Commun. Math. Phys. 288, 1–42 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0733-4
Communications in
Mathematical Physics
Bethe Algebra of Homogeneous XXX Heisenberg Model has Simple Spectrum E. Mukhin1, , V. Tarasov1,2, , A. Varchenko3, 1 Department of Mathematical Sciences, Indiana University – Purdue University, Indianapolis,
402 North Blackford St, Indianapolis, IN 46202-3216, USA. E-mail:
[email protected]
2 St. Petersburg Branch of Steklov Mathematical Institute, Fontanka 27, St. Petersburg,
191023, Russia. E-mail:
[email protected];
[email protected]
3 Department of Mathematics, University of North Carolina at Chapel Hill,
Chapel Hill, NC 27599-3250, USA. E-mail:
[email protected] Received: 12 September 2007 / Accepted: 23 October 2008 Published online: 18 February 2009 – © Springer-Verlag 2009
Abstract: We show that the algebra of commuting Hamiltonians of the homogeneous XXX Heisenberg model has simple spectrum on the subspace of singular vectors of the tensor product gl2 -modules. As a byproduct we show that there exist ofn two-dimensional exactly nl − l−1 two-dimensional vector subspaces V ⊂ C[u] with a basis f, g ∈ V such that deg f = l, deg g = n − l + 1 and f (u)g(u − 1) − f (u − 1)g(u) = (u + 1)n . 1. Introduction 1.1. Homogeneous XXX Heisenberg model. Consider the vector space (C2 )⊗n and the linear operator HXXX = −
n
( j) ( j+1)
( j) ( j+1)
(σ1 σ1
+ σ2 σ2
( j) ( j+1)
+ σ3 σ3
),
j=1 (k)
where σa matrices,
(n+1)
= 1⊗(k−1) ⊗ σa ⊗ 1(n−k) , σa σ1 =
01 10
,
σ2 =
0 −i i 0
(1)
= σa , and σ1 , σ2 , σ3 are the Pauli
,
σ3 =
1 0 0 −1
.
The operator HXXX is the Hamiltonian of the celebrated XXX Heisenberg model, also called the homogeneous XXX model, and the problem is to find eigenvalues and eigenvectors of the Hamiltonian. Supported in part by NSF grant DMS-0601005.
Supported in part by RFFI grant 08-01-00638. Supported in part by NSF grant DMS-0555327.
2
E. Mukhin, V. Tarasov, A. Varchenko
This problem was first addressed in the pioneering work [Be] by H. Bethe, who looked for eigenvectors of HXXX in a certain special form. His method and its further extensions are traditionally called the Bethe ansatz. The current literature on the XXX model and its generalizations, XXZ and XYZ models, as well as their counterparts in statistical mechanics, the six- and eight-vertex models, is enormous. We limit ourselves to mentioning just two books, [B1] and [KBI]. However, even numerous references therein hardly cover half of the bibliography on the subject. The Hamiltonian HXXX can be included into a one-parameter family of commuting linear operators called the transfer matrix, see [B1,FT,KBI]. We call a commutative unital subalgebra of linear operators on (C2 )⊗n generated by the transfer matrix the Bethe algebra. The actual problem is to construct eigenvalues and eigenvectors for the Bethe algebra. The elements of the Bethe algebra commute with the natural gl2 -action on (C2 )⊗n . Therefore, the eigenspaces of the Bethe algebra are representations of gl2 , and it suffices to construct highest weight vectors of those representations. The Bethe ansatz method associates to every admissible solution (λ1 , . . . λl ) of the system of equations
λj + λj −
i 2 i 2
n =
l λ j − λk + i , λ j − λk − i
j = 1, . . . , l ,
(1.1)
k=1 k= j
a vector in (C2 )⊗n , called the corresponding Bethe vector, see [FT]. A solution (λ1 , . . . , λl ) is called admissible if all λ1 , . . . , λl are distinct, and all factors in (1.1) are nonzero. A nonzero Bethe vector is a highest weight vector of an (n − 2l + 1)dimensional irreducible representation of gl2 , and all vectors in that representation are eigenvectors of each element of the Bethe algebra sharing the same eigenvalue. It is an important question whether the Bethe ansatz method produces all eigenvectors of the Bethe algebra. This question is referred to as the question of completeness of the Bethe ansatz for finite chains. It was discussed by H. Bethe himself in [Be] and many times since then by other authors. For instance, see a recent discussion in [B2]. However, no rigorous proof is available even for the so-called inhomogeneous models. Moreover, as one can see from the results of this paper, Sklyanin’s separation of variables does not prove completeness of the Bethe ansatz to the very end, though it is indeed an important step towards the proof. To be more precise, there are certain quantum integrable models for which the completeness of the Bethe ansatz has been proved. For example, see [YY] and Theorem 1.2.2 in [KBI]. The proofs for those models are based on a variational principle and convexity of some auxiliary action. However, for the the XXX model, the corresponding action is not convex, and that technique fails. In this paper we establish the completeness of the Bethe ansatz method for the homogeneous XXX model provided the method is improved in a certain way, see below in the introduction. We show that the spectrum of the Bethe algebra of the homogeneous XXX model is simple, that is, all eigenspaces of the Bethe algebra are irreducible gl2 -modules. We also show that eigenvalues of the Bethe algebra are in a one-to-one correspondence with certain second-order linear difference equations with two linearly independent polynomial solutions. We prove similar results for inhomogeneous higher spin XXX models.
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
3
To continue with an introduction and match the notation in the main part of the paper, we change the variables in system (1.1), λj =
i (t j + 1) , 2
j = 1, . . . , l ,
and write the system in the polynomial form (t j + 2)n
l k=1 k= j
(t j − tk − 1) = (t j + 1)n
l
(t j − tk + 1) ,
j = 1, . . . , l.
(1.2)
k=1 k= j
We call system (1.2) the system of the Bethe ansatz equations. The system is invariant with respect to permutations of t1 , . . . , tl , so the symmetric group Sl acts on solutions to the Bethe ansatz equations. We denote by ω(t1 , . . . , tl ) the Bethe vector corresponding to an admissible solution (t1 , . . . , tl ) of the Bethe ansatz equations. The Bethe vectors corresponding to admissible solutions with permuted coordinates are equal. The number of Bethe vectors ω(t1 , . . . , tl ) is equal to the number of Sl -orbits of admissible solutions to system (1.2). Since each element of the Bethe algebra commutes with the natural gl2 -action on (C2 )⊗n , it is enough to diagonalize the action of the Bethe algebra on each subspace of gl2 -singular vectors of given weight, Sing (C2 )⊗n [ l ] = { v ∈ (C2 )⊗n | e12 v = 0, e11 v = (n − l) v, e22 v = l v }, with 2l n. For every admissible solution (t1 , . . . , tl ) of the Bethe ansatz equations, the Bethe vector ω(t1 , . . . , tl ) belongs to the subspace Sing (C2 )⊗n [ l ] . To illustrate the problem with completeness of the Bethe ansatz in the standard form and the way it can be resolved, let us consider an example. Let n = 4 and l = 2. Then dim Sing (C2 )⊗4 [ 2 ] = 2, the operator HXXX restricted to Sing (C2 )⊗4 [ 2 ] has eigenvalues 5 and −3. The Bethe ansatz equations are (t1 + 2)4 (t1 − t2 − 1) = (t1 + 1)4 (t1 − t2 + 1), (t2 + 2)4 (t2 − t1 − 1) = (t2 + 1)4 (t2 − t1 + 1), and there is only one orbit of admissible solutions: 1 1 3 1 3 1 t1 = − + t2 = − − − , − . 2 2 3 2 2 3
(1.3)
(1.4)
The Bethe vector ω(t1 , t2 ) is an eigenvector of HXXX with eigenvalue 5. The results of this paper say that each eigenspace of HXXX acting on Sing (C2 )⊗4 [ 2 ] corresponds to a difference equation u 4 f (u) − B(u) f (u − 1) + (u + 1)4 f (u − 2) = 0 ,
(1.5)
where B(u) is a polynomial, and the difference equation has polynomial solutions of degree 2 and 3. The corresponding eigenvalue of HXXX equals 1 − 2B (0)/B(0). Indeed, there are exactly two such difference equations. The first one has B(u) = 2u 4 + 4u 3 − 2u + 1, and solutions u 2 + 3u + 73 and u 3 + 6u 2 + 11u + 13 2 . The roots of the quadratic polynomial are numbers t1 and t2 given by (1.4).
4
E. Mukhin, V. Tarasov, A. Varchenko
The second difference equation (1.5) with polynomial solutions of degree 2 and 3 has B(u) = 2u 4 + 4u 3 − 2u − 1, and solutions (u + 1) (u + 2) and u 3 + 6u 2 + 10u + 29 . The roots of the quadratic polynomial, t1 = −1 and t2 = −2, form a nonadmissible solution to system (1.3), and the Bethe vector ω(t1 , t2 ) for t1 = −1, t2 = −2 equals zero. For general n and l such that 2l n, the results of this paper for the homogeneous XXX model say that eigenspaces of the Bethe algebra acting on Sing (C2 )⊗n [ l ] are one-dimensional. They are in a one-to-one correspondence with difference equations u n f (u) − B(u) f (u − 1) + (u + 1)n f (u − 2) = 0 ,
(1.6)
where B(u) is a polynomial, and those difference equations have polynomial solutions of degree l and n − l + 1. The corresponding eigenvalues of elements of the Bethe algebra are described by the polynomial B(u). In particular, the eigenvalue of HXXX equals 1 − 2B (0)/B(0). The roots t1 , . . . tl of the polynomial solution of Eq. (1.6) of degree l form a solution of system (1.2). The Bethe vector ω(t1 , . . . tl ) is nonzero if and only if the solution (t1 , . . . tl ) is admissible. To obtain an eigenvector of the Bethe algebra corresponding to a difference equation (1.6) with two polynomial solutions, we use the following construction. The space (C2 )⊗n has a structure of a module over the Yangian Y (gl2 ), and the Bethe algebra of the homogeneous XXX model is the image of a commutative subalgebra B ⊂ Y (gl2 ), called the Bethe subalgebra. We take another Y (gl2 )-module Wa,d , described in Sect. 2.5, which is the holomorphic representation of Y (gl2 ) associated with the polynomials a(u) = (u + 1)n and d(u) = u n . There is a natural epimorphism σ : Wa,d → (C2 )⊗n of Y (gl2 )-modules. Using the roots t1 , . . . , tl of the polynomial solution of Eq. (1.6) of degree l and Sklyanin’s procedure of separation of variables [Sk], we define a nonzero vector ω(t ˜ 1 , . . . , tl ) in Wa,d , which is an eigenvector of B acting on Wa,d . We consider the maximal B-invariant subspace V ⊂ Wa,d that contains ω(t ˜ 1 , . . . , tl ) and does not contain other linearly independent eigenvectors of B. We show that the image σ (V ) ⊂ (C2 )⊗n is a one-dimensional subspace of Sing (C2 )⊗n [ l ]. Since σ is an homomorphism of Y (gl2 )modules, σ (V ) is an eigenspace of the Bethe algebra acting on Sing (C2 )⊗n [ l ] with the same eigenvalues as the eigenvalues of ω(t ˜ 1 , . . . , tl ) with respect to the action of B on Wa,d . The subspace σ (V ) ⊂ Sing (C2 )⊗n [l ] is that one-dimensional subspace of eigenvectors which we assigned to difference equation (1.6) with two polynomial solutions. If (t1 , . . . , tl ) is an admissible solution, then the subspace V ⊂ Wa,d is one-dimensional, and the subspace σ (V ) is spanned by the Bethe vector ω(t1 , . . . , tl ). The construction described above provides a generalization of the Bethe ansatz method in which the solutions to the Bethe ansatz equations are replaced by difference equation (1.6) with two polynomial solutions, and the Bethe vectors in Sing (C2 )⊗n [ l ] are replaced by the subspaces σ (V ). Our result says that the generalized Bethe vectors form a basis in Sing (C2 )⊗n [ l ] and, moreover, the spectrum of the Bethe algebra is simple. As a remark, we would like to indicate another way to obtain the eigenspace of the Bethe algebra acting on Sing (C2 )⊗n [ l ] corresponding to the difference equation (1.6). We may consider the inhomogeneous XXX model depending on parameters z 1 , . . . , z n . The corresponding system of the Bethe ansatz equations are n s=1
(t j − z s + 2)
l k=1 k= j
(t j − tk − 1) =
n s=1
(t j − z s + 1)
l k=1 k= j
(t j − tk + 1),
(1.7)
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
5
j = 1, . . . , l. It follows from the results of this paper that if f (u) = lj=1 (u − t j ) is a solution of the difference equation (1.6), then for generic z = (z 1 , . . . , z n ) there exists an admissible solution t(z) = (t1 (z), . . . , tl (z)) of system (1.7) such that t(z) → (t1 , . . . tl ) as z → 0. The Bethe vectors ω(t(z); z) are nonzero for generic z, and the eigenspaces C ω(t(z); z) have a one-dimensional limit as z → 0, which is the eigenspace of the Bethe algebra of the homogeneous XXX model. A similar approach for the gl N Gaudin model is developed in [MTV5]. The correspondence between the eigenvectors of the Bethe algebra and second-order linear difference equation with two polynomial solutions is in the spirit of the geometric Langlands correspondence in which eigenfunctions of commuting differential operators correspond to connections on curves. Equation (1.6) is known in the physical literature as Baxter’s equation. Its connection with the Bethe ansatz equations has been studied in many papers. The fact that the roots of a polynomial solution of Baxter’s equation give a solution of the Bethe ansatz equations (provided the roots are distinct) is known as Manakov’s principle and the analytic Bethe ansatz. An important observation about the existence of a second polynomial solution of Baxter’s equation has been done in [PS]. A similar observation in a much more general context has been made independently in [MV2,MV3]. 1.2. Content of the paper. The results of this paper for the XXX model are discrete analogues of the results of [MTV3] for the Gaudin model. In Sect. 2 we discuss the Yangian Y (gl2 ), the Bethe subalgebra B ⊂ Y (gl2 ), and Yangian modules. In particular, we describe the holomorphic representation Wa,d of the Yangian Y (gl2 ). The module Wa,d is associated with two monic polynomials a(u) =
n i=1
(u − z i + m i )
and
d(u) =
n
(u − z i )
i=1
and is isomorphic to C[x1 , . . . , xn ] as a vector space. We introduce a collection ((m 1 , 0), . . . , (m n , 0)) of gl2 -weights and say that the pair n m i − 2l + 1 + s = 0 for all s = 1, . . . , l. ((m 1 , 0), . . . , (m n , 0)) , l is separating if i=1 In Sects. 3 –7, we study the algebras A W and A D , and relations between them. Eventually, we show that the algebras A W and A D are isomorphic, see Theorem 7.3.1. The algebra A W is the image of the Bethe subalgebra B acting on the subspace Sing Wa,d [ l ] ⊂ Wa,d of gl2 -singular vectors. We consider a polynomial B(u, H) = 2u n + H1 u n−1 + · · · + Hn , whose coefficients Hk ∈ End (Sing Wa,d [ l ]) are generators of A W , and introduce the universal difference operator D Sing Wa,d [ l ] = d(u) − B(u, H) ϑ −1 + a(u) ϑ −2 acting on Sing Wa,d [ l ]-valued functions in u. Here ϑ : f (u) → f (u + 1). The algebra A D is defined in Sect. 4. We consider the space Cl+n with coordinates a = (a1 , . . . , al ) and h = (h 1 , . . . , h n ), polynomials B(u, h) = 2u n +h 1 u n−1 +· · ·+h n , and p(u, a) = u l + a1 u l−1 + · · · + al , and the difference operator Dh = d(u) − B(u, h) ϑ −1 + a(u) ϑ −2 . We define the scheme C D of points p ∈ Cl+n such that the polynomial p(u, a( p)) lies in the kernel of the difference operator Dh( p) . The algebra A D is the algebra of functions
6
E. Mukhin, V. Tarasov, A. Varchenko
on C D . There is a natural epimorphism ψ DW : A D → A W such that ψ DW (h k ) = Hk , see Theorem 4.3.3. Using the Bethe ansatz method, we prove that if z 1 , . . . , z n are generic and the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating, then the scheme C D considered as a set has at least dim Sing Wa,d [ l ] distinct points, see Sect. 5. In Sect. 6, we review Sklyanin’s procedure of separation of variables in the XXX model and construct the universal weight function. Theorem 6.3.2 connects the algebras A D , A W and the universal weight function. The algebra A D acts on itself by multiplication operators. We denote by L f the operator of multiplication by an element f ∈ A D . The algebra A D acts on its dual space A∗D by operators L ∗f , dual to multiplication operators. Using the universal weight function we define a linear map τ : A∗D → Sing Wa,d [ l ] and prove that if the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating, then τ is an isomorphism that intertwines the action of operators L ∗f , f ∈ A D , with the action of operators ψ DW ( f ) ∈ End(Sing Wa,d [ l ]), see Theorem 7.3.1. Therefore, we prove that ψ DW : A D → A W is an algebra isomorphism. Theorem 7.3.1 is our first main result. Using the Grothendieck residue, we define an isomorphism φ : A D → A∗D of A D modules, see Sect. 7.4. Therefore, if the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating, the composition τ φ : A D → Sing Wa,d [ l ] is a linear isomorphism which intertwines the action of the algebra A D on itself by multiplication operators and the action of the Bethe algebra A W on Sing Wa,d [ l ]. In Sects. 8 through 11, we impose more conditions on m 1 , . . . , m n and z 1 , . . . , z n . We assume that m 1 , . . . , m n are natural numbers. We keep the assumption n that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating, that takes the form 2l s=1 m s . We also assume that z i − z j ∈ / Z if i = j. In Sects. 8–11, we study three more algebras A G , A P and A L , and relations between them. The algebra A G is defined in Sect. 8. We consider the subspace Cd [u] ⊂ C[u] of all polynomials of degree d for a suitably large number d, and the Grassmannian of all two-dimensional subspaces of Cd [u]. Using the numbers z 1 , . . . , z n and m 1 , . . . , m n we define n + 1 Schubert cycles CF(z 1 ), (1) , . . . , CF(z n ), (n) , CF(∞), (∞) in the Grassmannian. The algebra A G is the algebra of functions on the intersection of the Schubert cycles. The algebra A P is defined in Sect. 9.1. Let l˜ = ns=1 m s + 1 − l, a˜ = (a˜ 1 , . . . , ˜
˜
a˜ l−l−1 , a˜ l−l+1 , . . . , a˜ l˜), p(u, ˜ a˜ ) = u l + a˜ 1 u l−1 + · · · + a˜ l−l−1 u l+1 + a˜ l−l+1 u l−1 + · · · + a˜ l˜, ˜ ˜ ˜ ˜ ˜
and consider the space Cl+l+n−1 with coordinates a˜ , a, h. We define the scheme C P as ˜ the scheme of points p ∈ Cl+l+n−1 such that the polynomials p(u, a( p)) and p(u, ˜ a˜ ( p)) lie in the kernel of the difference operator Dh( p) . The algebra A P is the algebra of functions on C P . The map ( p(u, a( p)), p(u, ˜ a˜ ( p)), Dh( p) ) → ( p(u, a( p)), Dh( p) ) defines a natural epimorphism ψ D P : A D → A P . We also show that the algebras A G and A P are naturally isomorphic. To define the algebra A L , see Sect. 9.3, we consider the tensor product L (z) = L (1) (z 1 ) ⊗ · · · ⊗ L (n) (z n ) of evaluation Yangian modules, where L (i) is the irreducible gl2 -module of highest weight (i) = (m i , 0). The algebra A L is the image of the Bethe subalgebra B ⊂ Y (gl2 ) acting on the subspace Sing L [l ] ⊂ L (z) of gl2 -singular vectors. The Y (gl2 )-module L (z) is isomorphic to the quotient module Wa,d /K , where K ⊂ Wa,d is the kernel of
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
7
the Yangian Shapovalov form on Wa,d . We denote by σ : Sing Wa,d [ l ] → Sing L [ l ] the epimorphism of vector spaces corresponding to the epimorphism Wa,d → L (z) of Y (gl2 )-modules. The epimorphism σ induces the algebra epimorphism ψW L : A W → AL . We denote by ξ : A D → Sing L [ l ] the composition of maps σ τ φ, and by ψ DL : A D → A L the composition of maps ψW L ψ DW . We show that the kernels of the maps ξ , ψ DL and ψ D P coincide. This allows us to obtain an algebra isomorphism ψ P L : A P → A L and a linear isomorphism ζ : A P → Sing L [ l ] intertwining the action of A P on itself by multiplication operators and the action of the Bethe algebra A L on Sing L [ l ]. This is our second main result, see Theorem 10.3.1. In Sect. 11, we use the Yangian Shapovalov form on L (z) and the map ζ to obtain a linear isomorphism θ : A∗P → Sing L [ l ] intertwining the action of operators L ∗f , f ∈ A P , with the action of operators ψ P L ( f ) ∈ End(Sing L [ l ]), see Theorem 11.2.1. Using the isomorphism, we show that eigenvectors of the action of the algebra A L on Sing L [ l ] are in a one-to-one correspondence with certain second-order linear difference equations with two polynomial solutions of degrees l and n − l + 1, see Corollary 11.2.3. Section 12 contains the analogues of the previous results for the homogeneous XXX Heisenberg model. We recapitulate the main results of this paper as three commutative diagrams. The horizontal arrows of the diagrams are isomorphisms, the downward vertical arrows are epimorphisms, and the upward vertical arrow is an embedding. The first diagram shows the algebras of functions A D , A P on difference operators with respectively one or two polynomials in the kernels, the algebra A G of functions on the intersection of Schubert cycles, the Bethe algebras A W and A L , associated with Sing Wa,d [ l ] and Sing L [ l ], respectively, and their homomorphisms: ψ DW
A D −−−−→ ⏐ ⏐ ψD P
AW ⏐ ⏐ψ
WL
A G −−−−→ A P −−−−→ A L ψG P
ψP L
The other two diagram show the vector spaces involved: τ
A∗D −−−−→ Sing Wa,d [ l ] ⏐ ⏐σ ⏐ (ψ D P )∗ ⏐
A∗P −−−−→ Sing L [ l ] θ
τφ
A D −−−−→ Sing Wa,d [ l ] ⏐ ⏐ ⏐σ ⏐ ψD P
A P −−−−→ Sing L [ l ] ζ
Each vector space on these diagrams is a module over the corresponding algebra on the first diagram, and all linear maps are consistent with the algebra homomorphisms. 2. Yangian Y (gl2 ) and Yangian Modules 2.1. Lie algebra gl2 . Let eab , a, b = 1, 2, be the standard generators of the complex Lie algebra gl2 . We have gl2 = n+ ⊕ h ⊕ n− , where n+ = C · e12 ,
h = C · e11 ⊕ C · e22 ,
n− = C · e21 .
8
E. Mukhin, V. Tarasov, A. Varchenko
For a gl2 -weight ∈ h∗ , we denote by M the Verma gl2 -module with highest weight and by L the irreducible gl2 -module with highest weight . 2.1.1. Let = ( (1) , . . . , (n) ) be a collection of gl2 -weights, where (i) = (i) (i) ( 1 , 2 ) for i = 1, . . . , n. Let l be a nonnegative integer. The pair , l will be n (i) (i) called separating if i=1 ( 1 − 2 ) − 2l + 1 + s = 0 for all s = 1, . . . , l, cf. [MV1,MV2,MTV3]. In the following, we need the next lemma. 2.1.2. Lemma. Let m be a complex number and l a nonnegative integer. Let V be a gl2 -module with weight decomposition V = ∞ k=0 V [k] , where V [k] ⊂ V is a weight subspace of weight (m − k, k). Assume that m − 2l + 1 + s = 0 for all s = 1, . . . , l. Then the map e12 e21 : V [l − 1] → V [l − 1] is an isomorphism of vector spaces. l−k Proof. Let Uk = ker e12 |V [l−1] . Clearly, V [l − 1] = U0 ⊃ U1 · · · ⊃ Ul−1 ⊃ Ul = {0} .
Let C = e11 (e22 + 1) − e12 e21 . Set P(x) = l−1 k=0 (x − ck ), where ck = k (m − k + 1), and Q(x) =
P(x) − P(cl ) . x − cl
We have e12 e21 |V [l−1] = (cl − C)|V [l−1] . Since C is a central element, we have (C −ck )Uk ⊂ Uk+1 . Therefore, P(C)|V [l−1] = 0, and (cl − C)|V [l−1] Q(C)|V [l−1] = P(cl ) . The assumption on m and l implies that P(cl ) = 0. Hence, the operator (cl − C)|V [l−1] is invertible.
{s}
2.2. Yangian. The Yangian Y (gl2 ) is the unital associative algebra with generators Tab , a, b = 1, 2 and s = 1, 2, . . . . Let Tab (u) = δab +
∞
{s}
Tab u −s ,
a, b = 1, 2 .
s=1
Then the defining relations in Y (gl2 ) have the form (u − v) (Tab (u)Tcd (v) − Tcd (v)Tab (u)) = Tcb (v)Tad (u) − Tcb (u)Tad (v),
(2.1)
for all a, b, c, d. The Yangian is a Hopf algebra with coproduct : Tab (u) →
2 c=1
for all a, b.
Tcb (u) ⊗ Tac (u)
(2.2)
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
9
2.2.1. Proposition [KBI]. The following relations hold: T11 (u) T12 (u 1 ) . . . T12 (u k ) = 1 T12 (u) + (k − 1)!
σ ∈Sk
i=1
k 1 u σ1 −u σi − 1 T12 (u σ2 ) . . . T12 (u σk ) T11 (u σ1 ) , u − u σ1 u σ1 −u σi i=2
T22 (u) T12 (u 1 ) . . . T12 (u k ) = 1 T12 (u) + (k − 1)!
σ ∈Sk
k u − ui − 1 T12 (u 1 ) . . . T12 (u k ) T11 (u) u − ui
k i=1
u − ui + 1 T12 (u 1 ) . . . T12 (u k ) T22 (u) u − ui
k 1 u σ1 −u σi +1 T12 (u σ2 ) . . . T12 (u σk ) T22 (u σ1 ) . u − u σ1 u σ1 −u σi i=2
2.2.2. A series f (u) in u −1 is called monic if f (u) = 1 + O(u −1 ). For a monic series f (u), there is an automorphism ϕ f : Y (gl2 ) → Y (gl2 ) ,
Tab (u) → f (u) Tab (u).
There is a one-parameter family of automorphisms ρz : Y (gl2 ) → Y (gl2 )
Tab (u) → Tab (u − z),
z)−1
has to be expanded as a power series in u −1 . where in the right-hand side, (u − The Yangian Y (gl2 ) contains the universal enveloping algebra U (gl2 ) as a Hopf sub{1} algebra. The embedding is given by the formula eab → Tba for all a, b. We identify U (gl2 ) with its image. {1} The evaluation homomorphism : Y (gl2 ) → U (gl2 ) is defined by the rule: Tab → {s} eba for all a, b, and Tab → 0 for all a, b and all s > 1. We denote by + : Y (gl2 ) → Y (gl2 ) the antiinvolution defined by (Tab (u))+ = Tba (u).
(2.3)
2.3. Bethe subalgebra. The series qdet T (u) = T1 1 (u) T2 2 (u − 1) − T1 2 (u) T2 1 (u − 1)
(2.4)
is called the quantum determinant. The coefficients of the series qdet T (u) belong to the center of the Yangian Y (gl2 ) [IK]. The series T11 (u)+T22 (u) is called the transfer matrix. It is known that the coefficients of the series T11 (u) + T22 (u) commute [FT]. We call the unital subalgebra B ⊂ Y (gl2 ) generated by coefficients of the series qdet T (u) and T11 (u) + T22 (u) the Bethe subalgebra. The Bethe subalgebra is commutative. Elements of the Bethe subalgebra commute with elements of the subalgebra U (gl2 ) and are invariant under the antiinvolution (2.3). 2.4. Yangian modules. 2.4.1. Theorem [T]. Let V be an irreducible finite-dimensional Y (gl2 )-module. There exists a unique up to proportionality vector v ∈ V , monic series c1 (u), c2 (u), and a monic polynomial P(u) such that
10
E. Mukhin, V. Tarasov, A. Varchenko
T21 (u) v = 0 , Taa (u) v = ca (u) v ,
a = 1, 2 ,
and c1 (u) P(u + 1) = . c2 (u) P(u)
(2.5)
The vector v is called a highest weight vector, the series c1 (u) , c2 (u) — the Yangian highest weights, and the polynomial P(u) — the Drinfeld polynomial of the module V . 2.4.2. Theorem [T]. For any monic series c1 (u), c2 (u) and a monic polynomial P(u) obeying relation (2.5), there exists a unique irreducible finite-dimensional Y (gl2 )module V such that c1 (u) , c2 (u) are the Yangian highest weights of the module V . 2.4.3. Let V1 , V2 be irreducible finite-dimensional Y (gl2 )-modules with respective highest weight vectors v1 , v2 . Then for the Y (gl2 )-module V1 ⊗ V2 , we have T21 (u) v1 ⊗ v2 = 0 , Taa (u) v1 ⊗ v2 = ca(1) (u) ca(2) (u) v1 ⊗ v2 ,
a = 1, 2 .
Let W be the irreducible subquotient of V1 ⊗ V2 generated by the vector v1 ⊗ v2 . Then the Drinfeld polynomial of the module W equals the products of the Drinfeld polynomials of the modules V1 and V2 . 2.4.4. For a gl2 -module V , let the Y (gl2 )-module V (z) be the pullback of V through the homomorphism ◦ ρz ; that is, the series Tab (u) acts on V (z) as 1 + (u − z)−1 eba . The module V (z) is called the evaluation module with evaluation point z. 2.4.5. Let = ( (1) , . . . , (n) ) be a collection of integral dominant gl2 -weights, (i) (i) (i) where (i) = ( (i) 1 , 2 ), 1 2 , for i = 1, . . . , n. For generic complex numbers z 1 , . . . , z n , the tensor product of evaluation modules L (z) = L (1) (z 1 ) ⊗ · · · ⊗ L (n) (z n ) is an irreducible finite-dimensional Y (gl2 )-module and the corresponding highest weight series c1 (u) , c2 (u) have the form ca (u) =
n (i) u − z i + a . u − zi i=1
The corresponding Drinfeld polynomial equals (i)
P(u) =
n 1 −1 i=1 s= (i) 2
(u − z i + s).
(2.6)
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
11
2.5. Holomorphic representation. The results of this section go back to [T]. Choose monic polynomials a(u) , d(u) ∈ C[u] of positive degree n, a(u) =
n
(u − z i + m i ),
d(u) =
i=1
n
(u − z i ).
(2.7)
i=1
2.5.1. Proposition. There exists a unique Y (gl2 )-action on the vector space C[x1 , . . . , xn ] such that n 1 n−i u xi p(x1 , . . . , xn ) d(u) i=1 n zi x1 x2 − x1 i=1 + . . . p(x1 , . . . , xn ) = + u u2
(T12 (u) · p) (x) =
(2.8)
for any polynomial p ∈ C[x1 , . . . , xn ] , and T11 (u) · 1 =
a(u) · 1, d(u)
T22 (u) · 1 = 1 ,
T21 (u) · 1 = 0 ,
(2.9)
where 1 stands for the constant polynomial equal to 1 as an element of C[x1 , . . . , xn ]. We denote by Wa,d the Y (gl2 )-module defined by formulae (2.8), (2.9) and call it the holomorphic representation of Y (gl2 ), associated with the polynomials a(u) , d(u). The Yangian module Wa,d is cyclic: every element of Wa,d can be obtained from 1 {1} {2} by the action of a suitable polynomial in T12 , T12 , . . . . Formulae (2.9) mean that 1 is {s} {s} {s} an eigenvector of the operators T11 , T22 and 1 is annihilated by the operators T21 with s = 1, 2, . . . . Then the Yangian commutation relations (2.1) allow us to determine the {s} {s} {s} action of T11 , T22 , T21 on all elements of Wa,d . Since the coefficients of the series qdet T (u) are central, and the module Wa,d is generated by the polynomial 1, we have qdet T (u)W
a,d
=
a(u) . d(u)
(2.10)
For every i, j = 1, 2, we have Ti j (u)W
a,d
=
T˜i j (u) , d(u)
(2.11)
where T˜i j (u) is an End (Wa,d )-valued polynomial in u of degree n for i = j, and of degree n − 1 for i = j. 2.5.2. The embedding U (gl2 ) → Y (gl2 ) defines a gl2 -module structure on Wa,d . The ∞ W gl2 -weight decomposition of Wa,d is the degree decomposition Wa,d = ⊕l=0 a,d [ l ] into subspaces of homogeneous polynomials. The subspace W [ l ] of homogeneous a,d n polynomials of degree l has gl2 -weight i=1 m i − l, l .
12
E. Mukhin, V. Tarasov, A. Varchenko
2.5.3. Lemma. Let Sing Wa,d [ l ] = { p ∈ Wa,d [ l ] | e12 p = 0 } be the subspace of gl2 -sinular vectors. Assume that the pair ((m 1 , 0), . . . , (m n , 0)), l is separating. Then dim Sing Wa,d [ l ] = dim Wa,d [ l ] − dim Wa,d [ l − 1 ] . Proof. The map e12 e21 : Wa,d [ l − 1 ] → Wa,d [ l − 1 ] is an isomorphism of vector spaces since the pair ((m 1 , 0), . . . , (m n , 0)), l is separating, see Lemma 2.1.2. The fact that e12 e21 is an isomorphism implies the lemma.
2.5.4. Denote by + : Y (gl2 ) → Y (gl2 ) the antiinvolution defined by Ti+j (u) = T ji (u). Denote by φ : Wa,d → C the linear function p(x1 , . . . , xn ) → p(0, . . . , 0). The Yangian Shapovalov form on Wa,d is the unique symmetric bilinear form S on Wa,d defined by the formula S(x · 1, y · 1) = φ(x + y · 1) for all x, y ∈ Y (gl2 ). Different gl2 -weight subspaces of Wa,d are orthogonal with respect to the form S, and det S|Wa,d [ l ] = const
n l−1
(z i − z j + m j − s)(
n+l−s−2 n−1
),
i, j=1 s=0
where the constant does not depend on z 1 , . . . , z n , m 1 , . . . , m n . 2.6. The kernel of the Yangian Shapovalov form K ⊂ Wa,d is a Y (gl2 )-submodule. The Y (gl2 )-module Wa,d /K is irreducible. The Yangian Shapovalov form on Wa,d induces a nondegenerate symmetric bilinear form on Wa,d /K called the Yangian Shapovalov form of the module Wa,d /K . 2.6.1. Theorem [T]. For generic z 1 , . . . , z n , m 1 , . . . , m n , the Y (gl2 )-module Wa,d is irreducible and isomorphic to the tensor product of evaluation Verma modules M(m 1 ,0) (z 1 ) ⊗ · · · ⊗ M(m n ,0) (z n ). Any such isomorphism sends 1 to a scalar multiple of the tensor product v(m 1 ,0) ⊗ · · · ⊗ v(m n ,0) of highest weight vectors of the corresponding Verma modules. 2.6.2. Theorem [T]. Let m i ∈ Z0 for i = 1, . . . , n, and m 1 m 2 · · · m n . Assume that z i − z j + m j − s = 0 and z i − z j − 1 − s = 0 for all i < j and s = 0, 1, . . . , m i − 1. Then for any permutation σ ∈ Sn , the irreducible Y (gl2 )-module Wa,d /K is isomorphic to the tensor product of evaluation irreducible modules L (m σ1 ,0) (z σ1 )⊗· · ·⊗ L (m σn ,0) (z σn ). Any such an isomorphism sends the element corresponding to 1 to a scalar multiple of the tensor product v(m σ1 ,0) ⊗ · · · ⊗ v(m σn ,0) of highest weight vectors of the corresponding irreducible modules. For a proof of this theorem see also [CP]. 2.6.3. The assumption of Theorem 2.6.2 can be formulated geometrically as the assumption that for i < j the sets Z i = {z i , z i −1, . . . , z i −m i } and Z j = {z j , z j −1, . . . , z j − m j } either do not intersect, or the smaller set Z i is a subset of the larger set Z j (since we assumed that m i m j ).
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
13
3. Algebra AW and Universal Difference Operator 3.1. Definition. Let V be a Y (gl2 )-module. We call the image of the Bethe algebra B ⊂ Y (gl2 ) in End (V ) the Bethe algebra associated with V . If U ⊂ V is a vector subspace preserved by elements of the Bethe algebra BV , then their restrictions to U define a commutative unital subalgebra BU ⊂ End (U ) called the Bethe algebra associated with U . 3.1.1. Define the operator ϑ acting on functions of u as (ϑ f )(u) = f (u + 1). Let V be a Y (gl2 )-module such that for all a, b the series Tab (u)|V sum up to End (V )-valued rational functions in u. Let U ⊂ V be a vector subspace preserved by the Bethe algebra BV . The universal difference operator DU acting on U -valued functions in u is defined by the formula DU = 1 − (T11 (u) + T22 (u)) U ϑ −1 + qdet T (u) ϑ −2 , U
see [Tal, MTV1, (4.16) ], [MTV2]. The operator DU is a linear second-order difference operator. 3.2. Algebra A W . Operator D Sing Wa,d [ l ] . Consider the Bethe algebra BWa,d associated with the Y (gl2 )-module Wa,d . Recall that (qdet T (u)) W see (2.10), and
a,d
(T11 (u) + T22 (u)) W
a,d
a(u) , d(u)
=
=
˜ B(u, H) , d(u)
where
˜ = H˜ 0 u n + H˜ 1 u n−1 + · · · + H˜ n B(u, H) (3.1) for suitable coefficients H˜ k ∈ End Wa,d , see 2.11. It follows from Proposition (2.2.1) n (m i − 2z i ). that the coefficients H˜ 0 , H˜ 1 are scalar operators, H˜ 0 = 2, H˜ 1 = i=1 The elements H˜ k are called the XXX Hamiltonians associated with Wa,d . 3.2.1. The Hamiltonians H˜ k preserve the subspace Sing Wa,d [ l ] defined in Sect. 2.5.3. Set Hk = H˜ k | Sing Wa,d [ l ] ∈ End (Sing Wa,d [ l ]) and B(u, H) = H0 u n + H1 u n−1 + · · · + Hn . The coefficients H0 , H1 , H2 are scalar operators, H0 = 2 , H2 = l
l −1−
n i=1
H1 =
(m i − 2z i ),
i=1
mi
n
+
1i< j n
z i z j + (z i − m i )(z j − m j ) .
14
E. Mukhin, V. Tarasov, A. Varchenko
The simplest way to get the last formula is to extract H2 from the coefficient of u −2 of the series qdet T (u) Sing W [ l ] , see (2.4), and to use formula (2.10). a,d We denote by AW the Bethe algebra associated with Sing Wa,d [ l ]. It is the unital subalgebra of End Sing Wa,d [ l ] generated by the operators H3 , H4 , . . . , Hn , called the XXX Hamiltonians associated with Sing Wa,d [ l ]. 3.2.2. The operators of the algebra A W are symmetric with respect to the Yangian Shapovalov form on Wa,d , S( f v, w) = S(v, f w) for all f ∈ A W and v, w ∈ Wa,d , see [MTV1]. 3.3. Operator D Sing Wa,d [ l ] . Consider the universal difference operator D Sing Wa,d [ l ] acting on Sing Wa,d [ l ]-valued functions, D Sing Wa,d [ l ] = 1 −
B(u, H) −1 a(u) −2 ϑ ϑ . + d(u) d(u)
The modified universal difference operator D Sing Wa,d [ l ] is defined by the formula D Sing Wa,d [ l ] = d(u) D Sing Wa,d [ l ] . Then D Sing Wa,d [ l ] = d(u) − B(u, H) ϑ −1 + a(u) ϑ −2 . 3.3.1. Theorem. Assume that the pair ((m 1 , 0), . . . , (m n , 0)), l is separating. Then for any v0 ∈ Sing Wa,d [ l ] there exist unique v1 , . . . , vl ∈ Sing Wa,d [ l ] such that the function w(u) = v0 u l + v1 u l−1 + · · · + vl is a solution of the difference equation D Sing Wa,d [ l ] w(u) = 0. Proof. By Lemma 2.5.3 the dimension of Sing Wa,d [ l ] does not depend on z 1 , . . . , z n , m 1 , . . . , m n , if the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Because of that, we may consider the difference equation D Sing Wa,d [ l ] v(u) = 0 as a difference equation on a fixed vector space with coefficients of the difference equation algebraically depending on parameters z 1 , . . . , z n , m 1 , . . . , m n . Given a vector v0 ∈ Sing Wa,d [ l ], we look of the difference equation for a solution l− j . Substituting this expression v u D Sing Wa,d [ l ] v(u) = 0 in the form v0 u l + ∞ j j=1 into the equation, we can calculate all of the coefficients v j recursively, and they are algebraic functions of z 1 , . . . , z n , m 1 , . . . , m n . For generic z 1 , . . . , z n and large positive integral m 1 , . . . , m n , the coefficients v j are equal to zero for all j > l by Theorem 2.6.2 and [MTV3, Theorem 7.3]. Hence, the same coefficients are equal to zero for all z 1 , . . . , z n , m 1 , . . . , m n such that the pair
((m 1 , 0), . . . , (m n , 0)) , l is separating.
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
15
4. Algebra A D 4.1. Definition. From now on until the end of Sect. 11 we fix complex numbers z 1 , . . . , z n , m 1 , . . . , m n , and a nonnegative integer l. We always assume that the polynomials a(u) and d(u) are given by formulae (2.7). Let a = (a1 , . . . , al ) and h = (h 1 , . . . , h n ). Consider the space Cl+n with coordinates a, h. Let D be the affine subspace of Cl+n defined by equations q1 (h) = 0, q2 (h) = 0, where q1 (h) = h 1 −
n
(m i − 2z i ),
i=1
q2 (h) = h 2 − l (l − 1 −
n
mi ) −
i=1
(z i z j + (z i − m i )(z j − m j )).
1i< j n
Let p(u, a) = u l + a1 u l−1 + · · · + al , B(u, h) = 2u n + h 1 u n−1 + · · · + h n , Dh = d(u) − B(u, h) ϑ −1 + a(u) ϑ −2 .
(4.1)
If h satisfy the equations q1 (h) = 0 and q2 (h) = 0, then the polynomial Dh ( p(u, a)) is a polynomial in u of degree l + n − 3, Dh ( p(u, a)) = q3 (a, h) u l+n−3 + · · · + ql+n (a, h). The coefficients qi (a, h) are linear functions in a and linear functions in h. Denote by I D the ideal in C[a, h] generated by polynomials q1 , q2 , q3 , . . . , ql+n . The ideal I D defines a scheme C D ⊂ D. Then A D = C[a, h]/I D is the algebra of functions on C D . The scheme C D is the scheme of points p ∈ D such that the polynomial p(u, a( p)) solves the difference equation Dh( p) w(u) = 0. 4.2. Independence of dimension of A D on z 1 , . . . , z n . For fixed m 1 , . . . , m n , the scheme C D and the algebra A D depend on the choice of numbers z = (z 1 , . . . , z n ): C D = C D (z), A D = A D (z). 4.2.1. Theorem. Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Then the dimension of A D (z), considered as a vector space, is finite and does not depend on the choice of numbers z 1 , . . . , z n . Proof. It suffices to prove two facts: (i) For any z, there are no algebraic curves over C lying in C D (z). (ii) Let a sequence z (i) , i = 1, 2, . . . , tend to a finite limit z = (z 1 , . . . , z n ). Let (i) ∈ C (z (i) ), i = 1, 2, . . . , be a sequence of points. Then all coordinates D p (i) a( p ), h( p(i) remain bounded as i tends to infinity.
16
E. Mukhin, V. Tarasov, A. Varchenko
By fact (i), the dimension of A D (z) is finite for any z, whereas fact (ii) implies that dim A D (z) does not depend on z 1 , . . . , z n . For a point p in C D (z), the operator Dh( p) has the form d(u) − (2u n + h 1 ( p)u n−1 + h 2 ( p)u n−2 + h 3 ( p)u n−3 + · · · + h n ( p)) ϑ −1 + a(u) ϑ −2 , where the coefficients h 1 ( p) , h 2 ( p) are determined by the equations q1 (h) = 0 and q2 (h) = 0. Assume that (i) is not true. Since any affine algebraic curve over C is unbounded, there exists a sequence of points p(i) ∈ C D (z), i = 1, 2, . . . , which tends to infinity as i tends to infinity. Then it is easy to see that h( p(i) ) cannot tend to infinity since it would contradict the fact that Dh( p(i) ) p(u, a( p(i) )) = 0. Choosing a subsequence, we may assume that h( p(i) ) has a finite limit as i tends to infinity. Then a( p(i) ) cannot tend to infinity since it would mean that the limiting difference equation has a polynomial solution of degree less than l, and this is impossible. This reasoning implies that p(i) ∈ C D (z) cannot tend to infinity. Thus we get a contradiction and statement (i) is proved. The proof of statement (ii) is similar.
4.3. Second description of A D and epimorphism ψ DW : A D → A W . 4.3.1. Theorem. Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Assume that h satisfies equations q1 (h) = 0 and q2 (h) = 0. Consider the system qi (a, h) = 0 ,
i = 3, . . . , l + 2 ,
(4.2)
as a system of linear equations with respect to a1 , . . . , al . Then this system has a unique solution ai = ai (h) , i = 1, . . . , l, where ai (h) are polynomials in h. Proof. The claim follows from the fact that q2+i (a, h) = i
n
m s − 2l + i + 1
s=1
ai +
i−1
qi j (h) a j
j=1
for i = 1, . . . , l. Here qi j are some linear functions of h. The coefficient of ai does not vanish since the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating.
4.3.2. Denote by I D be the ideal in C[h] generated by the polynomials q1 (h) , q2 (h), in the space Cn q j (a(h), h), j = l + 3, . . . , l + n. The ideal I D defines a scheme C D is the scheme of points r ∈ Cn with coordinates h = (h 1 , . . . , h n ). The scheme C D such that the difference equation Dh( r ) w(u) = 0 has a polynomial solution of degree l. Theorem 4.3.1 implies that AD ∼ = C[h]/I D . Let H1 , . . . , Hn be the operators introduced in Sect. 3.2.1.
(4.3)
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
17
4.3.3. Theorem. Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Then the assignment h s → Hs , s = 1, . . . , n, determines an algebra epimorphism ψ DW : A D → AW . Proof. We use description (4.3) of the algebra A D . The equations defining the scheme are the equations of existence of a polynomial solution of degree l to the polynoCD mial difference equation Dh w(u) = 0. The operators H1 , . . . , Hn satisfy the defining by Theorem 3.3.1. equations for C D
5. Bethe Ansatz Equations 5.1. Bethe ansatz equations. The Bethe ansatz equations is the following system of equations with respect to complex numbers t = (t1 , . . . , tl ) : n
(t j −z s +1+m s )
(t j − tk − 1) =
k= j
s=1
n
(t j − z s + 1)
s=1
(t j − tk + 1),
k= j
j = 1, . . . , l.
(5.1)
A solution t is called admissible if all t1 , . . . , tl are distinct, and all factors in (5.1) are nonzero. The permutation group Sl acts on admissible solutions. If t = (t1 , . . . , tl ) is an admissible solution, then any permutation of these numbers is an admissible solution too. We shall consider Sl -orbits of admissible solutions. The following lemma is well-known, see for example Lemma 2.2 in [MV2]. 5.1.1. Lemma. Let t be an admissible solution of system (5.1). Let p(u) =
l i=1
(u − ti ), B(u) =
d(u) p(u) + a(u) p(u − 2) . p(u − 1)
Then B(u) is a polynomial of degree n and p(u) is annihilated by the difference operator d(u) − B(u) ϑ −1 + a(u) ϑ −2 .
5.1.2. Corollary. Any Sl -orbit of admissible solutions of the Bethe ansatz equations gives a point of the scheme C D considered as a set. Moreover, different Sl -orbits give different points. 5.1.3. Theorem. Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Then for generic z 1 , . . . , z n the Bethe ansatz equations have at least dim Sing Wa,d [l ] distinct Sl -orbits of admissible solutions.
18
E. Mukhin, V. Tarasov, A. Varchenko
5.1.4. Corollary. Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Then for generic z 1 , . . . , z n the scheme C D considered as a set has at least dim Sing Wa,d [ l ] distinct points. Proof of Theorem 5.1.3. Make the change of variables: z s = zˆ s /ε, s = 1, . . . , n, and ti = tˆi /ε, i = 1, . . . , l. Then Eqs. (5.1) take the form n tˆj − zˆ s + ε + m s ε tˆj − tˆk − ε = 1, tˆj − zˆ s + ε tˆ − tˆk + ε s=1 k= j j
j = 1, . . . , l .
(5.2)
As ε tends to zero, Eqs. (5.2) take the form n s=1
2 ms − = O(ε) , tˆj − zˆ s tˆ − tˆk k= j j
j = 1, . . . , l ,
and in the limit we obtain n s=1
2 ms − = 0, tˆj − zˆ s tˆ − tˆk k= j j
j = 1, . . . , l .
(5.3)
The last system is the system of the Bethe ansatz equations for the Gaudin model. It was proved in [RV] that if the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating and zˆ 1 , . . . , zˆ n are generic, then system (5.3) has at least dim Sing Wa,d [ l ]distinct Sl -orbits of admissible solutions. This proves Theorem 5.1.3.
6. Separation of Variables 6.1. Change of variables. For a nonnegative integer l let Cl [y1 , . . . , yn−1 ]Sym be the vector space of symmetric polynomials in y1 , . . . , yn−1 of degree not greater than l with respect to each variable. Let Wa,d [ l ] = y0l Cl [y1 , . . . , yn−1 ]Sym ⊂ C[y0 , y1 , . . . , yn−1 ] ∞ W and set Wa,d = ⊕l=0 a,d [ l ]. Define an isomorphism of vector spaces
Wa,d ∼ = Wa,d
(6.1)
using the formula n i=1
xi u n−i = y0
n−1
(u − y j ),
j=1
that is, by setting xi = (−1)i−1 y0 σi−1 (y1 , . . . , yn−1 ) , where σi−1 is the (i − 1) st elementary symmetric function. For example, for n = 2 we have x1 u + x2 = y0 (u − y1 ) and x1 = y0 , x2 = −y0 y1 .
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
19
We will identify the spaces Wa,d and Wa,d using isomorphism (6.1). In particular, this defines a Y (gl2 )-module structure on Wa,d . We denote by Sing Wa,d [ l ] ⊂ Wa,d [ l ] the subspace of gl2 -singular vectors. Isomorphism (6.1) defines on Wa,d and its subspaces the operators which were previously defined on Wa,d and its subspaces. Those new operators will be denoted by the same symbols. In particular, we shall consider the action of operators T˜i j (u) and H˜ 0 , . . . , H˜ n on Wa,d . 6.2. Sklyanin’s theorem. 6.2.1. Theorem [Sk]. The action of e11 , e22 , T˜11 (u), T˜22 (u) on Wa,d is given by the following formulae: e11 =
n
m i − y0
i=1
⎛ T˜11 (u) = ⎝u + e11 − ⎛ T˜22 (u) = ⎝u + e22 −
n
zi +
n−1
i=1
j=1
n
n−1
i=1
zi +
⎞ yj⎠ ⎞ yj⎠
j=1
∂ , ∂ y0
∂ , ∂ y0
(6.2)
a(y j )
u − y j ϑ y−1 , j y − y j j
e22 = y0
n−1
n−1
j=1
j=1
n−1
n−1
(u − y j ) +
j = j
(6.3) (u − y j ) +
j=1
d(y j )
u − y j ϑy j , y j − y j
j = j
j=1
(6.4) where ϑ y j : f (y0 , . . . , yn−1 ) → f (y0 , . . . , y j + 1, . . . , yn−1 ). Proof. The proofs of formulae (6.2) are straightforward. The proofs of formulae (6.3) and (6.4) are similar. We will prove formula (6.4). Clearly, the weight subspace Wa,d [ l ] is spanned by vectors of the form T˜12 (u 1 ) . . . T˜12 (u l ) · 1 = y0l
l n−1
(u i − y j )
(6.5)
i=1 j=1
with various u 1 , . . . , u l . So, it suffices to verify formula (6.4) on such vectors. Both the expression T˜22 (u) T˜12 (u 1 ) . . . T˜12 (u l ) · 1 and the right-hand side of formula (6.4) applied to T˜12 (u 1 ) . . . T˜12 (u l ) · 1 are polynomials in u of degree n. Therefore, they are uniquely determined by their coefficients at u n and u n−1 , and the values at n − 1 points y1 , . . . , yn−1 . Proposition 2.2.1 and formulae (2.9), (2.11), (6.5) yield that T˜22 (u) T˜12 (u 1 ) . . . T˜12 (u l ) · 1 =
n
u +
l−
n i=1
+O(u n−2 )
zi
u
n−1
y0l
l n−1 i=1 j=1
(u i − y j )
20
E. Mukhin, V. Tarasov, A. Varchenko
as u → ∞, and
⎞ ⎛ l ⎝(u i − y j +1) T˜22 (u) T˜12 (u 1 ) . . . T˜12 (u l ) · 1 u=y = d(y j ) y0l (u i − y j )⎠ , j
j = j
i=1
which proves the theorem.
6.2.2. Corollary. We have ˜ = T˜11 (u) + T˜22 (u) = (2u + B(u, H)
+
n−1 j=1
⎛ ⎝
⎞
n n−1 n−1 (m i − 2z i ) + 2 yj) (u − y j ) i=1
j=1
j=1
u − y j ⎠ a(y j )ϑ y−1 + d(y j )ϑ y j . j y j − y j
j = j
6.3. Universal weight function. Let y = (y0 , . . . , yn−1 ). Recall that a = (a1 , . . . , al ), h = (h 1 , . . . , h n ) and p(x, a) = x l + a1 x l−1 + · · · + al . Let
ω( y, a) = y0l
n−1
p(y j − 1, a).
j=1
This element of Wa,d [ l ] ⊗ C[a] ⊂ Wa,d [ l ] ⊗ C[a, h] is called the universal weight function. A trivial but important property of the universal weight function is given by the following lemma. 6.3.1. Lemma. Consider Cl+n with coordinates a, h. Then for every p ∈ Cl+n , the vector ω( y, a( p)) is a nonzero vector of Wa,d [ l ]. Denote by ω D the projection of the universal weight function ω( y, a) to Wa,d [ l ] ⊗ A D = Wa,d [ l ] ⊗ A D . 6.3.2. Theorem. Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Then for s = 1, . . . , n, we have H˜ s ω D = h s ω D
(6.6)
in Wa,d [ l ] ⊗ A D . Moreover, we have ω D ∈ Sing Wa,d [ l ] ⊗ A D ⊂ Wa,d [ l ] ⊗ A D .
(6.7)
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
21
6.3.3. Corollary. Let p be a point of the scheme C D considered as a set. Then ω( y, a( p)) ∈ Sing Wa,d [ l ].
(6.8)
Moreover, for s = 1, . . . , n, we have Hs ω( y, a( p)) = h s ( p) ω( y, a( p)).
(6.9)
Proof of Corollary 6.3.3. Let π : C[a, h] → A D be the canonical projection. A point p ∈ C D determines uniquely an algebra homomorphism p : A D → C, such that f ( p) = p (π( f )) for any f ∈ C[a, h]. In particular, ω( y, a( p)) = (id ⊗ p)(ω D ). (6.10) Therefore, formulae (6.8) and (6.9) follow from formulae (6.7) and (6.6), respectively.
6.3.4. Corollary. Let p1 , . . . pd be distinct points of the scheme C D considered as a set. Then the vectors ω( y, a( p1 )), . . . , ω( y, a( pd )) are linearly independent. Proof of Corollary 6.3.4. The vector ω( y, a( p j )) is nonzero by Lemma 6.3.1 and is an eigenvector of the operator Hs with eigenvalue h s ( p j ) by formula (6.9). Moreover, the collections of eigenvalues h( p1 ), . . . , h( pd ) are distinct, because a point p ∈ C D is uniquely determined by its coordinates h( p) by Theorem 4.3.1. The corollary is proved.
Proof of Theorem 6.3.2. To prove formula (6.6) it is enough to show that the polynomial ˜ − B(u, h) ω( y, a) projects to zero in C[u] ⊗ Wa,d [ l ] ⊗ A D . Let B(u, H) B(u, y1 , . . . , yn−1 , h) =
n−1 j=1
B(y j , h)
u − y j . y j − y j
j = j
For j = 1, . . . , n, we have B(y j , y1 , . . . , yn−1 , h) = B(y j , h) and B(u, y1 , . . . , yn−1 , h) is a polynomial in u of degree n − 2. Hence ⎛ ⎞ n−1 n−1 yj⎠ (u − y j ). B(u, h) − B(u, y1 , . . . , yn−1 , h) = ⎝2u + h 1 + 2 j=1
j=1
We have ˜ − B(u, h) + B(u, y1 , . . . , yn−1 , h) − B(u, y1 , . . . , yn−1 , h))ω( y, a B(u, H) ⎛ ⎞ n−1 n = ⎝ − h1 + (m i − 2z i ) (u − y j )⎠ ω( y, a) i=1
+
n−1 j=1
j=1
⎞ u − y j −1 y0l ⎝ p(y j − 1, a)⎠ a(y j )ϑ y−2 − B(y , h)ϑ + d(y ) j j y j j y j − y j ⎛
j = j
× p(y j , a) . Clearly all terms in the right-hand side of this formula project to zero in C[u]⊗Wa,d [l ]⊗ A D . Hence, formula (6.6) is proved. The proof of formula (6.7) is based on the following lemma.
22
E. Mukhin, V. Tarasov, A. Varchenko
Lemma. We have e21 e12 ω D = 0. Proof. From the formula for the quantum determinant we have T˜12 (u)T˜21 (u − 1)ω( y, a) = T˜11 (u)T˜22 (u − 1)−a(u)d(u − 1) ω( y, a),
(6.11)
where T˜12 (u)T˜21 (u − 1) = e21 e12 u 2n−2 + O(u 2n−3 ). Therefore, our goal is to calculate the coefficient of u 2n−2 in the right-hand side. We have {2}
T11 (u)T22 (u − 1) = 1 +
{2}
e22 (e11 + 1) + T11 + T22 e11 + e22 + + O(u −3 ) . u u2
Hence T11 (u)T22 (u − 1) − T11 (u) − T22 (u) + 1 =
e22 (e11 + 1) + O(u −3 ) u2
and ˜ d(u − 1)+d(u)d(u − 1) = e22 (e11 + 1)u 2n−2 + O(u 2n−3 ) . T˜11 (u)T˜22 (u − 1)− B(u, H) Thus the right-hand side of (6.11) equals ˜ − a(u) − d(u) d(u − 1) ω( y, a) + e22 (e11 + 1)u 2n−2 ω( y, a) + O(u 2n−3 ) . B(u, H) Here e22 (e11 + 1) ω( y, a) = l
n
mi − l + 1
ω( y, a),
i=1
˜ ω( y, a) = B(u, h) ω( y, a), B(u, H) a(u) + d(u) = 2u n −
n s=1
+
(2z s − m s )u n−1
z i z j + (z i − m i )(z j − m j ) u n−2 + · · ·
1i< j n
and B(u, h) = 2u n + h 1 u n−1 + h 2 u n−2 + · · · . Therefore, the right-hand side of (6.11) equals n (2z s − m s ) u n−1 d(u − 1) ω( y, a) h1 + ⎛
s=1
+ ⎝h 2 + l
n i=1
+O(u
2n−3
mi − l + 1 −
⎞ z i z j + (z i − m i )(z j − m j ) ⎠ u 2n−2 ω( y, a)
1i< j n
).
Clearly the first two terms of this expression project to zero in C[u] ⊗ Wa,d [ l ] ⊗ A D . This proves the lemma. In order to deduce formula (6.7) from the lemma, it is enough to notice that the operator e21 is injective, in variables y it is the operator of multiplication by y0 . Therefore, e12 ω D = 0. Theorem 6.3.2 is proved.
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
23
7. Multiplication in Algebra A D and Bethe Algebra AW 7.1. Multiplication in A D . By Theorem 4.2.1, the scheme C D considered as a set is finite, and the algebra A D is the direct sum of local algebras, A D = ⊕ p A p,D , corresponding to points p of the set C D . The local algebra A p,D may be defined as the quotient of the algebra of germs at p of holomorphic functions in a, h modulo the ideal I p,D generated by all functions q1 , q2 , . . . , ql+n . The local algebra A p,D contains the maximal ideal m p generated by germs which are zero at p. For f ∈ A D , denote by L f the linear operator A D → A D , g → f g, of multiplication by f . Consider the dual space A∗D = ⊕ p A∗p,D and the dual operators L ∗f : A∗D → A∗D . Every summand A∗p,D contains the distinguished one-dimensional subspace m⊥p which is the annihilator of m p . 7.1.1. Lemma [MTV3]. (i) For any point p of the scheme C D considered as a set and any f ∈ A D , we have L ∗f (m⊥p ) ⊂ m⊥p . (ii) For any point p of the scheme C D considered as a set, if W ⊂ A∗p,D is a nonzero vector subspace invariant with respect to all operators L ∗f , f ∈ A D , then W contains m⊥p . Proof. For any f ∈ m p we have L ∗f (m⊥p ) = 0. This proves part (i). To prove part (ii) we consider the filtration of A p,D by powers of the maximal ideal, A p,D ⊃ m p ⊃ m2p ⊃ · · · ⊃ {0}. We consider a linear basis { f a,b } of A p,D , a = 0, 1, . . . , b = 1, 2, . . . , which agrees with this filtration. Namely, we assume that for every i, the subset of all vectors f a,b with a i is a basis of mip . Since dim A p,D /m p = 1, there is only one basis vector with a = 0 and we also assume that this vector f 0,1 is the image of 1 in A p,D . Let { f a,b } denote the dual basis of A∗p,D . Then the vector f 0,1 generates m⊥p . Let w = a,b ca,b f a,b be a nonzero vector in W . Let a0 be the maximum value of a such that there exists b with a nonzero ca,b . Let b0 be such that ca0 ,b0 is nonzero. Then it is easy to see that L ∗fa ,b w = ca0 ,b0 f 0,1 . Hence W contains m⊥p .
0 0
24
E. Mukhin, V. Tarasov, A. Varchenko
7.2. Linear map τ : A∗D → Sing Wa,d [ l ]. Let f 1 , . . . , f µ be a basis of A D considered as a vector space over C. Write ωD =
vi ⊗ f i
with
vi ∈ Sing Wa,d [ l ] = Sing Wa,d [ l ].
(7.1)
i
Denote by V ⊂ Sing Wa,d [ l ] the vector subspace spanned by v1 , . . . , vµ . Define the linear map τ : A∗D → Sing Wa,d [ l ] ,
g → g(ω D ) =
g( f i ) vi .
(7.2)
i
Clearly, V is the image of τ . 7.2.1. Lemma. Let p be a point of C D considered as a set. Let ω( y, a( p)) ∈ Wa,d [ l ] = Wa,d [ l ] be the value of the universal weight function at p. Then the vector ω( y, a( p)) belongs to the image of τ . Proof. The statement follows from formula (6.10).
Let ψ DW : A D → A W be the epimorphism defined in Theorem 4.3.3. 7.2.2. Lemma Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Then for any f ∈ A D and g ∈ A∗D , we have τ (L ∗f (g)) = ψ DW ( f )(τ (g)). In other words, the map τ intertwines the action of the algebra of multiplication operators L ∗f on A∗D and the action on the Bethe algebra on Sing Wa,d [ l ]. Proof. The algebra A D is generated by h 1 , . . . , h n . It is enough to prove that for any swe have τ (L ∗h s (g)) = Hs (τ (g)). But τ (L ∗h s (g)) = i g(h s f i )vi = g i vi ⊗ h s f i = g
i Hs vi ⊗ f i = Hs (τ (g)). 7.2.3. Corollary. The vector subspace V ⊂ Sing Wa,d [ l ] is invariant with respect to the action of the Bethe algebra A W and the kernel of τ is a subspace of A∗D , invariant with respect to multiplication operators L ∗f , f ∈ A D .
7.3. First main theorem. 7.3.1. Theorem. Assume that the pair ((m 1 , 0), . . . , (m n , 0)), l is separating. Then the image of τ is Sing Wa,d [ l ] and the kernel of τ is zero.
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
25
7.3.2. Corollary. The map τ identifies the action of operators L ∗f , f ∈ A D , on A∗D and the action of the Bethe algebra on Sing Wa,d [ l ]. Hence the epimorphism ψ DW : A D → A W is an isomorphism. Proof of Theorem 7.3.1. First we will show that τ is an epimorphism for generic z. Let dl = dim Sing Wa,d [ l ]. Corollary 5.1.4 says that for generic z there exists dl distinct points p1 , . . . , pdl in C D . By Corollary (6.3.4), the vectors ω( y, a( p1 )), …, ω( y, a( pdl )) are linearly independent and hence form a basis for Sing Wa,d [ l ]. Therefore, τ is an epimorphism for generic z by Lemma 7.2.1. By Theorem 4.2.1 and Lemma 2.5.3, dimensions of A D and Sing Wa,d [ l ] do not depend on z. Hence dim A D dim Sing Wa,d [l ] for all z 1 , . . . , z n . Therefore, to prove Theorem 7.3.1 it remains to prove that τ has zero kernel. Denote the kernel of τ by K . Let A D = ⊕ p A p,D be the decomposition into the direct sum of local algebras. Since K is invariant with respect to multiplication operators, we have that K = ⊕ p K ∩ A∗p,D , and for every p, the vector subspace K ∩ A∗p,D is invariant with respect to multiplication operators. By Lemma 7.1.1, if K ∩ A∗p,D is nonzero, then K ∩ A∗p,D contains the one-dimensional subspace m⊥p . Let { f a,b } be the basis of A p,D constructed in the proof of Lemma 7.1.1, and let { f a,b } be the dual basis of A∗p,D . Then the vector f 0,1 generates m⊥p . By definition of τ , the vector τ ( f 0,1 ) is equal to the value of the universal weight function at p. By Lemma 6.3.1, this value is nonzero and that contradicts the assumption that f 0,1 lies in the kernel of τ .
7.4. Grothendieck bilinear form on A D . Realize the algebra A D as C[h]/I D , where I D is the ideal generated by n polynomials q1 (h) , q2 (h) , q j (a(h), h), j = l + 3, . . . , l + n, see (4.3). Let : A D → C, be the Grothendieck residue, f →
f 1 . ResC D
l+n n (2πi) q1 (h)q2 (h) j=l+3 q j (a(h), h)
Let ( , ) D be the Grothendieck symmetric bilinear form on A D defined by the rule ( f, g) D = ( f g) . The Grothendieck bilinear form is nondegenerate. The form ( , ) D determines a linear isomorphism φ : A D → A∗D , f → ( f, ·) D . 7.4.1. Lemma. The isomorphism φ intertwines the operators L f and L ∗f for any f ∈ AD. Proof. For g ∈ A D we have φ(L f (g)) = φ( f g) = ( f g, ·) D = (g, f ·) D = L ∗f ((g, ·) D ) = L ∗f φ(g).
7.4.2. Corollary. Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Then the composition τ φ : A D → Sing Wa,d [ l ] is a linear isomorphism which intertwines the algebra of multiplication operators on A D and the action of the Bethe algebra A W on Sing Wa,d [ l ].
26
E. Mukhin, V. Tarasov, A. Varchenko
8. Algebra A G 8.1. New conditions on (m 1 , 0) , . . . , (m n , 0), l. In the remainder of the paper we assume that = ( (1) , . . . , (n) ) = ((m 1 , 0) , . . . , (m n , 0)) is a collection of dominant integral gl2 -weights, that is,m s ∈ Z0 for s = 1, . . . , n. n We assume that l ∈ Z0 is such that the weight m − l, l is dominant s s=1 integral, that is, ns=1 m s − l l. This assumption implies that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. n ˜ Let l˜ = s=1 m s + 1 − l. We have l > l. 8.2. Wronskian. The (discrete) Wronskian of polynomials f, g ∈ C[u] is the polynomial Wr ( f (u), g(u)) = f (u)g(u − 1) − f (u − 1)g(u). 8.2.1. Lemma. Let f, g, B ∈ C[u]. Assume that f, g are monic polynomials of degrees ˜ respectively, that lie in the kernel of the difference operator l, l, d(u) − B(u) ϑ −1 + a(u) ϑ −2 . Then ˜ Wr ( f (u), g(u)) = (l − l)
ms n
(u − z s + j).
s=1 j=1
˜ and Proof. Let C(u) = Wr ( f (u), g(u)). Then the top coefficient of C(u) equals l − l, a(u) C(u) = , C(u − 1) d(u) which determines the polynomial C(u) uniquely.
8.2.2. Lemma. Let f, g ∈ C[u], z ∈ C, m ∈ Z>0 . Assume that f (z − j) = 0 for j = 1, . . . , m + 1. Then the polynomial Wr ( f (u), g(u)) is equal to zero at u = z − j, j = 1, . . . , m, and the polynomial f (u)g(u − 2) − f (u − 2)g(u) is equal to zero at u = z − j, j = 1, . . . , m − 1. 8.2.3. Lemma. Let f, g, C ∈ C[u], z ∈ C, and Wr ( f (u), g(u)) = C(u). (i) If C(z) = 0 and f (z − 1) = 0, then g(z − 1) = 0. (ii) If C(z) = 0 and f (z) = 0, then f (z − 1) = 0. 8.2.4. Lemma. Let f, g ∈ C[u], z ∈ C. Then Wr ((u − z) f (u), (u − z)g(u)) = (u − z)(u − z − 1) Wr ( f (u), g(u)).
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
27
8.3. Intersection of Schubert cycles C G . Let d be a sufficiently large natural number with respect to the numbers m 1 , . . . , m n considered in Sect. 8.1. Let Cd [u] be the vector subspace in C[u] of polynomials of degree not greater than d. 8.3.1. Denote by G the Grassmannian of all two-dimensional vector subspaces in Cd [u]. Let F = {0 = Fd+1 ⊂ Fd ⊂ · · · ⊂ F1 ⊂ F0 = Cd [u]} be a complete flag and = (a, b) a gl2 dominant integral weight such that d a b 0 and a, b ∈ Z. Define o a Schubert cell CF , ⊂ G to be the set of all two-dimensional subspaces V ⊂ Cd [u] having a basis f, g such that f ∈ Fa+1 − Fa+2
and
g ∈ Fb − Fb+1 .
o . Define a Schubert cycle CF, ⊂ G as the closure of the Schubert cell CF , For z ∈ Z and i ∈ Z>0 , set
ϕi (u, z) =
i
(u − z + j).
j=1
Introduce a complete flag in Cd [u] : F(z) = {0 = Fd+1 (z) ⊂ Fd (z) ⊂ · · · ⊂ F1 (z) ⊂ F0 (z) = Cd [u]}, where Fi (z) consists of all polynomials divisible by ϕi (u, z). Introduce the complete flag in Cd [u] associated with infinity: F(∞) = {0 = Fd+1 (∞) ⊂ Fd (∞) ⊂ · · · ⊂ F1 (∞) ⊂ F0 (∞) = Cd [u]}, where Fi (∞) consists of all polynomials of degree d − i. o We consider the Schubert cells CF ⊂ G, s = 1, . . . , n, where (s) = (z s ), (s) (∞) = (d − l, d − l˜ − 1). The (m s , 0), and the Schubert cell C o (∞) ⊂ G, where F(∞),
o is the set of all two-dimensional subspaces V ⊂ Cd [u] having a basis cell CF (z s ), (s) f, g such that
g(z s − 1) = 0 ,
f (z s − m s − 2) = 0 ,
f (z s − j) = 0 for j = 1, . . . , m s + 1,
o and the cell CF is the set of all two-dimensional subspaces V ⊂ Cd [x] having (∞), (∞) ˜ a basis f, g such that deg f = l and deg g = l. Consider the (scheme-theoretic) intersection
C G = CF(∞), (∞)
∩ns=1 CF(z s ), (s)
(8.1)
of the corresponding Schubert cycles. Denote by A G the algebra of functions on C G .
28
E. Mukhin, V. Tarasov, A. Varchenko
8.3.2. Lemma. Let z i − z j ∈ / Z for i = j. Then n o 0 C G = CF ∩s=1 CF (∞) (z (∞),
), (s)
s
as sets.
n ∩s=1 CF(z s ), (s) . Let f, g be a monic basis Proof. Let V be a point of CF(∞), (∞) ˜ Then deg Wr ( f (u), g(u)) l + l˜ − 1. On of V , such that deg f l and deg g l.
s the other hand, the polynomial Wr ( f (u), g(u)) is divisible by ns=1 mj=1 (u − z s + j) n ˜ ˜ by Lemma 8.2.2. Since s=1 m s = l + l − 1, we conclude that deg f = l, deg g = l, o V is a point of CF(∞), (∞) , and ms n ˜ Wr ( f (u), g(u)) = (l − l) (u − z s + j). s=1 j=1
Since a suitable linear combination of f and g is divisible by ϕm s +1 (u, z s ), the subspace o V is a point of CF by Lemma 8.2.3.
(z ), (s) s
8.3.3. Let V be a point of C G considered as a set. Then there exists a unique basis f, g of V such that f (u) = u l + f 1 u l−1 + · · · + fl , ˜
˜
g(u) = u l + g1 u l−1 + · · · + gl−l−1 u l+1 + gl−l+1 u l−1 + · · · + gl˜ ˜ ˜ for suitable complex numbers f 1 , . . . , fl , g1 , . . . , gl−l−1 , gl−l+1 , . . . , gl˜. ˜ ˜ / Z for i = j. Then all polynomials of the subspace V are 8.3.4. Lemma. Let z i − z j ∈ annihilated by the difference operator DV = d(u) − BV (u) ϑ −1 + a(u) ϑ −2 , where BV (u) =
1 l˜ − l
(g(u) f (u − 2) − g(u − 2) f (u))
n m s −1
(u − z s + j)−1
s=1 j=1
is a polynomial of degree n. Proof. Let W (u) = Wr ( f (u), g(u)). It is straightforward to see that all polynomials of the subspace V are annihilated by the difference operator W (u − 1) − (g(u) f (u − 2) − g(u − 2) f (u)) ϑ −1 + W (u) ϑ −2 . Since ˜ W (u) = (l − l)
ms n
(8.2)
(u − z s + j),
s=1 j=1
see the proof of Lemma 8.3.2, and all coefficients of the difference operator (8.2) are
s −1 divisible by ns=1 mj=1 (u − z s + j) by Lemma 8.2.2, the statement follows.
Write
BV (u) = 2u n + h 1 u n−1 + · · · + h n .
Recall that the scheme C D is defined in Sect. 4.1.
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
29
8.3.5. Corollary of Lemma 8.3.4. Consider the schemes C D and C G as sets. Then the assignment V → ( f 1 , . . . , fl , h 1 , . . . , h n ) ∈ Cl+n defines an injective map of sets CG → C D . / Z for i = j. Assume that V ∈ G has a basis f, g such 8.3.6. Theorem. Let z i − z j ∈ ˜ and V is annihilated by a difference operator of the form that deg f = l and deg g = l, d(u) − B(u) ϑ −1 + a(u) ϑ −2 , where B(u) is a polynomial. Then V is a point of C G . The proof is similar to the proof of Theorem 7.2 in [MTV2].
8.4. Algebra A G . / Z for i = j. Then A G considered as a vector space is 8.4.1. Lemma. Let z i − z j ∈ finite-dimensional. Moreover, this dimension does not depend on z. Proof. The claim follows from Corollary 8.3.5 and the reasoning is similar to the proof of Theorem 4.2.1.
Under conditions of Lemma 8.4.1, the dimension of A G as a vector space is given by Schubert calculus. Namely, let = ( (1) , . . . , (n) ) be the collection of gl2 -highest weights, where (s) = (m s , 0). Denote by L = L (1) ⊗ · · · ⊗ L (n) the tensor product of irreducible gl2 -modules with highest weights (1) , . . . , (n) , respectively. Let Sing L [ l ] be the subspace of L of gl2 -singular vectors of weight ( ns=1 m s − l, l). Then by Schubert calculus, dim A G = dim Sing L [ l ],
(8.3)
see [Fu]. / Z for i = j, we shall use the following 8.5. Presentation of algebra A G . If z i − z j ∈ presentation of the algebra A G : Let a˜ = (a˜ 1 , . . . , a˜ l−l−1 , a˜ l−l+1 , . . . , a˜ l˜). ˜ ˜ ˜
Consider the space Cl+l+n−1 with coordinates a˜ , a, h, cf. Sect. 4.1. Denote by p(u, ˜ a˜ ) the following polynomial in u depending on parameters a˜ : ˜
˜
p(u, ˜ a˜ ) = u l + a˜ 1 u l−1 + · · · + a˜ l−l−1 u l+1 + a˜ l−l+1 u l−1 + · · · + a˜ l˜ . ˜ ˜ Recall that p(u, a) = u l + a1 u l−1 + · · · + al and B(u, h) = 2u n + h 1 u n−1 + · · · + h n . Let us write ˜
˜
( a˜ , a), Wr ( p(u, ˜ a˜ ), p(u, a)) = (l˜ − l)u l+l−1 + w1 ( a˜ , a)u l+l−2 + · · · + wl+l−1 ˜ p(u, ˜ a˜ ) p(u − 2, a) − p(u ˜ − 2, a˜ ) p(u, a) ˜
˜
= 2(l˜ − l)u l+l−1 + wˆ 1 ( a˜ , a)u l+l−2 + · · · + wˆ l+l−1 ( a˜ , a) ˜
30
E. Mukhin, V. Tarasov, A. Varchenko
for suitable polynomials w1 , . . . , wl+l−1 , wˆ 1 , . . . , wˆ l+l−1 in variables a˜ , a. Let us write ˜ ˜ (l˜ − l)
ms n
˜
˜
(u − z s + j) = (l˜ − l)u l+l−1 + c1 u l+l−2 + · · · + cl+l−1 , ˜
s=1 j=1
(l˜ − l) B(u, h)
n m s −1
˜ ˜ (u − z s + j) = 2(l˜ − l)u l+l−1 + cˆ1 (h)u l+l−2 + · · · + cˆl+l−1 (h), ˜
s=1 j=1
and polynomials cˆ1 , . . . , cˆl+l−1 in variables h. for suitable numbers c1 , . . . , cl+l−1 ˜ ˜ ˜ Denote by IG the ideal in C[ a˜ , a, h] generated by 2(l + l − 1) polynomials wi ( a˜ , a) − ci ,
i = 1, . . . , l˜ + l − 1.
wˆ i ( a˜ , a) − cˆi (h),
(8.4)
8.5.1. Lemma. Let z i − z j ∈ / Z for i = j. Then A G = C[ a˜ , a, h]/IG . Proof. The scheme defined by the ideal IG consists of points p such that Wr ( p(u, ˜ a˜ ( p)), p(u, a( p))) = (l˜ − l)
ms n
(u − z s + j),
s=1 j=1 n m s −1
p(u, ˜ a˜ ) p(u − 2, a) − p(u ˜ − 2, a˜ ) p(u, a) = (l˜ − l) B(u, h( p))
(u − z s + j).
s=1 j=1
Hence, the polynomials p(u, ˜ a˜ ( p)), p(u, a( p)) span a vector subspace V lying in the intersection C G , see Theorem 8.3.6. Conversely, if V is a point of C G , then V has a basis f, g like in Lemma 8.3.4. Then by Lemma 8.3.4 we have Wr(g(u), f (u)) = (l˜ − l)
ms n
(u − z s + j),
s=1 j=1
g(u) f (u − 2) − g(u − 2) f (u) = (l˜ − l) B(u)
n m s −1
(u − z s + j)
s=1 j=1
for a suitable polynomial B(u). Hence, the triple g, f, B determines a point p, whose coordinates satisfy Eqs. (8.4).
9. Algebras A P and A L ˜
9.1. Algebra A P . Consider the space Cl+l+n−1 with coordinates a˜ , a, h. Let Dh = d(u) − B(u, h) ϑ −1 + a(u) ϑ −2 be the difference operator defined in (4.1). If h satisfies equations q1 (h) = 0 and ˜ a˜ )) is a polynomial in u of degree l˜ + n − 3, q2 (h) = 0, then the polynomial Dh ( p(u, ˜
˜ , h). Dh ( p(u, ˜ a˜ )) = q˜3 ( a˜ , h) u l+n−3 + · · · + q˜l+n ˜ (a The coefficients q˜i ( a˜ , h) are functions linear in a˜ and linear in h.
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
31
Recall that if p(u, a) = u l +a1 u l−1 +· · ·+al , and h satisfies equations q1 (h) = 0 and q2 (h) = 0, then the polynomial Dh ( p(u, a)) is a polynomial in u of degree l + n − 3, Dh ( p(u, a)) = q3 (a, h) u l+n−3 + · · · + ql+n (a, h). Denote by I P the ideal in C[ a˜ , a, h] generated by polynomials q1 , q2 , q3 , . . . , ql+n , ˜ l+l+n−1 . The algebra q˜3 , . . . , q˜l+n ˜ . The ideal I P defines a scheme C P ⊂ C A P = C[ a˜ , a, h]/I P is the algebra of functions on C P . ˜ The scheme C P is the scheme of points p ∈ Cl+l+n−1 such that the difference equation Dh( p) w(u) = 0 has two polynomial solutions p(u, ˜ a˜ ( p)) and p(u, a( p)). 9.2. Isomorphism ψG P : A G → A P . ˜
˜
/ Z for i = j, then the identity map Cl+l+n−1 → Cl+l+n−1 9.2.1. Theorem. If z i − z j ∈ induces an algebra isomorphism ψG P : A G → A P . ˜ a˜ ( p)), p(u, a( p)) are annihilated by Proof. If p is a point of C P , then polynomials p(u, the difference operator d(u) − B(u, h( p))ϑ −1 + a(u)ϑ −2 . If z i − z j ∈ / Z for i = j, then the span V of polynomials p(u, ˜ a˜ ( p)), p(u, a( p)) is a point of C G by Theorem 8.3.6. This reasoning defines an algebra homomorphism ψG P : A G → A P . Conversely, if p is a point of C G , then the triple p(u, ˜ a˜ ( p)), p(u, a( p)), B(u, h( p)) satisfies equations ms n ˜ Wr ( p(u, ˜ a˜ ( p)), p(u, a( p))) = (l − l) (u − z s + j), s=1 j=1
p(u, ˜ a˜ ) p(u − 2, a) − p(u ˜ − 2, a˜ ) p(u, a) = (l˜ − l) B(u, h( p))
n m s −1
(u − z s + j).
s=1 j=1
Hence the polynomials p(u, ˜ a˜ ( p)), p(u, a( p)) are annihilated by the difference operator d(u) − B(u, h( p))ϑ −1 + a(u)ϑ −2 . Therefore, p is a point of C P .
9.3. Algebra A L . Assume that m 1 , . . . , m n , l satisfy conditions of Sect. 8.1. Let = ( (1) , . . . , (n) ) be the collection of gl2 -highest weights with (s) = (m s , 0). Let L = L (1) ⊗ · · · ⊗ L (n) be the tensor product of irreducible gl2 -modules with highest weights (1) , . . . , (n) , respectively, and v = v(m 1 ,0) ⊗ · · · ⊗ v(m n ,0) the tensor product of the corresponding highest weight vectors. Denote by L (z) = L (1) (z 1 ) ⊗ · · · ⊗ L (n) (z n ) the tensor product of evaluation modules.
32
E. Mukhin, V. Tarasov, A. Varchenko
n Let Sing L [ l ] ⊂ L (z) be the subspace of gl2 -singular vectors of weight ( i=1 m i − l, l). The algebra A L is the Bethe algebra associated with Sing L [ l ]. Assume that m i ∈ Z0 for i = 1, . . . , n, and m 1 m 2 · · · m n . Assume that z i − z j + m j − s = 0 and z i − z j − 1 − s = 0 for all i < j and s = 0, 1, . . . , m i − 1. Then by Theorem 2.6.2, there is a natural isomorphism Wa,d /K → L (z) such that 1 → v . Here K ⊂ Wa,d is the kernel of the Yangian Shapovalov form on Wa,d . The Yangian Shapovalov form on Wa,d induces the Yangian Shapovalov form S on L (z) such that S(v , v ) = 1 and S(x · v, w) = S(v, x + · w) for all x ∈ Y (gl2 ) and v, w ∈ L (z). The form S is nondegenerate and symmetric. We have the composition of linear maps Wa,d → Wa,d /K → L (z). Restricting this composition to Sing Wa,d we get a linear epimorphism σ : Sing Wa,d [ l ] → Sing L [ l ]. The Bethe algebra A W preserves the kernel of σ and induces a commutative subalgebra in End (Sing L [ l ]). The induced subalgebra coincides with the Bethe algebra A L . We denote by ψW L : A W → A L the corresponding epimorphism. The operators of the algebra A L are symmetric with respect to the Yangian Shapovalov form on L (z). 9.3.1. Denote by D L = d(u) − (2u n + ψW L (H1 )u n−1 + · · · + ψW L (Hn )) ϑ −1 + a(u) ϑ −2 the universal difference operator associated with the subspace Sing L [ l ] and collection z. 9.3.2. Theorem. Assume that the pair , l satisfies conditions of Sect. 8.1. Then for any v0 ∈ Sing L [ l ] there exist v1 , . . . , vl˜ ∈ Sing L [ l ] such that the function ˜
˜
w(u) = v0 u l + v1 u l−1 + · · · + vl˜ is a solution of the difference equation D L w(u) = 0. This theorem is a particular case of Theorem 7.3 in [MTV2].
10. Homomorphisms of Algebras A D , A P and A L 10.1. Epimorphism ψ D P : A D → A P . A point p of C P determines the difference equation Dh( p) w(u) = 0 and two solutions p(u, ˜ a˜ ( p)) and p(u, a( p)). Then the pair, consisting of the difference operator Dh( p) and the solution p(u, a( p)) of the smaller degree, determines a point of C D , see Sect. 4.1. This correspondence defines a natural algebra epimorphism ψ D P : A D → A P .
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
33
10.2. Linear map ξ : A D → Sing L [ l ]. Assume that z 1 , . . . , z n , m 1 , . . . , m n satisfy the assumptions of Theorem 2.6.2. Then we have the composition of linear maps φ
τ
σ
A D −→ A∗D −→ Sing Wa,d [ l ] −→ Sing L [ l ]. Denote this composition by ξ : A D → Sing L [ l ]. By Theorem 7.3.1, ξ is a linear epimorphism. Let ψ DL : A D → A L be the algebra epimorphism defined as the composition ψW L ψ DW . 10.2.1. Lemma. If z 1 , . . . , z n , m 1 , . . . , m n satisfy the assumptions of Theorem 2.6.2, then the linear map ξ intertwines the action of the multiplication operators L f , f ∈ A D , on A D and the action of the Bethe algebra A L on Sing L [ l ], that is, for any f, g ∈ A D we have ξ(L f (g)) = ψ DL ( f )(ξ(g)). The lemma follows from Corollary 7.4.2. 10.2.2. Lemma. If z 1 , . . . , z n , m 1 , . . . , m n satisfy the assumptions of Theorem 2.6.2, then the kernel of ξ coincides with the kernel of ψ DL . Proof. If ψ DL ( f ) = 0, then ξ( f ) = ξ(L f (1)) = ψ DL ( f )(ξ(1)) = 0. On the other hand, if ξ( f ) = 0, then for any g ∈ A D we have ψ DL ( f )(ξ(g)) = ξ(L f (g)) = ξ( f g) = ξ(L g ( f )) = ψ DL (g)(ξ( f )) = 0. Since ξ is an epimorphism, this means that ψ DL ( f ) = 0.
/ Z for i = j, then the kernel of ξ coincides with the kernel 10.2.3. Lemma. If z i − z j ∈ of ψ D P . Proof. If z i − z j ∈ / Z for i = j, then the assumptions of Theorem 2.6.2 are satisfied and ξ is defined. By Schubert calculus, dim Sing L [ l ] = dim A G . By Theorem 9.2.1 dim A G = dim A P if z i − z j ∈ / Z for i = j. Hence it suffices to show that the kernel of ξ contains the kernel of ψ D P . But this follows from Theorems 3.3.1 and 9.3.2. Indeed the defining relations in A P = A D /(ker ψ D P ) are the conditions on the operator Dh to have two linear independent polynomials in the kernel. Theorems 3.3.1 and 9.3.2 guarantee these relations for elements of the Bethe algebra A L . Hence, the kernel of ψ DL contains the kernel of ψ D P . By Lemma 10.2.2, the kernel of ξ coincides with the kernel of ψ DL . Therefore, the kernel of ξ contains the kernel of ψ D P .
/ Z for all i = j. Then the algebras A P , A L and A G 10.2.4. Corollary. Let z i − z j ∈ are isomorphic. Proof. Since the algebra epimorphisms ψ D P and ψ DL have the same kernels, the algebras A P and A L are isomorphic. Then A L and A G are isomorphic by Theorem 9.2.1.
/ Z for all i = j. Denote by ψ P L : A P → A L 10.3. Second main theorem. Let z i − z j ∈ the isomorphism induced by ψ DL and ψ D P . Lemmas 10.2.1 – 10.2.3 imply the following theorem.
34
E. Mukhin, V. Tarasov, A. Varchenko
10.3.1. Theorem. If z i − z j ∈ / Z for all i = j, then the linear map ξ induces a linear isomorphism ζ : A P → Sing L [ l ] which intertwines the multiplication operators L f , f ∈ A P , on A P and the action of the Bethe algebra A L on Sing L [ l ], that is, for any f, g ∈ A P we have ζ (L f (g)) = ψ P L ( f )(ζ (g)). / Z for all i = j. Assume that every operator f ∈ A L 10.3.2. Corollary. Let z i − z j ∈ is diagonalizable. Then the algebra A L has simple spectrum and all points of the intersection of Schubert cycles n C G = CF(∞), (∞) ( ∩i=1 CF(zi ), (i) ) are of multiplicity one. Proof. The algebras A L , A P and A G are isomorphic. We have A P = ⊕ p A p,P , where the sum is over the points of the scheme C P considered as a set and A p,P is the local algebra associated with a point p. The algebra A p,P has nonzero nilpotent elements if dim A p,P > 1. If every element f ∈ A P is diagonalizable, then the algebra A P is the direct sum of one-dimensional local algebras. Hence A P has simple spectrum as well as the algebras A L and A G .
Corollary 10.3.2 has the following application. / Z and |z i − z j | 1 for 10.3.3. Corollary. Assume that z 1 , . . . , z n are real, z i − z j ∈ all i = j. Then all points of the intersection of Schubert cycles n C G = C∞, (∞) C zi , (i) ) ( ∩i=1 are of multiplicity one. Proof. If z 1 , . . . , z n are real and |z i − z j | 1 for all i = j, then the Yangian Shapovalov form, restricted to the real part of Sing L [ l ], is positive definite, see Appendix C in [MTV1]. The Hamiltonians ψW L (H1 ), . . . , ψW L (H1 ), restricted to the real part of Sing L [ l ], are real symmetric operators with respect to the Yangian Shapovalov form, see [MTV1]. Hence, all elements of the Bethe algebra A L are diagonalizable operators. Therefore, the spectrum of A G is simple and all points of C G are of multiplicity one.
Corollary 10.3.3 is related to Theorem 1 from [EGSV] and Theorem 2.1 from [MTV4] concerning the real Schubert calculus. 10.3.4. Example. Let n = 3, (s) = (1, 0), s = 1, 2, 3, (∞) = (2, 1), and R = 4 (z 12 + z 22 + z 32 − z 1 z 2 − z 1 z 3 − z 2 z 3 ) − 3. If R = 0, then every element of A L is diagonalizable and the algebra A L is isomorphic to the direct sum C ⊕ C. If R = 0, then the algebra A L contains a nonzero nilpotent matrix and is isomorphic to C[b]/b2 .
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
35
11. Operators with Polynomial Kernel and Bethe Algebra A L 11.1. Linear isomorphism θ : A∗P → Sing L [l ]. Let z i − z j ∈ / Z for all i = j. Define the symmetric bilinear form on A P by the formula ( f, g) P = S (ζ ( f ), ζ (g))
for all
f, g ∈ A P ,
where S( , ) denotes the Yangian Shapovalov form on Sing L [ l ]. 11.1.1. Lemma. The form ( , ) P is nondegenerate. The lemma follows from the fact that the Yangian Shapovalov form on Sing L [ l ] is nondegenerate and the fact that ζ is an isomorphism. 11.1.2. Lemma. We have ( f g, h) P = (g, f h) P for all f, g, h ∈ A P . The lemma follows from the fact the elements of the Bethe algebra are symmetric operators with respect to the Yangian Shapovalov form, see Sect. 3.2.2. The form ( , ) P defines a linear isomorphism π : A P → A∗P , f → ( f , ·) P . 11.1.3. Corollary. Let z i − z j ∈ / Z for all i = j. Then the map π intertwines the multiplication operators L f , f ∈ A P , on A P and the dual operators L ∗f , f ∈ A P , on A∗P . 11.2. Third main theorem. Summarizing Theorem 10.3.1 and Corollary 11.1.3 we obtain the following theorem. 11.2.1. Theorem. Let z i − z j ∈ / Z for all i = j. Then the composition θ = ζ π −1 is a linear isomorphism from A∗P to Sing L [ l ] which intertwines the multiplication operators L ∗f , f ∈ A P , on A∗P and the action of the Bethe algebra A L on Sing L [ l ], that is, for any f ∈ A P and g ∈ A∗P we have θ (L ∗f (g)) = ψ P L ( f )(θ (g)). 11.2.2. Let z i − z j ∈ / Z for all i = j. Assume that v ∈ Sing L [ l ] is an eigenvector of the Bethe algebra A L , that is, ψW L (Hs )v = λs v for suitable λs ∈ C and s = 1, . . . , n. Then, by Corollary 7.4 in [MTV2], the difference equation d(u) − (2u n + λ1 u n−1 + · · · + λn ) ϑ −1 + a(u) ϑ −2 w(u) = 0 has two linearly independent polynomial solutions, one of degree l and the other of ˜ The following corollary of Theorem 11.2.1 gives the converse statement. degree l. / Z for all i = j. Assume that 11.2.3. Corollary of Theorem 11.2.1 Let z i − z j ∈ (λ1 , . . . , λn ) ∈ Cn is a point such that n n λ1 = (m i − 2z i ) , λ2 = l l − 1 − mi i=1
+
1i< j n
i=1
z i z j + (z i − m i )(z j − m j ) ,
36
E. Mukhin, V. Tarasov, A. Varchenko
and the difference equation d(u) − (2u n + λ1 u n−1 + · · · + λn ) ϑ −1 + a(u) ϑ −2 w(u) = 0
(11.1)
has two linearly independent polynomial solutions. Then there exists a unique up to normalization eigenvector v ∈ Sing L [ l ] of the action of the Bethe algebra A L such that for every s = 1, . . . , n we have ψW L (Hs ) v = λs v.
(11.2)
Proof of Corollary 11.2.3. Indeed, such a point (λ1 , . . . , λn ) defines a linear function η : A P → C, h s → λs , for s = 1, . . . , n. Moreover, η( f g) = η( f )η(g) for all f, g ∈ A P . Hence η ∈ A∗P is an eigenvector of operators L ∗f acting on A∗P . By Theorem 11.2.1, the vector v = θ (η) ∈ Sing L [ l ] is an eigenvector of the action of the Bethe algebra A L with eigenvalues prescribed in Corollary 11.2.3. Let v ∈ Sing L [ l ] satisfy (11.2), then η = θ −1 (v) ∈ A∗P satisfies η ( f g) = η( f )η (g) for all f, g ∈ A P . Hence, for g = 1 we have η ( f ) = η( f )η (1). Therefore, η is proportional to η, and v is proportional to v.
11.2.4. Assume that (λ1 , . . . , λn ) ∈ Cn is a point satisfying the assumptions of Corollary 11.2.3. We describe how to find the eigenvector v ∈ Sing L [ l ], indicated in Corollary 11.2.3. Let f (u) be the monic polynomial of degree l which is a solution of the difference equation (11.1). Consider the polynomial ω( y) = y0l
n−1
f (y j − 1)
j=1
as an element of Wa,d , see Sect. 6.3. By Theorem 6.3.2 this vector lies in Sing Wa,d [ l ] and ω( y) is an eigenvector of the Bethe algebra A W with eigenvalues prescribed in Corollary 11.2.3. Consider a maximal subspace V ⊂ Sing Wa,d [ l ] with three properties: i) V contains ω( y), ii) V does not contain other eigenvectors of the Bethe algebra A W , iii) V is invariant with respect to the Bethe algebra A W . Such a maximal subspace does exist and is unique. Let σ (V ) ⊂ Sing L [ l ] be the image of V under the epimorphism σ . Then by Corollary 11.2.3, the subspace σ (V ) contains a unique one-dimensional subspace of eigenvectors of the Bethe algebra A L . Any such eigenvector may serve as an eigenvector of the Bethe algebra A L indicated in Corollary 11.2.3. 12. Homogeneous XXX Heisenberg model 12.1. Statement of results. In Sects. 8–11, in most of the assertions we assumed that z 1 , . . . , z n ∈ C are such that z i − z j ∈ / Z for i = j, and m 1 , . . . , m n are natural numbers. In this section we assume that z1 = · · · = zn = 0
and
m 1 = · · · = m n = 1.
This special case is called the homogeneous XXX Heisenberg model.
(12.1)
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
37
In other words, in this section we consider the Y (gl2 )-module L 1 (0) = L (1,0) (0) ⊗ · · · ⊗ L (1,0) (0), which is the tensor product of n copies of the two-dimensional evaluation module, and the subspace of gl2 -singular vectors of weight (n − l, l), Sing L 1 [ l ] = { p ∈ L 1 (0) | e12 p = 0, e22 p = lp }. The subspace Sing L 1 [ l ] is not empty if and only if 2l n, that is, if and only if the pair ((1, 0), . . . , (1, 0)) , l is separating. In that case n n − . dim Sing L 1 [ l ] = l l −1 The algebra A L is the Bethe algebra associated with the subspace Sing L 1 [ l ]. It is generated by the coefficients of the series (T11 (u) + T22 (u)) Sing L [ l ] . 1 The main result of this section is the following theorem. 12.1.1. Theorem. For the homogeneous XXX Heisenberg model, the Bethe algebra A L has simple spectrum. The theorem will be proved in Sect. 12.7. 12.1.2. Let l˜ = n + 1 − l. We have l˜ + l − 1 = n and l˜ > l. Denote by f, g two polynomials in C[u] of the form: f (u) = u l + f 1 u l−1 + · · · + fl , l˜
g(u) = u + g1 u
˜ l−1
(12.2)
+ · · · + gl−l−1 u ˜
l+1
+ gl−l+1 u ˜
l−1
+ · · · + gl˜ .
As a byproduct of the proof of Theorem 12.1.1 we prove the following theorem. n distinct pairs of polynomials f, gof the form Theorem. There exist exactly nl − l−1 (12.2), such that ˜ (u + 1)n . f (u)g (u − 1) − f (u − 1) g(u) = (l − l) Theorem 12.1.2 will be proved in Sect. 12.8.
12.2. Algebra A L for the homogeneous XXX model. Consider the Yangian module Wa,d corresponding to the polynomials a(u) = (u + 1)n ,
d(u) = u n .
The numbers (12.1) satisfy the assumptions of Theorem 2.6.2. Therefore the Y (gl2 )module L 1 (0) is irreducible, and there is a natural epimorphism Wa,d → L 1 (0) of Y (gl2 )-modules. Restricting this epimorphism to Sing Wa,d [ l ], we obtain a linear epimorphism σ : Sing Wa,d [ l ] → Sing L [ l ].
38
E. Mukhin, V. Tarasov, A. Varchenko
The Bethe algebra A W preserves the kernel of σ and induces a commutative subalgebra in End (Sing L [ l ]). The induced subalgebra coincides with the Bethe algebra A L , see Sect. 9.3. Denote by ψW L : A W → A L the corresponding epimorphism. We have (T11 (u) + T2 (u)) Sing L
1[ l
]
= 2 + ψW L (H1 ) u −1 + · · · + ψW L (Hn ) u −n ,
where ψW L (H1 ) = n ,
ψW L (H2 ) = l(l − 1 − n) +
n(n − 1) , 2
see Sect. 3.2.1. Thus the Bethe algebra A L is generated by elements ψW L (H3 ), . . . , ψW L (Hn ). 12.3. Algebra A P for the homogeneous XXX model. Consider the space C2n with coordinates a˜ , a, h, as in Sect. 8.5, and polynomials p(u, ˜ a˜ ), p(u, a), B(u, h). Given the polynomials a(u) = (u + 1)n and d(u) = u n , we define the ideal I P , the algebra A P , and the scheme C P as in Sect. 9.1. The scheme C P is the scheme of points p ∈ C2n such that the difference equation (u n − B(u, h) ϑ −1 + (u + 1)n ϑ −2 ) w(u) = 0 has two polynomial solutions p(u, ˜ a˜ ( p)) and p(u, a( p)). 12.4. Algebra A G for the homogeneous XXX model. Consider the space C2n with coordinates a˜ , a, h, and polynomials p(u, ˜ a˜ ), p(u, a), B(u, h). Let us write Wr ( p(u, ˜ a˜ ), p(u, a)) = (l˜ − l)u n + w1 ( a˜ , a)u n−1 + · · · + wn ( a˜ , a), p(u, ˜ a˜ ) p(u − 2, a) − p(u ˜ − 2, a˜ ) p(u, a) n = 2(l˜ − l)u + wˆ 1 ( a˜ , a)u n−1 + · · · + wˆ n ( a˜ , a) for suitable polynomials w1 , . . . , wn , wˆ 1 , . . . , wˆ n in variables a˜ , a. Denote by IG the ideal in C[ a˜ , a, h] generated by 2n polynomials n , wˆ i ( a˜ , a) − (l˜ − l)h i , i = 1, . . . , n. wi ( a˜ , a) − (l˜ − l) i
(12.3)
The ideal IG defines a scheme C G ⊂ C2n . Then A G = C[ a˜ , a, h]/IG is the algebra of functions on C G . The scheme C G is the scheme of points p ∈ C2n such that Wr ( p(u, ˜ a˜ ( p), p(u, a( p)) = (l˜ − l) (u + 1)n , p(u, ˜ a˜ ) p(u − 2, a) − p(u ˜ − 2, a˜ ) p(u, a) = (l˜ − l) B(u, h( p)).
(12.4)
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
39
12.4.1. Theorem. The identity map C2n → C2n induces an algebra isomorphism ψG P : A G → A P . Proof. The proof is similar to the proof of Theorem 9.2.1.
12.4.2. Lemma. The dimension of A G considered as a vector space is equal to n n dim Sing L 1 [ l ] = − . l l −1 Proof. Consider the ideal IG (z) defined by (8.4) for m 1 = · · · = m n = 1 and arbitrary z 1 , . . . , z n . Consider the algebra A G (z) = C[ a˜ , a, h]/IG (z). By Lemma 8.5.1, if z 1 , . . . , z n are distinct and close to zero, then A G (z) is the algebra of functions on the intersection of Schubert cells C G (z), see (8.1), and by (8.3) we have n n dim A G (z) = − . l l −1 To complete the proof of Lemma 12.4.2, it suffices to verify two facts: (i) There are no algebraic curves over C lying in the scheme C G (0), defined by the ideal (12.3). (ii) Let a sequence z (i) , i = 1, 2, . . . , tend to 0. Let p(i) ∈ C G (z (i) ), i = 1, 2, . . . , be a sequence of points. Then all coordinates a˜ ( p(i) ), a( p(i) ), h( p(i) remain bounded as i tends to infinity. By Theorem 9.2.1, the schemes C G (z) and C P (z) are isomorphic if z 1 , . . . , z n are distinct and close to zero. By Theorem 12.4.1, the schemes C G (0) and C P (0) are isomorphic as well. Claims (i) and (ii) hold for the scheme C P (z) by Theorem 4.2.1 because C P (z) is a subscheme of the scheme C D (z).
12.5. Three more homomorphisms for the homogeneous XXX model. In Sects. 10.1 and 10.2, we define an algebra epimorphism ψ D P : A D → A P , a linear epimorphism ξ : A D → Sing L 1 [ l ] as the composition of linear maps φ
τ
σ
A D −→ A∗D −→ Sing Wa,d [ l ] −→ Sing L 1 [ l ], and an algebra epimorphism ψ DL : A D → A L as the composition ψW L ψ DW . For the homogeneous XXX model, we have Lemmas 10.2.1 and 10.2.2 and the following analogue of Lemma 10.2.3. 12.5.1. Lemma. For the homogeneous XXX model, the kernel of ξ coincides with the kernel of ψ D P . Proof. The proof is similar to the proof of Lemma 10.2.3 with Theorem 12.4.1 replacing Theorem 9.2.1.
12.5.2. Corollary. For the homogeneous XXX model, the algebras A P , A L and A G are isomorphic. Denote by ψ P L : A P → A L the isomorphism induced by ψ DL and ψ D P . We have the following analogue of Theorem 10.3.1.
40
E. Mukhin, V. Tarasov, A. Varchenko
12.5.3. Theorem. For the homogeneous XXX model, the linear map ξ induces a linear isomorphism ζ : A P → Sing L 1 [ l ] which intertwines the multiplication operators L f , f ∈ A P , on A P and the action of the Bethe algebra A L on Sing L 1 [ l ], that is, for any f, g ∈ A P we have ζ (L f (g)) = ψ P L ( f )(ζ (g)).
12.6. The Bethe algebra A L of the homogeneous XXX model is diagonalizable. 12.6.1. Theorem. For the homogeneous XXX model, all elements of A L are diagonalizable operators. Proof. Let v+ be a highest gl2 -weight vector of L (1,0) and v− = e21 v+ . Then v+ , v− form a basis of L (1,0) . Consider the Hermitian form on L 1 (0) for which the vectors vi1 ⊗ · · · ⊗ vin
with
i j ∈ {+, −}
generate an orthonormal basis of L 1 (0). For any X ∈ End (L 1 (0)), denote by X † the Hermitian conjugate operator with respect to this Hermitian form. It is clear that † (1⊗( j−1) ⊗ eab ⊗ 1⊗(n− j) )| L 1 (0) = (1⊗( j−1) ⊗ eba ⊗ 1⊗(n− j) )| L 1 (0) . Using the fact that (e11 + e22 )| L (1,0) = 1 and the definition of the coproduct (2.2), it is straightforward to verify by induction on n that
Tab (u)| L 1 (0)
†
= (−1)a+b+n T3−a,3−b (− u¯ − 1)| L 1 (0) ,
where u¯ is the complex conjugate of u. Therefore, † (T11 (u) + T22 (u))| L 1 (0) = − (T11 (− u¯ − 1) + T22 (− u¯ − 1))| L 1 (0) . This means that for any X ∈ A L , the Hermitian conjugate operator X † lies in A L . Hence, any element of A L commutes with its Hermitian conjugate and, therefore, is diagonalizable.
12.7. Proof of Theorem 12.1.1. The proof is similar to the proof of Corollary 10.3.2, because every element of A L is diagonalizable by Theorem 12.6.1.
12.8. Proof of Theorem 12.1.2. The algebras A G and A L are isomorphic. So, by Theorem 12.6.1 every element f ∈ A G is diagonalizable. Therefore, the algebra A G is the direct sum local algebras. Hence C G considered as a set consists of of one-dimensional n distinct points, see Lemma 12.4.2. Theorem 12.1.2 is proved. dim A G nl − l−1
Bethe Algebra of Homogeneous XXX Model has Simple Spectrum
41
12.8.1. Assume that v ∈ Sing L 1 [ l ] is an eigenvector of the Bethe algebra A L , that is, ψW L (Hs )v = λs v for suitable λs ∈ C and s = 1, . . . , n. Then by Corollary 7.4 in [MTV2], the difference equation u n − (2u n + λ1 u n−1 + · · · + λn ) ϑ −1 + (u + 1)n ϑ −2 w(u) = 0 has two linearly independent polynomial solutions, one of degree l and the other of degree n − l + 1. The following corollary of Theorem 12.1.1 gives the converse statement. 12.8.2. Corollary of Theorem 12.1.1 Assume that (λ1 , . . . , λn ) ∈ Cn is a point such that n(n − 1) , λ2 = l(l − 1 − n) − λ1 = n , 2 and the difference equation u n − (2u n + λ1 u n−1 + · · · + λn ) ϑ −1 + (u + 1)n ϑ −2 w(u) = 0 has two linearly independent polynomial solutions. Then there exists a unique up to normalization eigenvector v ∈ Sing L 1 [ l ] of the action of the Bethe algebra A L of the homogeneous XXX model such that for every s = 1, . . . , n we have ψW L (Hs ) v = λs v . The proof of Corollary 12.8.2 is similar to the proof of Corollary 11.2.3. 12.8.3. Assume that (λ1 , . . . , λn ) ∈ Cn is a point satisfying the assumptions of Corollary 12.8.2. In order to find the eigenvector v ∈ Sing L 1 [l ], indicated in Corollary 12.8.2, one needs to apply the procedure described in Sect. 11.2.4. Acknowledgements. The authors thank referees for helpful comments.
References [B1] [B2] [Be] [CP] [EGSV] [FT] [Fu] [IK] [KBI] [MTV1]
Baxter, R.: Exactly solved models in statistical mechanics. London: Academic Press, Inc., 1982 Baxter, R.: Completeness of the bethe ansatz for the six- and eight-vertex models. J. Stat. Phys. 108(1-2), 1–48 (2002) Bethe, H.: Zur theorie der metalle: i. eigenwerte und eigenfunktionen der linearen atomkette. Z. Phys. 71, 205–226 (1931) Chari, V., Pressley, A.: A Guide to quantum groups. Cambridge: Cambridge University Press, 1994 Eremenko, A., Gabrielov, A., Shapiro, M., Vainshtein, A.: Rational functions and real schubert calculus. Proc. Amer. Math. Soc. 134(4), 949–957 (2006) Faddeev, L.D., Takhtajan, L.A.: The quantum method for the inverse problem and the XYZ Heisenberg model. Russ. Math. Surv. 34, no. 5, 11–68 (1979); The spectrum and scattering of excitations in the one-dimensional isotropic Heisenberg model. J. Sov. Math. 24, 241–267 (1984) Fulton, W.: Intersection Theory. Berlin-Heidelberg-New Yok: Springer-Verlag, 1984 Izergin, A.G., Korepin, V.E.: Lattice model connected with nonlinear schrödinger equation. Sov. Phys. Doklady 26, 653–654 (1981) Korepin, V.E., Bogoliubov, N.M., Izergin, A.G.: Quantum inverse scattering method and correlation functions. Cambridge: Cambridge University Press, 1993 Mukhin, E., Tarasov, V., Varchenko, A.: Bethe eigenvectors of higher transfer matrices. J. Stat. Mech. Theor. Exp. 2006, no. 8, P08002, 1–44 (electronic) (2006)
42
[MTV2] [MTV3] [MTV4] [MTV5] [MV1] [MV2] [MV3] [PS] [RV] [Sk] [T] [Tal] [YY]
E. Mukhin, V. Tarasov, A. Varchenko
Mukhin, E., Tarasov, V., Varchenko, A.: Generating operator of XXX or Gaudin transfer matrices has quasi-exponential kernel. SIGMA 6, 060, 1–31 (2007) Mukhin, E., Tarasov, V., Varchenko, A.: Bethe algebra and algebra of functions on the space of differential operators of order two with polynomial solutions. http://arxiv./org/abs/0705. 4114v1[math.QA], 2007 Mukhin, E., Tarasov, V., Varchenko, A.: On reality property of Wronski maps. http://arxiv./org/ abs/0710.5856v2[math.QA], 2008 Mukhin, E., Tarasov, V., Varchenko, A.: On separation of variables and completeness of the Bethe ansatz for quantum glN Gaudin model. http://arxiv./org/abs/0712.0981v1[math.QA], 2007 Mukhin, E., Varchenko, A.: Critical points of master functions and flag varieties. Comm. Contemp. Math. 6(1), 111–163 (2004) Mukhin, E., Varchenko, A.: Solutions to the XXX type bethe ansatz equations and flag varieties. Cent. Eur. J. Math. 1(2), 238–271 (2003) Mukhin, E., Varchenko, A.: Discrete miura opers and solutions of the bethe ansatz equations. Commun. Math. Phys. 256(3), 565–588 (2005) Pronko, G.P., Stroganov, Yu.G.: Bethe equations “on the wrong side of equator”. J. Phys. A 32(12), 2333–2340 (1999) Reshetikhin, N., Varchenko, A.: Quasiclassical asymptotics of solutions to the KZ equations. In: Geometry, Topology and Physics for R. Bott, somerville, MA: Intern. Press, 1995, pp. 293–322 Sklyanin, E.: Quantum inverse scattering method. Selected topics. In: Nankai Lectures Math. Phys., River Edge, NJ: World Sci. Publ., 1992, pp. 63–97 Tarasov, V.: Irreducible monodromy matrices for the r -matrix of the xxz-model and lattice local quantum hamiltonians. Theor. Math. Phys. 63(2), 440–454 (1985) Talalaev, D.: Quantization of the Gaudin system. http://arxiv.org/abs/list/hep-th/0404153, 2004 Yang, C.N., Yang, C.P.: Thermodynamics of a one-dimensional system of bosons with repulsive delta-function interaction. J. Math. Phys. 10, 1115–1122 (1969)
Communicated by L. Takhtajan
Commun. Math. Phys. 288, 43–53 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0731-6
Communications in
Mathematical Physics
Conformal Radii for Conformal Loop Ensembles Oded Schramm1 , Scott Sheffield2,3, , David B. Wilson1 1 Microsoft Research, One Microsoft Way, Redmond, WA 98052, USA 2 Courant Institute, New York University, 251 Mercer Street, New York, NY 10021, USA 3 Department of Mathematics, M. I. T., 77 Massachusetts Ave., Cambridge, MA 02139, USA
Received: 20 September 2007 / Accepted: 10 November 2008 Published online: 26 February 2009 – © Springer-Verlag 2009
Abstract: The conformal loop ensembles CLEκ , defined for 8/3 ≤ κ ≤ 8, are random collections of loops in a planar domain which are conjectured scaling limits of the O(n) loop models. We calculate the distribution of the conformal radii of the nested loops surrounding a deterministic point. Our results agree with predictions made by Cardy and Ziff and by Kenyon and Wilson for the O(n) model. We also compute the expectation dimension of the CLEκ gasket, which consists of points not surrounded by any loop, to be 2−
(8 − κ)(3κ − 8) , 32κ
which agrees with the fractal dimension given by Duplantier for the O(n) model gasket. 1. Introduction The conformal loop ensembles CLEκ , defined for all 8/3 ≤ κ ≤ 8, are random collections of loops in a simply connected planar domain D C. They were defined and constructed from branching variants of SLEκ in [She06], where they were conjectured to be the scaling limits of various random loop models from statistical physics, including the so-called O(n) loop models with n = −2 cos(4π/κ),
(1)
see e.g. [KN04] for an exposition. This paper is a sequel to [She06]. We will state the results about CLEκ from [She06] that we need for this paper (namely Propositions 1 and 2), but we will not repeat the definition of CLEκ here. When 8/3 < κ < 8, CLEκ is almost surely a countably infinite collection of loops. CLE8 is a single space-filling loop almost surely and CLE8/3 is almost surely empty. Partially supported by NSF grant DMS0403182.
44
O. Schramm, S. Sheffield, D. B. Wilson
CLE6 is the scaling limit of the cluster boundaries of critical site percolation on the triangular lattice [CN06,Smi01,CN07]. We will henceforth assume 8/3 < κ < 8. For each z ∈ D, we inductively define L kz to be the outermost loop surrounding z z when the loops L 1z , . . . , L k−1 are removed (provided such a loop exists). For each deterministic z ∈ D, the loops L kz exist for all k ≥ 1 with probability one. Define A0z = D and let Akz be the component of D\L kz that contains z. The conformal gasket is the random closed set consisting of points that are not surrounded by any loop of an instance of CLEκ , i.e. the set of points for which L 1z does not exist. If D is a simply connected planar domain and z ∈ D, the conformal radius of D viewed from z is defined to be CR(D, z) := |g (z)|−1 , where g is any conformal map from D to the unit disk D that sends z to 0. The following is immediate from the construction in [She06]: Proposition 1. Let D be a simply connected bounded planar domain, and consider a CLEκ on D for some 8/3 < κ < 8. Then is almost surely the closure of the set of points that lie on an outermost loop (i.e., a loop of the form L 1z for some z). Conditioned on the outermost loops, the law of the remaining loops is given by an independent CLEκ in each component of D\. For z ∈ D and k = 1, 2, 3, . . . , define z , z) − log CR(Akz , z). Bkz := log CR(Ak−1
For any fixed z, the Bkz ’s are i.i.d. random variables. Various authors in the physics literature have used heuristic arguments (based on the so-called Coulomb gas method) to calculate properties of the scaling limits of statistical physical loop models, including the O(n) models, based on certain conformal invariance hypotheses of these limits. Although the scaling limits of the O(n) models have not been shown to exist, there is strong evidence that if they do exist they must be CLEκ . (For example, there is heuristic evidence that any scaling limit of the O(n) models should be in some sense conformally invariant; it is shown in [She06,SW08b,SW08a] that any random loop ensemble satisfying certain hypotheses including conformal invariance and a Markov-type property must be a CLEκ .) It is therefore natural to interpret these calculations as predictions about the behavior of the CLEκ . Cardy and Ziff [CZ03] predicted and experimentally verified the expected number of loops surrounding a point in the O(n) model, which, in light of (1), may be interpreted as a prediction of the expectation of Bzk : 1 (κ/4 − 1) cot(π(1 − 4/κ)) . z = E[Bk ] π
(2)
Kenyon and Wilson [KW04] went further and predicted the distribution of Bkz , giving its moment generating function E[exp(λBkz )] = for λ satisfying Reλ < 1 −
2 κ
−
3κ 32 ,
− cos(4π/κ) , cos π (1 − 4/κ)2 + 8λ/κ and density function
d Pr[Bkz < x] dx ∞ ( j + 1/2)2 − (1 − 4/κ)2 −κ cos(4π/κ) j x . (−1) ( j + 1/2) exp − = 4π 8/κ j=0
(3)
(4)
Conformal Radii for Conformal Loop Ensembles
45
The main result of this paper is Theorem 1, which confirms these predictions. In the special case κ = 6, this prediction for the law of Bkz was independently confirmed by Dubédat [Dub05]. Theorem 1. Let f κ denote the density function for √ the first√time that a standard Brownian motion started at 0 exits the interval (−2π/ κ, 2π/ κ). Then for 8/3 < κ < 8, the density function for Bkz is d (κ − 4)2 z Pr[Bk < x] = − f κ (x) cos(4π/κ) exp x . (5) dx 8κ The equivalence of the formulations (4) and (5), and the fact that they imply (3), follows from a calculation of Ciesielski and Taylor, who showed that the exit-time distribution of a Brownian motion from the √center of a 1-dimensional ball of radius r has a moment generating function of 1/ cos 2r 2 λ, and who gave two series expansions (one in powers of e−x and the other in powers of e−1/x ) for its density function [CT62, Theorem 2 and Eq. 2.22] (see also [BS02, Eqs. 1.3.0.1 and 1.3.0.2]). Since the Fourier transform is invertible on L 2 (R), the equivalence of (3) and (4) follows by considering the moment generating function restricted to the imaginary line Reλ = 0. Duplantier [Dup90] predicted the fractal dimension of the gasket associated with the O(n) model to be 3κ 2 +1+ , 32 κ where as usual n = −2 cos(4π/κ). We partially confirm this prediction by giving the expectation dimension of the gasket associated with CLEκ . The expectation dimension of a random bounded set A is defined to be log E[minimal number of balls of radius ε required to cover A] , ε→0 | log ε|
DE (A) = lim
provided the limit exists. The expectation dimension upper bounds the Hausdorff dimension. Theorem 2. Let be the gasket of a CLEκ in the unit disk with κ ∈ (8/3, 8). Then E[minimal number of balls of radius ε required to cover ] In particular, the expectation dimension of the gasket is
3κ 32
3κ +1+ 2 κ 1 32 . ε
+ 1 + κ2 .
Here, denotes equivalence up to multiplicative constants. Lawler, Schramm, and Werner [LSW02] studied the percolation gasket (associated with CLE6 ), effectively proving Theorem 2 in the case κ = 6. More generally, they studied how long it takes for a radial SLEκ to surround the origin when κ > 4, and their results implicitly imply Theorem 2 when κ > 4; see the remark in Sect. 2.1 for further discussion. We conclude our introduction by noting that the gasket dimension described above plays an important role in the physics literature, where it is related (at least heuristically) to the exponents of magnetization and multipoint correlation functions in critical lattice models. We briefly describe this connection in the case of the q-state Potts model on the square lattice. More details and references are found in [She06,Car07,Gri06].
46
O. Schramm, S. Sheffield, D. B. Wilson
A sample from the q-state Potts model on a connected planar graph G is a random function σ : V → {1, 2, . . . , q}, where V is the set of vertices of G and the image values 1, 2, . . . , q are often called spins. If the boundary vertices (those on the boundary of the unbounded face) of G are all assigned a particular value (say b), then using the standard FK random cluster decomposition [FK72], one may construct a sample from the Potts model as follows: 1. Sample a random subgraph G of G containing all boundary edges (edges on the boundary of the unbounded face), with probability proportional to # edges of G p q # components of G , 1− p where 0 < p < 1 is a parameter.The law of G is called the FK random cluster model with parameters p and q. Call the component of G which contains the boundary vertices of G the FK gasket. 2. Set σ (v) = b for each v in the FK gasket, and independently assign one of the q states uniformly at random to each of the remaining connected components of G (assigning all vertices in that component the corresponding state). The “magnetization” at an interior vertex v of G (i.e., the probability that σ (v) = j minus 1/q) is proportional to the probability that v is in the FK gasket. Given distinct vertices v and w, the covariance of σ (v) and σ (w) is proportional to the probability that both v and w lie in the same component of G . We now restrict to the case in which G is√a finite piece of the square grid in the plane and the parameter p satisfies p/(1 − p) = q. (With this choice of p, the FK model is self-dual and believed to be critical, see e.g., [Gri06, Chap. 6].) It is shown in [She06] that if q ≤ 4 and certain other hypotheses including conformal invariance hold, then the scaling limit of the set of boundaries between clusters and dual clusters in the critical FK random cluster models discussed above must be given by CLEκ for the κ satisfying q = 4 cos2 (4π/κ) and 4 ≤ κ ≤ 8. Assuming these hypotheses, the scaling limit of the discrete gasket is precisely the continuum CLEκ gasket. A heuristic ansatz is that the law of the critical FK gasket should have similar properties as the law of the set of squares in a fine grid which intersect the continuum gasket. If this heuristic holds, then when G is a bounded domain intersected with a square grid with spacing ε, the magnetization at a vertex v of macroscopic distance from the boundary in the discrete model will be on the order of ε2−d , where d is the limiting expectation dimension of the continuum gasket. Similarly, the covariance between σ (v) and σ (w), for two macroscopically separated vertices v and w, should be on the order of ε2(2−d) (since in the continuum model, the set of pairs v and w which lie in the same continuum spin cluster has dimension 2d; see [She06]). 2. Diffusions and Martingales 2.1. Reduction to a diffusion problem. Let Bt : [0, ∞) → R be a standard Brownian motion and let θt : [0, ∞) → [0, 2π ] be a random continuous process on the interval [0, 2π ] that is instantaneously reflecting at its endpoints (i.e., the set {t : θt ∈ {0, 2π }} has Lebesgue measure zero almost surely) and evolves according to the SDE dθt =
√ κ −4 cot(θt /2) dt + κ d Bt 2
(6)
Conformal Radii for Conformal Loop Ensembles
47
on each interval of time for which θt ∈ / {0, 2π }. In other words, θt is a random continuous process adapted to the filtration of Bt which almost surely satisfies √ ∂ κ −4 (θt − κ Bt ) = cot(θt /2) ∂t 2 for all t for which the right hand side is well defined. The law of this process is uniquely determined by θ0 [She06], and we also have the following from [She06]: Proposition 2. When 8/3 < κ < 8, the law of Bkz is the same as the law of inf{t : θt = 2π } for the diffusion (6) started at θ0 = 0. It is convenient to lift the process θt so that, rather than taking values in [0, 2π ], it takes values in all of R. Let R : R → [0, 2π ] be the piecewise affine map for which R(x) = |x| when x ∈ [−2π, 2π ] and R(4π + x) = R(x) for all x. Given θt , we can generate a continuous process θ˜t with R(θ˜t ) = θt in such a way that for each component (t1 , t2 ) of the set {t : θt ∈ / 2π Z}, we independently toss a fair coin to decide whether θ˜t > θ˜t1 or θ˜t < θ˜t1 on that component. The θt together with these coin tosses (for each interval of {t : θt ∈ / 2π Z}) determine θ˜t uniquely. ˜ This θt is still a solution to (6) provided we modify Bt in such a way that d Bt is replaced with −d Bt on those intervals for which d θ˜t = −dθt . (This modification does not change the law of Bt .) In the remainder of the text, we will drop the θ˜t notation and write θt for the lifted process on R. Remark. A very similar diffusion process was studied by Lawler, Schramm, and Werner [LSW02], namely dθt = cot(θt /2) dt +
√ κ d Bt ,
(7)
which is the same as (6) but without the factor of (κ − 4)/2, and they too studied the time for the diffusion to reach θt = 2π when started at θ0 = 0. Diffusions 6 and 7 are identical when κ = 6, and Diffusion 6 (for 4 < κ < 8) is given by Diffusion 7 (for 4 < κ < ∞) upon substituting κ → (2κ)/(κ − 4) and scaling time by t → t (κ − 4)/2. However, Diffusion 6 is a more singular Bessel-type process when κ ≤ 4, requiring additional technical analysis to deal with the times at which process is at 0 (see e.g. Lemma 3). Furthermore, only the large-time asymptotic decay rate of the hitting time distribution (used in the proof of Theorem 2) is given in [LSW02], and additional effort is required to obtain the precise hitting time distribution provided in Theorem 1.
2.2. Local martingales. Recall the hypergeometric function defined by F(a, b; c; z) =
∞ (a)n (b)n n=0
(c)n n!
zn ,
where a, b, c ∈ C are parameters, c ∈ / −N (where N = {0, 1, 2, . . . }), and ( )n denotes
( + 1) · · · ( + n − 1). This definition holds for z ∈ C when |z| < 1, and it may be
48
O. Schramm, S. Sheffield, D. B. Wilson
defined by analytic continuation elsewhere (though it is then not always single-valued). We define for λ ∈ C,
2 e 4 4 2 8λ 3 4 2 θ , Mκ,λ 1 − (θ ) = F 1 − κ4 + 1 − κ4 + 8λ , 1 − − + ; − ; sin κ κ κ κ 2 κ 4
1 2 2 2λ 3 o 1 2 2 2 Mκ,λ (θ ) = F 1 − κ2 + + 2λ + κ ; 2 ; cos2 θ2 cos θ2 2 − κ κ ,1 − κ − 2 − κ e (θ ) makes sense whenever κ = 8 , 8 , 8 , . . . ). There is some (where the formula for Mκ,λ 3 5 7 ambiguity in the choice of square root, but since F(b, a; c; z) = F(a, b; c; z), as long as the same choice of square root is made for both occurrences, there is no ambiguity in these definitions.
Lemma 1. For the diffusion (6) with κ > 0, let T be the first time at which θt ∈ 2π Z, e (θ ) and exp[λt¯]M o (θ ) are and let t¯ = min(t, T ). For any λ ∈ C, both exp[λt¯]Mκ,λ t¯ κ,λ t¯ local martingales parameterized by t. Proof. Given these formulas, in principle it is straightforward to verify that the dt term of the Itô expansion of
e|o d eλt Mκ,λ (θt ) is equal to zero (where e|o is either e or o). This term can be expressed as κ −4 κ e|o e|o e|o λt cot(θt /2) + Mκ,λ (θt ) dt. e λ Mκ,λ (θt ) + Mκ,λ (θt ) 2 2 e ; the Since Mathematica does not simplify this to zero, we show how to do this for Mκ,λ o is similar. case for Mκ,λ e with M, let F denote the hypergeometric function in the defiWe abbreviate Mκ,λ e nition of Mκ,λ , let a, b, and c denote the parameters of F in M, and change variables to y = y(θ ) = sin2 (θ/4) = (1 − cos(θ/2))/2:
M(θ ) = F(y(θ )), 1 M (θ ) = F (y(θ )) sin(θ/2), 4 1 1 M (θ ) = F (y(θ )) sin2 (θ/2) + F (y(θ )) cos(θ/2) 16 8 1 2 − y y−y = F (y) + F (y) 2 , 4 4 so that
κ
κ −4 κ eλt λF(y) − F (y)(y − 21 ) + F (y) y − y 2 + F (y) 21 − y 4 8 8
κ = eλt λF(y) + (1 − 3κ/8) F (y) y − 21 + F (y)(y − y 2 ) ⎤ 8 ⎡ 1 (a + n)(b + n) λ + (1 − 3κ/8) n − ∞ ⎥ (a)n (b)n ⎢ 2 c+n ⎢ ⎥ = eλt yn ⎣ ⎦ κ (a + n)(b + n) (c)n n! n=0 n − n(n − 1) + 8 c+n E n /(c+n)
Conformal Radii for Conformal Loop Ensembles
49
and we define E n to be c +n times the expression in brackets. (Note that indeed c ∈ / −N.) We may write E n as a polynomial in n: E n = [(1 − 3κ/8)(1 − 1/2) + (κ/8)(1 − c + a + b)] × n 2 + [λ + (1 − 3κ/8)(c − a/2 − b/2) + (κ/8)(c + a b)] × n + [cλ − (1 − 3κ/8)a b/2]. e . By our choices of a, b and c, E n = 0 for each n, which proves the claim for Mκ,λ
2.3. Expected first hitting of 2π Z. In this subsection we obtain asymptotics for the function L(θ ) := E[θT |θ0 = θ ], where θt is the diffusion (6) and T is the first time t ≥ 0 at which θt ∈ 2π Z. (Recall that T is finite a.s. when κ < 8.) Whenever θ ∈ 2π Z, trivially L(θ ) = θ . Lemma 2. For the diffusion (6) with 8/3 < κ < 8, L(θt ) is a martingale. Proof. Since L(θ ) is defined in terms of expected values, L(θt ) is a local martingale whenever θt ∈ / 2π Z, and the stopped process L(θmin(t,T ) ) is a martingale. Since the diffusion behaves symmetrically around the points 2π Z and the number of intervals of R2π Z crossed before some deterministic time has exponentially decaying tails (which implies integrability), L(θt ) is a martingale.
Next, we express L(θt ) in terms of the λ = 0 case of the local martingales e (θ ) and exp[λt]M o (θ ). Because M e (θ ) = 1, this local martingale is exp[λt]Mκ,λ t κ,λ t κ,0 uninformative, but
o (θ ) = F 23 − κ4 , 21 ; 23 ; cos2 θ2 cos θ2 . Mκ,0
o (0) = −M o (2π ) = √π (4/κ − 1/2)/(2(4/κ)), and since M o is We have Mκ,0 κ,0 κ,0 o (θ bounded (when κ = 8), the stopped process Mκ,0 min(t,T ) ) is also a martingale. This determines L, namely,
√
2 π κ4 L(θ ) = π − 4 1 F 23 − κ4 , 21 ; 23 ; cos2 θ2 cos θ2 , θ ∈ [0, 2 π ] (8) κ −2
and L(θ + 2 π ) = L(θ ) + 2 π . (It is also possible to derive (8) from the formula for Pr[SLE trace passes to left of x + i y] [Sch01] after applying a Möbius transformation and suitable hypergeometric identities.) We wish to understand the behavior of L near the points in 2π Z, and to this end we use the formula (see [EMOT53, p. 108, Eq. 2.10.1]) F(a, b; c; z) =
(c)(c − a − b) F(a, b; a + b − c + 1; 1 − z) (c − a)(c − b) (c)(a + b − c) + (1 − z)c−a−b F(c − a, c − b; c − a − b + 1; 1 − z) (a)(b) (9)
50
O. Schramm, S. Sheffield, D. B. Wilson
which is valid when 1 − c, b − a, and c − b − a are not integers and | arg(1 − z)| < π . In our case the nonintegrality condition is satisfied whenever 8/κ ∈ / N. For the range of κ that we are interested in, this rules out κ = 4, for which we already know L(θ ) = θ , and therefore do not need asymptotics. The endpoints of the range, κ = 8/3 and κ = 8 are also ruled out, but for the remaining κ’s we have
3 4 1 3
23 κ4 − 21 θ 2 F 2 − κ , 2 ; 2 ; cos 2 = F 23 − κ4 , 21 ; 23 − κ4 ; sin2 θ2
4 κ (1)
23 21 − κ4 θ κ8 −1 4 + 3 4 1 sin 2 F κ , 1; κ4 + 21 ; sin2 θ2 2−κ 2
√ 21 − κ4 θ κ8 −1 π κ4 − 21 1 + =
sin 2 cos θ 2 3 − 4 2 κ4 2 2 κ 4 4 1 2 θ , 1; + ; sin ×F , κ κ 2 2 and by (8),
8 −1
L(θ ) = cκ sin θ2 κ F κ4 , 1; κ4 + 21 ; sin2 θ2 cos θ2 ,
θ ∈ (0, π ).
for some constant cκ > 0. (One can use the Legendre duplication formula to show that cκ = 28/κ−1 ( κ4 )2 /( κ8 ), but we do not need this.) Since L(−θ ) = −L(θ ), we conclude that L(θ ) = A0 (θ 2 ) |θ |8/κ /θ,
θ ∈ (−π, π ),
where A0 is a analytic function (depending on κ) satisfying A0 (0) > 0. This implies θ 2 = A |L(θ )|2κ/(8−κ) (10) near θ = 0, for some analytic A. 2.4. Starting at θ0 = 0. We will eventually need to start the diffusion at θ0 = 0, but Lemma 1 only covers what happens up until the first time that θt ∈ 2π Z. In this subsection we show Lemma 3. For the diffusion (6) with 8/3 < κ < 8, let T be the first time at which e (θ ) is θt ∈ 2π Z4π Z, and let t¯ = min(t, T ). For any λ ∈ C, the process exp[λt¯]Mκ,λ t¯ a martingale. e , and let us assume without loss of generality that −2π < Proof. Let M abbreviate Mκ,λ θ0 < 2π . Let us define ωt = L(θt ), which is a martingale and may be interpreted as a time-changed Brownian motion. We wish to argue that eλt¯ M(L −1 (ωt¯)) = eλt¯ M(θt¯) is a local martingale. (The definition of L implies that it is strictly monotone, and hence L −1 is well defined.) Note that by Lemma 1 it is a local martingale when θt¯ ∈ / 2 π Z. To extend this to a neighborhood of θt¯ = 0, one could try to use Itô’s formula. To do this, it would be necessary that f := M ◦ L −1 be twice differentiable. We have M(θ ) = A1 (θ 2 ) in (−2 π, 2 π ), where A1 is analytic. Consequently, (10) gives for 8/3 < κ < 8 and for ω in a neighborhood of 0, f (ω) = A2 |ω|2κ/(8−κ) , (11)
Conformal Radii for Conformal Loop Ensembles
51
for some analytic A2 . Though this is not necessary for the proof, one can check that A2 (0) = 0 and therefore f (0) is not finite when κ < 4. To circumvent the problem of f = M ◦ L −1 not being twice differentiable, we use the 2κ Itô-Tanaka Theorem ([RY99, Theorem 1.5 on p. 223]). The exponent 8−κ in (11) ranges from 1 to ∞ as κ ranges from 8/3 to 8. In particular, f (0) = 0 and f is continuous near 0. Since A2 is analytic, near 0 the function f may be expressed as the difference of two convex functions, namely, f (ω) = ( f (ω) − f (0)) 1ω≥0 + ( f (ω) − f (0)) 1ω≤0 + f (0). Therefore, we may apply the Itô-Tanaka Theorem to conclude that eλt¯ f (ωt¯) = eλt¯ M(θt¯) is a local martingale also when θt¯ is near zero. Now, the hypergeometric function F satisfies F(a, b; c; 1) =
(c)(c − a − b) (c − a)(c − b)
(12)
provided −c ∈ / N and Rec > Re(a + b) (see e.g. [EMOT53, p. 104, Eq. 46]). Therefore, M(±2 π ) is finite. Thus, eλt¯ M(θt¯) is bounded for bounded t, and we may conclude that it is a martingale.
e (±2 π ). Observe that the parameters For future reference, we now calculate Mκ,λ e satisfy 2 c − a − b = 1. a, b, c of the hypergeometric function in the definition of Mκ,λ Consequently, the identity (z)(1 − z) = π/sin(π z) and (12) give
sin e Mκ,λ (±2 π ) =
π 2
2 − π 1 − κ4 +
8λ κ
=
sin (3π/2 − 4π/κ)
cos π (1 − 4/κ)2 + 8λ/κ cos(π(1 − 4/κ))
. (13)
3. Proofs of Main Results We now restate and prove Theorem 1. Theorem 3. Suppose the diffusion process (6) (with 8/3 < κ < 8) is started at θ0 = 0, and T is the first time at which θt = ±2π . If Reλ ≤ 0, then
E eλT θ0 = 0 =
cos(π(1 − 4/κ)) . cos π (1 − 4/κ)2 + 8λ/κ
(This is equivalent to Theorem 1 by Proposition 2 and the remarks following the statement of Theorem 1.) e (θ ) = M e (±2 π ) = M e (2 π ) a.s. and exp[λt¯]M e (θ ) is a marProof. Since Mκ,λ T κ,λ κ,λ κ,λ t¯ tingale, the optional sampling theorem gives
e e e Mκ,λ (2 π ) E eλT θ0 = 0 = E eλT Mκ,λ (θT ) θ0 = 0 = Mκ,λ (0) = 1, and the proof is completed by appeal to (13).
52
O. Schramm, S. Sheffield, D. B. Wilson
Proof of Theorem 2. Fix some ε > 0 and let z ∈ D. Set r0 := 1 − |z|, r1 := dist(z, L 1z ), and suppose that ε < r0 . We seek to estimate the probability that the open disk of radius ε about z intersects the gasket; that is, the probability that r1 < ε. By the Koebe 1/4 theorem, r1 ≤ CR(D1 , z) ≤ 4 r1 . Likewise, r0 ≤ CR(D, z) ≤ 4 r0 . Thus, B1z = log CR(D, z) − log CR(D1 , z) = log(r0 /r1 ) + O(1). Referring to the density function of Bkz (4), we see that Pr[r1 < ε] exp[−α log(r0 /ε)] = (ε/r0 )α , where α=
1/4 − (1 − 4/κ)2 3κ 2 (8 − κ)(3κ − 8) =− +1− = . 8/κ 32 κ 32κ
For each j = 1, . . . , 1/ε, we may cover the annulus {z : ( j − 1) ε ≤ 1 − |z| ≤ j ε} by O(1/ε) disks of radius ε. The total expected number of these disks that intersect the gasket is at most
1/ε
O(1/ε) × O(ε/( jε))α = O(εα−2 ).
j=1
(Here we made use of the fact that α < 1.) Thus on average O(εα−2 ) disks of radius ε suffice to cover the gasket. On the other hand, we may pack into D at least (1/ε2 ) points so that every two of them are more than distance 4ε apart, and each of them is at least distance 1/2 from the boundary. For each such point z there is a (εα ) chance that the disk or radius ε centered at z is not surrounded by a loop, i.e., that that the gasket contains a point z that is within distance ε of z. Since the points z are sufficiently far apart, the points z must be covered by distinct disks in any covering of by disks of radius ε. Thus the expected number of disks of radius ε required to cover the gasket is at least (εα−2 ).
4. Open Problems Kenyon and Wilson [KW04] also predicted the large-k limiting distribution of another quantity, the “electrical thickness” of the loops L kz when k → ∞. The electrical thickness of a loop compares the conformal radius of the loop to the conformal radius of the image of the loop under the map m(w) = 1/(w − z), and more precisely it is ϑz (L kz ) = − log CR(L kz , z) − log CR(m(L kz ), z). Kenyon and Wilson [KW04] predicted that the large-k moment generating function of ϑz (L kz ) is sin(π(1 − 4/κ)) π (1 − 4/κ)2 + 8λ/κ z , (14) lim E[exp(λϑz (L k ))] = k→∞ π(1 − 4/κ) sin π (1 − 4/κ)2 + 8λ/κ or equivalently that the limiting probability density function is given by the density function of the √ exit time√of a standard Brownian excursion started in the middle of the interval (−2π/ κ, 2π/ κ), reweighted by a factor of const × exp[(κ − 4)2 x/(8κ)]. (This equivalence follows from [BS02, Eq. 5.3.0.1].) Recall that the density function
Conformal Radii for Conformal Loop Ensembles
53
of Bkz is given by the density function of the exit time of a standard Brownian motion √ √ started in the middle of the interval (−2π/ κ, 2π/ κ), also reweighted by a factor of const × exp[(κ − 4)2 x/(8κ)]. These forms are highly suggestive, but currently we do not know how to calculate the electrical thickness using CLEκ , nor do we have a conceptual explanation for why these distributions take these forms. References [BS02] [Car07] [CN06] [CN07] [CT62] [CZ03] [Dub05] [Dup90] [EMOT53] [FK72] [Gri06] [KN04] [KW04] [LSW02] [RY99] [Sch01] [She06] [Smi01] [SW08a] [SW08b]
Borodin, A.N., Salminen, P.: Handbook of Brownian Motion—Facts and Formulae. Probability and its Applications. Basel: Birkhäuser Verlag, 2nd edition, 2002 Cardy, J.: ADE and SLE. J. Phys. A 40(7), 1427–1438 (2007) Camia, F., Newman, C.M.: Two-dimensional critical percolation: the full scaling limit. Commun. Math. Phys. 268(1), 1–38 (2006) Camia, F., Newman, C.M.: Critical percolation exploration path and SLE6 : a proof of convergence. Probab. Theory Related Fields 139(3-4), 473–519 (2007) Ciesielski, Z., Taylor, S.J.: First passage times and sojourn times for Brownian motion in space and the exact Hausdorff measure of the sample path. Trans. Amer. Math. Soc. 103(3), 434– 450 (1962) Cardy, J., Ziff, R.M.: Exact results for the universal area distribution of clusters in percolation, Ising, and Potts models. J. Stat. Phys. 110(1-2), 1–33 (2003) Dubédat, J.: 2005, Personal communication Duplantier, B.: Exact fractal area of two-dimensional vesicles. Phys. Rev. Lett. 64(4), 493 (1990) Erdélyi, A., Magnus, W., Oberhettinger, F., Tricomi, F.G.: Higher Transcendental Functions. Vol. I., New York: McGraw-Hill Book Company, 1953, based, in part, on notes left by Harry Bateman Fortuin, C.M., Kasteleyn, P.W.: On the random-cluster model. I. Introduction and relation to other models. Physica 57, 536–564 (1972) Grimmett, G.: The Random-Cluster Model, Volume 333 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Berlin: Springer-Verlag, (2006) Kager, W., Nienhuis, B.: A guide to stochastic Löwner evolution and its applications. J. Stat. Phys. 115(5-6), 1149–1229 (2004) Kenyon, R.W., Wilson, D.B.: Conformal radii of loop models, 2004. Manuscript Lawler, G.F., Schramm, O., Werner, W.: One-arm exponent for critical 2D percolation. Electron. J. Probab. 7: Paper No. 2, 13 pp. (2002) Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion, Volume 293 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Berlin: Springer-Verlag, third edition, 1999 Schramm, O.: A percolation formula. Electron. Comm. Probab. 6, 115–120 (2001) Sheffield, S.: Exploration trees and conformal loop ensembles. http://arxiv.org/abs/math.PR/ 0609167, 2006 Duke Math. J., to appear Smirnov, S.: Critical percolation in the plane: conformal invariance, Cardy’s formula, scaling limits. C. R. Acad. Sci. Paris Sér. I Math. 333(3), 239–244 (2001) Sheffield, S., Werner, W.: Conformal loop ensembles: Construction via loop-soups, 2008, in preparation Sheffield, S., Werner, W.: Conformal loop ensembles: The Markovian characterization, 2008, in preparation
Communicated by M. Aizenman
Commun. Math. Phys. 288, 55–96 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0767-7
Communications in
Mathematical Physics
A Groupoid Approach to Noncommutative T-Duality Calder Daenzer University of California at Berkeley, 970 Evans Hall, Berkeley, CA 94720-3840, USA. E-mail:
[email protected] Received: 6 October 2007 / Accepted: 15 December 2008 Published online: 26 February 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com
Abstract: Topological T-duality is a transformation taking a gerbe on a principal torus bundle to a gerbe on a principal dual-torus bundle. We give a new geometric construction of T-dualization, which allows the duality to be extended in the following two directions. First, bundles of groups other than tori, even bundles of some nonabelian groups, can be dualized. Second, bundles whose duals are families of noncommutative groups (in the sense of noncommutative geometry) can be treated, though in this case the base space of the bundles is best viewed as a topological stack. Some methods developed for the construction may be of independent interest. These are a Pontryagin type duality that interchanges commutative principal bundles with gerbes, a nonabelian Takai type duality for groupoids, and the computation of certain equivariant Brauer groups. Contents 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Groupoids and G-Groupoids . . . . . . . . . . . . . . . . . . . . Modules and Morita Equivalence for Groupoids . . . . . . . . . Some Relevant Examples of Groupoids and Morita Equivalences Groupoid Algebras, K-Theory, and Strong Morita Equivalence . Equivariant Groupoid Cohomology . . . . . . . . . . . . . . . . Gerbes and Twisted Groupoids . . . . . . . . . . . . . . . . . . Pontryagin Duality for Generalized Principal Bundles . . . . . . Twisted Morita Equivalence . . . . . . . . . . . . . . . . . . . . Some Facts about Groupoid Cohomology . . . . . . . . . . . . . Generalized Mackey-Rieffel Imprimitivity . . . . . . . . . . . . Classical T-Duality . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
56 58 59 60 64 64 66 69 73 75 79 80
The research reported here was supported in part by National Science Foundation grants DMS-0703718 and DMS-0611653.
56
13. 14. 15. 16. A.
C. Daenzer
Nonabelian Takai Duality . . . . . . . . . . . . . Nonabelian Noncommutative T-Duality . . . . . . The Equivariant Brauer Group . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . Connection with the Mathai-Rosenberg Approach
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
82 85 87 90 90
1. Introduction A principal torus bundle with U (1)-gerbe and a principal dual-torus bundle1 with U (1)gerbe are said to be topologically T-dual when there is an isomorphism between the twisted K -theory groups of the two bundles, where the “twisting” of the K -groups is determined by the gerbes on the two bundles. The original motivation for the study of T-duality comes from theoretical physics, where it describes several phenomena and is by now a fundamental concept. For example T-duality provides a duality between type IIa and type IIb string theory and a duality on type I string theory (see e.g. [Pol]), and it provides an interpretation of a certain sector of mirror symmetry on Calabi-Yau manifolds (see [SYZ]). There have been several approaches to constructing T-dual pairs, each with their particular successes. For example Bunke, Rumpf and Schick ([BRS]) have given a description using algebraic topology methods which realizes the duality functorially. This method is very successful for cases in which T-duals exist as commutative spaces, and has recently been extended ([BSST]) to abelian groups other than tori. In complex algebraic geometry, T-duality is effected by the Fourier-Mukai transform; in that context duals to certain toric fibrations with singular fibers (so they are not principal bundles) can be constructed (e.g. [DP,BBP]). Mathai and Rosenberg have constructed T-dual pairs using C ∗ -algebra methods, and with these methods arrived at the remarkable discovery that in certain situations one side of the duality must be a family of noncommutative tori ([MR]). In this paper we propose yet another construction of T-dual pairs, which can be thought of as a construction of the geometric duality that underlies the C ∗ -algebra duality of the Mathai-Rosenberg approach. To validate the introduction of yet another T-duality construction, let us immediately list some of the new results which it affords. Any of the following language which is not standard will be reviewed in the body of the paper. • Duality for groups other than tori can be treated, even groups which are not abelian. More precisely, if N is a closed normal subgroup of a Lie group G, then the dual of any G/N -bundle P → X with U (1)-gerbe can be constructed as long as the gerbe is “equivariant” with respect to the translation action of G on the G/N -bundle. The dual is found to be an N -gerbe over X , with a U (1)-gerbe on it. The precise sense in which it is a duality is given by what we call nonabelian Takai duality for groupoids, which essentially gives a way of returning from the dual side to something canonically Morita equivalent to its predual. A twisted K -theory isomorphism is not necessary for there to be a nonabelian Takai duality between the two objects, though we show that there is nonetheless a K -isomorphism whenever G is a simply connected solvable Lie group. 1 If a torus is written V /Λ, where V is a real vector space and Λ a full rank lattice, then its dual is the torus := Hom(Λ, U(1)). Λ
A Groupoid Approach to Noncommutative T-Duality
57
• Duality can be treated for torus bundles (or more generally G/N bundles, as above) whose base is a topological stack rather than a topological space. Such a generalization is found to be crucial for the understanding of duals which are noncommutative (in the sense of noncommutative geometry). • We find new structure in noncommutative T-duals. For example in the case of principal T -bundles, where T = V /Λ Rn /Zn is a torus, we find that a noncommutative T-dual is in fact a deformation of a Λ-gerbe over the base space X . The Λ-gerbe will be given explicitly, as will be the 2-cocycle giving the deformation, and when we restrict the Λ-gerbe to a point m ∈ X , so that we are looking at what in the classical case would be a single dual-torus fiber, the (deformed) Λ-gerbe is presented by a groupoid with twisting 2-cocycle, whose associated twisted groupoid algebra is a noncommutative torus. Thus the twisted C ∗ -algebra corresponding to a groupoid presentation of the deformed Λ-gerbe is a family of noncommutative tori, which matches the result of the Mathai-Rosenberg approach, but we now have an understanding of the “global” structure of this object, which one might say is that of a Λ-gerbe fibred in noncommutative dual tori. Another benefit to our setup is that a cohomological classification of noncommutative duals is available, given by groupoid (or stack) cohomology. • Groupoid presentations are compatible with extra geometric structure such as smooth, complex or symplectic structure. This will allow, in particular, for the connection between topological T-duality and the complex T-duality of [DP] and [BBP] to be made precise. We will begin an investigation of this and possible applications to noncommutative homological mirror symmetry in a forthcoming paper with Jonathan Block [BD]. Let us now give a brief outline of our T-dualization construction. For the outline to be intelligible, the reader should be familiar with groupoids and gerbes or else should browse Sects. (2)–(4) and (7). Let N be a closed normal subgroup of a locally compact group G, let P → X be a ˇ principal G/N -bundle over a space X , and suppose we are given a Cech 2-cocycle σ on P with coefficients in the sheaf of U (1)-valued functions. It is a classic fact that σ deterˇ mines a U (1)-gerbe on P, and that such gerbes are classified by the Cech cohomology 2 class of σ , written [σ ] ∈ Hˇ (P; U (1)). So σ represents the gerbe data. (The case which has been studied in the past is G Rn and N Zn , so that P is a torus bundle.) Given this data (P, [σ ]) of a principal G/N -bundle P with U (1)-gerbe, we construct a T-dual according to the following prescription: 1. Choose a lift of [σ ] ∈ Hˇ 2 (P; U (1)) to a 2-cocycle [σ˜ ] ∈ Hˇ G2 (P; U (1)) in ˇ G-equivariant Cech cohomology (see Sect. (6)). If no lift exists there is no T-dual in our framework. 2. From the lift σ˜ , define a new gerbe as follows. By definition, σ˜ will be realized as a G-equivariant 2-cocycle in the groupoid cohomology of some groupoid presentation G(P) of P (see Example (3)). Because σ˜ is G-equivariant, it can be interpreted as a 2-cocycle on the crossed product groupoid G G(P) for the translation action of G on G(P) (see Example (2)). Thus σ˜ determines a U (1)-gerbe on the groupoid G G(P). 3. The crossed product groupoid G G(P) is shown to present an N -gerbe over X , so σ˜ is interpreted as the data for a U (1)-gerbe on this N -gerbe. This U (1)-gerbe on an N -gerbe can be viewed as the T-dual (there will be ample motivation for this). We construct a canonical induction procedure, nonabelian Takai duality (see Sect. (13)), that recovers the data (P, σ˜ ) from this T -dual.
58
C. Daenzer
4. In the special case that N is abelian and σ˜ has a vanishing “Mackey obstruction”, we construct a “Pontryagin dual” of the U (1)-gerbe over the N -gerbe of Step (3). This dual object is a principal G/N dual -bundle with U (1)-gerbe, where G/N dual ≡ := Hom(N ; U (1)) is the Pontryagin dual.2 Thus in this special case we arrive at N a classical T-dual, which is a principal G/N dual -bundle with U (1)-gerbe, and the other cases in which we cannot proceed past Step (3) are interpreted as noncommutative and nonabelian versions of classical T-duality. The above steps are Morita invariant in the appropriate sense and can be translated into statements about stacks. Furthermore, they produce a unique dual object (up to Morita equivalence or isomorphism) once a lift σ˜ has been chosen. It should be noted, however, that neither uniqueness nor existence are intrinsic features of a T-dualization whose input data is only (P, [σ ]). In fact, the different possible T-duals are parameterized by the fiber over [σ ] of the forgetting map Hˇ G2 (P; U (1)) → Hˇ 2 (P; U (1)), which is in general neither injective nor surjective. In some cases the forgetful map is injective. For example this is true 1-dimensional tori, and consequently T-duals of gerbes over principal circle bundles are unique. At the core of our construction is the concept of dualizing by taking a crossed product for a group action. This concept was first applied by Jonathan Rosenberg and Mathai Varghese, albeit in a quite different setting than ours. We have included an Appendix which makes precise the connection between our approach and the approach presented in their paper [MR]. The role of Pontryagin duality in T-duality may have been first noticed by Arinkin and Beilinson, and some notes to this effect can be found in Arinkin’s appendix in [DP], (though this is in the very different setting of complex T-duality). The idea from Arinkin’s appendix has recently been expanded upon in the topological setting in [BSST]. Our version of Pontryagin duality almost certainly coincides with these, though we arrived at it from a somewhat different perspective. 2. Groupoids and G-Groupoids Let us fix notation and conventions for groupoids. A set theoretic groupoid is a small category G with all arrows invertible, written as follows: s,r
G := (G1 ⇒ G0 ). Here G1 is the set of arrows, G0 is the set of units (or objects), s is the source map, and r is the range map. The n-tuples of composable arrows will be denoted Gn . Throughout the paper γ s will be used to denote arrows in a groupoid unless otherwise noted. A topological groupoid is one whose arrows G1 and objects G0 are topological spaces and for which the structure maps (source, range, multiplication, and inversion) are continuous. A left Haar system on a groupoid (see [Ren]) is, roughly speaking, a continuous family of measures on the range fibers of the groupoid that is invariant under left groupoid multiplication. It is shown in [Ren] that for every groupoid admitting a left Haar system, the source and range maps are open maps. In this paper, a groupoid will mean a topological groupoid whose space of arrows is locally compact Hausdorff. Also each groupoid will be implicitly equipped with a left Haar system. These extra conditions are needed so that groupoid C ∗ -algebras can 2 For example when N Zn , N is the n-torus which is (by definition) dual to Rn /Zn .
A Groupoid Approach to Noncommutative T-Duality
59
be defined. Furthermore, all groupoids will be assumed second countable, that is, the space of arrows will be assumed second countable. Second countability of a groupoid ensures that the groupoid algebra is well behaved. For example, second countability implies that the groupoid algebra is separable and thus well-suited for K -theory; second countability is invoked in [Ren] when showing that every representation of a groupoid algebra comes from a representation of the groupoid [Ren]; and the condition is used in [MRW] when showing that Morita equivalence of groupoids implies strong Morita equivalence of the associated groupoid algebras. On the other hand, several results presented here do not involve groupoid algebras in any way. It will hopefully be clear in these situations that the Haar measure and second countability hypotheses, and in some cases local compactness, are unnecessary. Topological groups and topological spaces are groupoids, and they will be assumed here to satisfy the same implicit hypotheses as groupoids. Thus spaces and groups are always second countable, locally compact Hausdorff, and equipped with a left Haar system of measures. A group G can act on a groupoid, forming what is called a G-groupoid. Definition 2.1. A (left) G-groupoid is a groupoid G with a (continuous) left G-action on its space of arrows that commutes with all structure maps and whose Haar system is left G-invariant. 3. Modules and Morita Equivalence for Groupoids ε
Let G be a groupoid. A left G-module is a space P with a continuous map P → G0 called the moment map and a continuous “action” G ×G0 P → P; (γ , p) → γ p. Here G1 ×G0 P := { (γ , p) | sγ = εp } is the fibred product and by “action” we mean that γ1 (γ2 p) = (γ1 γ2 ) p. A right module is defined similarly, and one can convert a left module P to a right module P op by setting p · γ := γ −1 p; γ ∈ G , p ∈ P op . A G-action is called free if (γ p = γ p) ⇒ (γ = γ ) and is called proper if the map G ×G0 P → P × P; (γ , p) → (γ p, p) is proper. A G-module is called principal if the G action is both free and proper, and is called locally trivial when the quotient map P → G\P admits local sections. Note that when G = (G ⇒ ∗) is a group, a locally trivial principal G-module P is exactly a principal G-bundle over the quotient space G\P. For this reason principal modules are sometimes called principal bundles. We are reserving the term principal bundle for something else (see Example (3)). Now we come to the important notion of groupoid Morita equivalence. Definition 3.1. Two groupoids G and H are said to be Morita equivalent when there exists a Morita equivalence (G-H)-bimodule. This is a space P with commuting left G-module and right H-module structures that are both principal, and satisfying the following extra conditions:
60
C. Daenzer
• The quotient space G\P (with its quotient topology) is homeomorphic to H0 in a way that identifies the right moment map P → H0 with the quotient map P → G\P. • The quotient space P/H (with its quotient topology) is homeomorphic to G0 in a way that identifies the left moment map P → G0 with the quotient map P → P/H. In the literature on groupoids one finds several other ways to express Morita equivalence, but they are all equivalent (see for example [BX]). Morita equivalence bimodules give rise to equivalences of module categories. To see this, let E be a right G-module and P a Morita (G-H)-bimodule. Then G acts on E ×G0 P by γ · (e, p) := (eγ −1 , γ p) and one checks that the right H-module structure on P induces one on E ∗ P := G\(E ×G0 P). The assignment E → E ∗ P induces the desired equivalence of module categories. The inverse is given by P op ; in fact the properties of Morita bimodules ensure an isomorphism of (G-G)-bimodules, P ∗ P op G, and this in turn induces an isomorphism ((E ∗ P) ∗ P op ) E. If E is principal then so is E ∗ P, and if furthermore, both E and P are locally trivial, then so is E ∗ P. There is also a notion of G-equivariant Morita equivalence of G-groupoids. This is given by a Morita equivalence (G-H)-bimodule P with compatible G-action. The compatibility is expressed by saying that the map G ×G0 P ×H0 H → P; (γ , p, η) → γ pη satisfies, for g ∈ G, g(γ pη) = g(γ )g( p)g(η).
(1)
4. Some Relevant Examples of Groupoids and Morita Equivalences Here are some groupoids and Morita equivalences which will be used throughout the paper. ˇ Example 1. Cech groupoids and refinement. If U := {Ui }i∈I is an open cover of a topoˇ logical space X then the Cech groupoid of the cover, which we denote GU, is defined as follows: s : Ui j → U j GU := ( Ui j ⇒ Ui ) . r : Ui j → Ui I ×I
I
This groupoid is Morita equivalent to the unit groupoid X ⇒ X . Indeed, G0 is a Morita equivalence bimodule. It is a right G module in the obvious way. As for the left (X ⇒ X )-module structure, the moment map G0 → X is “glue the cover together” and the X -action is the trivial one X × X G0 → G0 . More generally, let G be any groupoid and suppose U := {Ui }i∈I is a locally finite cover of G0 . Define a new groupoid GU := ( Gi j ⇒ Ui ), where G i j := r −1 Ui ∩ s −1 U j . I ×I
I
A Groupoid Approach to Noncommutative T-Duality
61
s s i j : G i j → s −1 U j → U j The source and range maps are and where i j r r : G i j → r −1 U j → Ui . We will call such a groupoid a refinement of G and write γ i j for γ ∈ G i j . A groupoid is always Morita equivalent to its refinements. A Morita equivalence (G-GU)-bimodule P is defined as follows: P := s −1 Ui .
si j
rij,
I
For a left G-module structure on P, let the moment map be r : P → G0 and, writing ηi for η ∈ s −1 Ui ⊂ P, define the action by (γ , ηi ) → (γ η)i
γ ∈ G, ηi ∈ P.
For the right GU-module structure, the moment map is ηi → sη ∈ Ui and the right action is (ηi , γ i j ) → (ηγ ) j . ˇ Of course a Cech groupoid is exactly a refinement of a unit groupoid. In order to keep within our class of second countable groupoids, we restrict to countable covers of G0 . Example 2. Crossed product groupoids. From a G-groupoid G one can form the crossed product groupoid G G := (G × G1 ⇒ G0 ) whose source and range maps are s(g, γ ) := s(g −1 γ ) and r (g, γ ) := r γ , and for which a composed pair looks like: (g, γ ) ◦ (g , g −1 γ ) = (gg , γ γ ). Now suppose two G-groupoids G and H are equivariantly Morita equivalent via a bimodule P with moment maps b : P → G0 and br : P → H0 . Then G × P has the structure of a Morita (G G)-(G H)-bimodule. The left G G action is (g, γ ) · (g , p) := (gg , γ gp),
(g, γ ) ∈ G G, (g , p) ∈ G × P,
with moment map (g , p) → b ( p). The right G H-module structure is (g , p) · (g , η) := (g g , pg η),
(g , η) ∈ G H ,
with moment map (g , p) → br (g −1 p). Example 3. Generalized principal bundles. Let G be a groupoid, G a locally compact group, and ρ : G → G a homomorphism of groupoids. The generalized principal bundle associated to ρ is the groupoid G ρ G := (G × G1 ⇒ G × G0 ) whose source and range maps are s : (g, γ ) → (gρ(γ ), sγ ) and r : (g, γ ) → (g, r γ ) and for which a composed pair looks like (g, γ1 ) ◦ (gρ(γ1 ), γ2 ) = (g, γ1 γ2 ).
62
C. Daenzer
ˇ i} The reason G ρ G is called a generalized principal bundle is that when G = G{U ˇ is the Cech groupoid of Example (1), G ρ G is G-equivariantly Morita equivalent to ˇ a principal bundle on X . Indeed, in this case ρ is the same thing as a G-valued Cech 1-cochain on the cover, and the homomorphism property ρ(γ1 γ2 ) = ρ(γ1 )ρ(γ2 ) translates to ρ being closed. Thus ρ gives transition functions for the principal G-bundle on X, P(ρ) := G × Uα / ∼ (g, u ∈ Uα ) ∼ (gρ(γ ), u ∈ Uβ ), where γ = u ∈ Uαβ ⊂ G. Let π denote the bundle map P(ρ) → X , then there are isomorphisms h α : G × Uα −→ π −1 Uα which satisfy h −1 β h α (g, u) = (gρ(γ ), u), and the maps h α give a G-equivariant isoˇ −1 Uα }) by sendˇ groupoid G({π morphism of groupoids between G ρ G and the Cech −1 ˇ −1 Uα }). Finally, ing (g, γ ) ∈ G × Uα,β ⊂ G ρ G to h α (g, γ ) ∈ π Uαβ ⊂ G({π ˇ −1 Uα }) is G-equivariantly Morita equivalent to the unit groupoid P(ρ) ⇒ P(ρ). G({π To see the importance of keeping track of G-equivariance, note for instance that when G is abelian G ρ G and G ρ −1 G are isomorphic (and therefore Morita equivalent), whereas these two groupoids with their natural G-groupoid structures are not equivariantly equivalent. Example 4. Isotropy subgroups. Let G be a locally compact group and N a closed subgroup. Then G acts on the homogeneous space G/N by left translation and one can form the crossed product groupoid G G/N ⇒ G/N . There is a Morita equivalence (G G/N ⇒ G/N ) ∼ (N ⇒ ∗). The bimodule implementing the equivalence is G, with N acting on the right by translation and G G/N acting on the left by (g, gh N ) · h := gh. Example 5. Nonabelian groupoid extensions. Let G be a groupoid and B → G0 a bundle of not necessarily abelian groups over G0 . Suppose we have two continuous functions G2 ×G0 B → B ; (γ1 , γ2 , p) → σ (γ1 , γ2 ) p and G ×G0 B → B ; (γ , p) → τ (γ )( p), such that σ (γ1 , γ2 ) is an element of the fiber of B over r γ1 , τ (γ ) is an isomorphism from the fiber over sγ to the fiber over r γ , and the following equations are satisfied: τ (γ1 ) ◦ τ (γ2 ) = ad(σ (γ1 , γ2 )) ◦ τ (γ1 γ2 ), (τ (γ1 ) ◦ σ (γ2 , γ3 ))σ (γ1 , γ2 γ3 ) = σ (γ1 , γ2 )σ (γ1 γ2 , γ3 ),
(2) (3)
where ad( p)(q) := pqp −1 for elements p, q ∈ B that both lie in the same fiber over G0 . We will write γ ( p) := τ (γ )( p). The pair (σ, τ ) can be thought of as a 2-cocycle in “nonabelian cohomology” with values in B, and when B is a bundle of abelian groups, τ is simply an action and σ a 2-cocycle as in Sect. (6). From the data (σ, τ ) we form an extension of G by B, which is the groupoid B σ G := (B ×b,G0 ,r G1 ⇒ G0 ) with source, range and multiplication maps
A Groupoid Approach to Noncommutative T-Duality
63
1. s( p, γ ) := sγ r ( p, γ ) := r γ , 2. ( p1 , γ1 ) ◦ ( p2 , γ2 ) = ( p1 γ1 ( p2 )σ (γ1 , γ2 ), γ1 γ2 ). The next example combines Examples (3), (4), and (5). Example 6. Let ρ : G → G be a continuous function. Define δρ(γ1 , γ2 ) := ρ(γ1 )ρ(γ2 )ρ(γ1 γ2 )−1 ,
(γ1 , γ2 ) ∈ G 2 .
Suppose δρ takes values in a closed normal subgroup N of G, and write ρ¯ for the comρ position G → G → G/N , which is a homomorphism. Associated to ρ we construct two groupoids. 1. (G (G/N ρ¯ G). This is the crossed product groupoid of G acting by translation on the generalized principal bundle G/N ρ¯ G. This means that for (g, t, γ ) s ∈ (G (G/N ρ¯ G), the source, range and multiplication maps are (a) s(g, t, γ ) = (g −1 tρ(γ ), sγ ), (b) r (g, t, γ ) = (t, r γ ), (c) (g1 , t, γ1 ) ◦ (g2 , g1−1 tρ(γ1 ), γ2 ) = (g1 g2 , t, γ1 γ2 ). 2. (N δρ G). In the notation of Example (2) this is the extension determined by the pair (σ, τ ) := (δρ, ad(ρ)) and the constant bundle B := G0 × N . Note that as an N -valued groupoid 2-cocycle δρ is not necessarily a coboundary. Explicitly, the source, range and multiplication are (a) s(n, γ ) = sγ , (b) r (n, γ ) = r γ , (c) (n 1 , γ1 ) ◦ (n 2 , γ2 ) = (n 1 γ1 (n 2 )δρ(γ1 , γ2 ), γ1 γ2 ), where γ (n) := ρ(γ )nρ(γ )−1 . Proposition 4.1. The two groupoids H := G (G/N ρ¯ G) and K := (N δρ G) of Example (6) are Morita equivalent. Proof. The equivalence bimodule is P = G × G, endowed with the following structures: 1. Moment maps: P (g, γ ) → (gρ(γ )−1 , r γ ) ∈ H0 and P (g, γ ) → sγ ∈ K0 . 2. H-action: H ×H0 P ((g1 , t, γ1 ), (g2 , γ2 )) → (g1 g2 , γ1 γ2 ) ∈ P, whenever t = g1 g2 ρ(γ2 )−1 ρ(γ1 )−1 ∈ G/N . 3. K-action: P ×K0 K ((g, γ1 ), (n, γ2 )) → (gnρ(γ2 ), γ1 γ2 ) ∈ P. Direct checks show that these definitions make P a Morita equivalence bimodule.
Summary of notation. For convenience, let us summarize the notation that has been developed in these examples. • (G G) denotes a crossed product groupoid. It is in some sense a quotient of G by G. • (G ρ G) denotes a principal bundle over G. • (G σ G) denotes an extension of G by G. We will see that this corresponds to a presentation of a G-gerbe over G.
64
C. Daenzer
5. Groupoid Algebras, K-Theory, and Strong Morita Equivalence Let G be a groupoid. The continuous compactly supported functions G1 → C form an associative algebra, denoted Cc (G), for the following multiplication called groupoid convolution: a(γ1 )b(γ2 ) (4) a ∗ b(γ ) := γ1 γ2 =γ
for a, b ∈ Cc (G) and γ s ∈ G. Integration is with respect to the fixed left Haar system of measures. This algebra has an involution, a → a ∗ (γ ) = a(γ −1 ) (the overline denotes complex conjugation), and can be completed in a canonical way to a C ∗ -algebra (see [Ren]) which we simply refer to as the groupoid algebra and denote C ∗ (G) or C ∗ (G1 ⇒ G0 ). The groupoid algebra is a common generalization of the continuous functions on a topological space, to which this reduces when G is the unit groupoid, and of the convolution C ∗ -algebra of a locally compact group, to which this reduces when the unit space is a point. Indeed, by definition of the groupoid algebra we have C ∗ (X ⇒ X ) = C(X ) and C ∗ (G ⇒ ∗) = C ∗ (G) when X is a locally compact Hausdorff space and G is a locally compact Hausdorff group. As is probably common, we will define the K-theory of G, denoted K (G), to be the C ∗ -algebra K-theory of its groupoid algebra. Here are the facts we need about groupoid algebras and K -theory: Proposition 5.1. 1. [MRW] A Morita equivalence of groupoids gives rise to a (strong) Morita equivalence of the associated groupoid algebras. 2. A Morita equivalence between G-groupoids G and H gives rise to a Morita equivalence between the crossed product C ∗ -algebras G C ∗ (G) and G C ∗ (H). 3. Groupoid K -theory is invariant under Morita equivalence. Proof. The first statement is the main theorem of [MRW]. The second statement follows from the first after noting that the definitions of GC ∗ (G) and C ∗ (G G) coincide and that G G is Morita equivalent to G H (see Example (2)). The last statement now follows from the Morita invariance of C ∗ -algebra K -theory. Let G and H be Morita equivalent groupoids. It is useful to know that in [MRW] a C ∗ (G)C ∗ (H)-bimodule is constructed directly from a G-H-Morita equivalence bimodule P. The C ∗ -algebra bimodule is a completion of Cc (P) and has the actions induced from the translation actions of G and H on P. We present a generalization of this in Lemma (A.4). 6. Equivariant Groupoid Cohomology In this section we define equivariant groupoid cohomology for G-groupoids. Equivariant 2-cocycles will give rise to what we call equivariant gerbes. b
Let H be a groupoid and B → H0 a left H-module each of whose fibers over H0 is an abelian group, that is a (not necesssarily locally trivial) bundle of groups over H0 .
A Groupoid Approach to Noncommutative T-Duality
65
Then one defines the groupoid cohomology with B coefficients, denoted H ∗ (H; B), as the cohomology of the complex (C • (H; B), δ), where C k (H; B) := {continuous maps f : Hk → B | b( f (h 1 , . . . , h n )) = r h 1 } and for f ∈ C k (H; B), δ f (h 1 , . . . , h k+1 ) := h 1 · f (h 2 , . . . , h k+1 ) +
(−1)i f (h 1 , . . . , h i h i+1 , . . . , h k+1 )
i=1...k
f (h 1 , . . . , h k+1 ).
k+1
+(−1)
As is common, we tacitly restrict to the quasi-isomorphic subcomplex { f ∈ C k | f (h 1 , . . . , h k ) = 0 if some h i is a unit}, except for 0-cochains which have no such restriction. When the H-module is B = H0 × A, where A is an abelian group, we write A for the cohomology coefficients. ˇ When H is the Cech groupoid of a locally finite cover of a topological space X and ˇ B is the étale space of a sheaf of abelian groups on X , then C • is identical to the Cech complex of the cover with coefficients in the sheaf of sections of B, so this recovers ˇ Cech cohomology of the given cover. On the other hand, when H is a group this recovers continuous group cohomology. If H is a G-groupoid then C • (H; B) becomes a complex of left G-modules by g · f (h 1 , . . . , h n ) := f (g −1 h 1 , . . . , g −1 h n )
f ∈ C k (H; B),
and one can form the double complex K p,q = (C p (G; C q (H; B)), d, δ), where d denotes the groupoid cohomology differential for G ⇒ ∗. The G-equivariant cohomology of H with values in B, denoted HG∗ (H; B), is the cohomology of the total complex tot(K )n := (⊕ p+q=n K p,q , D = d + (−1) p δ). As one would hope, there is a chain map from the complex tot(K ) computing equivariant cohomology to the chain complex associated to the crossed product groupoid: Proposition 6.1. The map F : tot K • −→ C • (G H; B),
(5)
defined to be the sum of the maps f pq
C p (G; C q (H; B)) −→ C p+q (G H; B) f pq (c)((g1 , γ1 ), (g2 , g1−1 γ2 ), . . . , (g p+q , (g1 g2 . . . g p+q−1 )−1 γ p+q )) := c(g1 , . . . , g p , γ p+1 , . . . , γ p+q ) for c ∈ C p (G; C q (H; B)), g s ∈ G, and (γ1 , . . . , γ p+q ) ∈ H p+q , is a morphism of chain complexes. Proof. This is a direct check.
66
C. Daenzer
It seems likely that this is a quasi-isomorphism, but we have not proved it. In Sect. (10) it is shown that H (G H; B) is always a summand of HG (H; B), which is enough for the present purposes. Remark 6.2. These cohomology groups are not Morita invariant. For example, different ˇ covers can have different Cech cohomology. One can form a Morita invariant cohomology (that is, stack cohomology); it is the derived functors of B → HomH (H0 , B) = Γ (H0 , B)H , which homological algebra tells us can be computed by using a resolution of B by injective H-modules.3 However, the cocycles obtained via injective resolutions are often not useful for describing geometric objects such as bundles and groupoid extensions, so we will stick with the groupoid cohomology as defined above (which is the approximation to these derived functors obtained by resolving H0 by H• and taking the cohomology of HomH (H• ; B)). In Sect. (9) we will show how to compare the respective groupoid cohomology groups of two groupoids which are Morita equivalent. We will also encounter cocycles with values in a bundle of nonabelian groups B, defined in degrees n = 0, 1, 2. The spaces of cochains are the same and a 0-cocycle is also the same as in the abelian setting. In degree one we say ρ ∈ C 1 (H; B) is closed when δρ(γ1 , γ2 ) := ρ(h 1 )h 1 · ρ(h 2 )ρ(h 1 h 2 )−1 = 1 and ρ and ρ are cohomologous when ρ (h) = h · α(sh)ρ(h)α −1 (r h). A nonabelian 2-cocycle is a pair (σ, τ ) as in Example (5). 7. Gerbes and Twisted Groupoids In this section we describe various constructions that can be made with 2-cocycles and, in particular, explain our slightly non-standard use of the term gerbe. We also describe the construction of equivariant gerbes from equivariant cohomology. Given a 2-cocycle σ ∈ Z 2 (G; N ), where N is an abelian group, we can form an extension of G by N : N σ G := (N × G1 ⇒ G0 ), with multiplication (n 1 , γ1 ) ◦ (n 2 , γ2 ) := (n 1 n 2 σ (γ1 , γ2 ), γ1 γ2 ). More generally, if B is a bundle of not necessarily abelian groups, and (σ, τ ) a B-valued nonabelian 2-cocycle, then we can form the groupoid extension B σ G that was described in Example (5). We will call such an extension a B-gerbe, or an N -gerbe if B = G0 × N is a constant bundle of (not necessarily abelian) groups. The term gerbe comes from Giraud’s stack theoretic interpretation of degree two nonabelian cohomology ([Gir]). In the following few paragraphs (everything up to Definition (7.2)) we will outline the stack theoretic terminology leading to Giraud’s gerbes. The point of the outline is only to clear up terminology, and can be skipped. A nice reference for topological stacks is [Met]. Let C be any category. A topological stack is a functor F : C → T op satisfying a certain list of axioms. A morphism of stacks from (F : C → T op) to (F : C → T op) 3 There are enough injective H-modules for étale groupoids, but we do not know if this is true for general groupoids.
A Groupoid Approach to Noncommutative T-Duality
67
is a functor α : C → C (satisfying a couple of axioms) such that F = F ◦ α. Such a morphism is an equivalence of stacks when α is an equivalence of categories. Given a groupoid G, define PrinG to be the category whose objects are locally trivial right principal G-modules and whose homs are the continuous G-equivariant maps. This category has a natural functor to T op which sends a principal module P to the quotient P/G, and in fact satisfies the axioms for a stack. This stack (which we denote PrinG ) is called the stack associated to G. A stack which is equivalent to PrinG is called presentable and G is called a presentation of the stack. The discussion following Definition (3.1) shows that a locally trivial right principal G-H-bimodule P induces a functor ∗P : PrinG → PrinH . It is in fact a morphism of stacks, and is an equivalence of stacks when P is also left principal (that is when P is a Morita equivalence bimodule). Conversely, if there is an equivalence of stacks PrinG → PrinH , then G and H are Morita equivalent. Thus any statement about groupoids which is Morita invariant is naturally a statement about presentable stacks. We will only work with presentable stacks in this paper, and when stacks are mentioned at all, it will only be as motivation for making Morita invariant constructions. According to Giraud [Gir], a gerbe over a stack C is a stack C equipped with a morphism of stacks α : C → C satisfying a couple of axioms. Now, the extension B σ G has its natural quotient map to G (and this quotient map is a functor), and this determines a morphism of stacks Prin(B σ G ) → PrinG which in fact makes Prin(B σ G ) into what Giraud called a B-gerbe over the stack PrinG (see also [Met] Definition 84). When G is Morita equivalent to a space X , one usually calls this a gerbe over X . Thus we call the groupoid B σ G a B-gerbe, although it is actually a presentation of a B-gerbe. Hopefully this will not cause much confusion. Remark 7.1. Every gerbe described so far has the property that N acts on N σ G1 , making it a trivial principal N -bundle over G1 . Any groupoid presentation of a stack theoretic N -gerbe will admit a principal N -action on its space of arrows, but not every one is a trivializable principal bundle over G1 . Those that are not trivializable do not admit the 2-cocycle description we have been using. The obstruction to all gerbes being trivializable bundles is the degree one sheaf cohomology of the space G1 with coefficients in the 1 sheaf of N -valued functions, HShea f (G1 ; N ). Since many groupoids admit a refinement ´ for which this obstruction vanishes (in particular Cech groupoids do), there are plenty of situations in which one may assume the gerbe admits the above 2-cocycle description (in particular, for gerbes on spaces this is fine). Nonetheless, we will encounter gerbes which are not trivial bundles, such as the ones in Example (9). Closely related to gerbes is the following notion: Definition 7.2. Let G be a groupoid and let B → G0 be a bundle of groups. A B-twisted groupoid is a pair (G, (σ, τ )), where (σ, τ ) is a B-valued (nonabelian) 2-cocycle over G as in Example (5). When τ is understood to be trivial, we simply write (G, σ ), and when B is the constant bundle G0 × U (1) we simply call the pair a twisted groupoid. In fact a B-twisted groupoid contains the exact same data as a B-gerbe. However, we will encounter a type of duality which takes twisted groupoids to U (1)-gerbes and does not extend to a “gerbe-gerbe” duality. Thus it is necessary to have both descriptions at hand. We would like to make C ∗ -algebras out of twisted groupoids in order to define twisted K-theory. Here is the definition.
68
C. Daenzer
Definition 7.3. [Ren]. Given a twisted groupoid (G, σ ∈ Z 2 (G; U (1))), the associated twisted groupoid algebra, denoted C ∗ (G, σ ), is the C ∗ -algebra completion of the compactly supported functions on G1 , with σ -twisted multiplication a(γ1 )b(γ2 )σ (γ1 , γ2 ); a, b ∈ Cc (G1 ) a ∗ b(γ ) := γ1 γ2 =γ
and involution a → a ∗ (γ ) := a(γ −1 )σ (γ , γ −1 ). Here functions are C-valued and the overline denotes complex conjugation. Of course a groupoid algebra is exactly a twisted groupoid algebra for σ = 1. Definition 7.4. The twisted K-theory of a twisted groupoid (G, σ ) is the K-theory of C ∗ (G, σ ). Now suppose G is a locally compact group and H is a G-groupoid. By definition, a U (1)-valued 2-cocycle in G-equivariant cohomology is of the form: (σ, λ, β) ∈ C 0 (G; Z 2 (H; U (1))) × C 1 (G; C 1 (H; U (1))) × Z 2 (G; C 0 (H; U (1))). (6) and it satisfies the cocycle condition: D(σ, λ, β) = (δσ, δλ−1 dσ, δβdλ, dβ) = (1, 1, 1, 1). The first component, σ , of the triple determines a twisted groupoid algebra C ∗ (H; σ ). Now the translation action of G on C ∗ (H), g · a(γ ) := a(g −1 γ );
g ∈ G, h ∈ H, a ∈ C ∗ (H; σ )
is not an action on C ∗ (H; σ ) because g · (a ∗σ b) = (g · a) ∗σ (g · b) for a, b ∈ C ∗ (H; σ ), g ∈ G. The second and third components are “correction terms” that allow G to act on the twisted groupoid algebra. Indeed, define a map α : G → Aut(C ∗ (H; σ )) αg (a)(h) := λ(g, h)g · a(h). Then we have { αg (a ∗σ b) = αg (a) ∗σ αg (b)} ⇐⇒ { dσ = δλ}, so α does land in the automorphisms C ∗ (H; σ ). However, this is still not a group homomorphism since in general αg1 ◦ αg2 = αg1 g2 . If we attempted to construct a crossed product algebra G α C ∗ (H) it would not be associative. The failure of α to be homomorphic is corrected by the third component, β. An interpretation of β is that it determines a family over H0 of deformations of G as a noncommutative space from which α is in some sense a homomorphism. We encode this “noncommutative G-action” in the following twisted crossed product algebra: G λ,β C ∗ (H; σ ), which is the algebra with multiplication a ∗ b(g, h) := h 1 h 2 =h a(g1 , h 1 )b(g2 , g1−1 h 2 )χ ((g1 , h 1 ), (g2 , g1−1 h 2 )) , g1 g2 =g
where χ ((g1 , h 1 ), (g2 , g1−1 h 2 )) := σ (h 1 , h 2 )λ(g1 , h 2 )β(g1 , g2 , sh 2 ).
A Groupoid Approach to Noncommutative T-Duality
69
Lemma 7.5. Let G H be the crossed product groupoid associated to the G action on H (as in Example (2)). Then χ ∈ Z 2 (G H; U (1)). Consequently the multiplication on G λ,β C ∗ (H; σ ) ≡ C ∗ (G H, χ ) is associative. Proof. χ is the image of the cocycle (σ, λ, β) under the chain map of Eq. (5), thus it is a cocycle. Now it is clear how to interpret the data (σ, λ, β) at groupoid level: it is the data needed to extend the twisted groupoid (H, σ ) to a twisted crossed product groupoid (G H, χ ). The meaning of “extend” in this context is that (H, σ ) is a sub-(twisted groupoid): (H, σ ) ({1} H, σ ) ⊂ (G H, χ ), which follows from the fact that χ |H ≡ σ . Clearly in the groupoid interpretation the U (1) coefficients can be replaced by an arbitrary system of coefficients. Definition 7.6. The pair (H, (σ, λ, β)), where H is a G-groupoid and (σ, λ, β) ∈ 2 (H, U (1)) will be called a twisted G-groupoid. ZG 8. Pontryagin Duality for Generalized Principal Bundles In this section we introduce an extension of Pontryagin duality, which has been a duality on the category of abelian locally compact groups, to a correspondence of the form {generalized principal G-bundles} ←→ {twisted G-gerbes} for any abelian locally compact group G. In fact we extend this to a duality between a U (1)-gerbe on a principal G-bundle (though only certain types of U (1)-gerbes are allowed) and a U (1)-gerbe on a G-gerbe. By construction, there will be a Fourier type isomorphism between the twisted groupoid algebras of a Pontryagin dual pair; consequently any invariant constructed from twisted groupoid algebras (K -theory for example) will be unaffected by Pontryagin duality. This duality might be of independent interest, especially because it is not a Morita equivalence and thus induces a nontrivial duality at stack level. The Pontryagin dual of a U (1)-gerbe on a principal torus bundle will play a crucial role in the understanding of T-duality. Let us fix the following notation for the remainder of this section: G denotes an = Hom(G, U (1)) denotes its Pontryagin dual group, abelian locally compact group, G we use g s and φ s respectively. Evaluation is often written and for elements of G and G as a pairing φ, g ≡ φ(g). As usual γ s are elements of a groupoid G. According to our groupoid notation, (G ⇒ G) denotes the group G thought of as a topological space while (G ⇒ ∗) denotes the group thought of as a group. Thus by definition, the groupoid algebras C ∗ (G ⇒ G) and C ∗ (G ⇒ ∗) are functions on G with pointwise multiplication in the first case and convolution multiplication in the
70
C. Daenzer
second. Keeping this in mind, Fourier transform can be interpreted as an isomorphism of groupoid algebras: ⇒ ∗), F : C ∗ (G ⇒ G) −→ C ∗ (G a(g)φ(g −1 ). a → F(a)(φ) := g∈G
We use the Plancherel measure so that the inverse transform is given by −1 ⇒ ∗). ˆ := a(φ)φ(g) ˆ aˆ ∈ C ∗ (G F (a)(g) φ∈G
The group G acts on C ∗ (G ⇒ G) by translation g1 · a(g) := a(gg1 )
a ∈ C ∗ (G ⇒ G),
and the dual group acts by “dual translation” on C ∗ (G ⇒ G) by φ a(g) := φ, ga(g). Under Fourier transform translation and dual translation are interchanged: F(g · a)(φ) = φ, gF(a)(φ) =: g F(a)(φ), F(φ a)(ψ) = F(a)(φ −1 ψ) =: φ −1 · F(a)(ψ), for φ, ψ ∈ G.
(7) (8)
Let us quickly check the first one: a(g1 g)φ(g1−1 ) F(g · a)(φ) = g1 = a(g )φ((g g −1 )−1 ) g = φ(g) a(g )φ(g ) = g F(a)(φ). g
With those basic rules of Fourier transform in mind, we are ready to prove: Definition 8.1. Let G be a locally compact abelian group and G a groupoid. The following data: and ν ∈ C 2 (G; U (1)), ρ ∈ Z 1 (G; G), f ∈ Z 2 (G; G), satisfying δν(γ1 , γ2 , γ3 ) = f (γ1 , γ2 ), ρ(γ3 )−1 will be called Pontryagin duality data. Given Pontryagin duality data (ρ, f, ν), the following formulas define twisted groupoids: 1. The generalized principal bundle (G ρ G) with twisting 2-cocycle: σ ν f ((g, γ1 ), (gρ(γ1 ), γ2 )) := ν(γ1 , γ2 ) f (γ1 , γ2 ), g ∈ Z 2 (G ρ G; U (1)). f G) with twisting 2-cocycle: 2. The G-gerbe (G f G; U (1)). τ ρν ((φ1 , γ1 ), (φ2 , γ2 )) := ν(γ1 , γ2 )φ2 , ρ(γ1 ) ∈ Z 2 (G
A Groupoid Approach to Noncommutative T-Duality
71
To verify this, simply check that the twistings are indeed 2-cocycles. so that the pairs f G; τ ρν ) are actually twisted groupoids. (G ρ G, σ ν f ) and (G f G; τ ρν ) Theorem 8.2 (Pontryagin duality for groupoids). Let (G ρ G, σ ν f ) and (G be twisted groupoids constructed from Pontryagin duality data (ρ, f, ν) as in Definition (8.1). Then there is a Fourier type isomorphism between the associated twisted groupoid algebras: F
f G; τ ρν ), C ∗ (G ρ G; σ ν f ) −→ C ∗ (G a(g, γ )φ(g −1 ), γ ∈ G . a → F(a)(φ, γ ) := g∈G
Also, the natural translation action of G on C ∗ (Gρ G; σ ) is taken to the dual translation analogous to Eq. (7): F(g · a)(φ, γ ) = φ, gF(a)(φ, γ ) =: g F(a)(φ, γ ). Note, however, that G only acts by vector space automorphisms (as opposed to algebra automorphisms) unless f = 1. Proof. The fact that F is an isomorphism of Banach spaces follows because “fibrewise” this is a classical Fourier transform. Thus verifying that F is a C ∗ -isomorphism is a matter of seeing that F takes the multiplication on the first algebra into the multiplication on the second. Let us check. ν1,2 = ν(γ1 , γ2 ) ∈ U (1), and ρi = ρ(γi ) ∈ G. The Set f 1,2 = f (γ1 , γ2 ) ∈ G, ∗ multiplication on C (G ρ G; σ ν f ) is by definition a ∗ b(g, γ ) := a(g, γ1 )b(gρ1 , γ2 )ν1,2 f 1,2 , g γ1 γ2 =γ = a(g, γ1 ) f 1,2 (ρ1 · b)(g, γ2 )ν1,2 . γ1 γ2 =γ
which we denote by Pointwise multiplication on G is transformed to convolution on G, ∗ˆ . Group translation and dual group translation behave under the transform according to Eqs. (7) and (8). Using these rules, we have F(a ∗ b)(φ, γ ) = φ1 φ2 =φ F(a)(φ1 , γ1 )ˆ∗F( f 1,2 (ρ1 · b))(φ2 , γ2 )ν1,2 γ γ =γ 1 2 −1 −1 = φ1 φ2 =φ F(a)(φ1 , γ1 )F(b)(φ2 f 1,2 , γ2 )φ2 f 1,2 , ρ1 ν1,2 γ1 γ2 =γ = φ1 φ f1,2 =φ F(a)(φ1 , γ1 )F(b)(φ2 , γ2 )φ2 , ρ1 ν1,2 , 2
γ1 γ2 =γ
f G; τ νρ ). The statement about and this last line is exactly the multiplication on C ∗ (G G-actions is proved by the same computation as for Eq. (7). Definition 8.3. A pair of a twisted generalized principal G-bundle and twisted G-gerbe as in Theorem (8.2) are said to be Pontryagin dual.
72
C. Daenzer
This Pontryagin duality is independent of choice of cocycles ρ and f within their cohomology classes, and furthermore, if we alter ν by a closed 2-cochain (recall ν itself is not closed) we obtain a new Pontryagin dual pair. Here is the precise statement of these facts; the proof is a simple computation. f G; τ ρν ) are Pontryagin dual and Proposition 8.4. Suppose (G ρ G; σ ν f ) and (G 1 0 we are given α ∈ C (G; G) and β ∈ C (G; G) and c ∈ Z 2 (G; U (1)). Define ρ := ρδβ and f := f δα. Then (G ρ G; σ ν
f
f G; τ ρ ν ) are Pontryagin dual as well, where ) and (G
ν1,2 := c1,2 ν1,2 f 1,2 , β(sγ2 )−1 α1 , ρ2 −1 .
Since Pontryagin duality is defined in terms of a Fourier isomorphism, we have the following obvious Morita invariance property: Proposition 8.5. Suppose we are given two sets of Pontryagin duality data (G, G, ρ, ν, f ) and (G , G, ρ , ν , f ). Then f 1. C ∗ (G ρ G; σ ν f ) is Morita equivalent to C ∗ (G ρ G ; σ ν f ) if and only if C ∗ (G f G ; σ ρ ν ). G; σ ρν ) is Morita equivalent to C ∗ (G 2. In particular, if G ρ G is G-equivariantly Morita equivalent to G ρ G and the twisted groupoid (G ρ G, σ ν f ) is Morita equivalent to (G ρ G , σ ν f ) (see Theorem (9.1) for Morita equivalence of twisted groupoids), then all C ∗ -algebras in (1) are Morita equivalent. The same is true if the G-twisted groupoids (G, f ) is Morita f G, σ ρν ) is Morita equivalent equivalent to (G , f ) and the twisted groupoid (G f G , σ ρ ν ). to (G Note that we have not claimed that a Morita equivalence at the groupoid level produces a Morita equivalence of Pontryagin dual groupoids. Though there is often such a correspondence, it is not clear that there is always one. Here are a couple of important examples of Pontryagin duality: Example 7. Pontryagin duality actually shows that any twisted groupoid algebra is a C ∗ -subalgebra of the (untwisted) groupoid algebra of a U (1)-gerbe. Using the notation = U (1). Then the twisted groupoid of Theorem (8.2), set ρ = ν = 1, G = Z, and G ((Z 1 G), σ f ) (this is a trivial Z-bundle with twisting σ f ) is Pontryagin dual to the gerbe (U (1) f G). Explicitly, σ f ((n, γ1 ), (n, γ2 )) = f (γ1 , γ2 )n , for ((n, γ1 ), (n, γ2 )) ∈ (Z 1 G)2 , f ∈ Z 2 (G; U (1)). But the functions in C ∗ (U (1) f G) C ∗ (Z 1 G, σ f ) with support in {1} 1 G clearly form a C ∗ -subalgebra identical to C ∗ (G; f ). Example 8. Again in the notation of Theorem (8.2), the case ν = f = 1 shows that a 1 generalized principal bundle G ρ G is Pontryagin dual to the twisted groupoid (G ρ ρ G, τ ). (The latter object is the trivial G-gerbe on G, with twisting τ .) One might wonder if this duality can be expressed purely in terms of gerbes. The answer is that it cannot. More precisely, this duality does not extend via the association (twisted groupoids) ←→ (U (1)-gerbes)
A Groupoid Approach to Noncommutative T-Duality
73
to a duality between U (1)-gerbes that is implemented by the Fourier isomorphism of 1 G) (here groupoid algebras. Indeed, the gerbe associated to the right side is U (1)τ (G ρ τ = τ ), but the extension on the left side corresponding (in the sense of having a Fourier isomorphic C ∗ -algebra) to that gerbe is G χ (Z 1 G), where χ (n, γ ) := ρ(γ )n , which cannot be written as a U (1)-gerbe. The groupoid G χ (Z 1 G) actually corresponds to the disjoint union of the tensor powers of the generalized principal bundle. Pontryagin duality will be used to explain “classical” T-duality, in Sect. (12). Before we get to T-duality, however, we need the tools to compare Morita equivalent twisted groupoids, and we need to know some specific Morita equivalences. The development of these tools is the subject of the next two sections. 9. Twisted Morita Equivalence Let H and K be groupoids. A Morita (H-K)-bimodule P determines a way to compare the groupoid cohomology of H with that of K. More specifically, there is a double complex C i j (P) associated to the bimodule such that the moment maps H0 ← P → K0 induce augmentations by the groupoid cohomology complexes C i (H; M) → C i• (P; M) and C • j (P; M) ← C j (K; M).
(9)
We say groupoid cocycles c ∈ C n (H; M) and c ∈ C n (K; M) are cohomologous when their images in C(P; M) are cohomologous. We assume for simplicity that the coefficients M are constant and have the trivial actions of H and K. The double complex is defined as follows: C i j (P) := (C(H j ×H0 P ×K0 Ki ; M); δ H , δ K ). The differentials (δ H : C i j → C i j+1 ) and (δ K : C i j → C i+1 j ) are given, for f ∈ C i j , by the formulas δ H f (h 1 , . . . , h j+1 , p, k s) := f (h 2 , . . . , h j+1 , p, k s) +
(−1)n f (. . . , h n h n+1 , . . . , p, k s)
n=1... j
+(−1) f (h 1 , . . . , h j , h j+1 p, k s), j
δ K f (h s, p, h 1 . . . , h i+1 ) := f (h s, pk1 , k2 , . . . , ki ) +
(−1)n f (h s, p, . . . , kn kn+1 , . . . )
n=1...i
+(−1)k f (h s, p, k1 , . . . , ki ). Our main reason for comparing cohomology of Morita equivalent groupoids is the following theorem. The first statement of the theorem is a classic statement about (stack theoretic) gerbes, but we will reproduce it for convenience.
74
C. Daenzer
Theorem 9.1. Let P be a Morita equivalence (H-K)-bimodule, let M be an Abelian group acted upon trivially by the two groupoids, and suppose we are given 2-cocycles ψ ∈ Z 2 (H; M) and χ ∈ Z 2 (K; M) whose images ψ˜ and χ˜ in the double complex of the Morita equivalence are cohomologous. Then 1. The M-gerbes M ψ H and M χ K are Morita equivalent. 2. For any group homomorphism φ : M → N , the N -gerbes N φ◦ψ H and N φ◦χ K are Morita equivalent. 3. For any group homomorphism φ : M → U (1), the twisted C ∗ -algebras C ∗ (H; φ◦ψ) and C ∗ (K; φ ◦ χ ) are Morita equivalent. For example when M = U (1) with H and K acting trivially, the representations of M are identified with the integers, (u → u n ) and the proposition implies that the “gerbes of weight n”, C ∗ (H; ψ n ) and C ∗ (K; χ n ), are Morita equivalent. Proof. We begin by proving the statement about M-gerbes. The idea is that a cocycle (µ, ν −1 ) ∈ C 1,0 × C 0,1 satisfying ˜ ) ∈ C 2,0 × C 1,1 × C 0,2 ˜ 0, χ −1 D(µ, ν −1 ) := (δ H µ, δ K µδ H ν, δ K ν −1 ) = (ψ,
(10)
provides exactly the data needed to form a Morita (M ψ H) − (M χ K)-bimodule structure on M × P. Indeed, define for m s ∈ M, h s ∈ H, k s ∈ K, and p s ∈ P, 1. Left multiplication of M ψ H: (m 1 , h) ∗ (m 2 , p) := (m 1 m 2 µ(h, p), hp), 2. Right multiplication of M χ K ⇒ K0 : (m 1 , p) ∗ (m 2 , k) := (m 1 m 2 ν( p, k), pk). Then δ H µ = ψ˜ ⇔ the left multiplication is homomorphic, δ K ν = χ˜ ⇔ the right multiplication is homomorphic, δ K µ = δ K ν −1 ⇔ the left and right multiplications commute. For example the equality δ K ν( p, k1 , k2 ) := ν( pk1 , k2 )ν( p, k1 k2 )ν( p, k1 )−1 = χ˜ ( p, k1 , k2 ) =: χ (k1 , k2 ) holds if and only if ((m, p) ∗ (m 1 , k1 )) ∗ (m 2 , k2 ) = (mm 1 m 2 ν( p, k1 )ν( pk1 , k2 ), pk1 k2 ) = (m(m 1 m 2 χ (k1 , k2 ))ν( p, k1 k2 ), pk1 k2 ) = (m, p) ∗ ((m 1 , k1 ) ∗ (m 2 , k2 )). Now we will check that the right action is principal and that the orbit space (M × P)/(M χ K) is isomorphic to H0 . Suppose (m, p) ∗ (m 1 , k1 ) = (m, p) ∗ (m 2 , k2 ). Then k1 = k2 since the action of K is principal, and then m 1 = m 2 is forced, so the action is principal. Next, the equation (m, p) ∗ (m 1 ν( p, k)−1 , k) = (mm 1 , pk) makes it clear that the orbit space (M × P)/(M χ K) is the same as the orbit space P/K, which is H0 . The situation is obviously symmetric, so the left action satisfies the analogous properties, and thus the first statement of the proposition is proved.
A Groupoid Approach to Noncommutative T-Duality
75
The second statement now follows immediately because whenever Eq. (10) is satisfied for the quadruple (µ, ν, ψ, χ ) in C(P; M), it is also satisfied for (φ◦µ, φ◦ν, φ◦ψ, φ◦χ ) in C(P; N ). In other words φ ◦ ψ and φ ◦ χ are cohomologous. The third statement of the theorem can be proved directly by exhibiting a C ∗ -algebra bimodule, but this will not be necessary because we already have a Morita C ∗ (M ψ H)C ∗ (M χ K)-bimodule, coming from Proposition (5.1) and the fact that M ψ H and M χ K are Morita equivalent. We will manipulate this bimodule into a C ∗ (H, φ ◦ ψ)C ∗ (K, φ ◦ χ )-bimodule. 1 H, σ ψ ) is Pontryagin dual to (M ψ H) and ( M 1 K, σ χ ) is Note that ( M Pontryagin dual to (M χ K), thus the associated C ∗ -algebras are pairwise Fourier isomorphic. Then the Morita equivalence bimodule (which is a completion of Cc (M × P), where P is the H-K-bimodule) for C ∗ (M ψ H) and C ∗ (M χ K) is taken via the 1 H, σ ψ ) Fourier isomorphism to a Morita equivalence bimodule between C ∗ ( M ∗ χ and C ( M 1 K, σ ). This Fourier transformed bimodule will be a completion X of × P). Then for φ ∈ M, evaluation at φ determines projections Cc ( M 1 H, σ ψ ) → C ∗ (H, φ ◦ ψ), evφ : C ∗ ( M 1 K, σ χ ) → C ∗ (K, φ ◦ χ ), evφ : C ∗ ( M × P) → Cc (P), and evφ : Cc ( M that are compatible with the bimodule structure of X , so evφ (X ) is automatically a Morita equivalence C ∗ (H, φ ◦ ψ)-C ∗ (K, φ ◦ χ )-bimodule. Thus X is actually a family and in particular the third statement of Morita equivalences parameterized by φ ∈ M, of the theorem is true. Remark 9.2. Just as a Morita equivalence lets you compare cohomology, a G-equivalence lets you compare G-equivariant cohomology. Indeed, all complexes involved will have the commuting G-actions, and there will be an associated tricomplex which is coaugmented by the complexes computing equivariant cohomology of the two groupoids. Rather than using the tricomplex, however, one can map the equivariant complexes into the complexes of the crossed product groupoids (as in Eq. (5)) and do the comparing there. The result is the same but somewhat less tedious to compute. 10. Some Facts about Groupoid Cohomology Now there is good motivation for wanting to know when groupoid cocycles are cohomologous. In this section we will collect some facts that help in that pursuit. The Morita equivalences that come from actual groupoid homomorphisms have nice properties with respect to cohomology. They are called essential equivalences. Definition 10.1. [C]. Let φ : G → H be a morphism of topological groupoids (i.e. a continuous functor). This morphism determines a right principal G-H-bimodule Pφ := G0 ×H0 H1 . If Pφ is a Morita equivalence bimodule (which follows if it is left principal) then φ is called an essential equivalence. Note that for any morphism φ the left moment map Pφ → G0 has a canonical section. Proposition 10.2. Let G and H be topological groupoids, and suppose P is a Morita
ρ
equivalence G-H-bimodule with left and right moment maps G0 ← P → H0 . Then:
76
C. Daenzer
1. P is equivariantly isomorphic to Pφ for some essential equivalence φ : G → H if and only if the left moment map : P → G0 admits a continuous section. 2. Any morphism φ : G → H determines, via pullback, a chain morphism φ ∗ : C • (H) → C • (G). 3. The moment maps and ρ determine chain morphisms ∗
ρ∗
C • (G) −→ tot(C •• (P)) ←− C • (H). 4. Any continuous section of : P → G0 determines a contraction of the coaugmented complex C k (G) → C •k (P) for each k, and thus a quasi-inverse [ ∗ ]−1 : H ∗ (tot(C(P)) → H ∗ (G), and thus a homomorphism [ ∗ ]−1 ◦ [ ρ ∗ ] : H ∗ (H) → H ∗ (G). 5. The two chain morphisms ∗ ◦ φ ∗ and ρ ∗ are homotopic; in particular [ φ ∗ ] = [ ∗ ]−1 ◦ [ ρ ∗ ]. Proof. The first statement is an easy exercise and the next two statements do not require proof. The homotopy for the fourth statement is in the proof of Lemma 1 [C] and the homotopy for the fifth statement is written on page (8) of [C]. Corollary 10.3. Suppose that φ : G → H is an essential equivalence, that M is a locally compact abelian group viewed as a trivial module over both groupoids, and that χ ∈ Z 2 (H; M) is a 2-cocycle. Then χ and φ ∗ χ satisfy the conditions of Theorem (9.1); ∗ in particular M χ H is Morita equivalent to M φ χ G. Proof. Use the notation of Proposition (10.2). The images of χ and φ ∗ χ in C •• (P) are ∗ ◦ φ ∗ χ and ρ ∗ χ respectively, which are cohomologous by statement (5), and these are precisely the conditions of Theorem (9.1). Corollary 10.4. If φ : G → H is an essential equivalence such that the right moment ρ map Pφ → H0 admits a section then [φ ∗ ] : H ∗ (H) → H ∗ (G) is an isomorphism. Proof. This follows from Corollary (10.3) and Part (4) of Proposition (10.2).
Corollaries (10.3) and (10.4) will be very useful because all of the Morita equivalences that have been introduced so far are essential equivalences, as the next proposition shows. Before the proposition let us describe two more groupoids. Example 9. Let ρ : G → G/N be a homomorphism and let G ×G/N ,ρ G1 denote the fibred product (i.e. the space {(g, γ ) | g N = ρ(γ ) ∈ G/N }). Define G ×G/N ,ρ G := (G ×G/N ,ρ G1 ⇒ G0 ) with structure maps s(g, γ ) := sγ , r (g, γ ) := r γ , and (g1 , γ1 )◦(g2 , γ2 ) := (g1 g2 , γ1 γ2 ). Note that any lift of ρ to a continuous map ρ˜ : G1 → G determines an isomorphism of groupoids N δρ G −→ G ×G/N ,ρ G,
(n, γ ) → (n ρ(γ ˜ ), γ ).
When no such lift exists this fibred product groupoid is an example of an N -gerbe without section.
A Groupoid Approach to Noncommutative T-Duality
77
Example 10. The fibred product groupoid of Example (9) is equipped with a canonical right module P := G × G0 , for which the induced groupoid is G (G ×G/N ,ρ G) := ((G × G ×G/N ,ρ G1 ) ⇒ G × G0 ) with structure maps s(g, g , γ ) := (gg , sγ ), r (g, g , γ ) := (g, r γ ), and (g, g1 , γ1 ) ◦ (gg1 , g2 , γ2 ) := (g, g1 g2 , γ1 γ2 ). A lift ρ˜ determines a homeomorphism G (N δρ G) −→ G (G ×G/N ,ρ G),
(g, n, γ ) → (g, n ρ(γ ˜ ), γ )
which implicitly defines the groupoid structure on G N δρ G as the one making this a groupoid isomorphism. Proposition 10.5 (Essential equivalences). Let G be a groupoid, N a closed subgroup of a locally compact group G, NG (N ) its normalizer in G, and ρ : G → NG (N )/N ⊂ G/N a homomorphism. Then 1. The gluing morphism GU → G from a refinement GU corresponding to a locally finite cover U of G0 (see Example (1)) is an essential equivalence. 2. The quotient morphism X N → (X/N ⇒ X/N ),
(x, n) → x N ∈ X/N ,
for X a free and proper right N -space, is an essential equivalence. 3. The inclusion ι : (G ×G/N ,ρ G) → G G/N ρ G,
(g, γ ) → (g, eN , γ )
is an essential equivalence and in particular if ρ lifts to a continuous map ρ˜ : G → G then ι : (N δρ G) → G (G/N ρ G),
(n, γ ) → (n ρ(γ ˜ ), eN , γ )
is an essential equivalence. 4. The quotient map κ : G (G ×G/N ,ρ G) → G/N ρ G,
(g, g , γ ) → (g N , γ )
is an essential equivalence and in particular if ρ lifts to a continuous map ρ˜ : G → G then κ : G (N δρ G) → G/N ρ G,
(g, n, γ ) → (g N , γ )
is an essential equivalence. ˇ 5. If G is a Cech groupoid, N is normal in G, and Q := (G/N × G0 )/(t, r γ ) ∼ (tρ(γ ), sγ ) is the principal G/N -bundle on X = G0 /G1 with transition functions given by ρ, then the quotient q : (G/N ρ G) −→ (Q ⇒ Q) is an essential equivalence.
78
C. Daenzer
Furthermore, in all cases where there are obvious left G actions, these are essential G-equivalences. Proof. For the groupoids that were introduced in Sect. (4), simply check that the bimodules determined by these morphisms are the same as the Morita equivalence bimodules that were described there. The two new cases involving Examples (9) and (10) are very similar and are left for the reader. The statement about G-equivariance is clear. Proposition 10.6. In the notation of Proposition (10.5), suppose that G → G/N admits a continuous section (for example if N = {e}, or if N is a component of G or if G is discrete), then the essential equivalences ι and κ induce isomorphisms of groupoid cohomology. Proof. By Corollary (10.4), we only need to produce sections of the right moment maps, and the section of σ : G/N → G provides these. The case G = ∗ illustrates the answer: for ι, G/N g N → (∗, σ (g N ), eN ) ∈ {∗} × G × eN Pι does the job, while for κ it is G/N g N → (σ (g N ), g N ) ∈ G ×G/N G/N Pκ . The following proposition is important because it implies that cocycles on crossed product groupoids can be assumed to be in a special form. Proposition 10.7. The groupoid cohomology H ∗ (G (G ρ G); B)) is a direct summand of the equivariant cohomology HG∗ (G ρ G; B). Proof. We will show that the identity map on H ∗ (G (G ρ G); B)) factors through HG∗ (G ρ G; B). The inclusion ι : G → G (G ρ G) sending γ → (ρ(γ ), e, γ ) induces a chain map ι∗ : C • (G (G ρ G); B) −→ C • (G; B) which is a quasi-isomorphism by Proposition (10.6). The quotient G (G ρ G) −→ G induces the chain map q ∗ : C • (G; B) −→ C • (G (G ρ G); B) which is also a quasi-isomorphism, and in fact induces the quasi-inverse to ι∗ since ι∗ ◦ q ∗ = (q ◦ ι)∗ = I d ∗ . Now the quotient G ρ G → G induces a chain map C • (G) → C • (G ρ G) and thus a chain map C • (G) → C • (G ρ G) → tot C • (G, C • (G ρ G)). Note that the first map and the composition of both maps are chain morphisms, but the second is not. Following this by the morphism tot C • (G, C • (Gρ G)) → C • (G(Gρ G)) of Eq. (5) induces a sequence C • (G) −→ tot C • (G; C • (G ρ G)) −→ C • (G (G ρ G)) whose composition is easily seen to equal q ∗ . Finally, precomposing with ι∗ provides the promised factorization.
A Groupoid Approach to Noncommutative T-Duality
79
Remark 10.8. Whenever a group G acts freely and properly on a groupoid H and H → G\H has local sections, then H, being a principal G-bundle, is equivariantly Morita equivalent to G ρ G for some ρ and G, where G is a refinement of the quotient groupoid H/G := (H1 /G ⇒ H0 /G). In particular there is a refinement of H that is equivariantly isomorphic to G ρ G. Thus after a possible refinement the above proposition is a statement about the cohomology of all free and proper G-groupoids with local sections. For completeness we will include the following proposition, which might be attributable to Haefliger. It implies that one can work exclusively with essential equivalences if desired. Proposition 10.9. Let P be a G-H-Morita equivalence which is locally trivial as an H-module, then the Morita equivalence can be factored in the form: φ
G ←− GU −→ H , where GU is a refinement of G and φ is an essential equivalence. Proof. The left moment map for P admits local sections so there is a cover U of G0 such that the refined Morita GU-H-bimodule PU has a section for its left moment map. By Proposition (10.2) PU Pφ for some essential equivalence φ. 11. Generalized Mackey-Rieffel Imprimitivity In this section we show that a specific pair of twisted groupoids is Morita equivalent. The two Morita equivalent twisted groupoids correspond, respectively, to a U (1)-gerbe on a crossed product groupoid for a G-action on a generalized principal G/N -bundle, and to a U (1)-gerbe over an N -gerbe. We also describe group actions on the groupoids that make the Morita equivalence equivariant. The equivalence is a simple consequence of the methods of Sect. (12)), but it deserves to be singled out because both twisted groupoids appear in the statement of classical T-duality. Theorem 11.1 (Generalized Mackey-Rieffel imprimitivity) Let G be a groupoid, G a locally compact group, N < G a closed normal subgroup, and ρ¯ : G → G/N a homomorphism that admits a continuous lift ρ : G → G. From the two groupoids of Example (6): 1. H := G (G/N ρ¯ G), 2. K := N δρ G. Then for any G-equivariant gerbe presented by a 2-cocycle (σ, λ, β) as in Eq. (6), there is a Morita equivalence of twisted groupoids (H, ψ) ∼ (K, χ ), where ψ ∈
Z 2 (H; U (1))
and χ ∈ Z 2 (K; U (1)) are given by
ψ((g1 , t, γ1 ), (g2 , g1−1 tρ(γ1 ), γ2 )) := σ (t, γ1 , γ2 )λ(g1 , tρ(γ1 ), γ2 ) × β(g1 , g2 , tρ(γ1 )ρ(γ2 ), sγ2 ), (11) χ ((n 1 , γ1 ), (n 2 , γ2 )) := σ (eG/N , γ1 , γ2 )λ(n 1 ρ(γ1 ), ρ(γ1 ), γ2 ) × β(n 1 ρ(γ1 ), n 2 ρ(γ2 ), ρ(γ1 )ρ(γ2 ), sγ2 ), (12) for g s ∈ G, t ∈ G/N , n s ∈ N , and γ s ∈ G.
80
C. Daenzer
Proof. For the sake of understanding, let us first see where ψ comes from. Set (g1 , h 1 ) = (g1 , t, γ1 ), (g2 , h 2 ) = (g2 , g1−1 tρ(γ1 ), γ2 ). Then a composed pair looks like (g1 , h 1 )(g2 , h 2 ) = (g1 g2 , h 1 g1 h 2 ), and ψ = σ (h 1 , g1 h 2 )λ(g1 , g1 h 2 ) × β(g1 , g2 , g1 g2 (sh 2 )). In other words, ψ = F(σ, λ, β), the image of (σ, λ, β) under the chain map F : tot C • (G, C • ((G/N ρ¯ G), U (1))) → C • (H; U (1)) of Eq. (5). So ψ comes from extending a 2-cocycle from (G/N ρ¯ G) to an equivariant 2-cocycle. Now χ = ι∗ ◦ ψ, where ι : N δρ G → G G/N ρ¯ G is as in Proposition (10.5) and the theorem follows immediately from the results in Sect. (10). Remark 11.2. In the hypotheses of the above theorem it is not necessary to have the lift ρ. ˜ In the absence of such a lift, one simply replaces N δρ G by G ×G/N ,ρ G as in Example (9). on U (1) ψ H (denoted α), When G is abelian there is a canonical action of G ˆ given by the formula (θ, g, t, γ ) ∈ U (1) ψ H. φ · (θ, g, t, γ ) = (θ φ, g, g, t, γ ), φ ∈ G,
(13)
This action corresponds to the natural G-action on a crossed product algebra G A. As one expects, the above Morita equivalence takes this action to the same action, pulled back via ι. Proposition 11.3. In the notation of Theorem (11.1) let G be an abelian group. Then the canonical G-action, α, ˆ on U (1) ψ H is transported under the Morita equivalence ψ (U (1) H) ∼ (U (1) χ K) to ι∗ (α)(φ)(θ, ˆ n, γ ) := (θ φ, nρ(γ ), n, γ ), (θ, n, γ ) ∈ U (1) χ K. Thus with these actions the Morita equivalence of Theorem (11.1) becomes G-equivariant. Proof. U (1) × Pι is the Morita (U (1) ψ H)-(U (1) χ K)-bimodule here. For φ ∈ G and (θ, g, γ ) ∈ U (1) × Pι , define an action by φ · (θ, g, γ ) = (θ φ, g, g, γ ). Then this action, along with the actions αˆ and ι∗ αˆ determines an equivariant Morita equivalence (that is, Eq. (1) is satisfied for these actions). 12. Classical T-Duality Let us start with the definition. For us classical T-duality will refer to a T-duality between a U (1)-gerbe on a generalized torus bundle and a U (1)-gerbe on a generalized principal dual-torus bundle. If instead of a torus bundle we start with a G/N -bundle, for N a closed subgroup of an abelian locally compact group G, then we still call this classical T-duality since it is the same phenomenon. Thus “classical” means for us that the groups involved are abelian and furthermore that the dual object does not involve noncommutative geometry. We are now ready to fully describe the T-dualization procedure. There will be several remarks afterwards. Classical T-duality. Suppose we are given the data of a generalized principal G/N -bundle and G-equivariant twisting 2-cocycle whose C 2,0 -component β ∈ C 2 (G; C 0 (G/N ρ¯ G; U (1))) is trivial: 2 (G/N ρ¯ G, (σ, λ, 1) ∈ Z G (G/N ρ¯ G; U (1))).
(14)
A Groupoid Approach to Noncommutative T-Duality
81
Then this is the initial data for a classically T-dualizable bundle, and the following steps produce its T-dual. Remark 12.1. Because G is abelian the restriction of λ : G × G/N × G1 → U (1) to N × G/N × G1 does not depend on G/N . Indeed, for n ∈ N , g ∈ G, and t ∈ G/N , λ(n, t, γ ) = λ(ng, t, γ )λ(g, t, γ )−1 = λ(gn, t, γ )λ(g, t, γ )−1 = λ(n, g −1 t, γ ). Furthermore, the cocycle conditions ensure that λ| N ×G/N ×G1 is homomorphic in both . N and G. We write λ¯ for the induced homomorphism λ¯ : G → N Step 1. Pass from (G/N ρ¯ G, (σ, λ, 1)) to the crossed product groupoid G (G/N ρ¯ G), with twisting 2-cocycle ψ := F(σ, λ, 1) of Eq. (11), and with canonical G-action αˆ of Eq. (13): ˆ (G (G/N ρ¯ G), ψ, α). Step 2. Choose a lift ρ : G → G of ρ¯ and pass from G (G/N ρ¯ G) to the Morita ι∗ αˆ of Proposition equivalent N -gerbe N δρ G with twisting 2-cocycle ι∗ ψ and G-action (11.3): ˆ (N δρ G, ι∗ ψ, ι∗ α). Step 3. Pass to the Pontryagin dual system. More precisely, pass to the twisted grou poid with G-action whose twisted groupoid algebra is G-equivariantly isomorphic to the algebra of Step 2, when the chosen isomorphism is Fourier transform in the N -direction. -bundle, but not exactly of the form we have been considering. Here The result is an N is the Pontryagin dual system: × G1 ⇒ N × G0 ). • Groupoid: K := ( N • Source and range: s(φ, γ ) := (φ, sγ ), r (φ, γ ) := φ(λ¯ (γ )−1 , r γ ) for (φ, γ ) ∈ K1 . • Multiplication: (φ λ¯ (γ2 )−1 , γ1 )(φ, γ2 ) := (φ, γ1 γ2 ), where λ¯ := λ| N ×{1}×G1 (which . is homomorphic in N ), viewed as a map λ¯ : G1 → N • Twisting: τ (φ(λ¯ (γ2 )−1 , γ1 ), (φ, γ2 )) := σ (e, γ1 , γ2 )λ(ρ(γ1 ), γ2 )φ, δρ(γ1 , γ2 ). and a ∈ Cc (K1 ). • G-action: φ · a(φ , γ ) := φ , ρ(γ )a(φ −1 φ , γ ) for φ ∈ G For aesthetic reasons, it is preferable to put this data in the same form as Eq. (14). To λ¯ G via: do this first note that K is isomorphic to N ∼ λ¯ G −→ K, N
(φ, γ ) → (φ λ¯ (γ ), γ ).
Using this isomorphism to import the twisting and G-action on N λ¯ G from K determines that N λ¯ G must have: ¯ 1 )λ(γ ¯ 2 )), δρ(γ1 , γ2 ). • Twisting: σ ∨ (φ, γ1 , γ2 ) := σ (e, γ1 , γ2 )λ(ρ(γ1 ), γ2 )(φ λ(γ and a ∈ Cc (N λ¯ G). • G-action: φ · a(φ, γ ) := φ, ρ(γ )a(φ −1 φ, γ ) for φ ∈ G But this is again classical T-duality data, indeed it is what we write as: λ¯ G, (σ ∨ , ρ, 1) ∈ Z 2 ( N λ¯ G; U (1))). (N G Thus the classical T-dual pair is λ¯ G, (σ ∨ , ρ, 1)). (G/N ρ¯ G, (σ, λ, 1)) ←→ ( N
(15)
82
C. Daenzer
Some properties of the duality. (A) Taking C ∗ -algebras everywhere, the dualizing process becomes: Morita
C ∗ (G/N ρ¯ G; σ ) G λ C ∗ (G/N ρ¯ G; σ ) ∼ C ∗ (N δρ G; ι∗ ψ) iso
λ¯ G; σ ∨ ). C ∗( N
All algebras to the right of the “” have canonically G-equivariantly isomorphic spectra. For the passage from Step (1) to Step (2) this follows because it is a Morita equivalence of twisted groupoids by Theorem (11.1), and by Proposition (11.3) this Morita equiv alence is G-equivariant. For the passage from Step (2) to Step (3) it follows because Pontryagin dualization induces an equivariant isomorphism of twisted groupoid algebras with G-action (Theorem (8.2)). (B) The passages (1)→(2) and (2)→(3) induce isomorphisms in K -theory for any G, since K -theory is invariant under Morita equivalence and isomorphism of C ∗ -algebras. Thus whenever G satisfies the Connes-Thom isomorphism (i.e. G satisfies K (A) K (G A) for every G-C ∗ -algebra A) the duality of (15) incorporates an isomorphism of twisted K-theory: λ¯ G, σ ∨ ). K • (G/N ρ¯ G, σ ) K •+dimG ( N The class of groups satisfying the Connes-Thom isomorphism includes the (finite dimensional) 1-connected solvable Lie groups. ˇ (C) When G is a Cech groupoid for a space X , this duality can be viewed as a duality 2 ∨ (P → X, [(σ, λ, 1)] ∈ HG2 (P; U (1))) ←→ (P ∨ → X, [(σ ∨ , ρ, 1)] ∈ HG (P ; U (1))),
-bundle. Indeed, the pair where P is a principal G/N -bundle and P ∨ is a principal N (P, [(σ, λ)]) are the spectrum (with its G/N -action) and G-equivariant Dixmier-Douady invariant, respectively, of the G-algebra C ∗ (G/N ρ¯ G; σ ), and it is enough to show that λ¯ G, σ ∨ ), is the spectrum and equivariant Dixmier-Douady invariant of the dual, C ∗ ( N independent of the groupoid presentation of (P, [(σ, λ)]). But since every procedure to the right of the “” is an equivariant Morita equivalence of C ∗ -algebras, the result will follow as long as two different presentations give rise to objects which are equivariantly Morita equivalent at Step (1). But if (G/N ×ρ¯ G, (σ, λ, 1)) and (G/N ×ρ¯ G , (σ , λ , 1)) are G-equivariantly Morita equivalent groupoids with cohomologous cocycles then the groupoids U (1) ψ (G G/N ×ρ¯ G) and U (1) ψ (G G/N ×ρ¯ G ) are Morita groupoids, and from this the result follows immediately. equivalent as G 13. Nonabelian Takai Duality In describing classical T-duality it was crucial that the group G be abelian because the dual side was viewed as a G-space, and the procedure of T-dualizing was run in reverse by taking a crossed product by the G-action. Since our goal is to describe an analogue of T-duality that is valid for bundles of nonabelian groups, we need a method of returning from the dual side to the original side that does not involve a Pontryagin dual group. A solution is provided by what we call the nonabelian Takai duality for groupoids. In this section we first review classical Takai duality, then we describe the nonabelian version. The nonabelian Takai duality is constructed so that when applied to abelian groups it reduces to what is essentially the Pontryagin dual of classical Takai duality.
A Groupoid Approach to Noncommutative T-Duality
83
Recall that Takai duality for abelian groups is the passage − C ∗ -algebras) (G − C ∗ -algebras) (G (α, A) −→ (α, ˆ G α A), g ∈ G and where αˆ is the canonical G-action αˆ φ · a(g) := φ(g)a(g) for φ ∈ G, a ∈ G A. This is a duality in the sense that a second application produces a G-algebra ˆˆ G αˆ G α A), which is G-equivariantly Morita equivalent to the original (α, A). (α, Now let (α, H) be a G-groupoid. Takai duality applied (twice) to the associated groupoid algebra is the passage (α, C ∗ (H)) −→ (α, ˆ C ∗ (G α H)) ˆˆ G αˆ C ∗ (G α H)). −→ (α, Comparing multiplications, one sees that the last algebra is identical to the groupoid triv (G α H), χ ) where algebra of the twisted groupoid (G χ ((φ1 , g1 , γ ), (φ2 , g2 , γ2 )) := φ1 , g2 Note that this duality cannot be expressed and (triv) denotes the trivial action of G. only acts on the groupoid algepurely in terms of groupoids since the dual group G bra. However, taking the Fourier transform in the G-direction determines a Pontryagin triv G α H, χ ) and the untwisted groupoid duality between (G G := (G × G × H1 ⇒ G × H0 ) whose source, range and multiplication are 1. s(g, h, γ ) := (g, h −1 sγ ) r (g, h, γ ) := (gh −1 , r γ ), −1 2. (gh −1 2 , h 1 , γ1 ) ◦ (g, h 2 , h 1 γ2 ) := (g, h 1 h 2 , γ1 γ2 ), for g s, h s ∈ G and γ ∈ H. This groupoid has the natural left translation action of G for which it is equivariantly isomorphic to the generalized G-bundle G q (G α H), where q is the quotient homomorphism q : G α H → (G ⇒ ∗). The map is given by G −→ G q (G α H), (g, h, γ ) −→ (gh −1 , h, γ ). As was mentioned in the construction of generalized bundles, a generalized bundle G ρ G can be described as the induced groupoid of the right G-module P := G × G0 with the obvious structure. In the current situation the right (G α H)-module is b
P := (G × H0 → H0 ), where the moment map b is just the projection and the right action is (g, r γ ) · (h, hγ ) := (g(q((h, hγ ), sγ ) = (gh, sγ ). For convenience we will write down this groupoid structure, G q (G α H) ≡ P q (G α H) with source, range and multiplication
84
C. Daenzer
1. s(g, h, γ ) := (gh, h −1 sγ ) r (g, h, γ ) := (g, r γ ), 2. (g, h 1 , γ1 ) ◦ (gh 1 , h 2 , h −1 1 γ2 ) := (g, h 1 h 2 , γ1 γ2 )
for g s, h s ∈ G and γ ∈ H. The content of the previous paragraph is that up to a Pontryagin duality, a Takai duality can be expressed purely in terms of groupoids. Given a G-groupoid (α, H), one forms the crossed product G α H. There is a canonical G-action on the groupoid algebra, but there is also a canonical right (G α H)-module P, and the two canonical pieces of data are essentially the same. Now to express the duality, one passes to the induced groupoid P (G α H). This induced groupoid is itself equipped with a natural G-action coming from the left translation of G on P, and is G-equivariantly Morita equivalent to the original G-groupoid (α, H). Thus the C ∗ -algebra of this induced groupoid takes the αˆ C ∗ (G α H)) (and is isomorphic to it via Fourier transform). place of G This duality (α, H) (P, G α H) expresses essentially the same phenomena as Takai duality, but while Takai duality applies only to abelian groups, this formulation applies to arbitrary groups. Let us write this down formally. Theorem 13.1 (Nonabelian Takai duality for groupoids). Let G be a locally compact group and (α, H) a G-groupoid, then 1. G α H has a canonical right module P := ((G × H0 ) → H0 ). 2. The induced groupoid P q (G α H) is naturally a generalized principal G-bundle. 3. The G-groupoids (α, H) and (τ, P q (G α H)) are equivariantly Morita equivalent, where τ denotes the left principal G-bundle action. Proof. The first statement has already been explained, and the second is just the fact that P (G α H) ≡ G q (G α H) and the latter groupoid is manifestly a G-bundle. For the third statement, note that the inclusion φ
H {1} × {1} × H → (G q (G α H)) is an essential equivalence. After identifying the equivalence bimodule Pφ with the space {(g, g −1 , γ ) | g ∈ G, γ ∈ H1 }, one verifies easily that the following G-actions make this an equivariant Morita equivalence: For (h, k, γ ) ∈ (G q (G α H)), γ ∈ H, and (h, h −1 , γ ) ∈ Pφ , g ∈ G acts by g · (h, k, γ ) := (gh, k, γ ); g · (h, h −1 , γ ) := (gh, (gh)−1 , γ ); g · (γ ) := gγ . Note that φ itself is not an equivariant map.
For our intended application to T-duality, it will be necessary to consider a nonabelian Takai duality for twisted groupoids, which we will prove here. In this context it not possible to make a statement of equivariant Morita equivalence because in general the twisted groupoid does not admit a G-action. Instead of G-equivariance, there is a Morita equivalence that is compatible with “extending” to a twisted crossed product groupoid. Theorem 13.2. (Nonabelian Takai duality for twisted groupoids). Let G be a group and (α, H) a G-groupoid. Suppose we are given a 2-cocycle χ ∈ Z 2 (G α H; U (1)). Define σ := χ |H ∈ Z 2 (H, U (1)). Then χ can be viewed as a 2-cocycle on (G q (G α H)) which is constant in the first G variable, or as a 2-cocycle on G τ (G q (G α H)) which is constant in the first two G-variables, and we have:
A Groupoid Approach to Noncommutative T-Duality
85
1. The Morita equivalence (G (G α H)) H of Theorem (13.1) extends to a Morita equivalence of U (1)-gerbes: U (1) χ (G q (G α H)) U (1) σ H. 2. The Morita equivalence G τ (G q (G α H)) G α H induced from G-equivariance extends to a Morita equivalence of U (1)-gerbes: U (1) χ (G τ (G q (G α H))) U (1) χ (G α H). 3. The first equivalence is a subequivalence of the second. Proof. The first statement follows just as in the proof of the Morita equivalence of G q (G α H) and H; this time the bimodule is U (1) × Pφ (using the notation from the proof of Theorem (13.1)). For the second statement, note that G × Pφ is the G τ (G (G α H))- G α Hbimodule, the bimodule structure being given as in Example (2). Then U (1) × G × Pφ is the desired Morita bimodule. For the third statement, note that restricting to U (1) × {1} × Pφ ⊂ U (1) × G × Pφ recovers the Morita equivalence of the first statement as a subequivalence of the second. 14. Nonabelian Noncommutative T-Duality Remembering our convention to call a group nonabelian when it is not commutative and reserve the word noncommutative for a noncommutative space in the sense of noncommutative geometry, we define the following extensions to T-duality. Definition 14.1. Let N be a closed normal subgroup of a locally compact group G. There is a canonical equivalence between the data contained in: • a G-equivariant U (1)-gerbe on a generalized G/N -bundle, and • a U (1)-gerbe on an N -gerbe, with canonical right module. The interpolation between the two objects is described below, and will be called nonabelian noncommutative T-duality whenever the interpolating procedure induces an isomorphism in K -theory. Remark 14.2. Classical T-duality was a duality between gerbes on generalized principal bundles. More precisely, for abelian groups G, the duality associated to a U (1)-extension of a groupoid of the form G/N ρ G, a new U (1)-extension of a groupoid of the form λ¯ G. Now we will instead use groupoids of the form G N δρ G or more generally N G G ×G/N ,ρ G, defined in Example (10). These are equivariantly Morita equivalent to the old kind. The reason for this change is that while every gerbe on a classically T-dualizable pair of bundles can be presented by an equivariant 2-cocycle on a groupoid of the form G/N ρ G, this is not the case in general. On the other hand, the following fact will be proved in Sect. (15): Every G-equivariant stack theoretic gerbe on a generalized principal G/N -bundle admits a presentation as a U (1)-extension of a groupoid of the form G G ×G/N ,ρ G.
86
C. Daenzer
This is not necessarily obvious. In fact without the G-equivariance condition, not every gerbe on a generalized G/N -bundle admits such a presentation; it is the G-equivariance that forces this. Procedure for nonabelian T-dualization. The initial data for nonabelian T-duality will be a G/N -bundle with equivariant 2-cocycle: 2 (G N δρ G; U (1))). (G N δρ G, (σ, λ, β) ∈ Z G
(16)
The nonabelian T-dual is obtained in the following two steps: Step 1. Pass to the crossed product groupoid, together with its canonical right module P of Theorem (13.1) and the 2-cocycle ψ which is the image of (σ, λ, β) in Z 2 (G G N δρ G; U (1)) : (G (G N δρ G), ψ, P). Step 2. Pass to the Morita equivalent system: (N δρ G; ι∗ ψ, ι∗ P), where ι is the essential equivalence (see Proposition (10.5)): ι : (N δρ G) → G (G N δρ G) (n, γ ) → (nρ(γ ), e, n, γ ). Some properties of the duality. (A) If ρ : G → G/N does not admit a lift to a map to G, then the same T-dualization procedure works with N δρ G replaced by G ×G/N ,ρ G (see Example (9)). (B) What we are describing is indeed a duality, in the sense that we can recover the initial system (16) by inducing the groupoid via its canonical module P. This fact is the content of twisted nonabelian Takai duality for groupoids, Theorem (13.2). (C) There is an isomorphism of twisted K -theory K • (G N δρ G; σ ) K •+dimG (G (G N δρ G), ψ) K •+dimG (N δρ G; ι∗ ψ) whenever G satisfies the Connes-Thom isomorphism theorem, in particular whenever G is a (finite dimensional) 1-connected solvable Lie group. (D) This construction is Morita invariant in the appropriate sense. That is, if we choose two representatives for the same generalized principal bundle with equivariant gerbe, the two resulting dualized objects present the same N -gerbe with equivariant gerbe. This is proved in Sect. (15). (E) The dual can be interpreted as a family of noncommutative groups. Indeed, ˇ suppose that in the above situation G is a Cech groupoid of a locally finite cover of a space X . Let us look at the fiber of the dual over a point m ∈ X . This fiber corresponds to N δρ (G|m ), where G|m is the restriction of G to the chosen point. G|m is a pair groupoid which is a finite set of points (there is one arrow for each double intersection Ui ∩ U j that contains m and one object for each element of the cover that contains m). Any inclusion of the trivial groupoid (∗ ⇒ ∗) → G|m induces an essential equivalence φ : (N ⇒ ∗) → N δρ (G|m ) which induces an isomorphism φ ∗ in cohomology by Corollary (10.4). So the twisted groupoids ((N δρ G)|m , (ι∗ ψ)|m ) and (N ⇒ ∗, φ ∗ (ι∗ ψ|m )) are equivalent. In particular, C ∗ (N δρ G|m ; ι∗ ψ) is Morita equivalent to C ∗ (N ; φ ∗ (ι∗ ψ)), and the latter is a standard presentation of a noncommutative
A Groupoid Approach to Noncommutative T-Duality
87
(and nonabelian, if desired) dual group! For example if N Zn we get noncommutative n-dimensional tori. So the T-duality applied to G/N -bundles produces N -gerbes that are fibred in what should be interpreted as noncommutative versions of the dual group (G/N )∨ ! 15. The Equivariant Brauer Group In this section we will describe the elements of the G-equivariant Brauer group of a principal “G/N -stack”. First recall that the Brauer group Br(X ) of a space X is the set of isomorphism classes of stable separable continuous trace C ∗ -algebras with spectrum X . The famous Dixmier-Douady classification says that each such algebra is isomorphic to the algebra Γ0 (X ; E) of sections that vanish at infinity of a bundle E of compact operators. Since such bundles can be described by transition functions with values in Aut K PU (h), there is an isomorphism H 1 (X ; Aut K ) Br(X ). Since H 1 (X ; Aut K ) H 2 (X ; U (1)), Br (X ) can also be taken to classify U (1)-gerbes on X . If X is a G-space, one can talk about the equivariant Brauer group Br G (X ), which can be most simply defined as the equivalence classes of G-equivariant bundles of compact operators under the equivalence of isomorphism and outer equivalence of actions. We intend to show that this group also corresponds to G-equivariant gerbes on X . Generalizing the case of spaces X , one can consider the Brauer group of a presentable topological stack X (that is, a stack X which is equivalent to PrinG for some groupoid G, as in Sect. (7)). So let X be a presentable topological stack. Then a vector bundle on X is a vector bundle on a groupoid G presenting X . A vector bundle on a groupoid G is a (left) G-module E → G0 which is a vector bundle on G0 and such that G acts linearly (in the sense that the action morphism s ∗ E → r ∗ E is a morphism of vector bundles over the space G1 ). An H-G-Morita equivalence bimodule P makes P ∗ E → H0 a vector bundle on H. In exactly the same way, one has the notion of a bundle of algebras on a stack, in particular one has bundles of compact operators on a stack. Finally, the stack with G/N -action associated to generalized principal bundle G/N ρ G is what should be called a principal G/N -stack. This data can be presented in at least two ways, for example as a stack over PrinG/N , PrinG −→ PrinG/N , or as a stack over PrinG , PrinG/N ρ G −→ PrinG . Now define the G-equivariant Brauer group Br G (PrinG/N ρ G ) to be the isomorphism classes G-equivariant bundles of compact operators on PrinG/N ρ G . We will show that whenever G0 is contractible to a set of points, Br G (PrinG/N ρ G ) H 1 (G ×G/N G; Aut K ). Of course if G0 isn’t contractible we can refine G so that it is, thus we will always make that assumption. So suppose E is a bundle on PrinG/N ρ G . It may initially be presented as a module over any groupoid H which is equivariantly Morita equivalent to (G/N ρ G), ˇ for example if G is a Cech groupoid one could imagine that E is a bundle on the actual space Q (G/N × G0 )/(G/N ρ G). We would like E to be presented on the
88
C. Daenzer
groupoid G (G G/N ,ρ G), (as in Example (10)), so choose a Morita equivalence ((G G G/N ,ρ G)-H)-bimodule P, and replace E by E˜ := P ∗ E. Remember that no data is lost here because P op ∗ P ∗ E E. Just for the sake of not having too many G s, assume that ρ admits a lift to G so that there is an isomorphism (G G G/N ,ρ G) (G N δρ G) as in Example (10). The general case when there is no lift works in the exact same way. ˜ Let us suppose this is the case, meaning precisely If E is G-equivariant then so is E. that E˜ is a (G N δρ G)-module, and E˜ has a G-action which is equivariant with respect to the translation action of G on (G N δρ G). Keep in mind that in particular E˜ is a bundle over the objects G × G0 of the groupoid. Then the restriction of E˜ to {e} × G0 ⊂ G × G0 , denoted E 0 , is trivializable since G0 is contractible. So assume that E 0 = G0 × K . But then the whole of E˜ is trivializable since it is a G-equivariant bundle over a space with free G-action. For example a trivialization is given by: G × E 0 → E˜
(g, ξ ) → gξ.
So assume that E˜ = G × G0 × K . Note that being a G-equivariant (G N δρ G)-module is the same as being a G (G (N δρ G))-module. Since E˜ is trivial as a bundle, it is classified by the action of G (G N δρ G), and this is given by a homomorphism π : G (G N δρ G) → Aut K such that the groupoid action is ˜ (G (G N δρ G)) ×G×G0 E˜ −→ E, ((g1 , g2 , n, γ ), (g1−1 g2 nρ(γ ), sγ , k)) → (g2 , r γ , π(g1 , g2 , n, γ )k) ˜ for (g1−1 g2 nρ(γ ), sγ , k) ∈ E. Two such actions, given by π and π , are outer equivalent if and only if π and π are cohomologous, and this shows that Br G (PrinG/N ρ G ) H 1 (G G N δρ G; Aut K ). But the results of Sect. (10) imply that the inclusion ι : N δρ G → G G N δρ G
(n, γ ) → (nρ(γ ), e, n, γ )
induces a quasi-isomorphism with quasi-inverse the quotient map q : G G N δρ G → N δρ G , therefore we have: Proposition 15.1. Let N be a closed normal subgroup of a locally compact group G, let G be a groupoid with G0 contractible, and let ρ : G → G/N be a homomorphism. Then Br G (PrinG/N ρ G ) H 1 (G G N δρ G; Aut K ) H 1 (N δρ G; Aut K ). More generally, N δρ G can be replaced by G G/N ,ρ G.
A Groupoid Approach to Noncommutative T-Duality
89
The meaning of the isomorphism on the right is that π can be taken constant in its two G variables. In fact, one can construct such a π directly; simply note that the whole situation is determined by the module structure of E 0 , and that it is precisely ι(N δρ G) that preserves this subspace. An example of this construction is carried out in the proof of Theorem (A.1). Now let us explain the relationship between gerbes and bundles of compact operators. If N is a discrete group then there is a connecting homomorphism which is an isomorphism, H 1 (N δρ G; Aut K ) H 2 (N δρ G; U (1)). If N is not discrete then there is the possibility that only a Borel connecting homomorphism can be chosen. Rather than tread into the territory of Borel cohomology for groupoids, we point out that for any groupoid H, the U (1)-gerbe associated to π ∈ Z 1 (H; Aut K ) is just U (h) ×Aut K ,π H, the groupoid constructed in Example (9). It is a U (1)-gerbe on H. To make that last fact more clear, note that as a space, the arrows of this groupoid form a possibly nontrivial U (1)-bundle over H1 and any global section of the bundle determines an isomorphism U (h) ×Aut K ,π H U (1) δπ H, as was pointed out in Example (9), and the latter object is clearly a U (1) gerbe. At the C ∗ -algebra level there is also a relationship between gerbes and bundles of compact operators. It is shown in Lemma (A.5) that a global section of the U (1)-bundle (U (h) ×Aut K ,π H1 ) → H1 induces a Morita equivalence C ∗ (H, δπ )
Morita
∼
Γ (H; E(π )),
where E(π ) is the trivial bundle H0 × K with H-module structure given by π . Independent of the existence of a global section, there is a Morita equivalence Γ (H; Fund(U (h) ×Aut K ,π H))
Morita
∼
Γ (H; E(π )),
where Fund(U (h) ×Aut K ,π H) denotes the associated line bundle to the U (1)-bundle (viewed as a bundle of (rank 1) C ∗ -algebras) (U (h) ×Aut K ,π H1 ) −→ H1 . So in this section we saw that nonabelian noncommutative T-duality as presented in Sect. (14) can be used to describe a dual to a G-equivariant gerbe presented on any groupoid H such that H describes a principal “G/N -stack”; this being because any such gerbe could also be presented on (G N δρ G), as it was in Sect. (14). (Except for the situation in which ρ and possibly π do not admit lifts to G or U (h) respectively, but we also saw how to modify the setup in these situations.) Finally, as promised: Corollary 15.2. Nonabelian T-duality is Morita invariant. Proof. This is easy because we may assume that any gerbe on a generalized principal bundle is presented on a groupoid of the form G N δρ G and that the gerbe is defined via a cocycle which is constant in G. But then if two such presentations are given, corre sponding to π ∈ Z 1 (N δρ G; Aut K ) and π ∈ Z 1 (N δρ G ; Aut K ) it is clear that the gerbes U (h)×Aut K ,π G N δρ G and U (h)×Aut K ,π G N δρ G are G-equivariantly Morita equivalent if and only if U (h) ×Aut K ,π N δρ G and U (h) ×Aut K ,π N δρ G are Morita equivalent.
90
C. Daenzer
16. Conclusion A natural direction for future study here is to consider the case when both sides of the duality are fibred in noncommutative groups (in the sense of noncommutative geometry). Interestingly, a completely new phenomenon arises in this context: the gerbe data, the 2-cocycles that is, become noncompactly supported distributions and it is necessary to multiply them. We have manufactured examples in which this can be done, that is when the singular support of the distributions do not intersect, but our present methods do not provide a general method for describing a T-dual pair with both sides families of noncommutative groups. Another direction to look is the case of groupoids for which the G/N -action is only free on a dense set. This corresponds to singularities in the fibers of a bundle of groups. The groupoid approach to T-duality seems well suited for this. On the other hand for some other types of singularities in fibers, notably singularities which destroy the possibility of a global G/N -action, the groupoid approach will not apply at all. It will be interesting to see if this problem can be fixed. A third direction these methods can take is to consider complex structures on the groupoids and make the connection between topological T-duality and the T-duality of complex geometry (as in [DP]). We have initiated this project in [BD]. Lastly, it will of course be very nice to find some physically motivated examples of nonabelian T-duality. A. Connection with the Mathai-Rosenberg Approach The goal of this section is to describe the connection between our approach and the Mathai-Rosenberg approach to T-duality. Let us begin with a summary of the approach of Mathai and Rosenberg [MR]. One begins with the data of a principal torus bundle P → X over a space X and a cohomology class H ∈ H 3 (P; Z) called the H -flux. The procedure for T-dualizing is as follows: 1. Pass from the data (P, H ) to a C ∗ -algebra A(P, H ). To do this one traces H through the isomorphisms H 3 (P; Z) Hˇ 2 (P; U (1)) Hˇ 1 (P; Aut(K ))
(17)
(here K = K (h) is the algebra of compact operators on a fixed separable ˇ Hilbert space h) to get an Aut(K )-valued Cech 1-cocycle. This cocycle gives transition functions for a bundle of compact operators over P, and A(P, H ) is the C ∗ -algebra of continuous sections of this bundle which vanish at infinity. According to the Dixmier-Douady classification, one can recover the torus bundle P and the H -flux from the A(P, H ). 2. Next, writing the torus as a vector space modulo full rank lattice, T = V /Λ, one tries to lift the action of V on P to an action of V on A(P, H ). Assuming one exists, choose an action α : V → Aut(A(P, H )) lifting the principal bundle action. If no action exists, the data is not T-dualizable. 3. Now the T-dual of A = A(P, H ) is simply the crossed product algebra, V α A, (or perhaps the T-dual is the spectrum and Dixmier-Douady invariant (P ∨ , H ∨ ) of this algebra, if V α A is of continuous trace) and the problem of producing a T-dual object is reduced to understanding this crossed product algebra.
A Groupoid Approach to Noncommutative T-Duality
91
4. There are two scenarios for describing the crossed product, depending on whether a certain obstruction class, called the Mackey obstruction M(α), vanishes. When M(α) = 0 the crossed product algebra is isomorphic to one of the form A(P ∨ , H ∨ ) for P ∨ → X a principal dual-torus bundle and H ∨ ∈ H 3 (P ∨ , Z). The transition functions for P ∨ are obtained from the so-called Phillips-Raeburn obstruction of the action and there exist explicit formulas for H ∨ in terms of the data (P, H, α). This understanding of the crossed product is a result of work by Mackey, Packer, Phillips, Raeburn, Rosenberg, Wassermann, Williams and others (in the subject of crossed products of continuous trace algebras) which is referenced in [MR]. When M(α) = 0, the crossed product algebra was shown in [MR] to be a continuous field of stable noncommutative (dual) tori over X . ˇ We claim that the Mathai-Rosenberg setup corresponds to our approach applied to Cech groupoids and the groups (G, N ) = (V, Λ). More precisely, we have the following theorem. Theorem A.1. Let Q → X be a principal torus bundle trivialized over a good cover of ˇ X , let G denote the Cech groupoid for this cover and let ρ : G → V /Λ be transition functions presenting Q. Then 1. For any H ∈ H 3 (Q; Z) such that A := A(Q; H ) admits a V -action, there is a Morita equivalence A
Morita
∼
C ∗ (V Λ δρ G; σ ),
for some σ ∈ Z 2 (V Λ δρ G; U (1)) that is constant in V . If V acts by translation on C ∗ (V Λ δρ G; σ ) then the equivalence is V -equivariant. 2. [σ ] is the image of H under the composite map ∼
H 3 (Q; Z) −→ H 2 (Q; U (1)) −→ H 2 (V Λ δρ G; U (1)). 3. Let σ ∨ := σ |Λδρ G ∈ Z 2 (Λ δρ G; U (1)). Then for the chosen action of V , there -equivariant Morita equivalence: is a V VA
Morita
∼
V C ∗ (V Λ δρ G; σ )
Morita
∼
C ∗ (Λ δρ G; σ ∨ ),
acts by the canonical dual action on the left two algebras and on the rightwhere V and (λ, γ ) ∈ Λ δρ G. most algebra by φ · a(λ, γ ) := φ, λρ(γ )a(λ, γ ) for φ ∈ V The theorem will follow from the next two lemmas concerning bundles of C ∗ -algebras. Definition A.2. Let G be a groupoid. A (left) G-C ∗ -algebra A is a bundle of C ∗ -algebras A → G0 which is a (left) G-module, such that G acts by C ∗ -algebra isomorphisms. The groupoid algebra of sections of A, written Γ ∗ (G; r ∗ A), is the C ∗ -completion of Γc (G; r ∗ A) (= the compactly supported sections of the pullback bundle r ∗ A) with multiplication and involution given by a(g1 )g1 · (b(g2 )) a ∗ (g) := g · (a(g −1 )∗ ) ab(g) := g1 g2 =g
for g s ∈ G and a, b ∈ Γc (G; r ∗ A), and where the last “ ∗ ” on the right is the C ∗ -algebra involution in the fiber.
92
C. Daenzer
Remark A.3. This definition is a synonym for the groupoid crossed product algebra G Γ0 (G0 ; A), as defined in [Ren2].
ρ
Lemma A.4. Let G and H be groupoids and (G0 ← P → H0 ) a (G-H)-Morita equivaπ lence bimodule. Then for any H-C ∗ -algebra A → H0 , there is a Morita equivalence Γ ∗ (G; r ∗ (P ∗ A))
Morita
∼
Γ ∗ (H; r ∗ A)
where, as in Sect. (3), P ∗ A := (P ×ρ,H0 ,π A)/H) = ((P ×ρ,H0 ,π A)/( p, a) ∼ ( ph −1 , h · a)). Proof. We will construct a Morita equivalence bimodule for essential equivalences. If this case is true then the lemma is true because any Morita equivalence factors into essenφ1
φ2
op
tial equivalences, and if G ← K → H is such a factorization then, setting P = Pφ1 ∗ Pφ2 , we will have Γ ∗ (H; r ∗ A)
Morita
Γ ∗ (K; r ∗ (Pφ∗2 A)) ∼ Γ ∗ (K; r ∗ Pφ1 ∗ (P ∗ A))
Morita
Γ ∗ (G; r ∗ (P ∗ A)).
∼ ∼
iso
So in the notation of the statement of the lemma, assume P = Pφ := G0 ×φ,H0 ,r H1 for φ : G → H an essential equivalence. Then is the projection Pφ → G0 and ρ is the projection π H : Pφ → H1 followed by the source map s : H1 → H0 . We use the isomorphism Pφ ∗ A φ ∗ A := G0 ×φ,H0 ,π A. Now we will construct a Morita equivalence bimodule. Set Ac := Γc (G; r ∗ (φ ∗ A)) and Bc := Γc (H; r ∗ A). The following structure defines an Ac -Bc -pre-Morita equivalence bimodule structure on X c := Γc (P; ∗ φ ∗ A), which after completion produces the desired Morita equivalence. Fix the notation: g s ∈ G, h s ∈ H, p s ∈ P, a s ∈ Ac , x s ∈ X c , and b s ∈ Bc , and as usual integration is with respect to the fixed Haar system. The left pre-Hilbert module structure is given by the following data: • Action: ax( p) := g a(g)φ(g) · (x(g −1 p)). • Inner product: A x1 , x2 (g) := p x1 (gp)φ(g) · x2 ( p)∗ . The right pre-Hilbert module structure is given by the following data: • Action: xb( p) := h h · (x( ph))π H ( ph) · b(h −1 ). • Inner product: x1 , x2 B (h) := p π H ( p)−1 · (x1 ( p)∗ x2 ( ph)). Verification that this determines a Morita equivalence bimodule is routine.
A Groupoid Approach to Noncommutative T-Duality
93
Lemma A.5. Let G be a groupoid, σ : G2 → U (1) a 2-cocycle, and T : G1 → U (h) a continuous map satisfying σ (γ1 , γ2 ) = T (γ1 )T (γ2 )T (γ1 γ2 )−1 =: δT (γ1 , γ2 ). Let A(ad T ) denote the G-C ∗ -algebra (G0 × K ) → G0 with G-action given by: s ∗ A(ad T ) −→ r ∗ A(ad T )
g · (sg, k) := (rg, ad T (g)k).
Then 1. There is a Morita equivalence Γ ∗ (G; r ∗ A(ad T ))
Morita
∼
C ∗ (G; σ ).
ˇ ∗ -algebra ˇ 2. When G = Gˇ is a Cech groupoid on a good cover of a space X , every G-C of compact operators is isomorphic to A(ad T ) for some T and Γ0 (X ; E(ad T ))
Morita
∼
ˇ σ ), C ∗ (G;
where E(ad T ) := G0 × K /(sγ , k) ∼ (r γ , adT (γ )k) is the bundle of compact operators whose transition functions are ad T : G → Aut K . Proof. The Morita equivalence in the second statement is proved as follows. First note op that E(ad T ) = (Pφ ) ∗ A(ad T ), where Pφ is the bimodule of the essential equivalence φ
G −→ G0 /G ≡ X , so an application of Lemma (A.4) implies that Γ0 (X ; E(ad T ))
Morita
∼
ˇ r ∗ A(ad T )). Γ ∗ (G;
Now apply the Morita equivalence of the first statement to finish. The other part of statement (2), that all bundles of compact operators on X are of the form E(ad T ) for some ˇ T , follows because the connecting homomorphism in nonabelian Cech cohomology, H 1 (X ; Aut K ) → H 2 (X ; U (1)), is an isomorphism. It remains to exhibit the Morita equivalence of statement (1). Set Ac := Γc (G; r ∗ A(ad T )) and Bc := C ∗ (G; σ ). We claim that the space X c := Cc (G1 ; h) of compactly supported h-valued maps admits a pre-Morita equivalence Γ ∗ (G; r ∗ A(ad T ))-C ∗ (G; σ )-bimodule structure. Set the notational conventions: g s ∈ G, a s ∈ Ac , x s ∈ X c , and b s ∈ Bc . The unadorned bracket , denotes the C-valued inner product on h and is taken to be conjugate-linear in the first variable and linear in the second. The bra-ket notation will be used for the K -valued inner product on h, so for v s ∈ h, |v1 v2 | is the compact operator defined by |v1 v2 |(v) := v2 , vv1 . The left pre-Hilbert module structure on X c is given by the following data: • Action: ax(g) := g1 g2 =g σ (g1 , g2 )a(g1 )T (g1 )x(g2 ). • Inner product: A x1 , x2 (g) := g1 g2 =g σ (g1 , g2 )|x1 (g1 )T (g)x2 (g2−1 )|. The right pre-Hilbert module structure on X c is given by the following data: • Action: xb(g) := g1 g2 =g σ (g1 , g2 )x(g1 )b(g2 ). • Inner product: x1 , x2 B (g) := g1 g2 =g x1 (g1−1 ), x2 (g2 )σ (g1 , g2 ). Verification that this structure gives a Morita equivalence is routine. Now let us proceed to the proof of the theorem.
94
C. Daenzer
Proof. (Theorem). By the Dixmier-Douady classification we know that A is isomorphic to the algebra of continuous sections of a bundle E → Q of compact operators, so assume A = Γ0 (Q; E). A V -action on A comes from an action by automorphisms of the bundle of algebras. Now, noting that Q is isomorphic to the quotient (V /Λρ G)0 /(V /Λρ G)1 , pull back E to a bundle E˜ over V /Λ × G0 . Then E˜ is a module for the groupoid V V /Λ ρ G. Denote by E 0 the restriction of E˜ to {eΛ} × G0 ⊂ V /Λ × G0 . Then E 0 is trivializable since G0 is contractible, so we assume E 0 = G0 × K , and write (sγ , k) ˜ and [sγ , k] for its image under the quotient q : E˜ → E. The for a point in E 0 ⊂ E, δρ V × V /Λ G-module structure on E˜ “restricts” to a Λ δρ G-module structure on E 0 , via the inclusion ι : Λ δρ G → V V /Λ ρ G
(λ, γ ) → (λρ(γ ), eΛ, γ ).
The action on E 0 can be written as (λ, γ ) · (sγ , k) := (r γ , π(λ, γ )k), where π is a homomorphism Λ δρ G → Aut K . (Note that there exists a lift of ρ to V since G0 is contractible and V → V /Λ is a covering space, so δρ makes sense.) Then q followed by the V action determines a map V × E0 → E
(v, (sγ , k)) → (v, [sγ , k]) → v · [sγ , k].
This map factors through the quotient V × E 0 −→ (V × E 0 )/(Λ δρ G) := (V × E 0 )/(v, (sγ , k)) ∼ (v − λ − ρ(γ ), (r γ , π(λ, γ )k)), and induces an isomorphism of bundles ∼
(V × E 0 )/(Λ δρ G) −→ E, which is V -equivariant when the bundle on the left is equipped with the natural translation action. So A(Q; E) is equivariantly isomorphic to Γ (Q; (V × E 0 )/(V Λ δρ G)). Let σ := δπ : (Λ δρ G)2 → U (1) be an image of π under the composition of the connecting homomorphism (which is an isomorphism due to the contractibility of U (h)) and the pullback via the quotient map V Λ δρ G → Λ δρ G, H 1 (Λ δρ G; Aut K ) → H 2 (Λ δρ G; U (1)) → H 2 (V Λ δρ G; U (1)). In other words, we have chosen a continuous map T : (Λ δρ G) → U (h) such that ad T = π and δT = σ , and E E(ad T ). Now we know that A(Q; H ) Γ0 (Q; E) = Γ0 (Q; ad T )), and according to Lemma (A.5) there is a Morita equivalence: C ∗ (V Λ δρ G; σ )
Morita
∼
Γ0 (Q; E(ad T ))
which is easily seen to be equivariant since T does not depend on V . So statement (1) is proved, and statement (2) is obvious from the construction since Γ0 (Q, E(ad T )) A(Q, H ) when H is the image of [σ ] = [δT ].
A Groupoid Approach to Noncommutative T-Duality
95
Statement (3) now follows as well. Indeed, since an equivariant Morita equivalence induces an equivalence of the associated crossed product algebras, we have V C ∗ (V Λ δρ G; σ )
Morita
∼
V A(Q; H ),
and the algebra on the left, being identical to C ∗ (V V Λ δρ G; σ ), is equivariantly Morita equivalent to C ∗ (Λ δρ G; σ ) by Proposition (10.5). This completes the proof. So that is the correspondence: Mathai-Rosenberg do A(Q; H ) ↔ V A(Q; H ), whereas we do (V Λ δρ G; σ ) ↔ (Λ δρ G; σ ∨ ). In Sects. (12) and (14) the presentation is slightly different. The difference is that in this appendix we have assumed that σ is already given as a 2-cocyle on H := (V Λ δρ G) which is constant in the V -direction, whereas in Sect. (14) we begin with an arbitrary σ on H that has been extended to an equivariant 2-cocyle (σ, λ, β). The two setups are essentially the same because according to Sect. (10), the existence of the lift (σ, λ, β) ensures that σ is cohomologous to a 2-cocycle which is constant in the V direction. In Sect. (12) there is the further difference that σ is presented on V /Λ ρ G rather than H, but we can easily pull it back to a cocycle on H. The slightly messier presentation in Sects. (12) and (14) appeals to the notion that the initial data is a gerbe on a principal bundle (with any groupoid presentation), and that we have found an action of V (that is a lift to an equivariant cocycle) for which the gerbe is equivariant. The Mackey obstruction in our setup is simply β := σ |Λδρ G0 . The methods develˇ oped in Sect. (10) make it clear that since G is a Cech groupoid, the restriction to a point in the base space, G G|m identifies β with a 2-cocycle on Λ. When β is a coboundary, we may assume that σ only depends on one copy of Λ, and the Pontryagin duality methods apply. Indeed, we may always assume that σ is in the image of equivariant cohomology, HV2 (V Λ δρ G; U (1)), and then β really corresponds to the component that obstructs classical dualization and Pontryagin dualization described in Sect. (12). Acknowledgement. I would like to thank Oren Ben-Bassat, Tony Pantev, Michael Pimsner, Jonathan Rosenberg, Jim Stasheff, and most of all Jonathan Block, for advice and helpful discussions. I am also grateful to the Institut Henri Poincaré, which provided a stimulating environment for some of this research. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References [BX] [BBP]
Behrend, K., Xu, P.: Differentable stacks and gerbes. http://arxiv.org/abs/math.DG/0605694, 2006 Ben-Bassat, O., Block, J., Pantev, T.: Noncommutative tori and Fourier-Mukai duality. http://arxiv. org/abs/math.AG/0509161, 2005 [BD] Block, J., Daenzer, C.: Mukai duality for gerbes with connection. J. für die reine und ang. Math. (Crelle’s journal) in press [BHM] Bouwknegt, P., Hannabuss, K., Mathai, V.: Nonassociative tori and applications to T-duality. Commun. Math. Phys. 264, 41–69 (2006) [Br] Brylinski, J.-L.: Loop spaces, characteristic classes and geometric quantization. Progress in Mathematics, 107. Boston, MA: Birkhäuser Boston, Inc., 1993 [BRS] Bunke, U., Rumpf, P., Schick, T.: The topology of T-duality for t n -bundles. Rev. Math. Phys. 18(10), 1103–1154 (2006) [BSST] Bunke, U., Schick, T., Spitzweck, M., Thom, A.: Duality for topological abelian group stacks and T-duality. http://arxiv.org/abs/0701428v1[math.AT], 2007
96
[C] [CMW] [DP] [Gir] [Met] [MR] [MRW] [RW] [RW2] [Ren] [Ren2] [Pol] [SYZ]
C. Daenzer
Crainic, M.: Differentiable and algebroid cohomology, Van Est isomorphisms, and characteristic classes. http://arxiv.org/abs/math.DG/0008064, 2000 Curto, R., Muhly, P., Williams, D.: Crossed products of strongly morita equivalent c∗ -algebras. Proc. Amer. Math. Soc. 90, 528–530 (1984) Donagi, R., Pantev, T.: Torus fibrations, gerbes, and duality. http://arxiv.org/abs/math.AG/0306213, 2003 Giraud, J.: Cohomologie non abélienne. Berlin-Heidelberg-New York: Springer-Verlag, 1971 Metzler, D.: Topological and smooth stacks. http://arxiv.org/abs/math/0306176, 2003 Mathai, V., Rosenberg, J.: T-duality for torus bundles with H-fluxes via noncommutative topology, II: the high-dimensional case and the T-duality group. Adv. Theor. Math. Phys. 10, 123–158 (2006) Muhly, P.S., Renault, J., Williams, D.P.: Equivalence and isomorphism for groupoid c∗ -algebras. J. Op. Th. 17, 3–22 (1987) Raeburn, I., Williams, D.P.: Dixmier-douady classes of dynamical systems and crossed products. Can. J. Math. 45(5), 1032–1066 (1993) Raeburn, I., Williams, D.P.: Morita equivalence and continuous-trace C ∗ -algebras. Mathematical Surveys and Monographs, 60. Providence, RI: Amer. Math. Soc., 1998 Renault, J.: A groupoid approach to C ∗ -algebras. Lecture Notes in Mathematics, 793. Berlin: Springer, 1980 Renault, J.: Représentations des produits croisés d’algèbres de groupoïdes. J. Op. Th. 18, 67–97 (1987) Polchinski, J.: String Theory, Vol. II, Cambridge: Cambridge University Press, 1998 Strominger, A., Yau, S.-T., Zaslow, E.: Mirror symmetry is T-duality. Nucl. Phys. B 479, 243 (1996)
Communicated by A. Connes
Commun. Math. Phys. 288, 97–123 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0732-5
Communications in
Mathematical Physics
Twistor Actions for Self-Dual Supergravities Lionel J. Mason1 , Martin Wolf 2 1 The Mathematical Institute, University of Oxford, 24–29 St. Giles,
Oxford OX1 3LP, United Kingdom. E-mail:
[email protected]
2 Theoretical Physics Group, The Blackett Laboratory, Imperial College London,
Prince Consort Road, London SW7 2AZ, United Kingdom. E-mail:
[email protected] Received: 5 November 2007 / Accepted: 5 November 2008 Published online: 26 February 2009 – © Springer-Verlag 2009
Abstract: We give holomorphic Chern-Simons-like action functionals on supertwistor space for self-dual supergravity theories in four dimensions, dealing with N = 0, . . . , 8 supersymmetries, the cases where different parts of the R-symmetry are gauged, and with or without a cosmological constant. The gauge group is formally the group of holomorphic Poisson transformations of supertwistor space where the form of the Poisson structure determines the amount of R-symmetry gauged and the value of the cosmological constant. We give a formulation in terms of a finite deformation of an ¯ integrable ∂-operator on a supertwistor space, i.e., on regions in CP3|8 . For N = 0, we also give a formulation that does not require the choice of a background. Contents 1. 2.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Twistor Constructions for Self-Dual Supergravity . . . . . . . . . . . . . 2.1 Definitions, notation and conventions . . . . . . . . . . . . . . . . 2.2 Self-dual supergravity equations . . . . . . . . . . . . . . . . . . . 2.3 Twistor constructions . . . . . . . . . . . . . . . . . . . . . . . . . 3. Twistor Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Deformations of twistor space . . . . . . . . . . . . . . . . . . . 3.2 Action functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Covariant Approach, Covariant Action for N = 0 and Special Geometry . 4.1 The supersymmetric case . . . . . . . . . . . . . . . . . . . . . . . 5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A. Prepotential Formulation . . . . . . . . . . . . . . . . . . . . . A.1. Real structures on PT[N] and M[N] . . . . . . . . . . . . . . . . . A.2. Comparison of the two approaches . . . . . . . . . . . . . . . . . . Appendix B. Holomorphic Volume Forms and Non-Projective Twistor Space Appendix C. Supersymmetric BF-Type Theory . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
98 100 100 102 104 107 107 110 112 114 115 115 116 117 119 120 121
98
L. J. Mason, M. Wolf
1. Introduction Recently it has been discovered that N = 8 supergravity has better ultraviolet behaviour than has hitherto been anticipated, [B-Betal06,BDR07 and GRV07]. This has led some authors to speculate that it is possibly even finite. This improved behaviour relies on exact cancellations that do not follow from standard supersymmetry arguments, [St06]. One possible explanation arises from twistor string theory, [W04 and B04]. The original twistor string theories by Witten and Berkovits correspond to conformal supergravity (together with supersymmetric Yang-Mills theory), [BW04]. By gauging certain symmetries of the Berkovits twistor string, [A-ZHM08] introduced a new family of twistor string theories some of which have the appropriate field content for Einstein supergravity (including N = 4 and N = 8). Such a twistor string formulation of Einstein supergravity could be an explanation for the possible ultraviolet finiteness of N = 8 supergravity if it were fully consistent in its quantum theory. However, it now appears that these twistor string theories are chiral, [N08], unlike the original twistor string theories which were parity invariant. It remains a major open question as to whether a twistor-string theory exists that gives the full content of Einstein (super)-gravity even just at tree level. An approach to understanding what the appropriate twistor string theory might be is via a twistor action, [M05 and MS06] and [BMS07a,BMS07b]. Such actions have two terms. The first on its own gives a kinetic term for all the fields, but with only the self-dual part of the interactions. The second gives the remaining interactions of the full theory and correspond to the instanton contribution in the twistor-string theory. In the case of N = 4 supersymmetric Yang-Mills theory, the self-dual part of the action on twistor space is a holomorphic Chern-Simons theory, [W04], see also [S95] for a closely related harmonic superspace action. [BW04] gave a twistor action for self-dual N = 4 conformal supergravity. The purpose of this paper is to give an analogous action in the case of self-dual N = 8 Einstein supergravity. This action is special to N = 8 supergravity in much the same way as Witten’s Chern-Simons action is special to N = 4 supersymmetric Yang-Mills theory. It lends general support to the idea that twistor space has something special to say about full N = 8 supergravity and is suggestive of the existence of an underlying twistor string theory, perhaps even with explicit N = 8 supersymmetry as opposed to those of [A-ZHM08] in which only N = 4 supersymmetry is manifest. Penrose’s non-linear graviton construction [P76] reformulates the local data of a four-metric with self-dual Weyl tensor into the complex structure of a deformed twistor space, a three-dimensional complex manifold obtained by deforming a region in CP3 . The space-time field equation in this case is the vanishing of the anti-self-dual part of the Weyl tensor, and in the [AHS78] approach to twistor theory, this is reformulated as the integrability of the twistor almost complex structure. [BW04] introduce a version of conformal gravity with just self-dual interactions in which the underlying conformal structure is self-dual, but in which there is also a linear anti-self dual conformal gravity field (a linearised anti-self-dual Weyl tensor B) propagating on the self dual background. This has a Lagrange multiplier action (analogous to a ‘BF’ action) (B, C − ) d vol , where C − is the anti-self-dual part of the Weyl tensor, and (B, C − ) is the natural pairing. This can be extended to N = 4 supersymmetry. [BW04] gave a corresponding (super symmetric) twistor action of the form bN , where N is the Nijenhuis tensor of the almost complex structure and b is a Lagrange multiplier that doubles up as the Penrose
Twistor Actions for Self-Dual Supergravities
99
transform of the field B when the field equations are satisfied. In the non-supersymmetric case, this was extended to a twistor action for full (non-self-dual) conformal gravity in [M05] with further supersymmetric extension and connections with twistor-string theory in [MS08]. For Einstein gravity we wish to encode the vanishing of the Ricci tensor. In the non-linear graviton this can be characterised by requiring that the twistor space admits a fibration over a CP1 together with a certain Poisson structure up the fibre. [W80] extended this to the Einstein case, with a cosmological constant; in this case, the twistor space is required to admit a holomorphic contact structure that is non-degenerate when the cosmological constant is non-zero, see [WW90 and MW96] for textbook treatments. So, for Einstein gravity, we are seeking a twistor action whose field equations not only imply the integrability of an almost complex structure, but also the existence of some compatible holomorphic geometric structure, for example the contact one-form in the case of the cosmological constant, or the fibration together with a Poisson structure up the fibres in the case of vanishing cosmological constant. The first task is to introduce suitable variables that encode the almost complex structure together with the relevant compatible geometric structure on the real six-manifold underlying the twistor space. This turns out to be a one-form with values in a line bundle, and we write down the appropriate field equations that it must satisfy and an action (depending also on a Lagrange multplier field) that gives rise to them; the Lagrange multiplier field again corresponds to an anti-self-dual linear gravitational field propagating on the self-dual background via the Penrose transform when the field equations are satisfied. Our primary exposition will focus on the N = 8 supersymmetric cases, and reduce them to the cases with lesser or no supersymmetry. Supersymmetric extensions of Penrose’s non-linear graviton construction were first discussed by [M92a,M92b] (see also [M91,M92c]) based on work by [M88] and developed further in [A-ZHM08] and in [W07].1 That in [W07] gives a twistor description of four-dimensional N-extended, possibly gauged, self-dual supergravity with and without cosmological constant in terms of a deformed supertwistor space, a deformation of a region in CP3|N endowed with an even holomorphic contact structure. Here we also discuss the different gaugings in the case without a cosmological constant. It is these the integrability of the almost complex structures of these twistor spaces together with the holomorphy of the appropriate geometric structures that correspond to the field equations for our twistor actions. There are now a number of contexts arising from conventional string theory and M-theory in which the task of finding variables and action principles whose field equations encode the integrability of complex structures compatibly with some other geometric structure. In particular Kodaira-Spencer theory, [BCOV94], leads to field equations that imply the integrability of an almost complex structure compatible with a global holomorphic volume form on a six-manifold, yielding a Calabi-Yau structure. For a compendium of such theories and relations between them, including conjectured relations to twistorstring theory, see [DGNV05]. The situations considered here are distinct from those in [DGNV05], but given that one of the form theories involved there is a self-dual form theory of four-dimensional gravity including a cosmological constant (see also [A-ZH06]) there may well be some important connections between these ideas. The paper is structured as follows. In §2, we first review the equations of self-dual supergravity, with cosmological constant and gauged R-symmetry, and then go on to review the various twistor constructions and give a brief proof of the version of the 1 See also [S06 and W06] and references therein for recent reviews of supertwistors and their application to supersymmetric gauge theories.
100
L. J. Mason, M. Wolf
non-linear graviton construction for self-dual Einstein supergravity both with and without cosmological constant and different gaugings. In §3, we study infinitesimal deformations and show that a deformation of the contact structure determines a deformation of the almost complex structure. We develop a non-projective twistor formulation that shows that this persists in the case of a finite deformation giving a compact form for the field equations, i.e., the integrability condition for the almost complex structure. In the case of maximal supersymmetry, N = 8, we present the twistor action and show that it gives the appropriate field equations. We give a brief discussion of its invariance properties and various reductions with lesser gauging, supersymmetry, or no cosmological constant. A Chern-Simons action is always expressed in a given background frame and is not manifestly gauge invariant. In this gravitational context, our action is similarly not manifestly diffeomorphism invariant; we require the choice of some background, which we take to be a solution to the field equations. However, we go some way towards an invariant formulation. We give an invariant formulation of the field equations in general, but only find an explicitly diffeomorphism invariant action in the N = 0 case with cosmological constant. We prove that on any smooth manifold of dimension 4n + 2 equipped with a complex one-form τ up to scale (i.e., a complex line subbundle of the complexified cotangent bundle), then, if τ ∧ (dτ )n = 0, and a non-degeneracy condition is satisfied, there is a unique integrable almost complex structure for which τ is proportional to a non-degenerate holomorphic contact structure. This idea can be used to give a covariant form of the field equations in general, and a covarant action in the N = 0 case. In §5, we make some general concluding remarks. An action principle for N = 8 selfdual supergravity with vanishing cosmological constant has been obtained by [KK98] in harmonic superspace for split space-time signature.2 In that work, harmonic superspace is the spin bundle of super space-time and in Euclidean signature, it can naturally be identified with the supertwistor space. However, their action uses structures pulled back from space-time (e.g., the Laplacian) that are not locally obtainable from the complex structure and contact structure on twistor space. It is therefore not possible to regard it as a twistor action. Nevertheless, their action is closely related to ours and we show that theirs can be obtained from ours by gauge fixing in Appendix 5. In Appendix 5 we give a detailed discussion of the construction of the line bundle on a super-twistor space whose total space corresponds to a non-projective twistor space. In Appendix 5, we discuss some alternative twistor actions. 2. Twistor Constructions for Self-Dual Supergravity We work throughout in a complex setting. This can be understood as arising from taking a real analytic metric on a real space-time, and extending it to become a holomorphic complex metric on some neighbourhood M of the real slice in complexified space-time. We can straightforwardly restrict attention to Euclidean or split signature slice by requiring invariance under appropriate anti-holomorphic involutions (for Euclidean signature, these are discussed in Appendix 5). In the Euclidean case, one needs to restrict the number of allowed supersymmetries N to be even. 2.1. Definitions, notation and conventions. We model our definition of chiral super space-time on the paraconformal geometries of [BE91] (see also [W07]). 2 A similar action for N = 4 supersymmetric Yang-Mills theory was discovered in the context of harmonic superspace by [S95].
Twistor Actions for Self-Dual Supergravities
101
Definition 1. A right-chiral super space-time, M , is a split supermanifold of superdimension 4|2N on which we have an identification3 T M ∼ = H ⊗ S, where S is the right (dotted) spin bundle of rank 2|0 and H is the sum of the left spin bundle S and the rank-0|N bundle of supersymmetry generators and so has rank 2|N. We will also assume that S and H are endowed with choices of Berezinian forms (so that T M does also). This is the superspace one would obtain from a full super space-time by eliminating the left-handed fermionic coordinates, leaving only the right-handed ones in play. Being a split supermanifold, it is locally of the form C4|2N with coordinates4 (x µ˙ν , θ m ν˙ ) := x M ν˙ with x µ˙ν bosonic and θ m ν˙ fermionic where the indices range as fol˙ 1˙ lows: α, . . . , µ, . . . = 0, 1 for left-handed two-component spinors, α, ˙ . . . , µ, ˙ . . . = 0, for right-handed spinors, i, . . . , m, . . . = 1, . . . , N indexing the supersymmetries and A = (α, i), M = (µ, m); it will turn out in the following that it is natural, and simplifying in this self-dual context to group together the supersymmetry index m and the undotted spinor index µ into one index M. We use the convention that letters from the middle of the alphabets are coordinate indices whereas letters from the beginning of the alphabets are structure frame indices. The identification T M ∼ = H ⊗ S will be specified by a choice of ‘structure coframe’ given by the indexed one-forms E Aα˙ = dx M ν˙ E M ν˙ Aα˙ .
(2.1) ˙
˙
The dual vector fields will be denoted E Aα˙ , E Aα˙ E B β = δα˙ β δ A B . When contracting a vector field V with a differential one-form α we use the notation V α. With the capital Roman indices A, B, . . . ranging over both the bosonic α, β, . . . and the fermionic i, j, . . . indices we use the notation {AB . . .] for graded symmetrization and [AB . . .} for graded skew symmetrization T{A1 A2 ...An ] :=
1 n!
(−)σ¯ T Aσ (1) Aσ (2) ...Aσ (n) ,
(2.2a)
(−)σ¯ +|σ | T Aσ (1) Aσ (2) ...Aσ (n) ,
(2.2b)
σ ∈Pn
T[A1 A2 ...An } :=
1 n!
σ ∈Pn
where Pn is the group of permutations of n letters, |σ | the number of transpositions in σ and σ¯ the number of transpositions of odd indices. For an index such as A that ranges over indices for both odd and even coordinates, p A will denote the Graßmann parity of the index, p A = 0 for an even coordinate, and 1 for an odd one so that a graded skew form AB satisfies AB = −(−) p A p B B A . ˙
(2.3) ˙
γ˙ β = δ β , and similarly for . We introduce α˙ β˙ = [α˙ β] ˙ with 0˙ 1˙ = −1 and α˙ γ˙
α˙ αβ In the supersymmetric setting, there is a distinction between differential and integral forms, the latter being required for integration, [M88]. Unless otherwise stated, all our forms will be differential. 3 By T M we will mean T (1,0) M . There will be no role for anti-holomorphic objects on M . 4 The index structure on the bosonic coordinates in the curved case is not natural, but simplifies notation.
102
L. J. Mason, M. Wolf
2.2. Self-dual supergravity equations. We introduce connections on H and S repre˙ sented by connection one-forms ω A B and ωα˙ β , respectively. These determine a connection ∇ on T M by ˙
∇V Aα˙ = dV Aα˙ + V B α˙ ω B A + V Aβ ωβ˙ α˙
(2.4)
so that it preserve the factorisation T M ∼ = H ⊗ S. The fermionic parts of ω A B gauge the R-symmetry. In this supersymmetric context, a choice of scale or volume form on M is a section of the Berezinian of 1 M . We can assume that the Berezinians of H and S have been identified so that the scale is determined by a section of the Berezinian of either H ∗ or S∗ . The connections can be chosen uniquely so that they preserves these sections of the Berezinians of H ∗ and S∗ and so that the connection on T M has torsion with vanishing supertrace.5 We assume from hereon that such choices have been made. In the formulae that follow, we will also assume that the connection is torsion-free as that is part of the self-dual Einstein condition (the torsion will not in general vanish on the full super space-time, only on this right-chiral (or left-chiral) reduced supermanifold). ˙ The curvature two-form R Aα˙ B β of ∇ decomposes into curvature two-forms for the connections on H and S , ˙
˙
˙
R Aα˙ B β = δ A B Rα˙ β + δα˙ β R A B .
(2.5)
Making explicit the form indices, we write the Ricci identities as ˙
˙
D [∇ Aα˙ , ∇ B β˙ }V D δ = (−) pC ( p A + p B ) V C δ R Aα˙ B βC ˙ ˙
+ (−) p D ( p A + p B ) V D γ˙ R Aα˙ B β˙ γ˙ δ ,
(2.6)
where V Aα˙ is a vector field on M . In the torsion free case, using the algebraic Bianchi identities, Prop. 2.6 of [W07] gives the decomposition of the curvature into irreducibles: D D D R Aα˙ B βC = −2(−) pC ( p A + p B ) RC[A|α˙ β| ˙ ˙ δ B} + α˙ β˙ R ABC ,
R ABC D = C ABC D − 2(−) pC ( p A + p B ) C{A δ B] D , δ˙
δ˙
δ˙
(2.7a) (2.7b)
δ˙
R Aα˙ B β˙ γ˙ = C AB α˙ β˙ γ˙ + 2 AB δ(α˙ β) ˙ γ˙ + α˙ β˙ R AB γ˙ ,
(2.7c)
where the curvature tensors satisfy the algebraic conditions R AB α˙ β˙ = R AB α˙ γ˙ γ˙ β˙ = R AB(α˙ β) ˙ , C ABC D = C{ABC] D , (−) pC C ABC C = 0, AB = [AB} .
(2.8)
Here, AB is a natural supersymmetric extension of the scalar curvature and will be set equal to the cosmological constant when the field equations are satisfied. (See [W07] for further details of the construction and properties of the connections.) 5 Special care needs to be taken for N = 4, [W07].
Twistor Actions for Self-Dual Supergravities
103
Definition 2. A right-chiral superspace will be said to satisfy the N-extended self-dual supergravity equations if (i) the unique connection that preserves the given Berezinians of H ∗ and S∗ is ˙ torsion-free and satisfies C AB α˙ β˙ γ˙ δ = 0, (ii) R AB α˙ β˙ = 0, (iii) preserves some P AB = P [AB} ∈ 2 H of rank 2|r and is flat on the odd (N − r )dimensional subspace of H ∗ that annihilates P AB . When AB = 0 it will be said to be Einstein, whereas if AB = 0 it will be said to be vacuum. When r = 0, the connection on H is trivial in the odd directions and the R-symmetry is ungauged; all supersymmetry generators are covariantly constant. For r > 0, a subgroup of the R-symmetry is gauged with gauge group an extension of S O(r, C), the subgroup of S O(N, C) that preserves P i j the odd-odd part of P AB . For r = N, the gauge group is S O(N, C). Conformal supergravity corresponds to the more general situation where condition (i) alone is satisfied, and a natural supersymmetric analogue of the hypercomplex case corresponds to conditions (i) and (ii). In this work, we shall mostly be concerned with the situation where (i)–(iii) are satisfied simultaneously. There is only one possibility for the gauging in the Einstein case as follows: Lemma 1. Either P AB and AB both have maximal rank and can be chosen to be multiples of each-other’s inverse, or AB = 0. Proof. Condition (iii) of Def. 2 implies that (−) pC + pC ( p D + p E ) R ABC [D P E}C = 0, and taking a supertrace gives the equation E}C 0 = (−) pC ( p A + p E )+ pC + p B C{A δ [B , B] P
(−) p B
(2.9)
which quickly leads to the condition that is a multiple of δ A If this AB multiple is non-zero, P AB and AB have maximal rank and are multiples of each other’s inverse. If this multiple is zero, the assumption on the rank of P AB implies that the rank of AB is less than or equal to 0|N − r . The condition that the connection is flat on the subspace of H ∗ that annihilates P AB implies that R ABC D e D = 0 for all e D such that P AB e B = 0. Symmetrizing over ABC gives that C ABC D e D = 0 so we must also have C{A e D] = 0. Note that for N = 2 the multiple is always zero.
P BC
C.
It is a consequence of the Bianchi identities that AB is covariantly constant so that, when non-zero, defining PAB as the inverse of P AB , we can set AB = PAB . When AB is non-trivial, the curvature is non-trivial on the odd directions of H , and so the R-symmetry is therefore necessarily gauged with gauge group S O(N, C). We will see in §3.1 in the discussion of the deformations of twistor space how the different gaugings come about. We also obtain (−) pC + pC ( p D + p E ) C ABC [D P E}C = 0 and ∇[Aα˙ C B}C D E = 0. The field equations of self-dual supergravity with zero cosmological constant lead to the Ricci identities ˙
˙
[∇ Aα˙ , ∇ B β˙ }V D δ = (−) pC ( p A + p B ) V C δ α˙ β˙ C ABC D ,
(2.10)
which in turn imply Ricci-flatness of M . The self-dual supergravity equations on chiral super space-time with vanishing cosmological constant first appeared in light-cone gauge and in their covariant formulation in the work by [S92].6 6 See also [K79,K80,CDDG79,KNG92 and BS92].
104
L. J. Mason, M. Wolf
2.3. Twistor constructions. Flat supertwistor space is PT[N] := CP3|N \ CP1|N with homogeneous coordinates Z I := (ωα , θ i , πα˙ ) = (ω A , πα˙ ),
(2.11)
where ωα and πα˙ are bosonic coordinates and θ i fermionic ones. The supertwistor correspondence is between right-chiral complexified super spacetime M[N] ∼ = C4|2N with coordinates (x α α˙ , θ i α˙ ) = x Aα˙ , and is expressed by the incidence relation ω A = x Aα˙ πα˙ .
(2.12)
By holding x Aα˙ constant we see that points of M[N] correspond to CP1 s in supertwistor space PT[N] with homogeneous coordinates πα˙ . Alternatively, by holding Z I constant, we see that points in PT[N] correspond to (2|N)-dimensional isotropic superplanes. In the curved case, both sides of the correspondence are deformed, but points of super space-time still correspond to CP1 s in supertwistor space (and points of supertwistor space to (2|N)-dimensional isotropic subsupermanifolds of M ). Bosonic twistor space will be denoted by P T and will be a deformation of some region in CP3 , whereas a supersymmetrically extended curved twistor space will be denoted by PT and will be a deformation of a region in CP3|N. Similarly, a bosonic space-time will be denoted by M and a supersymmetric one (which will always in this paper be right-chiral) by M . We recall first Ward’s extension [W80] of [P76] non-linear graviton construction to the case of non-zero cosmological constant: Theorem 1. ([P76,W80]). (i) There is a natural one-to-one correspondence between holomorphic conformal structures [g] on some four-dimensional (complex) manifold M whose anti-self-dual Weyl curvature vanishes, and three-dimensional complex manifolds P T (the twistor space) containing a rational curve (a CP1 ) with normal bundle N∼ = O(1) ⊕ O(1). (ii) The existence of a conformal scale for which the trace-free Ricci tensor vanishes, but for which the scalar curvature is non-vanishing, is equivalent to P T admitting a non-degenerate contact structure. (iii) The existence of a conformal scale for which the full Ricci tensor vanishes is equivalent to P T admitting a fibration : P T → CP1 whose fibres admit a Poisson structure with values in the pullback of O(−2) from CP1 . Here, O(n) is the complex line bundle of Chern class n on CP1 . The holomorphic contact structure is a rank-2 distribution D ⊂ T (1,0) P T in the holomorphic tangent bundle of P T . The quotient determines a line bundle L := T (1,0) P T /D. It can be defined dually to be the kernel of a holomorphic (1, 0)-form τ defined up to scale on P T , i.e. D = ker τ . If so, τ takes values in L since the map T (1,0) P T → T (1,0) P T /D := L is then the contraction of a vector with the (1, 0)-form τ . The nondegeneracy condition is that for any two vector fields X and Y in D, the Frobenius form : D ∧ D → L := T (1,0) P T /D, with (X, Y ) := [X, Y ] mod D (2.13)
Twistor Actions for Self-Dual Supergravities
105
is non-degenerate on D. This is equivalent to τ ∧ dτ = 0. When it is everywhere degenerate, D determines a foliation whose leaves are the fibres of the projection : P T → CP1 and τ is the pullback of the one-form π α˙ dπα˙ from CP1 and L becomes the pullback of O(2) from CP1 . In the non-degenerate case, we can define a Poisson structure with values in L ∗ to be the inverse of on D. This has an analogue also in the degenerate case, now with values in O(−2) although its existence no longer follows from that of τ . We can impose compatibility with, e.g., Euclidean reality conditions by requiring the existence of an anti-holomorphic involution ρ : P T → P T without fixed points sending the given Riemann sphere to itself via the antipodal map. This then induces a corresponding involution on M fixing a real slice on which the metric g is real and of Euclidean signature. The above theorem has a supersymmetric extension as follows: Theorem 2. (i) There is a natural one-to-one correspondence between conformally self-dual holomorphic right-chiral space-times and complex supermanifolds PT of dimension 3|N with an embedded rational curve (a Riemann sphere CP1 ) with normal bundle N ∼ = O(1)⊕2|N. (ii) Furthermore, M is a complex solution to the four-dimensional N-extended selfdual supergravity equation with non-vanishing cosmological constant iff the twistor space PT admits a non-degenerate even contact structure. (iii) M is a complex solution to the four-dimensional N-extended self-dual supergravity equation with vanishing cosmological constant iff the twistor space PT admits a fibration : PT → CP1|N−r and a Poisson structure of rank 2|r tangent ot the fibres with values in ∗ O(−2). Here, O(n)⊕r |s := Cr |s ⊗ O(n). The proof breaks up into three parts; further details of the non-degenerate cosmological constant case are given in [W07]. Proof. Part (i). Let F = P(S∗ ) be the projective co-spin bundle over M with holomorphic projection p : F → M . Its fibres p −1 (x) over x ∈ M are complex projective lines CP1 with homogeneous fibre coordinates πα˙ . We define the twistor distribution to be the rank-2|N distribution DF on F given by ∂ A } := span π α˙ E Aα˙ + π α˙ πγ˙ ω Aα˙ β˙ γ˙ DF := span{ E , (2.14) ∂πβ˙ ˙ where the E Aα˙ s are the frame fields and ωα˙ β is the connection one-form on S. A few lines of algebra show that DF is integrable if and only if the connection is torsion-free and the C AB(α˙ β˙ γ˙ δ) ˙ -part of the curvature vanishes. In this case, the distribution DF defines a foliation of F . Working locally on M , the resulting quotient will be our supertwistor space, a (3|N)-dimensional supermanifold denoted by PT . The quotient map will be denoted by q : F → PT so that we have the double fibration q p PT ← F → M . We note that we can form a non-projective supertwistor space T by taking the quotient of S∗ by the distribution DF . The integral curves of the Euler := πα˙ ∂/∂πα˙ are the fibres over P(S∗ ) and ϒ descends to give a vector vector field ϒ field ϒ on T which determines the fibration T → PT . Since F is a CP1 -bundle over M and the fibres are transverse to the distribution DF , the submanifolds q( p −1 (x)) → PT , for x ∈ M , are CP1 s. In the other direction, the supermanifolds p(q −1 (Z )) → M , for Z ∈ PT , are the (2|N)-dimensional isotropic subsupermanifolds of M given by the p projections of integral surfaces of DF .
106
L. J. Mason, M. Wolf
The inverse construction, i.e. starting from PT , follows by applying a supersymmetric extension of Kodaira’s deformation theory ([W86]). This allows one to reconstruct M as the moduli space of CP1 s that arise as deformations of the given CP1 which will correspond to some x ∈ M . According to Kodaira theory, Tx M ∼ = H 0 (CP1 , N ), where N is the normal bundle to the given CP1 ⊂ PT , and in order that the moduli space exist, we require the vanishing of the first cohomology of the normal bundle N . If the given CP1 arises as q( p −1 (x)) for some x ∈ M , then N ∼ = O(1)⊕2|N: this can be seen by expressing it as the quotient of the horizontal tangent vectors to F at p −1 (x) ∼ = CP1 , which can be represented by Tx M , by DF , 0 −→ DF | p−1 (x) −→ Tx M −→ q ∗ N −→ 0 .
(2.15)
Since the twistor distribution DF restricted to the fibres p −1 (x) over x ∈ M is O(−1)⊕2|N, and Tx M ∼ = C4|2N, N takes the form O(1)⊕2|N as stated above. Kodaira theory in turn implies that we can reconstruct M as the moduli space of such CP1 s, and that the construction is stable under deformations of the complex structure on PT . Kodaira theory identifies the tangent bundle Tx M with the sections of the normal bundle, N ∼ = O(1)⊕2|N, and these, by an extension of Liouville’s theorem are linear functions of πα˙ , i.e., V Aα˙ πα˙ where the A index is associated to a basis of C2|N. This gives the right-chiral manifold structure on M , and it is easily seen that lines through a given point of PT correspond to an integrable (2|N) manifold that will be an integral surface of the distribution DF . Thus DF is integrable and the M is therefore conformally self-dual. Part (ii). In the self-dual Einstein case with non-vanishing cosmological constant, we may introduce a one-form of homogeneity 2 on F by ˙
τ := π α˙ ∇πα˙ = π α˙ dπα˙ − ωα˙ β π α˙ πβ˙ ,
(2.16)
˙ where ωα˙ β is the connection one-form on S. The one-form τ automatically annihilates horizontal vectors and hence the distribution DF . The form τ descends to PT if and only if d τ is annihilated by DF also. This characterizes the self-dual Einstein equations since when C AB α˙ β˙ α˙ δ˙ = 0, as follows from the conformal self-duality condition, ˙
˙
d τ = ∇π α˙ ∧ ∇πα˙ + E B β ∧ E Aα˙ AB πα˙ πβ˙ − E B γ˙ ∧ E γA˙ R AB α˙ β˙ π α˙ π β ,
(2.17)
and this is annihilated by DF iff R AB α˙ β˙ = 0. Thus, τ descends to PT , i.e., there exists a one-form τ on PT such that τ = q ∗τ . Non-degeneracy of the contact structure is the condition that dτ is non-degenerate on the kernel D of τ , or equivalently, the condition that the three-form τ ∧ dτ should be non-degenerate in the sense that for any vector X , X (τ ∧ dτ ) = 0 ⇒ X = 0. This non-degeneracy is equivalent to the non-degeneracy of AB on H . Thus, τ defines a non-degenerate holomorphic contact structure on PT . Part (iii). In the self-dual vacuum case, we see that the connection on S is flat and a basis for S can be found so that it vanishes. In this basis, πα˙ are constant along the horizontal distribution on F , and so along the distribution (2.14). They are therefore the pullback of coordinates on PT . The condition that the connection is flat on the annihilator of P AB in H ∗ means that there are N − r covariantly constant sections esA of the odd part of H ∗ , s = r + 1, . . . , N. The forms E Aα˙ esA are therefore constant and, since the connection is torsion free, these forms are exact and equal to dθ s α˙ for some odd
Twistor Actions for Self-Dual Supergravities
107
coordinates θ s α˙ . The N − r functions θ s = θ s α˙ πα˙ can be seen to be constant also along the twistor distribution (2.14). The global holomorphic coordinates (πα˙ , θ s ) define a projection : PT → CP1|N−r as promised. We now define the Poisson structure by considering a pair of local functions f, g on PT . Pulled back to F , they satisfy π α˙ E Aα˙ f = 0, π α˙ E Aα˙ g = 0 ,
(2.18)
E Aα˙ f = πα˙ f A ,
(2.19)
and this implies that E Aα˙ g = πα˙ g A
for some f A , g A of weight −1 in πα˙ (this follows from the standard fact that π α˙ bα˙ = 0 ⇒ bα˙ = bπα˙ for some b which follows from the two-dimensionality of the spin space and the skew symmetry of α˙ β˙ ). We define the Poisson bracket { f, g} of f with g to be { f, g} := (−) p A ( p f +1) f A P AB g B .
(2.20)
It is clear that this has weight −2 in πα˙ , but, as given, this expression only lives on F . However, it is easily checked that, as a consequence of the covariant constancy of the P AB , it is constant along the distribution (2.14) and descends to PT .
See Appendix 5 for more on the non-projective formulation. 3. Twistor Actions In order to consider actions, we must allow our fields to go off-shell, and this is most straightforwardly done in the Dolbeault setting. We can take an almost complex structure that is not necessarily integrable to be the off-shell field, and regard the integrability condition to be part of the field equations. In the following we will see that if we require the almost complex structure to be compatible with a Poisson structure or complex contact structure and the almost complex structure can be encoded in a complex one-form h defined up to scale. In the following, we will mostly work ‘non-projectively’ i.e., on T[N] = C4|N, or at least using homogeneous coordinates. This can also be identified as the total space of the line bundle O(−1) over PT. On this space, we have the Euler homogeneity vector field ϒ, and a canonically defined holomorphic volume form (an integral form in this supersymmetric context) of weight 4 − N, the tautological form pulled back from Ber(PT ) ∼ = O(N − 4) satisfying Lϒ = (4 − N) , where Lϒ is the Lie derivative along ϒ. Similarly, τ will be a well-defined differential one-form of weight 2. See Appendix 5 for further discussion.
3.1. Deformations of twistor space. For simplicity, we take the supertwistor space PT to be a deformation of flat twistor space PT[N] with homogeneous coordinates as in the flat case given by7 Z I = (ωα , θ i , πα˙ ) = (ω A , πα˙ ) = (Z a , θ i ) ,
(3.1)
7 We could take a finite deformation of any curved integrable twistor space, but would then need more coordinate patches.
108
L. J. Mason, M. Wolf
the latter form distinguishes between the odd, θ i and the even, Z a coordinates. We also assume that we are given an ‘infinity twistor’ a constant graded skew bi-vector I I J := diag(P AB , α˙ β˙ ),
(3.2a)
where P AB = diag( αβ , P i j )
and
P i j = P (i j) .
(3.2b)
When = 0, we will take P i j to be diagonal with r ones and N − r zeroes along the diagonal. We also introduce the graded Poisson structure on homogeneous functions f and g by [ f, g} := (−) p I ( p f +1) (∂ I f )I I J (∂ J g),
(3.3a)
where we introduce the notation ∂ I :=
∂ , ∂ZI
and we will also use
∂¯ I¯ :=
∂ . ∂ Z¯ I¯
(3.3b)
Infinitesimally, a deformation of the almost complex structure is represented by a holomorphic tangent bundle valued (0, 1)-form j, where the deformed and undeformed anti-holomorphic exterior derivatives are related by ∂¯ = ∂¯0 + j. The first order part of the integrability condition (assuming that ∂¯02 = 0) is ∂¯0 j = 0. An infinitesimal diffeomorphism induced by the real part of a (1, 0)-vector field X gives rise to the deformation j := −∂¯0 X , so that the infinitesimal deformations of the complex structure modulo those obtained by infinitesimal diffeomorphisms define an element of the Dolbeault cohomology group H 1 (PT , T (1,0) PT). In order to impose the Einstein or vacuum conditions, we will also demand that the deformation preserves the Poisson structure = −I J I ∂ I ∧ ∂ J of weight −2. In this linearised context, we can ensure this by requiring that the deforming vector fields j preserve the Poisson structure L j = 0, where L is the Lie derivative. This will follow if j is Hamiltonian with respect to , i.e., if there exists a (0, 1)-form h of weight 2 such that j = dh = (−) p I (∂ I h)I I J ∂ J .
(3.4)
¯ we see that j is ∂((χ ¯ If h = ∂χ )) and so is pure gauge. Thus such deformations correspond to h taken to be Dolbeault representatives for elements of H 1 (PT , O(2)). The Penrose transform gives the identification between elements of H 1 (PT , O(2)) and linearised self-dual gravitational fields, [P68,P76] and in the supersymmetric case this will give the whole associated linearised gravitational supermultiplet. We now consider a finite deformation, again determined by h = d Z¯ a¯ h a¯ which, at this stage, is an arbitrary (even) smooth function of (Z I , Z¯ a¯ ) homogeneous of degree 2 ¯ in Z I and 0 in Z¯ I , holomorphic in the θ i s and satisfies Z¯ a¯ h a¯ = 0; we will never allow any dependence on the complex conjugates of the fermionic cooordinates. We then define the distribution T (0,1) PT of anti-holomorphic tangent vectors on PT by T (0,1) PT := span{ D¯ I¯ } := span ∂¯a¯ + (−) p I (∂ I h a¯ )I I J ∂ J , ∂¯i¯ . (3.5)
Twistor Actions for Self-Dual Supergravities
109
This is to be understood as a finite perturbation of the standard complex structure on ¯ ¯ flat supertwistor space with ∂-operator ∂¯0 = d Z¯ I ∂¯ I¯ .8 The complex structure can be equivalently determined by specifying the space of (1,0)-forms
(1,0) PT := span{D Z I } := span{dZ I + I I J ∂ J h}.
(3.6)
The integrability condition for this distribution is
I I J ∂ J ∂¯a¯ h b¯ − ∂¯b¯ h a¯ + [h a¯ , h b¯ } = 0 ⇐⇒ I IJ ∂ J ∂¯0 h + 21 [h, h} = 0,
(3.7)
where the wedge product in the last expression is understood. When this equation is satisfied, not only is the almost complex structure integrable, but also the Poisson bracket of two holomorphic functions is again holomorphic. In the case that = 0, when the Poisson structure is degenerate, the coordinates πα˙ and θ r +1 , . . . , θ N are holomorphic and define a projection to CP1|N−r as required for the characterization of a twistor space for a self-dual vacuum solution. Thus, in this case, Eq. (3.7) is the main field equation. In the Einstein case, we must produce a holomorphic contact structure. On the flat twistor space, introduce the contact structure τ0 = dZ I Z J I J I ,
(3.8a)
where (−) p K I I K I K J = δ I J
and
˙
I I J = diag( PAB , α˙ β ).
(3.8b)
For the Einstein case, from Thm. 2, we need to know that we have a holomorphic contact structure on the deformed space. The deformed one can be taken to be τ := D Z I Z J I J I = dZ I Z J ω J I + Z J (−) p I I J I I I K ∂ K h = τ0 + 2 h, (3.9) = δ J K
where the last equation follows from the homogeneity relation Z I ∂ I h = 2h. The con¯ = 0 ⇔ D¯ ¯ dτ = 0 is dition that ∂τ I F (0,2) := ∂¯0 h + 21 [h, h} = 0.
(3.10)
Thus, integrability of the complex structure follows from the holomorphy of the contact structure when = 0. (When = 0, τ0 remains holomorphic trivially.) Thus, not only is (3.10) our main equation in the Einstein case, it also implies (3.7) in the other cases, and so we will focus on this as the main equation in what follows. The choice of the Poisson structure reduces the diffeomorphism freedom to (infinitesimal) Hamiltonian coordinate transformations of the form δ Z I = [Z I , χ } h → h + δh, with δh = ∂¯0 χ + [h, χ },
(3.11)
where χ is some smooth function of weight 2. Under this transformation, the ‘curvature’ F (0,2) behaves as F (0,2) → F (0,2) + δ F (0,2) with δ F (0,2) = [F (0,2) , χ }. Thus, the field equation (3.10) is invariant under these transformations. 8 As in the linearised context, we eventually want to impose the Einstein condition on the space-time manifold. Therefore, we are only interested in a subclass of (finite) deformations ∂¯0 → ∂¯0 + j with j given by j = d Z¯ a¯ ja¯ I ∂ I = d Z¯ a¯ (−) p I ∂ I h a¯ I I J ∂ J .
110
L. J. Mason, M. Wolf
We can see that, at least in linear theory, h encodes a supergravity multiplet as follows. The form h may be expanded in the odd coordinates as h = h0 +
N r =1
1 r!
θ i1 · · · θ ir h i1 ···ir .
(3.12)
If we further linearise (3.10) around the trivial solution h = 0, it tells us that ∂¯0 h = 0, or equivalently, ∂¯0 h 0 = 0 = ∂¯0 h i1 ···ir . Because of the gauge invariance (3.11), which at the linearised level reduces to δh = ∂¯0 χ , we see that h 0 ∈ H 1 (P T, O(2)) and h i1 ···ir ∈ H 1 (P T, O(2 − r )), where P T represents the body of the supermanifold PT (so that P T is a finite deformation of PT[0] ). By virtue of the Penrose transform, [P68], h 0 corresponds on space-time to a helicity s = 2 field while h i1 ···ir to a helicity s = (4 − r )/2 field. Hence, for maximal N = 8 supersymmetry, we find (sm ) = (−21 , − 23 8 , −128 , − 21 56 , 070 , 21 56 , 128 , 23 8 , 21 ) which is precisely the (on-shell) spectrum of N = 8 Einstein supergravity; the subscript ‘m’ refers to the respective multiplicity. Altogether, we see that a single element h ∈ H 1 (PT, O(2)) encodes the full particle content of maximally supersymmetric linearised Einstein gravity in four dimensions. In this linearised context, it is straightforward to see how the gauging works. The bundle of R-symmetry generators on twistor space is the tangent bundle to the odd ¯ directions spanned by ∂/∂θ i . The linearised variation in the ∂-operator on this bundle is P ik ∂ 2 h/∂θ j ∂θ k because the part of ∂¯ f i ∂/∂θ i tangent to the odd directions is (∂¯ f i + P ik ∂ 2 h/∂θ j ∂θ k f j )∂/∂θ i . Because θ i anti-commute, ∂ 2 h/∂θ i ∂θ j is skew symmetric in i j. Thus, in the case of non-degenerate P i j , this gives an element of the Lie algebra of SO(N, C), and so corresponds to the maximal gauging of the R-symmetry, with gauge group SO(N, C). When P i j has rank r , for r < N, the gauging of the R-symmetry will be reduced to the subgroup of SO(N, C) that preserves P i j . In Appendix 5, where we compare our approach with that of [KK98], we also make some comments on the space-time fields in the non-linearised setting for zero cosmological constant. 3.2. Action functionals. We will be interested in integrating Lagrangian densities over twistor space for which we will need the holomorphic volume integral form
N = D(D Z I ) =
a b 1 4! abcd Z D Z
∧ DZc ∧ DZd ⊗
N i=1
Dθ i ,
(3.13)
which has weight 4 − N on account of the Berezinian integration rule dθ i θ j = δ i j implying d(λθ i ) = λ−1 dθ i for λ ∈ C∗ . Here, we use Manin’s notation [M88] to denote integral forms associated with a given basis of differential one-forms. We will not integrate over any complex conjugated odd coordinates. For maximal supersymmetry, N = 8, we can write down an action functional reproducing the field equations (3.10) and hence also (3.7),
S[h] = 8 ∧ h ∧ ∂¯0 h + 13 h ∧ [h, h}
1 ¯ (3.14) = (0) 8 ∧ h ∧ ∂0 h + 3 h ∧ [h, h} ,
Twistor Actions for Self-Dual Supergravities
111
where the integral form = D(dZ I ) =
(0) 8
a b 1 4! abcd Z dZ
∧ dZ c ∧ dZ d ⊗
8
dθ i .
(3.15)
i=1
It can be seen that the weights balance as h has weight 2, [·, ·} weight −2 and 8 (0) (respectively, 8 ) has weight −4. This is the only value of N for which there is such a balance. The action (3.14) is invariant under (3.11). This follows from the Bianchi identity for F (0,2) , ∂¯0 F (0,2) + [h, F (0,2) } = 0,
(3.16)
implied by the (graded) Jacobi identity for the Poisson structure. It is clear that the almost complex structure, integrability conditions and action formulation (the latter for N = 8) only depend on the Poisson structure I I J and not on I I J directly. It is also clear that if I I J is degenerate, the above field equations and action (the latter for N = 8) all make good sense, although the action most directly yields (3.10) rather than the superficially weaker Eq. (3.7), that is sufficient to determine the relevant structures on the deformed twistor space. The action (3.14) can be compared with the Kodaira-Spencer actions introduced in [BCOV94], the compendium of topological M-theory related actions in [DGNV05] and the Lagrange multiplier-type action involving the Nijenhuis tensor given in [BW04] in the N = 4 case. Our action is local in contra-distinction with the non-local KodairaSpencer action. Our action is given for a non-Calabi Yau space (due the isomorphism (B.5), the holomorphic Berezinian is only trivial when N = 4). Ours is most closely related to that in Berkovits & Witten, although our basic variable, the one-form h which is a “potential” for the deformation j, considered in deformation theory (i.e. j is a holomorphic derivative of h) and is most naturally expressed for N = 8 rather than N = 4. We close this subsection by discussing the cases with N < 8 supersymmetries. We start from the action (3.14) with N = 8 but restrict the dependence of h on θ i by requiring invariance under an SO(8 − N, C) subgroup of the R-symmetry. Thus, we set h = f + θ N+1 · · · θ 8 b,
(3.17)
where f and b are now one forms depending on the bosonic twistor coordinates and θ 1 , . . . , θ N, f has weight 2, and b has weight N − 6. We can now integrate out the anti-commuting variables θ N+1 , . . . , θ 8 and integrate by parts to obtain the action
(3.18) S[b, f ] =
r ∧ b ∧ ∂¯0 f + 21 [ f, f } . This action is now of ‘BF’ form where b acts as a Lagrange multiplier for the field equation ∂¯0 f + 21 [ f, f } = 0.
(3.19)
which, as we have seen, implies that integrability of the complex structure is compatible with a holomorphic Poisson structure. Varying f yields the equation ∂¯ f b = 0
(3.20)
112
L. J. Mason, M. Wolf
and, together with the gauge freedom b → b+ ∂¯ f χ , this implies that b defines an element of the cohomology group H 1 (PT , O(N − 6)) and so is the Penrose transform of a superfield of helicity −2 + N/2. 4. Covariant Approach, Covariant Action for N = 0 and Special Geometry The above actions are non-covariant in the sense that they explicitly depend on the chosen background one has started with so that diffeomorphism invariance is broken. This is normal in the context of Chern-Simons actions for which a frame of the YangMills bundle must be chosen. Nevertheless, we will see that at least for τ non-degenerate and N = 0 we can give a covariant version. The geometric structure we are concerned with here is closely related to a (real) six-dimensional special geometry introduced by [CE03]. In their geometry, a real rank4 distribution (subbundle of the tangent bundle) D is introduced and, if suitably nondegenerate and satisfying a positivity condition, it is shown that there is a canonically defined almost complex structure J for which the distribution is an almost complex contact distribution. Furthermore, the obstruction to the integrability of J is identified. Our situation is somewhat different in that the primary structure on a smooth manifold, P, is a complex one-form τ defined up to complex rescalings (or more abstractly, a complex line bundle L ∗ ⊂ CT ∗ P := C ⊗ T ∗ P). This is more information in the sense that D is defined directly as the kernel of τ , but τ is only defined by D up to τ → aτ + bτ¯ , where a, b are complex valued functions on P. Given D, there is a unique choice of τ that is compatible with the Cap-Eastwood almost complex structure but a priori, one does not know if that is the τ that has been chosen. Our analogue of the Cap-Eastwood theorem works in higher dimensions also and we state it in greater generality than we need. Theorem 3. Suppose that on a (smooth) manifold P of dimension 4n + 2 we are given a complex line subbundle L ∗ ⊂ CT ∗ P, represented by a complex one-form τ defined up to complex rescalings. Suppose further that τ ∧ (dτ )n+1 = 0
and
τ ∧ (dτ )n ∧ τ¯ ∧ (dτ¯ )n = 0,
then there is a unique integrable almost complex structure for which τ is proportional to a non-degenerate holomorphic contact structure. Here, (dτ )n := dτ ∧· · ·∧dτ (n-times). Proof. We claim that, with the assumptions above, the (2n +1)-form τ ∧(dτ )n is simple, i.e., that the space of vectors X ∈ (P, CTP) such that X (τ ∧ dτ ) = 0 is (2n + 1)dimensional. This follows because the kernel of τ is (4n + 1)-dimensional, whereas dτ defines a skew form on this kernel and so must have even rank. However, its rank is less than 2n + 2 by τ ∧ (dτ )n+1 = 0 but greater than or equal to 2n because τ ∧ (dτ )n = 0. Hence, the kernel of τ ∧ (dτ )n is (2n + 1)-dimensional and we will take this kernel to be the space of anti-holomorphic tangent vectors spanning T (0,1) P. The condition that T (0,1) P should contain no real vectors follows from the second assumption of the theorem. We have that X (τ ∧ (dτ )n ) = 0 ⇔ X (τ ∧ dτ ) = 0 and we will use this latter characterisation of T (0,1) P in the following. We now consider the integrability of the distribution. Let X and Y satisfy X
(τ ∧ dτ ) = 0 = Y
(τ ∧ dτ ).
(4.1)
Twistor Actions for Self-Dual Supergravities
Then clearly X
τ =0=Y
113
τ and τ ∧ (X
dτ ) = 0 ,
(4.2)
so that X dτ ∝ τ and L X τ ∝ τ , and similarly for Y . Here, L X denotes the Lie derivative along X . Thus, [X, Y ] since X more, [X, Y ]
τ =0=Y
τ = X (Y
τ ) − Y (X
τ) − X
τ by assumption and so X
(τ ∧ dτ ) = −τ ∧ ([X, Y ]
(Y
(Y
dτ ) = 0 ,
(4.3)
dτ ) = 0 from above. Further-
dτ )
= −τ ∧ ([X, Y ] dτ + d([X, Y ] τ ) = −τ ∧ (L[X,Y ] τ ) = −τ ∧ (L X LY τ − LY L X τ ) = 0 , (4.4) since L X τ = X dτ ∝ τ , so L X LY τ ∝ τ . Thus, the almost complex structure is integrable.
In the twistor context, we will take P to be a six-dimensional manifold with topology U × S 2 with U ⊂ R4 and, as before, we shall denote it by P T . With this theorem, then, our data is simply a complex line subbundle L ∗ ⊂ CT ∗ P T represented by a differential one-form τ with values in L subject to the open condition τ ∧ dτ ∧ τ¯ ∧ dτ¯ = 0. We will also require that the line bundle L has Chern class 2. The field equation is τ ∧ (dτ )2 = 0. The N = 0 action above is simply S[b, τ ] = b ∧ τ ∧ (dτ )2 , (4.5) where b ∈ 1 P T ⊗ (L ∗ )3 is a Lagrange multiplier. Clearly, the field equation obtained by varying b is τ ∧ (dτ )2 = 0, as desired. The action is clearly diffeomorphism invariant, and enjoys a gauge invariance given by τ → χ τ and b → χ −3 b, where χ is a non-vanishing complex-valued function on P T . This gauge freedom corresponds to the fact that τ takes values in a line bundle L which we shall also denote by O(2) since it becomes that on-shell, and hence b is a differential one-form with values in O(−6). The action is also invariant under b → b + γ , where γ ∧ τ ∧ (dτ )2 = 0, and the space of such γ is two-dimensional when the field equations are not satisfied, but three-dimensional when they are. (When they are satisfied, this freedom can be used to ensure that b is a (0, 1)-form.) There is also a gauge freedom in b obtained as follows. We can define a partial connection ∂¯ on O(n) by defining for χ , now assumed to be a section of O(−6), ¯ to be the differential one-form modulo the kernel of ∂χ ¯ → ∂χ ¯ ∧ τ ∧ (dτ )2 defined ∂χ ¯ ∧ τ ∧ (dτ )2 := d(χ τ ∧ (dτ )2 ). It is clear from this definition that the integrand by ∂χ ¯ is a boundary integral and so this represents of the action evaluated on such a b = ∂χ ¯ needs to be a gauge freedom. On-shell, the above definition becomes trivial, and ∂χ ¯ 2/3 ∧ (τ ∧ dτ ) := d(χ 2/3 τ ∧ dτ ), and in this case it defined a little differently by ∂χ ¯ leads to an honest ∂-operator on the line bundles O(n). The field equation for b is db ∧ τ ∧ dτ − 23 b ∧ (dτ )2 = 0
(4.6)
114
L. J. Mason, M. Wolf
¯ and when the field equation for τ is satisfied, this is the ∂-closure condition for sections ¯ with χ a of (0,1) P T ⊗ O(−6). Taking into account the gauge freedom b → b + ∂χ 1 section of O(−6), b will correspond to an element of H (P T, O(−6)). Thus, solutions to the field equations correspond to a complex three-dimensional manifold P T with holomorphic contact structure τ , and the condition on the Chern class of L implies that it satisfies the topological assumption of Ward’s theorem, so that, if it contains a holomorphic rational curve of degree one in the S 2 -factor, then it corresponds to a space-time M with self-dual Einstein metric. The field b ∈ H 1 (P T, O(−6)) then corresponds via the Penrose transform to a right-handed linearised gravitational field propagating on that self-dual background. Thus, we have the self-dual sector of non-supersymmetric Einstein gravity. 4.1. The supersymmetric case. In the supersymmetric situation, we will assume that PT is a smooth supermanifold with six real bosonic dimensions and N complex fermionic dimensions. Without loss of generality, we can always assume that the supermanifold is split in the smooth category [B79], and that locally the odd coordinates are θ i , i = 1, . . . , N, and that we will only ever have holomorphic dependence on θ i , their complex conjugates will not enter the formalism, so, in particular, the transition functions for the supermanifold will be holomorphic in θ i .9 We can still encode the structure of a supersymmetric non-linear graviton into a complex contact form τ as follows. We will assume that τ is a complex differential one-form on the supermanifold PT , again with only holomorphic dependence on the θ i , i.e., τ = dx a τa + dθ i τi , where the x a s are the real bosonic coordinates on PT , a = 1, . . . , 6, and τa and τi are holomorphic in θ i with τi odd and τa even functions on PT . On the body of the supermanifold, θ i = 0, we can assume that we have the equations τ ∧ (dτ )2 = 0 as before, but these will not hold when θ i = 0, even for standard flat supertwistor space as, in general, (dθ )n = 0 ∀ n for an odd variable θ . Thus, we cannot express the conditions we need quite so simply in the supersymmetric case. Nevertheless, much of Thm. 3 works in the supersymmetric case also. We will require firstly, as a genericity assumption, that the complexified kernel CD of τ has dimension ¯ 5|2N (here we are taking ∂/∂θ i and ∂/∂ θ¯ i to be independent). Secondly, we require that on this complexified kernel of τ , the two form dτ has rank 2|N so that the kernel of τ ∧ dτ is 3|N-dimensional and further, that ker(τ ∧ dτ ) has no real vectors, i.e. ker(τ ∧ dτ ) ∩ ker(τ ∧ dτ ) = {0}. θi
(4.7) ¯ θ¯ i
The fact that we have required that τ depends only on and not means that dτ ¯ i ¯ annihilates ∂/∂ θ , for i = 1, . . . , N and so the rank of dτ is at most 5|N in any case. With these assumptions, the proof of Thm. 3 follows without modification to show that ker(τ ∧ dτ ) is integrable and that τ is a holomorphic complex contact structure so that T (0,1) PT := ker(τ ∧ dτ ).
(4.8)
The main field equation is therefore the condition that τ ∧ dτ annihilates a complex distribution of dimension 3|N. In the supersymmetric context, we do not yet have an equation on τ analogous to the bosonic equation τ ∧ (dτ )n+1 = 0 for higher dimensional complex contact structures nor an action that produces this condition as its Euler-Lagrange equation. As a consequence, we have so far been unable to find a covariant supersymmetric action functional. 9 In a Dolbeault context, this assumption is, in effect a gauge choice.
Twistor Actions for Self-Dual Supergravities
115
5. Conclusions Given that these actions are ‘Chern-Simons-like’ one is led to ask the extent to which they can be interpreted coherently as holomorphic Chern-Simons theories. Clearly, in some sense, the gauge group should be taken to be the diffeomorphisms of the supertwistor space that preserve the holomorphic Poisson structure. This is most easily made sense of in a complexified context so that the holomorphic twistor variables are freed up and become independent from the conjugate twistor variables. Then the theory becomes a complexified Chern-Simons theory with gauge group the holomorphic contact transformations of the holomorphic supertwistor space, a region in CP3|8 , on the conjugate supertwistor space (which is just CP3 as we have no anti-holomorphic fermionic coordinates). A similar connection between the self-dual vacuum equations and a gauge theory with a diffeomorphism group gauge group was given on space-time in [MN89] (here the gauge theory was the self-dual Yang-Mills equations); see also [W07] for a supersymmetric extension thereof. The fact that Thm. 3 works in 4n + 2 dimensions is suggestive of applications of this framework to the twistor theory for quaternionic Kähler manifolds with non-zero scalar curvature in 4n dimensions. It is straightforward to write down a Lagrange multiplier action b ∧ τ ∧ (dτ )n+1 analogous to our N = 0 action, but with b a (2n − 1)-form, although in this context the interpretation of b is less clear. An attractive feature is that we have a fully supersymmetrically invariant and Lorentz invariant off-shell formulation of the theory. However, we have so far been unable to find an action functional of N = 8 self-dual supergravity that does not depend on a given integrable background. Such an action functional would, however, be desirable as one would hope for an explicitly diffeomorphism invariant action principle for N = 8 self-dual supergravity. In particular, if one wishes to be able to extend the ideas to the full theory along the lines of [M05] for conformal supergravity,10 then it would seem awkward to have to identify a Minkowski background. A task for the future is to start with the superfield expansions (in the non-linear setting) of τ and h and reproduce the covariant form of the field equations and of the action functional of N = 8 self-dual supergravity in four dimensions as given in [S92].11 In the zero cosmological constant case, our twistor action and field equations must correspond via the Penrose transform to Siegel’s results. Acknowledgements. We would like to thank Alexander Popov for a number of important contributions to this work and Mohab Abou-Zeid, Rutger Boels, Daniel Fox, Chris Hull, Riccardo Ricci, Christian Sämann and David Skinner for useful discussions. We would also like to thank the referee for useful suggestions. The first author is partially supported by the EU through the FP6 Marie Curie RTN ENIGMA (contract number MRTN– CT–2004–5652) and through the ESF MISGAM network. The second author was supported in part by the EU under the MRTN contract MRTN–CT–2004–005104 and by STFC under the rolling grant PP/D0744X/1.
Appendix A. Prepotential Formulation The subject of this appendix is the comparison of [KK98] approach with ours. Their formulation is based on an anti-holomorphic involution which picks a real slice in complexified space-time being of split signature. Pretty much the same holds true, however, 10 See also [A-ZH06] for a space-time action for expanding about the self-dual sector in the case of Einstein gravity. 11 Similar expansions for certain supersymmetric gauge theories were performed in [PW04,PS05], Sämann (2005), [PSW05 and LS06].
116
L. J. Mason, M. Wolf
for Euclidean signature and it is this latter case we are interested in here. As already indicated, this works only for an even number of supersymmetries. In the following, we shall use conventions from [W06]. A.1. Real structures on PT[N] and M[N] . Let us first consider the supertwistor space
PT[N] = CP3|N \CP1|N with (homogeneous) coordinates (ω A , πα˙ ) for flat super spacetime M[N] ∼ = C4|2N. An Euclidean signature real slice follows from the anti-holomorphic involution without fixed points ρ : PT[N] → PT[N] given by ˙
(ωˆ A , πˆ α˙ ) := ρ(ω A , πα˙ ) := (ω¯ B C B A , Cα˙ β π¯ β˙ ),
(A.1)
where bar denotes complex conjugation and (C A B ) = diag((Cα β ), (Ci j )), with 0 1 ˙ . (A.2) (Cα β ) = , (Ci j ) = diag( , . . . , ), (Cα˙ β ) = − , := −1 0 N 2 −times
We can extend ρ to a map from a holomorphic function f on PT[N] another holomorphic function by ρ( f (· · · )) := f (ρ(· · · )).
(A.3)
By virtue of the incidence relation, ω A = x Aα˙ πα˙ , we obtain an induced involution on M[N] explicitly given by ˙
ρ(x Aα˙ ) = −x¯ B β C B A Cβ˙ α˙ .
(A.4)
We shall use the same notation ρ for the anti-holomorphic involution induced on the different (super)manifolds in the twistor correspondence. The fixed point set of this involution, that is, ρ(x) = x for x ∈ M[N] , defines Euclidean right-chiral superspace ρ M[N] ∼ = R4|2N inside M[N] . Following [AHS78], the supertwistor space PT[N] can be identified with O(1)⊕2|N → CP1
(A.5)
and so it can be covered by two (acyclic) coordinate patches U± and coordinatised by (ω±A , π± ), where ω±A are local fibre coordinates with ω+A := ω A /π0˙ , ω−A := ω A /π1˙ and π+ := π1˙ /π0˙ , π− := π0˙ /π1˙ are the standard local holomorphic coordinates on CP1 , with π+ = π−−1 on U+ ∩ U+ ⊂ PT[N] . On the other hand, since PT[N] is diffeomorphic ρ ∼ R4|2N × S 2 , one may equivalently coordinatise it by using (x Aα˙ , λ± ), to M[N] × S 2 = where λ± are the standard local holomorphic coordinates on S 2 ∼ = CP1 . Note that ± A A α ˙ (ω± , π± ) = (x λα˙ , λ± ), where (λα+˙ ) :=
λ+ −1
and
(λα−˙ ) :=
1 . −λ−
(A.6)
Twistor Actions for Self-Dual Supergravities
117
The explicit inverse transformation laws are simply x Aα˙ =
ω±A πˆ ±α˙ − ωˆ ±A π±α˙ β˙
πˆ ± π±β˙
,
(A.7)
where π±α˙ are similarly defined as in (A.6). Altogether, we have obtained a non-holomorphic fibration ρ
π : PT[N] → M[N] .
(A.8)
Introduce (λˆ α+˙ ) :=
¯ 1 ˆ α−˙ ) := λ− , γ±−1 := λˆ α±˙ λ± = 1 + λ± λ¯ ± , , ( λ α˙ 1 λ¯ +
(A.9)
like for πˆ ±α˙ = ρ(π±α˙ ). Then, due to the above diffeomorphism, we have the following transformation laws between the coordinate vector fields: ∂ ∂ = γ± λˆ α±˙ , A ∂ x Aα˙ ∂ω± ∂ ∂ ∂ ˙ = − γ+ x A1 λˆ α+˙ Aα˙ , ∂π+ ∂λ+ ∂x ∂ ∂ ∂ ˙ = − γ− x A0 λˆ α−˙ ∂π− ∂λ− ∂ x Aα˙
(A.10a) (A.10b) (A.10c)
for the holomorphic tangent vector fields and ∂ ¯ ∂ ω¯ ±A
= −γ± C A B λα±˙
∂ , ∂ x B α˙
∂ ∂ ∂ ˙ = − γ+ x A0 λα+˙ Aα˙ , ∂ π¯ + ∂x ∂ λ¯ + ∂ ∂ ∂ ˙ = + γ− x A1 λα−˙ ¯ ∂ π¯ − ∂ x Aα˙ ∂ λ−
(A.10d) (A.10e) (A.10f)
for the anti-holomorphic ones.
A.2. Comparison of the two approaches. In what follows, we shall restrict our discussion to the U+ -patch only and for notational simplicity suppress the patch index. Of course, a similar discussion carries over to the U− -patch. To begin with, let us write down the field Eqs. (3.10) more explicitly. If we let the deformation be h = dω¯ α¯ h α¯ + dπ¯ h π¯ , they read as ∂ ∂ h β¯ − h α¯ + [h α¯ , h β¯ } = 0, α ¯ ∂ ω¯ ∂ ω¯ β¯ ∂ ∂ h α¯ − h π¯ + [h π¯ , h α¯ } = 0. ∂ π¯ ∂ ω¯ α¯
(A.11a) (A.11b)
118
L. J. Mason, M. Wolf
Using the incidence relation ω A = x Aα˙ πα˙ and the involutions introduced in the preceding subsection, h can also be expressed in the coordinates (x Aα˙ , λ) as ˙
h = −γ λˆ β˙ dx α β α + dλ¯ λ¯ ,
(A.12)
˙
where α := −γ −1 Cα β h β¯ and λ¯ := h π¯ + γ x α 0 α . In order to compare our approach with those by [KK98], we notice that their formulation deals with the ‘vacuum case’, i.e. with the case of vanishing cosmological constant. Upon also recalling point (iii) of Thm. 2, we must therefore ensure that the fibration of the supertwistor space is preserved, and so (i) h is of the form ˙
h = −γ λˆ β˙ dx α β α ,
(A.13)
˙
i.e. λ¯ = 0 ⇔ h π¯ = −γ x α 0 α and (ii) the relative symplectic structure needs to be preserved which amounts to requiring a degeneracy of the Poisson structure ω = (I I J ) introduced in Sect. 3.1 according to ω = (I AB ). Notice further that α must be of weight 3 in order for h to be of weight 2. Some algebra then reveals that in the ‘vacuum case’ the above equations for h α¯ and h π¯ translate into the following set:
αβ ∂¯α β + 21 αβ [α , β } = 0, ∂λ¯ α + γ
−2 βγ
(∂β α )γ = 0,
(A.14a) (A.14b)
where ∂¯ A := λα˙ ∂/∂ x Aα˙ and ∂ A := γ λˆ α˙ ∂/∂ x Aα˙ . Before going any further, let us say a few words about gauge symmetries. The original equations for h transformed covariantly under gauge transformations of the form h → h + δh, with δh = ∂¯0 χ + [h, χ } for some function χ of weight 2. However, the above equations will no longer transform covariantly under generic gauge transformations, since we have incorporated the constraint λ¯ = 0. Nevertheless, some residual gauge symmetry remains, and which is determined as follows. In order to preserve the con˙ straint λ¯ = 0, we must have δh π¯ = −γ x α 0 δα , where δα = −γ −1 Cα β δh β¯ , i.e. transformations of h π¯ are determined by those of h α¯ . It is not difficult to verify that the remaining gauge symmetry is given by the following transformation laws: δα = −(∂¯α χ + [α , χ }),
with
∂λ¯ χ + γ −2 βγ (∂β χ )γ = 0. (A.15)
In particular, the last of these equations shows that the 2nd equation for α from above does not constrain α any further, so that the only remaining field equation we are left with is
αβ ∂¯α β + 21 αβ [α , β } = 0.
(A.16)
Since in particular α = ∂α (see also [W85]), where is some function of weight 4 (recall that α is of weight 3) and ω = (I AB ), we end up with + 21 αβ (−) p A ∂ A ∂α I AB ∂ B ∂β = 0 which is [KK98] result.
and
:= αβ ∂¯α ∂β , (A.17)
Twistor Actions for Self-Dual Supergravities
119
As before, in the case of maximal supersymmetry, N = 8, the field Eqs. (A.17) can be derived from an action principle, S[] = d vol + 3!1 αβ (−) p A ∂ A ∂α I AB ∂ B ∂β , (A.18a) where the measure d vol is given by d vol = d4 x γ 2 dλdλ¯ dθ 1 · · · dθ 8 .
(A.18b)
It remains to give the superfield expansion of . For brevity, let us only discuss the N = 8 case. We find = g + θ i ψi + θ i1 i2 A[i1 i2 ] + θ i1 i2 i3 χi1 i2 i3 + θ i1 i2 i3 i4 φi1 i2 i3 i4 + θi1 i2 i3 χ˜ i1 i2 i3 + θi1 i2 A˜ i1 i2 + θi ψ˜ i + θ g, ˜
(A.19)
where θ i1 ···ir := θi1 ···i8−r :=
1 i1 r! θ
· · · θ ir , for r = 1, . . . , 4,
i 9−r 1 r ! i 1 ···i 8−r i 9−r ···i 8 θ
(A.20a)
· · · θ , for r = 5, . . . , 8. i8
(A.20b)
Here, i1 ···i8 = [i1 ···i8 ] and 1···8 = 1. Keeping in mind (A.13), we find the following space-time fields: Table 1. Space-time fields and their helicities and multiplicities Field Helicity Multiplicity
g
ψ
2 1
3 2
8
A
χ
1 28
1 2
56
φ
χ˜
A˜
φ˜
g˜
0 70
− 21
−1 28
− 23 8
−2 1
56
Appendix B. Holomorphic Volume Forms and Non-Projective Twistor Space It is often convenient to work on the non-projective twistor space T as many of the geometric structures can be formulated globally there and sections of the line bundles O(n) become ordinary functions of weight n under the action of the Euler vector field ϒ = Z I ∂/∂ Z I . In the curved case, as in the proof of Theorem 2, the non-projective space can be defined as the quotient of the non-projective co-spin bundle S ∗ by DF . We can also define it intrinsically as follows. In the bosonic case, given a contact structure defined by a one-form τ with values in a line bundle L, we can see that τ ∧ dτ defines a (non-vanishing) section of
(3,0) P T ⊗ L 2 . Thus, we must have L −2 ∼ = (3,0) P T . In the flat case, non-projective 4 ∼ twistor space T[0] = C is the total space of the (tautological) line bundle O(−1) over the projective twistor space PT[0] , and (3,0) PT[0] ∼ = O(−4). In the general (non-supersymmetric) case, we can define the non-projective twistor space T to be the total space of the line bundle O(−1) now defined to be the 4th root of (3,0) P T . If so, we see that L∼ = O(2). The non-projective space has an Euler vector field ϒ that generates the C∗ action on the fibres of O(−1). The weights of functions and forms pulled back from P T are translated into the weights along ϒ on the non-projective space. In this context,
120
L. J. Mason, M. Wolf
τ defines a 1-form of weight 2 on the non-projective space, and the non-degeneracy of the contact structure translates into the condition that the two-form dτ is non-degenerate as a two-form on T and being closed defines a holomorphic symplectic structure. Its inverse therefore defines a non-degenerate holomorphic Poisson structure on T of weight −2. This descends to give a Poisson structure on P T with values in O(−2). We can extend this reasoning to the supersymmetric case as follows. We again consider a holomorphic differential one-form τ with values in a complex line bundle L . It defines as its kernel the contact distribution D, which now is of rank 2|N, leading to a short exact sequence as follows: 0 −→ D −→ T (1,0) PT −→ L −→ 0 .
(B.1)
Since we assume that τ defines a non-degenerate holomorphic contact structure, dτ provides a non-degenerate skew form on D. Taking its Berezinian, we get an element Ber(dτ |D ) ∈ L 2−N ⊗ (Ber D)−2 .
(B.2)
(This follows from the fact that in the definition of the Berezinian, the odd-odd part of the matrix is inverted before its determinant is taken leading to inverse weights associated to the odd directions relative to their bosonic counterparts.) When L has a square root, we can take its square root to get an isomorphism (Ber(dτ |D )) : Ber D → L 1−N/2 . (B.3) The above exact sequence then gives an identification Ber T (1,0) PT ∼ = Ber D ⊗ L ∼ = L 2−N/2 ,
(B.4)
and so finally we obtain the isomorphism Ber(PT ) := Ber (1,0) PT ∼ = L N/2−2 .
(B.5)
We will take the body of the supertwistor space to have topology U × S 2 , where U is an open subset of R4 (or more generally the total space of the projective co-spin bundle of a real smooth spin four-manifold M). The assumption on the normal bundle of a rational curve in supertwistor space implies that the holomorphic Berezinian bundle Ber(PT ) has Chern class N − 4, and with the topological assumptions we have made, this will have an |N − 4|-th root and we may introduce the (consistent) notation O(n) := (Ber(PT ))n/(N−4) . Thus, L ∼ = O(2) and Ber(PT ) ∼ = O(N − 4). Appendix C. Supersymmetric BF-Type Theory In this appendix we wish to present an alternative interpretion of the holomorphic ChernSimons-type theory (3.14). We shall see that this theory can be viewed as a certain supersymmetric holomorphic BF-type theory. In what follows, we will borrow ideas of [W89]. To begin with, consider some (0|2)-dimensional space T with odd coordinates ψ 1 and ψ 2 , which we collectively denote by ψ α . On PT × T , we may introduce a (0, 1)-form H of weight 2 according to H = h + ψ α χα + ψ 1 ψ 2 b.
(C.1)
Twistor Actions for Self-Dual Supergravities
121
Here, h and b are even and χα are odd (0, 1)-forms of weight 2 on PT . As before, we ¯ assume that these fields have no dependence on the θ¯ i coordinates. In analogy to (3.14), we may consider the action functional
(0) 1 2 S[b, h, χα ] = dψ dψ (C.2)
8 ∧ H ∧ ∂¯0 H + 13 H ∧ [H, H } . A short calculation reveals that this action reduces after integration over the ψ α coordinates to (0) S[h, b, χα ] =
8 ∧ b ∧ F (0,2) − 21 αβ χα ∧ (∂¯0 χβ + [h, χβ }) . (C.3) The equations of motion that follow from this action are F (0,2) = 0, ∂¯0 b + [h, b} = 21 αβ [χα , χβ },
∂¯0 χα + [h, χα } = 0.
(C.4a) (C.4b) (C.4c)
The first equation is the field equation (3.10). Note that for χα = 0 we get (3.18). The supersymmetry transformations are straightforwardly worked out as they follow from infinitesimal translations in the odd coordinates ψ α . We find δα h = χα ,
δα χβ = αβ b
and
δα b = 0,
(C.5)
with {δα , δβ } = 0. Therefore, the supersymmetric holomorphic BF-type action (C.3) can also be written as S[h, b, χα ] = − 21 δ1 δ2 S[h],
(C.6)
where S[h] is the action (3.14). References [A-ZH06] [A-ZHM08] [AHS78] [BE91] [B79] [BS92] [B04] [BW04] [BDR07] [BCOV94] [B-Betal06]
Abou-Zeid, M., Hull, C.M.: A chiral perturbation expansion for gravity. JHEP 0602, 057 (2006) Abou-Zeid, M., Hull, C.M., Mason, L.J.: Einstein supergravity and new twistor string theories. Commun. Math. Phys. 282, 519 (2008) Atiyah, M.F., Hitchin, N.J., Singer, I.M.: Self-duality in four-dimensional riemannian geometry. Proc. Roy. Soc. Lond. A 362, 425 (1978) Bailey, T.N., Eastwood, M.G.: Complex paraconformal manifolds— their differential geometry and twistor theory. Forum. Math. 3, 61 (1991) Batchelor, M.: The structure of supermanifolds. Trans. Amer. Math. Soc. 253, 329 (1979) Bergshoeff, E., Sezgin, E.: Self-dual supergravity theories in (2 + 2)-dimensions. Phys. Lett. B 292, 87 (1992) Berkovits, N.: An alternative string theory in twistor space for N = 4 super yang-mills. Phys. Rev. Lett. 93, 011601 (2004) Berkovits, N., Witten, E.: Conformal supergravity in twistor-string theory. JHEP 0408, 009 (2004) Bern, Z., Dixon, L.J., Roiban, R.: Is N = 8 supergravity ultraviolet finite? Phys. Lett. B 644, 265 (2007) Bershadsky, M., Cecotti, S., Ooguri, H., Vafa, C.: Kodaira-Spencer theory of gravity and exact results for quantum string amplitudes. Commun. Math. Phys. 165, 311 (1994) Bjerrum-Bohr, N.E.J., Dunbar, D.C., Ita, H., Perkins, W.B., Risager, K.: The no-triangle hypothesis for N = 8 supergravity. JHEP 0612, 072 (2006)
122
[BMS07a] [BMS07b] [CE03] [CDDG79] [DGNV05] [GRV07] [K79] [K80] [KK98] [KNG92] [LS06] [M88] [M05] [MN89] [MS06] [MS08] [MW96] [M91] [M92a] [M92b] [M92c] [N08] [P68] [P76] [PW04] [PS05] [PSW05] [S05] [S06] [S92] [S95]
L. J. Mason, M. Wolf
Boels, R., Mason, L.J., Skinner, D.: Supersymmetric gauge theories in twistor space. JHEP 0702, 014 (2007a) Boels, R., Mason, L.J., Skinner, D.: From twistor actions to MHV diagrams. Phys. Lett. B 648, 90 (2007b) Cap, A., Eastwood, M.G.: Some special geometry in dimension six. In: Proc. of the 22nd Winter School, Geometry and physics (Srni 2002), Rend. Circ. Mat. Palermo (2) Suppl. No. 71, 93 (2003) Christensen, S.M., Deser, S., Duff, M.J., Grisaru, M.T.: Chirality, self-duality, and supergravity counterterms. Phys. Lett. B 84, 411 (1979) Dijkgraaf, R., Gukov, S., Neitzke, A., Vafa, C.: Topological M-theory as unification of form theories of gravity. Adv. Theor. Math. Phys. 9, 603 (2005) Green, M.B., Russo, J.G., Vanhove, P.: Ultraviolet properties of maximal supergravity. Phys. Rev. Lett. 98, 131602 (2007) Kallosh, R.E.: Super self-duality. JETP Lett. 29, 172 [Pisma Zh. Eksp. Teor. Fiz. 29, 192] (1979) Kallosh, R.E.: Self-duality in superspace. Nucl. Phys. B 165, 119 (1980) Karnas, S., Ketov, S.V.: An action of N = 8 self-dual supergravity in ultra-hyperbolic harmonic superspace. Nucl. Phys. B 526, 597 (1998) Ketov, S.V., Nishino, H., Gates, S.J.J.: Self-dual supersymmetry and supergravity in AtiyahWard space-time. Nucl. Phys. B 393, 149 (1992). See also Phys. Lett. B 297, 323 (1992), Phys. Lett. B 307, 331 (1993), Phys. Lett. B 307, 323 (1993) Lechtenfeld, O., Sämann, C.: Matrix models and D-branes in twistor string theory. JHEP 0603, 002 (2006) Manin, Yu.I.: Gauge field theory and complex geometry. New York: Springer Verlag, 1988 [Russian: Moscow: Nauka, 1984] Mason, L.J.: Twistor actions for non-self-dual fields: A derivation of twistor string theory. JHEP 0510, 009 (2005) Mason, L.J., Newman, E.T.: A connection between the Einstein and Yang-Mills equations. Commun. Math. Phys. 121, 659 (1989) Mason, L.J., Skinner, D.: An ambitwistor Yang-Mills Lagrangian. Phys. Lett. B 636, 60 (2006) Mason, L.J., Skinner, D.: Heterotic twistor-string theory. Nucl. Phys. B 795, 105 (2008) Mason, L.J., Woodhouse, N.M.J.: Integrability, self-duality, and twistor theory. Oxford: Clarendon Press, 1996 Merkulov, S.A.: Paraconformal supermanifolds and non-standard N-extended supergravity models. Class. Quant. Grav. 8, 557 (1991) Merkulov, S.A.: Supersymmetric non-linear graviton. Funct. Anal. Appl. 26, 69 (1992a) Merkulov, S.A.: Simple supergravity, supersymmetric non-linear gravitons and supertwistor theory. Class. Quant. Grav. 9, 2369 (1992b) Merkulov, S.A.: Quaternionic, quaternionic Kähler, and hyper-Kähler supermanifolds. Lett. Math. Phys. 25, 7 (1992c) Nair, V.P.: A note on graviton amplitudes for new twistor string theories. Phys. Rev. D 78, 041501 (2008) Penrose, R.: Twistor quantization and curved space-time. Int. J. Theor. Phys. 1, 61 (1968) Penrose, R.: Non-linear gravitons and curved twistor theory. Gen. Rel. Grav. 7, 31 (1976) Popov, A.D., Wolf, M.: Topological B model on weighted projective spaces and self-dual models in four dimensions. JHEP 0409, 007 (2004) Penrose, R., Sämann, C.: On supertwistors, the Penrose-Ward transform and N = 4 super Yang-Mills theory. Adv. Theor. Math. Phys. 9, 931 (2005) Penrose, R., Sämann, C., Wolf, M.: The topological B model on a mini-supertwistor space and supersymmetric Bogomolny monopole equations. JHEP 0510, 058 (2005) Sämann, C.: The topological B model on fattened complex manifolds and subsectors of N = 4 self-dual Yang-Mills theory. JHEP 0501, 042 (2005) Sämann, C.: Aspects of twistor geometry and supersymmetric field theories within superstring theory, Ph.D. thesis, Leibniz University of Hannover, available at http://arXiv.org/list/ hep-th/0603098, 2006 Siegel, W.: Self-dual N = 8 supergravity as closed N = 2 (N = 4) strings. Phys. Rev. D 47, 2504 (1992) Sokatchev, E.S.: Action for N = 4 supersymmetric self-dual Yang-Mills theory. Phys. Rev. D 53, 2062 (1995)
Twistor Actions for Self-Dual Supergravities
[St06] [W86] [W80] [WW90] [W89] [W04] [W06] [W07] [W85]
123
Stelle, K.S.: Counterterms, holonomy and supersymmetry. In: Deserfest: A celebration of the Life and works of Stanley Deser, Ann Arbor Michigan, 2004, Liu, J.T., Duff, M.J., Stelle, K.S., Woodward, R.P., (eds.), River Edge, NJ: World Scientific, 2006, p. 303 Waintrob, A.Yu.: Deformations and moduli of supermanifolds. In: Group theoretical methods in physics, Vol. 1, Moscow: Nauka, 1986 Ward, R.S.: Self-dual space-times with cosmological constants. Commun. Math. Phys. 78, 1 (1980) Ward, R.S., Wells, R.O.: Twistor geometry and field theory. Cambridge: Cambridge University Press, 1990 Witten, E.: Topology changing amplitudes in (2 + 1)-dimensional gravity. Nucl. Phys. B 323, 113 (1989) Witten, E.: Perturbative gauge theory as a string theory in twistor space. Commun. Math. Phys. 252, 189 (2004) Wolf, M.: On supertwistor geometry and integrability in super gauge theory. Ph.D. thesis, Leibniz University of Hannover, available at http://arXiv.org/list/hep-th/0611013, 2006 Wolf, M.: Self-dual supergravity and twistor theory. Class. Quant. Grav. 24, 6287 (2007) Woodhouse, N.M.J.: Real methods in twistor theory. Class. Quant. Grav. 2, 257 (1985)
Communicated by G. W. Gibbons
Commun. Math. Phys. 288, 125–144 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0768-6
Communications in
Mathematical Physics
Asymptotic Stability of Lattice Solitons in the Energy Space Tetsu Mizumachi Faculty of Mathematics, Kyushu University, Hakozaki 6-10-1, Fukuoka 812-8581, Japan. E-mail:
[email protected] Received: 13 November 2007 / Accepted: 19 December 2008 Published online: 14 March 2009 – © Springer-Verlag 2009
Abstract: Orbital and asymptotic stability for 1-soliton solutions of the Toda lattice equations as well as for small solitary waves of the FPU lattice equations are established in the energy space. Unlike analogous Hamiltonian PDEs, the lattice equations do not conserve the adjoint momentum. In fact, the Toda lattice equation is a bidirectional model that does not fit in with the existing theory for the Hamiltonian systems by Grillakis, Shatah and Strauss. To prove stability of 1-soliton solutions, we split a solution around a 1-soliton into a small solution that moves more slowly than the main solitary wave and an exponentially localized part. We apply a decay estimate for solutions to a linearized Toda equation which has been recently proved by Mizumachi and Pego to estimate the localized part. We improve the asymptotic stability results for FPU lattices in a weighted space obtained by Friesecke and Pego.
1. Introduction In this paper, we study asymptotic stability of solitary waves for a class of Hamiltonian systems of particles connected by nonlinear springs. A typical model of these lattices is the Toda lattice q(t, ¨ n) = e−(q(t,n)−q(t,n−1)) − e−(q(t,n+1)−q(t,n)) for t ∈ R and n ∈ Z,
(1)
where q(t, n) denotes the displacement of the n th particle at time t and ˙ denotes differentiation with respect to t. Let p(t, n) = q(t, ˙ n), r (t, n) = q(t, n + 1) − q(t, n), u(t, n) = t(r (t, n), p(t, n)) and V (r ) = e−r − 1 +r . The Toda lattice (1) is an integrable system with the Hamiltonian H (u(t)) =
1 n∈Z
2
p(t, n) + V (r (t, n)) , 2
126
T. Mizumachi
(see [7]) and can be rewritten as du = J H (u), dt where
J=
0 1 − e−∂
(2)
e∂ − 1 , 0
∂
and e±∂ = e± ∂n are the shift operators defined by (e±∂ ) f (n) = f (n ± 1) for every sequence { f (n)}n∈Z and H is the Fréchet derivative of H in l 2 × l 2 . The Toda lattice (2) has a two-parameter family of solitary waves M = u c (t + δ) c > 1, δ ∈ R , where u c (t, n) = u˜ c (n − ct), u˜ c (x) = (˜rc (x), p˜ c (x)) and cosh{κ(x − 1)} , cosh κ x p˜ c (x) = −c∂x q˜c (x), r˜c (x) = q˜c (x + 1) − q˜c (x),
q˜c (x) = log
(3) (4)
and κ = κ(c) is a unique positive solution of c = sinh κ/κ. Friesecke and Pego [9,10] have proved asymptotic stability of solitary waves to FPU lattices in a weighted space assuming an exponential linear stability property (H1) below. To state the assumption explicitly, we introduce several notations. Let la2 be a Hilbert space of R2 -sequences equipped with the norm 1/2 e2an |u(n)|2 . ula2 = n∈N
2 Let u, v := n∈Z (u 1 (n)u 2 (n) + v1 (n)v2 (n)) for R -sequences u = (u 1 , u 2 ) and 1/2 v = (v1 , v2 ) and ul 2 = (u, u) .
(H1) Let a > 0 be a small number. There exist positive numbers K and β such that if v(s), J −1 u˙ c (s) = v(s), J −1 ∂c u c (s) = 0,
(5)
then a solution to dv dt
= J H (u c (t))v
(6)
satisfies ea(·−ct) v(t, ·)l 2 ≤ K e−β(t−s) ea(·−cs) v(s)l 2 for every t ≥ s.
(7)
Remark 1. Solutions u˙ c (t) and ∂c u c (t) to (6) correspond to infinitesimal changes on t and c and they do not decay as t → ∞. Since J −1 u˙ c (t) and J −1 ∂c u c (t) are the corresponding neutral modes to the adjoint equation dw = H (u c (t))J w, dt the condition (H1) says that a solution to (6) decays exponentially as t → ∞ if it does not include neutral modes u˙ c (t) and ∂c u c (t).
Asymptotic Stability of Lattice Solitons in the Energy Space
127
Remark 2. In (5), we set 0 J −1 = −1
0
k=−∞ e
k∂
k=−∞ e
k∂
0
2 . (Note that u decays exponentially as n → ±∞ so that J −1 is a bounded operator in l−a 2 and a > 0 and that e−∂ u −a if u ∈ l±a l 2 = e ul 2 .) Since u˙ c and ∂c u c decay like −a
−a
2 for every a ∈ (0, 2κ(c)). e−2κ|n−ct| as n → ±∞, we have J −1 u˙ c , J −1 ∂c u c ∈ l−a
Friesecke and Pego prove in [9] that solitary waves of FPU lattices are asymptotically stable in la2 if (H1) holds. They have also proved in [10,11] that small solitary waves of FPU lattices can be approximated by KdV solitons and that they satisfy (H1). In [18], we use the linearized Bäcklund transformation to show that every 1-soliton of the Toda lattice satisfies (H1) and prove that it is asymptotically stable in la2 without assuming smallness of solitons. Our goal in the present paper is to prove asymptotic stability of 1-solitons in l 2 . Theorem 1. Let c0 > 1, τ0 ∈ R and let u(t) be a solution to (2) with u(0) = u c0 (τ0 )+v0 . For every ε > 0, there exists a positive number δ > 0 satisfying the following: If v0 l 2 < δ, there exist constants c+ > 1 and σ ∈ (1, c+ ) and a C 1 -function x(t) such that u(t) − u˜ c0 (· − x(t))l 2 < ε,
lim u(t) − u˜ c+ (· − x(t)) l 2 (n≥σ t) = 0,
t→∞
˙ − c0 |) = O(v0 l 2 ), sup (|c(t) − c0 | + |x(t) t∈R
lim c(t) = c+ ,
t→∞
lim x(t) ˙ = c+ .
t→∞
Remark 3. By a simple computation, we see d H (u c )/dc > 0 and limc→1 H (u c ) = 0 (see e.g. [26]). So we have arbitrary small 1-solitons in l 2 . However, small solitary waves do not belong to an exponentially weighted space if c is close to 1 because u c (t) decays like e−2κ(c)|n−x(t)| as n → ∞ and limc↓1 κ(c) = 0. Thus from Friesecke and Pego [8–11] and Mizumachi and Pego [18], we cannot see whether a solitary wave can be stable under perturbations which include small solitary waves. Theorem 1 and Theorem 2 below insist that a solitary wave does not collapse by small perturbations including other solitary waves. Since Benjamin [1] and Bona [2] studied stability of KdV 1-solitons, many results have been obtained on stability of solitary waves to infinite dimensional Hamiltonian systems (see [5] and references therein). In those results, they utilized the fact that the Hamiltonian systems have another conservation law (momentum for KdV and charge for NLS) and a solitary wave solution is a local minimizer of the Hamiltonian among solutions whose momentum or charge is the same as the solitary wave solution. However, a solution to the Toda lattice does not conserve adjoint momentum in general because Noether’s theorem is not applicable to the spatial variable n ∈ Z. Hence stability of solitary waves does not follow from the theory of Hamiltonian systems by Grillakis, Shatah and Strauss [13,14] and Shatah and Strauss [25]. For the same reason, it is not possible to use a Liouville theorem such as [15] to prove asymptotic stability of solitary waves.
128
T. Mizumachi
Luckily, solitary waves for a class of lattice equations including the Toda lattice equation separate from each other as t → ∞. As can be seen from (3) and (4), speed of solitary waves which move to the right is larger than 1 and the larger a solitary wave is the faster it moves, whereas the absolute value of group velocities are less than 1. So a solution to (2) is decoupled into a train of solitary waves and a remainder term as t → ∞. Friesecke and Pego [8–11] utilize this fact and prove asymptotic stability of solitary waves to FPU lattices in an exponentially weighted space. They decompose a solitary wave as u(t) = u c(t) (γ (t)) + v(t) = u˜ c(t) (· − x(t)) + v(t), x(t) = c(t)γ (t),
(8)
where u c(t) (γ (t)) denotes a main solitary wave, and c(t) and x(t) are modulation parameters of the speed and the phase shift of the main wave, respectively. They prove that a solution which lies in a neighborhood of M is absorbed into M exponentially in la2 -norm as t → ∞. Their proof basically follows the idea of Pego and Weinstein [21] and imposes the symplectical orthogonality condition (5) on v. One of the difficulties in the use of their method in the energy space is that J −1 ∂c u c tends to a nonzero constant as n → ∞ and (5) is not well defined for v ∈ l 2 . Our strategy is to decompose v(t) into the sum of a small solution v1 (t) of (2) and v2 (t) that is driven by an interaction of u c and a dispersive part of the solution. Since v2 (t) is exponentially localized in front, we can estimate v2 (t) by using exponential linear stability (7). Since v1 (t) moves more slowly than the main solitary waves, it locally tends to 0 around the solitary wave. To fix the decomposition, we impose the constraint v, J −1 u˙ c (γ ) = v2 , J −1 ∂c u c (γ ) = 0 instead of (5). Recently, Martel and Merle [16] give a direct proof of the asymptotic stability results in H 1 (R) for generalized KdV solitons based on a virial identity (which first appeared in Kato [19]). Because the Toda lattice and KdV equations have a similarity that the dominant solitary wave outruns and is separated from other parts of solutions as t → ∞, their idea seems promising. We prove a virial lemma [Lemma 5 in Sect. 3] for v1 (t) and apply local energy decay estimates for v2 instead of proving a virial lemma around solitary waves.This enables us to prove our results without numerics whereas [15,16] need some numerical computation to prove positivity of a quadratic form. We expect our proof is applicable also for Hamiltonian PDEs like the KdV equation or bidirectional models like Boussinesq equations (see [3,4,20]) by using the renormalization method by Ei [6] and Promislow [22] (see [17] for an application to the generalized KdV equation in a weighted space). We remark that Quintero [23] proved orbital stability of solitary waves of the 1-dimensional Benney-Luke equation by the variational method [13] in a case where surface tension is strong. But the approach fails if the surface tension is weak because then a solitary wave solution is a saddle point of infinite dimensional indefiniteness ([24]). Our method can be applied to such a situation because we do not use the fact that a solitary wave is a constrained minimizer to prove stability of solitary waves. Now let us consider asymptotic stability of solitary waves to the FPU lattice equations. It is interesting to see whether solitary waves to non-integrable lattices are robust to perturbations in the energy class. Let u(t, n) = t(r (t, n), p(t, n)) be a solution to
Asymptotic Stability of Lattice Solitons in the Energy Space
129
du = J H F (u) for t ∈ R, dt where H F (u(t)) =
1 n∈Z
2
(9)
p(t, n) + VF (r (t, n)) , 2
and VF is a potential satisfying (H2) VF ∈ C 4 (R; R), VF (0) = VF (0) = 0, VF (0) > 0, VF (0) = 0. If c > cs := VF (0) and c is sufficiently close to cs , Friesecke and Pego [8] show that there exists a unique solution u˜ c (x), − c∂x u˜ c (x) = J H F (u˜ c (x)) for x ∈ R,
(10)
up to translation and its profile is close to that of a KdV soliton. We remark that a solitary wave solution u˜ c (n − ct) has small amplitude and satisfies d H (u˜ c )/dc > 0 if c > cs and c is close to cs . See Friesecke and Wattis [12] for existence of large solitary waves. Friesecke and Pego have proved in [11] that small solitary wave solutions of (9) satisfy (H1) and are asymptotically stable in la2 . Assuming (H2), we can prove orbital and asymptotic stability of small solitary waves in l 2 exactly in the same way as the Toda lattice. Theorem 2. Suppose (H2). Let δ∗ be a small positive number and let c0 ∈ (cs , cs + δ∗ ) and τ0 ∈ R. Let u(t) be a solution to (9) with u(0) = u c0 (τ0 ) + v0 . Then for every ε > 0, there exists a δ > 0 satisfying the following: If v0 l 2 < δ, there exist constants c+ > cs and σ ∈ (cs , c+ ) and a C 1 -function x(t) such that u(t) − u˜ c0 (· − x(t))l 2 < ε,
lim u(t) − u˜ c+ (· − x(t)) l 2 (n≥σ t) = 0, t→∞
˙ − c0 |) = O(v0 l 2 ), sup (|c(t) − c0 | + |x(t) t∈R
lim c(t) = c+ ,
t→∞
lim x(t) ˙ = c+ .
t→∞
In Sect. 2 of the present paper, we introduce a variant of the secular term condition for solutions in the energy class and some estimates that will be used later. In Sect. 3, we derive modulation equations of x(t) and c(t) and prove c(t) ˙ = O(v1 (t)2W + v2 (t)2X )
(11)
2 and X ⊂ l 2 . On the other hand, we show that for some weighted spaces W ⊂ la2 ∩ l−a a
∞ (v1 (t)W + v2 (t) X )2 dt v0 l22 0
(12)
130
T. Mizumachi
by using a virial lemma for v1 (t) and a local energy decay estimate (Corollary 1 in Sect. 2) for v2 (t). Combining (11) and (12) with v(t)l22 ≤ C(v0 l 2 + |c(t) − c0 |),
(13)
which follows from the convexity of the Hamiltonian and the orthogonality condition, we will prove Theorem 1. In Sect. 4, we give a brief proof of Theorem 2. Finally, let us introduce some notations. For a Banach space X , we denote by B(X ) the space of all linear continuous operators from X to X . We use a b and a = O(b) to mean that there exists a positive constant C such that a ≤ Cb. 2. Preliminaries Let u(t) be a solution to (2) which lies in a tubular neighborhood of M. We decompose u(t) as (8). Since u˙ c = −c∂x u˜ c (· − ct) = J H (u c ), it follows from (3) and (4) that d u c(t) (γ (t)) = c(t)∂ ˙ ˙ c u˜ c(t) (n − x(t)) − x(t)∂ x u˜ c(t) (n − x(t)) dt x(t) ˙ − c(t) u˙ c(t) (γ (t)). ˙ = J H (u c (t)) + c(t)∂ c u c (γ (t)) + c(t) Thus by the definition of v,
dv = J H u c(t) (γ (t)) v(t) + l1 (t) + N1 (t), dt
(14)
where x(t) ˙ − c(t) u˙ c(t) (γ (t)), ˙ l1 (t) = −c(t)∂ c u c(t) (γ (t)) − c(t)
N1 (t) = J H u c(t) (γ (t)) + v(t) − H u c(t) (γ (t)) − H (u c(t) (γ (t)))v(t) . Let Pc (t) be a spectral projection associated with a subspace of neutral modes span{u˙ c (t), ∂c u c (t)} and let Q c (t) = 1 − Pc (t). Then for v ∈ la2 (0 < a < 2κ(c)), Pc (t)v = θ (c)v, J −1 u˙ c (t)∂c u c (t) − θ (c)v, J −1 ∂c u c (t)u˙ c (t), where θ (c) = (d H (u c )/dc))−1 . We remark that the projections Pc (t) and Q c (t) cannot be defined on l 2 because J −1 ∂c u c does not decay as n → ∞. Now we decompose v(t) into the sum of a small solution to (2) and a remainder term that belongs to la2 for some a > 0. More precisely, we put v(t) = v1 (t) + v2 (t), where dv
1 dt = J H (v1 ), v1 (0) = v0 ,
and v2 (t) is a solution to dv
2 dt = J H u c(t) (γ (t)) v2 + l1 (t) + v2 (0) = u c0 (τ0 ) − u c(0) (γ (0)),
(15)
N2 (t),
(16)
Asymptotic Stability of Lattice Solitons in the Energy Space
131
where N2 (t) = N1 (t) − J H (v1 (t)) + J H (u c(t) (γ (t)))v1 . To fix the decomposition, we impose the constraint v(t), J −1 u˙ c(t) (γ (t)) = 0, v2 (t), J
−1
∂c u c(t) (γ (t)) = 0.
(17) (18)
We remark that u(t) − v1 (t) remains in la2 for every 0 ≤ a < 2κ(c0 ) and t ∈ R. More precisely, we have the following: Proposition 1. Let c0 > 1, τ0 ∈ R and v0 ∈ l 2 . Let u(t) be a solution to (2) satisfying u(0) = u c0 (τ0 ) + v0 and let v1 (t) be a solution to (15). Then u(t) ∈ C 2 (R; l 2 ) and u(t) − v1 (t) ∈ C 2 (R; la2 ) for 0 ≤ a < 2κ(c0 ). Proof. By [9], we have u, v1 ∈ C 2 (R; l 2 ). Let v3 (t) = u(t) − v1 (t). Then v3 (0) ∈ ∩0≤a<2κ(c0 )la2 and
dv3 = J H (u) − H (v1 ) . dt
(19)
Let u(t) = t(r (t), p(t)), v1 (t) = t(r1 (t), p1 (t)) and let V (r )−V (r ) 1 0 r −r 1 . F(u, v1 ) = 0 1 Then we have F(u, v1 ) ∈ C 1 (R; B(la2 )) for every a ∈ [0, 2κ(c0 )) and (19) can be rewritten as dv3 = J F(u, v1 )v3 . dt
(20)
By [9, Appendix A], we see that there exists a unique solution v3 ∈ C 2 (R; l 2 ∩la2 ) to (20) for every a ∈ [0, 2κ(c0 )). Thus we prove u − v1 ∈ C 2 (R; la2 ) for every a ∈ [0, 2κ(c0 )). If u(t) and u(t) − v1 (t) lie in a tubular neighborhood of a solitary wave in l 2 and la2 respectively, we can find modulation parameters c(t) and γ (t) satisfying (17) and (18). Lemma 1. Let c0 > 1, τ0 ∈ R, γ0 (t) = t + τ0 and a ∈ (0, 2κ(c0 )). Let u(t) be a solution to (2) and let v1 (t) be a solution to (15). Then there exist positive numbers δ0 and δ1 satisfying the following: If u(t) − u c0 (γ0 (t))l 2 + e−ac0 γ0 (t) u(t) − u c0 (γ0 (t)) − v1 (t)la2 < δ0 sup t∈[T1 ,T2 ]
for some 0 ≤ T1 ≤ T2 ≤ ∞, there exists (c(t), γ (t)) ∈ C 2 ([T1 , T2 ]; R2 ) satisfying (8), (17), (18) and sup (|γ (t) − γ0 (t)| + |c(t) − c0 |) < δ1 .
t∈[T1 ,T2 ]
Especially, it holds |c(0) − c0 | + |γ (0) − τ0 | = O(v0 l 2 ).
132
T. Mizumachi
Proof. Let F1 (u, u, ˜ c, γ ) := u − u c (γ ), J −1 u˙ c (γ ), F2 (u, u, ˜ c, γ ) := u˜ − u c (γ ), J −1 ∂c u c (γ )).
(21) (22)
Then 2 ∂(F1 , F2 ) d u c0 (γ0 ), u c0 (γ0 ), c0 , γ0 = − H (u c0 ) = 0. ∂(c, γ ) dc Let U (δ0 ) = (u, u) ˜ ∈ l 2 × la2 : u − u c (γ0 )l 2 + e−acγ0 u˜ − u c (γ0 )la2 < δ0 and B(δ1 ) := (c, γ ) ∈ R2 : |c − c0 | + |γ − γ0 | < δ1 . Using the implicit function theorem, we see that there exist positive numbers δ0 and δ1 and a mapping
: U (δ0 ) (u, u) ˜ → (c, γ ) ∈ B(δ1 ) satisfying F1 (u, u, ˜ (u, u)) ˜ = F2 (u, u, ˜ (u, u)) ˜ = 0. Since F1 and F2 are C 2 in 2 (u, u, ˜ γ , c) ∈ U (δ0 ) × B(δ1 ), we have ∈ C (U (δ0 )). We remark that δ0 and δ1 can be chosen uniformly with respect to γ due to the periodicity u(t + 1/c) = e−∂ u(t) (t ∈ R). Let (c(t), γ (t)) = (u(t), u(t) − v1 (t)) for t ∈ [T1 , T2 ]. Then c(t) and γ (t) satisfy (17) and (18) and are of class C 2 because ∈ C 2 (U (δ0 )) and (u(t), u(t) − v1 (t)) ∈ C 2 (R; U (δ0 )). Furthermore, we have |c(t) − c0 | + |γ (t) − γ0 (t)| u(t) − u c0 (γ0 (t))l 2 + e−ac0 γ (t) u(t) − u c0 (γ0 (t)) − v1 (t)la2 . Especially for t = 0, we have |c(0) − c0 | + |γ (0) − τ0 | = O(v0 l 2 ). This completes the proof of Lemma 1. To estimate the exponentially decaying part of a solution, we will use the following decay estimate for non-autonomous linearized equations. Lemma 2. [[10,18]] Let c0 > 1, a ∈ (0, 2κ(c0 )) and b(a) := ca − 2 sinh(a/2). Let U0 (t, τ )ϕ be a solution to dv dt = J H (u c0 )v. (23) v(τ ) = ϕ. Then for every b ∈ (0, b(a)), there exists a positive number K such that for every ϕ ∈ la2 and t ≥ τ , e−ac0 (t−τ ) U0 (t, τ )Q c0 (τ )ϕla2 ≤ K e−b(t−τ ) ϕla2 . Now let γ = γ (t) be a C 1 -function and let U (t, τ )v0 be a solution to dv
dt = γ˙ J H u c0 (γ ) v, v(τ ) = ϕ.
(24)
If a modulation parameter γ (t) is an increasing function and γ˙ (t) is bounded away from 0, we have the following:
Asymptotic Stability of Lattice Solitons in the Energy Space
133
Corollary 1. Let c0 , a, b and K be as in Lemma 2 and let 0 ≤ T ≤ ∞. Suppose inf t∈[0,T ] γ˙ (t) ≥ 1/2, ϕ ∈ la2 and ϕ, J −1 u˙ c0 (γ (τ )) = ϕ, J −1 ∂c u c0 (γ (τ )) = 0. Then U (t, τ )ϕ X (t) ≤ K e−b(t−τ )/2 ϕ X (τ ) for 0 ≤ τ ≤ t ≤ T , where v X (t) := e−ac0 γ (t) vla2 . Proof. Let s = γ (t), τ1 = γ (τ ) and v(s) ˜ = v(γ −1 (s)). Then for s ∈ [0, γ (T )], d v˜ = J H (u c0 )v˜ and v(s) ∈ Range Q c0 (s). ds Lemma 2 and the fact that γ˙ (t) ≥ 1/2 imply −b(s−τ1 )−ac0 τ1 v(t) X (t) = e−ac0 s v(s) ˜ ϕla2 la2 ≤ K e
≤ K e−b(t−τ )/2 e−ac0 γ (τ ) ϕla2 . This completes the proof of Corollary 1.
We can estimate v(t)l 2 by applying an argument from [9] that uses the convexity of Hamiltonian and the orthogonality condition (17). Lemma 3. Let u(t) be a solution to (2) satisfying u(0) = u c0 (τ0 ) + v0 . Then there exist positive numbers δ2 and C satisfying the following: Suppose there exists T ∈ [0, ∞] such that v(t) satisfies (8) and (17) for t ∈ [0, T ] and supt∈[0,T ] |c(t)−c0 |+v0 l 2 ≤ δ2 . Then v(t)l22 ≤ C(|c(t) − c0 | + v0 l 2 ) for t ∈ [0, T ].
(25)
Proof. By (17), we have H (u c(t) (γ (t))), v(t) = J −1 u˙ c(t) (γ (t)), v(t) = 0. Since H (u(t)) does not depend on t, it follows from the convexity of the functional H and the above that
δ H := H u c0 (τ0 ) + v0 − H (u c0 ) = H (u c(t) (γ (t)) + v(t)) − H (u c0 ) 1 = H (u c(t) ) + H (u c(t) (γ (t))), v(t) + H (u c(t) (γ (t)))v(t), v(t) 2 −H (u c0 ) + O v(t)l32 ≥ C v(t)l22 − C |c(t) − c0 | + O v(t)l32 , where C and C are positive constants. Noting that |δ H | = O(v0 l 2 ), we have (25) for a C > 0. Because l 2 ⊂ l r for every r ∈ [2, ∞], Lemma 3 allows us to control every l r - norm with r ≥ 2.
134
T. Mizumachi
3. Proof of Theorem 1 First, we derive from (17) and (18) a system of ordinary differential equations which describes the motion of modulating speed c(t) and phase shift x(t) = c(t)γ (t) of the main solitary wave. Lemma 4. Let u(t) be a solution to (2) and v1 (t) be a solution to (15). Suppose that c and γ are C 1 -functions satisfying (17) and (18) on [0, T ] and inf t∈[0,T ] c(t) > 1. Then it holds for t ∈ [0, T ] that c(t) ˙ = O(v1 (t)2W (t) + v2 (t)2X (t) ), x(t) ˙ − c(t) = O(v1 (t)W (t) + (v(t)l 2 + v1 (t)l 2 )v2 (t) X (t) ),
−κ(c(t))|n−x(t)| |u(n)|2 1/2 , u 2a(n−x(t)) where uW (t) = X (t) = n∈Z e n∈Z e 1/2 |u(n)|2 and a is a constant satisfying 0 < a ≤ inf t∈[0,T ] κ(c(t)). Proof. Differentiating (17) with respect to t and substituting (14) into the resulting equation, we have d v, J −1 u˙ c (γ ) dt
x˙ ˙ J −1 ∂c u˙ c (γ ) v, J −1 u¨ c (γ ) + cv, c = J H (u c (γ ))v, J −1 u˙ c (γ )) + v, J −1 u¨ c (γ ) x˙ −1 − 1 v, J −1 u¨ c (γ ) + cv, ˙ J −1 ∂c u˙ c (γ ) +l1 + N1 , J u˙ c (γ ) + c = 0.
= v, ˙ J −1 u˙ c (γ ) +
Substituting u¨ c = J H (u c )u˙ c and J ∗ = −J into the above, we have x˙ d −1 H (u c ) − v, J ∂c u˙ c (γ ) − − 1 v, J −1 u¨ c (γ ) = N1 , J −1 u˙ c (γ ). c˙ dc c (26) Differentiating (18) with respect to t, we have d v2 , J −1 ∂c u c (γ ) dt
x˙ v2 , J −1 ∂c u˙ c (γ ) + cv ˙ 2 , J −1 ∂c2 u c (γ ) c = J H (u c (γ ))v2 , J −1 ∂c u c (γ )) + v2 , J −1 ∂c u˙ c (γ ) x˙ − 1 v2 , J −1 ∂c u˙ c (γ ) + cv ˙ 2 , J −1 ∂c2 u c (γ ) +l1 + N2 , J −1 ∂c u c (γ ) + c = 0.
= v˙2 , J −1 ∂c u c (γ ) +
Substituting ∂c u˙ c = J H (u c )∂c u c into the above, we obtain x˙ d −1 H (u c ) + v2 , J −1 ∂c u˙ c (γ ) c dc −c˙ ∂c u c , J −1 ∂c u c − v2 , J −1 ∂c2 u c (γ ) = −N2 , J −1 ∂c u c (γ ).
(27)
Asymptotic Stability of Lattice Solitons in the Energy Space
135
Since |N1 (t)| |v(t)|2 and |J −1 u˙ c (t, n)| e−2κ(c)|n−x(t)| as n → ∞, we have N1 , J −1 u˙ c (γ ) = O(v(t)2W (t) ). 1 (t) + N 2 (t) + N 3 (t), where Let N2 (t) = N 1 (t) = N1 (t) − J H (v(t)) + J v(t), N 2 (t) = J H (v(t)) − J H (v1 (t)) − J v2 (t), N
3 (t) = J H (u c(t) (γ (t))) − 1 v1 (t). N We put G(v) := H (v) − H (0) − H (0)v = H (v) − v so that J G(v) represents 1 (t) that does not interact with the solitary wave u c (γ ). Since |u c (t, n)| a part of N −2κ(c)|n−x(t)| e and a ≤ inf t∈[0,T ] κ(c(t)), we have u c(t) v 2 X (t) v2W (t) . Hence by 2 , 1 and N the definition of N 1 (t) X (t) = N1 (t) − J G(v(t)) X (t) v(t)2 , N W (t) 2 (t) X (t) = J G(v(t)) − J G(v1 (t)) X (t) N (v(t)l ∞ + v1 (t)l ∞ )v2 (t) X (t) .
(28)
(29)
We see from (3) and (4) or [8] that H (u c ) − 1 decays like e−2κ|n−x(t)| as n → ±∞ and for a ∈ (0, κ(c(t))), 3 (t) X (t) v1 (t)W (t) . N
(30)
Let u X (t)∗ = eax(t) ul 2 and uW (t)∗ = ( n∈Z eκ(c(t))|n−x(t)| |u(n)|2 )1/2 . In view −a of (26), (27) and the fact that sup J −1 u¨ c(t) (γ (t))W (t)∗ + J −1 ∂c u˙ c(t) (γ (t))W (t)∗ < ∞, t∈[0,T ]
sup
t∈[0,T ]
we have A(t)
J −1 ∂c u˙ c(t) (γ (t)) X (t)∗ + J −1 ∂c2 u c(t) (γ (t)) X (t)∗ < ∞,
O(v(t)2W (t) ) c(t) ˙ , = x(t) ˙ − c(t) O(v1 (t)W (t) + (v(t)l 2 + v1 (t)l 2 )v2 (t) X (t) )
where A(t) = diag(d H (u c )/dc, d H (u c )/dc) + O(v1 (t)W (t) + v2 (t) X (t) ). We have thus proved Lemma 4. Since v1 (t) is smaller than the main wave, it moves more slowly and will be separated from the main wave. The following is an analog of the virial lemma for small solutions in Martel and Merle [16]. Lemma 5. Let v1 (t) be a solution to (15). (i) Suppose v0 ∈ l 2 . Then supt∈R v1 (t) ≤ Cv0 l 2 , where C can be chosen as an increasing function of v0 l 2 .
136
T. Mizumachi
(ii) Let c1 > 1 and x(t) ˜ be a C 1 -function satisfying inf t∈R x˜t ≥ c1 . Then there exist positive numbers a0 and δ3 such that if a ∈ (0, a0 ) and v0 l 2 ≤ δ3 , t ψa (t)
1/2
v1 (t)l22
+
ψ˜ a (s)v1 (s)l22 ds ψa (0)1/2 v0 l22 ,
0
where ψa (t, x) = 1 + tanh a(x − x(t)) ˜ and ψ˜ a (t, x) = a 1/2 sech a(x − x(t)). ˜ Corollary 2. Let v1 (t) be a solution to (15). For every c1 > 1, there exists δ3 > 0 such that limt→∞ v1 (t)l 2 (n≥c1 t) = 0 if v0 l 2 < δ3 . Proof. (Proof of Lemma 5) Since v1 (t) ∈ C 2 (R; l 2 ) is a solution to (15), we have H (v1 (t)) = H (v0 ) for t ∈ R. Noting that V (x) is coercive and inf |x|≤R |x|−2 V (x) > 0 for every R > 0, we have δ v(t)l22 ≤ H (v(t)) = H (v0 ) ≤ C(v0 l 2 )v0 l22 , where C can be chosen as an increasing function of v0 l 2 and δ is a positive constant depending only on v0 l 2 . Next, we prove (ii). Let v1 (t) = t(r1 (t, n), p1 (t, n)), h 1 (t, n) =
1 p1 (t, n)2 + V (r1 (t, n)). 2
By (2) and the fact that there exists a C > 0 such that for every n ∈ Z, 2 V (r1 (t, n)) − r1 (t, n) ≤ Cv0 l 2 |r1 (t, n)|2 , 2 V (r1 (t, n)) − r1 (t, n) ≤ Cv0 l 2 |r1 (t, n)|, we have d dt
ψa (t, n)h 1 (t, n)
n∈Z
=
p1 (t, n)V (r1 (t, n − 1)) (ψa (t, n − 1) − ψa (t, n)) +
n∈Z
∂t ψa (t, n)h 1 (t, n)
n∈Z
x˜t (t) ψ˜ a (t, n)2 p1 (t, n)2 2 n∈Z |ψa (t, n − 1) − ψa (t, n)| | p1 (t, n)r1 (t, n − 1)| +(1 + C v0 l 2 )
≤−
n∈Z
x˜t (t) (1 − C v0 l 2 ) − ψ˜ a (t, n − 1)2 r1 (t, n − 1)2 , 2 n∈Z
where C is a positive constant. Let δ3 and a be sufficiently small positive numbers. Since inf x˜t ≥ c1 > 1 and ψa (t, n) − ψa (t, n − 1) − 1 = O(a) as a ↓ 0, sup 2 ˜ n,t ψa (t, n)
Asymptotic Stability of Lattice Solitons in the Energy Space
137
there exists a δ˜ > 0 such that for t ∈ [0, T ], d ψ˜ a (t, n)2 ( p1 (t, n)2 + r1 (t, n)2 ). ψa (t, n)h 1 (t, n) ≤ −δ˜ dt n∈Z
(31)
n∈Z
Integrating (31) over [0, t], we have
ψa (t, n)h 1 (t, n) + δ˜
n∈Z
n∈Z
t
ψ˜ a (s, n)2 ( p1 (s, n)2 + r1 (s, n)2 )ds
n∈Z 0
ψa (0, n)h 1 (0, n) v0 l22 .
We have thus proved Lemma 5.
Proof. (Proof of Corollary 2) Let c2 ∈ (1, c1 ) and let x(t) ˜ = c2 t. Then by Lemma 5, we have v1 (t)l 2 (n≥c2 t) ψa (0)1/2 v0 l 2 . Let n 0 (t) = [(c1 − c2 )t], a largest integer which is smaller than (c1 − c2 )t. Then we have n 0 (t) → ∞ as t → ∞ and v1 (t)l 2 (n≥c1 t) ≤ v1 (t, · + n 0 (t))l 2 (n≥c2 t) ψa (0, ·)1/2 v0 (· + n 0 (t))l 2 . Letting t → ∞, we have limt→∞ v1 (t)l 2 (n≥c1 t) = 0. This completes the proof of Corollary 2. Next, we will estimate v2 . It decays slowly due to slow decay of the interaction between v1 and the solitary wave u c(t) . Lemma 6. Let c0 > 1, a ∈ (0, κ(c0 )/3) and δ4 be a sufficiently small positive number. Suppose that the decomposition (8), (17) and (18) exists for t ∈ [0, T ] and that v0 l 2 + supt∈[0,T ] (|c(t) − c0 | + |x(t) ˙ − c0 |) ≤ δ4 , where x(t) = c(t)γ (t). Then ⎛ ⎞ t −bt/4 −b(t−s)/4 v0 l 2 + e v1 (s)W (s) ds ⎠ , (32) Q c(t) (γ (t))v2 (t) X (t) ≤ C ⎝e 0
for t ∈ [0, T ], and T v2 (t)2X (t) dt ≤ Cv0 l22 ,
(33)
0
where C is a positive constant independent of T and v X (t) and vW (t) are as in Lemma 4. Proof. Since v2 is exponentially localized in front, we will apply (7) to (16). Let v˜2 (t) := Q c(t) (γ (t))v2 (t) and w(t) = Q c0 (γ˜ (t))v˜2 (t), where γ˜ (t) = x(t)/c0 . Here we choose γ˜ (t) so that u c(t) (γ (t)) and u c0 (γ˜ (t)) have the same phase shift and for 0 < a < min(κ(c(t)), κ(c0 )), Q c(t) (γ (t)) − Q c0 (γ˜ (t)) B(la2 ) = O(|c(t) − c0 |).
138
T. Mizumachi
By (17) and (18), v˜2 (t) = v2 (t) − θ (c(t))v2 (t), J −1 u˙ c (γ (t))∂c u c (γ (t)) = v2 (t) + θ (c(t))v1 (t), J −1 u˙ c (γ (t))∂c u c (γ (t)).
(34)
Thus we have d v˜2 = J H (u c(t) )(γ (t))v˜2 + l1 (t) + l2 (t) + l3 (t) + N2 (t), dt where
d θ (c(t))v1 (t), J −1 u˙ c(t) (γ (t))∂c u c(t) (γ (t)) , dt l3 (t) = −θ (c(t))v1 (t), J −1 u˙ c(t) (γ (t))∂c u˙ c(t) (γ (t)).
l2 (t) =
Since
we have
d ˙ − γ˜ J H (u c0 (γ˜ )), Q c0 (γ˜ ) = 0, dt
w˙ − γ˙˜ J H (u c0 (γ˜ ))w = Q c0 (γ˜ ) v˙˜2 − γ˙˜ J H (u c0 (γ˜ ))v˜2 ⎧ ⎫ ⎨ ⎬ k , = Q c0 (γ˜ ) lk + N ⎩ ⎭ 1≤k≤4
where
(35)
1≤k≤3
l4 (t) = J H (u c(t) (γ (t))) − H (u c0 (γ˜ (t))) v˜2 (t) −(γ˙˜ (t) − 1)J H (u c (γ˜ (t)))v˜2 (t). 0
In view of Lemma 4, we have for a ∈ (0, 2κ(c(t))), l1 X (t) v1 (t)W (t) + (v(t)l 2 + v1 (t)l 2 + v2 (t) X (t) )v2 (t) X (t) . By (15) and the fact that J −1 u˙ c (γ ), e−2κ|n−x(t)| as n → ±∞, we have
d dt
J −1 u˙ c (γ ), ∂c u c (γ ) and
d dt ∂c u c (γ )
(36)
decay like
l2 (t) X (t) v1 (t)W (t) .
(37)
l3 (t) X (t) v1 (t)W (t) .
(38)
Similarly, we have
Since x(t) = c0 γ˜ (t) = c(t)γ (t), l4 (t) X (t) (|c(t) − c0 | + |x(t) ˙ − c0 |)v˜2 (t) X (t) δ4 (v1 (t)W (t) + v2 (t) X (t) ). Let U (t, s) be a flow generated by dw = γ˙˜ (t)J H (u c0 (γ˜ (t)))w. dt
(39)
Asymptotic Stability of Lattice Solitons in the Energy Space
139
Applying Corollary 1 to (35) and substituting (28)–(30) and (36)–(39), we have w(t) X (t) U (t, 0)w(0) X (t) +
4
t
U (t, s)Q c0 (γ˜ (s))lk (s) X (t)
k=1 0
+
3
t
k (s) X (t) U (t, s)Q c0 (γ˜ (s)) N
k=1 0
e
−bt/2
t w(0) X (0) +
e−b(t−s)/2 v2 (s)2X (s) ds
0
t +
e−b(t−s)/2 v1 (s)W (s) + (δ4 + v1 (s)l 2 + v2 (s)l 2 )v2 (s) X (s) .
0
Here we use uW (t) ul 2 and uW (t) u X (t) for a ∈ (0, κ(c(t))/2). By the definition of v˜2 and w, v2 (t) X (t) v˜2 (t) X (t) + v1 (t)W (t) ,
(40)
v˜2 (t) X (t) ≤ w(t) X (t) + (Q c(t) (γ (t)) − Q c0 (γ˜ (t)))v˜2 (t) X (t) w(t) X (t) + |c(t) − c0 |v˜2 (t) X (t) .
(41)
If δ4 is sufficiently small, Eqs. (40) and (41) imply v˜2 (t) X (t) w(t) X (t) and v2 (t) X (t) w(t) X (t) + v1 (t)W (t) .
(42)
It follows from Lemmas 5 (i) and 3 that v1 (t)l 2 + v2 (t)l22 v0 l 2 + |c(t) − c0 |. √ Thus as long as sup0≤s≤t w(s) X (s) ≤ δ4 , we have w(t) X (t) e−bt/2 w(0) X (0) t " + e−b(t−s)/2 v1 (s)W (s) + δ4 w(s) X (s) ds. 0
Applying Gronwall’s inequality, we have √
w(t) X (t) e−(b/2+O( δ4 ))t w(0) X (0) t √ + e−(b/2+O( δ4 ))(t−s) v1 (s)W (s) ds.
(43)
0
By the definition of w, (16), (34) and Lemma 1, w(0) X (0) v2 (0) X (0) + v1 (0)l 2 v0 l 2 .
(44)
140
T. Mizumachi
In view of Lemma 5 (i), (43) and (44), we have w(t) X (t) v0 l 2 = O(δ4 ) and (43) persists for t ∈ [0, T ] if δ4 is sufficiently small. Thus by (41), we have (32). Combining (32), (42) and Lemma 5 (ii) and using Young’s inequality, we have v2 (t) L 2 (0,T ;X (t)) v0 l 2 + e−bt/4 L 1 (0,T ) v1 L 2 (0,T ;W (t)) v0 l 2 . We have thus completed the proof of Lemma 6.
Now we are in a position to prove the following proposition: Proposition 2. Let c0 > 1, τ0 ∈ R and let u(t) be a solution to (2) with u(0) = u c0 (τ0 ) + v0 . For every ε > 0, there exists a positive number δ > 0 satisfying the following: If v0 l 2 < δ, there exist a constant c+ > 1 and a C 1 -function x(t) such that u(t) − u˜ c0 (· − x(t))l 2 < ε,
lim u(t) − u˜ c+ (· − x(t)) l 2 (n≥x(t)−R) = 0 for every R > 0,
(45) (46)
˙ − c0 |) = O(v0 l 2 ), sup (|c(t) − c0 | + |x(t)
(47)
lim c(t) = c+ ,
(48)
t→∞ t∈R
t→∞
lim x(t) ˙ = c+ .
t→∞
Proof. Let δ5 = min1≤i≤4 δi and T0 := sup {t : (8), (17) and (18) hold for 0 ≤ τ ≤ t} , #
$
T1 := sup t ≤ T0 : v0 l 2 + sup (|c(τ ) − c0 | + |x(τ ˙ ) − c0 |) ≤ δ5 . 0≤τ ≤t
If δ is sufficiently small, Proposition 1 and Lemma 1 imply that T1 > 0. We will show that T0 = T1 for small δ. Suppose that t ∈ [0, T1 ). Lemmas 4, 5 and 6 and (40) imply |x(t) ˙ − c(t)| v1 (t)W (t) + v2 (t) X (t) v1 (t)W (t) + v˜2 (t) X (t) v0 l 2 .
(49)
By Lemmas 1 and 4, t |c(t) − c0 | ≤ |c(0) − c0 | +
|c(s)|ds ˙ 0
t v0 l 2 + v1 (s)2W (s) + v2 (s)2X (s) ds. 0
In view of Lemmas 5 (ii) and 6, we have |c(t) − c0 | v0 l 2 .
(50)
It follows from (49) and (50) that T0 = T1 if δ is sufficiently small. Next, we will show that T0 = ∞ for small δ. Suppose that for every δ > 0, there exists v0 such that v0 l 2 < δ and T0 < ∞. By Lemma 3 and (50),
Asymptotic Stability of Lattice Solitons in the Energy Space
141
sup v(t)l22 v0 l 2 .
t∈[0,T0 ]
(51)
Using (40), Lemmas 6 and 5 (i), we have sup v2 (t) X (t) sup
t∈[0,T0 ]
t∈[0,T0 ]
v1 (t)W (t) + v˜2 (t) X (t) v0 l 2 .
(52)
By (51) and (52), we get v(T0 )l 2 + e−ax(T0 ) v2 (T0 )la2 v0 l 2 . Hence it follows from Lemma 1 that the decomposition (8), (17) and (18) can be extended beyond t = T0 if v0 l 2 is small. This is a contradiction. Thus we prove T0 = ∞ for small v0 ∈ l 2 . Let δ be a small positive number such that T0 = T1 = ∞. Then Lemma 5 (ii) and Lemma 6 imply v1 (t)W (t) + v2 (t) X (t) ∈ L 2 (0, ∞). Thus by Lemma 4, we see that c(t) ˙ is integrable on [0, ∞) and that there exists c+ satisfying limt→∞ c(t) = c+ . Next, we will prove (46). As in the proof of Corollary 2, we can prove limt→∞ v1 (t)W (t) = 0. Combining this with (54), we have ˙ = lim c(t) = c+ . lim x(t)
t→∞
t→∞
(53)
By (40), Lemma 6 and the fact that v1 (t)W (t) ∈ L 2 (0, ∞), v2 (t) X (t) v1 (t)W (t) + v˜2 (t) X (t) v1 (t)W (t) + e
−bt/4
(54)
v0 l 2 + sup v1 (s)W (s) t/2≤s≤t
⎞1/2 ⎛ t/2 +e−bt/8 ⎝ v1 (s)2W (s) ds ⎠ → 0 as t → ∞. 0
Since v2 (t)l 2 (n≥x(t)−R) v2 (t) X (t) for every R > 0, Corollary 2 and (54) imply (46). Combining this (53) and (54), we have ˙ = lim c(t) = c+ . lim x(t)
t→∞
t→∞
We have thus completed the proof of Proposition 2.
Combining Proposition 2 and the monotonicity argument given in [16], we obtain Theorem 1. Proof. (Proof of Theorem 1) Put 1 ˜ n)2 + V (˜r (t, n)), (˜r (t, n), p(t, ˜ n)) := v(t, n), h(t, n) = p(t, 2 N3 (t) = J H (u c(t) (γ (t)) + v(t)) − H (u c(t) (γ (t))) − H (v(t)) .
t
Let σ ∈ (1, c+ ), t1 ≥ 0 and x(t) ˜ = x(t1 ) + σ (t − t1 ). Let ψa (t, n) and ψ˜ a (t, n) be as in Lemma 5. Then
142
T. Mizumachi
d ψa (t, n)h(t, n) dt n∈Z
˙ + = H (v(t)), ψa (t)v(t) =
∂t ψa (t, n)h(t, n)
n∈Z
p(t, ˜ n)V (˜r (t, n − 1))(ψa (t, n − 1) − ψ(t, n))
n∈Z
+ψa (t)l1 (t), H (v(t)) + ψa (t)N3 (t), H (v(t)) +
∂t ψa (t, n)h(t, n).
n∈Z
J H (v(t))
Here we use = + l1 (t) + N3 (t). Suppose that a > 0 and v0 l 2 are sufficiently small. Since v(t)l 2 v0 l 2 follows from Proposition 2, we see that there exists a δ > 0 d ψ˜ a (t, n) r˜ (t, n)2 + p(t, ψa (t, n)h(t, n) ≤ −δ ˜ n)2 dt dv dt
n∈Z
n∈Z
+ψa (t)l1 (t), H (v(t)) + ψa (t)N3 (t), H (v(t))
in exactly the same way as the proof of Lemma 5. By the definitions of l1 (t) and N3 (t) and Lemma 4, |N3 (t)| |u c(t) (γ (t))v(t)|,
|l1 (t), H (v(t))| (v1 (t)W (t) + v2 (t) X (t) )2 . Combining the above, we have ψa (t, n) r˜ (t, n)2 + p(t, ˜ n)2 n∈Z
t ψa (t1 , n)h(t1 , n) +
n∈Z
(v1 (s)W (s) + v2 (s) X (s) )2 ds t1
ψa (t1 , n) |v1 (t1 , n)| + |v2 (t1 , n)| 2
n∈Z
2
t
(v1 (s)W (s) + v2 (s) X (s) )2 ds.
+ t1
As in the proof of Corollary 2, we have lim ψa (t1 , n)|v1 (t1 , n)|2 = 0. t1 →∞
n∈Z
On the other hand, Lemma 6 implies ψa (t1 , n)|v2 (t1 , n)|2 v2 (t1 )2X (t1 ) → 0 as t1 → ∞. n∈Z
Furthermore, Lemmas 5 and 6 and Proposition 2 imply ∞ lim (v1 (s)2W (s) + v2 (s)2X (s) )ds = 0.
t1 →∞
t1
Asymptotic Stability of Lattice Solitons in the Energy Space
143
Combining the above, we obtain lim sup v(t)l 2 (n≥σ t) = 0.
t1 →∞ t≥t1
Thus we complete the proof of Theorem 1.
4. Proof of Theorem 2 In this section, we will prove orbital and asymptotic stability of solitary waves to the FPU lattice equation (9). For a two-parameter family of solitary wave solutions {u c (t + δ) : c ∈ [c1 , c2 ], δ ∈ R} that satisfies the condition (P1)–(P4) below, we can prove the orbital and asymptotic stability of solitary wave solutions in exactly the same way as Theorem 1. (P1) There exists an open interval I such that V (r ) > 0 for every r ∈ I and that {rc (x) : x ∈ R} ⊂ I for every c ∈ [c1 , c2 ]. 2 (P2) There exists an a > 0 such that the map R × [c1 , c2 ] (t, c) → u c (t) ∈ la2 ∩ l−a 2 is C . (P3) The solitary wave energy H F (u c ) satisfies d H F (u c )/dc = 0 for c ∈ [c1 , c2 ]. (P4) Let c0 ∈ [c1 , c2 ] and a ∈ (0, 2κ(c0 /cs )). Let U0 (t, τ )ϕ be a solution to dv dt = J H F (u c0 )v. (55) v(τ ) = ϕ. Then there exist positive numbers b and K such that for every ϕ ∈ la2 and t ≥ τ , e−ac0 (t−τ ) U0 (t, τ )Q c (τ )ϕla2 ≤ K e−b(t−τ ) ϕla2 . Proof. (Proof of Theorem 2) If c > cs and c is sufficiently close to cs , there exists a unique solitary wave solution to (10) up to translation ([8, Theorem 1.1]). By [8, Theorem 1.1], we see that a solitary wave solution satisfies (P1) and (P3) if c is close to cs . Slightly modifying the proof of [8, Prop. 6.1] and [9, Prop. A.3], we obtain (P2). Since (P4) holds for small solitary waves (see [11]), Theorem 2 can be proved in exactly the same way as Theorem 1. Acknowledgement. The author would like to express his gratitude to Professor Robert L. Pego for his hospitality at Carnegie Mellon University where this work was carried out.
References 1. Benjamin, T.B.: The stability of solitary waves. Proc. Roy. Soc. London A 328, 153–183 (1972) 2. Bona, J.L.: On the stability of solitary waves. Proc. Roy. Soc. London A 344, 363–374 (1975) 3. Bona, J.L., Chen, M., Saut, J.C.: Boussinesq equations and other systems for small-amplitude long waves in nonlinear dispersive media I. Derivation and Linear Theory. J. Nonlinear Sci. 12(4), 283–318 (2002) 4. Bona, J.L., Chen, M., Saut, J.C.: Boussinesq equations and other systems for small-amplitude long waves in nonlinear dispersive media II. The Nonlinear Theory. Nonlinearity 17, 925–952 (2004) 5. Cazenave, T.: Semilinear Schrodinger equations. Courant Lecture Notes in Mathematics 10, New York: New York University, Courant Institute of Mathematical Sciences. Providence, RI: Amer. Math. Soc., 2003 6. Ei, S.-I.: The motion of weakly interacting pulses in reaction-diffusion systems. J. Dynam. Diff. Eq. 14, 85–137 (2002)
144
T. Mizumachi
7. Flaschka, H.: On the Toda lattice. II. Inverse-scattering solution. Progr. Theor. Phys. 51, 703–716 (1974) 8. Friesecke, G., Pego, R.L.: Solitary waves on FPU lattices I, Qualitative properties, renormalization and continuum limit. Nonlinearity 12, 1601–1627 (1999) 9. Friesecke, G., Pego, R.L.: Solitary waves on FPU lattices. II, Linear Implies Nonlinear Stability. Nonlinearity 15, 1343–1359 (2002) 10. Friesecke, G., Pego, R.L.: Solitary waves on Fermi-Pasta-Ulam lattices III, Howland-type Floquet theory. Nonlinearity 17, 207–227 (2004) 11. Friesecke, G., Pego, R.L.: Solitary waves on Fermi-Pasta-Ulam lattices IV, Proof of stability at low energy. Nonlinearity 17, 229–251 (2004) 12. Friesecke, G., Wattis, J.: Existence theorem for solitary waves on lattices. Commun. Math. Phys. 161, 391–418 (1994) 13. Grillakis, M., Shatah, J., Strauss, W.A.: Stability Theory of solitary waves in the presence of symmetry I. J. Diff. Eq. 74, 160–197 (1987) 14. Grillakis, M., Shatah, J., Strauss, W.A.: Stability Theory of solitary waves in the presence of symmetry II. J. Funct. Anal. 94, 308–348 (1990) 15. Martel, Y., Merle, F.: Asymptotic stability of solitons for subcritical generalized KdV equations. Arch. Ration. Mech. Anal. 157(3), 219–254 (2001) 16. Martel, Y., Merle, F.: Asymptotic stability of solitons of the subcritical gKdV equations revisited. Nonlinearity 18, 55–80 (2005) 17. Mizumachi, T.: Weak interaction between solitary waves of the generalized KdV equations. SIAM J. Math. Anal. 35, 1042–1080 (2003) 18. Mizumachi, T., Pego, R.L.: Asymptotic stability of Toda lattice solitons. Nonlinearity 21, 2061–2071 (2008) 19. Kato, T.: On the Cauchy problem for the (generalized) Korteweg-de Vries equation, Studies in applied mathematics, Adv. Math. Suppl. Stud. 8, New York: Academic Press, 1983, pp. 93–128 20. Pego, R.L., Smereka, P., Weinstein, M.I.: Oscillatory instability of solitary waves in a continuum model of lattice vibrations. Nonlinearity 8, 921–941 (1995) 21. Pego, R.L., Weinstein, M.I.: Asymptotic stability of solitary waves. Commun. Math. Phys. 164, 305–349 (1994) 22. Promislow, K.: A renormalization method for modulational stability of quasi-steady patterns in dispersive systems. SIAM J. Math. Anal. 33, 1455–1482 (2002) 23. Quintero, J.R.: Nonlinear stability of a one-dimensional Boussinesq equation. J. Dynam. Diff. Eq. 15, 125–142 (2003) 24. Quintero, J.R., Pego, R.L.: Asymptotic stability of solitary waves in the Benney-Luke model of water waves. Unpublished manuscript 25. Shatah, J., Strauss, W.A.: Instability of nonlinear bound states. Commun. Math. Phys. 100, 173–190 (1985) 26. Toda, M.: Nonlinear waves and solitons. Mathematics and its Applications (Japanese Series) 5, Dordrecht: Kluwer Academic Publishers Group, Tokyo: SCIPRESS, 1989 Communicated by P. Constantin
Commun. Math. Phys. 288, 145–198 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0765-9
Communications in
Mathematical Physics
Global Wellposedness in the Energy Space for the Maxwell-Schrödinger System Ioan Bejenaru1, , Daniel Tataru2, 1 Department of Mathematics, Texas A&M University, College Station,
TX 77843-3368, USA. E-mail:
[email protected]
2 Department of Mathematics, University of California, Berkeley, CA 94720, USA
Received: 19 December 2007 / Accepted: 22 December 2008 Published online: 13 March 2009 – © Springer-Verlag 2009
Abstract: We prove that the Maxwell-Schrödinger system in R3+1 is globally well-posed in the energy space. The key element of the proof is to obtain a short time wave packet parametrix for the magnetic Schrödinger equation, which leads to linear, bilinear and trilinear estimates. These, in turn, are extended to larger time scales via a bootstrap argument.
1. Introduction The Maxwell-Schrödinger system in R3+1 describes the evolution of a charged nonrelativistic quantum mechanical particle interacting with the classical electro-magnetic field it generates. It has the form ⎧ ⎨ iu t − A u = φu, −φ + ∂ divA = ρ, ⎩ A + ∇(∂t φ + divA) = J, t
ρ = |u|2 J = 2I m(u, ¯ ∇ A u),
(1)
where u is the wave function of the particle, (φ, A) is the electro-magnetic potential, (u, A, φ) : R3 × R → C × R × R3 and ∇ A = ∇ − i A, A = ∇ 2A . The system is invariant under the gauge transform: (u , φ , A ) → (eiλ u, φ − ∂t λ, A + ∇λ),
λ : R3 × R → R,
The first author was partially supported by NSF grant DMS0738442.
The second author was partially supported by NSF grant DMS0354539.
146
I. Bejenaru, D. Tataru
where λ : R3 × R → R. To remove this degree of freedom we need to fix the gauge. In this article we choose to work in the Coulomb gauge divA = 0. Under this assumption, the system can be rewritten as: iu t − A u = φu A = P J,
(2)
(3)
where φ = (−)−1 (|u|2 ) and P = 1 − ∇div−1 is the projection on the divergence free vectors functions - also called the Helmholtz projection. We consider the above system with a set of initial data chosen in Sobolev spaces: (u(0), A(0), At (0)) = (u 0 , A0 , A1 ) ∈ H s × H σ × H σ −1 . The gauge condition (2) is conserved in time provided the initial data (A0 , A1 ) satisfies it due to the form of the second equation in (3). The conserved quantities associated to the system are the charge and the energy, |u|2 d x, Q(u) = R3 1 1 E(u) = |∇ A u|2 + (|At |2 + |∇x A|2 ) + |∇φ|2 d x. 3 2 2 R The local well-posedness of the system in various Sobolev spaces above the energy level is known, see [9,12]. On the other hand the existence of weak energy solutions is established in [2]. The main outstanding problem which we seek to address is the well-posedness in the energy space. Our result is Theorem 1. The Maxwell-Schrödinger system (3) is globally well-posed in the energy space H 1 × H 1 × L 2 in the following sense: i) (regular solutions) For each initial data (u 0 , A0 , A1 ) ∈ H 2 × H 2 × H 1 there exists an unique global solution (u, A) ∈ C(R, H 2 ) × C(R, H 2 ) ∩ C 1 (R, H 1 ). ii) (rough solutions) For each initial data (u 0 , A0 , A1 ) ∈ H 1 × H 1 × L 2 there exists a global solution (u, A) ∈ C(R, H 1 ) × C(R, H 1 ) ∩ C 1 (R, L 2 ), which is the unique strong limit of the regular solutions in (i). iii) (continuous dependence) The solutions (u, A) in (ii) depend continuously on the initial data in H 1 × H 1 × L 2 .
Maxwell-Schrödinger System
147
We remark that in the process of proving the above results we establish some additional regularity properties for the energy solutions (u, A) which suffice both for the uniqueness and the continuous dependence results. Traditionally these regularity properties are described using X s,b type spaces. Instead here we use the related U 2 and V 2 type spaces associated to both the wave equation and the magnetic Schrödinger equation. These are introduced in the next section; for more details we refer the reader to [3,6,7]. Remark 2. We note that in some directions our analysis yields stronger results than as stated in the theorem. Precisely, the same arguments as those in Sect. 5 also yield: a) Local in time a-priori estimates in H β × H 1 × L 2 for β > 21 . This is exactly the range allowed for β in Lemma 25 and Lemma 26 (a). b) Local in time well-posedness in H β × H 1 × L 2 for β > 43 . This reduced range arises due to Lemma 26 (b). The nonlinearities on the right-hand side of both equations in (3) are fairly mild. Indeed, if A were replaced by then it would be quite straightforward to iteratively close the argument in X s,b or Strichartz spaces. For the magnetic potential A it is quite reasonable to hope to obtain an X s,b type regularity. Thus the main difficulty stems from the linear magnetic Schrödinger equation iu t − A u = f,
u(0) = u 0 .
(4)
The linear and bilinear estimates for L 2 solutions to (4) are summarized in Theorem 9 in Sect. 4. The rest of the section is devoted to the well-posedness of (4) in H 2 , H −2 and intermediate spaces. The proof of our main result is completed in the following section. The first step is to establish local in time a-priori bounds for solutions to (3), first in H 1 and then in more regular spaces. This is done by treating the nonlinearities on the right of the equations in a perturbative manner. The transition from local in time to global in time is straightforward, using the conserved energy. The second step is to establish the continuous dependence on the initial data. This is a consequence of a Lipschitz dependence result in a weaker topology. Precisely, we show that the corresponding linearized equation is 1 1 well-posed in L 2 × H 2 × H − 2 . The rest of the paper is devoted to the study of L 2 solutions for (4), with the aim of proving Theorem 9. Previous approaches establish Strichartz estimates with a loss of derivatives for this equation in a perturbative manner, starting from the free Schrödinger equation. This no longer suffices for A in the energy space, and instead one needs to study directly the dispersive properties for the linear magnetic Schrödinger equation. Our approach uses some of the ideas described in [5 and 6]. To each dyadic frequency λ, we associate the time scale λ−1 . On this time scale we show that at frequency λ Eq. (4) is well approximated by its paradifferential truncation, which is roughly iu t − A<√λ u = f,
u(0) = u 0 .
(5)
Following the ideas in [1,8], in Sect. 6, 7 we obtain a wave packet parametrix for the Eq. (5) on the λ−1 time scale. This allows us to prove sharp Strichartz and square function estimates, as well as bilinear L 2 bounds and trilinear estimates. In the last section we extend the linear, bilinear and trilinear estimates to larger time scales. A brute force summation of the short time bounds yields unacceptably large
148
I. Bejenaru, D. Tataru
constants. Heuristically, the summation can be improved by taking advantage of the localized energy estimates for the magnetic Schrödinger equation. However these are not straightforward. Our idea is to obtain them from a weaker generalized wave packet decomposition where the localization scales in position and frequency are relaxed as the time scale is iteratively increased. 2. Definitions q
Throughout the paper we use the standard Lebesgue spaces L x and mixed space-time p q versions L t L x which are defined in the standard way. To measure regularity of functions at fixed time we use the standard Sobolev spaces Hxs . Additional space-time structures will be defined in the next section. We now introduce dyadic multipliers and a Littlewood-Paley decomposition in frequencies. Throughout the paper the letters λ, µ, ν and γ will be used to denote dyadic values, i.e. λ = 2i for some i ∈ N. We say that a function u is localized at frequency λ if its Fourier transform is supported in the annulus {|ξ | ∈ [ λ8 , 8λ]} if λ ≥ 2, respectively in the ball {|ξ | ≤ 8} if λ = 1. By Sλ we denote a multiplier with smooth symbol sλ (ξ ) which is supported in the annulus {|ξ | ∈ [ λ2 , 2λ]} for λ ≥ 2 respectively in the ball {|ξ | ≤ 2} if λ = 1 and satisfies the bounds |∂ α sλ (ξ )| ≤ cα λ−|α| .
(6)
By S<λ we denote a multiplier with smooth symbol s<λ (ξ ) which is supported in the ball {|ξ | ≤ 2λ}, equals 1 in the ball {|ξ | ≤ λ/2} and satisfies (6). All implicit constants in the estimates involving Sλ , S<λ will depend on finitely many seminorms of its symbol, i.e. on cα for |α| ≤ N , for some large N . Associated to each λ we also consider S˜λ to be a multiplier whose symbol s˜λ satisfies (6), is supported in {|ξ | ∈ [ λ4 , 4λ]} and equals 1 in the support of {|ξ | ∈ [ λ2 , 2λ]}. The last condition implies that S˜λ Sλ = Sλ . Similarly we consider S˜˜λ to be a multiplier whose symbol satisfies (6), is supported in {|ξ | ∈ [λ/8, 8λ]} and equals 1 in the support of {|ξ | ∈ [λ/4, 4λ]}. 3. V 2 and U 2 Type Spaces Let H be a Hilbert space. Let V 2 H be the space of H valued functions on R with bounded 2-variation u(ti+1 ) − u(ti ) 2H , u 2V 2 H = sup (ti )∈T
i
where T is the set of finite increasing sequences in R. The functions in V 2 H have lateral limits everywhere and at most countably many discontinuities. By V 2,r c H we denote the closed subspace of right continuous functions in V 2 H . We note that to each V 2 H function we can uniquely associate a V 2,r c H function so that the two coincide except at countably many points.
Maxwell-Schrödinger System
149
Let U 2 H be the atomic space defined by the atoms: u= h i χ[ti ,ti+1 ) , h i 2H = 1 i
i
for some (ti ) ∈ T . We have the inclusion U 2 H ⊂ V 2 H but in effect these spaces are 1 very close, and also close to the homogeneous Sobolev space H˙ 2 . Precisely, we can bracket them using homogeneous Besov spaces as follows: 1
1
2 2 B˙ 2,1 (R, H ) ⊂ U 2 H ⊂ V 2 H ⊂ B˙ 2,∞ (R, H ).
(7)
We denote by DU 2 H the space of (distributional) derivatives of U 2 H functions, and note the embedding L 1 (R, H ) ⊂ DU 2 H . There is also a duality relation between V 2 H and U 2 H , namely (DU 2 H )∗ = V 2 H.
(8)
Restricted to DU 2 H functions which belong to L 1 (R, H ) and V 2 functions which vanish at −∞, this duality coincides with the standard L 2 pairing. For more details on the U 2 and V 2 spaces we refer the reader to [7 and 3]. The above duality relation is explained in detail in [3, Prop. 2.7–2.10]. Given an abstract evolution in H , iu t = B(t)u,
u(0) = u 0 ,
which generates a family of bounded evolution operators S B (t, s) : H → H,
t, s ∈ R,
we can define the associated spaces U B2 H , VB2 H and DU B2 H by u V 2 H = S B (0, t)u(t) V 2 H ,
u U 2 H = S B (0, t)u(t) V 2 H ,
B
B
respectively DU B2 L 2 = {(i∂t − B)u; u ∈ U B2 H }. On occasion we need to compare the above spaces associated to closely related operators. For this we use the following Lemma 3. There exists 0 > 0 so that for all 0 < < 0 the following holds: Let H be a Hilbert space and B(t), C(t) two families of bounded selfadjoint operators in H so that (B − C)u DU 2 H ≤ ε u U 2 H . B
B
Then u U 2 H ≈ u U 2 H , B
C
u V 2 H ≈ u V 2 H , B
C
f DU 2 H ≈ f DU 2 H . B
C
We remark that both 0 and the implicit constants above are universal, and in particular they do not depend on the operator bounds for B and C.
150
I. Bejenaru, D. Tataru
Proof. By conjugating with respect the B flow we can assume without any restriction in generality that B = 0. Then we can solve the equation iu t − Cu = 0,
u(0) = u 0
by treating C perturbatively to obtain a solution u = u 0 + u 1 (t),
u 1 U 2 H u 0 H .
Applying this to each step in UC2 atoms we obtain u U 2 H u U 2 H C
for arbitrary u. For the converse, applying the above result to each step in a U 2 H atom we conclude that for each u ∈ U 2 H we can find u 1 ∈ U 2 H so that u + u 1 U 2 H + u 1 U 2 H u U 2 H . C
Iterating this shows that u U 2 H u U 2 H . C
Hence = Consider now f ∈ DUC2 H . Then f = iu t − Cu for some u ∈ UC2 H = U 2 H . Since C maps U 2 H to DU 2 H this implies that f ∈ DU 2 H . Conversely, if f ∈ DU 2 H then we can solve the inhomogeneous equation UC2 H
U2H.
iu t − Cu = f,
u(0) = 0
iteratively in U 2 H . This gives a solution u ∈ U 2 H = UC2 H , therefore f ∈ DUC2 H . We have proved that DU 2 H = DUC2 H . Then the last relation V 2 H = VC2 H follows by duality. Following the above procedure we can associate similar spaces to the Schrödinger flow by pulling back functions to time 0 along the flow, namely u V 2 L 2 = eit u V 2 L 2 ,
u U 2 L 2 = eit u V 2 L 2 .
The magnetic Schrödinger equation has time dependent coefficients, so we replace the above exponential with the corresponding evolution operators. We denote by S A (t, s) the family of evolution operators corresponding to Eq. (4). These are L 2 isometries. Then we define u V 2 L 2 = S A (0, t)u(t) V 2 L 2 , A
u U 2 L 2 = S A (0, t)u(t) V 2 L 2 . A
1
These spaces turn out to be a good replacement for the X 0, 2 space associated to the Schrödinger equations. We also define DU A2 L 2 = {(i∂t − A )u; u ∈ U A2 L 2 }. By (8) we have the duality relation (DU A2 L 2 )∗ = V A2 L 2 .
Maxwell-Schrödinger System
151
When solving Eq. (4) we let f ∈ DU A2 L 2 , and we have the straightforward bound u U 2 L 2 u 0 L 2 + f DU 2 L 2 . A
(9)
A
In our study of nonlinear equations later on we need to estimate multilinear expressions in DU A2 L 2 . By duality, this is always turned into multilinear estimates involving V A2 L 2 functions. Finally, we define similar spaces associated to the wave equation. The wave equation is second order in time therefore we use a half-wave decomposition and set u V 2 L 2 = e±it|D| u V 2 L 2 , ±
u U 2 L 2 = e±it|D| u V 2 L 2 . ±
Then the spaces for the full wave equation are u U 2
WL
2
= u U+2 L 2 +U 2 L 2 , −
u V 2 L 2 = u V+2 L 2 +V 2 L 2 . −
W
2 L 2 with norm For the inhomogeneous term in the wave equation we use the space DUW
f DU 2 L 2 = f DU+2 L 2 ∩DU 2 L 2 . −
W
Then to solve the inhomogeneous wave equation we use ∇u U 2
WL
2
∇u(0) L 2 + u DU 2
WL
2
.
Similarly we set u U 2
W
Hs
= Dx s u U+2 L 2 +U 2 L 2 , u V 2 H s = Dx s u V+2 L 2 +V 2 L 2 −
−
W
and f DU 2
W
Hs
= Dx s f DU 2 L 2 .
Such spaces originate in unpublished work of the second author on the wave-map equation, and have been successfully used in various contexts so far, see [1,3,6,7]. The 2 H s spaces. If Strichartz estimates for the wave equation turn into embeddings for UW the indices ( p, q) satisfy 1 1 + = 1, p q
2 < p ≤ ∞,
(10)
then we have u L p L q u
VW2 H
2 p
.
(11)
If we consider frequency localized solutions to the wave equation on a very small time scale, then the wave equation is ineffective. Precisely, Lemma 4. Let B<λ be a function which is localized at frequency < λ. Then B<λ U 2 (I ;L 2 ) ≈ B<λ U 2 (I ;L 2 ) , W
|I | ≤ λ−1 .
(12)
152
I. Bejenaru, D. Tataru
Proof. This follows from the similar bound for the corresponding half-wave spaces U±2 L 2 , and by Lemma 3 it is a consequence of the fact that for a short time the spatial derivatives in the half wave equation can be treated perturbatively, |Dx |B<λ DU 2 (I,L 2 ) |Dx |B<λ L 1 (I,L 2 ) λ B<λ L 1 (I,L 2 ) |I |λ B<λ L ∞ (I,L 2 ) |I |λ B<λ U 2 (I,L 2 ) The finite speed of propagation for the wave equation allows us to spatially localize 2 L 2 spaces. For R > 0 we consider a covering (Q R ) 3 functions in the UW i i∈Z3 of R with R cubes of size R. Let χi be an associated smooth partition of unity. Then we have the following result: Lemma 5. Assume that I is a time interval with |I | ≤ R. Then: 2 2 χiR u U 2 (I ;L 2 ) u U 2 (I ;L 2 ) . i∈Z3
W
(13)
W
Proof. By rescaling we can take R = 1. Without any restriction in generality we can also assume that |I | = 1. We prove that the result holds for one of the two half-wave spaces, say U+2 (I ; L 2 ). It is enough to verify (13) for atoms, and further for each step in an atom. Hence we can assume that u solves the half wave equation (i∂t + |Dx |)u = 0. Then we have (i∂t + |Dx |)(χiR u) = [|Dx |, χiR ]u. By standard commutator estimates we have at fixed time [|Dx |, χiR ]u(t) 2L 2 u(t) 2L 2 . i∈Z3
Then i∈Z3
2 χiR u U 2 (I ;L 2 ) W
χiR u(0) 2L 2 + [|Dx |, χiR ]u(t) 2L 1 L 2 u(0) 2L 2 .
i∈Z3
Due to the atomic structure, in many estimates it is convenient to work with the U 2 type spaces instead of V 2 . In order to transfer the estimates from U 2 to V 2 we use the following result from [3, Prop. 2.5,2.17]: Proposition 6. Let 2 < p < ∞. If u ∈ V 2,r c H then for each 0 < ε < 1 there exist u 1 ∈ U 2 H and u 2 ∈ U p H such that u = u 1 + u 2 and | ln ε|−1 u 1 U 2 H + ε−1 u 2 U p H u V 2 H .
(14)
Maxwell-Schrödinger System
153
Here U p is defined in the same manner as U 2 but with the l 2 summation replaced by an l p summation. One way we use this result is as follows: Corollary 7. Let ε > 0 and N arbitrarily large. Then 2 H s− + L ∞ H N . VW2 H s ⊂ UW
Following is another example of how this result can be applied. Typically in our analysis we prove dyadic trilinear estimates of the form Sλ1 u Sλ2 v, ¯ Sλ3 Bd xdt C1 (|I |, λ123 ) u U 2 L 2 v U 2 L 2 B U 2 L 2 , (15) I
R3
A
A
W
where λ123 stands for the triplet {λ1 , λ2 , λ3 }. What we need instead is an estimate with one U 2 replaced by a V 2 , say Sλ1 u Sλ2 v, ¯ Sλ3 Bd xdt C2 (|I |, λ123 ) u U 2 L 2 v U 2 L 2 B V 2 L 2 . (16) R3
I
A
A
W
Denoting λmax = max{λ1 , λ2 , λ3 }, due to Proposition 6 we can easily show that Lemma 8. Assume (15) holds and |I | ≤ 1. Then (16) holds with C2 (|I |, λ123 ) = C1 (|I |, λ123 ) ln λmax . The same holds if the V 2 structure is placed on any of the other two factors in (16). Proof. Without any restriction in generality we assume that λ1 , λ2 and λ3 are so that the integral in (15) is nontrivial. Taking u, v and B to be time independent frequency localized bump functions we easily see that C1 (|I |, λ123 ) |I |λ−N max
(17)
for some sufficiently large N . For each 0 < ≤ 1 we decompose B = B1 + B2 as in Proposition 6. For B1 we use (15) while for B2 we use Bernstein’s inequality to estimate 3 N Sλ1 u Sλ2 v, ¯ Sλ3 B2 d xdt |I |λmax u L ∞ L 2 v L ∞ L 2 B2 L ∞ L 2 I
R
N |I |λmax u U 2 L 2 v U 2 L 2 B2 U p L 2 . A
A
W
Adding the B1 and the B2 bounds gives N C2 (|I |, λ123 ) | ln |C1 (|I |, λ123 ) + |I |λmax . −2N . Then the conclusion of the lemma follows due to (17). We set = λmax
154
I. Bejenaru, D. Tataru
4. The Linear Magnetic Schrödinger Equation In this section we summarize the key properties of solutions to the homogeneous and inhomogeneous linear magnetic Schrödinger equation iu t − A u = 0, iu t − A u = f,
u(0) = u 0 , u(0) = u 0
(18) (19)
in L 2 , and we use them in order to show that the above equation is also well-posed in H 2 , H −2 and in intermediate spaces. 2 H 1 with ∇ · A = 0. All constants in the estimates depend We assume that A ∈ UW 2 1 on the UW H norm of A which is why we introduce the notation X A Y , which means X ≤ C( A U 2 H 1 )Y . W The trilinear estimates are concerned with integrals of the form T IλT1 ,λ2 ,λ3 (u, v, B) = Sλ1 u Sλ2 v¯ Sλ3 B d xdt, R3
0
where u and v are associated to the magnetic Schrödinger equation and B is associated to the wave equation. In order for the above integral to be nontrivial the two highest frequencies need to be comparable. Thus by a slight abuse of notation in the sequel we restrict ourselves to the case {λ1 , λ2 , λ3 } = {λ, λ, µ},
µ ≤ λ.
With these notations we have 2 H 1 with ∇ · A = 0 Eq. (18) is well-posed in L 2 . For each Theorem 9. For each A ∈ UW > 0 there exists δ > 0 so that the following properties hold for 0 < T ≤ 1:
(i) Strichartz estimates: δ
3 2 3 + = , 2 ≤ p ≤ ∞. p q 2
1
Sλ u L p (0,T ;L q ) A T p λ p u U 2 L 2 A
(20)
(ii) Local energy estimates. For any spatial cube Q of size 1 we have 1
Sλ u L 2 (0,T ;Q) A T δ λ− 2 +ε u U 2 L 2 .
(21)
A
(iii) Local Strichartz estimates. For any spatial cube Q of size 1 we have: Sλ u L 2 (0,T ;L 6 (Q)) A T δ λ u U 2 L 2 .
(22)
A
(iv) Trilinear estimates. For any 0 < T ≤ 1 and µ ≤ λ we have 1
T |Iλ,λ,µ (u, v, B)| A T δ λ min (1, µλ− 2 ) u U 2 L 2 v U 2 L 2 B U 2 A
WL
A
2
. (23)
On the other hand if µ λ then 1
1
T (u, v, B)| A T δ λ− 2 + µ 2 u U 2 L 2 v U 2 L 2 B U 2 |Iλ,µ,λ A
A
WL
2
.
The proof of this theorem is quite involved and is relegated to Sects. 6–10.
(24)
Maxwell-Schrödinger System
155
The smallness given by the T δ factor is needed in several proofs which use either the contraction principle or bootstrap arguments. However, this factor is nontrivial only in (20). Indeed, we have Remark 10. Assume that the conclusion of Theorem 9 holds without the T δ factor in (21), (22), (23) and (24). Then the conclusion of Theorem 9 holds in full. Proof. For (21) we observe that 1
Sλ u L 2 (0,T ;Q) T 2 u L ∞ L 2 u U 2 L 2 . A
Interpolating this with (21) without the T δ factor yields (21) with a T δ factor. By Bernstein’s inequality the same argument works for (22). For (23) we can also write the obvious estimate 3
T |Iλ,λ,µ (u, v, B)| T µ 2 u L ∞ L 2 v L ∞ L 2 B L ∞ L 2 ,
which is then interpolated with (23) without the T δ factor. The same argument applies for (24). Next we turn our attention to the H 2 and H −2 well-posedness for (18). We make the transition from L 2 to H 2 and H −2 using the coercive elliptic operator 1 − A . Its properties are summarized in the following Lemma 11. For each 0 ≤ s ≤ 2 the operator 1 − A is a diffeomorphism 1 − A : H s → H s−2 which depends continuously on A ∈ H 1 . The proof uses standard elliptic arguments and is left for the reader. Using the above operator we define the spaces U A2 H 2 , V A2 H 2 , respectively DU A2 H 2 by u U˜ 2 H 2 = (1 − A )u U 2 L 2 , A
A
u V˜ 2 H 2 = (1 − A )u V 2 L 2 , A
A
respectively f DU˜ 2 H 2 = (1 − A ) f DU 2 L 2 . A
A
Remark 12. The reason we use the U˜ , V˜ notation above is to differentiate these spaces from the U A2 H 2 , V A2 H 2 , DU A2 H 2 spaces which should be defined as in the previous section, with respect to the H 2 flow of (18). This is not possible at this point, as we have not yet proved that (18) is well-posed in H 2 . However, after we do so we will prove that the above two sets of norms are equivalent. After that the U˜ , V˜ notation is dropped. We can transfer the estimates from Theorem 9 to the U A2 H 2 spaces by making an elliptic transition between A and : 2 H 1 with ∇ · A = 0. Then for each ε > 0 there exists δ > 0 so Lemma 13. Let A ∈ UW that the following properties hold:
156
I. Bejenaru, D. Tataru
(i) For p, q as in (20) we have the Strichartz estimate δ
Sλ u L p (0,T ;L q ) A T p λ
−2+ 1p
u U˜ 2 H 2 .
(25)
A
(ii) Elliptic representation. Each u ∈ U A2 H 2 can be expressed as u = (1−)−1 (u e +u r ), u e U 2 L 2 + T −δ u r L 2 (0,T ;H 1− ) A u U˜ 2 H 2 . (26) A
A
(iii) Local energy estimates. For any spatial cube Q of size 1 we have: 1
Sλ u L 2 ([0,T ]×Q) A T δ λ−2− 2 + u 0 U˜ 2 H 2 .
(27)
A
(iv) Local Strichartz estimates. For any spatial cube Q of size 1 we have: Sλ u L 2 (0,T ;L 6 (Q)) A T δ λ−2+ u 0 U˜ 2 H 2 .
(28)
A
Proof.
(i) The Strichartz estimate (25) follows from δ
1
Sλ (1 − )u L p L q A T p λ p u U˜ 2 H 2 .
(29)
A
We use the identity (1 − )u = (1 − A )u − 2i A∇u − A2 u = (1 − A )u − R A (u)
(30)
and estimate each of the three terms. From the definition of U˜ A2 H 2 and (20) we have: δ
1
Sλ (1 − A )u L p L q A T p λ p u U˜ 2 H 2 . A
For the second term we use Bernstein’s inequality and the exponent relation in (20) to estimate 1
2
1
1
Sλ (A∇u) L p L q A T p λ p Sλ (A∇u) L ∞ L 2 T p λ p A∇u 1
1
L∞ H 2
1
T p λ p A L ∞ H 1 u L ∞ H 2 . Similarly for the last term we obtain 1
1
Sλ (A2 u) L p L q T p λ p A 2L ∞ H 1 u L ∞ H 2 . This concludes the proof for (29) which implies (25). (ii) By (30) we can set u e = (1 − A )u,
u r = −R A (u) = −2i A∇u − A2 u.
Hence it remains to prove that Sλ (A∇u) L 2 + Sλ (A2 u) L 2 A T δ λ−1+ u U˜ 2 H 2 . A
(31)
Maxwell-Schrödinger System
157
By the argument in Remark 10, here and for the rest of the proof of the lemma we can neglect the T δ factors. We decompose the expression Sλ (A∇u) as ⎞ ⎛ Sλ (A∇u) = Sλ ⎝ Sγ ASλ ∇u + Sλ ASγ ∇u + Sγ ASγ ∇u ⎠ . (32) γ λ
γ λ
γ λ
For exponents ( p, q) satisfying (10) we invoke the Strichartz estimates (11) for the wave equation. By Bernstein’s inequality and (25) we can derive a similar bound for u, Sγ ∇u L q L p A γ
− 2p
u U˜ 2 H 2 . A
(33)
The L 2 bound for the product is obtained by multiplying the last two inequalities. For the first term in (32) we take q close to ∞, for the second we take p = ∞, while for the third any choice will do. For the expression Sλ (A2 u) we take a triple Littlewood-Paley decomposition, Sλ (Aλ1 Aλ2 u λ3 ). Sλ (A2 u) = λ1 ,λ2 ,λ3
Then we must have λmax ≥ λ. If λmax = λ3 then we use the above Strichartz inequalities and Bernstein’s inequality to estimate the triple product as Aλ1 Aλ2 u λ3 L 2 Aλ1 L 4 L ∞ Aλ2 L 4 L ∞ u λ3 L ∞ L 2 1
1
A λ14 λ24 λ−2 3 Aλ1 U 2
W
2 H 1 u λ3 ˜ 2 2 . H 1 Aλ2 UW UA H
The summation with respect to λ1 , λ2 , λ3 is straightforward. If λ3 λmax then there are two possibilities. One is λmax = λ, in which case we assume w.a.r.g. that λ = λ1 ≥ λ2 , λ3 and estimate Aλ1 Aλ2 u λ3 L 2 Aλ L p L q Aλ2 L q L p u λ3 L ∞ −2
−2 −1
A λ1 p λ2 q λ3 2 Aλ1 U 2
W
2 H 1 u λ3 ˜ 2 2 H 1 Aλ2 UW UA H
with q close to infinity. The other possibility is λmax λ, in which case we must have λ1 = λ2 λ3 . Then we estimate the triple product as above, but the choice of p and q is no longer important. The proof of (31) is concluded. (iii) We use the representation in (26). The bound for u r holds without any localization. For u e we can write Sλ (1 − )−1 u e = λ−2 (λ2 (1 − )−1 S˜λ )Sλ u e , where the symbol of S˜λ is still supported at frequency λ but equals 1 in the support of the symbol of Sλ . The operator λ2 −1 S˜λ is a unit mollifier acting on the λ−1 scale, therefore it is bounded in l ∞ L 2 ([0, 1] × Q). Q (iv) We use the representation in (26). By Bernstein’s inequality the bound for u r holds without any localization. For u e we argue as above.
158
I. Bejenaru, D. Tataru
Next we define similar spaces U˜ A2 H −2 , V˜ A2 H −2 , respectively DU˜ A2 H −2 in a manner similar to the H 2 case, namely u U˜ 2 H −2 = (1 − A )−1 u U 2 L 2 ,
|u V˜ 2 H −2 = (1 − A )−1 u V 2 L 2 ,
A
A
A
A
respectively f DU˜ 2 H −2 = (1 − A )−1 f DU 2 L 2 . A
A
Due to the duality relation 8 we have the (DU˜ A2 H −2 )∗ = V˜ A2 H 2 ,
H2
−
H −2
duality,
(DU˜ A2 H 2 )∗ = V˜ A2 H −2 .
(34)
For functions in U A2 H −2 we can prove results similar to Lemma 13: 2 H 1 with ∇ · A = 0. Then for each ε > 0 there exists δ > 0 so Lemma 14. Let A ∈ UW that the following properties hold:
(i) For p, q as in (20) we have the Strichartz estimate δ
Sλ u L p (0,T ;L q ) A T p λ
2+ 1p
u U˜ 2 H −2 .
(35)
A
(ii) Elliptic representation. Each u ∈ U˜ A2 H 2 can be expressed as u = (1 − )(u e + u r ), u e U 2 L 2 + T −δ u r L 2 (0,T ;H 1− ) A u U˜ 2 H −2 . (36) A
A
(iii) Local energy estimates. For any spatial cube Q of size 1 we have: 1
Sλ u L 2 ([0,T ]×Q) A T δ λ2− 2 + u 0 U˜ 2 H −2 . A
(37)
(iv) Local Strichartz estimates. For any spatial cube Q of size 1 we have: Sλ u L 2 (0,T ;L 6 (Q)) A T δ λ2+ u 0 U˜ 2 H −2 . A
(38)
Proof. The proof is similar to the proof of Lemma 13, so we merely outline it. We denote v = (1 − A )−1 u. Then u = (1 − )v − R A (v). To prove (35) we use (20) for (1 − )v and it remains to show that Sλ ∇(Av) L p L q + Sλ (A2 v) L p L q A T δ λ
2+ 1p
v U 2 L 2 , A
which is obtained using only the energy estimates for A and v. For (36) we set u e = v,
u r = (1 − )−1 R A (v).
Then it remains to show that Sλ ∇(Av) L 2 + Sλ (A2 v) L 2 A T δ λ1+ε v U 2 L 2 . A
But this is obtained in the same manner as (31) from the Strichartz estimates for A and u 2 . Finally, (37) and (38) are proved exactly as in the previous lemma.
Maxwell-Schrödinger System
159
Next we turn our attention to the trilinear bounds, namely the H 2 and H −2 counterparts of (23) and (24). For uniformity in notations we set U˜ A2 L 2 = U A2 L 2 . Then Lemma 15. Let k, l ∈ {−2, 0, 2}. Then for any 0 < T ≤ 1 and µ ≤ λ we have 1
T |Iλ,λ,µ (u, v, B)| A T δ λ min (1, µλ− 2 )λ−k−l u U˜ 2 H k v U˜ 2 H l B U 2 A
WL
A
2
(39)
while if µ λ, then 1
1
T |Iλ,µ,λ (u, v, B)| A T δ λ− 2 + µ 2 µ−l λ−k u U˜ 2 H k v U˜ 2 H l B U 2 A
WL
A
2
.
(40)
Proof. The T δ factor can be neglected by the argument in Remark 10. We represent k
u = (1 − )− 2 (u e + u r ), where u e , u r are chosen as in (26) if k = 2, as in (36) if k = −2 and with u r = 0 if k = 0. Similarly we set k
v = (1 − )− 2 (ve + vr ). We begin with (39) and consider all four combinations. The estimate for u e and ve is exactly (23). The estimate for u e and vr reads 1
1 (u e , vr , B)| A λ min (1, µλ− 2 ) u e U 2 L 2 vr L 2 H 1− B U 2 |Iλ,λ,µ
WL
A
2
and is a consequence of the stronger bilinear L 2 estimate 1
Sλ u e Sµ B L 2 A µ 2 λ u e U 2 L 2 B U 2
WL
A
2
.
(41)
Due to the finite speed of propagation for the wave equation, see (13), we can localize this to the unit spatial scale. But on a unit cube Q we use the local Strichartz estimate (22) for u e and the energy estimate for B. The estimate for u r and ve is similar. The estimate for u r and vr reads 1
1 (u r , vr , B)| A λ min (1, µλ− 2 ) u r L 2 H 1− vr L 2 H 1− B U 2 |Iλ,λ,µ
WL
2
,
and is easily proved using L 2 bounds for Sλ u r , Sλ vr and an L ∞ bound for Sµ B. The proof of (40) is similar. For later use we note the bilinear L 2 estimates, namely 1
Sµ ve Sλ B L 2 A µ 2 + u e U 2 L 2 B U 2
WL
A
2
,
(42)
respectively 1
Sµ (Sλ u e Sλ B) L 2 A µ 2 + u e U 2 L 2 B U 2 A
WL
2
,
(43)
Both are proved by localizing to a unit spatial scale and then by combining the local Strichartz estimate (22) for u e and ve and the energy estimate (11) for B. Now we consider the H 2 well-posedness of (18).
160
I. Bejenaru, D. Tataru
2 with ∇ · A = 0. Then Eq. (19) is well-posed in H 2 . In Proposition 16. Let A ∈ UW addition we have
U A2 H 2 = U˜ A2 H 2 ,
V A2 H 2 = V˜ A2 H 2 ,
DU A2 H 2 = U˜ A2 H 2
with equivalent norms. Proof. In order to solve Eq. (18) with initial data u 0 ∈ H 2 we consider the equation for v = (1 − A )u which has the form (i∂t − A )v = 2(At ∇ − i A At )u. Expressing u in terms of v we obtain (i∂t − A )v = 2(At ∇ − i A At )(1 − A )−1 v.
(44)
We seek to solve this equation perturbatively in U A2 L 2 . For this we need first to establish suitable mapping properties for the operator At ∇ − i A At . Lemma 17. The operator At ∇ − i A At satisfies the space-time bound (At ∇ − 2i A At )u DU 2 L 2 A T δ u U˜ 2 H 2 . A
(45)
A
Proof. By duality, (45) follows from the bounds T B∇u vd ¯ xdt A T δ u U˜ 2 H 2 v V 2 L 2 B U 2 0
respectively T 0
R3
R3
A
A
ABu vd ¯ xdt A T δ B U 2
WL
2
A U 2
W
WL
2
,
H 1 u U˜ A2 H 2 v V A2 L 2 .
(46)
(47)
To prove (46) we use a triple Littlewood-Paley decomposition to write T T T B∇u vd ¯ xdt Iλ,λ,µ (∇u, v, B) + Iµ,λ,λ (∇u, v, B) 0
R3
µλ
+
µλ
µλ
T Iλ,µ,λ (∇u, v, B).
Then for each term we use the corresponding bounds (39), and (40) with a ln λ correction coming from the use of Proposition 8. The summation with respect to µ and λ is straightforward. For (47), by (13) the norms of A and B are l 2 summable with respect to unit spatial cubes. Hence without any restriction in generality we can assume that both A and B are supported in a unit cube Q. For A we use a Strichartz estimate, for B the energy and for u a pointwise bound. Finally, for v we interpolate the local energy estimates with the local Strichartz estimates to obtain Sλ v
1
L2 L
12 5
A λ− 4 +ε v V 2 L 2 , A
Maxwell-Schrödinger System
161
which leads to v Then we can estimate T ABu vd ¯ xdt 0
Q
L2 L
12 5
A v V 2 L 2 . A
1
T 4 B L ∞ L 2 A L 4 L 12 u L ∞ v 1
A T 4 B U 2
WL
2
A U 2
W
L2 L
12 5
H 1 u U A2 H 2 v V A2 L 2 .
We now return to Eq. (44). By the definition of the U˜ A2 L 2 norm and (45) we have 2(At ∇ − i A At )(1 − A )−1 v DU 2 L 2 A T δ v U 2 L 2 . A
A
(48)
By (9) it follows that we can solve (44) perturbatively in U A2 L 2 on short time intervals. This gives a solution u = (1 − A )−1 v ∈ U˜ A2 H 2 for (19). Furthermore, we obtain the bound v(t) − S(t, 0)v(0) L 2 A T δ v(0) L 2 , where S(t, s) is the evolution associated to (18). In particular this shows that v∈C([0, 1]; L 2 ). An elliptic argument allows us to return to u and conclude that u∈C([0, 1]; H 2 ). This concludes the proof of the H 2 well-posedness. Finally, we show that U A2 H 2 = U˜ A2 H 2 and the other two similar identities. Via the operator I − A these two spaces can be identified with the U 2 L 2 spaces associated to Eq. (18), respectively (44). But by Lemma 3, these are equivalent due to (48). Next we consider the well-posedness in H −2 , which is essentially dual to the H 2 well-posedness. 2 H 1 with ∇ · A = 0. Then Eq. (19) is well-posed in H −2 . Proposition 18. Let A ∈ UW In addition we have
U A2 H −2 = U˜ A2 H −2 ,
V A2 H −2 = V˜ A2 H −2 ,
DU A2 H −2 = U˜ A2 H −2
with equivalent norms. Proof. By Lemma 11 we can write the initial data u 0 as u 0 = (1 − A )v0 ,
v0 ∈ L 2 .
Then we seek the solution u for (18) of the form u = (1 − A )v. The equation for v is (i∂t − A )v = 2(1 − A )−1 (At ∇ − i A At )v.
(49)
To solve it we need the following counterpart to (45): Lemma 19. The operator At ∇ − i A At satisfies the space-time bound (At ∇ − 2i A At )u DU 2 H −2 A T δ u U˜ 2 L 2 . A
A
(50)
162
I. Bejenaru, D. Tataru
Proof. By duality, (50) follows from the bounds T B∇u vd ¯ xdt A T δ u U 2 L 2 v V 2 H 2 B U 2 0
respectively T 0
R3
R3
A
WL
A
2
,
ABu vd ¯ xdt A T δ B U 2 L 2 A U 2 H 1 u U 2 L 2 v V 2 H 2 . W W A A
(51)
(52)
These are almost identical to (46) and (47), and their proofs are essentially the same. The bound (50) allows us to solve Eq. (49) perturbatively in U A2 L 2 and obtain a solution v ∈ C([0, 1]; L 2 ). This implies the H −2 solvability for (18). The second part of the proposition follows again from Lemma 3. Having the well-posedness result in H 2 and H −2 allows us to prove well-posedness in a range of intermediate spaces. Given a positive sequence {m(λ)}λ=2 j satisfying 0
m(2λ) < C, m(λ)
(53)
we define the Sobolev type space H (m) with norm u 2H (m) = m 2 (λ) Sλ u 2L 2 . λ
Hα
The standard Sobolev spaces are obtained by taking m(λ) = λα . We consider the solvability for (18) in H (m) under a stronger condition for m, namely 1 m(2λ) ≤ ≤ 4. 4 m(λ)
(54)
This guarantees that H (m) is an intermediate space between H −2 and H 2 . We need to describe H (m) in terms of H −2 and H 2 . To measure functions which are localized at some frequency λ we can use the norm u 2Hλ = λ−4 u 2H 2 + λ4 u 2H −2 . Ideally we would like to represent H (m) as an almost orthogonal superposition of the Hλ spaces with the weights m(λ). However, this does not work so well if H (m) is “close” to either H −2 or H 2 . Instead we need to select a subset of dyadic frequencies which achieves the desired result. We denote m ∞ = lim λ2 m(λ). λ→∞
On 2N ∪ {∞} we introduce the relation “≺” by λ ≺ µ ⇔ 2m(µ) ≥ m(λ)(λ2 µ−2 + µ2 λ−2 ),
µ, λ < ∞,
respectively ∞ ≺ µ ⇔ 2m(µ) ≥ µ−2 m ∞ ,
µ < ∞.
Maxwell-Schrödinger System
163
Definition 20. We say that a subset (m) ⊂ 2N ∪ {∞} is m-representative if (i) for each µ ∈ 2N there exists λ ∈ (m) so that λ ≺ µ and (ii) for each µ ∈ (m) there is at most one λ ∈ (m) \ {µ} such that λ ≺ µ. Lemma 21. If m satisfies (54) then an m-representative set (m) exists. In addition, for each µ ∈ 2N and K ∈ N we have |{λ ∈ (m); 2 K m(µ) ≥ m(λ)(λ2 µ−2 + µ2 λ−2 )}| ≤ 4(K + 4).
(55)
Proof. For each λ ∈ 2N ∪ {∞} we denote Iλ = {µ ∈ 2N ∪ {∞}, λ ≺ µ}. Due to (54) it is easy to see that Iλ is an interval, Iλ = [λ− , λ+ ],
λ − ≤ λ ≤ λ+ ,
and the endpoints λ− and λ+ are nondecreasing functions of λ. We construct the set (m) as an increasing sequence {λ j } in an iterative manner. λ0 is chosen maximal so that λ0 ≺ 1. Iteratively, λ j+1 is chosen maximal so that λ j+1 ≺ 2λ+j . Either this process continues for an infinite number of steps, or it stops at some step k with λk = ∞. The former occurs if m ∞ = ∞ and the latter if m ∞ < ∞. The property (i) in Definition 20 is satisfied by construction. For (ii) we observe that λ j+1 ≥ 2λ+j , therefore λ j ≺ λ j+1 . On the other hand by construction we have λ j+2 ≺ λ j . For (55) suppose µ ≤ λ j . Then m(µ) ≥
1 1 2 −2 −K −2 m(λ j+2K )µ2 λ−2 m(λ j )µ2 λ−2 j ≥ m(λ j+2 )µ λ j+2 ≥ · · · ≥ 2 j+2K . 4 8
A similar bound holds if we descend from µ, and the conclusion follows. Since we allow ∞ ∈ (m) we need the equivalent of Hλ in that case, which is defined by H∞ = H −2 . We note that at the other extreme we have H1 = H 2 . Lemma 22. Let m satisfy (54), and (m) be an m-representative subset of 2 N ∪ {∞}. Then u 2H (m) ≈ inf{ m(λ)2 u λ 2Hλ , u = u λ }. λ∈(m)
λ∈(m)
Proof. By Definition 20 we have a finite covering of 2N with intervals Iλ . 2N ⊂ λ∈(m)
We consider an associated partition of unity in the Fourier space, 1= χλ (ξ ). λ∈(m)
For µ ∈ Iλ we have m(µ) ≈ m(λ)(µ2 λ−2 + λ2 µ−2 ), therefore we obtain χλ (Dx )u H (m) ≈ m(λ) χλ (Dx )u Hλ ,
164
I. Bejenaru, D. Tataru
and the “” inequality follows. For the reverse we use (55), which shows that the series
u λ is almost orthogonal in H (m), u λi , u λ j H (m) 2−|i− j| m(λi )m(λ j ) u λi Hλi u λ j Hλ j . Finally we consider the well-posedness of (18) in H (m). Proposition 23. a) Assume that the sequence m satisfies (54). Then Eq. (18) is wellposed in H (m). b) Furthermore, for each u ∈ U A2 H (m) there is a representation uλ u= λ∈(m)
with λ∈(m)
2 4 2 2 A u U m 2 (λ) λ−4 u λ U 2 H 2 + λ u λ U 2 H −2 2 H (m) . A
A
(56)
A
c) The following duality relation holds: (DU A2 H (m))∗ = V A2 H (m −1 ). Proof. a) We consider a dyadic decomposition of the initial data u0 = χλ (Dx )u 0 , λ∈(m)
and denote by u λ the solutions to (18) with initial data Sλ u 0 . Then uλ. u= λ∈(m)
We can measure u λ in both H 2 and H −2 , λ−2 u λ C([0,1],H 2 ) + λ2 u λ C([0,1],H −2 ) A Sλ u 0 L 2 . After summation this gives m 2 (λ) λ−4 u λ C([0,1],H 2 ) + λ4 u λ C([0,1],H −2 ) A u 0 2H (m) λ∈(m)
which, by (54), implies that u ∈ C([0, 1], H (m)) and u C([0,1],H (m)) A u 0 H (m) . b) It suffices to consider the case when u is an U A2 H (m) atom. Then we consider a decomposition as in part (a) for each of the steps, and the conclusion follows. c) This is a direct consequence of (8).
Maxwell-Schrödinger System
165
Finally, we can interpolate the properties from U A2 H 2 and U A2 H −2 to obtain properties for U 2 H (m). Indeed, we have the following 2 H 1 with ∇ · A = 0. Assume that m, m , m satisfy (54). Proposition 24. Let A ∈ UW 1 2 Then the following estimates hold:
(i) For p, q as in (20) we have the Strichartz estimate δ
1
Sλ u L p (0,T ;L q ) A T p m(λ)−1 λ p u U 2 H (m) . A
(57)
(ii) Local energy estimates. For any spatial cube Q of size 1 we have: 1
Sλ u L 2 ([0,T ]×Q) A T δ m(λ)−1 λ− 2 + u 0 U˜ 2 H (m) . A
(58)
(iii) Local Strichartz estimates. For any spatial cube Q of size 1 we have: Sλ u L 2 (0,1;L 6 (B)) A T δ m(λ)−1 λ u 0 U 2 H (m) . A
(59)
(iv) Trilinear estimates. For µ ≤ λ we have 1
T |Iλ,λ,µ (u, v, B)| A
T δ λ min (1, µλ− 2 ) u U 2 H (m 1 ) v U 2 H (m 2 ) B U 2 L 2 . (60) A A W m 1 (λ)m 2 (λ)
while if µ λ then 1
T |Iλ,µ,λ (u, v,
B)| A
1
T δ λ− 2 + µ 2 u U 2 H (m 1 ) v U 2 H (m 2 ) B U 2 L 2 . (61) A A W m 1 (λ)m 2 (µ)
Due to the representation in Proposition 23(b) this result is a straightforward consequence of the similar results in H 2 and H −2 .
5. Proof of Theorem 1 We first establish an a-priori estimate for regular (H 2 ) solutions of (3). For this we need to consider the two nonlinear expressions on the right-hand side of (3). We begin with the Schrödinger nonlinearity: Lemma 25. For β >
1 2
and m satisfying (54) we have
2 φv DU 2 (0,T ;H (m)) A T δ v U 2 H (m) u U 2 Hβ . A
A
(62)
A
Proof. By duality the above bound is equivalent to the quadrilinear estimate T −1 (u 1 u¯ 2 )u 3 u¯ 4 d xdt A T δ u 1 U 2 H β u 2 U 2 H β u 3 U 2 H (m) u 4 V 2 H ( 1 ) . A A A A m 3 0 R
166
I. Bejenaru, D. Tataru
After a simultaneous Littlewood-Paley decomposition of the three factors −1 (u 1 u¯ 2 ), u 3 and u 4 we need to consider the following three sums: T Slhh = −1 Sµ (u 1 u¯ 2 )Sλ u 3 Sλ u¯ 4 d xdt, 3 µ≤λ 0 R
Shlh = Shhl =
T
−1 Sλ (u 1 u¯ 2 )Sµ u 3 Sλ u¯ 4 d xdt,
3 µλ 0 R
T
µλ 0
−1 Sλ (u 1 u¯ 2 )Sλ u 3 Sµ u¯ 4 d xdt.
R3
The first sum can be estimated using Strichartz and energy estimates as follows: µ−2 Sµ (u 1 u¯ 2 ) L 1 L ∞ Sλ u 3 Sλ u¯ 4 L ∞ L 1 |Slhh | µ
λ≥µ
−1
µ
Sµ (u 1 u¯ 2 ) L 1 L 3 sup
t∈[0,T ] λ≥µ
µ
Sλ u 3 (t) L 2 Sλ u 4 (t) L 2
u 1 L 2 L 6 u 2 L 2 L 6 sup u 3 (t) H (m) u 4 (t) H (m −1 ) t∈[0,T ]
δ
A T u 1 U 2 H β u 2 U 2 H β u 3 U 2 H (m) u 4 V 2 H (m −1 ) . A
A
A
A
The second and third sums are similar. Using the Strichartz estimates we obtain |Shlh | λ−2 Sλ (u 1 u¯ 2 ) L 2 L 2 Sµ u 3 L 2 L ∞ Sλ u¯ 4 L ∞ L 2 µλ
A T δ
µm(λ) 3
µλ
λ 2 m(µ)
Sλ (u 1 u¯ 2 )
3
L2 L 2
u 3 U 2 H (m) u 4 V 2 H (m −1 ) . A
A
At least one of u 1 or u 2 , say u 1 , must have frequency at least λ. Then we continue with µm(λ) |Shlh | A T δ u 1 L ∞ L 2 u¯ 2 L 2 L 6 u 3 U 2 H (m) u 4 V 2 H (m −1 ) 3 A A µλ λ 2 m(µ) µ2 m(λ) λ 21 −β A T u 1 U 2 H β u¯ 2 U 2 H β u 3 U 2 H (m) u 4 V 2 H (m −1 ) . A A A A λ2 m(µ) µ δ
µλ
By (54) the first fraction is less than one therefore the summation with respect to λ and µ is straightforward. Next we consider the wave nonlinearity. If m is as in (53) then the linear wave equa2 H (m), tion is well-posed in H (m), and can easily define the corresponding spaces UW 2 2 −1 VW H (m), respectively DUW H (λ m). The next result asserts that in effect the contribution of the wave nonlinearity is one half of a derivative better than the solution to the Schrödinger equation. Here we impose an additional condition on m, namely λβ m(λ) ≥ β, m(µ) µ
λ > µ,
which guarantees that the H (m) norm is at least as strong as the H β norm.
(63)
Maxwell-Schrödinger System
167
Lemma 26. a) Let β >
1 2
and m satisfying (54) and (63). Then
P(u∇ ¯ A u)
1
2 (0,T ;H (λ− 2 m)) DUW
A T δ ( u U 2 H (m) u U 2 H β A
A
2 + u U 2 β A U 2 H (m) ). W AH
b) For β >
3 4
(64)
we have P(u∇ ¯ A v)
1
2 (0,T ;H − 2 ) DUW
A T δ u U 2 L 2 v U 2 H β . A
Proof. a) By duality we have two estimates to prove. The first is T A T δ u 2 u∇u ¯ Bd xdt U H (m) u U 2 H β B 2 0 R3
A
A
A
W
1
VW2 H (λ 2 /m)
+T δ u U 2 H β u U 2 H β A V 2 H (m) B A
A
(66)
1
VW H (λ 2 /m)
A
with divergence free B. The second is T A T δ u 2 uu ¯ ABd xdt U H (m) u U 2 H β A V 2 H 1 B 0 R3
(65)
A
W
1
VW2 H (λ 2 /m)
. (67)
Consider (66). Since B is divergence free, it follows that the gradient can be placed either on u or on u. ¯ Hence using a simultaneous trilinear Littlewood-Paley decomposition of the three factors we reduce the problem to estimating the following two terms: T T Shhl = |Iλ,λ,µ (u, ∇u, B)|, Shlh = |Iλ,µ,λ (u, ∇u, B)|. λ µλ
λ µλ
We use (60) and Lemma 8 to estimate the first term: Shhl A T δ
λ m(µ) 1
λ µλ
λβ µ 2 m(λ)
u U 2 H (m) u U 2 H β B A
A
1
VW2 H (λ 2 /m)
.
The bound (63) insures that the summation is straightforward if ε is chosen sufficiently small. For the second term we use (61) and Lemma 8: Shlh A T δ
µ 23 −β u U 2 H (m) u U 2 H β B 2 , 1 A A VW H (λ 2 /m) λ1−ε λ µλ
which is again summable if ε is sufficiently small. Next we turn our attention to (67). Using a Littlewood-Paley decomposition for all terms it suffices to consider factors of type T T Iλ1 ,λ2 ,λ3 ,λ4 = Sλ1 u¯ Sλ2 u Sλ3 A Sλ4 Bd xdt 0
R3
and prove that for some δ > 0 they satisfy the bound −δ −δ −δ IλT1 ,λ2 ,λ3 ,λ4 A λ−δ 1 λ2 λ3 λ4 · R H S((67)).
(68)
168
I. Bejenaru, D. Tataru
We begin with a weaker bound which follows directly from Strichartz estimates, namely 1
−1 −1 −2
2 IλT1 ,λ2 ,λ3 ,λ4 A T 3 λ1 4 λ2 4 λ3 3 λ4N u U 2 H β A U 2
W
A
H 1 B L ∞ L 2 .
Arguing as in Lemma 8, this allows us to replace (68) with −δ −δ −δ IλT1 ,λ2 ,λ3 ,λ4 A λ−δ 1 λ2 λ3 λ4 · modified R H S((67)),
(69)
1
where we have replaced the VW2 H (λ 2 /m) space with the similar U 2 space in the righthand side of (67). Due to (13) both wave factors are l 2 summable with respect to unit spatial cubes, therefore it is enough to estimate the above integral on a unit cube Q. Also we must have λ4 ≤ max{λ1 , λ2 , λ3 }. Hence we consider two cases. If max{λ1 , λ2 , λ3 } = λ1 (or λ2 ) then we estimate 1
IλT1 ,λ2 ,λ3 ,λ4 ≤ T 6 Sλ1 u L 4 L 3 Sλ2 u L 2 L ∞ Sλ3 A L 3 L 6 Sλ4 B L ∞ L 2 1
≤ T
1 6
− 1 +ε 1 +ε−β − 16 λ1 2 λ22 λ3
λ12 m(λ4 ) 1 2
λ4 m(λ1 )
u U 2 H (m) u U 2 H β A V 2 H 1 B A
A
2 H( UW
W
√
λ m )
.
By (63) the fraction above is less than one, therefore for small enough ε the bound (69) follows. The second case is when max{λ1 , λ2 , λ3 } = λ3 . Then we estimate 1
IλT1 ,λ2 ,λ3 ,λ4 ≤ T 12 Sλ1 u L 4 L 6 Sλ2 u L 2 L ∞ Sλ3 A L 6 L 3 Sλ4 B L ∞ L 2 1
A T
1 12
1 3 +ε−β
λ1
1 2 +ε−β
λ2
−1 λ3 6
λ32 m(λ4 ) 1 2
λ4 m(λ3 )
u U 2 H β u U 2 H β A U 2 A
A
W
H (m) B U 2 H ( W
and (69) again follows. b) By duality we have two estimates to prove. The first one is trilinear, T A T δ u 2 2 v 2 β B u∇v ¯ Bd xdt 1 U L U H 2 0
R3
A
with a divergence free B. The second one is quadrilinear, T uu ¯ ABd xdt A T δ u U 2 H β v U 2 L 2 A U 2 0 R3
A
A
W
,
(70)
VW H 2
A
√ λ m )
H 1 B V 2 H 21 . W
(71)
Since B is divergence free, in (70) we can place the gradient on either u or v. The argument is similar to the one in part (a), using (60) and (61), as well as Lemma 8 in order to substitute the V 2 norm by the U 2 norm. The new restriction β > 43 arises in the case when the low frequency is on B. Indeed, if µ λ then by (60) we have 1
1
T (u, ∇v, B)| A T δ λ1−β+ µ− 2 min{1, µλ− 2 } u U 2 L 2 v U 2 H β B |Iλ,λ,µ A
1
A
3
The worst case is when µ = λ 2 , when the coefficient above is T δ λ 4 −β+ε .
1
VW2 H 2
.
Maxwell-Schrödinger System
169
The estimate (71) is also proved as in part (a). Indeed, by Corollary 7 we can substitute the V 2 space by U 2 at the expense of losing ε derivatives. Because of the finite speed of propagation for the wave equation, see (13), we can reduce the problem to the case when A, B are supported in a unit cube Q. There we can use the Strichartz estimates for the wave equation, respectively the local Strichartz estimates for the Schrödinger equation. The next step in the proof of the theorem is to establish an a-priori H 1 estimate for H 2 solutions. This is obtained in terms of the conserved quantities in our problem, namely E and Q. Proposition 27. Let (u, A) be an H 2 solution for (3) in some time interval [0, T0 ] with T0 ≤ 1. Then u U 2 (0,T0 ;H 1 ) + A U 2 (0,T0 ;H 1 ) ≤ c(E, Q). A
W
Proof. We use a bootstrap argument. Since u, A ∈ L ∞ H 2 , we can easily estimate the 2 H 1 , and, in wave nonlinearity and obtain A ∈ L ∞ H 1 . This implies that A ∈ UW addition, that the function T → A U 2 (0,T ;H 1 ) W
is continuous and satisfies lim A U 2 (0,T ;H 1 ) = A(0) H 1 + At (0) L 2 .
T →0
W
A similar argument applies in the case of the Schrödinger equation. By (9) we estimate in the Schrödinger equation u U 2 (0,T ;H 1 ) u(0) H 1 + iu t − A u DU 2 (0,T ;H 1 ) . A
A
Then by (62) we obtain 3 u U 2 (0,T ;H 1 ) ≤ C 1A ( u(0) H 1 + T δ u U 2 (0,T ;H 1 ) ). A
A
Similarly we can use (64) to obtain a bound for the wave equation 2 A U 2 (0,T ;H 1 ) ≤ A(0) H 1 + At (0) L 2 + T δ C 2A u U 2 (0,T ;H 1 ) . W
A
We multiply the first equation by c1A = (C 1A )−1 and add to the second equation to obtain A U 2 (0,T ;H 1 ) + c1A u U 2 (0,T ;H 1 ) ≤ u(0) H 1 + A(0) H 1 + At (0) L 2 W
A
3 +T δ C 3A (1 + u U 2 (0,T ;H 1 ) ). A
We make the bootstrap assumption c1A u U 2 (0,T ;H 1 ) + A U 2 (0,T ;H 1 ) ≤ 2 + u(0) H 1 + A(0) H 1 + At (0) L 2 . A
W
Then the previous bound implies that c1A u U 2 (0,T ;H 1 ) + A U 2 (0,T ;H 1 ) ≤ u(0) H 1 + A(0) H 1 + At (0) L 2 A
W
3 +T δ C(E, Q)(1 + u U 2 (0,T ;H 1 ) ). A
170
I. Bejenaru, D. Tataru
This shows that for T ≤ T0 (E, Q) we have u U 2 (0,T ;H 1 ) + A U 2 (0,T ;H 1 ) ≤ 1 + u(0) H 1 + A(0) H 1 + At (0) L 2 , (72) A
W
improving our bootstrap assumption. Hence a continuity argument shows that (72) holds without any bootstrap assumption. The conclusion of the proposition follows by summing up with respect to T0 (E, Q) time intervals. The next step is to establish an a-priori H 2 bound with constants which depend only on the H 1 size of the data. Proposition 28. Let (u, A) be an H 2 solution for (3) in some time interval [0, T0 ] with T0 ≤ 1. Then u U 2 H 2 + A U 2 A
W
H2
≤ c(E, Q)( u 0 H 2 + A0 H 2 + A1 H 1 ).
Proof. The argument is similar to the one above.
Given the local well-posedness result in H 2 proved in earlier work [9], we can iterate the argument and conclude that the H 2 solutions are global. Finally, our last apriori estimate is in intermediate spaces: Proposition 29. Let m be a weight which satisfies (54) and (63). Let (u, A) be an H 2 solution for (3) in the time interval [0, 1]. Then u U 2 H (m) + A U 2 A
W
H (m)
≤ c(E, Q)( u 0 H (m) + A0 H (m) + A1 H (λ−1 m) ).
Proof. The argument is again similar to the one above.
In order to obtain H 1 solutions and to study the dependence of the solutions on the initial data we need to obtain estimates for differences of solutions. Given a solution (u, A) to (3) we consider the corresponding linearized problem ⎧ ¯ ⎨ ivt − A v = 2i B∇u + 2 ABu + φv + −1 (u v)u (73) ⎩ B = P(v∇ ¯ A u + u∇ ¯ A v + B|u|2 ). Our main estimate for the linearized problem is Proposition 30. Let (u, A) be an H 2 solution for (3) in the time interval [0, 1]. Then 1 1 the linearized problem (73) is well-posed in L 2 × H 2 × H − 2 uniformly with respect to (u, A) in a bounded set in the energy space, v U 2 L 2 + B A
1
2 H2 UW
≤ c(E, Q)( v0 L 2 + B0
1
H2
+ B1
1
H− 2
).
(74)
Proof. The conclusion follows iteratively in short time intervals provided that we obtain appropriate estimates for the terms on the right: δ 2i B∇u + 2 ABu + φv + −1 (u v)u ¯ DU 2 L 2 T c(E, Q)( v U 2 L 2 + B A
A
1
2 H2 UW
),
Maxwell-Schrödinger System
171
respectively ¯ A v + i B|u|2 ) P(v∇ ¯ A u + u∇
1
2 H− 2 DUW
T δ c(E, Q)( v U 2 L 2 + B
1
2 H2 UW
A
).
These in turn follow by duality from the trilinear and quadrilinear bounds T B∇u 1 u¯ 2 d xdt A T δ B 2 1 u 1 U 2 H 1 u 2 V 2 L 2 , A A UW H 2 0 R3 T ABu 1 u¯ 2 d xdt A T δ A U 2 H 1 B 2 1 u 1 U 2 H 1 u 2 V 2 L 2 , W A A UW H 2 0 R3 T −1 (u 1 u¯ 2 )u 3 u¯ 4 d xdt A T δ u 1 U 2 H 1 u 2 U 2 H 1 u 3 U 2 L 2 u 4 V 2 L 2 , A A A A 3 0 R T −1 (u 1 u¯ 2 )u 3 u¯ 4 d xdt A T δ u 1 U 2 H 1 u 2 U 2 L 2 u 3 U 2 H 1 u 4 V 2 L 2 , A A A A 0 R3 T B∇u 1 u¯ 2 d xdt A T δ B 2 1 u 1 U 2 H 1 u 2 U 2 L 2 , A A VW H 2 0 R3 T B∇u 1 u¯ 2 d xdt A T δ B 2 1 u 1 U 2 L 2 u 2 U 2 H 1 , A A VW H 2 3 0 R T ABu 1 u¯ 2 d xdt A T δ A U 2 H 1 B 2 1 u 1 U 2 H 1 u 2 U 2 L 2 , W A A VW H 2 3 0 R T ABu 1 u¯ 2 d xdt A T δ A 2 1 B 2 1 u 1 U 2 H 1 u 2 U 2 H 1 . UW H 2
0 R3
VW H 2
A
A
(75) (76) (77) (78) (79) (80) (81) (82)
The quadrilinear mixed bounds (76), (81), (82) follow trivially from the Strichartz estimates. For (76) for instance we have T T 121 A L 3 L 8 B L 4 u 1 L 3 L 8 u 2 L ∞ L 2 ABu u ¯ d xdt 1 2 0 R3
1
A T 12 A U 2
W
H 1 B U 2 H 21 u 1 U A2 H 1 u 2 V A2 L 2 . W
We note that there is significant room for improvement in this computation by localizing first to the unit spatial scale and then using the local Strichartz estimates for the Schrödinger equation. The quadrilinear Schrödinger bound (77) corresponds to the particular choice m(λ) = 1 and β = 1 > 21 in (62). For (78) we can write T −1 (u 1 u¯ 2 )u 3 u¯ 4 d xdt u 1 u 2 2 3 u 3 u 4 2 3 L L2 L L2 3 0 R
1
T 3 u 1 L 3 L 6 u 2 L ∞ L 2 u 3 L 3 L 6 u 4 L ∞ L 2 1
A T 3 u 1 U 2 H 1 u 2 U 2 L 2 u 3 U 2 H 1 u 4 V 2 L 2 . A
A
A
A
Finally, the bounds (79) and (80) are identical since div B = 0 and correspond to (70). Equation (75) is essentially the same estimate.
172
I. Bejenaru, D. Tataru
Proof of Theorem 1, conclusion. By Proposition 30 we can obtain a weak Lipschitz dependence result for H 2 solutions (u 1 , A1 ) and (u 2 , A2 ) to (3), u 1 − u 2 L ∞ L 2 + A1 − A2
1
2 H2 UW
≤ c(E 1 , Q 1 , E 2 , Q 2 )
( (u 1 − u 2 )(0) L 2 + (A1 − A2 )(0)
1
H2
+ (A1 − A2 )t (0)
1
H− 2
).
(83)
We use this in order to construct solutions to (3) for H 1 initial data. Given (u 0 , A0 , A1 ) ∈ H 1 × H 1 × L 2 we consider a sequence of H 2 initial data (u n0 , An0 , An1 ) → (u 0 , A0 , A1 ) in H 1 × H 1 × L 2 . The sequence (u n0 , An0 , An1 ) is compact in H 1 × H 1 × L 2 , therefore we can bound them uniformly in a stronger norm, (u n0 , An0 , An1 ) H (m)×H (m)×H (λ−1 m) ≤ M, where m(λ) ≥ λ satisfies (54) and (63) and in addition lim λ−1 m(λ) = ∞.
λ→∞
By Proposition 29 we obtain a uniform bound u n L ∞ H (m) + An U 2
W
H (m)
≤ M.
On the other hand, (83) shows that the solutions (u n , An ) have a limit in a weaker topology, (u n , An ) → (u, A)
1
2 in L ∞ L 2 × UW H 2.
Combining the two bounds above we obtain strong convergence in H 1 , (u n , An ) → (u, A)
2 in L ∞ H 1 × UW H 1.
In addition, u will also satisfy the same Strichartz estimates as u n . Passing to the limit in Eq. (3) we easily see that (u, A) is a solution. Due to the weak Lipschitz dependence it is also the unique uniform limit of strong solutions. Due to the Strichartz estimates we can bound the nonlinear term φu in the Schrödinger equation as in Lemma 62. Then it also follows that u ∈ U A2 H 1 . The weak Lipschitz dependence (83) carries over to H 1 solutions, as well as the bounds in Propositions 27,29. Then the same argument as above gives the continuous dependence on the initial data.
Maxwell-Schrödinger System
173
6. Wave Packets for Schrödinger Operators with Rough Symbols An essential part of this article is devoted to understanding the properties of the (95) flow at frequency λ on λ−1 time intervals. As it turns out, for many estimates the parameter λ can be factored out by rescaling. This is why in this section we consider a more general equation of the form iu t − u + a w (t, x, D)u = 0,
u(0) = u 0
(84)
which we study on a unit time scale. Here a is a real symbol which is roughly smooth on the unit scale. For such a problem one seeks to obtain a wave packet parametrix, i.e. to write solutions as almost orthogonal superpositions of wave packets, where the wave packets are localized both in space and in frequency on the unit scale. The simplest setup is to assume uniform bounds on a of the form β
|∂xα ∂ξ a(t, x, ξ )| ≤ cαβ ,
|α| + |β| ≥ k.
An analysis of this type has been carried out in [7,10,11]. If k = 2 then one obtains a wave packet parametrix where the packets travel along the Hamilton flow. If k = 1 the geometry simplifies, and the Hamilton flow stays close to the flow for a = 0; however, a still affects a time modulation factor arising in the solutions. Finally if k = 0 then the a w (t, x, D) term is purely perturbative. For the operators arising in the present paper the above uniform bounds on a are too strong, and need to be replaced by integral bounds of the form 1 β |∂xα ∂ξ a(t, x t , ξ t )| ≤ cαβ , |α| + |β| ≥ k, 0
(x t , ξ t ) is the associated Hamilton flow. The case k
where t → = 2 has been considered in [8]; as proved there, the Hamilton flow is bilipschitz and a wave packet parametrix can be constructed. The case k = 0 was considered in [1]; then the term a w (t, x, D) is perturbative, and one may use the a = 0 Hamilton flow in the above condition. In the present article we need to deal with the case k = 1. This corresponds to a Hamilton flow which is close to the a = 0 flow. However, the term a w (t, x, D) is nonperturbative, and contributes a time modulation factor along each packet. Given these considerations, we consider the following assumption on the symbol a: 1 β sup |∂xα ∂ξ a(t, x + 2tξ, ξ )|dt ≤ cα,β , |α| + |β| ≥ 1. (85) x,ξ
0
Let (x0 , ξ0 ) ∈ R2n . To describe functions which are localized in the phase space on the unit scale near (x0 , ξ0 ) we use the norm: HxN0 ,N := { f : D − ξ 0 N f ∈ L 2 , x − x 0 N f ∈ L 2 }. ,ξ 0 We work with the lattice Zn both in the physical and Fourier space. We consider a partition of unity in the physical space, φx0 = 1 φx0 (x) = φ(x − x0 ), x0 ∈Zn
174
I. Bejenaru, D. Tataru
where φ is a smooth bump function with compact support. We use a similar partition of unity on the Fourier side: ϕξ0 = 1, ϕξ0 (ξ ) = ϕ(ξ − ξ0 ). ξ0 ∈Zn
An arbitrary function u admits an almost orthogonal decomposition u= u x0 ,ξ0 , u x0 ,ξ0 = ϕξ0 (D)(φx0 u), (x0 ,ξ0 )∈Z2n
so that
u x0 ,ξ0 2
(x0 ,ξ0 )∈Z2n
H N0,N0 x ,ξ
u 2L 2 .
(86)
We remark that a continuous analog of the above discrete decomposition can be obtained using the Bargman transform. We first establish that the Hamilton flow is close to the Hamilton flow with a = 0: Lemma 31. Assume that (85) holds with a small enough . Then for each (x 0 , ξ 0 ) ∈ R2n and t ∈ [0, 1] we have |x t − (x 0 + 2tξ 0 )| + |ξ t − ξ 0 | ε. The proof is straightforward and is left for the reader; it is also essentially contained in [1]. This allows us to apply the main result in [8]: Proposition 32. Assume that (85) holds with a small enough . Then for each N ≥ 0 the solution of the homogeneous problem (84) satisfies the following localization estimate: u(t) H N ,N
x0 +2tξ0 ,ξ0
N u 0 H N ,N . x0 ,ξ0
(87)
We denote the evolution operator for (84) by S(t, s). If the initial data is u 0 = δx then it has a decomposition of the form u ξ0 (0), u ξ0 (0) H N ,N 1. (88) u0 = x,ξ 0
ξ0 ∈Zn
By (32), at time 1 the corresponding solutions u ξ0 are concentrated close to x + 2tξ0 , therefore they are spatially separated. Hence we obtain the following pointwise decay: Corollary 33. The kernel K (1, 0) of S(1, 0) satisfies |K (1, x, 0, y)| 1. The solution of the homogeneous equations (84) satisfies S(1, 0)u 0 L ∞ u 0 L 1 . If in addition the initial data is localized at some frequency λ, say u 0 = Sλ δx , then the decomposition in (88) is restricted to the range |ξ0 | ≈ λ. Then the corresponding solutions travel with speed O(λ), and we can obtain better pointwise decay away from the propagation region:
Maxwell-Schrödinger System
175
Corollary 34. The kernel K λ (1, 0) of S(1, 0)Sλ satisfies |K (1, x, 0, y)| (λ + |x − y|)−N ,
|x − y| ≈ λ.
(89)
|x − y| ≈ λ, |t − s| 1.
(90)
The kernel K λ (t, s) of S(t, s)Sλ satisfies |K (t, x, s, y)| λ−N ,
The next result concerns localized energy estimates. Corollary 35. For each ball Br of radius r ≥ 1 the solution u to (84) satisfies 1
1
S(t, 0)Sλ u 0 L 2 (Br ) λ− 2 r 2 u 0 L 2 .
(91)
Proof. We consider the wave packet decomposition for u = S(t, 0)Sλ u(0), u=
|ξ 0 |≈λ (x0 ,ξ0
u x0 ,ξ0 .
)∈Z2n
Let χr be a cutoff corresponding to Br . Since r ≥ 1 it follows that the functions χr u x0 ,ξ0 are almost orthogonal, therefore it suffices to prove the estimate for a single packet. But a single packet is concentrated near a tube of spatial size 1 which travels with speed O(λ). This tube intersects the cylinder [0, 1] × Br over a time interval of length λ−1r . The conclusion easily follows. To obtain any results below the unit spatial scale we slightly strengthen the condition (85) by adding a weaker pointwise bound β
1
|∂xα ∂ξ a(t, x, ξ )| ≤ cαβ ξ 2 ,
∀ α, β.
(92)
This will guarantee that on a unit spatial scale the flow in (84) is a small perturbation of the flat Schrödinger flow. Then we have: Proposition 36. Assume that the conditions (85) and (92) hold. Then (i) For any r > 0 the solution u to (84) satisfies the localized energy estimates 1
1
S(t, 0)Sλ u(0) L 2 (Br ) λ− 2 r 2 u 0 L 2 . (ii) For each y, z with |y − z| ≈ λ we have the square function bound Sλ S(t, s)Sλ ( f (s)δ y ) λ−1 f L 2 . I
(93)
(94)
L2
Proof. To prove this result it is convenient to replace the L 2 initial data space by weighted L 2 spaces. Definition 37. A weight m : R2n → R+ is admissible if |m(x, ξ )/m(y, η)| (1 + |x − y| + |ξ − η|) N for some real N .
176
I. Bejenaru, D. Tataru
Correspondingly we define a weighted L 2 space u 2L 2 (m) = m(x0 , ξ0 )u x0 ,ξ0 2 (x0 ,ξ0
HxN ,N ,ξ
)∈Z2n
.
0 0
Given a weight m 0 at time 0 we evolve it in time by m t (x + 2tξ, ξ ) = m 0 (x, ξ ). As a consequence of Proposition 32 we obtain Lemma 38. Assume that (85) holds with a small enough . Then S(t, s) L 2 (m s )→L 2 (m t ) 1. Next we consider truncated solutions on a unit spatial scale. Given a unit ball B and an associated cutoff function χ we have the following weighted local energy estimates: Lemma 39. For any solution u to (84) we have χ u
1
L 2 ([0,1],L 2 (ξ 2 m t ))
+ (i∂t − )χ u
1
L 2 ([0,1],L 2 (ξ − 2 m t ))
u 0 L 2 (m) .
Proof. We begin again with a wave packet decomposition of u, u= u x0 ,ξ0 . (x0 ,ξ0 )∈Z2n
The functions χ u x0 ,ξ0 are almost orthogonal in L 2 , therefore the bound for χ u follows. On the other hand we have (i∂t − )χ u = −2∇χ ∇u − χ u + χa w (t, x, D)u. The first two terms are estimated using the bound for χ u. For the last one we note that, by (92), the operator a w preserves the H N ,N spaces, 1
a w (t, x, D)u H N −1,N −1 ξ0 2 u H N ,N . x0 ,ξ0
Hence we can use orthogonality again.
x0 ,ξ0
Now we can conclude the proof of the proposition. For the local energy estimate (91) we first truncate u to a unit scale. By the above lemma with m = (1 + λ−3 ξ 3 ) (1 + λ3 ξ −3 ) we obtain 1
1
λ 2 χ1 u L 2 (m ) + λ− 2 (i∂t − )χ1 u L 2 (m ) u 0 L 2 (m) , where m = (1 + λ−2 ξ 2 )(1 + λ2 ξ −2 ). It remains to show that 1
1
1
1
λ 2 r − 2 χr u L 2 λ 2 χ1 u L 2 (m ) + λ− 2 (i∂t − )χ1 u L 2 (m ) . Then we can localize the right-hand side to the λ−1 time scale. On the λ−1 time scale we can use the Duhamel formula to further reduce the problem to a corresponding estimate for solutions to the homogeneous constant coefficient Schrödinger equation, namely: 1
1
λ 2 r − 2 χr e−it u 0 L 2 u 0 L 2 (m ) .
Maxwell-Schrödinger System
177
After a dyadic frequency decomposition this becomes 1
1
λ 2 r − 2 χr e−it Sλ u 0 L 2 u 0 L 2 which is exactly the local energy estimate for the homogeneous constant coefficient Schrödinger equation. Consider now the square function bound. For |t −s| 1 we can use the kernel bound (90). Hence without any restriction in generality we assume that t ∈ I , s ∈ J where I , J are intervals of size O(1) with O(1) separation. Choose t0 the center of the interval between I and J . We factor the estimate in two and prove the dual estimates S(t0 , s)Sλ ( f (s)δ y ) I
1
L 2 (m)
λ− 2 f L 2 ,
respectively 1
(Sλ S(t, t0 )u)(z) L 2 λ− 2 u L 2 (m) , where the flow invariant weight m is given by m(x, ξ ) = (1 + λ−1 |ξ ∧ (x − y)|) K (1 + λ−1 |ξ ∧ (x − z)|)−K with K large enough. These are dual bounds, therefore it suffices to prove the second one. If χ is a smooth approximation of the characteristic function of B(z, 1), then by (a slight modification of) Lemma 39 it remains to show that v = χ Sλ u satisfies 1
1
1
λ 2 v(t, x) L 2 (J ) λ 2 v L 2 L 2 (m·m ) + λ− 2 (i∂t − )v L 2 L 2 (m·m ) , t
t
where the additional weight m = (1 + λ−2 ξ 2 )(1 + λ2 ξ −2 ) can be added due to the localization to frequency λ. This estimate can be localized to the λ−1 timescale. In addition, since v has support in B(z, 1) we can freeze x = z in m and replace m by m(ξ ˜ ) = (1 + λ−1 |ξ ∧ (y − z)|) K . Assuming y − z = O(λ)e1 we get m(ξ ˜ ) = (1 + |ξ | + λ−1 |ξ1 |) K . Then the x variable can be factored out and we are left with a bound for the one dimensional Schrödinger equation, 1
e−it v0 (·, 0) L 2 (J ) λ− 2 v0 L 2 (m ) . But this is exactly the one dimensional local energy estimate.
178
I. Bejenaru, D. Tataru
7. The Short Time Structure In this section we consider a paradifferential approximation to the magnetic Schrödinger equation (18). Precisely given a dyadic frequency λ we consider the evolution iu t − u + i(A<√λ ∇ S˜λ + S˜λ A<√λ ∇)u = 0,
u(0) = u 0 ,
(95)
where A<√λ = S<√λ A. The multiplier S˜λ is added here for convenience. It guarantees that waves at frequencies away from λ evolve according to the constant coefficient Schrödinger flow, thereby strictly confining the interesting part of the evolution to frequency λ. In addition, the above expression is written in a selfadjoint form, which guarantees that the corresponding evolution operators S(t, s) are L 2 isometries. Later we will prove that on the time scale λ−1 the evolution of the λ dyadic piece of a solution u to (18) is well approximated by the evolution in (95). Here we establish dispersive type estimates for (95). Our main result concerning the flow in (95) is as follows: Proposition 40. Let u λ be the solution to (95) with initial data u 0,λ localized at frequency λ. Then for any interval I of size less than λ−1 the following estimates hold: (i) the full Strichartz estimates u λ L p (I,L q ) A u 0,λ L 2 ,
(96)
(ii) the square function estimate 1
u λ L 4 (L 2 (I )) A λ− 4 u 0,λ L 2 , x
(97)
t
(iii) the localized energy estimate: for any ball Br of radius r > 0 we have 1
1
u λ L 2 (I ×Br ) A r 2 λ− 2 u 0,λ L 2 .
(98)
Eq. (95) is L 2 well-posed, therefore we can define the spaces U 2
√ L 2, A, λ
V 2 √ L 2. A, λ
respectively
As a consequence of the above proposition we obtain
2 H 1 . Then for any interval I of length ≤ λ−1 and Corollary 41. Assume that A ∈ UW any function u λ localized at frequency λ the following embeddings hold:
u λ L p (I,L q ) A u λ U 2 √
A, λ
L2 ,
(99)
1
u λ L 4 (L 2 (I )) A λ− 4 u λ U 2 √ x
t
A, λ
1
L2 ,
1
u λ L 2 (I ×Br ) A r 2 λ− 2 u λ U 2 √
A, λ
L2 .
(100) (101)
Maxwell-Schrödinger System
179
The first step in the proof of Proposition 40 is to establish a wave packet parametrix and a wave packet decomposition for solutions to (95) on the λ−1 time scale. This is done by rescaling starting from the results in the previous section. We begin by writing in the Weyl calculus i(A<√λ ∇ S˜λ + S˜λ A<√λ ∇) = a w (t, x, D). Then the symbol a(t, x, ξ ) can be expressed as a principal term plus an error, a(t, x, ξ ) = a0 (t, x, ξ ) + ar (t, x, ξ ), where the principal part a0 is given by a0 (t, x, ξ ) = −2i A<√λ (t, x) · ξ s˜λ (ξ ). By Sobolev embeddings we have the following pointwise bound for the truncated magnetic potential: |∂xα A<√λ (t, x)| ≤ cα λ
1+|α| 2
A(t) H 1 .
This yields β
|∂xα ∂ξ a0 (t, x, ξ )| A cαβ λ
3+|α| 2 −|β|
.
(102)
In addition, by the Weyl calculus it follows that ar is also localized at frequency λ and satisfies β
|∂xα ∂ξ ar (t, x, ξ )| A cαβ λ
1+|α| 2 −|β|
.
(103)
This brings us to our main integral bound for the symbol a, namely 2 H 1 with div A = 0. Then the above symbol a satisfies Lemma 42. Assume that A ∈ UW T 1 cβ (T λ) 2 λ−|β| log λ α = 0 α β sup |∂x ∂ξ a(t, x + 2tξ, ξ )|dt A (104) 1 |α| cαβ (T λ) 2 λ 2 −|β| , |α| ≥ 1. x,ξ 0
Proof. The bound for ar follows directly from (103), therefore it remains to consider a0 . Furthermore, it suffices to consider the case |α| = 0, 1, β = 0. Then we need to prove the bounds T 1 1 |A<√λ (t, x + 2tξ )|dt T 2 λ− 2 ln λ ∇ A U 2 L 2 , |ξ | ≈ λ, W
0
respectively
T 0
1
|∇ A<√λ (t, x + 2tξ )|dt T 2 ∇ A U 2
WL
2
,
These in turn follow by dyadic summation from T 1 1 sup |(Sµ B)(t, x + 2tξ )|dt T 2 µλ− 2 B U 2 x,ξ
0
WL
|ξ | ≈ λ.
2
,
|ξ | ≈ λ.
(105)
180
I. Bejenaru, D. Tataru
The line y = x + 2tξ moves through a unit spatial cube in a time λ−1 . But, due to the finite speed of propagation for the wave equation, see (13), the contributions from different spatial unit cubes are square summable. Hence by Cauchy-Schwartz it suffices to prove the above bound for T ≤ λ−1 . By (12), for T ≤ λ−1 we have B U 2 (0,T ;L 2 ) ≈ B U 2 (0,T ;L 2 ) , W
therefore it is enough to prove:
T 0
1
|(Sµ B)(t, x + 2tξ )|dt (λT ) 2 µλ−1 B U 2 L 2 .
It suffices to prove the bound when B is an U 2 L 2 atom. By Cauchy-Schwartz it suffices to consider a single step, which corresponds to a time independent B. Then the last bound can be rewritten in the form 1 |(Sµ B)(x)|ds µ|L| 2 B L 2 , L
where L is an arbitrary line segment in R3 . We can set µ = 1 by rescaling. In coordinates x = (x1 , x ) suppose L is contained in {x = 0}. Then we use Sobolev embeddings in x and Cauchy-Schwartz with respect to x1 . Next we consider the rescaling that preserves the flat Schrödinger flow and takes the time scale λ−1 to 1, namely x t . vλ (x, t) = u √ , λ λ If u solves (95) then for vλ we obtain the following equation: √ x −1 w t , √ , D λ v = 0. ivt − v + λ a λ λ
(106)
However, this is not sufficient, we need to repeat the same procedure for shorter time scales. Precisely, for each λ < µ < λ2 we can rescale the µ−1 time scale to the unit scale by setting x t . v(x, t) = u √ , µ µ Then for v we obtain the equation ivt − v
+ aµw
−1 w
(t, x, D)) v = 0, aµ (t, x, ξ ) = µ
a
t x √ , √ , ξ µ . (107) µ µ
Rescaling the bounds (102), (103) and (104) it follows that this rescaled equation belongs to the class studied in the previous section: Lemma 43. For ε−1 λ ≤ µ ≤ λ2 and small enough the symbol aµ satisfies (85) and (92) on the time interval [0, 1].
Maxwell-Schrödinger System
181
This allows us to apply the results in the previous section to the evolution (95). Rescaling the result in Corollary 33 we obtain short time pointwise bounds for the solution to (95): Lemma 44. The solution of (95) has the pointwise decay n
u(t) L ∞ |t − s|− 2 u(s) L 1 ,
|t − s| λ−1 .
(108)
Proof. W.a.r.g we can take s = 0. If λ−2 ≤ t ≤ ελ−1 then this follows directly from Corollary 33 applied to Eq. (107) with µ−1 = t. The case t < λ−2 needs to be considered separately. For such t we split the evolution in two parts, S(t, 0) = S(t, 0) S˜˜λ + S(t, 0)(1 − S˜˜λ ). The second part evolves according to the constant coefficient Schrödinger flow, hence it is easy to estimate. For the first part we use the rescaled parametrix in the previous section corresponding to µ = λ−2 . The solution S(t, 0) S˜˜λ δx consists of a single packet on the λ−1 spatial scale which does not move up to time λ−2 . Hence we obtain |S(t, 0) S˜˜λ δx | λ3 (1 + λ|x − y|)−N , which concludes the proof.
|t| ≤ λ−2 ,
(109)
By [4], the Strichartz estimates in (96) are a direct consequence of (108). We continue with a decay bound away from the propagation region: Lemma 45. If |t − s| ≤ ελ−1 then the kernel of S(t, s)Sλ satisfies |K (t, x, s, y)| λ3 (1 + λ|x − y| + λ2 |t − s|)−N
(110)
whenever |t − s| + λ−2 λ−1 |x − y| or λ−1 |x − y| + λ2 |t − s|. Proof. W.a.r.g we can take s = 0. If |t − s| ≤ λ−2 then we use (109). On the other hand if λ−2 ≤ |t − s| ≤ ελ−1 and then we rescale (89) applied to (107) with µ = t −1 . Since the input is localized at frequency λ, it follows that waves need exactly a time ≈ λ−1 |x − y| + λ−2 to travel from x to y. Next we consider pointwise square function bounds: Lemma 46. The evolution S(t, s) associated to (95) has the pointwise square function decay S(t, s)Sλ ( f (s)δ y )ds(x) |x − y|−1 f L 2 (I ) , |I | λ−1 . (111) I
L 2t (I )
t
Proof. If |x − y| 1 then we can use directly (110). If |x − y| 1 then we split the integral in two parts. If |t − s| λ−1 |x − y| then we can still use (110). On the other hand if |t − s| λ−1 |x − y| then we rescale (94) applied to (107) with µ = λ|x − y|−1 .
182
I. Bejenaru, D. Tataru
We continue with the proof of (97). By the T T ∗ argument we need to prove the bound 1 S(t, s)Sλ f (s)ds A λ− 2 f 4 . (112) L x3 L 2t
L 4x L 2t
I
For this we use Stein’s complex interpolation theorem. Define the holomorphic family of operators Tz f (t) = z(t − s)+z−1 S(t, s)Sλ f (s)ds. I
Then we need to show that 1
T1 f L 4 L 2 A λ− 2 f t
x
4
L x3 L 2t
.
This follows by interpolation from Tz f L 2 f L 2 ,
z = 0
(113)
and Tz f L ∞ L 2 A λ−1 f L 1 L 2 , x
t
x
t
z = 2.
(114)
For (113) we write S(0, t)Tz f (t) = I
z(t − s)+z−1 Sλ S(0, s) f (s)ds.
Since S(t, s) are L 2 isometries it suffices to prove that z(t − s)+z−1 f (s)ds L 2 f L 2 I
which is straightforward by Plancherel’s theorem since the Fourier transform of zt+z−1 is (z + 1)(τ + i0)−z which is bounded. On the other hand the bound (114) is equivalent to Tz ( f δ y )(x) L 2 A λ−1 f L 2 t
which we can rewrite in the form (t − s)1+iσ S√ (t, s)Sλ ( f (s)δ y )ds(x) λ I
t
L 2t (I )
A λ−1 f L 2 (I ) . t
Restricting t − s to the range |t − s| λ−1 |x − y| + λ−2 this is a consequence of (111). On the other hand for larger t − s we can use directly the pointwise bound (110). The last step of the proof of Proposition 40 is the localized energy estimate (98). This follows directly by rescaling from (93) applied to Eq. (106).
Maxwell-Schrödinger System
183
8. Short Range Bilinear and Trilinear Estimates We first consider L 2 bilinear product estimates where one factor solves the wave equation and the other solves the Schrödinger equation. 2 H 1 , 1 ≤ µ λ and |I | ≤ λ−1 . Then the following Proposition 47. Assume that A ∈ UW 2 bilinear L estimates hold: 1
Sµ (Bλ u λ ) L 2 (I ×R3 ) A µ 2 Bλ U 2
WL
2
u λ U 2 √
2
u µ U 2 √
1
Bλ u µ L 2 (I ×R3 ) A µ 2 Bλ U 2
WL
A, λ
L2 ,
(115)
,
(116)
A, µ L
2
1
Bµ u λ L 2 (I ×R3 ) A µλ− 2 Bµ U 2 L 2 vλ . U 2 √ W
A, λ
L2 .
(117)
We remark that the constants in (117) are optimal, and in effect as a consequence of the results in the last section (117) can be extended almost up to time 1. On the other hand the constants in (115), (116) are not optimal, but this is not so important because this corresponds to the non-resonant case in the trilinear estimates. Proof. For (115) it suffices to use Bernstein’s inequality and the Strichartz estimates, 1
Sµ (Bλ u λ ) L 2 µ 2 Bλ u λ
3
L 2t L x2
1
µ 2 Bλ L ∞ L 2 u λ L 2 L 6 t
1 2
A µ Bλ U 2
WL
2
x
u λ U 2 √
A, λ
L2 .
A similar argument applies for (116). 2 L 2 space by the U 2 L 2 space It remains to prove (117). By (12) we replace the UW on a short time scale. Hence we can rewrite (117) in the form 1
Bµ u λ L 2 (I ×R3 ) A µλ− 2 Bµ U 2 L 2 vλ U 2 √
A, λ
L2 .
(118)
Due to the atomic structure of the U 2 spaces it suffices to prove the above bound in the special case when both Bµ and u λ solve the corresponding homogeneous equations ∂t Bµ = 0, respectively (95). We consider a partition of unit on the µ−1 scale, 1= φx20 (x), x0 ∈µ−1 Z3
and use the localized energy estimates (101) for u λ with r = µ−1 : Bµ u λ 2L 2 (I ×R3 ) ≈ φx20 Bµ u λ 2L 2 x0 ∈µ−1 Z3
φx0 Bµ 2L ∞ φx0 u λ 2L 2
x0 ∈µ−1 Z3 2 A λ−1 µ−1 u λ U 2 √
A, λ
λ
L2
φx0 Bµ 2L ∞
x0 ∈µ−1 Z3
−1 2
µ Bµ U 2 L 2 vλ U 2 √
A, λ
L2 .
184
I. Bejenaru, D. Tataru
Next we turn our attention to trilinear estimates. We begin with the easier case of three U 2 type spaces Proposition 48. a) If |I | ≤ λ−1 , µ λ and Bµ , u λ , vλ are localized at frequency µ, λ, respectively λ, then I
1 min (µ, λ 2 ) Bµ u λ v¯λ d xdt A Bµ U 2 L 2 u λ U 2 √ L 2 vλ U 2 √ L 2 . (119) W λ A, λ A, λ R3
b) If |I | = λ−1 , µ λ and Bλ , u µ , vλ are localized at frequency λ, µ, respectively λ, then A µ 21 λ−1 Bλ 2 2 vµ 2 √ 2 wλ 2 √ 2 . (120) B u v ¯ d xdt λ µ λ UW L U A, µ L U L 3 A, λ I
R
1
Proof. a) If µ < λ 2 then the conclusion follows directly from (117) since 1 Bµ u λ v¯λ d xdt |I | 2 Bµ u λ L 2 vλ L ∞ L 2 . 3 R
I
1 2
2 L 2 by U 2 L 2 ⊂ L 2 L ∞ . Then we estimate If µ > λ then we use (12) to replace UW x t Bµ u λ v¯λ d xdt A Bµ L 2x L ∞ u λ L 4 L 2 vλ L 4 L 2 t I
t
x
R3
x
t
and use the square function bounds (97) for the last two factors. b) In the Fourier space we obtain nontrivial contributions when either all three time frequencies are λ2 or when at least two of them are λ2 . More precisely, using smooth time multiplier cutoffs we can write Bλ u µ v¯λ d xdt = χ{|Dt |>λ2 /32} Bλ u µ v¯λ d xdt I R3 I R3 + χ{|Dt |<λ2 /32} Bλ χ{|Dt |>λ2 /32} u µ v¯λ d xdt 3 I R + χ{|Dt |<λ2 /32} Bλ χ{|Dt |<λ2 /32} u µ χ{|Dt |<λ2 /16} v¯λ d xdt. I
R3
Since the wave equation has constant coefficients, for the first term we can bound the first factor in L 2 , χ{|Dt |>λ2 /32} Bλ L 2 λ−1 Bλ U 2
WL
2
.
On the other hand for the remaining product we use the energy estimate for vλ and the L 2 L ∞ bound for u µ . We argue in a similar manner for the other two terms. The bilinear expressions χ{|Dt |<λ2 /32} Bλ χ{|Dt |<λ2 /32} u µ ,
Sµ (χ{|Dt |<λ2 /32} Bλ v¯λ )
can be estimated in L 2 using (116), respectively (115). Hence it remains to bound in L 2 the high modulation factors:
Maxwell-Schrödinger System
185
Lemma 49. We have χ{|Dt |<λ2 /16} vλ L 2 A λ−1 vλ U 2 √
L2
χ{|Dt |>λ2 /32} u µ L 2 A λ−1 u µ U 2 √
2
A, λ
respectively A, µ L
.
Proof. In the case A = 0 both bounds are trivial, the difficulty is to accommodate the unbounded term involving A. We consider the first bound only, as the argument for the second is similar. Without any restriction in generality we can take vλ to be an U 2 atom. The kernels of the operators χ{|Dt |<λ2 /16} , respectively χ{|Dt |>λ2 /32} decay rapidly on the λ−2 time scale. Then it suffices to prove the estimate in two cases: (i) vλ is supported in a λ−2 time interval (this corresponds to steps of length λ−2 and shorter). Then the bound follows directly from the energy estimates and Holder’s inequality. (ii) vλ solves the homogeneous equation (95) on the time interval I with λ−2 ≤ |I | ≤ λ−1 (this corresponds to steps of length λ−2 and longer). Then we can use the bound (117) to estimate (i∂t − )vλ L 2 (I ) = A<√λ ∇ S˜λ vλ L 2 (I ) A ln λ λ 2 vλ L ∞ L 2 . 1
Hence with I = [t0 , t1 ] we can write (i∂t − )(χ I vλ ) = ivλ (t0 )δt=t0 − ivλ (t1 )δt=t1 + f λ , where χ I is the characteristic function of I and 1
f λ L 2 A ln λ λ 2 vλ L ∞ L 2 . Hence working with the constant coefficient Schrödinger equation we obtain χ{|Dt |<λ2 /2} χ I vλ L 2 A λ−1 ( vλ (t0 ) L 2 + vλ (t1 ) L 2 ) + λ−2 f L 2 A λ−1 vλ L ∞ L 2 , which is exactly what we need.
Finally we turn our attention to the case when one of the three U 2 spaces is replaced by a V 2 space: Proposition 50. a) Let |I | ≤ λ−1 , µ ł and Bµ , u λ , vλ localized at frequency µ, λ, respectively λ. Then 1 (λ|I |) 2 µ Bµ U 2 L 2 u λ U 2 √ L 2 vλ V 2 √ L 2 . (121) B u v ¯ d xdt µ λ λ A W 3 λ A, λ A, λ R
I
1
If in addition λ 2 µ λ, then I
√µ ln λ Bµ u λ v¯λ d xdt A √ Bµ U 2 L 2 u λ U 2 √ L 2 vλ V 2 √ L 2 . (122) W A, λ A, λ λ R3
186
I. Bejenaru, D. Tataru
b) If |I | = λ−1 , µ λ and Bλ , u µ , vλ are localized at frequency λ, µ, respectively λ, then 1 µ 2 ln λ Bλ U 2 L 2 u µ U 2 √ L 2 vλ V 2 √ L 2 . (123) Bλ u µ v¯λ d xdt A W A, µ λ A, λ I R3 Proof. a) Using the bilinear L 2 bound (117) for the product of the first two factors we obtain 1 (λ|I |) 2 µ Bµ u λ v¯λ d xdt A Bµ U 2 L 2 u λ U 2 √ L 2 vλ L ∞ L 2 . W λ A, λ I R3 2 Then (121) follows due to the trivial embedding V 2 √ L 2 ⊂ L ∞ t Lx. A, λ On the other hand the LHS of (122) can be estimated either as above or as in (119). Then (122) follows from the decomposition µ µ −1 2 2√ 2 2 √ 2 V A, U L ⊂ ln L + L . √ √ A, λ λ λ λ
We can factor out the (95) flow by pulling functions back to time 0 along the flow. Then the above relation becomes V 2 L 2 ⊂ ln σ U 2 L 2 + σ −1 L ∞ L 2 ,
σ 1.
This in turn is true due to Lemma 8. b) This follows from a similar argument to the one above and by using (116) and (120). 9. The Short Time Paradifferential Calculus In this section we prove that, given a dyadic frequency λ, the evolution of the λ dyadic piece of a solution u to (18) is well approximated by the evolution of the paradifferential equation (95) on time intervals of size λ−1 . We also introduce different paradifferential truncations iu t − u + i(A<ν ∇ S˜λ + S˜λ A<ν ∇)u = 0,
u(0) = u 0
(124)
and show that they all generate equivalent spaces. The spaces associated to (124) are 2 2 2 L 2 , U A,ν,λ L 2 , respectively U A,ν,λ L 2 . We refer to the above evolution denoted by U A,ν,λ as the (A<ν , λ) flow. A special case of the above equation is when A<ν is replaced by Aλ . We refer to that as the (Aλ , λ) flow. By a slight abuse of notation we denote the corresponding 2 spaces by U A,λ,λ L 2 , etc. Proposition 51. a) For any interval I with |I | ≤ λ−1 the solution u to (18) satisfies Sλ u U 2 √ b) In addition, for any
√
A, λ
(I ;L 2 )
A u 0 L 2 .
(125)
λ < ν λ we have u U 2 √
A, λ
(I ;L 2 )
≈ A u U 2
A,ν,ł (I ;L
2)
.
(126)
Maxwell-Schrödinger System
187
The bound (125) transfers easily to U 2 spaces: Corollary 52. For any interval I with |I | ≤ λ−1 we have Sλ u U 2 √
A, λ
(I ;L 2 )
A u U 2 (I ;L 2 ) .
(127)
A
Combining this with (99) we immediately obtain the bounds (20) in part (i) of Theorem 9: Corollary 53. The solution u to the homogeneous magnetic Schrödinger equation (18) satisfies the Strichartz estimates (20). The rest of the section is dedicated to the proof of Proposition 51. Proof. a) From Eq. (18) we obtain the following equation for Sλ u, i∂t − + i A<√λ ∇ S˜λ + i S˜λ A<√λ ∇ Sλ u = f λ , where f λ = Sλ (2i A>√λ ∇u + A2 u) + i[Sλ , A<√λ ] S˜λ ∇u. Then we have Sλ u U 2 √
A, λ
(I ;L 2 )
u 0 L 2 + f λ DU 2 √
A, λ
(I ;L 2 ) .
The estimate (125) follows if we establish that the inhomogeneous terms f λ are uniformly small, f λ DU 2 √
A, λ
(I ;L 2 )
A (λ|I |)δ sup Sλ u U 2 √
(I ;L A, λ
λ
2)
.
We consider the terms in f λ . For the first term by duality we need to prove that √ A> λ ∇u Sλ vd ¯ xdt A (λ|I |)δ v V 2 √ L 2 sup Sλ u U 2 √ (I ;L 2 ) . I
R3
A, λ
λ
A, λ
We take a Littlewood-Paley decomposition of the first two factors and estimate each dyadic piece. There are several cases to consider for the integrand: √ a) Sµ A∇ Sλ u Sλ v¯ with λ ≤ µ ≤ λ. Using (121) yields a constant 1
|I | 2
√
1
1
λ = (λ|I |) 4 (µ2 |I |) 4
λ µ2
1 4
which is favorable if |I | ≤ µ−2 . On the other hand using (122) yields a constant 1 √ 1 1 λ 4 µ µ −1 µ λ = ln √ (|I |λ) 4 (|I |µ2 )− 4 ln √ 2 µ λ λ
which is favorable if |I | ≥ µ−2 .
188
I. Bejenaru, D. Tataru
b) Sλ A∇ Sµ u Sλ v¯ with µ λ. Then using (123) yields a constant 3
1
µ 2 λ−2 ln λ ≤ λ− 2 ln λ, and a power of |I | can be easily gained as in Remark 10. c) Sν A∇ Sν u Sλ v¯ with ν λ. Then we can use (123) but only on ν −1 time intervals. We obtain a constant 1
λ 2 ν −1 ln ν and again a power of I is gained as in Remark 10. For the second term in f λ by duality we need to prove that 2 A u Sλ vd ¯ xdt A (λ|I |)δ v V 2 √ L 2 sup Sλ u U 2 √ R3
I
A, λ
A, λ
λ
(I ;L 2 ) .
We consider the corresponding dyadic pieces Sλ1 ASλ2 ASλ3 u Sλ vd ¯ xdt I
R3
5
≤ (λ|I |) 12 Sλ1 A L 6 L 3 Sλ2 A L 6 L 3 Sλ3 u L ∞ L 2 Sλ v L 4 L 3 5
−2 −2
2 A (λ|I |) 12 λ1 3 λ2 3 A U 2
W
H1
Sλ3 u L ∞ L 2 v V 2 √
A, λ
L2 ,
where for v we have used the short time Strichartz estimates. The summation with respect to λ1 and λ2 is trivial. So is the summation with respect to λ3 since the integral is zero unless either λ3 = λ or λ3 ≤ max{λ1 , λ2 }. Finally for the commutator term in f λ we have the bound [Sλ , A<√λ ]∇ S˜λ u L 1 L 2 λ−1 (λ|I |) [Sλ , A<√λ ]∇ S˜λ u L ∞ L 2
λ−1 (λ|I |) ∇ A<√λ L ∞ S˜λ u L ∞ L 2 λ− 4 (λ|I |) A L ∞ H 1 S˜λ u L ∞ L 2 , 1
which again suffices by duality. b) By the same argument as in part (a) we obtain A<√λ ∇u − A<ν ∇u DU 2 √
A, λ
(I ;L 2 )
A (λ|I |)δ u U 2 √
A, λ
(I ;L 2 ) ,
which shows that the two flows are close. Then (126) follows due to Lemma 3. 10. Generalized Wave Packet Decompositions and Long Range Trilinear Estimates Denote by C1 (µ, λ, |I |) the best constant in the estimate Bµ u λ v¯λ d xdt ≤ C1 (µ, λ, |I |) Bµ U 2 L 2 u λ U 2 I
R3
W
A,λ,λ L
2
vλ U 2
A,λ,λ L
2
Maxwell-Schrödinger System
189
with Bµ , u λ and vλ localized at frequencies µ, λ, respectively λ. Similarly, let C2 (µ, λ, |I |) be the best constant in the estimate ≤ C2 (µ, λ, |I |) Bλ 2 2 u µ 2 B u v ¯ d xdt λ µ λ U L U L 2 wλ U 2 L2 I
R3
W
A,µ,µ
A,λ,λ
with Bλ , u µ and vλ localized at frequencies λ, µ, respectively λ. As a consequence of Proposition 48 we have 1
1
C1 (µ, λ, λ−1 ) A λ− 2 min {µλ− 2 , 1}. A trivial summation shows that for larger time intervals we have 1
1
C1 (µ, λ, |I |) A λ− 2 min{µλ− 2 , 1}(1 + λ|I |).
(128)
We seek to iteratively improve this to 1
1
1
C1 (µ, λ, |I |) A λ− 2 min {µλ− 2 , 1}(1 + λ|I |) 2
(129)
for intervals I almost up to length 1. To achieve this we iteratively produce an increasing sequence of times T for which (129) holds for |I | ≤ T . At the same time we seek to improve C2 (µ, λ, |I |) in a similar manner, as well as extend the time for which the local energy and local Strichartz estimates hold. More precisely, we consider a set of properties as follows: (i) Paradifferential approximation of the flow: Sλ u U 2
A,λ,λ (I ;L
2)
A u U 2 (I,L 2 ) .
(130)
A
(ii) Trilinear bounds: 1
1
1
C1 (µ, λ, |I |) A λ− 2 min {µλ− 2 , 1}(1 + λ|I |) 2 . 1
(131)
1
C2 (µ, λ, |I |) A µ 2 λ−1 (1 + λ|I |) 2 .
(132)
(iii) Local energy and local Strichartz estimates for each cube Q of size 1 and each function u λ localized at frequency λ: 1
u λ L 2 (I ;L 2 (Q)) A λ− 2 u λ U 2
A,λ,λ L
u λ L 2 (I ;L 6 (Q)) A u λ U 2
A,λ,λ L
2
2
,
(133)
.
(134)
So far, by Propositions 40,51,48 we know that the above estimates (130)–(134) hold if |I | ≤ λ−1 . On the other hand, in order to prove Theorem 9 we need to know that (130)–(134) hold if |I | ≤ λ− for ε arbitrarily small (see also Remark 10). This is accomplished in the next result. Proposition 54. Let 0 < α ≤ 1. Assume that the estimates (130)–(134) hold for |I | < λ−α . Then (130)–(134) hold for |I | < λ−β for each β > 43 α. Proof. We first improve the time range of the paradifferential calculus:
190
I. Bejenaru, D. Tataru
Lemma 55. a) For each frequency λ we have Sλ u U 2
A,λ,λ (I,L
α
3
|I | ≤ λ− 2 (log λ)− 2 .
A u U 2 (I,L 2 ) ,
2)
A
α
b) For each frequency λ1− 2 log λ < ν λ we have u U 2
A,ν,λ (I,L
2)
≈ A u U 2
A,λ,λ (I,L
2)
α
|I | ≤ T (λ, ν) = νλ−1− 2 (log λ)−1 .
,
Proof. a) We observe that (131), (132), (133) for |I | = λ−α trivially lead to bounds for longer time, 1
1
1
1
C1 (µ, λ, |I |) A λ− 2 min {µλ− 2 , 1}(1 + λ|I |) 2 (1 + λα |I |) 2 , 1 2
C2 (µ, λ, |I |) A µ λ
−1
1 2
1 2
α
(1 + λ|I |) (1 + λ |I |) , 1 2
u λ L 2 (I,L 6 (Q)) A (1 + λα |I |) u λ U 2
A,λ,λ L
2
(135) (136)
.
(137)
Furthermore, due to (130), we obtain the same constant in the trilinear estimates when we replace u λ by Sλ u and u λ U 2 L 2 by u U 2 L 2 . Also by the argument in Lemma 8, A,λ,λ
A
we can also replace one of the U 2 norms with a V 2 norm at the expense of an additional ln λ loss. The rest of the proof is similar to the proof of Proposition 53. For u solving (18) we write i∂t − + i Aλ ∇ S˜λ + i S˜λ Aλ ∇ Sλ u = f λ , where f λ = Sλ (2i Aλ ∇u + A2 u) + i[Sλ , Aλ ] S˜λ ∇u. Hence it suffices to prove that f λ DU 2
A,λ,λ (I ;L
2)
A u U 2 (I,L 2 ) . A
We use duality and consider each term in f λ . For the first one we need to show that Aλ ∇u Sλ vd ¯ xdt A A U 2 H 1 u U 2 (I,L 2 ) v V 2 L 2 . I
R3
W
A
A,λ,λ
After a Littlewood-Paley decomposition of the first two factors we need to consider the following three cases for the integrand: (i) Sλ A∇ Sλ u Sλ v. ¯ Then we use (135) to obtain a constant 1
1
α
1
log λ λ 2 (λ|I |) 2 (λα |I |) 2 λ−1 = log λ λ 2 |I | which is satisfactory given the range for I . (ii) Sλ A∇ Sµ u Sλ v, ¯ µ λ. Then we use (136) to obtain a constant 3
1
1
α
µ 2 λ−1 (λ|I |) 2 (λα |I |) 2 λ−1 ≤ λ 2 |I | which is much better than we need.
Maxwell-Schrödinger System
191
(iii) Sν A∇ Sν u Sλ v, ¯ λ ν. Then we use (136) to obtain a constant 3
1
α
1
λ 2 ν −1 (ν|I |) 2 (ν α |I |) 2 ν −1 ≤ λ 2 |I |. For the second term in f λ by duality we need to prove that ABu Sλ vd ¯ xdt A A U 2 H 1 B U 2 H 1 u U 2 (I,L 2 ) v V 2 I
R3
W
W
A,λ,λ L
A
2
.
Due to the finite speed of propagation for the wave equation, see (13), it suffices to consider the case when A and B are supported in a unit cube. Then we bound A and B in L ∞ L 6 , u in L 2 L 6 as in (137), and v in L ∞ L 2 and use Holder’s inequality with α respect to time. This yields the same constant λ 2 |I | as above. Finally we consider the commutator term in f λ . This can be represented in the form of a rapidly convergent series of the form 1j 2j Sλ (∇ Aλ Sλ u), [Sλ , Aλ ]∇ S˜λ u = j 1j
2j
where Sλ , Sλ are operators similar to Sλ . Then by duality it suffices to prove that 3 α ∇ Aλ ∇ Sλ u Sλ vd ¯ xdt A (log λ) 2 λ 2 |I | ∇ A U 2
WL
I R2
2
u U 2 L 2 v V 2 A
A,λ,λ L
2
.
For this it suffices to consider a Littlewood-Paley decomposition of Aλ and to apply (135) for each dyadic piece. b) By virtue of Lemma 3 it suffices to show that Aλ ∇ S˜λ u − A<ν ∇ S˜λ u DU 2
A,λ,λ
A u U 2
A,λ,λ
and the similar bound for S˜λ Aλ ∇ − S˜λ A<ν ∇. By duality this becomes λ Sµ A∇ S˜λ u vd ¯ xdt A A U 2 H 1 u U 2 v V 2 . A A,λ,λ A,λ,λ I R3 µ=ν
Indeed, for |I | > λα the estimate (135) yields a constant 1
1
α
1
log λ λ 2 (λ|I |) 2 (λα |I |) 2 ν −1 = log λ ν −1 λ 2 +1 |I | which leads to the restriction α
|I | ≤ T (λ, ν) = (log λ)−1 νλ− 2 −1 . We observe that this is useful only if it provides information on time intervals with |I | > λ−α . This leads to the condition α
ν > λ1− 2 log λ.
192
I. Bejenaru, D. Tataru
Next we consider the (124) evolution and we construct a generalized wave packet structure for the flow. The frequency scale is δξ = ν and the time scale is T (λ, ν) therefore it is natural to define the spatial scale by δx = νT (λ, ν), as in the case of the flat flow. We first partition the initial data. Let φ be a smooth unit bump function in R3 × R3 so that φ(x − k, ξ − j) = 1. k, j∈Zn
Denote φxν0 ,ξ0 (x, ξ ) = φ
x − x 0 ξ − ξ0 , νT (λ, ν) ν
,
where = (νT (λ, ν)Z)3 × (νZ)3 . (x0 , ξ0 ) ∈ Z2×3 ν Then consider an almost orthogonal decomposition of the initial data φxν0 ,ξ0 (x, D)u 0 . u0 = (x0 ,ξ0 )∈Z2×3 ν
We denote the corresponding solutions to (124) by u x0 ,ξ0 , and we call them generalized wave packets. To measure their evolution we consider the family of operators L x0 ,ξ0 = ν −1 (ξ − ξ0 ), (νT (λ, ν))−1 (x − x0 − 2tξ ) which commute with i∂t − . The following lemma shows that u x0 ,ξ0 is concentrated in a tube Txν0 ,ξ0 = {(x, ξ ) : |x − x0 − 2tξ0 | ≤ νT (λ, ν), |ξ − ξ0 | ≤ ν} and decays rapidly away from it. Lemma 56. The solutions u x0 ,ξ0 for the (124) flow satisfy 2 L αx0 ,ξ0 u x0 ,ξ0 (t) U A u 0 2L 2 , 2 (I,L 2 )
|I | ≤ T (λ, ν). (138)
A,ν,λ
|α|≤N (x0 ,ξ0 )∈Z2×3 ν
Proof. At time 0 we clearly have L αx0 ,ξ0 u x0 ,ξ0 (0) 2L 2 u 0 2L 2 , |α|≤N
therefore it suffices to prove that a single generalized wave packet satisfies 2 L αx0 ,ξ0 u x0 ,ξ0 U L αx0 ,ξ0 u x0 ,ξ0 (0) 2L 2 . 2 (I,L 2 ) A A,ν
|α|≤N
(139)
|α|≤N
This follows iteratively from 2 L x0 ,ξ0 v U 2
A,ν,λ (I,L
2)
2 A L x0 ,ξ0 v(0) 2L 2 + v U 2
A,ν.λ (I,L
2)
(140)
Maxwell-Schrödinger System
193
for which, in turn, we need the commutator bound [L x0 ,ξ0 , A<ν S˜λ ∇]v DU 2
A,ν,λ (I,L
2)
2 A v U 2
A,ν,λ (I,L
(141)
2)
as well as the similar one for the operator S˜λ A<ν ∇. If L x0 ,ξ0 = ν −1 (ξ − ξ0 ) then [L x0 ,ξ0 , A<ν S˜λ ∇] = ν −1 (∇ A<ν ) S˜λ ∇, therefore by duality we need to show that A ν A 2 1 u 2 ˜ ∇ A ∇ S u vd ¯ xdt <ν λ U H U I
R3
A
A,ν,λ
v V 2
A,ν,λ
which follows from (135). If L x0 ,ξ0 = (νT (λ, ν))−1 (x − x0 − 2tξ ), then [L x0 ,ξ0 , A<ν S˜λ ∇] = (νT (λ, ν))−1 (A<ν + t (∇ A<ν )∇) S˜λ . The second term is as above. For the first by duality we need to show that ˜ A<ν Sλ u vd ¯ xdt A νT (λ, ν) A U 2 H 1 u U 2 v V 2 , I
R3
A
which is much weaker and follows again from (135).
A,ν,λ
A,ν,λ
The parameter ν is chosen so that the packets move away from their initial support by the time λ−α . For this we impose the condition νT (λ, ν) < λ1−α− for some > 0. This is satisfied if we choose ν of the form α
ν = λ1− 4 − in which case we have νT (λ, ν) = λ−
3α 4 −2ε
(log λ)−1 .
This leads to the choice of β in the proposition. Indeed, the proof of the proposition is concluded due to the following lemma: Lemma 57. Let ε > 0. Choose ν so that νT (λ, ν) < λ1−α−ε and ν < λ1−ε . Then (131)–(134) hold for |I | < T (λ, ν). Proof. By the previous lemma we can replace the (Aλ , λ) flow by the (A<ν , λ) flow in (131)–(134). We begin with (131). Without any restriction in generality we can 2 assume that u and v are U A,ν,λ L 2 atoms. The generalized wave packet decomposi2 tion in Lemma 56 easily extends to U A,ν,λ L 2 atoms. Indeed, if we denote by Sν (t, s) the evolution generated by (124), then an atom u λ of the form u λ (t) = 1[tk ,tk+1 ) Sν (t, tk )u kλ k
194
I. Bejenaru, D. Tataru
can be partitioned as u=
u x0 ,ξ0 ,
(142)
1[tk ,tk+1 ) Sν (t, tk )u kx0 ,ξ0
(143)
x0 |ξ0 |≈λ
where u x0 ,ξ0 (t) =
k
with u kx0 ,ξ0 = φxν0 +2tk ξ0 ,ξ0 (x, D)u kλ . Then, denoting
|||u x0 ,ξ0 |||2 =
k |α|≤N
we have the orthogonality relation
L αx0 ,ξ0 (tk )u kx0 ,ξ0 2L 2 ,
|||u x0 ,ξ0 |||2
x0 |ξ0 |≈λ
u k 2L 2 .
(144)
k
We argue in a similar manner for v. Hence it suffices to take u λ as in (142),(143), and similarly for vλ , and prove that for |I | ≤ T (λ, ν) we have Bµ u λ v¯λ d xdt A R H S(131) · Bµ U 2 L 2 I
R3
W
⎛ ⎝
⎞1 ⎛ 2
|||u x0 ,ξ0 |||2 ⎠ ⎝
x0 |ξ0 |≈λ
⎞1 2
|||vx0 ,ξ0 |||2 ⎠ .
(145)
x0 |ξ0 |≈λ
We proceed with several reductions, which will eventually lead to shorter time intervals. 1. Reduction to spatial scale λT (λ, ν). Heuristically, in time T (λ, ν) the frequency λ waves for (124) travel by λT (λ, ν). Hence we partition the space into cubes {Q j } j∈Z3 of size λT (λ, ν). Correspondingly, we decompose u λ into u j, uj = u x0 ,ξ0 . uλ = j∈Z3
x0 ∈Q j
and similarly for v. By (138) the functions u j decay rapidly away from an enlargement of Q j . Precisely, if x0 ∈ Q j and | j − k| ≥ 10 then the separation between the tube Txν0 ,ξ0 and Q k is O(| j − k|λT (λ, ν)). Comparing this with the tube thickness νT (λ, ν)) we have |u j (t, x)| λ−N | j − k|−N |||u x0 ,ξ0 |||2 , x ∈ Q k , | j − k| ≥ 10. x0 ∈Q j |ξ0 |≈λ
Thus in (145) it suffices to consider the output of u j and v j1 for | j − j1 | < 20, and only within an enlargement C Q j of Q j ; the rest is trivially estimated using the above
Maxwell-Schrödinger System
195
bound. Furthermore, by Cauchy-Schwartz it suffices to consider a fixed j and k. By a slight abuse of notation we set k = j in the sequel. Then (145) is reduced to χC Q j Bµ u j v¯ j d xdt A R H S(131) · Bµ U 2 L 2 R3
I
W
⎛
⎝
⎞ ⎛ 1 2
|||u x0 ,ξ0 |||
2⎠
⎝
x0 ∈Q j |ξ0 |≈λ
⎞1 2
|||vx0 ,ξ0 |||
2⎠
.
(146)
x0 ∈Q j |ξ0 |≈λ
2. Reduction to small angles. Here we partition the λ annulus Aλ in frequency into small 1 angles of size 10 with centers in ⊂ S2 , Aλ,θ . Aλ = θ∈
Then we divide uj =
u j,θ ,
u j,θ =
θ∈
u x0 ,ξ0
x0 ∈Q j ξ ∈Aθ
and similarly for v j . It remains to prove that χC Q j Bµ u j,θ v¯ j,ω d xdt A R H S(131) · Bµ U 2
WL
R3
I
⎛
⎝
⎞1 ⎛ 2
|||u x0 ,ξ0 |||2 ⎠ ⎝
x0 ∈Q j ξ0 ∈Aλ,θ
2
⎞1 2
|||vx0 ,ξ0 |||2 ⎠ .
(147)
x0 ∈Q j ξ0 ∈Aλ,ω
3. Reduction to a spatial strip of size λε νT (λ, ν). Given directions θ and ω as above we choose a coordinate, say ξ1 , so that both the θ and the ω sectors Aθ , Aω are away from ξ1 = 0. Dividing the spatial coordinates x = (x1 , x ) we partition the space into strips Sk of thickness λε νT (λ, ν) in the x1 direction. Arguing as in (13), the A factor is square summable with respect to the strips Sk . There 1−ε 1 are about λ1− ν −1 such strips which intersect 30Q j . Hence by losing a λ 2 µ− 2 factor we can use Holder’s inequality to reduce the problem to the case when A is supported in a single spatial strip Sk . It remains to prove that 1 A R H S(131) · λ− 1−ε 2 µ 2 Bµ 2 2 χ B u v ¯ d xdt Sk ∩C Q j µ j,θ j,ω U L I
R3
⎛ ⎝
W
⎞1 ⎛ 2
|||u x0 ,ξ0 |||
x0 ∈Q j ξ0 ∈Aλ,θ
2⎠
⎝
⎞1 2
|||vx0 ,ξ0 |||
2⎠
.
(148)
x0 ∈Q j ξ0 ∈Aλ,ω
4. Reduction to spatial scale λε νT (λ, ν). Due to the choice of coordinates above, the packets in u j,α and v j,ω travel in directions which are transversal to Sk . Hence if we partition Sk into cubes Q˜ l of size λε νT (λ, ν), each packet will intersect only finitely many cubes. Then we partition further u j,θ,l , u j,θ,l = u x0 ,ξ0 , u j,θ = l
(x0 ,ξ0 )∈A j,θ,l
196
I. Bejenaru, D. Tataru
where A j,θ,l = {(x0 , ξ0 ); x0 ∈ Q j , ξ0 ∈ Ał,θ , {x0 + Rξ0 } ∩ Q˜ l = ∅} and packets u x0 ,ξ0 intersecting more than one cube are arbitrarily placed in one of the terms. Arguing as in the first reduction, for |l − l1 | 1 the size of u j,θ,l in Q˜ l1 is rapidly decreasing, |||u x0 ,ξ0 |||2 , x ∈ Q˜ l1 , |l − l1 | 1. |u j,θ,l (t, x)|2 λ−N |l − l1 |−N (x0 ,ξ0 )∈A j,θ,l
Hence the joint contribution of u j,θ,l and v j,ω,l1 in (148) is nontrivial only on the diagonal |l − l1 | 1. By Cauchy-Schwartz with respect to l it suffices to estimate this contribution for fixed l, l1 , in an enlarged cube C Q˜ l . By a slight abuse of notation we set l = l1 . Then we need to show that 1 A R H S(131) · λ− 1−ε 2 µ 2 Bµ 2 2 χ B u v ¯ d xdt C Q l µ j,θ,l j,ω,l U L I
R3
W
⎛
×⎝
⎞1 ⎛ 2
|||u x0 ,ξ0 |||
2⎠
⎝
(x0 ,ξ0 )∈A j,θ,l
⎞1 2
|||vx0 ,ξ0 |||
2⎠
.
(149)
(x0 ,ξ0 )∈A j,ω,l
5. Reduction to a time interval of size λε−1 νT (λ, ν). Each tube Tx0 ,ξ0 intersects the cube Q l in a time interval of size λε−1 νT (λ, ν). Hence it is natural to partition the time interval I into subintervals Im of length λε−1 νT (λ, ν). Correspondingly, we split u j,θ,l into u j,θ,l,m , u j,θ,l,m = u x0 ,ξ0 , u j,θ,l = (x0 ,ξ0 )∈A j,θ,l,m
m
where A j,θ,l,m = {(x0 , ξ0 ); x0 ∈ Q j , ξ0 ∈ Aλ,θ , Txν0 ,ξ0 ∩ Im × Q˜ l = ∅}, and similarly for v j,ω,l . Again, the size of u j,θ,l,m in Im 1 × Q l is negligible if |m − m 1 | 1. Thus by Cauchy-Schwartz with respect to m the estimate (149) reduces to the case of a single interval Im ,
Im R
1−ε 1 χC Ql Bµ u j,θ,l,m v¯ j,ω,l,m d xdt A R H S(131)λ− 2 µ 2 Bµ U 2 L 2 W 3
⎛
×⎝
⎞1 ⎛ 2
|||u x0 ,ξ0 |||
2⎠
(x0 ,ξ0 )∈A j,θ,l,m
⎝
(x0 ,ξ0 )∈A j,ω,l,m
But this follows from the hypothesis since |Im | = λε−1 νT (λ, ν) ≤ λ−α
⎞1 2
|||vx0 ,ξ0 |||
2⎠
.
(150)
Maxwell-Schrödinger System
197
and 1
|I | 2 λ−
1−ε 2
1
µ 2 = |Im |.
In the case of (132) the argument is similar but with several adjustments which we outline. 1. Reduction to spatial scale max{µ, λε ν}T (λ, ν). This smaller initial localization scale is possible since frequency ν Schrödinger waves travel with speed µ, so within time T (λ, ν) they can spread only as far as µT (λ, ν). Thus for the u µ factor we have square summability on the µT (λ, ν). For the wave factor Bλ by (13) we have square summability on the same scale, therefore we are allowed to localize spatially the estimate on the µT (λ, ν) scale. If µ is small we only can take partial advantage of this due to the wider spread of frequency λ packets. 2. Reduction to small angles. This is as before. 3. Reduction to a spatial strip of size λε νT (λ, ν). The difference here is that we have only about max{1, µł−ε ν −1 } strips intersecting a max{µ, łε ν}T (λ, ν) cube, therefore we only lose a factor of ε
1
1
max{1, µ 2 ł− 2 ν − 2 }. 4. Reduction to spatial scale λε νT (λ, ν). This is as before. 5. Reduction to a time interval of size λε−1 νT (λ, ν). As before, the frequency λ wave packets spend a time λε−1 νT (λ, ν) inside a λε νT (λ, ν) cube Q. However, the frequency µ packets spend a longer time µ−1 λε νT (λ, ν) inside Q. Hence we can carry out first a lossless reduction down to time scale min{1, µ−1 λε ν}T (λ, ν). To further reduce the time scale to λε−1 νT (λ, ν) we can only use the square summability for the frequency λ waves, therefore we apply Cauchy-Schwartz and lose an additional factor of 1
1
min{1, µ−1 λε ν} 2 (λε−1 ν)− 2 . Finally, combining the two losses in Steps 3 and 5 we obtain a total loss of 1
(λε−1 ν)− 2 , which is identical to the one in Step 3 of the proof of (131). We conclude as before. The argument is considerably simpler in the case of (133) and (134). There each packet intersects Q × I in a time interval which is shorter than λ−1 νT (λ, ν) < λ−α . Grouping the wave packets with respect to such time intervals we obtain the square summability of the outputs and reduce the problem to the shorter time scale λ−α .
References 1. Bejenaru, I., Tataru, D.: Large data local solutions for the derivative nls equation. JEMS 10(4), 957–985 (2008) 2. Guo, Y., Nakamitsu, K., Strauss, W.: Global finite-energy solutions of the Maxwell-Schrödinger system. Commum. Math. Phys. 170(1), 181–196 (1995) 3. Hadac, M., Herr, S., Koch, H.: Well-posedness and scattering for the KP-II equation in a critical space. Ann. Inst. H.Poincare (c) Nonlinear Anal., doi:10.1016/j.anitipc.2008.09.002 4. Keel, M., Tao, T.: Endpoint Strichartz estimates. Amer. J. Math. 120(5), 955–980 (1998) 5. Koch, H., Smith, H.F., Tataru, D.: Subcritical l p bounds on spectral clusters for lipschitz metrics. Math. Res. Lett. 14(1), 77–85 (2007)
198
I. Bejenaru, D. Tataru
6. Koch, H., Tataru, D.: A-priori bounds for the 1-d cubic nls in negative sobolev spaces. IMRN 16, Art ID rnm 053, 36pp. 2007 7. Koch, H.,Tataru, D.: Dispersive estimates for principally normal pseudodifferential operators. Comm. Pure Appl. Math. 58(2), 217–284 (2005) 8. Marzuola, J., Metcalfe, J., Tataru, D.: Wave packet parametrices for evolutions governed by pdo’s with rough symbols. Proc. AMS 136(2), 597–604 (2008) 9. Nakamura, M., Wada, T.: Local well-posedness for the Maxwell-Schrödinger equation. Math. Ann. 332(3), 565–604 (2005) 10. Tataru, D.: Strichartz estimates for second order hyperbolic operators with nonsmooth coefficients. III. J. Amer. Math. Soc. 15(2), 419–442 (electronic) (2002) 11. Tataru, D.: Phase space transforms and microlocal analysis. In: Phase Space Analysis of Partial Differential Equations. Vol. II, Pubbl. Cent. Ric. Mat. Ennio Giorgi, pp. 50:5–524. Pisa: Scuola Norm. Sup. 2004 12. Tsutsumi, Y.: Global existence and asymptotic behavior of solutions for the Maxwell-Schrödinger equations in three space dimensions. Comm. Math. Phys. 151(3), 543–576 (1993) Communicated by P. Constantin
Commun. Math. Phys. 288, 199–224 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0770-z
Communications in
Mathematical Physics
Energy Multipliers for Perturbations of the Schwarzschild Metric Serge Alinhac Department des Mathematiques, Université Paris-Sud, 91405 Orsay, France. E-mail:
[email protected] Received: 25 January 2008 / Accepted: 22 December 2008 Published online: 12 March 2009 – © Springer-Verlag 2009
Abstract: We consider the wave equations associated to metrics close to the Schwarzschild metric. We investigate spacelike energy multipliers likely to yield local decay of solutions to these wave equations, in the spirit of Morawetz. For rotationally invariant metrics, we obtain multipliers giving a control of the solutions having finitely many vanishing spherical harmonics. The structure of these multipliers is closely related to the photosphere of the metric. For Kerr metrics, in contrast, we display a region, which we call the intersphere region, where no energy inequality with the required properties can exist.
1. Introduction In this paper, we consider two types of lorentzian metrics close to the Schwarzschild metric : i) Rotationally invariant metrics, ii) Kerr metrics. The rotationally invariant metrics are of course not solutions of the Einstein vacuum equations, but they naturally appear in rotationally invariant coupled problems, so that it may be important to study their properties. A good example of this approach would be Christodoulou’s work [3]. The Kerr metrics for small parameter a are typical examples of what we are likely to run into when testing the stability of Schwarzschild metrics. For a lorentzian metric g close to the Schwarzschild metric, we consider the associated wave operator g = . Our interest here is to explore the interior decay of smooth solutions ψ of ψ = 0. By “interior decay”, we mean a property which would correspond to the decay far inside the light cone {r = t} for the standard corresponding to the Minkowski metric. More precisely, in the case of Schwarzschild, using the usual coordinates (t, r, θ, φ), we are considering a region
200
S. Alinhac
2m < r0 ≤ r ≤ r1 , that is, away from the black hole and away from the flat infinity r = +∞. In the case of metrics close to the Minkowski metric, a typical tool to prove interior decay are Morawetz type inequalities, starting with the original paper by Morawetz [10] (see also [1] for perturbed metrics). These inequalities are obtained by the classical energy inequality method, using a multiplier field X of the form f (r )∂r . For the Schwarzschild metric, Dafermos and Rodnianski [4] obtain, using multipliers X l = fl (r ∗ )∂r ∗ (r ∗ being the Regge-Wheeler coordinate), energy inequalities for each spherical harmonic l. They find that fl has to vanish on the photon sphere {r ∗ = 0} (corresponding to {r = 3m}). This photon sphere is characterized by the fact that null bicharacteristics starting tangentially to must stay on it : they are trapped. Such a fact speaks strongly against local energy decay in a region close to {r = 3m}. In fact, the technique of energy inequalities leads in this case to a rather weak control of the derivatives of ψ (in particular, no control of ∇ψ for r = 3m), yielding finally only a control of ψ itself ; in other words, one has lost one derivative, a bad omen if one is eventually willing to prove non linear stability. In this paper, we try to obtain, using energy methods, a control of the solution in a region close to {r = 3m}, for metrics close to the Schwarzschild metric. The choice of the multiplier approach, that is, integrating by parts the expression (ψ)(X ψ) in some domain D, may seem naive. Decay results have been obtained using special features of Kerr metrics (see the lectures by Dafermos and Rodnianski [6] for instance). However, it seems to us that the only robust approach likely to yield results for general metrics, when no symmetries, decomposition in spherical harmonics, etc. are present, is the multiplier approach. Use of microlocal analysis seems also excluded, because of its probable high cost in derivatives of g (see however the results of Tataru and Tohaneanu [11]). We are thus looking for fields X such that (ψ)(X ψ) = I + div(...), with I ≥ 0 (up to the addition of some multiple of |∇ψ|2 ). We call such fields nonnegative for g. In case i), it turns out that close enough metrics also have a photon sphere (say r = 3m + f (t) for some small f). We prove that a nonnegative field X must vanish on , and this allows us to actually display a nonnegative field, working simultaneously for all spherical harmonics l ≥ l0 . We believe that this improvement of [4] is likely to be more flexible in handling nonlinear problems. For Kerr metrics, the situation is much worse: there exists a continuous family of partial photon spheres, accumulating to ; there exists a region close to where any reasonable nonnegative field can only give an identically vanishing quadratic expression of ∇ψ. We call this region the intersphere. In other words, the energy method cannot yield local decay in this interior region. The existence of this region is analogous to the existence of the ergosphere r+ = m + (m 2 − a 2 )1/2 ≤ r ≤ m + (m 2 − a 2 cos2 θ )1/2 , where the Killing field K a becomes spacelike (see [6]).
Energy Multipliers for Perturbations of the Schwarzschild Metric
201
2. Notations and Basic Facts 2.1. The geometric energy formalism. We recall here, for the benefit of the reader, some well known facts. a. Let g be a lorentzian metric on R4 , for which we write g(X, Y ) =< X, Y >. In a given frame eα (1 ≤ α ≤ 4), its components are gαβ =< eα , eβ >, and g αβ is the inverse matrix of gαβ , g αβ gβγ = δγα . The associated metric connexion will be denoted by D, with the usual properties, X < Y, Z >=< D X Y, Z > + < Y, D X Z >, D X Y − DY X = [X, Y ]. For any field X , its deformation tensor X π = π is the symmetric 2-tensor defined by π(Y, Z ) =< DY X, Z > + < D Z X, Y >≡ L X g, the Lie derivative of the metric g. The gradient ∇ψ of a given function ψ is the dual of dψ defined by dψ(X ) ≡ X ψ =< ∇ψ, X > . Associated to a given function ψ, we define the symmetric 2-tensor Q by Q(X, Y ) = (X ψ)(Y ψ) − (1/2) < X, Y > |∇ψ|2 , |∇ψ|2 = < ∇ψ, ∇ψ >= (∇ψ)(ψ). The tensor Q is called the energy momentum tensor. In the sequence, we will consider the d’Alembertian associated with the metric g, ψ = div∇ψ, which can be expressed in local coordinates by ψ = |g|−1/2 ∂α (g αβ |g|1/2 ∂β ψ), |g| = | det g|. b. The energy formalism is entirely contained in the formula div P = (ψ)(X ψ) + (1/2)Q αβ X π αβ , Pα = Q αβ X β . The last term in the right-hand side is a quadratic form in the first order derivatives of ψ, the coefficients of which depend on first order derivatives of the components of the multiplier field X . If we integrate this formula in a domain D with boundary ∂ D, the divergence term will give us (div P)d V = P(n)dv = Q(X, n)dv, D
∂D
∂D
n being the outgoing unit normal to ∂ D. If X and n are both timelike, the energy density Q(X, n) is non-negative. In this paper, we will disregard the sign of this energy density, assuming that the boundary terms can be controlled using another energy inequality. We will concentrate instead on obtaining fields X for which the interior term Q αβ X π αβ ≡ Q X π has a definite sign, and, if possible, forms a positive definite quadratic form in ∇ψ. This is typically the case in the classical Morawetz inequality for the standard wave equation, using the multiplier X = ∂r (see for instance [1,9]).
202
S. Alinhac
c. In the sequence, we will work using a null frame, that is a frame of vectors (e1 , e2 , e3 , e4 ) such that the subspace generated by (e1 , e2 ) is orthogonal (for g) to the subspace generated by (e3 , e4 ), and < e1 , e1 >= 1, < e2 , e2 >= 1, < e1 , e2 >= 0, < e3 , e3 >= 0, < e4 , e4 >= 0, < e3 , e4 >= −2µ = 0 for some function µ. In such a frame, ∇ψ = e1 (ψ)e1 + e2 (ψ)e2 − (2µ)−1 (e4 (ψ)e3 + e3 (ψ)e4 ), |∇ψ|2 = e1 (ψ)2 + e2 (ψ)2 − µ−1 e3 (ψ)e4 (ψ), and, writing for simplicity eα (ψ) = eα , we obtain for the components of Q, Q 11 = (1/2)(e12 − e22 )+(2µ)−1 e3 e4 , Q 12 = e1 e2 , Q 22 = −(1/2)(e12 −e22 )+(2µ)−1 e3 e4 , Q 13 = e1 e3 , Q 14 = e1 e4 , Q 23 = e2 e3 , Q 24 = e2 e4 , Q 33 = e32 , Q 34 = µ(e12 + e22 ), Q 44 = e42 . To compute the double trace Q αβ π αβ , we use the null frame eα and its dual frame (e1 , e2 , −(2µ)−1 e4 , −(2µ)−1 e3 ). We obtain for any symmetric 2-tensor π the formula Qπ = Q αβ π αβ = Q 11 π11 + 2π12 e1 e2 − (1/µ)π14 e1 e3 − (1/µ)π13 e1 e4 + π22 Q 22 −(1/µ)π24 e2 e3 − (1/µ)π23 e2 e4 + (1/(4µ2 ))π44 e32 + (1/(4µ2 ))π33 e42 +(1/(2µ))π34 (e12 + e22 ) = (1/2)(e12 − e22 )(π11 − π22 ) + (1/(2µ))π34 (e12 + e22 ) + 2π12 e1 e2 −(1/µ)[π14 e1 e3 + π13 e1 e4 + π24 e2 e3 + π23 e2 e4 ] +(1/(4µ2 ))[π44 e32 + π33 e42 + 2µ(π11 + π22 )e3 e4 ]. d. We also recall the following integration formula: for any functions R and ψ, and any domain D with boundary ∂ D, R|∇ψ|2 d V = (1/2) ψ 2 (R)d V − (1/2) (n α ∂α R)ψ 2 dv D D ∂D Rψ(n α ∂α ψ)dv. + ∂D
2.2. Non-negative fields. Taking into account the last identity from point 2.1.d. above, we introduce the following definition: Definition. A field X is said to be non negative if, for some function R, the quantity I˜ = Q αβ X π αβ + R|∇ψ|2 is non-negative. The following proposition shows that this property depends only on X and the conformal class of g.
Energy Multipliers for Perturbations of the Schwarzschild Metric
203
Proposition. If X is non-negative for g, it is non-negative for eλ g. Proof. Set g˜ = eλ g, that is g˜ αβ = eλ gαβ . Then g˜ αβ = e−λ g αβ , ˜ ∇ψ) ˜ g( ˜ ∇ψ, = g˜ αβ (∂α ψ)(∂β ψ) = e−λ |∇ψ|2 , and consequently Q˜ αβ = Q αβ . Now X
π˜ = L X g˜ = eλ [L X g + (X λ)g],
hence finally ˜ −2λ |∇ψ|2 ] ˜ ∇ψ) ˜ ˜ ∇ψ, = eλ [Q X π + (X λ)tr Q + Re Q˜ X π˜ + R˜ g( ˜ −2λ − X λ)]. = eλ [Q X π + R|∇ψ|2 + |∇ψ|2 (−R + Re It is enough to choose R˜ = e2λ (X λ + R). It is important to remark that a non-negative multiplier X does not necessarily yield an energy inequality: first, after integrating by parts the additional term D R|∇ψ|2 d V , one gets a new interior term [ I˜ − (1/2)ψ 2 R]d V. D
To obtain from this a control of ψ, one has to use a Poincaré type inequality to balance the control of the terms in ∇ψ contained in I˜ (if any) with the ψ 2 term. Second, we obtain on ∂ D the usual energy terms, plus the new terms [Rψn(ψ) − (1/2)ψ 2 n(R)]dv, ∂D
and it is not clear whether the sum of these terms gives a positive generalized energy. This situation is typically illustrated by the Morawetz or the conformal inequalities [8,10]. Another example is to be found in this paper for the Schwarzschild metric, see Sects. 3.5, 3.6. 3. Rotationally Invariant Metrics 3.1. General facts. a. We will deal with spacetimes homeomorphic to R2 × S 2 ; here S 2 is the standard unit sphere in R3 , with longitude and colatitude coordinates φ and θ : the standard metric on S 2 is thus dσ 2 = dθ 2 + (sin θ )2 dφ 2 . We assume that there are coordinates u and v such that the metric of our spacetime is g = 4 2 dudv + f 2 dσ 2 ,
204
S. Alinhac
where ≥ 0 and f ≥ 0 are smooth functions of (u, v) only (this is our definition of rotationally invariant metrics). For convenience, we set u = (1/2)(−t + r ∗ ), v = (1/2)(t + r ∗ ), t = v − u, r ∗ = v + u, ∂u = −∂t + ∂r ∗ , ∂v = ∂t + ∂r ∗ , and assume that in the region where we work, t ≥ 0. The wave operator associated to g is φ = Dα ∇φ α = −2 ∂α (g αβ 2 ∂β φ). The basic examples we have in mind are of course 1. The Minkowski case, where r ∗ is the usual r of spatial polar coordinates, = 1, 2 φ + (2/r )∂ φ + r −2 φ. f = r , and the wave operator is φ = ∂uv r S2 2. The Schwarzschild case, where, r being again the usual |x| of spatial polar coordinates, we have set r ∗ = r + 2m log(r − 2m) − 3m − 2m log m. This is the Regge-Wheeler coordinate, normalized in such a way that r ∗ = 0 for r = 3m (see below “photon sphere”). Then 2 = 1 − 2m r , f = r, φ = −(1 −
2m −1 2 ) (∂t φ − r −2 ∂r ∗ (r 2 ∂r ∗ φ)) + r −2 S 2 φ. r
b. An orthonormal frame tangent to the sphere at a given point is e1 = (1/ f )∂θ , e2 = (1/( f sin θ ))∂φ . We will use the null frame (∂u , ∂v , e1 , e2 ). We have of course guu = 0, gvv = 0, guv = 2 2 , g uu = 0, g vv = 0, g uv = (2 2 )−1 . The volume element of the full space is d V = 2 2 f 2 dudvdσ S 2 , while the volume element on a slab T = {t = T } is dv = f 2 dr ∗ dσ S 2 . For a given function ψ, its gradient ∇ψ, with components ∂ α ψ = g αβ ∂β ψ, is given by ∇ψ =
1 (∂v ψ∂u + ∂u ψ∂v ) + ea (ψ)ea . 2 2
In particular, denoting derivatives by subscripts, |∇ψ|2 = ψ α ψα = ∂ α ψ∂α ψ = −2 ∂u ψ∂v ψ + | ∇ψ|2 , where as usual ∇ψ = ea (ψ)ea is the angular component of the gradient. We compute now the components of the connexion. We have first < Du ∂u , ∂u >= 0, < Du ∂u , ∂v >=< Du 2 2 ∇v, ∂v > = 4 ∂u < ∇v, ∂v > +2 2 < Dv ∇v, ∂u >= 4 ∂u , < Du ∂u , ea >=< Du 2 2 ∇v, ea >= 2 2 < Da ∇v, ∂u >= 0, hence finally Du ∂u = 2(∂u / )∂u , Dv ∂v = 2(∂v / )∂v .
Energy Multipliers for Perturbations of the Schwarzschild Metric
205
Similarly, < Du ∂v , ∂v >= 0, < Du ∂v , ∂u >=< Dv ∂u , ∂u >= 0, < Du ∂v , ea >=< Du 2 2 ∇u, ea >= 2 2 < Da ∇u, ∂u >=< Da ∂v , ∂u >= 0, since the left-hand side is symmetric in u, v and the right-hand side antisymmetric. Hence Du ∂v = Dv ∂u = 0. Now, with the above choices, [∂u , ea ] = −(∂u f / f )ea , < Du e1 , e2 > = < [∂u , e1 ], e2 > + < D1 ∂u , e2 >=< D2 ∂u , e1 >=< Du e2 , e1 > = − < Du e1 , e2 >, which gives < Du e1 , e2 >= 0. Since < Du ea , ∂u >= − < ea , Du ∂u >= 0, < Du ea , ∂v >= − < ea , Du ∂v >= 0, we obtain finally Du ea = Dv ea = 0. From the above formula, we get Da ∂u = −Du ea + [ea , ∂u ] = (∂u f / f )ea , Da ∂v = (∂v f / f )ea . Hence < Da ea , ∂u >= − < ea , Da ∂u >= −∂u f / f, < Da ea , ∂v >= −∂v f / f. The values of < Da eb , ec > are given by the induced connexion < Da eb , ec >=< Da eb , ec > . 3.2. Energy identity for X = a∂u + b∂v . Proposition 3.2. Let X = a∂u + b∂v , where the coefficients a and b are C 1 functions, and depend only on (u, v). Then I = Q αβ X π αβ = −2 [∂v a(∂u φ)2 + ∂u b(∂v φ)2 − 2(X f / f )∂u φ∂v φ] −| ∇φ|2 (∂u a + ∂v b + 2X / ). Proof. 1. We first compute the deformation tensor π = 1 π of X = ∂u . From the above formula, we have πaa
πuu = πvv = 0, πuv = 4 ∂u , = 2 < Da ∂u , ea >= 2∂u f / f, π12 = 0, πua = πva = 0.
In particular, tr 1 π = 4(∂u / +∂u f / f ). We have a similar formula for the deformation tensor 2 π of ∂v . Finally, X
παβ = a 1 παβ + b2 παβ + guα ∂β a + guβ ∂α a + gvα ∂β b + gvβ ∂α b,
tr X π = 4a(∂u / + ∂u f / f ) + 4b(∂v / + ∂v f / f ) + 2(∂u a + ∂v b).
206
S. Alinhac
2. We thus obtain I = Q αβ π αβ = −(1/2)|∇φ|2 tr π + ∂α φ∂β φ(a(1 π )αβ + b(2 π )αβ ) + 2∇a(φ)∂u φ + 2∇b(φ)∂v φ. Now (1 π )αβ ∂α φ∂β φ = 2(∂u / 3 )∂u φ∂v φ + 2(∂u f / f )| ∇φ|2 , and similarly for 2 π . Hence I = −|∇φ|2 (∂u a + ∂v b + 2X / + 2X f / f ) + −2 (∂u a∂v φ + ∂v a∂u φ)∂u φ + −2 (∂u b∂v φ + ∂v b∂u φ)∂v φ + 2 −3 ∂u φ∂v φ X + 2X f / f | ∇φ|2 . This gives the formula.
3.3. The photon sphere. For the Schwarzschild metric, the function h¯ = 2 log( / f ) depends only on r ∗ , and h¯ (0) = 0. Thus, a null bicharacteristic starting tangentially from = {r ∗ = 0} stays on . For a general metric g, let h = 2 log( / f ). Consider a null bicharacteristic for (ξ and η being the dual variables of u and v) u˙ = η, v˙ = ξ, φ˙ = ..., θ˙ = ..., ξ˙ = (∂u h)ξ η, η˙ = (∂v h)ξ η. For an arbitrary function ψ(u, v), let ψ0 (s) = ψ(u(s), v(s)) be ψ on this curve. Then 2 ψ˙ 0 = ∂u ψη+∂v ψξ, (d 2 /ds 2 )ψ0 = ∂u2 ψη2 +∂v2 ψξ 2 + ξ η(2∂uv ψ + ∂u ψ∂v h + ∂v ψ∂u h).
If we take for instance ψ = u, we obtain (d 2 /ds 2 )ψ0 = ∂v hξ ψ˙ 0 , hence a null bicharacteristic starting tangentially to ψ = C will stay on ψ = C: we know that u satisfies the eikonal equation, and all level surfaces u = C are characteristic surfaces. The same applies to v. Assume now ∂u ψ = 0, ∂v ψ = 0. We write then 2 2 (d 2 /ds 2 )ψ0 = [(∂uu ψ/∂u ψ)η + (∂vv ψ/∂v ψ)ξ ]ψ˙ 0 + ξ η A
with 2 2 2 A = 2∂uv ψ − (∂v ψ/∂u ψ)∂uu ψ − (∂u ψ/∂v ψ)∂vv ψ + ∂u ψ∂v h + ∂v ψ∂u h.
If ψ is such that A vanishes on ψ = 0, we deduce that a null bicharacteristic starting tangentially from ψ = 0 will stay on ψ = 0. The typical example is ψ = u + v for the Schwarzschild metric, since then A = ∂u h¯ + ∂v h¯ vanishes on u + v = 0. If we take simply ψ = v − k(u), the above vanishing condition reads k
/k = (∂u h − k ∂v h)(u, k(u)). Definition. A surface of equation v = k(u) for which k satisfies the above differential equation is called a photon sphere for g (recall h = 2 log / f ). The following proposition gives a rough sufficient condition for such a k to exist.
Energy Multipliers for Perturbations of the Schwarzschild Metric
207
Proposition 3.3. Let µ ≥ 0, 0 > 0 be given. Define the space C¯ µ2 to be the space of C 2 functions of H (r ∗ , t), defined in the strip [−0 , 0 ] × R, for which the norm ||H || = ||H ||∞ + ||∂t H ||∞ + || < t >µ ∂r ∗ H ||∞ + ||H
||∞ ¯ ≤ c0 , there exists k = k(u) ∈ C 2 is finite. There exists c0 > 0 such that, if ||h − h|| solution of the differential equation k
/k = (∂u h − k ∂v h)(u, k(u)) for which ||(k + u) < u >µ ||∞ + ||(k + 1) < u >µ ||∞ + ||k
< u >µ ||∞ < ∞. ¯ Moreover, k + u is a C 1 function of h − h¯ vanishing when h = h. Proof. 1. This is a simple application of the implicit function theorem. Set Cµ2 the space of C 2 functions K of u alone such that ||K < u >µ ||∞ + ||K < u >µ ||∞ + ||K
< u >µ ||∞ < ∞. The differential equation for k = −u + K and h = h¯ + H can be written K
/(K − 1) = (2 − K )(∂r ∗ h¯ + ∂r ∗ H )(K , K − 2u) − K ∂t H (K , K − 2u). Set F(H, K ) = K
+ (1 − K )(2 − K )(h¯ (K ) + ∂r ∗ H (K , K − 2u)) −K (1 − K )∂t H (K , K − 2u). We have F(0, 0) = 0, and F is a C 1 function defined on a neighbourhood of (0, 0) in C¯ µ2 × Cµ2 , with values in Cµ , the space of continuous functions z of u with weight < u >µ and norm ||z < u >µ ||∞ . The differential F of the mapping K → F(0, K ) is given by y → F (y) = y
+ 2h¯
(0)y. To prove the proposition, it is sufficient to prove that F is an isomorphism between Cµ2 and Cµ . Since h¯ = log(1/r 2 − 2m/r 3 ), dr/dr ∗ = 1 − 2m/r , we have h¯ = −2/r 2 (r − 3m), h¯
(0) = −2/27m 2 < 0. 2. Setting 2h¯
(0) = −α 2 , we have to study the equation y
− α 2 y = z. The only solution bounded at infinity is ∞ u −α(s−u) y(u) = (1/2α)[ z(s)e ds − z(s)e−α(u−s) ds]. u
∞
∞
−∞
Let us consider first | u z(s)e−α(s−u) ds| ≤ C 0 < u + σ >−µ e−ασ dσ : for u ≥ 0, it −u/2 ∞ and −u/2 . The first integral is smaller than C < u >−µ . If u ≤ 0, we separate in 0 is bounded by C < u >−µ , while the second is exponentially decreasing. Hence this first term in y belongs to Cµ , and similarly for the second term and for y . This concludes the proof.
208
S. Alinhac
3.4. Necessary conditions for non-negative multipliers. Theorem 3.4. Let g be a rotationally invariant metric close enough to the Schwarzschild metric to have a photon sphere , and X = a(u, v)∂u + b(u, v)∂v be a smooth spacelike field close to 2∂r ∗ (that is, a and b close to 1). If there exists a function ν (with dν = 0) such that ν X is non-negative for g, then necessarily X is orthogonal to the photosphere of g and ν vanishes on . Proof. 1. The interior terms I = Q αβ ν X π αβ are given by Proposition 3.2. Set y = (a/b)∂u (bν), z = (b/a)∂v (aν). If ν X is positive, necessarily ∂v (aν) ≥ 0, ∂u (bν) ≥ 0, hence y ≥ 0, z ≥ 0. The coefficient of −| ∇φ|2 in I is ∂u (aν) + ∂v (bν) + 2ν X / = y + z + ν A, where we defined A = ∂u a + ∂v b − (a/b)∂u b − (b/a)∂v a + 2X / . Assume that, for some R, I˜ = I + R( −2 ∂u φ∂v φ + | ∇φ|2 ) = −2 [∂v (aν)(∂u φ)2 + ∂u (bν)(∂v φ)2 − (∂u φ)(∂v φ)(2ν X f / f − R)] −| ∇φ|2 (y + z + ν A − R) is non-negative. Then, necessarily, R˜ = −(y + z + ν A − R) ≥ 0, ( R˜ + y + z + ν B)2 ≤ 4yz, where, recalling h = 2 log( / f ), B = A − 2X f / f = ∂u a + ∂v b − (a/b)∂u b − (b/a)∂v a + X h. The last inequality can be written ˜ + z) + 2ν B( R˜ + y + z) ≤ 0. R˜ 2 + (y − z)2 + ν 2 B 2 + 2 R(y ¯ the quantity This implies in particular ν B ≤ 0. Now, for a = b = 1, and h = h, corresponding to the Schwarzschild metric, B vanishes simply on r ∗ = 0. If we assume a and b close to one and h close to the value h¯ for the Schwarzschild metric, B will vanish simply on a surface S = {(u, v), v = k(u)}, for k close to −u. Now, on S, we must have R˜ = 0, y = z, and, from ν B ≤ 0, we also get ν = 0. From y = z and ν = 0, we deduce a∂u ν = b∂v ν on S. Since dν = 0, ν = 0 is an equation of S, and X is orthogonal to S, that is, b = −ak
on S. Then, with Z = ∂u + k ∂v tangent to S, B = 0 can be written B = Z a + (1/k )Z b + X h = Z a + (1/k )(−k Z a − ak
) + X h = −ak
/k + X h = −a(k
/k − ∂u h + k ∂v h), hence k satisfies the differential equation of the photon sphere k
/k = ∂u h(u, k(u)) − k ∂v h(u, k(u)). This concludes the proof of the theorem.
Energy Multipliers for Perturbations of the Schwarzschild Metric
209
3.5. Sufficient conditions I. Let us consider now a rotationally invariant metric g close enough to the Schwarzschild metric to have a photon sphere , according to Proposition 3.3. Let v = k(u) be an equation of , such that k (u) < 0, and take X = ∂u − k (u)∂v . Proposition 3.5. Recall h = 2 log / f and set B = −k
(u)/k (u) + X h. Assume the metric g such that, for some B¯ < 0, ¯ B = (v − k(u)) B. Let W and β be two functions of the real variable S = v − k(u) (still to be chosen), with W (0) = 0, W > 0. Set ν = (−1/k (u))W (v − k(u)), R = 2[W (v − k(u)) + ν X f / f ], which implies ν B ≤ 0. Then Q (ν X ) π d V ≥ D
∂D
[Cφ 2 − ν B| ∇φ|2 ]d V,
edv + D
with C = k −2 [W
+ 4βW
+ 4W (β 2 + β )] + W ((1/k )X f / f ). The boundary terms are edv = ∂D
∂D
[(1/2)n α ∂α R]φ 2 dv −
∂D
Rφ(n α ∂α φ)dv.
Proof. 1. We keep here the notation of the proof of Theorem 3.4. We first arrange to have y ≡ z: this means ∂u ν + k ∂v ν = (−k
/k )ν. For any W , ν = (−1/k )W (v − k(u)) is a solution of this equation. With this choice, y = z = W . We choose now R˜ = −ν B, which corresponds to R = 2(y + ν X f / f ). At this stage, we have I + R|∇φ|2 = I˜ = − −2 W /k (X φ)2 − ν B| ∇φ|2 . We transform the term R(φ α φα ) according to formula 2.1.d to obtain α −R(φ φα )d V = ...dv − φ 2 d V (y + ν X f / f ). D
∂D
D
Now, for a function w(u, v), we have 2 w = −2 [∂uv w + (1/ f )(∂u f ∂v w + ∂v f ∂u w)].
210
S. Alinhac
Hence y = −2 (−k W
+ W
X f / f ), (ν X f / f ) = −2 [W
X f / f − W X ((1/k )X f / f ) − (W /k )(X f / f )2 ] −W ((1/k )X f / f ). 2. Following [4], we use now a Poincaré formula to produce “good” terms in φ 2 . In fact, with the above choices, we are left with the non angular interior terms (−1/k )y(∂u φ)2 − k y(∂v φ)2 + 2y(∂u φ)(∂v φ) = (−y/k )[∂u φ − k ∂v φ]2 . Let us introduce now, instead of the coordinates (u, v), the coordinates S = v − k(u), T = v + k(u). Then X = −2k ∂ S ,
−2 2 (−y/k ) (X φ) d V = 4 y f 2 (∂ S φ)2 d SdT dσ S 2 . D
D
Keeping T and the variable on the sphere constant, we will use in the variable S the following Poincaré inequality (see [4]), where β˜ is arbitrary and small at infinity, ˜ − β˜ 2 A]d S. (∂ S φ)2 Ad S ≥ φ 2 [(Aβ) Here, A = 4y f 2 . Hence we obtain
−2 2 ˜ W
+ y(X β˜ + 2β˜ X f / f + 2k β˜ 2 )]. (−y/k ) (X φ) d V ≥ 2 φ 2 / 2 d V [−2βk D
D
Putting together all terms, we get for the interior terms the lower bound ˜ ) φ 2 W ((1/k )X f / f ) + φ 2 −2 [k W
− 2W
(X f / f + 2βk ˜ )X f / f + β˜ 2 )]. +2W X (β˜ + (1/2k )X f / f ) + 4k W ((1/4k 2 )(X f / f )2 + (β/k Choosing β˜ = −(1/2k )X f / f − β(S), these terms are φ 2 W ((1/k )X f / f ) + φ 2 −2 k [W
+ 4βW
+ 4W (β 2 + β )]. 3.6. Sufficient conditions II. Using Proposition 3.5, we can now construct the actual multiplier ν X , that is, choose the functions W and β. Following [4], we will denote by φl the part of φ which is a l-spherical harmonic. We keep here the notation of Sect. 3.5. Theorem 3.6. Let E = ((1/k )X f / f ). Assume that, for some C0 > 0, we have ¯ everywhere, i) f 2 |E| ≤ C0 | B| 2 2 ¯ ii) (− B/ f + |E|) ≤ C0 for |v − k(u)| ≤ C0−1 , ¯ ≥ C −1 f 2 for |v − k(u)| ≤ 3. iii) 2 | B| 0
Energy Multipliers for Perturbations of the Schwarzschild Metric
211
Then there exist l0 (C0 ) ∈ N, W and β and c > 0 such that, for all φ with φl = 0 for l ≤ l0 , Q αβ (ν X ) π αβ d V ≥ [Cφ 2 − ν B| ∇φ|2 ]d V ≥ c < r ∗ >−3−0 φ 2 d V. D
D
D
Proof. 1. We want first to obtain (1/4)a
+ βa + a(β 2 + β ) < 0, with a = W . To this aim, we set β = −a /a (with a still to be chosen), and get (1/4)a
+ βa + a(β 2 + β ) = −[(3/4)a
− a 2 /a]. Remark that the sign of this quantity is scale invariant, which justifies the normalization made in the following lemma. Lemma. For any small η > 0, there exists an even C 2 function a, with a > 0, a(0) = 1, a ∈ L 1 and i) For x ≥ 1, (3/4)a
− a 2 /a > 0, ii) For 0 ≤ x ≤ 1, (3/4)a
− a 2 /a = O(η), 2 iii) 1 a(x)d x ≥ (1 + 3η)−1 . Proof of the lemma. For x ≥ 1, we just take a(x) = (1 + ηx µ )−1 , with µ = (2η + 3)(η + 3)−1 . We get (3/4)a
−a 2 /a = ηµ(4(1 + ηx µ )−1 [(3 − µ)ηx µ −3(µ − 1)] ≥ ηµ(4(1 + ηx µ )−1 η/2. We set now a1 (x) = a(1) + (x − 1)a (1) + (1/2)(x − 1)2 a
(1), and, for 0 ≤ x ≤ 1, a(x) = χ (x) + (1 − χ (x))a1 (x), where χ is smooth decreasing between zero and one, being one close to zero and zero close to one. We have then a (x) = χ (x)(1 − a1 ) + (1 − χ (x))a1 = O(η), a ≥ 1 + O(η), a
(x) = O(η). This finishes the proof of the lemma. 2. We fix now η small enough and choose S a(x)d ¯ x, a(x) ¯ = a(x − 2), S = v − k(u). W (S) = 0
For l ≥ l0 , we have, with E = ((1/k )X f / f ), [Cφ 2 − ν B| ∇φ|2 ]d V ≥ [C − ν Bl0 (l0 + 1) f −2 ]φ 2 d V D D
−2
¯ 0 (l0 + 1) f −2 + E)}φ 2 d V. = {(−k ) [(3/4)a¯ − a¯ 2 /a] ¯ + W ((−1/k )(− B)Sl D
212
S. Alinhac
Thanks to the assumptions of the theorem, there is c1 > 0 big enough such that, for |S| ≥ c1 /l02 , ¯ 0 (l0 + 1) f −2 + E] > 0. S[(−1/k )(− B)Sl Thus the integrand for these values of S and |S − 2| ≥ 1 is positive, by i) of the lemma. 3. For |S| ≤ c1 /l02 , we have |W | = O(l0−2 ), and (3/4)a¯
− a¯ 2 /a¯ > 0 by i) of the lemma, hence once again the integrand is positive. We are left with the zone |S −2| ≤ 1 : but there W ≥ (1 + 3η)−1 by iii) of the lemma while, by ii), the first term in the integrand is small. This completes the proof. Remarks. i) For the Schwarzschild metric, 2/(3r 2 ) ≤ − B¯ = (2/r 2 )((r −3m)/r ∗ ) ≤ 2/r 2 , while E = O(r −4 ), hence the assumptions of the theorem are satisfied. ii) It is possible that positive neglected terms in the right-hand side of the inequality turn out to be useful. 4. Kerr Metrics 4.1. Frame and connection coefficients. We use the usual spherical coordinates (r, φ, θ ) in R3 , x1 = r sin θ cos φ, x2 = r sin θ sin φ, x3 = r cos θ. The Kerr metric is ds 2 = −(( − a 2 sin2 θ )/)dt 2 − 4amr (sin2 θ/)dtdφ +(/)dr 2 + [.]/ sin2 θ dφ 2 + dθ 2 , with = r 2 + a 2 cos2 θ, = r 2 + a 2 − 2mr, [.] = (r 2 + a 2 )2 − a 2 sin2 θ. a. Following [2,10], we use the null vectors l = ((r 2 + a 2 )/)∂t + (a/)∂φ + ∂r , n = ((r 2 + a 2 )/(2))∂t + a/(2)∂φ − /(2)∂r for which < l, n >= −1. We set e3 = /(r 2 + a 2 )l = X + Y, e4 = 2/(r 2 + a 2 )n = X − Y, X = ∂t + a/(r 2 + a 2 )∂φ , Y = /(r 2 + a 2 )∂r . We have then < e3 , e4 >= −2µ, µ = /(r 2 + a 2 )2 . The orthogonal space of (e3 , e4 ) is generated by ∂θ and ∂φ + a sin2 ∂t . In the sequence, for simplicity, we omit the variable θ in sin = sin θ , etc. We have < ∂θ , ∂θ >= , < ∂φ + a sin2 ∂t , ∂φ + a sin2 ∂t >= sin2 .
Energy Multipliers for Perturbations of the Schwarzschild Metric
213
Hence we take e1 = −1/2 ∂θ , e2 = (1/ 1/2 sin θ )(∂φ + a sin2 ∂t ). Remark that < e1 , e2 >= 0, hence (e1 , e2 ) form an orthonormal system. We compute now the brackets of the vectors of our null frame (e1 , e2 , e3 , e4 ). We obtain first [e1 , e2 ] = − cos( − a 2 sin2 )/( 2 sin2 )∂φ + a cos(r 2 + a 2 )/ 2 ∂t . Since we also have 1/2 sin e2 = ∂φ + a sin2 ∂t , (1/2)(e3 + e4 ) = a/(r 2 + a 2 )∂φ + ∂t , we get ∂t = (r 2 + a 2 )/(2)(e3 + e4 ) − a sin −1/2 e2 , ∂φ = −a(r 2 + a 2 ) sin2/(2)(e3 + e4 ) + (r 2 + a 2 ) sin −1/2 e2 . Finally [e1 , e2 ] = cos(r 2 + a 2 ) −3/2 [a −1/2 (e3 + e4 ) − (1/ sin)e2 ]. We also have [e1 , e3 ] = r /((r 2 + a 2 ))e1 , [e1 , e4 ] = −r /((r 2 + a 2 ))e1 , [e2 , e3 ] = r /((r 2 + a 2 ))e2 , [e2 , e4 ] = −r /((r 2 + a 2 ))e2 . Now [e3 , e4 ] = −2[X, Y ] = −4ar /(r 2 + a 2 )3 ∂φ , [e3 , e4 ] = 4ar sin /((r 2 + a 2 )2 )[(a/2) sin(e3 + e4 ) − 1/2 e2 ]. b. Using the properties of the metric connection D, in particular the torsion free character D X Y − DY X = [X, Y ], we can compute now all the coefficients for the frame (e1 , e2 , e3 , e4 ): < D1 e1 , e1 >= 0, < D1 e1 , e2 >= − < e1 , [e1 , e2 ] >= 0, < D1 e1 , e3 >= − < e1 , [e1 , e3 ] >= −r /((r 2 + a 2 )), < D1 e1 , e4 >= r /((r 2 + a 2 )), < D2 e1 , e1 >= 0, < D2 e1 , e2 >= − < [e1 , e2 ], e2 >= (r 2 + a 2 ) −3/2 cos/sin . For j = 3, 4, we have 2 < D j e1 , e2 >=< [e j , e1 ], e2 > − < e j , [e1 , e2 ] > + < [e2 , e j ], e1 >,
214
S. Alinhac
hence < < < <
D2 e1 , e3 >= − < e1 , D3 e2 >= a cos /((r 2 + a 2 )), D2 e1 , e4 >=< D4 e1 , e2 >= a cos /((r 2 + a 2 )), D3 e1 , e1 >= 0, < D3 e1 , e2 >=< D2 e1 , e3 >= a cos /((r 2 + a 2 )), D3 e1 , e3 >= 0, < D3 e1 , e4 >= (1/2)D1 < e3 , e4 > = 2a 2 cos sin /( 1/2 (r 2 + a 2 )2 ), < D4 e1 , e1 >= 0, < D4 e1 , e2 >= a cos /((r 2 + a 2 )), < D4 e1 , e3 >=< D3 e1 , e4 >= 2a 2 cos sin /( 1/2 (r 2 + a 2 )2 ).
We start similar computations with e2 : < D1 e2 , e1 >=< [e1 , e2 ], e1 >= 0, < D1 e2 , e2 >= 0, < D1 e2 , e3 >=< [e1 , e2 ], e3 > + < D2 e1 , e3 >= −a cos /((r 2 + a 2 )) =< D1 e2 , e4 >, < D2 e2 , e1 >= −(r 2 + a 2 ) −3/2 cos/sin, < D2 e2 , e2 >= 0, < D2 e2 , e3 >= −r /((r 2 + a 2 )), < D2 e2 , e4 >= r /((r 2 + a 2 )), < D3 e2 , e1 >= −a cos /((r 2 + a 2 )), < D3 e2 , e2 >= 0 =< D3 e2 , e3 >, < D3 e2 , e4 >= −(1/2) < [e3 , e4 ], e2 >= 2ar sin /( 1/2 (r 2 + a 2 )2 ), < D4 e2 , e1 >= −a cos /((r 2 + a 2 )), < D4 e2 , e2 >= 0, < D4 e2 , e3 >=< e2 , [e3 , e4 ] > + < D3 e2 , e4 >= −2ar sin /( 1/2 (r 2 + a 2 )2 ), < D4 e2 , e4 >= 0. By symmetry, the other terms are obtained: < < < < < < < < < < < < < < < < <
D1 e3 , e1 D1 e3 , e3 D2 e3 , e1 D2 e3 , e3 D3 e3 , e1 D3 e3 , e4 D4 e3 , e1 D4 e3 , e2 D4 e3 , e3 D1 e4 , e1 D1 e4 , e3 D2 e4 , e1 D2 e4 , e3 D3 e4 , e1 D3 e4 , e2 D3 e4 , e3 D4 e4 , e1
>= r /((r 2 + a 2 )), < D1 e3 , e2 >= a cos /((r 2 + a 2 )), >= 0, < D1 e3 , e4 >= 2a 2 cos sin /( 1/2 (r 2 + a 2 )2 ), >= −a cos /((r 2 + a 2 )), < D2 e3 , e2 >= r /((r 2 + a 2 )), >= 0, < D2 e3 , e4 >= 2ar sin /( 1/2 (r 2 + a 2 )2 ), >= 0, < D3 e3 , e2 >= 0, < D3 e3 , e3 >= 0, >= D3 < e3 , e4 > − < e3 , [e3 , e4 ] >= 4m(a 2 − r 2 )/(r 2 + a 2 )4 , >= −2a 2 cos sin /( 1/2 (r 2 + a 2 )2 ), >= 2ar sin /( 1/2 (r 2 + a 2 )2 ), >= 0, < D4 e3 , e4 >= 4a 2 r 2 sin2 /(r 2 + a 2 )4 , >= −r /((r 2 + a 2 )), < D1 e4 , e2 >= a cos /((r 2 + a 2 )), >= 2a 2 cos sin /( 1/2 (r 2 + a 2 )2 ), < D1 e4 , e4 >= 0, >= −a cos /((r 2 + a 2 )), < D2 e4 , e2 >= −r /((r 2 + a 2 )), >= −2ar sin /( 1/2 (r 2 + a 2 )2 ), < D2 e4 , e4 >= 0, >= −2a 2 cos sin /( 1/2 (r 2 + a 2 )2 ), >= −2ar sin /( 1/2 (r 2 + a 2 )2 ), >= −4a 2 r 2 sin2 /(r 2 + a 2 )4 , < D3 e4 , e4 >= 0, >=< D4 e4 , e2 >= 0, < D4 e4 , e3 >= − < D3 e3 , e4 >
Energy Multipliers for Perturbations of the Schwarzschild Metric
215
= −4m(a 2 − r 2 )/(r 2 + a 2 )4 , < D4 e4 , e4 >= 0. 4.2. Deformation tensors. We are now in a position to compute the deformation tensors of all fields. Denote i π = ei π . From the definition and the previous computations, we get easily the components of π : a. 1 π : π11 = 0, π12 = 0, π13 = −r /((r 2 + a 2 )), π14 = r /((r 2 + a 2 )), π22 = 2(r 2 + a 2 ) −3/2 cos/sin, π23 = 2a cos/((r 2 + a 2 )) = π24 , π33 = 0, π34 = 4a 2 cos sin/( 1/2 (r 2 + a 2 )2 ), π44 = 0. b. 2 π : π11 = 0, π12 = −(r 2 + a 2 ) −3/2 cos/sin, π13 = −2a cos/((r 2 + a 2 )) = π14 , π22 = 0, π23 = −r /((r 2 + a 2 )) = −π24 , π33 = 0, π34 = 0, π44 = 0. c. 3 π : π11 = 2r /((r 2 + a 2 )), π12 = 0, π13 = 0, π14 = 0, π22 = 2r /((r 2 + a 2 )), π23 = 0, π24 = 4ar sin /( 1/2 (r 2 + a 2 )2 ), π33 = 0, π34 = 4m(a 2 − r 2 )/(r 2 + a 2 )4 , π44 = 8a 2 r 2 sin2 /(r 2 + a 2 )4 . d. 4 π : π11 = −2r /((r 2 + a 2 )), π12 = 0, π13 = 0, π14 = 0, π22 = −2r /((r 2 + a 2 )), π23 = −4ar sin /( 1/2 (r 2 + a 2 )2 ), π24 = 0, π33 = −8a 2 r 2 sin2 /(r 2 + a 2 )4 , π34 = −4m(a 2 − r 2 )/(r 2 + a 2 )4 , π44 = 0. Taking into account the actual components of the tensors i π for the Kerr metric, we can finally compute their traces against Q. We obtain a. Q 1 π = −[cos/( 3/2 sin)](r 2 + a 2 − 2a 2 sin2 )e12 + [cos /( 3/2 sin)](r 2 + a 2 + 2a 2 sin2 )e22 − r (r 2 + a 2 ) −2 e1 (e3 − e4 ) − 2a(r 2 + a 2 ) cos −2 e2 (e3 + e4 ) + (r 2 + a 2 )3 cos /( 5/2 sin)e3 e4 . b. Similarly Q 2 π = −2(cos / sin)(r 2 + a 2 ) −3/2 e1 e2 + 2a(r 2 + a 2 ) −2 cos e1 (e3 + e4 ) −r (r 2 + a 2 ) −2 e2 (e3 − e4 ). c. Q 3 π = 2m[(a 2 − r 2 )/(r 2 + a 2 )2 ](e12 + e22 ) − 4ar sin −3/2 e2 e3 + 2a 2 r sin2 −2 e32 + 2r (r 2 + a 2 ) −2 e3 e4 . d. Q 4 π = −2m[(a 2 − r 2 )/(r 2 + a 2 )2 ](e12 + e22 ) + 4ar sin −3/2 e2 e4 −2r (r 2 + a 2 ) −2 e3 e4 − 2a 2 r sin2 −2 e42 .
216
S. Alinhac
4.3. Photon sphere. The principal symbol of the wave equation for the Kerr metric is p, and ˜ p = −[(r 2 + a 2 )2 − a 2 sin2 ]τ 2 +2 R 2 +(−a 2 sin2 )/ sin2 φ˜ 2 +2 −4amr τ φ. ˜ ). For a null The order of the variables is (t, r, φ, θ ), with dual variables (τ, R, φ, bicharacteristic, ˜ r˙ = 2R2 , τ˙ = 0, t˙ = −2τ [(r 2 + a 2 )2 − a 2 sin2 ] − 4amr τ φ, 2 2 2 2 2 − R˙ = τ (−4r (r + a ) + 2a (r − m) sin ) + 4(r − m)R 2 +2(r − m)(φ˜ 2 / sin2 +2 ) ˜ −4amτ φ. From p = 0 we get ˜ (φ˜ 2 / sin2 +2 ) = τ 2 [(r 2 + a 2 )2 − a 2 sin2 ] − 2 R 2 + a 2 φ˜ 2 + 4amr τ φ. Hence − R˙ = 2τ 2 (r 2 + a 2 )−1 (4mr 2 − (m + r )(r 2 + a 2 )) + 2(r − m)R 2 ˜ −1 (a(r − m)φ˜ + 2m(r 2 − a 2 )τ ). + 2a φ Set q(r ) =
a(r − m) . r − m(r 2 − a 2 )
We see that, for any r , a null bicharacteristic starting from a point with r = r , ˜ stays on the surface r = r : it is a partial photon sphere (in the R = 0, τ = q(r )φ, sense that only certain null bicharacteristics stay where they started from). These partial photon spheres accumulate to the one for which q = ∞. Definition. We call a photon sphere any surface = {r = r¯ } for which r¯ satisfies the relation r = m(r 2 − a 2 ). On , we have the choice of φ˜ = 0 or a(r −m)φ˜ +2m(r 2 −a 2 )τ = 0. The last possibility is forbidden on null characteristics, for r close to 3m. The conditions φ˜ = 0, R = 0 are equivalent to r˙ = 0, −2amr t˙ + [.]φ˙ = 0. Thus we see that not all null bicharacteristics starting tangentially from the photon sphere stay on it ; we can say that this is a partial photon sphere (in contrast with the case a = 0).
Energy Multipliers for Perturbations of the Schwarzschild Metric
217
4.4. Intersphere region. Lemma 4.4. Let a < m, and define the function f α (0 ≤ α ≤ a) by f α (r ) = 2r + (m − r )(r 2 + α 2 ) = r 3 − 3mr 2 + (2a 2 − α 2 )r + mα 2 . Then, i) For λ2 (α) = m + (m 2 + (α 2 − 2a 2 )/3)1/2 , f α (λ2 ) < 0, (∂r f α )(λ2 ) = 0, and (∂r f α )(r ) > 0 for r > λ2 (α). Hence there exists a unique smooth zero r (α) > λ2 (α) > m of f α , for which r (α) > 0 for α > 0, and r (0) = 0. ii) The surface {r = r (a) ≡ r0 } is a (partial) photon sphere defined in Sect. 4.3.The value of r (0) ≡ r1 is r1 = (3m + (9m 2 − 8a 2 )1/2 )/2. Hence, r1 ≤ r (α) ≤ r0 . Remark also that r1 > λ2 (a). iii) The function k˜ = 2r + (m − r ) = f a| cos θ| (r ) vanishes on the surface r = r (a| cos θ |) ≡ Sa (θ ), for which r1 ≤ Sa (θ ) ≤ r0 , Sa (0) = Sa (π ) = r0 . Proof of Lemma 4.4. 1. We have ∂r f α = 3r 2 − 6mr + 2a 2 − α 2 , which vanishes for r = λ1 and r = λ2 , λ1 = m − δ 1/2 , λ2 = m + δ 1/2 , δ = m 2 + (α 2 − 2a 2 )/3. At such a point, with = −1 for λ1 and = 1 for λ2 , f α (r ) = 2m(a 2 − m 2 ) − 2δ 3/2 . Hence f α (λ2 ) < 0, and r (α) exists and is smooth. Moreover, ∂α f α + (∂r f α )r (α) = 0. Since ∂α f α (r ) = 2α(m − r ), point i) is proved. 2. We note that r − m(r 2 − a 2 ) = r 3 − 3mr 2 + a 2 r + ma 2 = f a (r ), hence r = r0 is a (partial ) photon sphere for the Kerr metric. The rest of point ii) follows from i), except the final remark. But, for a > 0, λ2 (a) = m + (m 2 − a 2 /3)1/2 ≤ 2m = (3m + (m 2 )1/2 )/2 ≤ (3m + (9m 2 − 8a 2 )1/2 )/2 = r1 . Definition. We call the region I S = {(t, r, φ, θ ), 0 ≤ θ ≤ π, Sa (θ ) ≤ r ≤ r0 } the intersphere region.
218
S. Alinhac
5. A Nonexistence Theorem for Kerr Metrics Theorem. Assume 0 < a < m. Let X = α1 e1 + α2 e2 + be3 + ce4 be a vector field with the following properties: i) The C 1 coefficients b, c and α2 do not depend on t or φ : b = b(r, θ ), c = c(r, θ ), α2 = α2 (r, θ ). ii) The coefficient α1 is C ∞ for 0 ≤ θ ≤ π and analytic in r for 0 < θ < π . Assume that X is non negative, that is, there exists a function R with I˜ ≡ Q X π + R|ψ|2 ≥ 0. Then, in the intersphere region IS, R ≡ 0 and I˜ ≡ 0. Remarks. i) The independence of b, c and α2 on φ is technical; it seems natural since the coefficients of the Kerr metric do not depend on φ either. The fact that these coefficients should be independent of t is motivated by the fact that the flux of the field at time t has to be controlled by the standard energy (as in the Morawetz inequality). ii) The analyticity assumption on α1 is used in point 3 of the proof of the theorem, and remains a rather mysterious technical assumption. The rest of the sections are devoted to the proof of this theorem. 5.1. Computation of the quadratic terms I˜. Let 3X = X¯ + Z , X¯ = α1 e1 + α2 e2 , Z = be3 + ce4 . ¯
Lemma 5.1. With π = X π , π¯ = X π , α π = eα π , we have, for any R, I˜ = Qπ + R|∇ψ|2 = e12 [ R¯ + A] + e22 [ R¯ − A] + 2e1 e2 (π¯ 12 + b3 π12 + c4 π12 ) + 2C T + ye32 + ze42 + (2µ)−1 he3 e4 , where h = −2R + π¯ 11 + π¯ 22 + b(3 π11 + 3 π22 ) + c(4 π11 + 4 π22 ), R¯ = R − e3 (b) − e4 (c) + (2µ)−1 [b3 π34 + c4 π34 + π¯ 34 ], 2 A = b(3 π11 − 3 π22 ) + c(4 π11 − 4 π22 ) + π¯ 11 − π¯ 22 , y = −e4 (b)/µ + (4µ2 )−1 (b3 π44 + c4 π44 ), z = −e3 (c)/µ + (4µ2 )−1 (b3 π33 + c4 π33 ), and the cross terms C T are C T = e1 e3 [e1 (b) − (2µ)−1 π¯ 14 ] + e1 e4 [e1 (c) − (2µ)−1 π¯ 13 ] +e2 e3 [e2 (b) − 2bar sin θ −3/2 − (2µ)−1 π¯ 24 ] + e2 e4 [e2 (c) + 2acr sin θ −3/2 −(2µ)−1 π¯ 23 ]. Here and in the sequence, we just write eα for eα (ψ), when no misunderstanding can occur.
Energy Multipliers for Perturbations of the Schwarzschild Metric
219
¯
Proof of Lemma 5.1. 1. For π¯ = X π , we have π¯ αβ = α1 1 παβ + α2 2 παβ + π˜ αβ , π˜ αβ = eα (α1 ) < e1 , eβ > +eα (α2 ) < e2 , eβ > +eβ (α1 ) < e1 , eα > +eβ (α2 ) < e2 , eα > . Hence π˜ 11 = 2e1 (α1 ), π˜ 22 = 2e2 (α2 ), π˜ 12 = e1 (α2 ) + e2 (α1 ), π˜ 13 = e3 (α1 ), π˜ 14 = e4 (α1 ), π˜ 23 = e3 (α2 ), π˜ 24 = e4 (α2 ), π˜ 33 = π˜ 34 = π˜ 44 = 0. 2. For Z , we have similarly Z
παβ = b3 παβ + c4 παβ + eα (b) < e3 , eβ > +eα (c) < e4 , eβ > +eβ (b) < e3 , eα > +eβ (c) < e4 , eα > .
This gives Q Z π = bQ 3 π + cQ 4 π − (e12 + e22 )(e3 (b) + e4 (c)) + 2[e1 (b)e1 e3 + e1 (c)e1 e4 + e2 (b)e2 e3 + e2 (c)e2 e4 ] − µ−1 [e4 (b)e32 + e3 (c)e42 ]. Without making explicit all the terms in the deformation tensors for e3 and e4 yet, d1 = b(3 π11 − 3 π22 ) + c(4 π11 − 4 π22 ), Q Z π = e12 [−e3 (b) − e4 (c) + (2µ)−1 (b3 π34 + c4 π34 ) + (1/2)d1 ] +e22 [−e3 (b) − e4 (c) + (2µ)−1 (b3 π34 + c4 π34 ) − (1/2)d1 ] +2e1 e2 (b3 π12 + c4 π12 ) +2[e1 (b)e1 e3 + e1 (c)e1 e4 + e2 e3 (e2 (b) − 2bar sin θ −3/2 ) + e2 e4 (e2 (c) +2acr sin θ −3/2 )] +e32 [−µ−1 e4 (b) + (4µ2 )−1 (b3 π44 + c4 π44 )] + e42 [−µ−1 e3 (c) +(4µ2 )−1 (b3 π33 + c4 π33 )] +(2µ)−1 e3 e4 [b(3 π11 + 3 π22 ) + c(4 π11 + 4 π22 )]. 5.2. The function H . Lemma 5.2.1. With the notations of Lemma 5.1, assuming (e3 + e4 )(b) = 0, (e3 + e4 )(c) = 0, we have ¯ h/(2µ) = −(y + z) − R/(2µ) + H/(4µ2 ), 1 H = k(b − c) + 2α1 π34 + (4µ)[< D1 X¯ , e1 > + < D2 X¯ , e2 >], k = 8(r 2 + a 2 )−3 (2r + (m − r )).
220
S. Alinhac
Proof of Lemma 5.2.1. 1. Taking into account the assumptions on b and c, we get y = e3 (b)/µ + (4µ2 )−1 (b3 π44 + c4 π44 ), R¯ = R − e3 (b) + e3 (c) + (2µ)−1 [b3 π34 + c4 π34 + π¯ 34 ], hence h has the above form with H = b[3 π33 + 3 π44 + 23 π34 + 2µ(3 π11 + 3 π22 )] + c[4 π33 + 4 π44 + 24 π34 +(2µ)(4 π11 + 4 π22 )] + 2π¯ 34 + 2µ(π¯ 11 + π¯ 22 ). The coefficient cob of b is (1/2)cob =< D3 e3 , e4 > + < D4 e3 , e4 > +(2µ)(< D1 e3 , e1 > + < D2 e3 , e2 >). The coefficient coc of c is (1/2)coc =< D3 e4 , e3 > + < D4 e4 , e3 > + (2µ)(< D1 e4 , e1 > + < D2 e4 , e2 >) = −2(e3 + e4 )µ + (2µ)[< [e1 , e3 + e4 ], e1 > + < [e2 , e3 + e4 ], e2 >] − (1/2)cob . In the present case of the Kerr metric, (e3 + e4 )µ = 0, [e1 , e3 + e4 ] = 0, [e2 , e3 + e4 ] = 0, hence cob = −coc . 2. From the formula of Sect. 4.1, we have < D3 e3 , e4 > + < D4 e3 , e4 >= 4(r 2 + a 2 )−4 [m(a 2 − r 2 ) + a 2 r sin2 θ ]. The quantity in the bracket is m(a 2 − r 2 ) + r (a 2 − a 2 cos2 θ ) = mr 2 (a 2 − r 2 ) + ra 2 + (m − r )(r 2 + a 2 )(a 2 cos2 θ ) = (m − r )(r 2 + a 2 ) − r 2 (m − r )(r 2 + a 2 ) + r (r 2 + a 2 )(a 2 − mr ) = (r 2 + a 2 )[(m − r ) + r ]. Since 2µ < D1 e3 , e1 > + < D2 e3 , e2 >= 4r 2 (r 2 + a 2 )−3 , we obtain finally (1/2)cob = 4(r 2 + a 2 )−3 [2r + (m − r )]. 3. Since 2 π34 = 0, we have from Lemma 5.1, π¯ 34 = α1 1 π34 + α2 2 π34 = α1 1 π34 , π¯ 11 + π¯ 22 = 2(< D1 X¯ , e1 > + < D2 X¯ , e2 >), which finishes the proof. Lemma 5.2.2. i) The non negativity of the quadratic form I˜ = Qπ + R|∇ψ|2 requires as necessary conditions √ √ ¯ R¯ ≥ |A|, y ≥ 0, z ≥ 0, R/(2µ) + ( y − z)2 ≤ H/(4µ2 ).
Energy Multipliers for Perturbations of the Schwarzschild Metric
221
ii) For e1 = 0, e3 = e4 , the quadratic form reduces to I˜ = Qπ + R|∇ψ|2 = ( R¯ − A)e22 + Be2 e3 + (y + z + h/(2µ))e32 , where B = −4ar sin θ −3/2 (b − c) − 4a(r 2 + a 2 ) −2 α1 cos θ. The non negativity of the quadratic form requires |B| ≤ µ−3/2 H. Proof of Lemma 5.2.2. 1. If we set e1 = e2 = 0 in the expression of the quadratic form given in Lemma 1, the non negativity of ye32 + ze42 + (2µ)−1 he3 e4 requires (2µ)−1 |h| ≤ 2(yz)1/2 , and in particular ¯ −h/(2µ) = y + z + R/(2µ) − H/(4µ2 ) ≤ 2(yz)1/2 , which is the required inequality. 2. From Lemma 5.1 and the assumptions on b, c, we get B = −4ar sin θ −3/2 (b − c) − (µ)−1 (π¯ 23 + π¯ 24 ). But π¯ 23 = α1 1 π23 + α2 2 π23 + e3 (α2 ), π¯ 24 = α1 1 π24 + α2 2 π24 + e4 (α2 ), π¯ 23 + π¯ 24 = 4aα1 cos θ/((r 2 + a 2 )) + (e3 + e4 )(α2 ), since 2 π23 + 2 π24 = 0, and this gives the expression of B. The non-negativity requires ¯ R¯ ≤ H/(2µ), and R¯ ≥ |A|, which implies R¯ − A ≤ 2 R, ¯ ≤ (4/µ)(H/(2µ))2 , B 2 ≤ 2/µ( R¯ − A)(H/(2µ) − R) that is |B| ≤ µ−3/2 H . Lemma 5.2.3. Set ˜ B = −4ar sin θ −3/2 B, 2 2 1/2 ˜ B = b − c + [(r + a )/(r )]α1 cos θ/ sin θ. Then, for some C ∞ functions β1 , β2 , analytic in r , H = k B˜ + H˜ , H˜ = (4/(µ ))[∂θ α1 + β1 (r )α1 cos θ/ sin θ + β2 α1 ], β1 (r ) = 1 − 22 f a (r )/(r (r 2 + a 2 )4 ), f a (r ) = 2r + (m − r )(r 2 + a 2 ), 1/2
and, for some smooth γ , H˜ = (4/µ)e−γ −1/2 (sin θ )−β1 ∂θ w, w = eγ α1 (sin θ )β1 . Note that in the intersphere region IS, f a (r ) ≤ 0, hence β1 ≥ 1.
222
S. Alinhac
Proof of Lemma 5.2.3. We have, with X¯ = α1 e1 + α2 e2 , since < D1 e2 , e1 >= 0 and e2 (α2 ) = 0, < D1 X¯ , e1 > + < D2 X¯ , e2 >= e1 (α1 ) + α1 < D2 e1 , e2 >= e1 (α1 ) +(r 2 + a 2 ) −3/2 α1 cos θ/ sin θ. This gives H˜ = (4/µ)[e1 (α1 ) + α1 (r 2 + a 2 ) −3/2 cos θ/ sin θ ] + 2α1 1 π34 −k(r 2 + a 2 )α1 /(r 1/2 ) cos θ/ sin θ. Now (r 2 + a 2 )/ = (1 − a 2 sin2 θ/(r 2 + a 2 ))−1 = 1 + . . . sin2 θ, [µ(r 2 + a 2 )/(4r )]k = [22 /(r (r 2 + a 2 )4 )(2r + (m − r )(r 2 + a 2 )) + . . . sin2 θ, which finally gives H˜ = (4/(µ 1/2 ))[∂θ α1 + β1 (r )α1 cos θ/ sin θ + β2 α1 ] with β1 = 1 − 22 f a (r )/(r (r 2 + a 2 )4 ),
f a (r ) = 2r + (m − r )(r 2 + a 2 ).
We set w = eγ α1 (sin θ )β1 , ∂θ w = eγ (sin θ )β1 [∂θ α1 + β1 α1 cos θ/ sin θ + α1 ∂θ γ ], and it is enough to take γ such that ∂θ γ = β2 . 6. Proof of the Non Existence Theorem 1. We prove first that, in the interior of the intersphere region, i) the function H˜ from Lemma 5.2.3 cannot be negative, ii) if H˜ ( p) = 0, then B( p) = 0. Suppose H˜ ( p) < 0 for some p ∈ I S with sin θ = 0 at p. Then the necessary condition from Lemma 5.2.2 would imply ˜ < µ−3/2 k B˜ ≤ µ−3/2 |k|| B|, ˜ |B| = 4ar sin θ −3/2 | B| which in turn implies 4ar sin θ −3/2 < µ−3/2 |k|. In the region IS, we know from Lemma 4.4 that k = 8(r 2 + a 2 )−3 k˜ ≥ 0,
Energy Multipliers for Perturbations of the Schwarzschild Metric
223
and we observe that k˜ = f a| cos θ| (r ) = 2r + (m − r )(r 2 + a 2 − a 2 sin2 θ ) = f a (r ) + (r − m)a 2 sin2 θ ≤ (r − m)a 2 sin2 θ, since, according to Lemma 4.4 ii), f a (r ) ≤ 0 for r1 ≤ r ≤ r0 . Thus the inequality we have obtained at p implies 4r −3/2 < µ−3/2 8(r 2 + a 2 )−3 (r − m)a. 2. We prove now that, for a ≤ m, the non strict inequality 4r −3/2 ≤ µ−3/2 8(r 2 + a 2 )−3 (r − m)a is impossible in the region IS. This inequality is in fact r 1/2 ≤ 2(r − m)a. Now 1/2 (r − m)2 (r 1/2 /(r − m)) = −m + r (r − m)2 = (r − m)3 + m(m 2 − a 2 ) > 0, hence the inequality would imply r1 (r12 + a 2 − 2mr1 )1/2 = r1 (mr1 − a 2 )1/2 ≤ 2(r1 − m)a. Using the equation satisfied by r1 , this is equivalent to r12 (mr1 − a 2 ) ≤ 4a 2 (r1 − m)2 , (3mr1 − 2a 2 )(mr1 − a 2 ) ≤ 4a 2 (mr1 + m 2 − 2a 2 ), 9mr1 ≤ 10a 2 . Since 2m ≤ r1 , this is impossible. This proves points i) and ii) above. 3. From 2 we deduce in particular that H˜ ≥ 0 for r = r0 and 0 < θ < π , hence H˜ ≡ 0 for r = r0 because of Lemma 5.2.3. Consider now the function Hˆ = (sin θ ) H˜ : by assumption, it is a C ∞ function, vanishing on r = r0 , analytic in r for 0 < θ < π . The function Hˆ can be written Hˆ = (r − r0 )l H1 for some l and some H1 ∈ C ∞ non-identically zero on r = r0 ; if this were not the case, Hˆ would be flat on r = r0 , which would imply H˜ ≡ 0 by the analyticity assumption. Now, by Lemma 4, for r = r0 , π (µ/4)eγ 1/2 (sin θ )β1 −1 H1 dθ = 0, 0
hence the same must be true also for r = r0 , which implies that there are points m 1 = (r0 , θ1 ), m 2 = (r0 , θ2 ), 0 < θ1 < π, 0 < θ2 < π, for which H1 (m 1 ) < 0, H1 (m 2 ) > 0.
224
S. Alinhac
This implies that H˜ < 0 at some point interior to I S, which is impossible. 4. Hence H˜ ≡ 0 everywhere by the analyticity assumption, which implies α1 ≡ 0 by Lemma 5.2.3. From point 1, B˜ = 0 in IS, and also b ≡ c in IS. Since H = 0 in IS, we ¯ y − z and A are zero there. But b ≡ c implies obtain from Lemma 5.2.2 that also R, y = −z, since 3 π44 = −4 π33 . Finally y = z = 0 in IS, and the quadratic form I˜ obtained from X in Lemma 5.1 is identically zero. Since 3 π34 + 4 π34 = 0, b = c and α1 = 0 imply r = R¯ = 0. Remark. In the case a = 0 of the Schwarzschild metric, the intersphere region reduces to r = 3m: it is possible, as in Sect. 3.6, to have on the photon sphere b = c and ∂r b = −∂r c. This explains the difference in the conclusions. Acknowledgement. We would like to thank Prof. S. Klainerman for many helpful conversations about these subjects.
References 1. Alinhac, S.: On the Morawetz-Keel-Smith-Sogge inequality for the wave equation on a curved background. Publ. R. Inst. Math. Sc. Kyoto 42(3), 705–720 (2006) 2. Chandrasekhar, C.: The Mathematical Theory of Black Holes. Int. Ser. Mon. Physics 69, Oxford: Oxford Univ. Press, 1983 3. Christodoulou, D.: Bounded variation solutions of the spherically symmetric einstein-Scalar field equations. Comm. Pure Appl. Math. XLVI, 1131–1220 (1993) 4. Dafermos, M., Rodnianski, I.: The Red-Shift Effect and Radiation Decay on Black Hole Spacetimes. http://arxiv.org/abs/gr-qc/0512119, 2005 5. Dafermos, M., Rodnianski, I.: A Note on Energy Currents and Decay for the Wave Equation on a Schwarzschild Background. http://arxiv.org/abs/0710.071v1[math,AP], 2007 6. Dafermos, M., Rodnianski, I.: Lectures on Black Holes and Linear Waves. http://arxiv.org/abs/0811. 354v1[gr-qc], 2008 7. Hawking, S.W., Ellis, G.F.: The Large Scale Structure of Space-Time. Cambridge Mon. Math. Physics, Cambridge: Cambridge Univ. Press, 1973 8. Hörmander, L.: Lectures on Nonlinear Hyperbolic Differential Equations. Math. Appl. 26, Berlin-Heidelberg-New York: Springer Verlag, 1997 9. Klainerman, S., Nicolò, F.: The Evolution Problem in General Relativity. Prog. Math. Physics 25, BeselBoston: Birkhäuser, 2002 10. Morawetz, C.S.: Time decay for the nonlinear Klein-Gordon equation. Proc. Roy. Soc. A 306, 291–296 (1968) 11. Tataru, D., Tohaneanu, M.: Local Energy Estimate on Kerr Black Hole Background. http://arxiv.org/abs/ 0810.5766v2[math,AP], 2008 12. Wald, R.: General Relativity. Chicago, IL: Univ. Chicago Press, 1984 Communicated by G.W. Gibbons
Commun. Math. Phys. 288, 225–270 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0735-2
Communications in
Mathematical Physics
The N = 1 Triplet Vertex Operator Superalgebras Dražen Adamovi´c1 , Antun Milas2, 1 Department of Mathematics, University of Zagreb, Zagreb, Croatia. E-mail:
[email protected] 2 Department of Mathematics and Statistics, University at Albany (SUNY),
Albany, NY 12222, USA. E-mail:
[email protected];
[email protected] Received: 6 February 2008 / Accepted: 9 November 2008 Published online: 18 February 2009 – © Springer-Verlag 2009
Abstract: We introduce a new family of C2 -cofinite N = 1 vertex operator superalgebras SW(m), m ≥ 1, which are natural super analogs of the triplet vertex algebra family W( p), p ≥ 2, important in logarithmic conformal field theory. We classify irreducible SW(m)-modules and discuss logarithmic modules. We also compute bosonic and fermionic formulas of irreducible SW(m) characters. Finally, we contemplate possible connections between the category of SW(m)-modules and the category of modules for 2πi the quantum group Uqsmall (sl2 ), q = e 2m+1 , by focusing primarily on properties of characters and the Zhu’s algebra A(SW(m)). This paper is a continuation of our paper Adv. Math. 217, no.6, 2664–2699 (2008). Contents 1. 2. 3. 4. 5. 6. 7. 8.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Zhu’s algebra A(V ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Intertwining operators among vertex operator superalgebra modules . . N = 1 Neveu-Schwarz Vertex Operator Superalgebras . . . . . . . . . . . . Fusion Rules For N = 1 Superconformal (2m + 1, 1)-Models . . . . . . . . Lattice and Fermionic Vertex Superalgebras . . . . . . . . . . . . . . . . . 5.1 Fermionic vertex operator superalgebra F . . . . . . . . . . . . . . . . 5.2 Vertex superalgebra S M(1) . . . . . . . . . . . . . . . . . . . . . . . The N = 1 Neveu-Schwarz Module Structure of VL ⊗ F-Modules . . . . . . The Vertex Operator Superalgebra S M(1) . . . . . . . . . . . . . . . . . . Zhu’s Algebra A(SM(1)) and Classification of Irreducible S M(1)–Modules 8.1 Logarithmic S M(1)-modules . . . . . . . . . . . . . . . . . . . . . . . 8.2 Further properties of A(S M(1)) . . . . . . . . . . . . . . . . . . . . . The second author was partially supported by NSF grant DMS-0802962.
226 228 230 231 232 232 234 235 236 238 242 245 247 248
226
D. Adamovi´c, A. Milas
The N = 1 Triplet Vertex Algebra SW(m) . . . . . . . . . . . . Classification of Irreducible SW(m)–Modules . . . . . . . . . . On the Structure of Zhu’s Algebra A(SW(m)) . . . . . . . . . . Modular Properties of Characters of Irreducible SW(m)-Modules SW(m)-Characters and q-Series Identities . . . . . . . . . . . . 13.1 The m = 1 case: first computation . . . . . . . . . . . . . . 13.2 Irreducible SW(m) characters from W(2m + 1) characters . 14. A Conjectural Relation of SW(m) with Quantum Groups . . . . 15. Outlook and Final Remarks . . . . . . . . . . . . . . . . . . . . 16. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9. 10. 11. 12. 13.
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
249 251 253 257 260 260 262 264 265 265 268
1. Introduction Compared to rational vertex algebras, significantly less is known about the structure of modules for general vertex algebras. Recently, geared up with clues from the physics literature, some breakthrough has been achieved in understanding at least quasi-rational vertex algebras (i.e., C2 -cofinite irrational vertex algebras), and in particular the triplet vertex algebras W( p), p ≥ 2 [AM2,FHST,CF] (cf. also [Ab] for p = 2). But apart from the symplectic fermions W(2), the description of categories of weak (logarithmic) modules for other triplets W( p), p ≥ 3 remains open, even though there is strong evidence for Kazhdan-Lusztig correspondence between the category of logarithmic W( p)modules and certain categories of modules for quantum groups (for these and related developments we refer the reader to [FHST], and especially [FGST1,FGST2,Se], and references therein). In [AM2] we obtained several useful results about the structure of the category of W( p)-modules by using primarily Zhu’s algebra and Miyamoto’s pseudocharacters [Miy]. Eventually, we will require more-or-less explicit knowledge of “higher” Zhu’s algebras for the triplet. But several obstacles (e.g., explicit realization of certain logarithmic modules) prevents us from taking this theory to the next level. We hope that this approach, in particular, will give an additional evidence for Kazhdan-Lusztig correspondence, because we believe that proper understanding of the relationship between quantum groups and triplets. As with other familiar rational vertex operator algebras (e.g. Virasoro minimal models), one may also wonder if the triplet has (interesting!) N = k super extensions, and whether those exhibit similar properties (e.g., C2 -cofiniteness). In this paper we solve this problem for k = 1, by constructing a family of N = 1 vertex operator superalgebras SW(m), m ≥ 1, which share many similarities with the triplet family. In what follows we briefly recall the construction of SW(m) and present our main results. Let us recall that the triplet vertex algebra W( p) [FHST,AM2] is defined as the kernel of a screening operator acting from VL to VL− αp , where VL is the vertex algebra associated to the rank one even lattice Zα, α, α = 2 p, and VL− αp is a certain VL -module. To construct an N = 1 super triplet we replace the even lattice with an odd lattice such that α, α = 2m + 1, so that VL has a natural vertex operator superalgebra structure. Then we tensor VL with F, the free fermion vertex operator superalgebra (cf. [KWn]), and again, there is a screening operator α Q˜ : VL ⊗ F −→ VL− 2m+1 ⊗ F.
N = 1 Triplet Vertex Operator Superalgebras
227
The kernel of this operator, denoted by SW(m), is what we call the N = 1 triplet vertex operator superalgebra (or simply, N = 1 super triplet). If we restrict the kernel of Q˜ on the charge zero subspace we obtain another vertex operator superalgebra S M(1) ⊂ SW(m), called the N = 1 singlet vertex operator superalgebra. Both vertex operator superalgebras contain the Neveu-Schwarz vector τ , giving a representation of the ns Lie super12m 2 algebra of central charge 23 − 2m+1 . This is precisely the central charge of (1, 2m + 1) Neveu-Schwarz (degenerate) minimal modules. A different N = 1 extension of the symplectic fermion W-algebra W(2) was considered in [MS]. By using the notation used by physicists, our super triplet would be an example of a super W-algebra of type W( 23 , 2m + 21 , 2m + 21 , 2m + 21 ). Similarly, the N = 1 singlet algebra S M(1) is an example of a super W-algebra of type W( 23 , 2m + 21 ). We should say that for low m (e.g., m = 1) some general properties of W superalgebras with two generators were also discussed in the physics literature, but mostly by using the Jacobi identity and methods of Lie algebras (cf. [BS] and references therein). We should also mention that several general results about W-superalgebras associated to affine superalgebras were recently obtained in [Ar,KWak] (see also [HK]). However, super singlet and super triplet vertex superalgebras do not appear in these works. Because of the similarity between W( p) and SW(m) many results we obtain here are intimately related to those for the triplet [AM2] (cf. [FGST1,CF]), but there are some subtle differences which we address at various stages. However, to keep the paper self-contained at many places we gave proofs that are almost identical to those in [AM2]. Let us first consider the super singlet S M(1). This vertex superalgebra is too small to be C2 -cofinite (let alone rational!), which is evident from the following result: Theorem 1.1. Assume that m ≥ 1. (i) The singlet vertex superalgebra S M(1) is a simple N = 1 vertex operator superalgebra generated by τ and a primary vector H of conformal weight 2m + 21 . (ii) The associative Zhu’s algebra A(S M(1)) is isomorphic to the commutative algebra C[x, y]/P(x, y), where P(x, y) is the ideal in C[x, y] generated by the polynomial P(x, y) = y 2 − Cm
2m
(x − h 2i+1,1 ),
i=0
where Cm is a non-trivial constant and h 2i+1,1 =
i(i−2m) 2(2m+1) .
So the structure and representation theory of S M(1) is quite similar to that of M(1) investigated in [A3 and AM1]. In particular we can construct interesting logarithmic S M(1)–modules and logarithmic intertwining operators as defined in [HLZ]. Next we study the vertex operator superalgebra SW(m). The main result on the structure on this vertex superalgebra is Theorem 1.2. Assume that m ≥ 1. (i) SW(m) is a simple N = 1 vertex operator superalgebra generated by τ and three primary vectors E, F, H of conformal weight 2m + 21 .
228
D. Adamovi´c, A. Milas
(ii) The vertex operator superalgebra SW(m) is irrational and C2 -cofinite. (iii) SW(m) has precisely 2m + 1 inequivalent irreducible modules. Our proof of (iii) imitates the proof of C2 -cofiniteness for the triplet W( p) [AM2] (for a different proof see [CF]). The rest is done by combining methods of Zhu’s associative algebra and our knowledge of irreducible VL ⊗ F-modules. In parallel with the triplet vertex algebra, we do not have an explicit description of A(SW(m)), but we believe that the following conjecture should hold true. Conjecture 1.1. Zhu’s algebra decomposes as a sum of ideals A(SW(m)) =
3m i=2m+1
Mh 2i+1,1 ⊕
m−1
Ih 2i+1,1 ⊕ Ch 2m+1,1 ,
i=0
where Mh 2i+1,1 ∼ = M2 (C), dim(Ih 2i+1,1 ) = 2 and Ch 2m+1,1 is one-dimensional. In particular A(SW(m)) is 6m + 1-dimensional. In view of the classification result (cf. Theorem 1.1), it is important to compute irreducible characters and study their modular transformation properties. As with the triplet vertex algebra [F1], irreducible SW(m)-characters are often expressible as sums of modular forms of unequal weight. Also, because we are working within vertex operator superalgebras the S L(2, Z) group should be replaced with the θ -group θ . Then we have Theorem 1.3. The θ -closure of the space spanned by irreducible SW(m)-characters is 3m + 1-dimensional. For a more precise statement see Theorem 12.1. Our result should be compared with [F1], where it was observed that the S L(2, Z) closure of the vector space of W( p) characters is 3 p −1 dimensional. Finally, in parallel with [FGK and FFT], we also obtain (see Sect. 13) fermionic formulas for characters of irreducible SW(m)-modules. Our main results indicate that there is an interesting relationship between characters of irreducible SW(m)-modules and irreducible characters of W(2m + 1)-modules. It is not clear if there is a deeper connection between these two W-algebras. Notice that if Conjecture 1.1 is true, then the center of A(SW(m)) is 3m + 1dimensional, which is precisely the dimension of the center of the small quantum group 2πi Uqsmall (sl2 ), q = e 2m+1 [Ker]. It is no accident that this dimension matches the dimension in Theorem 1.3 (a similar phenomena occurs for the triplet vertex algebra [FGST1]). Furthermore, both Uqsmall (sl2 ) and SW(m) have the same number of irreducible modules [Ker] (see also [La]). Thus, motivated by conjectures in [FGST1], we expect the following (rather bold) conjecture to be true. Conjecture 1.2. The category of weak SW(m)-modules is equivalent to the category 2πi of modules for the quantum group Uqsmall (sl2 ), where q = e 2m+1 . 2. Preliminaries In this section we briefly discuss the definition of vertex operator superalgebras, their modules and intertwining operators as developed in [FFR,K,KWn,Li,HLZ,HM], etc.
N = 1 Triplet Vertex Operator Superalgebras
229
We assume the reader is familiar with basics of vertex algebra theory (cf. [FHL,FLM, FB,K,LL], etc.). Let V = V0¯ ⊕ V1¯ be any Z2 –graded vector space. Then any element u ∈ V0¯ (resp. u ∈ V1¯ ) is said to be even (resp. odd). We define |u| = 0¯ if u is even and |u| = 1¯ if u is odd. Elements in V0¯ or V1¯ are called homogeneous. Whenever |u| is written, it is understood that u is homogeneous. The notion of vertex operator superalgebra is a natural (and straightforward) generalization of the notion of vertex algebra where the vector space V in the definition is assumed to be Z2 -graded, where the vertex operator map u n z −n−1 Y (·, z) : V −→ Hom(V, V ((z)), Y (u, z) = n∈Z
is compatible with the Z2 -grading, and where the Jacobi identity for a pair of homogeneous elements is adjusted with an appropriate sign. A vertex superalgebra V is called a vertex operator superalgebra if there is a special element ω ∈ V0¯ (called the conformal vector) whose vertex operator we write in the form ωn z −n−1 = L(n)z −n−2 , Y (ω, z) = n∈Z
n∈Z
such that L(n) close the Virasoro algebra representation on V , and where V is 1 2 Z-graded (by weight), truncated from below, with finite-dimensional vector spaces. Also, the grading is determined with the action of the Virasoro operator L(0). In this paper, we shall assume that V (n), V1¯ = V (n), where V (n) = {a ∈ V | L(0)a = na}. V0¯ = n∈Z≥0
1 n∈ 2 +Z≥0
For a ∈ V (n), we shall write wt(a) = n or deg(a) = n. We shall sometimes refer to the vertex operator superalgebra V as a quadruple (V, Y, 1, ω), where 1 is the vacuum vector (as for vertex operator algebras). We say that the vertex operator superalgebra V is generated by the set S ⊂ V if V = spanC {u 1n 1 · · · u rnr 1 | u 1 , . . . , u r ∈ S, n 1 , . . . , nr ∈ Z, r ∈ Z>0 }. The vertex operator algebra V is said to be strongly generated (cf. [K]) by the set R if V = spanC {u 1n 1 · · · u rnr 1 | u 1 , . . . , u r ∈ R, n i < 0, r ∈ Z>0 }. In parallel with vertex algebras we can define the notion of weak module for vertex operator superalgebras. Again, the only new requirement is that the vector space M in the definition is Z2 -graded, with grading compatible with respect to the action of V , and where the Jacobi identity is adjusted as in the case of vertex superalgebras. The vertex operator acting on M is usually denoted by Y M . A weak V –module (M, Y M ) is called an (ordinary) V -module if M carries an action of the Virasoro algebra via the expansion of Y M (ω, x), and in addition M is equipped with a R-grading (or even C-grading) determined by the Virasoro operator L(0). In addition, the grading is truncated from below, with finite dimensional graded subspaces.
230
D. Adamovi´c, A. Milas
As usual, we say that a V -module M is irreducible (or simple) if M has no proper submodules. We say that a vertex operator superalgebra is rational if every V -module M is semisimple (i.e., M decomposes as a direct sum of irreducible modules) and if V has only finitely many (inequivalent) irreducible modules. Definition 2.1. Let V be a vertex operator superalgebra. We say that a weak V -module M is logarithmic, if it carries an action of the Virasoro algebra and if it admits decomposition M= Mr , r ∈C
where Mr = {v : (L(0) − r )k v = 0, for some k ∈ N}. 2.1. Zhu’s algebra A(V ). We define two bilinear maps ∗ : V ×V → V , ◦ : V ×V → V as follows. For homogeneous a, b ∈ V let (1+x)deg(a) b if a, b ∈ V0¯ x a ∗ b = Resx Y (a, x) (2.1) 0 if a or b ∈ V1¯ , ⎧ deg(a) ⎨ Resx Y (a, x) (1+x)2 b if a ∈ V0¯ x 1 a◦b = (2.2) deg(a)− ⎩ 2 Resx Y (a, x) (1+x) x b if a ∈ V1¯ . Next, we extend ∗ and ◦ on V ⊗ V linearly, and denote by O(V ) ⊂ V the linear span of elements of the form a ◦ b, and by A(V ) the quotient space V /O(V ). The image of v ∈ V , under the natural map V → A(V ) will be denoted by [v]. The space A(V ) has a unital associative algebra structure, with the product ∗ and [1] as the unit element. The associative algebra A(V ) is called Zhu’s algebra of V . Assume that M = ⊕ 1 M(n) is a 21 Z≥0 –graded V –module. Then the top comn∈ 2 Z≥0
ponent M(0) of M is a A(V )–module under the action [a] → o(a) = awt(a)−1 for homogeneous a in V0¯ . We shall sometimes write a(0) for o(a). (Note that if a ∈ V1¯ , then [a] = 0 in A(V ). We formally set o(a) = a(0) = 0 in this case.) Moreover, there is one-to-one correspondence between irreducible A(V )–modules and irreducible 21 Z≥0 –graded V –modules (cf. [KWn]). As usual, for a vertex operator superalgebra V we let C2 (V ) = {a−2 b : a, b ∈ V }. Then it is not hard to see that P(V ) = V/C2 (V ) has a super Poisson algebra structure with the multiplication a¯ · b¯ = a−1 b, and the Lie bracket ¯ = a0 b, [a, ¯ b]
N = 1 Triplet Vertex Operator Superalgebras
231
where–denotes the natural projection from V to P(V ) (see for instance [Z]). Therefore we have a decomposition P(V ) = P(V )0 ⊕ P(V )1 into even and odd subspace, respectively. If V /C2 (V ) is finite-dimensional we say that V is C2 -cofinite. Let a, b ∈ V , be Z2 homogeneous. Then by using super-commutator formulae in vertex operator superalgebras one can easily see that a¯ · b¯ − (−1)|a||b| b¯ · a¯ = 0 in V /C2 (V ).
(2.3)
The following result was proved in [DK], and it is a generalization of Proposition 2.2 in [Ab]. Proposition 2.1. Let V be strongly generated by the set S. Then we have: (1) P(V ) is generated by the set {a, a ∈ S}. (2) A(V ) is generated by the set {[a], a ∈ S}. (3) If V is C2 -cofinite, dim(P(V )0 ) ≥ dim(A(V )). 2.2. Intertwining operators among vertex operator superalgebra modules. Intertwining operators for superconformal vertex operator algebras were introduced in [KWn]. Their theory is further developed in [HM] by using both even and odd formal variables. We briefly outline the definition here. Definition 2.2. Let V be a vertex operator superalgebra and M1 , M2 and M3 a triple M of V –module. An intertwining operator Y(·, z) of type M1 3M2 is a linear map Y : M1 → End(M2 , M3 ){z}, w1 → Y(w1 , z) =
(w1 )n z −n−1 ,
n∈C
satisfying the following conditions for wi ∈ Mi , i = 1, 2 and a ∈ V : d Y(w1 , z). (I1) Y(L(−1)w1 , z) = dz (I2) (w1 )n (w2 ) = 0 for Re(n) sufficiently large. (I3) The following Jacobi identity holds: z1 − z2 −1 Y M3 (a, z 1 )Y(w1 , z 2 )w2 z0 δ z0 z2 − z1 |a||w1 | −1 −(−1) z0 δ Y(w1 , z 2 )Y M2 (a, z 1 )w2 −z 0 z1 − z0 Y(Y M1 (a, z 0 )w1 , z 2 )w2 , = z 2−1 δ z2
for Z2 -homogeneous a and w1 . We shall denote by
I
M3 M1 M2
the vector space of intertwining operators of type as the “fusion rules”.
M3
M1 M2 .
Their dimensions are known
232
D. Adamovi´c, A. Milas
3. N = 1 Neveu-Schwarz Vertex Operator Superalgebras The N = 1 Neveu-Schwarz (or simply NS) algebra is the Lie superalgebra CL(n) CG(m) CC ns = n∈Z
1 m∈ 2 +Z
with commutation relations (m, n ∈ Z): m3 − m C, 12 1 1 n 1 [G(m + ), L(n)] = (m + − )G(m + n + ), (3.1) 2 2 2 2 1 1 1 {G(m + ), G(n − )} = 2L(m + n) + m(m + 1)δm+n,0 C, (3.2) 2 2 3 1 [L(m), C] = 0, [G(m + ), C] = 0. 2 It is important to consider vertex algebras which admit an action of the N = 1 Neveu-Schwarz algebras (cf. [HM]). These vertex operator superalgebras are called N = 1 Neveu-Schwarz vertex operator superalgebras and are subject to an additional axiom: There exists τ ∈ V3/2 (superconformal vector) such that Y (τ, z) = G(n)z −n−3/2 , G(n) ∈ End(V ), [L(m), L(n)] = (m − n)L(m + n) + δm+n,0
n∈Z+1/2
where G(n) satisfy bracket relations as in (3.1) and (3.2). The simplest examples of N = 1 vertex operator superalgebras are ns-modules L ns(c, 0), c = 0, where we use the standard notation and for any (c, h) ∈ C2 we denote by L ns(c, h) the corresponding irreducible highest weight ns–module with central charge c and highest weight h (cf. [KWn,Li,A1,HM]). It is well-known that the vertex operator superalgebra L ns(c, 0), c = 0 is simple. Set 2( p − q)2 3 (1 − ), 2 pq (sp − rq)2 − ( p − q)2 = . 8 pq
c p,q = h r,s p,q
In the rest of the paper we shall focus on certain ns modules of central charge c2m+1,1 , m ≥ 1. 4. Fusion Rules For N = 1 Superconformal (2m + 1, 1)-Models From now on we will mostly focus on (non-minimal) (2m + 1, 1)-models, so that p = 2m + 1, q = 1. Relevant lowest weights are h r,s := h r,s 2m+1,1 , r, s ∈ Z. It will be of great use to determine the fusion rules L(c2m+1,1 , h r ,s ) (4.1) I L(c2m+1,1 , h r,s ) L(c2m+1,1 , h r ,s )
N = 1 Triplet Vertex Operator Superalgebras
233
for certain triples (r, s), (r , s ) and (r , s ) ∈ Z2 . For m = 0 (i.e., the c = 3/2 case) these numbers were computed in (see [M1]). In particular, for every s > 0 we have: 3 3 3 3 3 L( , h 1,3 ) × L( , h 1,2s+1 ) = L( , h 1,2s−1 ) ⊕ L( , h 1,2s+1 ) ⊕ L( , h 1,2s+3 ), 2 2 2 2 2 (4.2) where × is just a formal product indicating which triples of irreducible modules admit nontrivial fusion rules (all with multiplicity one). As shown in [M1], the fusion rules for m = 0 can be computed by using certain projection formulas for singular vectors combined with Frenkel-Zhu’s formula. It is not hard to see that the same approach extends to m ≥ 1 as well. We only have to apply appropriate projection formulas as in Lemma 3.1 of [IK1]. Actually, for purposes of this paper we do not need any results from [IK1], because we are interested only in special properties of “fusion rules” (4.1) (nevertheless, see Remark 4.1). Proposition 4.1. For every i = 0, . . . , m − 1 and n ≥ 1 we have: the space L(c2m+1,1 , h) I L(c2m+1,1 , h 1,3 ) L(c2m+1,1 , h 2i+1,2n+1 ) is nontrivial only if h ∈ {h 2i+1,2n−1 , h 2i+1,2n+1 , h 2i+1,2n+3 }, and L(c2m+1,1 , h) I L(c2m+1,1 , h 1,3 ) L(c2m+1,1 , h 2i+1,1 ) is nontrivial only if h = h 2i+1,3 . Similarly, for every i = 0, . . . , m − 1 and n ≥ 2 we have: the space L(c2m+1,1 , h) I L(c2m+1,1 , h 1,3 ) L(c2m+1,1 , h 2i+1,−2n+1 ) is nontrivial only if h ∈ {h 2i+1,−2n−1 , h 2i+1,−2n+1 , h 2i+1,−2n+3 }, and L(c2m+1,1 , h) I L(c2m+1,1 , h 1,3 ) L(c2m+1,1 , h 2i+1,−1 ) is nontrivial only if h ∈ {h 2i+1,−3 , h 2i+1,−1 }. For a stronger statement see Remark 4.1. Proof. We assume that n ≥ 1 (for other cases essentially the same argument works). Let A(L(c2m+1,0 , 0)) be Zhu’s algebra of L(c2m+1,0 , 0) (polynomial algebra in one variable) and A(L(c2m+1 , h)) the A(L(c2m+1,0 , 0))-bimodule of L(c2m+1 , h) [FZ]. As in [M1], it is sufficient to analyze the structure of the A(L(c2m+1,0 , 0))-module, A(L(c2m+1 , h 1,3 )) ⊗ A(L(c2m+1,0 ,0) L(c2m+1,1 , h 2i+1,2n+1 )(0),
(4.3)
where L(c2m+1,1 , h)(0) denotes the top weight component of L(c2m+1,1 , h) (cf. [FZ]). From [IK1] (or elsewhere) it follows that the Verma module M(c2m+1,0 , h 1,3 ) combines in the following short exact sequence: 3 0 −→ M(c2m+1,0 , h 1,3 + ) −→ M(c2m+1,0 , h 1,3 ) −→ L(c2m+1,0 , h 1,3 ) −→ 0. 2
234
D. Adamovi´c, A. Milas
Thus the maximal submodule of M(c2m+1,0 , h 1,3 ) is generated by a singular vector of weight h 1,3 + 23 (explicitly, (−L(−1)G(−1/2) + (2m + 1)G(−3/2))v1,3 , where v1,3 is the highest weight vector in M(c2m+1,0 , h 1,3 )). Now, as in [M1], it is not hard to see that the space (4.3) is three-dimensional and that all fusion rules covered by the statements are at most 1 (actually, they are all one; see Remark 4.1). Remark 4.1. We can actually prove the “if and only if” statement in Proposition 4.1 by using at least two different methods. On one hand we would have to combine methods from [M1] and projection formula in Lemma 3.1 [IK1] (we do not have explicit singular vectors to work with!). Alternatively, with Proposition 4.1, it is sufficient to construct non-trivial intertwining operators for all types covered in Proposition 4.1. This was actually done in later sections. We should say that our fusion rules formulas coincide with Iohara-Koga’s fusion rule formula in the generic case, which are computed by using coinvariants and projection formulas rather than Frenkel-Zhu’s formula [IK1]. But as we know the coinvariant approach and Frenkel-Zhu’s formulas yield the same answer in practically all known examples (for further examples see [W,M1,M4]).
5. Lattice and Fermionic Vertex Superalgebras We shall first recall some basic facts about lattice and fermionic vertex superalgebras. Let m ∈ Z≥0 . Let L = Zβ be a rational lattice of rank one with nondegenerate bilinear form ·, · given by β, β =
1 . 2m + 1
Let h = C ⊗Z L. Extend the form ·, · on L to h. Let hˆ = C[t, t −1 ] ⊗ h ⊕ Cc be the + − affinization of h. Set hˆ = tC[t] ⊗ h; hˆ = t −1 C[t −1 ] ⊗ h. Then hˆ + and hˆ − are abelian ˆ Let U (hˆ − ) = S(hˆ − ) be the universal enveloping algebra of hˆ − . Let subalgebras of h. ˆ λ ∈ h. Consider the induced h-module ˆ ⊗U (C[t]⊗h⊕Cc) Cλ S(hˆ − ) (linearly), M(1, λ) = U (h) where tC[t] ⊗ h acts trivially on Cλ ∼ = C, h acting as h, λ for h ∈ h and c acts on Cλ as multiplication by 1. We shall write M(1) for M(1, 0). For h ∈ h and n ∈ Z write h(n) = t n ⊗ h. Set h(z) = n∈Z h(n)z −n−1 . Then M(1) is a vertex algebra which is generated by the fields h(z), h ∈ h, and M(1, λ), for λ ∈ h, are irreducible modules for M(1). As in [DL] (see also [D,FLM,GL,K]), we have the generalized vertex algebra
V L = M(1) ⊗ C[ L], where C[ L] is a group algebra of L with a generator eβ . For v ∈ V L , let Y (v, z) = −s−1 be the corresponding vertex operator (for precise formulae see [DL]). v z 1 s∈ 2m+1 Z s Define α = (2m + 1)β. Then α, α = 2m + 1, implying L = Zα ⊂ L is an integer lattice. Therefore the subalgebra VL ⊂ V has the structure of a vertex superalgebra. L
N = 1 Triplet Vertex Operator Superalgebras
235
Define the Schur polynomials Sr (x1 , x2 , . . .) in variables x1 , x2 , . . . by the following equation: ∞ ∞ xn n y = exp Sr (x1 , x2 , . . .)y r . n
(5.1)
r =0
n=1
For any monomial x1n 1 x2n 2 · · · xrnr we have an element h(−1)n 1 h(−2)n 2 · · · h(−r )nr 1 in M(1) for h ∈ h. Then for any polynomial f (x1 , x2 , . . .), f (h(−1), h(−2), . . .)1 is a well-defined element in M(1). In particular, Sr (h(−1), h(−2), . . .)1 ∈ M(1) for r ∈ Z≥0 . Set Sr (h) for Sr (h(−1), h(−2), . . .)1. The following relations in the generalized vertex operator algebra V L are of great importance: γ
ei eδ = 0
for i ≥ −γ , δ.
(5.2)
γ
Especially, if γ , δ ≥ 0, we have ei eδ = 0 for i ∈ Z≥0 , and if γ , δ = −n < 0, we get γ
ei−1 eδ = Sn−i (γ )eγ +δ
for i ∈ {0, . . . , n}.
(5.3)
5.1. Fermionic vertex operator superalgebra F. In what follows we consider the Clifford algebra CL, generated by {φ(n), n ∈ 21 + Z} ∪ {1} and relations {φ(n), φ(m)} = δn,−m , n, m ∈
1 2
+ Z.
Let F be the CL–module generated by the vector 1 such that φ(n)1 = 0, n > 0. Then the field Y (φ(− 21 )1, z) = φ(z) =
1
φ(n)z −n− 2 ,
1 n∈ 2 +Z
generates a unique vertex operator superalgebra structure on F. We choose ω(s) =
1 φ(− 23 )φ(− 21 )1 2
for the Virasoro element giving central charge 21 . Moreover, F is a rational vertex operator superalgebra, and F is up to equivalence the unique irreducible F–module (see [FRW,KWn,Li]).
236
D. Adamovi´c, A. Milas
5.2. Vertex superalgebra S M(1). In this subsection we study the vertex superalgebra S M(1) := M(1) ⊗ F. We shall first define a family of N = 1 superconformal vectors in S M(1). For every m ∈ Z≥0 , we define (see also [MR,K,IK2])
1 τ=√ α(−1)1 ⊗ φ(− 21 )1 + 2m1 ⊗ φ(− 23 )1. , 2m + 1 1 (α(−1)2 + 2mα(−2))1 ⊗ 1 + 1 ⊗ ω(s) . ω= 2(2m + 1) Set Y (τ, z) = G(z) =
n∈Z
G(n + 21 )z −n−2 ,
Y (ω, z) = L(z) =
L(n)z −n−2 .
n∈Z
Then τ is an N = 1 superconformal vector, and the vertex subalgebra of S M(1) strongly generated by the fields G(z) and L(z) is isomorphic to the Neveu-Schwarz vertex opera8m 2 ). In other words, S M(1) tor superalgebra L ns(c2m+1,1 , 0), where c2m+1,1 = 23 (1 − 2m+1 becomes a Fock module for the Neveu-Schwarz algebra with central charge c2m+1,1 . Moreover, for every λ ∈ h, the S M(1)–modules S M(1, λ) := M(1, λ) ⊗ F is also a Fock module with central charge c2m+1,1 and conformal weight 1 (λ, α2 − 2mλ, α). 2(2m + 1)
(5.4)
Now we want to describe the structure of these Fock modules viewed as ns-modules. For this purpose we need the concept of screening operators. As in [A3], we shall construct these operators using generalized vertex algebras. The N = 1 superconformal vector τ ∈ M(1) ⊗ F also defines an N = 1 superconformal structure on V L ⊗ F and VL ⊗ F. In particular, VL ⊗ F is an N = 1 vertex operator superalgebra. The operator L(0) defines a 21 Z≥0 –gradation on VL ⊗ F. Recall that wt(v) = n if L(0)v = nv. Define s (1) = eα ⊗ φ(− 21 )1 ∈ VL ⊗ F, s (2) = e−β ⊗ φ(− 21 )1 ∈ V L ⊗ F. By using the Jacobi identity in the (generalized) vertex algebras VL ⊗ F and V L ⊗ F we get the following formulas: i 1 (1) (1) (1) α , [L(n), si ] = −i si+n (i ∈ Z), [G(n + ), si ] = − √ ei+n 2 2m + 1 √ 1 1 (2) −β (2) Z). [G(n + ), sr ] = r 2m + 1ei+n , [L(n), sr(2) ] = −r sr +n (r ∈ 2 2m + 1
(5.5) (5.6)
Let Q = s0(1) = Resz Y (s (1) , z),
= s (2) = Resz Y (s (2) , z). Q 0
commute with the From relations (5.5) and (5.6) we see that the operators Q and Q action of the Neveu-Schwarz algebra (see also [IK2]).
N = 1 Triplet Vertex Operator Superalgebras
237
are the We are interested in the action of these operators on S M(1). In fact, Q and Q
are vertex subalgebras of screening operators, and therefore Ker S M(1) Q and Ker S M(1) Q S M(1) (for details see Sect. 14 in [FB] and reference therein).
The proof The following lemma gives the basic properties of the operators Q and Q. is similar to that of Lemma 2.1 in [A3].
= 0. Lemma 5.1. (i) If m = 0, [Q, Q]
nα = 0, n ∈ Z>0 . (ii) Qe
−nα = 0, n ∈ Z≥0 . (iii) Qe We now define the following three (non-zero) elements in the vertex operator superalgebras VL ⊗ F: F = e−α , H = Q F, E = Q 2 F. By using expression for conformal weights (5.4) and Lemma 5.1, we conclude that these vectors are singular vectors for the action of the Neveu-Schwarz algebra, and wt(F) = wt(H ) = wt(E) = h 1,3 = 2m + 21 . It is also important to notice that H ∈ S M(1). The proof of the following result is similar to that of Lemma 3.1 in [A3]. Lemma 5.2. In the vertex operator superalgebra VL ⊗ F the following relations hold: (i) Q 3 F = 0. (ii) E i E = Fi F = 0, for every i ≥ −2m − 1. (iii) Q(Hi H ) = 0, for every i ≥ −2m − 1. We define = e−α ⊗ φ(− 1 ), F 2
= Q F, H
= Q 2 F. E
(5.7)
These vectors are even and have conformal weight 2m + 1. We will need the following result. Lemma 5.3. We have = 0, E i E = 0, i ≥ −2m. i F F Also, ) = 0, i ≥ −2m. i H Q( H i F) = 0, for i ≥ −2m then Q 4 ( F = 6E i E = i F Proof. Since Q acts as a derivation if F 0, for i ≥ −2m. We only have to notice relations k F = Resx x k Y (e−α , x)e−α ⊗ Y (φ(−1/2), x)φ(−1/2)1, F Resx x i Y (e−α , x)e−α = 0, i ≥ −2m − 1, proven in Lemma 5.2, and Resx x j Y (φ(−1/2), x)φ(−1/2)1 = 0, j ≥ 1. i F) = 0 for i ≥ −2m. The last formula follows from Q 3 ( F
238
D. Adamovi´c, A. Milas
6. The N = 1 Neveu-Schwarz Module Structure of VL ⊗ F-Modules For i ∈ Z, we set γi =
i α. 2m + 1
(6.1)
We shall first present results on the structure of VL ⊗ F–modules as modules for the N = 1 Neveu-Schwarz algebra. It is a known fact that irreducible VL ⊗ F-modules are given by VL+γi ⊗ F, i = 0, . . . , 2m. Each VL+γi is a direct sum of super Feigin-Fuchs modules via (M(1) ⊗ eγi +nα ) ⊗ F. VL+γi ⊗ F = n∈Z
We shall now investigate the action of the operator Q. Since operators Q j , j ∈ Z>0 , commute with the action of the Neveu-Schwarz algebra, they are actually intertwiners between super Feigin-Fuchs modules inside VL+γi ⊗ F. Assume that 0 ≤ i ≤ m. If Q j eγi −nα is nontrivial, it is a singular vector in the Fock module S M(1, γi + ( j − n)α) of weight wt(Q j eγi −nα ) = wt(eγi −nα ) = h 2i+1,2n+1 , γi +( j−n)α ) > wt(eγi −nα ) if j > 2n, we where h 2i+1,2n+1 := h 2i+1,2n+1 1,2m+1 . Since wt(e conclude that
Q j eγi −nα = 0 for j > 2n.
(6.2)
One can similarly see that for m + 1 ≤ i ≤ 2m: Q j eγi −nα = 0 for j > 2n + 1.
(6.3)
The following lemma is useful for constructing singular vectors in VL+γi ⊗ F: Lemma 6.1. (1) Q 2n eγi −nα = 0 for 0 ≤ i ≤ m. (2) Q 2n+1 eγi −nα = 0 for m + 1 ≤ i ≤ 2m. Proof. We shall prove the assertion (1) by induction on n ∈ Z>0 . For n = 1 we can see directly that Q 2 eγi −α = 0 (or see below). Assume now that (1) holds for certain n ∈ Z>0 . Since VL+γi ⊗ F is a simple module for the simple vertex operator superalgebra VL ⊗ F we have that Y (E, z)Q 2n eγi −nα = 0, (for the proof see [DL]). So there is j0 ∈ Z such that E j0 Q 2n eγi −nα = 0 and E j Q 2n eγi −nα = 0 for j > j0 . Since E j0 Q 2n eγi −nα =
1 γi −nα Q 2n+2 (e−α ), j0 e (n + 1)(2n + 1)
N = 1 Triplet Vertex Operator Superalgebras
239
we have that j0 ≤ i − 1 − (2m + 1)n. By using the fusion rules from Proposition 4.1, we conclude that γi −nα e−α ∈ U (ns).eγi −(n+1)α , j0 e
and therefore Q 2n+2 eγi −(n+1)α = 0, which proves (1). Notice that the idea used in the induction step, and fusion rules from Proposition 4.1 can be alternatively used to show that Q 2 eγi −α = 0. The proof of (2) is similar so we omit it here. Remark 6.1. It would be desirable — in parallel with the Virasoro algebra case — to have a direct proof of Lemma 6.1 with no reference to fusion rules. However, the Virasoro algebra approach based on matrix coefficients does not apply verbatim to superconformal (1, 2m + 1)-models, so we decided to give a proof which uses the theory of vertex algebras and fusion rules. We found this approach to be quite elegant. We also remark that Iohara and Koga proved certain properties of screening operators among super Feigin-Fuchs modules in Theorem 3.1, [IK2] (see also [MR]), but it is not clear whether these results can be used to prove Lemma 6.1. As in the Virasoro algebra case the N = 1 Feigin-Fuchs modules are classified according to their embedding structure. For our purposes we shall focus only on modules of certain type (Type 4 and 5 in [IK2]). These modules are either semisimple (Type 5) or they become semisimple after quotienting with the maximal semisimple submodule (Type 4). As usual the singular vectors will be denoted by • and cosingular vectors with ◦. The following result follows directly from Lemma 6.1 and the structure theory of super Feigin-Fuchs modules [IK2] after some minor adjustments of parameters (cf. Type 4 embedding structure). Theorem 6.1. Assume that i ∈ {0, . . . , m − 1}. (i) As a module for the Neveu-Schwarz algebra, VL+γi ⊗ F is generated by the family i C Sing i , where of singular and cosingular vectors Sing i = {u ( j,n) | j, n ∈ Z≥0 , 0 ≤ j ≤ 2n}; Sing i ( j,n) | n ∈ Z>0 , 0 ≤ j ≤ 2n − 1}. C Sing i = {wi
These vectors satisfy the following relations: ( j,n)
ui
( j,n)
= Q j eγi −nα , Q j wi
= eγi +nα .
i , denoted by S (i + 1), is The submodule generated by singular vectors Sing 1 isomorphic to ∞ (2n + 1)L ns(c2m+1,1 , h 2i+1,2n+1 ). n=0 1 In this section notation k L ns (c, h) means L ns (c, h)⊕k , k ∈ Z . ≥0
240
D. Adamovi´c, A. Milas
(ii) For the quotient module we have S(m − i) := (VL+γi ⊗ F)/S (i + 1) ∼ =
∞ (2n)L ns(c2m+1,1 , h 2i+1,−2n+1 ). n=1
The situation described in Theorem 6.1 can be depicted by the following diagram: j = −2
j = −1
j =0
j =1
j =2
(6.4)
•O
•
◦
◦
•O
•O
•O
◦
◦
◦
◦
•
•
•
•
Let M be the contragradient V -module, where V is a vertex operator superalgebra. Then we have an isomorphism of M(1) ⊗ F-modules, (M(1) ⊗ e 2m+1 α+iα ⊗ F) ∼ = (M(1) ⊗ e 2m+1 α−iα ⊗ F). j
2m− j
By taking direct sums we obtain the following isomorphism of ns-modules: (VL+γi ⊗ F) ∼ = VL+γ2m−i ⊗ F.
(6.5)
Since the dual functor interchanges cosingular and singular vectors, Theorem 6.1 implies the next result (alternatively, use Type 4 embedding structure in [IK2]): Theorem 6.2. Assume that i ∈ {0, . . . , m − 1}. (i) As a module for the Neveu-Schwarz algebra, VL+γ2m−i ⊗ F is generated by the i C family of singular and cosingular vectors Sing Sing i , where
i = {u ( j,n) | n ∈ Z>0 , 0 ≤ j ≤ 2n − 1}; Sing i
( j,n) C Sing i = {wi | j, n ∈ Z≥0 , 0 ≤ j ≤ 2n}.
These vectors satisfy the following relations: ( j,n)
ui
= Q j eγ2m−i −nα ,
( j,n)
Q j wi
= eγ2m−i +nα .
N = 1 Triplet Vertex Operator Superalgebras
241
i is isomorphic to The submodule generated by singular vectors Sing S(m − i) ∼ =
∞
(2n)L ns(c2m+1,1 , h 2i+1,−2n+1 ).
n=1
(ii) For the quotient module we have S (i + 1) ∼ = (VL+γi ⊗ F)/S(m − i) ∼ =
∞
(2n + 1)L ns(c2m+1,1 , h 2i+1,2n+1 ).
n=0
The embedding diagram for VL+γ2m−i ⊗ F, i = 0, . . . , m − 1 is now j = −1
j =0
j =1
j =2
(6.6)
◦
•
•O
•O
◦
◦
◦
•
•
•
Finally, (6.5) imply that VL+γm ⊗ F is a self-dual VL ⊗ F-module. In view of that, it is not surprising that VL+γm ⊗ F is a semisimple ns-module. More precisely, we have the following result (for the proof see the embedding structure in the Type 5 case in [IK2]). Theorem 6.3. As a module for the Neveu-Schwarz algebra VL+γm ⊗ F is completely reducible and generated by the family of singular vectors m = {u (mj,n) := Q j eγm −nα | j, n ∈ Z≥0 , 0 ≤ j ≤ 2n}; Sing and it is isomorphic to S (m + 1) := VL+γm ⊗ F ∼ =
∞ n=0
(2n + 1)L ns(c2m+1,1 , h 2m+1,2n+1 ).
242
D. Adamovi´c, A. Milas
The embedding structure in the last case is a totally disconnected diagram j = −2
j = −1
j =0
j =1
j =2
(6.7)
• •
•
•
•
•
•
•
•
:
:
:
:
:
7. The Vertex Operator Superalgebra SM(1) Let us fix a positive integer m. We shall first present the structure of the vertex operator superalgebra S M(1) as a module for the Neveu-Schwarz algebra. The next result follows directly from Theorem 6.1. Theorem 7.1. For every n ∈ Z≥0 , set (n,n)
u n := u 0
(n+1,n+1)
= Q n e−nα , wn+1 := w0
.
(i) The vertex operator superalgebra S M(1), as a module for the vertex operator superalgebra L ns(c2m+1,1 , 0), is generated by the family of singular and cosingu C lar vectors Sing Sing, where = {u n | n ∈ Z≥0 }; Sing C Sing = {wn | n ∈ Z>0 }. Moreover, U (ns)u n ∼ = L ns(c2m+1,1 , h 1,2n+1 ). (ii) The submodule generated by vectors u n , n ∈ Z≥0 is isomorphic to [Sing] ∼ =
∞
L ns(c2m+1,1 , h 1,2n+1 ).
n=0
(iii) The quotient module is isomorphic to M(1)/[Sing] ∼ =
∞
L ns(c2m+1,1 , h 1,−2n+1 ).
n=1
(iv) Qu 0 = Q1 = 0, and Qu n = 0, Qwn = 0 for every n ≥ 1. Our Theorem 7.1 immediately gives the following result.
N = 1 Triplet Vertex Operator Superalgebras
243
Proposition 7.1. We have L ns(c2m+1,1 , 0) ∼ = W0 = Ker S M(1) Q . Define the following vertex algebra:
S M(1) = Ker S M(1) Q.
commutes with the action of the Neveu-Schwarz algebra, we have Since Q L ns(c2m+1,1 , 0) ∼ = W0 ⊂ S M(1). This implies that S M(1) is a vertex operator subalgebra of S M(1) in the sense of [FHL] (i.e., S M(1) has the same Virasoro element as S M(1)). The following theorem will describe the structure of the vertex operator superalgebra S M(1) as a L ns(c2m+1,1 , 0)–module. Theorem 7.2. The vertex operator superalgebra S M(1) is isomorphic to [Sing] as a L ns(c2m+1,1 , 0)–module, i.e., S M(1) ∼ =
∞
L ns(c2m+1,1 , h 1,2n+1 ).
n=0
Proof. By Theorem 7.1 we know that the L ns(c2m+1,1 , 0)–submodule generated by the is completely reducible. So to prove the assertion, it suffices to show that set Sing
annihilates vector v ∈ Sing ∪ C Let the operator Q Sing if and only if v ∈ Sing. n −nα
−nα = 0, then v = Q e v ∈ Sing, for certain n ∈ Z≥0 . Since by Lemma 5.1 Qe we have
−nα = 0.
=Q
Q n e−nα = Q n Qe Qv Let now v ∈ C Sing. Then there is n ∈ Z>0 such that Q n v = enα . Assume that
Qv = 0. Then we have that
nα ,
=Q
Q n v = Qe 0 = Q n Qv contradicting Lemma 5.1 (iii). This proves the theorem. Next we shall prove that the vertex operator algebra S M(1) is generated by only two generators. Theorem 7.3. (1) The vertex operator superalgebra S M(1) is generated by τ and H . (2) The vertex operator superalgebra S M(1) is strongly generated by the set {τ, ω, H, G(− 21 )H }.
244
D. Adamovi´c, A. Milas
Proof. Let U be the vertex subalgebra of S M(1) generated by τ and H . We need to prove that U = S M(1). Let Wn by the (irreducible) ns–submodule of S M(1) generated by vector u n . Then Wn ∼ = L ns(c2m+1,1 , h 1,2n+1 ). Using Lemma 6.1 we see that Ker S M(1) Q n ∼ =
n−1
Wi .
i=0
To prove (1) it suffices to show that u n ∈ U for every n ∈ Z≥0 . We shall prove this claim by induction. By definition we have that u 0 , u 1 (= H ) ∈ U . Assume that we have k ∈ Z≥0 such that u n ∈ U for n ≤ k. In other words, the inductive assumption is k W ⊂ U. ⊕i=0 i We shall now prove that u k+1 ∈ U . Set j = −(2m + 1)k − 1. By Lemma 6.1 we have −kα
= 0. Q 2k+2 e−(k+1)α = Q 2k+2 e−α j e Next we notice that
Q k+1 (H j u k ) = Q k+1 Qe−α j Q k e−kα =
1 −kα Q 2k+2 e−α , e j 2k + 1
which implies that Q k+1 (H j u k ) = 0. So we have found vector H j u k ∈ U such that wt(H j u k ) = wt(u k+1 ). This implies H j uk ∈
k+1
Wi and H j u k ∈ /
i=0
k
Wi .
i=0
k
Since Q k+1 ⊕i=0 Wi = 0 and wt(H j u k ) = wt(u k+1 ) we conclude that there is a constant C, C = 0, such that H j u k = Cu k+1 + u , u ∈
k
Wi ⊂ U.
i=0
Since H j u k ∈ U , we conclude that u k+1 ∈ U . Therefore, the claim is verified, and the proof of (1) is complete. The proof of (1) shows that S M(1) is spanned by the vectors u 1n 1 · · · u rnr 1, u i ∈ {τ, H },
(7.1)
such that for 1 ≤ i ≤ r : n i ≤ −1 if u i = H
and
n i ≤ 0 if u i = τ.
This implies that S M(1) is strongly generated by the set {τ, ω, holds.
H, G(− 21 )H },
(7.2) and (2)
N = 1 Triplet Vertex Operator Superalgebras
245
i+1 H can The following lemma implies that for i ≥ −(2m + 1) vectors Hi H and H be constructed using only the action of the Neveu-Schwarz operators L(n) and G(n + 21 ) on the vacuum vector 1. Lemma 7.1. We have: Hi H ∈ W0 ∼ = ∼ ∈ W0 = i H H
L ns(c2m+1,1 , 0) for every i ≥ −(2m + 1), L ns(c2m+1,1 , 0) for every i ≥ −2m.
Remark 7.1. If we adopt notation used by physicists, then Theorem 7.3 implies that S M(1) is a W( 23 , 2m + 21 ) superalgebra, meaning that it is generated by primary fields of weight 23 and 2m + 21 . In some physics papers W( 23 , 2m + 21 ) super algebras are studied by using general principles (e.g., Jacobi identities) but only for low m. Because S M(1) shares many similarities with the singlet algebra M(1) [AM1] we call SM(1) super singlet vertex algebra. 8. Zhu’s Algebra A(SM(1)) and Classification of Irreducible SM(1)–Modules In this section we completely determine Zhu’s algebra A(S M(1)) and classify all irreducible S M(1)–modules. It turns out that the structure of Zhu’s algebra A(S M(1)) is similar to the structure of Zhu’s algebra for A(M(1)) studied in [A3] and the proofs of the main results are completely analogous. = Q(e−α ⊗ φ(− 1 )). Clearly, H is proportional to G(− 1 )H and Recall that H 2 2 ∈ S M(1). therefore H The next result shows that Zhu’s algebra A(S M(1)) is commutative. Theorem 8.1. Zhu’s algebra A(S M(1)) is spanned by the set ]∗t | s, t ≥ 0}. {[ω]∗s [ H In particular, Zhu’s algebra A(S M(1)) is isomorphic to a certain quotient of the poly]. nomial algebra C[x, y], where x and y correspond to [ω] and [ H Proof. The proof follows from Proposition 2.1, Theorem 7.3 and because τ and H are odd vectors. 2i+1,1 = i(i−2m) . Let h r,s = h r,s 2m+1,1 , so that h 2(2m+1) As in [A3], for X = F, E or H we let X (n) := X 2m+n (here as usual Y ( X , z) = −n−1 ). In particular, H z (0) is a degree zero operator acting on S M(1). Since X n n∈Z S M(1) ⊂ M(1) every M(1, λ) ⊗ F is naturally an S M(1)-module. Let T be the subspace of M(1) ⊗ F linearly spanned by the vectors
a ⊗ b, where a ∈ M(1), b ∈ F, deg(b) > 0. (So we only assume that b is homogeneous in F and that it is not proportional to 1.) The proof of the following lemma is a consequence of the definition of vertex superalegbra structure on M(1) ⊗ F. Lemma 8.1. Let λ ∈ h∗ and vλ be the highest weight vector in M(1, λ) ⊗ F. Assume that w ∈ T . Then o(w)vλ = 0.
246
D. Adamovi´c, A. Milas
We have the following proposition about the action of the “Cartan subalgebra” of S M(1) on the top component. Proposition 8.1. Let λ ∈ h∗ , t = α, λ and vλ be the highest weight vector in M(1, λ)⊗F. Then we have t (t − 2m) vλ , 2(2m + 1) t (0) · vλ = H vλ . 2m + 1 L(0) · vλ =
Proof. From the very definition of Q and H we see that = φ(1/2)S2m+1 (α)φ(−1/2) + w = S2m+1 (α) + w, H where w = S2m−1 (α) ⊗ φ(− 23 )φ(− 21 ) + · · · + 1 ⊗ φ(−2m − 21 )φ(− 21 ) ∈ T . On the other hand it is known (cf. Proposition 3.1 in [A2]) that t vλ , r ≥ 1. Sr (α)(0)vλ = r The proof follows. It is not hard to see that x(t) = zero curve P(x, y) = 0, where P(x, y) = y − Cm 2
where Cm =
22m+1 (2m+1)2m+1 . (2m+1)!
t (t−2m) 2(2m+1)
and y(t) =
m2 x+ 2(2m + 1)
m−1 i=0
t
2m+1
parametrize the genus
i(i − 2m) x− 2(2m + 1)
2 ,
(8.1)
Alternatively, notice that we can write
P(x, y) = y 2 − Cm
2m
x − h 2i+1,1 .
(8.2)
i=0
By using arguments analogous to those in the proof of Lemma 6.1 from [A3], we obtain the following result: Lemma 8.2. In Zhu’s algebra A(S M(1)) we have the following relation ] ∗ [ H ] = Cm [H
2m ([ω] − h 2i+1,1 ), i=0
where Cm is as above. By using Theorem 8.1, Lemma 8.2 and the same proof as that of Theorem 6.1 from [A3] we get:
N = 1 Triplet Vertex Operator Superalgebras
247
Theorem 8.2. Zhu’s algebra A(S M(1)) is isomorphic to the commutative, associative algebra C[x, y]/P(x, y), where P(x, y) is the ideal in C[x, y] generated by the polynomial P(x, y) = y − Cm 2
2m
(x − h 2i+1,1 ).
i=0
The fact that Zhu’s algebra A(S M(1)) is commutative, enables us to study irreducible lowest weight representations of the vertex operator superalgebra S M(1). For given (r, s) ∈ C2 such that P(r, s) = 0 let L(r, s) be the irreducible lowest weight S M(1)–module generated by the vector vr,s such that L(m)v = r δm,0 vr,s ,
(m)v = sδm,0 r vr,s (m ≥ 0). H
Our Theorem 8.2 and standard Zhu’s theory imply the following classification result. Theorem 8.3. The set {L(r, s) | P(r, s) = 0} provides all non-isomorphic irreducible 21 Z≥0 -gradable S M(1)-modules. By using classification of irreducible S M(1)–modules and the same proof as that of Theorem 4.3 of [AM1] we get: Corollary 8.1. The vertex operator superalgebra S M(1) is simple.
8.1. Logarithmic S M(1)-modules. In [AM1] we studied logarithmic modules for the singlet vertex algebra M(1) p . Here we have a similar result. ˆ As in [AM1], let M(1, λ) ⊗ be an h-module, where is a two-dimensional vector space and where α(0)| is given by formula α, λ 1 (8.3) 0 α, λ in some basis {w1 , w2 } of (see also [M3]). Then M(1, λ)⊗ F ⊗ carries an ns-module structure. m Proposition 8.2. The vector space M(1, λ) ⊗ F ⊗ , λ = 2m+1 α is a genuine m 2 logarithmic S M(1)-module , while for λ = 2m+1 α, M(1, λ) ⊗ F ⊗ is an ordinary S M(1)-module.
Notice that the previous result is in agreement with Theorem 8.2. More precisely, m2 because of the linear term (x + 2(2m+1) ) in P(x, y), as in the proof of Proposition 7.1 [AM1], Theorem 8.2 can be now used to show that there are no logarithmic self-extension m α) ⊗ F. of M(1, 2m+1 2 In other words, the module involves nontrivial Jordan blocks with respect to the action of L(0).
248
D. Adamovi´c, A. Milas
8.2. Further properties of A(S M(1)). In the next sections we shall make use of the following important technical results. Proposition 8.3. In Zhu’s algebra A(S M(1)) we have [Q 2 e−2α ] = Bm f m ([ω]), where f m ([ω]) =
3m
([ω] − h 2i+1,1 )
i=0
and
2m
(2(2m + 1))3m+1 Bm = (−1) m 4m+1
. 2 m (3m + 1)! m
Proof. First we notice that Q 2 e−2α = ν H−2m−2 H + v, where ν = 0 and v ∈ U (ns).1 (see also [AM2], Lemma 3.3). The above results on the structure of A(S M(1)) imply that [Q 2 e−2α ] = m ([ω]) for certain m ∈ C[x], deg m ≤ 3m + 1. We shall evaluate the action of Q 2 e−2α on top levels of S M(1)–modules M(1, λ) ⊗ F. Let vλ be the highest weight vector in M(1, λ) ⊗ F. First we notice that Q 2 e−2α =
4m+1
α e−i−1 eiα e−2α + w, where w ∈ T .
i=0
By using a direct calculation similar to that of [AM2] we see that o(Q 2 e−2α )vλ =
∞
α o(e−i−1 eiα e−2α )vλ
i=0
= Resz 1 Resz 2
∞
z 1−i−1 z 2i (z 1 − z 2 )2m+1 (z 1 z 2 )−4m−2 (1 + z 1 )t (1 + z 2 )t vλ
i=0
= Resz 1 Resz 2 (z 1 − z 2 )2m (z 1 z 2 )−4m−2 (1 + z 1 )t (1 + z 2 )t vλ = m ( 1 (t 2 − 2tm))vλ = m (t)vλ , 2(2m+1)
where
m (t) =
2m t t k 2m , (−1) k 4m + 1 − k 2m + 1 + k k=0
t = λ, α. As in the proof of Lemma 3.4 in [AM2] one can prove the following identity:
(−1)m 2m t t + m ¯m , where A¯ m = 4m+1 m . m (t) = A 3m + 1 3m + 1 m
(8.4)
N = 1 Triplet Vertex Operator Superalgebras
249
This implies 2 1 1 m ( 2(2m+1) (t 2 − 2mt)) = m (t) = Bm f m ( 2(2m+1) (t − 2mt)).
Consequently, m is a non-trivial polynomial of degree 3m + 1 and in A(S M(1)) we have [Q 2 e−2α ] = m ([ω]) = Bm f m ([ω]), Bm = 0.
(8.5)
Define the following non-trivial vector U F,E := Resz Y (F, z)E Set U F,E (0) := o(U F,E ) =
2m
i≥0
i
(z + 1)2m ∈ S M(1). z
o(Fi−1 E).
Proposition 8.4. In Zhu’s algebra A(S M(1)) we have: ], [U F,E ] = g([ω])[ H where g(x) ∈ C[x] is of degree at most m. Proof. First we notice that U F,E = a H − H ◦ H for certain a ∈ U (ns). This implies that in Zhu’s algebra A(S M(1)), we have ], [U F,E ] = g([ω])[ H
(8.6)
where g ∈ C[x] is a polynomial of degree at most m. (Here we used the relation [H ◦ H ] = 0, which holds in A(S M(1)).) It is not at all clear that g(x) is a nonzero polynomial. 9. The N = 1 Triplet Vertex Algebra SW(m) Define the following vertex superalgebra
SW(m) = KerVL ⊗F Q. Recall definition (5.7). For any X ∈ {E, F, H }, X is proportional to G(− 21 )X , and therefore X ∈ SW(m). Theorem 9.1. (1) For every m ≥ 1, SW(m) is an N = 1 vertex operator superalgebra and SW(m) ∼ = S (1). (2) The vertex operator superalgebra SW(m) is generated by E, F, H and τ . (3) The vertex operator superalgebra SW(m) is strongly generated by the set F, H }. {τ, ω, E, F, H, E,
250
D. Adamovi´c, A. Milas
Proof. Recall the structure of VL ⊗ F as a module for the Neveu-Schwarz algebra from Theorem 6.1. By using Lemma 5.1, similarly to the proof of Theorem 7.2, we conclude that SW(m) is a completely reducible module for the Neveu-Schwarz algebra, generated by the family of singular vectors: Q j e−nα , n ∈ Z≥0 , j ∈ {0, · · · , 2n}.
(9.1)
This proves (1). Let Z n be the Neveu-Schwarz module generated by singular vectors Q j e−α , ≤ n, j ∈ Z≥0 .
Therefore SW(m) = n∈Z≥0 Z n . Let now U be the vertex subalgebra of SW(m) generated by τ, E, F, H . Clearly, U ⊆ SW(m). We shall prove that in fact U = SW(m). In order to do so it is sufficient to show that Z n ⊆ U for every n ∈ Z>0 . We shall prove this claim by induction on n. By the definition, the claim holds for n = 1. Assume now that Z n ⊆ U. Set j0 = (2m + 1)n + 1. As in the proof of Theorem 7.3 we have F− j0 e−nα = e−(n+1)α , E − j0 Q 2n e−nα = B2n+1 Q 2n+2 e−(n+1)α , where B2n+1 = 0 and H− j0 Q j e−nα = B j Q j+1 e−(n+1)α + v j , where v j ∈ Z n , B j = 0, 0 ≤ j ≤ 2n. These relations imply that Z n+1 ⊆ U. By induction we conclude that Z n ⊆ U for every n ∈ Z>0 and therefore U = SW(m). This proves (2). The proof of (2) actually gives that SW(m) is spanned by the vectors u 1n 1 · · · u rnr 1, u i ∈ {τ, E, F, H }
(9.2)
such that for 1 ≤ i ≤ r : n i ≤ −1 if u i ∈ {E, F, H }
and
n i ≤ 0 if u i = τ.
(9.3)
The assertion (3) follows. Theorem 9.2. Assume that m ≥ 1. Then we have (1) The vertex operator superalgebra SW(m) is C2 –cofinite. (2) The vertex operator superalgebra SW(m) is irrational. Proof. By using Proposition 2.1, relation (2.3) and Theorem 7.2 we conclude that SW(m)/C2 (SW(m)) is generated by F, F, H, H , τ , ω, E, E,
(9.4)
and that every two generators either commute or anti-commute. In order to prove C2 – cofiniteness it suffices to prove that every generator (9.4) is nilpotent in SW(m)/ C2 (SW(m)). Let X be either E or F. From Lemma 5.2 we see that X −1 X = 0, and 2 thus X = 0. By using G(−i − 1/2)2 = L(−2i − 1) ∈ U (ns)
N = 1 Triplet Vertex Operator Superalgebras
251
we get τ 2 = 0. Similarly, from H−1 H ∈ U (ns) · 1,
H−1 H =
ai1 ,...,ik G(−i 1 −1/2) · · · G(−i k −1/2)L(−j1 ) · · · L(− js )1, k/2+i 1 +···+i k + j1 +···+ js =4m+1
where i 1 > i 2 > · · · > i k ≥ 1,
j1 , . . . , js ≥ 2, ai1 ,...,ik ∈ C,
it follows that H−1 H ∈ C2 (SW(m)), and thus 2
H = H−1 H = 0. E} in SW(m) (cf. Lemma 5.3), so that We also have X −1 X = 0, X ∈ { F, 2
X = 0 in SW(m)/C2 (SW(m)). are nilpotent. We prove this as in [AM2]. Since Thus, it remains to prove that ω and H + F −1 E + 2H −1 H =0 −1 F E we get 4
= 0. H Moreover, the description of Zhu’s algebra from Theorem 8.2 implies that 2
= Cm ω2m+1 , (Cm = 0), H which implies that ω4m+2 = 0. Therefore, every generator of SW(m)/C2 (SW(m)) is nilpotent and SW(m) is C2 –cofinite. This proves (1). Assertion (2) follows from the fact that VL ⊗ F is not completely reducible, viewed as a SW(m)–module. 10. Classification of Irreducible SW(m)–Modules From the definition of Zhu’s algebra and the structure of the vertex operator superalgebra SW(m) follows: [H ], [ F] Proposition 10.1. The associative algebra A(SW(m)) is generated by [ E], and [ω].
252
D. Adamovi´c, A. Milas
Proof. The proof follows from Proposition 2.1, Theorem 9.1 and the fact that τ , E, F and H are all odd. Theorem 10.1. In Zhu’s algebra A(SW(m)) we have the following relation: f m ([ω]) = 0, where 3m f m (x) = (x − h 2i+1,1 ). i=0
Proof. Since O(S M(1)) ⊂ O(SW(m)), the embedding S M(1) ⊂ SW(m) induces an algebra homomorphism A(S M(1)) → A(SW(m)). Applying this homomorphism to Proposition 8.3 and using the fact that Q 2 e−2α ∈ O(SW(m)) we get that f m ([ω]) = 0 in A(SW(m)). Alternatively, we can write the polynomial f m (x) as f m (x) = (x − h 2m+1,1 )
m−1
(x − h 2i+1,1 )2
i=0
3m
(x − h 2i+1,1 ),
(10.1)
i=2m+1
indicating possibility of existence of logarithmic modules of generalized lowest conformal weight h 2i+1,1 , i = 0, . . . , m − 1. Theorem 10.2. (1) For every 0 ≤ i ≤ m, S (i + 1) is an irreducible 21 Z≥0 –gradable SW(m)–module, with the top component S (i + 1)(0) of lowest weight h 2i+1,1 . Moreover, S (i + 1)(0) is an 1-dimensional irreducible A(SW(m))-module. (2) For every 0 ≤ j ≤ m − 1 , S(m − j) is an irreducible 21 Z≥0 -gradable SW(m)module, with the top component S(m − j)(0) of lowest weight h 2i+1,1 , where i = 2m+1+ j. Moreover, S(m− j)(0) is a 2–dimensional irreducible A(SW(m))module. Proof. Proof is similar to that of Theorem 3.7 in [AM2] so we omit it here. Applying the previous theorem in the case of SW(m) = S (1) we get: Corollary 10.1. The vertex operator superalgebra SW(m) is simple. As in [AM2] we have the following result: Proposition 10.2. In Zhu’s associative algebra we have ] ∗ [ F] − [ F] ∗ [H ] = −2q([ω])[ F], [H ] ∗ [ E] − [ E] ∗ [H ] = 2q([ω])[ E], [H ], [ E] ∗ [ F] − [ F] ∗ [ E] = −2q([ω])[ H
(10.2) (10.3) (10.4)
where q is a certain polynomial. Theorem 10.3. The set {S(i)(0) : 1 ≤ i ≤ m} ∪ S (i)(0) : 1 ≤ i ≤ m + 1} provides, up to isomorphism, all irreducible modules for Zhu’s algebra A(SW(m)).
N = 1 Triplet Vertex Operator Superalgebras
253
Proof. The proof is similar to that of Theorem 3.11 in [AM2]. Assume that U is an irreducible A(SW(m))–module. Relation f m ([ω]) = 0 in A(SW(m)) implies that L(0)|U = h 2i+1,1 Id, for i ∈ {0, . . . , m} ∪ {2m + 1, . . . , 3m}. Assume first that i = 2m + 1 + j for 0 ≤ j ≤ m − 1. By combining Propositions 10.2 and Theorem 10.2 we have that q(h 2i+1,1 ) = 0. Define e= √
1 2q(h 2i+1,1 )
[ E],
f = −√
1 2q(h 2i+1,1 )
[ F], h=
1 ]. [H q(h 2i+1,1 )
Therefore U carries the structure of an irreducible, sl2 –module with the property that e2 = f 2 = 0 and h = 0 on U . This easily implies that U is a 2-dimensional irreducible sl2 -module. Moreover, as an A(SW(m))-module U is isomorphic to S(m − j)(0). Assume next that 0 ≤ i ≤ m. If q(h 2i+1,1 ) = 0, as above we conclude that U is an irreducible 1–dimensional sl2 –module. Therefore U ∼ = S (i + 1)(0). If q(h 2i+1,1 ) = 0, from Proposition 10.2 we have that the action of generators of A(SW(m)) commute on U . Irreducibility of U implies that U is 1-dimensional. Since ], [ E] 2 , [ F] 2 must act trivially on U , we conclude that [ H ], [ E], [ F] also act trivially [H on U . Therefore U ∼ = S (i + 1)(0). As a consequence of the previous theorem we have. Theorem 10.4. The set {S(i) : 1 ≤ i ≤ m} ∪ {S (i) : 1 ≤ i ≤ m + 1} provides, up to isomorphism, all irreducible modules for the vertex operator superalgebra SW(m). 11. On the Structure of Zhu’s Algebra A(SW(m)) As in [AM2], the main difficulty in description of Zhu’s algebra A(SW(m)) is that of not having a good understanding of logarithmic SW(m)-modules. For the triplet W( p) this problem can be resolved, at least if p is prime, by using modular invariance. We believe the same approach can be applied for SW(m), which would require a super version of Miyamoto’s result [Miy]. This is the main reason why in this part we focus mostly on the case 2m + 1 is prime, but we expect all results to be true in general. In many ways this section is analogous to Sect. 5 (and Appendix) in [AM2], but as we shall see there are some important differences. First a few generalities regarding the Lagrange interpolation polynomial. Proposition 11.1. Let S = {(x1 , y1 ), . . . , (xn , yn )}, xi = x j be a set of points in C2 such that their Lagrange interpolation polynomial L n (x) is of degree exactly n − 1. Then every interpolation polynomial of degree exactly n is given by Q λ (x) = L n (x) + λ
n (x − xi ), λ = 0. i=1
Proof. Let P(x) be an arbitrary n interpolation polynomial of degree n. Then for some λ, the polynomial P(x) n− λ i=1 (x − xi ) is of degree less or equal n − 1, but not zero. But then P(x) − λ i=1 (x − xi ) = L n (x).
254
D. Adamovi´c, A. Milas
Lemma 11.1. Let L m (x) be the Lagrange interpolation polynomial for (h 2i+1,1 , t (t−2m) ), then where 2m + 1 ≤ i ≤ 3m. If we let r (t) = L n ( 2(2m+1)
i
2m+1 ),
3m
r (t) =
i=2m+1 (t
− i)(t − 2m + i) (2m + 1)! 3m (i!)2 (−1)i+m 1 1 ( − ) ∈ C[t]. × (i − 2m − 1)!2 (3m − i)!(i + m)! t − i t − 2m + i i=2m+1
Now, we have an important technical result (in a slightly different setup a similar result has been proven in Appendix of [AM2]). Proposition 11.2. For every m ≥ 1 we have L m (h 2i+1,1 ) = 0, 0 ≤ i ≤ m. Proof. As in [AM2] it suffices to let s(t) = 3m
r (t)
i=2m+1 (t
− i)(t − 2m + i)
,
and check first s(0) < 0, s(1) < 0, which follows by using hypergeometric summations. That r (h 2i+1,1 ) = 0 for 0 ≤ i ≤ m follows now from the recursion s(t)(m + t)(2m + 1 − t)2 = 2(m + 1 − t)(2m 2 + 2tm − 2 − t 2 + 2t)s(t − 1) + (t − 1)2 (3m + 2 − t)s(t − 2), because all coefficients in the recursion are positive for 1 ≤ t ≤ m. As in Appendix of [AM2] we now observe that ∗ F = a.F, H where a ∈ U (ns). From −1 F) = 4m + 2, deg( H and ] ∗ [ F] = −q([ω])[ F], [H
(11.1)
for some q ∈ C[x]. It follows that q([ω]) is a polynomial of degree at most m. In [AM2] this observation was sufficient to argue that q has to be the interpolation polynomial. However, in view of Proposition 11.1 and Lemma 11.1, we are unable to argue that q = L m , because L m is of degree m − 1. Thus, it is not clear what the q polynomial should be.
N = 1 Triplet Vertex Operator Superalgebras
255
Proposition 11.3. Let g(x) be as in Proposition 8.4 and 3m
u(x) =
(x − h 2i+1,1 ).
i=2m+1
Then g(x) = Dm u(x), for some constant Dm . Moreover, Dm u([ω]) ∗ [ X ] = 0, X ∈ {F, H, E}.
(11.2)
Proof. First we notice that U F,E = F ◦ E ∈ O(SW(m)). Then Proposition 8.4 implies that ] = 0 in A(SW(m)) g([ω]) ∗ [ H for some polynomial of degree at most m. Because we already know all irreducible SW(m)-modules we also know that g([ω]) must act as zero on all SW(m)-modules ] acts nontrivially). Thus we with two-dimensional highest weight subspaces (here [ H know that g([ω]) = Dm u([ω]) for some constant Dm . Since Q preserves O(SW(m)) we get (11.2). It is crucial for our considerations to show that Dm = 0 (i.e., g(x) = 0). This will require an explicit computation of U F,E (0) on the top degree subspaces of certain S M(1)-modules. We have the following result. Theorem 11.1. If m ∈ N such that 2m + 1 is a prime integer, then g(x) = 0. For the proof of this important technical result we refer the reader to the Appendix. If Dm = 0, Proposition 11.3 and (11.1) we get ] ∗ [ F] = −q([ω])[ F] = −q ([ω])[ F], [H where q ([ω]) is a polynomial of degree m − 1, which forces q = L m . We should say here that in [AM2] the formula (11.2) was a consequence of a formula analogous to (11.1). Theorem 11.2. Assume that 2m + 1 is prime or Dm = 0. Then we have 2 = [ F] 2=0 (i) [ E] 2 (ii) [ H ] = Cm P([ω]), where P(x) =
2m (x − h 2i+1,1 ) ∈ C[x] i=0
and Cm is a nonzero constant.
256
D. Adamovi´c, A. Milas
(iii) ] ∗ [ F] = −[ F] ∗ [H ] = −q([ω]) ∗ [ F], [H [ H ] ∗ [ E] = −[ E] ∗ [ H ] = q([ω]) ∗ [ E], where q(x) is a nonzero polynomial of degree m − 1 and q(h 2i+1,1 ) = 0, 0 ≤ i ≤ m. (iv) ] ∗ [ F] − [ F] ∗ [H ] = −2q([ω])[ F], [H ] ∗ [ E] − [ E] ∗ [H ] = 2q([ω])[ E], [H ], [ E] ∗ [ F] − [ F] ∗ [ E] = −2q([ω])[ H where q(x) is as in (iii). (v) 3m
F, H }. ([ω] − h 2i+1,1 ) ∗ [X ] = 0, X ∈ { E,
i=2m+1
(vi) The center of A(SW(m)) is a subalgebra generated by [ω]. Proof. We recall that SW(m) is generated by [ω] and [ X ], X = F, H and E (see Proposition 10.1). For (i) we recall [AM2] that Q lifts to a derivation of A(SW(m)), denoted by the same symbol. Now, because of Lemma 5.3 we have ∗ [ F] = [ E] ∗ [ E] = 0. [ F] Part (ii) has been proven in Lemma 8.2. It is left to show relations (iii), (iv) and (v). As in [AM2] we compute ∗ [ F]) = [H ] ∗ [ F] + [ F] ∗ [H ], 0 = Q([ F] which yields ] ∗ [ F] = −[ F] ∗ [H ]. [H After an application of Q 2 on the previous equation we get ] ∗ [ E] = −[ E] ∗ [H ]. [H Two remaining formulas in (iii) ] ∗ [ F] = −q([ω]) ∗ [ F], [H
(11.3)
] ∗ [ E] = q([ω]) ∗ [ E], [H have already been proven in the discussion preceding the theorem. The relation (iv) follows from (iii) (cf. [AM2]). Part (v) follows directly from Proposition 11.3. Part (vi) follows from the fact that q([ω]) is a unit in A(SW(m)).
N = 1 Triplet Vertex Operator Superalgebras
257
Corollary 11.1. Under the assumptions of Theorem 11.2, the associative algebra A(SW(m)) is spanned by F or H }. {[ω]i , 0 ≤ i ≤ 3m} ∪ {[ω]i ∗ [X ], 0 ≤ i ≤ m − 1, X = E, Thus, A(SW(m)) is at most 6m + 1-dimensional. By using the same ideas as in [AM2] it is not hard to show that Theorem 11.3. Assume 2m + 1 is prime or Dm = 0. Then Zhu’s algebra decomposes as a sum of ideals A(SW(m)) =
3m
Mh 2i+1,1 ⊕
i=2m+1
m−1
Ih 2i+1,1 ⊕ Ch 2m+1,1 ,
i=0
where Mh 2i+1,1 ∼ = M2 (C), 1 ≤ dim(Ih 2i+1,1 ) ≤ 2 and Ch 2m+1,1 is one-dimensional. It is also not hard to find explicit generators for every ideal, in parallel with [AM2]. As with the triplet we expect that all Ih 2i+1,1 are two-dimensional (which is related to existence of logarithmic modules). This is equivalent to Conjecture 11.1. The associative algebra A(SW(m)) is 6m + 1-dimensional. Then the center of A(SW(m)) is 3m + 1-dimensional. Remark 11.1. Dong and Jiang have recently proven [DJ] that if A(V ) is semisimple and every irreducible admissible module is an ordinary module, then V is rational. It is feasible to assume that their result applies for vertex operator superalgebras. This would imply dim(Ih 2i+1,1 ) = 2 for at least one i, and in particular dim A(SW(1)) = 7. (Note that in the case m = 1, D1 = 0 certainly holds.) 12. Modular Properties of Characters of Irreducible SW(m)-Modules We first introduce several basic facts regarding classical modular forms needed for description of irreducible SW(m) characters. The Dedekind η-function is usually defined as the infinite product η(τ ) = q 1/24
∞
(1 − q n ),
n=1
an automorphic form of weight We also introduce
1 2.
As usual in all these formulas q = e2πiτ , τ ∈ H.3
f(τ ) = q
−1/48
f1 (τ ) = q −1/48 f2 (τ ) = q 1/24
∞ n=0 ∞
n=1 ∞
(1 + q n+1/2 ),
(12.1)
(1 − q n−1/2 ),
(12.2)
(1 + q n ).
(12.3)
n=1 3 Here τ - the coordinate of H - should not be confused with the superconformal vector used in previous sections.
258
D. Adamovi´c, A. Milas
These (slightly normalized) Weber functions form a vector-valued modular form of weight zero. More precisely, √ 1 f(−1/τ ) = f(τ ), f2 (−1/τ ) = √ f1 (τ ), f1 (−1/τ ) = 2f2 (τ ), 2 f(τ + 1) = e−2πi/48 f1 (τ ), f2 (τ + 1) = e2πi/24 f2 (τ ), f1 (τ + 1) = e−2πi/48 f(τ ). In what follows, we denote by j,k (τ ) =
q (2kn+ j)
2 /4k
n∈Z
Jacobi-Riemann -series where j ∈ Z and k ∈ N/2. We also let 2 (∂) j,k (τ ) = (2kn + j)q (2kn+ j) /4k . n∈Z
Then we have transformation formulas (notice that here k ∈ N/2 so j,k (τ ) is not invariant under τ −→ τ + 1 in general): √ η(−1/τ ) = −iτ η(τ ), η(τ + 1) = eπi/12 η(τ ), (12.4) 2k−1 −iτ eiπ j j /k j ,k (τ ), (12.5) j,k (−1/τ ) = 2k j =0
j,k (τ + 2) = e
iπ j 2 /k
(∂) j,k (τ + 2) = e
j,k (τ ),
iπ j 2 /k
(12.6)
(∂) j,k (τ ),
(∂) j,k (−1/τ ) = (−τ ) −iτ/2k
2k−1
(12.7)
eiπ j j /k (∂) j ,k (τ ).
(12.8)
j =1
For a vertex operator algebra module M we define its graded-dimension or simply character χ M (τ ) = tr| M q L(0)−c/24 . If V = L ns(c2m+1,0 , 0) and M = L(c2m+1,0 , h 2i+1,2n+1 ), then (see [IK2], for instance) m2 f(τ ) h 2i+1,2n+1 2i+1,−2n−1 q χ L ns (c2m+1,1 ,h 2i+1,2n+1 ) (τ ) = q 2(2m+1) . (12.9) − qh η(τ ) By combining Theorem 6.1, 6.2 and 6.3, and formula (12.9) we obtain Proposition 12.1. For i = 0, . . . , m − 1, f(τ ) 2i + 1 2 χ S (i+1) (τ ) = m−i, 2m+1 (τ ) + (∂)m−i, 2m+1 (τ ) , (12.10) 2 2 η(τ ) 2m + 1 2m + 1 f(τ ) 2m − 2i 2 m−i, 2m+1 (τ ) − (∂)m−i, 2m+1 (τ ) . (12.11) χ S(m−i) (τ ) = 2 2 η(τ ) 2m + 1 2m + 1 Also, χ S (m+1) (τ ) =
f(τ ) 2m+1 (τ ). η(τ ) 0, 2
(12.12)
N = 1 Triplet Vertex Operator Superalgebras
259
For purposes of modular invariance, it is also important to compute supercharacters of irreducible modules. Let us recall that a supercharacter of a V -module M is defined F (τ ) = tr| M σ q L(0)−c/24 , χM
where σ is the sign operator taking values 1 (resp. −1) on even (resp. odd) vectors. In parallel with Proposition 12.1, it is not hard to compute irreducible supercharacters of SW(m)-modules. Here is an explicit description in terms of -constants and their derivatives. Proposition 12.2. For i = 0, . . . , m − 1, F χ S (i+1) (τ )
f2 (τ ) 2i + 1 2(m−i),2(2m+1) (τ ) − 2(m+i+1),2(2m+1) (τ ) = (12.13) η(τ ) 2m + 1
1 (∂)2(m−i),2(2m+1) (τ ) − (∂)2(m+i+1),2(2m+1) (τ ) ; (12.14) + 2m + 1 F χ S(m−i) (τ )
f2 (τ ) 2m − 2i 2(m−i),2(2m+1) (τ ) − 2(m+i+1),2(2m+1) (τ ) = (12.15) η(τ ) 2m + 1
1 (∂)2(m−i),2(2m+1) (τ ) − (∂)2(m+i+1),2(2m+1) (τ ) . (12.16) − 2m + 1
Also, F (τ ) = χ S (m+1)
f2 (τ ) 0,2(2m+1) (τ ) − 2(2m+1),2(2m+1) (τ ) . η(τ )
(12.17)
As in [F2] we now study modular invariance properties of irreducible SW(m) characters and supercharacters. We only consider some special modular transformations. For example, f(τ ) λk k, 2m+1 (τ ) 2 η(τ ) 2m
χ S (i+1) (−1/τ ) =
k=0
f(τ ) ν j (∂) j, 2m+1 (τ ), + (−τ ) 2 η(τ ) 2m
j=1
for some constants λk and ν j . Because of j,k = − j,k = 2k− j,k = 2k+ j,k , (∂) j,k = −(∂)− j,k , the previous formula indicates that τ
f(τ ) (∂) j, 2m+1 (τ ), j = 1, . . . , m 2 η(τ )
(12.18)
have to be added to the vector space spanned by irreducible SW(m) characters in order to preserve modular invariance. In the case of the triplet vertex algebra expressions similar to (12.18) could be interpreted as Miyamoto’s pseudocharacters (cf. [AM2]). On the
260
D. Adamovi´c, A. Milas
other hand, the T transformation τ → τ + 1, maps characters to supercharacters (multiplied with appropriate scalars). In order to find an S L(2, Z)-closure, we would have to apply the S transformation on the space of supercharacters, but this requires a knowledge of irreducible σ -twisted characters. Since we do not study σ -twisted SW(m)-modules in this paper, at this point we record the modular invariance property for the untwisted sector only. Theorem 12.1. The vector space N S spanned by: χ S (m+1) (τ ), χ S (i+1) (τ ), χ S(m−i) (τ ), i = 0, . . . , m − 1, f(τ ) (∂)m−i, 2m+1 (τ ), i = 0, . . . , m − 1 τ 2 η(τ )
(12.19)
is (3m + 1)-dimensional and invariant under the subgroup θ ⊂ S L(2, Z), where θ = S, T 2 . Remark 12.1. We expect that S-transforms of (generalized) supercharacters are expressible in terms of characters and generalized characters of σ -twisted SW(m)-modules. More precisely, appropriately defined vector space spanned by characters and generalized supercharacters , denoted by N S, and the vector space spanned by characters and generalized characters of σ -twisted modules, denoted by R, should be inter-related as on the diagram S
,
T
6N S l
8 NS
T S T
8R
w
S
It is known that (super) characters of N = 1 minimal models in the NS and R sector transform according to this picture (see [IK1]). 13. SW(m)-Characters and q-Series Identities In this section we discuss fermionic expressions for irreducible characters of SW(m)-modules. As we shall see irreducible SW(m)-modules admit q-series formulas similar to those for the triplet, conjectured by Flohr-Grabov-Koehn [FGK], and proven by Warnaar [Wa] ( Feigin et al. independently obtained similar identities by using different methods [FFT]). More precisely, the characters of irreducible modules for the super triplet SW(m) are intimately related to characters of irreducible W(2m + 1)-modules. It is not clear whether a deeper connection persists beyond characters. 13.1. The m = 1 case: first computation. Motivated by computations in [FGK] for W(2), here we probe double-sum fermionic expressions of irreducible characters of SW(1)-modules. As usual, we will be using (a; q)n = (1 − a)(1 − aq) · · · (1 − aq n−1 ), ∞ (a; q)∞ = (1 − aq i−1 ), i=1
N = 1 Triplet Vertex Operator Superalgebras
261
and sometimes we shall write (q)n = (q; q)n , for simplicity. We start a basic relation ∞ n−1/2 ) 1 n=1 (1 + q = ∞ . ∞ n) n/2 )(1 + q n ) (1 − q (1 − q n=1 n=1
(13.1)
We shall also use Durfee rectangle identities which hold for every k ∈ Z≥0 , ∞
1 q (n +kn)/2 = n/2 ) (q 1/2 )n (q 1/2 )n+k n≥1 (1 − q 2
n=0
=
2 ∞ (−q 1/2 )n (−q 1/2 )n+k q (n +kn)/2
(q)n (q)n+k
n=0
.
(13.2)
Another useful elementary formula due to Euler is η(q) = q
1/24
∞ (−1)n q (n+1)n/2
(q)n
n=0
.
(13.3)
For m = 1 there are three irreducible characters. We will focus here on f(τ ) 1 2 1, 3 (τ ) + (∂)1, 3 (τ ) . χ S (1) (τ ) = 2 2 η(τ ) 3 3
(13.4)
We first notice a theta-function identity (∂)1,3/2 (τ ) =
η(τ )3 , f(τ )2
(essentially, a consequence of the Jacobi triple product identity) or equivalently η(τ )2 f(τ ) (∂)1,3/2 (τ ) = . η(τ ) f(τ ) Now, we apply the relation f(τ ) =
η(τ )2 η(τ/2)η(2τ )
and (13.3), so we obtain ∞
f(τ ) (∂)1,3/2 (τ ) = η(2τ )η(τ/2) = q 5/48 (1 − q 2n )(1 − q n/2 ) η(τ ) = q 5/48
(m 1 ,m 2 )∈Z2≥0
n=1 (−1)m 1 +m 2 (−q 1/2 ; q 1/2 )
m2 q
(13.5)
m 1 (m 1 +1)+m 2 (m 2 +1)/4
(q 2 )m 1 (q)m 2
.
262
D. Adamovi´c, A. Milas
On the other hand the Durfee square identity (13.2) yields (after some computation) f(τ ) 1,3/2 (τ ) η(τ ) q 5/48 = (−q; q)∞
3(m 1 −m 2 )2 (m 1 −m 2 ) m 1 m 2 + + 2 8 2
(m 1 , m 2 ) ∈ Z2≥0 m 1 ≡ m 2 (2)
(−q 1/2 ; q 1/2 )m 1 (−q 1/2 ; q 1/2 )m 2 q (q)m 1 (q)m 2
.
(13.6) Evidently, double fermionic expressions for (∂)1,3/2 (τ ) and 1,3/2 (τ ) (cf. formulas (13.5) and (13.6), respectively) appear to have little in common, so it is unclear to us that (13.4) admits representation as a closed double fermionic sum. Thus, it appears that the m = 1 case is rather different compared to the triplet W(2). This is perhaps reflected by the fact that the p = 2 triplet admits a fermionic construction, while such a realization seems to be absent for SW(1) and its modules. 13.2. Irreducible SW(m) characters from W(2m + 1) characters. In this part we will be using character formulas of irreducible W( p)-modules (see for instance (6.34) and (6.35) in [AM2], or [FHST]). Recall f2 (τ ) = q 1/24
∞
(1 + q n ).
n=1
The first result in this part is Proposition 13.1.
(i) For 0 ≤ i ≤ m, we have χ S (i+1) (τ ) =
χ (2i+1) ( τ2 ) . f2 (τ )
(ii) For 0 ≤ i ≤ m − 1, we also have χ S(m−i) (τ ) =
χ(2m−2i) ( τ2 ) . f2 (τ )
Here (i) and (2m +2−i), i = 1, . . . , 2m +1, are irreducible W(2m +1)-modules [AM2]. Proof. The proof follows from character formulas for irreducible W( p)-modules, Theorem 12.1, and the following transformation formulas: 2 j,2m+1 (τ/2) =
j,
2m+1 (τ ), 2
(∂)2 j,2m+1 (τ/2) = 2(∂)
j,
2m+1 (τ ), 2
f(τ ) 1 = f2 (τ ). η(τ/2) η(τ )
N = 1 Triplet Vertex Operator Superalgebras
263
We recall two multi-sum identities obtained recently by Warnaar [Wa] (these identities are essentially conjectures from [FGK]): Theorem 13.1. For λ = 0, . . . , p and σ ∈ {0, 1} we have p
q
i, j=1
Bi, j n i n j +λ/2(n p−1 −n p +σ )−σ p/4
(13.7)
(q; q)n 1 · · · (q; q)n k
n1, . . . , n p = 0 n p−1 + n p ≡ 0 (2) 1 2 = q pn +(λ−σ p)n (q; q)∞ n∈Z
and
p
q
i, j=1
p−2 Bi, j n i n j +λ/2(n p−1 +n p +σ )+ p−λ (i− p+λ+1)n i −σ p/4
(13.8)
(q; q)n 1 · · · (q; q)n k
n 1 , . . . , n p = 0, n p−1 + n p ≡ 0 (2) 1 2 = (2n − σ + 1)q pn +(λ−σ p)n , (q; q)∞ n∈Z
where Bi, j are entries of the inverse Cartan matrix of the Lie algebra D p . Equipped with Warnaar’s formulas and Proposition 13.1 it is now not hard to prove the next result Theorem 13.2. We have the following formulas for irreducible SW(m)-characters: q −1/16 χ S (m+1) (τ ) =
2m+1
n 1 , . . . , n 2m+1 = 0, n 2m + n 2m+1 ≡ 0 (2)
(−q 1/2 ; q 1/2 )n 1 · · · (−q 1/2 ; q 1/2 )n 2m+1 q k,l=1 Bk,l n k nl /2 . (−q; q)∞ (q; q)n 1 · · · (q; q)n 2m+1 (13.9)
For i = 0, . . . , m − 1, we have q
−ai,m
χ S (i+1) (τ ) =
n 1 , . . . , n 2m+1 = 0 n 2m + n 2m+1 ≡ 0 (2)
2m+1
2m−1
(−q 1/2 ; q 1/2 )n 1 · · · (−q 1/2 ; q 1/2 )n 2m+1 q k,l=1 Bk,l n k nl /2+(m−i)(n 2m +n 2m+1 )/2+ (−q; q)∞ (q; q)n 1 · · · (q; q)n 2m+1
and q −bi,m χ S(m−i) (τ ) =
k=2i+1 (k−2i)n k /2
,
n 1 , . . . , n 2m+1 = 0 n 2m + n 2m+1 ≡ 1 (2)
2m+1
2m−1
(−q 1/2 ; q 1/2 )n 1 · · · (−q 1/2 ; q 1/2 )n 2m+1 q k,l=1 Bk,l n k nl /2+(m−i)(n 2m +n 2m+1 )/2+ (−q; q)∞ (q; q)n 1 · · · (q; q)n 2m+1
where ai,m and bi,m are certain rational numbers.
k=2i+1 (k−2i)n k /2
,
264
D. Adamovi´c, A. Milas
Proof. We prove the middle formula only. The other two formulas follow along the same lines. Recall that f(τ ) 2i + 1 2 (∂)m−i, 2m+1 (τ ) . χ S (i+1) (τ ) = (13.10) 2m+1 (τ ) + 2 η(τ ) 2m + 1 m−i, 2 2m + 1 Now, 2i + 1 2 m−i, 2m+1 (τ ) + (∂)m−i, 2m+1 (τ ) 2 2 2m + 1 2m + 1 (2m+1)n 2 +2(m−i)n (m−i)2 /(2(2m+1)) 2 =q (2n + 1)q . n∈Z
Finally, if we substitute q 1/2 for q in (13.8), and let p = 2m + 1, σ = 0, λ = 2m − 2i, and apply formula (13.1) and simple identity 1 (−q 1/2 ; q 1/2 )n = , (q 1/2 ; q 1/2 )n (q)n the proof automatically follows. 14. A Conjectural Relation of SW(m) with Quantum Groups Let gˆ be an untwisted affine Kac-Moody Lie algebra. Then there is a well-known (Kazhdan-Lusztig) equivalence between the tensor category of L g(k, 0)-modules k ∈ N, and the semisimple part of the tensor category of Uq (g)-modules, where q is a certain root of unity (not to be confused with q = e2πiτ used in the previous section) depending on the level k and g [Fi]. Notice that on the quantum group side we have a semisimplified category, and not the full category of Uq (g)-modules. In [FGST1 and FGST2] (see also [Se]) the authors proposed a remarkable equivalence between the (enhanced) tensor category of W( p)-modules and the category of Uq (sl2 )-modules, q = eiπ/ p , where Uq (sl2 ) is the restricted finite-dimensional quantum group. While this is still a conjecture for p > 2, the same authors established an important weaker equivalence among the S L(2, Z)-module Zc f t formed by generalized W( p) characters and the S L(2, Z)-module Z, the center of Uq (sl2 ). Thus, it is a natural question to find Kazhdan-Lusztig dual of the category or ordinary and logarithmic SW(m)-modules. In our case the relevant space of generalized characters is the θ invariant subspace described in Theorem 12.1, which is 3m + 1-dimensional. As indicated in the introduction, we believe that the quantum group Uqsmall (sl2 ), 2iπ
q = e 2m+1 is relevant for the supertriplet SW(m). Here are some evidences. Firstly, both SW(m) and Uqsmall (sl2 ) have the same number of inequivalent irreducible representations. Also, in [Ker] (see also [La]) it was proven that the center of Uqsmall (sl2 ) is 3m + 1-dimensional, and that it carries a projective action of the modular group. Notice that 3m + 1 is also (conjecturally) the dimension of the center of A(SW(m)). Thus, in parallel with [FGST1], we expect the following conjecture to be true. Conjecture 14.1. The category of weak SW(m)-modules is equivalent to the category 2πi of Uqsmall (sl2 )-modules, where q = e 2m+1 .
N = 1 Triplet Vertex Operator Superalgebras
265
Finally, Proposition 13.1 is a strong indication for a possiblity that the category of SW(m)-modules should be related to a subcategory of W(2m +1) and Uq (sl2 )-modules, q = eπi/(2m+1) . 15. Outlook and Final Remarks There are several research directions we plan to pursue in the future. Let us mention only a few we found the most interesting. (i) The most important problem that we left open is the existence and description of logarithmic SW(m)-modules. We strongly believe the ideas based on modular invariance as in [AM2] could be successfully applied for the super triplet. (ii) As with any N = 1 vertex operator superalgebra, the most obvious next step would be to examine the category of σ -twisted SW(m)-modules, where σ is the parity automorphisms. As we already indicated (cf. Theorem 12.1) the space of S L(2, Z)-transforms of irreducible SW(m)-modules should close a finitedimensional vector space. Supposedly characters of irreducible σ -twisted modules are included in the same vector space (cf. Remark 12.1) see [AM3]. (iii) Singular vectors in Feigin-Fuchs modules for the N = 1 Neveu-Schwarz algebra certainly deserve more attention. We expect these vectors to have description in terms of modified Jack polynomials and as kernels of super Calogero-Sutherland operators. Similar results for the Virasoro algebra have been obtained in [MY]. (iv) Our fermionic expressions for the SW(m)-characters indicate a possibility of parafermionic (or quasiparticle) bases for SW(m)-modules. For the triplet W( p) this problem has been resolved in [FFT]. 16. Appendix Here we prove Theorem 11.1 and give strong evidence that in Proposition 11.3 the polynomial g(x) is nonzero for every m. In the process of proving these results we discovered certain constant term identities which are of independent interest. We recall U F,E := Resz
(1 + z)2m Y (F, z)E ∈ S M(1). z
Then we have U
F,E
(0) := o(U
F,E
2m o(Fi−1 E). )= i i≥0
In Proposition 8.4 we proved that inside A(S M(1)) we have the relation ]. [U F,E ] = g([ω])[ H Because of the homomorphism from A(S M(1)) to A(SW(m)) and Proposition 11.3 it is sufficient to show that U F,E (0) acts nontrivially on the top components of at least one S M(1)-module M(1, λ) ⊗ F.
266
D. Adamovi´c, A. Milas
Proposition 16.1. Let vλ be the highest weight vector in M(1, λ) ⊗ F. Then we have U F,E (0) · vλ = −G m (t)vλ , where t = λ, α and 2m + 1 −2m − 1 G m (t) = · (−1) l j l=1 i=0 j=0 k=0 2m − t t t −2m − 1 . k j + k + 2m + 1 i − j − l + 2m + 1 l − k − 1 − i 2m+1 l−1 2m+1+i−l l−1−i
j+k+l
Proof. It is not hard to see that 2m 2m (1+z ) 1 −i−1 U F,E = Resz 1 Resz 2 Resz 3 z2 z 3i Y (e−α , z 1 )Y (eα , z 2 )Y (eα , z 3 )e−α +w, z1 i=0
where w ∈ T . By repeatedly using the well-known formula (cf. [LL]) E + (δ, x)E − (γ , y) = (1 − y/x)δ,γ E − (γ , y)E + (δ, x), which holds for every δ, γ ∈ Zβ, we get U F,E =
2m
Resz 1 Resz 2 Resz 3
(1 + z 1 )2m −i−1 i z2 z 3 (z 1 z 2 z 3 )−2m−1 · z1
i=0 (1−z 2 /z 1 )−2m−1 (1−z 3 /z 1 )−2m−1 (z 2 −z 3 )2m+1 E − (α, z 1 )E − (−α, z 2 )E − (−α, z 3 )+w.
The previous formula together with o(E − (β, x)) · vλ = (1 + x)−β,λ vλ and o(w)vλ = 0 implies U F,E (0) · vλ = (1 − z 2 /z 1 )
2m
i=0 −2m−1
Resz 1 Resz 2 Resz 3
(1 + z 1 )2m −i−1 i z2 z 3 (z 1 z 2 z 3 )−2m−1 · z1
(1 − z 3 /z 1 )−2m−1 (z 2 − z 3 )2m+1 (1 + z 1 )−t (1 + z 2 )t (1 + z 3 )t vλ .
The rest follows by expanding generalized rational functions with respect to standard conventions in vertex algebra theory and extracting the residues in all three variables. If we view parameter t as a variable, the expression G m (t) is a polynomial in t of degree at most 4m + 1. However, it is a priori not clear that the polynomial G m (t) is nonzero. We made some computations for small m and we came up with the following hypothesis.
N = 1 Triplet Vertex Operator Superalgebras
267
Conjecture 16.1. G m (t) =
2 t +m 2m . 4m + 1 m
We checked this conjecture by using Mathematica package for every m ≤ 20. As in Sect. 11, by using representation theory of SW(m) it is not hard to see that t+m 4m+1 must divide G m (t) for every m. Since deg(G m (t)) ≤ 4m + 1, then we have t +m , (16.1) G m (t) = Am 4m + 1 for some constant Am . But even proving Am = 0 seems to be a nontrivial problem. Proposition 16.2. Let 2m + 1 be prime. Then G m (t) = 0. Proof. We will prove this result by virtue of reduction mod 2m + 1. Let p = 2m + 1 be a prime. It is not hard to see that in fact G m (a) ∈ Z( p) , for every a ∈ Z (in other words, G m (a) is p-integral). Thus it is sufficient to prove that for some t = t0 we have G m (t0 ) = 0 mod p. We take t0 = 3m + 1 and examine G m (t) =
2m+1 l−1 2m+1+i−l l−1−i l=1 i=0
j=0
k=0
−2m − 1 −2m − 1 −m − 1 j+k+l 2m + 1 · (−1) l j k j + k + 2m + 1 3m + 1 3m + 1 . i − j − l + 2m + 1 l − k − 1 − i The finite sum G m (3m + 1) has many terms divisible by p. For instance, in the sum≡ 0 mod p unless l = 2m + 1. After some analysis it is not mation, all terms 2m+1 l hard to see that a possible nontrivial (mod p) contribution comes only if k = j = 0 and l = 2m + 1 (in other cases at least one binomial coefficient is divisible by p). Thus we get: G m (3m + 1) ≡
2m i=0
−m − 1 (−1) 2m + 1
3m + 1 3m + 1 mod p. i 2m − i
Observe the basic relation −m − 1 3m + 1 3m + 1 =− =− . 2m + 1 2m + 1 m Also, for i as in the summation we have 3m + 1 3m + 1 ≡ 0 mod p, i = m. i 2m − i
268
D. Adamovi´c, A. Milas
However for i = m we have 3m + 1 ≡ 11−1 22−1 · · · mm −1 ≡ 1 mod p. m Consequently, the summation reduces to a single term 3m + 1 3 ≡ 1 mod p. G m (3m + 1) ≡ m Notice that the previous computations support our Conjecture 16.1 because 2m ≡ ±1 mod p, m so that for t = 3m + 1, 2 2 2m t +m 2m 4m + 1 = ≡ 1 mod p. m 4m + 1 m 4m + 1 Remark 16.1. Because of interesting arithmetics involved in Propositions 16.1 and 16.2, we plan to return to Conjecture 16.1 in our future work. Acknowledgement. We thank the anonymous referee for his/her valuable comments.
References [Ab] [A1] [A2] [A3] [AM1] [AM2] [AM3] [Ar] [BS] [CF] [D] [DL] [DJ] [DK]
Abe, T.: A Z2 -orbifold model of the symplectic fermionic vertex operator superalgebra. Math. Z. 255, 755–792 (2007) Adamovi´c, D.: Rationality of Neveu-Schwarz vertex operator superalgebras. Internat. Math. Res. Notices 17, 865–874 (1997) Adamovi´c, D.: Representations of the vertex algebra W1+∞ with a negative integer central charge. Comm. Algebra 29(7), 3153–3166 (2001) Adamovi´c, D.: Classification of irreducible modules of certain subalgebras of free boson vertex algebra. J. Algebra 270, 115–132 (2003) Adamovi´c, D., Milas, A.: Logarithmic intertwining operators and W(2, 2 p − 1)-algebras. J. Math. Physics 48, 073503 (2007) Adamovi´c, D., Milas, A.: On the triplet vertex algebra W( p). Adv. Math. 217, 2664– 2699 (2008) Adamovi´c, D., Milas, A.: The N = 1 triplet vertex operator superalgebras: twisted sector. SIGMA 4(87), 1–24 (2008) Arakawa, T.: Representation theory of superconformal algebras and the Kac-Roan-Wakimoto conjecture. Duke Math. J. 130, 435–478 (2005) Bouwknegt, P., Schoutens, K.: W–symmetry in Conformal Field Theory. Phys. Rept. 223, 183–276 (1993) Carqueville, N., Flohr, M.: Nonmeromorphic operator product expansion and C2 -cofiniteness for a family of W-algebras. Phys. A: Math. Gen. 39, 951–966 (2006) Dong, C.: Vertex algebras associated with even lattices. J. Algebra 160, 245–65 (1993) Dong, C., Lepowsky, J.: Generalized vertex algebras and relative vertex operators. Boston: Birkhäuser, 1993 Dong, C., Jiang, C.: Rationality of vertex operator algebras. http://arxiv.org/abs/math/ 0607679.v1[math.QA], 2006 De Sole, A., Kac, V.: Finite vs. affine W -algebras. Japanese J. Math. 1, 137–261 (2006)
N = 1 Triplet Vertex Operator Superalgebras [EFHHNV] [FFR] [FRW] [FF] [FFT] [FGST1] [FGST2] [FHL] [Fi] [F1] [F2] [FGK] [FB] [FLM] [FZ] [FHST] [GK1] [GK2] [GL] [HK] [H] [HLZ] [HM] [IK1] [IK2] [K] [KWn] [KWak] [Ker]
269
Eholzer, W., Flohr, M., Honecker, A., Hubel, R., Nahm, W., Vernhagen, R.: Representations of W–algebras with two generators and new rational models. Nucl. Phys. B 383, 249–288 (1992) Feingold, A.J., Frenkel, I.B., Ries, J.: Spinor Construction of Vertex Operator Algebras, (1) Triality, and E 8 . Cont. Math. 121, Providence, RI: Amer. Math. Soc., 1991 Feingold, A.J., Ries, J., Weiner, M.: Spinor construction of the c = 21 minimal model. In: Moonshine, The Monster and related topics, Contemporary Math. 193, Chongying Dong, Geoffrey Mason, eds., Providence, RI: Amer. Math. Soc., 1995, pp. 45–92 Feigin, B., Fuchs, D.B.: Representations of the Virasoro algebra. In: Representations of infinite-dimensional Lie groups and Lie algebras. New York: Gordon andd Breach, 1989 Feigin, B., Feigin, E., Tipunin, I.: Fermionic formulas for (1, p) logarithmic model characters in 2,1 quasiparticle realisation. http://arxiv.org/abs/0704.2464.v4[hepth], 2007 Feigin, B.L., Ga˘ınutdinov, A.M., Semikhatov, A.M., Yu Tipunin, I.: Modular group representations and fusion in logarithmic conformal field theories and in the quantum group center. Commun. Math. Phys. 265, 47–93 (2006) Feigin, B.L., Ga˘ınutdinov, A.M., Semikhatov, A.M., Yu Tipunin, I.: The Kazhdan-Lusztig correspondence for the representation category of the triplet w-algebra in logorithmic conformal field theories. Theor. Math. Phys. 148, 1210–1235 (2006) Frenkel, I.B., Huang, Y.-Z., Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules, Mem. Amer. Math. Soc. 104, 1993 Finkelberg, M.: Am equivalence of fusion categories. Geom. Funct. Anal. 249–267 (1996) Flohr, M.: On modular invariant partition functions of conformal field theories with logarithmic operators. Internat. J. Modern Phys. A 11, 4147–4172 (1996) Flohr, M.: Bits and pieces in logarithmic conformal field theory. Proceedings of the School and Workshop on Logarithmic Conformal Field Theory and Its Applications (Tehran, 2001 (Internat. J. Mod. Phys. A 18), 4497–4591 (2003) Flohr, M., Grabow, C., Koehn, M.: Fermionic formulas for the characters of c p,1 logarithmic field theory. Nucl. Phys. B 768, 263–276 (2007) Frenkel, E., Ben-Zvi, D.: Vertex algebras and algebraic curves, Mathematical Surveys and Monographs; no. 88, Providence, RI: Amer. Math. Soc., 2001 Frenkel, I.B., Lepowsky, J., Meurman, A.: Vertex Operator Algebras and the Monster, Pure and Applied Math, Vol. 134. New York: Academic Press, 1988 Frenkel, I.B., Zhu, Y.: Vertex operator algebras associated to representations of affine and Virasoro algebras. Duke Math. J. 66, 123–168 (1992) Fuchs, J., Hwang, S., Semikhatov, A.M., Tipunin, I.Yu.: Nonsemisimple fusion algebras and the Verlinde formula. Commun. Math. Phys. 247(3), 713–742 (2004) Gaberdiel, M., Kausch, H.G.: A rational logarithmic conformal field theory. Phys. Lett B 386, 131–137 (1996) Gaberdiel, M., Kausch, H.G.: A local logarithmic conformal field theory. Nucl. Phys. B 538, 631–658 (1999) Gao, Y., Li, H.: Generalized vertex algebras generated by parafermion-like vertex operators. J. Algebra. 240, 771–807 (2001) Heluani, R., Kac, V.: SUSY lattice vertex algebras. http://arxiv.org/abs/0710.1587.v1[math. QA], 2007 Honecker, A.: Automorphisms of W algebras and extended rational conformal field theories. Nucl. Phys. B 400, 574–596 (1993) Huang, Y-Z., Lepowsky, J., Zhang, L.: Logarithmic tensor product theory for generalized modules for a conformal vertex algebra. http://arxiv.org/abs/0710.2687.v3[math.QA], 2007 Huang, Y.-Z., Milas, A.: Intertwining operator superalgebras and vertex tensor categories for superconformal algebras, I. Commun. Contemp. Math. 4, 327–355 (2002) Iohara, K., Koga, Y.: Fusion algebras for N = 1 superconformal field theories through coinvariants, II, N = 1 super-Virasoro-symmetry. J. Lie Theory 11, 305–337 (2001) Iohara, K., Koga, Y.: Representation theory of the Neveu-Schwarz and Ramond algebras II: Fock modules. Ann. Inst. Fourier. Grenoble 53(6), 1755–1818 (2003) Kac, V.G.: Vertex Algebras for Beginners, University Lecture Series, Second Edition. Providence, RI: Amer. Math. Soc., Vol. 10, 1998 Kac, V.G., Wang, W.: Vertex operator superalgebras and their representations. Contemp. Math. 175, 161–191 (1994) Kac, V., Wakimoto, M.: Quantum Reduction and Representation Theory of Superconformal Algebras. Adv. Math. 185, 400–458 (2004) Kerler, T.: Mapping class group actions on quantum doubles. Commun. Math. Phys. 168(2), 353–388 (1995)
270
[La] [LL] [Li] [MS] [MR] [M1] [M2]
[M3] [M4] [MY] [Miy] [Se] [Wa] [W] [Z]
D. Adamovi´c, A. Milas
Lachowska, A.: On the center of the small quantum group. J. Algebra 262, 313–331 (2003) Lepowsky, J., Li, H.: Introduction to Vertex Operator Algebras and Their Representations, Progress in Mathematics Vol. 227. Boston: Birkhäuser, 2003 Li, H.: Local systems of vertex operators, vertex superalgebras and modules. J. Pure Appl. Algebra 109, 143–195 (1996) Mavromatos, N., Szabo, R.: The Neveu-Schwarz and Ramond algebras of logarithmic superconformal field theory. JHEP 0301, 041 (2003) Meurman, A., Rocha-Caridi, A.: Highest weight representations of the Neveu-Schwarz and Ramond algebras. Commun. Math. Phys. 107, 263–294 (1986) Milas, A.: Fusion rings for degenerate minimal models. J. Algebra 254(2), 300–335 (2002) Milas, A.: Weak modules and logarithmic intertwining operators for vertex operator algebras. In: Recent developments in infinite-dimensional Lie algebras and conformal field theory (Charlottesville, VA, 2000), Contemp. Math. 297, Providence, RI: Amer. Math. Soc. 2002, pp. 201–225 Milas, A.: Logarithmic intertwining operators and vertex operators. Commun. Math. Phys. 277, 497–529 (2008) Milas, A.: Characters, Supercharacters and Weber modular functions. Crelle’s Journal 608, 35–64 (2007) Mimachi, K., Yamada, Y.: Singular vectors of the Virasoro algebra in terms of Jack symmetric polynomials. Commun. Math. Phys. 174, 447–455 (1995) Miyamoto, M.: Modular invariance of vertex operator algebras satisfying C2 -cofiniteness. Duke Math. J. 122, 51–91 (2004) Semikhatov, A.: Factorizable ribbon quantum groups in logarithmic conformal field theories. Theo. Math. Phys. 154, 433–453 (2008) Ole Warnaar, S.: Proof of the Flohr-Grabow-Koehn conjecture for characters of logarithmic field theory. J. Phys. A: Math. Theor. 40, 12243–12254 (2007) Wang, W.: Rationality of Virasoro vertex operator algebras. Internat. Math. Res. Notices 71(1), 197–211 (1993) Zhu, Y.-C.: Modular invariance of characters of vertex operator algebras. J. Amer. Math. Soc. 9, 237–302 (1996)
Communicated by Y. Kawahigashi
Commun. Math. Phys. 288, 271–285 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0734-3
Communications in
Mathematical Physics
On the Reeh-Schlieder Property in Curved Spacetime Ko Sanders Department of Mathematics, University of York, Heslington, York, YO10 5DD, United Kingdom. E-mail:
[email protected];
[email protected] Received: 26 February 2008 / Accepted: 27 November 2008 Published online: 26 February 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com
Dedicated to Klaas Landsman, out of gratitude for the support he offered when it was most needed Abstract: We attempt to prove the existence of Reeh-Schlieder states on curved spacetimes in the framework of locally covariant quantum field theory using the idea of spacetime deformation and assuming the existence of a Reeh-Schlieder state on a diffeomorphic (but not isometric) spacetime. We find that physically interesting states with a weak form of the Reeh-Schlieder property always exist and indicate their usefulness. Algebraic states satisfying the full Reeh-Schlieder property also exist, but are not guaranteed to be of physical interest. 1. Introduction The Reeh-Schlieder theorem ([17]) is a result in axiomatic quantum field theory which states that for a scalar Wightman field in Minkowski spacetime any state in the Hilbert space can be approximated arbitrarily well by acting on the vacuum with operations performed in any prescribed open region. The physical meaning of this is that the vacuum state has very many non-local correlations and an experimenter in any given region can exploit the vacuum fluctuations by performing a suitable measurement in order to produce any desired state up to arbitrary accuracy. The original proof uses analytic continuation arguments, an approach which was extended to analytic spacetimes in [20] by replacing the spectrum condition of the Wightman axioms by an analytic microlocal spectrum condition. For spacetimes which are not analytic, a result by Strohmaier [19], extending an earlier result by Verch [21], shows that in a stationary spacetime all ground and thermal (KMS-)states of several types of free fields (including the Klein-Gordon, Dirac and Proca field) also have the Reeh-Schlieder property. To prove the existence of such states directly one may need to make further assumptions, depending on the type of field (see [19]). Furthermore, the condition of [20] can be weakened to a smoothly covariant condition that implies the Reeh-Schlieder property as well as physical relevance (i.e. the microlocal spectrum condition), but this condition does not seem to be a suitable tool to find such states (see [18] Sect. 5.4).
272
K. Sanders
In this paper we will investigate whether we can find states of a quantum field system in a general (globally hyperbolic) curved spacetime which have the Reeh-Schlieder property. We do this using the technique of spacetime deformation, as pioneered in [9] and as applied successfully to prove a spin-statistics theorem in curved spacetime in [23]. This means that we assume the existence of a Reeh-Schlieder state (i.e. a state with the Reeh-Schlieder property) in one spacetime and try to derive the existence of another state in a diffeomorphic (but not isometric) spacetime which also has the Reeh-Schlieder property. We will prove that for every given region there is a state in the physical state space that has the Reeh-Schlieder property for that particular region (but maybe not for all regions). Algebraic states with the full Reeh-Schlieder property also exist, i.e. states which have the Reeh-Schlieder property for all open regions simultaneously. However, their existence follows from an abstract existence principle and, consequently, such states are not guaranteed to be of any physical interest. To keep the discussion as general as possible we will work in the axiomatic language known as locally covariant quantum field theory as introduced in [5] (see also [23], where some of these ideas already appeared, and [6] for a recent application). We outline this formulation in Sect. 2 and our most important assumption there will be the time-slice axiom, which expresses the existence of a causal dynamical law. In Sect. 3 we will prove the geometric results on spacetime deformation that we need and we will see what they mean for a locally covariant quantum field theory. Section 4 contains our main results on deforming one Reeh-Schlieder state into another one and it notes some immediate consequences regarding the type of local algebras and Tomita-Takesaki modular theory. As an example we discuss the free scalar field in Sect. 5 and we end with a few conclusions. 2. Locally Covariant Quantum Field Theory In this section we briefly describe the main ideas of locally covariant quantum field theory as introduced in [5]. It will also serve to fix our notation for the subsequent sections. In the following any quantum physical system will be described by a C ∗ -algebra A with a unit I , whose self-adjoint elements are the observables of the system. It will be advantageous to consider a whole class of possible systems rather than just one. Definition 2.1. The category Alg has as its objects all unital C ∗ -algebras A and as its morphisms all injective ∗ -homomorphisms α such that α(I ) = I . The product of morphisms is given by the composition of maps and the identity map idA on a given object serves as an identity morphism. A morphism α : A1 → A2 expresses the fact that the system described by A1 is a subsystem of that described by A2 , which is called a super-system. The injectivity of the morphisms means that, as a matter of principle, any observable of a sub-system can always be measured, regardless of any practical restrictions that a super-system may impose. A state of a system is represented by a normalised positive linear functional ω, i.e. ω(A∗ A) ≥ 0 for all A ∈ A and ω(I ) = 1. The set of all states on A will be denoted by A∗+ 1 . Not all of these states are of physical interest, so it will be convenient to have the following notion at our disposal. Definition 2.2. The category States has as its objects all subsets S ⊂ A∗+ 1 , for all unital C ∗ -algebras A in Alg and as its morphisms all maps α ∗ : S1 → S2 for which
On the Reeh-Schlieder Property in Curved Spacetime
273
∗ Si ⊂ (Ai )∗+ 1 , i = 1, 2, and α is the restriction of the dual of a morphism α : A2 → A1 ∗ in Alg, i.e. α (ω) = ω ◦ α for all ω ∈ S1 . Again the product of morphisms is given by the composition of maps and the identity map id S on a given object serves as an identity morphism.
After these operational aspects we now turn to the physical ones. The systems we will consider are intended to model quantum fields living in a (region of) spacetime which is endowed with a fixed Lorentzian metric (a background gravitational field). The relation between sub-systems will come about naturally by considering sub-regions of spacetime. More precisely we consider the following: Definition 2.3. By the term globally hyperbolic spacetime we will mean a connected, Hausdorff, paracompact, C ∞ Lorentzian manifold M = (M, g) of dimension d = 4, which is oriented, time-oriented and admits a Cauchy surface. A subset O ⊂ M of a globally hyperbolic spacetime M is called causally convex iff for all x, y ∈ O all causal curves from x to y lie entirely in O. A non-empty open set which is connected and causally convex is called a causally convex region or cc-region. A cc-region whose closure is compact is called a bounded cc-region. The category Man has as its objects all globally hyperbolic spacetimes M = (M, g) and its morphisms are given by all maps ψ : M1 → M2 which are smooth isometric embeddings (i.e. ψ : M1 → ψ(M1 ) is a diffeomorphism and ψ∗ g1 = g2 |ψ(M1 ) ) such that the orientation and time-orientation are preserved and ψ(M1 ) is causally convex. Again the product of morphisms is given by the composition of maps and the identity map id M on a given object serves as a unit. A region O in a globally hyperbolic spacetime is causally convex if and only if O itself is globally hyperbolic (see [11] Sect. 6.6), so a cc-region is exactly a connected globally hyperbolic region. The image of a morphism is by definition a cc-region. Notice that the converse also holds. If O ⊂ M is a cc-region then (O, g| O ) defines a globally hyperbolic spacetime in its own right. In this case there is a canonical morphism I M,O : O → M given by the canonical embedding ι : O → M. We will often drop I M,O and ι from the notation and simply write O ⊂ M. The importance of causally convex sets is that for any morphism the causality structure of M1 coincides with that of (M1 ) in M2 : ± ± ψ(J M (x)) = J M (ψ(x)) ∩ ψ(M1 ), x ∈ M1 . 1 2
(1)
If this were not the case then the behaviour of a quantum physical system living in M1 could depend in an essential way on the super-system, which makes it practically impossible to study the smaller system as a sub-system in its own right. This possibility is therefore excluded from the mathematical framework. ± Equation (1) allows us to drop the subscript in J M if we introduce the convention that ± J is always taken in the largest spacetime under consideration. This simplifies the notation without causing any confusion, even when O ⊂ M1 ⊂ M2 with canonical embed± ± dings, because then we just have J ± (O) := J M (O) and J M (O) = J ± (O) ∩ M1 . 2 1 Similarly we take by convention D(O)
:=
D M2 (O),
⊥
:=
O ⊥ M2 := M2 \ J (O),
O
274
K. Sanders
and we deduce from causal convexity that D M1 (O) = D(O) ∩ M1 and O ⊥ M1 = O ⊥ ∩ M1 . The following lemma gives some ways of obtaining causally convex sets in a globally hyperbolic spacetime. Lemma 2.4. Let M = (M, g) be a globally hyperbolic spacetime, O ⊂ M an open subset and A ⊂ M an achronal set. Then: 1. 2. 3. 4. 5. 6. 7.
the intersection of two causally convex sets is causally convex, for any subset S ⊂ M the sets I ± (S) are causally convex, O ⊥ is causally convex, O is causally convex iff O = J + (O) ∩ J − (O), int(D(A)) and int(D ± (A)) are causally convex, if O is a cc-region, then D(O) is a cc-region, if S ⊂ M is an acausal continuous hypersurface, then D(S), D(S) ∩ I + (S) and D(S) ∩ I − (S) are open and causally convex.
Proof. The first two items follow directly from the definitions. The fourth follows from J + (O) ∩ J − (O) = ∪ p,q∈O (J + ( p) ∩ J − (q)), which is contained in O if and only if O is causally convex. The fifth item follows from the first two and Theorem 14.38 and Lemma 14.6 in [14]. To prove the third item, assume that γ is a causal curve between points in O ⊥ and p ∈ J (O) lies on γ . By perturbing one of the endpoints of γ in O ⊥ we may ensure that the curve is time-like. Then we may perturb p on γ so that p ∈ int(J (O)) and γ is still causal. This gives a contradiction, because there then exists a causal curve from O through p to either x or y. For the sixth statement we let S ⊂ O be a smooth Cauchy surface for O (see [3]) and note that D(O) is non-empty, connected and D(O) = D(S). The causal convexity of O implies that S ⊂ M is acausal, which reduces this case to statement seven. The first part of statement seven is just Lemma 14.43 and Theorem 14.38 in [14]. The rest of statement seven follows from statement one and two together with the openness of I ± (S). We now come to the main set of definitions, which combine the notions introduced above (see [5]). Definition 2.5. A locally covariant quantum field theory is a covariant functor A : Man → Alg, written as M → A M , → α . A state space for a locally covariant quantum field theory A is a contravariant functor S : Man → States, such that for all objects M we have M → S M ⊂ (A M )∗+ 1 and ∗| for all morphisms : M1 → M2 we have → α S M2 . The set S M is called the state space for M. When it is clear that = I M,O for a canonical embedding ι : O → M of a cc-region O in a globally hyperbolic spacetime M, i.e. when O ⊂ M, we will often simply write A O ⊂ A M instead of using α I M,O . For a morphism : M → M which restricts to a morphism | O : O → O ⊂ M we then have α| O = α |A O
(2)
rather than α I M ,O ◦ α| O = α ◦ α I M,O , as one can see from a commutative diagram. The framework of locally covariant quantum field theory is a generalisation of algebraic quantum field theory (see [5,10]). We now proceed to discuss several physically
On the Reeh-Schlieder Property in Curved Spacetime
275
desirable properties that such a locally covariant quantum field theory and its state space may have (cf. [5], but note that our time-slice axiom is stronger, placing a restriction on the state spaces as well as the algebras). Definition 2.6. A locally covariant quantum field theory A is called causal iff for every pair of morphisms i : Mi → M,i = 1, 2 such that ψ1 (M1 ) ⊂ (ψ2 (M2 ))⊥ in M we have that α1 (A M1 ), α2 (A M2 ) = {0} in A M . A locally covariant quantum field theory A with state space S satisfies the time-slice axiom iff for all morphisms : M1 → M2 such that ψ(M1 ) contains a Cauchy surface ∗ (S ) = S . for M2 we have α (A M1 ) = A M2 and α M2 M1 A state space S for a locally covariant quantum field theory A is called locally quasiequivalent iff for every morphism : M1 → M2 such that ψ(M1 ) ⊂ M2 is bounded and for every pair of states ω, ω ∈ S M2 the GNS-representations πω , πω of A M2 are quasiequivalent on α (A M1 ). The local von Neumann algebras RωM1 := πω (α (A M1 )) are then *-isomorphic for all ω ∈ S M2 . A locally covariant quantum field theory A with a state space functor S is called nowhere classical iff for every morphism : M1 → M2 and for every state ω ∈ S M2 the local von Neumann algebra RωM1 is not commutative. Note that the condition ψ1 (M1 ) ⊂ (ψ2 (M2 ))⊥ is symmetric in i = 1, 2. The causality condition formulates how the quantum physical system interplays with the classical gravitational background field, whereas the time-slice axiom expresses the existence of a causal dynamical law. The condition of a locally quasi-equivalent state space is more technical in nature and means that all states of a system can be described in the same Hilbert space representation as long as we only consider operations in a small (i.e. bounded) cc-region of the spacetime. The condition that ψ(M1 ) contains a Cauchy surface for M2 is equivalent to D(ψ(M1 )) = M2 , because a Cauchy surface S ⊂ M1 maps to a Cauchy surface ψ(S) for D(ψ(M1 )). On the algebraic level this yields: Lemma 2.7. For a locally covariant quantum field theory A with a state space S satisfying the time-slice axiom, an object (M, g) ∈ Man and a cc-region O ⊂ M we have A O = A D(O) and S O = S D(O) . If O contains a Cauchy surface of M we have A O = AM and S O = S M . Proof. Note that both (O, g| O ) and (D(O), g| D(O) ) are objects of Man (by Lemma 2.4) and that a Cauchy surface S for O is also a Cauchy surface for D(O). (The causal convexity of O in M prevents multiple intersections of S.) The first statement then reduces to the second. Leaving the canonical embedding implicit in the notation, the result immediately follows from the time-slice axiom. Finally we define the Reeh-Schlieder property, which we will study in more detail in the subsequent sections. Definition 2.8. Consider a locally covariant quantum field theory A with a state space S. A state ω ∈ S M has the Reeh-Schlieder property for a cc-region O ⊂ M iff πω (A O )ω = Hω , where (πω , ω , Hω ) is the GNS-representation of A M in the state ω. We then say that ω is a Reeh-Schlieder state for O. We say that ω is a (full) Reeh-Schlieder state iff it is a Reeh-Schlieder state for all cc-regions in M.
276
K. Sanders
3. Spacetime Deformation The existence of Hadamard states of the free scalar field in certain curved spacetimes was proved in [9] by deforming Minkowski spacetime into another globally hyperbolic spacetime. Using a similar but slightly more technical spacetime deformation argument [23] proved a spin-statistics theorem for locally covariant quantum field theories with a spin structure, given that such a theorem holds in Minkowski spacetime. In the next section we will assume the existence of a Reeh-Schlieder state in one spacetime and try to deduce along similar lines the existence of such states on a deformed spacetime. As a geometric prerequisite we will state and prove in the present section a spacetime deformation result employing similar methods as the references mentioned above. First we recall the spacetime deformation result due to [9]: Proposition 3.1. Consider two globally hyperbolic spacetimes Mi , i = 1, 2, with spacelike Cauchy surfaces Ci both diffeomorphic to C. Then there exists a globally hyperbolic spacetime M = (R × C, g ) with spacelike Cauchy surfaces Ci , i = 1, 2, such that Ci is isometrically diffeomorphic to Ci and an open neighbourhood of Ci is isometrically diffeomorphic to an open neighbourhood of Ci . The proof is omitted, because the stronger result Proposition 3.3 will be proved later on. Note, however, the following interesting corollary (cf. [5] Sect. 4): Corollary 3.2. Two globally hyperbolic spacetimes Mi with diffeomorphic Cauchy surfaces are mapped to isomorphic C ∗ -algebras A Mi by any locally covariant quantum field theory A satisfying the time-slice axiom (with some state space S). Proof. Consider two diffeomorphic globally hyperbolic spacetimes Mi , i = 1, 2, let M be the deforming spacetime of Proposition 3.1 and let Wi ⊂ Mi be open neighbourhoods of the Cauchy surfaces Ci ⊂ Mi which are isometrically diffeomorphic under ψi to the open neighbourhoods Wi ⊂ M of the Cauchy surfaces Ci ⊂ M . We may take the Wi and Wi to be cc-regions (as will be shown in Proposition 3.3), so that the i (determined by ψi ) are isomorphisms in Man. It then follows from Lemma 2.7 that −1 −1 (AW1 ) = α (A M ) A M1 = AW1 = Aψ −1 (W ) = α 1 1 1
1
−1 ◦ α2 (A M2 ), = α 1
where the αi are ∗ -isomorphisms. This proves the assertion. At this point a warning seems in place. Whenever g1 , g2 are two Lorentzian metrics on a manifold M such that both Mi := (M, gi ) are objects in Man, Corollary 3.2 gives a ∗ -isomorphism α between the algebras A Mi . If O ⊂ M is a cc-region for g1 then α is a ∗ -isomorphism from A (O,g1 ) into A M2 . However, the image cannot always be identified with A(O,g2 ) , because O need not be causally convex for g2 , in which case the object is not defined. We now formulate and prove our deformation result. The geometric situation is schematically depicted in Fig. 1. Proposition 3.3. Consider two globally hyperbolic spacetimes Mi , i = 1, 2, with diffeomorphic Cauchy surfaces and a bounded cc-region O2 ⊂ M2 with non-empty causal complement, O2⊥ = ∅. Then there are a globally hyperbolic spacetime M = (M , g ), spacelike Cauchy surfaces Ci ⊂ Mi and C1 , C2 ∈ M and bounded cc-regions U2 , V2 ⊂ M2 and U1 , V1 ⊂ M1 such that the following hold:
On the Reeh-Schlieder Property in Curved Spacetime
277
Fig. 1. Sketch of the geometry of Proposition 3.3
• There are isometric diffeomorphisms ψi : Wi → Wi , where W1 := I − (C1 ), W1 := I − (C1 ), W2 := I + (C2 ) and W2 := I + (C2 ), • U2 , V2 ⊂ W2 , U2 ⊂ D(O2 ), O2 ⊂ D(V2 ), • U1 , V1 ⊂ W1 , U1 = ∅, V1⊥ = ∅, ψ1 (U1 ) ⊂ D(ψ2 (U2 )) and ψ2 (V2 ) ⊂ D(ψ1 (V1 )). Proof. First we recall the result of [3] that for any globally hyperbolic spacetime (M, g) there is a diffeomorphism F : M → R × C for some smooth three dimensional manifold C in such a way that for each t ∈ R the surface F −1 ({t} × C) is a spacelike Cauchy surface. The pushed-forward metric g := F∗ g makes (R × C, g ) a globally hyperbolic manifold, where g is given by = βdtµ dtν − h µν . gµν
(3)
Here dt is the differential of the canonical projection on the first coordinate t : R×C → R, which is a smooth time function; β is a strictly positive smooth function and h µν is a (space and time dependent) Riemannian metric on C. The orientation and time orientation of M induce an orientation and time orientation on R × C via F. (If necessary we may compose F with the time-reversal diffeomorphism (t, x) → (−t, x) of R × C to ensure that the function t increases in the positive time direction.) Applying the above to the Mi gives us two diffeomorphisms Fi : Mi → M , where M = R × C as a manifold. Note that we can take the same C for both i = 1, 2 by the assumption of diffeomorphic Cauchy surfaces. Define O2 := F2 (O2 ) and let tmin and tmax be the minimum and maximum value that the function t attains on the compact set O2 . We now prove that F2−1 ((tmin , tmax ) × C) ∩ O2⊥ = ∅. Indeed, if this were empty, then we see that J (O2 ) contains F2−1 ([tmin , tmax ] × C) and hence also Cmax := F2−1 ({tmax } × C) and Cmin := F2−1 ({tmin } × C). In fact, Cmin ⊂ J − (O2 ). Indeed, if p := F2−1 (tmin , x) is in J + (O2 ) then we can consider a basis of neighbourhoods of p of the form I − (F2−1 (tmin + 1/n, x)) ∩ I + (F2−1 ({tmin − 1/n} × C)). If qn ∈ J + (O2 ) is in such a basic neighbourhood, then the same neighbourhood also contains a point pn ∈ O2 . Hence, given a sequence qn in J + (O2 ) converging to p we find a sequence pn in O2 converging to p and we conclude that p ∈ O2 ⊂ J − (O2 ). Similarly we can show that Cmax ⊂ J + (O2 ). It then follows that I + (Cmax ) ⊂ J + (O2 ) and I − (Cmin ) ⊂ J − (O2 ), so that J (O2 ) = M and O ⊥ = ∅. This contradicts our assumption on O2 , so we must have F2−1 ((tmin , tmax ) × C) ∩ O2⊥ = ∅. Then we may choose t2 ∈ (tmin , tmax ) such that C2 := F2−1 ({t2 } × C) satisfies C2 ∩ O2 = ∅ and C2 ∩ O2⊥ = ∅. We define C2 := F2 (C2 ), W2 := I + (C2 ) and W2 := (t2 , ∞) × C.
278
K. Sanders
Note that C2 ∩ J (O2 ) is compact by [1] Corollary A.5.4. It follows that we can find relatively compact open sets K , N ⊂ C such that K 2 := {t2 } × K , K 2 := F2−1 (K 2 ), N2 := {t2 } × N and N2 := F2−1 (N2 ) satisfy K = ∅, N = C, K 2 ⊂ O2 and C2 ∩ J (O2 ) ⊂ N2 . We let Cmax := F2−1 ({tmax } × C) and define U2 := D(K 2 ) ∩ I + (K 2 ) ∩ I − (Cmax ) and V2 := D(N2 )∩ I + (N2 )∩ I − (Cmax ). It follows from Lemma 2.4 that U2 , V2 are bounded cc-regions in M2 . Clearly U2 , V2 ⊂ W2 , U2 ⊂ D(O2 ), O2 ⊂ D(V2 ) and V2⊥ = ∅. Next we choose t1 ∈ (tmin , t2 ) and define C1 := {t1 } × C, C1 := F1−1 (C1 ), W1 := − I (C1 ) and W1 := (−∞, t1 ) × C. Let N , K ⊂ C be relatively compact connected open sets such that K = ∅, N = C, K ⊂ K and N ⊂ N . We define N1 := {t1 } × N , K 1 := {t1 } × K , N1 := F1−1 (N1 ), K 1 := F1−1 (K 1 ) and Cmin := F1−1 ({tmin } × C). Let U1 := D(K 1 ) ∩ I − (K 1 ) ∩ I + (Cmin ) and V1 := D(N1 ) ∩ I − (N1 ) ∩ I + (Cmin ). Again by Lemma 2.4 these are bounded cc-regions in M1 . Note that U1 , V1 ⊂ W1 and V1⊥ = ∅. The metric g of M is now chosen to be of the form := βdtµ dtν − f · (h 1 )µν − (1 − f ) · (h 2 )µν , gµν
where we have written ((Fi )∗ gi )µν = βi dtµ dtν − (h i )µν , f is a smooth function on M which is identically 1 on W1 , identically 0 on W2 and 0 < f < 1 on the intermediate region (t1 , t2 ) × C and β is a positive smooth function which is identically βi on Wi . It is then clear that the maps Fi restrict to isometric diffeomorphisms ψi : Wi → Wi . The function β may be chosen small enough on the region (t1 , t2 )×C to make (M, g ) globally hyperbolic. (As pointed out in [9] in their proof of Proposition 3.1, choosing β small “closes up” the light cones and prevents causal curves from “running off to spatial infinity” in the intermediate region.) Furthermore, using the compactness of (t1 , t2 ) × N and the continuity of (h i )µν we see that we may choose β small enough on this set to ensure that any causal curve through K 1 must also intersect K 2 and any causal curve through N2 must also intersect N1 . This means that K 1 ⊂ D(K 2 ) and N2 ⊂ D(N1 ) and hence ψ1 (U1 ) ⊂ D(ψ2 (U2 )) and ψ2 (V2 ) ⊂ D(ψ1 (V1 )). This completes the proof. The analogue of Corollary 3.2 for the situation of Proposition 3.3 is: Proposition 3.4. Consider a locally covariant quantum field theory A with a state space S satisfying the time-slice axiom and two globally hyperbolic spacetimes Mi , i = 1, 2 with diffeomorphic Cauchy surfaces. For any bounded cc-region O2 ⊂ M2 with nonempty causal complement there are bounded cc-regions U1 , V1 ⊂ M1 and a ∗ -isomorphism α : A M2 → A M1 such that V1⊥ = ∅ and AU1 ⊂ α(A O2 ) ⊂ AV1 .
(4)
Moreover, if the spacelike Cauchy surfaces of the Mi are non-compact and P2 ⊂ M2 is any bounded cc-region, then there are bounded cc-regions Q 2 ⊂ M2 and P1 , Q 1 ⊂ M1 such that Q i ⊂ Pi⊥ for i = 1, 2 and α(A P2 ) ⊂ A P1 , A Q 1 ⊂ α(A Q 2 ),
(5)
where α is the same ∗ -isomorphism as in the first part of this proposition. Proof. We apply Proposition 3.3 to obtain sets Ui , Vi and isomorphisms i : Wi → Wi associated to the isometric diffeomorphisms ψi . As in the proof of Corollary 3.2 the i
On the Reeh-Schlieder Property in Curved Spacetime
279
Fig. 2. Sketch of the proof of the second part of Proposition 3.4 −1 give rise to ∗ -isomorphisms αi and α := α ◦ α2 is a ∗ -isomorphism from A M2 to 1 A M1 . Using the properties of Ui , Vi stated in Proposition 3.3 we deduce: −1 −1 −1 (AU1 ) ⊂ α (A D(U2 ) ) = α (AU2 ) = α(AU2 ) ⊂ α(A O2 ) AU1 = α 1 1 1
⊂ α(AV2 ) = αψ−11 (AV2 ) ⊂ αψ−11 (A D(V1 ) ) = αψ−11 (AV1 ) = AV1 . Here we repeatedly used Eq. (2) and Lemma 2.7 (the time-slice axiom). This proves the first part of the proposition. Now suppose that the Cauchy-surfaces are non-compact and let P2 be any bounded cc-region. We refer to Fig. 2 for a depiction of this part of the proof. First choose Cauchy surfaces T2 , T+ ⊂ W2 such that T+ ⊂ I + (T2 ). Note that J (P2 ) ∩ T2 is compact, so it has a relatively compact connected open neighbourhood N2 ⊂ T2 . Choosing T+ appropriately we see that R := D(N2 ) ∩ I + (N2 ) ∩ I − (T+ ) is a bounded cc-region in M2 by Lemma 2.4 and as usual we set R := ψ2 (R). Now let T− , T1 ⊂ W1 be Cauchy surfaces such that T− ⊂ I − (T1 ) and note that J (R ) ∩ T1 is again compact, so we can find a relatively compact connected open neighbourhood N1 ⊂ T1 and use Lemma 2.4 to define the bounded cc-region P1 := D(N1 ) ∩ I − (N1 ) ∩ I + (T− ) and P1 := ψ1−1 (P1 ). Now let L 1 ⊂ T1 be a connected relatively compact set such that L 1 ∩ N1 = ∅. Such an L 1 exists because T1 is non-compact. Define Q 1 := D(L 1 ) ∩ I − (L 1 ) ∩ I + (T− ) and Q 1 := ψ1−1 (Q 1 ). We see that Q 1 ⊂ P1⊥ is a bounded cc-region and Q 1 ⊂ D(ψ2 (L 2 )) where L 2 ⊂ T2 \ N2 is a relatively compact open set. In fact, we can choose L 2 to be connected because Q 1 lies in a connected component C of D(ψ2 (T2 \ N2 )). We now define the bounded cc-region Q 2 := D(L 2 ) ∩ I + (L 2 ) ∩ I − (T+ ) and Q 2 := ψ2 (Q 2 ), so that Q 2 ⊂ P2⊥ and Q 1 ⊂ D(Q 2 ). This concludes the geometrical part of the proof. Now note that A P2 ⊂ A R by Lemma 2.7 on D(N2 ) and that A R = α2 (A R ). Applying Lemma 2.7 in D(N1 ) we see that −1 (A P1 ). Putting this together yields the inclusion: A R ⊂ A P1 and we have A P1 = α 1 −1 −1 (A R ) ⊂ α (A P1 ) = A P1 . α(A P2 ) ⊂ α(A R ) = α 1 1
−1 Similarly we have A Q 1 = α (A Q 1 ), A Q 2 = α2 (A Q 2 ) and A Q 1 ⊂ A Q 2 by Lemma 1 2.7. This yields the inclusion: −1 −1 (A Q 2 ) ⊃ α (A Q 1 ) = A Q 1 . α(A Q 2 ) = α 1 1
280
K. Sanders
4. The Reeh-Schlieder Property in Curved Spacetime The spacetime deformation argument of the previous section will have some consequences for the Reeh-Schlieder property that we describe in the current section. Unfortunately it is not clear that we can deform a Reeh-Schlieder state into another (full) Reeh-Schlieder state, but we do have the following more limited result: Theorem 4.1. Consider a locally covariant quantum field theory A with state space S which satisfies the time-slice axiom. Let Mi be two globally hyperbolic spacetimes with diffeomorphic Cauchy surfaces and suppose that ω1 ∈ S M1 is a Reeh-Schlieder state. Then given any bounded cc-region O2 ⊂ M2 with non-empty causal complement, O2⊥ = ∅, there is a ∗ -isomorphism α : A M2 → A M1 such that ω2 := α ∗ (ω1 ) has the Reeh-Schlieder property for O2 . Moreover, if the Cauchy surfaces of the Mi are non-compact and P2 ⊂ M2 is a bounded cc-region, then there is a bounded cc-region Q 2 ⊂ P2⊥ for which ω2 has the Reeh-Schlieder property. (Here ω2 = α ∗ (ω1 ) is still defined by the same α as in the first statement of the theorem.) Proof. For the first statement let α and U1 be as in the first part of Proposition 3.4 and note that α gives rise to a unitary map Uα : Hω2 → Hω1 . This map is the expression of the essential uniqueness of the GNS-representation, so that Uα ω2 = ω1 and Uα πω2 U∗α = πω1 ◦ α. The Reeh-Schlieder property for O2 then follows from the observation that Uα πω2 (A O2 )U∗α ⊃ πω1 (AU1 ): πω2 (A O2 )ω2 ⊃ U∗α πω1 (AU1 )ω1 = U∗α Hω1 = Hω2 . Similarly for the second statement, given a bounded cc-region P2 and choosing Q 1 , Q 2 as in the second statement of Proposition 3.4 we see that Uα πω2 (A Q 2 )U∗α ⊃ πω1 (A Q 1 ). The second part of Theorem 4.1 means that ω2 is a Reeh-Schlieder state for all ccregions that are big enough. Indeed, if V2 is a sufficiently small cc-region then V2⊥ is connected (recall that we work with four-dimensional spacetimes) and therefore ω2 has the Reeh-Schlieder property for some cc-region in V2⊥ and hence also for V2⊥ itself. A useful consequence of Theorem 4.1 is the following: Corollary 4.2. In the situation of Theorem 4.1 if A is causal then ω2 is a cyclic and separating vector for RωO22 . If the Cauchy surfaces are non-compact ω2 is a separating vector for all RωP22 , where P2 is a bounded cc-region. Proof. Recall that a vector is a separating vector for a von Neumann algebra R iff it is a cyclic vector for the commutant R (see [12] Proposition 5.5.11.). Choosing V1 as in the first part of Proposition 3.4 we have Uα πω2 (A O2 )U∗α ⊂ πω1 (AV1 ) by the inclusion (4). Therefore the commutant of Uα RωO22 U∗α contains (RωV11 ) . As V1⊥ = ∅ this commutant contains the local algebra of some cc-region for which ω1 is cyclic. Hence ω1 is a separating vector for RωV11 and ω2 for RωO22 . If the Cauchy surfaces are non-compact, P2 is a bounded region and Q 2 is as in Theorem 4.1, then (RωP22 ) contains πω2 (A Q 2 ), for which ω2 is cyclic. It follows that ω2 is separating for RωP22 . If the theory is nowhere classical there exist non-local correlations between O2 and any cc-region V2 spacelike to it, just as in the Minkowski spacetime case (see e.g. [16]). Also,
On the Reeh-Schlieder Property in Curved Spacetime
281
if the Cauchy surfaces are non-compact, any localised non-trivial positive observable has a positive expectation value. If the state space is locally quasi-equivalent and large enough it is possible to show the existence of full Reeh-Schlieder states. The proof uses abstract existence arguments, as opposed to the proof of Theorem 4.1 which is constructive, at least in principle. Theorem 4.3. Consider a locally covariant quantum field theory A with a locally quasiequivalent state space S which is causal and satisfies the time-slice axiom. Assume that S is maximal in the sense that for any state ω on some A M which is locally quasi-equivalent to a state in S M we have ω ∈ S M . Let Mi , i = 1, 2, be two globally hyperbolic spacetimes with diffeomorphic noncompact Cauchy surfaces and assume that ω1 is a Reeh-Schlieder state on M1 . Then S M2 contains a (full) Reeh-Schlieder state. Proof. Let {On }n∈N be a countable basis for the topology of M2 consisting of bounded cc-regions with non-empty causal complement. We then apply Theorem 4.1 to each On to obtain a sequence of states ω2n ∈ S M2 which have the Reeh-Schlieder property for On . We write ω := ω21 and let (π, , H) denote its GNS-representation. For all n ≥ 2 we now find a bounded cc-region Vn ⊂ M2 such that Vn ⊃ O1 ∪On . For this purpose we first choose a Cauchy surface C ⊂ M2 and note that K n := C ∩ J (On ) is compact. Letting L n ⊂ C be a compact connected set containing K 1 ∪ K n in its interior it suffices to choose Vn := int(D(L n )) ∩ I − (C+ ) ∩ I + (C− ) for Cauchy surfaces C± to the future resp. past of O1 , On and C. Note that and ω2n are cyclic and sepωn
arating vectors for RωVn and RVn2 respectively by O1 ∪ On ⊂ Vn and by Corollary 4.2. ωn
Because ω and ω2n are locally quasi-equivalent there is a ∗ -isomorphism φ : RVn2 → RωVn . In the presence of the cyclic and separating vectors φ is implemented by a unitary map Un : Hω2n → H (see [12] Theorem 7.2.9). We claim that ψn := Un ω2n is cyclic for RωOn . Indeed, by the definition of quasi-equivalence we have φ ◦ πω2n = πω on AVn , so πω (A On )ψn = Un πω2n (A On )ω2n = Un Hω2n = Hω . We now apply the results of [8] to conclude that H contains a dense set of vectors ψ which are cyclic and separating for all RωOn simultaneously. Because each cc–region
ω (A)ψ defines a full ReehO ⊂ M2 contains some On we see that ωψ : A → ψ,πψ 2 Schlieder state. Finally, because the GNS-representation of ωψ is just (π, ψ, H) we see that it is locally quasi-equivalent to ω and hence ωψ ∈ S M2 .
One reason to assume the maximality condition of Theorem 4.3 is that it guarantees that the state spaces are closed under operations, i.e. if ω ∈ S M and A ∈ A M such that ω(A∗ A) = 1, then S M automatically contains the state ω A defined by ω A (B) := ω(A∗ B A). However, such a large state space may contain many singular states, as we will see in the example of the free scalar field in Sect. 5. In situations of physical interest it therefore remains to be seen whether the state space is big enough to contain full ReehSchlieder states. Nevertheless, Theorem 4.1 is already enough for some applications, such as the following conclusion concerning the type of local von Neumann algebras. Corollary 4.4. Consider a nowhere classical causal locally covariant quantum field theory A with a locally quasi-equivalent state space S which satisfy the time-slice axiom. Let Mi be two globally hyperbolic spacetimes with diffeomorphic Cauchy surfaces and let ω1 ∈ S M1 be a Reeh-Schlieder state. Then for any state ω ∈ S Mi and any cc-region O ⊂ Mi the local von Neumann algebra RωO is not finite.
282
K. Sanders
Proof. We will use Proposition 5.5.3 in [2], which says that RωO is not finite if the GNSvector is a cyclic and separating vector for RωO and for a proper sub-algebra RωV . Note that we can drop the superscript ω if O and V are bounded, by local quasi-equivalence. First we consider M1 . For any bounded cc-region O1 ⊂ M1 such that O1⊥ = ∅ we can find bounded cc-regions O ⊂ O1⊥ and U, V ⊂ O1 such that U ⊂ V ⊥ . By the Reeh-Schlieder property the GNS-vector ω1 is cyclic for RV and hence also for R O1 . Moreover it is cyclic for RO1 ⊃ R O and therefore it is separating for R O1 and RV . Now suppose that R O1 = RV . Then, by causality: πω (AU ) ⊂ πω (AV ) = πω (A O1 ) ⊂ πω (AU ) . , which contradicts the nowhere classicality. Therefore, the It follows that RU ⊂ RU inclusion RV ⊂ R O1 must be proper and the cited theorem applies. Of course, if O ⊂ M1 is a cc-region that is not bounded, then it contains a bounded sub-cc-region O1 as above and RωO ⊃ RωO1 R O1 isn’t finite either for any ω ∈ S M1 . (If V is a partial isometry in the smaller algebra such that I = V ∗ V and E := V V ∗ < I then the same V shows that I is not finite in the larger algebra.) Next we consider M2 and let O ⊂ M2 be any cc-region. It contains a cc-region O2 with O2⊥ = ∅, so we can apply Theorem 4.1. Using the unitary map Uα : Hω2 → Hω1 we see that R O2 RωO22 contains α −1 (RωO11 ), which is not finite by the first paragraph. Hence R O2 is not finite and the statement for O then follows again by inclusion.
Instead of the nowhere classicality we could have assumed that the local von Neumann algebras in M1 are infinite, which allows us to derive the same conclusion for M2 . Unfortunately it is in general impossible to completely derive the type of the local algebras using this kind of argument. Even if we know the types of the algebras AU1 and AV1 in the inclusions (4), we can’t deduce the type of A O2 . Another important consequence of Proposition 4.1 is that Corollary 4.2 enables us to apply the Tomita-Takesaki modular theory to RωO22 (or to the von Neumann algebra of any bounded cc-region V2 which contains O2 , if the Cauchy surfaces are non-compact). More precisely, let O2 ⊂ M2 be given and let U1 , V1 ⊂ M1 be the bounded cc-regions and α : M2 → M1 the ∗ -isomorphism of Proposition 3.4, so that A O1 ⊂ α(A O2 ) ⊂ AV1 . ω1 We can then define R := Uα RωO22 U∗α and obtain RU ⊂ R ⊂ RωV11 . It is then clear that 1 the respective Tomita-operators are extensions of each other, SU1 ⊂ SR ⊂ SV1 (see e.g. [12]). 5. The Free Scalar Field As an example we will consider the free scalar field, which can be quantised using the Weyl algebra (see [7]). For a globally hyperbolic spacetime M the algebra A M is defined as follows. We let E := E − − E + denote the difference of the advanced and retarded fundamental solution of the Klein-Gordon operator ∇ a ∇a + m 2 for a given mass m ≥ 0. ∞ The linear space H := E(C0 (M)) has a non-degenerate symplectic form defined by σ (E f, Eg) := M f Eg, where we integrate with respect to the volume element determined by the metric. To every E f ∈ H we can then associate an element W (E f ) subject to the relations i
W (E f )∗ = W (−E f ), W (E f )W (Eg) = e− 2 σ (E f,Eg) W (E( f + g)). These elements form a ∗ -algebra that can be given a norm and completed to a C ∗ -algebra A M . It is shown in [5] Theorem 2.2 that the free scalar field is an example of a locally
On the Reeh-Schlieder Property in Curved Spacetime
283
covariant quantum field theory which is causal. It satisfies part of the time-slice axiom, namely if O ⊂ M contains a Cauchy surface then A O = A M . A state ω on A M is called regular if the group of unitary operators λ → πω (W (λE f )) is strongly continuous for each f . It then has a self-adjoint (unbounded) generator ω ( f ) and we can define the Hilbert-space valued distribution φω ( f ) := ω ( f )ω . A regular state is quasi-free iff the two-point function w2 ( f, h) := φω ( f¯), φω (h),
f, h ∈ C0∞ (M)
determines the state by ω(W (E f )) = e−w2 ( f, f ) . A quasi-free state is Hadamard iff W F∞ (φω (.)) ⊂ V + , where V + ⊂ T ∗ M denotes the cone of future directed causal covectors of the spacetime (see [20] Proposition 6.1). Quasi-free Hadamard states exist on all globally hyperbolic spacetimes (see [9]) and they are believed to be the most suitable states to play a role similar to the vacuum in Minkowski spacetime. For this reason we will want to choose a state space S M which contains all quasi-free Hadamard states. If we choose these states only it can be shown that we get a locally quasi-equivalent state space (see [22] Theorem 3.6) and the time-slice axiom is satisfied (see [15] Theorem 5.1 and the subsequent discussion). We may now apply the results of Sect. 4: Proposition 5.1. Let M be a globally hyperbolic spacetime, let O ⊂ M a bounded ccregion with non-empty causal complement and assume that the mass m > 0 is strictly positive. Then there is a Hadamard state ω on A M which has the Reeh-Schlieder property for O. The vector ω is cyclic and separating for R O . For all bounded cc-regions V ⊂ M the local von Neumann algebra RV is not finite. Moreover, if the Cauchy surfaces of M are non-compact then ω is a separating vector for all RV . Proof. The theory is causal, satisfies the time-slice axiom and the state space is locally quasi-equivalent. Moreover, the theory is nowhere classical. To see this we note that the local C ∗ -algebras are non-commutative and simple, so the representations πω are faithful. Now we can find an ultrastatic (and hence stationary) spacetime M diffeomorphic to M. Because m > 0 we may apply the results of [13], which imply the existence of a regular quasi-free ground state ω on M . This state has the Reeh-Schlieder property (see [19]) and is Hadamard because it satisfies the microlocal spectrum condition (see [15,20]). The conclusions now follow immediately from Theorem 4.1 and Corollaries 4.2 and 4.4. Note that stronger results on the type of the local algebras are known, [22]. If we would enlarge our state space, following [5], and allow any state that is locally quasi-equivalent to a quasi-free Hadamard state, then it follows from Theorem 4.3 that it also contains full Reeh-Schlieder states. In fact, if ω is a suitable quasi-free Hadamard state on A M then the proof of Theorem 4.3 shows that Hω contains a dense G δ of vectors which define Reeh-Schlieder states. An important question is how many states are both Hadamard and Reeh-Schlieder states. As a partial answer we wish to note that most vectors in the given G δ of Reeh-Schlieder vector states are not Hadamard. Indeed, if a vector ψ ∈ Hω defines a Hadamard state then it must be in the domain of the unbounded self-adjoint operator T := ω ( f )∗∗ ω ( f )∗ for every test function f (see [12] Theorem 2.7.8v). We then apply
Note that this is what [5] calls the time-slice axiom. In our definition, however, we also need to choose a suitable state space functor so that we get isomorphisms of the sets of states too.
284
K. Sanders
Proposition 5.2. The domain of an unbounded self-adjoint operator T on a Hilbert space H is a meagre Fσ , (i.e. the complement of a dense G δ ). Proof. For each n ∈ N we define Vn := {ψ ∈ H|T ψ ≤ n} and note that dom(T ) = ∪n Vn . The sets Vn are nowhere dense because T is unbounded. They are also closed because for a Cauchy sequence ψi → ψ with ψi ∈ Vn we have T E [−r,r ] ψ ≤ T E [−r,r ] (ψ − ψi ) + T E [−r,r ] ψi ≤ r ψ − ψi + n, where E [−r,r ] is the spectral projection of T on the interval [−r, r ]. Taking i → ∞ shows that T E [−r,r ] ψ ≤ n for all r and hence T ψ ≤ n, i.e. ψ ∈ Vn . This completes the proof. It then follows that most Reeh-Schlieder vector states in Hω are not Hadamard. The converse question, how many Hadamard states are Reeh-Schlieder states, remains open. The basic difficulty for that question seems to be that the Hilbert space topology on Hω is not fine enough to deal with the meagre set of Hadamard states. 6. Conclusions If one accepts locally covariant quantum field theory as a suitable axiomatic framework to describe quantum field theories in curved spacetime then one only needs to assume the very natural time-slice axiom in order to use the general technique of spacetime deformation. The geometrical ideas behind deformation results like Proposition 3.3 are insightful, even though the proofs can become a bit involved. It should be noted, however, that these geometrical results, possibly combined with other assumptions such as causality, have immediate consequences on the algebraic side which are not hard to prove. This we have seen in Sect. 4, where most proofs follow easily from the deformation, with the exception of Theorem 4.3. Concerning the Reeh-Schlieder property we have shown that a Reeh-Schlieder state on one spacetime can be deformed in such a way that it gives a state on a diffeomorphic spacetime which is a Reeh-Schlieder state for a given cc-region. It is even possible to get full Reeh-Schlieder states, but it is not clear whether these are “physical” enough to belong to a state space of interest. Nevertheless, our results do allow us to draw conclusions about non-local correlations and the type of local von Neumann algebras and they open up the way to use Tomita-Takesaki theory in curved spacetime. Acknowledgement. I would like to thank Chris Fewster for suggesting the current approach to the ReehSchlieder property and for many helpful discussions and comments on the second draft. Many thanks also to Lutz Osterbrink for his careful proofreading of the first draft. Finally I would like to express my gratitude to the anonymous referee for pointing out some minor mistakes and providing useful comments. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References 1. Bär, C., Ginoux, N., Pfäffle, F.: Wave equations on Lorentzian manifolds and quantization. EMS Publishing House, Zürich, 2007 2. Baumgärtel, H., Wollenberg, M.: Causal nets of operator algebras. Akademie Verlag, Berlin, 1992 3. Bernal, A.N., Sánchez, M.: Smoothness of time functions and the metric splitting of globally hyperbolic spacetimes. Commun. Math. Phys. 257, 43–50 (2005) 4. Bernal, A.N., Sánchez, M.: Further results on the smoothability of Cauchy hypersurfaces and Cauchy time functions. Lett. Math. Phys. 77, 183–197 (2006)
On the Reeh-Schlieder Property in Curved Spacetime
285
5. Brunetti, R., Fredenhagen, K., Verch, R.: The generally covariant locality principle—a new paradigm for local quantum field theory. Commun. Math. Phys. 237, 31–68 (2003) 6. Brunetti, R., Ruzzi, G.: Superselection sectors and general covariance. I. Commun. Math. Phys. 270, 69–108 (2007) 7. Dimock, J.: Algebras of local observables on a manifold. Commun. Math. Phys. 77, 219–228 (1980) 8. Dixmier, J., Maréchal, O.: Vecteurs totalisateurs d’une algèbre de von Neumann. Commun. Math. Phys. 22, 44–50 (1971) 9. Fulling, S.A., Narcowich, F.J., Wald, R.M.: Singularity structure of the two-point function in quantum field theory in curved spacetime, II. Ann. Phys. (N.Y.) 136, 243–272 (1981) 10. Haag, R.: Local quantum physics – fields, particles, algebras. Berlin-Heidelberg:Springer Verlag, 1992 11. Hawking, S.W., Ellis, G.F.R.: The large scale structure of space-time. Cambridge University Press, Cambridge, 1973 12. Kadison, R.V., Ringrose, J.R.: Fundamentals of the theory of operator algebras. Academic Press, London, 1983 13. Kay, B.S.: Linear spin-zero quantum fields in external gravitational and scalar fields. I. A one particle structure for the stationary case. Commun. Math. Phys. 62, 55–70 (1978) 14. O’Neill, B.: Semi-Riemannian geometry: with applications to relativity. Academic Press, New York, 1983 15. Radzikowski, M.J.: Micro-local approach to the Hadamard condition in quantum field theory on curved space-time. Commun. Math. Phys. 179, 529–553 (1996) 16. Redhead, M.: The vacuum state in relativistic quantum field theory. In: Hull, D., Forbes, M., Burian, eds., Phil. of Sci. Assoc, 1994 PSA, Vol. 1994, Volume 2 E.Lansing, MI:Phil. of Sci. Assoc, 1994, pp. 77–87 17. Reeh, H., Schlieder, S.: Bemerkungen zur Unitäräquivalenz von Lorentzinvarianten Felden. Nuovo Cimento 22, 1051–1068 (1961) 18. Sanders, K.: Aspects of locally covariant quantum field theory. PhD thesis, University of York, also available on http://arxiv.org/abs/:0809.4828v1[math-ph], 2008 19. Strohmaier, A.: The Reeh-Schlieder property for quantum fields on stationary spacetimes. Commun. Math. Phys. 215, 105–118 (2000) 20. Strohmaier, A., Verch, R., Wollenberg, M.: Microlocal analysis of quantum fields on curved space-times: analytic wavefront sets and Reeh-Schlieder theorems. J. Math. Phys. 43, 5514–5530 (2002) 21. Verch, R.: Antilocality and a Reeh-Schlieder theorem on manifolds. Lett. Math. Phys. 28, 143–154 (1993) 22. Verch, R.: Continuity of symplectically adjoint maps and the algebraic structure of Hadamard vacuum representations for quantum fields on curved spacetime. Rev. Math. Phys. 9, 635–674 (1997) 23. Verch, R.: A spin-statistics theorem for quantum fields on curved spacetime manifolds in a generally covariant framework. Commun. Math. Phys. 223, 261–288 (2001) 24. Wald, R.M.: General relativity. The University of Chicago Press, Chicago and London, 1984 Communicated by G. W. Gibbons
Commun. Math. Phys. 288, 287–310 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0771-y
Communications in
Mathematical Physics
Mellin Transform of the Limit Lognormal Distribution Dmitry Ostrovsky 125 Field Point Rd. #3, Greenwich, CT 06830, USA. E-mail:
[email protected] Received: 26 February 2008 / Accepted: 24 December 2008 Published online: 11 March 2009 – © Springer-Verlag 2009
Abstract: The technique of intermittency expansions is applied to derive an exact formal power series representation for the Mellin transform of the probability distribution of the limit lognormal multifractal process. The negative integral moments are computed by a novel product formula of Selberg type. The power series is summed in general by means of its small intermittency asymptotic. The resulting integral formula for the Mellin transform is conjectured to be valid at all levels of intermittency. The conjecture is verified partially by proving that the integral formula reproduces known results for the positive and negative integral moments of the limit lognormal distribution and gives a valid characteristic function of the L´evy-Khinchine type for the logarithm of the distribution. The moment problem for the logarithm of the distribution is shown to be determinate, whereas the moment problems for the distribution and its reciprocal are shown to be indeterminate. The conjecture is used to represent the Mellin transform as an infinite product of gamma factors generalizing Selberg’s finite product. The conjectured probability density functions of the limit lognormal distribution and its logarithm are computed numerically by the inverse Fourier transform. 1. Introduction Limit lognormal stochastic processes were introduced and reviewed by Mandelbrot [17,19], formalized in a series of papers by Kahane [12–14], and constructed explicitly by Bacry et. al. [3]. The interest in limit lognormal processes is threefold. First, it stems from their remarkable scaling properties. Indeed, they exhibit nonlinear moment scaling, i.e. multiscaling, via the mechanism of grid-free stochastic self-similarity with lognormal multipliers, and thus are a concrete example of a family of multifractal stochastic processes. In addition, the construction of Bacry et. al. has stationary non-gaussian increments and serves as an ideal mathematical model for the physical phenomenon of long-range dependence. Second, limit lognormal processes are non-markovian as their increments are strongly stochastically dependent. We showed in [22 and 23] that this
288
D. Ostrovsky
dependence is stronger than that of both the canonical multifractals [18] and 1D disordered systems [15]. The precise nature of this dependence is the source of mathematical interest as it calls for novel mathematical techniques that are suitable for a strongly non-markovian setting. Third, the positive integral moments of the Bacry et. al. process were shown in [4] to be given by the Selberg integral, which has its own interest, confer [10], hence the interest in the probability distribution that generates it. In this paper we continue our study of the limit lognormal process of Bacry et. al. [3] via the technique of intermittency expansions that we introduced in [23] in the special case of the Laplace transform and developed it in general in [24]. This technique is based on a novel rule of intermittency differentiation for the probability distribution of the limit lognormal process. The rule is a functional equation for the derivatives of the expectation of an arbitrary smooth function of the distribution with respect to the intermittency parameter. By formally re-summing the resulting Taylor series, we obtain a power series expansion of any such functional with universal coefficients that are independent of the function. In [24] we expressed these expansion coefficients as an alternating sum of derivatives of the Selberg integral and used the celebrated Selberg’s formula [26] to compute the derivatives exactly as Bell polynomials of the values of the Riemann zeta function at positive integers. Thus, the technique of intermittency expansions provides a way of reconstructing the functional from the Selberg integral. Our goal in this paper is to carry out this program in the special case of the Mellin transform of the distribution. The contribution of this paper is to provide several new exact results beyond what has been known since the inception of the limit lognormal distribution. Our main result is the following formal expansion of the Mellin transform (complex moments) of the distribution in powers of the intermittency parameter µ,
E M
q
∞ µr +1 1 Br +2 (q + 1) + 2Br +2 (q) − 3Br +2 ζ (r + 1) − q = exp r + 1 2r +1 r +2 r =0 Br +2 (q − 1) − Br +2 (2q − 1) + (ζ (r + 1) − 1) , q ∈ C. (1) r +2
As usual, ζ (s)1 denotes the Riemann zeta and Bn (s) the n th Bernoulli polynomial. It is proved by deriving and solving a linear recurrence relation for the coefficients of the corresponding intermittency expansion. In the special case of integral moments, the series is convergent for a finite range of the moments and allows us to re-derive Selberg’s formula for the positive moments and derive the following formula for the negative ones: (2 + (n + 2 + k)µ/2) (1 − µ/2) n−1 E M −n = . 2 (1 + (k + 1)µ/2) (1 + kµ/2)
(2)
k=0
For non-integral moments, the power series expansion in Eq. (1) is divergent. We show that it is precisely the small intermittency µ → +0 asymptotic expansion of the following integral: 1 We will write ζ (1) to denote Euler’s constant. It never enters any of the final formulas as the coefficient it multiplies is identically zero throughout this paper.
Mellin Transform of the Limit Lognormal Distribution
log E M
q
∞ ∼
dx x
0
1 ex − 1
− 1 + q + qe
e
289
µx 2 (q+1)
+ 2e
µx 2 q
−3+e e
µx 2
+e
−x
e
µx 2
µx 2 (2q−1)
e
µx 2
µx 2 (q−1)
−e
µx 2 (2q−1)
−1 −e
µx 2 (q−1)
−1
−q
.
(3)
We conjecture that this formula holds exactly for all intermittency levels 0 ≤ µ < 1 and complex moments q such that (q) < 2/µ. While we do not know if it is true in this generality, we do know that it holds for integral q as it recovers Selberg’s formula when q is positive and Eq. (2) when q is negative. The special case of purely imaginary q corresponds to the characteristic function of log M. To further substantiate our conjecture, we prove that Eq. (3) does indeed produce a valid infinitely divisible characteristic function at all intermittency levels when q is purely imaginary. The corresponding probability density, which is conjectured to be the density of log M, is not known analytically. As a first step towards the goal of computing it, we show that the exponential of the integral in Eq. (3) equals an infinite product of gamma factors in the complex plane, thereby extending Selberg’s formula and Eq. (2) to the half-plane (q) < 2/µ: q 2 (2 + 2/µ − 2q) q E[M ] = (1 − µq/2) −q (1 − µ/2) µ (2 + 2/µ − q) ∞ 2q 2n 3 (1 − q + 2n/µ) (2 − q + 2n/µ) × . (4) µ 3 (1 + 2n/µ) (2 − 2q + 2n/µ) n=1
In particular, this representation implies novel functional equations for the Mellin transform that are stated and proved below. Finally, we use Eq. (3) to compute the conjectured densities of log M and M numerically. The main technical innovation of this paper is the derivation and use of a novel recurrence relation for the universal expansion coefficients. Its strength is in that it encodes both the representation of the coefficients as the alternating sum of Selberg integral’s derivatives and the Bell polynomial representation of these derivatives mentioned above into a single linear recurrence. We use the method of generating functions to evaluate the coefficients of this recurrence explicitly and relate them to Bernoulli polynomials. The plan of the paper is as follows. In Sect. 2 we give a brief review of our results on general intermittency expansions following [24]. In Sect. 3 we state and prove the recurrence relation for the expansion coefficients followed by the proof of Eq. (1) in Theorem 3.1. In Sect. 4 we treat the case of integral moments and derive Eq. (2) in Theorem 4.1. In Sect. 5 we establish the small intermittency asymptotic of Eq. (3) in Theorem 5.1. In Sect. 6 we state our main conjecture, explain its origin, show that it gives correct results in the special case of integral q, prove that it gives rise to a valid characteristic function for log M, and relate the Mellin transform to an infinite product of gamma factors. In Sect. 7 we use the conjecture to compute the probability density functions of log M and M numerically and compare them with the corresponding gaussian and lognormal densities, respectively. Conclusions are presented in Sect. 8. Appendix contains proofs of the infinite product representation and its corollaries. 2. Review of Intermittency Expansions In this section and throughout this paper we will write M to denote the limit lognormal distribution. It is a probability distribution on the positive real line M > 0, which can
290
D. Ostrovsky
be thought of as the value of the limit lognormal process at the time and decorrelation length equal to one. We will denote the intermittency parameter by 0 ≤ µ < 1. The simplest approach to the limit lognormal process introduced by Bacry et. al. [3] is to consider the exponential functional of a particular stationary gaussian process ωε (s) in the limit ε → 0. Specifically, let ωε (s) be a gaussian process in s, whose mean and covariance are functions of a finite scale ε > 0. Define them to be µ (1 − log ε) , 2 Cov [ωε (t), ωε (s)] −µ log |t − s|, ε ≤ |t − s| ≤ 1, |t − s| , Cov [ωε (t), ωε (s)] µ 1 − log ε − ε E [ωε (t)] −
(5a) (5b) (5c)
if |t − s| < ε, and covariance is zero in the remaining case of |t − s| ≥ 1. Thus, ε is used as a truncation scale, and for simplicity we set the decorrelation length to one. The two key properties of this construction are, first, that E [ωε (t)] = −Var [ωε (t)] /2 so that E exp (ωε (s)) = 1 and, second, that Var [ωε (t)] = µ(1 − log ε) is logarithmically divergent as ε → 0. The first property is essential for convergence, the second is responsible for multifractality, and both are originally due to Mandelbrot [17]. The interest in this t construction stems from the ε → 0 limit of the exponential functional Mε (t) 0 exp (ωε (s)) ds. Using the theory of T -martingales developed by Kahane in a series of papers [12–14], and the work of Barral and Mandelbrot [6] on log-Poisson cascades, Bacry and Muzy [5] showed that Mε (t) converges weakly (as a measure on R+ ) a.s. to a limit process M(t) = limε→0 Mε (t) provided 0 ≤ µ < 1, and the limit is nondegenerate in the sense that E[M(t)] = t. This limit is known as the limit lognormal process, and its value at time t = 1, i.e. M M(1), is the limit lognormal distribution. We refer the reader to the original work by Bacry et. al. [3 and 20] for further details of their construction including its remarkable statistical self-similarity and scaling properties, to [5] for the proof of existence, to [25] for an alternative approach to limit lognormal multifractality, and to [22 and 23] for reviews. The positive integral moments of M were shown in [4] to be given by the celebrated Selberg integral, confer [26] and [1], 2 ≤ l < 2/µ, 1 E[M ] =
···
l
0
1 l
|si − s j |−µ d s(l) =
0 i< j
l−1 (1 − (k + 1)µ/2) 2 (1 − kµ/2) , (6) (1 − µ/2)(2 − (l + k − 1)µ/2)
k=0
which from now on we will denote by Sl (µ). In general, it is shown in [5] that for q > 0 we have E M q < ∞ if q < 2/µ and, conversely, E M q < ∞ implies that q ≤ 2/µ. Let F(x) be an arbitrary smooth function that does not involve µ and let F (k) (x) denote its k th derivative. Our results on general intermittency expansions established in [24] are summarized in the following propositions. Proposition 2.1. The expectation E [F(M)] has the formal expansion E [F(M)] = F(1) +
2n ∞ µn n=1
n!
k=2
F
(k)
(1)Hn,k .
(7)
Mellin Transform of the Limit Lognormal Distribution
291
Proposition 2.2. Given n = 1, 2, 3, · · · , the coefficients (−1)l l!Hn,l and derivatives of the Selberg integral are binomial transforms of one another n k (−1)k l k ∂ Sl Hn,k = (−1) |µ=0 , (8a) l ∂µn k! l=2
∂ n Sl |µ=0 ∂µn
l l t!Hn,t . = t
(8b)
t=2
Proposition 2.3. The expansion coefficients Hn,k satisfy Hn,k = 0 ∀k > 2n.
(9)
Proposition 2.4. Let Yn (x1 · · · xn ) denote the complete exponential Bell polynomial of order n and ζ (x) the Riemann zeta function. By convention, we will write ζ (1) to mean Euler’s constant. Define the sequence of coefficients c p (l) for p = 1, 2, · · · and l = 2, 3, · · · ⎡ ⎤ l−1 l−1 1 ⎣ c p (l) = ( j + 1) p + 2 j p − 1 − (l + j − 1) p + (l + j − 1) p ⎦ . ζ ( p) p2 p j=0
j=0
(10) Then, ∂ n Sl |µ=0 = Yn (1!c1 (l), . . . , n!cn (l)) . (11) ∂µn We end this section with several remarks. First, Hn,k are the universal expansion coefficients, Eq. (8a) is their alternating sum representation, and Eq. (11) is the Bell polynomial representation of the derivatives that we referred to in the Introduction. Second, the intermittency expansion in Eq. (7) can be thought of as a linear operator acting on F(x) with the coefficients that are rational polynomials in values of the Riemann zeta at positive integers. It is an open question as to whether this operator has a number theoretic interpretation similar to that of the Gauss-Kuzmin-Wirsing operator. Third, the coefficient that multiplies ζ ( p) in Eq. (10) vanishes if p = 1 ∀l so that Euler’s constant does not enter. 3. Recurrence Relations We begin by stating and proving the fundamental recurrence relation for the expansion coefficients Hn,k . Recall the definition of c p (l) in Proposition 2.4 above. Proposition 3.1. The coefficients Hn,k satisfy the recurrence relation n−1 k n Hn+1,k = An,k + Hn−r, t Br, t, k , n ≥ 0, k ≥ 2, (12) r r =0
An,k (−1)
Br, t, k
k
t=2
k (n + 1)!
k!
l=2
k cn+1 (l), (−1) l l
k l (r + 1)! l k cr +1 (l). (−1) t! (−1) l t k! k
l=t
(13)
(14)
292
D. Ostrovsky
Proof. By Eq. (8a) we have Hn+1,k =
n+1 k k ∂ Sl (−1)k (−1)l |µ=0 . l ∂µn+1 k!
(15)
l=2
Recall the recursion relation of complete Bell polynomials, confer [9], Chap. 11, Yn+1 (x1 , · · · , xn+1 ) =
n−1 n r =0
r
Yn−r (x1 , · · · , xn−r ) xr +1 + xn+1 .
(16)
It follows from this equation and Eq. (11) that we have the identity n−1 n−r n ∂ Sl ∂ n+1 Sl | = |µ=0 (r + 1)!cr +1 (l) + (n + 1)!cn+1 (l). µ=0 r ∂µn−r ∂µn+1
(17)
r =0
Substituting Eq. (8b) into Eq. (17), we get
l n−1 n l ∂ n+1 Sl t!Hn−r,t (r + 1)!cr +1 (l) + (n + 1)!cn+1 (l). (18) |µ=0 = r t ∂µn+1 r =0
t=2
Finally, the result follows by substituting this equation back into Eq. (15) and changing the order of summation. It is clear that Proposition 3.1 determines the Hn,k uniquely as H1,k = A0,k . Next, we proceed to compute the alternating sums that occur in Eqs. (13) and (14). Proposition 3.2. The coefficients An,k and Br,t,k are An,k =
Br, t, k
1 d n+2 z n! ζ (n + 1) e z (e z − 1)k + 2(e z − 1)k |z=0 z n+1 n+2 k! 2 (n + 2) dz e −1
−z z k −z 2z +e (e − 1) − e (e − 1)k + e−z (e2z − 1)k − e−z (e z − 1)k ,
r +2 1 k d z z(t+1) z r! ζ (r + 1) e = t! | (e − 1)k−t z=0 k! 2r +1 (r + 2) t dz r +2 ez − 1 +2e zt (e z − 1)k−t + e z(t−1) (e z − 1)k−t − e z(2t−1) (e2z − 1)k−t − 3δkt r +2 k d z −(r + 2)k (δkt + δk t+1 ) + |z=0 z r +2 t dz e −1 × e z(2t−1) (e2z − 1)k−t − e z(t−1) (e z − 1)k−t .
(19)
(20) Proof. The starting point is the identity known as Faulhaber’s formula that expresses the sum of powers in terms of Bernoulli polynomials Bn (x), y j=x
jp =
B p+1 (y + 1) − B p+1 (x) . p+1
(21)
Mellin Transform of the Limit Lognormal Distribution
293
Using the generating function of Bernoulli polynomials, it follows that c p (l) can be written as p+1 d z z(l+1) 1 zl z(l−1) z(2l−1) ζ ( p) e c p (l) = | +2e − 3+e −e z=0 p( p + 1)2 p dz p+1 ez − 1 d p+1 z z(2l−1) −l( p + 1) + p+1 |z=0 z e (22) − e z(l−1) . dz e −1 We now substitute Eq. (22) in Eqs. (13) and (14) and compute the ensuing alternating sums k l k e zl = (1 − e z )k + ke z − 1, (−1) (23) l l=2
k l zl t k e = (−1) e zt (1 − e z )k−t . (−1) l t t
k l=t
l
(24)
The rest of the proof follows from lengthy but straightforward algebraic manipulations. We mention in passing that the derivatives that are involved in Eqs. (19) and (20) can all be evaluated in terms of N¨orlund-Bernoulli polynomials. For our purposes, however, it is advantageous to write them in this form as will become clear shortly. We now proceed to derive and solve a key recurrence relation that governs the intermittency expansion for the moments of M. The moments correspond to F(x) = x q in Eq. (7) for some given q ∈ C. By Proposition 2.1, the intermittency expansion for the moments is ∞ µn f n (q), E Mq = 1 + n!
(25)
n=1
∞ f n (q) = (q)k Hn,k , n = 1, 2, 3, · · · .
(26)
k=2
Here and throughout the rest of the paper we use the standard notation for the ‘falling factorial’ (q)k −1)(q −2) · · · (q −k +1) so that the corresponding binomial coef q(q ficient satisfies qk k! = (q)k . Note that the upper limit of summation has been extended to infinity by Proposition 2.3. Proposition 3.3. Let f 0 (q) = 1 and define the coefficients br (q), r = 0, 1, 2 · · · , Br +2 (q + 1) + 2Br +2 (q) − 3Br +2 1 br (q) = r +1 ζ (r + 1) −q 2 r +2 Br +2 (q − 1) − Br +2 (2q − 1) . (27) + (ζ (r + 1) − 1) r +2 Then, f n (q) satisfies the recurrence f n+1 (q) = n!
n f n−r (q) br (q). (n − r )! r =0
(28)
294
D. Ostrovsky
Proof. We begin by substituting the main recurrence in Eq. (12) into Eq. (26) and changing the order of summation in the second sum (recall that all the sums involved are finite, despite notation)
∞ ∞ n−1 ∞ n f n+1 (q) = (q)k An,k + (q)k Br, t, k Hn−r,t . (29) r r =0
k=2
t=2
k=t
∞ We now use Proposition 3.2 to evaluate k=2 (q)k An,k and k=t (q)k Br, t, k . By Proposition 3.2 it is sufficient to sum the elementary series for |a| < 1, ∞
a k (q)k /k! = (1 + a)q − qa − 1,
k=2
∞
∞
a k−t (q)k /(k − t)! = (q)t (1 + a)q−t . (30)
k=t
After straightforward algebraic reductions, we obtain ∞
(q)k An,k = n! bn (q),
k=2 ∞
(q)k Br, t, k = (q)t r ! br (q).
(31)
(32)
k=t
The result follows. Theorem 3.1. The moments have the following exact formal representation: ∞ µr +1 q br (q) , q ∈ C. E M = exp r +1
(33)
r =0
Proof. The key point to observe is that the solution to the recurrence relation in Eq. (28) is f n (q) = Yn b0 (q)0!, b1 (q)1!, · · · , bn−1 (q)(n − 1)! , (34) where Yn stands for the exponential Bell polynomial as in Sect. 2. This is proved by noticing that the recurrence in Eq. (28) satisfies the recurrence relation of Bell polynomials in Eq. (16). The result now follows from the well-known formula for the generating function of Bell polynomials, confer Chap. 11 in [9]. The significance of Theorem 3.1 is that it gives an exact closed-form expression for the moments. This solution is however formal as we have not said anything yet about convergence. This question is addressed in the next three sections. 4. Convergent Series Our goal in this section is to sum the series in Theorem 3.1 in the special case of integral q such that −2/µ + 1/2 < q < 2/µ, in which case the series is convergent. Thus, we will establish the validity of Eq. (2) for this particular range of moments and also show as a self-check that we do indeed recover Selberg’s formula in Eq. (6). Throughout this section we work with the derivative series r∞=0 µr br (q) instead of the one that is involved in Theorem 3.1 as it is technically easier to handle.
Mellin Transform of the Limit Lognormal Distribution
295
Proposition 4.1. Let q = −n, n = 1, 2, 3, · · · , < 2/µ − 1/2. Then, the sum of the series is
n−1 ∞ (2 + (n + 2 + k)µ/2) (1 − µ/2) ∂ log µr br (q) = . (35) ∂µ 2 (1 + (k + 1)µ/2) (1 + kµ/2) r =0
k=0
Proof. The starting point is a generalization of Faulhaber’s formula in Eq. (21), y B p+1 (−x) − B p+1 (−y) = (− j) p , 0 < x < y, p+1
(36)
j=x+1
which follows from Eq. (21) by means of the identity B p+1 (−x) = (−1) p+1 (B p+1 (x) + ( p + 1)x p ) that is satisfied by Bernoulli polynomials, confer [30], Chap. 1. By the definition of br (q), we obtain
n 1 (−k)r +1 + n ζ (r + 1) br (−n) = r +1 (−n)r +1 − 3 2 k=1 2n+1 (−k)r +1 . + (ζ (r + 1) − 1) (37) k=n+2
Recall the identities, confer Sect. 3.4 of [27], that relate the Riemann zeta to the digamma function (recall that ζ (1) denotes Euler’s constant), ∞
∞
ζ (r ) t r −1 = −ψ(1 − t), |t| < 1,
(38)
r =1
(ζ (r ) − 1) t r −1 = −ψ(1 − t) −
r =1
1 , |t| < 2. 1−t
(39)
We now change the index of summation in r∞=0 µr br (−n) from r to r + 1, change the order of summation over k and r, and evaluate the ensuing sums over r by means of Eqs. (38) and (39). We obtain for the sum of the series r∞=0 µr br (−n),
n µ µk 1 µn −3 − nψ(1 − ) kψ 1 + nψ 1 + 2 2 2 2 k=1 2n+1 µk k kψ 1 + + + . (40) 2 1 + µk/2 k=n+2
It is elementary to verify that the expressions in Eqs. (35) and (40) are the same.
Theorem 4.1. The negative integral moments satisfy Eq. (2) for n = 1, · · · , < 2/µ − 1/2. ∞ ∂ q E M = E Mq µr br (q), subject to Proof. The moments solve the equation ∂µ r =0 q E M (µ = 0) = 1, by Theorem 3.1. By Proposition 4.1, so does the product formula in Eq. (2).
296
D. Ostrovsky
Proposition 4.2. Let q = n, n = 1, 2, 3, · · · , < 2/µ. Then, the sum of the series is
n−1 ∞ (1 − (k + 1)µ/2) 2 (1 − kµ/2) ∂ r log µ br (q) = . (41) ∂µ (1 − µ/2)(2 − (n + k − 1)µ/2) r =0
k=0
Proof. The same argument as in the proof of Proposition 4.1 gives for the sum of the series
n−1 µ 1 µk µn −3 + nψ(1 − ) kψ 1 − −nψ 1 − 2 2 2 2 k=0 2n−2 µk k kψ 1 − + + . (42) 2 1 − µk/2 k=n−1
It is now easy to verify that the expressions in Eqs. (41) and (42) are the same.
Theorem 4.2. The positive integral moments satisfy Eq. (6) for n = 1, 2, 3, · · · , < 2/µ. 5. Asymptotic Series In this section we will consider the series in Theorem 3.1 for arbitrary q ∈ C. The series is a divergent asymptotic series unless q is integral. Our goal is to prove Eq. (3), which gives the sum of the series in the limit of small intermittency. In other words, we will compute the sum by finding a particular integral function whose asymptotic expansion in the limit of zero intermittency coincides with the series. We begin with a preliminary result that establishes the asymptotic expansion of two particular integrals and is the basis of our summation method. Proposition 5.1. Let 0 ≤ µ < 1, q ∈ C. Then, in the limit µ → +0, ∞ −x µx q ∞ Br +2 (q) − Br +2 e 2 −1 µ r +1 1 e ∼ − q d x, µx 2 r +1 r +2 x e 2 −1 r =0
(43)
0
∞ µ r +1 ζ (r + 1) Br +2 (q) − Br +2 2 r +1 r +2 r =1
µx ∞ e 2 q −1 (q 2 − q) µx dx −q − ∼ . µx (e x − 1)x e 2 − 1 2 2
(44)
0
Proof. We start with the well-known identity yq ∞ yr e −1 = (B (q) − B ) y r r ey − 1 r!
(45)
r =1
that is valid for |y| < 2π, hence valid asymptotically as y → 0. Using that B1 (q)−B1 = q, we obtain in the limit y → 0, ∞ Br +2 (q) − Br +2 1 e yq − 1 − q ∼ yr . (46) f (y) y ey − 1 (r + 2)! r =0
Mellin Transform of the Limit Lognormal Distribution
297
∞ Finally, as the integral 0 exp(−2y/µ) f (y)dy equals the integral on the right-hand side of Eq. (43), the result follows by Watson’s lemma, confer Theorem 3.1 in Chap. 3 of [21]. The proof of Eq. (44) is quite similar and requires a slight generalization of Watson’s Lemma in the following form. Given a function that has the asymptotic expansion f (y) ∼ r∞=1 ar y r as y → 0, then as z → +∞, ∞ 0
∞
1 f (y)dy ∼ ζ (r + 1)r !ar /z r +1 , e zy − 1
(47)
r =1
confer Lemma 10.2 in Chap. 38 of [8]. Now, using that B2 (q) − B2 = q 2 − q, we have in the limit y → 0 f (y)
∞ Br +2 (q) − Br +2 1 e yq − 1 y 2 − q − (q ∼ yr . − q) y ey − 1 2 (r + 2)!
(48)
r =1
The result now follows from Eq. (47).
We can naturally think of Eqs. (43) and (44) as defining the sums of the divergent series involved. The summation method in Eq. (43) is known as Borel’s and the one in Eq. (44) is its close kin known as Hardy’s Moment Constant Method, confer Sect. 4.12 and 4.13 in [11]. We now proceed to the main result of this section that gives an asymptotic formula for the sum of the series in Theorem 3.1. Theorem 5.1. Let br (q) be as in Proposition 3.3 and q ∈ C. Then, as µ → +0,
E M
q
⎛∞
µx µx µx µx 2 (q+1) + 2e 2 q − 3 + e 2 (q−1) − e 2 (2q−1) 1 e d x ∼ exp ⎝ µx x ex − 1 e 2 −1 0 ⎞
µx µx (2q−1) (q−1) 2 2 µx e −e − q ⎠. − 1 + q + qe 2 + e−x µx 2 e −1
(49)
Proof. It is sufficient to show that r∞=0 µr +1 br (q)/(r + 1) is asymptotic to the integral on the right-hand side of Eq. (49) as asymptotic series can be exponentiated, confer Sect. 65 of [16]. The expression in Eq. (27) is a linear combination of terms that are of the same types as in Proposition 5.1 with the exception of the q term, which can be treated using the identity ∞ ∞ µx µ r +1 ζ (r + 1) dx µx 2 −1− e . = 2 r +1 (e x − 1)x 2 r =1
0
The result follows by an elementary algebraic reduction.
(50)
298
D. Ostrovsky
6. The Conjecture and its Corollaries In this section we will state our conjecture, explain its origin, show that it is consistent with the known values of integral moments, and discuss its implications. Throughout this section we denote D(q) the integral on the right-hand side of Eq. (49) so that the content of Theorem 5.1 is that E[M q ] ∼ exp (D(q)) as µ → +0. Conjecture. The equality E[M q ] = exp (D(q)) holds for 0 ≤ µ < 1 and (q) < 2/µ. The origin of this conjecture is as follows. For a fixed q ∈ C define two functions of z ∈ C,
z z(q+1) e (51) + 2e zq − 3 + e z(q−1) − e z(2q−1) , g(z) z e −1
z e z(2q−1) − e z(q−1) . (52) h(z) z e −1 It is clear that these functions are meromorphic in z, and z = 0 is a removable singularity with g(0) = h(0) = 0. Now, the integral in Eq. (49) can be written as 1 1 µx 1
µx 2 µx
g − g (0) − g (0) e x − 1 µx/2 2 2 2 2 µx µx −x 1 µx µx
2 −q e − 1 − +e h − h (0) . (53) 2 µx/2 2 2
∞ D(q) = 0
dx x
The Taylor series at z = 0 for the integrands in Eq. (53) are ∞ z r +1 1 1 g(z) − g (0)z − g
(0)z 2 −q e z −1−z = , g (r +2) (0) − q(r + 2) z 2 (r + 2)! r =1
(54) 1 h(z) − h (0)z = z
∞ r =0
h (r +2) (0)
z r +1 (r + 2)!
.
(55)
Using g (n) (0) = Bn (q + 1) + 2Bn (q) − 3Bn + Bn (q − 1) − Bn (2q − 1), h
(n)
(0) = Bn (2q − 1) − Bn (q − 1),
(56) (57)
if we now substitute Eqs. (54) and (55) at z = µx/2 into Eq. (53) and integrate term by term, we obtain precisely the series r∞=0 µr +1 br (q)/(r + 1). This is the origin of our conjecture. It is worth emphasizing that the conjecture “explains” the source of divergence of ∞ r +1 b (q)/(r + 1) for non-integral q. If fact, when q is integral, the functions r r =0 µ g(z) and h(z) are entire so that the substitution of the Taylor expansions in Eqs. (54) and (55) into Eq. (53) is legitimate. For non-integral q, these series have finite radii of convergence, and the substitution leads to a divergent asymptotic series. Aside from the proven correctness of our conjecture in the limit of small intermittency, we also have strong analytic evidence that supports it in the case of the general intermittency level. Specifically, the following two propositions show that Eq. (49) recovers the values of the integral moments that were computed previously in Theorems 4.1 and 4.2, and that it gives rise to a valid characteristic function when q is purely imaginary.
Mellin Transform of the Limit Lognormal Distribution
299
Theorem 6.1. Let 0 ≤ µ < 1 and n = 1, 2, 3 · · · . Then, for q = ±n, the integral in Eq. (49) gives (1 − (k + 1)µ/2) 2 (1 − kµ/2) n−1 , n < 2/µ, E Mn = (1 − µ/2)(2 − (n + k − 1)µ/2) E M −n =
k=0 n−1 k=0
(2 + (n + 2 + k)µ/2) (1 − µ/2) , ∀n. 2 (1 + (k + 1)µ/2) (1 + kµ/2)
(58)
(59)
Proof. If q is integral, then the complicated fractions that are involved in Eq. (49) become finite sums of exponentials. The starting point is the algebraic identity 1 z(q+1) zq z(q−1) z(2q−1) e − 1 + q + qe z + 2e − 3 + e − e ez − 1 zq zq e − 1 z(q−1) e −1 − q − e − 1 . = e zq − 1 − q(e z − 1) + 2 ez − 1 ez − 1
(60)
Consider first the case of q = n, n = 1, 2, 3 · · · . Then, e zn − 1 zk e z(2n−1) − e z(n−1) = = e , e z(n+k−1) . z z e −1 e −1 n−1
n−1
k=0
k=0
(61)
Denoting z = µx/2, the integral in Eq. (49) simplifies to ∞ D(n) = 0
n−1 n−1 dx 1 zn z zk z(n+k−1) zk −e ) e − 1 − n(e − 1)+2 (e − 1)− (e x ex − 1
+e−x
k=0
n−1
k=0
(e z(n+k−1) − 1) .
(62)
k=0
By means of the identities, confer Sect. 1.2 of [27], ∞ log (1 + s) = 0
∞ log(s) =
e−ts − 1 + se−t et − 1
dt (Malmst´en), (s) > −1, t
−t dt (Frullani), (s) > 0; e − e−ts t
(63)
(64)
0
it is now not too difficult to show that the integral equals D(n) = log
n−1 k=0
(1 − (k + 1)µ/2) 2 (1 − kµ/2) . (1 − µ/2)(2 − (n + k − 1)µ/2)
This completes the proof of Eq. (58).
(65)
300
D. Ostrovsky
The case of q = −n is very similar. Instead of Eq. (61) we now have e−zn − 1 e−z(2n+1) − e−z(n+1) −z(k+1) = − = − e , e−z(n+k+2) , ez − 1 ez − 1 n−1
n−1
k=0
k=0
(66)
so that the integral in Eq. (49) reduces to ∞ D(−n) =
dx x
0
+
n−1
n−1 1 −zn z − 1 + n(e − 1) − 2 (e−z(k+1) − 1) e ex − 1
(e
k=0
−z(n+k+2)
−e
−z(k+1)
k=0
) −e
−x
n−1
(e
−z(n+k+2)
− 1) .
(67)
k=0
Again, the integral is evaluated using the identities in Eqs. (63) and (64) resulting in D(−n) = log
n−1 k=0
(2 + (n + 2 + k)µ/2) (1 − µ/2) . 2 (1 + (k + 1)µ/2) (1 + kµ/2)
This completes the proof of Eq. (59).
(68)
Remark 6.1. It is not too difficult to see from Eq. (59) that log E M −n grows as n 2 as n → +∞. This is the same rate of growth as that of the moments of the lognormal distribution. As the lognormal moment problem is well-known to be indeterminate, confer [7], it is natural to ask if the same conclusion holds for M −1 . It does as shown in Corollary 6.1 below. This, however, cannot be concluded from the rate of growth of the moments alone because the converse to the Carleman criterion of determinancy is false in general. Having considered the case of integral q, we now turn our attention to that of purely imaginary q in Eq. (49), which corresponds to the characteristic function of log M. Formally, we have the identity
E eiq log M = exp (D(iq)) , q ∈ R. (69) Thus, a necessary condition for the validity of our conjecture is that exp (D(iq)) be a valid characteristic function. This is the result of the next proposition. Theorem 6.2. For any 0 ≤ µ < 1 the function q → exp (D(iq)) , q ∈ R, is the characteristic function of an infinitely divisible distribution. Proof. We will prove the assertion by reducing D(iq) to the L´evy-Khinchine functional form. Fix a q ∈ R and introduce the following functions of x ∈ R+ µx µx µx µx e 2 + 2 + e− 2 −x , (70) F(x) eiq 2 − 1 − iq 2 µx µx − µx −x e 2 , G(x) e2iq 2 − 1 − 2iq (71) 2 µx µx µx µx µx e 2 + 2 − e− 2 −x − 2 + e 2 − e−x e 2 − 1 . (72) H (x) 2
Mellin Transform of the Limit Lognormal Distribution
301
Then, it is not hard to see that the integral in Eq. (49) can be written as ⎡ ⎤ ∞ d x ⎣ F(x) − G(x) + iq H (x) ⎦ µx D(iq) = . x (e x − 1) e 2 − 1
(73)
0
G (0)
= 0, we can split the integral into two as follows Now, using that G(0) = ⎡ ⎡ ⎤ ⎤ ∞ ∞ d x ⎣ F(x) − G
(0)x 2 /2 + iq H (x) ⎦ d x ⎣ G(x) − G
(0)x 2 /2 ⎦ µx µx . (74) − x x (e x − 1) e 2 − 1 (e x − 1) e 2 − 1 0
0
Next, we make a change of variables x = 2x in the second integral and then put the two integrals back under the same integral sign. Note that H (x) = O(x 3 ) as x → 0 and G
(0) = −q 2 µ2 . There results ⎡ µx ⎤ µx ∞ 2 + 2+e − 2 −x − µx −x2 e 4 µx d x iq µx ⎣ e µx − x µx ⎦ e 2 −1−iq D(iq) = x 2 x (e − 1) e 2 − 1 (e 2 −1) e 4 − 1 0
∞
⎡
⎤
(x/2)2 dx ⎣ x2 µx ⎦ − µx x x x − 1) e 2 − 1 2 − 1) e 4 − 1 (e (e 0 ⎡ ⎤ ∞ H (x) dx ⎣ µx ⎦ . +iq x x − 1) e 2 − 1 (e 0 −
q 2 µ2 2
(75) (76)
(77)
The key observation is that the functions that are involved in Eqs. (75) and (76) are non-negative. That is, it is not difficult to show by inspection that µx µx µx x e 2 + 2 + e− 2 −x e− 4 − 2 µx − x µx ≥ 0, (78) (e x − 1) e 2 − 1 (e 2 − 1) e 4 − 1 (x/2)2 x2 µx ≥ 0. − µx x (e 2 − 1) e 4 − 1 (e x − 1) e 2 − 1
(79)
Finally, the change of variables u = µx/2 in Eq. (75) and simple re-arrangement of the terms bring D(iq) to the canonical form, confer Theorem 4.4 in Chap. 4 of [28], iqu 1 2 2 iqu e −1− dM (u) (80) D(iq) = iqa − q σ + 2 1 + u2 R\{0}
for some constants a ∈ R and σ 2 > 0 depending on µ, and the spectral function M (u) that is defined by ⎤ ⎡ 2u ∞ eu + 2 + e−u− µ − u2 − µu e ⎥ du ⎢ u ⎦ − u M (u) = − ⎣ 2u , u > 0, (81) u µ − 1) e 2 − 1 µ − 1) (e u − 1) (e (e u
302
D. Ostrovsky
and M (u) = 0 for u < 0. Note that M (u) is continuous and non-decreasing on (−∞, 0) and (0, ∞) 2by construction, and satisfies the required integrability and limit conditions [−1,1]\{0} u dM (u) < ∞ and lim M (u) = 0. It follows that D(iq) is precisely of u→±∞
the L´evy-Khinchine functional form.
Corollary 6.1. The moment problem for log M is determinate and those for M and M −1 are indeterminate. Proof. The argument is based on the general theory of tail asymptotics of infinitely divisible distributions. The two key properties of log M that we need are that its spectral function M (u) is concentrated on R+ and decays as exp(−2u/µ)/u up to a constant in the limit u → +∞, confer Eq. (81). We conclude that the left tail u → −∞ of log M is gaussian, confer Theorem 9.7 in Chap. 4 of [28]. On the other hand, it follows from Theorem 2 in [2] that the right tail of log M satisfies log P (log M > u) ∼ −2(u − 1)/µ as u → +∞. It is immediate from here that the moment problem for log M is determinate. Indeed, the moments of log M grow no faster than those of the exponential distribution, which are known to satisfy the Carleman criterion, confer [7]. On the other hand, log P (M > u) ∼ −2(log u − 1)/µ as u → +∞ so that the right tail of M follows a power law, while its left tail u → +0 is lognormal. By the same token, the right tail of M −1 is lognormal, while its left tail is a power law. Hence, the moment problems for M and M −1 are indeterminate by the Krein criterion, confer [29]. Corollary 6.2. The positive integral moments of log M are finite and satisfy n
n E (log M)n−r D (r +1) (0). E (log M)n+1 = r
(82)
r =0
Proof. The positive integral moments of log M are given by the derivatives of E[M q ] at q = 0. As E[M q ] = exp (D(q)) , we have by di Bruno’s formula, confer [9], Chap. 11, (83) E (log M)n = Yn D (1) (0), · · · , D (n) (0) . Equation (82) follows from the recurrence relation of Bell polynomials, confer Eq. (16). We conclude this section with a partial result pertaining to the open question of analytically computing the probability density function of M. The density is the Mellin inverse of E[M q ] = exp (D(q)) . Computing it is not an easy task. We have so far given two representations of D(q), one being its definition in Eq. (49) and the other being the L´evy-Khinchine form in Eqs. (75)–(77) for the case of purely imaginary q. The following proposition gives a formula for exp (D(q)) , which extends Selberg’s finite product formula to an infinite product and reveals a deep connection between the structure of the Mellin transform and the gamma function in the complex plane. Theorem 6.3. The Mellin transform in Eq. (49) satisfies for (q) < 2/µ, q 2 (2 + 2/µ − 2q) E[M q ] = (1 − µq/2) −q (1 − µ/2) µ (2 + 2/µ − q) 2q 3 ∞ 2n (1 − q + 2n/µ) (2 − q + 2n/µ) × . µ 3 (1 + 2n/µ) (2 − 2q + 2n/µ) n=1
(84) (85)
Mellin Transform of the Limit Lognormal Distribution
303
Table 1. The moments of log M with intermittency µ = 0.5 versus those of N and of M versus exp(N ) mean stand. dev. skewness kurtosis
log M − 0.449 0.933 0.084 0.057
N (gaussian) − 0.449 0.933 0 0
M 1 1.291 8.892 ∞
exp(N )(lognormal) 0.987 1.163 5.176 74.064
The proof is quite long and is deferred to the Appendix.2 A direct proof that Theorem 6.3 implies Theorem 6.1 is easy and will be omitted. Corollary 6.3. The Mellin transform satisfies the following functional equations. 2 2 2 µ 2(−q) 2 (1 − q) (2 − q + 2/µ) q E M q+ µ = (2π ) µ −1 − µ 1 − E M , 2 µ (2 − 2q) (2 − 2q + 2/µ) (86)
(2 − (2q − 2)µ/2) (2 − (2q − 3)µ/2) (1 − µ/2) E M q . (87) E M q−1 = (1 − µq/2) 2 (1 − (q − 1)µ/2) (2 − (q − 2)µ/2) Equation (86) holds for (q) < 0 and Eq. (87) for (q) < 2/µ. The proof is given in the Appendix. We remark in passing that Eqs. (86) and (87) are equivalent to integral equations of the Mellin convolution type involving the Appell F3 function in the kernel for the probability density function of M. Their detailed study is however beyond the scope of this work. 7. Numerical Results In this section we present results of simple numerical calculations that illustrate some of the main properties of log M and M in comparison to their gaussian and lognormal counterparts. Equation (49) gives us the density of log M as the Fourier inverse of exp (D(iq)) , which we compute numerically. Corollary 6.2 gives us the moments of log M, in particular, it gives us the mean and standard deviation. Let N be a gaussian random variable with the same mean and standard deviation as those of log M with intermittency µ = 0.5. It is then natural to make a comparison of log M to N as well as of M to the lognormal random variable exp (N ) . Table 1 lists the mean < X >, standard deviation σ E (X − < X >)2 , skewness E (X − < X >)3 /σ 3 , and kurtosis E (X − < X >)4 /σ 4 − 3 of these distributions, while Fig. 1 and 2 show graphs of the corresponding probability densities. The computation was carried out with the help of the computer program MAXIMA (http://maxima.sourceforge.net). We can draw the following conclusions from the numerical data. First, log M is positively skewed, i.e. its mean is shifted to the left of zero resulting in the right tail that is “longer” than the left tail. This is also seen in Fig. 1. Second, both log M and M are leptokurtic, i.e. they have “fat” right tails. This is seen in having a positive kurtosis as well as in higher “peaks” and lower “valleys” in graphs of their densities compared to those of N and exp(N ). In fact, the kurtosis of M is infinite because 2 It is worth pointing out that the n = 1 term in Eq. (85) cancels the singularity at q = 1 + 1/µ coming from one of the gamma factors in Eq. (84). Hence, the equation has the stated region of validity.
304
D. Ostrovsky 0.45 pdf of log M gaussian
0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 -4
-3
-2
-1
0
1
2
3
4
Fig. 1. Graphs of the probability densities of log M with intermittency µ = 0.5 and N 1.2 pdf of M lognormal 1
0.8
0.6
0.4
0.2
0
0
0.5
1
1.5
2
2.5
3
3.5
Fig. 2. Graphs of the probability densities of M with intermittency µ = 0.5 and exp(N )
its fourth moment is infinite at intermittency µ = 0.5. Finally, the left tail of M looks lognormal, which is consistent with the asymptotic computed in the proof of Corollary 6.1. 8. Conclusions We have presented a detailed study of the limit lognormal probability distribution by means of the intermittency expansion of its Mellin transform (complex moments). In particular, we examined the integral moments of the distribution and the characteristic function of its logarithm. Our findings are as follows.
Mellin Transform of the Limit Lognormal Distribution
305
We derived an explicit expansion of the Mellin transform in powers of the intermittency parameter. The resulting power series is convergent for a particular range of integral moments and is divergent otherwise. We computed the sum of the convergent series, thereby establishing a new formula for the negative integral moments of the distribution and re-deriving Selberg’s famous formula for the positive ones. In general, we computed the asymptotic behavior of the series expansion in the limit of small intermittency by finding a particular integral function whose asymptotic expansion coincides with the series in this limit. It is our conjecture that this function gives an exact representation of the Mellin transform at the general level of intermittency. The conjecture is interesting as it points to the source of divergence of the series expansion for non-integral moments, namely, that the integrand involved is entire when the moments are integral and meromorphic when they are not. More importantly, we evaluated the conjectured integral formula in the special case of integral moments and showed that the result coincides with the sum of the convergent series. Moreover, we proved that in the case of purely imaginary moments, which corresponds to the characteristic function of the logarithm of the distribution, our integral formula is indeed a valid characteristic function of the infinitely divisible type. Thus, in effect, we introduced a new probability distribution with the properties that its positive and negative integral moments at arbitrary intermittency and Mellin transform asymptotic in the limit of small intermittency coincide with the corresponding quantities of the limit lognormal distribution. Our conjecture says that the two are one and the same. The main reason our conjecture is not a theorem is the absence of a proof of uniqueness. As the positive integral moments of the limit lognormal distribution are given by the Selberg integral, we can summarize our results in the following way. For each intermittency level µ ∈ [0, 1) we constructed an explicit ‘analytical continuation’ of the Selberg integral, as a function of its dimension, to the Mellin transform in the half-plane (q) < 2/µ of some probability measure on R+ depending on µ. In addition, for each complex q, we showed that the asymptotic expansion of the Mellin transform coincides with our intermittency expansion in the limit µ → +0. What is lacking is the proof that the construction with these properties is unique. This cannot be concluded from the matching of the integral moments alone because we showed that the corresponding moment problems are in fact indeterminate. On the other hand, the moment problem for the logarithm is determinate, hence the answer lies in the analyticity properties of the Mellin transform as a function of both q and µ. We illustrated our results graphically by numerically inverting the conjectured characteristic function of the logarithm of the limit lognormal distribution, thereby computing its probability density function. The graph of the density and the analytical computation of the moments indicate that the distribution of the logarithm has positive skewness and kurtosis. The same is true of the limit lognormal distribution itself. At last, we mention some avenues for future research. First, aside from proving or disproving our conjecture, it would be interesting to find a closed-form expression for the probability density corresponding to our integral formula. Our results suggest that there is a deep connection between this density and the gamma function in the complex plane, for, indeed, we represented its Mellin transform as an infinite product of gamma factors. This representation implies functional equations for the Mellin transform that are equivalent to certain integral equations for the underlying density. Their study is left to future research. Second, in this paper we focused on the Mellin transform of the limit lognormal distribution. Similar arguments can be given for the Laplace and Stieltjes transforms of the distribution. What is lacking are explicit formulas for either transform of the kind that we gave for the Mellin transform. Third, the very structure of
306
D. Ostrovsky
the intermittency expansion studied in this paper begs the question of whether there is a connection of the limit lognormal distribution to analytic number theory.
A. Appendix In this section we will give proofs of Theorem 6.3 and its corollaries. We begin with the “main lemma.” It can be thought of as an analogue of the Euler formula for the gamma function, whereas Theorem 6.3 is an analogue of the Weierstraβ formula. Lemma A.1. The Mellin transform in Eq. (49) satisfies for (q) < 2/µ, µ (2 + 2/µ − 2q) 2 q µq −q E[M ] = (A-1) 1− 1− (2 + 2/µ − q) µ 2 2 ! " N µN 3 (1 + µ(k − q)/2) (1 + µ(k + 1 − q)/2) 2q 1+ × lim . N →∞ 2 3 (1 + µk/2) (1 + µ(k + 1 − 2q)/2) q
k=1
(A-2) Proof. We assume in this proof that (q) < 1 + 1/µ. It will become clear from Lemma A.2 below that the result holds in fact for (q) < 2/µ. Consider the integral ∞ I (q) 0
dx (e x − 1)x
e
µx 2 q
e
µx 2
(q 2 − q) µx −q − . 2 2 −1 −1
(A-3)
The defining integral for D(q) in Eq. (49) can be naturally split into three terms as they exist individually in the case of (q) < 1 + 1/µ. Specifically, we can write D(q) = I (q + 1) + 2I (q) + I (q − 1) − I (2q − 1) ∞ µx dx µx 2 −1− e −q (e x − 1)x 2 0 ∞
+
e−x x
e
µx 2 (2q−1)
e
0
µx 2
−e
µx 2 (q−1)
−1
(A-4) (A-5)
− q d x.
(A-6)
The integrals in Eqs. (A-5) and (A-6) can be computed in closed form using a combination of Malmst´en’s formula and Frullani’s identity, ∞
∞ 0
e−x x
0
e
µx µ µ dx µx 2 −1− e = log 1 − −γ , x (e − 1)x 2 2 2
µx 2 (2q−1)
e
µx 2
−e
µx 2 (q−1)
−1
− q d x = log
(A-7)
2 (2 + 2/µ − 2q) + q log , (A-8) (2 + 2/µ − q) µ
Mellin Transform of the Limit Lognormal Distribution
307
where γ denotes Euler’s constant. Now, by the dominated convergence theorem, I (q) = lim I N (q), where I N (q) is defined by N →∞
N µx
(q 2 − q) µx µx
µx µx dx 2 q −1−q e 2 −1 − 2 −1 e e− 2 k . e x (e − 1)x 2 2
∞ I N (q) 0
k=1
(A-9) Let us fix k. Applying Malmst´en’s formula to log (1+µk/2−µq/2)−log (1+µk/2) and log (1 + µk/2 − µ/2) − log (1 + µk/2), we obtain µk µk µq log 1 + − log 1 + − 2 2 2 µ µk µk − − log 1 + −q log 1 + 2 2 2 ∞ µx
µx µx µx dx 2 (q−k) − e − 2 k − q e 2 (1−k) − e − 2 k e . (A-10) = (e x − 1)x 0
Similarly, we have
µ µk − ψ 1+ 2 2
∞ µx
µx µk dx − 2 k 2 (1−k) . (A-11) −ψ 1+ = e − e 2 (e x − 1) 0
Substituting Eqs. (A-10) and (A-11) into Eq. (A-9) and summing over k gives us the following formula for I N (q) N µk µq µk µN log 1 + − − log 1 + + q log 1 + I N (q) = 2 2 2 2 k=1 1 µN µ + (q 2 − q) ψ(1) − ψ 1 + . (A-12) 2 2 2 Hence, we obtain for I N (q + 1) + 2I N (q) + I N (q − 1) − I N (2q − 1), N µk µq µk µk µ 3 log 1 + − −3 log 1 + +log 1 + − (q − 1) 2 2 2 2 2 k=1 µ µN µq µk − (2q − 1) + 2q log 1 + + log (1 − ) − log 1 + 2 2 2 2 µN µN µq µq µN µq ) − log (1 + − )− ψ 1+ . (A-13) + ψ(1) + log (1 + 2 2 2 2 2 2 By Stirling’s formula written in the form log (a + b) − log (a) = b log(a) + O(1/a) and ψ(a) = log(a) + O(1/a) as a → +∞, we have in the limit N → ∞, µN µN µq µq µN log 1 + − log 1 + − − ψ 1+ = O(1/N ). 2 2 2 2 2 (A-14)
308
D. Ostrovsky
Putting everything together and recalling that ψ(1) = −γ , we obtain the following formula for D(q): 2 µq (2 + 2/µ − 2q) µ + q log − q log 1 − + log 1 − (2 + 2/µ − q) µ 2 2 ! N µN µk µq + lim 2q log 1 + 3 log 1 + + − N →∞ 2 2 2 k=1 µk µk µ −3 log 1 + + log 1 + − (q − 1) 2 2 2 # µ µk − (2q − 1) . (A-15) − log 1 + 2 2
D(q) = log
The result follows.
Lemma A.2. The limit in Eq. (A-2) satisfies ! " N µN 3 (1 + µ(k − q)/2) (1 + µ(k + 1 − q)/2) 2q lim 1+ N →∞ 2 3 (1 + µk/2) (1 + µ(k + 1 − 2q)/2) k=1
(A-16)
∞ 2n 2q 3 (1 − q + 2n/µ) (2 − q + 2n/µ) . = µ 3 (1 + 2n/µ) (2 − 2q + 2n/µ)
(A-17)
n=1
Proof. The proof entails a combination of the Weierstraβ and Euler formulas for the gamma function. Recall (1 + z) = e−γ z
∞ n=1
(z) = lim
N →∞
e z/n , 1 + z/n
(A-18)
N !N z . z(z + 1) · · · (z + N )
(A-19)
We substitute the Weierstraβ product formula for every gamma function that appears in Eq. (A-16) and obtain after some cancelations, ! lim
N →∞
∞ n=1
e2q
µN 2n
(1 + µN /2n)2q
∞ N
e
k=1 n=1
µ −2q 2n
" (1 + µ(k + 1 − 2q)/2n) . (1 + µ(k − q)/2n)3 (1 + µ(k + 1 − q)/2n) (1 + µk/2n)3
(A-20) We now change the order of the products3 and let N → ∞, ! " N ∞ 1 (k + 2n/µ)3 (k + 1 − 2q + 2n/µ) 1 lim . (µ/2n)2q N →∞ N 2q (k − q + 2n/µ)3 (k + 1 − q + 2n/µ) n=1
k=1
It is now easy to see that Eq. (A-17) holds by the Euler product formula. 3 This can be easily justified by taking logarithms.
(A-21)
Mellin Transform of the Limit Lognormal Distribution
309
Remark A.1. It is now clear that Lemma A.1 holds for (q) < 2/µ. Indeed, the n = 1 term in Eq. (A-17) cancels the singularity at q = 1 + 1/µ coming from the ratio of gamma functions in Eq. (A-1). Proof of Theorem 6.3. The proof is immediate from Lemmas A.1 and A.2.
Proof of Corollary 6.3. It is easy to see from Theorem 6.3 that we have the identity E M
q+ µ2
2 +1 2 2 (1 − q) (2 − q + 2/µ) µ 2 µ (−q) = E[M ] − µ 1 − µ 2 (2 − 2q) (2 − 2q + 2/µ) ! 4 " 4 (2−2q + 2(N − 1)/µ) (2−2q + 2N /µ) 2 µN × lim (N !) µ . N →∞ µ 3 (1−q + 2N /µ) (2−q + 2N /µ) q
(A-22) Equation (86) now follows from Stirling’s formula. Similarly, Theorem 6.3 implies another identity
µ (1 − µ(q − 1)/2) E M q−1 = E[M q ] 1 − (1 + µ(1 − q)/2)3 2 (1 − µq/2) ∞ (1 + µ(1 − q)/2n)3 (1 + µ(2 − q)/2n) . (A-23) × (1 + µ(2 − 2q)/2n) (1 + µ(3 − 2q)/2n) n=2
Equation (87) now follows by the Weierstraβ formula for the gamma function.
Acknowledgement. The author wishes to thank the anonymous referee for many helpful suggestions.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.
Andrews, G.E., Askey, R., Roy, R.: Special Functions. Cambridge: Cambridge University Press, 1999 Antonov, S.N.: Asymptotic behavior of infinitely divisible laws. Mathematical Notes 28, 924–929 (1980) Bacry, E., Delour, J., Muzy, J.-F.: Multifractal random walk. Phys. Rev. E 64, 026103 (2001) Bacry, E., Delour, J., Muzy, J.-F.: Modelling financial time series using multifractal random walks. Physica A 299, 84–92 (2001) Bacry, E., Muzy, J.-F.: Log-infinitely divisible multifractal random walks. Commun. Math. Phys 236, 449–475 (2003) Barral, J., Mandelbrot, B.B.: Multifractal products of cylindrical pulses. Prob. Theory Relat. Fields 124, 409–430 (2002) Bisgaard, T.M., Sasvari, Z.: Characteristic Functions and Moment Sequences. Huntington: Nova Science Publishers, 2000 Brendt, B.C.: Ramanujan’s Notebooks, Part V. New York: Springer-Verlag, 1998 Charalambides, C.A.: Enumerative Combinatorics. Boca Raton: Chapman & Hall/CRC, 2002 Forrester, P.J., Warnaar, S.O.: The importance of the Selberg integral. Bull. Amer. Math. Soc. 45, 489–534 (2008) Hardy, G.H.: Divergent Series. Oxford: Clarendon Press, 1949 Kahane, J.-P.: Sur le chaos multiplicatif. Ann. Sci. Math. Quebec 9, 105–150 (1985) Kahane, J.-P.: Positive martingales and random measures. Chi. Ann. Math. 8, 1–12 (1987) Kahane, J.-P.: Produits de poids aléatoires indépendants et applications. In: Belair, J., Dubuc, S. (eds.) Fractal Geometry and Analysis. Boston: Kluwer, 1991, p. 277 Khorunzhiy, O.: Limit theorems for sums of products of random variables. Markov Process. Relat. Fields 9, 675–686 (2003) Knopp, K.: Theory and Application of Infinite Series. New York: Dover, 1990
310
D. Ostrovsky
17. Mandelbrot, B.B.: Possible refinement of the log-normal hypothesis concerning the distribution of energy dissipation in intermittent turbulence. In: Statistical Models and Turbulence, Rosenblatt, M., Van Atta, C. eds., Lecture Notes in Physics 12, New York: Springer, 1972, p. 333 18. Mandelbrot, B.B.: Intermittent turbulence in self-similar cascades: divergence of high moments and dimension of the carrier. J. Fluid Mech. 62, 331–358 (1974) 19. Mandelbrot, B.B.: Limit lognormal multifractal measures. In: Frontiers of Physics: Landau Memorial Conference, Gotsman, E.A. et al, eds., New York: Pergamon, 1990, p. 309 20. Muzy, J.-F., Bacry, E.: Multifractal stationary random measures and multifractal random walks with log-infinitely divisible scaling laws. Phys. Rev. E 66, 056121 (2002) 21. Olver, F.W.J.: Asymptotics and Special Functions. San Diego: Academic Press, 1974 22. Ostrovsky, D.: Limit lognormal multifractal as an exponential functional. J. Stat. Phys. 116, 1491–1520 (2004) 23. Ostrovsky, D.: Functional Feynman-Kac equations for limit lognormal multifractals. J. Stat. Phys. 127, 935–965 (2007) 24. Ostrovsky, D.: Intermittency expansions for limit lognormal multifractals. Lett. Math. Phys. 83, 265–280 (2008) 25. Schmitt, F.: A causal multifractal stochastic equation and its statistical properties. Eur. J. Phys. B 34, 85–98 (2003) 26. Selberg, A.: Remarks on a multiple integral. Norske Mat. Tidsskr. 26, 71–78 (1944) 27. Srivastava, H.M., Choi, J.: Series Associated with the Zeta and Related Functions. Dordrecht: Kluwer, 2001 28. Steutel, F.W., van Harn, K.: Infinite Divisibility of Probability Distributions on the Real Line. New York: Marcel Dekker, 2004 29. Stoyanov, J.: Krein condition in probabilistic moment problems. Bernoulli 6, 939–949 (2000) 30. Temme, N.M.: Special Functions: an Introduction to the Classical Functions of Mathematical Physics. New York: John Wiley, 1996 Communicated by S. Zelditch
Commun. Math. Phys. 288, 311–347 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0774-8
Communications in
Mathematical Physics
Vortex Condensates for Relativistic Abelian Chern-Simons Model with Two Higgs Scalar Fields and Two Gauge Fields on a Torus Chang-Shou Lin1 , Jyotshana V. Prajapat2 1 Department of Mathematics, Taida Institute of Mathematical Sciences,
National Taiwan University, Taipei 106, Taiwan. E-mail:
[email protected]
2 Department of Mathematics, Arts and Science Program, The Petroleum Institute,
Abu Dhabi, P.O. 2533, U.A.E. E-mail:
[email protected] Received: 10 March 2008 / Accepted: 9 December 2008 Published online: 13 March 2009 – © Springer-Verlag 2009
Dedicated to Professor Louis Nirenberg Abstract: We prove the existence of maximal condensates for the relativistic Abelian Chern-Simons equations involving two Higgs particles and two gauge fields on a torus. After a change of variable, we obtain a variational formulation of the problem whose critical points are equivalent to the original system of the equation. We prove existence of a local minimizer for this functional as well as the existence of a second mountain-pass critical point. 1. Introduction The Chern-Simons theories were developed to explain certain condensed matter phenomena, anyon physics, superconductivity, quantum mechanics and so on. The models of this theory give rise to elliptic equations with exponential nonlinearities. Analysis of these equations poses mathematically challenging problems and require new tools and ideas combined with the known techniques. In this paper, we study the system of equations corresponding to the relativistic Abelian Chern-Simons model involving two Higgs scalar fields and two gauge fields on a torus , viz., ⎫ k1 ⎪ v u ⎪ u = λe (e − 1) + 4π m j δpj ⎪ ⎪ ⎬ j=1 , (1.1) k2 ⎪ ⎪ u v ⎪ v = λe (e − 1) + 4π n j δq j ⎪ ⎭ j=1
k 1 where λ > 0 is a real number, m j , n j are positive numbers and N1 = j=1 m j , k 2 N2 = j=1 n j . The sets of points S1 := { p1 , p2 , . . . , pk1 } and S2 := {q1 , q2 , . . . , qk2 } are prescribed and δ p denotes the Dirac delta measure at point p. When m j , n j are integers, the solutions of (1.1) correspond to static solutions, also referred to as periodic
312
C.-S. Lin, J. V. Prajapat
vortices or condensates of the Lagrangian studied in [8 and 11] satisfying ’t Hooft periodic boundary conditions on a torus. The functions u, v correspond to the gauge fields and the Dirac measure 4π nδ p represents the magnetic flux at a point p. The mathematical analysis of the system (1.1) has been recently initiated in [12] where the problem (1.1) is studied on the plane. The authors prove existence of a topological solution of (1.1) in R2 by first solving the system on balls of radius R, R > 0 by variational method and then taking the limit R → ∞ (see ([12]) for details). More recently, the paper [6] studied the uniqueness of topological solutions and existence of nontopological solutions for (1.1) in a plane. Our goal is to study the system (1.1) on a torus . It is interesting to study this problem on a torus since the solutions of the Chern Simons equations depend on the topology of underlying space. For example, the relativistic Abelian Chern-Simons theory (scalar equation) ([2,14–16]), non Abelian SU(3) Chern-Simons theory ([18]), Maxwell-Chern-Simons theory ([3,4,17]) models on a torus are well studied. In spite of similarity with the scalar Abelian Chern Simons equation and Abelian Higgs vortex equations (see [12] for details), the system (1.1) is difficult not only since it involves two equations, but also due to the “mixed” nature of the nonlinear terms. Our first result is the existence of maximal solutions, precisely Theorem 1.1. Let ⊂ R2 be a torus and || denote the measure of . Then, any solution (u, v) of (1.1) in satisfies u < 0, v < 0 in . Moreover, there exists λ∗ >
4π ||
(1.2)
max{N1 , N2 } such that:
(i) For any λ > λ∗ , there exists a unique maximal solution (u λ , vλ ) for the system (1.1) in the sense that if (u, v) ∈ H 1 () × H 1 () is any other solution of (1.1), then u < u λ ; v < vλ .
(1.3)
(ii) For λ < λ∗ , there exists no solution of (1.1) and the map λ → (u λ , vλ ) is monotone, i.e., for λ∗ ≤ λ1 < λ2 , u λ1 < u λ2 , vλ1 < vλ2 .
(1.4)
(iii) The limits lim u λ := u ∗ ;
λ→λ∗
lim vλ := v∗
λ→λ∗
(1.5)
exist a.e. x ∈ and (u ∗ , v∗ ) ∈ H 1 ()× H 1 () is a solution of (1.1) with λ = λ∗ . Moreover, (u ∗ , v∗ ) is a strict sub solution of (1.1) for all λ > λ∗ and hence u ∗ < u λ , v∗ < vλ for all λ > λ∗ .
(1.6)
Here, W 1, p () denotes the Sobolev space which is the completion of C 1 () with respect to the norm || f ||W 1, p := ||∇ f || p + || f || p ,
(1.7)
and H 1 () is the Hilbert space with || f || H 1 := ||∇ f ||2 + || f ||2 ,
(1.8)
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
313
where we shall use the notations || · || p and < ·, · > to denote the L p norm and L 2 inner product respectively. Note that, as in scalar Chern-Simons equations, while solutions of the system (1.1) in R2 exist for all λ > 0 (see [12]), the solutions for the system (1.1) in 4π a torus exists for λ ∈ [λ∗ , ∞) where the strict inequality λ∗ > || max{N1 , N2 } holds. ∗ The λ given by Theorem 1.1 is optimal in the sense that a solution for (1.1) exists for any λ ≥ λ∗ and does not exist for any λ < λ∗ . In order to find the maximal solution, it is conventional to develop a monotone iterative scheme. In general, it is difficult to develop an iterative scheme for a system of equations, more so with non linearities involving exponential functions and measures. So, it is surprising that such a scheme works for the system (1.1). Besides the simplicity of proof using an iteration scheme, an added advantage is that it allows for numerical construction of solutions of (1.1). We have used this scheme frequently in our paper, besides using it to prove Theorem 1.1 and hope that it will find applications in other problems too. Our second result is to find a second solution of (1.1) for λ > λ∗ . In principle, if a system of equation under consideration admits a variational structure, then a second solution might be obtained via the variational method, for example, the Mountain pass lemma. Unfortunately, the variational form associated with (1.1) is indefinite in H 1 ()× H 1 (). Thus, we have to find a new approach in order to find a second solution. We explain it briefly below, see Sects. 5 and 6 for details. As in [12], by the change of variables u → u + v =: F, v → u − v =: G, the system (1.1) can be transformed into an equivalent problem F = 2λe F − λe
F+G 2
− λe
F−G 2
+ 4π
k1
m j δ p j + 4π
k2
j=1
G = λe
F+G 2
− λe
F−G 2
+ 4π
k1
m j δ p j − 4π
j=1
n j δq j .
(1.9)
j=1 k2
n j δq j .
(1.10)
j=1
which is an Euler-Lagrange equation of an indefinite functional ˜ G)=I ˜ Iλ ( F, λ (F, G) :=
F+G F−G 1 1 ||∇ F||2 − ||∇G||2 +2λ eu 0 +v0 +F −eu 0+ 2 −ev0 + 2 d x 2 2 4π(N1 + N2 ) 4π(N1 − N2 ) + F dx − G d x. (1.11) || ||
˜ G˜ respectively. where F˜ = F + u 0 , G˜ = G + v0 and u 0 , v0 are singular parts of F, Since Theorem 1.1 gives existence of a strict sub solution, we may try to find a constrained minimizer for the functional Iλ in the closed subset {F ∈ H 1 () : F ≥ F∗ } as for the scalar case. However, even under the constraint on F, Iλ might not be bounded below in H 1 () × H 1 (). So we need new ideas to work on our problem. The idea is to first solve the second equation (1.10) for any given F. In fact, we show that for any fixed F ∈ H 1 (), Eq. (1.10) has a unique solution G(F) depending on F so that we may think of Iλ as a function of F as Iλ (F) := Iλ (F, G(F)).
(1.12)
314
C.-S. Lin, J. V. Prajapat
In spite of this reduction, it is still difficult to bound the functional Iλ from below. Hence we approximate the functional Iλ by a suitable functional Iλε and study the minimization problem for Iλε . This will be done in Sect. 5 where we prove Theorem 1.2. For every λ > λ∗ , the functional Iλ has a local minimizer Fλ ∈ H () such that Fλ > F∗ ,
(1.13)
where F∗ := u ∗ + v∗ , (u ∗ , v∗ ) defined by (1.5) in Theorem 1.1. The approximate system is introduced in Sect. 4. There, we replace the measures k1 k2 µ := 4π m j δ p j and ν := 4π n j δ p2 by smooth, non negative functions µε , ν ε j=1
j=1
j
where µε µ, ν ε ν in the sense of distribution i.e.,
u = λev (eu − 1) + µε v = λeu (ev − 1) + νε .
(1.14)
We show that solution of this system converges to solution of (1.1) as ε → 0. Moreover, an analogue of Theorem 1.1 holds for (1.14) and hence there exists λε > 0 such that solutions to (1.14) exists for λ ≥ λε , but no solutions exist for λ < λε . Thus, to prove Theorem 1.2, it has to be shown that lim λε = λ∗ .
ε→0
(1.15)
The equality (1.15) will be proved by the implicit function theorem, where the nonsingularity of the linearized equation at the maximal solution for (1.1) has been established a priori. Generally, the non- singularity can be proved via showing the triviality of the null space of the linearized equation. However, it does not seem easy to prove this for a system of equations. Instead, taking advantage of the monotone scheme, we prove that the linearized equation is onto from W 2,2 () × W 2,2 () → L 2 () × L 2 (). Thus, we have Theorem 1.3. For almost everywhere λ > λ∗ , the linearized system for (1.1) at the maximal solution is invertible. The proof of Theorem 1.3 will be given in Sect. 4. In order to apply the mountain pass lemma, we prove that the approximate functional Iλε satisfies the Palais-Smale condition, provided min{N1 , N2 } > 0. However, due to the indefinite form of (1.11), there are some difficulties to prove the Palais-Smale condition. In any case, we succeed and will give a proof in Lemma 6.1. It is not known that the maximal solution obtained in Theorem 1.1 corresponds to a local minimizer of Iλ for λ > λ∗ . If not, then the local minimizer obtained in Theorem 1.2 already gives us a second condensate. Again, if it is not a strict local minimizer, then we obtain a sequence of local minimizers of Iλ distinct from (u λ + vλ , u λ − vλ ), which would prove existence of more than one condensate. Therefore, without loss of generality, we might assume that the maximal solution obtained in Theorem 1.1 corresponds to a strict local minimizer of the functional Iλ , and then by applying the mountain pass lemma, we have Theorem 1.4. For each λ > λ∗ , the system (1.1) has at least two distinct solutions.
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
315
2. Equations (1.1) and the Chern-Simons-Higgs System The Chern-Simons theory is an alternative gauge theory, different from the Maxwell theory, in 2 + 1 dimensional space used to study planar condensed matter phenomena. The Chern-Simons Lagrangian is characterized by the coupling of scalar field, the YangMills or Maxwell field and the Chern-Simons gauge field. Moreover, it is interesting to study this theory on space-time with non trivial topology. Please refer to [4,7,12,17,19] and the references therein for more details. Here we are concerned with Chern-Simon model defined on the (2 + 1)- dimensional Minkowski space R2,1 with metric tensor (gr s ) = diag(1, −1, −1), r, s = 0, 1, 2 involving two scalar fields, two gauge fields and with mixed charge-flux relations. Let φ1 and φ2 be two complex scalar fields representing two Higgs particles of charges q1 , (1) (2) q2 . We denote by (Ar , Ar ) the two gauge fields associated with the given metric (2) tensor carrying the induced electromagnetic fields (Fr(1) s , Fr s ), where ( j)
( j)
( j)
Fr s = ∂r As − ∂s Ar ,
(2.1)
i, j = 1, 2 and r, s = 0, 1, 2. The Chern-Simons action density or Lagrangian L studied in [8,11] is defined as 1 (2) 1 (1) L = − κεr st Ar(1) Fst − κεr st Ar(2) Fst + Dr¯φ1 Dr φ1 + Dr¯φ2 Dr φ2 −V (φ1 , φ2 ), 4 4 (2.2) where κ > 0 is the coupling parameter, Dr φ1 = ∂φ1 − iq1 Ar(1) φ1 ,
Dr φ2 = ∂φ1 − iq2 Ar(2) φ1
(2.3)
are the covariant derivatives and V (φ1 , φ2 ) is the Higgs potential density, V (φ1 , φ2 ) :=
q12 q22 {|φ1 |2 (|φ2 |2 − c22 )2 + |φ2 |2 (|φ1 |2 − c12 )2 }. κ2
(2.4)
The special numerical factor in front of the expression of V ensures that selfduality can be achieved for the static field configurations and the positive vacuum states < φ1 >= c1 > 0 and < φ2 >= c2 > 0 lead to spontaneously broken symmetries. The equations of motions of the Lagrangian (2.2) are the Chern-Simons equations ⎫ 1 r sα F (2) = −q i(φ D r¯φ − φ¯ D r φ ) ⎪ ⎪ 1 1 1 1 1 sα 2 κε ⎪ ⎪ ⎪ ⎪ (1) 1 r sα r ⎪ r¯φ − φ¯ D φ ) κε F = −q i(φ D ⎬ 2 2 2 2 2 sα 2 . (2.5) q2q2 ⎪ = − 1κ 2 2 {2|φ2 |2 (|φ1 |2 − c12 )2 + (|φ2 |2 − c22 )2 }φ1 ⎪ Dr Dr φ1 ⎪ ⎪ ⎪ ⎪ ⎪ q12 q22 2 2 r 2 2 2 2 2 D D φ =− {2|φ | (|φ | − c ) + (|φ | − c ) }φ ⎭ r
2
κ2
1
2
2
1
1
2
The first two equations in (2.5) are the conserved matter current densities and putting r = 0 in these equations we get the Gauss laws (2) (1) κ F12 = 2q12 A0 |φ1 |2 , (2.6) (1) 2 κ F12 = 2q22 A(2) 0 |φ2 | .
316
C.-S. Lin, J. V. Prajapat (1)
(2)
These are the variational equations for (A0 , A0 ). The Lagrangian L is invariant under the gauge transformations φ → eiη φ, A → A + ∇η, A0 → A0
(2.7)
for any smooth real valued function η. Using the notations of [2], we let the torus be represented by a fundamental domain in R2 , generated by two independent vectors a 1 , a 2 : := {x = (x1 , x2 ) ∈ R2 : x = s1 a 1 + s2 a 2 , 0 < s1 , s2 < 1}.
(2.8)
If we denote by k := {x ∈ R2 : x = sk a k , 0 < sk < 1} , k = 1, 2, then the boundary of can be represented as ∂ = 1 ∪ 2 ∪ {a 1 + 1 } ∪ {a 2 + 2 } ∪ {0, a 1 , a 2 , a 1 + a 2 }. Due to the invariance given in (2.7), we impose as in the scalar case, the following ’t Hooft boundary conditions: ⎫ k eiηk (x+a ) φ1 (x + a k ) = eiηk x φ1 (x) ⎬ (2.9) (A( j) + ∇ηk )(x + a k ) = (A( j) + ∇ηk )(x) ⎭ ( j) ( j) = A0 (x) A0 (x + a k ) for k = 1, 2 and x ∈ 1 ∪ 2 . Here η1 , η2 are real valued smooth functions defined in a neighbourhood of 2 ∪ {a 1 + 2 } and 1 ∪ {a 2 + 1 } respectively. Let us denote the value ¯ by η(s1 , s2 ). Since φ1 is a single valued of a function η at a point x = s1 a 1 + s2 a 2 ∈ complex function, its phase around must be a multiple of 2π . Hence the boundary condition (2.9) implies that there exists an integer N such that η1 (1, 1− ) − η1 (1, 0+ ) + η1 (0, 0+ ) − η1 (0, 1− ) +η2 (0+ , 1) − η2 (1− , 1) + η2 (1− , 0) − η2 (0+ , 0) + 2π N = 0.
(2.10)
Integrating (2.6) over , from (2.9) and (2.10) it follows that there exists integers N1 , N2 corresponding to the functions φ1 , φ2 respectively, such that ⎫ (2) (1) κ(2) := κ F12 d x = κ2π N2 ; 2q12 A0 |φ1 |2 = Q (1) = κ(2) ⎪ ⎬ (2.11) (1) (2) κ(1) := κ F12 d x = κ2π N1 ; 2q12 A0 |φ2 |2 = Q (2) = κ(1) . ⎪ ⎭
It is important to note that for our model, we obtain a mixed flux-charge relation. The Hamiltonian (energy) density H for static field configuration is given by H = −L( up to a total divergence) (2) (2) (1) (1) 2 (2) 2 2 2 2 2 2 = κ A(1) 0 F12 + κ A0 F12 − q1 (A0 ) |φ| − q2 (A0 ) |φ2 | + |D j φ1 |
+|D j φ2 |2 + V (φ1 , φ2 ) (2)
=
κ 2 [F12 ]2 4q12 |φ|2
(1)
+
κ 2 [F12 ]2 4q22 |φ2 |2
+ |D j φ1 |2 + |D j φ2 |2 + V (φ1 , φ2 ),
(2.12)
where in the last equality we have used the Gauss laws (2.6). Moreover, the following identities hold:
(k) |D j φk |2 = |D1 φk ± i D2 φk |2 ± i ∂1 (φk D2¯φk ) − ∂2 (φk D1¯φk ) ± F12 |φk |2 ; k = 1, 2. (2.13)
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
317
Substituting these in (2.12) and integrating over the torus, we get the energy E = H dx
=
(2) 2 κ 2 [F12 ]
4q12 |φ|2
+
(1) 2 ] κ 2 [F12
4q22 |φ2 |2
+ |D1 φ1 ± i D2 φ2 |2 + |D1 φ2 ± i D2 φ2 |2
(1) (2) ±q1 F12 |φ1 |2 ± q2 F12 |φ2 |2 + V (φ1 , φ2 ) ⎧ 2 2 ⎨ (1) (2) κ F12 κ F12 q1 q2 q1 q2 2 2 2 2 = ± |φ2 |(|φ1 | − c1 ) + ± |φ1 |(|φ2 | − c2 ) ⎩ 2q2 |φ2 | κ 2q1 |φ1 | κ (1) (2) +|D1 φ1 ± i D2 φ2 |2 + |D1 φ2 ± i D2 φ2 |2 ± c12 q1 F12 |φ1 |2 ± c22 q2 F12 |φ2 |2 d x ≥ ±c12 q1 (1) ± c22 q2 (2) = c12 q1 |(1) | + c22 q2 |(2) |.
(2.14)
where we choose the signs so that ±( j) = |( j) |, j = 1, 2 . Hence, the energy will (1) (2) attain its lower bound if and only if the field configuration (φ1 , φ2 , Ar , Ar ) satisfy the equations ⎫ D 1 φ1 ± i D 2 φ1 = 0, ⎪ ⎪ ⎪ D 1 φ2 ± i D 2 φ2 = 0, ⎬ 2 2q1 q2 (1) (2.15) 2 2 2 F12 ± κ 2 |φ2 | (|φ1 | − c1 ) = 0, ⎪ ⎪ ⎪ 2 ⎭ 2q q (2) F12 ± κ12 2 |φ1 |2 (|φ2 |2 − c22 ) = 0. (l)
(l)
Using the change of variables ql A j → A j , φl → cl φl , l = 1, 2 and suppressed parameter λ = 4c12 c22 q12 q22 /κ 2 , we can simplify (2.15)(with positive sign) to ⎫ D 1 φ1 + i D 2 φ1 = 0, ⎪ ⎪ D 1 φ2 + i D 2 φ2 = 0, ⎬ (2.16) (1) F12 + λ2 |φ2 |2 (|φ1 |2 − 1) = 0, ⎪ ⎪ ⎭ (2) F12 + λ2 |φ1 |2 (|φ2 |2 − 1) = 0. where now D j φl = ∂ j φl − i A(l) j φl , l, j = 1, 2. It suffices to consider (2.15) with plus (l) (l) sign since the transformation A j → −A j and φ → φ¯ will help us recover results for the negative sign. The first two equations in (2.16) imply that the complex fields (φ1 , φ2 ) are holomorphic with respect to the gauge invariant derivatives while, the last two equations in (2.16) are the “vortex” equations relating “curvatures” to the “strength” of the scalar particles. The system of equations (2.16) together with the Gauss laws (2.6)( or (2.11)) and the periodic boundary conditions (2.9) are the self-dual Chern-Simons equations with two Higgs particles and two Abelian gauge fields. Note that if N2 = 0, we may choose A(2) j = 0 and |φ2 | = 1 so that (2.16) reduces to the self-dual Ginzburg-Landau equations, while if both φ1 and φ2 have only common zeroes with the same multiplicity, then we may take φ1 = φ2 so that (2.16) reduces to the single particle self-dual Abelian Chern-Simons equations.
318
C.-S. Lin, J. V. Prajapat
If we substitute u := ln |φ1 |2 , v := ln |φ2 |2 ,
(2.17)
then it can be seen that (2.16) reduces to the system (1.1) with suitable periodicity condition. Integrating the equations in (1.1), we see that
ev (1 − eu ) =
4π N1 , λ
(2.18)
eu (1 − ev ) =
4π N2 . λ
(2.19)
Thus as for the scalar case, we have two types of solutions. The “topological type” satisfying eu → 1 and ev → 1 as λ → ∞ equivalently, |φ1 | → 0, |φ2 | → 0 as κ → 0, (2.20) and the “non-topological type” satisfying eu → 0 and ev → 0 as λ → ∞ equivalently, |φ1 | → 1, |φ2 | → 1 as κ → 0 (2.21) This is in contrast to the non-Abelian case. In view of (1.2) and (1.6), it follows that the maximal solutions of Theorem 1.1 correspond to topological solutions. Let κ∗ be defined as κ∗ :=
c12 c22 q12 q22 , λ∗
(2.22)
where λ∗ is as in Theorem 1.1. Theorem 1.1 is equivalent to Theorem 2.1. Let N1 , N2 be given integers and S1 := { p1 , p2 , . . . , pk1 }, S2 := {q1 , q2 , . . . , qk2 } be sets of given points prescribed on a torus . Then there exists (an optimal) coupling parameter κ∗ , 0 < κ∗ <
c12 c22 q12 q22 || 4π max{N1 ,N2 }
such that
(i) For κ < κ∗ , the Chern-Simons equations (2.15) with boundary conditions (2.9) (1) (2) admits a “topological multi-vortex solution” (φ1κ , φ2κ , A j , A j )κ , with the preκ κ scribed set of zeroes S1 for φ1 and S2 for φ2 . (2) (ii) The solution (φ1κ , φ2κ , A(1) j , A j )κ is maximal i.e., most super conducting in the sense that the magnitudes |φ1κ |, |φ2κ | are largest possible values among all solutions (φ1 , φ2 ) to (2.15)–(2.9) with the same zero sets S1 and S2 respectively. (iii) The map κ → (φ1κ , φ2κ ) is strictly monotone for 0 < κ ≤ κ∗ , |φ1κ | < 1, |φ2κ | < 1.
(2.23)
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
319
3. An Iteration Scheme We first prove the existence of an iteration scheme for a more general system of equations which includes the Chern-Simon system. Proposition 3.1. For any λ > 0, consider the system of equation
u = λev (eu − 1) + f in . v = λeu (ev − 1) + g.
(3.1)
Here, f and g may be non-negative continuous functions or a linear combination of Dirac measures on . Suppose there exist functions (u, ¯ v) ¯ ∈ H 1 () × H 1 (), a super solution of (3.1) and (u, v), a sub solution of (3.1) such that u < u, ¯ v < v. ¯ Define (u 0 , v0 ) ≡ (u, ¯ v) ¯ in and consider the following iteration scheme for ⎫ ( − K )u n+1 = λevn (eu n − 1) − K u n + f ⎬ ( − K )vn+1 = λeu n (evn − 1) − K vn + g (3.2) ⎭ for n = 1, 2, 3, . . . where K > 0 is a constant. Then {u n } and {vn } are monotone decreasing sequence of functions in , i.e., ⎫ u¯ > u 1 > u 2 > . . . > u n > u n+1 > . . . > u a.e. ⎬ v¯ > v1 > v2 > . . . > vn > vn+1 > . . . > v a.e. (3.3) ⎭ and the limits lim u n (x) = u λ (x), lim vn (x) = vλ (x) exist almost everywhere x ∈ n→∞
n→∞
and (u n , vn ) → (u λ , vλ ) in H 1 () × H 1 (). The functions (u λ , vλ ) are solutions of (3.1) and are maximal (hence unique) in the sense that if (u, v) is any other solution of (3.1), then
u < uλ . (3.4) v < vλ The most important observation is that the linear system (3.2) can be solved on the torus , see for example Theorem 4.18 in [1]. Moreover, since the Green’s function for Laplacian on torus exist, we can allow f , g above to be sum of Dirac measures. Thus, there exist solutions (u n , vn ) of (3.1) for each n ∈ N and hence the iteration scheme (3.2) is well defined. Also note that the regularity of (u n , vn ) depends on the regularity of f and g. The proof of Proposition 3.1 is standard and follows by using the maximum principle. Here and in the following sections, whenever Proposition 3.1 is referred to , all we need to prove is the existence of sub-solution and super solution for the system being considered. An immediate application is Theorem 3.1. Define the measures µ := 4π
k1 j=1
m j δ p j and ν := 4π
k2
n j δq j ,
(3.5)
j=1
4π and let ( f, g) = (µ, ν) in (3.1). Then, there exists λ0 > || max{N1 , N2 } such that for every λ > λ0 , the system (3.1) has a maximal solution (u λ , vλ ).
320
C.-S. Lin, J. V. Prajapat
Proof. We proceed in the following steps: Step 1. (0, 0) is a super solution for (3.1). If (u, v) is a solution of (3.1), then u ∼ ln |x − p j |m j as x → p j where m j is the multiplicity of p j , for p j ∈ S1 v ∼ ln |x − q j |m j as x → q j where n j is the multiplicity of q j , for q j ∈ S2 ,
(3.6)
see [1]. The functions u ∈ C 2 ( \ S1 ), v ∈ C 2 ( \ S2 ) and u, v ∈ W 1,q () for all 1 < q < 2 . Thus, there exists ε > 0 such that the balls B( p j , ε), j = 1, . . . , k1 are mutu1 ally disjoint and u(x) < 0 in B( p j , ε) for all j = 1, . . . , k1 . Let ε := \∪kj=1 B( p j , ε) and maxε u = u(x0 ). If x0 ∈ ∂ε , then u(x0 ) < 0. If x0 ∈ ε is an interior point then using the differential equation (3.1) we have 0 ≥ u(x0 ) ≥ λev (eu − 1).
(3.7)
Therefore, eu(x0 ) − 1 ≤ 0, i.e, u(x0 ) ≤ 0. A similar argument will show that v ≤ 0. In fact, the maximum principle implies that the strict inequality u < 0, v < 0 holds. In particular, it follows that (0, 0) is a super-solution of (3.1) for any λ > 0. This proves (1.2) of Theorem 1.1. Integrating Eqs. (3.1) over and using the fact that u < 0, v < 0, ⎫ u+v v 4π N1 4π N1 = e < ||, ⎪ ⎬ λ ≤ λ + e u+v u (3.8) 4π N2 4π N2 = e < ||, ⎪ ⎭ λ ≤ λ + e
we derive necessary condition λ>
4π max{N1 , N2 } ||
(3.9)
for the existence of solutions to (1.1). Step 2. Iteration. Since here µ, ν are a sum of Dirac measures, we indicate the first two steps in the iteration scheme (3.2), where K > 0 will be chosen later. For n = 1, we have ⎫ ( − K )u 1 = µ ⎬ ( − K )v1 = ν in . (3.10) ⎭ Referring to [1], there exists a solution (u 1 , v1 ) of (3.10) which is C 2 away from the 1 2 singular points { p j }kj=1 and {q j }kj=1 respectively. Choose ε > 0 small such that the balls Bε ( pi ), Bε (q j ), i = 1, . . . .k1 and j = 1, . . . , k2 are mutually disjoint. Since u 1 ∼ ln |x − p j |m j , v1 ∼ ln |x − q j |n j near the singular points, we have u 1 < 0, v1 < 0 1 2 in the closure ¯ε , where ε := \ {∪kj=1 Bε ( p j ) ∪kj=1 Bε (q j )}. Thus, ( − K )u 1 = 0 in ε , ( − K )v1 = 0 in ε , u 1 < 0, v1 < 0 in ∂ε .
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
321
The maximum principle implies that both u 1 and v1 must achieve the maximum on the boundary. Hence u 1 < 0, v1 < 0 in . For n = 2, ( − K )u 2 = λev1 (eu 1 − 1) − K u 1 + µ ( − K )v2 = λeu 1 (ev1 − 1) − K v1 + ν
in .
We have ( − K )(u 2 − u 1 ) = λev1 (eu 1 − 1) − K u 1 = (K − λev1 ew )(−u 1 ) > 0 if we choose, say K > 2λ. Here w is a function between u 1 and 0. Again, the maximum principle implies that u 2 < u 1 in . Similarly, v2 < v1 . 4π Thus choosing the constant K > 2λ, for each λ > || max{N1 , N2 } we obtain a 2 2 monotone sequences {u n } ∈ C ( \ S1 ), {vn } ∈ C ( \ S2 ) from Proposition 3.1. Step 3. Existence of sub-solution. First note that if (u, v) is a sub-solution of (3.1), i.e.,
u > λev (eu − 1) + µ (3.11) u v v > λe (e − 1) + ν then u < 0, v < 0: If max u := u(x0 ) > 0, then at point x0 , (eu(x0 ) − 1) > 0 and hence λev (eu − 1) + µ > 0 at x0 . But then, u(x0 ) > 0, contradicting that x0 is a point of maximum. Next, we claim that u < u n , v < vn for all n ≥ 1.
(3.12)
We have ( − K )(u − u 1 ) > λev (eu − 1) − K u > (K − λev ew )(−u), where u < w < 0. For our choice of K , (K − λev ew ) > 0. Hence, the maximum principle implies u < u 1 . Similarly, v < v1 . Assuming that u < u k , v < vk for all k = 1, . . . , n − 1 we have ( − K )(u − u n ) > λev (eu − 1) − λevn−1 (eu n−1 − 1) + K (u n−1 − u) > λevn (eu − eu n−1 ) + K (u n−1 − u) = (K − λevn−1 ew )(u n−1 − u) > 0, where u < w < u n−1 . Again, using the maximum principle, we see that u < u n . Similarly, v < vn . Therefore (3.12) is proved. Lemma 3.1. There exists λ0 > 0 such that for all λ > λ0 , the system (3.1) has a sub-solution (w, z) (independent of λ).
322
C.-S. Lin, J. V. Prajapat
Proof. From Theorem 4.13 in [1] and the existence of the Green’s function for the Laplacian on a torus, there exist unique solutions (u 0 , v0 ) of the equations
1 4π N1 + 4π m j δpj , ||
k
u 0 = −
j=1
v0 = − The functions u 0 ∈
C 2 ( \
4π N2 + 4π ||
S1 ), v0 ∈
k2
(3.13)
v0 = 0.
(3.14)
n j δ p2 , j
j=1
C 2 ( \
u 0 = 0,
S2 ) and for ε > 0 sufficiently small,
u 0 ∼ 2m j ln |x − p j | in B( p j , ε) for j = 1, . . . , k1 , v0 ∼ 2n j ln |x − q j | in B(q j , ε) for j = 1, . . . , k2 . Moreover, u 0 , v0 ∈ W 1, p () for all 1 < p < 2 and eu 0 , ev0 ∈ L ∞ (). If we write u = u 1 − u 0 , v = v1 − v0 , then it can be easily verified that (u 1 , v1 ) is the solution of (3.1) if and only if (u, v) solves ⎫ u = λev0 +v (eu 0 +u − 1) + 4π||N1 ⎪ ⎬ 4π N2 . u +u v +v 0 0 (3.15) v = λe (e − 1) + || ⎪ ⎭ Thus, it suffices to find a sub solution of (3.15). The construction of these sub-solutions is similar to [2] with suitable modifications due to a different nonlinear term in our equations. Let ε > 0 be sufficiently small so that the balls B( p j , 2ε), 1 ≤ j ≤ k1 and B(q j , 2ε), 1 ≤ j ≤ k2 are mutually disjoint. Let ϕε , ψε be smooth functions such that ⎫ 0 ≤ ψε ≤ 1; 0 ≤ ϕε ≤ 1; ⎬ ϕε ≡ 1 in B( p j , ε), j = 1, . . . , k1 ; ψε ≡ 1 in B(q j , ε), j = 1, . . . , k2 , ϕε ≡ 0 in \B( p j , 2ε), j = 1, . . . , k1 ; ψε ≡ 0 in \B(q j , 2ε), j = 1, . . . , k2 . ⎭ (3.16) Consider the functions f ε :=
8π N1 8π N2 ϕε , gε := ψε . || ||
Then
C1 (ε) :=
fε d x ≤
32π 2 N12 2 ε ||
(3.18)
gε d x ≤
32π 2 N22 2 ε . ||
(3.19)
C2 (ε) :=
Define
f ε
:= f ε − C1 (ε),
gε
(3.17)
:= gε − C2 (ε) so that
Theorem 4.7 in [1], the equations w = f ε
f ε d x = 0 and
gε d x = 0. From (3.20)
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
323
and z = gε
(3.21)
have a unique solution up to an additive constant. From (3.18), (3.19 ) we have 4π N 4π N1 1 f ε (x) ≥ 2 − 8π N1 ε2 > (3.22) for x ∈ B( p j , ε); || || 4π N2 4π N1 2 − 8π N2 ε2 > (3.23) gε (x) ≥ for x ∈ B(q j , ε) || || for ε > 0 small enough and j = 1, . . . , k1 (respectively, j = 1, . . . , k2 ). Henceforth we fix ε so that the inequalities (3.22) and (3.23) hold. Now choose a solution w0 of (3.20) and z 0 of (3.21) such that eu 0 +w0 < 1, ev0 +z 0 < 1.
(3.24)
Then for any λ > 0, we have 4π N1 4π N1 ≥ λev0 +z 0 (eu 0 +w0 − 1)+ in B( pi , ε), 1 ≤ i ≤ k1 , || || 4π N2 4π N1 ≥ λeu 0 +w0 (ev0 +z 0 − 1)+ in B(qi , ε), 1 ≤ i ≤ k2 . z 0 = gε > || ||
w0 = f ε >
(3.25) (3.26)
k1 B( pi , ε)}, m 2 := inf{ev0 (x)+z 0 (x) : x ∈ Let m 1 := inf{eu 0 (x)+w0 : x ∈ \ ∪i=1 k2 k1 \ ∪i=1 B(qi , ε)} and M1 := sup{eu 0 (x)+w0 (x) : x ∈ \ ∪i=1 B( pi , ε)}, M2 := k2 k1 v (x)+z (x) 0 0 inf{e : x ∈ \ ∪i=1 B(qi , ε)}. Then, for x ∈ \ ∪i=1 B( pi , ε) we have k2 v +z u +w 0 0 0 0 (e −1) < m 2 (M1 −1) and for x ∈ \∪i=1 B(qi , ε) we have eu 0 +w0 (ev0 +z 0 − e 1) < m 1 (M2 − 1). Thus we can choose λ0 > 0 sufficiently large so that for all λ > λ0 ,
4π N1 k1 B( pi , ε), in \ ∪i=1 || 4π N1 k2 in \ ∪i=1 − 1) + B(qi , ε). ||
w0 = f εv0 +z 0 (eu 0 +w0 − 1) + z 0 = gεu 0 +w0 (ev0 +z 0
(3.27) (3.28)
In particular, (w0 , z 0 ) is a strict sub-solution of (3.15) for all λ > λ0 . Step 4. Existence of solutions for (3.1). For each λ > λ0 , (u, v) := (u 0 + w0 , v0 + z 0 ) is a sub-solution for (3.1). Hence, the sequence (u n (x), vn (x)) → (u λ (x), vλ (x)) almost everywhere x ∈ and u n → u λ , vn → vλ in L 2 norm. Since (u n+1 , vn+1 ) satisfies (3.2), multiplying the equations by (u n+1 , vn+1 ) and integrating by parts, we conclude that the right hand side of (3.2) converges in W 1, p () for all 1 < p < 2. Taking the limit as n → ∞ in Eqs. (3.2) and using the elliptic estimates, we conclude that (u λ , vλ ) is a solution of (3.1), which is C k away from the singular points. By definition, it is unique. If (u, v) is any other solution of (3.1) then, it is a sub-solution and hence u < u n , v < vn for all n = 1, 2, . . .. It follows that u < u λ , v < vλ and hence (u λ , vλ ) is maximal. This completes the proof of Theorem 3.1 4π Hence, there exists λ0 > || max{N1 , N2 } such that for all λ ≥ λ0 , the Eq. (3.15) has a C 2 maximal solution which we continue to denote by (u λ , vλ ) . Let := {λ > 0 : there exists a maximal solution (u λ , vλ ) of (3.15)}.
324
C.-S. Lin, J. V. Prajapat
For λ1 ∈ and λ1 < λ2 , it is immediate from Eqs. (3.15) that the maximal solution (u λ1 , vλ1 ) is a sub-solution for (3.15) with λ = λ2 . Hence, from Theorem 3.1, λ2 ∈ and u λ1 < u λ2 , vλ1 < vλ2 i.e., the map λ → (u λ , vλ ) is monotone.
(3.29)
Define λ∗ := inf{λ > 0 : λ ∈ } ≥ We prove
4π max{N1 , N2 }. ||
(3.30)
(i) For x ∈ , inf λ>λ∗ u λ (x) and inf λ>λ∗ vλ (x) are finite.
Theorem 3.2. (ii) Define
lim u λ (x) := u ∗ (x); lim vλ (x) := v∗ (x) a.e. x ∈ .
λ→λ∗
λ→λ∗
(3.31)
Then (u ∗ , v∗ ) ∈ H 1 () × H 1 () and is a solution for (3.15) with λ = λ∗ . In 4π particular, λ∗ > || max{N1 , N2 }. (iii) (u ∗ , v∗ ) is a strict sub solution of (3.15) for all λ > λ∗ . Proof. (i) Writing u λn = u n for simplicity of notation, suppose there exists xn ∈ such that u n (xn ) → −∞. Then we claim that u n (x) → −∞ for almost every x ∈ . Multiplying the first equation in (3.15) by u n − u n and integrating by parts, using the
Poincáre inequality we get 2 v0 +vn u 0 +u n |∇u n | ≤ |λe (e − 1)||u n − u n |
≤ λ||e
v0 +vn
(e
u 0 +u n
−1/2 − 1)||∞ ||1 ||∇u n ||2 ,
(3.32)
and hence ||∇u n || ≤ C(λ, 1 , ||), where 1 denotes the first eigenvalue of for a torus. Thus the sequence {||∇u n ||}n is uniformly bounded. Again, the Poincare inequality implies that −1/2 (3.33) ||u n − u n || L 2 () ≤ C1 ||∇u n || ≤ C(λ, 1 , ||), and hence {||u n −
u n ||} is uniformly bounded (independent of n). From the Calderón
Zygmund inequality we conclude ||u n − u n ||W 2,2 () ≤ C(λ, 1 , ||)
(3.34)
uniformly in n and it follows from the Sobolev embedding theorem that ||u n − u n ||C 0 () ≤ C(λ, 1 , ||)
(3.35)
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
325
uniformly in n. If there exists x0 ∈ such that u n (x0 ) → −∞ as λn → λ∗ , then inequality (3.34) implies that u n (x) → −∞ as λn → λ∗ for all x ∈ . But then, substituting (u n , vn ) in (3.15) with λ = λn and integrating by parts, we have λ ev0 +vn (1 − eu 0 +u n ) = 4π N1 , λ eu 0 +u n (1 − ev0 +vn ) = 4π N2 , (3.36)
and we get a contradiction if eu 0 +u n → 0 as λn → λ∗ . Therefore, the limits limλ→λ∗ u n (x) := u ∗ (x) and limλn →λ∗ vn (x) := v∗ (x) exist. (ii) By definition (u n , vn ) → (u ∗ , v∗ ) pointwise a.e. in . Since ev0 +vn < 1 and 0 < 1 − eu 0 +u n < 1, ev0 +vn (1 − eu 0 +u n ) are uniformly bounded in L p () for all p ≥ 1 and λn ev0 +vn (1 − eu 0 +u n ) → λ∗ ev0 +v∗ (1 − eu 0 +u ∗ ) pointwise in . Again, as in (3.32) we get ||∇u n ||22 = λn ev0 +vn (eu 0 +u n − 1)u n d x ≤ C(λ∗ )||ev0 +vn (eu 0 +u n − 1)|| L 2 ||u n || L 2 d x
≤ C(λ∗ , ||, 1 )||∇u n ||2 ,
(3.37)
where C(λ∗ , ||, 1 ) is constant independent of n. Similarly, ||∇vn || is uniformly bounded. Moreover, since the functions {u n }, {vn } are a monotone sequences of functions bounded below by u ∗ , v∗ almost everywhere, it follows that u n → u ∗ , vn → v∗ in the L 2 norm. Therefore, (u n , vn ) (u ∗ , v∗ ) weakly in H 1 () × H 1 () and strongly in L p () × L p () for all p ≥ 1. Using the fact that (u n , vn ) satisfies the Eq. (3.15) with λ = λn for all ϕ ∈ H 1 (), ∇u∇ϕ = lim ∇u n ∇ϕ λn →λ∗
= lim
λn →λ∗
=
λn ev0 +v∗ (eu 0 +u ∗
Moreover, since the map we conclude that
H 1 ()
4π N1 || 4π N1 − 1) + ϕ. ||
λn ev0 +vn (eu 0 +u n − 1) +
ϕ
(3.38)
→
L 1 ()
given by ϕ →
e pϕ , p ∈ R is compact,
eu 0 +u n → eu 0 +u ∗ , ev0 +vn → ev0 +v∗ as λn → λ∗
(3.39)
in L p norm for all p > 0. Therefore, from (3.38) and the fact that u n satisfies (3.15), we get |∇(u n − u ∗ )|2 = (λn ev0 +vn (eu 0 +u n − 1) − λ∗ ev0 +v∗ (eu 0 +u ∗ − 1))(u n − u ∗ )
≤ ||λn ev0 +vn (eu 0 +u n − 1) − λ∗ ev0 +v∗ (eu 0 +u ∗ − 1)|| L 2 ||u n − u ∗ || L 2 ≤ → 0 asλn → λ∗ (3.40)
and u n → u ∗ strongly in H 1 (). Similarly, we can see that vn → v∗ strongly in H 1 ().
326
C.-S. Lin, J. V. Prajapat
(iii) Moreover, (u ∗ , v∗ ) is a strict sub solution of (3.15) for any λ > λ∗ since u ∗ = λ∗ ev∗ (eu ∗ − 1) +
4π N1 4π N1 > λev∗ (eu ∗ − 1) + || ||
(3.41)
v∗ = λ∗ eu ∗ (ev∗ − 1) +
4π N2 4π N2 > λeu ∗ (ev∗ − 1) + . || ||
(3.42)
and
Hence, we must have u ∗ < u λ ; v∗ < vλ
(3.43)
for all λ > λ∗ . Note that this also implies the optimality of λ∗ , i.e., for λ < λ∗ , (1.1) has no solution. Now, Theorem 1.1 is a consequence of Theorem 3.1 and Theorem 3.2 . 4. An Approximate Problem We shall now consider an approximate problem to (3.1) replacing the measures µ and ν by smooth, positive approximating functions. Choose δ > 0 such that the balls B( p, δ), p ∈ S1 ∪ S2 are mutually disjoint. For this choice of δ, let η(x) = η(|x|) denote a ∞ C with compact support such that 0 ≤ η ≤ 1, η ≡ 1 in B(0, δ/2) and function ε η(r ) = π . Thus, (ε+r 2 )2
ε η → π δ0 (ε + r 2 )2
(4.1)
in the sense of distribution. 4π ε Let ρ(r ) := (ε+r 2 )2 η(r ) and define the functions µε (x) :=
k1
4π m j ρ(|x − p j |), νε (x) :=
j=1
k2
4π n j ρ(|x − q j |).
(4.2)
j=1
Then, µε ≥ 0, νε ≥ 0 are smooth, non negative functions such that µε µ, νε ν in the sense of distribution as ε → 0. Moreover, µε = 4π N1 , νε = 4π N2 for all ε > 0. (4.3)
Now consider the system u = λev (eu − 1) + µε v = λeu (ev − 1) + νε .
in .
(4.4)
Arguing as in Step 1 in the proof of Theorem 3.1, any solution (u ε , v ε ) of (4.4) must satisfy u ε (x) < 0, v ε (x) < 0 in ,
(4.5)
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
327
and integrating by parts, we immediately see that necessarily λ>
4π max{N1 , N2 }. ||
(4.6)
From [1], there exists unique, smooth functions u ε0 , v0ε satisfying ⎫ ε u0 = 0 ⎪ u ε0 = µε = − 4π||N1 + µε ; ⎬ in . 4π N v0ε = νε = − || 2 + νε ; v0ε = 0 ⎪ ⎭
(4.7)
It is easy to verify that u ε0 (x) =
k1
m j ln(|x − p j |2 + ε) + O(1),
j=1
v0ε (x) =
k1
n j ln(|x − q j |2 + ε) + O(1).
j=1
Thus, u ε0 → u 0 in C 2 ( \ S1 ), v0ε → v0 in C 2 ( \ S2 ) e
u ε0
→e
u0
in C (), e 0
After the change of variables u → u
v0ε
→e
+ u ε0 ,
ε
v0
in C () 0
v →
v + v0ε ,
ε
u = λev0 +v (eu 0 +u − 1) + ε ε v = λeu 0 +u (ev0 +v − 1) +
4π N1 || 4π N2 || .
as ε → 0,
as ε → 0.
(4.8)
the system (4.4) is equivalent to in .
(4.9)
Next, we show that solutions of (4.4) converge to solutions of (3.1 ) as ε → 0. Lemma 4.1. For a fixed λ, assume that the system (4.9) has solution (u ε , v ε ) for all ε ∈ (0, ε0 ). Then (i) lim u ε (x) and lim v ε exist; ε→0
ε→0
(ii) (u ε , v ε ) → (u, v) strongly in H 1 () × H 1 () and (u, v) is a solution for (3.15). Proof. Since u ε0 + u ε < 0, v0ε + v ε < 0,
(4.10)
{u ε }ε , {v ε }ε are uniformly bounded above in for all ε > 0. We need to prove that ε lim inf ε→0 u ε (x) and lim inf ε→0 v (x) exists almost everywhere. Multiplying the first equation in (4.9) by u ε − u ε and integrating by parts, using the Poincare inequality we get
ε 2
|∇u | ≤
ε
ε
ε
ε
|λev0 +v (eu 0 +u − 1)||u ε
−
ε
ε
−1/2
u ε | ≤ λ||ev0 +v (eu 0 +u − 1)||∞ ||1
||∇u ε ||2 ,
328
C.-S. Lin, J. V. Prajapat
and hence ||∇u ε || ≤ C(λ, 1 , ||),
(4.11)
where 1 denotes the first eigenvalue of for a torus. Thus the sequence {||∇u ε ||}ε is uniformly bounded. Again, the Poincare inequality implies that −1/2 ||u ε − u ε || ≤ C1 ||∇u ε || ≤ C(λ, 1 , ||), (4.12) and hence
{||u ε
−
u ε ||}
is uniformly bounded (independent of ε). From the Calderón
Zygmund inequality we conclude ε ||u − u ε ||W 2,2 () ≤ C(λ, 1 , ||),
(4.13)
uniformly in ε and it follows from the Sobolev embedding theorem that ε ||u − u ε ||C 0 () ≤ C(λ, 1 , ||)
(4.14)
uniformly in ε. If there exists x0 ∈ such that u ε (x0 ) → −∞ as ε → 0, then inequality (4.14) implies that u ε (x) → −∞ as ε → 0 for all x ∈ . But then, substituting (u ε , v ε ) in (4.9) and integrating by parts, we have ε ε ε ε λ ev0 +v (1 − eu 0 +u ) = 4π N1 ,
ε
ε
ε
ε
eu 0 +u (1 − ev0 +v ) = 4π N2 ,
λ
(4.15)
ε
ε
and we get a contradiction if eu 0 +u → 0 as ε → 0. Therefore, the limits limε→0 u ε (x) := u(x) and limε→0 v ε (x) := v(x) exist and hence ||u ε || → ||u||and ||v ε || → ||v|| in the L 2 norm. Together with (4.11) we conclude that {u ε }, {v ε } are uniformly bounded in H 1 () and hence u ε u, v ε v weakly in H 1 () as ε → 0. Hence, for ϕ ∈ H 1 (), ∇u∇ϕ = lim ∇u ε ∇ϕ ε→∞
= lim
ε→∞
=
ε
ε
ε
ε
λev0 +v (eu 0 +u − 1)ϕ +
λev0 +v (eu 0 +u
4π N1 ||
ϕ
4π N1 ϕ. − 1)ϕ + ||
(4.16)
From the Moser-Trudinger inequality and the fact that (u ε , v ε ) are uniformly bounded ε ε ε ε in H 1 (), we conclude that ev0 +v (eu 0 +u − 1) → ev0 +v (eu 0 +u − 1) in L p for all p > 0. ε Now, using the equation satisfied by (u , v ε ) and (4.16), taking ϕ = (u ε −u)− (u ε −u), it can be verified that (u ε , v ε ) → (u, v) strongly in H 1 () × H 1 ().
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
329
Next we prove Proposition 4.1. (Existence of maximal solutions for the system (4.4)) Given ε0 > 0 small, there exists λ0 = λ0 (ε0 ) depending on ε0 such that for each ε ∈ (0, ε0 ) the system (4.4) has a unique, smooth maximal solution (u ελ , vλε ) for all λ > λ0 satisfying u ελ < 0, vλε < 0.
(4.17)
Proof. Since (0, 0) is a super-solution of (4.4), the iteration scheme (3.2) in Proposition 3.1 can be applied to system (4.4) to obtain a smooth solution once we prove the existence of a sub-solution for (4.4) for all 0 < ε < ε0 . From Theorem 4.18 in [1], for a fixed positive constant M > 0, the equations ( − M)w = 4π||N1 (4.18) ( − M)z = 4π||N2 have unique smooth solutions (w, z). Let a1 (ε0 ) = supε≤ε0 sup u ε0 , a2 (ε0 ) = supε≤ε0 sup v0ε and choose a constant c(ε0 ) > 0 such that supx∈ w(x) − c(ε0 ) + a1 (ε0 ) < 0, supx∈ z(x) − c(ε0 ) + a2 (ε0 ) < 0. Then the functions w1 := w − c(ε0 ), z 1 := z − c(ε0 ) are such that w1 = Mw1 + Mc(ε0 ) + 4π||N1 , w1 + u ε0 < 0 for all ε ∈ [0, ε0 ] (4.19) z 1 = M z 1 + Mc(ε0 ) + 4π||N2 , z 1 + v0ε < 0 for all ε ∈ [0, ε0 ]. Hence, there exists λ0 (ε0 ) > 0 depending on ε0 such that 4π N1 4π N1 ε ε > λev0 +z 1 (eu 0 +w1 − 1) + , || || 4π N2 4π N2 ε ε > λeu 0 +w1 (ev0 +z 1 − 1) + z 1 = M z 1 + Mc(ε0 ) + m || ||
w1 = Mw1 + Mc(ε0 ) +
for all λ > λ0 . Thus, (w1 , z 1 ) is a sub solution of (4.9), independent of ε for all λ > λ0 . From Proposition 3.1, there exists a smooth, maximal solution (u ελ , vλε ) of (4.9) for every λ > λ0 . From Proposition 4.1, the set {λ > 0 : there exists a maximal solution of (4.9)} is non empty for all ε ∈ (0, ε0 ]. While, from the necessary condition (4.6), λε := inf{λ > 0 : there exists a maximal solution of (4.9)}
(4.20)
exists. By definition, λε is an optimal value for which a solution exists for (4.9). We will show that the family of solutions (u ελ , vλε ) have properties similar to ones mentioned in Theorem 1.1. Theorem 4.1. For each ε ∈ (0, ε0 ] let λε be defined by (4.20). Then (i) the map λ → (u ελ , vλε ) is monotone in (λε , ∞); (ii) The limits lim u ελ := u ε∗ ; lim vλε := v∗ε
λ→λε
λ→λε
(4.21)
exist and (u ε∗ , v∗ε ) is a solution of (4.9) with λ = λε∗ . 4π max{N1 , N2 } and (u ε∗ , v∗ε ) is a strict sub solution of (4.9) In particular, λε∗ > || ε for all λ > λ∗ .
330
C.-S. Lin, J. V. Prajapat
The proof of Theorem 4.1 is similar to the proof of Theorem 3.2. Remark 4.1. Since λε >
4π max{N1 ,N2 } ||
for all ε ∈ (0, ε0 ), the limit
lim λε := λ¯
(4.22)
ε→0
exists. From Lemma 4.1, the sequence of solutions (u ελε , vλε ε ) converges to a solution (u, v) of (3.15) with λ = λ¯ . Hence λ¯ ≥ λ∗ ,
(4.23)
λ∗ defined in Theorem 1.1. To prove λ¯ = λ∗ , we have to show that for every λ > λ∗ , the problem (4.9) can be solved for all small ε > 0. This will be achieved by studying the linearized system corresponding to (3.15) at the maximal solution (u λ , vλ ) , namely,
w − λeu 0 +u λ w + λev0 +vλ (1 − eu 0 +u λ )z = 0 (4.24) z + λeu 0 +u λ (1 − ev0 +vλ )w − λev0 +vλ z = 0. Heuristically, the fact that the map λ → (u λ , vλ ) is monotone implies that it is differentiable almost everywhere. Thus Equation (3.15) can be differentiated with respect to λ and at such λ, the linearized equation at (u λ , vλ ) will be proved to be non-singular. Conventionally, the non-singularity can be obtained by proving that the null space of the linearized equation is trivial. Since the linearized equation is a system of linear elliptic equations, it is not easy to show the triviality of the null space. Instead, in the following, we will show that the linear system is onto and then the non-singularity will follow from the Fredholm alternative theorem. Lemma 4.2. Let (u λ , vλ ) denote the maximal solution for (3.15) for λ > λ∗ . Then, the map λ → (u λ , vλ ) is differentiable almost everywhere in (λ∗ , +∞). Proof. Let λ ∈ (λ∗ , ∞) and h be sufficiently small such that λ + h ∈ (λ∗ , ∞). The difference u λ+h − u λ satisfies the equation (u λ+h − u λ ) = λ(eu 0 +v0 +u λ+h +vλ+h − eu 0 +v0 +u λ +vλ ) − λ(ev0 +vλ+h − ev0 +vλ ) +hev0 +vλ+h (eu 0 +u λ+h − 1). (4.25) For each fixed x ∈ , the sequence {(u λ (x), vλ (x))}λ is monotone in λ and hence so is the sequence {(u 2λ (x), vλ2 (x))}λ . Therefore, the map λ → ( u 2λ (x) d x, vλ2 (x) d x) is also
monotone and hence there exists a subset E 1 ⊂ (λ∗ , ∞) with measure |(λ∗ , ∞)\ E 1 )| = 0 and such that λ → ( u λ (x) d x, vλ (x) d x) and λ → ( u 2λ d x, vλ2 d x)
are differentiable for all λ ∈ E 1 .
(4.26)
λ → eu 0 (x)+v0 (x)+u λ (x)+vλ (x) , λ → eu 0 (x)+u λ (x) , λ → ev0 (x)+vλ (x)
(4.27)
Also, note that the maps
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
331
are all monotone in λ for each x ∈ . Thus, there exists E 2 ⊂ (λ∗ , ∞) with |(λ∗ , ∞) \ E 2 )| = 0 and ⎫ λ → eu 0 +v0 +u λ +vλ , λ → eu 0 +u λ , λ → ev0 +vλ ⎪ ⎪ ⎬ 2u +2v +2u +2v 2u +2u 2v +2v λ λ λ λ 0 0 0 0 . (4.28) λ → e , λ → e , λ → e ⎪ ⎪ ⎭ are differentiable for all λ ∈ E 2 From L 2 estimates and (4.25) we have ||u λ+h − u λ ||W 2,2 () ≤ ||u λ+h − u λ || L 2 () + ||(u λ+h − u λ )|| L 2 () ≤ o(1)|h| + λ||(eu 0 +v0 +u λ+h +vλ+h − eu 0 +v0 +u λ +vλ )|| + λ||(ev0 +vλ+h − ev0 +vλ )|| +|h|||ev0 +vλ+h (eu 0 +u λ+h − 1)||,
(4.29)
where o(1) → 0 as |h| → 0. From (4.28), for λ ∈ E 2 , we have ||(eu 0 +v0 +u λ+h +vλ+h − eu 0 +v0 +u λ +vλ )||2L 2 = (eu λ+h +vλ+h − eu λ +vλ )2 d x
=
(e2u 0 +2v0 +2u λ+h +2vλ+h − e2u 0 +2v0 +2u λ +2vλ ) d x
(e2u 0 +2v0 +2u λ +2vλ − eu 0 +v0 +u λ+h +vλ+h eu 0 +v0 +u λ +vλ ) d x
+2
≤ o(1)|h|
(4.30)
and similarly, ||(ev0 +vλ+h − ev0 +vλ )||2L 2 =
(e2v0 +2vλ+h − e2v0 +2vλ ) + 2
ev0 +vλ (ev0 +vλ − ev0 +vλ+h )
≤ o(1)|h|,
(4.31)
where o(1) → 0 as |h| → 0. Moreover, for λ ∈ E 1 , from (4.26), ||u λ+h − u λ ||2L 2 = (u 2λ+h − u 2λ ) d x + 2 u λ (u λ − u λ+h ) d x
≤ o(1)|h|
(4.32)
with o(1) → 0 as |h| → 0. Note that we use the fact that the functions u λ , vλ , evλ are bounded on . Hence, if we choose λ ∈ E 1 ∩ E 2 then |(λ∗ , N ) \ E 1 ∩ E 2 )| = 0 and from (4.29), (4.30 ), (4.31) and (4.32) we conclude that eu λ ,
||u λ+h (x) − u λ (x)||W 2,2 () < o(1)|h|
(4.33)
o(1) → 0 as |h| → 0. In particular, from the Sobolev embedding theorem, ||u λ+h − u λ ||C 0 () < o(1)|h|,
(4.34)
o(1) → 0 as |h| → 0 i.e. λ → u λ ∈ C 0 () is differentiable for all λ ∈ E 1 ∩ E 2 . Similarly, λ → vλ defined from E 1 ∩ E 2 → C 0 () is differentiable.
332
C.-S. Lin, J. V. Prajapat
Proposition 4.2. For any fixed λ1 > λ∗ and (u λ1 , vλ1 ) := (u 1 , v1 ) the maximal solution for (3.15), the linearized operator L : W 2,2 () × W 2,2 () → L 2 () × L 2 () defined by L :=
− λeu 0 +u λ ev0 +v1 (1 − eu 0 +u 1 ) u +u v +v 0 1 0 1 e (1 − e ) − λev0 +vλ
(4.35)
is invertible. Proof. The proof is complete once we show that the map L is onto. Step 1.Let ϕ, ψ be non negative, continuous functions defined on with the property that ϕ ≡ 0 in neighbourhood of S1 , ψ ≡ 0 in neighbourhood of S2 .
(4.36)
We claim (ϕ, ψ) lies in the image of L: For λ > λ1 , consider the equation u + λev0 +v (1 − eu 0 +u ) = v + λeu 0 +u (1 − ev0 +v ) =
4π N1 || 4π N2 ||
+ (λ − λ1 )ϕ + (λ − λ1 )ψ
.
(4.37)
Since (λ − λ1 )ϕ ≥ 0 and (λ − λ1 )ψ ≥ 0 and (u λ , vλ ) satisfies (3.15), it follows that (u λ , vλ ) is a super solution for (4.37) for any λ ≥ λ1 . Next, we show that (4.37) has a sub solution for all λ ∈ (λ1 , λ1 + δ), δ > 0 small, to be chosen later. Set σ = λ1 − c(λ − λ1 ), where c is a positive constant to be chosen. If (u σ , vσ ) denotes the maximal solution of (4.9) corresponding to σ = λ1 − c(λ − λ1 ), then u σ + λev0 +vσ (1 − eu 0 +u σ ) = vσ + λeu 0 +u σ (1 − ev0 +vσ ) =
4π N1 || 4π N2 ||
+ (c + 1)(λ − λ1 )ev0 +vσ (1 − eu 0 +u σ ) + (c + 1)(λ − λ1 )eu 0 +u σ (1 − ev0 +vσ ).
(4.38)
Since ϕ and ψ both vanish near the neighbourhood of singular points, we can choose c large enough so that ϕ ≤ (c + 1)ev0 +vσ (1 − eu 0 +u σ ), ψ ≤ (c + 1)eu 0 +u σ (1 − ev0 +vσ ). Now, choose δ > 0 sufficiently small such that λ1 −c(λ−λ1 ) > λ∗ for all λ ∈ (λ1 , λ1 +δ), so that the maximal solution (u σ , vσ ) with σ = λ1 − c(λ − λ1 ) exists. Hence (u σ , vσ ) is a sub solution of (4.37) for all λ ∈ (λ1 , λ1 + δ). By Proposition 3.1, (4.37) has a monotone family of maximal solutions (u˜ λ , v˜λ ) such that u σ ≤ u˜ λ ≤ u λ vσ ≤ v˜λ ≤ vλ
for all λ ∈ (λ1 , λ1 + δ),
(4.39)
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
333
where u˜ λ1 = u 1 , v˜λ1 = v1 In particular, for λ ∈ (λ1 , λ1 + δ), uσ − u1 u˜ λ − u 1 uλ − u1 ≤ ≤ , λ − λ1 λ − λ1 λ − λ1 and the limits u˜ λ −u 1 λ→λ1 λ−λ1 lim u σ −u 1 λ→λ1 λ−λ1
lim
(4.40)
⎫ ⎬ (4.41)
⎭
u˜ λ v˜λ |λ=λ1 exists. Similarly, ∂∂λ |λ=λ1 exist. Clearly, if A := exist. Thus the derivative ∂∂λ ∂ u˜ λ ∂u λ ∂ v˜λ ∂vλ ( ∂λ |λ=λ1 − ∂λ |λ=λ1 ), B := ( ∂λ |λ=λ1 − ∂λ |λ=λ1 ) then L(A, B) = (ϕ, ψ). Hence (ϕ, ψ) are in the image of the operator L. Step (ii). It is easy to see that the linear subspace spanned by the set of functions (ϕ, ψ) in Step (i) satisfying the property (4.36) is dense in L 2 () × L 2 (). Since the image of L is closed in L 2 () × L 2 (), we have L is onto. By the Fredholm alternative theorem, L is invertible.
We can now prove Theorem 4.2. lim λε = λ¯ = λ∗ . ε→0
Proof. For simplicity of notations, let ε
ε
ε
ε
f ε (u, v) := ev0 +v (eu 0 +u − 1), gε (u, v) := eu 0 +u (ev0 +v − 1)
(4.42)
with the convention that f 0 (u, v) := ev0 +v (eu 0 +u − 1), g0 (u, v) := eu 0 +u (ev0 +v − 1) so that system (4.4) can be rewritten as ⎫ u = λ f ε (u, v) + 4π||N1 ⎬ (4.43) ⎭ v = λgε (u, v) + 4π||N2 for ε > 0, and for ε = 0 we get (3.15). Consider the map : [0, ε0 ) × R × W 2,2 () × W 2,2 () → R × L 2 () × L 2 () defined by (ε, λ, u, v) = ε (λ, u, v) := (λ, u − λ f ε (u, v), v − λgε (u, v)). The map ε is
(4.44)
in λ, u, v variables and its derivative at a point (λ, u, v) is given by ⎞ ⎛ 1 0 0 ⎟ ⎜ ⎟ ⎜ ⎜ − f ε (u, v) − λ∂1 f ε (u, v) −λ∂2 f ε (u, v) ⎟ (4.45) Dε (λ, u, v) = ⎜ ⎟, ⎟ ⎜ ⎝ −g (u, v) −λ∂ g (u, v) − λ∂ f (u, v) ⎠ ε 1 ε 2 ε C1
where ∂i denotes the partial derivative with respect to the i th variable, i = 1, 2. From Remark 4.1, λ¯ ≥ λ∗ . Suppose λ¯ > λ∗ . From Lemma 4.2 and Proposition 4.2, ¯ such that (∂λ u λ , ∂λ vλ ) |λ=λ1 exists and the operator L and hence there exists λ1 ∈ (λ∗ , λ) D0 is invertible at (u λ1 , vλ1 ). By the implicit function theorem there exists ε0 > 0 small, such that (4.44) has a solution for all ε ∈ (0, ε0 ) and hence a maximal solution for all ε ∈ (0, ε0 ). Thus, λ1 ≥ λε for all ε ∈ (0, ε0 ) and hence λ1 ≥ λ¯ a contradiction to λ1 < λ¯ . Therefore λ¯ = λ∗ .
334
C.-S. Lin, J. V. Prajapat
5. Variational Method In general, it is difficult to find a local minimizer for a system of equations. Fortunately, the variational method developed in [12] is helpful here. Since Eqs. (3.15) are symmetric in u and v, without loss of generality, we may assume that N2 ≥ N1 . To write the variational functional corresponding to (3.15), we first add the two equations therein to get (u + v) = 2λeu 0 +v0 +u+v − λeu 0 +u − λev0 +v +
4π(N1 + N2 ) , ||
(5.1)
and subtracting the second equation in (3.15) from the first one we get (u − v) = λeu 0 +u − λev0 +v +
4π(N1 − N2 ) . ||
(5.2)
Writing F := u + v and G := u − v, (F, G) satisfies F = 2λeu 0 +v0 +F − λeu 0 + G = λeu 0 +
F+G 2
F+G 2
F−G 2
− λev0 +
− λev0 +
+
F−G 2
+
4π(N1 + N2 ) , ||
4π(N1 − N2 ) . ||
(5.3) (5.4)
F−G Thus (F, G) is a solution of (5.3)–(5.4) if and only if (u = F+G 2 ,v = 2 ) is a solution of (3.15). It can be verified that Eqs. (5.3)–(5.4) correspond to the Euler Lagrange equations for the functional F+G F−G 1 1 2 2 eu 0 +v0 +F − eu 0 + 2 − ev0 + 2 dx Iλ (F, G) := ||∇ F|| − ||∇G|| + 2λ 2 2 4π(N1 + N2 ) 4π(N1 − N2 ) + F dx − G dx (5.5) || ||
= E(F) − J (F, G), where E(F) :=
1 ||∇ F||2 + 2λ 2
and 1 J (F, G) := ||∇G||2 +2λ 2
eu 0 +
(5.6) eu 0 +v0 +F d x +
F+G 2
+ev0 +
F−G 2
4π(N1 + N2 ) ||
dx +
F dx
(5.7)
4π(N1 − N2 ) ||
G d x.
(5.8)
As in Sect. 4, we use the approximate problem (4.9) and define the functional I ε as 1 ε ε ε F+G ε F−G ε 2 1 2 eu 0 +v0 +F − eu 0 + 2 −ev0 + 2 dx Iλ (F, G) := ||∇ F|| − ||∇G|| + 2λ 2 2 4π(N1 + N2 ) 4π(N1 − N2 ) + F dx − G dx (5.9) || ||
= E ε (F) − J ε (F, G),
(5.10)
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
335
where 1 E (F) := ||∇ F||2 + 2λ 2 ε
e
u ε0 +v0ε +F
4π(N1 + N2 ) dx + ||
F dx
(5.11)
and 1 J (F, G) := ||∇G||2 + 2λ 2 ε
4π(N1 − N2 ) u ε0 + F+G v0ε + F−G 2 2 e dx + +e G d x. ||
(5.12) The Euler Lagrange equations for I ε are ε
ε
ε
F = 2λeu 0 +v0 +F − λeu 0 + ε
G = λeu 0 +
F+G 2
ε
− λev0 +
F+G 2
F−G 2
+
ε
− λev0 +
F−G 2
+
4π(N1 + N2 ) , ||
4π(N1 − N2 ) , ||
(5.13) (5.14)
F−G and (F, G) are solutions of (5.13)–(5.14) if and only if u = F+G 2 , v = 2 are solutions ε of (4.9). The functional I is also indefinite and on differentiation we have
D1 Iλε (F, G) = D1 E ε (F) − D1 J ε (F, G); D2 Iλε (F, G) = −D2 J ε (F, G).
(5.15) (5.16)
Thus, to find the critical points of I ε , we first look for the critical points of J ε (F, G) as a function of G for a fixed F ∈ H 1 (). Our first observation is that for a fixed F ∈ H 1 (), (5.4) and (5.14) can have at most one solution: for if G 1 , G 2 are two solutions of Eq. (5.14) and x0 ∈ such that max (G 1 − G 2 ) = (G 1 − G 2 )(x0 ) > 0, then ε
F
ε
F
0 ≥ (G 1 − G 2 )(x0 ) = λeu 0 + 2 (e G 1 /2 − e G 2 /2 ) − λev0 + 2 (e−G 1 /2 − e−G 2 /2 ) > 0 (5.17) gives a contradiction. Therefore G 1 (x) ≤ G 2 (x) in . Interchanging G 1 and G 2 , it follows that G 1 ≡ G 2 . A similar argument works for (5.4). Hence (5.14) and (5.4) have a unique solution, if it exists. Eqs. (5.4) and (5.14) are Euler Lagrange equations for the functionals J (F, G) and J ε (F, G). We define J F (G) := J (F, G), J Fε (G) := J ε (F, G)
(5.18) (5.19)
to emphasize the fact that we are considering the functionals for a fixed F ∈ H 1 (). We prove Lemma 5.1. For each ε ∈ [0, ε0 ) and fixed F ∈ H 1 (), the functional J Fε has a unique minimizer in H 1 (). Here we use the convention that J 0 := J defined in (5.8).
336
C.-S. Lin, J. V. Prajapat
Proof. From Jensen’s inequality, we have F+G ε + F−G 2 u ε0 + F+G v 2 e ≥ e and e 0 2 ≥ e
1 4π(N1 − N2 ) = ||∇G||2 + 2 || 1 4π(N1 − N2 ) > ||∇G||2 + 2 ||
Since N2 ≥ N1 , we note that if and
J Fε (G)
.
(5.20)
Therefore, J Fε (G)
F−G 2
ε F+G ε F−G e u 0 + 2 + e v0 + 2 G + 2λ
G + 2λe
F−G 2
+ 2λe
dx
.
(5.21)
G → +∞ then e
→ +∞. It follows that
F+G 2
4π(N1 −N2 ) ||
F+G 2
dx
will be the dominating term
G + 2λe
F+G 2
+ 2λe
F−G 2
dx
≥ C and
J Fε is bounded from below. The existence of the global minimizer of J Fε in H 1 () then follows from the coerciveness of ||∇G||. In view of Lemma 5.1, for every F ∈ H 1 () there exists a unique G(F) ∈ H 1 () such that G(F) is a point of minimum of J Fε . Substituting G = G(F) in (5.9), we immediately see that the Iλε (F, G(F) satisfies the condition (5.16) for any F ∈ H 1 () and our problem is now reduced to finding F ∈ H 1 () such that (5.15) is satisfied. Thus, we define the functionals Iλ (F) := Iλ (F, G(F)) = E(F) − J F (G(F))
(5.22)
Iλε (F) := Iλε (F, G(F)) = E ε (F) − J Fε (G(F))
(5.23)
and
to emphasize the fact that they depend only on the function F and our next goal is to find a minimizer for the functional Iλ . Due to the presence of singularities of u 0 and v0 , it is difficult to directly show that Iλ is bounded from below. Thus, we first prove that Iλε is bounded below for any ε > 0, and then by passing to the limit ε → 0, the lower bound for Iλ can be obtained. The main result of this section can be summarized as Theorem 5.1. For ε ∈ (0, ε0 ], let F∗ε := u ε∗ + v∗ε where (u ε∗ , v∗ε ) are solutions of the approximate system (4.9) defined by (4.21). Then, for every ε ∈ (0, ε0 ] there exists a critical point F ε > F∗ε of the functional Iλε which is a local minimum in H 1 (). The limε→0 F ε = Fˆ exists and Fˆ is a local minimum for the functional Iλ in H 1 (). Moreover, Fˆ > F∗ = u ∗ + v∗ , where (u ∗ , v∗ ) are solutions of (3.15) defined by (1.5). The proof of Theorem 5.1 is completed in the following steps each of which will be proved in subsequent lemmas: (i) find a constrained minimizer for the functional Iλε ; (ii) prove that this constrained minimizer is a critical point for Iλε in H 1 ();
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
337
(iii) taking the limit as ε → 0, obtain a constrained minimizer which is a critical point for Iλ ; (iv) the critical point is a local minimum of Iλε in H 1 () for all ε ∈ [0, ε0 ], ε0 small. From (iii) of Proposition 4.1 and Theorem 1.1 in case ε = 0, the maximum principle implies that F∗ε is a strict sub solution for Eq. (5.13) with G = G ε∗ = u ε∗ − v∗ε for any λ > λε∗ i.e., ε
F∗ε > 2λeu 0 +v0 +F∗ − λeu 0 +
F∗ε +G ε∗ 2
− λev0 +
F∗ε −G ε∗ 2
4π(N1 + N2 ) . ||
+
(5.24)
Thus, restricting the functional Iλ to the closed set F ε := {F ∈ H 1 () : F ≥ F∗ε = u ε∗ + v∗ε },
(5.25)
we prove Lemma 5.2. Iλε attains its minimum in F ε . Proof. Since G(F) minimizes J Fε in H 1 (), we have J Fε (G(F)) ≤ J Fε (0). Hence, Iλε (F) ≥ E ε (F) − J ε (F, 0) 1 ε ε ε F ε F 4π(N1 + N2 ) 2 = ||∇ F|| + eu 0 +v0 +F − eu 0 + 2 − ev0 + 2 . F + 2λ 2 ||
(5.26) By the Cauchy Schwartz inequality, ε F ε F ε ε eu 0 + 2 + ev0 + 2 ≤ (||eu 0 ||∞ + ||ev0 ||∞ )( e F )1/2 .
It follows that ε
ε
ε
F
ε
F
(eu 0 +v0 +F − eu 0 + 2 − ev0 + 2 ) ≥ C1 (ε)
e F − C2 (ε)(
varies in F ε . Here C
which is bounded below as F of F. Therefore, for F ∈ F ε , Iλε (F)
1 4π(N1 + N2 ) ≥ ||∇ F||2 + 2 ||
i (ε), i
e F )1/2
= 1, 2 is a constant independent
F∗ d x + C1 (ε)λ
(5.27)
e − C2 (ε)λ( F
e F )1/2 ,
(5.28) in particular, Iλε is coercive in F ε . Moreover, Iλε is lower semi-continuous on F ε since both E ε and J ε are lower semicontinuous. It follows that Iλε is bounded from below and attains its minimum at a point say F ε in the set F ε . Let F ε ∈ F ε such that min Iλε = Iλε (F ε ) Fε
and let G ε denote the corresponding solution of (5.14). Then,
(5.29)
338
C.-S. Lin, J. V. Prajapat
Lemma 5.3. F ε is a critical point of the functional Iλε in H 1 (). Proof. Note that F ε is a closed subset of H 1 (). To prove that F ε is a critical point of Iλε in H 1 (), it suffices to show F ε lies in the interior of the set F ε , i.e., F ε > F∗ε . Let ϕ ∈ H 1 () such that ϕ ≥ 0. Then F ε + tϕ ∈ F ε for t > 0 and DIλε (F ε )(ϕ) = t ∂ ε ε limt→0+ 1t 0 ∂s Iλ (F + sϕ)ds ≥ 0, i.e., 4π(N1 + N2 ) ε ε ε F ε +G ε ε F ε −G ε ε ∇ F ε ∇ϕ + (2λeu 0 +v0 +F − λeu 0 + 2 − λev0 + 2 + )ϕ ≥ 0, ||
(5.30) where G ε satisfies (5.14). Thus, F ε is a super solution of (5.13), ε
ε
ε
ε
F ε ≤ 2λeu 0 +v0 +F − λeu 0 + ε
ε
ε
F ε +G ε 2
ε
− λev0 +
F ε −G ε 2
+
4π(N1 + N2 ) . ||
(5.31)
ε
F −G If we define w := F +G , then it can be easily verified from (5.31) and 2 , z := 2 (5.14) that (w, z) is a super solution of (4.9) i.e., ε
ε
4π N1 , || 4π N2 . − 1) + ||
w ≤ λev0 +z (eu 0 +w − 1) + ε
ε
z ≤ λeu 0 +w (ev0 +z
(5.32) (5.33)
We claim that u ε∗ ≤ w, v∗ε ≤ z a.e. x ∈ .
(5.34)
Consider ϕ = min{w − u ε∗ , 0}. For x0 ∈ such that (w − u ε∗ )(x0 ) < 0, we must have ε ε ε ε (z − v∗ε )(x0 ) > 0 since F ε − F∗ε ≥ 0. Therefore, λev0 +z (eu 0 +w − 1) < λ∗ ev0 +v∗ (eu 0 +w − ε ε ε ε 1) ≤ λ∗ ev0 +v∗ (x0 ) (eu 0 +u ∗ (x0 ) − 1), and using (5.32) together with the equation satisfied ε by u ∗ we see that (5.35) (w − u ε∗ )(x0 ) ≤ 0. It implies that ϕ is a super solution, and − ϕϕ = |∇ϕ|2 ≤ 0 and hence ϕ ≡ 0 a.e
in . Hence u ε∗ ≤ w a.e. in . A similar argument will show that v∗ε ≤ z a.e. in . We now apply the iteration scheme (3.2) beginning with the super solution (w, z) to get a monotone sequence {(u n , vn )}. Due to the claim (5.34), we note that u ε∗ ≤ u n , v∗ε ≤ vn for all n.
(5.36)
ˆ v) ˆ of the system (4.9). Clearly, Hence, the sequence (u n , vn ) converges to a solution (u, (u, ˆ v) ˆ are smooth functions and since λ > λ∗ , there exists δ > 0 such that u ε∗ + δ < uˆ ≤ w, v∗ε + δ < vˆ ≤ z.
(5.37)
F ε = w + z ≥ uˆ + vˆ > u ε∗ + v∗ε + 2δ > F∗ε ,
(5.38)
Therefore,
and the lemma is proved.
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
339
From Lemma 4.1, it follows that F ε → Fˆ in C 1,α , 0 < α < 1 as ε → 0, where Fˆ is a critical point of Iλ in H 1 (). Clearly, Fˆ = lim F ε ≥ lim F∗ε = F∗ = u ∗ + v∗ , ε→0
(5.39)
ε→0
and repeating the proof of Lemma 5.3, we can further conclude Fˆ > F∗ . Let F = {F ∈ H 1 () : F ≥ F∗ }. Lemma 5.4. Fˆ is a local minimizer of Iλ in H 1 () . Proof. Suppose that Fˆ is not a local minimizer for the functional Iλ in H 1 (). Thus, for every n ∈ N, inf
ˆ 1≤ 1 ||F− F|| n H
ˆ Iλ (F) < Iλ ( F).
ˆ H1 ≤ Let Fn ∈ H 1 () be such that ||F − F||
1 n
and Iλ (Fn ) :=
min
ˆ 1≤ 1 ||F− F|| n H
Iλ (F).
Using the principle of Lagrange multipliers, there exists a constant µn ≤ 0 such that ˆ ˆ for all ϕ ∈ H 1 (), (5.40) DIλ (Fn )(ϕ) = µn ( ∇(Fn − F)∇ϕ + (Fn − F)ϕ)
here DIλ denotes the differential of Iλ . Let G n denote the unique minimizer of the functional J Fn . Then, DIλ (Fn ) = D E(Fn ) − D1 J (Fn , G n ) − D2 J (Fn , G(Fn ))( = D E(Fn ) − D1 J (Fn , G n ),
∂G (Fn )) ∂F (5.41)
since D2 J (Fn , G(Fn )) = 0. Therefore, Fn +G n Fn −G n 4π(N1 + N2 ) ∇ Fn ∇ϕ + λ (2eu 0 +v0 +Fn −eu 0 + 2 −ev0 + 2 )ϕ d x + ϕ dx || ˆ ˆ + (Fn − F)ϕ) (5.42) = µn ( ∇(Fn − F)∇ϕ
for all ϕ ∈
H 1 ().
Integrating by parts, we conclude from (5.42) that Fn +G n
Fn −G n
−Fn +λ(eu 0 +v0 +Fn − eu 0 + 2 − ev0 + 2 )+ ˆ + (Fn − F) ˆ . = µn −(Fn − F)
4π(N1 + N2 ) || (5.43)
Since G n is a minimizer of J Fn ,
1 4π(N1 − N2 ) 2 ||∇G n ||2 + G n d x ≤ J (Fn , G n ) ≤ J (Fn , 0) 2 || u 0 + F2n v0 + F2n = 2λ (e +e ).
(5.44)
340
C.-S. Lin, J. V. Prajapat
Putting ϕ = 1 in (5.42), we have Fn +G n Fn −G n ε ε λ eu 0 + 2 d x + λ ev0 + 2 d x = 2λ eu 0 +v0 +Fn + O(1).
(5.45)
Since ||Fn || H 1 () ≤ C
(5.46)
by the Moser-Trudinger inequality e Fn ∈ L p () for any p > 1. Thus by (5.45), Fn +G n Fn −G n (eu 0 + 2 + ev0 + 2 ) ≤ C, (5.47)
and Jensen’s inequality implies that (Fn + G n ) and (Fn − G n ) are bounded from
above. Thus,
|
G n | ≤ C and||∇G n || ≤ C,
(5.48)
where the last inequality is due to (5.44). Again, using the Moser Trudinger inequality together with (5.43), we get Fn ∈ L p for any p > 1 and ||Fn || L p () ≤ C p for all p > 1. By the Sobolev embedding theorem ˆ H 1 () → 0, we obtain and the fact that ||Fn − F|| ˆ C 0 () → 0. ||Fn − F||
(5.49)
Fn > F∗
(5.50)
Since Fˆ > F∗ , we have
for all large n and hence Fn ∈ F for all large n. In particular, there exists n 0 >> 0 such that Fn 0 > F∗ in.
(5.51)
Since lim F∗ε = F∗ , there exists ε0 such that for all ε < ε0 , we have ε→0
Fn 0 > F∗ε for all ε < ε0 ,
(5.52)
and hence Fn 0 ∈ F ε for all ε < ε0 . Thus, Iλε (Fn 0 ) ≥ Iλε (F ε ) for all ε < ε0 and taking the limit as ε → 0 we conclude ˆ Iλ (Fn 0 ) = lim Iλε (Fn 0 ) ≥ lim Iλε (F ε ) = Iλ ( F), ε→0
(5.53)
ε→0
ˆ This completes the proof. which contradicts that Iλ (Fn 0 ) < Iλ ( F).
Remark 5.1. By similar arguments, we can prove that F ε is a local minimizer of Iλε in H 1 ().
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
341
6. Mountain Pass Solution We begin this section by proving that Iλε satisfies the Palais-Smale condition i.e., Lemma 6.1. Let N2 ≥ N1 > 0 and {Fn } be a sequence in H 1 () such that (i) Iλε (Fn ) → α as n → ∞, (ii) ||DIλε (Fn )|| → 0 strongly, as n → ∞. Then {Fn } has a convergent subsequence. Proof. Condition (i) implies that there exists M > 0 such that |Iλε (Fn )| < M.
(6.1)
Let G n := G(Fn ) be the unique minimizer of J Fεn in H 1 (). Then J Fεn (G n ) ≤ J Fεn (0) and from (6.1) we have E ε (Fn ) − J Fεn (0) < M i.e., 1 4π(N1 + N2 ) ε ε ε Fn ε Fn ||∇ Fn ||2 + Fn d x + 2λ (eu 0 +v0 +Fn − eu 0 + 2 − ev0 + 2 ) d x < M. 2 ||
(6.2) Therefore, 4π(N1 + N2 ) ||
ε
ε
ε
(eu 0 +v0 +Fn − eu 0 +
Fn d x + 2λ
Fn 2
ε
− e v0 +
Fn 2
) d x < M.
(6.3)
By the Cauchy-Schwartz inequality, u ε0 + F2n v0ε + F2n (e +e ) ≤ C(ε)( e Fn )1/2 .
Therefore, 4π(N1 + N2 ) ||
(6.4)
Fn d x + 2λ
e
u ε0 +v0ε +Fn
− C(ε)(
e Fn )1/2 < M.
(6.5)
Note that together with Jensen’s inequality (6.5) implies that ε
ε
eu 0 +v0 ≥ c (ε) > 0, Fn ≤ C(ε).
(6.6) (6.7)
Recall that G n satisfies ε
G n = λ(eu 0 +
Fn +G n 2
ε
− e v0 +
Fn −G n 2
)+
4π(N1 − N2 ) . ||
If x0 ∈ is such that min G n = G n (x0 ), then ε
λ(eu 0 (x0 )+
Fn (x0 )+G n (x0 ) 2
ε
− ev0 (x0 )+
Fn (x0 )−G n (x0 ) 2
)+
4π(N1 − N2 ) ≥ 0, ||
(6.8)
342
C.-S. Lin, J. V. Prajapat
and since N2 ≥ N1 , ε
ε
eu 0 (x0 )+G n (x0 )/2 − ev0 (x0 )−G n (x0 )/2 ≥ 0.
(6.9)
inf min G n ≥ −Cε
(6.10)
Hence n
for some constant Cε > 0 independent of n. Moreover, ε Fn +G n ε Fn −G n λ eu 0 + 2 = λ ev0 + 2 + 4π(N2 − N1 ).
(6.11)
Since D J Fn (G n ) = 0, ∂G n )=D E(Fn )− D1 J (Fn , G n ), ∂F (6.12)
DIλ (Fn )=D E(Fn )−D1 J (Fn , G n )−D2 J (Fn , G n )(
where Di denotes differentiation with respect to the i th variable, i = 1, 2. Therefore, (ii) implies that ||D E(Fn ) − D1 J (Fn , G n )|| → 0 as n → ∞,
(6.13)
strongly i.e., for any ϕ ∈ H 1 (), 4π(N1 + N2 ) ε Fn +G n ε Fn −G n u ε0 +v0ε +Fn ∇ Fn · ∇ϕ + ϕ +2λ e ϕ − λ (eu 0 + 2 + ev0 + 2 )ϕ ||
= o(1)||ϕ|| H 1 ,
(6.14)
where o(1) → 0 as n → ∞. Putting ϕ = 1, we get ε ε ε Fn +G n ε Fn −G n |λ (2eu 0 +v0 +Fn − eu 0 + 2 − ev0 + 2 ) d x + 4π(N1 + N2 )| → 0.
(6.15)
Using (6.11) and (6.15), we get ε ε ε Fn +G n ε Fn −G n 2λ eu 0 +v0 +Fn ≤ λ eu 0 + 2 + λ ev0 + 2 d x − 4π(N1 + N2 ) + o(1)
≤ C(ε, λ, N1 , N2 )(
e )
Fn 1/2
(
≤ C(ε, λ, N1 , N2 )(
e−G n )1/2
e )
Fn 1/2
.
(6.16)
Since ε
ε
2λ(inf eu 0 +v0 )
e Fn ≤ 2λ
ε
ε
eu 0 +v0 +Fn ,
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
(6.16) implies that
343
e Fn ≤ C(ε, λ, N1 , N2 ).
(6.17)
Therefore,
ε
Fn −G n 2
ε
Fn +G n 2
e v0 +
≤ C(ε, λ, N1 , N2 ),
(6.18)
≤ C(ε, λ, N1 , N2 ).
(6.19)
and from (6.11),
eu 0 +
Hence from Jensen’s inequality,
Fn + G n ≤ C.
(6.20)
From (6.8),
|∇G n |2 + λ
ε
eu 0 +
Fn +G n 2
Gn = λ
Thus
ε
e v0 +
Fn −G n 2
Gn +
|∇G n |2 ≤
|∇G n |2 + λ
= −λ
e
ε
eu 0 +
u ε0 + F2n
(e
Gn + λ
e
Gn 2
Gn .
(6.21)
− 1)G n
n v0ε + Fn −G 2
≤ C(ε)|
Fn 2
4π(N2 − N1 ) ||
4π(N2 − N1 ) Gn + ||
Gn
G n |,
(6.22)
where (6.10), (6.17) and the Poincare inequality are used. 1 Substituting ϕ = Fn := Fn − αn , where αn := || Fn in (6.14), we have
|∇ Fn |2 + 2λ
ε
ε
eu 0 +v0 +Fn Fn = λ
ε
(eu 0 +
Fn +G n 2
ε
+ e v0 +
Fn −G n 2
)Fn + o(1)||Fn || H 1 () ,
and hence ε ε |∇ Fn |2 ≤ |∇ Fn |2 + 2λ eu 0 +v0 +αn (e Fn − 1)Fn
= −2λ
e
u ε0 +v0ε +αn
Fn + λ
ε
(eu 0 +
Fn +G n 2
ε
+ e v0 +
Fn −G n 2
)Fn + o(1)||∇ Fn ||.
(6.23)
344
C.-S. Lin, J. V. Prajapat
Multiplying (6.8) by Fn and integrating by parts, we get ε Fn +G n ε Fn −G n ∇G n ∇ Fn = λ (eu 0 + 2 − ev0 + 2 )Fn .
Thus,
λ
(6.24)
e
n u ε0 + Fn +G 2
Fn
=
∇G n ∇ Fn + λ
From (6.10) and (6.17),
ε
e v0 +
Fn −G n 2
Fn .
(6.25)
ε
e2v0 +Fn −G n ≤ C(ε).
(6.26)
Therefore, substituting (6.24) in (6.23) and using the Poincare inequality we get 2 2 ||∇ Fn || = |∇ Fn | ≤ C(ε)||∇ Fn || + ∇G n ∇ Fn ≤ (C(ε) + ||∇G n ||)||∇ Fn ||,
(6.27) and therefore ||∇ Fn || ≤ C(ε) + ||∇G n ||.
(6.28)
Substituting (6.17), (6.18), (6.19) in (6.1) we get
1 1 4π(N1 + N2 ) 4π(N2 − N1 ) ||∇ Fn ||2 − ||∇G n ||2 + Fn + Gn 2 2 || || + N ) 4π(N 8π N 1 2 1 ≤ C1 (ε) + c (ε)||∇G n ||2 + (Fn + G n ) − G n , (6.29) || ||
C≤
where the last inequality is due to (6.28) and c (ε) is small and will be chosen later. From (6.20) and (6.22), (6.29) yields 8π N1 G n ≤ c (ε)C(ε) G n + C2 (ε), ||
(6.30)
where Ci (ε) are constants independent of c (ε). Choosing c (ε)C(ε) = (6.10), we have | G n | ≤ C(ε).
4π N1 ||
from
(6.31)
From (6.22) and (6.28) we have |∇ Fn |2 + |∇G n |2 ≤ C(ε).
(6.32)
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
345
Again, from (6.1) it gives |
Fn | ≤ C(ε).
(6.33)
The Palais-Smale condition can now be proved from (6.31), (6.32) and (6.33). We omit the details since the remaining argument is standard (see [13]). This completes the proof of Lemma 6.1. Now, we are in position to prove Theorem 1.4. Proof of Theorem 1.4. From Theorem 1.2, Iλ has a local minimizer F0 for λ > λ∗ . We first consider the case when min{N1 , N2 } > 0. If F0 is not a strict local minimum, then a second solution of (3.15) obviously exists. Hence, we may assume that F0 is a strict local minimum for Iλ , i.e., there exists δ0 > 0 such that min
Iλ (F) − Iλ (F0 ) ≥ C1 > 0.
(6.34)
Iλε (F) − Iλε (Fε ) ≥ C1 /2 > 0
(6.35)
||F−F0 || H 1 () =δ0
Thus there exists ε0 > 0 such that min
||F−Fε || H 1 () =δ0
for all 0 < ε ≤ ε0 . For a fixed ε > 0, there exists large α > 0 such that 1 4π(N1 + N2 ) |∇ Fε |2 + Fε − 4π(N1 + N2 )α Iλε (Fε − α) = 2 || 1 − N ) 4π(N ε ε 2 1 2 −α − ||∇G ε || + G ε + λe eu 0 +v0 +Fε 2 || ε + Fε −G ε ε −α/2 u ε0 + Fε +G v 2 −λe (e +e 0 2 )
≤ −4π(N1 + N2 )α + C < Iλε (Fε ),
(6.36)
where C is a constant independent of ε and α, α is large. Thus, by the mountain pass lemma, there exists a solution Fˆε ∈ H 1 () such that I ελ ( Fˆε ) − Iλε (Fε ) ≥ C1 /2.
(6.37)
Taking the limit as ε → 0, Fˆε → Fˆ and Fε → F0 in C 2 () and ˆ − Iλ (F0 ) ≥ C1 /2. Iλ ( F)
(6.38)
Hence the second solution of (3.15) has been obtained provided that min{N1 , N2 } > 0. If min{N1 , N2 } = 0, we may assume that N2 > N1 = 0. Without loss of generality, let the local minimum F0 be a strict local minimizer in H 1 () i.e., (6.34) holds. Let k > 0 and consider Iλk corresponding to the equation
346
C.-S. Lin, J. V. Prajapat
u + λev (1 − eu ) = 4π kδ p0 2 v + λeu (1 − ev ) = 4π kj=1 n j δq j
.
(6.39)
It is not difficult to see that Iλk → Iλ and Fk → F0 as k → 0, where Fk is a local minimizer of (6.39). By the above the proof, there exists a second solution of (6.39) with Iλk ( Fˆk ) − Iλk (Fk ) ≥ C1 /2 > 0.
(6.40)
Letting k → 0, Fˆk will converge to Fˆ in H 1 () . Since the proof is similar to that of Lemma (4.1), the proof is omitted here. Obviously, ˆ − Iλ (F0 ) ≥ C1 /2 > 0. Iλ ( F)
(6.41)
Thus, a second solution is found in the case N1 = 0 and the proof of Theorem 1.4 is complete. Acknowledgement. This work was done while the second author was visiting Taida Institute for Mathematical Sciences, National Taiwan University with NSC grants. She thanks them for their support and warm hospitality.
References 1. Aubin, T.: Nonlinear analysis on Manifolds: Monge Ampere equations. Grundlehren Math. Wiss., Vol. 252, NY: Springer, 1982 2. Caffarelli, L.A., Yang, Y.S.: Vortex condensation in the Chern-Simons Higgs model: an existence theorem. Comm. Math. Phys. 168(2), 321–336 (1995) 3. Chae, D., Imanuvilov, O.Yu.: Non-topological multivortex solutions to the self-dual Maxwell-ChernSimons-Higgs systems. J. Funct. Anal. 196(1), 87–118 (2002) 4. Chae, D., Kim, N.: Topological multivortex solutions of the self-dual Maxwell-Chern-Simons-Higgs system. J. Differential Equations 134(1), 154–182 (1997) 5. Chan, H., Fu, C.-C., Lin, C.-S.: Non-topological multi-vortex solutions to the self-dual ChernSimons-Higgs equation. Comm. Math. Phys. 231(2), 189–221 (2002) 6. Chern, J.-L., Chen, Z.-Y., Lin, C.-S.: Uniqueness of topological solutions and the structure of solutions for the Chern-Simons system with two Higgs particles. Preprint 7. Dunne, G.V.: Aspects of Chern-Simons theory. Aspects topologiques de la physique en basse dimension/Topological aspects of low dimensional systems (Les Houches, 1998), Les Ulis: EDP Sci., 1999, pp. 177–263 8. Dziarmaga, J.: Low energy dynamics of [U (1)] N Chern-Simons solitons and two dimensional nonlinear equations. Phys. Rev. D 49, 5469–5479 (1994) 9. Gilbarg, D., Trudinger, N.S.: Elliptic partial differential equations of second order. Reprint of the 1998 edition. Classics in Mathematics. Berlin: Springer-Verlag, 2001 10. Jaffe, A., Taubes, C.: Vortices and Monopoles. Progr. Phys. Vol. 2, Boston, MA: Birkhäuser Boston, 1990 11. Kim, C., Lee, C., Ko, P., Lee, B.-H: Schrödinger fields on the plane with [U (1)] N Chern-Simons interactions and generalized self-dual solitons. Phys. Rev. D (3) 48, 1821–1840 (1993) 12. Lin, C.-S., Ponce, A.C., Yang, Y.: A system of elliptic equations arising in Chern-Simons field theory. J. Funct. Anal. 247(2), 289–350 (2007) 13. Struwe, M.: Variational methods. Applications to nonlinear partial differential equations and Hamiltonian systems. Fourth edition. Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics], 34, Berlin: Springer-Verlag, 2008 14. Spruck, J., Yang, Y.S.: The existence of nontopological solitons in the self-dual Chern-Simons theory. Comm. Math. Phys. 149(2), 361–376 (1992) 15. Spruck, J., Yang, Y.S.: Topological solutions in the self-dual Chern-Simons theory: existence and approximation. Ann. Inst. H. Poincar Anal. Non Linire 12(1), 75–97 (1995) 16. Tarantello, G.: Multiple condensate solutions for the Chern-Simons-Higgs theory. J. Math. Phys. 37(8), 3769–3796 (1996)
Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus
347
17. Tarantello, G.: Self-dual gauge field vortices: an analytical approach. Berlin-Heidelberg, New York: Springer, 2007 18. Nolasco, M., Tarantello, G.: Vortex condensates for the SU(3) Chern-Simons theory. Comm. Math. Phys. 213(3), 599–639 (2000) 19. Yang, Y.: Solitons in field theory and nonlinear analysis. Springer Monographs in Mathematics. New York: Springer-Verlag, 2001 Communicated by I.M. Sigal
Commun. Math. Phys. 288, 349–377 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0772-x
Communications in
Mathematical Physics
On the Steady Compressible Navier–Stokes–Fourier System Piotr B. Mucha1 , Milan Pokorný2 1 Institute of Applied Mathematics and Mechanics, University of Warsaw,
ul. Banacha 2, 02-097 Warszawa, Poland. E-mail:
[email protected]
2 Mathematical Institute of Charles University, Sokolovská 83,
186 75 Praha 8, Czech Republic. E-mail:
[email protected] Received: 10 March 2008 / Accepted: 11 December 2008 Published online: 17 March 2009 – © Springer-Verlag 2009
Abstract: We study the motion of the steady compressible heat conducting viscous fluid in a bounded three dimensional domain governed by the compressible Navier– Stokes–Fourier system. Our main result is the existence of a weak solution to these equations for arbitrarily large data. A key element of the proof is a special approximation of the original system guaranteeing pointwise uniform boundedness of the density as well as the positiveness of the temperature. Therefore the passage to the limit omits tedious technical tricks required by the standard theory. Basic estimates on the solutions are possible to obtain by a suitable choice of physically reasonable boundary conditions. 1. Introduction We consider the following system of partial differential equations describing the steady flow of a compressible heat conducting Newtonian fluid in a bounded three dimensional domain , div(v) = 0,
(1.1)
div (v ⊗ v) − div S(v) + ∇ p(, θ ) = F,
(1.2)
div (e(, θ )v) − div (κ(θ )∇θ ) = S(v) : ∇v − p(, θ ) div v,
(1.3)
where : → R+0 is the density of the fluid, v : → R3 is the velocity field, S(v) = 2µ D(v) + λ(div v)I is the viscous part of the stress tensor, D(v) = 21 ∇v + (∇v)T + + is the symmetric part of the velocity gradient, p(·, ·) : R0 × R+ → R0 , a given function, is the pressure, F : → R3 is the external force, e(·, ·) : R+0 × R+ → R+0 , a given function, is the internal energy. System (1.1)–(1.3) is known as the compressible Navier–Stokes–Fourier equations or the full Navier–Stokes system [6].
350
P. B. Mucha, M. Pokorný
We assume that the constitutive equation has the form p(, θ ) = a1 γ + a2 θ, a1 , a2 > 0,
(1.4)
i.e. the pressure has one part corresponding to the ideal fluid and a so-called elastic part; for more information see e.g. [6]. Even though we could consider more general pressure laws, we restrict ourselves to this simple model to avoid unnecessary technicalities in the proof. The corresponding internal energy takes the form e(, θ ) = a1
γ −1 + cv θ, γ −1
(1.5)
see e.g. [6 or 1]. Note that in the full generality, Eq. (1.3) should be replaced by the conservation of the total energy, instead of conservation of the internal energy only. For a sufficiently regular class of solutions, including that we are going to construct, the balance of the kinetic energy is just a consequence of the momentum equation. We further simplify (1.3). Our solutions will be such that ∈ L ∞ () and v ∈ W p1 () for all p < ∞. We get due to the fact that div(v) = 0 in the weak sense (see [16]) 1 γ v = −γ div v, div γ −1 again in the weak sense. Thus we write instead of (1.3) (we put a1 = a2 = cv = 1) the energy equation (1.3) in the form div (θ v) − div (κ(θ )∇θ ) = S(v) : ∇v − θ div v.
(1.6)
The viscosity coefficients are, for the sake of simplicity, considered to be constant such that the conditions of the thermodynamical stability µ > 0,
2 λ+ µ>0 3
(1.7)
are satisfied. Finally, the heat conductivity is assumed to be temperature dependent, i.e. κ(θ ) = a3 (1 + θ m ), a3 , m > 0.
(1.8)
This fact is important for our study, we are not able to consider a constant heat conductivity. Our domain is sufficiently smooth, at least a C 2 domain. We supplement the system (1.1), (1.2) and (1.6) with the following boundary conditions at ∂. For the velocity, we consider the slip boundary conditions v · n = 0,
τ k · (T ( p, v)n) + f v · τ k = 0
at ∂,
(1.9)
where τ k , k = 1, 2 are two perpendicular tangent vectors to ∂, n is the outer normal vector and T ( p, v) = − p I + S(v) is the stress tensor. The friction coefficient f is non-negative (if f = 0 we assume additionally that is not axially symmetric). Recall that f = 0 corresponds to the perfect slip, while f → ∞ leads to the homogeneous Dirichlet boundary conditions. However, we are not able to perform this limit passage. Concerning the temperature, we assume that κ(θ )
∂θ + L(θ )(θ − θ0 ) = 0 ∂n
at ∂,
(1.10)
On the Steady Compressible Navier–Stokes–Fourier System
351
where θ0 : ∂ → R+ is a strictly positive sufficiently smooth given function, say θ0 ∈ C 2 (∂), 0 < θ∗ ≤ θ0 ≤ θ ∗ < ∞ with θ∗ , θ ∗ ∈ R+ and L(θ ) = a4 (1 + θ l ),
l ∈ R+0 .
We also add the prescribed mass of the gas d x = M > 0.
(1.11)
(1.12)
The objective of this paper is to prove the existence of weak solutions to problem (1.1)–(1.12) for arbitrarily large data. Till now only partial results have been proved (see e.g. [2,9,14,15]) and only known general theorems concern weak solutions to the evolutionary version of the system [6]. One of main obstacles was to construct suitable a priori estimates. Due to properties of boundary condition (1.10) we are able to obtain a nontrivial energy bound for weak solutions, saving the thermodynamical structure of the system. In the case of the barotropic gas we do not meet such difficulties. The energy bound follows elementary from the momentum equation. However, it is not the only difference. The standard methods introduced by P.L. Lions [9] do not work successfully for the heat conducting case. However, a generalization of the technique introduced in [11,17] gives us sufficient tools to solve the stated problem. An approach to system (1.1)–(1.12) was considered in the book [9]. Unfortunately, this result can be viewed as conditionalonly, since instead of (1.12) the author assumed artificially that weak solutions satisfy p d x = M p for sufficiently large p. On the one hand, this condition is physically not acceptable, on the other hand, it simplifies considerably the mathematical analysis. Nevertheless, this result shows us what is the difference in techniques for the barotropic and heat conducting models. Looking at results concerning the classical solutions for problems with small data, we realize that the heat conducting system has the same mathematical structure (difficulties) as the barotropic version of the model. Thus results from [2,15] are almost immediately transformed to the case of system (1.1)–(1.12). For large data solutions the energy equation starts to play an important role, essentially changing the properties of the whole system. The evolutionary case of system (1.1)–(1.12), under general assumptions on the pressure law was considered in [7 and 8]; the authors assumed only the situation when ∂θ the fluid is thermically isolated, i.e. ∂n = 0 at the boundary. However, the same technique works also for our boundary conditions (1.10). The thermically isolated situation guarantees immediately the energy bound for weak solutions, but considering the limit t → ∞, the only solution which can be obtained as the limit for large times (with time independent force) is a solution with the constant temperature. This is connected to the fact that the model does not allow the heat transfer through the boundary and either the energy increases to infinity (non-potential force) or the temperature approaches a constant value (potential force). The boundary condition (1.10) allows the heat transfer through the boundary, guaranteeing the balance of the total energy, and thus we are able to prove existence of solutions which are definitely nontrivial and physically acceptable. The main result of this paper is the following. Theorem 1. Let ∈ C 2 be a bounded domain in R3 which is not axially symmetric if f = 0. Let F ∈ L ∞ () and γ > 3,
m =l +1>
3γ − 1 . 3γ − 7
352
P. B. Mucha, M. Pokorný
Then there exists a weak solution to (1.1)–(1.12) such that ∈ L ∞ (),
v ∈ Wq1 (), θ ∈ Wq1 ()
for all 1 ≤ q < ∞ and θ > 0 a.e.
The solution constructed by Theorem 1 is meant in the following sense. Definition 1. The triple (, v, θ ) is a weak solution to (1.1)–(1.12), if ∈ L s (), s ≥ γ , v ∈ W21 (), θ ∈ W21 (), θ m ∇θ ∈ L 1 () and θ > 0 a.e.; v · n = 0 at ∂ in the sense of traces and
∀η ∈ C ∞ (),
v · ∇η = 0
(1.13)
(−v ⊗ v : ∇ϕ + 2µ D(v) : D(ϕ) + λ div v div ϕ − p(, θ ) div ϕ) d x
(v τ ) · (ϕ τ )dσ =
+f ∂
F · ϕd x
(1.14)
∀ϕ ∈ C ∞ (); ϕ · n = 0 at ∂
(we denoted by v τ the vector v − (v · n)n1 ) and finally
(κ(θ )∇θ · ∇ψ − θ v · ∇ψ) d x +
L(θ )(θ − θ0 )ψdσ
(1.15)
∂
2µ| D(v)|2 ψd x + λ(div v)2 ψ − θ div vψ d x =
∀ψ ∈ C ∞ ().
The proof of Theorem 1 will be based on a special approximation procedure described in the next section which is the kernel of our method. This section includes also a priori estimates for the approximation. The structure of the approximative system gives us immediately the approximative density bounded uniformly in L ∞ , but we must prove refined L ∞ estimates to verify that the limit solves the original system (1.1)–(1.3). This idea has already been successfully applied in [11 and 17] in the case of barotropic flows. The third section contains a detailed proof of existence to the approximative system. Here the main difficulty comes from the energy equation, since the required positiveness of the temperature does not follow immediately. In the next section we introduce an important quantity, the effective viscous flux and prove its main properties, i.e. the compactness. This feature allows to improve information about the convergence of the density, which is the basic/fundamental fact in the theory of the compressible Navier–Stokes equations [6,9]. The last section describes the refined L ∞ estimates for the approximative density and the passage to the limit. Then we prove that the limit is indeed our sought solution in the meaning of Definition 1. As the reader may easily check, our method works for the slightly larger class of the pressure laws. It allows to consider e.g. 1 Note that v · n = 0 at ∂ and thus v τ = v.
On the Steady Compressible Navier–Stokes–Fourier System
p(, θ ) = pb () + θ,
353
(1.16)
where pb () is a strictly monotone function which behaves for large values as γ . The main steps of this generalization are similar to the barotropic case and can be found in [17]; since our problem is technically enough complicated, we shall avoid such generalizations. Our new result is closely related to the barotropic version of the system (1.1)–(1.12). Let us recall the state of the art in this theory. The steady compressible Navier–Stokes equations for arbitrarily large data were firstly successfully studied in the book [9], where, in the case of p() = γ the existence of renormalized weak solutions was shown for γ > 1 (N = 2) and γ ≥ 53 (N = 3) for Dirichlet boundary conditions. For potential forces with a small non-potential perturbation the existence was improved in [13] for γ > 23 (N = 3). In the recent paper [5] the authors proved the existence in two space dimensions also for γ = 1. See also [3], where the authors considered the three dimensional case and got existence for certain γ –s less than 53 , however, for periodic boundary conditions. P.L. Lions also considered the existence of solutions with locally bounded density: for the case of Dirichlet boundary conditions he was able to show their existence for γ > 1 (N = 2) and γ ≥ 3 (N = 3). Nevertheless, to prove Theorem 1 the above methods are not sufficient, thus we present our new approach for the heat conducting model. Throughout the paper we use the standard notations for the Lebesgue, Sobolev, etc. spaces; generic constants are denoted by C and sequences → 0 always mean suitable chosen subsequences k → 0+ . For the sake of simplicity we put a1 = a2 = a3 = a4 = cv = 1.
2. Approximation This section contains one of the main difficulties in the proof of Theorem 1 — to find a good approximation of problem (1.1)–(1.12). Then we shall be able to show existence and prove the corresponding a priori estimates. Here we present the approximative system as well as the proof of the fundamental a priori estimates, provided the temperature is positive and all quantities are sufficiently smooth. The next section deals then with the solvability of this system and with further a priori bounds. In particular in Sect. 3 the positiveness of the approximative temperature and smoothness of all quantities is proved. Our approximative system will contain two parameters: a number > 0 and an auxiliary function K (·) defined by a number k > 0 as follows: ⎧ ⎨1 K (t) = ∈ [0, 1] ⎩0
for t < k − 1 for k − 1 ≤ t ≤ k for t > k;
(2.1)
moreover, we assume that K (t) < 0 for t ∈ (k − 1, k), where k ∈ R+ . In the last section we pass with → 0+ and we shall show that we may take k sufficiently large such that K () ≡ 1 for our solution. The approximation of our problem (1.1)–(1.12) reads as follows:
354
P. B. Mucha, M. Pokorný
⎫ + div(K ()v) − = h K () ⎪ ⎪ ⎪ 1 1 ⎪ div(K ()v ⊗ v) + K ()v · ∇v − div S(v) + ∇ P(, θ ) = K ()F ⎪ ⎪ ⎪ ⎪ 2 2 ⎬ ⎞ ⎛ in , +θ ⎪ ⎪ − div (1 + θ m ) ∇θ + div ⎝v K (t)dt ⎠ θ + div (K ()v) θ ⎪ ⎪ ⎪ θ ⎪ ⎪ ⎪ 0 ⎭ +K ()v · ∇θ − θ K ()v · ∇ = S(v) : ∇v (2.2) where P(, θ ) =
γt
γ −1
K (t)dt + θ
0
K (t)dt = Pb () + θ
0
K (t)dt
(2.3)
0
M and h = || . Equation (2.2)3 can be reformulated in the following way being the modification of the entropy equation:
⎛ ⎞ s) ( + e − div (1 + esm ) ∇s + K ()v · ∇s − K ()v · ∇ + div ⎝v K (t)dt ⎠ es 0
+ div (K ()v) =
S(v) : ∇v + es
(1 + esm )( es
+ es )
|∇s|2 in ,
(2.4)
with the “entropy” s defined as follows: s = ln θ.
(2.5)
The solvability of (2.2)–(2.4), guaranteed by Theorem 2, gives us s integrable, even continuous for a fixed > 0. Hence here the temperature θ := es is positive. This construction is performed in Sect. 3. Additionally if s ∈ Wq2 () or θ ∈ Wq2 () with q > 23 , so θ ≥ c0 > 0 in . Then (2.2)3 and (2.4) are equivalent. The distinguished entropy will allow to control the positiveness of the temperature, what does not seem to be elementary working directly with an equation of type (2.2)3 . This system is completed by the boundary conditions at ∂, ∂s + L(θ )(θ − θ0 ) + s = 0, ∂n τ k · (T ( p, v)n) + f v · τ k = 0, k = 1, 2, ∂ = 0. ∂n
(1 + θ m )( + θ ) v · n = 0,
(2.6)
The key element in the limit passage from the approximative problem to the original one is the energy estimate giving information independent of the choice of function K , i.e. of the choice of the positive constant k — see (2.1):
On the Steady Compressible Navier–Stokes–Fourier System
355
Lemma 1. Suppose solutions to (2.1)–(2.6) to be sufficiently smooth, i.e. , v and θ ∈ Wq2 () for any q < ∞, θ > 0 in . Let assumptions of Theorem 1 be satisfied. Then d x ≤ M and 0 ≤ ≤ k,
||v|| H 1 () + ||K ()|| L 2γ () + ||P(, θ )|| L 2 () + ||θ || L 3m () + ||∇θ || L r () + (es + e−s )dσ + ||∇s|| L 2 () ≤ C(||F|| L ∞ () , M), (2.7) ∂ 3m where the r.h.s. of (2.7) is independent of and k, s = ln θ and r = min{2, m+1 }.
Proof. The nonnegativeness of the density and boundedness by k follow directly from features of function K and the form of (2.2)1 ; it suffices to integrate the equation over sets {x ∈ ; (x) < 0} and {x ∈ ; (x) > k}, respectively. The integration of this equation over leads to the bound on the total mass. For details we refer to [11]. Let us prove the second part of (2.7) which is definitely more complicated. We divide our chain of estimates into eight steps to underline the main parts of our method. Step I. Multiply the approximative momentum equation (2.2)2 by v and integrate it over : 2 2 2 2µD (v) + λ div v d x + f |v τ | dσ + v · ∇ Pb ()d x
=
K ()v · Fd x +
⎛ ⎝
∂
⎞
K (t)dt ⎠ θ div vd x.
(2.8)
0
To find a good form of the last term of the l.h.s. of (2.8) we use the approximative continuity equation (2.2)1 , γ v · ∇ Pb ()d x = K ()v · ∇γ −1 d x γ −1 γ + h K () − γ −1 d x =− γ −1 γ = [ − h K ()]γ −1 d x + γ γ −2 |∇|2 d x. γ −1
Thus the momentum equation gives the following inequality: γ 2 γ −2 2 S(v) : ∇vd x + f |v τ | dσ + γ |∇| d x + γ d x γ −1 ∂ ⎛ ⎞ ⎛ ⎞ − ⎝ K (t)dt ⎠ θ div vd x ≤ C ⎝1 + |K ()v · F|d x ⎠ . (2.9)
0
356
P. B. Mucha, M. Pokorný
Step II. Integrating the energy equation (2.2)3 and employing the boundary condition (2.6)1 we get ⎞ ⎞ ⎛ ⎛ (L(θ )(θ − θ0 ) + s) dσ = ⎝ S(v) : ∇v − ⎝ K (t)dt ⎠ θ div v ⎠ d x, (2.10)
∂
0
since the integration by parts gives the following identity: ⎞ ⎡ ⎛ ⎣ K ()v · ∇θ − θ K ()v · ∇ + div ⎝v K (t)dt ⎠ θ
⎤ + div (K ()v) θ ⎦ d x =
⎛ ⎝
0
⎞
K (t)dt ⎠ θ div vd x.
0
Summing up (2.9) and (2.10) we get γ + γ −2 2 L(θ )θ + s dσ + γ |∇| d x + γ d x γ −1 ∂ ⎛ ⎞ ≤ s − dσ + C ⎝1 + |K ()v · F|d x ⎠ , ∂
(2.11)
where s + and s − are the positive and negative parts of the entropy, respectively (s = s + − s − ). Note that the form of L(·) implies that the first term of (2.11) gives an estimate on ∂ es dσ independently of . We shall concentrate the attention on this term, since it controls the positive part of entropy s at the boundary. Step III. We integrate the entropy equation (2.4) over getting v · ∇θ L(θ )(θ − θ0 ) −s dσ + K () + se − K ()v · ∇ d x θ θ ∂ S(v) : ∇v (1 + θ m )( + θ ) 2 = (2.12) + |∇s| d x. θ θ
So
S(v) : ∇v (1 + θ m )( + θ ) L(θ )θ0 2 − |s − | dσ + |∇s| d x + + |s |e θ θ θ ∂ + − K ()v · ∇(s − ln )d x ≤ L(θ )dσ + s + e−s dσ. (2.13)
∂
∂
Here we emphasize that the first term in the second integral in the l.h.s. of (2.13) gives us a bound on ∂ e−s dσ , because of the properties of L(·) and θ0 ≥ θ∗ > 0. Hence we control the negative part of entropy s.
On the Steady Compressible Navier–Stokes–Fourier System
357
Let us look closer at the last term in the l.h.s. of (2.13). We have − K ()v · ∇(s −ln )d x = K ()v · ∇ ln d x − K ()v · ∇sd x = I1 + I2
(2.14) and employing (2.2)1 we get I1 =
K ()v · ∇ ln d x = −
div(K ()v) ln d x
(− + − h K ()) ln d x
=
|∇|2 − h K () ln + ln d x. =
(2.15)
The first term has a good sign (we shall keep in mind this term), the second term has a good sign for ≤ 1, too, and for ≥ 1is easily bounded by h. Similarly, the last term can be controlled by the term (1 + γ d x). The proof was rather formal, as we do not know whether > 0 in . However, we may write K ()v · ∇( + δ) in (2.12) with δ > 0 and find an analogue of (2.15) with ln( + δ). Finally we pass with δ → 0+ and get precisely the same information as above. Next I2 = − K ()v · ∇sd x = ( − + h K ()) sd x =
(−∇∇s − ln θ + h K () ln θ ) d x.
(2.16)
Considering the r.h.s. of (2.16), we have ∇∇sd x ≤ ∇ L () ∇s L () 2 2 ⎛ ⎞ 1 ⎝ |∇|2 1 d x + |∇|2 γ −2 d x ⎠ + ∇s 2L 2 () . ≤ 4 4
Moreover,
− ln θ d x
has a good sign for θ ≤ 1 and for θ > 1,
−(ln θ )+ d x ≤ L 2 () s + L 2 () ≤
(2.17)
+ ∇s L 2 () + γ L 1 () + C. 4
+
s L 1 (∂) 4 (2.18)
358
P. B. Mucha, M. Pokorný
The last term of (2.16) can be treated as follows (one part has again a good sign, so we consider only θ ∈ (0, 1], i.e. s ≤ 0): 1 1 h K ()|(ln θ )− |d x ≤ C |s − |d x ≤ C + |s − |dσ + ||∇s|| L 2 () . (2.19) 2 4
∂
Here we applied a Poincaré type inequality yielding
u L 1 () ≤ c()( u L 1 (∂) + ∇u L 2 () ).
(2.20)
Then combining (2.13) with inequality (2.11) and with (2.15)–(2.19) we obtain S(v) : ∇v 1 + θ m L(θ )θ0 2 L(θ )θ + |∇θ | d x + + + |s| dσ ≤ H, (2.21) θ θ2 θ
∂
where
⎛ H = C ⎝1 +
⎞ |K ()v · F|d x ⎠ .
The form of the l.h.s. of (2.21) implies that we control also s L 1 (∂) and ∇s L 2 () ; 2 d x = |∇s|2 d x, evidently s L 1 (∂) is controlled by ∂ (es + e−s )dσ and |∇θ| θ2 which are estimated by the l.h.s. of (2.21). Step IV. From the growth conditions and (2.21) we deduce the following “homogeneous” estimates: ⎛ ⎞1/(l+1) ⎛ ⎞1/(l+1) C ⎝ θ l+1 dσ ⎠ ≤ ⎝ L(θ )θ dσ ⎠ ≤ H 1/(l+1) , ∂
∂
⎞1/m ⎞1/m ⎛ ⎛ m 1 + θ ≤⎝ |∇θ |2 d x ⎠ ≤ H 1/m . C ⎝ |∇θ m/2 |2 ⎠ θ2 We use the following Poincaré type inequality (analogical to (2.20)) ⎛⎛ ⎛ ⎞1/m ⎞1/m ⎛ ⎞1/(l+1) ⎞ ⎟ ⎜ ⎝ |θ m/2 |2 d x ⎠ ≤ C() ⎝⎝ |∇θ m/2 |2 d x ⎠ + ⎝ θ l+1 dσ ⎠ ⎠.
∂
Then the imbedding theorem W21 () → L 6 () (for N = 3) applied to the function θ m/2 leads to the bound ⎞1/3m ⎛ ⎝ θ 3m d x ⎠ ≤ H 1/m + H 1/(l+1) . (2.22)
To simplify further calculations, we set l + 1 = m. Note that we may allow also different values of l, however, for the prize that the further calculations become more technical which we try to avoid.
On the Steady Compressible Navier–Stokes–Fourier System
359
Step V. We return to (2.9). Hölder’s inequality yields2 γ ||v||2H 1 () + γ γ −2 |∇|2 d x + γ d x γ −1 ⎛ ⎞ ≤ C ⎝1 + |K ()v · F|d x + |θ K (t)dt|2 d x ⎠ .
(2.23)
0
The next step of our estimation is the bound on Pb () which is necessary to estimate the r.h.s. of (2.23). We just repeat the method for the barotropic case, but here we shall obtain an extra term related to the temperature. Introduce : → R3 defined as a solution to the following problem: 1 div = Pb () − {Pb ()} in , with {Pb ()} = Pb ()d x. (2.24) =0 at ∂, ||
The basic theory to the stationary Stokes system gives the existence of a vector field satisfying (2.24) with the following estimate for a solution to (2.24) (for another possible proof, using directly estimates of special solutions to system (2.24), see [16]) |||| H 1 () ≤ C||Pb || L 2 () . (2.25) 0 From the structure of Pb () and information that d x ≤ M we easily get, applying the interpolation inequality, {Pb ()} ≤ δ||Pb ()|| L 2 () + C(δ, M)
for any δ > 0.
Multiplying the momentum equation (2.2)2 by , employing (2.23) and (2.25), we conclude after standard estimates of the r.h.s to (2.2)2 , ⎛ ⎞ (2.26) ||Pb ()||2L 2 () ≤ C ⎝1 + |K ()v ⊗ v|2 d x + |θ K (t)dt|2 d x ⎠ .
As ||Pb ()||2L 2 ()
0
⎛ ⎛ ⎞2γ ⎞ ⎜ ⎟ ≥ C ⎝ (K ())2γ d x + ⎝ K (t)dt ⎠ d x ⎠ ,
(2.27)
0
recalling that 2γ > 6, we get a bound for the first integral in the r.h.s. of (2.26), |K ()v ⊗ v|2 d x ≤ c||v||4H 1 () ||K ()||2L 6 () 2(γ −3)
10γ
−1) −1) ≤ c||v||4H 1 () ||K ()|| L3(2γ ||K ()|| L3(2γ 1 () 2γ ()
(2.28)
6(2γ −1)
−4 ≤ δ||Pb ()||2L 2 () + C(δ, M)||v|| H3γ1 () .
2 Note that we used Korn’s inequality; for f = 0 we therefore require that is not axially symmetric, for more details see [16].
360
P. B. Mucha, M. Pokorný
Hence a suitable choice of δ in (2.28) simplifies (2.26) to ⎛ ⎞ 6(2γ −1) −4 + |θ K (t)dt|2 d x ⎠ . ||Pb ()||2L 2 () ≤ C ⎝1 + ||v|| H3γ1 ()
(2.29)
0
The last estimate can be viewed by (2.27) in the form ||
K (t)dt|| L 2γ () + ||K ()|| L 2γ () 0
⎛
⎜ ≤ C ⎝1 + ||v||
3 2γ −1 γ 3γ −4 H 1 ()
⎛ +⎝
|θ
⎞ 2γ1 ⎞ ⎟ K (t)dt|2 d x ⎠ ⎠ .
(2.30)
0
Within our estimation we concentrate on a precise specification of powers of norms. Then, due to our growth conditions we shall be able to construct the desired bound (2.7). Step VI. The last integral in (2.30) can be treated as follows (we need m > 23 and m > 3(γ2γ−1) ): ||θ
1/γ K (t)dt|| L 2 ()
≤
1/γ ||θ || L 3m () ||
0
1/γ
K (t)dt|| L 0
1 γ
≤ θ L 3m () ||
(3m−2)γ −3m 3mγ (2γ −1)
K (t)dt|| L 1 ()
||
0
6m () 3m−2
3m+2
−1) K (t)dt|| L3m(2γ , 2γ ()
(2.31)
0
so (2.30) and (2.31) with the Young inequality imply ||
K (t)dt|| L 2γ () + ||K ()|| L 2γ ()
2γ −1 3m 3 2γ −1 γ 6m(γ −1)−2 γ 3γ −4 . ≤ C 1 + ||v|| H 1 () + ||θ || L 3m ()
0
Applying the inequality for the temperature — (2.22) — we obtain (recall that we put l + 1 = m) ||
3 2γ −1 2γ −1 3 −4 K (t)dt|| L 2γ () + ||K ()|| L 2γ () ≤ C 1 + ||v|| Hγ 13γ() + H γ 6m(γ −1)−2 .
(2.32)
0
We have to estimate H ; it holds |K ()v · F|d x ≤ ||v|| L 6 () ||K ()|| L 6/5 () ||F|| L ∞ () .
Using the interpolation between 1 and 2γ as above leads to the following bound: γ −1) |K ()v · F|d x ≤ C(M)||v|| H 1 () ||K ()|| L3(2γ . (2.33) 2γ ()
On the Steady Compressible Navier–Stokes–Fourier System
361
Inserting this inequality to the r.h.s. of (2.32), recalling that m ≥ 14 and applying the standard Hölder inequality we obtain from (2.32) the estimate on the density K (t)dt|| L 2γ () + ||K ()|| L 2γ ()
||
2γ −1 1 3 2γ −1 γ 2m(γ −1)−1 γ 3γ −4 . (2.34) ≤ C 1 + ||v|| H 1 () + ||v|| H 1 ()
0
Step VII. As we can see later, the first term is the most restrictive. So by (2.33) and (2.34) −1 we conclude (for m > 3γ 6γ −6 )
3γ −3 . |K ()v · F|d x ≤ C 1 + ||v|| H3γ1−4 ()
(2.35)
Hence we obtain from (2.22), 1 3γ −3 −4 ||θ || L 3m () ≤ C 1 + ||v|| Hm 13γ() .
(2.36)
From (2.31) we easily see that ||θ
K (t)dt|| L 2 () ≤ C||θ || L 3m () ||
0
3m+2
γ
2γ −1 K (t)dt|| L3m . 2γ ()
(2.37)
0
Step VIII. Summing up inequalities (2.23), (2.35) and (2.37) we obtain the main bound on the norm of the velocity 3γ −3 2 3γ −3 2 3m+2 m 3γ −4 + m 3γ −4 . + ||v|| ||v||2H 1 () ≤ C 1 + ||v|| H3γ1−4 () H 1 ()
(2.38)
The above bound implies the a priori bound ||v|| H 1 () ≤ C(||F|| L ∞ , M),
(2.39)
provided a suitable dependence between γ and m holds. The estimate (2.39) holds as the powers in the r.h.s. of (2.38) are less than 2. It can be described by the sufficient condition (γ > 3) m>
3γ − 1 . 3γ − 7
(2.40)
Note that as we take γ near 3 then m > 4 and for γ = 4 we have m > 11 5 . Moreover, the 3γ −1 2γ 2 above needed conditions m > 6γ −1 , m > 3 and m > 3(γ −1) are clearly less restrictive than (2.40). Bound (2.39) implies immediately the a priori estimate (2.7), since it follows from (2.21) with (1.11), (2.29), (2.34)–(2.37), together with (3.7) necessary in the next section.
362
P. B. Mucha, M. Pokorný
3. Existence for the Approximative System The aim of this section is to show that for any > 0 and k > 0 there is a solution to the approximative system (2.2)–(2.6). In particular we ensure the positiveness of the temperature. We prove Theorem 2. Let the assumptions of Theorem 1 be satisfied. Moreover, let > 0 and k > 0. Then there exists a strong solution (, v, s) to (2.2)–(2.6) such that ∈ W p2 (), v ∈ W p2 () and s ∈ W p2 () for all 1 ≤ p < ∞. Moreover 0 ≤ ≤ k in , ||v||W 1
3m ()
+
√
d x
≤ M and
||∇|| L 2 () + ∇θ L r () + θ L 3m () ≤ C(k),
(3.1)
3m where θ = es > 0, r = min{2, m+1 } and the r.h.s. of (3.1) is independent of the parameter .
The proof of the existence to the approximative system (2.2) will follow from the standard application of the Leray-Schauder fixed point theorem. It will be split into several lemmas. First we consider the continuity equation. We denote for p ∈ [1, ∞], M p = {w ∈ W p2 (); w · n = 0 at ∂}. We have Lemma 2. Let q > 3. Then the operator S : Mq → W p2 ()
for 1 < p < ∞
such that S(v) = , where is the solution to the following problem: − = h K () − div(K ()v) in , ∂ = 0 at ∂ ∂n
(3.2)
is a well defined continuous compact operator from Mq to W p2 (), 1 < p < ∞. In particular, the solution to (3.2) is unique. Moreover,
W p1 () ≤ C(k, )( v L p () + 1) 1 < p < ∞, ⎧ ⎨ C(k, ) 1 + v 1 (1 + v L () ) 1 < p < 3, W p () 3
W p2 () ≤ 2 ⎩ C(k, )(1 + v W 1 () ) 3 ≤ p < ∞.
(3.3)
p
Proof. The well posedness of the operator S was proved in [16] for K ≡ 1, see also [11], Prop. 3.1 (there the two dimensional case with our function K was considered). The estimates are the direct consequence of the standard elliptic theory, together with the fact that L ∞ () ≤ k.
On the Steady Compressible Navier–Stokes–Fourier System
363
Next, we define the operator T : M p × W p2 () → M p × W p2 () such that T (v, s) = (w, z), where (w, z) is the solution to the following system: ⎫ 1 1 −div S(w) = − div(K ()v ⊗ v )− K ()v · ∇ v −∇ P(, es )+ K () F ⎪ ⎪ ⎪ 2 2 ⎛ ⎞ ⎪ ⎪ ⎬ ms s s −div (1+e )( + e )∇z = S(v ) : ∇ v −div ⎝v K (t)dt⎠ e ⎪ in, ⎪ (3.4) ⎪ ⎪ ⎪ 0 ⎭ s s s −div (K ()v) e − e K ()v · ∇s + e K()v · ∇ w · n = 0, n · S(w) · τ l + f w · τ l = 0 for l = 1, 2 at ∂, (1 + ems )( + es )∇z + z = −L(es )(es − θ0 )
where = S(w) is given by Lemma 2. The above procedure guarantees us that the temperature obtained in this way, θ = e z , will be strictly positive for fixed > 0. Our aim is to apply the Leray–Schauder fixed point theorem. Thus we need to verify that T is a continuous and compact mapping from M p × W p2 () to M p × W p2 () and that all solutions satisfying tT (w, z) = (w, z),
t ∈ [0, 1]
are bounded in M p × W p2 ().
(3.5)
First we easily have Lemma 3. Let p > 3 and all assumptions of Theorem 2 be satisfied. Then T is a continuous and compact operator from M p × W p2 () to M p × W p2 (). Proof. Note that for > 0 the system (3.4) is strictly elliptic. Since p > 3, the W p1 ()–space is algebra, thus the r.h.s. of (3.4) belongs to the L p –space (the boundary term 1−1/ p belongs to W p (∂)). The coefficients in the operator in the l.h.s. of (3.4)2 are of the C 1+α ()–class. Hence the standard theory for elliptic systems gives us the existence of the solution to (3.4) in M p × W p2 () with the following bound: ||w||W p2 () + ||z||W p2 () ≤ C(||es ||C 1+α () ) ||the r.h.s. of (3.4)1 || L p () + ||the r.h.s. of (3.4)2 || L p () + ||the r.h.s. of (3.4)4 ||W 1−1/ p (∂) p
which guarantees us the uniqueness and the continuous dependence on the data. Moreover, the r.h.s. of (3.4) is at most of the first order derivative of sought functions. Thus this structure implies the compactness for the map T . Next we consider a priori bounds for solutions to (3.5). Lemma 4. All solutions to problem (3.5) in the class M p × W p2 () satisfy the following bounds: √ 0 ≤ ≤ k, ||w|| H 1 () + ||θ || L 3m () + ||∇θ || L r () + ε ∇ L 2 () ≤ C(k), (3.6) 3m where r = min{ m+1 , 2}, θ = e z and the constant C(k) is independent of and t ∈ [0, 1].
364
P. B. Mucha, M. Pokorný
Proof. We may basically repeat estimates of Lemma 1 from the previous section. Here we may use that ≤ k in , on the other hand we must control the behaviour of all norms with respect to t. Thus, repeating steps (2.8)–(2.13) for the case t = 1 (the corresponding terms are only multiplied by t) we finally get (1 + θ m )( + θ ) 2 (1 − t) S(w) : ∇wd x + f (w τ ) dσ + |∇z|2 d x θ ∂ S(w) : ∇w γ γ +t + γ γ −2 |∇|2 + dx θ γ −1 ! L(θ )θ0 L(θ )θ − L(θ )θ0 + + z + (1 − e−z + ) + |z − |(e|z − | −1) dσ +t − L(θ ) dσ θ ∂ ∂ ⎛ ⎞ ≤ t (K ()w · ∇z − K ()w · ∇) d x + tC ⎝1 + |K ()w · F|d x ⎠ ,
where = S(w). We may now repeat the arguments between (2.14)–(2.21) (all the corresponding terms are only multiplied by t) and we finally get L(θ )θ0 1 + θm S(w) : ∇w 2 t L(θ )θ + t d x + + |z| dσ |∇θ | d x + t θ2 θ θ ∂ ⎛ ⎞ ≤ tC ⎝1 + |K ()w · F|d x ⎠ .
As 0 ≤ ≤ k, we easily get (the Poincaré inequality is just the same as in the previous section), after dividing by t (the case t = 0 is clear; recall also m = l + 1) ||θ || L 3m () ≤ C(1 + ||w|| L 2 () )1/m , and from an analogue to (2.23) also ||w||2H 1 () ≤ C(1 + ||θ ||2L 2 () ). As m > 1, it implies ||w|| H 1 () + ||θ || L 3m () ≤ C(k). Further, if m ≥ 2 then due to the control of |∇θ| θ and |∇θ |θ ∇θ bounded in the same space. For 1 < m < 2, we only get
m−2 2
∇θ L
3m () m+1
≤ |∇θ |θ
m−2 2
in L 2 () we have also
2−m
2
L 2 () θ L 3m () ≤ C(k).
(3.7)
Finally, multiplying the approximative continuity equation by and integrating by parts we get ⎛ ⎞ (|∇|2 + 2 )d x ≤ h K ()d x + ⎝ K (t)tdt ⎠ | div w|d x,
from where we deduce the bound for
√
∇ L 2 () .
0
On the Steady Compressible Navier–Stokes–Fourier System
365
We continue the proof of Theorem 2. To conclude, we verify the bound on (w, z) in W p2 () × W p2 (), p < ∞, independently of t. We apply the bootstrap method to system ⎫ 1 1 ⎪ ⎪ − div S(w) = t − div(K ()w ⊗ w) − K ()w · ∇w ⎪ ⎪ 2 ⎪ 2 ⎪ ⎪ ⎪ z ⎪ ⎪ −∇ P(, e ) + K () F ⎪ ⎪ ⎡ ⎛ ⎞ ⎪ ⎪ ⎬ in , mz z z − div (1 + e )( + e )∇z = t ⎣S(w) : ∇w − div ⎝w K (t)dt ⎠ e ⎪ ⎪ ⎪ (3.8) ⎪ ⎪ 0 ⎪ ⎤ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ − div (K ()w) ez − ez K ()w · ∇z + ez K ()w · ∇⎦ ⎪ ⎪ ⎭ w · n = 0, n · S(w) · τ l + f w · τ l = 0 for l = 1, 2 (1 + emz )( + ez )∇z + z = −t L(ez )(ez − θ0 )
at ∂,
where = S(w) given by Lemma 2. Note first that due to bounds from Lemma 4 we have
w W 1 () ≤ C 3
as K ()w ⊗ w is bounded in L 3 (). Thus w is bounded in any L q (), q < ∞ and the most restrictive term is ∇ P(, ez ). As ez = θ is bounded in L 3m (), in L ∞ (), we deduce the bound
w W 1
3m ()
≤C
and consequently also W 2
3m ()
≤ C.
Note that the constant in the estimate for w is independent of . Next, we rewrite Eq. (3.8)2 as follows: ⎡ − (z) = t ⎣S(w) : ∇w − ez K ()w · ∇z + ez K ()w · ∇ ⎛ − div ⎝w
⎞
⎤
K (t)dt ⎠ ez − div (K ()w) ez ⎦ in ,
(3.9)
0
∂(z) = −z − t L(ez )(ez − θ0 ) at ∂ ∂n with z (z) =
(1 + emτ )( + eτ )dτ.
(3.10)
0
We multiply (3.9)1 by and integrate over . It leads to ||∇||2L 2 () + t L(ez )(ez −θ0 )+z dσ ≤ C||the r.h.s. of (3.9)1 || L 6/5 () |||| L 6 () . ∂
366
P. B. Mucha, M. Pokorný
It is not difficult to realize that the most restrictive term on the r.h.s is ez K ()w · 3m ∇z ∈ L 3m (), where m+1 > 65 for m > 1. m+1 Let us look at the boundary terms. Note that (s) ∼ s for s → −∞ and (s) ∼ e(m+1)s for s → +∞. Thus t L(es )(es − θ0 ) + s I{≤0} dσ ≥ C1 2 ||I{≤0} ||2L 2 (∂) − C2 ∂
and
t L(es )(es − θ0 ) + s I{≥0} dσ ≥ C1 ||I{≥0} || L 1 (∂) − C2 .
∂
Thus, the estimates above yield W 1 () ≤ C with C independent of t which implies 2
θ
m+1
L 6 () = e
(m+1)z
L 6 () ≤ C and also ∇θ L 2 () = ez ∇z L 2 () ≤ C.
Now, it is not difficult to verify that from (3.9) we get W 2∗ () ≤ C with p
z p ∗ = min{ 3m 2 , 2} (as e ∇z ∈ L 2 () and ∇w ∈ L 3m ()). In particular,
z L ∞ () + θ L ∞ () ≤ C,
∇z L q () + ∇θ L q () ≤ C
3 p∗ 3− p ∗
> 3. Thus from the approximative momentum equation we for 1 ≤ q ≤ q ∗ = get (recall ∇(θ ) ∈ L q ∗ ()) the bound w W 2∗ () ≤ C and from the energy/entropy q
equation also
z W 2∗ () + θ W 2∗ () ≤ C. q
q
The imbedding theorem yields ∇z L ∞ () + ∇θ L ∞ () ≤ C which finally gives as above
Wr2 () + w Wr2 () + z Wr2 () + θ Wr2 () ≤ C, 1 ≤ r < ∞ with C independent of t. This finishes the proof of Theorem 2. 4. Effective Viscous Flux In this part we investigate the properties of the effective viscous flux. Estimates (3.1) from Theorem 2 guarantee us existence of a subsequence → 0+ such that 1 (), v v in W3m v → v in L ∞ (), ∗ Pb ( ) ∗ Pb () in L ∞ (), in L ∞ (), ∗ K ( ) ∗ K () in L ∞ (), K ( ) K () in L ∞ (),
K (t)dt 0
∗
K (t)dt
in L ∞ (),
(4.1)
0
3m }, θ θ in Wr1 () with r = min{2, m+1 θ → θ in L q () for q < 3m.
Here we follow the notation that a weak limit of a sequence {A(a )} is denoted by A(a) (for a fixed subsequence → 0+ ).
On the Steady Compressible Navier–Stokes–Fourier System
367
Passing to the limit in the weak formulation of our problem (2.2) we get div(K ()v) = 0, ⎛
(4.2) ⎛
⎜ ⎜ K ()v · ∇ v − div ⎝2µ D(v ) + ν(div v ) I − Pb () I − θ ⎝
⎞⎞ ⎟⎟ K (t)dt ⎠ I⎠ = K () F ,
0
(4.3) ⎛ ⎜ − div((1 + θ m )∇θ ) + θ ⎝ div v
⎞ ⎟ K (t)dt ⎠ + div(K ()θ v ) = 2µ|D(v )|2 + ν(div v )2 ,
0
(4.4) together with the boundary conditions (1.9)–(1.10). Recall that (4.2)–(4.4) is satisfied in the weak sense, similar to Definition 1. In what follows we must carefully study the dependence of the a priori bounds on k. We have Lemma 5. Under the assumptions of Theorems 1 and 2, we have || || L ∞ () ≤ k and ||v ||W 1
3m ()
γ 3m−2 m
≤ C(1 + k 3
).
(4.5)
Proof. The bound on the density follows directly from Theorem 2. We therefore estimate the velocity. If we write (2.2)2 in the form ⎛ ⎞⎞ ⎛ − div S(v) = −∇ ⎝ Pb ( ) + θ ⎝ K (t)dt ⎠⎠ + K ( ) F 0
1 1 − div[K ( ) v ⊗ v ] − K ( ) v · ∇v , 2 2 we immediately see that
v W 1 () ≤ C K ( ) v ⊗ v L 3m () + K ( ) v · ∇v L 3m () 3m m+1 ⎞ ⎛ ⎞ + Pb ( ) L 3m () + θ ⎝ K (t)dt ⎠ L 3m () + K ( ) F L 3m () ⎠ . m+1
0
Note that due to the bound of the temperature we cannot expect an –independent estimate for q > 3m. The bounds on the density and temperature yield 2
3m−2
≤ Ck γ
Pb ( ) L 3m () ≤ Pb ( ) L3m2 () Pb ( ) L 3m ∞ () while
⎛
θ ⎝
0
⎞ K (t)dt ⎠ L 3m () ≤ Ck.
3m−2 3m
,
368
P. B. Mucha, M. Pokorný
Note that for m and γ satisfying assumptions of Theorem 1, γ 3m−2 3m > 1. It remains to estimate the convective terms (C.T.) C.T. ≤ K ( ) |v |2 L 3m () + K ( ) |v ||∇v | L 3m () m+1 2 ≤ C L ∞ () v L 6m () + ∇v L 3m () v L ∞ () m+1
for m ≥ 2, while for m < 2 the last term is replaced by ∇v L 2 () v L the fact that for 6 < q ≤ ∞,
6m 2−m
() . Using
α 1 1 1 = + (1 − α) − , q 6 3m 3
v L q () ≤ C v αL 6 () v 1−α with W 1 () 3m
and for 2 < r < 3m, α 1−α 1 = + , r 2 3m
∇v L r () ≤ v αL 2 () ∇v 1−α L 3m () with we end up with 2 2m−1
2
m−1
3m−2 C.T. ≤ C L ∞ () v W3m−2 1 () v W 1 () . 2
Note that
2(m−1) 3m−2
3m
< 1. Thus we may use the bound on and Young’s inequality yields
v W 1
3m ()
γ 3m−2 m
≤ C(1 + k 3
As γ > 3, the lemma is proved.
) + Ck
3m−2 m
1 + v W 1 () . 3m 2
Before using the bounds proved above, we show one useful result which in particular implies that the limit temperature is positive. Lemma 6. There exists a subsequence {s } such that s → s in L 2 (), subsequently, θ → θ in L q (), q < 3m with θ > 0 a.e. in . Proof. Recall that from the energy bound we have the following information: |∇s |2 d x + (es + e−s )dσ ≤ C,
which in particular gives
∂
|∇s | d x +
s2 dσ ≤ C.
2
∂
Thus, remembering that is bounded, we are allowed to choose a subsequence s → s in L 2 (). Recall also that θ = es and θ → θ strongly in L r (), r < 3m. Hence by Vitali’s theorem (for a subsequence, if necessary) es → es
in L r ()
and
θ = es
Thus θ > 0 a.e. in , since s > −∞ a.e. in .
with s ∈ L 2 ().
On the Steady Compressible Navier–Stokes–Fourier System
369
A crucial role in the proof of the strong convergence of the density is played by a quantity called the effective viscous flux. To define it, the Helmholtz decomposition of the velocity is needed v = ∇φ + rot A,
(4.6)
where the divergence-free part of the velocity is given as a solution to the following elliptic problem: rot rot A = rot v = ω in , div rot A = 0 in , rot A · n = 0 at ∂.
(4.7)
The potential part of the velocity is given by the solution to φ = div v in , φd x = 0. ∂φ ∂ v = 0 at ∂,
(4.8)
The classical theory for elliptic equations [18,19] gives us for 1 < q < ∞, ||∇ rot A|| L q () ≤ C||ω|| L q () , ||∇ 2 φ|| L q () ≤ C|| div v|| L q () ,
||∇ 2 rot A|| L q () ≤ C||ω||Wq1 () , ||∇ 3 φ|| L q () ≤ C|| div v||Wq1 () .
The properties of the slip boundary condition enable us to state the following problem: −µ ω = rot (K ( ) F − K ( ) v · ∇v − 21 h K ( )v + 21 v − rot 21 v := H 1 + H 2 in , ω · τ 1 = −(2χ2 − f /µ)v · τ 2 at ∂, ω · τ 2 = (2χ1 − f /µ)v · τ 1 at ∂, div ω = 0 at ∂,
(4.9)
where χk are curvatures related with directions τ k . For the proof of relations (4.9)2,3 – see [12] (also [10]). The structure of ω gives us a hint to consider it as a sum of three components, ω = ω0 + ω1 + ω2 ,
(4.10)
where they are determined by the following systems: −µ ω1 = H 1 , −µ ω2 = H 2 −µ ω0 = 0, · τ 1 = −(2χ2 − f /µ)v · τ 2 , ω1 · τ 1 = 0, ω2 · τ 1 = 0 0 1 ω · τ 2 = (2χ1 − f /µ)v · τ 1 , ω · τ 2 = 0, ω2 · τ 2 = 0 0 1 div ω = 0, div ω = 0, div ω2 = 0
ω0
in , at ∂, (4.11) at ∂, at ∂.
Lemma 7. For the vorticity ω written in the form (4.10) we have:3 ||ω2 || L r () ≤ C(k) 1/2 ||ω0 ||Wq1 () + ||ω1 ||Wq1 () ≤ C(1 + k
for 1 ≤ r ≤ 2,
1+γ ( 43 − q2 )
)
for 2 ≤ q ≤ 3m.
(4.12)
3 Note that we can prove that ω2 + L r () = o() for → 0 for any r < 3m. As we do not need it and the proof of the rate is slightly more complicated, we skip it. Analogously we may consider the other inequality also for q < 2, with different powers of k.
370
P. B. Mucha, M. Pokorný
Proof. First, let us consider ω0 . Take α 0 any divergence–free extension of the boundary data to ω , e.g. in the form of a solution to the following Stokes problem: − µ α 0 + ∇ p0 div α 0 α0 · τ 1 α0 · τ 2 α0 · n 1−1/(3m)
Note that v ∈ W3m
= = = = =
0 in , 0 in , −(2χ2 − f /µ)v · τ 2 (2χ1 − f /µ)v · τ 1 0 at ∂.
at ∂, at ∂,
(4.13)
1 () with the estimate (∂), thus α 0 ∈ W3m
α 0 Wq1 () ≤ C v Wq1 () ,
1 < q ≤ 3m.
Thus we may transform the system for ω0 to the form − µ (ω0 − α 0 ) = µ α 0
in ,
− α0 ) · τ 1 = 0
at ∂,
− α0 ) · τ 2 = 0
at ∂,
div(ω0 − α 0 ) = 0
at ∂.
(ω0 (ω0
(4.14)
To find the estimates for solutions to (4.14) we consider its weak form, then the r.h.s. of (4.14)1 delivers a nontrivial boundary term. It is well defined, since div α 0 = 0. Then results from [18,19] guarantee desired bounds. As the system for ω0 has the same structure as that for ω1 , we get ||ω1 ||Wq1 () ≤ C||H 1 ||Wq−1 () and ||ω0 ||Wq1 () ≤ C||v ||Wq1 () , 1 < q ≤ 3m. Analyzing the form of H 1 we see that the only not elementary term is the convective one; so we obtain ||ω1 ||Wq1 () ≤ C(1 + ||K ( ) v · ∇v || L q () ). We easily see that for q ≥ 2, ||K ( ) v · ∇v || L q () ≤ k||v || L ∞ () ||∇v || L q () . Using interpolation inequalities as in Lemma 5 we prove that 2(m−1)
6m−2q
m
3m(q−2)
∇v L3m−2
∇v L(3m−2)q
∇v L(3m−2)q ||K ( ) v · ∇v || L q () ≤ Ck v L3m−2 6 () 3m () 2 () 3m () ≤ Ck
1+γ ( 43 − q2 )
.
Evidently, the estimate for ω0 is less restrictive. Similarly, for ω2 we have
||ω2 || L q () ≤ C|| v ||Wq−1 () ≤ C sup | φ
v φd x|,
On the Steady Compressible Navier–Stokes–Fourier System
371
where the sup is taken over all functions belonging to Wq1 () with 1/ p + 1/q = 1. From the continuity equation we know that √ ||∇ || L 2 () ≤ C(k). (For q > 2 we have only ∇ L q () ≤ C.) As q ≤ 2, 1
||ω2 || L q () ≤ C( ∇ L 2 () v L ∞ () + ∇ L 2 () ∇v L 3m () ) ≤ C(k) 2 . The lemma is proved.
We now introduce the fundamental quantity — the effective viscous flux — which is in fact the potential part of the momentum equation. Using the Helmholtz decomposition in the approximative momentum equation we have ∇(−(2µ + ν) φ + P( , θ )) = µ rot A + K ( ) F 1 1 1 −K ( ) v · ∇v − h K ( )v + v − v . 2 2 2 We define G ε = −(2µ + ν) φ + P( , θ ) = −(2µ + ν) div v + P( , θ )
(4.15)
and its limit version G = −(2µ + ν) div v + P(, θ ). (4.16) Note that we are able to control integrals G d x = P( , θ )d x and Gd x = 0 K (t)dt . P(, θ )d x, where P(, θ ) = Pb () + θ The result of the lemma below gives the most important properties of the effective viscous flux, guaranteeing the compactness of {G } as well as the pointwise bound of the limit in terms of the parameter k from definition (2.1). Lemma 8. We have, up to a subsequence → 0+ : G → G strongly in L 2 ()
(4.17)
and 2
||G|| L ∞ () ≤ C(η)(1 + k 1+ 3 γ +η )
for any η > 0.
(4.18)
Proof. The function G can be naturally decomposed as G = G 1 + G 2 , where 1 2 2 2 G d x = 0 and ∇G = − 2 v − µ rot ω . Thus ||G 2 || L q () ≤ C(|| v ||Wq−1 () + µ rot ω2 Wq−1 () ). Using Lemma 7 we see that 1
G 2 L q () ≤ C(k) 2 ,
1 ≤ q ≤ 2.
372
P. B. Mucha, M. Pokorný
Next,using again Lemma 7 and calculations in its proof, we immediately see that (recall that | G d x| ≤ C, we control the average of the r.h.s of (4.15) from the energy bound — Lemma 1) ||G 1 ||Wq1 () ≤ C(1 + k
1+γ ( 43 − q2 )
)
for 2 ≤ q ≤ 3m.
(4.19)
Thus we have, at least for a subsequence, in L ∞ () and G 2 → 0
G 1 → G 1
in L 2 ().
Therefore G = G 1 + G 2 → G 1
in L q (),
1 ≤ q ≤ 2,
and due to the definition, G 1 = G. Finally, choosing q = 3 + η˜ in (4.19), 2
G L ∞ () ≤ C(q) G Wq1 () ≤ C(q) sup G 1 Wq1 () ≤ C(η)(1 + k 1+ 3 γ +η ) >0
with η > 0, arbitrarily small if η˜ is so. This finishes the proof of Lemma 8.
5. Limit Passage In this section we apply the properties of the effective viscous flux shown in the previous part. First we prove a result characterizing the sequence of approximative densities. Theorem 3. There exists a sufficiently large number k0 > 0 such that for k > k0 , k−3 (k − 3)γ − ||G|| L ∞ () ≥ 1 k
(5.1)
and for a subsequence → 0+ it holds lim |{x ∈ : (x) > k − 3}| = 0.
→0+
(5.2)
In particular it follows: K () = a.e. in . Proof. We define a smooth function M ⎧ ⎨1 M(t) = ∈ [0, 1] ⎩0
: R+0 → [0, 1] such that for t ≤k−3 for k − 3 < t < k − 2 for k−2≤t
and M (t) < 0 for t ∈ (k − 3, k − 2). We follow the method introduced in [11]. First we multiply the approximative continuity equation (2.2)1 by M l ( ) for l ∈ N getting ⎛ ⎞ (x) ⎜ ⎟ t l M l−1 (t)M (t)dt ⎠ div v ≥ R ⎝
0
On the Steady Compressible Navier–Stokes–Fourier System
373
with R → 0 as → 0, as M l ( ) d x = −l M l−1 ( )M ( )|∇ |2 d x ≥ 0.
Next, recalling definitions of G and M, we obtain ⎛ ⎞ (x) ⎜ ⎟ −(k − 3) ⎝ l M l−1 (t)M (t)dt ⎠ P( , θ )d x
0
⎛
⎞ (x) ⎜ ⎟ ≤ k ⎝ −l M l−1 (t)M (t)dt ⎠ G d x + R .
0
Thus the properties of M lead us to the following inequality: k−3 l (1 − M ( ))P( , θ )d x ≤ (1 − M l ( ))|G |d x + |R |. k { >k−3}
{ >k−3}
From the explicit form of the pressure function (2.3) we find k−3 k−3 (k − 3)γ |{ > k − 3}| − ||P( , θ )|| L 2 () ||M l ( )|| L 2 () k k ≤ ||G|| L ∞ () |{ > k − 3}| + ||G − G || L 1 () + |R |. But by Lemma 8 – inequality (4.18) – we are able to choose k0 so large that for all k > k0 2 we have (5.1), since γ > 3 and ||G|| L ∞ () ≤ Cη (1 + k 1+ 3 γ +η ) with 0 < η ≤ γ −3 6 . Hence we get |{x ∈ : (x) > k −3}| ≤ C ||M l ( )|| L 2 ({ >k−3}) +||G −G || L 1 () +|R | . (5.3) Now, let us fix δ > 0. Then there exists 0 > 0 such that for < 0 , C(||G − G || L 1 () + |R |) ≤ δ/2.
(5.4)
Having fixed, we consider the sequence {M l ( )I{ >k−3} }l∈N , where I A is the characteristic function of a set A. We see that it monotonely pointwise converges to zero. Thus by the Lebesgue theorem we are able to find l = l(, δ) such that C||M l ( )|| L 2 ({ >k−3}) ≤ δ/2.
(5.5)
From (5.3), (5.4) and (5.5) we obtain lim |{x ∈ ; (x) > k − 3}| ≤ δ.
(5.6)
→0
As δ > 0 can be chosen arbitrarily small, Theorem 3 is proved.
Thanks to Theorem 3 we are prepared to present the main part of the proof, i.e. the pointwise convergence of the density.
374
P. B. Mucha, M. Pokorný
Lemma 9. We have P(, θ )d x ≤ Gd x and P(, θ )d x = Gd x;
(5.7)
consequently, P(, θ ) = P(, θ ) and up to a subsequence → 0+ , → strongly in L q () for any q < ∞.
(5.8)
Proof. Due to Theorem 3 we are able to omit K () in the limit equation. For details we refer to [11] – Sect. 4, consideration for (4.16). Examine the approximative continuity equation (2.2)1 . We use as test function ln( + δ) and passing with δ → 0+ we obtain K ( )v · ∇ d x ≥ C(k), (5.9)
thus Theorem 3 implies div v d x ≥ R .
−
(5.10)
Applying (4.15) to (5.10), passing with → 0, then by the strong convergence of G — see (4.17) — we conclude that G = G, so the first relation in (5.7) is proved. Next, we consider the limit to the continuity equation, i.e. div(v) = 0. Testing it by ln with an application of Friedrich’s lemma to have the possibility to use test functions with lower regularity we obtain (for details see [11]) div vd x = 0.
The definition of G — (4.16) — shows the second part of (5.7). Due to elementary properties of weak limits we get P(, θ ) ≤ P(, θ ) a.e. in , but (5.7) implies (P(, θ ) − P(, θ ) )d x ≤ 0, hence P(, θ ) = P(, θ ) a.e.,
i.e. γ +1 + 2 θ = γ + 2 θ a.e.
However, γ +1 ≥ γ and 2 θ ≥ 2 θ , so γ +1 = γ a.e.
and
2 θ = 2 θ a.e.
By Lemma 6 the temperature θ > 0 a.e., we conclude 2 = 2 and for a suitably taken subsequence, lim || − ||2L 2 () = 2 − 2 L 1 () = 0.
→0
(5.11)
Thus the limit (5.11) implies → strongly in L 2 () and by the pointwise boundedness of and we conclude (5.8).
On the Steady Compressible Navier–Stokes–Fourier System
375
Next, we would like to study the limit of the energy equation. The first observation concerns the velocity: we obtain the strong convergence of its gradient. Recall that from Theorem 3 and due to the strong convergence of the temperature it follows P( , θ ) → p(, θ ) strongly in L 2 (), hence (4.17) implies div v → div v
strongly in L 2 ().
(5.12)
strongly in L 2 (),
(5.13)
Additionally we already proved that rot v → rot v
since we observed that the vorticity can be written as the sum of two parts, one bounded in Wq1 () and the other one going strongly to zero in L 2 (). The regularity of systems (4.7) and (4.8) and convergences (5.12) and (5.13) imply immediately that v → v
strongly in H 1 ().
In particular, we get S(v ) : ∇v → S(v) : ∇v
strongly at least in L 1 ().
(5.14)
This fact will be crucial in our considerations for the limit of the energy equation. Recall that → in L q () for q < ∞,
v → v in Wq1 () for q < 3m,
θ → θ in L q () for q < 3m,
1 θ θ in Wmin{2, 3m (). } m+1
Consider the weak form of (2.2)3 . For a smooth function φ we have m + θ (1 + θ ) ∇θ · ∇φd x + L(θ )(θ − θ0 )φdσ + ln θ φdσ θ ∂ ∂ ⎡⎛ ⎞ ⎤ (x) ⎢⎜ ⎟ ⎥ − ⎣⎝ K (t)dt ⎠ v · ∇(θ φ) + K ( ) v · ∇(θ φ)⎦ d x
+
⎡
0
⎢ ⎣ K ( ) v · ∇θ φ + div(θ v φ)
(x)
⎤
⎥ K (t)dt ⎦ d x =
0
+ θ ∇θ (1 + θ m )∇θ
(5.16)
S(v ) : ∇v φd x.
Thanks to (5.15), (1 + θm )
(5.15)
in L 1 ().
376
P. B. Mucha, M. Pokorný
Passing to the limit with the last four terms of the l.h.s. of (5.16) we get −v∇(θ φ) − v∇(θ φ) + φv∇θ + div(θ φv) d x
=
−θ v · ∇φ + θ div vφ d x.
(5.17)
In (5.17) we essentially used the strong convergence of the density. To control the behavior of the boundary terms we note that due to (5.15)2 we see that θ |∂ → θ |∂ strongly in L l+1 (∂) and by Lemma 6, ln θ is bounded in L 2 (∂). Thus recalling (5.14) we get at the limit
(1 + θ )∇θ · ∇φd x +
=
θ v · ∇φd x
∂
S(v) : ∇vφd x −
L(θ )(θ − θ0 )dσ −
m
θ div vφd x.
(5.18)
To conclude, note that we may show that the limit functions θ and v belong to W p1 () θ for any p < ∞. To see this, we introduce the function (θ ) = 0 (1 + t m )dt, similarly as in Sect. 3, formula (3.10). Thus from (5.18) we immediately see that θ ∈ L ∞ () and v ∈ W p1 () for any p < ∞. Using this fact once more in the energy equation, we observe that θ ∈ W p1 (), p < ∞. The positiveness of θ follows from Lemma 6. Theorem 1 is proved. Acknowledgements. The work has been granted by the working program between Charles and Warsaw Universities. The first author has been partly supported by the Polish KBN grant No. 1 P03A 021 30 and by ECFP6 M.Curie ToK program SPADE2, MTKD-CT-2004-014508 and SPB-M. The work of the second author is a part of the research project MSM 0021620839 financed by MSMT and partly supported by the grant of the Czech Science Foundation No. 201/05/0164 and by the project LC06052 (Jindˇrich Neˇcas Center for Mathematical Modeling).
References 1. Batchelor, G.K.: An introduction to fluid dynamics. Cambridge: Cambridge University Press, 1967 2. Bause, M., Heywood, J.G., Novotný, A., Padula, M.: On some approximation schemes for steady compressible viscous flow. J. Math. Fluid Mech. 5(3), 201–230 (2003) 3. Bˇrezina, J., Novotný, A.: On Weak Solutions of Steady Navier-Stokes Equations for Monatomic Gas. preprint, http://ncmm.karlin.mff.cuni.cz/research/Preprints 4. Ducomet, B., Feireisl, E.: On the dynamics of gaseous stars. Arch. Rat. Mech. Anal. 174(2), 221–266 (2004) 5. Frehse, J., Steinhauer, M., Weigant, V.: On Stationary Solutions for 2 - D Viscous Compressible Isothermal Navier-Stokes Equations. preprint, http://ncmm.karlin.mff.cuni.cz/research/Preprints 6. Feireisl, E.: Dynamics of viscous compressible fluids. Oxford Lecture Series in Mathematics and its Applications 26, Oxford: Oxford University Press, 2004 7. Feireisl, E., Novotný, A., Petzeltová, H.: On a class of physically admissible variational solutions to the Navier-Stokes-Fourier system. Z. Anal. Anwendungen 24(1), 75–101 (2005) 8. Feireisl, E., Novotný, A.: Large time behaviour of flows of compressible, viscous, and heat conducting fluids. Math. Methods Appl. Sci. 29(11), 1237–1260 (2006)
On the Steady Compressible Navier–Stokes–Fourier System
377
9. Lions, P.L.: Mathematical Topics in Fluid Mechanics, Vol. 2: Compressible Models. Oxford: Oxford Science Publications, 1998 10. Mucha, P.B.: On cylindrical symmetric flows through pipe-like domains. J. Diff. Eq. 201(2), 304–323 (2004) 11. Mucha, P.B., Pokorný, M.: On a new approach to the issue of existence and regularity for the steady compressible Navier–Stokes equations. Nonlinearity 19(8), 1747–1768 (2006) 12. Mucha, P.B., Rautmann, R.: Convergence of Rothe’s scheme for the Navier-Stokes equations with slip conditions in 2D domains. ZAMM Z. Angew. Math. Mech. 86(9), 691–701 (2006) 13. Novo, S., Novotný, A.: On the existence of weak solutions to the steady compressible Navier-Stokes equations when the density is not square integrable. J. Math. Kyoto Univ. 42(3), 531–550 (2002) 14. Novo, S., Novotný, A., Pokorný, M.: Steady compressible Navier-Stokes equations in domains with non-compact boundaries. Math. Methods Appl. Sci. 28(12), 1445–1479 (2005) 15. Novotný, A., Padula, M.: L p -approach to steady flows of viscous compressible fluids in exterior domains. Arch. Rat. Mech. Anal. 126(3), 243–297 (1994) 16. Novotný, A., Straškraba, I.: Mathematical Theory of Compressible Flows. Oxford: Oxford Science Publications, 2004 17. Pokorný, M., Mucha, P.B.: 3D steady compressible Navier–Stokes equations. Cont. Discr. Dyn. Systems S1, 151–163 (2008) 18. Solonnikov, V.A.: Overdetermined elliptic boundary value problems. Zap. Nauch. Sem. LOMI 21, 112–158 (1971) 19. Zaj¸aczkowski, W.: Existence and regularity of some elliptic systems in domains with edges. Dissertationes Math. (Rozprawy Mat.) 274 (1989) 95 pp Communicated by P. Constantin
Commun. Math. Phys. 288, 379–401 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0747-y
Communications in
Mathematical Physics
L 2 -Restriction Bounds for Eigenfunctions Along Curves in the Quantum Completely Integrable Case John A. Toth Department of Mathematics and Statistics, McGill University, Montreal, Canada. E-mail:
[email protected] Received: 17 March 2008 / Accepted: 5 December 2008 Published online: 11 March 2009 – © Springer-Verlag 2009
Abstract: We show that for a quantum completely integrable system in two dimensions, the L 2 -normalized joint eigenfunctions of the commuting semiclassical pseudodifferen tial operators satisfy restriction bounds of the form γ |ϕ j |2 ds = O(| log |) for generic curves γ on the surface. We also prove that the maximal restriction bounds of BurqGerard-Tzvetkov [BGT] are generically attained for certain exceptional subsequences of eigenfunctions. 1. Introduction Let (M, g) be a compact, closed orientable Riemannian manifold. Let − : C ∞ (M) → C ∞ (M) be the associated Laplace-Beltrami operator with eigenvalues 0 < λ1 ≤ λ2 ≤ · · · and eigenfunctions ϕ j ; j = 1, 2, 3, . . . satisfying −g ϕ j = λ2j ϕ j , and L 2 -normalized so that M |ϕ j |2 dvol(x) = 1. The celebrated Avakumovic-Levitan-Hörmander asymptotics for the unintegrated spectral counting function e(x, x, λ) = 2 λ j ≤λ |ϕ j (x)| implies that n−1 ϕ j L ∞ = O λ j 2 .
(1.1)
The example of the sphere shows that (1.1) is sharp. The corresponding sharp L p -bounds are due to Sogge [So1,So2,So3]. Even though this L ∞ -bound is far from generic [STZ], the only general improvements on (1.1) that we are aware of are due to Sogge and Zelditch [SZ] and more recently, Sogge, Toth and Zelditch [STZ,T4]. These authors The author was supported by a William Dawson Fellowship and NSERC Grant OGP0170280.
380
J. A. Toth n−1
obtain pointwise o(λ 2 )-bounds under a certain non-recurrence condition for the geodesic flow on (M, g). The methods in [STZ] follow closely the earlier work of Safarov [S] and Safarov-Vassiliev [SV]. n−1 It is natural to ask whether one can generically improve the O(λ 2 )- sup-bound by polynomial powers of λ and if so, by how much? In general, very little is known here: Polynomial improvements have been obtained by Iwaniec and Sarnak in arithmetic hyperbolic cases [Sa,IS]. At the other extreme, in the quantum completely integrable (QCI) case it is known that under a natural Morse assumption, one can show 1 that supx∈M |ϕλ (x)| = O(λ 4 ) when the ϕλ ’s are joint eigenfunctions of the commuting operators and dim M = 2 (see [T4]). In the latter case, when dim M > 2, one can at least hope to obtain a fairly complete answer to this question provided the ϕ j ’s are joint eigenfunctions of n-functionally independent, self-adjoint, jointly elliptic, commuting -pseudodifferential operators P1 (), P2 (), . . . , Pn (). However, due to the presence of often complicated degeneracies of the Lagrangian foliation, even at the classical level, the dynamical picture is only partially understood [VN2]. Likewise, at the quantum level, the asymptotic blow-up properties of eigenfunctions (eg. sharp L p -bounds) are also only partially understood [T1–T3, TZ1–TZ3]. Apart from pointwise bounds, it is natural when studying asymptotic concentration properties of eigenfunctions to consider limits of expected values Aϕλ , ϕλ as λ → ∞ where A is a zeroth-order pseudodifferential operator and to compute the corresponding semiclassical defect measures. Formally, one can let A approach δγ , where the latter is surface measure along a submanifold, γ ⊂ M. Then, one is faced with estimating asymptotic upper bounds for L 2 , or more generally, L p integrals along submanifolds of M. In the case of surfaces, these are curves and the concentration of these defect measures along a periodic geodesic γ is called strong scarring. For Laplace eigenfunctions, the eigenfunction restriction bounds have been studied by Reznikov [R] for hyperbolic surfaces and Burq-Gerard-Tzvetkov [BGT] for general manifolds (and for all p ≥ 2). Both papers are related to earlier work of Tataru [Ta] on estimating boundary traces of wavefunctions. We will focus here on the case where p = 2 and dim M = 2. (i.e. L 2 -restriction bounds along curves on surfaces). At the moment, it is unclear to us whether our methods extend to L p -restriction bounds for p = 2. In the special case of L 2 -integrals along curves, the estimates in [BGT] are as follows: • (i) If γ is a unit-length geodesic, then
1/2
γ
|ϕ j (s)|2 ds = O(λ j ).
• (ii) If γ is a curve with strictly-positive geodesic curvature, γ
1/3
|ϕ j (s)|2 ds = O(λ j ).
In this article, we obtain generic asymptotic bounds for γ |ϕ j (s)|2 ds in the case where the ϕ j ’s are joint eigenfunctions of the QCI system consisting of two commuting -pseudodifferential operators P1 () and P2 (). Other than the fact that our analysis here is specific to QCI systems and to the case p = 2, this paper differs from [BGT] in several ways:
L 2 -Restriction Bounds for Eigenfunctions Along Curves in the QCI Case
381
1) One of the main issues here is the generic behaviour of restriction bounds, where a curve γ : [a, b] → M is called generic if it satisfies the Morse condition in (1.1). As weshow in Sect. 2, in the QCI case the restricted asymptotic eigenfunction mass, γ |ϕ j |2 ds, is much smaller than the prediction in (i) or (ii) above. Indeed, it is O(log λ j ) (see Theorems 1 and 3) and the example of zonal harmonics on the sphere (see Sect. 4.1) shows that this bound is sharp. 2) In Theorem 3, we establish a converse to (i) above, and show that the bound in (i) is generically attained in the QCI case. Moreover, we identify the specific bicharacteristics in terms of the singular Lagrangian foliation that support such large eigenfunction scars. 3) Finally, we prove all our results for a rather large class of possibly inhomogeneous semiclassical QCI Hamiltonians. The semiclassical Laplacian P1 () = −2 is a special case. The results really have to do with the bicharacteristic flow and are not specific to geodesics. Before going on, we explain what is meant here by the term generic. Given E 1 > 0 a regular value of p1 , we assume that for (x, ξ ) ∈ p1−1 (E 1 ), ∂ξ p1 (x, ξ ) = 0. (A1) That is, p1 is real prinicipal type on the hypersurface p1−1 (E 1 ). Given the canonical projection π : T ∗ M → M, we define Cγ := {(x, ξ ) ∈ T ∗ M; p1 (x, ξ ) = E 1 , x ∈ γ } = p1−1 (E 1 ) ∩ π −1 (γ ).
(1.2)
Definition 1.1. Let ι : Cγ → p1−1 (E 1 ) be the standard inclusion map. We say that the R2 -integrable system with moment map P = ( p1 , p2 ) is generic along the curve γ : [a, b] → M provided ι∗ p2 ∈ C ∞ (Cγ ) is a Morse function and condition (A1) is satisfied. When it does not cause confusion, we call the curve segment γ itself generic when the conditions in Definition 1.1 are satisfied. Remark 1.2. In the homogeneous case where p1 = |ξ |2g , the manifold Cγ = Sγ∗ M. In this case, E 1 = 1 and by Euler homogeneity, ξ · ∂ξ |ξ |2g = 2|ξ |2g so that (A1) is automatically satisfied. Theorem 1. Let ϕ j ; j = 1, 2, 3, . . . be the L 2 -normalized joint eigenfunctions of the commuting operators P1 () and P2 () on a Riemannian surface (M 2 , g) with joint (1) (2) eigenvalues (λ j () = E 1 +O(), λ j ()) ∈ Spec P1 ()×Spec P2 (); j = 1, 2, 3, . . .. Then for generic curves γ : [a, b] → M and ∈ (0, 0 ], |ϕ j |2 ds = O|γ | (| log |) . γ
Here, |γ | denotes the length of the curve segment γ and the RHS of the above estimate is uniform over all energy values {E ∈ R; (E 1 , E) ∈ P(T ∗ M)}. In the special case where P1 () = −2 one can scale out and Theorem 1 becomes
382
J. A. Toth
Theorem 2. Let ϕ j ; j = 1, 2, 3, . . . be the L 2 -normalized joint Laplace eigenfunctions of the commuting operators P1 = − and P2 on a Riemannian surface (M 2 , g). Then, provided ι∗ p2 | Sγ∗ M is Morse, one gets |ϕ j |2 ds = O|γ | log λ j . γ
In the homogeneous case, it was already observed in [BGT] (see estimate (i) above) that in the case where γ is a geodesic, the restriction upper bounds can grow at the maximal rate ∼ λ1/2 . Consistent with this, in the QCI case, we will show that there always exist certain bicharacteristics that support high L 2 -mass for certain subsequences of eigenfunctions consistent with the λ1/2 -bound in (i) (at least up to possible loss of log λ). However, it is important to note that the nature of the bicharacteristic is very important when discussing restriction bounds. To describe what we mean, let Br eg (resp. Bsing ) denote the regular (resp. singular) values of the moment map P = ( p1 , p2 ) : T ∗ M → R2 . In the general QCI case, most bicharcteristics of H p1 are subsets of Lagrangian tori in P −1 (Br eg ). These do not support large L 2 -bounds along their configuration space projections. However, as was shown in [TZ3] Lemma 3, unless (M, g) is a flat torus, there is always a subsequence of joint eigenfunctions of P1 and P2 with mass concentrated along (singular) joint orbits of the Hamilton fields H p1 and H p2 contained in P −1 (Bsing ). The latter eigenfunctions saturate the maximal bounds in (ii) above. This is of course consistent with simple examples like surfaces of revolution with metric g = dr 2 + a 2 (r )dθ 2 , where the equator is the projection of a singular orbit of the joint flow of H p1 and H p2 . In this case, p1 = |ξ |g and p2 = pθ with pθ (v) := v, ∂θ . The corresponding joint eigenfunctions, ϕ j , of P1 () = −2 g and P2 () = Dθ (1)
(2)
with joint eigenvalues (λ j (), λ j ()) = (1, 0) + o(1) (the analogs of highest weight spherical harmonics) satisfy γ |ϕ j |2 ds ∼ −1/2 along the equator, γ . So, in particular, the restriction bound for these eigenfunctions is certainly non-generic in the sense of Definition 1.1. However, it is not hard to show (see Sect. 4) that the meridian great circles, while obviously also periodic geodesics, have associated joint eigenfunctions with very different restriction bounds. These geodesics lies in the base space projection of a maximal Lagrangian torus. The zonal harmonics have -microsupport on this torus we show in Subsect. 4.1, the latter eigenfunctions have the L 2 -restriction bound and, as |2 ∼ log 1 along generic curves, , passing through the poles. These examples |ϕ j show that the estimates in Theorems 1 and 2 are sharp. In the case of exceptional bicharacteristics, we prove Theorem 3. Let P j ; (); j = 1, 2 be an Eliasson non-degenerate, QCI system on a surface, (M, g). Then, • (i) When γ is the projection of a bicharacteristic segment of p1 contained in P −1 (Br eg ), |ϕ j (s; )|2 ds = O|γ | (1). γ
• (ii) When γ is the projection of a singular joint orbit in P −1 (Bsing ), |ϕ j (s; )|2 ds = O|γ | (−1/2 ). γ
L 2 -Restriction Bounds for Eigenfunctions Along Curves in the QCI Case
383
Moreover, there exists a constant cγ > 0 depending only on the curve γ , and a subsequence of joint eigenfunctions, ϕ jk ; k = 1, 2, . . . such that for ∈ (0, 0 ], γ
γ
|ϕ jk (s; )|2 ds ≥ cγ −1/2 when γ is stable,
|ϕ jk (s; )|2 ds ≥ cγ −1/2 | log |−1 when γ is unstable.
It is proved in [TZ3] (see Lemma 3) that unless (M, g) is a flat torus, the joint flow
t always possesses at least one singular orbit (see also [L,LS]). In the case where dim M = 2 this orbit must be one-dimensional (i.e. a geodesic). Thus, the second estimate (ii) in Theorem 3 is generically attained in the QCI case and therefore, up to a power of log λ, the maximal L 2 -restriction bound in [BGT] is attained. Remark 1.3. Examples to which Theorems 1 and 3 apply include: QCI Laplacians on ellipsoids (with distinct axes), surfaces of revolution, Liouville surfaces. Less wellknown examples include QCI Laplacians associated with spherical metrics got by reducing the Goryachev-Chaplyin top as well as those constructed in [DM]. In both of the last two classes of examples, the integral in involution, p2 , is a cubic polynomial in the momentum variables. Finally, there are also examples known where p2 is quartic in the momenta [Mi]. In addition, our results apply to inhomogeneous QCI systems such as Neumann oscillators, Euler and Kowalevsky tops and the spherical pendulum as well as many others. We have stated our results for surfaces because the formulation is quite elegant in that case. It is not hard to extend the analysis here to higher-dimensions under the appropriate notion of a generic submanifold, but the formulation of results becomes more cumbersome. We hope to address this elsewhere. Remark 1.4. In analogy with the specific results for QCI eigenfunctions in Theorems 1 and 3 above, it is natural to try to determine L 2 (or L p ) eigenfunction bounds along “typical” curves on general Riemann surfaces (M 2 , g) by varying the standard restriction estimates over appropriate moduli spaces of curve segments. We hope to address this point elsewhere.
2. Generic (Joint) Eigenfunction Restriction Bounds Along Curves We say that P() ∈ O p,cl (S m,k (T ∗ M)) if locally it has Schwartz kernel P(x, y; ) = (2π )
−n
Rn
ei(x−y)ξ/ p(x, ξ ; )dξ,
∞ −k+ j and p ∈ S m− j (T ∗ M) with ∈ (0, ]. where p(x, ξ ; ) ∼ j 0 j=0 p j (x, ξ ) 1,0 From now on, without loss of generality, we assume that P j () ∈ O p,cl (S m,0 ), P1 () is elliptic in the classical sense and the P j ()’s are self-adjoint. In this section, we get generic asymptotic bounds for γ |ϕ j (s; )|2 ds in the case where the ϕ j ’s are joint eigenfunctions of P1 () and P2 (). Here, the term generic refers to a non-degeneracy condition on the QCI system along the (generalized) cylinder Cγ given in Lemma 1.1.
384
J. A. Toth
2.1. Proof of Theorem 1. We assume here that (M, g) is a compact surface with QCI quantum Hamiltonian given by P1 () and the quantum integral in involution is P2 (), where we assume that its principal symbol, p2 , satisfies the Morse condition in (1) Definition 1.1. The joint spectrum of P1 ()(resp. P2 ()) will be denoted by λ j () (2)
(resp. λ j ()) with j = 1, 2, 3, . . .. Let ρ ∈ S(R) satisfy ρ(u) ≥ 0 with ρ(0) = 1 and ρˆ ∈ C0∞ ([−, ]) with > 0 sufficiently small. For fixed x ∈ M, we form the joint unintegrated trace attached to the level ( p1 , p2 ) = (E 1 , E) given by I E (x; ) :=
∞
(1)
(2)
ρ(−1 [λ j () − E 1 ]) ρ(−1 [λ j () − E]) |ϕ j (x; )|2 .
(2.3)
j=1
b Our task is to obtain a locally uniform asymptotic bound (in E) for a I E (x(τ ); ) dτ as → 0+ . Writing the usual small-time -Fourier integral operator (FIO) parametrices for eit P1 () and eis P2 () and taking Fourier transforms in (2.3) gives: I E (x; ) = (2π )−4 ˆ ρ(s) ˆ dsdtdξ dηdy ei (x,y,ξ,η,s,t;E)/aχ (y, η, ξ ; ) ρ(t) +O(∞ ).
(2.4)
In (2.4), because of the cutoff function χ appearing in the amplitude (see (2.10 below), the uniform O(∞ ) remainder follows by successive integration by parts in t and s. The total phase function
(x, y, ξ, η, s, t; E) = ψ1 (x, y, ξ, t) + t E 1 + ψ2 (y, x, η, s) + s E,
(2.5)
ψ1 (x, y, ξ, t) = ϑ1 (x, ξ, t) − yξ, ψ2 (y, x, η, s) = ϑ2 (y, η, s) − xη.
(2.6)
where,
In (2.6), the ϑ j ; j = 1, 2 satisfy the usual eikonal initial value problems ∂t ϑ1 + p1 (x, ∂x ϑ1 ) = 0, ϑ1 |t=0 = xξ, ∂s ϑ2 + p2 (y, ∂ y ϑ2 ) = 0, ϑ2 |s=0 = yη.
(2.7)
From the equations in (2.7) one easily derives the following Taylor expansions for ϑ1 (x, ξ, t)(resp. ϑ2 (y, η, s)) centered at t = 0 (resp. s = 0): ϑ1 (x, ξ, t) = xξ − t p1 (x, ξ ) + O(t 2 ), ϑ2 (y, η, s) = yη − sp2 (y, η) + O(s 2 ).
(2.8) (2.9)
In (2.4) the amplitude is of the form aχ (x, y, ξ, η, s, t; ) = χ (E 1 − p1 (x, ξ ))χ (E− p2 (y, η))χ (y−x)a(x, y, η, ξ, s, t; ), (2.10) ∞ where a ∼ j=0 a j j , a0 ≥ C10 > 0 and χ ∈ C0∞ (R) with χ (x) = 0 for |x| ≥ 1 and χ (x) = 1 for |x| ≤ 1/2.
L 2 -Restriction Bounds for Eigenfunctions Along Curves in the QCI Case
385
Since the integral in (2.4) is absolutely convergent, we carry out the (y, η)-integration first and get that −4 I E (x; ) = (2π ) exp [it (E 1 − p1 )(x, ξ ) + O(t 2 )/] ×ρ(t) ˆ I (x, ξ, s, t; ) dsdξ dt + O(∞ ), where,
I (x, ξ, s, t; ) :=
ˆ ei (x,s;y,η)/bχ (x, y, ξ, η, s, t; )ρ(s)dydη,
(2.11)
(2.12)
j and where b ∈ Scl0 (1) with b ∼ j b j , b0 ≥ 1/C 0 > 0 and bχ has the same properties as aχ in (2.10). The phase function
(x, s; y, η) = x − y, ξ − η + s(E − p2 (y, η)) + O y,η (s 2 ).
(2.13)
det( y,η )
= 1 + O(s) and the s-support of bχ can be taken arbitrarily small, Since one can apply stationary phase (with parameters) in the (y, η)-variables in (2.12). The critical point equations for (y, η) are η = ξ + s ∂ y p2 (y, η) + O(s 2 ),
(∗)
y = x + s ∂η p2 (y, η) + O(s ). 2
By a straightforward computation, I E (x, ) equals −2 (2π ) exp [it (E 1 − p1 )(x, ξ ) + is(E − p2 )(x, ξ ) + Ox,ξ (s 2 ) + Ox,ξ (t 2 )/] (2.14) ×c(x, ξ, s, t; )dξ dtds + O(∞ ), ∞ 0 ∞ j where, c ∈ Scl (1) with c(x, ξ, s, t; ) ∼ j=0 c j (x, ξ, s, t) , where the c j ∈ C0 . Next, we make a polar variables decompostion in the ξ -variables in (2.14), which is legitimate since by assumption, p1 is real principal type on the energy shell p1−1 (E 1 ) and so, |∂ξ p1 | ≥ C1 > 0 when p1 ∼ E 1 and supp c ⊂ [−, ]2 × p1−1 [E 1 − , E 1 + ]. We note that in the case of a Schrödinger operator, p1 (x, ξ ) = |ξ |2g + V (x) and so, 2ξ ∂ξ p1 = |ξ |2g by Euler homogeneity. So, as long as γ ∩ {x ∈ M; V (x) = E 1 } = ∅, (∗) the condition ∂ξ p1 (x, ξ ) = 0 is satisfied for (x, ξ ) ∈ p1−1 (E 1 ) ∩ π −1 (γ ). In the homogeneous case, where p1 = |ξ |g and E 1 = 1 the condition (∗) is automatically satisfied. Since by assumption ∂ξ p1 = 0 near p1−1 (E 1 ) we can choose p1 as a local coordinate on π −1 (x) near (x, ξ0 ) ∈ p1−1 (E 1 ). Then, we put p1 = r E 1 and extend it to a local polar coordinate system (r, ω) : π −1 (x) → R2 near (x, ξ0 ) ∈ p1−1 (E 1 ). Cover a neighbourhood of π −1 (x) ∩ p1−1 (E 1 ) by small open sets and choose a partition of unity subordinate to the covering. Then, make the change of variables ( p1 , ω) → ξ in each open set and sum over the partition to get I E (x; ) = (2π )−2
exp i [t E 1 (1 − r ) + s(E− p2 (x, r ω)) + Ox,ξ (s 2 ) + Ox,ξ (t 2 )/]
×c(x, r ω, s, t; )r dr dωdtds + O(∞ ),
(2.15)
386
J. A. Toth
where ω ∈ p1−1 (E 1 )∩π −1 (x) is a (generalized) angle variable and dωx denotes Liouville measure on p1−1 (E 1 )∩π −1 (x). In the following and in (2.15) above, we have suppressed the dependence of dωx on x ∈ M. One final application of stationary phase in the (r, t)-variables in (2.15) gives I E (x; ) = (2π )−1 exp i[s (E − p2 (x, ω)) + O(s 2 )]/] ×c(x, ˜ ω, s; )dωds + O(∞ ),
(2.16)
where c˜ ∈ Scl0 (1). The remainder of the proof of Theorem 1 involves integrating the restriction of I E (x; ) in (2.15) to x = x(τ ) ∈ γ and then carrying out a detailed analysis of the result under the generic Morse condition in Definition 1.1. From (2.15), a
b
I E (x(τ ); ) dτ = (2π )−1
exp i[s (E − p2 (x(τ ), ω)) + O(s 2 )]/]
×c(x(τ ˜ ), ω, s; )dωdτ ds + O(∞ ) = (2π )−1 eis E/ Iγ (s; ) ds + O(∞ ).
(2.17) (2.18)
In (2.17) it is useful to absorb the O(s 2 )-term into the p2 -term in the phase and write p2 (x(τ ), ω; s) := p2 (x(τ ), ω) + O(s), uniformly in (τ, ω) ∈ [a, b] × (δ1 , δ2 ); δ j ∈ R, j = 1, 2. Also, an application of Fubini ensures that the s-integral can be carried out last and this shows that the result is uniform in the energy values E. By carrying out the s-integration last, E will always appear in a harmless, linear fashion in the phase only. So, for ∈ (0, 0 ] it remains to estimate the integral: Iγ (s; ) =
a
b
δ2
δ1
e−isp2 (x(τ ),ω;s)/c(x(τ ˜ ), ω, s; )dωdτ.
(2.19)
Because of the Morse assumption in Definition 1.1, the (ω, τ )-critical points of p2 (x(τ ), ω) are isolated and so, without loss of generality, we assume that there is a single critical point at (τ0 , ω0 ). Let Bδ ⊂ [a, b] × (δ1 , δ2 ) be a small δ-ball centered at (τ0 , ω0 ) and χ0 ∈ C0∞ (Bδ ) with χ0 = 1 in Bδ/2 . We then choose χ1 ∈ C ∞ so that χ0 + χ1 = 1 and split up the integral Iγ (s; ) = e−isp2 (x(τ ),ω;s)/c(x(τ ˜ ), ω, s; ) χ0 (τ, ω) dωdτ + e−isp2 (x(τ ),ω;s)/c(x(τ ˜ ), ω, s; ) χ1 (τ, ω) dωdτ = : Iγ(0) (s; ) + Iγ(1) (s; ).
(2.20)
(1)
First, we deal with the second integral Iγ (s; ) on the RHS of (2.20): For (τ, ω) ∈ supp χ1 , we have that for |s| sufficiently small, max { |∂τ p2 (x(τ ), ω; s)|, |∂ω p2 (x(τ ), ω; s)| } ≥
1 > 0. C0
(2.21)
L 2 -Restriction Bounds for Eigenfunctions Along Curves in the QCI Case
387
By the implicit function theorem, in the case where |∂ω p2 (x(τ ), ω; s)| ≥
1 C0 ,
one can
(1) Iγ (s; ).
Alternamake a local change of variables (τ, ω) → (τ, p2 (x(τ ), ω; s)) in tively, when |∂τ p2 (x(τ ), ω; s)| ≥ C10 , one can make the make the change of variables (τ, ω) → ( p2 (x(τ ), ω; s), ω)). So, in either case after making a change of variables, one gets −1 i Es/ (1) −1 (2π ) Iγ (s; )ds = (2π ) e eis(E−θ)/c˜1 (s, θ, v; ) dθ dvds, (2.22) where, again c˜1 ∈ Scl0 (1) with compact support in all variables. Finally, another application of stationary phase in the (s, θ )-variables gives (2π )−1 ei Es/ Iγ(1) (s; )ds = O(1). (2.23) Moreover, the O(1)-bound on the RHS in (2.23) is clearly uniform in E. We now deal with Iγ(0) (s; ). The Morse assumption and implicit function theorem imply that the critical point equations ∂τ p2 (x(τ ), ω; s) = 0, ∂ω p2 (x(τ ), ω; s) = 0, τ (0) = τ0 , ω(0) = ω0 have unique local solutions τ (s) and ω(s) which are smooth for |s| ≤ C1 with C > 0 suf(0) ficiently large. We apply stationary phase in (τ, ω) to expand the first integral Iγ (s; ) on the RHS of (2.20). First, we split up the domain of s-integration and write ˜ ), ω, s; ) dτ dω eisp2 (x(τ ),ω;s)/χ0 (τ, ω)c(x(τ = 1|s|≤eisp2 (x(τ ),ω;s)/χ0 (τ, ω)c(x(τ ˜ ), ω, s; ) dτ dω ˜ ), ω, s; ) dτ dω. (2.24) + 1|s|≥eisp2 (x(τ ),ω;s)/χ0 (τ, ω)c(x(τ Clearly, (2π )
−1
|s|≤
ei Es/ Iγ(0) (s; )ds = O(1).
(2.25)
An application of stationary phase with parameters ([Ho] Theorem 7.7.5) in the second integral gives 1|s|≥ Iγ(0) (s; ) = s −1 c˜0 (x(τ (s)), ω(s), s) 1|s|≥ exp [ isp2 (x(τ (s)), ω(s); s)/ ] +O(|s|−2 2 ).
(2.26)
So, integrating (2.26) over {s; 1 ≥ |s| ≥ } gives i Es/ (0) (2π )−1 ≤ C1 −1 e I (s; )ds γ 1≥|s|≥
2 ds + C2 −1 ds 2 1≥|s| ≥ s 1≥|s|≥ s = O(| log |) + O(1) = O(| log |). (2.27)
388
J. A. Toth
Combining (2.23), (2.25) and (2.27) and using the fact that each of these estimates is uniform in E implies that for ∈ (0, 0 ], sup
{E; (E 1 ,E)∈P (T ∗ M)} a
b
I E (x(τ ); ) dτ = O(| log |).
Rewriting the last estimate, we have proved that
sup
{E; (E 1
×
,E)∈P (T ∗ M)}
γ
−1 (2) ρ(−1 [λ(1) j () − E 1 ])ρ( [λ j () − E])
j
|ϕ j |2 ds = O(| log |).
(2.28)
We claim that there exists a constant C2 > 0 (independent of ∈ (0, 0 ] and the joint (1) (2) eigenfunctions ϕ j ) such that for any ∈ (0, 0 ], and (λ j (), λ j ()) ∈ Spec(P1 (), P2 ()), with |λ(1) j () − E 1 | ≤ C 1 ,
inf |λ(2) j () − E|) ≤ C 2 . (∗) E
To see this, we argue by contradiction: Assume that (∗) does not hold. Then there + exists a sequence (m )∞ m=1 with m → 0 as m → ∞ for which (∗) is violated. Let ∞ ∈ (m )m=1 and treat > 0 as an adiabatic parameter. Consider the -pseudodiffer(1) (2) ential operator P() := −2 [ P1 () − λ j ()]2 + −2 [P2 () − λ j ()]2 and consider (k)
(k)
pk, := −2 ( pk − λ j () )2 and Pk, := −2 [Pk () − λ j () ]2 ; k = 1, 2. Then, our assumption implies that for any ∈ (m )∞ m=1 , p2, (x, ξ ) ≥ C22 when p1, (x, ξ ) ≤ C12 , and thus P() = P1, + P2, is -elliptic. One then constructs an -pseudodifferential parametrix Q() with Q()P() = I d + O( ∞ ) L 2 →L 2 . Applying Q() to both sides of the equation P()ϕ () j = 0 implies that ∞ ϕ () j L 2 = O( ).
(2.29)
But since > 0 can be taken arbitrarily small, (2.29) contradicts the fact that all joint eigenfunctions are L 2 -normalized. So, after possibly rescaling ρ and using that ρ ≥ 0 with ρ(0) = 1 it follows from (∗) that there exists a constant C3 > 0 (independent of j and ) such that for all j ≥ 1 and ∈ (0, 0 ], sup (1) {E;(E 1 ,E)∈P (T ∗ M), |E 1 −λ j ()|≤C1 }
(2)
ρ(−1 [λ j () − E]) ≥ C3 > 0.
L 2 -Restriction Bounds for Eigenfunctions Along Curves in the QCI Case
389 (1)
Since the sum on the LHS of (2.28) has non-negative terms, by restricting to { j; |λ j ()− (1)
E 1 | ≤ C1 } and (after possibly rescaling ρ) using that ρ(−1 [λ j () − E 1 ]) ≥ C5 > 0 for these eigenvalues, one finally gets that |ϕ j |2 ds = O|γ | (| log |), (1)
{ j;|λ j ()−E 1 |=O()}
γ
uniformly in E. This finishes the proof of Theorem 1.
Remark 2.5. The sup bound in Theorem 1 is also uniform in the energy parameter, E 1 . However, for different values of E 1 one needs to excise different subvarieties of M (which depend on E 1 ) to ensure that p1 is real principal type on p1−1 (E 1 ). For example, in the case where p1 = |ξ |2g + V (x), assumption (A1) requires that γ ∩ {x ∈ M; V (x) = E 1 } = ∅. 3. Non-Generic Curves In this section, we turn to the proof of Theorem 3. In contrast to Theorem 1, this result deals with the L 2 -restriction bounds of joint eigenfunctions of P1 () and P2 () with -microsupports along singular orbits of the joint bicharacteristic flow of H p1 and H p2 . We show that, up to log -factors, the maximal L 2 -restriction bound in [BGT] is attained along the base projections of these orbits. In the special homogeneous case, these projections are certain (exceptional) geodesics. For example, as we discuss in Sect. 4, in the case of surfaces of revolution, the equator is such an exceptional geodesic. Just as in the previous section, one is reduced to estimating the integral Iγ (s; ). However, unlike the generic case, the phase function (τ, ω) ∈ Cγ∞ will now have degenerate critical points and we make a change of variables to classical Birkhoff normal form along these singular orbits to compute the asymptotics. 3.1. Orbits of the joint flow, t . Here, we describe an important class of exceptional curves, γ , which do not satisfy the Morse assumption in Definition 1.1. As we have already pointed out in the Introduction, it is not difficult to see that in the homogeneous case, geodesics are distinguished as far L p -restriction bounds are concerned (see for example [BGT]). In the QCI case, the same is true for the bicharacteristics of general inhomogeneous Hamiltonians. Moreover, as we will show, the nature of bicharacteristics vis-a-vis the singular Lagrangian foliation of T ∗ M also plays a very important role as far as restriction bounds are concerned. First, we give a slightly different characterization of what it means for a curve γ to be generic. This consists of a series of simple but useful geometric lemmas, the main result being Proposition 3.6. Fix a smooth curve γ : [a, b] → R and let (τ (0), ω(0)) ∈ Cγ be any point on the cylinder. We define : Cγ → R by = ι∗ p2 , where ι : Cγ → T ∗ M is the standard inclusion map. So, in terms of the local coordinates (τ, ω) : Cγ → [a, b] × (δ1 , δ2 ), (τ, ω) = p2 (x(τ ), ω). Modulo an O(s)-error (which is negligible), this is the phase function in (2.19), in the integral Iγ (s; ).
390
J. A. Toth
The point (τ (0), ω(0)) is critical for : Cγ → R if for every smooth curve segment µ(s) := {(τ (s), ω(s)) ∈ Cγ ; s ∈ (−, )} passing through the initial point, ∂ (τ (s), ω(s))|s=0 = 0. ∂s
(3.30)
Since (τ (s), ω(s)) = p2 (τ (s), ω(s)), writing (3.30) out explicitly and applying the chain rule gives: ∂x p2 · ∂s τ |s=0 + ∂ξ p2 · ∂s ω|s=0 = 0.
(3.31)
On the other hand, differentiating the definining equation p1 (τ (s), ω(s)) = 1 in s gives ∂x p1 · ∂s τ |s=0 + ∂ξ p1 · ∂s ω|s=0 = 0.
(3.32)
The following lemma is an immediate consequence of (3.31) and (3.32). Lemma 4. A point z 0 = (τ (0), ω(0)) ∈ Cγ is critical for : Cγ → R if and only if Tz 0 Cγ ⊂ ker(dp1 )(z 0 ) ∩ ker(dp2 )(z 0 ). The following simple geometric result is central to our proof of Theorem 3 since it describes the bicharacteristics that are non-generic. Proposition 3.6. Let γ ⊂ π(γ˜ ) where γ˜ = (z 0 ) is a joint orbit of exp t j H p j ; j = 1, 2 through the point z 0 ∈ Cγ with dim γ˜ ≥ 1. Then, if γ˜ ⊂ Cγ , the curve γ is not generic. Proof. First, the real principal type assumption combined with the implicit function theorem imply that Cγ = π −1 (γ ) ∩ p1−1 (E 1 ) is a smooth two-dimensional submanifold of T ∗ M. We split the analysis into two cases. Case 1. When γ˜ is a two-dimensional Lagrangian torus, we have that locally γ˜ = p1−1 (E 1 ) ∩ p2−1 (E) for some E ∈ R. Since by assumption γ˜ ⊂ Cγ , and both are two-manifolds, clearly Cγ = γ˜ . Then, Cγ is non-generic since = p2 |Cγ = E and so, all points z ∈ Cγ are critical for . Case 2. Here we assume that γ˜ is a singular joint orbit of dimension one (see Subsect. 3.2 below). Then, for all z ∈ γ˜ , dp2 (z) = λ(z) · dp1 (z), for some λ(z) = 0. So, from Lemma 4, z 0 ∈ γ˜ is a critical point of : Cγ → R if and only if Tz 0 Cγ ⊂ ker(dp1 )(z 0 ). This inclusion is always satisfied since Cγ ⊂ p1−1 (E 1 ). As a result, all points z ∈ γ˜ along the one-dimensial orbit are critical for and so the latter is not Morse.
L 2 -Restriction Bounds for Eigenfunctions Along Curves in the QCI Case
391
3.2. Singular leaves of the Lagrangian foliation. Before taking up the proof of Theorem 3 we collect here some basic facts about the geometry of integrable systems and their singular sets. We refer the reader to [TZ3,VN2] for further details. Given the moment map P = ( p1 , p2 ), the singular variety of the corresponding integrable system is defined to be the set sing = {(x, ξ ) ∈ T ∗ M; dp1 ∧ dp2 (x, ξ ) = 0}. We now recall some elementary results about sing which we will need later on. First, given the joint flow t : T ∗ M → T ∗ M defined by t (x, ξ ) = exp t1 H p1 ◦ exp t2 H p2 (x, ξ ); t = (t1 , t2 ) ∈ R2 , we observe that
t (sing ) = sing ,
(3.33)
which follows immediately from the fact that { p1 , p2 } = 0 and t is a diffeomorphism. The singular set sing consists of a union of orbits of the joint flow t ; t ∈ R2 . Definition. Following [TZ3], we say that an orbit of the joint flow t is singular if it is not Lagrangian; that is, if dim ≤ 1. 3.2.1. Eliasson nondegeneracy. For our second main result (Theorem 3), we will need to make a non-degeneracy assumption on the integrable system with moment map P = ( p1 , p2 ). We now give a brief description of this condition. For more detailed treatment, see [VN1,VN2,TZ3]. Let p = R{ p1 , p2 } ⊂ C ∞ (T ∗ M − 0), {.} be the standard abelian subalgebra with Poisson bracket. Then, given a singular orbit (v) = exp t1 H p1 ◦ exp t2 H p2 (v) through a point v ∈ P −1 (Bsing ) of rank k ≤ 1, we note that the Hessians dv2 p j ; j = 1, 2, determine an Abelian subalgebra dv2 p ⊂ S 2 (K /L , ωv )∗ of quadratic forms on the reduced symplectic subspace K /L, where we put K = ker dp1 (v) ∩ ker dp2 (v),
L = span(H p1 (v), H p2 (v)).
Definition. We say that the orbit (v) is Eliasson non-degenerate of rank k ≤ 1 if dv2 p is a Cartan subalgebra of S 2 (K /L , ωv )∗ . Lemma 5. Assume that the integrable system with moment map P = ( p1 , p2 ) is Eliasson non-degenerate. Then, sing is a finite union of orbits of the joint flow, t with dimension ≤ 1. The latter are diffeomorphic to open intervals, circles and isolated points. Proof. From (3.33) it follows that sing is a finite union of joint orbits of the joint flow
t and so has dimension ≤ 2. The Eliasson non-degeneracy condition (3.2.1) implies that dim sing ≤ 1. As a result, sing consists of a union of open intervals, circles and, in the inhomogeneous case, possibly a finite number of isolated points. Remark. In the homogeneous case where p1 (x, ξ ) = |ξ |2g , the singular orbits are necessarily topological intervals or circles since P = ( p1 , p2 ) has no isolated critical points.
392
J. A. Toth
The Eliasson non-degeneracy assumption implies that is a finite union of singular orbits for p j ; j = 1, 2 and we use this in the next section to analyze the integral (2.19) by microlocalizing near these orbits and applying a classical Birkhoff normal form construction to analyze the resulting integral. We note that the crucial difference between the generic case and the case of a bicharacteristic which lifts to a singular joint orbit lies in the fact that due to the invariance of the integral p2 , the computation of Iγ (s; ) can be reduced to a single fibre π −1 (x) ∩ p1−1 (E 1 ) in the latter case. Thus, there is no additional cancellation coming from the computation of the s-integral and this is ultimately the reason why the O(−1/2 ) L 2 -restriction bound is saturated by these singular orbits.
3.3. Microlocalization along Cγ . In the following, it is useful to split up the mapping cylinder Cγ as follows: Cγ = Cγr eg ∪ Cγsing , r eg
(3.34)
sing
where Cγ (resp. Cγ ) denote invariant open neighbourhoods of regular (resp. singular) points of (τ, ω). Let χr eg (ω) (resp. χsing (ω)) be a partition of unity subordinate to the corresponding covering of the parametrizing coordinate interval (δ1 , δ2 ) of a crosssection of the cylinder, Cγ . We then write Iγ (s; ) =
e−is(τ,ω)/c(ω, τ, s; )χr eg (ω) dωdτ + e−is(τ,ω)/c(ω, τ, s; )χsing (ω) dωdτ
=: Ir eg (s; ) + Ising (s; ).
(3.35)
First, we analyze the regular term on the RHS of (3.35).
3.4. Analysis of the regular term. Given ω ∈ supp χr eg , in light of the invariance formula (3.33) it easily follows that for all τ ∈ [a, b], and ω ∈ supp χr eg , d(τ, ω) = d(ι∗ p2 )(τ, ω) = 0. Indeed, rank (dp1 , dp2 )(τ, ω) = 2 for all ω ∈ supp χr eg and so, by Lagrange multipliers, the restriction = ι∗ p2 ∈ C ∞ (Cγ ) satisfies d(τ, ω) = d(ι∗ p2 )(τ, ω) = 0 for all (τ, ω) ∈ [a, b]× supp χr eg . But then one can introduce ι∗ p2 as a new coordinate on supp χr eg , and so by the change of variables formula, for some c ∈ Scl0 (1), (2π )
−1
ei[s E−sθ]/c (s, θ ; )dθ ds = O(1).
(3.36)
The last bound on the RHS of (3.36) follows by stationary phase in (s, θ ) and the estimate is uniform for E ∈ π2 (P(T ∗ M)), where π2 : (E 1 , E) → E.
L 2 -Restriction Bounds for Eigenfunctions Along Curves in the QCI Case
393
3.5. Analysis of the singular term. Here we assume that {(x(τ ), ω); a ≤ τ ≤ b} ∈ suppχsing . So, in particular (x(τ ), ω) is contained in an arbitrarily small neighbourhood of γ˜ containing (x(0), ω(0)) where, dp1 ∧ dp2 (x(0), ω(0)) = 0. To deal with the second term in (3.35), it is useful to pass to a convergent singular Birkhoff normal form and write the phase function (τ, ω) in (3.35) in normal coordinates. The analysis will be split into several cases depending on the nature of the singularity in the phase function, . 3.5.1. Singular Birkhoff normal forms. First, we recall that the orbit = ∪(t1 ,t2 )∈R2 exp t1 H p1 ◦ exp t2 H p2 (x(0), ω(0)) of the joint flow is of dimension ≤ 1. So, it is diffeomorphic to a union of intervals and circles and possibly a finite number of (necessarily isolated) critical points. Since the latter case of fixed points is handled very similarly to the case of 1-D orbits, we only consider here restriction bounds along curves. The literature on general classical (and quantum) Birkhoff normal forms is extensive [G1,G2,ISZ,Z1,Z2] and we focus here on the integrable case where the canonical change of variables to normal form is actually convergent [CP,HS,T2,TZ3,MVN,VN2]. Without loss of generality, we assume here that the singular locus γ˜ = t (x(0), ω(0)) consists of a bicharacteristic. Whether or not γ is the projection of a periodic bicharacteristic is of no consequence here. Since we are considering the case where n = 2, there are only two possibilities: γ is either stable (elliptic) or unstable (hyperbolic). 3.5.2. Stable case. Let γ be a non-degenerate, stable bicharacteristic in the singular locus P −1 (b) ⊂ P −1 (Bsing ). Let (x, t) : M → R2 be Fermi coordinates along γ centered at the point x0 ∈ γ . We choose the t-coordinate to run along the geodesic and the x-coordinate is transversal. By possibly replacing p1 and p2 by appropriate functions f j ( p1 , p2 ); j = 1, 2 (the corresponding operators f j (P1 (), P2 ()); j = 1, 2 have the same joint eigenfunctions), one can assume that p j (x, t, ξ, σ ) = b j + δ j (σ ) + ω j (σ )(x 2 + ξ 2 ) + Ot,σ (|x, ξ |3 ); j = 1, 2, where, δ j (0) = 0, ω j (0) = 0; j = 1, 2 and ω j , δ j are locally-defined smooth functions near σ = 0. In the following, Bδ∗ (0, 0) := {(x, ξ ); x 2 + ξ 2 < δ 2 } and Bδ∗ ([a, b]) := {(t, σ ); |σ | < δ, a ≤ t ≤ b}. In this case, [TZ3,VN1,VN2] there exists a canonical map κ : Uγ −→ Bδ∗ (0, 0) × Bδ∗ ([a, b]), where Uγ ⊃ Cγ is a small neighbourhood of Cγ such that in local coordinates, κ : (x, t; ξ, σ ) → (x , t ; ξ , σ ), with (κ −1 )∗ p j (x , t ; ξ , σ ) = F j (x 2 + ξ 2 , σ ); j = 1, 2.
(3.37)
Here, δ > 0 is a sufficiently small tube radius and F j ∈ C ∞ (Bδ (0) × Bδ (0)). In (3.37) and in the following, we abuse notation somewhat and write κ for both the canonical mapping and its various coordinate representations. By possibly replacing the classical
394
J. A. Toth
integrals p j ; j = 1, 2 by f k ( p1 , p2 ); k = 1, 2 with appropriate f k ∈ C ∞ , without loss of generality, we can assume that F j (u, v) = b j + β j (v) + α j (v)u + Ov (u 2 );
j = 1, 2.
Moreover, one can take here β j (v) = v +O(v 2 ). The Eliasson non-degeneracy condition says that for all v ∈ Bδ (0), α1 (v) = α2 (v) with minv∈Bδ (0) {|α1 (v)|, |α2 (v)|} ≥ C1 > 0. We need to compute the asymptotics of the RHS of (3.35). Without loss of generality, one can assume that x0 ∈ M is an interior point of the segment γ and so a < 0, b > 0. Consider the integral Ising (s; ) = e−is(t,ω)/χsing (ω)c (ω, t; s) dωdt. (3.38) To make the change of variables in (3.38) to Birkhoff coordinates (x , t ; ξ , σ ) ∈ Bδ∗ (0, 0) × Bδ∗ ([a, b]), we use that x (x, t; ξ, σ ) = x + O(x 2 ), t (x, t; ξ, σ ) = t + O(x), and so, from the expansion in (3.37), in terms of local coordinates, κ( p1−1 (E 1 ) ∩ π −1 (γ )) ∩ (Bδ∗ (0) × Bδ∗ ([a, b]))
= {(x , t ; ξ , σ ) ∈ Bδ∗ (0)× Bδ∗ ([a, b]); x = 0, σ +α1 (σ )ξ 2 +Oσ (ξ 4 )+O(σ 2 ) = 0}. (3.39)
To simplify the writing, from now on we drop the primes in the Birkhoff coordinates and put (b1 , b2 ) = (E 1 , E). Then, since ∂
σ + α1 (σ )ξ 2 + O(σ 2 ) + Oσ (ξ 4 ) = 1 + O(σ ) + O(ξ 2 ), ∂σ it follows from the implicit function theorem that one can use (σ, t) ∈ Bδ (0) × [a, b] as local parametrizing coordinates on κ( p1−1 (E 1 ) ∩ π −1 (γ )) = κ(Cγ ). Substitution of the solution ξ 2 (σ ) of the defining equation in (3.39) into the formula for κ ∗ p2 gives (σ ) = E + β2 (σ ) + α2 (σ )ξ 2 (σ ) + O(ξ 4 (σ )) α2 (σ ) + O(σ 2 ) = E +σ −σ α1 (σ ) α2 (σ ) σ + O(σ 2 ). = E + 1− α1 (σ )
(3.40)
Here we have used that β2 (σ ) = σ + O(σ 2 ) with β2 = 0. Next we compute the induced measure dω in terms of the Birkhoff coordinates. Let denote the canonical 2-form locally given by d x ∧ dξ + dt ∧ dσ . Since κ is canonical, locally the Lebesgue measure (κ ∗ )2 = 2 = d xdtdξ dσ. The induced arc-length (i.e. Liouville measure) dω satisfies κ∗ dωdt = i ∗ dσ dt.
(3.41)
L 2 -Restriction Bounds for Eigenfunctions Along Curves in the QCI Case
395
In (3.41), i : γ × suppχ1 → κ(Cγ ) is the local parametrization given by i(t, σ ) = (0, ξ(σ ), t, σ ), where ξ(σ ) satisfies the identity in (3.39). Choosing (σ, t) as local coordinates on supp χsing × γ , by a straightforward computation, κ∗ dωdt = f (σ )|σ |−1/2 dσ dt,
(3.42)
where, depending on the sign of α1 , either f (σ ) = f + (σ )1[0,∞) or f (σ ) = f − (σ )1(−∞,0] with f ± ∈ C ∞ and f ± (σ ) ≥ C1 > 0. Consequently, by a change of variables, in terms of the normal coordinates, α2 (σ ) −1 is E/ −1 2 σ +O(σ ) (2π ) Ising (s; )ds = (2π ) e exp is 1− α1 (σ ) ×c(s, |σ |1/2 , t; ) χsing (σ ) |σ |−1/2 f (σ ) dσ dtds, (3.43) where c ∈ Scl0 (1) ∩ C0∞ is -elliptic on supp χsing . Now, from the non-degeneracy of the integrable system, we use the fact that α2 (σ ) = α1 (σ ) to make the change of variables α2 (σ ) σ + O(σ 2 ) σ → 1 − α1 (σ ) in the phase in (3.43) and integate out the t-variable (note that the phase in (3.43) is independent of t). The result is that for some T > 0, −1 is E/ −1 Ising (s; )ds = (2π ) eisσ/c(s, ˜ σ 1/2 ; )σ −1/2 dσ ds. e (2π ) 0≤σ ≤T
(3.44) Here the amplitude c˜ has the same properties as c. Making a first-order Taylor expansion around σ = 0 we write c(s, ˜ σ 1/2 ; ) 0 1/2 1/2 = c(s, ˜ 0; )+σ ·δ c(s, ˜ σ ; ), where δ c(·, ˜ ·; ) ∈ Scl (1) with compact support in the s-variable and both c˜ and δ have standard symbolic expansions in with δ c(s, ˜ σ 1/2 ; ) 1/2 = δ c(s, ˜ σ ) + O() and c(s, ˜ 0; ) = c(s, ˜ 0) + O(). The integral in (3.43) splits into the sum ∞ T −1 eisσ/c(s, ˜ 0)σ −1/2 dσ ds + (2π )−1 (2π ) ×
∞
−∞ 0 T isσ/
−∞ 0
e
(1) (2) δ c(s, ˜ σ 1/2 )dσ ds + O(1) =: Ising () + Ising () + O(1). (3.45)
Let Fϕ(ξ ) = (2π )−n Rn e−i xξ ϕ(x)d x be the usual Fourier transform. Then, by Fubini, T (2) Ising () = (2π )−1 (Fδ c) ˜ s→σ (−1 σ, σ 1/2 )dσ = O(1). 0
396
J. A. Toth
On the other hand, again by Fubini, for the leading term T (1) −1 Ising () = (2π ) (F c) ˜ s→σ (−1 σ, 0) σ −1/2 dσ ∼→0+ cγ −1/2 .
(3.46)
0
Again, the constant cγ > 0 appearing on the RHS in (3.46) is uniform in E with (E 1 , E) ∈ P(T ∗ M). Consequently, (1) (2) (2π )−1 eis E/ Iγ (s; )ds = Ising () + Ising () + Ir eg () = cγ −1/2 + O(1).
(3.47)
From (3.47) it follows that −1 (1) −1 (2) ρ( [λ j ()−E 1 ]) ρ( [λ j ()−E]) |ϕ j (s)|2 ds ∼→0+ cγ (E; ρ)−1/2 . γ
j
(3.48) So, by taking supremum over E in (3.48) the upper-bound in (ii) of Theorem 3 follows. The final part of the proof of Theorem 3 follows from the result of Toth and Zelditch [TZ3] which says that, unless (M, g) is a flat torus, the bicharacteristic flow must have a singular orbit, γ˜ . But then, γ˜ ⊂ P −1 (E 1 , E), where (E 1 , E) ∈ Bsing . In the case where γ˜ is stable, the existence of the subsequence of joint eigenfunctions follows from the joint trace formula −1 (2) ρ(−1 [λ(1) (3.49) j () − E 1 ]) ρ( [λ j () − E]) ∼→0+ c(E; ρ). j
To see this, one simply argues by contradiction: Assume that for all eigenfunctions |ϕ j (s)|2 ds = o(−1/2 ). γ
Then we bound the LHS in (3.48) by −1 (2) −1/2 o(−1/2 ) × ρ(−1 [λ(1) ) j () − E 1 ]) ρ( [λ j () − E]) = o( j
by the joint trace formula (3.49). This contradicts the asymptotic ∼ −1/2 on the RHS of (3.48). 3.5.3. Unstable case. In this case, the relevant canonical transformation to normal form is given by κ : Uγ → Bδ∗ (0) × Bδ∗ ([a, b]), where κ ∗ p j (x, t; ξ, σ ) = F(ξ 2 − x 2 , σ ) = b j + β j (σ ) + α j (σ )(ξ 2 − x 2 ) + O(|ξ 2 − x 2 |2 ); j = 1, 2. The computations follow in the same way as in the stable case by putting x = 0 and repeating the analysis in 3.5.2 with a few minor changes: In the unstable case [BPU,TZ3], the formula (3.49) gets replaced by −1 (2) ρ(−1 [λ(1) j () − E 1 ]) ρ( [λ j () − E]) ∼→0+ c(E; ρ)| log |. j
L 2 -Restriction Bounds for Eigenfunctions Along Curves in the QCI Case
397
As in the stable case, an argument by contradiction then proves the existence of a subsequence ϕ jk ; k = 1, 2, 3, . . . satisfying |ϕ jk (s)|2 ds ≥ cγ −1/2 | log |−1 γ
for ∈ (0, 0 ]. This completes the proof of Theorem 3.
4. The Example of a Convex Surface of Revolution One can parametrize convex surfaces of revolution by using geodesic polar coordinates (t, ϕ) ∈ (0, 1) × [0, 2π ] in terms of which p1 (t, ϕ; ξt , ξϕ ) = ξt2 + a −1 (t) ξϕ2 , and p2 (t, ϕ, ξt , ξϕ ) = ξϕ , where, the profile function satisfies a(0) = a(1) = 0 and a(t) is a non-negative Morse function with a single non-degenerate maximum at t = t0 ∈ (0, 1). The level curve t = t0 is the equator of the surface. Let γ = {(t, ϕ(t)); 0 < a ≤ t ≤ b < 1} with ϕ ∈ C ∞ (0, π ) be a curve segment on the surface. The computation of the phase function in this case is easy. Clearly, Cγ = {(t, ϕ(t); ξt , ξϕ ); ξt2 + a −1 (t)ξϕ2 = 1}, and away from the set Cγ ∩ {(t, ϕ; ξt , ξϕ ); |ξϕ | ≤ δ} where δ > 0 is sufficiently small, one can use t ∈ (0, 1) and ξt to parametrize Cγ . It is easy to see that p2 = ξϕ restricted to Cγ ∩ {(t, ϕ; ξt , ξϕ ); |ξϕ | ≤ δ} has no critical points, provided δ > 0 is small enough. So, when (t, ϕ; ξt , ξϕ ) ∈ Cγ ∩ {(t, ϕ; ξt , ξϕ ); |ξϕ | ≥ δ}, it suffices to consider the function 2 (t, ξt ) = a(t) (1 − ξt2 ); t ∈ [a, b]. Here and in the following we work with 2 rather than noting that since > 0, the nature of the critical points is the same. The critical points are the solutions of ∂t 2 = a (t)(1 − ξt2 ) = 0 and ∂ξt 2 = −2a(t)ξt = 0. Since t ∈ (0, 1), there is a single critical point with ξt = 0 and a (t) = 0. This happens precisely when t = t0 . The end result is that the critical point of is (t0 , 0). We next compute the terms of the Hessian matrix at the critical point (t0 , 0). The result is that ∂t2 2 = a (t0 ), ∂t ∂ξt 2 = 0 and ∂ξ2t 2 = −2a(t0 ). Consequently, we get that det(d 2 2 )|(t0 ,0) = −2a(t0 )a (t0 ) = 0 by our Morse assumption on the profile function of the surface. It follows that the curve segment γ = {(t, ϕ(t)); 0 < t < 1} is generic. These curve segments are graphs over the meridian great circle. Next consider curves of the form γ = {(θ (t), ϕ(t) = t); t ∈ [a, b]}
398
J. A. Toth
which are graphs over the equator. In this case, 2 (t, ξt ) = a(θ (t))(1 − ξt2 ); t ∈ [a, b]. The critical points in this case are the solutions of ∂t 2 = a (θ (t)) · θ (t)(1 − ξt2 ) = 0 and ∂ξt 2 = −2a(θ (t))ξt = 0. Since a(θ ) > 0 the second equation implies that ξt = 0 at critical points. The first equation implies a (θ (t)) = 0, or θ (t) = 0. For the Hessian, ∂t2 2 = [a (θ (t))|θ (t)|2 + a (θ (t))θ (t)](1 − ξt2 ), and ∂ξ2t 2 = −2a(θ (t)). Also, clearly ∂t ∂ξt 2 = 0 at any critical point (t0 , 0). In the case where a (θ (t0 )) = 0 at the crtical point (t0 , 0) one gets det(d 2 2 )|(t0 ,0) = −2a(θ (t0 )) a (θ (t0 )) |θ (t0 )|2 . (∗) In the case where θ (t0 ) = 0 at the critical point (t0 , 0) one gets det(d 2 2 )|(t0 ,0) = −2a(θ (t0 )) a (θ (t0 )). θ (t0 ). (∗∗) The only way (∗) can vanish is if also θ (t0 ) = 0, so that both θ (t0 ) = 0 and a (θ (t0 )) = 0; that is, the curve γ (t) is tangent to the equator at t = t0 . In the second case where θ (t0 ) = 0 and a (θ (t0 )) = 0, the curve γ (t) is tangent to another circle parallel to the equator. So, curves which are graphs over the equator are generic in the sense of Definition 1.1 provided θ (t) = 0. This condition is satisfied provided θ : [a, b] → (0, π ) is never tangent to a circle parallel to the equator. In particular, this rules out the cases where γ includes pieces of the equator z = 0 or parallel circles z = const. The equator is of course the (only) projection of a singular orbit. It is non-generic and in that case, Theorem 3 applies. The parallel circles z = const. are caustics which are also necessarily non-generic since in the latter case there are joint eigenfunctions which blow-up like ∼ λ1/6 in sup-norm along the curve. To see what Theorem 1 means for a specific sequence of eigenfunctions, we consider the special case of the round sphere, where ϕλ (x) = λ1/4 (x1 + i x2 )λ are the highest-weight spherical harmonics where M |ϕλ |2 dVol ∼ 1, and where λ = n; n = 1, 2, 3, . . .. The above computations showed that all smooth curves γ = {(t, ϕ(t)); t ∈ [a, b], ϕ (t) = 0 } are generic. In terms of spherical coordinates, the restricted eigenfunction ϕλ (t) = λ1/4 [cos t cos ϕ(t) + i cos t sin ϕ(t)]λ , and so
|ϕλ | ds = λ 2
γ
1/2 a
b
(cos t)2λ dt.
L 2 -Restriction Bounds for Eigenfunctions Along Curves in the QCI Case
399
In the case where a < 0 and b > 0 (so that γ intersects the equator), an application of steepest descent gives b 1/2 λ (cos t)2λ dt ∼λ→∞ cγ = O(1). a
Similarily, when γ = {(θ (t), t); t ∈ [a, b]} with a < 0, b > 0 and |θ (t)| ≥ gets b 1/2 λ (cos θ (t))2λ dt ∼λ→∞ c˜γ = O(1).
1 C
> 0 one
a
These bounds are consistent with (and slightly stronger than) the general O(log λ) bound given in Theorem 1. 4.1. Zonal harmonics. Let x = (x1 , x2 ) be geodesic normal coordinates on a convex surface of revolution centered at the north pole and (r, ϕ) denote the corresponding polar variables. We consider zonal harmonics centered at the north pole which can be written as oscillatory integrals of the form ϕλ (x) + ϕλ (−x) where, eiλx,ω a(x, ω; λ) dω, (4.50) ϕλ (x) = (2π λ)1/2 S1
∞
(x, ω)λ− j
where, a(x, ω; λ) ∼ and |a0 (x, ω)| ≥ C1 > 0 with a0 (x, ω) = j=0 a j 1 + O(|x|). The λ1/2 -factor in front of the terms in (4.50) ensures that M |ϕλ |2 d x = 1. Consider the generic curve segment γ = {(r, ϕ = f (r )); 0 ≤ r ≤ π } written in geodesic polar variables corresponding tothe x j -coordinates. From (4.50) it follows that ϕλ is radial and we get that in this case γ |ϕλ |2 ds equals 2 λ−1 π 2 iλx,ω |ϕλ (r )| dr = 2π λ e a(x, ω; λ)dω dr 0
r =0 π
S1
2 iλx,ω dr e a(x, ω; λ)dω r =λ−1 S1 2 π = 2π λ eiλx,ω a(x, ω; λ)dω dr + O(1). +2π λ
r =λ−1
S1
An application of stationary phase in the inner integral on the RHS of the last identity gives π π π dr −1/2 iλr 2 r + O(1). |ϕλ (r )|2 dr = 2π e dr + O(1) = 2π −1 −1 r 0 λ λ So it follows that π |ϕλ (r )|2 dr = 2π log λ + O(1), (4.51) 0
and this example shows that the upper bounds in Theorems 1 and 2 are sharp. Acknowledgement. We thank Steve Zelditch for helpful comments and suggestions regarding an earlier version of the manuscript.
400
J. A. Toth
References [A]
Avakumovi´c, G.V.: Über die eigenfunktionen auf geschlossenen riemannschen mannigfaltigkeiten. Math. Z. 65, 327–344 (1956) [BGT] Burq, N., Gerard, P., Tzvetkov, N.: Restrictions of the Laplace-Beltrami eigenfunctions to submanifolds. Duke Math. J. 138(3), 445–486 (2007) [BPU] Brummelhuis, R., Paul, T., Uribe, A.: Spectral estimates around a critical level. Duke Math. J. 78(3), 477–530 (1995) [CP] Colin de Verdiere, Y., Parisse, B.: Equilibre instable en regime semi-classique I: concentration microlocale. Comm. P.D.E. 19, 1535–1563 (1994) [DG] Duistermaat, J.J., Guillemin, V.W.: The spectrum of positive elliptic operators and periodic bicharacteristics. Invent. Math. 29, 39–79 (1975) [DM] Dullin, H.R., Matveev, V.S.: A new integrable system on the sphere. Math. Res. Lett. 11, 715–722 (2004) [DS] Dimassi, M., Sjöstrand, J.: Spectral asymptotics in the semi-classical limit. London Math. Soc. Lecture Notes 268, Cambridge: Cambridge Univ. Press, 1999 [G1] Guillemin, V.: Wave trace invariants. Duke Math. J. 83, 287–352 (1996) [G2] Guillemin, V.: Wave trace invariants and a theorem of Zelditch. Int. Math. Res. Not. (IMRN) 12, 303–308 (1993) [HS] Helffer, B., Sjöstrand, J.: Semiclassical analysis of Harper’s equation III. Mem. Bull. Soc. Math. France, Ser. 2, 39, 1–124 (1990) [Ho] Hörmander, L.: The Analysis of Linear Partial Differential Operators, Volume I. Berlin-Heidelberg: Springer-Verlag, 1983 [IS] Iwaniec, H., Sarnak, P.: L ∞ norms of eigenfunctions of arithmetic surfaces. Ann. of Math. Second Ser. 141(2), 301–320 (1995) [ISZ] Iantchenko, A., Sjöstrand, J., Zworski, M.: Birkhoff normal forms in semiclassical inverse problems. Math. Res. Lett. 9, 337–362 (2002) [K] Kozlov, V.V.: Topological obstructions to the integrability of natural mechanical systems. Soviet Math. Dokl. 20(6), 1413–1415 (1979) [Le] Levitan, B.M.: On the asymptoptic behavior of the spectral function of a self-adjoint differential equation of second order. Isv. Akad. Nauk SSSR Ser. Mat. 16, 325–352 (1952) [L] Lerman, E.: Contact toric manifolds. J. Symp. Geom. 1(4), 785–828 (2003) [LS] Lerman, E., Shirokova, N.: Completely integrable torus actions on symplectic cones. Math. Res. Lett. 9(1), 105–115 (2002) [Mi] Miller, A.: Riemannian manifolds with integrable geodesic flows. Preprint, available at http://www. dpmms.cam.ac.uk/~hk244/miller.cp.pdf [MVN] Miranda, E., Vu-Ngoc, S.: A singular Poincare lemma. IMRN 1, 27–45 (2005) [R] Reznikov, A.: Norms of geodesic restrictions for eigenfunctions on hyperbolic surfaces and representation theory. http://arXiv.org/abs/math/0403437v2[math.AP], 2004 [S] Safarov, Yu.G.: Asymptotics of a spectral function of a positive elliptic operator without a nontrapping condition. (Russian) Funkt. Anal. i Pril. 22(3), 53–65 (1988); translation in Funct. Anal. Appl. 22(3), 213–223 (1988) [SV] Safarov, Yu., Vassiliev, D.: The asymptotic distribution of eigenvalues of partial differential operators. Translated from the Russian manuscript by the authors. Translations of Mathematical Monographs, 155. Providence, RI: Amer. Math. Soc., 1997 [Sa] Sarnak, P.: Arithmetic quantum chaos: The Schur lectures (1992) (Tel Aviv), Israel Math. Conf. Proc. 8, Ramat Gan: Bar-Ilan Univ., 1995, pp. 183–236 [So1] Sogge, C.D.: Concerning the L p norm of spectral clusters for second-order elliptic operators on compact manifolds. J. Funct. Anal. 77(1), 123–138 (1988) [So2] Sogge, C.D.: Oscillatory integrals and spherical harmonics. Duke Math. J. 53(1), 43–65 (1986) [So3] Sogge, C.D.: Fourier Integrals in Classical Analysis. Cambridge Tracts in Math. 105. Cambridge: Cambridge Univ. Press, 1993 [SZ] Sogge, C.D., Zelditch, S.: Riemannian manifolds with maximal eigenfunction growth. Duke Math. J. 114(3), 387–437 (2002) [STZ] Sogge, C.D., Toth, J.A., Zelditch, S.: Geodesic recurrence and maximal growth of eigenfunctions, I., 2008, Preprint [Ta] Tataru, D.: On the regularity of boundary traces for the wave equation. Ann. Scuola Norm. Sup. Pisa Cl. Sci. 26(1), 185–206 (1998) [T1] Toth, J.A.: Eigenfunction localization in the quantized rigid body. J. Diff. Geom. 43(4), 844–858 (1996) [T2] Toth, J.A.: On the quantum expected values of integrable metric forms. J. Diff. Geom. 52(2), 327–374 (1999)
L 2 -Restriction Bounds for Eigenfunctions Along Curves in the QCI Case
[T3] [T4] [TZ1] [TZ2] [TZ3] [VN1] [VN2] [Z1] [Z2]
401
Toth, J.A.: A small-scale density of states formula. Commun. Math. Phys. 238, 225–256 (2003) Toth, J.A.: Eigenfunctions of quantum completely integrable systems. Encyclopedia of Math. Phys. vol. 2, Amsterdam: Kluwer, 2006, pp. 148–157 Toth, J.A., Zelditch, S.: Riemannian manifolds with uniformly bounded eigenfunctions. Duke Math. J. 111(1), 97–132 (2002) Toth, J.A., Zelditch, S.: Norms of modes and quasi-modes revisited. In: Harmonic analysis at Mount Holyoke (South Hadley, MA, 2001), Contemp. Math. 320, Providence, RI: Amer. Math. Soc., 2003, pp. 435–458 Toth, J.A., Zelditch, S.: L p -norms of eigenfunctions in the completely integrable case. Annales Henri Poincaré 4, 343–368 (2003) Vu-Ngoc, S.: Formes normales semi-classiques des systemes completement integrables au voisinage d’un point critique de l’application moment. Asymptotic Analysis 24(3), 319–342 (2000) Vu-Ngoc, S.: Symplectic techniques for semiclassical integrable systems. In: Topological Methods in the Theory of Integrable Systems, Cambridge: Cambridge Scientific Publishers, 2006 Zelditch, S.: Wave invariants at elliptic closed geodesics. Geom. Funct. Anal. (GAFA) 7, 145–213 (1997) Zelditch, S.: Wave invariants for non-degenerate closed geodesics. Geom. Funct. Anal. (GAFA) 8, 179–217 (1998)
Communicated by P. Sarnak
Commun. Math. Phys. 288, 403–429 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0766-8
Communications in
Mathematical Physics
Noncommutative Riemann Surfaces by Embeddings in R3 Joakim Arnlind1,2 , Martin Bordemann3 , Laurent Hofer4 , Jens Hoppe5 , Hidehiko Shimada2 1 Institut des Hautes Études Scientifiques, Le Bois-Marie 35, route de Chartres,
F-91440, Bures-sur-Yvette, France. E-mail:
[email protected]
2 Max Planck Institute for Gravitational Physics, Am Mühlenberg 1,
D-14476, Golm, Germany. E-mail:
[email protected];
[email protected]
3 Laboratoire de MIA, 4, rue des Frères Lumière, Université de Haute-Alsace,
F-68093, Mulhouse, France. E-mail:
[email protected]
4 Université du Luxembourg, FSTC 162a, avenue de la Faïencerie,
L-1511, Luxembourg City, Luxembourg. E-mail:
[email protected]
5 Department of Mathematics, KTH, S-10044, Stockholm, Sweden. E-mail:
[email protected]
Received: 4 January 2008 / Accepted: 17 December 2008 Published online: 20 March 2009 – © Springer-Verlag 2009
Abstract: We introduce C-Algebras of compact Riemann surfaces as non-commutative analogues of the Poisson algebra of smooth functions on . Representations of these algebras give rise to sequences of matrix-algebras for which matrix-commutators converge to Poisson-brackets as N → ∞. For a particular class of surfaces, interpolating between spheres and tori, we completely characterize (even for the intermediate singular surface) all finite dimensional representations of the corresponding C-algebras. Introduction Attaching sequences of matrix algebras to a given manifold M to describe a noncommutative and approximate version of its ring of smooth functions has become a rather important tool in non-commutative field theory: more precisely, for each positive integer N let Q N : C ∞ (M, C) → M N ,N (C) be a complex linear surjective map of the ring of smooth functions on M into the space of all complex N × N -matrices such that products of functions are approximately mapped to products of matrices in the limit N → ∞. In almost all cases, C ∞ (M, C) carries a Poisson bracket { , } (for instance if M is symplectic, such as every orientable Riemann surface), and one further demands that Poisson brackets are approximately mapped to matrix commutators in the limit N → ∞ (see e.g. [BHSS91] for details). For the 2-sphere S2 [GH82] one could use the fact that the space of all spherical harmonics of fixed l is in bijection with the space of all harmonic polynomials in R3 of degree l; substituting the three commuting variables by irreducible N-dimensional representations of the three-dimensional Lie algebra su(2) allows to define a map from functions on S2 to N × N matrices, that sends Poisson brackets to matrix commutators up to corrections of order 1/N (see also [BHSS91], Example 3, p. 218). The result was dubbed “Fuzzy Sphere” in [Mad92]. The papers [KL92] prove that the (complexified) Poisson algebra of functions on any Riemann surface arises as a N → ∞ limit of
404
J. Arnlind, M. Bordemann, L. Hofer, J. Hoppe, H. Shimada
gl(N , C) – which had been conjectured in [BHSS91]. This result was extended to any quantizable compact Kähler manifold in [BMS94], the technical tool being geometric and Berezin-Toeplitz quantization. A thorough analysis of non-commutative Riemann surfaces of genus greater than or equal to 2 as a continuous field of simple C ∗ -algebras, strongly Morita equivalent to a reduced twisted group C ∗ -algebra of its fundamental group, has been given in [NN99]. Insight on how matrices can encode topological information (certain sequences having been identifiable as converging to a particular function, but gl(N , C) lacking topological invariants) was gained in [Shi04]. Even though the above general results are constructive, there seem to be only two explicit formulas, for the two-sphere [GH82] and for the two-torus [FFZ89] (see also [Hop89/88]), which are quite different from each other; the former uses the natural embedding of the two-sphere into R3 whereas the latter relies on the fact that the twotorus is a quotient R2 /Z2 . The general results are based on the complex nature of any compact orientable Riemann surface. In this paper, we should like to propose an approach which to the best of our knowledge does not seem to have been treated in the literature so far, despite its rather intuitive appeal: we are using the ‘visualisable’ embedding of a compact orientable Riemann surface into R3 explicitly given by the set of all zeros of a real polynomial C. The function C, via { f, g}C := ∇C · ∇ f × ∇g , defines a Poisson bracket for all real-valued smooth functions f, g on R3 . Since C is a Casimir function for the bracket { , }C , one gets a symplectic Poisson bracket on by restriction. The idea now is to use the above Poisson bracket on R3 to first define an infinite-dimensional algebra as a quotient algebra of the free non-commutative algebra in three variables, involving a real parameter and suitably ordered non-commutative analogues of {x, y}C , {y, z}C and {z, x}C . In a second step the resulting algebra is divided by an ideal generated by the constraint polynomial C thus giving a non-commutative version of the functions on . In a third step matrix representations of any size N of this latter algebra are constructed where the parameter takes specific values depending on N . It is noteworthy that the construction does not require the zero set of C to be a regular surface. Thus, even for a singular surface (e.g., in the transition from sphere to torus) the non-commutative analogue is still well defined. The main result of this paper is an explicit construction of non-commutative (non-round) spheres and tori, including the transition region with a singular surface that emerges at the point of topology change. Encouraged by the explicit construction and by the fact that for the two-torus our results almost coincide with the older results of [BHSS91], we are quite optimistic that for the case of genus g ≥ 2 this embedding approach may give more explicit constructions than the existence proof in [KL92] and [BMS94]. The paper is organized as follows: In Sect. 1 we describe Riemann surfaces of genus g embedded in R3 as inverse images of polynomial constraint-functions, C( x ). The above-mentioned Poisson bracket { , }C on R3 is treated in Sect. 2, where the bracket restricts to a symplectic bracket on the embedded Riemann surface . In Sect. 3 step one and two of the above programme is explicitly proven for a polynomial constraint C describing the two-sphere, the two-torus, and a transition region: we give a system of relations (Eqs. (3.2), (3.3), (3.4)), and show that this system satisfies the hypothesis of the Diamond lemma, thus proving that the non-commutative algebra
Noncommutative Riemann Surfaces by Embeddings in R3
405
carries a multiplication which is a converging deformation of the point wise multiplication of polynomials in three commuting variables (see Proposition 3.1). In the central Sect. 4 we completely classify all the finite-dimensional representations of the algebras constructed in the preceding section (two-sphere, two-torus, and transition) which are hermitian in the sense that the variables x, y, and z are sent to hermitian N × N -matrices. The main technical tool is graph-theory describing the non-zero entries of the matrices. Next, in Sect. 5, we confirm that the eigenvalue sequences of these representations reflect topology in the sense suggested in [Shi04]. The final Sect. 6 compares the classification results of Sect. 4 with previously known matrix constructions for the sphere and the torus. In the case of the torus it is shown that our result agrees with what can be obtained by (variants of) Berezin-ToeplitzQuantization, see e.g. [BHSS91]. 1. Genus g Riemann Surfaces The aim of this section is to present compact connected Riemann surfaces of any genus embedded in R3 by inverse images of polynomials. For this purpose we use the regular value theorem and Morse theory. Let C be a polynomial in 3 variables and define = C −1 ({0}). What are the conditions on C, for to be a genus g Riemann surface? If the restriction of C to is a submersion, then is an orientable submanifold of R3 . has to be compact and of the desired genus. For further details see [Hir76,Hof02]. The classification of 2-dimensional compact (connected) manifolds is well-known. In this case there is a one to one correspondence between topological and diffeomorphism classes. The result is that any compact orientable surfaces is homeomorphic (hence diffeomorphic) to a sphere or to a surface obtained by gluing tori together (connected sum). The number g of tori is called the genus and is related to the Euler-Poincaré characteristic by the formula χ = 2 − 2g. To compute χ () we apply Morse theory to a specific function. A point p of a (smooth) function f on is a singular point if D f p = 0 in which case f ( p) is a singular value. At any singular point p one can consider the second derivative D 2 f p of f and p is said to be non-degenerate if det(D 2 f p ) = 0. Moreover, one can attach an index to each such point depending on the signature of D 2 f : 0 if positive, 1 if hyperbolic and 2 if negative. A Morse function is a function such that every singular point is non-degenerate and singular values all distinct. Then χ () is given by the formula: χ () = n(0) − n(1) + n(2), where n(i) is the number of singular points which have an index i. The Cote x function is defined as the restriction of the first coordinate on the surface. It is not necessarily a Morse function (one has to choose a “good” embedding for that), but the singular points are those for which the gradient grad C is parallel to the O x axis. Moreover the Hessian matrix of Cote x at such a point p is: ⎛ 2 ⎞ ∂ C ∂2C ( p) ( p) 1 ⎝ ∂y2 ∂y∂z ⎠. − ∂C 2C ∂ ∂2C ( p) ( p) 2 ∂ x ( p) ∂y∂z ∂z Take C( x) =
2 1 1 1 P(x) + y 2 + z 2 − c, 2 2 2
406
J. Arnlind, M. Bordemann, L. Hofer, J. Hoppe, H. Shimada
where c > 0, P(x) = a2k x 2k + a2k−1 x 2k−1 + · · · + a1 x + a0 with a2k > 0 and k > 0. Obviously is closed and bounded (even degree of P) hence compact. is a submanifold of R3 if, and only if √ for each p ∈√, DC p = 0 which is equivalent to requiring that the polynomials P − c and P + c have only simple roots. The singular points of the Cote x function on are the points (x, 0, 0) such that P(x)2 = c and the Hessian matrix is: 1 2P(x) 0 − ∂C . 0 1 ∂ x (x, 0, 0) √ Hence it is√ positive or negative if, and only if P(x) = c and hyperbolic if, and only if P(x) = − c. Thanks to the fact that P(x) never vanishes at a singular point, this also shows that Cote x is a genuine Morse function. Finally, √ √ and n(1) = #{P = − c}. n(0) + n(2) = #{P = c} √ √ If the polynomial P − c has exactly 2 simple roots and the polynomial P + c has exactly 2g simple roots, then χ () = 2 − 2g and is a surface of genus g. Let g > 0. Set: √ 2 c (i) G(t) = (t − 1)(t − 22 ) . . . (t − g 2 ) and M = max G(t), α ∈ 0, , M 0≤t≤g 2 +1 √ (ii) Q(x) = αG(x) − c and P(x) = Q(x 2 ). √ √ roots, hence P + c has exactly 2g One can directly see that Q + c has exactly g simple√ 2 simple roots. For t ∈ [0;√ g + 1], the function Q(t) − c has no zero. On the other hand, for t ≥ g 2 + 1, Q(t)√ − c is strictly growing and has exactly one zero. Consequently the polynomial P − c has exactly 2 simple roots and the surface defined above is a genus g compact Riemann surface. Note that non-compact, respectively non-polynomial, higher genus Riemann surfaces have been considered in [BKL05]. 2. The Construction for General Riemann Surfaces For arbitrary smooth C : R3 −→ R, { f, g}R3 := ∇C · (∇ f × ∇g)
(2.1)
defines a Poisson bracket for functions on R3 (see e.g. Nowak [Now97] who studied the formal deformability of (2.1)).1 Clearly, C is a Casimir function of the bracket, i.e. C commutes with every function. Let now, as in Sect. 1, g ⊂ R3 be described as C −1 (0) with 2 1 1 1 P(x) + y 2 + z 2 − c, C( x) = (2.2) 2 2 2 and c > 0. For this choice of C, the bracket {·, ·}R3 defines a Poisson bracket on g through restriction. The Poisson brackets between x,y and z read: {x, y}R3 = ∂z C = z,
{y, z}R3 = ∂x C = P (x)(P(x) + y 2 ),
(2.3)
{z, x}R3 = ∂y C = 2y(P(x) + y ). 2
1 While we did not (yet find a way to) use his results, we are very grateful for his “New Year’s Eve” explanations, as well as providing us with his Ph.D. Thesis.
Noncommutative Riemann Surfaces by Embeddings in R3
407
We claim that fuzzy analogues of g can be obtained via matrix analogues of (2.3). Apart from possible “explicit 1/N corrections”, direct ordering questions arise on the r.h.s. of (2.3), while on the l.h.s. one replaces Poisson brackets by commutators, i.e. {·, ·} → i1 [·, ·]. We present the following Ansatz for the C-algebra of g , given as three relations in the free algebra generated by the letters X, Y, Z : [X, Y ] = iZ , [Y, Z ] = i
2g r =1
(2.4) ar
r −1
X i P(X ) + Y 2 X r −1−i ≡ φˆ X ,
(2.5)
i=0
[Z , X ] = i 2Y 3 + Y P(X ) + P(X )Y ≡ φˆ Y ,
(2.6)
2g where is a positive real number and P(X ) = r =0 ar X r . The particular ordering in (2.5) and (2.6) is chosen such that the three equations are consistent, in the sense of the Diamond Lemma [Ber78]. Proposition 2.1. Let S = {σ X , σY , σ Z } be a reduction system with σ X = (W X , f X ) = Z Y, Y Z − φˆ X , σY = (WY , f Y ) = Z X, X Z + φˆ Y , σ Z = (W Z , f Z ) = Y X, X Y − iZ . This reduction system contains an ambiguity; i.e., there are two ways of reducing the word Z Y X : Either we replace Z Y by Y Z − φˆ X or we replace Y X by X Y − iZ . The ambiguity is called resolvable if these two reductions eventually reduce to the same expression, by using that we can replace any occurrence of W X , WY , W Z by f X , f Y , f Z respectively. The statement of this proposition is that the ambiguity (Z Y )X = Z (Y X ) is resolvable if and only if [X, φˆ X ] + [Y, φˆ Y ] = 0, and that this relation is satisfied for the choice in (2.5) and (2.6). Proof. The ambiguity is resolvable if we can show that A := (Y Z − φˆ X )X − Z (X Y − iZ ) = 0 using only the possibility to replace any occurrence of Wi with f i , for i = X, Y, Z . We get A = Y Z X − Z X Y − φˆ X X + iZ 2 = Y (X Z + φˆ Y ) − (X Z + φˆ Y )Y − φˆ X X + iZ 2 = Y X Z − X Z Y + [Y, φˆ Y ] − φˆ X X + iZ 2 = (X Y − iZ )Z − X (Y Z − φˆ X ) + [Y, φˆ Y ] − φˆ X X + iZ 2 = [X, φˆ X ] + [Y, φˆ Y ]. It is then straightforward to check that [Y, φˆ Y ] = −[X, φˆ X ] for the choice in (2.5) and (2.6). Finding explicit representations of (2.4)–(2.6), let alone classifying them, is of course a very complicated task. We succeeded in doing so for a (continuously deformable) class of surfaces corresponding to spheres and tori.
408
J. Arnlind, M. Bordemann, L. Hofer, J. Hoppe, H. Shimada
3. The Torus and Sphere C-Algebras Let us now take P(x) = x 2 − µ, in which case C −1 (0), with 2 1 1 2 1 x + y2 − µ + z 2 − c C(x, y, z) = (c > 0), (3.1) 2 2 2 √ √ describes √ a surface of revolution which is a torus for µ > c and a sphere for − c < µ < c. √ √ As µ increases from − c to c the (almost) round sphere gets deformed by introducing two growing√“sinks”; one at the north pole and one at the south pole. At the critical point µ = c the two sinks meet and the surface develops a singularity. For larger µ the singularity vanishes and a hole appears, giving the topology of a torus. The corresponding C-algebra is defined as the quotient of the free algebra C X, Y, Z with the two-sided ideal generated by the relations X, Y = iZ , (3.2)
(3.3) Y, Z = i 2X 3 + X Y 2 + Y 2 X − 2µX ,
Z , X = i 2Y 3 + Y X 2 + X 2 Y − 2µY . (3.4) By introducing W = X + iY and V = X − iY one can rewrite (3.3) and (3.4) as
W 2 V + V W 2 (1 + 2 ) = 4µ2 W + 2(1 − 2 )W V W, (3.5)
V 2 W + W V 2 (1 + 2 ) = 4µ2 V + 2(1 − 2 )V W V, (3.6) and we denote by I (µ, ) the ideal generated by these relations. Through the “Diamond lemma” [Ber78] one can explicitly construct a basis of this algebra. Proposition 3.1. Let C(µ, ) = CW, V /I (µ, ). Then a basis of C(µ, ) is given by {V i (W V ) j W k : i, j, k = 0, 1, 2, . . .}. As a vector space, C(µ, ) is therefore isomorphic to the space of commutative polynomials C[X, Y, Z ]. Proof. In the notation of the Diamond Lemma, let S = {σ1 , σ2 } be a reduction system with 4µ2 2(1 − 2 ) 2 , W + W V W − V W σ1 = (wσ1 , f σ1 ) = W 2 V, 1 + 2 1 + 2 2 2(1 − 2 ) 2 4µ 2 σ2 = (wσ2 , f σ2 ) = W V , V+ VWV − V W , 1 + 2 1 + 2 and let ≤ be a partial ordering on W, V such that p < q if either the total degree (in W and V ) of p is less than the total degree of q or if p is a permutation of the letters in q and the misordering index of p is less than the misordering index of q. The misordering index of a word a1 a2 . . . ak is defined to be the number of pairs (ak , ak ) with k < k such that ak = W and ak = V . This partial ordering is compatible with S in the sense that every word in f σi is less than wσi .
Noncommutative Riemann Surfaces by Embeddings in R3
409
We will now argue that the partial ordering fulfills the descending chain condition, i.e. that every sequence of words such that w1 ≥ w2 ≥ · · · eventually becomes constant. Assume that w1 has degree d and misordering index i. If w1 > wk , then d or i must decrease by at least 1. Since both the degree and the misordering index are non-negative integers, an infinite sequence of strictly decreasing words can not exist. The reduction system S has one overlap ambiguity, namely, there are two ways to reduce the word W 2 V 2 ; either you write it as (W 2 V )V and use σ1 , or you write it as W (W V 2 ) and use σ2 . In an associative algebra, these must clearly be the same, and if they do reduce to the same expression, we call the ambiguity resolvable. It is now straightforward to check that the indicated ambiguity is in fact resolvable. The above observations allow for the use of the Diamond lemma, which in particular states that a basis for C(µ, ) is given by the set of irreducible words. In this particular case, it is clear that the words V i (W V )k W j are irreducible (since they do not contain W 2 V or W V 2 ) and that there are no other irreducible words. By a straightforward calculation, using (3.5) and (3.6), one proves the following result. ˜ 2 /2 . Proposition 3.2. Define D = W V , D˜ = V W and Cˆ = (D + D˜ − 2µ)2 + (D − D) Then it holds that ˜ = 0, (i) [D, D] ˆ = [V, C] ˆ = 0. (ii) [W, C] In particular, this means that the direct non-commutative analogue of the constraint (3.1) is a Casimir of C(µ, ). Let us make a remark on the possibility of choosing a different ordering when constructing a non-commutative analogue of the Poisson algebra. Assume we choose to completely symmetrize the r.h.s of Eqs. (2.3). Then, the defining relations of the algebra become X, Y = iZ , 1 2 3 2 Y, Z = 2i X + X Y + Y X + Y X Y − µX , 3 2 1 3 2 Z , X = 2i Y + Y X + X Y + X Y X − µY . 3 Again, defining W = X + iY and V = X − iY , gives
W 2 V + V W 2 (1 + 42 /3) = 4µ2 W + 2(1 − 22 /3)W V W,
V 2 W + W V 2 (1 + 42 /3) = 4µ2 V + 2(1 − 22 /3)V W V, which, by rescaling 2 = as the new parameter.
3 2 , can be brought to the form of Eqs. (3.5) and 3−h 2
(3.6), with
4. Representations of the Torus and Sphere Algebras Let us now turn to the task of finding representations φ, of the algebra C(µ, ), with 0 < < 1, for which φ(X ), φ(Y ), φ(Z ) are hermitian matrices, i.e. φ(W )† = φ(V ). First, we observe that any such representation is completely reducible; hence, in the following, we need only consider irreducible representations.
410
J. Arnlind, M. Bordemann, L. Hofer, J. Hoppe, H. Shimada
Proposition 4.1. Any representation φ of C(µ, ) such that φ(W )† = φ(V ) is completely reducible. Proof. Since the algebra of all complex N × N matrices equipped with the sup-norm is a C ∗ -algebra, it is clear that any ∗ -subalgebra is completely reducible. For the convenience of the reader, we give the algebraic proof. Let φ be a representation of C(µ, ) fulfilling the conditions in the proposition. Moreover, let A be the subalgebra, of the full matrix-algebra, generated by φ(W ) and φ(V ). First we note that since φ(V ) = φ(W )† , the algebra A is invariant under hermitian conjugation, thus given M ∈ A we know that M † ∈ A. We prove that Rad(A) (the radical of A), i.e. the largest nilpotent ideal of A, vanishes, which implies, by the Wedderburn-Artin theorem, see e.g. [ASS06], that φ is completely reducible. Let M ∈ Rad(A). Since Rad(A) is an ideal it follows that M † M ∈ Rad(A). For a finite-dimensional algebra, Rad(A) is nilpotent, m which in particular implies that there exists a positive integer m such that M † M = 0. It follows that M = 0, hence Rad(A) = 0. In the following, we shall always assume that φ is an hermitian irreducible representa˜ (as defined in Proposition 3.2) tion of C(µ, ). For these representations, φ(D) and φ( D) will be two commuting hermitian matrices and therefore one can always choose a basis such that they are both diagonal. We then conclude that the value of the Casimir Cˆ will always be a non-negative real number, which we will denote by 4c. Finding herˆ = 4c1 thus amounts to solving the matrix mitian representations of C(µ, ) with φ(C) equations
˜ ) 1 + 2 = 4µ2 W + 1 − 2 (W D˜ + DW ), (4.1) (W D + DW
2 1
2 D + D˜ − 2µ1 + 2 D − D˜ = 4c 1, (4.2) with D = W W † = diag(d1 , d2 , . . . , d N ) and D˜ = W † W = diag(d˜1 , d˜2 , . . . , d˜N ) being diagonal matrices with non-negative eigenvalues. The “constraint” (4.2) constrains the pairs xi = (di , d˜i ) to lie on the ellipse (x + y − 2µ)2 + (x − y)2 /2 = 4c, e.g. as in Fig. 1. Representations with c = 0, which we shall call degenerate, are particularly simple, and can be directly characterized. ˆ Proposition 4.2. Let φ be an hermitian representation of C(µ, ) such √ that φ(C) = 0. Then µ ≥ 0 and there exists a unitary matrix U such that φ(W ) = µ U . Proof. When D and D˜ are non-negative diagonal matrices, c = 0 implies D = D˜ = µ1 via (4.2), which necessarily gives µ ≥ 0. In this case, Eq. (4.1) is identically satisfied, and we are left with solving the equations W W † = W † W = µ1. Hence, there exists a √ unitary matrix U such that W = µ U . Assume in the following that c > 0. We note that any representation φ of C(µ , ), ˆ = 4c 1, can be obtained from a representation φ of C(µ, ) with φ(C) ˆ = with φ (C) √ √ 4c1, if µ/ c = µ / c . Namely, one simply defines φ (W ) := 4 c /c φ(W ). ˆ = 4c1. Proposition 4.3. Let√φ be an hermitian representation of C(µ, ) with φ(C) Then it holds that − c ≤ µ.
Noncommutative Riemann Surfaces by Embeddings in R3
411
Fig. 1. The constraint ellipse
√ Proof. Assume that there exists a representation of C(µ, ) with − c > µ. Then the ˜ diagonal components of Eq. (4.2) describes an ellipse in the (d, d)-plane, for which all ˜ ˜ points (d, d) satisfy that either d or d is strictly negative. This contradicts the fact that √ D and D˜ have non-negative eigenvalues. Hence, − c ≤ µ. Writing out (4.1) in components gives Wi j 2 + 1 (d˜i + d j ) + 2 − 1 (di + d˜ j ) − 4µ2 = 0,
(4.3)
and we also note that W D˜ = DW yields Wi j di − d˜ j = 0. If Wi j = 0, the two equations give a relation between the pairs xi = (di , d˜i ) and x j = (d j , d˜ j ). Namely, x j = s ( xi ) with
˜ d , (4.4) s d, d˜ = 4µ sin2 θ + 2d cos 2θ − d, where = tan θ for 0 < θ < π/4. The map s is better understood if we introduce ˜ coordinates z( x ) = (d − d)/ and ϕ( x ) = d + d˜ − 2µ in which case one finds that z s( x) z( x) cos 2θ − sin 2θ . (4.5) = ϕ( x) sin 2θ cos 2θ ϕ s( x) We conclude that s amounts to a “rotation” on the ellipse described by the constraint (4.2). Let us collect some basic facts about s in the next proposition. Proposition 4.4. Let s : R2 → R2 be the map as defined above and let q = e2iθ . Then (i) (ii) (iii) (iv)
s is a bijection,
√ β0 √ cos(β0 +2θ) µ if x(β0 ) = c √µc + cos then s l x(β0 ) = x (β0 + 2lθ ) , , + cos θ cos θ c s( x ) = x if and only if x = (µ, µ), if x = (µ, µ), then s n ( x ) = x if and only if q n = 1.
From these considerations one realizes that it will be important to keep track of the pairs (i, j) for which Wi j = 0. This leads us to a graph representation of the matrix W .
412
J. Arnlind, M. Bordemann, L. Hofer, J. Hoppe, H. Shimada
4.1. Graph representation of matrices. In this section we will introduce the directed graph of the matrix W . See, e.g., [FH94] for the standard terminology concerning directed graphs. Definition 4.5. Let G = (V, E) be a directed graph on N vertices with vertex set V = {1, 2, . . . , N } and edge set E ⊆ V × V . We say that an N × N matrix W is associated to G (or G is associated to W ) if it holds that (i j) ∈ E ⇔ Wi j = 0. Given an equation for W , we say that a graph G is a solution if G is associated to a matrix W , solving the equation. Needless to say, for a given solution G there might exist many different (matrix) solutions associated to G. A graph with several disconnected components is clearly associated to a matrix that is a direct sum of matrices; hence, it suffices to consider connected graphs. In the following, a solution will always refer to a solution of (4.1). Given a connected solution G, we note that given the value of xi = (di , d˜i ), for any i, we can compute xk = (dk , d˜k ), for all k, using (4.4). Namely, since G is connected, we can always find a sequence of numbers i = i 1 , i 2 , . . . , il = k, such that Wi j i j+1 = 0 or Wi j+1 i j = 0, which will give us xk = s m ( xi ), where m is the difference between the number of edges (in the path) directed from i and the number of edges directed towards i. Proposition 4.6. Let G = (V, E) be a connected non-degenerate solution. Then (i) G has no self-loops (i.e. (ii) ∈ / E), (ii) there is at most one edge between any pair of vertices. Proof. In both cases, assuming the opposite, it follows from (4.3) that there exists an i such that di = d˜i = µ. Since the graph is connected we will have di = d˜i = µ for all i ((µ, µ) is indeed the fix-point of s), giving c = 0. Hence, a non-degenerate solution will satisfy the two conditions above. Any finite directed graph has a directed cycle, which we shall call loop, or a directed path from a transmitter (i.e. a vertex having no incoming edges) to a receiver (i.e. a vertex having no outgoing edges), which we shall call string. The existence of a loop or a string imposes restrictions on the corresponding representations. From Proposition 4.4 we immediately get: Proposition 4.7. Let G be a non-degenerate solution containing a loop on n vertices. Then q n = 1. Lemma 4.8. Let G be a solution. The vertex i is a transmitter if and only if d˜i = 0. The vertex i is a receiver if and only if di = 0. Proof. Since D = W W † and D˜ = W † W , we have Wik W ik = |Wik |2 , di = k
d˜i =
k
k
W ki Wki =
|Wki |2 ,
k
and it follows that di = 0 if and only if Wik = 0 for all k, i.e. i is a receiver. In the same way d˜i = 0 if and only if Wki = 0 for all k, i.e. i is a transmitter.
Noncommutative Riemann Surfaces by Embeddings in R3
413
Next we prove that if G is a solution, then G can not contain both a string and a loop. Lemma 4.9. Let G be a non-degenerate connected solution and assume that G has a transmitter or a receiver. Then G has no loop and therefore there exists a string. Proof. Let us prove the case when a transmitter exists. Let us denote the transmitter by 1 ∈ V , and by Lemma 4.8 we have x1 = (a, 0), for some a > 0. Assume that there exists a loop and let i be a vertex in the loop. Since G is connected there exists an integer i such that xi = s i ( x1 ). Let l be the number of vertices in the loop. From Proposition 4.7 we know that q l = 1, which means that there is at most l different values of xk in the graph, and all values are assumed by vertices in the loop. In particular this means that there exists a vertex k in the loop, such that xk = x1 . But this implies, by Lemma 4.8, that k is a transmitter, which contradicts the fact that k is part of a loop. Hence, if a transmitter exists, there exists no loop and therefore there must exist a string. The above result suggests to introduce the concept of loop representations and string representations, since all representations are associated to graphs that have either a loop or a string. Let us now prove a theorem providing the general structure of the representations. Theorem 4.10. Let φ be an N -dimensional non-degenerate connected hermitian repˆ = 4c1. Then there exists a positive integer k dividresentation of C(µ, ) with φ(C) ing N , a unitary N × N matrix T , unitary N /k × N /k matrices U0 , . . . , Uk−1 and β, e˜0 , . . . , e˜k−1 ∈ R with e˜1 , . . . , e˜k−1 > 0, such that √ ⎞ ⎛ 0 e˜1 U1 √ 0 ··· 0 ⎟ ⎜ 0 e˜2 U2 · · · 0 0 ⎟ ⎜ ⎟ ⎜ . . . .. .. .. .. (4.6) T φ(W )T † = ⎜ .. ⎟, . . ⎟ ⎜ ⎠ ⎝ 0 e˜k−1 Uk−1 0 ··· 0 √ e˜0 U0 0 ··· 0 0 √ µ cos(2lθ + β) e˜l = c √ + . (4.7) cos θ c ˜ † are diagonal, Proof. Let U be a unitary N × N matrix such that U DU † and U DU † set Wˆ = U φ(W )U and let G be the graph associated to Wˆ . Define {xˆ0 , . . . , xˆk−1 } to be the set of pairwise different vectors out of the set { x1 , x2 , . . . , xN }, such that xˆi+1 = s xˆi for i = 0, . . . , k − 2 (which is always possible since G is connected), and write xˆi = (ei , e˜i ). We note that if G has a transmitter, it must necessarily correspond to the vector xˆ0 , in which case e˜0 = 0. In particular this means that no vertex corresponding to xˆi , for i > 0, can be a transmitter and hence, by Lemma 4.8, e˜1 , . . . , e˜k−1 > 0. Now, define Vi = { j ∈ V : x j = xˆi }
i = 0, . . . , k − 1,
and set li = |Vi |. Since xˆi+1 = s(xˆi ), a necessary condition for (i j) ∈ E is that j = i +1. This implies that there exists a permutation σ ∈ S N (permuting vertices to give the order V0 , . . . , Vk−1 ) such that ⎛ ⎞ 0 W1 0 ··· 0 0 W2 · · · 0 ⎟ ⎜0 ⎜ . . .. ⎟ † . . ⎟ .. .. .. W := σ Wˆ σ = ⎜ . ⎟, ⎜ .. ⎝0 0 ··· 0 Wk−1 ⎠ W0 0 ··· 0 0
414
J. Arnlind, M. Bordemann, L. Hofer, J. Hoppe, H. Shimada
where Wi is a li−1 × li matrix (counting indices modulo k). In this basis we get D = diag(e0 , . . . , e0 , . . . , ek−1 , . . . , ek−1 ) l0
=W W
†
=
lk−1
† diag(W1 W1† , . . . , Wk−1 Wk−1 , W0 W0† ),
D˜ = diag(e˜0 , . . . , e˜0 , . . . , e˜k−1 , . . . , e˜k−1 ) l0 †
=W W =
lk−1
† diag(W0† W0 , W1† W1 , . . . , Wk−1 Wk−1 ),
which gives Wi Wi† = ei−1 1li−1 and Wi† Wi = e˜i 1li . Since xˆi+1 = s(xˆi ) we know that e˜i+1 = ei , which implies that Wi Wi† = e˜i 1i−1 for i = 1, . . . , k − 1. Any matrix satisfying such conditions must be a square matrix, i.e. li = li−1 for i = 1, . . . , k − 1. Hence, Wi is a√square matrix of dimension N /k, and there exists a unitary matrix Ui such that Wi = e˜i Ui . Moreover, we take T to be the unitary N × N matrix σ U . Finally, since every point √ xˆi = (ei , e˜i ) lies on the ellipse, there exists a β0 such that xˆ0 corresponds to the point c (cos(β0 + θ ), sin(β0 + θ )) in the (z, ϕ)-plane, as in Proposition 4.4.
By √ cos(2lθ+β) µ defining β = β0 + 2θ , we get, since xˆl+1 = s(xˆl ), that e˜l = c √c + cos θ . The above theorem proves the structure of any connected representation, but the question of irreducibility still remains. We will now prove that any representation is in fact equivalent to a direct sum of representations where the Ui ’s are 1 × 1-matrices. Lemma 4.11. Let W1 and W2 be matrices such that ⎛ ⎛ ⎞ 0 w1 U 1 0 · · · 0 w1 1 0 0 w2 U 2 · · · 0 0 ⎜ 0 ⎜ 0 ⎟ ⎜ . ⎜ . ⎟ .. .. .. .. .. ⎜ ⎟ . . ; W W1 = ⎜ = 2 . . . . . ⎜ . ⎜ . ⎟ ⎝ 0 ⎝ 0 0 · · · 0 wn−1 Un−1 ⎠ 0 w0 V 0 w0 U 0 0 ··· 0 0
0 ··· w2 1 · · · .. .. . . ··· ···
0 0 .. .
0 wn−1 0 0
⎞ ⎟ ⎟ ⎟, ⎟ 1⎠
where U0 , . . . , Un−1 are unitary matrices, w0 , . . . , wn−1 ∈ C and V a diagonal matrix such that SV S † = U1 U2 · · · Un−1 U0 for some unitary matrix S. Then there exists a unitary matrix P such that W1 = P W2 P †
and W1† = P W2† P † .
Proof. Let us define P as P = diag(S, P1 , . . . , P¯n−1 ) with Pl = (U1 U2 . . . Ul )† S for l = 1, . . . , n − 1. Then one easily checks that W1 = P W2 P † and W1† = P W2† P † . Note that a graph associated to a matrix such as W2 , consists of n components, each being either a string (e˜0 = 0) or a loop (e˜0 > 0). Therefore, we have the following result.
Noncommutative Riemann Surfaces by Embeddings in R3
415
Fig. 2. The constraint ellipse of a Toral representation
Theorem 4.12. Let φ be a non-degenerate hermitian representation of C(µ, ). Then φ is unitarily equivalent to a representation whose associated graph is such that every connected component is either a string or a loop. √ The existence of strings or loops will depend on the ratio µ/ c, and therefore we split all connected representations of C(µ, ) into three subsets, in correspondence with the original surface described by the polynomial C(x, y, z): √ (a) −1 < µ/ – Spherical representations, √ c≤1 (b) 1 < µ/ c ≤ 1/ cos θ – Critical toral representations, √ (c) 1/ cos θ < µ/ c – Toral representations. √ 4.2. Toral representations. For µ/ c > 1/ cos θ the constraint ellipse lies entirely in the region where both d and d˜ are strictly positive, e.g. as in Fig. 2. In particular this implies, by Lemma 4.8, that a graph associated to a toral representation can not have any transmitters or receivers. Hence, it must have a loop, and by Proposition 4.7, there exists an integer k such that q k = 1. We note that the restriction 0 < θ < π/4 necessarily gives k ≥ 5. √ Theorem 4.13. Assume that µ/ c > 1/ cos θ and let k be a positive integer such that k q = 1. Furthermore, let U0 , . . . , Uk−1 be unitary matrices of dimension N and let β ∈ R. Then φ is an N · k dimensional hermitian toral representation of C(µ, ), with ˆ = 4c1, if φ(C) √ ⎛ ⎞ 0 e˜1 U1 √ 0 ··· 0 ⎜ 0 ⎟ e˜2 U2 · · · 0 0 ⎜ ⎟ ⎜ .. ⎟ .. .. .. .. (4.8) φ(W ) = ⎜ . ⎟ . . . . ⎜ ⎟ ⎝ 0 ⎠ e˜k−1 Uk−1 0 ··· 0 √ e˜0 U0 0 ··· 0 0 and
√ µ cos(2lθ + β) e˜l = c √ + . cos θ c
(4.9)
416
J. Arnlind, M. Bordemann, L. Hofer, J. Hoppe, H. Shimada
Definition 4.14. We define a single loop representation φ L of C(µ, ) to be a toral representation, as in Theorem 4.13, with Ui chosen to be 1 × 1 matrices and k to be the smallest positive integer such that q k = 1. As a simple corollary to Theorem 4.12 we obtain Corollary 4.15. Let φ be a toral representation of C(µ, ). Then φ is unitarily equivalent to a direct sum of single loop representations. Proposition 4.16. A single loop representation of C(µ, ) is irreducible. Proof. Given a single loop representation φ L of dimension n, it holds that q n = 1, and there exists no n < n such that q n = 1, by definition. Now, assume that φ L is reducible. Then, by Proposition 4.1, φ L is equivalent to a direct sum of at least two representations. In particular, this means that there exists a toral representation of C(µ, ) of dimension m < n which implies, by Proposition 4.7, that there exists an integer n < n such that q n = 1. But this is impossible by the above argument. Hence, φ L is irreducible. For two loop representations of the same dimension, it is not only the value of the Casimir Cˆ that distinguishes them, but there is in fact a whole set of inequivalent representations - parametrized by a complex number. Definition 4.17. Let φ L be a single loop representation in the notation of Theorem 4.13 with Ul = eiαl . We define the index z(φ L ) as the complex number z(φ L ) = e˜0 e˜1 · · · e˜k−1 eiγ with γ = α0 + α1 + · · · + αk−1 . Lemma 4.18. Let k, n be integers such that gcd(k, n) = 1 and define 2π kl Al (β) = cos β + n for l = 0, 1, . . . , n − 1. Then there exists permutations σ+ , σ− ∈ Sn such that Aσ+ (l) (β) = Al (β + 2π/n) and
Aσ− (l) (β) = Al (2π/n − β)
for l = 0, 1, . . . , n − 1. Proof. Let us prove the existence of σ+ ; the proof that σ− exists is analogous. We want to show that there exists a permutation σ+ such that Aσ+ (l) (β) = Al (β + 2π/n). Let us make an Ansatz for the permutation; namely, we take it to be a shift with σ+ (l) = l + δ (mod n) for some δ ∈ Z. We then have to show that there exists a δ such that 2π k(l + δ) 2π(kl + 1) cos β + = cos β + . n n This holds if for some m ∈ Z, β+
2π(kl + 1) 2π k(l + δ) =β+ + 2π m n n kδ − nm = 1.
⇐⇒
Now, can we find δ such that this holds for some m? It is an elementary fact in number theory that such an equation has integer solutions for δ and m if gcd(k, n) = 1. Hence, if we set σ+ (l) = l + δ (mod n), where δ is such a solution, then the argument above shows that Aσ+ (l) (β) = Al (β + 2π/n).
Noncommutative Riemann Surfaces by Embeddings in R3
417
Lemma 4.19. Let θ = π k/n with gcd(k, n) = 1, and set √ n−1 c cos(2lθ + β) . µ+ f (β) = cos θ l=0
Then f (β) = f (β + 2π/n), f (β) = f (2π/n − β) and if β, β ∈ [0, π/n] then β = β implies that f (β) = f (β ). Proof. It follows directly from Lemma 4.18 that f (β) = f (β + 2π/n) = f (2π/n − β). Since f is periodic, with period 2π/n, it can be expanded in a Fourier series as f (β) =
∞ l=−∞
al e2πilβ/(2π/n) =
∞
al eilnβ .
l=−∞
Comparing the Fourier series with the original expression for f , and introducing q = e2iθ , we get √ n n−1 ∞ µ cos θ 1
c l iβ −l −iβ q e +q e = al einlβ . f (β) = + √ cos θ 2 c l=0
l=−∞
From this equality we deduce that there are only three non-zero coefficients in the Fourier series, namely a−1 , a0 , a1 . Comparing both sides, we obtain 1 a−1 = n q −n(n−1)/2 , 2 1 a1 = n q n(n−1)/2 , 2 which implies that √ −n c 1 1 f (β) = a0 + n q −n(n−1)/2 e−inβ + n q n(n−1)/2 einβ cos θ 2 2 1 n−1 = a0 + − cos nβ. 2 From this it is clear that f (β) = f (β ) when β = β and β, β ∈ [0, π/n].
φ L
Proposition 4.20. Let φ L and be single loop representations of dimension n, such ˆ = φ (C). ˆ Then φ L and φ are equivalent if and only if z(φ L ) = z(φ ). that φ L (C) L L L Proof. Then characteristic equation of φ L (W ) is λn −z(φ L ). Therefore, a necessary condition for φ L and φ L to be equivalent is that z(φ L ) = z(φ L ). Now, to prove the opposite implication, assume that z(φ L ) = z(φ L ). Let us denote the β in Theorem 4.13 by β and β for φ L and φ L respectively. The fact that z(φ L ) = z(φ L ) gives directly γ = γ , and in the notation of Lemma 4.19, we must have f (β) = f (β ). By the same Lemma, writing θ = π k/n, this leaves us with three possibilities: Either β = β, β = β + 2π m/n or β = 2π m/n − β for some m ∈ Z. In all three cases, by Lemma 4.18, there exists a permutation σ such that for W = σ φ L (W )σ † it holds that e˜l = e˜l . Then it is easy to construct a diagonal unitary matrix P such that φ L (W ) = Pσ φ L (W )σ † P † . Hence, for a given dimension n and for a given value of the Casimir, such that toral representations exist, the set of inequivalent irreducible representations is parametrized by a complex number w such that π/n ≤ |w| ≤ 2π/n. We relate w to a single loop representation by setting w = βeiγ .
418
J. Arnlind, M. Bordemann, L. Hofer, J. Hoppe, H. Shimada
4.3. Spherical representations. In contrast to the case of toral representations, we will show that, in a spherical representation, there can not exist any loops. The intuitive picture is that the part of the ellipse lying in the region where either d or d˜ is negative, is too large to skip by a rotation through the map s ; see, e.g. Fig. 1. By Lemma 4.8, we know that the x corresponding to a transmitter or a receiver must ˜ lie on the d-axis or the d-axis respectively. For this reason, let us calculate the points where the ellipse crosses the axes. Lemma 4.21. Consider the ellipse (x + y − 2µ)2 + (x − y)/2 = 4c. Then x = 0 implies y = a± and y = 0 implies x = a± with ⎡ ⎤
2 c−µ ⎦ a± = 2 sin θ µ sin θ ± c − µ2 cos2 θ = 2 sin2 θ ⎣µ ± µ2 + (4.10) sin2 θ x ) = (a− , 0). Lemma 4.22. Let x = (0, a+ ), with a+ as in Lemma 4.21. Then s( Lemma 4.23. If φ is a spherical representation of C(µ, ), that contains a string on n vertices, then 0 < (n + 1)θ ≤ π.
(4.11)
Proof. Let us denote the vectors corresponding to the vertices in the string by x1 , . . . , xn and we define 0 < β, θ0 < 2π through x1 = x(β) and xn = x(β + θ0 ) in the notation of Proposition 4.4. Since xn = s n−1 x(β) we must have that (n − 1)2θ = θ0 + 2π k for some integer k ≥ 0. Let us prove that k = 0. For a spherical representation, a− ≤ 0, which implies, by Lemma 4.22, that s x(β + θ0 ) = (a− , 0) can not correspond to a vertex of a connected representation. Hence, for any α ∈ (0, 2θ ), s x(β + θ0 − α) can not correspond to a vertex of a connected representation. This implies that k = 0, i.e. ˜ the string never crosses the d-axis. Therefore 0 < (n − 1)2θ = θ0 < 2π . Again, by Lemma 4.22, both vectors s(0, a+ ) and s 2 (0, a+ ) have non-positive components which implies that 0 < (n + 1)2θ ≤ 2π . In fact, equality is attained when a− = 0. Proposition 4.24. Let φ be a spherical representation of C(µ, ). Then the associated graph has no loops. Proof. In the same way as in the proof of Lemma 4.23, we can argue that for α ∈ (0, 2θ ), s x(β + θ0 − α) has a negative component (or equals (0, 0)), which implies that it is impossible to have loops. Hence, we have excluded the possibility of loop representations and can conclude that all spherical representations are string representations. We therefore get the following corollary to Theorem 4.12. Corollary 4.25. Let φ be a spherical representation of C(µ, ). Then φ is unitarily equivalent to a direct sum of string representations. Let us now investigate the conditions for the existence of strings. Lemma 4.26. Let x1 = (a, 0) and xn = (0, b). Then s n−1 ( x1 ) = xn if and only if (i) q n = −1, µ = 0 and a = b, (ii) q n = 1 and b = −a + 4µ sin2 θ , (iii) q n = ±1, and
Noncommutative Riemann Surfaces by Embeddings in R3
a=b=−
2µ sin θ sin(n − 1)θ . cos nθ
419
(4.12)
In particular, if a = a+ and q n = 1, then b = a− . Proposition 4.27. Let φ be a spherical representation of C(µ, ) containing a string on n vertices. Then √ c cos nθ + µ cos θ = 0. (4.13) Proof. Assume the existence of a string on n vertices. From Lemma 4.26 we can exclude the possibility that q n = 1, since a− < 0. Hence, either q n = −1 and µ = 0 or q n = ±1. If q n = −1 and µ = 0 then (4.13) is clearly satisfied. Now, assume q n = ±1 and θ sin(n−1)θ . Demanding that (a, 0) and (0, b) lie on the ellipse determines a = b = 2µ sincos nθ 2 c as c = µ cos2 θ/ cos2 nθ . Let us set ε = sgn µ. Recalling that 0 < (n + 1)θ ≤ π , from Lemma 4.23, demanding a > 0 makes it necessary that sgn(cos nθ ) = −ε, which determines the sign of the root in the statement. As we have seen, the existence of a loop puts a restriction on through the relation q n = 1. For the case of strings, the restriction comes out as a restriction on the possible values of the Casimir. In the next theorem we show that the necessary conditions for the existence of spherical representations are in fact sufficient. √ Theorem 4.28. Let n be a positive integer, c a positive real number such that c cos nθ + µ cos θ = 0 and 0 < (n + 1)θ ≤ π . Furthermore, let U1 , . . . , Un−1 be N × N unitary matrices. Then φ is a N · n-dimensional spherical representation of C(µ, ), with ˆ = 4c1, if φ(C) ⎛ ⎞ √ 0 e˜1 U1 √ 0 ··· 0 ⎜0 ⎟ e˜2 U2 · · · 0 0 ⎜ ⎟ ⎜. ⎟ .. .. .. .. φ(W ) = ⎜ .. ⎟ . . . . ⎜ ⎟ ⎝0 e˜n−1 Un−1 ⎠ 0 ··· 0 0 0 ··· 0 0 and e˜l =
√ 2 c sin lθ sin(n − l)θ . cos θ
Proof. It is easy to check that the matrix φ(W ) satisfy (4.1), since s(e˜l , e˜l−1 ) = (e˜l+1 , e˜l ). Moreover, it is clear that e˜l > 0 since 0√< (n − 1)θ < √ π . Let us show that it is indeed a spherical representation, i.e. −1 < µ/ c ≤ 1. Since c cos nθ + µ cos θ = 0, we get that µ cos nθ √ =− cos θ c and from 0 < (n + 1)θ ≤ π we obtain 0 < nθ ≤ π − θ . From this it follows that | cos nθ | ≤ | cos θ | which implies that φ is a spherical representation. † Remark. Let us note that √ the matrix elements of the diagonal matrix Z = [W, W ]/2 can be written as zl = c sin(n + 1 − 2l)θ .
420
J. Arnlind, M. Bordemann, L. Hofer, J. Hoppe, H. Shimada
Fig. 3. The constraint ellipse of a critical toral representation
Definition 4.29. We define a single string representation φ S of C(µ, ) to be a spherical representation, as in Theorem 4.28, with Ui chosen to be 1 × 1 matrices. Proposition 4.30. Any single string representation of C(µ, ) is irreducible. ˆ = 4c1. Then, by Proof. Assume that φ S is reducible and has dimension n with φ S (C) Proposition 4.1, φ S is equivalent to a direct sum of at least two representations of dimension < n. In particular, this implies that there exists a representation φ of dimension ˆ = 4c1. But this is false, since there is at most one integer l such that m < n with φ(C) x(β + 2lθ ) = x(β + θ0 ), for 0 < (l + 1)2θ < 2π and 0 < θ0 < 2π . We conclude that the single string representations are the only irreducible spherical representations. Moreover, two single string representations φ S and φ S , of the same ˆ = φ (C). ˆ dimension, are equivalent if and only if φ S (C) S 4.4. Critical toral representations. In the case of critical toral representations, the con˜ axis twice, as in Fig. 3. As we will show, straint ellipse intersects the positive d (resp. d) there are both loop representations and string representations. String representations can √ be obtained from Theorem 4.28, by demanding that 1 < µ/ c ≤ 1/ cos θ instead of 0 < (n + 1)θ ≤ π . Let us as well give an example of a loop representation. √ Proposition 4.31. Assume that θ = π/N , N ≥ 5 odd and 1 < µ/ c ≤ 1/ cos θ . If we define φ as in Theorem 4.13 with β = 0, then φ is a critical toral representation of C(µ, ). Proof. One simply has to check that π ) √ cos(2l N µ e˜l = c √ + >0 π cos N c for l = 0, . . . , N − 1. If N is odd then 2θl ∈ / (π − / (2π − θ, 2π ), √ θ, π + θ ) and 2θl ∈ which implies that | cos 2θl| < | cos θ |. Since µ/ c > 1 we conclude that e˜l > 0 for l = 0, . . . , N − 1.
Noncommutative Riemann Surfaces by Embeddings in R3
421
In contrast to the previous cases, it is, for a given value of the Casimir, possible to have both string representations and loop representations. Namely, if we assume that q n = 1 and let x1 correspond to the largest intersection with the d-axis, then s n−1 ( x1 ) ˜ will be the smallest intersection with the d-axis (cp. Lemma 4.26), and one can check that all pairs xi , for i = 2, . . . , n − 1 will be strictly positive. 4.5. Summary of representations. We have shown that every representation can be decomposed into a direct sum of irreducible representations of two types: string and loop representations. String representations correspond to matrices of the form ⎛ ⎞ 0 W12 0 ··· 0 0 W23 · · · 0 ⎟ ⎜0 ⎜. ⎟ .. .. .. .. ⎟, . φ(W ) = ⎜ . . . . ⎜. ⎟ ⎝0 0 ··· 0 W N −1,N ⎠ 0 0 ··· ··· 0 and loop representations to matrices of the form ⎛ 0 W12 0 ··· 0 W23 · · · ⎜ 0 ⎜ . .. .. .. φ(W ) = ⎜ . . . ⎜ .. ⎝ 0 0 ··· 0 W N ,1 0 ··· ···
0 0 .. . W N −1,N 0
⎞ ⎟ ⎟ ⎟, ⎟ ⎠
with W12 , W23 , . . . , W N −1,N , W N ,1 = 0. Furthermore, existence of representations puts ˆ A necessary conrestrictions on the parameter and the value 4c of the Casimir C. dition for loop representations to exist is that there is a positive integer k such that q k = e2ikθ √ = 1, where = tan θ . A necessary condition for string representations to exist is that c cos nθ + µ cos θ = 0, for some positive integer n. The structure √ of representations respects the classical geometry as follows: In the region −1 < µ/ c ≤ 1 we have shown that there are only string representations and √ when µ/ c > 1/ cos θ there are only loop representations. In the critical region, where √ 1 < µ/ c ≤ 1/ cos θ (classically, one is close to the singular surface), there are in fact representations of both types. 5. Eigenvalue Distribution and Surface Topology In this section we consider the eigenvalue distribution of the matrix X in the representations obtained in sect. 4, with the help of numerical computations. The eigenvalue distribution is of interest since in [Shi04] it was shown that the Morse theoretic information of topology manifests itself in certain branching phenomena of eigenvalue distribution of a single matrix. More precisely, critical points of the Morse function correspond to branching points of the eigenvalue distribution. (The meaning of the word “branching phenomena” will be illustrated below by using the eigenvalue distribution of the matrix X , plotted in Fig. 4, as an example.) This was achieved by using arguments analogous to those used in the WKB approximation in quantum mechanics, and is part of a more general correspondence between matrix elements and certain geometric quantities computed from the corresponding function on the surface. For a description of this analogy and also for more examples, we refer the reader to [Shi04].
J. Arnlind, M. Bordemann, L. Hofer, J. Hoppe, H. Shimada
λi + 1 _ λi
λi
422
1.5
1.5
1.5
1.0
1.0
1.0
0.5
0.5
0.5
0.0
10
20
30
0.0
10
20
30
0.0
-0.5
-0.5
-0.5
-1.0
-1.0
-1.0
-1.5
-1.5
-1.5
0.2
0.2
0.2
0.1
0.1
0.1
0.0
10
20
i = 0.9
30
0.0
10
20
30
20
30
0.0 10
20
i = 1.1
30
10
i = 1.3
Fig. 4. Plot of λi andλi+1 −λi versus i, where λ1 < λ2 < . . . < λ N are eigenvalues of X , for µ = 0.9, 1.1, 1.3. The size of matrices is given by N = 30. Critical values of x are also shown by the horizontal lines
Eigenvalues of X (whose continuum counterpart, x, is a Morse function on the surface) in the representations obtained in Sect. 4, do exhibit this branching phenomena, as is consistent with the results in [Shi04]. In Fig. 4, eigenvalues of X , computed numerically, for the case µ = 0.9, 1.1, 1.3 are shown. (We use the normalization convention in which c = 1, so that the transition between sphere and torus occurs at µ = 1. The size of matrices is given by N = 30. For the toral representation, we have taken the additional “phase shift” parameter β to be zero. Using different β’s does not change the plot qualitatively.) The horizontal lines correspond to the critical values of the function x on the surface. The plots directly reflect the Morse theoretic information of topology, with x as the Morse function, for each case µ = 0.9, 1.1, 1.3. For the case µ = 0.9, there are two critical values which are connected by a single branch. Correspondingly, the eigenvalue plot shows that there is only one “sequence” of eigenvalues λ1 < λ2 < . . . < λ N which increase smoothly. For the cases µ = 1.1 and µ = 1.3, there are four critical values of x, say x A < x B < xC < x D . For x A < x < x B and xC < x < x D the surface consists of a single branch, whereas for x B < x < xC , the surface consists of two branches. Correspondingly, in the plot of eigenvalues, one sees that eigenvalues x A < λi < x B and xC < λi < x D each consists of a single smoothly increasing eigenvalue sequence, whereas eigenvalues xC < λi < x D are naturally divided into two sequences both of which increase smoothly. This branching phenomena of eigenvalues can be seen more manifestly if one plots the difference between eigenvalues, λi+1 − λi , as is shown in the figure. From the figure it can also be seen that by decreasing the parameter µ from 1.3 to 1.1 the part of the surface which has two branches shrinks, as is consistent with the geometrical picture about the transition between torus and sphere.
Noncommutative Riemann Surfaces by Embeddings in R3
423
6. Comparison with Other Quantization Methods 6.1. The torus. The purpose of this section is to compare matrix representations obtained in Sect. 4, in the torus case, with those one gets using Berezin-Toeplitz quantization. Full details and proofs can be found in [Hof07]. We shall use Theorem 5.1 from the paper [BHSS91] applied to S1 × S1 . Namely n = 1, τ = 1 and we omit the Laplacian terms: ! n n 2 rk+n π π τs r2 2 rs2 + s+n . τk rk + 2 exp − and m 2m τs2 tk k=1
s=1
We reformulate it for simplicity and to fix notations. Theorem 6.1. Let r1 , r2 ∈ Z and N ≥ 5 be an integer. Then the N × N -matrix corresponding to the phase function e2πi(r1 ϑ+r2 ϕ) is:
πi M e2πi(r1 ϑ+r2 ϕ) = χ r1 r2 S −r1 T r2 and χ := e− N , where the ⎛ 0 ⎜ ⎜0 ⎜ S = ⎜ .. ⎜. ⎝ 0 1
S and T are matrices such that: ⎞ 1 0 ··· 0 .. . 0⎟ 0 1 ⎟ 2πi .. .. . . .. ⎟ ⎟, T = diag(1, q, . . . , q N −1 ) where q := χ 2 = e− N . . ⎟ . . . ⎠ 0 0 ··· 1 0 0 ··· 0
Remark 6.2. The M map is not a morphism of algebras. However, M is continuous in the topology of uniform convergence. To apply this theorem to the torus case, √ i.e. the regular values of the polynomial function (x 2 + y 2 − µ)2 + z 2 − c (with µ/ c > 1), one has to choose an embedding: √ Proposition 6.3. Assume that µ/ c > 1. By using the parametrization: √ ⎧ ⎨ x(ϑ, ϕ) = cos(2π ϑ) c cos(2π ϕ) + µ √ sin(2π ϑ) c cos(2π ϕ) + µ ⎩ y(ϑ, ϕ) = √ z(ϑ, ϕ) = c sin(2π ϕ) one gets: √ √ S −1 S c −1 c −1 χ T + χT χ T + χ −1 T −1 , M(x) = + (6.1) 1µ + 1µ + 2 2 2 2 √ √ S −1 c −1 c S χ T + χ T −1 − χ T + χ −1 T −1 , (6.2) M(y) = 1µ + 1µ + 2i 2 2i 2 √
c T − T −1 . (6.3) M(z) = 2i √ Proof. The key idea is an expansion in Fourier series of µ + c cos(2π ϕ). We then replace phase functions by matrices T and S according to Theorem 6.1. Square roots of matrices are well defined since the matrices are positive definite.
424
J. Arnlind, M. Bordemann, L. Hofer, J. Hoppe, H. Shimada
Lemma 6.4. Let D = diag(d1 , . . . , d N ) be a diagonal N × N -matrix, then: S −1 DS = diag(d N , d1 , . . . , d N −1 ) and S DS −1 = diag(d2 , . . . , d N , d1 ). Let us denote: D :=
√
c % := χ T + χ −1 T −1 and D 1µ + 2
√ c −1 χ T + χ T −1 . 2
1µ +
Then one can write (6.1) and (6.2) as: i % 1 % S D + S −1 D S D − S −1 D . and M(y) = − M(x) = 2 2 % are diagonal: It is easily seen that the matrices D and D ! √ 2πl π + D = diag µ + c cos , N N l=1,...,N ! √ π 2πl % = diag − D µ + c cos . N N l=1,...,N
By Lemma 6.4, % = S DS % SD
−1
S = diag
! √ 2πl π + µ + c cos N N
× S = DS.
l=1,...,N
As a consequence, M(x) and M(y) can be written as: 1
i
DS + S −1 D DS − S −1 D . M(x) = and M(y) = − 2 2 Theorem 6.5. The matrices M(x), M(y) and M(z) are: ⎛0 x 0 ··· 0 ⎜ x1 ⎜ ⎜ 0 1⎜ M(x) = ⎜ . ⎜ 2 ⎜ .. ⎜ ⎝ 0 xN ⎛ 0
1
0
x2
x2 .. .
0 .. .
0 0
··· ···
y1 0
0 y2
··· .. . .. . .. . 0
··· ··· .. .
0 0 .. . 0 x N −1 0 0
xN ⎞ 0 ⎟ ⎟ ⎟ 0 ⎟ .. ⎟ ⎟, . ⎟ ⎟ ⎠
x N −1 0
⎜−y1 ⎜ ⎜ 0 0 −y2 0 i ⎜ M(y) = − ⎜ .. .. .. .. .. ⎜ 2⎜ . . . . . ⎜ . ⎝ .. 0 0 ··· 0 yN 0 ··· 0 −y N −1 M(z) = diag(z 1 , z 2 , . . . , z N ),
−y N ⎞ 0 ⎟ ⎟ ⎟ 0 ⎟ .. ⎟ ⎟, . ⎟ ⎟ ⎠ y N −1 0
Noncommutative Riemann Surfaces by Embeddings in R3
425
where the xl ’s, yl ’s and zl ’s (for l = 1, . . . , N ) are: √ √ 2πl π 2πl + and zl = − c sin . xl = yl = µ + c cos N N N These matrices satisfy the following relations: √ √ Theorem 6.6. Let µ, c ∈ R and N ≥ 5 such that µ/ c > 1. If one assumes = tan(θ ) with θ := π/N , then: [ X˜ , Y˜ ] = i(cos(θ ) Z˜ ),
[Y˜ , (cos(θ ) Z˜ )] = i X˜ ( X˜ 2 + Y˜ 2 − µ1) + ( X˜ 2 + Y˜ 2 − µ1) X˜ ,
[(cos(θ ) Z˜ ), X˜ ] = i Y˜ ( X˜ 2 + Y˜ 2 − µ1) + ( X˜ 2 + Y˜ 2 − µ1)Y˜ , √ ( X˜ 2 + Y˜ 2 − µ1)2 + (cos(θ ) Z˜ )2 = ( c cos(θ ))2 1, where X˜ := M(x), Y˜ := M(y) and Z˜ := M(z) are the matrices obtained in Theorem 6.5. Proof. This is a direct computation on matrices.
This proves that the matrices X = X˜ , Y = Y˜ and Z = cos(θ ) Z˜ satisfy exactly the relations (3.2), (3.3) and (3.4). The Casimir identity is satisfied with c replaced by c cos2 θ . Note that cos θ converges to 1 as N goes to infinity. 6.2. The sphere. Let us start √ by constructing √ a parametrization for the deformed sphere described by (3.1) with − c < µ < c. Recall the well-known parametrization of the sphere ξ1 = sin ϑ cos ϕ, ξ2 = sin ϑ sin ϕ, ξ3 = cos ϑ. We would like to keep the axial symmetry and therefore we make the following Ansatz outside the poles for a map (x, y, z) : S 2 → , x = f (ϑ) cos ϕ, y = f (ϑ) sin ϕ. Keeping the relation {x, y} S 2 = z, using the (round sphere) Poisson bracket, { f, g} S 2 =
λ ∂ϑ f ∂ϕ g − ∂ϕ f ∂ϑ g sin ϑ
(where λ > 0 is an arbitrary parameter scaling the round sphere volume as 4π/λ), yields z=
λ f (ϑ) f (ϑ). sin ϑ
Demanding that these functions satisfy the constraint (x 2 + y 2 − µ)2 + z 2 − c = 0 gives the following differential equation for f (ϑ):
2 f 2 (ϑ) − µ +
2 λ2 (ϑ) f (ϑ) − c = 0, f sin2 ϑ
426
J. Arnlind, M. Bordemann, L. Hofer, J. Hoppe, H. Shimada
which is solved by f (ϑ) =
√ 2 cos ϑ + B , µ + c cos λ
for arbitrary B. As ϑ goes to 0 or π we need that x and y go to zero; there are two ways of achieving this (giving conditions on λ and B) but only the following leads to an embedding: Proposition 6.7. The map : S 2 → R3 defined by
ϑ =0: with
µ √ c
' ⎧ √ ⎪ x = cos ϕ µ + c cos λ2 cos ϑ ⎪ ⎨ ' √ ϑ ∈ (0, π ) : y = sin ϕ µ + c cos λ2 cos ϑ , ⎪ ⎪ √ ⎩ z = c sin λ2 cos ϑ √ √ x = y = 0, z = c sin(2/λ); ϑ = π : x = y = 0, z = − c sin(2/λ), = − cos
2 √ λ , −1 < µ/ c < 1, and 0 < 2/λ < π , is an embedding of the
(round) sphere into R3 whose image coincides with . Moreover, it holds that {x, y} S 2 = z, {y, z} S 2 = 2x(x 2 + y 2 − µ) and {z, x} S 2 = 2y(x 2 + y 2 − µ). The embedding is therefore a Poisson map and hence volume preserving (where is equipped with the volume defined by the inverse of the restriction of the C-bracket (2.1)). Proof. Outside the poles all the assertions are computed in a straight forward manner. Around the poles we can express x,y and z by the local round sphere charts ξ1 and ξ2 to see that the map is a smooth embedding. Let us introduce the hermitian n × n matrices S1 , S2 , S3 , whose nonzero matrix elements are 1 S1 k,k+1 = k(n − k) = S1 k+1,k , k = 1, . . . , n − 1, 2 i k(n − k) = − S2 k+1,k , k = 1, . . . , n − 1, S2 k,k+1 = − 2 1 S3 k,k = (n + 1 − 2k), k = 1, . . . , n, 2 satisfying [Sa , Sb ] = iabc Sc and S12 + S22 + S32 = n 4−1 1. We then define rescaled matrices X a = A(n)Sa , for some function A(n). In analogy with the case of the torus, we would like to compare the Berezin-Toeplitz quantization of the embedding functions with the results obtained in Sect. 4.3. Quantizing the function cos ϑ will (up to scaling) give the diagonal matrix S3 . However, a function of cos ϑ is in general not mapped to the same function of S3 and one can numerically check that the quantization of z(ϑ) (in Proposition 6.7) is not equal to the matrix Z obtained in Sect. 4.3. However, they agree up to corrections of order 1/n. 2
Noncommutative Riemann Surfaces by Embeddings in R3
427
In [GH82] the following prescription for replacing functions on S 2 by matrices was introduced: Smooth functions on S 2 are expanded in terms of the spherical harmonics Ylm (ϑ, ϕ), resp. Ylm = r l Ylm , written as Ylm (x1 , x2 , x3 ) =
3
ca(m) x · · · xal , 1 ···al a1
a1 ,...,al =1
(x1 = r sin ϑ cos ϕ, x2 = r sin ϑ sin ϕ, x3 = r cos ϑ) with ca(m) 1 ···al chosen to be totally symmetric with respect to the lower indices. A function is then mapped to a n × n via 3 T (n) Ylm = B(n, l) ca(m) X · · · X al , 1 ···al a1 a1 ,...,al =1
'
√ 2 l (n−1−l)! 4π (n −1) and X 1 , X 2 , X 3 are defined as above, with the (n + l)! √ choice A(n) = 2/ n 2 − 1. Disregarding multiplication by an overall n-dependent function, the map T (n) will act on the basic functions in the following way: T (n) (x1 ) = T (n) sin ϑ cos ϕ ∼ S1 , T (n) (x2 ) = T (n) sin ϑ sin ϕ ∼ S2 , T (n) (x3 ) = T (n) cos ϑ ∼ S3 . where B(n, l) =
We will now show that, for some scaling of X 1 , X 2 , X 3 , the following hermitian matrices: ' −1 √ 2 1 2 ˆ X3 µ + c cos 1 − X3 X1 X = 2 λ ' −1 √ 1 2 2 + X 1 µ + c cos 1 − X3 , X3 2 λ ' −1 √ 2 1 X3 µ + c cos 1 − X 32 X2 Yˆ = 2 λ ' −1 √ 1 2 2 X3 + X 2 µ + c cos 1 − X3 , 2 λ √ 2 X3 , Zˆ = c sin λ being noncommutative analogues of the embedding functions in Proposition 6.7, agree with the results obtained in Sect. 4.3 up to corrections of order 1/n; moreover, the matrix Zˆ will have an exact agreement. (Actually, all orderings we tried for x + iy gave matrices with nonzero elements only on the first off-diagonal; furthermore, they also agreed with our results up to order 1/n, as we have seen from numerical computations.) For a spherical representation, it holds that Z ll =
√ 1 [W, W † ]ll = c sin (n + 1 − 2l)θ , 2
428
J. Arnlind, M. Bordemann, L. Hofer, J. Hoppe, H. Shimada
√ c cos nθ = 0. Furthermore, the matrix elements of Zˆ are given by √ A(n) ˆ . Z ll = c sin (n + 1 − 2l) λ √ As one can easily check, the relation µ cos θ + c cos nθ = 0 defines a unique smooth function θ (n) such that 0 < θ (n) < π/(n + 1). Defining A(n) = λθ (n) gives directly that Z ll = Zˆ ll , and one can show that the matrices Xˆ , Yˆ will agree with the matrices X, Y up to corrections of order 1/n. The main ingredient is the following lemma.
with µ cos θ +
Lemma 6.8.
µ 2 sin(n − l)θ sin lθ 1 2 X 3 l,l = +O . √ + cos λ cos θ n c
Proof. Setting θ = A(n)/λ, we can rewrite 2 cos X 3 l,l = cos(n − l)θ cos lθ + sin(n − l)θ sin lθ λ + cos θ − 1 cos(n − 2l)θ − sin θ sin(n − 2l)θ. √ Since − cos(2/λ) = µ/ c = − cos nθ/ cos θ , it follows that θ (n) = 2/(λn) + O(1/n 2 ) and we conclude that 1 2 X 3 l,l = cos(n − l)θ cos lθ + sin(n − l)θ sin lθ + O . cos λ n √ √ Since µ = − c cos 2/λ , we have − cos nθ = − cos(2/λ + O(1/n)) = µ/ c + O(1/n), which implies that µ 1 2 X 3 l,l = 2 sin(n − l)θ sin lθ + O , √ + cos λ n c from which the statement of the lemma follows.
Acknowledgement. We would like to thank the Swedish Research Council, the Royal Institute of Technology, the Knut and Alice Wallenberg foundation, the Japan Society for the Promotion of Science, the Albert Einstein Institute, the Sonderforschungsbereich “Raum-Zeit-Materie”, the IHES, the ESF Scientific Programme MISGAM, and the Marie Curie Research Training Network ENIGMA for financial support resp. hospitality. In addition, we are thankful for the constructive remarks of the referees.
References [ASS06] [Ber78] [BHSS91] [BKL05] [BMS94] [FFZ89]
Assem, I., Simson, D., Skowronski, A.: Elements of the Representation Theory of Associative Algebras. LMS Student Texts 65, Cambridge: Cambridge University Press, 2006 Bergman, G.M.: The diamond lemma for ring theory. Adv. Math. 29, 178–218 (1978) Bordemann, M., Hoppe, J., Schaller, P., Schlichenmaier, M.: gl(∞) and geometric quantization. Commun. Math. Phys. 138, 209–244 (1991) Bak, D., Kim, S., Lee, K.: All higher genus BPS membranes in the plane wave background. JHEP 0506, 035 (2005) Bordemann, M., Meinrenken, E., Schlichenmaier, M.: Toeplitz Quant. of Kähler manifolds and gl(N ), N → ∞ limits. Commun. Math. Phys. 165, 281–296 (1994) Fairlie, D., Fletcher, P., Zachos, C.: Trigonometric structure constants for new infinite algebras. Phys. Lett. B 218, 203 (1989)
Noncommutative Riemann Surfaces by Embeddings in R3 [GH82] [FH94] [Hir76] [Hof02] [Hof07] [Hop89/88]
[KL92] [Mad92] [NN99] [Now97]
[Shi04]
429
Hoppe, J.: Quantum theory of a massless relativistic surface. Ph.D. Thesis (Advisor: J. Goldstone), MIT. http://www.aei.mpg.de/~hoppe/, 1982 Harary, F.: Graph Theory. Reading MA: Addison-Wesley, 1969 Hirsch, M.W.: Differential topology. New-York: Springer, 1976 Hofer, L.: Surfaces de Riemann compactes. Master’s thesis, Université de Haute-Alsace Mulhouse, http://laurent.hofer.free.fr/data/master_hofer_2002.pdf, 2002 Hofer, L.: Aspects algébriques et quantification des surfaces minimales. Ph.D. thesis, Université de Haute-Alsace de Mulhouse, http://laurent.hofer.free.fr/data/these_hofer_2007. pdf, June 2007 Hoppe, J.: diffeomorphism groups, quantization, and SU (∞). Int. J. of Mod. Phys. A, 4(19), 5235–5248 (1989); Diff A T 2 , and the curvature of some infinite dimensional manifolds. Phys. Lett. B 215, 706–710 (1988) Klimek, S., Lesniewski, A.: Quantum Riemann surfaces I. The unit disc Commun. Math. Phys. 146, 103–122 (1992); Quantum Riemann surfaces II. The discrete series. Lett. Math. Phys. 24, 125–139 (1992) Madore, J.: The fuzzy sphere. Class. Quant. Grav. 9, 69–88 (1992) Natsume, T., Nest, R.: Topological approach to quantum surfaces. Commun. Math. Phys. 202, 65–87 (1999) Nowak, C.: Über Sternprodukte auf nichtregulren Poissonmannigfaltigkeiten (Ph.D Thesis, Freiburg University); Star Products for integrable Poisson Structures on R3 . http://arxiv.org/ abs/q-alg/9708012, 1997 Shimada, H.: Membrane topology and matrix regularization. Nucl. Phys. B 685, 297–320 (2004)
Communicated by Y. Kawahigashi
Commun. Math. Phys. 288, 431–502 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0773-9
Communications in
Mathematical Physics
Stability, Convergence to Self-Similarity and Elastic Limit for the Boltzmann Equation for Inelastic Hard Spheres S. Mischler, C. Mouhot CEREMADE, Université Paris IX-Dauphine, Place du Maréchal de Lattre de Tassigny, 75775 Paris, France. E-mail:
[email protected];
[email protected] Received: 12 February 2008 / Accepted: 5 January 2009 Published online: 4 March 2009 – © Springer-Verlag 2009
Abstract: We consider the spatially homogeneous Boltzmann equation for inelastic hard spheres, in the framework of so-called constant normal restitution coefficients α ∈ [0, 1]. In the physical regime of a small inelasticity (that is α ∈ [α∗ , 1) for some constructive α∗ ∈ [0, 1)) we prove uniqueness of the self-similar profile for given values of the restitution coefficient α ∈ [α∗ , 1), the mass and the momentum; therefore we deduce the uniqueness of the self-similar solution (up to a time translation). Moreover, if the initial datum lies in L 13 , and under some smallness condition on (1 − α∗ ) depending on the mass, energy and L 13 norm of this initial datum, we prove time asymptotic convergence (with polynomial rate) of the solution towards the selfsimilar solution (the so-called homogeneous cooling state). These uniqueness, stability and convergence results are expressed in the selfsimilar variables and then translate into corresponding results for the original Boltzmann equation. The proofs are based on the identification of a suitable elastic limit rescaling, and the construction of a smooth path of self-similar profiles connecting to a particular Maxwellian equilibrium in the elastic limit, together with tools from perturbative theory of linear operators. Some universal quantities, such as the “quasielastic self-similar temperature” and the rate of convergence towards self-similarity at first order in terms of (1 − α), are obtained from our study. These results provide a positive answer and a mathematical proof of the Ernst-Brito conjecture [16] in the case of inelastic hard spheres with small inelasticity.
1. Introduction and Main Results 1.1. The model. We consider the spatially homogeneous Boltzmann equation for hard spheres undergoing inelastic collisions with a constant normal restitution coefficient α ∈ [0, 1) (see [8,17,23,24]). More precisely, the gas is described by the distribution density of particles f = f t = f (t, v) ≥ 0 with velocity v ∈ R N (N ≥ 2) at time t ≥ 0
432
S. Mischler, C. Mouhot
and it satisfies the evolution equation ∂f = Q α ( f, f ) in (0, +∞) × R N , ∂t f (0, ·) = f in in R N .
(1.1) (1.2)
The quadratic collision operator Q α ( f, f ) models the interaction of particles by means of inelastic binary collisions (preserving mass and momentum but dissipating kinetic energy). We define the collision operator by its action on test functions, or observables. Taking ψ = ψ(v) to be a suitable regular test function, we introduce the following weak formulation of the collision operator: Q α (g, f ) ψ dv = b |u| g∗ f (ψ − ψ) dσ dv dv∗ , (1.3) R N ×R N ×S N −1
RN
where we use the shorthand notations f := f (v), g∗ := g(v∗ ), ψ := ψ(v ), etc. Here and below u = v − v∗ denotes the relative velocity and v , v∗ denotes the possible post-collisional velocities. These post-collisional velocities are functions of v, v∗ , σ depending on the collision mechanism, and therefore, in our case, depending on α. They are defined by v =
w u w u + , v∗ = − , 2 2 2 2
with
w = v + v∗ , u =
1−α 2
u+
1+α 2
(1.4) |u| σ.
We also introduce the notation xˆ = x/|x| for any x ∈ R N , x = 0. The function b = b(uˆ · σ ) in (1.3) is (up to a multiplicative factor) the differential collisional crosssection. We assume that b is Lipschitz, non-decreasing and convex on (−1, 1),
(1.5)
and that ∃ bm , b M ∈ (0, ∞)
s.t.
∀ x ∈ [−1, 1], bm ≤ b(x) ≤ b M .
(1.6)
In the important case of “hard spheres”, the cross-section is given by (see [13,17]) b(x) = b0 (1 − x)−
N −3 2
, b0 ∈ (0, ∞),
(1.7)
so that it fulfills the above hypothesis (1.5,1.6) when N = 3. These hypotheses are needed in the proof of moments estimates (see [23, Prop. 3.2] and [24, Prop. 3.1]). We also define the symmetrized (or polar form of the) bilinear collisional operator Q˜ α by setting ⎧ 1 ⎪ ⎨ b |u| g∗ h ψ dσ dv dv∗ , Q˜ α (g, h) ψ dv = 2 RN R N ×R N ×S N −1 (1.8) ⎪ ⎩ with ψ = ψ + ψ∗ − ψ − ψ∗ .
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
433
In other words, Q˜ α (g, h) = (Q α (g, h) + Q α (h, g))/2. The formula (1.3) suggests the − natural splitting Q α = Q +α − Q − α between gain and loss parts. The loss part Q α can be defined in strong form noticing that (g, f ), ψ = b |u| g∗ f ψ dσ dv dv∗ =: f L(g), ψ,
Q − α R N ×R N ×S N −1
where ·, · is the usual scalar product in L 2 and L is the convolution operator L(g)(v) = (b0 | · | ∗ g)(v) = b0 g(v∗ ) |v−v∗ | dv∗ , with b0 = RN and Q − α
S N −1
b(σ1 ) dσ.
(1.9)
= Q − are indeed independent of the normal restitution In particular note that L coefficient α. The Boltzmann equation (1.1) is complemented with an initial datum (1.2) which satisfies ⎧ ⎪ 1 N ⎪ ρ( f in ) := f in dv = ρ ∈ (0, ∞) ⎨ 0 ≤ f in ∈ L (R ), N R (1.10) ⎪ ⎪ ⎩ f in v dv = 0, E( f in ) := f in |v|2 dv < ∞. RN
RN
As explained in [23,24], the operator (1.3) preserves mass and momentum, and so does the evolution equation: d 1 dv = 0, (1.11) ft v dt R N while kinetic energy is dissipated
d 2 E( f t ) = −(1 − α ) DE ( f t ), E( f t ) := f t |v|2 dv. dt RN The energy dissipation functional is given by f f ∗ |u|3 dv dv∗ , DE ( f ) := b1 R N ×R N
where b1 is (up to a multiplicative factor) the angular momentum defined by 1 1 − (uˆ · σ ) b(uˆ · σ ) dσ. b1 := 8 S N −1
(1.12)
(1.13)
(1.14)
In order to establish (1.12) we have used (1.8) and the elementary computation 1 − α2 (1 − (uˆ · σ )) |u|2 . 4 The study of the Cauchy theory and the cooling process of (1.1)-(1.2) was done in [23]. The equation is well-posed for instance in L 13 : for 0 ≤ f in ∈ L 13 , there is a unique global solution in C(R+ ; L 12 ) ∩ L ∞ (R+ ; L 13 ) (see Subsect. 1.5 for the notation of functional spaces). This solution preserves mass, momentum and has a positive and decreasing kinetic energy. Moreover, as time goes to infinity, it satisfies: |·|2 (v, v∗ , σ ) = −
E( f t ) → 0 and f (t, ·) δv=0 in M 1 (R N )-weak *, where
M 1 (R N )
denotes the space of probability measures on
RN .
(1.15)
434
S. Mischler, C. Mouhot
1.2. Introduction of rescaled variables. Let us introduce some rescaled variables (which can be found in [8,15,24] for instance), in order to study more precisely the asymptotic behavior (1.15) of the solution. For any solution f to the Boltzmann equation (1.1), we may associate for any τ ∈ (0, ∞) the self-similar rescaled solution g by the relation g(t, v) = e−N τ t f
eτ t − 1 −τ t v . ,e τ
Using the homogeneity property Q α (g(λ·), g(λ·))(v) = λ−(N +1) Q α (g, g)(λv), it is straightforward that g satisfies the evolution equation ∂g = Q α (g, g) − τ ∇v · (vg). ∂t
(1.16)
Any non-negative steady state 0 ≤ G = G(v) of (1.16), that is G satisfying Q α (G, G) − τ ∇v · (v G) = 0,
(1.17)
is called a self-similar profile. It induces a self-similar solution (or homogeneous cooling state) F of the original equation (1.1) by setting F(t, v) = (V0 + τ t) N G((V0 + τ t)v),
(1.18)
for a given constant V0 ∈ (0, ∞). Reciprocally, let us consider a self-similar solution F of the original equation (1.1). This means that F is a solution of (1.1) with the specific shape F(t, v) = V (t) N G(V (t) v),
(1.19)
for some given non-negative distribution G = G(v) and some C 1 , positive, increasing time rescaling function V (t). One can easily show (see for instance [24, Sect. 1.2]) that V (t) = τ t + V0 for some constants τ, V0 > 0 and G satisfies (1.17) for the velocity rescaling parameter τ . For a given self-similar profile G, associated to a velocity rescaling parameter τ and ˜ associated to with mass ρ and energy E, we may associate a new self-similar profile G, a velocity rescaling parameter τ˜ and with mass ρ˜ by setting ρ˜ τ ˜ , G(v) = K G(V v), V = ρ τ˜
K = VN
ρ˜ . ρ
(1.20)
2 The energy of G˜ is then E˜ = ρρ˜ ττ˜ E. We thus see that there exists a two real parameter family of self-similar profiles which can be either parametrized by (ρ, τ ) or by (ρ, E). For fixed mass, changing the velocity rescaling parameter τ in (1.17) corresponds to a change of the energy of the profile, or equivalently to an homothetic change of variable of the solution. Therefore it is no restriction to choose arbitrarily this constant. Also note that modifying V0 just corresponds to a time translation in the self-similar solution F defined by (1.18).
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
435
It follows from [24, Theorem 1.1] that for any inelastic parameter α ∈ (0, 1), any mass ρ ∈ (0, ∞) and (thanks to the preceding discussion) any velocity rescaling parameter τ ∈ (0, ∞), there exists at least one positive and smooth self-similar profile G with given mass ρ and vanishing momentum: ⎧ N ⎪ ⎨ Q α (G, G) − τ ∇v · (v G) = 0 in R , (1.21) ⎪ G dv = ρ, G v dv = 0, 0 < G ∈ S(R N ), ⎩ RN
RN
where S(R N ) denotes the Schwartz space of C ∞ functions decreasing at infinity faster than any polynomial. Furthermore, we recall that the particular self-similar profile G built in [24, Theorem 1.1] satisfies ∀ v ∈ R N , |v| ≥ 1, e−Cα |v| ≤ G(v) ≤ e−cα |v| , for some 0 < cα ≤ Cα < ∞. Finally, for any solution g to the Boltzmann equation in self-similar variables (1.16), we may associate a solution f to the evolution problem (1.1), defining f by the relation ln(V0 + τ t) N , (V0 + τ t)v . f (t, v) = (V0 + τ t) g (1.22) τ We emphasize that the construction of self-similar rescaling of this subsection is a priori no more valid when α is not constant. Therefore other difficulties arise in this case, and we postpone their study to a forthcoming work [25]. 1.3. Rescaled variables and elastic limit α → 1. We now set the value of τ as τ = τα = ρ (1 − α),
(1.23)
and we denote by G α a solution to the problem (1.21) for this choice τα . At a formal level, it is immediate that with this choice, in the elastic limit α → 1, Eq. (1.21) becomes ⎧ N ⎪ ⎨ Q 1 (G 1 , G 1 ) = 0 in R , (1.24) ⎪ G 1 dv = ρ, G 1 v dv = 0, 0 ≤ G 1 ∈ S(R N ). ⎩ RN
RN
Moreover, multiplying the first equation of (1.21) by |v|2 , integrating in the velocity variable as in (1.12) and taking into account the additional term coming from the additional drift term in (1.21), one gets 2 (1 − α) ρ E(G α ) − (1 − α 2 ) DE (G α ) = 0.
(1.25)
Dividing the above equation by (1 − α) and passing to the limit α → 1, one obtains ρ E(G 1 ) − DE (G 1 ) = 0.
(1.26)
It is straightforward (see Prop. 3.6 below) that the only function satisfying (1.24) and (1.26) is the Maxwellian function G¯ 1 := Mθ¯1 = Mρ,0,θ¯1 ,
(1.27)
436
S. Mischler, C. Mouhot
where, for any ρ, θ > 0, u ∈ R N , the function Mρ,u,θ denotes the Maxwellian with mass ρ, momentum u and temperature θ given by Mρ,u,θ (v) :=
|v−u|2 ρ e− 2θ , N /2 (2π θ )
(1.28)
and where the temperature θ¯1 ∈ (0, ∞) here is given by (we recall that b1 is defined in (1.14)) −2 N2 3 θ¯1 = M (v) |v| dv . (1.29) 1,0,1 8 b12 RN For instance in dimension N = 3 we obtain θ¯1 = (9π )/(64b12 ). Moreover, in the particular case of the hard-spheres cross-section (1.7) in dimension 3, we find b1 = b0 (4π )/3 and therefore θ¯1 = 81/(1024π (b0 )2 ). 1.4. Physical and mathematical motivation. For a detailed physical introduction to granular gases we refer to [9,13]. As can be seen from the references included in the latter, granular flows have become a subject of physical research on their own in the last decades, and for certain regimes of dilute and rapid flows, these studies are based on kinetic theory. By contrast, the mathematical kinetic theory of granular gas is rather young and began in the late 1990 decade. We refer to [23,24] for some (short) mathematical introduction to this theory and a (non-exhaustive) list of references. As explained in these papers, granular gases are composed of grains of macroscopic size with contact collisional interactions, when one does not consider other additional possible self-interaction mechanisms such as gravitation – for cosmic clouds for instance – or electromagnetism – for “dusty plasmas” for instance. Therefore the natural assumption about the binary interaction between grains is that of inelastic hard spheres, with no loss of “tangential relative velocity” (according to the impact direction) and a loss in “normal relative velocity”. This loss is quantified in some (normal) restitution coefficient. The latter is either assumed to be constant as a first approximation (as in this paper) or can be more intricate: for instance it is a function of the modulus |v − v| of the normal relative velocity in the case of “visco-elastic hard spheres” (see [9]), which shall be studied in the future in [25]. Simplified Boltzmann models like inelastic Maxwell molecules or pseudo inelastic hard spheres have been proposed (see [5]) for which existence, uniqueness and global stability of a self-similar profile has been shown (see [3,7]), see also [2] for similar results in the case of a thermal bath. However these models do not capture some crucial physical features of the cooling process of granular gas, like the tail behavior of the velocity distribution or the rate of decay of temperature (the so-called Haff’s law). For (spatially homogeneous) inelastic hard spheres Boltzmann models, the existing mathematical works are: • the paper [8] which shows a priori polynomial and exponential moments bounds on any possible self-similar profile (resp. stationary solutions), whose existence is assumed, for freely cooling (resp. driven by a thermal bath) inelastic hard spheres with constant restitution coefficient; • the paper [17] which shows existence of stationary solutions for inelastic hard spheres driven by a thermal bath, and improves the estimates on their tails of [8] into pointwise ones;
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
437
• the paper [23] which provides a Cauchy theory for freely cooling inelastic hard spheres with a broad family of collision kernels (including in particular restitution coefficients possibly depending on the relative velocity and/or the temperature), and studies whether the gas cools down in finite time or asymptotically, depending on the collision kernel; • the paper [24] which shows, for freely cooling inelastic hard spheres with constant restitution coefficient, existence of self-similar profile(s) as well as propagation of regularity and damping with time of singularities. In this paper we study the self-similarity properties of the Boltzmann equation for inelastic hard spheres. Therefore as a natural first step we consider a constant restitution coefficient α in order to have a self-similar scaling, which enables to reduce the study of self-similar solutions to the study of stationary solutions for a rescaled equation. We also restrict to the case of a restitution coefficients α close to 1, that is, small inelasticity. There are several motivations from mathematics and physics for such a choice: • the first reason is related to the regime of validity of kinetic theory: as explained in [9, Chap. 6] for instance, the more inelasticity, the more correlations between grains are created during the binary collisions, and therefore the molecular chaos assumption, which is at the basis of the validity of Boltzmann’s theory, suggests weak inelasticity to be the most effective; • second, as emphasized in [9] again, the case of restitution coefficient α close to 1 has been widely considered in physics or mathematical physics since it allows to use expansions around the elastic case, and since conversely it is an interesting question to understand the connection of the inelastic case (dissipative at the microscopic level) to the elastic case (“hamiltonian” at the microscopic level); • finally this case of a small inelasticity is reasonable from the viewpoint of applications, since it applies to interstellar dust clouds in astrophysics, or sands and dusts in earth-bound experiments, and more generally to visco-elastic hard spheres whose restitution coefficient is not constant but close to 1 on the average. In this framework we shall show uniqueness and attractivity of self-similar solutions (in a suitable sense), and thus give a complete answer to the Ernst-Brito conjecture [16] (stated in [16] for the simplified inelastic Maxwell model) for inelastic hard spheres with a small inelasticity. Moreover we give precise results about the elastic limit and deduce some quantitative information about the weakly inelastic case.
1.5. Notation. Throughout the paper we shall use the notation · = 1 + | · |2 . We denote, for any p ∈ [1, +∞], q ∈ R and weight function ω : R N → R+ , the weighted Lebesgue space p L q (ω) := f : R N → R measurable; f L qp (ω) < +∞ , with, for p < +∞,
f L qp (ω) =
1/ p | f (v)| v p
RN
pq
ω(v) dv
and, for p = +∞, f L q∞ (R N ) = sup | f (v)| vq ω(v). v∈R N
438
S. Mischler, C. Mouhot
We shall in particular use the exponential weight functions m = m s,a (v) := e−ζ (|v|
2)
∀ v ∈ RN ,
(1.30)
for some a ∈ (0, ∞), s ∈ (0, 1), or (for a mollified exponential where ζ (r ) = weight function m) where ζ ∈ C ∞ such that ζ (r ) = a r s/2 , r ≥ 1 for some a ∈ (0, ∞), s ∈ (0, 1). k, p In the same way, the weighted Sobolev space Wq (ω) (k ∈ N) is defined by the norm ⎤1/ p ⎡ p ∂ s f (v) p ⎦ , f W k, p (ω) = ⎣ L (ω) a r s/2
q
q
|s|≤k
and as usual in the case p = 2 we denote Hqk (ω) = Wqk,2 (ω). The weight ω shall be omitted when it is 1. Finally, for g ∈ L 12k , with k ≥ 0, we introduce the following notation for the homogeneous moment of order 2k: mk (g) := g |v|2 k dv, RN
and we also denote by ρ(g) = m0 (g) the mass of g, E(g) = m1 (g) the energy of g and by θ (g) = E(g)/(ρ(g) N ) the temperature associated to g (when the distribution g has zero mean). For any ρ, E ∈ (0, ∞), u ∈ R N we then introduce the subsets of L 1 of functions of given mass, mean velocity and energy 1 Cρ,u := {h ∈ L 1 ; h dv = ρ, h v dv = ρ u}, RN RN Cρ,u,E := {h ∈ L 12 ; h dv = ρ, h v dv = ρ u, h |v|2 dv = E}. RN
RN
RN
For any (smooth version of) exponential weight function m we introduce the Banach space L1 (m −1 ) = L 1 (m −1 ) ∩ C0,0 . 1.6. Main results in self-similar variables. Our main result, that we state now, deals with the evolution equation in self-similar variables ∂g = Q α (g, g) − τα ∇v · (vg), g(0, ·) = gin ∈ Cρ,0 , ∂t
τα := ρ (1 − α), (1.31)
and with the associated stationary equation for the self-similar profile, Q α (G, G) − τα ∇v · (v G) = 0, G ∈ Cρ,0 ,
τα := ρ (1 − α).
(1.32)
Theorem 1.1. There is some constructive α∗ ∈ (0, 1) such that for any given mass ρ ∈ (0, ∞), we have: (i) For any τ > 0 and α ∈ [α∗ , 1], Eq. (1.16) admits a unique non-negative stationary solution with mass ρ and vanishing momentum. We denote by G¯ α the self-similar profile obtained by fixing τ = τα := ρ (1 − α).
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
439
(ii) Let G¯ 1 = Mρ,0,θ¯1 be the Maxwellian distribution with mass ρ, momentum 0 and “quasi-elastic self-similar temperature” θ¯1 defined in (1.29). The path of selfsimilar profiles α → G¯ α parametrized by the normal restitution coefficient is C 1 from [α∗ , 1] into W k,1 ∩ L 1 (ea |v| ) for any k ∈ N and some a ∈ (0, ∞). (iii) For any α ∈ [α∗ , 1], the linearized collision operator h → Lα h := 2 Q˜ α (G¯ α , h) − τα ∇v · (v h)
(1.33)
is well-defined and closed on L1 (m −1 ) for any exponential weight function m with exponent s ∈ (0, 1) (defined in (1.30)). Its spectrum decomposes into a part which lies in the half-plane {Re ξ ≤ µ} ¯ for some constructive µ¯ < 0, and some remaining discrete eigenvalue µα . This eigenvalue is real negative and satisfies µα = −ρ (1 − α) + O(1 − α)2 when α → 1.
(1.34)
The associated eigenspace has dimension 1 and, denoting by φα = φα (v) the unique associated eigenfunction such that φα L 1 = 1 and φα (0) < 0, we have 2
φα ∈ S(R N ) (with bounds of regularity independent of α) and φα → φ1 := c0 |v|2 − N θ¯1 G¯ 1 as α → 1,
(1.35)
where c0 is the positive constant such that φ1 L 1 = 1. Finally one has construc2 tive decay estimates on the semigroup associated to this spectral decomposition in this Banach space (see the key Theorem 5.2 and the following point). (iv) The self-similar profile G¯ α is globally attractive on bounded subsets of L 13 under some additional smallness condition on (1 − α∗ ) in the following sense. For any ρ, E0 , M0 ∈ (0, ∞) there exists α∗∗ ∈ (α∗ , 1), C∗ ∈ (0, ∞) and η ∈ (0, 1), such that for any initial datum satisfying 0 ≤ gin ∈ L 13 ∩ Cρ,0,E0 , gin L 1 ≤ M0 , 3
the solution g to (1.31) satisfies ∀ α ∈ [α∗∗ , 1), gt − G¯ α L 1 ≤ e(1−η) µα t C∗ . 2
(1.36)
(v) Moreover, under smoothness conditions on the initial datum one may prove a more precise asymptotic decomposition, and construct a Lyapunov functional for Eq. (1.31). More precisely, there exists k∗ ∈ N and, for any exponential weight m as defined in (1.30) and any ρ, E0 , M0 ∈ (0, ∞), there exists α∗∗ ∈ (α∗ , 1) and a constructive functional H : H k∗ ∩ L 1 (m −1 ) → R such that, first, for any initial datum 0 ≤ gin ∈ H k∗ ∩ L 1 (m −1 ) ∩ Cρ,0,E0 satisfying gin H k∗ ∩ L 1 (m −1 ) ≤ M0 , the solution g to (1.31) satisfies g(t, ·) = G¯ α + cα (t) φα + rα (t, ·),
(1.37)
with cα (t) ∈ R and rα (t, ·) ∈ L 12 (R N ) such that (µα satisfies (1.34) above) |cα (t)| ≤ C∗ eµα t ,
rα (t, ·) L 1 ≤ C∗ e(3/2) µα t . 2
(1.38)
440
S. Mischler, C. Mouhot
Second, when the initial datum also satisfies the lower bound gin ≥ M0−1 e−M0 |v| , 8
the solution is such that t → H(g(t, ·)) is strictly decreasing (except maybe when gt reaches the stationary state G¯ α , see the remark below). Remark 1.2. 1) All the constants of this theorem are constructive and they can be made explicit. In particular the proofs do not use any compactness argument. Unless otherwise mentioned, these constants will depend on b, on the dimension N , and on some bounds on the initial datum but never on the inelasticity parameter α ∈ (0, 1]. 2) Theorem 1.1 establishes that Conjectures 1 and 2 in [24, Sect. 5] hold true at least for the weak inelasticity model (α close enough to 1). 3) In point (iv), the condition on the restitution coefficient α depends on the mass, temperature and L 13 norm of the initial distribution, but this dependence is not a perturbative condition of smallness of (gin − G¯ α ). This is unexpected and relies on the so-called “entropy-entropy production” estimates which yields “superlinear” Gronwall-type estimates, and on the decoupling of the timescales of energy dissipation and entropy production. 4) In (1.38) one can prove rα (t, ·) L 1 ≤ Cζ eζ µα t for any ζ ∈ (1, 2). We remark that 2
¯
here we do not claim a decay rate eλt on the remaining part when one “removes” from gt − G¯ α the projection on the energy eigenvalue, where λ¯ < 0 would be some constant independent of α related to the second non-zero eigenvalue of Lα . We do not know if such a decay rate holds, but it is unlikely in our opinion, due to the coupling effect of the bilinear term, which mixes the different part of the spectral decomposition. 5) As a subproduct the above result provides an alternative argument to the one of [24, Sect. 3] to show uniform (according to time and α) non-concentration bounds on the rescaled equation in the case of α close to 1 and a general initial datum gin ∈ L 13 (whereas the proof of [24, Sect. 3] was valid for all α ∈ (0, 1) but for some initial datum gin ∈ L 13 ∩ L p , p ∈ (1, ∞]). 6) Our results show that no bifurcation occurs for the path of self-similar profiles for α close to 1. We do not know now if some bifurcations occur for other values of the inelasticity parameter. Therefore we do not know if there is a continuous branch of self-similar profiles parametrized by α ∈ [0, 1] (even if we know from [24] that self-similar profiles exist for all values of α). The best we can say from the estimates we have proved on the profile together with the classical theory of topological degree (see [29] for instance) is that there is a set K ⊂ [0, 1] × F (where F is for instance the set of positive functions in the Schwartz space with given mass) which is compact, connected, and such that for any α ∈ [0, 1], the intersection K ∩ {α} × F is not empty. 7) If gin = G¯ α , it is likely that g(t, ·) does not reach the stationary state G¯ α in finite time, and thus t → H(g(, ·)) is a strictly decreasing function over R+ . Unfortunately, we are not able to prove this fact. Nevertheless, it is worth mentioning that when gin ∈ / C ∞ , so that gin ∈ / H k for some k, we may adapt to the inelastic Boltzmann equation some results obtained in [28] for the elastic Boltzmann equation and we get that g(t, ·) ∈ / H k for any t ≥ 0, from which we easily conclude that g(t, ·) = G¯ α ∀ t ≥ 0 (because G¯ α ∈ H for any ≥ 0).
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
441
1.7. Coming back to the original equation. When coming back to the original equation (1.1) with the help of (1.18) and (1.22), Theorem 1.1 translates into Theorem 1.3. There is a constructive α∗ ∈ (0, 1) such that for any given mass ρ ∈ (0, ∞), we have: (i) For any α ∈ [α∗ , 1), up to a translation of time there exists a unique self-similar solution F¯α of Eq. (1.1) with mass ρ, and it is given by F¯α (t, v) = (1 + τα t) N G¯ α ((1 + τα t) v), τα = ρ (1 − α), where G¯ α was obtained in Theorem 1.1. More precisely, if Fα is a solution of (1.1) of the form (1.19) and with mass ρ, there exists t0 ∈ R such that Fα (t, v) = F¯α (t + t0 , v) for any t ≥ max{0, −t0 } and any v ∈ R N . (ii) The self-similar solution F¯α is globally attractive on bounded subsets of L 13 under some smallness condition on (1 − α∗ ) in the following sense. For any ρ, E0 , M0 ∈ (0, ∞) there exists α∗∗ ∈ (α∗ , 1) and η ∈ (0, 1) such that for any q ∈ N there is cq ∈ (0, ∞) such that for any initial datum satisfying 0 ≤ f in ∈ L 13 ∩ Cρ,0,E0 ,
f in L 1 ≤ M0 , 3
the solution f (t, ·) to (1.1) satisfies ∀ q ∈ [0, ∞), f (t, ·) − F¯α (t, ·) L 1 (|v|q ) ≤ cq (1 + τα t)(1−η) µα /τα −q = cq (1 + τα t)−(1−η)−q+O(1−α) . (iii) Moreover, there exists k∗ ∈ N and, for any exponential weight m as defined in (1.30) and any ρ, E0 , M0 ∈ (0, ∞), there exists α∗∗ ∈ (α∗ , 1) such that, for any initial datum 0 ≤ f in ∈ H k∗ ∩ L 1 (m −1 ) ∩ Cρ,0,E0 satisfying f in H k∗ ∩ L 1 (m −1 ) ≤ M0 , the solution f to (1.31) satisfies f (t, ·) = F¯α (t, ·) + c˜α (t) ψα (t, ·) + r˜α (t, ·), where
ψα (t, v) = (1 + τα t) N φα ((1 + τα t) v) , c˜α (t) = cα
(1.39)
ln(1 + τα t) . τα
In this expansion, the different terms have the following asymptotic behaviors (for any given q ≥ 0): F¯α (t, ·) L 1 (|v|q ) = (1 + τα t)−q G¯ α L 1 (|v|q ) , ψα (t, ·) L 1 (|v|q ) = (1 + τα t)−q G¯ α L 1 (|v|q ) , |c˜α (t)| ≤ C∗ (1 + τα t)µα /τα = C∗ (1 + τα t)−1+O(1−α) ,
∃Cq > 0; ˜rα L 1 (|v|q ) ≤ Cq (1+τα t)(3/2) µα /τα −q = Cq (1+τα t)−(3/2)−q+O(1−α) . Hence the leading term in the expansion (1.39) is, as expected, the self-similar solution, and the first order correction beyond self-similarity is given by the second term, that is the projection onto the eigenspace of the “energy eigenvalue”.
442
S. Mischler, C. Mouhot
(iv) We can make Haff’s law precise on the asymptotic behavior of the granular temperature (see [24]) in the following way. Under the assumptions of point (iii), the solution f = f (t, v) to (1.1) satisfies E( f t ) =
E(G¯ α ) (1 + τα t)2
+O
. 3+O (1−α)
1 (1 + τα t)
(1.40)
(v) Under the assumptions of point (iii) the rescaling by the square root of the energy familiar to physicists is rigorously justified in the following sense: the solution f = f (t, v) to (1.1) satisfies for t → +∞, E( f t ) N /2 f t, E( f t )1/2 v → E(G¯ α ) N /2 G¯ α E(G¯ α )1/2 v in L 1 . Remark 1.4. We see from this theorem that the convergence towards the self-similar solution is indeed faster than the convergence towards the Dirac mass (hence justifying its interest), but also that the speed of convergence towards this self-similar solution degenerates to 0 as α → 1 (because τα → 0 when α → 1). This fact is surprising, since the self-similar solution converges towards a stationary Maxwellian distribution in the elastic limit, and the latter is known to be exponentially attractive for the elastic equation (see [27] for instance). As we shall see this is related to the fact that a bifurcation occurs in the spectrum of the linearized collision operator at α = 1 (namely the eigenvalue corresponding to the kinetic energy vanishes at α = 1 whereas it is non-zero for α ∈ [α∗ , 1)). This remark may explain the fact that in the quasi-elastic limit considered – in dimension 1 – in [10], it is proved that the rate of relaxation towards the self-similar solution is worse than any polynomial. Proof of Theorem 1.3. Except for points (i) and (v) this theorem is an obvious translation of Theorem 1.1. In order to prove (i), one first remarks that for two given self-similar solutions F and F˜ with same mass ρ, there holds F(t, v) = (V0 + A t) N G A ((V0 + A t) v) ,
˜ v) = (V˜0 + A˜ t) N G ˜ (V˜0 + A˜ t) v , F(t, A
and thus from (1.20) and Theorem 1.1, G A˜ (v) =
N A A v . GA A˜ A˜
We deduce N A A ˜ ˜ ˜ F(t, v) = V0 + A t GA V0 + A t v = F(t + t0 , v) A˜ A˜ with t0 =
V0 V˜0 . − A A˜
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
443
In order to prove (v), we introduce the function ξ(t) = E(G¯ α )1/2 /[E( f t )1/2 (1 + τα t)] and we compute E( f t ) N /2 f t, E( f t )1/2 · −E(G¯ α ) N /2 G¯ α (E(G¯ α )1/2 ·) 1 L
= g(τα−1 ln(1+τα t), ·)−ξ(t) N G¯ α (ξ(t) ·) L 1 ≤ g(τα−1 ln(1+τα t), ·)−G¯ α L 1 +|ξ(t) N −1| G¯ α L 1 +ξ(t) N G¯ α (ξ(t) ·)−G¯ α L 1 . Using now (1.34), (1.37), (1.38), (1.40) and the fact that G¯ α is bounded in W11,1 uniformly in α ∈ (α∗ , 1) from Theorem 1.1 (ii), we deduce E( f t ) N /2 f t, E( f t )1/2 · − E(G¯ α ) N /2 G¯ α (E(G¯ α )1/2 ·) 1 ≤ C (1 + τα t)−1+O(1−α) , L
for some constant C ∈ (0, ∞) (which depends in particular on the upper bound on G¯ α 1,1 ), from which (v) follows. W1
Remark 1.5. Let us emphasize that the temperature θ¯1 of the limit Maxwellian G¯ 1 is “universal” in the sense that it only depends on the collisional cross-section b (through its angular momentum), and not for instance on the density distribution. The temperature of the self-similar solution F¯α = F¯α (t, v) associated to a self-similar profile G¯ α decreases like θ ( F¯α (t, ·)) =
θ (G¯ α ) . (1 + ρ(1 − α)t)2
Hence when α is close to 1 (small inelasticity) we obtain θ ( F¯α (t, ·)) ≈
θ¯1 . (1 + ρ(1 − α)t)2
Therefore, as soon as the self-similar solutions correctly describe the asymptotic (at least in the framework of point (ii) of Theorem 1.3), as conjectured by physicists, generic solutions satisfy θ¯1 t −2 θ ( f α (t, ·)) ∼t→∞ ρ 2 (1 − α)2 for an inelasticity coefficient α close to 1. We call θ¯1 a “quasi-elastic self-similar temperature”. Remark that its definition as the temperature of G¯ 1 seems to depend on the choice of the scaling. However changing this scaling by some asymptotically equivalent one, as α → 1, would only add a factor which would then disappear when coming back to the solution to the original equation (1.1). Therefore a more “canonical” way to define this quasi-elastic self-similar temperature could be θ¯1 = ρ 2 lim (1 − α 2 ) lim θ ( f α (t, ·)) t 2 , α→1
t→+∞
where f α denotes a generic solution with mass ρ to Eq. (1.1).
444
S. Mischler, C. Mouhot
1.8. Method of proof and plan of the paper. • The first main idea of our method is to consider the rescaled equations (1.16) and (1.17) with an inelasticity dependent anti-drift coefficient τα which exactly “compensates” the loss of elasticity of the collision operator (in the sense that it compensates its loss of kinetic energy). This scaling allows by some technical estimates to prove uniform bounds according to α for the family of self-similar profiles G α to Eq. (1.32). • The second main idea is the decoupling of the variations along the “energy direction” and its “orthogonal direction”. This decoupling makes possible to identify the limit of different objects as α → 1 (among them the limit of G α ). • The third main idea is to systematically use the well-developed theory on the elastic limit problem, once it has been identified thanks to the previous arguments. In particular we use the spectral study of the linearized problem and the entropy - entropy production inequalities for the elastic problem. This allows to argue by perturbative method. Let us emphasize that perturbation is singular in the classical sense because of the addition of a (vanishing at the limit) first-order derivative operator, but also because of the gain of one more conservative quantity at the limit (which implies in particular at the linearized level that the “energy eigenvalue” µα is negative for α = 1 but converges to µ1 = 0 in the limit α → 1). In Sect. 2, we use the regularity properties of the collision operator in order to establish on the one hand that the family (G α ) is bounded in H ∞ ∩ L 1 (m −1 ) uniformly according to the inelastic parameter α (the key argument being the use of the entropy functional which provides uniform lower bound on the energy of G α ) and on the other hand that the difference of two self-similar profiles in any regular or weighted norm may be bounded by the difference of these ones in L 1 norm (the key idea is a bootstrap argument). This last point shall allow to deal with the loss of derivatives and weights in the operator norms used in the sequel of the paper. In Sect. 3, we prove that α → Q +α is Hölder continuous in the norm of its graph and is Hölder differentiable in a weaker norm. As a consequence we deduce that G α → G¯ 1 when α → 1 with explicit “Hölder” rate, which (partially) proves point (ii) Theorem 1.1. The cornerstone of the proof is the decoupling of the variation G α − G¯ 1 between the “energy direction” and its “orthogonal direction”. In Sect. 4, we prove uniqueness of the profile G¯ α for small inelasticity (point (i) of Theorem 1.1) by a variation around the implicit function theorem in infinite dimension. We also deduce that α → G¯ α is differentiable at α = 1. Section 5 is devoted to the study of the linearized operator Lα , and we are partially inspired from the method of [27]. We prove point (iii) of Theorem 1.1 and we end the proof of point (ii) of Theorem 1.1. We obtain information on the localization of the spectrum and we establish some decay estimates on the associated semigroup. Let us emphasize that for technical reasons we state our results in an L 1 framework (mainly because we are not able to generalize Lemma 5.8 to an L 2 framework), which makes the spectral analysis more intricate. The proof proceeds as follows (the cornerstone idea is again the decoupling of the variations in the “energy direction” and its “orthogonal direction”). First, we localize the essential spectrum inside the half plane cµ¯ = {z ∈ C, e z ≤ µ¯ < 0} with the help of Weyl’s theorem, the compactness properties of Lα and the “rough” (Hölder type) convergence of Q +α (G¯ α , ·) to Q +1 (G¯ 1 , ·) in the “good” norm of the graph. Second, we localize the discrete spectrum lying in µ¯ = {z ∈ C, e z ≥ µ} ¯ inside the disc {z ∈ C, |z| ≤ C (1 − α)}, thanks to estimates on the resolvent of Lα . Third we establish that the spectrum (Lα ) of Lα satisfies (Lα ) ∩ µ¯ = {µα }, where µα has multiplicity 1 (the proof mainly takes advantage of
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
445
the “precise” convergence of Q +α (G¯ α , ·) to Q +1 (G¯ 1 , ·) in weaker norm, together with a regularity estimate holding in the discrete eigenspace). Last we establish the expansion (1.34) using the energy equation associated to the eigenvalue µα . The decay properties of the linear semigroup are then deduced from resolvent estimates and the above localization of the spectrum. Section 6 is devoted to the proof of points (iv) and (v) in Theorem 1.1, which is split in several steps. First we establish a “linearized asymptotic stability result” by decoupling the evolution equation (1.31) along the “energy direction” and its “orthogonal direction”, and using the semigroup decay estimates and the quadratic structure of the collision operator. Second we establish a “non-linear stability result” by decomposing the evolution equation (1.31) into two timescales, and using the energy dissipation equation along the “energy direction” and the entropy production method on its “orthogonal direction” (let us mention that this method follows closely the physical idea that for small inelasticity the “molecular” timescale of thermalization of velocity distribution decouples from the “cooling” timescale of dissipation of energy). Third we prove the asymptotic decomposition and we exhibit a Lyapunov functional for smooth initial data (point (v)) by gathering (and slightly modifying) the two preceding steps. Fourth and last, we prove point (iv) for general initial data, gathering the previous arguments with the decomposition of solutions between a smooth part and a small remaining part as introduced in [28]. 2. A Posteriori Estimates on the Self-Similar Profiles In this section we prove various a posteriori regularity and decay estimates on the self-similar profiles (or on the difference of two self-similar profiles). The main issue here is to obtain uniform estimates as α → 1. This shall be useful in the sequel. 2.1. Uniform estimates on the self-similar profiles. For any α ∈ (0, 1) we consider Gα the set of all the self-similar profiles of the inelastic Boltzmann equation (1.1) with inelasticity coefficient α, with given mass ρ ∈ (0, +∞) and finite energy. More precisely, we define Gα as the following set of functions: Gα := 0 ≤ G ∈ L 12 satisfying (1.32) . For some fixed α0 ∈ (0, 1), we also define G = ∪α∈[α0 ,1) Gα . The fact that for any α ∈ (0, 1), Gα is not empty was proved in [24], where a solution of (1.32) was built within the class of radially symmetric functions belonging to the Schwartz space. Here we show that any self-similar profile G α ∈ G belongs to the Schwartz space and that decay estimates, pointwise lower bound and regularity estimates can be made uniform according to the inelasticity coefficient α ∈ [α0 , 1). Let us once again emphasize that the choice of the velocity rescaling parameter τα = ρ (1 − α) in (1.32) is fundamental in order to get this uniformity in the limit α → 1. Let us also mention that our choice of scaling for Eq. (1.32) is mass invariant, that is G with density ρ(G) satisfies the equation if and only if G/ρ(G) satisfies the equation with ρ = 1. Therefore all the estimates on the profiles are homogeneous in terms of the density ρ.
446
S. Mischler, C. Mouhot
Proposition 2.1. Let us fix α0 ∈ (0, 1). There exists a1 , a2 , a3 , a4 ∈ (0, ∞) and, for any k ∈ N, there exists Ck ∈ (0, ∞) such that ⎧ ⎨ G α L 1 (ea1 |v| ) ≤ a2 , G α H k (R N ) ≤ Ck , ∀ α ∈ [α0 , 1), ∀ G α ∈ Gα , (2.1) 8 ⎩ G α ≥ a3 e−a4 |v| . We first recall the following geometrical lemma extracted (in a slightly specified form) from [23, Lemma 2.3 & Lemma 4.4], that we shall use several times in the sequel. Lemma 2.2. For any α ∈ (0, 1] and σ ∈ S N −1 we define ∗ φα∗ = φα,v,σ : R N → R N , v∗ → v ,
φα = φα,v∗ ,σ : R N → R N , v → v , ∗ ), Jα = det (D φα,v∗ ,σ ), as well as the and the Jacobian functions Jα∗ = det (D φα,v,σ cone δ = δ,σ = u ∈ R N , uˆ · σ > δ − 1 ,
for any δ ∈ (0, 2) and σ ∈ S N −1 . For any δ √ ∈ (0, 2), φα∗ defines a C ∞ -diffeomorphism from v + δ onto v + ω∗ (δ) with ∗ ω (δ) = 1 + δ/2 and φα defines a C ∞ -diffeomorphism from v∗ + δ onto v∗ + ωα (δ) with δ − 1 + rα ωα (δ) = 1 + 1/2 1 + 2(δ − 1)rα + rα2 and rα = (1 + α)/(3 − α). Moreover, there exist C ∈ (0, ∞) such that with Cδ = C/δ, Cδ−1 |v − v∗ | ≤ |φα (v) − v∗ | ≤ 2 |v − v∗ |,
|φα−1 (v ) − φα−1 (v )| ≤ C δ |α − α| |v − v∗ |, 2 |Jα | ≤ Cδ , |Jα−1 | ≤ Cδ , |Jα−1 − Jα−1 | ≤ C δ |α
(2.2) (2.3) − α|
(2.4)
on v∗ + δ , uniformly with respect to the parameters α, α ∈ [0, 1], σ ∈ S N −1 and v∗ ∈ R N . The same estimate holds for φα∗ on v + δ . Finally, for any α, α ∈ [0, 1], σ ∈ S N −1 , v∗ ∈ R N and t ∈ [0, 1], there holds −1 t φα−1 + (1 − t) φα−1 = φαt
(2.5)
for some αt belonging to the segment with extremal points α and α . The same result holds for φα∗ . We will also need the following elementary result in order to estimate the convolution operator L defined in (1.9). Lemma 2.3. For any function g ∈ L 13 (R N ) there exist some constants c1 , c2 ∈ (0, ∞) such that c1 (1 + |v|) ≤ L(g) ≤ c2 (1 + |v|).
(2.6)
Moreover, if g satisfies E(g) ≥ a1 ρ and m3/2 (g) ≤ a2 ρ, for some constants a1 , a2 > 0, we can choose c1 = C −1 ρ, c2 = C ρ in (2.6) for some explicit constant C > 0 depending only on a1 , a2 > 0.
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
447
Proof of Lemma 2.3. The upper bound in (2.6) is immediate. As for the lower bound, we have, on the one hand, by Jensen’s inequality, g∗ |u| dv∗ ≥ ρ |v|. (2.7) RN
On the other hand, by the triangular inequality, g∗ |u| dv∗ ≥ m1/2 − |v| m0 . RN
−1 By Hölder’s inequality we have m1/2 ≥ E 2 m3/2 ≥ C0 ρ for some explicit constant C0 > 0 depending only on a1 , a2 . As a consequence g∗ |u| dv∗ ≥ ρ (C0 − |v|). (2.8)
RN
These two lower bounds (2.7, 2.8) imply immediately that g∗ |u| dv∗ ≥ C −1 ρ (1 + |v|) RN
for some explicit constant C > 0 depending only on C0 .
Proof of Proposition 2.1. We split the proof into several steps. In Steps 1, 2 and 3, we establish the smoothness for any profile G α ∈ G as well as upper and lower bounds on its tail. In Steps 4, 5, 6, 7, 8 and 9, we show that these estimates actually are uniform with respect to the choice of the profile G α ∈ Gα and α ∈ [α0 , 1). Thanks to Steps 1, 2 and 3 the computations then performed are rigorously justified. We fix α ∈ [α0 , 1) and G α a solution of (1.32) for which we will establish the announced bounds. From now on we omit the subscript “α” when no confusion is possible. Step 1. Moment bounds. From [24, Prop. 3.1], by taking gin = G in the evolution equation (1.16), we get that G ∈ L 1k for any k ∈ N. Step 2. L 2 a posteriori bound. We aim to prove that G ∈ L 2 . Let us fix A > 0 and let us introduce the C 1 function x2 A2 A (x) := 1x≤A + A x − 1x>A . 2 2 We multiply Eq. (1.32) by A (G) = min{G, A} := T A (G). Once again we shall omit the subscript “A” when no confusion is possible. After some straightforward computation we get T (G) G L(G) + ρ (1 − α) N T (G)2 /2 dv = T (G) Q + (G, G) dv. RN
RN
Since L(G) ≥ c1 (1 + |v|), thanks to Lemma 2.3 and (G) ≤ G T (G), we have c1 (G) (1 + |v|) dv ≤ T (G) G L(G) dv N RN R ≤ T (G) Q + (G, G) dv ≤ I1 + I2 + I3 + I4 , (2.9) RN
448
S. Mischler, C. Mouhot
where the terms Ik are defined in the following way, splitting the collision kernel into some smooth and non-smooth R+ be an even C ∞ function such that parts. Let : R → N ∞ support ⊂ (−1, 1), and R = 1. Let : R → R+ be a radial C function such that support ⊂ B(0, 1) and R N = 1. Introduce the regularizing sequences m (z) = m (mz), z ∈ R,
(nx), x ∈ R N . n (x) = n N
As a convention, we shall use subscripts S for “smooth” and R for “remainder”. We denote (u) := |u|. First, we set n ∗ 1An , S,n = R,n = − S,n , where An stands for the annulus An = x ∈ R N ; b S,m (z) = m ∗ b 1Im (z),
2 n
≤ |x| ≤ n . Similarly, we set
b R,m = b − b S,m ,
where Im stands for the interval Im = x ∈ R ; −1 + m2 ≤ |x| ≤ 1 − m2 (b is understood as a function defined on R with compact support in [−1, 1]). We then define I1 = T (G) Q +R (G, G) dv, RN
where Q +R is the gain term associated to the cross-section B R := |u| b R,m , I2 = T (G) Q +R S (G, G) dv, RN
where Q +R S is the gain term associated to the cross-section B R S := R,n b S,m , I3 = T (G) Q˜ +S (χ (G), G) + Q˜ +S (T (G), χ (G) dv, RN
where Q +S is the gain term associated to the smooth cross-section B S := S,n b S,m and χ (G) := G − T (G) and finally I4 = T (G) Q +S (T (G), T (G)) dv. RN
We estimate each term separately. We omit the subscripts m and n when there is no confusion. For I1 we proceed along the line of the proof of the estimate for the term I r in [23, Proof of Theorem 2.1]. Using Young’s inequality x T (y) ≤ (x) + (y) we have I1 = G G ∗ T (G ) b R,m |u| dv dv∗ dσ R N ×R N ×S N −1 G [(G ∗ )+(G )] b R,m 1u·σ ≤ ˆ ≤0 |u| dv dv∗ dσ N N N −1 R ×R ×S G ∗ [(G)+(G )] b R,m 1u·σ + ˆ ≥0 |u| dv dv∗ dσ = I1,1 + · · · +I1,4 . R N ×R N ×S N −1
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
449
We just deal with the term I1,2 , the others may be handled in a similar (or even simpler) way. Making the change of variables v∗ → v = φα∗ (v∗ ) (for some fixed v, σ ) and using the elementary inequality |u| ≤ 4 |v − v|, valid when σ · uˆ ≤ 0, there holds I1,2 = G (G ) b R,m 1u·σ ˆ ≤0 |u| dv dv∗ dσ R N ×R N ×S N −1 G (G ) b R,m 1u·σ = 2 N +2 ˆ ≤0 |v − v | dv dv dσ R N ×R N ×S N −1 (G) (1 + |v|) dv. ≤ 2 N +2 b R,m L 1 G L 1 1
RN
Since the same estimates hold for all the terms I1,k , we obtain I1 ≤ ε(m) G L 1 (G) v dv with ε(m) −→ 0. 1
m→∞
RN
(2.10)
For I2 we proceed along the line of the proof of the estimate for the term I in [24, Proof of Prop. 2.5]. Using again Young’s inequality x T (y) ≤ (x) + (y) and the trivial estimate R,n ≤ C n −1 (|v|2 + |v∗ |2 ) we get I2 = G G ∗ T (G ) b S,m R,n dv dv∗ dσ R N ×R N ×S N −1 C ≤ G |v|2 [(G ∗ ) + (G )] b S,m dv dv∗ dσ n R N ×R N ×S N −1 C + G ∗ |v∗ |2 [(G)+(G )] b S,m dv dv∗ dσ = I2,1 + · · · + I2,4 . n R N ×R N ×S N −1 Because of the truncation on b of frontal and grazing collisions, both changes of variables v → v = φα (v) (for fixed v∗ , σ ) and v∗ → v = φα∗ (v∗ ) (for fixed v, σ ) are allowed (and the jacobian of their inverse is bounded). Hence in a similar way as for the term I1 we obtain C(m) I2 ≤ G L 1 (G) dv. (2.11) 2 n RN For I3 , using again Young’s inequality, plus T (G) ≤ G and the fact that both changes of variables v → v = φα (v) (for fixed v∗ , σ ) and v∗ → v = φα∗ (v∗ ) (for fixed v, σ ) are allowed, we have I3 = (T (G )+T (G ∗ )) [G χ (G ∗ )+χ (G) T (G ∗ )] b S,m S,n dv dv∗ dσ N N N −1 R ×R ×S χ (G ∗ ) (G )+(G ∗ )+2 (G) ≤ C(n) R N ×R N ×S N −1 + χ (G) (G )+(G ∗ )+2 (G ∗ ) b S,m dv dv∗ dσ. We deduce as before
I3 ≤ Cm,n χ (G) L 1
for some constant Cm,n > 0.
RN
(G) dv
(2.12)
450
S. Mischler, C. Mouhot
Finally for I4 , we argue as in the proof of [24, Prop. 2.6] for the treatment of the term involving Q +S , and we get for some θ ∈ (0, 1), 2 (1−θ)
I4 ≤ Cm,n T (G)1+2θ T (G) L 2 L1
,
(2.13)
for some constant Cm,n > 0. Gathering (2.9), (2.10), (2.11), (2.12), (2.13) and taking m, next n and finally A ≥ A(G) large enough we may control the terms I1 , I2 and I3 by the half of the left hand side term of (2.9) (for I3 we use that χ A (G) L 1 → 0 when A → ∞). Note that the condition A ≥ A(G) depends on the distribution G (by the mean of some nonconcentration bound), but shall play no role since we shall take the limit A → +∞ in the end. We obtain c1 2 (1−θ) ∀ A ≥ A(G), A (G) (1 + |v|) dv ≤ Cb,ρ,E (G) T A (G) L 2 2 RN for some constant Cb,ρ,E (G) > 0 depending on the cross-section b and on the profile G via its energy. Using that T A (G)2 /2 ≤ A (G) we deduce ∀A ≥ A(G),
c1 T A (G)2L θ2 ≤ Cb,ρ,E (G) , 4
and we then conclude that G ∈ L 2 passing to the limit A → ∞ in the preceding estimate, with the bound G L 2 ≤
4 Cb,ρ,E (G) c1
1
2θ
.
(2.14)
Remark 2.4. Note that the L 2 bound (2.14) only depends on the distribution G by the means of the energy E(G) and the constant c1 . Therefore, thanks to Lemma 2.3, this bound only depends on a lower bound on the energy E(g) and an upper bound on the third moment m3/2 (g). Step 3. Smoothness and positivity. Thanks to [24, Theorem 1.3] and [8, Theorem 1], taking gin = G as an initial condition in (1.16) we have that G belongs to the Schwartz space of C ∞ functions decreasing faster than any polynomials, and that G ≥ a1 e−a2 |v| for some constant a1 , a2 > 0. So far the estimates in Step 3 may be not uniform on the elasticity coefficient α ∈ [α0 , 1) and on the profile G α . The aim of the following steps is to prove that they actually are uniform. Note however that estimates of the previous steps shall ensure that the following computations are rigorously justified. Step 4. Upper bound on the energy using the energy dissipation term. We prove that ∀ α ∈ (0, 1]
E≤
4 ρ. b12
(2.15)
From Eq. (1.25) on the energy of the profile G there holds (1 + α) b1 G G ∗ |u|3 dv dv∗ = 2 ρ RN RN
RN
G |v|2 dv.
(2.16)
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
From Jensen’s inequality
RN
|u|3 G ∗ dv∗ ≥ ρ |v|3 ,
and Hölder’s inequality |v| G dv ≥ ρ 3
RN
451
−1/2
3/2 |v| G dv 2
RN
,
we get (1 + α) b1 ρ 1/2 E 3/2 ≤ 2 ρ E from which the bound (2.15) follows. Step 5. Lower bound on the energy using the entropy. We prove ∀ α ∈ (0, 1]
E≥
N 2 α4 ρ with b2 := b L 1 . 8 b22
(2.17)
Remark 2.5. The choice of scaling we have made for the evolution equation in selfsimilar variables becomes clear from this computation: it is chosen such that the energy of the self-similar profile does not blow up nor vanishes for α → 1. The restriction α ∈ [α0 , 1), α0 > 0, is then made in order to get a uniform estimate from below on the energy. By integrating the equation satisfied by G against log G we find Q(G, G) log G dv − ρ (1 − α) log G ∇v · (v G) dv = 0. RN
RN
Then we write the first term as in [17, Sect. 1.4] to find G G ∗ 1 G G ∗ G G ∗ log − + 1 B dv dv∗ dσ 2N N −1 2 GG ∗ GG ∗ R ×S 1 G G ∗ − GG ∗ B dv dv∗ dσ +ρ (1−α) v · ∇v G dv = 0. + 2 R2N ×S N −1 RN If we denote D H,α (g) =
1 2
R2N ×S N −1
g g∗
g g∗ g g∗ − log − 1 B dv dv∗ dσ ≥ 0, (2.18) gg∗ gg∗
(recall that in this formula the post-collisional velocities v , v∗ are computed according to the inelastic formula (1.4) with normal restitution coefficient α ∈ (0, 1]), we can write 1 − D H,α (G) + − 1 b G G ∗ |u| dv dv∗ − (1 − α) N ρ 2 = 0, (2.19) 2 α2 R2N and thus we get α2 1 N α2 2 N ρ2 + D H,α (G) ≥ ρ . G G ∗ |u| dv dv∗ = 1+α 1−α 2 R2N
452
S. Mischler, C. Mouhot
On the other hand, from Cauchy-Schwarz’s inequality G G ∗ |u| dv dv∗ R2N
1/2
≤
R2N
G G ∗ dvdv∗
1/2 R2N
G G ∗ |u|2 dvdv∗
=
√ 3/2 1/2 2ρ E ,
and then the bound (2.17) follows gathering the two preceding estimates. Step 6. Upper bound on (exponential) moments using Povzner inequality. There exists A, C > 0 such that ∀ α ∈ [0, 1), G(v) e A|v| dv ≤ C ρ. RN
We refer to [8] where that bound is obtained as an immediate consequence of the following sharp moment estimates: there exists X > 0 such that ∀ α ∈ [0, 1), mk = G |v|k dv ≤ (k + 1/2) X k/2 ρ. (2.20) RN
It is worth noticing that in [8] the Povzner inequality used in order to get (2.20) is uniform in the normal restitution coefficient α ∈ [0, 1] and that the factor ρ comes from our choice of the scaling variables (in which ρ is involved). Step 7. Uniform upper bound on the L 2 norm. From (2.17), (2.20) and Remark 2.4, the L 2 bound (2.14) is uniform on α ∈ [α0 , 1) and G ∈ Gα . Step 8. Smoothness. It is enough to show some uniform bounds from above and below on the energy together with uniform non-concentration bounds on the self-similar profiles in G, in the form of upper bounds on the L 2 bounds for instance. Indeed the proofs of [24, Prop. 3.1, Prop. 3.2, Prop. 3.4, Theorem 3.5 and Theorem 3.6] then apply straightforwardly (in these proofs we did not use the part associated with the anti-drift in the semigroup). Therefore the uniform bounds on the H k norms for all k ≥ 0 follows from these results. Step 9. Pointwise lower bound. It is a consequence of the following lemma. Lemma 2.6. Let g ∈ C([0, ∞); L 13 ) be a solution of the rescaled equation (1.31) with inelasticity parameter α ∈ (0, 1) and assume that for some p > 1 and C, T ∈ (0, ∞), sup g L p ∩L 1 ≤ C.
[0,T ]
3
(i) For any t1 ∈ (0, T ) there exists a1 ∈ (0, ∞) (depending on C, ρ and t1 but not on T ) such that ∀ t ∈ [t1 , T ], ∀ v ∈ R N , g(t, v) ≥ a1−1 e−a1 |v| . 8
(2.21)
(ii) If furthermore, gin satisfies gin (v) ≥ a0−1 e−a0 |v| , 8
then (2.21) holds with t0 = 0 and some constant a1 ∈ (0, ∞) (depending on C, ρ, a0 but not on T ).
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
453
Proof of Lemma 2.6. We only prove (i), the proof of (ii) being similar. Let us fix t1 ∈ (0, 1). We closely follow the proof of the Maxwellian lower bound for the solutions of the elastic Boltzmann equation (see [11,30]) taking advantage of some technical results established in its extension to the solutions of the inelastic Boltzmann equation (see [24, Theorem 4.9]). The starting point is again the evolution equation satisfied by g written in the form ∂t g + τα v · ∇v g + (τα N + C + C |v|) g = Q +α (g, g) + (C + C |v| − L(g)) g, where the last term in the right-hand side term is non-negative for some well-chosen numerical constant C ∈ (0, ∞) thanks to Lemma 2.3, (2.20) and (2.17). Let us introduce the semigroup Ut associated to the operator τα v · ∇v + λ(v), where λ(v) := τα N + C + C |v|, which action is given by t (Ut h)(v) = h(v e−τα t ) exp − λ(v e−s ) ds . 0
Thanks to the Duhamel formula, we have t Ut−s Q + (g(s + τ, .), g(s + τ, .)) ds. (2.22) ∀ t > 0, ∀ τ ≥ 0, g(t + τ, .) ≥ 0
Noticing that t − λ(v e−s ) ds ≥ −(C |v| + τα N t + C t), 0
and repeating the arguments of Steps 2 and 3 in the proof of [24, Theorem 4.9], we get that ∀ t ≥ τ, g(t, .) ≥ η 1 B(0,δ) (v),
(2.23)
with τ = τ1 = t1 /2 and some constant η = η1 > 0, δ = δ1 > 1. Let us emphasize that here we make use of Lemma 4.6, Lemma 4.7 and Lemma 4.8 in [24] where the constants exhibited in these ones are uniform in α ∈ [α0 , 1) thanks to the uniform L p ∩ L 13 estimates assumed on g. Now, on the one hand, from [24, Lemma 4.8], there exists κ ∈ (0, ∞) such that Q +α (1 B(0,1) , 1 B(0,1) ) ≥ κ 1 B(0,√5/2) , which in turns implies ∀ δ > 0,
Q +α (1 B(0,δ) , 1 B(0,δ) ) ≥ κ δ −N −1 1 B(0,√5/2 δ) .
(2.24)
On the other hand, there exists κ ∈ (0, ∞) such that ∀ δ > 0, ∀ s ∈ [0, 1], Us (1 B(0,δ) ) ≥ κ e−C δ 1 B(0,δ) .
(2.25)
From (2.23) with η = η1 , δ = δ1 , and making use of (2.22), (2.24), (2.25), we get that (2.23) holds with √ 5 t1 δ1 and η = η2 = (τ2 − τ1 ) κ η12 e−C δ1 , τ = τ2 = τ1 + 2 , δ = δ2 = 2 2
454
S. Mischler, C. Mouhot
where κ = κ κ and C depends on C and N . Iterating the argument we get that (2.23) √ k+1 5/2 and holds with τ = τk = τk−1 + t1 2−k = (1 − 2−k ) t1 , δ = δk+1 = ηk+1 = (κ t1 )1+2+···+2
η12 e−C
(δ
2−[k+2 (k−1)+···+2 1] ≥ A2 , √ 8 √ √ with A := κ t1 η1 e−C δ1 /2. In other words, using that 25 > 2, we have proved k−1
k
k +2 δk−1 +···+ 2
k−1 δ
1)
k−1
k+1
k
∀ t ≥ t1 , ∀ k ∈ N, g(t, v) ≥ A2 1 B(0,2k/8 δ1 ) (v), from which we easily conclude.
2.2. Estimates on the difference of two self-similar profiles. In this subsection we take advantage of the mixing effects of the collision operator in order to show that the L 1 norm of the difference of two self-similar profiles (corresponding to the same inelasticity coefficient) indeed controls the H k ∩ L 1 (m −1 ) norm of their difference for any k ∈ N and for some exponential weight function m, uniformly in terms of α ∈ [α0 , 1). Proposition 2.7. For any k > 0, there is m = exp(−a |v|), a ∈ (0, ∞) and Ck > 0 such that for any α ∈ [α0 , 1) and any G α , Hα ∈ Gα there holds Hα − G α H k ∩L 1 (m −1 ) ≤ Ck Hα − G α L 1 .
(2.26)
Proof of Proposition 2.7. We proceed in three steps. It is worth mentioning that all the constants in the proof are uniform in terms of the normal restitution coefficient α ∈ [α0 , 1), as they only depend on the uniform bounds of Proposition 2.1 and some uniform bounds on the collision kernel. Step 1. Control of the L 1 moments. We prove first that there exists A, C ∈ (0, ∞) such that ∀ α ∈ [α0 , 1), |Hα − G α | e A |v| dv ≤ C |Hα − G α | dv. RN
RN
Let us consider some normal restitution coefficient α ∈ [α0 , 1) and two self-similar profiles G, H ∈ Gα (here again, we omit the subscript α when there is no confusion). We denote D = G − H , S = G + H and ϕ = |v|2 p sgn(D), p ∈ 21 N, p ≥ 3/2, where sgn(D) denotes the sign of D. The equation for D reads 0 = Q α (G, G) − Q α (H, H ) − ρ (1 − α) ∇v · (v D) = 2 Q˜ α (D, S) − ρ (1 − α) ∇v · (v D). Multiplying Eq. (2.27) by ϕ, we get 0= B D S∗ ϕ∗ + ϕ − ϕ∗ − ϕ dv dv∗ dσ R N ×R N ×S N −1 ∇v (v D) |v|2 p sgn(D) dv −ρ (1 − α) N R ≤ |u| |D| S∗ K p dv dv∗ + 2 |u| |D| S∗ |v∗ |2 p dv dv∗ N N N N R ×R R ×R 2p +ρ |D| v · ∇(|v| ) dv RN
(2.27)
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
with
K p (v, v∗ ) :=
S N −1
455
(|v |2 p + |v∗ |2 p − |v|2 p − |v∗ |2 p ) b(σ · u) dσ.
From [8, Corollary 3, Lemma 2], there holds K p (v, v∗ ) ≤ γ p p − (1 − γ p ) (|v|2 p + |v∗ |2 p ), where (γ p ) p=3/2,2,... is a decreasing sequence of real numbers such that 4 , 0 < γ p < min 1, p+1
(2.28)
and p is defined by kp p |v|2k |v∗ |2 p−2k + |v|2 p−2k |v∗ |2k , p := k k=1
with k p := [( p + 1)/2] is the integer part of ( p + 1)/2 and
p stands for the binomial k
coefficient. As a consequence, 2p (1 − γ3/2 ) |v| |u| S∗ |D| dv dv∗ ≤ γ p |u| |D| S∗ p dv dv∗ N N R N ×R N R ×R +2 |u| |D| S∗ |v∗ |2 p dv dv∗ + 2 ρ p |D| |v|2 p dv. R N ×R N
RN
Using Lemma 2.3 in order to estimate L(S) from below, the inequality |u| ≤ |v| + |v∗ | and introducing the notations 2k dk := |D| |v| dv, sk := S |v|2k dv, RN
RN
we get, for some numerical constant C ∈ (0, ∞), ρ d p+1/2 ≤ γ p S p + d0 s p+1/2 + d1/2 s p + 2 ρ p d p , C
(2.29)
with kp p dk+1/2 s p−k + dk s p−k+1/2 + d p−k+1/2 sk + d p−k sk+1/2 . S p := k k=1
From Proposition 2.1, or more precisely (2.20), we know that sk ≤ ρ (k + 1/2) x k for any k ≥ 1 and for some x ∈ (1, ∞). By Hölder’s inequality, we also have 1+ 21p
dp
1
≤ d p+ 1 d02 p . 2
Repeating the proof of [8, Lemma 4], for any a ≥ 1, there exists A > 0 such that S p ≤ A ρ (d0 + d1/2 ) (a p + a/2 + 1) Z p
456
S. Mischler, C. Mouhot
with Z p := max {δk+1/2 σ p−k , δk σ p−k+1/2 , δ p−k+1/2 σk , δ p−k σk+1/2 }, k=1,..,k p
and δk :=
dk sk , σk := . (d0 + d1/2 ) (a k + 1/2) ρ (a k + 1/2)
We may then rewrite (2.29) as 1+1/2 p
(a p + 1/2)1/2 p δ p
≤ A γp
(a p + a/2 + 1) Z p + (σ p+1/2 + σ p ) + 2 ρ p δ p . (ap + 1/2)
On the one hand, from (2.28), there exists A such that A γp
(ap + a/2 + 1) ≤ A pa/2−1/2 (ap + 1/2)
∀ p = 3/2, 2, . . . .
On the other hand, thanks to Stirling’s formula n! ∼ n n e−n the estimate (2.28), there exists A > 0 such that (1 − γ p ) (a p + 1/2)1/2 p ≥ A pa/2
√ 2π n when n → ∞ and
∀ p = 3/2, 2, . . . .
Therefore, 1+1/2 p
pa/2 δ p
≤ pa/2−1/2 Z p + (σ p+1/2 + δ1 σ p ) + 2 ρ p δ p .
We finally obtain dk ≤ x k (ak + 1/2) (d0 + d1/2 ), and we easily conclude as in [8, Proof of Theorem 1] or in [23, Proof of Prop. 3.2, Step 2]. Step 2. Control of the L 2 norms. For k = 0, the propagation of the L 2 norm is immediate using the result [24, Cor. 2.3]. Indeed one just has to split the collision kernel as in [24, Sect. 2.4]. For the truncated and regularized part Q +S (we use the notation introduced in Step 2 in the proof of Prop. 2.1), [24, Cor. 2.3] together with some basic interpolation yield the following control: + Q S (S, D) + Q +S (D, S) D dv ≤ C ρ 1+2θ D2−2θ L2 RN
for some explicit C > 0 and θ ∈ (0, 1). For the remaining term Q +R , we use the same control as in [24, Proof of Prop. 2.5] to get + Q R (S, D) + Q +R (D, S) D dv ≤ ε D L 1 + D L 2 D L 2 RN
2
1/2
1/2
for some ε which can be taken as small as wanted by the truncation. Gathering these estimates, we get ∀ε > 0 Q˜ + (S, D) D dv ≤ ε D2L 2 + Cε , RN
1/2
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
457
where Cε depends on weighted L 1 and L 2 norms of S, on L 1 norms on D and on ε. Using Eq. (2.30) with i = 0, Lemma 2.3 to treat the term L(D), and some elementary interpolation, we deduce that D L 2 ≤ C D L 1 1/2
2
for some constant C > 0, which concludes the proof for k = 0 using the previous step on the L 1 moments. Step 3. Control of the H k norms. From the previous step and some interpolation, in order to conclude it is enough to prove (2.26) for any k ∈ N and m ≡ 1. We proceed by induction on k. For any i ∈ N N , the equation satisfied by ∂ i D is ∂ i Q + (S, D) + ∂ i Q + (D, S) − ∂ i (L(D) S) − L(S) ∂ i D i ∂ i L(S) ∂ i−i D − ρ (1 − α) ∂ i ∇ · (v D) = 0. − i 0
We deduce that i 2 ∂ i Q + (S, D) + ∂ i Q + (D, S) ∂ i D dv C (∂ D) (1 + |v|) dv ≤ RN RN i ∂ i L(D) ∂ i−i S ∂ i D dv − i N R 0≤i ≤i i − ∂ i L(S) ∂ i−i D ∂ i D dv, (2.30) i N R 0
dropping the non-positive term. The induction is initialized by Step 2. Let us assume the induction step k ≥ 0 to be proved, and let us consider some i ∈ N N such that |i| = k + 1. Using Eq. (2.30) and [24, Theorem 2.5] to estimate the gain term, we find easily ∂ i D L 2 ≤ C D L q1 + D H k+(3−N )/2 q
for some q > 0. Therefore we obtain by interpolation (since (3 − N )/2 < 1 for N ≥ 2), for another q possibly larger: D H k+1 ≤ C D L 1 + D H k . q
This concludes the proof, using interpolation, the induction hypothesis k, and Step 1 on the L 1 moments. 3. The Elastic Limit α → 1 3.1. Dependency of the collision operator according to the inelasticity. In this subsection we show that the collision operator continuously depends on the inelasticity coefficient α ∈ [0, 1]. Since it is an unbounded operator, this continuous dependency is expressed in the norm of the graph of the operator or in some weaker norm. We first show that this dependency of the collision operator is Lipschitz, and even C 1,η for any
458
S. Mischler, C. Mouhot
η ∈ (0, 1), at the expense of a loss in the norm (in terms of derivatives and weight). Let us define the formal derivative of the collision operator according to α by u − |u| σ dσ dv∗ Q α (g, f ) := ∇v · g( v∗ (α)) f ( v(α)) b |u| 4 α2 R N S N −1 or by duality
Q α (g, f ), ψ :=
R N R N S N −1
g∗ f b |u|
|u| σ − u 4
∇ψ(vα ) dσ dv∗ dv.
Proposition 3.1. Let us fix a smooth exponential weight function m with exponent s ∈ (0, 1) (defined in (1.30)). Then (i) For any k, q ∈ N there exists C ∈ (0, ∞) such that for any smooth functions f, g (say in S(R N )) and any α ∈ [0, 1] there holds ± Q (g, f ) k,1 −1 ≤ Ck,m f k,1 −1 g k,1 −1 , (3.1) α Wq (m ) Wq+1 (m ) Wq+1 (m ) Q (g, f ) k,1 −1 ≤ Ck,m f k+1,1 −1 g k+1,1 −1 . (3.2) α W (m ) W (m ) W (m ) q
q+2
q+2
(ii) Moreover, for any smooth functions f, g and for any α, α ∈ [0, 1], there holds + Q (g, f ) − Q + (g, f ) − (α − α ) Q (g, f ) −2,1 −1 α α α W (m ) q
2
≤ |α − α | f L 1
q+3 (m
−1 )
g L 1
q+3 (m
−1 )
.
(3.3)
(iii) As a consequence, there holds + Q (g, f )−Q + (g, f ) k −1 ≤ C |α−α | f 2k+3,1 −1 g 2k+3,1 −1 , α α W (m ) W (m ) W (m ) q
q+3
q+3
(3.4) and for any η ∈ (1, 2), there exists kη ∈ N, qη ∈ N and Cη ∈ (0, ∞) such that + Q (g, f ) − Q + (g, f ) − (α − α ) Q (g, f ) 1 −1 α α α L (m ) ≤ Cη |α − α |η f
kη ,1
Wqη (m −1 )
g
kη ,1
Wqη (m −1 )
.
(3.5)
Proof of Proposition 3.1. First by classical convolution-like estimates (see for instance [28] in the elastic case, and [17] in the inelastic case, as well as the proof of Proposition 3.2 below) we easily have (3.1) and (3.2). Next, in order to prove (3.3) we proceed by duality. Let us consider ϕ ∈ S(R N ) and define ψ := ϕ vq m −1 . We compute + Q α (g, f ) − Q +α (g, f ) − (α − α ) Q α (g, f ) ψ(v) dv I := N
R |u| σ − u · ∇ψ(vα ) |u| b g∗ f ψ(vα ) − ψ(vα ) − (α − α ) = 4 R N ×R N ×S N −1 ×dv dv∗ dσ.
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
459
Hence, if one denotes by ξv,v∗ ,σ (α) := ψ(vα ) (for given fixed values of v, v∗ , σ ), we obtain (omitting the subscripts for clarity) |u| b g∗ f ξ(α) − ξ(α ) − (α − α ) ξ (α) dv dv∗ dσ |I | = R N ×R N ×S N −1 |u| b g∗ f sup |ξ (α)| dv dv∗ dσ. ≤ (α − α )2 R N ×R N ×S N −1
α∈(0,1)
We then easily conclude that (3.3) holds using that v q (m )−1 ≤ C vq (m)−1 v∗ q (m ∗ )−1 for some constant C ∈ (0, ∞). Last, we prove (3.4) by using the following interpolation on J = Q +α (g, f ) − Q +α (g, f ) − (α − α ) Q α (g, f ): J W k,1 (m −1 ) ≤ J W −2,1 (m −1 ) J W 2(k+1),1 (m −1 ) , q
q
q
and using (3.3) on the first term in the right-hand side, and (3.1,3.2) on the second term in the right-hand side. It yields + Q (g, f ) − Q + (g, f ) − (α − α ) Q (g, f ) k,1 −1 α
α
α
Wq (m
)
≤ C |α − α | f W 2k+3,1 (m −1 ) gW 2k+3,1 (m −1 ) , q+3
q+3
and (3.4) follows by using (3.2) again. Then the proof of (3.5) is done in the same way using suitable interpolation.
We next state a mere (Hölder) continuity dependency on α, which is however stronger than Proposition 3.1 in some sense, since it is written in the norm of the graph of the operator for one of the arguments. Proposition 3.2. For any α, α ∈ (0, 1], and any g ∈ L 11 (m −1 ), f ∈ W11,1 (m −1 ), there holds ⎧ + + ⎨ Q α (g, f ) − Q α (g, f ) L 1 (m −1 ) ≤ ε(α − α ) f W11,1 (m −1 ) g L 11 (m −1 ) , (3.6) ⎩ Q + ( f, g) − Q + ( f, g) 1 −1 ≤ ε(α − α ) f 1,1 −1 g 1 −1 , α L (m ) α L (m ) W (m ) 1
1
1
where ε(r ) = C r 3+4/s for some constant C (depending only on b). Proof of Proposition 3.2. For any given v, v∗ ∈ R N , w = v + v∗ = 0 and σ ∈ S N −1 we define χ ∈ [0, π/2], cos χ := |σ · w|. ˆ Let us fix δ ∈ (0, 1), R ∈ (1, ∞) and let us define θδ ∈ W 1,∞ (−1, 1) such that θδ (s) = 1 on (−1 + 2δ, 1 − 2δ), θδ (s) = 0 on (−1+δ, 1−δ)c , 0 ≤ θδ ≤ 1, |θδ (s)| ≤ 3/δ, R (u) = (|u|/R) with (x) = 1 on [0, 1], (x) = 1−x for x ∈ [1, 2] and (x) = 0 on [2, ∞), A(δ) := {σ ∈ S N −1 ; sin2 χ ≥ δ}, B(δ) := {σ ∈ S N −1 ; cos θ ∈ (−1 + 2δ, 1 − 2δ)c or sin2 χ ≤ δ}. We then split Q + in three terms, namely +,v +,r Q +α = Q +,a α + Qα + Qα , r where Q +,r ˆ R (u), where Q +,v α is defined by (1.3) with b replaced by b := b θδ (σ · u) α is defined by (1.3) with b replaced by bv := b 1 A(δ) (1 − R (u)), and where Q +,a α is
460
S. Mischler, C. Mouhot
defined by (1.3) with b replaced by ba := b (1−θδ (σ · u)) ˆ R (u)+b (1− R (u)) 1 Ac (δ) . We split the proof into three steps. Step 1. Treatment of small angles. There exists a constant C ∈ (0, ∞) such that for any α ∈ (0, 1] and δ ∈ (0, 1) there holds +,a Q (ψ, ϕ) 1 −1 ≤ C δ ψ 1 −1 ϕ 1 −1 . α L (m ) L (m ) L (m ) 1
1
Indeed let us consider some ∈ L ∞ and let us proceed by duality. We estimate +,a −1 Q α (ψ, ϕ) (v) m (v) dv = |u| ba ψ∗ ϕ (m )−1 dv dv∗ dσ RN
≤
R N ×R N ×S N −1 ba L ∞ 1 N −1 )) L ∞ v,v∗ (L (S
ψ L 1 (m −1 ) ϕ L 1 (m −1 ) , 1
1
and we conclude using that ba L ∞ 1 N −1 )) ≤ C (δ + maxv,v∗ |B(δ)|é1) ≤ C δ. v,v∗ (L (S Step 2. Treatment of large relative velocities. There exists a constant C = Ca,s,b ∈ (0, ∞) such that for any α ∈ (0, 1] and δ ∈ (0, 1) there holds Q +,v α (ψ, ϕ) L 1 (m −1 ) ≤
C ψ L 1 (m −1 ) ϕ L 1 (m −1 ) . 1 1 R δ 2/s
(3.7)
We need the following lemma, which we state below and prove at the end of the subsection. Lemma 3.3. For any δ > 0 and α ∈ (0, 1), there holds σ ∈ S N −1 , sin2 χ ≥ δ implies m −1 (v ) ≤ m −k (v) m −k (v∗ ),
(3.8)
with k = (1 − δ/160)s/2 . In order to prove (3.7) we fix ∈ L ∞ and we argue by duality again. We estimate thanks to Lemma 3.3, +,v −1 Q α (ψ, ϕ) (v) m (v) dv = |u| bv ψ∗ ϕ (m )−1 dv dv∗ dσ RN R N ×R N ×S N −1 1 |u|2 bv ψ∗ ϕ (m)−k (m ∗ )−k ≤ R R N ×R N ×S N −1 ×dv dv∗ dσ 1 ≤ L ∞ ψ L 1 (m −k ) ϕ L 1 (m −k ) 2 2 R 1 ≤ L ∞ |.| m 1−k (.)2L ∞ ψ L 1 (m −1 ) ϕ L 1 (m −1 ) , 1 1 R from which we easily conclude since x → x m 1−k (x) is uniformly bounded by Ca,s (1 − k)−1/s , Ca,s ∈ (0, ∞). Step 3. The truncated operator. Let us prove that there exists a constant C ∈ (0, ∞) such that for any δ ∈ (0, 1), α, α ∈ (0, 1] and R ∈ (1, ∞) there holds +,r Q +,r α (g, f ) − Q α (g, f ) L 1 (m −1 ) ≤ C |α − α |
R2 R + δ δ3
g L 1 (m −1 ) f W 1,1 (m −1 ) .
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
461
We closely follow the proof of [23, Prop. 4.3]. We consider some ∈ L ∞ , f, g ∈ D(R N ), we proceed by duality and next conclude thanks to a density argument. We have +,r −1 I := [Q +,r dv α (g, f ) − Q α (g, f )] m RN = |u| R (u) bδ g∗ f (vα ) m −1 (vα ) − (vα ) m −1 (vα ) dv dv∗ dσ. R N ×R N ×S N −1
With the notations of Lemma 2.2 we perform the changes of variables v → vα = φα (v) and v → vα = φα (v) (for fixed v∗ and σ ) with jacobians Jα and Jα . Observing that without restriction we may assume α ≤ α and therefore Oα = v∗ + ωα (δ) ⊂ Oα = v∗ + ωα (δ) , since s → ωs (0) is an increasing function, we get I = g∗ (m −1 ) F(φα−1 ) Jα−1 dv dv∗ dσ R N ×S N −1 Oα \Oα
+ +
R N ×S N −1 Oα
R N ×S N −1 Oα
dv dv∗ dσ g∗ (m −1 ) F(φα−1 ) Jα−1 − Jα−1 g∗ (m −1 ) F(φα−1 ) − F(φα−1 ) Jα−1 dv dv∗ dσ
= I1 + I2 + I3 ,
− v∗ ). For the first term I1 we use with F(w) := |w − v∗ | R (w − v∗ ) f (w) bδ (σ · w the backward change of variables v → v = φα−1 (v ) (for fixed v∗ and σ ) and we get |u| R (u) f g∗ (vα ) m −1 (vα ) bδ 10≤u·σ I1 = ˆ ≤η dv∗ dv dσ R N ×S N −1 R N
with η := ωα−1 ◦ ωα (δ) ≤ C δ −3/2 |α − α | for some constant C ∈ (0, ∞). Since v → |v|s/2 is an increasing subadditive function, we also have |vα |s ≤ (|v|2 + |v∗ |2 )s/2 ≤ |v|s +|v∗ |s , which implies m(vα ) ≤ C m −1 m −1 ∗ for some constant C ∈ (0, ∞) (depending of ζ ). As a consequence, we obtain |I1 | ≤ C R δ −3/2 |α − α | b L ∞ L ∞ f L 1 (m −1 ) g L 1 (m −1 ) . For the term I2 , using the backward change of variable v → v = φα−1 (v ) (for some −1 −1 fixed v∗ and σ ) and using the bounds (2.4) on Jα and |Jα − Jα |, we obtain
|I1 | ≤ C R δ −3 |α − α | b L ∞ L ∞ f L 1 (m −1 ) g L 1 (m −1 ) . In order to estimate I3 , we introduce αt := (1 − t) α + t α and, thanks to (2.3)-(2.2), we get 1 ! ! C ! ! |I3 | ≤ |α−α | |g∗ | | | (m −1 ) |v −v|!∇w F(φα−1 (v ))!dv dv∗ dσ dt. t N N −1 δ Oαt 0 R ×S Using finally the backward change of variable v → v = φα−1 (v ) and the uniform bound t (2.4) on Jαt , t ∈ [0, 1], on v∗ + δ , we get 2 R R + 2 |α − α | bW 1,∞ L ∞ g L 1 (m −1 ) f W 1,1 (m −1 ) . |I3 | ≤ C δ δ
462
S. Mischler, C. Mouhot
Gathering the estimates established in Steps 1, 2 and 3, we deduce the first inequality in (3.6). The second inequality in (3.6) is proved in a similar way (using symmetric changes of variable, allowed by the truncation). Proof of Lemma 3.3. We proceed √ in three steps.√ Step 1. Assume first that (2/ 5) |v∗ | ≤ |v| ≤ ( 5/2) |v∗ |. Using the fact that x → x s/2 is an increasing and subadditive function, there holds |v |s ≤ (|v|2 + |v∗ |2 )s/2 ≤ (9/4)s/2 |v∗ |s , and then by symmetry and because s ≤ 1, |v |s ≤
1 3 (9/4)s/2 (|v|s + |v∗ |s ) ≤ (|v|s + |v∗ |s ). 2 4
In that case, (3.8) holds with k = 3/4. Step 2. We shall first show that for any v, v∗ ∈ R N and σ ∈ S N −1 , there holds |v |2 , |v∗ |2 ≤ |v|2 + |v∗ |2 −
1+α sin2 χ |v + v∗ |2 . 8
(3.9)
We recall the formula
1+α 1+α v + v∗ 1 1 − α v + v∗ 1 1−α + u+ |u| σ , v∗ := + u− |u| σ . v := 2 2 2 2 2 2 2 2 Straightforward computations yield (denoting S = v + v∗ )
1+α |S|2 1 1 + α 2 2 1 − α 2 2 1−α 2 + |u| + |u| cos θ + (S · u) + |S| |u| cos χ . |v | ≤ 4 4 2 2 4 4 We deduce the bound from above |v |2 ≤
1+α |S|2 |u|2 1 − α + + |S| |u| + |S| |u| cos χ . 4 4 4 4
Then by applying twice Young’s inequality 1 1−α 1 1−α 1+α + + |u|2 + + |S| |u| cos χ , |v |2 ≤ |S|2 4 8 4 8 4 1 1−α 1 1−α 1+α 1+α 2 + |u|2 + ≤ |S|2 + + + |S| cos2 χ , 4 8 4 8 8 8 ≤
|S|2 |u|2 1 + α 2 + + |S| (cos2 χ − 1), 2 2 8
from which we deduce (3.9). √ √ Step 3. Assume that sin2 χ ≥ δ and that either (2/ 5) |v∗ | ≥ |v| or |v| ≥ ( 5/2) |v∗ |. In the first case, we have √ √ √ |v + v∗ | ≥ 1 − (2/ 5) |v∗ | + (2/ 5) |v∗ | − |v| ≥ 1 − (2/ 5) |v∗ |, which then implies
√ √ √ |v + v∗ | ≥ 1 − (2/ 5) ( 5/2) |v| ≥ 1 − (2/ 5) |v|.
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
463
The same inequalities are proved in a similar way in the second case. We deduce √ 1 1 − (2/ 5) (|v|2 + |v∗ |2 ). |v + v∗ |2 ≥ 2 We then deduce from (3.9) that |v |2 ≤ (1 − δ/160) (|v|2 + |v∗ |2 ) and we conclude that (3.8) holds as in Step 1. 3.2. Quantification of the elastic limit α → 1. We begin with a simple consequence of Proposition 3.1. Corollary 3.4. There exists k0 , q0 ∈ N such that for any ai ∈ (0, ∞) i = 1, 2, 3, there exists an explicit constant C ∈ (0, ∞) such that for any function g satisfying g H k0 ∩L 1 ≤ a1 , g ≥ a2 e−a3 |v| , 8
q0
there holds
! ! ! D H,α (g) − D H,1 (g)! ≤ C (1 − α),
where we recall that D H,α is defined in (2.18). Proof of Corollary 3.4. We write D H,α (g) − D H,1 (g) = b |u| gα g∗α − g g∗ dv dv∗ dσ (=: I1 ) + b |u| g g∗ log gα + log g∗α − log g − log g∗ dv dv∗ dσ (=: I2 ). For the first term, thanks to Proposition 3.1, we have |I1 | ≤ Q +α (g, g) − Q +1 (g, g) L 1 ≤ C (1 − α) g2
W33,1
.
For the second term, we write !" #!! ! |I2 | = 2 ! (Q +α (g, g) − Q +1 (g, g)) v8 , v−8 log g ! ≤ 2 Q +α (g, g) − Q +1 (g, g) L 1 v−8 log g L ∞ 8
≤ C (1 − α) g2
3,1 W11
≤C
(1 − α) a12
(| log g L ∞ | + | log a2 | + a3 )
(| log a1 | + | log a2 | + C a3 ) ,
3,1 for thanks to Proposition 3.1 and the bounded embedding H k0 ∩ L q10 ⊂ L ∞ ∩ W11 k0 , q0 large enough (see Proposition B.1). We conclude the proof gathering these two estimates.
Let us now recall the Csiszár-Kullback-Pinsker inequality (see [14,22]) and a “entropy-entropy production inequality” (the version we present here is established in [31]) that we will use several times in the sequel.
464
S. Mischler, C. Mouhot
Theorem 3.5. (i) For a given function g ∈ L 12 , let us denote by M[g] the Maxwellian function with the same mass, momentum and temperature as g. For any 0 ≤ g ∈ L 12 (R N ), there holds g g − M[g]2 1 ≤ 2 ρ(g) dv. (3.10) g ln L N M[g] R (ii) For any ε > 0 there exists kε , qε ∈ N and for any A ∈ (0, ∞) there exists Cε = Cε,A ∈ (0, ∞) such that for any g ∈ H kε ∩ L q1ε such that g(v) ≥ A−1 e−A |v| , g H kε ∩L q1 ≤ A, 8
ε
there holds Cε ρ(g)1−ε
RN
g ln
g dv M[g]
1+ε ≤ D H,1 (g).
(3.11)
We have then the following estimate on the distance between G α and G¯ 1 for any self-similar profile G α . Proposition 3.6. For any ε > 0 there exists Cε (independent of the mass ρ) such that ∀ α ∈ [α0 , 1)
sup G α − G¯ 1 L 1 ≤ Cε ρ (1 − α) 2+ε , 1
2
G α ∈Gα
(3.12)
where we recall that G¯ 1 is the Maxwellian function defined by (1.27)–(1.29). Proof of Proposition 3.6. On the one hand, for any inelasticity coefficient α ∈ [α0 , 1) and profile G α , there holds from (2.19) together with Corollary 3.4 and the uniform estimates of Proposition 2.1, D H,1 (G α ) ≤ D H,α (G α ) + ρ 2 O(1 − α) ≤ ρ 2 O(1 − α).
(3.13)
On the other hand, introducing the Maxwellian function Mθ with the same mass, momentum and temperature as G α , that is Mθ given by (1.28) with u = 0 and θ = E(G α )/ρ, and gathering (3.13), (3.11), (3.10) with the uniform estimates of Proposition 2.1 and interpolation inequality, we obtain that for any q, ε > 0 there exists Cq,ε such that ∀α ∈ [α0 , 1) Next, from (2.16), we have b1
G α − Mθ 2+ε ≤ Cq,ε ρ 2+ε (1 − α). L1 q
(3.14)
G α G α∗ |u| dv dv∗ − ρ G α |v|2 dv N R b1 3 G α G α∗ |u| dv dv∗ , = (1 − α) 2 RN RN RN
3
RN
and then |(θ )| ≤ C1 G α − Mθ L 1 + C2 ρ 2 (1 − α), 3
(3.15)
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
465
where we have used that G α and Mθ are bounded thanks to Proposition 2.1 and we have defined Mθ |v|2 dv − b1 Mθ Mθ∗ |u|3 dv dv∗ . (3.16) (θ ) = ρ RN
RN RN
By elementary changes of variables, this formula simplifies into (θ ) = k1 θ − k2 θ 3/2 with k1 = ρ 2 N and, using (A.3), 2 M1,0,1 (M1,0,1 )∗ |u|3 dv dv∗ = 23/2 ρ 2 b1 m3/2 (M1,0,1 ). k2 = ρ b1 R N ×R N
We next observe that ∈ C ∞ (0, ∞) and is strictly concave. It is also obvious that the equation (θ ) = 0 for θ > 0 has a unique solution which is θ¯1 defined in (1.29), and that we have (θ ) ≤ (θ¯1 ) (θ − θ¯1 ) = −k1 (θ − θ¯1 )/2 as well as (θ ) = θ [k1 − k2 θ 1/2 ] = k2 θ [θ¯1
1/2
− θ 1/2 ].
(3.17)
Plugging this expression for into (3.15) and using the lower bound (2.17) on the temperature θ and the estimate (3.14), we obtain that for any ε > 0 there is Cε ∈ (0, ∞) such that ! ! ! 1/2 ¯ 1/2 !2+ε ∀ α ∈ (α0 , 1) ≤ Cε (1 − α). (3.18) !θ − θ1 ! We have thus proved that the temperature of G¯ α converges (with rate) to the expected temperature θ¯1 . In order to come back to the norm of G α − G¯ 1 , we first write, using Cauchy-Schwarz’s inequality, G α − G¯ 1 L 1
−N
≤ G α − Mθ L 1
−N
+ Mθ − G¯ 1 L 1
−N
≤ G α − Mθ L 1 + C N Mθ − G¯ 1 L 2 ,
(3.19)
and we remark that ! 1/2 ! Mθ − G¯ 1 2L 2 ≤ C ρ 2 !θ 1/2 − θ¯1 !.
(3.20)
Gathering (3.19) with (3.20), (3.18) and (3.14) we deduce that for any ε > 0 there is Cε ∈ (0, ∞) such that ∀ α ∈ (α0 , 1)
G α − G¯ 1 2+ε ≤ Cε ρ 2+ε (1 − α), L1
and (3.12) follows by interpolation again.
−N
466
S. Mischler, C. Mouhot
4. Uniqueness and Continuity of the Path of Self-Similar Profiles 4.1. The proof of uniqueness. Theorem 4.1. There exists a constructive α1 ∈ (0, 1) such that the solution G α of (1.32) is unique for any α ∈ [α1 , 1]. We denote by G¯ α this unique self-similar profile. This theorem is an immediate consequence of the following result. Proposition 4.2. There is a constructive constant η ∈ (0, 1) such that ⎫ G, H ∈ Gα , α ∈ (1 − η, 1) ⎬ implies G = H. G − G¯ 1 1 ≤ η, H − G¯ 1 1 ≤ η ⎭ L2
L2
Proof of Theorem 4.1. Let us assume that Proposition 4.2 holds. Then Proposition 3.6 implies that there is some explicit ε ∈ (0, 1) such that for α ∈ (1 − ε, 1] one has sup G α − G¯ 1 L 1 ≤ η,
G α ∈Gα
2
where η is defined in the statement of Proposition 4.2. Up to reducing η, it is always possible to take η ≤ ε, and the proof is completed by applying Proposition 4.2. Proof of Proposition 4.2. Let us consider any exponential weight function m with s ∈ (0, 1), a ∈ (0, +∞), or with s = 1 and a ∈ (0, ∞) small enough. With the notations of Sub-sect. 1.5, let us also define O = C0,0,0 ∩ L1 (m −1 ) the subvector space of L1 (m −1 ) of functions with zero energy, ψ = C (|v|2 − N ) M1,0,1 such that E(ψ) = 1, and the following projection: : L1 (m −1 ) → O,
(g) = g − E(g) ψ.
Finally, let us introduce the following non-linear functional operator: : [0, 1) × (W11,1 (m −1 ) ∩ Cρ,0 ) → R × O, and (1, ·) : (L 11 (m −1 ) ∩ Cρ,0 ) → R × O, by setting (α, g) = (1 + α) DE (g) − 2 ρ E(g), Q α (g, g) − τα divv (v g) , where DE (g) is defined in (1.13). It is straightforward that (α, G α ) = 0 for any α ∈ [α0 , 1] and G α ∈ Gα , and that the equation (1, g) = (0, 0) has a unique solution, given by g = G¯ 1 = Mρ,0,θ¯1 defined in (1.27), (1.29).
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
467
The function is nonlinear in terms of its first argument, and it is quadratic in terms of its second argument (more precisely, it is the sum of linear and quadratic terms in terms of its second argument). Hence easy computations yield the following formal differential according to the second argument at the point (1, G¯ 1 ): (4.1) D2 (1, G¯ 1 ) h = A h := 4 D˜ E (G¯ 1 , h) − 2ρ E(h), 2 Q˜ 1 (G¯ 1 , h) , where Q˜ α is defined in (1.8) and D˜ E (g, h) := b1
R N ×R N
g h ∗ |u|3 dv dv∗ .
Notice that we can remove the projection on the last argument in (4.1) since the elastic collision operator always has zero energy. Then we have Lemma 4.3. The linear functional A : L11 (m −1 ) → R × O
h → A h = D2 (1, G¯ 1 ) h
is invertible: it is bijective with A−1 bounded with explicit estimate. Proof of Lemma 4.3. Since the spectrum of the linear operator L1 defined on L 1 (m −1 ) (with domain L 11 (m −1 )) includes 0 as a discrete eigenvalue associated with the eigenspace KerL1 = Span{G¯ 1 , v1 G¯ 1 , . . . , v N G¯ 1 , |v|2 G¯ 1 } by [27, Theorem 1.3] and since moreover O ∩ KerL1 = {0}, we deduce that it is invertible from O ∩ L11 (m −1 ) onto O. Moreover the work [27, Sect. 4] provides explicit estimates on the norm of its inverse. We deduce immediately that L−1 1 maps O onto itself with explicit bound. For any h ∈ L1 (m −1 ), we decompose h = h 1 φ1 + h ⊥ , with h 1 :=
E(h) ∈ R, h ⊥ ∈ O, E(φ1 )
where we recall that φ1 is defined in (1.35). Then, using the characterization (1.32) of G¯ 1 , A h = b1 G¯ 1 (v) h ⊥ (v∗ ) |u|3 dv dv∗ R N ×R N
2 ¯ ¯ 3 4 ⊥ ¯ +h 1 b1 |v| G 1 G 1∗ |u| dv dv∗ − 2ρ G 1 |v| dv , L1 (h ) . R N ×R N
RN
The claimed invertibility follows from the fact that C ∗ = 2 N ρ 2 θ¯12 = 0. Indeed, from (A.2) and (A.4) there holds ∗ 2 ¯ ¯ 3 C := b1 |v| G 1 G 1∗ |u| dv dv∗ −2ρ G¯ 1 |v|4 dv R N ×R N RN
1/2 2 ¯2 2 3 4 ¯ = ρ θ1 b1 θ1 M1,0,1 (M1,0,1 )∗ |v| |u| dv dv∗ −2 M1,0,1 |v| dv R N ×R N RN 1/2 √ 2 (2N + 3) m3/2 (M1,0,1 ) − 2 N (N + 2) , = ρ 2 θ¯12 b1 θ¯1 and we conclude thanks to formula (1.29).
468
S. Mischler, C. Mouhot
Let us come back to the proof of Proposition 4.2. We write G α − Hα = A−1 [A G α − (α, G α ) + (α, Hα ) − A Hα ] = A−1 (I1 , I2 )
(4.2)
with (recall that the bilinear operators D˜ E and Q˜ α are symmetric) ' I1 := 4 D˜ E (G¯ 1 , G α − Hα ) − (1 + α) D(G α ) + (1 + α) D(Hα ) I2 := I2,1 + I2,2 and '
I2,1 := 2 Q˜ 1 (G¯ 1 , G α − Hα ) − Q α (G α , G α ) + Q α (Hα , Hα ) I2,2 := ρ (1 − α) ∇v · (v (Hα − G α )) .
On the one hand, I1 = 2 D 2G¯ 1 − (G α + Hα ), G α − Hα + (1 − α) D(G α + Hα , G α − Hα ) so that
|I1 | ≤ C3 G¯ 1 − G α L 1 + G¯ 1 − Hα L 1
+ (1 − α) G α L 1 + (1 − α) Hα L 1 G α − Hα L 1 3
3
3
3
3
≤ η1 (α) G α − Hα L 1 (m −1 )
(4.3)
1
with η1 (α) → 0 when α → 1 (with explicit rate, for instance η1 (α) = C1 (1 − α)1/3 ) because of Propositions 2.1 and 3.6. On the other hand, I2,1 = Q 1 (G¯ 1 , G α − Hα ) − Q α (G¯ 1 , G α − Hα ) + Q 1 (G α − Hα , G¯ 1 ) −Q α (G α − Hα , G¯ 1 ) + Q α (G¯ 1 − G α , G α − Hα ) + Q α (G α − Hα , G¯ 1 − Hα ). From Proposition 3.2 there holds Q 1 (G¯ 1 , G α − Hα ) − Q α (G¯ 1 , G α − Hα ) L 1 (m −1 ) ≤ ε(α) G α − Hα L 1 (m −1 ) , 1
Q 1 (G α − Hα , G¯ 1 ) − Q α (G α − Hα , G¯ 1 ) L 1 (m −1 ) ≤ ε(α) G α − Hα L 1 (m −1 ) , 1
with ε(α) → 0 as α → 1 (with again explicit rate, for instance ε(α) = C1 (1 − α)1/12 if s = 1/2 in the formula of m). From elementary estimates in L 1 (m −1 ) we have Q α (G¯ 1 − G α , G α − Hα ) + Q α (G α − Hα , G¯ 1 − Hα ) L 1 (m −1 ) ≤ C4 G α − G¯ 1 L 1 (m −1 ) + Hα − G¯ 1 L 1 (m −1 ) G α − Hα L 1 (m −1 ) . 1
1
1
Together with Propositions 3.6 we thus obtain I2,1 L 1 (m −1 ) ≤ η2 (α) G α − Hα L 1 (m −1 ) 1
(4.4)
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
469
for some η2 (α) → 0 as α → 1. Here we can take for instance (when s = 1/2 in the formula of m) η2 (α) = C2 (1 − α)1/12 for some C2 ∈ (0, ∞) by picking a suitable ε and interpolating. Finally from Proposition 2.7 there holds I2,2 L 1 (m −1 ) ≤ C5 (1 − α) G α − Hα L 1 (m −1 ) . 1
(4.5)
Gathering (4.3), (4.4) and (4.5) we obtain from (4.2) and Lemma 4.3 G α − Hα L 1 (m −1 ) ≤ η(α) A−1 G α − Hα L 1 (m −1 ) 1
1
for some function η such that η(α) → 0 as α → 1 (with explicit rate). Hence choosing α1 close enough to 1 we have η(α) A−1 ≤ 1/2 for any α ∈ [α1 , 1). This implies G α = Hα and concludes the proof. 4.2. Differentiability of the map α → G¯ α at α = 1. Lemma 4.4. The map [α1 , 1] → L 1 (m −1 ), α → G¯ α is continuous on [α1 , 1] and differentiable at α = 1. More precisely, there exists G¯ 1 ∈ L 1 (m −1 ) and for any η ∈ (1, 2) there exists a constructive Cη ∈ (0, ∞) such that G¯ α − G¯ 1 − (1 − α) G¯ 1 L 1 (m −1 ) ≤ Cη (1 − α)η
∀ α ∈ (α0 , 1).
(4.6)
Proof of Lemma 4.4. We split the proof into four steps. Step 1. For the continuity we use a classical stability argument. Let us consider a sequence (αn )n≥0 such that αn ∈ [α1 , 1] and αn → α. From the uniform bound (2.1), we may extract a subsequence (G¯ αn ) which strongly converges in L 1 (m −1 ) to a function G α . Passing to the limit in Eqs. (1.32) associated to the normal restitution coefficient αn and written for G αn , we deduce that G α satisfies (1.32) associated to the normal restitution coefficient α. From the uniqueness of the solution proved in Theorem 4.1, there holds G α = G¯ α and thus the whole sequence G¯ αn converges to G¯ α . Step 2. We next prove that there exists an explicit constant C such that ∀ α ∈ [α1 , 1]
G¯ α − G¯ 1 L 1 (m −1 ) ≤ C (1 − α).
We write G¯ α − G¯ 1 = A−1 [A G¯ α − (α, G¯ α ) + (1, G¯ 1 ) − A G¯ 1 ] = A−1 (J1 , J2 ) with '
J1 := 4 D˜ E (G¯ 1 , G¯ α − G¯ 1 ) + 2 D˜ E (G¯ 1 , G¯ 1 ) − (1 + α) D˜ E (G¯ α , G¯ α ) J2 := J2,1 + J2,2
and '
J2,1 := Q 1 (G¯ 1 , G¯ α ) + Q 1 (G¯ α , G¯ 1 ) − Q α (G¯ α , G¯ α ) J2,2 := ρ (1 − α) ∇v · v (G¯ α ) .
(4.7)
470
S. Mischler, C. Mouhot
On the one hand, J1 = −2 D˜ E (G¯ 1 − G¯ α , G¯ 1 − G¯ α ) + (1 − α) D(G¯ α , G¯ α ) so that |J1 | ≤ C G¯ 1 − G¯ α 2L 1 + C (1 − α). 3
On the other hand, J2,1 = −Q 1 (G¯ 1 − G¯ α , G¯ 1 − G¯ α ) + Q 1 (G¯ α , G¯ α ) − Q α (G¯ α , G¯ α ). Hence using Propositions 2.7, 3.1, and the bound (2.1), we deduce 2 |J2,1 | ≤ G¯ α − G¯ 1 L 1 + C (1 − α) 2
and we also have straightforwardly J2,2 = O(1 − α). Gathering all these estimates, we thus obtain from (4.7), G¯ α − G¯ 1 1 −1 ≤ A−1 G¯ α − G¯ 1 2 1 −1 + C (1 − α) . L (m ) L (m ) 1
1
Using then the explicit result of quantification of the elastic limit in Proposition 3.6, we have that for some α2 ∈ [α1 , 1) close enough to 1: ∀ α ∈ [α2 , 1]
1 A−1 G¯ α − G¯ 1 L 1 (m −1 ) < , 1 2
and thus we get ∀ α ∈ [α2 , 1], G¯ α − G¯ 1 L 1 (m −1 ) ≤ 2 C A−1 (1 − α), 1
which implies the claimed estimate. Step 3. In order to prove the differentiability we must slightly improve the estimate established in the preceding step. On the one hand we exhibit what should be the derivative of G¯ α at α = 1, and denote it by R. Formally differentiating Eq. (1.32) at α = 1 we have Q 1 (G¯ 1 , G¯ 1 ) + 2 Q˜ 1 (R, G¯ 1 ) + ρ ∇v · v G¯ 1 = 0. On the other hand, we may compute 1 ¯ 2 ¯
Q α (G 1 , G 1 ), |.| = b |u| G¯ 1 G¯ 1∗ (|u| σ −u) · (|u| σ ) dv dv∗ dσ 4 R N ×R N ×S N −1 (4.8) = 2 DE (G¯ 1 ). Next, dividing Eq. (1.25) on the energy of G α by (1 − α) and formally differentiating the resulting expression we get 2 ρ E(R) − D˜ E (G¯ 1 , G¯ 1 ) − 4 D˜ E (R, G¯ 1 ) = 0. We now rigorously define R in the following way G¯ 1 = R := A−1 − D˜ E (G¯ 1 , G¯ 1 ), −F , F := Q α (G¯ 1 , G¯ 1 ) + ρ ∇v · v G¯ 1 .
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
471
Note that R is well-defined since E(F) = 0 because of (4.8) and the definition of G¯ 1 . Step 4. We finally come back to Step 2 and we shall construct a Taylor expansion of order 1. We want to estimate G¯ α − G¯ 1 + (α − 1) G¯ 1 = A−1 J1 − (α − 1) D˜ E (G¯ 1 , G¯ 1 ), J2 − (1 − α) F . On the one hand J1 − (α − 1) D˜ E (G¯ 1 , G¯ 1 ) = −2 D˜ E (G¯ 1 − G¯ α , G¯ 1 − G¯ α ) + (1 − α) D(G¯ α , G¯ α ) − D˜ E (G¯ 1 , G¯ 1 ) , so that we obtain straightforwardly ! ! ! J1 − (α − 1) D˜ E (G¯ 1 , G¯ 1 )! ≤ C (1 − α)2 . On the other hand, J2 − (1 − α) F := J2,1 + J2,2 with J2,1 = −Q 1 (G¯ 1 − G¯ α , G¯ 1 − G¯ α ) + Q 1 (G¯ α − G¯ 1 , G¯ α ) − Q α (G¯ α − G¯ 1 , G¯ α ) +Q 1 (G¯ 1 , G¯ α − G¯ 1 ) − Q α (G¯ 1 , G¯ α − G¯ 1 ) + (1 − α) ∇v · v (G¯ α − G¯ 1 ) and J2,2 = Q 1 (G¯ 1 , G¯ 1 ) − Q α (G¯ 1 , G¯ 1 ) − (1 − α) K . It is clear from Propositions 3.1, the bound of Step 2, and some interpolation with the uniform bounds (2.1), that J2,1 L 1 (m −1 ) , J2,2 L 1 (m −1 ) ≤ Ck (1 − α)k for any k ∈ (1, 2).
5. Study of the Spectrum and Semigroup of the Linearized Problem In this section we shall obtain information on the geometry of the spectrum of the linearized rescaled inelastic collision operator (for a small inelasticity), as well as estimates on its resolvent and on the associated linear semigroup. We shall use the properties of the elastic linearized operator and some perturbation arguments again. In order to do so, one needs some common functional “ground” for the linearized operators in the limit of vanishing inelasticity. This common functional setting is given by the study [27] in which the spectral study of the elastic linearized operator is made in L 1 spaces with s exponential weights ea |v| , a ∈ (0, +∞), s ∈ (0, 1). We thus consider the operator g → Q α (g, g) − τα ∇v · (v g) and some fluctuations h around the self-similar profile G¯ α : g = G¯ α + h with h ∈ L 1 (m −1 ), where m is a fixed smooth exponential weight function, as defined in (1.30). The corresponding linearized unbounded operator Lα acting on L 1 (m −1 ) with
472
S. Mischler, C. Mouhot
domain dom(Lα ) = W11,1 (m −1 ) if α = 1 and dom(L1 ) = L 11 (m −1 ), is defined in (1.33) (it is straightforward to check that it is closed in this space). Since the equation in selfsimilar variables preserves mass and the zero momentum, the correct spectral study of Lα requires to restrict this operator to zero mean and centered distributions (which are preserved as well by Lα ), and therefore we shall work in L1 (m −1 ). When restricted to this space, the operator Lα is denoted by Lˆ α . We denote by R(Lˆ α ) the resolvent set of Lˆ α , and by Rα (ξ ) = (Lˆ α − ξ )−1 its resolvent operator for any ξ ∈ R(Lˆ α ). Let us recall that the linearized elastic hard spheres Boltzmann equation, the spectrum, and the asymptotic stability have been studied by many authors since the pioneering works by Hilbert [20], Carleman [12] and Grad [18], and we refer for instance to [27] for more references. The result established for L1 (and translated straightforwardly to Lˆ 1 ) in [27] is the following: Theorem 5.1. (i) There exists a decreasing sequence of real discrete eigenvalues (µn )n≥1 (that is: eigenvalues isolated and with finite multiplicity) of Lˆ 1 , with “energy” eigenvalue µ1 = 0 of multiplicity 1 and “energy” eigenvector φ1 (defined in (1.35)), µ2 < 0 and lim µn = µ∞ ∈ (−∞, 0) such that the spectrum (Lˆ 1 ) of Lˆ 1 in L1 (m −1 ) is written (Lˆ 1 ) = (−∞, µ∞ ] ∪ {µn }n∈N . In particular, Lˆ 1 is onto from O ∩ L11 (m −1 ) onto O. (ii) The resolvent R1 (ξ ) has a sectorial property for after “subtraction” of the “energy” eigenvalue, namely there is a constructive µ2 < λ < 0 such that ∀ ξ ∈ A, R1 (ξ )L1 (m −1 ) ≤ a +
b , |ξ + λ|
with
λ 3π 3π and e ξ ≤ . A = ξ ∈ C, arg(ξ + λ) ∈ − , 4 4 2 (iii) The linear semigroup S1 (t) associated to Lˆ 1 in L1 (m −1 ) is written ∀t ≥ 0
S1 (t) = 1 + R1 (t),
where 1 is the projection on the eigenspace associated to µ1 and R1 (t) is a semigroup which satisfies ∀t ≥ 0
R1 (t)L1 (m −1 ) ≤ C eµ2 t
with explicit constant C. The main result proved in this section is a perturbation result which extends Theorem 5.1 in the following way. Let us define for any x ∈ R the half-plane x by x = {ξ ∈ C, e ξ ≥ x}.
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
473
Theorem 5.2. Let us fix µ¯ ∈ (µ2 , 0), k, q ∈ N and m a smooth weight exponential function with s ∈ (0, 1). Then there exists α2 ∈ (α1 , 1) such that for any α ∈ [α2 , 1] the following holds: (i) The spectrum (Lˆ α ) of Lˆ α in Wqk,1 (m −1 ) is written (Lˆ α ) = E α ∪ {µα },
E α ⊂ cµ¯ ,
where µα is a 1-dimensional real eigenvalue which does not depend on the choice of the space Wqk,1 (m −1 ) and satisfies (1.34). (ii) The resolvent Rα (ξ ) in Wqk,1 (m −1 ) is holomorphic on a neighborhood of µ¯ \{µα } and there are explicit constants C1 , C2 such that sup
z∈C, e z=µ¯
Rα (z)|Wk,1 (m −1 )→Wk,1 (m −1 ) ≤ C1 q
q
and Rα (µ¯ + is)Wk+1,1 (m −1 )→Wk,1 (m −1 ) ≤ q
q+1
C2 . 1 + |s|
k+2,1 (m −1 ), k, q ∈ N, is written (iii) The linear semigroup Sα (t) associated to Lˆ α in Wq,2
Sα (t) = eµα t α + Rα (t), where α is the projection on the (1-dimensional) eigenspace associated to µα and where Rα (t) is a semigroup which satisfies Rα (t)Wk+2,1 (m −1 )→Wk,1 (m −1 ) ≤ Ck eµ¯ t q+2
q
(5.1)
with explicit bounds. Remark 5.3. Note that we do not claim that the resolvent Rα is sectorial for α < 1. Indeed it is likely that it is not (because of the contribution of the drift term). Moreover, it is not clear to us how to perform the spectral study in a Hilbert functional setting L 2 (m −1 ) with convenient weight function m. In particular, we are not able to prove Proposition 3.2 in an L 2 framework. In such a situation the spectral study and the obtaining of constructive rate of decay on the semigroup become tricky. Let us emphasize also that (as most of the results established in this paper) this result is not an easy consequence of perturbation theory of the unbounded operator since the elastic limit α → 1 is strongly ill-behaved (for instance neither the “relative bound” nor the “operator gap” of [21] go to 0) because of the anti-drift term.
5.1. Recalls and improvements of technical tools from [27]. Proposition 5.4. In the statement of Theorem 5.1 one can replace everywhere L 1 (m −1 ) by Wqk,1 (m −1 ), k, q ∈ N.
474
S. Mischler, C. Mouhot
Let us first recall the key decomposition of Lˆ 1 in [27, Sect. 2] (re-written within the notation of this paper): Let 1 E denote the usual indicator function of the set E, let : R → R+ be an even ˜ : R N → R+ a radial C ∞ function with mass 1 and support included in [−1, 1] and C ∞ function with mass 1 and support included in B(0, 1). We define the following mollification functions ( > 0): ' (x) = −1 ( −1 x), (x ∈ R) ˜ (x) =
−N
˜ (
−1 x),
(x ∈ R N ).
Then we consider the decompositions L1 (g) = Lc1 (g) − Lν (g)
with Lν (g) := ν g, ν = L(G¯ 1 ),
where Lc1 splits between a “gain” part L+1 (denoted so because it corresponds to the linearization of Q + ) and a convolution part L∗ (not depending on α) as Lc1 (g) = L+1 (g) − L∗ (g) with L∗ (g) := G¯ 1 [g ∗ ] , (we do not write the subscript 1 when there is no dependency on α). Then for any δ ∈ (0, 1) we set + (|v − v∗ |) bδ (cos θ ) g (G¯ 1 )∗ + (G¯ 1 ) g∗ dv∗ dσ, L1,δ (g) = Iδ (v) R N ×S N −1
where ˜ δ ∗ 1{|·|≤δ −1 } , Iδ = and
bδ (z) = δ 2 ∗ 1{−1+2δ 2 ≤z≤1−2δ 2 } b(z).
This approximation induces L1,δ = L+1,δ − L∗ − Lν . Then the key result is that this approximation converges (in the norm of the graph) to the original linearized operator L1 as δ → 0, first in the small classical linearization space L 2 (G¯ −1 1 ) (this technical result was in fact mostly already included in Grad’s results [18]), and second most importantly in the larger space L 1 (m −1 ). On the basis of this approximation result the spectrum is then proved to be the same in both functional spaces, and then the norm of the resolvents within these two functional spaces are related by an explicit control. Hence the key elements of the proof which are to be extended are, on the one hand, the approximation argument (which has to be extended from an L 1 (m −1 ) setting to an Wqk,1 (m −1 ) setting), and, on the other hand the explicit control on the resolvent in the space L 2 (G¯ −1 1 ) provided by the self-adjointness structure of the collision operator in this space and the explicit estimates on the spectral gap (see [6]), which has to be extended to an H k (G¯ −1 1 ) setting. Then the rest of the proof of [27] would extend as well (up to minor technical modifications) to W k, p (m −1 ). Therefore for the first point let us prove k,1 (m −1 ), we have Proposition 5.5. For any k, q ∈ N and g ∈ Wq+1 + L − L+ (g) k,1 −1 ≤ ε(δ) g k,1 −1 , 1 1,δ W (m ) W (m ) q
q+1
where ε(δ) > 0 is an explicit constant going to 0 as δ goes to 0.
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
475
Proof of Proposition 5.5. The case k = q = 0 is provided by Proposition 3.2. Then higher-order derivatives follow by differentiation, and the incoporation of a polynomial weight is trivial. Concerning the second point let us prove k ¯ −1 Proposition 5.6. The spectrum (L1 ) of L1 in L 2 (G¯ −1 1 ) is unchanged in any H (G 1 ), k ∈ N. Moreover the control on the resolvent, which was (self-adjoint operator)
R1 (ξ ) L 2 (G¯ −1 ) ≤ 1
1 dist(ξ, (L1 ))
in the space L 2 (G¯ −1 1 ), extends into ∀ ξ ∈ A, R1 (ξ ) H k (G¯ −1 ) ≤ 1
with
Ck , dist(ξ, (L1 ))
3π 3π λ A = ξ ∈ C, arg(ξ + λ) ∈ − , and e ξ ≤ , 4 4 2
for any k ∈ N and some explicit constant Ck > 0, Proof of Proposition 5.6. A quick way to prove the result for instance is the following. It is easy to prove by induction on k ∈ N the following estimate on the Dirichlet form: ⎛ ⎞ ¯ s g)2 2 −1 ⎠ as ∇ s L1 (g), ∇ s g L 2 (G¯ −1 ) ≤ −τk ⎝ (∇ L (G¯ ) |s|≤k
1
|s|≤k
1
¯ denotes the orthogonal profor some explicit τk > 0 and as > 0, |s| ≤ k, and where −1 2 ¯ jection in L (G 1 ) onto the functions with zero mass, momentum and energy. Therefore we deduce on Lˆ 1 that its semigroup satisfies ˆ
∀ k ∈ N, et L1 H k (G¯ −1 ) ≤ Ck 1
and that obviously the same is true on the stable subspace of functions with zero energy. Then by interpolation with the rate of decay of the semigroup for functions with zero energy in L 2 (G¯ −1 1 ), we deduce that ˆ
∀ ε > 0, k ∈ N, et L1 H k (G¯ −1 ) ≤ Cε,k e−(µ2 −ε) t 1
for some explict Cε,k > 0, and where is the orthogonal projection in L 2 (G¯ −1 1 ) onto functions with zero energy. This implies on the resolvent that for any k ∈ N, ∀ ξ ∈ A, R1 (ξ ) H k (G¯ −1 ) ≤ Ck , 1
with
λ 3π 3π and e ξ ≤ , A = ξ ∈ C, arg(ξ + λ) ∈ − , 4 4 2
for some explicit Ck > 0. Then the result follows by straightforward interpolation with the estimates on the resolvent in L 2 (G¯ −1 1 ).
476
S. Mischler, C. Mouhot
Then we can conclude to the following extension of point (ii) of Theorem 5.1: Proposition 5.7. We have ∀ ξ ∈ A, R1 (ξ )Wk,1 (m −1 ) ≤ ak,q + q
with
bk,q , |ξ + λ|
3π 3π λ A = ξ ∈ C, arg(ξ + λ) ∈ − , and e ξ ≤ 4 4 2
for any k, q ∈ N and some explicit constant ak,q , bk,q > 0. 5.2. Decomposition of Lˆ α and technical estimates. We fix once for all some µ¯ ∈ (µ2 , 0) and we split the proof of Theorem 5.2 into four steps, detailed in the following four subsections. Let us introduce the operator Pα = L1 − Lα = L+1 − L+α + τα ∇v · (v ·). Our first step in this subsection is to estimate the convergence to 0 of the first part of this operator in suitable norm. Namely we prove Lemma 5.8. (i) For any k, q ∈ N, there exists C = Ck,q,m such that + L (g) k,1 −1 ≤ C g k,1 −1 , Lα (g) k,1 −1 ≤ C g α
Wq (m
)
Wq+1 (m
)
Wq (m
)
k+1,1 Wq+1 (m −1 )
.
(ii) For any k, q ∈ N, there is a constructive function ε : (0, ∞) → (0, ∞) satisfying k,1 (m −1 ), ε(α) → 0 as α goes to 1 and such that for any g ∈ Wq+1 + L − L+ (g) k,1 −1 ≤ ε(α) g k,1 −1 . α 1 W (m ) W (m ) q
q+1
(iii) There exists C ∈ (0, ∞) such that for any g ∈ W33,1 (m −1 ), we have (L1 − Lα ) (g) L 1 (m −1 ) ≤ C (1 − α) gW 3,1 (m −1 ) . 3
Proof of Lemma 5.8. The case k = q = 0 is proved in Proposition 3.2. Then higher-order derivatives are obtained from the L 1 (m −1 ) estimates by straightforward differentiation, and the incorporation of polynomial weights is trivial. Now let us consider some ξ ∈ C and let us define Aδ = L+1,δ − L∗ and
Bα,δ (ξ ) = ν + ξ + L+1,δ − L+1 + Pα .
(Recall that the approximation L+1,δ is defined in the beginning of Subsect. 5.1.) It yields the decomposition Lα − ξ = Aδ − Bα,δ (ξ ). Then we have
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
477
Lemma 5.9. Let us consider any k, q ∈ N and ξ such that e ξ ≥ − min ν. Then ∞,1 (m −1 ) is a bounded linear operator (i) For any δ > 0, the operator Aδ : L 1 → W∞ 1 ∞ (more precisely it maps functions of L into C functions with compact support). (ii) There are some constructive δ ∗ > 0 and α2 ∈ (α1 , 1) (depending on a lower bound on dist(ξ, ν(R N ))) such that for δ ∈ [0, δ ∗ ] and α ∈ [α2 , 1], the operator k+1,1 (m −1 ) → Wqk,1 (m −1 ) Bα,δ : Wq+1
is invertible. (iii) The inverse operator Bα,δ (ξ )−1 satisfies for δ ∈ [0, δ ∗ ] and α ∈ [α3 , 1]: C1 ≤ Bα,δ (ξ )−1 k,1 −1 k,1 −1 Wq (m )→Wq (m ) dist(e ξ, ν(R N )) and Bα,δ (ξ )−1
Wqk+1,1 (m −1 )→Wqk,1 (m −1 )
≤
C2 dist(ξ, ν(R N ))
for some explicit constants C1 , C2 > 0 depending on k, q, δ ∗ , α2 and a lower bound on dist(e ξ, ν(R N )). N c Proof of Lemma 5.9.For ξ ∈ ν(R ) , it was proved in [27, Prop. 4.1, Theorem 4.2] the convergence to 0 of L+1,δ − L+1 as δ → 0 (which was done in L 11 (m −1 ) → L 1 (m −1 ) k,1 (m −1 ) → Wqk,1 (m −1 ) by Proposition 5.5), we deduce in [27] and is extended in any Wq+1 as in [27] that for δ small enough (depending on a lower bound on the coercivity norm of ν + ξ , that is on a lower bound on dist(ξ, ν(R N ))), we have
+ L − L+ g k,1 −1 ≤ 1 (ν + ξ ) g k,1 . 1,δ 1 Wq (m ) Wq+1 2 It was also proved that Aδ maps functions of L 1 into C ∞ functions with compact support (with explicit estimates). Let us now consider Bα,δ (ξ ) only in the case k = q = 0 (estimates for higher-order derivatives and weights are obtained by straightforward differentiation and computations). From Lemma 5.8 we have for α close enough to 1 (depending on a lower bound on dist(ξ, ν(R N ))), + L − L+ g 1
α
L 1 (m −1 )
≤
1 (ν + ξ ) g L 1 (m −1 ) . 2
By considering the semigroup on L 1 (m −1 ) of Bα,δ (ξ ) and computing the evolution of the norm in symmetric form using the formula for the differentiation of the complex modulus of a function ∇|h| =
∇h h¯ + h ∇ h¯ , 2 |h|
it is easily seen that B (ξ ) t 1 e α,δ g L 1 (m −1 ) ≥ (ν + e ξ ) g L 1 (m −1 ) − (ν + eξ ) g L 1 (m −1 ) , 2
478
S. Mischler, C. Mouhot
and therefore for α close enough to 1 (depending on a lower bound on dist(ξ, ν(R N ))), we deduce that B (ξ ) t 1 e α,δ g L 1 (m −1 ) ≥ (ν + e ξ ) g L 1 (m −1 ) , 2 and thus that the operator is invertible with its inverse bounded by 2 Bα,δ (ξ )−1 1 −1 ≤ . L (m ) dist(e ξ, ν(R N )) Moreover by computing separately the evolution of the L 1 (m −1 ) norm in non-symmetric form (thus keeping ν + ξ but creating a term of the form O(1 − α) times a W11,1 (m −1 ) norm) and the evolution of the W11,1 (m −1 ) norm in symmetric form: it yields easily B (ξ ) t 1 1 e α,δ g W 1,1 (m −1 ) ≥ (ν + ξ ) g L 1 (m −1 ) + (ν + ξ ) ∇v g L 1 (m −1 ) , 2 2 which implies the result, by dropping the second term. 5.3. Geometry of the essential spectrum and estimates on the eigenvalues. First concerning the geometry of the spectrum, following the same strategy as in [27, Subsect. 3.2] we can prove Proposition 5.10. Let us pick any k, q ∈ N and m a smooth exponential weight function (as defined in (1.30)). Then for any α ∈ [α2 , 1] the spectrum of Lα in Wqk,1 (m −1 ) is composed of a part included in cµ∞ containing all possible essential spectrum, and a remaining part included in µ∞ exclusively composed of discrete eigenvalues. Proof of Proposition 5.10. We follow the same method as in the proof of [27, Prop. 3.4]. One uses the decomposition Lα = Aδ − Bα,δ (0), the compactness of the first part Aδ and the coercivity Bα,δ (0) L 1 (m −1 ) ≥ ν g L 1 (m −1 ) − ε(δ) ν g L 1 (m −1 ) of the second part (where (δ) → 0 as δ → 0). Then one applies Weyl’s theorem and show that (for any δ > 0) µ∞ + (δ) has to be a Fredhom set with indices (0, 0) (except possibly for a countable family of points) since [a, +∞) is included in the resolvent set for a big enough. Second concerning the discrete part of the spectrum, that is the isolated eigenvalues with finite multiplicity, following the same strategy as in [27, Proof of Prop. 3.5] we can prove Proposition 5.11. Let us fix µ¯ ∈ (µ2 , 0). Then for any α ∈ [α2 , 1] (where α2 is obtained from Lemma 5.9 for this choice of µ), ¯ for any µ ∈ µ¯ and φ ∈ W11,1 satisfying Lα (φ) = µ φ in L 1 , we have φW k,1 (m −1 ) ≤ Ck,m φ L 1 2
for any k ∈ N and m = exp(−a |v|s ), a > 0, s ∈ (0, 1), where the constant Ck,m depends on k, m and a lower bound on µ¯ − µ.
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
479
Proof of Proposition 5.11. Let us sketch the idea of the proof. We use the decomposition 0 = Lα φ − µ φ = Aδ φ − Bα,δ (µ)φ and the fact that for the choices made for µ and α in the assumptions we have (adjusting δ as in Lemma 5.9) Bα,δ (µ) is invertible in any W k,1 (m −1 ) with explicit bound, and Aδ maps L 1 into C ∞ functions with compact support. Remark 5.12. An alternative proof could be to adapt the proof of Proposition 2.7. 5.4. Estimate on the resolvent and global stability of the spectrum. Lemma 5.13. Let us pick k, q ∈ N and m a smooth exponential weight function (as defined in (1.30)) and consider the operator Lα in Wqk,1 (m −1 ). Then (i) For any ξ ∈ R(L1 ), there is αξ ∈ [α2 , 1) such that ξ ∈ R(Lα ) for any α ∈ [αξ , 1]. (ii) More precisely, the resolvent Rα (ξ ) satisfies the following two estimates for α ∈ [α2 , 1): Rα (ξ )Wk,1 (m −1 ) ≤ q
C1 + C2 R1 (ξ )W k+1,1 (m −1 ) q+1
1 − C3 (1 − α) R1 (ξ )W k+1,1 (m −1 )
,
q+1
Rα (ξ )Wk+1,1 (m −1 )→Wk,1 (m −1 ) q
q+1
C1 + C2 R1 (ξ )W k+1,1 (m −1 ) 1 q+1 ≤ , δ(ξ ) 1 − C3 (1 − α) R1 (ξ )W k+1,1 (m −1 ) q+1
Ci , Ci ,
dist(ξ, ν(R N ))
and where the constants i = 1, 2, 3 depend with δ(ξ ) := N on a positive lower bound on dist(e ξ, ν(R )). (iii) Finally, for any compact set K ⊂ ρ(L1 ) there exists α K ∈ [α2 , 1), C K ∈ (0, ∞) such that ∀ ξ ∈ K , α ∈ (α K , 1] Rα (ξ )Wk,1 (m −1 ) ≤ C K , q
∀ ξ ∈ K , α, α ∈ (α K , 1] Rα (ξ ) h−Rα (ξ ) hL1 (m −1 ) ≤ C K (1−α) hW 3,1 . 3
Proof of Lemma 5.13. We split the proof into three steps. k+1,1 (m −1 ) Step 1. Let us consider the following operator defined from Wqk,1 (m −1 ) to Wq+1 (which is seen to be well-defined at a glance) Iα,δ (ξ ) := −Bα,δ (ξ )−1 + R1 (ξ ) Aδ Bδ,α (ξ )−1 . Some straightforward computations show that (Lα − ξ ) Iα,δ (ξ ) = −Aδ Bα,δ (ξ )−1 + Id + [Id − Pα R1 (ξ )] Aδ Bα,δ (ξ )−1 which simplifies into (Lα − ξ ) Iα,δ (ξ ) =: Jα,δ (ξ ) := Id − Pα R1 (ξ ) Aδ Bα,δ (ξ )−1 =: Id − K α,δ (ξ ). First using that Pα hWk,1 (m −1 ) ≤ C (1 − α) hW k+1,1 (m −1 ) , q
q+1
480
S. Mischler, C. Mouhot
k+1,1 the control of R1 (ξ ) in Wq+1 (m −1 ) and the regularization property of Aδ we deduce that
K α,δ (ξ ) = Pα R1 (ξ ) Aδ Bδ,α (ξ )−1 = O(1 − α) in the norm of bounded operators on Wqk,1 (m −1 ), and therefore for (1 − α) small enough (with explicit bound) we get that K α,δ (ξ )W k,1 (m −1 ) ≤ C3 (1 − α) R1 (ξ )Wk+1,1 (m −1 ) < 1 q
q+1
and Id − K α,δ (ξ ) is invertible in Wqk,1 (m −1 ). As a consequence (Lα − ξ ) Iα,δ (ξ ) (Id − K α,δ (ξ ))−1 = IdWk,1 (m −1 ) q
and we have proved that Lα − ξ admits a right-inverse, namely so that Iα (ξ ) (Id − K α,δ (ξ ))−1 . This proves that the operator Lα − ξ is onto. Step 2. In order to show that Lα − ξ is invertible and that we have identified the resolvent it remains to prove that it is one-to-one. Let us consider the eigenvalue equation (Lα − ξ ) h = 0 which is written (L1 − ξ )h = Pα h from which we deduce (using Proposition 5.11 to get regularity bounds on h) hWk,1 (m −1 ) ≤ R1 (ξ )Wk,1 (m −1 ) Pα hWk,1 (m −1 ) q
q
q
≤ C (1 − α) R1 (ξ )Wk,1 (m −1 ) hWk+1,1 (m −1 ) q
q+1
≤ C (1 − α) R1 (ξ )Wk,1 (m −1 ) hWk,1 (m −1 ) . q
q
Therefore for (1 − α) small enough (depending on the norm of R1 (ξ )) we have that necessarily h = 0, and thus the operator (Lα − ξ ) is one-to-one. For α satisfying all the previous conditions, the operator (Lα − ξ ) is bijective from k+1,1 (m −1 ) to Wqk,1 (m −1 ) and its inverse is given by Wq+1 Rα (ξ ) = Iα,δ (ξ ) Jα,δ (ξ )−1 from which we get the desired bound on the resolvent thanks to the study of Bα,δ (ξ )−1 in Lemma 5.9. At this point we have proved points (i), (ii) and the first estimate in (iii). Step 3. The second estimate in point (iii) is obtained from the resolvent identity Rα (ξ ) − R1 (ξ ) = Rα (ξ ) [L1 − Lα ] R1 (ξ ), together with the previous estimates on the resolvent and point (iii) in Lemma 5.8.
Remark that this lemma proves the point (ii) in Theorem 5.2. Moreover, as a consequence of this estimate on the resolvent Rα (ξ ), we may go one step further in the localization of the spectrum of Lˆ α around 0. Corollary 5.14. Let us fix µ¯ ∈ (µ2 , 0). In any Wqk,1 (m −1 ) there is some constant C ∈ (0, ∞) such that ∀ α ∈ [α2 , 1], (Lˆ α ) ∩ µ¯ ⊂ B(0, C (1 − α)).
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
481
Proof of Corollary 5.14. The proof follows from the estimates in point (ii) of Lemma 5.13, together with the fact that (Proposition 4.1 of [27] in L1 (m −1 ) extended to Wqk,1 (m −1 ) by the previous discussion): ∀ ξ ∈ µ¯ , R1 (ξ )Wk,1 (m −1 ) ≤ a + q
b |ξ |
for some explicit constants a, b > 0. We get thus that Rα (ξ )Wk,1 (m −1 ) < ∞ if ξ ∈ µ¯ q and |ξ | ≥ C (1 − α), which concludes the proof. 5.5. Fine study of spectrum close to 0. Let us fix r ∈ (0, |µ|] ¯ and let us choose any αr ∈ [α2 , 1) such that C (1 − αr ) < r (with the notations of Corollary 5.14) in such a way that (Lˆ α ) ∩ λ¯ ⊂ B(0, r ) for any α ∈ [αr , 1]. We may then define the spectral projection operator (see [21]) 1 Rα (ζ ) dζ (5.2) α := − 2 π i S(0,r ) in any Wqk,1 (m −1 ), with S(0, r ) := {ξ ∈ C, |ξ | = r }. The operator α is the projection operator on the sum of eigenspaces associated to eigenvalues lying in the half plane {ξ ∈ C, e ξ ≥ −r }, see [21]. In particular the operator 1 is the projection on the energy eigenline R φ1 , where we recall that φ1 is the energy eigenfunction defined by (1.35). Lemma 5.15. The operator α satisfies (i) For any k ∈ N and any exponential weight function m (as defined in (1.30)), it is well-defined and bounded in Wqk (m −1 ). (ii) Moreover there is a constant C > 0 (depending on m) such that ∀ α, α ∈ [αr , 1]
α − α W 3,1 (m −1 )→L 1 (m −1 ) ≤ C |α − α|.
(5.3)
3
Proof of Lemma 5.15. It is a straightforward consequence of (5.2) and Lemma 5.13. Corollary 5.16. There exists α3 ∈ [α2 , 1) such that for any α ∈ [α3 , 1) there holds (Lˆ α ) ∩ µ¯ = {µα } and the eigenspace associated to µα ∈ R is 1-dimensional. This eigenvalue is called the energy eigenvalue. We may furthermore remark that Corollary 5.14 implies ∀ α ∈ [α3 , 1)
|µα | ≤ C (1 − α).
(5.4)
Proof of Corollary 5.16. We already know that (Lˆ α ) ∩ µ¯ is entirely composed of discrete spectrum. Therefore we have to prove that it is of dimension 1. Indeed once this is proved, the fact that µα ∈ R is trivial since the operator is real, and the control (5.4) is trivial from Corollary 5.14. Let us define the space X α := α (L 1 (m −1 ))+1 (L 1 (m −1 )) endowed with the norm · L 1 (m −1 ) . From Proposition 5.11, there exists a constant C1 > 0 such that ∀ ψ ∈ X α , ψW 3,1 (m −1 ) ≤ C1 ψ L 1 (m −1 ) . 3
482
S. Mischler, C. Mouhot
Thanks to the definition of α and 1 and to Lemma 5.15, we then get α − 1 X α →X α ≤ C2 sup
(Rα (z) − R1 (z)) ψ L 1 (m −1 )
sup
ψ L 1 (m −1 )
ψ∈X α z∈S(0,r )
≤ C2 (1 − α) sup
ψW 3,1 (m −1 )) 3
ψ L 1 (m −1 )
ψ∈X α
≤ C2 (1 − α) < 1, for (1 − α) small enough. By classical operator theory (see for instance the arguments presented in [21, Chap. 1, paragraph 4.6] in order to prove [21, Lemma 4.10]) one deduces that dimension(α ) = dimension(1 ). Since dimension(1 ) = 1 (as recalled in Theorem 5.1), this concludes the proof. Let us introduce for any ψ ∈ L 1 the decomposition ⊥ ψ = 1 ψ + ⊥ 1 ψ = (π1 ψ) φ1 + 1 ψ,
where π1 ψ ∈ R is the coordinate of 1 ψ on R φ1 (defined thanks to the projection 1 ). For any α ∈ [α3 , 1) we denote by φα the unique eigenfunction associated to µα such that φα L 1 = 1 and π1 φα ≥ 0. 2 We can now establish a first order approximation of the eigenfunction φα . Lemma 5.17. For any k, q ∈ N and any exponential weight function m (as defined in (1.30)), there exists C such that ∀ α ∈ [α3 , 1]
φα − φ1 W k,1 (m −1 ) ≤ C (1 − α).
(5.5)
q
Remark 5.18. We immediately deduce from Lemma 5.17 that φα (0) < 0 for α close enough to 1, and therefore, we get that this definition of φα coincides with the definition in Theorem 1.1. Proof of Lemma 5.17. On the one hand, from the normalization conditions, we have ! ! ! ! φ1 − 1 φα L 1 = |1 − π1 φα | = ! φα L 1 − 1 φα L 1 ! 2
2
≤ φα − 1 φα L 1 = 2
2
⊥ 1 φα L 12 .
We then deduce ⊥ φ1 − φα L 1 ≤ 1 φα − φα L 1 + ⊥ 1 φα L 1 ≤ 2 1 φα L 1 . 2
2
2
2
(5.6)
On the other hand, the eigenfunction φα satisfies Lˆ 1 (φα ) = [Lˆ 1 (φα ) − Lˆ α (φα )] − µα φα . ∞,1 Recall that from Proposition 5.11 one has uniform bounds in W∞ (m −1 ) on φα in terms 1 of its L 2 norm which has been fixed to 1, so that for any α ∈ [α3 , 1], φα W k,1 (m −1 ) ≤ C. q Using Proposition 3.1 and Proposition 5.11 we get
Lˆ 1 φα L 1 (m −1 ) = O(1 − α).
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
483
1 −1 1 −1 Using that Lˆ 1 is invertible from ⊥ 1 L1 (m ) to L (m ) we deduce that
⊥ 1 φα L 1 (m −1 ) = O(1 − α).
(5.7)
We conclude the proof of (5.5) holds for the L 12 norm gathering (5.6) and (5.7): ∀ α ∈ [α3 , 1]
φα − φ1 L 1 ≤ C (1 − α). 2
Let us now consider the eigenfunctions α associated to µα for α ∈ [α3 , 1] such that π1 α > 0 with the normalization condition α W k,1 (m −1 ) = 1. Proceeding similarly as before (by working in the space W k,1 (m −1 )), we can get α − 1 W k,1 (m −1 ) = O(1 − α). Because the eigenspace associated to µα is of dimension 1, we have α = cα φα for some constant cα ∈ (0, ∞). Then |c1 − cα | = cα φα − c1 φα L 1 ≤ α − 1 L 1 + |c1 | φ1 − φα L 1 = O(1 − α). 2
2
We then easily conclude that (5.5) holds for any
W k,1 (m −1 )
2
norm.
We now use the linearized energy dissipation equation to get a second order expansion of the eigenvalue. Lemma 5.19. For α ∈ [α3 , 1], the eigenvalues µα satisfy (with explicit bound) µα = −ρ (1 − α) + O(1 − α)2 . Proof of Lemma 5.19. By integrating the eigenvalue equation Lˆ α φα = µα φα against |v|2 and dividing it by (1 − α), we get µα ˜ G¯ α , φα ). E(φα ) = 2 ρ E(φα ) − 2 (1 + α) D( 1−α Using the rate of convergence of G¯ α → G¯ 1 and φα → φ1 established in Lemma 4.4 and Lemma 5.17 we deduce that µα ˜ G¯ 1 , φ1 ) + O(1 − α). E(φ1 ) = 2 ρ E(φ1 ) − 4 D( (5.8) 1−α Then we compute thanks to (A.1) and (A.2), E(φ1 ) = 2 N c0 ρ θ¯12 ,
(5.9)
where c0 is still the normalizing constant in (1.35) such that φ1 L 1 = 1. Similarly, 2 using (A.3), (A.4) and the relation (1.29) which make a link between b1 and θ¯1 , we find ˜ G¯ 1 , φ1 ) = 3 N c0 ρ 2 θ¯12 . D( 2 We conclude gathering (5.8), (5.9) and (5.10).
(5.10)
484
S. Mischler, C. Mouhot
5.6. The map α → G¯ α is C 1 . The fact that the path of self-similar profiles α → G¯ α is C 0 on [α3 , 1] and C 1 at α = 1 was already proved in Lemma 4.4. Therefore we have to prove that it is C 1 for α ∈ [α3 , 1). Let us define the functional (α, g) → (α, g) := Q α (g, g) − τα ∇v (v g). The map is C 1 from R × (W11,1 (m −1 ) ∩ Cρ,0 ) into L1 (m −1 ) and it is such that for any α ∈ [α1 , 1), the equation (α, g) = 0 has only one solution which is the profile G¯ α . Moreover, for any α ∈ [α3 , 1), the linearized operator D2 (α, G¯ α ) = Lα is invertible from W1,1 (m −1 ) into L1 (m −1 ) because of the spectral properties of Lα established in Theorem 5.2 (i) & (ii) (note that here there is no eigenvalue approaching 0 at α). Then using the same strategy as in Subsect. 4.2 based on the implicit function theorem we easily conclude that α → G¯ α is C 1 from [α3 , 1) into L 1 (m −1 ). That ends the proof of Theorem 1.1 (ii). 5.7. Decay estimate on the semigroup. We start with a lemma dealing with semigroups in Banach spaces. This result is a tool for deriving constructive decay rate on non sectorial semigroups, assuming some precise estimates on the resolvent of their generator. We do not try to prove such a decay rate for the semigroup in the norm of the ambiant Banach space but instead in a weaker norm (corresponding to the norm of the graph of some power of its generator), which shall be sufficient for our study of the linearized stability of (1.31). Lemma 5.20. Let A be a closed unbounded operator on a Banach space E with dense domain dom(A). We denote by S(t) the associated semigroup, by R(A) the associated resolvent set and by R = R(ξ ) the resolvent operator defined on R(A). Assume that we have a sequence of Banach spaces E 2 ⊂ E 1 ⊂ E 0 = E decreasing for inclusion (in most cases this sequence shall be provided by E k = dom(Ak ) endowed with the norm of the graph of Ak ). We assume on the operator that: (i) the resolvent set R(A) contains the half plan a for some a ∈ R, together with the estimates sup R(a + i s) E 0 →E 0 ≤ C1
s∈R
and ∀ s ∈ R, R(a + i s) E 1 →E 0 , R(a + i s) E 2 →E 1 ≤
C2 1 + |s|
for some constants C1 , C2 > 0; (ii) the semigroup S(t) satisfies ∀ t ≥ 0, S(t) E 2 →E 0 ≤ C3 eb t
(5.11)
for some constants C3 , b > 0. Then for any a > a, there exists a constant C4 depending only on a, b, a , C1 , C2 , C3 such that
∀ t ≥ 0, S(t) E 2 →E 0 ≤ C4 ea t .
(5.12)
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
485
Proof of Lemma 5.20. We split the proof into two parts. Step 1. The first bound on the resolvent implies that for any x ∈ E 0 , R(a + is)x E 0 → 0, |s| → ∞. Indeed we first consider x ∈ dom(A) and then argue by density (since the domain dom(A) is dense). When x ∈ dom(A) the result is proved by the relation R(z)x = z −1 [−Id + R(z) A] x. Step 2. Then consider the following integral of R(z)x on a vertical segment with real part a (for some M > 0) a+i M e zt R(z)x dz. I M (x) := a−i M
The function z → R(z) is differentiable on this segment and we can perform an integration by part: a+i M zt e(a−i M) t e e(a+i M) t I M (x) = R(a + i M)x − R(a − i M)x − R(z)2 x dz, t t t a−i M where we have used R (z) = R(z)2 . Now we estimate the E 0 norm of this quantity: e(a+i M) t e(a+i M) t I M (x) E 0 ≤ R(a + i M)x + R(a + i M)x t t E0 E0 +∞ a t e 1 + C22 ds x E 2 . 2 t −∞ (1 + |s|) Therefore the integral is semi-convergent and we can pass to the limit M → +∞ and use (see [4,32]) that a+i M 1 1 S(t)x = lim lim I M e zt R(z)x dz = 2iπ M→∞ a−i M 2iπ M→∞ to obtain (the two boundary terms go to 0 as M → +∞ from the first step) +∞ at 1 e 2 x E 2 , with C2 = C2 S(t)x E 0 ≤ C2 ds . 2 t −∞ (1 + |s|)
(5.13)
Using (5.11) for t ≤ 1 and (5.13) for t ≥ 1, we conclude that (5.12) holds with C4 = max(C2 , C3 eb−a ). Proof of point (iii) in Theorem 5.2. The point (ii) of Theorem 5.2 was proved in Lemma 5.13 and it shows that the operator L¯ α = (Id − α ) Lˆ α together with the sequence of Banach spaces E i = Wik+i,1 (m −1 ), i = 0, 1, 2, for any fixed k ∈ N and any exponential weight function m (as defined in (1.30)), satisfies the assumption (i) of Lemma 5.20 for any a ∈ (µ2 , 0). Moreover it is trivial to prove that it satisfies the assumption (ii) of Lemma 5.20 for some explicit b > 0 from the decomposition Lα = Aδ − Bα,δ (ξ ) already introduced.
486
S. Mischler, C. Mouhot
6. Convergence to the Self-Similar Profile In this section, we consider the nonlinear rescaled equation (1.31) and we prove the convergence of its solutions to the self-similar profile. As a preliminary step let us recall some results from [24, Proposition 3.1, Theorem 3.5, Theorem 3.6] about propagation and appearance of moments and regularity. Lemma 6.1. Let us consider gin ∈ L 13 ∩ Cρ,0 and the associated solution g ∈ C([0, ∞); L 13 ) to the rescaled equation (1.31). Then (i) For any exponential moment weight m (as defined in (1.30)) with exponent s ∈ (0, 1/2) and any time t0 ∈ (0, ∞), there exists a constant M1 = M1 (t0 ) such that sup g(t, ·) L 1 (m −1 ) ≤ M1 .
(6.1)
[t0 ,∞)
Moreover, if gin ∈ L 1 (m −1 ) for some polynomial or exponential (with exponent s ∈ (0, 1)) moment weight m then (6.1) holds (for this weight m) with t0 = 0 and some constant M1 = M1 (gin L 1 (m −1 ) ). For the following two points we now assume that for some constants c1 , T ∈ (0, ∞) there holds inf E(g(t, ·)) ≥ c1 ,
(6.2)
[0,T ]
and we state some smoothness properties of the solution g which depend on c1 but not on T nor α. (ii) Assume (6.2). Then for any k0 ∈ N there is q0 = q0 (k0 ) ∈ N such that if gin H k0 ∩L 1 q0 ≤ C0 holds, then for any c1 ∈ (0, ∞) there exists C1 = C1 (C0 , c1 ) ∈ (0, ∞) such that for any time T ∈ (0, ∞), we have ∀ t ∈ [0, T ], g(t, ·) H k1 ≤ C1 ,
(6.3)
with k1 = 0 if k0 = 0 and k1 = k0 − 1 if k0 ∈ N∗ . (iii) Assume (6.2) and that gin ∈ L 2 , with gin L 2 ∩L 1 ≤ M1 ∈ (0, ∞). Then there 3 exists λ ∈ (−∞, 0) and for any exponential weight function m with exponent s ∈ (0, 1/2) and any k ∈ N, there exists a constant K (which depends on ρ, c1 , M1 , k, m) such that we may split g = g S + g R with ∀ t ∈ [0, T ], g S (t, ·) H k ∩L 1 (m −1 ) ≤ K , g R (t, ·) L 1 ≤ K eλ t . 3
(6.4)
Remark 6.2. It is worth mentioning that these estimates are uniform with respect to the inelasticity parameter α ∈ (0, 1). Indeed, on the one hand, this was already the case for the moment estimate (6.1) in [24, Prop. 3.1]. On the other hand (6.3) and (6.4) from [24, Theorem 3.5, Theorem 3.6] were (partially) based on the use of the damping effect of the anti-drift term (whose coefficient was fixed to τ = 1). Here the damping effect of the anti-drift term vanishes (τα → 0) but it is replaced (as for the elastic Boltzmann equation) by the lower bound on the energy (6.2) which allows for a control from below on the convolution term L(g) appearing in the loss term of the collision operator (see Lemma 2.3), which is enough to conclude also in this case.
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
487
6.1. Local linearized asymptotic stability. Let us consider the nonlinear evolution equation (1.31) in L 1 (m −1 ) ∩ H k , and the associated equation on the fluctuation h of a solution g around the unique equilibrium G¯ α : g = G¯ α + h and ∂t h = Lα h + Q α (h, h). Let us start by stating an inequality that we shall need in the sequel. Lemma 6.3. For any exponential weight function m (as defined in (1.30)), there is a constant C ∈ (0, ∞) such that for any h ∈ W33,1 (m −1 ) and any α ∈ (0, 1), α Q α (h, h) L 1 (m −1 ) ≤ C (1 − α) h2
W33,1 (m −1 )
.
Proof of Lemma 6.3. We write α Q α (h, h) = α (Q α (h, h) − Q 1 (h, h)) + (α − 1 )Q 1 (h, h). On the one hand, from Lemma 5.15 (i) and (3.4), there is C ∈ (0, ∞) such that α (Q α (h, h) − Q 1 (h, h)) L 1 (m −1 ) ≤ C (1 − α)h2
W33,1 (m −1 )
.
On the other hand, from (5.3) and (3.1), we get (α − 1 )Q 1 (h, h) L 1 (m −1 ) ≤ C (1 − α) h2
W33,1 (m −1 )
.
The proof of the lemma is immediate by gathering the two previous estimates.
We now state a first local linearized stability result. Proposition 6.4. For any α ∈ [α3 , 1), the self-similar profile G¯ α is locally asymptotically stable, with domain of stability uniform according to α ∈ [α3 , 1). More precisely, let us fix ρ ∈ (0, ∞) and some exponential weight function m as in (1.30). There is k1 , q1 ∈ N∗ such that for any M0 ∈ (0, ∞) there exists C, ε ∈ (0, ∞) such that for any α ∈ [α3 , 1], for any gin ∈ H k1 ∩ L 1 (m −q1 ) with mass ρ, momentum 0 satisfying gin H k1 ∩L 1 (m −q1 ) ≤ M0 ,
gin − G¯ α L 1 (m −1 ) ≤ ε,
(6.5)
the solution g to the rescaled equation (1.31) with initial datum gin satisfies ∀ t ≥ 0, α (gt − G¯ α ) L 1 (m −1 ) ≤ C gin − G¯ α L 1 (m −1 ) eµα t ,
(6.6)
∀ t ≥ 0, (Id − α ) (gt −G¯ α ) L 1 (m −1 ) ≤ C gin −G¯ α L 1 (m −1 ) e(3/2) µα t .
(6.7)
Proof of Proposition 6.4. Step 1. Let us first denote by c1 the constant given in Step 5 of Proposition 2.1 such that ∀ α ∈ [α1 , 1), E(G¯ α ) ≥ 2 c1 . We may then fix ε0 ∈ (0, ∞) in such a way that g − G¯ α L 1 (m −1 ) ≤ ε0
implies
E(g) ≥ c1 ,
(6.8)
488
S. Mischler, C. Mouhot
and define T∗ := sup T, E(gt ) ≥ c1 ∀ t ∈ [0, T ] ∈ (0, ∞]. From Lemma 6.1 (i) & (ii), there exists M ∈ (0, ∞) (depending on ρ, c1 , k1 , q1 , M0 ) such that for any T ∈ (0, ∞) there holds sup g H k1 ∩L 1 (m −q1 ) ≤ M.
(6.9)
t∈[0,T∗ ]
Let us now consider the fluctuation h t = gt − G¯ α . Thanks to the mass and momentum conservations, it satisfies h t ∈ C0,0 for all times, as well as the bound (6.9). We define the following decomposition on h: h 1 = α h and h 2 = (Id − α )h =: ⊥ α h. Since the spectral projection α commutes with the linearized operator Lα , the equation on h 1 is written ∂t h 1 = µα h 1 + α Q α (h, h). Multiplying that equation by (sign h) m −1 and integrating in the velocity variable, we deduce thanks to Lemma 6.3 and to (B.2), (6.9) that on (0, T∗ ) the following holds: d 1 h L 1 (m −1 ) ≤ µα h 1 L 1 (m −1 ) + C (1 − α) h2 3,1 −1 W3 (m ) dt
1 3/2 2 3/2 1 ≤ (1 − α) C1 h L 1 + C1 h L 1 − C2 h L 1 (m −1 ) , (6.10) 2
2
for some constants C1 depending on M and the possible choice C2 = ρ/2 for C2 . For the second part h 2 we have the following equation: 2 ⊥ ∂t h 2 = ⊥ α Lα h + α Q α (h, h).
Since the linearized operator Lα restricted to ⊥ α generates the semigroup Rα (t) defined in point (iii) of Theorem 5.2, the Duhamel formula reads t 2 h (t) = Rα (t) h in + Rα (t − s) ⊥ α Q α (h, h)(s) ds. 0
From (5.1) and (3.1) we have h 2 (t) L 1 (m −1 ) ≤ C eµ¯ t h in L 1 (m −1 ) + C
t 0
eµ¯ (t−s) h(s)2
W22,1 (m −1 )
ds.
We deduce h 2t L 1 (m −1 )
≤ C3 e
µ¯ t
h in + C4
t
2
0
3/2 3/2 eµ¯ (t−s) h 1s L 1 (m −1 ) +h 2s L 1 (m −1 ) ds (6.11)
with C4 depending on M thanks to (B.2) and (6.9). It is then easy to show by comparison arguments from (6.10) and (6.11) that there are 0 < ε2 ≤ ε1 ≤ ε0 (one can take for
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres 1/2
instance ε1 ≤ ε0 /2 satisfying 2 C1 ε1 satisfying C3 ε2 < ε1 /2) such that
489 1/2
< C2 and 2 C4 ε1
< 1/2 and next ε2 ≤ ε1
h 1in L 1 (m −1 ) + h 2in L 1 (m −1 )
sup max h 1t L 1 (m −1 ) , h 2t L 1 (m −1 ) ≤ ε1 .
≤ ε2 implies
t∈[0,T∗ ]
(6.12)
Gathering (6.8) and (6.12) we deduce that there exists ε ∈ (0, ε2 ) such that under condition (6.5) there holds T∗ = ∞ as well as sup g − G¯ α L 1 (m −1 ) ≤ 2 ε1 ≤ ε0 .
t∈(0,∞)
Step 2. In a second step, coming back to (6.11) and to the integral version of (6.10) and setting y(t) = h 1 + |µα | h 2 , we obtain t y(t) ≤ C5 eµα t y(0) + C6 |µα | eµα (t−s) y(s)3/2 ds. (6.13) 0
Then we have the following variant of the Gronwall lemma whose proof is the same as the one of [27, Lemma 4.5] and is therefore skipped: Lemma 6.5. Let y = y(t) be a nonnegative continuous function on R+ such that for some constants a, b, θ , µ > 0, t y(t) ≤ a e−µt X + b e−µ(t−s) y(s)1+θ ds 0
(as compared to [27, Lemma 4.5], X needs not necessarily be y(0)). Then if X and b are small enough, we have y(t) ≤ C X e−µt for some explicit constant C > 0. Thanks to the uniform smallness estimate on y(t) we can apply the lemma with θ = 1/4 for instance, and we get y(t) ≤ C7 y(0) eµα t , from which we deduce the estimate (6.6) for the h 1 part of g − G¯ α . Finally, we may insert that estimate on h 1 in (6.11) and we get t 3/2 h 2 (t) L 1 (m −1 ) ≤ C3 (eµ¯ t + e(3/2) µα t ) h in L 1 (m −1 ) + C4 eµ¯ (t−s) h 2 (s) L 1 (m −1 ) ds. 0
The same kind of computation yields to h 2t ≤ C8 e(3/2) µα t h(0) from which (6.7) follows.
490
S. Mischler, C. Mouhot
6.2. Nonlinear stability estimates. In this subsection we shall prove that when the inelasticity is small enough, depending on the size of the initial datum (but not on the distance between the initial datum and the self-similar profile), Eq. (1.31) is stable. This mainly relies on the fact that the entropy production timescale is much faster than the energy dissipation timescale as α → 1. This point is familiar to physicists (see for instance [9]), who distinguish, for granular gases with small inelasticity, the “molecular timescale” (the level where entropy production effects dominate) and the “cooling timescale” (much slower than the molecular timescale). Proposition 6.6. Define k2 := max{k0 , k1 }, q2 := max{q0 , q1 , 3}, where ki and qi are defined in Theorem 3.5 and Corollary 3.4. For any ρ, E0 , M0 there exists α4 ∈ [α3 , 1), c1 ∈ (0, ∞) and for any α ∈ [α4 , 1] there exist ϕ = ϕ(α) with ϕ(α) → 0 as α → 1 and T = T (α) (possibly blowing-up as α → 1) such that any initial datum 0 ≤ gin ∈ L q12 ∩ H k2 ∩ Cρ,0,E0 with gin L 1
k q2 ∩H 2
≤ M0 ,
the solution g associated to the rescaled equation (1.31) satisfies ∀ t ≥ 0, E(gt ) ≥ c1 and for all α ∈ [α4 , 1) and then all α ∈ [α , 1], ∀ t ≥ T (α ), gt − G¯ α L 1 ≤ ϕ(α ).
(6.14)
2
Proof of Proposition 6.6. Let us consider a solution g ∈ C([0, ∞); L q12 ∩ H k2 ) to the rescaled equation (1.31) with given initial datum gin , whose existence has been established in [23,24]. We split the proof of the proposition into five steps. Step 1. From the propagation and appearance of uniform moment bounds [24, Prop. 3.1, (iii)], which it is worth noticing have been obtained uniformly with respect to the elastic coefficient (see also [8]), there exists C1 ∈ (0, ∞) such that sup g L q1 ≤ C1 . t≥0
2
Let us define c1 := min{E(G¯ 1 ), E0 }/4, and T∗ := sup T ; ∀ t ∈ [0, T ], E(g(t, ·)) ≥ c1 .
(6.15)
(6.16)
Next from the equation on the evolution of energy E (t) = −(1 − α 2 ) b1 DE (g) + (1 − α) 2 ρ E
(6.17)
and (6.15) there holds |E (t)| ≤ C2 (1 − α)
∀t ≥ 0
(take for instance C2 = 2 b1 C12 + C1 ), from which we deduce that we necessarily have T∗ ≥ C3 (1 − α)−1 (take for instance C3 = (3/4) E0 /C2 ).
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
491
Step 2. From point (ii) of Lemma 6.1, we have for some constant C5 ∈ (0, ∞), ∀ t ∈ [0, T∗ ]
gt H k2 ≤ C4 .
(6.18)
Moreover from Lemma 2.6, for any time t1 ∈ (0, T∗ ), there exists some constant C5 = C5 (ρ, C4 , t1 ) such that ∀ t ∈ [t1 , T∗ ]
g(t, v) ≥ C5−1 e−C5 |v| . 8
(6.19)
Step 3. With the notations of Theorem 3.5, we compute the evolution of the relative entropy of g(t, ·) with respect to the associated Maxwellian M[g(t, ·)], and we obtain d d ρN d d d H (g|M[g]) = H (g) − H (g) − E g ln M[g] = N dt dt dt R dt 2 E dt ρN (1 − α 2 ) DE (g) − (1 − α) ρ 2 N . = −D H,α (g) + 2E Next from Lemma 3.4 and the estimates (6.15), (6.16), (6.18) and (6.19) we have d H (g|M[g]) = −D H,1 (g, g) + O(1 − α) on (t1 , T∗ ). dt Then from (3.11), we are led to the following differential inequation on the relative entropy d H (g|M[g]) ≤ −C6 H (g|M[g])2 + C7 (1 − α) on (t1 , T∗ ). dt By straightforward computations we deduce that independently of the value of H (gt1 |M [gt1 ]) (this “loss of memory" effect is typical of differential equations with overlinear damping terms), we have ∀ t ∈ [t1 , T∗ ],
H (gt |M[gt ]) ≤ C8 (1 − α)
1/2
1 + e−C9 (1−α)
1/2 (t−t
1 − e−C9 (1−α)
1)
1/2 (t−t
1)
for some explicit constants. As a conclusion, defining t2 := t1 + C9−1 (1 − α)−1/2 and choosing α¯ ∈ [α3 , 1) in such a way that t2 < T∗ we have for α ∈ [α , 1), ∀ t ∈ [t2 , T∗ ] H (g(t)|M[g]) ≤ C10 (1 − α)1/2 . Finally, using Csiszár-Kullback-Pinsker inequality (3.10), as well as Hölder inequality, we obtain under the same conditions on α and the time variable: 1/2
1/2
g − M[g] L 1 ≤ C g − M[g] L 1 g L 1 ≤ C H (g|M[g])1/4 ≤ C (1 − α)1/8 . (6.20) 3
6
Step 4. Now let us go back to the energy equation (6.17). First, with the help of the moment bound (6.15), one may write E (t) = 2 (1 − α) [ρ E − b1 DE (g) + O(1 − α)]. Thanks to (6.20) we deduce E (t) = 2 (1 − α) (ρ E − b1 DE (M[g]) + O((1 − α)1/8 )) on (t2 , T∗ ).
492
S. Mischler, C. Mouhot
Finally, thanks to (3.16), (3.17) and the relation E(g) = ρ N θ (g), we get on (t2 , T∗ ), 1/2 E (t) = (E(t), α) := (1 − α) [k3 E (E¯1 − E 1/2 ) + O((1 − α)1/8 )],
(6.21)
where E¯1 = ρ N θ¯1 with θ¯1 is the quasi-elastic self-similar temperature defined in (1.29). We may then choose α ∈ [α , 1) such that (c1 , α) > 0 for any α ∈ [α , 1). We conclude by maximum principle that T∗ = ∞ for α ∈ [α , 1). In particular, all the previous estimates on g are uniform on (t2 , ∞). Step 5. Thanks to (6.21) we easily get d (E − E¯1 )2 ≤ −(1 − α) [k5 (E − E¯1 )2 + O((1 − α)1/8 )], dt so that (for some constants a, b > 0) ∀ t ≥ t2 , |E(t) − E¯1 | ≤ |E(t2 ) − E¯1 | e−a (1−α) (t−t2 ) + b (1 − α)1/8 . Setting T (α) = max{t2 , c (1 − α)−1 } for some suitable constant c > 0, we then obtain |E − E¯1 | = O((1 − α)1/8 ) on [T (α), ∞).
(6.22)
In order to conclude that (6.14) holds, we write g(t) − G¯ α = (g(t) − M[g(t)]) + (M[g(t)] − G¯ 1 ) + (G¯ 1 − G¯ α ), and we estimate the first term thanks to (6.20), the second term thanks to (6.22) and the third term by (3.12). 6.3. Decomposition and Lyapunov functional for smooth initial datum. The proof of the gobal convergence (point (v) of Theorem 1.1) for smooth initial data only amounts to connect the two previous results of Propositions 6.4 and 6.6 by choosing α such that ϕ(α) ≤ ε, where ε is the size of the attraction domain in Proposition 6.4 and ϕ(α) is defined in Propositions 6.6. More precisely, we state without proof the straightforward combination of Propositions 6.6 and Proposition 6.4. Corollary 6.7. Let us fix an exponential weight function m as in (1.30), with exponent s ∈ (0, 1). Then for any ρ, E0 , M0 there exists C and α5 ∈ [α4 , 1) (depending on ρ, E0 , M0 , m) such that for any α ∈ [α5 , 1) and any initial datum 0 ≤ gin ∈ L 1 (m −q2 ) ∩ H k2 satisfying gin ∈ Cρ,0,E0 ,
gin L 1 (m −q2 )∩H k2 ≤ M0 ,
the solution g associated to the rescaled equation (1.31) satisfies ∀ t ≥ 0, α (gt − G¯ α ) L 1 (m −1 ) ≤ C eµα t , ∀ t ≥ 0, (Id − α ) (gt − G¯ α ) L 1 (m −1 ) ≤ C e(3/2) µα t . Remark 6.8. Note that the constant C in the rate of decay does not depend on α. This comes from the fact that the size of the linearized stability domain is uniform as α goes to 1 in Proposition 6.4, which allows in Proposition 6.6 to pick a fixed α such that in the estimate (6.14) ϕ(α ) is less than this size, and therefore that the time T (α ) required to enter this neighborhood does not blow-up as α goes to 1.
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
493
As a by-product of the previous propositions, we state and prove a result which provides a partial answer to the question (important from the physical viewpoint) of finding Lyapunov functionals for this particles system. Let us define the required objects. We consider a fixed mass ρ and some restitution coefficient α whose range will be specified below. At initial times, non-linear effects dominate and therefore we define 2 H1 (g) := H (g|M[g]) + E − E¯α , where E¯α = E(G¯ α ) is the energy of the self-similar profile corresponding to α and the mass ρ. At eventual times, linearized effects dominate. Therefore we define (inspiring from the spectral study): +∞ 2 1 2 H2 (g) := h L 1 (m −1 ) + (1 − α) Rα (s) h 2 2 ds, L
0
¯ with h 1 = α h, h 2 = ⊥ α h and h = g − G α . Proposition 6.9. There is k4 ∈ N big enough (this value is specified in the proof) such that for any exponential weight function m as defined in (1.30), any time t0 ∈ (0, ∞) and any ρ, E0 , M0 ∈ (0, ∞), there exists κ∗ ∈ (0, ∞) and α6 ∈ [α5 , 1) such that for any α ∈ [α6 , 1] and any initial datum gin ∈ H k4 ∩ L 1 (m −1 ) satisfying gin ∈ Cρ,0,E0 ,
gin H k4 ∩L 1 (m −1 ) ≤ M0 ,
gin (v) ≥ M0−1 e−M0 |v| , 8
the solution g to the rescaled equation (1.31) with initial datum gin is such that the functional H(gt ) = H1 (gt ) 1H
1 (gt )≥κ∗
+ H2 (gt ) 1
H1 (gt )≤κ∗
is decreasing for all times t ∈ [0, +∞). Moreover, H(g(t, ·)) is strictly decreasing as long as g(t, ·) has not reached the self-similar profile G¯ α . Proof of Proposition 6.9. We split the proof into three steps. Step 1: Initial times. Taking k4 ≥ k2 and α ∈ [α4 , 1), we know from the proof of Proposition 6.6 that the solution g satisfies that ∀ t ∈ [t0 , ∞), g(t, ·) H k4 ∩L 1 (m −1 ) ≤ M1 , g(t, v) ≥ M1−1 e−M1 |v| , 8
for some constant M1 ∈ (0, ∞) (recall that α4 was adjusted in terms of ρ, E0 , M0 ). Coming back then to Steps 3 and 4 in the proof of Proposition 6.6, we obtain the two following differential equations on (t0 , ∞) d H (g|M[g]) ≤ −K 1 H (g|M[g])2 + O(1 − α) dt and
d E = 2 ρ (1 − α) K 2 E (E¯α1/2 − E 1/2 ) (E − E¯α ) + O((1 − α)1/8 ) , dt
for some constants K i ∈ (0, ∞). We easily deduce that for any κ ∈ (0, ∞) there exists ακ ∈ [α5 , ∞) such that d H1 (gt ) < 0 for any t ∈ (0, ∞) such that H1 (gt ) ≥ κ. dt
(6.23)
494
S. Mischler, C. Mouhot
Step 2: Eventual times. Let us first remark that from point (iii) in Theorem 5.2 (iii) and the interpolation inequality (B.2), for any q ∈ N∗ there exists k, k ∈ N and Ci ∈ (0, ∞) such that Rα h 2 2 ≤ C1 Rα h 2 k,1 −q/2 L W (m ) µ¯ s 2 ≤ C2 e h k+2,1 −q/2 ≤ C3 eµ¯ s h H k ∩L 1 (m −q ) , (m
W2
)
so that, taking k4 big enough, the functional H2 (g(t, .)) is well-defined for any times t ∈ (0, ∞). First observe that from (6.10) there holds d 1 2 5/2 h L 1 (m −1 ) ≤ (1 − α) K 1 h L 1 (m −1 ) − K 2 h 1 2L 1 (m −1 ) . (6.24) dt Second, we compute (with the notation of Subsect. 5.7) +∞ +∞ 2 d ¯ ¯ (es Lα h 2 ) [es Lα (L¯ α h 2 +⊥ Rα (s) h 2t 2 ds = 2 α Q α (h, h))] ds dv. L dt 0 RN 0 On the one hand,
I1 = 2 = 0
+∞
0 +∞
RN
¯ ¯ (es Lα h 2 ) [es Lα L¯ α h 2 ] dsdv
d s L¯ α 2 2 e h L 2 ds = −h 2 2L 2 . ds
On the other hand, +∞ (Rα (s) h 2 ) [Rα (s) ⊥ I2 = 2 α Q α (h, h))] dsdv 0 RN +∞ ≤ 2 C12 Rα (s) h 2 W k1 ,1 (m −q/2 ) Rα (s) ⊥ α Q α (h, h)W k1 ,1 (m −q/2 ) ds 0+∞ ≤ C2 e2µ¯ s ds h 2 W k1 +1,1 (m −q/2 ) Q α (h, h)W k1 +1,1 (m −q/2 ) ≤
2 2 0 3/2 1/2 2 3/4 2 1/4 C3 h L 2 h H k3 ∩L 1 (m −1 ) h L 2 h H k3 ∩L 1 (m −1 ) ,
for some k3 ∈ N given by Proposition B.1. Taking k4 ≥ k3 , we then obtain +∞ 2 d 9/4 Rα (s) h 2t 2 ds ≤ K 3 h L 2 − h 2 2L 2 . L dt 0
(6.25)
Gathering (6.24) and (6.25) and using some interpolation again, we deduce that there exists κ ∈ (0, ∞) such that d H2 (gt ) < 0 for any t ∈ (0, ∞) such that h t L 1 ≤ κ . (6.26) dt Step 3. We conclude putting together (6.23) and (6.26), and using (3.10), (3.12) in order to prove that H1 (g) ≤ κ implies h t L 1 ≤ κ , for α ∈ [α6 , 1] for some α6 ∈ [α5 , 1).
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
495
6.4. Global stability for general initial datum. We first state and prove a regularity result on the iterated gain term which is the inelastic collision operator version of the same result proved for the elastic collision operator in [1,26]. Lemma 6.10. There exists a constant C such that for any f, g, h ∈ L 12 (R N ) and any α ∈ (0, 1] there holds Q +α ( f, Q +α (g, h)) L 3 ≤ C f L 1 g L 1 h L 1 . 2
2
2
(6.27)
Proof of Lemma 6.10. We follow [26, Lemma 2.1] and [1, Lemma 2.1] and we make use of the Carleman representation introduced in [24, Prop. 1.5]. Let us consider f, g, h ∈ L 12 (R N ) and φ ∈ L ∞ (R N ). We apply twice the weak formulation of the gain term Q + ( f, Q + (g, h))(v) φ(v) dv RN
+ Q (g, h)(v) f (v2 ) |v − v2 | φ(w2 ) dσ2 dv2 dv = RN RN S2
= g(v) h(v1 ) f (v2 ) |v − v1 | |v1 − v2 | φ(v2 ) dσ2 dσ1 dv dv1 dv2 RN RN RN
w2
with that for
S2
= V (v2 , v, σ2 ), v1 any given v, v∗ , σ ∈ w=
= V (v, v1 , σ1 ) R N , we define
S2
and therefore v2 = V (v2 , v1 , σ2 ). Recall
v + v∗ 1+e , u = v − v∗ , γ = , u = (1 − γ ) u + γ |u| σ 2 2
and then γ w u + = v + (|u| σ − u), 2 2 2 w u γ V∗ = V∗ (v, v∗ , σ ) = − = v∗ − (|u| σ − u). 2 2 2 V = V (v, v∗ , σ ) =
We denote by = (v, v1 , v2 ) the term between brackets in the last integral. Introducing the point w1 and the set Sv,v1 ,ε defined by w1 := (1 − γ /2) v + (γ /2) v1 , ! ! ! ! := z ∈ R N ; !|z − w1 | − (γ /2) |v − v1 |! ≤ ε/2 ,
Sv,v1 ,ε we get =
(2/γ )2 ε lim , ε = 1 Sv,v1 ,ε (v1 ) |v1 − v2 | φ(v2 ) dσ2 dv1 . (6.28) N 2 |v − v1 | ε→0 ε R S
Remarking that v2 = v2 + (γ /2) (|u 2 | σ2 − u 2 ) with u 2 = v1 − v2 , we observe that the integral term ε is very similar to the collision term Q + (here v2 (resp. v1 , σ2 , γ , v2 ) plays the role of v (resp. v1 , σ , β, v) in the gain term) and therefore we may give a Carleman representation of ε . The same computations as performed in [24, Prop. 1.5] yield 4 ε = 2 1S (v ) |v − v2 |−1 φ(v2 ) d E(v3 ) dv2 , γ R N E v ,v v,v1 ,ε 1 2 2 2
496
S. Mischler, C. Mouhot
where E v2 ,v2 is the hyperplan orthogonal to the vector v2 − v2 and passing through the point v2 ,v2 = v2 + (1 − γ −1 ) (v2 − v2 ). Here v3 stands for the post collision velocity issued from v1 , that is v3 = V∗ (v2 , v1 , σ2 ), and then, thanks to the momentum conservation, v1 := v2 + v3 − v2 . We finally define v2 ,v2 the hyperplan orthogonal to the vector v2 − v2 and passing through the point v ,v = v2 + (1 − γ −1 ) (v2 − v2 ) and we 2 2 get ε =
4 1S (v ) |v − v2 |−1 φ(v2 ) d E(v1 )dv2 . γ 2 R N v ,v v,v1 ,ε 1 2 2
(6.29)
2
Now, arguing as in [1, Lemma 2.1], we see that the measure of the intersection ε between the plane v2 ,v2 and the thickened sphere Sv,v1 ,ε is bounded by π ε γ |v − v1 | and that v1 ∈ ε implies that v2 ∈ B ε with B ε := z ∈ R N ; |z|2 ≤ |v|2 + |v1 |2 + 2 ε (|v| + |v1 |) + ε2 |v2 |2 . Gathering these estimates with (6.28) and (6.29) we get φ(v2 ) (2/γ )4 1 = lim mes(ε ) dv2 |v − v1 | ε→0 ε R N |v2 − v2 | φ(v2 ) φ(v2 ) 24 π 24 π ε (v ) dv = 1 1 0 (v ) dv2 , ≤ 3 lim B 2 2 γ ε→0 R N |v2 − v2 | γ 3 R N |v2 − v2 | B 2 where we have defined B 0 := {z ∈ R N ; |z|2 ≤ |v|2 + |v1 |2 }. Using [1, Lemma 2.2] we may conclude as in the end of [1, Lemma 2.1] and therefore (6.27) follows. We second establish that the solution g of the rescaled equation (1.31) decomposes between a regular part and a small remaining part as it has been proved for the elastic Boltzmann equation in [28], and then partially extended to the inelastic Boltzmann equation in [24]. As compared to this last paper, this result relaxes the assumption on the initial datum to gin ∈ L 13 , but at the price of the hypothesis of a lower bound on the energy. Lemma 6.11. Consider gin ∈ L 13 and the associated solution g ∈ C([0, ∞); L 13 ) to the rescaled equation (1.31). Assume that for some constant ρ, c1 , M1 , T ∈ (0, ∞) there holds gin ∈ Cρ,0 ,
gin L 1 ≤ M1 , 3
∀ t ∈ [0, T ], E(g(t, ·)) ≥ c1 .
(6.30)
Then, there are α7 ∈ [α6 , 1) and λ ∈ (−∞, 0), and for any smooth exponential weight function m (as defined in (1.30)) and any k ∈ N, there exists a constant K (which depends on ρ, c1 , M1 , k, m) such that for any α ∈ [α7 , 1], we may split g = g S + g R with ∀ t ∈ [0, T ], g S (t, ·) H k ∩L 1 (m −1 ) ≤ K ,
g R (t, ·) L 1 ≤ K eλ t . 3
(6.31)
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
497
Proof of Lemma 6.11. The starting point is to write the rescaled equation (1.31) in the following way ∂g + τα v · ∇v g + g = Q +α (g, g), ∂t with (t, v) := τα N + L(g(t, ·))(v). Introducing the linear semigroup t (Ut h)(v) = h(v e−τα t ) exp − (s, v) ds 0
and using the Duhamel formula, we have t gt = Ut gin + Ut−s Q +α (gs , gs ) ds. 0
We iterate that last identity and we obtain g = g1R + g1S with g1R
t
= Ut gin + Ut−s Q +α (gs , Us gin ) ds, 0 t s S g1 = Ut−s Q +α (gs , Us−u Q +α (gu , gu )) du ds. 0
0
On the one hand, the energy lower bound (6.30) and Lemma 2.3 imply that there exists a constant c2 ∈ (0, ∞) such that (Ut h)(v) ≤ e−c2 t (Vξt h)(v) with (Vξ h)(v) = h(ξ v) and ξt = e−τα t . On the other hand, straightforward homogeneity arguments leads to Q +α (g, Vξ h) = ξ −N −1 Vξ −1 Q +α (Vξ −1 g, h) and h ξ |.|q L p = ξ −q−N / p h |.|q L p for any functions g, h and positive real ξ . From these considerations we deduce that g1R (t) L 1 ≤ e(N τα −c2 ) t gin L 1 + e((N +1) τα −c2 ) t gin L 1 sup gs L 1 ≤ C e−(c2 /2) t , 1
s≥0
1
for some constant C and for any (1 − α) small enough. In the same way, we have t s g1S (t) L 3 ≤ e[(2N /3+1) τα −c2 ] (t−σ ) Q +α (Vξ −1 gs , Q +α (gσ , gσ )) L 3 dσ ds. 0
s−σ
0
Taking (1 − α) smaller if necessary and using Lemma 6.10, we obtain t s g1S (t) L 3 ≤ e−(c2 /2) (t−σ ) dσ ds sup gs 3L 1 , 0
0
s≥0
2
which ends the proof of (6.31) in the case k = 0, with the help of point (i) of Lemma 6.1. The general case k ∈ N∗ is then treated by following the strategy introduced in [28] and
498
S. Mischler, C. Mouhot
using the result of appearance of regularity proved in [24] (and recalled in point (iii) of Lemma 6.1). We third recall a classical L 1 stability result for the elastic Boltzmann equation which has been established in [24, Prop. 3.2] for the rescaled equation (1.31). 1 , g2 ∈ L 1 ∩ C Lemma 6.12. Consider 0 ≤ gin ρ,0 and the two associated solutions in 3 1 1 2 ∞ gt , gt ∈ C([0, ∞); L 3 ) ∩ L (0, ∞; L 13 ) to the rescaled equation (1.31). There exists Cstab ∈ (0, ∞) (only depending on b and supt≥0 g 1 + g 2 L 1 ) such that 3
2 1 − gin L 1 eCstab t . ∀ t ≥ 0, gt2 − gt1 L 1 ≤ gin 2
2
Proof of point (iv) of Theorem 1.1. Let us consider gin ∈ L 13 ∩Cρ,0,Ein with gin L 1 ≤ M0 3 for some fixed Ein , M0 ∈ (0, ∞) and g the associated solution to the rescaled equation (1.31) which has been built in [24]. We know that there exists M1 ∈ (0, ∞) such that sup g(t, ·) L 1 ≤ M1 .
(0,∞)
3
(6.32)
Step 1. We define T∗ := sup T ∈ (0, ∞), E(g(t, ·)) ≥ c1 ∀ t ∈ [0, T ] , c1 := min{Ein , E¯1 }/2. We shall prove that T∗ = +∞. We argue by contradiction, assuming that T∗ < ∞. From the equation on the energy (6.17) and the uniform estimate (6.32) and from the definition of T∗ we have T∗ ≥ C1 (1 − α)−1 and E (T∗ ) ≤ 0.
(6.33)
Thanks to Lemma 6.11, we may decompose g = g S + g R on (0, t1 ), with t1 ∈ (0, T∗ ) to be fixed. At time t1 we initiate a new flow starting from the smooth part of g. More precisely, we decompose g = g˜ S + g˜ R on (t1 , T∗ ), with g˜ S (t1 ) = [ρ/ρ(g S (t1 ))] g S (t1 ), g˜ S solution (with mass ρ!) to Eq. (1.31) on (t1 , T∗ ) and g˜ R := g − g˜ S . On the one hand, from (6.31) and Lemma 6.12 we have g˜ R (t) L 1 ≤ C eCstab (T∗ −t1 )+λ t1 on (t1 , T∗ ). 3
We choose t1 = η T∗ with η ∈ (0, 1) in such a way that Cstab (1 − η) + λ η = λ/2. We have then proved g˜ R (t)] L 1 ≤ C e(λ/2) C1 (1−α) 3
−1
on (t1 , T∗ ).
(6.34)
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
499
On the other hand, following Step 3 in the proof of Proposition 6.6, we deduce a similar estimate as (6.20), namely g˜ S (T∗ , ·) − M[g˜ S (T∗ , ·)] L 1 = O((1 − α)1/8 ) 3
(6.35)
for any (1 − α) small enough chosen in such a way that the intermediate time t2 defined in Step 3 of the proof of Proposition 6.6 satisfies t1 + t2 ≤ T∗ . Gathering (6.34) and (6.35) we obtain g(T ˜ ∗ , ·) − M[g(T∗ , ·)] L 1 = O((1 − α)1/8 ). 3
Coming back to Eq. (6.17) on the energy and proceeding like in Step 4 in the proof of Proposition 6.6, we get 1/2 1/2 E (T∗ ) ≥ (1 − α) k3 c1 (E¯1 − c1 ) − C (1 − α)1/8 > 0 for any (1 − α) small enough. That is in contradiction with (6.33) and we conclude that T∗ = +∞. Step 2. Thanks to the previous step, we have a uniform in time lower bound on the energy, and therefore we can run the decomposition theorem for all times. By applying the decomposition theorem as in Step 1 for a given time t ∈ (0, ∞), starting a new flow at t1 = η t, taking [ρ/ρ(g S (t1 ))] g S (t1 , ·) as initial datum, and then using Corollary 6.7 on the smooth part g˜ S (s, ·), s ∈ [t1 , t], we find that at time t, the solution gt decomposes as g˜tS + g˜tR , where g˜tS approaches the self-similar profile with rate C eµα (t−t1 ) , that is C e(1−η) µα t , and the remaining part g˜ R goes to 0 with rate C e(λ/2) t . Since |λ/2| is larger than (1 − η) |µα | for (1 − α) small enough, it concludes the proof of (1.36). A. Appendix: Moments of Gaussians We state here some moments of tensor product of Gaussians. Lemma A.1. The following identities hold: M1,0,1 |v|2 dv = N , N R M1,0,1 |v|4 dv = N (N + 2), N R 3 3/2 M1,0,1 (M1,0,1 )∗ |u| dv dv∗ = 2 M1,0,1 |v|3 dv, R N ×R N RN √ 2 3 M1,0,1 (M1,0,1 )∗ |v| |u| dv dv∗ = 2 (2N +3) M1,0,1 (v) |v|3 dv. R N ×R N
RN
(A.1) (A.2) (A.3) (A.4)
Proof of Lemma A.1. The proof of (A.1) and (A.2) being straightforward and the proof of (A.3) being very similar to the proof of (A.4) we only prove (A.4). We first notice that M1,0,1 (M1,0,1 )∗ |v|2 |u|3 dv dv∗ N N R ×R 1 = M1,0,1 (M1,0,1 )∗ (|v|2 + |v∗ |2 ) |u|3 dv dv∗ 2 R N ×R N 1 = M1,0,1 (M1,0,1 )∗ (|v + v∗ |2 + |v − v∗ |2 ) |u|3 dv dv∗ . 4 R N ×R N
500
S. Mischler, C. Mouhot
√ √ Making use of the change of variable (v, v∗ ) → (x = (v + v∗ )/ 2, y = (v − v∗ )/ 2), we then get M1,0,1 (M1,0,1 )∗ |v|2 |u|3 dv dv∗ R N ×R N
=
√ 2
=
√ 2N
=
√ 2 (2N + 3)
R N ×R N
RN
M1,0,1 (x) M1,0,1 (y) (|x|2 + |y|2 ) |y|3 d x d y
M1,0,1 (v) |v|3 dv +
RN
√ 2
M1,0,1 (v) |v|5 dv
RN
M1,0,1 (v) |v|3 dv.
B. Appendix: Interpolation Inequalities Lemma B.1. (i) For any k, k ∗ , q, q ∗ ∈ Z with k ≥ k ∗ , q ≥ q ∗ and any θ ∈ (0, 1) ∗∗ there is C ∈ (0, ∞) such that for h ∈ Wqk∗∗ ,1 (m −1 ), hW k,1 (m −1 ) ≤ C h1−θ k ∗ ,1
Wq ∗ (m −1 )
q
hθ
∗∗
Wqk∗∗ ,1 (m −1 )
(B.1)
with k ∗∗ , q ∗∗ ∈ Z such that k = (1 − θ ) k ∗ + θ k ∗∗ , q = (1 − θ ) q ∗ + θ q ∗∗ . (ii) For any k, q ∈ N ∗ and any exponential weight function m as defined in (1.30), there ‡ exists C ∈ (0, ∞) such that for any h ∈ H k ∩ L 1 (m −12 ) with k ‡ := 8k + 7(1 + N /2), 1/4 ‡ Hk
hW k,1 (m −1 ) ≤ C h q
1/4
3/4
h L 1 (m −12 ) h L 1 (m −1 ) .
(B.2)
Proof of Lemma B.1. The inequality (B.1) in point (i) is a classical result from interpolation theory. Let us focus on point (ii). We prove the inequality (B.2) for h ∈ S(R N ) and then argue by density. On the one hand, we observe that for any there exists C such that h2H ≤ C h L 1 h H † , † := 2 + 1 + N /2. Iterating twice that inequality, we get (for some related exponents k † , k ‡ ) 3/4
1/4 ‡. Hk
h H k † ≤ C h L 1 h
(B.3)
On the other hand, using first Cauchy-Schwartz inequality, plus the same argument as above and Hölder’s inequality, we obtain 1/2
1/2 † Hk
hW k,1 (m −1 ) ≤ Ch H k (m −3/2 ) ≤ C h L 1 (m −3 ) h q
1/8
3/8
1/2 †. Hk
≤ C h L 1 (m −12 ) h L 1 h We conclude gathering (B.4) and (B.3).
(B.4)
Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres
501
Acknowledgements We thank Alexander Bobylev, José Antonio Carrillo and Cédric Villani for fruitful discussions on the inelastic Boltzmann equation.
References 1. Abrahamsson, F.: Strong L 1 convergence to equilibrium without entropy conditions for the Boltzmann equation. Comm. Part. Diff. Eqs. 24, 1501–1535 (1999) 2. Bisi, M., Carrillo, J.A., Toscani, G.: Contractive metrics for a Boltzmann equation for granular gases: diffusive equilibria. J. Stat. Phys. 118(1–2), 301–331 (2005) 3. Bisi, M., Carrillo, J.A., Toscani, G.: Decay rates in probability metrics towards homogeneous cooling states for the inelastic Maxwell model. J. Stat. Phys. 124(2–4), 625–653 (2006) 4. Blake, M.D.: A spectral bound for asymptotically norm-continuous semigroups. J. Op. Th. 45, 111–130 (2001) 5. Bobylev, A.V., Carillo, J.A., Gamba, I.: On some properties of kinetic and hydrodynamics equations for inelastic interactions. J. Stat. Phys. 98, 743–773 (2000) 6. Baranger, C., Mouhot, C.: Explicit spectral gap estimates for the linearized Boltzmann and Landau operators with hard potentials. Rev. Matem. Iberoam. 21, 819–841 (2005) 7. Bobylev, A.V., Cercignani, C., Toscani, G.: Proof of an asymptotic property of self-similar solutions of the Boltzmann equation for granular materials. J. Stat. Phys. 111, 403–417 (2003) 8. Bobylev, A.V., Gamba, I., Panferov, V.: Moment inequalities and high-energy tails for the Boltzmann equations with inelastic interactions. J. Stat. Phys. 116, 1651–1682 (2004) 9. Brilliantov, N.V., Pöschel, T.: Kinetic Theory of Granular Gases. Oxford Graduate Texts. Oxford: Oxford University Press, 2004 10. Caglioti, E., Villani, C.: Homogeneous cooling states are not always good approximations to granular flows. Arch. Rat. Mech. Anal. 163, 329–343 (2002) 11. Carleman, T.: Sur la théorie de l’équation intégrodifférentielle de Boltzmann. Acta Math. 60, 91–146 (1932) 12. Carleman, T.: Problèmes mathématiques dans la théorie cinétique des gaz. Uppsala: Almqvist and Wiksells Boktryckeri Ab, 1957 13. Cercignani, C.: Recent developments in the mechanics of granular materials. In: Fisica matematica e ingegneria delle strutture, Bologna: Pitagora Editrice, 1995, pp. 119–132 14. Csiszár, I.: Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten. Magyar Tud. Akad. Mat. Kutató Int. Közl. 8, 85–108 (1963) 15. Ernst, M.H., Brito, R.: Driven inelastic Maxwell molecules with high energy tails. Phys. Rev. E 65, 85–108 (2002) 16. Ernst, M.H., Brito, R.: Scaling solutions of inelastic Boltzmann equations with over-populated high energy tails. J. Stat. Phys. 109, 407–432 (2002) 17. Gamba, I., Panferov, V., Villani, C.: On the Boltzmann equation for diffusively excited granular media. Commun. Math. Phys. 246, 503–541 (2004) 18. Grad, H.: Asymptotic theory of the Boltzmann equation. II. In: Rarefied Gas Dynamics (Proc. 3rd Internat. Sympos., Palais de l’UNESCO, Paris, 1962), Vol. I, New York: Academic Press, 1963, pp 26–59 19. Haff, P.K.: Grain flow as a fluid-mechanical phenomenon. J. Fluid Mech. 134, 401–430 (1983) 20. Hilbert, D.: Grundzüge einer Allgemeinen Theorie der Linearen Integralgleichungen. Math. Ann. 72, (1912), New York: Chelsea Publ., 1953 21. Kato, T.: Perturbation Theory for Linear Operators. Berlin: Springer-Verlag, 1995 22. Kullback, S.: Information Theory and Statistics. New York: John Wiley, 1959 23. Mischler, S., Mouhot, C., Rodriguez Ricard, M.: Cooling process for inelastic Boltzmann equations for hard spheres, Part I: The Cauchy problem. J. Stat. Phys. 124, 655–702 (2006) 24. Mischler, S., Mouhot, C.: Cooling process for inelastic Boltzmann equations for hard spheres, Part II: Self-similar solutions and tail behavior. J. Stat. Phys. 124, 703–746 (2006) 25. Mischler, S., Mouhot, C.: Work in progress 26. Mischler, S., Wennberg, B.: On the spatially homogeneous Boltzmann equation. Ann. Inst. Henri Poincaré, Analyse Non linéaire 16, 467–501 (1999) 27. Mouhot, C.: Rate of convergence to equilibrium for the spatially homogeneous Boltzmann equation. Commun. Math. Phys. 261, 629–672 (2006) 28. Mouhot, C., Villani, C.: Regularity theory for the spatially homogeneous Boltzmann equation with cutoff. Arch. Rat. Mech. Anal. 173, 169–212 (2004) 29. Nirenberg, L.: Topics in Nonlinear Functional Analysis. With a chapter by E. Zehnder. Notes by R. A. Artino. Lecture Notes, 1973–1974. New York: Courant Institute of Mathematical Sciences, New York University, 1974
502
S. Mischler, C. Mouhot
30. Pulvirenti, A., Wennberg, B.: A Maxwellian lower bound for solutions to the Boltzmann equation. Commun. Math. Phys. 183, 145–160 (1997) 31. Villani, C.: Cercignani’s conjecture is sometimes true and always almost true. Commun. Math. Phys. 234, 455–490 (2003) 32. Yao, P.F.: On the inversion of the Laplace transform of C0 semigroups and its applications. SIAM J. Math. Anal. 26, 1331–1341 (1995) Communicated by H. Spohn
Commun. Math. Phys. 288, 503–546 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0748-x
Communications in
Mathematical Physics
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations D. Chiron, F. Rousset Laboratoire J.A. Dieudonne, Université de Nice - Sophia Antipolis, Parc Valrose, 06108 Nice Cedex 02, France. E-mail:
[email protected];
[email protected] Received: 11 April 2008 / Accepted: 22 November 2008 Published online: 20 February 2009 – © Springer-Verlag 2009
Abstract: We justify supercritical geometric optics in small time for the defocusing semiclassical Nonlinear Schrödinger Equation for a large class of non-necessarily homogeneous nonlinearities. The case of a half-space with Neumann boundary condition is also studied. 1. Introduction We consider the nonlinear Schrödinger equation in ⊂ Rd , iε
∂ ε ε2 + ε − ε f (| ε |2 ) = 0, ε : R+ × → C ∂t 2
with an highly oscillating initial datum under the form i ε ε ε ε |t=0 = 0 = a0 exp ϕ , ε 0
(1)
(2)
where ϕ0ε is real-valued. We are interested in the semiclassical limit ε → 0. The nonlinear Schrödinger equation (1) appears, for instance, in optics, and also as a model for Bose-Einstein condensates, with f (ρ) = ρ − 1, and the equation is termed the Gross-Pitaevskii equation, or also with f (ρ) = ρ 2 (see [13]). Some more complicated nonlinearities are also used especially in low dimensions, see [12]. At first, let us focus on the case = Rd . To guess the formal limit, when ε goes to zero, it is classical to use the Madelung transform, i.e. to seek for a solution of (1) under the form i ε ϕ . ε = ρ ε exp ε
504
D. Chiron, F. Rousset
By separating real and imaginary parts and by introducing u ε ≡ ∇ϕ ε , this allows to rewrite (1) as an hydrodynamical system, ⎧ ∂t ρ ε + ∇ · ρ ε u ε = 0 ⎪ ⎪ ⎨ √ ε (3) ε ε ε2 ρ ⎪ ε ε ⎪ ⎩ ∂t u + u · ∇ u + ∇ f (ρ ) = ∇ √ ε . 2 ρ The system (3) is a compressible Euler equation with an additional term in the right-hand side called quantum pressure. As ε tends to 0, the quantum pressure is formally negligible and (3) reduces to the (compressible) Euler equation, ⎧ ⎨ ∂t ρ + ∇ · (ρu) = 0 (4) ⎩ ∂ u + (u · ∇) u + ∇ ( f (ρ)) = 0. t The justification of this formal computation has received much interest recently. The case of analytic data was solved in [7]. Then for data with Sobolev regularity and a defocusing nonlinearity, so that (4) is hyperbolic, it was noticed by Grenier, [9], that it is more convenient to use the transformation ε ϕ ε ε (5) = a exp i ε and to allow the amplitude a ε to be complex. By using an identification between C and R2 , this allows to rewrite (1) as ⎧ aε ε ⎪ ⎨ ∂t a ε + u ε · ∇a ε + ∇ · u ε = J a ε 2 2 (6) ⎪ ⎩ ε ε ε ε 2 ∂t u + (u · ∇) u + ∇ f (|a | ) = 0, where J is the matrix of complex multiplication by i: 0 −1 . J= 1 0 When ε = 0, we find the system ⎧ a ⎪ ⎨ ∂t a + u · ∇a + ∇ · u = 0 2 ⎪ ⎩ ∂t u + (u · ∇) u + ∇ f (|a|2 ) = 0,
(7)
which is another form of (4), since then (ρ ≡ |a|2 , u) solves (4). The rigorous convergence of (6) towards (7) provided the initial conditions suitably converge was rigorously performed by Grenier [9] in the case f (ρ) = ρ (which corresponds to the cubic defocusing NLS). More precisely, it was proven in [9] that there exists T > 0 independent of ε such that the solution of (6) is uniformly bounded in H s on [0, T ]. In terms of the unknown ε of (1), this gives that
ϕ || s < +∞ sup sup || ε exp −i ε H ε∈(0,1] [0,T ]
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
505
for every s where (a, u = ∇ϕ) is the solution of (7). Furthermore, the justification of WKB expansions under the form m
iϕ iϕ ε − εk a k e ε = O(εm )e ε k=0
for every m was performed in [9]. The main idea in the work of Grenier [9] is to use the symmetrizer 1 1 , . . . , S ≡ diag 1, 1, 4 f (|a|2 ) 4 f (|a|2 ) of the hyperbolic system (7) to get H s energy estimates which are uniform in ε for the singularly perturbed system (6). The case of nonlinearities for which f vanishes at zero (for instance the case f (ρ) = ρ 2 ) was left open in [9]. The additional difficulty is that for such nonlinearities, the system (7) is only weakly hyperbolic at a = 0 and in particular the symmetrizer S becomes singular at a = 0. In more recent works, see [1,14,19] it was proven that for every weak solution of (1) with f (ρ) = ρ − 1 or f (ρ) = ρ, the limits as ε → 0, ε ¯ ∇ ε − ρu → 0 | ε |2 − ρ → 0 in L ∞ ([0, T ], L 2 ) εIm 1 in L ∞ ([0, T ], L loc )
(8)
hold under some suitable assumption on the initial data. The approach used in these papers is completely different, and relies on the modulated energy method introduced in [4]. The advantage of this powerful approach is that it allows to describe the limit of weak solutions and to handle general nonlinearities once the existence of a global weak solution in the energy space for (1) is known. Nevertheless, it does not give precise qualitative information on the solution of (1), for example, it does not allow to prove that the solution remains smooth on an interval of time independent of ε if the initial data are smooth or to justify the WKB expansion up to arbitrary orders in smooth norms. In the work [2], the possibility of getting the same result as in [9] for pure power nonlinearities f (ρ) = ρ σ in the case = Rd was studied. It was first noticed that, thanks to the result of [15], the system ⎧ a ⎨ ∂t a + ∇ϕ · ∇a + ϕ = 0 2 (9) ⎩ ∂ ϕ + 1 |∇ϕ|2 + f (|a|2 ) = 0, t 2 with the initial condition (a, ϕ)/t=0 = (a0 , ϕ0 ) ∈ H ∞ has a unique smooth maximal solution (a, ϕ) ∈ C [0, T ∗ [, H s (Rd ) × H s−1 (Rd ) for every s. It was then established: Theorem 1 ([2]). Let d ≤ 3, σ ∈ N∗ and initial data a0ε , ϕ0ε ≡ ϕ0 in H ∞ such that, for some functions (ϕ0 , a0 ) ∈ H ∞ , || a0ε − a0 || H s = O(ε), for every s ≥ 0. Then, there exists T ∗ > 0 such that (9) with f (ρ) = ρ σ has a smooth maximal solution (a, ϕ) ∈ C([0, T ∗ [, H ∞ × H ∞ ). Moreover, there exists T ∈ (0, T ∗ )
506
D. Chiron, F. Rousset
independent of ε, such that the solution of (1), (2) remains smooth on [0, T ] and verifies the estimate
ϕ || ∞ s < +∞, (10) sup || ε exp −i ε L ([0,T ],H ) ε∈(0,1] where • • • •
if σ if σ if σ if σ
= 1, then s ∈ N is arbitrary, = 2 and d = 1, then one can take s = 2, = 2 and 2 ≤ d ≤ 3, then one can take s = 1, ≥ 3 then one can take s = σ .
As emphasized in [2], in some cases, the global existence of smooth solutions is already known for (1). For example, in the quintic case, σ = 2, global existence is known for d ≤ 3 (see [6] for the difficult critical case d = 3), so that only the bound (10) is interesting. Nevertheless, Theorem 1 may be also applied to cases where (1) is H 1 super-critical (σ ≥ 3, d = 3 for example) and hence the fact that it is possible to construct a smooth solution on a time interval independent of ε is already interesting. The main ingredient used in [2] is a subtle transformation of (1) into a perturbation of a quasilinear symmetric hyperbolic system with non smooth coefficients when σ ≥ 2. The first aim of this paper is to prove that the estimate (10) holds true for every s, every dimension d and every nonlinearity f which satisfies the following assumption: (A) f ∈ C ∞ ([0, +∞)) ,
f (0) = 0,
f > 0 on (0, +∞), ∃n ∈ N∗ ,
f (n) (0) = 0. Note that we allow f to vanish at the origin. The assumption (A) takes into account in particular all the homogeneous polynomial nonlinearities f (ρ) = ρ σ but also nonρσ linearities under the form f (ρ) = ρ σ1 + ρ σ2 or 1+ρ for example. Our result reads: Theorem 2. We assume (A), and consider an initial data (2) with ϕ0ε real-valued, a0ε , ϕ0ε in H ∞ such that, for some real-valued functions (ϕ0 , a0 ) ∈ H ∞ , we have for every s, || a0ε − a0 || H s = O(ε)
and
|| ϕ0ε − ϕ0 || H s = O(ε).
Then, there exists T ∗ > 0 such that (7) with initial value (a0 , ϕ0 ) has a unique smooth maximal solution (a, ϕ) ∈ C([0, T ∗ [, H ∞ × H ∞ ). Moreover, there exists T ∈ (0, T ∗ ] such that for every ε ∈ (0, 1), the solution ε to (1)–(2) exists at least on [0, T ] and satisfies for every s, i ε sup || exp − ϕ || L ∞ ([0,T ],H s ) < +∞. ε ε∈(0,1] More precisely, there exists ϕ ε = ϕ + O H ∞ (ε) such that, for every s, i ε ε || exp − ϕ − a || L ∞ ([0,T ],H s ) = O(ε). ε
(11)
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
507
Let us give a few comments on the statement of Theorem 2. At first, note that Theorem 2 contains a result of local existence of smooth solutions for (9) in the case of non necessarily homogeneous nonlinearities satisfying (A). Since (a, ∇ϕ) solves a compressible type Euler equation, the case of a homogeneous nonlinearity was studied in [15], and we thus give an extension of this result to smooth non-linearities satisfying assumption (A). A precise statement of our result with the required regularity of the initial data is given in Theorem 4 below. The new difficulty when f is not homogeneous is that the nonlinear symmetrization does not seem to allow to transform the problem into a classical symmetric or symmetrizable hyperbolic system with smooth coefficients. The correction of order ε that we have to add to the phase to get the estimate (11) is expected. Indeed, a perturbation of order ε in the phase modifies the amplitude at the leading order. Our approach to prove Theorem 2 is completely different from the one of [2 and 9]. We do not work any more on the system (6) or any reformulation of (1) into a perturbation of a quasilinear symmetric hyperbolic system, but directly on the NLS equation (1). Basically, we first prove the linear stability for (1) in arbitrary Sobolev norms of a highly oscillating solution of the form aeiϕ/ε and then use a fixed point argument to prove the nonlinear stability. The crucial estimate of linear stability of a highly oscillating solution is given in Lemma 1 and Theorem 3. This actually allows to justify WKB expansions up to arbitrary orders (see Theorem 5). Since we deal in this paper with sufficiently smooth and in particular bounded solutions, the assumption (A) can be replaced by a local version where we assume that f > 0 on (0, β) with β independent of ε if the initial datum verifies |a0 |2 < β. Indeed, since a 0 takes it values in the (weak) hyperbolic region of the limit system (7), there still exists a local smooth solution of (7) defined on [0, T ] for some T > 0 and the stability argument leading to Theorem 2 still holds. Consequently, our result can also be applied to nonlinearities like f (ρ) = ρ σ1 − ρ σ2 for every σ2 > σ1 , provided |a0 |2 ≤ β 1. Note that when σ2 is too large, the classical global existence result of weak solutions (see [8]) for (1) is not valid and hence it does not seem possible to use the modulated energy method of [1,14] to investigate the semi-classical limit. Finally, the last advantage of our approach is that it can be easily generalized to the case of a domain with boundary and to non-zero condition at infinity. This will be the aim of the second part of the paper. We shall restrict ourself to a physical case, the Gross-Pitaevskii equation, i.e. f (ρ) = ρ − 1. The generalization to more general nonlinearities satisfying an assumption like (A) is rather straightforward. This simplifying assumption is only made to avoid the multiplication of difficulties. Again to avoid too many technicalities, we restrict ourselves to the simplest domain = Rd+ = Rd−1 × (0, +∞). For x ∈ Rd+ , we shall use the notation x = (y, z), y ∈ Rd−1 , z > 0. We add to (1) the Neumann boundary condition ∂z ε (t, y, 0) = 0. We also impose the following condition at infinity: u∞ · x |u ∞ |2 ε +i , (t, x) ∼ exp −it 2ε ε that we can write in hydrodynamical variables ε (t, x)2 → 1, u ε (t, x) → u ∞ ,
(12)
|x| → +∞,
|x| → +∞,
(13)
508
D. Chiron, F. Rousset
where u ∞ is a constant vector. This condition appears naturally when we study a moving obstacle in the fluid. Indeed, if we start from (1) with the Neumann boundary condition on an obstacle moving at constant velocity and fluid at rest at infinity, then we can use the Galilean invariance of (1) to transform the problem into the study of (1) in a fixed domain but with the condition (13) at infinity. This problem with such boundary conditions is physically meaningful since it can be used to describe superfluids past an obstacle (we refer to [16] for example). The semiclassical limit ε tends to zero was already studied in [14] by using the modulated energy method. The limit (8) was proven with (ρ, u) the solution of the compressible Euler equation with boundary condition u · n /∂ = 0, n being the normal to the boundary. Note that the result of [14] is restricted to the two-dimensional case only in order to have a global solution in the energy space of (1). By using more recent results on the Cauchy problem, [3], one can also get the result in the three-dimensional case at least when u ∞ = 0. Our aim here is to give a more precise description of the convergence which takes into account boundary layers. More precisely, since the solution of the Euler system (9) cannot match the Neumann boundary condition ∂z a(t, y, 0) = 0, a boundary layer of weak amplitude ε and of size ε appears. They are formally described for example in [16]. WKB expansions ε = a ε ei ε
a =a + 0
m
k=1
ϕε ε
are thus to be sought under the form
m
z z k k ε 0 ε a (t, x) + A (t, y, ) , ϕ = ϕ + εk ϕ k (t, x)+ k (t, y, ) , ε ε k
k=1
(14) Ak (t,
where the profiles y, and are chosen such that
Z ), k (t,
y, Z ) are exponentially decreasing in the Z variable
∂z a k (t, y, 0) + ∂ Z Ak+1 (t, y, 0) = 0, ∂z ϕ k (t, y, 0) + ∂ Z k+1 (t, y, 0) = 0 so that the approximate WKB expansion WKB = a ε exp εi ϕ ε matches the Neumann boundary condition (12). Our result (Theorem 6) is that under suitable assumptions on the initial conditions, we have the nonlinear stability of WKB expansions: in particular we have the existence of a smooth solution for (1), (12), (13) on a time interval independent of ε and the estimate || ε e−i
ϕε ε
− a ε ||W 1,∞ ε.
(15)
Note that it is necessary to incorporate the boundary layer ε A1 in order to get (15) since its gradient has amplitude one in L ∞ . The case of Dirichlet boundary condition which is also physically meaningful, we again refer to [16], seems more complicated to handle since often in boundary layer theory in fluid mechanics the boundary layers involved have amplitude one. This is left for future work. s The paper is organized as follows. In Sect. 2, we prove the stability in H of
linear ε an approximate WKB solution of (1) under the form a ε exp i ϕε in the case = Rd . This is the crucial part towards the proof of Theorem 2. Next in Sect. 3, we give the construction of a WKB expansion up to arbitrary order and give the proof of the local existence of a smooth solution for the compressible Euler equation with a pressure law satisfying (A). In Sect. 4, we give the justification of WKB expansions at every order and recover Theorem 2 as a particular case. This part uses in a classical way the linear stability result and a fixed point argument. Finally, in Sect. 5, we study the problem in the half-space with Neumann boundary condition.
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
509
2. Linear Stability
ε In this section, we consider a smooth WKB approximate solution a = a ε exp i ϕε of (1) such that ε ϕ a ε N L S( ) = R exp i , (16) ε where N L S() ≡ iε∂t +
ε2 − f (||2 ). 2
Moreover, we also set 1 |∇ϕ ε |2 + f (|a ε |2 ), 2 1 Ra ≡ ∂t a ε + ∇ϕ ε · ∇a ε + a ε ϕ ε , 2
R ϕ ≡ ∂t ϕ ε +
(17) (18)
so that R ε = −a ε Rϕ + iε Ra +
ε2 a ε . 2
Looking for an exact solution of (1) under the form ε = a + w ei
ϕε ε
= (a ε + w)ei
ϕε ε
,
we find that w solves the nonlinear Schrödinger equation ε2 1 iε ∂t w + u ε · ∇w + w ∇ · u ε + w − 2(w, a ε ) f (|a ε |2 )a ε 2 2 = Rϕ w − R ε + Q ε (w),
(19)
where (·, ·) stands for the real scalar product in C R2 , with u ε ≡ ∇ϕ ε and the nonlinear term Q ε (w) is defined by
Q ε (w) ≡ (a ε + w) f (|a ε + w|2 ) − f (|a ε |2 ) − 2(w, a ε ) f (|a ε |2 )a ε .
(20)
Of course, R ε will be very small and Rϕ (and Ra ) are to be thought small (at least O(ε)) for applications to nonlinear stability results. Nevertheless, in this section the exact form of these terms is not important. The way to construct an accurate WKB solution a will be explained in the next section. Remark 1. If we work with a non-linearity f such that f (A2 ) = 0 for some A ∈ R, we can impose a non-zero condition at infinity such as a0 ∈ A + H ∞ and ∇ϕ0 ∈ U ∞ + H ∞ for some constant vector U ∞ ∈ Rd . Since we can still look for the perturbation w in H s , this does not affect the proofs.
510
D. Chiron, F. Rousset
Since we expect the correction term w to be small, we shall only consider in this section the linearized equation iε
∂w + Lε w = Rϕ w + F ε , x ∈ Rd , ∂t
(21)
where the linear operator Lε is defined as Lε (w) ≡
ε2 iε w + iε u ε · ∇w + w ∇ · u ε − 2 f (|a ε |2 )(w, a ε )a ε . 2 2
In this section, F ε is considered as a given source term. Of course, for the proof of Theorem 2, we shall apply the result of this section to F ε = −R ε + Q ε (w).
(22)
Furthermore, let us emphasize that at this stage, Rϕ is seen as a multiplicative operator with no link with the vector field u ε appearing in Lε , even though we will use this lemma with u ε = ∇ϕ ε . We notice that Lε is formally self-adjoint, but only the first and last term give rise to a nonnegative quadratic functional. Indeed, the quadratic form (in H 1 ) associated to the operator Sε w ≡ − is, since f ≥ 0,
ε2 w + 2 f (|a ε |2 )(w, a ε )a ε 2
1 w, S ε w = ε2 |∇w|2 + 4 f (|a ε |2 )(w, a ε )2 ≥ 0. 2 Rd Rd It is then natural to consider the (squared) norm w, S ε (w) as a good energy for
Rd
the linearized equation (21). Consequently, we introduce the weighted norm 1 N ε (w) ≡ ε2 |∇w|2 + 4 f (|a ε |2 )(w, a ε )2 + K ε2 |w|2 2 Rd for every K > 0 (K will be chosen sufficiently large only in the next subsection). Our first result of this section is a linear stability result in the energy norm N ε (w). Lemma 1. Assume that u ε : [0, T ] × Rd → Rd and a ε : [0, T ] × Rd → C are smooth and such that M ≡ || ∇x u ε || L ∞ ([0,T ]×Rd ) + || ∇x (∇ · u ε ) || L ∞ ([0,T ]×Rd ) + || |a ε |2 || L ∞ ([0,T ]×Rd ) < +∞. Let w ∈ C 1 ([0, T ], H 2 ) be a solution of (21). Then, there exists C M depending only on d, f and M such that for every ε ∈ (0, 1], the solution w of (21) satisfies the energy estimate 1 d ε 1 1 N (w(t)) ≤ C M 1+ || Ra (t) || L ∞ + || Rϕ (t) ||W 1,∞+ 2 || Rϕ (t) || L ∞ N ε (w(t)) dt ε ε ε 4 f (|a ε |2 )(w, a ε )(a ε , i F ε ) + + || F ε (t) ||2L 2 − (εw, i F ε ). Rd ε Rd (23)
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
511
Note that it is very easy to get from (23) and the Gronwall inequality a classical estimate of linear stability. Indeed, assuming that Ra = O L ∞ ([0,T ],L ∞ ) (ε) and Rϕ = O L ∞ ([0,T ],W 1,∞ ) (ε2 ) (which is true if (a ε , ϕ ε ) come from the WKB method), we infer from a crude estimate for the two last terms in (23) that for 0 ≤ t ≤ T , d ε 1 N (w(t)) ≤ C N ε (w(t)) + 2 || F ε (t) ||2H 1 , dt ε which gives for 0 ≤ t ≤ T , N ε (w(t)) ≤ eCt
N ε (w(0)) +
1 ε2
t
0
|| F ε (τ ) ||2H 1 dτ ,
which is a more classical result of linear stability in the energy norm N ε (w) since the amplification rate C is independent of ε. Nevertheless, to get H s estimates and the best nonlinear results as possible, it is important to have the special structure of the two last terms in (23). Modulated linearized functionals like N ε were also used in asymptotic problems in fluid mechanics, see [10] for example. 2.1. Proof of Lemma 1. The norms L ∞ , W 1,∞ , L 2 ... always stand for the norms in the x variable. At first, since S ε is self adjoint, we have ε d S w, w = 2 S ε w, ∂t w + 2∂t f (|a ε |2 ) (w, a ε )2 dt Rd Rd + 4 f (|a ε |2 )(w, a ε )(w, ∂t a ε ).
(24)
Next, we use (21) to express ∂t w as i i 1 i ∂t w = − S ε w − u ε · ∇w + w ∇ · u ε − Rϕ w − F ε ε 2 ε ε to get 2
Rd
ε S w, ∂t w = 2
Rd
ε2 1 w − 2 f (|a ε |2 )(w, a ε )a ε , u ε · ∇w + w ∇ · u ε 2 2 i i ε (25) + Rϕ w + F . ε ε
We shall now estimate the various terms in the right-hand side of (25). Integrating by parts, we get i 2 ε w, Rϕ w = −ε ∇w, iw∇ Rϕ d d ε R R ≤ ε ||∇ Rϕ || L ∞ ||w|| L 2 ||∇w|| L 2 1 ≤ ||Rϕ ||W 1,∞ N ε (w). ε Note that we have used that Rϕ is real-valued and thus that (∇w, i Rϕ ∇w) = 0
512
D. Chiron, F. Rousset
for the first equality. We also easily obtain by integration by parts that Rd
ε2 w, w ∇ · u ε ≤ C || ∇ · u ε || L ∞ + || ∇(∇ · u ε ) || L ∞ ε2 || ∇w ||2L 2 +ε2 || w ||2L 2 . ≤ C M N ε (w).
In the proof, C M is a harmless number which changes from line to line and which depends only on M. In particular, it is independent of ε. Moreover, we can also write for k = 1, . . . , d, Rd
|∂k w|2 − u ·∇ 2 Rd |∂k w|2 ∇ · uε − = 2 Rd Rd
2 ε ∂kk w, u · ∇w = −
ε
∂k w, ∂k u ε · ∇w ∂k w, ∂k u ε · ∇w ,
and hence, we immediately infer Rd
ε2 w, u ε · ∇w ≤ C M N ε (w).
Furthermore, from the inequality 2ab ≤ a 2 + b2 , there holds −
1 CM 4 2 f (|a ε |2 ) (w, a ε ) ε|w| f (|a ε |2 )(w, a ε ) a ε , i Rϕ w ≤ 2 ||Rϕ || L ∞ d ε Rd ε R CM ≤ 2 ||Rϕ || L ∞ f (|a ε |2 )(w, a ε )2 + ε2 |w|2 ε Rd CM ≤ 2 ||Rϕ || L ∞ N ε (w). (26) ε
Consequently, we can replace (25) in (24) and use the above estimates to get ε d 1 S w, w = 4 f (|a ε |2 )(w, a ε ) (w, ∂t a ε )− u ε · ∇w+ w ∇ · u ε , a ε dt Rd 2 Rd ∂t f (|a ε |2 ) (w, a ε )2 + E 1 , (27) +2 Rd
where E 1 satisfies the estimate 1 1 E 1 ≤ C M 1 + || Rϕ ||W 1,∞ + 2 || Rϕ || L ∞ N ε (w) ε ε 4 − f (|a ε |2 )(w, a ε )(a ε , i F ε ) + (εw, i F ε ). ε Rd Rd
(28)
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
513
To estimate the first integral in the right-hand side of (27), we use Eq. (18) to get 1 4 f (|a ε |2 )(w, a ε ) (w, ∂t a ε ) − u ε · ∇w + w ∇ · u ε , a ε 2 Rd f (|a ε |2 )(w, a ε ) (w, Ra ) − u ε · ∇(w, a ε ) − (w, a ε )∇ · u ε =4 d R
ε 2 ε =4 f (|a | )(w, a )(w, Ra ) − 2 f (|a ε |2 ) u ε · ∇ (w, a ε )2 Rd Rd ε 2 ε 2 ε f (|a | )(w, a ) ∇ · u −4 Rd =4 f (|a ε |2 )(w, a ε )(w, Ra ) + 2 (w, a ε )2 u ε · ∇ f (|a ε |2 ) Rd Rd f (|a ε |2 )(w, a ε )2 ∇ · u ε . −2 Rd
To get the last line, we have integrated by parts the second integral. Note that the last term is bounded by C M N ε (w), and, as for (26), that the first integral is bounded by CM ||Ra || L ∞ N ε (w). Consequently, we can replace the above identity in (27) to get ε ε d S w, w = 2(w, a ε )2 ∂t +u ε · ∇ f (|a ε |2 )+ E 1 + E 2 =: I + E 1 + E 2 , (29) dt Rd Rd where E 2 is such that
1 E 2 ≤ C M 1 + || Ra || L ∞ N ε (w). ε
(30)
To estimate I , we use again Eq. (18) which gives ∂t + u ε · ∇ f (|a ε |2 ) = 2 f (|a ε |2 ) a ε , ∂t a ε + u ε · ∇a ε = 2 f (|a ε |2 ) 1 × Ra − a ε ∇ · u ε , a ε , 2 and hence we find I ≤C |a ε |2 f (|a ε |2 ) (w, a ε )2 + 4 Rd
Rd
|a ε | | f (|a ε |2 )| (w, a ε )2 |Ra |.
To conclude, we shall use Assumption (A). By defining n ∈ N∗ the first integer such that f (n) (0) = 0, we see from Taylor expansion that f (ρ) = ρ n−1 q(ρ)
(31)
for some smooth positive function q on [0, +∞). In particular, since q > 0, we have ρ → which implies
ρ f (ρ) q (ρ) =n−1+ρ ∈ C ∞ ([0, +∞)) , f (ρ) q(ρ)
ρ f (ρ) ≤ C M f (ρ)
for 0 ≤ ρ ≤ M.
(32)
514
D. Chiron, F. Rousset
This yields ε 2 ε 2 ε 2 |a | | f (|a | )|(w, a ) ≤ C M Rd
Rd
(w, a ε )2 f (|a ε |2 ) ≤ C M N ε (w),
where, again, C M depends only on M. In a similar way, we also obtain ε 2 ε ε 2 (w, a ) |a | | f (|a | )| |Ra | ≤ || Ra || L ∞ |w| · (w, a ε ) · |a ε |2 f (|a ε |2 ) d d R R CM ε ε 2 || Ra || L ∞ ≤ (ε|w|) (w, a ) f (|a | ) ε Rd CM || Ra || L ∞ N ε (w). ≤ ε Consequently, we have proven that 1 I ≤ C M 1 + || Ra || L ∞ N ε (w). ε
(33)
To get the result of Lemma 1, it remains to perform the L 2 estimate. Taking the L 2 scalar product of (21) with iw and using that (w, u ε · ∇w +
1 1 w ∇ · u ε ) = ∇ · |w|2 u ε , 2 2
we get d dt
ε2 ||w||2L 2 2
=
Rd
ε
ε(F , iw) + 2ε
Rd
f (|a ε |2 )(w, a ε )(a ε , iw).
Note that we have once again used that Rϕ is real-valued and hence that (Rϕ w, iw) = 0. The first integral is clearly bounded by N ε (w) + || F ε ||2L 2 whereas for the second one, we have
f (|a ε |2 )(w, a ε )2 + ε2 |w|2 ≤ C M N ε (w). 2ε f (|a ε |2 )(w, a ε )(a ε , iw) ≤ C M Rd
Rd
As a consequence, we get d ε2 2 || w || L 2 ≤ C M N ε (w) + || F ε ||2L 2 . dt 2
(34)
Finally, we can collect (28), (29), (30), (33) and (34) to get (23). This completes the proof. 2.2. Higher order estimates. Since our final aim is to prove Theorem 2 by a fixed point argument, we also need to have H s estimates for s sufficiently large for the solution of the linear equation (21). This is the aim of the following. Note that the term −2(w, a ε ) f (|a ε |2 )a ε in (19) can be seen as a singular term with variable coefficients. Consequently, a crude way to get H s estimates is to apply ε|α| ∂ α to the equation,
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
515
the weight ε|α| being used to compensate the singular commutator when we take the derivative of (19), and then to apply Lemma 1 to the resulting equation. Nevertheless, it is possible to avoid the loss of ε|α| with more work by using more clever higher order modulated functionals. We set N1ε ≡ N ε and, if s ∈ N, s ≥ 2, we define the following weighted norm, where α ∈ Nd are multi-indices
Nsε (w) ≡ N ε (∂ α w) + K ||Re w||2H s−2 |α|≤s−1
=
1 2 f (|a ε |2 )(∂ α w, a ε )2 + K ε2 ||w||2H s−1 ε ||∇w||2H s−1 + 2 d 2 |α|≤s−1 R (35) + ||Re w||2H s−2 .
In this section, we shall use that a ε = a 0 + εa r with a 0 real-valued and sup ||a r || L ∞ ([0,T ],W s,∞ ) ≤ C.
ε∈(0,1]
Note that this allows to write 1 f (|a ε |2 )(∂ α w, a ε )2 ≥ f (|a ε |2 )(a 0 )2 |Re ∂ α w|2 − Cε2 ||Re ∂ α w||2L 2 , 2 Rd Rd and hence by choosing K sufficiently large (K > C) we get the lower bound
α 1 ε ε Ns (w) ≥ N ∂x w + f (|a ε |2 )(a 0 )2 |Re ∂ α w|2 d x. (36) 2 Rd |α|≤s−1
|α|≤s−1
Note that we also have the equivalence of norms: 2 || w ||2H s ≤ 2 Nsε (w), Nsε (w) ≤ C(|a ε |W s−1,∞ ) || w ||2H s + ||Re w||2H s−2 . ε The main result of this section is:
(37)
Theorem 3. Let 0 < T < ∞, s ∈ N∗ , f satisfying (A) and w ∈ C 1 ([0, T ], H s ) a solution of (21) with u ε : [0, T ] × Rd → Rd and a ε : [0, T ] × Rd → C such that M ≡ sup || u ε || L ∞ ([0,T ],W s+1,∞ (Rd )) + || a ε || L ∞ ([0,T ],W s,∞ (Rd )) < +∞. 0<ε<1
Assume finally that, for some a 0 ∈ L ∞ ([0, T ], W s,∞ (Rd )) real-valued, a ε verifies a ε = a 0 + OW s,∞ (ε)
(38)
uniformly on [0, T ]. Then, there exists C, depending only on d, f and M, such that d ε 1 1 N (w(t)) ≤ C 1 + || Ra (t) || L ∞ + 2 || Rϕ (t) ||W s−1,∞ Nsε (w(t))+C || F ε (t) ||2H s dt s ε ε C + 2 || Im F ε (t) ||2H s−1 . ε Remark 2. In view of (38), a ε is real up to O(ε), hence, in the integral in the right-hand side of (23), the real and imaginary parts of F ε do not play the same role. This explains that the estimate is better for Re F ε than for Im F ε . As a matter of fact, for s = 1, Theorem 3 follows immediately from Lemma 1 and (38).
516
D. Chiron, F. Rousset
2.3. Proof of Theorem 3. We estimate separately the two terms in Nsε (w), when s ≥ 2 (otherwise, the result follows from Lemma 1 as we have seen). Let us set
(w) ≡ ||Re w||2H s−2 . Note that we have
(w) ≤ Nsε (w).
(39)
In the proof, C is a constant depending only on d, f and M. We shall first prove that C d 1
(w) ≤ C 1 + 2 || Rϕ ||W s−2,∞ Nsε (w) + C || F ε ||2H s−2 + 2 || Im F ε ||2H s−2 . dt ε ε (40) For α ∈ Nd , we have iε i i ∂t ∂ α w + u ε · ∇ ∂ α w = (∂ α w) − ∂ α F ε − ∂ α Rϕ w 2 ε ε 2i α ε 2 ε 1 α ε α ε − ∂ f (|a | )(a , w)a − ∂ , u · ∇ w − ∂ w∇ · u ε . ε 2
(41)
Next, by taking the real part of (41), we get 1 ∂t ∂ α Re w + u ε · ∇ ∂ α Re w = − ∂ α , u ε · ∇ Re w − ∂ α Re w ∇ · u ε + Rε , 2 where Rε = Re
2i
iε i i (∂ α w) − ∂ α F ε − ∂ α Rϕ w − ∂ α f (|a ε |2 )(a ε , w)a ε . 2 ε ε ε
(42)
By using (38), we have Im ∂ γ a ε = O(ε), ∀γ , |γ | ≤ |α| and |(∂ β a ε , ∂ γ w)| ≤ Cβ,γ |Re ∂ γ w| + ε|∂ γ w|
(43)
for every β, γ . Consequently, we immediately obtain for every α, |α| ≤ s − 2, ||Rϕ ||W s−1,∞ 1 + ||Im F ε || H s−2 ||w|| +||Re w|| +ε||w|| ||Rε || L 2 ≤ C ε||w|| H s + s−2 s−2 s−2 H H H 2 ε ε 1 ||Rϕ ||W s−2,∞ 1 Nsε (w) 2 + ||Im F ε || H s−2 . ≤ C 1+ ε2 ε Consequently, the standard L 2 energy estimate for (42) gives ||Rϕ ||W s−1,∞ 1 d Nsε (w) + 2 ||ImF ε ||2H s−2 . ||Re ∂ α w||2L 2 ≤ C 1 + 2 dt ε ε
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
517
Note that we have used that ε 1 u · ∇ ∂ α Re w , ∂ α Re w = − (∇ · u ε )|∂ α Re w|2 . d d 2 R R Consequently, (40) is proven. The next step is to estimate N ε (∂ α w) for |α| ≤ s − 1. By applying ∂ α to (21), we get iε
∂(∂ α w) + Lε ∂ α w = Rϕ ∂ α w + F˜ ε , ∂t
(44)
where F˜ ε ≡ C α + Dα + ∂ α F ε + [∂ α , Rϕ ]w, with Cα ≡ 2 ∂ α
f (|a ε |2 )a ε (w, a ε ) − 2 f (|a ε |2 )(∂ α w, a ε )a ε ,
iε α Dα ≡ −iε ∂ α , u ε · ∇ w − ∂ , ∇ · u ε w. 2 To estimate N ε (∂ α w), we shall use Lemma 1. Towards this, we need to estimate the commutators in the right-hand side of (44). For |α| ≤ s − 1, the following estimates hold for C α and Dα : C || Rϕ ||2W s,∞ Nsε (w), ε2 ≤ C Nsε (w),
|| [∂ α , Rϕ ]w ||2H 1 ≤ C || Rϕ ||2W s,∞ || w ||2H s ≤ || Dα ||2H 1 ≤ C ε2 || w ||2H s
1 || i f (|a ε |2 ) 2 a ε , Dα ||2L 2 ≤ Cε2 Nsε (w), || (ia , C
α
) ||2L 2
≤ Cε
2
(46) (47)
|| C α ||2H 1 ≤ C Nsε (w), ε
(45)
(48)
Nsε (w).
(49)
The estimates (45) and (46) follow easily from (37). For (47), we note that 1 ε α 1 ε α ia , D = − a ε , [∂ α , u ε · ∇]w − a , [∂ , ∇ · u ε ]w ε 2
α 1 α α−γ ε ε γ ∂ ∂ α−γ ∇ · u ε a ε , ∂ γ w =− u · a , ∇∂ w − γ γ 2 γ <α
γ <α
since u ε is real. Next, we can use (38) and (43) again. In particular, in the above expansion, the terms (a ε , ∂ γ w) are bounded in L 2 by (w) + ε2 ||w||2H s−2 and thus by Nsε (w). Similarly, the terms (a ε , ∇∂ γ w) are bounded in L 2 by Nsε (w) if |γ | ≤ s − 3. Consequently, we get ⎛ ⎞
1 || i f (|a ε |2 ) 2 a ε , Dα ||2L 2 ≤ C ⎝ f (|a ε |2 )(∂ β w, a ε )2 + Nsε (w)⎠≤ C Nsε (w), d |β|=s−1 R
518
D. Chiron, F. Rousset
which yields (47). Next, we turn to C α . The Leibnitz formula gives
Cα = ∗ ∂ λ f (|a ε |2 ) ∂ α˜ w, ∂ β a ε ∂ µ a ε ,
(50)
α˜ < α, α˜ + β + λ + µ = α
where ∗ is a real coefficient depending only on α, ˜ β, λ and µ. Since |α| ˜ ≤ |α|−1 ≤ s −2, we can use again (38) through (43) to get that
||C α ||2L 2 ≤ C (w) + ε2 ||w||2H s ≤ C Nsε (w). Since (ia ε , ∂ µ a ε ) = O(ε) thanks to (38), we also get (49). For the H 1 norm, the same argument yields ⎞ ⎛ ⎜ ⎜ ||C α ||2H 1 ≤ C ⎜ (w) + ε2 ||w||2H s + ⎝
|γ | = s − 1, |β + λ + µ| = 1
λ ε 2 γ ⎟ ∂ f (|a | ) ∂ w, ∂ β a ε ∂ µ a ε 2 ⎟ ⎟. ⎠ Rd
To estimate the last sum, we first consider the terms with β = 0. They are always bounded by 2 C f (|a ε |2 ) + |a ε |2 | f (|a ε |2 )| ∂ γ w, a ε Rd
with |γ | = s − 1 and hence, thanks to (32), they are bounded by 2 C f (|a ε |2 ) ∂ γ w, a ε , Rd
and hence by Nsε (w). Next, we consider the terms with |β| = 1. Since then λ = µ = 0, we have to estimate terms like 2 T = f (|a ε |2 ) ∂ γ w, ∂ β a ε |a ε |2 . Rd
By using again (38) and (43), we get T ≤C f (|a ε |2 )|a 0 |2 |Re ∂ γ w|2 + Cε2 ||w||2H s−1 , Rd
and hence, by using (36), we finally obain T ≤ C Nsε (w). Consequently, (48) is proven. This ends the estimates of the commutators. We are now able to establish: d ε α 1 1 N ∂ w ≤ C 1 + 2 || Rϕ ||W s−1,∞ + || Ra || L ∞ Nsε (w) dt ε ε C + || F ε ||2H s + 2 || Im F ε ||2H s−1 . ε
(51)
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
519
Indeed, from Lemma 1, we deduce d ε α 1 1 1 N ∂ w ≤ C 1 + || Rϕ ||W 1,∞ + || Ra || L ∞ + 2 || Rϕ || L ∞ N ε ∂ α w dt ε ε ε
4 + || F˜ ε ||2L 2 + f (|a ε |2 )(∂ α w, a ε ) ia ε , F˜ ε − (iε∂ α w, F˜ ε ). (52) d ε Rd R To estimate the right-hand side of (52), we first estimate || F˜ ε ||2L 2 . Combining (45) and (46) with (48), we infer 1 || F˜ ε ||2L 2 ≤ || F ε ||2H s−1 + C 1 + 2 || Rϕ ||2W s−1,∞ Nsε (w). (53) ε Next, we turn to the term 4
4 f (|a ε |2 )(∂ α w, a ε ) ia ε , F˜ ε = f (|a ε |2 )(∂ α w, a ε ) ia ε , C α +Dα +∂ α F ε ε Rd ε Rd + [∂ α , Rϕ ]w , which splits as four integrals. For the first one, by (49) and Cauchy-Schwarz: 1 2 2 ε 1 4 f (|a ε |2 )(∂ α w, a ε ) ia ε , C α ≤ C f (|a ε |2 ) ∂ α w, a ε Ns (w) 2 ε Rd Rd ≤ C Nsε (w). For the second one, we use (47) and Cauchy-Schwarz, which gives
1 1 4 f (|a ε |2 ) 2 (∂ α w, a ε ) i f (|a ε |2 ) 2 a ε , Dα ≤ C Nsε (w). ε Rd For the third integral, we simply write, using once again (38), 1 ε α ε C || ia , ∂ F || L 2 ≤ C || F ε || H s−1 + || Im F ε || H s−1 , ε ε which yields by Cauchy-Schwarz 4 C f (|a ε |2 )(∂ α w, a ε ) ia ε , ∂ α F ε ≤ C Nsε (w) + C || F ε ||2H s−1 + 2 || Im F ε ||2H s−1 . ε Rd ε Finally, for the fourth integral, we have by (45), C 4 f (|a ε |2 )(∂ α w, a ε ) ia ε , [∂ α , Rϕ ]w ≤ || Rϕ ||W s−1,∞ N ε (w). ε Rd ε By summing these estimates, we find
4 1 ε 2 α ε ε ˜ε f (|a | )(∂ w, a ) ia , F ≤ C 1 + || Rϕ ||W s−1,∞ Nsε (w) + C || F ε ||2H s−1 ε Rd ε C + 2 || Im F ε ||2H s−1 . (54) ε
520
D. Chiron, F. Rousset
Finally, we handle the term
− iε∂ α w, F˜ ε = − Rd
Rd
iε∂ α w, C α + Dα + ∂ α F ε + [∂ α , Rϕ ]w .
By using an integration by parts, we have − iε∂ α w, F˜ ε ) ≤ || C α ||2H 1 + || Dα ||2H 1 + || [∂ α , Rϕ ]w ||2H 1 + || F ε ||2H s + C Nsε (w) d R 1 ε 2 ≤ || F || H s + C 1 + 2 || Rϕ ||W s−1,∞ Nsε (w) ε thanks to (45), (46) and (48). Consequently, we can collect the last estimate and (52), (53), (54) to get (51). This ends the proof of Theorem 3. 3. Construction of WKB Expansions In this section, we construct an approximate solution of (1) using a WKB expansion. The first step is to prove the local existence of smooth solutions of the limit hydrodynamical system. 3.1. Well-posedness of the limit system. We consider the system ⎧ 1 ⎪ ⎪ ⎨ ∂t a + u · ∇a + 2 a ∇ · u = 0
⎪ ⎪ ⎩ ∂t u + u · ∇u + ∇ f (a 2 ) = 0,
(55)
which is only weakly hyperbolic, with the pressure law f satisfying assumption (A) and the initial condition (a, u)|t=0 = (a0 , u 0 ). Theorem 4. Assume that f satisfies (A) and let s > 2 + d/2. Then, for every initial condition (a0 , u 0 ) ∈ H s × H s with a0 ∈ R, there exists T > 0 and a unique solution (a, u) of (55) such that (a, u) ∈ C([0, T ], H s−1 × H s ) ∩ C 1 ([0, T ], H s−2 × H s−1 ). Let us remark that if n = 1, then f (0) > 0 and thus f > 0 in [0, +∞) (by (A)). In this case, (55) is symmetrizable (with the symmetrizer S = diag 1, 4 f 1(a 2 ) , . . . , 4 f 1(a 2 ) used in [9]) and the local existence and uniqueness for (55) follows easily. Proof of Theorem 4. The first step is to rewrite the system by using more convenient unknowns. At first, we notice that thanks to (A), we can write f under the form f (ρ) = ρ n f˜(ρ), with f˜ smooth on [0, +∞) and such that f˜(0) = 0. Next, since we have by assumption f (0) = 0 and f (ρ) > 0 for ρ = 0, we also have that f (ρ) > 0 for ρ > 0. This implies that f˜(ρ) > 0 for ρ ≥ 0. This allows to define a smooth function h on R by 1 2n . h(a) ≡ a f˜(a 2 )
(56)
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
521
Note that h(a) = 0 for a = 0. It is useful to notice that we can also write h under the form 1
h(a) = sgn(a) f (a 2 ) 2n , and hence that we have h(a)2n = f (a 2 ),
a ∈ R.
Furthermore, since > 0 and f˜(0) > 0 in (0, +∞), we deduce that h (a) > 0 for a = 0 1 2n > 0, so that h > 0 on R. Thus h is a smooth diffeomorphism and that h (0) = f˜(0) from R to h(R). In particular, this allows to define a smooth positive function c on h(R) such that 1 ah (a) = h(a) c (h(a)) , ∀a ∈ R. 2 With this definition, (h, u), with h ≡ h(a), solves the system ⎧ ⎪ ⎨ ∂t h + u · ∇h + hc(h) ∇ · u = 0 (57)
⎪ ⎩ ∂t u + u · ∇u + ∇ h 2n = 0. f
Since a is in H s if and only if h is in H s , we shall prove local existence of a smooth solution for the weakly hyperbolic system (57). As we shall see below, the nonlinear symmetrization method of [15] does not allow to reduce (57) to a symmetric or symmetrizable system with smooth coefficients except in the case where c(h) = c(h ˜ n ) for some smooth map c. ˜ Nevertheless, it will be still possible to use the same idea to prove the existence of an energy estimate with loss for the system (57). When we are in such a situation, the simplest way to construct a solution is to use the vanishing viscosity method. Indeed, this approximation method allows to preserve the nonlinear energy estimate verified by (57). We thus consider for > 0 the system ⎧ ⎪ ⎨ ∂t h + u · ∇h + h c(h )∇ · u = h (58)
⎪ ⎩ ∂t u + u · ∇u + ∇ h 2n = u . The local existence of smooth solutions for this parabolic system is very easy to obtain. Moreover, we note that h remains nonnegative if the initial datum (h )|t=0 is nonnegative. In the following, we shall only prove an H s energy estimate independent of for this system which ensures that the solution remains smooth on an interval of time independent of . The final step which consists in using the uniform bounds to pass to the limit when goes to zero to get a solution of (57) is very classical and hence will not be detailled. In the proof of the energy estimates, we shall omit the subscript for notational convenience. 1 As in the work of [15], we introduce the unknown H ≡ h n = a n f˜(a 2 ) 2 . Note that by definition of h, H is in H s as soon as a is in H s . We get for (H, u) the system ⎧
⎪ ⎨ ∂t H + u · ∇ H + n H c(h) ∇ · u = nh n−1 h = H − n(n − 1)h n−2 |∇h|2 ⎪ ⎩ ∂ u + u · ∇u + 2H ∇ H = u. t (59)
522
D. Chiron, F. Rousset
Note that it does not seem possible to get a classical hyperbolic symmetric system (in the case = 0) involving only H and u as in the case of homogeneous pressure laws 1 considered in [15]. Indeed, the coefficient c(h) = c(H n ) is not (in general) a smooth function of H . Nevertheless, it will be possible to prove that the system with unknowns (h, H, u) though only weakly hyperbolic (when = 0) satisfies an energy estimate. We notice that the symmetrizer
n S ≡ diag 1, c(h)Id , 2 which is positive since c(h) is positive, symmetrizes the first order part of (59). We shall first perform an H s energy estimate (s > 2 + d/2) on (59) but we have to track carefully the dependence on h in the energy estimates. To prove our H s energy estimate, we shall make extensive use of the following classical (see [18] for example) tame estimates || f g || H k ≤ Ck || f || L ∞ || g || H k + || f || H k || g || L ∞ , || ∂ α ( f g)− f ∂ α g || L 2 ≤ Ck || f || H k || g || L ∞ + || ∇ f || L ∞ || g || H k−1 , |α| ≤ k, || F(u) || H k ≤ C(|| u || L ∞ )(1 + || u || H k )
(60) (61) (62)
if F is smooth and such that F(0) = 0. At first, we notice that (∂ α H, ∂ α u) for |α| ≤ s solves the system ⎧ α ∂ ∂ H + u · ∇∂ α H + nc(h) (∇ · u) ∂ α H = ∂ α H − n(n − 1)∂ α (h n−2 |∇h|2 ) ⎪ ⎨ t − [∂ α , u] · ∇ H − n[∂ α , H c(h)]∇ · u ⎪ ⎩ α ∂t ∂ u + u · ∇∂ α u + 2H ∇∂ α H = ∂ α u − [∂ α , u] · ∇u − [∂ α , 2H ]∇ H. By using (61) to estimate in L 2 the commutators in the right hand-side, we get in a classical way by integration by parts 1 n n |∂ α H |2 + c(h)|∂ α u|2 + |∇∂ α H |2 + c(h)|∇∂ α u|2 d 2 Rd 2 2 R 2 α α α ≤ C0 || (h, u) ||W 1,∞ ||V || H s + C + D + R ,
d dt
(63)
where V ≡ (H, u), C0 is a non-decreasing function depending only on f , s and d, and α
C ≡ −n
d R
(∂ α H ) [∂ α , H c(h)](∇ · u),
n c (h) (∇h · ∇)∂ α u · ∂ α u − n(n − 1) ∂ α h n−2 |∇h|2 ∂ α H, 2 Rd Rd n Rα ≡ c (h)∂t h|∂ α u|2 . 4 Rd
Dα ≡ −
We have singled out the three terms above since they are the ones involving h which must be estimated with care. Note that the estimate of C α will be crucial since this
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
523
term involves high order derivatives of h. Next, we can integrate (63) in time, sum the 1 estimates for |α| ≤ s and use that c(h) > 0, hence nc(h)/2 ≥ C1 (||h|| to obtain L∞ ) t ||∇V (τ )||2H s dτ ||V (t)||2H s + 0 t 2 ≤ C1 (||h|| L ∞ ) ||V (0)|| H s + C0 ||(h, u)(τ )||W 1,∞ ||V (τ )||2H s + C(τ ) + D(τ ) 0 + R(τ ) dτ , (64) with C≡
Cα , D ≡
|α|≤s
|α|≤s
Dα , R ≡
Rα .
|α|≤s
Estimate for C. We claim that
C ≤ C0 (||(h, u)||W 1,∞ ) ||V ||2H s + ||h||2H s−1 .
(65)
The crucial point is that this estimate only involves the H s−1 norm of h. This will allow us to conclude by using that for the first equation in (59), the H s−1 norm of h is controlled by the H s norm of u. By using the commutator estimate (61), we have C ≤ C||H || H s ||H c(h)|| H s ||∇ · u|| L ∞ + ||∇ (H c(h)) || L ∞ ||∇ · u|| H s−1
≤ C0 ||(h, u)||W 1,∞ ||V ||2H s + ||H || H s ||H c(h)|| H s . To estimate the last term, we use that H = h n , which yields h∂i H = n H ∂i h, thus ∂i (H c(h)) = c(h)∂i H + c (h)H ∂i h = c(h)∂i H +
1 c (h)h∂i H. n
Consequently, by (60), (62), we get
||H c(h)|| H s ≤ C||c(h)∇ H || H s−1 + C||c (h)h∇ H || H s−1 ≤ C0 ||(h, u)||W 1,∞ ||H || H s + ||h|| H s−1 ,
and (65) follows. Estimate for D. The term D involves derivatives of u of order ≤ s + 1, and we shall use the energy dissipation in (63). We prove that
1 C1 (||h|| L ∞ ) D ≤ ||∇V ||2H s + C0 (||h||W 1,∞ ) ||V ||2H s + ||∇h||2H s−1 . (66) 2 We have, on the one hand, α α c (h)∇h · ∇∂ u · ∂ u ≤ C0 (||h||W 1,∞ )||∇u|| H s ||u|| H s Rd
≤ C0 (||h||W 1,∞ )||∇V || H s ||V || H s .
524
D. Chiron, F. Rousset
On the other hand, for the second term (which vanishes if n = 1), after one integration by parts when |α| > 0, we get
α n−2 2 α ∂ h |∇h| ∂ H ≤ C||∇ H || H s ||h n−2 |∇h|2 || H s−1 n(n − 1) Rd
≤ C0 (||h||W 1,∞ ) ||∇ H || H s ||∇h|| H s−1 ,
and if α = 0, since H = h n and s ≥ 1, n−1 h n−2 |∇h|2 H = |∇ H |2 ≤ C||H ||2H s . n(n − 1) n Rd Rd Consequently, D ≤ C0 (||h||W 1,∞ ) ||∇V || H s ||V || H s + ||∇h|| H s−1 + C||V ||2H s , and (66) follows from the standard inequality, for a, b, θ > 0, ab ≤ θa 2 +
b2 4θ .
Estimate for R. We prove that C1 (||h|| L ∞ ) R ≤
1 ||∇V ||2H s + C0 (||(h, u)||W 1,∞ ) ||V ||2H s . 2
(67)
By using the first equation in (58) for h and an integration by parts, we find, as for the first term in D, n α 2 R ≤ C0 (||(h, u)||W 1,∞ ) ||V || H s + c (h)h|∂ α u|2 4 Rd n ≤ C0 (||(h, u)||W 1,∞ ) ||V ||2H s − c (h) (∇h · ∇)∂ α u · ∂ α u 4 Rd n − c (g)|∇h|2 |∂ α u|2 4 Rd
≤ C0 (||(h, u)||W 1,∞ ) ||V ||2H s + ||∇V || H s ||V || H s . 2
b Then, (66) follows as above from the inequality ab ≤ θa 2 + 4θ . Summing (65), (66) and (67), inserting this into (64) and cancelling the terms ||∇V ||2H s , we infer
||V (t)||2H s ≤ C1 (||h(t)|| L ∞ ) ||V (0)||2H s t + C0 (||(h, u)(τ )||W 1,∞ ) ||V (τ )||2H s + ||h(τ )||2H s−1 + ||∇h(τ )||2H s−1 dτ . (68) 0
t To close the estimate, it remains to evaluate ||h||2H s−1 and 0 ||∇h||2H s−1 . We use the standard H s−1 estimate for the convection diffusion equation (58) which yields, as for (63), for |α| ≤ s − 1,
d 1 |∂ α h|2 + |∂ α h|2 ≤ C0 || (h, u) ||W 1,∞ ||h||2H s−1 + ||h|| H s−1 ||u|| H s . dt 2 Rd Rd
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
Summing for |α| ≤ s − 1 and integrating in time, this yields t 1 ||∇h(τ )||2H s−1 dτ ||h(t)||2H s−1 + 2 0 t
1 ≤ ||h(0)||2H s−1 + C0 || (h, u)(τ ) ||W 1,∞ ||V (τ )||2H s + ||h(τ )||2H s−1 dτ. 2 0
525
(69)
Finally, we can combine (68) and (69), to get ||V (t)||2H s + ||h(t)||2H s−1
≤ C0 || (h, u) || L ∞ ([0,t],W 1,∞ ) ||V (0)||2H s + ||h(0)||2H s−1 t + ||V (τ )||2H s + ||h(τ )||2H s−1 dτ .
(70)
0
Since H s−1 is embedded in W 1,∞ for s > 2+d/2, we easily get by classical continuation arguments and the Gronwall lemma that the solution of (58) is defined on an interval of time [0, T ) independent of . Finally, (70) provides a uniform bound for (h, H, u) in H s−1 × H s × H s , which allows to prove in a classical way that (h , u ) converges towards a solution of (57). This ends the proof of the existence of solution. To prove the uniqueness, it suffices to use the same method as above and perform an L 2 energy estimate on the system satisfied by h 1 − h 2 , u 1 − u 2 , H1 − H2 . This is left to the reader. 3.2. WKB expansions. We now turn to the construction of WKB expansions up to arbitrary order. Let us first notice that in Theorem 4, if the initial datum (a0 , u 0 ) is in H ∞ × H ∞ , then the solution (a, u) is in C 0 ([0, T ], H s−1 × H s ) for every s > 2 + d/2, with T independent of s > 2 + d/2. In other words, the existence time of the maximal solution in H ∞ × H ∞ is positive. This fact follows easily from (70) and the Gronwall inequality (since H s−1 ⊂ W 1,∞ ). ε
Lemma 2. Consider 0ε = a0ε eiϕ0 /ε with a0ε ∈ H ∞ , ϕ0ε ∈ H ∞ and that for some m ∈ N, there exists an expansion a0ε
=
m
εk a0k
k=0
+ εm+1 a0ε ,
ϕ0ε
=
m
εk ϕ0k + εm+1 ϕ0ε
with a00 ∈ R, a0k , ϕ0k ∈ H ∞ , satisfying, for every s, sup || a0ε || H s + || ϕ0ε || H s < +∞. ε∈(0,1)
(71)
k=0
(72)
Let us denote 0 < T ∗ ≤ +∞ the existence time of the maximal smooth (i.e. H ∞ × H ∞ ) solution (a 0 , ϕ 0 ) for (55) with the initial condition (a00 , ϕ00 ). Then, there exists an approxε imate smooth solution of (1) on [0, T ∗ ) under the form a = a ε eiϕ /ε , with a ε , ϕ ε ∈ H ∞ and a ε complex-valued, solving ⎧ ε ∂ϕ 1 ⎪ ε 2 ε 2 m ⎪ ⎪ ⎨ ∂t + f (|a | ) + 2 |∇ϕ | = Rϕ (73) ε ⎪ ⎪ ∂a ε ε a ε ⎪ ⎩ + ∇ϕ · ∇a ε + ϕ ε − J a ε = Ram , ∂t 2 2
526
D. Chiron, F. Rousset
with the initial condition (a ε , ϕ ε )/t=0 = a0ε , ϕ0ε , and where, for every s and 0 < T < T ∗, (74) sup || Ram || H s + || Rϕm || H s ≤ Cs,T εm+2 . [0,T ]
Finally, for 0 < T < T ∗ , a ε verifies (38): a ε − a 0 = O(ε) in L ∞ ([0, T ], W s,∞ ). Note that a is indeed an approximate solution of (1) since iε
ε ϕ ∂ a ε2 + a − a f (| a |2 ) = −iε Ram + a ε Rϕm exp i . ∂t 2 ε
By using the notation of Sect. 2, we have R ε = −iε Ram + a ε Rϕm , hence sup || R ε || H s ≤ Cs εm+2 .
(75)
[0,T ]
Proof. As in [9], we look for expansions aε =
m
εk a k + εm+1 a m+1 ,
k=0
ϕε =
m
εk ϕ k + εm+1 ϕ m+1 .
k=0
This yields that (a 0 , ϕ 0 ) solves the nonlinear system ⎧ 0 ∂ϕ 1 ⎪ 0 2 0 2 ⎪ ⎪ ⎨ ∂t + f (|a | ) + 2 |∇ϕ | = 0 ⎪
0 0 ⎪ ⎪ ⎩ ∂a + ∇ϕ 0 · ∇a 0 + a ϕ 0 = 0, ∂t 2 which is just (9), and that for 1 ≤ k ≤ m, (a k , ϕ k ) solves the linear system ⎧ k ∂ϕ ⎪ ⎪ + 2 f (|a 0 |2 )(a 0 , a k ) + ∇ϕ 0 · ∇ϕ k = Sϕk ⎪ ⎨ ∂t ⎪
k 0 k ⎪ ⎪ ⎩ ∂a + ∇ϕ 0 · ∇a k + ∇a 0 · ∇ϕ k + a ϕ k + a ϕ 0 = S k , a ∂t 2 2
(76)
(77)
where the source terms (Sϕk , Sak ) depend only on (a j , ϕ j )0≤ j≤k−1 , and Sak is complexvalued. 0 0 We first solve (76) (that is (9)) with the initial condition ϕ/t=0 = ϕ00 , a/t=0 = a00 . By introducing u 0 ≡ ∇ϕ 0 and by taking the gradient of the first equation of (76), we find ⎧ a0 ⎪ 0 0 0 ⎪ ∇ · u0 = 0 ⎨ ∂t a + u · ∇a + 2 (78)
⎪ ⎪ ⎩ ∂ u 0 + u 0 · ∇u 0 + ∇ f (a 0 )2 = 0, t
which is the compressible Euler type equation considered in the previous section. By using Theorem 4, we get the existence of a smooth solution (a 0 , u 0 ) ∈ H s−1 × H s for
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
527
every s on [0, T ∗ ) (with T ∗ independent of s), with a 0 real-valued. Finally, to get ϕ 0 , it is natural to set t
1 0 0 0 2 0 2 f (a ) + |u | (τ, x) dτ, ϕ (t, x) = ϕ0 (x) − 2 0 and the same argument as in [2] yields u 0 = ∇ϕ 0 . kWek now turn tok the resolution of (77). We solve it with the initial condition ϕ , a /t=0 = ϕ0 , a0k . By introducing again u k ≡ ∇ϕ k , we can take the gradient in the first line of (77) to get ⎧ a0 ak ⎪ k 0 k k k 0 0 k ⎪ ⎨ ∂t a + u · ∇a + ∇ · u + u · ∇a + ∇ · u = Sa , 2 2 (79)
⎪ ⎪ ⎩ ∂ u k + u 0 · ∇u k + ∇ a 0 , f ((a 0 )2 )a k + u k · ∇u 0 = ∇ S k . t ϕ Again, since f (a 0 )2 can vanish, the symmetrization of this linear hyperbolic system requires some care. We thus set ⎧√ 1 ⎨ 2 f ((a 0 )2 ) 2 a k if n is odd
1 F k (t, x) ≡ √ ⎩ 2 a 0 f ((a 0 )2 ) 2 a k if n is even. (a 0 )2 Note that in both cases, we have F k (t, x) =
√
2 g(a 0 )a k
with g smooth. Indeed, as we have seen, we can write f (ρ) = ρ n−1 q(ρ) with q smooth and positive, and we have in both cases :
1 2 g(a 0 ) = (a 0 )n−1 q((a 0 )2 ) .
(80)
This is the natural generalization of the change of unknown used in [2]. Then, thanks to the equation on a 0 , we get for (F k , u k ) the system ⎧ √ 1 ⎪ ⎪ ∂t F k + u 0 · ∇ F k + √ a 0 g(a 0 )∇ · u k + 2 g(a 0 ) u k · ∇a 0 ⎪ ⎪ 2 ⎪ ⎪ ⎪ ⎪ √ ⎨ a 0 g (a 0 ) Fk 1+ ∇ · u 0 = 2g(a 0 )Sak + 0 2 g(a ) ⎪ ⎪ ⎪ ⎪ ⎪
⎪ 1 ⎪ ⎪ ⎩ ∂t u k + u 0 · ∇u k + √ ∇ a 0 g(a 0 ), F k + u k · ∇u 0 = ∇ Sϕk . 2 0
g (a ) 0 Note that the coefficient a g(a 0 ) is smooth even when a vanishes since g is under the form (80). We have obtained a linear symmetric hyperbolic system with a zero order term and a source term S k depending only on (a j , ϕ j ) for 0 ≤ j < k under the form
∂t U + k
d
j=1
0
A (t, x)∂ j U + L(t, x)U = S , U = j
k
k
k
k
Fk uk
,
528
D. Chiron, F. Rousset
where A j (t, x) are smooth, real and symmetric and the matrix L is smooth. By the classical theory, there exists, on [0, T ∗ ), a smooth solution (F k , u k ) in H ∞ × H ∞ of this system. Once u k is built, we get a k by solving the transport equation for a k which is given by the first line of (79). Finally, we deduce the phase ϕ k by integrating in time the first line of (77). We obtain t
2 f (|a 0 |2 )(a 0 , a k ) + ∇ϕ 0 · u k − Sϕk (τ, x)dτ. ϕ k (t, x) = ϕ0k (x) − 0
m+1 m+1 ) that solve (77) with the initial conFinally, m+1we choose ε way in a similar (a , ϕ ε m+1 dition a ,ϕ = a0 , ϕ0 . Because of the assumption (72), we find that they /t=0
are also uniformly bounded in H s−1 × H s with respect to ε. This concludes the proof of Lemma 2. 4. Nonlinear Stability
In this section, we give the proof of Theorem 2. We shall actually prove directly a more precise version which states the existence of a WKB expansion to any order. ε
Theorem 5. Consider 0ε = a0ε eiϕ0 /ε with a0ε ∈ H ∞ , ϕ0ε ∈ H ∞ and that for some m ∈ N, there exists an expansion (71) as in Lemma 2. We assume (A) and let (a ε , ϕ ε ) be the smooth approximate solution given by Lemma 2 which is smooth on [0, T ∗ ). Then, • if m = 0, there exists ε0 > 0 and T ∈ (0, T ∗ ) such that for every ε ∈ (0, ε0 ], the solution of (1) with initial data 0ε remains smooth on [0, T ] and satisfies for every s ∈ N, the estimate i ε ε || exp − ϕ − a ε || L ∞ ([0,T ],H s ) ≤ Cs ε. ε • if m ≥ 1, for every T ∈ (0, T ∗ ), there exists ε0 (T ) > 0 such that for every ε ∈ (0, ε0 (T )], the solution of (1) with initial data 0ε remains smooth on [0, T ] and satisfies for every s ∈ N, the estimate i || ε exp − ϕ ε − a ε || L ∞ ([0,T ],H s ) ≤ Cs,T εm+1 . ε Note that Theorem 2 is actually the special case m = 0 in Theorem 5. Proof of Theorem 5. Let s > d/2. We take (a ε , ϕ ε ) the approximate solutions given by ε Lemma 2 and look for the solution of (1) under the form ε = (a ε + w)eiϕ /ε . We get for w Eq. (21) with F ε given by (22) and the initial condition w/t=0 = 0. For s > d/2, and every ε > 0, this semilinear equation is locally well-posed in H s : we get very easily that there exists for some T ε > 0 a unique maximal solution w ∈ C([0, T ε ), H s ) of (21) (see [5] for example). We shall prove that T ε is bounded from below by some T > 0 if m = 0, and that T ε ≥ T for every T ∈ (0, T ∗ ) for ε sufficiently small if m ≥ 1. Let us define τ ε ≡ sup τ ∈ (0, T ε ), ∀t ∈ [0, τ ], 2Nsε (w(t)) ≤ ε2m+4 . Note that τ ε > 0 since w(0) = 0 and that by Sobolev embedding, we have, for t ≤ τ ε , || w(t) ||2L ∞ ≤ K 2 ε−2 Nsε (w(t)) ≤ K 2 ε2m+2 ≤ K 2 , for some K independent of ε.
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
529
We will apply Theorem 3 with F ε given by (22). To estimate F ε , we use the following lemma: Lemma 3. Let R > 0, s > d/2 and w such that || w || L ∞ ≤ R, and F ε given by (22). Then, for a constant C depending only on || a ε (t) ||W s+2,∞ and R, we have ! ε " Nsε (w) Ns (w) 2 1 ε 2 ε 2 2m+4 2m ε || F || H s + 2 || ImF || H s−1 ≤ Cε +Cε Ns (w)+C + Nsε (w). ε ε4 ε4 We postpone the proof of Lemma 3 to the end of the section. We can first easily end the proof of Theorem 5. Notice first that, by definition of a , we have Ra = Ram +
iε ε a = O H k (εm+1 ) + O H k (ε) = O H k (ε), 2
for every k, uniformly for 0 ≤ t ≤ T , hence 1 || Ra (t) ||W s−1,∞ ≤ C. ε Applying Theorem 3 and Lemma 3 with R ≡ K , we infer that for 0 ≤ t ≤ τ ε , d ε N (w(t)) ≤ Cε2m+4 + Cε2m Nsε (w(t)) , dt s which gives immediately, since w/t=0 = 0, that
2m 1 Nsε (w(t)) ≤ Cε2m+4 eCε t − 1 ≤ ε2m+4 2 in the following cases: • for m = 0, 0 ≤ t ≤ T with 0 < T < T ∗ sufficiently small independent of ε, • for m ≥ 1, T ∈ (0, T ∗ ) is arbitrary, 0 ≤ t ≤ T and ε ≤ ε0 (T ) with ε0 (T ) sufficiently small. As a consequence, τ ε ≥ T as desired and || w || L ∞ ([0,T ],H s (Rd )) ≤ Cs,T εm+1 . It remains to prove Lemma 3. Proof of Lemma 3. We recall that F ε is given by
F ε = R ε + Q ε (w) = R ε + (a ε + w) f (|a ε + w|2 ) − f (|a ε |2 ) − 2(w, a ε ) f (|a ε |2 )a ε . As a first try, we could use the rough estimate Q ε (w) = O(|w|2 )
as w → 0,
which would lead to || Q ε ||2H s +
1 C C || Im Q ε ||2H s−1 ≤ 2 || w ||4H s ≤ 6 Nsε (w)2 , ε2 ε ε
530
D. Chiron, F. Rousset
which does not allow us conclude in the proof of Theorem 5 for m = 0 and does not give the sharp result for the existence time if m = 1. To get the refined estimate of Lemma 3, the idea is then to use a Taylor expansion for Q ε w.r.t. w up to second order, and write Q ε (w) = |w|2 f (|a ε |2 )a ε + 2 f (|a ε |2 )(w, a ε )w + 2a ε f (|a ε |2 )(w, a ε )2 + G ε (x, w), so that for fixed x, we have as w → 0,
G ε (x, w) = O |w|3 .
We turn now to estimate each term in F ε . Estimate for R ε = iε Ram − Rϕm a ε . Thanks to (75), we have || R ε ||2H s ≤ Cε2m+4 . Moreover, since Rϕm is real-valued and since, from (38), Im a ε = OW s,∞ (ε), we also have 1 || Im R ε ||2H s−1 ≤ Cε2m+4 ε2 thanks to (74). We have thus proven that || R ε ||2H s +
1 || Im R ε ||2H s−1 ≤ Cε2m+4 . ε2
Estimate for G ε (x, w). The estimate relies on Lemma 5 in the Appendix. Indeed, it is clear from the Taylor formula that G ε may be written under the form (Re w)2 h 11 (x, w(x)) + (Re w) (Im w) h 12 (x, w(x)) + (Im w)2 h 22 (x, w(x)) , where h 11 , h 12 , h 22 : Rd × C → C are of class C ∞ and ∀x ∈ Rd , h 11 (x, 0) = h 12 (x, 0) = h 22 (x, 0) = 0. Moreover, h 11 , h 12 and h 22 verify the hypothesis of Lemma 5 in the Appendix since a ε ∈ L ∞ ([0, T ], W s,∞ ). As a consequence, if || w || L ∞ ≤ R, || G ε || H s ≤ C || w ||3H s , which implies || G ε (x, w(x)) ||2H s +
1 2 C || Im G ε (x, w(x)) ||2H s−1 ≤ 2 || G ε (x, w(x)) ||2H s ≤ 8 Nsε (w)3 . 2 ε ε ε
The estimate for the quadratic terms in Q ε (w) will rely crucially on the fact that a ε is real to first order and that (w, a ε ) is estimated in H s−1 by Nsε (w) and not just by ε−2 Nsε (w). Estimate for F1ε ≡ |w|2 f (|a ε |2 )a ε . We have || F1ε ||2H s ≤
C ε N (w)2 , ε4 s
and in view of (38), Im a ε = OW s,∞ (ε), thus 1 C || Im F1ε ||2H s−1 ≤ C || |w|2 ||2H s−1 ≤ 4 Nsε (w)2 . ε2 ε
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
531
Estimate for F2ε ≡ 2 f (|a ε |2 )(w, a ε )w. We begin with the rough estimate || F2ε ||2H s ≤
C ε N (w)2 . ε4 s
Moreover, one has || f (|a ε |2 )(w, a ε ) ||2H s−1 ≤ C Nsε (w).
(81)
Indeed, let µ ∈ Nd with |µ| ≤ s − 1. Then,
∂ µ f (|a ε |2 )(w, a ε ) = ∗ ∂ λ f (|a ε |2 ) ∂ α w, ∂ β a ε , α+β+λ=µ
where ∗ is a coefficient depending only on α, β and λ. Since |µ| ≤ s − 1, the terms 1 (∂ α w, ∂ β a ε ) are bounded in L 2 by (w) 2 + ε||w|| H s−2 as soon as |α| ≤ s − 2. The term in the sum with |α| = s − 1 (hence µ = α and β = λ = 0) is f (|a ε |2 ) (∂ µ w, a ε ) and is bounded in L 2 by N ε (∂ µ w). Hence, (81) follows. As a consequence, by (60) and Sobolev embedding, we obtain
|| f (|a ε |2 )(w, a ε )w || H s−1 ≤ Cs || w || L ∞ || f (|a ε |2 )(w, a ε ) || H s−1 + || w || H s−1 ≤
C ε N (w). ε2 s
Consequently, || F2ε ||2H s +
1 C || Im F2ε ||2H s−1 ≤ 4 Nsε (w)2 . 2 ε ε
Estimate for F3ε ≡ 2a ε f (|a ε |2 )(w, a ε )2 . We find as for F1ε , || F3ε ||2H s ≤
C ε N (w)2 , ε4 s
and once again in view of (38), 1 C || Im F3ε ||2H s−1 ≤ C || w ||4H s−1 ≤ 4 Nsε (w)2 . ε2 ε We conclude the proof of Lemma 3 summing these estimates. 5. Geometric Optics in a Half-Space In this section, we consider the Gross-Pitaevskii equation in a half-space in dimension d ≤ 3, G P( ε ) ≡ iε∂t ε +
ε2 ε − ε (| ε |2 − 1) = 0, x ∈ Rd+ ≡ Rd−1 × (0, +∞). 2 (82)
We consider the Neumann boundary condition (12) on the boundary and the condition (13) at infinity, that is ∂ ε ∂ ε i ∞2 i ∞ |u | t − u · x ε → 1 |x| → +∞ = = 0 and exp ∂n /∂ Rd+ ∂z /z=0 2ε ε by using the notation x = (y, z) ∈ Rd−1 × (0, +∞).
532
D. Chiron, F. Rousset
5.1. Construction of the WKB expansion. In this section, we shall consider a smooth solution (a, u), with a real-valued, of ⎧ 1 ⎪ ⎨ ∂t a + u · ∇a + a ∇ · u = 0 2 ⎪ ⎩ ∂t u + u · ∇u + ∇(a 2 ) = 0,
(83)
with the boundary condition u d (t, y, 0) = 0 and the condition at infinity u(t, x) → u ∞ ,
a(t, x) → 1
when |x| → +∞.
Since we look for a real-valued, the resolution of this system is made in [14] (Theorem 2). Given s ∈ N∗ , if the initial datum a0 is positive and (a0 − 1, u 0 − u ∞ ) ∈ H s , and under some compatibility conditions for (a0 , u 0 ) on the boundary ∂Rd+ of sufficiently high order on the initial data, there exists T0 ∈ (0, +∞) and a solution (a, u) on [0, T0 ] with (a − 1, u − u ∞ ) ∈ C 0 ([0, T0 ], H s ) ∩ C 1 ([0, T0 ], H s−1 ), such that a(t, x) ≥ α > 0, ∀t ∈ [0, T0 ], ∀x ∈ Rd+
(84)
for some α > 0. We also define the phase ϕ by ϕ(t, x) ≡ ϕ0 (x) −
t 0
1 2 |u| + |a|2 − 1 (τ, x) dτ. 2
In view of the condition (13) at infinity, ϕ is not in H s but ϕ(t, .)−u ∞ ·x + 2t |u ∞ |2 ∈ H s . As we have seen and as in [2], u = ∇ϕ. The aim of this subsection is to prove the existence of the WKB expansion (which involves boundary layers since the solution of (83) does not match the Neumann boundary condition (12)) up to arbitrary orders for (82), (12), (13) starting from a smooth (a, u) which verifies (84). We define the set of boundary layer profiles Sex p as Sex p = A(t, y, Z ) ∈ H ∞ (R+ × Rd−1 × R+ ), ∀k, α, l, ∃γ > 0, |∂tk ∂ yα ∂ Zl A| # ≤ Ck,α,l exp(−γ Z ) . Lemma 4. Let s ∈ N and m ∈ N∗ be fixed. Then, there exists a smooth function a,m = ϕε
a ε ei ε on [0, Tm ] verifying the Neumann condition (12) and the condition (13) at infinity and such that a,m is an approximate solution of (82) on [0, Tm ]: G P( a,m ) = εm R ε ei
ϕε ε
,
(85)
where R ε can be written under the form
z z R ε = −a ε Rϕint,m (t, x) + Rϕ,m (t, y, ) + i ε Raint,m (t, x) + Ra,m (t, y, ) , ε ε
(86)
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
533 ,m
,m
with Rϕint,m , Raint,m smooth and uniformly bounded in H s and Ra (t, y, Z ), Rϕ (t, y, Z ) ∈ Sex p . Moreover, a ε is real-valued and a ε , ϕ ε have smooth expansions under the form aε = a + ϕε = ϕ +
m−1
k=1 m−1
k=1
z z εk a k (t, x) + Ak (t, y, ) + εm Am (t, y, ), ε ε
(87)
z z εk ϕ k (t, x) + k (t, y, ) + εm m (t, y, ). ε ε
(88)
The boundary layer profiles Ak (t, y, Z ), k (t, y, Z ) belong to Sex p and are such that ∂ Z A1 (t, y, 0) = −∂z a(t, y, 0),
∂ Z 1 (t, y, 0) = −∂z ϕ(t, y, 0),
∂ Z Ak (t, y, 0) = −∂z a k−1 (t, y, 0), ∂ Z k (t, y, 0) = −∂z ϕ k−1 (t, y, 0) ∀2 ≤ k ≤ m. (89)
ε Proof. Since a,m = a ε exp i ϕε , we want to solve approximately 1 1 −a ε ∂t ϕ ε + |∇ϕ ε |2 + |a ε |2 − 1 + iε ∂t a ε + ∇ϕ ε · ∇a ε + a ε ϕ ε 2 2 +
ε2 a ε = 0. 2
(90)
Since, in this section, we are looking for a ε real-valued, we can split the system (90) into ⎧ 1 ε ε ⎪ ε ε ε ⎪ ⎪ ∂t a + ∇ϕ · ∇a + a ϕ = 0 ⎨ 2 for t ≥ 0, x ∈ Rd+ . (91) ⎪ 2 ε ⎪ a 1 ε ⎪ ⎩ ∂t ϕ ε + |∇ϕ ε |2 + (a ε )2 − 1 = 2 2 aε Note that in this section, the division by a ε in the right-hand side of the second equation of (91) is not a problem since a 0 = a verifies (84) and hence does not vanish. We thus plug the expansions (87), (88) in (91) and we cancel the powers of ε. To separate interior and boundary layer terms, we use the general theory of [11]. In particular, we use that for every smooth function f and V ∈ Sex p , we have the expansion f (u(t, x) + V (t, y, z/ε)) = f (u(t, x)) + f (u(t, y, 0) + V (t, y, z/ε)) − f (u(t, y, 0)) + εR, where R ∈ Sex p . This yields that the boundary layer part of f (u(t, x) + V (t, y, z/ε)) is given by f (u(t, y, 0) + V (t, y, z/ε)) − f (u(t, y, 0)). In the following, we use the notation Wb = W (t, y, 0) for every W (t, x). At first, the ε−1 term in the equation only gives ab ∂ Z Z 1 = 0,
534
D. Chiron, F. Rousset
and hence we have 1 = 0, since ab ≥ α > 0 and 1 ∈ Sex p . Note that this is coherent with the fact that u d (t, y, 0) = (∂z ϕ)b = 0 so that we do not need a boundary layer to correct the boundary condition. The ε0 term gives, as expected, ⎧ 1 2 2 ⎪ ⎪ ⎨ ∂t ϕ + 2 |∇ϕ| + a − 1 = 0 for t ≥ 0, x ∈ Rd+ (92) ⎪ ⎪ 1 ⎩ ∂ a + ∇ϕ · ∇a + a ϕ = 0 t 2 for the interior part, and for the boundary layer terms, for (t, y) ∈ R+ × Rd−1 , ab ∂ Z Z 2 = −(∂z ϕ)b ∂ Z A1 = 0
for Z > 0,
since (∂z ϕ)b = u d (t, y, 0) = 0. Consequently, we also find gives ⎧ 1 ⎪ ⎨ ∂t a 1 + ∇ϕ · ∇a 1 + ∇ϕ 1 · ∇a + (aϕ 1 + a 1 ϕ) = 0 2 ⎪ ⎩ ∂t ϕ 1 + 2a a 1 + ∇ϕ · ∇ϕ 1 = 0
2
(93)
= 0. Next, the order ε
for t ≥ 0, x ∈ Rd+
in the interior and for the boundary layer terms ⎧ 1 1 ⎪ 1 1 2 2 ⎪ ⎨ ∂ Z Z A = A ∂t ϕ + |∇ϕ| + a − 1 + 2ab2 A1 = 2ab2 A1 2 2 b ⎪ ⎪ ⎩ a b ∂ Z Z 3 = G 3
for Z > 0, (94)
where G 3 ∈ Sex p depends only on (a, A1 , a 1 ) and (ϕ, ϕ 1 ). Consequently, the boundary layer A1 is given by A1 ≡
(∂z a)b −2ab Z e 2ab
in order to match (89). Finally, the εk , k ≥ 2 terms give ⎧ ⎪ ∂ ϕ k + 2a a k + ∇ϕ · ∇ϕ k = Sϕk ⎪ ⎨ t ⎪ a ak ⎪ ⎩ ∂t a k + ∇ϕ · ∇a k + ∇a · ∇ϕ k + ϕ k + ϕ = Sak 2 2
for t ≥ 0, x ∈ Rd+ (95)
and
⎧ ⎨ ∂ Z Z Ak = 4ab2 Ak + F k ⎩
∂ Z Z k = G k
for Z > 0,
(96)
where Sϕk and Sak depend only on (a, ϕ) and (a j , ϕ j )1≤ j≤k−1 ; F k ∈ Sex p depends only on (a, ϕ), (a j , ϕ j , A j , j )1≤ j≤k−1 and k ; and G k ∈ Sex p depends on (a, ϕ),
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
535
(a j , ϕ j , A j , j )1≤ j≤k−1 . Therefore, if we want to solve by induction these equations, one has to determine first k , then (a k , ϕ k ) and finally Ak . To solve the cascade of equations by induction, we first determine (a 1 , ϕ 1 ). As before, we notice that (a 1 , u 1 ≡ ∇ϕ 1 ) solves a symmetrizable hyperbolic system (there is no problem with the vacuum since we are in the same situation as in [9]). Since the condition at infinity is already absorbed by (a, ϕ), one can look for (a 1 , u 1 ) in H s . Moreover, we solve the system in Rd+ with the boundary condition u 1d (t, y, 0) = 0 which is needed in order to match (89) since we have already found that 2 = 0. The existence of a smooth solution for this linear system with the boundary condition u 1d (t, y, 0) = 0 which is maximal dissipative and an initial condition satisfying suitable compatibility conditions can be obtained by the classical theory [17]. Then, one finds ϕ 1 by the formula t
2a a 1 + u · u 1 (τ, x) dτ. ϕ 1 (t, x) = ϕ01 (x) − 0
Furthermore, since F 2 ∈ Sex p and ab ≥ α > 0, the first equation in (96) (with k = 2) has a unique solution A2 ∈ Sex p . We have therefore found (a 1 , A1 , ϕ 1 , 1 , A2 , 2 ). We now proceed by induction. Assume that, for some m ≥ 2, we have determined (a j , ϕ j )1≤ j≤m−1 and (A j , j )1≤ j≤m . Then, we wish to solve (95) and (96) with k = m +1. Since G m+1 is already determined and G m+1 ∈ Sex p , the differential equation ∂ Z Z m+1 = G m+1 has a unique solution in Sex p and +∞ m+1 G (t, y, ζ ) ∂ Z m+1 (t, y, Z ) = − dζ. a (t, y) b Z This determines the boundary condition for u m+1 ≡ ∇ϕ m+1 . Indeed, to match (89) we shall need to impose +∞ m−1 G (t, y, ζ ) m+1 m+1 dζ, u m+1 (t, y, 0) = (∂ ϕ )(t, y, 0) = −(∂
)(t, y, 0) = z Z d ab (t, y) 0 (97) which is non-zero in general. We then solve (96) in the following way: (a m+1 , u m+1 ≡ ∇ϕ m+1 ) still solves a linear symmetrizable hyperbolic system, with source terms Sϕm+1 and Sam+1 already known, with the maximal dissipative boundary condition (97). It has then a smooth solution by the above mentioned theory. Then, we recover ϕ m+1 as usual by t
Sϕm+1 − 2a a m+1 − u · u m+1 (τ, x) dτ. ϕ m+1 (t, x) ≡ ϕ0m+1 (x) + 0
Finally, the first equation in (96) (with k = m + 1) is a linear ODE for Am+1 , with source term F m+1 ∈ Sex p now determined, for which we can write down explicitly the unique exponentially decreasing solution satisfying ∂ Z Ak (t, y, 0) = −∂z a k (t, y, 0). Consequently, we have constructed an approximate solution of (91) such that ⎧
1 ε ε ⎪ ε ε ε m int,m −1 ,m ⎪ ∂ a R a + ∇ϕ · ∇a + ϕ = ε (t, x) + ε R (t, y, z/ε) ⎪ t a a ⎨ 2 ⎪
2 ε ⎪ ⎪ ⎩ ∂t ϕ ε + 1 |∇ϕ ε |2 + a ε 2 − 1 = ε a (t, x) + εm Rϕint,m (t, x) + Rϕ,m (t, y, z/ε) , 2 2 aε
536
D. Chiron, F. Rousset ,m
,m
where Raint,m (t, x), Rϕint,m (t, x) are smooth bounded functions and Ra , Rϕ ∈ Sex p . We can thus write the error R ε in the GP equation as
R ε (t, x) = εm −a ε Rϕint,m (t, x) + Rϕ,m (t, y, z/ε) + i ε Raint,m (t, x) + Ra,m (t, y, z/ε) . This ends the proof of Lemma 4.
5.2. Validity of the WKB expansion. We shall now prove the stability of the WKB expansion built in Lemma 4. ϕε
Theorem 6. Let a,m = a ε ei ε be a WKB expansion defined on [0, Tm ] given by Lemma 4. Then for d ≤ 3 and m ≥ 4 there exists a unique smooth solution ε also a,m ε defined on [0, Tm ] of (82), (12), (13) such that /t=0 = /t=0 . Moreover, we have the estimate ε || ε e
−iϕ ε ε
− a ε || H 1 (Rd+ ) + ε3 || ε e−i
ϕε ε
1
− a ε || H 3 (Rd+ ) ≤ Cm εm− 2 , ∀t ∈ [0, Tm ],
and in particular || ε e−i
ϕε ε
7 − a + ε A1 ||W 1,∞ (Rd+ ) ≤ Cm max{ε, εm− 2 }.
(98)
Remark 3. For simplicity, we have restricted ourselves to dimension d ≤ 3. Note however that it is possible to get H s estimates for every s. By contrast with Theorem 2, we emphasize that the initial condition in Theorem 6 is exactly the WKB approximate solution a,m . In particular, this initial datum has to verify some compatibility condition on the boundary. Proof. As in the proof of Theorem 5, we set ε = a,m + w e
iϕ ε ε
and we study the equation for w i.e. (19). Note that we are now seeking a w which tends to zero at infinity since the boundary condition at infinity is already absorbed in the WKB expansion. Again the first step is to get estimates for the linear equation (21) in with the Neumann boundary condition ∂z w(t, y, 0) = 0.
(99)
As we can check in the proof of Lemma 1, in all the integration by parts that are performed, the boundary terms vanish due to the Neumann boundary condition or the fact that u εd (t, y, 0) = 0, and hence the proof of the L 2 stability will be almost the same as the one in the whole space. Nevertheless, we have to pay attention to the presence of boundary layer terms in the coefficients. At first, we note that since 1 = 0 and
2 = 0 in the WKB expansion, we still have that M (which is defined in Lemma 1) is independent of ε. Indeed, for the worst term which is ∇(∇ · u ε ), we have ∇(∇ · u ε ) = ∂ Z Z Z 3 + ∇ϕ + O L ∞ (ε).
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
537
Next, keeping the definitions of Ra and Rϕ given in (17), (18) and by construction of the WKB expansion, we have ||Ra || L ∞ ≤ Cεm .
(100)
Nevertheless, again by construction of the WKB expansion, we only have Rϕ = Rϕm +
ε2 a ε , 2 aε
and due to the presence of boundary layers in a ε , we can split Rϕ into z Rϕ = ε2 Rϕint (t, y, z) + ε Rϕ (t, y, ), ε
(101)
where Rϕint is smooth and bounded, whereas Rϕ ∈ Sex p , and we see that ε ||Rϕ || L ∞ =
O(ε), ε ||∇ Rϕ || L ∞ = O(1), hence the estimate (23) of Lemma 1 would be useless. Moreover, the fact that Rϕ belongs to Sex p does not seem to improve the estimates. The way to overcome this difficulty seems to incorporate this new singular term into the functional. Let us define the operator S+ε w = −
ε2 w + 2(w, a ε )a ε + ε Rϕ w, 2
our weighted norm in this section will be
ε (S+ε w, w) + K ε2 |w|2 d x N+ (w) =
1 = ε2 |∇w|2 + 4(w, a ε )2 + 2ε Rϕ |w|2 + 2K ε2 |w|2 d x. 2 Note that Rϕ has no sign, nevertheless, N+ε (w) can be bounded from below by a weighted H 1 norm if K is chosen sufficiently large. Indeed, since Rϕ belongs to Sex p we can write −γ z 2ε Rϕ |w|2 d x ≤ Cε e ε |w|2 d x
and then use the one-dimensional Sobolev inequality 1
|w(t, y, z)| ≤ C 2
|w(t, y, ζ )| dζ 2
R+
1
2
|∂z w(t, y, ζ )| dζ 2
R+
2
to get ε
e
− γεz
|w| ≤ Cε||w|| L 2 ||∇w|| L 2 2
R+
e−
γz ε
dz ≤ Cε2 ||w|| L 2 ||∇w|| L 2 .
In particular, we have proven that 2 2ε Rϕ |w| d x ≤ Cε2 ||w|| L 2 ||∇w|| L 2 .
(102)
(103)
538
D. Chiron, F. Rousset
This yields thanks to the Young inequality 1 2 2ε Rϕ |w| d x ≤ ε2 ||∇w||2L 2 + Cε2 ||w||2L 2 , 2
(104)
where C is independent of ε. Consequently, if K is chosen such that 2K > C, we get ε 2 2 ε 2 N+ (w) ≥ C0 ε ||w|| H 1 + (w, a ) d x , C0 > 0.
Note that in this section, we have a ε = a + O(ε) with a ≥ α; this finally yields that N+ε (w) is equivalent to the weighted norm N+ε (w) ∼ ε2 ||w||2H 1 + ||Re w||2L 2 .
(105)
The first step in the proof of Theorem 6 is to prove the equivalent of Lemma 1. We shall prove the estimate d ε N (w(t)) ≤ C N+ε (w(t)) dt + + ||F ε ||2L 2 +
4 (w, a ε )(ia ε , F ε ) − ε
(iεw, F ε ) −
(106) (i F ε , Rϕ w),
where C is independent of ε. Proof of (106). The proof follows the same lines as the proof of Lemma 1. At first, since S+ε is self adjoint, we have
ε d 2 S+ε w, ∂t w + 4(w, a ε )(w, ∂t a ε ) + 2ε ∂t Rϕ |w|2 d x. S+ w, w d x = dt
Since ∂t Rϕ ∈ Sex p , we can still use (102) to get ∂t Rϕ |w|2 ≤ C N+ε (w). 2ε
Next, as in the proof of Lemma 1, we use (21) to express ∂t w as ε2 Rϕint i Fε i 1 w− ∂t w = − S+ε w − u ε · ∇w + w ∇ · u ε − i ε 2 ε ε to get 2
iε2 Rϕint 1 w − u ε · ∇w + w ∇ · u ε − 2 ε Fε ε , S+ w d x. −i ε
∂t w, S+ε w d x = 2
(107)
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
539
Moreover, since Rϕint and Rϕ are real, we have the cancellation (i Rϕint w, Rϕ w) d x = 0.
Therefore, the only terms in the right-hand side of (107) which are not present in (25) are − (i F ε , Rϕ w) and 1 ε ε u · ∇w + w ∇ · u , ε Rϕ w . I = −2 2 To estimate I, we note that we have a bound on the second term by using again (102). It remains to estimate the first term. Integrating by parts and using that u εd (t, y, 0) = 0, we get ε −2 u · ∇w, ε Rϕ w = ε ∇ · Rϕ u ε |w|2 = ∇ · u ε ε Rϕ |w|2 + ε u ε · ∇ Rϕ |w|2 .
Again, the first term can be bounded thanks to (102). For the second one, we first notice that since u εd (t, y, 0) = 0 and Rϕ ∈ Sex p , we have γz ε u ε · ∇ Rϕ ≤ Cε |∇ y Rϕ | + |z∂z Rϕ | ≤ Cεe− ε . This finally yields I ≤ C N+ε (w), thanks to a new use of (102). The end of the proof of (106) is then exactly the same as the proof of Lemma 1, since all the integration by parts do not create boundary terms either because of the Neumann boundary condition or because u εd vanishes on the boundary. Higher order estimates. The estimates of higher order derivatives are more involved than in the whole space. There are two main reasons. The first one is that there is a new singular term ε Rϕ w which creates bad terms when we take the derivatives of the equation. The second reason is that to recover estimates on the normal derivatives, we need to use the equation which gives in particular that ε2 ∂z2 behaves like ε∂t and ε∇. This anisotropy in the weights does not seem to allow to construct high order functionals like Nsε (w) which allows to get H s estimates without additional loss of ε. Let us use the notation t = (0 , . . . , d ) = ∂t , ∇ y , p(z)∂z , where the weight p(z) is given by p(z) = z/(1 + z). Note that we can apply to the equation since w still satisfies the Neumann boundary condition. The use of is classical in hyperbolic characteristic initial boundary value problems (see [17] for example) The weighted norm that we shall estimate is Y+ε (w) ≡ N+ε (w) + N+ε (εw).
540
D. Chiron, F. Rousset
In dimension d ≤ 3, this is sufficient to get the nonlinear stability. We shall see in the proof why the use of d is necessary. We shall prove that d ε Y+ (w) ≤ C Y+ε (w) + X ε (F ε ) + X ε (εF ε ) dt
(108)
for some C > 0 independent of ε where we have set X ε (F) ≡ ||F||2H 1 +
||F||2L 2 ε
+
||Im F||2L 2 ε2
.
Proof of (108). As a preliminary, we shall rewrite (106) in a more convenient form. We can use that a ε = a + O(ε) with a real, perform an integration by parts and use (102) to get from (106) that d ε N (w(t)) ≤ C N+ε (w(t)) + X ε (F ε ), dt +
(109)
where X ε (F ε ) = ||F ε ||2H 1 +
||F||2L 2 ε
+
||Im F ε ||2L 2 ε2
.
To prove (108), we start with the estimate of N+ε (ε∂t w). When we apply ε∂t to (21), we find iε∂t + Lε ε∂t w = Rϕ ε∂t w + ε∂t F ε + C, (110) where the commutator C can be split into C = C1 + C2 + C3
(111)
with C1 ≡ ε ∂t Rϕ w, C2 ≡ 2ε (∂t a ε , w)a ε + (a ε , w)∂t a ε , 1 2 ε ε C3 ≡ −iε ∂t u · ∇w + ∂t (∇ · u ) w . 2 Consequently, we can apply (109) to (110) with the new source term ε∂t F ε + C to get d ε N (ε∂t w(t)) ≤ C N+ε (ε∂t w(t)) + X ε (ε∂t F ε ) + X ε (C). dt +
(112)
Thus it remains to estimate X ε (C). Let us begin with X ε (C1 ). Thanks to the expansion (101), we easily get X ε (C1 ) N+ε (w) + |∂t Rϕ |2 ε4 |w|2 + ε4 |∇w|2 ) + ε4 |∇∂t Rϕ |2 |w|2 N+ε (w).
(113)
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
541
Note that we could have a better estimate by using that Rϕ ∈ Sex p and (102). Next, we turn to the estimate of X ε (C2 ). By using that a ε = a + O(ε) with a real, we find X ε (C2 ) N+ε (w) + ε||Re w||2L 2 + ε2 ||∇w||2L 2 N ε (w).
(114)
Note that the above estimate was sharp. This is for the estimate of this commutator C2 that we had to choose the weight ε in front of the time derivative. Finally, we estimate X ε (C3 ) using that ∂t u εd vanishes on the boundary which implies that |∂t u εd | p(z). Thanks to this remark, we find X ε (C3 ) N+ε (w) + ε4 ||∇w||2L 2 Y+ε (w).
(115)
Note that this is for the control of this commutator that we are obliged to add the vector field p(z)∂z in the definition of the functional space. Consequently, the combination of (112), (113), (114) and (115) gives d ε N (ε∂t w(t)) Y+ε (w(t)) + X ε (ε∂t F ε ). dt +
(116)
The estimate of ε∇ y w follows exactly the same lines, and we also find d ε N+ ε∇ y w(t) Y+ε (w(t)) + X ε (ε∇ y F ε ). dt
(117)
The estimate of εd w = εp(z)∂z w requires some additional work since the vector field d does not commute with the Laplacian. By applying εd to (21), we get iε∂t + Lε εd w = Rϕ εd w + εd F ε + C + C4 , (118) where C is defined as in (111) above with ∂t replaced by d and C4 is given by ε3 ε3 [d , ]w = − (2 (∂z p) ∂zz w + (∂zz p) ∂z w) . 2 2 Next, we can apply (106) to get C4 ≡ −
d ε N (εd w(t)) N+ε (εd w(t)) + X ε (εd F ε ) + X ε (C) + ||C4 ||2H 1 dt + 4 ε ε iC4 , Rϕ εd w . + (εd w, a )(ia , C4 ) − ε Since one can easily check that X ε (C) still satisfies the bounds (113), (114), (115), we obtain d ε N (εd w(t)) Y+ε (w) + X ε (εd F ε ) + ||C4 ||2H 1 dt + 4 iC4 , Rϕ εd w . + (εd w, a ε )(ia ε , C4 ) − ε Next, we note that ||C4 ||2H 1 ε6 ||w||2H 3 and that
542
D. Chiron, F. Rousset
4 1 4 ε ε (ε w, a )(ia , C ) ε|∂z w| | p(z)C4 | ε2 N+ε (w) 2 || p∂zz w|| L 2 + ||∂z w|| L 2 d 4 ε ε 1
1
N+ε (w) 2 Y+ε (w) 2 . In a similar way, we also get iC4 , Rϕ εd w ε||∂z w|| L 2 || p C4 || L 2 Y+ε (w).
Consequently, we have proven that d ε N (εd w(t)) Y+ε (w) + X ε (εd F ε ) + ε6 ||w||2H 3 . dt +
(119)
To conclude, it remains to estimate ε6 ||w||2H 3 . As usual, this is done thanks to Eq. (19) and the standard regularity result for elliptic equations. We rewrite (19) as the equation ε2 w = G ε , ∂z w(t, y, 0) = 0,
(120)
where the source term enjoys the estimates ||G ε ||2L 2 ε2 ||w||2L 2 + ||w||2L 2 + ||F ε ||2L 2 , ||∇G ε ||2L 2 ε2 ||∇w||2L 2 + ||w||2H 1 + ||∇ F ε ||2L 2 . Consequently, we get from (120) by standard elliptic regularity that ε6 ||w||2H 3 Y+ε (w) + ||F ε ||2H 1 .
(121)
By replacing this last estimate in (119), we finally obtain d ε N (εd w(t)) Y+ε (w) + X ε (εd F ε ) + ||F ε ||2H 1 . dt +
(122)
To conclude, it suffices to sum the estimates (109), (116), (117) and (122) to get (108). The estimate (108) is sufficient to prove the nonlinear stability stated in Theorem 6 for d ≤ 3. Nevertheless, it is possible to prove by induction that for every s, d dt
m≤s
N+ε
m (ε) w X ε (ε)m F ε + N+ε (ε)m w . m≤s
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
543
Nonlinear stability.. Thanks to (108) and the Gronwall inequality, we get for 0 ≤ T ≤ Tm , where Tm is the existence time of the approximate solution given by Lemma 4, sup Y+ε (w) Y+ε (0) + T eγ T sup X ε (F ε ) + X ε (εF ε ) [0,T ]
[0,T ]
for some γ > 0 independent of ε. Combining this last estimate with (121), we get ε ε ε ε ε ε sup Z + (w) ≤ C Tm Y+ (0) + sup X (F ) + X (εF ) , (123) [0,T ]
[0,T ]
with Z +ε (w) ≡ Y+ε (w) + ε6 ||w||2H 3 . Thanks to this a priori estimate, one can easily prove by standard fixed point argument the existence of a unique solution of (19) with the Neumann condition ∂z w|z=0 = 0 on some interval of time [0, T ε ] ⊂ [0, Tm ] such that Z +ε (w) remains finite. By using that w/t=0 = 0 and the equation to compute the time derivative, we find Y+ε (w)/t=0 = N+ε (ε∂t w)/t=0 ≤ C Tm ε2m . Moreover, using that F ε = εm R ε + Q ε , we have thanks to (86) that sup X +ε (R ε ) + X +ε (R ε ) ≤ C Tm ε2m−1 . [0,Tm ]
Inserting this into (123) yields, for 0 ≤ t ≤ T ε , sup Z +ε (w) ≤ K Tm ε2m−1 + C Tm sup X ε (Q ε ) + X ε (εQ ε ) .
[0,T ]
[0,T ]
(124)
We can thus define τ ε ∈ (0, Tm ] as the maximal time such that the solution w of (19) satisfies Z +ε (w(t)) ≤ 2K Tm ε2m−1 on [0, τ ε ]. As in the proof of Theorem 5, we shall prove that for ε sufficiently small, we have τ ε = Tm . Here, the expression of Q ε (w) is given by Q ε (w) = a ε |w|2 + 2(w, a ε )w + w|w|2 . To conclude, we need to bound the right-hand side of (124). To estimate the nonlinear term, we use that for d ≤ 3, we have ||w||2L ∞ ||∇ 2 w|| ||w|| H 1 , which gives ||w||2L ∞
Z +ε (w) ε2m−5 ∀t ∈ [0, τ ε ). ε4
We shall take m such that 2m > 5 in order to get ||w|| L ∞ ≤ 1 for t ∈ [0, τ ε ). This implies
Z ε (w)2 ||Q ε ||2H 1 ||w||2L ∞ + ||w||4L ∞ ||w||2H 1 + 6 . ε
544
D. Chiron, F. Rousset
Next, since H 1 (Rd ) ⊂ L 4 for d ≤ 3, we also have ||Q ε ||2L 2 ε2
||w||4H 1 ε2
(1 + ||w||2L ∞ )
Z +ε (w)2 . ε6
Consequently, we have already proven that X ε (Q ε )
Z +ε (w)2 . ε6
(125)
Next, we evaluate X ε (εQ ε ). At first, we write
ε2 ||Q ε ||2H 1 ε2 ||w||2H 1 ||w||2L ∞ + ||w||4L ∞ + ε2 ||w||2L 4 ||∇w||2L 4 1 + ||w||2L ∞ ) and by using for d ≤ 3, the Sobolev embedding H 1 ⊂ L 4 and the Gagliardo-Nirenberg inequality 1
3
||∇ f ||2L 4 || f || H2 1 ||∇ 2 f || L2 2 , we get for 0 ≤ t ≤ τ ε : ε2 ||Q ε ||2H 1
1 3 Z +ε (w)2 Z ε (w)2 + ε2 ||∇w||2H 1 ||w|| H2 1 ||∇ 2 w|| L2 2 + 6 . 4 ε ε
Finally, by similar arguments, we also have ||εQ ε ||2L 2 ε2
1
1
||w|| L2 4 ||w|| L2 4 ||w||2H 1 ||w||2H 1
Z +ε (w)2 . ε6
We have thus proven that X ε (εQ ε )
Z +ε (w)2 . ε6
(126)
Consequently, inserting (125), (126) into (124), we get Z +ε (w)2 ≤ K Tm ε2m−1 +2K Tm C Tm ε2m−7 sup Z +ε (w). ε6 [0,τ ε ] [0,τ ε ]
sup Z +ε (w) ≤ K Tm ε2m−1 +C Tm sup
[0,τ ε ]
By choosing m ≥ 4, this allows to get for ε sufficiently small that τ ε = Tm and that sup Z +ε (w) ≤ Cε2m−1 .
[0,Tm ]
Finally, the estimate (98) follows by Sobolev embedding. This ends the proof of Theorem 6. Acknowledgement. We thank Rémi Carles for useful comments about this work.
Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations
545
A. A Lemma about Composition in Sobolev Spaces During the proof of Lemma 3, we have used a result about composition in Sobolev spaces. This result is very standard when h does not depend on x (see, for instance, [18]). Lemma 5. Let R > 0, s ∈ N and h = h(x, w) ∈ C s+1 (Rd ×R2 , R), satisfying h(x, 0) = 0 for all x ∈ Rd . Assume moreover A≡ sup || ∂xα ∂wβ h || L ∞ (Rd ×B R ) , α ∈ Nd , β ∈ N2 , |α| ≤ s, |α| + |β| ≤ s + 1 < +∞. Then, there exists C, depending only on A, s and R, such that, for any w ∈ H s (Rd ) satisfying |w| L ∞ (Rd ) ≤ R, we have h (x, w(x)) ∈ H s (Rd ) and || h (x, w(x)) || H s ≤ C || w || H s . Proof. The proof is by induction on s ∈ N and relies on the Gagliardo-Nirenberg inequality. If s = 0, it suffices to notice that since h(x, 0) = 0, then for w ∈ B R , |h(x, w)| ≤ A|w|. Assume then the result for s − 1 ∈ N. Let µ ∈ Nd with |µ| = s. One has easily
p γ q ∂ µ (h(x, w(x))) = ∂ w2 , ∗ ∂xα ∂wβ+γ h (x, w(x)) ∂ β w1 where α ∈ Nd , α ≤ µ, β, γ ∈ N2 , p, q ∈ N∗ depend on β and γ , |α| + p|β| + q|γ | = s, and ∗ is a coefficient depending only on µ, α, β and γ . Furthermore, since w ∈ H s ∩ L ∞ , the Gagliardo-Nirenberg inequality yields, for 1 ≤ k ≤ s, || w ||
k
W
k, 2s k
1− k
≤ Ck,s || w || Hs s || w || L ∞s .
As a consequence, by interpolation, if w ∈ H s ∩ L ∞ and || w || L ∞ ≤ R, then for γ ∈ Nd , |γ | ≤ s, and 2 ≤ p ≤ |γ2s| , 2
|| ∂ γ w || L p ≤ Cs, p,R || w || Hp s . Therefore, in view of |α| + p|β| + q|γ | = s, by the Hölder inequality, we can estimate the terms in ∂ µ (h(x, w(x))) for which α = µ (thus |α| < s) as p γ q p q || ∂xα ∂wβ+γ h (x, w(x)) ∂ β w1 ∂ w2 || L 2 ≤ A || ∂ β w1 || s−|α| || ∂ γ w2 || s−|α| L
2 |β|
≤ Cs, p,R A || w || H s .
L
2 |γ |
For the term for which α = µ, we note that since h(x, 0) = 0 for x ∈ Rd , then (∂xα h)(x, 0) = 0 for any x ∈ Rd , so that if w ∈ B R ⊂ R2 , α (∂ h)(x, w) ≤ A|w|, x which implies || (∂xα h) (x, w(x)) || L 2 ≤ A || w || L 2 ≤ A || w || H s . Combining these two estimates gives || ∂ µ (h(x, w(x))) || L 2 ≤ Cs, p,R A || w || H s , and the proof of the lemma is complete.
546
D. Chiron, F. Rousset
References 1. Alazard, T., Carles, R.: Loss of regularity for supercritical nonlinear Schrödinger equations. Math. Ann. 343(2), 397–420 (2009) 2. Alazard, T., Carles, R.: Supercritical geometric optics for nonlinear Schrödinger equations. Arch. Rat. Mech. Anal., to appear, doi:10.1007s00205-008-0176-7, 2008 3. Anton, R.: Global existence for defocusing cubic NLS and Gross-Pitaevskii equations in exterior domains. J. Math. Pures Appl. (9) 89(4), 335–354 (2008) 4. Brenier, Y.: Convergence of the Vlasov-Poisson system to the incompressible Euler equations. Comm. Part. Diff. Eqs. 25(3-4), 737–754 (2000) 5. Cazenave, T.: Semilinear Schrödinger equations. Courant Lecture Notes in Mathematics, Vol. 10. New York: New York University, Courant Institute of Mathematical Sciences, 2003 6. Colliander, J., Keel, M., Staffilani, G., Takaoka, H., Tao, T.: Global well-posedness and scattering for the energy-critical nonlinear Schrödinger equation in R3. Ann. of Math. (2) 167(3), 767–865 (2008) 7. Gérard, P.: Remarques sur l’analyse semi-classique de l’équation de Schrödinger non linéaire. Séminaire sur les Equations aux Dérivées Partielles, Ecole Polytechnique, Palaiseau, 1992-1993, Exp. No. XIII, 13 pp. 8. Ginibre, J., Velo, G.: The global Cauchy problem for the nonlinear Schrödinger equation revisited. Ann. Inst. H. Poincaré Anal. Non Linéaire 2(4), 309–327 (1985) 9. Grenier, E.: Semiclassical limit of the nonlinear Schrödinger equation in small time. Proc. Amer. Math. Soc. 126(2), 523–530 (1998) 10. Grenier, E.: On the derivation of homogeneous hydrostatic equations. M2AN Math. Model. Numer. Anal. 33(5), 965–970 (1999) 11. Grenier, E., Guès, O.: Boundary layers of viscous perturbations of noncharacteristic quasilinear hyperbolic problems. J. Diff. Eqs. 143(1), 110–146 (1998) 12. Kivshar, Y.S., Luther-Davies, B.: Dark optical solitons: physics and applications. Physics Reports 298, 81–197 (1998) 13. Kolomeisky, E.B., Newman, T.J., Straley, X., Qi, J.P. : Low-Dimensional Bose Liquids: Beyond the Gross-Pitaevskii Approximation. Phys. Rev. Lett. 85, 1146–1149 (2000) 14. Lin, F., Zhang, P.: Semiclassical limit of the Gross-Pitaevskii equation in an exterior domain. Arch. Rat. Mech. Anal. 179(1), 79–107 (2006) 15. Makino, T., Ukai, S., Kawashima, S.: Sur la solution à support compact de l’équations d’Euler compressible. Japan J. Appl. Math. 3(2), 249–257 (1986) 16. Pham, C.-T., Nore, C., Brachet, M.-E.: Boundary layers and emitted excitations in nonlinear Schrödinger superflow past a disk. Phys. D 210(3-4), 203–226 (2005) 17. Rauch, J.: Symmetric positive systems with boundary characteristic of constant multiplicity. Trans. Amer. Math. Soc. 291(1), 167–187 (1985) 18. Taylor, M.: Partial Differential Equations. (III), Applied Mathematical Sciences, 117. New-York: Springer-Verlag, 1997 19. Zhang, P.: Semiclassical limit of nonlinear Schrödinger equation. II. J. Part. Diff. Eqs. 15(2), 83–96 (2002) Communicated by I. M. Sigal
Commun. Math. Phys. 288, 547–613 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0743-2
Communications in
Mathematical Physics
Rough Solutions of the Einstein Constraints on Closed Manifolds without Near-CMC Conditions Michael Holst , Gabriel Nagy , Gantumur Tsogtgerel Department of Mathematics, University of California San Diego, La Jolla, CA 92093, USA. E-mail:
[email protected];
[email protected];
[email protected] Received: 12 April 2008 / Accepted: 15 October 2008 Published online: 26 February 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com
Abstract: We consider the conformal decomposition of Einstein’s constraint equations introduced by Lichnerowicz and York, on a closed manifold. We establish existence of non-CMC weak solutions using a combination of a priori estimates for the individual Hamiltonian and momentum constraints, barrier constructions and fixed-point techniques for the Hamiltonian constraint, Riesz-Schauder theory for the momentum constraint, together with a topological fixed-point argument for the coupled system. Although we present general existence results for non-CMC weak solutions when the rescaled background metric is in any of the three Yamabe classes, an important new feature of the results we present for the positive Yamabe class is the absence of the near-CMC assumption, if the freely specifiable part of the data given by the traceless-transverse part of the rescaled extrinsic curvature and the matter fields are sufficiently small, and if the energy density of matter is not identically zero. In this case, the mean extrinsic curvature can be taken to be an arbitrary smooth function without restrictions on the size of its spatial derivatives, so that it can be arbitrarily far from constant, giving what is apparently the first existence results for non-CMC solutions without the near-CMC assumption. Using a coupled topological fixed-point argument that avoids near-CMC conditions, we establish existence of coupled non-CMC weak solutions with (positive) conformal factor φ ∈ W s, p , where p ∈ (1, ∞) and s( p) ∈ (1 + 3/ p, ∞). In the CMC case, the regularity can be reduced to p ∈ (1, ∞) and s( p) ∈ (3/ p, ∞) ∩ [1, ∞). In the case of s = 2, we reproduce the CMC existence results of Choquet-Bruhat [10], and in the case p = 2, we reproduce the CMC existence results of Maxwell [33], but with a proof that goes through the same analysis framework that we use to obtain the non-CMC results. The non-CMC results on closed manifolds here extend the 1996 non-CMC result of Isenberg and Moncrief in three ways: (1) the near-CMC assumption is removed in the case of the positive Yamabe class; (2) regularity is extended down to the maximum Supported in part by NSF Awards 0715146, 0411723, and 0511766, and DOE Awards DE-FG02-05ER25707 and DE-FG02-04ER25620. Supported in part by NSF Awards 0715146 and 0411723.
548
M. Holst, G. Nagy, G. Tsogtgerel
allowed by the background metric and the matter; and (3) the result holds for all three Yamabe classes. This last extension was also accomplished recently by Allen, Clausen and Isenberg, although their result is restricted to the near-CMC case and to smoother background metrics and data. Contents 1. 2. 3. 4. 5. 6. 7. A.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . Preliminary Material . . . . . . . . . . . . . . . . . . . . Overview of the Main Results . . . . . . . . . . . . . . . Weak Solution Results for the Individual Constraints . . . Barriers for the Hamiltonian Constraint . . . . . . . . . . Proof of the Main Results . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . Some Key Technical Tools and Some Supporting Results
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
548 552 559 568 574 585 589 591
1. Introduction In this article, we give an analysis of the coupled Hamiltonian and momentum constraints in the Einstein equations on a 3-dimensional closed manifold. We consider the equations with matter sources satisfying an energy condition implied by the dominant energy condition in the 4-dimensional spacetime; the unknowns are a Riemannian three-metric and a two-index symmetric tensor. The equations form an under-determined system; therefore, we focus entirely on a standard reformulation used in both mathematical and numerical general relativity, called the conformal method, introduced by Lichnerowicz and York [32,49,50]. The conformal method assumes that the unknown metric is known up to a scalar field called a conformal factor, and also assumes that the trace and a term proportional to the trace-free divergence-free part of the two-index symmetric tensor is known, leaving as unknown a term proportional to the traceless symmetrized derivative of a vector. Therefore, the new unknowns are a scalar and a vector field, transforming the original under-determined system for a metric and a symmetric tensor into a (potentially) well-posed elliptic system for a scalar and a vector field. See [5] for a recent review article. The question of existence of solutions to the Lichnerowicz-York conformally rescaled Einstein’s constraint equations, for an arbitrarily prescribed mean extrinsic curvature, has remained an open problem for more than thirty years. The rescaled equations, which are a coupled nonlinear elliptic system consisting of the scalar Hamiltonian constraint coupled to the vector momentum constraint, have been studied almost exclusively in the setting of constant mean extrinsic curvature, known as the CMC case. In the CMC case the equations decouple, and it has long been known how to establish existence of solutions. The case of CMC data on closed (compact without boundary) manifolds was completely resolved by several authors over the last twenty years, with the last remaining sub-cases resolved and all the CMC sub-cases on closed manifolds summarized by Isenberg in [25]. Over the last ten years, other CMC cases on different types of manifolds containing various kinds of matter fields were studied and partially or completely resolved; see the survey [5]. We take a moment to point out just some of the quite substantial number of works in this area, including: the original work on the Lichnerowicz equation [32]; the development of the conformal method [49–52]; the initial solution theory for the Hamiltonian constraint [39–41]; the thin sandwich alternative
Rough Solutions of the Einstein Constraints on Closed Manifolds
549
to the conformal method [4,37]; the complete classification of CMC initial data [25] and the few known non-CMC results [11,26,28]; various technical results on transversetraceless tensors and the conformal Killing operator [6,8]; the more recent development of the conformal thin sandwich formulation [53]; initial data for black holes [7,9]; initial data for Kerr-like black holes [13,14]; initial data with trapped surface boundaries [15,34]; rough solution theory for CMC initial data [10,33,35]; and the gluing approach to generating initial data [12]. A survey of many of these results appears in [5]. On the other hand, the question of existence of solutions to the Einstein constraint equations for non-constant mean extrinsic curvature (the “non-CMC case”) has remained largely unanswered, with progress made only in the case that the mean extrinsic curvature is nearly constant (the “near-CMC case”), in the sense that the size of its spatial derivatives is sufficiently small. The near-CMC condition leaves the constraint equations coupled, but ensures the coupling is weak. In [26], Isenberg and Moncrief established the first existence (and uniqueness) result in the near-CMC case, for background metric having negative Ricci scalar. Their result was based on a fixed-point argument, together with the use of iteration barriers (sub- and super-solutions) which were shown to be bounded above and below by fixed positive constants, independent of the iteration. We note that both the fixed-point argument and the global barrier construction in [26] rely critically on the near-CMC assumption. All subsequent non-CMC existence results are based on the framework in [26] and are thus limited to the near-CMC case (see the survey [5], the non-existence results in [27], and also the newer existence results in [1] for non-negative Yamabe classes). This article presents (together with the brief overview in [22]) the first non-CMC existence results for the Einstein constraints that do not require the near-CMC assumption. Two recent advances make this possible: A new topological fixed-point argument (established here and in [21]) and a new global super-solution construction for the Hamiltonian constraint (established here and in [22]) that are both free of near-CMC conditions. These two results allow us to establish existence of non-CMC solutions for conformal background metrics in the positive Yamabe class, with the freely specifiable part of the data given by the traceless-transverse part of the rescaled extrinsic curvature and the matter fields sufficiently small, and with the matter energy density not identically zero. Our results here and in [21,22] can be viewed as reducing the remaining open questions of existence of non-CMC (weak and strong) solutions without near-CMC conditions to two more basic and clearly stated open problems: (1) Existence of near-CMC-free global super-solutions for the Hamiltonian constraint equation when the background metric is in the non-positive Yamabe classes and for large data; and (2) existence of near-CMCfree global sub-solutions for the Hamiltonian constraint equation when the background metric is in the positive Yamabe class in vacuum (without matter). We will make some further comments about this later in the paper. Our results in this article, which can be viewed as pushing forward the rough solutions program that was initiated by Maxwell in [33,35] (see also [10]), further extend the known solution theory for the Einstein constraint equations on closed manifolds in several directions: (i) Far-from-CMC Weak Solutions: We establish the first existence results (Theorem 1) for the coupled Einstein constraints in the non-CMC setting without the near-CMC condition. In particular, if the rescaled background metric is in the positive Yamabe class, if the freely specifiable part of the data given by the traceless-transverse part of the rescaled extrinsic curvature and the matter fields are sufficiently small, and if the energy density of matter is not identically zero, then
550
(ii)
(iii)
(iv)
(v)
M. Holst, G. Nagy, G. Tsogtgerel
we show existence of non-CMC solutions with mean extrinsic curvature arbitrarily far from constant. Two advances in the analysis of the Einstein constraint equations make this result possible: A topological fixed-point argument (Theorems 4 and 5) based on compactness arguments rather than k-contractions that is free of near-CMC conditions, and constructions of global barriers for the Hamiltonian constraint that are similarly free of the near-CMC condition (Lemmas 7, 8, 9, 13, and 14). Near-CMC Weak Solutions: We establish existence results (Theorem 2) for nonCMC solutions to the coupled constraints under the near-CMC condition in the setting of weaker (rougher) solutions spaces and for more general physical scenarios than appeared previously in [26,1]. In particular, we establish existence of weak solutions to the coupled Hamiltonian and momentum constraints on closed manifolds for all three Yamabe classes, with (positive) conformal factor in φ ∈ W s, p , where p ∈ (1, ∞) and s( p) ∈ (1+3/ p, ∞). These results are based on combining barriers, a priori estimates, and other results for the individual constraints together with a new type of topological fixed-point argument (Theorems 4 and 5), and are established in the presence of a weak background metric and data meeting very low regularity requirements. CMC Weak Solutions: In the CMC case, we establish existence (Theorem 3) of weak solutions to the un-coupled Hamiltonian and momentum constraints on closed manifolds for all three Yamabe classes, with (positive) conformal factor φ ∈ W s, p , where p ∈ (1, ∞) and s( p) ∈ (3/ p, ∞) ∩ [1, ∞). In the case of s = 2, we reproduce the CMC existence results of Choquet-Bruhat [10], and in the case p = 2, we reproduce the CMC existence results of Maxwell [33], but with a different proof; our CMC proof goes through the same analysis framework that we use to obtain the non-CMC results (Theorems 4 and 5). Again, these results are established in the presence of a weak background metric and with data meeting very low regularity requirements. Barrier Constructions: We give constructions (Lemmas 9 and 13) of weak global sub- and super-solutions (barriers) for the Hamiltonian constraint equation which are free of the near-CMC condition. The constructions require the assumption that the freely specifiable part of the data given by the traceless-transverse part of the rescaled extrinsic curvature and the matter fields are sufficiently small (required for the super-solution construction in Lemma 9) and if the energy density of matter is not identically zero (required for the sub-solution in construction Lemma 13, although we note this can be relaxed using the technique in [1]). While near-CMC-free sub-solutions are common in the literature, our near-CMC-free super-solution constructions appear to be the first such results of this type. Supporting Technical Tools: We assemble a number of new supporting technical results in the body of the paper and in several appendices, including: topological fixed-point arguments designed for the Einstein constraints; construction and properties of general Sobolev classes W s, p and elliptic operators on closed manifolds with weak metrics; the development of a very weak solution theory for the momentum constraint; a priori L ∞ -estimates for weak W 1,2 -solutions to the Hamiltonian constraint; Yamabe classification of non-smooth metrics in general Sobolev classes W s, p ; and an analysis of the connection between conformal rescaling and the near-CMC condition.
The results in this paper imply that the weakest differentiable solutions of the Einstein constraint equations we have found correspond to CMC and non-CMC hypersurfaces
Rough Solutions of the Einstein Constraints on Closed Manifolds
551
with physical spatial metric h ab satisfying h ab ∈ W s, p (M),
p ∈ (1, ∞),
s( p) ∈ 1 + 3p , ∞ .
(1.1)
The curvature of such metrics can be computed in a distributional sense, following [17]. In the CMC case, the regularity can be reduced to h ab ∈ W s, p (M),
p ∈ (1, ∞),
s( p) ∈
3 p,∞
∩ [1, ∞).
(1.2)
In the case s = 2, we reproduce the CMC existence results of Choquet-Bruhat [10], and in the case p = 2, we reproduce the CMC existence results of Maxwell [33], but with a different proof; our CMC proof goes through the same analysis framework that we use to obtain the non-CMC results (Theorems 4 and 5). In this paper we do not include uniqueness statements on CMC solutions, or necessary and sufficient conditions for the existence of CMC solutions; however, we expect that the techniques used in the above mentioned works can be adapted to this setting without difficulty. There are several related motivations for establishing the extensions outlined above. First, as outlined in [5], new results for the non-CMC case, beyond the case analyzed in [1,26], are of great interest in both mathematical and numerical relativity. Non-CMC results that are free of the near-CMC assumption are of particular interest, since the existence of solutions in this case has been an open question for more than thirty years. Second, there is currently substantial research activity in rough solutions to the Einstein evolution equations, which rest on rough/weak solution results for the initial data [30]. Third, the approximation theory for Petrov-Galerkin-type methods (including finite element, wavelet, spectral, and other methods) for the constraints and similar systems previously developed in [20] establishes convergence of numerical solutions in very general physical situations, but rests on assumptions about the solution theory; the results in the present paper and in [21], help to complete this approximation theory framework. Similarly, very recent results on convergence of adaptive methods for the constraints in [23,24] rest in large part on the collection of results here and in [20,21]. An extended outline of the paper is as follows. In Sect. 2, we summarize the conformal decomposition of Einstein’s constraint equations introduced by Lichnerowicz and York, on a closed manifold. We describe the classical strong formulation of the resulting coupled elliptic system, and then define weak formulations of the constraint equations that will allow us to develop solution theories for the constraints in the spaces with the weakest possible regularity. After setting up the basic notation, we give an overview of our main results in Sect. 3, summarized in three existence theorems (Theorems 1, 2, and 3) for weak far-from-CMC, near-CMC, and CMC solutions to the coupled constraints, extending the known solution theory in several distinct ways as described above. We outline the two recent advances in the analysis of the Einstein constraint equations that make these results possible. The first advance is an abstract coupled topological fixed-point result (Theorems 4 and 5), the proof of which is based directly on compactness rather than on k-contractions. This gives an analysis framework for weak solutions to the constraint equations that is fundamentally free of the near-CMC assumption; the near-CMC assumption then only potentially arises in the construction of global barriers as part of the overall fixed-point argument. A result of this type also makes possible the new non-CMC results for the case of compact manifolds with boundary appearing in [21]. The second new advance is the construction
552
M. Holst, G. Nagy, G. Tsogtgerel
of global super-solutions for the Hamiltonian constraint that are also free of the nearCMC condition; we give an overview of the main ideas in the constructions, which are then derived rigorously in Sect. 5. In Sect. 4 we then develop the necessary results for the individual constraint equations in order to complete an existence argument for the coupled system based on the abstract fixed-point argument in Theorems 4 and 5. In particular, in Sect. 4.1, we first develop some basic technical results for the momentum constraint operator under weak assumptions on the problem data, including existence of weak solutions to the momentum constraint, given the conformal factor as data. In Sect. 4.2, we assume the existence of barriers (weak sub- and super-solutions) to the Hamiltonian constraint equation forming a nonempty positive bounded interval, and then derive several properties of the Hamiltonian constraint that are needed in the analysis of the coupled system. The results are established under weak assumptions on the problem data, and for any Yamabe class. Using order relations on appropriate Banach spaces, we then derive several such compatible weak global sub- and super-solutions in Sect. 5, based both on constants and on more complex non-constant constructions. While the sub-solutions are similar to those found previously in the literature, some of the super-solutions are new. In particular, we give two super-solution constructions that do not require the near-CMC condition. The first is constant, and requires that the scalar curvature be strictly globally positive. The second is based on a scaled solution to a Yamabe-type problem, and is valid for any background metric in the positive Yamabe class. In Sect. 6, we establish the main results by giving the proofs of Theorems 1, 2, and 3. In particular, using the topological fixed-point argument in Theorem 5, we combine the global barrier constructions in Sect. 5 with the individual constraint results in Sect. 4 to establish existence of weak non-CMC solutions. We summarize our results in Sect. 7. For ease of exposition, various supporting technical results are given in several appendices as follows: Appendix Sect. A.1 – topological fixed-point arguments; Appendix Sect. A.2 – ordered Banach spaces; Appendix Sect. A.3 – monotone increasing maps; Appendix Sect. A.4 – construction of fractional order Sobolev spaces of sections of vector bundles over closed manifolds; Appendix Sect. A.5 – a priori estimates for elliptic operators; Appendix Sect. A.6 – maximum principles on closed manifolds; Appendix Sect. A.7 – Yamabe classification of weak metrics; Appendix Sect. A.8 – conformal covariance of the Hamiltonian constraint; and Appendix Sect. A.9 – conformal rescaling and the near-CMC condition. 2. Preliminary Material 2.1. Notation and conventions. Let M be an n-dimensional smooth closed manifold. We denote by π : E → M (or simply E → M, or just E) a smooth vector bundle over M, where the manifold M is called the base space, E is called the total space, and π is the bundle projection such that for any x ∈ M, E x = π −1 (x) is the fiber over x, which is a vector space of (fiber) dimension m x . If all fibers E x have dimension m x = m, we say the fiber dimension of E is m. The manifold M itself can be considered as the vector bundle E = M × {0} with fiber dimension m = 0. A section of the trivial vector bundle E = M × R with fiber dimension m = 1 is simply a scalar function on M. Our primary interest is the case where E = Tsr M = T M ⊗ · · · ⊗ T M ⊗ T ∗ M ⊗ . . . ⊗ T ∗ M, r times
s times
Rough Solutions of the Einstein Constraints on Closed Manifolds
553
the (r, s)-tensor bundle with contravariant order r and covariant order s, giving fiber dimension m = n(r + s), where T M is the tangent bundle, and T ∗ M is the co-tangent bundle of M. A C k section of π (or of E) is a C k map γ : M → E such that for each x ∈ M, π(γ (x)) = x. These C k sections form real Banach spaces C k (E) which arise naturally in the global linear analysis of partial differential equations on manifolds. Let h ab ∈ C ∞ (T20 M) be a smooth Riemannian metric on M, (where by convention Latin indices denote abstract indices as e.g. in [48]), meaning that it is a symmetric, positive definite, covariant, smooth two-index tensor field on M. The combination (M, h ab ) is referred to as a (smooth) Riemannian manifold; we will relax the smoothness requirement on h ab below. For each x ∈ M, the metric h ab (x) defines a positive definite inner product on the tangent space Tx M at x. Denote by h ab the inverse of h ab , that is, h ac h bc = δa b , where δa b : Tx M → Tx M is the identity map. We use the convention that repeated indices, one upper-index and one sub-index, denote contraction. Indices on tensors will be raised and lowered with h ab and h ab , respectively. For example, given the tensor u ab c we denote u abc = h aa1 h bb1 u a1 b1 c , and u abc = h cc1 u ab c1 ; notice that the order of the indices is important in the case that the tensor u abc or u abc is not symmetric. We say that a tensor is of type m iff it can be transformed into a tensor u a1 ···am by lowering appropriate indices (its vector bundle then has fiber dimension mn). We now give a brief overview of L p and Sobolev spaces of sections of vector bundles over closed manifolds in order to introduce the notation used throughout the paper. An overview of the construction of fractional order Sobolev spaces of sections of vector bundles can be found in Appendix A.4, based on Besov spaces and partitions of unity. The case of the sections of the trivial bundle of scalars can also be found in [19], and the case of tensors can also be found in [42]. Let ∇a be the Levi-Civita connection associated with the metric h ab , that is, the unique torsion-free connection satisfying ∇a h bc = 0. Let Rabc d be the Riemann tensor of the connection ∇a , where the sign convention used in this article is (∇a ∇b − ∇b ∇a )vc = Rabc d vd . Denote by Rab := Racb c the Ricci tensor and by R := Rab h ab the Ricci scalar curvature of this connection. Integration on M can be defined with the volume form associated with the metric h ab . Given an arbitrary tensor u a1 ···ar b1 ···bs of type m = r + s, we define a real-valued function measuring its magnitude at any point x ∈ M as |u| := (u a1 ···bs u a1 ···bs )1/2 .
(2.1)
A norm of an arbitrary tensor field u a1 ···ar b1 ···bs on M can then be defined for any 1 p < ∞ and for p = ∞ respectively using (2.1) as follows: u p :=
M
1/ p |u| p d x
,
u∞ := ess sup |u|. x∈M
(2.2)
One way to construct the Lebesgue spaces L p (Tsr M) of sections of the (r, s)-tensor bundle, for 1 p ∞, is through the completion of C ∞ (Tsr M) with respect to the L p -norm (2.2). The L p spaces are Banach spaces, and the case p = 2 is a Hilbert space with the inner product and norm given by
(u, v) := u a1 ···am v a1 ···am d x, u := (u, u) = u2 . (2.3) M
Denote covariant derivatives of tensor fields as ∇ k u a1 ···am := ∇b1 · · · ∇bk u a1 ···am , where k denotes the total number of derivatives represented by the tensor indices (b1 , . . . , bk ).
554
M. Holst, G. Nagy, G. Tsogtgerel
Another norm on C ∞ (Tsr M) is given for any non-negative integer k and for any 1 p ∞ as follows: uk, p :=
k
∇ l u p .
(2.4)
l=0
The Sobolev spaces W k, p (Tsr M) of sections of the (r, s)-tensor bundle can be defined as the completion of C ∞ (Tsr M) with respect to the W k, p -norm (2.4). The Sobolev spaces W k, p are Banach spaces, and the case p = 2 is a Hilbert space. We have L p = W 0, p and s p = s0, p . See Appendix A.4 for a more careful construction that includes real order Sobolev spaces of sections of vector bundles. Let C+∞ be the set of nonnegative smooth (scalar) functions on M. Then we can define order cone s, p
W+
:= φ ∈ W s, p : φ, ϕ 0 ∀ ϕ ∈ C+∞ ,
(2.5)
with respect to which the Sobolev spaces W s, p = W s, p (M) are ordered Banach spaces. Here ·, · is the unique extension of the L 2 -inner product to a bilinear form W s, p ⊗ s, p W −s, p → R, with p1 + 1p = 1. The order relation is then φ ψ iff φ − ψ ∈ W+ . We note that this order cone is normal only for s = 0. See Appendix A.2, where we review the main properties of ordered Banach spaces. 2.2. The Einstein constraint equations. We give a quick overview of the Einstein constraint equations in general relativity, and then define weak formulations that are fundamental to both solution theory and the development of approximation theory. Analogous material for the case of compact manifolds with boundary can be found in [21]. Let (M, gµν ) be a 4-dimensional spacetime, that is, M is a 4-dimensional, smooth manifold, and gµν is a smooth, Lorentzian metric on M with signature (−, +, +, +). Let ∇µ be the Levi-Civita connection associated with the metric gµν . The Einstein equation is G µν = κ Tµν , where G µν = Rµν − 21 R gµν is the Einstein tensor, Tµν is the stress-energy tensor, and κ = 8π G/c4 , with G the gravitation constant and c the speed of light. The Ricci tensor is Rµν = Rµσ ν σ and R = Rµν g µν is the Ricci scalar, where g µν is the inverse of gµν, that is gµσ g σ ν = δµ ν . The Riemann tensor is defined by Rµνσ ρ wρ = ∇µ ∇ν − ∇ν ∇µ wσ , where wµ is any 1-form on M. The stress energy tensor Tµν is assumed to be symmetric and to satisfy the condition ∇µ T µν = 0 and the dominant energy condition, that is, the vector −T µν vν is timelike and future-directed, where v µ is any timelike and futuredirected vector field. In this section Greek indices µ, ν, σ , ρ denote abstract spacetime indices, that is, tensorial character on the 4-dimensional manifold M. They are raised and lowered with g µν and gµν , respectively. Latin indices a, b, c, d will denote tensorial character on a 3-dimensional manifold. The map t : M → R is a time function iff the function t is differentiable and the vector field −∇ µ t is a timelike, future-directed vector field on M. Introduce the hypersurface M := {x ∈ M : t (x) = 0}, and denote by n µ the unit 1-form orthogonal to M. By definition of M the form n µ can be expressed as n µ = −α ∇µ t, where α, called the
Rough Solutions of the Einstein Constraints on Closed Manifolds
555
lapse function, is the positive function such that n µ n ν g µν = −1. Let hˆ µν and kˆµν be the first and second fundamental forms of M, that is, hˆ µν := gµν − n µ n ν ,
kˆµν := −hˆ µ σ ∇σ n ν .
The Einstein constraint equations on M are given by G µν − κ Tµν n ν = 0. A well known calculation allows us to express these equations involving tensors on M as equations involving intrinsic tensors on M. The result is the following equations: Rˆ + kˆ 2 − kˆab kˆ ab − 2κ ρˆ = 0, Dˆ a kˆ − Dˆ b kˆ ab + κ jˆa = 0,
3
(2.6) (2.7)
where tensors hˆ ab , kˆab , jˆa and ρˆ on a 3-dimensional manifold are the pull-backs on M of the tensors hˆ µν , kˆµν , jˆµ and ρˆ on the 4-dimensional manifold M. We have introduced the energy density ρˆ := n µ n µ T µν and the momentum current density jˆµ := −hˆ µν n σ T νσ . We have denoted by Dˆ a the Levi-Civita connection associated to hˆ ab , so (M, hˆ ab ) is a 3-dimensional Riemannian manifold, with hˆ ab having signature (+, +, +), and we use the notation hˆ ab for the inverse of the metric hˆ ab . Indices have been raised and lowered with hˆ ab and hˆ ab , respectively. We have also denoted by 3Rˆ the Ricci scalar curvature of the metric hˆ ab . Finally, recall that the constraint Eqs. (2.6)-(2.7) are indeed equations on hˆ ab and kˆab due to the matter fields satisfying the energy condition −ρˆ 2 + jˆa jˆa 0 (with strict inequality holding at points on M, where ρˆ = 0; see [48]), which is implied by the dominant energy condition on the stress-energy tensor T µν in spacetime. 2.3. Conformal transverse traceless decomposition. Let φ denote a positive scalar field on M, and decompose the extrinsic curvature tensor kˆab = lˆab + 13 hˆ ab τˆ , where τˆ := kˆab hˆ ab is the trace and then lˆab is the traceless part of the extrinsic curvature tensor. Then, introduce the following conformal re-scaling: hˆ ab =: φ 4 h ab , lˆab =: φ −10 l ab , τˆ =: τ, jˆa =: φ −10 j a , ρˆ =: φ −8 ρ.
(2.8)
We have introduced the Riemannian metric h ab on the 3-dimensional manifold M, which determines the Levi-Civita connection Da , and so we have that Da h bc = 0. We have also introduced the symmetric, traceless tensor lab , and the non-physical matter sources j a and ρ. The different powers of the conformal re-scaling above are carefully chosen so that the constraint Eqs. (2.6)-(2.7) transform into the following equations: 2 −8φ + 3Rφ + τ 2 φ 5 − lab l ab φ −7 − 2κρφ −3 = 0, 3 2 6 a ab −Db l + φ D τ + κ j a = 0, 3
(2.9) (2.10)
where in the equation above, and from now on, indices of unhatted fields are raised and lowered with h ab and h ab respectively. We have also introduced the Laplace-Beltrami
556
M. Holst, G. Nagy, G. Tsogtgerel
operator with respect to the metric h ab , acting on smooth scalar fields; it is defined as follows: φ := h ab Da Db φ.
(2.11)
Equations (2.9)–(2.10) can be obtained by a straightforward albeit long computation. In order to perform this calculation it is useful to recall that both Dˆ a and Da are connections on the manifold M, and so they differ on a tensor field Cab c , which can be computed explicitly in terms of φ, and has the form Cab c = 4δ(a c Db) ln(φ) − 2h ab h cd Dd ln(φ). We remark that the power four on the re-scaling of the metric hˆ ab and M being 3-dimensional imply that 3Rˆ = φ −5 (3Rφ − 8φ), or in other words, that φ satisfies the Yamabe-type problem: ˆ 5 = 0, φ > 0, − 8φ + 3Rφ − 3Rφ
(2.12)
where 3Rˆ represents the scalar curvature corresponding to the physical metric hˆ ab = φ 4 h ab . Note that for any other power in the re-scaling, terms proportional to h ab (Da φ) (Db φ)/φ 2 appear in the transformation. The set of all metrics on a closed manifold can be classified into the three disjoint Yamabe classes Y + (M), Y 0 (M), and Y − (M), corresponding to whether one can conformally transform the metric into a metric with strictly positive, zero, or strictly negative scalar curvature, respectively, cf. [31] (see also Appendix A.7). We note that the Yamabe problem is to determine, for a given metric h ab , whether there exists a conformal transformation φ solving (2.12) such that 3ˆ R = const. Arguments similar to those above for φ force the power negative ten on the re-scaling of the tensor lˆab and jˆa , so terms proportional to (Da φ)/φ cancel out in (2.10). Finally, the ratio between the conformal re-scaling powers of ρˆ and jˆa is chosen such that the inequality −ρ 2 + h ab j a j b 0 implies the inequality −ρˆ 2 + hˆ ab jˆa jˆb 0. For a complete discussion of all possible choices of re-scaling powers, see Appendix A.9. There is one more step to convert the original constraint equation (2.6)-(2.7) into a determined elliptic system of equations. This step is the following: Decompose the symmetric, traceless tensor lab into a divergence-free part σab , and the symmetrized and traceless gradient of a vector, that is, l ab =: σ ab + (Lw)ab , where Da σ ab = 0 and we have introduced the conformal Killing operator L acting on smooth vector fields and defined as follows: (Lw)ab := D a w b + D b wa − 23 (Dc w c )h ab .
(2.13)
Therefore, the constraint Eqs. (2.6)-(2.7) are transformed by the conformal re-scaling into the following equations: 2 −8φ + 3Rφ + τ 2 φ 5 − [σab + (Lw)ab ][σ ab + (Lw)ab ]φ −7 − 2κρφ −3 = 0, (2.14) 3 2 6 a ab −Db (Lw) + φ D τ + κ j a = 0. (2.15) 3 In the next section we interpret these equations above as partial differential equations for the scalar field φ and the vector field wa , while the rest of the fields are considered
Rough Solutions of the Einstein Constraints on Closed Manifolds
557
as given fields. Given a solution φ and wa of Eqs. (2.14)-(2.15), the physical metric hˆ ab and extrinsic curvature kˆ ab of the hypersurface M are given by hˆ ab = φ 4 h ab ,
1 kˆ ab = φ −10 [σ ab + (Lw)ab ] + φ −4 τ h ab , 3
while the matter fields are given by Eq (2.8). From this point forward, for simplicity we will denote the Levi-Civita connection of the metric h ab on the 3-dimensional manifold M as ∇a rather than Da , and the Ricci scalar of h ab will be denoted by R instead of 3R. Let (M, h) be a 3-dimensional Riemannian manifold, where M is a smooth, compact manifold without boundary, and h ∈ C ∞ (T20 M) is a positive definite metric. With the shorthands C ∞ = C ∞ (M × R) and C∞ = C ∞ (T M), let L : C ∞ → C ∞ and L : C∞ → C∞ be the operators with actions on φ ∈ C ∞ and w ∈ C∞ given by Lφ := −φ, (Lw)a := −∇b (Lw)ab ,
(2.16) (2.17)
where denotes the Laplace-Beltrami operator defined in (2.11), and where L denotes the conformal Killing operator defined in (2.13). We will also use the index-free notation Lw and Lw. The freely specifiable functions of the problem are a scalar function τ , interpreted as the trace of the physical extrinsic curvature; a symmetric, traceless, and divergence-free, contravariant, two index tensor σ ; the non-physical energy density ρ and the non-physical momentum current density vector j subject to the requirement −ρ 2 + j · j 0. The term non-physical refers here to a conformal rescaled field, while physical refers to a conformally non-rescaled term. The requirement on ρ and j mentioned above and the particular conformal rescaling used in the semi-decoupled decomposition imply that the same inequality is satisfied by the physical energy and momentum current densities. This is a necessary condition (although not sufficient) in order that the matter sources in spacetime satisfy the dominant energy condition. The definition of various energy conditions can be found in [48, p. 219]. Introduce the non-linear operators F : C ∞ × C∞ → C ∞ and F : C ∞ → C∞ given by F(φ, w) = aτ φ 5 + a R φ − aρ φ −3 − aw φ −7 , and F(φ) = bτ φ 6 + b j , where the coefficient functions are defined as follows: aτ :=
1 2 12 τ ,
a R := 18 R,
aρ := κ4 ρ,
aw := 18 (σ + Lw)ab (σ + Lw)ab , bτa := 23 ∇ a τ, baj := κ j a .
(2.18)
Notice that the scalar coefficients aτ , aw , and aρ are non-negative, while there is no sign restriction on a R . With these notations, the classical formulation (or the strong formulation) of the coupled Einstein constraint equations reads: Given the freely specifiable smooth functions τ , σ , ρ, and j in M, find a scalar field φ and a vector field w in M solution of the system Lφ + F(φ, w) = 0
and
Lw + F(φ) = 0
in M.
(2.19)
558
M. Holst, G. Nagy, G. Tsogtgerel
2.4. Formulation in Sobolev spaces. We now outline a formulation of the Einstein constraint equations that involves the weakest regularity of the equation coefficients such that the equation itself is well-defined. So in particular, the operators L and L are no longer differential operators sending smooth sections to smooth sections. We shall employ Sobolev spaces to quantify smoothness, cf. Appendix A.4. Let (M, h) be a 3-dimensional Riemannian manifold, where M is a smooth, compact manifold without boundary, and with p ∈ ( 23 , ∞) and s ∈ ( 3p , ∞) ∩ [1, 2], h ∈ W s, p (T20 M) is a positive definite metric. Note that the restriction s 2 is only apparent, since W t, p → W 2, p for any t > 2. In the formulation of the constraint equations we need to distinguish the cases s > 2 and s 2 at least notation-wise, and we choose to present in this subsection the case s 2 because this is the case that is considered in the core existence theory; the higher regularity is obtained by a standard bootstrapping technique. 3p The general case is discussed in Sects. 4 and 6. Let us define r = r (s, p) = 3+(2−s) p, r s−2, p so that the continuous embedding L → W holds. Introduce the operators A L : W s, p → W s−2, p ,
and
AL : W 1,2r → W −1,2r ,
as the unique extensions of the operators L and L in Eqs. (2.16) and (2.17), respectively, cf. Lemma 31 in Appendix A.5. The boldface letters denote spaces of sections of the tangent bundle T M, e.g., W 1,2r = W 1,2r (T M). Fix the source functions s−2, p
τ ∈ L 2r , ρ ∈ W+
, σ ∈ L 2r , j ∈ W −1,2r ,
(2.20)
where σ is symmetric, traceless and divergence-free in the weak sense, the latter mean ing that σ, Lω = 0 for all ω ∈ W 1,(2r ) . Here (2r1 ) + 2r1 = 1, and ·, · denotes the
extension of the L 2 -inner product to W −1,2r ⊗ W 1,(2r ) . We say that the matter fields ρ and j satisfy the energy condition iff there exist sequences {ρn } ⊂ C ∞ and {j n } ⊂ C∞ , respectively converging to ρ and j in the appropriate topology, such that ρn2 − j n · j n 0. Given any function τ ∈ L 2r we have bτ ≡ 23 ∇τ ∈ W −1,2r . The assumptions τ ∈ L 2r and σ ∈ L 2r imply that for every w ∈ W 1,2r the functions aτ and aw belong to L r . For example, to see that aw ∈ L r , we proceed as 2 aw r = σ + Lw2r 2 σ 22r + Lw22r 2 σ 22r + cL w21,2r , where we used the boundedness Lw2r cL w1,2r . The assumption on the background metric implies that a R ∈ W s−2, p . Given any two functions u, v ∈ L ∞ , and t 0 and q ∈ [1, ∞], define the interval [u, v]t,q := {φ ∈ W t,q : u φ v} ⊂ W t,q , see Lemma 1 near the end of Sect. 3. We equip [u, v]t,q with the subspace topology of W t,q . We will write [u, v]q for [u, v]0,q , and [u, v] for [u, v]∞ . Now, assuming that φ− , φ+ ∈ W s, p and 0 < φ− φ+ < ∞, we introduce the non-linear operators f : [φ− , φ+ ]s, p × W 1,2r → W s−2, p ,
and
f : [φ− , φ+ ]s, p → W −1,2r ,
Rough Solutions of the Einstein Constraints on Closed Manifolds
559
by f (φ, w) = aτ φ 5 + a R φ − aρ φ −3 − aw φ −7 , and f (φ) = bτ φ 6 + b j , where the pointwise multiplication by an element of W s, p defines a bounded linear map in W s−2, p and in W −1,2r , cf. Corollary 3(a) in Appendix A.4. Now, we can formulate the Einstein constraint equations in terms of the above defined operators: Find elements φ ∈ [φ− , φ+ ]s, p and w ∈ W 1,2r solutions of A L φ + f (φ, w) = 0, AL w + f (φ) = 0.
(2.21) (2.22)
In the following, often we treat the two equations separately. The Hamiltonian constraint equation is the following: Given a function w ∈ W 1,2r , find an element φ ∈ [φ− , φ+ ]s, p solution of A L φ + f (φ, w) = 0.
(2.23)
When the Hamiltonian constraint equation is under consideration, the function w is referred to as the source. To indicate the dependence of the solution φ on the source w, sometimes we write φ = φw . Let us define the momentum constraint equation: Given φ ∈ W s, p with φ > 0, find an element w ∈ W 1,2r solution of AL w + f (φ) = 0.
(2.24)
When the momentum constraint equation is under consideration, the function φ is referred to as the source. To indicate the dependence of the solution w on the source φ, sometimes we write w = wφ . 3. Overview of the Main Results In this section, we state our three main theorems (Theorems 1, 2, and 3 below) on the existence of far-from-CMC, near-CMC, and CMC solutions to the Einstein constraint equations, and give an outline of the overall structure of the argument that we build in the paper. The proofs of the main results appear in Sect. 6 toward the end of the paper, after we develop a number of supporting results in the body of the paper. After we give an overview of the basic abstract structure of the coupled nonlinear constraint problem, we prove two abstract topological fixed-point theorems (Theorems 4 and 5) that are the basis for our analysis of the coupled system; these arguments are also the basis for our results in [21] on existence of non-CMC solutions to the Einstein constraints on compact manifolds with boundary. After proving these abstract results, we give an overview of the technical results that must be established in the remainder of the paper in order to use the abstract results. Before stating the main theorems, let us make precise what we mean by near-CMC condition in this article. We say that the extrinsic mean curvature τ satisfies the nearCMC condition when the following inequality is satisfied: ∇τ z < inf |τ |,
(3.1)
M
√
√
min uv 6 where the constant = 2C3 if ρ, σ 2 ∈ L ∞ , and = 2C3 ( max uv ) otherwise, with the constant C > 0 as in Corollary 1 and the continuous functions u, v > 0 are as defined in
560
M. Holst, G. Nagy, G. Tsogtgerel
(5.14) or in (5.15) on page. Here C depends only on the Riemannian manifold (M, h ab ), and not mentioning (M, h ab ), u and v depend only on ρ, σ 2 , and τ . It is important to min uv note that we always have 0 < max uv 1, so that in any case the condition (3.1) is √
at least as strong as the same condition with taken to be equal to 2C3 . The condition depends on the value of z, and that will be inserted through the context. Recall that the three Yamabe classes Y + (M), Y − (M) and Y 0 (M) are defined after Eq. (2.12). See Appendix A.7 for more details. 3.1. Theorem 1: Far-CMC weak solutions. Here is the first of our three main results. This result does not involve the near-CMC condition, which is one of the main contributions of this paper. The result is developed in the presence of a weak background metric h ab ∈ W s, p , for p ∈ (1, ∞) and s ∈ (1 + 3p , ∞), with the weakest possible assumptions on the data that allows for avoiding the near-CMC condition. Theorem 1 (Far-CMC W s, p solutions, p ∈ (1, ∞), s ∈ (1 + 3p , ∞)). Let (M, h ab ) be a 3-dimensional closed Riemannian manifold. Let h ab ∈ W s, p admit no conformal Killing field and be in Y + (M), where p ∈ (1, ∞) and s ∈ (1 + 3p , ∞) are given. Select q and e to satisfy: 3− p 3+ p • q1 ∈ (0, 1) ∩ (0, s−1 3 ) ∩ [ 3 p , 3 p ], • e ∈ (1 + q3 , ∞) ∩ [s − 1, s] ∩ [ q3 + s − 3p − 1, q3 + s − 3p ]. Assume that the data satisfies: 3q • τ ∈ W e−1,q if e 2, and τ ∈ W 1,z otherwise, with z = 3+max{0,2−e}q , • σ ∈ W e−1,q , with σ 2 ∞ sufficiently small, s−2, p ∩ L ∞ \ {0}, with ρ∞ sufficiently small, • ρ ∈ W+ e−2,q • j∈W , with je−2,q sufficiently small. Then there exist φ ∈ W s, p with φ > 0 and w ∈ W e,q solving the Einstein constraint equations. Proof. The proof will be given in Sect. 6. See Fig. 1 for clarification of the conditions on e and q. Remark 1. The above result avoids the near-CMC condition (3.1); however, one should be aware of the various smallness conditions involved in the above theorem. More precisely, the mean curvature τ can be chosen to be an arbitrary function from a suitable function space, and afterwards, one has to choose σ , ρ, and j satisfying smallness conditions that depend on the chosen τ . Nevertheless, the novelty of this result is that τ can be specified freely, whereas the condition (3.1) is not satisfied for arbitrary τ . 3.2. Theorem 2: Near-CMC weak solutions. Here is the second of our three main results; this result requires the near-CMC condition, but still extends the known near-CMC results to situations with weaker assumptions on metric and on the data. In particular, the result is developed in the presence of a weak background metric h ab ∈ W s, p , for p ∈ (1, ∞) and s ∈ (1 + 3p , ∞), and with the weakest possible assumptions on the data. Theorem 2 (Near-CMC W s, p solutions, p ∈ (1, ∞), s ∈ (1 + 3p , ∞)). Let (M, h ab ) be a 3-dimensional closed Riemannian manifold. Let h ab ∈ W s, p admit no conformal Killing field, where p ∈ (1, ∞) and s ∈ (1 + 3p , ∞) are given. Select q, e and z to satisfy:
Rough Solutions of the Einstein Constraints on Closed Manifolds
561
Fig. 1. Range of e and q in Theorems 1 and 2, with d = s − 3p > 1
•
1 q
3− p 3+ p ∈ (0, 1) ∩ (0, s−1 3 ) ∩ [ 3p , 3p ] .
• e ∈ (1 + q3 , ∞) ∩ [s − 1, s] ∩ [ q3 + s − • z=
3 p
− 1, q3 + s − 3p ] .
3q 3+max{0,2−e}q .
Assume that τ satisfies the near-CMC condition (3.1) with z as above, and that the data satisfies: • • • •
τ ∈ W e−1,q if e > 2, and τ ∈ W 1,z if e 2, σ ∈ W e−1,q , s−2, p , ρ ∈ W+ j ∈ W e−2,q .
In addition, let one of the following sets of conditions hold: (a) h ab is in Y − (M); the metric h ab is conformally equivalent to a metric with scalar curvature (−τ 2 ); (b) h ab is in Y 0 (M) or in Y + (M); either ρ ≡ 0 and τ ≡ 0 or τ ∈ L ∞ and inf M σ 2 is sufficiently large. Then there exist φ ∈ W s, p with φ > 0 and w ∈ W e,q solving the Einstein constraint equations. Proof. The proof will be given in Sect. 6. See Fig. 1 for clarification of the conditions on e and q.
3.3. Theorem 3: CMC weak solutions. Here is the last of our three main results; it covers specifically the CMC case, and allows for lower regularity of the background metric than the non-CMC case. In particular, the result is developed with a weak background metric h ab ∈ W s, p , for p ∈ (1, ∞) and s ∈ ( 3p , ∞) ∩ [1, ∞). In the case of s = 2, we reproduce the CMC existence results of Choquet-Bruhat [10], and in the case p = 2, we reproduce the CMC existence results of Maxwell [33], but with a different proof; our CMC proof goes through the same analysis framework that we use to obtain the non-CMC results (Theorems 4 and 5).
562
M. Holst, G. Nagy, G. Tsogtgerel
Theorem 3 (CMC W s, p solutions, p ∈ (1, ∞), s ∈ ( 3p , ∞) ∩ [1, ∞)). Let (M, h ab ) be a 3-dimensional closed Riemannian manifold. Let h ab ∈ W s, p admit no conformal Killing field, where p ∈ (1, ∞) and s ∈ ( 3p , ∞) ∩ [1, ∞) are given. With d := s − 3p , select q and e to satisfy: •
1 q
p 3+ p 1−d 3+sp ∈ (0, 1) ∩ [ 3− 3 p , 3 p ] ∩ [ 3 , 6 p ),
• e ∈ [1, ∞) ∩ [s − 1, s] ∩ [ q3 + d − 1, q3 + d] ∩ ( q3 + d2 , ∞). Assume τ = const (CMC) and that the data satisfies: • σ ∈ W e−1,q , s−2, p , • ρ ∈ W+ e−2,q • j∈W . In addition, let one of the following sets of conditions hold: (a) (b) (c) (d)
h ab h ab h ab h ab
is in Y − (M); τ = 0; is in Y + (M); ρ = 0 or σ = 0; is in Y 0 (M); τ = 0; ρ = 0 or σ = 0; is in Y 0 (M); τ = ρ = σ = 0; j = 0.
Then there exist φ ∈ W s, p with φ > 0 and w ∈ W e,q solving the Einstein constraint equations. Proof. The proof will be given in Sect. 6. See Fig. 2 for clarification of the conditions on e and q. 3.4. A coupled topological fixed-point argument. In Theorems 4 and 5 below (see also [21]) we give some abstract fixed-point results which form the basic framework for our analysis of the coupled constraints. These topological fixed-point theorems will be the main tool by which we shall establish Theorems 1, 2, and 3 above. They have the important feature that the required properties of the abstract fixed-point operators S and T appearing in Theorems 4 and 5 below can be established in the case of the Einstein
Fig. 2. Range of e and q in Theorem 3. Recall that d = s − 3p > 0
Rough Solutions of the Einstein Constraints on Closed Manifolds
563
constraints without using the near-CMC condition; this is not the case for fixed-point arguments for the constraints based on k-contractions (cf. [1,26]) which require nearCMC conditions. The bulk of the paper then involves establishing the required properties of S and T without using the near-CMC condition, and finding suitable global barriers φ− and φ+ for defining the required set U that are similarly free of the near-CMC condition (when possible). We now set up the basic abstract framework. Let X and Y be Banach spaces, let f : X × Y → X ∗ and f : X → Y ∗ be (generally nonlinear) operators, let AL : Y → Y ∗ be a linear invertible operator, and let A L : X → X ∗ be a linear invertible operator satisfying the maximum principle, meaning that A L u A L v ⇒ u v. The order structure on X for interpreting the maximum principle will be inherited from an ordered Banach space Z (see Appendices A.2, A.3, and A.6, and also cf. [54]) through the compact embedding X → Z , which will also make available compactness arguments. The coupled Hamiltonian and momentum constraints can be viewed abstractly as coupled operator equations of the form: A L φ + f (φ, w) = 0, AL w + f (φ) = 0,
(3.2) (3.3)
or equivalently as the coupled fixed-point equations φ = T (φ, w), w = S(φ),
(3.4) (3.5)
for appropriately defined fixed-point maps T : X × Y → X and S : X → Y . The obvious choice for S is the Picard map for (3.3), S(φ) = −A−1 L f (φ),
(3.6)
which also happens to be the solution map for (3.3). On the other hand, there are a number of distinct possibilities for T , ranging from the solution map for (3.2), to the Picard map for (3.2), which inverts only the linear part of the operator in (3.2): T (φ, w) = −A−1 L f (φ, w).
(3.7)
Assume now that T is as in (3.7), and (for fixed w ∈ Y ) that φ− and φ+ are sub- and super-solutions of the semi-linear operator equation (3.2) in the sense that A L φ− + f (φ− , w) 0,
A L φ+ + f (φ+ , w) 0.
The assumptions on A L imply (see Lemma 26 in Appendix A.3) that for fixed w ∈ Y , φ− and φ+ are also sub- and super-solutions of the equivalent fixed-point equation: φ− T (φ− , w),
φ+ T (φ+ , w).
For developing results on fixed-point iterations in ordered Banach spaces, it is convenient to work with maps which are monotone increasing in φ, for fixed w ∈ Y : φ1 φ2
⇒
T (φ1 , w) T (φ2 , w).
The map T that arises as the Picard map for a semi-linear problem will generally not be monotone increasing; however, if there exists a continuous linear monotone increasing
564
M. Holst, G. Nagy, G. Tsogtgerel
map J : X → X ∗ , then one can always introduce a positive shift s into the operator equation AsL φ + f s (φ, w) = 0, with AsL = A L + s J and f s (φ, w) = f (φ, w) − s J φ. (Throughout this paper, the spaces we encounter for X typically fit into a Gelfand triple X → H → X ∗ , where the “pivot” space H is Hilbert space, and the continuous map between X and X ∗ is a composition of the two inclusion maps.) Since s > 0 the shifted operator AsL retains the maximum principle property of A L , and if s is chosen sufficiently large then f s is monotone decreasing in φ. Under the additional condition on J and s that AsL is invertible (see also [21]), the shifted Picard map T s (φ, w) = −(AsL )−1 f s (φ, w) is now monotone increasing in φ. We now give two abstract existence results for systems of the form (3.4)–(3.5). Theorem 4 (Coupled Fixed-Point Principle A). Let X and Y be Banach spaces, and let Z be a Banach space with compact embedding X → Z . Let U ⊂ Z be a non-empty, convex, closed, bounded subset, and let S : U → R(S) ⊂ Y,
T : U × R(S) → U ∩ X,
be continuous maps. Then there exist φ ∈ U ∩ X and w ∈ R(S) such that φ = T (φ, w) and w = S(φ). Proof. The proof will be through a standard variation of the Schauder Fixed-Point Theorem, reviewed as Theorem 9 in Appendix A.1. The proof is divided into several steps. Step 1. Construction of a non-empty, convex, closed, bounded subset U ⊂ Z . By assumption we have that U ⊂ Z is non-empty, convex (involving the vector space structure of Z ), closed (involving the topology on Z ), and bounded (involving the metric given by the norm on Z ). Step 2. Continuity of a mapping G : U ⊂ Z → U ∩ X ⊂ X . Define the composite operator G := T ◦ S : U ⊂ Z → U ∩ X ⊂ X. The mapping G is continuous, since it is a composition of the continuous operators S : U ⊂ Z → R(S) ⊂ Y and T : U ⊂ Z × R(S) → U ∩ X ⊂ X . Step 3. Compactness of a mapping F : U ⊂ Z → U ⊂ Z . The compact embedding assumption X → Z implies that the canonical injection operator i : X → Z is compact. Since the composition of compact and continuous operators is compact, we have the composition F := i ◦ G : U ⊂ Z → U ⊂ Z is compact. Step 4. Invoking the Schauder Theorem. Therefore, by a standard variant of the Schauder Theorem (see Theorem 9 in Appendix A.1), there exists a fixed-point φ ∈ U such that φ = F(φ) = T (φ, S(φ)). Since R(T ) = U ∩ X , in fact φ ∈ U ∩ X . We now take w = S(φ) ⊂ R(S) and we have the result.
Rough Solutions of the Einstein Constraints on Closed Manifolds
565
The assumption in Theorem 4 that the mapping T is invariant on the non-empty, closed, convex, bounded subset U can be established using a priori estimates if T is the solution mapping, but if there are multiple fixed-points then continuity of T will not hold. Fixed-point theory for set-valued maps could still potentially be used (cf. [54]). On the other hand, if T is chosen to be the Picard map, then it is typically easier to establish continuity of T even with multiple fixed-points, but more difficult to establish the invariance property without additional conditions on T . In our setting, we wish to allow for non-uniqueness in the Hamiltonian constraint (for example see [21] for possible non-uniqueness in the case of compact manifolds with boundary), so will generally focus on the Picard map for the Hamiltonian constraint in our fixed-point framework for the coupled constraints. The following special case of Theorem 4 gives some simple sufficient conditions on T to establish the invariance using barriers in an ordered Banach space (for a review of ordered Banach spaces, see Appendix A.2 or [54]). Theorem 5 (Coupled Fixed-Point Principle B). Let X and Y be Banach spaces, and let Z be a real ordered Banach space having the compact embedding X → Z . Let [φ− , φ+ ] ⊂ Z be a nonempty interval which is closed in the topology of Z , and set U = [φ− , φ+ ] ∩ B M ⊂ Z , where B M is the closed ball of finite radius M > 0 in Z about the origin. Assume U is nonempty, and let the maps S : U → R(S) ⊂ Y,
T : U × R(S) → U ∩ X,
be continuous maps. Then there exist φ ∈ U ∩ X and w ∈ R(S) such that φ = T (φ, w) and w = S(φ). Proof. By choosing the set U to be the non-empty intersection of the interval [φ− , φ+ ] with a bounded set in Z , we have U bounded in Z . We also have that U is convex with respect to the vector space structure of Z , since it is the intersection of two convex sets [φ− , φ+ ] and B M . Since U is the intersection of the interval [φ− , φ+ ], which by assumption is closed in the topology of Z , with the closed ball B M in Z , U is also closed. In summary, we have that U is non-empty as a subset of Z , closed in the topology of Z , convex with respect to the vector space structure of Z , and bounded with respect to the metric (via normed) space structure of Z . Therefore, the assumptions of Theorem 4 hold and the result then follows. We make some final remarks about Theorems 4 and 5. If the ordered Banach space Z in Theorem 5 had a normal order cone, then the closed interval [φ− , φ+ ] would automatically be bounded in the norm of Z (see Lemma 20 in Appendix A.2 or [54] for this result). The interval by itself is also non-empty and closed by assumption, and trivially convex (see Appendix A.2), so that Theorem 5 would follow immediately from Theorem 4 by simply taking U = [φ− , φ+ ]. Second, the closed ball B M in Theorem 5 can be replaced with any non-empty, convex, closed, and bounded subset of Z having non-trivial intersection with the interval [φ− , φ+ ]. Third, in the case that T in Theorem 5 arises as the Picard map (3.7) of the semi-linear problem (3.2), we can always ensure that T is invariant on U in Theorem 5 by: (1) obtaining sub- and super-solutions to the semi-linear operator equation and using these for φ− and φ+ , since these will also be suband super-solutions for the fixed-point equation involving the Picard map; (2) introducing a shift into the nonlinearity to ensure T is monotone increasing; and (3) obtaining a priori norm bounds on Picard iterates. As noted earlier, (1) and (2) will ensure φ− T (φ− , w) T (φ, w) T (φ+ , w) φ+ ,
(3.8)
566
M. Holst, G. Nagy, G. Tsogtgerel
for all φ ∈ [φ− , φ+ ], and w ∈ R(S), whereas (3) ensures that T (φ, w) X M, ∀φ ∈ [φ− , φ+ ], ∀w ∈ R(S),
(3.9)
which together ensure T : U × R(S) → U ∩ X , where U = [φ− , φ+ ] ∩ B M ⊂ Z . Again, if Z has a normal order cone structure, then ensuring (3.8) holds will automatically guarantee that (3.9) also holds, so it is not necessary to establish (3.9) separately in the case of a normal order cone. Finally, note that Theorem 5 also allows one to choose the solution map (or any other fixed-point map) for T together with a priori order cone and norm estimates to ensure the conditions (3.8) and (3.9) hold (as long as continuity for T can be shown). Even if a priori order-cone estimates cannot be shown to hold directly for this choice of T , as long as the map can be “bracketed” in the interval [φ− , φ+ ] by two auxiliary monotone increasing maps, then it can be shown that (3.8) holds. This allows one to use the Picard map even if it is not monotone increasing, without having to introduce the shift into the Picard map. The overall argument we use to prove the non-CMC results in Theorems 1, 2, and 3 using Theorems 4 and 5 involves the following steps: Step 1. The choice of function spaces. We will choose the spaces for use of Theorem 5 as follows: – X = W s, p , with p ∈ (1, ∞), and s( p) ∈ (1 + 3p , ∞). In the CMC case in Theorem 3, we can lower s to s( p) ∈ ( 3p , ∞) ∩ [1, ∞). – Y = W e,q , with e and q as given in the theorem statements. – Z = W s˜, p , s˜ ∈ ( 3p , s), so that X = W s, p → W s˜, p = Z is compact.
Step 2.
Step 3.
Step 4.
Step 5.
– U = [φ− , φ+ ]s˜, p ∩ B M ⊂ W s˜, p = Z , with φ− and φ+ global barriers (suband super-solutions, respectively) for the Hamiltonian constraint equation which satisfy the compatibility condition: 0 < φ− φ+ < ∞. Construction of the mapping S. Assuming the existence of “global” weak suband super-solutions φ− and φ+ , and assuming the fixed function φ ∈ U = [φ− , φ+ ]s˜, p ∩ B M ⊂ W s˜, p = Z is taken as data in the momentum constraint, we establish continuity and related properties of the momentum constraint solution map S : U → R(S) ⊂ W e,q = Y (Sect. 4.1). Construction of the mapping T . Again assuming existence of “global” weak sub- and super-solutions φ− and φ+ , with fixed w ∈ R(S) ⊂ W e,q = Y taken as data in the Hamiltonian constraint, we establish continuity and related properties of the Picard map T : U × R(S) → U ∩ W s, p . Invariance of T on U = [φ− , φ+ ]s˜, p ∩ B M ⊂ W s˜, p is established using a combination of a priori order cone bounds and norm bounds (Sect. 4.2). Barrier construction. Global weak sub- and super-solutions φ− and φ+ for the Hamiltonian constraint are explicitly constructed to build a nonempty, convex, closed, and bounded subset U = [φ− , φ+ ]s˜, p ∩ B M ⊂ W s˜, p , which is a strictly positive interval. These include variations of known barrier constructions which require the near-CMC condition, and also some new barrier constructions which are free of the near-CMC condition (Sect. 5). Note: This is the only place in the argument where near-CMC conditions may potentially arise. Application of fixed-point theorem. The global barriers and continuity properties are used together with the abstract topological fixed-point result (Theorem 5) to establish existence of solutions φ ∈ U ∩ W s, p and w ∈ W e,q to the coupled system: w = S(φ), φ = T (φ, w) (Sect. 6).
Rough Solutions of the Einstein Constraints on Closed Manifolds
567
Step 6. Bootstrap. The above application of a fixed-point theorem is actually performed for some low regularity spaces, i.e., for s 2 and e 2, and a bootstrap argument is then given to extend the results to the range of s and p given in the statement of the theorem (Sect. 6). The ordered Banach space Z plays a central role in Theorem 5. We will use Z = W t,q , t 0, 1 q ∞, with order cone defined as in (2.5). Given such an order cone, one can define the closed interval [φ− , φ+ ]t,q = {φ ∈ W t,q : φ− φ φ+ } ⊂ W t,q , which as noted earlier is denoted more simply as [φ− , φ+ ]q when t = 0, and as simply [φ− , φ+ ] when t = 0, q = ∞. When t = 0, the W t,q order cone is normal for 1 q ∞, meaning that closed intervals [φ− , φ+ ]q ⊂ L q = W 0,q are automatically bounded in the metric given by the norm on L q . If we consider the interval U = [φ− , φ+ ]t,q ⊂ W t,q = Z defined using this order structure, it will be critically important to establish that U is convex (with respect to the vector space structure of Z ), closed (in the topology of Z ), and (when possible) bounded (in the metric given by the norm on Z ). It will also be important that U be nonempty as a subset of Z ; this will involve choosing compatible φ− and φ+ . Regarding convexity, closure, and boundedness, we have the following lemma. Lemma 1 (Order cone intervals in W t,q ). For t 0, 1 q ∞, the set U = [φ− , φ+ ]t,q = {φ ∈ W t,q : φ− φ φ+ } ⊂ W t,q is convex with respect to the vector space structure of W t,q and closed in the topology of W t,q . For t = 0, 1 q ∞, the set U is also bounded with respect to the metric space structure of L q = W 0,q . Proof. That U is convex for t 0, 1 q ∞, follows from the fact that any interval built using order cones is convex. That U is closed in the case of t = 0, 1 q ∞ follows from the fact that norm convergence in L q for 1 q ∞ implies pointwise subsequential convergence almost everywhere (see Theorem 3.12 in [44]). That U is q bounded when t = 0, 1 q ∞ follows from the fact that the order cone L + is normal (see Appendix A.2). What remains is to show that U is closed in the case of t > 0, 1 q ∞. The t,q ⊂ L q , with argument is as follows. Let {u k }∞ k=1 be a Cauchy sequence in U ⊂ W t > 0, 1 q ∞. From completeness of W t,q there exists limk→∞ u k = u ∈ W t,q . From the continuous embedding W t,q → L q for t > 0, we have that u k − u l q Cu k − u l t,q so that u k is also Cauchy in L q . Moreover, the continuous embedding also implies that u is also the limit of u k as a sequence in L q . Since [φ− , φ+ ]0,q is closed in L q , we have u ∈ [φ− , φ+ ]0,q , and so u ∈ U = [φ− , φ+ ]t,q = [φ− , φ+ ]0,q ∩ W t,q . Remark 2. We indicate now how the far-CMC result outlined in [22] can be recovered using Theorem 4 above. The framework is constructed by taking X = W 2, p , Y = W 2, p , and Z = L ∞ , with p > 3, giving the compact embedding W 2, p → L ∞ . The coefficients are assumed to satisfy τ ∈ W 1, p and σ 2 , j a , ρ ∈ L p as well as the assumptions for the construction of a near-CMC-free global super-solution (presented in [22] as Theorem 1, analogous to Lemma 9 in this paper), and for the construction of a near-CMC-free global
568
M. Holst, G. Nagy, G. Tsogtgerel
sub-solution (presented in [22] as Theorem 2, analogous to Lemma 13 in this paper). One then takes U = [φ− , φ+ ] ⊂ Z = L ∞ , where the compatible 0 < φ− φ+ are these near-CMC-free barriers. Since Z = L ∞ is an ordered Banach space with normal order cone, we have (by Lemma 1 in this paper) that U is non-empty, convex, closed and bounded as a subset of Z . The invariance of the Picard mapping on the interval [φ− , φ+ ] is proven using a monotone shift (cf. Lemma 4 in this paper) and a barrier argument (cf. Lemma 5 in this paper). The main result in [22] (stated in [22] as Theorem 4), now follows from Theorem 4 in this paper (stated in [22] as Lemma 1). 4. Weak Solution Results for the Individual Constraints 4.1. The momentum constraint and the solution map S. In this section we fix a particular scalar function φ ∈ W s, p with sp > 3, and consider separately the momentum constraint equation (2.24) to be solved for the vector valued function w. The result is a linear elliptic system of equations for this variable w = wφ . For convenience, we reformulate the problem here in a self-contained manner. Note that the problem (4.2) below is identical to (2.24) provided the functions bτ and b j are defined accordingly. Our goal is not only to develop some existence results for the momentum constraint, but also to derive the estimates for the momentum constraint solution map S that we will need later in our analysis of the coupled system. We note that a complete weak solution theory for the momentum constraint on compact manifolds with boundary, using both variational methods and Riesz-Schauder Theory, is developed in [21]. Let (M, h) be a 3-dimensional Riemannian manifold, where M is a smooth, compact manifold without boundary, and with p ∈ (1, ∞) and s ∈ ( 3p , ∞), h ∈ W s, p is a positive definite metric. With q ∈ (1, ∞), and e ∈ (2 − s, s] ∩ −s + 3p − 1 + q3 , s − 3p + q3 , introduce the bounded linear operator AL : W e,q → W e−2,q , as the unique extension of the operator L in (2.17), cf. Lemmata 31 and 32 in Appendix A.5. Fix the source terms bτ , b j ∈ W e−2,q . Fix a function φ ∈ W s, p , and define f φ ∈ W s−2,q ,
f φ := bτ φ 6 + b j .
(4.1)
We used the subscript φ in f φ to emphasize that φ is not a variable (but the “source”) of the problem. Note that the above conditions on q and e are sufficient for the pointwise multiplication by an element of W s, p to be a bounded map in W e−2,q , cf. Corollary 3(a) in Appendix A.4. The momentum constraint equation is the following: find an element w ∈ W e,q solution of AL w + f φ = 0.
(4.2)
We sketch here a proof of existence of weak solutions of the momentum constraint equation (4.2).
Rough Solutions of the Einstein Constraints on Closed Manifolds
569
Theorem 6 (Momentum constraint). Let e and q be as above. Then there exists a solution w ∈ W e,q to the momentum constraint equation (4.2) if and only if f φ (v) = 0 for all v ∈ W 2−e,q satisfying A∗L v = 0. The solution is unique if and only if the kernel of A∗L is trivial. Moreover, if a solution exists at all in W e,q , for any given closed linear space K ⊆ W e,q such that W e,q = ker AL ⊕ K , there is a unique solution satisfying w ∈ K , and for this solution, we have we,q C f φ e−2,q ,
(4.3)
with some constant C > 0 not depending on w. Proof. By Lemma 34 in Appendix A.5, the operator AL is semi-Fredholm, and moreover since AL is formally self-adjoint, it is Fredholm. The formal self-adjointness also implies that when the metric is smooth, index of AL is zero independent of e and q. Now we can approximate the metric h by smooth metrics so that AL is sufficiently close to a Fredholm operator with index zero. Since the set of Fredholm operators with constant index is open, we conclude that the index of AL is zero, and the theorem follows. In the later sections we need to bound the coefficient aw in the Hamiltonian constraint equation, which can be obtained by using the following observation. Corollary 1. Let p ∈ (1, ∞) and s ∈ (1 +
3 p , ∞).
In addition, let q ∈ (3, ∞) and
3q e ∈ (1, s] ∩ (1 + − + z = 3+(2−e)q , let bτ ∈ Lz . Assume that e,q the momentum constraint equation has a solution w ∈ W . Then, we have 3 q,s
3 p
3 q ] ∩ (1, 2], and with
Lw∞ C φ6∞ bτ z + C b j e−2,q ,
(4.4)
with C > 0 not depending on w. Moreover, if the solution is unique, the norm we,q can be bounded by the same expression. Proof. Since the kernel of AL is finite dimensional, we can write W e,q = ker AL ⊕ K with a closed linear space K ⊆ W e,q . We have the splitting w = w0 + w1 such that w0 ∈ ker AL = ker L and w1 ∈ K , implying that Lw∞ = Lw1 ∞ c w1 1,∞ c w1 e,q , the latter inequality by W e,q → W 1,∞ . We note that demanding W e,q → W 1,∞ gives us the lower bound e > 1 + q3 , and this in turn implies s > 1 + 3p if the range of e is to be nonempty. To complete the proof, we note that w1 is also a solution of the momentum constraint, and taking into account Lz → W e−2,q , we apply Theorem 6 to bound the norm w1 e,q . Note that the latter embedding requires e 2, and combining this with e > 1 + q3 , we need q > 3. We now establish some properties of the momentum constraint solution map S that we will need later for our analysis of the coupled system. Suppose that the conditions for Theorem 6 hold, so that the momentum constraint is uniquely solvable. Then for any fixed φ+ ∈ W s, p with φ+ > 0, there exists a mapping S : [0, φ+ ] ∩ W s, p → W e,q
(4.5)
that sends the source φ to the corresponding solution w of the momentum constraint equation. Since the momentum constraint is linear, it follows easily that S is Lipschitz continuous as stated in the following lemma.
570
M. Holst, G. Nagy, G. Tsogtgerel
Lemma 2 (Properties of the map S). In addition to the conditions imposed in the begin3−e ning of this section, let s 1. Let e ∈ [1, 3] and q1 ∈ ( e−1 2 δ, 1 − 2 δ), where δ = max{0, 1p − s−1 3 }. Assume that the momentum constraint (4.2) is uniquely solvable in W e,q . With some φ+ ∈ W s, p satisfying φ+ > 0, let w1 and w2 be the solutions to the momentum constraint with the source functions φ1 and φ2 from the set [0, φ+ ] ∩ W s, p , respectively. Then, w1 − w2 e,q C φ+ 5∞ bτ e−2,q φ1 − φ2 s, p . Proof. The functions φ1 and φ2 pointwise satisfy the following inequalities: n−1 j n−1− j n n φ2 − φ1 = φ2 φ1 (φ2 − φ1 ) n (φ+ )n−1 |φ2 − φ1 |, −
φ2−n
− φ1−n
=
j=0 φ2n −φ1n (φ2 φ1 )n
n
(φ+ )n−1 (φ− )2n
(4.6)
(4.7)
|φ2 − φ1 |,
for any integer n > 0. Since Eq. (4.2) is linear, applying Theorem 6 with the right-hand side f := f φ1 − f φ2 , and by using Lemma 29 in Appendix, we obtain w1 − w2 e,q bτ e−2,q φ16 − φ26 s, p 6φ+ 5∞ bτ e−2,q φ1 − φ2 s, p . 4.2. The Hamiltonian constraint and the Picard map T . In this section we fix a particular function aw in an appropriate space and we then separately look for weak solutions of the Hamiltonian constraint equation (2.23). For convenience, we reformulate the problem here in a self-contained manner. Note that the problem (4.9) below is identical to (2.23), provided the functionals aτ and aρ are defined accordingly. Our goal here is primarily to establish some properties and derive some estimates for a Hamiltonian constraint fixed-point map T that we will need later in our analysis of the coupled system, and also for the analysis of the Hamiltonian constraint alone in the CMC setting. We remark that a complete weak solution theory for the Hamiltonian constraint on compact manifolds with boundary, using both variational methods and fixed-point arguments based on monotone increasing maps, combined with sub- and super-solutions, is developed in [21]. Let (M, h) be a 3-dimensional Riemannian manifold, where M is a smooth, compact manifold without boundary, and with p ∈ (1, ∞) and s ∈ ( 3p , ∞) ∩ [1, ∞), h ∈ W s, p is a positive definite metric. Introduce the operator A L : W s, p → W s−2, p , as the unique extension of the Laplace-Beltrami operator L = −, cf. Lemma 31 in Appendix A.5. Fix the source functions s−2, p
aτ , aρ , aw ∈ W+
, and a R = 18 R ∈ W s−2, p ,
where R is the scalar curvature of the metric h. (By Corollary 3(b) in Appendix A.4, we know h ab ∈ W s, p implies R ∈ W s−2, p .) Given any two functions φ− , φ+ ∈ W s, p with 0 < φ− φ+ , introduce the nonlinear operator f w : [φ− , φ+ ]s, p → W s−2, p ,
f w (φ) = aτ φ 5 + a R φ − aρ φ −3 − aw φ −7 , (4.8)
Rough Solutions of the Einstein Constraints on Closed Manifolds
571
where the pointwise multiplication by an element of W s, p defines a bounded linear map in W s−2, p since s − 2 −s and 2(s − 3p ) > 0 > 2 − 3, cf. Corollary 3(a) in Appendix A.4. In case the coupled system is under consideration, the dependence of f w on w is hidden in the fact that the coefficient aw depends on w, cf. (2.18). For generality, in the following we will view that the operator f w depends on aw . We now formulate the Hamiltonian constraint equation as follows: find an element φ ∈ [φ− , φ+ ]s, p solution of A L φ + f w (φ) = 0.
(4.9)
To establish existence results for weak solutions to the Hamiltonian constraint equation using fixed-point arguments, we will rely on the existence of generalized (weak) suband super-solutions (sometimes called barriers) which will be derived later in Sect. 5. Let us recall the definition of sub- and super-solutions in the following, in a slightly generalized form that will be necessary in our study of the coupled system. A function φ− ∈ (0, ∞) ∩ W s, p is called a sub-solution of (2.23) iff the function φ− satisfies the inequality A L φ− + f w (φ− ) 0,
(4.10)
for some aw ∈ W s−2, p . A function φ+ ∈ (0, ∞) ∩ W s, p is called a super-solution of (2.23) iff the function φ+ satisfies the inequality A L φ+ + f w (φ+ ) 0, for some aw ∈ satisfy
W s−2, p .
(4.11)
We say a pair of sub- and super-solutions is compatible if they 0 < φ− φ+ < ∞,
(4.12)
so that the interval [φ− , φ+ ] ∩ is both nonempty and bounded. We now turn to the construction of the fixed-point mapping T : U × R(S) → X for the Hamiltonian constraint and its properties. There are a number of possibilities for defining T ; the requirements are (1) that every fixed-point of T must be a solution to the Hamiltonian constraint; (2) T must be a continuous map from its domain to its range; and (3) T must be invariant on a non-empty, convex, closed, bounded subset U of an ordered Banach space Z , with X → Z compact. It will be sufficient to define T using a variation of the Picard iteration as follows. Due to the presence of the non-trivial kernel of the operator A L , which is a consequence of working with a closed manifold, we must introduce a shift into the Hamiltonian constraint equation in order to construct T with the required properties. W s, p
Lemma 3 (Properties of the map T ). In the above described setting, assume that p ∈ s−2, p s, p ( 23 , ∞) and s ∈ ( 3p , ∞) ∩ [1, 3]. With a0 ∈ W+ satisfying a0 = 0, and ψ ∈ W+ , let as = a0 + aw ψ ∈ W s−2, p . Fix the functions φ− , φ+ ∈ W s, p such that 0 < φ− φ+ , and define the shifted operators AsL : W s, p → W s−2, p , f ws
: [φ− , φ+ ]s, p → W
AsL φ := A L φ + as φ, s−2, p
,
f ws (φ)
:= f w (φ) − as φ.
(4.13) (4.14)
Let, for φ ∈ [φ− , φ+ ]s, p and aw ∈ W s−2, p , T s (φ, aw ) := −(AsL )−1 f ws (φ).
(4.15)
572
M. Holst, G. Nagy, G. Tsogtgerel
Then, the map T s : [φ− , φ+ ]s, p × W s−2, p → W s, p is continuous in both arguments. Moreover, there exist s˜ ∈ ( 3p , s) and a constant C such that T (φ, aw )s, p C 1 + aw s−2, p φs˜, p ,
(4.16)
for all φ ∈ [φ− , φ+ ]s, p and aw ∈ W s−2, p . Proof. In this proof, we denote by C a generic constant that may have different values at its different occurrences. By applying Lemma 29 from the Appendix, for any s˜ ∈ ( 3p , s], s − 2 ∈ [−1, 1] and
1 p
∈ ( s−1 2 δ, 1 −
3−s 2 δ)
with δ =
1 p
−
s˜ −1 3 ,
we have
−4 f ws (φ)s−2, p C aτ s−2, p φ+4 ∞ + aρ s−2, p φ− ∞
−8 + aw s−2, p (φ− ∞ + ψs˜, p ) + a R + a0 s−2, p φs˜, p .
Let us verify if since
s˜ 3
−
is indeed in the prescribed range. First, we have δ =
> 0, and taking into account s 1, we infer 1 −
1 p
This shows
1 p
1 p
< 1−
3−s 2 δ
for p >
3 2,
analysis. For the other bound, we need (s−1)(˜s −1) 6
>
s−3 2p .
3−s 2 δ
1 3
+
1−
1 s˜ 1 p − 3 < 3 3−1 1 2 2 3 = 3.
which is not sharp, but will be sufficient for our 1 p
<
s−1 2 δ
=
s−1 2p
s −1) − (s−1)(˜ , or in other words, 6
Since s ∈ [1, 3], it is possible to choose s˜ ∈ ( 3p , s] satisfying this
inequality. To finalize the proof of (4.16), we note that by Lemma 36 in Appendix A.6, the operator AsL is invertible, since the function as is positive, and that by Corollary 5 also in that appendix, the inverse (AsL )−1 : W s−2, p → W s, p is bounded. The continuity of the mapping f ws : [φ− , φ+ ]s, p → W s−2, p for any aw ∈ W s−2, p is obtained similarly, and the continuity of aw → f w (φ) for fixed φ ∈ [φ− , φ+ ]s, p is obvious. Being the composition of continuous maps, (φ, aw ) → Tws (φ) is also continuous. The following lemma shows that by choosing the shift sufficiently large, we can make the map T s monotone increasing. This result is important for ensuring that the Picard map T for the Hamiltonian constraint is invariant on the interval [φ− , φ+ ] defined by sub- and super-solutions. There is an obstruction that the scalar curvature should be continuous, which will be handled in the general case by conformally transforming the metric to a metric with continuous scalar curvature and using the conformal covariance of the Hamiltonian constraint, cf. Sect. 6.1. Lemma 4 (Monotone increasing property of T ). In addition to the conditions of Lemma 3, let a R be continuous and define the shift function as by as = max{1, a R } + 3
φ+2 φ+6 4 a + 5 φ a + 7 a . ρ τ + 6 14 w φ− φ−
Then, for any fixed aw ∈ W s−2, p , the map φ → T s (φ, aw ) : [φ− , φ+ ]s, p → W s, p is monotone increasing.
Rough Solutions of the Einstein Constraints on Closed Manifolds
573
Proof. The shifted operator AsL satisfies the maximum principle, hence the inverse (AsL )−1 : W s−2, p → W s, p is monotone increasing. Now we will show that the operator f ws is monotone decreasing. Given any two functions φ2 , φ1 ∈ [φ− , φ+ ]s, p with φ2 φ1 , we have f ws (φ2 ) − f ws (φ1 ) = f w (φ2 ) − f w (φ1 ) − as [φ2 − φ1 ] = aτ φ25 − φ15 + a R [φ2 − φ1 ] − as [φ2 − φ1 ] − aρ φ2−3 − φ1−3 −aw φ2−7 − φ1−7 . The inequalities (4.7), the condition 0 < φ1 φ2 , and the choice of as imply f ws (φ2 ) − f ws (φ1 ) 0, which establishes that f ws is monotone decreasing. Both the operator (AsL )−1 and the map − f ws are monotone increasing, therefore the operator T s (·, aw ) is also monotone increasing. Lemma 5 (Barriers for T and the Hamiltonian constraint). Let the conditions of Lemma 4 hold, with φ− and φ+ sub- and super-solutions of the Hamiltonian constraint equation (4.9), respectively. Then, we have T s (φ+ , aw ) φ+ and T s (φ− , aw ) φ− . Proof. We have φ+ − T s (φ+ , aw ) = (AsL )−1 AsL φ+ + f ws (φ+ ) , which is nonnegative since φ+ is a super-solution and (AsL )−1 is linear and monotone increasing. The proof of the other inequality is completely analogous. Since we are no longer using normal order cones, our non-empty, convex, closed interval [φ− , φ+ ]s, p is not necessarily bounded as a subset of W s, p . Therefore, we also need a priori bounds in the norm on W s, p to ensure the Picard iterates stay inside the intersection of the interval with the closed ball B M in W s, p of radius M, centered at the origin. We first establish a lemma to this effect that will be useful for both the non-CMC and CMC cases. Lemma 6 (Invariance of T on the ball B M ). Let the conditions of Lemma 3 hold, and let aw ∈ W s−2, p . Then, for any s˜ ∈ ( 3p , s] and for some t ∈ ( 3p , s˜ ) there exists a closed ball B M ⊂ W s˜, p of radius M = O [1 + aw s−2, p ]s˜/(˜s −t) , such that φ ∈ [φ− , φ+ ]s˜, p ∩ B M
⇒
T s (φ, aw ) ∈ B M .
Proof. From Lemma 3, there exist t ∈ ( 3p , s˜ ) and K > 0 such that T s (φ, aw )s˜, p K (1 + aw s−2, p )φt, p ,
∀φ ∈ [φ− , φ+ ]s˜, p .
For any ε > 0, the norm φt, p can be bounded by the interpolation estimate φt, p εφs˜, p + Cε−t/(˜s −t) φ p ,
574
M. Holst, G. Nagy, G. Tsogtgerel
where C is a constant independent of ε. Since φ is bounded from above by φ+ , φ p is bounded uniformly, and now demanding that φ ∈ B M , we get (4.17) T s (φ, aw )s˜, p K [1 + aw s−2, p ] Mε + Cε−t/(˜s −t) , with possibly different constant C. Choosing ε such that 2εK [1 + aw s−2, p ] = 1 and setting M = 2K C[1 + aw s−2, p ]ε−t/(˜s −t) , we can ensure that the right-hand side of (4.17) is bounded by M. 5. Barriers for the Hamiltonian Constraint The results developed in Sect. 4.2 for a particular fixed-point map T for analyzing the Hamiltonian constraint equation and the coupled system rely on the existence of generalized (weak) sub- and super-solutions, or barriers. There, the Hamiltonian constraint was studied in isolation from the momentum constraint, and these generalized barriers only needed to satisfy the conditions given at the beginning of Sect. 4.2 for a given fixed function w appearing as a source term in the nonlinearity of the Hamiltonian constraint. Therefore, these types of barriers are sometimes referred to as local barriers, in that the coupling to the momentum constraint is ignored. In order to establish existence results for the coupled system in the non-CMC case, it will be critical that the sub- and super-solutions satisfy one additional property that now reflects the coupling, giving rise to the term global barriers. It will be useful now to define this global property precisely. Definition 1. A sub-solution φ− is called global iff it is a sub-solution of (2.23) for all vector fields wφ solution of (2.24) with source function φ ∈ [φ− , ∞) ∩ W s, p . A supersolution φ+ is called global iff it is a super-solution of (2.23) for all vector fields wφ solution of (2.24) with source function φ ∈ (0, φ+ ] ∩ W s, p . A pair φ− φ+ of sub- and super-solutions is called an admissible pair if φ− and φ+ are sub- and super-solutions of (2.23) for all vector fields wφ of (2.24) with source function φ ∈ [φ− , φ+ ] ∩ W s, p . It is obvious that if φ− and φ+ are respectively global sub- and super-solutions, then the pair φ− , φ+ is admissible in the sense above, provided they satisfy the compatibility condition (4.12). Below we give a number of (local and global) sub- and super-solution constructions for closed manifolds; analogous constructions for compact manifolds with boundary are given in [21]. These constructions are based on generalizing known constant sub- and super-solution constructions given previously in the literature for closed manifolds. On one hand, the generalized global sub-solution constructions appearing here and in [21] do not require the near-CMC condition, inheriting this property from the known sub-solutions from literature on which they are based. However, on the other hand, all previously known global super-solutions for the Hamiltonian constraint equation have required the near-CMC condition. Here and in [21,22], one of our primary interests is in developing existence results for weak (and strong) non-CMC solutions to the coupled system which are free of the near-CMC assumption. This assumption had appeared in two distinct places in all prior literature on this problem [1,26]; the first assumption appears in the construction of a fixed-point argument based on strict k-contractions, and the second assumption appears in the construction of global super-solutions. Here and in [21,22], an alternative fixedpoint framework based on compactness arguments rather than k-contractions is used
Rough Solutions of the Einstein Constraints on Closed Manifolds
575
to remove the first of these near-CMC assumptions. In this section, we give some new constructions of global super-solutions that are free of the near-CMC assumption, along with some compatible sub-solutions. These sub- and super-solution constructions are needed (without their global property) for the existence result for the Hamiltonian constraint (Theorem 3), and they are also needed (now with their global property) for the general fixed-point result for the coupled system (Theorem 5), leading to our two main non-CMC results (Theorems 1 and Theorem 2). The super-solutions in Lemmata 7(b) and 9 appear to be the first such near-CMC-free constructions, and provide the second key piece of the puzzle we need in order to establish non-CMC results through Theorem 5 without the near-CMC condition. Throughout this section, we will assume that the background metric h belongs to 3p W s, p with p ∈ (1, ∞) and s ∈ ( 3p , ∞) ∩ (1, 2]. Recall that r = 3+(2−s) p , so that r s−2, p the continuous embedding L → W holds. Given a symmetric two-index tensor σ ∈ L 2r and a vector field w ∈ W 1,2r , introduce the functions aσ = 18 σ 2 ∈ L r and aLw = 18 (Lw)2 ∈ L r . Note that under these conditions aw belongs to L r → W s−2,2 , and that if aσ , aLw ∈ L ∞ we have the pointwise estimate ∧ aw∧ 2aσ∧ + 2aL w.
Here and in what follows, given any scalar function u ∈ L ∞ , we use the notation u ∧ := ess sup u,
u ∨ := ess inf u.
In some places we will assume that when the vector field w ∈ W 1,2r is given by the solution of the momentum constraint equation (2.24) (or (4.2)) with the source term φ ∈ W s, p , ∧ 12 aL w k(φ) := k1 φ∞ + k2 ,
(5.1)
with some positive constants k1 and k2 . We can verify this assumption e.g. when the conditions of Corollary 1 are satisfied, since from Corollary 1 we would get 2 ∧ 2 2 6 φ = Lw C b + b , aL j e−2,q ∞ ∞ τ z w giving the bound (5.1) with the constants k1 = 2C 2 bτ 2z ,
and
k2 = 2C 2 b j 2e−2,q .
(5.2)
5.1. Constant barriers. Now we will present some global sub- and super-solutions for the Hamiltonian constraint equation (2.23) which are constant functions. The proofs essentially follow the arguments in [21] for the case of compact manifolds with boundary. Lemma 7 (Global super-solution). Let (M, h) be a 3-dimensional, smooth, closed Riemannian manifold with metric h ∈ W s, p . Assume that the estimate (5.1) holds for the solution of the momentum constraint equation, and assume that aρ , aσ ∈ L ∞ and that a R is uniformly bounded from below. With the parameter ε > 0 to be chosen later, define the rational polynomial qε (χ ) = (aτ∨ − K1ε ) χ 5 + a ∨R χ − aρ∧ χ −3 − K2ε χ −7 , where K1ε := (1 + 1ε )k1 and K2ε := (1 + ε)aσ∧ + (1 + 1ε )k2 . We distinguish the following two cases:
576
M. Holst, G. Nagy, G. Tsogtgerel
k1 . If qε has a root, let φ+ = φ1 (aτ∨ − aτ − k1 K1ε , a ∨R , aρ∧ , K2ε ) be the largest positive root of q, and if q has no positive roots, let φ+ = 1. Now, the constant φ+ is a global super-solution of the Hamiltonian constraint equation (2.23). (b) In case k1 aτ∨ , choose ε > 0. In addition, assume that a ∨R > 0 and that both aρ∧ and K2ε are sufficiently small, such that q has two positive roots. Then, the largest root φ+ = φ2 (aτ∨ − K1ε , a ∨R , aρ∧ , K2ε ) of q is a super-solution of the Hamiltonian constraint equation (2.23). (a) In case k1 < aτ∨ , choose ε >
∨
Proof. We look for a super-solution among the constant functions. Let χ be any positive constant. Then we have f (χ , w) = aτ χ 5 + a R χ − aρ χ −3 − aw χ −7 aτ∨ χ 5 + a ∨R χ − aρ∧ χ −3 − aw∧ χ −7 . Given any ε > 0, the inequality 2|σab (Lw)ab | εσ 2 + 1ε (Lw)2 implies that 8aw = σ 2 + (Lw)2 + 2σab (Lw)ab (1 + ε) σ 2 + (1 + 1ε ) (Lw)2 , hence, taking into account (5.1), for any w ∈ W 1,2r that is a solution of the momentum constraint equation (2.24) with any source term φ ∈ (0, χ ], the constant aw∧ must fulfill the inequality ∧ 12 aw∧ (1 + ε)aσ∧ + (1 + 1ε )aL w K1ε φ∞ + K2ε .
(5.3)
Thus, for any constant χ > 0 and all φ ∈ (0, χ ], it holds that −7 f (χ , wφ ) aτ∨ χ 5 + a ∨R χ − aρ∧ χ −3 − K1ε φ12 ∞ + K2ε χ Bε χ 5 + a ∨R χ − aρ∧ χ −3 − K2ε χ −7 , where Bε := aτ∨ − K1ε . Introduce the rational polynomial on χ given by qε (χ ) := Bε χ 5 + a ∨R χ − aρ∧ χ −3 − K2ε χ −7 .
(5.4)
We calculate the first and second derivative of qε as qε (χ ) = 5Bε χ 4 + a ∨R + 3aρ∧ χ −4 + 7K2ε χ −8 , qε (χ ) = 20Bε χ 3 − 12aρ∧ χ −5 − 56K2ε χ −9 .
(5.5)
1 , we have Bε > 0, Consider the case (a). In this case, because of the choice ε > a ∨k−k 1 τ and so qε (χ ) > 0 for sufficiently large χ , and qε is increasing. The function qε has no positive root only if aρ∧ = K2ε = 0. So if qε has no positive root, qε (χ ) 0 for all χ 0. If qε has at least one positive root, denoting by φ1 the largest positive root, q(χ ) 0 for all χ φ1 . Recalling now that any constant χ satisfies A L χ = 0, we conclude that
A L χ + f (χ , wφ ) 0
∀ χ φ1 , ∀ φ ∈ (0, χ ],
implying that φ+ is a global super-solution of the Hamiltonian constraint (2.21). For the case (b), since Bε < 0 and aρ∧ and K2ε are nonnegative, the first derivative qε (χ ) is strictly decreasing for χ > 0, and since qε (φ) > 0 for sufficiently small χ > 0 and qε (χ ) < 0 for sufficiently large χ > 0, the derivative qε has a unique positive root,
Rough Solutions of the Einstein Constraints on Closed Manifolds
577
at which the polynomial qε attains its maximum over (0, ∞). This maximum is positive if both aρ∧ and K2ε are sufficiently small, and hence the polynomial qε has two positive roots φ1 φ2 . Similarly to the above we conclude that A L χ + f (χ , wφ ) 0
∀ χ ∈ [φ1 , φ2 ], ∀ φ ∈ (0, χ ],
implying that φ+ is a global super-solution of the Hamiltonian constraint (2.21).
Case (a) of the above lemma has the condition k1 < aτ∨ , which is the near-CMC condition. This condition seems to be present in all non-CMC results to date. The above condition also requires that the extrinsic mean curvature τ is nowhere zero. Noting that there are solutions even for τ ≡ 0 in some cases (cf. [25]), the condition inf τ > 0 appears as a rather strong restriction. We see that case (b) of the above lemma removes this restriction, in exchange for the smallness conditions on ρ, j, and σ . We also need the scalar curvature to be strictly positive, which condition is relaxed in the next subsection to allow any metric in the positive Yamabe class. In the following lemma, we list some constant sub-solutions. They impose considerable restrictions on the allowable data, which is the main reason to consider non-constant sub-solutions in the next subsection. Lemma 8 (Global sub-solution). Let (M, h) be a 3-dimensional, smooth, closed Riemannian manifold with metric h ∈ W s, p . Assume that aτ ∈ L ∞ and that a R is uniformly bounded from above. We distinguish the following three cases: (a) If a ∧R < 0, then the unique positive root of the polynomial q(χ ) = aτ∧ χ 4 + a ∧R , is a global sub-solution of (2.23). (b) If aρ∨ > 0, then the unique positive root of the polynomial qρ (χ ) = aτ∧ χ 8 + max{1, a ∧R } χ 4 − aρ∨ , is a global sub-solution of (2.23). (c) Let φ+ > 0 be a global super-solution of the Hamiltonian constraint. Let aσ∨ > k(φ+ ), where k is as in (5.1). Then, with some ε ∈ (k(φ+ )/aσ∨ , 1), the unique positive root φ+ of the polynomial qσ (χ ) = aτ∧ χ 12 + max{1, a ∧R } χ 8 − Kε , where Kε := (1 − ε)aσ∨ − 1ε − 1 k(φ+ ), is a global sub-solution of (2.23). Proof. For the proof of (a,b), see e.g. [21]. We give a proof of (c) here. Let χ > 0 be any constant function, and let w ∈ W 1,2r . Then we have f (χ , w) = aτ χ 5 + a R χ − aρ χ −3 − aw χ −7 aτ∧ χ 5 + a ∧R χ − aw∨ χ −7 aτ∧ χ 5 + Cχ − aw∨ χ −7 ,
(5.6)
where we have used that aρ is nonnegative, and introduced the constant C = max{1, a ∧R }. Given any ε > 0, the inequality 2|σab (Lw)ab | εσ 2 + 1ε (Lw)2 implies that 8aw = σ 2 + (Lw)2 + 2σab (Lw)ab (1 − ε) σ 2 − ( 1ε − 1) (Lw)2 ,
578
M. Holst, G. Nagy, G. Tsogtgerel
hence, taking into account (5.1), for any w ∈ W 1,2r that is a solution of the momentum constraint equation (2.24) with any source term φ ∈ (0, φ+ ], the constant aw∨ must fulfill the inequality 1 ∧ ∨ aw∨ (1 − ε)aσ∨ − ( 1ε − 1)aL w (1 − ε)aσ − ( ε − 1)k(φ+ ) =: Kε .
We use the above estimate in (5.6) to get, for any w ∈ W 1,2r that is a solution of the momentum constraint equation (2.24) with any source term φ ∈ (0, φ+ ], f (χ , w) aτ∧ χ 5 + Cχ − Kε χ −7 . Because of the choice k(φ+ )/aσ∨ < ε < 1, we have Kε > 0. So with the unique positive root χ∗ of qσ (χ ) := aτ∧ χ 5 + C χ − Kε χ −7 , we have qσ (χ ) 0 for any constant χ ∈ (0, χ∗ ], establishing the proof.
5.2. Non-constant barriers. All global super-solutions found to date appear to require the near-CMC condition; Lemma 7(b) avoids the near-CMC condition, but it requires the scalar curvature to be strictly positive. The following lemma extends this result to arbitrary metrics in the positive Yamabe class Y + (M). Lemma 9 (Global super-solution h ∈ Y + ). Let (M, h) be a 3-dimensional, smooth, closed Riemannian manifold with metric h ∈ W s, p in Y + (M). Assume there exist continuous positive functions u, ∈ W s, p that together satisfy: − u + 18 Ru = > 0, u > 0.
(5.7)
Let 0 < k3 := u ∧ /u ∨ < ∞, which is a trivially satisfied Harnack-type inequality. Assume that the estimate (5.1) is satisfied for the solution of the momentum constraint equation for two positive constants k1 and k2 , and assume that aρ , aσ ∈ L ∞ . If the constants aρ∧ , aσ∧ , and k2 are sufficiently small, then φ+ = βu, β =
∨ ∧ 5 2k1 k12 3 (u )
1/4 > 0,
(5.8)
is a positive global super-solution to the Hamiltonian constraint equation. Proof. Taking φ = βu with a constant β > 0 in (5.7), gives − φ + a R φ = β(−u + 18 Ru) = β.
(5.9)
Then for any ϕ ∈ C+∞ , by using (5.3) with K1 := 2k1 and K2 := 2aσ∧ + 2k2 , we infer A L φ + f (φ, w), ϕ = ∇φ, ∇ϕ + a R φ + aτ φ 5 − aρ φ −3 − aw φ −7 , ϕ
β + aτ∨ φ 5 − [K1 (φ ∧ )12 + K2 ]φ −7 − aρ∧ φ −3 , ϕ
5 −7 β + [aτ∨ − K1 k12 − aρ∧ φ −3 , ϕ
3 ]φ − K2 φ βG(β, K2 , aρ ), ϕ ,
Rough Solutions of the Einstein Constraints on Closed Manifolds
579
where 4 ∧ 5 −8 ∧ −7 G(β, K2 , aρ ) := ∨ − K1 k12 − aρ∧ β −4 (u ∧ )−3 , 3 β (u ) − K2 β (u )
and where we have used the fact that φ ∧ /φ ∨ = u ∧ /u ∨ = k3 . Therefore, to ensure φ is a super-solution we must now pick arguments ensuring G(β, K2 , aρ ) 0. We first pick β as in (5.8) giving 1 ∨ 2
∧ 5 4 = ∨ − K1 k12 3 (u ) β > 0.
For this fixed β, we then pick K2 and aρ∨ , each sufficiently small, so that 1 ∨ 2
The result then follows.
− K2 β −8 (u ∧ )−7 − aρ∧ β −4 (u ∧ )−3 0.
Remark 3. We now make some remarks about the existence of a pair of positive functions (u, ) which satisfy the hypotheses of Lemma 9. Let the background metric h ab ∈ W s, p be in the positive Yamabe class. Then in Theorem 11 in Appendix A.7, for the sub-critical range 1 q < 5 we establish the existence of a positive u ∈ W s, p and a constant µq > 0 satisfying −8u + Ru = µq u q . So the pair (u, 18 µq u q ) readily satisfies (5.7). In a sense the simplest construction of the near-CMC-free global super-solution in Lemma 9 arises by taking q = 1; one is then simply using the first eigenfunction of the conformal Laplacian to build the global super-solution. Alternatively, one can consider a solution to the Yamabe problem −8u + Ru = u 5 , u > 0, which exists for sufficiently smooth metrics in the positive Yamabe class, cf. [31]. This approach is taken for simplicity in [22]. In any case, note that the function u > 0 that satisfies (5.7) is the conformal factor which transforms h ab into a metric with scalar curvature Ru = 8u −5 > 0. We remark that without the near-CMC condition, the only potentially strictly positive term appearing in the nonlinearity of the Hamiltonian constraint is the term involving the scalar curvature R. Therefore, global super-solution constructions based on the approach in Lemma 9 are restricted to data in Y + (M). We extend this observation in the next lemma, which essentially says that in a nonpositive Yamabe class, there is no way to build a positive global super-solution without the near-CMC condition as long as we use a global estimate of type (5.1). Lemma 10 (Near-CMC condition and aw bounds). Let (M, h) be a 3-dimensional, smooth, closed Riemannian manifold with metric h ∈ W s, p in a nonpositive Yamabe class, and let aτ be continuous. Let φ+ ∈ W s, p with φ+ > 0 be a global super-solution to the Hamiltonian constraint equation. We assume that any vector field w ∈ W 1,2r that is a solution of the momentum constraint equation with a source φ φ+ satisfies the estimate aw K1 φ+ 12 ∞ + K2 ,
(5.10)
580
M. Holst, G. Nagy, G. Tsogtgerel
with some positive constants K1 and K2 . Moreover, we assume that this estimate is sharp in the sense that for any x ∈ M there exist an open neighborhood U x and a vector field w ∈ W 1,2r a solution of the momentum constraint equation with a source φ φ+ , such that aw = K1 φ+ 12 ∞ + K2
in U.
(5.11)
Then, we have K1 supM aτ . 2−s, p
such Proof. Since the metric is in a nonpositive Yamabe class, there exists ϕ˜ ∈ W+ that ∇φ+ , ∇ ϕ
˜ + a R φ+ , ϕ
˜ 0. The collection of all neighborhoods in (5.11) will form an open cover of M, and let {Ui } be one of its finite subcovers. Let {µi } be a partition of unity subordinate to {Ui }. Then, by writing ϕ˜ = i µi ϕ, ˜ we can expand the expression ∇φ+ , ∇ ϕ
˜ + a R φ+ , ϕ
˜ into a finite sum, which has at least one non-positive term. Without loss of generality, let us assume ∇φ+ , ∇ϕ + a R φ+ , ϕ 0 with ϕ = µi ϕ. ˜ With w ∈ W 1,2r being a vector field that satisfies (5.11) with respect to U := Ui , we have 0 ∇φ+ , ∇ϕ + a R φ+ + aτ φ+5 − aw φ+−7 − aρ φ+−3 , ϕ
aτ φ+5 − aw φ+−7 − aρ φ+−3 , ϕ
= aτ φ+5 − [K1 (φ+∧ )12 + K2 ]φ+−7 − aρ φ+−3 , ϕ
([aτ − K1 (φ+∧ /φ+ )12 ]φ+5 , ϕ). Using partitions of unity we can make the support of ϕ arbitrarily small, from which we conclude that aτ K1 (φ+∧ /φ+ )12 K1 at some x ∈ M. All of the subsequent barrier constructions below are more or less known. A number of the more technically sophisticated construction techniques we employ below were pioneered by Maxwell in [33]. For completeness, we first construct local super-solutions and then global super-solutions for the near-CMC case. Lemma 11 (Local super-solution). Let (M, h) be a 3-dimensional, smooth, closed s−2, p Riemannian manifold with metric h ∈ W s, p . Let aτ , aρ , aw ∈ W+ , and let one of the following conditions hold: (a) The metric h is in a non-negative Yamabe class, aτ = 0, and aρ + aw = 0. (b) The metric h is in the positive Yamabe class, and aρ + aw = 0. (c) The metric h is conformally equivalent to a metric with scalar curvature −aτ = 0, thus in particular the metric is in the negative Yamabe class. Then, there is a positive (local) super-solution φ+ ∈ W s, p of the Hamiltonian constraint equation (2.23). Proof. First we prove (a) and (b). Let u ∈ W s, p be a (weak) solution to −u + 18 Ru = λu, u > 0, with a constant λ 0, which exists by Theorem 11 in Appendix A.7, and let v ∈ W s, p be the solution to u 2 ∇v, ∇ϕ + λu 2 v + aτ v, ϕ = aρ + aw , ϕ ,
∀ϕ ∈ C ∞ .
(5.12)
Rough Solutions of the Einstein Constraints on Closed Manifolds
581
Since aτ , aρ , aw ∈ W+ with sp > 3, we have v ∈ W s, p → L ∞ , and since 2 λu + aτ = 0 and aρ + aw = 0, Lemma 35 (maximum principle) in Appendix A.6 implies that v > 0. Let us define φ = βuv ∈ W s, p for a constant β > 0. Then for any ϕ ∈ C+∞ we have s−2, p
A L φ + f (φ, w), uϕ = ∇φ, ∇(uϕ) + aτ φ 5 + a R φ − aρ φ −3 − aw φ −7 , uϕ
= β u 2 ∇v, ∇ϕ + βλu 2 v + aτ uφ 5 − aρ uφ −3 − aw uφ −7 , ϕ
= aτ [β 5 u 6 v 5 − βv], ϕ + aρ [β − β −3 u −2 v −3 ], ϕ
+ aw [β − β −7 u −6 v −7 ], ϕ , where the second line is obtained by A L φ + a R φ, uϕ = β ∇(uv), ∇(uϕ) + β8 Ruv, uϕ
= β ∇u, ∇(uvϕ) + β8 Ru, uvϕ + β u∇v, u∇ϕ
(5.13)
= β λu, uvϕ + β u ∇v, ∇ϕ , 2
and the third line is from (5.12). Now, choosing β > 0 sufficiently large, so that β 4 u 6 v 5 −v 0, 1 − β −4 u −2 v −3 0 and 1 − β −8 u −6 v −7 0, we ensure that φ is a super-solution. Now, let us consider (c). Let u > 0 be the conformal factor which transforms h into a metric with scalar curvature λ = −8aτ , i.e., let u ∈ W s, p be a weak solution to −u + 18 Ru + aτ u 5 = 0, u > 0. If aρ = aw = 0, the Hamiltonian constraint equation reduces to the above equation and we can take u as a super-solution (it is even a solution). So we can assume in the following that aρ + aw = 0. Let v ∈ W s, p be the solution to u 2 ∇v, ∇ϕ + aτ v, ϕ = aρ + aw , ϕ ,
∀ϕ ∈ C ∞ .
Defining φ = βuv ∈ W s, p for a constant β > 0, the rest of the proof proceeds superficially in the same way as the above case. Lemma 12 (Near-CMC global super-solution) Let (M, h) be a 3-dimensional, s−2, p smooth, closed Riemannian manifold with metric h ∈ W s, p . Let aτ , aρ ∈ W+ ∞ and aσ ∈ L + , and let one of the following conditions hold: (a) The metric h is in a non-negative Yamabe class, aτ = 0, and aρ + aσ = 0. Let u ∈ W s, p and v ∈ W s, p be the solutions to −u + 18 Ru = λu, −∇(u 2 ∇v) + (λu 2 + aτ )v = aρ + aσ
(5.14)
with a constant λ 0. (b) The metric h is conformally equivalent to a metric with scalar curvature −aτ = 0, thus in particular the metric is in the negative Yamabe class. Let u ∈ W s, p and v ∈ W s, p be the solutions to −u + 18 Ru + aτ u 5 = 0, −∇(u 2 ∇v) + aτ v = aρ + aσ .
(5.15)
582
M. Holst, G. Nagy, G. Tsogtgerel
Assume that the estimate (5.1) holds for the momentum constraint equation, and let min uv 12 k1 < aτ∨ ( max uv ) . Then, for any sufficiently large constant β > 0, φ+ = βuv is a global super-solution of the Hamiltonian constraint equation (2.23). Proof. We give a proof of (a). The proof of (b) is similar. Proceeding as in the proof of the preceding lemma, for any ϕ ∈ C+∞ we have A L φ + f (φ, w), uϕ = ∇φ, ∇(uϕ) + aτ φ 5 + a R φ − aρ φ −3 − aw φ −7 , uϕ
= β u 2 ∇v, ∇ϕ + βλu 2 v + aτ uφ 5 − aρ uφ −3 − aw uφ −7 , ϕ
β u 2 ∇v, ∇ϕ + βλu 2 v + aτ uφ 5 − aρ uφ −3 − 2[aσ + aLw ]uφ −7 , ϕ
= aρ [β − β −3 u −2 v −3 ], ϕ + aσ [β − 2β −7 u −6 v −7 ], ϕ
+ aτ [β 5 u 6 v 5 − βv] − 2aLw uφ −7 , ϕ . Then, choosing β sufficiently large, and by using (5.1), with θ = uv we infer A L φ + f (φ, w) [aτ∨ (θ ∨ )5 − 2k1 (θ ∧ )12 (θ ∨ )−7 ]β 5 − p(β), where p(β) = aτ (v ∧ /u ∨ )β +2k2 (θ ∨ )−7 β −7 . Now, if we have k1 < 21 aτ∨ (θ ∨ /θ ∧ )12 , then choosing β large enough, we ensure that φ is a super-solution. If we proceeded as in the proof of Lemma 7, we could remove the factor 21 from the condition k1 < 21 aτ∨ (θ ∨ /θ ∧ )12 ; however, we omit it for clarity. We now also give some examples of non-constant global sub-solutions φ− which are compatible with φ+ above in the sense that 0 < φ− φ+ . Such a pair of compatible sub- and super-solutions are needed to establish existence of solutions to the individual Hamiltonian constraint (Theorem 3), and are also needed again to establish existence of solutions to the coupled system (Theorems 1 and 2). Lemma 13 (Global sub-solution h ∈ Y − , ρ ≡ 0). Let (M, h) be a 3-dimensional, smooth, closed Riemannian manifold with metric h ∈ W s, p in a non-negative Yamabe s−2, p \{0}. Then, there exists a positive scalar φ− ∈ W s, p such that class. Let aρ , aτ ∈ W+ for any constant β ∈ (0, 1], βφ− is a global sub-solution of the Hamiltonian constraint equation. Proof. Let u ∈ W s, p be a (weak) solution to −u + 18 Ru = λu, u > 0, with a constant λ 0, which exists by Theorem 11 in Appendix A.7, and let v ∈ W s, p be the solution to u 2 ∇v, ∇ϕ + λu 2 v + aτ v, ϕ = aρ , ϕ ,
∀ϕ ∈ C ∞ .
(5.16)
with sp > 3, we have v ∈ W s, p → L ∞ , and Lemma 35 (maxSince aρ , aτ ∈ W+ imum principle) in Appendix A.6 implies that v > 0. Let us define φ = βuv ∈ W s, p for a constant β > 0. Then for any ϕ ∈ C+∞ we have s−2, p
A L φ + f (φ, w), uϕ A L φ, uϕ + aτ φ 5 + a R φ − aρ φ −3 , uϕ
= β u 2 ∇v, ∇ϕ + βλu 2 v + aτ u 6 (βv)5 − aρ u −2 (βv)−3 , ϕ
= β aρ [1 − u −2 v −3 β −4 ], ϕ + β aτ [u 6 v 5 β 4 − 1], ϕ ,
Rough Solutions of the Einstein Constraints on Closed Manifolds
583
where the second line is obtained by (5.13), and the third line is from (5.16). Now, choosing β > 0 sufficiently small, so that 1 − u −2 v −3 β −4 0 and (βv)4 − 1 0, we ensure that φ is a sub-solution. The following lemma extends Lemma 8(a) to all reasonable metrics in the negative Yamabe class. Lemma 14 (Global sub-solution h ∈ Y − ). Let (M, h) be a 3-dimensional, smooth, closed Riemannian manifold with metric h ∈ W s, p in Y − (M). In addition, let aτ ∈ W s−2, p , and let the metric h be conformally equivalent to a metric with scalar curvature (−aτ ). Then, there exists a positive scalar function φ− ∈ W s, p such that for any β ∈ (0, 1], βφ− is a global sub-solution of the Hamiltonian constraint equation. Proof. Let u > 0 be the conformal factor which transforms h into a metric with scalar curvature λ = −8 aτ , i.e., let u ∈ W s−2, p be a weak solution to −u + 18 Ru + aτ u 5 = 0, u > 0. Taking φ = βu with a constant β > 0, we have A L φ + f (φ, w) A L φ + aτ φ 5 + a R φ = −βu + aτ (βu)5 +
β 8
Ru
= βaτ u (β − 1). 5
4
By choosing β ∈ (0, 1], we get the sub-solution.
The following lemma shows that the additional condition on the metric appearing in Lemma 14 is indeed not restrictive. It is worth noting that this next result can be viewed as an apparently new non-existence result in the context of the non-CMC constraints, which is interesting in its own right. This result was first proved in [33] for the case of p = 2; we just need to reinterpret it here in our setting. It states that for there to be a (CMC or non-CMC) solution to the Hamiltonian constraint, the background metric h ab must be conformally equivalent to a metric with scalar curvature equal to (−aτ ). Lemma 15 (Non-existence h ∈ Y − ). Let (M, h) be a 3-dimensional, smooth, closed Riemannian manifold with metric h ∈ W s, p in Y − (M). Let aτ ∈ W s−2, p , and let there exist a solution to the Hamiltonian constraint equation. Then, the metric h is conformally equivalent to a metric with scalar curvature (−aτ ). Proof. It suffices to show that the equation − ψ + 18 Rψ + aτ ψ 5 = 0,
(5.17)
has a solution ψ > 0. Since the above equation is just a Hamiltonian constraint equation with aρ = aw = 0, Theorem 3 establishes the proof upon constructing sub- and super-solutions to (5.17). Let φ > 0 be a solution to the (general) Hamiltonian constraint equation. Then, since both aρ and aw are non-negative, we have −φ + 18 Rφ + aτ φ 5 0, which means that φ is a super-solution to (5.17).
584
M. Holst, G. Nagy, G. Tsogtgerel
Let u ∈ W s, p be a solution to −u + 18 Ru = −λu, u > 0, with a constant λ > 0, which exists by Theorem 11 in Appendix A.7, and with a real parameter ε, let vε ∈ W s, p be the solution to u 2 ∇vε , ∇ϕ + λu 2 vε , ϕ = λu 2 − aτ ε, ϕ ,
∀ϕ ∈ C ∞ .
We have vε ≡ 1 for ε = 0, and we have vε ∈ W s, p → L ∞ , so as ε goes to 0, vε tends to 1 uniformly. Let us fix ε > 0 such that vε 21 . By taking ψ = βuvε with a constant β > 0, and using (5.13), it holds for any ϕ ∈ C+∞ that 1 5 ∇ψ, ∇(uϕ) + Rψ + aτ ψ , uϕ = β u 2 ∇vε , ∇ϕ + aτ u 6 (βvε )5 − βλu 2 vε , ϕ
8 = β aτ (u 6 vε5 β 4 − ε), ϕ + βλ u 6 (1 − 2vε ), ϕ . Now, by choosing β > 0 small enough, we can ensure that ψ is a sub-solution of (5.17). 5.3. A priori L ∞ bounds on W 1,2 solutions. We now establish some related a priori L ∞ -bounds on any W 1,2 -solution to the Hamiltonian constraint equation. Although such results are standard for semi-linear scalar problems with monotone nonlinearities (for example, see [29]), the nonlinearity appearing in the Hamiltonian constraint becomes non-monotone when R becomes negative. Nonetheless, we are able to obtain a priori L ∞ -bounds on solutions to the Hamiltonian constraint in all cases including the nonmonotone case. See [21] for an analogue of this result in the case of compact manifolds with boundary; in that case a more general result is possible. Lemma 16 ((Pointwise a priori bounds). Let φ ∈ W 1,2 be any non-constant positive solution of the Hamiltonian constraint equation (2.23). (a) Let aτ∨R := ess inf (aτ + a R ) > 0, and let aρ∧ and aw∧ be finite. Then, φ satisfies the a priori bound aρ∧ + aw∧ 4 . φ max 1, aτ∨R (b) Let aτ∨ > 0 and let aρ∧ and aw∧ be finite. Then, φ satisfies the a priori bound ⎧ ⎨
φ 4 max 1, ⎩
⎫ (a ∨R )2 + aτ∨ (aρ∧ + aw∧ ) − a ∨R ⎬ ⎭
aτ∨
.
∨ := ess inf (aρ + aw ) > 0, and let aτ∧ be finite. Then, φ satisfies the a priori (c) Let aρw bound
φ4
∨ aρw ∨ , a∧ + a∧} max{aρw R τ
.
Rough Solutions of the Einstein Constraints on Closed Manifolds
585
Proof. We will only prove (a) since the other cases can be proven similarly. Let χ ∈ W 1,2 be any function with χ 1. Then for ϕ ∈ C+∞ we have f w (χ ), ϕ (χ ∨ )5 aτ , ϕ + χ ∨ a R , ϕ − (χ ∨ )−3 (aρ , ϕ) − (χ ∨ )−7 (aw , ϕ) aτ∨R χ ∨ − (χ ∨ )−3 [aρ∧ + aw∧ ] ϕ1 . So we conclude that ∀χ φ ∧ , χ ∈ W 1,2 , f w (χ ), ϕ 0 $ a ∧ +a ∧ % where (φ ∧ )4 = max 1, ρa ∨ w .
∀ϕ ∈ C+∞ ,
τR
Now, suppose that φ ∈ W 1,2 is a solution of the Hamiltonian constraint equation, such that φ φ ∧ . Denoting by (φ − φ ∧ )+ the positive part of φ − φ ∧ (cf. Appendix A.6), then we have 0 − f w (φ), (φ − φ ∧ )+ = (∇φ, ∇(φ − φ ∧ )+ ) = (∇(φ − φ ∧ )+ , ∇(φ − φ ∧ )+ ) c(φ − φ ∧ )+ − (φ − φ ∧ )+ 22 , where c > 0, and (φ − φ ∧ )+ is the integral average of (φ − φ ∧ )+ . This implies that φ is constant, leading to a contradiction. 6. Proof of the Main Results It is convenient to prove Theorem 2 first, which is the most general of the three; the proofs of Theorem 1 and Theorem 3 involve minor modifications of the proof of Theorem 2. 6.1. Proof of Theorem 2. Our strategy will be to prove the theorem first for the case s 2, and then to bootstrap to include the higher regularity cases. Step 1. The choice of function spaces. We have the (reflexive) Banach spaces X = W s, p and Y = W e,q , where p, q ∈ (3, ∞), s = s( p) ∈ (1 + 3p , 2], and e = e( p, s, q) ∈ (1, s]∩(1+ q3 , s − 3p + q3 ]. We have the ordered Banach space Z = W s˜, p with the compact
embedding X = W s, p → W s˜, p = Z , for s˜ ∈ ( 3p , s). The interval [φ− , φ+ ]s˜, p is nonempty (by compatibility of the barriers we will choose below), and by Lemma 1 at the end of Sect. 3 it is also convex with respect to the vector space structure of W s˜, p and closed with respect to the norm topology of W s˜, p . We then take U = [φ− , φ+ ]s˜, p ∩ B M for sufficiently large M (to be determined below), where B M is the closed ball in Z = W s˜, p of radius M about the origin, ensuring that U is non-empty, convex, closed, and bounded as a subset of Z = W s˜, p . Step 2. Construction of the mapping S. We have b j ∈ W e−2,q , and bτ ∈ Lz with 3q z = 3+(2−e)q so that Lz → W e−2,q . Moreover, since the metric admits no conformal Killing field, by Lemma 6 the momentum constraint equation is uniquely solvable for any “source” φ ∈ [φ− , φ+ ]s˜, p . The ranges for the exponents ensure that Lemma 2 holds, so that the momentum constraint solution map S : [φ− , φ+ ]s˜, p → W e,q = Y, is continuous.
586
M. Holst, G. Nagy, G. Tsogtgerel
3p Step 3. Construction of the mapping T . Define r = 3+(2−s) p , so that the continuous r s−2, p embedding L → W holds. Since the pointwise multiplication is bounded on L 2r ⊗ L 2r → L r , and w ∈ W e,q → W 1,2r , we have aw ∈ W s−2, p by σ ∈ L 2r . The em1 2 beddings W 1,z → W e−1,q → L 2r also guarantee that aτ = 12 τ ∈ W s−2, p . We have the scalar curvature R ∈ W s−2, p , and these considerations show that the Hamiltonian constraint equation is well defined with [φ− , φ+ ]s, p as the space of solutions. Suppose for the moment that the scalar curvature R of the background metric h is continuous, and by using the map T s introduced in Lemma 3, define the map T by T (φ, w) = T s (φ, aw ), where aw is now considered as an expression depending on w. Then Lemma 3 implies that the map T : [φ− , φ+ ]s˜, p × W e,q → W s, p is continuous for any reasonable shift as , which, by Lemma 4, can be chosen so that T is monotone in the first variable. Combining the monotonicity with Lemma 5, we infer that the interval [φ− , φ+ ]s˜, p is invariant under T (·, aw ) if w ∈ S([φ− , φ+ ]s˜, p ). Since Lz → W e−2,q , from Theorem 6 we have
we,q C bτ φ 6 + b j e−2,q C φ+ 6∞ bτ z + C b j e−2,q for any w ∈ S([φ− , φ+ ]s˜, p ). In view of Lemma 6, this shows that there exists a closed ball B M ⊂ W s˜, p such that φ ∈ [φ− , φ+ ]s˜, p ∩ B M , w ∈ S([φ− , φ+ ]s˜, p ∩ B M )
⇒
T (φ, w) ∈ B M .
Under the conditions in the above displayed formula, from the invariance of the interval [φ− , φ+ ]s˜, p , we indeed have T (φ, w) ∈ U = [φ− , φ+ ]s˜, p ∩ B M . However, the scalar curvature of h may be not continuous, and in general it is not clear how to introduce a shift so that the resulting operator is monotone. Nevertheless, we can conformally transform the metric into a metric with continuous scalar curvature, cf. Theorem 12, and by using the conformal covariance of the Hamiltonian constraint, we will be able to construct an appropriate mapping T . Let h˜ = θ 4 h be a metric with continuous scalar curvature, where θ ∈ W s, p is the (positive) conformal factor of the scaling. Let T˜ s be the mapping introduced in Lemma 3, corresponding ˜ and the coeffito the Hamiltonian constraint equation with the background metric h, cients a˜ τ = aτ , and a˜ ρ = θ −8 aρ . With a˜ w = θ −12 aw , this scaled Hamiltonian constraint equation has sub- and super-solutions θ −1 φ− and θ −1 φ+ , respectively, as long as φ− and φ+ are sub- and super-solutions respectively of the original Hamiltonian constraint equation, cf. Appendix A.8. We choose the shift in T˜ s so that it is monotone in [θ −1 φ− , θ −1 φ+ ]s˜, p . Then by the monotonicity and the above mentioned sub- and super-solution property under conformal scaling, for w ∈ S([φ− , φ+ ]s˜, p ), T˜ s (·, θ −12 aw ) is invariant on [θ −1 φ− , θ −1 φ+ ]s˜, p . Finally, we define T (φ, w) = θ T˜ s (θ −1 φ, θ −12 aw ), where, as before, aw is considered as an expression depending on w. From the pointwise multiplication properties of θ and θ −1 , the map T : [φ− , φ+ ]s˜, p × W e,q → W s, p is continuous, and from the monotonicity and Lemma 6 , T (·, w) is invariant on U = [φ− , φ+ ]s˜, p ∩ B M for w ∈ S(U ), where M is taken to be sufficiently large. Moreover, if the fixed point equation φ = θ T˜ s (θ −1 φ, θ −12 aw ),
Rough Solutions of the Einstein Constraints on Closed Manifolds
587
is satisfied, then θ −1 φ is a solution to the scaled Hamiltonian constraint equation with a˜ w = θ −12 aw , and so by conformal covariance, φ is a solution to the original Hamiltonian constraint equation, cf. Appendix A.8. Step 4. Barrier choices and application of the fixed point theorem. At this point, Theorem 5 implies the Main Theorem 2, provided that we have an admissible pair of barriers for the Hamiltonian constraint. The ranges for the exponents ensure through Corollary 1 that we can use the estimate (5.1); see the discussion following the estimate at the beginning of Sect. 5. We will separate into the two cases in the theorem, depending on which Yamabe class we are in: (a) h ab is in Y − (M): We use the global constant super-solution from Lemma 7(a) or the non-constant super-solution from Lemma 12 depending on whether ρ and σ are both in L ∞ , and the global sub-solution from Lemma 14. (b) h ab is in Y 0 (M) or in Y + : We use the global constant super-solution from Lemma 7(a) or the non-constant super-solution from Lemma 12 depending on whether ρ and σ are both in L ∞ , and the global sub-solution from Lemma 13 or Lemma 8(c). This concludes the proof for the case s 2. Step 5: Bootstrap. Now suppose that s > 2. First of all we need to show that the equations are well defined in the sense that the involved operators are bounded in appropriate spaces. All other conditions being obviously satisfied, we will show that aτ ∈ W s−2, p , and aw ∈ W s−2, p for any w ∈ W e,q . Since τ , σ and Lw belong to W e−1,q , it suffices to show that the pointwise multiplication is bounded on W e−1,q ⊗ W e−1,q → W s−2, p , and by employing Corollary 3(b) in the Appendix, we are done as long as s − 2 e − 1 0, s − 2 − 3p < 2(e − 1 − q3 ), and s − 2 − 3p e − 1 − q3 . After a rearrangement these conditions read: e 1, e s − 1, e > d = s−
3 p
3 q
+ d2 , and e
3 q
+ d − 1, with the shorthand
> 1, the latter inequality by the hypothesis of the theorem. We have d −1 >
for d > 2, and 1
d 2
for d 2, meaning that the condition e >
3 q
d 2
+ d2 is implied by the
hypotheses e q3 + d − 1 and e > 1 + q3 . So we conclude that the constraint equations are well defined. Next, we will treat the equations as equations defined with s = e = 2 and with p and q appropriately chosen. This is possible, since if the quadruple ( p, s, q, e) satisfies the hypotheses of the theorem, then ( p, ˜ s˜ = 2, q, ˜ e˜ = 2) satisfies the hypotheses too, provided that 2 − 3p˜ s − 3p , and 1 < 2 − q3˜ e − q3 . Since the latter conditions reflect
the Sobolev embeddings W s, p → W 2, p˜ and W e,q → W 2,q˜ → W 1,∞ , the coefficients of the equations can also be shown to satisfy sufficient conditions for posing the problem for ( p, ˜ 2, q, ˜ 2). Finally, we have τ ∈ W e−1,q → W 1,q˜ = W 1,z since z = q˜ by e˜ = 2 for this new formulation. Now, by the special case s 2 of this theorem that is proven in the above steps, under the remaining hypotheses including the conditions on the metric and the near-CMC condition, we have φ ∈ W 2, p˜ with φ > 0 and w ∈ W 2,q˜ solution to the coupled system. To complete the proof we only need to show that these solutions indeed satisfy φ ∈ W s, p and w ∈ W e,q . Suppose that φ ∈ W s1 , p1 and w ∈ W e1 ,q1 , with 1 < s1 − p31 s− 3p ,
1 < e1 − q31 e − q3 , max{2, s − 2} s1 s, and max{2, e − 2} e1 min{e, s}. Then we have bτ φ 6 + b j ∈ W e−2,q , and so Corollary 5 from Appendix A.5 implies that w ∈ W e,q . This implies that aw ∈ W s−2, p , and by employing Corollary 5 once again, we get φ ∈ W s, p . The proof is completed by induction.
588
M. Holst, G. Nagy, G. Tsogtgerel
6.2. Proof of Theorem 1. The proof is identical to the proof of Theorem 2, except for the particular barriers used. In the proof of Theorem 2, the near-CMC condition is used to construct global barriers satisfying 0 < φ− φ+ < ∞, for all three Yamabe classes, and then the supporting results for the operators S and T established in Sect. 4.1 and Sect. 4.2 are used to reduce the proof to invoking Theorem 5. The construction of φ+ is in fact the only place in the proof of Theorem 2 that requires the near-CMC condition. Here, the proof is identical, except that the additional conditions made on the background metric h ab (that it be in Y + (M)), and on the data (the smallness conditions on σ , ρ, and j) allow us to make use of the alternative construction of a global super-solution given in Lemma 9, together with compatible global sub-solution given in Lemma 13, properly scaled for compatibility with the super-solution. Theorem 1 now follows from Theorem 5, without the use of near-CMC conditions. 6.3. Proof of Theorem 3. The CMC result in this theorem can be proved using the same analysis framework used for the proofs of the two non-CMC results in Theorem 1 and Theorem 2 above. Therefore, the proof follows the same general outline of the proof of Theorem 2, with slightly different spaces and supporting results. The main difference is that we can avoid having to construct “global” barriers and getting uniform bounds on the solution to the momentum constraint, since it is solved only once a priori and then is input as data into the nonlinearity of the Hamiltonian constraint. The case (d) follows from the Yamabe classification, cf. Appendix A.7. Since otherwise we can use the conformal covariance of the Hamiltonian constraint as in Sect. 6.1, for simplicity, assume that the scalar curvature of the background metric is continuous. Also assume that s 2, and let us look at the hypotheses of Theorem 5. We have the (reflexive) Banach spaces X = W s, p and Y = W 1,2r , where p ∈ ( 23 , ∞), 3p s = s( p) ∈ ( 3p , ∞) ∩ [1, 2], and r = r (s, p) = 3+(2−s) p . On the diagram in Fig. 2, 1,2r corresponds to the lower right corner of the shaded parallelfor s 2 the space W ogram, and so W 1,2r contains all the spaces W e,q which are represented by the points in the shaded parallelogram. In fact, W 1,2r is outside of this parallelogram, because of the strict inequality relating e and q in order to have the boundedness of the pointwise multiplication on W e−1,q ⊗ W e−1,q → W s−2, p by using Corollary 3(b). However, the conditions of Corollary 3(b) are not necessary conditions when some of the smoothness indices are integers, for example, in our case the pointwise multiplication is bounded on L 2r ⊗ L 2r → L r , even though these spaces do not satisfy the conditions of the corollary. As a consequence, as we have seen e.g. in Sect. 2.4, the constraint equations are well defined for these spaces. We have the ordered Banach space Z = W s˜, p with the compact embedding X = s, W p → W s˜, p = Z , for s˜ ∈ ( 3p , s). The interval [φ− , φ+ ]s˜, p is nonempty (by compatibility of the barriers we will choose below), and by Lemma 1 at the end of Sect. 3 it is also convex with respect to the vector space structure of W s˜, p and closed with respect to the norm topology of W s˜, p . We then take U = [φ− , φ+ ]s˜, p ∩ B M for sufficiently large M (to be determined below), where B M is the closed ball in Z = W s˜, p of radius M about the origin, ensuring that U is non-empty, convex, closed, and bounded as a subset of Z = W s˜, p . We take as T the shifted Picard mapping T s having as its fixed-point a solution to the 1,2r , which is indepenHamiltonian constraint, and we take S(φ) = w = −A−1 L bj ∈ W
Rough Solutions of the Einstein Constraints on Closed Manifolds
589
dent of φ, since the momentum equation decouples from the Hamiltonian constraint in this case. The map S, which is constant as a function of φ due to the CMC de-coupling, is trivially continuous as a map S : U → W 1,2r = Y . We now consider properties we have for T . By Lemma 3, T : U × R(S) → W s, p = X is a continuous map. By Lemma 4, T is invariant on the closed interval [φ− , φ+ ]s˜, p , and by Lemma 6, T is invariant on U = [φ− , φ+ ]s˜, p ∩ B M . To summarize, T is invariant on the non-empty, closed, convex, bounded set U . Finally, Theorem 5 implies the Main Theorem 3, as long as we have an admissible pair of barriers for the Hamiltonian constraint. That is when we need to separate into the three remaining cases in the theorem, depending on which Yamabe class we are in: (a) h ab is in Y − (M); τ = 0: We take the super-solution from Lemma 11(c), and we take the sub-solution from Lemma 14. These lemmata require that the metric h ab is conformally equivalent to a metric with scalar curvature (−aτ ), and we shall verify this condition. By conformal invariance, it suffices to verify the condition for metrics with continuous and negative scalar curvature, meaning that we have to solve Eq. (5.17) with R < 0 continuous and aτ > 0 constant. Indeed, this |R| 1/4 equation has a positive solution ψ ∈ W s, p as the constants ψ− = ( min and 8aτ )
|R| 1/4 are respectively sub- and super-solutions of (5.17). ψ+ = ( max 8aτ ) + (b) h ab is in Y (M); ρ = 0 or σ = 0: We take the super-solution from Lemma 11(b), and we take the sub-solution from Lemma 13. For the case ρ = 0 and σ = 0, a local sub-solution can easily be constructed following the approach in the proof of Lemma 13. (c) h ab is in Y 0 (M); τ = 0; ρ = 0 or σ = 0: We take the super-solution from Lemma 11(a), and we take the sub-solution from Lemma 13. The case ρ = 0 and σ = 0 is treated as above.
To complete the proof one can bootstrap as in Sect. 6.1.
7. Summary We began in Sect. 2 by summarizing the conformal decomposition of Einstein’s constraint equations introduced by Lichnerowicz and York, on a closed manifold. After this setting up of the notation, we gave an overview of our main results in Sect. 3, represented by three new weak solution existence results for the Einstein constraint equations in the far-from-CMC, near-CMC, and CMC cases. In Sect. 4 we then developed the necessary results we need for the individual constraint equations in order to analyze the coupled system. In particular, in Sect. 4.1, we first developed some basic technical results for the momentum constraint operator under weak assumptions on the problem data. We also established the properties we need for the momentum constraint solution mapping S appearing in the analysis of the coupled system. In Sect. 4.2, we assumed the existence of barriers φ− and φ+ (weak sub- and super-solutions) to the Hamiltonian constraint equation, forming a nonempty positive bounded interval, and then established the properties we need for the Hamiltonian constraint Picard mapping T appearing in the analysis of the coupled system. We then derived several weak global sub- and super-solutions in Sect. 5, based both on constants and on more complex non-constant constructions. While the sub-solutions are similar to those found previously in the literature, some of the super-solutions were new. In particular, we gave two super-solution constructions that do not require the near-CMC condition. The first was constant, and requires that the
590
M. Holst, G. Nagy, G. Tsogtgerel
scalar curvature be strictly globally positive. The second was based on a scaled solution to a Yamabe-type problem, and is valid for any background metric in the positive Yamabe class. In Sect. 6, we proved the main results. In particular, using topological fixed-point arguments and global barrier constructions, we combined the results for the individual constraints and the global barriers to establish existence of coupled non-CMC weak solutions with (positive) conformal factor φ ∈ W s, p , where p ∈ (1, ∞) and s( p) ∈ (1+ 3p , ∞). In the CMC case, the regularity can be reduced to p ∈ (1, ∞) and s( p) ∈ ( 3p , ∞)∩[1, ∞). In the case of s = 2, we reproduce the CMC existence results of Choquet-Bruhat [10], and in the case p = 2, we reproduce the CMC existence results of Maxwell [33], but with a different proof; our CMC proof goes through the same analysis framework that we use to obtain the non-CMC results (Theorems 4 and 5). We also assembled a number of new supporting technical results in the body of the paper and in several appendices, including: topological fixed-point arguments designed for the Einstein constraints; construction and properties of general Sobolev classes W s, p and elliptic operators on closed manifolds with weak metrics; the development of a very weak solution theory for the momentum constraint; a priori L ∞ -estimates for weak W 1,2 -solutions to the Hamiltonian constraint; Yamabe classification of non-smooth metrics in general Sobolev classes W s, p ; and a discussion and analysis of conformal covariance and the connection between conformal rescaling and the near-CMC condition. An important feature of the results we presented here is the absence of the near-CMC assumption in the case of the rescaled background metric in the positive Yamabe class, as long as the freely specifiable part of the data given by the matter fields (if present) and the traceless-transverse part of the rescaled extrinsic curvature are taken to be sufficiently small. In this case, the mean extrinsic curvature can be taken to be an arbitrary smooth function without restrictions on the size of its spatial derivatives, so that it can be arbitrarily far from constant. Under these conditions, we have the first existence result for non-CMC solutions without the near-CMC condition. The two advances in the analysis of the Einstein constraint equations that make these results possible were: A topological fixed-point theorem based on compactness arguments that is free of the near-CMC condition (Theorems 4 and 5 and in [21]), and a new construction of global super-solutions for the Hamiltonian constraint that is similarly free of the near-CMC condition (Lemma 7 and Lemma 9). We note that the near-CMC-free constructions based on scaled solutions to a Yamabe-like problem also work for compact manifolds with boundary and other cases; see e.g. [21]. Finally, we point out that our results here and in [21,22] can be viewed as reducing the remaining open questions of existence of non-CMC (weak and strong) solutions without near-CMC conditions to two more basic and clearly stated open problems: (1) Existence of near-CMC-free global super-solutions for the Hamiltonian constraint equation when the background metric is in the non-positive Yamabe classes and for large data; and (2) existence of near-CMC-free global sub-solutions for the Hamiltonian constraint equation when the background metric is in the positive Yamabe class in vacuum (without matter). However, an important new development, which occurred a few months after the first draft of this article was made available, is that Maxwell has now shown [36] how a related topological fixed-point argument can be constructed so that a global subsolution is not needed, as long as the global super-solution is available; this allows for the extension of the far-CMC results in this article to the vacuum case without having to solve problem (2).
Rough Solutions of the Einstein Constraints on Closed Manifolds
591
Acknowledgement. The authors would like to thank Jim Isenberg, David Maxwell, and Daniel Pollack for many very insightful comments and suggestions about this work. We would like to thank David Maxwell in particular for his careful reading of earlier drafts of this work and for pointing out various errors. MH was supported in part by NSF Awards 0715146, 0411723, and 0511766, and DOE Awards DE-FG02-05ER25707 and DE-FG02-04ER25620. GN and GT were supported in part by NSF Awards 0715146 and 0411723.
A. Some Key Technical Tools and Some Supporting Results A.1. Topological fixed-point theorems. In this Appendix, we give a brief review of some standard topological fixed-point theorems in Banach spaces that provide the framework for our analysis of the coupled constraint equations. The analysis framework that was developed earlier in [26] for analyzing the coupled constraints was based on k-contractive mappings, and as a result required the near-CMC condition in order to establish k-contractivity. All subsequent non-CMC results (see e.g. [1]) are based on the framework from [26], and as a result remain limited to the near-CMC case. Our interest here is on more general topological fixed-point arguments that will allow us to avoid the near-CMC condition. Brouwer, Schauder, and Leray-Schauder Fixed-Point Theorems. To establish the main abstract results we will need, we first give a brief overview of some standard results on topological fixed-point arguments involving compactness. Theorem 7 (Brouwer Theorem). Let U ⊂ Rn be a non-empty, convex, compact subset, with n 1. If T : U → U is a continuous mapping, then there exists a fixed-point u ∈ U such that u = T (u). Proof. See Proposition 2.6 in [54]; a short proof can be based on homotopy-invariance of topological degree. Theorem 8 (Schauder Theorem). Let X be a Banach space, and let U ⊂ X be a nonempty, convex, compact subset. If T : U → U is a continuous operator, then there exists a fixed-point u ∈ U such that u = T (u). Proof. This is a direct extension of the Brouwer Fixed-Point Theorem from Rn to X ; see Corollary 2.13 in [54]. The short proof involves a simple finite-dimensional approximation algorithm and a limiting argument, extending the Brouwer Fixed-Point Theorem (itself generally having a more complicated proof) from Rn to X . Theorem 9 (Schauder Theorem B). Let X be a Banach space, and let U ⊂ X be a non-empty, convex, closed, bounded subset. If T : U → U is a compact operator, then there exists a fixed-point u ∈ U such that u = T (u). Proof. See Theorem 2.A in [54]; the proof is a simple consequence of Theorem 8 above. A.2. Ordered Banach spaces. These notes follow the main ideas and definitions given in Chap. 7.1, p. 275, in [54], while some examples were taken from [2 and 16]. Let X be a Banach space, R+ be the non-negative real numbers. A subset C ⊂ X is a cone iff given any x ∈ C and a ∈ R+ the element ax ∈ C. A subset X + ⊂ X is an order cone iff the following properties hold: (i) The set X + is non-empty, closed, and X + = {0};
592
M. Holst, G. Nagy, G. Tsogtgerel
(ii) Given any a, b ∈ R+ and x, x ∈ X + then ax + bx ∈ X + ; (iii) If x ∈ X + and −x ∈ X + , then x = 0. The second property above says that every order cone is in fact a cone, and that the set X + is convex. The space X = R2 is a convenient Banach space to picture non-trivial examples of cones and order cones, as can be seen in Fig. 3. A pair X , X + is called an ordered Banach space iff X is a Banach space and X + ⊂ X is an order cone. The reason for this name is that the order cone X + defines several relations on elements in X , called order relations, as follows: u v iff u − v ∈ X + , u v iff u − v ∈ int(X + ),
u > v iff u v and u = v, u v iff u v is false;
finally the notations u v, u < v, and u v are used to mean v u, v > u, v u, respectively. A simple example of an ordered Banach space is R with the usual order. Another example can be constructed when this order on R is transported into C 0 (M), the set of scalar-valued functions on a set M ⊂ Rn , with n 1. An order on C 0 (M) is the following: the functions u, v ∈ C 0 (M) satisfy u v iff u(x) v(x) for all x ∈ M. The following lemmas summarize the main properties of order relations in Banach spaces. Lemma 17. Let X , X + be an ordered Banach space. Then, for all elements u, v, w ∈ X , hold: (i) u u; (ii) If u v and v u, then u = v; (iii) If u v and v w, then u w. Proof. The property that u − u = 0 ∈ X + implies that u u. If u v and v u then u − v ∈ X + and −(u − v) ∈ X + , therefore u − v = 0. Finally, if u v and v w, then u − v ∈ X + and v − w ∈ X + , which means that u − w = (u − v) + (v − w) ∈ X + . Furthermore, the order relation is compatible with the vector space structure and with the limits of sequences. Lemma 18. Let X , X + be an ordered Banach space. Then, for all u, u, ˆ v, v, ˆ w ∈ X , and a, b ∈ R, the following hold: (i) If u v and a b 0, then au bv; (ii) If u v and uˆ v, ˆ then u + uˆ v + v; ˆ (iii) If u n vn for all n ∈ N, then limn→∞ u n limn→∞ vn . Proof. The first two properties are straightforward to prove, and we do not do it here. The third property holds because the order cone is a closed set. Indeed, u n vn means that u n − vn ∈ X + for all n ∈ N, and then limn→∞ (u n − vn ) ∈ X + because X + is closed, then Property (iii) follows. The remaining order relations have some other interesting properties. Lemma 19. Let X , X + be an ordered Banach space. Then, for all u, v, w ∈ X , and a ∈ R, the following hold:: (i) If u v and v w, then u w; (ii) If u v and v w, then u w; (iii) If u v and v w, then u w; (iv) If u v and a > 0, then au av. The proof of Lemma 19 is similar to the previous lemma, and is not reproduced here. Given an ordered Banach space X , X + , and two elements u v, introduce the intervals [v, u] := {w ∈ X : v w u},
(v, u) := {w ∈ X : v w u}.
Rough Solutions of the Einstein Constraints on Closed Manifolds
593 2
R
R+
u
2
[v,u]
v
Fig. 3. The shaded regions in the first picture represent an order cone, while the second picture represents a cone that is not an order cone. The shaded region between u and v in the third picture represents the closed interval [v, u], constructed with the order cone R2+ , which is also represented by a shaded region
Analogously, introduce the intervals [v, u) and (v, u]. See Fig. 3 for an example in X = R2 . Useful order cones for solving PDE are those that define an order structure in the Banach space which is related with the norm and the notion of boundedness. These types of order cones are called normal. More precisely, an order cone X + in a Banach space X is called a normal order cone iff there exists 0 < a ∈ R such that for all u, v ∈ X with 0 v u holds v a u. Lemma 20. If X , X + is an ordered Banach space with normal order cone X + , then every closed interval in X is bounded. Proof. Let w ∈ [v, u], then v w u, and so 0 w − v u − v. Since the cone X + is normal, this implies that there exists a > 0 such that w − v a u − v. Then, the inequalities w w − v + v a u − v + v, which hold for all w ∈ [v, u], establish the lemma. Not every order cone is normal. For example, consider the Sobolev spaces W k, p of scalar-valued functions on an n-dimensional, closed manifold M (or a compact manifold with Lipschitz continuous boundary), where k is a non-negative integer, and p > 1 is a real number. An order cone in W k, p is defined translating the order on the real numbers, almost everywhere in M, that is, k, p
W+
:= {u ∈ W k, p : u 0 a.e. in M}.
In the case k = 0, that is, we have W 0, p = L p , the order cone above is a normal cone [2,54]. However, in the case k 1 the cone above cannot be normal, since on the one hand, the cone definition involves information only of the values of u(x) and not of its derivatives; on the other hand, the norm in W k, p contains information of both the values of u(x) and its derivatives. In the case of a compact manifold with boundary, since there are no boundary conditions on ∂M in the definition of W k, p , there is no way to relate the values of a function in M with the values of its derivatives. (In other words, there is no Poincaré inequality for elements in W k, p , with k 1.) An order cone X + ⊂ X is generating iff Span(X + ) = X . An order cone X + ⊂ X is called total iff Span(X + ) is dense in X . Total order cones are important because the order structure associated with them can be translated from the space X into its dual space X ∗ . Lemma 21. Let X , X + be an ordered Banach space. If X + is a total order cone, then an order cone in X ∗ is given by the set X +∗ ⊂ X ∗ defined as X +∗ := {u ∗ ∈ X ∗ : u ∗ (v) 0 ∀ v ∈ X + }.
594
M. Holst, G. Nagy, G. Tsogtgerel
Proof. We check the three properties in the definition of the order cone. The first property is satisfied because X + is an order cone, so there exists v = 0 in X + , and then there exists u ∗ = 0 in X ∗ such that u ∗ (v) = 1 0, so X +∗ is non-empty. Trivially, 0 ∈ X +∗ . Finally, X +∗ is closed because the order relation for real numbers is used in its definition. The second property of an order cone is satisfied, because given any u ∗ , v ∗ ∈ X +∗ and any non-negative a, b ∈ R, then for all u ∈ X + , (au ∗ + bv ∗ )(u) = au ∗ (u) + bv ∗ (u) 0 holds since each term is non-negative. This implies that (au ∗ + bv ∗ ) ∈ X +∗ . The third property is satisfied because the order cone X + is total. Suppose that the element u ∗ ∈ X +∗ and −u ∗ ∈ X +∗ , then for all u ∈ X + it holds that u ∗ (u) 0 and −u ∗ (u) 0, which implies that u ∗ (u) = 0 for all u ∈ X + . Therefore, u ∗ ∈ X +◦ ⊂ X ∗ , where the superscript ◦ in X +◦ means the Banach annihilator of the set◦ X + , which is a subset of the space X ∗ . Therefore, we conclude that u ∗ ∈ Span(X + ) . Since the order cone is total, ◦ Span(X + ) = X , that implies Span(X + ) = {0}, so u ∗ = 0. This establishes the lemma. An order cone X + in a Banach space X is called a solid cone iff X + has non-empty interior. The following result asserts that solid order is generating. We remark that the converse is not true. In the examples below we present function spaces frequently used in solving PDE with order cones having empty interior which are indeed generating. Lemma 22. Let X , X + be an order Banach space. If X + is a solid cone, then X + is generating. Proof. The cone X + has a non-empty interior, so there exists x0 ∈ int(X + ) and x0 = 0. This means that given any x ∈ X there exists 0 < a ∈ R small enough such that both x+ := x0 + ax and x− := x0 − ax belong to int(X + ). But then, x = (x+ − x− )/(2a), so x ∈ Span(X + ). This establishes the lemma. Here is a list of examples of several order cones used in function spaces. All these examples use order cones obtained from the usual order in R. In particular, they refer to scalar-valued functions on an n- dimensional, closed manifold M (or a compact manifold with Lipschitz boundary). • Introduce on C k the cone C+k := {u ∈ C k : u(x) 0 ∀x ∈ M}. This is an order cone for all non-negative integers k. The cone is a normal cone in the particular case k = 0. The cone is solid for all k 0, therefore it is a generating cone. ∞ : u 0 a.e. in M}. This is a normal, • Introduce on L ∞ the cone L ∞ + := {u ∈ L order cone. It is a solid cone, therefore it is generating. • Introduce on W k,∞ the cone W+k,∞ := {u ∈ W k,∞ : u 0 a.e. in M}. This is an order cone. It is not normal for k 1. The cone is solid, therefore it is generating. p • Introduce on L p the cone L + := {u ∈ L p : u 0 a.e. in M}. This is a normal, order cone for every real number p 1. The cone is not solid, however it is a generating cone. k, p • Introduce on W k, p the cone W+ := {u ∈ W k, p : u 0 a.e. in M}. This is an order cone for every real number p 1. The cone is not normal for k 1. The cone is not solid for kp n, and it is solid for kp > n. In both cases, the cone is generating.
Rough Solutions of the Einstein Constraints on Closed Manifolds
595
A key concept that becomes possible in ordered Banach spaces is that of an operator satisfying a maximum principle. We have not seen in the literature an approach to maximum principles on ordered Banach spaces in the generality we now present. Let X , X + and Y , Y+ be ordered Banach spaces. An operator A : D A ⊂ X → Y satisfies the maximum principle iff for every u, v ∈ D A such that Au − Av ∈ Y+ , u − v ∈ X + holds. In the particular case that the operator A is linear, then it satisfies the maximum principle iff for all u ∈ X such that Au ∈ Y+ , u ∈ X + holds. The main example is the Laplace operator acting on scalar-valued functions defined on different domains. It is shown later on in this Appendix that the inverse of an operator that satisfies the maximum principle is monotone increasing. The following result gives a simple sufficient condition for an operator to satisfy the maximum principle. This result is useful on weak formulations of PDE. Lemma 23. Let X , X + be an ordered Banach space, and A : X → X ∗ be a linear and coercive map. Assume that X + is a generating order cone, and that for all u ∈ X such that Au ∈ X +∗ there exists a decomposition u = u + − u − with u + , u − ∈ X + that also satisfies Au + (u − ) = 0. Then, the operator A satisfies the maximum principle. Proof. Since the order cone X + is generating, the space X ∗ is also an ordered Banach space. Denote its order cone by X +∗ . The assumption that the order cone X + is generating also implies that for any element u ∈ X there exists a decomposition u = u + − u − with u + , u − ∈ X + . By hypothesis, there exists at least one decomposition with the extra property that Au + (u − ) = 0. Now, by definition of the order in the space X ∗ we have that Au ∈ X +∗
⇔
Au(u) 0 ∀ u ∈ X + .
Pick as a test function u = u − . Then, 0 Au(u − ) = A(u + − u − )(u − ) = Au + (u − ) − Au − (u − ) = −Au − (u − ), where the last equality comes from the condition Au + (u − ) = 0. Therefore, we have Au − (u − ) 0
⇒
u − = 0,
because A is coercive. So we showed that u = u + ∈ X + . This establishes the lemma. An example is the weak form of the shifted Laplace-Beltrami operator +s on scalar functions on a closed manifold M, where s > 0. Consider the case X = W 1,2 , with Y = X ∗ = W −1,2 , and X + = W+1,2 , while Y+ = W+−1,2 . The Laplace operator in this case is given by A : X → X ∗ with action Au(v) := (∇u, ∇v). It is not difficult to check that this operator satisfies the hypothesis in Lemma 23. Therefore, this operator satisfies the maximum principle, that is, Au ∈ W+−1,2 implies u ∈ W+1,2 , that is, u 0 a.e. in the manifold M. A.3. Monotone increasing maps. Let X , X + and Y , Y+ be two ordered Banach spaces. An operator F : X → Y is monotone increasing iff for all x, x ∈ X such that x − x ∈ X + , F(x) − F(x) ∈ Y+ holds. An operator F : X → Y is monotone decreasing iff for all x, x ∈ X such that x − x ∈ X + it holds that − F(x) − F(x) ∈ Y+ . The main result for these types of maps is the following; it can be found as Theorem 7.A in [54], p. 283, and Corollary 7.18 on p. 284. We reproduce it here for completeness, without the proof.
596
M. Holst, G. Nagy, G. Tsogtgerel
Theorem 10 (Fixed point for increasing operators). Let X be an ordered Banach space, with a normal order cone X + . Let T : [x− , x+ ] ⊂ X → X be a monotone increasing, compact map. If − x− − T (x− ) ∈ X + and x+ − T (x+ ) ∈ X + , then the iterations xn+1 := T (xn ), xˆn+1 := T (xˆn ),
x0 = x− , xˆ0 = x+ ,
converge to x and xˆ ∈ [x− , x+ ], respectively, and the following estimate holds: x− xn x xˆ xˆn x+ ,
∀n = N.
(A.1)
We are interested in the following class of nonlinear problems: Find an element x ∈ X which solves the equation Ax + F(x) = 0,
(A.2)
where the principal part involves an invertible linear operator A : X → Y satisfying the maximum principle, and the non-principal part involves a nonlinear operator F : X → Y which has monotonicity properties. We now establish some basic results for this class of problems. The first two results relate linear, invertible operators that satisfy the maximum principle with monotone increasing (decreasing) operators. Lemma 24. Let X , X + and Y , Y+ be two ordered Banach spaces. Let A : X → Y be a linear, invertible operator satisfying the maximum principle. Then, the inverse operator A−1 : Y → X is monotone increasing. Proof. Let y, y ∈ Y be such that y − y ∈ Y+ . Then, A A−1 (y − y) ∈ Y+ ⇒ A−1 (y − y) ∈ X +
⇔
A−1 y − A−1 y ∈ X + .
This establishes that the operator A−1 is monotone increasing.
Lemma 25. Let X , X + and Y , Y+ be two ordered Banach spaces. Let A : X → Y be a linear, invertible operator satisfying the maximum principle. Let F : X → Y be a monotone decreasing (increasing) operator. Then, the operator T : X → X given by T := −A−1 F is monotone increasing (decreasing). Proof. Assume first that the operator F is monotone decreasing. So, given any x, x ∈ X such that x − x ∈ X + , the following inequalities hold: x − x ∈ X + ⇒ − F(x) − F(x) ∈ Y+ , ⇔ A −A−1 F(x) − F(x) ∈ Y+ , ⇒ −A−1 F(x) − F(x) ∈ X + , ⇔ − A−1 F(x) − A−1 F(x) ∈ X + , ⇔
T (x) − T (x) ∈ X + ,
which establishes that the operator T is monotone increasing. In the case that the operator F is monotone increasing, then the first line in the proof above changed into x − x ∈ X + implies that F(x) − F(x) ∈ Y+ , and then all the remaining inequalities in the proof above are reverted. This establishes the lemma.
Rough Solutions of the Einstein Constraints on Closed Manifolds
597
The next result translates the inequalities that satisfy sub- and super-solutions to the equation Ax + F(x) = 0, into inequalities for the operator T = −A−1 F. Lemma 26. Assume the hypothesis in Lemma 25. If there exists an element x+ ∈ X such that Ax+ + F(x+ ) ∈ Y+ , then this element satisfies that x+ − T (x+ ) ∈ X + . If there exists an element x− ∈ X such that − Ax− + F(x− ) ∈ Y+ , then this element satisfies that − x− − T (x− ) ∈ X + . Proof. The first statement in the lemma can be shown as follows: Ax+ + F(x+ ) ∈ Y+ ⇔ A x+ + A−1 F(x+ ) ∈ Y+ ⇒ x+ + A−1 F(x+ ) ∈ X + , which then establishes that x+ − T (x+ ) ∈ X + . In a similar way, the second statement in the lemma can be shown as follows: − Ax− + F(x− ) ∈ Y+ ⇔ A −x− − A−1 F(x− ) ∈ Y+ ⇒ −x− − A−1 F(x− ) ∈ X + , which then establishes that − x− − T (x− ) ∈ X + . This establishes the lemma.
For nonlinear problems of the form (A.2), one can use Theorem 10 for monotone nonlinearities to conclude the following. Corollary 2. (Semi-linear equations with sub-/super-solutions) Let X , X + and Y , Y+ be two ordered Banach spaces where X + is a normal order cone. Let A : X → Y be a linear, invertible operator satisfying the maximum principle. Let x+ , x− ∈ X be elements such that (x+ − x− ) ∈ X + , and then assume that the operator F : [x− , x+ ] ⊂ X → Y is monotone decreasing and compact. If the elements x− and x+ satisfy the relations − Ax− + F(x− ) ∈ Y+ ,
Ax+ + F(x+ ) ∈ Y+ ,
(A.3)
then there exists a solution x ∈ [x− , x+ ] ⊂ X of the equation Ax + F(x) = 0. Proof. The operator A is invertible, then rewrite the equation Ax + F(x) = 0 as a fixed-point equation, x = −A−1 F(x) =: T (x).
(A.4)
By Lemma 25, we know that the map T : X → X is monotone increasing. Moreover, this operator T is compact, since it is the composition of the continuous mapping −A−1 and the compact map F. The elements x− and x+ satisfy Eq. (A.3), therefore, by Lemma 26, they are also sub- and super-solutions for the fixed-point equation involving the map T . It follows from Theorem 10 that there exists an element x ∈ X solution to the fixed-point equation (A.4), and this solution satisfies the bounds x− x x+ .
598
M. Holst, G. Nagy, G. Tsogtgerel
A.4. Sobolev spaces on closed manifolds. In this Appendix we will recall some properties of Sobolev spaces of sections of vector bundles over closed manifolds. The following definition makes precise what we mean by fractional order Sobolev spaces. We expect that without much difficulty all the results in this paper can be modified to reflect other smoothness classes such as Bessel potential spaces or general Besov spaces. Definition 2. For s 0 and 1 p ∞, we denote by W s, p (Rn ) the space of all distributions u defined in Rn , such that (a) when s = m is an integer, um, p =
∂ ν u p < ∞,
|ν|m
where · p is the standard L p -norm in Rn ; (b) and when s = m + σ with m (nonnegative) integer and σ ∈ (0, 1), us, p = um, p + ∂ ν uσ, p < ∞, |ν|=m
where uσ, p =
|u(x) − u(y)| p d xd y n+σ p Rn ×Rn |x − y|
1
p
,
for 1 p < ∞,
and uσ,∞ = ess supx,y∈Rn
|u(x) − u(y)| . |x − y|σ
For s < 0 and 1 < p < ∞, W s, p (Rn ) denotes the topological dual of W −s, p (Rn ), where 1p + p1 = 1. These well known spaces are Banach spaces with corresponding norms, and become Hilbert spaces when p = 2. We refer to [18,46] and references therein for further properties. Now we will define analogous spaces on closed manifolds. Let M be an n-dimensional smooth closed manifold, and let {(Ui , ϕi )} be a collection of charts such that {Ui } forms a finite cover of M. Then for any distribution u ∈ C0∞ (Ui )∗ , the pull-back ϕi∗ (u) ∈ C0∞ (ϕi (Ui ))∗ is defined by ϕi∗ (u)(v) = u(v ◦ ϕi ) for all v ∈ C0∞ (ϕi (Ui )). Extending ϕi∗ (u) by zero outside ϕi (Ui ), in the following we treat it as a distribution on Rn . Let {χi } be a smooth partition of unity subordinate to {Ui }. Definition 3. For s ∈ R and p ∈ (1, ∞), we denote by W s, p (M) the space of all distributions u defined in M, such that us, p = ϕi∗ (χi u)s, p < ∞, (A.5) i
where the norm under the sum is the W s, p (Rn )-norm. In case s 0, these Sobolev spaces can also be defined for p = 1 and p = ∞.
Rough Solutions of the Einstein Constraints on Closed Manifolds
599
We collect the most basic properties of these spaces in the following lemma. Recall that a Riemannian metric on M induces a volume form on M, so that L p spaces can be defined on M (cf. [43]). Lemma 27. Either let s 0 and p ∈ [1, ∞] or let s < 0 and p ∈ (1, ∞). Then the space W s, p (M) is a Banach space. It is independent of the choice of the covering charts {(Ui , ϕi )} and the partition of unity {χi }. In particular, the different norms (A.5) are equivalent. Moreover, the following are true when M is equipped with a smooth Riemannian metric. (a) Let ∇ be the Levi-Civita connection associated to the Riemannian metric. Then for any nonnegative integer m, u m, p =
m
∇ i u p ,
i=0
is an equivalent norm on W m, p (M). In particular, we have W 0, p (M) = L p (M). (b) Identifying C ∞ (M) as a subspace of distributions via the L 2 -inner product, C ∞ (M) is densely embedded in W s, p (M) for any s ∈ R and p ∈ (1, ∞). (c) Let s ∈ R and p ∈ (1, ∞). Then the L 2 -inner product on C ∞ (M) extends uniquely to a continuous bilinear pairing W s, p (M) ⊗ W −s, p (M) → R, where 1p + p1 = 1.
Moreover, the pairing induces a topological isomorphism between W −s, p (M) and the topological dual space of W s, p (M).
Proof. See for example [3,19,43,45].
A main goal of this subsection is to extend the previous lemma to the case when the Riemannian metric is not smooth. The following result will be of importance. Lemma 28. Let si s with s1 + s2 0, and 1 p, pi ∞ (i = 1, 2) be real numbers satisfying 1 1 1 1 1 si − s n , s1 + s2 − s > n , − + − pi p p1 p2 p where the strictness of the inequalities can be interchanged if s ∈ N0 . In case min(s1 , s2 ) < 0, in addition let 1 < p, pi < ∞, and let 1 1 s1 + s 2 n + −1 . p1 p2 Then, the pointwise multiplication of functions extends uniquely to a continuous bilinear map W s1 , p1 (M) ⊗ W s2 , p2 (M) → W s, p (M). Proof. A proof is given in [55] for the case s 0, and by using a duality argument one can easily extend the proof to negative values of s. Some important special cases are considered in the following corollary:
600
M. Holst, G. Nagy, G. Tsogtgerel
Corollary 3. (a) If p ∈ (1, ∞) and s ∈ ( np , ∞), then W s, p is a Banach algebra. Moreover, if in addition q ∈ (1, ∞) and σ ∈ [−s, s] satisfy σ − qn ∈ [−n − s + np , s − np ], then the pointwise multiplication is bounded as a map W s, p ⊗ W σ,q → W σ,q . (b) Let 1 < p, q < ∞ and σ s 0 satisfy σ − qn < 2(s − np ) and σ − qn s − np . Then the pointwise multiplication is bounded as a map W s, p ⊗ W s, p → W σ,q . The following lemma is proved in [33] for the case p = q = 2. With the help of Lemma 28, the proof can easily be adapted to the following general case. Lemma 29. Let p ∈ (1, ∞) and s ∈ ( np , ∞), and let u ∈ W s, p . Let σ ∈ [−1, 1]
1−σ σ,q , where δ = 1 − s−1 . Moreover, let and q1 ∈ ( 1+σ 2 δ, 1 − 2 δ), and let v ∈ W p n f : [inf u, sup u] → R be a smooth function. Then, we have v( f ◦ u)σ,q C vσ,q f ◦ u∞ + f ◦ u∞ us, p ,
where the constant C does not depend on u, v or f . Proof. We consider the case σ = 1 first. Choosing a smooth Riemannian metric on M, we have v( f ◦ u)1,q C v( f ◦ u)q + ∇[v( f ◦ u)]q C v( f ◦ u)q + (∇v)( f ◦ u)q + v( f ◦ u)∇uq C vq f ◦ u∞ + v1,q f ◦ u∞ + f ◦ u∞ v∇uq . By Lemma 28, for
1 q
δ, the last term can be bounded as
v∇uq Cv1,q ∇us−1, p Cv1,q us, p , proving the lemma for the case σ = 1. By using duality one proves the case σ = −1 and q1 1 − δ, and the lemma follows from interpolation. Let M be an n-dimensional smooth closed manifold, and let E → M be a smooth vector bundle over M. Analogously to Definition 3, we define the Sobolev space W s, p (E) of sections of E by utilizing a finite trivializing cover of coordinate charts, a partition of unity subordinate to the cover, and the space [W s, p (Rn )]k of vector functions, where k is the fiber dimension of E. Then, Lemma 27 holds for these spaces with obvious modifications. When there is no risk of confusion, we will omit the explicit specification of the vector bundle E from the notation W s, p (E). In the following lemma we consider nonsmooth Riemannian structures on E and nonsmooth volume forms on M. Lemma 30. Let γ ∈ (1, ∞) and α ∈ ( γn , ∞). Fix on M a volume form of class W α,γ , and on E a Riemannian structure of class W α,γ . (a) Let p ∈ (1, ∞) and s min{α, α + n( 1p − γ1 )}. Then identifying the space C ∞ (E) of smooth sections of E as a subspace of distributions via the L 2 -inner product, C ∞ (E) is densely embedded in W s, p (E). (b) Let s ∈ [−α, α], p ∈ (1, ∞), and s − np ∈ [−n − α + γn , α − γn ]. Then the L 2 -inner product on C ∞ (E) extends uniquely to a continuous bilinear pairing W s, p (E) ⊗ W −s, p (E) → R, where 1p + p1 = 1. Moreover, the pairing induces a ∼ W −s, p (E). topological isomorphism [W s, p (E)]∗ =
Rough Solutions of the Einstein Constraints on Closed Manifolds
601
Proof. We will prove the lemma for scalar functions on M, i.e., for the trivial bundle E = M × R. The general case is only more technical. Fixing a smooth volume form on M and denoting the associated L 2 -inner product by (·, ·)∗ , the L 2 -inner product associated to the nonsmooth volume form (and the nonsmooth metric on M × R) satisfies (u, v) L 2 = (hu, v)∗ ,
u, v ∈ C ∞ (M),
with some strictly positive function h ∈ W α,γ . From Lemma 28, we have that multiplication by h is continuous on W s, p for s ∈ [−α, α], p ∈ (1, ∞), and s − np ∈ [−n − α + γn , α − γn ]. Since h > 0 this operation is invertible hence a homeomorphism on W s, p . Now by using Lemma 27 we complete the proof. Corollary 4. Let γ ∈ (1, ∞) and α ∈ ( γn , ∞). Fix on M a volume form of class W α,γ , and on E a Riemannian structure of class W α,γ . With s ∈ [−α, α], p ∈ (1, ∞), and s − np ∈ [−n − α + γn , α − γn ], let A : L p → W s, p be a bounded linear operator and let A∗ be its formal L 2 -adjoint, i.e., let (Au, v) L 2 = (u, A∗ v) L 2 ,
for u, v ∈ C ∞ (E).
Then, A∗ extends uniquely to a bounded linear map A∗ : W −s, p → L p , and we have Au, v = u, A∗ v ,
for u ∈ L p (E), v ∈ W −s, p (E),
where ·, · denotes the extension of the L 2 -inner product. Proof. This is an application of Lemma 30.
A.5. Elliptic operators on closed manifolds. In this Appendix we will state a priori estimates for general elliptic operators in some Sobolev spaces. Let M be an n-dimensional smooth closed manifold, and let E → M be a smooth vector bundle over M. Let C −∞ (E) be the topological dual of the space C ∞ (E) of smooth sections of α,γ E. Then for m ∈ N, α ∈ R, and γ ∈ [1, ∞], we define Dm (E) to be the space of ∞ −∞ differential operators A : C (E) → C (E) that can be written in local coordinates (trivializing E) as A= a ν ∂ν with a ν ∈ W α−m+|ν|,γ (Rn , Rk×k ), |ν| m, |ν|m
where k is the fiber dimension of E. One can easily verify that if the metric of a Riemannian manifold is in W α,γ with αγ > n, then both the Laplace-Beltrami operator and vector Laplacian defined in (2.17) α,γ α,γ are in the classes D2 (M × R) and D2 (T M), respectively. α,γ
Lemma 31. Let A be a differential operator of class Dm (E). Then, A can be extended to a bounded linear map A : W s,q (E) → W σ,q (E),
602
M. Holst, G. Nagy, G. Tsogtgerel
for q ∈ (1, ∞), s m − α, and σ satisfying σ min{s, α} − m, σ−
n n α − − m, q γ
σ <s−m+α− and s −
n , γ
n n m−n−α+ . q γ
Proof. This is a straightforward application of Lemma 28.
The Laplace-Beltrami operator and vector Laplacian are elliptic operators. We now consider local a priori estimates for general elliptic operators. For any subset U ⊂ M, the W s, p (U )-norm is denoted by · s, p,U . α,γ
Lemma 32. Let A ∈ Dm (E) be an elliptic operator with α − γn > max{0, m−n 2 }. Let n n n q ∈ (1, ∞), s ∈ (m − α, α], and s − q ∈ (m − n − α + γ , α − γ ]. Then for any y ∈ M, there exists a constant c > 0 and open neighborhoods K ⊂ U ⊂ M of y such that cχ us,q Aus−m,q + us−1,q,U ,
(A.6)
for any u ∈ W s,q (E) and χ ∈ C0∞ (K ) with χ 0. Proof. We work in a local chart containing y, which trivializes E. Let K be the open ball of radius r centered at y contained in the domain of the chart and extend the coefficients ν,γ of A outside K so that the resulting operator is still in Dm , with appropriate vector n fields over R . We make the decomposition A = L + R + B, where L is the highest order term of A with coefficients frozen at y, and R is what remains in the highest order terms, i.e., L= a ν (y)∂ν , R= [a ν − a ν (y)]∂ν . |ν|=m
|ν|=m
Obviously B = A − L − R are the lower order terms. Let u ∈ W s,q with supp u ⊂ K . From the theory of constant coefficient elliptic operators, we infer the existence of a constant c > 0 such that for any u ∈ W s,q (E) with supp u ⊂ K , cus,q Lus−m,q + us−m,q Aus−m,q + Rus−m,q + Bus−m,q + us−m,q . Since α > γn , without loss of generality we can assume for |ν| = m that a ν ∈ C 0,h for some h > 0, so Rus−m,q Cr h us,q , where C is a constant depending only on A. By choosing r so small that Cr h 2c , we have c us,q Aus−m,q + Bus−m,q + us−m,q . 2 Now we will work with the lower order term. Choose δ ∈ (0, α − δ min{1, s + α − m, s − + α − n q
n γ
+ n − m}. We have B ∈
n γ)
such that
α−1,γ Dm−1 , so by Lemma 31,
Rough Solutions of the Einstein Constraints on Closed Manifolds
603
B : W s−δ,γ → W s−m,γ is bounded. Then using a well known interpolation inequality, we get Bus−m,q Cus−δ,q Cεus,q + C ε−(m−δ)/δ us−m,q , for any ε > 0. Choosing ε > 0 sufficiently small, we conclude that cus,q Aus−m,q + us−m,q ,
∀u ∈ W s,q (E), supp u ⊂ K . α,γ
We apply this inequality to χ u, and then observing that [A, χ ] is in Dm−1 (M), we obtain (A.6). We can easily globalize the above result as follows: Corollary 5. Let the conditions of Lemma 32 hold. Then there exists a constant c > 0 such that cus,q Aus−m,q + us−m,q ,
∀u ∈ W s,q (E).
(A.7)
Proof. We first cover M by open neighborhoods K by applying Lemma 32 to every point y ∈ M, and then choose a finite subcover of the resulting cover. Then a partition of unity argument gives (A.7) with the term us−m,q replaced by us−1,q , and finally one can use an interpolation inequality to get the conclusion. Let us recall the following well known results from functional analysis. Lemma 33. Let X and Y be Banach spaces with continuous embedding X → Y . Let A : X → Y be a continuous linear map. Then (a) A necessary and sufficient condition that the graph of A be closed in X × Y is that there exists a constant c > 0 such that cu X AuY + uY for all u ∈ X . (b) If in addition the embedding X → Y is compact then the range of A is closed and the kernel of A is finite-dimensional. As an immediate consequence, we obtain the following result. α,γ
Lemma 34. Let A ∈ Dm (E) be an elliptic operator with α − γn > max{0, m−n 2 }. Let n n n q ∈ (1, ∞), s ∈ (m − α, α], and s − q ∈ (m − n − α + γ , α − γ ]. Then, the operator A : W s,q (E) → W s−m,q (E) is semi-Fredholm, i.e., its range is closed and the kernel is finite-dimensional. A.6. Maximum principles on closed manifolds. In this Appendix, we present maximum principles for the operators of the form −∇ · (u∇) with positive function u, followed by a simple application. These types of results are well known, but nevertheless we state them here for completeness. It is convenient at times when working with barriers and maximum principle arguments to split real valued functions into positive and negative parts; we will use the following notation for these concepts: φ + := max{φ, 0},
φ − := −min{φ, 0},
whenever they make sense. In the proof of the following lemma we will use the fact that for φ ∈ W 1, p , φ + ∈ W 1, p holds, and so φ − ∈ W 1, p , cf. [38].
604
M. Holst, G. Nagy, G. Tsogtgerel
Lemma 35. Let p ∈ (1, ∞) and s ∈ ( np , ∞) ∩ [1, ∞), and let (M, h ab ) be an n-dimensional, smooth, closed manifold with a Riemannian metric h ab ∈ W s, p . Moreover, let u ∈ W s, p be a function with u > 0 and let f ∈ W s−2, p . Let φ ∈ W s, p be such that for all ϕ ∈ C+∞ .
u∇φ, ∇ϕ + f, φϕ 0,
(A.8)
(a) If f = 0 and f, ϕ 0 for all ϕ ∈ C+∞ , then φ 0. (b) If M is connected and φ 0, then either φ ≡ 0 or φ > 0 everywhere. Proof. For (a), we will follow the proof of [33, Lemma 2.9]. Since φ ∈ W 1,n , we have φ − ∈ W+1,n and −φφ − ∈ W+1,n . Note that W 1,n → (W s−2, p )∗ by n 2. Now, using the positivity of f and the property (A.8), by density we get 0 f, φφ − − u∇φ, ∇φ − = u∇φ − , ∇φ − , implying that φ − = const. So if φ < 0, it would have to be a negative constant. But property (A.8) gives f, ϕ 0 for all ϕ ∈ C+∞ , which, in combination with the positivity, implies f = 0. This contradicts the hypothesis f = 0 and proves (a). Now we will prove (b). Since φ is continuous, the level set φ −1 (0) ⊂ M is closed. Following the proof of [35, Lemma 5.3], we apply the weak Harnack inequality [47, Theorem 5.2] to show that φ −1 (0) is also open. Then by connectedness of M we will have the proof. The weak Harnack inequality [47, Theorem 5.2] can be applied to second order elliptic operators of the form Lφ = ∂i (a i j ∂ j φ + a i φ) + b j ∂ j φ + aφ, where a i j are continuous, and a i , b j ∈ L 2t , and a ∈ L t for some t > n2 . The first term in (A.8) satisfies these conditions, and the second term can be cast into a form satisfying the conditions (details can be found in the proof of [35, Lemma 5.3]). Now suppose that φ(x) = 0 for some x ∈ M, and let us work in local coordinates around x. Then the weak Harnack inequality says that for sufficiently small R > 0, and for some p > t , n
φ L p (B(x,2R)) C R p inf φ, B(x,R)
where B(x, R) denotes the open ball of radius R (in the background flat metric) centered at x, and C is a constant that depends only on t, p, and the differential operator. Since φ(x) = 0 and φ is nonnegative, the infimum is zero and the inequality implies that φ ≡ 0 in a neighborhood of x. Hence the set φ −1 (0) is open. Lemma 36. Let the hypotheses of Lemma 35 (b) hold, and define the operator L : W s, p → W s−2, p by Lφ, ϕ = u∇φ, ∇ϕ + f, φϕ ,
φ ∈ W s, p , ϕ ∈ C ∞ .
Then, L is bounded and invertible. Proof. By Lemma 34, the operator L is semi-Fredholm, and moreover since L is formally self-adjoint, it is Fredholm. It is well known that when the metric is smooth, the index of L is zero independent of s and p. We can approximate the metric h by smooth metrics so that L is arbitrarily close to a Fredholm operator with index zero. Since the level sets of index as a function on Fredholm operators are open, we conclude that the index of L is zero. The injectivity of L follows from Lemma 35(a), for if φ1 and φ2 are two solutions of Lφ = g, then the above lemma implies that φ1 − φ2 0 and φ2 − φ1 0.
Rough Solutions of the Einstein Constraints on Closed Manifolds
605
A.7. The Yamabe classification of nonsmooth metrics. Let M be a smooth, closed, connected n-dimensional Riemannian manifold with a smooth metric h, where we assume throughout this section that n 3. With a positive scalar ϕ, let h˜ be related to h by 2n the conformal transformation h˜ = ϕ 2 −2 h, where 2 = n−2 . We say that h˜ and h are conformally equivalent, and this defines an equivalence relation on the space of metrics. The equivalence class containing h will be denoted by [h]; e.g., h˜ ∈ [h]. It is well known that any smooth Riemannian metric h on a given closed connected manifold M satisfies one and only one of the following three conditions: Y + : There is a metric in [h] with strictly positive scalar curvature; Y 0 : There is a metric in [h] with vanishing scalar curvature; Y − : There is a metric in [h] with strictly negative scalar curvature. These conditions define three disjoint classes in the space of metrics: they are referred to as the Yamabe classes. We will extend the above classification to metrics in the Sobolev spaces W s, p under rather mild conditions on s and p. Since the case p = 2 is treated in [33] and the argument there easily extends to our slightly general setting, we shall only sketch the proof here. Given a Riemannian metric h ∈ W s, p , let us consider the functional E : W 1,2 → R defined by E(ϕ) = (a∇ϕ, ∇ϕ) + R, ϕ 2 , n−1 where a = 4 n−2 . By Corollary 3, the pointwise multiplication is bounded on W 1,2 ⊗ W 1,2 → W σ,q for σ 1 and σ − qn < 2 − n. Putting σ = 2 − s and q = p , these conditions read as 2 − s − pn = 2 − n − s + np < 2 − n or s − np > 0, and s 1.
So if sp > n and s 1, ϕ 2 ∈ W 2−s, p for ϕ ∈ W 1,2 , meaning that the second term is bounded in W 1,2 . By using the functional E, we define the quantity µq = µq (h) = inf E(ϕ), ϕ∈Bq
where Bq = {ϕ ∈ W 1,2 : ϕq = 1}.
Under the conditions sp > n and s 1, one can show that µq is finite for q 2, and ˜ for any two metrics moreover that µ2 is a conformal invariant, i.e., µ2 (h) = µ2 (h) ˜h ∈ [h], now allowing W s, p functions for the conformal factor. We refer to µ2 (h) as the Yamabe invariant of the metric h, and we will see that the Yamabe classes correspond to the signs of the Yamabe invariant. Theorem 11. Let (M, h) be a smooth, closed, connected Riemannian manifold with dimension n 3 and with a metric h ∈ W s, p , where we assume sp > n and s 1. Let q ∈ [2, 2 ). Then, there exists φ ∈ W s, p , φ > 0 in M, such that − aφ + Rφ = µq φ q−1 ,
and
φq = 1,
(A.9)
where µq = µq (h) is as defined above. Proof. The above equation is the Euler-Lagrange equation for the functional E, so it suffices to show that E attains its infimum µq over Bq at a positive function φ ∈ W s, p . Let {φi } ⊂ Bq be a sequence satisfying E(φi ) → µq . From the continuity of the embedding L q → L 2 , we have {φi } is bounded in L 2 . It is the content of [33, Lemma 3.1] that E(ϕ) C1 ϕ21,2 − C2 ϕ22 ,
ϕ ∈ W 1,2 ,
606
M. Holst, G. Nagy, G. Tsogtgerel
for metrics in W s,2 with s > n2 . The proof works verbatim for our case, and since µq is finite, from this we conclude that {φi } is bounded in W 1,2 . By the reflexivity of W 1,2 and the compactness of W 1,2 → L q , there exist an element φ ∈ W 1,2 and a subsequence {φi } ⊂ {φi } such that φi φ in W 1,2 and φi → φ in L q . The latter implies φ ∈ Bq . It is not difficult to show that E is weakly lower semi-continuous, and it follows that E(φ) = µq , so φ satisfies (A.9). Bootstrapping with Corollary 5 implies that φ ∈ W s, p → W 1,n , so that |φ| ∈ W 1,n . Since E(|φ|) = E(φ), after replacing φ by |φ|, we can assume that φ 0. Finally, bootstrapping again gives φ ∈ W s, p , and since φ = 0 as φ ∈ Bq , by Lemma 35 we have φ > 0. Under the conformal scaling h˜ = ϕ 2
−2
h, the scalar curvature transforms as
R˜ = ϕ 1−2 (−aϕ + Rϕ),
so assuming the conditions of the above theorem we infer that any given metric h ∈ W s, p can be transformed to the metric h˜ = φ 2 −2 h with the continuous scalar curvature R˜ = µq φ q−2 , where the conformal factor φ is as in the theorem. In other words, given any metric h ab ∈ W s, p , there exist continuous functions φ ∈ W s, p with φ > 0 and R˜ ∈ W s, p having constant sign, such that ˜ 2 −1 . − aφ + Rφ = Rφ
(A.10)
We will prove below that the conformal class of the metric h completely determines the ˜ giving rise to the Yamabe classification of metrics in W s, p . sign of R, In the class of smooth metrics there is a stronger result known as the Yamabe theorem: each conformal class of smooth metrics contains a metric with constant scalar curvature. The Yamabe theorem is a non-trivial extension of the above theorem to the critical case q = 2 , and we see that for smooth metrics the sign of the Yamabe invariant determines which Yamabe class the metric is in. A proof of the Yamabe theorem requires more delicate techniques since we lose the compactness of the embedding W 1,2 → L q , see e.g. [31] for a treatment of smooth metrics. As far as we know there has not appeared in the literature an explicit proof of the Yamabe theorem for nonsmooth metrics such as the ones considered in this paper, although it is generally expected to be true. We will not pursue this issue here; however, the following simpler result justifies the Yamabe classification of nonsmooth metrics. Theorem 12. Let (M, h) be a smooth, closed, connected Riemannian manifold with dimension n 3 and with a metric h ∈ W s, p , where we assume sp > n and s 1. Then, the following hold: • µ2 > 0 iff there is a metric in [h] with continuous positive scalar curvature. • µ2 = 0 iff there is a metric in [h] with vanishing scalar curvature. • µ2 < 0 iff there is a metric in [h] with continuous negative scalar curvature. In particular, two conformally equivalent metrics cannot have scalar curvatures with distinct signs. Proof. We begin by proving that if there is a metric in [h] with continuous scalar curvature of constant sign, then µ2 has the corresponding sign. Since µ2 is a conformal invariant, we can assume that the scalar curvature R of h is continuous and has constant sign. If R < 0, then E(ϕ) < 0 for constant test functions ϕ = const and there
Rough Solutions of the Einstein Constraints on Closed Manifolds
607
is a constant function in B2 , so we have µ2 < 0. If R 0, then E(ϕ) 0 for any ϕ ∈ W 1,2 , so µ2 0. Taking constant test functions, we infer that R = 0 implies µ2 = 0. Now, if R > 0 then E(ϕ) defines an equivalent norm on W 1,2 , and we have 1 = ϕ2 Cϕ1,2 for ϕ ∈ B2 , so µ2 > 0. Next, we will prove that there is a metric in [h] with continuous scalar curvature with the same sign as that of µ2 . To this end, for any q ∈ [2, 2 ), we shall show that the sign of µ2 determines the sign of µq , so that the proof is completed by Theorem 11. If µ2 < 0, then E(ϕ) < 0 for some ϕ ∈ B2 , and since E(kϕ) = k 2 E(ϕ) for k ∈ R, there is some kϕ ∈ Bq such that E(kϕ) < 0, so µq < 0. If µs 0, then E(ϕ) 0 for all ϕ ∈ B2 , and for any ψ ∈ Bq there is k such that kψ ∈ B2 , so µq 0. All such k are uniformly bounded since k = 1/ψ2 C/ψq = C by the continuity estimate ·1 C·2 . From this we have for all ψ ∈ Bq , E(ψ) = E(kψ)/k 2 µ2 /k 2 µ2 /C 2 , meaning that µ2 > 0 implies µq > 0. A similar scaling argument gives that if µ2 = 0 then µq = 0. A.8. Conformal covariance of the Hamiltonian constraint. Let M be a smooth, closed, connected n-dimensional manifold equipped with a Riemannian metric h ∈ W s, p , where we assume throughout this section that p ∈ (1, ∞), s ∈ ( np , ∞)∩[1, ∞) and that n 3. We consider the Hamiltonian constraint H (φ) := −φ + where r =
4 n−2 ,
1 r (n−1) Rφ
+ aτ φ r +1 − aw φ −r −3 − aρ φ −t = 0,
t ∈ R are constants, R ∈ W s−2, p is the scalar curvature of the metric s−2, p
. In this Appendix, we will be h, and the other coefficients satisfy aτ , aw , aρ ∈ W+ interested in the transformation properties of H under the conformal change h˜ = θ r h of the metric with the conformal factor θ ∈ W s, p satisfying θ > 0. To this end, we consider ˜ + H˜ (ψ) := −ψ
1 ˜ r (n−1) Rψ
+ a˜ τ ψ r +1 − a˜ w ψ −r −3 − a˜ ρ ψ −t = 0,
˜ R˜ ∈ W s−2, p is ˜ is the Laplace-Beltrami operator associated to the metric h, where ˜ and at the moment we do not impose any conditions on the the scalar curvature of h, s−2, p . One can derive remaining coefficients other than that they satisfy a˜ τ , a˜ w , a˜ ρ ∈ W+ the following relations: R˜ = θ −r R − r (n − 1)θ −r −1 θ, ˜ = θ −r ψ + 2θ −r −1 ∇ a θ ∇a ψ. ψ Combining these relations with (θ ψ) = θ ψ + ψθ + 2∇ a θ ∇a ψ, we obtain ˜ + −ψ
1 ˜ r (n−1) Rψ
= θ −r −1 −(θ ψ) +
which in turn implies that H˜ (ψ) = θ −r −1 H (θ ψ),
1 r (n−1) Rθ ψ
,
608
M. Holst, G. Nagy, G. Tsogtgerel
provided in the definition of H˜ that a˜ τ = aτ , a˜ w = θ −2r −4 aw , and a˜ ρ = θ −t−r −1 aρ . We have proved the following well known result. Lemma 37. Assume the above setting, so in particular, a˜ τ = aτ , a˜ w = θ −2r −4 aw , and a˜ ρ = θ −t−r −1 aρ . Then we have H˜ (ψ) = 0 H˜ (ψ) 0
⇔
H (θ ψ) = 0,
⇔
H (θ ψ) 0,
H˜ (ψ) 0
⇔
H (θ ψ) 0.
A.9. General conformal rescaling and the near-CMC condition. In this article we focused on the standard conformal method to produce the particular coupled elliptic PDE system that we analyzed. Here we examine briefly other decompositions to see if it is possible to remove the near-CMC obstacle for non-CMC existence that still seems to remain for the non-positive Yamabe classes and for the positive Yamabe class with large data. The key question here is whether or not the standard conformal method essentially hard-wires the near-CMC assumption into the coupled system in order to get a domain of attraction for fixed-point iterations. If this is the case, then there remains the possibility that one can reverse-engineer a formulation, different from the conformal method, that gives a domain of attraction (preferably a contraction so that we also get uniqueness) without use of near-CMC conditions. Unfortunately, the answer appears to be negative, as we demonstrate below. In particular, it seems that the near-CMC obstacle is present in all possible formulations based on conformal transformations, if the estimate (5.1) is used. To begin, recall that the objects (M, hˆ ab , kˆab , ρ, ˆ jˆa ) form an n-dimensional initial data set for Einstein’s equations iff M is a n-dimensional smooth manifold, the tensor hˆ ab is a Riemannian metric on M, the tensor kˆab is a symmetric tensor field on M, the fields ρˆ and jˆa are a non-negative scalar and a tensor field on M, respectively, satisfying the condition −ρˆ 2 + jˆa jˆa < 0, and the following equations hold: Rˆ + kˆ 2 − kˆab kˆ ab − 2κ ρˆ = 0, −∇ˆ a kˆ ab + ∇ˆ b kˆ + κ jˆb = 0,
(A.11) (A.12)
where ∇ˆ a is the Levi-Civita connection of the metric hˆ ab , the scalar field Rˆ is the Ricci scalar of the connection ∇ˆ a , the scalar kˆ = kˆab hˆ ab is the trace of the tensor kˆab , and the constant κ = 8π in units where both the gravitation constant G and the speed of light c have value one. The initial data set for Einstein’s equations describe an instant of time in the physical world if we choose the number n = 3. Nevertheless, in the calculations that follow we keep the number n as a general positive integer. Introduce the decomposition of the two-index tensor kab into trace-free and trace parts, as follows: kˆ ab = sˆ ab +
1 n
kˆ hˆ ab ,
where sˆab hˆ ab = 0. Introduce the following conformal rescaling: hˆ ab = φ r h ab ,
sˆ ab = φ s s ab ,
kˆ = φ t k,
(A.13)
Rough Solutions of the Einstein Constraints on Closed Manifolds
609
where the integers r , s, and t are arbitrary, and we have introduced the Riemannian metric h ab , a symmetric tensor s ab , and a scalar field k. Introduce ∇a , the Levi-Civita connection of the metric h ab , which satisfies the equation ∇a h bc = 0, and denote by R the Ricci scalar of this connection ∇a . The rescaling above induces the following equations: hˆ ab = φ −r h ab ,
sˆab = φ (2r +s) sab ,
where hˆ ab is the inverse tensor of hˆ ab , and h ab is the inverse tensor of h ab . We use the convention that indices in all other hatted tensors are raised and lowered with the tensors hˆ ab and hˆ ab , respectively, while indices on unhatted tensors are raised and lowered with the tensors h ab and h ab , respectively. For example: sˆab = hˆ ac hˆ bd sˆ cd = φ r h ac φ r h bd φ s s cd = φ (2r +s) sab . The rescaling introduced in Eq. (A.13) implies that the tensor field kˆ ab transforms as follows: kˆ ab = φ s s ab +
1 n
φ (t−r ) kh ab
⇔
kˆab = φ (2r +s) sab +
1 n
φ (t+r ) kh ab .
The connections ∇ˆ a and ∇a differ in a tensor field Cab c , in the sense that for any tensor field va , ∇ˆ a vb = ∇a vb − Cab c vc holds. The tensor field Cab c depends on the scalar field φ and the number r as follows: Cab c = r δ(a c ∇b) ln(φ) −
r 2
h ab h cd ∇d ln(φ).
(A.14)
This expression implies the contractions h ab Cab c = − r2 (n − 2)h cd ∇d ln(φ),
Cab b =
nr 2 ∇a
ln(φ).
Given any two connections ∇ˆ a and ∇a related by a tensor field Cab c , the Riemann, Ricci, and Ricci scalar fields associated with these two connections are related by the following expressions: Rˆ abc d = Rabc d − 2∇[a Cb]c d + 2Cc[a e Cb]e d , Rˆ ac = Rac − ∇a Ccb b + ∇b Cac b + Cca e Ceb b − Ccb e Cae b , Rˆ = φ −r R − ∇ a Cab b + ∇b (h ac Cac b ) + h ac Cca e Ceb b − h ac Ccb e Cae b , where indices between square brackets mean anti-symmetrization, that is, given any tensor u ab we define u [ab] := (u ab − u ba )/2. In the case that the tensor Cab c is given by Eq. (A.14), the Ricci scalars Rˆ and R satisfy the equation r (n − 1)[r (n − 2) − 4](∇a φ)(∇ a φ) . Rˆ = φ −(r +1) φ R − r (n − 1)φ − 4φ Introduce the Hamiltonian and momentum fields, Hˆ := Rˆ + kˆ 2 − kˆab kˆ ab , ˆ Mˆ b := −∇ˆ a kˆ ab + ∇ˆ b k,
610
M. Holst, G. Nagy, G. Tsogtgerel
then the conformal rescaling given in Eq. (A.13) implies the following equations: r Hˆ = φ −(r +1) φ R − r (n − 1)φ − 4φ (n − 1)[r (n − 2) − 4](∇a φ)(∇ a φ) n − 1 2t 2 φ k − φ 2(r +s) sab s ab , n rn n−1 t φ ∇b k − + r + s φ (r +s) sb a ∇a ln(φ) Mˆ b = −φ (r +s) ∇a sb a + n 2 n−1 t t φ k∇b ln(φ). + n +
It is convenient to reorder the terms in these equations in such a way that the equation for the Hamiltonian field is given by −r (n − 1)φ − +Rφ +
r (n − 1)[r (n − 2) − 4](∇a φ)(∇ a φ) 4φ
(n − 1) 2 (2t+r +1) k φ − sab s ab φ (3r +2s+1) = φ (r +1) Hˆ , n
and the equation for the momentum field is given by (n + 2) a r + s sb a ∇a ln(φ) −∇a sb − 2 (n − 1) (t−r −s) (n − 1) (t−r −s−1) φ tφ = φ −(r +s) Mˆ b − ∇b k − k∇b φ. n n There are many interesting particular cases of the equations above. The first case is to keep the dimension n 3 arbitrary, and choose: r=
4 n−2 ,
s = − (n+2) 2 r,
t = 0,
then, introducing the number 2∗ := 2n/(n − 2), we conclude that the n-dimensional vacuum Einstein constraint equations (H = 0, Mb = 0) can be written as follows: 4(n − 1) (n − 1) 2 (2∗ −1) ∗ − sab s ab φ −(2 +1) = 0, φ + Rφ + k φ (n − 2) n (n − 1) 2∗ a φ ∇b k = 0. −∇a sb + n
−
In the case that the manifold M is 3-dimensional, we have the number 2∗ = 6, and the equation for the Hamiltonian field is given by −2r φ − +Rφ +
r (r − 4)(∇a φ)(∇ a φ) 2φ
2 2 (2t+r +1) k φ − sab s ab φ (3r +2s+1) = φ (r +1) Hˆ , 3
and the equation for the momentum field is given by 3r a −∇a sb − + r + s sb a ∇a ln(φ) 2 2 2 = φ −(r +s) Mˆ b − φ (t−r −s) ∇b k − t φ (t−r −s−1) k∇b φ. 3 3
(A.15)
(A.16)
Rough Solutions of the Einstein Constraints on Closed Manifolds
611
The semi-decoupling decomposition in the case of the vacuum Einstein constraint equations (H = 0, Mb = 0) is obtained from Eqs. (A.15)-(A.16) in the particular case of r = 4, s = −10, and t = 0, that is, 2 ∗ ∗ −8φ + Rφ + k 2 φ (2 −1) − sab s ab φ −(2 +1) = 0, 3 2 2∗ a −∇a sb + φ ∇b k = 0. 3 The conformally covariant decomposition, in the case of the vacuum Einstein constraint equations (H = 0, Mb = 0) and in the case that the transverse, traceless part of the tensor kab vanishes, is obtained from Eqs. (A.15)-(A.16) with the particular choice of r = 4, s = −4, and t = 0, that is, 2 2 ∗ k − sab s ab φ (2 −1) = 0, −8φ + Rφ + 3 2 −∇a sb a − 6 sb a ∇a ln(φ) + ∇b k = 0. 3 As a final example, it is interesting to write down the rescaled equations above in the case r = 4, s = −10, t arbitrary: 2 −8φ + Rφ + φ (2t+5) k 2 − φ −7 sab s ab = φ 5 Hˆ , 3 2 2 −∇a sb a = φ 6 Mˆ b − φ (t+6) ∇b k − t φ (t+5) k ∇b φ. 3 3 Since the leading power in each equation scales exactly as the conformal method, the same argument leading to the negative result for the conformal method in Lemma 10 will apply here. Therefore, it appears that the different conformal rescalings produce coupled systems leading to precisely the same form of the near-CMC condition to establish non-CMC existence, in the case of both the non-positive Yamabe classes and the positive Yamabe class for large data. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References 1. Allen, P., Clausen, A., Isenberg, J.: Near-constant mean curvature solutions of the Einstein constraint equations with non-negative Yamabe metrics. Class. Quant. Grav. 25, 075009 (2008) 2. Amann, H.: Fixed point equations and nonlinear eigenvalue problems in ordered Banach spaces. SIAM Review 18(4), 620–709 (1976) 3. Aubin, T.: Nonlinear Analysis on Manifolds. Monge-Ampère Equations. New York: Springer-Verlag, 1982 4. Bartnik, R., Fodor, G.: On the restricted validity of the thin sandwich conjecture. Phys. Rev. D 48(8), 3596–3599 (1993) 5. Bartnik, R., Isenberg J.: The constraint equations. In: Chru´sciel, P., Friedrich, H. eds., The Einstein equations and large scale behavior of gravitational fields. Berlin: Birhäuser, 2004, pp. 1–38 6. Beig, R.: TT-tensors and conformally flat structures on 3-manifolds. In: Chru´sciel, P.T. ed., Mathematics of Gravitation, Part 1, Volume 41. Warszawa: Banach Center Publications, Polish Academy of Sciences, Institute of Mathematics, 1997, pp. 109–118. Available at http://arXiv.org/abs/gr-qc/9606055v1, 1996
612
M. Holst, G. Nagy, G. Tsogtgerel
7. Beig, R.: Generalized Bowen-York initial data. In: Cotsakis, S., Gibbons, G. eds., Mathematical and Quantum Aspects of Relativity and Cosmology, Volume 537. Springer Lecture Note in Physics, Berlin: Springer, 2000, pp. 55–69 8. Beig, R., Ó Murchadha, N.: The momentum constraints of general relativity and spatial conformal isometries. Commun. Math. Phys. 176(3), 723–738 (1996) 9. Bowen, J., York, J.: Time-asymmetric initial data for black holes and black-hole collisions. Phys. Rev. D 21(8), 2047–2055 (1980) 10. Choquet-Bruhat, Y.: Einstein constraints on compact n-dimensional manifolds. Class. Quant. Grav. 21, S127–S151 (2004) 11. Choquet-Bruhat, Y., Isenberg, J., York, J.: Einstein constraint on asymptotically Euclidean manifolds. Phys. Rev. D 61, 084034 (2000) 12. Corvino, J.: Scalar curvature deformation and a gluing construction for the Einstein constraint equations. Commun. Math. Phys. 214, 137–189 (2000) 13. Dain, S.: Initial data for a head on collision of two Kerr-like black holes with close limit. Phys. Rev. D 64(15), 124002 (2001) 14. Dain, S.: Initial data for two Kerr-like black holes. Phys. Rev. Lett. 87(12), 121102 (2001) 15. Dain, S.: Trapped surfaces as boundaries for the constraint equations. Class. Quant. Grav. 21(2), 555–573 (2004) 16. Du, Y.: Order structure and topological methods in nonlinear partial differential equations, Vol I. New Jersey, London, Singapore: World Scientific, 2006 17. Geroch, R., Traschen, J.: Strings and other distributional sources in general relativity. Phys. Rev. D 36(4), 1017–1031 (1987) 18. Grisvard, P.: Elliptic Problems in Nonsmooth Domains. Marshfield, MA: Pitman Publishing, 1985 19. Hebey, E.: Sobolev spaces on Riemannian manifolds, Volume 1635 of Lecture notes in mathematics. Berlin, New York: Springer, 1996 20. Holst, M.: Adaptive numerical treatment of elliptic systems on manifolds. Adv. Comp. Math. 15, 139–191 (2001) 21. Holst, M., Nagy, G., Tsogtgerel, G.: Rough solutions of the Einstein constraints on manifolds with boundary. Preprint, available at http://arXiv.org/abs0712.0798v1[gr-qc], 2007 22. Holst, M., Nagy, G., Tsogtgerel, G.: Far-from-constant mean curvature solutions of Einstein’s constraint equations with positive Yamabe metrics. Phys. Rev. Lett. 100(16), 161101.1–161101.4 (2008) 23. Holst, M., Tsogtgerel, G.: Adaptive finite element approximation of nonlinear geometric PDE. Preprint 24. Holst, M., Tsogtgerel, G.: Convergent adaptive finite element approximation of the Einstein constraints. Preprint 25. Isenberg, J.: Constant mean curvature solution of the Einstein constraint equations on closed manifold. Class. Quant. Grav. 12, 2249–2274 (1995) 26. Isenberg, J., Moncrief, V.: A set of nonconstant mean curvature solution of the Einstein constraint equations on closed manifolds. Class. Quant. Grav. 13, 1819–1847 (1996) 27. Isenberg, J., Ó Murchadha, N.: Non CMC conformal data sets which do not produce solutions of the Einstein constraint equations. Class. Quant. Grav. 21, S233–S242 (2004) 28. Isenberg, J., Park, J.: Asymptotically hyperbolic non-constant mean curvature solutions of the Einstein constraint equations. Class. Quant. Grav. 14, A189–A201 (1997) 29. Jerome, J.: Consistency of semiconductor modeling: an existence/stability analysis for the stationary van Roosbroeck system. SIAM J. Appl. Math. 45(4), 565–590 (1985) 30. Klainerman, S., Rodnianski, I.: Improved local well posedness for quasilinear wave equations in dimension three. Duke Math. J. 117(1), 1–124 (2003) 31. Lee, J., Parker, T.: The Yamabe problem. Bull. Amer. Math. Soc. 17(1), 37–91 (1987) 32. Lichnerowicz, A.: L’integration des équations de la gravitation relativiste et le problème des n corps. J. Math. Pures Appl. 23, 37–63 (1944) 33. Maxwell, D.: Rough solutions of the Einstein constraint equations on compact manifolds. J. Hyp. Diff. Eqs. 2(2), 521–546 (2005) 34. Maxwell, D.: Solutions of the Einstein constraint equations with apparent horizon boundaries. Commun. Math. Phys. 253(3), 561–583 (2005) 35. Maxwell, D.: Rough solutions of the Einstein constraint equations. J. Reine Angew. Math. 590, 1–29 (2006) 36. Maxwell, D.: A class of solutions of the vacuum einstein constraint equations with freely specified mean curvature. http://arXiv.org/abs/0804.0874v1[gr-qc], 2008 37. Misner, C., Thorne, K., Wheeler, J.: Gravitation. San Francisco, CA: W. H. Freeman and Company, 1970 38. Mitrovi´c, D., Žubrini´c, D.: Fundamentals of applied functional analysis, Volume 91 of Pitman monographs and surveys in pure and applied mathematics. Essex, UK: Addison Wesley Longman, 1998 39. Ó Murchadha, N., York, J.: Existence and uniqueness of solutions of the Hamiltonian constraint of general relativity on compact manifolds. J. Math. Phys. 14(11), 1551–1557 (1973)
Rough Solutions of the Einstein Constraints on Closed Manifolds
613
40. Ó Murchadha, N., York, J.: Initial-value problem of general relativity I. General formulation and physical interpretation. Phys. Rev. D 10(2), 428–436 (1974) 41. Ó Murchadha, N., York, J.: Initial-value problem of general relativity II. Stability of solution of the initial-value equations. Phys. Rev. D 10(2), 437–446 (1974) 42. Palais, R.: Seminar on the Atiyah-Singer index theorem. Princeton, NJ: Princeton University Press, 1965 43. Rosenberg, S.: The Laplacian on a Riemannian Manifold. Cambridge: Cambridge University Press, 1997 44. Rudin, W.: Real & Complex Analysis. New York: McGraw-Hill, 1987 45. Schwarz, G.: Hodge decomposition – a method for solving boundary value problems. In: Lecture Notes in Mathematics, Volume 1607. Berlin-Heidelberg-New York: Springer Verlag, 1995 46. Triebel, H.: Theory of function spaces, Volume 78 of Monographs in Mathematics. Basel : Birkhäuser Verlag, 1983 47. Trudinger, N.: Linear elliptic operators with measurable coefficients. Ann. Scuola Norm. Sup. Pisa 27(3), 265–308 (1973) 48. Wald, R.: General Relativity. Chicago, IL: The University of Chicago Press, 1984 49. York, J.: Gravitational degrees of freedom and the initial-value problem. Phys. Rev. Lett. 26(26), 1656–1658 (1971) 50. York, J.: Role of conformal three-geometry in the dynamics of gravitation. Phys. Rev. Lett. 28(16), 1082–1085 (1972) 51. York, J.: Conformally invariant orthogonal decomposition of symmetric tensor on Riemannian manifolds and the initial-value problem of general relativity. J. Math. Phys. 14(4), 456–464 (1973) 52. York, J.: Covariant decompositions of symmetric tensors in the theory of gravitation. Ann. Inst. Henri Poincare A 21(4), 319–332 (1974) 53. York, J.: Conformal “thin-sandwich” data for the initial-value problem of general relativity. Phys. Rev. Lett. 82, 1350–1353 (1999) 54. Zeidler, E.: Nonlinear Functional Analysis and its Applications I, Fixed-Point Theorems. New York: Springer, 1986 55. Zolesio, J.L.: Multiplication dans les espaces de Besov. Proc. Royal Soc. Edinburgh (A) 78(2), 113–117 (1977) Communicated by G. W. Gibbons
Commun. Math. Phys. 288, 615–652 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0786-4
Communications in
Mathematical Physics
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples Chen Yang Department of Mathematical Sciences, Durham University, Durham DH1 3LE, UK. E-mail:
[email protected] Received: 14 April 2008 / Accepted: 15 December 2008 Published online: 20 March 2009 – © Springer-Verlag 2009
Abstract: We study the isospectral deformations of the Eguchi-Hanson spaces along a torus isometric action in the noncompact noncommutative geometry. We concentrate on locality, smoothness and summability conditions of the nonunital spectral triples, and relate them to the geometric conditions to be noncommutative spin manifolds. Contents 1. 2. 3. 4. 5. 6.
Introduction . . . . . . . . . . . . . . . . . Spin Geometry of Eguchi-Hanson Spaces . . Smooth Algebras and Projective Modules . . Nonunital Spectral Triples and Summability Geometric Conditions . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
615 616 624 635 643 651
1. Introduction As a generalization of Connes’ noncommutative differential geometry [1], noncompact noncommutative geometry is the study of nonunital spectral triples [2,3]. Various authors also consider the aspect of summability as in [4–6]. In the unital case, Connes provides a set of axioms for unital spectral triples so to define compact noncommutative spin manifolds. See for example [1]. Rennie and Várilly explicitly reconstruct compact commutative spin manifolds from slightly modified axioms [7] and Connes thoroughly investigated this problem recently in [8]. As to the nonunital case, a complete generalization considering these axioms is not known yet. There are various nonunital examples [9,3,10], which may serve the purpose of testing Supported by the Dorothy Hodgkin Scholarship.
616
C. Yang
the axioms or geometric conditions suggested. In this article, we obtain another nonunital example by isospectral deformation of Eguchi-Hanson (EH-) spaces [11]. They are geodesically complete Riemannian spin manifolds in the commutative geometry. Isospectral deformation is a simple method to deform a commutative spectral triple. It traces back to the Moyal type of deformation from quantum mechanics. Rieffel’s insight is to consider Lie group actions on function spaces and hence explain the Moyal product between functions by oscillatory integrals over the group actions [12]. Apart from the well-known Moyal planes and noncommutative tori [13], this scheme allows more general deformations. Connes and Landi in [14] deform spheres and more general compact spin manifolds with isometry groups containing a two-torus. Connes and DuboisViolette in [15] observe that this works equally well for noncompact spin manifolds. As in the appendix of [2], it is possible to fit such noncompact examples in the nonunital framework there. The deformation of EH-spaces we will consider in the following is obtained by these methods and serves as an example of a nonunital triple. The Eguchi-Hanson spaces are of interest in both Riemannian geometry and physics. Geometrically, they are the simplest asymptotic locally Euclidean (ALE) spaces, for which a complete classification is provided by Kronheimer through the method of hyper-Kähler quotients [16]. This construction realizes the family of EH-spaces as a resolution of a singular conifold. In physics, where they first appeared, EH-spaces are known as gravitational instantons. Due to their hyper-Kähler structure, the ADHM construction [17], obtaining Yang-Mills’ instantons, is generalized on the EH-spaces in an elegant way [18,19]. The nonunital spectral triple from isospectral deformation of Eguchi-Hanson spaces may thus link various perspectives. Our aim in this article is to concentrate on the locality, smoothness [2] and summability conditions of these triples and further see how they fit into the modified geometric conditions for nonunital spectral triples. The organization of the rest of the article is as follows. In Sect. 2, we describe the Eguchi-Hanson spaces in the spin geometry. In Sect. 3, we consider algebras of functions over EH-spaces, the deformation quantization of algebras, and representations of algebras as operators on the Hilbert space of spinors. We also obtain a projective module description of the spinor bundle. In Sect. 4, we define spectral triples of the deformed EH-spaces and study their summability. In Sect. 5, we discuss how the triple fits into the modified geometric conditions. We conclude in Sect. 6.
2. Spin Geometry of Eguchi-Hanson Spaces In this section, we first describe the metric and the Levi-Civita connection of the Eguchi-Hanson space, and then introduce its spinor bundle, the spin connection and the Dirac operator. Finally, we write down the torus action through parallel propagators on the spinor bundle.
2.1. Metrics, connections and torus isometric actions. The Eguchi-Hanson spaces were originally constructed as gravitational instantons [11]. Generalized by Gibbons and Hawking, they fall into a new category of solutions of the Einstein’s equation, known as the multicenter solutions [20]. In local coordinates, the metric is ds 2 = ∆−1 dr 2 + r 2 (σx2 + σ y2 ) + ∆ σz2 ,
(1)
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
617
where ∆ := ∆(r ) := 1 − a 4 /r 4 and {σx , σ y , σz } are the standard Cartan basis for the 3-sphere, ⎧ 1 ⎪ ⎨σx = 2 (− cos ψ dθ − sin θ sin ψ dφ), σ y = 21 (sin ψ dθ − sin θ cos ψ dφ), ⎪ ⎩σ = 1 (−dψ − cos θ dφ), z 2 with r ≥ a, 0 ≤ θ ≤ π, 0 ≤ φ < 2π, 0 ≤ ψ < 2π. Remark 1. The convention that the period of ψ is 2π rather than 4π as in the original construction is suggested in [20] to remove the singularity at r = a, so that the manifold becomes geodesically complete. The EH-space is diffeomorphic to the tangent bundle of a 2-sphere T (S2 ). Modulo a distortion of the metric, the base as a unit two sphere S2 is parametrized by parameters φ and θ , with θ = 0 as the south pole and θ = π as the north pole. The angle φ parametrizes the circle defined by a constant θ . Over each point, say (θ, φ) on the 2-sphere, the tangent plane is parametrized by (r, ψ). r parametrizes the radial direction with r = a at the origin of the plane. Circles of constant r are parametrized by ψ. The identification of ψ = ψ + 2π is the identification of the antipodal points on the circle of constant radius. Together with the metric, this implies that the space at large enough r is asymptotic to R4 /Z2 , so that it is an ALE space. The parameter a in the metric (1) is a non-negative real number, and thus parametrizes a family of EH-spaces. When a = 0, the metric degenerates to the conifold R4 /Z2 and the rest of the family is a resolution of the conifold. This appears as the simplest case in Kronheimer’s classification of ALE spaces [16]. We will only concentrate on the smooth case so that a is assumed to be positive. Choose the local coordinates {xi } with x1 = r, x2 = θ, x3 = φ, x4 = ψ. We will write the coordinates (r, θ, φ, ψ) and (x1 , x2 , x3 , x4 ) interchangeably throughout the article, because the former give a clear geometric picture while the latter are convenient in tensorial expressions. The corresponding basis on the tangent space Tx (E H ) of any point x ∈ E H are ∂i := ∂∂xi , and the dual basis on the cotangent space Tx∗ (E H ) are {d x j }. The corresponding metric tensor gi j (x) d x i ⊗ d x j can be written as entries of the matrix G = (gi j ) as ⎛ −1 4∆ 1⎜ 0 G(x) = ⎜ 4⎝ 0 0
0 r2 0 0
0 0 ρ r 2 ∆ cos θ
⎞ 0 ⎟ 0 ⎟, 2 r ∆ cos θ ⎠ r2 ∆
(2)
where ρ := ρ(r, θ ) := r 4 − a 4 cos2 θ /r 2 . We always assume Einstein’s summation convention.
618
C. Yang
In the same coordinate chart, the Christoffel symbols of the Levi-Civita connection of (1) , defined by ∇i ∂ j = Γikj ∂k , are explicitly ∆ r∆ ∆ ρ+ r ∆+ ∆ cos θ 1 1 1 , Γ22 , Γ33 , Γ34 , =− =− =− ∆ 4 4r 4 r ∆+ ∆ 1 a 4 sin 2θ ∆ sin θ 2 2 2 , Γ12 , =− = , Γ33 =− , Γ34 = 4 4 r 2r 2 1 cot θ ∆+ ∆ 2 a 4 cos θ 3 3 4 , Γ24 , Γ13 , = , Γ23 = =− = r 2 2 sin θ r (r 4 − a 4 ) ∆+ ρ+ cot θ ∆ 4 4 , Γ23 , Γ24 , = =− 2 = r∆ 2 r sin θ 2
1 =− Γ11 1 Γ44 3 Γ13 4 Γ14
(3)
where ∆+ := ∆+ (r ) := 1 +
a4 ∂∆ r 4 + a 4 cos2 θ , ρ + := ρ + (r, θ ) := , ∆ := . 4 r ∂r r2
i = Γ i , implied by the torsion free property of the connection, generates The identity Γ jk kj another set of symbols and all the rest of the Christoffel symbols vanish. The isometry group of the metric (1) is (U (1) × SU (2))/Z2 . The Killing vector ∂ψ generates the group U (1)/Z2 . Another Killing vector is ∂φ . Its action on the restriction of the space at r = a is analogous to one of the three typical generators of the Lie algebra of the Lie group SU (2) on a standard two-sphere. These are the two Killing vectors which define a torus action σ on the Eguchi-Hanson space,
σ : U (1) × U (1) −→ Aut (E H ),
(4)
by σ (exp (i t3 ∂φ ), exp (i t4 ∂ψ ))(r, θ, φ, ψ) = (r, θ, φ + t3 , ψ + t4 ), where 0 ≤ t3 < 2π , 0 ≤ t4 < 2π and for any point (r, θ, φ, ψ) ∈ E H . The isometric torus action will determine the isospectral deformation later. 2.2. The stereographic projection and orthonormal basis. We choose an orthonormal basis to trivialize the cotangent bundle of the EH-space and obtain the corresponding transition functions. Since the EH-space is topologically the same as T (S2 ), we may obtain another set of coordinates by taking the stereographic projection of the S2 part, while keeping the coordinates on the tangent space unchanged. The EH-space (1) can be covered by two open neighbourhoods U N and U S , where U N covers the whole space except at θ = π and U S covers the whole space except at θ = 0. We may define the map f N : U N −→ C×R2 by taking a stereographic projection of the base two sphere to C. I.e., f N (φ, θ, r, ψ) = (z; r, ψ). For the coordinate chart U S , we similarly define the projection map f S : N S −→ C×R2 , by f S (φ, θ, r, ψ) = (w; r, ψ), where θ θ z := cot e−iφ , w := tan eiφ . 2 2 For any point x ∈ U N ∩ U S , the transition function from the coordinate charts U S to U N is 1 (w; r, ψ) = ( ; r, ψ), z and the transition function from U N to U S is (z; r, ψ) = ( w1 ; r, ψ).
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
619
The restriction of metric (1) on the U N chart with coordinates (z; r, ψ) is r2 r2 ∆ zz − 1 i dz dz 2 2 dzdz + ds = dψ + ( − ) . (1 + zz)2 4 zz + 1 2 z z To obtain a local orthonormal basis of T ∗ (E H )U N we may simply define dz r dz 1 − zz ∆−1/2 r ∆1/2 l := √ − + 2i dψ , dz, m := √ dr + √ z z 1 + zz 2 (1 + zz) 2 4 2 with their complex conjugates l, m so that the metric tensor over U N is ds 2 = l ⊗ l + l ⊗ l + m ⊗ m + m ⊗ m. A real orthonormal frame {ϑ α } of T ∗ (E H )U N is thus defined by 1 i i 1 ϑ 1 := √ (l + l), ϑ 2 := − √ (l − l), ϑ 3 := − √ (m − m), ϑ 4 := √ (m + m), 2 2 2 2 such that the metric on U N is diagonalized as ds 2 = δαβ ϑ α ⊗ ϑ β . The coordinate transformations ϑ α = h iα d x i are determined by the matrix H = (h iα ), ⎞ ⎛ 0 −r cos φ −r sin θ sin φ 0 ⎟ ⎜ r sin φ −r sin θ cos φ 0 ⎟ 1⎜ ⎟ ⎜ 0 (5) H= ⎜ ⎟, 2⎜ 0 0 r ∆1/2 cos θ r ∆1/2 ⎟ ⎠ ⎝ 2 0 0 0 1/2 ∆ j j whose inverse H −1 = (h˜ β ) from d x j = h˜ β ϑ β is ⎛ 0 0 ⎜ ⎜ cos φ sin φ ⎜ − r r H −1 = 2 ⎜ ⎜ sin φ cos φ ⎜ − r sin θ − r sin θ ⎝ cos θ sin φ r sin θ
cos θ cos φ r sin θ
⎞
0 0 0 1 r ∆1/2
∆1/2 2 ⎟
⎟ 0 ⎟ ⎟. ⎟ 0 ⎟ ⎠ 0.
(6)
The above construction on the U N chart can be carried out the same way on the U S coordinates. We denote orthonormal frames over U S by adding ’s to l, m, ϑ α , x j , etc. Local frames {ϑ α } on U N define a local trivialization of the cotangent bundle, FN : T ∗ (E H )U N → U N × R4 by FN (x; a1 ϑ 1 + · · · + a4 ϑ 4 ) := (x; a1 , . . . , a4 ), where aα ’s are real-valued functions over U N . In a similar way, the choice of local frames {ϑ α } on U S defines a local trivialization of the cotangent bundle, FS : T ∗ (E H )U S −→ U N ×R4 . β β The transition functions f α ’s such that ϑ β = f α ϑ α are elements of the matrix −1 FS N := FN ◦ FS as ⎞ ⎛ ⎛ 2 2 2 2 ⎞ − cos 2φ sin 2φ 0 0 − z2+z −i z 2−z 0 0 zz zz ⎟ ⎜ 2 2 2 2 ⎟ ⎜ − sin 2φ − cos 2φ 0 0⎟ ⎜ z −z ⎟ − z2+z 0 0⎟ ⎜ ⎜ i 2 zz ⎟ . (7) zz FS N = ⎜ ⎟=⎜ ⎟ ⎜ ⎟ ⎝ ⎜ 0 0 0 1 0 ⎠ 0 1 0 ⎠ ⎝ 0 0 0 1 0 0 0 1
620
C. Yang
The inverse transition function is given by the inverse of the matrix FS N , FN S := FS ◦ FN−1 = FS−1 N . The cotangent bundle is thus T ∗ (E H ) = (U N × R4 ) ∪ (U S × R4 )/ ∼,
(8)
where (x; a1 , . . . , a4 ) ∈ U N × R4 and (x ; a1 , . . . , a4 ) ∈ U S × R4 are defined to be equivalent if and only if x = x and FN S (a1 , . . . , a4 )t = (a1 , . . . , a4 )t . 2.3. Spin structures and spinor bundles. Following a standard procedure from [21], we obtain the spinor bundle of the EH-space. In coordinate charts {U N , U S }, the frame bundle PS O(4) of the EH-space is the S O(4)-principal bundle with transition functions FN S in (7) and its inverse FS N . Recall that the covering map of groups, ρ : Spin(4) −→ S O(4),
(9)
is defined by the adjoint representation of Spin(4) as ρ(w)x := w · x · w−1 for x ∈ R4 , where w = v1 · · · vm ∈ Spin(4), m is even and vi ∈ R4 for i = 1, . . . , m. Geometrically, ρ(w) = ρ(v1 ) ◦ · · · ◦ ρ(vm ), where ρ(vi ) is the reflection of the space R4 with respect to the hyperplane with normal vector vi . Locally, the upper left block of the transition matrix (7) is a rotation in the plane spanned by {ϑ 1 , ϑ 2 } through an angle 2φ + π . Such a rotation can be decomposed to two reflections say ρ(v2 ) ◦ ρ(v1 ), with v1 := ϑ 1 , v2 := − sin φ ϑ 1 + cos φ ϑ 2 . Remark 2. Another choice is ρ(−v2 )◦ρ(v1 ), which gives the same rotation as an element in S O(2). v2 · v1 ∈ Spin(4) is a lifting of ρ(v2 ) ◦ ρ(v1 ) ∈ S O(4) under the covering map (9). Thus, in the local coordinate chart U N , F S N := v2 · v1 in Spin(4) defines a lifting of the action FN S ∈ S O(4) as in (7) under the double covering (9). To obtain a global lifting of the frame bundle, we consistently define the transition matrix F N S as a lifting in the group Spin(4) over x ∈ U S by F N S = −v2 · v1 , where v1 := ϑ 1 , v2 := sin φ ϑ 1 + cos φ ϑ 2 . The following confirms the consistency of the liftings on two coordinate charts. Lemma 1. Transition functions { F N S, F S N } satisfy the cocycle condition, F NS ◦ F SN = F ◦ F = 1. SN NS Proof. Applying the transformation from ϑ α ’s to ϑ β ’s by (7), we have ϑ 1 ·ϑ 2 = ϑ 1 ·ϑ 2 . Thus, F NS ◦ F S N = −v2 · v1 · v2 · v1
= −(sin φ ϑ 1 + cos φ ϑ 2 ) · ϑ 1 · (− sin φ ϑ 1 + cos φ ϑ 2 ) · ϑ 1 = sin2 φ − sin φ cos φ(ϑ 1 · ϑ 2 + ϑ 2 · ϑ 1 ) − cos2 φ ϑ 2 · ϑ 1 · ϑ 2 · ϑ 1 = 1, by using identities ϑ α · ϑ α = −1 and ϑ α · ϑ β = −ϑ β · ϑ α for α = β, of elements of the orthonormal bases ϑ α ’s and those of ϑ β ’s. Similarly, F SN ◦ F N S = 1.
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
621
Therefore, the principal Spin(4)-bundle can be defined by PSpin(4) := (U N × Spin(4) ∪ U S × Spin(4))/ ∼ ,
(10)
where (x, g) ˜ ∈ U N × Spin(4) and (x , g˜ ) ∈ U S × Spin(4) are defined to be equivalent if and only if x = x and g˜ = F ˜ N S g. The double covering of bundles (10) over the EH-space defines a spin structure of it. We will always assume this choice of spin structure. The spinor bundle can be defined as an associative bundle of typical fiber C4 of the principal Spin(4)-bundle (10), by specifying a representation of Spin(4) on G L C (4). We know that locally, for any x ∈ E H , there exists a unique irreducible representation space Λ of complex dimension 4 of the Clifford algebra Cl(Tx∗ (E H )) through the Clifford action c : Cl(Tx∗ (E H )) → End(Λ). We define the representation of Spin(4) in End(Λ)(∼ = G L C (4)) simply by the restriction of c from the Clifford algebra, and obtain the spinor bundle S of typical fiber Λ, with transition functions {c( F N S ), c( F S N )} in the coordinate charts {U N , U S }. With respect to the orthonormal basis, say {ϑ α } of T ∗ (E H )U N , there exists a unitary frame { f α } of the representation space Λ ∼ = C4 , such that the Clifford representations γ α := c(ϑ α (x)) for α = 1, . . . , 4 can be represented as constant matrices, ⎛ ⎞ ⎛ ⎞ 0 0 −1 0 0 0 −i 0 0 1⎟ 0 0 −i ⎟ ⎜0 0 ⎜0 , γ2 = ⎝ , γ1 = ⎝ 1 0 0 0⎠ −i 0 0 0⎠ 0 −1 0 0 0 −i 0 0 ⎛ ⎞ ⎛ ⎞ 0 0 0 −1 0 0 0 −i 0⎟ ⎜0 0 −1 0 ⎟ ⎜0 0 i 3 4 γ =⎝ , γ =⎝ . (11) 0 1 0 0⎠ 0 i 0 0⎠ 1 0 0 0 −i 0 0 0 The fact is that there exist frames { f β } on the coordinate chart U S so that the representation of c(ϑ β ) are also constant matrices γ β ’s as above. Under the chosen frames { f α } and { f β }, we may represent the transition functions of the spinor bundle as follows. Define maps P, Q : U N ∩ U S −→ G L C (4) by iz iz iz iz , , , − ), (12) |z| |z| |z| |z| iw iw iw iw 1 1 2 1 ,− ,− , ); (13) Q : = c( F N S ) = − sin φγ γ − cos φγ γ = diag( |w| |w| |w| |w| 1 1 2 1 P : = c( F S N ) = − sin φγ γ + cos φγ γ = diag(−
diag(a, b, c, d) stands for diagonal matrix with diagonal elements a, b, c, d. The spinor bundle S is thus, S := (U N × C4 ∪ U S × C4 )/ ∼,
(14)
where (x; s1 , · · · , s4 ) ∈ U N × C4 and (x ; s1 , · · · , s4 ) ∈ U S × C4 are defined to be equivalent if and only if x = x and (s1 , · · · , s4 )t = Q(s1 , · · · , s4 )t . One can easily see that the cocycle condition of the transition functions P ◦ Q = Q ◦ P = 1 holds. The chirality operator is defined by χ := c(ϑ 1 ) c(ϑ 2 ) c(ϑ 3 ) c(ϑ 4 ) = γ 1 γ 2 γ 3 γ 4 = diag(−1, −1, 1, 1),
(15)
622
C. Yang
such that χ 2 = 1. The representation space Λ = Λ+ ⊕ Λ− is decomposed as ±1-eigenspaces of the operator χ , with dim C Λ+ = dim C Λ− = 2. This fiberwise splitting extends to the global decomposition of the spinor bundle as subbundles over the EH-space, S = S + ⊕ S − , with each of the complex subbundles S + and S − of rank 2. Therefore, any element s ∈ S can be decomposed as s = (s + , s − )t . The charge conjugate operator on the spinor bundle J : S → S is defined by − + s −s . (16) J − := s s+ 2.4. Spin connections and Dirac operators of spinor bundles. Following the general procedure in [22], we can induce the spin connection ∇ S of the spinor bundle S from the Levi-Civita connection of the EH-space. We will only work on the U N coordinate chart and the construction on U S is similar. In the orthonormal frame {ϑ α }, the corresponding Levi-Civita connection on the dual ∗ β tangent bundle, T ∗ (E H )U N , can be expressed as ∇ T E H ϑ β = −Γiα d x i ⊗ ϑ α . The β metric compatibility of the Levi-Civita connection implies that Γiβα = −Γiα . β We may represent Γiα ’s in terms of the Christoffel symbols Γikj ’s of ∇ in the d x i ’s (4) by β β β Γiα = h˜ αj h k Γikj − ∂i h j , (17) where h iα ’s and h˜ β ’s are the matrix entries of H in (5) and H −1 in (6), respectively. Modulo the anti-symmetric condition between α and β indices, all the nonvanishing Christoffel symbols are j
1 1/2 1 1 3 ∆ sin φ, Γ24 = − ∆1/2 cos φ, Γ22 = 2 2 1 1 4 1 Γ22 = − ∆1/2 sin φ, Γ33 = − ∆1/2 sin θ cos φ, 2 2 1 1 1 3 Γ32 = −1 − ∆+ cos θ, Γ32 = − ∆1/2 sin θ sin φ 2 2 1 + 1 1 4 1 4 Γ33 = − ∆ cos θ Γ42 = ∆, Γ43 = − ∆+ . 2 2 2 1 Γ23 =
1 1/2 ∆ cos φ, 2 1 1 Γ34 = − ∆1/2 sin θ sin φ, 2 1 4 Γ32 = ∆1/2 sin θ cos φ, 2 (18)
We define γα := γ α , then the spin connection ∇ S : S → S ⊗ Ω 1 (E H ) is 1 β ∇ S := d − Γiα d x i ⊗ γ α γβ . 4
(19)
The covariant derivative ∇iS := ∇ S (∂i ), for i = 1, . . . , 4, equals ∇iS = ∂i − ωi , β where ωi = 41 Γiα γ α γβ . The Dirac operator D : Γ (S) → Γ (S) can be defined by D(ψ) := −i γ j ∇ Sj ψ, ∀ψ ∈ Γ (S),
(20)
j where γ j := c(d x j ) = h˜ β γ β . We note that the compatibility of the spin connection with respect to the spin structure implies that the commutativity between the Dirac operator and the charge conjugate operator, i.e. [D, J ] = 0.
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
623
2.5. Torus actions on the spinor bundle. A torus action on the spinor bundle S can be induced from the torus isometric action on a general Riemannian manifold [14,15]. In this subsection, we will represent such torus action (4) through parallel transporting spinors along geodesics. Recall that the isometric action σ is generated by the two Killing vectors ∂3 = ∂φ and ∂4 = ∂ψ . Let ck : R → E H be the geodesics obtained as integral curves of the Killing vector field ∂k for k = 3, 4. The equation of parallel transportation with respect to the spin connection along any curve c(t) is ∇cS (t) ψ = 0, where c (t) := dc(t)/dt, for ψ ∈ Γ (S). Substituting (19), we obtain dψ − A(c(t)) ψ = 0, dt
A(c(t)) :=
1 β i Γ d x (c (t)) ⊗ γ α γβ . 4 iα
When the curve is c3 (t), the corresponding matrix A(c3 (t)) is ⎞ ⎛ i 0 0 0 ⎟ 1 ⎜0 −i 0 0 ⎟ A(c3 (t)) = ⎜ + cos θ ) −∆1/2 sin θ eiφ ⎠ , ⎝ 0 0 −i (1 + ∆ 2 0 0 ∆1/2 sin θ e−iφ i (1 + ∆+ cos θ )
(21)
(22)
where r, θ and φ are understood as components of coordinates on the curve c3 (t). When the curve is c4 (t), the corresponding matrix A(c4 (t)) is A(c4 (t)) =
a4 a4 i diag − 4 , 4 , −1, 1 , 2 r r
(23)
where r is understood as one of the components of coordinates on the curve c4 (t). The corresponding parallel propagator is a map Pc(t) (t0 , t1 ) : Γ (S) → Γ (S) defined by parallel transporting any section ψ along the curve c(t) with t ∈ [t0 , t1 ]. The propagator can be represented by an iterated integration of Eq. (21). For geodesics ck (t), k = 3, 4, the corresponding matrix is formally solved as t1 Ak (t)dt , (24) Pck (t) (t0 , t1 ) = P exp t0
where P is the path-ordering operator. Let H be the Hilbert space completion with respect to the L 2 -inner product on the space of L 2 -integrable sections of the spinor bundle S. The parallel propagators (24) can be extended to families of operators Uk (tk − t0 ) : H → H parametrized by the real number (tk − t0 ) by Uk (tk − t0 )(ψ)(x) := (Pck (t) (t0 , tk )ψ)(x), ∀ψ ∈ H , where we assume x = ck (t0 ) ∈ E H for k = 3, 4. Without loss of generality, we may take t0 = 0 so that the family of operators is parametrized by tk . Since the spin connection is compatible with the metric of the EH-space, the pointwise inner product of the images of any two sections under parallel transportation along the geodesics ck (t) remains unchanged. This further implies that their L 2 -integrations remain the same. Futhermore, the operators Uk (tk ) are unitary. Let Wk be the self-adjoint operators on H which generate Uk by Uk (tk ) = eitk Wk , where tk ∈ R for k = 3, 4.
624
C. Yang
2 → T2 of the two torus We may define a representation of the double cover p : T 2 → L(H) such that :T by V (t˜3 , t˜4 )ψ(x) := ei(t˜3 W3 (x)+t˜4 W4 (x) ψ(x), ∀ψ ∈ H. V
(25) from (4) in the sense that for any v˜ ∈ T2 This action covers the isometric action σ v˜ ( f ψ) = αv ( f )V v˜ (ψ), ∀ψ ∈ H for any bounded consuch that p(v) ˜ = v implies V tinuous function f ∈ Cb (E H ) and the action α on Cb (E H ) defined by αv ( f )(x) := f (σ−v (x)). We assume the choice of the lifting in the double torus is always fixed and omit the ˜· for notational simplicity from now on. of T2
3. Smooth Algebras and Projective Modules We consider algebras of functions over the Eguchi-Hanson spaces, and their deformations as differential algebras. To obtain a C ∗ -norm on the deformed algebra, we consider representations of algebras as operators on the Hilbert space of spinors. Some algebras may be realized as smooth algebras [2]. We also find projective modules from the spinor bundle. 3.1. Algebras of smooth functions. We first summarize some related facts on topological algebras of complex-valued functions in [2]. For a noncompact Riemannian manifold X , let Cc∞ (X ) be the space of smooth functions on X of compact support, C0∞ (X ) be the space of smooth functions vanishing at infinity, and Cb∞ (X ) be the space of smooth functions whose derivatives are bounded to all degrees. In some local coordinate charts with corresponding partition of unity, say U = {Ua , h a }a∈A , we may define the family of seminorms on Cb∞ (X ) by U α qm ( f ) := sup sup |h a (x) ∂ f (x)| (26) a∈A |α|≤m x∈Ua ∞ f ∈ Cb (X ), α are multi-indices and m a non-negative integer. These seminorms on C0∞ (X ) and Cc∞ (X ). The natural topology induced by (26) is the topology
for any restrict of uniform convergence of all derivatives. We can show that two such families of seminorms defined by different coordinate charts are equivalent. Thus the topology defined does not depend on the choice of coordinates U. We also note that the q0 seminorm in the family of seminorms is nothing but the supremum norm · ∞ , which is a C ∗ -norm with the involution defined by normal complex conjugation. Algebras Cb∞ (X ) and C0∞ (X ) are both Fréchet in the topology of uniform convergence of all derivatives, while the algebra Cc∞ (X ) is not complete. However, Cc∞ (X ) is complete in the topology of inductive limit as the inductive limit of the topology obtained by restriction on a family of algebras Cc∞ (K n ), where {K n }n∈N is an increasing family of compact subsets in X . The algebra Cc∞ (X ) is dense in the Fréchet algebra C0∞ (X ). To consider algebras of smooth functions of the Eguchi-Hanson spaces, we may use the coordinate charts U = {U N , U S } defined in Sect. 2.2, with a partition of unity {h N , h S } subordinated to them. The family of seminorms (26) can be written as
qmU ( f ) = sup sup |h N (x) ∂ α f (x)| + sup sup |h S (x ) ∂ α f (x )|. |α|≤m x∈U N
|α |≤m x ∈U S
We obtain the corresponding topological algebras by taking X = E H .
(27)
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
625
3.2. Algebras of integrable functions. Apart from algebras of functions which can be represented as operators, there are algebras of functions which may define projective modules as representation spaces. Decay conditions at infinity and integrability conditions of functions become important when considering noncompact spaces. We consider the following algebras of integrable functions. The (k, p)th Sobolev norm of a function f , say in Cb∞ (E H ), is given as f H p :=
k
k
m=0
1/ p |∇ f | d V ol m
p
,
(28)
EH
where k is a non-negative integer and p is a positive integer. (We will not consider the case where p is a real number). We define subspaces in Cb∞ (E H ) which contain functions with finite Sobolev norm, Ck (E H ) := { f ∈ Cb∞ (E H ) : f H p < ∞}. p
k
p Hk (E H )
p
Let be the Banach space obtained by the completion of the algebra Ck (E H ) p p with respect to the Sobolev norm. In particular, H0 (E H ) ⊃ · · · ⊃ Hk (E H ) ⊃ p Hk+1 (E H ) ⊃ · · · . Remark 3. Notice that the algebra Cc∞ (E H ) is contained in Hk (E H ) for any k ∈ N. p Completion of Cc∞ (E H ) with respect to · H p gives us the Banach space, Hk,0 (E H ) p
p
k
p
such that Hk,0 (E H ) ⊂ Hk (E H ). The equality does not hold in general. However, in the circumstances of a complete Riemannian manifold with Ricci curvature bounded up to degree k − 2, and positive injective radius (which is satisfied by the E H -space), p p Hk,0 (E H ) = Hk (E H ) when k ≥ 2 [23]. Lemma 2. For a fixed non-negative integer p, the intersection defined as C∞ p (E H ) := ∩k Hk (E H ) p
is a Fréchet algebra in the topology defined by the family of norms { · H p }k∈N . k
Proof. The topology is easily seen to be locally convex and metrizable. To show that it p is complete, let { f β } be any Cauchy sequence in C ∞ p (E H ), then there exists a limit f k p of { f β } under the norm · H p in Hk (E H ) for each k ∈ N. For any two indices k1 , k2 k such that k1 ≤ k2 , the norm · H p is stronger than the norm · H p . The Cauchy p
k2
k1
sequence { f β } with the limit f k2 in the norm · H p is also a Cauchy sequence with k2
p
p
p
the limit f k1 in the norm · H p . Uniqueness of the limit implies that f k2 = f k1 . Since k1
p
k1 , k2 are arbitrary, the limits f k for any k ∈ N agree. We denote the limit as f so that the Cauchy sequence converges to f ∈ C ∞ p (E H ) with respect to any of the norms. Thus the topology is complete and C ∞ (E H ) is a Fréchet algebra. p When p = 2, the Fréchet algebra C2∞ (E H ) belongs to the chain of continuous inclusions, Cc∞ (E H ) → C2∞ (E H ) → C0∞ (E H ), with respect to their aforementioned topologies.
(29)
626
C. Yang
3.3. Deformation quantizations of differentiable Fréchet algebras. Rieffel’s deformation quantization of a differentiable Fréchet algebra in [12] (Chapter 1, 2) can be summarized as follows. Let A be a Fréchet algebra whose topology is defined by a family of seminorms {qm }. We assume that there there is an isometric action α of the vector space V := Rd considered as a d-dimensional Lie algebra acting on A. We also assume that the algebra is smooth with respect to the action α, i.e. A = A∞ in the notation of the reference. Under the choice of a basis {X 1 , . . . , X d } of the Lie algebra of V , the action α X i of X i defines a partial differentiation on A. One can define a new family of seminorms from {qm } by taking into account the action of α. For any f ∈ A, f j,k := qm (δ µ f ), (30) m≤ j, |µ|≤k µ
µ
where µ are the multi-indices (µ1 , . . . , µd ) and δ µ = α X 11 . . . α X dd . The deformation quantization of the algebra A can be carried out in three steps: Step 1. Let Cb (V × V, A) be the space of bounded continuous functions from V × V to A. One can induce the family of seminorms { · Cj,k } on the space Cb (V × V, A) by FCj,k :=
sup F(w) j,k ,
w∈V ×V
(31)
for F in Cb (V × V, A) and · j,k on A as in (30). Let τ be an action of V × V on the space Cb (V × V, A) defined by translation. That is, τw0 (F)(w) = F(w + w0 ) for any w0 , w ∈ V × V and F ∈ Cb (V × V, A). The action τ is an isometry action with respect to the seminorms (31). We define B A (V × V ) to be the maximal subalgebra such that τ is strongly continuous and whose elements are all smooth with respect to the action of τ . In the same way as one induces from the family of seminorms {qm } and obtains the seminorms · j,k of A in (30), one may induce the family of seminorms on B A (V × V ) from (31) by taking into account the action of τ . For any F ∈ B A (V × V ), let C FBj,k;l := δ ν Fl,m , (32) (l,m)≤( j,k) |ν|≤l
δν
where ν are the multi-indices and denotes the partial differentiation operator associated to τ of V × V . Step 2. The following is the fundamental result of the deformation quantization of a differentiable algebra. See Proposition 1.6 in [12]. One can define an A-valued oscillatory integral over V × V of F ∈ B A (V × V ) by F(u, v)e(u · v) dudv, (33) V ×V
where e(t) := exp(2πi t) for t ∈ R and u ·v is the natural inner product on V considered as its own Lie algebra. It is shown to be A-valued by getting the bound of the integral in the family of seminorms { · j,k } on A. Specifically, for large enough l, there exists a constant Cl such that B F(u, v)e(u · v) dudv ≤ Cl F j,k;l < ∞, V ×V
where the seminorm
· Bj,k;l
j,k
is defined in (32).
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
627
Step 3. For any invertible map J on V and any two functions f, g ∈ A, one defines an element F f,g ∈ B A (V × V ) by F f,g (u, v) := α J u ( f )αv (g) ∈ A, ∀(u, v) ∈ V × V.
(34)
The deformed product f × J g is thus defined by the integral (33) of F f,g (u, v) as f × J g := α J u ( f )αv (g)e(u · v) dudv. (35) V
V
The algebra A with its deformed product × J , together with its undeformed seminorms { · j,k }, defines the deformed Fréchet algebra A J . This is called the deformation of the algebra A (in the direction of J ) as a differentiable Fréchet algebra. In the following, we obtain deformation quantizations of various algebras of functions on EH-spaces. We may induce a torus action α on the algebra Cb∞ (E H ), or similarly on algebras C0∞ (E H ) and C2∞ (E H ), from the torus isometric action σ of v ∈ T2 on the E H -space (4) by αv f (x) = f (σ−v (x)) for any f ∈ Cb∞ (E H ) and x ∈ E H . Under the choice of the covering {U N , U S }, the orbit of any point x ∈ E H lies in the same coordinate chart as x. We assume that the partition of unity h N and h S only depend on the coordinate θ so that they are invariant under the torus action α. One can easily show that the torus action α is isometric with respect to the family of seminorms (27). We also note that each of the Fréchet algebras Cb∞ (E H ) and C0∞ (E H ) is already smooth with respect to the action α. Thus, each of Cb∞ (E H ) and C0∞ (E H ), with the isometric action α, regarded as a periodic action of V = R2 , appears exactly as the starting point as (A, {qm }) of Rieffels’ deformation quantization. We can carry out Step 1 to Step 3 and obtain the product × J on the respective algebras, f × J g := α J u ( f ) αv (g) e(u · v) dudv, (36) R2 R2
where the inner product u · v is the one onR2 and J is a skew-symmetric linear operator 0 −θ on R2 . In the following we assume J := , for some θ ∈ R\{0}, and denote × J θ 0 as ×θ . The algebra Cb∞ (E H ) with its deformed product ×θ , together with its undeformed family of seminorms (27) defines the deformed Fréchet algebra Cb∞ (E H )θ as the deformation quantization of Cb∞ (E H ). Similarly, C0∞ (E H )θ is the deformation quantization of the algebra C0∞ (E H ). For the Fréchet algebra C2∞ (E H ), the torus action α is isometric with respect to the family of norms {· H 2 }k∈N , because it is isometric with respect to the Riemannian metk ric. We can similarly obtain the Fréchet algebra C2∞ (E H )θ as deformation quantization of the algebra C2∞ (E H ). Remark 4. For any of the algebras in our example, the family of seminorms · j,k induced from qm ’s as in Step 1 is equivalent to the original family of seminorms. Indeed, the torus action is defined by the normal differentiation with respect to coordinates. There follow some immediate observations. Lemma 3. The algebra C2∞ (E H )θ is an ideal of the algebra Cb∞ (E H )θ .
628
C. Yang
Proof. Let f ∈ C2∞ (E H ) and g ∈ Cb∞ (E H ). Considered as elements of the algebra ∞ Cb∞ (E H ), they define F f,g ∈ BCb (E H ) (R2 × R2 ) by (34). We claim that F f,g lies in ∞ BC2 (E H ) (R2 × R2 ) so that its oscillatory integral, or product of f ×θ g by definition, will be finite in the family of seminorms on C2∞ (E H ) and hence C2∞ (E H )-valued. In fact, |F f,g (u, v)(x)|2 d V ol(x) = | f (J u + x)g(v + x)|2 d V ol(x) EH EH ≤ sup |g(x)|2 | f (J u + x)|2 d V ol(x) EH x∈E H = sup |g(x)|2 | f (x)|2 d V ol(x) < ∞. x∈E H
EH
The last equality is by the invariance of the volume form of the integration with respect to the torus isometric action. The finiteness is because g is a bounded function and f ∈ C2∞ (E H ). Higher orders can be shown as follows. For any non-negative integer k, we may expand ∇ k ( f (J u + x)g(v + x)) by the Leibniz rule to a summation of terms in the form of ∇ l f (J u + x)∇ m g(v + x) with l + m = k. By the assumption that ∇ k f is L 2 -integrable for any k and ∇ l g is bounded for any l, each term in the summation is L 2 -integrable. Thus ∇ k ( f (J u + x)g(v + x)) is L 2 -integrable for any k and F f,g (u, v) ∈ C2∞ (E H ) for any (u, v) ∈ R2 × R2 . As a result, the product f ×θ g is C2∞ (E H )-valued and C2∞ (E H )θ is an ideal. The restriction of the product (36) of the algebra Cb∞ (E H )θ to the algebra Cc∞ (E H ) gives the deformed algebra Cc∞ (E H )θ . We see that it is closed as an algebra as follows. For any f, g ∈ Cc∞ (E H ), the integral (36) vanishes outside the compact set Orb(supp( f )) ∩ Orb(supp(g)), where Orb(U ) := {αT2 (x) : x ∈ U ⊂ E H }. Therefore, f ×θ g is of compact support and Cc∞ (E H )θ is thus closed. We define the topology of inductive limit of Cc∞ (E H )θ as follows. Let {K j } be an increasing family of compact sets of the E H -space such that ∪ j K j = E H and Orb(K j ) ⊂ K j for each j. The product (36) defines a multiplication for each of the algebra Cc∞ (K j ). This defines a family of deformed algebras {Cc∞ (K j )θ }. For each j, the topology of Cc∞ (K j )θ obtained from the restriction of the topology of uniform convergence to all degrees of C0∞ (E H )θ is complete. This induces the strictly inductive limit topology of Cc∞ (E H )θ which is locally convex and complete. Using definitions, we have Lemma 4. Cc∞ (E H )θ is an ideal of the algebras C0∞ (E H )θ and Cb∞ (E H )θ . Proof. For f ∈ Cc∞ (E H )θ and g ∈ Cb∞ (E H )θ , the integral (36) vanishes outside the compact set Orb(supp( f )). Hence f ×θ g is Cc∞ (E H )-valued, so that Cc∞ (E H )θ is an ideal of the algebras Cb∞ (E H )θ . The proof for the algebra C0∞ (E H )θ is the same. The torus action α as a compact action of an abelian group defines a spectral decomposition of a function f in the algebra Cb∞ (E H ) or C0∞ (E H ), by f =
s
fs ,
f s (x) = e−is3 φ e−is4 ψ h s (r, θ ),
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
629
where s = (s3 , s4 ) ∈ Z2 , f s satisfies αv f s = eis·v f s , ∀v ∈ T2 , and the series converges in the topology of uniform convergence of all derivatives. Under the decomposition, the product of (36) takes a simple form (Chapter 2, [12]). Let f = r fr and g = s gs , in their respective decompositions, be both in the algebra Cb∞ (E H ) (or C0∞ (E H )), then σ (r, s) fr gs , (37) f ×θ g = r,s
where σ (r, s) := eiθ(r4 s3 −r3 s4 ) and r = (r3 , r4 ), s = (s3 , s4 ) ∈ Z2 . The expression (37) can also be restricted to the algebra Cc∞ (E H )θ . Lemma 5. C0∞ (E H )θ is an ideal of Cb∞ (E H )θ . Proof. For any f ∈ C0∞ (E H )θ and g ∈ Cb∞ (E H )θ , it suffices to show that f ×θ g ∈ C0∞ (E H )θ . For g being zero, this is trivial. We thus assume that g is nonzero. The convergence of the series (37) implies that for any ε/2 > 0, there exists an integer N such that | f ×θ g(x)| <
|r |,|s|≤N
ε σ (r, s) fr (x)gs (x) + , 2
for any x ∈ E H , where |r | := |r3 | + |r4 | and |s| := |s3 | + |s4 |. Since fr ∈ C0∞ (E H ), for each |r | ≤ N , there exists a compact set K ( fr ) ⊂ E H such that ε | fr (x)| < , ∀x ∈ E H \K ( fr ), 2C for any fixed constant C. Therefore, for any ε > 0, we may choose N and K ( fr ) as above and define the union of finitely many compact sets as K := ∪|r |≤N K ( fr ), so that x ∈ E H \K implies that | f ×θ g(x)| <
|r |,|s|≤N
σ (r, s) fr (x)gs (x) +
ε ε ε AN < sup |σ (r, s)g(x)| + , 2 2C x∈E H 2
where A N is a finite non-negative integer counting numbers of indices r and s satisfying |r |, |s| ≤ N . If we fix the constant C = supx∈E H |σ (r, s)g(x)|A N , then the above inequalities give | f ×θ g(x)| < ε, whenever x ∈ E H \K . Therefore f ×θ g is C0 (E H )valued. Higher orders can be shown by applying Leibniz rule for the deformed product so that f ×θ g is C0∞ (E H )-valued and hence C0∞ (E H )θ is an ideal of Cb∞ (E H )θ . We will end this subsection by introducing local algebras. Definition 1 [2]. An algebra Ac has local units if for every finite subset of elements n {ai }i=1 ⊂ Ac , there exists φ ∈ Ac such that for each i, φ ai = ai φ = ai . Let A be a Fréchet algebra such that Ac ⊂ A is a dense ideal with local units, then A is called a local algebra. Lemma 6. The algebra Cc∞ (E H )θ has local units and the algebra C0∞ (E H )θ is a local algebra.
630
C. Yang
Proof. For any finite set of elements { f β }nβ=1 ⊂ Cc∞ (E H )θ , there exists a compact set K large enough to contain the union of supports ∪β supp( f β ). Let φ be a function equal to 1 on K and decaying only with respect to the r -variable to zero outside K . Thus defined φ satisfies φ = φ(0,0) in the spectral decomposition so that φ ×θ f β = f β ×θ φ = f β for all β. Thus, (Cc∞ (E H ), ×θ ) is an algebra with units. The fact that Cc∞ (E H ) is dense in C0∞ (E H ) with respect to the topology of uniform convergence of all derivatives implies that Cc∞ (E H )θ is dense in C0∞ (E H )θ , since the family of seminorms is not deformed. Cc∞ (E H )θ is an ideal in C0∞ (E H )θ by Lemma 4. Thus C0∞ (E H )θ is a local algebra. Lemma 3 of [2] says that there exists a local approximate unit {φn }n≥1 for a local algebra (Ac ⊂)A. In this example, we choose a family of compact sets K 0 ⊂ K 1 ⊂ . . . in the E H -space, increasing in the r -direction. For instance, K n := {x ∈ E H : r ≤ n}, ∀n ∈ N. Let {φn }n≥1 be a family of functions only depending on r and with compact support K n ⊂ supp(φn ) ⊂ K n+1 such that φn is constant 1 on K n and decays to zero on K n+1 . In particular, φn+1 ×θ φn = φn ×θ φn+1 = φn φn+1 = φn . This gives a local approximate unit. It is not hard to see that each φn actually commutes with functions in the algebra C0∞ (E H )θ . Furthermore, the union of the algebras ∪n [C0∞ (E H )θ ]n , where [C0∞ (E H )θ ]n := { f ∈ C0∞ (E H )θ : φn ×θ f = f ×θ φn = f }, is the algebra Cc∞ (E H )θ . 3.4. Algebras of operators and deformations of C ∗ -algebras. Definition 2 [2]. A ∗-algebra A is smooth if it is Fréchet and ∗-isomorphic to a proper dense subalgebra i(A) of a C ∗ -algebra A which is stable under the holomorphic functional calculus under suitable representation. Recall that the q0 seminorm in the family (27) is the suprenorm · ∞ , which defines C ∗ -norms on each of the algebras Cb∞ (E H ) and C0∞ (E H ). The C ∗ -completion of the former is the algebra Cb (E H ) of bounded continuous functions. That Cb∞ (E H ) is stable under the holomorphic functional calculus of Cb (E H ) implies that Cb∞ (E H ) is a pre-C ∗ -algebra. The C ∗ -completion of C0∞ (E H ) is the algebra C0 (E H ) of continuous functions vanishing at infinity. As a nonunital Banach algebra, the holomorphic functional calculus is with respect to its unitization and with respect to holomorphic functions vanishing at 0. C0∞ (E H ) is stable under the holomorphic functional calculus of C0 (E H ) and hence a pre-C ∗ -algebra. Similarly, the Fréchet algebra C2∞ (E H ) is also a pre-C ∗ -algebra of the C ∗ -completion C0 (E H ). We see that C0∞ (E H ), C2∞ (E H ) and Cb∞ (E H ) are smooth algebras. The deformation quantizations of C0∞ (E H ) and Cb∞ (E H ) obtained before are as differentiable Fréchet algebras. To realize the deformed algebras as pre-C ∗ -algebras of some deformed C ∗ -algebra, we may represent them as operators on some Hilbert space. Following the construction of [14,15], we may obtain their representations on the Hilbert space H of spinors, by using the torus isometric action.
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
631
Let C∗∞ (E H )θ stand for the algebras Cc∞ (E H )θ , C0∞ (E H )θ or Cb∞ (E H )θ . The operator representation of C∗∞ (E H )θ on the Hilbert space H is defined by L θf := M fr Vrθ , (38) r ∈Z2
where M fr is the normal multiplication by fr and Vrθ is a unitary operator obtained as the evaluation of the unitary operator (25) at t˜3 = θr4 and t˜4 = −θr3 . That is, Vrθ := eiθ(r4 W3 −r3 W4 ) ,
r = (r3 , r4 ) ∈ Z2 .
(39)
Remark 5. Geometrically, Vrθ is the action of parallel transporting any section by −θr3 along the ψ direction followed by a parallel transporting by θr4 along the φ direction. With the involution on C∗∞ (E H )θ defined by the complex conjugation of functions, we can use the property ( f ∗ )r = ( f −r )∗ and Vrθ h s = h s Vrθ σ (r, s) for any simple com ponent h s from s h s , to show that the representation (38) is a faithful ∗-representation of C∗∞ (E H )θ . We may define the C ∗ -norm of C∗∞ (E H )θ by the operator norm · op of the representation on H. The series of operators (38) converges uniformly in the operator norm. We denote the C ∗ -completion of the algebra Cb∞ (E H )θ by Cb (E H )θ . It is a deformation of Cb (E H ) as a C ∗ -algebra. One can also show that Cb∞ (E H )θ is stable under the holomorphic functional calculus of Cb (E H )θ and hence a pre-C ∗ -algebra. The C ∗ -completion C0 (E H )θ of the algebra C0∞ (E H )θ defines a deformation of C0 (E H ) as a C ∗ -algebra. This can also be realized as a pre-C ∗ -algebra. For similar reasons, the Fréchet algebra C2∞ (E H )θ can be realized as a pre-C ∗ -algebra with the C ∗ -completion C0 (E H )θ . In the commutative case, one can show that · op is bounded by the zeroth seminorm q0 in the family of seminorms (26). Hence the C ∗ -norm is weaker than the family of seminorms (26). To see that the same holds in the deformed case, we note that in Rieffel’s construction, the deformed Fréchet algebras can be represented on the space of Schwarz functions associated with a natural inner product (P. 23 [12]) and completed to C ∗ -algebras. Furthermore, the correspondent C ∗ -norm is shown to be weaker than the family of seminorms defining the Fréchet topology (Proposition 4.10 [12]). We may induce a ∗-homomorphism from the C ∗ -algebra representing on H to the C ∗ -algebra representing on the space of Schwarz functions by the identity map of functions. Since any ∗-homomorphism between C ∗ -algebras is norm-decreasing, we conclude that the C ∗ -norm on C∗∞ (E H )θ represented on H is also weaker than the family of seminorms (26) defining the topology of uniform convergence of all derivatives. Both of the algebras C0∞ (E H )θ and Cb∞ (E H )θ are smooth algebras. 3.5. Projective modules of spinor bundles. The link between vector bundles over compact space and projective modules is the Serre-Swan theorem [24]. It is generalized for vector bundles of finite type, of which there exists a finite number of open sets in the open cover of the base manifold such that the bundle is trivialized on each open set [25]. The smooth version of the result is as follows. Theorem 1. The category of complex vector bundles of finite type over X for any differentiable manifold X is equivalent to the category of finitely generated projective Cb∞ (X )-modules.
632
C. Yang
Remark 6. There exists an alternative version of the generalized Serre-Swan theorem [2] for vector bundles over noncompact manifolds, proved by using certain compactification of the base manifolds. Since the simplest one-point compactification of the Eguchi-Hanson space gives an orbifold due to the Z2 -identification, it is not straightforward to apply the construction there. In the following, we will use Theorem 1 to find the projective module associated to the spinor bundle S of the EH-space as defined in Sect. 2.3. In the coordinate charts U N and U S of the EH-space, we may choose a partition of unity {h N , h S } such that both functions depend only on θ and h N (h S , respectively) decays rapidly to zero at the “south pole” S (“north pole” N , respectively). Recall that in the unitary basis { f α } of SU N and { f β } of SU S , the transition functions P αβ ’s and Q αβ ’s, such that f β = P αβ f α and f β = Q αβ f α , are matrix entries of P in (12) and Q in (13), respectively. The idea is to extend the basis { f α } on U N across N and { f α } on U S across S so that one can take the summation of both extended global sections to obtain a generating set of the space of smooth bounded sections of the spinor bundle Γb∞ (S). To extend { f α } across N , we may rescale it by the function h N , ! f α h N on U N , (40) Fα := 0 at N so that Fα ’s now decay to zero smoothly at N . Similarly, we may rescale the basis { f α } by the function h S by defining ! f α h S on U S . (41) Fα := 0 at S Note that on the intersection U N ∩ U S , the transition function satisfies P αβ h N → 0 whenever h N → 0, and similarly Q αβ h S → 0 whenever h S → 0. Lemma 7. The set of global sections {Fα , Fα }, where α = 1, . . . , 4, are the generating set of the space of bounded smooth sections of the spinor bundle Γb∞ (S). Proof. The restriction {Fα |U N } where α = 1, . . . , 4 is a basis for SU N . Indeed, any section ψ ∈ Γb∞ (S) can be written as ψ|U N = ψ α f α = a α f α h N = a α Fα |U N , where a α = ψ α / h N . Similarly, the restriction {Fα |U S } gives a basis for SU S , since any section ψ can be written as ψ|U S = ψ α f α = bα f α h S = bα Fα |U S , where bα = ψ α / h S . On the intersection, Fα |U N ∩U S = h N P βα Fβ h −1 S ,
Fα |U N ∩U S = h S Q βα Fβ h −1 N .
Let {k N , k S } be a new partition of unity, also depending only on θ , such that supp(k N ) ⊂ U N and supp(k S ) ⊂ U S . Furthermore, k N (k S , respectively) is required to decay faster than h N around N (h S around S, respectively). Therefore, a α k N → 0 on U N , whenever h N → 0, and bα k S → 0 on U S , whenever h S → 0. Thus, we can extend the coefficient functions a α ’s and bα ’s by zero, ! ! a α k N on U N bα k S on U S α α , B := , A := 0 at N 0 at S
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
633
so that ψ = Aα Fα + B α Fα . In fact, ⎧ ! α α ⎪ ⎨ψ k N f α + ψ k S f α on U N ∩ U S ψ α f α on U N Aα Fα + B α Fα = ψ α k S f α = at N ⎪ ψ α f α on U S ⎩ψ α k f at S N α
(42)
which is the section ψ in Γb∞ (S). Therefore, {Fα , Fα } with α = 1, . . . , 4 is a generating set of Γb∞ (S). By construction, we may obtain a projection in M8 (Cb∞ (E H )) corresponding to the spinor bundle S. Under the standard basis of the free Cb∞ (E H )-module Cb∞ (E H )8 , we define the matrix, kN 1 kN P , (43) p := kS Q kS 1 where P and Q are 4 × 4 complex matrices from (12) and (13) and 1 is the four by four identity matrix. Proposition 1. Γb∞ (S) is a finitely generated projective Cb∞ (E H )-module, Cb∞ (E H )8 p ∼ = Γb∞ (S).
(44)
Proof. It is easy to check that p 2 = p = p ∗ . To show that (44) is an isomorphism, any section can be represented as an element in Cb∞ (E H )8 p by construction. Conversely, the matrix p maps any element (t1 , . . . , t4 , t1 , . . . , t4 ) of Cb∞ (E H )8 to β
β
β
β
((t1 + P 1 tβ ) k N , (t2 + P 2 tβ ) k N , (t3 + P 3 tβ ) k N , (t4 + P 4 tβ ) k N , β
β
β
β
(t1 + Q 1 tβ ) k S , (t2 + Q 2 tβ ) k S , (t3 + Q 3 tβ ) k S , (t4 + Q 4 tβ ) k S ). β
β
Let Aα = (t α + P α tβ )k N and B α = (t α + Q α tβ )k S , for α = 1, · · · , 4, then the image gives a section in Γb∞ (S) in the form of (42). Therefore, (44) is an isomorphism. Columns of the matrix p = ( pβα ) give a generating set of Γb∞ (S). We may define P k = ( p1k , · · · , p8k )t for k = 1, . . . , 8, then any element ψ ∈ Cb∞ (E H )8 p can be written as ψ = ψk P k for functions ψk ∈ Cb∞ (E H ). 3.6. Smooth modules. In addition to the description of a vector bundle as a finitely generated projective module, the integrability conditions of the sections become vital when the base manifold is noncompact. The notion of smooth module [2] is proposed to integrate the two aspects. We will give the relevant background from the reference. Let A0 be an ideal in a smooth unital algebra Ab . Suppose that A0 is further a local algebra containing a dense subalgebra of local units Ac . Assuming the topology on A0 is the one making it local and the topology on Ab is the one making it smooth, if the inclusion i : A0 → Ab is continuous, then A0 is a local ideal. It is further called essential if A0 b = {0} for some b ∈ Ab implies b = 0.
634
C. Yang
Let A0 be a closed essential local ideal in a smooth unital algebra Ab and p ∈ Mn (Ab ) be a projection. By pulling back the projective modules Eb defined by Anb p through inclusion maps i : Ac → Ab , one can define the Ab -finite projective Ac -module Ec by Anc p. n p. Similarly, one can define the Ab -finite projective A0 -module E0 by A 0 By using the Hermitian form on the projective modules (ξ, η) := ξk∗ ηk , one may obtain the topology on Ec induced from the topology of inductive limit on Ac , the Fréchet topology on E0 induced from the Fréchet topology on A0 and the Fréchet topology on Eb induced from the Fréchet topology on Ab . Hence one has the following continuous inclusions of projective modules Ec → E0 → Eb . Definition 3 A smooth Ab -module E2 is a Fréchet space with a continuous action of Ab such that Ec → E2 → E0 as linear spaces, where the inclusions are all continuous. Returning to our example, we may choose Ac as Cc∞ (E H )θ , A0 as C0∞ (E H )θ , A2 ∞ as C2∞ (E H )θ , and A∞ b as C b (E H )θ . Proposition 2. Assuming that Cc∞ (E H )θ is the algebra of units, the algebras Cc∞ (E H )θ , C2∞ (E H )θ and C0∞ (E H )θ are all essential local ideals of Cb∞ (E H )θ under the topology of uniform convergence of all derivatives. Proof. C0∞ (E H )θ is an ideal of Cb∞ (E H )θ by Lemma 5. Since the topology on C0∞ (E H )θ and Cb∞ (E H )θ are both the topology of uniform convergence of all derivatives, the inclusion C0∞ (E H )θ → Cb∞ (E H )θ is continuous. To show that the ideal C0∞ (E H )θ is essential, we suppose that f ∈ Cb∞ (E H )θ satisfies g ×θ f = 0 for all g ∈ C0 (E H )θ . Taking g = 1/r , g ×θ f = g × f = 0. This implies that f = 0, since 1/r is nowhere zero. Thus, C0∞ (E H )θ is an essential ideal. C2∞ (E H )θ is an ideal of Cb∞ (E H )θ by Lemma 3. Similar to the proof for C0∞ (E H )θ , ∞ C2 (E H )θ is further an essential ideal. Cc∞ (E H )θ is an ideal of Cb∞ (E H )θ by Lemma 4. Cc∞ (E H )θ carrying the topology of inductive limit is a local essential ideal, as is implied by Corollary 7 of [2] directly. With the differential topologies the same as their commutative restriction, there is a chain of continuous inclusions, Cc∞ (E H )θ → C2∞ (E H )θ → C0∞ (E H )θ → Cb∞ (E H )θ .
(45)
One may define the following projective modules Cc∞ (E H )8θ p, C0∞ (E H )8θ p and ∞ Cb (E H )8θ p by the projection p in the form of (43) while considered as an element in M8 (Cb∞ (E H )θ ). It is not hard to see that p 2 = p = p ∗ still holds in this case. The family of seminorms, say {Q m }’s, on the projective modules is induced from the family of seminorms on the algebra, say {qm }’s, by composing with the Hermitian form (·, ·) on the projective modules as Q m (ξ ) := qm ((ξ, ξ ))1/2 for any ξ in the projective module. The topologies on the projective modules are defined by the induced family of seminorms. In this way, the chain of algebras (45) induces the chain of projective modules, Cc∞ (E H )8θ p → C2∞ (E H )8θ p → C0∞ (E H )8θ p → Cb∞ (E H )8θ p.
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
635
Note that the action of Cb∞ (E H )θ on C2∞ (E H )8θ p is continuous. Indeed, if a sequence of elements {ξβ } in C2∞ (E H )8θ p satisfies that Q m (ξβ ) → 0 as β → ∞, then for any f ∈ Cb∞ (E H )θ , Q m (ξβ f )2 = qm ((ξβ f, ξβ f )) = qm ( f ∗ (ξβ , ξβ ) f ) = qm ( f ∗ )Q m (ξβ )2 qm ( f ) → 0, where qm stands for || · || Hm2 defined in (28). Therefore, we realize C2∞ (E H )8θ p as a smooth module. 4. Nonunital Spectral Triples and Summability In this section, we define nonunital spectral triples and consider their summability. We also consider the regularity and measurability of the spectral triples of the isospectral deformations of EH-spaces. Among normed ideals in the algebra of compact operators K(H) on a Hilbert space H, the Dixmier trace ideal L1,∞ (H) is the domain of a Dixmier trace T rω , where ω is some functional on the space of bounded sequences. An operator T ∈ L1,∞ (H) is measurable if its Dixmier trace is independent of ω"and one denotes the Dixmier trace by T r + (T ). See for example [22]. One may define − T := T r + (T ) as the noncommutative integral of T . Apart from the Dixmier trace ideal, the generalized Schatten ideal L p,∞ (H) for p > 1 are the domain of operators where the ( p, ∞)-summability are considered. They are related to L1,∞ (H) in a similar fashion as various Sobolev spaces are linked. If the operator T ∈ L p,∞ (H), then T p ∈ L1,∞ (H). Rennie (Theorem 12, [5]) provides a measurability criterion of operators from local nonunital spectral triples. Within the locality framework, a generalized Connes trace theorem over a commutative geodesically complete Riemannian manifold is also given (Proposition 15, [5]). The Dixmier trace of such a measurable operator agrees with the Wodzicki residue of the operator [4]. Gayral and his coworkers [6] carry out a detailed study on summability of the nonunital spectral triples from isospectral deformations. Their results are also of a local manner. 4.1. Nonunital spectral triples and local ( p, ∞)-summability. Definition 4 [2]. A nonunital spectral triple (A, H, D) is given by 1. A representation π : A −→ B(H) of a local ∗-algebra A, containing some algebra Ac of local units as a dense ideal, on the Hilbert space H. A admits a suitable unitization Ab . 2. A self-adjoint (unbounded, densely defined) operator D : domD −→ H such that [D, a] extends to a bounded operator on H for all a ∈ Ab and a (D − λ)−1 is compact for λ ∈ / R and all a ∈ A. This is the compact resolvent condition for nonunital triples. We omit π if no ambiguity arises. The spectral triple is even if there exists an operator χ = χ ∗ such that χ 2 = 1, [χ , a] = 0 for all a ∈ A and χ D + Dχ = 0. Otherwise, it is odd. To obtain the nonunital spectral triple of the isospectral deformation of the EH-space, let A be the local ∗-algebra C0∞ (E H )θ which contains the algebra of local units
636
C. Yang
Cc∞ (E H )θ as a dense ideal. The unitization Ab is chosen as Cb∞ (E H )θ . The representation π is defined by the representation L θ• : Cb∞ (E H )θ → B(H) from (38). The boundedness of L θf where f = r fr can be seen as follows: θ θ L f op = M fr Vr ≤ M fr Vrθ op ≤ M fr op ≤ fr ∞ < ∞, r r r r op
where the summations are over Z2 . Let D be the extension of the Dirac operator of the spinor bundle to the Hilbert space H. Since the Eguchi-Hanson space is geodesically complete, the extended operator is self-adjoint. We will see in the next subsection that the operator [D, L θf ] is of degree 0 as a pseudodifferential operator and hence bounded. The operator χ is chosen to be the chirality operator defined in (15), such that χ = χ ∗ and χ 2 = 1. Since χ can be realized as a fiberwise constant matrix operating on the spinor bundle, its commutativity with respect to any L θf = r M fr Vrθ holds. The identity χ D + Dχ = 0 is that from the commutative geometry. The data (C0∞ (E H )θ , H, D) will be a nonunital spectral triple once the compact resolvent condition is shown. Before that, we consider the following proposition. Proposition 3. For any f ∈ Cc∞ (E H )θ , / R. L θf (D − λ)−1 ∈ L4,∞ (H), ∀λ ∈
(46)
Proof. The proof is a straightforward generalization of Proposition 15 of [5] and references therein. With respect to the local trivializations {U N , U S } of the spinor bundle S coming from the stereographic projection as before, we may show the summability of the operator (46) by showing the summability of the restrictions of the operator on each trivialization. Indeed, for any f ∈ Cc∞ (E H )θ , the operator L θf = r M fr Vrθ is defined by summations of normal multiplications by fr following parallel transporting in the φ and ψ directions, so that it is well-defined when restricted on either U N or U S . We may choose the partition of unity h N , h S as before so that each function f can be decomposed as f = f N + f S with f N ∈ Cc∞ (U N ) and f S ∈ Cc∞ (U S ). It suffices to show that L θf (D − λ)−1 ∈ L4,∞ (L 2 (SU N )), ∀ f ∈ Cc∞ (U N ),
(47)
and similarly for U S . For any fixed f ∈ Cc∞ (U N )θ , we can find a positive constant R > a large enough, and a constant Θ > 0 small enough such that the compact region defined by W R,Θ := {x ∈ U N : r ≤ R, θ ≥ Θ} ⊂ U N , contains the compact support of f . Notice that with the restricted metric from the EH-space, the region W R,Θ is a compact manifold with a boundary ∂ W R,Θ defined by r = R and θ = Θ. We will fix R and Θ from now on, and write W instead of W R,Θ and denote the restriction of the spinor bundle S on W R,Θ by SW . Because the integral curve starting through any point in W along the φ or ψ direction still lies within W , the action of L θf can be restricted on sections of the subbundle SW . To prove (47), it suffices to prove that L θf (D − λ)−1 ∈ L4,∞ (L 2 (SW )).
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
637
:= W ∪∂ W (−W ) be the invertible double of the compact manifold W with Let W and the correspondboundary ∂ W , and let the corresponding spinor bundle be S → W as a vector bundle ing Dirac operator be D I . Applying Weyl’s lemma [26] on S → W −1 for over a compact manifold without boundary, we obtain (D I − λ) ∈ L4,∞ (L 2 (S)), λ∈ / R. That is,
W →W < ∞, ∀λ ∈ / R, (D I − λ)−1 4,∞
(48)
where the norm is the (4, ∞)-Schatten norm and we indicate the domain and image of operators as a superscript on the norms. As to the action of L θf , we may extend the function f ∈ Cc∞ (W ) to a function ) by zero. Correspondingly, we may extend the operator L θ : L 2 (W, S) → f˜ ∈ Cc∞ (W f
L 2 (W, S) to → L 2 (W , S) , S). L θf˜ : L 2 (W Using the resolvent identity [L θ˜ , (D I − λ)−1 ] = (D I − λ)−1 [D I , L θ˜ ](D I − λ)−1 , we f f have L θf˜ (D I − λ)−1 = (D − λ)−1 L θf˜ + (D − λ)−1 (D L θf˜ − L θf˜ D I )(D I − λ)−1 . (49) to L 2 (W, S), we obtain , S) By composing L θ˜ with the restriction of sections of L 2 (W f to L 2 (W, S). Let ι : , S) an operator in the same notation, L θ mapping from L 2 (W f˜
be the inclusion map. The composition of ι with the identity , S) L 2 (W, S) → L 2 (W (49) then gives, L θf˜ (D I − λ)−1 ι = (D − λ)−1 L θf˜ ι + (D − λ)−1 (D L θf˜ − L θf˜ D I )(D I − λ)−1 ι
(50)
as a map from L 2 (W, S) to itself. Applying (50), we obtain W →W W →W = L θf˜ (D I − λ)−1 ι4,∞ L θf (D − λ)−1 4,∞
= (D − λ)−1 L θf˜ ι + (D − λ)−1 (D L θf˜ − L θf˜ D I ) W →W (D I − λ)−1 ι4,∞ W →W ≤ (D − λ)−1 L θf˜ ι4,∞ + (D − λ)−1 (D L θf˜ − L θf˜ D I ) W →W (D I − λ)−1 ι4,∞ .
We consider the two terms in the last line separately. Since the inclusion ι is an isometry, the first term is bounded as
W →W W →W (D − λ)−1 L θf˜ ι4,∞ ≤ (D − λ)−1 L θf˜ 4,∞
W →W W →W ≤ (D I − λ)−1 4,∞ L θf˜ op < ∞,
(51)
638
C. Yang
W →W < ∞ is because L θ is the trivial extension of the bounded operator where L θ˜ op ˜ f
f
W →W is by (48). The from to itself and the finiteness of (D I − λ)−1 4,∞ second term is bounded as
L θf
L 2 (W, S)
W →W (D − λ)−1 (D L θf˜ − L θf˜ D I )(D I − λ)−1 ι4,∞
W →W ≤ (D − λ)−1 (D L θf˜ − L θf˜ D I )(D I − λ)−1 4,∞
W →W W →W W →W ≤ (D − λ)−1 op D L θf˜ − L θf˜ D I op (D I − λ)−1 4,∞ < ∞.
(52)
W →W is by the fact that (D − λ)−1 is a bounded Indeed, the finiteness of (D − λ)−1 op operator on S → W as the restriction of the bounded operator on L 2 (S). For the finite →W W ness of D L θ˜ − L θ˜ D I op , we have f
f
W →W W →W E H →E H D L θf˜ − L θf˜ D I op = [D, L θf ]op ≤ [D, L θf ]op < ∞,
since f˜ extends f by zero and the boundedness of [D, L θf ] will be shown in the next
W →W is again by (48). section. The finiteness of (D I − λ)−1 4,∞ Summation of the inequalities (51) and (52) implies that W →W |L θf (D − λ)−1 4,∞ < ∞.
The proof for the coordinate patch U S is the same.
As pointed out by Rennie, Proposition 3 implies the compact resolvent condition. Lemma 8. For any f ∈ C0∞ (E H )θ , L θf (D − λ)−1 ∈ K(H) with λ ∈ / R. Proof. Let { f β } is be a sequence of functions in Cc∞ (E H )θ , which converges to the function f ∈ C0∞ (E H )θ in the topology of uniform convergence, then {L θf β } converges
to L θf in the C ∗ -operator norm, for the norm-topology is weaker than the topology of uniform convergence. This further implies that the sequence of operators {L θfβ (D−λ)−1 }
converges uniformly to L θf (D − λ)−1 in the operator norm. The (4, ∞)-summability of each L θfβ (D − λ)−1 by (46) implies that they are all compact operators. As the uniform
limit of a sequence of compact operators, L θf (D − λ)−1 is also compact.
In summary, the data (C0∞ (E H )θ , H, D) of the isospectral deformations of the Eguchi-Hanson spaces are even nonunital spectral triples as in Definition 4. Let ΩD (A) denote the algebra of operators generated by A and [D, A]. Definition 5 [5]. A (nonunital) spectral triple (A, H, D) is called local, if there exists a local approximate unit {φn } ⊂ Ac for A satisfying ΩD (Ac ) = ∪n ΩD (A)n , where ΩD (A)n := {ω ∈ ΩD (A) : φn ω = ωφn = ω}. For p ≥ 1, the local spectral triple is called local ( p, ∞)-summable if a (D −λ)−1 ∈ L p,∞ (H), λ ∈ / R, for any a ∈ Ac .
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
639
Local ( p, ∞)-summability implies that (Proposition 10 [5]) T (1 + D2 )−s/2 ∈ L p/s,∞ (H), 1 ≤ Re(s) ≤ p,
(53)
for any T ∈ B(H) such that T φ = φ T = T for some φ ∈ Ac . If Re(s) > p, the operator is of trace class. In considering the (local) summability of the spectral triples, we restrict to the spectral triple (Cc∞ (E H )θ , H, D). Lemma 9. The spectral triple (Cc∞ (E H )θ , H, D) is local (4, ∞)-summable. Proof. First we show that the spectral triple is local. We may choose the local approximate unit {φn } as defined in Sect. 3.3 so that each φn remains commutative. As operators, they act only by normal multiplication Mφn on spinors. Define [Cc∞ (E H )θ ]n to be the subalgebra of the operator algebra Cc∞ (E H )θ consisting of elements L θf such that L θf Mφn = Mφn L θf = L θf , then Cc∞ (E H )θ = ∪n∈N [Cc∞ (E H )θ ]n . Thus ΩD (Cc∞ (E H )θ ) = ΩD (∪n∈N [Cc∞ (E H )θ ]n ) = ∪n∈N ΩD ([Cc∞ (E H )θ ]n ). We claim that this equals ∪n∈N [ΩD (Cc∞ (E H )θ ]n , where [ΩD (Cc∞ (E H )θ ]n := {ω ∈ ΩD (Cc∞ (E H )θ ) : ω Mφn = Mφn ω = ω}. By the fact that the orbit of the torus action of any point x ∈ K n remains in K n , Mφn L θf = L θf Mφn whenever supp( f ) ⊂ K n . That the Dirac operator preserves support implies Mφn [D, L θf ] = [D, L θf ]Mφn = [D, L θf ]. This further gives that ∪n∈N ΩD ([Cc∞ (E H )θ ]n ) ⊂ ∪n∈N [ΩD (Cc∞ (E H )θ ]n . The other direction is obvious. Therefore, ΩD (Cc∞ (E H )θ ) = ∪n∈N [ΩD (Cc∞ (E H )θ ]n , and the spectral triple is local. The local (4, ∞)-summability of the spectral triple (Cc∞ (E H )θ , D, H) is implied by Proposition 3. 4.2. Regularity of spectral triples. For a given spectral triple (A, D, H), we can define a derivation δ on the space of linear operators on the Hilbert space L(H) by δ(T ) := [|D|, T ], T ∈ L(H). A linear operator T is in the domain of the derivation dom δ ⊂ L(H), if any ψ ∈ dom(|D|) implies T (ψ) ∈ dom(|D|). For any positive integer k, T is in the domain of the k th derivation dom δ k ⊂ L(H), if δ k−1 (T ) ∈ dom δ, where δ k−1 (T ) = [|D|, [|D|, . . . , [|D|, T ] . . . ]], with k − 1 brackets. The intersection of domains of δ with all possible degree dom ∞ δ := ∩k∈N dom δ k is the smooth domain of the derivation δ. When k = 0, dom δ 0 is simply the space of the bounded operators B(H). Therefore, an operator T ∈ domδ k if δ k (T ) is a bounded operator. Definition 6 A spectral triple (A, H, D) is regular if ΩD (A) ⊂ dom ∞ δ.
640
C. Yang
Before considering the regularity of the spectral triple, we collect some related properties of operators L θf and D as pseudodifferential operators. The Dirac operator D on the spinor bundle S is a first order differential operator with a principal symbol, σ D (x, ξ ) = c(ξ j d x j ) , where ξ as a section in the cotangent bundle T ∗ (E H ) is of coordinates (ξ1 , . . . , ξ4 ) with respect to the basis {d x i }, defined in the beginning of Sect. 2.1. The operator D2 is a second-order differential operator with a principal symbol σ D (x, ξ ) = g(ξ, ξ ) 1, 2
(54)
where g is the induced metric tensor on the cotangent bundle from that on the tangent bundle (2). Lemma 10. The principal symbol of the pseudodifferential operator M f is σ M f (x, ξ ) = M f (x) = diag4 ( f (x)),
(55)
where diagr (g) denotes the r × r diagonal matrix of g on the diagonal. The principal symbol of the pseudodifferential operator L θf is θ
σ L f (x, ξ ) =
M fr (x)P θ (x) eiθ (r3 ξ4 −r4 ξ3 ) ,
(56)
r =(r3 ,r4 )
where the matrix-valued function P θ (x) = Pc3 ◦ Pc4 (x) is defined by the composition of parallel propagators along integral curves of ∂φ and ∂ψ . Proof. Applying M f where f = r fr on the inverse Fourier transformation of a spinor ψ, 1 1 i x·ξ ˆ ˆ )dξ, Mf e ψ(ξ )dξ = diag4 ( f (x))ei x·ξ ψ(ξ (2π )4 R4 (2π )4 R4 we see that M f is an order zero classical pseudodifferential operator with principal symbol (55). From Remark 5, the pointwise evaluation of the operator L θf is M fr (Pc3 ◦ Pc4 )(ψ(x + (0, 0, −θr4 , θr3 ))), (57) L θf ψ(x) = r
where c4 is the integral curve of the Killing field ∂ψ starting at (x1 , x2 , x3 −θr4 , x4 +θr3 ) and ending at (x1 , x2 , x3 −θr4 , x4 ), and Pc4 is assumed to be the parallel propagator with respect to the spin connection along the c4 . It is evaluated at the point (x1 , x2 , x3 −θr4 , x4 ) as a four by four matrix. Similarly, c3 is the integral curve of the Killing field ∂φ starting at (x1 , x2 , x3 − θr4 , x4 ) and ending at (x1 , x2 , x3 , x4 ). Pc3 is assumed to be the parallel propagator with respect to the spin connection along the c3 as defined by (24). In (57), their composition is evaluated at the point (x1 , x2 , x3 , x4 ) as a four by four matrix. Applying L θf on the inverse Fourier transformation of ψ, 1 ˆ )dξ, L θf ψ(x) = M fr Pc3 Pc4 ei((x+(0,0,−θr4 ,θr3 )))·ξ ψ(ξ (2π )4 R4 r
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
641
one obtains the symbol of L θf . With respect to the ξ variable, the complete symbol is bounded by a constant and hence is of degree 0 and it can be chosen to be its principal symbol, which takes the form of (56). Proposition 4. The spectral triple (C0∞ (E H )θ , H, D) is regular. Proof. We write L θf by f for notational simplicity here. As indicated in the proof of Proposition 20 in [2], f, [D, f ] ∈ dom ∞ δ for any f ∈ C0∞ (E H )θ if and only if f, [D, f ] ∈ dom k,l≥0 L k R l , where L( f ) := (1 + D2 )−1/2 [D2 , f ],
R( f ) := [D2 , f ](1 + D2 )−1/2 ,
for the reason that |D| − (1 + D2 )1/2 is bounded. The rest of the proof is a direct generalization of the standard method in the unital case, see for instance [22]. Denote ad(D2 )m (·) = [D2 , . . . , [D2 , ·] . . . ], with m brackets, so that L k ( f ) = (1 + D2 )−k/2 ad(D2 )k ( f ),
R l ( f ) = ad(D2 )l ( f )(1 + D2 )−l/2 ,
where k, l ∈ N. Their composition is L k R l ( f ) = (1 + D2 )−k/2 ad(D2 )k+l ( f )(1 + D2 )−l/2 . The operator ad(D2 )( f ) = [D2 , f ] is of order at most 1, since the commutator of the principal symbols (54) and (56) vanishes. Similarly, the operator ad(D2 )(k+l) ( f ) is of order at most k + l. This implies that the operator L k R l ( f ) is of order at most zero and hence a bounded pseudodifferential operator on H. This holds for any k and l in N. Hence f ∈ dom k,l≥0 L k R l , for any f ∈ C0∞ (E Hθ ). Since [D, M f ] is a bounded operator of degree 0 and Vrθ is of degree 0, seen from (56), [D, L θf ] is also a bounded operator of degree 0. The above proof holds if f is replaced by [D, L θf ]. Thus [D, L θf ] ∈ dom k,l≥0 L k R l , for any f ∈ C0∞ (E H )θ . Since L k R l (T ) ∈ dom L 0 R 0 = B(H) for any k, l, where T ∈ B(H) is equivalent to T ∈ dom k,l≥0 L k R l for any k, l, we obtain ΩD (C0∞ (E H )θ ) ⊂ dom ∞ δ. Hence the spectral triple is regular. 4.3. Measurability in the nonunital case. The following is the measurability criterion of operators from a local nonunital spectral triple [5]. Theorem 2. Let (A, H, D) be a regular, local ( p, ∞)-summable spectral triple with p ≥ 1. Suppose that T ∈ B(H) such that ψ T = T ψ = T for some ψ ≥ 0 in Ac . If the limit p Trace T (1 + D2 )−s (58) lim + s − 2 s→ 2p exists, then the operator T (1 + D2 )− p/2 is measurable and its Dixmier trace equals to the limit up to a factor of 2/ p, 2 p Trace T (1 + D2 )−s . (59) T r + T (1 + D2 )− p/2 = lim + s − p s→ 2p 2
642
C. Yang
Implied by [6], the operators L θf (1 + D2 )−2 , for f ∈ Cc∞ (E H )θ from the spectral triple (Cc∞ (E H )θ , D, H) satisfies the measurability criterion (58) and hence the Dixmier trace can be uniquely defined. We include these contents briefly for coherence. Lemma 11. The limit lim (s − 2)Trace(L θf (1 + D2 )−s ), ∀ f ∈ Cc∞ (E H )θ ,
s→2+
(60)
exists and the operator L θf (1 + D2 )−2 is measurable. Proof. Since the spectral triple satisfies the local (4, ∞)-summability condition, (53) implies that L θf (1 + D2 )−s for s > 2 is of trace class and so is M f (1 + D2 )−s . Since both of them are of trace class, their traces agree by Corollary 3.10 of [6]. Thus it suffices to show that the limit lim (s − 2)Trace(M f (1 + D2 )−s ), ∀ f ∈ Cc∞ (E H ),
s→2+
exists. We may compute the trace of the operator by evaluating the corresponding kernels of operators. The kernel of M f is given by g M fr δx (x ), K M f (x, x ) = " g g where δx (x ) is defined by requiring ψ(x) = E H δx (x )ψ(x )d V ol(x ) for all ψ ∈ L 2 (S). For s > 2, K M f (x, x ) K (1+D2 )−s (x , x)dV ol(x )d V ol(x) Trace(M f (1 + D2 )−s ) = ⎞ ⎛ g diag4 ( fr (x)) δx (x )⎠ K (1+D2 )−s (x , x) = tr ⎝ r ∈Z2
dV ol(x )dV ol(x) =4 f (x) K (1+D2 )−s (x, x)dV ol(x), where tr denotes the trace of a matrix. Applying the method of heat kernel expansion on the Laplacian transformation of the kernel as in the proof of Theorem 6.1 [6], we obtain that for s > 2, (s − 2)Γ (s − 2) 4 lim+ (s − 2)Trace(M f (1 + D2 )−s ) = lim f (x)d V ol(x) s→2 (2π )2 s→2+ Γ (s) 4 f (x)dV ol(x) < ∞ , = (2π )2 and this equals lims→2+ (s − 2)Trace L θf (1 + D2 )−s . Since f is of compact support, we can always find a function φ of value one on the compact support of f and decaying to zero only with respect to the r variable so that L θφ = Mφ , and hence L θf Mφ = Mφ L θf = L θf holds. By Theorem 2, the operator L θf (1 + D2 )−2 is measurable.
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
Equation (59) further implies that the Dixmier trace is 2 + θ 2 −2 T r L f (1 + D ) = f (x)d V ol(x). (2π )2 E H
643
(61)
In the reduced commutative case the operator M f (1+D2 )−2 is measurable, and T r + (M f (1 + D2 )−2 ) equals the right-hand side of (61). The Connes trace theorem for the unital case (Theorem 7.18 [22]) implies that for a spectral triple (A, H, D), 1 T r + a (1 + D2 )− p/2 = Wres(a (1 + D2 )− p/2 ), (62) p(2π ) p where D is the Dirac operator of some p-dimensional spin manifold, a (1 + D2 )− p/2 is considered as an elliptic pseudodifferential operator on the complex spinor bundle S and Wres is the Wodzicki residue. Despite a full understanding of (62) in the noncommutative nonunital case, a Wodzicki residue computation of M θf (1 + D2 )1/2 for f ∈ Cc∞ (E H ) shows Wres M f (1 + D2 )−2 = 8(2π )2 f (x)d V ol(x), f ∈ Cc∞ (EH) (63) EH
Comparing with (61), (62) does hold when taking a = f and p = 4. This also serves as an example of Proposition 15 [5] where a geodesically complete manifold is considered. 5. Geometric Conditions In this section, we see how the spectral triples of the isospectral deformations of the EH-spaces fit into the proposed geometric conditions to construct noncompact noncommutative spin manifolds [3,9]. For nonunital spectral triples (A, H, D) as in Definition 4, the geometric conditions are as follows. (1) Metric dimension. There is a unique non-negative integer p, the metric dimension, for which a (1 + D2 )−1/2 belongs to the generalized Schatten ideal L p,∞ (H) for a ∈ A. Moreover, T r + (a (1 + D2 )− p/2 ) is defined and not identically zero. This p is even if and only if the spectral triple is even. (2) Regularity. Bounded operators a and [D, a], for a ∈ A, lie in the smooth domain of the derivation δ = [|D|, ·]. (3) Finiteness. The algebra A and its preferred unitization Ab are pre-C ∗ -algebras. There exists an ideal A2 of Ab , which is also a pre-C ∗ -algebra with the same C ∗ -completion as A, such that the subspace of smooth vectors in H H∞ := ∩m∈N dom(Dm ) is an Ab finitely generated projective A2 -module. (4) Reality. There is an antiunitary operator J on H, such that [a, J b∗ J −1 ] = 0, for a, b ∈ Ab . Thus b → J b∗ J −1 is a commuting representation on H of the opposite algebra A◦b . Moreover, for the metric dimension p = 4, J 2 = −1,
J D = D J,
J χ = χ J.
For other dimensions, we refer to the aforementioned references.
644
C. Yang
(5) First order. The bounded operator [D, a] commutes with the opposite algebra representation: [[D, a], J b∗ J −1 ] = 0 for all a, b ∈ Ab . (6) Orientation. There is a Hochschild p-cycle c on Ab , with values in Ab ⊗ A◦b . The p-cycle is a finite sum of terms like (a ⊗ b◦ ) ⊗ a1 ⊗ · · · ⊗ a p , and its natural representation πD (c) on H is defined by πD ((a0 ⊗ b0◦ ) ⊗ a1 ⊗ · · · ⊗ a p ) := a0 J b0∗ J −1 [D, a1 ] · · · [D, ak ]. The volume form πD (c) solves the equation πD (c) = χ in the even case and πD (c) = 1 in the odd case. 5.1. Metric dimensions. One might show p = 4 for the triples (C0∞ (E H )θ , H, D) by considering the measurability of the operator L θf (1 + D2 )−2 for f ∈ C0∞ (E H )θ . However, the algebra C0∞ (E H )θ is not integrable, which is necessary for the computation of the Wodzicki residue [4] of the operator L θf (1 + D2 )−2 . Thus L θf (1 + D2 )−2 may not be measurable. Nonetheless, Lemma 11 implies that operators L θf (1 + D2 )−2 for f ∈ Cc∞ (E H )θ are measurable. The Dixmier trace is evaluated as 2 + θ 2 −2 T r (L f (1 + D ) ) = f d V ol, (2π )2 which is finite and nonzero. We do not know whether this remains true for some general integrable algebras, for instance C2∞ (E H )θ , lying between Cc∞ (E H )θ and C0∞ (E H )θ . 5.2. Finiteness. By the construction of the ideal C2∞ (E H ) in Sect. 3.2, we see that the Cb∞ (E H ) projective C2∞ (E H )-module C2∞ (E H )8 p, with p as in (43), is the smooth domain of the Dirac operator in H. In the deformed case, we recall that C2∞ (E H )8θ p is a Cb∞ (E H )θ projective C2∞ (E H )θ -module. By matching generators, we have the isomorphism between the finitely generated projective modules, C2∞ (E H )8θ p ∼ = C2∞ (E H )8 p. Therefore, ∩m∈N dom(Dm ) ∼ = C2∞ (E H )8θ p. From Sect. 3.4, the Fréchet algebra C2∞ (E H )θ is a pre-C ∗ -algebra with the same C ∗ -completion C0 (E H )θ as that of the algebra C0∞ (E H )θ . Hence the finiteness condition is satisfied. As an application of a general construction considering smooth projective modules in [2], we may define a C-valued inner product on the projective module. Since the Hermitian form on the projective module Cc∞ (E H )8θ p is Cc∞ (E H )θ -valued, composing with the Dixmier trace, one may define an inner product on Cc∞ (E H )8θ p by 2 + θ 2 −2 = τ (ξ, η) := T r L (ξ |η) (1 + D ) (ξ |η) d V ol, (2π )2 ∗ where the equality is by (61). Here the image (ξ |η) = ξk ×θ ηk ∈ Cc∞ (E H )θ is ∞ considered as a function in Cc (E H ). τ
One can further take the Hilbert space completion Cc∞ (E H )8θ p with respect to the inner product τ . When restricted to the commutative case, the inner product is simply τ the L 2 -inner product on the spinor bundle, and the Hilbert space Cc∞ (E H )8 p is the Hilbert space H, appearing in the spectral triple.
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
645
5.3. Regularity. The regularity condition is implied by Proposition 4.
5.4. Reality. The proof of the reality condition is based on the lecture notes [27]. With respect to the decomposition of spinor bundle S = S + ⊕ S − as in Sect. 2.3, we have the corresponding Hilbert space completions under the inner product coming from the L 2 -norms, and their sum is the Hilbert space completion of S, H = H+ ⊕ H− . Any element ψ ∈ H can thus be decomposed as ψ = (ψ + , ψ − )t . The operator J defined on the spinor bundle (16) can be extended to the Hilbert space as an antiunitary operator J : H → H by + − ψ −ψ := J , + ψ− ψ satisfying J 2 = −1. We define the representation of the opposite algebra A◦b of Ab = Cb∞ (E H )θ on H, θ R• : A◦b → B(H) by Rhθ := J L θh ∗ J −1 . Specifically, for h = s h s , the representation is θ −1 θ Rhθ = J Mh ∗s V−s J = Mh s V−s . s
s
The commutativity of operators L θf and Rhθ where f = [L θf , Rhθ ] =
r
fr is seen as follows,
θ θ fr Vrθ h s V−s − h s V−s fr Vrθ
r,s
=
θ θ fr h s σ (r, s) Vrθ V−s − h s fr σ (−s, r )V−s Vrθ
r,s
=
[ fr , h s ]σ (r, s) Vrθ−s = 0,
(64)
r,s θ = V θ V θ = V θ are applied. where identities σ (r, s) = σ (−s, r ) and Vrθ V−s −s r r −s As in the commutative case, D J = J D and J χ = χ J , where χ is the chirality operator (15).
5.5. First proof of the first order condition is again from [27]. For any order. The f = r fr and h = s h s in Cb∞ (E H )θ , the first order property [[D, fr ], h s ] = 0 in the commutative case implies that, θ [[D, L θf ], Rhθ ] = [[D, fr ] Vrθ , h s V−s ]= [[D, fr ], h s ]σ (r, s) Vrθ−s = 0. r,s
r,s
5.6. Orientation. In Riemannian geometry, the volume form determines the orientation of a manifold. Translated to the spectral triple language, the volume form is replaced by a Hochschild cycle c which can be represented on H such that π D (c) = χ in the even case. For a detailed discussion we refer to [22].
646
C. Yang
We may obtain a Hochschild 4-cycle of the spectral triple from the classical volume form of the Eguchi-Hanson space. We will only give the construction on the coordinate chart U N , that for the other chart U S is similar and the global construction can be obtained by a partition of unity. We will consider the commutative case first and then the deformed case. Define a new set of coordinates by u 1 = x1 , u 2 = x2 , u 3 = ei x3 , u 4 = ei x4 , so that the transition of differential forms d x i = v ij du j is given by the diagonal matrix V = (v ij ) := diag(1, 1, − ui3 , − ui4 ). Composing with the ϑ α = h iα d x i , where h iα are components of the matrix H in (5), the transition of differential forms ϑ α = kiα du i is given by the matrix K = (kiα ) := H V . In components, k1α = h α1 , k2α = h α2 , k3α = h α3
−i −i , k4α = h α4 , α = 1, . . . , 4. u3 u4
(65)
Similarly, the transition du j = v˜i d x i is given by the inverse matrix V −1 = (v˜i ) of V . j j Composing with d x j = h˜ β ϑ β , where h˜ β are elements of the inverse matrix H −1 in (6) , we obtain du i = k˜βi ϑ β with k˜βi as the elements of the inverse matrix K −1 = V −1 H −1 . In components, j
j
k˜β1 = h˜ 1β , k˜β2 = h˜ 2β , k˜β3 = i u 3 h˜ 3β , k˜β4 = i u 4 h˜ 4β , β = 1, . . . , 4. To avoid ambiguity, if the u-coordinates and x-coordinates appear in the same formula, we will distinguish them by adding to indices of the u-coordinates. By tensor trans formations, we may obtain the Dirac operator satisfying D(s) = −iγ j ∇ Sj s in the coordinates {u i }’s from (20) in the coordinates {xi }’s as, 1 β α 1 β α 1 η 2 η ˜ ˜ ∂1 − Γ1α γ γβ − i h η γ ∂2 − Γ2α γ γβ D = −i h η γ 4 4 i 1 1 i β α β α 3 η 4 η ˜ ˜ ∂3 + ∂4 + Γ γ γβ + u 4 h η γ Γ γ γβ , +u 3 h η γ 4 u 3 3α 4 u 4 4α β
where Γiα ’s are from (18) and γα = γ α ’s are from (11). The volume form of the Eguchi-Hanson space can be represented in the orthonormal basis on U N as ϑ 1 ∧ ϑ 2 ∧ ϑ 3 ∧ ϑ 4 = ki11 du i1 ∧ ki22 du i2 ∧ ki33 du i3 ∧ ki44 du i4 = ki44 ki33 ki22 ki11 du i1 ∧ du i2 ∧ du i3 ∧ du i4 .
(66)
We may define a Hochschild 4-cycle c0 in C4 (Ab , Ab ⊗ A◦b )), with Ab = Cb∞ (E H ) and A◦b as the opposite algebra of Ab , by c0 :=
1 σ (4) σ (3) σ (2) σ (1) (−1)|σ | (kiσ (4) ⊗ 1◦ )(kiσ (3) ⊗ 1◦ )(kiσ (2) ⊗ 1◦ )(kiσ (1) ⊗ 1◦ ) 4! σ ∈S4 i σ (1)
⊗u
⊗ u iσ (2) ⊗ u iσ (3) ⊗ u iσ (4) ,
(67)
where σ is an element in the permutation group S4 and (−1)|σ | indicates the sign of the permutation. On the Ab -bimodule Ab ⊗ A◦b , Ab acts as a (a ⊗ b0 )a := a aa ⊗ b◦ , for a ⊗ b◦ ∈ Ab ⊗ A◦b and a , a ∈ Ab .
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
647
Lemma 12. The Hochschild 4-chain (67) defines a Hochschild cycle. That is, b(c0 ) = 0, where b is the boundary operator of a Hochschild chain. Proof. Recall that the Hochschild boundary operator b acts on a simple n-chain a = (a0 ⊗ b0◦ ) ⊗ a1 ⊗ · · · ⊗ an in Cn (Ab , Ab ⊗ A◦b ) by b(a) = (a0 ⊗ b0◦ )a1 ⊗ a2 ⊗ · · · ⊗ an +
n−1
(−1) j (a0 ⊗ b0◦ ) ⊗ a1 ⊗ · · · ⊗ a j a j+1 ⊗ · · · ⊗ an
j=1
+(−1)n an (a0 ⊗ b0◦ ) ⊗ a1 ⊗ · · · ⊗ an−1 .
(68)
Elements of b(c0 ) are of three types. The first type corresponds to the second line in (68), σ (4)
σ (3)
σ (2)
σ (1)
(−1)|σ | (−1) j (kiσ (4) ⊗ 1◦ )(kiσ (3) ⊗ 1◦ )(kiσ (2) ⊗ 1◦ )(kiσ (1) ⊗ 1◦ ) ⊗u iσ (1) ⊗ · · · ⊗ u iσ ( j) u iσ ( j+1) ⊗ · · · ⊗ u iσ (4) . In the summation of all σ ∈ S4 , each such term can be cancelled by a term from another σ which is obtained from the composition of σ by a transition between σ ( j) and σ ( j +1), as
(−1)|σ | (−1) j (kiσ (4) ⊗ 1◦ )(kiσ (3) ⊗ 1◦ )(kiσ (2) ⊗ 1◦ )(kiσ (1) ⊗ 1◦ ) σ (4)
σ (3)
σ (2)
σ (1)
⊗u iσ (1) ⊗ · · · ⊗ u iσ ( j+1) u iσ ( j) ⊗ · · · ⊗ u iσ (4) .
Indeed, since (−1)|σ | = −(−1)|σ | and the elements in the first term from the bimodule are commuting, the summation of such pairs is σ (4)
σ (3)
σ (2)
σ (1)
(−1)|σ | (−1) j (kiσ (4) ⊗ 1◦ )(kiσ (3) ⊗ 1◦ )(kiσ (2) ⊗ 1◦ )(kiσ (1) ⊗ 1◦ ) ⊗u iσ (1) ⊗ · · · ⊗ (u iσ ( j) u iσ ( j+1) − u iσ ( j+1) u iσ ( j) ) ⊗ · · · ⊗ u iσ (4) = 0. It vanishes since u iσ ( j) u iσ ( j+1) = u iσ ( j+1) u iσ ( j) as elements in Ab . The second type corresponds to the first line in (68). After the Ab -bimodule action from the right, it is in the following form, σ (4) σ (3) σ (2) σ (1) kiσ (4) kiσ (3) kiσ (2) kiσ (1) u iσ (1) ⊗ 1◦ ⊗ u iσ (2) ⊗ u iσ (3) ⊗ u iσ (4) . The third type of component corresponds to the third line in (68). After the Ab -bimodule action from the left, it is in the following form, σ (4) σ (3) σ (2) σ (1) u iσ (4) ki ki ki ki ⊗ 1◦ ⊗ u iσ (1) ⊗ u iσ (2) ⊗ u iσ (3) . σ (1)
σ (3)
σ (2)
σ (1)
By commutativity of Ab , the summation of all σ of the second type and the third type cancel exactly when the permutation σ differs from σ by a transition between (σ (1), σ (2), σ (3), σ (4)) to (σ (4), σ (1), σ (2), σ (3)). Indeed, such σ and σ are of opposite sign. Therefore, all three types cancel in the summation of σ ∈ S4 , and b(c0 ) = 0. This shows that c0 is a Hochschild 4-cycle. We define the representation πD of the Hochschild cycle c0 on the Hilbert space by πD (a0 ⊗ b0◦ ⊗ a1 ⊗ · · · ⊗ a4 ) := Ma0 Mb0 [D, Ma1 ][D, Ma2 ][D, Ma3 ][D, Ma4 ].
648
C. Yang
Proposition 5. The operator πD (c0 ) = χ . Proof. 4! πD (c0 ) =
σ ∈S4
(−1)|σ | Mk σ (4) Mk σ (3) )Mk σ (2) Mk σ (1) i σ (4)
i σ (3)
i σ (2)
i σ (1)
c(du iσ (1) ) c(du iσ (2) ) c(du iσ (3) )c(du iσ (4) ) (−1)|σ | Mk σ (4) Mk σ (3) )Mk σ (2) Mk σ (1) = i
i
i
σ (4) σ (3) σ (2) σ ∈S4 i σ (1) α1 i σ (2) α2 i σ (3) α3 i σ (4) α4 k˜α1 γ k˜α2 γ k˜α3 γ k˜α4 γ
=
σ ∈S4
=
i σ (1)
(−1)|σ | δασ1(1) δασ2(2) δασ3(3) δασ4(4) γ α1 γ α2 γ α3 γ α4 (−1)|σ | γ σ (1) γ σ (2) γ σ (3) γ σ (4) = 4! γ 1 γ 2 γ 3 γ 4 .
(69)
σ ∈S4
Thus πD (c0 ) = χ .
Now we consider the noncommutative case. Let Ab,θ be Cb∞ (E H )θ and A◦b,θ be the opposite algebra. On the Ab,θ -bimodule Ab,θ ⊗ A◦b,θ , Ab,θ acts as a (a ⊗ b0 )a := (a ×θ a ×θ a ) ⊗ b◦ , for a ⊗ b◦ ∈ Ab,θ ⊗ A◦b,θ and a , a ∈ Ab,θ . The Hochschild 4-chain in C4 (Ab,θ , Ab,θ ⊗ A◦b,θ ) is defined by c :=
1 (−1)|σ | K iσσ (4) K iσσ (3) K iσσ (2) K iσσ (1) ⊗ u iσ (1) ⊗ u iσ (2) ⊗ u iσ (3) ⊗ u iσ (4) , (4) (3) (2) (1) 4!
(70)
σ ∈S4
where K i is the corresponding element of ki in the bimodule Ab,θ ⊗ A◦b,θ . They are chosen as, j
j
−i u 1 K 41 := ∆(u 1 )−1/2 ⊗ 1◦ ⊗ 1◦ (u 3 ), ⊗ 1◦ , K 12 := − u4 2 u u 1 1 2 ◦ 3 ◦ ⊗ 1 (u 3 ), K 1 := − sin u 2 ⊗ 1 (u 3 ), K 2 := 2 2 −i u u 1 1 3 ◦ 3 1/2 ◦ ◦ sin u 2 ⊗ 1 (u 3 ), K 3 := ∆(u 1 ) cos u 2 ⊗ 1 ⊗1 , K 2 := − 2 2 u3 −i u 1 ∆(u 1 )1/2 ⊗ 1◦ K 34 := ⊗ 1◦ , 2 u3 where 1 1/2 1/2 1/2 1/2 ∆(u 1 ) := 1 − a 4 /u 21 , (u 3 ) := (u 3 ⊗ (u 3 )◦ + u¯ 3 ⊗ (u¯ 3 )◦ ) , 2 1 1/2 1/2 ◦ 1/2 1/2 (u 3 ) := (u 3 ⊗ (u 3 ) − u¯ 3 ⊗ (u¯ 3 )◦ ) . 2i
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
649
Remark 7. The choices of K i ’s are based on the following observation. If eir φ ∈ Ab,θ r r is of spectral homogeneous degree (−r, 0), then ei 2 φ ⊗ (ei 2 φ )◦ as an element in the Ab,θ -bimodule is of the bimodule action satisfying j
r
r
r
r
eisψ (ei 2 φ ⊗ (ei 2 φ )◦ ) = (ei 2 φ ⊗ (ei 2 φ )◦ )eisψ , for any eisψ of spectral homogeneous degree (0, −s) in the algebra Ab,θ . The same holds when φ and ψ swap. In this way, all the u 3 appearing in the matrix H of K = H V can be “commutatized”. Lemma 13. The Hochschild 4-chain (70) is a Hochschild cycle in Z4 (Ab,θ , Ab,θ ⊗ A◦b,θ ). I.e., b(c) = 0, where b is the boundary operator of a Hochschild chain. Proof. As in the commutative case, elements of b(c) are of three types. The first type is, σ (4)
σ (3)
σ (2)
σ (1)
(−1)|σ | (−1) j (K iσ (4) ×θ K iσ (3) ×θ K iσ (2) ×θ K iσ (1) ⊗u iσ (1) ⊗ · · · ⊗ u iσ ( j) ×θ u iσ ( j+1) ⊗ · · · ⊗ u iσ (4) .
(71) j
Firstly, from Remark 7, we may observe that the noncommutative part of any K i has only contributions from terms like uii ×θ ·, for i = 3, 4. Secondly, any term containing −i the product −i u 3 ×θ u 4 contains the product u 4 ×θ u 3 and their product is, −i −i −i −i iθ ×θ ×θ u 4 ×θ u 3 = e−iθ e u 4 u 3 = −1. u3 u4 u3 u4 This also holds when 3 and 4 swap. These observations imply that the noncommutativity factor coming from the first line of (71) always cancels with the noncommutativity factor coming from the second line. Therefore, it reduces to the commutative case. By the same matching of σ ’s in the proof of Lemma 12 for terms of the first type, summation of all the terms of the first type is zero. The second type is σ (4) σ (3) σ (2) σ (1) K iσ (4) ×θ K iσ (3) ×θ K iσ (2) ×θ K iσ (1) u iσ (1) ⊗ u iσ (2) ⊗ u iσ (3) ⊗ u iσ (4) . σ (1)
Notice that K iσ (1) commutes with u iσ (1) . The third type is
σ (4) σ (3) σ (2) σ (1) ⊗ u iσ (1) ⊗ u iσ (2) ⊗ u iσ (3) . u iσ (4) K i ×θ K i ×θ K i ×θ K i σ (4)
σ (3)
σ (2)
σ (4)
Notice that u iσ (4) commutes with K i σ
σ (1)
σ (4)
σ (1)
. As in the commutative case, we may pair σ
and which are related by = σ (4), σ (2) = σ (1), σ (3) = σ (2), σ (4) = σ (3) so that they are cancelled through the summation of σ . Three cases altogether give us b(c) = 0, and hence the proof. We represent the Hochschild cycle c on the Hilbert space H by πD (a0 ⊗ b0◦ ⊗ a1 ⊗ · · · ⊗ a4 ) := L aθ 0 Rbθ0 [D, L aθ 1 ][D, L aθ 2 ][D, L aθ 3 ][D, L aθ 4 ] (72) for a0 ⊗ b0◦ ⊗ a1 ⊗ · · · ⊗ a4 ∈ Z4 (Ab,θ , Ab,θ ⊗ A◦b,θ ). We denote the coefficient L aθ 0 Rbθ0 in (72) by π˜ D (a0 ⊗ b0◦ ). A straightforward fact follows.
650
C. Yang
Lemma 14. π˜ D ((u 3 )) = Mcos φ and π˜ D ((u 3 )) = Msin φ . Proposition 6. The operator πD (c) = χ . Proof. By using the commutativity between the Dirac operator and Vrθ , we can write down the formula for the commutators: θ θ , [D, L θu 4 ] = c(du 4 )V(0,−1) , [D, L θu i ] = c(du i ), [D, L θu 3 ] = c(du 3 )V(−1,0)
where i = 1, 2. By Lemma 14, all the nonvanishing coefficients in the representation of the Hochschild cycle c are π˜ D (K 41 ) = M∆(u 1 )−1/2 L θ−i , π˜ D (K 12 ) = −M u 1 Mcos φ , π˜ D (K 22 ) = M u 1 Msin φ , 2
u4
π˜ D (K 13 )
= −M u 1 sin u 2 Msin φ , 2
π˜ D (K 33 ) = M u 1 ∆(u 1 )1/2 2
π˜ D (K 23 )
θ cos u 2 L −i
u3
2
= −M u 1 2
sin u 2 Mcos φ ,
(73)
, π˜ D (K 34 ) = M u 1 ∆(u 1 )1/2 L θ−i . 2
u3
The representation πD (c) is thus πD (c) =
1 (−1)|σ | π˜ D (K iσσ (4) ) π˜ D (K iσσ (3) ) π˜ D (K iσσ (2) ) π˜ D (K iσσ (1) ) (4) (3) (2) (1) 4! σ ∈S4 i σ (1)
· c(du
)Uiθσ (1) c(du iσ (2) )Uiθσ (2) c(du iσ (3) )Uiθσ (3) c(du iσ (4) )Uiθσ (4) , (74)
θ where Uiθσ (k) denotes the representation Vdeg(u i
σ (k)
)
of the function u iσ (k) by the formula
(39) with the deg standing for the spectral degree in Z2 of the function in consideration. θ θ Specifically, U1θ = 1, U2θ = 1, U3θ = V(−1,0) and U4θ = V(0,−1) . The goal is to show that any component in the summation (74) agrees with the component of the same indices in the summation (69). We demonstrate one term specifically by using (74). We consider the component in (74) labeled by indices σ (4) = 1, σ (3) = 2, σ (2) = 3, σ (1) = 4; i σ (4) = 4, i σ (3) = 2, i σ (2) = 3, i σ (1) = 3. That is π˜ D (K 41 ) π˜ D (K 22 ) π˜ D (K 33 ) π˜ D (K 34 )c(du 3 )U3θ c(du 3 )U3θ c(du 2 )U2θ c(du 4 )U4θ = M∆(u 1 )−1/2 L θ−i M u 1 Msin φ M u 1 ∆(u 1 )1/2 u4
2
2
θ cos u 2 L −i
u3
U3θ U3θ U2θ U4θ c(du 3 ) c(du 3 ) c(du 2 ) c(du 4 ) = M∆(u 1 )−1/2 L θ−i M u 1 Msin φ M u 1 ∆(u 1 )1/2 cos u 2 L θ−i u4
2
2
u3
M u 1 ∆(u 1 )1/2 L θ−i 2
u3
θ M u 1 ∆(u 1 )1/2 M −i V(1,0) 2
u3
θ θ θ V(−1,0) V(−1,0) 1V(0,−1) c(du 3 ) c(du 3 ) c(du 2 ) c(du 4 ) = M∆(u 1 )−1/2 L θ−i M u 1 Msin φ M u 1 ∆(u 1 )1/2 cos u 2 L θ−i M u 1 ∆(u 1 )1/2 M −i 2 2 2 u u4
u3
3
θ θ V(−1,0) V(0,−1) c(du 3 ) c(du 3 ) c(du 2 ) c(du 4 ) = M∆(u 1 )−1/2 L θ−i M u 1 Msin φ M u 1 ∆(u 1 )1/2 cos u 2 2 2
M u 1 ∆(u 1 )1/2 M −i L θ−i
θ θ V(−1,0) V(0,−1) c(du 3 ) c(du 3 ) c(du 2 ) c(du 4 ) = M∆(u 1 )−1/2 L θ−i M u 1 Msin φ M u 1 ∆(u 1 )1/2 cos u 2 2 2
θ M u 1 ∆(u 1 )1/2 M −i M −i V(1,0)
u4
u4
2
2
u3
u3
u3
u3
Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples
651
θ θ V(−1,0) V(0,−1) c(du 3 ) c(du 3 ) c(du 2 ) c(du 4 )
= M∆(u 1 )−1/2 M u 1 Msin φ M u 1 ∆(u 1 )1/2 2
2
cos u 2
M u 1 ∆(u 1 )1/2 M −i M −i L θ−i 2
u3
u3
u4
θ V(0,−1) c(du 3 ) c(du 3 ) c(du 2 ) c(du 4 )
= M∆(u 1 )−1/2 M u 1 Msin φ M u 1 ∆(u 1 )1/2 2
2
cos u 2
M u 1 ∆(u 1 )1/2 M −i M −i M −i 2
u3
u3
u4
c(du ) c(du ) c(du ) c(du ) = M iu 3 c(du 3 ) c(du 3 ) c(du 2 ) c(du 4 ). 3
1 8u 23 u 4
3
2
4
sin φ cos u 2 ∆(u 1 )1/2
On the other hand, the corresponding component which appeared in the summation (69) is Mk 1 Mk 2 Mk 3 Mk 4 c(du 3 )c(du 4 )c(du 2 )c(du 2 ) 4
2
3
3
= Mh 1 −i Mh 2 Mh 3 −i Mh 4 −i c(du 3 )c(du 4 )c(du 2 )c(du 2 ) 2
4 u4
=M =M =M =M
1 −i ∆1/2 u 4
3 u3
3 u3
M 1 r sin φ M 1 r ∆1/2 cos θ −i M 1 r ∆1/2 −i c(du 3 )c(du 4 )c(du 2 )c(du 2 )
1 −i 1 r ∆1/2 u 4 2
2
2
u3
2
1 1/2 −i sin φ 21 r ∆1/2 cos θ −i u 2 r∆ u 3
u3
c(du )c(du 4 )c(du 2 )c(du 2 ) 3
3
c(du 3 )c(du 4 )c(du 2 )c(du 2 )
ir 3 8u 23 u 4
sin φ cos θ∆1/2
iu 31 8u 23 u 4
sin φ cos u 2 ∆(u 1 )1/2
c(du 3 ) c(du 3 ) c(du 2 ) c(du 4 ),
where the h iα ’s are matrix elements in (5). We have shown that the two components agree. Other terms can be shown similarly. Therefore the two summations (74) and (69) are the same. Together with Proposition 5, the summation (74) equals χ and this completes the proof of the orientation condition. 6. Conclusion We have obtained the nonunital spectral triples of the isospectral deformations of the Eguchi-Hanson spaces along torus isometric actions and studied analytical properties of the triple. We have also tested the proposed geometric conditions of a noncompact noncommutative geometry on this example. There are possible generalizations in the following directions. Firstly, we may further consider the Poincaré duality of nonunital spectral triples [28]. Secondly, we may take the conical singularity limit of EH-spaces and consider the spectral triple of the conifold. Thirdly, we may realize the spectral triple as a complex noncommutative geometry defined by [29]. Finally, we may deform the EH-spaces, and possibly for more general ALE-spaces, by using the hyper-Kähler quotient structures. Acknowledgements. The author thanks Lucio Cirio, Giovanni Landi for their interest and comments, and Derek Harland for helpful discussions, without which the research would have taken much longer. The author wants to thank Adam Rennie for sharing his insights, encouragement and comments on the draft. Finally, the author thanks the referee for many corrections and helpful suggestions to improve the article.
652
C. Yang
References 1. Connes, A.: Noncommutative Geometry. London: Academic Press, 1994 2. Rennie, A.: Smoothness and locality for nonunital spectral triples. K-Theory 28, 127–165 (2003) 3. Gayral, V., Gracia-Bondía, J.M., Iochum, B., Schücker, T., Várilly, J.C.: Moyal planes are spectral triples. Commun. Math. Phys. 246, 569–623 (2004) 4. Estrada, R., Gracia-Bondía, J.M., Várilly, J.C.: On summability of distributions and spectral geometry. Commun. Math. Phys. 191, 219–248 (1998) 5. Rennie, A.: Summability for nonunital spectral triples. K-Theory 31, 71–100 (2004) 6. Gayral, V., Iochum, B., Várilly, J.C.: Dixmier traces on noncompact isospectral deformations. J. Funct. Anal. 237, 507 (2006) 7. Rennie, A., Várilly, J.C.: Reconstruction of manifolds in noncommutative geometry, 2006, available at http://arXiv.org/abs/math/0610418v4[math.OA], 2006 8. Connes, A.: On the spectral characterization of manifolds, 2008, available at http://arXiv.org/abs/0810. 2088v1[math.OA], 2008 9. Gracia-Bondía, J.M., Lizzi, F., Marmo, G., Vitale, P.: Infinitely many star products to play with. JHEP 0204, 026 (2002) 10. Gayral, V., Gracia-Bondía, J.M., Várilly, J.C.: Fourier analysis on the affine group, quantization and noncompact connes geometries, 2007, available at http://arXiv.org/abs/0701.3511v2[hep.th], 2007 11. Eguchi, T., Hanson, A.J.: Asymptotically flat self-dual solutions to euclidean gravity. Phys. Lett. 74B, 249–251 (1978) 12. Rieffel, M.A.: Deformation Quantization for Actions of Rd . Mem. Amer. Math. Soc. 506, Prvidence, RI: Amer. Math. Soc., 1993 13. Rieffel, M.A.: Non-commutativee tori-a case study of non-commutative differentiable manifolds. Contemp. Math. 105, 191–211 (1990) 14. Connes, A., Landi, G.: Noncommutative manifolds, the instanton algebra and isospectral deformations. Commun. Math. Phys. 221, 141–159 (2001) 15. Connes, A., Dubois-Violette, M.: Noncommmutative finite-dimensional manifolds. i. spherical manifolds and related examples. Commun. Math. Phys. 230, 539–579 (2002) 16. Kronheimer, P.B.: The construction of ale spaces as hyper-kähler quotients. Differ. Geom. 28, 665–683 (1989) 17. Atiyah, M.F., Drinfeld, V.B., Hitchin, N.J., Manin, Y.I.: Construction of instantons. Phys. Lett. 63A(3), 185–187 (1978) 18. Kronheimer, P.B., Nakajima, H.: Yang-Mills instantons on ale gravitational instantons. Math. Ann. 288, 263–307 (1990) 19. Nakajima, H.: Instantons on ale spaces, quiver varieties, and kac-moody algebras. Duke Math. J. 76, 365– 416 (1994) 20. Gibbons, G.W., Hawking, S.W.: Gravitational multi-instantons. Phys. Lett. 78, 430–432 (1978) 21. Lawson, H.B., Michelsohn, M.: Spin Geometry. Princeton, NJ: Princeton University Press, 1989 22. Gracia-Bondía, J.M., Várilly, J.C., Figueroa, H.: Elements of Noncommutative Geometry. Boston: Birkhäuser, 2001 23. Hebey, E.: Nonlinear Analysis on Manifolds: Sobolev Spaces and Inequalities. Providence, RI: American Mathematical Society Courant Institute of Mathematical Sciences, 1999 24. Swan, R.G.: Vector bundles and projective modules. Trans. Amer. Math. Soc. 105, 264–277 (1962) 25. Vaserstein, L.N.: Vector bundles and projective modules. Trans. Amer. Math. Soc. 294(2), 749–755 (1986) 26. Gilkey, P.B.: Invariant Theory, the Heat Equation, and the Atiyah-Singer Index Theorem. Singapore: CRC Press, 1994 27. Várilly, J.C.: Dirac operators and spectral geometry (notes taken by Witkowski, P.), 2006 28. Rennie, A.: Poincare duality and spinc structures for noncommutative manifolds, 2001, available at http:// arXiv.org/abs/math-ph/0107013v1, 2001 29. Fröhlich, J., Grandjean, O., Recknagel, A.: Supersymmetric quantum theory and (non-commutative) differential geometry. Commun. Math. Phys. 193, 527 (1998) Communicated by A. Connes
Commun. Math. Phys. 288, 653–675 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0779-3
Communications in
Mathematical Physics
Deformed Macdonald-Ruijsenaars Operators and Super Macdonald Polynomials A. N. Sergeev1,2 , A. P. Veselov1,3 1 Department of Mathematical Sciences, Loughborough University, Loughborough LE11 3TU, UK.
E-mail:
[email protected];
[email protected]
2 Steklov Institute of Mathematics, Fontanka 27, St., Petersburg 191023, Russia 3 Landau Institute for Theoretical Physics, Moscow, Russia
Received: 16 April 2008 / Accepted: 24 November 2008 Published online: 21 March 2009 – © Springer-Verlag 2009
Abstract: It is shown that the deformed Macdonald-Ruijsenaars operators can be described as the restrictions on certain affine subvarieties of the usual MacdonaldRuijsenaars operator in infinite number of variables. The ideals of these varieties are shown to be generated by the Macdonald polynomials related to Young diagrams with special geometry. The super Macdonald polynomials and their shifted version are introduced; the combinatorial formulas for them are given. 1. Introduction In this paper we investigate the properties of the deformed Macdonald-Ruijsenaars (MR) operators introduced in [1] Mn,m,q,t =
n m 1 1 Ai (Tq,xi − 1) + B j (Tt,y j − 1), 1−q 1−t i=1
(1)
j=1
where Ai =
n m (xi − t xk ) (xi − qy j ) , (xi − xk ) (xi − y j )
k=i
j=1
Bj =
n m (y j − t xi ) (y j − qyl ) (y j − xi ) (y j − yl ) i=1
l= j
and Tq,xi , Tt,y j are the “shift” operators: (Tq,xi f )(x1 , . . . , xi , . . . , xn , y1 , . . . , ym ) = f (x1 , . . . , q xi , . . . , xn , y1 , . . . , ym ), (Tt,y j f )(x1 , . . . , xn , y1 , . . . , y j , . . . , ym ) = f (x1 , . . . , xn , y1 , . . . , t y j , . . . , ym ). More precisely, we generalise the results of our paper [2] by showing that the deformed MR operator can be described as the restriction of the usual Macdonald-Ruijsenaars
654
A. N. Sergeev, A. P. Veselov
operator [3,4] Mq,t =
1 zi − t z j Tq,zi − 1 1−q zi − z j
(2)
i≥1 j=i
for an infinite number of variables z i onto certain subvarieties n,m,q,t . As well as our previous paper [2] this work is based on the theory of Macdonald polynomials [4] and shifted Macdonald polynomials developed by Knop, Sahi and Okounkov [5–8]. The paper [9] by B. Feigin, Jimbo, Miwa and Mukhin was very useful for us in understanding the role of special parameters in this problem. Another important relevant work is the paper [10] by Chalykh, who used a different technique to derive and investigate the deformed MR operator in the case m = 1, which was the first case when the deformed Calogero-Moser systems were discovered (see [11]). The structure of the paper is the following. First we review some basic facts from the theory of Macdonald polynomials and Cherednik-Dunkl operators. The main results about deformed MR operators are proved in Sect. 5. We introduce the super Macdonald polynomials as the restriction of the usual Macdonald polynomials on n,m,q,t . In Sect. 6 we define their shifted versions and show that for any shifted super Macdonald polynomial there exists a difference operator commuting with Mn,m,q,t (a deformed version of Harish-Chandra homomorphism). In the last section we present some combinatorial formulas for the super Macdonald polynomials and their shifted versions generalising Okounkov’s result [8]. 2. Symmetric Functions and Macdonald Polynomials In this section we recall some general facts about symmetric functions and Macdonald polynomials mainly following Macdonald’s classical book [4]. It will be convenient for us to use instead of the parameters q, t in Macdonald’s notations of Macdonald polynomials the parameters q, t −1 . Let PN = C[x1 , . . . , x N ] be the polynomial algebra in N independent variables and N ⊂ PN be the subalgebra of symmetric polynomials. A partition is any sequence λ = (λ1 , λ2 , . . . , λr . . .) of nonnegative integers in decreasing order λ1 ≥ λ2 ≥ · · · ≥ λr ≥ · · · containing only finitely many nonzero terms. The number of nonzero terms in λ is the length of λ denoted by l(λ). The sum | λ |= λ1 + λ2 + · · · is called the weight of λ. The set of all partitions of weight N is denoted by P N . On this set there is a natural involution: in the standard diagrammatic representation [4] it corresponds to the transposition (reflection in the main diagonal). The image of a partition λ under this involution is called the conjugate of λ and denoted by λ . This involution will play an essential role in our paper. Partitions can be used to label the bases in the symmetric algebra N . There are the following two standard bases in N , which we are going to use: monomial symmetric polynomials m λ , λ ∈ P N , which are defined by a a aN m λ (x1 , . . . , x N ) = x1 1 x2 2 . . . x N
Deformed Macdonald-Ruijsenaars Operators and Super Macdonald Polynomials
655
summed over all distinct permutations a of λ = (λ1 , λ2 , . . . , λ N ) and power sums pλ = pλ1 pλ2 . . . pλ N where pk = x1k + x2k + · · · + x Nk . It is well-known [4] that each of these sets of functions with l(λ) ≤ N form a basis in N . We will need the following infinite dimensional versions of both PN and N . Let M ≤ N and ϕ N ,M : PN −→ PM be the homomorphism which sends each of x M+1 , · · · , x N to zero and other xi to themselves. It is clear that ϕ N ,M ( N ) = M so we can consider the inverse limits in the category of graded algebras P = lim PN , = lim N . ←−
←−
This means that P = ⊕r∞=0 P r ,
P r = lim PNr ,
⊕r∞=0 r ,
= lim rN ,
=
←−
r
←−
where PNr , rN are the homogeneous components of PN , N of degree r . The elements of are called symmetric functions. Since for any partition λ, ϕ N ,M (m λ (x1 , . . . , x N )) = m λ (x1 , . . . , x M ) (and similarly for the power sums) we can define the symmetric functions m λ , pλ . Another important example of symmetric functions are Macdonald polynomials Pλ (x, q, t). We give here their definition in the form most suitable for us. Recall that on the set of partitions P N there is the following dominance partial ordering: we write µ ≤ λ if for all i ≥ 1, µ1 + µ2 + · · · + µi ≤ λ1 + λ2 + · · · + λi . Consider the following Macdonald-Ruijsenaars operator (MR operator) (N )
Mq,t =
N 1 xi − t x j Tq,xi − 1 , 1−q xi − x j
(3)
i=1 j=i
where Tq,xi is the shift operator Tq,xi f (x1 , . . . , xi , . . . , x N ) = f (x1 , . . . , q xi , . . . , x N ). This operator is related to the operators D 1N and E N from Macdonald’s book [4] by the simple formulas (N ) = Mq,t
t −1 t N −1 1 1 − tN D N (q, t −1 ) − = E N (q, t −1 ). 1−q (1 − q)(1 − t) 1−q
1 Our choice of the additional coefficient 1−q in formula (3) was motivated by the symmetric form of the deformed operator (1). We should note also that the operator (3) is
656
A. N. Sergeev, A. P. Veselov
related in a simple way to the trigonometric version of the operator Sˆ1 introduced by Ruijsenaars [3]. An important property of the MR operator is its stability under the change of N : the following diagram is commutative: (N ) Mq,t
N −→ N ↓ ϕ N ,M ↓ ϕ N ,M (M) Mq,t
M −→ M (see p. 321 in [4]). This allows us to define the MR operator Mq,t on the space of (N ) symmetric functions as the inverse limit of Mq,t . Recall [4] that Macdonald polynomials Pλ (x, q, t) ∈ N are uniquely defined for generic parameters q, t and any partition λ, l(λ) ≤ N by the following properties: 1) Pλ (x, q, t) = m λ + µ<λ u λµ m µ , where u λµ = u λµ (q, t) ∈ C. (N )
2) Pλ (x, q, t) is an eigenfunction of the MR operator Mq,t .
(N ) has an upper triangular matrix in the monomial basis m µ : Indeed the operator Mq,t (N ) Mq,t (m λ ) =
cλµ m µ ,
µ≤λ
where the coefficients cλµ can be described explicitly (see [4], p. 321). In particular cλλ =
N 1 λi q − 1 t i−1 . 1−q i=1
For generic parameters q, t the coefficients cλλ = cµµ for all λ = µ with |λ| = |µ|, (N ) is diagonalisable. so the operator Mq,t We should note that the coefficients u λµ are rational functions of q and t, which have the singularities only if q a = t b for some non-negative integers a, b (not equal to zero simultaneously) [4]. Such parameters are called special, the Macdonald polynomials are well-defined for all non-special values of the parameters q, t. From the stability of the MR operators it follows that ϕ N ,M (Pλ (x1 , . . . , x N )) = Pλ (x1 , . . . , x M ), so we have correctly defined Macdonald symmetric functions Pλ (x, q, t) ∈ which are the eigenfunctions of the MR operator Mq,t . 3. Shifted Symmetric Functions and Shifted Macdonald Polynomials We discuss now the so-called shifted Macdonald polynomials investigated by Knop, Sahi and Okounkov [5–8]. Let us denote by N ,t the algebra of polynomials f (x1 , . . . , x N ) which are symmetric in the “shifted” variables xi t i−1 . This algebra has the filtration by the degree of polynomials: ( N ,t )0 ⊂ ( N ,t )1 ⊂ · · · ⊂ ( N ,t )r ⊂ . . . .
Deformed Macdonald-Ruijsenaars Operators and Super Macdonald Polynomials
657
We have the following shifted analog of power sums: pr∗ (x1 , . . . , x N , t) =
N
xir − 1 t r (i−1) .
(4)
i=1
The polynomials pλ∗ (x, t) = pλ∗1 (x, t) pλ∗2 (x, t) . . . , where λ = (λ1 , λ2 , . . . , λr . . . ) are partitions of length l(λ) ≤ N , form a basis in N ,t . They are stable in the following sense. Let M ≤ N and ϕ ∗N ,M : PN −→ PM be the homomorphism which sends each of x M+1 , . . . , x N to 1 and leaving the remaining xi unchanged. Then ϕ ∗N ,M ( pλ∗ (x1 , . . . , x N )) = pλ∗ (x1 , . . . , x M ). Therefore ϕ ∗N ,M ( N ,t ) = M,t and one can consider the inverse limit t = lim N ,t ←−
in the category of filtered algebras: t =
∞
(t )r , (t )r = lim( N ,t )r . ←−
r =0
The algebra t is called the algebra of shifted symmetric functions [7,8]. Let us introduce the following function on the set of partitions: H (λ, q, t) = t n(λ ) q n(λ) q a(s)+1 − t l(s) . (5) s∈λ
Here a(s) and l(s) are the arm and leg lengths respectively of a box s = (i, j) ∈ λ, which are defined as a(s) = λi − j, l(s) = λj − i and n(λ) =
(i − 1)λi . i≥1
Recall (see [6–8]) that the shifted Macdonald polynomial Pλ∗ (x, q, t) ∈ t is the unique shifted symmetric function of degree deg Pλ = |λ| satisfying the following property: Pλ∗ (q λ , q, t) = H (λ, q, t) and Pλ∗ (q µ , q, t) = 0 unless λ ⊆ µ (Extra Vanishing Condition). Here and later throughout the paper by P(q λ ) for a partition λ = (λ1 , . . . , λ N ) we mean P(q λ1 , . . . , q λ N , 1, 1, . . . ). We will need the following duality property of the shifted Macdonald polynomials proved by Okounkov [8]: Pλ∗ (q µ , q, t) =
H (λ, q, t) ∗ µ P (t , t, q). H (λ , t, q) λ
(6)
658
A. N. Sergeev, A. P. Veselov
To show this consider the following conjugation homomorphism (cf. [2]):
∗ (ωq,t ( f )) (t λ ) = f (q λ ).
(7)
We claim that the conjugation homomorphism maps the algebra of shifted symmetric functions t into the algebra q . Indeed computing the sum q i−1 t j−1 (i, j)∈λ
first along columns and then along the rows we come to the following equality 1 λj 1 λi q − 1 t j−1 = t − 1 q i−1 , 1−q 1−t j≥1
(8)
i≥1
which is equivalent to ∗ (ωq,t ( pr∗ (x, t)) =
1 − qr ∗ p (x, q) 1 − tr r
with r = 1. Replacing q by q r and t by t r we have this formula for all r. Now the claim follows from the fact that pλ∗ (x, t) generate the algebra. Combining this with the definition of the shifted Macdonald polynomials we have the duality property (6). 4. Cherednik-Dunkl Operators and Harish-Chandra Homomorphism In this section we present the basic facts about Cherednik-Dunkl operators. For more details we refer to [12,13]. Consider the operators Ti , i = 1, . . . , N − 1, Ti = 1 +
xi − t xi+1 (σii+1 − 1) xi − xi+1
and ω = σ N N −1 σ N −1N −2 . . . σ21 Tq,x1 , where σi j is acting on the function f (x1 , . . . , x N ) by permutation of the i th and j th coordinates. By Cherednik-Dunkl operators we will mean the following difference operators: −1 , i = 1, . . . N . Di,N = t 1−N Ti . . . TN −1 ωT1−1 . . . Ti−1
(9)
The first important property of the Cherednik-Dunkl operators is that they commute with each other: [Di,N , D j,N ] = 0. This means that one can substitute them in any polynomial P in N variables without ordering problems.
Deformed Macdonald-Ruijsenaars Operators and Super Macdonald Polynomials
659
The second property is that if one does this for a shifted symmetric polynomial g ∈ N ,t , then the corresponding operator g(D1,N . . . D N ,N ) leaves the algebra of symmetric polynomials N invariant: g(D1,N . . . D N ,N ) : N → N . The restriction of the operator g(D1,N . . . D N ,N ) on the algebra N is given by some g difference operator, which we will denote as D N ,q,t . One can check that if we apply this operation to the shifted power sum p1∗ (x1 , . . . , x N , N t) = i=1 (xi − 1) t i−1 we arrive (up to a factor (1 − q)−1 ) at the Macdonald-Ruijg senaars operator (3). Thus the operators D N ,q,t can be considered as the integrals of the corresponding quantum system, which is equivalent to the relativistic Calogero-Moser system introduced by Ruijsenaars [3]. The Macdonald polynomials are the joint eigenfunctions of all these operators: if Pλ (x, q, t) is the Macdonald polynomial corresponding to a partition λ of length at most N then D N ,t Pλ (x, q, t) = g(q λ1 , q λ2 , . . . , q λ N )Pλ (x, q, t). g
(10)
This allows us to define a homomorphism (which is actually a monomorphism) f χ : f → D N ,t from the algebra N ,t to the algebra of difference operators. Let us denote by D(N , t) the image of χ . The inverse homomorphism χ −1 : D(N , t) −→ N ,t is called the Harish-Chandra isomorphism. It can be defined by its action on the Macdonald polynomials: the image of D ∈ D(N , t) is a polynomial f = f D ∈ N ,t such that D Pλ (x, q, t) = f (q λ )Pλ (x, q, t). One can check that the Cherednik-Dunkl operators are stable: the diagram Di,N
PN −→ PN ↓ ϕ N ,M ↓ ϕ N ,M Di,M
PM −→ PM is commutative for all M ≤ N and i ≥ 1. Similarly for any f ∈ N ,t and g = ϕ ∗N ,M ( f ), M ≤ N the following diagram is commutative: f
D N ,t
N −→ N ↓ ϕ N ,M ↓ ϕ N ,M g
D M,t
M −→ M This allows us to define for any shifted symmetric function f ∈ t a difference operator f
Dt : −→ and the infinite dimensional version of the homomorphism χ . We will denote by D(t) the image of this homomorphism. The inverse (Harish-Chandra) homomorphism χ −1 : D(t) −→ t can be described by the relation Dt Pλ (x, q, t) = f (q λ )Pλ (x, q, t), f
where now f ∈ t and Pλ (x, q, t) are Macdonald polynomials.
660
A. N. Sergeev, A. P. Veselov
5. Deformed Macdonald-Ruijsenaars Operator as a Restriction The following algebra n,m,q,t will play a central role in our construction. Let Pn,m = C[x1 , . . . , xn , y1 , . . . , ym ] be the polynomial algebra in n + m independent variables. Then n,m,q,t ⊂ Pn,m is the subalgebra consisting of polynomials which are symmetric in x1 , . . . , xn and y1 , . . . , ym separately and satisfy the conditions Tq,xi ( f ) = Tt,y j ( f )
(11)
on each hyperplane xi − y j = 0 for all i = 1, . . . , n and j = 1, . . . , m. Assume from now on that t and q are not roots of unity and consider the following deformed Newton sums pr (x, y, q, t) =
n i=1
xir +
m 1 − qr r yj, 1 − tr
(12)
j=1
which obviously belong to n,m,q,t for all nonnegative integers r . We will prove later that if the parameters q, t are non-special, then n,m;q,t is generated by the deformed Newton sums pr (x, y, q, t) (see Theorem 5.8 below), but now let us start with the following result. Theorem 5.1. The algebra n,m,q,t is finitely generated if and only if t i q j = 1 for all 1 ≤ i ≤ n, 1 ≤ j ≤ m. Proof. Consider the subalgebra P(k) = C[ p1 , ..., pn+m ] generated by the first n + m deformed Newton sums (12). We need the following result about common zeros of these polynomials (cf. Prop. 4 in [1]). Lemma 5.2. The system ⎧ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎩
1−q 1−t (x n+1 1−q 2 2 (x 1−t 2 n+1
x1 + x2 + · · · + xn +
+ xn+2 + · · · + xn+m ) = 0
+ ··· +
2 + · · · + x2 ) = 0 + xn+2 n+m .. . n+m + · · · + x n+m ) = 0 + xn+2 n+m
x12
+
x22
xn2
x1n+m + x2n+m + · · · + xnn+m +
+
1−q n+m n+m 1−t n+m (x n+1
has a non-zero solution in Cn+m if and only if t i q j = 1 for some 1 ≤ i ≤ n, 1 ≤ j ≤ m. Proof. Let us multiply the k th equation by 1 − t k and rewrite it as k k x1k + · · · + xnk + xn+1 + · · · + xn+m = (t x1 )k + · · · + (t xn )k + (q xn+1 )k + · · · + (q xn+m )k .
Since this is true for all k = 1, . . . , n + m this means that the set x1 , . . . , xn , xn+1 , . . . , xn+m coincides up to a permutation with the set t x1 , . . . , t xn , q xn+1 , . . . , q xn+m .
Deformed Macdonald-Ruijsenaars Operators and Super Macdonald Polynomials
661
Let us consider only the nonzero elements xi , i ∈ S ⊂ [1, . . . , n] and the nonzero elements xn+ j , j ∈ T ⊂ [1, . . . , m] . Therefore xi xn+ j = t |S| q |T | xi xn+ j . i∈S
j∈T
i∈S
j∈T
Therefore t |S| q |T | = 1. Conversely suppose that t i q j = 1 for some 1 ≤ i ≤ n, 1 ≤ j ≤ m and consider x1 = 1, x2 = t −1 , . . . , xi = t 1−i , xn+1 = q −1 t 1−i , . . . , xn+ j = q − j t 1−i with every other variable set to zero. Then it is easy to verify that it is a solution of the system. From the lemma and Nullstellensatz it follows that if t i q j = 1 for all 1 ≤ i ≤ n, 1 ≤ j ≤ m the elements xiNi belong to the ideal generated by p1 , . . . , pn+m for some Ni and all i = 1, . . . , n + m. By a standard result from commutative algebra (see Corollary 5.2 in [15]) it follows that the algebra Pn,m is a finitely generated module over subalgebra P(k). Now using Proposition 7.8 from [15] we conclude that the algebra n,m,q,t in this case is finitely generated. Conversely, assume that t i q j = 1 for some 1 ≤ i ≤ n, 1 ≤ j ≤ m and consider the following homomorphism: i, j : n,m;q,t → C[u, v] sending a polynomial f (x1 , . . . , xn , y1 , . . . , ym ) into φ(u, v) = f (u, t −1 u, . . . , t 1−i u, 0, . . . , 0, q j−1 tv, q j−2 tv, . . . , tv, 0, . . . , 0). One can check that the image φ of any f ∈ n,m;q,t satisfies the condition φ(u, u) = φ(qu, qu) and therefore φ(u, u) = const, since by assumption q is not a root of unity. Moreover one can show that any such function φ belongs to the image of i, j . The corresponding algebra consists of the polynomials of the form φ = (u − v) p(u, v) + c, which is not finitely generated (cf. [1], p. 274). This completes the proof of the theorem. Let us assume from now on that q, t are generic. Since the algebra n,m,q,t is finitely generated we can introduce an affine algebraic variety n,m,q,t = Spec n,m,q,t . Consider the following embedding of n,m,q,t into infinite-dimensional Macdonald variety M = Spec (cf. [2]). Recall that is the algebra of symmetric functions in an infinite number of variables z 1 , z 2 , . . . , which is freely generated by the powers sums pr (z) = z r1 + z r2 + · · · . Consider the following homomorphism ϕ from to n,m,q,t uniquely determined by the relations ϕ( pr (z)) = pr (x, y, q, t). For generic q, t this homomorphism is surjective and thus defines an embedding φ : n,m,q,t → M. We are going to show that the deformed MR operator (1) is the restriction of the usual MR operator on M onto the subvariety n,m,q,t . We start with the following modification of Proposition 2.8 from paper [10] by Chalykh.
662
A. N. Sergeev, A. P. Veselov
Proposition 5.3. The deformed MR operator preserves the algebra n,m,q,t : Mn,m,q,t : n,m,q,t → n,m,q,t .
(13)
Proof. Let f ∈ n,m,q,t and g = Mn,m,q,t ( f ). Then we have g=
n i=1
Bj Ai fi + f ¯, 1−q 1−t j m
j=1
where Ai , B j are the same as in (1) and f i = Tq,xi ( f ) − f,
f j¯ = Tt,y j ( f ) − f, i = 1, . . . , n, j = 1, . . . , m.
Let us prove first that g is a polynomial. It is clear that g is a rational function, which is symmetric in x1 , . . . , xn and y1 , . . . , ym , so, given the form of Ai and B j , it is enough to prove that g has no singularities of the type (x1 − x2 )−1 ,(y1 − y2 )−1 ,(x1 − y1 )−1 . Let us represent g in the form 1 A1 A2 (x1 − x2 ) g= f 1 + (x1 − x2 ) f 2 + g1 , x1 − x2 1−q 1−q where g1 is a rational function without singularities on the hyperplane x1 = x2 . But on this hyperplane we have A1 A2 A1 A2 (x1 − x2 ) f 1 + (x1 − x2 ) f 2 = (x1 − x2 ) + (x1 − x2 ) f 1 = 0, 1−q 1−q 1−q 1−q so we see that g has no poles when x1 = x2 . Similarly there are no singularities of type (y1 − y2 )−1 . Now write g in the form A1 B1 1 h 1 + (x1 − y1 ) f ¯ + g2 , (x1 − y1 ) g= x1 − y1 1−q 1−t 1 where g2 is a rational function without poles when x1 = y1 . On the hyperplane x1 = y1 we have (x1 − y1 )
A1 B1 + (x1 − y1 ) = 0, 1−q 1−t
therefore (x1 − y1 )
A1 B1 f 1 + (x1 − y1 ) f¯ = 1−q 1−t 1
A1 B1 + (x1 − y1 ) f 1 = 0. (x1 − y1 ) 1−q 1−t
We have used here the fact that when x1 = y1 we have f 1 = f 1¯ from the definition of algebra n,m,q,t . Thus we have proved that g is a polynomial. Now let us prove that g ∈ n,m,q,t . On the hyperplane x1 = y1 we have the following equalities: Tq,x1 − Tt,y1 Ai = 0, i = 1, Tq,x1 A1 = 0, Tq,x1 − Tt,y1 B j = 0, j = 1 Tt,y1 B1 = 0, A1 B1 Tt,y1 = Tq,x1 , 1−q 1−t
Deformed Macdonald-Ruijsenaars Operators and Super Macdonald Polynomials
663
and therefore
A1 B1 f1 + f¯ Tq,x1 − Tt,y1 g = Tq,x1 − Tt,y1 1−q 1−t 1 B1 A1 B1 f 1¯ − Tt,y1 f 1 = Tq,x1 Tq,x1 Tt,y1 ( f ) − Tq,x1 f = Tq,x1 1−t 1−q 1−t A1 Tt,y1 Tq,x1 ( f ) − Tt,y1 f = 0, −Tt,y1 1−q
since f ∈ n,m,q,t .
Now we are ready to formulate our central result. Let Mq,t be the usual MR operator in infinite dimension. Theorem 5.4. The following diagram is commutative for all values of the parameters q, t: Mq,t
↓ϕ
−→
n,m,q,t
Mn,m,q,t
↓ϕ
(14)
−→ n,m,q,t
In other words, the deformed MR operator (1) is the restriction of the operator Mq,t onto the subvariety n,m,q,t ⊂ M. Proof. Let us introduce the following function ∈ [[w1 , . . . , w N ]] which plays an important role in the theory of Macdonald polynomials (see [4]):
=
N ∞ ∞ 1 − t −1 q r z i w j j=1 r =0 i=1
1 − q r zi w j
.
Lemma 5.5. The function satisfies the following properties: (i) z w Mq,t
= Mq,t
,
(15)
where index z (resp. w) indicates the action of the Macdonald operator Mq,t on z (resp. w) variables, (ii) ϕ( ) =
m ∞ n N 1 − t −1 q r xi wl r =0 i=1 l=1
1 − q r xi wl
(1 − t −1 y j wl ),
(16)
j=1 l
(iii) z ϕ(Mq,t
) = Mn,m,q,t ϕ( ).
(17)
664
A. N. Sergeev, A. P. Veselov
Proof. The first part is well known, see Macdonald [4], formula (3.12) in Chap. 6. To prove the second one we note that since ϕ is a homomorphism it is enough to consider the case N = 1 when we have only one variable w. Consider an auxiliary homomorphism ϕ˜ : → n ⊗ defined by ϕ( ˜ pr (z)) = pr (x) +
1 − qr pr (y) 1 − tr
with finite number n of variables xi and infinite number of variables yi . Define also the following automorphism σq,t : → by σq,t ( pr (y)) =
1 − qr pr (y). 1 − tr
(18)
We have (cf. [2]) ⎛
⎞ ∞ 1 − t −1 q r y j w ⎠ ϕ( ) ˜ = σq,t ⎝ 1 − q r xi w 1 − qr y j w r =0 i=1 r =0 j≥1 ⎛ ⎞ ∞ ∞ n 1 − t −1 q r y j w 1 − t −1 q r xi w ⎠ σq,t ⎝exp log = 1 − q r xi w 1 − qr y j w r =0 i=1 r =0 j≥1 ⎛ ⎞ n ∞ ∞ 1 − t −1 q r y j w 1 − t −1 q r xi w ⎠ exp σq,t ⎝log = 1 − q r xi w 1 − qr y j w r =0 i=1 r =0 j≥1 ⎞ ⎛ n ∞ ∞ 1 − t −1 q r xi w exp σq,t ⎝ log(1−t −1 q r y j w)−log(1−q r y j w)⎠ = 1 − q r xi w r =0 i=1 r =0 j≥1 ⎛ ⎞ n ∞ ∞ q r s y s w s − t −s q r s y s w s 1 − t −1 q r xi w j j ⎠ = exp σq,t ⎝ 1 − q r xi w s r =0 i=1 r =0 j≥1 s≥1 ⎛ ⎞ n ∞ s 1 − t −s 1 − t −1 q r xi w w exp σq,t ⎝ = ps (y) ⎠ 1 − q r xi w 1 − qs s r =0 i=1 s≥1 ⎛ ⎞ ∞ n s 1 − t −s 1 − t −1 q r xi w w exp ⎝ ps (y) ⎠ = 1 − q r xi w 1 − ts s r =0 i=1 s≥1 ⎛ ⎞ n ∞ s 1 − t −1 q r xi w w = t −s ps (y) ⎠ exp ⎝− 1 − q r xi w s ∞ n 1 − t −1 q r xi w
r =0 i=1
=
s≥1
∞ ∞ n 1 − t −1 q r xi w r =0 i=1
1 − q r xi w
(1 − t −1 y j w).
j=1
Deformed Macdonald-Ruijsenaars Operators and Super Macdonald Polynomials
665
Now setting all the variables yi except the first m to zero we have the second part of the lemma. Let us prove (iii). It is easy to check the following equalities: ϕ( )−1 Tq,wl ϕ( ) =
m n 1 − xi wl 1 − t −1 qy j wl , 1 − t −1 xi wl 1 − t −1 y j wl i=1
ϕ( )−1 Tq,xi ϕ( ) =
N l=1
j=1
1 − xi wl , 1 − t −1 xi wl
N 1 − y j wl ϕ( )−1 Tq,y j ϕ( ) = . 1 − t −1 y j wl l=1
Now we see that (iii) is equivalent to the following equality: ⎛ ⎞ m N n 1 − t −1 qy j wl 1 − x w i l Cl ⎝ − 1⎠ 1 − t −1 xi wl 1 − t −1 y j wl l=1
=
n i=1
i=1
(19)
j=1
N N m 1 − xi wl 1 − t −1 qy j wl 1−q Ai −1 + Bj −1 , 1 − t −1 xi wl 1−t 1 − t −1 y j wl l=1
j=1
l=1
where Cl =
m n N n m y j − qyr y j − t xi wl − twk xi − t xs xi − qy j , Ai = , Bj = . wl − wk xi − xs xi − y j y j − yr y j − xi k=l
s=i
r = j
j=1
i=1
From the partial fraction decomposition we have m n 1 − xi wl 1 − t −1 qy j wl −1 1 − t −1 xi wl 1 − t −1 y j wl i=1
j=1
= t n q m − 1 + (1 − t)
n i=1
Ai 1 − t −1 x
i wl
+ (1 − q)
m
Bj
j=1
1 − t −1 y j wl
,
N N 1 − xi wl Cl N − 1 = t − 1 + (1 − t) , −1 1 − t xi wl 1 − t −1 xi wl l=1
N l=1
l=1
1 − y j wl Cl − 1 = t N − 1 + (1 − t) . −1 1 − t y j wl 1 − t −1 y j wl N
l=1
Substituting these identities into relation (19) we reduce it to the following equality: (t n q m − 1)
N l=1
Cl = (t N − 1)
n i=1
1−q N (t − 1) Bj, 1−t m
Ai +
j=1
666
A. N. Sergeev, A. P. Veselov
n
m
Fig. 1. Fat hook
which follows from the identities N l=1
n m tN − 1 1−q tnqm − 1 , . Cl = Ai + Bj = t −1 1−t t −1 i=1
The lemma is proved.
j=1
To complete the proof of the theorem we need Macdonald’s result that the coefficients gλ (z, q, t) in the expansion of the function
=
N ∞ ∞ 1 − t −1 q r z i w j j=1 r =0 i=1
1 − qr z
iwj
=
gλ (z, q, t)m λ (w)
λ
linearly generate when we increase the number of variables w (see [4], VI, (2.10)). Let us introduce the set of partitions Hn,m , which consists of the partitions λ such that λn+1 ≤ m or, in other words, whose diagrams are contained in the fat (n, m)-hook (see Fig. 1). Its complement we will denote as H¯ n,m . Theorem 5.6. If q, t are non-special then K er ϕ is spanned by the Macdonald polynomials Pλ (z, q, t) corresponding to the partitions which are not contained in the fat (n, m)-hook. Proof. Consider the automorphism σq,t of algebra defined above by (18): σq,t ( pr ) =
1 − qr pr . 1 − tr
Then using formula (6.19) from [4] (see p. 327) it is easy to verify that σq,t (Pλ (z, q, t)) = (−1)|µ|
H (λ, q, t) Pλ (z, t, q). H (λ , t, q)
(20)
Deformed Macdonald-Ruijsenaars Operators and Super Macdonald Polynomials
667
Let now x = (x1 , x2 , . . . ), y = (y1 , y2 , . . . , ) be two infinite sequences. Then we have (see [4], p. 345, formula (7.9 )) Pλ (x, y, q, t) = Pλ/µ (x, q, t)Pµ (y, q, t), (21) µ⊂λ
where Pλ/µ (z, q, t) are the skew Macdonald functions defined in [4] (see Chap. 6, Sect. 7) and µ ⊂ λ means that µi ≤ λi (or equivalently the diagram of µ is a subset of the diagram of λ). If we apply this automorphism σq,t acting in y variables on both sides of the formula (21) and put all the variables x and y except the first n and m respectively to zero similarly to the proof of the second part of Lemma 5.4 we get H (µ, q, t) Pµ (y, t, q). ϕ(Pλ (z, q, t)) = (−1)|µ| Pλ/µ (x, q, t) (22) H (µ , t, q) µ⊂λ
Now let us assume that λ is not contained in the fat (n, m)-hook, then λm+1 > n. We have two possibilities: µm+1 > 0 or µm+1 = 0. In the first case we have Pµ (y1 , . . . , ym , t, q) = 0. In the second case we have λm+1 − µm+1 > n, so according to [4] (p. 347, formula (7.15)) the skew function Pλ/µ (x1 , . . . , xn , q, t) = 0. Thus we have shown that the Macdonald polynomials Pλ (z, q, t) with λ ∈ H¯ n,m belong to the kernel of ϕ. To prove that they actually generate the kernel let us consider the image of the Macdonald polynomials Pλ (z, q, t) with λ ∈ Hn,m . From the formula (22) it follows that the leading term in lexicographic order of ϕ(Pλ (z, q, t)) has a form (−1)|µ|
H (µ, q, t) λ1 x1 . . . xn λn y1 µ1 . . . ym µm , H (µ , t, q)
where µ = (λn+1 , λn+2 , . . . ). From the definition ϕ(Pλ (z, q, t)) ∈ n,m,q,t . It is clear that all these polynomials corresponding to the diagrams contained in the fat hook are linearly independent. The theorem is proved. Note that we have also shown that for the generic q, t the restriction of the Macdonald polynomials on the subvariety n,m,q,t ⊂ M S Pλ (x, y, q, t) = ϕ(Pλ (z, q, t)), λ ∈ Hn,m
(23)
forms a basis in n,m,q,t . By analogy with the Jack polynomials case (see e.g. [2]) we call S Pλ (x, y, q, t) the super Macdonald polynomials. f
Corollary 5.7. Let f ∈ t be a shifted symmetric function and Mq,t = χ ( f ) ∈ Dq,t be the corresponding difference operator commuting with the MR operator Mq,t . Then f for generic q, t there exists a difference operator Mn,m,q,t commuting with the deformed MR operator (1) such that the following diagram is commutative: f
↓ϕ
Mq,t
−→
↓ϕ
f
n,m,q,t f
Mn,m,q,t
−→ n,m,q,t
All the operators Mn,m,q,t commute with each other and the super Macdonald polynomials (23) are their joint eigenfunctions.
668
A. N. Sergeev, A. P. Veselov
Remark. For special q, t of the form q = t k , where k is a nonnegative integer the homomorphism ϕ can be passed through the finite dimension N = n + mk: ϕ = φ ◦ ϕ N , where ϕ N : → N is the standard map (all z i except N go to zero), and φ : N → n,m,q,t is a homomorphism such that φ(z i ) = xi , i = 1, . . . , n, φ(z n+kl+ j ) = t j−1 yl , l = 1, . . . , m, j = 1, . . . , k. When m = 1 the corresponding ideal (the kernel of φ) was investigated by B. Feigin, Jimbo, Miwa and Mukhin in [9], who also described it in terms of Macdonald polynomials, but their description is much more complicated then in the generic parameters case. The case k = 2, m = 2 was studied in [14]. f
We investigate the homomorphism: f → Mn,m,q,t in more detail in the next section, but here we finish with the following result mentioned at the beginning of this section. Recall that the parameters q, t are called special if q a = t b for some non-negative integers a, b not equal to zero simultaneously. Theorem 5.8. If the parameters q, t are non-special then n,m,q,t is generated by the deformed Newton sums pr (x, y, q, t). The main idea is standard for this kind of result (see e.g. [1]). We show that the dimensions of the homogeneous components of the algebra n,m,q,t and its subalgebra Nn,m,q,t generated by the deformed Newton sums are the same and thus these two algebras coincide. More precisely, we prove that these dimensions coincide with the number of the corresponding Young diagrams contained in the fat (n, m)-hook (see Fig. 1 above). Lemma 5.9. If q, t are not special then the dimension of the homogeneous component n,m;q,t of degree N is less than or equal to the number of partitions λ of N such that λn+1 ≤ m. Proof. For a given partition ν consider the set of all different partitions νˆ , which one can get from ν by eliminating at most one part of it (or one row in the Young diagram representation). We have the following obvious formula m ν (x1 , x2 , . . . , xn ) = x1a m νˆ (x2 , . . . , xn ), (24) νˆ ∪(a)=ν
where (a) denote the row of length a. Any element f ∈ n,m,q,t can be written in the form f = c(λ, µ)m λ (x1 , x2 , . . . , xn )m µ (y1 , y2 , . . . , yn ). λ,µ
Then from (11) and (24) we have the linear system (q a − t b )c(λ ∪ (a), µ ∪ (b)) = 0,
(25)
a+b= p
where a, b, p nonnegative integers and λ, µ are partitions such that λn = 0, µm = 0. Consider the set X N (n, m) of pairs λ, µ of partitions such that |λ| + |µ| = N , λn+1 = 0, µm+1 = 0. Then we have the disjoint union X N (n, m) = D N (n, m) ∪ R N (n, m),
Deformed Macdonald-Ruijsenaars Operators and Super Macdonald Polynomials
669
where D N (n, m) is the subset of pairs (λ, µ) of partitions such that λn+1 ≥ µ1 . The pairs from R N (n, m) will be called irregular. Note that the corresponding Young diagrams λ∪µ , where µ corresponds to the transposed Young diagram, are precisely those which are contained in the fat (n, m)-hook. We would like to show that one can express c(λ, µ) for all other partitions as linear combinations of those from D N (n, m). Consider any (λ, µ) ∈ R N (n, m) : λ = (λ1 , . . . , λn ), µ = (µ1 , . . . , µs ). Define k = {i | λi < µ1 } and µ(q) = µs−q+1 + · · · + µs for any integer 0 < q ≤ s. Introduce the following partial order on R N (n, m) : (λ, µ) ≺ (λ˜ , µ) ˜ if and only if k < k˜ or one of the following conditions is fulfilled: ˜ + 1), k = k˜ and for q = min{λn , λ˜ n }, µ(q + 1) < µ(q k = k˜ and for q = min{λn , λ˜ n }, µ(q + 1) = µ(q ˜ + 1) and λn < λ˜ n . We prove the lemma by induction in N (x) = {y ≺ x | y ∈ R N (n, m)}. Let (λ, µ) ∈ R N (n, m). Consider Eq. (25) where λˆ = (λ1 , . . . , λn−1 ) and µˆ = (µ1 , . . . , µs−λn −1 , µs−λn +1 , . . . , µs ) and p = λn + µs−λn . It is easy to see that the equation contains c(λ, µ) with a non-zero coefficient and (λ, µ) is maximal among all irregular pairs in this equation. Now let us prove the theorem. To show that Nn,m,q,t = n,m,q,t it is enough to prove that the dimension of the homogeneous component of degree N of Nn,m,q,t is not less than D N (n, m). To produce enough independent polynomials consider the super Macdonald polynomials (23), S Pλ (x, y, q, t) = ϕ(Pλ (z, q, t)), λ ∈ Hn,m . From the formula (22) it follows that the leading term of S Pλ (x, y, q, t) in lexicographic order has a form <λ1 −n>
x1λ1 . . . xnλn y1
<λm −n>
. . . ym
,
where λ is the partition conjugate to λ and < x >= x+|x| = max(0, x). From the 2 definition ϕ(Pλ (z, q, t)) ∈ Nn,m,q,t . It is clear that all these polynomials corresponding to the diagrams contained in the fat hook are linearly independent. This completes the proof of Theorem 5.8. 6. Shifted Super Macdonald Polynomials and Harish-Chandra Homomorphism Let again Pn,m = C[x1 , . . . , xn , y1 , . . . , ym ] be a polynomial algebra in n + m indepen dent variables. The following algebra n,m,q,t can be considered as a shifted version of the algebra n,m,q,t . It consists of the polynomials p(x1 , . . . , xn , y1 , . . . , ym ), which are symmetric in x1 , x2 t, . . . , xn t n−1 and y1 , y2 q . . . , ym q m−1 separately and satisfy the conditions Tq,xi ( f ) = Tt,y j ( f ) on each hyperplane xi t i−1 − y j q j−1 = 0 for i = 1, . . . , n and j = 1, . . . , m.
(26)
670
A. N. Sergeev, A. P. Veselov
Now we are going to define the homomorphism ϕ which is a shifted version of the homomorphism ϕ from the previous section. Recall that Hn,m denote the set of partitions λ whose diagrams are contained in the fat (n, m)-hook. Consider the following F : Hn,m −→ Cn+m : F(λ) = ( p1 , . . . , pn , q1 , . . . , qm ), where
pi = q λi , q j = t µ j t n , and µ = (λn+1 , λn+2 , . . . ). The image F(Hn,m ) is dense in Cn+m with respect to the Zariski topology. Indeed, this is clearly true already for the subset consisting of the corresponding partitions with λn ≥ m. The homomorphism ϕ : t −→ C[x1 , . . . , xn , y1 , . . . , ym ] is defined by the relation ϕ ( f )( p1 , . . . , pn , q1 , . . . , qm ) = f (q λ ), where ( p1 , . . . , pn , q1 , . . . , qm ) ∈ F(Hn,m ) and λ = F −1 ( p1 , . . . , pn , q1 , . . . , qm ). In other words, we consider the shifted symmetric function f as a function on the partitions from the fat hook and re-write it in the new coordinates. The fact that as a result we will have a polynomial is not obvious. Lemma 6.1. The image ϕ ( f ) of a shifted symmetric function f ∈ t is a polynomial. For the shifted power sums pk∗ (z, t) it can be given by the following explicit formula: m n 1 − qr r ϕ pk∗ (z, t) = (xir − 1)t r (i−1) + (y j − t r n )q r ( j−1) . 1 − tr i=1
(27)
j=1
Proof. Assume that z i = q λi , where λ ∈ Hn,m . Then we have ϕ ( pk∗ (z, t)) =
n (q r λi − 1)t i−1 = (q r λi − 1)t r (i−1) + t r n (q r λn+i − 1)t r (i−1) . i≥1
i=1
i≥1
Now using (8) we have m 1 − q r r µj (q r λn+i − 1)t r (i−1) = (t − 1)q r ( j−1) , 1 − tr i≥1
j=1
which proves the formula (27). Since the shifted sums generate t this implies the first part of the lemma as well. Theorem 6.2. If the parameters q, t are generic then the image of the homomorphism
ϕ coincides with the algebra n,m,q,t and the kernel of ϕ is spanned by the shifted ∗ Macdonald polynomials Pλ (z, q, t) corresponding to the Young diagrams which are not contained in the fat (n, m)-hook.
Deformed Macdonald-Ruijsenaars Operators and Super Macdonald Polynomials
671
Proof. The first claim follows from Lemma 6.1 and Theorem 5.8. To prove the statement about the kernel consider a shifted Macdonald polynomial Pλ∗ (z, q, t) with λ ∈ H¯ n,m . Let µ be a partition whose diagram is contained in the fat (n, m)-hook. Since this implies that the diagram of λ is not a subset of the one of µ according to the Extra Vanishing Property of shifted Macdonald polynomials (see Sect. 3) we have Pλ∗ (q µ , q, t) = 0. Thus we have shown that Pλ∗ (z, q, t) with λ ∈ H¯ n,m belong to the kernel of ϕ . To show that they generate the kernel one should note that ϕ (Pλ∗ (z, q, t)) = ϕ(Pλ (z, q, t))(x1 , x2 t, . . . , xn t n−1 , y1 , y2 q . . . , ym q m−1 ) + . . . , where dots mean the terms of degree less than |λ|. From Theorem 5.6 it follows that ϕ (Pλ∗ (z, q, t)) with λ ∈ Hn,m are linearly independent. The theorem is proved. Corollary 6.3. For generic q, t the functions S Pλ∗ (x, y, q, t) = ϕ (Pλ∗ (z, q, t))
with λ ∈ Hn,m form a basis in n,m,q,t . We will call the polynomials S Pλ∗ (x, y, q, t) the shifted super Macdonald polynomials. We are going to show that to any such polynomial corresponds a quantum integral of the deformed MR system. Let us consider the algebra of difference operators in n + m variables with rational coefficients belonging to C[x1 , . . . , xn , y1 , . . . , ym , (xi − x j )−1 , (xi − yl )−1 , (yk − xl )−1 )], 1 ≤ i < j ≤ n, 1 ≤ l < k ≤ m. We denote it as (n, m).
Theorem 6.4. For generic values of q, t there exists a unique monomorphism ψ: n,m,q,t → (n, m) such that the following diagram is commutative: χ
−→ (q, t) t
↓ r es ↓ϕ ψ
n,m,q,t −→ (n, m), where χ is the inverse Harish-Chandra homomorphism and r es is the operation of restriction onto (n, m, q, t) described by Corollary 5.7. f
f
Indeed let f be a shifted symmetric function from t , Mq,t and Mn,m,q,t = f
r es(Mq,t ) be the same as in Corollary 5.7. We know that if Pλ (z, q, t) is a Macdonald symmetric function then Mn,m,q,t ϕ(Pλ (z, q, t)) = f (q λ )ϕ(Pλ (z, q, t)). f
Therefore according to Theorem 5.6 Mn,m,q,t ≡ 0 if and only if f (q λ ) = 0 for any λ with the diagram contained in the fat (n, m)-hook. Now from Theorem 6.2 it follows that K er (r es ◦ χ ) = K er ϕ . f
672
A. N. Sergeev, A. P. Veselov
7. Combinatorial Formulas In this section we give some combinatorial formulas for the super Macdonald polynomials and shifted super Macdonald polynomials generalising the results by Okounkov [8]. Let us recall his results. A tableau T on λ is called a reverse tableau if its entries decrease strictly downwards in each column and weakly rightwards in each row. By T (s) we denote the entry in the box s ∈ λ. The following combinatorial formula for the shifted Macdonald polynomial was proven by Okounkov in [8]: Pλ∗ (x1 , . . . , x N , q, t) = x T (s) − q a (s) t l (s) t T (s)−1 , ψT (q, t) (28) s∈λ
T
where a (s) and l (s) are defined for a box s = (i, j) as a (s) = j − 1, l (s) = i − 1. Here the sum is taken over all reverse tableaux on λ with entries in {1, . . . , N } and ψT (q, t) is the same weight as in the combinatorial formulas for the ordinary Macdonald polynomials (see [4], VI, (7.13’)) interpreted in terms of reverse tableau: Pλ (x1 , . . . , x N , q, t) = ψT (q, t) x T (s) , (29) s∈λ
T
Pλ/µ (x1 , . . . , x N , q, t) =
ψT (q, t)
x T (s) .
(30)
s∈λ/µ
T
In the last formula the sum is taken over all reverse tableaux of shape λ/µ with entries in {1, . . . , N }. Let us consider now a reverse bitableau T of type (n, m) and shape λ. We can view T as a filling of a Young diagram λ by symbols 1 < 2 < · · · < n < 1 < 2 < · · · < m with entries decreasing weakly downwards in each column and rightwards in each row; additionally entries 1, 2 . . . , n decrease strictly downwards in each column and entries 1 , 2 . . . , m decrease strictly rightwards in each row. Here is an example of a reverse bitableau of type (3, 2): 2 2 2 1
1 3 3 3 3 2 2 1
Let T1 be a subtableau of T containing all symbols 1 , 2 . . . , m , µ is its shape and T0 = T − T1 . Note that the conjugate tableau T1 is the usual reverse tableau (if we ignore prime symbols) and T0 is the usual reverse skew tableau of shape λ/µ. In the rest of this section x j will be denoted as y j . Theorem 7.1. For generic values of parameters q, t the super Macdonald polynomials can be written as S Pλ (x1 , x2 , . . . , xn , y1 , y2 , . . . , ym , q, t) = ψT (q, t) x T (s) , (31) T
s∈λ
Deformed Macdonald-Ruijsenaars Operators and Super Macdonald Polynomials
673
where the sum is taken over all reverse bitableaux T of type (n, m) and shape λ and ψT (q, t) = (−1)|µ| ψT1 (t, q)ψT0 (q, t)
H (µ, q, t) H (µ , t, q)
with H (µ, q, t) defined by (5). The proof follows directly from the formulas (22),(29),(30). We are going to present now a combinatorial formula for the shifted super Macdonald polynomial. Theorem 7.2. The following formula holds: S Pλ∗ = x T (s) − q a (s) t l (s) (q, t; s)T (s)−1 , ψT (q, t)
(32)
s∈λ
T
where the sum is taken over the same set of reverse bitableaux as in the previous theorem, (q, t; s) = q if s ∈ T1 and (q, t; s) = t if s ∈ T0 . Proof. Let us consider the skew diagram λ/µ in the formula (29) and define ∗ x T (s) − q a (s) t l (s) t T (s)−1 . Pλ/µ (x, q, t) = ψT (q, t) s∈λ/µ
T
Okounkov in [8] proved the following formula: z 1 − q a (s) t l (s) t |µ| Pµ∗ (z 2 , z 3 . . . , q, t), ψλ/µ (q, t) Pλ∗ (z 1 , z 2 . . . , q, t) = µ≺λ
s∈λ/µ
(33) where µ ≺ λ means λi+1 ≤ µi ≤ λi and ψλ/µ (q, t) is the same coefficient as for the ordinary Macdonald polynomials [4] |λ/µ| Pλ (z 1 , z 2 . . . , q, t) = ψλ/µ (q, t)z 1 Pµ (z 2 , z 3 . . . , q, t). µ≺λ
Strictly speaking Okounkov proved this for finitely many variables. To make sense of formula (33) in infinite dimension we embed the algebra t to C[z 1 ] ⊗ t by sending pk∗ to z 1k − 1 + t k pk∗ . Applying Okounkov’s formula n times we get ∗ Pλ∗ (z 1 , z 2 . . . , q, t) = Pλ/µ (z 1 , z 2 , . . . , z n , q, t)t n|µ| Pµ∗ (z n+1 , z n+2 . . . , q, t), µ⊂λ
which implies ϕ (Pλ∗ (z 1 , z 2 . . . , q, t)) =
µ⊂λ
∗ ∗ Pλ/µ (x1 , x2 , . . . , xn , q, t)t n|µ| ωq,t
×(Pµ∗ (z n+1 , z n+2 . . . , q, t)).
674
A. N. Sergeev, A. P. Veselov
Now using the duality (6) we have ϕ (Pλ∗ (z 1 , z 2 . . . , q, t)) =
µ⊂λ
∗ Pλ/µ (x1 , x2 , . . . , xn , q, t)
H (µ, q, t) ∗ P H (µ , t, q) µ
×(y1 , y2 . . . , ym , t, q). But according to formula (28), Pµ∗ (y1 , y2 . . . , ym , t, q) = =
T1
T1
ψT1 (t, q) ψT1 (t, q)
s ∈µ
x T1 (s ) − t a (s ) q l (s ) q T1 (s )−1
x T1 (s) − q a (s) t l (s) q T1 (s)−1 .
s∈µ
Therefore ϕ (Pλ∗ (z 1 , z 2 . . . , q, t)) H (µ, q, t) a (s) l (s) ψ x t T (s)−1 t |µ| (q, t) (s) − q t = T T H (µ , t, q) s∈λ/µ T (s) l (s) a x T1 (s) − q q T1 (s)−1 , ×ψT1 (t, q) t s∈µ
and the theorem is proved.
8. Concluding Remarks Haglund, Haiman and Loehr [16] recently proved a remarkable new combinatorial formula for Macdonald polynomials, previously proposed by Haglund. These formulas are much more effective than the original Macdonald’s formulas (29) and provide a new direct way to prove the main properties of Macdonald polynomials. A natural problem is to find an analogue of Haglund’s formula for the super Macdonald polynomials. We would like to mention here recent work by Ram and Yip [17], where Haglund’s formula was generalised for other Lie algebras. Another interesting open problem is to find generalisations of our results for the deformed analogues of the Macdonald operators related to other Lie superalgebras, in particular for the deformed Koornwinder operators (see the rational limit of them in [18]). Acknowledgements. We are grateful to A. Okounkov for useful and stimulating discussions. Special thanks go to the anonymous referee for an excellent job, which helped us to improve the paper. This work has been partially supported by the EPSRC (grant EP/E004008/1), European Union through the FP6 Marie Curie RTN ENIGMA (Contract number MRTN-CT-2004-5652) and ESF programme MISGAM.
Deformed Macdonald-Ruijsenaars Operators and Super Macdonald Polynomials
675
References 1. Sergeev, A.N., Veselov, A.P.: Deformed quantum Calogero-Moser problems and Lie superalgebras. Commun. Math. Phys. 245(2), 249–278 (2004) 2. Sergeev, A.N., Veselov, A.P.: Generalised discriminants, deformed Calogero-Moser-Sutherland operators and super-Jack polynomials. Adv. Math. 192(2), 341–375 (2005) 3. Ruijsenaars, S.N.M.: Complete integrability of relativistic Calogero-Moser systems and elliptic function identities. Commun. Math. Phys. 110(2), 191–213 (1987) 4. Macdonald, I.: Symmetric Functions and Hall Polynomials. 2nd edition, Oxford: Oxford Univ. Press, 1995 5. Knop, F., Sahi, S.: Difference equations and symmetric polynomials defined by their zeros. Internat. Math. Res. Notes 10, 473–486 (1996) 6. Knop, F.: Symmetric and non-symmetric quantum Capelli polynomials. Comment. Math. Helv. 72(1), 84–100 (1997) 7. Sahi, S.: Interpolation, integrality, and a generalization of Macdonald’s polynomials. Internat. Math. Res. Notices 10, 457–471 (1996) 8. Okounkov, A.: (Shifted) Macdonald polynomials : q-integral representation and combinatorial formula. Compositio Math. 112(2), 147–182 (1998) 9. Feigin, B., Jimbo, M., Miwa, T., Mukhin, E.: Symmetric polynomials vanishing on the shifted diagonals and Macdonald polynomials. Int. Math. Res. Not. no. 18, 1015–1034 (2003) 10. Chalykh, O.A.: Macdonald polynomials and algebraic integrability. Adv. Math. 166(2), 193–259 (2002) 11. Veselov, A.P., Feigin, M.V., Chalykh, O.A.: New integrable deformations of quantum Calogero - Moser problem. Russ. Math. Surv. 51(3), 185–186 (1996) 12. Cherednik, I.: Double affine Hecke algebras and Macdonald’s conjectures. Ann. of Math. (2) 141(1), 191–216 (1995) 13. Kirillov, A.N., Noumi, M.: Affine Hecke algebras and raising operators for Macdonald polynomials. Duke Math. J. 93(1), 1–39 (1998) 14. Kasatani, M., Miwa, T., Sergeev, A.N., Veselov, A.P.: Coincident root loci and Jack and Macdonald polynomials for special values of the parameters. In: Jack, Hall-Littlewood and Macdonald Polynomials, 207–225, Contemp. Math., 417, Providence, RI: Amer. Math. Soc., 2006, pp. 207–225 15. Atiyah, M., Macdonald, I.G.: Introduction to Commutative Algebra. Reading, MA: Addison-Wesley, 1969 16. Haglund, J., Haiman, M., Loehr, N.: A combinatorial formula for Macdonald polynomials. J. Amer. Math. Soc. 18(3), 735–761 (2005) 17. Ram, A., Yip, M.: A combinatorial formula for Macdonald polynomials. http://arXiv.org/abs/0803. 1146v1[math-co.], 2008 18. Sergeev, A.N., Veselov, A.P.: BC-infinity Calogero-Moser operator and super Jacobi polynomials. http://arXiv.org/abs/0807.3858v2[math-ph], 2008 Communicated by L. Takhtajan
Commun. Math. Phys. 288, 677–697 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0714-z
Communications in
Mathematical Physics
Non-Kaehler Heterotic String Compactifications with Non-Zero Fluxes and Constant Dilaton Marisa Fernández1 , Stefan Ivanov2 , Luis Ugarte3 , Raquel Villacampa3 1 Universidad del País Vasco, Facultad de Ciencia y Tecnología, Departamento de Matemáticas,
Apartado 644, 48080 Bilbao, Spain. E-mail:
[email protected]
2 University of Sofia “St. Kl. Ohridski”, Faculty of Mathematics and Informatics,
Blvd. James Bourchier 5, 1164 Sofia, Bulgaria. E-mail:
[email protected]
3 Departamento de Matemáticas - I.U.M.A., Universidad de Zaragoza, Campus Plaza San Francisco,
50009 Zaragoza, Spain. E-mail:
[email protected];
[email protected] Received: 21 April 2008 / Accepted: 7 October 2008 Published online: 22 January 2009 – © Springer-Verlag 2009
Abstract: We construct new explicit compact supersymmetric valid solutions with non-zero field strength, non-flat instanton and constant dilaton to the heterotic equations of motion in dimension six. We present balanced Hermitian structures on compact nilmanifolds in dimension six satisfying the heterotic supersymmetry equations with non-zero flux, non-flat instanton and constant dilaton which obey the three-form Bianchi identity with curvature term taken with respect to either the Levi-Civita, the (+)-connection or the Chern connection. Among them, all our solutions with respect to the (+)-connection on the compact nilmanifold M3 satisfy the heterotic equations of motion. Contents 1. 2.
Introduction. Field and Killing-Spinor Equations . . . . . . . . . The Supersymmetry Equations in Dimension 6 . . . . . . . . . . 2.1 SU(3)-structures in d = 6 . . . . . . . . . . . . . . . . . . . 2.2 Proof of Theorem 1.1 . . . . . . . . . . . . . . . . . . . . . 2.3 Heterotic supersymmetry with constant dilaton . . . . . . . 3. General Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 3.1 Six-dimensional balanced Hermitian nilmanifolds . . . . . . 4. The Iwasawa Manifold Revisited . . . . . . . . . . . . . . . . . 4.1 Cardoso et al. abelian instanton . . . . . . . . . . . . . . . . 5. A Family of Balanced Hermitian Structures on the Lie Algebra h3 6. Balanced Hermitian Structures on the Lie Algebras h2 , h4 and h5 7. The Space of Balanced Structures on h6 . . . . . . . . . . . . . . 8. Balanced Structures on the Lie Algebra h− 19 . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
678 680 681 682 683 683 684 685 687 688 690 693 694 696
678
M. Fernández, S. Ivanov, L. Ugarte, R. Villacampa
1. Introduction. Field and Killing-Spinor Equations The bosonic fields of the ten-dimensional supergravity which arises as a low energy effective theory of the heterotic string are the spacetime metric g, the NS three-form field strength H , the dilaton φ and the gauge connection A with curvature F A . In this paper, we consider the bosonic geometry to be of the form R 1,9−d × M d , where the bosonic fields are non-trivial only on M d , d ≤ 8. We consider the two connections 1 ∇ ± = ∇ g ± H, 2 where ∇ g is the Levi-Civita connection of the Riemannian metric g. Both connections preserve the metric, ∇ ± g = 0, and have totally skew-symmetric torsion ±H , respectively. The Green-Schwarz anomaly cancellation mechanism requires that the three-form Bianchi identity receives an α correction of the form α α d H = 8π 2 ( p1 (M d ) − p1 (E)) = T r (R ∧ R) − T r (F A ∧ F A ) , (1.1) 4 4 d where p1 (M ), p1 (E) are the first Pontrjagin forms of M d with respect to a connection ∇ with curvature R and the vector bundle E with connection A, respectively. A class of heterotic-string backgrounds for which the Bianchi identity of the three-form H receives a correction of type (1.1) are those with (2,0) world-volume supersymmetry. Such models were considered in [31]. The target-space geometry of (2,0)-supersymmetric sigma models has been extensively investigated in [28,31,39]. Recently, there is revived interest in these models [9,21–24] as string backgrounds and in connection to heterotic-string compactifications with fluxes [2–5,8,19,20,36]. In writing (1.1) there is a subtlety to the choice of connection ∇ on M d since anomalies can be cancelled independently of the choice [29]. Different connections correspond to different regularization schemes in the two-dimensional worldsheet non-linear sigma model. Hence the background fields given for the particular choice of ∇ must be related to those for a different choice by a field redefinition [38]. Connections on M d proposed to investigate the anomaly cancellation (1.1) are ∇ g [23,39], ∇ + [9] and very recently [14], ∇ − [6,8,24,29,32,35], Chern connection ∇ c when d = 6 [5,19,20,36,39]. A heterotic geometry will preserve supersymmetry if and only if, in 10 dimensions, there exists at least one Majorana-Weyl spinor such that the supersymmetry variations of the fermionic fields vanish, i.e. the following Killing-spinor equations hold [39]: 1 g np = ∇ + = 0, δλ = ∇m = ∇m + Hmnp 4 1 1 Hmnp mnp = (dφ − H ) · = 0, (1.2) δ = m ∂m φ − 12 2 A mn δξ = Fmn = F A · = 0,
where λ, , ξ are the gravitino, the dilatino and the gaugino fields, respectively and· means Clifford action of forms on spinors. The bosonic part of the ten-dimensional supergravity action in the string frame is ([6], R = R − ) √ 1 1 α T r |F A |2 ) − T r |R|2 . S = 2 d 10 x −ge−2φ Scal g + 4(∇ g φ)2 − |H |2 − 2k 2 4 (1.3)
Heterotic String Compactifications with Non-Zero Fluxes and Constant Dilaton
679
The string frame field equations (the equations of motion induced from the action (1.3)) of the heterotic string up to two-loops [30] in sigma model perturbation theory are (we use the notations in [24]) g
Rici j −
1 α A g g mnq (F )imab (F A )mab = 0; − R R Himn H jmn + 2∇i ∇ j φ − imnq j j 4 4 g ∇i (e−2φ H ijk ) = 0; (1.4) ∇i+ (e−2φ (F A )ij ) = 0.
The field equation of the dilaton φ is implied from the first two equations above. The first compact torsional solutions for the heterotic/type I string were obtained via duality from M-theory compactifications on K3 × K3 proposed in [13]. The metric was first written down on the orientifold limit in [13] and such backgrounds have since been studied (see [2,3] and references therein). The metric and the H -flux are derived by applying a chain of supergravity dualities and the resulting geometry in the heterotic theory is a T2 bundle over a K3. A compact example solving (1.2) and (1.1) with nonzero field strength, constant dilaton and taking R = R + , is constructed in [9] on the Iwasawa nilmanifold which is a T2 bundle over T4 . However, it has been pointed out in [23] that this example is not a valid solution due to a sign error in the torsional equation derived from the first two equations in (1.2) which leads to the opposite sign in the left hand side of (1.1). Compact example of a balanced 6-manifold with constant dilaton non-trivial warped factor and torsion generated by the Chern-Simons term only is presented very recently in [14]. Compact examples in dimension six solving (1.2) and (1.1) with non-zero flux H and non-constant dilaton were constructed by Li and Yau [36] for U(4) and U(5) principal bundles taking R = R c -the curvature of the Chern connection in (1.1). Non-Kaehler compact solutions of (1.2) and (1.1) on some torus bundles over Calabi-Yau 4-manifold (K3 surfaces or complex torus) provided in [25] are presented by Fu and Yau [19,20] using the Chern connection in (1.1). It is confirmed in [5] that the examples of torus bundles over the complex torus can not be solutions to (1.2) and (1.1) taking with respect to the curvature of the Chern connection R = R c with α > 0 while some torus bundles over K3 surfaces are valid solutions. It is known [15,22] ([24] for dimension 6) that the equations of motion of type I supergravity are automatically satisfied with R = 0 if one imposes, in addition to the preserving supersymmetry equations (1.2), the three-form Bianchi identity (1.1) taking with respect to a flat connection on T M, R = 0. According to no-go (vanishing) theorems (a consequence of the equations of motion [15,17]; a consequence of the supersymmetry [33,34] for the SU(n)-case and [23] for the general case) there are no compact solutions with non-zero flux and non-constant dilaton satisfying simultaneously the supersymmetry equations (1.2) and the three-form Bianchi identity (1.1) with T r (R ∧ R) = 0. However, in the presence of a curvature term R the solution of the supersymmetry equations (1.2) and the anomaly cancellation condition (1.1) obey the second and the third equations of motion but do not always satisfy the Einstein equations of motion (the first equation in (1.4)). If R is an SU(3)-instanton then (1.2) and (1.1) imply (1.4). This can be seen from the considerations in the Appendix of [22]. We give a quadratic expression for R which is a necessary and sufficient condition in order that (1.2) and (1.1) imply (1.4) based on the properties of the special geometric structure induced from the first two equations in (1.2). More precisely, we prove in Sect. 2.2 the following:
680
M. Fernández, S. Ivanov, L. Ugarte, R. Villacampa
Theorem 1.1. Let (M, J, g, F A , R) be a conformally balanced Hermitian manifold with Kähler form F which solves the heterotic Killing spinor equations (1.2) and the anomaly cancellation (1.1). a) The Einstein equations of motion (the first equation in (1.4)) are a consequence of the heterotic Killing spinor equations (1.2) and the anomaly cancellation (1.1) if and only if the next identity holds 1
pqr Rmsab R pqab + Rmpab Rqsab + Rmqab Rspab F pq Jns = Rmpqr Rn . 2
(1.5)
p
• If R is (1, 1)-form, Jms Jn Rspab = Rmnab then (1.5) is equivalent to Rm jab Rklab F kl = 0.
(1.6)
In particular, the Einstein equations of motion with respect to either the Chern connection or the (–)-connection are a consequence of the heterotic Killing spinor equations (1.2) and the anomaly cancellation (1.1) if and only if (1.6) holds. • If R is an SU(3)-instanton then (1.5) holds. b) If R − is an SU(3)-instanton, H ol(∇ + ) ⊂ su(3) and the manifold is compact then the flux H vanishes, the dilaton is constant and the manifold is a Calabi-Yau space. As a consequence of Theorem 1.1, considering solutions involving the Chern connection, one may study stability of the tangent bundle. The main goal of this paper is to construct explicit compact valid solutions with nonzero field strength, non-flat instanton and constant dilaton to the heterotic equations of motion (1.4) in dimension six. We present compact nilmanifolds in dimension six satisfying the heterotic supersymmetry equations (1.2) with non-zero flux H = 0, non-flat instanton F A = 0 and constant dilaton obeying the three-form Bianchi identity (1.1) with curvature term R = R g , R = R + or R = R c . Some of them are torus bundles over the complex torus but this does not violate the non-existence result in [5] since we use a different curvature term (R g or R + ) in (1.1). In particular, we present a valid solution on the Iwasawa manifold but with respect to a non-standard complex structure. We find compact valid solutions to (1.2) with non-zero flux, non-flat instanton and constant dilaton satisfying the anomaly cancellation condition (1.1) using the curvature R c of the Chern connection on an S 1 bundle over a 5-manifold which is a T2 bundle over T3 . All manifolds do not admit any Kaehler metric and seem to be the first explicit compact valid supersymmetric heterotic solutions to (1.2) and (1.1) with non-zero flux, non-flat instanton and constant dilaton in dimension six. However, because of Theorem 1.1, the Einstein equations of motion (the first equation in (1.4)) are not satisfied in most cases. Only the solutions constructed in Theorem 5.1 b), Theorem 5.2 b) on the compact nilmanifold M3 = \H (2, 1) × S 1 , where H (2, 1) is the 5-dimensional Heisenberg group and is a lattice, solve in addition the heterotic equations of motion (1.4) with non-zero fluxes and constant dilaton. It seems that these are the first compact supersymmetric solutions to the heterotic equations of motion with non-zero flux H = 0, non-flat instanton F A = 0 and constant dilaton in dimension six. Our convention for the curvature is given in Sect. 3. 2. The Supersymmetry Equations in Dimension 6 Necessary and sufficient conditions to have a solution to the system of gravitino and dilatino equations (the first two equations in (1.2)) in dimension 6 were derived by
Heterotic String Compactifications with Non-Zero Fluxes and Constant Dilaton
681
Strominger in [39] involving the notion of SU(n)-structure and then studied by many authors [2,3,5,8,9,19–24,32,36]. 2.1. SU(3)-structures in d = 6. Let (M, J, g) be an almost Hermitian 6-manifold with Riemannian metric g and almost complex structure J , i.e. (J, g) define an U (3)structure. The Nijenhuis tensor N , the Kaehler form F and the Lee form θ are defined by N (·, ·) = [J ·, J ·] − [·, ·] − J [J ·, ·] − J [·, J ·],
F(·, ·) = g(·, J ·), θ (·) = δ F(J ·),
respectively, where ∗ is the Hodge operator and δ is the co-differential, δ = − ∗ d∗. An SU(3)-structure is determined by an additional non-degenerate (3,0)-form = + + i − , or equivalently by a non-trivial spinor. The subgroup of S O(6) fixing the forms F and simultaneously is SU(3). The Lie algebra of SU(3) is denoted su(3). The failure of the holonomy group of the Levi-Civita connection to reduce to SU(3) can be measured by the intrinsic torsion τ , which is identified with ∇ g F or ∇ g J and can be decomposed into five classes [10], τ ∈ W1 ⊕ · · · ⊕ W5 . The intrinsic torsion of an U (n)-structure belongs to the first four components described by Gray-Hervella [27]. The five components of an SU(3)-structure are first described by Chiossi-Salamon [10] (for interpretation in physics see [9]) and are determined by d F, d + , d − as well as by d F and N . The Hermitian manifolds belong to W3 ⊕ W4 . In the paper we are interested in the class W3 of balanced Hermitian manifolds [37] which is characterized by the conditions N = 0, θ = 0 or, equivalently, N = 0, d ∗ F = 0. Necessary conditions to solve the gravitino equation (the first equation in (1.2)) are given in [18]. The presence of a parallel spinor in dimension 6 leads firstly to the reduction to U (3), i.e. the existence of an almost Hermitian structure, secondly to the existence of a linear connection preserving the almost Hermitian structure with torsion 3-form and thirdly to the reduction of the holonomy group of the torsion connection to SU(3), i.e. its Ricci 2-form has to be identically zero. It is shown in [18] that there exists a unique linear connection preserving an almost Hermitian structure having totally skew-symmetric torsion if and only if the Nijenhuis tensor is a 3-form, i.e. the intrinsic torsion τ ∈ W1 ⊕ W3 ⊕ W4 . The torsion connection ∇ + with torsion T is determined by 1 ∇ + = ∇ g + T, 2
T = J d F + N = −d F(J ·, J ·, J ·) + N .
Necessary and sufficient conditions to solve the gravitino equation (the first equation in (1.2)) in dimension 6 are given in [32]. Namely, there exists a unique linear connection with torsion 3-form which preserves the almost Hermitian structure whose holonomy is contained in SU(3) if and only if the first Chern class vanishes, c1 (M, J ) = 0 and the SU(3)-structure (M, g, F, + , − ) satisfies the differential equations [32] 1 d + = θ ∧ + − (N , + ) ∗ F, 4
1 d − = θ ∧ − − (N , − ) ∗ F. 4
(2.1)
The torsion T is given by T = − ∗ d F + ∗(θ ∧ F) + 41 (N , + ) + + 41 (N , − ) − . Necessary and sufficient conditions to solve the gravitino and dilatino equations (the first two equations in (1.2)) are presented in [39]. The dilatino equation forces the almost complex structure to be integrable (N = 0) and the Lee form to be closed (for applications in physics the Lee form has to be exact) determined by the dilaton due to
682
M. Fernández, S. Ivanov, L. Ugarte, R. Villacampa
θ = 2dφ [39]. The three-form field strength is given by H = T = −d F(J ·, J ·, J ·) = − ∗ d F + ∗(2dφ ∧ F). Solutions with constant dilaton are those with zero Lee form, d F n−1 = 0, i.e. balanced Hermitian manifolds. When the almost complex structure is integrable, N = 0, the torsion connection ∇ + is also known as the Bismut connection (we shall call it the Bismut-Strominger (B-S) connection) and was used by Bismut to prove a local index theorem for the Dolbeault operator on non-Kaehler Hermitian manifolds [7]. This formula was recently applied in string theory [3]. Vanishing theorems for the Dolbeault cohomology on compact non-Kaehler Hermitian manifolds were found in terms of the B-S connection [1,33,34]. In addition to these equations, the vanishing of the gaugino variation (the third equation in (1.2)) requires the non-zero 2-form F A to be of instanton type ([12,23,39]). A Donaldson-Uhlenbeck-Yau SU(3)-instanton i.e. the gauge field A is a connection on a holomorphic vector bundle with curvature 2-form F A ∈ su(3). The SU(3)-instanton condition can be written in local holomorphic coordinates in the form [12,39] A Fαβ = Fα¯Aβ¯ = 0,
¯
FαAβ¯ F α β = 0.
2.2. Proof of Theorem 1.1 . A consequence of the gravitino and dilatino equations (the + = R+ ij first two equations in (1.2)) is the expression of the Ricci tensor Ricmn imn j g of the (+)-connection established in [34], Proposition 3.1: 1 + Ricmn = −2∇m+ dφn − dTmspq Jns F pq 4 1 g s = −2∇m dφn + dφs Tmn − dTmspq Jns F pq . 4
(2.2)
The four-form dT = d J d F is a (2,2)-form with respect to the complex structure J . Therefore, the last term in (2.2) is symmetric. On the other hand, the Ricci tensors of ∇ g and ∇ + are connected by (see e.g. [18]) 1 1 g pq + s Ricmn = Ricmn + Tmpq Tn − ∇s+ Tmn , 4 2 g s + + s − Ricnm = ∇s+ Tmn = ∇s Tmn , Ricmn 1 1 g pq + + Ricmn = (Ricmn + Ricnm ) + Tmpq Tn . 2 4
(2.3) (2.4)
Substitute (2.2) into (2.4), insert the result into the first equation of (1.4) and use the anomaly cancellation (1.1) to conclude (1.5). If R is a (1,1)-form then (1.6) is a consequence of (1.5). It is well known that the curvature of the Chern connection R c is always a (1,1)-form. When H ol(∇ + ) ⊂ su(3) the curvature R − of the (–)-connection is also an (1,1)-form. This follows from the well known identity − dTi jkl = 2Ri+jkl − 2Rkli j
(2.5)
and the fact that dT is a (2,2)-form. This completes the proof of a). The proof of b) is essentially contained in [33,34]. Indeed, if H ol(∇ + ) ⊂ su(3) and R − is an SU(3)-instanton, (2.5) yields dTispq F pq = 0, i.e. the manifold is almost strong in the terminology of [34]. Then Corollary 4.2 a) in [34] asserts that there are no holomorphic (3,0) forms which contradict the result in [39] except T = dφ = 0. This completes the proof of Theorem 1.1.
Heterotic String Compactifications with Non-Zero Fluxes and Constant Dilaton
683
2.3. Heterotic supersymmetry with constant dilaton. We look for a compact Hermitian 6-manifold (M, J, g) which satisfies the following conditions: a) Gravitino equation (the first equation in (1.2)): H ol(∇ + ) ⊂ su(3), i.e. 6
(+ ) EJ iEi = 0,
(2.6)
i=1
where {E 1 , . . . , E 6 } is an orthonormal basis on M. b) Dilatino equation (the second equation in (1.2)) with constant dilaton: the Lee form θ = 2dφ = 0, i.e. (M, J, g) is a balanced manifold. c) Gaugino equation (the third equation in (1.2)): look for a Hermitian vector bundle E of rank r over M equipped with an SU(3)-instanton, i.e. a connection A with curvature 2-form A satisfying ( A )ij (J E k , J El ) = ( A )ij (E k , El ),
6 ( A )ij (E k , J E k ) = 0.
(2.7)
k=1
d) Anomaly cancellation condition: d H = dT =
α 2 8π ( p1 (M) − p1 (A)), 4
α > 0.
(2.8)
3. General Preliminaries For a linear connection ∇, the connection 1-forms ωij with respect to a fixed basis E 1 , . . . , E 6 are ωij (E k ) = g(∇ E k E j , E i ) since we write ∇ X E j = ω1j (X ) E 1 + · · · + ω6j (X ) E 6 . The curvature 2-forms ij of ∇ are given in terms of the connection 1-forms ωij by ij = dωij + ωki ∧ ωkj , ji = dω ji + ωki ∧ ω jk ,
Ril jk = lk (E i , E j ),
Ri jkl = Risjk gls ,
and the first Pontrjagin class is represented by the 4-form 1 p1 (∇) = ij ∧ ij . 2 8π 1≤i< j≤6
Let (M, J, g) be a 6-dimensional Hermitian manifold. Consider the connections with torsion ∇ ± given by ∇ ± = ∇ g ± 21 T with torsion T given by T = J d F = − ∗ d F.
(3.1)
∇+
Notice that is precisely the B-S connection of the Hermitian structure. The Chern connection ∇ c is defined by 1 C(., ., .) = d F(J., ., .). ∇ c = ∇ g + C, 2 Observe that the tensor field C satisfies that C(X, ·, ·) = (J X d F)(·, ·) is a 2-form on M.
684
M. Fernández, S. Ivanov, L. Ugarte, R. Villacampa
Let us suppose that (J, g) is a left invariant Hermitian structure on a 6-dimensional Lie group G and let {e1 , . . . , e6 } be an orthonormal basis of left invariant 1-forms, that is, g = e1 ⊗ e1 + · · · + e6 ⊗ e6 . Let aikj ei ∧ e j , k = 1, . . . , 6, d ek = 1≤i< j≤6
be the structure equations in the basis {ek }. Let us denote by {E 1 , . . . , E 6 } the dual basis. Since dek (E i , E j ) = −ek ([E i , E j ]), we have that the Levi-Civita connection 1-forms (ω g )ij are 1 (ω g )ij (E k ) = − (g(E i , [E j , E k ]) − g(E k , [E i , E j ]) + g(E j , [E k , E i ])) 2 1 j = (a ijk − aikj + aki ). (3.2) 2 The connection 1-forms (ω± )ij for the connections with torsion ∇ ± are given by 1 (ω± )ij (E k ) = (ω g )ij (E k ) + (T ∓ )ij (E k ), 2 (T ± )ij (E k ) = T ± (E i , E j , E k ) = ∓d F(J E i , J E j , J E k ).
(3.3)
The connection 1-forms (ωc )ij for the Chern connection ∇ c are determined by 1 (ωc )ij (E k ) = (ω g )ij (E k ) + C ij (E k ), 2
C ij (E k ) = d F(J E k , E i , E j ).
(3.4)
We shall focus on six-dimensional nilmanifolds M = \G endowed with an invariant (integrable almost) complex structure J . According to Proposition 6.1 in [16], for invariant Hermitian metrics on compact nilmanifolds the balanced condition is equivalent to H ol(∇ + ) ⊂ su(3). The equivalence of the conditions a) and b) in Subsect. 2.3 can also be derived from (2.1) and the fact, established in [16], that for any invariant Hermitian structure on a nilmanifold the (3,0)-form = + + i − is closed. 3.1. Six-dimensional balanced Hermitian nilmanifolds. Next we review the main results given in [40] concerning balanced J -Hermitian metrics on M in order to apply them to the construction of solutions to Eqs. (2.6)-(2.8) above. First of all, if (M, J ) admits a balanced J -Hermitian metric (not necessarily invariant) then the Lie algebra g of G is isomorphic to h1 , . . . , h6 or h− 19 , where h1 = (0, 0, 0, 0, 0, 0) is the abelian Lie algebra and h2 = (0, 0, 0, 0, 12, 34), h3 = (0, 0, 0, 0, 0, 12 + 34), h4 = (0, 0, 0, 0, 12, 14 + 23),
h5 = (0, 0, 0, 0, 13 + 42, 14 + 23), h6 = (0, 0, 0, 0, 12, 13), h− 19 = (0, 0, 0, 12, 23, 14 − 35).
Here h5 is the Lie algebra underlying the Iwasawa manifold. For the canonical complex structure J0 on h5 there exists a complex basis {ω j }3j=1 of 1-forms of type (1,0) satisfying dω1 = dω2 = 0 and dω3 = ω12 .
Heterotic String Compactifications with Non-Zero Fluxes and Constant Dilaton
685
Since the Lie algebras h2 , . . . , h6 are 2-step nilpotent, for any complex structure J (= J0 for h5 ) there is a basis {ω j }3j=1 of (1,0)-forms such that ¯
dω1 = dω2 = 0,
¯
¯
dω3 = ρ ω12 + ω11 + B ω12 + D ω22 ,
(3.5)
where B, D ∈ C, and ρ = 0, 1. In particular, J is a nilpotent complex structure on h2 , . . . , h6 in the sense [11]. Recall that a complex structure J on a 2n-dimensional nilpotent Lie algebra g is called nilpotent if there is a basis {ω j }nj=1 of (1,0)-forms satisfying dω1 = 0 and
2 (ω1 , . . . , ω j−1 , ω1 , . . . , ω j−1 ), dω j ∈ for j = 2, . . . , n. Any complex structure on the Lie algebra h− 19 is not nilpotent and there is a (1,0)-basis {ω j }3j=1 satisfying ¯ ¯ ¯ ¯ dω1 = 0, dω2 = E ω13 + ω13 , dω3 = C ω11 + ia ω12 − ia E¯ ω21 ,
(3.6)
where E ∈ C with |E| = 1, C¯ = C E and a ∈ R − {0}. Now, the fundamental form F of any invariant J -Hermitian structure is given in terms of the basis {ω j }3j=1 by ¯
¯
¯
¯
¯
¯
¯
¯
¯
2 F = i(r 2 ω11 + s 2 ω22 + t 2 ω33 ) + u ω12 − u¯ ω21 + v ω23 − v¯ ω32 + z ω13 − z¯ ω31 , (3.7) where r, s, t ∈ R − {0} and u, v, z ∈ C must satisfy those restrictions coming from the positive definiteness of the associated metric g(X, Y ) = −F(X, J Y ). The following result gives necessary and sufficient conditions, in terms of the different coefficients involved, in order for the Hermitian structure to be balanced. Proposition 3.1. [40]. In the notation above, we have: (i) If J is a nonnilpotent complex structure defined by (3.6), then (J, F) is balanced if and only if z = −iuv/s 2
and
¯ + a u¯ = 0. Cs 2 + a Eu
(ii) If J is a nilpotent complex structure defined by (3.5), then (J, F) is balanced if and only if s 2 t 2 − |v|2 + D(r 2 t 2 − |z|2 ) = B(it 2 u¯ − v z¯ ). 4. The Iwasawa Manifold Revisited Apart from the abelian Lie algebra, h5 is the only 6-dimensional nilpotent Lie algebra which can be given a complex Lie algebra structure. The corresponding complex parallelizable nilmanifold is the well-known Iwasawa manifold. This manifold is studied in [9]; however, as it is pointed out in the introduction, this example is not a valid solution due to a sign error in the torsional equation. More generally, we show in Remark 4.1 that there are no valid solutions on the Iwasawa manifold with respect to the standard
686
M. Fernández, S. Ivanov, L. Ugarte, R. Villacampa
complex structure and any invariant compatible Hermitian metric, a fact which leads us to study general complex nilmanifolds in the subsequent sections. The standard complex structure J0 on h5 is defined by the following complex structure equations: dω1 = dω2 = 0,
dω3 = ω12 .
For any t = 0, let us consider F given by F=
i 11¯ ¯ ¯ (ω + ω22 + t 2 ω33 ). 2
It is easy to see that the Hermitian structure (J0 , F) is balanced for any value of the parameter. Notice that the Iwasawa manifold is a T2 bundle over T4 , where the parameter t scales the fiber. From a real point of view, let us consider the real basis of 1-forms {e1 , . . . , e6 } given by e1 + i e2 = ω1 , e3 + i e4 = ω2 , e5 + i e6 = t ω3 . Now, in terms of this basis, we have that the structure equations are ⎧ 1 2 3 4 ⎪ ⎨ de = de = de = de = 0, de5 = t e13 − t e24 , ⎪ ⎩ 6 de = t e14 + t e23 ,
(4.1)
the complex structure J0 is given by J0 e1 = −e2 , J0 e3 = −e4 , J0 e5 = −e6 , the J0 -Hermitian metric g = e1 ⊗ e1 + · · · + e6 ⊗ e6 has the associated fundamental form F = e12 +e34 +e56 . The structure equations (4.1) give d F = t e136 −t e145 −t e235 −t e246 . Apply (3.1) to verify that the torsion T of ∇ + satisfies T = −t e135 − t e146 − t e236 + t e245 , dT = −4t 2 e1234 . All the curvature forms (c )ij of the Chern connection vanish. In view of (3.2) and (3.3), the non-zero curvature forms (g )ij and (± )ij for the Levi-Civita connection and the connections ∇ ± are given by: t 2 34 t2 t2 (e − e56 ), (g )13 = − (3e13 − e24 ), (g )14 = − (3e14 + e23 ), 2 4 4 2 2 t t (g )15 = −(g )26 = (e15 − e26 ), (g )16 = (g )25 = (e16 + e25 ), 4 4 2 t (g )23 = − (e14 + 3e23 ), 4
(g )12 =
Heterotic String Compactifications with Non-Zero Fluxes and Constant Dilaton
687
t 2 13 t2 t2 (e −3e24 ), (g )34 = (e12 −e56 ), (g )35 = −(g )46 = (e35 −e46 ), 4 2 4 2 2 t t (g )36 = (g )45 = (e36 + e45 ), (g )56 = − (e12 + e34 ); 4 2 (+ )12 = 2t 2 e34 , (+ )13 = (+ )24 = −t 2 (e13 +e24 ), (+ )14 = −(+ )23 = −t 2 (e14 −e23 ), (g )24 =
(+ )34 = 2t 2 e12 , (+ )56 = −2t 2 (e12 + e34 ); (− )12 = (− )34 = −2t 2 e56 , (− )13 = −(− )24 = −t 2 (e13 − e24 ), (− )14 = (− )23 = −t 2 (e14 + e23 ). Clearly H ol(∇ + ) ⊂ su(3) and the Pontrjagin classes of the four connections are then represented by p1 (∇ g ) =
t 4 1234 e , 4π 2
p1 (∇ + ) = 0,
p1 (∇ − ) =
t 4 1234 e , π2
p1 (∇ c ) = 0. (4.2)
4.1. Cardoso et al. abelian instanton. Cardoso et al. consider in [9] an abelian field strength configuration with (1,1)-form F = i f dz 1 ∧ d z¯ 1 −i f dz 2 ∧ d z¯ 2 + e
iγ
1 − f 2 dz 1 ∧ d z¯ 2 − e−iγ 4
1 − f 2 dz 2 ∧ d z¯ 1 , 4
where the function f satisfies i∂z 2 f + ∂z 1 e
−iγ
1 − f2 4
= 0,
i∂z 1 f + ∂z 2 e
iγ
1 − f2 4
= 0.
Under these conditions one gets 1 T r F A ∧ F A = F ∧ F = − dz 1 ∧ dz 2 ∧ d z¯ 1 ∧ d z¯ 2 . 2 Here dz 1 and dz 2 denote the (2,0)-forms at the level of the Lie group, which descend to the forms ω1 and ω2 on the compact nilmanifold. Therefore, on the Iwasawa manifold we have Tr FA ∧ FA =
1 11¯ ¯ ω ∧ ω22 = −2 e1234 . 2
Now, taking A as one of these abelian instantons we have that dT = −4t 2 e1234 = −16π 2 t 2 ( p1 (∇ + ) − p1 (A)),
(4.3)
which is not a valid solution for any t (see [23] for details). Moreover, the whole space of complex structures compatible with the canonical metric obtained when t = 1 in (4.1) is studied in [9] where the authors proved that the behavior is the same as in (4.3).
688
M. Fernández, S. Ivanov, L. Ugarte, R. Villacampa
Remark 4.1. It is not difficult to prove that any J0 -Hermitian invariant metric g is equivalent to one in the 1-parameter family given above. Since dT = −4t 2 e1234 , in view of (4.2) there is no way to find a satisfactory solution with (J0 , g) as the underlying Hermitian structure. Indeed, it is proved in [5] that torus bundles over the complex torus can not be solutions to (1.2) and (1.1) taken with respect to the curvature of the Chern connection R = R c with α > 0. Since p1 (∇ c ) = 0 and dT = −4t 2 e1234 we conclude from (1.1) that p1 (A) cannot be a positive multiple of e1234 for any SU(3)-instanton A on the Iwasawa manifold which is a torus bundle over the complex torus. Hence, (1.1) cannot be satisfied for any α > 0 neither for R = R g nor for R = R ± because of (4.2). Therefore, in order to find solutions we need to consider other compact nilmanifolds or metrics and/or complex structures different from the canonical ones on the nilmanifold underlying the Iwasawa manifold. In the following sections we show many explicit solutions. 5. A Family of Balanced Hermitian Structures on the Lie Algebra h3 In this section we construct explicit solutions on a compact nilmanifold corresponding to the Lie algebra h3 . First we recall [40] that, up to equivalence, there exist two complex structures J ± on h3 , namely ¯
¯
J ± : dω1 = dω2 = 0, dω3 = ω11 ± ω22 , but only J − admits compatible balanced structures. Notice that the balanced condition for J − given in Proposition 3.1 (ii) reduces to (r 2 − s 2 )t 2 = |z|2 − |v|2 . For any t = 0, let us consider the balanced structure F given by F=
i 11¯ ¯ ¯ (ω + ω22 + t 2 ω33 ), 2
which corresponds to r = s = 1 and u = v = z = 0. From a real point of view, let us consider the basis of 1-forms {e1 , . . . , e6 } given by e1 + i e2 = ω1 , e3 + i e4 = ω2 , e5 + i e6 = t ω3 . Now, in terms of this basis, we have the structure equations de1 = de2 = de3 = de4 = de5 = 0,
(5.1)
de6 = −2t e12 + 2t e34 ,
and the complex structure J = J − is given by J e1 = −e2 , J e3 = −e4 , J e5 = −e6 . The balanced J -Hermitian metric g = e1 ⊗ e1 + · · · + e6 ⊗ e6 has the associated fundamental form F = e12 + e34 + e56 . The structure equations (5.1) yield d F = 2t (e12 − e34 )e5 . For the torsion T of ∇ + we calculate using (3.1), (3.2) and (3.3) that T = −2t (e12 − e34 )e6 ,
dT = −8t 2 e1234 ,
∇ + T = 0.
Heterotic String Compactifications with Non-Zero Fluxes and Constant Dilaton
689
A direct calculation applying (3.2) and (3.3) shows that the non-zero curvature forms (g )ij of the Levi-Civita connection ∇ g are given by (g )12 = −t 2 (3 e12 − 2 e34 ), (g )13 = t 2 e24 , (g )14 = −t 2 e23 , (g )16 = t 2 e16 , (g )23 = −t 2 e14 , (g )24 = t 2 e13 , (g )26 = t 2 e26 , (g )34 = t 2 (2 e12 − 3 e34 ), (g )36 = t 2 e36 , (g )46 = t 2 e46 , and the non-zero curvature forms (+ )ij of the connection ∇ + are (+ )12 = −(+ )34 = −4t 2 (e12 − e34 ).
(5.2)
Therefore, (2.6) is satisfied and the Pontrjagin classes are represented by p1 (∇ g ) =
−3t 4 1234 e , π2
p1 (∇ + ) =
−8t 4 1234 e . π2
Now, let us consider the new basis { f 1 , . . . , f 6 } given by f i = ei , for i = 1, . . . , 5, and f 6 = 1t e6 . In terms of this basis, the structure equations (5.1) become d f 1 = d f 2 = d f 3 = d f 4 = d f 5 = 0,
d f 6 = −2 f 12 + 2 f 34 ,
and the family (Jt , gt ) of balanced Hermitian SU(3)-structures on h3 is given by Jt f 1 = − f 2 , Jt f 2 = f 1 , Jt f 3 = − f 4 , Jt f 4 = f 3 , Jt f 5 = −t f 6 , Jt f 6 = gt = f 1 ⊗ f 1 + · · · + f 5 ⊗ f 5 + t 2 f 6 ⊗ f 6 ,
1 5 f , t
Ft = f 12 + f 34 + t f 56 .
Let us fix t = 0 and denote by ∇t+ the connection corresponding to the balanced structure (Jt , gt ) in the previous family. It follows from (5.2) that the non-zero curvature forms (+t )ij of ∇t+ are (+t )12 = −(+t )34 = −4t ( f 12 − f 34 ). 2
Therefore, (2.6) and (2.7) are satisfied and ∇t+ is an SU(3)-instanton with respect to any other balanced structure in the family (Jt , gt ). Let H (2, 1) denote the 5-dimensional generalized Heisenberg group, and let be a lattice of maximal rank. The nilpotent Lie algebra h3 is the Lie algebra underlying the compact nilmanifold M3 = \H (2, 1) × S 1 . Theorem 5.1. In the notation above, for each t = t, we consider the SU(3)-instanton ∇t+ . Then we have: g 8π 2 t 2 ( p1 (∇t ) − p1 (∇t+ )), 3t 4 −8t 4 2 2 = t π4 −tt 4 ( p1 (∇t+ ) − p1 (∇t+ )). for any pair (t, t ) such that 8t 4
a) dT = b) dT
Hence, < 3t 4 we obtain explicit valid solutions to the heterotic supersymmetry equations (1.2) with non-zero flux H = T and constant dilaton satisfying the three-form Bianchi identity (1.1) for the Levi-Civita connection and for the (+)-connection on the compact nilmanifold M3 . The compact manifold (M3 , g, J, A = ∇t+ , R(∇t+ )) described in b) solves the equations of motion (1.4).
690
M. Fernández, S. Ivanov, L. Ugarte, R. Villacampa
Moreover, we can also use the abelian instanton A given in Subsect. 4.1 to find more solutions. In fact, we can take dz 1 and dz 2 as (2,0)-forms at the level of the Lie group H (2, 1) × R which descend to the forms ω1 and ω2 on the compact nilmanifold M3 . Theorem 5.2. In the notation above and taking A as the abelian SU(3)-instanton given in [9] we have: a) dT = b) dT =
g 32π 2 t 2 ( p (∇ ) − 12t 4 −1 1 t 32π 2 t 2 ( p (∇ + ) − 32t 4 −1 1 t
p1 (A)), p1 (A)).
Thus, for any t such that 12t 4 > 1 we obtain explicit valid solutions to the heterotic supersymmetry equations (1.2) with non-zero flux H = T and constant dilaton satisfying the three-form Bianchi identity (1.1) for the Levi-Civita connection and for the (+)-connection on the compact nilmanifold M3 . The space (M3 , g, J, A, R(∇t+ )) described in b) is a compact solution to the equations of motion (1.4). Remark 5.3. A direct calculation for ∇ − and for the Chern connection ∇ c shows that p1 (∇ − ) = 0,
p1 (∇ c ) = 0.
The nilmanifold M3 is a torus bundle over a complex torus, therefore we can use the argument given in Remark 4.1 to conclude that the family above cannot provide any solution for the connections ∇ − and ∇ c . 6. Balanced Hermitian Structures on the Lie Algebras h2 , h4 and h5 In this section we construct explicit solutions on compact nilmanifolds corresponding to the Lie algebras h2 , h4 and h5 . Let us consider the complex structure equations dω1 = dω2 = 0,
¯
¯
¯
dω3 = ω12 + ω11 + b ω12 − ω22 ,
where b ∈ R. According to [40, Prop. 13], the Lie algebras underlying this 1-parameter family of complex equations are: h2 , for b ∈ (−1, 1);
h4 , for b = ±1;
h5 , for any b such that b2 > 1. (6.1)
Notice that the latter condition defines a 1-parameter family of complex structures J on the Iwasawa manifold which are not equivalent to the standard J0 . For any t = 0, let us consider F given by F=
i 11¯ ¯ ¯ (ω + ω22 + t 2 ω33 ). 2
Since D = −1, r = s = 1 and the coefficients u, v, z in (3.7) vanish, it follows from Proposition 3.1 (ii) that all the Hermitian structures (J, F) are balanced. Notice that the associated compact nilmanifolds are T2 bundles over T4 for any b, whereas the parameter t scales the fiber. In terms of the real basis of 1-forms {e1 , . . . , e6 } defined by e1 + i e2 = ω1 , e3 + i e4 = ω2 , e5 + i e6 = t ω3 ,
Heterotic String Compactifications with Non-Zero Fluxes and Constant Dilaton
the structure equations are ⎧ 1 2 3 4 ⎪ ⎨ de = de = de = de = 0, de5 = t (b + 1)e13 + t (b − 1)e24 , ⎪ ⎩ 6 de = −2t e12 − t (b − 1)e14 + t (b + 1)e23 + 2t e34 ,
691
(6.2)
the complex structure J is given by J e1 = −e2 , J e3 = −e4 , J e5 = −e6 , and the balanced J -Hermitian metric g = e1 ⊗ e1 + · · · + e6 ⊗ e6 has the associated fundamental form F = e12 + e34 + e56 . Use (6.2) to get d F = 2t e125 + t (b + 1)e136 + t (b − 1)e145 − t (b + 1)e235 + t (b − 1)e246 − 2t e345 . Due to (3.1) the torsion T of ∇ + satisfies T = −2t e126 + t (b − 1)e135 − t (b + 1)e146 + t (b − 1)e236 + t (b + 1)e245 + 2t e346 , dT = −4t 2 (b2 + 3)e1234 . A direct calculation using (3.3) gives that the non-zero curvature forms (+ )ij of the connection ∇ + are: (+ )12 = −4t 2 e12 − 2t 2 (b − 1)e14 + 2t 2 (b + 1)e23 + 6t 2 e34 − 2t 2 b2 e56 , (+ )13 = (+ )24 = −t 2 (b2 + b + 1)e13 − t 2 (b2 − b + 1)e24 , (+ )14 = −(+ )23 = −2t 2 b e12 − t 2 (b2 −b+1)e14 +t 2 (b2 +b+1)e23 +2t 2 b e34 +4t 2 b e56 , (+ )15 = (+ )26 = t 2 b e15 + t 2 b e26 − 2t 2 e46 , (+ )16 = −(+ )25 = −t 2 b e16 + t 2 b e25 + 2t 2 e36 , (+ )34 = 6t 2 e12 + 2t 2 (b − 1)e14 − 2t 2 (b + 1)e23 − 4t 2 e34 + 2t 2 b2 e56 , (+ )35 = (+ )46 = −2t 2 e26 + t 2 b e35 − t 2 b e46 , (+ )36 = −(+ )45 = 2t 2 e16 + t 2 b e36 + t 2 b e45 , (+ )56 = −(+ )12 − (+ )34 = −2t 2 e12 − 2t 2 e34 . Similarly, applying (3.2), we calculate that the non-zero curvature forms (g )ij of the Levi-Civita connection ∇ g are: 3 3 t2 t2 (g )12 = −3t 2 e12 − t 2 (b − 1)e14 + t 2 (b + 1)e23 − (b2 − 5)e34 − (b2 + 1)e56 , 2 2 2 2 2 3 t (g )13 = − t 2 (b + 1)2 e13 − (b2 − 5)e24 , 4 4 3 3 t2 3 (g )14 = − t 2 (b−1)e12 − t 2 (b−1)2 e14 + (b2 −5)e23 + t 2 (b−1)e34 + t 2 b e56 , 2 4 4 2 2 2 2 t t t (g )15 = (b + 1)2 e15 − (b − 1)2 e26 + (b − 1)e46 , 4 4 2 2 2 t t t2 (g )16 = (b2 − 2b + 5)e16 + (b + 1)2 e25 + t 2 e36 − (b + 1)e45 , 4 4 2 2 3 t 3 3 (g )23 = t 2 (b + 1)e12 + (b2 − 5)e14 − t 2 (b + 1)2 e23 − t 2 (b + 1)e34 − t 2 b e56 , 2 4 4 2 2 t 3 (g )24 = − (b2 − 5)e13 − t 2 (b − 1)2 e24 , 4 4
692
M. Fernández, S. Ivanov, L. Ugarte, R. Villacampa
(g )25 = (g )26 = (g )34 = (g )35 = (g )36 = (g )45 = (g )46 = (g )56 =
t2 t2 t2 (b + 1)2 e16 + (b − 1)2 e25 − (b + 1)e36 , 4 4 2 2 t2 t t2 − (b − 1)2 e15 + (b2 + 2b + 5)e26 + (b − 1)e35 − t 2 e46 , 4 4 2 t2 2 3 3 t2 − (b − 5)e12 + t 2 (b − 1)e14 − t 2 (b + 1)e23 − 3t 2 e34 + (b2 − 1)e56 , 2 2 2 2 t2 t2 t2 2 26 2 35 46 (b − 1)e + (b + 1) e + (b − 1)e , 2 4 4 2 2 t t t2 t 2 e16 − (b + 1)e25 + (b2 + 2b + 5)e36 − (b2 − 1)e45 , 2 4 4 2 2 t2 t t − (b + 1)e16 − (b2 − 1)e36 + (b − 1)2 e45 , 2 4 4 t2 t2 2 t2 15 2 26 (b − 1)e − t e + (b − 1)e35 + (b2 − 2b + 5)e46 , 2 4 4 2 t2 2 t − (b + 1)e12 + t 2 b e14 − t 2 b e23 + (b2 − 1)e34 . 2 2
Hence, H ol(∇ + ) ⊂ su(3) and the Pontrjagin classes of the connections ∇ g and ∇ + are represented by p1 (∇ g ) = −
t4 4 (b + 4b2 + 11)e1234 , 4π 2
p1 (∇ + ) = −
t4 4 (b + 5b2 + 10)e1234 . π2
As we mentioned above, h5 is the nilpotent Lie algebra underlying the Iwasawa manifold. Notice that h2 is the Lie algebra of H 3 × H 3 , where H 3 is the Heisenberg group. Let us denote by M2 , M4 , M5 any compact nilmanifold whose underlying Lie algebra is isomorphic to h2 , h4 or h5 , respectively. We can take dz 1 and dz 2 as (2,0)-forms at the level of the associated Lie group which descend to the forms ω1 and ω2 on M2 , M4 , M5 , so using again the abelian instanton given in Sect. 4 we get: Theorem 6.1. In the notation above and taking A as the abelian SU(3)-instanton given in [9] we have: 16π 2 t 2 (b2 + 3) ( p1 (∇ g ) − p1 (A)), + 4b2 + 11) − 1 16π 2 t 2 (b2 + 3) ( p1 (∇ + ) − p1 (A)). dT = 4 4 4t (b + 5b2 + 10) − 1 dT =
t 4 (b4
For any b ∈ R we can choose t = 0 such that t 4 (b4 + 5b2 + 10) > 1/4
and
t 4 (b4 + 4b2 + 11) > 1,
which, in view of (6.1), provides explicit valid solutions to the heterotic supersymmetry equations (1.2) with non-zero flux H = T and constant dilaton satisfying the three-form Bianchi identity (1.1) for the Levi-Civita connection and for the (+)-connection on the compact nilmanifolds M2 , M4 , M5 .
Heterotic String Compactifications with Non-Zero Fluxes and Constant Dilaton
693
Remark 6.2. Finally, a direct calculation for ∇ − and for the Chern connection ∇ c shows that p1 (∇ − ) =
t4 2 (b + 3)e1234 , π2
p1 (∇ c ) = 0.
Since M2 , M4 and M5 are torus bundles over a complex torus, notice that the same argument as in Remark 4.1 shows that the family above cannot provide any satisfactory solution for the connections ∇ − and ∇ c . 7. The Space of Balanced Structures on h6 In this section we study the space of balanced Hermitian structures on the nilpotent Lie algebra h6 . The complex equations ¯
dω1 = dω2 = 0, dω3 = ω12 − ω21 , define a complex structure J on h6 , and any complex structure on the Lie algebra h6 is equivalent to J [40, Cor. 15]. Moreover, it is easy to see that any J -balanced structure F is equivalent to one of the form F=
i 11¯ ¯ ¯ (ω + ω22 + t 2 ω33 ), 2
for some t = 0. From a real point of view, the whole space of balanced Hermitian structures on h6 is described as follows. Let us consider the basis of 1-forms {e1 , . . . , e6 } given by e1 + i e2 = ω1 , e3 + i e4 = ω2 , e5 + i e6 = t ω3 . Now, in terms of this basis, we have the structure equations ⎧ 1 2 3 4 ⎪ ⎨ de = de = de = de = 0, de5 = 2t e13 , ⎪ ⎩ 6 de = 2t e14 .
(7.1)
The complex structure J is given by J e1 = −e2 , J e3 = −e4 , d J e5 = −e6 , the J Hermitian metric g = e1 ⊗ e1 + · · · + e6 ⊗ e6 has the associated fundamental form F = e12 + e34 + e56 . The structure equations (7.1) yield d F = 2t (e136 − e145 ). Consequently, applying (3.1), we obtain that the torsion T of ∇ + satisfies T = −2t (e236 − e245 ), ∇+
dT = −8t 2 e1234 .
Using (3.3) we calculate that the non-zero curvature forms (+ )ij for the connection are given by:
694
M. Fernández, S. Ivanov, L. Ugarte, R. Villacampa
(+ )12 = 2t 2 (e34 + e56 ), (+ )13 = (+ )24 = −t 2 (3e13 + e24 ), (+ )14 = −(+ )23 = −t 2 (3e14 − e23 ), (+ )15 = (+ )26 = t 2 (e15 − e26 ), (+ )16 = −(+ )25 = t 2 (e16 + e25 ), (+ )34 = 2t 2 (e12 − e56 ), (+ )35 = (+ )46 = t 2 (e35 + e46 ), (+ )36 = −(+ )45 = −t 2 (e36 − e45 ), (+ )56 = −2t 2 (e12 + e34 ), so (2.6) holds and the first Pontrjagin class is represented by p1 (∇ + ) = −
2t 4 1234 e . π2
Let us denote by M6 any compact nilmanifold whose underlying Lie algebra is isomorphic to h6 . We can take dz 1 and dz 2 as (2,0)-forms at the level of the Lie group corresponding to h6 which descend to the forms ω1 and ω2 on M6 , so using again the abelian instanton given in Sect. 4 we get: Theorem 7.1. In the notation above and taking A as the abelian SU(3)-instanton given in [9] we have: dT =
32π 2 t 2 ( p1 (∇ + ) − p1 (A)). 8t 4 − 1
Thus, for any t such that t 4 > 18 we obtain explicit valid solutions to the heterotic supersymmetry equations (1.2) with non-zero flux H = T and constant dilaton satisfying the three-form Bianchi identity (1.1) for the (+)-connection on the compact nilmanifold M6 . Remark 7.2. The Pontrjagin classes of the Levi-Civita connection, ∇ − and the Chern connection are represented by p1 (∇ g ) = 0,
p1 (∇ − ) =
2t 4 1234 e , π2
p1 (∇ c ) = 0.
Since the nilmanifold M6 is a torus bundle over a complex torus, the same argument as in Remark 4.1 shows that there is no way to find a satisfactory solution for the connections ∇ g , ∇ − and ∇ c on the whole space of invariant balanced Hermitian structures on M6 . 8. Balanced Structures on the Lie Algebra h− 19 In this section we construct compact valid solutions to (1.2) with non-zero flux and constant dilaton satisfying anomaly cancellation condition (1.1) using the curvature R c of the Chern connection. Consider the complex structure equations ¯
¯
¯
dω1 = 0, dω2 = ω13 + ω13 , dω3 = i(ω12 − ω21 ), which in view of (3.6) correspond to a complex structure J on the 3-step nilpotent Lie algebra h− 19 .
Heterotic String Compactifications with Non-Zero Fluxes and Constant Dilaton
The associated real structure equations are ⎧ 1 de = de2 = de5 = 0, ⎪ ⎪ ⎪ ⎪ ⎨ de3 = 2e15 , ⎪ de4 = 2e25 , ⎪ ⎪ ⎪ ⎩ 6 de = 2(e13 + e24 ),
695
(8.1)
and the complex structure J is given by J e1 = −e2 , J e3 = −e4 , J e5 = −e6 . The fundamental form F of the J -Hermitian metric g = e1 ⊗ e1 + · · · + e6 ⊗ e6 is given by F = e12 + e34 + e56 . It follows from Proposition 3.1 (i) that the structure (J, g) is balanced. The structure equations (8.1) imply d F = −2(e135 + e145 − e235 + e245 ). Apply (3.1) to verify that the torsion T satisfies T = 2(e136 + e146 − e236 + e246 ),
dT = −8(e1234 + e1256 ).
Using (3.2), (3.3) and (3.4) we obtain that the non-zero curvature forms (c )ij and (+ )ij of the Chern connection and the (+)-connection are given by: (c )12 = −2e34 − 2e56 , (c )13 = (c )24 = −e13 − e24 , (c )14 = −(c )23 = 2e13 + e14 − e23 + 2e24 , (c )15 = (c )26 = e16 − e25 , (c )16 = −(c )25 = −e15 − e26 , (c )34 = −2e12 + 2e56 , (c )35 = (c )46 = −e36 + e45 , (c )36 = −(c )45 = e35 + e46 , (c )56 = −(c )12 − (c )34 = 2e12 + 2e34 ; (+ )12 = −2e34 + 2e56 , (+ )13 = (+ )24 = −3e13 − 3e24 , (+ )14 = −(+ )23 = −2e13 −e14 +e23 −2e24 , (+ )15 = (+ )26 = −3e15 − 2e16 − e26 , (+ )16 = −(+ )25 = −e16 + 3e25 + 2e26 , (+ )34 = −2e12 − 2e56 , (+ )35 = (+ )46 = e35 + 2e36 − e46 , (+ )36 = −(+ )45 = −e36 − e45 − 2e46 , (+ )56 = −(+ )12 − (+ )34 = 2e12 + 2e34 . A direct calculation shows that the Pontrjagin classes are represented by p1 (∇ + ) = −
2 (3e1234 + e1256 ), π2
p1 (∇ c ) = −
2 1234 (e + e1256 ). π2
Let M19 be a compact nilmanifold corresponding to the Lie algebra h− 19 . From (8.1) 1 we have that M19 is an S -bundle over a compact 5-nilmanifold N , which is a T2 -bundle over T3 . Lemma 8.1. For each λ, µ ∈ R, let Aλ,µ be the U(3)-connection on M19 with respect to structure (J, g) defined by the connection forms (σ Aλ,µ )23 = (σ Aλ,µ )25 = (σ Aλ,µ )45 = −λ e1 − µ e6 ,
(σ Aλ,µ )ij = λ e1 + µ e6 ,
for 1 ≤ i < j ≤ 6 such that (i, j) = (2, 3), (2, 5), (4, 5). Then, Aλ,µ is an SU(3)instanton and
696
M. Fernández, S. Ivanov, L. Ugarte, R. Villacampa
p1 (Aλ,µ ) = −
15 2 1234 µ e . π2
Proof. A direct calculation shows that the curvature forms ( Aλ,µ )ij of the connection Aλ,µ are given by ( Aλ,µ )23 = ( Aλ,µ )25 = ( Aλ,µ )45 = −2µ(e13 + e24 ),
( Aλ,µ )ij = 2µ(e13 + e24 ),
for 1 ≤ i < j ≤ 6 such that (i, j) = (2, 3), (2, 5), (4, 5). Now it is clear that Aλ,µ satisfies (2.7). Theorem 8.2. Let Aλ,µ be the SU(3)-instanton above. (i) If µ2 =
4 15 ,
then dT = 4π 2 ( p1 (∇ + ) − p1 (Aλ,µ )).
(ii) If µ = 0, then p1 (Aλ,0 ) = 0 and dT = 4π 2 ( p1 (∇ c ) − p1 (Aλ,0 )). Hence, we obtain explicit valid solutions to the heterotic supersymmetry equations (1.2) with non-zero flux H = T and constant dilaton satisfying the three-form Bianchi identity (1.1) for the Chern connection and the (+)-connection on the compact nilmanifold M19 . Remark 8.3. During the preparation of the paper we learned that a compact example solving (1.2) with non-zero flux, constant dilaton satisfying (1.1) with respect to a metric connection on the tangent bundle, and trivial instanton ( A = 0) on M3 is announced [26]. Acknowledgements. We would like to thank George Papadopoulos for very useful discussions. We also thank the referee for useful advice which improved the exposition. This work has been partially supported through grant MEC (Spain) MTM2005-08757-C04-02. S.I. is partially supported by the Contract 154/2008 with the University of Sofia ‘St.Kl.Ohridski‘. S.I. is a Senior Associate to the Abdus Salam ICTP, Trieste.
References 1. Alexandrov, B., Ivanov, S.: Vanishing theorems on Hermitian manifolds. Diff. Geom. Appl. 14(3), 251–265 (2001) 2. Becker, K., Becker, M., Dasgupta, K., Green, P.S.: Compactifications of Heterotic theory on Non-Kähler complex manifolds: I. JHEP 0304, 007 (2003) 3. Becker, K., Becker, M., Dasgupta, K., Green, P.S., Sharpe, E.: Compactifications of Heterotic Strings on Non-Kähler complex manifolds: II. Nucl. Phys. B 678, 19–100 (2004) 4. Becker, K., Becker, M., Dasgupta, K., Prokushkin, S.: Properties from heterotic vacua from superpotentials. Nucl. Phys. B 666, 144–174 (2003) 5. Becker, K., Becker, M., Fu, J.-X., Tseng, L.-S., Yau, S.-T.: Anomaly Cancellation and Smooth Non-Kahler Solutions in Heterotic String Theory. Nucl. Phys. B 751, 108–128 (2006) 6. Bergshoeff, E.A., de Roo, M.: The quartic effective action of the heterotic string and supersymmetry. Nucl. Phys. B 328, 439 (1989) 7. Bismut, J.-M.: A local index theorem for non-Kähler manifolds. Math. Ann. 284, 681–699 (1989) 8. Cardoso, G.L., Curio, G., Dall’Agata, G., Lust, D.: BPS Action and Superpotential for Heterotic String Compactifications with Fluxes. JHEP 0310, 004 (2003) 9. Cardoso, G.L., Curio, G., Dall’Agata, G., Lust, D., Manousselis, P., Zoupanos, G.: Non-Käeler string back-grounds and their five torsion classes. Nucl. Phys. B 652, 5–34 (2003)
Heterotic String Compactifications with Non-Zero Fluxes and Constant Dilaton
697
10. Chiossi, S., Salamon, S.: The intrinsic torsion of SU(3) and G 2 -structures. In: Differential Geometry, (Valencia 2001), River Edge, NJ: World Sci. Publishing, 2002, pp. 115–133 11. Cordero, L.A., Fernández, M., Gray, A., Ugarte, L.: Compact nilmanifolds with nilpotent complex structure: Dolbeault cohomology. Trans. Amer. Math. Soc. 352, 5405–5433 (2000) 12. Corrigan, E., Devchand, C., Fairlie, D.B., Nuyts, J.: First-order equations for gauge fields in spaces of dimension greater than four. Nucl. Phys. B 214(3), 452–464 (1983) 13. Dasgupta, K., Rajesh, G., Sethi, S.: M theory, orientifolds and G-flux. JHEP 0211, 006 (2002) 14. Dasgupta, K., Firouzjahi, H., Gwyn, R.: On the warped heterotic axion. http://arXiv.org/abs/0803. 3828v2[hep-th], to appear in JHEP 15. de Wit, B., Smit, D.J., Hari Dass, N.D.: Residual supersimmetry of compactified D = 10 supergravity. Nucl. Phys. B 283, 165 (1987) 16. Fino, A., Parton, M., Salamon, S.: Families of strong KT structures in six dimensions. Comment. Math. Helv. 79, 317–340 (2004) 17. Freedman, D.Z., Gibbons, G.W., West, P.C.: Ten Into Four Won’t Go. Phys. Lett. B 124, 491 (1983) 18. Friedrich, Th., Ivanov, S.: Parallel spinors and connections with skew-symmetric torsion in string theory. Asian J. Math. 6, 303–335 (2002) 19. Fu, J.-X., Yau, S.-T.: Existence of Supersymmetric Hermitian Metrics with Torsion on Non-Kaehler Manifolds. http://arXiv.org/list/hep-th/0509028, 2005 20. Fu, J.-X., Yau, S.-T.: The theory of superstring with flux on non-Kähler manifolds and the complex Monge-Ampere equation. http://arXiv.org/list/hep-th/0604063, 2006 21. Gauntlett, J., Kim, N., Martelli, D., Waldram, D.: Fivebranes wrapped on SLAG three-cycles and related geometry. JHEP 0111, 018 (2001) 22. Gauntlett, J.P., Martelli, D., Pakis, S., Waldram, D.: G-Structures and Wrapped NS5-Branes. Commun. Math. Phys. 247, 421–445 (2004) 23. Gauntlett, J., Martelli, D., Waldram, D.: Superstrings with Intrinsic torsion. Phys. Rev. D 69, 086002 (2004) 24. Gillard, J., Papadopoulos, G., Tsimpis, D.: Anomaly, Fluxes and (2,0) Heterotic-String Compactifications. JHEP 0306, 035 (2003) 25. Goldstein, E., Prokushkin, S.: Geometric Model for Complex Non-Käehler Manifolds with SU(3) Structure. Commun. Math. Phys. 251, 65–78 (2004) 26. Grantcharov, G., Poon, Y.-S.: Talk of G. Grantcharov in the Workshop ’Special Geometries in Mathematical Physics’, Kulungsborn, March 30-April 4, 2008 27. Gray, A., Hervella, L.: The sixteen classes of almost Hermitian manifolds and their linear invariants. Ann. Mat. Pura Appl. (4) 123, 35–58 (1980) 28. Howe, P.S., Papadopoulos, G.: Ultraviolet behavior of two-dimensional supersymmetric non-linear sigma models. Nucl. Phys. B 289, 264 (1987) 29. Hull, C.M.: Anomalies, ambiquities and superstrings. Phys. Lett. B 167, 51 (1986) 30. Hull, C.M., Townsend, P.K.: The two loop beta function for sigma models with torsion. Phys. Lett. B 191, 115 (1987) 31. Hull, C.M., Witten, E.: Supersymmetric sigma models and the Heterotic String. Phys. Lett. B 160, 398 (1985) 32. Ivanov, P., Ivanov, S.: SU(3)-instantons and G 2 , Spin(7)-Heterotic string solitons. Commun. Math. Phys. 259, 79–102 (2005) 33. Ivanov, S., Papadopoulos, G.: A no-go theorem for string warped compactifications. Phys. Lett. B 497, 309–316 (2001) 34. Ivanov, S., Papadopoulos, G.: Vanishing Theorems and String Backgrounds. Class. Quant. Grav. 18, 1089–1110 (2001) 35. Kimura, T., Yi, P.: Comments on heterotic flux compactifications. JHEP 0607, 030 (2006) 36. Li, J., Yau, S.-T.: The Existence of Supersymmetric String Theory with Torsion. J. Diff. Geom. 70(1), 143–181 (2005) 37. Michelsohn, M.L.: On the existence of special metrics in complex geometry. Acta Math. 149(3-4), 261–295 (1982) 38. Sen, A.: (2, 0) supersymmetry and space-time supersymmetry in the heterotic string theory. Nucl. Phys. B 167, 289 (1986) 39. Strominger, A.: Superstrings with torsion. Nucl. Phys. B 274, 253 (1986) 40. Ugarte, L.: Hermitian structures on six-dimensional nilmanifolds. Transform. Groups 12, 175–202 (2007) Communicated by G. W. Gibbons
Commun. Math. Phys. 288, 699–713 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0741-4
Communications in
Mathematical Physics
Track Billiards Leonid A. Bunimovich1 , Gianluigi Del Magno2 1 ABC Math Program and School of Mathematics, Georgia Institute of Technology,
Atlanta, GA 30332, USA. E-mail:
[email protected]
2 Max Planck Institute for the Physics of Complex Systems, 01187 Dresden,
Germany. E-mail:
[email protected] Received: 21 April 2008 / Accepted: 31 October 2008 Published online: 26 February 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com
Abstract: We study a class of planar billiards having the remarkable property that their phase space consists up to a set of zero measure of two invariant sets formed by orbits moving in opposite directions. The tables of these billiards are tubular neighborhoods of differentiable Jordan curves that are unions of finitely many segments and arcs of circles. We prove that under proper conditions on the segments and the arcs, the billiards considered have non-zero Lyapunov exponents almost everywhere. These results are then extended to a similar class of 3-dimensional billiards. Interestingly, we find that for some track billiards, the mechanism generating hyperbolicity is not the defocusing one, which requires every infinitesimal beam of parallel rays to defocus after every reflection off of the focusing boundary.
1. Introduction There are rather few examples of hyperbolic billiards with several ergodic components, which are exactly described (for example, see [W2,B3,B-D1]). In this paper, we study a class of billiards whose phase space consists (up to a set of zero measure) of two invariant sets formed by orbits moving in opposite directions. The table of such a billiard is a tubular neighborhood of a differentiable Jordan curve γ composed of finitely many straight segments and arcs of circles. A simple example of one of these tables is obtained by cutting out a smaller stadium from a stadium (Fig. 2(a)). Since these regions resemble track fields, our billiards will be called track billiards. In this paper, we prove that all the Lyapunov exponents of a track billiard are nonzero almost everywhere provided that two conditions are satisfied: 1) the arcs of γ are sufficiently long or the width of the cross section of the track is sufficiently large, and 2) the straight segments of γ are sufficiently long. In addition, we show that a similar result remains valid for a class of 3-dimensional track billiards. There is no doubt that as a consequence of the hyperbolicity (non-zero Lyapunov exponents), each invariant
700
L. A. Bunimovich, G. Del Magno
set formed by orbits moving in one of the two possibile directions is ergodic. We will address this problem in a future paper. Interestingly, we find that for some track billiards, the mechanism generating hyperbolicity is not the defocusing one. This mechanism requires that after every reflection from the focusing part of the billiard boundary, every narrow beam of parallel rays must pass through a conjugate point, and become divergent before the next collision with the curved part of the boundary. It follows from our results that there is a class of track billiards that are hyperbolic, but do not have this property. It is worth mentioning that track billiards are related to billiards in tubular regions, which model certain electronic devices used in nanotechnology. Although, there are several works devoted to the study of the quantum properties of these billiards [E-S,G-J, C-D-F-K,V-P-R], not much attention has been dedicated to the study of their classical properties [H-P,P]. Our results, may help fill in this gap. The paper is organized as follows. In Sect. 2, we review some basic facts concerning billiard systems, introduce tracks billiards, and state the main result of this paper. The last part of Sect. 2 contains some preliminary lemmas that are crucial for the proof of the hyperbolicity. In Sect. 3, we give the notions of focusing time and invariant cone field. Then, using a sort of generalized mirror formula for billiard trajectories crossing annular regions, we construct an eventually strictly invariant cone field for track billiards, whose existence implies hyperbolicity. Finally, in Sect. 4, the results obtained for 2-dimensional track billiards are extended to 3-dimensional track billiards. 2. Track Billiards Let Q be a bounded domain of R2 with piecewise differentiable boundary. The billiard in Q is the dynamical system arising from the motion of a point-particle inside Q obeying the following rules: the particle moves along straight lines at unit speed until it hits the boundary of Q, at that moment, the particle gets reflected so that the angle of reflection equals the angle of incidence. 2.1. Definitions. The domain Q ⊂ R2 considered in this paper is a tubular neighborhood of a differentiable Jordan curve γ that is a finite union of segments and arcs of circles. Equivalently, we can say that Q is a union of finitely many building blocks of two types: circular guides and straight guides. A circular guide is a region of an annulus with circles of radii r1 > r2 > 0 contained inside a sector with central angle 0 < α < 2π (see Fig. 1(a)). A straight guide is simply a rectangle (see Fig. 1(b)). The circular and straight guides must all have the same transverse width in order to fit together and form a domain Q. Furthermore, we will always assume that any two circular guides of Q do
(a)
(b)
Fig. 1. The two types of guides considered in this paper
Track Billiards
701
Fig. 2. Two examples of tracks
not intersect, i.e., they are separated by a straight guide. We call Q a track, because its shape resembles that of a track field. Two examples of tracks are depicted in Fig. 2. For our purposes, the dynamics of a track billiard can be conveniently described by a discrete transformation called the billiard map, which is defined as follows. Let M be the set of all vectors (q, v) ∈ T1 R2 such that q ∈ ∂ Q and v, n(q) ≥ 0, where n(q) is the normal vector to ∂ Q at q pointing inside Q. Here ·, · is the standard dot product of R2 . The set M is easily seen to be a smooth manifold with boundary. Let π : M → Q be the canonical projection defined by π(q, v) = q for (q, v) ∈ M. If we view q and v as the position and the velocity of the particle after a collision with ∂ Q, then M represents the collection of all possible post-collision states (collisions, for short) of the particle with ∂ Q. Fix an orientation of the boundary ∂ Q. A set of local coordinates for M is given by M x → (s(x), θ (x)), where s is the arclength parameter along the oriented boundary ∂ Q, and 0 ≤ θ ≤ π is the angle that the velocity of the particle forms with the oriented tangent of ∂ Q. To specify an element x ∈ M, we will use either the notation x = (q, v) or x = (s, θ ). We endow M with the Riemannian metric ds 2 + dθ 2 and the probability measure dµ = (2|∂ Q|)−1 sin θ dsdθ , where |∂ Q| is the length of ∂ Q. Denote by ∂ M the set of all vectors (q, v) ∈ M such that v, n(q) = 0 or q is the endpoint of a straight segment of ∂ Q. Let int M = M\∂ M. For technical reasons, we define T only on collisions belonging to the smooth manifold (without boundary) int M. The billiard map T : int M → M is the transformation given by (q, v) → (q1 , v1 ), where (q, v) and (q1 , v1 ) are consecutive collisions of the particle. Let us denote by S1+ the union of ∂ M and the subset of int M, where T is not differentiable. It is easy to see that S1+ = ∂ M ∪ T −1 ∂ M. From the general results of [K-S], it follows that S1+ is a compact set consisting of finitely many smooth compact curves that can intersect each other only at their endpoints. If we define S1− = M\T (M\S1+ ), then T is a diffeomorphism from M\S1+ to its image M\S1− , and preserves the measure µ (see e.g. [C-F-S,K-S]). The billiard dynamics is time-reversible. Indeed, the involution J : M → M defined by J (s, θ ) = (s, π − θ ) for every (s, θ ) ∈ M has the property that J ◦ T = T −1 ◦ J everywhere on M\S1+ . Most of the time, we will use the notation −A instead of J A, where A is a subset of M. For every n > 1, let us define Sn+ = S1+ ∪T −1 S1+ ∪· · ·∪T −n+1 S1+ and Sn− = S1− ∪T S1− ∪ · · ·∪ T n−1 S1− . By the time-reversibility of the billiard dynamics, we have Sn− = −Sn+ for + = ∪ + − − − ∪ S + ) is the ˜ = M\(S∞ every n > 0. Let S∞ n>0 Sn and S∞ = ∪n>0 Sn . Then M ∞ + − ˜ = 1. set where all iterates of T are defined. Clearly, µ(S∞ ) = µ(S∞ ) = 0 and µ( M) 2.2. Unidirectionality. We say that a billiard in a track Q has the unidirectionality property if every billiard trajectory that is not contained in a cross section of Q moves through every cross section of Q in the same direction. We will prove this property not only for 2-dimensional tracks but also for a certain class of tubular domains of R3 . In fact, the
702
L. A. Bunimovich, G. Del Magno
unidirectionality of tracks will be derived from the undirectionality of 3-dimensional tubular domains. It would be easy to extend our proof to tubular domains in any dimension. We have not done that in order to keep the length of the proof within reasonable limits. We allow the cross section of the tubular domains to be an arbitrary convex compact subset of R2 . In Sect. 4, we will consider tubular domains of R3 with rectangular cross section. Because of the generality of the current setting, before proving the unidirectionality property, we provide a precise definition of a tubular domain in R3 . The ‘skeleton’ of the tubular neighborhood Q˜ is given by a regular Jordan curve ϕ : S 1 → R3 parametrized by the arclength s. We assume that ϕ is piecewise C 2 . This means precisely that there exist a0 < b0 = a1 < b1 < · · · < bn−1 = an < bn = a0 such that S 1 = ∪1≤i≤n [ai , bi ], and ϕ is C 2 on each interval [ai , bi ]. We also make the assumption that the curvature of ϕ on each interval [ai , bi ] is either identically zero (ϕ([ai , bi ]) is a straight segment) or is never equal to zero. In the latter case, for every s ∈ [ai , bi ], let {T (s), N (s), B(s)} be the Frenet frame of ϕ, where T (s), N (s), B(s) are the tangent, normal and binormal vectors of ϕ at ϕ(s), respectively (see for instance [Kl, Chap. 1]). In the former case, instead, we choose N (s) and B(s) to be some fixed vectors such that {T (s), N (s), B(s)} forms an orthonormal basis of R3 . The cross section of Q˜ is given by a compact convex subset ⊂ R2 whose boundary is a piecewise regular simple closed curve ζ : S 1 → R2 with ζ (α) > 0 for every α ∈ S 1 = [0, 2π ). The tubular neighborhood of ϕ with cross section is the domain Q˜ bounded by the surface ψ(s, α) = ϕ(s) + F(s)ζ (α) with (s, α) ∈ S 1 × S 1 , where F(s) is the 3 × 2 matrix with column vectors given by N (s) and B(s). We suppose that Q˜ is not selfintersecting, namely, we require that the diameter of is sufficiently small so that the map : S 1 × → Q˜ given by (s, p) → ϕ(s) + p is a diffeomorphism. In particular, we assume that maxα ζ (α) < (maxs |κ(s)|)−1 , where κ is the curvature of ϕ. Proposition 1. Consider a tubular neighborhood Q˜ of R3 , and assume that its cross section is a circular disk or that each curve ϕ([ai , bi ]) is planar. Then the billiard inside Q˜ has the unidirectionality property. Proof. Let s ∈ [ai , bi ] and α ∈ [0, 2π ). The vectors ∂s ψ = T (s) + F (s)ζ (α) and ∂α ψ = F(s)ζ (α) span the tangent plane of ∂ Q˜ at ψ(s, α). Consider the vector n(s, ˜ α) = ∂s ψ ∧ ∂α ψ. We recall that the Frenet equations read as follows: T (s) = κ(s)N (s), N (s) = −κ(s)T (s) + τ (s)N (s) and B (s) = −τ (s)N (s), where κ and τ are the curvature and torsion of ϕ, respectively (see [Kl, Chap. 1]). A little lengthy but simple computation using the Frenet equations shows that n(s, ˜ α) = (1 − ζ1 (α)κ(s))T (s) ∧ F(s)ζ (α) − τ (s)F J ζ (α) ∧ F(s)ζ (α), (1) 0 1 where ζ1 (s) is the components of ζ (α) along N (s), and J = −1 0 . Since ζ is piecewise regular, it follows that n(s, ˜ α) = 0, which in turn guarantees that n(s, ˜ α) is parallel ˜ to the normal line to ∂ Q through ψ(s, α). It is now easy to obtain n(s, ˜ α), T (s) = −τ (s)ζ (α), ζ (α). By hypothesis, we have τ (s) = 0 (ϕ is planar) or ζ (α), ζ (α) = 0 (ζ is a circle). Note that if ϕ([ai , bi ]) is a straight segment, then τ (s) = 0. We can therefore conclude that n(s, ˜ α), T (s) = 0, meaning that the plane orthogonal to T (s) is ˜ It follows that the set N = {(q, v) ∈ M : v, n(q) = orthogonal to the boundary of Q. 0} is invariant. To obtain this conclusion, it is crucial that ∂ Q˜ is not self-intersecting, otherwise the parametrization ψ of ∂ Q˜ ceases to be valid. ˜ Let v∗ (t) = v(t), T∗ (t), Consider now the billiard flow t → (q(t), v(t)) inside Q. where T∗ (t) = T (s(t)) and s(t) is given by (s(t), p(t)) = −1 (q(t)). Suppose that
Track Billiards
703
the particle motion is defined for t ∈ (− , ) with > 0, and that during such interval of time there is only one collision with ∂ Q˜ at time t = 0. We claim that v∗ is continuous on (− , ). First, note that T∗ is continuous on (− , ), and that v is continuous on (− , 0) ∪ (0, ). Thus to prove the claim, all we have to do is to show that limt→0− v∗ (t) and limt→0+ v∗ (t) exist and coincide. The existence of these limits is obvious. Now, by the reflection law, we have v(0+ ) = v(0− ) + 2v(0− ), n(q)n(q). Since n(q(0)), T∗ (0)) = 0 by previous results, we see that v(0+ ), T∗ (0) = v(0− ), T∗ (0), which completes the proof of the claim. We are now in a position to prove that the billiard in Q˜ has the unidirectionality property. First, note that the invariance of N implies that if v∗ (t¯) = 0 for some billiard orbit and some t¯ ∈ R, then v∗ is identically zero along that orbit. Now, suppose that the billiard in Q˜ does not have the unidirectionality property. Then, we can find a billiard orbit such that v∗ (t1 )v∗ (t2 ) < 0 for some t1 < t2 . Since v∗ is continuous, there exists t1 < t¯ < t2 for which v∗ (t¯) = 0. By previous observation, it follows that v∗ ≡ 0, which contradicts v∗ (t1 )v∗ (t2 ) < 0. We now prove the undirectionality property for billiards in 2-dimensional tracks. Corollary 1. A billiard in a track Q ⊂ R2 has the unidirectionality property. Proof. Embed Q in R3 , namely, identify R2 with some plane P ⊂ R3 , and denote by Q˜ the tubular neighborhood of γ in R3 with circular section . Clearly, Q = Q˜ ∩ P. Next, choose the parametrization ζ of the circle ∂ so that F(s)ζ (0) ∈ P. In other words, we require the curve F(s)ζ (α) to intersect P for α = 0. If γ is a straight segment, then in order for the previous statement to make sense, the vector N (s) has to be chosen so to lie on P (recall that N and B are arbitrarily selected if γ is a straight segment). Now, let n˜ be as in the proof of Proposition 1. The corollary will be proved once we show that n(s, ˜ 0) = λN (s) for some λ = 0. Indeed, from (1) and the fact that τ ≡ 0, we obtain immediately that n(s, ˜ 0) = −(1 − ζ1 (0)κ(s))N (s). To complete the proof, just note that |ζ1 (0)κ(s)| < 1 by assumption. 2.3. Main result. The map T is called (nonuniformly) hyperbolic if all its Lyapunov ˜ exponents are non-zero almost everywhere on M. Definition 1. We say that a circular guide is of type A if α ≥ π (and no conditions on r1 and r2 are imposed), and is of type B if r2 /r1 < 1/2 (and no conditions on α are imposed). We now state the main result of this paper. Theorem 1. Let Q be a track, and suppose that each circular guide of Q is of type A or B. Then the billiard map T in Q is hyperbolic provided that the straight guides of Q are sufficiently long. To our knowledge, all the recipes for designing hyperbolic billiard domains including focusing and dispersing in their boundaries require these curves to be placed sufficiently apart [B1,C-M,M1,W2,W3]. Since for guides of type A, there is no restriction on the distance between the outer and inner circles, Theorem 1 tells us that there do exist hyperbolic billiard domains that violate the condition on the separation between focusing and
704
L. A. Bunimovich, G. Del Magno
Fig. 3. Consecutive collisions inside a circular guide
dispersing boundary components. This means that the mechanism generating hyperbolicity in these billiards is not the defocusing one, which requires that after a reflection off of a focusing curve, an infinitesimal family of parallel trajectories must focus and defocus before the next collision with the boundary of the billiard table. However, we have to point out that circular guides are very special domains, because the billiards inside them are integrable. Also, note that we still need to put circular guides sufficiently far away from each other in order to obtain hyperbolicity. While writing this paper, we learned that Bussolari and Lenci also constructed hyperbolic billiards (different than track billiards) that violate the aforementioned separation condition [B-L]. 2.4. Billiard dynamics in a circular guide. To prove Theorem 1, it is essential to investigate the billiard dynamics inside a circular guide. Consider a circular guide with outer and inner radii r1 = 1 and 0 < r2 = r < 1, respectively. Note that by a proper rescaling, every circular guide can be transformed into such a guide. Denote by M1 the set of all collisions (q, v) such that q belongs to the outer circle of the guide. Note that the ray {q + tv : t ≥ 0} emerging from x = (q, v) ∈ M1 is tangent to the full circle containing the inner arc of the guide if and only if θ (x) ∈ {θ¯ , π − θ¯ }, where θ¯ = cos−1 r ∈ (0, π/2). Let D1 = M1 \θ −1 ({0, π, θ¯ , π − θ¯ }). We say that a collision x ∈ D1 is ‘leaving (the guide)’ if the only point of intersection between the ray emerging from x = (q, v) and the guide is q. We say that a collision x ∈ D1 is ‘entering (the guide)’ if −x is leaving the guide. For every x ∈ D1 , denote by n 1 (x) ≥ 0 the number of times that the particle with initial state x hits the outer circle before leaving the guide. We will focus our attention on the transformation T1 : D1 → D1 that maps a collision with the outer circle to the next collision with the same circle. More precisely, for every (s, θ ) ∈ D1 , define (s + 2δ(θ ), θ ) if n 1 (x) > 0, (2) T1 (s, θ ) = (s, θ ) if n 1 (x) = 0, where 2δ(θ ) is the central angle of the sector bounded by the two consecutive collisions with the outer circle (see Fig. 3). For θ ∈ (0, θ¯ ) ∪ (π − θ¯ , π ), it is trivial to check that ¯ π − θ¯ ) instead, we see from Fig. 3 that δ(θ ) = θ − φ(θ ), where δ(θ ) = θ . For θ ∈ (θ, φ(θ ) is the angle of the collision with the inner circle. The relation between θ and φ is provided by the conservation of the angular momentum of the particle measured from
Track Billiards
705
the center of the circular guide, which reads as cos θ = r cos φ. Putting all together, we obtain ¯ θ − cos−1 cosr θ if θ ∈ (θ¯ , π − θ), δ(θ ) = (3) ¯ θ if θ ∈ (0, θ ) ∪ (π − θ¯ , π ). ¯ and δ (θ ) → −∞, as θ → θ¯ + or The function δ is differentiable on (0, π )\{θ¯ , π − θ}, − ¯ θ → (π − θ ) . By abuse of notation, we define δ(x) = δ(θ (x)) and δ (x) = δ (θ (x)) for every x ∈ D1 . From (2), it follows that for every x ∈ D1 , 1 2n 1 (x)δ (x) n 1 (x) . (4) Dx T1 = 0 1
2.5. Preliminary lemmas. We now prove some facts that will play a crucial role in the proof the hyperbolicity of track billiards. The goal here is to estimate the quantity 2n 1 (x)δ (x) for x ∈ D1 . Definition 2. Let E 1 = {x ∈ D1 : x is entering and T1n 1 (x) x is leaving}. The set E 1 can be partitioned as follows E 1 = E 0 ∪ E + ∪ E − , where – E 0 = {x ∈ E 1 : n 1 (x) = 0}, ¯ π − θ¯ )}, – E + = {x ∈ E 1 \E 0 : θ (x) ∈ (θ, ¯ π)}. – E − = {x ∈ E 1 \E 0 : θ (x) ∈ (0, θ¯ ) ∪ (π − θ, For every x ∈ E 1 , define ω(x) = α − 2n 1 (x)δ(x), and χ (x) = 2n 1 (x)δ (x). Remark 1. From the definition of ω(x), it follows that 0 ≤ ω(x) < 2δ(x) for every x ∈ E1. The next lemma is a trivial consequence of the definition of E 0 and the fact that δ (x) = 1 for all x ∈ E − . Lemma 1. If x ∈ E 0 ∪ E − , then χ (x) = 2n 1 (x). We now restrict our analysis to the circular guides of type A and B. Lemma 2. Consider a circular guide of type A. There exists a function χ A = χ A (r, α) non-increasing in α such that χ (x) ≤ χ A for every x ∈ E + . Proof. By the symmetry of the guide, it is enough to prove the lemma for x ∈ E + such that θ (x) ∈ (θ¯ , π/2). For such values of x, we have sin θ δ (θ ) = 1 − √ <0 r 2 − cos2 θ
706
L. A. Bunimovich, G. Del Magno
and δ (θ ) =
cos θ (r 2
3
− cos2 θ ) 2
(1 − r 2 ) > 0.
¯ π/2) such that χ (x) < −3 for Since δ (θ ) → −∞, as θ → θ¯ + , we can find ϑ ∈ (θ, every x ∈ E + with θ (x) ∈ (θ¯ , ϑ]. We now consider the case x ∈ E + with θ (x) ∈ (ϑ, π/2). It is trivial to see that for every θ ∈ (ϑ, π/2), we have δ(θ ) = −δ (θ )θ , where θ is the length of the segment lying on the θ -axis whose endpoints are θ and the intersection point of the tangent of the graph of δ at (θ, δ(θ )) with the θ -axis. Since δ is strictly convex and δ(π/2) = 0, it follows that 0 < θ < π/2 − θ for every θ ∈ (ϑ, π/2). Hence 1 2 δ (θ ) =− <− δ(θ ) θ π − 2θ
for θ ∈ (ϑ, π/2).
(5)
Now, note that α − ω(x) > 0 because n 1 (x) > 0. Combining together the last observation, inequality (5) and Remark 1, we obtain δ (θ (x)) δ(θ (x)) α − ω(x) < −2 π − 2θ (x) α − 2δ(θ (x)) for θ (x) ∈ (ϑ, π/2). < −2 π − 2θ (x)
χ (x) = (α − ω(x))
Let h(α, θ ) = −2(α − 2δ(θ ))/(π − 2θ ). Since π ≤ α and δ(θ ) < θ , it is easy to see that ∂α h < 0 and h(α, ϑ) < −2. So h is strictly decreasing, and therefore χ (x) < h(α, ϑ) < −2
for θ (x) ∈ (ϑ, π/2).
To complete the proof, set χ A = max{−3, h(α, ϑ)}, and observe that χ A is a nonincreasing function of α. Since δ is strictly increasing for θ ∈ (θ¯ , π/2) (see the Proof of Lemma 2), we have ¯ π − θ¯ ). This simple < δ (π/2) = 1 − 1/r for every x ∈ D1 such that θ (x) ∈ (θ, fact proves immediately the following lemma, saying that a result similar to Lemma 2 holds true for circular guides of type B. δ (x)
Lemma 3. Consider a guide of type B, and let χ B = χ B (r ) = 2(1 − 1/r ) < −2. Then χ (x) ≤ 2n 1 (x)(1 − 1/r ) < χ B for every x ∈ E + . Remark 2. It is precisely the fact that |χ (x)| > 2, proved in the previous lemmas, which allows us to think of circular guides as optical devices having the property of focusing in a controlled way infinitesimal families of parallel rays entering the guide. In this sense, we can think of circular guides of type A and B as some sort of generalized absolutely focusing curves [B2,D]. 3. Hyperbolicity In this section, we prove that, under proper conditions concerning the circular guides and the distance between them, a track billiard admits an eventually strictly invariant cone field. By a well known result of Wojtkowski [W2], this property implies Theorem 1.
Track Billiards
707
3.1. Focusing times. Recall that M is the phase space of the billiard inside the track Q. Given a tangent vector u ∈ Tx M at x ∈ int M, let s → γ (s) = (q(s), v(s)) ∈ int M be a differentiable curve such that γ (0) = x and γ (0) = u. Next, define a family of lines s → γ+ (s) by setting γ+ (s) = {q(s) + tv(s) : t ∈ R}. Similarly, define a second family of lines s → γ− (s) by replacing γ with −γ in the definition of γ+ . In geometrical terms, γ− is obtained from γ+ by reflecting its lines at ∂ Q. All the lines of γ+ (γ− ) intersect in linear approximation at one point along the line γ+ (0)(γ− (0)). This point is called a focal point of u. If x = (s, θ ) and u = (ds, dθ ), then the distances between π(x) and the focal points of u lying on γ+ (0) and γ− (0) are, respectively, given by f + (u) =
sin θ κ(s) + m(u)
(6)
f − (u) =
sin θ , κ(s) − m(u)
(7)
and
where κ(s) is the curvature ∂ Q at s and m(u) = dθ/ds (see for example, [W2]). We conventionally assume that the curvature of the outer circle is positive, whereas the curvature of the inner circle is negative. The distances f + (u) and f − (u) are called forward and backward focusing times of u. By summing the reciprocals of f + (u) and f − (u), we obtain the well known Mirror Formula1 1 2κ(s) 1 + = . f + (u) f − (u) sin θ 3.2 Fractional linear transformation Definition 3. Let E be the set of all collisions x ∈ M\S1+ entering a circular guide of Q. Also, for every x ∈ E, denote by n(x) ≥ 0 the times that the particle with initial state x hits the boundary of the circular guide before leaving it. Following [W3], we now introduce a transformation describing the relation between the focusing times of an infinitesimal family of billiard trajectories at the entrance and at the exit of a circular guide. Let x ∈ E, and consider 0 = u ∈ Tx M. Next, denote by Fx the map from the real projective line R ∪ {∞} to itself given by f − (u) → f + (Dx T n(x) u). Using the Mirror Formula, one can deduce that Fx is a linear fractional transformation (restricted to R ∪ {∞}) Fx ( f ) =
a(x) f + b(x) c(x) f + d(x)
for f ∈ R ∪ {∞},
where a(x), b(x), c(x), d(x) are real numbers satisfying a(x)d(x) − b(x)c(x) < 0. For x ∈ E 1 , the analytic expression of a(x), b(x), c(x), d(x) will be derived in the proof of Theorem 2. The inequality a(x)d(x) − b(x)c(x) < 0 implies that d Fx /d f < 0 is negative on R. Therefore, the transformation Fx has two fixed points f 1 (x) and f 2 (x) on the real line. We will always assume that f 1 (x) ≥ f 2 (x). The following lemma is an immediate consequence of the monotonicity of Fx . 1 The convention on the signs of the focusing times adopted here is different than that used in [W2].
708
L. A. Bunimovich, G. Del Magno
Lemma 4. For every x ∈ E, we have f < f 2 (x) or f > f 1 (x) ⇐⇒ f 2 (x) < Fx ( f ) < f 1 (x). Definition 4. We call the focal length of a circular guide the number f˜ = sup f 1 (x). x∈E
In the next theorem, we prove that the focal length of a circular guide of type A or B is always bounded above. Theorem 2. Let χ˜ = χ A for a guide of type A, and χ˜ = χ B for a guide of type B. Then f˜ ≤
χ˜ . χ˜ + 2
Proof. We first prove that supx∈E 1 f 1 (x) ≤ χ/(2 ˜ + χ˜ ). To this end, we need to compute the fixed point of Fx for x ∈ E 1 . Note that n(x) = n 1 (x) in this case. By (6) and (7), we have f = f − (u) = sin θ (x)(1 − m(u))−1 and f + (Dx T n 1 (x) u) = sin θ (x)(1 + m(Dx T n 1 (x) u))−1 for 0 = u ∈ Tx M. It follows from (4) that m(Dx T n 1 (x) u) = m(u)(1+ χ (x)m(u))−1 . A straightforward computation yields Fx ( f ) =
sin θ (x)(1 + χ (x)) f − sin θ (x)χ (x) . (2 + χ (x)) f − sin θ (x)(1 + χ (x))
By Lemmas 1, 2 and 3, we know that χ (x) ≥ 0 for x ∈ E 0 ∪ E − , and χ (x) < −2 for x ∈ E + . It is then easy to check that the fixed points of Fx are given by sin θ (x) if x ∈ E 0 ∪ E − , f 1 (x) = sin θ(x)χ (x) if x ∈ E + , 2+χ (x) and
f 2 (x) =
sin θ(x)χ (x) 2+χ (x)
sin θ (x)
if x ∈ E 0 ∪ E − , if x ∈ E + .
Observe that χ (x)(2 + χ (x))−1 ≥ 1 for x ∈ E + , and the function z → z(z + 2)−1 is increasing for z ∈ (−∞, −2). Hence sup f 1 (x) ≤ sup x∈E 1
x∈E +
χ (x) 2 + χ (x)
supx∈E + χ (x) 2 + supx∈E + χ (x) χ˜ . ≤ 2 + χ˜
≤
(8)
We now consider the case x ∈ E\E 1 . This time, rather than computing directly the fixed points of Fx , we will try to reduce the current case to the one studied in the first part of this proof. This will be done by considering our guide as contained in a larger guide. In fact, a circular guide can always be embedded into a larger guide: the radii of the outer and inner circles of the larger guide coincide with those of the original guide,
Track Billiards
709
Fig. 4. The solid and dotted curves denote the original and the enlarged guide, respectively, as described in the proof of Theorem 2
but if β and α are the central angle of the larger and the original guide, respectively, then β > α. In the rest of this proof, the symbols denoting the billiard transformation and related mathematical objects for the original guide will be used with the addition of a hat to denote their counterparts for the larger guide. Thus, for example, Mˆ denotes the set of all possible collisions for the larger guide. Note that the sets M and Mˆ are subsets ˆ we write x = y of the unit tangent bundle of R2 . Accordingly, given x ∈ M and y ∈ M, if x and y coincide as tangent vectors of R2 . After this parenthesis on the notation, we can resume our proof. We embed the original ˆ ≤ n(x) + 2 for guide into a larger guide so that there exists y ∈ Eˆ 1 with n(x) ≤ n(y) ˆ y}. The condition n(x) ≤ n(y) ˆ ≤ n(x) + 2 which {x, . . . , T n(x) x} ⊂ {y, . . . , Tˆ n(y) n ˆ (y)−1 n(x) 1 ˆ ˆ ˆ implies T y = x or T y=T x. We will study only the case T y = x, because the case y = x can be studied similarly. The case Tˆ y = x can be further split into two subcases: i) Tˆ y = x and Tˆ nˆ 1 (y)−1 y = T n(x) x and ii) Tˆ y = x and Tˆ nˆ 1 (y) y = T n(x) x. Again, we will only consider the second subcase, because the first can be studied similarly (as a matter of fact, its analysis is easier). Hence, we assume that Tˆ y = x and Tˆ nˆ 1 (y) y = T n(x) x. Denote by f + the difference between the length of the segment connecting π(x) with π(y) and f 1 (x). Next, f − can be found by using the Mirror Formula f −−1 + f +−1 = 2/ sin θˆ (y). Note that θˆ (y) = θ (x). By the definition of the map Fx , it follows that Fˆ y ( f − ) = Fx ( f 1 ) (see Fig. 4), and so Fˆ y ( f − ) = f 1 (x).
(9)
We want to show that f 1 (x) ≤ fˆ1 (y). To do this, we argue by contradiction. Suppose that f 1 (x) > fˆ1 (y). Since y ∈ Eˆ 1 , we know from the first part of this proof that the fixed ¯ it is easy points of Fˆ y satisfy sin θˆ (y) = fˆ2 (y) < fˆ1 (y). Also, since θˆ (y) ∈ (θ¯ , π − θ), ˆ to check that the length of the segment connecting π(x) with π(y) is less than sin θ(y). ˆ ˆ Hence f + < 0. The Mirror Formula then implies that 0 < f − < (sin θ (y))/2 < f 2 (y). By Lemma 4, it follows that fˆ2 (y) < Fˆ y ( f − ) < fˆ1 (y), and so fˆ2 (y) < f 1 (x) < fˆ1 (y) by (9). The last inequality contradicts our assumption. Thus f 1 (x) ≤ fˆ1 (y). By the first part of this proof, the right-hand side part of the previous inequality is −1 . Since χ˜ (β) is non-increasing in β (in fact, χ˜ is bounded above by χ˜ (β)(2 + χ(β)) ˜ independent of β for a guide of type B), we conclude that f 1 (x) ≤ χ(α)(2 ˜ + χ˜ (α))−1 . This completes the proof.
710
L. A. Bunimovich, G. Del Magno
3.3. Cone fields. A cone in a 2-dimensional space V is a subset C = {a X 1 + bX 2 : ab ≥ 0}, where X 1 and X 2 are two linear independent vectors of V . Equivalently, we can say that the cone C is a closed interval of the projective space P(V ), the space of the lines in V . The interior of C is defined by int C = {a X 1 + bX 2 : ab > 0} ∪ {0}. Since the backward focusing time f − and the forward focusing time f + are both projective coordinates of P(Tx M), the set C = {u ∈ Tx M : f − (u)( f + (u)) ∈ I } is a cone in Tx M for every closed interval I ⊂ R. Let be a subset of M˜ such that µ() > 0. Denote by T : → the first return map on induced by the billiard map T . Also, denote by µ the probability measure on obtained by normalizing the restriction of µ to . It is well known that the map T preserves µ . Definition 5. A measurable cone field C on is a measurable map that associates to each x ∈ a cone C(x) ⊂ Tx M. We say that C is eventually strictly invariant if for every x ∈ , we have 1. Dx T C(x) ⊂ C(T x), 2. ∃ an integer k(x) > 0 such that Dx Tk(x) C(x) ⊂ int C(Tk(x) x). Remark 3. By [W2], the existence of such a cone field (plus other properties, always satisfied by track billiards) implies that T is hyperbolic. Furthermore, if the set ∪k∈Z T k has full µ-measure, then it is not difficult to see that T is hyperbolic as well (see [W1]). We now define an invariant cone field for circular track billiards. In the next subsection, we will show, relying on Lemmas 2 and 3 and Theorem 2, that this cone field is eventually strictly invariant provided that the straight guides of a track are sufficiently long. Let E˜ = E ∩ M˜ be the set of entering collisions with infinite positive and negative semi-orbits. We define a measurable cone field on E˜ as follows: C(x) = {u ∈ Tx M : f − (u) ≥ f˜(x)}
˜ for all x ∈ E,
(10)
where f˜(x) is the focal length of the circular guide containing π(x). The cone field C is continuous on E˜ (and therefore measurable), because so are f − (as a function of x) and f˜. 3.4. Hyperbolicity. Let Q be a track, and assume that its guides are ordered in such a way that the i th straight guide connects the i th and (i + 1)th circular guides. The (n + 1)th circular guide coincides with the first one so that there are exactly n circular guides separated by n straight guides. We also assume that each circular guide is of type A or B. For every 1 ≤ i ≤ n, let f˜i and li be the focal length and the length of the i th circular guides and the i th straight guide, respectively. We say that such a track Q satisfies Condition H if the distance between any pair of consecutive circular guides of Q is greater than the focal length of the two circular guides, i.e., li > f˜i + f˜i+1
for each i = 1, . . . , n.
(H)
We can now give the precise formulation and the proof of Theorem 1, the main result of this paper.
Track Billiards
711
Theorem 3. Suppose that a track Q satisfies Condition H. Then the billiard map T in Q is hyperbolic. Proof. By Remark 3, it is enough to prove that the cone field C defined in (10) is eventually strictly invariant, and the set ∪k∈Z T k E˜ has full µ-measure. ˜ and consider u ∈ C(x) with u = 0. By definition of C(x), we have Let x ∈ E, f − (u) > f˜(x) ≥ f 1 (x) so that Lemma 4 implies that f + (Dx T n(x) u) < f 1 (x) ≤ f˜(x). Now, note that T n(x) x is a collision leaving a circular guide, and that the piece of the orbit of x between x and TE˜ x crosses a straight guide of length l. By Condition H, we then have l > f˜(x) + f˜(TE˜ x), and hence f − (Dx TE˜ u) = l − f + (Dx T n(x) u) ≥ l − f˜(x) > f˜(TE˜ x). This means that Dx TE˜ u ∈ int C(TE˜ x), and we can conclude that C is eventually strictly ˜ It is clear that ∪k∈Z T k E˜ = M\N ˜ invariant with k(x) = 1 for every x ∈ E. (for the definition of N , see Subsect. 2.2). Since µ(N ) = 0, it follows that ∪k∈Z T k E˜ has full measure. Remark 4. It is easy to check that the so-called Monza billiard considered in [V-P-R] satisfies Condition H. Note that its circular guides are of type B. Theorem 3 then assures that the Monza billiard is hyperbolic. 4. 3-Dimensional Track Billiards In this last section, we extend Theorem 3 to billiards in 3-dimensional tracks, which are special 3-dimensional tubular domains with rectangular cross section. A 3-dimensional cylindrical (straight) guide G˜ is the Cartesian product of a 2-dimensional circular (straight) guide G and a closed interval I ⊂ R. The guide G, when circular, is assumed to be of type A or B. We call the axis and the focal length of G˜ the line perpendicular to the plane containing G and focal length of G, respectively. ˜ each being the Cartesian product of an opening of G and I , The two rectangles of ∂ G, ˜ are called the ends of G. We say that a cylindrical guide G˜ 1 and a straight guide G˜ 2 are glued together if there exists an isometry of R3 that identifies one end of G˜ 1 with one end of G˜ 2 . Definition 6. A 3-dimensional track is a finite chain of alternating 3-dimensional straight and cylindrical guides glued together. More precisely, a connected subset Q˜ ⊂ R3 is called a 3-dimensional track if Q˜ is a union of 3-dimensional guides G˜ 1 , . . . , G˜ 2n+1 with n > 1 such that 1. G˜ 2n+1 = G˜ 1 , 2. G˜ 2i−1 and G˜ 2i are a cylindrical and a straight guide, respectively, for each i = 1, . . . , n, 3. G˜ i and G˜ i+1 are glued together for each i = 1, . . . , 2n. Note that Q˜ must contain at least two cylindrical guides. An example of a 3-dimensional track is depicted in Fig. 5.
712
L. A. Bunimovich, G. Del Magno
˜ Fig. 5. A 3-dimensional track that satisfies Condition H
Remark 5. A 3-dimensional track is a tubular neighborhood of a curve that is a union of finitely many planar curves (straight segments and arcs of circles) and with rectangular cross section (see Subsect. 2.2). From Proposition 1, it follows that 3-dimensional track billiards have the unidirectionality property. If a 3-dimensional track has the property that the axes of the cylindrical guides are all parallel to each other, then the momentum of the particle along this line is a first integral of motion, and the billiard is not completely hyperbolic. A billiard inside a 2-dimensional track is hyperbolic if it satisfies Condition H. We now introduce the 3-dimensional analogue of Condition H. ˜ if A 3-dimensional track Q˜ = ∪1≤i≤2n+1 G˜ i satisfies Condition H 1. the distance between the ends of each straight guide G˜ 2i is greater than the sum of the focal lengths of the cylindrical guides G˜ 2i−1 and G˜ 2i+1 for each i = 1, . . . , n; 2. there are at least two cylindrical guides with orthogonal axes. ˜ is shown in Fig. 5. Billiards in tracks satisfying An example of track satisfying H ˜ Condition H are closely related to certain hyperbolic semi-focusing cylindrical billiards [B-D1,B-D2], and are examples of twisted Cartesian products [W3]. Theorem 3 combined with the results of [B-D1] (or Theorem 17 of [W3]) implies that for a 3-dimensional ˜ there exists an invariant cone field that is strictly track billiard satisfying Condition H, invariant along every orbit connecting two cylindrical guides with orthogonal axes, thus proving the following theorem. ˜ then the billiard map in Theorem 4. If a 3-dimensional track Q˜ satisfies Condition H, ˜ Q is hyperbolic. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References [B1] [B2]
Bunimovich, L.: A Theorem on Ergodicity of Two-Dimensional Hyperbolic Billiards. Commun. Math. Phys. 130, 599–621 (1990) Bunimovich, L.: On absolutely focusing mirrors. In: Ergodic theory and related topics, III (Güstrow, 1990), Lect. Notes Math. 1514, Berlin-Heidelberg-New York: Springer-Verlag 1992, pp. 62–82
Track Billiards
[B3]
713
Bunimovich, L.: Mushrooms and other billiards with divided phase space, Chaos 11 (2001), 802–808 [B-D1] Bunimovich, L., Del Magno, G.: Semi-focusing billiards: hyperbolicity. Commun. Math. Phys. 262, 17–32 (2006) [B-D2] Bunimovich, L., Del Magno, G.: Semi-focusing billiards: ergodicity. Erg. Th. Dynam. Sys. 28, 1377–1417 (2008) [B-L] Bussolari, L., Lenci, M.: Hyperbolic billiards with nearly flat focusing boundaries. Physica D 237, 2272–2281 (2008) [C-M] Chernov, N., Markarian, R.: Chaotic billiards. Mathematical Surveys and Monographs 127, Providence, RI: Amer. Math. Soc. 2006 [C-F-S] Cornfeld, I., Fomin, S., Sinai, Ya.: Ergodic theory. New York: Springer-Verlag, 1982 [C-D-F-K] Chenaud, B., Duclos, P., Freitas, P., Krejˇciˇrk, D.: Geometrically induced discrete spectrum in circular tubes. Diff. Geom. Appl. 23, 95–105 (2005) [D] Donnay, V.: Using integrability to produce chaos: billiards with positive entropy. Commun. Math. Phys. 141, 225–257 (1991) [E-S] Exner, P., Šeba, P.: Bound states in curved quantum waveguides. J. Math. Phys. 30, 2574–2580 (1989) [G-J] Goldstone, J., Jaffe, R.L.: Bound states in twisting tubes. Phys. Rev. B 45, 14100–14107 (1992) [H-P] Horvat, M., Prosen, T.: Uni-directional transport properties of a serpent billiard. J. Phys. A: Math. Gen. 37, 3133–3145 (2004) [K-S] Katok, A., Strelcyn, J.-M.: Invariant manifolds, entropy and billiards; smooth maps with singularities. Lect. Notes Math. 1222, New York: Springer, 1986 [Kl] Klingerberg, W.: A course in differential geometry. Graduate Texts in Mathematics 51, New York: Springer-Verlag, 1978 [M1] Markarian, R.: Non-uniformly hyperbolic billiards. Ann. Fac. Sci. Toulouse Math. 6(3), 223–257 (1994) [P] Peirone, R.: Billiards in Tubular Neighborhoods of Manifolds of Codimension 1. Commun. Math. Phys. 207, 67–80 (1999) [V-P-R] Veble, G., Prosen, T., Robnik, M.: Expanded boundary integral method and chaotic time-reversal doublets in quantum billiards. New J. Phys. 9, 15 (2007) [W1] Wojtkowski, M.: Invariant families of cones and Lyapunov exponents. Erg. Th. Dynam. Syst. 5, 145–161 (1985) [W2] Wojtkowski, M.: Principles for the design of billiards with nonvanishing Lyapunov exponents. Commun. Math. Phys. 105, 391–414 (1986) [W3] Wojtkowski, M.: Design of hyperbolic billiards. Commun. Math. Phys. 273, 283–304 (2007) Communicated by G. Gallavotti
Commun. Math. Phys. 288, 715–730 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0690-3
Communications in
Mathematical Physics
Sharp Bounds on the Critical Stability Radius for Relativistic Charged Spheres Håkan Andréasson1,2 1 Mathematical Sciences, University of Gothenburg,
S-41296 Göteborg, Sweden
2 Mathematical Sciences, Chalmers University of Technology,
S-41296 Göteborg, Sweden. E-mail:
[email protected] Received: 22 April 2008 / Accepted: 28 July 2008 Published online: 27 November 2008 – © Springer-Verlag 2008
This work is dedicated to the memory of my father Dan Andréasson (1933–2008). Abstract: In a recent paper by Giuliani and Rothman [17], the problem of finding a lower bound on the radius R of a charged sphere with mass M and charge Q < M is addressed. Such a bound is referred to as the critical stability radius. Equivalently, it can be formulated as the problem of finding an upper bound on M for given radius and charge. This problem has resulted in a number of papers in recent years but neither a transparent nor a general inequality similar to the case without charge, i.e., M ≤ 4R/9, has been found. In this paper we derive the surprisingly transparent inequality √ √ R R Q2 + + . M≤ 3 9 3R The inequality is shown to hold for any solution which satisfies p + 2 pT ≤ ρ, where p ≥ 0 and pT are the radial- and tangential pressures respectively and ρ ≥ 0 is the energy density. In addition we show that the inequality is sharp, in particular we show that sharpness is attained by infinitely thin shell solutions. 1. Introduction Black holes for which the charge or angular momentum parameter equals the mass are called extremal black holes. They are very central in black hole thermodynamics due to their vanishing surface gravity and they represent the absolute zero state of black hole physics. It is quite generally believed that extremal black holes are disallowed by nature but a proof is missing. One possibility to obtain an extremal black hole is to produce one from the collapse of an already extremal object. Previous mainly numerical studies [7,13] have concluded that when Q < M collapse always takes place at a critical radius Rc outside the outer horizon, and as Q approaches M, this value approaches the horizon. This is similar to the non-charged case where the Buchdahl inequality implies that collapse will take place when R < 9M/4, i.e., Rc = 9M/4, cf. [11]. In the charged
716
H. Andréasson
case the critical value is expected to be smaller due to the Coulomb repulsion, and this is also shown to be the case below and in particular as Q → M the stability radius does approach the outer horizon. For more information on the relation of this topic to extremal black holes and black hole thermodynamics we refer to [7,14,17 and 12] and the references therein. The problem of finding a similar bound as the classical Buchdahl bound for charged objects has resulted in several papers; some of these are analytical, cf. [14,15,17,19,23 and 25], whereas others are numerical or use a mix of numerical and analytical arguments, cf. [7,13 and 16] to mention some of them. We refer the reader to the sources for the details of these studies but in none of them a transparent bound has been obtained (except in very special cases); on the contrary they have been quite involved and implicit. Moreover, most of these studies rely on the assumptions made by Buchdahl, i.e., the energy density is assumed to be non-increasing and the pressure to be isotropic. In this work we will show that √ r √ + mg ≤ 3
r q2 + , 9 3r
(1)
given that q < r (which is a physically natural assumption, cf. the discussion below), and that p + 2 pT ≤ ρ, where p ≥ 0 and pT are the radial- and tangential pressures respectively and ρ ≥ 0 is the energy density. Here we have used lower case letters m g , q and r to stress that the inequality holds anywhere inside the object. We refer to Eqs. (3) and (9) below for the exact definitions of these quantities. To the best of our knowledge this bound has not appeared in the literature before. In the non-charged case a general proof of the Buchdahl inequality 2m/r ≤ 8/9, in the case when p + 2 pT ≤ ρ, was first given in [1]. A completely different proof was then given by Stalker and Karageorgis [22] where also several other situations were considered, e.g. the isotropic case where p = pT . The advantage of the method in [22] (which is related to the method by Bondi [9] which however is non-rigorous) compared to the method in [1] is that it is shorter and that it is more flexible in the sense that other assumptions than p + 2 pT ≤ ρ can be treated. On the other hand the result in [22] is weaker than the result in [1] in the sense that the latter method implies that the steady state that saturates the inequality is unique, it is an infinitely thin shell. Indeed, in [1] it is shown that given any steady state, the value of 2m/r for this state is strictly less than the value 2m/r of a state for which the matter has been slightly re-distributed and this monotonic property continues until an infinitely thin shell has been reached for which 2m/r = 8/9. The method in [22] also shows sharpness but only in the sense that there are steady states with 2m/r arbitrary close to 8/9, leaving open the possibility that different kinds of steady states might share this feature. Moreover, since the assumption p + 2 pT ≤ ρ is satisfied by solutions of the Einstein-Vlasov system it is natural to ask if there exist regular static solutions to the coupled system which can have 2m/r arbitrary close to 8/9. This question is given an affirmative answer in [2], where in particular it is shown that arbitrary thin shells which are regular solutions of the spherically symmetric Einstein-Vlasov system do exist. On the contrary, the matter quantities and the corresponding spacetimes constructed in [22] for showing sharpness cannot be realized by regular solutions of the Einstein-Vlasov system. The construction in [22] gives that a solution which nearly saturates the inequality 2m/r ≤ 8/9 satisfies p + 2 pT = ρ, and in addition pT and ρ are discontinuous. Neither of these two properties can be realized by regular solutions of the (massive) Einstein-Vlasov system.
Sharp Bounds on Critical Stability Radius for Relativistic Charged Spheres
717
In the present work where we study charged objects we will adapt the method in [22] to show the inequality (1) and its sharpness. This again supports the claim above that this method is very flexible. We have not been able to carry out the strategy in [1] in this case. If we have succeeded it would have given a more complete characterization, cf. the discussion above. However, we do show in Theorem 2 below that an infinitely thin shell solution (with properties specified in the theorem) saturates the inequality, although we cannot show that no other steady states can saturate it as well. We also mention that in [5] a numerical study of the coupled Einstein-Maxwell-Vlasov system is carried out which supports that there are arbitrarily thin shell solutions for this system which saturate the inequality (1). Before giving the outline of the paper let us also mention the works [8,10,18,24 and 20], on the uncharged Buchdahl inequality prior to the works [1 and 22]. These investigations also shared the aim of finding a bound on 2M/R without imposing the Buchdahl assumptions of a nonincreasing energy density and isotropic pressure. See also the list of references in [1]. The outline of the paper is as follows. In the next section the Einstein equations will be given and some basic quantities will be introduced. In Sect. 3 the main results are stated and Sect. 4 is devoted to the proofs. In the final section we discuss our inequality in view of the bound derived in [17] for a constant energy density profile.
2. The Einstein Equations We follow closely the set up in [17] but here we also allow the pressure to be anisotropic, i.e., the radial pressure p and the tangential pressure pT need not be equal. We assume throughout the paper that p, the energy density ρ, and the charge density j 0 are non-negative. We study spherically symmetric mass and charge distributions and we write the metric in the form ds 2 = −e2µ(r ) dt 2 + e2λ(r ) dr 2 + r 2 (dθ 2 + sin2 θ dϕ 2 ), where r ≥ 0, θ ∈ [0, π ], ϕ ∈ [0, 2π ]. It is well-known that the Reissner-Nordström solution for the charged spherically symmetric case gives e−2λ(r ) = 1 −
2M Q 2 + 2 = e2µ(r ) , r ≥ R. r r
(2)
Here R is the outer radius of the sphere and Q is the total charge. This solution is a vacuum solution. The purpose of this work is to investigate the behaviour of λ and µ when the matter and charge densities are non-zero for r < R. Before writing down the Einstein equations let us introduce some quantities following [17]. Let
r
q(r ) = 4π
e(λ+µ)(η) η2 j 0 dη,
(3)
0
and m i (r ) = 4π 0
r
η2 ρ dη,
(4)
718
H. Andréasson
where q(r ) is the charge within the sphere with area radius r and m i (r ) is the mass within this sphere. The subscript i is used to distinguish m i from the gravitational mass m g which is defined below. Let us also introduce the quantity r 2 q (η) F(r ) = dη. η2 0 The Einstein equations for λ and µ now read (cf. [7 and 17]) 1 2λr e−2λ e−2λ q 2 (r ) − + = 8πρ + , r2 r r2 r4
(5)
1 e−2λ 2µr e−2λ q 2 (r ) − − = −8π p + , r2 r r2 r4
(6)
and
where the subscript r denotes differentiation with respect to r. Equation (5) can be written as d(e−2λ r ) q 2 (r ) = 1 − 8πr 2 ρ − 2 , dr r
(7)
so that e−2λ = 1 −
F(r ) 2m i (r ) − . r r
(8)
By requiring that (8) matches the exterior solution (2) at r = R gives 2M Q 2 1 R q2 + 2 =1− 1− (8πρη2 + 2 )dη R R R 0 η or 1 M= 2
R
(8πρη2 +
0
q2 Q2 , )dη + η2 2R
which defines the total gravitational mass M. In view of this relation we now define the gravitational mass m g within a given area radius r by m g (r ) = m i (r ) +
F(r ) q 2 (r ) + . 2 2r
(9)
In terms of the gravitational mass we thus get e−2λ(r ) = 1 −
2m g (r ) q 2 (r ) + 2 . r r
(10)
Let us also write down the Tolman-Oppenheimer-Volkov equation which follows from the Einstein equations, cf. [7], but note that in our case p is allowed to be different from pT which modifies the equation accordingly pr =
m g (r ) qqr 2 q2 + ( pT − p) − (ρ + p)e2λ ( 2 + 4πr p − 3 ). 4 4πr r r r
(11)
Sharp Bounds on Critical Stability Radius for Relativistic Charged Spheres
719
3. Set Up and Main Results The problem of finding an upper bound on the total gravitational mass that a sphere of area radius R with total charge Q can hold, or equivalently, to find the smallest radius Rc , referred to as the critical stability radius, for which a physically acceptable solution of the Einstein equations can be found, is formulated in [17] as follows: A physically acceptable solution should satisfy ρ ≥ 0, p ≥ 0 and µ > −∞, 0 ≤ Q < M, R > R+ .
(12) (13)
m g (R) = M, q(R) = Q, q ≤ m g , m g + m 2g − q 2 < r.
(14)
Here R+ = M + M 2 − Q 2 is the outer horizon of a Reissner-Nordström black hole. The quantities m g and q should satisfy
(15)
We see immediately that these relations imply that q/r ≤ m g /r < 1. We will in addition assume that the following condition holds: p + 2 pT ≤ ρ.
(16)
The condition (16) is likely to be satisfied for most realistic matter models, cf. [10], and in particular it holds for Vlasov matter, cf. [4] for more information on this matter model. Remark. In [1 and 3] the following generalization of this condition was imposed, namely that p + 2 pT ≤ ρ, for some ≥ 0.
(17)
However, in contrast to the non-charged case where a bound on M is given by a simple formula depending on the simplicity is completely lost in the charged case except when = 1. Now, the case = 1 should be considered as the principal case, cf. [10], and in the non-charged case it is when = 1 that the classical bound 2m/r < 8/9 is recovered. We are now almost ready to state our main result but first we define what we mean by a regular solution of the spherically symmetric Einstein equations. We say that := (µ, λ, ρ, p, pT , j 0 ) is a regular solution if the matter quantities ρ, p, pT and j 0 are bounded everywhere and C 1 except possibly at finitely many points, p has compact support and Eqs. (3), (5), (6) and (11) are satisfied (where the matter quantities are C 1 ) and the constraints (12) and (15) are satisfied. Remark. Presently there is no existence theorem of static solutions of the EinsteinVlasov-Maxwell system (this sytem is of particular interest in this context in view of Theorem 2 below since numerical simulations [5] indicate that arbitrarily thin shell solutions of this system do exist similarly to the uncharged case, cf. [2]). However, solutions are certainly known to exist for other matter models, see e.g. [21] where several examples of perfect fluid solutions are given. Theorem 1. Let be a regular solution of the Einstein equations and assume that (16) holds. Then
720
H. Andréasson
√ r m g (r ) ≤ + 3
r q 2 (r ) + . 9 3r
(18)
Moreover, the inequality is sharp in the subclass of regular solutions for which pT ≥ 0. Let us immediately make a consistency check so that (18) ensures that the stability radius is strictly outside the outer horizon. Thus we wish to show that the inequality (18) 2 2m implies that e−2λ(r ) = 1 − r g + qr 2 > 0, or equivalently that mg 1 q2 < + . r 2 2r 2 In view of inequality (18) this holds if 1 q2 1 q2 1 + + 2 < + . 3 9 3r 2 2r 2 An elementary computation shows that this is true as long as q2 < 1, r2 which always holds. The proof of Theorem 1 relies on the method in [22] for the non-charged case. In the introduction we discussed the strength of this method but also its shortages; the question of uniqueness of the steady state that saturates the inequality (18) is left open, and the constructed steady states that nearly saturate the inequality cannot be solutions of the coupled Einstein-Vlasov system (these issues were answered in [1 and 2] respectively). Furthermore, it is not completely obvious from the construction in [22] that these solutions approach an infinitely thin shell. This point also carries over in our proof of sharpness in Theorem 1 and we therefore find it natural to include a proof of the fact that an infinitely thin shell does saturate (18), although we have not been able to adapt the strategy in [1] to show that no other steady state can have this property. Furthermore, the numerical study in [5] supports that one maximizer for the spherically symmetric Einstein-Vlasov-Maxwell system is an infinitely thin shell. We therefore investigate a sequence of regular shell solutions which approach an infinitely thin shell and adapt the method in [3]. More precisely, let k be a sequence of regular solutions such that pk , ( pT )k , jk0 and ρk have support in [Rk , R]. Denote by Mk the total gravitational mass and by Q k the total charge of the corresponding solution in the sequence and assume that Q := limk→∞ Q k , and M = limk→∞ Mk exist and that R R supk qk /r < 1. Furthermore, assume that Rk r 2 pk dr → 0, and Rk r 2 (2( pT )k − ρk ) dr → 0 as k → ∞. Theorem 2. Assume that {k }∞ k=1 is a sequence of regular solutions with support in [Rk , R] with the properties specified above and assume that lim
Rk = 1. R
√
k→∞
Then √
R + M= 3
R Q2 + . 9 3R
(19)
(20)
Sharp Bounds on Critical Stability Radius for Relativistic Charged Spheres
721
Remark. That sequences exist with these properties, in particular the property (19), has been proved for the (non-charged) Einstein-Vlasov system, cf. [2] (and [6] for a numerical study). The investigation carried out in [5] also supports that such sequences exist for the Einstein-Vlasov-Maxwell system. 4. Proofs Proof of Theorem 1. As described above our method of proof is an adaption of the method in [22] to the charged case. Let a regular solution be given and let us define m λ (r ) = m i (r ) +
F(r ) q2 = mg − , 2 2r
(21)
and let x≡
2m λ q2 , y ≡ 8πr 2 p, z ≡ 2 . r r
Note that the conditions (12) and (15) imply that x < 1, y ≥ 0, and z < 1.
(22)
Indeed, the two latter bounds are immediate and the former follows since (15) gives that q 2 > m 2g − (r − m g )2 so that x=
m 2g − (r − m g )2 2m g 2m g q2 2m λ = − 2 < − = 1. r r r r r2
(23)
Lemma 1. The variables (x, y, z) give rise to a parametric curve in [0, 1) × [0, ∞) × [0, 1) and satisfy the equations 8πr 2 ρ = 2 x˙ + x − z, 8πr 2 p = y, (x + y − z)2 x+y−z x˙ + y˙ − z˙ − z + , 8πr 2 pT = 2(1 − x) 4(1 − x)
(24) (25) (26)
where the dots denote derivatives with respect to β := 2 log r. Proof of Lemma 1. The proof is a straightforward computation using the Einstein equations (5) and (6) and the Tolman-Oppenheimer-Volkov equation (11). Now let w(x, y, z) =
(3(1 − x) + 1 + y − z)2 . 1−x
Differentiating with respect to β gives w˙ =
4 − 3x + y − z [(3x − 2 + y − z)x˙ + 2(1 − x) y˙ − 2(1 − x)˙z ] . (1 − x)2
(27)
722
H. Andréasson
Now, using the expressions of the matter terms given in Lemma 1 the condition p+2 pT ≤ ρ can be written (3x − 2 + y − z)x˙ + 2(1 − x)( y˙ − z˙ ) ≤
−α(x, y, z) , 2
(28)
where α = 3x 2 − 2x + (y − z)2 + 2(y − z). From (27) and (28) it now follows that 4 − 3x + y − z [(3x − 2 + y − z)x˙ + 2(1 − x) y˙ − 2(1 − x)˙z ] (1 − x)2 4 − 3x + y − z α(x, y, z). ≤− 2(1 − x)2
w˙ =
(29)
Since 0 ≤ x < 1, y ≥ 0 and 0 ≤ z < 1, it follows that w is decreasing whenever α > 0, which implies that w ≤ max w(x, y, z), E
(30)
where E = {(x, y, z) : 0 ≤ x ≤ 1, y ≥ 0, 0 ≤ z ≤ 1 and α(x, y, z) ≤ 0}. To solve this optimization problem we introduce s = y − z and note that max E w(x, y, z) = max E w(x, s), where E = {(x, s) : 0 ≤ x ≤ 1, s ≥ −1 and α(x, s) ≤ 0}. It is straightforward to conclude that there are no stationary points in the interior of E , so the maximum is attained at the boundary ∂ E of E . The Lagrange multiplier method leads to the following system of equations: (1 − x)(6(1 + s) + 4(3x − 1)) − 2(1 + s)2 = 0, x(3x − 2) + s(s + 2) = 0.
(31) (32)
From (32) we have that s 2 = −2s − x(3x − 2) which substituted into (31) results in the equation (x + s)(1 − x) = 0.
(33)
If x = −s we get from (32) that 4s 2 + 4s = 0 so that either s = 0 = x or s = −1 = −x. In the latter case we get w(1, −1) = 0, and the former case gives w(0, 0) = 16. We thus conclude that w ≤ 16 throughout the curve. Since p ≥ 0 it follows from the inequality w ≤ 16 that 2m g q 2 2m g q 2 q 2 2 + 2 ) + 1 − 2 ≤ 16 1 − + 2 . 3(1 − r r r r r
(34)
This is easily seen to be equivalent to 6m g r
−
2q 2 2 16m g . ≤ r2 r
(35)
Sharp Bounds on Critical Stability Radius for Relativistic Charged Spheres
Taking the square root of both sides and rearranging leads to √ √ √ r r q 2 √ r r q2 − + + + ≤ 0. mg − mg − 3 9 3r 3 9 3r
723
(36)
Since the second bracket is always non-negative and vanishes only if m g = q = 0 we have √ r r q2 √ mg − − + ≤ 0, (37) 3 9 3r which is the first claim. To show sharpness we will construct a spacetime such that the corresponding curve from Lemma 1 intersects a small neighbourhood of (xq , 0, z q ), where z q < 1 is a given ratio q 2 /r 2 and xq is the corresponding value of x when equality holds in (37), i.e., 4 zq 4 1 zq + + . xq := − 9 3 3 9 3 We will construct such a spacetime by showing that there exists a curve x = x(τ ), y = y(τ ), z = z(τ ); τ ∈ [0, ∞), which passes near (xq , 0, z q ) and in addition has the following properties: • • • • •
(A1) α1 dw dτ is negative and locally integrable, (A2) x(0) = y(0) = z(0) = 0, (A3) 0 ≤ x(τ ) < xq , z(τ ) ≤ z q , (A4) y(τ ) = 0 for all large enough τ , x(τ ) → 0 and z(τ ) → 0 as τ → ∞, (A5) the curve is C 1 except for finitely many points.
Below we will denote s = y − z as above and the curves (x(τ ), y(τ ), z(τ )) ∈ [0, 1) × [0, ∞) × [0, 1), and (x(τ ), y(τ ), s(τ )) ∈ [0, 1) × [0, ∞) × (−1, ∞) will be used interchangeably. Let us first see that if we have a curve which satisfies (A1)-(A5) a spacetime can be constructed. Indeed, let κ(τ ) = −
1 dw 2(1 − x)2 , α(x, s) dτ 4 − 3x + s
(38)
and observe that κ is positive and locally integrable by (A1) and (A2). Next define β = κ dτ, (39) and r = eβ/2 ,
(40)
and define the metric coefficients by 1 λ = − log (1 − x), 2 x+y µ=− κ dτ. 4(1 − x)
(41) (42)
724
H. Andréasson
It is straightforward to check that λ and µ solve the Einstein equations (5) and (6). The definition of κ now implies w˙ =
1 dw 4 − 3x + s =− α(x, s), κ dτ 2(1 − x)2
(43)
where we recall that dots denote differentiation with respect to β = 2 log r. Using (27) we thus have (3x + s − 2)x˙ + 2(1 − x)˙s = −
α(x, s) , 2
(44)
which is equivalent to the relation p + 2 pT = ρ in view of (28). We will now show that such a curve exists. Let us fix some small > 0 and define w (x, s) :=
((3 − 3)(1 − x) + 1 + s))2 . 1−x
Consider now the curve γ in the (x, s)-plane defined by √ w (x, s) = ( 1 + 3x + 4(1 − ))2 .
(45)
(46)
Define the corresponding curve in R3 by (x, y, z) = (x, max (0, s), max (0, −s)),
(47)
so that s = y − z. In Fig. 1 the curve γ is depicted together with the curves γ0 and γ1 . Note that γ0 is the curve w(x, s) = 16 which passes through (0, 0) and (1, −1). The curve γ1 is the curve α(x, s) = 0 which also passes through (0, 0) and (1, −1). The dotted line shows the line s = sq := −z q (for the choice z q = 0.6) and the intersection of the curve γ0 with this line is the point (xq , sq ). It is clear that for a sufficiently small > 0 the curve γ intersects an arbitrarily small neighbourhood of (xq , sq ). Let us denote the point of intersection of γ and the line s = sq by (xq , sq ). Let us now define := γ + h , where h is the curve given by the equation 2s ds = , such that s(xq ) = sq . dx s+x
(48)
It is clear from the defining equation that h ∈ {(x, s) : x ≥ 0, s ≤ 0, s + x > 0} and that the solutions approach the point (0, 0) for all admissible starting points (xq , sq ) (note that xq + sq > 0). The curve h is depicted in Fig. 1. It remains to show that (A1)-(A5) are satisfied for the curve and that ρ, p and pT are non-negative along the curve. It is obvious that satisfies (A2)-(A5). To see that it satisfies (A1) we first consider the first part of the curve γ and note that α > 0 along γ . This follows since γ lies above γ1 and α = 0 along γ1 and ∂α = 2s + 2 > 0, for s > −1. ∂s Hence it is sufficient to show that dw/dτ < 0 to establish that 1 dw < 0. α dτ
Sharp Bounds on Critical Stability Radius for Relativistic Charged Spheres
725
Fig. 1. The curves γ0 , γ1 , γ and h
We differentiate (46) and obtain 3 dw 3(1 − )(1 − x) + 1 + s d x =√ . √ dτ dτ 1−x 1 + 3x
(49)
If we now differentiate (45) directly we get dw (3 − 3)(1 − x) + 1 + s ds
dx = + 2(1 − x) . (−(3 − 3)(1 − x) + 1 + s) 2 dτ (1 − x) dτ dτ (50) Comparing (49) and (50) gives 2(1 − x)
3(1 − x)3/2 dx ds = . + 3(1 − )(1 − x) − 1 − s √ dτ dτ 1 + 3x
Thus differentiating w along γ , substituting for ds/dτ using (51), leads to √ √ 3(4 − 3x + s) 1 − x − 1 + 3x d x dw = . √ dτ 1−x dτ 1 + 3x
(51)
(52)
Since d x/dτ > 0 along γ and since 0 ≤ x < 1 and s > −1 we get that dw < 0. dτ It remains to show that α −1 dw/dτ is negative also along the curve h . Here we have 1 dw 3(1 − x) + 1 + s ds d x = (3x − 2 + s) + 2(1 − x) 2 α dτ (x(3x − 2) + s(s + 2))(1 − x) d x dτ =
[3(1 − x) + 1 + s] d x , (x + s)(1 − x)2 dτ
(53)
726
H. Andréasson
where we used (48) for ds/d x. Since d x/dτ < 0 along h the claim follows since x + s > 0 along h . Hence, 1 dw < 0, α τ
(54)
along . The local integrability of this expression follows by inspection of the formulas above since 0 ≤ x ≤ xq < 1. Thus condition (A1) holds along . Finally we show that ρ, p and pT are non-negative along . Since y ≥ 0 along , cf. (47), it immediately follows that p ≥ 0. Since (44) implies that ρ = p + 2 pT we only need to show that pT ≥ 0 along . First we consider the first part of , i.e., the curve γ . From Lemma 1 and (51) we have x +s (x + s)2 x˙ + s˙ − z + 2(1 − x) 4(1 − x) √ 3 1 − x (x + s)2 3 1 dx + [−z + ]. = √ +1− 2 κ dτ 4(1 − x) 2 1 + 3x
8πr 2 pT =
(55)
Since is small the first term is positive since d x/dτ > 0 along γ . The term in square brackets can also be seen to be positive. Indeed, along the part of γ where s ≥ 0, z = 0 and the claim is trivial so we focus on the part where s < 0. Here z = −s and we thus want to show that s+
(x + s)2 >0 4(1 − x)
(56)
along the part of γ where s < 0. We evaluate the left hand side of (56) along γ0 and show that it is positive there and then we conclude by continuity that this statement also holds along γ for small. Along γ0 we have the relation 4 + 3s 4 1 s x= + − . (57) 9 3 9 3 A straightforward calculation now gives that along γ0 , 1 (x + s)2 1 s 2 16 = s + 4( + − ) ≥s+ > 0. s+ 4(1 − x) 3 9 3 9 Hence pT > 0 also along γ for a sufficiently small . Along h it holds by construction, cf. (48) and Lemma (1), that ρ = 0 and thus p = pT = 0. This completes the proof of Theorem 1. Proof of Theorem 2. Let us begin with a few general facts. Consider a regular solution where the matter quantities are supported in [0, R]. We recall from Sect. 2 the following consequence of the matching condition: 2M Q 2 (r ) e−λ(r ) = eµ(r ) = 1 − , r ≥ R, (58) + r r2 so that eµ+λ = 1 for r > R. Let us now derive an explicit expression for µ. The Einstein equation (6) can be written as
Sharp Bounds on Critical Stability Radius for Relativistic Charged Spheres
mi
µr = so that
r2
µ(r ) = − r
+ 4πr p −
∞ m
i r2
q2 F 2λ e , + 2r 3 2r 2
+ 4πr p −
q2 F + 2 e2λ dr, 3 2r 2r
727
(59)
(60)
since µ → 0 as r → ∞ in view of (58). We will also need an explicit formula for λr , and from (5) we have λr = (4πrρ(r ) −
m i (r ) q 2 F + 3 − 2 )e2λ . 2 r 2r 2r
From the expressions of µr and λr we also obtain ∞ 4π η(ρ + p)e2λ dη, µ(r ) + λ(r ) = −
(61)
(62)
r
so that in particular µ + λ ≤ 0.
(63)
Now we derive our fundamental integral equation which is a consequence of the Tolman-Oppenheimer-Volkov equation. Let ψ = (m g + 4πr 3 p −
q 2 µ+λ )e . r
Taking the derivative of ψ with respect to r, a straightforward calculation using the Tolman-Oppenheimer-Volkov equation (11) results in the following equation: r q 2 µ+λ q2 3 = eµ+λ (4π η2 (ρ + p + 2 pT ) + 2 )dη. (64) (m g + 4πr p − )e r η 0 This equation must be satisfied by any spherically symmetric static solution of the Einstein-Maxwell system. Let us now consider our sequence of solutions. Since pk (R) = 0, (m g )k (R) = Mk , qk (R) = Q k and eµk +λk (R) = 1 we get in view of (64) for r = R, R Q 2k q2 = Mk − eµk +λk (4π η2 (ρk + pk + 2( pT )k ) + k2 ) dη. (65) R η Rk Here we also used the fact that the matter is supported in [Rk , R]. We split the right-hand side as follows: R q2 eµk +λk (4π η2 (ρk + pk + 2( pT )k ) + k2 ) dη η Rk R q2 = eµk +λk (8π η2 ρk + k2 ) dη η Rk R eµk +λk (4π η2 ( pk + 2( pT )k − ρk ) dη =: Sk + Tk . (66) + Rk
728
H. Andréasson
By the mean value theorem we get that there is a ξ ∈ [Rk , R] such that Sk = 2eµk (ξ ) ξ = 2e =:
µk (ξ )
Sk1
+
R
eλk (4π ηρk +
Rk
qk2 )dη 2η3
d(e−λk ) ξ [− ]dη + 2eµk (ξ ) ξ dη Rk R
(67)
R Rk
(
(m i )k (η) Fk (η) λk + )e dη η2 2η2
Sk2 .
(68)
Here we used Eq. (61) for λr . Now, since supk qk /r is strictly less than one we obtain a uniform bound on λk from the inequality (1), cf. the consistency check after the formulation of Theorem 1. The same computation guarantees that (m i )k (r )/r + Fk (r )/2r < 1/2, thus it follows that 0 ≤ Sk2 ≤ C log
R → 0 as k → ∞. Rk
Since µk + λk ≤ 0 by (63) and since ρk ≥ pk + 2( pT )k ≥ 2( pT )k , it follows from the assumptions on the sequence that also Tk → 0 as k → ∞. For the term Sk1 we get by (58), Sk1 = −2eµk (ξ ) ξ
R Rk
d −λk (e ) dη = 2eµk (ξ ) ξ 1 − dη
1−
2Mk Q 2k + 2 . R R
Note here that λk (Rk ) = 0 due to the support condition of the matter terms. In view of (60), the assumptions on the sequence of solutions and the general bounds on m i /r and q/r it follows that 2M Q 2 + 2 as k → ∞, eµk (ξ ) → 1 − R R so that
2M Q 2 lim Sk1 = 2R 1 − + 2 1− k→∞ R R
1−
2M Q 2 + 2 . R R
In conclusion, from (65) we get in the limit k → ∞, Q2 2M Q 2 2M Q 2 M− = 2R 1 − + 2 1− 1− + 2 . R R R R R After some algebra this relation can be written as Q2 Q2 Q2 M− = (3M − ) 1 − 2M/R + 2 . R R R
(69)
Sharp Bounds on Critical Stability Radius for Relativistic Charged Spheres
729
Squaring both sides one finds after some rearrangements (9M 2 −
6M Q 2 Q 4 2M 2M Q2 Q2 + 2 )( − 2 ) = 4M R( − 2 ), R R R R R R
so that (3M −
Q2 2 ) = 4M R. R
(70)
We have thus arrived at the same expression (with equality instead of inequality) as (35) and we accordingly obtain √ √ R R Q2 M= + + , 3 9 3R which completes the proof of Theorem 2. 5. Final Remarks In [22] several different conditions on the relation between ρ, p and pT are investigated, e.g. the isotropic case where p = pT . We have not tried to consider other cases than p + pT ≤ ρ in this work although we believe that it can be done. We believe however that an equally transparent inequality as (1) is unlikely to be found under other conditions than p + pT ≤ ρ, cf. [22]. However, the following comparison with the non-charged case is interesting. The original Buchdahl inequality [11] was derived under the assumptions that ρ is non-increasing outwards and the pressure is isotropic and the steady state that saturates the inequality 2M/R ≤ 8/9 within this class of solutions is the one with constant energy density for which the pressure is infinite at the center. It is quite remarkable that exactly the same inequality holds much more generally [1], as long as p + pT ≤ ρ, and in particular that the steady state that saturates the inequality in this class is an infinitely thin shell which is drastically different from the constant energy density solution. One can now ask if there is a similar analogue in the charged case. In the work [17] by Giuliani and Rothman they find an explicit solution with constant energy density and constant charge density and they obtain for this solution an algebraic equation from which the values of the stability radius can be evaluated. It is in view of the discussion above therefore interesting to see whether these values are less, equal or greater than the values given by (1). In [17] the ratios R/M are displayed for different ratios of Q/R (or more precisely for different ratios Q/M, but the corresponding ratios Q/R can be deduced). It turns out that the critical stability radius given by the relation √ √ R R Q2 + + M= 3 9 3R is smaller than the corresponding ones found in [17], or alternatively, our relation admits a larger ratio M/R for a given ratio Q/R. Acknowledgement. I would like to thank the authors of [17] for their clearly written paper which got me interested in this topic.
730
H. Andréasson
References 1. Andréasson, H.: Sharp bounds on 2m/r of general spherically symmetric static objects. J. Diff. Eqs. 245(8), 2243–2266 (2008) 2. Andréasson, H.: On static shells and the Buchdahl inequality for the spherically symmetric EinsteinVlasov system. Commun. Math. Phys. 274, 409–425 (2007) 3. Andréasson, H.: On the Buchdahl inequality for spherically symmetric static shells. Commun. Math. Phys. 274, 399–408 (2007) 4. Andréasson, H.: The Einstein-Vlasov system/Kinetic theory. Liv. Rev. Relativity 8, (2) [Online Article: http://www.livingreviews.org/ln-2005-2] (2005) 5. Andréasson, H., Eklund, M.: A numerical investigation of the steady states of the spherically symmetric Einstein-Vlasov-Maxwell system. In preparation 6. Andréasson, H., Rein, G.: On the steady states of the spherically symmetric Einstein-Vlasov system. Class. Quantum Grav. 24, 1809–1832 (2007) 7. Anninos, P., Rothman, T.: Instability of extremal relativistic charged spheres. Phys. Rev. D 62, 024003 (2001) 8. Baumgarte, T.W., Rendall, A.D.: Regularity of spherically symmetric static solutions of the Einstein equations. Class. Quantum Grav. 10, 327–332 (1993) 9. Bondi, H.: Massive spheres in general relativity. Proc. R. Soc. A 282, 303–317 (1964) 10. Bondi, H.: Anisotropic spheres in general relativity. Mon. Not. Roy. Astr. Soc. 259, 365 (1992) 11. Buchdahl, H.A.: General relativistic fluid spheres. Phys. Rev. 116, 1027–1034 (1959) 12. Böhmer, C.G., Harko, T.: Minimum mass-radius ratio for charged gravitational objects. Gen. Rel. Grav. 39, 757–775 (2007) 13. de Felice, F., Siming, L., Yunqiang, Y.: Relativistic charged spheres: II. Regularity and stability. Class. Quantum Grav. 16, 2669–2680 (1999) 14. Farrugia, C.J., Hajicek, P.: Commun. Math. Phys. 68, 291–299 (1979) 15. Fayos, F., Senovilla, J.M.M., Torres, R.: Spherically symmetric models for charged stars and voids. I. Charge bound. Class. Quantum Grav. 20, 2579–2594 (2003) 16. Ghezzi, C.R.: Relativistic structure, stability, and gravitational collapse of charged neutron stars. Phys. Rev. D 72, 104017 (2005) 17. Giuliani, A., Rothman, T.: Absolute stability limit for relativistic charged spheres. Gen. Rel. Gravitation 40(7), 1427–1447 (2008) 18. Guven, J., Ó Murchadha, N.: Bounds on 2m/R for static spherical objects. Phys. Rev. D 60, 084020 (1999) 19. Harko, T., Mak, M.K.: Anisotropic charged fluid spheres in D space-time dimensions. J. Math. Phys. 41, 4752–4764 (2000) 20. Ivanov, B.V.: Maximum bounds on the surface redshift of anisotropic stars. Phys. Rev. D 65, 104011 (2002) 21. Ivanov, B.V.: Static charged perfect fluid spheres in general relativity. Phys. Rev. D 65, 104001 (2002) 22. Karageorgis, P., Stalker, J.: Sharp bounds on 2m/r for static spherical objects. Class Quantum Grav 25, 195021 (2008) 23. Mak, M.K., Dobson, P.N., Harko, T.: Maximum mass-radius ratios for charged compact general relativistic objects. Europhys. Lett. 55, 310 (2001) 24. Mars, M., Mercè Martín-Prats, M., Senovilla, J.M.M.: The 2m ≤ r property of spherically symmetric static spacetimes. Phys. Lett. A 218, 147 (1996) 25. Yunqiang, Y., Siming, L.: Relativistic charged balls. Commun. Theor. Phys. 33, 571 (2000) Communicated by G. W. Gibbons
Commun. Math. Phys. 288, 731–744 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0778-4
Communications in
Mathematical Physics
Phase Transition in the 1d Random Field Ising Model with Long Range Interaction Marzio Cassandro1 , Enza Orlandi2 , Pierre Picco3 1 Dipartimento di Fisica, Universitá di Roma “La Sapienza”, P.le A. Moro,
00185 Roma, Italy. E-mail:
[email protected]
2 Dipartimento di Matematica, Universitá di Roma Tre, L.go S.Murialdo 1,
00146 Roma, Italy. E-mail:
[email protected]
3 LATP, CMI, UMR 6632, CNRS, Université de Provence,
39 rue Frederic Joliot Curie, 13453 Marseille Cedex 13, France. E-mail:
[email protected] Received: 22 April 2008 / Accepted: 22 December 2008 Published online: 17 March 2009 – © Springer-Verlag 2009
Abstract: We study one–dimensional Ising spin systems with ferromagnetic, long– ln 3 range interaction decaying as n −2+α , α ∈ 21 , ln − 1 , in the presence of external 2 random fields. We assume that the random fields are given by a collection of symmetric, independent, identically distributed real random variables, gaussian or subgaussian. We show, for temperature and strength of the randomness (variance) small enough, with IP = 1 with respect to the random fields, that there are at least two distinct extremal Gibbs measures. 1. Introduction It is well known that the one dimensional ferromagnetic Ising model exhibits a phase transition when the forces are sufficiently long range. A fundamental work on the subject is due to Dyson [9]. He proved, by comparison to a hierarchical model, that for a two body interaction J (n) = ln ln(n+3) , where n denotes the distance, there is spontaneous n 2 +1 magnetization at low enough temperature. On the other hand Rogers &Thompson [14] proved that the spontaneous magnetization vanishes for all temperatures when lim
N →∞
N
1 1
[ln N ] 2
n J (n) = 0.
n=1
Later, Fröhlich & Spencer [11] proved the existence of spontaneous magnetization when J (n) = n −2 . For the same model Aizenman, Chayes, Chayes, & Newman [1] proved the discontinuity of the magnetization at the critical temperature, the so-called Thouless effect. When J (n) = n −2+α , α < 0 there is only one Gibbs state [6,7,15] and the free Supported by: GDRE 224 GREFI-MEFI, CNRS-INdAM. P.P was also partially supported by INdAM program Professori Visitatori 2007; M.C and E.O were partially supported by Prin07: 20078XYHYS.
732
M. Cassandro, E. Orlandi, P. Picco
energy is analytic in the thermodynamic parameters, see [8]. More recently the notion of contours introduced in [11] was implemented in [5], by giving a graphical description of the spin configurations better suited for further generalizations. The case studied in [5] covers the regime 0 ≤ α ≤ (ln 3/ ln 2) − 1. By applying Griffiths inequalities the existence of a phase transition in the full interval 0 ≤ α < 1 can be deduced either by [5], or by [11]. A natural extension of this analysis is its application to disordered systems. One of the simplest prototype models for disordered spin systems is obtained by adding random magnetic fields, say gaussian independent identically distributed with mean zero and finite variance. The problem of (lower) critical dimension for the d–dimensional Random Field Ising Model was very challenging at the end of the eighties since the physical literature predicted conflicting results. For finite range interaction the problem was rigorously solved by two complementary articles, Bricmont & Kupiainen [4] and Aizenman & Wehr [2]. In [4] a renormalization group argument was used to show that if d ≥ 3 and the variance of the random magnetic field is small enough then almost surely there are at least two distinct Gibbs states (the plus and the minus Gibbs states). In [2] it was proved that for d ≤ 2, almost surely there is an unique Gibbs state. The guidelines of these proofs are suggested by a heuristic argument due to Imry & Ma [13]. In the long–range one–dimensional setting the Imry & Ma argument is the following: the deterministic cost to create a run of −1 in an interval of length L with respect to the state made of +1 at each site, is of order L α , while the cumulative effect of the random field inside this interval is just L 1/2 . So when 0 ≤ α ≤ 1/2 the randomness is dominant and there is no phase transition. This has been proved by Aizenman & Wehr [2]. They show that the Gibbs state is unique for almost all realizations of the randomness. When 1/2 < α < 1, the above Imry & Ma argument suggests the existence of a phase transition since the deterministic part is dominant with respect to the random part as in the case of the three–dimensional random field Ising model. However a rigorous result is this direction was missing. In this paper we study the random field one–dimensional Ising model with long ln 3 1 58 range interaction n −2+α , α ∈ ( 21 , ln 2 − 1) ( 2 , 100 ). We assume that the random field h[ω] := {h i [ω], i ∈ ZZ } is given by a collection of independent random variables, with mean zero and symmetrically distributed. We take h i [ω] = ±1 with p = 21 and we introduce the strength parameter θ . However one could take different distributions, for example gaussian distribution with mean zero and variance θ 2 , or subgaussian. In fact 2 2 all that is needed is E(eth 1 ) ≤ ecθ t for some positive constant c and for all t ∈ IR. We 1 ln 3 prove that for 2 < α < ln 2 − 1 the situation is analogous to the three-dimensional short range random field Ising model. For temperature and variance of the randomness small enough, with IP = 1 with respect to the randomness, there exist at least two distinct infinite volume Gibbs states, namely the µ+ [ω] and the µ− [ω] Gibbs states. The proof is based on the representation of the system in terms of the contours as defined in [5]. A Peierls argument is obtained by using the lower bound of the deterministic part of the cost to erase a contour and controlling the contribution of the stochastic part. This control is done applying an exponential Markov inequality and the so-called Yurinski’s martingale difference sequences method. We do not need to use any coarse-grained contours as in [4], a fact that simplifies the proof. In the one dimensional case the contours can be described in terms of intervals and the Imry & Ma argument can be implemented. Namely in our case bad configurations of the random magnetic field, the ones for which the naive Imry & Ma argument fails, are treated probabilistically. A kind of energy entropy argument is successfully used, see (3.14), to prove that they can be neglected. In
Phase Transition in the 1d Random Field Ising Model with Long Range Interaction
733
3 dimensions this specific energy entropy argument fails. The coarse grained contours in [4] allow to control these bad contours on various length scales by using a renormalization group argument. As a by-product an estimate on the decay of the truncated two point correlation functions is given in [4]. Our method does not give any information on this decay. Therefore we do not think that it can be directly applied to give an alternative proof of Bricmont & Kupiainen results [4]. For α ∈ [(ln 3/ ln 2) − 1, 1) we still expect the same result to hold but we are not able to prove it. In this case the lower bound for the deterministic contribution to the cost of erasing a contour does not hold, see Lemma (2.3). Known correlations inequalities are not relevant to treat this range of values of α as in the case where the random field is absent. 2. Model, Notations and Main Results 2.1. The model and the main results. Let (, B, IP) be a probability space on which we define h ≡ {h i }i∈ZZ , a family of independent, identically distributed Bernoulli random variables with IP[h i = +1] = IP[h i = −1] = 1/2. The spin configuration space is S ≡ {−1, +1} ZZ . If σ ∈ S and i ∈ ZZ , σi represents the value of the spin at site i. The pair interaction among spins is given by J (|i − j|) defined as follows 1 : J (1) >> 1 J (n) = 1 if n > 1, with α ∈ [0, 1). n 2−α For ⊆ ZZ we set S = {−1, +1} ; its elements are denoted by σ ; if σ ∈ S, σ denotes its restriction to . Given ⊂ ZZ finite and a realization of the magnetic fields, the Hamiltonian in the volume , with τ = ±1 boundary conditions, is the random variable on (, B, IP) given by H τ (σ )[ω] = H0τ (σ ) + θ G(σ )[ω], where H0τ (σ ) :=
1 2
J (|i − j|)(1 − σi σ j ) +
(i, j)∈×
i∈
and G(σ )[ω] := −
(2.1)
J (|i − j|)(1 − τ σi ), (2.2)
j∈c
h i [ω]σi .
(2.3)
i∈
In the following we drop the ω from the notation. The corresponding Gibbs measure on the finite volume , at inverse temperature β > 0 and + boundary condition is then a random variable with value on the space of probability measures on S defined by µ+ (σ ) =
1 + + exp{−β H (σ )} Z
σ ∈ S ,
(2.4)
+ is the normalization factor. Using FKG inequalities, one can construct with where Z IP = 1 the infinite volume Gibbs measure µ+ [ω] as limits of local specifications with homogeneous plus boundary conditions along any deterministic sequence of increasing and absorbing finite volumes n . Of course the same holds with minus boundary conditions, see for example Theorem 7.2.2 in [3] or Theorem IV.6.5 in [10]. The main results are the following. 1 The condition J (1) >> 1 is essential to apply the results of [5], reported in Subsect. 2.2.
734
M. Cassandro, E. Orlandi, P. Picco
Theorem 2.1. Let α ∈
1
ln 3 2 , ln 2
− 1 and
ζ = ζ (α) = 1 − 2(2α − 1) > 0.
(2.5)
There exist positive θ0 := θ0 (α) > 0 and β0 := β0 (α) > 0 so that for 0 < θ ≤ θ0 and β ≥ β0 there exists 1 ⊂ such that b¯
IP[1 ] ≥ 1 − e− 200 ,
(2.6)
and for any ω ∈ 1 , b¯
where
µ+ ({σ0 = −1}) [ω] < e− 200 ,
(2.7)
2 ¯b = min βζ , ζ . 4 210 θ 2
(2.8)
Remark. Since the translation invariant, B measurable event A ≡ {∃i ∈ ZZ : µ+ [ω] b¯
(σi = +1) > 1 − e− 200 } has strictly positive probability, see 2.6 and (2.7), by ergodicity IP[A] = 1. Therefore almost surely the two extremal Gibbs states µ± [ω] are distinct. The proof of Theorem (2.1) is given in Sect. 3. In the next subsection we recall the definition of contours and in Sect. 4 we prove the main probabilistic estimate. 2.2. Geometrical description of the spin configurations. We will follow the geometrical description of the spin configuration presented in [5] and use the same notations. We will consider homogeneous boundary conditions, i.e the spins in the boundary conditions are either all +1 or all −1. Actually we will restrict ourself to + boundary conditions and consider spin configurations σ = {σi , i ∈ ZZ } ∈ X+ so that σi = +1 for all |i| large enough. In one dimension an interface at (x, x + 1) means σx σx+1 = −1. Due to the above choice of the boundary conditions, any σ ∈ X+ has a finite, even number of interfaces. The precise location of the interface is immaterial and this fact has been used to choose the interface points as follows: For all x ∈ ZZ so that (x, x + 1) is an interface, take the 1 1 location of the interface to be a point inside the interval [x + 21 − 100 , x + 21 + 100 ], with the property that for any four distinct points ri , i = 1, . . . , 4; |r1 − r2 | = |r3 − r4 |. This choice is done once for all so that the interface between x and x + 1 is uniquely fixed. Draw from each one of these interface points two lines forming respectively an angle of π4 and of 34 π with the ZZ line. We have thus a bunch of growing ∨− lines each one emanating from an interface point. Once two ∨− lines meet, they are frozen and stop their growth. The other two lines emanating from the same interface points are erased. The ∨− lines emanating from other points keep growing. The collision of the two lines is represented graphically by a triangle whose basis is the line joining the two interface points and whose sides are the two segments of the ∨− lines which meet. The choice made of the location of the interface points ensures that collisions occur one at a time so that the above definition is unambiguous. In general there might be triangles inside triangles. The endpoints of the triangles are suitable coupled pairs of interface points. The graphical representation just described maps each spin configuration in X+ to a set of triangles.
Phase Transition in the 1d Random Field Ising Model with Long Range Interaction
735
Notation. Triangles will be usually denoted by T , the collection of triangles constructed as above by {T }, and we will write |T | = cardinality of T ∩ ZZ = mass of T, and by supp(T ) ⊂ IR the basis of the triangle. We have thus represented a configuration σ ∈ X+ as a collection of T = (T1 , . . . , Tn ). The above construction defines a one to one map from X+ onto {T }. It is easy to see that a triangle configuration T belongs to {T } iff for any pair T and T in T , dist(T, T ) ≥ min{|T |, |T |}.
(2.9)
We say that two collections of triangles S and S are compatible and we denote it by S S iff S ∪ S ∈ {T } (i.e. there exists a configuration in X+ such that its corresponding collection of triangles is the collection made of all triangles that are in S or in S.) By an abuse of notation, we write H0+ (T ) = H0+ (σ ), G(σ (T ))[ω] = G(σ )[ω], σ ∈ X+ ⇐⇒ T ∈ {T }. Definition 2.2. The energy difference. Given two compatible collections of triangles S T , we denote H + (S|T ) := H + (S ∪ T ) − H + (T ).
(2.10)
Let T = (T1 , . . . , Tn ) with |Ti | ≤ |Ti+1 |, then using (2.10) one has H + (T ) = H + (T1 |T \ T1 ) + H + (T \ T1 ).
(2.11)
The following lemma proved in [5], see Lemma 2.1 there, gives a lower bound on the cost to “erase” triangles sequentially starting from the smallest ones. ln 3 Lemma 2.3. [5] For α ∈ (0, ln 2 − 1) and ζ := ζ (α) as defined in (2.5) one has
H0+ (T1 |T \ T1 ) ≥ ζ |T1 |α ,
(2.12)
and by iteration, for any 1 ≤ i ≤ n,
H0+ (∪i =1 T |T \ [∪i =1 T ]) ≥ ζ
i
|T |α .
(2.13)
=1
The estimate (2.13) involves contributions coming from the full set of triangles associated to a given spin configuration, starting from the triangle having the smallest mass. To implement a Peierls bound in our set up we need to “localize” the estimates to compute the weight of a triangle or of a finite set of triangles in a generic configuration. In order to do this [5] introduced the notion of contours as clusters of nearby triangles sufficiently far away from all other triangles. Contours. A contour is a collection T of triangles related by a hierarchical network of connections controlled by a positive number C, see (2.14), under which all the triangles of a contour become mutually connected. We denote by T ( ) the triangle whose basis is the smallest interval which contains all the triangles of the contour. The right and left
736
M. Cassandro, E. Orlandi, P. Picco
endpoints of T ( ) ∩ ZZ are denoted by x± ( ). We denote | | the mass of the contour
, | | = |T |, T ∈
i.e. | | is the sum of the masses of all the triangles belonging to . We denote by R(·) the algorithm which associates to any configuration T a configuration { j } of contours with the following properties: P.0 Let R(T ) = ( 1 , . . . , n ), i = {T j,i , 1 ≤ j ≤ ki }, then T = {T j,i , 1 ≤ i ≤ n, 1 ≤ j ≤ ki }. P.1 Contours are well separated from each other. Any pair = verifies one of the following alternatives. T ( ) ∩ T ( ) = ∅, i.e. [x− ( ), x+ ( )] ∩ [x− ( ), x+ ( )] = ∅, in which case dist ( , ) := min dist (T, T ) > C | |3 , | |3 , T ∈ ,T ∈
(2.14)
where C is a positive number. If T ( ) ∩ T ( ) = ∅, then either T ( ) ⊂ T ( ) or T ( ) ⊂ T ( ); moreover, supposing for instance that the former case is verified, (in which case we call an inner contour) then for any triangle Ti ∈ , either T ( ) ⊂ Ti or T ( ) ∩ Ti = ∅ and dist ( , ) > C| |3 , if T ( ) ⊂ T ( ).
(2.15)
P.2 Independence. Let for k > 1 {T (1) , . . . , T (k) } be configurations of triangles and (i) R(T (i) ) = { j , j = 1, . . . , n i } be the contours of the configurations T (i) . Then if any
(i ) distinct (i) j and j satisfies P.1,
(i)
R(T (1) , . . . , T (k) ) = { j , j = 1, . . . , n i ; i = 1, . . . , k}. As proven in [5], the algorithm R(·) having properties P.0, P.1 and P.2 is unique and therefore there is a bijection between families of triangles and contours. Next we report the estimates proven in [5] which are essential for this paper. ln 3 Theorem 2.4. [5]. Let α ∈ (0, ln 2 − 1) and the constant C in the definition of the contours, see (2.14), be so large that
m≥1
4m 1 ≤ , [Cm]3 2
(2.16)
where [x] denotes the integer part of x. For any T ∈ {T }, let 0 ∈ R(T ) be a contour, S (0) the triangles in 0 and ζ (α) as in (2.5) Then
ζ (2.17) H0+ S (0) |T \ S (0) ≥ | 0 |α , 2
Phase Transition in the 1d Random Field Ising Model with Long Range Interaction
737
where
| 0 |α :=
|T |α .
(2.18)
T ∈ 0
Theorem 2.5. [5]. For any γ > 0 there exists C0 (γ ) so that for b ≥ C0 (γ ) and for all m > 0, γ γ wb ( ) ≤ 2me−bm , (2.19) 0∈ | |=m
where
γ
wb ( ) :=
γ
e−b|T | .
(2.20)
T ∈
In the sequel, it is convenient to identify in each contour the families of triangles having the same mass. Definition 2.6.
= {T (0) , T (1) , . . . T (k ) }, ( )
( )
( )
where for = 0, . . . k , T ( ) := {T1 , T2 , . . . Tn }, and each triangle of the family ( ) T ( ) has the same mass, i.e. for all i ∈ {1, . . . n }, |Ti | = for ∈ IN . According to (2.18), | |ρ =
k
|T ( ) |ρ , |T ( ) |ρ =
=0
T ∈T ( )
ρ
|T |ρ = n , ρ ∈ IR + .
(2.21)
3. Proof of Theorem (2.1) The proof of Theorem (2.1) is an immediate consequence of the following proposition and the Markov inequality. ln 3 Proposition 3.1. Let α ∈ ( 21 , ln 2 − 1). There exist positive θ0 := θ0 (α) > 0 and β0 := β0 (α) > 0 so that for 0 < θ ≤ θ0 and β ≥ β0 , b¯ 1 − σ0 ≤ e− 100 , (3.1) IE µ+ 2
where b¯ is the quantity defined in (2.8). Proof. A necessary condition to have σ0 = −1 is that the site zero is contained in the support of some contour so that µ+ (σ0 = −1) ≤ µ+ ({∃ : 0 ∈ }) ≤ µ+ ( ). (3.2)
0
738
M. Cassandro, E. Orlandi, P. Picco
By definition, see (2.4), µ+ ( )[ω] :=
1
e + [ω] Z T :T
−β H + (T ∪ )[ω]
,
(3.3)
where T :T means that the sum is over all families of triangles compatible with the contour . Recalling (2.2) and (2.10), for any j such that 0 ≤ j ≤ k , we write for the deterministic part of the Hamiltonian H0+ (T ∪ ) = H0+ (T ∪ \ (∪ =0 T ( ) )) + H0+ (T ∪ \ (∪ =0 T ( ) )|(∪ =0 T ( ) )). (3.4) j
j
j
Using estimate (2.17) and recalling notation (2.21), j
ζ j j n | |α . H0+ T ∪ \ ∪ =0 T ( ) | ∪ =0 T ( ) ≥ 2
(3.5)
=0
Therefore ζ
µ+ ( ) ≤ e−β 2
j
=0 n | |
α
1 −β H + (T ∪ \(∪ j T ( ) ))+βθ G(σ (T ∪ ))[ω] 0 =0 e . (3.6) + Z T :T
We multiply and divide by j j + ( ) ( ) e−β H0 (T ∪ \(∪ =0 T ))+βθ G(σ (T ∪ \∪ =0 T ))[ω] ,
(3.7)
T :T
and reconstruct µ+ ( \ [∪ =0 T ( ) ]), observing that We get j
µ+ ( ) ≤ e−
βζ 2
j
=0
|T ( ) |α
T :T
1≤
T :T \∪ =0 T ( ) j
µ+ ( \ [∪ =0 T ( ) ])eβ F j [ω] , j
(3.8)
where
⎧ 1 ⎨ F j [ω] := ln β ⎩
j
e−β H0 (T ∪ \(∪ =0 T +
T :T
( ) )+βθ G(σ (T ∪ ))[ω]
⎫ ⎬
( ) ))+βθ G(σ (T ∪ \(∪ j T ( ) ))[ω] =0
⎭
j
e−β H0 (T ∪ \∪ =0 T +
T :T
1.
. (3.9)
In (3.8) we explicitly quantify the deterministic cost of the first smaller families of triangles {T (0) , . . . , T ( j) } and express the main random contribution F j [ω] so that it is j antisymmetric with respect to the sign exchange of the random field inside ∪ =0 T ( ) , see (4.6). This observation allows to estimate this random contribution in a very convenient
way, see Lemma (3.2). To this aim we define for each the partition: = ∪kj=−1 Bj, where for j ∈ {0, . . . k − 1}, B j = B j ( ) := {ω : F j [ω] ≤
j i ζ ( ) α ζ ( ) α |T | , and ∀i > j, Fi [ω] > |T | }, 4 4 =0
=0
(3.10)
Phase Transition in the 1d Random Field Ising Model with Long Range Interaction
739
k ζ |T ( ) |α }, 4
(3.11)
Bk = Bk ( ) := {ω : Fk [ω] ≤
=0
and B−1 = B−1 ( ) := {ω : ∀i > −1; Fi [ω] >
i ζ ( ) α |T | }. 4
(3.12)
=0
The relevant properties of the partition are given in the following lemma, whose proof is given in Sect. 4. Lemma 3.2. For −1 ≤ j ≤ k , 2 k − ζ |T ( ) |2α−1 IE 1 B j ≤ e 210 θ 2 = j+1 ,
(3.13)
with the convention that an empty sum is zero. We then write µ+ ( )
=
k
µ+ ( )1{B j }
j=−1
and apply to each µ+ ( )1{B j } estimate (3.8). We obtain k IE µ+ ( ) = IE µ+ ( ))1{B j } j=−1
≤
k
e−
βζ 4
j
=0 |T
( ) |α
e
−
ζ2 210 θ 2
k
= j+1 |T
( ) |2α−1
(3.14)
j=−1 ¯
≤ (k + 1)e−b
k
=0
|T ( ) |2α−1
,
ζ where b¯ := min( βζ 4 , 210 θ 2 ). Recalling (2.20), one has 2
IE µ+ ({0 ∈ }) ≤ (k + 1)wb2α−1 ( ) = (m + 1) ¯
0
m≥3
wb2α−1 ( ). (3.15) ¯
0∈ :| |=m
Using (2.19), after a few lines of computation one gets (3.1)
ln 3 Remark. The upper bound α < ln 2 − 1 in Theorem (2.1) follows from Theorem (2.4), 1 the lower bound α > 2 from Theorem (2.5) and (3.15).
740
M. Cassandro, E. Orlandi, P. Picco
4. Probabilistic Estimates Let h = h[ω] be a realization of the random magnetic fields and A ⊂ ZZ . Define −h i , if i ∈ A; (S A h)i = (4.1) hi , otherwise, and denote h[S A ω] ≡ S A h[ω]. In the following to simplify notation we set ST h = Ssupp(T ) h. Recalling (2.3), it is easy to see that G(σ (T ∪ \ T (0) ))[ω] = G(σ (T ∪ ))[ST (0) ω].
(4.2)
G(σ (T ∪ \ ∪i =0 T ( ) ))[ω] = G(σ (T ∪ ))[S Di ω],
(4.3)
Di ⊂ ∪i =0 supp(T ( ) )
(4.4)
S Di = ST (i) ST (i−1) . . . ST (1) ST (0) .
(4.5)
In general
where
is the non–empty set so that
When all the triangles in (T ( ) , = 0, . . . , j) have disjoint supports (4.4) becomes an equality. In general there are triangles inside triangles and in this case the inclusion in (4.4) is strict. By construction the F j [ω] defined in (3.9) are such that F j (h(D cj ), h(D j )) = −F j (h(D cj ), −h(D j )),
j ∈ 0, . . . , k ,
(4.6)
where for a set A ⊂ ZZ , we denote by h(A) = {h i : i ∈ A}. Therefore one gets that IE[F j ] = 0. Proof of Lemma (3.2). Set Ai :=
i ζ ( ) α |T | . 4
(4.7)
=0
We have IP B j ≤ IP [∀i > j; Fi [ω] > Ai ] . Let λi for i = j + 1, . . . , k be positive parameters, and by an exponential Markov inequality we have
k k
− = λ A λ F j+1 = j+1 . (4.8) IP [∀i > j : Fi [ω] ≥ Ai ] ≤ e IE e Set F[ω] :=
k
λi Fi [ω].
(4.9)
i= j+1
It remains to estimate IE[e F ]. Note that F[ω] depends on all the random fields on . Let N be the number of sites in . To avoid involved notations, we define a bijection
Phase Transition in the 1d Random Field Ising Model with Long Range Interaction
741
from to {1, . . . , N } as follows: first pick up all the n 0 0 sites in supp(T (0) ) and put them consecutively in N , . . . , N − n 0 0 + 1 (keeping them in the same order as they are for definiteness). Then pick up the sites in supp(T (1) ) that are not in supp(T (0) ) and put them consecutively starting at N − n 0 0 until they are exhausted. Note that if no triangles of size 0 are within triangle of size 1 , maps supp(T (0) ) ∪ supp(T (1) ) onto {N , . . . , N −n 0 0 −n 1 1 +1}; otherwise maps supp(T (0) )∪supp(T (1) ) onto a proper subset of {N , . . . , N − n 0 0 − n 1 1 + 1}. One can iterate this procedure until all the sites of the support of are exhausted. As above, for all 1 ≤ j ≤ k − 1, if all trianj+1 gles considered are disjoint maps ∪ =0 supp(T ( ) ) onto {N , N −1, . . . , N − M j+1 +1} j+1 where M j+1 = =0 n , otherwise on a proper subset of it. Then one can pick up all the remaining sites of and continue as above. The so defined induces a bijection from the random magnetic fields indexed by to a family of random variables (h 1 , . . . , h N ) by (h)i := h i , ∀i ∈ . Using this bijection, one can work with the random variables (h i , 1 ≤ i ≤ N ). Define the family of increasing σ -algebra: (∅, ) = 0 ⊂ 1 = σ (h 1 ) ⊂ 2 = σ (h 1 , h 2 ) ⊂ · · · ⊂ N = σ (h 1 , h 2 , . . . , h N ) and k (F) = IE [F|k ]− IE F|k−1 the associated martingale difference sequences. We have IE[F| N ] = F;
IE[F|0 ] = IE[F] = 0,
F=
N
k (F).
k=1
Remark that IE[F j+1 |i ] = 0 ∀i ∈ {1, · · · N − M j+1 }
(4.10)
since by (4.6) F j+1 (h(D cj+1 ), h(D j+1 )) = −F j+1 (h(D cj+1 ), −h(D j+1 )) and h(∪ =0 T ( ) ) ⊂ {h N −M j+1 , . . . , h N }. j+1
Note also that IE[F|i ] = 0 ∀i ∈ {1, · · · N − Mk },
(4.11)
and IE[e F ] = IE[e
N −1 k=1
k (F)
IE[e N (F) | N −1 ]].
With self explained notations, using the Jensen inequality one has IE[e N (F) | N −1 ] = e N (F) IP(dh N ) F(h
(4.12)
742
M. Cassandro, E. Orlandi, P. Picco
We then expand the exponential in the right-hand side of (4.12). All the odd powers but the constant one vanish. For the even power we recall (4.9) and by the Lipschitz continuity of each term with respect to (h 1 , . . . , h N ) we get k λi |Fi (h
≤ 2θ |h N − h˜ N |
k
λ .
(4.13)
= j+1 + Then estimating |h N − h˜ N | ≤ 2 and 2(2n−1) (2n!)−1 ≤ (n!)−1 to re-sum the series one gets
IE[e
N
| N −1 ] ≤ e
16θ 2
k
= j+1 λ
2
.
(4.14)
In the case of gaussian or subgaussian variable one just performs all the integration instead of using |h N − h˜ N | ≤ 2. This will modify the result by a constant different from 16. To iterate, one uses again the Jensen inequality obtaining IE[e N −1 | N −2 ] = e N −1 (F) IP(dh N −1 ) F(h
2 k 16θ 2 = j+1 λ . IE e N −1 (F) | N −2 ≤ e
(4.15)
2 k 2 Iterating one gets IE ek (F) |k−1 ≤ e4θ ( = j+1 λ ) for k ∈ {N , N − 1, . . . , N − M j+1 }. When k = N − M j+1 − 1, a new fact happens. Using (4.10) for i = N − M j+1 ≡ m and computing m (F) =
k
λi (IE[Fi |m ] − IE[Fi |m−1 ]),
(4.16)
i= j+1
one obtains that the term corresponding to i = j + 1 in the sum gives zero contribution. Therefore, in this case, one has k ≤ 4θ ˆ ˜ ˆ ˆ F(h , h , h ) − F(h , h , h ) IP( h ) λ . (4.17) <m m >m <m m >m >m = j+2
Phase Transition in the 1d Random Field Ising Model with Long Range Interaction
743
Iterating this procedure one gets 16θ 2 IE e F ≤ e
j+1
=0 n
2
2 2 k k = j+1 λ +n j+2 j+2 = j+2 λ +···+n k k λk
. (4.18)
The estimate (4.18) suggests to set for = j + 1, . . . , k , µ ≡
k
λn ,
(4.19)
n=
and the constraints (λi ≥ 0, j + 1 ≤ i ≤ k ) become µ decreasing with . We write j the first exponent of (4.8) in terms of {µ } =0 obtaining k
−
= j+1
⎛ ⎞ j k ζ ζ µ n α , λ A = − µ j+1 ⎝ n α ⎠ − 4 4 =0
(4.20)
= j+1
and for the exponent in (4.18) we obtain ⎛ 16θ 2 (µ j+1 )2 ⎝
j
⎞ n ⎠ + 16θ 2
=0
k
(µ )2 n .
(4.21)
= j+1
Denote ζ f (µ ) ≡ − µ α + 16θ 2 µ2 = j + 1, . . . k . 4 Choose µ ≡ µ¯ where µ¯ =
ζ 1 , 4 × 32 θ 2 1−α
(4.22)
is the minimizer of f (µ ). Note that µ¯ is a decreasing function of and f (µ¯ ) = −
ζ 2 2α−1 . 210 θ 2
(4.23)
Therefore collecting together the last sum in (4.20) and the one in (4.21) we get −
k k k ζ ζ2 µ¯ n α + 16θ 2 (µ¯ )2 n = − 10 2 n 2α−1 4 2 θ
= j+1
= j+1
= j+1
=−
k ζ2 |T ( ) |2α−1 . (4.24) 210 θ 2 = j+1
744
M. Cassandro, E. Orlandi, P. Picco
Summing up (4.20) and (4.21), taking into account (4.22) and (4.24) we get ⎛ ⎞ ⎛ ⎞ j j ζ n α ⎠ + 16θ 2 (µ¯ j+1 )2 ⎝ n ⎠ − µ¯ j+1 ⎝ 4 =0
=−
j =0
n
=0
ζ α 2 2 µ¯ j+1 − 16θ (µ¯ j+1 ) . 4
(4.25)
One can check easily that for all 0 ≤ ≤ j one has " # ζ 1 ζ 2 1 α 2 2 µ¯ j+1 − 16θ (µ¯ j+1 ) = − > 0, (4.26) 4 29 θ 2 1−α 1−α 21−α j+1 j+1 since by construction < j+1 for 0 ≤ ≤ j. By (4.8), (4.18), (4.24), and (4.25) one gets (3.13). Acknowledgements. We are indebted to Errico Presutti for stimulating discussions and criticism. P.P. thanks the Mathematics Department of “ Universitá degli Studi dell’Aquila” and Anna de Masi for hospitality. Enza Orlandi thanks the Institut Henri Poincaré - Centre Emile Borel, (workshop Mécanique statistique, probabilités et systèmes de particules 2008) for hospitality. The authors thank the referees for useful comments.
References 1. Aizenman, M., Chayes, J., Chayes, L., Newman, C.: Discontinuity of the magnetization in one–dimensional 1/|x − y|2 percolation, Ising and Potts models. J. Stat. Phys. 50(1–2), 1–40 (1988) 2. Aizenman, M., Wehr, J.: Rounding of first order phase transitions in systems with quenched disorder. Commun. Math. Phys. 130, 489–528 (1990) 3. Bovier, A.: Statistical Mechanics of Disordered Systems. Cambridge Series in Statistical and Probabilistic Mathematics 18, Cambridge: Cambridge Univ. Press, 2006 4. Bricmont, J., Kupiainen, A.: Phase transition in the three-dimensional random field Ising model. Commun. Math. Phys. 116, 539–572 (1988) 5. Cassandro, M., Ferrari, P.A., Merola, I., Presutti, E.: Geometry of contours and Peierls estimates in d = 1 Ising models with long range interaction. J. Math. Phys. 46(5), 053305 (2005) 6. Dobrushin, R.: The description of a random field by means of conditional probabilities and conditions of its regularity. Theory Probability Appl. 13, 197–224 (1968) 7. Dobrushin, R.: The conditions of absence of phase transitions in one-dimensional classical systems. Matem. Sbornik 93(N1), 29–49 (1974) 8. Dobrushin, R.: Analyticity of correlation functions in one-dimensional classical systems with slowly decreasing potentials. Commun. Math. Phys. 32(N4), 269–289 (1973) 9. Dyson, F.J.: Existence of phase transition in a one-dimensional Ising ferromagnetic. Commun. Math. Phys. 12, 91–107 (1969) 10. Ellis R.S.: Entropy, Large Deviation and Statistical Mechanics. New York: Springer, 1988 11. Fröhlich, J., Spencer, J.: The phase transition in the one-dimensional Ising model with 12 interaction r energy. Commun. Math. Phys. 84, 87–101 (1982) 12. Gallavotti, G., Miracle Sole, S.: Statistical mechanics of Lattice Systems. Commun. Math. Phys. 5, 317–323 (1967) 13. Imry, Y., Ma, S.: Random field instability of the ordered state of continuous symmetry. Phys. Rev. Lett. 35, 1399–1401 (1975) 14. Rogers, J.B., Thompson, C.J.: Absence of long range order in one dimensional spin systems. J. Stat. Phys. 25, 669–678 (1981) 15. Ruelle, D.: Statistical mechanics of one-dimensional Lattice gas. Commun. Math. Phys. 9, 267–278 (1968) Communicated by H. Spohn
Commun. Math. Phys. 288, 745–772 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0776-6
Communications in
Mathematical Physics
Spin Foam Perturbation Theory for Three-Dimensional Quantum Gravity João Faria Martins1 , Aleksandar Mikovi´c2, 1 Centro de Matemática da Universidade do Porto, Faculdade de Ciências da Universidade do Porto,
Rua do Campo Alegre, 687, 4169-007 Porto, Portugal. E-mail:
[email protected] 2 Departamento de Matemática, Universidade Lusófona de Humanidades e Tecnologia, Av do Campo Grande, 376, 1749-024 Lisboa, Portugal. E-mail:
[email protected] Received: 28 April 2008 / Accepted: 19 December 2008 Published online: 17 March 2009 – © Springer-Verlag 2009
Abstract: We formulate the spin foam perturbation theory for three-dimensional Euclidean Quantum Gravity with a cosmological constant. We analyse the perturbative expansion of the partition function in the dilute-gas limit and we argue that the Baez conjecture stating that the number of possible distinct topological classes of perturbative configurations is finite for the set of all triangulations of a manifold, is not true. However, the conjecture is true for a special class of triangulations which are based on subdivisions of certain 3-manifold cubulations. In this case we calculate the partition function and show that the dilute-gas correction vanishes for the simplest choice of the volume operator. By slightly modifying the dilute-gas limit, we obtain a nonvanishing correction which is related to the second order perturbative correction. By assuming that the dilute-gas limit coupling constant is a function of the cosmological constant, we obtain a value for the partition function which is independent of the choice of the volume operator. Contents 1. 2. 3.
4.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . Perturbative Expansion for the PR Model . . . . . . . . . Quantum SU(2) Invariants of Links and Three-Manifolds 3.1 Spin network calculus . . . . . . . . . . . . . . . . . 3.2 Generalised Heegaard diagrams . . . . . . . . . . . . 3.3 The Chain-Mail invariant . . . . . . . . . . . . . . . 3.4 The Turaev-Viro invariant . . . . . . . . . . . . . . . 3.5 The Witten-Reshetikhin-Turaev invariant . . . . . . . Perturbative Expansion . . . . . . . . . . . . . . . . . . 4.1 Graspings . . . . . . . . . . . . . . . . . . . . . . . 4.2 The first-order volume expectation value . . . . . . . Member of the Mathematical Physics Group, University of Lisbon.
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
746 747 749 749 750 750 750 752 753 753 754
746
5.
6.
J. Faria Martins, A. Mikovi´c
4.3 Higher-order corrections . . . . . . Dilute Gas Limit . . . . . . . . . . . . . 5.1 Preliminary approach . . . . . . . . 5.1.1 The problem with Assumption 1 5.2 Exact calculation . . . . . . . . . . 5.3 Dilute gas limit for z 1 = 0 . . . . . 5.4 Relating g and λ . . . . . . . . . . . 5.5 Proof of Lemma 5.2 . . . . . . . . . Conclusions . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
758 761 762 764 764 766 769 770 770
1. Introduction Spin foam state sum models can be understood as the path integrals for BF topological field theories [Ba1]. Since General Relativity in 3 and 4 dimensions can be represented as a perturbed BF theory, see [FK,M2], then, in order to find the corresponding Quantum Gravity theory, one would need a spin foam perturbation theory. Baez has analysed the spin foam perturbation theory from a general point of view in [Ba2], and he was able to show that, under certain reasonable assumptions, the perturbed spin foam state sum Z can be calculated in the dilute gas limit. In this limit, the number N of tetrahedra of a manifold triangulation tends to infinity and λ, the perturbation theory parameter, tends to zero, in a way such that the effective coupling constant g = λN is finite. By assuming that the number of topologically inequivalent classes of perturbed configurations at a given order of perturbation theory is limited when N → ∞, Baez showed that the perturbation series Z (M, ) = Z 0 (M) + λZ 1 (M, ) + λ2 Z 2 (M, ) + · · · =
∞
λn Z n (M, ),
(1)
n=0
where M is the manifold, is dominated by the contributions from the dilute configurations in the dilute gas limit. The dilute configurations of order n are the configurations where n non-intersecting simplices carry a single perturbation insertion. Let N Z¯ N = n=0 λn Z n (M, ), then lim Z¯ N (M, ) = e gz 1 Z 0 (M),
N →∞
(2)
where z 1 = lim
N →∞
Z 1 (M, ) N Z 0 (M)
does not depend on the chosen manifold M. In this paper, we are going to study in detail the Baez approach on the example of three-dimensional (3d) Euclidean Quantum Gravity with a cosmological constant. In this case it is possible to construct explicitly the perturbative corrections, and we will show how to do it. Therefore one can check all the assumptions and the results from the general approach. We will show that the conjecture that there are only finitely many topological classes of perturbative configurations at a given order of the perturbation theory is not true. However, if the triangulations are restricted to those corresponding to special subdivisions of certain acceptable manifold cubulations, then the number of these topological classes is finite. In this case we show that the formula (2) is still valid
Spin Foam Perturbation Theory for Three-Dimensional Quantum Gravity
747
and we calculate z 1 . A surprising feature of 3d gravity is that z 1 = 0 and consequently the dillute gas limit has to be modified in order to obtain a nonzero contribution. We will show that for g = λ2 N , lim Z¯ N (M, ) = e gz 2 Z 0 (M),
N →∞
(3)
where z 2 = lim
N →∞
Z 2 (M, ) . N Z 0 (M)
The value of z 2 , which we conjecture to be non-zero, is independent of the triangulation and the manifold M. Recall that N denotes the number of tetrahedra of the triangulation of M. In order to construct the perturbative corrections Z n , we will use the path-integral expression for the partition function of Euclidean 3d gravity with a cosmological constant λ, a b c Z (M, ) = D AD B exp i (4) Tr (B ∧ F) + abc B ∧ B ∧ B , M
where A is an SU(2) principal bundle connection, F is the corresponding curvature 2-form, B is a one-form taking values in the SU(2) Lie algebra and abc are the structure constants. This path integral can be defined as a finite spin foam state sum when = 4π 2 /r 2 , r ∈ N, see [FK], and in this case it is given by the Turaev-Viro (TV) state sum [TV]. However, if = 4π 2 /r 2 then it is not obvious how to define Z . A natural approach is to use the generating functional technique [FK], and in [HS] the first order perturbation theory spin foam amplitudes were studied for the Ponzano-Regge (PR) model [PR]. However, the problem with the PR model is that it is not finite, so that the state sums Z n in (1) are not well defined. Since the TV model can be considered as a quantum group regularisation of the PR model, we are going to use the TV model to define the perturbation series (1). Physically this means expanding the path integral (4) by using λ = − 4π 2 /r 2 as the perturbation theory parameter instead of λ = . The TV model perturbation series can be constructed by using the PR model perturbation series and then replacing all the weights in the PR amplitudes with the corresponding quantum group spin network evaluations. The calculation of the corresponding state sums is substantially simplified if the Chain-Mail technique is used, see [R,BGM,FMM]. In Sect. 2 we review the PR perturbation theory. In Sect. 3 we review the Chain-Mail technique, while in Sect. 4 we define the perturbative corrections. In Sect. 5 we discuss the dilute gas limit, while in Sect. 6 we present our conclusions. 2. Perturbative Expansion for the PR Model Given a triangulation of M with N tetrahedra, let us associate to each edge of a source current J = Ja Ta which belongs to the Lie algebra su(2) with a basis {Ta |a = 1, 2, 3}. One can then write N 3 Z = exp −λ ∂ J (τk ) Z (J ) , (5) k=1
J =0
748
J. Faria Martins, A. Mikovi´c
where ∂ J3 (τk ) is a differential operator associated with the volume of a tetrahedron τk and Z (J ) is the generating functional, given by the Ponzano-Regge state sum with the D ( j ) (e J ) insertions at the edges of the 6 j spin networks, where D ( j) (e J ) is the matrix of the group element e J in the representation of spin j, see [FK,HS]. The operator ∂ J3 can be chosen to be 1 abc ∂ ∂ ∂ , ∂ J3 = 4 ,µ,ν ∂ Ja ∂ Jµb ∂ Jνc where abc is a totally antisymmetric tensor and , µ, ν are tetrahedron edges sharing a common vertex1 . Since ∂ ∂ (j ) (j ) ( j ) J · · · D (e ) = T(a · · · Ta ) , J =0 ∂ Ja ∂ Ja where X (a1 ···a p ) =
1 X aσ (1) · · · X aσ ( p) , p! σ ∈S p
then the result of the action of ∂ J3 on a tetrahedron’s vertex will be given by the grasping insertion ( j) (k) abc Ta Tb Tc(l) , a,b,c
where j, k and l are the spins of the three edges sharing the vertex. By using
( j) 1jj 111 , Ta = A j Caαα , abc = Cabc αα
where is the intertwiner tensor for Hom(V j ⊗ Vk , Vl∗ ) (3 j symbol) and A j is a normalisation factor given by jkl Cαβγ
A2j = one obtains
( j) abc Ta a,b,c
αα
Tb(k)
ββ
j ( j + 1)(2 j + 1) , θ (1, j, j)
Tc(l)
γγ
= A j A k Al
(6)
1jj
111 1kk 1ll Cabc Caαα Cbββ C cγ γ .
a,b,c
This equation implies that the evaluation of a tetrahedral spin network with a grasping insertion is proportional to the evaluation of a spin network based on a tetrahedron graph with an additional trivalent vertex whose edges carry the spin one representations and connect the 3 edges carrying the spins j, k and l, see Fig. 1. The Z n which follows −1)! from (5) will be then given by a sum of (n+N n!(N −1)! terms where each term corresponds to the PR state sum with n graspings distributed among the N tetrahedra. The weight of a tetrahedron with m graspings is given by an analogous evaluation of the SU(2) spin network from Fig. 1 with m insertions. 1 One can choose a more general expression for ∂ 3 , involving all possible triples of the edges, see [HS], J but in this paper we will study the simplest possible choice.
Spin Foam Perturbation Theory for Three-Dimensional Quantum Gravity
749
Fig. 1. The evaluation of a tetrahedral spin network with a grasping insertion
In order to make all the PR state sums Z n finite, we will replace all the SU(2) spin networks associated with a Z n with the corresponding quantum SU(2) spin networks at a root of unity. In the following sections we will show how to do this. 3. Quantum SU(2) Invariants of Links and Three-Manifolds We gather some well known facts about quantum SU(2) invariants which we will need in this paper. 3.1. Spin network calculus. Consider an integer parameter r ≥ 3 (fixed throughout this 2 j+1 −q −2 j−1 iπ article), and let q = e r . Define the quantum dimensions dimq j = (−1)2 j q q−q , −1 where i ∈ {0, 1/2, . . ., (r − 2)/2}. If the edges of a trivalent framed graph embedded in S 3 are assigned spins j1 , j2 , ..., jn ∈ {0, 1/2, ..., (r − 2)/2}, then we can consider the value ; j1 , ..., jn ∈ C obtained by using the quantum spin network calculus at q; we will use the normalisation of [KL]. We can also consider the case in which the edges of are assigned a linear combination of spins, with multilinear dependence on the colourings of each edge of . A very important linear combination of spins is the “-element” given by: r −2
=
2
dimq ( j)R j ,
j=0
where R j denotes the representation of spin j. A sample of the properties satisfied by the -element appears in [R,L,KL]. In Fig. 3 we display a special case of the Lickorish Encircling Lemma, of which we will make explicit use. √ Define 0 ; = η and 1 ; = κ η, the evaluation of the 0- and 1-framed unknots coloured with the -element. Therefore we have that η=
(r −2)/2 j=0
On the other hand κ = q
−3−r 2 2
iπ
(dimq j)2 =
r . 2 sin2 πr
e− 4 , and −1 =
√
ηκ −1 ; see [R].
750
J. Faria Martins, A. Mikovi´c
3.2. Generalised Heegaard diagrams. Let M be a closed oriented piecewise-linear 3-manifold. Choose a handle decomposition of M; see [RS,GS]. Let H− be the union of the 0- and 1-handles of M. Let also H+ be the union of the 2- and 3-handles of M. Both H− and H+ have natural orientations induced by the orientation of M. There exist two non-intersecting naturally defined framed links m and in H− ; see [R]. The second one is given by the attaching regions of the 2-handles of M in ∂ H− = ∂ H+ , pushed inside H− , slightly. On the other hand, m is given by the belt-spheres of the 1-handles of M, living in ∂ H− . The sets of curves m and in H− have natural framings, parallel to ∂ H− . The triple (H− , m, ) will be called a generalised Heegaard diagram of the oriented closed 3-manifold M.
3.3. The Chain-Mail invariant. We now recall the definition of J. Robert’s Chain-Mail invariant of closed oriented 3-manifolds. This construction will play a fundamental role in this article. Let M be a connected 3-dimensional closed oriented piecewise linear manifold. Consider a generalised Heegaard diagram (H− , m, ), associated with a handle decomposition of M. Give H− the orientation induced by the orientation of M. Let : H− → S 3 be an orientation preserving embedding. Then the image of the links m and under defines a link CH(H− , m, , ) in S 3 , called the “Chain-Mail Link”. J. Roberts proved that the evaluation CH(H− , m, , ); of the Chain-Mail Link coloured with the -element is independent of the orientation preserving embedding : H− → S 3 ; see [R], Prop. 3.3. The Chain-Mail Invariant of M is defined as: Z CH (M) = η−n 0 −n 2 CH(H− , m, , ); , where n i is the number of i-handles of M. It is proved in [R] that this Chain-Mail invariant is independent of the chosen handle decomposition of M and that it coincides with the Turaev-Viro invariant Z TV (M) of M; see [TV]. 3.4. The Turaev-Viro invariant. Let M be 3-dimensional closed connected oriented piecewise linear manifold. Consider a piecewise linear triangulation of M. We can consider a handle decomposition of M where each i-simplex of M generates a (3−i)-handle of M; see for example [R]. Applying the Chain-Mail construction to this handle decomposition, yields the following combinatorial picture for the calculation of the Chain-Mail invariant Z CH (M), which, in this form, is called the Turaev-Viro invariant Z TV . A colouring of M is an assignment of a spin j ∈ {0, 1/2, ..., (r − 2)/2} to each edge of M. Each colouring of a simplex s gives rise to a weight W (s) ∈ C, in the way shown in Fig. 2. Using the identity shown in Fig. 3 together with Fig. 4, it follows that:
Z CH (M) = W (s) colourings of M
simplices s of M
= Z TV (M). Note that we apply the Lickorish Encircling Lemma to the 0-framed unknot defined from each face of the triangulation of M; see Fig. 5. The last expression for Z CH is the usual definition of the Turaev-Viro Invariant. For a complete proof of the fact that Z TV = Z CH , see [R].
Spin Foam Perturbation Theory for Three-Dimensional Quantum Gravity
751
Fig. 2. Weights associated with coloured simplices. All spin networks are given the blackboard framing
Fig. 3. Lickorish Encircling Lemma for the case of three strands. All networks are given the blackboard framing
Fig. 4. Local configuration of the Chain-Mail Link at the vicinity of a tetrahedron
752
J. Faria Martins, A. Mikovi´c
Fig. 5. Applying Lickorish Encircling Lemma to the 0-framed link determined by a face of the triangulation of M
3.5. The Witten-Reshetikhin-Turaev invariant. The main references now are [L,RT and R]. Let M be an oriented connected closed 3-manifold. Then M can be presented by surgery on some framed link L ⊂ S 3 , up to an orientation preserving diffeomorphism. Any framed graph in M can be pushed away from the areas where the surgery is performed, and therefore any pair (M, ), where is a trivalent framed graph in the oriented closed 3-manifold M, can be presented as a pair (L , ), where is a framed trivalent graph in S 3 , not intersecting L. The Witten-Reshetikhin-Turaev Invariant of a pair (M, ), where the framed graph is coloured with the spins j1 , ..., jn , is defined as: Z WRT (M, ; j1 , ..., jn ) = η−
#L+1 2
κ −σ (L) L ∪ ; , j1 , ..., jn .
Here σ (L) is the signature of the linking matrix of the framed link L, and #L is the number of components of L. This is an invariant of the pair (M, ), up to an orientation preserving diffeomorphism. In contrast with the Turaev-Viro invariant, the WittenReshetikhin-Turaev invariant is sensitive to the orientation of M. If M is an oriented 3-manifold, we represent the manifold with the reverse orientation by M. With the normalisations that we are using, the Turaev-Walker theorem reads: Z TV (M) = |Z WRT (M)|2 , for any closed 3-manifold M; see [R,T]. Some other well known properties of the Witten-Reshetikhin-Turaev invariant are the following: Theorem 3.1. We have: Z WRT (S 3 ) = η−1/2 , Z WRT (S 2 × S 1 ) = 1, Z WRT (M) = Z WRT (M), 1 Z WRT (P, )#(Q, ) = Z WRT (P, )Z WRT (Q, )η 2 . Here M, P and Q are oriented closed connected 3-manifolds. In addition, and are coloured graphs embedded in P and Q. It is understood that the connected sum P#Q is performed away from and . Given oriented closed connected 3-manifolds P and Q, we define P#n Q in the following way; see [BGM]. Remove n 3-balls from P and Q, and glue the resulting manifolds P and Q along their boundary in the obvious way, so that the final result is an oriented manifold. We denote it by: P#n Q = P Q. ∂ P =∂ Q
Spin Foam Perturbation Theory for Three-Dimensional Quantum Gravity
753
Fig. 6. The form of the graph G γ = G T ∪ Iv ⊂ ∂T defined from a grasping γ = (v), where v is a vertex of T. Note that Iv is placed inside the dual face to v, and thus it intersects G T in the dual edges of the edges incident to v
It is immediate that P#1 Q = P#Q, and that P#n Q is diffeomorphic to (P#Q)#(S 1 × S 2 )#(n−1) , if n > 1. By using Theorem 3.1 it follows that: Z WRT (P#n Q, ∪ ; j1 , ..., j p , i 1 , ..., i m ) n
= Z WRT (P, ; j1 , ..., j p )Z WRT (Q, ; i 1 , ..., i m )η 2 .
(7)
Here P and Q are closed oriented 3-manifolds. In addition, and are graphs in P and Q, coloured with the spins j1 , ..., j p and i 1 , ..., i m , respectively. As before, it is implicit that the multiple connected sum P#n Q is performed away from and . 4. Perturbative Expansion In this section we are going to define the Z n ’s considered in the Introduction. 4.1. Graspings. Let K be a simplicial complex whose geometric realisation |K| is a piecewise-linear closed p-dimensional manifold. Recall that we can define the dual cell decomposition of |K|, where each k-simplex of K generates a dual ( p − k)-cell of the dual cell decomposition of |K|, see [TV, 3.3], for example. This is very easy to visualise in three dimensions. Definition 4.1 (n-grasping). Let T ⊂ R3 be the standard tetrahedron. For a positive integer n ∈ N, an n-grasping γ is a sequence (v1 v2 . . . vn ), where vk is a vertex of T for each k = 1, 2, . . . , n. Any 1-grasping γ = (v), where v is a vertex of T, naturally defines a trivalent graph G γ on the boundary ∂T of T (usually called a grasping itself), by doing the transition shown in Fig. 6. The graph G γ is therefore the union G T ∪ Iv , where G T is the dual graph to the 1-skeleton of the obvious triangulation of ∂T, and Iv is homeomorphic to the graph Y made from a trivalent vertex and three open-ends. We want to define, in an analogous fashion, a trivalent graph G γ ⊂ ∂T from an n-grasping γ = (v1 v2 . . . vn ). This is not possible unless further information is given. n Y , where each graph Y We want G γ to be the union of G T and a disjoint union i=1 i i is homeomorphic to the graph Y. To describe G γ we need to specify where the ends of each Yi intersect G T , as well as the crossing information. To this end we give the following definition:
754
J. Faria Martins, A. Mikovi´c
Fig. 7. Defining the graph G(γ , Oγ ) defined from a grasping γ = (vvw) for two different space orderings of γ
Definition 4.2 (Space ordering of an n-grasping). Let again T ⊂ R3 be the standard tetrahedron. Let G T be the dual graph to the obvious triangulation of the boundary ∂T of T. Any edge e of T therefore defines a dual edge e∗ of the graph G T ⊂ ∂T. Let γ = (v1 . . . vn ) be an n-grasping. A space pre-ordering of γ is given by an assignment of a subset O i = {x1i , x2i , x3i } ⊂ G T to each i = 1, 2, . . . , n such that: 1. For each i, x1i , x2i and x3i belong to different edges of G T , and each of these points belongs to the dual edge of an edge incident to vi . 2. O i ∩ O j = ∅ if i = j. A space ordering Oγ of γ is given by a space pre-ordering of γ considered up to ambient n O i inside G . isotopy of ∪i=1 T There exists therefore a unique space ordering of a 1-grasping γ = (v). Let now γ = (v1 v2 . . . vn ) be an n-grasping with a certain space ordering n Oγ = {x1i , x2i , x3i } i=1 . We define an associated graph G(γ , O) in ∂T, with crossing information, as being: G(γ , O) = G T ∪
n
Ivi i ,
i=1
where: 1. Ivi i is homeomorphic to Ivi (see above) for each i = 1, . . . n. 2. Ivi i ∩ G T = {x1i , x2i , x3i } for each i = 1, 2, . . . , n. 3. If i < j then Ivi i is placed above Ivi j , with respect to the boundary of T. See Fig. 7 for the description of the graph G(γ , Oγ ) for two different space orderings of γ = (vvw), where v = w. Given a 1-grasping γ = (v) living in the tetrahedron T, define σ (v) as being given by the set made from the three edges of T incident to v. Given an n-grasping γ = (v1 . . . vn ) n σ (v ). living in T, define σ (γ ) = ∪i=1 i 4.2. The first-order volume expectation value. Let M be a piecewise linear oriented closed 3-manifold (from now on called simply a 3-manifold). Consider a triangulation of M. Let M0 , M1 , M2 and M3 be the set of vertices, edges, triangles and tetrahedra of M.
Spin Foam Perturbation Theory for Three-Dimensional Quantum Gravity
755
A colouring of M is an assignment of a spin j ∈ {0, 1/2, ..., (r − 2)/2} to each edge of M. Each colouring of a simplex s of M gives rise to a weight W (s), in the way shown in Fig. 2, exactly the same fashion as in the definition of the Turaev-Viro Invariant Z TV . Consider a tetrahedron T of M (whose edges are coloured), with some n-grasping γ = (v1 . . . vn ), provided with a space ordering O. Choose an orientation preserving embedding of T into S 3 , which is defined up to isotopy. Then the weight W (T, γ , O) is defined as being the factor A(T, γ ) (see below) times the evaluation of the spin network n G(γ , O) = G T ∪ i=1 Ivi i ⊂ ∂ T ⊂ S 3 , where G T has the colouring given by the colouring of M (recall that each edge of G T is dual to a unique edge of T ), and all edges of each Ivi i are assigned the spin 1. Note that the graph G(γ , O) ⊂ ∂T has a natural framing parallel to the boundary of T . If γ = (v1 , . . . , vn ) is an n-grasping, the factor A(T, γ ) is, by definition: n
i=1
Al i Al i Al i , 1
2
3
where for each i we let l1i , l2i and l3i denote the colourings of the three edges of T incident to vi and the Al ’s are given by (6). The first-order volume expectation value can be represented as
V (M) = D A D B V (M) ei M T r (B∧F) , where V (M) = M abcB a ∧ B b ∧ B c is the volume of M. It corresponds to N ∂ J3 (τk )Z J , which can be defined as −i Z 1 = i k=1 0
i VTV (M, ) =
i 4
T ∈M3
vTV (M, T, γ ),
(8)
1-graspings γ of T
where, by definition: vTV (M, T, γ ) = colourings of M
W (T, γ , O)
T ∈M3 \{T }
W (T )
W (s).
(9)
s∈M0 ∪M1 ∪M2
Recall that any 1-grasping γ has a unique space ordering O. Let v(M) = VTV (M, )/N , where N is the number of tetrahedra of M. Theorem 4.3. For any triangulation of M we have v(M) = 0. Proof. Each term vTV (M, T, γ ) can be presented in a Chain-Mail way. As in [R], consider the natural handle decomposition of M for which each i-simplex of M generates a (3 − i)-handle of M. This handle decomposition is dual to the one where each i-simplex of M is thickened to an i-handle of M. Let us consider the Chain-Mail formula Z CH (M) = η−n 0 −n 2 CH(H− , m, , ); ,
756
J. Faria Martins, A. Mikovi´c
Fig. 8. Local configuration of the Chain-Mail Link at the vicinity of a tetrahedron T , for the case when T has a grasping γ = (v). All strands are coloured with , unless indicated
for Z TV (M), obtained from this handle decomposition of M; see 3.4 and [R]. Here n i is the number of i-handles of M, and therefore it equals the number of (3 − i)-simplices of M. From the same argument that shows that Z TV (M) = Z CH (M), follows that: vTV (M, T, γ ) =
Aa Ab Ac dimq (a) dimq (b) dimq (c)η−n 0 −n 2
a,b,c
CH(H− , m, T,γ , (l1 l2 l3 ) ∪ Yv , ); , , a, b, c, 1 ,
where: 1. All components of m are coloured with . 2. Recall that each circle of the link corresponds to a certain edge of M. The components l1 , l2 and l3 of which correspond to the edges e1 , e2 and e3 incident to v, where γ = (v), should be coloured with the spins a, b, c ∈ {0, 1/2, ..., (r − 2)/2}, whereas the remaining components (which form the link T,γ ) should be coloured with . 3. The component Yv , where γ = (v), is a trivalent vertex with three open ends, each of which is incident to either l1 , l2 or l3 , with no repetitions, with framing parallel to the surface of H− ; see Fig. 8. The three edges of Yv are to be assigned the spin 1. 4. Finally, is an orientation preserving embedding H− → S 3 . As in the case where no graspings are present, the final result is independent of this choice; see [R, Proof of Prop. 3.3]. By cancelling some pairs of 0- and 1-handles, we can reduce the handle decomposition of M to one with a single 0-handle. Similarly, by eliminating pairs of 2- and 3-handles, we can reduce the handle decomposition of M to one having four 3-handles, each of which corresponds to one of the vertices of the triangulation of M which are endpoints of the edges of T incident to v, where γ = (v), and so that the 2-handles corresponding to the three edges of T incident to v are still in the handle decomposition. The chain-mail link of the new handle decomposition of M will then be CH(H− , m , , ), where m is obtained from m by removing some circles, and the same for . Let n i be the number of i-handles of the new handle decomposition of M.
Spin Foam Perturbation Theory for Three-Dimensional Quantum Gravity
757
Fig. 9. The graph Rγ inside ∂ρ(Sγ ) Let T,γ = \ {l1 l2 l3 }. By the same argument as in [R, Proof of Theorem 3.4] follows: vTV (M, T, γ ) = Aa Ab Ac dimq (a) dimq (b) dimq (c)η−n 0 −n 2 a,b,c
CH(H− , m , T,γ , (l1 l2 l3 ) ∪ Yv , ); , , a, b, c, 1 .
Given a compact 3-manifold with border Q embedded in the oriented 3-manifold M, define M# Q M as being the manifold obtained from M and M by removing the interior of Q from each of them and gluing the resulting manifolds along the identity map ∂(M\Q) → ∂(M\Q). Let Sγ be the graph in M made from the edges of T incident to γ = (v), together with their endpoints. Each edge of Sγ will induce a 2-handle of M and its four vertices will induce 3-handles of M. The union of these handles will be a regular neighbourhood ρ(Sγ ) of Sγ . Consider the graph Rγ in ∂ρ(Sγ ) made from the attaching spheres of these 2-handles, with a Y-graph inserted, as in Fig. 9. By using either the argument in [BGM, Proof of Theorem 1]) or in [FMM, Proof of ), ((l l l ) ∪ Y )) is a surgery presentation of Lemma 3.3], the pair ((m ∪ T,γ 1 2 3 v the manifold M#ρ(Sγ ) M, with the graph Rγ ⊂ ∂ρ(Sγ ) inserted in ∂(M\ρ(Sγ )). ) is zero, given that it is a Kirby diagram for the The signature of the link (m ∪ T,γ manifold (M\ρ(Sγ )) × I ; see [R, Proof of Theorem 3.7] or [BGM, Proof of Theorem 1]. Therefore it follows that: vTV (M, T, γ ) = Aa Ab Ac dimq (a) dimq (b) dimq (c) a,b,c 7
η− 2 Z WRT (M#ρ(Sγ ) M, Rγ ; a, b, c, 1).
1+n 1 +n 2 −3
7
2 We have used the calculation η−n 0 −n 2 + = η− 2 , which follows from the fact that the Euler characteristic of a closed orientable 3-manifold is zero. Note that n 0 = 1 and n 3 = 4, by construction.
758
J. Faria Martins, A. Mikovi´c
ˆ Fig. 10. The coloured graph Y
Since ρ(Sγ ) is a closed 3-ball embedded in M it follows that M#ρ(Sγ ) M ∼ = M#M. On the other hand, the graph Rγ is obviously trivially embedded in M#M, in the sense ˆ of Fig. 10 to Rγ . This that there exists an embedding B 3 → M sending the graph Y leads to: vTV (M, T, γ ) 7 = Aa Ab Ac dimq (a) dimq (b) dimq (c)η− 2 Z WRT (M#ρ(Sγ ) M, Rγ ; a, b, c, 1) a,b,c
=
ˆ a, b, c, 1) Aa Ab Ac dimq (a) dimq (b) dimq (c)η− 2 Z WRT (M#M#S 3 , Y; 7
a,b,c
=
a,b,c
=
ˆ a, b, c, 1) Aa Ab Ac dimq (a) dimq (b) dimq (c)η− 2 +1 Z TV (M)Z WRT (S 3 , Y; 7
ˆ a, b, c, 1 . Aa Ab Ac dimq (a) dimq (b) dimq (c)η−3 Z TV (M) Y;
a,b,c
ˆ consists of three tadpole graphs, and the evaluation of a tadpole Since the graph Y quantum spin network is zero, see [CFS, Theorem 3.7.1] or [KL, Chap. 9], then < ˆ a, b, c, 1 >= 0, which implies v(M) = 0. Y; Given that Z 1 = −VTV , we have Z 1 (M, ) = N z 1 Z TV (M) = 0, where z 1 = −η−3
ˆ a, b, c, 1 = 0, Aa Ab Ac dimq (a) dimq (b) dimq (c) Y;
a,b,c
and N is the number of tetrahedra of M. 4.3. Higher-order corrections. Let M be a 3-manifold with a triangulation . Let n be a positive integer. An n-grasping G of M is a set TG = {T1G , . . . , TmGG } of tetrahedra of M (where TiG = T jG if i = j), each of which is provided with a space ordered n iG -grasping G G (γiG , OiG ), where n iG > 0, such that n G 1 + n 2 + · · · + n m G = n. The set TG is said to be an n-grasping support and the n-grasping G of M is said to be supported in TG .
Spin Foam Perturbation Theory for Three-Dimensional Quantum Gravity
759
Recall the definition of the weights W (T, γ , O) ∈ C, where T is a coloured tetrahedra, with a space ordered grasping (γ , O) living in T . This appears in the beginning of Subsect. 4.2, to which we refer for the notation below. Define: 1 (n) VTV (M, ) = n 4 n-graspings G of M
mG
colourings of M
W (TiG , γiG , OiG )
T ∈M3 \TG
i=1
W (T )
W (s), (10)
s∈M0 ∪M1 ∪M2
which can also be written as (n)
VTV (M, ) =
n 1 4n
K =1
K
colourings of M
i=1
n−graspings supports T with K tetrahedra
n−graspings G of M supported in T
W (TiG , γiG , OiG )
W (T )
T ∈M3 \T
W (s). (11)
s∈M0 ∪M1 ∪M2
(n)
Observe that VTV is related to the expectation value of the n th power of the volume V = M abc B a ∧ B b ∧ B c : (n)
V n ≡ Vˆ n Z (J ) = i n n!VTV , 0
where Vˆ = i
N
3 k=1 ∂ J (τk )
is the volume operator. Furthermore
Z=
∞ n n i λ n=0
n!
V n =
∞
(n)
(−1)n λn VTV .
(12)
n=0
Let us analyse whether expressions (10) and (11) define a topological invariant. Consider the bottom term of (11): X (M, , K , G) =
K
colourings of M
i=1
W (TiG , γiG , OiG )
T ∈M3 \TG
W (T )
W (s).
s∈M0 ∪M1 ∪M2
K of M supported in the set with K It depends on an n-grasping G = {TiG , γiG , OiG }i=1 G K tetrahedra TG = {Ti }i=1 . As in the proof of Theorem 4.3, the value of X (M, , K , G) can be presented as the evaluation of a chain-mail link, with some additional 3-valent vertices inserted. Let TG1 be given by
TG1 =
K i=1
σ γiG ,
(13)
760
J. Faria Martins, A. Mikovi´c
see the end of Subsect. 4.1 for this notation. Let also A(G, c) =
K
A TiG , γiG ;
(14)
i=1
depending on a colouring c of M; see Subsect. 4.2. We have X (M, , K , G) =
A(G, c) dimq (c)η−n 0 −n 2 CH(m, TG , L G , ); , c, 1 , (15)
colourings c of TG1
where: 1. All components of the link m are coloured with . 2. The graph L G is made from the attaching regions of the 2-handles of M which correspond to the edges of the triangulation of M belonging to TG1 , with the obvious colouring, with n Y-graphs inserted in the obvious way, and coloured by the spin-1 representation. 3. The link TG is formed by the attaching regions of the 2-handles of M corresponding to the remaining edges of M. These should be coloured with the -element. 4. We have put dimq (c) =
dimq c(e).
e∈TG1
5. As usual is an embedding H− → S 3 . The final result is independent of this choice. We can now reduce the handle decomposition of M to one with a unique 0-handle, and so that all n 3 3-handles of it are dual to vertices of M occurring as endpoints of edges in TG1 . Moreover, we can suppose that the 2-handles of M which are dual to the edges of TG1 are still in the new handle decomposition. Let ρ(TG1 ) be a regular neighbourhood of TG1 in M. Similarly to the n = 1 case, the graph L G naturally projects to a graph RG in ∂(ρ(TG1 )), with crossing information; see Fig. 11 for an example. By the same argument as in the proof of Theorem 4.3 it follows that: X (M, , K , G) = colourings c of TG1
A(G, c) dimq (c)η−
n 3 #TG1 2 − 2
Z WRT M#ρ(T 1 ) M, RG . (16) G
In contrast to the case when G is a 1-grasping, the expression (16) is not apriori a topological invariant of M. This is because there can exist several subsets of M that are of the form ρ(TG1 ), for some n-grasping G of M with K tetrahedra, if we consider an arbitrary triangulation of M; we will go back to this later. This is the reason why a (n) similar result to Theorem 4.3 does not immediately hold for VTV (M, ) for n > 1.
Spin Foam Perturbation Theory for Three-Dimensional Quantum Gravity
761
Fig. 11. The graph RG in ∂(ρ(TG1 )). Here G is a grasping of M supported in the set TG1 = {T1 , T2 }, with γG1 = (v) and γG2 = (vw), and the space ordering shown. In this example, ρ(TG1 ) is obtained by thickening the solid edges of T1 and T2
Note that Eq. (16) simplifies to X (M, , K , G) =
A(G, c) dimq (c)η−
n 3 #TG1 2 − 2
Z WRT M#(S 3 #ρ(T 1 ) S 3 )#M, RG G
colourings c of TG1
=
A(G, c) dimq (c)η
#T 1 n 1− 23 − 2G
Z TV (M)Z WRT
colourings c of TG1
(17)
S 3 #ρ(T 1 ) S 3 , RG , G
(18) whenever TG1 is confined to a closed ball contained in M. For fixed n, this happens whenever the triangulation of M is fine enough. 5. Dilute Gas Limit Let M be a 3-manifold. Similarly to [Ba2], to eliminate the triangulation dependence of (n) VTV (M, ), we want to consider the limit lim
||→0
1 (n) V (M, ), N n TV
(19)
in a sense that still needs to be addressed. Here N = N denotes the number of tetrahedra of a triangulation of M. The case considered in [Ba2] is the limit when the maximal diameter of each tetrahedra of a triangulation of M tends to zero, called there the “Dilute Gas Limit.”
762
J. Faria Martins, A. Mikovi´c
5.1. Preliminary approach. Warning. There exists a gap in the argument below; see Assumption 1. It corresponds to Conjecture B in p. 8 of [Ba2]. In Subsect. 5.2 we explain how we can go around it by restricting the class of triangulations with which we work, so that all calculations are valid. The number of n-grasping supports with K tetrahedra in a triangulated manifold with N tetrahedra is given by the number of cardinality K subsets of the set of tetrahedra of N! M, in other words by (N −K )!K ! . On the other hand: 1 (n) V (M, ) N n TV n 1 = n n 4 N K =1
n−grasping supports T with K tetrahedra
n−graspings G of M supported in T
X (M, , K , G),
(20)
where, according to Eq. (16), X (M, , K , G) =
A(G, c) dimq (c)η−
colourings c of TG1
n 3 #TG1 2 − 2
Z WRT M#ρ(T 1 ) M, RG . G
Assumption 1. Fix a 3-manifold M and a positive integer n. Suppose that there exists a positive constant C = C(M, n) < +∞ for which we have that |X (M, , K , G)| ≤ C, for any triangulation of M, any K ∈ {1, . . . n} and any n-grasping G of M with K tetrahedra. The number of n-graspings that can be supported in an arbitrary set with K tetrahedra, with 1 ≤ K ≤ n, is certainly bounded by a positive constant C < ∞. As → 0, the number of tetrahedra of M goes to infinity. Therefore if K < n then 1 X (M, , K , G) Nn n−grasping supports T n−graspings G of M with K tetrahedra
supported in T
N! ≤ n CC → 0 N (N − K )!K ! if N → +∞. Let us now consider n-graspings of M living in n tetrahedra. Such graspings are called separated if the tetrahedra of its support are pairwise disjoint. It is complicated to determine the exact number of separated n-graspings with n-tetrahedra. This is because this is highly dependent on the local configuration of the chosen triangulation of M. Restriction 2. Choose a positive integer D = D(M). We consider only triangulations of M such that any tetrahedra of M intersects at most D tetrahedra of M. As we will see below in Subsect. 5.2, there exists a positive integer D for which any 3-manifold M has a triangulation such that any tetrahedra of it intersects at most D tetrahedra, and triangulations like this can be chosen to be arbitrarily fine.
Spin Foam Perturbation Theory for Three-Dimensional Quantum Gravity
763
Restricting to this type of triangulations, the number of separated n-grasping sup−n D+D) ports is not smaller than N (N −D)(N −2D)...(N , whereas the number of n-grasping n! N (N −1)(N −2)...(N −n+1) . supports with n-tetrahedra is n! Going back to Eq. (20), the value of 1 Nn
n−grasping supports T with n tetrahedra
n−graspings G of M supported in T
X (M, , n, G)
splits into the contribution of separated and non-separated n-graspings with n tetrahedra. Since by Assumption 1 the set of possible values of X (M, , n, G) is bounded, the contribution of non-separated configurations goes to zero as the number N of tetrahedra of M goes to +∞. Therefore we have: 1 (n) V (M, ) N →∞ N n TV 1 = lim n n N →∞ 4 N lim
separated n−grasping supports T with n tetrahedra
n−graspings G of M supported in T
X (M, , n, G). (21)
Now, the value of X (M, , n, G) is in fact independent of the chosen separated nn n grasping G = {Ti , γi }i=1 of M, supported in the set TG = {Ti }i=1 of n non-intersecting tetrahedra of M; compare with Conjecture A on p. 8 of [Ba2]. Note that space orderings are not relevant in this case. Let us see why it is so. By Eq. (16) it follows that:
7 X (M, , n, G) = A(G, c) dimq (c)η− 2 n Z WRT M#ρ(T 1 ) M, RG , colourings c of TG1
G
(22) since n 3 is the number of vertices of M which are endpoints of edges in TG1 , and in this case TG1 is, topologically, the disjoint union of n Y-graphs, each with a unique trivalent vertex and three open ends (univalent vertices). Given that any two embeddings of a disjoint union of n Y-graphs into M are isotopic, it thus follows that the value of X (M, , n, G) is independent of the chosen separated n-grasping G with n tetrahedra. Therefore, by the same argument as in the proof of Theorem 4.3, and by using Eq. (7), it follows that (whenever G is separated) X (M, , n, G) ⎛ ⎞n ˆ a, b, c, 1 ⎠ Z TV (M)= 0. = η−3n ⎝ dimq (a) dimq (b) dimq (c)Aa Ab Ac Y; a,b,c
ˆ is the graph of Fig. 10. Here Y There are exactly 4n graspings supported on a set of n non-intersecting tetrahedra. On the other hand, the number of separated n-graspings supports is certainly between N (N −D)(N −2D)...(N −n D+D) −n+1) and N (N −1)(N −2)...(N . Putting everything together n! n!
764
J. Faria Martins, A. Mikovi´c
follows that (should Assumption 1 hold true), and restricting to triangulations satisfying Restriction 2 that 1 (n) V (M, ) −→ 0, N n TV whenever the number N of tetrahedra of a triangulation of M converges to +∞. This finishes a preliminary analysis of the Dilute Gas Limit. 5.1.1. The problem with Assumption 1 Fix a positive integer n and a 3-dimensional manifold M. We would like to prove that there exists a positive constant C = C(M, n) < ∞ such that |X (M, , K , G)| ≤ C, for any n-grasping G of M with K tetrahedra, where K = 1, 2, . . . n, and an arbitrary triangulation of M. This is a difficult problem. The approach taken in [Ba2] was to conjecture that X (M, , K , G) can only take a finite number of values, for fixed M and n. However, this is very likely to be false in our case. Indeed, as we have seen above, X (M, , K , G) =
A(G, c) dimq (c)η
−
n 3 #TG1 2 − 2
Z WRT M#ρ(T 1 ) M, RG ,
colourings c of TG1
G
K , T 1 is defined in Eq. (13) and A(G, c) is defined in where G = {(TiG , γiG , OiG )}i=1 G Eq. (14). For fixed K and n, where K is large enough, considering the set of all triangulations of M, there can be an infinite number of isotopy classes of sets of M that are K with K tetrahedra. of the form TG1 for some n-grasping G = {(Ti , γi , Oi )}i=1 For example, consider a triangulation of the solid torus with K tetrahedra, with a grasping in each, so that all edges of these K tetrahedra are incident to some grasping. Then embed the solid torus into the manifold M (there exists an infinite number of such embeddings) and extend the triangulation of the solid torus to a triangulation of M (this can always be done). Therefore X (M, , K , G) can almost certainly take an infinite number of values for fixed n and M, from which we can assert that Conjecture A in p. 8 of [Ba2] is probably false in our particular case2 . This makes it difficult to give an upper bound for X (M, , K , G). To fix this problem we will alter slightly the way of defining the limit (19), by restricting the class of allowable triangulations.
5.2. Exact calculation. Since they are easier to visualise, we will now switch to cubulations of 3-manifolds. Fix a closed 3-manifold M. A cubulation of M is a partition of it into 3-cubes, such that if two cubes intersect they will do it in a common face, edge or vertex of each. Note that any 3-manifold can be cubulated. Any cubulation of M will give rise to a triangulation of it by taking the cones first of each face and then of each cube of M. Given a cubulation of M we therefore define (n)
(n)
VTV (M, ) = VTV (M, ). Here n is a positive integer. 2 The number of topologically distinct possible classes for (T 1 , R ) is infinite; however, there still exists G G
the unlikely possibility that Z WRT M#ρ(T 1 ) M, RG may take only a finite number of values. G
Spin Foam Perturbation Theory for Three-Dimensional Quantum Gravity
765
Fig. 12. Tri-valent and penta-valent cubulations of the disk D 2
The three dimensional cube can be naturally subdivided into 8 cubes. This will be called the baricentric subdivision. The baricentric subdivision of a cubulated manifold is obtained by doing the baricentric subdivision of each cube of it. Denote the v th baricentric subdivision of M by v . Fix a cubulation of M. We want to consider the limit: 1 (n) lim V (M, v ), (23) v→+∞ Nvn TV where Nv is the number of tetrahedra of the triangulation v of M. We want to use the calculation in 5.1. Therefore Assumption 1 and Restriction 2 need to be addressed. Their validity is, as we have seen, highly dependent on the local combinatorics of the chosen triangulations. Therefore, we make the following restriction on the cubulations with which we work. Definition 5.1 (Acceptable cubulation). The valence of an edge in a cubulated 3-manifold M is given by the number of cubes in which the edge is contained. A cubulation of the 3-manifold M is called acceptable if any edge of M has valence 3, 4 or 5, and the set of edges of order 3 and of order 5 match up to form 1-dimensional disjoint submanifolds 3 and 5 of M. It is proved in [CT] that any closed orientable 3-manifold has an acceptable cubulation. Note that if is acceptable then so is the baricentric subdivision of it; see below. Let us try to visualise an acceptable cubulation of M. The manifolds 3 and 5 are disjoint unions of circles S 1 . Let be a component of 3 . The cubical subcomplex ˆ of M made from the cubes of M which contain some simplex of , together with ˆ is cubulated as the product of the their faces, is diffeomorphic to D 2 × S 1 . Moreover 2 tri-valent cubulation of the disk D , shown in Fig. 12, with some cubulation of S 1 . An analogous picture holds if is a component of 5 , by using the penta-valent cubulation of the disk D 2 shown in Fig. 12. Then what is left of the cubulation of M is locally given by some portion of the natural cubulation of the 3-space (with vertices at Z × Z × Z.) From this picture, it is easy to see that if is an acceptable cubulation of M then so is the baricentric subdivision of it. Moreover, we can show that there exists a positive integer D such that, for any 3-manifold with an acceptable cubulation , then any tetrahedron of intersects at most D tetrahedra of . This proves that Restriction 2 will hold if we consider triangulations coming from taking the cone of acceptable cubulations. Looking at Assumption 1, let us now prove that given an acceptable cubulation of M, then X (M, v , K , G)
n 3 #TG1 A(G, c) dimq (c)η− 2 − 2 Z WRT M#ρ(T 1 ) M, RG = colourings c of TG1
G
766
J. Faria Martins, A. Mikovi´c
can only take a finite number of values, for fixed M and n. Here v is an arbitrary positive K is an n-grasping of M with K tetrahedra, thus integer, and G = {(TiG , γiG , OiG )}i=1 K ∈ {1, . . . , n}. Recall also that TG1 is given by Eq. (13) and n 3 denotes the number of vertices of the graph made out of the edges of TG1 , together with their endpoints. In n 3
#TG1
particular η− 2 − 2 can only take a finite number of values. On the other hand, the term
A(G, c) dimq (c)Z WRT M#ρ(T 1 ) M, RG G
colourings c of TG1
depends only on the isotopy class of the pair (TG1 , RG ) inside M. The following lemma shows that there exists only a finite number of possible isotopy classes of TG1 in M for a fixed n. There exist also a finite (and depending only on n) number of possible configurations of the graph RG , wrapping around TG1 . Lemma 5.2. Let M be a 3-dimensional manifold with an acceptable cubulation . Let Q be a fixed positive integer. There exists a finite number of possible isotopy classes of graphs in M which can be constructed out of Q edges of the triangulation v of M, where v is arbitrary. By using this lemma (proved in Subsect. 5.5), the same argument as in Subsect. 5.1 shows the following theorem: Theorem 5.3. Let M be an oriented closed 3-manifold. Let be an acceptable cubulation of M. Let n be a positive integer. If Nv denotes the number of tetrahedra of the triangulation v , then: lim
v→+∞
1 (n) V (M, v ) = 0. Nvn TV
Therefore in the dilute gas limit with g = λN : Z (M) =
∞ ∞ gn n (n) z Z TV (M) = e gz 1 Z TV (M) = Z TV (M), (−1)n λn VTV (M) → n! 1 n=0
since z 1 = −η−3
n=0
ˆ a, b, c, 1 = 0, dimq (a) dimq (b) dimq (c)Aa Ab Ac Y;
a,b,c
see the discussion in the Introduction. 5.3. Dilute gas limit for z 1 = 0. The fact that z 1 = 0 implies that the dilute gas limit partition function for g = λN is the same as the unperturbed one. In order to obtain a non-trivial partition function we need to take a different dilute gas limit. Let us now consider configurations where a single tetrahedron of the manifold M contains a space ordered 2-grasping G. Let T ⊂ S 3 be the standard tetrahedron. Let ρ(T) be a regular neighbourhood of T in S 3 . Let also RG ⊂ ∂ρ(T) be the associated graph with crossing information in the boundary of ρ(T); see Subsect. 4.3.
Spin Foam Perturbation Theory for Three-Dimensional Quantum Gravity
767
Fig. 13. The linear combination of graphs Y2 . The two Y-graphs in the middle are coloured by the spin 1. The indices labeling the bottom Y-graph refer to framing coefficients
Fig. 14. The combination of graphs H2
In the general case when there are p graspings in a single tetrahedron T of M one defines 1 zp = p (−4) space ordered colourings c of T p-graspings G of T
A(G, c) dimq (c)η−4 Z WRT S 3 #ρ(T) S 3 , RG ;
(24)
see Eq. (18). Note there exists a unique way to embed an oriented tetrahedron in M, up to isotopy. From the proof of Theorem 4.3, for p = 1 this coincides with the previous definition of z 1 = 0. For p = 2, this reduces to z2 =
η−3 dimq (a) dimq (b) dimq (c)Aa Ab Ac Y2 ; a, b, c, 1 4 a,b,c 4
3η−3 + Aai dimq (ai ) A2e dimq (e) H2 ; a, b, c, d, e, 1, 8 a ,a ,a ,a ,e 1
2
3
4
i=1
(25) where Y2 is the linear combination of graphs shown in Fig. 13, while H2 is the linear combination of graphs appearing in Fig. 14. The constants Ai are given by Eq. (6). The
768
J. Faria Martins, A. Mikovi´c
Y2 graphs correspond to the case when both graspings are associated with the same tetrahedron vertex, while the H2 graphs correspond to the case when the graspings connect edges incident to two different tetrahedron vertices. Let us consider an even order perturbative contribution 1 = (2n)!
Z 2n
N
2n Vˆm
Z (J )
m=1
,
J =0
where Vˆm = ∂ J3 (τm ). We write it as Z 2n
1 = (2n)!
N
2n Vm
.
m=1
Then Z 2n
2n 1 = (2n)!
p=1 1≤m 1 ,...,m p ≤N k1 ,...,k p
(2n)! k1 k Vm 1 . . . Vm np . k1 ! · · · k p !
Let N be the number of tetrahedra of M. When N → +∞ (in the sense described in Subsect. 5.2), the dominant contribution comes from the graspings supported by n non-intersecting tetrahedrons. This contribution arises from the p = n terms in Z 2n with k1 = k2 = · · · = kn = 2. By using the same technique as in Subsects. 5.1 and 5.2 we can show that 1≤m 1 ,··· ,m n ≤N
(2n)! 2 2 (2n)! N n (2n)! N n n 2 z Z0,
V V · · · V ≈ C z Z ≈ 0 m m m n 2 n 1 2 2n 2n 2n n! 2
where Z 0 = Z TV (M). Therefore Z 2n =
1 Nn n z Z 0 + O(N n−1 ), 2n n! 2
as N → ∞. Similarly Z 2n+1 = −
2n+1 1 (2n + 1)!
p=1 1≤m 1 ,...,m p ≤N k1 ,...,k p
(2n + 1)! k1 k Vm 1 · · · Vm np , k1 ! · · · k p !
and the dominant contribution for N large comes from the graspings supported by n non-intersecting tetrahedrons. This contribution corresponds to p = n terms with k1 = · · · = kn−1 = 2, kn = 3, and it is easy to show that 1≤m 1 ,...,m n ≤N
≈
(2n + 1)! 2 (2n + 1)!
Vm 1 · · · Vm2 n−1 Vm3 n ≈ n−1 nCnN z 2n−1 z 3 Z 0 n−1 2 3! 2 3!
(2n + 1)! N n z n−1 z 3 Z 0 , 2n−1 3! (n − 1)! 2
Spin Foam Perturbation Theory for Three-Dimensional Quantum Gravity
769
where z 3 is defined in Eq. (24). Therefore Z 2n+1 = −
Nn 1 z n−1 z 3 Z 0 + O(N n−1 ), 3 · 2n (n − 1)! 2
as N → ∞. N λn Z n (M, ), then Let g = λ2 N , and Z¯ N = n=0 Z¯ N ≈
N /2 n N /2 g n λ gn z2 Z 0 − z n−1 z 3 Z 0 n! 3 (n − 1)! 2 n=1
n=1
λg z 3 e gz 2 Z 0 ≈ e gz 2 Z 0 − 3 for N large. By taking the limit N → ∞, λ → 0 such that g = const, we obtain (26) lim Z¯ N (, M) = e gz 2 Z 0 (M). N →∞
One can conjecture that in the case z 1 = z 2 = · · · = z p−1 = 0, z p = 0, the limit N → 0, λ → 0, such that λ p N is a non-zero constant, gives lim Z¯ N (, M) = e gz p Z 0 (M), (27) N →∞
where g ∝
λp N .
5.4. Relating g and λ. Note that the perturbed partition function (27) is a function of the parameter g, which takes values independently from the values of λ. One would like to find a relation between λ and g such that Z (M, λ) = e gz p (r ) Z TV (M, r ), where we have written explicitly the dependence of z p and Z TV on the integer r . If we assume that the path integral (4) for λ = (2π/r )2 is equal to Z TV (M, r ), it is natural to consider λ¯ = λ − (2π/r )2 as the perturbative parameter. Let g = f (λ¯ ), then for λ = (2π/l)2 , where l is an integer different from r , we will require that
¯ (28) Z M, (2π/l)2 = e f (λ)z p (r ) Z TV (M, r ) = Z TV (M, l). The relation (28) implies that
f (2π/l)2 − (2π/r )2 =
Z TV (M, l) 1 ln . z p (r ) Z TV (M, r )
(29)
√ When 2π/ λ is not an integer, a natural generalization of the relations (28) and (29) is
Z (M, λ) = Z TV
and
f λ − (2π/r )2 =
2π M, √ λ
2π √ Z M, TV 1 λ ln . z p (r ) Z TV (M, r )
(30)
(31)
Although f depends on z p and the integer r , the value for Z (M, λ) given by (30) is independent of z p and r .
770
J. Faria Martins, A. Mikovi´c
Fig. 15. The cubulations C4 , C3 and C5 of R2
Fig. 16. Baricentrically subdividing the cubulations C3 and C5 of R2 yields C3 and C5
5.5. Proof of Lemma 5.2. Consider the cubulations C3 , C4 and C5 of R2 presented in Figs. 15. These have the property that they are invariant under baricentric subdivision; see Fig. 16. Doing the product with the cubulation of R with a vertex at each integer, yields cubulations C3 , C4 and C5 of R3 , which stay stable under baricentric subdivision. These cubulations of R3 have the property that, given a positive integer Q, then there exists a finite number of isotopy classes of graphs in R3 which can be constructed out of Q edges of the triangulations of R3 constructed by taking the cone of them. Let M be a 3-dimensional manifold with an acceptable cubulation . We can cover M with a finite number of cubical subcomplexes, say {Vi }, each of which is isomorphic to a subcomplex of either C3 , C4 or C5 . Moreover, we can choose each Vi so that it is diffeomorphic to the 3-ball D 3 . Suppose that is a graph (which we can suppose to be connected) made from Q edges of the triangulation v of M, where v is arbitrary. By making v big enough, we can suppose that any such graph is contained in Vi for some i. This means that is isomorphic to a graph with Q edges either in C3 , C4 or C5 , and there are only a finite number of isotopy classes of these. 6. Conclusions Note that (n)
VTV (M, ) =
n 1 n(K , G) X (M, , K , G), 4n K =1 G
where n(K , G) is the number of times the configuration G appears, up to isotopy, among configurations which correspond to n-graspings distributed among K tetrahedrons. Then (n)
VTV (M, ) =
n k=0
(n)
N n−k vk+1 (M, ),
Spin Foam Perturbation Theory for Three-Dimensional Quantum Gravity (n)
771 (n)
where the coefficients vk are linear combinations of X (M, , K , G). In general vk (n) 1 n depends on the triangulation , except for v1 = n! z 1 Z TV . For the triangulations coming from the baricentric divisions of an acceptable cubulation, there is only a finite number of topologically distinct grasping configurations. Therefore the set of values of the X ’s is finite and hence limited, so that the vk(n) are limited as N → ∞. In that case the dilute gas configurations which have the dominant contributions have two graspings in a single tetrahedron instead of one, because z 1 = 0 and z 2 = 0. The corresponding difference with respect to the Baez definition of the dilute gas limit is that the effective perturbation parameter g changes from λN to λ2 N . We have not proved that z 2 , which is given by (25), is different from zero. However, it is very unlikely that z 2 vanishes, since the evaluations of Y2 and H2 graphs are not apparently zero. The result z 1 = 0 is a consequence of our choice of the volume operator Vˆ . This is a natural choice since it contains only the non-coplanar triples of the tetrahedron edges. One can also include the coplanar triples, see [HS], and it is possible that in that case z 1 = 0. This would then give Z = e gz 1 Z 0 in the usual dilute √ gas limit. Note that the proposed value for Z (M, λ) when 2π/ λ is not an integer, given by (30), is independent of z p . This means that it is independent of the type of dilute gas limit used. Since the value of z p depends on the choice of the volume operator, this also means that the value (30) is independent of the choice of the volume operator. Although the value (30) is independent of the triangulation of M, it is not a new topological invariant, √ because it is the same function as Z TV (M, r ). The only diference is that r = 2π/ λ takes a noninteger value. However, (30) gives a definition of 3d quantum gravity partition function when the value of the cosmological constant is an arbitrary positive number. An interesting problem is to develop the PR model perturbation theory without using the quantum group regularization. The recent results on the PR model regularisation by using group integrals, see [BNG], suggest that such a perturbation theory could be developed. The corresponding perturbative series could be then summed by using the dilute gas techniques and the cubulation approach. The obtained result could be then compared to Z TV (M, r ). The techniques developed in this paper can be readily extended to the case of fourdimensional Euclidean Quantum Gravity with a cosmological constant, since then the classical action can be represented as the SO(5) BF theory action plus a perturbation quadratic in the B field, see [M1,M2]. Acknowledgements. We would like to thank Laurent Freidel for helpful comments. J. Faria Martins was supported by the Centro de Matemática da Universidade do Porto www.fc.up.pt/cmup, financed by FCT through the programmes POCTI and POSI, with Portuguese and European Community structural funds, and by the research project POCTI/MAT/60352/2004, also financed by the FCT. A. Mikovi´c and J. Faria Martins were partially supported by the FCT grant PTDC/MAT/69635/2006.
References [Ba1] [Ba2] [BNG]
Baez, J.: An introduction to spin foam models of Quantum Gravity and BF Theory. In: Geometry and Quantum Physicis, Gausterer, H., Grosse, H., Pittner, L. (eds.) Lect. Notes Phys. 543, Berlin-Heidelberg-New York: Springer Verlag, 2000 pp. 25–94 Baez, J.: Spin foam perturbation theory. In: Diagrammatic Morphisms and Applications (San Francisco, CA, 2000), Contemp. Math. 318, Providence, RI: Amer. Math. Soc., 2003, pp. 9–21 Barrett, J.W., Naish-Guzman, I.: The Ponzano-Regge Model. http://arXiv.org/abs/0803.3319v1 (gr-qc), 2008
772
[BGM] [CFS] [CT] [FMM] [FK] [GS] [HS] [KL] [L] [M1] [M2] [PR] [RT] [R] [RS] [T] [TV]
J. Faria Martins, A. Mikovi´c
Barrett, J.W., Garcia-Islas, J.M., Faria Martins, J.: Observables in the Turaev-Viro and Crane-Yetter models. J. Math. Phys. 48(9), 093508 (2007) Carter, J.S., Flath, D.E., Saito, M.: The Classical and Quantum 6j-Symbols, Mathematical Notes 43, Princeton, NJ: Princeton University Press, 1995 Cooper, D., Thurston, W.P.: Triangulating 3 manifolds using 5 vertex link types. Topology 27(1), 23–25 (1988) Faria Martins, J., Mikovi´c, A.: Invariants of spin networks embedded in three-dimensional manifolds. Commun. Math. Phys. 279, 381–399 (2008) Freidel, L., Krasnov, K.: Spin foam models and the classical action principle. Adv. Theor. Math. Phys. 2, 1183–1247 (1999) Gompf, R.E., Stipsicz, A.I.: 4-Manifolds and Kirby Calculus. Graduate Studies in Mathematics, 20. Providence, RI: Amer. Mathe. Soc., 1999 Hackett, J., Speziale, S.: Grasping rules and semiclassical limit of the geometry in the ponzanoregge model. Class. Quant. Grav. 24, 1525–1545 (2007) Kauffman, L.H., Lins, S.L.: Temperley-Lieb Recoupling Theory and Invariants of 3-Manifolds. Annals of Mathematics Studies, 134. Princeton, NJ: Princeton University Press, 1994 Lickorish, W.B.R.: The skein method for three-manifold invariants. J. Knot Theory Ram 2(2), 171–194 (1993) Mikovi´c, A.: Quantum gravity as a deformed topological quantum field theory. J. Phys. Conf. Ser. 33, 266–270 (2006) Mikovi´c, A.: Quantum gravity as a broken symmetry phase of a BF theory. SIGMA 2, 086, (2006) (5 pages) Ponzano, G., Regge, T.: Semiclassical limit of Racah coefficients. In: Spectroscopic and Group Theoretical Methods in Physics, ed Bloch, F. et al, Amsterdam: North-Holland, 1968 Reshetikhin, N., Turaev, V.G.: Invariants of 3-manifolds via link polynomials and quantum groups. Invent. Math. 103(3), 547–597 (1991) Roberts, J.: Skein theory and Turaev-Viro invariants. Topology 34(4), 771–787 (1995) Rourke, C.P., Sanderson, B.J.: Introduction to Piecewise-Linear Topology. Reprint. Springer Study Edition. Berlin-New York: Springer-Verlag, 1982 Turaev, V.G.: Quantum Invariants of knots and 3-Manifolds. de Gruyter Studies in Mathematics, 18. Berlin: Walter de Gruyter & Co., 1994 Turaev, V.G., Viro, O.Ya.: State sum invariants of 3-manifolds and quantum 6 j-symbols. Topology 31(4), 865–902 (1992)
Communicated by A. Connes
Commun. Math. Phys. 288, 773–797 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0746-z
Communications in
Mathematical Physics
Hydrodynamic Limit for a Zero-Range Process in the Sierpinski Gasket Milton Jara FYMA, Université Catholique de Louvain, chemin du Cyclotron 2, B-1348 Louvain-la-Neuve, Belgium. E-mail:
[email protected] Received: 3 May 2008 / Accepted: 13 November 2008 Published online: 26 February 2009 – © Springer-Verlag 2009
Abstract: We consider a system of random walks on graph approximations of the Sierpinski gasket, coupled with a zero-range interaction. We prove that the hydrodynamic limit of this system is given by a nonlinear heat equation on the Sierpinski gasket. 1. Introduction The Sierpinski gasket is a fractal in R2 constructed in the following way. Start with an equilateral triangle of side 1. Divide it into 4 equilateral triangles of side 1/2, and remove the central triangle. Repeat this procedure on each of the three remaining triangles. After n − 1 steps, we are left with 3n small triangles of side 1/2n . Since this sequence of triangles is decreasing, and each element in the sequence is compact, their intersection K is a non-empty, compact subset of R2 , which is known as the Sierpinski gasket. At each step of this construction, consider the boundary of the resulting set as a graph Γn in R2 , with 3(3n + 1)/2 vertices and 3n+1 bonds. Consider now a system of particles evolving on this graph. The particles are attempting jumps to neighboring sites at rates that depend only on the number of particles sharing the same site. This is the so-called zero-range process. Such a system has been extensively studied in the usual lattice Zd or its periodic version (see [9] and the references therein). The purpose of this article is to study the collective behavior of this system as the graph gets finer and finer, approximating in this way the Sierpinski gasket K . More precisely, we are interested in the hydrodynamic limit of the model, that is, the macroscopic evolution of the density of particles. It turns out that the hydrodynamic limit for this model is given by a nonlinear heat equation of the form ∂t u = ∆φ(u), where ∆ is the Laplacian defined in K and φ depends on the particular form of the interaction between particles. At first sight, this result does not appear surprising, since the hydrodynamic equation for the zero-range process in the usual lattice Zd is given by the same nonlinear heat equation. However, the Sierpinski gasket K is a fractal, and therefore it does not have a differentiable structure. For this reason, even the definition of the Laplacian ∆ in K is
774
M. Jara
subtle. In [1], Barlow and Perkins have defined a Brownian motion in K as the scaling limit of the simple random walks defined in Γn . As a by-product, they have defined the Laplacian ∆ in K as the generator of this Brownian motion. A purely analytical definition can be found in the book by Kigami [6]. One remarkable fact is that the scaling limit is subdiffusive, in the sense that the scaling factor, which is equal to 5n in this case, is larger than the square of the mesh of the graph, which is 2n . This fact leads to anomalous diffusion properties of the corresponding heat equation, and to the failure of the usual Gaussian estimates on the decay to equilibrium of solutions of the heat equation. Although very detailed information about the fundamental solutions of ∂t u = ∆u in K has been obtained [7], up to our knowledge, existence and uniqueness of the Cauchy problem for the hydrodynamic equation ∂t u = ∆φ(u) have not been considered in the literature. We provide in this article the existence and uniqueness results for the Cauchy problem in K that are needed to make sense of the hydrodynamic limit for the zero-range process. Although collateral to our work on the zero-range process, these results could be of independent interest. From the point of view of interacting particle systems, the zero-range process is an example of a process satisfying the gradient condition. Roughly speaking, the current of particles is the gradient of another local function (the interaction rate, in this case). Therefore, Fick’s law is satisfied at a microscopic level. The classical method to deal with the hydrodynamic limit of gradient systems was introduced in [4] and it is based on the so-called one-block and two-blocks estimates. Unfortunately, the two-blocks estimate does not seem to hold for the Sierpinski gasket, at least not in a straightforward way. In fact, this two-blocks estimate is based on the moving particle lemma, which roughly states that a particle can be moved from one site to another with a diffusive entropy cost. The Laplacian ∆ satisfies a Poincaré inequality in K , and therefore, the spectral gap for the associated particle system is expected to be of the right order. However, in the Sierpinski gasket there are hot spots, that is, sites of the graph that have to be visited in order to connect two different regions of the graph. In particular, it is not true that the best strategy in order to transport a particle from one site to another is just to follow the shortest path, in contrast with the situation in the integer lattice. Probably the simplest method to prove hydrodynamic limits of gradient systems is the H−1 -norm method, due to Chang and Yau [2] (see [3] for a more readable exposition). The advantage of this method is it that only requires the one-block estimate; its main drawback is that it works only for diffusive systems without a drift term. The idea of this method is very simple: let u, v be two solutions of the hydrodynamic equation. Then, the time derivative of u − v, (−∆)−1 (u − v) is equal to −2u − v, φ(u) − φ(v) and in particular, it is decreasing. Therefore, if u 0 = v0 , then u t = vt for any t > 0. The idea is to prove that this relation holds also at a microscopic level. However, at the microscopic level and for non-transitive graphs, there is a correction term involving ∆G(x, x), where G is the Green function associated to ∆. For the Laplacian in subsets of Rd , G(x, x) may not be well defined, but anyway ∆G(x, x) can be defined, since the singularity of G(x, x) at the diagonal is always of the same magnitude, and cancels, at least in a weak sense, when taking the Laplacian as the limit of averaged differences around x. This is not the situation in the Sierpinski gasket K . We will see that, despite the fact that the Green function is continuous in K × K , the function G(x) = G(x, x) satisfies ∆G(x) = +∞, in a sense to be made precise later. This irregularity of the Green function poses an extra difficulty to the proof of the hydrodynamic limit. Usually, the one-block estimate allows us to replace a local function of the number of particles by averages over small boxes, when averaged with respect to a continuous test function. In our case,
Hydrodynamics on a Fractal
775
we need to average it with respect to discrete approximations of ∆G(x), which we have seen that is not continuous. Therefore, the one-block estimate needs to be proved without averaging with respect to test functions. This has been accomplished only recently [5]. The proof, however, only works in dimension d < 2. Fortunately, the Sierpinski gasket satisfies the following condition: its spectral dimension d S = log 5/ log 3, is bigger than its Hausdorff dimension d H = log 3/ log 2. Under this condition, the so-called local one-block estimate still holds. This paper is organized as follows. In Sect. 2, we define the Laplacian operator in K and we review some aspects of functional analysis on K that will be needed in the sequel. In particular, we define with some detail the Green function G(x, y) associated to the Dirichlet Laplacian in K . Although most of the material of this section has been taken from [6,7], we have decided to include most of the proofs for the convenience of readers interested in interacting particle systems and not familiar with analysis on fractals. In Sect. 3 we define what we mean by a weak solution of the hydrodynamic equation ∂t u = ∆φ(u) and we prove existence and uniqueness of such solutions by considering a finite-difference numerical scheme to approximate those solutions. In Sect. 4 we introduce the zero-range process, the H−1 -norm method and we prove the hydrodynamic limit for the zero-range process, relying on the one-block estimate and suitable properties of the Green function G(x, y). In Sect. 5 we prove the one-block estimate and the lemmas needed for the derivation of the hydrodynamic limit. In the Appendix we study the behavior of ∆n G(x, x), where ∆n corresponds to the discrete approximation of ∆ defined in Γn . The results presented in the Appendix are not needed for the hydrodynamic limit. Our main tool is an iterative formula for ∆n G(x, y), known in the literature as the “near diagonal formula” [8]. We have included them here to stress that the example of the Sierpinski gasket is the worst possible case for the derivation of the hydrodynamic limit with the H−1 -norm method, in the following sense. Let Gn (x, y) be the Green function defined in Γn × Γn , associated to ∆n . Only using the relations ∆n Gn (x, y) = 3n δ(x, y) and Gn (x, y) ≥ 0, valid for any finite graph when replacing 3n by the cardinality of the graph, we can prove that ∆n Gn (x, x) ≤ 4·3n , where the number 4 should be replaced by the degree of the graph Γn at x in an arbitrary finite graph. But for the Sierpinski gasket, there exists a positive constant c such that ∆n Gn (x, x) ≥ 3n c for any x and any n big enough. In other words, aside from a multiplicative constant, the function ∆n Gn (x, x) diverges as n → ∞ at the fastest possible rate. By comparison, in the interval [0, 1], G(x, x) = x(1 − x) and therefore ∆n Gn (x, x) is uniformly bounded (in n and x). Since we are dealing with a graph on which the Green function has the worst possible behavior for the method we use, the ideas presented here could be adapted to treat general non-homogeneous graphs, like random graphs and trees, percolation clusters and disordered lattices. We have organized the paper in such a way that Sects. 3 and 4 are independent. Therefore, readers interested in the zero-range process can take for granted the existence and uniqueness results for the hydrodynamic equation and jump directly from Sect. 2 to Sect. 4.
2. The Sierpinski Gasket √ Let a0 = (0, 0), a1 = (1/2, 3/2), a2 = (1, 0) be the vertices of an equilateral triangle of unit side in R2 . Define ϕi : R2 → R2 by taking ϕi (x) = (x + ai )/2, i = 0, 1, 2. The
776
M. Jara
Sierpinski gasket K is defined as the unique non-empty compact subset K of R2 such that K = ϕi (K ). i=0,1,2
A constructive definition of K is the following. Define Vn ⊆ R2 recursively by taking V0 = {a0 , a1 , a2 } and Vn+1 = ∪i ϕi (Vn ). Define V ∗ = ∪n Vn . Since Vn ⊆ Vn+1 , it is not hard to see that K = cls(V ∗ ), the closure of V ∗ under the usual topology of R2 . Consider Vn as the set of vertices of a non-oriented graph Γn = (E n , Vn ), and define inductively the set of bonds of Γn by taking E 0 = {a0 a1 , a1 a2 , a2 a0 } and E n+1 = {ϕi (x)ϕi (y); x y ∈ E n , i = 0, 1, 2}. We say that Γn is the n th discrete approximation of K . For x, y ∈ Vn , we say that x ∼n y if x y ∈ E n . We simply write x ∼ y when there is no risk of confusion. We say that x and y are neighbors in that case.
2.1. The Laplacian operator in K . The material from this section is essentially contained in [6]. For the reader’s convenience we have included sketched proofs of the facts about the Laplacian ∆ in K we will need in the sequel. Let u : V ∗ → R be an arbitrary function. For each n ≥ 0, we define En (u, u) = (5/3)n (u(y) − u(x))2 . x∼n y
Proposition 1. For each n ≥ 0 and each u : V ∗ → R, En+1 (u, u) ≥ En (u, u). Moreover, given n ≥ 0, u : V ∗ → R, there exists a unique function u¯ n : V ∗ → R such that En (u, u) = En+ p (u¯ n , u¯ n ) for all p ≥ 0 and u¯ n (x) = u(x) for every x ∈ Vn . Proof. For α = (α0 , α1 , α2 ), β = (β0 , β1 , β2 ), define the function 1 (αi − β j )2 + (βi − β j )2 . I (α, β) = 2 i = j
Let α ∈ R3 be fixed. A simple computation shows that inf I (α, β) = 5/3 (α0 − α1 )2 + (α1 − α2 )2 + (α2 − α0 )2 , β∈R3
and that the infimum is attained at a single point β satisfying βi = (2σ − αi )/5, where σ = α0 + α1 + α2 . The proof follows easily from this observation and a chaining argument.
Hydrodynamics on a Fractal
777
From the previous result, E(u, u) = limn En (u, u) is always well defined, although it can be infinite. Observe that #Vn = 3(3n + 1)/2. In particular, limn #Vn /3n = 3/2. This motivates the following definition. For each n ≥ 0 define the positive measure µn in K by µn (d x) =
1 δx (d x), 3n x∈Vn
where δx (d x) is the Dirac mass at x. It is not hard to see that µn converges in the vague topology to a measure µ in K which coincides with a constant multiple of the Hausdorff measure in K . A simple summation by parts shows that for any u : V ∗ → R, En (u, u) = − u(x)∆n u(x)µn (d x), where ∆n is the discrete Laplacian in Vn : ∆n u(x) = 5n
[u(y) − u(x)] .
y∈Vn :y∼n x
Proposition 2. There exists a universal constant c > 0 such that for any u : V ∗ → R with E(u, u) < +∞ we have sup
x,y∈V ∗ x = y
|u(y) − u(x)| ≤ cE(u, u)1/2 , |y − x|α
where α = log(5/3)/2 log 2. Proof. For each x, y ∈ Vn , define Rn (x, y) =
|u(x) − u(y)|2 . En (u, u) u:En (u,u) =0 sup
√ By definition, |u(x) − u(y)| ≤ Rn (x, y)En (u, u)1/2 . Notice that for any constants a > 0, b ∈ R we have En (au + b, au + b) = a 2 En (u, u). Therefore, Rn (x, y)−1 = inf En (u, u). u(x)=0 u(y)=1
Assume that x ∼n y. Considering the function u(z) = 1(z = y), we see that R(x, y)−1 ≤ 4(5/3)n . By definition, Rn (x, y)−1 ≥ (5/3)n . Therefore, |u(x) − u(y)| ≤ (3/5)n/2 En (u, u)1/2 ≤ 2−αn En (u, u)1/2 , where α = log(5/3)/2 log 2. Remember that |x − y| = 2−n for x ∼n y. Therefore, we have proved the inequality when x ∼n y for some n ≥ 0. Using the triangle inequality, we can extend this relation to arbitrary x, y ∈ V ∗ .
778
M. Jara
In particular, the previous proposition tells us that any function u : V ∗ → R satisfying E(u, u) < +∞ is uniformly continuous. Therefore, u can be continuously extended in a unique way to the set K . From now on, we consider u as defined in K , and we assume that any u : K → R with E(u, u) < +∞ is continuous. Notice that the points ai , i = 0, 1, 2 are different from other points in V ∗ . In fact, the points ai have only two neighbors in Γn while other points in Vn have 4 neighbors. It is natural to define V0 as the boundary of K . Define H10 (K ) = {u : K → R; E(u, u) < +∞, u(ai ) = 0 ∀i}. For u ∈ H10 (K ), define ||u||1 = E(u, u)1/2 . Notice that E(u, u) = 0 if and only if u is constant in K . Therefore, || · ||1 is a norm. By Proposition 2, H10 (K ) is closed under this norm. It is easy to see that (H10 (K ), || · ||1 ) is a Hilbert space, with inner product given by the polarization identity E(u, v) = (E(u + v, u + v) − E(u − v, u − v)) /4. Denote by C0 (K ) the set of continuous functions u : K → R with u(ai ) = 0 for every i. We define L2 (µ) as the completion of C0 (K ) under the norm 1/2 ||u||0 = . u(x)2 µ(d x) Proposition 3. The space H10 (K ) is dense in L2 (µ). Proof. Notice that H10 (K ) ⊆ C0 (K ) ⊆ L2 (K ). It is enough to see that H01 (K ) is dense in C0 (K ). Take u ∈ C0 (K ), and define u¯ n as in Proposition 1. Notice that given any three numbers α0 , α1 , α2 , βi = (2σ − αi )/5 is always between the maximum and the minimum of αi . Therefore, the function u¯ n satisfies a maximum principle, and for every x ∈ V ∗ , there is y ∈ Vn such that |u(x) − u¯ n (x)| ≤ |u(x) − u(y)| + sup |u(x ) − u(y )|. x ,y ∈Vn x ∼n y
Therefore, sup |u(x) − u¯ n (x)| ≤ 2
x∈V ∗
which goes to 0 as n → ∞.
sup
x,y∈V ∗ |x−y|≤2−n
|u(x) − u(y)|,
Denote by u, v the inner product in L2 (µ). Define in an analogous way the spaces and the inner products En (u, v), u, vn . Recall the identity
L2 (µn )
En (u, u) = −u, ∆n un . By analogy with the case of the real line, we define the Dirichlet Laplacian ∆ : D(∆) ⊆ L2 (µ) → L2 (µ) as the unbounded operator given by i) D(∆) = {u ∈ H10 (K ); ∃c > 0 with E(u, v) ≤ c||v||0 ∀ v ∈ H10 (K )}. ii) For u ∈ D(∆), ∆u = h if and only if E(u, v) = −h, v for every v ∈ H01 (K ).
Hydrodynamics on a Fractal
779
Notice that this definition of the Laplacian in K is not constructive. At this point, even to find a single example of a function u ∈ D(∆) is hard to achieve. Moreover, for a generic u ∈ C0 (K ), the approximations u¯ n of Proposition 1 are not in D(∆). For this reason we extend the definition of the Laplacian in the following way. For u ∈ L2 (µ), define the dual norm ||u||2−1 =
sup v∈H10 (K )
{2u, v − E(v, v)}.
By Proposition 1, for any u ∈ H10 (K ) we have ||u||∞ ≤ c||u||1 . Therefore, Friedrich’s inequality ||u||0 ≤ c||u||1 holds, and in particular ||u||−1 < +∞ for every u ∈ L2 (K ). We denote by H−1 (K ) the closure of L2 (K ) under this norm. We extend the definition of ∆ to the operator (still denoted by) ∆ : H10 (K ) → H−1 (K ) such that ∆u, v = E(u, v) for all u, v ∈ H10 . Notice that ∆ is now well defined as an element of H−1 (K ) for every u ∈ H10 , and ∆ is an isometry from H10 (K ) to H−1 (K ). Since L2 (µ) is dense in H−1 , we conclude that D(∆) is dense in L2 (µ). For any function u ∈ C0 (K ), we have u(x)δx , ∆u¯ n = x∈Vn
and therefore we have plenty of examples of functions u for which ∆u can be evaluated (in H−1 , of course). Since H10 ⊆ C0 (K ), the set M0 (K ) of Radon measures in K \V0 is contained in H−1 (K ). We say that u¯ n is the harmonic continuation of u|Vn . Notice that u¯ n converges to u in H10 (K ), and therefore ∆u¯ n converges to ∆u in H−1 (K ). 2.2. The carré du champ. The set K does not admit a differentiable structure, since it is clear that no neighborhood of K is diffeomorphic to an open set of Rd . Therefore, the notion of a gradient ∇u for functions u : K → R seems hopeless. Moreover, since the dimension of K is not an integer, it is not clear how many components should ∇u have. What is remarkable, is that the so-called carré du champ |∇u|2 can be defined in a very simple way. We say that a set T ⊆ K is a triangle if T = K ∩ ∆(x0 , x1 , x2 ) for some triangle ∆(x0 , x1 , x2 ) with vertices mutually adjacent in Vn for some n ≥ 0 (that is, xi ∼n x j for i = j). In an equivalent way, T is a triangle if T is of the form ϕin ◦ · · · ◦ ϕi1 (K ) for some sequence {i 1 , . . . , i n } in {0, 1, 2}. Fix a function u ∈ H1 (K ). For each triangle T , we define dµ[u,u] = lim (5/3)n (u(y) − u(x))2 . n→∞
T
x,y∈T x∼n y
By the proof of Proposition 1, this sum is increasing and therefore the limit always exists. Moreover, the limit is always finite, since it is bounded by E(u, u). The set of triangles generates the Borel topology in K . It is also not hard to check the continuity at vacuum of the set-valued function µ[u,u] . Therefore, we conclude that µ[u,u] can be extended to a positive, finite measure in K . For two given functions u, v in H1 (K ), we define the measure µ[u,v] by polarization: µ[u,v] =
1
µ[u+v,u+v] − µ[u−v,u−v] . 4
780
M. Jara
It has been shown [10] that the measures µ[u,u] are singular with respect to the Hausdorff measure µ for any u ∈ H1 (K ). However, it has been shown that these measures are not mutually singular: there exists a measure µ¯ in K such that for any pair of functions u, v in H1 (K ) we have µ[u,v] = Γ (u, v)µ¯ ¯ In fact, the measure µ¯ can be chosen to be equal for some function Γ (u, v) in L1 (µ). to µ[h 1 ,h 1 ] + µ[h 2 ,h 2 ] for suitable harmonic functions h 1 , h 2 . Therefore, we can define ∇u · ∇v = Γ (u, v) to get the identity E(u, v) = ∇u · ∇vd µ. ¯ 2.3. Harmonic functions and integration by parts. Normally, a function h : K → R is said to be harmonic if ∆h = 0. Notice that the only function h in H10 (K ) for which ∆h = 0 is h = 0 and we only have defined ∆h for functions in H10 (K ). Therefore, we need a definition of what we mean by a harmonic function. A function h : V ∗ → R is said to be harmonic if ∆n h(x) = 0 for every x ∈ Vn \V0 and every n ≥ 1. In that case, E(h, h) = E0 (h, h) and by Proposition 1, h can be uniquely extended to a continuous function h : K → R. Notice that h is entirely determined by its values at the boundary V0 . We extend the definition of ∆ as follows. For a continuous function u not necessarily in H10 (K ), we say that ∆u = v if there exists an harmonic function h such that u − h ∈ H10 (K ) and ∆(u − h) = v. In this case we say that u ∈ H1 (K ). For a function u : Vn → R, we define the discrete Dirichlet Laplacian by ∆nD u(x) = / V0 and ∆nD u(x) = 0 if u ∈ V0 . Define the normal derivatives ∂ni u by ∆n u(x) if x ∈ ∂ni u = (5/3)n u(y) − u(ai ). y∈Vn y∼n ai
We have the following (discrete) integration by parts formula:
u, ∆nD vn = v, ∆nD un + u(ai )∂ni v − v(ai )∂ni u . i=0,1,2
In order to obtain an analogous formula for the Dirichlet Laplacian, for u ∈ H10 (K ) we define ∂ i u = ∆u, h i , where h i is the harmonic function with h i (a j ) = δi j . Since ∆ is symmetric, u, ∆v = v, ∆u for any pair of functions u, v ∈ H10 (K ). It is straightforward to check the identity u, ∆v − v, ∆u = u(ai )∂ i v − v(ai )∂ i u i=0,1,2
for any two functions u, v ∈ H1 (K ). 2.4. The Green function in K . Friedrich’s inequality tells us that the Dirichlet Laplacian has a positive spectral gap in L2 (µ). In particular, for every u ∈ L2 (µ), the equation
Hydrodynamics on a Fractal
781
−∆w = u w ∈ H10 (K )
(1)
has a unique solution. Since the inclusion H10 (K ) ⊆ L2 (µ) is compact, we conclude that the operator (−∆)−1 is compact. We use the − sign to emphasize that ∆ is non-positive. In particular, there exist an orthonormal basis {vi }i of L2 (µ) and a non-decreasing sequence {λi }i of positive numbers such that −∆vi = λi vi for any i. The function w can be written in terms of the orthonormal basis {vi }i : w= λi−1 u, vi vi . i≥1
Formally, we can obtain w(x) by an integral formula: λi−1 vi (x)vi (y) w(x) = G(x, y)u(y)µ(dy), where G(x, y) = i≥1
is the Green function associated to the Dirichlet Laplacian ∆ in K . A simple computation shows that the sum defining G(x, y) is convergent in L2 (µ⊗µ) if i≥1 λi−2 < +∞. We will give a constructive definition of G(x, y) that will allow us to prove finer properties of G(x, y). Take the discrete Laplacian ∆n in Vn and define the Green function Gn (x, y) at x, y ∈ Vn as the solution of 3n δ(x, y), x ∈ Vn \V0 ∆n Gn (x, y) = 0, x ∈ V0 , where δ(x, y) = 1 if x = y and δ(x, y) = 0 otherwise. Here the operator ∆n acts on the first variable x. Notice that Gn (x, y) is non-negative and Gn (x, y) ≤ Gn (x, x) for any x, y ∈ Vn . We extend the definition of Gn (x, y) to K by taking the harmonic continuation of Gn given by Proposition 2. A key observation is that for y ∈ Vn , Gn+1 (x, y) = Gn (x, y) for any x ∈ Vn+1 . In fact, for x ∈ Vn+1 with x = y, ∆n+1 Gn (x, y) = 0. For x = y, a simple computation shows that Gn scales correctly, and therefore ∆n+1 Gn (y, y) = 3n+1 . The following propositions show that it is sufficient to compute G1 (x, y) to obtain Gn (x, y) for every n ≥ 1, x, y ∈ V ∗ : Proposition 4. For any i = 0, 1, 2, Gn+1 (ϕi (x), ϕi (y)) =
3 Gn (x, y) + h j (y)G1 (ϕi (x), ϕi (a j )). 5 j=0,1,2
The proof is simple; we refer to Sect. A for the argument. As a consequence of this relation and the previous discussion, Gn (x, y) does not really depend on n. Therefore, for x, y ∈ V ∗ we define G(x, y) = Gn (x, y), where n is such that x, y ∈ Vn . For fixed y, we see that E(G(·, y), G(·, y)) = G(y, y). In particular, G(·, y) is uniformly continuous and can be uniquely extended to K . In this way we can not define G(x, x) for x ∈ / V ∗. ∗ ∗ We would like to prove that in fact G(x, y) is uniformly continuous in V × V . This is an immediate consequence of the following proposition: Proposition 5. There exists a constant c > 0 such that G(x, x) ≤ c for every x ∈ V ∗ .
782
M. Jara
Proof. Notice that i h i (x) = 1 for every x ∈ K . In fact, i h i corresponds to the harmonic function with h(ai ) = 1 for i = 0, 1, 2, which is identically constant. Taking x = y in Proposition 4, we see that 3 G(x, x) + h j (y)G(ϕi (x), ai ). 5
G(ϕi (x), ϕi (x)) =
j=0,1,2
Since every y ∈ Vn+1 \Vn is equal to ϕi (x) for some i and some x ∈ Vn \Vn−1 , we see that sup
y∈Vn+1 \Vn
G(y, y) ≤ 3/5
sup
x∈Vn \Vn−1
G(x, x) + sup sup G(x, ϕi (a j )). j=0,1,2 x∈K
Setting βn = supx∈Vn \Vn−1 G(x, x), we see that βn+1 ≤ 3/5βn + c for some constant c independent of n. A simple computation shows that βn is bounded in n. Remember that for any function u ∈ H10 (K ), ∆u¯ n converges to ∆u in H−1 (K ). Therefore, ∆G(x, y) = δ y (x) and w(x) = G(x, y)u(y)µ(dy) is the solution of Eq. (1), at least for functions u ∈ C0 (K ). By the continuity of G(x, y), we conclude that G(x, y)u(y)µ(dy) solves (1) for u ∈ L2 (µ) as well. 3. The Nonlinear Heat Equation in K Let φ : R+ → R+ be a smooth function. We will assume that there exists 0 > 0 such that 0 ≤ φ (u) ≤ 0−1 for any u ∈ R+ . Fix some T > 0. We want to study the Cauchy problem ⎧ ⎪ ⎨∂t u = ∆φ(u) (2) u(t, ai ) = αi , i = 0, 1, 2 ⎪ ⎩u(0, ·) = u (·). 0 More precisely, we want to obtain criteria for existence and uniqueness of solutions for this equation. Now we define what we understand by a weak solution of (2). We say that u : [0, T ] × K is a weak solution of (2) if: i) For almost every t ∈ [0, T ], u(t, ·) ∈ H1 (K ) and
T 0
||u(t, ·)||21 dt < +∞.
ii) For any function G : [0, T ] → H10 (K ), pointwise differentiable in t and strongly differentiable in H−1 (K ) as a function of [0, T ],
T
u T , G T − u 0 , G 0 − 0
{u t , ∂t G t + φ(u t ), ∆G t } dt =
αi ∂ i G t .
i=0,1,2
We will start by considering a finite-difference scheme that approximates Eq. (2).
Hydrodynamics on a Fractal
783
3.1. A discrete nonlinear equation. Take a function u n0 : Vn → [0, ∞) such that u n0 (ai ) = αi . Let us define u n (t, x) : [0, ∞) × Vn → [0, ∞) as the solution of the following system of ordinary differential equations: ⎧ d n ⎨ dt u (t, x) = ∆n φ(u n (t, x)) for x ∈ Vn0 u n (t, ai ) = αi for i = 0, 1, 2 ⎩ u n (0, x) = u n0 (x). By the maximum principle and Peano’s theorem, u n (t, x) is well defined for any t > 0. We will prove existence of solutions for Eq. (2) in a proper sense by taking limits of these approximated solutions u n (t, x). To avoid an overcharged notation, we will take αi = 0. Our arguments work for αi arbitrary as well: just take into account the boundary terms when performing integrations by parts. Let us define the discrete norms 1 ||u||20,n = u(x)2 µn (d x) = n u(x)2 , 3 ||u||21,n = En (u, u) =
5n 3n
x∈Vn
(u(y) − u(x))2 .
x∼n y
Let us denote the function u n (t, ·) by u nt . It is not hard to see that ||u nt ||0,n is decreasing. In fact, d ||u n ||2 = 2u nt , ∆n φ(u nt )n dt t 0,n = − 2En (u nt , φ(u nt )) ≤ − 20 ||u nt ||21,n . Integrating this inequality between t = 0 and t = T , we see that ||u nt ||20,n
T
+ 20 0
||u nt ||21,n dt ≤ ||u 0 ||20,n .
(3)
Let us fix some reference time T > 0. For a function u : [0, T ] × K → R, define T 2 ||u||21 , |||u|||1 = 0
0 (K ) the Hilbert space obtained as the closure of the space C([0, T ], and denote by H1,T 0 H1 (K )) with respect to this norm, where C([0, T ], H10 (K )) denotes the space of continuous paths in H10 (K ). Fix a continuous function u 0 : K → [0, ∞) with u(ai ) = 0 for i = 0, 1, 2. Consider u¯ nt , the harmonic continuation of u nt into K . By (3), we have
sup |||u¯ nt |||21 ≤ n
||u 0 ||∞ . 20
In particular, there is a subsequence n such that u nt converges to some function u t ∈ H1,T (K ), weakly with respect to the topology of H1,T (K ). Theorem 1. The function u t ∈ H1,T (K ) is a weak solution of (2).
784
M. Jara
¯ nt ) converges Proof. Taking a second subsequence if necessary, we can assume that φ(u n ¯ t ) is the harmonic continuation of φ(u nt ). At to some function wt as well, where φ(u this point, we need to justify the identity wt = φ(u t ). For the usual Laplacian, defined on a bounded, open set U ⊆ Rd , the argument is the following. If u ∈ H10 (U ), that is, if u, −∆u < +∞, then there exists a unique function ∇u : U → Rd such that u, ∆G = ∇u, ∇G for any smooth function G. Since φ(u) is a smooth function of u, φ(u) also belongs to H10 (U ), and moreover ∇φ(u) = φ (u)∇u. The uniqueness of the weak gradient ∇u would allow us to conclude that wt = φ(u t ). Recall the definition of the carré du champ ∇u. A simple Taylor expansion shows that dµ[φ(u),φ(u)] = φ (u)dµ[u,u] , T
T
as expected. Therefore, we can appeal to the uniqueness of the representation µ[u,u] = Γ (u, u)µ¯ to conclude that wt = φ(u t ). Take a function G t ∈ H1,T (K ). Assume that G t is of class C 1 in time. By hypothesis,
T
K
0
n→∞ ¯ nt (x))∆G t (x)µ(d x)ds − −−→ φ(u
T 0
φ(u t (x))∆G t (x)µ(d x)dt. K
Performing an integration by parts, we see that the left-hand side of the previous expression is equal to
T 0
K
¯ nt (x))µ(d x)ds = G t (x)∆φ(u =
T 0
d n u , G t n dt dt t
u nT , G T n
T
− u 0 , G 0 n − 0
u nt , ∂t G t n dt.
Remember that ||u nT ||0,n is also bounded by ||u 0 ||∞ . In particular, choosing a further subsequence if necessary, we can assume that u nT converges weakly to u T in L2 (µ). Therefore, u nT , G T n converges to u t , G T . By Friedrich’s inequality, weak convergence in H1,T (K ) is stronger than weak convergence in L2 (µ(d x) × dt). Therefore, we can pass to the limit in each of the terms on the right-hand side of the previous expression. We have therefore proved that
T
u T , G T − u 0 , G 0 −
{u t , ∂t G t + φ(u t ), ∆G t } dt = 0
(4)
0
for any function G t smooth enough, which proves the theorem.
Theorem 2. Equation (2) has at most one weak solution. Proof. The heuristic argument is very simple. Take two weak solutions u t , vt of (2). Then, d ||u t − vt ||2−1 = 2(−∆)−1 (u t − vt ), ∆(φ(u t ) − φ(vt )) dt = −2u t − vt , φ(u t ) − φ(vt ) ≤ 0.
(5)
Hydrodynamics on a Fractal
785
Therefore, if u 0 = v0 , we conclude that u t = vt for any t > 0. Of course this heuristic computation needs to be justified. By (4), we have T {u t − vt , ∂t G t + φ(u t ) − φ(vt ), ∆G t } dt. u T − vT , G T = 0
Let us take G t = (−∆)−1 (u t − vt ) in the previous expression. That is, define G t (x) = G(x, y) (u t (y) − vt (x)) µ(dy). K
Putting this into the previous formula, we obtain immediately T ||u T − vT ||2−1 = −2 u t − vt , φ(u t ) − φ(vt )dt, 0
which is just the integral version of (5). But we still need to justify that G t can be taken as a test function. Since u t and vt are in H1 (K ) for almost every t ∈ [0, T ], for fixed t the function G t is regular enough. The problem is that G t maybe is not differentiable in t. This problem is easily solved by taking an approximation of the identity γδ (t) with support in [0, δ] and defining δ G δt = γδ (s)G t+s ds. 0
Now G δt
is differentiable, so it is an admissible test function. Taking δ → 0 we obtain the desired result. 4. Hydrodynamic Limit for the Zero-Range Process 4.1. The zero-range process. Let g : N0 = {0, 1, . . .} → [0, ∞) be a function with g(0) = 0. The zero-range process in Vn with interaction rate g(·) is defined as the continuous-time Markov chain ξt in Ω n = N0Vn and generated by the operator L bzr = g (ξ(x)) f (ξ x,y ) − f (ξ ) , x∈Vn y∼n x
where ξ is a generic element of Ω n , f : Ω n → R and ξ x,y is given by ⎧ ⎪ ⎨ξ(x) − 1, z = x x,y ξ (z) = ξ(y) + 1, z = y ⎪ ⎩ξ(z), z = x, y. Notice that the number of particles in this process is preserved by the dynamics. Therefore, for any fixed initial configuration, the state space is finite, and the previous process is well defined. This process has a family of invariant measures which we describe as follows. Define g(n)! = g(1) · · · g(n), g(0)! = 0. Assume that −1 φ ∗ = lim sup n g(n)! n→∞
786
M. Jara
is non-zero. This is the fact if, for example, inf n≥n 0 g(n) > 0 for some n 0 . For any φ < φ ∗ , define the uniform product measure ν¯ φ in Ω n by ν¯ φ {ξ ; ξ(x) = k} =
φk 1 , Z (φ) g(k)!
where Z (φ) is the normalization constant. Notice that due to the fact that φ < φ ∗ , the normalization constant Z (φ) is finite. It is nothard to see that the measure ν¯ φ is invariant under the evolution of ξt . Observe that φ = g(ξ(x))¯νφ (dξ ). Define the number of particles per site by ρ(φ) = ξ(x)¯νφ (dξ ). Notice that the application φ → ρ(φ) is strictly increasing, with ρ(0) = 0. Therefore, φ → ρ(φ) is a bijection between [0, φ ∗ ) and [0, ρ ∗ ), where ρ ∗ = lim∗ ρ(φ). φ↑φ
Denote by ρ → φ(ρ) the inverse mapping of ρ(φ). Since the number of particles per site is a more natural quantity than φ, we define νρ = ν¯ φ(ρ) for ρ ∈ [0, ρ ∗ ). Now we will introduce a Dirichlet-type boundary condition into this process. Fix some numbers αi ∈ [0, ρ ∗ ), i = 0, 1, 2. Define the boundary operators by L izr f (ξ ) =
φ(αi ) f (ξ + δ y ) − f (ξ ) + g (ξ(y)) f (ξ − δ y ) − f (ξ ) ,
y∼n ai
where the configurations ξ ± δ y are given by
ξ(z) ± 1, z = y ξ ± δ y (z) = ξ(z), z = y. The zero-range process in Vn with boundary conditions {αi }i is then defined as the V0
continuous-time Markov process ξt in Ωn = N0 n and generated by the operator L zr f (ξ ) =
x∈Vn0 y∈Vn0 y∼n x
g (ξ(x)) f (ξ x,y ) − f (ξ ) + L izr f (ξ ). i=0,1,2
The difference between this process and the zero-range process defined previously is easy to understand. Inside Vn (that is, in Vn0 ), the dynamics is the same. Particles are coming from the boundary sites ai with intensity φ(αi ), which corresponds to have a density of particles αi at ai . Particles are also annihilated when they jump into the boundary sites ai , in order to keep the density of particles at ai fixed. Notice now that the number of particles is no longer fixed, since particles are coming in and out at the boundary sites. In order to have a process with amenable properties, we will impose some technical conditions on the interaction rate g(·) (see [9]). We say that g(·) satisfies (SG) condition if i) supk |g(k + 1) − g(k)| < +∞. ii) There exist k0 > 0 and a0 > 0 such that g(k + l) − g(k) ≥ a0 for any l > k0 .
Hydrodynamics on a Fractal
787
Notice that in this case ρ ∗ = +∞. This condition guarantees the existence of exponential moments for the occupation variables ξ(x) under the invariant measures νρ . We say that g(·) satisfies (C) condition if g(k + 1) ≥ g(k) for any k. In this case, there exists a constant θ0 > 0 such that exp{θ0 ξ(x)}dνρ < +∞ for any ρ ≤ ρ ∗ . We will assume throughout this article that g(·) satisfies (SG) or (C). Condition (SG) guarantees the existence of a uniform spectral gap, which states that the magnitude of the first non-null eigenvalue of L bzr with respect to νρ is bounded below by a constant that does not depend on the density ρ. We expect this constant to be of order 5−n . In this context, at least in an obvious way, the moving particle lemma does not hold, preventing us to obtain such a bound. Notice, however, that a simple computation shows that there exists a constant c0 , independent of n and ρ, such that the spectral gap with respect to νρ is bounded below by c0 6−n . Refined versions of the moving particle lemma can be used to improve the order of magnitude from 6−n to ν −n for some constant 5 < ν < 6, but it is our belief that the geometric constraints imposed to the particle motion are still not understood in this context. An interesting open problem consists in obtaining a bound for the spectral gap of the form c5−n for a constant c independent (at least) of n and ρ (under Condition (SG)). Condition (C) implies that the zero-range process is attractive, which means that, given two initial configurations ξ , ξ with ξ(x) ≤ ξ (x) for any x, there exists a joint process (ξt , ξt ) such that ξt is a zero-range process starting from ξ , ξt is a zero-range process starting from ξ and ξt (x) ≤ ξt (x) for any x ∈ Vn and any t > 0. This property allows us to obtain moment bounds for ξt (x) in terms of the invariant measures νρ . Now a simple path argument shows that there is exactly one invariant measure for the evolution of ξt . Remarkably, this invariant measure is still of product form. Consider the harmonic function h with h(ai ) = αi . It is not hard to see that the non-uniform product measure νh defined by νh {ξ ; ξ(x) = k} =
1 h(x)k Z (h(x)) g(k)!
is invariant and ergodic for the evolution of ξt . 4.2. The H−1 -norm method: heuristics. Perhaps the simplest method to prove hydrodynamic limits for particle systems of gradient type is the H−1 -norm method introduced by Chang and Yau [2] (see [3] for a more comprehensible presentation). The main drawback of this method is that it only works for strictly diffusive systems. The other alternatives are the so-called entropy method [4] and relative entropy method [11]. As we discussed in the Introduction, the entropy method requires a path lemma that roughly states that we can move a particle from one site to another paying a diffusive cost. This is not true for the Sierpinski gasket, due to the presence of hot spots: points that connect two huge parts of the graph that can not be avoided in order to move a particle from one of these parts to the other. For example, if we want to transport a particle from a site in ϕ0 (K ) to another site in ϕ1 (K ), the particle has to pass by point ϕ0 (a1 ), or by the two points ϕ0 (a2 ), ϕ1 (a2 ). The second alternative requires smoothness of the solutions of the hydrodynamic equation. Of course, since we do not have a differentiable structure in K , we do not expect the solutions of the hydrodynamic equation to be smooth. The H−1 method is based on the heuristic argument leading to uniqueness of the hydrodynamic equation (2). Remember that the idea was to prove that, for two solutions
788
M. Jara
u t , vt of (2), ||u t − vt ||2−1 is decreasing in time. The main point is that this inequality also holds at the microscopic level, that is, for two different versions ξt1 , ξt2 of the zero-range process, or even between u t and ξt . Our task will be to put this formal arguing into a rigorous proof. Before doing that, we need some definitions. We recall the formula for the norm in H−1 (K ) in terms of the Green function G: for a function (or even a measure) u : K → R, ||u||2−1 = u(x)u(y)G(x, y)ν(d x)ν(dy). K ×K
More important for us will be the discrete version of this formula: for u : Vn → R such that u(ai ) = 0, 1 ||u||2−1,n = 2n u(x)u(y)G(x, y). 3 x,y∈Vn
Now we define what we understand by “convergence in H−1 ”. Let {ν n }n be a sequence of measures in Ωn . Let u : K → [0, ∞) be a given function. We say that ν n converges to u in the H−1 sense if lim ||ξ − u||2−1,n ν n (dξ ) = 0. n→∞
n defined as the A simple computation shows that the local equilibrium measures νu(·) product measures in Ωn with marginals n {ξ ; ξ(x) = k} = νu(x) {ξ ; ξ(x) = k} νu(·)
converge to u in the H−1 sense. For two given measures ν, ν in Ωnzr , we define the relative entropy of ν with respect to ν by dν dν dν log dν dν , if ν << ν H (ν|ν ) = +∞, otherwise. We also say that ν is stochastically dominated by ν if there is a measure λ in Ωn ×Ωn such that i) λ(ξ, Ωn ) = ν(ξ ), ii) λ(Ωn , ξ ) = ν (ξ ), iii) λ{(ξ, ξ ); ξ(x) ≤ ξ (x) for any x ∈ Vn0 } = 1. Now we are ready to state the main result of this article. Theorem 3. Let {ν n }n be a sequence of probability measures in Ωnzr , converging in the H−1 sense to some bounded function u 0 : K → [0, ∞). Assume that the hydrodynamic equation has a unique solution. Assume also the technical conditions: i) Under (SG), there are positive constants κ, ρ such that H (ν n |νρ ) ≤ κ3n . ii) Under (C), there are two positive constants ρ < ρ such that νρ is stochastically dominated by ν n and ν n is stochastically dominated by νρ for any n > 0. Then, for any t > 0, the distributions {ν n (t)}n at time t > 0 of the rescaled process ξtn = ξ5n t in Ωn starting from ν n , converge in the H−1 sense to u(t, ·), solution of the hydrodynamic equation (2).
Under Condition ii), we have H (ν n |νρ ) ≤ H (ν ρ |νρ ) and therefore there is a constant κ such that H (ν n |νρ ) ≤ κ3n for any n.
Hydrodynamics on a Fractal
789
4.3. The H−1 -norm method: martingale representation. In this section we will obtain a martingale representation for the H−1 -norm of ξtn . Notice that, due to the timescaling, the generator of ξtn is equal to 5n L zr . For any function F : [0, T ] × Ωnzr → R, differentiable in time and linearly growing in ξ , Dynkin’s formula states that Mtn,F
=:
F(t, ξtn ) −
F(0, ξ0n ) −
0
t
∂t + 5n L zr F(s, ξsn )ds
is a martingale. A long and tedious, but totally elementary computation shows that, in fact,
2 n ∂t + 5n L zr ||ξs − u ns ||2−1,n = − n F ξs (x), u ns (x) 3 x∈Vn
1 + 2n g (ξs (x)) ∆n G(x), 3 x∈Vn
where G(x) = G(x, x) and F(ξ, u) = (g(ξ ) − φ(u)) (ξ − u) − g(ξ ). We conclude that Mtn = ||ξt − u nt ||2−1,n − ||ξ0 − u n0 ||2−1,n ⎫ ⎧ t⎨ ⎬ 1 2 n n + F ξ (x), u (x) − g (x)) ∆ G(x) ds (ξ s n s s n ⎭ 32n 0 ⎩3 x∈Vn
x∈Vn
is a martingale. Observe u) νρ (dξ ) ≥ 0 for any ρ ∈ [0, ρ ∗ ) and any that F (ξ(x), n u ≥ 0. In particular, F (ξ(x), u) νu(·) (dξ ) ≥ 0 for any x ∈ Vn . Since M0n = 0, we have that En [Mtn ] = 0 for any t > 0. Here and below, Pn denotes the distribution of the process ξtn starting from ν n , and En denotes expectation with respect to Pn . Taking the expectation with respect to Pn of the previous identity, we can obtain an expression for En ||ξt − u nt ||2−1,n : En ||ξt − u nt ||2−1,n = En ||ξ0 − u n0 ||2−1,n − En
t
+ En 0
0
t
2 n F ξs (x), u ns (x) ds n 3 x∈Vn
1 g (ξs (x)) ∆n G(x)ds. 32n x∈Vn
Theorem 3 is an immediate consequence of the following two lemmas: Lemma 1.
t
lim En
n→∞
0
2 n F ξs (x), u ns (x) ds ≥ 0. 3n x∈Vn
790
M. Jara
Lemma 2.
t
lim En
n→∞
0
1 g (ξs (x)) ∆n G(x)ds ≤ 0. 32n x∈Vn
In fact, from these two lemmas, we conclude that lim sup En ||ξt − u nt ||2−1,n ≤ lim sup En ||ξ0 − u n0 ||2−1,n , n
n→∞
and convergence in the H−1 sense follows at once. The argument behind the proof of these two lemmas is as follows. We will see that some sort of weak conservation of local equilibrium will allow us to replace in the previous expressions the functions g(ξs (x)) by φ(ξsk (x)), where the symbol ξsk (x) denotes the average of ξs (y) over a small triangle containing x, paying a price that vanishes when n → ∞ and then k → ∞. In the same way we can substitute F(ξs (x), u nt (x)) by φ(ξsk (x)) − φ(u nt (x))(ξsk (x) − u nt (x)).
(6)
This replacement completes the proof of Lemma 1, since the function in (6) is always positive. The proof of Lemma 2 is more subtle. Notice the huge factor 1/3n in front of the average in Lemma 2. Since G(x, y) satisfies sup y ||G(·, y)||21 < +∞, we could guess that G(·) is in H1 (K ). In that case, we should have ∆n G(x)/3n → 0 as n → ∞ in some convenient sense. It turns out that this is not the case. The function G(x) is extremely irregular, and in fact it can be proved that for each x ∈ V ∗ fixed, ∆n G(x)/3n → 3/7 as n → ∞. Therefore, the irregularity of G(x, x) compensates exactly the factor 1/3n , and we need another argument to conclude the proof. Replacing g(ξs (x)) by φ(ξsk (x)) we will be able to perform an integration by parts in a small triangle of size k, therefore gaining a factor 3k that will save the day at the end. We will devote the following two sections to the proof of each one of these two lemmas. 5. The One-Block Estimate The replacement mentioned in the previous section is known in the literature of interacting particle systems as the one-block estimate. Before stating the one-block estimate in a precise way, we need some definitions. Fix two integers n ≥ l ≥ 0. For x ∈ Vn \Vn−l , we define Tnl (x) as the set of points in Vn contained in the triangle ∆(x0 , x1 , x2 ), which contains x and has vertices in Vn−l . In an equivalent way, Tnl (x) = T ∩ Vn , where T is the unique triangle of the form ϕin−l ◦ · · · ◦ ϕi1 (K ) containing x. For x ∈ Vn−l , there are two possible choices for Tnl (x), given by two triangles intersecting exactly at x. Rotating the graph in such a way that both triangles lie on the upper half-plane, we choose Tnl (x) as the triangle at the right of x. The exact choice in this case is not important; the point is to choose each triangle the same number of times. Theorem 4. (Local one-block) Let us define the average number of particles ξsk (x) by ξsk (x) =
1 |Vk |
y∈Tnk (x)
ξs (y),
Hydrodynamics on a Fractal
791
where |Vk | denotes the cardinality of Vk (and also of Tnk (x)). Then, t
k g(ξs (x)) − φ ξs (x) ds = 0, lim lim sup sup En k→∞ n→∞ x∈Vn t
lim lim sup sup En k→∞ n→∞ x∈Vn
0
(7)
0
ξs (x)g(ξs (x)) − φ ξsk (x) 1 + ξsk (x) ds = 0.
(8)
The word “local” comes from the fact that in the usual version of the one-block estimate, the arguments in the integrals are averaged against a smooth test function. This local version of the one-block estimate was introduced in [5]. As stated in [5], this local one-block estimate is available only in dimension d < 2. The Sierpinski gasket K has Hausdorff dimension d H = log(3/2) < 2. But this is not really the point. The local one-block estimate holds each time the scaling of the process is faster than the scaling of the number of points. In our present situation, the process scales like 5n and the number of points scales like 3n , so the local one-block estimate will hold. 5.1. Proof of the one-block estimate. We will take the proof of Theorem 4 from [5]. In that paper the case on which condition (SG) is satisfied is treated in detail, so here we focus on condition (C). Our first step is to introduce a cut-off function that prevents us to have too many particles at site x. Take a > ρ . Then, t
En g(ξs (x)) − φ ξsk (x) 1(ξsk (x) ≥ a)ds 0 t
≤ En g(ξs (x)) + φ ξsk (x) 1(ξsk (x) ≥ a)ds 0 t
≤ Eρ g(ξs (x)) + φ ξsk (x) 1(ξsk (x) ≥ a)ds 0
≤t g(ξ(x)) + φ ξ k (x) 1(ξ k (x) ≥ a)νρ (dξ ) 1/2
2 g(ξ(x))2 + φ ξ k (x) νρ (dξ )νρ (ξ k (x) ≥ a) ≤t 2 . The expectation in the last line is bounded in k. Moreover, by the law of large numbers, ξ k (x) converges to ρ in probability as k → ∞, and the probability in the last line goes to 0 as k → ∞. Notice that this convergence is uniform in x. Therefore, we can introduce the indicator function 1(ξsk (x) ≤ a) in (7). By assumption, the entropy density H (ν n |νρ )/3n is uniformly bounded in n, by a constant κ < +∞. A simple computation shows that the same is true for H (ν n |νh ), the entropy with respect to the invariant measure of the process. It is well known that the entropy of a jump process with respect to the invariant measure is decreasing in time. Fix some reference time T > 0. Denote by Pinv the law of the process ξt up to time T , speeded up by 5n and starting from the invariant measure νh . Then, there is another constant κ¯ depending only on κ and T , such that H (Pn |Pinv )/3n ≤ κ¯ for any n > 0. To simplify the notation, let us define Vk (ξ, x) by
Vk (ξ, x) = g (ξ(x)) − φ ξ k (x) 1(ξsk (x) ≤ a).
792
M. Jara
We have omitted in the notation the dependence of Vk in n and a. By the entropy inequality, En
T
0
V(ξs , x)ds
T κ¯ 1 n ≤ + log Einv exp γ 3 Vk (ξs , x)ds . n γ γ3 0
Using the elementary inequality e|x| ≤ e x + e−x , we can get rid of the modulus in the previous expression. Therefore, the limit in (7) will be obtained if we prove that lim lim sup
k→∞ n→∞
T 1 n exp ±γ 3 log E V (ξ , x)ds = 0. inv k s γ 3n 0
By Feynman-Kac’s formula, the logarithm of this expectation is bounded by the largest eigenvalue of the operator 5n L ± γ 3n Vk , where the term Vk is understood as a multiplication operator. For simplicity, we will consider just the “+” sign in Vk . By the variational formula for the largest eigenvalue of an operator in L2 (νh ), the previous expression is bounded by 1 T sup Vk , f − γ f
5 3
!n
f , −L zr f ,
(9)
where the supremum is over all the densities f with respect to νh , and the inner product is with respect to νh as well. Notice first that Vk depends on ξ(y) only through its values for y ∈ Tnk (x). Taking the conditional expectation of f with respect to F(Tnk (x)), the σ -algebra generated by {ξ(y); y ∈ Tnk (x)}, and due to the fact that νh is a product measure, we can restrict the previous supremum to densities in the configuration space √ √ T k (x) N0 n , which is homeomorphic to N0Vk . By positivity of f , −L zr f , we can also replace L zr by the generator of a zero-range process restricted to the triangle Tnk (x). We will denote this generator simply by L, since no risk of confusion will appear by the fact that L depends on n, k and x. In this way, we have reduced the initial problem into a problem on a finite graph, which in our case is equal to Vk . Moreover, due to the presence of the indicator function 1(ξ k (x) ≤ a) in the definition of Vk , we can restrict ourselves to a finite state space, namely {ξ ∈ N0Vk ; ξ k ≤ a}. Notice that at this point, ξ k = ξ k (x) does not depend on x ∈ Vk . Since now the supremum is over a compact set, we can exchange the supremum and the limit as n → ∞ to obtain 1 lim sup sup Vk , f − γ n→∞ f
5 3
!n
f , −L f =
√
sup √
f ,−L
f =0
Vk , f .
√ √ Now we have to identify the densities for which f , −L f = 0. A simple computation shows that, in this case, f is constant over the sets Ωk,l = {ξ ∈ N0Vk ;
x∈Vk
ξ(x) = l}.
Hydrodynamics on a Fractal
793
The restriction ξ k ≤ a imposes l ≤ a|Vk |. Let us define the measures νk,l by taking νk,l (·) = νρ (·|ξ k = l|Vk |). Notice that these measures do not depend on the value of ρ, and they are also exchangeable. Then, the previous supremum is equal to Vk (ξ, x)νk,l (dξ ). sup l≤a|Vk |
For l ≤ a|Vk |, the indicator function 1(ξ k ≤ a) is identically equal to 1. Therefore, we are left with {g(ξ(x)) − φ(l/|Vk |)} dνk,l . sup l≤a|Vk |
But this last quantity goes to 0 as k → ∞ by the equivalence of ensembles, which states that the grand canonical measures νk,l approach the canonical measures νl/|Vk | , uniformly in compact subsets of the real line. The case on which we take the “−” sign in front of Vk is totally analogous. In this way we have finished the proof of (7). The proof of this theorem when we take ξ(x)g(ξ(x)) instead of g(ξ(x)) is totally analogous. Remark 1. The same estimate remains true if we consider functions of the form g(ξtn (x)) F(t), where F : [0, T ] → R is bounded. In that case the limit is uniform over sets of the form {supt∈[0,T ] |F(t)| ≤ K }. It is enough to replace the variational formula in (9) T by 0 λn (t)dt, where λn (t) is the largest eigenvalue of the operator 5n L ± γ 3n Vk,t . 5.2. Proofs of Lemma 1 and Lemma 2. Now we are in position to prove Lemma 1. By the one-block estimate, we can write t 2 F(ξsn (x), u ns (x))ds En n 3 0 x∈Vn t
2 k n k n φ(ξ (x)) − φ(u (x)) ξ (x) − u (x) ds = En s s s s n 0 3 x∈Vn
plus a rest that vanishes as n → ∞ and then k → ∞. But this last term is always positive, which proves Lemma 1. In order to prove Lemma 2, we start proving that ∆n G(x)/3n is uniformly bounded. Remember that G(x, x) ≥ G(x, y) for any x, y ∈ K . Since ∆n G(x, x) = −3n , we conclude that G(x, x) ≤ G(x, y) + 3n for x, y in Vn with y ∼n x. Therefore, G(x, y) ≤ G(x, x) ≤ G(x, y), G(x, y) ≤ G(y, y) ≤ G(x, y), where we have obtained the second line by interchanging the roles of x and y. We conclude that |G(x) − G(y)| ≤ 3n for x ∼n y, and |∆n G(x)/3n | ≤ 4. Now we are able to use the one-block estimate to rewrite the expectation in Lemma 2 as t t 1 1 k En g (x)) ∆ G(x)ds = E φ ξs (x) ∆n G(x)/3n ds (ξ s n n 2n n 0 3 0 3 x∈Vn
x∈Vn
794
M. Jara
plus a rest that vanishes as n → ∞ and then k → ∞. Notice that the function φ(ξsk (x)) is constant in Tnk (x). Therefore, we can integrate by parts (in this discrete context just a summation) the function ∆n G(x) in Tnk (x) to obtain that the previous expression is equal to t 1 1 k i En φ ξs (x) ∂n,k G(x)ds, n 0 3 |Vk | x∈Vn
i=0,1,2
i G(x) denote the outer normal derivative of G(x) computed on where the symbols ∂n,k
the three vertices ain,k (x) of the triangle Tnk (x), defined by i G(y) − G(ain,k (x)) . ∂n,k G(x) = (5/3)n y∼n ain,k (x) y∈ / Tnk (x)
Notice that the ordering of the three vertices ain,k (x) is not relevant here. We have also gained a factor |Vk |−1 in this integral. Now we just need to prove that the normal derivatives of G(x) defined in this way are uniformly bounded. But the same arguments i G(x)| ≤ 2 used to bound ∆n G(x) can be repeated here, to get a bound of the form |∂n,k for any x ∈ Vn−k and any k ≤ n. A. The Behavior of the Green Function at the Diagonal In this Appendix we study the behavior of the function G(x). We want to compute ∆n G(x)/3n for x ∈ Vn and we want to obtain its asymptotic behavior. The idea is to obtain an iterative formula for ∆n+1 G(x) in terms of the values of G(x, y) for sites y in the neighborhood of x. This iterative scheme is known as the “near diagonal formula” and was investigated in [8]. We are especially interested in Eq. (11) which, up to our knowledge, was not obtained before. In Fig. A we have taken a point x0 ∈ Vn and we have named xi , xi , i = 1, 2 each of the four neighbors of x0 in Vn . We have also drawn points yi , yi , i = 0, 1, 2, which correspond to the points in Vn+1 belonging to the two triangles of side 2−n in Vn , meeting at x0 . We claim that G(yi , x j ) can be computed in terms of the six numbers G inj = G(xi , x j ) (there are 9 combinations for i, j, but remember that G inj = G nji ). In fact, let us construct G(y, y0 ) for y ∈ Vn+1 , for example. Defining ⎧ ⎪ ⎨3/10, y = y0 G0 (y, y0 ) = 1/10, y = y1 , y2 ⎪ ⎩0, otherwise, we see that ∆n+1 (3/5)n+1 G0 (y, y0 ) = −3n δ(y, y0 ), except for y = xi , i = 0, 1, 2. Therefore, the true Green function G(y, y0 ) is a linear combination between G0 (y, y0 ) and G(y, xi ), i = 0, 1, 2. In fact, ∆n+1 (3/5)n+1 G0 (xi , y0 ) = 3n+1 h 0 (xi ), where 1/5, i = 0 0 h (xi ) = 2/5, i = 1, 2.
Hydrodynamics on a Fractal
795
Fig. A. A neighborhood of x0
Therefore, we have the following formula for G(y, y0 ): ! 3 n+1 G(y, y0 ) = G0 (y, y0 ) + h 0 (xi )G(y, xi ). 5 i=0,1,2
Notice, as well, that G(yi , x j ) can be computed from G inj by harmonic continuation: ⎫ ⎧ ⎬ ⎨ 1 G nk j − G inj . G(yi , x j ) = 2 ⎭ 5⎩ k=0,1,2
Combining these two formulas, we can obtain the values of G(yi , y j ) in terms of the numbers {G inj } for any i, j. We give the formulas for G(y0 , y0 ), G(y1 , y0 ); the other formulas can be obtained by cyclic permutations of {0, 1, 2}: ! 3 3 n+1 1 n G 00 + 4G n11 + 4G n22 + 4G n01 + 8G n12 + 4G n20 , G(y0 , y0 ) = + 10 5 25 !n+1 1 3 1 n G(y0 , y1 ) = 2G 00 + 2G n11 + 4G n22 + 5G n01 + 6G n12 + 6G n20 . + 10 5 25 Of course, similar formulas hold for the left-side points in Fig. A. As an application of these formulas, we can try to compute ∆n+1 G(x0 ) in terms of ∆n G(x0 ):
3 5n+1 G(y1 , y1 ) + G(y2 , y2 ) − G n00 = 3n+1 · 5 n−1 n n n n +5 5(G 11 + G 22 − 2G 00 ) + 12(G 01 + G n02 − G n00 ) + 8(G n12 − G 00 ) .
(10)
Adding the symmetric term coming from the left-hand side of Fig. A, we see that 6 12 + ∆n G(x0 ) + ∆n G(x0 , x0 ) 5 5 8 + 5n · G(x1 , x2 ) + G(x1 , x2 ) − 2G(x0 , x0 ) . 5
∆n+1 G(x0 ) = 3n+1 ·
In this expression, there is a new term appearing. Let us define Γn (x0 ) = 5n G(x1 , x2 ) + G(x1 , x2 ) − 2G(x0 , x0 ) .
796
M. Jara
Since ∆n G(x0 , x0 ) = −3n , we have the formula ∆n+1 G(x0 ) = 3n+1 ·
2 8 + ∆n G(x0 ) + Γn (x0 ). 5 5
Now let us compute Γn+1 (x0 ): Γn+1 (x0 ) = −3n+1 ·
1 2 + ∆n G(x0 ) + Γn (x0 ). 5 5
This establishes a linear recurrence formula for the pair {∆n G(x0 ), Γn (x0 )}. Due to the factor 3n+1 , it is natural to define an = ∆n G(x0 )/3n , bn = Γn (x0 )/3n . For an , bn , the recurrence formula reads an+1 = 2/5 + an /3 + 8bn /15, bn+1 = −1/5 + 2an /15 + bn /3. This recursion formula can be written in vectorial terms as an+1 = w + Man , with ! 1/3 8/15 . M= 2/15 1/3 The eigenvalues of this matrix are λ1 = 3/5, λ2 = 1/15. In particular, for any initial value of an , bn (remember that x0 ∈ Vn , so the sequence does not start at n = 1) we have convergence to a unique fixed point, given by a = (I − M)−1 w. In our case, w = (2/5, −1/5), and a = (3/7, −3/14). In particular, lim ∆n G(x0 )/3n = 3/7.
n→∞
(11)
Notice that the limit is positive and does not depend on x0 . Remember that ∆n G(x0 , x0 )/3n = −1. This remarkable fact shows the high irregularity of the function G(x). In contrast to it, for the unit interval [0, 1], G(x) = x(1 − x), so G(x) is smooth and concave. In order to obtain a recursive formula for the partial derivatives of G(x), we focus our attention on the right side of Fig. A. Let us take a look at Eq. (10). We see that most of the work has already been done. In fact, defining αn = (5/3)n (G n11 + G n22 − 2G n00 ), βn = (5/3)n (G n12 − G n00 ), γn = (5/3)n (G n01 + G n20 − 2G n00 ), we obtain the following recursion formulas: αn+1 = 3/5 + 1/3αn + 8/15βn + 4/5γn , βn+1 = 1/10 + 2/15αn + 1/3βn + 2/5γn , γn+1 = γn . In particular, we obtain the same linear recursion as before, but now with a different vector w = (3/5 + 4/5γ , 1/10 + 2/5γ ). Again, we have convergence of (αn , βn ) to the solution of a = w + Ma. In our case, limn αn = α, with α = 17/14 + 2γ , where γ = γn for some suitable n. Notice that we recover our previous computation
Hydrodynamics on a Fractal
797
∆n G(x)/3n → 3/7 by noticing that the Laplacian at x is the sum of the two partial derivatives at x, and that the corresponding γ ’s add up to −1 in that case. We see that the behavior of ∆n G(x) is the worst possible, in the sense that in one hand we have the trivial bound ∆n G(x)/3n ≤ c1 for any n and c1 = 4, and in the other hand we have the lower bound ∆n G(x)/3n ≥ c2 for any c2 ≤ 3/7 and any n large enough. Acknowledgements. M.J. was supported by the Belgian Interuniversity Attraction Poles Program P6/02, through the network NOSY (Nonlinear systems, stochastic processes and statistical mechanics). M.J. would like to thank the anonymous referees for pointing out relevant references and for helping to improve the presentation of this article.
References 1. Barlow, M.T., Perkins, E.A.: Brownian motion on the Sierpi´nski gasket. Probab. Theory Related Fields 79(4), 543–623 (1988) 2. Chang, C.C., Yau, H.-T.: Fluctuations of one-dimensional Ginzburg-Landau models in nonequilibrium. Commun. Math. Phys. 145(2), 209–234 (1992) 3. Gravner, J., Quastel, J.: Internal DLA and the Stefan problem. Ann. Probab. 28(4), 1528–1562 (2000) 4. Guo, M.Z., Papanicolaou, G.C., Varadhan, S.R.S.: Nonlinear diffusion limit for a system with nearest neighbor interactions. Commun. Math. Phys. 118(1), 31–59 (1988) 5. Jara, M.D., Landim, C., Sethuraman, S.: Nonequilibrium fluctuations for a tagged particle in mean-zero one-dimensional zero-range processes. Probab. Theory Relat. Fields, to appear, doi:10.1007/s00440008-0178-2, 2008 6. Kigami, J.: Analysis on fractals, Volume 143 of Cambridge Tracts in Mathematics. Cambridge: Cambridge University Press, 2001 7. Kigami, J.: Measurable Riemannian geometry on the Sierpinski gasket: the Kusuoka measure and the Gaussian heat kernel estimate. Math. Ann. 340(4), 781–804 (2008) 8. Kigami, J., Sheldon, D.R., Strichartz, R.S.: Green’s functions on fractals. Fractals 8(4), 385–402 (2000) 9. Kipnis, C., Landim, C.: Scaling limits of interacting particle systems, Volume 320 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Berlin: SpringerVerlag, 1999 10. Kusuoka, S.: Dirichlet forms on fractals and products of random matrices. Publ. Res. Inst. Math. Sci. 25(4), 659–680 (1989) 11. Yau, H.-T.: Relative entropy and hydrodynamics of Ginzburg-Landau models. Lett. Math. Phys. 22(1), 63–80 (1991) Communicated by H. Spohn
Commun. Math. Phys. 288, 799–800 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0797-1
Communications in
Mathematical Physics
Erratum
Moduli Spaces of Self-Dual Connections over Asymptotically Locally Flat Gravitational Instantons Gábor Etesi1 , Marcos Jardim2 1 Department of Geometry, Mathematical Institute, Faculty of Science,
Budapest University of Technology and Economics, Egry J. u. 1, H ép., H-1111 Budapest, Hungary. E-mail:
[email protected];
[email protected] 2 Instituto de Matemática, Estatística e Computação Científica, Universidade Estadual de Campinas, C.P. 6065, 13083-859, Campinas, SP, Brazil. E-mail:
[email protected] Received: 1 October 2008 / Accepted: 20 October 2008 Published online: 28 March 2009 – © Springer-Verlag 2009
Commun. Math Phys. 280, 285–313 (2008)
As it was pointed out by U. Bunke, there is an error in the formulation and proof of Lemma 2.1 in our paper [2]. Hereby we would like to correct it. For sake of clarity we present the whole correctly formulated lemma and its proof. Lemma 2.1. Fix an 0 < ρ < ε and let ∇ Aρ = d + Aρ and ∇ Bρ = d + Bρ be two smooth SU(2) connections in a fixed smooth gauge on the trivial SU(2) bundle E|∂ Mρ . Then there is a constant c1 = c1 (Bρ ) > 0, depending on ρ only through Bρ , such that |τ∂ Mρ (Aρ ) − τ∂ Mρ (Bρ )| ≤ c1 Aρ − Bρ L 2
1,Bρ (∂ Mρ )
that is, the Chern–Simons functional is continuous in the L 21,Bρ norm. Moreover, for each ρ, τ∂ Mρ (Aρ ) is constant on the path connected components of the character variety χ (∂ Mρ ). Proof. The first observation follows from the identity τ∂ Mρ (Aρ ) − τ∂ Mρ (Bρ ) 1 1 =− 2 tr (FAρ +FBρ ) ∧ (Aρ − Bρ )− (Aρ −Bρ ) ∧ (Aρ −Bρ ) ∧ (Aρ −Bρ ) , 8π 3 ∂ Mρ
which implies that there is a constant c0 = c0 (ρ, Bρ ) such that |τ∂ Mρ (Aρ ) − τ∂ Mρ (Bρ )| ≤ c0 Aρ − Bρ
3
2 L 1,B (∂ Mρ ) ρ
The online version of the original article can be found under doi:10.1007/s00220-008-0466-9.
800
G. Etesi, M. Jardim 3
2 that is, the Chern–Simons functional is continuous in the L 1,B norm. A standard appliρ cation of Hölder’s inequality on (∂ Mρ , g| ˜ ∂ Mρ ) then yields
Aρ − Bρ
3 2 L 1,B (∂ Mρ ) ρ
≤
1 √ 6 2 Volg| Aρ − Bρ L 2 ˜ ∂ Mρ (∂ Mρ )
1,Bρ (∂ Mρ )
.
The metric locally looks like g| ˜ ∂ Mρ ∩Uε∗ = ρ 2 ϕ(du 2 +dv 2 )+ρ 4 (dτ 2 +2h τ,u dτ du+. . . ) with ϕ and h τ,u , etc. being bounded functions of (u, v, ρ) and (u, v, ρ, τ ) respectively, hence the metric coefficients as well as the volume of (∂ Mρ , g| ˜ ∂ Mρ ) are bounded functions of ρ, consequently we can suppose that c1 does not depend explicitly on ρ. Concerning the second part, assume ∇ Aρ and ∇ Bρ are two smooth, flat connections belonging to the same path connected component of χ (∂ Mρ ). Then there is a continuous path ∇ Atρ with t ∈ [0, 1] of flat connections connecting the given flat connections. Out of this we construct a connection ∇ A on ∂ Mρ × [0, 1] given by A := Atρ + 0 · dt. Clearly, this connection is flat, i.e., FA = 0. The Chern–Simons theorem [1] implies that 1 tr(FA ∧ FA ) = 0, τ∂ Mρ (Aρ ) − τ∂ Mρ (Bρ ) = − 2 8π ∂ Mρ ×[0,1]
concluding the proof.
This lemma is used in the estimates on p. 293 and p. 299 in [2]. In these estimates, the original (incorrect) L 2 norm of the su(2) valued 1-form Aρ − ρ should be replaced simply by its L 21,ρ norm dictated by the corrected Lemma 2.1 presented here. This replacement is only of technical nature and does not effect any of the main results in [2]. Finally, we would like to thank U. Bunke for pointing out this technical gap. References 1. Chern, S., Simons, J.: Characteristic forms and geometric invariants. Ann. Math. 99, 48–69 (1974) 2. Etesi, G., Jardim, M.: Moduli spaces of self-dual connections over asymptotically locally flat gravitational instantons. Commun. Math. Phys. 280, 285–313 (2008) Communicated by G.W. Gibbons
Commun. Math. Phys. 288, 801–819 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0784-6
Communications in
Mathematical Physics
Chronological Spacetimes without Lightlike Lines are Stably Causal E. Minguzzi Dipartimento di Matematica Applicata, Università degli Studi di Firenze, Via S. Marta 3, I-50139 Firenze, Italy. E-mail:
[email protected] Received: 26 April 2008 / Accepted: 23 December 2008 Published online: 25 March 2009 – © Springer-Verlag 2009
Abstract: The statement of the title is proved. It implies that under physically reasonable conditions, spacetimes which are free from singularities are necessarily stably causal and hence admit a time function. Read as a singularity theorem it states that if there is some form of causality violation on spacetime then either it is the worst possible, namely violation of chronology, or there is a singularity. The analogous result: “Non-totally vicious spacetimes without lightlike rays are globally hyperbolic” is also proved, and its physical consequences are explored. 1. Introduction While the local structure of spacetime is fairly simple to describe, there are still a number of open problems concerning the causal behavior of the spacetime manifold in the large. About three decades ago Geroch and Horowitz in the conclusions of their review “Global structure of spacetimes” [8] identified the problem of giving good physical reasons for assuming stable causality as one of the most important questions concerning the global aspects of general relativity together with the proof of the cosmic censorship conjecture. Indeed, if stable causality holds, then the spacetime does not suffer any pathological behavior connected with the presence of almost closed causal curves, and, more importantly, it admits a (non-unique) time function [9], that is a function which is continuous and increases on every causal curve. In order to understand the role of stable causality it is useful to recall that most conformally invariant properties can be ordered in the so-called causal ladder of spacetimes (see Fig. 1). If the real Universe were represented by a globally hyperbolic manifold (the top of the ladder) then a number of mathematically and physically nice properties would hold. The problem is that, though there is evidence that the spacetime manifold evolves according to the Einstein equations, it is not clear whether the evolution from physically reasonable Cauchy data would introduce naked singularities and would eventually produce a non-globally hyperbolic spacetime. If so, the Cauchy data would be insufficient
802
E. Minguzzi
Fig. 1. The causal ladder displaying the new levels considered in Sect. 3. Penrose’s infinite ladder between A-causality and A∞ -causality is omitted [16], as well as the levels of weak distinction and feeble distinction [19]. For the placement of the non-imprisonment properties the reader is referred to [18]. The arrow C ⇒ D means that C implies D and there are examples which show that C differs from D. Stable causality implies K -causality, but it is not known if they coincide. The implications climbing the ladder express the geometrical content of the theorems proved in this work
for the determination of the spacetime geometry and one would have to take into account the information coming from infinity. However, Penrose gave arguments which support the view that the so developed manifold would actually be globally hyperbolic [23] (strong cosmic censorship). Some other authors claim that one should only expect that the non-predictable behavior due to singularities be confined behind horizons (weak cosmic censorship). Other authors note that there is not even compelling reasons for excluding chronologically violating regions, in fact in some cases they allow one to keep the spacetime nonsingular even in the presence of trapped surfaces [21]. From this point of view chronology violating sets should not be discarded a priori, instead they should be considered in the same footing as naked singularities, a physical possibility which hopefully remains hidden behind an horizon. These considerations show that the class of mathematically reasonable spacetimes is rather large, and therefore physicists look for physical arguments which allow to get as close as possible to global hyperbolicity. In short physicists look for results which allow to climb the causal ladder.
Chronological Spacetimes without Lightlike Lines are Stably Causal
803
The first step would be to justify the chronology property. Actually this assumption is philosophically satisfactory because its violation would raise issues related to the free will of the generic observer. However, the notion of free will is not modeled in general relativity, therefore it becomes reasonable to search for other physical mechanisms, perhaps based on quantum mechanics, which prevent the formation or stability of chronology violating sets. The idea that such a mechanism should indeed exist and that starting from well behaved initial conditions closed timelike curves can not form has been referred to by Hawking as the chronology protection conjecture [10]. As I commented above there is no general consensus on its validity and the evidence coming from classical general relativity is under investigation [13,28,29,32]. It is natural to separate the remainder of the causal ladder in two parts. That going from chronology up to stable causality (causality, distinction, strong causality belong to it), and that going from stable causality up to global hyperbolicity (passing through causal continuity and causal simplicity). While the former part deals with each time more demanding conditions conceived to avoid almost closed causal curves, the latter part presents each time more demanding conditions in order to reduce the effects of points at infinity on spacetime. The problem of climbing the causal ladder from chronology up to stable causality will be considered and solved in this work. It has received less attention than the latter problem, that is, that of going from stable causality up to global hyperbolicity which is indeed more closely related to the strong cosmic censorship conjecture [23]. I am going to prove that chronology plus the absence of lightlike lines implies stable causality (Theorem 6). The theorem is formulated so that every mentioned property is conformally invariant. It is therefore a theorem on the causal structure of spacetime. In this respect it is important to use the weaker assumption of absence of lightlike lines instead of the more common null convergence, null genericity and null completeness conditions, though these have a more direct physical meaning. If we regard the null convergence and the null genericity conditions as physically reasonable we can say that under physically reasonable conditions null completeness implies the absence of lightlike lines (see Sect. 2) and hence, under chronology, it also implies stable causality. Thus the theorem physically can be interpreted by saying that under chronology, the absence of singularities implies stable causality and hence the existence of a time function. It is the first result of this form which reduces the existence of a time function to considerably less demanding properties. Moreover, note that in the previous statement the required absence of singularities is more precisely only a null completeness requirement: the spacetime manifold could still be timelike incomplete in a way compatible with the singularity theorems (I shall say more on that in Sects. 4 and 6). Recall that if stable causality holds then the spacetime is free from almost closed causal curves or other more complex forms of causality violation. Stated in a more precise way, stable causality implies K -causality [27], which assures that it is impossible to obtain a closed chain of events pairwisely related by suitable closures and compositions of the usual causal relation J + . The theorem can then be regarded as a singularity theorem, indeed, rewritten in the form non-stably causal spacetimes either are non-chronological or admit lightlike lines receives the following physical interpretation if there is a form of causality violation on spacetime then either it is the worst possible, namely violation of chronology, or the spacetime is singular. Regarded in this way the theorem clarifies the influence of causality violations on singularities. In fact, if the violation of chronology is regarded as a sort of singularity then the theorem states that if there is no time function then the spacetime is singular in this broader sense.
804
E. Minguzzi
I refer the reader to [16,20] for most of the conventions used in this work. In particular, I denote with (M, g) a C r spacetime (connected, time-oriented Lorentzian manifold), r ∈ {3, . . . , ∞} of arbitrary dimension n ≥ 2 and signature (−, +, . . . , +). On M × M the usual product topology is defined. For convenience and generality I often use the causal relations on M × M in place of the more widespread point based relations I + (x), J + (x), E + (x) (and past versions). All the causal curves that we shall consider are future directed (thus also the past rays). The subset symbol ⊂ is reflexive, X ⊂ X . The limit curve theorem will be repeatedly used. The reader is referred to [15] for a sufficiently strong formulation which generalizes that contained in [3]. 2. Absence of Lightlike Lines In this section I consider the property of absence of lightlike lines and comment on its physical meaning. Two spacetimes belonging to the same conformal class (M, g) share the same lightlike geodesics up to reparametrizations, and the condition of maximality for the lightlike geodesic γ reads “there is no pair of events x, z ∈ γ , (x, z) ∈ I + ”, which makes no mention of the full metric structure and hence is independent of the representative of the conformal class. Thus, it is convenient to give the following conformally invariant definition, Definition 1. A lightlike line is an achronal inextendible causal curve. The definition implies, by achronality, that the causal curve is a lightlike geodesic and that it maximizes the Lorentzian length between any of its points. It is well known that [22, Chap. 10, Prop. 48] Proposition 1. If an inextendible lightlike geodesic admits a pair of conjugate events then it is not a lightlike line. It can be proved that the notion of conjugate points along a lightlike geodesic is conformally invariant [20], thus the previous proposition relates two conformally invariant properties. In particular note that the requirement every lightlike geodesic has a pair of conjugate points is stronger than absence of lightlike lines, e.g. 1+1 Minkowski spacetime with x = 0 and x = 1 identified. From the point of view of Lorentzian geometry any statement should be formulated so as to make its conformal invariance clear. For physical reasons some authors prefer to mention physically motivated but non-conformally invariant conditions. The consequence, however, is that several results have been formulated in an unnecessarily weak form as the assumptions of the theorems are not really used. Definition 2. An inextendible lightlike geodesic γ of the spacetime (M, g) satisfies the generic condition if at some x ∈ γ the tangent vector n to the curve is a generic vector, that is, n c n d n [a Rb]cd[e n f ] = 0. A spacetime satisfies the null generic condition if every inextendible lightlike geodesic satisfies the generic condition. A spacetime can be generic only if n ≥ 3 (see [3, Cor. 2.10]). The precise sense in which the null generic condition is generic is clarified by [3, Prop. 2.15]. It is usually assumed on the physical ground that if a lightlike geodesic does not satisfy it then arbitrarily small metric perturbation in the geodesic path would make it true.
Chronological Spacetimes without Lightlike Lines are Stably Causal
805
Definition 3. The spacetime (M, g) satisfies the timelike convergence condition if R(v, v) ≥ 0 for all timelike, and hence also for all lightlike, vectors v. The spacetime (M, g) satisfies the null convergence condition if R(v, v) ≥ 0 for all lightlike vectors v (cf. [11, p. 95] [3, Def. 12.8]). The null convergence condition is a consequence of the positivity of the energy density [11]. Definition 4. A spacetime (M, g) is null geodesically complete if every inextendible lightlike geodesic is complete. Proposition 2. In a spacetime (M, g) of dimension dim M ≥ 3, which satisfies the null convergence condition, the null generic condition and which is null geodesically complete, every inextendible lightlike geodesic admits a pair of conjugate events. In particular (M, g) does not have lightlike lines. Proof. It follows from the existence of some pair of conjugate points in the lightlike geodesics according to [11, Prop. 4.4.5] [3, Prop. 12.17]. This proposition has been improved by Tipler [30,31] and Chicone and Ehrlich [6] (see also Borde [5]) by weakening the null convergence condition to the averaged null convergence condition. This possibility is important because many quantum fields on spacetime determine a stress-energy tensor and hence a Ricci tensor which does not comply with the null convergence condition while it satisfies the averaged null convergence condition. Proposition 2 implies that the condition of absence of lightlike lines is quite reasonable from a physical point of view at least if the spacetime is assumed to be non-singular (see also the discussion in [11, Sect. 4.4]) or just null geodesically complete. In the next sections I will prove that the assumption of absence of lightlike lines has the effect of identifying the levels of the causal ladder between chronology and stable causality. In this respect the hard part will come with the inclusion of stable causality. A key role will be played by the property of K -causality introduced by Sorkin and Woolgar [27], and for the last step by a new property which I study in the next section. 3. Compact Stable Causality Recall that a non-total imprisoning spacetime is a spacetime for which there is no futureinextendible causal curve totally imprisoned in a compact set (future non-total imprisonment is equivalent to past non-total imprisonment [2,18]). It is known that every relatively compact open set in a non-total imprisoning spacetime [18] is stably causal when regarded as a spacetime with the induced metric [2]. Actually, this property characterizes non-total imprisonment, indeed we have Theorem 1. A spacetime (M, g) is non-total imprisoning iff for every relatively compact open set B, (B, g| B ) is stably causal. Proof. The implication to the right was proved by Beem [2]. To the left, assume (M, g) has a compact subset C in which some curve γ is future imprisoned. In [18] I proved that there is a lightlike line η contained in C such that η ⊂ Ω f (η), where Ω f (η) is the set of accumulation points in the future of η (in analogy with the set of ω-limit points of dynamical systems). Let B be a relatively compact open set such that C ⊂ B. Take
806
E. Minguzzi
− q ∈ η and, given a convex neighborhood U q, U ⊂ B, take p ∈ η ∩ J(U,g| (q). U) − C Take g > g in B (g need not be defined on B ) then p ∈ I(U,g |U ) (q), but recall that p ∈ Ω f (η) is an accumulation point for the future-inextendible g -timelike curve given − by the portion of η which starts from q. Thus since I(U,g | ) (q) is open it is possible to U construct a closed g -timelike curve contained in B. The argument holds for any choice of g , thus it is not true that for every relatively compact open set B, (B, g| B ) is stably causal.
Note that non-total imprisonment is a quite weak property (it is implied by weak distinction [18]). A related problem is that of establishing if, given an arbitrary compact set on spacetime, the metric can be widened in it without introducing closed causal curves in the whole spacetime. If this is possible the spacetime satisfies a condition which is stronger than non-total imprisonment. We can define a new property Definition 5. A spacetime (M, g) is compactly stably causal if for every relatively compact open set B there is a metric g B ≥ g such that g B > g on B, g B = g on B C and (M, g B ) is causal. Remark 1. There are some equivalent definitions, for instance: (M, g) is compactly stably causal if for every compact set C there is gC ≥ g such that gC > g on C and (M, gC ) is causal. In order to prove the equivalence one has to take appropriate convex combinations of metrics with smooth coefficients. Some natural questions arise, among them the placement of compact stable causality in the causal ladder of spacetimes. Before considering this question let me recall some notation and terminology [16]. Following Woodhouse [1,33] I denote with A+ the closure of the causal relation, that is A+ = J¯+ , where, as usual for a subset of M × M, the closure is with respect to the topology of M × M. A spacetime is A∞ -causal if there is no finite cyclic chain of distinct A+ -related events. This property is equivalent +∞ (A+ )i , which is the smallest transitive to the antisymmetry of the relation A+∞ = ∪i=1 + relation containing A . Analogously, a spacetime is A∞ -causal if the relation A+∞ is antisymmetric. The relation K + is the smallest closed and transitive relation containing J + , and the spacetime is K -causal if the relation K + is antisymmetric [27]. It is known that stable causality implies K -causality, although it is not known if these two conditions coincide [17]. We have Theorem 2. K -causality implies A∞ -causality. Proof. Since J + ⊂ K + , any causal relation obtained from J + by taking closures or +∞ (R + )i , is still by making the relation transitive through the replacement R + → ∪i=1 + + contained in K . Since A+∞ has this form A+∞ ⊂ K , thus K -causality implies A∞ causality. Remark 2. Given a relation R + the two involutive operations given by (a) closure: R + → +∞ (R + )i , once alternatively applied to R¯ + , and (b) transitivization: R + → R +∞ = ∪i=1 + J generate a chain of relations all contained in K + whose first members are J + , A+ , A+∞ , A+∞ , . . .. By demanding the antisymmetry one obtains a ladder of causal properties whose first members are causality, A-causality, A∞ -causality and A∞ -causality, all necessarily weaker than K -causality. If at a certain point two adjacent relations coincide then they coincide with K + as they are both closed and transitive and they are certainly
Chronological Spacetimes without Lightlike Lines are Stably Causal
807
the smallest relations with this property. In this case the mentioned ladder of relations finishes there where this coincidence occurs. As we shall see, the mentioned first levels are all different but it is not known if from some point on the levels would start to coincide, that is, if after a finite number of operations of closure and transitivization one would get K + and K -causality. Examples support the view that this coincidence occurs at a level which increases with the dimensionality of the spacetime. Lemma 1. Let ◦ denote the composition of relations, then J + ◦ A+ ⊂A+ and A+ ◦ J + ⊂A+ . Proof. Let us consider the latter case, the former being analogous. Let (x, y) ∈ J + and (y, z) ∈ A+ , and let γn be a sequence of causal curves of endpoints (yn , z n ) → (y, z). Take xk ∈ I − (x), xk → x, so that xk y and for sufficiently large n, xk yn ≤ z n , thus (xk , z n(k) ) ∈ I + and in the limit (x, z) ∈ A+ . Remark 3. In the next proof and in the proof of Lemma 3 we shall consider a sequence σn of gn -causal curves, where the metrics in the sequence gn may differ, g ≤ gn+1 ≤ gn , and gn → g pointwisely. In this circumstance it is possible to apply the usual limit curve theorem [15, Theorem 3.1] originally formulated for the case gn = g provided the following idea is taken into account (see also [15, Corollary 2.9 and Remark 2.10]). Any such sequence σn is also, for any chosen k, and for sufficiently large n, a sequence of gk -causal curves. Let the curve σn be parametrized with respect to the arc-length of a complete Riemannian metric h on M. Let us start with k = 1. Under the assumptions of the limit curve theorem [15, Theorem 3.1] for the spacetime (M, g1 ) it is possible to infer the existence of a subsequence σs which converges uniformly on compact subsets to a parametrized g1 -causal curve σ (whether it is inextendible or not depends on the case). Choosing k > 1, this same subsequence σs is made of gk -causal curves provided s is taken sufficiently large, thus by the same limit curve theorem there is a further subsequence σr which converges uniformly on compact subsets to a gk -causal curve σ . But clearly the parametrized curves σ and σ are the same because the sequence σr converges uniformly on compact subsets to both of them. Thus σ is gk -causal for every k > 1 and hence it is g-causal, as gk → g. In conclusion, it is possible to apply the limit curve theorem [15, Theorem 3.1] suitably generalized to include the case in which the converging sequence is made of curves which are causal with respect to different metrics. Theorem 3. A∞ -causality implies compact stable causality. Proof. In this proof, where some different metrics are introduced, the relations J + , A+ , A+∞ , and A+∞ with no subscript are always understood with respect to the metric g. Suppose (M, g) is A∞ -causal but non-compactly stably causal, then there is a relatively compact open set B such that for every g ≥ g, g > g on B, g = g on B C , (M, g ) is not causal. Let gn be a sequence of metrics gn ≥ g, gn > g on B, gn = g on B C , gn+1 ≤ gn , and gn → g pointwisely on the appropriate tensor bundle. For every choice of n, (M, gn ) is not causal, and since (M, g) is causal there must be a closed gn -causal curve γn intersecting B (see Fig. 2). Let pn0 ∈ γn ∩ B and parametrize the curves with respect to a complete Riemannian metric h so that pn0 = γn (0) and the domain of the curves is R (that is, following the parametrization the curve winds over its own image). ¯ Beem [2] has shown Assume an infinite number of γn is entirely contained in B. that there would be an inextendible g-causal limit curve contained in B¯ in contradiction with the non-total imprisoning property of the spacetime (recall that A-causality implies
808
E. Minguzzi
Fig. 2. The argument of the proof that A∞ -causality implies compact stable causality
distinction which implies the non-total imprisoning property). Thus without loss of gen¯ We conclude that erality we can assume that none of the γn is entirely contained in B. γn intersects B˙ at least once to enter B C . Without loss of generality we can also assume ¯ that pn0 → p 0 ∈ B. Using the limit curve theorem [15, Theorem 3.1], through p 0 there passes a future inextendible (hence its h-length parameter has domain (−∞, +∞)) g-causal curve γ 0 which can’t pass through p 0 twice as it would imply a violation of causality for (M, g). In particular since (M, g) is non-partial imprisoning it escapes B¯ at a last point q 0 ∈ B˙ ¯ Let γn0 be a subsequence of γn which converges to γ 0 uniformly never to reenter B. on compact subsets and let s 0 be the value of the parameter such that q 0 = γ 0 (s 0 ). Since γn0 (s 0 + 2) → γ 0 (s 0 + 2) ∈ / B¯ pass to a subsequence denoted in the same way ¯ Let (¯sn0 , tn1 ) s 0 + 2 be the largest open connected interval so that γn0 (s 0 + 2) ∈ / B. ¯ C . Define q¯n0 , pn1 ∈ B˙ as q¯n0 = γn0 (¯sn0 ) and pn1 = γn0 (tn1 ). so that γn0 ((¯sn0 , tn1 )) ⊂ ( B) 1 ˙ Let p ∈ B be an accumulation point for pn1 ; without loss of generality we can assume pn1 → p 1 . Note that the segment γn0 |[¯sn0 ,t n ] is entirely contained in B C and hence it is 1 g-causal. Since s¯n0 ∈ [0, s 0 + 2], without loss of generality we can assume s¯n0 → s¯ 0 for some s¯ 0 . Now, s¯ 0 ≤ s 0 indeed if s¯ 0 > s 0 then q¯n0 ∈ B¯ converges to γ 0 (¯s 0 ), a point that ¯ which is impossible. In particular, it is possible to find a sequence does not belong to B, 0 0 0 0 sn , s¯n < sn < s + 2, such that sn0 → s 0 . Then qn0 = γn0 (sn0 ) ∈ / B¯ converges to q 0 0 0 and the g-causal sequence of curves γn |[sn0 ,t n ] has endpoints (qn , pn1 ) ∈ J + such that 1 (qn0 , pn1 ) → (q 0 , p 1 ), i.e. (q 0 , p 1 ) ∈ A+ . Note that ( p 0 , q 0 ) ∈ J + as both points belong to γ 0 , hence ( p 0 , p 1 ) ∈ A+ . The limit curve theorem [15] states that tn1 → +∞, indeed otherwise we can assume that tn1 converges to some finite t1 ≥ s 0 + 2, so that p 1 would belong to the prolongation of γ 0 , p 1 = γ 0 (t1 ), which is impossible since q 0 = γ 0 (s 0 ) is the last point of γ 0 in ¯ There is no compact set containing all the segments γn0 |[¯s 0 ,t n ] because γ 0 escapes B. n 1
Chronological Spacetimes without Lightlike Lines are Stably Causal
809
every compact set never to return and for every k > 0, γn0 (sn0 + k) → γ 0 (s 0 + k) because tn1 → +∞. As a consequence the pair ( p 0 , p 1 ) ∈ A+ can be regarded as the limit of the pairs of endpoints of g-causal segments which are not all contained in a compact set. (In order to construct these segments take p¯ k0 ∈ I − ( p 0 ), p¯ k0 → p 0 so that q 0 ∈ I + ( p¯ k0 ), 0 ∈ I + ( p¯ k0 ) for a sufficiently large n(k). Next follow and hence since I + is open qn(k) the g-causal segment γn0 |[sn0 ,t n ] which is not all contained in a compact set, finally rede1 fine the parametrization of the sequence p¯ k0 and pass if necessary to a subsequence so that ( p¯ n0 , qn0 ) ∈ I + and hence ( p¯ n0 , pn1 ) ∈ J + with ( p¯ n0 , pn1 ) → ( p 0 , p 1 ).) In particular, p 0 = p 1 since the spacetime is strongly causal. Now, translate all the parametrizations of γn0 so that tn1 gets replaced by 0. Repeat the previous steps where now p 1 plays the role of p 0 and the found sequence γn1 is a reparametrized subsequence of γn0 . Continue in this way, defining at each step analogous subsequences and events so ¯ ( p k , p k+1 ) ∈ A+ , p k = p k+1 , and for each k there is a sequence of g-causal that p k ∈ B, curves, not all contained in a compact set, so that the endpoints of the sequence converge to ( p k , p k+1 ). Note that for every pair of positive integers a < b, pa = p b , otherwise there would be a closed chain of A+ related events in contradiction with A∞ -causality, and ( pa , p b ) ∈ A+∞ . Since B¯ × B¯ is compact, there is a subsequence denoted ( p ks , p ks +1 ) such that k ( p s , p ks +1 ) → (x, z) as s → +∞. Moreover, x = z because otherwise for every relatively compact causally convex neighborhood U x, for sufficiently large s, ( p ks , p ks +1 ) ∈ U , and the sequence of g-causal curves not all contained in a compact set, whose endpoints converge to ( p ks , p ks +1 ) would contradict the causal convexity of U . Since A+ is closed, (x, z) ∈ A+ and x = z. Since p ks is a subsequence of p k , for every s, ks + 1 ≤ ks+1 , thus ( p ks +1 , p ks+1 ) ∈ A+∞ and in the limit s → +∞, (z, x) ∈ A+∞ . As a consequence (M, g) is not A∞ -causal which is the desired contradiction. Theorem 4. Compact stable causality implies A∞ -causality. Proof. Assume the spacetime is compactly stably causal, and suppose it is not A∞ -causal then there is a finite closed chain of A+ -related events (xi , xi+1 ) ∈ A+ , i = 1, . . . , n, xn+1 = x1 . Consider a relatively compact open set B which contains all xi , i = 1, . . . , n, and let g B ≥ g, g B > g on B, g B = g on B C . We want to prove that A+ ∩ (B × B) ⊂ + J(M,g , from which it follows that (M, g B ) is not causal whatever the choice of g B , and B) hence (M, g) is not compactly stably causal, the desired contradiction. Let (y, z) ∈ A+ , + y, z ∈ B, then by the limit curve theorem either (y, z) ∈ J + ⊂ J(M,g or there are a B) y future inextendible g-causal curve σ starting from y, and a past inextendible g-causal curve σ z ending at z such that for every y ∈ σ y \{y} and z ∈ σ z \{z}, (y , z ) ∈ A+ . At least a segment of σ y near y is timelike for (M, g B ) and analogously for σ z , thus + + + , and (z , z) ∈ I(M,g finally, since (y , z ) ∈ A+ ⊂ J(M,g , we have (y, y ) ∈ I(M,g B) B) B) + (y, z) ∈ I(M,g B ) . Remark 4. All the properties of the previous theorems differ. In [16] I gave an example of non-K -causal A∞ -causal spacetime. A closer inspection proves that it is actually non-A∞ -causal but compactly stably causal. Moreover, it is possible to construct an example, similar to that of [16] which is A∞ -causal but non-K -causal (simply repeat the figure of [16] three times vertically, and then identify the holes cyclically). The
810
E. Minguzzi
Fig. 3. A A∞ -causal but non-compactly stably causal spacetime. In order to construct the spacetime start from R × S 1 × R of coordinates (t, θ, z), θ ∈ [0, 1], and metric g = −dt 2 + dθ 2 + dz 2 , remove two spacelike surfaces and identify, after a translation by an irrational number, two spacelike surfaces as done in the figure . The coordinates (x, y) have been introduced on the identified surfaces so as to make the identification clear. The spacetime is non-orientable but this feature is not essential. The spacetime is non-compactly stably causal since any enlargement of the metric on K gives closed causal curves. Thanks to the translation by an irrational number there cannot be closed chains of A+ related events
properties A∞ -causality and compact stable causality differ because of the spacetime example of Fig. 3. A consequence of these examples is the perhaps surprising fact that compact stable causality differs from stable causality (see again the example of [16]). This fact means that the behavior of the light cones near infinity is important in order to determine if a spacetime is properly compactly stably causal or not.
4. The Proof and Some Physical Considerations I start with a result due to Hawking [12] [11, Prop. 6.4.6] (he proved it with the stronger but inessential assumption that every inextendible lightlike geodesic admits a pair of conjugate points) Lemma 2. A chronological spacetime without lightlike lines is strongly causal. Proof. Recall that a spacetime is strongly causal if for every x ∈ M, and for every neighborhood U x there exists a neighborhood V ⊂ U , x ∈ V , such that any future-directed causal curve with endpoints at V is entirely contained in U (see for instance [20, Lemma 3.22]). Thus if (M, g) were not strongly causal there would be a point x, a neighborhood U x, and a sequence of causal curves γn of starting event xn , ending event z n such that xn → x, z n → x, and the curves γn are not entirely contained in U . Hence there are the conditions required by the limit curve theorem [15, Theorem 3.1, case (2)] which implies the existence of a lightlike line passing through x, a contradiction. A fundamental step in the proof is Theorem 5. If a spacetime does not have lightlike lines then the relation A+ = J¯+ is transitive, that is K + = A+ . Moreover, if the spacetime is also chronological then the spacetime is K -causal. Proof. Let us prove the transitivity of A+ . Take two pairs (x, y) ∈ A+ and (y, z) ∈ A+ and two sequences of causal curves σn of endpoints (xn , yn ) → (x, y), and γn of endpoints (yn , z n ) → (y, z). Apply the limit curve theorem [15] to both sequences, and
Chronological Spacetimes without Lightlike Lines are Stably Causal
811
consider first the case in which the limit curve in both cases does not connect the limit points. By the limit curve theorem, σn has a limit curve σ which is a past inextendible causal curve ending at y. Analogously γn has a limit curve γ which is a future inextendible causal curve starting from y. The inextendible curve γ ◦ σ cannot be a lightlike line, thus there are points x ∈ σ \{y}, z ∈ γ \{y} such that (x , z ) ∈ I + and (pass to a subsequence) points xn ∈ σn , xn → x and z n ∈ γn , z n → z . Since I + is open, for sufficiently large n, (xn , z n ) ∈ I + and finally (x, z) ∈ I¯+ = A+ . If both limit curves join the limit points then clearly (x, z) ∈ J + ⊂ A+ . If, say, σ joins x to y but γ does not join y to z, take xn ∈ I − (x), xn → x, so that xn y and for large n, xn yn ≤ z n , thus in the limit (x, z) ∈ A+ . The remaining case is analogous. Thus A+ is closed and transitive, hence A+ = K + . Assume (M, g) is chronological, then by Lemma 2 (M, g) is strongly causal. The relation A+ is antisymmetric indeed let (x, y) ∈ A+ and (y, x) ∈ A+ , x = y, and let σn of endpoints (xn , yn ) and γn of endpoints (yn , z n ) be sequences of causal curves whose endpoints converge to the initial pairs (xn , yn ) → (x, y), (yn , z n ) → (y, x). Then we repeat the argument used above, that is we apply the limit curve theorem to the accumulation point y. Call σ the limit causal curve for σn and analogously let γ be the limit causal curve for γn . If σ connects x to y and γ connects y to x then there is a closed causal curve on spacetime, a contradiction. Let U x, V y be two disjoint causally convex neighborhoods. If σ connects x to y but γ does not connect y to x, then it is possible to argue as above, i.e. take xk ∈ I − (x), xk → x, then for sufficiently large n, which we can choose so that n(k) > k, yn(k) ∈ I + (xk ) ∩ V , from which it follows that there is a sequence of causal curves of endpoints xk , z n(k) , intersecting V . But (xk , z n(k) ) → (x, x) thus strong causality is violated at x. The case in which γ connects y to x is analogous. The remaining case is that in which σ is past-inextendible and γ is future-inextendible. Then γ ◦ σ is an inextendible causal curve which by assumption is not a lightlike line. Moreover, since strong causality holds, this curve is not partially imprisoned in any compact set, thus using the same argument as above (i.e. taking advantage of the chronality of γ ◦ σ ) it follows that there is a sequence of causal curves of endpoints xn , z n not all contained in a compact set. Again there is a contradiction with the strong causality at x. Clearly, if we could prove that K -causality is equivalent to stable causality then the main theorem would follow. Seifert [24], even before the introduction of K -causality, gave an argument which would have implied the equivalence. Unfortunately, he only sketched the proof and a recent more detailed study [17] has shown that those arguments were inconclusive. Fortunately, however, it is possible to circumvent this difficulty, and avoid a direct proof of the equivalence between stable causality and K -causality, by working on compact stable causality. Indeed, the previous result will be used in the following weaker form: Corollary 1. A chronological spacetime without lightlike lines is compactly stably causal. Now, the idea is to consider the property “(M, g) is compactly stably causal and does not admit lightlike lines” to show that it is invariant under enlargement of the light cones over compact sets (see Lemma 4). Then it is possible to enlarge the light cones in a sequence of compact sets that cover M so as to obtain a causal spacetime with strictly larger light cones (Theorem 6). Lemma 3. On (M, g) let B be a relatively compact open set, let gn be a sequence of metrics gn ≥ g, gn > g on B, gn = g on B C , gn+1 ≤ gn , and gn → g pointwisely on
812
E. Minguzzi
the appropriate tensor bundle. If (M, g) does not have lightlike lines then all but a finite number of (M, gn ) do not have lightlike lines. Proof. If not we can, passing to a subsequence, assume that all (M, gn ) have lightlike lines. Denote γn a respective sequence of lightlike lines and assume there is one, say γn¯ , which does not intersect B. Since gn¯ and g coincide outside B, γn¯ is a g-causal curve. Also it is g-achronal because if there are two points p, q ∈ γn¯ such that ( p, q) ∈ Ig+ then as g ≤ gn¯ , ( p, q) ∈ Ig+n¯ which is impossible because γn¯ is a lightlike line on (M, gn¯ ). But γn¯ cannot be g-achronal as it would be a lightlike line of (M, g), thus the overall contradiction proves that all γn intersect B. Without loss of generality we can assume (pass to a subsequence if necessary) that there are xn ∈ B ∩ γn , and x ∈ B¯ such that xn → x. By the limit curve theorem [15] there is an inextendible g-causal curve η passing through x. If η is not g-achronal there are y, z ∈ η such that (y, z) ∈ Ig+ ⊂ Ig+n for every n. But since y and z are limit points of the sequence γn and Ig+ (⊂ Ig+n ) is open, some of the curves γn are not lightlike lines. The contradiction proves that η is not only g-causal but also g-achronal, thus it is a lightlike line. Again this is impossible, thus the assumption that an infinite number of (M, gn ) does admit lightlike lines has lead to a contradiction. Lemma 4. If (M, g) is compactly stably causal and without lightlike lines then for every relatively compact open set B it is possible to find a metric g B ≥ g such that g B > g on B, g B = g outside B, and (M, g B ) is compactly stably causal and without lightlike lines. Proof. Since (M, g) is compactly stably causal we can find g˜ B such that g˜ B > g on B, g˜ B = g outside B and (M, g˜ B ) is causal. Define gn = (1 − n1 )g + n1 g˜ B so that g ≤ gn ≤ g˜ B satisfies the assumptions of the previous lemma. Thus there is a certain element of the sequence, denote it g B , such that (M, g B ) does not have lightlike lines and since g B ≤ g˜ B , (M, g B ) is causal. But every causal spacetime without lightlike lines is compactly stably causal, thus the thesis. Theorem 6. If (M, g) is chronological and without lightlike lines then it is stably causal. Proof. Let h be an auxiliary complete Riemannian metric, x0 ∈ M, and let Bk = B(x0 , k) be the open balls of radius k centered at x0 . Define g1 = g. By the previous lemma it is possible to find a metric g2 > g1 on B2 , g2 = g1 outside B2 , such that (M, g2 ) is compactly stably causal and without lightlike lines. Next repeat the argument for the relatively compact open set B3 with respect to the spacetime (M, g2 ): there is a metric g3 > g2 on B3 , g3 = g2 (= g) outside B3 , such that (M, g3 ) is compactly stably causal and without lightlike lines. Continue in this way and find a sequence of metrics gk+1 ≥ gk ≥ g, gk+1 > gk on Bk+1 . The open sets A1 = B2 , Ak = Bk+1 \ B¯ k−1 for k ≥ 2, cover M. Let {χk } be a partition of unity so that the support of χk is contained in Ak , and define g˜ = +∞ k=1 χk gk+2 (the sum has at most two non-vanishing terms at each point) then g˜ > g, moreover at x ∈ Bk , g(x) ˜ ≤ gk+2 (x), because for n > k, χn (x) = 0 (see Fig. 4). But (M, g) ˜ is causal because otherwise there is a closed g-causal ˜ curve σ , which being a closed set, is entirely contained in Bs for some s. Since g˜ ≤ gs+2 on Bs , this curve is gs+2 -causal which contradicts the (compact stable) causality of (M, gs+2 ). Thus since (M, g) ˜ is causal and g˜ > g, (M, g) is stably causal. Remark 5. This result is sharp in the sense that causal continuity can not replace stable causality in the statement of the theorem. Indeed, the 1+1 spacetime R × S 1 of coordinates (t, θ ), θ ∈ [0, 2], metric ds 2 = −dt 2 + dθ 2 with the timelike segment θ = 1,
Chronological Spacetimes without Lightlike Lines are Stably Causal
813
Fig. 4. The construction of the metric g˜ > g and of the causal spacetime (M, g) ˜ in the proof of Theorem 6
0 ≤ t ≤ 1, removed does not have lightlike lines, is chronological, and thus stably causal (t is a time function) but it is not reflective and hence it is not causally continuous. Analogously, chronology can not be weakened to non-total viciousness indeed, for instance, the spacetime of Fig. 5 is non-totally vicious, does not have lightlike lines but is not even chronological. Nevertheless, it is possible to relax slightly the chronology condition by asking, for instance, that the chronology violating set be confined in a compact set or even more weakly to have a compact boundary (see the next section). Recall that a time function t : M → R is a continuous function which increases on every causal curve, that is, if γ : B → M is a causal curve, b1 < b2 implies t (γ (b1 )) < t (γ (b2 )). Hawking proved, improving previous results by Geroch [7], that stable causality holds if and only if the spacetime admits a time function [9,11]. Actually the time function can be chosen smooth with timelike gradient [4] (see also [25]). Thus a corollary of Theorem 6 is Theorem 7. If (M, g) is chronological and without lightlike lines then it admits a time function (which can be chosen smooth with timelike gradient). Recall also that if t is a time function then Fa = { p : t ( p) > a} is an open future set and F˙a = { p : t ( p) = a}. In particular, Sa = F˙a is an acausal boundary (hence edgeless), that is, Sa is a partial Cauchy hypersurface [11]. The great advantage of Theorem 7, is that it allows to considerably weaken the causality and boundary conditions underlying most singularity theorems. These theorems assume geodesic completeness along with other conditions (which often imply the absence of lightlike lines) and derive from them some contradiction. Among the additional conditions most singularity theorems assume some of the following: (a) global hyperbolicity, (b) a partial Cauchy hypersurface, (c) a compact achronal edgeless set, (d) a trapped set. Often these global assumptions are made without any further justification, in fact Senovilla in his review [26, pp. 803-8] expressed the opinion that these boundary assumptions may represent the main weak point of singularity theorems. Fortunately, Theorem 7 shows that in some respect the additional conditions are often redundant, indeed if they include chronology and they imply the absence of lightlike lines (as in Hawking and Penrose’s singularity theorem) then they also justify the presence of a foliation of partial Cauchy hypersurfaces. Thus Theorem 7 can be used to weaken the global assumptions made in singularity theorems.
814
E. Minguzzi
Fig. 5. The figure displays 1+1 Minkowski spacetime with two spacelike slices identified and a triangle removed. If the angle at the top of the triangle is small enough there are no past lightlike rays
4.1. Absence of lightlike rays. In this section I am going to consider the implications of the absence of lightlike rays. Recall that a future ray is a future-inextendible causal curve which is achronal. Past rays are defined analogously. Choosing a point c ∈ (a, b) in a lightlike line γ : (a, b) → M, the portion γ |[c,b) is a lightlike future ray while γ |(a,c] is a lightlike past ray, thus Lemma 5. The absence of lightlike future (or past) rays implies the absence of lightlike lines. Thus, assuming the absence of lightlike future rays one expects to obtain a stronger property than stable causality. Indeed, we have (see also the related result [29, Prop. 4]) Theorem 8. If (M, g) is chronological and without future lightlike rays then it is globally hyperbolic (and the only TIP is M). An analogous past version also holds. Proof. Since there are no future rays then there are no lightlike lines and the spacetime is stably causal and admits a time function t. Let p ≤ q, we have to prove that C = J − (q) ∩ J + ( p) is compact. Take r ∈ I + (q) so that a = t (r ) > t (q), and consider the partial Cauchy surface Sa . Since C ⊂ I − (r ), all the points in C stay in the past set Pa = {x : t (x) < a}. The set H − (Sa ) is generated by future lightlike rays (as Sa is edgeless) and since by assumption there is no future lightlike ray, H − (Sa ) is empty. Thus C ⊂ Pa ⊂ D − (Sa ) ⊂ D(Sa ), the last set being globally hyperbolic. Note that no causal curve from p can escape D(Sa ) and hence Pa to return to q, as t is a time − + function. Hence C = J D(S (q) ∩ J D(S ( p) is compact. Finally, (M, g) has no TIP but a) a) M because the boundary of any TIP is generated by future lightlike rays. Note that in Theorem 8 chronology can not be weakened to non-total viciousness, i.e. to the condition C = M, where C is the chronology violating set. Indeed, Fig. 5 gives a counterexample (past case). Nevertheless, if one replaces the absence of future lightlike rays with the absence of lightlike rays then the proof of Theorem 12 will show ˙ if non-empty, that a non-totally vicious spacetime is chronological (by showing that C, contains a lightlike ray), and thus one has: Theorem 9. If (M, g) is non-totally vicious and without lightlike rays then it is globally hyperbolic (and there are no TIP or TIF but M).
Chronological Spacetimes without Lightlike Lines are Stably Causal
815
4.2. Physical considerations. Theorem 8 can be used as a singularity theorem though the null convergence condition is not enough to guarantee that a future-complete futureinextendible (affinely parametrized) lightlike geodesic γ : [a, +∞) → M admits a pair of conjugate points. A sufficient condition is Tipler’s [29, Prop. 1] +∞ lim [(s − a) Rcd n c n d ds ] > 1, (1) s→+∞
s
where is the tangent vector to γ at γ (s). Weaker conditions were also considered by Borde [5]. These conditions physically state that the energy density should not drop off too sharply. The assumption is reasonable in those cases where the universe is contracting (or taking the past version, expanding) as one would expect the energy density to increase rather than decrease. Thus we get the following singularity theorem (past version) nc
Theorem 10. The following conditions cannot all hold: (i) (ii) (iii) (iv)
(M, g) is past null geodesically complete, (M, g) is chronological, (M, g) is non-globally hyperbolic, some energy condition which implies the presence of conjugate points in pastcomplete past-inextendible lightlike geodesics (e.g. s lim [(b − s) Rcd n c n d ds ] > 1, s→−∞
−∞
holds on any past-inextendible lightlike geodesic γ : (−∞, b) → M). The nice feature of this theorem is that there is essentially no boundary assumption and the causality conditions are quite weak. There is no assumption on the existence of partial Cauchy surfaces or trapped sets. Of course, the strongest assumption which must be physically justified is made in (iv) but the local expansion of the Universe together with the cosmic background radiation, seem to support it. Then the theorem states that under the said energy conditions the spacetime is either globally hyperbolic or has singularities. Used in conjunction with Penrose’s (1965), and Hawking and Penrose’s (1970) singularity theorems [11] it allows to characterize quite precisely what a spacetime looks like if it contains trapped surfaces and it is still null geodesically complete. We have Theorem 11. Let (M, g) be a spacetime of dimension greater than 2. If (i) (ii) (iii) (iv)
(M, g) is null geodesically complete, (M, g) is chronological, there is a closed future trapped surface, the timelike convergence, the generic condition, together with some energy condition which implies the presence of conjugate points in past-complete pastinextendible lightlike geodesics (e.g. s lim [(b − s) Rcd n c n d ds ] > 1, s→−∞
−∞
holds on any past-inextendible lightlike geodesic γ : (−∞, b) → M), then the spacetime is globally hyperbolic with compact space slices and has a incomplete timelike line.
816
E. Minguzzi
Proof. The conditions (i), (ii) and (iv) imply (v): the spacetime is globally hyperbolic (Theorem 10). The Cauchy hypersurfaces are either compact or non-compact. In the latter case (iii) and (v) imply, by the Penrose singularity theorem, that the spacetime is null geodesically incomplete. Thus (vi): the Cauchy hypersurfaces are compact. The proof of the Hawking-Penrose theorem implies that (i), (ii), (iii) or (vi), and (iv) imply that there is an incomplete timelike line. Since the existence of trapped surfaces is a quite natural consequence of general relativity if matter is concentrated enough, Theorem 8 supports the global hyperbolicity of the spacetime (and a closed space) provided it is null geodesically complete. Since the conditions are quite reasonable one concludes that the spacetime is either null geodesically incomplete or timelike geodesically incomplete (or both). Finally I would like to stress that the assumption of null geodesic completeness does not lead to a spacetime picture which contradicts observations. Thus Theorems 8 and 6 may have a “positive” role in proving the good causal property of spacetime rather than being used only to prove its singularity. As a matter of fact they can be used to do both (Theorem 11). 5. The Non-Chronological Case So far we have studied the consequence of the absence of lightlike lines under the assumption of chronology. Let us consider the other possibility, namely non-chronological spacetimes. Denote with C the chronology violating set, with Cα , C = α Cα , its (open)components and with Bαk the (closed) components of the respective boundaries C˙α = k Bαk . The next result joins two theorems, one by Kriele [14, Theorem 4] who improved previous results by Tipler [29] and the other by the author [15]. Theorem 12. A non-chronological spacetime without lightlike lines is either totally vicious (i.e. C = M) or it has a non-empty chronology violating set C, the boundaries C˙α of the components Cα , are disjoint and the components Bαk of those boundaries are all non-compact. In particular non-totally vicious spacetimes without lightlike lines are non-compact. For the proof that the sets C˙α are disjoint I refer the reader to [15]. Instead, I elaborate on Kriele’s argument by giving a slightly different proof that the boundaries Bαk are non-compact. Indeed, I can give a shorter proof thanks to the limit curve theorem contained in [15] and to the results on totally imprisoned curves contained in [18]. Recall that in the chronology violating set C, Carter’s equivalence relation p ∼ q iff p q p gives rise to open equivalence classes, moreover, since C is open, if x ∈ C˙ it cannot be x ∈ C. We denote by Ω f (η) = t∈R η[t,+∞) the set of accumulation points in the future of the causal curve η, and analogously in the past case. This set is always closed, moreover, it is non-empty iff the curve is partially future imprisoned in a compact set [18]. Proof. Assume that Bαk ⊂ C˙α is compact and let x ∈ Bαk . Let xn ∈ Cα be such that xn → x, and let U x be a convex set. There are closed timelike curves σn ⊂ Cα of starting and ending point xn , which are necessarily not entirely contained in U (every convex set is causal). Let z = x, then by the limit curve theorem [15] (point 2) there are two cases (corresponding to 0 < b < +∞, or b = +∞ in that reference).
Chronological Spacetimes without Lightlike Lines are Stably Causal
817
Fig. 6. If (M, g) has a non-empty chronology violating set and has no lightlike line, (N , g| N ), with N any ¯ may admit lightlike lines (e.g. the causal curves γ1 or γ2 ) component of the shaded region M\C,
In the first case there is a closed continuous causal curve γ ∈ C¯α passing through x. It must be achronal since if p, q ∈ γ , p q, then x ≤ p q ≤ x and hence x x which implies x ∈ C a contradiction. Thus γ is a geodesic with no discontinuity in the tangent vectors at x. It can be extended to a lightlike line γ by making infinite rounds over γ (note that in this case Ω f (γ ) = Ω p (γ ) = γ ). In the second case there are a future inextendible continuous causal curve γ x ⊂ C¯α starting at x and a past inextendible continuous causal curve γ z ⊂ C¯α ending at x. If γ x ∩ I + (x) = ∅ and γ z ∩ I − (x) = ∅ then for sufficiently large n, since I + is open, it would be possible to complete a segment of γn to a closed timelike curve passing through x, hence x ∈ C, a contradiction. Thus γ x or γ z , say γ x , is a lightlike ray. In particular γ x being a lightlike ray is achronal and hence can not enter Cα , thus γ x ⊂ Bαk . Now, since Bαk is compact and Bαk ∩ C = ∅, results on totally imprisoned causal curves can be applied [18, Theorem 3.6]. In particular there is a minimal non-empty closed achronal set Ω ⊂ Ω f (γ x ) ⊂ Bαk such that through each point of Ω there passes one and only one lightlike line; this line is entirely contained in Ω and for every line α ⊂ Ω, Ω f (α) = Ω p (α) = Ω. Just the existence of a lightlike line suffices to conclude the proof that the boundaries Bαk are non-compact. The last statement in a slightly weaker form has been first obtained by Tipler [29, Theorem 7]. It follows from the observation that a compact spacetime has a non-empty chronology violating set C (see [11, Prop. 6.4.2]) thus either C = M or C˙ is non-empty and compact in contradiction with the absence of lightlike lines. These results restrict the possible chronology violation in spacetimes without lightlike lines, for instance they state that the chronology violation must extend to infinity. In principle this fact does not mean that a chronology violating region can not develop from regular data. For this to be the case stronger global assumptions than the only absence of lightlike lines should be assumed [13,29]. Instead of trying to remove chronology violating sets altogether from the spacetime, it is natural to consider what Theorem 6 may say in the cases of chronology violation. ¯ then The idea is that if (M, g) has a non-empty chronology violating set but M = C, ¯ has empty the spacetime (N , g| N ), where N is any connected components of M\C, chronology violating set. However, even if (M, g) does not have lightlike lines, (N , g| N ) may have lightlike lines (see Fig. 6). This may happen because a lightlike line γ for (N , g| N ) is not inextendible in M, and thus once extended it may enter the chronology violating set
818
E. Minguzzi
(the geodesic γ2 in the figure). Another possibility is that while γ is also inextendible in M, the enlargement of the spacetime enlarges the set of timelike curves and hence the possibilities that γ is not a line (the geodesic γ1 in the figure). Thus it is not possible to infer from the absence of lightlike lines for (M, g) the same property for (N , g| N ). Actually, neither the converse is true, the Misner spacetime (with region I = N , see Fig. 32 of [11]) does not have lightlike lines but its analytic extension (I+II), where II is the chronology violating set for I+II, does admit a lightlike line given by the Misner boundary. There is therefore no immediate way to apply Theorem 6 to the non-chronological case apart from that of motivating on physical grounds that some component N does not have lightlike lines. 6. Conclusions A proof has been given that chronological spacetimes without lightlike lines are stably causal, and that non-totally vicious spacetimes without lightlike rays are globally hyperbolic (together with some other variations). The properties: (i) chronology, (ii) null convergence condition and (iii) null generic condition, are quite reasonable from a physical point of view, moreover, for our purposes (ii) can be weakened to the averaged null convergence condition. Assuming (i), (ii) and (iii), the result of the title of this work translates into the physical statement that null geodesically complete spacetimes are stably causal and therefore admit a time function. Since the existence of some partial Cauchy surface is assumed in most singularity theorems, this result can be used to weaken the assumptions of those theorems. This result may also prove important when applied to the study of the real Universe. Indeed, let us recall that Hawking’s and Hawking and Penrose’s theorems [11] suggest the existence of an incomplete causal curve which however could well be timelike. In other words our Universe may perhaps be geodesically null complete but timelike incomplete, in which case the main theorem could be applied in the “positive” way to infer the existence of a time function for the Universe. In fact Theorem 11 shows that the assumption of null geodesic completeness leads to consequences that do not contradict physical observations. Penrose’s singularity theorem seems to go against this conclusion as it predicts null incompleteness in those cases in which closed trapped surfaces form. It must be remarked, however, that Penrose’s theorem assumes the existence of a non-compact Cauchy hypersurface, thus (i) it assumes the existence of a time function and hence it cannot be used to dismiss the conclusion that a time function exists and (ii) for spacetimes with compact slices its conclusions do not hold. Moreover, if the space slices are compact, one can extract further information from the proof of Penrose’s theorem [22, Theorem 14.61]. The result is that, roughly speaking, in such spacetimes black holes do not exist. Closed trapped surfaces may form and locally they may resemble black holes but the global behavior would be quite different. Indeed, their horizons would finally join and swallow the whole spacetime. Thus, without an “exterior”, the “interior” could not be distinguished from a usual spacetime. In conclusion the theorems of this work can be used physically, either in the “negative” way, to prove the existence of singularities or of chronology violating regions, or in the “positive” way to argue for the existence of a time function or of global hyperbolicity. In either case they throw new light on the existence and role of time at cosmological scales. Acknowledgement. This work has been partially supported by GNFM of INDAM and by MIUR under project PRIN 2005 from Università di Camerino.
Chronological Spacetimes without Lightlike Lines are Stably Causal
819
References 1. Akolia, G.M., Joshi, P., Vyas, U.: On almost causality. J. Math. Phys. 22, 1243–1247 (1981) 2. Beem, J.K.: Conformal changes and geodesic completeness. Commun. Math. Phys. 49, 179–186 (1976) 3. Beem, J.K., Ehrlich, P.E., Easley, K.L.: Global Lorentzian Geometry. New York: Marcel Dekker Inc., 1996 4. Bernal, A.N., Sánchez, M.: Smoothness of time functions and the metric splitting of globally hyperbolic spacetimes. Commun. Math. Phys. 257, 43–50 (2005) 5. Borde, A.: Geodesic focusing, energy conditions and singularities. Class. Quant. Grav. 4, 343–356 (1987) 6. Chicone, C., Ehrlich, P.: Line integration of Ricci curvature and conjugate points in Lorentzian and Riemannian manifolds. Manus. Math. 31, 297–316 (1980) 7. Geroch, R.: Domain of dependence. J. Math. Phys. 11, 437–449 (1970) 8. Geroch, R., Horowitz, G.T.: Global structure of spacetimes. In: Vol. General relativity: An Einstein centenary survey, Cambridge: Cambridge University Press, 1979, pp. 212–292 9. Hawking, S.W.: The existence of cosmic time functions. Proc. Roy. Soc. London, Series A 308, 433–435 (1968) 10. Hawking, S.W.: Chronology protection conjecture. Phys. Rev. D 46, 603–611 (1992) 11. Hawking, S.W., Ellis, G.F.R.: The Large Scale Structure of Space-Time. Cambridge: Cambridge University Press, 1973 12. Hawking, S.W., Penrose, R.: The singularities of gravitational collapse and cosmology. Proc. Roy. Soc. Lond. A 314, 529–548 (1970) 13. Krasnikov, S.: No time machines in classical general relativity. Class. Quant. Grav. 19, 4109–4129 (2002) 14. Kriele, M.: The structure of chronology violating sets with compact closure. Class. Quant. Grav. 6, 1607–1611 (1989) 15. Minguzzi, E.: Limit curve theorems in Lorentzian geometry. J. Math. Phys. 49, 092501 (2008) 16. Minguzzi, E.: The causal ladder and the strength of K -causality. I. Class. Quant. Grav. 25, 015009 (2008) 17. Minguzzi, E.: The causal ladder and the strength of K -causality. II. Class. Quant. Grav. 25, 015010 (2008) 18. Minguzzi, E.: Non-imprisonment conditions on spacetime. J. Math. Phys. 49, 062503 (2008) 19. Minguzzi, E.: Weak distinction and the optimal definition of causal continuity. Class. Quant. Grav. 25, 075015 (2008) 20. Minguzzi, E., Sánchez, M.: The causal hierarchy of spacetimes. In: H. Baum, D. Alekseevsky (eds.), Recent developments in pseudo-Riemannian geometry, ESI Lect. Math. Phys., Zurich: Eur. Math. Soc. Publ. House, 2008, pp. 299–358 21. Newman, R.P.A.C.: Black holes without singularities. Gen. Rel. Grav. 21, 981–995 (1989) 22. O’Neill, B.: Semi-Riemannian Geometry. San Diego: Academic Press, 1983 23. Penrose, R.: Singularities and time-asymmetry. In: General relativity: An Einstein centenary survey, Cambridge: Cambridge University Press, 1979, pp. 581–638 24. Seifert, H.: The causal boundary of space-times. Gen. Rel. Grav. 1, 247–259 (1971) 25. Seifert, H.J.: Smoothing and extending cosmic time functions. Gen. Rel. Grav. 8, 815–831 (1977) 26. Senovilla, J.M.M.: Singularity theorems and their consequences. Gen. Rel. Grav. 30, 701–848 (1998) 27. Sorkin, R.D., Woolgar, E.: A causal order for spacetimes with C 0 Lorentzian metrics: proof of compactness of the space of causal curves. Class. Quant. Grav. 13, 1971–1993 (1996) 28. Thorne, K.: Closed Timelike Curves. In: General Relativity and Gravitation, Bristol: Inst. of Phys. Publ., 1993, pp. 295–315 29. Tipler, F.J.: Singularities and causality violation. Ann. Phys. 108, 1–36 (1977) 30. Tipler, F.J.: General relativity and conjugate ordinary differential equations. J. Diff. Eq. 30, 165–174 (1978) 31. Tipler, F.J.: On the nature of singularities in general relativity. Phys. Rev. D 15, 942–945 (1978) 32. Visser, M.: Lorentzian Wormholes. New York: Springer-Verlag, 1996 33. Woodhouse, N.M.J.: The differentiable and causal structures of space-time. J. Math. Phys. 14, 495–501 (1973) Communicated by G.W. Gibbons
Commun. Math. Phys. 288, 821–846 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0777-5
Communications in
Mathematical Physics
Krein Signatures for the Faddeev-Takhtajan Eigenvalue Problem Jared C. Bronski, Mathew A. Johnson Department of Mathematics, University of Illinois Urbana-Champaign, 1409 W. Green St., Urbana, IL 61801, USA. E-mail:
[email protected] Received: 7 May 2008 / Accepted: 8 December 2008 Published online: 14 March 2009 – © Springer-Verlag 2009
Abstract: One of the difficulties in analyzing eigenvalue problems that arise in connection with integrable systems is that they are frequently non-self-adjoint, making it difficult to determine where the spectrum lies. In this paper, we consider the problem of locating and counting the discrete eigenvalues associated with the Faddeev-Takhtajan eigenvalue problem, for which the sine-Gordon equation is the isospectral flow. In particular we show that for potentials having either zero topological charge or topological charge ±1, and satisfying certain monotonicity conditions, the point spectrum lies on the unit circle and is simple. Furthermore, we give an exact count of the number of eigenvalues. This result is an analog of that of Klaus and Shaw for the Zakharov-Shabat eigenvalue problem. We also relate our results, as well as those of Klaus and Shaw, to the Krein stability theory for J-unitary matrices. In particular we show that the eigenvalue problem associated to the sine-Gordon equation has a J-unitary structure, and under the above conditions the point eigenvalues have a definite Krein signature, and are thus simple and lie on the unit circle.
1. Introduction The sine-Gordon equation arises as a model for many systems. In physics, the sine-Gordon equation models the dynamics of Josephson junctions [18] and has been studied as a model for field theory [3]. It has been studied in atmospheric sciences as a model for a rotating baroclinic fluid [8]. It has been proposed as a model for DNA dynamics [15,17,23] (see also the work of Cuenda, Sánchez and Quintero [5]). Various perturbed sine-Gordon models have been extensively studied since they exhibit complicated dynamics and chaotic behavior [2,10,19], and the sine-Gordon equation also plays a role in the geometry of surfaces of constant negative curvature [21].
822
J. C. Bronski, M. A. Johnson
The sine-Gordon equation is known to be integrable [20] and is the isospectral flow for a 2 × 2 non-self-adjoint eigenvalue problem. If we define the characteristic coordinates √ , η = x−t √ , then the sine-Gordon equation takes the form χ = x+t 2
2
ψχ η = sin(ψ). The evolution of this problem in η, a Goursat problem, is integrable via the inverse scattering transform and the associated eigenvalue problem for which this is the isospectral flow is the well studied Zhakarov-Shabat system φ1,χ = −iζ φ1 + qφ2 , φ2,χ = iζ φ2 − q ∗ φ1 ,
(1)
where 2q := −iψχ , z ∈ C is a spectral parameter, and ∗ denotes complex conjugation. Note that this is the same eigenvalue problem associated with the non-linear Schrödinger equation on R. In most contexts the initial value problem in characteristic coordinates is not a natural one. Instead it is most natural to consider the initial-value problem in the laboratory coordinates: ψx x − ψtt = sin ψ, ψ(x, 0) = u(x), ψt (x, 0) = v(x).
(2)
In the laboratory coordinates there is a different eigenvalue problem for which the sineGordon evolution is the isospectral flow, which is due to Faddeev and Takhtajan [20,24] and independently by Kaup [11] (see also [6,7]) which takes the following form: v(x) 0 1 1 u(x) 1 −i 0 Φ− Φ Φx = z− cos 0 i −1 0 4 z 2 4 1 1 u(x) 0i Φ, (3) + z+ sin i 0 4 z 2 where again z denotes the spectral parameter. This eigenvalue problem is somewhat non-standard since the eigenvalue parameter enters non-linearly; it is what is generally known as a quadratic operator pencil (or, in the terminology of Gohberg and Krein [9], a quadratic operator bundle). If one is interested in solving the PDE in the laboratory coordinates, one must understand the forward and inverse scattering of this problem. It is the forward scattering problem for this system with v(x) = 0 which we consider in this paper. We are primarily motivated by two recent results. The first is of Klaus and Shaw [13,14], who proved the following result for the Zakharov-Shabat eigenvalue problem: if the potential q ∈ L 1 (R) is real valued with a single extremum point, then all the discrete eigenvalues ζ lie on the imaginary axis and are simple. We often refer to such a potential as a Klaus-Shaw potential. Further, they were able to derive an exact count of the number of discrete eigenvalues of (1) in terms of the L 1 norm of the potential q. We would also like to mention the paper of Klaus and Mityagin [12], who have proved many results about how and where imaginary eigenvalues of higher multiplicity can occur, and where along the real axis complex eigenvalues can emerge. We will say a bit more about this paper in the conclusions section.
Krein Signatures for the Faddeev-Takhtajan Eigenvalue Problem
823
The second is a recent result of Buckingham and Miller [4], who have constructed the analog of reflectionless potentials for the eigenvalue problem (3). In particular they have shown that if v(x) = 0 and u(x) is defined by u(x) sin = sech(x) , 2 u(x) = tanh(x) , cos 2 then the spectral problem (3) is hypergeometric and admits an integral representation of Euler type. The point spectrum for this problem can be explicitly computed: it lies entirely on the unit circle and is simple. This potential is the analog of the Satsuma-Yajima potentials for the Zakharov-Shabat eigenvalue problem or the Bargmann reflectionless potentials for the Schrödinger eigenvalue problem. It is interesting to note that the potential u(x) constructed by Buckingham and Miller is related to the Gudermannian function gd(x), which arises in the theory of Mercator projections, via u(x) = π − 2 gd(x). To motivate our results we note a few facts. Purely imaginary eigenvalues of the Zakharov-Shabat eigenvalue problem correspond to stationary solitons. The phase of the potential q in the Zakharov-Shabat eigenvalue problem is related to the momentum of the initial pulse, with real data corresponding to an initially stationary pulse. Thus the Klaus-Shaw result says that, under the monotonicity condition on the potential, a pulse with no initial momentum gives rise to solitons with no momentum. For the sineGordon problem in laboratory coordinates the analog of data with no initial momentum is, of course, v(x) = 0. At the level of the eigenvalue problem (3) stationary solitons correspond to point spectrum on the unit circle [6]. These considerations suggest that for initially stationary data, v(x) = 0 and u(x) satisfying certain monotonicity conditions, the discrete spectrum of Eq. (3) should lie on the unit circle. The fact that the Buckingham-Miller potential is monotone and has point spectrum that is confined to the unit circle tends to support this idea. In this paper we prove such a result. At this point it is worthwhile to introduce a bit of terminology. The potential u(x) is assumed to satisfy the following asymptotics: lim u(x) = 2π k± .
x→±∞
Following Faddeev and Takhtajan [6] we define the topological charge of the poten1 tial u(x) to be Q top = k+ − k− = 2π u x (x)d x. Potentials with topological charge Q top = 0 are generally are referred to as breathers, while potentials with non-zero topological charge are referred to as kinks. In this paper we will deal only with breathers and kinks with topological charge Q top = ±1 (simple kinks). We will not consider potentials of higher topological charge (|Q top | > 1) in this paper. Notice that the Buckingham-Miller potential is a simple kink. 2. Preliminaries 2.1. Notation. In order to make the notation simpler, we define the following matrices: −i 0 0i 0 1 , τ2 := , τ3 := . τ1 := o i i 0 −1 0
824
J. C. Bronski, M. A. Johnson
These are related to the usual Pauli matrices via a (cyclic) permutation and multiplication by i: in particular τ1 = −iσ3 , τ2 = iσ1 , τ3 = iσ2 . Note that the τi satisfy the commutation relations τi−1 = τi† = −τi , τi τ j = i jk τk − δi j I , and if i = j; τj, τi τ j τi = −τ j , if i = j. Using this notation, the eigenvalue problem for which the sine-Gordon equation (in laboratory coordinates) is the isospectral flow is given by u u 1 1 1 1 v Φx = z− cos τ1 Φ + z+ sin τ2 Φ − τ3 Φ 4 z 2 4 z 2 4 on L 2 (R), where u = u(x), Φ = (φ1 , φ2 )T and z ∈ C is the spectral parameter. We refer to this as the symmetric gauge due to the relatively symmetric way z and 1z appear in the eigenvalue problem. This eigenvalue problem can be written in a number of different forms which are related to this form via different gauge transformations, and in the literature one will often see these different forms. We prefer this form for several reasons. One is that it appears that the Buckingham-Miller potential only leads to a hypergeometric equation in this gauge. This gauge also makes the J-unitary structure of the spectral problem most obvious (see Lemma 4). Occasionally it will be more convenient to use a different gauge. In this case we will so state. Throughout this paper we will be assuming that v(x) = 0, and will work with the problem u u 1 1 1 1 Φx = z− cos τ1 Φ + z+ sin τ2 Φ. (4) 4 z 2 4 z 2 By a standard Weyl sequence argument the essential spectrum of (4) is the real axis. Moreover, since the eigenvalues are symmetric about the real axis (see Proposition 1), we define z to be in the point spectrum of (4) if Im(z) > 0 and there exists a non-trivial L 2 solution Φ of (4). Throughout this paper, we consider only z in the upper half plane, which suffices due to the symmetries of the problem. Since we are concerned only with the forward scattering problem (4) all of the analysis in this paper is done at t = 0, with ψ(x, 0) = u(x). As is usual in these problems the time evolution of the spectral data is quite straightforward and will not be considered here. Moreover, since all of our results concern the case of stationary initial data (see Remark 1 below), we assume throughout ψt (x, 0) = v(x) = 0. Finally, we make standard assumptions on all potentials u: u(x) → 0 mod 2π as |x| → ∞ fast enough so that sin u2 ∈ L 1 (R), and (for simplicity) u ∈ C 1 (R): when our results allow for more gen eral assumptions, will outline the extension of the proofs. Notice that sin u2 ∈ L 1 we implies 1 − cos u2 ∈ L 1 . Note that there is a difference in the structure of the Jost solutions of (4) between the cases where Q top is even and Q top is odd. When k is odd (and positive), the Jost solutions have the asymptotics 0 u 1 1 1 as x → −∞, z− dy τ1 cos Ψ (x, z) ∼ exp − 0 4 z 2 x (5) x u 1 1 1 as x → +∞, Φ(x, z) ∼ exp z− dy τ1 cos 0 4 z 2 0
Krein Signatures for the Faddeev-Takhtajan Eigenvalue Problem
825
while in the case k is even (and non-negative), Jost solutions satisfy the asymptotics 0 u 1 1 1 as x → −∞, z− dy τ1 cos Ψ (x, z) ∼ exp − 0 4 z 2 x (6) x u 1 1 0 Φ(x, z) ∼ exp as x → +∞. z− dy τ1 cos 1 4 z 2 0 Similar expressions hold for the case when k is negative. Thus in the case of even topological charge the eigenvalues correspond to a heteroclinic connection, while in the case of odd topological charge the eigenvalues correspond to a homoclinic connection. 2.2. Symmetries and Signatures. To begin we derive the symmetries of the eigenvalue problem (4) under the assumption that v(x) = 0. The symmetry group of the discrete spectrum is Z2 × Z2 × Z2 , corresponding to reflection across the real and imaginary axes as well as the unit circle. Proposition 1. Suppose Φ is an eigenfunction of (4) corresponding to an eigenvalue z. Then w = 1z is an eigenvalue with eigenfunction Ψ = τ2 Φ, w = −z is an eigenvalue ¯ with eigenfunction Ψ = τ3 Φ, and w = z is an eigenvalue with eigenfunction Ψ = τ3 Φ. Proof. Defining Ψ by Φ = τ2 Ψ , we get the following equation for Ψ : u u 1 1 1 1 τ1 Ψ + τ2 Ψ. Ψx = − z cos z+ sin 4 z 2 4 z 2 Letting w =
1 z
then gives the equation u u 1 1 1 1 w− cos τ1 Ψ + w+ sin τ2 Ψ Ψx = 4 w 2 4 w 2
which is the original eigenvalue problem. Thus if z is an eigenvalue with associated eigenfunction Φ, then 1z is an eigenvalue with corresponding eigenfunction Ψ = τ2 Φ. Similarly, defining Φ = τ3 Ψ , we get u u 1 1 1 1 z− cos τ1 Ψ − z+ sin τ2 Ψ , Ψx = − 4 z 2 4 z 2 so that w = −z is an eigenvalue with eigenfunction Ψ = τ3 Φ. Finally, conjugating the original eigenvalue equation gives u u 1 1 1 1 ¯ Φ¯ x = − cos τ1 Φ¯ − sin τ2 Φ, z− z+ 4 z 2 4 z 2 where the overbar denotes complex conjugation. It follows that w = z¯ is an eigenvalue ¯ with eigenvector Ψ = τ3 Φ. Note that when z lies on the unit circle, it follows the functions φ1 and iφ2 can be chosen to be real, where φ j is the j th component of Φ. Remark 1. In the case v(x) = 0 we lose the z → 1z symmetry, but the other two symmetries persist. Our main results, which show confinement of the spectrum to the unit circle, rely on v(x) = 0. Numerical evidence show that for v(x) non-zero the point spectrum moves off the unit circle.
826
J. C. Bronski, M. A. Johnson
Corollary 1. If u is an odd function, then (4) has no eigenvalues on the unit circle. Proof. Let z be an eigenvalue of (4) on the unit circle with corresponding eigenfunction Φ(x). By Proposition 1 φ1 and iφ2 can be chosen to be real. A simple calculation shows that −iτ2 Φ(−x) is also an eigenfunction corresponding to z. Hence, there exists κ ∈ C such that Φ(x) = −iκτ2 Φ(−x). If we write Φ = (φ1 , φ2 )T , then Proposition 1 implies that κ ∈ iR. However, we also have φ1 (x) = κφ2 (−x) = κ 2 φ1 (x) so that κ 2 = 1, which is a contradiction. Our results, as well as those of Klaus and Shaw, can be interpreted in terms of the classical theory of Krein signatures for J-unitary operators due to Krein and collaborators. To illustrate this connection, we begin by recalling the definition of the classical Krein signature. While we try to give a self-contained summary here, the interested reader is encouraged to consult the text of Yakubovitch and Starzhinskii [22] and references therein. In particular Chapter 3 of Volume I is most relevant. Let J be a skew form satisfying J† = −J, J2 = −I. An operator M is said to be J-unitary if it satisfies the relation M† JM = J,
(7)
or equivalently if M leaves invariant the bilinear form
x, y = Jx, y . Notice that if M is real then it belongs to the symplectic group SP(n, R). Relation (7) implies that spectrum spec(M) is invariant under reflection across the unit circle: λ ∈ spec(M) ⇒ λ¯ −1 ∈ spec(M). The obvious question is whether the eigenvalues actually lie on the unit circle and, if so, whether they remain there under perturbation. This and many other questions were considered by Krein and collaborators. The basic results are as follows: if v is an eigenvector of M and one defines the Krein signature κ to be the following (real) quantity κ = −i v, Jv, then the following results hold: – If |λ| = 1, then κ = 0. – If M has a non-diagonal Jordan block form, then there exists an eigenvector v with vanishing Krein signature κ = 0. Thus, if one can show that all eigenvectors have non-zero Krein signature then the Jordan block is actually diagonal, and hence the algebraic and geometric multiplicities of the eigenvalue are equal. It can further be shown that eigenvalues can only leave the unit circle by colliding with an eigenvalue of opposite Krein signature. In particular see Chap. 3, Sects. 1–3 of Volume I of Yakubovich and Starzhinskii [22] for details. To put our calculation and that of Klaus-Shaw into a common framework we introduce a slight generalization of the classical Krein construction. Suppose that M is an operator satisfying the following commutation relation: U M = f ( M† ) U,
(8)
where f is a meromorphic function and U some non-singular operator. It seems simplest to assume that f is an automorphism of the extended complex plane, and thus a Mobius transformation. All examples that we are aware of are of this form.
Krein Signatures for the Faddeev-Takhtajan Eigenvalue Problem
827
¯ ∈ spec( M). We Since M ∼ f ( M† ) it follows that λ ∈ spec( M) implies that f (λ) ¯ assume that there exists a curve γ of fixed points of the map λ → f (λ): γ = λ|λ = f (λ¯ ) . For instance, for f (z) = z, f (z) = −z, f (z) = 1/z the corresponding curves are given by the real axis, the imaginary axis, and the unit circle respectively. Note that generically γ is co-dimension 2: it is only for special choices of f that γ is a curve. For f a Mobius transformation, γ is a circle or (in the degenerate case) a line. ¯ over-determines the curve γ one expects that there is a Since the relation λ = f (λ) consistency condition which must hold. This is the result of the next lemma: Lemma 1. Suppose f is analytic and λ = f (λ¯ ) along a curve γ . Then | f (λ¯ )| = 1 for λ ∈ γ. Proof. It is convenient to let z = λ¯ so that the righthand side is holomorphic. Then we have the following expressions for ddyx : dy 1 − gx −h x = = , dx gy 1 + hy where g, h are the real and imaginary parts respectively of f. From the Cauchy-Riemann equations and the equation above we get | f (z)|2 = gx2 + h 2x = 1. ¯ and that γ is Given that the spectrum of M is invariant under the map λ → f (λ), the set of fixed points of this map, it seems natural to ask when the spectrum is actually contained in γ . A sufficient condition is given by the following lemma: Lemma 2. Define a generalized Krein signature as follows: suppose M satisfies the commutation relation (8) for some bounded operator U, and v is an L 2 eigenvector of M with eigenvalue λ. Then the generalized Krein signature is given by κ = v, Uv. A non-vanishing Krein signature κ = 0 implies that the eigenvalue lies along the symmetry curve λ = f (λ¯ ). Proof. It is easy to see that
¯ λκ = v, U Mv = f¯( M)v, Uv = f (λ)κ,
¯ or κ = 0. and thus either λ = f (λ)
Remark 2. Note that the Krein signature depends on both the eigenvector v and the symmetry as determined by U and f in (8). The relationship (8) generalizes a number of classes of operators. If U is positive definite and f (z) = z, the matrix is self-adjoint under the inner product induced by U and the spectrum always lies on the symmetry curve (the real axis). More generally, if f (z) is any analytic function and U is positive definite, then M is normal under the inner product induced by U. Finally, if f (z) = z −1 and U = J, then the matrix M is J-unitary. By Lemma 2 if U induces a definite inner product the eigenvalues always lie on the symmetry curve γ , so it is the case where U is not definite which is most interesting to us.
828
J. C. Bronski, M. A. Johnson
Example 1. The Zakharov-Shabat eigenvalue problem with a real potential is given by i ddx −iq(x) M= −iq(x) −i ddx and satisfies two such commutation relations of form (8). The first has 01 , U= 10 f (z) = −z, corresponding to symmetry of the spectrum under reflection across the imaginary axis. The corresponding Krein signature is given by
κ = φ1∗ φ2 + φ2∗ φ1 d x. This is the quantity which Klaus and Shaw study in their papers: non-vanishing of this quantity implies that the point spectrum is a subset of γ = Ri. There is a second relation of the form (8) with 0 1 , U= −1 0 f (z) = z, corresponding to the symmetry of the spectrum under reflection across the real axis. Note that the generalized Krein signature as defined here is only defined for eigenvalues corresponding to L 2 eigenfunctions: it is not convergent for eigenfunctions associated to the essential spectrum. The next lemma is important for understanding the Klaus-Shaw calculation for the Zakharov-Shabat problem, as well as our calculation for the eigenvalue problem (4): it shows that an eigenvector of non-zero Krein signature is not part of a Jordan chain. This is a minor modification of a standard result for J-unitary matrices: compare with Lemma I in Chap. 3.1 of Yakubovich and Starzhinskii [22]. Lemma 3. Suppose that λ ∈ Γ is an eigenvalue of M and v the corresponding eigenvector. If the corresponding Krein signature κ = v, Uv is non-zero then v belongs to a trivial Jordan block: there does not exist w such that ( M − λI)w = v. Proof. This follows from a calculation. Suppose that there does exist such a vector w: Mw = λw + v. Then a straightforward calculation shows that f ( M) satisfies f ( M)w = f (λ)w + f (λ)v. A straightforward calculation shows that λ w, Uv = w, U Mv = w, f ( M† ) Uv
= f¯( M)w, Uv = f (λ¯ ) w, Uv + f (λ¯ ) v, Uv.
Krein Signatures for the Faddeev-Takhtajan Eigenvalue Problem
829
Thus we have the equality (λ − f (λ¯ )) w, Uv = f (λ¯ ) v, Uv = f (λ¯ )κ. By Lemma 1 we know that f (λ¯ ) = 0 and (λ − f (λ¯ )) = 0, and thus κ, the Krein signature of the eigenvector, must vanish. Lemma 3 connects with the Klaus-Shaw calculation in the following way: as mentioned in Example 1, the Zakharov-Shabat eigenvalue problem satisfies a commutation relation of form (8) with f (z) = −z and U = τ2 , with the associated generalized Krein signature
κ = φ1∗ φ2 + φ2∗ φ1 d x. In this situation the symmetry curve is given by λ = −λ¯ , i.e. the imaginary axis. Klaus and Shaw first established that for real, monomodal potentials this Krein signature is non-vanishing, and thus the point eigenvalues must lie on the imaginary axis. Further, by Lemma 3 the fact that the Krein signature is non-zero shows that all Jordan blocks have size one and thus the algebraic and geometric multiplicities must be the same. However, for second order ordinary differential equation eigenvalue problems, such as the Zakharov-Shabat eigenvalue problem, the geometric multiplicity can never be larger than one: an eigenvalue of geometric multiplicity higher than one would imply the existence of two linearly independent exponentially decaying solutions. We know from the asymptotic behavior of the Jost solutions that these problems have a one dimensional eigenspace of growing solutions and a one dimensional eigenspace of decaying solutions. Thus, the definiteness of the Krein signature also proves the eigenvalues on the imaginary axis are necessarily simple. Our goal is to apply the same ideas to the Faddeev-Takhtajan eigenvalue problem (4). The first obstacle to be overcome is the nonlinear way in which the spectral parameter enters: again we have a quadratic pencil problem rather than a standard linear eigenvalue problem. However, this can be overcome by the somewhat standard trick of doubling the size of the system: see, for example, Chap. V, Sect. 12 of Gohberg and Krein [9]. We begin by defining the operators A and B on L 2 (d x; C2 ) as u 1 u cos τ1 + sin τ2 , 4 2 2 u u 1 B := − cos τ1 + sin τ2 , 4 2 2
A :=
and noting that (4) can be written as Φx = zAΦ + 1z BΦ (recall v(x) = 0). Defining Ψ = zΦ, we get the following equivalent problem in which the eigenvalue parameter enters linearly: 0 I Φ Φ Φ = z . (9) M := Ψ Ψ Ψ −A−1 B A−1 ∂x Next, we would like to derive a relation of the form (8) for the eigenvalue problem (4). We are particularly interested in the symmetry under reflection across the unit circle, and thus would like to find a relation of this form with f (z) = 1z . That such a relation exists is the content of the next lemma.
830
J. C. Bronski, M. A. Johnson
Lemma 4. The operator M defined by (9) satisfies M† UM = U,
(10)
where U is given by ⎛
⎞ u(x) − sin u(x) − cos ⎜ 2 ⎟ 2 ⎟ ⎜ u(x) − sin u(x) 0 0 cos 2 ⎟ ⎜ 2 ⎟. ⎜ U=⎜ ⎟ u(x) u(x) − cos 2 0 0 ⎟ ⎜ sin 2 ⎠ ⎝ u(x) u(x) cos 2 sin 2 0 0 0
0
Proof. We first look for an operator U on L 2 (d x; C4 ) of the form 0 R U := −R† 0 for some operator R. By a direct calculation, we have that 0 B† A−† R† . M† U M = −RA−1 B RA−1 ∂x + ∂x A−† R† Considering the lower left-hand block we must require that −RA−1 B = −R† . Note that this implies the analogous condition on the upper right-hand block, B† A−† R† = R. An easy calculation shows that −A−1 B = cos(u)I + sin(u)τ3 , which is a rotation matrix through angle −u. This suggests choosing R in the form of a rotation matrix. If we denote a rotation matrix through θ radians by R(θ ), then assuming R = R(θ ) for some function θ , the condition −RA−1 B = −R† is equivalent to R(θ )R(−u) = R(π − θ ). Thus the condition −RA−1 B = −R† holds if we choose −1 θ = u+π = 4τ2 , which is a constant. Hence 2 . With this choice, we have RA RA−1 ∂x + ∂x A−† R† = 4τ2 ∂x − 4∂x τ2 = 0, and M has a U-unitary structure. Notice that U satisfies U2 = −I and U† U = I so that (10) is in the classical form (7). It is worth emphasizing that M considered as an (unbounded) operator on L 2 (d x; C4 ) has a J-unitary structure. Note that this is very different from the way in which (for instance) the Schrödinger eigenvalue problem has a Hamiltonian structure. In the latter case for fixed values of the eigenvalue parameter λ the ordinary differential equation has a Hamiltonian structure and thus the fundamental solution matrix satisfies a relation of the form (7): Here it is the operator itself which satisfies this relation. To further muddy the waters it is also true that the ordinary differential equations given by (4) have a Hamiltonian structure when z lies on the unit circle, and thus the fundamental solution matrix also satisfies a relation of the form (7). We will exploit this fact in Sect. 4. The Krein signature κ associated with this U-unitary structure is given by Φ Φ ,U κ= , Ψ L 2 (d x;C4 ) Ψ
Krein Signatures for the Faddeev-Takhtajan Eigenvalue Problem
831
where Φ and Ψ satisfy (9). Pulling this back to the original variable Φ gives the following expression for the Krein signature:
u u
Φ, τ3 Φ d x , |Φ|2 d x − i cos θ κ = 2ir sin θ sin cos (11) 2 2 R R where z = r exp(iθ ). Therefore Lemma 2 implies that for L 2 eigenfunctions
u u
Φ, τ3 Φ d x = 0 |Φ|2 d x − i cos θ sin cos sin θ 2 2 R R implies |z| = 1. Remark 3. A major qualitative difference between the Krein signature given in (11) and that given by Klaus and Shaw for the Zakharov-Shabat eigenvalue problem is that the signature (11) explicitly depends on the argument of the eigenvalue. This explicit dependence on the spectral parameter introduces some additional complications, which can nevertheless be dealt with. Note that the other spectral symmetries of (4) (reflection across the real and imaginary axis) have associated Krein signatures. For instance, the relation ˜ M = − M† U ˜ U is associated to the symmetry under reflection across the imaginary axis- M is ˜ U-antihermitian. The associated Krein signature is given by
u u 1 1 i i
Φ, τ2 Φ d x −
Φ, τ1 Φ d x. r− r+ κim = cos sin 2 r 2 2 r 2 A non-zero κim implies that the eigenvalue lies on the imaginary axis, and thus corresponds to a kink. We’ve been unable to derive any condition on u which would guarantee that κim = 0. Physically such a condition would correspond to having only counterpropagating kink/anti-kink pairs, with no breather component. While such a situation is certainly possible it would seem rather unusual, and thus it may not be surprising that no simple condition guaranteeing a non-zero κim has come to light. It is also worth noting that these Krein signatures can be derived directly from the eigenvalue problem (4), and that each of them results from integrating a flux associated to one of the Pauli matrices. There are four such fluxes: three are associated to spectral symmetries of Eq. (4) and lead to Krein signatures associated to these symmetries. The third can be integrated to yield an identity which is true for any eigenfunction in the point spectrum. Indeed, it is not difficult to show that the following identities hold u 1 1
Φ, τ2 Φx = − Re z −
Φ, τ3 Φ cos 2 z 2 u 1 i sin |Φ|2 , (12) − Im z + 2 z 2 u 1 1
Φ, τ3 Φx = Re z −
Φ, τ2 Φ cos 2 z 2 u 1 1
Φ, τ1 Φ , sin (13) − Re z + 2 z 2
832
J. C. Bronski, M. A. Johnson
u 1 i
Φ, τ2 Φ Im z − sin x 2 z 2 u 1 i
Φ, τ1 Φ , (14) + (z + ) cos 2 z 2 u 1 1 cos |Φ|2 i Φ, τ1 Φx = Im z − 2 z 2 u 1 i
Φ, τ3 Φ . sin + Re z + (15) 2 z 2 Assuming z is an eigenvalue, integrating (12) over all of R yields r − r1 κ = 0, the Krein signature condition associated with reflections across the unit circle. Arguing similarly with (13) and (15) gives cos(θ )κim = 0 and sin(θ )κr e = 0 respectively, the Krein signature conditions associated with reflections across the imaginary and real axes. Finally integrating (14) gives the identity
u u
Φ, τ2 Φ d x + i cos(θ ) cos
Φ, τ1 Φ d x = 0, sin(θ ) sin 2 2
|Φ|2
=
which holds for any point eigenvalue. Several of these identities will play a role in proving our main results. 3. Main Results We are now in a position to establish our main results. Having derived the Krein signature associated with the spectral symmetry of reflection across the unit circle we will now prove that, under certain conditions on the potential u(x), the Krein signature is non-zero and thus the eigenvalues actually lie on the unit circle. We consider two cases: the case of kink-like initial data (topological charge Q top = ±1), and the case of breather-like initial data (topological charge Q top = 0). The former is somewhat easier so we consider it first. Theorem 1. Let u(x) be a monotone C 1 function such that sin u2 ∈ L 1 , satisfying the asymptotic condition u(x) → 0 as x → −∞ and have topological charge Q top = ±1. Then the discrete spectrum of (4) lies on the unit circle. Proof. We first prove this theorem in the case Q top = 1. Note that from (11), it suffices to prove
u u
Φ, τ3 Φ d x = 0 |Φ|2 d x − i cos θ sin θ sin cos 2 2 R R for any eigenvalue z = r exp(iθ ) and corresponding L 2 eigenfunction Φ. Note that if cos θ = 0, i.e. if z ∈ iR, then the inequality (3) is clearly true since the righthand side is manifestly positive. Thus the only possible eigenvalues on the imaginary axis are at the intersection with the unit circle: at ±i. Moreover, when z = ±i the off-diagonal terms vanish and one can solve (4) in closed form and see directly that this always corresponds to a bound state. Hence, z = ±i is always an eigenvalue in this case. We now assume cos θ = 0. To motivate the calculation that follows we recall that Φ = (φ1 , φ2 )T and note that our assumptions on u imply that φ2 generically grows as x → ±∞. Our goal, then, is to express the Krein signature solely in terms of φ2 , since
Krein Signatures for the Faddeev-Takhtajan Eigenvalue Problem
833
it is the vanishing of this component of Φ as x → ∞ that defines an eigenvalue. We therefore consider the φ2 equation in (4): u u 1 1 i i z+ sin φ1 + z− cos φ2 . φ2,x = (16) 4 z 2 4 z 2 Note that from the exponential boundedness of the eigenfunctions of (4) (see the Appendix), we know cot u2 ddx |φ2 |2 and csc u2 |φ2 |2 are integrable on R. Thus, multiplying (16) by cot u2 φ2∗ , adding the resulting equation to its conjugate and integrating gives u d 1 2 |φ2 | d x = − r+ cot 2 dx 2 R i r+ − 4 i r− + 4
cos2 u2 1 u |φ2 |2 d x sin θ r R sin 2
u 1
Φ, τ3 Φ d x cos θ cos r 2 R
u 1
Φ, τ2 Φ d x. sin θ cos r 2 R
As was mentioned previously, integrating (13) over R yields the identity
u u 1 1
Φ, τ2 Φ d x = i r +
Φ, τ1 Φ d x i r− cos sin r 2 r 2 R R since cos θ = 0. Hence, u u 1
Φ, τ3 Φ d x r+ |Φ|2 − i cos θ cos sin θ sin r 2 2 R
u d cos2 u2 1 2 u |φ2 |2 d x = 4 cot |φ2 | d x + 2 r + sin θ 2 dx r R R sin 2
u 1 sin θ |Φ|2 − i Φ, τ1 Φ d x. sin + r+ r 2 R Integrating by parts, we see
u d ux |φ2 |2 d x, |φ2 |2 d x = cot 2 u 2 dx R R 2 sin 2
(17)
which is positive by our assumptions on the potential u. Note that there are no boundary terms by the exponential boundedness results for φ2 . Since |Φ|2 − i Φ, τ1 Φ = 2|φ2 |2 we find the following expression for the Krein signature:
1 sin θ ux u |φ2 |2 d x + 2 |φ2 |2 d x. κ=2 r+ r sin2 u2 R sin 2 Thus if u x > 0 and u ∈ [0, 2π ] then the Krein signature given by (11) must always be positive at an eigenvalue, and therefore the point spectrum is confined to the unit circle. One can easily verify for data with topological charge Q top = −1 the two components φ1 , φ2 exchange roles and the same proof holds mutatis mutandi.
834
J. C. Bronski, M. A. Johnson
Remark 4. Theorem 1 is easily seen to hold under the more general hypothesis that u(x) is a piecewise C 1 function with u x ≥ 0 a.e. having a finite set of jump discontinuities x + {x j }nj=1 such that cot u2 x −j ≥ 0 and satisfies sin u2 ∈ L 1 and the asymptotic condij tions cos u2 → 1 as x → −∞ and cos u2 → −1 as x → ∞. The only change in the proof is the integration by parts formula in (17) is replaced by
u d u x+ ux 2 2 |φ2 | d x = |x −j |φ2 |2 (x j ). cot | d x + cot |φ 2 u 2 2 d x 2 j R R 2 sin ( 2 ) j We also believe this theorem also holds for u ∈ BV (R) with sin u2 ∈ L 1 and µ defined via u x d x = dµ being a positive measure satisfying µ (−∞, ∞) = 2π , but we have not shown this. Next, we consider the case where u(x) is a stationary breather type potential with one critical point on the real line (i.e. a Klaus-Shaw potential). Note that by translation invariance we may assume the critical point occurs at x = 0. Using essentially the same ideas as in the kink case we derive the following result for this class of potentials. Note that this case is most similar to the calculation of Klaus and Shaw for the Zakharov-Shabat eigenvalue problem. Theorem 2. Let a non-negative C 1 function with a single critical point at x = 0 u u(x) be 1 such that sin 2 ∈ L , sign(u x ) = −sign(x), and u(x) → 0 as x → ±∞ (in other words, Q top = 0). Furthermore, define u 0 := u(0) and assume 0 < u 0 < π . Then the discrete spectrum of (4) lies in the sector u0 . z = r exp(iθ ) : 0 < θ < 2 0 Moreover, all the eigenvalues z = r exp(iθ ) with θ ≤ π −u lie on the unit circle. In 2 π particular if u 0 ≤ 2 , then all the eigenvalues of the Faddeev-Takhtajan eigenvalue problem lie on the unit circle.
Proof. Recall that, due to the spectral symmetries, we need only consider eigenvalues in the first quadrant intersect the closed unit disk. To begin, note that if z = r exp(iθ ) is an eigenvalue of (4), integrating (15) over R yields the identity
u u
Φ, τ3 Φ d x. |Φ|2 d x = cos θ sin θ cos sin 2 2 R R In particular, since cos u2 > 0 by hypothesis, there can be no eigenvalues on the imaginary axis and we get the Rayleigh quotient type relation −i R sin u2 Φ, τ3 Φ d x sin( u20 ) u , (18) tan θ = < 2 cos( u20 ) R cos 2 |Φ| d x where the latter follows from the Cauchy-Schwartz inequality and the observation that Φ and τ3 Φ cannot be proportional to each other on the whole axis. The asymptotics of the Jost solutions implies that φ2 will generically blow up as x → −∞ while φ1 will generically blow up as x → ∞. Thus, following Klaus and Shaw, we will derive a formula for the Krein signature involving φ2 on the negative half-line and φ1 on the positive half-line.
Krein Signatures for the Faddeev-Takhtajan Eigenvalue Problem
835
A calculation parallel to the one given in the proof of Theorem 1 gives the following identity for the contribution to the Krein signature from the negative half-line: 0 u u 1
Φ, τ3 Φ d x r+ |Φ|2 − i cos θ cos sin θ sin r 2 2 −∞
0 cos2 u2 1 u d 2 2 |φ2 | d x sin θ =4 cot |φ2 | + 2 r + 2 dx r sin u2 −∞
0 u u 1 1
Φ, τ2 Φ + r + −i r − sin θ cos sin θ sin |Φ|2 d x. + r 2 r 2 −∞ Note that all the integrals above are well defined by the exponential boundedness of the Jost solutions. For convenience we define
0 u u I− = sin(θ ) sin( )|Φ|2 − i cos(θ ) cos( ) Φτ3 Φ d x. 2 2 −∞ Integrating (13) over (−∞, 0) gives us
0 u 1
Φ, τ2 Φ d x 2 Φ(0), τ3 Φ(0) = r − cos θ cos r 2 −∞
0 u 1
Φ, τ1 Φ d x. cos θ sin − r+ r 2 −∞ An integration by parts gives
0
0 u d u ux 0 2 2 |φ2 | d x = |φ2 (0)|2 , cot | d x + cot |φ 2 u 2 2 d x 2 −∞ −∞ 2 sin 2
(19)
where again there are no boundary terms at −∞ due to the exponential boundedness of φ2 . Therefore, we find that I− , the contribution to the Krein signature from the left half-line, satisfies the equation u 1 0 I− − 4 cot |φ2 (0)|2 + 2i tan θ Φ(0), τ3 Φ(0) r+ r 2
0 (r + r1 ) sin θ ux 2 2 =2 |φ2 | + |φ2 | d x. (20) sin u2 sin2 u2 −∞ In particular, notice that the righthand side of (20) is positive by our assumption on the potential u. Similarly defining
∞ u u I+ = sin(θ ) sin( )|Φ|2 − i cos(θ ) cos( ) Φτ3 Φ d x 2 2 0 and working with the φ1 equation on [0, ∞) gives u 1 0 r+ I+ − 4 cot |φ1 (0)|2 + 2i tan θ Φ(0), τ3 Φ(0) r 2
∞ (r + r1 ) sin θ ux 2 2 |φ1 | − |φ1 | d x. =2 sin u2 sin2 u2 0
836
J. C. Bronski, M. A. Johnson
Putting these results together, we see that I− + I+ > 0 if u 0 |Φ(0)|2 − i tan θ Φ(0), τ3 Φ(0) ≥ 0. cot 2 Since we have u u 0 0 |Φ(0)|2 − i tan θ Φ(0), τ3 Φ(0) ≥ cot − tan θ |Φ(0)|2 , cot 2 2 we see that the Krein signature is positive if 0 ≤ θ ≤
π −u 0 2 ,
which completes the proof.
Remark 5. Theorem 2 can easily be seen to hold under the more general hypothesis that u(x) is a piecewise C 1 function with u x ≥ 0 for a.e. x < 0 andu x ≥ 0 fora.e. x + x > 0, having a finite number of jump discontinuities {x j }nj=1 with sign cot u2 x −j = j −sign(x j ) and satisfies u L ∞ < π , sin u2 ∈ L 1 , and the asymptotic condition lim x→±∞ cos u2 = 1. As in the previous theorem the only change is the addition of some boundary terms to the integration by parts in (19), which are of the correct sign by hypothesis, together with a similar correction to the integration by parts in deriving the formula for I+ . We also believe this should be true for functions of bounded variation, u ∈ BV (R), with sin u2 ∈ L 1 and µ defined via u x d x = dµ being a signed measure of the form µ = µle f t − µright with supp µle f t ⊂ (−∞, 0], supp µright ⊂ (0, −∞] and µle f t (−∞, ∞) = µright (−∞, ∞) < π, but we have not shown this. Before we move on we point out an interesting corollary. Remark 6. Suppose Φ is an eigenvector of (4) corresponding to an eigenvalue on the unit circle. Then Φ = (φ1 , φ2 )T satisfies
∗ φ1 φ1,x d x = φ2∗ φ2,x d x. R
R
Proof. From (4) we see that if z is an eigenvalue, any corresponding eigenfunction Φ satisfies
u ∗ 1 |Φ|2 d x 4 φ1 φ1,x − φ2∗ φ2,x d x = −i z − cos z 2 R R u 1
Φ, τ3 Φ d x. sin +i z + z 2 R Since we know r = 1, the right-hand side is purely real while, by integration by parts, the left-hand side is purely imaginary. This result has a nice physical intuition: eigenvalues on the unit circle correspond to stationary breathers. This zero momentum condition is a reflection in the spectral domain of the fact that such solutions correspond to stationary breathers. Theorem 2 guarantees that the point eigenvalues lie on the unit circle if u 0 < π2 . In the next section we will improve this to allow higher amplitude potentials (u 0 < π ) using a proof based on the argument principle. Finally, to conclude this section, we show that under the hypothesis of Theorems 1,2 the point eigenvalues are necessarily simple.
Krein Signatures for the Faddeev-Takhtajan Eigenvalue Problem
837
Theorem 3. If u satisfies the hypothesis of Theorem 1 (or Remark 4), then the eigenvalues of (4) are simple. If u satisfies the hypothesis of Theorem 2 (or Remark 0 5) then the eigenvalues satisfying 0 ≤ arg(z) ≤ π −u are simple. 2 Proof. This proof follows that given for Klaus and Shaw’s analogous result for the Zhakarov-Shabat system (see [13]). As noted in the previous section this is to be expected from the positivity of the Krein signature - by Lemma 3, a definite Krein signature guarantees that the corresponding eigenvalue has the same algebraic and geometric multiplicities which, for equations of second order, implies simplicity. We define the Wronskian of Ψ and Φ to be W (Ψ, Φ) = ψ1 φ2 − ψ2 φ1 , where Ψ and Φ are the Jost solutions defined in (5) or (6), depending of course on the value of Q top . We say z is a double eigenvalue of (4) if W˙ (Ψ, Φ)(x, z) = 0, where a˙ denotes differentiation of a with respect to z, so that z is a double zero of the Wronskian. We now derive an expression for W˙ (Ψ, Φ)(x, z) using the eigenvalue problem (4). If v is an L 2 eigenfunction corresponding to an eigenvalue z of (4), then it must be a multiple of both Φ and Ψ , and hence there exists a non-zero constant C such that Ψ = CΦ. Then if W (z) := W (Ψ, Φ)(x, z), which is independent of x, we have ˙ W˙ (z) = W (Ψ˙ , Φ) + W (Ψ, Φ) 1 ˙ = C W (Ψ˙ , Ψ ) + W (Φ, Φ). C Now, the fundamental theorem of calculus implies
W (Ψ˙ , Ψ )(x, z) − W (Ψ, Ψ˙ )(−x, z) =
x
−x
ψ˙ 1 ψ2 − ψ˙ 2 ψ1 t dt.
Using r = 1 in (4), a tedious calculation yields u u 1 sin θ sin ψ12 − ψ22 − 2i cos θ cos ψ1 ψ2 . ψ˙ 1 ψ2 − ψ˙ 2 ψ1 t = 2z 2 2 Since Ψ and Φ decay exponentially in their respective directions, it follows that lim W (Ψ˙ , Ψ )(x, z) = 0 and
x→∞
˙ lim W (Φ, Φ)(x, z) = 0.
x→−∞
Therefore, if z is an eigenvalue of (4), 1 ˙ W˙ (z) = lim C W (Ψ˙ , Ψ )(−x, z) + W (Φ, Φ)(x, z) x→∞ C ˙ = C lim W (Ψ , Ψ )(−x, z) x→∞
u u C 2 2 sin θ ψ1 − ψ2 dt − 2i cos θ ψ1 ψ2 dt . =− sin cos 2z 2 2 R R Since Proposition 1 implies the functions φ1 and iφ2 can be chosen to be real we have that C W˙ (z) = − κ, 2z where κ is the Krein signature. Hence z is not a double eigenvalue of (4) if κ = 0.
838
J. C. Bronski, M. A. Johnson
Thus, if u(x) satisfies the hypothesis of Theorem 1, all the eigenvalues lie on the unit circle and are simple. For Klaus-Shaw potentials discussed in Theorem 2, Theorem 3 0 implies all the eigenvalues 0 < arg(z) ≤ π −u lie on the unit circle are simple. Obvi2 ously the same holds in the lower half-plane by reflection. Notice that these theorems 0 π contain no information about eigenvalues in the sector arg(z) ∈ ( π −u 2 , 2 ). This is a reflection of the fact that we do not have a definite Krein signature estimate there. In the next section we present a more topological argument which improves the result of Theorem 2. In particular, we show that (assuming u 0 < π ) all of the eigenvalues in 0 π the sector arg(z) ∈ ( π −u 2 , 2 ) are simple and lie on the unit circle. 4. Counting Eigenvalues In the previous section, we proved that if u satisfies the hypothesis of Theorem 2 with u L ∞ (R) ≤ π2 , then all eigenvalues of (4) lie on the unit circle. In the case where π 2 < u 0 < π , however, there is a sector given by u0 π − u0 <θ < , S := z = r exp(iθ ) : 2 2 where eigenvalues could a priori live off of the unit circle. The main goal of this section is to improve the result of Theorem 2 and to derive a count on the number of eigenvalues. The strategy we follow is this: we give a lower bound on the number of eigenvalues on the unit circle that intersect the upper half-plane, and an upper bound on the number of eigenvalues in the upper half-plane. Under the hypotheses of Theorem 2 the lower bound is the same as the upper bound. Thus all of the eigenvalues must lie on the unit circle, and are simple. To establish this we need the following lemma, which guarantees that eigenvalues in S lie in a bounded region. Lemma 5. Let u satisfy the hypothesis of Theorem 2. Then the point eigenvalues in the upper half-plane are confined to a compact subset. Proof. This is a situation where it is convenient to work in a gauge other than the symmetric gauge. To this end we define a new variable Ψ which is related to Φ via the gauge transformation cos u4 − sin u4 Ψ = Φ, sin u4 cos u4 where Φ is a solution of (4). Then Ψ is a solution of i z − cos(u) sin(u) i ux 0 1 1 0 Ψ+ Ψ+ Ψ. Ψx = sin(u) cos(u) 4 −1 0 4 4z 0 −1 x Define Θ(z, x) := 0 4zi − i4z cos(u) dy and define v1 and v2 by ψ1 (x) = v1 (x) exp(Θ(z, x)) and ψ2 (x) = v2 (x) exp(−Θ(z, x)). Then v1 and v2 satisfy the integral equations
x ux i z + sin(u) (s)e−2Θ(z,s) v2 (s)ds, v1 (x) = 1 + 4 4 −∞
x ux i z v2 (x) = − + sin(u) (t)e2Θ(z,t) v1 (t)dt, 4 4 −∞
Krein Signatures for the Faddeev-Takhtajan Eigenvalue Problem
839
Γ
Fig. 1. Eigenvalue bounds: For any potential in L 1 the eigenvalues cannot exist in the light shaded region { Im(z) ≥ C} ∪ { Im(z −1 )−1 ≤ C }. For potentials of Klaus-Shaw type no eigenvalues can exist in the black sector, and any eigenvalues in the dark grey sector must be confined to the unit circle (denoted in white). When the single maxima has height u 0 < π2 the two sectors overlap and all eigenvalues must lie on the unit circle (left). When u 0 ∈ ( π2 , π ) the two sectors do not overlap and a priori there can exist eigenvalues in the white region (right). When u 0 ≥ π we lose all control over the locations of the eigenvalues and there is no longer any sector in which the eigenvalues are guaranteed to be on the unit circle
or, equivalently, v1 (x) = 1 + T (v1 ), where T : L ∞ → L ∞ is the linear operator defined by
T (ϕ) =
x −∞
0
iz f + (s ; z) f − (t + s ; z) exp 2 −∞
with f ± (x; z) := ± u4x + estimate
iz 4
u(y) )dy ϕ(t +s )dt ds (1+cos 2 t +s
s
sin(u) (x). It is straightforward to check that we have the
−1 a 2 + b2 T L ∞ →L ∞ ≤ C Im(z −1 ) , =C a where z = a + ib and C is a constant depending on the L 1 norm of the functions sin u2 and u x . Note that the assumption that u has a single extrema implies that it is a function of bounded variation, and u x L 1 can be replaced by u BV with no change in the argument. An appeal to the contraction mapping theorem shows that if (z −1 )−1 is sufficiently small v1 (x) must be uniformly close to 1 and thus z cannot be an eigenvalue, as the boundary conditions require vanishing of v1 (x) as x → ∞. Finally, it follows by applying the transformation z → 1z , there is an upper bound on Im(z) for eigenvalues of (4) (see Fig. 1). We now turn to the problem of counting the number of discrete eigenvalues associated with (4) for a given potential u. As mentioned in the Introduction, Klaus and Shaw were able to derive an exact count of the number of discrete eigenvalues of (1) in terms of the L 1 norm of the potential q (see [14]). In this section, we derive an analogous result for the eigenvalue problem (4): we show the number of discrete eigenvalues is determined by the L 1 norm of sin u2 . Since we are mainly concerned with extending the result of Theorem 2, unless otherwise stated we assume throughout this section u ∈ C 1 is a non-negative and compactly supported in the interval [−d, d]. It is straightforward to extend this to the hypothesis of Theorem 2 (and, more generally to those mentioned in Remark 5) by limiting arguments, which we do not detail here: the arguments do not substantially differ from those of Klaus and Shaw for the Zakharov-Shabat eigenvalue problem [14].
840
J. C. Bronski, M. A. Johnson
We define M(z; u) = [Mi, j (z; u)]i, j to be the transfer matrix across the support of the potential: in other words M(z; u) = (Φ1 (d) Φ2 (d)) , where Φ1,2 (x) are solutions to 3 satisfying Φ1 (−d) =
1 0 and Φ2 (−d) = . 0 1
Obviously z is an eigenvalue if and only if 0 1 , ∝ M(z; u) 1 0 or, equivalently, M11 (z; u) = 0. Note that standard arguments show that M11 (z; u) is analytic in the upper half-plane. For eigenvalue parameter z = eiθ on the unit circle it is also convenient to introduce the Prüfer variables ρ and η, which are defined via (4): ρ cos η φ1 . Φ1 (x) = = −iρ sin η φ2 It is easy to check that for a fixed θ , ρ(x; θ ), and η(x; θ ) satisfy the coupled system of differential equations u u − 2η = cos θ sin + sin θ cos sin(2η) 2 (21) u2 2ρ = sin θ cos ρ cos(2η) 2 subject to the boundary conditions η(−d; θ ) = 0 and ρ(−d, θ ) = 1. As usual, the eigenvalue condition can be translated to a condition on the Prüfer angle variable. Indeed, exp(iθ ) is an eigenvalue of (4) if and only if η(d; θ ) = 2k−1 2 π for some k ∈ Z (since then the boundary condition φ1 (d) = 0 is satisfied). support, and let N be the largest nonTheorem 4. Let u ∈ C 1 (R) have compact d u negative integer such that −d sin 2 d x > (2N − 1)π . Then there exists at least of (4) on the unit circle in the open positive quadrant. In particular, if N eigenvalues u d −d sin 2 d x > π , then there exists at least one eigenvalue on the unit circle. Proof. The main observation here is that Eq. (21) can be explicitly solved when z = 1 and z = i. If θ = 0, then (21) reduces to −2η = sin u2 and hence
−2 Defining I := gives
d
−d
sin
d
−d
u 2
η d x = −2η(d; 0) =
d −d
sin
u 2
d x.
d x, we have η(d; 0) = − 21 I . Similarly, letting θ = −2η = cos
u 2
sin (2η).
π 2
in (21)
Krein Signatures for the Faddeev-Takhtajan Eigenvalue Problem
841
Since the righthand side is Lipschitz, the initial condition η −d; π2 = 0 implies π η d; 2 = 0. Thus, if |I | > (2N − 1)π it follows from the continuity of η(d; ·) that there exists 0 < θ1 < θ2 < · · · < θ N < π2 such that π |η(d; θk )| = (2(N − k) + 1) 2 for each k = 1, 2, ..., N , and the stated lower bound holds. Remark 7. It is worthwhile casting Theorem 4 in a geometric light. For each z on the unit ˜ circle the transfer matrix M(z; u) can be associated to a real symplectic matrix M(z; u) via 0 1 10 1 0 ˜ M(z; u) = M(z; u) . −1 0 0i 0 −i There is a Maslov index [16] µ M˜ associated with this one parameter family of symplectic ˜ matrices M(z; u). The change in the Prüfer angle between θ = 0 and θ = π2 is related ˜ to the Maslov index µ of the curve M(z; u) relative to the vertical subspace via π η d; 2 − η (d; 0) . µ= π ˜ The Maslov index is a signed sum over intersections of curve M(z; u) with the Maslov cycle, with the sign of each intersection being related to the Krein signature. A similar 0 1 picture holds in the case of topological charge 1, without the factor of J = . −1 0 This reflects the difference in boundary conditions between even and odd topological charge. The geometric interpretation makes it clear that in the case where the Krein signature is of fixed sign (||u||∞ < π2 ) the count given in Theorem 4 should be exact. This is the content of the next lemma. In the case where u satisfies the hypothesis of Theorem 2 with π2 < u 0 < π , this counting scheme offers no improvement. However, if we know all the discrete eigenvalues of (4) lie on the unit circle and are simple, Theorem 4 produces an exact count. Lemma 6. Suppose u is a compactly supported potential satisfying the hypothesis of Theorem 2 with u L ∞ (R) ≤ π2 . Then if N is defined as in Theorem 4, then there exist exactly N eigenvalues of (4) in the open positive quadrant, all of which live on the unit circle and are simple. Proof. We will prove monotonicity of η(d; θ ) with respect to θ at an eigenvalue. By definition, tan η = i φφ21 . Differentiating this with respect to θ and using the relation
cos η =
φ1 ρ
we get η(d; ˙ θ) = −
i ˙ 1 (d)φ2 (d) − φ1 (d)φ˙2 (d) , φ |Φ(d)|2
(22)
where f˙ := ddθf . Using (4) to integrate (22) and noting that φ˙1 (−d) = φ˙2 (−d) = 0 from the boundary conditions, we have
d
d u u 1 2 2 η(d; ˙ θ) = sin θ (φ1 − φ2 )d x −2i cos θ φ1 φ2 d x , sin cos 2|Φ(d)|2 2 2 −d −d
842
J. C. Bronski, M. A. Johnson
and hence 2|Φ(d)|2 η(d; ˙ θ ) = κ which is always positive at an eigenvalue by Theorem 2. This rules out multiple crossings of η(d; θ ) = 2k−1 2 π and hence there are exactly N eigenvalues of (4), all of which live on the unit circle and are simple by Theorems 2 and 3. When u satisfies the hypothesis of Theorem 2 with u L ∞ > π2 we no longer have control of the Krein signature for all eigenvalues of (4) in the open positive quadrant. It follows that the count given in Theorem 4 gives a priori a lower bound on the number of eigenvalues in the upper sector. We now obtain an upper bound on the number of eigenvalues by a homotopy argument: we allow the strength of the potential to vary and count the eigenvalues as they emerge from the continuous spectrum. To this end, we consider a potential u satisfying the hypothesis of Theorem 2 with π < u L ∞ < π . For each such u, define a one parameter family of potentials u a (x) := 2 au(x) for a ∈ [0, 1]. For small enough values of a, Theorems 2 and 3 imply the discrete eigenvalues lie on the unit circle and are simple. The total number of eigenvalues in the upper half-plane is given by
M11 1 NU H P = dz 2πi Γ M11 and Γ is the contour denoted in Eq. (4). By the usual arguments N is piecewise constant and increases whenever an eigenvalue crosses the contour Γ . Lemma 5 shows that the eigenvalues must be bounded away from the origin and are constrained to lie on the unit circle in a sector containing the real axis. Thus the only place where eigenvalues can cross the contour Γ is at z = ±1. As noted before the equation is easy to solve at z = ±1, with 1 M11 (±1; u a ) = cos I (a) , 2 d ua where I (a) := −d sin 2 dy. This implies that an upper bound on the total number of eigenvalues of (4) in the open positive quadrant is given by the total number of zeroes of the function 1 I (a) F(a) := cos 2 on [0, 1). Since I (0) = 0 and I is an increasing function of a, we immediately see that if N is the largest non-negative integer such that I (1) > (2N − 1)π , then there exists a total of at most N discrete eigenvalues of (4) in the open positive quadrant. Thus, we see the lower bound given by Theorem 4 is also an upper-bound on the total number of eigenvalues of (4) in the open positive quadrant. By noticing Theorem (4) does not respect the multiplicities of the eigenvalues on the unit circle, this proves the following improvement of Theorem 2. Theorem 5. Let u be a compactly supported potential satisfying the hypothesis of d Theorem 2. Let N be the largest non-negative integer such that −d sin u2 d x > (2N − 1)π . Then there exists exactly N discrete eigenvalues of (4) in the open positive quadrant, all of which lie on the unit circle and are simple. As mentioned earlier it is not difficult to extend this from the compactly supported case to the case when sin( u2 ) ∈ L 1 (R): the basic idea is to check that the whole-line
Krein Signatures for the Faddeev-Takhtajan Eigenvalue Problem
843
boundary value problem satisfied by the Prüfer angle has a unique solution. This does not differ substantially from the calculation of Klaus and Shaw [14]. 1 Note that if u satisfies the hypothesis of Theorem 2 then π is the threshold L norm u for sin 2 for the existence of discrete eigenvalues for (4). It is not difficult to check that this threshold persists for a more general class of potentials. Remark 8. Let u ∈ C 1 (R) have compact support in [−d, d], be of fixed sign, and satisfy u L ∞ ≤ π . If d u d x ≤ π, sin 2 −d then there do not exist any eigenvalues of (4) on the unit circle. There is an analogous count on the number of eigenvalues for potentials satisfying the hypotheses of Theorem 1. This case is somewhat simpler since we have positivity of the Krein signature under the monotonicity assumption without any additional assumptions. The proof is the same modulo some changes in the boundary conditions on the Prüfer angle. The statement of the analogous theorem for simple kink potentials is as follows. Theorem 6. Let u have compactly supported derivative and satisfy the hypotheses of u d Theorem 1. Let N be the largest integer such that −d sin 2 d x > 2N π . Then there exists exactly 2N + 1 eigenvalues of (4) in the open upper half plane, all of which live on the unit circle and are simple. Remark 9. Note that if u is not monotone, Theorem 6 gives a lower bound on the number of eigenvalues of (4) in the open upper half plane. Proof. First, notice that if u satisfies the hypothesis of Theorem 1, then exp(iθ ) is an d eigenvalue of (4) if η(d; θ ) = kπ for some k ∈ Z. Next, notice that if −d sin u2 d x > 2N π , then η(−d; 0) > N π . Since η(d; 0) = 0 as in the proof of Theorem 4, we know there exists θ1 , θ2 , . . . , θ N ∈ 0, π2 such that z k = eiθk is an eigenvalue of (4) for each k = 1, 2, . . . , N . By Theorems 1 and 3, and the proof of Lemma 6, we see the set N consists of all the eigenvalues of (4) in the open positive quadrant. The theorem {z k }k=1 follows by recalling that i is always an eigenvalue of (4) for kink-like data. 5. Conclusions In this paper we have proven a Klaus-Shaw type theorem for the Sine-Gordon eigenvalue problem for kink-like potentials with topological charge ±1 (under the assumption of monotonicity) and for breather-like potentials under the assumption that the potential u has a single maximum of height less than π . Note that this implies that sin(u/2) has a single maximum. The main analytical difficulty in dealing with the case where the height of the maximum is greater than π is that we are no longer able to show that the eigenvalues emerge from the essential spectrum at z = ±1, so apriori eigenvalues can emerge from anywhere along the real axis with (potentially) arbitrary multiplicity. Using the techniques of this paper it is still straightforward to establish a lower bound (though not an upper bound) for the number of eigenvalues on the unit circle, but we have little or no information about the number of eigenvalues off of the unit circle.
844
J. C. Bronski, M. A. Johnson
Tentative numerical experiments have indicated that the first result is probably tight: monotone kinks of higher topological charge and non-monotone kinks of topological charge ±1 frequently have point eigenvalues which do not lie on the unit circle. Similar experiments on breather-like potentials suggests that this result may be improved. In particular for breather-like potentials with a single maximum we have not observed point spectrum off of the unit circle until the height of the maximum reaches 2π . Geometrically there is some further evidence to support this: the monodromy matrix at z = 1 has the property that the winding number is strictly increasing for Klaus-Shaw potentials of height less than 2π . We are currently investigating whether an improvement of the theorem along these lines is possible. It is interesting to note that there is no interesting extension of these results to the periodic case. It is easy to compute that the Floquet discriminant of the problem always lies in the interval [−2, +2] when λ lies on the real axis. If the Floquet discriminant has a critical point on the interior (−2, 2) then by a simple analyticity argument the problem must have spectrum lying off of the real axis. Thus the only case in which the eigenvalue problem has spectrum confined to the union of the real axis and the unit circle is when all of the critical points of the Floquet discriminant on the real axis are double points. In this case the potential is necessarily a finite gap solution, and can be constructed by algebro-geometric methods [1]. We would also like to briefly discuss the more recent work of Klaus and Mityagin [12] on the eigenvalues of the Zakharov-Shabat eigenvalue problem with a real potential. Klaus and Mityagin were able to prove a number of very nice results on the Zakharov-Shabat eigenvalue problem. Klaus and Mityagin basically study three questions: how eigenvalues emerge from the essential spectrum at z = 0 onto the imaginary axis, and conversely how they are reabsorbed, how eigenvalues on the imaginary axis collide and split off into the complex plane, and how eigenvalues emerge from the essential spectrum at points other than z = 0. Many of the arguments employ a homotopy argument in the strength of the potential, similar to that employed in the proof of Theorem 5 in this paper, and require an estimate of the derivative of the eigenvalue with respect to the coupling constant of the potential. This is given by the ratio of two terms, one of which is exactly the Krein signature of the eigenvalue. Klaus and Mityagin also employ large coupling constant asymptotics to understand the asymptotic behavior of the eigenvalues. As but one example, they prove a very pretty theorem showing a semi-circle law for the percentage of eigenvalues which emerge from z = 0 vs. the percentage of eigenvalues which are absorbed into z = 0 in the large coupling constant limit (Theorem 2.6). It seems clear that these techniques could be applied to the FaddeevTakhtajan eigenvalue problem. The large coupling constant limit is (modulo a rescaling) the same as the semiclassical limit of the sine-Gordon problem. This is a particularly interesting limit (Josephson junctions operate in this limit) so theorems of this nature would be particularly welcome. Acknowledgement. The authors gratefully acknowledge useful conversations with Robbie Buckingham, Richard Kollár, and Peter Miller. The authors would also like to acknowledge support under NSF DMS0354462.
6. Appendix In this section, we mention one of the more technical but standard results which are needed in making the above arguments rigorous. Namely, we prove that eigenfunctions
Krein Signatures for the Faddeev-Takhtajan Eigenvalue Problem
845
of (4) are exponentially bounded as x → ∞ when u satisfies either the hypothesis of Theorem 1 or Theorem 2. As mentioned in the preliminaries, for potentials u satisfying the hypothesis of Theorem 1, we define the Jost solutions of (4) by the asymptotic properties (5) for Im(z) > 0. Note that these solutions differ from the standard Jost solutions by a normalization at ±∞. From scattering theory, we know that up to constant multiples Ψ and Φ are the unique solutions of (4) which are square integrable on (−∞, 0] and [0, ∞), respectively. It follows that if v is any eigenfunction of (4) corrresponding to an eigenvalue z, then v must be a multiple of both Φ(x, z) and Ψ (x, z). To show exponential boundedness of the Jost solutions, we begin by factoring off the asymptotic behavior at ±∞. To this end, fix an eigenvalue z in the open upper half plane (x) := Ω(x)Ψ (x, z) and Φ(x) and define Ψ := Ω(x)Φ(x, z) where 0 u 1 1 z− dyτ1 . cos Ω(x) = exp − 4 z 2 x It follows that Ψ˜ and Φ˜ are the unique solutions of the following system of integral equations: 1 + 0 1 + Φ(x) = 0
(x) = Ψ
1 z+ 4 1 z+ 4
x 1 u(y) (y)dy, Ω(y)−1 τ2 Ω(y)Ψ sin z 2 −∞ ∞ 1 u(y) Ω(y)−1 τ2 Ω(y)Φ(y)dy. sin z 2 x
The first of these is used to obtain bounds as x → −∞, while the second can be used to obtain similar bounds as x → ∞. By standard arguments involving the contraction 1 ∈ L ∞ (−∞, 0), i.e. mapping principle, one can show that ψ
u β 0 |ψ1 (x, z)| ≤ C exp − dy cos 4 x 2 for some C > 0, where β := Im z − 1z . Substituting this into the above integral equations, and noting that sin u2 is increasing on (−∞, 0), yields 0
u u x u |ψ2 (x, z)| β 3β 0 ≤ C sin exp dz dz dy cos exp − cos 2 4 x 2 4 y 2 −∞ z + 1z for x < 0. In particular, this shows that if Ψ is an eigenvector of (4), then csc u2 ψ2 ∈ L 1 (−∞, 0) for any potential u satisfying the hypothesis of Theorem 1. Similar results hold for φ1 and φ2 as x → ∞. By letting x → −x above, it follows that eigenfunctions of (4) must be bounded and decay exponentially as |x| → ∞ in the case Q top = ±1. These results are vital in showing convergence of integrals when we study potentials with Q top = ±1, as well as proving certain boundary terms arising from integration by parts vanish. Similar arguments imply analogous results when u satisfies the hypothesis of Theorem 2.
846
J. C. Bronski, M. A. Johnson
References 1. Beolokolos, E.D., Bobenko, A.I., Enol’skii, V.Z., Its, A.R., Matveev, V.B.: Algebro-Geometric Approach to Nonlinear Integrable Equations. Berlin: Springer-Verlag, 1994 2. Bishop, A.R., Flesch, R., Forest, M.G., McLaughlin, D.W., Overman, E.A. II.: Correlations between chaos in a perturbed sine-Gordon equation and a truncated model system. SIAM J. Math. Anal. 21(6), 1511–1536 (1990) 3. Blas, H., Carrion, H.L.: Solitons, kinks and extended hadron model based on the generalized sine-Gordon theory. J. High Energy Phys. 1, 027, 27 pp. (2007) (electronic) 4. Buckingham, R., Miller, P.D.: Exact solutions of semiclassical non-characteristic cauchy problems for the sine-Gordon equation. http://arXiv.org/abs/07053159v1[nlin.SI], 2007 5. Cuenda, S., Sánchez, A., Quintero, N.: Does the dynamics of sine-gordon solitons predict active regions of dna? http://arXiv.org/abs/0606028v1[q-bio.6N], 2006 6. Faddeev, L.D., Takhtajan, L.A.: Hamiltonian Methods in the Theory of Solitons. Springer Series in Soviet Mathematics. Berlin: Springer-Verlag, 1987, Translated from the Russian by A. G. Reyman [A. G. Re˘ıman] 7. Forest, M.G., McLaughlin, D.W.: Spectral theory for the periodic sine-Gordon equation: a concrete viewpoint. J. Math. Phys. 23(7), 1248–1277 (1982) 8. Gibbon, J.D., James, I.N., Moroz, I.M.: An example of soliton behaviour in a rotating baroclinic fluid. Proc. Roy. Soc. London Ser. A 367(1729), 219–237 (1979) 9. Gohberg, I.C., Kre˘ın, M.G.: Introduction to the Theory of Linear Nonselfadjoint Operators. Translated from the Russian by A. Feinstein. Translations of Mathematical Monographs, Vol. 18. Providence, RI: Amer. Math. Soc., 1969 10. Goodman, R.H., Haberman, R.: Interaction of sine-Gordon kinks with defects: the two-bounce resonance. Phys. D 195(3-4), 303–323 (2004) 11. Kaup, D.J.: Method for solving the sine-gordon equation in laboratory coordinates. Studies in Appl. Math. 54(2), 165–179 (1975) 12. Klaus, M., Mityagin, B.: Coupling constant behavior of eigenvalues of Zakharov-Shabat systems. J. Math. Phys. 48(12), 123502 (2007) 13. Klaus, M., Shaw, J.K.: Purely imaginary eigenvalues of Zakharov-Shabat systems. Phys. Rev. E (3) 65(3), 036607 (2002) 14. Klaus, M., Shaw, J.K.: On the eigenvalues of Zakharov-Shabat systems. SIAM J. Math. Anal. 34(4), 759–773 (2003) 15. Lennholm, E., Hörnquist, M.: Revisiting Salerno’s sine-Gordon model of DNA: active regions and robustness. Physica D Nonlinear Phenomena 177, 233–241 (2003) 16. McDuff, D., Salamon, D.: Introduction to Symplectic Topology. Oxford Mathematical Monographs. New York: The Clarendon Press/Oxford University Press, Second edition, 1998 17. Salerno, M.: Discrete model for dna promoter dynamics. Phys. Rev. A 44(8), 5292–5297 (1991) 18. Scott, A.C.: Magnetic flux annihilation in a large Josephson junction. In: Stochastic Behavior in Classical and Quantum Hamiltonian Systems (Volta Memorial Conf., Como, 1977), Volume 93 of Lecture Notes in Phys., Berlin: Springer, 1979, pp. 167–200 19. Shlizerman, E., Rom-Kedar, V.: Hierarchy of bifurcations in the truncated and forced nonlinear Schrödinger model. Chaos 15(1), 013107 (2005) 20. Takhtajan, L.A., Faddeev, L.D.: Essentially nonlinear one-dimensional model of classical field theory. Theor. Math. Phys. 21, 1046 (1974) 21. Terng, C.-L., Uhlenbeck, K.: Geometry of solitons. Notices Amer. Math. Soc. 47(1), 17–25 (2000) 22. Yakubovich, V.A., Starzhinskii, V.M.: Linear Differential Equations with Periodic Coefficients I, II. New York: Wiley, 1975 23. Yamosa, S.: Soliton excitations in deoxyribonucleic acid (dna). Phys. Rev. A 27(4), 2120–2125 (1983) 24. Zaharov, V.E., Takhtajan, L.A., Faddeev, L.D.: A complete description of the solutions of the “sine-Gordon” equation. Dokl. Akad. Nauk SSSR 219, 1334–1337 (1974) Communicated by L. Takhtajan
Commun. Math. Phys. 288, 847–886 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0738-z
Communications in
Mathematical Physics
Post-Newtonian Expansions for Perfect Fluids Todd A. Oliynyk School of Mathematical Sciences, Monash University, Melbourne, VIC 3800, Australia. E-mail:
[email protected] Received: 8 May 2008 / Accepted: 18 October 2008 Published online: 18 February 2009 – © Springer-Verlag 2009
Abstract: We prove the existence of a large class of dynamical solutions to the EinsteinEuler equations that have a first post-Newtonian expansion. The results here are based on the elliptic-hyperbolic formulation of the Einstein-Euler equations used in [15], which contains a singular parameter = vT /c, where vT is a characteristic velocity associated with the fluid and c is the speed of light. As in [15], energy estimates on weighted Sobolev spaces are used to analyze the behavior of solutions to the Einstein-Euler equations in the limit 0, and to demonstrate the validity of the first post-Newtonian expansion as an approximation. 1. Introduction The Einstein-Euler equations, which govern a gravitating perfect fluid, are given by Gi j =
8π G i j T c4
and ∇i T i j = 0,
where T i j = (ρ + c−2 p)v i v j + pg i j , with ρ the fluid density, p the fluid pressure, v the fluid four-velocity normalized by v i vi = −c2 , c the speed of light, and G the Newtonian gravitational constant. Defining =
vT , c
where vT is a typical speed associated with the fluid, the Einstein-Euler equations, upon suitable rescaling [15], can be written in the form G i j = 2 4 T i j and ∇i T i j = 0,
(1.1)
848
T. A. Oliynyk
where T i j = (ρ + 2 p)v i v j + pg i j and v i vi = −
1 . 2
In this formulation, the fluid four-velocity v i , the fluid density ρ, the fluid pressure p, the metric gi j , and the coordinates (x i ) i = 1, . . . , 4 are dimensionless. By assumption, the (x i ) are global Cartesian coordinates on spacetime M ∼ = R3 × [0, T ), where the I 3 (x )(I = 1, 2, 3) are spatial coordinates that cover R , and t = x 4 /vT is a Newtonian time coordinate that covers the interval [0, T ). By a choice of units, we can and will set vT = 1. Post-Newtonian expansions for the Einstein-Euler system refer to expansions of solutions to this system in the parameter , about = 0, where the lowest expansion term is governed by the Poisson-Euler equations of Newtonian gravity: 0 0
0
∂t ρ + ∂ I (ρ w I ) = 0, 0
0
0
(1.2) 0
0
0
0
ρ(∂t w J + w I ∂ I w J ) = −(ρ∂ J + ∂ J p), 0
0
= ρ. 0 0
0
(1.3) (1.4)
Here ρ, p, and w J are the fluid density, pressure, and three velocity, respectively. Formal calculational schemes for determining the post-Newtonian expansion coefficients and the equations they satisfy exist, and are in wide use by physicists [5,9]. In fact, these post-Newtonian computational schemes are one of the most important techniques in general relativity for calculating physical quantities for the purpose of comparing theory with experiment. For example, in gravitational wave astronomy, postNewtonian expansions are used to calculate gravitational wave forms that are emitted during gravitational collapse [5]. It is important to stress that the formal post-Newtonian expansion schemes all implicitly rely on the assumption that the expansions exist and approximate solutions to general relativity. Therefore, to establish existence of such approximations, and to answer questions about their range of validity, a different approach must be taken to the problem. In [15], we took a first step in analyzing this problem by proving the existence of a wide class of one-parameter families of solutions to the Einstein-Euler equations that converged in a suitable sense to the Poisson-Euler equations in the limit 0. We also remark that similar results were also established, using a different method, by Alan Rendall [19] for the Einstein-Vlasov equations. In this paper, we use the results of [15] to prove the existence of a large class of solutions to the Einstein-Euler equations that can be expanded in to the first postNewtonian order. Moreover, we demonstrate the existence of convergent expansions in for solutions to the Einstein-Euler equations. These expansions are, in general, not of the post-Newtonian type since the expansion coefficients can depend on . Nevertheless, the expansions are convergent, and therefore, represent a kind of generalized post-Newtonian expansion. We note that analogous expansions for the Vlasov-Maxwell equations and Vlasov-Nordstöm equations have been rigorously analyzed in [2–4]. The difficulty in analyzing the post-Newtonian expansions arise from the fact that the limit 0 is singular. To analyze this limit, we follow the approach of [15], which requires that the metric gi j and the fluid velocity v i are replaced with new variables that
Post-Newtonian Expansions for Perfect Fluids
849
are compatible with the limit 0. The new gravitational variable is a density u¯ i j defined via the formula Qi j , gi j = √ (1.5) − det(Q) where Qi j =
IJ IJ 0 0 0 u¯ I 4 4¯u 0 δ 0 + 4 4 . + 2 + 4 3 J 4 0 u¯ 44 0 0 0 −1 0 u¯
(1.6)
From this, it not difficult to see that the density u¯ i j is equivalent to the metric gi j for > 0, and is well defined at = 0. For the fluid, a new velocity variable wi is defined by v I = w I and w 4 =
v4 − 1 .
(1.7)
For technical reasons, we assume an isentropic equation of state p = Kρ (n+1)/n ,
(1.8)
where K ∈ R>0 , n ∈ N. This allows us to use a technique of Makino [14] to regularize the fluid equations by the use of the fluid density variable ρ=
1 α 2n . (4K n(n + 1))n
(1.9)
The resulting system can be put into a symmetric hyperbolic system that is regular across the fluid-vacuum interface. In this way, it is possible to construct solutions to the Einstein-Euler equations that represent compact gravitating fluid bodies (i.e. stars) both in the Newtonian and relativistic setting [14,18]. In the Newtonian setting, this is straightforward to see. Using (1.8) and (1.9), the Poisson-Euler equations (1.2)–(1.4) imply that 0
α 0I ∂t α = −w ∂ I α − ∂I w , 2n 0I
0
0
(1.10)
0
0
∂t w J = − 0
0
= ρ
0 α J0 0 0 ∂ α − w I ∂ I w J − ∂ J , 2n 0
0
(1.11)
ρ := (4K n(n + 1))−n α 2n , 0
(1.12)
which is readily seen to be regular even across regions where α vanishes. As discussed by Rendall [18], the type of fluid solutions obtained by the Makino method have freely falling boundaries and hence do not include static stars of finite radius, and consequently this method is far from ideal. However, in trying to understand the post-Newtonian expansions, these solutions are general enough to obtain a comprehensive understanding of the mathematical issues involved in the post-Newtonian expansions. As in [15], our approach to the problem of post-Newtonian expansions is to use the gravitational and matter variables {¯ui j , wi , α} along with a harmonic gauge to put the
850
T. A. Oliynyk
Einstein-Euler equations into a singular (non-local) symmetric hyperbolic system of the form b0 (W )∂t W =
1 I c ∂ I W + b I (, W )∂ I W + F(, W ).
(1.13)
Singular hyperbolic systems of this form have been extensively studied in the articles [6,11,12,20,21]. Especially relevant for our purposes, is the paper [21]. There, a systematic procedure for constructing rigorous expansions to singular symmetric hyperbolic systems is developed (see also [11,12]). However, the techniques of [6,11,12,20,21] cannot be applied directly to our case. The reason for this is that the initial data for the system (1.13) must include a 1/r piece for the metric and cannot lie in the Sobolev k space H k . This problem was overcome in [15] by using a one parameter family Hδ, of weighted Sobolev spaces that include 1/r type fall off for > 0, and reduce to the standard Sobolev spaces H k in the limit 0. We again use these weighted Sobolev spaces, this time to generalize the results of [21] so that we can apply them to the problem of generating rigorous post-Newtonian expansions. The next theorem is the main result of this paper, and the proof can be found in Sect. 6. k , and X The definition of the spaces Hδk , Hδ, T,s,k,δ can be found in Appendices A and B. k , Theorem 1.1. Suppose −1 < δ < −1/2, s ≥ 3, k ≥ 3 + s, α , w I , z4I J ∈ Hδ−1 o
o
k−2 f ∈ Hδ−2 , supp α ⊂ B R , and let T0M be the maximal existence time (see Proposition o
3.7) for solutions to the Poisson-Euler-Makino equations (1.10)–(1.12) with initial data 0
α(0) = α , w I (0) = w I . Then for any T0 < T there exists an 0 > 0, and maps o
o
u¯ ij (t)
: u¯ ij (t) − u¯ i j (0), ∂ I u¯ ij (t), ∂t u¯ ij (t) ∈ X T0 ,s,k,δ−1 0 < ≤ 0 ,
α (t), wi (t) ∈ X T0 ,s,k,δ−1 0 < ≤ 0 , 0
0
0
0
α(t), w I (t) ∈ X T0 .s,k,δ−1 , (t) ∈ X T0 ,s,k+2,δ with ∂t (t) ∈ X T0 ,s,k+1,δ−1 , q
q
q
q
u¯ i j (t)
: u¯ i j (t) − u¯ i j (0), ∂ I u¯ i j (t) ∈ X T0 ,s−q,k−q,δ−1 q = 1, 2,
q
q
α(t), wi (t) ∈ X T0 ,s−q,k−q,δ−1 q = 1, 2, q
u¯ ij (t) q
q
q
q
: u¯ ij (t) − u¯ ij (0), ∂ I u¯ ij (t) ∈ X T0 ,s−3,k−3,δ−1 (q, ) ∈ Z≥3 × (0, 0 ], q
α (t), wi (t) ∈ X T0 ,s−3,k−3,δ−1 (q, ) ∈ Z≥3 × (0, 0 ], such that ij
(i) the triple {¯u (t), α (t), wi (t)} determines, via formulas (1.5)–(1.9), a solution to the Einstein-Euler equations (1.1) in the harmonic gauge for 0 < ≤ 0 on the spacetime region (x I , t = x 4 ) ∈ D = R3 × [0, T0 ), (ii) ∂t u¯ I J (0) = 2 z4I J , ∂t2 u¯ I J (0) = 2 f I J , α (0) = α , and wI (0) = w I for 0 < ≤ o o 0 , 0
0
0
(iii) {α(t), w I (t), (t)} is the unique solution to the Poisson-Euler-Makino equations 0
0
(1.10)–(1.12) with initial data α(0) = α , w I (0) = w I , o
o
Post-Newtonian Expansions for Perfect Fluids q
q
851
q
(iv) for q = 1, 2, {u¯ i j (t), α(t), wi (t)} satisfies a linear (non-local) symmetric hyper0
0
0
0
0
bolic system that only depends on {α(t), w I (t), (t)} if q = 1, and {α(t), w I (t), 0
1
1
1
(t), u¯ i j (t), α(t), wi (t)} if q = 2, q
q
ij
q
(v) for q ∈ Z≥3 , {u¯ (t), α (t), wi (t)} satisfies a linear (non-local) symmetric hyper0
p
0
0
p
p
bolic system that only depends on , {α(t), w I (t), (t)}, {u¯ i j (t), α(t), wi (t)} for p
p
p
p = 1, 2, and {u¯ i j (t), α (t), wi (t)} for p = 3, 4, . . . , q − 1, q
q
ij
q
(vi) {¯ui j , α (t), wi (t)} and {u¯ (t), α (t), wi (t)} for q ∈ Z≥3 , satisfy the following estimates: ¯uij (t) L 2 + ∂ I u¯ ij (t) H k + ∂t u¯ ij (t) H k + ∂t ∂ I u¯ ij (t) H k−1 δ
+ 2 ∂t2 u¯ ij (t) H k−1 1, α (t) H k + wi (t) H k + ∂t α (t) H k−1 + ∂t wi (t) H k−1 1, q
q
q
q
u¯ ij (t) L 2 + ∂ I u¯ ij (t) H k−3 + ∂t u¯ ij (t) H k−3 + ∂t ∂ I u¯ ij (t) H k−4 δ
q + ∂t2 u¯ ij (t) H k−4 1, q q α (t) H k−3 + wi (t) H k−3 2
q
q
+ ∂t α (t) H k−4 + ∂t wi (t) H k−4 1,
for all (t, ) ∈ [0, T0 ) × (0, 0 ], and (vii) {¯ui j , α (t), wi (t)} admits convergent expansions (uniform for 0 < ≤ 0 ) of the form j 0
u¯ ij = δ4i δ4 +
2
q
q u¯ i j +
q=1 0
q
q u¯ ij ,
q=3
ν ∂tν ∂ I u¯ ij = ν δ4i δ4 ∂tν ∂ I + j
∞
2
q
q+ν ∂tν ∂ I u¯ i j +
q=1 0
ν ∂tν u¯ ij = ν δ4i δ4 ∂tν + j
2
0
∂tν α = ∂tν α +
q
q+ν ∂tν u¯ i j +
q
q ∂tν α +
q=1 0
∂tν wi = ∂tν wi +
2 q=1
q
q+ν ∂tν ∂ I u¯ ij , ν = 0, 1,
q=3
q=1 2
∞
∞
q
q+ν ∂tν u¯ ij , ν = 1, 2,
q=3 ∞
q
q ∂tν α , ν = 0, 1,
q=3 q
q ∂tν wi +
∞
q
q ∂tν wi , ν = 0, 1,
q=3
where the first expansion is convergent in C 0 ([0, T0 ); L 2δ ), and the rest are conk−4 ). vergent in both C 0 ([0, T0 ); H k−4 ) and C 0 ([0, T0 ); Hδ−1,
852
T. A. Oliynyk
Remark 1.2. q
q
q
(a) For q = 1, 2, the equations satisfied by {u¯ i j , α, wi } are the ones obtained by directly substituting the expansions of Theorem 1.1 (vii) into the Einstein-Euler equations and collecting terms to order 2 , and therefore coincide with the standard first postNewtonian expansions. q
q
q
(b) The equations satisfied by {u¯ i j , α , w i } for q ≥ 3 can be determined from the q
equations satisfied by the W defined in the proof of Theorem 5.1. To facilitate comparisons of the approach taken in this paper with previous studies, we define the following -independent quantities: q q q ij ij k i j q = 1, 2, h = 4u¯ − 2ηk u¯ η where (ηi j ) = diag (1, 1, 1, −1). Then a straightforward calculation, using statement (vii) of Theorem 1.1 and formulas (1.5)–(1.6), shows that the metric gi j can be expanded as follows: 1 2 0 0 2 1 44 2 44 + O( 3 ), g44 = − 2 − 2 − h − 3 + h 1
2
g4I = 2 h 4I + 3 h 4I + O( 4 ), and 0
1 3 IJ
g I J = δ I J − 2 δ I J − h 2
2 0 2 IJ − δI J + h + O( 5 ). 4
It is worthwhile to note that higher order expansions in can be generated for the metric gi j using part (vii) of Theorem 1.1. These higher order terms will, in general, depend on in a non-analytic fashion, and therefore, without further analysis, the relation of these expansion terms to the standard post-Newtonian expansions is not clear. 2. Einstein-Euler Equations In this section, we quickly review the formulation of the Einstein-Euler equation used in [15] to analyze the limit as 0. 2.1. Reduced Einstein equations. As discussed in the introduction, we use a symmetric tensor density u¯ i j instead of the metric g i j , which for > 0 completely determines the metric via the formula IJ 1 g¯ I 4 g¯ ij , (2.1) (g ) = √ |g| ¯ g¯ 4J 2 g¯ 44 where g¯ i j := ηi j + 4 2 u¯ i j , |g| ¯ := − det(¯gi j ),
(2.2)
Post-Newtonian Expansions for Perfect Fluids
853
and ηi j = To fix the gauge, we let ∂¯k =
13×3 0 . 0 −1
∂I ∂t
if k = I , if k = 4
and demand that ∂¯i u¯ i j = 0.
(2.3)
For > 0, this condition is easily seen to be equivalent to the harmonic gauge ∂k gk j = 0.
√ Here gi j = − det(gk )g i j is the metric density in the coordinates (x i ). Next, defining ui j := u¯ i j , ij u := ∂¯k u¯ i j , k ij
u :=
(2.5) (2.6)
ij ij (u4 , u J , ui j )T , i j −1
(2.7)
(¯gi j ) := (¯g ) ,
Ai j := 2 21 g¯ k g¯ mn − g¯ km g¯ n g¯ i p g¯ jq − 21 g¯ i j g¯ pq ∂¯ p u¯ k ∂¯q u¯ mn ,
B i j := 4¯gk 2¯gn (i ∂¯m u¯ j )∂¯n u¯ km − 21 g¯ i j ∂¯m u¯ kn ∂¯n u¯ m − g¯ mn ∂¯m u¯ ik ∂¯n u¯ j , and
(2.4)
C i j := 4 ∂¯k u¯ i j ∂¯ u¯ k − ∂¯k u¯ i ∂¯ u¯ jk ,
(2.8) (2.9) (2.10)
(2.11)
the Einstein equations G i j = 2 4 T i j , in the harmonic gauge, can be written in first order form as 1 1 ij ij A4 (u)∂t ui j = C I ∂ I ui j + A I (u)∂ I ui j + F¯ 0 (u) + F¯ 1 (u, u) − (T i j , 0, 0)T , (2.12) where
⎛ 0 1 − 4u44 ⎝ A (u) = 0 δ I J + 4u I J 0 0 ⎛ ⎞ 0 δI J 0 C I = ⎝δ I J 0 0⎠ , 0 0 0 ⎛ 4I ⎞ 4u 4u I J 0 A I (u) = ⎝4u I J 0 0⎠ , 0 0 0 4
ij ij F¯ 0 (u) = (0, 0, u4 )T , ij F¯ 1 (u, u)
ij
(2.13)
(2.14)
(2.15) (2.16)
= (A + B + C , 0, 0) , ij
⎞ 0 0⎠ , 1
ij
T
(2.17)
854
T. A. Oliynyk
and
with
1 ij 0 (T ) = 0
0
−1 ρ
+ Si j
(2.18)
0 |¯g |v I v 4 |¯g |v J v 4 −1 (|¯g | − 1)(v 4 )2 + ((v 4 )2 − 1) (ρ + 2 p)v I v J + |¯g |−1/2 p(δ I J + 4 u I J ) pv I v 4 + 4|¯g |−1/2 p u I 4 + |¯g | . pv J v 4 + 4|¯g |−1/2 p u4J p(v 4 )2 + |¯g |−1/2 p(−1 + 4 u44 )
(S i j ) = ρ
(2.19) Letting w = (α, wi )T ,
(2.20)
ij
(2.21)
we can decompose S i j as ij
S i j = S0 + S1 , where ij
S0 (u, w, u, w) 0 |g|w ¯ I (1 + w 4 ) , =ρ |g|w ¯ J (1 + w 4 ) −1 (|g| ¯ − 1)(1 + w 4 )2 + (1 + w 4 )2 − 1 (2.22) and ij
S1 (w, u, w) I J ρw w + pw I w J + |g| ¯ −1/2 p g¯ I J pw I (1 + w 4 ) + 4|g| ¯ −1/2 pu I 4 . = |g| ¯ pw J (1 + w 4 ) + 4|g| ¯ −1/2 pu J 4 p(1 + w 4 )2 + |g| ¯ −1/2 p(−1 + 4u44 ) (2.23) We will refer to the gauge fixed Einstein equation (2.12) as the reduced Einstein equations. Because of the matrix inversion (2.9) used to define the inverse density g¯ i j , the reduced Einstein equations will be well defined provided u ∈ V = { (r i j ) ∈ M4×4 | det(ηi j + 4r i j ) > 0 } . 2.2. Euler equations. In [15], we also showed that if we use the fluid variables (2.20), and choose initial data that satisfies 0 = N := vi v i + 1/ = g¯ 44 (1/ + w 4 )2 + 1/ + 2g¯ 4J (1 + w 4 )w J + g¯ I J w I w J , (2.24) then the Euler equations ∇i T i j = 0 are equivalent to the system a 4 ∂4 w = a I ∂ I w + b,
(2.25)
Post-Newtonian Expansions for Perfect Fluids
855
where v¯ I = v I , v¯ 4 =
v4 ,
(2.26)
1 g¯ i j = √ g¯ i j , (g¯ i j ) = (g¯ i j )−1 , |g| ¯ 1 1 (α)2 , q = α, h = 1+ 4n(n + 1) 2nh L ij = δ ij + 2 v¯ i v¯ j , v¯ j = g¯ i j v¯ i ,
(2.28) (2.29)
Mi j = g¯ i j + 2 v¯ i v¯ j ,
¯ ikj = 2 g¯ km (2¯gi g¯ j p − g¯ i j g¯ p )∂¯m u¯ p + 2(¯gp δ(ik ∂¯ j) u¯ p − 2¯g(i ∂¯ j) u¯ k ) , 2 q L 4j h (1 + w 4 ) a4 = , q L i4 Mi j (1 + w 4 ) 2 I −h w −q L Ij I , a = −q L iI −Mi j w I 2
and
(2.27)
(2.30) (2.31) (2.32) (2.33)
j −q L ij ¯ i v¯ b= . j −Mi j ¯ v¯ k v¯
(2.34)
k
We also note that
1 0 0 δi j
+ aˆ 4 (u, w), α I δj −w I − 2n + w I a(u, aI = ˆ w) + α aˆ I (u, w), α I − 2n δi −δi j w I a4 =
and
(2.35) (2.36)
0 α bˆ 1 (u, w) · uk
, b= + lp p −ηim 2η4 η4 p + ηp um − 2 ηp δ4i u4 − 2η4 ui bˆ 2 (u, w) · uk 4 (2.37)
where {aˆ 4 , a, ˆ aˆ I , bˆ 1 , bˆ 2 } are analytic in all their variables provided that u ∈ V, {aˆ 4 , a, ˆ I 4 I ˆ ˆ 0) = 0, b1 (0, 0) = 0, and aˆ } are symmetric, and aˆ (0, 0) = 0, aˆ (0, 0) = 0, a(0, bˆ 2 (0, 0) = 0. 3. Uniform Existence and the Zeroth Order Equations The combined systems (2.12) and (2.25) can be written as b0 (V, 2 U )∂t V =
1 I c ∂ I V + b I (V, U, V, 2 U )∂ I V 1 + f 0 (V, U, V, 2 U ) + f 1 (V, U, V, 2 U ) + g(V ), (3.1)
856
T. A. Oliynyk
where U = (0, 0, u¯ i j , 0, 0)T , V =
o ij ij (u4 , u J , δui j , α, wi )T ,
,
b I (V, U, V, 2 U ) f 0 (V, U, V, 2 U ) f 1 (V, U, V, 2 U )
(3.2)
o
δui j = ui j − u¯ i j , o
0 A4 (u) 0 a 4 (u, w) I C 0 = , 0 0 I 0 A (u) , = 0 a I (w, u, w) ij ¯ (u) − S i j (u, w, u, w) F 0 0 , = b(u, w, u, w) ij ij ¯ = F 1 (u, u) − S1 (w, u, w) , 0
b0 (V, 2 U ) = cI
u¯ i j = u¯ i j |t=0 ,
(3.3) (3.4) (3.5) (3.6) (3.7) (3.8)
and j
g(V ) = (−δ4i δ4 ρ(α), 0, . . . , 0)T .
(3.9)
For initial data, we will often use the following notation: given a function z that depends on time t, we define z = z|t=0 . o
In addition to solving these equations, we must also solve constraint equations on the initial data to get a full solution to the Einstein-Euler equations. Letting
2 ij 2 k 2 k(i j) (3.10) G i j = g¯ k ∂¯k u¯ + 2 Ai j + B i j + C i j + g¯ i j ∂¯k u¯ − 2∂¯k u¯ g¯ , and defining C J = −1 (G 4J − T 4J ), C 4 = G 44 − T 44 , and H j = ∂¯i u¯ i j , the constraint equations to be solved on the initial hypersurface S0 = {(x I , 0) | (x I ) ∈ R3 } are: Cj = 0
(gravitational constraint equations),
(3.11)
H =0
(harmonic gauge condition),
(3.12)
j
and N =0
(fluid velocity normalization).
(3.13)
To fix a region on which the system where both the evolution (3.1) and constraint equations (3.11)–(3.13) are well defined, we note from (2.13), (2.35), and the invertibility of the Lorentz metric (ηi j ) that there exists a constant K 0 > 0 such that − det(ηi j + 4ui j ) > 1/16, 1 + w 4 > 1/16, 1 1 A4 (u) ≥ 1, a 4 (u, w) ≥ 1, 16 16
(3.14) (3.15)
Post-Newtonian Expansions for Perfect Fluids
857
and |A4 (u)| ≤ 16, |a 4 (u, w)| ≤ 16
(3.16)
for all |u| ≤ 2K 0 , |wi | ≤ 2K 0 , |α| ≤ 2K 0 . The choice of the bounds 1/16 and 16 is somewhat arbitrary, and they can be replaced by any number of the form 1/M and M for any M > 1 without changing any of the arguments presented in the following sections. However, since we are interested in the limit 0, we lose nothing by assuming M = 16.
3.1. Newtonian initial data. In [15], we proved the following theorem, based on previous work by Lottermoser [13], concerning the existence of -analytic solutions to the constraints (3.11)–(3.13). Before we state the theorem, we note from (1.9), (1.8), and the weighted multiplication inequality (see [15] Lemma A.8 ) that if α ∈ Hδk (δ ≤ 0, k > 3/2) then ρ, p ∈ Hδk . Proposition 3.1. Suppose −1 < δ < 0, k > 3/2 + 1, R > 0 and (ρ, ˜ p, ˜ w˜ I , z˜4I J , z˜ I J ) ∈ k−2 2 k−1 k k (Hδ−2 ) × Hδ−1 × Hδ−1 × B R (Hδ ) . Then there exists an 0 > 0, an open neighk : borhood U of (ρ, ˜ p, ˜ w˜ I , z˜4I J , z˜ I J ), and analytic maps (−0 , 0 ) × U → Hδ−1 I J k I J I I J 4 I I J (, ρ, p, w , z4 , z ) → w , (−0 , 0 ) × U → Hδ : (, ρ, p, w , z4 , z ) → φ, (−0 , 0 )×U → Hδk : (, ρ, p, w I , z4I J , z I J ) → w I such that for each (ρ, p, w I , z4I J , ij z I J ) ∈ U , (, ρ, p, w I , w 4 , u¯ 4 , ∂¯4 u¯ i j ) is a solution to the three constraints C j = 0, H j = 0, and N = 0,
(3.17)
where IJ z w I , (¯u ) = w J φ −∂ K z K I z4I J ij , (∂t u¯ ) = −∂ K z K J −∂ K w K ij
(3.18) (3.19)
and w 4 = − 1 +
√
− g¯ 4J w J −
2 (¯g 4J w J )2 −¯g 44 ( 2 g¯ I J w I w J +1) . g¯ 44
(3.20)
Moreover, if we let φ0 = φ|=0 , w0I = w I |=0 , and w04 = w 4 |=0 , then φ0 , w0I , and w04 satisfy the equations φ0 = ρ, w0I = −∂ L z4L J + ρw I , and w04 = 0, respectively. In Sect. 5, we show that the analytic dependence of the initial data on implies that there exists a corresponding convergent expansion in for the solution generated from the initial data.
858
T. A. Oliynyk
3.2. Uniform existence. To prove local existence of solutions to (3.1) on a uniform time interval independent of , we use a non-local symmetric hyperbolic version of (3.1). This system is essentially the one used in [15] to derive uniform existence, convergence, and error estimates for the limit 0 of solutions to (3.1). However, we employ a few refinements that can be used to simplify the proof in [15], and will also be useful for analyzing the higher order expansions in . Letting χ R¯ ∈ C0∞ be a cutoff function that satisfies χ R¯ B R¯ = 1, 0 ≤ χ R¯ ≤ 1, and supp χ R¯ ⊂ B2 R¯ , we replace g(V ) in (3.1) with j
g(V ) = (−δ4i δ4 χ R¯ ρ(α), 0, . . . , 0)T ,
(3.21)
and, following [15], we define the Newtonian potential by = χ R¯ ρ.
(3.22)
Before proceeding, we first recall the following inequalities from [15]: (a) If > 3/2, there exists a constant CSob such that · L ∞ ≤ CSob · Hη, ∀ ∈ [0, 0 ]. η,
(3.23)
(b) For 0 > 0 and η ≤ −3/2, · Hη, · H η
∀ ∈ [0, 0 ].
(3.24)
∀ ∈ [0, 0 ].
(3.25)
∀ ∈ [0, 0 ].
(3.26)
(c) For 0 > 0, and −2 ≤ η ≤ −3/2, · H · Hη, (d) For 0 > 0 and η ≥ −3/2, · L 2η · L 2η, (d) If 2 ≤ 1 , and η1 ≤ η2 , then · H 2 · H 1 . 2 ,
1 ,
Lemma 3.2. Suppose 0 > 0, −1 < η < −1/2, and > 3/2. Then the maps −→ Hη+2 : α −→ −1 χ R¯ ρ(α) : Hη−1, and +1 ∂ I ◦ : Hη−1, −→ Hη−1, : α −→ ∂ I (α)
are uniformly analytic1 for ∈ [0, 0 ]. 1 See Appendix A for a definition of the term uniformly analytic.
(3.27)
Post-Newtonian Expansions for Perfect Fluids
859
Proof. First we recall that for −1 < η < −1/2, the Laplacian : Hη+2 → Hη−2
(3.28)
is an isomorphism by Proposition 2.2 of [1]. Next, by assumption > 3/2, and hence α → ρ = (4K n(n + 1))−n α 2n ∈ Hη−1, is uniit follows that the map Hη−1, formly analytic for ∈ [0, 0 ] by Lemma A.7. Moreover, the linear map Hη−1, u → χ R¯ u ∈ Hη−2 is clearly well defined and uniformly bounded for ∈ [0, 0 ]. Since compositions of uniformly analytic maps are again uniformly analytic, we see that the map Hη−1, α → −1 χ R¯ ρ(α) ∈ Hη+2 is uniformly analytic of ∈ [0, 0 ]. +1 is a bounded linear Next, we recall that the differentiation Hη+2 u → ∂ I u ∈ Hη−1 +1 +1 map, and the imbedding Hη−1 ⊂ Hη−1, is well defined and uniformly bounded for ∈ [0, 0 ] by (3.24). Again using the fact that uniform analyticity is preserved under +1 is uniformly analytic for compositions, we get that the map ∂ I ◦ : Hη−1, → Hη−1, ∈ [0, 0 ]. Following [15], we use the Newtonian potential to define a new combined gravitational-matter variable W via the formula W = V − d,
(3.29)
where j
d := (0, δ4i δ4 ∂ J (α), 0, 0, 0).
(3.30)
Notice that the transformation (3.29) leaves the matter variables unaffected. Consequently, we can define W by ij
ij
W = (u4 , W I , δui j , α, wi )T , and treat or d as a function of W . In fact, by Lemma 3.2, W −→ d ∈ Hδ−1, Hδ−1,
(3.31)
defines a uniformly analytic map for ∈ [0, 0 ]. To formulate the evolution equation entirely in terms of W , we need the “time derivative” of the map. So we define ˙ (W, U, W, 2 U ) 2nχ R¯ α 2n−1 −1 4 −1 I := a (w, u, w)∂ I w + b(u, w, u, w) a (u, w) , (4K n(n + 1))n (3.32) ˙ = ∂t when where ((α, wi )T ) = α is a constant projection map. By construction, evaluated on a solution of the reduced Einstein-Euler equations. Lemma 3.3. Suppose R1 > 0, 0 > 0, −1 < η < 1/2, and > 3/2. Then there exists an R2 > 0 such that the maps ˙ : B R1 (Hη−1, ) × B R2 (Hη ) ˙ ) × B R2 (Hη ) −→ Hη+1 : (W, U, W˜ , U˜ ) −→ (W, U, W˜ , U˜ ) ×B R2 (Hη−1,
860
T. A. Oliynyk
and ˙ : B R1 (Hη−1, ∂I ◦ ) × B R2 (Hη ) × B R2 (Hη−1, )
˙ U, W˜ , U˜ ) ×B R2 (Hη ) −→ Hη−1, : (W, U, W˜ , U˜ ) −→ ∂ I (W,
are uniformly analytic for ∈ [0, 0 ]. Proof. Fixing R1 > 0, 0 > 0, −1 < η < −1/2 and > 3/2, it follows directly from Lemmas A.2 and A.7 that there exists a R2 > 0 such that the map ) × B R2 (Hη ) × B R2 (Hη−1, ) × B R2 (Hη ) (W, U, W, 2 U ) B R1 (Hη−1, −1 −→ χ R¯ α 2n−1 a 4 (u, w)−1 a I (w, u, w)∂ I w + b(u, w, u, w) ∈ Hη−2
is uniformly analytic for ∈ [0, 0 ]. The rest of the proof now follows from the same arguments used in the proof of Lemma 3.2. To fit with the above notation, we define ˙ 0, 0, 0)T . ˙ = (0, δ4i δ ∂ I , d 4 j
Noting that b0 (V, 2 U ) = b0 (W, 2 U ) and b I (V, U, V, 2 ) = b I (W, U, W, 2 U ), (3.33) we write (3.1) as b0 (W, 2 U )∂t W =
1 I c ∂ I W + b I (W, U, W, 2 U )∂ I W +F0 (W, U, W, 2 U ) + F1 (W, U, W, 2 U ), (3.34)
where F0 (W, U, W, 2 U ) = f 0 (W + d(W ), U, (W + d(W )), 2 U ) ˙ −b0 (W, 2 W )d (W, U, W, 2 U ) + b I (W, U, W )∂ I d(W ) (3.35) and F1 (W, U, W, 2 U )) = f 1 (W + d(W ), U, (W + d(W )), 2 U ). (3.36) Proposition 3.4. Suppose −1 < δ < −1/2, 0 > 0, s ∈ N0 , R > 0, K 1 < K 0 / √ k , supp α ⊂ B , (2 0 CSob ), τ ≥ 2K 1 /CSob , R¯ > 16τ + R, k ≥ 3 + s, α , w I ∈ Hδ−1 R o
ij
o
o
ij
k . Let u ¯ , ∂t u¯ and w4 be the initial data constructed in z I J ∈ Hδk+1 , z4I J ∈ Hδ−1 o
o
o
Proposition 3.1, which, by choosing 0 ≤ 1 small enough, satisfies T K0 j ≤K 1 , and ¯uij H k+1 ≤ √ ∂t u¯ ij , ∂ I u¯ ij −δ4i δ4 ∂ I −1 ρ , 0, α , wi δ o o o o o 0 CSob o k Hδ−1,
for all ∈ (0, 0 ]. Then there exists a T > 0 independent of ∈ (0, 0 ], and maps
T ij ij ∈ X T ,s,k,δ−1 0 < ≤ 0 W = u4, , W I, , δuij , α , wi such that
Post-Newtonian Expansions for Perfect Fluids
861
(i) T ≥ T for 0 ≤ ≤ 0 , (ii) W is the unique solution to (3.34) with initial data T ij ij i i j −1 W (0) = ∂t u¯ , ∂ I u¯ − δ4 δ4 ∂ I ρ , 0, α , w , o
o
o
o
o
(iii) W (t) H k
δ−1,
≤ 2K 1 , ∂t W (t) H k−1 1, δ−1,
and max{ u¯ ij (t) L ∞ , α (t) L ∞ , wi (t) L ∞ } < 2K 0 for all (t, ) ∈ [0, T ] × (0, 0 ], (iv) if lim sup W (t) W 1,∞ < ∞, tT
and sup { u¯ ij (t) L ∞ , α (t) L ∞ , wi (t) L ∞ } < 2K 0 ,
0≤t
then the solution W (t) can be uniquely extended for some time T∗ > T , (v) for any time T˜ which is strictly less than the maximal existence time and for which sup { u¯ ij (t) L ∞ , α (t) L ∞ , wi (t) L ∞ } < 2K 0
0≤t≤T
holds, the support of α satisfies supp α (t) ⊂ B R¯ ∀ t ∈ [0, T˜ ], where R¯ := 16 sup0≤t≤T˜ wI (t) L ∞ + R, (vi) supp α (t) ⊂ B R¯ for all (t, ) ∈ [0, T ] × (0, 0 ], ij ij ij ij ij j ij (vii) ∂t u¯ = −1 u¯ 4, , and ∂ I u¯ = W I, + δ4i δ4 ∂ I (α ), where u¯ = u¯ + −1 δui j , o
ij
(viii) the triple {¯u , α , wi } determines, via the formulas (1.7), (1.9), (2.1), and (2.2), a solution to the full Einstein-Euler system (1.1) in the harmonic gauge (2.4) on the spacetime region D = R3 × [0, T ], and (ix) the conclusions (vii)-(viii) continue to hold on any region of the form D = R3 × [0, T˜ ] provided supp α (t) ⊂ B R¯ for all 0 ≤ t ≤ T˜ . Proof. (i)-(iv): Given the initial data satisfying T j ∂t u¯ ij , ∂ I u¯ ij −δ4i δ4 ∂ I −1 ρ , 0, α , wi o o o o k o
Hδ−1,
K0 ≤K 1 , and ¯uij H k+1 ≤ √ δ o 0 CSob
for all ∈ (0, 0 ], it is not difficult using the inequalities (3.23) and (3.24), and Lemmas 3.2, 3.3, and A.7 to verify that W (0) H k ≤ K 1 , ∂t W (0) H k−1 1, and the evoδ−1, δ−1, lution equation (3.34) satisfies the conditions (B.3)–(B.5). Therefore, it follows directly
862
T. A. Oliynyk
from Theorem B.1 that there exists √ a time T > 0 independent of ∈ (0, 0 ] such that W (t) H k ≤ 2K 1 < 2K 0 /( 0 CSob ), and ∂t W (t) H k−1 1 for all 0 ≤ t ≤ T . δ−1, δ−1, This proves (i)-(iii). Statement (iv) also follows directly from Theorem B.1. (v)-(vi): Statement (v) follows from a slight modification of Lemma 7.2 in [15] while (vi) follows directly from (iii) and (v). (vii)-(ix): By (vi) we see that V (t) = W (t) + d(W (t)) satisfies (3.1) for (t, ) ∈ [0, T ] × (0, 0 ]. Then the same arguments used to prove (ii) and (iii) of Proposition 6.1 in [15] can be employed to prove the statements (vii)-(ix) of this proposition. 3.3. Zeroth order equation. In order to discuss equations satisfied by the zeroth and higher order expansions, we will first introduce some notation. To begin, we define p
0
1
p
p
0
1
p
p
0
1
p
U = (U , U , . . . , U ), W = (W , W , . . . , W ), p
0
1
p
X = ( X , X , . . . , X ), Y = (Y I , Y I , . . . , Y I ), and let F (U, W ) = F0 (W, U, W, 2 U ) + F1 (W, U, W, 2 U ), B (U, W, Y ) = b I (W, U, W, 2 U )Y I , and B0 (U, W, X ) = b0 ( 2 U, W )X. Proposition 3.5. Suppose > 3/2, R > 0, −1 < η < −1/2. Then there exists an 0 > 0 such that the maps ) −→ Hη−1, , F : B R (Hη ) × B R (Hη−1, ) × B R (Hη−1, ) −→ Hη−1, , B : B R (Hη ) × B R (Hη−1,
and B0 : B R (Hη ) × B R (Hη−1, ) × B R (Hδ−1, ) −→ Hη−1,
are uniformly analytic for ∈ [0, 0 ]. Proof. The proof follows directly from Lemmas 3.2, 3.3, A.7, and the fact that compositions of uniformly analytic functions are again analytic. Next, we define 1 dp |=0 F (U (), W (), p! d p p p−1 p p 1 dp |=0 B (U (), W (), Y ()), B( U , W, Y) = p! d p p p−1
p
F( U , W) =
and p p−2 p−1 p
B( U , W , X) =
1 dp |=0 B0 (U (), W (), X ()), p! d p
Post-Newtonian Expansions for Perfect Fluids
863
where U () =
p−1 q=0
q
q U , W () =
p−1
q
q W ,
X () =
q=0
p
q
q X , and Y () =
q=0
p
q
q Y .
q=0
Proposition 3.6. Suppose > 3/2, R > 0, −1 < η < −1/2. Then there exists an 0 > 0 such that the maps
p F : B R (Hη ) × (Hη ) p−2 × B R (Hη−1, ) × (Hη−1, ) p−1 −→ Hη−1, ,
p B : B R (Hη )×(Hη ) p−2 × B R (Hη−1, )×(Hη−1, ) p−1 ×(Hη−1, ) p −→ Hη−1, , and p
B 0 : B R (Hη )×(Hη ) p−3 × B R (Hη−1, )×(Hη−1, ) p−2 ×(Hη−1, ) p −→ Hη−1,
are uniformly analytic for ∈ [0, 0 ]. Moreover, there exists uniformly analytic maps
p F R, : B R (Hη ) × (Hη ) p−2 × B R (Hη−1, ) × (Hη−1, ) p−1 −→ Hη−1, ,
p B R, : B R (Hη )×(Hη ) p−2 × B R (Hη−1, )×(Hη−1, ) p−1 ×(Hη−1, ) p −→ Hη−1, , and
p B 0 R, : B R (Hη )×(Hη ) p−3 × B R (Hη−1, )×(Hη−1, ) p−2 ×(Hη−1, ) p −→ Hη−1, p−1
1
1
p
p
0
0
p
that are linear in the variables U , . . . , U , W , . . . , W , X , . . . , X , Y , . . . , Y , and ⎡ ⎤ p p p−1 p q q−1 q 1 ⎣ F (U (), W ()) − q F( U , W)⎦ = F R, ( U , W),
p+1
q=0
⎤ p p−1 p p q q−1 q q p 1 ⎣ q B( U , W, Y)⎦ = B R, (, U , W, Y), B (U (), W (), Y ()) − p+1 ⎡
q=0
and ⎡ 1 ⎣ 0 B (U (), W (), X ()) −
p+1
p
q q−2 q−1 q q 0
⎤
p
p−2 p−1 p
B ( U , W , X)⎦ = B 0 R, (, U , W , X).
q=0
Proof. The proof follows immediately from the Taylor expansions for F , B , and B0 which are uniformly analytic by Proposition 3.5.
864
T. A. Oliynyk
We note that from the definition of the above maps, it is clear that 0
p
0
p p−1
p
p
p
p−1
p
p
p−2 p−1 p−1
˜ U , W, Y ) and B 0 = X + B˜ 0 ( U , W , X ), B = b I (W )Y I + B(
(3.37)
where 0
0
0
0
0
b I (W ) := b I (W , 0, 0, 0) and B˜ = B˜ 0 = 0.
(3.38)
With our notation fixed, we are now ready to define the zeroth order equations: 0
0
0
0
0
0
1
∂t W = b I (W )∂ I W + F(W ) + c I ∂ I ω,
(3.39)
0
c I ∂ I W = 0,
(3.40)
0
W (0) = W (0) |=0 .
(3.41)
We showed in [15] that these equations are equivalent to the Poisson-Euler equations of Newtonian gravity. To see this, we first note that the Poisson-Euler-Makino system (1.10)–(1.12) is (non-local) symmetric hyperbolic, and thus we can use the results of Appendix B to obtain local existence of solutions. Proposition 3.7. Let k, s, δ, α , and w be as in Proposition 3.4. Then there exists a o
o
maximal time T0M > 0 and a unique solution 0
0
k−1 k α, w I ∈ C 0 ([0, T0M ), Hδ−1 ) ∩ C 1 ([0, T0 ), Hδ−1 ), 0
0
k+1 ∈ C 0 ([0, T0M ), Hδk+2 ) ∩ C 1 ([0, T0M ), Hδk+1 ), ∂t ∈ C 0 ([0, T0M ), Hδ−1 ) 0
0
to (1.10)–(1.12) satisfying α(0) = α and w I (0) = w I . Moreover, o
o
0
0
0
0
0 0
α, w I ∈ X T M ,s,k,δ−1 , ∈ X T M ,s,k+2,δ , ∂t = −∂ I −1 (ρ w I ) ∈ X T M ,s,k+1,δ−1 , 0
0
0
and 0
supp α(t) ⊂ B R(t) ∀ t ∈ [0, T0M ), 0
where R(t) = R + t sup0≤s≤t w I (s) L ∞ . Proof. From the weighted calculus inequalities of Appendix A (see also Appendix A of [15]), the Poisson-Euler-Makino system (1.10)–(1.12) satisfies the conditions required by Theorem B.1. Therefore all of the statements except for the estimate on the sup0
port of α(t) follow from this theorem. To prove the estimate on the support, we note 0
that w I ∈ C 1 ([0, T0M ), Cb1 (R)) by the Sobolev inequality (3.23). Therefore we can 0
integrate the differential equation d x I/dt = w I (t, x) to get a C 1 flow ψtI (x) that is
Post-Newtonian Expansions for Perfect Fluids
865
defined for all (t, x) ∈ [0, T0 ) × R3 and satisfies ψ0 = 1R3 . For each x ∈ R3 , define 0 0 α x (t) = α(t, ψt (x)). The evolution equation (1.10) implies that d 0x 1 0 0 ∂ I w I (x, ψt (x))α x (t) = 0. α (t) + dt 2n 0
0
By assumption, α x (0) = α(0, x) = 0 for all x ∈ E R := R3 \B R , and thus 0
0
α x (t) = α(t, ψt (x)) = 0 for all (t, x) ∈ [0, T0M ) × E R
(3.42)
by the above differential equation. Moreover, t t 0 0 |ψt (x) − x| ≤ |∂s ψs (x)| ds ≤ |w I (x, ψs (x))| ds ≤ t sup w I (s) L ∞ , 0
0≤s≤t
0 0
and hence it follows from (3.42) that supp α(t) ⊂ B R(t) , where R(t) = R + t 0
sup0≤s≤t w I (s) L ∞ . Using this local existence theorem, the next proposition follows by straightforward computation. 0
0
Proposition 3.8. Let {α(t), w I (t), (t)} be the solution to the Makino-Euler-Poisson Eqs. (1.10)–(1.12) from Proposition 3.7, and define 0
j 0
0
0
W (t) = (0, −δ4i δ4 (t), 0, α(t), δ iI w I (t))T ∈ X T M ,s,k,δ−1 , and 0
1
ω(t) =
1 1 (ω4 i j (t), ω I i j (t), 0, 0, 0)T ,
where
1 1 0 j 0 j) 0 ω4 i j = δ4i δ4 ∂t ∈ X T M ,s,k+1,δ−1 and ω I i j = ∂ I −1 2ρδ (iJ δ4 w J ∈ X T M ,s,k+1,δ−1 . 0
0
0
1
Then {W (t), ω(t)} defines a unique solution to the initial value problem (3.39)–(3.41) on the time interval 0 ≤ t < T0M . 4. First Order Expansion ij
By Proposition 3.1, the initial data u¯ is analytic in and there exists a convergent o ∞ q q i j ij ij k+1 expansion in Hδ for u¯ of the form u¯ = q=0 u¯ for 0 ≤ ≤ 0 . Consequently, o o o q q q ∞ q U can be expanded as U = q=0 U , where U = (0, 0, u¯ i j , 0, 0)T . Moreover, by o
Lemma 3.2 and the inequality (3.24), we can expand W (0) as W (0) =
∞ q=0
q
q W 0
k with the sum converging in Hδ−1, uniformly for 0 ≤ ≤ 0 .
(4.1)
866
T. A. Oliynyk 2
We define the second order remainder Z by 0
1
1
2
W = W + (ω + W ) + 2 Z ,
(4.2)
1
with the first order expansion term W satisfying 1
1 0 1 0 0 0 1 0 1 0 1 1 1 1 I 1 1 ˜ U, W, Y) − B 0 (W, X) + F(U, W), c ∂ I W + b I ∂ I W + b I ∂ I ω + B( (4.3)
1
b 0 ∂t W = 1
1
1
W (0) = W − ω(0),
(4.4)
0
where 1
0
b 0 = b0 (0, W ), 0
0
0
0
0
0
0
0
U = U , W = W , X = ∂t W , Y = ∂ I W , and 1
0
1
1
1
0
1
W = (W , ω + W ), X = (∂t W , ∂t ω). Observe that 1
b 0 = 1, by Proposition 3.8. Substituting (4.2) in (3.34) yields 1
1
1
B¯ + b ∂t W + 0
=
0
0 2 b
1
2 1 − b 0 ∂t W + 2 b0 Z 2
1 1 2 2 1 I 0 1 c ∂ I W + c I ∂ I ω + c I ∂ I W + c I ∂ I Z + B¯ + 2 bI ∂ I Z + F ,
where b0 = b0 ( 2 U, W ), 1 0 1 1 ¯ B = B , U, W , ∂ I W + (∂ I ω + ∂ I W ) ,
(4.5) (4.6)
and 1
0
1 B¯ 0 = B 0 ( 2 U, W , ∂t W + ∂t ω).
(4.7)
2
Using (4.3)–(4.4), we then find that Z satisfies 2
b0 ∂t Z = 2
Z (0) =
2 2 1 I 2 c ∂ I Z + bI ∂ I Z + K¯ , 0 1 1 W (0) − W (0) − W (0) + ω(0)
2
(4.8)
,
(4.9)
Post-Newtonian Expansions for Perfect Fluids
867
where ⎡⎛ ⎞ 1 q−1 q 1 1 0 − b0 q 1 b 1 K¯ = ∂t W + 2 ⎣⎝ q B 0 W , X − B¯ 0 ⎠ 2 q=0 ⎛ ⎞ ⎛ ⎞⎤ 1 1 1 q q−1 q q q−1 q q + ⎝ B¯ − q B U , W, Y ⎠ + ⎝F − q F U , W ⎠⎦ , 2
q=0
(4.10)
q=0
and 1
0
1
1
0
1
1
U = (U , U ), Y = (∂ I W , ∂ I ω + ∂ I W ). Letting 1
0
1
1 ˜ = (∂t W , ∂t ω X + ∂t W ),
it follows from Proposition 3.6 that 2 1 1 1 1 2 1 1 1 2 ¯ ˜ ˜ K = L U, W, X, Y, Z + M , U, W, X, Y, Z
(4.11)
2
for analytic maps L and M with L linear in Z . As we shall see in Theorem 4.2, when the initial data is chosen such that ∂t2 W (0) H k−2 remains bounded as 0, the dependence can be removed from δ the first order expansion coefficient. This is accomplished by replacing (4.3)–(4.4) with a related, but different independent version. To describe this system, we let 1
1
1 ij
1
1
1
W = (W 4 i j , W I , δ ui j , α, wi )T , and define projection operators by 1
1
1i j
1 ij
4 (W ) = (u4 ) and J (W ) = (W J ). Then the system that replaces (4.3)–(4.4) is: 1
0
1
1
1
1 0
0
1
0
1
0
1
1
0
1
1 2 ˜ U, W, Y) − B 0 (W, X) + F(U, W) + c I ∂ I ω, ∂t W = b I ∂ I W + b I ∂ I ω + B( (4.12) 1
W (0) = W − ω(0),
(4.13)
0
where 2
2
2
ω = (ω4 i j , ∂ I i j , 0, 0, 0)T , 1 0 1 0 1 0 1 0 1 1 2 J 0 ˜ U, W, Y) − B (W, X) + F(U, W) , ω4 = −∂ J B(
(4.14) (4.15)
868
T. A. Oliynyk
and
2
1 0
1
0
1
0
1
1
0
1
˜ U, W, Y) − B 0 (W, X) + F(U, W) . = −4 B(
(4.16)
Existence of solutions to the initial value problem (4.12)–(4.13) is covered by the following proposition. ¯ and τ be as in Proposition 3.4, T M be as in Proposition 4.1. Let δ, k, s, K 1 , R, R, 0 M Proposition 3.7, and suppose T0 < T0 . If s and τ are chosen so that s ≥ 1, and 16τ > 0
max{32K 1 , T0 sup0≤t≤T0 sup w I (t) L ∞ }, then there exists a map 1
W ∈ X T0 ,s−1,k−1,δ 1
such that W (t) is the unique solution to the initial value problem (4.12)–(4.13), and 1
supp ρ(t) ⊂ B R¯ for 0 ≤ t < T0 , 1
where ρ = then
0 2n−1 1 2n α. (4K n(n+1))n α
1
Moreover, if the initial data satisfies c I ∂ I W (0) = 0,
1
2
2
c I ∂ I W (t) = 0 for 0 ≤ t < T0 , and ω I , ω4 ∈ X T0 ,s−1,k−1,δ−1 . Proof. By construction, we have 1
1
k−1 W0 − ω(0) ∈ Hδ−1 .
(4.17)
Next, we observe that the map
0 1 0 0 −1 −1 −1 Hδ+1 × Hδ−1 × Hδ−1 × Hδ−1 × Hδ−1 (U, W, X, Y) 1 0 1 0 1 0 1 0 1 1 0 ˜ U, W, Y) − B (W, X) + F(U, W) ∈ H −1 −→ 4 B( δ−2
(4.18)
is analytic for > 3/2 + 1, which follows directly from the weighted estimates of Appendix A (see also Appendix A of [15]). It therefore follows that the system (4.12)– (4.16) satisfies all the hypotheses of Theorem B.1. Thus, there exists a unique solution 1
W ∈ X T0 ,s−1,k−1,δ−1
(4.19)
satisfying the initial value problem (4.12)–(4.13). Furthermore, from (3.28)–(4.18), it 2
2
is clear that ω I = ∂ I ∈ X T0 ,s−1,k,δ−1 . Note that we have used the linearity of the 1
system (4.12)–(4.16) in W to conclude that the solution can be continued as long as the coefficients are well defined, which is the case for 0 ≤ t ≤ T0 < T0M . By assumption, the initial data satisfies 1
c I ∂ I W (0) = 0,
(4.20)
Post-Newtonian Expansions for Perfect Fluids
869
while from Proposition 3.8 we have that 0
0
0
W 4 i j (t) = W I i j (t) = δ ui j (t) = 0,
(4.21)
and hence 0
0
0
j
0
u4 i j (t) = 0, u I i j (t) = δ4i δ4 ∂ I (t), and ui j (t) = 0.
(4.22)
0
From this it follows that b I has a block diagonal structure of the form 0 00 I , b = 0∗ and consequently 0 0 1 1 I I b ∂ I ω = 0, 4 b ∂ I W = 0, and J b ∂ I W = 0. 0
I
1
(4.23)
Next, calculation using (4.12), (4.14)–(4.15), and (4.23) shows that a straightforward 1 ∂t c I ∂ I W = 0, and hence 1
c I ∂ I W (t) = 0 for 0 ≤ t < T0 by (4.20). By the definition of the c I , this is equivalent to (since δ < 0) 1
1
W 4 (t) = 0 and ∂ I W I (t) = 0.
(4.24)
A short calculation using (4.12) and (4.24) then shows that 1
1
j
0
∂t δ ui j = ω4 i j = δ4i δ4 ∂t .
(4.25)
1
However, δ ui j (0) = 0 (see Proposition 3.1), and so integrating (4.25) yields 0 0 1i j i j δ u = δ4 δ4 (t) − (0) ,
(4.26)
and 0
1
j 0
1
ui j (t) = u¯ i j + δ ui j (t) = δ4i δ4 (t). 0
(4.27)
Also by (4.24), we have that 1
1
1
j 0
u4 i j (t) = ω4 i j (t) + W 4 i j (t) = δ4i δ4 (t),
(4.28)
while 1
1
j
1
1
u I i j (t) = ω I i j (t) + δ4i δ4 ∂ I (t) + W I i j (t),
(4.29)
870
T. A. Oliynyk
where 1
1
= ρ.
(4.30) 1
We remark that in obtaining (4.30), we have used supp ρ(t) ⊂ B R¯ for 0 ≤ t < T0 , 1
which follows from the definition of ρ and Proposition 3.7. Using (4.21), (4.22), (4.24), (4.27), (4.28), and (4.29) together, we can write (4.15) as 1 2 ij 1 ij J i j ω 4 = ∂ ∂ t ω J + δ 4 δ 4 ∂ J ∂t . (4.31) Moreover, it follows from the evolution equation (4.12) that 1
1
2
j
1
∂t W I i j = −∂t ω I i j + ∂ I ω4 i j − δ4i δ4 ∂ I ∂t .
(4.32)
1 0 0 ω I i j = ∂ I −1 2ρδ (iJ δ j)4 w J ,
(4.33)
We also note that
by Proposition (3.8), and hence 1
∂t ∂[J W I ] i j = 0
(4.34)
1
by (4.32). However, ∂[J W I ] (0) = 0 by Proposition 3.1, and thus we get from (4.34) that 1
∂[J W I ] i j (t) = 0. This combined with (4.24) shows that (since δ < 0) 1
W I i j (t) = 0,
(4.35)
and hence 1
1
1
j
u I i j (t) = ω I i j (t) + δ4i δ4 ∂ I (t).
(4.36)
Using (4.21), (4.22), (4.28), (4.35), (4.36), and the evolution equation (4.12), a straight1
1
forward calculation then shows that the pair {α, w I } satisfy 0
1
α 1I α 0I 1 0 ∂I w − w I ∂I α − ∂ I w = 0, ∂t α − w ∂ I α − 2n 2n 1
0I
1
0
1
0
1
∂t w J − w I ∂ I w J −
(4.37)
1
1 α J1 α J0 1I 0 ∂ α − ∂ J − w ∂I w J − ∂ α = 0. 2n 2n
(4.38)
Also, we observe that 2 0 0 j 1 (i j) ω4 = 2δ J ∂4 ∂t ρ w J + δ4i δ4 ∂t ,
(4.39)
Post-Newtonian Expansions for Perfect Fluids
871
by (4.31) and (4.33), and that 1 0 ∂I w I ρ =
⎡ ⎤ 0 0 α 2n α 2n−1 1I 0 1 ⎣w ∂I w I ⎦ , ∂I α + (4K n(n + 1))n 2n
(4.40)
and 0 1 ∂I w I ρ =
0 2n α 2n−1 2n(2n − 1) 1 0I 0I 0I 1 0 0 2n−2 1 α∂ w + w ∂ α + w ∂ α α α. I I I (4K n(n+1))n (4K n(n+1))n (4.41)
But, by (1.10), we have 1
∂t ρ =
2n(2n − 1) 2n 2n − 1 1 0 I 0I 1 0 0 2n−1 1 ∂ + α + ∂ w w ∂ α α α, t I I (4K n(n + 1))n 2n (4K n(n + 1))n (4.42)
and therefore
1 0 0 1 1 ∂t ρ − ∂ I w I ρ + w I ρ = 0
by (4.37), (4.40), (4.41), and (4.42). It then follows from (4.30) that 1 1 0 0 1 ∂t = ∂ I −1 w I ρ + w I ρ , and hence 2
ω4 i j ∈ X T0 ,s−1,k,δ−1 by (4.19), (4.39), and Proposition 3.7. 0
1 ¯ τ , T , and W (t) be as in Proposition 3.4, {W (t), ω(t)} Theorem 4.2. Let δ, k, s, K 1 , R, R, as in Proposition 3.8, T0M as in Proposition 3.7, and suppose T0 < T0M . If s and τ are 0
chosen so that s ≥ 2, and 16τ > max{32K 1 , T0 sup0≤t≤T0 sup w I (t) L ∞ }, then for 0 > 0 small enough, (i) there exist constants K 2 , K 3 such that the solution W (t) (0 < ≤ 0 ) exists on the interval [0, T˜ ), where 1 K3 , ln T˜ = min T0 , K2 and obeys the bounds sup max{ u¯ (t) L ∞ , α (t) , wi (t) L ∞ } < 2K 0 ,
0≤t
sup W (t) W 1,∞ < ∞, supp ρ (t) ⊂ B R¯ ,
0≤t
872
T. A. Oliynyk
(ii) and there exists maps 1
W ∈ X T0 ,s−1,k−1,δ−1 0 < ≤ 0 , 1
such that W is the unique solution to the initial value problem (4.3)–(4.4), and 0 1 1 W (t) − W (t) − ω(t) + W (t) H k−2 0 1 1 W (t) − W (t) − ω(t) + W (t) H k−2
δ−1,
e K2t 2
for all (t, ) ∈ [0, T˜ ) × (0, 0 ]. (iii) Moreover, if W (0) satisfies ∂t2 W (0) H k−2 1 for 0 ≤ ≤ 0 , then δ−1
0 1 1 W (t) − W (t) − ω(t) + W (t) H k−2 0 1 1 W (t) − W (t) − ω(t) + W (t) H k−2 e K 2 t 2 δ−1,
1
for all (t, ) ∈ [0, T˜ ) × (0, 0 ], where W ∈ X T0 ,s−1,k−1,δ−1 is the unique solution to the initial value problem (4.12)–(4.13). Proof. (i)-(ii): Fix T∗ < min{T, T0 }, and let 0
C1 = sup W (t) H k
δ−1
0≤t≤T∗
1
C2 = sup ω(t) H k
δ−1
0≤t≤T∗
0
+ sup ∂t W (t) H k−1 , δ−1
0≤t≤T∗
1
+ sup ∂t ω(t) H k−1 , 0≤t≤T∗
δ−1
and 1
1
C3 = W − ω(0) H k−1 . 0
δ−1
Since K0 ¯uij H k+1 ≤ √ , δ o 0 CSob 1
and W satisfies the linear equation (4.3), it follows from the energy estimates√derived in the proof of Theorem B.1 that there exists a constant K 2 = K 2 (C1 , C2 , K 0 /( 0 CSob )) such that 1
W (t) H k−1 ≤ e K 2 T∗ C3 + K 2 ∀ (t, ) ∈ [0, T∗ ] × (0, 0 ]. δ−1,
(4.43)
Post-Newtonian Expansions for Perfect Fluids
873
Next, we observe that 0 ij ij ∞ ¯ (by (3.23)) u (t) L ≤ CSob ¯u H k+1 + W (t) − W (t) H k−2 δ o δ−1, 2 1 ≤ K 0 + CSob Z (t) H k−2 + W (t) H k−1 + C2 , (4.44) δ−1, δ−1, 2 1 W (t) W 1,∞ ≤ CSob 2 Z (t) H k−2 + W (t) H k−1 + C2 + C1 , (4.45) δ−1,
and
δ−1,
0 2 1 W (t) − W (t) W 1,∞ ≤ CSob 2 Z (t) H k−2 + W (t) H k−1 + C2 . δ−1,
δ−1,
(4.46) 2
2
Setting Z (t) = Z (t), we note that by construction there exists a constant C5 such that 2
2
Z (0) H k−2 ≤ C4 . Moreover, from the error equation (4.8), it is clear that Z satisfies δ−1, an equation to which Theorem B.1 applies. Therefore, for any K 3 > C4 (0 ≤ ≤ 0 ) 2
there exists constants K 4 , K 5 such that Z(t) satisfies an estimate of the form
2 Z(t) ≤ e K 4 t [C4 + K 5 ] − K 5 ≤ K 3 for 0 ≤ t < T˜ , where
(4.47)
1 K3 + K5 ˜ . ln T = min T∗ , K4 (C4 + K 5 )
(4.48)
Statements (i) and (ii) now follow directly from Propositions 3.4 and 3.7, and the estimates (3.25), (4.44)–(4.46), (4.47), and (4.48), provided 0 is chosen small enough. (iii): To prove statement (iii), we first observe that it follows from the evolution equation (3.34) that the condition ∂t2 W (0) H k−2 1 for 0 < ≤ 0 is equivalent to the δ−1
condition 2
cI ∂
2
1
I W (0)
1
2
1
= 0. Then replacing W (t), and Z (t) in (4.2) with W (t), and
ω + Z (t), respectively, it is not difficult using Proposition 4.1 to show that the new error 2
term Z (t) will satisfy the same type of estimate as above. We emphasize that the key 1
2
1
property used to make this replacement is that W (t) and ω(t) satisfy c I ∂ I W (t) = 0 and 2
ω ∈ X T0 ,s−1,k−1,δ−1 . The proof of statement (iii) now follows as we are able to replace 1
1
W (t) with W (t) everywhere in the above estimates. 5. Higher Order Expansions and Convergence 0
1 ¯ and W (t) be as in Proposition 3.4, {W (t), ω(t)} Theorem 5.1. Let δ, k, s, K 1 , R, R, 1
as in Proposition 3.8, T0M as in Proposition 3.7, W (t) and τ as in Theorem 4.2, and
874
T. A. Oliynyk
suppose T0 < T0M . If s ≥ 3, then for 0 small enough, there exists an infinite sequence of maps q
W ∈ X T0 ,s−2,k−2,δ−1 q ∈ Z≥2 such that q
(i) each W (t) satisfies a linear (non-local) symmetric hyperbolic system with initial q
q
0
1
r
data W (0) = W and coefficients depending on , W , ω, U for 0 ≤ r ≤ q, and 0
r
W for 1 ≤ r ≤ q − 1, (ii) q
q
q
q
W (t) H k−2 + ∂t W (t) H k−3 W (t) H k−2 + ∂t W H k−3 1 δ−1,
δ−1,
for all (t, , q) ∈ [0, T0 ) × (0, 0 ] × Z≥2 , and (iii) 0
1
1
W (t) = W (t) + (ω(t) + W ) +
∞
q
q W (t) (t, ) ∈ [0, T0 ) × (0, 0 ],
q=0
(5.1) k−3 Hδ−1, ) and C 0 ([0, T0 );
where the sum converges uniformly in C 0 ([0, T0 ); (iv) Moreover, if s − 2 ≥ p ≥ 1, and the initial data is chosen so that q+1
∂t
H k−3 ).
W (0) H k−(q+1) 1 q = 1, 2, . . . , p, δ−1
then there exists -independent maps q
W ∈ X T0 ,s−q.k−q,δ−1 and
q+1
ω ∈ X T0 ,s−q,k−q,δ−1 q = 1, 2, . . . , p
such that q (iv.a) each W satisfies a -independent linear (non-local) symmetric hyperbolic r
r
system with coefficients depending only on U for 0 ≤ r ≤ q, ω for 0 ≤ r
r ≤ q + 1, and W for 0 ≤ r ≤ q − 1, and q
q
q
(iv.b) the terms W in the sum (5.1) can be replaced by ω + W for 1 ≤ q ≤ p k−(q+2) with the sum converging uniformly C 0 ([0, T0 ), Hδ−1, ) and C 0 ([0, T0 ); H k−(q+2) ). Proof. The proof of this theorem follows from a straightforward adaptation of the proof of Theorem 3 in [21]. We will only sketch the details. Following Schochet [21] (see also [11]), we consider the following iteration: m
m m+1 m m+1 1 I m+1 c ∂ I Z + b I ( Z )∂ I Z + L( Z ) + M( Z ), m+1 q m+1 Z (0) = q−2 W , m+1
b0 ( Z )∂t Z =
q=2
0
(5.2) (5.3)
Post-Newtonian Expansions for Perfect Fluids
875
where m
0
m
1
m
m
m
1 Z 1 = 0, W¯ = W + (ω + W ) + 2 Z , b I ( Z ) = b I (W¯ , U, W¯ , 2 U ), m 2 1 1 1 m+1 m m+1 0 0 2 ˜ ¯ b ( Z ) = b ( U, W ), L( Z ) = L U, W, X, Y, Z , and m M( Z )
1
1
1
˜ Y, = M , U, W, X,
m Z
.
Using the energy estimates of Theorem B.1 and the weighted Sobolev estimates in Appendix A (see also [15]), it is clear the arguments of Schochet can be generalized to show that m
m
Z (t) H k−2 + ∂t Z (t) H k−3 1, δ−1,
(5.4)
δ−1,
and m+2
m+1
m+1
m
m+2
Z (t) − Z (t) H k−3 Z (t) − Z (t) H k−3 + m W H k−3 δ−1,
δ−1,
0
δ−1,
(5.5)
for all (t, ) ∈ [0, T0 ) × (0, 0 ]. Therefore by (4.1), (5.4), (5.5), and the uniqueness of solutionsto the evolution equation (5.2), we see that for 0 small enough the sequence 1 m 0 1 k−3 W (t) + W (t) + ω(t) + 2 Z (t) converges in C 0 ([0, T0 ), Hδ−1, ) to W (t) for each ∈ (0, 0 ]. Therefore, defining m+1
m
Z (t) − Z (t) W (t) = , m−1
m+1
we have that
∞ q 1 0 1 W (t) = W (t) + W (t) + ω(t) + q W (t) q=2
k−3 with the sum converging in C 0 ([0, T0 ), Hδ−1, ) for each ( ∈ (0, 0 ]. Moreover, because of the inequality (3.25), it follows that the sum converges uniformly in C 0 ([0, T0 ), H k−3 ) for ∈ (0, 0 ]. This completes the proof of statements (i)-(iii). The proof of statement (iv) also follows easily from the arguments used in the proof of Theorem 3 in [21]. q
Remark 5.2. The equations satisfied by the W from part (iv) of Theorem (5.1) are: q q−1 q q−1 q q 0 0 0 0 q q ∂t W = b I (W )∂ I W + b I (W )∂ I ω + B˜ U , W, Y − ∂t ω q
− B˜ 0
q−2 q−1 q−1 q q−1 q q+1 U , W , X + F U , W + c I ∂I ω ,
q
c I ∂ I W = 0, q
q
q
W (0) = W − ω(0), 0
876
T. A. Oliynyk
where q
q
q
0
0
1
1
q
q
U = (U , . . . , U , W = (W , ω + W , . . . , ω + W ), q
0
q
0
1
1
q
q
X = (∂t W , ∂t ω + ∂t W , . . . , ∂t ω + ∂t W ), and 1
1
q
q
Y = (∂ I W , ∂ I ω + ∂ I W , . . . , ∂ I ω + ∂ I W ). 6. The First Post-Newtonian Expansion We are now ready to prove the main theorem that guarantees the existence of a large class of solutions to the Einstein-Euler equations that can be expanded to the first postNewtonian order. Proof of Theorem 1.1. Using the harmonic equations ∂t u¯ 44 = −∂ I u¯ 4I , and ∂t u¯ I 4 = −∂ I u¯ I J ,
(6.1)
we can write the constraint equations (3.11) as ¯ i j , ∂ I ∂ J u¯ i j , ∂ I ∂t u¯ K L ) ¯u4k = δ4k ρ − δ kI ∂ L ∂t u L I + Q 4k 0 ( u 4j 4j + Q 1 ( 2 u¯ i j , ∂ I u¯ i j , ∂t u I J ) + Q 2 ( 2 u¯ i j , w, w)α 2 , 4j
4j
where Q 0 (y1 , y2 , y3 ) is bilinear in y1 and (y2 , y3 ), Q 1 (y1 , y2 , y3 ) is quadratic in y2 , y3 , 2 ¯ i j ∈ V. We can and the maps Q 4k ν (ν = 0, 1, 2) are analytic in all their variables for u also write the K L-components of the reduced Einstein equations (2.12) as ∂t2 u¯ K L =
1 2 (1− 2 u¯ 44 ) × ¯u K L + 2 3 u¯ I 4 ∂ I ∂t u¯ K L + 2 u¯ I J ∂ I J u¯ K L + 2 Q 0K L ( 2 u¯ i j , ∂ M u¯ i j , ∂t u¯ I J )
− 2 ρw K w L + pδ K L + 3 Q 1K L ( u¯ i j , 2 u¯ i j , w, w) , (6.2)
where Q 0K L (y1 , y2 , y3 ) is quadratic in (y2 , y3 ), Q 1K L = Q 2K L ( u¯ i j , 2 u¯ i j , w, w)α 2 + Q 3K L ( u¯ i j , 2 u¯ i j , w, w)w I w J , and all of the maps Q νK L (ν = 0, 1, 2, 3) are analytic in their arguments for 2 u¯ ∈ V. We now take ∂t u¯ I J (0) = 2 z4I J , α(0) = α , w I (0) = w I , f I J 0
0
as the prescribed initial data, and solve the non-linear elliptic system ¯ i j , ∂ I ∂ J u¯ i j , ∂ I ∂t u¯ K L ) ¯u4k = 4k := δ4k ρ − δ kI ∂ L ∂t u L I + Q 4k 0 ( u 4j 4j + Q 1 ( 2 u¯ i j , ∂ I u¯ i j , ∂t u I J ) + Q 2 ( 2 u¯ i j , w, w)α 2 , ¯u
KL
(6.3)
:= −2 u¯ ∂ I ∂t u¯ − u¯ ∂ I J u¯ −
+ 2 ρw K w L + pδ K L − 3 Q 1K L ( u¯ i j , 2 u¯ i j , w, w) + (1 − u¯ )f
=
KL
3 I4
KL
2 IJ
KL
2
Q 0K L ( 2 u¯ i j , ∂ M u¯ i j , ∂t u¯ I J ) 4 2 44 K L ,
(6.4)
Post-Newtonian Expansions for Perfect Fluids
877
to determine the initial data {¯ui j |t=0 , ∂t u¯ i j |t=0 } on S0 = {(x I , 0) | (x I ) ∈ R3 }. Note that w4 is determined by the fluid velocity normalization (3.13), which can be written as w4 =
1 f (w I , 2 u¯ i j ),
(6.5)
where f (y1 , y2 ) is analytic in a neighborhood of (0, 0) and f (y) = O(|y|2 ) as y → 0. Using the weighted multiplication inequality (see [15], Lemma A.8) and Lemma A.7, it is straightforward to verify that there exists an 0 > 0 such that i j (see (6.3)–(6.4)) defines an analytic map k−1 I IJ IJ ij k k × Hδ−1 × Hδ−1 ∈ (−0 , 0 ) × Hδ−1 , z4 , α , w , f , u¯ 0
k−1 ×Hδ−2
0
k−2 × Hδk −→ i j ∈ Hδ−2 ,
where 4i = δ4k ρ + 0() and K L = 0( 2 ) as 0. Writing (6.3)–(6.4) as
(6.6)
u¯ i j = −1 , z4I J , α , w I , f I J , u¯ i j , 0
0
k−2 it follows from (6.6) and the invertibility of the Laplacian : Hδk → Hδ−2 that we can use the analytic version of the implicit function theorem [8] to conclude that there exists k−2 k k k an open neighborhood U of any point in Hδ−1 × Hδ−1 × Hδ−1 × Hδ−2 , and analytic maps , z4I J , α , w I , f I J ∈ (−0 , 0 ) × U −→ u¯ i j ∈ Hδk 0
0
that solve Eqs. (6.3)–(6.4). Moreover, it follows from (6.6) that ¯uK L (0) H k
δ−1
2 ∀ ∈ [0, 0 ],
(6.7)
and hence ∂t2 u¯ K L (0) H k
2 ∀ ∈ [0, 0 ].
(6.8)
∂t u¯ K L (0) H k−1 2 ∀ ∈ [0, 0 ].
(6.9)
δ−2
Also, we note that by construction δ−1
Differentiating the harmonic conditions (6.1) with respect to t, and using (6.7)–(6.9), yields p
∂t u¯ 44 (0) H k− p 1 p = 0, . . . , 4, δ− p
(6.10)
and p
∂t u¯ 4J (0) H k− p p = 0, . . . , 3 δ− p
for all ∈ [0, 0 ].
(6.11)
878
T. A. Oliynyk
Using (6.1), the Euler equations (2.25) can be written as −1 ∂t w = a 4 ( 2 u¯ i j , w) a I (w, 2 u¯ i j , w)∂ I w + b0 (∂ I u¯ i j , ∂t u¯ I 4 )
+ b1 w, 2 u¯ i j , w, ∂ I u¯ i j , ∂t u¯ I J , ∂ I u¯ i j , 2 ∂t u¯ I J ,
(6.12)
where the maps a 4 , a I , b0 , b1 are analytic in all their arguments for 2 u¯ ∈ V, and a 4 (0, 0) = 1, a I (0, 0, 0, 0) = 0, b0 (y1 , y2 ) is linear, and b4 (y1 , y2 , y3 , y4 , y5 , y6 , y7 ) is linear in (y4 , y5 , y6 , y7 ) and satisfies b4 (0, 0, 0, y4 , y5 , y6 , y7 ) = 0. Then differentiating (6.1), (6.2), and (6.12) with respect to t while using (6.7)–(6.11) shows that p
∂t u¯ K L (0) H k− p 1 p = 3, 4,
(6.13)
δ−2
p
∂t α (0) H k− p 1 p = 0, . . . , 3,
(6.14)
δ−1
p
∂t wi (0) H k− p 1 p = 0, . . . , 3,
(6.15)
δ−1
and ∂t4 u¯ 4J H k−4 1
(6.16)
δ−3
for all ∈ [0, 0 ]. We then find from the definition of W , the estimates (6.7)–(6.11), and (6.13)–(6.16), that ∂t3 W (0) H k− p 1 for p = 0, 1, 2, 3 and 0 ≤ ≤ 0 .
(6.17)
δ−1
Next, we observe that 1 ¯uij (t) L 2 = ¯uij (0) + −1 δuij (t) L 2 ¯uij (0) L 2 + δ u¯ ij (t) L 2 δ δ δ δ−1,
(6.18)
by (3.26) and (3.27), while for any 0 ≤ ≤ k, V (t) V (t) H
δ−1,
= W (t) + d(W (t)) H
δ−1,
W (t) H
δ−1,
(6.19)
by (3.25), (3.29), and (3.31). The proof of Theorem 1.1, now follows directly from Theorem 5.1, and the estimates (6.17)–(6.19). 7. Discussion In this article, we have established the existence of a large class of dynamical solutions to the Einstein-Euler equations that have a first post-Newtonian expansion. Although this is an improvement over existing rigorous results [15,19], which only cover the Newtonian limit situation (i.e. the “zeroth” post-Newtonian expansion), the results of this paper are almost certainly not optimal. In general, one expects that with a suitable gauge choice, it should be possible to generate post-Newtonian expansions to at least the 2.5 postNewtonian order after which there are indications that the post-Newtonian expansions will break down. For a lucid discussion of this phenomenon see [17]. As remarked in [17], the choice of harmonic gauge may be the reason for not being able to reach the 2.5 post-Newtonian order. At the formal level, there exist other gauges
Post-Newtonian Expansions for Perfect Fluids
879
that perform better than the harmonic gauge for the post-Newtonian expansions. However, it remains to be seen if these other gauges are compatible with the singular hyperbolic energy estimates that are guaranteed to arise in the dynamical setting. We are presently investigating this problem. From the proof of Theorem 1.1 and the paper [15], it is clear that conditions of the form p
∂t W (0) H k− p 1 as 0 δ−1
(7.1)
on the initial data play a crucial role in generating the post-Newtonian expansions. This leads to the question of what happens when one considers initial data that does not satisfy (7.1) for any p ∈ Z ≥ 0. In [16], we address this question for the situation where lim sup ∂t W (0) H k−1 = ∞. δ−1
0
There we find that a Newtonian description is still appropriate for the motion of the matter, but the gravitational field no longer vanishes in the limit 0. Instead, there exists high frequency gravitational radiation that is not small at the 0 order, and this will necessarily affect the higher order expansions. Acknowledgements. This work began while I was a junior scientist at the Albert-Einstein-Institute (AEI). I thank the AEI and the director Gerhard Huisken of the Geometric Analysis and Gravitation Group for supporting this research.
A. Weighted Calculus Inequalities In this section, we prove additional weighted calculus inequalities that are similar in spirit to those in Appendix A of [15]. We first recall from [15] the definition of the weighted Sobolev spaces. Let V be a finite dimensional vector space with inner product p (·|·) and corresponding norm | · |. For u ∈ L loc (Rn , V ), 1 ≤ p ≤ ∞, δ ∈ R, and p ∈ R≥0 , the weighted L norm of u is defined by ⎧ −δ−n/ p ⎨ σ u L p if 1 ≤ p < ∞ , (A.1) u L p := δ, ⎩ −δ σ u L ∞ if p = ∞ ! 1 where σ (x) := 1 + |x|2 . The weighted Sobolev norms are then defined by 4 ⎧⎛ ⎞1/ p ⎪ ⎪ ⎪ p ⎪ ⎠ ⎪ D α u L p if 1 ≤ p < ∞ ⎪⎝ ⎨ δ−|α|, |α|≤k , (A.2) u W k, p := ⎪ δ, ⎪ ⎪ ⎪ ⎪ D α u L ∞ if p = ∞ ⎪ δ−|α|, ⎩ |α|≤k
where k ∈ N0 , α = (α1 , . . . , αn ) ∈ Nn0 is a multi-index and D α = ∂1α1 . . . ∂nαn . Here ∂i =
∂ , ∂ xi
880
T. A. Oliynyk
where (x 1 , . . . , x n ) are the standard Cartesian coordinates on Rn . The weighted Sobolev spaces are then defined as k, p
k, p
Wδ, = { u ∈ Wloc (Rn , V ) | u W k, p < ∞ }. δ,
k, p Wδ,0
k, p
We note that are the standard Sobolev spaces, and for > 0 the Wδ, are equivalent to the radially weighted Sobolev spaces [1,7]. For p = 2, we use the alternate k := W k,2 . The spaces L 2 and H k are Hilbert spaces with inner products notation Hδ, δ, δ, δ, (u|v)σ−2δ−n d n x, (A.3) u|v L 2 := δ,
Rn
and
u|v H k := δ,
D α u|D α v L 2
,
(A.4)
k, p
k . = Wδ,1 and Hδk = Hδ,1
δ−|α|,
|α|≤k
respectively. When = 1, we will also use the notation Wδ
k, p
Lemma A.1. Suppose 0 > 0, δ1 ≥ max{δ2 + δ3 , δ4 + δ5 }, then v + Du + u uv H k u L ∞ v L ∞ L2 Hk H k−1 δ , δ , δ1 ,
δ3 ,
2
δ4 −1,
δ4 ,
5
k k ∞ for all ∈ [0, 0 ], u ∈ L ∞ δ2 , ∩ Hδ4 , , and v ∈ L δ5 , ∩ Hδ3 , .
Proof. This follows directly from the inequality uv H k u L ∞ v H k + Du H k−1 v L ∞ and Lemma A.4 of [15]. Lemma A.2. Suppose 0 > 0, δ ≤ 0, −n/2 ≤ λ ≤ −n/2 + 1, λ ≥ δ, k > n/2, and f ∈ Cbk (R L × R N , M M×M ) with f (0, 0) = 0. Then there exists a polynomial p(y1 , y2 , y3 ) such that
f (u, w)v H k f C k p u H k , w H k , v H k v H k δ,
for all ∈ [0, 0 ], u ∈
Hλk
λ
b
and w, v ∈
δ,
δ,
δ,
k . Hδ,
Proof. Since δ ≤ λ, it follows from Lemma A.1 that f (u, w)v H k f (u, w) L ∞ v H k δ, δ,
+ D( f (u, w)) H k−1 + f (u, w) H k v L ∞ . δ, λ−1,
λ,
Using Lemma A.9 of [15], we can write the above inequality as f (u, w)v H k f (u, w) L ∞ v H k δ, δ, k−1 Du H k−1 + f C k 1 + ( u L ∞ + w L ∞ ) b λ−1,
∞ + u H k + w H k−1 v L δ, . λ,
λ,
(A.5)
Post-Newtonian Expansions for Perfect Fluids
881
But k > n/2 and λ ≤ δ ≤ 0 implies that u L ∞ u H k , w L ∞ w H k , v L ∞ v L ∞ v H k , δ,
(A.6)
w H k w H k
(A.7)
λ
δ,
δ,
and λ,
δ,
by Eq. A.24 and Lemma A.7 of [15], while Du H k−1 + u L 2 u H k λ,
λ−1,
(A.8)
λ
follows from Lemma A.11 of [15] since −n/2 ≤ λ ≤ −n/2 + 1. The proof now follows directly from the inequalities (A.5)–(A.8). Lemma A.3. Suppose 0 > 0, δ ≤ 0, −n/2 ≤ λ ≤ −n/2 + 1, λ ≥ δ, k > n/2 + 1, and f ∈ Cbk (R L × R N , M M×M ) with f (0, 0) = 0. Then there exists a polynomial p(y1 , y2 ) such that
f C k p( u H k , w H k ) u H k + w H k v H k−1 [D α , f (u, w)]v L 2 δ−|α|
λ
b
δ,
λ
δ,
δ−1,
k , and v ∈ H k−1 . for all ∈ [0, 0 ] , 1 ≤ |α| ≤ k, u ∈ Hλk , w ∈ Hδ, δ−1,
Proof. The proof follows directly from Lemma A.9 of [15] and the inequalities (A.5)– (A.8). Lemma A.4. Suppose 0 > 0, δ ≤ 0, −n/2 ≤ λ ≤ −n/2 + 1, λ ≥ δ, and k > n/2. Then there exists a constant C > 0 such that u 1 u 2 H k ≤ C u 1 H k u 2 H k , λ
λ
λ
δ,
λ
δ,
u 1 v1 H k ≤ C u 1 H k v1 H k , and v1 v2 H k ≤ C v1 H k v2 H k δ,
δ,
δ,
k , and ∈ [0, ]. for all u 1 , u 2 ∈ Hλk , v1 , v2 ∈ Hδ, 0
Proof. The proof follows immediately from Lemma A.1 and the inequalities (A.6)– (A.8). We now recall the definition of analytic maps between Banach spaces. Definition A.5. Suppose Y and Z are Banach spaces, U ⊂ Y is an open set, and L j (Y, Z ) is the space of continuous, j-multilinear maps from Y to Z with norm # F L j (Y,Z ) = sup F(u 1 , u 2 , . . . , u j ) Z u j ∈ U and sup{ u 1 Y , u 2 Y , . . . , u 3 Y } ≤ 1} .
882
T. A. Oliynyk
Then a map f : U −→ Z is analytic in U, if for each u 0 ∈ U there exists a ρ > 0, and a sequence of multilinear maps f j ∈ L j (Y, Z ) such that ∞
f j L j (Y,Z ) ρ j < ∞,
j=0
and f (u) =
∞
f j (u − u 0 , . . . , u − u 0 )
(A.9)
j=0
for all u ∈ U satisfying u − u 0 Y < ρ. The set of all analytic functions in U will be denoted C ω (U, Z ). In addition to analytic maps, we will need analytic maps that are uniformly analytic k spaces as varies. on the Hδ, Definition A.6. Suppose R > 0, Y, Z are Banach spaces, and V ⊂ Y is open. Then a sequence of maps f : B R (Hk1 ) × V → Hδk22, × Z will be called uniformly analytic for ∈ [0, 0 ], if (i) f ∈ C ω (B R (Hδk11, × V ; Hδk22, × Z ) for 0 ≤ ≤ 0 , and (ii) for each v0 ∈ V there exists constants ρ, c j > 0, and a sequence of multilinear maps f j ∈ L j (Hδk11, × Y, Hδk22, × Z ) such that f j L
k1 k2 j (Hδ , ×Y,Hδ , ×Z ) 1 2
∞
≤ cj
0 ≤ ≤ 0 ,
c j (ρ + R) j < ∞,
j=0
and f (u, v) =
∞
f j (u, v − v0 , . . . , u, v − v0 )
0 ≤ ≤ 0
j=0
for all (u, v) ∈ Hδk11, × V satisfying u H k1 < R, and v − v0 Y < ρ. δ1 ,
The next lemma shows how to construct a particular class of uniformly analytic functions. Lemma A.7. Suppose 0 > 0, δ ≤ 0, −n/2 ≤ λ ≤ −n/2 + 1, k > n/2, F ∈ C ω (B R1 (R) × B R2 (R), R), F(·, 0) = 0, and C is the independent constant from Lemma A.4. Then for 0 ≤ ≤ 0 , F(u, v) =
∞ ∞ 1 p q ∂ ∂ F (0, 0)u p v q q! p! 1 2 p=0 q=1
k ), H k ), where R¯ = R /C and defines a function of class C ω (B R¯ 1 (Hλk ) × B R¯ 2 (Hδ, 1 1 δ, R¯ 2 = R2 /C.
Post-Newtonian Expansions for Perfect Fluids
883
Proof. Using Lemma A.4, the proof follows from a slight modification of the proof of Proposition 3.6 from [10]. We note that the above lemma can be easily generalized to maps f ∈ C ω (B R (R N ) × B R (R M ), M M×M ). B. Symmetric Hyperbolic Equations The hyperbolic equations that we will consider are of the form b0 (u , w , v )∂t v = v |t=0 = v , o
1 j c ∂ j v + b j (, u , w , v )∂ j v + γ F(, u , w , v ), (B.1) (B.2)
where (i) the maps u = u (x) and w = w (t, x) are R L and R N valued, respectively, while the map v = v (t, x) is R M -valued, (ii) F is a (possibly non-local) map satisfying F(, u, w1 , v1 ) − F(, u, w2 , v2 ) H k ρ,0 ,k, w1 − w2 H k + v1 − v2 H k δ
δ
δ
(B.3) k ), and for all ∈ [0, 0 ], u ∈ Bρ (Hλ ), w1 , w2 , v1 , v2 ∈ Bρ (Hδ,
F(, u, w, v) H k p( u H , w H k , v H k ) w H k + v H k λ
δ,
δ,
δ,
δ,
δ,
(B.4) (iii) (iv) (v) (vi)
k , for all ∈ [0, ], u ∈ Hλ , and w, v ∈ Hδ, 0 j L N M b , b ∈ Cb (R × R × R , M M×M )( j = 1, . . . , n), b0 and b j are symmetric, the c j are constant symmetric matrices, and there exists a constant ω > 0 such that
b0 (ξ1 , ξ2 , ξ3 ) ≥ ω1M×M for all (ξ1 , ξ2 , ξ3 ) ∈ R L × R M × R M .
(B.5)
Let [n/2] denote the largest integer with [n/2] ≤ n/2, k0 = [n/2] + 2, and X T,s,k,δ =
s+1 $ =0
C ([0, T ), Hδk− ).
Theorem B.1. Suppose 0 > 0, T > 0, s ∈ N0 , k = k0 + s, δ ≤ 0, −n/2 ≤ λ ≤ −n/2 + 1, v ∈ Hδk , u ∈ Hλk , w ∈ X T,s,k,δ , o
0 < ≤ 0 ,
and v H k ≤ C1 , w (t) H k + ∂t w (t) H k−1 ≤ C2 , u H k ≤ C3 , o
δ,
δ,
δ,
λ
884
T. A. Oliynyk
for constants C1 , C2 , C3 , independent of (t, ) ∈ [0, T ) × (0, 0 ]. Then there exists a polynomial p(y1 , y2 , y3 ) and maps v ∈ X T ,s,k,δ
0 < ≤ 0 ,
such that for all ∈ (0, 0 ]: (i) v (t, x) is the unique solution in L ∞ ((0, T∗ ), Hδk ) ∩ Lip((0, T ), Hδk−1 ) to the initial value problem (B.1)–(B.2)), (ii) if lim suptT v W 1,∞ < ∞, then the solution v can be extended (uniquely) for time T∗ ∈ [T , T ), (iii) for any constant K 1 > C1 , γ C2 γ C2 v (t) H k ≤ exp (K 2 (1 + γ )t) v (0) H k + − ≤ K1 δ, δ, K 2 (1 + γ ) K 2 (1 + γ ) for all (t, ) ∈ [0, T˜ ) × (0, 0 ], where K 2 := p(C3 , C2 , K 1 ), and
˜ T ≥ T := min T,
1 ln K 2 (1 + γ )
K 1 K 2 (1 + γ ) + γ C2 C1 K 2 (1 + γ ) + γ C2
,
(iv) v (t) H k 1 for all (t, ) ∈ [0, T˜ ) × (0, 0 ], δ,
(v) and if c j ∂ j v H k−1 , then ∂t v (t) H k−1 1 for all (t, ) ∈ [0, T˜ ) × (0, 0 ]. o
δ,
δ,
Proof. We will only prove statements (iii)-(v) as (i)-(ii) follow from a slight modification of arguments in Appendix B of [15]2 . j Let vα = D α v , b0 = b0 (u , w , v ), b = b j (, u , w , v ), and F = F(, u , w , v ). Then from the evolution equation (B.1), we find that −1 1 (B.6) c j ∂ j v + bj ∂ j v + γ F . ∂t v = b0 Differentiating this yields b0 ∂t vα = where
1 j c ∂ j v + bj ∂ j v + f α ,
−1 −1 f α = b0 D α , b0 −1 c j + bj ∂ j vα + γ b0 D α b0 F .
(B.7)
(B.8)
Energy estimates (see Lemma 7.1 in [15]) then show that
d L ∞ vα 2 + f α 2 vα 2 , (B.9) |||vα |||20,δ, div b L ∞ + c + b L δ, L δ, L δ, dt 2 The only real difference is the proof of the convergence of the Galerkin approximations. For the non-local problem, one can use the global compact imbedding Hδk ⊂ Hη (k > , δ < η) to obtain convergence instead of the local compact H k (B R ) ⊂ H (B R ) (k > ) imbedding used in [15].
Post-Newtonian Expansions for Perfect Fluids
885
where div b = ∂t b0 + ∂ j b , c = (c1 , . . . , cn ), b = (b1 , . . . , bn ), and D α (·)|b0 D α (·). ||| · |||k,δ, := j
(B.10)
|α|≤k
Since b0 = b0 (u , w , v ), it follows from Lemma A.3 that (B.11) [D α , (b0 )−1 ( −1 c j + bj ))]∂ j vα L 2 δ,
p u H , w H k , v H k u H + w H k + v H k v H k λ
δ,
δ,
δ,
λ
δ,
δ,
(B.12) for some polynomial p(y1 , y2 , y3 ). Using this estimate along with (B.4) and Lemma A.2, we find that
(B.13) f α L 2 p( u H , w H k , v H k ) w H k + v H k λ
δ
δ,
δ,
δ,
δ,
for some polynomial p(y1 , y2 , y2 ). Combining the two estimates (B.9) and (B.13), and summing over α(0 ≤ |α| ≤ k) yields
d |||v |||k,δ, p( u H , w H k , v H k ) γ w H k + (1 + γ ) v H k . δ, δ, δ, δ, λ dt (B.14) But ω v H k ≤ |||v |||k,δ, ≤ b0 C 0 v H k , δ,
b
δ,
by (B.5), and so it follows from (B.14) and Gronwall’s inequality that for any constant K 1 > C1 , if we let K 2 = p(C3 , C2 , K 1 ), then γ C2 γ C2 − v (t) H k ≤ exp (K 2 (1 + γ )t) v (0) H k + δ, δ, K 2 (1 + γ ) K 2 (1 + γ ) (B.15) for all t such that v (t) H k ≤ K 1 . But v W 1,∞ v H k by Lemma A.7 of [15], δ, δ, and hence, by the continuation principle (ii), we see that K 1 K 2 (1 + γ ) + γ C2 1 ˜ T ≥ T := min T, . (B.16) ln K 2 (1 + γ ) C1 K 2 (1 + γ ) + γ C2 Next, differentiating (B.1) with respect to t, it is clear that ∂t v satisfies a linear equation of the same structure as (B.1), and therefore the same estimates used to derive (B.15) also show that there exists constants K 2 , K 3 such that ∂t v (t) H k ≤ e K 1 t ∂t v (0) H k−1 + K 3 ∀ (t, ) ∈ [0, T∗ ] × (0, 0 ]. δ,
δ,
The proof now follows from the estimates (B.15)–(B.17).
(B.17)
886
T. A. Oliynyk
References 1. Bartnik, R.: The Mass of an Asymptotically Flat Manifold. Comm. Pure Appl. Math. 39, 661–693 (1986) 2. Bauer, S., Kunze, M.: The Darwin approximation of the relativistic Vlasov-Maxwell system. Ann. Henri Poincare 6, 283–308 (2005) 3. Bauer, S.: Post Newtonian approximation of the Vlasov-Nordström system. Comm. PDE 30, 957–985 (2005) 4. Bauer, S.: Post Newtonian dynamics at order 1.5 in the Vlasov-Maxwell system. submitted to J. Nonlinear Sci., available at http://arXiv.org/abs/math-ph/0602031vI, 2006 5. Blanchet, L.: Gravitational Radiation from Post-Newtonian Sources and Inspiralling Compact Binaries, Living Rev. Relativity 9, 4. URL (cited on 02.05.2008): http://www.livingreviews.org/lrr-2006-4, 2006 6. Browning, G., Kreiss, H.O.: Problems with different time scales for nonlinear partial differential equations. SIAM J. Appl. Math. 42, 704–718 (1982) 7. Choquet-Bruhat, Y., Christodoulou, D.: Elliptic systems in Hs,δ spaces on manifolds which are Euclidean at infinity. Acta. Math. 146, 129–150 (1981) 8. Deimling, K.: Nonlinear functional analysis. Berlin: Springer-Verlag, 1998 9. Futamase, T., Itoh, Y.: The Post-Newtonian Approximation for Relativistic Compact Binaries. Living Rev. Relativity 10, 2. URL (cited on 02.05.2008): http://www.livingreviews.org/lrr-2007-2, 2007 10. Heilig, U.: On the Existence of Rotating Stars in General Relativity. Commun. Math. Phys. 166, 457–493 (1995) 11. Klainerman, S., Majda, A.: Compressible and incompressible fluids. Comm. Pure Appl. Math. 35, 629–651 (1982) 12. Kreiss, H.O.: Problems with different time scales for partial differential equations. Comm. Pure Appl. Math. 33, 399–439 (1980) 13. Lottermoser, M.: A convergent post-Newtonian approximation for the constraints in general relativity. Ann. Inst. Henri Poincaré 57, 279–317 (1992) 14. Makino, T.: On a local existence theorem for the evolution equation of gaseous stars. In: Patterns and Waves, edited by T. Nishida, M. Mimura, H. Fujii, Amsterdam: North-Holland, 1986 15. Oliynyk, T.A.: The Newtonian limit for perfect fluids. Commun. Math. Phys. 276, 131–188 (2007) 16. Oliynyk, T.A.: The fast Newtonian limit for perfect fluids. In preparation 17. Rendall, A.D.: On the definition of post-Newtonian approximations. Proc. R. Soc. Lond. A 438, 341–360 (1992) 18. Rendall, A.D.: The initial value problem for a class of general relativistic fluid bodies. J. Math. Phys. 33, 1047–1053 (1992) 19. Rendall, A.D.: The Newtonian limit for asymptotically flat solutions of the Vlasov-Einstein system. Commun. Math. Phys. 163, 89–112 (1994) 20. Schochet, S.: Symmetric hyperbolic systems with a large parameter. Comm. Part. Diff. Eqs. 11, 1627–1651 (1986) 21. Schochet, S.: Asymptotics for symmetric hyperbolic systems with a large parameter. J. Diff. Eqs. 75, 1–27 (1988) 22. Taylor, M.E.: Partial differential equations III, nonlinear equations. New York: Springer, 1996 Communicated by G. W. Gibbons
Commun. Math. Phys. 288, 887–906 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0712-1
Communications in
Mathematical Physics
Diffusion at the Random Matrix Hard Edge José A. Ramírez1 , Brian Rider2 1 Department of Mathematics, Universidad de Costa Rica, San Jose 2060,
Costa Rica. E-mail:
[email protected]
2 Department of Mathematics, University of Colorado at Boulder, Boulder,
CO 80309, USA. E-mail:
[email protected] Received: 10 May 2008 / Accepted: 9 October 2008 Published online: 13 January 2009 – © Springer-Verlag 2009
Abstract: We show that the limiting minimal eigenvalue distributions for a natural generalization of Gaussian sample-covariance structures (beta ensembles) are described by the spectrum of a random diffusion generator. This generator may be mapped onto the “Stochastic Bessel Operator,” introduced and studied by A. Edelman and B. Sutton in [6] where the corresponding convergence was first conjectured. Here, by a Riccati transformation, we also obtain a second diffusion description of the limiting eigenvalues in terms of hitting laws. All this pertains to the so-called hard edge of random matrix theory and sits in complement to the recent work [15] of the authors and B. Virág on the general beta random matrix soft edge. In fact, the diffusion descriptions found on both sides are used below to prove there exists a transition between the soft and hard edge laws at all values of beta. 1. Introduction The origins of random matrix theory can be traced to the introduction of Wishart’s ensembles, matrices of the form X X † with rectangular X comprised entirely of independent real or complex Gaussians of mean zero and mean-square one. The spectrum of these objects are of fundamental importance in mathematical statistics (see the comprehensive text [14]), and continue to generate wide interest due to their relevance to such disparate areas as information theory [20], numerical analysis [7], and, along with their quaternion-entried counterparts, theoretical physics [27]. Here we consider scaling limits for Wishart-type eigenvalues at the hard edge. To explain, let X be n×m. If m n as n ↑ ∞ the minimal eigenvalues of the (non-negative) X X † will feel the “hard” constraint at the origin, while if m/n is strictly larger than one in the large dimensional limit, the minimal eigenvalues separate form zero and one has “soft” edge fluctuations (on which more below). In fact, if m = n + a with fixed a as n ↑ ∞ one discovers an interesting family of limit laws indexed by a for the bottom of the spectrum.
888
J. A. Ramírez, B. Rider
The known results at the hard edge have thus far been based on the explicit joint density for the Wishart eigenvalues 0 ≤ λ0 , λ1 , . . . , λn−1 . In particular, when X is n × (n + a) with integer a > −1, that density is Pβ,a (λ1 , . . . , λn ) =
1
Z β,a
j
|λ j − λk |β ×
n−1
β
λk2
(a+1)−1 − β λk 2
e
,
(1.1)
k=0
with normalizer Z β,a < ∞ and β = 1, 2, or 4 for real, complex, or quaternion Gaussian entries. More importantly, with these choices of β all finite dimensional correlation functions of the eigenvalues are computable in terms of Laguerre polynomials (thus the common tag “Laguerre ensembles”). At β = 2 and all valid a, [22] proves the limiting distribution of the minimal eigenvalue is described by the Fredholm determinant of a kernel operator given in terms of Bessel functions, and based on this derives a second description of the limit law as a functional of the fifth Painlevé transcendent. Other work at β = 1, 2, 4 hard edge include [4,10], and [26]. These again rely on the underlying orthogonal polynomial structure (the first and third reference use Riemann-Hilbert methods to replace the exponential weight e−(β/2)λ in (1.1) with a more general e−V (λ) potential), and describe the eventual limit law through Fredholm determinants or Fredholm pfaffians. While the distribution on n points λ1 , . . . , λn ∈ R+ defined by (1.1) makes sense for all β > 0 and a > −1, the orthogonal polynomial approach breaks down outside the standard triple in β. For some special choices of the parameters beyond β = 1, 2, 4, [9] was able to exploit the niceties of the exponential weight to obtain limit laws in terms of hypergeometric functions. Still, even the existence of the general β hard edge limit law remained open until now. Our approach, similar to that in [15], rests on the existence of tridiagonal matrix models for all β. Set for any a > −1 and β > 0, ⎤ ⎡ χ(a+n)β χ(n−1)β χ(a+n−1)β χ(n−2)β ⎥ ⎢ ⎥ 1 ⎢ . . ⎥ ⎢ .. .. L β,a = √ ⎢ (1.2) ⎥ β⎣ ⎦ χ(a+2)β χβ χ(a+1)β in which each χr that appears is an independent χ random variable of the indicated index. (We suppress here the dimension parameter n on the n × n random matrix L β,a .) T have Then, as discovered by Dumitriu and Edelman [5], the eigenvalues of L β,a L β,a (1.1) as their joint density function. Note that, when β = 1 or 2, the bidiagonal (1.2) may be arrived at by performing Householder transformations on the corresponding “full” Wishart ensemble; this fact was used previously in a random matrix context by Silverstein [16]. Viewing the n ↑ ∞ limit as giving rise to a continuum approximation to the discrete operators L β,a , an entry-wise expansion in the random χ variables led Edelman and Sutton to the following conjecture for the full β > 0 hard edge. Conjecture. (Edelman-Sutton [6]). Let sk denote the k th√smallest singular value of the bidiagonal operator L β,a . Then, as n ↑ ∞ the family { nsk } converges in law to the corresponding singular values of √ d a 1 Lβ,a = − x + √ + √ b (x) dx 2 x β
Diffusion at the Random Matrix Hard Edge
889
in which x → b(x) is a Brownian motion. Here, Lβ,a is understood to act on functions f ∈ L 2 [0, 1] subject to f (1) = 0 and (Lβ,a f )(0) = 0. The random Lβ,a was tagged the Stochastic Bessel Operator in [6] on account of the zero-noise (β = ∞) version having singular-values at the roots of Ja (the Bessel function of the first kind). Our main result establishes this conjecture, though we prefer to phrase matters in a T , hencedifferent way, back in terms of eigenvalues of the symmetric ensembles L β,a L β,a forth referred to as the (β, a)-Laguerre ensembles. Toward this, introduce the random operator of second order,
d d Gβ,a = − exp [(a + 1)x + √2β b(x)] exp − ax − √2β b(x) , (1.3) dx dx where again b(x) is a Brownian motion and a > −1, β > 0. Formal manipulations will T to G take you from Lβ,a Lβ,a β,a , but the latter is better understood upon recognizing, in the spirit of the title, that −Gβ,a generates the diffusion with (random) speed and scale measures m(d x) = e
−(a+1)x− √2 b(x) β
d x and s(d x) = e
ax+ √2 b(x) β
d x.
This motion may be built pathwise in the classical mode (see for example [12]), placing (1.3) on firm ground. The limiting spectral problem will require consideration of Gβ,a acting on the positive half-line with Dirichlet conditions at the origin, and this carries over into killing the underlying process when reaching that point. Even more convenient, we may define eigenvalues/eigenvectors through the resolvent equation. That is, if we at first take the equation Gβ,a ψ = λψ to mean ψ = λG−1 β,a ψ, the speed and scale construction provides the explicit form of the inverse, ⎞ ⎛ ∞ x∧y ⎝ (G−1 s(dz)⎠ ψ(y) m(dy). (1.4) β,a ψ)(x) ≡ 0
0
2 Now G−1 β,a is plainly non-negative symmetric in L [R+ , m] and the Dirichlet condition
at the origin is automatic for solutions of ψ = λG−1 β,a ψ. Lying slightly deeper, we will 2 3/2− and is in fact of trace class. see that (almost surely) G−1 β,a maps L [R+ , m] into C We have:
Theorem 1. With probability one, when restricted to the positive half-line with Dirichlet conditions at the origin, Gβ,a has discrete spectrum comprised of simple eigenvalues 0 < 0 (β, a) < 1 (β, a) < · · · ↑ ∞. Moreover, with now 0 < λ0 < λ1 < · · · < λn the ordered (β, a)-Laguerre eigenvalues, {nλ0 , nλ1 , . . . , nλk } ⇒ {0 (β, a), 1 (β, a), . . . , k (β, a)} (jointly in law) for any fixed k < ∞ as n ↑ ∞. Remark. The Dirichlet condition for Gβ,a at x = 0 may be mapped to that at x = 1 for Lβ,a in the Edelman-Sutton conjecture. On the other hand, the process generated by −Gβ,a has a natural (or free) boundary at x = +∞, which carries certain advantages over the specified condition for Lβ,a at the corresponding point (x = 0) in the conjecture.
890
J. A. Ramírez, B. Rider
As a bit of amplification, differentiating with abandon one is led to 2 d d √2 b (x)) −Gβ,a = e x , − (a + β dx2 dx along with the idea that the corresponding motion is just a Brownian motion with (shifted) white noise drift. In fact, modulo the multiplicative factor e x which affects a change of time, this is precisely the random diffusion introduced by Brox as a continuum analogue of Sinai’s walk [3]. Theorem 1 then draws a concrete connection between random matrix theory and the lifetime of this random process in a random environment which has been the subject of continued investigation since its introduction (see [19] and the many references within, or the recent [2] for a spectral point of view). Our second description of the hard-edge is a corollary of the first, employing Riccati’s map to transform a solution of, the suitably interpreted, Gβ,a ψ(x, λ) = λψ(x, λ) for any fixed λ ≥ 0 into one of (1.5) dp(x) = √2β p(x)db(x) + (a + β2 ) p(x) − p 2 (x) − λe−x d x, understood in the sense of Itô. The point is: Sturm’s oscillation theorem implies that the eigenvalues of Gβ,a are counted by the zeros of ψ(·, λ), and those zeros correspond to places where p ≡ ψ /ψ, which solves (1.5), hits −∞. Theorem 2. Let Px,c denote the law of p(·) = p(·; a, β, λ) starting from position c at time x. Let also νx (dc) = Px,+∞ (m ∈ dc), where m is the passage time of p to −∞. Then, P(0 (β, a) > λ) = ν0 ({∞}) and, more generally, P(k (β, a) < λ) = ν0 (d x1 )νx1 (d x2 ) · · · νxk (d xk+1 ). Rk+1
Theorem 1 is proved in Sect. 2, and Theorem 2 in Sect. 3. We conclude the introduction by describing a general transition between the hard edge laws just described and the form of the β > 0 soft edge laws established in [15]. Remark. It is natural to ask whether from Theorems 1 and 2 one might recover the Painlevé or Hypergeometric descriptions of the hard edge from [22 or 9] respectively, and then go further by finding explicit formulas of the distributions at all β > 0. Thus far the answer is no, even when (a + 1) = 2/β and 0 (a, β) is just an exponential random variable (as is easily seen from the joint density (1.1)). With that choice of parameters, −(a+1)x− √2 b(x)
β the speed m (x) = e of the Gβ,a -diffusion turns out to be a martingale, but we do not see how to make use of this.
Soft edge and transition. The random matrix soft edge corresponds to the scaling limits of the maximal, rather than minimal, eigenvalues in the Laguerre ensembles. Historically, these laws were discovered first by Tracy and Widom ([21 and 23]) in the context of a different class of random matrices, the Gaussian Orthogonal, Unitary, and Symplectic ensembles. The latter are n × n real symmetric, complex hermitian, or quaternion self-dual matrices with Gaussian entries. There is again an explicit joint spectral density,
Diffusion at the Random Matrix Hard Edge
891
2 which takes the form of a constant multiple of |λi − λ j |β e−(β/4) λi with β = 1, 2 or 4 respectively. And once more, all correlations are given in terms of orthogonal polynomials (now Hermites). Tracy and Widom proved that the appropriately scaled largest eigenvalues have distribution functions described by Painlevé II (via a more basic formulation in terms of Fredholm determinants/pfaffians of an Airy kernel). Matters were later carried over to the Laguerre soft edge by a collection of authors. As before, one may consider the general “β-Hermite” laws. [5] provides a separate family of tridiagonal matrix models for these laws (though see also [24] for an earlier application at β = 1, 2 ), and in direct analogy with Theorems 1 and 2 the authors and B. Virág have previously proved:
Theorem. (Theorems 1.1 and 1.2 of [15]). The largest eigenvalues in either the β-Laguerre or β-Hermite ensembles have scaling limits given by the law of the top 2 eigenvalues of the random Schrödinger operator −Hβ = ddx 2 − x + √2β b (x). There is also a description of the limiting soft-edge eigenvalues equivalent to that in Theorem 2, with the diffusion dp(x) =
√2 db(x) β
+ (λ + x − p 2 (x))d x
(1.6)
in place of (1.5). The first part of this result, the identification of −Hβ at the soft edge, proves a different conjecture of Edleman and Sutton from [6]. For obvious reasons, we refer to the distribution of the top eigenvalue of −Hβ as the general beta Tracy-Widom law, notated T Wβ . Returning to the (β, a)-Laguerre ensembles, it is well understood that if a tends to infinity with n so that limn→∞ m/n = limn→∞ (n + a)/n > 1, the limiting spectral measure is pulled away from the origin and one sees soft-edge behavior at both the minimal and maximal eigenvalues. Thus, one expects that by taking a → ∞ after n → ∞, the hard-edge becomes a soft-edge and creates a link between these families of distributions arising in random matrix theory. Borodin and Forrester [1] have shown that this is indeed the case in the classical β = 1, 2 and 4 settings. Their work rests on the aforementioned determinantal forms of the underlying distribution functions. Employing just the diffusions (1.5) and (1.6), our final result shows that this transition holds true at all β > 0. Theorem 3. With 0 (β, a) the limiting smallest eigenvalue in the (β, a)-ensemble and T Wβ the general beta Tracy-Widom law, √ η − 0 (β, 2 η − β2 ) η2/3
⇒ T Wβ
as η → ∞. Theorem 3 is proved in Sect. 4. In summary, together with [15] the present provides a complete picture of the extremal laws of random matrix theory, at all values of the natural parameters. This leaves apart the general β spectral bulk, which has recently been treated by Valko-Virág [25] (for the β-Hermite ensembles) and Killip-Stoiciu [13] (for the circular β ensembles, generalizing the eigenvalue laws for the Haar distributed unitary group).
892
J. A. Ramírez, B. Rider
2. Convergence of the Spectrum The key is to prove the almost sure strong convergence of the resolvent operators, or T )−1 matrices to a verreally a similarity transformation of the sequence of (L β,a L β,a
sion of G−1 β,a . As we will see, all these objects may be viewed as integral operators with well-behaved kernels, allowing for an efficient verification of the necessary compactness. Outline. Set Mβ,a = S L β,a S −1 where S is the anti-diagonal matrix of alternating signs T rather Si j = (−1)i δi+ j−n−1 . The spectrum is unchanged (we may work with Mβ,a Mβ,a T ), and we record than L β,a L β,a ⎤ ⎡ χ(a+1)β ⎥ ⎢ −χ˜ β χ(a+2)β ⎥ 1 ⎢ − χ ˜ χ ⎥, ⎢ 2β (a+3)β Mβ,a = √ ⎢ ⎥ β⎣ .. .. ⎦ . . −χ˜ (n−1)β χ(a+n)β where the additional notation is intended to emphasize the independence of the processes along the main and lower diagonals. Wishing to track inverses, we first note the readily checked fact: Lemma 4. For any lower bidiagonal matrix B = bi, j (that is, bi j = 0 if j > i or j < i − 1), the inverse, when it exists, is lower triangular and has the expression [B −1 ]i, j =
i−1 (−1)i+ j bk+1,k bii bk,k
for j ≤ i.
k= j
Next, observe that for any A = ai, j ∈ Rn×n there is a natural operator embedding into L 2 [0, 1] which does not change the spectrum: (A f )(x) ≡
n j=1
x j f (x)d x
ai, j n
for xi−1 ≤ x < xi ,
x j−1
where hereafter we define xi = i/n, for i = 1, 2, . . . , n. Thus, moving attention to (n Mβ,a MβTa )−1 (after introducing the appropriate hard-edge scaling), the action of −1 n −1/2 Mβ,a on L 2 [0, 1] reads
nx √ ( n Mβ,a )−1 f (x) = j=1
√
βn
χ(nx+a)β
nx−1
χ˜ kβ
k= j
χ(k+a)β
x j f (x)d x. x j−1
−1 n with (discrete) In other words, n −1/2 Mβ,a is equated with the integral operator K β,a kernel ⎧ ⎫ √ i−1 ⎨ ⎬ βn n (x, y) = exp log χ˜ kβ − log χ(k+a)β 1 L (x, y), (2.1) kβ,a ⎩ ⎭ χ(i+a)β k= j
where 1 L = 1{xi−1 ≤x<xi } 1{x j−1 ≤y<x j } and i > j.
Diffusion at the Random Matrix Hard Edge
893
With this set-up, the basic convergence result we need is the following. Lemma 5. There is a Brownian motion b(·) such that for y < x lying in (0, 1], √ nβ 1 ⇒√ χ(nx+a)β x
(2.2)
and nx
x (log χ˜ kβ − log χ(k+a)β ) ⇒ (a/2) log(y/x) +
k=ny
y
dbz √ , βz
(2.3)
in law in the Skorohod topology. Morever, the exist tight random constants κn > 0 and κn > 0 which are independent of β so that sup 1≤k≤n
and, with T (x) =
1 β
√ kβ ≤ κn χ(k+a)β
(2.4)
log x1 ,
i−1 log χ˜ kβ − log χ(k+a)β − (a/2) log( j/i) ≤ κn (1 + T 3/4 (xi ) + T 3/4 (x j ))
(2.5)
k= j
for all 1 ≤ j < i ≤ n. The first part of the lemma ((2.2) and (2.3) together) identifies the limiting operator n (x, y) approaches K β,a . Namely, for n ↑ ∞ it should be that kβ,a kβ,a (x, y) ≡ x −
1+a 2
⎤ ⎡ x dbz exp ⎣ √ ⎦ y a/2 1 y<x . βz
(2.6)
y
The second part, or the bounds (2.4) and (2.5), provide the needed compactness and more. As we shall prove: Lemma 6. K β,a is almost surely Hilbert-Schmidt. Also, there exists a probability space n and K on which all K β,a β,a are defined, and such that any sequence of the operators n K β,a contains a subsequence which converges to K β,a in Hilbert-Schmidt norm with probability one. In particular, for whatever n ↑ ∞ we can find an n ↑ ∞ along which 1 1
n |kβ,a (x, y)(ω) − kβ,a (x, y)(ω)|2 d x d y = 0
lim
n ↑∞ 0
0
almost surely. Granted this we may complete the proof of the main result.
894
J. A. Ramírez, B. Rider
Proof of Theorem 1. Working on the probability space promised in Lemma 6, the argument is reduced to a deterministic setting. Start with the scaled minimal (β, a)-Laguerre eigenvalue !−1 nλ0 (n) =
T inf v, n Mβ,a Mβ,a v =
||v||2 =1
n T n sup f, (K β,a ) K β,a f
|| f || L 2 =1
n T n = ||(K β,a ) K β,a ||−1 ,
where || · || is the L 2 → L 2 operator norm, and the final equality holds simply because n )T K n is non-negative symmetric. Assume for the moment that, as claimed, K (K β,a β,a β,a T K is almost surely Hilbert-Schmidt. Then K β,a is non-negative symmetric and comβ,a pact (trace class even) with a well defined maximal eigenvalue also equal to the norm T K ˆ0 > ˆ 1 > · · · the eigenvalues of K T K β,a ≡ B, ||K β,a β,a ||. For short, we notate β,a ˆ n } for the (decreasing) eigenvalues of (K n )T K n ≡ Bn . The and similarly write { β,a β,a k simplicity of the limiting eigenvalues is also assumed here; it will be deduced from the differential form of the eigenvalue problem introduced in the next section, see 3.1 and the surrounding discussion. Next, for whatever sequence n ↑ ∞, Lemma 6 allows a choice of subsequence n − K along which ||K β,a β,a || H S → 0. The same holds for the transposes, and hence Bn converges strongly to B. It follows that the norms themselves converge along this ˆ n = ||Bn || → ||B|| = ˆ 0 with probability one. But this is to say that subsequence: 0 for any sequence of the original eigenvalues nλ0 (n), there exists a subsequence along ˆ 0 . That is of course equivalent to the which these points converge almost surely to 1/ full convergence statement. As to nλ1 , nλ2 , . . ., we first show that the convergence (along perhaps a further subsequence) of the ground state eigenvectors is a by-product of the above. Define { f n } and f in L 2 [0, 1] with unit norm by ˆ n0 , f, B f = ˆ 0. f n , Bn f n = Remaining in the introduced setting, we have ||Bn f n || L 2 → ||B f || L 2 . Also, being uniformly bounded in L 2 , f n has a weakly convergent subsequence: f n f ∞ . Then, for any φ ∈ L 2 , φ, Bn f n − B f ∞ = φ, (Bn − B) f n + Bφ, f n − f ∞ tends to zero (the first term by norm convergence, the second by boundedness of B). Having weak convergence (Bn f n B f ∞ ) plus convergence of its norm, we conclude ˆ n f n and there is a strongly convergent subsequence of {Bn f n }. Coupled with Bn f n = 0 n ˆ → ˆ 0 , this implies a strongly convergent subsequence for the { f n } themselves, 0 which by continuity can only wind up at f . Finally, place yourself along this sequence where || f n − f || L 2 → 0, and denote by P fn the projection onto the orthogonal complement of f n in L 2 . At once we find ˆn = that P fn Bn P fn converges strongly to P f B P f (with obvious notation), and so 1 ˆ 1 . The implication for nλ1 is clear, and an induction ||P fn Bn P fn || → ||P f B P f || = argument extends the picture to the almost sure convergence of any finite number of Laguerre eigenvalues.
Diffusion at the Random Matrix Hard Edge
895
Reflect upon the fact that we have proved convergence (in law) of say {(nλk )−1 } for T K k = 0, . . . , m to the top m eigenvalues of the integral operator B = K β,a β,a which we now write out. In particular, its spectral problem reads 1 f (x) = λ
x
a/2
"y
e
x
x
1 =λ
dbs √ βs
⎛ (x y)a/2 ⎝
y
y
−(a+1)
"y
e
z
dbs √ βs
z a/2 f (z) dz dy
0
1 e
−2
⎞
"1
dbs √ z βs
z −(a+1) dz ⎠ e
"1 x
dbs √ βs
"1
e
y
dbs √ βs
f (y) dy, (2.7)
x∨y
0
after an integration by parts. Again, we seek here an f ∈ L 2 [0, 1], which inherits the continuity and vanishing of the kernel at x = 1. To recover the advertised limit operator (1.4), make the substitution g(x) = x −a/2 " 1 dbs "1 √ − ˆ e x βs f (x) in conjunction with the time change x s −1/2 dbs = b(log(1/x)) with a ˆ new Brownian motion b to express (2.7) in the equivalent way: 1 g(x) = λ 0
⎛ ⎝
⎞
1 e
ˆ − √2β b(log 1/z)
z −(a+1) dz ⎠ g(y) y a e
ˆ √2 b(log 1/y) β
dy.
x∨y √2
ˆ x −1 ) b(log
With f ∈ L 2 [0, 1], g resides in L 2 ([0, 1], m) for m(d x) = x a e β d x. Last, the change of variables (x, y) → (e−x , e−y ) will produce the form of (1.4) quite exactly, along with transforming the Dirichlet condition at one ( f (1) = g(1) = 0) into that at the origin (ψ(0) ≡ (g ◦ exp)(0) = 0).
Estimates. Before establishing Lemmas 5 and 6 we make the simple observation: Proposition 7. For any constant C and a > −1, the integral operator on L 2 [0, 1] with kernel $ # kC (x, y) = C exp C(log(1/x))3/4 + C(log(1/y))3/4
y a/2 x (a+1)/2
1 y<x
is Hilbert-Schmidt. Proof. The change of variables x = e−s and y = e−t employed just above produces 1 1
∞ |kC (x, y)| d xd y = C 2
0
0
2
e 0
2Cs 3/4 +as
∞ e2Ct s
and the latter is clearly finite if (and only if) a > −1.
3/4 −(a+1)t
dt ds,
896
J. A. Ramírez, B. Rider
Proof of Lemma 6. Now taking Lemma 5 for granted, we can find a subsequence over which we have the joint convergence in law, √ nβ 1 ⇒ √ ,0<x ≤1 , χ(nx+a)β x ⎞ ⎛ x nx dbz (log χ˜ kβ − log χ(k+a)β ) ⇒ ⎝(a/2) log(y/x) + √ , 0 < y ≤ x < 1⎠, βz k=ny y
κn , κn
⇒ κ, κ .
(2.8)
Then, Skorohod’s representation theorem (Theorem 1.8, Chapt. 2 of [8]) furnishes a probability space on which each of the above occurs with probability one. The first two items of (2.8) take place a.e. in (0, 1], and so on this new space n (2.9) (x, y)(ω) = kβ,a (x, y)(ω) for a.e. x, y ∈ [0, 1]2 = 1. P lim kβ,a n↑∞
That
1 1 n |kβ,a (x, y)(ω) − kβ,a (x, y)(ω)|2 d xd y → 0 a.s. 0
0
will follow if we can supply an a.s. finite constant C(ω) such that n (x, y)(ω) ≤ kC(ω) (x, y) and kβ,a (x, y)(ω) ≤ kC(ω) (x, y) sup kβ,a
(2.10)
n>0
for almost all x, y ∈ [0, 1] and ω. Again by Lemma 5, for each n, −(a+1)/2 a/2 y j exp
n kβ,a (x, y)(w) ≤ κn (ω)xi
# $ κn (ω)(1 + T 3/4 (xi ) + T 3/4 (y j )) holds, (2.11)
where x ∈ [xi .xi+1 ), y ∈ [y j , y j+1 ). But now we are allowed to assume that both κn and κn converge, and thus are bounded almost surely by say 2(κ ∨ κ ) for sufficiently large n. The continuity of the functions x → T (x) and x → x p (on (0, 1]) then enables us to fit the right hand side of (2.11) under a fixed kC independently of n. For the limit kernel, kβ,a (x, y)(ω) simply note that its exponent could have been expressed from the start as x y
$ 1 #˜ dbz ˜ − b(log(1/y)) . √ = √ b(log(1/x)) βz β
(The equality is in law with a different Brownian motion living on the same probability ˜ space.) By the law of the iterated logarithm, b(a) ≤ c(ω)(1 + [a log log(1 + a)]1/2 ) for a random c(ω) and all a > 0, and certainly [a log log(1 + a)]1/2 ≤ c a 3/4 with a (non-random) c and all a large enough. Thus, the second half of (2.10) holds with C(ω) ≤ β −1/2 c c(ω), and the proof is complete. Note here we immediately passed to a fixed subsequence and then chose a favorable probability space, while the statement of the lemma was worded with the convergence
Diffusion at the Random Matrix Hard Edge
897
n , each tied to a (β, a)of the (β, a)-Laguerre eigenvalues in mind. That is, build all kβ,a Laguerre eigenvalue, on the same space as kβ,a and then note for whatever n ↑ ∞ there is a subsequence along which everything above holds. Either way the upshot is the same.
Turning to the proof of Lemma 5, we record without proof the following facts. Proposition 8. For χr a chi random variable of index r > 0, r+p p p 2 E[χr ] = 2 (r/2)
(2.12)
for any p > −r . Also, as r → ∞, E[log χr ] =
3 1 1 log r − + O(1/r 2 ), V ar [log χr ] = + O(1/r 2 ), 2 2r 2r
(2.13)
while E[(log χr − E log χr )2m ] = O(1/r m ) for positive integer m. Proposition 9. (After Theorem 1.3, Chapt. 7 of [8]) Let yn,k be a sequence of mean-zero processes starting at 0 with independent increments yn,k . Assume, n E(yn,k )2 = f (k/n) + o(1), n E(yn,k )4 = o(1)
(2.14)
1 [0, T ). Then uniformly for k/n in compact sets of [0, T ) with a continuous f ∈ L loc " t 1/2 yn (t) = yn,[nt] ⇒ 0 f (s)db(s) with a standard Brownian motion b (in the Skorohod topology).
Proof of Lemma 5. Start with (2.3). By the first estimate of (2.13), lim
n→∞
nx
(E log χ˜ kβ − E log χ(k+a)β ) =
k=ny
a log(y/x) 2
uniformly for y < x restricted to compact sets of (0, 1]. Thus, for (2.3) it is enough to demonstrate the weak convergence n k=[nx]
1 (log χ(k+c)β − E log χ(k+c)β ) ⇒
(2βz)−1/2 db(z),
(2.15)
x
where c is any fixed number. Indeed, the exponent of the discrete kernel is comprised of √ two such independent sums, and the promised limit will follow as b1 + b2 = 2b3 in law for independent Brownian motions b1 , b2 , b3 . Now refer to Proposition 9 and view the processes on the left of (2.15) as starting from 0 at x = 1 and evolving toward x = 0 (or take t = 1 − x in the proposition). Then, the second estimate of (2.13) yields the first part of (2.14) with f (t) = 1/(2βt); the estimate right after (2.13) with m = 2 produces the second half of (2.14) as x is always > 0. This finishes the job. The convergence (2.2) is easier. For any fixed x ∈ (0, 1], it is just an instance of the law of large numbers. The tightness required to ensure process level convergence is also elementary: via (2.12) one can obtain the increment bound !2 % % (r + 1)β rβ E − = O(1/r 2 ) χ(r2 +a+1)β χ(r2 +a)β
898
J. A. Ramírez, B. Rider
which more than suffices. While here we dispense of (2.4). First use the sum bound, ! √ n χ(k+a)β kβ 1 P sup . < >M ≤ P √ M kβ 1≤k≤n χ(k+a)β k=1
1−r/2
2 Then, employing the explicit density P(χr ∈ ds) = (r/2) s r −1 e−s /2 ds, one can perform a Laplace-type estimate to find the k th term on the right hand √ √ sidek is upper bounded by C( e/M)k with C depending only on β. Since ∞ k=1 ( e/M) may be made arbitrarily small by choice of M, the desired tightness of the random variables √ supk≤n ( kβ/χ(k+a)β ) follows. The final piece, or (2.5), is the most elaborate but really comes down to reworking the standard proof of the upper bound in the law of the iterated logarithm. Define,
Anx =
n−1
(log χ˜ kβ − log χ(k+a)β ) −
k= j
2
a log( j/n) 2
for x ∈ [x j , x j+1 ), and h(x) = [2x log log x]1/2 . We will in fact show that sup (Anx j ∨ 0)/ h(T (x j )) are tight in distribution,
(2.16)
1≤ j≤n−1
where again T (x) = Set Y jn
≡
exp(Anx j )
1 β
=
log x1 . This is stronger than what is claimed. n−1
χ˜ kβ
k= j
χ(k+a)β
k+1 k
a/2 , and
Z nj ≡ (Y jn )λ E[(Y jn )λ ]−1
with a small positive λ (the precise conditions on λ follow shortly). The sequence j → n n Z n− j is a martingale for j = 1, 2, . . . with E[Z j ] = 1 for all j. Hence, by Doob’s inequality P max Z nj ≥ eλb ≤ e−λb , ≤ j≤n−1
or
P
max (λAnx j − log E[exp(λAnx j )]) ≥ b ≤ e−λb
≤ j≤n−1
(2.17)
for b > 0. For the next move we need an estimate on the moment generating functions of Anx j , the proof of which we will return to at the end of the section. Claim 10. For all λ > 0 sufficiently small (λ < (β/2)[(1 + a) ∧ 1] will do), 2 # λAn $ λ xj log(1/x j ) + n ( j) = exp E e 2β with |n ( j)| ≤ C for constant C = C(a, β).
(2.18)
Diffusion at the Random Matrix Hard Edge
899
Using (2.18) in (2.17), we have ! 1 λ n log(1/t) + n (nt) ≥ b/λ ≤ e−λb P sup At − 2β λ x ≤t<1 with n (t) understood via interpolation. Now choose θ > 1, a positive constant M and set λ = Mθ −m h(θ m ), b = Mh(θ m )/2. (To choose M large one must take θ large as well to respect the condition on λ set down in Claim 10.) The previous display will then imply ! P
Ant ≥ (M + 1)h(θ m ) ≤ (m log θ )−M . 2
sup θ m
(2.19)
Here we have used the uniform bound on n (t) to fit λ−1 n (t) under h(θ m ) by choice of θ and so M. In particular, λ−1 = M −1 θ m h −1 (θ m ) ≤ M −1 h(θ m ) if log log θ > 1/2. Finally return to the goal (2.16), re-expressed as seeking a bound of type n + P sup [At ] / h(T (t)) > N ≤ ε(N ) where ε(N ) ↓ 0 as N ↑ ∞. 0
Note in addition that the supremum inside the probability over any truncated range x < t < 1 (for x > 0, rather than 0 < t < 1) poses no problem. Indeed, the process t → Ant has already been shown to be convergent in that regime. On the other hand, the troublesome tail is bounded by ! ∞ ∞ 2 n P sup At / h(T (t)) ≥ N ≤ (m log θ )−(N −1) m=1
θ m
m=1
with the aid of (2.19), completing the proof.
Proof of Claim 10. By (2.12), the left-hand side of (2.18) equals kβ+λ n−1 (k+a)β−λ 2 2 k + 1 λa/2 . k kβ (k+a)β k= j
2
2
Taking logarithms, we must estimate the sum n−1 k= j sk where kβ (k + a)β − λ (k + a)β kβ + λ − log + log − log sk = log 2 2 2 2 1 λa . (2.20) + log 1 + 2 k Introduce Stirling’s approximation in the form & & & & &log (z) − z − 1 log z + z − log 2π − 1 & ≤ c . & 2 2 12z & z 2 The O(z −2 ) error term produces a constant multiple of k −2 when applied in (2.20). Differences such as (kβ/2)−1 −((kβ−λ)/2)−1 and the like stemming from the 1/(12z) terms
900
J. A. Ramírez, B. Rider
are similarly bounded. When summed, both contributions produce constants which are then absorbed into the n . Also, the constant and z-terms obviously cancel throughout the log-gamma expressions when the above estimate is applied in (2.20). Move to the terms of type (z − 1/2) log z. A bit of algebra will lead to ' ( ' ( λ λ λ λ λ kβ − 1 log 1 + 1− − log 1 + / 1− sk = 2 kβ (k + a)β 2 kβ (k + a)β λ λa 1 aβ λ a 1 . − log 1 + − +O + log 1 − log 1 + 2 (k + a)β 2 k 2 k k2 Since | log(1 + s) − s| ≤ s 2 for s > −1/2, we conclude sk = establishes the claim upon summation from j to n − 1.
λ2 2βk
+ O(k −2 ), which
3. Riccati Map and a Second Diffusion Riccati’s substitution takes a linear second order operator into one of first order, at the price of introducing a quadratic nonlinearity. Its use in the study of random spectra has a long history, dating back to Halperin [11]. To employ it here we must first recover the differential form of the eigenvalue problem from the established integrated version ψ = λG−1 β,a ψ, which reads in full: ∞ x∧y ψ(x) = λ s(dz) ψ(y) m(dy) 0
∞ =λ 0
0
⎛ x∧y ⎝ exp [az +
⎞ √2 b(z)] dz ⎠ ψ(y) exp [−(a β
+ 1)y −
√2 b(y)] dy. β
0
Noting that any f ∈ L 2 [m] is also in L 1 [m], G−1 β,a f is easily seen to be differentiable after writing the right hand side as separate terms. This property is inherited by ψ, and we compute
∞
ψ (x) = λ exp[ax +
ψ(y) exp[−(a + 1)y −
√2 b(x)] β
√2 b(y)] dy, β
x
to find that ψ is actually in C 3/2− . Continue by taking (Itô) differentials to arrive at the system dψ (x) = √2β ψ (x)db(x) + (a + β2 )ψ (x) − λe−x ψ(x) d x, dψ(x) = ψ (x)d x,
(3.1)
which is the appropriate way to interpret Gβ,a ψ = λψ. Taken independently of the preceding developments, (3.1) has globally Lipschitz coefficients of linear growth, and as such defines (for fixed λ) a unique Markov process x → (ψ(x), ψ (x)) for any specified (ψ(0), ψ (0)) pair. As the uniqueness is pathwise we conclude along the way
Diffusion at the Random Matrix Hard Edge
901
the simplicity of the corresponding eigenvalues: for given λ, any two L 2 solutions of ψ = λG−1 β,a ψ vanishing at the origin must be constant multiples of one another. Now bring in Riccati’s map, p(x) = ψ (x)/ψ(x), valid away from the zeros of ψ. Since ψ is continuously differentiable, we find from (3.1) and elementary calculus: (3.2) dp(x) = √2β p(x) db(x) + (a + β2 ) p(x) − p 2 (x) − λe−x d x, defining yet another Markov process for any fixed λ. The relevance of (3.2) in counting L , indicating eigenvalues of Gβ,a is first understood through the truncated operator Gβ,a Gβ,a restricted to [0, L] with Dirichlet conditions at both endpoints. Lemma 11. Consider the unique diffusion p(x) = p(x; λ) started at +∞ at x = 0, and restarted at +∞ immediately after any passage to −∞. The number of eigenvalues of L less than λ is equal in law to the number of explosions of p before x = L. Gβ,a Proof. This is well understood, and our treatment here is much the same as in [15], Sect 3. Take the sine-like solution of (3.1), that is, ψ0 (x, λ) subject to ψ0 (0, λ) = 0 and L if only if ψ (L , ) = 0. Regarding ψ0 (0, λ) = 1. Plainly, is an eigenvalue of Gβ,a 0 the ground state eigenvalue 0 (L): if for any λ ψ0 (x, λ) > 0 for 0 < x ≤ L, then it must be that 0 (L) > λ, as an examination of (3.1) shows. That is, the event that {0 (L) > λ} is equal in law to the event {x → ψ0 (x, λ) has no roots before x = L}. Continuing, additional zeros of the (almost surely continuous) function λ → ψ0 (L , λ) (and so additional eigenvalues) only occur by increasing λ, whereupon all other roots (in L the x-variable) move to the left. This equates the event that the k th eigenvalue of Gβ,a lies above a fixed λ and the event that ψ0 (x, λ) has at most k − 1 roots on (0, L). Now move to the p(x, λ) formed from ψ0 (x, λ) and its derivative. By appealing again to uniqueness of solutions to (3.1), note ψ0 and ψ0 cannot vanish simultaneously. (In particular, the zeros of ψ0 are isolated, and must be either finite in number or form a sequence tending to infinity.) Thus, at any root m of x → ψ0 (x, λ), including m = 0, an examination of signs shows that limε↓0 p(m + ε, λ) = +∞ and, when m > 0, limε↓0 p(m − ε, λ) = −∞. That is, counting roots of ψ0 (·, λ) is to count passages of the corresponding p(·, λ) to −∞, after subsequent re-starts at +∞. To see that the p-picture stands on its own is to show that there is a unique solution of (3.2) starting from +∞. Replacing the −λe−x term in the drift with any negative constant produces a homogeneous motion with an entrance boundary at +∞ (and which hits −∞ with probability one). This process (begun at +∞) may be constructed unambiguously via speed and scale, see again [12]. By successive dominations of the inhomogeneous p in the statement by such homogeneous versions over all short times, one may conclude the existence and uniqueness of the former. Theorem 2 now follows by taking L → ∞ in Lemma 11 with the aid of the next fact. L converge to the top k eigenvalues Lemma 12. As L → ∞, the top k eigenvalues of Gβ,a of Gβ,a with probability one. L )−1 Proof. This again demonstrates the advantage of having explicit inverses. Now (Gβ,a acts on L 2 ([0, L], m) via
∞ L −1 (Gβ,a ) f (x) = s L (x, y) f (y) m(dy), 0
902
J. A. Ramírez, B. Rider
where
⎤ ⎡ x∧y ⎤ ⎡" L s(dz) x∨y ⎦ 1{x,y∈[0,L]} . s L (x, y) = ⎣ s(dz)⎦ × ⎣ " L s(dz) 0 0
" x∧y
" x∧y Plainly, s L (x, y) ≤ 0 s(dz) and lim L→∞ s L (x, y) = 0 s(dz) pointwise in x and y, almost surely. By dominated convergence we have in the same mode that ⎛ x∧y ⎞ ∞ ∞ ∞ ∞ f (x)s L (x, y)g(y) m(d x)m(dy) → f (x) ⎝ s(dz)⎠ g(y) m(d x)m(dy) 0
0
0
0
0
for all f, g ∈ L 2 [R+ , m], and L −1 tr (Gβ,a )
L =
∞ x s L (x, x)m(d x) →
0
0
s(dy)m(d x) = tr G−1 β,a .
0
L to G But these last two items imply convergence of Gβ,a β,a in trace norm (see [17], Theorem 2.20); the convergence of the eigenvalues then stems from the same style of argument used in the proof of Theorem 1.
4. The Hard-to-Soft Transition Borodin-Forrester [1] discovered a transition between the hard and soft edge distributions at β = 1, 2, and 4. Their proof rests on the explicit Fredholm determinant or Fredholm pfaffian form of these laws. For example, at β = 2 one has that ∞ (−1)k
λ
λ
P(0 (2, a) > λ) = 1+
k=1
k!
d x1 · · ·
0
d xk det K Bessel (xi , x j ) i, j=1,...,k ,
0
(4.1) while P(T W2 < λ) = 1 +
∞ ∞ (−1)k k=1
k!
∞ d x1 · · ·
λ
d xk det K Air y (xi , x j ) i, j=1,...,k .
(4.2)
λ
Here, √ √ √ √ √ √ Ja ( x) y Ja ( y) − x Ja ( x)Ja ( y) K Bessel (x, y) := x−y with Ja the usual Bessel function of the first kind, which is replaced by the Airy function in K Air y (x, y) :=
Ai(x)Ai (y) − Ai (x)Ai(y) . x−y
Diffusion at the Random Matrix Hard Edge
903
For β = 1 or 4 the determinants in (4.1) and (4.2) are replaced by quaternion determinants (or, equivalently, pfaffians), but are comprised of the same class of functions. Further, it is a fact that, suitably scaled, Ja goes over into the Airy function as a → ∞, and the analysis of [1] demonstrates that one may pass this limit inside the various multiple integrals in (4.1) and its analogues. By a much different method, employing the Riccati correspondence, we show the same type of phenomena holds at all β > 0. From Theorem 2, the event that {0 (β, a) > λ} is equivalent in law to the process dp(x) = √2β p(x)db(x) + (a + β2 ) p(x) − p 2 (x) − λe−x d x never hitting −∞. While from [15] we know that the probability of the event {T Wβ < µ} equals the chance that a separate motion q given by dq(x) =
√2 db(x) β
+ (x + µ − q 2 (x))d x
(4.3)
also never hits −∞. (Both processes are begun at +∞.) The question is then: with the scalings 2 √ a = 2 η − > −1 and λ = η − η2/3 µ, β does the chance of p-explosion go over into that of a q-explosion for large η? To understand the mechanism, set µ = 0 for a moment. This scaled p solves √ dp(x) = √2β p(x)db(x) + (2 η p(x) − p 2 (x) − ηe−x )d x, √ and obviously p = p/ η explodes or not with p while satisfying √ dp(x) = √2β p(x)db(x) + η (2p(x) − p2 (x) − e−x )d x.
For η ↑ ∞, p comes quickly to the place p = 1, and, if it manages to tunnel through this point in a short time, explosion is hard to avoid. Within this excursion from 1+ to 1− in a small x-window, the q-motion emerges. To make this explicit we will use the following convergence criteria. Proposition 13. (After Theorem 11.1.4 of [18]). Let a(t, z) and b(t, z) be continuous from [0, ∞)× R into R. For each w ∈ R, let the solution of the martingale problem for a and b (diffusion and drift coefficients respectively) begun from w at t = s be unique. Denote this solution by Ps,w . Suppose next that there are {an } and {bn } satisfying sup sup sup (|an (t, z)| + |bn (t, z)|) < ∞ n≥1 t
and T sup (|an (t, z) − a(t, z)| + |bn (t, z) − b(t, z)|) dt = 0
lim
n→∞ 0
|z|<M
n is a solution of the martingale problem for a for all T > 0 and M > 0. Then, if Ps,w n n → P and bn starting from (s, w), Ps,w s,w .
904
J. A. Ramírez, B. Rider
Proof of Theorem 3. Restoring a generic value of µ we write dp(x) = √2β p(x)db(x) + η1/2 2p(x) − p2 (x) − (1 − η−1/3 µ)e−x d x.
(4.4)
Here p(0) = +∞, while to utilize the proposition it is convenient to move the starting point to a finite place. Certainly, P+∞ (p never explodes) ≥ P1+ε (p never explodes) for whatever ε > 0. Also, P+∞ (p never explodes) ≤ P+∞ (p never explodes, m1+ε ≤ δ) + P+∞ (m1+ε ≥ δ) where mc is the fist passage to the point c and δ > 0. By the Markov property and monotonicity, the first term on the right is less than the (Pδ,1+ε )-probability of no explosion. We wish to bound the second term from above for large η, and to that end note that P+∞ (ma ≤ maδ ) = 1, where maδ is the passage time of the homogeneous process pδ in which the appearance of e−x in the p drift is replaced by e−δ . (The obvious coupling is used.) Hence, P+∞ (m1+ε
1 1 > δ) ≤ E +∞ [mδ1+ε ] = δ δ
∞ x s(dy)m(d x)
(4.5)
1+ε 1+ε
for m(d x) and s(d x) the speed and scale measures of pδ : m(d x) =
' ( √ 1 β 2 1 −√ηψ(x) ηψ(x) x − 2 ln x − c e d x, s(d x) = e d x, ψ(x) = η,δ β x2 2 x
and cη,δ = (1 − µη−1/3 )e−δ . Next choose ε = ε(η) = Mη−1/6 , δ = δ(η) =
1 −1/3 η , K
(4.6)
√ where K ≥ 1 and M ≥ |µ| + 2. These last precautions imply that ψ(x) is increasing for x > 1 + ε. Then an exercise in stationary phase allows the continuation of (4.5) as P+∞ m1+ε(η) > δ(η) ≤ K η1/3
∞
1+Mη−1/6
1 x2
x
√
e−
η[ψ(x)−ψ(y)]
d yd x ≤ C
1+Mη−1/6
K , M
for η ↑ ∞ and a constant C depending only on β, the inner integral concentrating at the upper limit y = x. In summary, for p paths we have that P0,1+ε(η) (m−∞ = ∞) ≤ P0,+∞ (m−∞ = ∞) ≤ Pδ(η),1+ε(η) (m−∞ = ∞) + C holds for all large η. Now bring in
qη (x) = η1/6 p(η−1/3 x) − 1 ,
K M (4.7)
Diffusion at the Random Matrix Hard Edge
905
and note that, when p begins at (0, ε(η)), qη begins at (0, M), and when p begins at (δ(η), ε(η)), qη begins at (K −1 , M). Further, qη hits −∞ if and only if p does, and a substitution in (4.4) shows that qη satisfies the Itô equation # $ $ # −1/3 ˆ −qη2 (x) + η1/3 1 − (1 − η−1/3 µ)e−η x d x dqη (x) = √2β 1 + η−1/6 qη (x) d b(x)+ ˆ with a new Brownian Motion b(x) = η1/6 b(η−1/3 x). Given unique strong solutions in both instances, Proposition 13 easily applies with aη (t, z) = (2/β)[1 + η−1/6 z]2 and bη (t, z) = [−z 2 + η1/3 (1 − (1 − η−1/3 µ)e−η
−1/3 t
)],
the qη -coefficients, and a(t, z) = 2/β and b(t, z) = −z 2 + µ + t, the q-coefficients (recall (4.3)). That is to say, limη→∞ E x,c [φ(qη )] = E x,c [φ(q)] for all bounded continuous functions of the path, and, by approximation we also find, via (4.6) and (4.7), that P0,M (q never explodes) ≤ lim inf P0,∞ (p never explodes) η→∞
≤ lim sup P0,∞ (p never explodes) ≤ PK −1 ,M (q never explodes) + C η→∞
K . M
Note while q → m−∞ (q) is not continuous, q → m−L (q) is for any L finite (outside a set of measure zero). It follows that we have the distributional convergence of m−L (qη ) to m−L (q). The approximation required above is then to show that: lim L→∞ m−L (q) = m−∞ (q) holds in probability, with the same limit taking place uniformly in η when qη replaces q. That all processes involved have exit barriers at −∞ makes this routine. To finish the proof, let M and then K tend to infinity. The q-law is continuous in its initial time, and that lim M→∞ Pc,M = Pc,∞ is a byproduct of +∞ being an entrance point. Acknowledgements. We thank P. Forrester for pointing out the transition problem to us, and also M. Krishnapur and T. Kurtz for helpful input. The work of the second author was supported in part by NSF grants DMS-0505680 and DMS-0645756.
References 1. Borodin, A., Forrester, P.J.: Increasing subsequences and the hard-to-soft transition in matrix ensembles. J. Phys. A: Math and Gen. 36(12), 2963–2982 (2003) 2. Bovier, A., Faggionato, A.: Spectral analysis of Sinai’s walk for small eigenvalues. Ann. Probab. 36(1), 198–254 (2008) 3. Brox, T.: A one-dimensional diffusion process in a Wiener medium. Ann. Probab. 14(4), 1206–1218 (1986) 4. Deift, P., Gioev, D., Kriecherbauer, T., Vanlessen, M.: Universality for orthogonal and symplectic Laguerre-type ensembles. J. Statist. Phys. 29(5–6), 949–1053 (2007) 5. Dumitriu, I., Edelman, A.: Matrix models for beta ensembles. J. Math. Phys. 43(11), 5830–5847 (2002) 6. Edelman, A., Sutton, B.: From random matrices to stochastic operators. J. Stat. Phys. 127(6), 1121–1165 (2007) 7. Edelman, A.: Eigenvalues and condition numbers of random matrices. SIAM J. Matrix Anal. Appl. 9, 543–560 (1988) 8. Either, S., Kurtz, T.: Markov processes Wiley Series in Probability and Statistics. New york: John Wiley Sons, 1986 9. Forrester, P.J.: Exact results and universal asymptotics in the Laguerre random matrix ensemble. J. Math. Phys. 35(5), 2519–2551 (1994)
906
J. A. Ramírez, B. Rider
10. Forrester, P.J.: Hard and soft edge spacing distributions for random matrix ensembles with orthogonal and symplectic symmetry. Nonlinearity 19, 2989–3002 (2006) 11. Halperin, B.I.: Green’s functions for a particle in a one-dimensional random potential. Phys. Rev (2) 139, A104–A117 (1965) 12. Itô, K., McKean, H.P.: Diffusion Processes and their Sample Paths. Berlin-Heidelberg-New York: Springer-Verlag, 1974 13. Killip, R., Stoiciu, M.: Eigenvalue Statistics for CMV Matrices: From Poisson to Clock via Circular Beta Ensembles. To appear, Duke Math. J. available at http://arxiv.org/abs/math-ph/0608002, 2006 14. Muirhead, R.J.: Aspects of Multivariate Statistical Theory. Wiley Series in Probability and Statistics, 1982 15. Ramírez, J., Rider, B., Virág, B.: Beta ensembles, stochastic Airy spectrum and a diffusion. Preprint, available at http://arXiv.org/abs/math/0607331v3, 2007 16. Silverstein, J.: The smallest eigenvalue of a large dimensional Wishart matrix. Ann Probab. 13(4), 1364–1368 (1985) 17. Simon, B.: Trace Ideals and their Applications. Cambridege-New York: Cambridge University Press, 1974 18. Stroock, D.W., Varadhan, S.R.S.: Multidimensional Diffusion Processes. Springer-Verlag, Berlin-New York, 1997 19. Talet, M.: Annealed tail estimates for a Brownian motion in a drifted Brownian potential. Ann. Probab. 35(1), 32–67 (2007) 20. Telatar, E.: Capacity of multi-antenna Gaussian channels. European Trans. Telecom. 10(6), 585–596 (1999) 21. Tracy, C., Widom, H.: Level spacing distributions and the Airy kernel. Commun. Math. Phys. 159(1), 151–174 (1994) 22. Tracy, C., Widom, H.: Level spacing distributions and the Bessel kernel. Comm. Math. Phys. 161(2), 289–309 (1994) 23. Tracy, C., Widom, H.: On orthogonal and symplectic matrix ensembles. Comm. Math. Phys. 177(3), 727–754 (1996) 24. Trotter, H.F.: Eigenvalue distributions of large Hermitian matrices; Wigner’s semicircle law and a theorem of Kac, Murdock, and Szegö. Adv. in Math. 54(1), 67–82 (1984) 25. Valko, B., Virág, B.: Continuum limits of random matrices and the Brownian carousel. Preprint, available at http://arxiv.org/abs/0712.2000v3, 2008 26. Vanlessen, M.: Strong asymptotics of Laguerre-type orthogonal polynomials and applications in random matrix theory. Constr. Approx. 25(2), 125–175 (2007) 27. Verbaarschot, J.: Spectrum of the QCD Dirac operator and chiral random matrix theory. Phys. Rev. Lett. 72, 2531–2533 (1994) Communicated by H. Spohn
Commun. Math. Phys. 288, 907–918 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0667-2
Communications in
Mathematical Physics
On the Spectrum and Lyapunov Exponent of Limit Periodic Schrödinger Operators Artur Avila CNRS UMR 7599, Laboratoire de Probabilités et Modèles Aléatoires, Université Pierre et Marie Curie–Boîte Courrier 188, 75252 Paris Cedex 05, France Received: 12 May 2008 / Accepted: 24 July 2008 Published online: 27 November 2008 – © Springer-Verlag 2008
Abstract: We exhibit a dense set of limit periodic potentials for which the corresponding one-dimensional Schrödinger operator has a positive Lyapunov exponent for all energies and a spectrum of zero Lebesgue measure. No example with those properties was previously known, even in the larger class of ergodic potentials. We also conclude that the generic limit periodic potential has a spectrum of zero Lebesgue measure. 1. Introduction This work is motivated by a question in the theory of one-dimensional ergodic Schrödinger operators. Those are bounded self-adjoint operators of 2 (Z) given by (H u)n = u n+1 + u n−1 + v( f n (x))u n ,
(1.1)
where f : X → X is an invertible measurable transformation preserving an ergodic probability measure µ and v : X → R is a bounded measurable function, called the potential. One is interested in the behavior for µ-almost every x. In this case, the spectrum is µ-almost surely independent of x. The Lyapunov exponent is defined as 1 L(E) = lim (1.2) ln A(E) n (x)dµ(x), n (E)
where An is the n-step transfer matrix of the Schrödinger equation H u = Eu. Here we will give first examples of ergodic potentials with a spectrum of zero Lebesgue measure such that the Lyapunov exponent is positive throughout the spectrum. This answers a question raised by Barry Simon (Conjecture 8.7 of [S]). Current address: IMPA, Estrada Dona Castorina 110, Rio de Janeiro, 22460-320, Brazil. E-mail:
[email protected];
[email protected]
908
A. Avila
The example we will construct will belong to the class of limit periodic potentials. Those arise from continuous potentials over a minimal translation of a Cantor group (see Sect. 2 for a discussion of those notions). In our approach, we fix the underlying dynamics and vary the potential: it turns out that a dense set of such potentials provide counterexamples. It is actually possible to incorporate a coupling parameter in our construction. Here is a precise version that can be obtained from our technique: Theorem 1.1. Let f : X → X be a minimal translation of a Cantor group. For a dense set of v ∈ C 0 (X, R) and for every λ = 0, the Schrödinger operator with potential λv has a spectrum of zero Lebesgue measure, and the Lyapunov exponent is a continuous positive function of the energy. Our result implies, by continuity of the spectrum, that a generic potential over a minimal translation of a Cantor group has a spectrum of zero Lebesgue measure. Corollary 1.2. Let f : X → X be a minimal translation of a Cantor group. For generic v ∈ C 0 (X, R), and for every λ = 0, the Schrödinger operator with potential λv has a spectrum of zero Lebesgue measure (and the Lyapunov exponent is a continuous function of the energy which vanishes over the spectrum). The statements about the Lyapunov exponent in the generic context are rather obvious consequences of upper semicontinuity and density of periodic potentials. They highlight however that the generic approach is too rough and that care must be taken in the proof of Theorem 1.1 in order not to lose the Lyapunov exponent. Remark 1.1. Lebesgue measure zero can be strengthened to Hausdorff dimension zero in both Theorem 1.1 and Corollary 1.2. It suffices to replace the 10th power by an arbitrarily large one in (2), Lemma 3.2, without qualitative impact in the proof, and to replace (3), Lemma 3.3, by the covering estimate which the argument is giving. This remark was prompted by a recent result (based on a different method) of Last-Shamis about the Hausdorff dimension of the spectrum of the critical almost Mathieu operator. 2. Preliminaries 2.1. From limit periodic sequences to Cantor groups. Limit periodic potentials are discussed in depth in [AS]. Here we will restrict ourselves to some basic facts used in this paper. Let σ be the shift operator on ∞ (Z), that is, (σ (x))n = xn+1 . Let orb(x) = {σ k (x), k ∈ Z}. We say that x is periodic if orb(x) is finite. We say that x is limit periodic if it belongs to the closure, in ∞ (Z), of the set of periodic sequences. If x is limit periodic, we let hull(x) be the closure of orb(x) in ∞ (Z). It is easy to see that every y ∈ hull(x) is limit periodic. Lemma 2.1. If x is limit periodic then hull(x) is compact and it has a unique topological group structure with identity x such that Z → hull(x), k → σ k (x) is a homomorphism. Moreover, the group structure is Abelian and there exist arbitrarily small compact open neighborhoods of x in hull(x) which are finite index subgroups.
Limit Periodic Operators
909
Proof. Recall that a metric space is called totally bounded if for every > 0 it is contained in the -neighborhood of a finite set. It is easy to see that a totally bounded subset of a complete metric space has compact closure. If x is limit periodic then orb(x) is totally bounded: indeed if p is periodic and x − p < then orb(x) is contained in the -neighborhood of orb( p). Since ∞ (Z) is a Banach space, hull(x) is compact. Clearly there exists a unique (cyclic) group structure on orb(x) such that the map Z → orb(x), k → σ k (x) is a homomorphism. Let us show that the group structure is uniformly continuous. We have
σ k+l (x) − σ k +l (x)∞ = σ k−k (x) − σ l −l (x)∞
≤ σ k−k (x) − x∞ + x − σ l −l (x)∞
= σ k (x) − σ k (x)∞ + σ l (x) − σ l (x)∞ ,
(2.1)
where the inequality is just the triangle inequality and the equalities follow from the fact that σ is an isometry of ∞ (Z). Thus if y, z, y , z ∈ orb(x) then y · z − y · z ∞ ≤ y − y ∞ + z − z ∞ , which shows the uniform (even Lipschitz) continuity. By uniform continuity, the group structure on orb(x) has a unique continuous extension to hull(x). Since the group structure on orb(x) is Abelian, its extension is still Abelian. For the last statement, fix > 0 and let p be periodic with x − p∞ < /2. Let k be such that σ k ( p) = p. Clearly the closure hullk (x) of {σ kn (x), n ∈ Z} is a compact subgroup of hull(x) of index at most k. Since hull(x) is the union of finitely many disjoint translates of hullk (x), it follows that hullk (x) is also open. Since σ is an isometry, hullk (x) is contained in the /2-ball around p, and hence it is contained in the -ball around x.
By the previous lemma, hull(x) is compact and totally disconnected, so it is either finite (if and only if x is periodic) or it is a Cantor set. If x is limit periodic but not periodic, we see that every y in hull(x) (which is also a limit periodic sequence) is of the form yn = v( f n (y)), where f is a minimal translation of a Cantor group ( f = σ |hull(x)) and v is continuous (v(w) = w0 ). 2.2. From Cantor groups to limit periodic sequences. Let us now consider a Cantor group X and let t ∈ X . Let f : X → X be the translation by t. We say that f is minimal if { f n (y), n ∈ Z} is dense in X for every y ∈ X . This is equivalent to {t n , n ∈ Z} being dense in X . In this case, since there exists a dense cyclic subgroup, we conclude that X is actually Abelian. Let v : X → R be any continuous function. Let φ : X → ∞ (Z), φ(x) = (v( f n (x))n∈Z . Lemma 2.2. For every x ∈ X , φ(x) is limit periodic and φ(X ) = hull(x). Proof. It is enough to show that φ(x) is limit periodic, since φ(X ) is compact and orb(φ(x)) is the image under φ of the set { f n (x), x ∈ X } which is dense in X . Given δ > 0 we must find a periodic sequence p such that φ(x)− p∞ ≤ δ. Choose a compact open neighborhood W of the identity of X which is so small that if y ∈ W then |v(y · z) − v(z)| ≤ δ. Introduce a metric d on X , compatible with the topology. Let > 0 be such that if y, z ∈ X are such that y ∈ W and z ∈ / W then d(y, z) > . Choose m > 0 such that t m
910
A. Avila
is so close to the identity that for every y ∈ X , d(y, f m (y)) < . Then by induction on |k|, t mk ∈ W for every k ∈ Z. It follows that the closure of {t km , k ∈ Z} is a compact subgroup of X contained in W . Clearly it has index at most m. Let p ∈ ∞ (Z) be given by pi = v( f j (w)), where 0 ≤ j ≤ m − 1 is such that i = jmodm. Then |φ(x)i − pi | = |v( f i (w)) − v( f j (w))| = |v(y · z) − v(z)|, where z = f j (w) and y = t i− j . Since i = jmodm, t i− j ∈ W , and by the choice of W we have |φ(x)i − pi | ≤ δ. It follows that φ(x) − p∞ ≤ δ as desired.
Remark 2.1. By the proof above, there exist arbitrarily small compact subgroups of finite index of X (such subgroups are automatically open as before). 2.3. Limit periodic Schrödinger operators. Given f : X → X a minimal translation of a Cantor group and v : X → R a continuous function, we define for every x ∈ X a Schrödinger operator H = H f,v,x by (1.1). A formal solution of H u = Eu satisfies u0 un (E, f,v) An (x) = , (2.2) u −1 u n−1 where A(E) n (x)
=
(E, f,v) An (x)
= Sn−1 · · · S0 where Si =
E − v( f i (x)) −1 . 1 0
(2.3)
(E)
The An (x) are thus in SL(2, R), and are called the n-step transfer matrices. The Lyapunov exponent L(E) = L(E, f, v) is defined by (1.2), where we take µ the Haar probability measure on X (this is the only possible choice actually, since minimal translations of Cantor groups are uniquely ergodic). (The limit in (1.2) exists by subadditivity, which also shows that lim may be replaced by inf.) (E) Remark 2.2. By subadditivity, 21k ln A2k (x)dµ(x) is a decreasing sequence converging to L(E). Allowing E to take values in C, we conclude that E → L(E) is the real part of a subharmonic function. Lemma 2.3. If n ≥ 2, for every non-zero vector z ∈ R2 , the derivative (with respect to (E, f,v) E) of the argument of An (x)z is strictly negative. (E, f,v)
Proof. Let ρn (E, x, z) be the derivative (with respect to E) of the argument of An (x)z. It is easy to see that ρ1 (E, x, z) is strictly negative whenever z is not vertical, and it is n zero if z is vertical. By the chain rule, for n ≥ 2, ρn (E, x, z) = i=1 κi ρ1 (E, f i−1 (x), (E, f,v) (E, f,v) i Ai−1 (x)z), where κi are strictly positive (since An−i ( f (x)) ∈ SL(2, R) and (E, f,v)
hence preserves orientation). Since either z or A1 follows.
(x)z is non-vertical, the result
2.3.1. Let us endow the space H of bounded self-adjoint operators of 2 (Z) with the norm = supu2 =1 (u)2 , and the space of compact subsets of R with the Caratheodory metric (d(A, B) is the infimum of all r such that A is contained in the r -neighborhood of B and B is contained in the r -neighborhood of A). With respect to those metrics, it is easy to see that the spectrum is a 1-Lipschitz function of ∈ H. Since the map C 0 (X, R), v → H f,v,x is also 1-Lipschitz, we conclude that the spectrum
Limit Periodic Operators
911
of H f,v,x is a 1-Lipschitz function of v ∈ C 0 (X, R). It also follows that the spectrum of H f,v,x depends continuously on x. Since H f,v,x and H f,v, f (x) have obviously the same spectrum, and f is minimal, we conclude that the spectrum is actually x-independent. We will denote it ( f, v). 2.3.2. We say that v is periodic (of period n ≥ 1) if v( f n (x)) = v(x) for every x ∈ X . If v is a periodic potential, then it is locally constant, hence for any compact subgroup Y ⊂ X contained in a sufficiently small neighborhood of id, the function v is defined over X/Y . If v ∈ C 0 (X, R) and Y ⊂ X is a compact subgroup of finite index, then we can define another potential v Y by convolution with Y : v Y (x) = Y v(y · x)dµY , where µY is the Haar measure on Y . The potential v Y is then periodic. Since there are compact subgroups with finite index contained in arbitrarily small neighborhoods of id, this shows that the set of periodic potentials is dense in C 0 (X, R). (E, f,v)
2.3.3. If v is n-periodic then trAn (x) is x-independent and denoted ψ(E). Then (E, f,v) (x), for any x ∈ X . This L(E, f, v) is the logarithm of the spectral radius of An shows that the Lyapunov exponent is a continuous function of both the potential and the energy when one restricts considerations to potentials of period n. 2.3.4. We will need some basic facts on the spectrum of periodic potentials, see [AMS], Sect. 3, for a discussion with further references. If v is periodic of period n the spectrum ( f, v) of H is the set of E ∈ R such that |ψ(E)| ≤ 2. Thus for periodic potentials, we have ( f, v) = {E ∈ R, L(E, f, v) = 0}. The function ψ is a polynomial of degree n. It can be shown that ψ has n distinct real roots and its critical values do not belong to (−2, 2), moreover, E is a critical point of (E,v, f ) ψ with ψ(E) = ±2 if and only if An (x) = ± id. From this one derives a number of consequences about the structure of periodic spectra: (1) The set of all E such that |ψ(E)| < 2 has n connected components whose closures are called bands. (E, f,v) (2) If E is in the boundary of some band, we obviously have trAn (x) = ±2. (E, f,v) (x) = ±2, E is in the boundary of some band, thus the (3) Conversely, if tr An spectrum is the union of the bands. (E, f,v) (4) If two different bands intersect then their common boundary point satisfies An (x) = ± id. 2.3.5. We will need some simple estimates on the Lebesgue measure of the bands and of the spectrum. Lemma 2.4. Let v be a periodic potential of period n. (1) The measure of each band is at most 2π n . (2) Let C ≥ 1 be such that for every E in the union of bands, there exists x ∈ X and (E, f,v) k ≥ 1 such that Ak (x) ≥ C. Then the total measure of the spectrum is at 4π n most C . (E, f,v)
(x) is conjugate in SL(2, R) to a rotation: Proof. If E belongs to some band, An (E, f,v) (x)B (E) (x)−1 ∈ SO(2, R). there exists B (E) (x) ∈ SL(2, R) such that B (E) (x)An
912
A. Avila
This matrix is not unique, since R B (E) (x) has the same property for R ∈ SO(2, R), but this is the only ambiguity. In particular, the Hilbert-Schmidt norm squared B (E) (x)2HS (the sum of the squares of the entries of the matrix of B (E) (x)) is a well defined function b(E) (x), which obviously satisfies b(E) ( f n (x)) = b(E) (x). This allows us to define an ˆ x-independent function b(E) which is zero if E does not belong to a band and for E in a band is given by ˆ b(E) =
n−1 1 (E) i b ( f (x)). 4π n
(2.4)
i=0
ˆ It turns out that b(E) is related to the integrated density of states by the formula E ˆ N (E) = −∞ b(E)d E. As a consequence, we conclude that for any band I ⊂ ( f, v), 1 ˆ ˆ I b(E)d E = n (in particular R b(E)d E = 1). See [AD2], Sect. 2.4.1 for a discussion of this point of view on the integrated density of states. 1 ˆ The first statement is then an immediate consequence of b(E) ≥ 2π which in turn 2 comes from the estimate BHS ≥ 2, B ∈ SL(2, R). For the second estimate, it is enough to show that for every E in a band we have ˆ b(E) ≥ 4πC n . Notice that (E, f,v)
B (E) ( f k (x))Ak
(E, f,v)
(x)An
(E, f,v)
= B (E) ( f k (x))An (E, f,v)
(E, f,v)
(x)Ak
( f k (x))B (E) ( f k (x))−1 ∈ SO(2, R). (E, f,v)
Thus B (E) ( f k (x))Ak (x) conjugates An (E) R B (x) for some R ∈ SO(2, R). Thus (E, f,v)
C ≤ Ak
(x)−1 B (E) ( f k (x))−1 (2.5)
(x) to a rotation so it coincides with
(x) ≤ B (E) ( f k (x))−1 B(x),
(2.6)
and there exists y ∈ X (either y = x or y = f k (x)) such that C ≤ B (E) (y)2 ≤ ˆ b(E) (y). It follows that b(E) ≥ 4πC n .
2.3.6. We conclude with a weak continuity result for the Lyapunov exponent. Lemma 2.5. Let v (n) ∈ C 0 (X, R) be a sequence converging uniformly to v ∈ C 0 (X, R). Then L(E, f, v (n) ) → L(E, f, v) in L 1loc . Proof. This follows from the proof of Lemma 1 of [AD1]. Indeed for every compact interval I ⊂ R, there exists a continuous function g : I → R, non-vanishing in int I , such that lim max{L(E, f, v (n) ) − L(E, f, v), 0}g(E)d E = 0 (2.7) n→∞ I
and
lim
n→∞ I
min{L(E, f, v (n) ) − L(E, f, v), 0}g(E)d E = 0
(see the last two equations in p. 396 of [AD1]). The result follows.
(2.8)
Limit Periodic Operators
913
3. Proof of Theorem 1.1 Fix some Cantor group X , and let f : X → X be a minimal translation. Then the homomorphism Z → X , n → f n (id) is injective with dense image. For simplicity of notation, we identify the integers with its image under this homomorphism. For a given potential w ∈ C 0 (X, R) and n ≥ 1, we write L(E, w) = L(E, f, w) for the Lyapunov exponent with energy E corresponding to the potential w. Since X is Cantor, there exists a decreasing sequence of Cantor subgroups X k ⊂ X with finite index such that ∩X k = {0}. Let Pk be the set of potentials which are defined on X/ X k . Potentials in Pk are n k -periodic where n k is the index of X k . If w ∈ C 0 (X, R) is a periodic potential, then it belongs to some Pk . Let P = ∪Pk be the set of periodic potentials (which is a dense subset of C 0 (X, R), see Sect. 2.3.2). (E, f,w) (E,w) (x) = An (x) for the n-step transfer matrix associated For n ≥ 1, we write An (E,w) with the potential w at x. We also let An = A(E,w) (0). The spectrum will be denoted n by (w) = ( f, w). We will actually work with finite families W of periodic potentials. Here we allow for multiplicity of elements, so the number of elements in W , denoted by #W , may be larger than the number of distinct elements of W . For simplicity of notation, we will often treat 1 W as a set (writing for instance W ⊂ P). We write L(E, W ) = #W w∈W L(E, w). (More formally, and generally, one could work with probability measures with compact support contained in Pk for all k sufficiently large.) The core of the construction is contained in the following two lemmas. Lemma 3.1. Let B be an open ball in C 0 (X, R), let W ⊂ P ∩ B be a finite family of potentials, and let M ≥ 1. Then there exists a sequence W n ⊂ P ∩ B such that (1) L(E, λW n ) > 0 whenever M −1 ≤ |λ| ≤ M, E ∈ R, (2) L(E, λW n ) → L(E, λW ) uniformly on compacts (as functions of (E, λ) ∈ R2 ). Lemma 3.2. Let B be an open ball in C 0 (X, R) and let W ⊂ P ∩ B be a finite family of potentials. Then for every K sufficiently large, there exists W K ⊂ PK ∩ B such that (1) L(E, λW K ) → L(E, λW ) uniformly on compacts (as functions of (E, λ) ∈ R2 ). (2) The diameter of W K is at most n −10 K . (3) For every λ ∈ R, if inf E∈R L(E, λW ) ≥ δ#W n k then for every w ∈ W K , (λw) has Lebesgue measure at most e−δn K /2 . Before proving the lemmas, let us conclude the proof of Theorem 1.1. First we combine both lemmas: Lemma 3.3. Let B ⊂ C 0 (X, R) be an open ball and let W ⊂ P ∩ B be a finite family of potentials. Then for every M ≥ 1, there exist δ > 0, an open ball B with closure contained in B, with diameter at most M −1 and W ⊂ P ∩ B such that (1) |L(E, λW ) − L(E, λW )| < M −1 for |E|, |λ| ≤ M. (2) L(E, λW ) > δ for every M −1 ≤ |λ| ≤ M and E ∈ R. (3) For every w ∈ B and M −1 ≤ |λ| ≤ M the Lebesgue measure of (λw) is at most M −1 . Proof. First apply Lemma 3.1 to find some W˜ ⊂ P ∩ B such that L(E, λW˜ ) > 0 for every E ∈ R and M −1 ≤ |λ| ≤ M (it is easy to see that L(E, λw) ≥ 1 if |E| ≥ λw + 4,
(3.1)
914
A. Avila
so this is really a statement about bounded energies which follows from Lemma 3.1), and |L(E, λW˜ ) − L(E, W )| < M −1 /4 for every |E|, |λ| ≤ M. By continuity of the Lyapunov exponent for periodic potentials (Sect. 2.3.3) and compactness (and (3.1) to take care of large energies), we conclude that there exists δ > 0 such that L(E, λW˜ ) > 2δ for every E ∈ R and M −1 ≤ |λ| ≤ M. Let us now apply Lemma 3.2 to W = W˜ and let W = W K for K large. Then W is −1 centered around some w ∈ W . contained in a ball B ⊂ B with diameter n −10 K < M Both estimates on L(E, λW ) are clear from the statement of Lemma 3.2 (using again (3.1) for large |E|). To estimate the measure of (λw) for w ∈ B , we notice that (λw) is contained in a Mn −10 K neighborhood of (λw ) (by 1-Lipschitz continuity of the spectrum, see Sect. 2.3.1). Using that (λw ) has at most n K connected components −1 ˜
and has measure at most e−δ(# W n k ) n K /2 , the result follows. Given an open ball B0 ⊂ C 0 (X, R) and W0 ⊂ P ∩ B0 , and 1 > 0, we can proceed by induction, applying the previous lemma, to define, for every i ≥ 1, open balls Bi with B i ⊂ Bi−1 , finite families of periodic potentials Wi ⊂ P ∩ Bi , and constants 0 < δi < 1 and i+1 = min{i , δi }/10 such that (1) L(E, λWi ) ≥ δi for E ∈ R and i ≤ |λ| ≤ i−1 . (2) |L(E, λWi ) − L(E, λWi−1 )| < i for |E|, |λ| ≤ i−1 . (3) For every w ∈ Bi and i ≤ |λ| ≤ i−1 , (λw) has measure at most i . Then the common element w∞ of all the Bi is such that (λw∞ ) has zero Lebesgue measure for every λ = 0. Notice that L(E, λWi ) converges uniformly on compacts to a continuous function, positive if λ = 0, which by general considerations must coincide with L(E, λw∞ ). Indeed, if wn → w then L(E, wn ) → L(E, w) in L 1loc by Lemma 2.5. So L(E, λw∞ ) coincides almost everywhere with lim L(E, λWi ). Since E → L(E, λw∞ ) is the real part of a subharmonic function (see Remark 2.2) and E → lim L(E, λWi ) is continuous, they coincide everywhere. Since B0 was arbitrary, the denseness claim of Theorem 1.1 follows. 3.1. Proof of Lemma 3.1. Let k be such that W ⊂ Pk . For every K > k, choose N1 (K ) > 0 such that if |E| ≤ K , |λ| ≤ K , w ∈ W and w ∈ PK are such that w is 2n k +1 1 N1 (K ) close to w then |L(E, λw ) − L(E, λw)| < K . Here we use the continuity of the Lyapunov exponent for periodic potentials, see Sect. 2.3.3. For w ∈ W , K > k, 1 ≤ j ≤ 2n k + 1, we define potentials w K , j ∈ PK by w K , j (i) = w(i), 0 ≤ i ≤ n K − 2 and w K , j (n K − 1) = w(n K − 1) +
j . N1 (K )
(3.2)
(This uniquely defines w K , j by periodicity.) Claim 3.4. For every λ = 0, K > k there exists 1 ≤ j ≤ 2n k + 1 such that (λw K , j ) has exactly n K components. Proof. Recall that for every w ∈ Pm , there exist exactly 2n m values of E such that (E,w ) (E,w ) = ±2, if one counts the exceptional energies such that An m = ± id with trAn m multiplicity 2, see Sect. 2.3.4. For each j such that (λw K , j ) does not have exactly n K components, there exists (E ,λw K , j )
at least one energy E j ∈ (λw K , j ) with An K j
= ± id. Then
Limit Periodic Operators
915 (E ,λw)
An K j
=±
1 0
But since w is n k -periodic, this means that (E ,λw) 1 An k j =± 0 (E ,λw)
−λj N1 (K ) .
(3.3)
1
−λjn k N1 (K )n K
1
.
(3.4)
(E ,λw)
This implies, in particular, that An k j = An k j for j = j , thus we must also have (E,λw) E j = E j for j = j . But there can be at most 2n k values of E such that trAn k = ±2. K , j Thus there must be some 1 ≤ j ≤ 2n k +1 such that (λw ) has exactly n K connected components.
By the previous claim and compactness, there exists δ = δ(W, K , M) > 0 such that for w ∈ W and M −1 ≤ |λ| ≤ M, there exists 1 ≤ j = j (K , λ, w) ≤ 2n k + 1 such that (λw K , j ) has n K components and the measure of the smallest gap is at least δ. Choose M an integer N2 (K ) with N2 (K ) > 4π δn K . K , j For 0 ≤ l ≤ N2 (K ) and w as above, let w K , j,l ∈ PK be given by w K , j,l = 4π Ml w K , j + n K N2 (K ) . Claim 3.5. For every M −1 ≤ |λ| ≤ M, w ∈ W , K > k, (λw K , j (K ,λ,w),l ) = ∅.
(3.5)
0≤l≤N2 (K )
Proof. Each of the connected components of (λw K , j ) has measure at most n2πK , see M Lemma 2.4. Since N2 (K ) > 4π δn K , for every E there exists at least some l with 0 ≤ l ≤ 4π Ml / (λw K , j ), that is, E ∈ / (λw K , j,l ), which gives the N2 (K ) such that E − λ n K N2 (K ) ∈ result.
Let W K be the family obtained by collecting the w K , j,l for different w ∈ W , 1 ≤ j ≤ 2n k + 1 and 0 ≤ l ≤ N2 (K ). By the second claim, L(E, λW K ) > 0 for every M −1 ≤ |λ| ≤ M and E ∈ R (since L(E, λw) > 0 if E ∈ / (λw), see Sect. 2.3.4). To conclude, it is enough to show that max
max
1≤ j≤2n k +1 0≤l≤N2 (K )
|L(E, λw K , j,l ) − L(E, λw)| → 0
(3.6)
uniformly on compacts of (E, λ) ∈ R2 . Write |L(E, λw K , j,l ) − L(E, λw)| ≤ |L(E, λw K , j,l ) − L(E − λ +|L(E − λ
4π Ml , λw)| n K N2 (K )
(3.7)
4π Ml , λw) − L(E, λw)|. n K N2 (K )
Then the first term in the right-hand side is smaller than K −1 provided K ≥ |E| + 4π M 2 (by the choice of N1 (K )), while the second term in the right hand side is bounded by maxw∈W sup 4π M 2 |L(E + t, λw) − L(E, λw)| which converges to zero uniformly |t|≤
nK
on compacts of (E, λ) ∈ R2 as K → ∞ (by continuity of the Lyapunov exponent for periodic potentials, Sect. 2.3.3).
916
A. Avila
3.2. Proof of Lemma 3.2. Assume that W ⊂ Pk , n k ≥ 2, and let K > k be large. Order the elements w 1 , . . . , w m of W . Let r = [n K /mn k ]. First consider a potential w ∈ PK obtained as follows. It is enough to define w(l) for 0 ≤ l ≤ n K −1. Let I j = [ jn k , ( j +1)n k −1] ⊂ Z and let 0 = j0 < j1 < . . . < jm−1 < jm = n K /n k be a sequence such that ji+1 − ji − r ∈ {0, 1}. Given 0 ≤ l ≤ n K − 1, let j be such that l ∈ I j , let i be such that ji−1 ≤ j < ji and let w(l) = wi (l). For any sequence t = (t1 , . . . , tm ) with ti ∈ {0, . . . , r − 1}, let w t ∈ PK be the potential defined as follows. Let 0 ≤ l ≤ n K − 1, and let j be such that l ∈ I j . If j = ji − 1 for some 1 ≤ i ≤ m, we let w t (l) = w(l) + r −20 ti . Otherwise we let w t (l) = w(l). Let W K be the family consisting of all the w t . The claimed diameter estimate is obvious for large K . Let us show that L(E, λW K ) → L(E, λW ) uniformly on compacts. It is enough to restrict ourselves to compact subsets of (E, λ) ∈ R × (R \ {0}), since it is easy to see that L(E, λw) − L(E, 0) → 0 uniformly as λw → 0. For fixed E and λ, we write ) A(E,λw = C (tm ,m) B (m) · · · C (t1 ,1) B (1) , nK t
(E−λr −20 t ,λwi )
(3.8)
(E,λwi )
i and B (i) = (An k ) ji − ji−1 −1 . Notice that, for E and where C (ti ,i) = An k (t ,i) λ in a compact set, the norm of the C i -type matrices stays bounded as r grows, while the B (i) matrices may get large. Find some cutoff (ln ln r )−m ≤ c ≤ (ln ln ln r )m /(ln ln r )m such that if B (i) < ecr −1 −1 then B (i) < e(ln ln ln r ) cr < e(ln ln ln r ) cn K . To see that this is possible, notice that the (i) union of the m intervals (ln ln B − ln r, ln ln B (i) − ln r + ln ln ln ln r ], 1 ≤ i ≤ m, must omit at least one point in [−m ln ln ln r, −m ln ln ln r + m ln ln ln ln r ], which can be taken as ln c. r Call i good if B (i) ≥ ecr . If no B (i) is good, then L(E, λW ) ≤ c r −1 and rm L(E, λW K ) ≤ c n K + O(1/r ). In particular L(E, λW K ) and L(E, λW ) are close, since c = o(1) with respect to r . So we can assume that there exists at least one good B (i) . Let i 1 < . . . < i d t be the list of all good i. Write A(E,λw ) (0) = Cˆ (d) Bˆ (d) · · · Cˆ (1) Bˆ (1) , where for 1 ≤ (t ,i ) j ≤ d we let Cˆ ( j) = C i j j and Bˆ ( j) = B (i j ) D ( j) , where we denote D ( j) = (i j −1,ti j −1 ) (i j −1) (i j−1 +1,ti j−1 +1 ) (i j−1 +1) C B ···C B (denoting also i 0 = 0). By the choice of the cutoff, we have D ( j) ≤ ecr/2 for r large (uniformly on compacts of (E, λ) ∈ R2 ), so Bˆ ( j) ≥ ecr/2 .
Claim 3.6. As r grows, d 1 ln Bˆ ( j) → L(E, λW ) nK
(3.9)
j=1
uniformly on compacts of E and λ. Proof. Notice that this is equivalent to showing that m 1 ln B (i) → L(E, λW ) nK i=1
(3.10)
Limit Periodic Operators
917
(uniformly), which in turn is equivalent to m 1 1 ln B (i) → L(E, λW ) m n k ( ji − ji−1 − 1)
(3.11)
i=1
(uniformly). Thus it is enough to show that 1 ln B (i) → L(E, λwi ) n k ( ji − ji−1 − 1)
(3.12) (E,λwi )
, whose (uniformly). But B (i) is just the ji − ji−1 − 1 iterate of the matrix An k spectral radius is precisely the exponential of n k L(E, λwi ). But it is easy to see that T n 1/n converges to the spectral radius of T uniformly on compacts of T ∈ SL(2, R). This gives (3.12) and the result.
For every t, we have the obvious upper bound L(E, λwt ) ≤
d 1 ln Bˆ ( j) + O(1/r ), nK
(3.13)
j=1
and we will now be concerned with bounding L(E, λwt ) from below, not for all t, but for a majority of them. Let s j be the most contracted direction of Bˆ ( j) and let u j be the image under Bˆ ( j) of the most expanded direction. Let us say that t is j-nice, 1 ≤ j ≤ d, if the absolute value of the angle between Cˆ ( j) u j and s j+1 is at least r −70 (with the convention that j + 1 = 1 for j = d). Claim 3.7. Let r be sufficiently large, and let t be j-nice. If z is a non-zero vector making an angle at least r −80 with s j , then z = Cˆ ( j) Bˆ ( j) z makes an angle at least r −80 with s j+1 and z ≥ Bˆ ( j) r −100 z. Proof. Let 0 ≤ θ ≤ π/2 be the angle between z and s j , and let 0 ≤ θ ≤ π/2 be the angle between z = Bˆ ( j) z and u j . The orthogonal projection of z on u j has norm z Bˆ ( j) sin θ . Since Cˆ ( j) stays bounded as r grows, we conclude that z ≥ Bˆ ( j) r −100 z. On the other hand, tan θ tan θ = Bˆ ( j) −2 . Since Bˆ ( j) ≥ ecr/2 ≥ r 400 for r large, it follows that θ < r −100 . The boundedness of Cˆ ( j) again implies that the angle between z and Cˆ ( j) u j is at most r −90 . Since t is j-nice, z makes an angle at least r −80 with s j+1 .
It follows that if t is very nice in the sense that it is j-nice for every 1 ≤ j ≤ d, then (E,λwt ) if z is a non-zero vector making an angle at least r −80 with s1 then z = An K z also makes an angle at least r −80 with s1 , and moreover z /z ≥ dj=1 r −100 Bˆ ( j) . By (3.9) and (3.13), it follows that L(E, λwt ) − L(E, λW ) → 0 as r grows, at least for very nice t. To conclude the estimate on the Lyapunov exponent, it is thus enough to show that most t are nice, in the sense that for every > 0, for every r sufficiently large, the set of t ∈ {0, . . . , r − 1}m which are not very nice has at most r m elements. A more precise estimate is provided below.
918
A. Avila
Claim 3.8. For every r sufficiently large, the set of t which are not very nice has at most mr m−1 elements. Proof. We will show in fact that, for every 1 ≤ j ≤ d, if for every 1 ≤ k ≤ m with k = i j one chooses tk ∈ {0, . . . , r − 1}, there exists at most one “exceptional” ti j ∈ {0, . . . , r − 1} such that t = (t1 , . . . , tm ) is not j-nice. Thus the set of t which are not j-nice has at most r m−1 elements and the estimate follows. Once tk is fixed for 1 ≤ k ≤ m with k = i j , both u j and s j+1 become determined, (E−λr −20 t ,λw j ) i
ij (t ,i ) but Cˆ ( j) = C i j j = An k depends on ti j . Since n k ≥ 2, we can apply Lemma 2.3 to conclude that for any non-zero vector
(E ,λw j ) i
z ∈ R2 , the derivative of the argument of the vector An k z as a function of E is strictly negative, and hence bounded away from zero and infinity, uniformly on z and on compacts of (E , λ) ∈ R2 , and independently of r . If r is sufficiently large, we conclude that for every 0 ≤ l ≤ r − 2, there exists a rotation Rl of angle θ with r −21 < θ < r −19 such that C (l+1,i j ) u j = Rl C (l,i j ) u j . It immediately follows that there exists at most one choice of 0 ≤ ti j ≤ r − 1 such that C
(ti j ,i j )
u j has angle at most r −90 with s j+1 , as desired.
We now estimate the measure of the spectrum. Let wi ∈ W be such that L(E, λwi ) ≥ (E,λwt ) δm(r −1)n 2k δn k m. Then A(r . Since E is arbitrary, we can apply −1)n k (( ji−1 )n k ) ≥ e
Lemma 2.4 to conclude that the measure of the spectrum is at most 4π n K e−δm(r −1)n k ≤ e−δn K /2 for r large. The result follows. 2
Acknowledgements. Conjecture 8.7 of [S] was brought to the attention of the author by Svetlana Jitomirskaya. This work was carried out during visits to Caltech and UC Irvine. This research was partially conducted during the period the author served as a Clay Research Fellow. We are grateful to the referee for several suggestions which led to significant changes in the presentation.
References [AD1] [AD2] [AMS] [AS] [S]
Avila, A., Damanik, D.: Generic singular spectrum for ergodic schrdinger operators. Duke Math. J. 130, 393–400 (2005) Avila, A., Damanik, D.: Absolute continuity of the integrated density of states for the almost mathieu operator with non-critical coupling. Invent. Math. 172, 439–453 (2008) Avron, J., van Mouche, P., Simon, B.: On the measure of the spectrum of the almost mathieu operator. Commun. Math. Phys. 132, 117–142 (1990) Avron, J., Simon, B.: Almost periodic schrödinger operators, i. Limit Periodic Potentials. Commun. Math. Phys. 82, 101–120 (1982) Simon, B.: Equilibrium measures and capacities in spectral theory. Inverse Probl. Imaging 1(4), 713–772 (2007)
Communicated by B. Simon
Commun. Math. Phys. 288, 919–942 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0745-0
Communications in
Mathematical Physics
Isometric Embeddings into the Minkowski Space and New Quasi-Local Mass Mu-Tao Wang1 , Shing-Tung Yau2 1 Department of Mathematics, Columbia University, New York,
NY 10027, USA. E-mail:
[email protected]
2 Department of Mathematics, Harvard University, Cambridge, MA 02138, USA
Received: 17 May 2008 / Accepted: 17 October 2008 Published online: 27 February 2009 – © The Author(s) 2009
Abstract: The definition of quasi-local mass for a bounded space-like region in space-time is essential in several major unsettled problems in general relativity. The quasi-local mass is expected to be a type of flux integral on the boundary two-surface = ∂ and should be independent of whichever space-like region bounds. An important idea which is related to the Hamiltonian formulation of general relativity is to consider a reference surface in a flat ambient space with the same first fundamental form and derive the quasi-local mass from the difference of the extrinsic geometries. This approach has been taken by Brown-York [4,5] and Liu-Yau [16,17] (see also related works [3,6,9,12,14,15,28,32]) to define such notions using the isometric embedding theorem into the Euclidean three space. However, there exist surfaces in the Minkowski space whose quasilocal mass is strictly positive [19]. It appears that the momentum information needs to be accounted for to reconcile the difference. In order to fully capture this information, we use isometric embeddings into the Minkowski space as references. In this article, we first prove an existence and uniqueness theorem for such isometric embeddings. We then solve the boundary value problem for Jang’s [13] equation as a procedure to recognize such a surface in the Minkowski space. In doing so, we discover a new expression of quasi-local mass for a large class of “admissible” surfaces (see Theorem A and Remark 1.1). The new mass is positive when the ambient space-time satisfies the dominant energy condition and vanishes on surfaces in the Minkowski space. It also has the nice asymptotic behavior at spatial infinity and null infinity. Some of these results were announced in [29]. 1. Introduction 1.1. Dominant energy condition and positive mass theorem. Let N be a space-time, i.e. a four-manifold with a Lorentzian metric gαβ of signature (− + + +) that satisfies the © 2009 The authors. Reproduction of this article for non-commercial purposes by any means is permitted.
920
M.-T. Wang, S.-T. Yau
Einstein equation: s Rαβ − gαβ = 8π GTαβ , 2 where Rαβ and s are the Ricci curvature and the Ricci scalar curvature of gαβ , respectively. G is the gravitational constant and Tαβ is the energy-momentum tensor of matter density. The metric gαβ defines space-like, time-like and null vectors on the tangent space of N accordingly. A “dominant energy condition”, which corresponds to a positivity condition on the matter density Tαβ , is expected to be satisfied on any realistic space-time. It means the following: for any time-like vector e0 , T (e0 , e0 ) ≥ 0 and T (e0 , ·) is a non-space-like covector. We shall assume throughout this article the space-time N satisfies the dominant energy condition. Consider a space-like hypersurface (M, gi j , pi j ) in N where gi j is the induced (Riemannian) metric and pi j is the second fundamental form with respect to the future-directed time-like unit normal vector field of M. The dominant energy condition together with the compatibility conditions for submanifolds imply µ ≥ |J |,
(1.1)
where µ=
1 (R − pi j pi j + ( pkk )2 ), 2
and J i = D j ( pi j − pkk g i j ). Here R is the scalar curvature of M. An important special case is when pi j = 0 (time-symmetric case) and the dominant energy condition implies that the scalar curvature of M is non-negative. The positive mass theorem proved by Schoen-Yau [22–24] (later a different proof by Witten [30] ) states: Theorem 1.1. Let (M, gi j , pi j ) be a complete three manifold that satisfies (1.1). Suppose M is asymptotically flat: i.e. there exists a compact set K ⊂ M such that M\K is diffeomorphic to a union of complements of balls in R3 (called ends) such that gi j = δi j +ai j with ai j = O( r1 ), ∂k (ai j ) = O( r12 ), ∂l ∂k (ai j ) = O( r13 ), and pi j = O( r12 ), ∂k ( pi j ) = O( r13 ) on each end of M\K . Then the ADM mass (Arnowitt-Deser-Misner) of each end of M is positive, i.e. E ≥ |P|, where 1 E = lim r →∞ 16π G is the total energy and 1 r →∞ 16π G
(1.2)
(∂ j gi j − ∂i g j j )di Sr
Pk = lim
2( pik − δik p j j )di Sr
is the total momentum. Here Sr is a coordinate sphere of radius r on an end. We notice that the conclusion of the theorem is equivalent to the four-vector (E, P1 , P2 , P3 ) is future-directed time-like, i.e.
Isometric Embeddings into the Minkowski Space and New Quasi-Local Mass
E ≥ 0 and
921
− E 2 + P12 + P22 + P32 ≤ 0.
The asymptotically flat condition can be considered a gauge condition to assure that M can be compared to the flat space R3 . The essence of the positive mass theorem is that positive local matter density (1.1) measured pointwise should imply positive total energy momentum (1.2) measured at infinity. In contrast, the “quasi-local mass” corresponds to the measurement of mass of in-between scales. 1.2. Two-surfaces in space-time and quasi-local notion of mass. Let N be a timeoriented space-time. Denote the Lorentzian metric on N by ·, · and covariant derivative by ∇ N . Let be a closed space-like two-surface embedded in N . Denote the induced Riemannian metric on by σ and the gradient and Laplace operator of σ by ∇ and , respectively. Given any two tangent vector X and Y of , the second fundamental form of in N is given by II(X, Y ) = (∇ XN Y )⊥ , where (·)⊥ denotes the projection onto the normal bundle of . Themean curvature vector is the trace of the second fundamental form, 2 or H = tr II = a=1 II(ea , ea ), where {e1 , e2 } is an orthonormal basis of the tangent bundle of . The normal bundle is of rank two with structure group S O(1, 1) and the induced metric on the normal bundle is of signature (−, +). Since the Lie algebra of S O(1, 1) is isomorphic to R, the connection form of the normal bundle is a genuine 1-form that depends on the choice of the normal frames. The curvature of the normal bundle is then given by an exact 2-form which reflects the fact that any S O(1, 1) bundle is topologically trivial. Connections of different choices of normal frames differ by an exact form. We define Definition 1.1. Let e3 be a space-like unit normal along ; the connection form determined by e3 is defined to be αe3 (X ) = ∇ XN e3 , e4 ,
(1.3)
where e4 is the future-directed time-like unit normal that is orthogonal to e3 . When bounds a space-like hypersurface with ∂ = , we choose e3 to be the space-like outward unit normal with respect to . The connection form is then denoted by α . Suppose bounds a space-like hypersurface in N , the definition of quasi-local mass m asks that (see [7,8]): (1) m ≥ 0 under the dominant energy condition. (2) m = 0 if and only if is in the Minkowski spacetime. (3) The limit of m on large coordinates spheres of asymptotically flat (null) hypersurfaces should approach the ADM (Bondi) mass. The quasi-local mass is supposed to be closely related to the formation of black holes according to the hoop conjecture of Throne. Various definitions for the quasi-local mass have been proposed (see for example the review article by Szabados [27]). In this article, we shall focus on quasi-local mass defined by the following comparison principle: anchor the intrinsic geometry (the induced metric) by isometric embeddings and compare other extrinsic geometries. An important feature that we expect is the definition should be a flux type integral on and it should depend only on the fact that bounds a space-like hypersurface , but does not depend on which specific it bounds.
922
M.-T. Wang, S.-T. Yau
1.3. Prior results. We recall the solution of Weyl’s isometric embedding problem by Nirenberg [18] and independently, Pogorelov [21]: Theorem 1.2. Let be a closed surface with a Riemannian metric of positive Gauss curvature, then there exists an isometric embedding i : → R3 that is unique up to Euclidean rigid motions. In particular, the mean curvature of the isometric embedding is uniquely determined by the metric. Through a Hamiltonian-Jacobi analysis of Einstein’s action, Brown and York [4,5] introduced Definition 1.2. Suppose a two-surface bounds a space-like region in a space-time N . Let k be the mean curvature of with respect to the outward normal of . Assume the induced metric on has positive Gauss curvature and denote by k0 the mean curvature of the isometric embedding of into R3 . The Brown-York mass is defined to be: 1 k0 − k . 8π G Liu and Yau [16,17] (see also Kijowski [14]) defined Definition 1.3. Suppose is an embedded two-surface that bounds a space-like region in a space-time N . Assume has positive Gauss curvature. The Liu-Yau mass is defined to be 1 k0 − |H | , 8π G where |H | is the Lorentzian norm of the mean curvature vector. The Brown-York and Liu-Yau mass are proved to be positive by Shi-Tam [26] in the time-symmetric case, and Liu-Yau [16,17], respectively. Theorem 1.3 [26]. Suppose has non-negative scalar curvature and k > 0. Then the Brown-York mass of is nonnegative and it equals zero if only if is flat. Theorem 1.4 [16,17]. Suppose N satisfies the dominant energy condition and the mean curvature vector of is space-like. The Liu-Yau mass is non-negative and it equals zero only if N is isometric to R3,1 along . However, Ó Murchadha, Szabados, and Tod [19] found examples of surfaces in the Minkowki space which satisfy the assumptions but whose Liu-Yau mass, as well as Brown-York, mass, are strictly positive. It seems the missing of the momentum information pi j is responsible for this inconsistency: the Euclidean space can be considered as a totally geodesic space-like hypersurface in the Minkowski space with the second fundamental form pi j = 0 and in both the Brown-York and Liu-Yau case, the reference is taken to be the isometric embedding into R3 . In order to capture the information of pi j , we need to take the reference surface to be a general isometric embedding into the Minkowski space. However, an intrinsic difficulty for this embedding problem is that there are four unknowns (the coordinate functions in R3,1 ) but only three equations (for the first fundamental form). An ellipticity condition in replacement of the positive Gauss curvature condition is also needed to guarantee the uniqueness of the solution. We are
Isometric Embeddings into the Minkowski Space and New Quasi-Local Mass
923
able to achieve these in this article, and indeed, the extra unknown (corresponding to the time function) allows us to identify a canonical gauge in the physical space N and define a quasi-local mass expression. We refer to our paper [29] in which this expression was derived from the more physical point of view, i.e. the Hamilton-Jacobi analysis of the gravitational action. 1.4. Results and organization. We first state the key comparison theorem: Theorem A. Let N be a space-time that satisfies the dominant energy condition. Suppose i : → N is a closed embedded space-like two-surface in N with space-like mean curvature vector H . Let i 0 : → R3,1 be an isometric embedding into the Minkowski space and let τ denote the restriction of the time function t on i 0 (). Let e¯4 be the future-directed time-like unit normal along i() such that − τ H, e¯4 = 1 + |∇τ |2 and e¯3 be the space-like unit normal along with e¯3 , e¯4 = 0 and H, e¯3 < 0. Let be the projection of i 0 () onto R3 = {t = 0} ⊂ R3,1 and kˆ be the mean curvature of in R3 . If τ is admissible (see Definition 5.1), then ˆk − − 1 + |∇τ |2 H, e¯3 − αe¯3 (∇τ ) (1.4)
is non-negative. Indeed, we show
kˆ =
− 1 + |∇τ |2 H0 , e˘3 − αe˘3 (∇τ )
(1.5)
(see Eq. (3.4) ) where H0 is the mean curvature vector of i 0 () in R3,1 , e˘3 is the space-like unit normal along i 0 () in R3,1 that is orthogonal to the time direction. The expression (1.4) naturally arises as the surface term in the Hamiltonian of gravitational action (see Remark 2.1). When the reference isometric embedding lies in an R3 with τ = 0, it recovers the Liu-Yau mass. Remark 1.1. If the Gauss curvature of is positive, an isometric embedding into an R3 with τ = 0 is admissible (see Corollary 5.3 and the preceding remark). In general, when the Gauss curvature is close to being positive, an isometric embedding with small enough τ is admissible. Remark 1.2. We learned the expression in (1.5) from Gibbon’s paper [10]. Indeed, we are motivated by [10] to study the projection of a space-like two-surface in the Minkowski space. The new quasi-local mass is defined to be the infimum of the expression (1.4) over all such isometric embeddings (see Definition 5.2). We prove that such embeddings are parametrized by the admissible τ . Theorem B. Given a metric σ and a function τ on S 2 such that the condition (3.1) holds. There exists a unique space-like isometric embedding i 0 : S 2 → R3,1 with the induced metric σ and the function τ as the time function.
924
M.-T. Wang, S.-T. Yau
In Sect. 2, we study the expression − 1 + |∇τ |2 H, e3 − αe3 (∇τ ) for surfaces in space-time. We consider it as a generalized mean curvature and study the variation of the total integral. The gauge e¯3 , e¯4 in Theorem A indeed minimizes the total integral (see Proposition 2.1). In Sect. 3, we prove Theorem B and study the total mean curvature of the projected surface. In particular, we prove equality (1.5). In Sect. 4, we study the boundary problem of Jang’s equation and calculate the boundary terms. This is an important step in proving Theorem A. In Sect. 5, we define the new quasi-local mass and prove the positivity. In particular, Theorem A is proved. We emphasize that though the proof involves solving Jang’s equation, the results depend only on the solvability but not on the specific solution. The Euler-Lagrange equation of the new quasi-local mass among all admissible τ ’s is derived in Sect. 6. 2. A Generalization of Mean Curvature Definition 2.1. Suppose i : → N is an embedded space-like two-surface. Given a smooth function τ on and a space-like normal e3 , the generalized mean curvature associated with these data is defined to be h(, i, τ, e3 ) = − 1 + |∇τ |2 H, e3 − αe3 (∇τ ), where H is the mean curvature vector of in N and αe3 is the connection form (see Definition 1.1) of the normal bundle of in N determined by e3 and the future-directed time-like unit normal e4 orthogonal to e3 . Remark 2.1. In the case when bounds a space-like region and e3 is the outward unit normal of , the mean curvature vector is H = H, e3 e3 − H, e4 e4 . We can reflect H along the light cone of the normal bundle to get J = H, e4 e3 − H, e3 e4 . Denote the tangent vector on dual to the one-form αe3 by V , then the expression (3) in [29] is J − V , where k = −H, e3 and p = −H, e4 . We have h(, i, τ, e3 ) = −J − V, 1 + |∇τ |2 e4 − ∇τ . Notice that 1 + |∇τ |2 e4 − ∇τ is again a future-directed unit time-like vector along . Fix a base frame {eˆ3 , eˆ4 } for the normal bundle; any other frame {e3 , e4 } can be expressed as e3 = cosh φ eˆ3 − sinh φ eˆ4 , e4 = − sinh φ eˆ3 + cosh φ eˆ4 for some φ. We compute the integral h(, i, τ, e3 )dv = 1 + |∇τ |2 (cosh φ∇eNa eˆ3 , ea − sinh φ∇eNa eˆ4 , ea ) −αeˆ3 (∇τ ) − ∇τ · ∇φ dv and consider this expression as a functional of φ.
(2.1)
Isometric Embeddings into the Minkowski Space and New Quasi-Local Mass
925
Suppose the mean curvature vector of is space-like; we may choose a base frame with eˆ3 = −
H |H |
(2.2)
and eˆ4 the future directed time-like unit normal that is orthogonal to eˆ3 . This choice makes ∇eNa eˆ4 , ea = 0. Integrating by parts, the functional becomes ( 1 + |∇τ |2 cosh φ|H | − αeˆ3 (∇τ ) + φ τ )dv . (2.3)
As |H | is positive, this is clearly a convex functional of φ which achieves the minimum as sinh φ =
− τ . |H | 1 + |∇τ |2
(2.4)
We notice that the minimum is achieved by e¯4 such that the expression |H | sinh φ = H, e¯4 =
− τ 1 + |∇τ |2
(2.5)
depends only on τ ; this is taken as the characterizing property of e¯4 in [29]. Definition 2.2. Given an isometric embedding i : → N into a space-time with space-like mean curvature vector H . Denote ( τ )2 + |H |2 (1 + |∇τ |2 ) − ∇τ · ∇φ − αeˆ3 (∇τ ) dv , H(, i, τ ) =
where φ is defined by (2.4) and αeˆ3 is the connection one-form on associated with eˆ3 in Eq. (2.2). In terms of the frame e¯3 , e¯4 , where e¯4 is given by Eq. (2.5) and e¯3 is the space-like unit normal with e¯3 , e¯4 = 0, then h(, i, τ, e¯3 )dv = − 1 + |∇τ |2 H, e¯3 − αe¯3 (∇τ )dv . H(, i, τ ) =
Proposition 2.1. If the mean curvature vector of the embedding i : → N is space-like and e3 is any space-like unit normal such that H, e3 < 0, then h(, i, τ, e3 )dv ≥ H(, i, τ ).
3. Isometric Embeddings into the Minkowski Space 3.1. Existence and uniqueness theorem. Let be a two-surface diffeomorphic to S 2 . We fix a Riemannian metric σ on , σ = σab du a du b , in local coordinates u 1 , u 2 . Denote the gradient, the Hessian, and the Laplace operator with respect to the metric σ by ∇, ∇ 2 , and , respectively. We consider the isometric embedding problem of (, σ ) into the Minkowski space R3,1 with prescribed mean curvature in a fixed time direction. Let ·, · denote the standard Lorentzian metric on R3,1 and T0 be a constant unit time-like vector in R3,1 ; we have the following existence and uniqueness theorem:
926
M.-T. Wang, S.-T. Yau
Theorem 3.1. Let λ be a function on with of λ, i.e. τ = λ. Suppose
λdv = 0. Let τ be a potential function
K + (1 + |∇τ |2 )−1 det(∇ 2 τ ) > 0,
(3.1)
where K is the Gauss curvature of σ and det(∇ 2 τ ) is the determinant of the Hessian of τ . Then there exists a unique space-like embedding X : → R3,1 with the induced metric σ and such that the mean curvature vector H0 of the embedding satisfies H0 , T0 = −λ.
(3.2)
Proof. We prove the uniqueness part first. Let X i : → R3,1 , i = 1, 2 be two isometric embeddings that satisfy (3.2). Since the mean curvature vector of the embedding X i is X i , this implies (X 1 − X 2 ), T0 = 0, or X 1 − X 2 , T0 is a constant on . Denote τi = −X i , T0 ; we thus have dτ1 = dτ2 . Now consider the projection X i : → R3 onto the orthogonal complement of T0 ; X i = X i − τi T0 . The Gauss curvature of the embedding X can be computed as i = (1 + |∇τi |2 )−1 [K + (1 + |∇τi |2 )−1 det(∇ 2 τi )] K
(3.3)
which is positive by the assumption. We compute the induced metric on the image of the embedding d Xi , d X i = d X i , d X i + dτi2 . Since we assume d X 1 , d X 1 = d X 2 , d X 2 = σ , X i ’s are embeddings into R3 with the same induced metrics of positive Gauss curvature. By Theorem 1.2, X 1 and X 2 are congruent in R3 . Since τ1 and τ2 are different by a constant, X i , as the graphs of τi over X i , are congruent in R3,1 . We turn to the existence part. We start with the metric σ and the function λ and solve of the new metric σˆ = σ + dτ 2 is again given for τ in τ = λ. The Gauss curvature K by (3.3). Theorem 1.2 gives an embedding X : → R3 with the induced metric σˆ . Now X = X + τ T0 is the desired isometric embedding into R3,1 that satisfies (3.2).
The existence theorem can be formulated in terms of τ as the mean curvature vector is given by H0 = X . Corollary 3.1. (Theorem B) Given a metric σ and a function τ on S 2 such that the condition (3.1) holds. There exists a unique space-like isometric embedding i 0 : S 2 → R3,1 with the induced metric σ and the function τ as the time function. 3.2. Total mean curvature of the projection. In this section, we compute the total mean ˆ of the projection in R3 in terms of the geometry of in R3,1 . curvature kdv 3,1 Suppose X : → R is the embedding and τ = −X, T0 is the restriction of the in R3 and T0 form time function associated with T0 . The outward unit normal νˆ of in R3,1 . Extend νˆ along T0 by parallel an orthonormal basis for the normal bundle of translation and denote it by e˘3 . We have
Isometric Embeddings into the Minkowski Space and New Quasi-Local Mass
Proposition 3.1. ˆ
ˆ kdv =
927
−H0 , e˘3 1 + |∇τ |2 − αe˘3 (∇τ ) dv .
(3.4)
Proof. Denote by ∇ R the flat connection associated with the Lorentzian metric on and compute R3,1 . Take an orthonormal basis eˆa , a = 1, 2 for the tangent space of 3,1
3,1 3,1 R3,1 R3,1 νˆ , ν ˆ − ∇TR0 ν, ˆ T0 , kˆ = ∇eR ˆa νˆ , eˆa = ∇eˆa νˆ , eˆa + ∇νˆ
because the last two terms 3,1 are both zero. Therefore kˆ = g αβ ∇eRα νˆ , eβ for any orthonormal frame eα of R3,1 , where g αβ is the inverse of gαβ = eα , eβ . Now e˘3 = νˆ may be considered as a space-like normal vector field along . Pick an orthonormal basis {e1 , e2 } tangent to . Let e˘4 = √ 1 2 (T0 − T0 ) be the future1+|∇τ |
directed unit normal vector in the direction of the normal part of T0 . It is not hard to see that T0 = −∇τ . {e˘3 , e˘4 } form an orthonormal basis for the normal bundle of . We derive 1 3,1 3,1 R3,1 ∇∇τ e˘3 , e˘4 (3.5) kˆ = ∇eRa e˘3 , ea − ∇eR ˘4 e˘3 , e˘4 = −H0 , e˘3 − 2 1 + |∇τ | because νˆ is extended along T0 by parallel translation. are related by dv = √ The area forms of and
1 dv . Integrating Eq. (3.5) 1+|∇τ |2
over , we obtain (3.4)
−H0 |H0 |
Suppose the mean curvature vector H0 of in R3,1 is space-like. Let e3H0 = the unit vector in the direction of H0 and vector with e3H0 , e4H0 = 0. Suppose that
e4H0
be
the future-directed time-like unit normal
e3H0 = cosh θ e˘3 + sinh θ e˘4 , and e4H0 = sinh θ e˘3 + cosh θ e˘4 . Since τ = −H0 , T0 and T0 = 1 + |∇τ |2 e˘4 − ∇τ , we derive sinh θ =
− τ . |H0 | 1 + |∇τ |2
(3.6)
These imply the following relations: e˘3 = cosh θ e3H0 − sinh θ e4H0 , and e˘4 = − sinh θ e3H0 + cosh θ e4H0 . The integrand on the right-hand side of (3.4) becomes R3,1 H0 H0 e3 , e4 . |H0 | cosh θ 1 + |∇τ |2 − ∇θ · ∇τ − ∇∇τ Therefore we have Proposition 3.2. When the mean curvature vector of in R3,1 is space-like, is equal to ( τ )2 + |H0 |2 (1 + |∇τ |2 ) − ∇θ · ∇τ − αe H0 (∇τ ) dv ,
ˆ
ˆ kdv (3.7)
3
where θ is given by (3.6) and αe H0 is the one-form on defined by αe H0 (X ) = ∇ XR e3H0 , e4H0 . 3,1
3
3
928
M.-T. Wang, S.-T. Yau
4. Jang’s Equation and Boundary Information 4.1. Jang’s equation. Jang’s equation was proposed by Jang [13] in an attempt to solve the positive energy conjecture. Schoen and Yau came up with different geometric interpretations, studied the equation in full, and applied them to their proof [24] of the positive mass theorem. Another important contribution of Schoen and Yau’s work in [24] is to understand the precise connection between the solvability of Jang’s equation and the existence of black holes. This leads to the later works on the existence of black holes due to condensation of matter and boundary effect [25,31]. Given an initial data set (, gi j , pi j ), where pi j is a symmetric two-tensor that represents the second fundamental form of with respect to a future-directed time-like normal e4 in a space-time N , we consider the Riemannian product × R and extend pi j by parallel translation along the R direction to a symmetric tensor P(·, ·) on × R. Such an extension makes P(·, v) = 0, where v denotes the downward unit vector in the R direction.
in ×R, defined as the graph of a function Jang’s equation asks for a hypersurface
in × R is the same as the the trace of the f over , such that the mean curvature of
. In terms of local coordinates x i on , the equation takes the form restriction of P to 3 i, j=1
where D f = ikj ∂∂xfk
fi f j g − 1 + |D f |2
∂f ij ∂ g ∂x j ∂xi
ij
Di D j f − pi j (1 + |D f |2 )1/2
is the gradient of f , |D f |2 = g i j ∂∂xfi
∂f ∂x j
= 0,
and Di D j f =
(4.1)
∂2 f ∂xi ∂x j
−
is the Hessian of f .
such Pick an orthonormal basis {e˜α }α=1···4 for the tangent space of × R along
and e˜4 is the downward unit normal, then Jang’s equation that {e˜i }i=1···3 is tangent to is 3 i=1
e˜ e˜4 , e˜i = ∇ i
3
P(e˜i , e˜i );
(4.2)
i=1
is the Levi-Civita connection on the product space here and throughout this section ∇ × R. 4.2. Boundary calculations. Let τ be a smooth function on = ∂. We consider a solution f of Jang’s equation in × R that satisfies the Dirichlet boundary condition f = τ on .
and the graph of f over by
so that ∂
=
. Denote the graph of τ over by
, respectively. Let We choose orthonormal frames {e1 , e2 } and {e˜1 , e˜2 } for T and T e3 be the outward normal of that is tangent to . We also choose e˜3 , e˜4 for the normal
in × R such that e˜3 is tangent to the graph
and e˜4 is a downward unit bundle of
normal vector of in × R. {e1 , e2 , e3 , v} forms an orthonormal basis for the tangent space of × R, and so does {e˜α }α=1···4 . All these frames are extended along the R direction by parallel translation. Along , we have D f = ∇τ + f 3 e3 ,
Isometric Embeddings into the Minkowski Space and New Quasi-Local Mass
929
where f 3 = e3 ( f ) is the normal derivative of f . e˜3 and e˜4 can be written down explicitly:
1 f3 2 e˜3 = 1 + |∇τ | e3 − (v + ∇τ ) and 1 + |D f |2 1 + |∇τ |2 1 (v + D f ). (4.3) e˜4 = 1 + |D f |2 We check that e˜3 and e˜4 are orthogonal to ea − ea (τ )v for a = 1, 2. Simple calculations yield 1 + |∇τ |2 f3 e3 , e˜3 = , and e3 , e˜4 = . 2 1 + |D f | 1 + |D f |2
(4.4)
e˜ e˜3 , e˜a be the mean curvature of
with respect to
. We are particularly Let k˜ = ∇ a
: interested in the following expression on
e˜ e˜4 , e˜3 + P(e˜4 , e˜3 ). k˜ − ∇ 4
(4.5)
Theorem 4.1. Let i : → N be a space-like embedding. Given any smooth function τ on and any space-like hypersurface with ∂ = , suppose the Dirichlet problem of Jang’s equation (4.1) over subject to the boundary condition that f = τ on is solvable. Then there exists a space-like unit normal e3 along in N such that the
is equal to expression (4.5) at q˜ ∈ −H, e3 − (1 + |∇τ |2 )−1/2 αe3 (∇τ ) at q ∈ ,
. In particular where q˜ = (q, τ (q)) ∈ ˜k − ∇
e˜ e˜4 , e˜3 + P(e˜4 , e˜3 )dv − 1 + |∇τ |2 H, e3 − αe3 (∇τ )dv .
= 4
(4.6) Let e3 be the outward unit normal of that is tangent to and e4 is the future-directed time-like normal of in N ; e3 is given by e3 = cosh φe3 + sinh φe4 , where sinh φ =
− f3 1 + |∇τ |2
.
(4.7)
Proof. The proof is through a sequence of calculations using the product structure of × R and Jang’s equation. It also relies on the fact that P, {e˜α }4α=1 , and {e1 , e2 , e3 , v} are all parallel in the direction of v. We first prove the following identity:
e˜ e˜4 , e˜3 + P(e˜3 , e˜4 ) k˜ − ∇ 4
ea e˜3 , ea + e3 , e˜4 ∇
ea e˜4 , ea − e3 , e˜4 P(ea , ea ) = ∇ e3 , e˜3 e3 , e˜3 1 + P(e3 , e˜4 − e˜4 , e3 e3 ). e3 , e˜3
(4.8)
930
M.-T. Wang, S.-T. Yau
ea e˜3 , ea and ∇
ea e˜4 , ea in the following: We compute the terms ∇
ea e˜3 , ea = ∇
3 4
ei e˜3 , ei − ∇
e3 e˜3 , e3 =
e˜α e˜3 , e˜α − ∇
e3 e˜3 , e3 , ∇ ∇ α=1
i=1
as {e1 , e2 , e3 , v} and {e˜α }α=1···4 are both orthonormal frames for the tangent space of
v e˜3 = 0. × R and ∇
e˜ e˜3 , e˜3 = 0 and thus we obtain Notice that ∇ 3
2
ea e˜3 , ea = k˜ + ∇
e˜ e˜3 , e˜4 − ∇
e3 e˜3 , e3 . ∇ 4
(4.9)
a=1
On the other hand, 2 3 3
ea e˜4 , ea =
ei e˜4 , ei − ∇
e3 e˜4 , e3 =
e˜ e˜4 , e˜i − ∇
e3 e˜4 , e3 . ∇ ∇ ∇ i a=1
i=1
i=1
Applying Jang’s equation (4.2), we obtain 2
ea e˜4 , ea = ∇
a=1
3
e3 e˜4 , e3 . P(e˜i , e˜i ) − ∇
i=1
Furthermore, we derive 3
P(e˜i , e˜i ) =
i=1
3
P(ei , ei ) +
i=1
e3 , e˜3 1 P(e˜3 , e˜4 ) − P(e3 , e˜4 ), e3 , e˜4 e3 , e˜4
using 3
P(e˜i , e˜i ) =
4
P(e˜α , e˜α ) − P(e˜4 , e˜4 ) =
α=1
i=1
3
P(ei , ei ) − P(e˜4 , e˜4 )
i=1
and e3 , e˜4 P(e˜4 , e˜4 ) = P(e3 − e3 , e˜3 e˜3 , e˜4 ). Therefore, we arrive at 2 2 e3 , e˜3
ea e˜4 , ea = P(e˜3 , e˜4 ) ∇ P(ea , ea ) + e3 , e˜4 a=1
a=1
−
1
e3 e˜4 , e3 . P(e3 , e˜4 − e˜4 , e3 e3 ) − ∇ e3 , e˜4
(4.10)
Combining (4.9) and (4.10) yields (4.8).
ea e3 , ea be the mean curvature of (as the boundary of ) with respect Let k = ∇ to e3 . As e3 , e˜a = 0 for a = 1, 2, we have e3 = e3 , e˜3 e˜3 + e3 , e˜4 e˜4 . Plug this into the expression for k, and we obtain
ea e˜3 , ea + e3 , e˜4 ∇
ea e˜4 , ea k = e3 , e˜3 ∇ − e3 , e˜3 ea (ea , e˜3 ) − e3 , e˜4 ea (ea , e˜4 ).
Isometric Embeddings into the Minkowski Space and New Quasi-Local Mass
ea e˜3 , ea + From here we solve for ∇ obtain
e˜ e˜4 , e˜3 + P(e˜3 , e˜4 ) k˜ − ∇
e3 ,e˜4
e3 ,e˜3 ∇ea e˜4 , ea
931
and substitute into (4.8) to
4
e3 , e˜4 1 1 k− P(ea , ea ) + P(e3 , e˜4 − e˜4 , e3 e3 ) e3 , e˜3 e3 , e˜3 e3 , e˜3 e3 , e˜4 ea (ea , e˜4 ). +ea (ea , e˜3 ) + e3 , e˜3 We calculate the right-hand side of (4.11) using (4.3) and 1 P(e3 , e˜4 − e˜4 , e3 e3 ) = P(e3 , ∇τ ). 1 + |D f |2 The last two terms can also be calculated using (4.3) and =
(4.11)
e3 , e˜4 ea (ea , e˜4 ) ea (ea , e˜3 ) + e3 , e˜3 − f 3 ∇τ ∇τ f3 div = div + (1 + |D f |2 )(1 + |∇τ |2 ) 1 + |D f |2 1 + |∇τ |2 f3 1 ∇τ · ∇ = − . 2 1 + |D f | 1 + |∇τ |2 Recalling the definition of φ from (4.7), this is equal to ∇τ · ∇φ . 1 + |∇τ |2 The right-hand side of (4.11) is therefore (1 + |∇τ |2 )−1/2 1 + |D f |2 k − f 3 P(ea , ea ) + P(e3 , ∇τ ) + ∇τ · ∇φ .
(4.12)
This is an expression on that depends on the functions τ and f 3 on . Recall that the symmetric tensor P originates from the second fundamental form of with respect to the future-directed unit time-like normal e4 in the space-time N . Rewrite the expression (4.12) in terms of e3 and e4 :
e˜ e˜4 , e˜3 + P(e˜3 , e˜4 ) k˜ − ∇ 4 = (1 + |∇τ |2 )−1/2 N N 2 × 1 + |D f | ∇ea e3 , ea − f 3 ∇ea e4 , ea − αe3 (∇τ ) + ∇τ · ∇φ . (4.13) On the other hand, with the orthonormal frame e3 , e4 given by e3 = cosh φe3 + sinh φe4 , e4 = sinh φe3 + cosh φe4 , we compute ∇eNa e3 , ea − (1 + |∇τ |2 )−1/2 αe3 (∇τ ) = cosh φ∇eNa e3 , ea + sinh φ∇eNa e4 , ea − (1 + |∇τ |2 )−1/2 (αe3 (∇τ ) − ∇τ · ∇φ). Plug in the expression for cosh φ and sinh φ, we recover the right-hand side of (4.13).
932
M.-T. Wang, S.-T. Yau
4.3. Boundary gradient estimate. In this section, we demonstrate a sufficient condition for Jang’s equation to be solvable. As most estimates are derived in SchoenYau’s original paper [24] for the asymptotically flat case, it suffices to control the boundary gradient of the solution. Theorem 4.2. The normal derivative of a solution of the Dirichlet problem of Jang’s equation is bounded if k > |tr
P|. Proof. We consider the operator Q( f ) = g i j −
fi f j 1 + |D f |2
Di D j f − tr
P, (1 + |D f |2 )1/2
is the graph of f over . The point is to construct sub and super solutions of where this operator with the prescribed boundary condition. Denote by d the distance function to ∂. We extend the boundary data τ to the interior of , still denoted by τ . Consider a test function f = ψ(d) + τ as the one in (14.11) of [11], where ψ(d) = ν1 log(1 + κd) with κ, ν > 0. In particular ψ = −ν(ψ )2 < 0, ψ > 0, and ψ (d) → ∞ as κ → ∞. We compute Di D j f = ψ di d j + ψ Di D j d + Di D j τ. Therefore, Di D j f fi f j ij g − 1 + |D f |2 (1 + |D f |2 )1/2 di d j Di D j d fi f j fi f j ij ij . =ψ g − +ψ g − 2 2 1/2 2 1 + |D f | (1 + |D f | ) 1 + |D f | (1 + |D f |2 )1/2 Di D j τ fi f j + gi j − . 2 1 + |D f | (1 + |D f |2 )1/2 Applying fi f j 1 gi j ≤ gi j − ≤ gi j 2 1 + |D f | 1 + |D f |2 and |Dd| = 1, we derive that the first term is bounded above by ψ (1 + |D f |2 )3/2 and the third term is bounded above by |D 2 τ | . (1 + |D f |2 )1/2 The second term is ψ gi j −
Di D j d fi f j 2 1 + |D f | (1 + |D f |2 )1/2
d ψ = ψ − f i f j Di D j d. (1 + |D f |2 )1/2 (1 + |D f |2 )3/2
We compute
Isometric Embeddings into the Minkowski Space and New Quasi-Local Mass
933
f i f j Di D j d = (ψ d i + τ i )(ψ d j + τ j )Di D j d = τ i τ j Di D j d, where we used the identity d i Di D j d = 0. Therefore, Q( f ) is bounded from above by ψ + (1 + |D f |2 )|D 2 τ | − ψ τ i τ j Di D j d ψ d + ψ − tr
P. (1 + |D f |2 )3/2 (1 + |D f |2 )1/2
is the graph of ψ(d) + τ over . Let a = {d ≤ a} ∩ and We recall that ∂a be the graph of ψ(d) + τ over ∂a . We have tr˜ P = tr P+ ∂a
P(Dd, Dd) 1 + (ψ + Dτ · Dd)2
.
Therefore Q( f ) is bounded from above by ψ + (1 + |D f |2 )|D 2 τ | − ψ τ i τ j Di D j d ψ d + − tr P 2 3/2 ∂a (1 + |D f | ) (1 + |D f |2 )1/2 . P(Dd, Dd) − 1 + (ψ + Dτ · Dd)2 When τ = 0, this recovers formula (5.11) in [31]. In general, we recall D f = ψ Dd + Dτ and |D f |2 ≥ θ (ψ )2 −
θ |Dτ |2 1−θ
for any positive θ < 1. We notice that d approaches −k, where k is the mean curvature of ∂ in . However, tr P approaches tr
P. Thus a sub and a super solution exist ∂a P|.
when k ≥ |tr
5. New Quasi-Local Mass and the Positivity First we define an admissible time function for a surface in space-time. Definition 5.1. Given a space-like embedding i : → N , a smooth function τ on is said to be admissible if: (1) K + (1 + |∇τ |2 )−1 det(∇ 2 τ ) > 0. (2) bounds an embedded space-like three-manifold in N such that Jang’s equation (4.1) with the Dirichlet boundary data τ is solvable on . (3) The generalized mean curvature h(, i, τ, e3 ) > 0 for the space-like unit normal e3 (4.7) is determined by Jang’s equation. We are now ready to define the quasi-local mass.
934
M.-T. Wang, S.-T. Yau
Definition 5.2. Given a space-like embedding i : → N , suppose the set of admissible functions is non-empty. The quasi-local mass is the defined to be the infimum of H(, i 0 , τ ) − H(, i, τ ) among all admissible τ , where H is given by Definition 2.2 and i 0 is the unique space-like isometric embedding of into R3,1 associated with τ given by Theorem B. The proof of the positivity of quasi-local mass is based on the following theorem which can be considered as a total mean curvature comparison theorem for solutions of Jang’s equation. Theorem 5.1. Suppose is a Riemannian three-manifold with boundary and suppose there exists a vector field X on such that R ≥ 2|X |2 − 2div X
(5.1)
in , where R is the scalar curvature of and k > X, ν
(5.2)
on , where ν is outward normal of and k is the mean curvature of with respect to ν. Suppose the Gauss curvature of is positive and k0 is the mean curvature of the isometric embedding of into R3 . Then k0 dv ≥ k − X, νdv .
Remark 5.1. When X = 0, the theorem was proved by Shi-Tam [26]. By the calculation in Schoen-Yau [23], the condition (5.1) holds for any solution of Jang’s equation over an initial data set that satisfies the dominant energy condition. The vector field X is the dual
e˜ e˜4 , · − P(e˜4 , ·) in the notation of Sect. 4. In this case, Liu-Yau [16] essentially of ∇ 4 proved the theorem by conformally changing the metric to zero scalar curvature. The proof of Theorem 6.2 in [28] gives a direct proof without conformal change in a slightly different setting. Proof. The idea of the proof is similar to the one by Shi-Tam. Consider the isometric embedding of into R3 and denote the region inside the image 0 by 0 . We then glue together and R3 \0 along the identification of and 0 . Write the metric on R3 \0 into the form dr 2 + gr , where r is the distance function to 0 and gr is the induced metric on the level set r of r . Applying Bartnik’s [2] quasi-spherical construction, we consider a new metric on R3 \0 of the form u 2 dr 2 + gr with zero scalar curvature and k0 at r = 0. u then satisfies a parabolic equation and the solution gives an u = k−X,ν asymptotically flat metric on R3 \0 . Denote by M˜ the space ∪ R3 \0 with the new metric u 2 dr 2 + gr on R3 \0 . The initial condition on u implies the mean curvature of 0 with respect to this new1 metric is k − X, ν. We still have the monotonicity ford 3 mula, i.e. dr r k0 (r )(1 − u )dvr ≤ 0, where k0 (r ) is the mean curvature of r in R . ˜ In the following, we Therefore, it remains to prove the positivity of the total mass of M. prove a Lichnerowicz formula for such a manifold and the existence of harmonic spinors
Isometric Embeddings into the Minkowski Space and New Quasi-Local Mass
935
asymptotic to constant spinors. According to the standard Lichnerowicz formula, on we have 1 |∇ψ|2 + R|ψ|2 − |Dψ|2 4 = ψ, ∇ν ψ + c(ν) · Dψ, (5.3) ∂
where ψ is a spinor, c(·) is the Clifford multiplication, ∇ is the spin connection, and D is the Dirac operator. Integrating by parts, we obtain 2 2 X, ν|ψ| = div X |ψ| + X (|ψ|2 ). ∂
Formula (5.3) is equivalent to 1 1 |∇ψ|2 + (R + 2div X )|ψ|2 + X (|ψ|2 ) − |Dψ|2 4 2 1 = ψ, ∇ν ψ + c(ν) · Dψ + X, ν|ψ|2 . 2 ∂ ∂
(5.4)
The boundary term can be rearranged as 1 −ψ, D ∂ ψ − (k − X, ν)|ψ|2 , 2 ∂ 2 ψ for an orthonormal basis e1 , e2 for the where −D ∂ ψ = a=1 c(ν) · c(ea ) · ∇e∂ a tangent bundle of . Let M˜ r ⊂ M˜ be the region with ∂ M˜ r = r . On M˜ r \, we have 1 2 2 ∂ ψ, D ψ + (k − X, ν)ψ (|∇ψ| − |Dψ| ) = 2 ∂ M˜ r \ ∇νr ψ + c(νr ) · Dψ, ψ. + r
Adding these up, we obtain 1 1 |∇ψ|2 + (R + 2div X )|ψ|2 + X (|ψ|2 ) 4 2 M˜ r |Dψ|2 + ∇ν ψ + c(ν) · Dψ, ψ. = M˜ r
r
We claim the left-hand side of (5.5) is always greater than or equal to 1 |∇ψ|2 . 2 M˜ r This follows from the inequality: |∇ψ|2 + |X |2 |ψ|2 + X (|ψ|2 ) ≥ 0 as
(5.5)
936
M.-T. Wang, S.-T. Yau
X (|ψ|2 ) = ∇ X ψ, ψ + ψ, ∇ X ψ ≥ −2|∇ X ψ||ψ|. Thus if we can solve the harmonic spinor equation Dψ = 0 we obtain lim ∇ν ψ + c(ν) · Dψ, ψ ≥ 0 r →∞ r
˜ and it is known that the limit expression for a constant spinor gives the total mass of M. Equation (5.5) also implies the following coercive estimates for spinors of compact support: 1 2 |Dψ| ≥ |∇ψ|2 , 2 M˜ r M˜ r which is enough to establish the existence of harmonic spinors that are asymptotic to constant spinors at infinity.
Theorem 5.2. Given an embedding i : → N into a space-time that satisfies the dominant energy condition, suppose τ is admissible, then we have h(, i 0 , τ, e˘3 )dv ≥ h(, i, τ, e3 )dv ,
where i 0 : → Theorem B.
R3,1
is the isometric embedding into the Minkowski space given by
Proof. Since τ is admissible, by (2) and (3) of Definition 5.1, bounds a space-like hypersurface such that Jang’s equation over with boundary value τ on is solvable and the generalized mean curvature h(, i, τ, e3 ) is positive. It follows from Theorem 4.1 that
e˜ e˜4 , e˜3 + P(e˜4 , e˜3 ) > 0 k˜ − ∇ 4
e˜ e˜4 , ·−P(e˜4 , ·);
, the graph of τ over . Take X to be the vector field on
dual to ∇ on 4 ˜ satisfies the assumption of Theorem 5.1 by Remark 5.1. We can take the we see that projection of i 0 onto the standard R3 slice determined by t = 0 and denote the image . The induced metric on is then isometric to the metric on the boundary surface by
of . Therefore, by Theorem 5.1 we have ˆkdv (5.6) k˜ − X˜ , e˜3 dv ≥
.
˜
The theorem follows from Eq. (3.4) and Eq. (4.6).
We recall the statement of Theorem A and give the proof: Theorem A. Let N be a space-time that satisfies the dominant energy condition. Suppose i : → N is a closed embedded space-like two-surface in N with space-like mean curvature vector H . Let i 0 : → R3,1 be an isometric embedding into the Minkowski space and let τ denote the restriction of the time function t on i 0 (). Let e¯4 be the future-directed time-like unit normal along i() such that H, e¯4 =
− τ 1 + |∇τ |2
Isometric Embeddings into the Minkowski Space and New Quasi-Local Mass
937
and e¯3 be the space-like unit normal along with e¯3 , e¯4 = 0 and H, e¯3 < 0. Let be the projection of i 0 () onto R3 = {t = 0} ⊂ R3,1 and kˆ be the mean curvature of in R3 . If τ is admissible (see Definition 5.1), then ˆk − − 1 + |∇τ |2 H, e¯3 − αe¯3 (∇τ )
is non-negative. Proof. Because τ is admissible and the i 0 is the unique isometric embedding into R3,1 associated with i 0 , by Eq. (5.6) and Eq. (4.6), we have ˆ ≥ h(, i, τ, e3 )dv . (5.7) kdv By Proposition 2.1, from Definition 2.2.
h(, i, τ, e3 )dv
≥ H(, i, τ ). Theorem A now follows
Rewriting the integrals, we obtain: Corollary 5.1. Given an embedding i : → N into a space-time that satisfies the dominant energy condition, suppose the mean curvature vector of in N is space-like and τ is admissible, then H(, i 0 , τ ) ≥ H(, i, τ ), where i 0 : → R3,1 is the isometric embedding into the Minkowski space given by Theorem B. Proof. We have from Eq. (3.4) and Eq. (3.7), ˆ = h(, i 0 , τ, e˘3 )dv = H(, i 0 , τ ) kdv
and
h(, i, τ, e3 )dv ≥ H(, i, τ )
from Proposition 2.1.
Corollary 5.2. Under the assumption of Theorem 5.1, if the set of admissible τ is nonempty, then the quasi-local mass is non-negative. It is zero if the embedding i : → N is isometric to R3,1 along . Proof. The first part follows from the previous corollary. If i : → N is isometric to R3,1 along , we can take the isometric embedding i 0 : → R3,1 , the restriction of the time function τ will be admissible and all the inequalities become equalities by the uniqueness of e¯4 .
By the boundary gradient estimate of Jang’s equation, a constant function is admissible if has positive Gauss curvature and the mean curvature vector of in N is space-like.
938
M.-T. Wang, S.-T. Yau
Corollary 5.3. Under the assumption of Theorem 5.1, and supposing has positive Gauss curvature, then the quasi-local mass is non-negative. It is zero if the embedding i : → N is isometric to R3,1 along . Suppose the minimum is achieved at some τ , we can consider the isometric embedding determined by τ and define a quasi-local energy momentum vector. This is particularly useful when we have a family of surface s → N ; we find the optimal isometric embedding into R3,1 and apply the procedure to get a family of future-directed time-like vectors in R3,1 . 6. The Equation of the Optimal Isometric Embedding 6.1. Variation of total mean curvature. Let be an orientable closed embedded hypersurface in Rn+1 . Denote the outward normal by ν and the mean curvature with respect to ν by H . We study how the total mean curvature H dv changes with respect to the induced metric σi j . We fix a local coordinate system u i on . The variational field δ X = Y can be decomposed into the tangential and normal part Y = ak
∂X + bν. ∂u k
∂X ∂X We compute the variation of the induced metric σi j = ∂u i , ∂u j ,
∂ δσi j = ∂u i =
∂X ∂ ∂X k ∂X k ∂X a + a + bν , , + bν ∂u k ∂u j ∂u i ∂u j ∂u k
∂a k ∂a k k σ + a ϒ + bh + σik + a k ϒ jki + bh i j , k j ik j i j ∂u i ∂u j
(6.1)
X ∂X ∂ν ∂ X where ϒik j = ∂u∂i ∂u k , ∂u j is the Christoffel symbol of σi j and h i j = ∂u i , ∂u j is the second fundamental form. Denote by the ∇i the covariant derivative with respect to ∂u∂ i . We solve for 2
bh i j =
1 (δσi j − ∇i a k σk j − ∇ j a k σik ). 2
(6.2)
X Next we compute the variation of the mean curvature H = −σ i j ∂u∂i ∂u j , ν, 2
δ H = h i j δσ
ij
−σ
ij
2 ∂ 2Y ∂ X ij ,ν − σ , δν . ∂u i ∂u j ∂u i ∂u j
We derive δσ i j = −σ ik δσkl σ l j , and
∂X ∂b δν = a k h lk − l σ l j j . ∂u ∂u
(6.3)
Isometric Embeddings into the Minkowski Space and New Quasi-Local Mass
939
On the other hand, 2 ∂ Y ∂2 k ∂X a ,ν = + bν , ν ∂u i ∂u j ∂u i ∂u j ∂u k k 2 ∂a ∂ X ∂b ∂ν ∂ k ∂ X ,ν . + a + ν + b = ∂u i ∂u j ∂u k ∂u j ∂u k ∂u j ∂u j Substitute in ∂2 X ∂X ∂X ∂ν = ϒ ljk l − h jk ν and = h jl σ lk k , ∂u j ∂u k ∂u ∂u j ∂u and we obtain 2 ∂ Y ∂ ∂b k lk ∂ X k (∇ j a + bh jl σ ) k + ,ν = − a h jk ν , ν ∂u i ∂u j ∂u i ∂u ∂u j . ∂b ∂ k lk l = −h ik (∇ j a + bh jl σ ) + i − a h jl ∂u ∂u j Plug these into (6.3) and we arrive at δ H = −σ ik σ l j h i j δσkl − b + σ i j h ik ∇ j a k + bg i j σ lk h ik h jl + σ i j ∇i (a k h jk ). We plug (6.2) into this equation and obtain 1 δ H = − σ ik σ l j h i j δσkl − b + σ i j ∇i (a k h jk ). 2 Proposition 6.1. Let be a closed embedded hypersurface in Rn+1 . The variation of the total mean curvature with respect to a metric deformation is 1 H dv = (H σ i j − σ ik σ jl h kl )δσi j dv . δ 2 Corollary 6.1. If X s : → Rn+1 is a smooth family of isometric embedding of a compact n-manifold , then the total mean curvature is a constant. 6.2. The variational equation. In this section, σab denotes the metric on a two-surface which satisfies the assumption in Theorem A. Recall the metric on the projection is σˆ ab = σab + τa τb . The metric σab is fixed for the isometric embedding and thus δ σˆ ab = δ(τa τb ). The quasi-local mass expression we try to minimize is ˆ = − 1 + |∇τ |2 cosh θ |H | − ∇τ · ∇θ − V · ∇τ dv , kdv
where sinh θ =
√− τ , and |H | 1+|∇τ |2
V is the tangent vector on that is dual to the connec-
H tion one-form αeˆ3 determined by eˆ3 = − |H | . This is an expression that is determined by σab and the mean curvature vector H .
940
M.-T. Wang, S.-T. Yau
We can take X = (x(u a ), y(u a ), z(u a ), τ (u a )) : → R3,1 and X = (x(u a ), √ a a 3 1 2 ˆ ab du du and dv = det σab du 1 y(u ), z(u )) : → R . Notice that dv = det σ ˆ is du 2 . Recall from the last section the variation of kdv ˆ σˆ ab − σˆ ac σˆ bd hˆ cd )τa (δτ )b dv = (H δ kdv .
σˆ ab − σˆ ac σˆ bd hˆ cd is divergence free on Integrate by parts and recall that the tensor H ˆ ; we obtain ˆ σˆ ab − σˆ ac σˆ bd hˆ cd )∇ b ∇ a τ δτ dv δ = − (H kdv (6.4) .
is given by The relation between the Hessians of τ on and b ∇ a τ = ∇
1 ∇b ∇a τ. 1 + |∇τ |2
Since we also have
δ
ˆ kdv =−
det σˆ = 1 + |∇τ |2 , det σ
σˆ ab − σˆ ac σˆ bd hˆ cd ) ∇b ∇a τ δτ dv . (H 1 + |∇τ |2
Now 1 + |∇τ |2 cosh θ |H | − ∇τ · ∇θ dv δ (1 + |∇τ |2 )−1/2 ∇τ · ∇δτ cosh θ |H | + 1 + |∇τ |2 sinh θ δθ |H | dv = − (∇δτ · ∇θ + ∇τ · ∇δθ )dv .
Substitute sinh θ = δ
|H |
√− τ
1+|∇τ |2
and integrate by parts; we obtain
1 + |∇τ |2 cosh θ |H | − ∇τ · ∇θ dv ∇τ = cosh θ |H | − ∇θ · ∇δτ dv . 1 + |∇τ |2
Proposition 6.2. The variation of with respect to τ is σˆ ab − σˆ ac σˆ bd hˆ cd ) ∇b ∇a τ δτ dv − (H 1 + |∇τ |2 ∇τ + div cosh θ |H | − ∇θ − V · δτ dv . 1 + |∇τ |2
(6.5)
Isometric Embeddings into the Minkowski Space and New Quasi-Local Mass
941
Therefore, the equation for the minimizing isometric embedding is ∇τ ∇ τ ∇ b a ab ac bd σˆ − σˆ σˆ hˆ cd ) +div cosh θ |H | − ∇θ − V = 0 −( H 1 + |∇τ |2 1 + |∇τ |2 (6.6) with sinh θ =
|H |
√− τ
1+|∇τ |2
.
Acknowledgements. We wish to thank Richard Hamilton for helpful discussions on isometric embeddings and Melissa Liu for her interest and reading of an earlier version of this article. The first author would like to thank Naqing Xie for pointing out several typos in an earlier version.
References 1. Bartnik, R.: New definition of quasilocal mass. Phys. Rev. Lett. 62(20), 2346–2348 (1989) 2. Bartnik, R.: Quasi-spherical metrics and prescribed scalar curvature. J. Diff. Geom. 37, 31–71 (1993) 3. Booth, I.S., Mann, R.B.: Moving observers, nonorthogonal boundaries, and quasilocal energy. Phys. Rev. D. 59, 064021 (1999) 4. Brown, J.D., York, J.W.: Quasilocal energy in general relativity. In: Mathematical aspects of classical field theory (Seattle, WA, 1991), Contemp. Math. 132, Providence, RI: Amer. Math. Soc., 1992, pp. 129–142 5. Brown, J.D., York, J.W.: Quasilocal energy and conserved charges derived from the gravitational action. Phys. Rev. D (3) 47(4), 1407–1419 (1993) 6. Brown, J.D., Lau, S.R., York, J.W.: Energy of isolated systems at retarded times as the null limit of quasilocal energy. Phys. Rev. D (3) 55(4), 1977–1984 (1997) 7. Christodoulou, D., Yau, S.-T.: Some remarks on the quasi-local mass. In: Mathematics and general relativity (Santa Cruz, CA, 1986), Contemp. Math. 71, Providence, RI: Amer. Math. Soc., 1988, pp. 9–14 8. Eardley, M.: Global problems in numerical relativity. In Sources of gravitational radiation, Cambridge: Cambridge Univ. Press, 1979, pp. 127–138 9. Epp, R.J.: Angular momentum and an invariant quasilocal energy in general relativity. Phys. Rev. D 62(12), 124108 (2000) 10. Gibbons, G.W.: Collapsing shells and the isoperimetric inequality for black holes. Class. Quant. Gravi. 14(10), 2905–2915 (1997) 11. Gilbarg, D., Trudinger, N.S.: Elliptic partial differential equations of second order. Second edition. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], 224, Berlin: Springer-Verlag, 1983 12. Hawking, S.W., Horowitz, G.T.: The gravitational Hamiltonian, action, entropy and surface terms. Class. Quant. Grav. 13(6), 1487–1498 (1996) 13. Jang, P.S.: On the positivity of energy in general relativity. J. Math. Phys. 19(5), 1152–1155 (1978) 14. Kijowski, J.: A simple derivation of canonical structure and quasi-local Hamiltonians in general relativity. Gen. Relativity Gravitation 29(3), 307–343 (1997) 15. Lau, S.R.: New variables, the gravitational action and boosted quasilocal stress-energy-momentum. Class. Quant. Grav. 13(6), 1509–1540 (1996) 16. Liu, C.-C.M., Yau, S.-T.: Positivity of quasilocal mass. Phys. Rev. Lett. 90(23), 231102 (2003) 17. Liu, C.-C.M., Yau, S.-T.: Positivity of quasilocal mass II. J. Amer. Math. Soc. 19(1), 181–204 (2006) 18. Nirenberg, L.: The Weyl and Minkowski problems in differential geometry in the large. Comm. Pure Appl. Math. 6, 337–394 (1953) 19. Ó Murchadha, N., Szabados, L.B., Tod, K.P.: Comment on “Positivity of quasilocal mass”. Phys. Rev. Lett 92, 259001 (2004) 20. Penrose, R.: Some unsolved problems in classical general relativity. In: Seminar on Differential Geometry, Ann. of Math. Stud. 102, Princeton, NJ: Princeton Univ. Press, 1982, pp. 631–668 21. Pogorelov, A.V.: Regularity of a convex surface with given Gaussian curvature (in Russian). Mat. Sbornik N.S. 31(73), 88–103 (1952) 22. Schoen, R., Yau, S.-T.: Positivity of the total mass of a general space-time. Phys. Rev. Lett. 43(20), 1457–1459 (1979) 23. Schoen, R., Yau, S.-T.: On the proof of the positive mass conjecture in general relativity. Commun. Math. Phys. 65(1), 45–76 (1979)
942
M.-T. Wang, S.-T. Yau
24. Schoen, R., Yau, S.-T.: Proof of the positive mass theorem II. Commun. Math. Phys. 79(2), 231–260 (1981) 25. Schoen, R., Yau, S.-T.: The existence of a black hole due to condensation of matter. Commun. Math. Phys. 90, 575–579 (1983) 26. Shi, Y., Tam, L.-F.: Positive mass theorem and the boundary behavior of compact manifolds with nonnegative scalar curvature. J. Diff. Geom. 62(1), 79–125 (2002) 27. Szabados, L.B.: Quasi-local energy-momentum and angular momentum in GR: a review article. Living Rev. Relativity 7, 4 (2004) 28. Wang, M.-T., Yau, S.-T.: A generalization of Liu-Yau’s quasi-local mass. Comm. Anal. Geom. 15(2), 249–282 (2007) 29. Wang, M.-T., Yau, S.-T.: Quasilocal mass in general relativity. Phys. Rev. Lett. 102, 021101 (2009). http://arXiv.org/abs/0804.1174v3[gr-qc] 30. Witten, E.: A new proof of the positive energy theorem. Commun. Math. Phys. 80(3), 381–402 (1981) 31. Yau, S.-T.: Geometry of three manifolds and existence of black hole due to boundary effect. Adv. Theor. Math. Phys. 5(4), 755–767 (2001) 32. Zhang, X.: A new quasi-local mass and positivity. Acta Mathematica Sinica (English Series) 24(6), 881–890 (2008) Communicated by G. W. Gibbons
Commun. Math. Phys. 288, 943–961 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0783-7
Communications in
Mathematical Physics
Reconstruction of Random Colourings Allan Sly Statistics Department, University of California, Berkeley, CA 94720, USA. E-mail:
[email protected] Received: 25 May 2008 / Accepted: 18 December 2008 Published online: 20 March 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com
Abstract: Reconstruction problems have been studied in a number of contexts including biology, information theory and statistical physics. We consider the reconstruction problem for random k-colourings on the ∆-ary tree for large k. Bhatnagar et al. [2] showed non-reconstruction when ∆ ≤ 21 k log k−o(k log k). We tighten this result and show nonreconstruction when ∆ ≤ k[log k +log log k +1−log 2−o(1)], which is very close to the best known bound establishing reconstruction which is ∆ ≥ k[log k+log log k+1+o(1)].
1. Introduction Determining the reconstruction threshold of a Markov random field has been of interest in a number of areas including biology, information theory and statistical physics. Reconstruction thresholds on trees are believed to determine the dynamical phase transitions in many constraint satisfaction problems including random K-SAT and random colourings on random graphs. It is thought that at this point the space of solutions splits into exponentially many clusters. The properties of the space of solutions of these problems are of interest to physicists, probabilists and theoretical computer scientists. It is known [18,20,21] that reconstruction holds when the number of colours satisfies k[log k + log log k + 1 + o(1)] ≤ ∆. This bound is given by the analysis of a naive reconstruction algorithm which reconstructs the root only when it is known with absolute certainty given the leaves. The problem of finding good bounds when nonreconstruction holds is more difficult, it requires showing that the spins on the root and the leaves are asymptotically independent. The best previous rigorous result was that ∆ ≤ 21 k log k − o(k log k) implies non-reconstruction [2]. We improve this to ∆ ≤ k[log k + log log k + 1 − log 2 − o(1)]. Even at a heuristic level no non-reconstruction bound as good as ours was known. Supported by NSF grants DMS-0528488 and DMS-0548249.
944
A. Sly
1.1. Definitions. We begin by giving a general description of broadcast models on trees and the reconstruction problem. The broadcast model on a tree T is a model in which information is sent from the root ρ across the edges, which act as noisy channels, to the leaves of T . For some given finite set of characters C a configuration on T is an element of C T , that is an assignment of a character C to each vertex. The broadcast model is a probability distribution on configurations defined as follows. Some |C| × |C| probability transition matrix M is chosen as the noisy channel on each edge. The spin σρ is chosen from C according to some initial distribution and is then propagated along the edges of the tree according to the transition matrix M. That is if vertex u is the parent of v in the tree then the spin at v is defined according to the probabilities P(σv = j|σu = i) = Mi,j . We will focus on the colouring model with |C| = k which is given by the transition 1i=j matrix Mi,j = k−1 . Broadcast models and in particular colourings can also be considered as Gibbs measures on trees. Given a finite set of colours k and a graph T = (V, E), a k-colouring is an assignment of a colour to each vertex so that adjacent vertices have different colours. The random k-colouring model is then the uniform probability distribution on valid kcolourings of the graph. It is a Gibbs measure or Markov random field on the space of configurations σ ∈ {1, . . . , k}V given by 1 P(σ ) = 1σu =σv , Z (u,v)∈E
where Z is a normalizing constant given by the number of colourings of T . On an infinite tree more than one Gibbs measure may existi; the broadcast colouring model corresponds to the free Gibbs measure. We will restrict our attention to ∆-ary trees, that is the infinite rooted tree where every vertex has ∆ offspring. Let L(n) denote the spins at distance n from the root. Definition 1. We say that a model is reconstructible on a tree T if, P(σρ = i, L(n) = L) − P(σρ = i)P(L(n) = L) > 0, lim sup n
i,L
where the sum is over all i ∈ C and all configurations L on the vertices at distance n from the root. When the limsup is 0 we will say the model has non-reconstruction on T . Non-reconstruction is equivalent to the mutual information between σρ = L(0) and L(n) going to 0 as n goes to infinity and also to {L(n)}∞ n=1 having a non-trivial tail sigma-field. More equivalent formulations are given in [17] Prop. 2.1. As increasing ∆ only increases the information on the root, we can define ∆∗ (k) to be the reconstruction threshold, that is the smallest ∆ such that reconstruction holds on the ∆-ary tree. In contrast to reconstruction consider the uniqueness property of a model. Definition 2. We say that a model has uniqueness on a tree T if (P(σρ = i|L(n) = L) − P(σρ = i|L(n) = L )) > 0, lim sup sup n
L ,L
i∈C
where the supremum is over all configurations L , L on the vertices at distance n from the root.
Reconstruction of Random Colourings
945
Reconstruction implies non-uniqueness and is a strictly stronger condition. Essentially uniqueness says that there is some configuration on the leaves which provides information on the root while reconstruction says that a typical configuration on the leaves provides information on the root. 1.2. Background. For some parameterized collection of models the key question in studying reconstruction is finding which models have reconstruction, which typically involves finding a threshold. This problem naturally arises in biology, information theory and statistical physics and involves the trade off between increasing numbers of leaves with increasingly noisy information as the distance from the root to the leaves increases. The simplest collection of model is the binary symmetric channel which is defined on two characters with 1− M= 1− for 0 < < 21 , which corresponds to the ferromagnetic Ising model on the tree with no external field. It was shown in [3 and 7] that this channel has reconstruction if and only if ∆(1 − 2)2 > 1. The broadcast model is a natural model for the evolution of characters of DNA. In phylogenetic reconstruction the goal is to reconstruct the ancestral tree of a collection of species given their genetic data. Daskalakis, Mossel and Roch [5,16] proved the conjecture of Mike Steel that the number of samples required for phylogenetic reconstruction undergoes a phase transition at the reconstruction threshold for the binary symmetric channel. Exact reconstruction thresholds have only been calculated in the binary symmetric model and binary asymmetric models with sufficiently small asymmetry [4]. In both these cases the threshold corresponds to the Kesten-Stigum bound [10]. The KestenStigum bound shows that reconstruction holds whenever ∆λ2 (M)2 > 1, where λ2 (M) denotes the second largest eigenvalue of M. In fact when ∆λ2 (M)2 > 1, it is possible to asymptotically reconstruct the root from just knowing the number of times each character appears on the leaves (census reconstruction) without using the information on their positions on the leaves. Mossel [15,17] showed that the Kesten-Stigum bound is not the bound for reconstruction in the binary-asymmetric model with sufficiently large asymmetry or in the Potts model with sufficiently many characters. It was shown in [9] that k-colourings have uniqueness on ∆-ary trees if and only if k ≥ ∆ + 2 which therefore also establishes non-reconstruction in this regime. Exactly finding the threshold for reconstruction is difficult so most attention has been focused on finding its asymptotics as the number of colours and the degree goes to infinity. Recently [2] greatly improved this bound showing that ∆∗ (k) ≥ ( 21 + o(1))k log k. On the other hand [18] showed that when ∆ ≥ (1 + o(1))k log k then with high probability in k the spin of the root is exactly determined by the leaves and so reconstruction is possible. With a more detailed analysis this argument can be improved to show reconstruction when k[log k + log log k + 1 + o(1)] ≤ ∆, as was shown in [20,21]. This is a large improvement on the Kesten-Stigum bound which implies reconstruction when ∆ > (k − 1)2 . In related work Mezard and Montanari [14] found a variational principle which establishes bounds on reconstruction for colourings but which is asymptotically weaker than Lemma 7. Our results establish extremely tight bounds on ∆∗ (k) with the upper and lower bounds differing by just (log 2 + o(1))k rather than 21 k log k previously.
946
A. Sly
Theorem 1. The k-colouring model has reconstruction threshold ∆∗ (k) satisfying, ∆∗ (k) ≤ k[log k + log log k + 1 + o(1)] and ∆∗ (k) ≥ k[log k + log log k + 1 − log(2) − o(1)]. 1.3. Applications to Statistical Physics. The reconstruction threshold on trees is believed to play a critical role in the dynamical phase transitions in certain glassy systems given by random constraint satisfaction problems. Important examples include random K-SAT and random colourings on random graphs. We will briefly describe what is conjectured by physicists about such systems [11,21], generally without rigorous proof, and why understanding the reconstruction threshold for colourings plays an important role in such systems. The Erd˝os-Rényi random graph G(n, p) is a random graph on n vertices where every pair of vertices is connected with probability p. To maintain constant average degree ∆ we let p = ∆/n. The k-colouring model on G(n, ∆/n) or random ∆-regular graphs undergoes several phase transitions as ∆ grows. If we consider the space of solutions to the random colouring model where two colourings are adjacent if they differ at at most o(n) vertices, then for the smallest values of ∆ the space of solutions forms a large connected component. Above the clustering transition ∆d the space of solutions breaks into exponentially many disconnected clusters and has no giant component with a constant fraction of the probability. This replica symmetry breaking transition is believed [11,12] to occur at ∆d = k[log k + log log k + α + o(1)]. In a recent remarkable result [1] rigorously proved that when (1 + o(1))k log k ≤ ∆ ≤ (2 − o(1))k log k, then the space of solutions indeed breaks into exponentially many small clusters. A second transition occurs when most clusters have frozen spins, that is vertices which have the same colour in every colouring in the cluster. This phase transition is believed to occur at ∆r = k[log k + log log k + 1 + o(1)] [20,21] and is the best upper bound known for ∆d . Two more transitions are believed to occur: condensation where the size of the clusters is given by a Poisson-Dirichlet process, and the colouring threshold beyond which no more colourings are possible. These transitions are conjectured to occur at ∆c = 2k log k − log k − 2 log 2 + o(1) and ∆s = 2k log k − log k − 1 + o(1) respectively [21]. Similar results are also expected to hold for K-SAT and other random constraint satisfiability problems [11]. Both random regular and Erd˝os-Rényi random graphs are locally tree-like. Asymptotically in a random regular graph the neigbourhood of a random vertex is a regular tree and for Erd˝os-Rényi random graphs it is a Galton-Watson branching process tree with Poisson offspring distribution. It is conjectured [11] that the reconstruction threshold on the corresponding tree is exactly the clustering threshold ∆d on the random graph. As such, rigorous estimates of the reconstruction problem can be seen as part of a larger program of understanding glassy phases in constraint satisfaction problems. The clustering threshold is also believed to play an important role in the efficiency of MCMC algorithms for finding and sampling from colourings of the graphs. MCMC algorithms are believed to be efficient up to the clustering threshold but experience an exponential slowdown beyond it [11]. This is to be expected since a local MCMC algorithm cannot move between clusters each of which has exponentially small probability. Rigorous proofs of rapid mixing of MCMC algorithms, such as the Glauber dynamics, fall a long way behind. For random regular graphs, results of [6] imply rapid mixing when
Reconstruction of Random Colourings
947
k ≥ 1.49∆, well below the reconstruction threshold and even the uniqueness threshold. Even less is known for Erd˝os-Rényi random graphs as almost all MCMC results are given in terms of the maximum degree which in this case grows with n. Polynomial time mixing of the Glauber dynamics has been shown [19] for a constant number of colours in terms of ∆. 1.4. Open Problem. If the probability that the leaves uniquely determine the spin at the root does not go to 0 as n goes to infinity then the model has reconstruction. It is natural to ask is this a necessary condition for reconstruction. When k = 5 and ∆ = 14 it was shown in [14] using a variational principle that reconstruction holds but the probability that the leaves fix the root goes to 0. However, this is the only case in which the variational principle gives an upper bound on the number of colours required for reconstruction which is better than the bound of the leaves fixing the root. It remains open to determine if for large numbers of colours/high degree if this is exactly the reconstruction threshold. Numerical results of [21] suggest this is in fact not the case and there are two separate thresholds. Answering this question would be of significant interest. 2. Proofs We introduce the notation we use in the proofs. We denote the colours by C = {1, . . . , k} and let T be the ∆-ary tree rooted at ρ. Let u1 , . . . , u∆ be the children of ρ and let Tj denote the subtree of descendants of uj . Let P(σ ) denote the free measure on colourings on the ∆-ary tree. Let L(n) denote the spins at distance n from ρ and let L j (n) denote the spins on level n in the subtree Tj . We let E i and P i denote the expectation and probability with respect to the measure conditioned to have i at the root. For a random variable U , a function of σ , we will let L(U ) denote the law of U and Li (U ) denote its conditional law with respect to the measure conditioned to have i at the root. For a configuration L on the spins at distance n from ρ define the deterministic function f n as f n (i, L) = P(σρ = i|L(n) = L). By the recursive nature of the tree we also have that f n (i, L) = P(σuj = i|L j (n) = L). Now define X i (n) = X i by X i (n) = f n (i, L(n)). These random variables are a deterministic function of the random configuration L(n) of the leaves which gives the marginal probability that the root is in state i. By symmetry the X i are exchangable. Now we define two distributions X + = X + (n) = L1 f n (1, L(n)), and X − = X − (n) = L2 f n (1, L(n)).
948
A. Sly
We will establish non-reconstruction by showing that the distributions X + and X − both converge to k1 as n goes to infinity. By symmetry we have X + i1 = i2 , d Li1 ( f n (i2 , L(n))) = X − otherwise, and the set { f n (i, L(n)) : 2 ≤ i ≤ k} is conditionally exchangeable when conditioned on the event σρ = 1. Moreover, they are conditionally exchangeable given σρ = 1 and the value of f n (1, L(n)). Now define Yij = Yij (n) = f n (i, L j (n)). This is equal to the probability that σuj = i, given the random configuration L j (n) on the spins on level n in the subtree Tj . The following proposition follows immediately from the symmetries of the model. Proposition 1. The Yij satisfy the following properties: – The random vectors Yj = Y1j , . . . , Yqj are conditionally independent given σρ for j = 1, . . . , d. – Conditional on σuj the random variable Yσuj j is equal in distribution to X + (n) while for i = σuj the random variables Yij are equal in distribution to X − (n). – Further, for fixed j, given σuj and Yσuj j the random variables {Yij }i=σuj are conditionally exchangeable over i = σuj . We make use of these symmetries to simplify the anaylsis. Given the standard Gibbs measure recursions on trees we have that ∆ j=1 (1 − f n (1, L j (n))) f n+1 (1, L(n + 1)) = k ∆ i=1 j=1 (1 − f n (i, L j (n))) and so Z1 X 1 (n + 1) = k
i=1
Zi
,
where Zi =
∆
(1 − Yij ).
j=1
We let xn and zn denote E 1 X 1 (n) = E X + (n) and E 1 (X 1 (n)− k1 )2 = E(X + (n)− k1 )2 respectively. These quantities, in particular xn , play a major role in our analysis. The following lemma, which can be viewed as the analogue of Lemma 1 of [4], allows us to relate the first and second moments of X + . Lemma 1. We have that xn = E X + = E 1
k i=1
X i (n)2 = E
k (X i (n))2 , i=1
Reconstruction of Random Colourings
949
and k 1 1 1 2 1 2 + + X i (n) − ≥E X − = zn . xn − = E X − = E k k k k i=1
Proof. From the definition of conditional probabilities and of f n and the fact that P(σρ = 1) = k1 we have that E 1 f n (1, L(n)) = f n (1, L)P(L(n) = L|σρ = 1) L
P(L(n) = L , σρ = 1) f n (1, L) = P(σρ = 1) L P(L(n) = L) f n (1, L)2 =k L
= kE(X 1 (n))2 k (X i (n))2 .
=E
i=1
By symmetry for any i1 , i2 ∈ C, E i1
k k (X i (n))2 = E i2 (X i (n))2 , i=1
i=1
and so E
k k k k 1 i (X i (n))2 = E (X i (n))2 = E 1 (X i (n))2 . k i=1
i=1
i =1
i=1
Finally we have that E
k i=1
(X i (n) −
k
k
i=1
i=1
1 2 2 1 1 ) =E (X i (n))2 − E X i (n) + k 2 = E X + − , k k k k
which completes the proof. Corollary 1. We have that xn ≥
1 k
and that lim xn = n
1 k
implies non-reconstruction. Proof. We have that xn ≥ zn +
1 k
k
≥ k1 . If xn converges to
E
i=1
which implies non-reconstruction.
1 X i (n) − k
2 →0
1 k
then
950
A. Sly
2.1. Non-reconstruction. Our analysis is split into two phases, the first when xn is close to 1 and the second when xn is close to k1 . Lemma 2. Suppose that β < 1 − log 2. Then for sufficiently large k if ∆ < k[log k + log log k + β] then lim sup xn ≤ n
2 . k
Proof. We fix the colour of the root to be 1 and let F denote the sigma-algebra generated by {σuj : 1 ≤ j ≤ ∆}, the colours of the neighbours of the root. For 1 ≤ i ≤ k let bi = #{j : σuj = i}, the number of times each colour appears amongst the neighbours of the root. Of course b1 = 0 since the neighbours of the root cannot be 1. For 1 ≤ i ≤ k define Ui = (1 − Yij ). 1≤j≤∆:σuj =i
Note that with this definition U1 = 1. We will use the symmetries and exchangeability of the model to reduce the problem to considering a random variable only involving the Ui . Conditional on F, the Ui are independent and are distributed as the product of bi independent copies of (1 − X + (n)) and 0 ≤ Ui ≤ 1 for all i. Fix an with 2 ≤ ≤ k. Let W1 and W be defined by W1 = (1 − Y1j ), W = (1 − Y j ) 1≤j≤∆:σuj =
1≤j≤∆:σuj =
so Z = W U . Note that for j ∈ {1 ≤ j ≤ ∆ : σuj = } we have that σuj ∈ {1, }, since of none of the σuj are 1. So by Proposition 1, conditional on F and σuj ∈ {1, }, we have that Y1j and Y j are conditionally exchangeable and so W1 and W are conditionally exchangeable. We will analyse the effect of swapping W1 with W . Recall that Zi = ∆ j=1 (1 − Yij ) so define (1 − Y j ), Z = W1 U = W1 1≤j≤∆:σuj =
and Z 1 = W
(1 − Y1j ),
1≤j≤∆:σuj =
and for i ∈ {1, }, Zi = Zi. Proposition 1 noted that Yj = {Y1j , . . . , Ykj } are conditionally independent given F and for each j given σuj and Yσuj j the random variables {Yij : i = σuj } are conditionally exchangeable. It follows that (W1 , W , Z 1 , . . . , Z k , U1 . . . , Uk , σ1 , . . . , σ∆ ) d = W , W1 , Z1, . . . , Z k , U1 . . . , Uk , σ1 , . . . , σ∆ ,
(1)
Reconstruction of Random Colourings
951
where we denote equality as in distributions of random vectors since this just swaps Y1j ’s with Y j ’s which are conditionally exchangeable given all the other random variables.
Z = U (W − W1 ) it follows that Zi = Z − Since 0 ≤ U ≤ 1, and ki=2 Z i − ki=2 (W1 − W ) has the same sign as
k k Z i − W + Z i = (W1 − W )(1 − U ) W1 + i=2
i=2
and so W1 +
1
k
i=2
Zi
−
W +
1
k
i=2
Zi
has the opposite sign as W1 − W . Applying the equality in distribution of Eq. (1) we have that W − W1 1 E F, {Ui }
W1 + ki=2 Z i 1 1 W1 − W W − W1 = E + F, {Ui }
2 W1 + ki=2 Z i W + ki=2 Zi
1 1 1 1 = E (W − W1 ) − F, {Ui }
2 W1 + k Z i W + k Zi i=2
≥ 0,
i=2
where the first equality follows using equality in distributions of the random vectors and the inequality follows from the two terms of the product having the same sign. Since 0 ≤ Z 1 ≤ W1 ≤ 1 we have that, Z W 1 1 E1 F, {Ui } ≤ E 1 F, {Ui }
Z 1 + ki=2 Z i W1 + ki=2 Z i W 1 ≤E F, {Ui }
W1 + ki=2 Z i W ≤ E1 F, {Ui } ,
Z 1 + ki=2 Z i and so since Z = U W and we are conditioning on U , Z 1 U Z 1 1 E F, {Ui } ≤ E
k
k Z1 + Zi Z1 + i=2
i=2
F, {Ui } . Zi
Recall that ≥ 2 is arbitrary so the previous equation holds for all 2 ≤ ≤ k simultaneously. Summing over all values of we get that, ⎤ ⎡
k k Z 1 1 + l=2 Ul Z l 1⎣ 1 F, {Ui }⎦ ≤ E E F, {Ui } = 1,
k
k i=1 Z i i=1 Z i l=1
952
A. Sly
and hence since we are conditioning on the Ui , Z1 E 1 [ X 1 (n + 1)| F, {Ui }] = E 1 k
1 . F, {Ui } ≤
k 1 + i=2 Ui i=1 Z i
We now estimate the expected value of the right-hand side of the previous equation. 1 1 Using the fact that 1+x = 0 s x ds we have that
1+
1
k
1 k
=
i=2 Ui
s
i=2 Ui
ds.
0
As s u is convex as a function of u we have that s u ≤ s 0 (1 − u) + s 1 u when 0 ≤ u ≤ 1 and so since 0 ≤ Ui ≤ 1 we have that E 1 s Ui ≤ (1 − E 1 Ui ) + s E 1 Ui = 1 − (1 − s)E 1 Ui . Since it is conditional on F the Ui are independent and are distributed as the product of bi independent copies of (1 − X + (n)) we have that,
k 1
E [ X 1 (n + 1)| F] ≤ 1
0 i=2 k 1
=
0 i=2
(1 − (1 − s)E 1 [Ui |F])ds (1 − (1 − s)(1 − xn )bi )ds.
Now the colours σuj are chosen independently and uniformly from the set {2, . . . , k} so (b2 , . . . , bk ) has a multinominal distribution. Let β < β ∗ < 1 − log 2 and let bi be iid random variables distributed as Poisson(D), where D = log k + log log k + β ∗ . By Lemma 4 we can couple the b’s and b’s so that (b2 , . . . , bk ) ≤ ( b2 , . . . , bk ) whenever
k j=2 bj ≥ ∆. It follows that xn+1 = E 1 X 1 (n + 1) ≤E
1
1{ k j=2 bj <∆}
1
≤ p+
0 1
≤ p+ 0
= p+
+ 0
1
E1
k
(1 − (1 − s)(1 − xn )bi )ds
i=2
(1 − (1 − s) exp(−xn D))k−1 ds exp (−(1 − s)(k − 1) exp(−xn D)) ds
1 − exp (−(k − 1) exp(−xn D)) , (k − 1) exp(−xn D)
where p = P(Poisson((k − 1)D) < ∆). Now p = exp −Ω √k = o(k −1 ) and ∆ the function g(y) = p +
1 − exp (−(k − 1) exp(−y D)) (k − 1) exp(−y D)
is increasing in y so the result follows by Lemma 3.
Reconstruction of Random Colourings
953
Lemma 3. Let y0 , y1 , . . . be a sequence of positive real numbers such that y0 = 1 and exp(−yn D)) yn+1 = g(yn ), where g(yn ) = p + 1−exp(−(k−1) , D = log k + log log k + β ∗ , (k−1) exp(−yn D) β ∗ < 1 − log 2 and p = o(k −1 ). Then for large enough k, lim sup yn < Proof. Since
d dx
1−e−x x
x=0
n
2 . k
= − 21 we can find , δ > 0 such that when 0 < x < δ, then 1 − e−x <1− x
1 − x. 2
Assuming our choice of is sufficiently small we can also choose r > r > 0 such that −β ∗
∗
<δ ( 21 − )e−β > e−1 (1 + r ). Now for large enough k, (k − 1) exp(−D) = (k−1)e k log k −1 and so using the fact that r < r and p = o(k ), ∗ (k − 1)e−β (1 + r )e−1 e−1 1 − ≤1+ p− ≤1− , y1 = g(1) ≤ p + 1 − 2 k log k log k log k provided k is sufficiently large. Now since g is a continuous increasing function and y1 < y0 it follows that the sequence yi is decreasing. Suppose that (k − 1) exp(−yi D) < δ. Then 1 − (k − 1) exp(−yi D), yi+1 ≤ p + 1 − 2 and so for k sufficiently large 1 1 − yi+1 ≥ − (k − 1) exp(−yi D) − p 2 ∗ (k − 1)e−β 1 − exp((1 − yi ) log k) − p ≥ 2 k log k (1 + r )e−1 exp((1 − yi ) log k) − p log k ≥ (1 + r )(1 − yi ) − p ≥ (1 + r )(1 − yi ), ≥
where the second to last inequality uses the fact that ex ≥ ex and the final inequality e−1 −1 uses the fact that 1 − yi ≥ log k , while p = o(k ). It follows that yi decreases until for some i, (k − 1) exp(−yi D) ≥ δ. Now let k is large enough then yi+1 ≤ p +
1−e−δ δ
= α < α < α < 1 for some α. When
1 − e−δ ≤ α . δ
Then for k large enough, exp(−yi+1 D) ≥ exp(−α D) ≥ exp(−α log k) = k −α . It follows that 1 ≤ 2k α−1 . yi+2 ≤ p + (k − 1) exp(−yi+1 D)
954
A. Sly
Finally we have exp(−yi+2 D) ≥ exp(−2k α−1 D) ≥ yi+3 ≤ p +
2 3
and so
1 < 2k −1 (k − 1) exp(−yi+2 D)
when k is large enough, which completes the proof. In the preceding lemma we note that the requirement that β ∗ < 1 − log 2 comes from ∗ the fact that x < 21 ex−β for all x when β ∗ < 1 − log 2. Lemma 4. Suppose that (b1 , . . . , bk ) has the multinominal distribution M n, k1 , k1 , . . . 1 the b’s k . Let bj be iid random variables distributed as Poisson(D). We can couple
k and b’s so that (b1 , . . . , bk ) ≤ (b1 , . . . , bk ) (respectively ≥) whenever j=1 bj ≥ n (respectively ≤).
Proof. Since the bj are independent and Poisson, conditional on the sum N = kj=1 bj , 1 1 (see [13] bk ) is multinominal M N , k , k , . . . k1 the distribution of ( b1 , . . . , Prop. 6.2.1). n ≤ m then Now if two multinomial distributions A and B distributed as M n, k1 , k1 , . . . k1 and M m, k1 , k1 , . . . k1 respectively can be trivially coupled so that A ≤ B, which completes the proof. Janson and Mossel [8] studied “robust reconstruction”, the question of when reconstruction is possible from a very noisy copy of the leaves. They found that the threshold for robust reconstruction is exactly the Kesten-Stigum bound. Lemma 2 establishes that the leaves provide very little information about the spin at a vertex a long distance from the leaves. So as information over long distances is very noisy the results of [8] suggest that reconstruction would only be possible after the Kesten-Stigum bound whereas, in our context, ∆ is much less than λ2 (M)−2 . As such, only crude bounds are needed to establish the following lemma. Lemma 5. For sufficiently large k if ∆ ≤ 2k log k and if xn ≤ 1 1 1 xn − . xn+1 − ≤ k 2 k
2 k
then
Proof. Using the identity 1 1 r r2 1 = − 2+ 2 s +r s s s s +r
k
k 1 1 and taking s = E i=1 Z i and r = i=1 (Z i − E Z i ) we have that
Z 1 − k1 ki=1 Z i 1 xn+1 − = E 1
k k i=1 Z i
k 1 k 1Z ) 1 k Z − Z (Z − E 1 i i=1 i i=1 i Z 1 − k i=1 Z i k = E1 − E1
k
2 1 E i=1 Z i E 1 ki=1 Z i
2 k 1Z ) 1 k (Z − E i i i=1 Z 1 − k i=1 Z i +E 1
k
2 . i=1 Z i E 1 ki=1 Z i
Reconstruction of Random Colourings
955
Now by Lemma 6,
E 1 Z 1 − k1 ki=1 Z i ≤
E 1 ki=1 Z i
2∆ 1 1 − − xn − k1 − k−1 x n k k k2 2∆ 1 1 + (k − 1) 1 − k2 xn − k 1 3∆ . (2) ≤ 2 xn − k k k−1 k
1+
2∆ k
Using the inequality 21 (a 2 + b2 ) ≥ ab we have that
k
k 1 1 Zi Zi − E Zi − Z1 − k i=1 i=1
k k 1 1 1 1 1 1 E Zi − (Z i − E Z i ) = − Z1 − E Z1 + E Z1 − k k i=1 i=1 k
1 · (Z i − E Z i )
i=1
k 2 2 1 1 1 + ≤ Z 1 − E 1 Z 1 + (Z i − E 1 Z i ) 2 2 k i=1
k k 1 − E 1 Z1 − E 1 Zi (Z i − E 1 Z i ) k i=1
i=1
so by Lemma 6 we have that,
k k 1 1 E − Z1 − Zi (Z i − E Z i ) k i=1 i=1 4∆ k − 1 2∆ 1 xn − + 4∆ ≤ k k k
1
and ⎡
⎤ k 1Z ) Z (Z − E i i i i=1 i=1 ⎢ ⎥ E 1 ⎣− ⎦
2 E 1 ki=1 Z i xn − k1 4∆ k + 4∆ ≤ 2 1 1 + (k − 1) 1 − 2∆ x − n 2 k k 5∆ 1 . ≤ 2 xn − k k Z1 −
1 k
k
(3)
956
A. Sly
Finally since 0 ≤
Z
k 1
i=1
Zi
1 k i=1 1 Z1 − k E
k i=1 Z i
Z − 1 k Z ≤ 1 we have that 1 kk i=1 i ≤ 1, and so i=1
Zi
k i=1 (Z i
− E 1 Zi)
2 E 1 ki=1 Z i
2
Zi
k i=1 (Z i
− E 1 Zi) 1 ≤E
2 E 1 ki=1 Z i 5∆ 1 . ≤ 2 xn − k k
Combining Eqs. (2), (3) and (4) we have that 13∆ 1 1 1 1 ≤ xn − xn+1 − ≤ 2 xn − k k k 2 k
2
(4)
(5)
for large enough k, which completes the result. Lemma 6. For sufficiently large k if ∆ ≤ 2k log k and if xn ≤ k2 then the following all hold: 2∆ k−1 ∆ k−1 ∆ 1 1+ xn − , (6) ≤ E 1 Z1 ≤ k k k k and for i = 1,
∆ 2∆ 1 k−1 ∆ 1 ≤ E Zi ≤ 1 − 2 xn − , k k k k − 1 2∆ 4∆ 1 Var1 Z 1 ≤ xn − , k k k k
1 k − 1 2∆ 1 Var . Zi ≤ 4∆ xn − k k k−1 k
(7) (8) (9)
i=1
Proof. From Eq. (15) of Lemma 9 we have that 1 k−1 1 ∆ 1 E Z1 = + xn − , k k−1 k and since by Corollary 1, xn ≥
1 k
we have that k−1 ∆ E 1 Z1 ≥ . k k∆ 1 Then since exp(x) = 1 + x + O(x2 ) and (k−1) 2 xn − k is small for large k, k∆ 1 x − n (k − 1)2 k ∆ 2∆ 1 k−1 1+ xn − , ≤ k k k
E 1 Z1 ≤
k−1 k
∆
exp
which establishes Eq. (6). Equations (7), (8) and (9) are established similarly using identities from Lemma 9.
Reconstruction of Random Colourings
957
2.2. Reconstruction. An upper bound on the reconstruction threshold ∆∗ (k) is found by estimating the probability that the colour of the root is uniquely determined by the colours at the leaves. This method was described in [18] and used to a higher level of precision in [21,20]. We restate the result and give a full proof for completeness. Lemma 7. Suppose that β > 1. Then for sufficiently large k if ∆ > k[log k+log log k+β] then the colour of the root is uniquely determined by the colours at the leaves with probability at least 1 − log1 k , that is inf P(X + (n) = 1) > 1 − n
1 . log k
Proof. Let pn be the probability that the leaves at distance n determine the spin at the root, that is pn = P 1 (X 1 (n) = 1). We will show that when k is large then lim inf n pn is close to 1. Suppose we fix the colour of the root to be 1 and let F denote the sigma-algebra generated by {σuj : 1 ≤ j ≤ ∆}, the colours of the neighbours of the root. For 2 ≤ i ≤ k let bi = #{j : σuj = i}, the number of times each colour appears in the neighbours of the root. Now each colour σuj is chosen uniformly from the set {2, . . . , k} so (b2 , . . . , bk ) has a multinominal distribution. Let β > β ∗ > 1 and let bi be iid random variables distributed as Poisson(D), where D = log k + log log k + β ∗ . By Lemma 4 we can couple
b2 , . . . , bk ) whenever ki=2 bj ≤ ki=2 bj = ∆. the b’s and b’s so that (b2 , . . . , bk ) ≥ ( If for each colour 2 ≤ i ≤ k there is some vertex uj such that the states of the leaves, L j (n) fix the colour of uj to be i, then the leaves L(n + 1) fix the colour of ρ to be 1. Conditional on F the probability that there is such a vertex uj for a given colour i is at least 1 − (1 − pn )bi . Moreover these are conditionally independent of F so it follows that pn+1 ≥
k
E 1 1 − (1 − pn )bi |F
i=2
≥
k
E 1 1 − (1 − pn )bi − s
i=2
= (1 − exp(− pn D))k−1 − s, where s = P(Poisson((k − 1)D) > ∆) = o(k −1 ). Now f (x) = (1 − exp(−xD))k−1 − s is an increasing function in x and hence when k is large enough k−1 1 1 ∗ = 1 − exp −(1 − )(log k + log log k + β ) f 1− −s log k log k 1 , > 1− log k and since p0 = 1, inf pn ≥ 1 − n
which completes the proof.
1 , log k
958
A. Sly
2.3. Main Theorem. Proof (Theorem 1). Combining Lemmas 2 and 5 establishes non-reconstruction when ∆ ≤ k[log k + log log k + 1 − log(2) − o(1)]. Lemma 7 shows that the root can be reconstructed correctly with probability at least 1 − log1 k , which establishes reconstruction when ∆ ≥ k[log k + log log k + 1 + o(1)]. Remarks. For large √ k the Poisson(∆) distribution is concentrated around ∆ with standard deviation O( ∆) which is significantly smaller than the error bounds in Theorem 1. With some minor modifications the bounds for ∆-ary trees can be extended to GaltonWatson branching processes with offspring distribution Poisson(∆). The reconstruction of Galton-Watson branching processes with offspring distribution Poisson(∆) is of interest because, as noted before, it is believed to be related to the clustering phase transition for colourings on Erd˝os-Rényi random graphs. To be more specific, for the proof of non-reconstruction we can again bound xn = E 1 X 1 (n), where the expected value is taken over all possible trees. In Lemma 2 we repeat the same bounds on xn , the only difference being ∆ is now random, which does not affect the results for large k. Then similar estimates can be made in Lemma 5 pro1 vided ∆ x − n k k is very small. As ∆ is concentrated around its expected value the probability of this not holding is very small and this can be used to complete the proof of non-reconstruction. When β > β ∗ > 1, with probability going to 1 as k goes to infinity, the GaltonWatson branching process contains a subgraph which is a (k[log k + log log k + β ∗ ])-ary tree rooted at ρ. Reconstruction then follows from Lemma 7. Acknowledgements. The author would like to thank Elchanan Mossel for his useful comments and advice and thank Dror Weitz, Nayantara Bhatnagar, Lenka Zdeborova, Florent Krz¸akała, Guilhem Semerjian and Dmitry Panchenko for useful discussions. He would also like to thank the anonymous referees and associate editor for their careful reading of the paper and suggested improvements in the exposition. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
A. Appendix In this appendix we calculate identities which are used in the proof of Lemma 6. Observe that since E X + (n) + (k − 1)E X − = 1 we have that E X + − k1 = −(k − 1)(E X − − k1 ). We will show that the means and variances of the Yij and Z i can all be calculated in terms of xn and zn . Lemma 8. We have the identities
1 1 1 − xn − , k k−1 k 1 1 k−2 1 xn − − zn . = 2+ k k(k − 1) k k−1
E 1 Y1j =
(10)
2 E 1 Y1j
(11)
For 2 ≤ i ≤ k, E 1 Yij =
1 1 + k (k − 1)2
1 xn − k
(12)
Reconstruction of Random Colourings
and E 1 Yij2
959
1 1 k 2 − 2k + 2 1 xn − + = 2+ zn . k k(k − 1)2 k (k − 1)2
(13)
For any 1 ≤ i1 < i2 ≤ k, Cov1 (Yi1 j , Yi2 j ) ≤ 0.
(14)
Proof. When the root is conditioned to be 1, σuj = 1 and so Y1j is distributed as X − and we have that 1 1 1 1 1 1 E 1 Y1j = E X − = − E X+ − = − xn − , k k−1 k k k−1 k and 2 = E(X − )2 E 1 Y1j k 1 2 + 2 = (X i ) − E(X ) E k−1 i=1
1 = [E X + − E(X + )2 ] k−1 k−2 1 1 1 2 k−1 + + −E X − = + E X − k−1 k k k k2 1 1 k−2 1 = 2+ xn − − zn , k k(k − 1) k k−1 where the third equality follows from Lemma 1. For 2 ≤ i ≤ k we have that 1 1 1 1 1 1 1 [1 − E Y1j ] = 1− + xn − E Yij = k−1 k−1 k k−1 k 1 1 1 xn − , = + k (k − 1)2 k and again using Lemma 1, E 1 Yij2
k 1 1 2 1 2 = (X i ) − E Y1j E k−1 i=1 1 k 2 − 2k + 2 1 1 x = 2+ + − zn . n k k(k − 1)2 k (k − 1)2
Also for 2 ≤ i ≤ k, E 1 Y1j Yij =
k 1 1 E Y1j Yi j k−1 i =2
1 E 1 Y1j (1 − Y1j ) = k−1 1 ≤ E 1 Y1j E(1 − Y1j ) k−1 = E 1 Y1j E 1 Yij ,
960
A. Sly
so Cov1 (Y1j , Yij ) ≤ 0. Finally for 2 ≤ i1 < i2 ≤ k, Var1 (1 − Y1j ) =
k
Var1 (Yij ) + (k − 1)(k − 2)Cov1 (Yi1 j , Yi2 j ),
i=2
and so Cov1 (Yi1 j , Yi2 j ) = Var1 (1 − Y1j ) − −
k
Var1 (Yij )
i=2
= Var(X ) − ((k − 2)Var(X − ) + Var(X + )) ≤ 0, so Cov1 (Yi1 j , Yi2 j ) ≤ 0. Using Lemma 8 we can calculate the means and covariances of the Z j . Lemma 9. We have the following results: 1 k−1 1 ∆ + xn − , k k−1 k
∆ 2 k − 1 1 3k − 2 1 xn − − zn + . E 1 Z 12 = k k(k − 1) k k−1
E 1 Z1 =
(15) (16)
For each 2 ≤ i ≤ k then E 1 Zi =
1 k−1 − k (k − 1)2
xn −
1 k
∆ (17)
and E
1
Z i2
=
k−1 k
2
∆ 1 k 2 − 4k + 2 1 + xn − + zn . k(k − 1)2 k (k − 1)2
(18)
For any 1 ≤ i1 < i2 ≤ k, Cov1 (Z i1 j , Z i2 j ) ≤ 0. Proof. By Eq. (10) we have that E 1 Z1 = E
∆ j=1
(1 − Y1j )
∆ 1 1 1 − xn − = 1− k k−1 k ∆ 1 k−1 1 + xn − = , k k−1 k
(19)
Reconstruction of Random Colourings
961
which establishes Eq. (15). Equations (16), (17) and (18) follow similarly. Using Eq. (14) we have that for 1 ≤ i1 < i2 ≤ k, E Z i1 Z i2 = E 1
1
∆
(1 − Yi1 j )(1 − Yi2 j )
j=1
≤
∆
E 1 (1 − Yi1 j )E(1 − Yi2 j )
j=1
= E 1 Z i1 E 1 Z i2 , which establishes Eq. (19). References 1. Achlioptas, D., Coja-Oghlan, A.: Algorithmic barriers from phase transition. http://front.math.ucdavis. edu/0803.2122, 2008 2. Bhatnagar, N., Vera, J., Vigoda, E.: Reconstruction for colorings on trees. http://front.math.ucdavis.edu/ 0711.3664, 2007 3. Bleher, P.M., Ruiz, J., Zagrebnov, V.A.: On the purity of limiting Gibbs state for the Ising model on the Bethe lattice. J. Stat. Phys. 79, 473–482 (1995) 4. Borgs, C., Chayes, J.T., Mossel, E., Roch, S.: The Kesten-Stigum reconstruction bound is tight for roughly symmetric binary channels. In: FOCS 2006, Los Alamitos, CA: IEEE Computer Society, 2006, pp. 518–530 5. Daskalakis, C., Mossel, E., Roch, S.: Optimal phylogenetic reconstruction. In: STOC’06: Proceedings of the 38th Annual ACM Symposium on Theory of Computing, New York: ACM, 2006, pp. 159–168 6. Dyer, M., Frieze, A., Hayes, T.P., Vigoda, E.: Randomly coloring constant degree graphs. In: FOCS ’04: Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science, Washington, DC: IEEE Computer Society, 2004, pp. 582–589 7. Evans, W., Kenyon, C., Peres, Y., Schulman, L.J.: Broadcasting on trees and the Ising model. Ann. Appl. Probab. 10(2), 410–433 (2000) 8. Janson, S., Mossel, E.: Robust reconstruction on trees is determined by the second eigenvalue. Ann. Probab. 32(3B), 2630–2649 (2004) 9. Jonasson, J.: Uniqueness of uniform random colorings of regular trees. Stat. Prob. Lett. 57, 243–248 (2002) 10. Kesten, H., Stigum, B.P.: Additional limit theorems for indecomposable multidimensional Galton-Watson processes. Ann. Math. Stat. 37, 1463–1481 (1966) 11. Krz¸akała, F., Montanari, A., Ricci-Tersenghi, F., Semerjian, G., Zdeborova, L.: Gibbs states and the set of solutions of random constraint satisfaction problems. Proc. Nat. Acad. Sci. 104, 10318–10323 (2007) 12. Krz¸akała, F., Pagnani, A., Weigt, M.: Threshold values, stability analysis, and high-q asymptotics for the coloring problem on random graphs. Phys. Rev. E 70(4), 046705 (2004) 13. Lange, K.: Applied probability. Springer Texts in Statistics. New York: Springer-Verlag, 2003 14. Mézard, M., Montanari, A.: Reconstruction on trees and spin glass transition. J. Stat. Phys. 124(6), 1317–1350 (2006) 15. Mossel, E.: Reconstruction on trees: beating the second eigenvalue. Ann. Appl. Prob. 11(1), 285–300 (2001) 16. Mossel, E.: Phase transitions in phylogeny. Trans. Amer. Math. Soc. 356(6), 2379–2404 (electronic), (2004) 17. Mossel, E.: Survey: information flow on trees. In: Graphs, morphisms and statistical physics, Volume 63 of DIMACS Ser. Discrete Math. Theoret. Comput. Sci., Providence, RI: Amer. Math. Soc., 2004, pp. 155–170 18. Mossel, E., Peres, Y.: Information flow on trees. Ann. Appl. Probab. 13, 817–844 (2003) 19. Mossel, E., Sly, A.: Gibbs rapidly samples colorings of G(n,d/n). http://arxiv.org/abs/0707.3241V2[math. PR], 2007 20. Semerjian, G.: On the freezing of variables in random constraint satisfaction problems. J. Stat. Phys. 130, 251 (2008) 21. Zdeborová, L., Krz¸akała, F.: Phase transitions in the coloring of random graphs. Phys. Rev. E 76, 031131 (2007) Communicated by F. Toninelli
Commun. Math. Phys. 288, 963–1006 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0639-6
Communications in
Mathematical Physics
Heat Kernel on Homogeneous Bundles over Symmetric Spaces Ivan G. Avramidi Department of Mathematics, New Mexico Institute of Mining and Technology, Socorro, NM 87801, USA. E-mail:
[email protected] Received: 27 May 2008 / Accepted: 29 May 2008 Published online: 26 September 2008 – © Springer-Verlag 2008
Abstract We consider Laplacians acting on sections of homogeneous vector bundles over symmetric spaces. By using an integral representation of the heat semi-group we find a formal solution for the heat kernel diagonal that gives a generating function for the whole sequence of heat invariants. We show explicitly that the obtained result correctly reproduces the first non-trivial heat kernel coefficient as well as the exact heat kernel diagonals on the two-dimensional sphere S 2 and the hyperbolic plane H 2 . We argue that the obtained formal solution correctly reproduces the exact heat kernel diagonal after a suitable regularization and analytical continuation. 1. Introduction The heat kernel is one of the most powerful tools in mathematical physics and geometric analysis (see, for example the books [13,17,24,26,27] and reviews [2,12,14,18,31]). The short-time asymptotic expansion of the trace of the heat kernel determines the spectral asymptotics of the differential operator. The coefficients of this asymptotic expansion, called the heat invariants, are extensively used in geometric analysis, in particular, in spectral geometry and index theorems proofs [17,24]. There has been a tremendous progress in the explicit calculation of spectral asymptotics in the last thirty years [2–5,23,30,33]. It seems that further progress in the study of spectral asymptotics can be only achieved by restricting oneself to operators and manifolds with a high level of symmetry, in particular, homogeneous spaces, which enables one to employ powerful algebraic methods. In some very special particular cases, such as group manifolds, spheres, rank-one symmetric spaces and split-rank symmetric spaces, it is possible to determine the spectrum of the Laplacian exactly and to obtain closed formulas for the heat kernel in terms of the root vectors and their multiplicities [1,18– 20,22,26]. The complexity of the method crucially depends on the global structure of the symmetric space, most importantly its rank. Most of the results for symmetric spaces are obtained for rank-one symmetric spaces only [18].
964
I. G. Avramidi
It is well known that heat invariants are determined essentially by local geometry. They are polynomial invariants in the curvature with universal constants that do not depend on the global properties of the manifold [24]. It is this universal structure that we are interested in this paper. Our goal is to compute the heat kernel asymptotics of the Laplacian acting on homogeneous vector bundles over symmetric spaces. Related problems in a more general context are discussed in [7,9,11]. 2. Geometry of Symmetric Spaces 2.1. Twisted spin-tensor bundles. In this section we introduce the basic concepts and fix notation. Let (M, g) be an n-dimensional Riemannian manifold without boundary. We assume that it is complete simply connected orientable and spin. We denote the local coordinates on M by x µ , with Greek indices running over 1, . . . , n. Let ea µ be a local orthonormal frame defining a basis for the tangent space Tx M so that g µν = δ ab ea µ eb ν .
(2.1)
We denote the frame indices by lower case Latin indices from the beginning of the alphabet, which also run over 1, . . . , n. The frame indices are raised and lowered by the metric δab . Let ea µ be the matrix inverse to ea µ , defining the dual basis in the cotangent space Tx∗ M, so that, gµν = δab ea µ eb ν .
(2.2)
The Riemannian volume element is defined as usual by dvol = d x |g|1/2 , where |g| = det gµν = (det ea µ )2 . The spin connection ωab µ is defined in terms of the orthonormal frame by ωab µ = eaµ eb µ;ν = −ea µ;ν ebµ = eaν ∂[µ eb ν] − ebν ∂[µ ea ν] + ecµ eaν ebλ ∂[λ ec ν] ,
(2.3)
where the semicolon denotes the usual Riemannian covariant derivative with the LeviCivita connection. The curvature of the spin connection is R a bµν = ∂µ ωa bν − ∂ν ωa bµ + ωa cµ ωc bν − ωa cν ωc bµ .
(2.4)
The Ricci tensor and the scalar curvature are defined by Rαν = ea µ eb α R a bµν ,
R = g µν Rµν = ea µ eb ν R ab µν .
(2.5)
Let T be a spin-tensor bundle realizing a representation Σ of the spin group Spin(n), the double covering of the group S O(n), with the fiber Λ. Let Σab be the generators of the orthogonal algebra SO(n), the Lie algebra of the orthogonal group S O(n), satisfying the following commutation relations: [Σab , Σcd ] = −δac Σbd + δbc Σad + δad Σbc − δbd Σac .
(2.6)
The spin connection induces a connection on the bundle T defining the covariant derivative of smooth sections ϕ of the bundle T by 1 ∇µ ϕ = ∂µ + ωab µ Σab ϕ. (2.7) 2
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
965
The commutator of covariant derivatives defines the curvature of this connection via [∇µ , ∇ν ]ϕ =
1 ab R µν Σab ϕ. 2
(2.8)
As usual, the orthonormal frame, ea µ and ea µ , will be used to transform the coordinate (Greek) indices to the orthonormal (Latin) indices. The covariant derivative along the frame vectors is defined by ∇a = ea µ ∇µ . For example, with our notation, ∇a ∇b Tcd = ea µ eb ν ec α ed β ∇µ ∇ν Tαβ . The metric δab induces a positive definite fiber metric on tensor bundles. For Dirac spinors, the fiber metric is defined as follows. First, one defines the Dirac matrices, γa , as generators of the Clifford algebra, (represented by 2[n/2] × 2[n/2] complex matrices), γa γb + γb γa = 2δab I S ,
(2.9)
where I S is the identity matrix in the spinor representation. Then one defines the antisymmetrized products of Dirac matrices γa1 ...ak = γ[a1 · · · γak ] .
(2.10)
Then the matrices Σab =
1 γab 2
(2.11)
are the generators of the orthogonal algebra SO(n) in the spinor representation. The Hermitian conjugation of Dirac matrices defines a Hermitian matrix β 1 by γa† = βγa β −1 ,
(2.12)
¯ = ψ † βϕ in the vector space of spinors. which defines a Hermitian inner product ψϕ We also find the following important relation: R ab cd γab γ cd = −2R ab ab I S = −2R I S ,
(2.13)
where R is the scalar curvature. In the present paper we will further assume that M is a locally symmetric space with a Riemannian metric with the parallel curvature ∇µ Rαβγ δ = 0,
(2.14)
which means, in particular, that the curvature satisfies the integrability constraints R f g ea R e bcd − R f g eb R e acd + R f g ec R e dab − R f g ed R e cab = 0.
(2.15)
Let G Y M be a compact Lie group (called a gauge group). It naturally defines the principal fiber bundle over the manifold M with the structure group G Y M . We consider a representation of the structure group G Y M and the associated vector bundle through this representation with the same structure group G Y M whose typical fiber is a k-dimensional vector space W . Then for any spin-tensor bundle T we define the twisted spin-tensor bundle V via the twisted product of the bundles W and T . The fiber of the bundle V is V = Λ ⊗ W so that the sections of the bundle V are represented locally by k-tuples of spin-tensors. 1 The Dirac matrices γ and the spinor metric β should not be confused with the matrices γ ab AB and βi j defined below.
966
I. G. Avramidi
Let A be a connection one form on the bundle W (called Yang-Mills or gauge connection) taking values in the Lie algebra GY M of the gauge group G Y M . Then the total connection on the bundle V is defined by 1 (2.16) ∇µ ϕ = ∂µ + ωab µ Σab ⊗ IW + IΛ ⊗ Aµ ϕ, 2 and the total curvature Ω of the bundle V is defined by [∇µ , ∇ν ]ϕ = Ωµν ϕ,
(2.17)
where Ωµν =
1 ab R µν Σab + Fµν , 2
(2.18)
and Fµν = ∂µ Aν − ∂ν Aµ + [Aµ , Aµ ]
(2.19)
is the curvature of the Yang-Mills connection. We also consider the bundle of endomorphisms of the bundle V. The covariant derivative of sections of this bundle is defined by 1 ab ∇µ X = ∂µ + ω µ Σab X + [Aµ , X ], (2.20) 2 and the commutator of covariant derivatives is equal to [∇µ , ∇ν ]X =
1 ab R µν Σab X + [Fµν , X ]. 2
(2.21)
In the following we will consider homogeneous vector bundles with parallel bundle curvature ∇µ Fαβ = 0,
(2.22)
which means that the curvature satisfies the integrability constraints [Fcd , Fab ] − R f acd F f b − R f bcd Fa f = 0.
(2.23)
2.2. Normal coordinates. Let x be a fixed point in M and U be a sufficiently small coordinate patch containing the point x . Then every point x in U can be connected with the point x by a unique geodesic. We extend the local orthonormal frame ea µ (x ) at the point x to a local orthonormal frame ea µ (x) at the point x by parallel transport
ea µ (x) = g µ ν (x, x )ea ν (x ), e
a
µ (x)
ν
= gµ (x, x )e
a
ν (x
),
(2.24) (2.25)
where g µ ν (x, x ) is the operator of parallel transport of vectors along the geodesic from the point x to the point x. Of course, the frame ea µ depends on the fixed point x as a parameter. Here and everywhere below the coordinate indices of the tangent space at the point x are denoted by primed Greek letters. They are raised and lowered by the
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
967
metric tensor gµ ν (x ) at the point x . The derivatives with respect to x will be denoted by primed Greek indices as well. The parameters of the geodesic connecting the points x and x , namely the unit tangent vector at the point x and the length of the geodesic, (or, equivalently, the tangent vector at the point x with the norm equal to the length of the geodesic), provide the normal coordinate system for U. Let d(x, x ) be the geodesic distance between the points x and x and σ (x, x ) be a two-point function defined by σ (x, x ) =
1 [d(x, x )]2 . 2
(2.26)
Then the derivatives σ;µ (x, x ) and σ;ν (x, x ) are the tangent vectors to the geodesic connecting the points x and x at the points x and x respectively pointing in opposite directions; one is obtained from another by parallel transport
σ;µ = −gµ ν σ;ν .
(2.27)
Here and everywhere below the semicolon denotes the covariant derivative. The operator of parallel transport satisfies the equation
with the initial conditions
σ ;µ ∇µ g α β = 0,
(2.28)
gα β
(2.29)
x=x
= δβα .
It can be expressed in terms of the local parallel frame g µ ν (x, x ) = ea µ (x)ea ν (x ),
gµ ν (x, x ) = ea µ (x)ea ν (x ).
(2.30)
Now, let us define the quantities
y a = ea µ σ ;µ = −ea µ σ ;µ ,
(2.31)
so that σ ;µ = ea µ y a
and
σ ;µ = −ea µ y a .
(2.32)
Notice that y a = 0 at x = x . Further, we have ∂ ya = −ea µ σ;νµ , ∂xν so that the Jacobian of the change of variables is a ∂y = |g|−1/2 (x ) det[−σ;νµ (x, x )]. det ∂xν
(2.33)
(2.34)
The geometric parameters y a are nothing but the normal coordinates. By using the Van Vleck-Morette determinant defined by2 ∆(x, x ) = |g|−1/2 (x )|g|−1/2 (x) det[−σ;νµ (x, x )], 2 Do not confuse it with the Laplacian ∆ defined below.
(2.35)
968
I. G. Avramidi
we can write the Riemannian volume element in the form dvol = dy ∆−1 (x, x ).
(2.36)
Let P(x, x ) be the operator of parallel transport of sections of the bundle V from the point x to the point x. It satisfies the equation σ ;µ ∇µ P = 0,
(2.37)
P
(2.38)
with the initial condition
x=x
= IV .
Any spin-tensor ϕ can be now expanded in the covariant Taylor series ϕ(x) = P(x, x )
∞ 1 ∇(c1 · · · ∇ck ) ϕ (x )y c1 · · · y ck . k!
(2.39)
k=0
Therefrom it is clear, in particular, that the frame components of a parallel spin-tensor are simply constant. In symmetric spaces one can compute the Van Vleck-Morette determinant explicitly in terms of the curvature. Let K be a n × n matrix with the entries K a b = R a cbd y c y d .
(2.40)
Then [2,5,13] ∂ ya = ∂xν
√ sin
K √
a eb ν ,
b
K
(2.41)
and, therefore, √
∆(x, x ) = det T M
sin
K √
K
.
(2.42)
Thus, the Riemannian volume element in symmetric spaces takes the following form: √ sin K . (2.43) dvol = dy det T M √ K √ √ The matrix (sin K )/ K determines the orthonormal frame in normal coordinates, and the square of this matrix determines the metric tensor in normal coordinates, √ sin2 K 2 dy a dy b . (2.44) ds = K ab
Let us define an endo-morphism valued 1-form A˜ a by the equation
∇ν P = P A˜ a ea µ σ ;µ ν .
(2.45)
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
969
Then for bundles with parallel curvature over symmetric spaces one can find it explicitly [2,5,13] √ b K I − cos A˜ a = −Fbc y c (2.46) a. K This object determines the gauge connection in normal coordinates, √ b c I − cos K a A = −Fbc y a dy . K
(2.47)
This means that all connections on a homogeneous bundle are essentially the same. In particular, the spin connection one-form in normal coordinates has the form √ c K I − cos e ωa b = −R a bcd y d (2.48) e dy . K Remark 1. Two remarks are in order here. First, strictly speaking, normal coordinates can be only defined locally, in geodesic balls of radius less than the injectivity radius of the manifold. However, for symmetric spaces normal coordinates cover the whole manifold except for a set of measure zero where they become singular [18]. This set is precisely the set of points conjugate to the fixed point x (where ∆−1 (x, x ) = 0) and of points that can be connected to the point x by multiple geodesics. In any case, this set is a set of measure zero and, as we will show below, it can be dealt with by some regularization technique. Thus, we will use the normal coordinates defined above for the whole manifold. Second, for compact manifolds (or for manifolds with compact submanifolds) the range of some normal coordinates is also compact, so that if one allows them to range over the whole real line R, then the corresponding compact submanifolds will be covered infinitely many times. 2.3. Curvature group of a symmetric space. We assumed that the manifold M is locally symmetric. Since we also assume that it is simply connected and complete, it is a globally symmetric space (or simply symmetric space) [32]. A symmetric space is said to be compact, non-compact or Euclidean if all sectional curvatures are positive, negative or zero. A generic symmetric space has the structure M = M0 × Ms ,
(2.49)
where M0 = Rn 0 and Ms is a semi-simple symmetric space; it is a product of a compact symmetric space M+ and a non-compact symmetric space M− , Ms = M+ × M− .
(2.50)
Of course, the dimensions must satisfy the relation n 0 + n s = n, where n s = dim Ms . Let Λ2 be the vector space of 2-forms on M at a fixed point x . It has the dimension dim Λ2 = n(n − 1)/2, and the inner product in Λ2 is defined by X, Y =
1 X ab Y ab . 2
(2.51)
970
I. G. Avramidi
The Riemann curvature tensor naturally defines the curvature operator Riem : Λ2 → Λ2
(2.52)
by (Riem X )ab =
1 Rab cd X cd . 2
(2.53)
This operator is symmetric and has real eigenvalues which determine the principal sectional curvatures. Now, let Ker (Riem) and Im (Riem) be the kernel and the range of this operator and p = dim Im(Riem) =
n(n − 1) − dim Ker (Riem). 2
(2.54)
Further, let λi , (i = 1, . . . , p), be the non-zero eigenvalues, and E i ab be the corresponding orthonormal eigen-two-forms. Then the components of the curvature tensor can be presented in the form [10] Rabcd = βik E i ab E k cd ,
(2.55)
where βik is a symmetric, in fact, diagonal, nondegenerate p × p matrix, (βik ) = diag (λ1 , . . . , λ p ).
(2.56)
Of course, the zero eigenvalues of the curvature operator correspond to the flat subspace M0 , the positive ones correspond to the compact submanifold M+ and the negative ones to the non-compact submanifold M− . Therefore, Im (Riem) = Tx Ms . In the following the Latin indices from the middle of the alphabet will be used to denote tensors in Im(Riem); they should not be confused with the Latin indices from the beginning of the alphabet which denote tensors in M. They will be raised and lowered with the matrix βik and its inverse −1 (β ik ) = diag (λ−1 1 , . . . , λ p ).
(2.57)
Next, we define the traceless n × n matrices Di = (D a ib ), where D a ib = −βik E k cb δ ca .
(2.58)
Then R a bcd = −D a ib E i cd , R a b c d = β ik D a ib D c kd , R a b = −β ik D a ic D c kb , R = −β ik D a ic D c ka .
(2.59) (2.60)
Also, we have identically, D a j[b E j cd] = 0.
(2.61)
The matrices Di are known to be the generators of the holonomy algebra, H, i.e. the Lie algebra of the restricted holonomy group, H , [Di , Dk ] = F j ik D j ,
(2.62)
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
971
where F j ik are the structure constants of the holonomy group. The structure constants of the holonomy group define the p × p matrices Fi , by (Fi ) j k = F j ik , which generate the adjoint representation of the holonomy algebra, [Fi , Fk ] = F j ik F j .
(2.63)
These commutation relations follow directly from the Jacobi identities F i j[k F j ml] = 0.
(2.64)
For symmetric spaces the introduced quantities satisfy additional algebraic constraints. The most important consequence of the Eq. (2.15) is the equation [10] E i ac D c kb − E i bc D c ka = F i k j E j ab .
(2.65)
It is this equation that makes a generic Riemannian manifold a symmetric space. Now, by using Eqs. (2.62) and (2.65) one can prove the following: Proposition 1. The matrix βik is H -invariant and satisfies the equation βik F k jl + βlk F k ji = 0.
(2.66)
This means that the matrices Fi satisfy the transposition rule (Fi )T = −β Fi β −1 ,
(2.67)
which simply means that the adjoint and the coadjoint representations of the holonomy algebra H are equivalent. In particular, this means that the matrices Fi are traceless. Such an algebra is called compact [16]. Another consequence of the Eq. (2.65) are the identities D a i[b Rc]ade + D a i[d Re]abc = 0, R a c D c ib = D a ic R c b .
(2.68) (2.69)
This means, in particular, that the Ricci tensor matrix commutes with all matrices Di and is, therefore, an invariant matrix of the holonomy algebra. Thus, Ra b =
1 a h b R, ns
(2.70)
where h a b is a projection (a symmetric idempotent parallel tensor) to the subspace Tx Ms of the tangent space of dimension n s , that is, h ab = h ba ,
ha b hb c = ha c ,
ha a = ns .
(2.71)
It is easy to see that the tensor h ab is nothing but the metric tensor on the semi-simple subspace Tx Ms . Since the curvature exists only in the semi-simple submanifold Ms , the components of the curvature tensor Rabcd , as well as the tensors E i ab , are non-zero only in the semi-simple subspace Tx Ms . Let q a b = δa b − h a b
(2.72)
972
I. G. Avramidi
be the projection tensor to the flat subspace Rn 0 such that qab = qba ,
qa bqbc = qa c,
q a a = n0,
q a b h b c = 0.
(2.73)
Rabcd q a e = Rab q a e = E i ab q a e = D a ib q b e = D a ib qa e = 0.
(2.74)
Then
Now, we introduce a new type of indices, the capital Latin indices, A, B, C, . . . , which split according to A = (a, i) and run from 1 to N = p + n. We define new quantities C A BC by C i ab = E i ab ,
C a ib = −C a bi = D a ib ,
C i kl = F i kl ,
(2.75)
all other components being zero. Let us also introduce rectangular p × n matrices Ta by (Ta ) j c = E j ac and the n × p matrices T¯a by (T¯a )b i = −D b ia . Then we can define N × N matrices C A = (Ca , Ci ), Di 0 0 T¯a Ca = , (2.76) , Ci = 0 Fi Ta 0 so that (C A ) B C = C B AC . Theorem 1. The quantities C A BC satisfy the Jacobi identities C A B[C C C D E] = 0.
(2.77)
This means that the matrices C A satisfy the commutation relations [C A , C B ] = C C AB CC ,
(2.78)
or, in more detail, [Ca , Cb ] = E i ab Ci ,
[Ci , Ca ] = D b ia Cb ,
[Ci , Ck ] = F j ik C j ,
(2.79)
and generate the adjoint representation of a Lie algebra G with the structure constants C A BC . Proof. This can be proved by using Eqs. (2.61), (2.62), (2.64) and (2.65) [10].
For the lack of a better name we call the algebra G the curvature algebra. As it will be clear from the next section it is a subalgebra of the total isometry algebra of the symmetric space. It should be clear that the holonomy algebra H is the subalgebra of the curvature algebra G. The curvature algebra exists only in symmetric spaces; it is Eq. (2.65) that closes this algebra. Next, we define a symmetric nondegenerate N × N matrix ⎛ ⎞ δab 0 (γ AB ) = = diag ⎝1, . . . , 1, λ1 , . . . , λ p ⎠ . (2.80) 0 βik This matrix and its inverse (γ AB ) =
δ ab 0 0 β ik
n
= diag (1, . . . , 1, λ−1 , . . . , λ−1 p ) will 1
be used to lower and to raise the capital Latin indices.
n
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
973
Finally, by using Eqs. (2.65) and (2.66) one can show the following: Proposition 2. The matrix γ AB is G-invariant and satisfies the equation γ AB C B C D + γ D B C B C A = 0.
(2.81)
In matrix notation this equation takes the form (C A )T = −γ C A γ −1 ,
(2.82)
which means that the adjoint and the coadjoint representations of the curvature group are equivalent. In particular, the matrices C A are traceless. Thus the curvature algebra G is compact; it is a direct sum of two ideals, G = G0 ⊕ Gs ,
(2.83)
an Abelian center G0 of dimension n 0 and a semi-simple algebra Gs of dimension p + n s . It is worth mentioning that although the holonomy algebra H is compact the (indefinite, in general) metric, βi j , introduced above is not equal to the (positive definite) Cartan-Killing form, ρi j , defined by tr T M Di Dk = D a ib D b ka = −ρik ,
(2.84)
ρik = diag (λ21 , . . . , λ2p ),
(2.85)
β ik ρik = R.
(2.86)
so that
and
Similarly, the generators Fi satisfy tr H Fi Fk = F j im F m k j = −4
RH ρik , R
(2.87)
where 1 R H = − β ik F j im F m k j . 4
(2.88)
The Killing-Cartan form tr G C A C B for the curvature algebra G is defined by 2 h ab R, ns RH ρi j , tr G Ci C j = − 1 + 4 R tr G Ca Ci = 0.
tr G Ca Cb = −
Notice that it is degenerate and is not equal to the metric γ AB .
(2.89) (2.90) (2.91)
974
I. G. Avramidi
2.4. Killing vectors fields. We will use extensively the isometries of the symmetric space M. We follow the approach developed in [2,5,10,13]. The generators of isometries are the Killing vector fields ξ defined by the equation ∇µ ξ ν + ∇ν ξ µ = 0.
(2.92)
The integrability conditions for this equation are Rαβµ[λ ∇ν] ξ µ + Rλνµ[β ∇α] ξ µ = 0.
(2.93)
By differentiating this equation, commuting derivatives and using curvature identities we obtain ∇µ ∇ν ξ λ = −R λ ναµ ξ α ,
(2.94)
∆ξ λ = −R λ α ξ α .
(2.95)
which means, in particular,
By induction we obtain ∇µ2k · · · ∇µ1 ξ λ = (−1)k R λ µ1 α1 µ2 R α1 µ3 α2 µ4 · · · R αk−1 µ2k−1 αk µ2k ξ αk , λ
∇µ2k+1 · · · ∇µ1 ξ = (−1) R k
λ
µ1 α1 µ2 R
α1
µ3 α2 µ4
··· R
αk−1
µ2k−1 αk µ2k ∇µ2k+1 ξ
(2.96) αk
. (2.97)
These derivatives determine all coefficients of the covariant Taylor series (2.39) for the Killing vectors, and therefore, every Killing vector in a symmetric space has the form √ a √ a sin K a b c b ξ (x) = cos K b ξ (x ) + (2.98) √ b y ξ ;c (x ), K or ξ(x) =
√
K cot
∂ √ a b K b ξ (x ) + ξ a ;c (x )y c . ∂ ya
(2.99)
Thus, Killing vector fields at any point x are determined by their values ξ a (x ) and the values of their derivatives ξ a ;c (x ) at the fixed point x . Similarly we can obtain the derivatives of the Killing vectors, √ c a a a d 1 − cos K f e ξ ;b (x) = ξ ;b (x ) − R bcd y e y ξ ; f (x ) K √ c sin K e −R a bcd y d (2.100) √ e ξ (x ). K The set of all Killing vector fields forms a representation of the isometry algebra, the Lie algebra of the isometry group of the manifold M. We define two subspaces of the isometry algebra. One subspace is formed by Killing vectors satisfying the initial conditions ∇µ ξ ν = 0, (2.101) x=x
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
975
and another subspace is formed by the Killing vectors satisfying the initial conditions ξ ν = 0. (2.102) x=x
We will call the Killing vectors from the first subspace translations and the Killing vectors from the second group rotations. However, this should not be understood literally. One can easily show that the initial values ξ a (x ) are independent and, therefore, there are n such parameters. Thus, there are n linearly independent translations, which can be chosen in the form √ √ b ∂ Pa = K cot K a b , (2.103) ∂y so that
eb µ Pa µ x=x = δ b a ,
Pa µ ;ν
x=x
= 0.
(2.104)
It is worth pointing out that the nature of the lower index of the Killing vectors Pa µ is different from the frame indices. This means, in particular, that the covariant derivative of Pa µ does not include the spin connection associated with the lower index. In other words, Pa µ are just n vectors and not the components of a (1, 1) tensor. On the other hand, the initial values of the derivatives ξ a ;c (x ) are not independent because of the constraints (2.93). These constraints are valid only in the semi-simple subspace Tx Ms . However, in this subspace, due to the identity (2.68), it should be clear that there are p linearly independent rotations L i = −D b ia y a satisfying the initial conditions L i µ x=x = 0,
∂ , ∂ yb
(2.105)
ea µ eb ν L i µ ;ν x=x = −D a ib .
More generally, by using (2.100) we also obtain √ c a a d sin K ν µ e µ eb Pe ;ν = −R bcd y √ e, K √ c 1 − cos K f e ea µ eb ν L i µ ;ν = −D a ib + R a bcd y d ey D if . K
(2.106)
(2.107)
(2.108)
This means, in particular, that the derivatives of all Killing vectors have the form ξ A a ;b = −D a ib η A i ,
(2.109)
η A i = α i j ξ A a ;b D b ja ,
(2.110)
where η A i are defined by
αi j
and the matrix = (ρi j by (2.84). Notice that
)−1
is the inverse matrix of the Cartan-Killing form ρ defined
ηa i x=x = 0,
ηji
x=x
= δ ij .
(2.111)
976
I. G. Avramidi
Then, from Eq. (2.94) we also immediately obtain η A i ;b = −E i ab ξ A a .
(2.112)
By adding the trivial Killing vectors for flat subspaces we find that the dimension of the rotation subspace is equal to p + n0ns +
n 0 (n 0 − 1) . 2
(2.113)
Here n 0 n s is the number of mixed rotations between M0 and Ms and n 0 (n 0 − 1)/2 is the number of rotations of M0 . Since p ≤ n s (n s − 1)/2, then the above number of rotations is less or equal to n(n − 1)/2 as it should be (recall that n = n 0 + n s ). In the following we will need only the Killing vectors Pa and L i defined above. We introduce the following notation (ξ A ) = (Pa , L i ). Theorem 2. The Killing vector fields ξ A satisfy the commutation relations [ξ A , ξ B ] = C C AB ξC ,
(2.114)
or, in more detail, [Pa , Pb ] = E i ab L i ,
[L i , Pa ] = D b ia Pb ,
[L i , L k ] = F j ik L j .
(2.115)
Proof. This can be proved by using the explicit form of the Killing vector fields obtained above [10].
Notice that they do not generate the complete isometry algebra of the symmetric space M but rather they form a representation of the curvature algebra G introduced in the previous section, which is a subalgebra of the total isometry algebra. It is clear that the Killing vector fields L i form a representation of the holonomy algebra H, which is the isotropy algebra of the semi-simple submanifold Ms , and a subalgebra of the total isotropy algebra of the symmetric space M. Proposition 3. There holds ξ Ac ;a ξ B b ;c − ξ Bc ;a ξ A b ;c = C C AB ξC b ;a − R b acd ξ A c ξ B d ,
(2.116)
F j ik η A i η B k = C C AB ηC j − E j cd ξ Ac ξ B d .
(2.117)
and
Proof. By differentiating Eq. (2.114) and using (2.94) we obtain (2.116). Finally, by using (2.109) and the holonomy algebra (2.62) we obtain (2.117).
Now, we derive some bilinear identities that we will need in the present paper. Proposition 4. The Killing vector fields satisfy the equation γ AB ξ A µ ξ B ν = δ ab Pa µ Pb ν + β ik L i µ L k ν = g µν , . Proof. This can be proved by using the explicit form of the Killing vectors.
(2.118)
Proposition 5. There holds γ AB ξ A α ξ B µ ;νλ = R α λνµ .
(2.119)
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
Proof. This follows from Eqs. (2.94) and (2.118).
977
Proposition 6. There holds γ AB ξ A µ ξ B ν ;β = 0, γ
AB
ξA
µ
ν
Rµα ν β .
;α ξ B ;β
=
and
θ µ α ν β = γ AB ξ A µ ;α ξ B ν ;β .
(2.120) (2.121)
Proof. Let τ µα ν = γ AB ξ A µ ξ B α ;ν
(2.122)
We compute ∇β τµαν = θµβνα − Rµβνα ,
(2.123)
∇γ ∇β τµαν = Rµβγρ τ ρ να + Rναγρ τ ρ µβ .
(2.124)
and
All higher derivatives of τµνα are expressed linearly in terms of τµνα and its first derivative ∇β τµαν with coefficients polynomial in curvature. Let x be a fixed point. We will show that the tensor τµνα together with all its covariant derivatives is equal to zero at x = x . This will then mean that τµνα = 0 identically and, therefore, from Eq. (2.123) that θ µ ανβ = R µ ανβ . We have τ µα ν = δ ab Pa µ Pb α ;ν + β i j L i µ L j α ;ν ,
(2.125)
θ µ α ν β = δ ab Pa µ ;α Pb ν ;β + β i j L i µ ;α L j ν ;β .
(2.126)
and
Therefore,
Therefore,
τ µ αν x=x = 0
and
θ µ ανβ x=x = R µ ανβ .
∇β τ µ αν x=x = 0.
(2.127)
(2.128)
Thus, by induction, all derivatives of τµνα vanish, and, therefore, τµνα = 0 identically. This also proves (2.121) by making use of (2.123).
Let i, j be non-negative integers. We define the tensors X (i, j) which are bilinear in Killing vectors by X (i, j) µν α1 ...αi β1 ...β j = γ AB ∇α1 · · · ∇αi ξ A µ ∇β1 · · · ∇β j ξ B ν .
(2.129)
Theorem 3. 1. The tensors X (i, j) are G-invariant and parallel, that is, ∇λ X (i, j) = 0. 2. For even (i + j) the tensors X (i, j) are polynomial in the curvature tensor. 3. For odd (i + j) the tensors X (i, j) are identically equal to zero.
(2.130)
978
I. G. Avramidi
Proof. First of all, we notice that (1) follows from (2) and (3). There are three cases: a) both i = 2k and j = 2m are even, b) both i = 2k + 1 and j = 2m + 1 are odd, and c) i = 2k is even and j = 2m + 1 is odd. In the case (a), when both i and j are even, by using Eqs. (2.96) and (2.118) we immediately obtain a polynomial in the curvature. In the cases (b) and (c) by using Eqs. (2.96) and (2.97) we reduce it to the tensors γ AB ξ A µ ;α ξ B ν ;β and γ AB ξ A µ ξ B ν ;β . Now, by using the Proposition 6 we prove the theorem.
Proposition 7. There holds γ AB ξ A µ η B i = 0,
(2.131)
γ AB η A i η B j = β i j .
(2.132)
Proof. This follows from the definition of η A i (2.110) and Eqs. (2.120) and (2.121).
2.5. Homogeneous vector bundles. Equation (2.23) imposes strong constraints on the curvature of the homogeneous bundle W. We define Bab = Fcd q c a q d b ,
Eab = Fcd h c b h d b ,
(2.133)
Eab q a c = 0.
(2.134)
[Bab , Bcd ] = [Bab , Ecd ] = 0,
(2.135)
[Ecd , Eab ] − R f acd E f b − R f bcd Ea f = 0.
(2.136)
so that Bab h a c = 0, Then, from Eq. (2.23) we obtain
and
This means that Bab takes values in an Abelian ideal of the gauge algebra GY M and Eab takes values in the holonomy algebra. More precisely, Eq. (2.136) is only possible if the holonomy algebra H is an ideal of the gauge algebra GY M . Thus, the gauge group G Y M must have a subgroup Z × H , where Z is an Abelian group and H is the holonomy group. We proceed in the following way. The matrices D a ib provide a natural embedding of the holonomy algebra H in the orthogonal algebra SO(n) in the following sense. Let X ab be the generators of the orthogonal algebra SO(n) in some representation satisfying the commutation relations (2.6). Let Ti be the matrices defined by 1 Ti = − D a ib X b a . 2
(2.137)
Proposition 8. The matrices Ti satisfy the commutation relations [Ti , Tk ] = F j ik T j and form a representation T of the holonomy algebra H. This can be proved by taking into account the orthogonal algebra (2.6).
(2.138)
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
979
Thus Ti are the generators of the gauge algebra GY M realizing a representation T of the holonomy algebra H. Since Bab takes values in the Abelian ideal of the algebra of the gauge group we also have [Bab , T j ] = 0.
(2.139)
Then by using Eq. (2.65) one can show that 3 Eab =
1 cd R ab X cd = −E i ab Ti . 2
(2.140)
Proposition 9. The two form Fab = −E i ab Ti + Bab =
1 cd R ab X cd + Bab 2
(2.141)
satisfies the constraints (2.23), and, therefore, gives the curvature of the homogeneous bundle W. Now, we consider the representation Σ of the orthogonal algebra defining the spintensor bundle T and define the matrices G ab = Σab ⊗ I X + IΣ ⊗ X ab .
(2.142)
Obviously, these matrices are the generators of the orthogonal algebra in the product representation Σ ⊗ X . Next, the matrices 1 Q i = − D a ib Σ b a 2
(2.143)
form a representation Q of the holonomy algebra H, and the matrices 1 Ri = Q i ⊗ IT + IΣ ⊗ Ti = − D a ib G b a 2
(2.144)
are the generators of the holonomy algebra in the product representation R = Q ⊗ T . Then the total curvature, that is, the commutator of covariant derivatives, (2.18) of a twisted spin-tensor bundle V is Ωab = −E i ab Ri + Bab =
1 cd R ab G cd + Bab . 2
(2.145)
Finally, we define the Casimir operators of the holonomy algebra in the representations Q, T and R, 1 abcd R X ab X cd , 4 1 Q 2 = C2 (H, Q) = β i j Q i Q j = R abcd Σab Σcd , 4 1 R2 = C2 (H, R) = β i j Ri R j = R abcd G ab G cd . 4 T 2 = C2 (H, T ) = β i j Ti T j =
They commute with all matrices Ti , Q i and Ri respectively. 3 We correct here a sign misprint in Eq. (3.24) in [10].
(2.146) (2.147) (2.148)
980
I. G. Avramidi
2.6. Twisted Lie derivatives. Let ϕ be a section of a twisted homogeneous spin-tensor bundle T . Let ξ A be the basis of Killing vector fields. Then the covariant (or generalized, or twisted) Lie derivative of ϕ along ξ A is defined by L A ϕ = Lξ A ϕ = ∇ξ A + S A ϕ, (2.149) where ∇ξ A = ξ A µ ∇µ , S A = η A i Ri =
1 a ξ A ;b G b a , 2
(2.150)
and η A i are defined by (2.110). Note that Sa q a b = 0.
(2.151)
[∇ξ A , ∇ξ B ]ϕ = C C AB ∇ξC − R AB + B AB ϕ,
(2.152)
Proposition 10. There hold
∇ξ A S B = R AB , [S A , S B ] = C
C
AB SC
(2.153) − R AB ,
(2.154)
where 1 R AB = ξ A a ξ B b E i ab Ri = − R cd ab ξ A a ξ B b G cd , 2 B AB = ξ A a ξ B b Bab .
(2.155) (2.156)
Proof. By using the properties of the Killing vectors described in the previous section and Eq. (2.145) we obtain first (2.152). Next, by using Eqs. (2.112) we obtain (2.153), and, further, by using Eq. (2.117) we get (2.154).
Notice that from the definition (2.133) we have Pc a L i b Bab = L i a L j b Bab = 0,
(2.157)
Pc a Pd b Bab = Bcd .
(2.158)
and
This means that the matrix B AB has the form Bab 0 , B AB = 0 0
(2.159)
and, therefore, C A BC B AD = γ B D C A BC B D E = 0.
(2.160)
We define the operator L2 = γ AB L A L B .
(2.161)
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
981
Theorem 4. The operators L A and L2 satisfy the commutation relations [L A , L B ] = C C AB LC + B AB ,
(2.162)
or, in more detail, [La , Lb ] = E i ab Li + Bab , [Li , La ] = D b ia Lb , [Li , L j ] = F k i j Lk , (2.163) and [L A , L2 ] = 2γ BC B AB LC .
(2.164)
[L A , L B ] = [∇ξ A , ∇ξ B ] + [∇ξ A , S B ] − [∇ξ B , S A ] + [S A , S B ]
(2.165)
Proof. This follows from
and Eqs. (2.152), (2.153), and (2.154). Equation (2.164) follows directly from (2.162).
The operators L A form an algebra that is a direct sum of a nilpotent ideal and a semisimple algebra. For lack of a better name we call this algebra a gauged curvature algebra and denote it by Ggauge . Proposition 11. There hold γ AB ξ A µ S B = 0, γ
AB
γ
(2.166)
∇ξ A S B = 0,
AB
(2.167)
S A SB = R . 2
Proof. This can be proved by using Eqs. (2.131), (2.153) and (2.132).
(2.168)
Theorem 5. The Laplacian ∆ acting on sections of a twisted spin-tensor bundle V over a symmetric space has the form ∆ = L2 − R2 , [L A , ∆] = 2γ BC B AB LC .
(2.169) (2.170)
γ AB L A L B = γ AB ∇ξ A ∇ξ B + γ AB S A ∇ξ B + γ AB ∇ξ A S B + γ AB S A S B .
(2.171)
Proof. We have
Now, by using Eqs. (2.118) and (2.120) we get γ AB ∇ξ A ∇ξ B = ∆.
(2.172)
Next, by using Eqs. (2.153), (2.167) and (2.168), we obtain (2.169). Equation (2.170) follows from the commutation relations (2.162).
982
I. G. Avramidi
2.7. Isometries and pullbacks. Let ωi be the canonical coordinates on the holonomy group and (k A ) = ( pa , ωi ) be the canonical coordinates on the gauged curvature group. We fix a point x so that the basis Killing vectors fields ξ A satisfy the initial conditions (2.104)-(2.106) and are given by (2.103)-(2.105). Let ξ = k, ξ = k A ξ A = pa Pa +ωi L i be a Killing vector field and let ψt : M → M be the one-parameter diffeomorphism (the isometry) generated by the vector field ξ . Let xˆ = ψt (x), so that
and
d xˆ = ξ (x) ˆ dt
(2.173)
xˆ t=0 = x.
(2.174)
The solution of this equation depends on the parameters t, p, ω, x and x , that is, xˆ = x(t, ˆ p, ω, x, x ).
(2.175)
We will be interested mainly in the case when the points x and x are close to each other. In fact, at the end of our calculations we will take the limit x = x . In this case, as we will show below, the Jacobian µ ∂ xˆ = 0 (2.176) det ∂ pa is not equal to zero, and, therefore, coordinates p can be used to parametrize the point x, ˆ that is, Eq. (2.175) defines the function p = p(t, ω, x, ˆ x, x ).
(2.177)
We will be interested in those trajectories that reach the point x at the time t = 1. So, we look at the values x(1, ˆ p, ω, x, x ) when the parameters p are varied. Then, as we will show below, there is always a value of the parameters p that we call p¯ such that x(1, ˆ p, ¯ ω, x, x ) = x .
(2.178)
Thus, Eq. (2.178) defines a function p¯ = p(ω, ¯ x, x ). Therefore, the parameters p¯ can be used to parameterize the point x. Of course, p(ω, ¯ x, x ) = p(1, ω, x , x, x ).
(2.179)
Now, we choose the normal coordinates y a of the point defined above and the normal coordinates yˆ a of the point xˆ with the origin at x , so that the normal coordinates y of the point x are equal to zero, y a = 0. Recall that the normal coordinates are equal to the components of the tangent vector at the point x to the geodesic connecting the points x and the current point, that is, y a = −ea µ (x )σ ;µ (x, x ) and yˆ a = −ea µ (x )σ ;µ (x, ˆ x ). Then by taking into account Eqs. (2.103) and (2.105), Equation (2.173) becomes a d yˆ a = K ( yˆ ) cot K ( yˆ ) p b − ωi D a ib yˆ b , (2.180) b dt with the initial condition
yˆ a t=0 = y a .
The solution of this equation defines a function yˆ = yˆ (t, p, ω, y).
(2.181)
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
983
Proposition 12. The Taylor expansion of the solution yˆ = yˆ (t, p, ω, y) of Eq. (2.180) in t reads a K (y) cot K (y) p b − ωi D a ib y b t + O(t 2 ). (2.182) yˆ a = y a + b
The Taylor expansion of the function yˆ = yˆ (t, p, ω, y) in p and y reads 1 − exp[−t D(ω)] a b 2 2 yˆ a = (exp[−t D(ω)])a b y b + b p + O(y , p , py). (2.183) D(ω) There holds
∂ yˆ a det ∂ pb
= det T M p=y=0,t=1
sinh[ D(ω)/2] . D(ω)/2
(2.184)
Proof. Eq. (2.182) trivially follows from Eq. (2.180). Let us expand the function yˆ (t, p, ω, y) in Taylor series in p and y restricting ourselves to linear terms, that is, ∂ yˆ a ∂ yˆ a yˆ a = yˆ a + b pb + b y b + O( p 2 , y 2 , py). (2.185) p=y=0 ∂ p p=y=0 ∂ y p=y=0 First of all, for p = 0, Eq. (2.180) becomes d yˆ a = −ωi D a ib yˆ b . dt
(2.186)
The solution of this equation with the initial condition yˆ = 0 is trivial, therefore, yˆ = yˆ (t, 0, ω, 0) = 0. (2.187) p=y=0
Next, by differentiating Eq. (2.186) with respect to y b and setting y = 0 we obtain the equation c d ∂ yˆ a i a ∂ yˆ , (2.188) p=y=0 = −ω D ic dt ∂ y b ∂ y b p=y=0 with the initial condition ∂ yˆ a = δa b . ∂ y b p=y=t=0 The solution of this equation is ∂ yˆ a = (exp[−t D(ω)])a b , ∂ y b p=y=0
(2.189)
(2.190)
where D(ω) = ωi Di . Let Zab =
∂ yˆ a . ∂ p b p=y=0
(2.191)
984
I. G. Avramidi
Then by differentiating Eq. (2.180) with respect to p b and setting p = 0, we obtain d a Z b = δ a b − ωi D a ic Z c b , dt
(2.192)
Z a b t=0 = 0.
(2.193)
1 − exp[−t D(ω)] . D(ω)
(2.194)
with the initial condition
The solution of this equation is Z=
By substituting Eqs. (2.187), (2.190) and (2.194) in (2.185) we get the desired result (2.183). Finally, by taking into account that the matrix D(ω) is traceless, we find first det exp[t D(ω)] = 1, and, then by using Eq. (2.194) we obtain (2.184).
The function yˆ = yˆ (t, p, ω, y) implicitly defines the function p = p(t, ω, yˆ , y).
(2.195)
The function p¯ = p(ω, ¯ y) is now defined by the equation yˆ (1, p, ¯ ω, y) = 0,
(2.196)
p(ω, ¯ y) = p(1, ω, 0, y).
(2.197)
or
Proposition 13. The Taylor expansion of the function p(ω, ¯ y) in y has the form a exp[−D(ω)] b 2 p¯ a = − D(ω) (2.198) b y + O(y ). 1 − exp[−D(ω)] Therefore, sinh[ D(ω)/2] −1 ∂ p¯ a det − b = det T M . ∂y D(ω)/2 y=0 Proof. We expand p¯ in Taylor series in y, ∂ p¯ a a a p¯ = p¯ y=0 + b y b + O(y 2 ). ∂ y y=0 Next, by taking into account (2.187) we have p¯ = 0. y=0
(2.199)
(2.200)
(2.201)
Further, by differentiating (2.196) with respect to y c and setting y = 0 we get ∂ yˆ a ∂ p¯ c ∂ yˆ a + = 0, (2.202) ∂ y b p=y=0,t=1 ∂ p c p=y=0,t=1 ∂ y b y=0
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
985
and, therefore, a ∂ p¯ a exp[−D(ω)] = − D(ω) b. ∂ y b y=0 1 − exp[−D(ω)] This leads to both (2.198) and (2.199).
(2.203)
Now, we define Λµˆ ν =
∂ xˆ µ . ∂xν
(2.204)
The pullback of the metric by the diffeomorphism ψt is defined by ˆ
(ψt∗ g)µν (x) = Λαˆ µ Λβ ν gαˆ βˆ (x). ˆ
(2.205)
Since ψt is an isometry, we have (ψt∗ g)µν (x) = gµν (x).
(2.206)
Therefore, the inverse matrix Λ−1 is equal to ˆ
(Λ−1 )µ αˆ = g µν (x)Λβ ν gβˆ αˆ (x). ˆ
(2.207)
Let ea µ and ea µ be a local orthonormal frame that is obtained by parallel transport along geodesics from a point x . Then the action of the pullback ψt∗ on the orthonormal frame is (ψt∗ ea )µ (x) = Λαˆ µ ea αˆ (x). ˆ
(2.208)
Since ψt is an isometry, we have δab (ψt∗ ea )α (x)(ψt∗ eb )β (x) = δab ea α (x)eb β (x).
(2.209)
Therefore, the frames of 1-forms ea and ψt∗ ea are related by an orthogonal transformation (ψt∗ ea )(x) = O a b eb (x),
(2.210)
where the matrix O a b is defined by ˆ αˆ µ eb µ (x). O a b = ea αˆ (x)Λ
(2.211)
Proposition 14. For p = y = 0 the matrix O has the form O
p=y=0
= exp [−t D(ω)] .
(2.212)
986
I. G. Avramidi
Proof. We use normal coordinates yˆ a and y a . Then the matrix O takes the form O a b = ea αˆ
∂ xˆ α ∂ yˆ c ∂ y d µ eb . ∂ yˆ c ∂ y d ∂ x µ
(2.213)
Now, by using the Jacobian matrix (2.41) and recalling that yˆ = 0 for p = y = 0 we obtain α ∂ y a µ a ∂ xˆ eb p=y=0 = e αˆ b = δa b . (2.214) ∂xµ ∂ yˆ p=y=0
Therefore, Oa b
p=y=0
=
∂ yˆ a , ∂ y b p=y=0
and, finally (2.190) gives the desired result (2.212).
(2.215)
Let ϕ be a section of the twisted spin-tensor bundle V. Let Vx be the fiber at the point x and Vxˆ be the fiber at the point xˆ = ψt (x). The pullback of the diffeomorphism ψt defines the map, that we call just the pullback, ψt∗ : C ∞ (V) → C ∞ (V)
(2.216)
on smooth sections of the twisted spin-tensor bundle V. The pullback of tensor fields of type ( p, q) is defined by µ ...µ
ˆ
ˆ
αˆ ...αˆ p
(ψt∗ ϕ)ν11...νq p (x) = Λβ1 ν1 · · · Λβq νq (Λ−1 )µ1 αˆ 1 · · · (Λ−1 )µ p αˆ p ϕ ˆ 1
β1 ...βˆq
(x). ˆ (2.217)
We define the twisted pullback (a combination of a proper pullback and a gauge transformation) of a tensor of type ( p, q) by a ...a
d ...d
(ψt∗ ϕ)b11 ...bqp (x) = O c1 b1 · · · O cq bq Od1 a1 · · · Od p a p ϕc11...cqp (x). ˆ
(2.218)
Since the matrix O is orthogonal, it can be parametrized by O = exp θ,
(2.219)
where θab is an antisymmetric matrix. The orthogonal transformation of the frame pulled back causes the transformation of spinors 1 ˆ (2.220) (ψt∗ ϕ)(x) = exp − θab γ ab ϕ(x). 4 More generally, we have Proposition 15. Let ϕ be a section of a twisted spin-tensor bundle V. Then 1 ∗ ab ϕ(x). ˆ (ψt ϕ)(x) = exp − θab G 2 In particular, for p = y = 0 (or x = x ), (ψt∗ ϕ)(x) = exp [tR(ω)] ϕ(x ), p=y=0
where R(ω) = ωi Ri .
(2.221)
(2.222)
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
987
Proof. First, from Eq. (2.212) we see that θ a b = −tωi D a ib .
(2.223)
p=y=0
Then, from the definition (2.144) of the matrices Ri we get (2.222).
It is not very difficult to check that the Lie derivatives are nothing but the generators of the pullback, that is, d (2.224) Lξ ϕ = k A L A ϕ = (ψt∗ ϕ) . t=0 dt We will use this fundamental fact to compute the heat kernel diagonal below. 3. Heat Semigroup 3.1. Geometry of the curvature group. Let G gauge be the gauged curvature group and H be its holonomy subgroup. Both these groups have compact algebras. However, while the holonomy group is always compact, the curvature group is, in general, a product of a nilpotent group, G 0 , and a semi-simple group, G s , G gauge = G 0 × G s .
(3.1)
The semi-simple group G s is a product G s = G + × G − of a compact G + and a noncompact G − subgroup. Let ξ A be the basis Killing vectors, k A be the canonical coordinates on the curvature group G and ξ(k) = k A ξ A . The canonical coordinates are exactly the normal coordinates on the group defined above. Let C A be the generators of the curvature group in adjoint representation and C(k) = k A C A . In the following ∂ M means the partial derivative ∂/∂k M with respect to the canonical coordinates. We define the matrix Y A M by the equation exp[−ξ(k)]∂ M exp[ξ(k)] = Y A M ξ A ,
(3.2)
which is well defined since the right hand side lies in the Lie algebra of the curvature group. This can be written in the form exp[−ξ(k)]∂ M exp[ξ(k)] = exp[−Adξ(k) ]∂ M ,
(3.3)
where the operator Ad X is defined by Ad X Z = [X, Z ]. This enables us to compute the matrix Y = (Y A M ) explicitly, namely, Y =
1 − exp[−C(k)] . C(k)
(3.4)
Let X = (X A M ) = Y −1 be the inverse matrix of Y . Then we define the 1-forms Y A and the vector fields X A on the group G by Y A = Y A M dk M ,
X A = X A M ∂M .
(3.5)
Proposition 16. There holds X A exp[ξ(k)] = exp[ξ(k)]ξ A .
(3.6)
988
I. G. Avramidi
Proof. This follows immediately from Eq. (3.2).
Next, by differentiating Eq. (3.2) with respect to k L and alternating the indices L and M we obtain ∂ L Y A M − ∂ M Y A L = −C A BC Y B L Y C M ,
(3.7)
which, of course, can also be written as 1 dY A = − C A BC Y B ∧ Y C . 2
(3.8)
Proposition 17. The vector fields X A satisfy the commutation relations [X A , X B ] = C C AB X C . Proof. This follows from Eq. (3.7).
(3.9)
The vector fields X A are nothing but the right-invariant vector fields. They form a representation of the curvature algebra. We will also need the following fundamental property of Lie groups. Proposition 18. Let G be a Lie group with the structure constants C A BC , C A = (C B AC ) and C(k) = C A k A . Let γ = (γ AB ) be a symmetric non-degenerate matrix satisfying the equation (C A )T = −γ C A γ −1 .
(3.10)
Let X = (X A M ) be a matrix defined by X=
C(k) . 1 − exp[−C(k)]
(3.11)
Then (det X )−1/2 γ AB X A M ∂ M X B N ∂ N (det X )1/2 = −
1 AB C γ C AD C D BC . 24
(3.12)
Proof. It is easy to check that this equation holds at k = 0. Now, it can be proved by showing that it is a group invariant. For a detailed proof for semisimple groups see [18,20,25].
It is worth stressing that this equation holds not only on semisimple Lie groups but on any group with a compact Lie algebra, that is, when the structure constants C A BC and the matrix γ AB , used to define the metric G M N and the operator X 2 , satisfy Eq. (2.81). Such algebras can have an Abelian center as in Eq. (2.83). Now, by using the right-invariant vector fields we define a metric on the curvature group G, G M N = γ AB Y A M Y B N ,
G M N = γ AB X A M X B N .
(3.13)
This metric is bi-invariant and satisfies, in particular, the equation L X A G BC = X A M ∂ M G BC + G B M ∂C X A M + G MC ∂ B X A M = 0.
(3.14)
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
989
This equation is proved by using Eqs. (2.81) and (3.9). This means that the vector fields X A are the Killing vector fields of the metric G M N . One can easily show that this metric defines the following natural affine connection ∇ G on the group 1 1 ∇ XGC Y A = C B AC Y B , (3.15) ∇ XGC X A = − C A BC X B , 2 2 with the scalar curvature 1 (3.16) RG = − γ AB C C AD C D BC . 4 Since the matrix C(k) is traceless we have det exp[C(k)/2] = 1, and, therefore, the volume element on the group is sinh[C(k)/2] , (3.17) |G|1/2 = (det G M N )1/2 = |γ |1/2 det G C(k)/2 where |γ | = det γ AB . Notice that this function is precisely the inverse Van VleckMorette determinant (2.42) on the group in normal coordinates. It is not difficult to see that k M Y A M = k M X M A = k A. By differentiating this equation with respect to we obtain
kB
(3.18)
and contracting the indices A and B
k M ∂A X M A = N − X A A.
(3.19)
G BC
Now, by contracting Eq. (3.15) with we obtain the zero-divergence condition for the right-invariant vector fields (3.20) |G|−1/2 ∂ M |G|1/2 X A M = 0. Next, we define the Casimir operator X 2 = C2 (G, X ) = γ AB X A X B .
(3.21)
X2
By using Eq. (3.20) one can easily show that is an invariant differential operator that is nothing but the scalar Laplacian on the group G G X 2 = |G|−1/2 ∂ M |G|1/2 G M N ∂ N = G M N ∇ M ∇N .
(3.22)
Then, by using Eqs. (2.81) and (2.78) one can show that the operator X 2 commutes with the operators X A , [X A , X 2 ] = 0.
(3.23)
Since we will actually be working with the gauged curvature group, we introduce now the operators (covariant right-invariant vector fields) J A by 1 J A = X A − B AB k B , 2
(3.24)
J 2 = γ AB J A J B .
(3.25)
and the operator
990
I. G. Avramidi
Proposition 19. The operators J A and J 2 satisfy the commutation relations [J A , J B ] = C C AB JC + B AB ,
(3.26)
[J A , J 2 ] = 2B AB J B .
(3.27)
and
Proof. By using Eqs. (2.157)-(2.160) we obtain X B A B AM = γ B N γ AC X C N B AM = B B M ,
(3.28)
γ AB X B M B AM = 0,
(3.29)
and, hence,
and, further, by using (3.9) we obtain (3.26). By using Eqs. (3.28) we get (3.27).
Thus, the operators J A form a representation of the gauged curvature algebra. Now, let L A be the operators of Lie derivatives satisfying the commutation relations (2.162) and L(k) = k A L A . Proposition 20. There holds J A exp[L(k)] = exp[L(k)]L A ,
(3.30)
J 2 exp[L(k)] = exp[L(k)]L2 .
(3.31)
and, therefore,
Proof. Similarly to (3.3) we have exp[−L(k)]∂ M exp[L(k)] = exp[−AdL(k) ]∂ M .
(3.32)
By using the commutation relations (2.162) and Eq. (2.160) we obtain 1 exp[−L(k)]∂ M exp[L(k)] = Y A M L A + B M N k N . 2
(3.33)
The statement of the proposition follows from the definition of the operators J A , J 2 and L2 .
3.2. Heat kernel on the curvature group. Let B be the matrix with the components B = (γ AB B BC ). Let k A be the canonical coordinates on the curvature group G and A(t; k) be a function defined by sinh [C(k)/2 + tB] −1/2 A(t; k) = det G . (3.34) C(k)/2 + tB By using Eqs. (3.28) one can rewrite this in the form sinh [C(k)/2] −1/2 sinh [tB] −1/2 A(t; k) = det G det G . C(k)/2 tB
(3.35)
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
Notice also that due to (2.159), sinh [tB] −1/2 sinh [tB] −1/2 det G = det T M , tB tB where B is now regarded as just the matrix B = (Ba b ). Let Θ(t; k) be another function on the group G defined by 1 ˆ , k, γ Θk Θ(t; k) = 2
991
(3.36)
(3.37)
where Θˆ is the matrix Θˆ = tB coth(tB)
(3.38)
and u, γ v = γ AB u A v B is the inner product on the algebra G. Theorem 6. Let Φ(t; k) be a function on the group G defined by Θ(t; k) 1 −N /2 Φ(t; k) = (4π t) + RG t . A(t; k) exp − 2t 6
(3.39)
Then Φ(t; k) satisfies the equation ∂t Φ = J 2 Φ,
(3.40)
Φ(0; k) = |γ |−1/2 δ(k).
(3.41)
and the initial condition
Proof. We compute first ∂t Θ =
t 1 1 Θ− k, γ Θˆ 2 k + k, γ B 2 k t 2t 2
(3.42)
1 N − tr G Θˆ A. 2t
(3.43)
and ∂t A = Therefore,
1 1 1 1 2 2 ˆ ˆ ∂t Φ = RG − tr G Θ + 2 k, γ Θ k − k, γ B k Φ. 6 2t 4t 4
(3.44)
Next, we have 1 J 2 = X 2 − γ AB B AC k C X B + γ AB B AC B B D k C k D . 4
(3.45)
By using Eqs. (3.28) and (2.160) and the anti-symmetry of the matrix B AB we show that γ AB B AC k C X B Θ = 0,
and
γ AB B AC k C X B A = 0,
(3.46)
and, therefore, B AC k C X B Φ = 0.
(3.47)
992
I. G. Avramidi
Thus,
1 1 J 2 Φ = A−1 (X 2 A) − (X 2 Θ) + 2 γ AB (X A Θ)(X B Θ) 2t 4t 1 1 − A−1 γ AB (X B A)(X A Θ) − k, γ B 2 k Φ. t 4
(3.48)
Further, by using (3.28) we get γ AB (X A Θ)(X B Θ) = k, γ Θˆ 2 k , X 2 Θ = tr G X + tr G Θˆ − N .
(3.49) (3.50)
Now, by using Eq. (3.20) in the form A2 ∂ M (A−2 X B M ) = 0
(3.51)
and Eqs. (2.160) and (3.19) we show that A−1 γ AB (X A Θ)X B A =
1 N − tr G X , 2
(3.52)
and by using Eq. (3.12) we obtain A−1 X 2 A =
1 RG . 6
(3.53)
Finally, substituting Eqs. (3.49)-(3.53) into Eq. (3.48) and comparing it with Eq. (3.44) we prove Eq. (3.40). The initial condition (3.41) follows easily from the well known property of the Gaussian. This completes the proof of the theorem.
3.3. Regularization and analytical continuation. In the following we will complexify the gauged curvature group in the following sense. We extend the canonical coordinates (k A ) = ( pa , ωi ) to the whole complex Euclidean space C N . Then all group-theoretic functions introduced above become analytic functions of k A possibly with some poles on the real section R N for compact groups. In fact, we replace the actual real slice R N N in C N obtained by rotating the real section of C N with an N -dimensional subspace Rreg R N counterclockwise in C N by π/4. That is, we replace each coordinate k A by eiπ/4 k A . In the complex domain the group becomes non-compact. We call this procedure the decompactification. If the group is compact, or has a compact subgroup, then this plane will cover the original group infinitely many times. Since the metric (γ AB ) = diag (δab , βi j ) is not necessarily positive definite, (actually, only the metric of the holonomy group βi j is non-definite) we analytically continue the function Φ(t; k) in the complex plane of t with a cut along the negative imaginary axis so that −π/2 < arg t < 3π/2. Thus, the function Φ(t; k) defines an analytic function of t and k A . For the purpose of the following exposition we shall consider t to be real negative, t < 0. This is needed in order to make all integrals convergent and well defined and to be able to do the analytical continuation. As we will show below, the singularities occur only in the holonomy group. This means that there is no need to complexify the coordinates pa . Thus, in the following we assume the coordinates pa to be real and the coordinates ωi to be complex, more
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
993 p
precisely, to take values in the p-dimensional subspace Rreg of C p obtained by rotating N = Rn × R p . R p counterclockwise by π/4 in C p . That is, we have Rreg reg This procedure (that we call a regularization) with the nonstandard contour of integration is necessary for the convergence of the integrals below since we are treating both the compact and the non-compact symmetric spaces simultaneously. Remember, that, in p general, the nondegenerate diagonal matrix βi j is not positive definite. The space Rreg is chosen in such a way to make the Gaussian exponent purely imaginary. Then the indefiniteness of the matrix β does not cause any problems. Moreover, the integrand does not have any singularities on these contours. The convergence of the integral is guaranteed by the exponential growth of the sine for imaginary argument. These integrals can be computed then in the following way. The coordinates ω j corresponding to the compact directions are rotated further by another π/4 to an imaginary axis and the coordinates ω j corresponding to the non-compact directions are rotated back to the real axis. Then, for t < 0 all the integrals below are well defined and convergent and define an analytic function of t in a complex plane with a cut along the negative imaginary axis. 3.4. Heat semigroup. Theorem 7. The heat semigroup exp(tL2 ) can be represented in form of the integral exp(tL2 ) = dk |G|1/2 (k)Φ(t; k) exp[L(k)]. (3.54) N Rreg
Proof. Let Ψ (t) =
dk |G|1/2 Φ(t; k) exp[L(k)].
(3.55)
N Rreg
By using the previous theorem we obtain ∂t Ψ (t) = dk |G|1/2 exp[L(k)]J 2 Φ(t; k).
(3.56)
N Rreg
Now, by integrating by parts we get ∂t Ψ (t) = dk |G|1/2 Φ(t; k)J 2 exp[L(k)],
(3.57)
N Rreg
and, by using Eq. (3.31) we obtain ∂t Ψ (t) = Ψ (t)L2 .
(3.58)
Finally from the initial condition (3.41) for the function Φ(t; k) we get Ψ (0) = 1, and, therefore, Ψ (t) = exp(tL2 ).
(3.59)
994
I. G. Avramidi
Theorem 8. Let ∆ be the Laplacian acting on sections of a homogeneous twisted spin-tensor vector bundle over a symmetric space. Then the heat semigroup exp(t∆) can be represented in the form of an integral sinh(tB) −1/2 1 exp(t∆) = (4π t)−N /2 det T M exp −tR2 + RG t tB 6 1/2 sinh[C(k)/2] dk |γ |1/2 det G × C(k)/2 N Rreg
1 × exp − k, γ tB coth(tB)k exp[L(k)]. 4t
(3.60)
Proof. By using Eq. (2.169) we obtain
exp(t∆) = exp −tR2 exp tL2 .
(3.61)
The statement of the theorem follows now from Eqs. (3.54), (3.39), (3.35)-(3.38) and (3.17).
4. Heat Kernel 4.1. Heat kernel diagonal and heat trace. The heat kernel diagonal on a homogeneous bundle over a symmetric space is parallel. In a parallel local frame it is just a constant matrix. The fiber trace of the heat kernel diagonal is just a constant. That is why it can be computed at any point in M. We fix a point x in M such that the Killing vectors satisfy the initial conditions (2.104)-(2.106) and are given by the explicit formulas above (2.103)-(2.105). We compute the heat kernel diagonal at the point x . The heat kernel diagonal can be obtained by acting by the heat semigroup exp(t∆) on the delta-function, [8,10] U diag (t) = exp(t∆)δ(x, x ) x=x = exp −tR2 dk |G|1/2 Φ(t; k) exp[L(k)]δ(x, x ) . (4.1) x=x
N Rreg
To be able to use this integral representation we need to compute the action of the isometries exp[L(k)] on the delta-function. Proposition 21. Let ϕ be a section of the twisted spin-tensor bundle V, L A be the twisted Lie derivatives, k A = ( pa , ωi ) be the canonical coordinates on the group and L(k) = k A L A . Let ξ = k A ξ A be the Killing vector and ψt be the corresponding one-parameter diffeomorphism. Then 1 ab ϕ(x) ˆ , (4.2) exp [L(k)] ϕ(x) = exp − θab G 2 t=1 where xˆ = ψt (x) and the matrix θ is defined by (2.219). In particular, for p = 0 and x = x , exp[L(k)]ϕ(x) = exp [R(ω)] ϕ(x). (4.3) p=0,x=x
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
995
Proof. This statement follows from Eqs. (2.221) and (2.222) and the fact that the Lie derivative is nothing but the generator of the pullback.
Proposition 22. Let ωi be the canonical coordinates on the holonomy group H and (k A ) = ( pa , ωi ) be the natural splitting of the canonical coordinates on the curvature group G. Then sinh[D(ω)/2] −1 exp[L(k)]δ(x, x ) = det T M exp[R(ω)]δ( p). (4.4) x=x D(ω)/2 Proof. Let x(t, ˆ p, ω, x, x ) = ψt (x). By making use of Eq. (4.2) we obtain 1 ab δ(x(1, ˆ p, ω, x, x ), x ) . (4.5) exp[L(k)]δ(x, x ) x=x = exp − θab G 2 x=x ,t=1 Now we change the variables from x µ to the normal coordinates y a to get a ∂y δ( y ˆ (1, p, ω, y)) . δ(x(1, ˆ p, ω, x, x ), x ) x=x = |g|−1/2 det ∂xµ y=0
(4.6)
This delta-function picks the values of p that make yˆ = 0, which is exactly the functions p¯ = p(ω, ¯ y) defined by Eq. (2.196). By switching further to the variables p we obtain b −1 a ∂ yˆ ∂y −1/2 det δ(x(1, ˆ p, ω, x, x ), x ) x=x = |g| det δ( p − p(ω, ¯ y)) . ∂xµ ∂ pc y=0,t=1
(4.7) Now, by recalling from (2.201) that p| ¯ y=0 = 0 and by using (2.41) and (2.184) we evaluate the Jacobians for p = y = 0 and t = 1 to get Eq. (4.4).
Remark 2. Some remarks are in order here. We implicitly assumed that there are no closed geodesics and that the equation of closed orbits of isometries yˆ a (1, p, ¯ ω, 0) = 0
(4.8)
has a unique solution p¯ = p(ω, ¯ 0) = 0. On compact symmetric spaces this is not true: there are infinitely many closed geodesics and infinitely many closed orbits of isometries. However, these global solutions, which reflect the global topological structure of the manifold, will not affect our local analysis. In particular, they do not affect the asymptotics of the heat kernel. That is why we have neglected them here. This is reflected in the fact that the Jacobian in (4.4) can become singular when the coordinates of the holonomy group ωi vary from −∞ to ∞. Note that the exact results for compact symmetric spaces can be obtained by an analytic continuation from the dual noncompact case when such closed geodesics are absent [18]. That is why we proposed above to complexify our holonomy group. If the coordinates ωi are complex taking values in the subspace p Rreg defined above, then Eq. (4.8) should have a unique solution and the Jacobian is an analytic function. It is worth stressing once again that the canonical coordinates cover the whole group except for a set of measure zero. Also a compact subgroup is covered infinitely many times. We will show below how this works in the case of the two-sphere, S2. Now by using the above lemmas and the theorem we can compute the heat kernel diagonal. We define the matrix F(ω) by F(ω) = ωi Fi .
996
I. G. Avramidi
Theorem 9. The heat kernel diagonal of the Laplacian on twisted spin-vector bundles over a symmetric space has the form sinh(tB) −1/2 1 1 R + R H − R2 t U diag (t) = (4π t)−n/2 det T M exp tB 8 6 dω 1 × |β|1/2 exp − ω, βω cosh [ R(ω)] (4π t) p/2 4t Rnreg
× det H
sinh [ F(ω)/2] F(ω)/2
1/2 det T M
sinh [ D(ω)/2] D(ω)/2
−1/2
,
(4.9)
where |β| = det βi j and ω, βω = βi j ωi ω j . Proof. First, we have dk = dp dω and |γ | = |β| . By using Eqs. (4.1) and (4.4) and integrating over p we obtain the heat kernel diagonal
U diag (t) =
dω |G|1/2 (0, ω)Φ(t; 0, ω) det T M p
sinh[D(ω)/2] D(ω)/2
−1
Rreg
× exp[R(ω) − tR2 ].
(4.10)
Further, by using the Eq. (2.76) we compute the determinants det G
sinh[C(ω)/2] C(ω)/2
= det T M
sinh[D(ω)/2] sinh[F(ω)/2] det H . (4.11) D(ω)/2 F(ω)/2
Now, by using (2.159) we compute (3.37) Θ(t; 0, ω) = 21 ω, βω, and, finally, by using Eqs. (3.39), (3.35), (3.16) and (2.88) we get the result (4.9).
By using this theorem we can also compute the heat trace for compact manifolds, Tr L 2 exp(t∆) =
dvol (4π t)−n/2 tr V det T M
M
sinh(tB) tB
−1/2 (4.12)
1 1 2 R + RH − R t × exp 8 6 dω 1 1/2 × |β| exp − ω, βω cosh [R(ω)] p/2 (4π t) 4t
p
Rreg
× det H where tr V is the fiber trace.
sinh [F(ω)/2] F(ω)/2
1/2 det T M
sinh [D(ω)/2] D(ω)/2
−1/2
,
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
997
4.2. Heat kernel asymptotics. It is well known that there is the following asymptotic expansion as t → 0 of the heat kernel diagonal [24]: U diag (t) ∼ (4π t)−n/2
∞
t k ak .
(4.13)
k=0
The coefficients ak are called the local heat kernel coefficients. On compact manifolds, there is a similar asymptotic expansion of the heat trace with the global heat invariants Ak defined by dvol tr V ak . (4.14) Ak = M
In symmetric spaces the heat invariants do not contain any additional information since the local heat kernel coefficients define the heat invariants Ak up to a constant equal to the volume of the manifold, Ak = vol (M)tr V ak .
(4.15)
We introduce a Gaussian average over the holonomy algebra by dω 1 1/2 f (ω) = ω, βω f (ω). |β| exp − (4π ) p/2 4
(4.16)
p Rreg
Then we can write U diag (t) = (4π t)−n/2 det T M !
sinh(tB) tB
−1/2
exp
1 1 R + R H − R2 t 8 6
√ 1/2 sinh t F(ω)/2 √ t F(ω)/2 √ −1/2 " sinh t D(ω)/2 × det T M . √ t D(ω)/2 √ × cosh t R(ω) det H
(4.17)
This equation can be used now to generate all heat kernel coefficients ak for any locally symmetric space simply by expanding it in a power series in t. By using the standard Gaussian averages 4 ω1i · · · ωi2k+1 = 0, (4.18) (2k)! (i1 i2 ωi1 · · · ωi2k = β · · · β i2k−1 i2k ) , (4.19) k! one can obtain now all heat kernel coefficients in terms of traces of various contractions of the matrices D a ib and F j ik with the matrix β ik . All these quantities are curvature invariants and can be expressed directly in terms of the Riemann tensor. 4 We have corrected here a misprint in Eq. (4.68) of [10].
998
I. G. Avramidi
There is an alternative representation of the Gaussian average in purely algebraic terms. Let b j and bk∗ be operators, called creation and annihilation operators, acting on a Hilbert space, that satisfy the following commutation relations: [b j , bk∗ ] = δk ,
[b j , bk ] = [b∗j , bk∗ ] = 0.
j
(4.20)
Let |0 be a unit vector in the Hilbert space, called the vacuum vector, that satisfies the equations b j |0 = 0|bk∗ = 0.
0|0 = 1,
(4.21)
Then the Gaussian average is nothing but the vacuum expectation value f (ω) = 0| f (b) expb∗ , βb∗ |0,
(4.22)
where b∗ , βb∗ = β jk b∗j bk∗ . This should be computed by the so-called normal ordering, that is, by simply commuting the operators b j through the operators bk∗ until they hit the vacuum vector giving zero. The remaining non-zero commutation terms precisely reproduce Eqs. (4.18), (4.19). 4.2.1. Calculation of the coefficient a1 . As an example let us calculate the lowest heat kernel coefficients: a0 and a1 . Let X be a matrix. Then by using √ m √ sinh t X sinh t X det = exp m tr log (4.23) √ √ tX tX and [21] √ ∞ 2k−1 sinh t X 2 B2k k 2k log t X , = √ k(2k)! tX k=1
(4.24)
where Bk are Bernoulli numbers, in particular, B0 = 1, we obtain
det
1 B1 = − , 2
B2 =
1 , 6
√ ±1/2 sinh t X 1 = 1 ± t tr X 2 + O(t 2 ). √ 12 tX
(4.25)
(4.26)
Now, by using Eq. (4.17) we obtain
U diag (t) = (4π t)−n/2 a0 + ta1 + O(t 2 ) ,
(4.27)
where a0 = I, and
b1 =
a1 = b1 ,
1 1 1 1 1 R + R H + tr F(ω)2 − tr D(ω)2 I − R2 + R(ω)2 . 8 6 48 48 2
(4.28)
(4.29)
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
Next, bu using (4.19), in particular, we obtain
999
ωi ω j = 2β i j ,
(4.30)
R(ω)2 = 2R2 , tr F(ω)2 = 2tr Fi F i = 2F j il F li j = −8R H , tr D(ω)2 = 2tr Di D i = 2D a ib D bi a = −2R,
(4.31) (4.32) (4.33)
and, therefore, a1 =
1 1 1 1 1 R + RH − RH + R I − R2 + R2 = RI. 8 6 6 24 6
(4.34)
This confirms the well known result for the coefficient a1 [5,24]. 4.3. Heat kernel on S 2 and H 2 . Let us apply our result to a special case of a two-sphere S 2 of radius r , which is a compact symmetric space equal to the quotient of the isometry group, S O(3), by the isotropy group, S O(2), S 2 = S O(3)/S O(2).
(4.35)
The two-sphere is too small to incorporate an additional Abelian field B; therefore, we set B = 0. Let y a be the normal coordinates defined above. On the 2-sphere of radius r they range over −r π ≤ y a ≤ r π . We define the polar coordinates ρ and ϕ by y 1 = ρ cos ϕ,
y 2 = ρ sin ϕ,
so that 0 ≤ ρ ≤ r π and 0 ≤ ϕ ≤ 2π . The orthonormal frame of 1-forms is e1 = dρ,
e2 = r sin
ρ r
dϕ,
(4.36)
(4.37)
which gives the spin connection 1-form ωab = −εab cos
ρ r
dϕ ,
(4.38)
with εab being the antisymmetric Levi-Civita tensor, and the curvature 1 1 εab εcd = 2 (δac δbd − δad δbc ), r2 r 1 2 = 2 δab , R = 2. r r
Rabcd =
(4.39)
Rab
(4.40)
Since the holonomy group S O(2) is one-dimensional, it is obviously Abelian, so all structure constants F i jk are equal to zero, and therefore, the curvature of the holonomy
1000
I. G. Avramidi
group vanishes, R H = 0. The metric of the holonomy group βi j is now just a constant, β = 1/r 2 . The only generator of the holonomy group in the vector representation is Dab = −
1 1 E ab = − 2 εab . 2 r r
(4.41)
The irreducible representations of S O(2) are parametrized by α, which is either an integer, α = m, or a half-integer, α = m + 21 . Therefore, the generator R of the holonomy group and the Casimir operator R2 are R=i
α , r2
(4.42)
R2 = β i j Ri R j = −
α2 . r2
(4.43)
The extra factor r 2 here is due to the inverse metric β −1 = r 2 of the holonomy group. The Lie derivatives L A are given by ρ sin ϕ sin ϕ cot ∂ϕ + i α, r r r sin (ρ/r ) ρ cos ϕ cos ϕ L2 = sin ϕ∂ρ + cot ∂ϕ − i α, r r r sin (ρ/r ) 1 L3 = 2 ∂ϕ , r L1 = cos ϕ∂ρ −
(4.44) (4.45) (4.46)
and form a representation of the S O(3) algebra [L1 , L2 ] = −L3 ,
[L3 , L1 ] = −
1 L2 r2
[L3 , L2 ] =
1 L1 . r2
(4.47)
The Laplacian is given by ∆ = ∂ρ2 +
ρ ρ 2 1 1 ∂ϕ − iα cos cot ∂ρ + 2 2 . r r r r sin (ρ/r )
(4.48)
The contour of integration over ω in (4.9) should be the real axis rotated counterclockwise by π/4. Since S 2 is compact, we rotate it further to the imaginary axis, compute the determinant sinh[ωD] −1/2 ω/(2r 2 )] , (4.49) = det T M ωD sin[ω/(2r 2 )] √ and rescale ω for t < 0 by ω → r −t ω to obtain an analytic function of t, t 1 1 diag 2 (4.50) U (t) = exp +α 4π t 4 r2 √ ∞ √ ω −t/(2r ) dω ω2 √ cosh αω −t/r . × √ exp − 4 sinh ω −t/(2r ) 4π −∞
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
1001
If we would have rotated √ the contour to the real axis instead then we would have obtained after rescaling ω → r t ω for t > 0, t 1 1 (4.51) U diag (t) = exp + α2 2 4π t 4 r √ ∞ √ ω t/(2r ) dω ω2 √ cos αω t/r , × – √ exp − 4 sin ω t/(2r ) 4π −∞
# where – denotes the Cauchy principal value of the integral. This can also be written as U
diag
1 exp (t) = 4π t ×
∞
t 1 2 +α 4 r2 √ 2πr/ t
(−1)k
k=−∞
√
0
(4.52)
2πr 2 % √ √ k ω + 2πr dω t 1 t √ ω+ √ k √ exp − 4 2r sin ω t/(2r ) t 4π $
× cos αω t/r . 2 This is √ nothing but the sum over the closed geodesics of S . Note that the factor cos αω t/r is either periodic (for integer α) or anti-periodic (for half-integer α). The non-compact symmetric space dual to the 2-sphere is the hyperbolic plane H 2 of pseudo-radius a. It is equal to the quotient of the isometry group, S O(1, 2), by the isotropy group, S O(2),
H 2 = S O(1, 2)/S O(2).
(4.53)
Let y a be the normal coordinates defined above. On H 2 they range over −∞ ≤ y a ≤ ∞. We define the polar coordinates u and ϕ by y 1 = u cos ϕ,
y 2 = u sin ϕ,
(4.54)
so that 0 ≤ u ≤ ∞ and 0 ≤ ϕ ≤ 2π . The orthonormal frame of 1-forms is e1 = du,
e2 = a sinh
u a
dϕ,
(4.55)
which gives the spin connection 1-form ωab = −εab cosh
u a
dϕ ,
(4.56)
and the curvature 1 1 εab εcd = − 2 (δac δbd − δad δbc ), a2 a 1 2 = − 2 δab , R = − 2. a a
Rabcd = − Rab
(4.57) (4.58)
1002
I. G. Avramidi
The metric of the isotropy group βi j is just a constant, β = −1/a 2 , and the only generator of the isotropy group in the vector representation is given by Dab =
1 1 E ab = 2 εab . 2 a a
(4.59)
The Lie derivatives L A are now u sin ϕ sin ϕ ∂ϕ + i α, coth a a a sinh (u/a) u cos ϕ cos ϕ L2 = sin ϕ∂u + coth ∂ϕ − i α, a a a sinh (u/a) 1 L3 = − 2 ∂ϕ , a L1 = cos ϕ∂u −
(4.60) (4.61) (4.62)
and form a representation of the S O(1, 2) algebra [L1 , L2 ] = −L3 ,
[L3 , L1 ] =
1 L2 a2
[L3 , L2 ] = −
1 L1 . a2
(4.63)
The Laplacian is given by ∆ = ∂u2 +
u u 2 1 1 ∂ coth ∂u + 2 − iα cosh . ϕ a a a a sinh2 (u/a)
(4.64)
The contour of integration over ω in (4.9) for the heat kernel should be the real axis rotated counterclockwise by π/4. Since √ H 2 is non-compact, we rotate it back to the real axis and rescale ω for t > 0 by ω → a t ω to obtain the heat kernel diagonal for the Laplacian on H 2 , t 1 1 diag 2 U (t) = (4.65) exp − +α 4π t 4 a2 √ ∞ √ ω t/(2a) dω ω2 √ cosh αω t/a . × √ exp − 4 sinh ω t/(2a) 4π −∞
We see that the heat kernel in the compact case of the two-sphere, S 2 , is related with the heat kernel in the non-compact case of the hyperboloid, H 2 , by the analytical continuation, a 2 → −r 2 , or a → ir , or, alternatively, by replacing t → −t (and a = r ). One can go even further and compute the Plancherel (or Harish-Chandra) measure µ(ν) in the case of H 2 and the spectrum in the case of S 2 . √ For H 2 we rescale the integration variable in (4.65) by ω → ωa/ t, substitute 2 ∞ t dν a 2 exp − 2 ν 2 + iων , exp − ω = √ 4t 2π a 4π t a
(4.66)
−∞
integrate by parts over ν, and use ∞ −∞
1 dω iων cosh(αω) = {tanh[π(ν + iα)] + tanh[π(ν − iα)]} e 2πi sinh (ω/2) 2
(4.67)
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
1003
(and the fact that α is a half-integer) to represent the heat kernel for H 2 in the form U
diag
1 (t) = 4πa 2
∞ −∞
t 1 2 2 , +α +ν dν µ(ν) exp − 4 a2
(4.68)
where µ(ν) = ν tanh ν
for integer α = m,
(4.69)
1 for half-integer α = m + . 2
(4.70)
and µ(ν) = ν coth ν
For S 2 we proceed as follows. We cannot just substitute a 2 → −r 2 in (4.68). Instead, first, we deform the contour of integration in (4.68) to the V -shaped contour that consists of two segments of straight lines, one going from ei3π/4 ∞ to 0, and another going from 0 to eiπ/4 ∞. Then, after we replace a 2 → −r 2 , we can deform the contour further to go counterclockwise around the positive imaginary axis. Then we notice that the function µ(ν) is a meromorphic function with simple poles on the imaginary axis at νk = idk , where 1 dk = k + , 2
k = 0, ±1 ± 2, . . . ,
for integer α = m,
(4.71)
and dk = k,
1 for half-integer α = m + . 2
k = ±1, ±2, . . . ,
(4.72)
Therefore, we can compute the integral by residue theory to get U diag (t) =
∞ 1 dk exp (−λk t), 4πr 2
(4.73)
k=0
where 1 λk = 2 r
$
1 k+ 2
2
1 − − m2 4
% for integer α = m,
(4.74)
1 for half-integer α = m + . 2
(4.75)
and $ % 1 1 1 2 2 λk = 2 k − − m + r 4 2
Our results for the heat kernel on the 2-sphere S 2 and the hyperbolic plane H 2 coincide with the exact heat kernel of scalar Laplacian (when R = α = 0) reported in [18] and obtained by completely different methods.
1004
I. G. Avramidi
4.4. Index theorem. We can now apply this result for the calculation of the index of the Dirac operator on spinor bundle on compact manifolds, D = γ µ ∇µ . Let the dimension n of the manifold be even and 1 Γ = i n(n−1)/2 εa1 ...an γ[a1 · · · γan ] n! be the chirality operator of the spinor representation so that Γ 2 = IS
Γ γa = −γa Γ.
and
(4.76)
(4.77)
(4.78)
Then the index of the Dirac operator is equal to Ind (D) = Tr L 2 Γ exp(t D 2 ).
(4.79)
In this case the generators Ri have the form 1 Ri = − D a ib γ b a ⊗ IT + I S ⊗ Ti , (4.80) 4 and the Casimir operator of the holonomy group in the spinor representation is obtained by using (2.13), 1 1 (4.81) R I S + I S ⊗ T 2 − E j ab γ ab ⊗ T j . 8 2 Now, by using Eqs. (2.11), (2.17), (2.13) and (2.141) we compute the square of the Dirac operator R2 =
D2 = ∆ −
1 1 1 R I S − E i ab Ti γ ab + γ ab Bab , 4 2 2
(4.82)
and, finally, the index sinh(tB) −1/2 −n/2 Ind (D) = dvol (4π t) tr V Γ det T M tB M 1 1 1 2 ab t × exp − R + R H − T + Bab γ 4 6 2 dω 1 × |β|1/2 exp − ω, βω (4π t) p/2 4t p
Rreg
1 × cosh − ωi D a ib γ b a + ωi Ti 4 sinh [F(ω)/2] 1/2 sinh [D(ω)/2] −1/2 × det H det T M . (4.83) F(ω)/2 D(ω)/2 Since the index does not depend on t, the right-hand side of this equation does not depend on t. By expanding it in an asymptrotic power series in t, we see that the index is equal to Ind (D) = (4π )−n/2 dvol tr V Γ an/2 . (4.84) M
Heat Kernel on Homogeneous Bundles over Symmetric Spaces
1005
5. Conclusion We have continued the study of the heat kernel on homogeneous spaces initiated in [6–10]. In those papers we have developed a systematic technique for calculation of the heat kernel in two cases: a) a Laplacian on a vector bundle with a parallel curvature over a flat space [6,9], and b) a scalar Laplacian on manifolds with parallel curvature [8,10]. What was missing in that study was the case of a non-scalar Laplacian on vector bundles with parallel curvature over curved manifolds with parallel curvature. In the present paper we considered the Laplacian on a homogeneous bundle and generalized the technique developed in [10] to compute the corresponding heat semigroup and the heat kernel. It is worth pointing out that our formal result applies to general symmetric spaces by making use of the regularization and the analytical continuation procedure described above. Of course, the heat kernel coefficients are just polynomials in the curvature and do not depend on this kind of analytical continuation (for more detail, see [10]). As we mentioned above, due to existence of multiple closed geodesics the obtained form of the heat kernel for compact symmetric spaces requires an additional regularization, which consists simply in an analytical continuation of the result from the complexified noncompact case. In any case, it gives a generating function for all heat invariants and reproduces correctly the whole asymptotic expansion of the heat kernel diagonal. However, since there are no closed geodesics on non-compact symmetric spaces, it seems that the analytical continuation of the obtained result for the heat kernel diagonal should give the exact result for the non-compact case, and, even more generally, for the general case too. We have seen on the example of the two-sphere that our method gives not just the asymptotic expansion of the heat kernel diagonal but, after an appropriate regularization, in fact, an exact result for the heat kernel diagonal. References 1. Anderson, A., Camporesi, R.: Intertwining operators for solving differential equations with applications to symmetric spaces. Commun. Math. Phys. 130, 61–82 (1990) 2. Avramidi, I.G.: Covariant methods for the calculation of the effective action in quantum field theory and investigation of higher-derivative quantum gravity. PhD Thesis, Moscow State University (1987), http://arXiv.org/abs/hep-th/9510140, 1995 3. Avramidi, I.G.: Background field calculations in quantum field theory (vacuum polarization). Teor. Mat. Fiz. 79, 219–231 (1989) 4. Avramidi, I.G.: The covariant technique for calculation of the heat kernel asymptotic expansion. Phys. Lett. B 238, 92–97 (1990) 5. Avramidi, I.G.: A covariant technique for the calculation of the one-loop effective action. Nucl. Phys. B 355, 712–754 (1991); Erratum: Nucl. Phys. B 509, 557–558 (1998) 6. Avramidi, I.G.: A new algebraic approach for calculating the heat kernel in gauge theories. Phys. Lett. B 305, 27–34 (1993) 7. Avramidi, I.G.: Covariant methods for calculating the low-energy effective action in quantum field theory and quantum gravity. University of Greifswald (March, 1994), http://arXiv.org/abs/gr-qc/9403036, 1994 8. Avramidi, I.G.: The heat kernel on symmetric spaces via integrating over the group of isometries. Phys. Lett. B 336, 171–177 (1994) 9. Avramidi, I.G.: Covariant algebraic method for calculation of the low-energy heat kernel. J. Math. Phys. 36, 5055–5070 (1995); Erratum: J. Math. Phys. 39, 1720 (1998) 10. Avramidi, I.G.: A new algebraic approach for calculating the heat kernel in quantum gravity. J. Math. Phys. 37, 374–394 (1996) 11. Avramidi, I.G.: Covariant approximation schemes for calculation of the heat kernel in quantum field theory. In: Quantum Gravity, Eds. V.A. Berezin, V.A. Rubakov, D.V. Semikoz, Singapore: World Scientific, 1998, pp. 61–78 12. Avramidi, I.G.: Covariant techniques for computation of the heat kernel. Rev. Math. Phys. 11, 947–980 (1999)
1006
I. G. Avramidi
13. Avramidi, I.G.: Heat Kernel and Quantum Gravity. Lecture Notes in Physics, Series Monographs, LNP:64, Berlin: Springer-Verlag, 2000 14. Avramidi, I.G.: Heat kernel approach in quantum field theory. Nucl. Phys. Proc. Suppl. 104, 3–32 (2002) 15. Avramidi, I.G.: Heat kernel asymptotics on symmetric spaces. Comm. Math. Anal. Conf. 1, 1–10 (2008) 16. Barut, A.O., Raszka, R.: Theory of Group Representations and Applications. Warszawa: PWN, 1977 17. Berline, N., Getzler, E., Vergne, M.: Heat Kernels and Dirac Operators. Berlin: Springer-Verlag, 1992 18. Camporesi, R.: Harmonic analysis and propagators on homogeneous spaces. Phys. Rep. 196, 1–134 (1990) 19. Dowker, J.S.: When is the “sum over classical paths” exact?. J. Phys. A 3, 451–461 (1970) 20. Dowker, J.S.: Quantum mechanics on group space and Huygen’s principle. Ann. Phys. (USA) 62, 361–382 (1971) 21. Erdélyi A., Magnus W., Oberhettinger F., Tricomi F.G.: Higher Transcendental Functions. (New York: McGraw-Hill, 1953), vol. I 22. Fegan, H.D.: The fundamental solution of the heat equation on a compact Lie group. J. Diff. Geom. 18, 659–668 (1983) 23. Gilkey, P.B.: The spectral geometry of Riemannian manifold. J. Diff. Geom. 10, 601–618 (1975) 24. Gilkey, P.B.: Invariance Theory, the Heat Equation and the Atiyah-Singer Index Theorem. Boca Raton FL: CRC Press, 1995 25. Helgason, S.: Groups and Geometric Analysis: Integral Geometry, Invariant Differential Operators, and Spherical Functions. Mathematical Surveys and Monographs, Vol. 83, Providence, RI: Amer. Math. Soc., 2002, p. 270 26. Hurt, N.E.: Geometric Quantization in Action: Applications of Harmonic Analysis in Quantum Statistical Mechanics and Quantum Field Theory. Dordrecht: D. Reidel Publishing, Holland, 1983 27. Kirsten, K.: Spectral Functions in Mathematics and Physics. Boca Raton FL: CRC Press, 2001 28. Ruse, H., Walker, A.G., Willmore, T.J.: Harmonic Spaces. Roma: Edizioni Cremonese, 1961 29. Takeuchi, M.: Lie Groups II. In: Translations of Mathematical Monographs. Vol. 85, Providence, RI: Amer. Math. Soc., 1991, p.167 30. Van de Ven, A.E.M.: Index free heat kernel coefficients. Class. Quant. Grav. 15, 2311–2344 (1998) 31. Vassilevich, D.V.: Heat kernel expansion: user’s manual. Phys. Rep. 388, 279–360 (2003) 32. Wolf, J.A.: Spaces of Constant Curvature. University of California, Berkeley, 1972 33. Yajima, S., Higasida, Y., Kawano, K., Kubota, S.-I., Kamo, Y., Tokuo, S.: Higher coefficients in asymptotic expansion of the heat kernel. Phys. Rep. Kumamoto Univ. 12(1), 39–62 (2004) Communicated by A. Connes
Commun. Math. Phys. 288, 1007–1021 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0696-x
Communications in
Mathematical Physics
Reflectionless Herglotz Functions and Jacobi Matrices Alexei Poltoratski1, , Christian Remling2 1 Mathematics Department, Texas A&M University, College Station, TX 77843, USA.
E-mail:
[email protected]
2 Mathematics Department, University of Oklahoma, Norman, OK 73019, USA.
E-mail:
[email protected] Received: 28 May 2008 / Accepted: 18 August 2008 Published online: 16 December 2008 – © Springer-Verlag 2008
Abstract: We study several related aspects of reflectionless Jacobi matrices. First, we discuss the singular part of the corresponding spectral measures. We then show how to identify sets on which measures are reflectionless by looking at the logarithmic potentials of these measures. 1. Introduction We study several aspects of reflectionless Jacobi matrices and Herglotz functions in this paper. This is part of a larger program; the (perhaps too ambitious) goal is to reach a systematic understanding of the absolutely continuous spectrum of Jacobi operators J on 2 (Z+ ), (J u)(n) = a(n)u(n + 1) + a(n − 1)u(n − 1) + b(n)u(n). We will always assume that the coefficients a, b satisfy bounds of the form (C + 1)−1 ≤ a(n) ≤ C + 1, |b(n)| ≤ C, for some C > 0. Note that if J has some absolutely continuous spectrum, then, by the decoupling argument of Dombrowski and Simon-Spencer [3,17], it actually suffices to assume that a(n) is bounded above; the other two inequalities follow automatically. Let us recall some definitions. A Herglotz function is a holomorphic mapping of C+ = {z ∈ C : Im z > 0} to itself. We denote the set of Herglotz functions by H. If F ∈ H, then F(t) ≡ lim y→0+ F(t + i y) exists for (Lebesgue) almost every t ∈ R. We call F reflectionless (on E ⊂ R) if Re F(t) = 0 for almost every t ∈ E. A. P.’s work is supported in part by NSF grant DMS 0800300.
(1.1)
1008
A. Poltoratski, C. Remling
We will also use the notation N (E) = {F ∈ H : F reflectionless on E}. Herglotz functions have unique representations of the form F(z) = Fµ (z) = a + bz +
∞
−∞
1 t − 2 t −z t +1
dµ(t),
(1.2)
< ∞. We will call with a ∈ R, b ≥ 0, and a (positive) Borel measure µ on R, R dµ(t) t 2 +1 such a measure µ reflectionless (on E) if Fµ ∈ N (E) for some choice of a ∈ R, b ≥ 0; for easier reference, it will also be convenient to introduce the notation R(E) = {µ : µ reflectionless on E}. < ∞ for all µ ∈ R(E). Also, if We emphasize again that in particular R dµ(t) t 2 +1 µ ∈ R(E), then Fµ will refer to the unique Herglotz function Fµ ∈ N (E) that is associated with µ as in (1.2). There are several reasons for being interested in the class N (E); here, our main motivation is provided by the following fact: Call a (whole line) Jacobi matrix J reflectionless (on E) if gn ∈ N (E) for all n ∈ Z, where gn (z) = δn , (J − z)−1 δn is the nth diagonal element of the resolvent of J (also known as the Green function). Then [13, Theorem 1.4] says that all ω limit points of a Jacobi matrix J with some absolutely continuous spectrum are reflectionless on E = ac ; here, ac denotes an essential support of the absolutely continuous part of the spectral measure ρ of J . This is defined up to sets of Lebesgue measure zero; we can obtain a representative as ac = {t : dρ/dt > 0}. Please see [13] for the details. If µ ∈ R(E), then χ E dt dµac . Indeed, this follows immediately from (1.1) because dµac (t) = (1/π )Im F(t) dt and the boundary value of a Herglotz function can not be zero on a set of positive measure. However, it is not so clear in general if µ can also have a singular part on E. See [4–6] for earlier work on this question. We have the following criterion. We say that a (positive) measure ν is supported by a (measurable) set S if ν(S c ) = 0; supports are not assumed to be closed in this paper. Theorem 1.1. Let µ ∈ R(E). Then: (a) µs , the singular part of µ, is supported by
|E ∩ (x − h, x + h)| x ∈ R : lim =0 . h→0+ 2h
(b) Let θ ∈ L ∞ (E) be an arbitrary bounded measurable function. Then µs is also supported by E θ )(x) exists}. {x ∈ R : ( H
Reflectionless Herglotz Functions
1009
E f as Here, we define H E f )(x) = lim (H
y→0+ E
t−x t − 2 2 2 (t − x) + y t +1
f (t) dt,
if the limit exists. This is closely related to the Hilbert transform f (t) dt. (H f )(x) = lim y→0+ |t−x|>y t − x E f and H χ E f exist (Lebesgue) almost For instance, if f ∈ L 1 (R), then both H everywhere and define the same function (almost everywhere), up to an additive constant. Here, we are interested in the singular part of µ, so sets of Lebesgue measure zero do matter, and we distinguish between the two transforms. However, for a bounded, inteE and H stays bounded, grable function, the difference between the integrals defining H so we also obtain the following variant of Theorem 1.1(b): (c) Let θ ∈ L ∞ (E). Then µs is supported by
θ (t) χ E (t) dt < ∞ . x ∈ R : sup 0
x+1 x−1
χ E (t) dt = ∞ |t − x|
Our final result on the singular part of reflectionless measures is of a conditional nature. It says that if µ is also non-zero outside E, then this will only make it more difficult to produce a singular part on E. To be able to formulate this concisely, we introduce R0 (E) = µ ∈ R(E) : µ(E c ) = 0 . Theorem 1.3. Let E ⊂ R be a closed set. Suppose that µs (E) = 0 for µ ∈ R0 (E). Then νs (E) = 0 for all ν ∈ R(E).
1010
A. Poltoratski, C. Remling
As our next topic, we would like to address the following question: Given a set E, how can we produce examples of measures that are reflectionless on E? Two quick answers are immediately available: As already mentioned above, [13, Theorem 1.4] says that if we start out with a Jacobi matrix with ac ⊃ E and then take ω limit points, then we can be sure that these will be reflectionless on E. A different answer to our question was obtained in [9, Theorem 5.4] (see also [10]): the potential theoretic equilibrium measure is reflectionless on its support. These two results are not totally unrelated. The equilibrium measure frequently arises as the density of states measure of Jacobi matrices with some absolutely continuous spectrum. See, for example, [15]; the book [19] has a systematic exposition of similar methods in a slightly different setting. We will prove a general result here that puts these facts in perspective. Let ν be a finite Borel measure on R of compact support. We introduce the following two functions: γ (z) = ln |t − z| dν(t) (z ∈ C), R dν(t) g(z) = (z ∈ C+ ). R t −z The integral defining the logarithmic potential γ (z) could diverge to −∞; however, for simplicity, we assume here that ν is such that γ (z) > −∞ for all z ∈ C. This assumption could be relaxed, but it holds in all cases of interest, so we do not bother. We will need the following definition. Definition 1.1. Let f : R → R be a (Lebesgue) measurable function. We say that f is approximately differentiable at x ∈ R if there exists d ∈ R so that f (x + y) − f (x) 1 y ∈ (−h, h) : lim − d ≥ = 0 h→0+ 2h y for all > 0. In this case, we call d the approximative derivative of f at x, and we write (Dap f )(x) = d. Please see [1,14,20] for much more on this and related topics. Theorem 1.4. For almost all x ∈ R, we have that (Dap γ )(x) = −Re g(x).
(1.3)
In particular, γ is approximately differentiable almost everywhere. This is a development of [9, Theorem 5.4]. See also [10] for subsequent work inspired by the same result. Theorem 1.4 may be viewed as a result on interchanging limits. Indeed, Re g(x + i y) = −∂x γ (x + i y) for y > 0, so, for almost every x ∈ R, Re g(x) = − lim y→0+ ∂x γ (x +i y). This raises the question of whether it is possible to perform these operations in the opposite order; in other words, can we first take the boundary value of γ to obtain γ (x) and then take the derivative? Theorem 1.4 provides an affirmative answer if the derivative is taken in the approximate sense. However, for us here, Theorem 1.4 is significant mainly because it identifies sets on which g is reflectionless; in particular, this set will contain the points of constancy of γ . More precisely, we obtain the following:
Reflectionless Herglotz Functions
1011
Corollary 1.5. Let
K = c ∈ R : γ −1 ({c}) > 0 ;
C = γ −1 (K ).
Then g ∈ N (C). Proof. K is countable and thus C is an at most countable union of sets C j of the form C j = γ −1 ({c j }). Almost every point of C j is a point of density, and at such points, clearly Dap γ = 0. Theorem 1.4 now gives the corollary. So, first of all, we recover the result from [9,10] that the equilibrium measure of a compact set K ⊂ R is reflectionless on K (see also [12] for information on potential theory). More importantly, Theorem 1.4 and Corollary 1.5 may be used to identify additional examples of reflectionless measures, and they unify these results. We will not pursue this theme in detail here, but just make a few quick remarks on how to proceed. Introduce density of states measures ν as follows: Nj 1 dν(t) = lim dE(t)δn 2 . j→∞ N j n=1
Here, E denotes the spectral resolution of the Jacobi matrix J , and the limit is taken in the weak-∗ sense and on a suitable subsequence N j → ∞. By the Banach-Alaoglu Theorem, such limits always exist; of course, ν could depend on the choice of the sequence N j . We also remark that there are other ways to obtain dν: one can use the eigenvalues of truncated problems or Christoffel-Darboux kernels; see [16, Theorem 1.5] for further discussion of this. By (a generalized version of) the Thouless formula, the logarithmic potential γ will equal the (generalized) Lyapunov exponent, up to a constant. Therefore, Corollary 1.5 now says that ν will be reflectionless on sets of constancy of the Lyapunov exponent; from this in turn, it also follows that ν ∈ R(ac ), g ∈ N (ac ), where ac again denotes an essential support of the absolutely continuous part of the spectral measure of J . The proofs of the results discussed in this introduction will be given in the following two sections, and we split the material in the obvious way: We will prove Theorems 1.1, 1.2, and 1.3 in Sect. 2, and Sect. 3 has the proof of Theorem 1.4. 2. The Singular Part of Reflectionless Measures Please recall the notation Fµ introduced in (1.2). Also, if f ≥ 0 is a Borel function, then, as expected, f µ will denote the measure ( f µ)(A) = A f dµ. The following result from [11] will be our main tool in this section. Theorem 2.1. ([11]) lim
y→0+
for µs -almost every x ∈ R.
F f µ (x + i y) = f (x) Fµ (x + i y)
1012
A. Poltoratski, C. Remling
A clarifying comment is in order: Given ν, the function Fν is of course not completely determined yet (we don’t know a, b). This, however, is not an issue here; the statement from the theorem holds for all such functions. This follows because |Fµ (x + i y)| → ∞ as y → 0+ for µs -almost every x ∈ R. In Theorem 2.1, we of course implicitly assume that 1/(t 2 + 1) is integrable for all measures involved here. See also [2,7] for further discussion of this theorem. We will also use the following consequence of Theorem 2.1. Proposition 2.2. Suppose that ρ = ρs and σ ⊥ ρ. Then lim
y→0+
Fσ (x + i y) =0 Fρ (x + i y)
for ρ-almost every x ∈ R. Proof. Pick a Borel set T ⊂ R with ρ(T c ) = σ (T ) = 0, and abbreviate ρ + σ = µ. Then Fχ c µ Fχ c µ Fµ Fσ = T = T , Fρ FχT µ Fµ FχT µ and FχT c µ /Fµ → χT c µs -almost everywhere by Theorem 2.1. In particular, this ratio goes to zero ρ-almost everywhere. Similarly, FχT µ /Fµ → 1 ρ-almost everywhere, so the proposition follows. Proof of Theorem 1.1. Let µ ∈ R(E). Write Fµ for the associated Herglotz function Fµ ∈ N (E), as in (1.2), and let ξ be the Krein function of Fµ , that is, ξ(x) =
1 lim Im ln Fµ (x + i y), π y→0+
where we take the logarithm with 0 < Im ln w < π for w ∈ C+ . Since ln Fµ is a Herglotz function, the limit defining ξ exists almost everywhere and 0 ≤ ξ(x) ≤ 1. If, conversely, a measurable function ζ with values in [0, 1] is given, then ζ is the Krein function of some Herglotz function G. We can in fact recover ln G, up to an additive real constant, from ζ , using the Herglotz representation of ln G. Here, we make use of the fact that since ln G has bounded imaginary part, the associated measure is purely absolutely continuous. The condition that Fµ ∈ N (E) means that ξ = 1/2 (almost everywhere) on E. Given an arbitrary function θ ∈ L ∞ (R), with −1 ≤ θ ≤ 1 and θ = 0 on E c , we can therefore introduce two new Krein functions ξ± , as follows: 1 ξ± (x) = ξ(x) ± θ (x). 2 As just explained, this also defines two new Herglotz functions F± , up to multiplicative constants. We fix these constants by demanding that |F± (i)| = |Fµ (i)|. Call the measures associated with these functions µ+ and µ− , respectively. Since ξ = (ξ+ + ξ− )/2, we then have that (2.1) Fµ = Fµ+ Fµ− .
Reflectionless Herglotz Functions
1013
Our first aim is to show that µs µ± .
(2.2)
Suppose this were wrong and write µs = gµ+,s + ν, with ν ⊥ µ+,s , ν = 0. We can then find a Borel set T so that ν(T ) > 0, µ+,s (T ) = ν(T c ) = |T | = 0. Theorem 2.1 now shows that Fχ µ (x + i y) Fν (x + i y) = T s →1 Fµs (x + i y) Fµs (x + i y) for µs -almost every x ∈ T , and, similarly, Fχ (µ +ν) (x + i y) Fν (x + i y) = T + →1 Fµ+ +ν (x + i y) Fµ+ +ν (x + i y) for (µ+,s + ν)-almost every x ∈ T and thus also for µs -almost every x ∈ T . Put differently, this means that Fµ+ (x + i y) →0 Fν (x + i y) for µs -almost every x ∈ T . So on a set of positive µs -measure, Fµ+ (x + i y) Fµ+ Fν = → 0. Fµs (x + i y) Fν Fµs We also have that for µs -almost every x ∈ R, Fµ (x + i y) < ∞. sup − 0
(2.3)
(2.4)
This follows quickly from Proposition 2.2 with ρ = µs if we write µ− = hµs + σ , with σ ⊥ µs . Indeed, Fµ− /Fµs → h at µs -almost every point by the proposition and Theorem 2.1, and h < ∞ µs -almost everywhere. Finally, Theorem 2.1 also implies that lim
y→0+
Fµs (x + i y) =1 Fµ (x + i y)
for µs -almost every x ∈ R, and if this is combined with (2.3), (2.4), we obtain that Fµ+ Fµ− →0 Fµ on a set of positive µs -measure, but by (2.1), this ratio is identically equal to one, so we reach a contradiction if (2.2) fails. Thus we can write µs = f ± µ± = f ± µ±,s ,
1014
A. Poltoratski, C. Remling
with f ± ≥ 0 and in fact 0 < f ± < ∞ at µs -almost all points. By Theorem 2.1, lim
y→0+
Fµs (x + i y) → f ± (x + i y) Fµ± (x + i y)
for µ±,s -almost every x ∈ R and thus also µs -almost everywhere. It follows that for µs -almost every x, lim
y→0+
Fµ+ (x + i y) exists and is positive; Fµ− (x + i y)
(2.5)
in fact, this limit is equal to f − (x)/ f + (x) µs -almost everywhere. By definition of ξ± , we have that ξ+ − ξ− = θ , so, if we introduce 1 t − 2 θ (t) dt, L(z) = t +1 E t −z then Fµ+ /Fµ− = e L . Since −1 ≤ θ ≤ 1, we have that Im L(z) ∈ (−π, π ) on C+ , and thus (2.5) implies that for µs -almost every x ∈ R, L(x) ≡ lim L(x + i y) exists, Im L(x) = 0. y→0+
In particular, if we take θ = χ E , then y 1 |E ∩ (x − y, x + y)|, Im L(x + i y) = dt ≥ 2 + y2 (t − x) 2y E so part (a) of the theorem follows. Part (b) is now also immediate, from the fact that Re L(x + i y) approaches a finite limit as y → 0+. c
The set from part (a) of Theorem 1.1 contains E , so this result really addresses the question of whether µ can have a singular part on E. In particular, it says that no such singular part can be present for closed sets of the following type. Definition 2.1. Call a Borel set E ⊂ R weakly homogeneous if lim sup h→0+
1 |E ∩ (x − h, x + h)| > 0 2h
for all x ∈ E. This condition is much weaker than the following, which is used to define homogeneous sets: inf inf
x∈E 0
1 |E ∩ (x − h, x + h)| > 0. 2h
From the work of Sodin-Yuditskii [18] it was previously known that if E is a compact (strongly) homogeneous set and µ ∈ R0 (E), then µs = 0. By using Theorem 1.1, we can go considerably beyond this: Corollary 2.3. Suppose that E is a weakly homogeneous set. If µ ∈ R(E), then µs (E) = 0.
Reflectionless Herglotz Functions
1015
This is more general in two respects: E is only assumed to be weakly homogeneous (rather than homogeneous), and we can treat measures from R(E), not just from R0 (E). This latter improvement, of course, can also be obtained from the general principle that we formulated as Theorem 1.3. We now move on to proving Theorem 1.2. This will follow quickly from the following known characterization of the point part of µ in terms of the Krein function ξ of Fµ . See, for example, [8, p. 201]. We include a proof for the reader’s convenience. Lemma 2.4. µ({x}) > 0 if and only if x+1 |ξ(t) − χ(x,∞) (t)| dt < ∞. |t − x| x−1
(2.6)
Proof. First of all, we can recover the point part as µ({x}) = −i lim y Fµ (x + i y); y→0+
this is well known and follows quickly from the dominated convergence theorem. So µ({x}) > 0 if and only if (2.7) lim sup Re ln Fµ (x + i y) + ln y > −∞. y→0+
To slightly simplify the notation, we will now assume that x = 0. In terms of the Krein function ξ , the expression from (2.7) equals 1 1 t dt + O(1) (y → 0+). ξ(t) dt − 2 2 −1 t + y y t By monotone convergence, 0 −1
t ξ(t) dt → t 2 + y2
0
−1
ξ(t) dt t
(and, of course, this limit could equal −∞). Also, 1 1 1 1 t dt t (ξ(t) − 1) y2 = dt ξ(t) dt − dt − 2 2 2 2 t 2 + y2 0 t +y y t y y t (t + y ) y tξ(t) + dt 2 2 0 t +y 1 t (ξ(t) − 1) dt + O(1), = t 2 + y2 y and, by monotone convergence again, 1 1 t (ξ(t) − 1) ξ(t) − 1 dt ≥ −∞. dt → 2 + y2 t t y 0 These calculations have shown that (2.7), for x = 0, holds if and only if 0 1 ξ(t) 1 − ξ(t) dt + dt < ∞, |t| t −1 0 as asserted by the lemma.
1016
A. Poltoratski, C. Remling
Proof of Theorem 1.2. Suppose that condition (b) from Theorem 1.2 fails. Put ξ(t) =
1 χ E (t) + χ E c ∩(x,∞) (t). 2
Let F ∈ H be the corresponding Herglotz function. Since ξ = 1/2 on E, we have that F ∈ N (E), but it is also clear that (2.6) holds, so the corresponding measure has a point mass at x. The converse is an immediate consequence of Theorem 1.1(c), with θ (t) = sgn(t −x). Furthermore, we can also obtain this statement conveniently from Lemma 2.4, as follows: If Fµ ∈ N (E), then ξ = 1/2 almost everywhere on E, so the integrand from (2.6) equals 1/(2|t − x|) on E ∩ (x − 1, x + 1) and thus (2.6) can not hold if we have condition (b) from Theorem 1.2. Proof of Theorem 1.3. Let ν ∈ R(E). We claim that if µ ∈ R0 (E), then we must have that lim
y→0+
Fµ (x + i y) =0 Fν (x + i y)
(2.8)
for νs -almost every x ∈ R. Indeed, µs = 0, µ(E c ) = 0 by assumption, and, as discussed above, the condition that ν ∈ R(E) forces the absolutely continuous part of ν to be equivalent to χ E dt on E. So µ ν, µs = 0, and thus (2.8) follows immediately from Theorem 2.1. Starting from ν, we will now construct a measure µ ∈ R0 (E) for which (2.8) cannot hold at any point x ∈ E. This will prove that νs (E) = 0, as claimed. We will again work with the Krein functions; the following simple monotonicity property is at the heart of the matter. Lemma 2.5. For ξ ∈ L ∞ (a, b), 0 ≤ ξ ≤ 1, and x ∈ / [a, b], define b ξ(t) dt . I x (ξ ) = a t −x b Let c = a ξ(t) dt. Then I x (ξ ) ≤ I x χ(a,a+c) for all x ∈ / [a, b]. Proof. It suffices to prove this for step functions ξ because these are dense in L 1 . So assume that ξ = Nj=1 s j χ I j , with disjoint intervals I j . If (c, c + h) is such an interval of constancy of ξ and ξ = s on (c, c + h), with 0 < s < 1, then, as an elementary argument shows, I x (ξ ) will go up if we redefine ξ on (c, c + h) as χ(c,c+sh) . Use this procedure on all intervals of constancy. Since I x (ξ ) clearly also increases if we pass to the non-increasing rearrangement of ξ , we obtain the lemma. Let ξ be the Krein function of Fν , and, motivated by Lemma 2.5, define ξ0 as follows: ξ0 = 1/2 on E, and if (a, b) is one of the bounded components of the open set E c , set b ξ0 = χ(a,a+c) on (a, b), where c = a ξ dt, as in the lemma. If E c has unbounded components, put ξ0 = 1 (say) on these. Notice that ξ0 is the Krein function of a Herglotz function Fµ whose associated measure satisfies µ ∈ R0 (E). Indeed, µ is reflectionless on E because this property is equivalent to ξ0 = 1/2 on E, and µ(E c ) = 0 because Fµ (x) ≡ lim y→0+ Fµ (x + i y) exists and is real at all points of E c , except possibly at the
Reflectionless Herglotz Functions
1017
jumps of ξ0 . However, these can’t be discrete points of µ either because in order for this to happen, ξ0 would have to jump from 0 to 1, not the other way around, by Lemma 2.4. Now fix x ∈ E and look at ln |Fµ /Fν |. As y → 0+, ξ0 (t) − ξ(t) ln |Fµ (x + i y)| − ln |Fν (x + i y)| = dt + O(1). (2.9) t−x y<|t−x|≤1 Since ξ = ξ0 = 1/2 on E, this set doesn’t contribute to the integral. Moreover, by Lemma 2.5 and construction of ξ0 , those components of E c that are contained in the region of integration make non-negative contributions. This more or less finishes the proof except that there might also be up to four truncated components of E c contributing to the integral. Suppose for example that (a, b) is such a component and x ≤ a < x + y < b ≤ x + 1. Suppose also, for simplicity, that x = 0. We claim that then b ξ0 (t) − ξ(t) dt ≥ −1. t y This follows because Lemma 2.5 says that this integral will only become smaller if we replace the actual ξ on (y, b) by χ(y,y+h) , where again h is chosen so that the integral of ξ over (y, b) is left unchanged. A similar process was used to construct ξ0 , so after this replacement, ξ0 and ξ are both characteristic functions of an interval, and the interval of ξ0 is not smaller than the one corresponding to ξ , so the difference ξ0 − ξ is zero, except perhaps on an interval of at most the size of the truncated piece (a, y), and this is obviously ≤ y. Similar discussions of course apply to the other cases, so (2.9) is bounded below as y → 0+ and (2.8) cannot hold. 3. Proof of Theorem 1.4 For a more streamlined presentation, we first of all isolate the following simple (but key) calculation. Lemma 3.1. If 0 < |t| ≤ 2y, then y ln 1 − h dh ≤ 12y ln 1 + y . t |t| 0 Proof. We will treat explicitly only the case 0 < t ≤ 2y here; the case where t < 0 is similar, but easier. In the former case, 1 y ln 1 − h dh = t |ln |s|| ds. (3.1) t 0 1−y/t If y/t ≤ 2, then
(3.1) ≤ 2t
1
|ln s| ds = 2t ≤ 4y.
0
Similarly, if y/t > 2, then y/t−1 (3.1) = 2t + t ln s ds ≤ 2t + y ln(1 + y/t) ≤ 12y ln(1 + y/t). 1
1018
A. Poltoratski, C. Remling
Proof of Theorem 1.4. We will here that for almost every x ∈ R, the one-sided prove + right approximate derivative Dap γ (x) exists and (1.3) holds. The same argument can then be used to establish the corresponding statement about the left derivative, and these two statements together will give the full claim. Here, one-sided derivatives are defined + f )(x) = d if for all > 0, in the obvious way; for example, we say that (Dap f (x + y) − f (x) 1 ≥ = 0. y ∈ (0, h) : − d lim h→0+ h y Our basic strategy is modelled on the proof of [9, Theorem 5.4]. The following statements hold at (Lebesgue) almost every point x ∈ R; here, we write again νs for the singular part of ν, and we also use the notation ν(t) = ν((−∞, t)). • • • •
x is a Lebesgue point of ν (t); lim y→0+ g(x + i y) exists; lim y→0+ νs ([x − y, x + y])/y = 0; lim y→0+ Im gs (x + i y) = 0, where gs (z) = R
dνs (t) t−z .
+ γ )(x) exists and (1.3) We will now show that if x has all these properties, then (Dap holds. So fix such an x. To simplify the notation, we will again assume that x = 0. The basic idea is to look at averages of
γ (y) − γ (0) . y By the definitions of g and γ , we have that F(y) = R φ y (t) dν(t), where 1 t + ln 1 − , φ(t) = 2 t +1 t t 1 φ y (t) = φ . y y F(y) ≡ Re g(i y) +
(3.2)
Recall that we assumed that ν has a finite logarithmic potential γ (z) everywhere, so φ y is in L 1 (dν). Moreover, since φ(t) = O(t −2 ) for large |t|, we have that φ, φ y ∈ L 1 (R). For later use, we also observe that ∞ φ(t) dt = 0. (3.3) −∞
R
To prove this, look at −R φ. Clearly, the first term from (3.2) is odd and thus doesn’t contribute to this integral, and R R t − 1 1 dt ln 1 − dt = ln t t −R −R −R R = ln |t| dt − ln |t| dt → 0 (R → ∞), −R−1
so we obtain (3.3).
R−1
Reflectionless Herglotz Functions
1019
Suppose now that B y is a family of Borel sets with the following properties: B y ⊂ [δy, y], B y ≥ δy,
(3.4)
for some fixed (but arbitrary) 0 < δ < 1/2. Define 1 ψ y (t) = φh (t) dh. |B y | B y We now claim that ψ y (t)
y −1 ln(1 + y/|t|) 0 < |t| ≤ 2y . |t| > 2y y/t 2
(3.5)
The constant implicit in (3.5) only depends on δ. Indeed, for |t| ≤ 2y this follows immediately from Lemma 3.1 and the obvious bound |t|/(t 2 + y 2 ) ≤ 2/y (if |t| ≤ 2y). If, on the other hand, h ≤ y < |t|/2, then Taylor’s theorem shows that |φh (t)| =
h2 y + O(h/t 2 ) 2 , |t|(t 2 + h 2 ) t
and the second bound from (3.5) follows. Next, (3.3) and the Fubini-Tonelli Theorem imply that ∞ ψ y (t) dt = 0. −∞
Our next goal is to show that
∞
lim
y→0+ −∞
ψ y (t) dν(t) = 0.
We rewrite this as ψ y (t) dν(t) ≤ ψ y (t) dνs (t) + ψ y (t)ν (t) dt . R
(3.6)
R
R
(3.7)
Our first step will be to show that the first integral on the right-hand side of (3.7) goes to zero as y → 0. Start by considering the contributions coming from |t| > 2y: By (3.5), y ψ y (t) dνs (t) dνs (t) = Im gs (i y) → 0, 2 2 |t|>2y R t +y by our choice of x (= 0). Next, if > 0 is given, we can find η > 0 so that if h ≤ η, then νs ([−h, h]) ≤ h. If 2y < η, then this, (3.5), and the monotone convergence theorem imply that ∞ ψ y (t) dνs (t) = ψ y (t) dνs (t) |t|≤2y
n=0 2 ∞
1 y
−n y<|t|≤2−n+1 y
νs
−2−n+1 y, 2−n+1 y ln 1 + 2n
n=0
∞ (n + 1)2−n = C . n=0
So, putting things together, we see that lim y→0+ R |ψ y | dνs = 0, as desired.
1020
A. Poltoratski, C. Remling
As for the last integral from (3.7), we recall (3.6) to estimate this as follows: ψ y (t)ν (t) dt ≤ ψ y (t) |ν (t) − ν (0)| dt. R
R
Now a very similar argument works, so we will just give a sketch of this. First of all, for |t| > 2y, ψ y (t) is dominated by the Poisson kernel y/(t 2 + y 2 ), so this part goes to zero because x = 0 is a Lebesgue point of ν . For small |t|, on the other hand, we again have that the contributions coming from |t| ≈ 2−n y will be n2−n , and the sum over n is still . Let us summarize: We have shown that lim ψ y (t) dν(t) = 0. y→0
By unwrapping the definitions, we see that this means that 1 1 γ (h) − γ (0) dh + Re g(i h) dh = 0. lim y→0 |B y | B y h |B y | B y Since g(i h) converges, to g(0), by the choice of x = 0 again, the second term converges to Re g(0), so we can also say that γ (h) − γ (0) 1 lim + Re g(0) dh = 0, (3.8) y→0 |B y | B y h and this holds for any choice of sets B y as in (3.4). This implies that the (right) approximate derivative of γ at x = 0 exists and (1.3) holds. Indeed, if this were not true, then we could find δ, > 0 and a sequence of sets An ⊂ [0, yn ], with yn → 0, such that |An | ≥ 3δyn and γ (h) − γ (0) + Re g(0) ≥ h for all h ∈ An . But then we can also construct sets Bn ⊂ [δyn , yn ], |Bn | ≥ δyn , so that either γ (h) − γ (0) + Re g(0) ≥ h for all h ∈ Bn or . . . ≤ − for all h ∈ Bn . However, then (3.8) with B yn = Bn leads to a contradiction, so we have to admit that the (one-sided) approximate derivative exists and (1.3) holds, as claimed. References 1. Bruckner, A.M.: Differentiation of Real Functions. Lecture Notes in Mathematics 659, Berlin: Springer, 1978 2. Cima, J.A., Matheson, A.L., Ross, W.T.: The Cauchy Transform. Mathematical Surveys and Monographs 125, Providence, RI: Amer. Math. Soc., 2006 3. Dombrowski, J.: Quasitriangular matrices. Proc. Amer. Math. Soc. 69, 95–96 (1978) 4. Gesztesy, F., Krishna, M., Teschl, G.: On isospectral sets of Jacobi operators. Commun. Math. Phys. 181, 631–645 (1996)
Reflectionless Herglotz Functions
1021
5. Gesztesy, F., Yuditskii, P.: Spectral properties of a class of reflectionless Schrödinger operators. J. Funct. Anal. 241, 486–527 (2006) 6. Gesztesy, F., Zinchenko, M.: Local spectral properties of reflectionless Jacobi, CMV, and Schrödinger operators. http://arxiv.org/abs/0803.3177 [math.sp], 2008 7. Jaksic, V., Last, Y.: A new proof of Poltoratski’s theorem. J. Funct. Anal. 215, 103–110 (2004) 8. Martin, M., Putinar, M.: Lectures on Hyponormal Operators. Operator Theory: Advances and Applications 39, Basel: Birkhäuser Verlag, 1989 9. Melnikov, M., Poltoratski, A., Volberg, A.: Uniqueness theorems for Cauchy integrals. http://arxiv.org/abs/0704.0621 [math.sp], 2007 10. Nazarov, F., Volberg, A., Yuditskii, P.: Reflectionless measures with a point mass and singular continuous component. http://arxiv.org/abs/0711.0948 [math.sp], 2008 11. Poltoratski, A.: Boundary behavior of pseudocontinuable functions. St. Petersburg Math. J. 5, 389–406 (1994) 12. Ransford, T.: Potential Theory in the Complex Plane, London Mathematical Society Student Texts 28, Cambridge: Cambridge University Press, 1995 13. Remling, C.: The absolutely continuous spectrum of Jacobi matrices. http://arxiv.org/abs/0706.1101 [math.sp], 2007 14. Saks, S.: Theory of the Integral. Second revised edition, New York: Dover Publications, 1964 15. Simon, B.: Equilibrium measures and capacities in spectral theory. Inverse Probl. Imaging 1, 713–772 (2007) 16. Simon, B.: Weak convergence of CD kernels and applications. To appear in Duke Math. J. 17. Simon, B., Spencer, T.: Trace class perturbations and the absence of absolutely continuous spectra. Commun. Math. Phys. 125, 113–125 (1989) 18. Sodin, M., Yuditskii, P.: Almost periodic Jacobi matrices with homogeneous spectrum, infinitedimensional Jacobi inversion, and Hardy spaces of character-automorphic functions. J. Geom. Anal. 7, 387–435 (1997) 19. Stahl, H., Totik, V.: General Orthogonal Polynomials, Encyclopedia of Mathematics and its Applications 43, Cambridge: Cambridge University Press, 1992 20. Thomson, B.S.: Real Functions, Lecture Notes in Mathematics 1170, Berlin: Springer-Verlag, 1985 Communicated by B. Simon
Commun. Math. Phys. 288, 1023–1059 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0754-z
Communications in
Mathematical Physics
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction Jürg Fröhlich, Antti Knowles, Simon Schwarz Institute of Theoretical Physics, ETH Hönggerberg, CH-8093 Zürich, Switzerland. E-mail:
[email protected];
[email protected];
[email protected] Received: 28 May 2008 / Accepted: 2 December 2008 Published online: 28 February 2009 – © Springer-Verlag 2009
Abstract: In the mean-field limit the dynamics of a quantum Bose gas is described by a Hartree equation. We present a simple method for proving the convergence of the microscopic quantum dynamics to the Hartree dynamics when the number of particles becomes large and the strength of the two-body potential tends to 0 like the inverse of the particle number. Our method is applicable for a class of singular interaction potentials including the Coulomb potential. We prove and state our main result for the Heisenbergpicture dynamics of “observables”, thus avoiding the use of coherent states. Our formulation shows that the mean-field limit is a “semi-classical” limit. 1. Introduction Whenever many particles interact by means of weak two-body potentials, one expects that the potential felt by any one particle is given by an average potential generated by the particle density. In this mean-field regime, one hopes to find that the emerging dynamics is simpler and less encumbered by tedious microscopic information than the original N -body dynamics. The mathematical study of such problems has quite a long history. In the context of classical mechanics, where the mean-field limit is described by the Vlasov equation, the problem was successfully studied by Braun and Hepp [3], as well as Neunzert [16]. The mean-field limit of quantum Bose gases was first addressed in the seminal paper [10] of Hepp. We refer to [6] for a short discussion of some subsequent results. The case with a Coulomb interaction potential was treated by Erd˝os and Yau in [6]. Recently, Rodnianski and Schlein [21] have derived explicit estimates for the rate of convergence to the meanfield limit, using the methods of [10 and 9]. A sharper bound on the rate of convergence in the case of a sufficiently regular interaction potential was derived by Schlein and Erd˝os [22], by using a new method inspired by Lieb-Robinson inequalities. In [7,15], the mean-field limit (N → ∞) and the classical limit were studied simultaneously. A conceptually quite novel approach to studying mean-field limits was introduced in [8].
1024
J. Fröhlich, A. Knowles, S. Schwarz
In that paper, the time evolution of quantum and corresponding “classical” observables is studied in the Heisenberg picture, and it is shown that “time evolution commutes with quantisation” up to terms that tend to 0 in the mean-field (“classical”) limit, which is a Egorov-type result. In this paper we present a new, simpler way of handling singular interaction potentials. It yields a Egorov-type formulation of convergence to the mean-field limit, thus obviating the need to consider particular (traditionally coherent) states as initial conditions. Another, technical, advantage of our method is that it requires no regularity (traditionally H 1 - or H 2 -regularity) when applied to coherent states. Such kinds of results were first obtained by Egorov [5] for the semi-classical limit of a quantum system. Roughly, the statement is that time-evolution commutes with quantisation in the semi-classical limit. We sketch this in a simple example: Let us start with a classical Hamiltonian system of a finite number f of degrees of freedom. The classical algebra of observables A is given by (some subalgebra of) the Abelian algebra of smooth functions on the phase space := R2 f . Let H ∈ A be a Hamilton function. Together with the symplectic structure on , H generates a symplectic flow φ t on . Now we : A → define a quantisation map (·) A, where A is some subalgebra of B(L 2 (R f )). For concreteness, let (·) be Weyl quantisation with deformation parameter . This implies that , A B = {A, B} + O(2 ), i for → 0. The quantised Hamilton function defines a 1-parameter group of automorphisms on A through
A → eit H / A e−it H /,
A∈ A.
A Egorov-type semi-classical result states that, for all A ∈ A and t ∈ R, e−it H / + R(t), (A ◦ φ t ) = eit H / A
where R(t) → 0 as → 0. This approach identifies the semi-classical limit as the converse of quantisation. In a similar fashion, we identify the mean-field limit as the converse of “second quantisation”. In this case the deformation parameter is not , but N −1 , a parameter proportional to the coupling constant. We consider the mean-field dynamics (given by the Hartree equation in the case of bosons), and view it as the Hamiltonian dynamics of a classical Hamiltonian system. We show that its quantisation describes N -body quantum mechanics, and that the “semi-classical” limit corresponding to N −1 → 0 takes us back to the Hartree dynamics. We sketch the key ideas behind our strategy. (1) Use the Schwinger-Dyson expansion to construct the Heisenberg-picture dynamics of p-particle operators eit HN A N (a ( p) ) e−it HN (in the notation of Sect. 3). (2) Use Kato smoothing plus combinatorial estimates (counting of graphs) to prove convergence of the Schwinger-Dyson expansion on N -particle Hilbert space, uniformly in N and for small |t|. Diagrams containing l loops yield a contribution of order N −l .
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1025
(3) Use Kato smoothing plus combinatorial estimates to prove convergence of the iterative solution of the Hartree equation, for small |t|. (4) Show that the Wick quantisation of the series in (3) is equal to the series of tree diagrams in (2). (5) Extend (2) and (3) to arbitrary times by using unitarity and conservation laws. This paper is organised as follows. In Sect. 2 we show that the classical Newtonian mechanics of point particles is the second quantisation of Vlasov theory, the latter being the mean-field (or “classical”) limit of the former. The bulk of the paper is devoted to a rigorous analysis of the mean-field limit of Bose gases. In Sect. 3 we recall some important concepts of quantum many-body theory and introduce a general formalism which is convenient when dealing with quantum gases. Section 4 contains an implementation of Step (1) above. The convergence of the Schwinger-Dyson series for bounded interaction potentials is briefly discussed in Sect. 5. Section 6 implements Step (2) above. Steps (3), (4) and (5) are implemented in Sect. 7. Finally, Sect. 8 extends our results to more general interaction potentials as well as nonvanishing external potentials. 2. Mean-Field Limit in Classical Mechanics In this section we consider the example of classical Newtonian mechanics to illustrate how the atomistic constitution of matter arises by quantisation of a continuum theory. The aim of this section is to give a brief and nonrigorous overview of some ideas that we shall develop in the context of quantum Bose gases, in full detail, in the following sections. A classical gas is described as a continuous medium whose state is given by a nonnegative mass density dµ(x, v) = M f (x, v) dx dv on the “one-particle” phase space R3 × R3 . Here M is the mass of one “mole” of gas; µ(A) is the mass of gas in the phase space volume A ⊂ R3 × R3 . Let dx dv f (x, v) = ν < ∞ denote the number of “moles” of the gas, so that the total mass of the gas is µ(R3 × R3 ) = ν M. An example of an equation of motion for f (x, v) is the Vlasov equation ∂t f t (x, v) = − (v · ∇x f t) (x, v) +
1 (∇Veff [ f t ] · ∇v f t) (x, v), m
(2.1)
where m is a constant with the dimension of a mass, t denotes time, and Veff [ f ](x) = V (x) + dy W (x − y) dv f (y, v). Here V is the potential of external forces acting on the gas and W is a (two-body) potential describing self-interactions of the gas. The Vlasov equation arises as the mean-field limit of a classical Hamiltonian sysn , moving in an external tem of n point particles of mass m, with trajectories (xi (t))i=1 potential V and interacting through two-body forces with potential N −1 W (xi − x j ). Here N is the inverse coupling constant. We interpret N as “Avogadro’s number”, i.e. as the number of particles per “mole” of gas. Thus, M = m N and n = ν N . More precisely, it is well-known (see [3,16]) that, under some technical assumptions on V and W , f t (x, v) = w*-lim n→∞
n ν δ(x − xi (t)) δ(v − x˙i (t)) n i=1
(2.2)
1026
J. Fröhlich, A. Knowles, S. Schwarz
exists for all times t and is the (unique) solution of (2.1), provided that this holds at time t = 0. Here, f t is viewed as an element of the dual space of continuous bounded functions. Note that n and N are, a priori, unrelated objects. While n is the number of particles in the classical Hamiltonian system, N −1 is by definition the coupling constant. The mean-field limit is the limit n → ∞ while keeping n ∝ N ; the proportionality constant is ν. It is of interest to note that the Vlasov dynamics (2.1) may be interpreted as a Hamiltonian dynamics on an infinite-dimensional affine phase space Vlasov . To see this, we write f (x, v) = α(x, ¯ v)α(x, v), where α(x, ¯ v), α(x, v) are complex coordinates on Vlasov . For our purposes it is enough to say that Vlasov is some dense subspace of L 2 (R6 ) (typically a weighted Sobolev space of index 1). On Vlasov we define a symplectic form through ω = i dx dv dα(x, ¯ v) ∧ dα(x, v). This yields a Poisson bracket which reads α(x, v), α(y, w) = α(x, ¯ v), α(y, ¯ w) = 0, α(x, v), α(y, ¯ w) = iδ(x − y)δ(v − w).
(2.3)
A Hamilton function H is defined on Vlasov through
1 H (α) := i dx dv α(x, ¯ v) −v · ∇x + ∇V (x) · ∇v α(x, v) m
i + dx dv α(x, ¯ v) dy dw ∇W (x − y) |α(y, w)|2 · ∇v α(x, v). (2.4) m ¯ which by Note that H is invariant under gauge α → e−iθ α, α¯ → eiθ α, 2transformations Noether’s theorem implies that |α| dx dv = f dx dv is conserved. Let us abbreviate K := −∇V /m and F := −∇W/m. After a short calculation using (2.3) we find that the Hamiltonian equation of motion α˙ t (x, v) = {H, αt (x, v)} reads α˙ t (x, v) = (−v · ∇x − K (x) · ∇v) αt (x, v)− dy dw F(x−y) |αt (y, w)|2 · ∇v αt (x, v) + dy dw F(x−y) α¯ t (y, w)αt (x, v) · ∇w αt (y, w). (2.5) Also, α¯ t satisfies the complex conjugate equation. Therefore, d |αt (x, v)|2 = (−v · ∇x −K (x) · ∇v) |αt (x, v)|2 dt −
dy dw F(x−y) |αt (y, w)|2 · ∇v |αt (x, v)|2 +|αt (x, v)|2 dy dw F(x−y) · α¯ t (y, w)∇w αt (y, w) + αt (y, w)∇w α¯ t (y, w) .
(2.6)
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1027
We assume that |α(x, v)| = o(|(x, v)|−1 ),
(x, v) → ∞.
(2.7)
We shall shortly see that this property is preserved under time-evolution. By integration by parts, we see that the second line of (2.6) vanishes, and we recover the Vlasov equation of motion (2.1) for f = |α|2 . We comment briefly on the existence and uniqueness of solutions to the Hamiltonian equation of motion (2.5). Following Braun and Hepp [3], we assume that K and F are bounded and continuously differentiable with bounded derivatives. We use polar coordinates α = β eiϕ , where ϕ ∈ R and β ≥ 0. Then the Hamiltonian equation of motion (2.5) reads β˙t (x, v) = (−v · ∇x − K (x) · ∇v) βt (x, v) − dy dw F(x − y) βt2 (y, w) · ∇v βt (x, v), (2.8a) ϕ˙t (x, v) = (−v · ∇x −K (x) · ∇v) ϕt (x, v)− dy dw F(x−y) βt2 (y, w) · ∇v ϕt (x, v) + dy dw F(x − y) βt2 (y, w) · ∇w ϕt (y, w). (2.8b) We consider two cases. (i) ϕ = 0. In this case α = β and the equations of motion (2.8) are equivalent to the Vlasov equation for f = β 2 . The results of [3,16] then yield a global well-posedness result. (ii) ϕ = 0. The equation of motion (2.8a) is independent of ϕ. Case (i) implies that it has a unique global solution. In order to solve the linear equation (2.8b), we apply a contraction mapping argument. Consider the space X := {ϕ ∈ C(R6 ) : ∇ϕ ∈ L ∞ (R6 )}. Using Sobolev inequalities one finds that X , equipped with the norm ϕ X := |ϕ(0)| + ∇ϕ∞ , is a Banach space. We rewrite (2.8b) as an integral equation, and using standard methods show that, for small times, it has a unique solution. Using conservation of dx dv βt2 we iterate this procedure to find a global solution. We omit further details. Note that, as shown in [3], the solution βt can be written using a flow φ t on the one-particle phase space: βt (x, v) = β0 (φ −t (x, v)). The flow φ t (x, v) = (x(t), v(t)) satisfies x(t) ˙ = v(t), v(t) ˙ = K (x(t)) + dy dw βt2 (y, w) F(x(t) − y). Using conservation of dx dv βt2 we find that there is a constant C such that |φ −t (x, v)| ≤ |(x, v)|(1 + t) + C(1 + t 2 ). Therefore (2.7) holds for all times t provided that it holds at time t = 0. The Hamiltonian formulation of Vlasov dynamics can serve as a starting point to recover the atomistic Hamiltonian mechanics of point particles by quantisation: Replace α(x, ¯ v) → α ∗N (x, v),
α(x, v) → α N (x, v),
1028
J. Fröhlich, A. Knowles, S. Schwarz
where α ∗N and α N are creation and annihilation operators acting on the bosonic Fock space F+ L 2 (R6 ) ; see Appendix A. They satisfy the canonical commutation relations (A.2); explicitly, ∗ α N (x, v), α N (y, w) = α N (x, v), α ∗N (y, w) = 0, 1 α N (x, v), α ∗N (y, w) = δ(x − y)δ(v − w). (2.9) N Given a function A on Vlasov which is a polynomial in α and α, we define an operator N on F+ by replacing α # with A α #N and Wick-ordering the resulting expression. We N . Here, N −1 is the deformation parameter of the denote this quantisation map by (·) quantisation: We find that
N , BN A
=
N −1 {A, B} N + O(N −2 ), i
for N → ∞. Here A and B are polynomial functions on Vlasov . The dynamics of a state ∈ F is given by the Schrödinger equation N t , iN −1 ∂t t = H
(2.10)
N is the quantisation of the Vlasov Hamiltonian H . In order to identify the where H dynamics given by (2.10) with the classical dynamics of point particles, we study wave functions (n) (x1 , v1 , . . . , xn , vn ) in the n-particle sector of F+ , and interpret ρ (n) := ||2 as a probability density on the n-body classical phase space. If ∈ F+ denotes the vacuum vector annihilated by α N (x, v) then N n/2 dx1 dv1 · · · dxn dvn (n) (x1 , v1 , . . . , xn , vn ) α ∗N (xn , vn ) · · · α ∗N (x1 , v1 ) . (n) = √ n! It is a simple matter to check that (2.9) and (2.10) imply that
n 1 1 (n) (n) 1 (n) −vi · ∇xi + ∇V (xi ) · ∇vi t + ∇W (xi −x j ) · ∇vi t . ∂t t = m N m 1≤i = j≤n
i=1
Also, (n) t satisfies the same equation. Therefore,
n 1 (n) (n) 1 −vi · ∇xi + ∇V (xi ) · ∇vi ρt + ∂t ρt = m N i=1
1≤i = j≤n
1 (n) ∇W (xi −x j ) · ∇vi ρt . m
This is the Liouville equation corresponding to the Hamiltonian equations of motion of n classical point particles, ∂t xi = vi , m ∂t vi = −∇V (xi ) −
1 ∇W (xi − x j ). N j =i
Analogous results can be proven if α ∗N and α N are chosen to be fermionic creation and annihilation operators obeying the canonical anti-commutation relations and acting on the fermionic Fock space F− (L 2 (R6 )).
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1029
3. Quantum Gases: The Setup Although our main results are restricted to bosons, all of the following rather general formalism remains unchanged for fermions. We therefore consider both bosonic and fermionic statistics throughout Sects. 3 – 6. Details on systems of fermions will appear elsewhere. Throughout the following we consider the one-particle Hilbert space H := L 2 (R3 , dx). We refer the reader to Appendix A for our choice of notation and a short discussion of many-body quantum mechanics. In the following a central role is played by the p-particles operators, i.e. closed ( p) operators a ( p) on H± = P± H⊗ p , where P+ and P− denote symmetrisation and anti-symmetrisation, respectively. When using second-quantised notation it is convenient to use the operator kernel of a ( p) . Here is what this means (see [18] for details): Let S (Rd ) be the usual Schwartz space of smooth functions of rapid decrease, and S (Rd ) its topological dual. The nuclear theorem states that to every operator A on L 2 (Rd ),
such that the map (f, g) → f , Ag is separately continuous on S (Rd ) × S (Rd ), there belongs a tempered distribution (“kernel”) A˜ ∈ S (R2d ), such that ˜ f¯ ⊗ g). f , Ag = A( In the following we identify A˜ with A. In the suggestive physicist’s notation we thus have
( p) ( p) ( p) = dx1 · · · dx p dy1 · · · dy p f ,a g f ( p) (x1 , . . . , x p ) a ( p) (x1 , . . . , x p ; y1 , . . . , y p ) g ( p) (y1 , . . . , y p ), where f, g ∈ S (R3 p ). It will be easy to verify that all p-particle operators that appear in the following satisfy the above condition; this is for instance the case for all bounded a ( p) ∈ B(H⊗ p ). ( p) Next, we define second quantisation A N . It maps a closed operator on H± to a closed operator on F± according to the formula ( p) A N (a ) := dx1 · · · dx p dy1 · · · dy p N∗ (x p ) · · · ψ N∗ (x1 )a ( p) (x1 , . . . , x p ; y1 , . . . , y p )ψ N (y1 ) · · · ψ N (y p ). ψ (3.1) # := √1 ψ # , where ψ # is the usual creation or annihilation operator; see Here ψ N N Appendix A. (n) In order to understand the action of A N (a ( p) ) on H± , we write N n/2 (n) = √ n!
N∗ (z n ) · · · ψ N∗ (z 1 ) dz 1 · · · dz n (n) (z 1 , . . . , z n ) ψ
1030
J. Fröhlich, A. Knowles, S. Schwarz
and apply A N (a ( p) ) to the right side. By using the (anti) commutation relations (A.2) to N (yi ) through the n creation operators ψ ∗ (z i ), and pull the p annihilation operators ψ N N (x) = 0, we get the “first quantised” expression ψ p! n P± (a ( p) ⊗ 1(n− p) )P± , n ≥ p, p ( p) A N (a ) H(n) = N p (3.2) ± 0, n < p. This may be viewed as an alternative definition of A N (a ( p) ). ( p) ( p) A is We define A as the linear span of A N (a ) : p ∈ N, a ( p) ∈ B(H± ) . Then 0 a ∗-algebra of closable operators on F± (see Appendix A). We list some of its important properties, whose straightforward proofs we omit. (i) A(a ( p) )∗ = A((a ( p) )∗ ). ( p) (q) ( p) (ii) If a ∈ B(H± ) and b(q) ∈ B(H± ), then A N (a ( p) ) A N (b(q) ) =
min( p,q) r =0
p r
q r! ( p) (q) A a , • b N r r Nr
(3.3)
where ( p+q−r )
a ( p) •r b(q) := P± (a ( p) ⊗ 1(q−r ) ) (1( p−r ) ⊗ b(q) ) P± ∈ B(H±
(n) (iii) The operator A(a ( p) ) leaves the n-particle subspaces H± invariant. ( p) (iv) If a ( p) ∈ B(H± ) and b ∈ B(H) is invertible, then (b−1 ) A N (a ( p) ) (b) = A N (b−1 )⊗ p a ( p) b⊗ p ,
). (3.4)
(3.5)
(n)
where (b) is defined on H± by b⊗n . ( p) (v) If a ( p) ∈ B(H± ) then n p a ( p) . A N (a ( p) )H(n) ≤ ± N
(3.6)
Of course, on an appropriate dense domain, (3.3) holds for unbounded operators a ( p) and b(q) too. We introduce the notation ( p) (q) (3.7) a , b r := a ( p) •r b(q) − b(q) •r a ( p) . Note that a ( p) , b(q) 0 = 0. Thus, A N (b(q) ) = A N (a ( p) ),
min( p,q) r =1
p r
q r! A N a ( p) , b(q) r . r r N
(3.8)
We now move on to discuss dynamics. Take a one-particle Hamiltonian h (1) ≡ h of the form h = − + v, where is the Laplacian over R3 and v is some real function. We denote by V the multiplication operator v(x). Two-body interactions are described
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1031
by a real, even function w on R3 . This induces a two-particle operator W (2) ≡ W on H⊗2 , defined as the multiplication operator w(x1 − x2 ). We define the Hamiltonian 1 N := H A N (h) + A N (W ). 2
(3.9)
Under suitable assumptions on v and w that we make precise in the following sections, N is a well-defined self-adjoint operator on F± . It is convenient to one shows that H (n) N . On H± introduce H N := N H we have the “first quantised” expression n 1 H N H(n) = hi + ± N i=1
Wi j =: H0 +
1≤1< j≤n
1 W, N
(3.10)
in self-explanatory notation. 4. Schwinger-Dyson Expansion and Loop Counting Without loss of generality, we assume throughout the following that t ≥ 0. ( p) Let a ( p) ∈ B(H± ) and w be bounded, i.e. w ∈ L ∞ (R3 ). Using the fundamental theorem of calculus and the fact that the unitary group (e−it H0 )t∈R is strongly differentiable one finds eit HN A N (a ( p) ) e−it HN (n) A N (a ( p) ) e−it H0 eis H0 e−is HN (n) s=t = eis HN e−its H0 eit H0 t iN ( p) ( p) A N (Ws ), = A N (at ) (n) + ds eis HN e−is H0 A N (at ) eis H0 e−is HN (n) , 2 0 where (·)t := (eith )(·)(e−ith ) denotes free time evolution. As an equation between operators defined on F±0 , this reads eit HN A N (a ( p) ) e−it HN t iN ( p) ( p) A N (Ws ), = A N (at ) + ds eis HN e−is H0 A N (at ) eis H0 e−is HN . 2 0
(4.1)
Iteration of (4.1) yields the formal power series ∞ k k=0 (t)
dt
(iN )k ( p) A N (at ) . . . . A N (Wtk ), . . . A N (Wt1 ), k 2
(4.2)
(n)
It is easy to see that, on H± , the k th term of (4.2) is bounded in norm by k 2 tn w∞ /N n p p a . k! N (n)
(4.3)
Therefore, on H± , the series (4.2) converges in norm for all times. Furthermore, (4.3) implies that the rest term arising from the iteration of (4.1) vanishes for k → ∞, so that (4.2) is equal to (4.1).
1032
J. Fröhlich, A. Knowles, S. Schwarz
( p) Fig. 4.1. Two terms of the product A N (at ) A N (Ws ), represented as labelled diagrams. A tree term (left) produces a tree diagram. A loop term (right) produces a diagram with one loop
The mean-field limit is the limit n = ν N → ∞, where ν > 0 is some constant. The above estimate is clearly inadequate to prove statements about the mean-field limit. In order to obtain estimates uniform in N , more care is needed. To see why the above estimate is so crude, consider the commutator iN p! n i ( p) A N (Ws ), A N (at ) (n) = P± H± 2 Np p N
( p)
Wi j,s , at
⊗ 1(n− p) P± .
1≤i< j≤n
We see that most terms of the commutator vanish (namely, whenever p < i < j). Thus, for large n, the above estimates are highly wasteful. This can be remedied by more careful bookkeeping. We split the commutator into two terms: the tree terms, defined by 1 ≤ i ≤ p and p + 1 ≤ j ≤ n, and the loop terms, defined by 1 ≤ i < j ≤ p. All other terms vanish. This splitting can also be inferred from (3.8). The naming originates from a diagrammatic representation (see Fig. 4.1). A p-particle operator is represented as a wiggly vertical line to which are attached p horizontal branches on the left and p horizontal branches on the right. Each branch on ∗ (xi ), and each branch on the right an annihilathe left represents a creation operator ψ N N (yi ). The product tion operator ψ A N (a ( p) ) A N (b(q) ) of two operators is given by the sum over all possible pairings of the annihilation operators in A N (a ( p) ) with the creation (q) operators in A N (b ). Such a contraction is graphically represented as a horizontal line joining the corresponding branches. We consider diagrams that arise in this manner from the multiplication of a finite number of operators of the form A N (a ( p) ). We now generalise this idea to a systematic scheme for the multiple commutators appearing in the Schwinger-Dyson expansion. To this end, we decompose the multiple commutator (iN )k ( p) A (W ), . . . A (W ), A (a ) ... N t N t N t 1 k 2k into a sum of 2k terms obtained by writing out each commutator. Each resulting term is a product of k + 1 second-quantised operators, which we furthermore decompose into a sum over all possible contractions for which r > 0 in (3.3) (at least one contraction for each multiplication). The restriction r > 0 follows from [a ( p) , b(q) ]0 = 0. This is equivalent to saying that all diagrams are connected.
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1033
We call the resulting terms elementary. The idea is to classify all elementary terms according to their number of loops l. Write k (iN )k 1 ( p) (k,l) ( p) A A G (W ), . . . A (W ), A (a ) . . . = (a ) , (4.4) N t N t N N t,t1 ,...,tk t 1 k 2k Nl l=0
( p) ) is a ( p +k −l)-particle operator, equal to the sum of all elementary where G (k,l) t,t1 ,...,tk (a ( p+k−l)
terms with l loops. It is defined through the recursion relation (on H±
)
(k,l) (k−1,l) G t,t1 ,...,tk (a ( p) ) = i( p + k − l − 1) Wtk , G t,t1 ,...,tk−1 (a ( p) ) 1 p+k −l (k−1,l−1) ( p) Wtk , G t,t1 ,...,tk−1 (a ) +i 2 2 p+k−l−1 (k−1,l) Wi p+k−l,tk , G t,t1 ,...,tk−1 (a ( p) ) ⊗ 1 P± = iP± i=1
+ iP±
(k−1,l−1) Wi j,tk , G t,t1 ,...,tk−1 (a ( p) ) P± ,
(4.5)
1≤i< j≤ p+k−l (0,0)
( p)
(k,l)
as well as G t (a ( p) ) := at . If l < 0, l > k, or p+k −l > n, then G t,t1 ,...,tk (a ( p) ) = 0. The interpretation of the recursion relation is simple: a (k, l)-term arises from either a (k − 1, l)-term without adding a loop or from a (k − 1, l − 1)-term to which a loop is added. It is not hard to see, using induction on k and the definition (4.5), that (4.4) holds. It is often convenient to have an explicit formula for the decomposition into elementary terms: (k,l) G t,t1 ,...,tk (a ( p) )
=
c( p,k,l)
(k,l)(α)
G t,t1 ,...,tk (a ( p) ),
α=1 (k,l)(α)
where G t,t1 ,...,tk (a ( p) ) is an elementary term, and c( p, k, l) is the number of elementary (k,l) terms in G t,t1 ,...,tk (a ( p) ). In order to establish a one-to-one correspondence between elementary terms and diagrams, we introduce a labelling scheme for diagrams. Consider an elementary term arising from a choice of contractions in the multiple commutator of order k, along with its diagram. We label all vertical lines v with an index i v ∈ N as follows. The vertical line of a ( p) is labelled by 0. The vertical line of the first (i.e. innermost in the multiple commutator) interaction operator is labelled by 1, of the second by 2, and so on (see Fig. 4.2). Conversely, every elementary term is uniquely determined by its labelled diagram. We consequently use α = 1, . . . , c( p, k, l) to index either elementary terms or labelled diagrams. Use the shorthand t = (t1 , . . . , tk ) and define ( p) ( p) := (a ) dt G (k,l) ). (4.6) G (k,l) t t,t (a k (t)
In summary, we have an expansion in terms of the number of loops l:
1034
J. Fröhlich, A. Knowles, S. Schwarz
Fig. 4.2. The labelled diagram corresponding to a one-loop elementary term in the commutator of order 4
A N (a ( p) ) e−it HN = eit HN
∞ k 1 (k,l) A N G t (a ( p) ) , l N
(4.7)
k=0 l=0
(n) , n ∈ N, for all times t. which converges in norm on H±
5. Convergence for Bounded Interaction For a bounded interaction potential, w∞ < ∞, it is now straightforward to control the mean-field limit. Lemma 5.1. We have the bound (k,l) ( p) G t,t (a ) ≤ c( p, k, l)wk∞ a ( p) .
(5.1)
Furthermore, c( p, k, l) ≤ 2k
k ( p + k − l)l ( p + k − 1) · · · p. l
(5.2)
Proof. Assume first that l = 0. Then the number of labelled diagrams is clearly given by 2k p · · · ( p + k − 1). Now if there are l loops, we may choose to add them at any l of the k steps when computing the multiple commutator. Furthermore, each addition of a loop produces at most p + k − l times more elementary terms than the addition of a tree branch. Combining these observations, we arrive at the claimed bound for c( p, k, l). Alternatively, it is a simple exercise to show the claim, with c( p, k, l) replaced by the bound (5.2), by induction on k. (ν N )
Lemma 5.2. Let ν > 0 and t < (8νw∞ )−1 . Then, on H± series (4.7) converges in norm, uniformly in N .
, the Schwinger-Dyson
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1035
(k,l) Proof. Recall that p + k − l ≤ n for nonvanishing A N G t,t (a ( p) ) H(n) . Using the ± symbol I{A} , defined as 1 if A is true and 0 if A is false, we find ∞ k 1 (k,l) ( p) A G dt (a ) (ν N ) N t,t H± N l k (t) k=0 l=0
≤ ≤
∞ k ( p + k−l)l
Nl
k=0 l=0 ∞
I{ p+k−l≤ν N }
k p + k−1 1 k! ν p+k−l a ( p) (2w∞ t)k l k k!
(8νw∞ t)k (2ν) p a ( p)
k=0
=
1 (2ν) p a ( p) , 1−8νw∞ t
where we used that
k l=0 l
k
= 2k , and in particular
k l
≤ 2k .
In the spirit of semi-classical expansions, we can rewrite the Schwinger-Dyson series to get a “1/N -expansion”, whereby all l-loop terms add up to an operator of order O(N −l ). (ν N )
Lemma 5.3. Let t < (8νw∞ )−1 and L ∈ N. Then we have on H± eit HN A N (a ( p) ) e−it HN =
,
∞ L−1 1 1 (k,l) ( p) , G (a ) + O A N t Nl NL l=0
k=l
where the sum converges uniformly in N . Proof. Instead of the full Schwinger-Dyson expansion (4.2), we can stop the expansion whenever L loops have been generated. More precisely, we iterate (4.1) and use (3.8) at each iteration to split the commutator into tree (r = 1) and loop (r = 2) terms. Whenever a term obtained in this fashion has accumulated L loops, we stop expanding and put it into a remainder term. Thus all fully expanded terms are precisely those arising from diagrams containing up to L − 1 loops, and it is not hard to show that the remainder term is of order N −L . In view of later applications, we also give a proof using the fully expanded Schwinger(ν N ) Dyson series. From Lemma 5.2 we know that the sum converges on H± in norm, uniformly in N , and can be reordered as
e
it H N
A N (a ( p) ) e−it HN =
∞ ∞ 1 (k,l) ( p) A G dt (a ) , N t,t Nl k (t) l=0
(ν N )
as an identity on H±
k=l
. Proceeding as above we find
1036
J. Fröhlich, A. Knowles, S. Schwarz ∞ ∞ 1 (k,l) ( p) G A dt (a ) (ν N ) N t,t H± Nl k (t) l=L
k=l
∞ ∞ 1 ( p + k − l)l ≤ L I{ p+k−l≤ν N } N N l−L l=L k=l p+k−1 1 k k k! ν p+k−l a ( p) × (2w∞ t) l k k! ∞ ∞ 1 ≤ ( p + k − l) L (8νw∞ t)k (2ν) p a ( p) (ν N ) L
= ≤
1 (ν N ) L 1 (ν N ) L
l=L k=l ∞ ∞
( p + k) L (8νw∞ t)k+l (2ν) p a ( p)
l=L k=0 ∞ l=L
(8νw∞ t)l
e p L! (2ν) p a ( p) (1 − 8νw∞ t) L+1
1 e p L! (8νw∞ t) L (2ν) p a ( p) , = (ν N ) L (1 − 8νw∞ t) L+2 e p L! L k where we used that ∞ k=0 ( p + k) x ≤ (1−x) L+1 . 6. Convergence for Coulomb Interaction In this section we consider an interaction potential of the form w(x) = κ
1 , |x|
(6.1)
where κ ∈ R. We take the one-body Hamiltonian to be h = −, the nonrelativistic kinetic energy without external potentials. We assume this form of h and w throughout Sects. 6 and 7. In Sect. 8, we discuss some generalisations. 6.1. Kato smoothing. The non-relativistic dispersive nature of the free time evolution eit is essential for controlling singular potentials. The key tool for all of the following is the Kato smoothing estimate: −1 it 2 |x| e ψ dt ≤ π ψ2 , (6.2) R
L 2 (R3 ). Estimate (6.2) follows from Kato’s theory of smooth perturbations;
where ψ ∈ see [20,23]. In Sect. 8 we provide a proof of (6.2) (without the sharp constant π ), for a larger class of interaction potentials, using Strichartz estimates. In order to avoid tedious discussions of operator domains in equations such as (4.1), we introduce a cutoff to make the interaction potential bounded. For ε ≥ 0 set w ε (x) := w(x)I{|w(x)|≤ε−1 } ,
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1037
so that w ε ∞ ≤ ε−1 . Now (6.2) implies, for ε ≥ 0, ε it 2 it 2 w e ψ dt ≤ w e ψ dt ≤ π κ 2 ψ2 . R
R
(6.3)
An immediate consequence is the following lemma. (n) Lemma 6.1. Let (n) ∈ H± . Then 2 ε −it H (n) 2 0 W e dt ≤ π κ (n) 2 . ij 2 R
(6.4)
Proof. By symmetry we may assume that (i, j) = (1, 2). Choose centre of mass coordi˜ (n) (X, ξ, x3 , . . . , xn ) := (n) (x1 , . . . , xn ), nates X := (x1 +x2 )/2 and ξ = x2 −x1 , set and write ε −it H (n) 2 ε 0 W e dt = w (ξ ) e2itξ ˜ (n) 2 dt, 12 R
R
since H0 = −1 − 2 = − X /2 − 2ξ and [ X , w ε (ξ )] = 0. Therefore, by (6.3) and Fubini’s theorem, we find ε −it H (n) 2 0 W e dt 12 R ˜ (n) (X, ξ, x3 , . . . , xn )2 = dX dx3 · · · dxn dt dξ w ε (ξ ) e2itξ (n) πκ2 ˜ (X, ξ, x3 , . . . , xn )2 ≤ dX dx3 · · · dxn dξ 2 π κ 2 (n) 2 . = 2
By Cauchy-Schwarz we then find that 1/2 2 1/2 t ε −is H (n) 2 πκ t 0 W ε (n) ds ≤ t 1/2 W e ds ≤ (n) . (6.5) i j,s ij 2 R 0
By iteration, this implies that, for all elementary terms α, 2 k/2 t t (k,l)(α),ε ( p) ( p+k−l) ≤ πκ t dt1 . . . dtk G t,t (a ) a ( p) ( p+k−l) , (6.6) 2 0 0 (k,l)(α),ε
where the superscript ε reminds us that G t,t potential wε . Thus one finds
(k,l),ε ( p) G t (a ) ≤ c( p, k, l)
(a ( p) ) is computed with the regularised
π κ 2t 2
k/2
a ( p) ,
for all ε ≥ 0. Unfortunately, the above procedure does not recover the factor 1/k! arising from the time-integration over the k-simplex k (t), which is essential for our convergence
1038
J. Fröhlich, A. Knowles, S. Schwarz
√ estimates. First iterating (6.4) and then using Cauchy-Schwarz yields a factor 1/ k!, which is still not good enough. A solution to this problem must circumvent the highly wasteful procedure of replacing the integral over k (t) with an integral over [0, t]k . The key observation is that, in the sum over all labelled diagrams, each diagram appears of the order of k! times with different labellings. 6.2. Graph counting. In order to make the above idea precise, we make use of graphs (related to the above diagrams) to index terms in our expansion of the multiple commutator (iN )k ( p) A (W ), . . . A (W ), A (a ) ... . (6.7) N t N t N t 1 k 2k The idea is to assign to each second quantised operator a vertex v = 0, . . . , k, and to represent each creation and annihilation with an incident edge. A pairing of an annihilation operator with a creation operator is represented by joining the corresponding edges. The vertex 0 has 2 p edges and the vertices 1, . . . , k have 4 edges. We call the vertex 0 the root. The edges incident to each vertex v are labelled using a pair λ = (d, i), where d = a, c is the direction (a stands for “annihilation” and c for “creation”) and i labels edges of the same direction; i = 1, . . . , p if v = 0 and i = 1, 2 if v = 1, . . . , k. Thus, a labelled edge is of the form {(v1 , λ1 ), (v2 , λ2 )}. Graphs G with such labelled edges are graphs over the vertex set V (G) = {(v, λ)}. We denote the set of edges of a graph G (a set of unordered pairs of vertices in V (G)) by E(G). The degree of each (v, λ) is either 0 or 1; we call (v, λ) an empty edge of v if its degree is 0. We often speak of connecting two empty edges, as well as removing a nonempty edge; the definitions are self-explanatory. over the vertex set We may drop the edge labelling of G to obtain a (multi)graph G {0, . . . , k}: Each edge {(v1 , λ1 ), (v2 , λ2 )} ∈ E(G) gives rise to the edge {v1 , v2 } ∈ E(G). We understand a path in G to be a sequence of edges in E(G) such that two consecutive This leads to the notions of connectedness of G and edges are adjacent in the graph G. loops in G. The admissible graphs – i.e. graphs indexing a choice of pairings in the multiple commutator (6.7) – are generated by the following “growth process”. We start with the empty graph G0 , i.e. E(G0 ) = ∅. In a first step, we choose one or two empty edges of 1 of the same direction and connect each of them to an empty edge of 0 of opposite direction. Next, we choose one or two empty edges of 2 of the same direction and connect each of them to an empty edge of 0 or 1 of opposite direction. We continue in this manner for all vertices 3, . . . , k. We summarise some key properties of admissible graphs G. (a) G is connected. (b) The degree of each (v, λ) is either 0 or 1. (c) The labelled edge {(v1 , λ1 ), (v2 , λ2 )} ∈ E(G) only if λ1 and λ2 have opposite directions. Property (c) implies that each graph G has a canonical directed representative, where each edge is ordered from the a-label to the c-label. See Fig. 6.1 for an example of such a graph. We call a graph G of type ( p, k, l) whenever it is admissible and it contains l loops. We denote by G ( p, k, l) the set of graphs of type ( p, k, l).
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1039
Fig. 6.1. An admissible graph of type ( p = 4, k = 7, l = 3)
By definition of admissible graphs, each contraction in (6.7) corresponds to a unique admissible graph. A contraction consists of at least k and at most 2k pairings. A contraction giving rise to a graph of type ( p, k, l) has k + l pairings. The summand in (6.7) corresponding to any given l-loop contraction is given by an elementary term of the form (iN )k A N b( p+k−l) , k k+l 2 N
(6.8)
where the ( p + k − l)-particle operator b( p+k−l) is of the form ( p) b( p+k−l) = P± Wi1 j1 ,tv1 · · · Wir jr ,tvr at ⊗1(k−l) Wir +1 jr +1 ,tvr +1 · · · Wik jk ,tvk P± , (6.9) for some r = 0, . . . , k. Indeed, the (anti)commutation relations (A.2) imply that each pairing produces a factor of 1/N . Furthermore, the creation and annihilation operators of each summand corresponding to any given contraction are (by definition) Wick ordered, and one readily sees that the associated integral kernel corresponds to an operator of (k,l) the form (6.9). Thus we recover the splitting (4.4), whereby G t,t1 ,...,tk (a ( p) ) is a sum, indexed by all l-loop graphs, of elementary terms of the form (6.9). As remarked above, we need to exploit the fact that many graphs have the same topological structure, i.e. can be identified after some permutation of the labels {1, . . . , k} of the vertices corresponding to interaction operators. We therefore define an equivalence relation on the set of graphs: G ∼ G if and only if there exists a permutation σ ∈ Sk such that G = Rσ (G). Here Rσ (G) is the graph defined by {(v1 , λ1 ), (v2 , λ2 )} ∈ E(Rσ (G)) ⇐⇒ {(σ (v1 ), λ1 ), (σ (v2 ), λ2 )} ∈ E(G), where σ (0) ≡ 0. We call equivalence classes [G] graph structures, and denote the set of graph structures of admissible graphs of type ( p, k, l) by Q( p, k, l). Note that, in general, Rσ (G) need not be admissible if G is admissible. It is convenient to increase G ( p, k, l) to include all Rσ (G), where σ ∈ Sk and G is admissible. In order to keep track of the admissible graphs in this larger set, we introduce the symbol i G which is by definition 1 if G ∈ G ( p, k, l) is admissible and 0 otherwise. Because Rσ (G) = G if σ = id, G ( p, k, l) = k! Q( p, k, l). (6.10)
1040
J. Fröhlich, A. Knowles, S. Schwarz
Our goal is to find an upper bound on the number of graph structures of type ( p, k, l), which is sharp enough to show convergence of the Schwinger-Dyson series (4.2). Let us start with tree graphs: l = 0. In this case the number of graph structures is equal to 2k times the number of ordered trees1 with k + 1 vertices, whose root has at most 2 p children and whose other vertices have at most 3 children. The factor 2k arises from the fact that each vertex v = 1, . . . , k can use either of the two empty edges of compatible direction to connect to its parent. We thus need some basic facts about ordered trees, which are covered in the following (more or less standard) combinatorial digression. For x, t ∈ R and n ∈ N define x + nt x An (x, t) := (6.11) x + nt n as well as A0 (x, t) := 1. After some juggling with binomial coefficients one finds n
Ak (x, t)An−k (y, t) = An (x + y, t) ;
(6.12)
k=0
see [12] for details. Therefore An 1 (x1 , t) · · · Anr (xr , t) = An (x1 + · · · + xr , t).
(6.13)
n 1 +···+nr =n
Set Cnm
1 + nm 1 nm 1 := An (1, m) = = , 1 + nm n n(m − 1) + 1 n
the n th m-ary Catalan number. Thus we have n 1 +···+nr =n
In particular,
Cnm1 · · · Cnmr =
n 1 +···+n m =n−1
r + nm r . r + nm n
Cnm1 · · · Cnmm = Cnm .
(6.14)
(6.15)
(6.16)
Define an m-tree to be an ordered tree such that each vertex has at most m children. The number of m-trees with n vertices is equal to Cnm . This follows immediately from C0m = 1 and from (6.16), which expresses that all trees of order n are obtained by adding m (possibly empty) subtrees of combined order n − 1 to the root. We may now compute |Q( p, k, 0)|. Since the root of the tree has at most 2 p children, we may express |Q( p, k, 0)| as the number of ordered forests comprising 2 p (possibly empty) 3-trees whose combined order is equal to k. Therefore, by (6.15), 2 p + 3k 2p |Q( p, k, 0)| = 2k . (6.17) Cn31 · · · Cn32 p = 2k 2 p + 3k k n 1 +···+n 2 p =k
Next, we extend this result to all values of l in the form of an upper bound on |Q( p, k, l)|. 1 An ordered tree is a rooted tree in which the children of each vertex are ordered.
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
Lemma 6.2. Let p, k, l ∈ N. Then |Q( p, k, l)| ≤ 2k
k 2 p + 3k ( p + k − l)l . l k
1041
(6.18)
Proof. The idea is to remove edges from G ∈ G ( p, k, l) to obtain a tree graph, and then use the special case (6.17). In addition to the properties (a) – (c) above, we need the following property of G ( p, k, l): (d) If G ∈ G ( p, k, l) then there exists a subset V ⊂ {1, . . . , k} of size l and a choice of direction δ : V → {a, c} such that, for each v ∈ V, both edges of v with direction δ(v) are nonempty. Denote by E(v) ⊂ E(G) the set consisting of the two above edges. We additionally require that removing one of the two edges of E(v) from G, for each v ∈ V, yields a tree graph, with the property that, for each v ∈ V, the remaining edge of E(v) is contained in the unique path connecting v to the root. This is an immediate consequence of the growth process for admissible graphs. The set V corresponds to the set of vertices whose addition produces two edges. Note that property (d) is independent of the representative and consequently holds also for non-admissible G ∈ G ( p, k, l). Before coming to our main argument, we note that a tree graph T ∈ G ( p, k, 0) gives rise to a natural lexicographical order on the vertex set {1, . . . , k}. Let v ∈ {1, . . . , k}. There is a unique path that connects v to the root. Denote by 0 = v1 , v2 , . . . , vq = v the sequence of vertices along this path. For each j = 1, . . . , q − 1, let λ j be the label of the edge {v j , v j+1 } at v j . We assign to v the string S(v) := (λ1 , . . . , λq−1 ). Choose some (fixed) ordering of the sets of labels {λ}, for each v. Then the set of vertices {1, . . . , k} is ordered according to the lexicographical order of the string S(v). We now start removing loops from a given graph G ∈ G ( p, k, l). Define R1 as the graph obtained from G by removing all edges in v∈V E(v). By property (d) above, R1 is a forest comprising l trees. Define T1 as the connected component of R1 containing the root. Now we claim that there is at least one v ∈ V such that both edges of E(v) are incident to a vertex of T1 . Indeed, were this not the case, we could choose for each v ∈ V an edge in E(v) that is not incident to any vertex of T1 . Call R1 the graph obtained by adding all such edges to R1 . Now, since no vertex in V is in the connected component of R1 , it follows that no vertex in V is in the connected component R1 . This is a contradiction to property (d) which requires that R1 should be a (connected) tree. Let us therefore consider the set V˜ of all v ∈ V such that both edges of E(v) are incident to a vertex of T1 . We have shown that V˜ = ∅. For each choice of v and e, where v ∈ V˜ and e ∈ E(v), we get a forest of l − 1 trees by adding e to the edge set of R1 . Then v is in the same tree as the root, so that each such choice of v and e yields a string S(v) as described above. We choose v1 and e(v1 ) as the unique couple that yields the smallest string (note that different choices have different strings). Finally, set G1 equal to G from which e(v1 ) has been removed, and V1 := V \ {v}. We have thus obtained an (l − 1)-loop graph G1 and a set V1 of size l − 1, which together satisfy the property (d). We may therefore repeat the above procedure. In this manner we obtain the sequences v1 , . . . , vl and G1 , . . . , Gl . Note that Gl is obtained by removing the edges e(v1 ), . . . , e(vl ) from G, and is consequently a tree graph. Also, by construction, the sequence v1 , . . . , vl is increasing in the lexicographical order of Gl . Next, consider the tree graph Gl . Each edge e(v j ) connects the single empty edge of v j with direction δ(v j ) with an empty edge of opposite direction of a vertex v, where
1042
J. Fröhlich, A. Knowles, S. Schwarz
v is smaller than v j in the lexicographical order of Gl . It is easy to see that, for each j, there are at most ( p + k − l) such connections. We have thus shown that we can obtain any G ∈ G ( p, k, l) by choosing some tree Gl ∈ G ( p, k, 0), choosing l elements v j out of {1, . . . , k}, ordering them lexicographically (according to the order of Gl ) and choosing an edge out of at most ( p + k − l) possibilities for v1 , . . . , vl . Thus, G ( p, k, l) ≤
k ( p + k − l)l G ( p, k, 0). l
The claim then follows from (6.10) and (6.17).
6.3. Proof of convergence. We are now armed with everything we need in order to ( p) ). Recall that estimate k (t) dt G (k,l) t,t (a (k,l)
G t,t1 ,...,tk (a ( p) ) =
ik 2k
(k,l)(G )
i G G t,t1 ,...,tk (a ( p) ),
(6.19)
G ∈G ( p,k,l)
G ) ( p) where G (k,l)( ) is an elementary term of the form (6.9) indexed by the graph G. t,t1 ,...,tk (a We rewrite this using graph structures. Pick some choice P : Q( p, k, l) → G ( p, k, l) of representatives. Then we get (k,l)
G t,t1 ,...,tk (a ( p) ) = =
ik 2k
(k,l)(G )
i G G t,t1 ,...,tk (a ( p) )
Q∈Q ( p,k,l) G ∈Q
ik 2k
σ (P (Q))) i Rσ (P (Q)) G (k,l)(R (a ( p) ). t,t1 ,...,tk
Q∈Q ( p,k,l) σ ∈Sk
Now, by definition of Rσ , we see that (k,l)(R (G ))
G t,t1 ,...,tσk
(k,l)(G )
(a ( p) ) = G t,tσ (1) ,...,tσ (k) (a ( p) ).
Thus, k (t)
dt
( p) G (k,l) ) t,t1 ,...,tk (a
ik = k 2 =
ik 2k
i Rσ (P (Q))
Q∈Q ( p,k,l) σ ∈Sk
Q∈Q ( p,k,l)
kQ (t)
k (t)
(k,l)(P (Q)) dt G t,t (a ( p) ) σ (1) ,...,tσ (k)
(k,l)(P (Q))
dt G t,t1 ,...,tk
(a ( p) ),
where kQ (t) := {(t1 , . . . , tk ) : ∃σ ∈ Sk : i Rσ (P (Q)) = 1, (tσ (1) , . . . , tσ (k) ) ∈ k (t)} ⊂ [0, t]k is a union of disjoint simplices.
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1043
( p+k−l)
Therefore, (6.5) and (6.9) imply, for any ( p+k−l) ∈ H± , that (k,l) ( p) ( p+k−l) k dt G t,t (a ) (t) (k,l)(P (Q)) 1 ≤ k dt G t,t1 ,...,tk (a ( p) ) ( p+k−l) k 2 Q∈Q ( p,k,l) Q (t) (k,l)(P (Q)) 1 ≤ k dt G t,t1 ,...,tk (a ( p) ) ( p+k−l) k 2 [0,t] Q∈Q ( p,k,l)
≤
1 2k
Q∈Q ( p,k,l)
π κ 2t 2
k/2
a ( p) ( p+k−l)
2 k/2 πκ t 2 p + 3k k a ( p) ( p+k−l) , ≤ ( p + k − l)l 2 k l where the last inequality follows from Lemma 6.2. Of course, the above treatment remains valid for regularised potentials. We summarise: 2 k/2 (k,l),ε ( p) 2 p + 3k k l πκ t G t ( p + k − l) (a ) ≤ a ( p) , (6.20) k l 2 for all ε ≥ 0. Using (6.20) we may now proceed exactly as in the case of a bounded interaction potential. Let 1 . 128π κ 2 ν 2 The removal of the cutoff and summary of the results are contained in ρ(κ, ν) :=
(6.21)
(ν N ) Lemma 6.3. Let t < ρ(κ, ν). Then we have on H±
e
it H N
A N (a ( p) ) e−it HN =
∞ k 1 (k,l) ( p) A G (a ) , N t Nl
(6.22)
k=0 l=0
in operator norm, uniformly in N . Furthermore, for L ∈ N, we have the 1/N -expansion ∞ L−1 1 1 (k,l) ( p) it H N ( p) −it H N , (6.23) A N (a ) e A N G t (a ) + O e = Nl NL l=0
where the sum converges on
(ν N ) H±
k=l
uniformly in N .
Proof. Using (6.20) we may repeat the proof of Lemma 5.3 to the letter to prove the statements about convergence. Thus (6.22) holds for all ε > 0. The proof of (6.22) for ε = 0 follows by approximation and is banished to Appendix B. 7. Mean-Field Limit In this section we identify the mean-field dynamics as the dynamics given by the Hartree equation.
1044
J. Fröhlich, A. Knowles, S. Schwarz
7.1. Hartree equation. The Hartree equation reads i∂t ψ = hψ + (w ∗ |ψ|2 )ψ.
(7.1)
It is the equation of motion of a classical Hamiltonian system with phase space := H 1 (R3 ). Here H 1 (R3 ) is the usual Sobolev space of index one. In analogy to A N we ( p) define A as the map from closed operators on H+ to functions on phase space, through A(a ( p) )(ψ) := ψ ⊗ p , a ( p) ψ ⊗ p ¯ p) · · · = dx1 · · · dx p dy1 · · · dy p ψ(x ¯ 1 ) a ( p) (x1 , . . . , x p ; y1 , . . . , y p ) ψ(y1 ) · · · ψ(y p ). ψ(x We define the space of “observables” A as the linear hull of {A(a ( p) ) : p ∈ N, a ( p) ∈ B ( p) (H+ )}. The Hamilton function is given by 1 H := A(h) + A(W ), 2 i.e. 1 1 dx (w ∗ |ψ|2 )|ψ|2 = ψ , h ψ+ ψ ⊗2 , W ψ ⊗2 . (7.2) H (ψ) = dx |∇ψ|2 + 2 2 Using the Hardy-Littlewood-Sobolev and Sobolev inequalities (see e.g. [13]) one sees that H (ψ) is well-defined on : 2 |ψ(x)|2 |ψ(y)|2 |ψ|2 6/5 = ψ412/5 ψ4H 1 , dx dy |x − y| where the symbol means the left side is bounded by the right side multiplied by a positive constant that is independent of ψ. The Hartree equation is equivalent to i∂t ψ = ∂ψ¯ H (ψ). The symplectic form on is given by ¯ ω = i dx dψ(x) ∧ dψ(x), which induces a Poisson bracket given by ¯ ¯ ¯ {ψ(x), ψ(y)} = iδ(x − y), {ψ(x), ψ(y)} = {ψ(x), ψ(y)} = 0. For A, B ∈ A we have that {A, B}(ψ) = i
dx ∂ψ A(ψ) ∂ψ¯ B(ψ) − ∂ψ B(ψ) ∂ψ¯ A(ψ) .
The “mass” function
N (ψ) :=
dx |ψ|2
is the generator of the gauge transformations ψ → e−iθ ψ. By the gauge invariance of the Hamiltonian, {H, N } = 0, we conclude, at least formally, that N is a conserved quantity. Similarly, the energy H is formally conserved. The space of observables A has the following properties:
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1045
(i) A(a ( p) ) = A (a ( p) )∗ . ( p) (ii) If a ( p) ∈ B(H+ ) and b ∈ B(H), then A(a ( p) )(bψ) = A (b∗ )⊗ p a ( p) b⊗ p (ψ). (iii) If a ( p) and b(q) are p- and q-particle operators, respectively, then A(a ( p) ), A(b(q) ) = i pqA a ( p) , b(q) 1 .
(7.3)
( p)
(iv) If a ( p) ∈ B(H+ ), then A(a ( p) )(ψ) ≤ a ( p) ψ2 p .
(7.4)
The free time evolution φ0t (ψ) := e−ith ψ is the Hamiltonian flow corresponding to the free Hamilton function A(h). We abbreviate the free time evolution of observables A ∈ A by At := A ◦ φ0t . Thus, A(a ( p) )t = ( p) A(at ). In order to define the Hamiltonian flow on all of L 2 (R3 ), we rewrite the Hartree equation (7.1) with initial data ψ(0) = ψ as an integral equation t ψ(t) = e−ith ψ − i ds e−i(t−s)h (w ∗ |ψ(s)|2 )ψ(s). (7.5) 0
Lemma 7.1. Let ψ ∈ Then (7.5) has a unique global solution ψ(·) ∈ C (R; L 2 (R3 )), which depends continuously on the initial data ψ. Furthermore, ψ(t) = ψ for all t. Finally, we have a Schwinger-Dyson expansion for observables: Let ( p) a ( p) ∈ B(H+ ), ν > 0 and t < ρ(κ, ν). Then L 2 (R3 ).
A(a ( p) )(ψ(t)) =
∞
(k,0) A G t (a ( p) ) (ψ)
k=0
∞ 1 ( p) = dt A(Wtk ), . . . A(Wt1 ), A(at ) . . . (ψ), k 2 k (t)
(7.6)
k=0
uniformly in the ball Bν := {ψ ∈ L 2 (R3 ) : ψ2 ≤ ν}. Proof. The well-posedness of (7.5) is a well-known result; see for instance [4,24]. The remaining statements follow from a “tree expansion”, which also yields an existence result. We first use the Schwinger-Dyson expansion to construct an evolution on the space of observables. We then show that this evolution stems from a Hamiltonian flow that satisfies the Hartree equation (7.5). First, we generalise our class of “observables” to functions that are not gauge invarip q ant, i.e. that correspond to bounded operators a (q, p) ∈ B(H+ ; H+ ). We set A(a (q, p) ) (ψ) := ψ ⊗q , a (q, p) ψ ⊗ p , and denote by A the linear hull of observables of the form p q A(a (q, p) ) with a (q, p) ∈ B(H+ ; H+ ).
1046
J. Fröhlich, A. Knowles, S. Schwarz
It is convenient to introduce the abbreviations G := {A(h), · },
D :=
1 {A(W ), · }. 2
A through (eGt A)(ψ) = A(e−ih ψ), where A ∈ A. Note Then eGt is well-defined on also that Ds := eGs De−Gs =
1 {A(Ws ), · }. 2
Let A ∈ A. We use the Schwinger-Dyson series for e(G+D)t to define the flow S(t)A through S(t)A := =
∞
dt Dtk · · · Dt1 eGt A
k k=0 (t) ∞
dt
k k=0 (t)
1 A(W ), . . . A(W ), A ) ... . t t t 1 k 2k
(7.7)
Our first task is to show convergence of (7.7) for small times. Let A = A(a (q, p) ). As with (7.3) one finds, after short computation, that q p 1 {A(W ), A(a (q, p) )} = A i Wi q+1 (a (q, p) ⊗ 1) − i (a (q, p) ⊗ 1)Wi p+1 . (7.8) 2 i=1
i=1
Thus we see that the nested Poisson brackets in (7.7) yield a “tree expansion” which (k) (a (q, p) ) recursively through may be described as follows. Define Tt,t 1 ,...,tk (q, p) Tt(0) (a (q, p) ) := at , (k) Tt,t1 ,...,tk (a (q, p) ) := iP+
q+k−1
(k−1) Wi q+k,tk Tt,t1 ,...,tk−1 (a (q, p) ) ⊗ 1 P+
i=1
−iP+
p+k−1
(k−1) Tt,t1 ,...,tk−1 (a (q, p) ) ⊗ 1 Wi p+k,tk P+ .
i=1 ( p+k)
(k)
Note that Tt,t1 ,...,tk (a (q, p) ) is an operator from H+ that
(q+k)
to H+
. Moreover, (7.8) implies
1 (q, p) (k) (q, p) A(W ), . . . A(W ), A(a ) . . . = A T (a ) . tk t1 t,t1 ,...,tk t 2k
(7.9)
Also, by definition, we see that for gauge-invariant observables a ( p) we have (k)
(k,0)
Tt,t1 ,...,tk (a ( p) ) = G t,t1 ,...,tk (a ( p) ). We may use the methods of Sect. 6 to obtain the desired estimate. One sees that (k) Tt,t1 ,...,tk (a ( p) ) is a sum of elementary terms, indexed by labelled ordered trees, whose
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1047
root has degree at most p + q, and whose other vertices have at most 3 children. From (6.15) we find that there are p + q + 3k p+q p + q + 3k k unlabelled trees of this kind. Proceeding exactly as in Sect. 6 we find that 2 k/2 p + q + 3k πκ t (k) (q, p) ( p+k) dt Tt,t1 ,...,tk (a ) a (q, p) ( p+k) , ≤ k 2 k (t) ( p+k)
where ( p+k) ∈ H+ . Let ψ ∈ L 2 (R3 ) with ψ2 ≤ ν. Then |A(a (q, p) )(ψ)| ≤ (q, p) p+q a ψ implies 1 (q, p) dt k A(Wtk ), . . . A(Wt1 ), A(at ) . . . (ψ) 2 k (t) 2 k/2 p + q + 3k πκ t ≤ a (q, p) ν k+( p+q)/2 . (7.10) k 2 Convergence of the Schwinger-Dyson series (7.7) for small times t follows immediately. Thus, for small times t, the flow S(t) is well-defined on A, and it is easy to check that it satisfies the equation t Gt S(t)A = e A + ds S(s) D eG(t−s) A, (7.11) 0
for all A ∈ A. In order to establish a link with the Hartree equation (7.5), we consider f ∈ L 2 (R3 ) and define the function F f ∈ A through F f (ψ) := f , ψ. Clearly, the mapping f → (S(t)F f )(ψ) is antilinear and (7.10) implies that it is bounded. Thus there exists a unique vector ψ(t) such that (S(t)F f )(ψ) =: f , ψ(t). We now proceed to show that (S(t)A)(ψ) = A(ψ(t)) for all A ∈ A. By definition, this is true for A = F f . As a first step, we show that S(t)(AB) = (S(t)A)(S(t)B), where A, B ∈ A. Write S(t)(AB) = =
∞ k k=0 (t) ∞ k k=0 (t)
(7.12)
dt Dtk · · · Dt1 eGt (AB) dt Dtk · · · Dt1 (At Bt ),
where we used eGt (AB) = (eGt A)(eGt B). We now claim that dt Dtk · · · Dt1 (At Bt ) = dt ds Dtl · · · Dt1 At Dsm · · · Ds1 Bt , k (t)
l l+m=k (t)
m (t)
(7.13)
1048
J. Fröhlich, A. Knowles, S. Schwarz
where the sum ranges over l, m ≥ 0. This follows easily by induction on k and using Ds (AB) = A(Ds B) + (Ds A)B. Then (7.12) follows immediately. Next, we note that (7.12) implies that (S(t)A)(ψ) = A(ψ(t)), whenever A is of the form A = A(a (q, p) ), where j j j j P+ f 1 ⊗ · · · ⊗ f q g1 ⊗ · · · ⊗ g p P+ , (7.14) a (q, p) = j ( p)
(q)
where the sum is finite, and f i , gi ∈ L 2 (R3 ). Now each a (q, p) ∈ B(H+ ; H+ ) can be (q, p) written as the weak operator limit of a sequence (an )n∈N of operators of type (7.14). One sees immediately that j
j
(q, p)
lim A(an n
)(ψ(t)) = A(a (q, p) )(ψ(t)). (q, p)
On the other hand, uniform boundedness implies that supn an < ∞, so that (q, p) ψ ⊗(q+k) , Wi1 j1 ,tv1 · · · Wir jr ,tvr an ⊗ 1(k) Wir +1 jr +1 ,tvr +1 · · · Wik jk ,tvk ψ ⊗( p+k) (q, p) ≤ an Wir jr ,tvr · · · Wi1 j1 ,tv1 ψ ⊗(q+k) Wir +1 jr +1 ,tvr +1 · · · Wik jk ,tvk ψ ⊗( p+k) justifies the use of dominated convergence in (q, p)
lim(S(t)A(an n
))(ψ) = (S(t)A(a (q, p) ))(ψ).
We have thus shown that (S(t)A)(ψ) = A(ψ(t)),
∀A ∈ A.
(7.15)
Let us now return to (7.11). Setting A = F f , we find that (7.11) implies t 1 −i h S(s){A(W ), (F f )t−s } (ψ) f , ψ(t) = f , e ψ + ds 2 0 t = f , e−i h ψ + ds {A(W ), (F f )t−s } (ψ(s)), 0
where we used (7.15). Using (7.8) we thus find t
−i h f , ψ(t) = f , e ψ − i ds (eih(t−s) f ) ⊗ ψ(s) , W ψ(s) ⊗ ψ(s) , (7.16) 0
which is exactly the Hartree equation (7.5) projected onto f . We have thus shown that ψ(t) as defined above solves the Hartree equation. To show norm-conservation we abbreviate F(s) := (w ∗ |ψ(s)|2 )ψ(s) and write, using (7.5), t ψ(t)2 − ψ2 = i ds F(s) , e−ish ψ − e−ish ψ , F(s) 0 t t
+ ds dr eish F(s) , eir h F(r ) . 0
0
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1049
The last term is equal to t s ds dr eish F(s) , eir h F(r ) + eir h F(r ) , eish F(s) . 0
0
Therefore (7.5) implies that
ψ(t)2 − ψ2 = i
t
ds F(s) , ψ(s) − i
0
t
ds ψ(s) , F(s) = 0,
0
since F(s) , ψ(s) ∈ R, as can be seen by explicit calculation. Thus we can iterate the above existence result for short times to obtain a global solution. Furthermore, (7.16) implies that ψ(t) is weakly continuous in t. Since the norm of ψ(t) is conserved, ψ(t) is strongly continuous in t. Similarly, the Schwinger-Dyson expansion (7.7) implies that the map ψ → ψ(t) is weakly continuous for small times, uniformly in ψ in compacts. Therefore, the map ψ → ψ(t) is weakly continuous for all times t, and norm-conservation implies that it is strongly continuous. 7.2. Wick quantisation. In order to state our main result in a general setting, we shortly discuss how the many-body quantum mechanics of bosons can be viewed as a deformation quantisation of the (classical) Hartree theory. The deformation parameter (the analogue of in the usual quantisation of classical theories) is 1/N . We define quantisaN : A → N (x) tion as the linear map (·) A defined by the formal replacement ψ(x) → ψ ¯ ∗ (x) followed by Wick ordering. In other words, and ψ(x) → ψ N N : A(a ( p) ) → (·) A N (a ( p) ). N to unbounded operators in the obvious way, we see that Extending the definition of (·) H N is the quantisation of H . Note that (3.3) and (7.3) imply, for A, B ∈ A, N −1 1 , A N , BN = {A, B} N + O i N2 N. so that 1/N is indeed the deformation parameter of (·) 7.3. Mean-field limit: A Egorov-type result. Let φ t denote the Hamiltonian flow of the Hartree equation on L 2 (R3 ). Introduce the short-hand notation α t A := A ◦ φ t , α t A := eit N HN A e−it N HN ,
A ∈ A, A∈ A.
We may now state and prove our main result, which essentially says that, in the mean-field limit n = ν N → ∞, time evolution and quantisation commute. Theorem 7.2. Let A ∈ A, ν > 0, and ε > 0. Then there exists a function A(t) ∈ A such that ≤ ε, sup α t A − A(t) ∞ t∈R
as well as
L (Bν )
t C(ε, ν, t, A) N − α A A(t) N H(ν N ) ≤ ε + . + N
1050
J. Fröhlich, A. Knowles, S. Schwarz
Remark. The “intermediate function” A(t) is required, since the full time evolution α t does not leave A invariant. Proof. Most of the work has already been done in the previous sections. Without loss ( p) of generality take A = A(a ( p) ) for some p ∈ N and a ( p) ∈ B(H± ). Assume that t < ρ(κ, ν). Taking L = 1 in (6.23) we get A N (a ( p) ) α t
∞
=
H+(ν N )
A N G (k,0) (a ( p) ) t
k=0
H+(ν N )
+O
1 N
.
(7.17)
Comparing this with (7.6) immediately yields t
α A N (a
( p)
) = α t A(a ( p) ) N + O
1 N
on H+(ν N ) , where α t A(a ( p) ) N is defined through its norm-convergent power series. This is the statement of the theorem for short times. The extension to all times follows from an iteration argument. We postpone the details to the proof of Theorem 7.3 below. In its notation A(t) is given by
A(t) =
K 1 −1
···
k1 =0
K m −1 km =0
(k ,0) 1 ,0) a ( p) . A G τ(km ,0) G τ m−1 · · · G (k τ
The result may also be expressed in terms of coherent states. ( p)
Theorem 7.3. Let a ( p) ∈ B(H+ ), ψ ∈ L 2 (R3 ) with ψ = 1, and T > 0. Then there exist constants C, β > 0, depending only on p, T and κ, such that ⊗N A N (a ( p) ) e−it HN ψ ⊗N − ψ(t)⊗ p , a ( p) ψ(t)⊗ p , eit HN ψ ≤
C a ( p) , Nβ
t ∈ [0, T ].
(7.18)
Here ψ(t) is the solution to the Hartree equation (7.5) with initial data ψ. Proof. Introduce a cutoff K ∈ N and write (in self-explanatory notation) ατ A N (a ( p) ) =
K −1 k=0
α τ A(a ( p) ) =
K −1 k=0
1 ( p) τ A N G (k,0) A N (a ( p) ) + R N ,τ (a ( p) ), (7.19) (a ) + α≥K τ N ( p) τ A G (k,0) (a ) + α≥K A(a ( p) ). τ
(7.20)
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1051
To avoid cluttering the notation, from now on we drop the parentheses of the linear map G (k,0) . We iterate (7.19) m times by applying it to its first term and get τ ( α τ )m A N (a ( p) ) =
K 1 −1
···
k1 =0
+
K m −1 km =0
m−1 1 −1 K
(k ,0) τ 1 ,0) a ( p) + ( A N G τ(km ,0) G τ m−1 · · · G (k A (a ( p) ) α τ )m−1 α≥K τ 1 N K j −1
···
j=1 k1 =0
k j =0
(k ,0) (k ,0) j j−1 τ (k1 ,0) ( p) A G ( α τ )m−1− j α≥K G · · · G a N τ τ τ j+1
1 τ m−1 R N ,τ (a ( p) ) ( α ) N K j −1 m−1 K 1 −1 (k ,0) 1 1 ,0) a ( p) . + ··· ( α τ )m−1− j R N ,τ G τ j · · · G (k τ N +
j=1 k1 =0
(7.21)
k j =0
A similar expression without the third line holds for (α τ )m A(a ( p) ). In order to control this somewhat unpleasant expression, we abbreviate τ . ρ(κ, 1)
x :=
Assume that x < 1. Then (6.20) and (6.23) imply the estimates, valid on H+(N ) , (k,0) ( p) G a ≤ 4 p a ( p) x k , τ τ xK α≥K A N (a ( p) ) ≤ 4 p a ( p) , 1−x x R N ,τ (a ( p) ) ≤ (4e) p a ( p) . (1 − x)3 Furthermore, (7.6) implies that τ α A(a ( p) ) ∞ ≥K L (B
1)
≤ 4 p a ( p)
xK . 1−x
We also need N · · · (N − p+1) ⊗N ( p) ⊗N ( p) ψ −A(a )(ψ) = , A N (a )ψ −1A(a ( p) )(ψ) p N p−1 N · · · (N − j) N · · · (N − j+1) a ( p) ≤ − N j+1 Nj j=1
≤
p 2 ( p) a . N
(7.22)
1052
J. Fröhlich, A. Knowles, S. Schwarz
Armed with these estimates we may now complete the proof of Theorem 7.3. Suppose that 1/2 ≤ x < 1. Then K 1 −1 k1 =0
···
K m −1 km =0
⊗N (k ,0) 1 ,0) a ( p) ψ ⊗N , A N G τ(km ,0) G τ m−1 · · · G (k ψ τ
· · · G τ(k1 ,0) a ( p) (ψ)
(km−1 ,0)
−A G τ(km ,0) G τ ≤
1 ( p + K 1 + · · · + K m )2 4m( p+K 1 +···+K m ) a ( p) . N (N )
Similarly, the second line of (7.21) on H+ bounded by m
and its classical equivalent on B1 are
x K j 4 j ( p+K 1 +···+K j−1 ) a ( p) .
j=1 (N )
Finally, the last line of (7.21) on H+ 1 N
m
is bounded by
4( j+1)( p+K 1 +···+K j−1 ) a ( p) .
j=1
Now pick m large enough that T ≤ mτ . Then it is easy to check that there exist a1 , . . . , am such that setting K j = a j log N ,
j = 1, . . . , m
implies that the three above expressions are all bounded by C N −β a ( p) , for some β > 0. This remains of course true for all m ≤ m. Since any time t ≤ T can be reached by at most m iterations with 1/2 ≤ x < 1, the claim follows. We conclude with a short discussion on density matrices. First we recall some standard results; see for instance [18]. Let ∈ L1 , where L1 is the space of trace class operators on some Hilbert space. Equipped with the norm 1 := Tr||, L1 is a Banach space. Its dual is equal to B, the space of bounded operators, and the dual pairing is given by A , = Tr(A),
A ∈ B, ∈ L1 .
Therefore, 1 =
sup
A∈B, A≤1
|Tr(A)|.
(7.23)
Consider an N -particle density matrix 0 ≤ N ∈ L1 (H+(N ) ) that satisfies Tr N = 1 and is symmetric in the sense that N P+ = N . Define the p-particle marginals ( p)
N
:= Tr p+1,...,N N ,
where Tr p+1,...,N denotes the partial trace over the coordinates p + 1, . . . , N . Define furthermore N (t) = e−it HN N eit HN ,
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1053
( p)
as well as the p-particle marginals N (t) of N (t). Noting that p! N 1 ( p) ( p) Tr Tr a ( p) N (t) = Tr a ( p) N (t) + O , A N (a ( p) ) N (t) = p p N N we see that (7.23) and Theorem 7.3 imply Corollary 7.4. Let ψ ∈ H with ψ = 1, and let ψ(t) be the solution of (7.5) with initial data ψ. Set N := (|ψψ|)⊗N . Then, for any p ∈ N and T > 0 there exist constants C, β > 0, depending only on p, T and κ, such that C ( p) , t ∈ [0, T ]. N (t) − (|ψ(t)ψ(t)|)⊗ p ≤ 1 Nβ Remark. Actually it is enough for N to factorise asymptotically. If ( N ) N ∈N is a sequence of symmetric density matrices satisfying (1) lim N − |ψψ|1 = 0, N →∞
then one finds
(1) lim N (t) − |ψ(t)ψ(t)| = 0,
N →0
1
t ∈ R.
This is a straightforward corollary of the proof of Theorem 7.3. By an argument of Lieb and Seiringer (see the remark after Theorem 1 in [14]), this implies that ( p) lim N (t) − (|ψ(t)ψ(t)|)⊗ p = 0, t ∈R N →0
1
for all p. 8. Some Generalisations In this section we generalise our results to a larger class of interaction potentials, and allow an external potential. For this we need Strichartz estimates for Lorentz spaces. We start with a short summary of the relevant results (see [1,11]). For 1 ≤ q ≤ ∞ and 0 < θ < 1 we define the real interpolation functor (·, ·)θ,q as follows. Let A0 and A1 be two Banach spaces contained in some larger Banach space A. Define the real interpolation norm ⎧ 1/q q ⎨ ∞ −θ t K (t, a) dt/t , q < ∞, 0 a(A0 ,A1 )θ,q := ⎩sup t −θ K (t, a), q = ∞, t≥0
where K (t, a) :=
inf
a=a0 +a1
a0 A0 + ta1 A1 .
Define (A0 , A1 )θ,q as the space of a ∈ A such that a(A0 ,A1 )θ,q < ∞. Then (A0 , A1 )θ,q is a Banach space. The Lorentz space L p,q (R3 , dx) ≡ L p,q is defined by interpolation as L p,q := (L p0 , L p1 )θ,q ,
1054
J. Fröhlich, A. Knowles, S. Schwarz
where 1 ≤ p0 , p1 ≤ ∞, p0 = p1 , and 1 1−θ θ = + . p p0 p1 Lorentz spaces have the following properties that are of interest to us. First, L p, p = p p L p,∞ = L w , where L w is the weak L p space (see e.g. [1,19]). In particular, we have for the Coulomb potential in 3 dimensions L p . Second,
1 ∈ L 3,∞ . |x| Finally, Lorentz spaces satisfy a general Hölder inequality (see [17]): Let 1 < p, p1 , p2 < ∞ and 1 ≤ q, q1 , q2 ≤ ∞ satisfy 1 1 1 + = . q1 q2 q
1 1 1 + = , p1 p2 p Then we have
f g L p,q f L p1 ,q1 g L p2 ,q2 .
(8.1)
We need an endpoint homogeneous Strichartz estimate proved in [11]. For a map f : R → L p,q we define the space-time norm
1/r r p,q f L rt L x := . dt f (t) L p,q Then Theorem 10.1 of [11] implies that it e f r
p,2
Lt Lx
f L 2 ,
(8.2)
whenever 2 ≤ r < ∞ and 2 3 3 + = . r p 2 We are now set for proving a generalisation of (6.2). Lemma 8.1. Let w ∈ L 3w + L ∞ . Then there is a constant C = C(w) > 0, such that 1 w eit ψ2 dt ≤ Cψ2 . 0
Proof. Let w = w1 + w2 with w1 ∈ L ∞ and w2 ∈ L 3w . Then it w e ψ 2 2 ≤ w1 eit ψ 2 2 + w2 eit ψ 2 2 . L L L L L L t
t
x
t
x
x
The first term is bounded by w1 L ∞ ψ L 2 . To bound the second we use (8.1) and (8.2) with r = 2 and p = 6 to get w2 eit ψ 2 2 w2 L 3,∞ eit ψ 2 6,2 w2 L 3,∞ ψ L 2 . L L L L t
Therefore,
t
x
it w e ψ
L 2t L 2x
≤
x
$ C(w) ψ L 2 .
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
Now let us assume that v, w ∈ L ∞ + L 3w . Set H0 |H(n) := ± required generalisation of Lemma 6.1 is
1055
n
i=1 −i .
Then the
Lemma 8.2. There exists a constant C ≡ C(w, v) such that 1 Wi j e−it H0 (n) 2 dt ≤ C(n) 2 , 0
1
Vi e−it H0 (n) 2 dt ≤ C(n) 2 ,
0 (n)
where (n) ∈ H± . Proof. The claim for V follows immediately from Lemma 8.1. The estimate for W follows similarly by using centre of mass coordinates. Finally, we briefly discuss the changes to the combinatorics arising from an external potential. We classify the elementary terms according to the numbers (k, l, m), where k is the order of the multiple commutator, l is the number of loops, and m is the number of V -operators. Thus, instead of (4.5), we have the recursive definition (k,l,m) (k−1,l,m) G t,t (a ( p) ) = i( p + k − l − m − 1) Wtk , G t,t (a ( p) ) 1 ,...,tk 1 ,...,tk−1 1 p+k −l −m (k−1,l−1,m) +i Wtk , G t,t1 ,...,tk−1 (a ( p) ) 2 2 (k−1,l,m−1) ( p) +i( p + k − l − m) Vtk , G t,t1 ,...,tk−1 (a ) 1
= iP±
p+k−l−m−1 i=1
+iP±
(k−1,l,m) Wi p+k−l−m,tk , G t,t1 ,...,tk−1 (a ( p) ) ⊗ 1 P±
(k−1,l−1,m) Wi j,tk , G t,t1 ,...,tk−1 (a ( p) ) P±
1≤i< j≤ p+k−l−m
+iP±
p+k−l−m
(k−1,l,m−1) Vi,tk , G t,t1 ,...,tk−1 (a ( p) ) P± ,
i=1 ( p) (0,0,0) ( p) (k,l,m) as well as G t (a ) := at . We also set G t,t1 ,...,tk (a ( p) ) = 0 unless 0 ≤ l ≤ k − m. It is again an easy exercise to show by induction on k that k k−l (iN )k 1 ( p) (k,l,m) ( p) G (W ), . . . (W ), A (a ) . . . = (a ) . A A A N t N t N N t t,t1 ,...,tk 1 k 2k Nl l=0 m=0
(k,l,m) Note that G t,t (a ( p) ) is a p + k − l − m particle operator. 1 ,...,tk The graphs of Sect. 6 have to be modified: Each vertex corresponding to a V -operator has one edge for each direction d = a, c (see Fig. 8.1). Let us first consider tree graphs, l = 0. Take the set of trees without an external potential as in Sect. 6. By allowing each vertex v = 1, . . . , k whose edges (a, 2) and (c, 2) are empty to stand for either an interaction potential W or an external potential V , we count all trees with an external potential. Thus, for a given m, there are at most
1056
J. Fröhlich, A. Knowles, S. Schwarz
Fig. 8.1. An admissible graph of type ( p = 4, k = 7, l = 2, m = 2)
k (k,0,m) ( p) ). If l > 0 we repeat the argum |G ( p, k, 0)| tree graphs contributing to G t,t1 ,...,tk (a ment in the proof Lemma 6.2, and find that the number of graph structures contributing (k,l,m) to G t,t (a ( p) ) is bounded by 1 ,...,tk k k 2 p + 3k ( p + k − l − m)l . 2k m l k Putting all this together, we find that (k,l,m) ( p) k 2 p + 3k k G t ( p + k − l − m)l (Ct)k/2 a ( p) . (a ) ≤ m l k Using the condition p +k −l −m ≤ n, it is then easy to see that all convergence estimates remain valid with the additional factor 2k . In summary, all of the results of Sects. 6 and 7 hold if v, w ∈ L 3w + L ∞ . A. Second Quantisation We briefly summarise the main ingredients of many-body quantum mechanics and second quantisation. See for instance [2] for an extensive discussion. Let H = L 2 (Rd , dx) be the “one-particle Hilbert space”, where d ∈ N. Manybody quantum mechanics is formulated on subspaces of the n-particle spaces H⊗n . Let (n) P± ≡ P± be the orthogonal projector onto the symmetric/antisymmetric subspace of H⊗n , i.e. (P± (n) )(x1 , . . . , xn ) :=
1 (±1)|σ | (n) (xσ (1) , . . . , xσ (n) ), n! σ ∈Sn
where |σ | denotes the number of transpositions in the permutation σ , and (n) ∈ H⊗n . We define the bosonic n-particle space as H+(n) := P+ H⊗n , and the fermionic n-particle (n) space as H− := P− H⊗n . We adopt the usual convention that H⊗0 = C.
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1057
We introduce the Fock space F± (H) ≡ F± :=
∞ %
(n)
H± .
n=0
A state ∈ F± is a sequence = scalar product
((n) )∞ n=0 ,
, =
(n)
where (n) ∈ H± . Equipped with the
∞
(n) , (n) ,
n=0
F± is a Hilbert space. The vector := (1, 0, 0, . . . ) is called the vacuum. By a slight abuse of notation, we denote a vector of the form = (0, . . . , 0, (n) , 0, . . . ) ∈ F± by its non-vanishing n-particle component (n) . Define also the subspace of vectors with a finite particle number F±0 := { ∈ F± : (n) = 0 for all but finitely many n}. ∗ and ψ , which map On F± we have the usual creation and annihilation operators, ψ the one-particle space H into densely defined closable operators on F± . For f ∈ H and ∈ F± , they are defined by n ∗ (n) 1 := ψ ( f ) (x1 , . . . , xn ) (±1)i−1 f (xi )(n−1) (x1 , . . . , xi−1 , xi+1 , . . . , xn ), √ n i=1 √ (n) ( f ) (x1 , . . . , xn ) := n+1 dy f¯(y)(n+1) (y, x1 , . . . , xn ). ψ
( f ) and ψ ∗ ( f ) are adjoints of each other (see for instance [2] It is not hard to see that ψ for details). Furthermore, they satisfy the canonical (anti)commutation relations # ( f ), ψ ∗ (g) ( f ), ψ # (g) ψ = f , g 1, ψ = 0, (A.1) ∓ ∓ ∗ or ψ . In order to simplify notation, we # = ψ where [A, B]∓ := AB ∓ B A, and ψ usually identify c1 with c, where c ∈ C. For our purposes, it is more natural to work with the rescaled creation and annihilation operators 1 # , N# := √ ψ ψ N where N > 0. We also introduce the operator-valued distributions defined formally by N# (x) := ψ N# (δx ), ψ # (x) has a rigorous meaning where δx is the delta function at x. The formal expression ψ N as a densely defined sesquilinear form on F± (see [19] for details). In particular one has that N ( f ) = N (x), N∗ ( f ) = N∗ (x). ψ dx f¯(x) ψ ψ dx f (x) ψ Furthermore, the (anti)commutation relations (A.1) imply that # 1 N (x), ψ N∗ (y) N# (y) N (x), ψ ψ = = 0. δ(x − y), ψ ∓ ∓ N
(A.2)
1058
J. Fröhlich, A. Knowles, S. Schwarz
B. The Limit ε → 0 in Lemma 6.3 What remains is the justification of the equality in (6.22) for ε = 0. Our strategy is to show that both sides of (6.23) with ε > 0 converge strongly to the same expression with ε = 0. (k,l),ε ( p) (n) We first show the strong convergence of G t (a ). Let (n) ∈ H± and consider ε (W − Wi j,s )(n) = I{|W |>ε−1 } Wi j e−is H0 (n) ≤ Wi j e−is H0 (n) . i j,s ij Since the right side is in L 1 ([0, t]), we may use dominated convergence to conclude that t lim ds (Wiεj,s − Wi j,s )(n) = 0. ε→0 0
Now
t ds ds Wiεj,s Wiε j ,s (n) − Wi j,s Wi j ,s (n) 0 0 t t ≤ ds ds Wiεj,s Wiε j ,s (n) − Wiεj,s Wi j ,s (n) 0 0 t t + ds ds Wiεj,s Wi j ,s (n) − Wi j,s Wi j ,s (n) . t
0
0
The first term is bounded by 2 1/2 t πκ t ε → 0. ds Wiε j ,s (n) − Wi j ,s (n) → 0, 2 0 The integrand of the second term is bounded by 2Wi j,s Wi j ,s (n) ∈ L 1 ([0, t]2 ), so that dominated convergence implies that the second term vanishes in the limit ε → 0. A straightforward generalisation of this argument shows that G (k,l),ε (a ( p) ) ( p+k−l) → G (k,l) (a ( p) ) ( p+k−l) , t t as claimed. Since the series (6.22) converges uniformly in ε, we find that ∞ ∞ k k 1 1 (k,l),ε ( p) (k,l) ( p) (n) A A G G (a ) → (a ) (n) , N N t t Nl Nl k=0 l=0
k=0 l=0
as ε → 0. ε Next, we show that e−it HN (n) → e−it HN (n) . This follows from strongresolvent convergence of H Nε to H N as ε → 0 by Trotter’s theorem [18]. Let W ε := i< j Wiεj , and consider N (H Nε − i)−1 (n) − (H N − i)−1 (n) = (H Nε − i)−1 (W − W ε )(H N − i)−1 (n) ≤ (W − W ε )(H N − i)−1 (n) . Clearly (n) := (H N − i)−1 (n) is in the domain of H N . By the Kato-Rellich theorem [19], (n) is in the domain of Wi j for all i, j. Therefore, (Wi j − W ε )(H N − i)−1 (n) = I{|W |>ε−1 } Wi j (n) → 0 ij ij
On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction
1059
as ε → 0. Therefore ε N (a ( p) ) e−it HNε (n) → eit HN A N (a ( p) ) e−it HN (n) eit HN A
as ε → 0, and the proof is complete. Acknowledgements. We thank W. De Roeck, S. Graffi and A. Pizzo for useful discussions and encouragement. We should also like to thank a referee for pointing out Ref. [14] in connection with the remark following Corollary 7.4.
References 1. Bergh, J., Löfström, J.: Interpolation Spaces, an Introduction. Berlin-Heidelberg-New York: Springer, 1976 2. Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics 2. BerlinHeidelberg-New York: Springer, 2002 3. Brown, W., Hepp, K.: The Vlasov dynamics and its fluctuations in the 1/N limit of interacting classical particles. Commun. Math. Phys. 56, 101–113 (1977) 4. Chadam, J.M., Glassey, R.T.: Global existence of solutions to the Cauchy problem for time-dependent Hartree equations. J. Math. Phys. 16, 1122 (1975) 5. Egorov, Y.V.: The canonical transformations of pseudodifferential operators. Usp. Mat. Nauk 25, 235–236 (1969) 6. Erd˝os, L., Yau, H.-T.: Derivation of the nonlinear Schrödinger equation with Coulomb potential. Adv. Theor. Math. Phys. 5, 1169–1205 (2001) 7. Fröhlich, J., Graffi, S., Schwarz, S.: Mean-field and classical limit of many-body Schrödinger dynamics for bosons. Commun. Math. Phys. 271, 681–697 (2007) 8. Fröhlich, J., Knowles, A., Pizzo, A.: Atomism and Quantization. J. Phys. A 40, 3033–3045 (2007) 9. Ginibre, J., Velo, G.: The classical field limit of scattering theory for non-relativistic many-boson systems. I-II. Commun. Math. Phys. 66, 37–76 (1979); Commun. Math. Phys. 68, 45–68 (1979) 10. Hepp, K.: The classical limit for quantum mechanical correlation functions. Commun. Math. Phys. 35, 265–277 (1974) 11. Keel, M., Tao, T.: Endpoint Strichartz estimates. Amer. J. Math. 120, 955–980 (1998) 12. Knuth, D.E.: The Art of Computer Programming, Vol. 1, Reading, MA: Addison-Wesley, 1998 13. Lieb, E.H., Loss, M.: Analysis. Providence, RI: Amer. Math. Soc., 2001 14. Lieb, E.H., Seiringer, R.: Proof of Bose-Einstein condensation for dilute trapped gases. Phys. Rev. Lett. 88(17), 170409 (2002) 15. Narnhofer, H., Sewell, G.L.: Vlasov hydrodynamics of a quantum mechanical model. Commun. Math. Phys. 79, 9–24 (1981) 16. Neunzert, H.: Fluid Dyn. Trans. 9, 229 (1977); Neunzert, H.: Neuere qualitative und numerische Methoden in der Plasmaphysik. Paderborn: Vorlesungsmanuskript, 1975 17. O’Neil, R.: Convolution operators and L( p, q) spaces. Duke Math. J. 30, 129–142 (1963) 18. Reed, M., Simon, B.: Methods of Modern Mathematical Physics I: Functional Analysis. New York: Academic Press, 1980 19. Reed, M., Simon, B.: Methods of Modern Mathematical Physics II: Fourier Analysis, Self-Adjointness. New York: Academic Press, 1975 20. Reed, M., Simon, B.: Methods of Modern Mathematical Physics IV: Analysis of Operators. New York: Academic Press, 1978 21. Rodnianski, I., Schlein, B.: Quantum fluctuations and rate of convergence towards mean field dynamics. http://arXiv.org/abs/0711.3087v1[math-ph], 2007 22. Schlein, B., Erd˝os, L.: Quantum Dynamics with Mean Field Interactions: a New Approach. http://arXiv. org/abs/0804.3774v1 (2008) 23. Simon, B.: Best constants in some operator smoothness estimates. J. Func. Anal. 107, 66–71 (1992) 24. Zagatti, S.: The Cauchy problem for Hartree-Fock time-dependent equations. Ann. Inst. Henri Poincaré (A) 56, 357–374 (1992) Communicated by H.-T. Yau
Commun. Math. Phys. 288, 1061–1088 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0718-8
Communications in
Mathematical Physics
Inverse Spectral Problems for Schrödinger Operators Hamid Hezari Department of Mathematics, Johns Hopkins University, Baltimore, MD 21218, USA. E-mail:
[email protected] Received: 29 May 2008 / Accepted: 17 October 2008 Published online: 13 January 2009 – © Springer-Verlag 2009
Abstract: In this article we find some explicit formulas for the semi-classical wave invariants at the bottom of the well of a Schrödinger operator. As an application of these new formulas for the wave invariants, we improve the inverse spectral results proved by Guillemin and Uribe in [GU]. They proved that under some symmetry assumptions on the potential V (x), the Taylor expansion of V (x) near a non-degenerate global minimum can be recovered from the knowledge of the low-lying eigenvalues of the associated Schrödinger operator in Rn . We prove similar inverse spectral results using fewer symmetry assumptions. We also show that in dimension 1, no symmetry assumption is needed to recover the Taylor coefficients of V (x). 1. Introduction and Statement of Results In this article we study the inverse spectral problems for the semi-classical Schrödinger operator, Pˆ = − + V (x) 2 2
on L 2 (Rn ),
(1)
associated to the Hamiltonian P(x, ξ ) =
1 2 ξ + V (x). 2
Here the potential V (x) in (1) satisfies ⎧ ⎨ V (x) ∈ C ∞ (Rn ), V (x) has a unique non-degenerate global minimum at x = 0 and V (0) = 0, ⎩ For some ρ > 0, V −1 [0, ρ] is compact.
(2)
1062
H. Hezari
Under these conditions for sufficiently small , say ∈ (0, h 0 ), and sufficiently small δ, a classical fact tells us the spectrum of Pˆ in the energy interval [0, δ] is finite. We denote these eigenvalues by {E j ()}mj=0 . ˆ We notice the Weyl’s law We call these eigenvalues the low-lying eigenvalues of P. reads 1 ( d xdξ + o(1)). (3) m = N(δ) = { j; 0 ≤ E j () ≤ δ} = (2π )n 21 ξ 2 +V (x)≤δ Recently in [GU], Guillemin and Uribe raised the question whether we can recover the Taylor coefficients of V at x = 0 from the low-lying eigenvalues E j (). They also established that if we assume some symmetry conditions on V , namely V (x) = f (x12 , . . . , xn2 ), then the 1-parameter family of low-lying eigenvalues, {E j () | ∈ (0, h 0 )}, determines the Taylor coefficients of V at x = 0. In this article we will attempt to recover as much of V as possible from the family E j (), by establishing some new formulas for the wave invariants at the bottom of the potential (Theorem 1.1). Using these new expressions for the wave invariants, in Theorem 1.2 we improve the inverse spectral results of [GU] for a larger class of potentials. A classical approach in studying this problem is to examine the asymptotic behavior as → 0 of the truncated trace ˆ T r (( P)e
−it
Pˆ
),
(4)
where ∈ C0∞ ([0, ∞)) is supported in I = [0, δ] and equals one in a neighborhood of 0. The asymptotic behavior of the truncated trace around the equilibrium point (x, ξ ) = (0, 0) has been extensively studied in the literature. It is known that (see for example ˆ ˆ −it P ) has an asymptotic [BPU]) for t in a sufficiently small interval (0, t0 ), T r (( P)e expansion of the following form: ˆ T r (( P)e
−it
Pˆ
)∼
∞
a j (t) j ,
→ 0.
(5)
j=0
Throughout this paper when we refer to wave invariants at the bottom of the well, we mean the coefficients a j (t) in (5). By applying an orthogonal change of variable, we can assume that V is of the form 1 2 2 V (x) = ωk xk + W (x), 2 n
ωk > 0,
(6)
k=1
W (x) = O(|x|3 ),
|x| → 0.
In addition to conditions in (2), we also assume that {ωk } are linearly independent over Q. Our first result finds explicit formulas for the wave invariants.
Inverse Spectral Problems for Schrödinger Operators
1063
Theorem 1.1. There exists t0 such that for 0 < t < t0 , 1. n n −it ˆ 1 ˆ 0 = − 1 2 + 1 a0 (t) = T r (e H0 ) = , where H ωk2 xk2 . ωk t 2 2 2i sin 2 k=1 k=1 2. For j ≥ 1 , the wave invariants a j (t) defined in (5) are given by sl−1 t s1 2j l(n−1)+ n2 i π4 sgnHl a j (t) = a0 (t) i e ... Pl+ j bl (0)dsl . . . ds1 , (7) 0
l=1
0
0
where for every m, i −m < Hl−1 ∇, ∇ >m (bl )(0), 2m m! l cos ωk si k sin ωk si k sin ωk (t − si ) + sin ωk si k bl = W( ξi +( (z i+1 + z ik )− )x ), 2 ωk sin ωk t
Pm bl (0) =
i=1
and
Hl−1
is the inverse matrix of the Hessian Hl =Hess l (0), where
l = l (t, x, z 1 , . . . , zl , ξ1 , . . . , ξl ) =
n ωk ωk t)xk2 + ( cot ωk t)(z 1k )2 {(−ωk tan 2 2 k=1
+
l
k (z i+1 − z ik )ξik }.
i=1
The Hessian of l is calculated with respect to every variable except t. Therefore the entries of the matrix Hl−1 are functions of t. The matrix Hl−1 is shown in (32). 3. The wave invariant a j (t) is a polynomial of degree 2 j of the Taylor coefficients of V . The Taylor coefficients of highest order appearing in a j (t) are of order 2 j + 2. In fact these highest order Taylor coefficients appear in the linear term of the polynomial and ω a0 (t) t −1 2 j+2 ( cot t)α D2 α V (0) a j (t) = j+1 (2i) α ! 2ω 2 | α |= j+1
+ {a polynomial of Taylor coefficients of order ≤ 2 j + 1}.
(8)
Notice that in (8), we have used the standard shorthand notations for multi-indices, i.e. α = (α1 , . . . αn ), ω = (ω1 , . . . ωn ), | α | = α1 + . . . + αn , α ! = α1 ! . . . αn !, ∂m = with m = | α |. X α = X 1α1 . . . X nαn , and Dαm α1 αn ∂ x1 ...∂ xn
Our second result improves the result of Guillemin and Uribe in [GU]. This theorem is actually a non-trivial corollary of Theorem 1.1. Theorem 1.2. Let V satisfy (2), (6), and be of the form V (x) = f (x12 , . . . , xn2 ) + xn3 g(x12 , . . . , xn2 ), for some f, g ∈ | α| Dα V (0),
C ∞ (Rn ). Then the low-lying eigenvalues of 3 V (0) D3 en
∂3V ∂ xn3
Pˆ =
(9) − 21 2 + V
determine
| α | = 2, 3, and if := (0) = 0, they determine all the Taylor coefficients of V at x = 0. One quick consequence of Theorem 1.2 is the following:
1064
H. Hezari
Corollary 1.3. If n = 1, and V ∈ C ∞ (R) satisfies (2), then (with no symmetry assumptions) the low-lying eigenvalues determine V (0) and V (3) (0), and if V (3) (0) = 0, then these eigenvalues determine all the Taylor coefficients of V at x = 0. Let us briefly sketch our main ideas for the proofs. First, because of a technical reason which arises in the proofs, we will need to replace the Hamiltonian P by the following Hamiltonian H : ⎧ H (x, ξ ) = 21 ξ 2 + V(x), ⎪ ⎪ ⎪ ⎪ ⎨ V(x) = 21 nk=1 ωk2 xk2 + W(x), ⎪ ⎪ ⎪ ⎪ ⎩ ε > 0 sufficiently small, W(x) = χ ( 1x−ε )W (x), 2
C0∞ (Rn )
where the cut off χ ∈ is supported in the unit ball B1 (0) and equals one in B 1 (0). 2 Then in two lemmas (Lemma 2.1 and Lemma 2.2) we show that for t in a sufficiently small interval (0, t0 ), in the sense of tempered distributions we have ˆ T r (( P)e
−it
Pˆ
) = T r (e
−it
Hˆ
) + O(∞ ). −it
ˆ
This reduces the problem to studying the asymptotic of T r (e H ). For this we use the −it ˆ construction of the kernel k(t, x, y) of the propagator U (t) = e H found in [Z]. We find that i
k(t, x, y) = C(t)e S(t,x,y)
∞
al (t, , x, y),
(10)
l=0
where S(t, x, y) =
n k=1
1 ωk ( (cos ωk t)(xk2 + yk2 ) − xk yk ), sin ωk t 2
(11)
a0 = 1, and for l ≥ 1,
al (t, , x, y) = (
−1 ln 1 l(n+1) ) ( ) 2π i
t 0
2l
sl−1
...
i . . . e l bl (s, x, y, z , ξ )d l zd l ξ d l s,
0
where l =
n l ωk k { cot ωk t (z 1k )2 + (z i+1 − z ik )ξik }, 2 k=1
i=1
and bl =
l i=1
W (
cos ωk si k sin ωk si k sin ωk (t − si ) k sin ωk si k (z i+1 + z ik ) − y + x ). ξi + 2 ωk sin ωk t sin ωk t
Inverse Spectral Problems for Schrödinger Operators
1065 −it
ˆ
H
Next we apply the expression in (10) for k(t, x, y) to the formula T r (e ) = k(t, x, x)d x. Then we obtain an infinite series of oscillatory integrals, each one corresponding to one al . Finally we apply the method of stationary phase to each oscillatory integral and we show that the resulting series is a valid asymptotic expansion. From the resulting asymptotic expansion we obtain the formulas (7). Now let us compare our approach for the construction of k(t, x, y) with the classical approach. In the classical approach (see for instance [DSj,D,R,BPU and U]), one ˆ ˆ −it P, constructs a WKB approximation for the kernel k P (t, x, y) of the operator ( P)e i.e. i (12) k P (t, x, y) = e (ϕ P (t,x,η)−y.η) b P (t, x, y, η, )dη,
where ϕ P (t, x, η) satisfies the Hamilton-Jacobi equation (or eikonal equation in geometrical optics) ∂t ϕ P (t, x, η) + P(x, ∂x ϕ P (t, x, η)) = 0,
ϕ P |t=0 = x.η,
and the function b P has an asymptotic expansion of the form b P (t, x, y, η, ) ∼
∞
b P, j (t, x, y, η) j .
j=0
The functions b P, j (t, x, y, η) are calculated from the so called transport equations. See for example [R,DSj,EZ] or Appendix A of the paper in hand for the details of the above construction. In this setting, when one integrates the kernel k P (t, x, y) on the diagonal and applies the stationary phase lemma to the given oscillatory integral, one obtains very complicated expressions for the wave invariants. Of course the classical calculations above show the existence of asymptotic formulas of the form (5) (which can be used to get Weyl-type estimates for the counting functions of the eigenvalues, see for example [BPU]). Unfortunately these formulas for the wave invariants are not helpful when trying to establish some inverse spectral results. Hence, one should look for more efficient methods to calculate the wave invariants a j (t). One approach is to use the semi-classical Birkhoff normal forms, which was used in the papers [Sj] and [ISjZ] and [GU]. The Birkhoff normal forms methods were also used by S. Zelditch in [Z4] to obtain positive inverse spectral results for real analytic domains with symmetries of an ellipse. Zelditch proved that for a real analytic plane domain with symmetries of an ellipse, the wave invariants at a bouncing ball orbit, which is preserved by the symmetries, determine the real analytic domain under isometries of the domain. Recently in [Z3], Zelditch improved his earlier result to the real analytic domains with only one mirror symmetry. His approach for this new result was different. He used a direct approach (Balian-Bloch trace formula) which involves Feynman-diagrammatic calculations of the stationary phase method to obtain a more explicit formula for the wave invariants at the bouncing ball orbit. Motivated by the work of Zelditch [Z3] mentioned above, our approach in this article is also somehow direct and involves combinatorial calculations of the stationary phase. −it ˆ Our formula in (10) for the kernel of the propagator, U (t) = e H , is different from the WKB-expression in the sense that we only keep the quadratic part of the phase
1066
H. Hezari
function, namely the phase function S(t, x, y) in(11) of the propagator of Anisotropic ∞ oscillator, and we put the rest in the amplitude l=0 al (t, , x, y) in (10). The details of this construction are mentioned in Sect. 2.2. Remark 1.4. After the initial posting of this article, Guillemin and Colin de Verdière posted two articles (see [CG1], also [C]) in which they study inverse spectral problems of 1 dimensional semi-classical Schrödinger operators. One of the main results in [CG1] is our Corollary 1.3 in this paper. 2. Proofs of the Results 2.1. Two reductions. Because of some technical issues arising in the proof of Theorem 1.1, we will need to use the following two lemmas as reductions. In the following, we let χ ∈ C0∞ (Rn ) be a cut off which is supported in the unit ball B1 (0) and equals one in B 1 (0). 2
Lemma 2.1. Let the Hamiltonians ⎧ ⎪ P(x, ξ ) = 21 ξ 2 + V (x) ⎪ ⎪ ⎪ ⎨ V (x) = 21 nk=1 ωk2 xk2 + W (x), ⎪ ⎪ ⎪ ⎪ ⎩ W (x) = O(|x|3 ), as x → 0
P and H be defined by ⎧ H (x, ξ ) = 21 ξ 2 + V(x), ⎪ ⎪ ⎪ ⎪ ⎨ V(x) = 21 nk=1 ωk2 xk2 + W(x), (13) ⎪ ⎪ ⎪ ⎪ ⎩ W(x) = χ ( 1x−ε)W (x), ε > 0 sufficiently small, 2
and let Pˆ and Hˆ be the corresponding Weyl (or standard) quantizations. Then for t in a sufficiently small interval (0, t0 ), ˆ T r (( P)e
−it
Pˆ
) = T r (( Hˆ )e
−it
Hˆ
) + O(∞ ).
In other words, the wave invariants a j (t) will not change if we replace P by H . Proof. Proof is given in Appendix A.
Next we use the following lemma to get rid of ( Hˆ ). Lemma 2.2. Let H be defined by (13). Then in the sense of tempered distributions T r (( Hˆ )e
−it
Hˆ
) = T r (e
−it
Hˆ
) + O(∞ ).
This means that if we sort the spectrum of Hˆ as E 1 () < E 2 () ≤ . . . ≤ E j () → +∞, then for every Schwartz function ϕ(t) ∈ S(R), < T r (e
−it
Hˆ
) − T r (( Hˆ )e
−it
Hˆ
), ϕ(t) >=
∞ E j () ) = O(∞ ). (1 − (E j ()))ϕ( ˆ j=1
Proof. Proof is given in Appendix B.
Because of the above lemmas, it is enough to study the asymptotic of T r (e
−it
Hˆ
).
Inverse Spectral Problems for Schrödinger Operators
1067 −it
ˆ
2.2. Construction of k(t, x, y), the kernel of U (t) = e H . In this section we follow the construction in [Z] to obtain an oscillatory integral representation of k(t, x, y), the kernel −it ˆ of the propagator e H . The reader should consult [Z] for many details. In that article Zelditch uses Dyson’s Expansion of propagator to study the singularities of the kernel k(t, x, y). But he does not consider the semi-classical setting → 0 in his calculations (i.e. = 1). So we follow the same calculations but also consider carefully. The following important proposition gives a new semi-classical approximation to the propagator U (t) near the bottom of the well. We will use B(X ) for the bounded functions on X with bounded derivatives. Proposition 2.3. Let k(t, x, y) be the Schwartz kernel of the propagator U (t) = e Then (A) We have k(t, x, y) = (
n
k=1
−it
Hˆ
.
∞
1 i ωk ) 2 e S(t,x,y) al (t, , x, y), 2πi sin ωk t
(14)
l=0
where S(t, x, y) =
n
1 ωk ( (cos ωk t)(xk2 + yk2 ) − xk yk ). sin ωk t 2
k=1
Also a0 = 1 and for l ≥ 1,
al (t, , x, y) = (
−1 ln 1 l(n+1) ) ( ) 2π i
t
2l
i . . . e l bl (s, x, y, z , ξ )d l zd l ξ d l s,
sl−1
... 0
0
(15) where l =
n l ωk k { cot ωk t (z 1k )2 + (z i+1 − z ik )ξik }, 2 k=1
(16)
i=1
and bl =
l
cos ωk si k sin ωk si k sin ωk (t − si ) k (z i+1 + z ik ) − y ξi + 2 ωk sin ωk t i=1 sin ωk si k x , (zl+1 := 0). + sin ωk t W
(17)
there exists k0 = k0 (α, β) (B) We have al ∈ B(Rnx × Rny ). In fact for every α and β, such that for every 0 < ≤ h 0 ≤ 1,
|∂xα ∂ yβ al (t, , x, y)| ≤
Cα,β,n (t)l ||W ||l|α|+|β|+k0 l!
1
1
l( 2 −3ε)− 2 (|α|+|β|) ,
(18)
where 1
W (x) =
W( 2 x) 1
3( 2 −ε)
1
= χ (ε x)
W ( 2 x) 1
3( 2 −ε)
(19)
1068
H. Hezari
and W is uniformly in B(Rnx ); i.e. W is bounded with bounded derivatives and ∞ al (t, , x, y) the bounds are independent of . Hence the sum a(t, , x, y) = l=0 in (14) is uniformly convergent in B(Rnx × Rny ). In fact 1 1 |∂xα ∂ yβ a(t, , x, y)| ≤ exp 2 −3 Cα,β,n (t)||W |||α|+|β|+k0 − 2 (|α|+|β|) . Proof. Following [Z], we denote ⎧ ⎨ Hˆ 0 = − 21 2 + 21 nk=1 ωk2 xk2 ,
(Anisotropic Oscillator)
⎩ ˆ H = Hˆ 0 + W(x) = − 21 2 + V(x), and by U0 (t) = e From
−it
Hˆ 0
, and U (t) = e
Hˆ
−it
, we mean their corresponding propagators.
(i∂t − Hˆ 0 )U (t) = W.U (t), we obtain U (t) = U0 (t) +
1 i
t
U0 (t − s).W.U (s)ds.
0
By iteration we get the norm convergent Dyson Expansion:
U (t) = U0 (t) +
∞ l=1 −1
[U0 (sl )
1 (i)l
t
...
0
0
sl−1
U0 (t)[U0 (s1 )−1 .W.U0 (s1 )] . . .
.W.U0 (sl )]dsl . . . ds1 .
It is well-known that for t =
mπ ωk ,
(20)
the kernel of U0 (t) is given by
k0 (t, x, y) = (
n k=1
1 i ωk ) 2 e S(t,x,y) , 2πi sin ωk t
(21)
where S(t, x, y) =
n k=1
1 ωk ( (cos ωk t)(xk2 + yk2 ) − xk yk ). sin ωk t 2
Then by taking kernels in (20) and after some change of variables (see [Z], pp. 8–9 and 18–19), we get (14). This finishes the proof of part (A) of the proposition. Before proving part (B), let us mention a useful estimate from [Z]. The setting in [Z] is a non-semiclassical one, i.e. = 1. In [Z] on pp. 17–18 the following estimate (for = 1) is proved using integration by parts. That there exists a positive integer k0 = k0 (α, β, n) and a continuous function Cα,β,n (t) such that
|∂xα ∂ yβ al (t, 1, x, y)| ≤
1 Cα,β,n (t)l ||W1 ||l|α|+|β|+k0 , l!
(W1 = W |=1 ).
(22)
The estimates (22) will change if one considers in the calculations. This would be part (B) of the proposition. Let us prove this, namely the estimate (18). First, in (15),
Inverse Spectral Problems for Schrödinger Operators
1069
we apply the change of variables x → 2 x, y → 2 y, z → 2 z and ξ → 2 ξ . This 1 1 gives us ln in front of the integral. Then we replace W( 2 (·)) by 3( 2 −ε) W (·). After collecting all the powers of in front of the integral we obtain 1
1
1
al (t, , 2 x, 2 y) = (
−1 ln l( 1 −3ε) ) 2 2π
t
1
...
1
2l
sl−1
0
0
1
. . . eil bl (s, x, y, z , ξ )d l zd l ξ d l s,
where bl (s, x, y, z , ξ ) =
l
cos ωk si k sin ωk si k (z i+1 + z ik ) − ξi 2 ωk i=1 sin ωk (t − si ) k sin ωk si k y + x . + sin ωk t sin ωk t W
Next we apply (22) to the above integral with W1 replaced by W , and we get (18). To finish the proof we have to show that for every positive integer m we can find uniform bounds (i. e. independent of ) for the m th derivatives of the function W (x). Since χ (x) is supported in the unit ball, from the definition (19) we see that W is supported in |x| < h −ε . So from (19) it is enough to find uniform bounds in for the m th derivatives 1
of the function
W ( 2 x) 1 3( 2 −ε)
in the ball |x| < −ε . This is very clear for m ≥ 3. For m < 3,
we use the order of vanishing of W (x) at x = 0. Since W (x) = O(|x|3 ) near x = 0, the order of vanishing of W at x = 0 is 3. Therefore in the ball |x| < −ε , the functions 1
1
1
W ( 2 x) (∂ α W )( 2 x) (∂ α ∂ β W )( 2 x) , , , 1 1 1 ( 2 x)3 ( 2 x)2 2 x are bounded functions with uniform bounds in , and the statement follows easily for m < 3.
2.3. Trace of U (t). In this section we show that the integral T r U (t) = k(t, x, x)d x is convergent as an oscillatory integral and using (14) we express T r U (t) as an infinite sum of oscillatory integrals with a appropriate -estimate for the remainder term. First of all we review some standard facts. We know that the sum itE j () T r U (t) = e− is convergent in the sense of tempered distributions, i.e. T r U (t) ∈ S (R). This can be shown by the Weyl’s law in its high energy setting, which implies that for potentials of the form V (x) = 21 nk=1 ωk2 xk2 + W(x), with W ∈ B(Rn ), for fixed , the j th eigenvalue E j () satisfies 2
E j () ∼ C(n, ) j n ,
j → ∞.
Another way to define T r U (t) is to write it as the limit (it+τ )E j () e− T r U (t) = lim+ T r U (t − iτ ) = lim+ . τ →0
τ →0
(23)
(24)
1070
H. Hezari
This time Weyl’s law (23) implies that the sum T r U (t − iτ ) is absolutely uniformly τ E j ()
convergent because of the rapidly decaying factor e− . As a result, U (t − iτ ) is a trace class operator. It is clear that the kernel of U (t − iτ ) is k(t − iτ, x, y), the analytic continuation of the kernel k(t, x, y) of U (t). Clearly k(t − iτ, x, y) is continuous in x and y. So we can write T r U (t − iτ ) = k(t − iτ, x, x)d x. We notice that this integral is uniformly convergent. This is because up to a constant this integral equals
i S(t−iτ,x,x) e a(t − iτ, , x, x), and the exponential factor in the integral is rapidly decaying for τ > 0 as |x| → ∞ and a is a bounded function. More precisely (i S(t − iτ, x, x)) =
n
ωk (t − iτ ) 2 ωk (1 − e2τ ωk ) 2 ))xk = x , 2 |1 + eωk (it+τ ) |2 k n
(−iωk tan(
k=1
k=1
and ωk (1 − e2τ ωk ) < 0. |1 + eωk (it+τ ) |2
The discussion above shows that the integral k(t, x, x)d x can be defined by integrations by parts as follows: Since < Dx >2 ei S(t,x,x) := (1 − )ei S(t,x,x) ωt ωk t i S(t,x,x) ωk tan( , ) x 2 +2i ))e 2 2 n
= (1+ 2ω tan(
k=1
we can write
i
n
e S(t,x,x) a(t, , x, x)d x = 2 n
= 2 +2i
ei S(t,x,x) a(t, ,
ei S(t,x,x) (< Dx >2 (1+ 2ω tan( n k=1
ωk tan(
√
x,
√
x)d x
ωt ) x 2 2
√ √ ωk t −1 n 0 )) ) a(t, , x, x)d x. 2
(25)
π }, then by choosing n 0 > n2 , and because If we assume 0 < t < min1≤k≤n { 2w k a(t, , x, y) ∈ B(Rnx × Rny ), the integral becomes absolutely convergent. Finally, since ∞ al (t, , x, y) is absolutely uniformly convergent, by (18) the series a(t, , x, y) = l=0 we have ∞ i i e S(t,x,x) a(t, , x, x)d x = e S(t,x,x) al (t, , x, x)d x, l=0
and therefore we obtain an infinite sum of oscillatory integrals. The next step is to apply the stationary phase method to each integral above and then add the asymptotic expansions to obtain an asymptotic expansion for the T r U (t). Because we have an infinite sum of asymptotic expansions, we have to establish that the resulting asymptotic for the
Inverse Spectral Problems for Schrödinger Operators
1071
trace is a valid approximation. Hence we have to find some appropriate -estimates for the remainder term of the series. For this we define n i Il (t, ) = − 2 (26) e S(t,x,x) al (t, , x, x)d x. Hence by this notation, T r U (t) = following crucial proposition.
n
1 ωk 2 k=1 ( 2πi sin ωk t )
∞
l=0 Il (t, ).
Now we have the
Proposition 2.4. Let 0 < ε < 16 , and Il (t, ) be defined by (26). Then for all m ≥ 1, T r U (t) =
n k=1
(
m−1 1 1 ωk )2 Il (t, ) + O(m( 2 −3ε) ). 2πi sin ωk t
(27)
l=0
Proof. If in (26) we integrate by parts as we did in (25), and choose n 0 = [ n2 ] + 1, then using (18) we get |Il (t, )| ≤ Cn (t)
Cn (t)l ||W ||l2n 0 +k0 l!
1
l( 2 −3ε) ,
where Cn (t) = max|α|+|β|≤2n 0 {Cα,β,n (t)}. We choose ε > 0 such that 21 − 3ε > 0, or ε < 16 . Now it is clear that for every positive integer m, and every 0 < ≤ h 0 ≤ 1, |
∞
1
Il (t, )| ≤ Cn (t)e{Cn (t)||W ||2n0 +k0 } m( 2 −3ε) .
(28)
l=m
Since by Proposition 2.3.B, sup0<≤1 ||W ||2n 0 +k0 < ∞, we get (27).
Proposition 2.4 is very important because it enables us to add up all the asymptotic expansions obtained by applying the stationary phase method to each Il (t, ). 2.4. Stationary phase calculations and the proof Theorem 1.1.1-2. In this section we will apply the stationary phase method to each Il (t, ) in (27). By (26), (14) and (15) we have Il (x, ) = (
−1 ln 1 l(n+1)+ n 2 ) ( ) 2π i
t
2l+1
sl−1
...
0
i . . . e l bl (s, x, x, z , ξ )d l s d l z d l ξ d x,
0
(29) where l = S(t, x, x) + l =
n k=1
ωk ωk k t) xk2 + cot ωk t (z 1k )2 + (z i+1 − z ik )ξik }. 2 2 l
{(−ωk tan
i=1
(30) It is easy to see that the only critical point of the phase function l , given by (30), is at (x, z , ξ ) = 0. Next we calculate Hl =Hess l (0) and Hl−1 . In the following we use the notation D( v ) for the diagonal matrix Diag(v1 , . . . , vn ), where v = (v1 , . . . , vn ). From (30), we get
1072
H. Hezari
⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ Hl = ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝
D(−2ω tan( ωt 2 ))n×n
0
0
D(ω cot ωt) n×n 0 0 0
⎛ −I I .. 0 ⎜ 0−I I .. 0 ⎜ ⎜ . 0 .. .. . . ⎜ ⎝ . I 0 0 −I
0
ln×ln
⎞
⎛ −I 0 ⎜ I −I ⎜ ⎜ 0 I ⎜ ⎝ ˙. 0
⎟ ⎟ ⎟ ⎟ ⎠
0 .. 0 0 .. 0 ... .. . I −I
0
⎞ ⎞
⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ln×ln ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠
⎟ ⎟ ⎟ ⎟ ⎠
,
(2l+1)n×(2l+1)n
ln×ln
(31) where I = In×n is the identity matrix of size ⎛ n × n. ⎞ K 0 0 Since Hl is of the form Hl = ⎝ 0 A B ⎠, the inverse matrix equals 0 BT 0 ⎞ ⎛ 0 K −1 0 −1 ⎟ ⎜ −1 T Hl = ⎝ 0 0 B ⎠ . A simple calculation shows that −1 −1 −1 T 0 B −B AB ⎛
Hl−1
⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ =⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝
ωt D( −1 2ω cot( 2 ))
0
0
0 ⎛
−I ⎜ −I ⎜ ⎜ ˙. ⎝ −I −I
0
0 −I ˙. ... ...
... 0 ... ... −I −I
0 ˙. ˙. 0 −I
⎞ 0 ⎞ ⎛ −I −I . . . −I ⎟ ⎟ ⎜ 0 −I −I . . . −I ⎟ ⎟ ⎟⎟ ⎜ ⎜ 0 0 −I ... ˙. ⎟ ⎟ ⎟ ⎝ ˙. . . −I ⎠ ⎟ .. .. ⎟ 0 ... 0 −I ⎟, ⎞ ⎟ ⎛ ⎞ ⎟ − − . . . − ⎟ ⎟ ⎟ ⎜ − − . . . − ⎟ ⎟ ⎟ ⎟ ⎝ ˙ ˙˙. ⎠ ⎟ ˙. ⎠ ⎠ − − . . . −
(32)
where = D(w cot ωt). It is also easy to see that detHl = (−1)
(l+1)n
n
2ωk tan
k=1
ωk t . 2
(33)
By applying the stationary phase lemma to (29) and plugging into (27) we obtain π sl−1 m−1 1 −l (−1)ln ei 4 sgnHl t ωk 2 ) T r U (t) = ( ... √ l(n+1)+ n2 sin ωk t |detHl | 0 0 k=1 l=0 i
n
∞ j=0
1
j P j bl (0)dsl . . . ds1 + O(m( 2 −3ε) ),
(34)
Inverse Spectral Problems for Schrödinger Operators
1073
where i−j < Hl−1 ∇, ∇ > j bl (x, z , ξ ) 2 j . j! 2j z , ξ ) i−j r r ∂ bl (x, h rl 1 r2 . . . h l 2 j−1 2 j , = j 2 . j! ∂r1 . . . ∂r2 j
P j bl (x, z , ξ ) =
(35)
r1 ,...,r2 j ∈Al
where in the sum (35) the indices r1 , . . . , r2 j run in the set Al = {x k , z 1k , ..zlk , ξ1k , ..ξlk }nk=1 , −1 th and h rr l with r, r ∈ Al , corresponds to the (r, r ) entry of the inverse Hessian Hl . We note that P j bl (0) = 0 if 2 j < 3l. This is true because of (17) and because W (0) = 0, ∇W (0) = 0 and HessW (0) = 0. This implies, first, there are not any negative powers of in the expansion (as we were expecting). Second, the constant term (i.e. the 0th wave invariant), which corresponds to the term l = j = 0 in the sum, equals a0 (t) = T rU0 (t) =
n k=1
−π n 4
n
(
1 i− 2 e ωk )2 sin ωk t (2ωk tan
=
ωk t 21 2 )
n
1 2i sin k=1
ωk t 2
.
And third (using (33)), for j ≥ 1 the coefficient of j in (34) equals a j (t) = (
n
1 2i sin k=1
ωk t 2
)
2j
n
π
i l(n−1)+ 2 ei 4 sgnHl
t 0
l=1
s1
...
0
sl−1
Pl+ j bl (0)dsl . . . ds1 .
0
The sum goes only up to 2 j because if l > 2 j then 2(l + j) < 3l and Pl+ j bl (0) = 0. This proves the first two parts of Theorem 1.1.
2.5. Calculations of the wave invariants and the proof of Theorem 1.1.3. In this section we try to calculate the wave invariants a j (t) from the formulas (7). First of all, let us investigate how the terms with highest order of derivatives appear in a j (t). Because bl is the product of l copies of W functions, and because we have to put at least 3 derivatives on each W to obtain non-zero terms, the highest possible order of derivatives that can appear in P j+l bl (0), is 2( j + l) − 3(l − 1) = 2 j − l + 3. This implies that, because in the sum (7) we have 1 ≤ l ≤ 2 j, the highest order of derivatives in a j (t) is 2 j + 2 and those derivatives are produced by the term corresponding to l = 1, i.e. P j+1 b1 (0). The formula (7) also shows that a j (t) is a polynomial of degree 2 j. The term with the highest polynomial order is the one with l = 2 j, i.e. P3 j b2 j (0) (which has the lowest order of derivatives) and the term P j+1 b1 (0) is the linear term of the polynomial. Now let us calculate P j+1 b1 (x, z , ξ ) and prove Theorem 1.1.3. By (35), P j+1 b1 =
i −( j+1) + 1)!
2 j+1 .( j
r1 ,...,r2 j+2 ∈A1
r
h r11 r2 . . . h 12 j+1
r2 j+2
∂ 2 j+2 b1 , ∂r1 . . . ∂r2 j+2
where here by (17), b1 = W(
cos ωk s k sin ωk s k sin ωk (t − s) + sin ωk s k ξ +( z − )x ). 2 ωk sin ωk t
(36)
1074
H. Hezari
Also by (32),
⎛
H1−1
⎞ ωt D( −1 0 2ω cot( 2 )) 0 ⎠. =⎝ 0 0 −I 0 −I −D(w cot ωt)
Hence the only non-zero entries of H1−1 are the ones of the form h 1x and
ξkξk h1 .
k xk
zk ξ k
, h1
ξ k zk
= h1
,
Now we let ⎧ r2 j+1 r2 j+2 r1 r2 xk xk in (36), ⎪ ⎨ i x k x k = the number of times h 1k k appears in h 1 . . . h 1 r r z ξ i z k ξ k = the number of times h 1 appears in h r11 r2 . . . h 12 j+1 2 j+2 in (36), ⎪ ⎩ r r ξkξk i ξ k ξ k = the number of times h 1 appears in h r11 r2 . . . h 12 j+1 2 j+2 in (36).
By applying these notations to (36), and by (17) we get
i
n
i −( j+1) P j+1 b1 = j+1 2 ( j + 1)! n
k=1 i x k x k +i z k ξ k +i ξ k ξ k = j+1
i
( j + 1)! 2 k=1 zk ξ k n k=1 i x k x k !i z k ξ k !i ξ k ξ k !
− cot ω2k t x x i i (−1) zk ξ k (−ωk cot ωk t) ξ k ξ k 2ωk k=1 n sin ωk (t − s) + sin ωk s 2i x k x k cos ωk s i zk ξ k − sin ωk s i zk ξ k +2iξ k ξ k × sin ωk t 2 ωk ×
n
k k
k=1 2 j+2 ×D2α1 ,...2αn W,
where αk = i x k x k + i z k ξ k + i ξ k ξ k , for k = 1, . . . , n. Next we write the above big sum as n
k=1 i x k x k +i z k ξ k +i ξ k ξ k = j+1
=
αk = j+1
( j + 1)! () i k=1 x k x k !i z k ξ k !i ξ k ξ k !
n ( j + 1)! αk !
n
i x k x k +i z k ξ k +i ξ k ξ k =αk k=1
αk ! (). i x k x k !i z k ξ k !i ξ k ξ k !
2 j+2
So the coefficient of D2α1 ,...2αn W in P j+1 b1 , equals n 1 ωk t sin ωk (t − s) + sin ωk s 2 i −( j+1) (−1) j+1 cot 2 j+1 ( αk !)( ωkαk ) 2 2 sin ωk t k=1 αk − cos ωk s sin ωk s + cot ωk t sin2 ωk s
.
Now we observe that the term in the parenthesis simplifies to ωk t sin ωk (t −s) + sin ωk s 2 ωk t 1 1 cot . −cos ωk s sin ωk s + cot ωk t sin2 ωk s = cot 2 2 sin ωk t 2 2
Inverse Spectral Problems for Schrödinger Operators
1075
So we get 1 −1 1 ωt α 2 j+2 D2 α W, P j+1 b1 = cot (2i) j+1 α ! 2ω 2
(37)
| α |= j+1
Finally, by plugging (x, z , ξ ) = 0 into Eq. (37) and applying it to (7), we get (8). This finishes the proof of Theorem 1.1.3. For future reference let us highlight the equation we just proved
S1 :=
r
h r11 r2 . . . h 12 j+1
r1 ,...,r2 j+2 ∈A1
= ( j + 1)!
| α |= j+1
1 α !
r2 j+2
−1 ωt cot 2ω 2
∂ 2 j+2 W ∂r1 . . . ∂r2 j+2 α
2 j+2
D2 α W,
(38)
where W = W(
sin ωk (t − s) + sin ωk s k cos ωk s k sin ωk s k z − )x ). ξ +( 2 ωk sin ωk t
t s 2.6. Calculations of 0 0 1 P j+2 b2 (0), and the proof of Theorem 1.2. Throughout this section we assume that V is of the form (9). Hence, the only non-zero Taylor coefficients 2 j+2 2 j+1 are of the form D2 α V (0), or D2 α +3 en V (0), where e n = (0, . . . , 0, 1). We notice that based on our discussion in the previous section, the Taylor coefficients
t s 2 j+1 of order 2 j + 1 appear in 0 0 1 P j+2 b2 (0), and they are of the form D V (0)D 3 V (0). δ β Therefore we look for the coefficients of the data 2 j+1 3 D2 α+3 en V (0)D3 α| = j − 1 (39) en V (0); | in the expansion of a j (t). 2 j+1
Proposition 2.5. In the expansion of a j (t), the coefficient of the data D2 α+3 en V (0) 3 V (0), | α | = j − 1, is D3 en c2 (n) t (2i) j+2 α !
−1 ωt cot 2ω 2
α
1 2αn + 5 −1 ωn t 2 1 )( ) + ( cot 3ωn2 αn + 1 2ωn 2 9ωn4
.
(40)
Therefore a j (t) =
ωt c1 (n) t −1 2 j+2 ( cot )α D2 α V (0) (2i) j+1 α ! 2ω 2 | α |= j+1
ωt α c2 (n) t −1 cot (2i) j+2 α ! 2ω 2 | α |= j−1 1 2αn + 5 −1 ωn t 2 1 2 j+1 3 D2 α +3 en V (0)D3 )( ) × ( cot + en V (0) 3ωn2 αn + 1 2ωn 2 9ωn4 +
+{a polynomial of Taylor coefficients of order ≤ 2 j}.
(41)
1076
H. Hezari 2 j+1
3 V (0), Proof. As we mentioned at the beginning of Sect. 2.7, the data D2 α +3 en V (0)D3 en
t s1 | α | = j −1, appears first in a j (t) and it is a part of the term 0 0 P j+2 b2 (0). So let us cal2 j+1 3 V (0). culate those terms in the expansion of P j+2 b2 (0) which contain D2 α +3 en V (0)D3 en By (17), since here l = 2, we have
b2 (s1 , s2 , x, x, z 1 , z 2 , ξ1 , ξ2 ) = W1 W2 , W1 = W ( cos 2ωk s1 (z 1k + z 2k ) − W2 = W ( cos 2ωk s2 z 2k − Also from (32) we have ⎛ −1 D( 2ω cot( ωt 2 )) ⎜ ⎜ ⎜ 0 ⎜ ⎜ ⎜ ⎜ 0 H2−1 = ⎜ ⎜ ⎜ 0 ⎜ ⎜ ⎜ ⎝ 0
sin ωk s1 k ωk ξ1
sin ωk s2 k ωk ξ2
where,
1 )+sin ωk s1 + ( sin ωk (t−s )x k ), sin ωk t
(42)
2 )+sin ωk s2 + ( sin ωk (t−s )x k ). sin ωk t
0
0
0
0
0
0
−I
−I
⎞
⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ 0 0 0 −I ⎟ ⎟ ⎟ −I 0 −D(ω cot(ωt)) −D(ω cot(ωt)) ⎟ ⎟ ⎟ −I −I −D(ω cot(ωt)) −D(ω cot(ωt)) ⎠
. (43)
5n×5n
By (35) and (42), P j+2 b2 (0) = S2 =
i− j
S , 2 j . j! 2
where S2 is the following sum: r
h r21 r2 . . . h 22 j+3
r2 j+4
(W1 W2 )r1 ...r2 j+4 (0),
(44)
r1 ,...,r2 j+4 ∈A2
where A2 = {x k , z 1k , z 2k , ξ1k , ξ2k }nk=1 and for every r, r ∈ A2 , h rr 2 is the (r, r )-entry of −1 the matrix H2 in (43). We would like to separate out those terms in S2 which include 2 j+1 3 V (0). To do this, from the total number 2 j +4 derivatives that we want D2 α+3 en V (0)D3 en to apply to W1 W2 , we have to put 3 of them on W1 (or W2 ) and put 2 j + 1 of them on W2 (or W1 respectively). These combinations fit into one of the following two different forms: r r h r21 r2 . . . h 22 j+3 2 j+4 (W1 )r1 r2 r3 (W2 )r4 ,r5 ...r2 j+4 (0). (45) S21 = r1 ,...,r2 j+4 ∈A2
There are 2( j + 1)( j + 2) terms of this form in the expansion of S2 . r r h r21 r2 . . . h 22 j+3 2 j+4 (W1 )r1 r3 r5 (W2 )r2 ,r4 ,r6 ,r7 ...r2 j+4 (0). S22 = r1 ,...,r2 j+4 ∈A2
j +2 terms of this form in the expansion of S2 . 3 Now, we calculate the sums S21 and S22 . There are 23
(46)
Inverse Spectral Problems for Schrödinger Operators
1077
2.6.1. Calculation of S21 . We rewrite S21 as ⎛ ⎞ rr rr r r h 21 2 h 23 4 ⎝ h r25 r6 . . . h 22 j+3 2 j+4 (W2 )r5 ...r2 j+4 ⎠ (W1 )r1 r2 r3 (0). S21 = r1 ,...,r4
r5 ,...,r2 j+4
r4
Then from the definition of W2 in (42) and also from (43) it is clear that we can apply (38) to the sum in the big parenthesis above. Hence we get 1 −1 ωt α r1 r2 r3 r4 2 j 1 cot D2 α W2 (W1 )r1 r2 r3 (0). S2 = ( j + 1)! h2 h2 r4 α ! 2ω 2 r ,...,r | α |= j
1
4
(47) This reduces the calculation of S21 to calculating the small sum rr rr 2j h 21 2 h 23 4 (W˜ 2 )r4 (W1 )r1 r2 r3 (0), (W˜ 2 = D2 α W2 ). A12 = r1 ,...,r4
Computation of the sum A12 is straightforward and we omit writing the details of this computation. Using Maple, we obtain t s1 t −1 ωn t 1 ˜ 3 A12 ds2 dt = − 2 ( cot ) De n W2 D3 en W1 (0). 2ωn 2ωn 2 0 0 If we plug this into (47), after a change of variable αn → αn + 1 in indices, we get t s1 ωt α ( j + 1)! 1 −1 cot S21 ds2 dt = αn + 1 α ! 2ω 2 0 0 | α |= j−1
t −1 ωn t 2 2 j+1 3 ) D2 α +3 en V (0) D3 × (− 2 )( cot en V (0). 2ωn 2ωn 2 2.6.2. Calculation of S22 . We rewrite S22 as ⎛ ⎞ rr rr rr r r 2 j+3 2 j+4 r r S22 = h 21 2 h 23 4 h 25 6 ⎝ h 27 8 . . . h 2 (W2 )r7 ...r2 j+4 ⎠ r1 ,...,r6
r7 ,...,r2 j+4
(48)
(W1 )r1 r3 r5 (0).
r2 ,r4 ,r6
Again from (43) it is clear that we can apply (38) to the sum in the big parenthesis above. So 1 −1 ωt α cot S22 = ( j + 1)! α ! 2ω 2 | α |= j−1 r r r r 2 j−2 × h 21 2 h 23 4 D2 α W2 (W1 )r1 r3 r5 (0). (49) r1 ,...,r6
r2 ,r4 ,r6
So we need to compute rr rr rr h 21 2 h 23 4 h 25 6 (W˜ 2 )r2 ,r4 ,r6 (W1 )r1 r3 r5 (0), A22 = r1 ,...,r6
(W˜ 2 = D2 α
2 j−2
W2 ).
1078
H. Hezari
Using Maple t 0
s1
0
t A22 ds2 dt = − 2 2ωn
−1 ωn t cot 2ωn 2
2
t − 12ωn4
3 ˜ 3 D3e W2 D3 en W1 (0). n
If we plug this into (49) we get t 0
s1 0
S22 ds2 dt = ( j + 1)!
| α |= j−1
t × − 2 2ωn
1 α !
−1 ωt cot 2ω 2
−1 ωn t cot 2ωn 2
2
α
t − 12ωn4
2 j+1
3 D2 α+3 en V (0) D3 en V (0).
(50) On the other hand the part of the expansion of 2 j+1 3 V (0), equals D2 α+3 en V (0) D3 en
t s1 0
0
P j+2 b2 (0) which contains the data
t s1 t s1 i−j j +2 1 3 S2 + 2 S22 . 2( j + 2)( j + 1) 3 2 j . j! 0 0 0 0 Finally, by applying Eqs. (48) and (50) to this we obtain (41).
Now using Proposition 2.5, we give a proof for Theorem 1.2. Proof of Theorem 1.2. First of all, we prove that for all α , the functions ω α cot t , 2 are linearly independent over C. To show this we define : (0, π )n −→ Rn , cot 1 , . . . , xn ) = (cot(x1 ), . . . , cot(xn )). cot(x Because ωk are linearly independent over Q, the set {( ω21 t, . . . , ω2n t) + π Zn ; t ∈ R} ∩ is a homeomorphism and is π -periodic, we conclude (0, π )n is dense in (0, π )n . Since cot that the set {(cot( ω21 t), . . . , cot( ω2n t); t ∈ R} is dense in Rn . Now assume α
ω α cα cot t = 0. 2
Since {(cot( ω21 t), . . . , cot( ω2n t); t ∈ R} is dense in Rn , we get cα X α = 0, α
for every X = (X 1 , . . . , X n ) ∈ Rn . But the monomials X α are linearly independent over C. So cα = 0.
Inverse Spectral Problems for Schrödinger Operators
1079
Next we argue inductively to recover the Taylor coefficients of V from the wave invariants. Since a0 (t) =
n
1 2i sin k=1
ωk t 2
,
we can recover nk=1 sin ω2k t , and therefore we can recover {ωk } up to a permutation. This can be seen by Taylor expanding nk=1 sin ω2k t . We fix this permutation and we 3 V (0). This term appears first move on to recover the third order Taylor coefficient D3 en in a1 (t). By Proposition 2.5, we have ω t 1 −1 4 ( cot t)α D2 α V (0) 2 (2i) α ! 2ω 2 | α |=2 2 5 −1 t ωn t 2 1 3 D3 + c2 (n) ) ( cot + en V (0) 3 2 4 (2i) 3ωn 2ωn 2 9ωn +{a rational function of ωk }. ωn t 2 −1 1 are linearly cot ) + Now since the functions {(cot ω2 t)α }| α|=2 and 3ω5 2 ( 2ω 4 2 9ω n a1 (t) = c1 (n)
n
n
4 V (0)} 3 2 independent over C, we can therefore recover the data {D2 | α |=2 and {D3 en V (0) } α 3 from a1 (t). So we have determined the third order term D3 en V (0) up to a minus sign from the first invariant a1 (t). This choice of minus sign corresponds to a reflection. We fix this reflection and we move on to determine the higher order Taylor coefficients inductively. 3 V (0) = 0 and that we know all the Taylor coefficients Next we assume D3 en 2 j+1
D m V (0) with m ≤ 2 j. We wish to determine the data {D2 α+3 en V (0)}| α|= j−1 and
β 2 j+2 {D2 α V (0)}| α|= j+1 ,
from the wave invariant a j (t). At this point we use Proposition 2.5, and to finish the proof of Theorem 1.2 we have to show that the set of functions ω (cot t)α ; | α| = j + 1 ∪ 2 ωt 1 2αn + 5 −1 ωn t 2 1 (cot )α ; | α| = j − 1 , )( ) ( cot + 2 3ωn2 αn + 1 2ωn 2 9ωn4 are linearly independent over C. But this is clear from our discussion at the beginning of the proof. 3. Appendix A In this Appendix we prove Lemma 2.1. Proof. First of all we would like to change the function slightly by rescaling it. We choose 0 < τ < 2ε so that 1−τ = o(1−2ε ). Then we define (x) := (
x ). 1−τ
(51)
1080
H. Hezari
Thus ∈ C0∞ ([0, ∞)) is supported in the interval I = [0, 1−τ δ]. In Appendix B, using the min-max principle we show that T r (( Hˆ )e
−it
Hˆ
) = T r (( Hˆ )e
−it
Hˆ
) + O(∞ ) = T r (e
−it
Hˆ
) + O(∞ ).
Hence to prove the lemma it is enough to show ˆ T r (( P)e
−it
Pˆ
) = T r (( Hˆ )e
−it
Hˆ
) + O(∞ ).
To prove this identity we use the WKB construction of the kernel of the operators ˆ ˆ ˆ )e −it ˆ −it P and ( H H and make a compression between them. ( P)e −it
ˆ
ˆ P . In [DSj], Chapter 10, a WKB construction is 3.1. WKB construction for ( P)e −it ˆ P ˆ for symbols P in the symbol class S 0 (1) which are independent of made for ( P)e 0 or of the form P(x, ξ, ) ∼ P0 (x, ξ )+P1 (x, ξ )+. . . , where P j ∈ S00 are independent of (but not for symbols H = H (x, ξ, ) ∈ Sδ00 ). ˆ ˆ −it P for small time t, say t ∈ (−t0 , t0 ), It is shown that we can approximate ( P)e by a fourier integral operator of the form
U P (t)u(x) = (2π )
−n
ei(ϕ P (t,x,η)−y.η)/b P (t, x, y, η, )u(y)dydη,
where b P ∈ C ∞ ((−t0 , t0 ); S(1)) have uniformly compact support in (x, y, η), and ϕ P is real, smooth and is defined near the support of b P . The functions ϕ P and b P are found in such a way that for all t ∈ (−t0 , t0 ), ˆ ||( P)e
−it
Pˆ
− U P (t)||tr = O(∞ ).
Let us briefly review this construction, made in [DSj]. First of all, in Chapter 8, Theoˆ p w (a P (x, ξ, )) rem 8.7, it is proved that for every symbol P ∈ S00 (1), we have ( P)=O for some a P (x, ξ, ) ∈ S00 (1), where here Pˆ and O p w (a P (x, ξ, )) are respectively the Weyl quantization of P and a P (x, ξ, ). It is also shown that a P ∼ a P,0 (x, ξ ) + ha P,1 (x, ξ ) + . . . for some a P, j (x, ξ ) ∈ S00 (1). The idea of proof is as follows. In Theo˜ ∈ C 1 (C) is an almost analytic rem 8.1 of [DSj] it is shown that if ∈ C0∞ (R), and if 0 ∞ ˜ extension of (i.e. ∂¯ (z) = O(|z| )), then ˆ = ( P)
¯˜ ∂ (z) −1 L(dz). π C z − Pˆ
(52)
ˆ −1 = O p w (r (x, ξ, Then it is verified that for some symbol r (x, ξ, z; ), we have (z − P) z; )). By symbolic calculus, one can find a formal asymptotic expansion of the form r (x, ξ, z; ) ∼
q1 (x, ξ, z) 1 q2 (x, ξ, z) + + 2 + ..., 3 z−P (z − P) (z − P)5
Inverse Spectral Problems for Schrödinger Operators
1081
ˆ = (z − P) ˆ O p w (r (x, ξ, z; )) = 1. by formally solving O p w (r (x, ξ, z; ))(z − P) We can see that q j (x, ξ, z) are polynomials in z with smooth coefficients. Finally it is ˆ = O p w (a P (x, ξ, )), where a P ∈ S 0 is given by shown that ( P) 0 a P (x, ξ, ) =
−1 ˜ ∂¯ (z)r (x, ξ, z; )L(dz). π C
By the above asymptotic expansion for r (x, ξ, z; ) one obtains an asymptotic a P ∼ a P,0 + a P,1 + . . ., where a P, j =
q j (x, ξ, z) −1 1 2j ˜ ∂ (q j (x, ξ, t)(t))|t=P(x,ξ ) . (53) L(dz) = ∂¯ (z) 2 j+1 π C (z − P) (2 j)! t
Then, again in Chapter 10 of [DSj], it is shown that ϕ P (t, x, η) and b P (t, x, y, η, ) satisfy ∂t ϕ P (t, x, η) + P(x, ∂x ϕ P (t, x, η)) = 0, ϕ P |t=0 = x.η, (54) b P ∼ b P,0 + b P,1 + . . . , b P, j = b P, j (t, x, y, η) ∈ C ∞ ((−t0 , t0 ); S00 (1)), where ⎧ ! " ⎨ ∂t b P, j + ∂x ϕ P , ∂x b P, j + 21 x ϕ P . b P, j = − 21 x b P, j−1 , ⎩
b P, j |t=0 = ψ(x, η)a P, j ( x+y 2 , η)ψ(y, η).
j ≥ 0, (b P,−1 = 0), (55)
In (55), a P, j is given by (53) and ψ(x, η) is any C0∞ function which equals 1 in a neighborhood of P −1 (I ), where I = [0, δ] is, as before, the range of our low-lying eigenvalues and where is supported. −it ˆ There exists a similar construction for ( Hˆ )e H , except here H ∈ S 0 . δ0
−it
ˆ
3.2. WKB construction for ( Hˆ )e H . Since in (13), H = H (x, ξ, ) ∈ Sδ00 , with δ0 = 21 − ε, we can not simply use the construction in [DSj] mentioned above. Here in −it ˆ two lemmas we show that the same construction works for the operator ( Hˆ )e H . We will closely follow the proofs in [DSj]. Lemma 3.1. 1) Let be given by (51) and H ∈ Sδ00 by (13). Then for some a H ∈ Sδ00 we have ( Hˆ ) = O p w (a H (x, ξ, )). Moreover a H (x, ξ, ) ∼ a H,0 (x, ξ, ) + a H,1 (x, ξ, ) + . . . , where a H, j (x, ξ, ) ∈ Sδ00 is given by q H, j (x, ξ, z, ) −1 ˜ L(dz) ∂¯ (z) π C (z − H )2 j+1 1 2j = ∂ (q H, j (x, ξ, t, )(t))|t=H (x,ξ,) . (2 j)! t
a H, j =
(56)
1082
H. Hezari
2) Choose c such that 0 < c < min{1, ωk2 }nk=1 ≤ max{1, ωk2 }nk=1 < 1c . Let ψ(x, η) be a function in C0∞ (R2n ) ∩ Sδ00 (R2n ) which is supported in the ball {x 2 + η2 < 4c−1 1−τ δ} and equals 1 in a neighborhood of H −1 (I ), where I = [0, 1−τ δ] (I is where is supported). Then x + y −n ˆ ei(x−y).η/ψ(x, η)a H ( H )u(x) = (2π ) , η, 2 (57) × ψ(y, η)u(y)dydη + K ()u(x), where ||K ()||tr = O(∞ ). Proof of Lemma 3.1. Since H ∈ Sδ00 and δ0 = 21 − ε < 21 , the symbolic calculus mentioned in the last section can be followed similarly to prove Lemma 3.1.1. It is also easy to check that in (56), a H, j ∈ Sδ00 . The second part of the lemma is stated in [DSj], Eq. 10.1, for the case P ∈ S00 . The same argument works for H ∈ Sδ00 , precisely because the factor N on the right-hand side of the inequality in Proposition 9.5 of [DSj] changes to N −δ0 α . Thus the discussion on pp. 115–116 still follows. Lemma 3.2. There exists t0 > 0 such that for every t ∈ (−t0 , t0 ), there exist functions ϕ H (t, x, η, ) and b H (t, x, y, η, ) such that the operator U H (t) defined by −n ei(ϕ H (t,x,η,)−y.η)/b H (t, x, y, η, )u(y)dydη, (58) U H (t)u(x) = (2π ) satisfies ||( Hˆ )e
−it
Hˆ
− U H (t)||tr = O(∞ ).
Moreover, we can choose ϕ H and b H such that 1) ϕ H satisfies the eikonal equation ∂t ϕ H (t, x, η, ) + H (x, ∂x ϕ H (t, x, η, )) = 0,
ϕ H |t=0 = x.η.
(59)
This equation can be solved in (−t0 , t0 ) × + < where C is an arbitrary constant. In fact ϕ H is independent of in this domain. (Only the domain of ϕ H depends on . See (62).) 2) For all t ∈ (−t0 , t0 ), we have b H (t, x, y, η, ) ∈ Sδ00 with supp b H ⊂ {x 2 + η2 , y 2 + η2 < C1 1−τ δ} for some constant C1 . Also b H has an asymptotic expansion of the form {x 2
η2
C1−τ δ},
b H ∼ b H,0 + b H,1 + . . . , b H, j = b H, j (t, x, y, η, ) ∈ C ∞ ((−t0 , t0 ); Sδ00 (1)), (60) and the functions b H, j satisfy the transport equations ⎧ " ! ⎨ ∂t b H, j + ∂x ϕ H , ∂x b H, j + 21 x ϕ H . b H, j = − 21 x b H, j−1 , ⎩
j ≥ 0, (b H,−1 = 0),
b H, j |t=0 = ψ(x, η)a H, j ( x+y 2 , η, )ψ(y, η), (61)
where in (61) we let ψ(x, η) be a function in C0∞ (R2n )∩Sδ00 (R2n ) which is supported in the ball {x 2 + η2 < 4c−1 1−τ δ} and equals 1 in a neighborhood of H −1 (I), where I = [0, 1−τ δ]. Here c is defined in Lemma 3.1.2. Also in (61), the functions
a H, j are defined by (56).
Inverse Spectral Problems for Schrödinger Operators
1083
3) For all t ∈ (−t0 , t0 ), ϕ H (t, x, η, ) = ϕ P (t, x, η) on {x 2 + η2 , y 2 + η2 < C1 1−τ δ} ⊃ supp (b H (x, y, η, )).
(62) 4) For all t ∈ (−t0 , t0 ), b H, j (t, x, y, η, ) = b P, j (t, x, y, η) on
x 2 + η2 , y 2 + η2 < c1−τ δ . (63)
Proof of Lemma 3.2. First of all we assume U H (t) is given by (58) and we try to solve the equation ⎧ ⎨ ||( i ∂t + Hˆ )U H (t)||tr = O(∞ ), ⎩
U H (0) = ( Hˆ )
for ϕ H and b H , for small time t. Using (57), this leads us to ⎧ (1)), ⎨ e−iϕ H /( i ∂t + Hˆ )(eiϕ H /b H ) ∈ C ∞ ((−t0 , t0 ); Sδ−∞ 0 ⎩
b|t=0 = ψ(x, η)a H ( x+y 2 , η, )ψ(y, η).
We choose the phase function ϕ H = ϕ H (t, x, η, ) to satisfy the eikonal equation (59). We show that this equation can be solved in a neighborhood of the support of b H , for small time t ∈ (−t0 , t0 ) with t0 independent of . Let us explain how to solve this equation. We let (x(t, z, η; ), ξ(t, z, η; )) be the solution to the Hamilton equation ⎧ x(0, z, η; ) = z, ⎨ ∂t x = ∂ξ H (x, ξ, ) = ξ, . (64) ⎩ ∂ ξ = −∂ H (x, ξ, ) = −∂ V (x), ξ(0, z, η; ) = η. t x x We can show that (see Sect. 4 of [Ch]) there exists t0 independent of such that for all |t| ≤ t0 we have ⎧ |∂η x(t, z, η; )| ≤ 21 , ⎨ |∂z x(t, z, η; ) − I | ≤ 21 , . (65) ⎩ |∂z ξ(t, z, η; )| ≤ 21 , |∂η ξ(t, z, η; ) − I | ≤ 21 . We can choose t0 independent of , precisely because in Eq. 4.4 of [Ch] we have a uniform bound in for Hess(V(x)). Now, we define λ : (z, η) −→ (x(t, z, η; ), η). It is easy to see that λ(0, 0) = (0, 0). This is because if (z, η) = (0, 0) then H (x, ξ ) = H (z, η) = 0. By (13) and (2), and W (x) = O(|x|3 ), we can see that H (x, ξ ) = 0 implies (x(t, 0, 0; ), ξ(t, 0, 0; )) = (0, 0). On the other hand from (65) we have 1 3 2 < |∂z x(t, z, η; )| < 2 . Therefore λ is invertible in a neighborhood of origin. We define the inverse function by λ−1 (x, η) = (z(t, x, η; ), η),
1084
H. Hezari
which is defined in a neighborhood of (x, η) = (0, 0). Then we have t 1 |ξ(s, z(t, x, η; ), η; )|2 ϕ H (t, x, η, ) = z(t, x, η; ).η + 0 2 −V(x(s, z(t, x, η; ), η; ))ds.
(66)
A similar formula holds for ϕ P except in (64) H should be replaced by P and in (66) V by V . It is known that the eikonal equation for ϕ P can be solved near suppb P , for small time t ∈ (−t0 , t0 ) (Of course t0 is independent of .) Now, we want to show that ϕ H (t, x, η, ) = ϕ P (t, x, η)
in
(−t0 , t0 ) × {x 2 + η2 < C1−τ δ}.
(67)
1
1−τ 2
δ2.
1
1−τ
1
Let (x, η) be in {x 2 + η2 < C1−τ δ}. First, we show that |z(t, x, η; )| < 8C 2 Because z(t, 0, 0; ) = 0, by the Fundamental Theorem of Calculus we have
1
|z(t, x, η; )| ≤ (|x| + |η|) sup{(|∂x | + |∂η |)(z(t, x, η; ))}. From x(t, z(t, x, η; ), η; ) = x, we get ∂η z = −(∂z x)−1 ∂η x. Thus by (65), |∂x z| + |∂η z| ≤ 4. Hence |z(t, x, η; )| < 4(|x| + |η|) < 8C 2 2 δ 2 . This implies that for all |t| ≤ t0 , (x(s, z(t, x, η; ), η; ), ξ(s, z(t, x, η; ), η; )) will stay in a ball of radius O(1−τ ) centered at the origin (this can be seen from the conservation of energy, i.e. H (x, ξ ) = H (z, η)). On the other hand, by definition (13), P and H agree in the ball {x 2 + η2 < 41 1−2ε } and τ < 2ε. So for all t, s ∈ (−t0 , t0 ) and (x, η) ∈ {x 2 + η2 < C1−τ δ} we have z P (t, x, η) = z(t, x, η; ), x P (s, z P (t, x, η), η) = x(s, z(t, x, η; ), η; ), ξ P (s, z P (t, x, η), η) = ξ(s, z(t, x, η; ), η; ),
(68)
where z P (t, x, η), x P (s, z P (t, x, η), η) and ξ P (s, z P (t, x, η), η) are corresponded to the Hamilton flow of P. Hence by (66) and a similar formula for ϕ P , we have (67). This also shows that we can solve (59) in (−t0 , t0 ) × {x 2 + η2 < C1−τ δ}. To find b H we assume it is of the form (60) and we search for functions b H, j such that e−iϕ H /( i ∂t + Hˆ )(eiϕ H /b H ) ∼ 0. After some straightforward calculations and using the eikonal equation for ϕ H we obtain the so called transport equations (61). We can solve the transport equations inductively (see [Ch]). In [Ch] it is shown that the solutions to the transport equation (61) are given by 1
b H,0 (t, x, y, η, ) = J − 2 (t, x, η, )b H,0 (0, z(t, x, η; ), η; ), y, η, ), 1 b H, j (t, x, y, η, ) = J − 2 (t, x, η, ) b H, j (0, z(t, x, η; ), η; ), y, η, ) 1 t 1 − J 2 (s, x, η, )b H, j−1 2 0 (s, x(s, z(t, x, η; ), η; ), y, η, )ds ,
(69)
Inverse Spectral Problems for Schrödinger Operators
1085
where J (t, x, η, ) = det(∂x z(t, x, η; ))−1 . Now, we notice by the assumption on ψ, we have supp(b H, j (0, x, y, η; )) ⊂ {x 2 + η2 , x 2 + η2 < 4c−1 1−τ δ}. So by our previous discussion on z(t, x, η, ), we can argue inductively that for all t ∈ (−t0 , t0 ), supp(b H, j ) ⊂ {x 2 + η2 , y 2 + η2 < C1 1−τ δ} for some constant C1 . Since b H, j |t=0 ∈ Sδ00 , we can also see inductively from (69) that b H, j ∈ Sδ00 . Finally, Borel’s Theorem produces a compactly supported amplitude b H ∈ Sδ00 from the compactly supported functions b H, j ∈ Sδ00 . This finishes the proof of items 1, 2 and 3 of Lemma 3.2. Now we give a proof for item 4 of Lemma 3.2. By choosing C > C1 , Eq. (62) is clearly true from (67). Next we prove that Eq. (63) holds. Using (53) and (56), and because P and H agree in the ball {x 2 + η2 < 41 1−2ε }, we observe that the functions a P, j (x, η) and a H, j (x, ξ, ) agree in this ball. Therefore, because suppψ(x, η) ⊂ {x 2 + η2 < 4c−1 1−τ δ} and ψ = 1 in {x 2 + η2 < c1−τ δ}, by (55) and (61), b H, j (0, x, y, η, ) = b P, j (0, x, y, η)
on {(x, y, η); x 2 + η2 , y 2 + η2 < c1−τ δ}.
This proves (63) only at t = 0. But by applying (68) to (69) and a similar formula for b P , we get (63). This finishes the proof of Lemma 3.2. To finish the proof of Lemma 2.1, we have to show that for t sufficiently small T rU H (t) = T rU P (t) + O(∞ ), or equivalently ei(ϕ H (t,x,η,)−x.η)/b H (t, x, x, η, )d xdη = ei(ϕ P (t,x,η)−x.η)/b P (t, x, x, η, )d xdη + O(∞ ). By (62), the phase function ϕ H of the double integral on the left-hand side equals ϕ P on the support of the amplitude b H , so ϕ H is independent of in this domain. Now, if t ∈ (0, t0 ), where t0 is smaller than the smallest non-zero period of the flows of P and H respectively in the energy balls, {(x, η)| H (x, η) ≤ δ1−τ C1 } ⊂ {(x, η)| P(x, η) ≤ δ}, then for every such t, (x, η) = (0, 0) is the only critical point of the phase functions ϕ H (t, x, η, ) − x.η and ϕ P (t, x, η) − x.η in these energy balls. Obviously both integrals in the equation above are convergent because their amplitudes are compactly supported. But the question is whether or not we can apply the stationary phase lemma to these integrals around their unique non-degenerate critical points. By Lemma 3.2 the phase functions ϕ H and ϕ P are independent of on the support of their corresponding amplitudes. Hence ϕ H , ϕ P ∈ S00 on supp b H and supp b P respectively. On the other hand b H (t, x, x, η, ) ∈ Sδ00 , δ0 < 21 ; and b P (t, x, x, η, ) ∈ S00 . These facts can be used to get the required estimates for the remainder term in the stationary phase lemma (for an estimate for the remainder term of the stationary phase lemma, see for example Proposition 5.2 of [DSj]). Finally, by (62) and (63) it is obvious that the integrals above must have the same stationary phase expansions.
1086
H. Hezari
4. Appendix B In this Appendix we prove Lemma 2.2. In fact we prove that if is given by (51) then in the sense of tempered distributions T r (( Hˆ )e
−it
Hˆ
) = T r (e
−it
Hˆ
) + O(∞ ).
(70)
Proof of Lemma 2.2 follows similarly. We will use the min-max principle. Min-max principle. Let H be a self-adjoint operator that is bounded from below, i.e. H ≥ cI , with purely discrete spectrum {E j }∞ j=0 . Then Ej =
sup
ϕ1 ,...,ϕ j−1
inf (ψ, H ψ). ψ ∈ D(H ); ψ = 1 ψ ∈ span(ϕ1 , . . . , ϕ j−1 )⊥
(71)
As before we put Hˆ = − 21 2 + V(x) = − 21 2 + 21 nk=1 ωk2 xk2 + W(x), and Hˆ 0 = − 21 2 + 21 nk=1 ωk2 xk2 . Then if we let C = W(x) L ∞ (Rn ×(0,h 0 )) , we have (ψ, Hˆ 0 ψ) − C ≤ (ψ, Hˆ ψ) ≤ (ψ, Hˆ 0 ψ) + C, and therefore by applying the min-max principle to the operators Hˆ and Hˆ 0 we get E 0j () − C ≤ E j () ≤ E 0j () + C.
(72)
Notice we have explicit formulas for the eigenvalues E 0j () of Hˆ 0 . They are given by the lattice points in the first quadrant of Rn . More precisely n 1 0 ≥0 σ ( Hˆ 0 ) = E γ () = ωk (γk + ); γk ∈ Z . 2 k=1
Since in the sense of tempered distributions T r (( Hˆ )e
−it
Hˆ
) = T r (χ[0,δ 1−τ ] ( Hˆ )e
−it
Hˆ
)+ O(∞ ); (see for example [Ca], Prop. 6),
to prove (70), it is clearly enough to show that for every ϕ in S(R) { j; E j ()>δ 1−τ }
ϕ( ˆ
E j () ) = O(∞ ).
Since ϕˆ is in S(R), for every p ≥ 0 there exists a constant C p such that |ϕ(x)| ˆ ≤ C p |x|− p . Hence by (72), # 0 # # # # E () − C #− p # E j () #− p E j () j # # # ≤ Cp # ϕ( ) ≤ C p ## # . # # #
Inverse Spectral Problems for Schrödinger Operators
1087
Again using (72) and because C = W(x) L ∞ (Rn ×(0,h 0 )) < A 2 −3ε < 4δ 1−τ we get 3
# 0 #− p # 0 #− p # E () # E j () δ1−τ − C p ## E j () ## # j # ) ≤ C p ( 1−τ ) # ϕ( # < 2C p # # , for E j () > δ1−τ. # # δ − 2C # # Now let m be an arbitrary positive integer. So in order to prove the lemma it is enough to find a uniform bound for
A() := −m {γ ;
|
ωk (γk + 21 )> δ
1−τ −C }
n k=1
1 ωk (γk + )|− p . 2
By applying the geometric-arithmetic mean value inequality we get A() ≤ n
− p −m
{γ ;
ωk (γk + 21 )> δ
⎧⎛ n ⎪ ⎨ ⎜ −m ≤ n− p ⎝ ⎪ k=1 ⎩
| 1−τ −C }
n k=1
1 ωk (γk + )|− p 2 ⎞
{γk ∈Z≥0 ; ωk (γk + 21 )> δ
⎛ ⎞⎫ ⎪ ⎬ 1 ⎝ |ωk (γk + )|− p ⎠ . × ⎪ 2 ⎭ γk k =k
1−τ −C } n
1 ⎟ |ωk (γk + )|− p ⎠ 2
We claim for p large enough there is a uniform bound for the sum on the right-hand side of the above inequality. It is clear that if p ≥ 2 then the series γ |ωk (γk + k
is convergent. Also if for some γk we have ωk (γk + 21 ) > δ n −C , then because ' (1/τ 3 δ 1/τ 1 > ( 2n ) . Thus C = O( 2 −3ε ), for small enough we have ωk (γk + 21 ) 1−τ
1 −p 2 )|
{γk ∈Z≥0 ; ωk (γk + 12 )> δ
1−τ −C } n
1 2n 1 m −m |ωk (γk + )|− p ≤ ( )m/τ |ωk (γk + )| τ − p . 2 δ 2 γ k
So if we choose p > max { mτ , 2}, then the sum on the right-hand side is convergent and therefore we have a uniform bound for the sum on the left-hand side and hence for A(). This finishes the proof of (70). Acknowledgements. I am sincerely grateful to Steve Zelditch for introducing the problem and many helpful discussions and suggestions on the subject. I would also like to thank him for his great support and encouragement as I was writing this article.
References [BPU] Brummelhuis, R., Paul, T., Uribe, A.: Spectral estimates around a critical level. Duke Math. J. 78(3), 477–530 (1995) [C] Colin De Verdière, Y.: A semi-classical inverse problem II: reconstruction of the potential. http://arXiv.org/abs/:0802.1643, 2008
1088
[Ca]
H. Hezari
Camus, B.: A semi-classical trace formula at a non-degenerate critical level. (English Summary) J. Funct. Anal. 208(2), 446–481 (2004) [CG1] Colin De Verdière, Y., Guillemin, V.: A semi-classical inverse problem I: Taylor expansions. http://arXiv.org/abs/:0802.1605, 2008 [Ch] Chazarain, J.: Spectre d’un Hamiltonien quantique et méchanique classique. Comm. PDE 5, 595–644 (1980) [D] Duistermaat, J.J.: Oscillatory integrals, Lagrange immersions and unfolding of singularities. Comm. Pure Appl. Math. 27, 207–281 (1974) [DSj] Dimassi, M., Sjöstrand, J.: Spectral asymptotics in the semi-classical limit. London Mathematical Society Lecture Note Series, 268. Cambridge: Cambridge University Press, 1999 [EZ] Evans, L.C., Zworski, M.: Lectures on semiclassical analysis, Lecture notes, available at http://math. berkeley.edu/~zworski/semiclassical.pdf [GU] Guillemin, V., Uribe, A.: Some inverse spectral results for semi-classical Schrödinger operators. Math. Res. Lett. 14(4), 623–632 (2007) [GPU] Guillemin, V., Paul, T., Uribe, A.: ‘‘Bottom of the well” semi-classical trace invariants. Math. Res. Lett. 14(4), 711–719 (2007) [ISjZ] Iantchenko, A., Sjöstrand, J., Zworski, M.: Birkhoff normal forms in semi-classical inverse problems. Math. Res. Lett. 9(2-3), 337–362 (2002) [PU] Paul, T., Uribe, A.: The semi-classical trace formula and propagation of wave packets. J. Funct. Anal. 132(1), 192–249 (1995) [R] Robert, D.: Autour de l’approximation semi-classique. (French) [On semiclassical approximation] Progress in Mathematics, 68. Boston, MA: Birkhäuser Boston, Inc., 1987 [Sj] Sjöstrand, J.: Semi-excited states in nondegenerate potential wells. Asymptotic Anal. 6(1), 29–43 (1992) [Sh] Shubin, M.A.: Pseudodifferential operators and spectral theory. Translated from the 1978 Russian original by Stig I. Andersson. Second edition. Berlin: Springer-Verlag, 2001 [U] Uribe, A.: Trace formulae. First Summer School in Analysis and Mathematical Physics (Cuernavaca Morelos, 1998), Contemp. Math. 260. Providence, RI: Amer. Math. Soc., 2000, pp. 61–90 [Z] Zelditch, S.: Reconstruction of singularities for solutions of Schrödinger’s equation. Commun. Math. Phys. 90(1), 1–26 (1983) [Z1] Zelditch, S.: The inverse spectral problem. With an appendix by J. Sjöstrand and M. Zworski. In: Surv. Differ. Geom. IX, Somerville, MA: Int. Press, 2004, pp. 401–467 [Z2] Zelditch, S.: Inverse spectral problem for analytic domains. I. balian-bloch trace formula. Commun. Math. Phys. 248(2), 357–407 (2004) [Z3] Zelditch, S.: Inverse spectral problem for analytic plane domains II: Z2 -symmetric domains. To appear in Ann. Math. http://aiXiv.org/abs/math.SP/0111078., 2001; available at http://annals.math. princeton.edu/issues/2006/FinalFiles/Zelditdi.pdf [Z4] Zelditch, S.: Spectral determination of analytic bi-axisymmetric plane domains. Geom. Funct. Anal. 10(3), 628–677 (2000) Communicated by B. Simon
Commun. Math. Phys. 288, 1089–1102 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0713-0
Communications in
Mathematical Physics
A Characterization of Dirac Morphisms E. Loubeau1 , R. Slobodeanu2, 1 Département de Mathématiques, Université de Bretagne Occidentale, 6,
Avenue Victor le Gorgeu, CS 93837, 29238 Brest Cedex 3, France. E-mail:
[email protected] 2 Faculty of Physics, Bucharest University, 405 Atomi¸stilor Str., CP Mg-11, RO - 077125 Bucharest, Romania. E-mail:
[email protected] Received: 30 May 2008 / Accepted: 6 October 2008 Published online: 24 January 2009 – © Springer-Verlag 2009
Abstract: Relating the Dirac operators on the total space and on the base manifold of a horizontally conformal submersion, we characterize Dirac morphisms, i.e. maps which pull back (local) harmonic spinor fields onto (local) harmonic spinor fields.
1. Introduction Introduced by Jacobi [11] in 1848, harmonic morphisms are maps which pull back local harmonic functions onto harmonic functions and, more recently, they were characterized by Fuglede [7] and Ishihara [10] as horizontally weakly conformal harmonic maps. Their dual nature of analytical and geometrical objects has led to a rich theory (cf. [3]) which has encouraged the study of various other morphisms, that is maps preserving germs of certain differential operators. The central role of the Dirac operator in differential geometry and mathematical physics called for this approach to be applied to harmonic spinors. Unlike previous cases, the first hurdle is to make sense of a notion of pull-back of spinors by a map. This requires the identification of the spinor bundles involved, necessarily restricting our investigation to horizontally conformal maps between Riemannian manifolds and even-dimensional targets (cf. Sect. 2). Combining a chain rule for the Dirac operator and a local existence lemma, we show that a horizontally conformal submersion between spin manifolds is a Dirac morphism if and only if its horizontal distribution is integrable and the mean curvature of the fibres is related to the dilation factor, in a manner reminiscent of the fundamental equation for harmonic morphisms. We conclude with some simple examples between Euclidean spaces and make explicit our results in the set-up of [13], which inspired initially our construction. The second author benefited from a one-year fellowship of the Conseil Général du Finistère.
1090
E. Loubeau, R. Slobodeanu
2. Pull-Back of a Spinor ρ
Let (M m , g) be a spin Riemannian manifold, the two-sheeted covering Spin(m) −→ SO(m) induces a double cover χ : PSpin(m) M −→ PSO(m) M of the bundle of positively oriented orthonormal frames by the principal Spin(m)-bundle over M, such that χ (s · g) = χ (s) · ρ(g), ∀s ∈ PSpin(m) M, g ∈ Spin(m). The associated bundle Cl(M) = PSO(m) M ×clm Clm is the Clifford bundle, where Clm is the Clifford algebra and clm the representation of SO(m) into Aut(Cl(Rm )), and the spinor bundle is S M = PSpin(m) M ×γ Sm , with γ the spinorial representation of Spin(m) on the [m/2] (cf. [12]). Clifford module Sm = C2 A spinor field is a (smooth) section of S M, : U ⊂ M −→ S M, (x) = [sx , ψ(x)], where sx ∈ PSpin(m) M is a spinorial frame at x ∈ M and ψ : U −→ Sm , the equivalence class being defined by [s, ψ] = [s · g −1 , γ (g)ψ], for all g ∈ Spin(m). The covariant derivative is m l 1 jk ek · el · ψ , ∇e j = s, dψ(e j ) + 2 k
where χ (s) = {e1 , . . . , em } is an orthonormal frame of T M, “·” is the Clifford multiplication and the symbols {ikj }i, j,k=1,...,m are defined by ∇ei e j = ikj ek . The Dirac operator is the first-order differential operator defined by D M : (S M) → (S M) m e j · ∇e j . → D M = j=1
The above constructions extend to any oriented Riemannian bundle and, as for orientation, given three vector bundles E , E and E = E ⊕ E over M, a choice of spin-structure on any two of them uniquely determines a spin structure on the third ([12]). Definition 1. A smooth map π : (M m , g) → (N n , h) between Riemannian manifolds is a horizontally conformal map if, at any point x ∈ M, dπx maps the horizontal space Hx = (ker dπx )⊥ conformally onto Tπ(x) N , i.e. dπx is surjective and there exists a number λ(x) = 0 such that (π ∗ h)x = λ2 (x)gx . H x ×H x
H x ×H x
The function λ is the dilation of π and the orthogonal complement of Hx is the vertical distribution Vx = ker dπx . The mean curvatures of the distributions H and V are a=1,...,m−n denoted µH and µV and I H is the integrability tensor of H. A frame {Va , X i∗ }i=1,...,n of T M will be called adapted if Va ∈ V, a = 1, . . . , m − n and {X i∗ }i=1,...,n is the horizontal lift by π of an orthonormal frame {X i }i=1,...,n on N . Note that λ ≡ 1 corresponds to Riemannian submersions. We call the map π : (M m , g1 ) → (N n , h), where g1 = π ∗ h + g V , the associated Riemannian submersion of π : (M m , g) → (N n , h).
A Characterization of Dirac Morphisms
1091
For the remainder of the article, we will make the blanket assumption that the dimension n of the manifold N is even. Since a general submersion π : (M m , g) → (N n , h), between spin Riemannian manifolds, splits the tangent bundle T M into H ⊕ V, if H admits a spin structure, so does V and ˆ Cl(V). Cl(M) = Cl(H) ⊗
(1)
The spin structures PSpin(n) H and PSpin(m−n) V induce a spin structure PSpin(m) M by prolongation of the principal bundle PSpin(n)×Spin(m−n) M (cf. [9]). General properties of associated bundles of reduced principal bundles ([9, Theorem 3.1]) together with a dimension count yield the following isomorphisms of (associated) vector bundles S M = SH ⊗ SV.
(2)
For any map π : M → N into a spin manifold, consider the pull-back spinor bundle π −1 S N = {(x, [s, ψ]) ∈ M × S N | S ([s, ψ]) = π(x)}, where S is the projection map of S N . If π is a Riemannian submersion, then the isomorphism π −1 S N = SH, due to the identification of orthonormal frames, simplifies (2) into (cf. [5]) S M = π −1 S N ⊗ SV.
(3)
Remark 1. When π is a Riemannian submersion with totally geodesic fibres, H complete and N connected then the fibres are isometric to a Riemannian manifold F. If N and F are spin manifolds, consider on M the induced spin structure and, via the isomorphism π −1 S N = SH, (2) reads (see [13]) S M = π −1 S N ⊗ S F. Remark 2. Since n is even, the Clifford algebra Cln possesses an irreducible complex module Sn of complex dimension 2n/2 , the complex spinor module. When restricted to Cl0n the spinor module decomposes into Sn = Sn+ ⊕ Sn− , the submodules of spinors of positive and negative chirality, characterized by the action of the volume element, once an orientation is given. In particular, the spin group Spin(n) ⊂ Cl0n acts on Sn+ and on Sn− (the spinor representations). Moreover, Sn+1 pulls back to Sn under the algebra isomorphism Cln = Cl0n+1 . In other words, we can regard Sn+1 as the spinor representation of Cln , provided we define the action of Cln on Sn+1 by v ⊗ σ → e0 · v · σ . When π : (M n+1 , g) → (N n , h) is a Riemannian submersion with one-dimensional fibres into a spin manifold N n , the manifold M n+1 inherits a natural spin structure, cf [14]. Moreover, S M = π −1 S N . These identifications justify the following definition. Definition 2. (Riemannian submersions). Let π : (M m , g) → (N n , h) be a Riemannian submersion between spin manifolds and endow the vertical bundle with the induced spin structure (if m = n + 1, we consider on M the natural spin structure inherited from N ).
1092
E. Loubeau, R. Slobodeanu
Let = [s, ψ] be a (local) spinor field on N . Since a local spin frame s = {X i }i=1,...,n on N lifts to an adapted spin frame on M a=1,...,m−n s˜ = {X i∗ , Va }i=1,...,n ∈ PSpin(n)×Z2 Spin(m−n) M|π −1 U ,
where X i∗ is the horizontal lift of X i and {Va }a=1,...,m−n is an orthonormal frame of V, we define the pull-back of to be • If m − 1 = n, the section = [˜s , ψ ◦ π ] of the bundle π −1 S N , identified with S M. = (ψ ◦ π ) ⊗ α] in π −1 S N ⊗ SV, identified • If m − n ≥ 2, the section = [ s, ψ with S M, where α is a fixed (non-zero) section of SV. Remark 3. When m − n ≥ 2, this notion of pull-back depends on the choice of the section α. In general, there exists no such non-vanishing global section. Remark 4. Note that Clifford multiplication with this kind of spinor fields is given by = X = iψ, X∗ · ψ · ψ; V · ψ when m = n + 1, and by X ∗ · ((ψ ◦ π ) ⊗ α) = ((X · ψ) ◦ π ) ⊗ α; V · ((ψ ◦ π ) ⊗ α) = (ψ ◦ π ) ⊗ V · α, when m − n ≥ 2, where X ∗ is the horizontal lift of the vector field X ∈ (T N ), V is a unit vertical vector field and ψ is the conjugate of ψ with respect to the Z2 -graduation (see [13]). Remark 5. For a horizontally conformal submersion π : (M m , g) → (N n , h), we deform the metric on M into g1 = π ∗ h + g V and denote by (cf. [12]) ξγ : S M1 → S M, ξγ ([s, ψ]) = [ξ(s), ψ], where ξ(s) = {E i = λE i1 , Va } if s = {E i1 , Va }, the bundle isometry induced by the Spin-equivariant map ξ : PSpin(n)×Z2 Spin(m−n) M1 → PSpin(n)×Z2 Spin(m−n) M given by the natural correspondence between adapted orthogonal frames with respect to the two metrics: E i1 = λ−1 E i , Va1 = Va . The Clifford multiplication will be given by E i · = ξγ E i1 · 1 , Va · = ξγ (Va · 1 ) , where = ξγ ◦ 1 . Definition 3. (Horizontally conformal submersions) Let π : (M m , g) → (N n , h) be a horizontally conformal submersion between spin manifolds and endow the vertical bundle with the induced spin structure. Let = [s, ψ] be a (local) spinor field on N . The pull-back of is = ξγ ◦ 1 , where 1 is the pull-back of by the associated Riemannian submersion π : (M m , g1 ) → (N n , h) and ξγ the bundle isometry between S M1 and S M.
A Characterization of Dirac Morphisms
1093
3. Dirac Morphisms with High Dimensional Fibres Throughout this section π has fibres of dimension at least two. Definition 4. A horizontally conformal submersion π : (M m , g) → (N n , h) between spin manifolds is called a Dirac morphism if there exists a section α ∈ (SV), ∇ V -parallel in horizontal directions with D V α − n2 µH · α = 0, and if for any local harmonic spinor defined on U ⊆ N , such that π −1 (U ) = ∅, the pull-back of (with respect to α) = ξγ ◦ 1 is a harmonic spinor on π −1 (U ) ⊆ M, where 1 is the pull-back (with respect to α) of by the associated Riemannian submersion. Remark 6. We assume, in Definition 4, the existence of the section α in order to construct pull-backs of spinors. Though these pull-backs will depend on the choice of α (if any), the two conditions on α will make the notion of Dirac morphism independent of the choice of such a section α. We first need some lemmas. Lemma 1. (Chain rule) Let π : (M m , g) → (N n , h) be a horizontally conformal submersion of dilation λ (m − n ≥ 2) and ψ a (local) spinor field on N . If is the pull-back of by π , with respect to some section α ∈ (SV), then N ψ − 1 (m − n)µV + (n − 1)gradH (lnλ) · ψ ˜ = λD DMψ 2 +
n
E i · (ψ ◦ π ) ⊗ ∇ EVi α +
1 4
I H · ψ˜
i=1
+ (ψ ◦ π ) ⊗ D V α − n2 µH · α ,
(4)
denotes where {E i }i=1...,n is a local orthonormal horizontal frame on M and I H · ψ
n H i< j=1 E i · E j · I (E i , E j )· ψ (the standard action of vector-valued 2-forms on spinor fields). Proof. Let π be a horizontally conformal submersion of dilation λ. Let {X i }i=1,...,n be a=1,...,m−n an orthonormal frame on (N , h) and {Va , X i∗ }i=1,...,n an orthonormal adapted frame
a=1,...,m−n on (M, g1 ), where g1 = π ∗ h + g V . With respect to the metric g, {Va , λX i∗ }i=1,...,n is an orthonormal adapted frame. Denote by ∇ and ∇ 1 the (spinorial) connections corresponding to g and g1 , and note E i1 = X i∗ , E i = λX i∗ . ˜ for the pull-back spinor field ψ, ˜ As D M = [˜s , D M ψ],
= DMψ =
n i=1 n
+ E i · ∇ Ei ψ
m−n
Va · ∇Va ψ
a=1
E i · E i ((ψ ◦ π ) ⊗ α)
(H0 )
i=1
+ 41
n i, j,k=1
E i · g(∇ Ei E j , E k ) E j · E k · ψ
(H1 )
1094
E. Loubeau, R. Slobodeanu
+ 21
n,m−n
E i · g(∇ Ei E j , Va ) E j · Va · ψ
(H2 )
E i · g(∇ Ei Va , Vb ) Va · Vb · ψ
(H3 )
i, j,a=1
+ 41
n,m−n i,a,b=1
+
+
m−n
Va · Va ((ψ ◦ π ) ⊗ α) a=1 n,m−n 1 Va · g(∇Va E i , E j ) 4 i, j,a=1
+ 21
n,m−n
(V0 ) Ei · E j · ψ
(V1 )
Va · g(∇Va E i , Vb ) E i · Vb · ψ
(V2 )
i,a,b=1 m−n
+ 41
. Va · g(∇Va Vb , Vc ) Vb · Vc · ψ
(V3 ).
a,b,c=1
Note that 1
N ψ = ξ (D Nψ ) D γ = ξγ (X i∗ · X i∗ (ψ ◦ π ) + g1 (∇ X1 ∗ X ∗j , X k∗ )X i∗ · X ∗j · X k∗ · (ψ ◦ π )) ⊗ α , i
where g1 (∇ X1 ∗ X ∗j , X k∗ ) = h(∇ XNi X j , X k ). i The computation breaks down into five steps: Step 1. Nψ − (H0 ) + (H1 ) + (H3 ) = λ D
H n−1 2 grad (lnλ) · ψ +
n
E i · (ψ ◦ π ) ⊗ ∇ EVi α.
i=1
(5) As (H0 ) + (H1 ) =
n
E i · [E i (ψ ◦ π ) ⊗ α + (ψ ◦ π ) ⊗ E i (α)]
i=1
+ 41
n
, E i · g(∇ Ei E j , E k ) E j · E k · ψ
i, j,k=1
in order to recognize the lift of D N ψ, first observe that g(∇ Ei E j , E k ) = g(∇λX i∗ λX ∗j , λX k∗ )
j = λg1 (∇ X1 ∗ X ∗j , X k∗ ) + X k∗ (λ)δi − X ∗j (λ)δik . i
A Characterization of Dirac Morphisms
1095
Whilst = ξγ (E i1 · E 1j · E k1 · ψ 1 ) Ei · E j · Ek · ψ 1 ), = ξγ (X i∗ · X ∗j · X k∗ · ψ and E i · E i (ψ ◦ π ) ⊗ α = λE i · E i1 (ψ ◦ π ) ⊗ α = λξγ (X i∗ · X i∗ (ψ ◦ π ) ⊗ α). Hence 1
Nψ ) + (H0 ) + (H1 ) = λξγ ( D ⎛ + ξγ ⎝ 41
n
E i · (ψ ◦ π ) ⊗ E i (α)
i=1 n
⎞
1 ⎠. X k∗ (λ)δi − X ∗j (λ)δik X i∗ · X ∗j · X k∗ · ψ j
i, j,k=1
The last term can be rewritten 1 4
n
j 1 = − n−1 gradH1 (λ) · ψ 1 , X k∗ (λ)δi − X ∗j (λ)δik X i∗ · X ∗j · X k∗ · ψ 2
i, j,k=1
and, since gradH (λ) = λ2 gradH1 (λ), it becomes H1 1 = − n−1 λgradH1 (λ) · ψ grad − n−1 ξ (λ) · ψ γ 2 2 H = − n−1 2 grad (lnλ) · ψ .
Summing up with (H3 ), we obtain (5). Step 2. V (V2 ) = − m−n 2 µ · ψ.
(6)
As g(∇Va E j , Vb ) = −g(E j , ∇Va Vb ) and V is integrable (V2 ) =
− 21
m−n
Va · (∇Va Vb )H · Vb · ψ
a,b=1
= − 21 − 21
m−n
Va · (∇Va Vb )H · Vb +
a
=
Va · (∇Va Vb )H · Vb · ψ
a>b=1
(∇Va Va )H · ψ
a=1 m−n
− 21
m−n
H
− Va · [Va , Vb ] · Vb · ψ
a
V = − m−n 2 µ · ψ.
m−n V 2 µ
·ψ
1096
E. Loubeau, R. Slobodeanu
Step 3. (V0 ) + (V3 ) = (ψ ◦ π ) ⊗ D V α,
(7)
since (V0 ) + (V3 ) =
m−n
(ψ ◦ π ) ⊗ Va · Va (α) +
a=1
1 4
m−n
Va · g(∇Va Vb , Vc ) Vb · Vc · ψ
a,b,c=1
= (ψ ◦ π ) ⊗ Va · ∇VVa α. Step 4. − n µH · ψ . (H2 ) = 21 I H · ψ 2
(8)
As for Step 2, we have (H2 ) = =
1 2
1 2
n,m−n
E i · g(∇ Ei E j , Va ) E j · Va · ψ
i, j,a=1 n
− n µH · ψ , E i · E j · I H (E i , E j ) · ψ 2
i< j=1
where the terms i = j give µH . Step 5. , (V1 ) = − 41 I H · ψ
(9)
since (V1 ) =
1 4
n,m−n
g(∇Va E i , E j )Va · E i · E j · ψ
i, j,a=1
=
1 4
n,m−n
g([Va , E i ], E j ) − g(∇ Ei E j , Va ) Va · E i · E j · ψ
i, j,a=1
=
1 4
n,m−n
− 1 (H2 ). g([Va , E i ], E j )Va · E i · E j · ψ 2
i, j,a=1
But g([Va , E i ], E j ) = g([Va , λX i∗ ], λX ∗j ) = Therefore (V1 ) = − n4 = =
m−n
Va (λ) j λ δi ,
since [Va , X i∗ ] ∈ V.
− 1 (H2 ) Va (lnλ)Va · ψ 2
a=1 n − 1 (H2 ) − 4 gradV (lnλ) · ψ 2 − 1 (H2 ) − n4 µH · ψ 2
. = − 41 I H · ψ Summing up these five steps yields the chain rule.
A Characterization of Dirac Morphisms
1097
A generalization of [1, Prop. 2.4] to vector-valued functions yields local existence of harmonic spinors with prescribed value. Lemma 2. (Local existence). For any point p ∈ M and ψ0 ∈ S p M, there exists an open neighbourhood U of p and a harmonic spinor ψ : U → S M such that ψ( p) = ψ0 . Theorem 1. (Characterization for horizontally conformal submersions). Let π : (M, g) −→ (N , h) be a horizontally conformal submersion between spin manifolds and assume there exists a section α satisfying the conditions of Definition 4. Then π is a Dirac morphism if and only if its horizontal distribution is integrable and (m − n)µV + (n − 1)gradH (lnλ) = 0,
(10)
where µV is the mean curvature of the fibres. Proof. Let π be a horizontally conformal submersion with integrable horizontal distribution and such that (10) is satisfied. In this case, the Chain Rule (4) simplifies to Nψ + D ψ = λD M
n
E i · (ψ ◦ π ) ⊗ ∇ EVi α + (ψ ◦ π ) ⊗ D V α − n2 µH · α .
i=1
Let ψ be a local harmonic spinor and α a section of (SV), ∇ V -parallel in horizontal directions and with D V α − n2 µH · α = 0. From the above formula, it follows that = 0 and therefore π is a Dirac morphism. DMψ Conversely, suppose that π is a Dirac morphism. Then, by Definition 4, there exists a horizontally parallel section α ∈ (SV) satisfying D V α − n2 µH · α = 0, and, for a harmonic spinor field ψ on N , we have, according to (4), 0 = − 21 (m − n)µV + (n − 1)gradH (lnλ) · ψ (11) + 41
n
X i∗ · X ∗j · (ψ ◦ π ) ⊗ I H (X i∗ , X ∗j ) · α.
i< j=1
Putting X = (m − n)µV + (n − 1)gradH (lnλ) and V i j = I H (X i∗ , X ∗j ), (11) becomes + 0 = − 21 X · ψ
1 4
n
. X i∗ · X ∗j · V i j · ψ
i< j=1
at p ∈ M can be prescribed, the above equation implies As the value of ψ 0 = − 21 X +
1 4
n
X i∗ · X ∗j · V i j .
i< j=1
But since V i j is vertical, X and X i∗ · X ∗j · V i j have different degrees, necessarily X = 0 and V i j = 0. Therefore, if a horizontally conformal submersion is Dirac morphism, it must satisfy Eq. (10) and have integrable horizontal distribution. Remark 7. (1) Note the analogy between Eq. (10) and the fundamental equation for harmonic morphisms in [3].
1098
E. Loubeau, R. Slobodeanu
(2) Compare Formula (4) with [5, (4.26)] and [13, 1.1.1]. (3) If the fibres are totally geodesic, the integrability of the horizontal distribution makes the section α “basic transversally harmonic”, as introduced in [8]. Corollary 1. A Riemannian submersion π : (M m , g) → (N n , h) between spin manifolds is a Dirac morphism if and only if its fibres are minimal and its horizontal distribution is integrable. Recall that if π is a Riemannian submersion then µH = 0. Remark 8. If the dilation function λ is a projectable function (i.e. V (λ) = 0), the conformal invariance of the Dirac operator ([12]) allows a correspondence between harmonic spinors of the spaces involved in the commutative diagram below.
1M -
(M, λ2 π ∗ h + g V )
(M, π ∗ h + g V )
Z
Z Z
Z
π ? (N , λ˜ 2 h)
Z
π
Z
Z
1N
Z Z ~ ? Z - (N , h)
4. Dirac Morphisms with One-Dimensional Fibres In this section m = n + 1. Definition 5. A horizontally conformal submersion π : (M n+1 , g) → (N n , h) between spin manifolds is called a Dirac morphism if for any local harmonic spinor defined on U ⊂ N , such that π −1 (U ) = ∅, the pullback = ξγ ◦ 1 is a harmonic spinor on π −1 (U ) ⊂ M, where 1 is the pullback of by the associated Riemannian submersion. Lemma 3. (Chain Rule). Let π : (M n+1 , g) → (N n , h) be a horizontally conformal submersion of dilation λ and ψ a (local) spinor field on N , then Nψ − = λD DMψ
1 1 V + IH · ψ . µ + (n − 1)gradH (lnλ) + nµH · ψ 2 4
(12)
Proof. Take {X i }i=1,...,n an orthonormal frame on (N n , h) and {V, E i = λX i∗ }i=1,...,n an adapted frame on (M m , g). Let be a (local) spinor field on N and
its pullback by π . The proof is similar to the proof of Lemma 1, except that (H0 ) = E i · E i (ψ ◦ π ), (V0 ) = 0 and the terms (H3 ), (V3 ) do not appear. as V · ψ = iµH 2 ψ = i ψ. Note that µH · ψ
A Characterization of Dirac Morphisms
1099
Theorem 2. A horizontally conformal submersion π : (M n+1 , g) → (N n , h) between spin manifolds is a Dirac morphism if and only if its horizontal distribution is integrable and minimal, and µV + (n − 1)gradH (lnλ) = 0, where µV is the mean curvature of the fibres. Proof. The argument is similar to the one of Theorem 1, except that X = µV + (n − 1)gradH (ln λ) + nµH . Observe that X = 0 if and only if µV + (n − 1)gradH (ln λ) = 0 and µH = 0, as they belong to orthogonal distributions. Corollary 2. A Riemannian submersion π : (M n+1 , g) → (N n , h) between spin manifolds is a Dirac morphism if and only if its horizontal distribution is integrable and the fibres are minimal. Remark 9. Suppose that π is a Riemannian submersion. (1) The Chain Rule (12) gives us the formula of [14] (where the fibres are minimal) DM = DN −
1 V 1 ∗ X j · A X ∗j V · , µ · − i 2 4 n
(13)
j=1
where A denotes the second O’Neill tensor of π . (2) If π is a Dirac morphism, any pullback spinor field is parallel along the fibres and its metric Lie derivative ([6]) vanishes. Remark 10. When the fibres are circles, a similar chain rule was obtained by Ammann in [2]. 5. Examples Example 1. (Dirac morphisms from 3 to 2-dimensional Euclidean spaces). With respect to the irreducible representation γ of Cl2 given by Pauli matrices, a spinor field on ψ+ 2 2 2 , in a global spin frame (e1 = (R , , standard ) will be ψ : R −→ C , ψ = ψ− ∂ ∂ 2 2 ∂ y1 , e2 = ∂ y2 ) on R , and the Dirac operator on R , D
R2
∂ ∂ = γ (e1 ) + γ (e2 ) = ∂ y1 ∂ y2
∂ 0 − ∂z . ∂ ∂z 0
= ψ ◦π : Consider a Riemannian submersion π : R3 −→ R2 , the pull-back of ψ, ψ R3 −→ C2 is a spinor field on R3 with respect to the frame {e1∗ , e2∗ , v}, v ∈ Ker dπ . Supposing H integrable, this spin adapted frame can be chosen to be { ∂∂x1 , ∂∂x2 , ∂∂x3 }, and with the Pauli representation, the Dirac operator on R3 is ∂ ∂ + i i ∂∂x3 ∂ R3 ∂ x ∂ x 2 1 = D = iσk . − ∂∂x2 + i ∂∂x1 −i ∂∂x3 ∂ xk
1100
E. Loubeau, R. Slobodeanu
Let ψ be an arbitrary harmonic spinor on R2 (so ψ + is a holomorphic function), then + is also harmonic if and only if ψ ∂π1 ∂π2 = , ∂ x1 ∂ x2
∂π1 ∂π2 =− , ∂ x2 ∂ x1
∂π1 ∂π2 = = 0. ∂ x3 ∂ x3
− involves a change of sign in the first two equalities (i.e. The analogous question for ψ π must be anti-holomorphic with respect to x1 + i x2 ). These conditions are exactly the harmonicity of π , which is equivalent to the minimality of the fibres. Example 2. (Dirac morphisms from 4 to 2-dimensional Euclidean spaces). A spinor field on (R4 , , standard ) can be seen as a C4 -valued function on R4 once a global spin frame is chosen. Using Pauli matrices, the Dirac operator on R4 can be described as ⎛ ⎞ ∂ ∂ ∂ ∂ 0 0 ∂ x3 + i ∂ x0 ∂ x1 − i ∂ x2 ⎜ ∂ ∂ ∂ ∂ ⎟ 0 0 ∂ 4 ⎜ ∂ x1 + i ∂ x2 − ∂ x3 + i ∂ x0 ⎟. D R = γk =⎜ ⎟ 0 0 ⎝ − ∂∂x3 + i ∂∂x0 − ∂∂x1 + i ∂∂x2 ⎠ ∂ xk 0 0 − ∂∂x1 − i ∂∂x2 ∂∂x3 + i ∂∂x0 Let π : R4 −→ R2 be a Riemannian submersion and ψ : R2 −→ C2 a spinor field of ψ is (ψ ◦ π ) ⊗ α : R4 −→ C4 . With on (R2 , , standar d ). Then the pull-back ψ ∗ ∗ respect to {e1 , e2 , v, w}, v, w ∈ Ker dπ , assuming H integrable and choosing a frame { ∂∂x2 , ∂∂x3 , ∂∂x0 , ∂∂x1 }, = ψ
⎛ + +⎞ ψ α + ψ ⎜ ψ −α− ⎟ − = ⎝ ψ + α − ⎠. ψ ψ −α+
The conditions of harmonicity and parallelism make α : R2 → C2 a harmonic spinor field with respect to the variables x0 , x1 (i.e. α + is holomorphic and α − antiholomorphic). Take a harmonic spinor ψ on R2 (i.e. ψ + is a holomorphic function), + is harmonic, for any α, if and only if π satisfies one can directly check that ψ ∂π1 ∂π1 ∂π2 ∂π2 = = = = 0, ∂ x0 ∂ x1 ∂ x0 ∂ x1
∂π1 ∂π2 = , ∂ x2 ∂ x3
∂π1 ∂π2 =− . ∂ x3 ∂ x2
− merely introduces a different sign. Again, this forces π The same question for ψ to be harmonic, i.e. its fibres are minimal. Example 3. (Moroianu’s projectable spinors). In [13], Moroianu considers a principal fibre bundle π : (M m , g) −→ N with compact structural group G, over a compact spin manifold (N n , h), such that π is a Riemannian submersion with totally geodesic fibres and the horizontal distribution H is a principal connection. Since its tangent space is trivial, G admits a canonical spin structure, and a spinor (ψ ◦ π ) ⊗ α is called projectable if α : G −→ Sm−n is a constant function with respect to the canonical frame of left-invariant vector fields. To have a D M -invariant notion of projectable spinor, it is necessary and sufficient to suppose G commutative ([13]).
A Characterization of Dirac Morphisms
1101
Let X ∗ be the horizontal lift of a vector field X on N and, using [V, X ∗ ] = 0 for V ∈ V, since H is a principal connection, we have ∇ XV∗ α = X (α) + = =
1 2
m−n
g(∇ X ∗ Vb , Vc )Vb · Vc · α
b
= X (α),
since the fibres are totally geodesic. Therefore the condition of parallelism of α translates into constancy in horizontal directions. Moreover V
D α= =
m−n a=1 m−n
Va · ∇VVa α Va · Va (α) +
1 4
a=1
=
m−n a=1
m−n
Va · g(∇Va Vb , Vc ) Vb · Vc · α
a,b,c=1
Va · Va (α) +
3 4
m−n
g([Va , Vb ], Vc ) Va · Vb · Vc · α,
a
where, on a Lie group, g([V, W ], Z ) = g(V, [W, Z ]) if g is an invariant metric. Clearly, if α is a constant function and G is commutative (i.e. the structural constants vanish) then D V α = 0. Therefore a projectable spinor field is the pull-back of some spinor field on the base, with respect to α satisfying the two conditions required for Dirac morphisms. Hence, Theorem 1 says that a principal bundle over a spin manifold with commutative structural group is a Dirac morphism if and only if it is flat.
References 1. Alinhac, S., Gérard, P.: Opérateurs pseudo-différentiels et théorème de Nash-Moser. Paris: InterEditions, Meudon: Éditions du CNRS, 1991 2. Ammann, B.: The Dirac operator on collapsing S 1 -bundles. Sémin. Théor. Spectr. Géom. 16, Saint-Martin-d’Hères: Univ. Grenoble I, 1998 3. Baird, P., Wood, J.C.: Harmonic morphisms between Riemannian manifolds. Oxford: Oxford Univ. Press, 2003 4. Bär, C., Gauduchon, P., Moroianu, A.: Generalized cylinders in semi-Riemannian and Spin geometry. Math. Zeit. 249, 545–580 (2005) 5. Bismut, J.-M., Cheeger, J.: η-invariants and their adiabatic limits. J. Amer. Math. Soc. 2, 33–70 (1989) 6. Bourguignon, J.-P., Gauduchon, P.: Spineurs, opérateurs de Dirac et variations de métriques. Commun. Math. Phys. 144, 581–599 (1992) 7. Fuglede, B.: Harmonic morphisms between Riemannian manifolds. Ann. Inst. Fourier (Grenoble) 28, 107–144 (1978)
1102
E. Loubeau, R. Slobodeanu
8. Glazebrook, J.F., Kamber, F.: Transversal Dirac families in Riemannian foliations. Commun. Math. Phys. 140, 217–240 (1991) 9. Husemoller, D.: Fibre Bundles. Berlin-Heidelberg-New York: Springer-Verlag, 1994 10. Ishihara, T.: A mapping of Riemannian manifolds which preserves harmonic functions. J. Math. Kyoto Univ. 19, 215–229 (1979) 11. Jacobi, C.G.J.: Über eine Lösung der partiellen Differentialgleichung (V ) = 0. J. Reine Angew. Math. 36, 113–134 (1848) 12. Lawson, H., Michelsohn, M.-L.: Spin Geometry. Princeton, NJ: Princeton University Press, 1989 13. Moroianu, A.: Opérateur de Dirac et submersions riemanniennes. Thèse de Doctorat, Ecole Polytechnique, 1996 14. Moroianu, A.: La première valeur propre de l’opérateur de Dirac sur les variétés kählériennes compactes. Commun. Math. Phys. 169, 373–384 (1995) Communicated by A. Connes
Commun. Math. Phys. 288, 1103–1116 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0695-y
Communications in
Mathematical Physics
On Almost Randomizing Channels with a Short Kraus Decomposition Guillaume Aubrun Institut Camille Jordan, Université Claude Bernard Lyon 1, 43 boulevard du 11 novembre 1918, 69622 Villeurbanne cedex, France. E-mail:
[email protected] Received: 11 June 2008 / Accepted: 14 August 2008 Published online: 8 January 2009 – © Springer-Verlag 2008
Abstract: For large d, we study quantum channels on Cd obtained by selecting randomly N independent Kraus operators according to a probability measure µ on the unitary group U(d). When µ is the Haar measure, we show that for N d/ε2 , such a channel is ε-randomizing with high probability, which means that it maps every state within distance ε/d (in operator norm) of the maximally mixed state. This slightly improves on a result by Hayden, Leung, Shor and Winter by optimizing their discretization argument. Moreover, for general µ, we obtain an ε-randomizing channel provided N d (log d)6 /ε2 . For d = 2k (k qubits), this includes Kraus operators obtained by tensoring k random Pauli matrices. This leads to more efficient constructions of almost randomizing channels. The proof uses recent results on empirical processes in Banach spaces.
1. Introduction The completely randomizing quantum channel on Cd maps every state to the maximally mixed state ρ∗ . This channel is used to construct perfect encryption systems (see [1] for formal definitions). However it is a complex object in the following sense: any Kraus decomposition must involve at least d 2 operators. It has been shown by Hayden, Leung, Shor and Winter [13] that this “ideal” channel can be efficiently emulated by lower-complexity channels, leading to approximate encryption systems. The key point is the existence of good approximations with much shorter Kraus decompositions. More precisely, say that a quantum channel on Cd is ε-randomizing if for any state ρ, (ρ) − ρ∗ ∞ ε/d. The existence of ε-randomizing channels with o(d 2 ) Kraus operators has several other implications [13], such as counterexamples to multiplicativity conjectures [18].
1104
G. Aubrun
It has been proved in [13] that if (Ui ) denote independent random matrices Haardistributed on the unitary group U(d), then the quantum channel : ρ →
N 1 Ui ρUi† N
(1)
j=1
is ε-randomizing with high probability provided N Cd log d/ε2 for some constant C. The proof uses a discretization argument and the fact that the Haar measure satisfies subgaussian estimates. We show a simple trick that allows to drop a log d factor: is ε-randomizing when N Cd/ε2 , this is our Theorem 1. The dependance is sharp in ε and d. The Haar measure is a nice object from the theoretical point of view, but is often too complicated to implement for concrete situations. We consider here thebroad class of isotropic measures. Let us say that a measure µ on U(d) is isotropic when UρU † dµ(U ) = ρ∗ for any state ρ. When d = 2k , an important example of isotropic measure is given by assigning equal masses at k-fold tensor products of Pauli operators. The following question was asked in [13]: is the quantum channel defined as (1) ε-randomizing when (Ui ) are distributed according to any isotropic probability measure on U(d)? We answer positively this question when N Cd log6 d/ε2 . This is our main result and appears as Theorem 2. Note that for non-Haar measures, previous results appearing in the literature [2,9,13] only involved the weaker trace-norm approximation (ρ) − ρ∗ 1 ε. As opposed to the Haar measure, the measure µ need not have subgaussian tails, and we need more sophisticated tools to prove Theorem 2. We use recent results on suprema of empirical processes in Banach spaces. After early work by Rudelson [16] and Guédon–Rudelson [12], a general sharp inequality was obtained by Guédon, Mendelson, Pajor and Tomczak-Jaegermann [11]. This inequality is valid in any Banach space with a sufficiently regular equivalent norm, such as d1 . The problem of ε-randomizing channels involves the supremum of an empirical process in the trace-class space S1d (non-commutative analogue of d1 ), which enters perfectly this setting. The paper is organized as follows. Section 2 contains background and precise statements of the theorems. Theorem 1 (for Haar measure) is proved in Sect. 3. Theorem 2 (for a general measure) is proved in Sect. 4. An Appendix contains the needed facts about geometry and probability in Banach spaces. 2. Background and Presentation of Results Thoughout the paper, the letters C and c denote absolute constants whose value may change from occurrence to occurrence. We usually do not pay too much attention to the value of these constants. 2.1. Schatten classes. We write M(Cd ) for the space of complex d × d matrices. If A ∈ M(Cd ), let s1 (A), . . . , sd (A) denote the singular values of A (defined as the square roots of the eigenvalues of A A† ). For 1 p ∞, the Schatten p-norm is defined as d 1/ p p A p = si (A) . i=1
On Almost Randomizing Channels with a Short Kraus Decomposition
1105
For p = ∞, the definition should be understood as A∞ = max si (A) and coincides with the usual operator norm. It is well-known (see [6], Sect. IV.2) that (M(Cd ), · p ) is a complex normed space, denoted S dp and called Schatten class. The space S dp is the non-commutative analogue of the space dp . We write B(S dp ) for the unit ball of S dp . The Schatten 2-norm (sometimes called Hilbert–Schmidt or Frobenius norm) is a Hilbert space norm associated to the inner product A, B = Tr A† B. This Hermitian structure allows to identify M(Cd ) with its dual space. Duality on Schatten norms holds as in the commutative case: if p and q are conjugate exponents (i.e. 1/ p + 1/q = 1), then the normed space dual to S dp coincides with Sqd . 2.2. Completely positive maps. We write Msa (Cd ) (resp. M+ (Cd )) for the set of selfadjoint (resp. positive semi-definite) d × d matrices. A linear map : M(Cd ) → M(Cd ) is said to preserve positivity if (M+ (Cd )) ⊂ M+ (Cd ). Moreover, is said to be completely positive if for any k ∈ N, the map ⊗ IdM(Ck ) : M(Cd ⊗ Ck ) → M(Cd ⊗ Ck ) preserves positivity. We use freely the canonical identification M(Cd ) ⊗ M(Ck ) ≈ M(Cd ⊗ Ck ). If (ei )0i d−1 denotes the canonical basis of Cd , let E i j = |ei e j |. To : M(Cd ) → M(Cd ) we associate A ∈ M(Cd ⊗ Cd ) defined as A =
d
E i j ⊗ (E i j ).
i, j=1
The matrix A is called the Choi matrix of ; it is well-known [8] that is completely positive if and only if A is positive. Therefore, the set of completely positive operators on M(Cd ) is in one-to-one correspondence with M+ (Cd ⊗ Cd ). This correspondence is known as the Choi–Jamiołkowski isomorphism. The spectral decomposition of A implies now the following: any completely positive map on M(Cd ) can be decomposed as : X →
N
Vi X Vi† .
(2)
i=1
Here V1 , . . . , VN are elements of M(Cd ). This decomposition is called a Kraus decomposition of of length N . The minimal length of a Kraus decomposition of (called Kraus rank) is equal to the rank of the Choi matrix A . In particular it is always bounded by d 2 . 2.3. States and the completely depolarizing channel. A state on Cd is a element of M+ (Cd ) with trace 1. We write D(Cd ) for the set of states; it is a compact convex set with (real) dimension d 2 − 1. If x ∈ Cd is a unit vector, we write Px = |xx| for the associated rank one projector. The state Px is called a pure state, and it follows from spectral decomposition that any state is a convex combination of pure states. A central role is played by the maximally mixed state ρ∗ = Id/d (ρ∗ is sometimes called the random state).
1106
G. Aubrun
A quantum channel : M(Cd ) → M(Cd ) is a completely positive map which preserves trace: for any X ∈ M(Cd ), Tr (X ) = Tr X . Note that a quantum channel maps states to states. The trace-preserving condition is read on the Kraus decomposition (2) as N
Vi† Vi = Id.
i=1
An example of quantum channel that plays a central role in quantum information theory is the (completely) randomizing channel (also called completely depolarizing channel) R : M(Cd ) → M(Cd ), R : X → Tr X ·
Id . d
The randomizing channel maps every state to ρ∗ . The Choi matrix of R is AR = d1 IdCd ⊗Cd . Since AR has full rank, any Kraus decomposition of R must have length (at least) d 2 . An explicit decomposition can be written as follows: let ω = exp(2iπ/d) and A and B the matrices defined as A(e j ) = e j+1 mod d
B(e j ) = ω j e j .
(3)
For 1 j, k d, define V j,k as the product B j Ak . Note that V j,k belongs to the unitary group U(d). A routine calculation (see also Sect. 2.5) shows that for any X ∈ M(Cd ), d 1 Id † V j,k X V j,k = Tr X · . 2 d d j,k=1
This is a Kraus decomposition of the randomizing channel. 2.4. ε-randomizing channels. We are interested in approximating the randomizing channel R by channels with low Kraus rank. Following Hayden, Leung, Shor and Winter [13], a quantum channel is called ε-randomizing if for any state ρ ∈ D(Cd ), (ρ) − ρ∗ ∞
ε . d
It is equivalent to say that the spectrum of (ρ) is contained in [(1 − ε)/d, (1 + ε)/d] for any state ρ. It has been proved in [13] that there exist ε-randomizing channels with Kraus rank equal to Cd log d/ε2 for some constant C. This is much smaller than d 2 (the Kraus rank of R). The construction is simple: generate independent random Kraus operators according to the Haar measure on U(d) and show that the induced quantum channel is ε-randomizing with nonzero probability. A key step in the proof is a discretization argument. We show that a simple trick improves the efficiency of the argument from [13] to prove the following theorem. A version of this theorem with a weaker dependence in ε appeared in the preprint [4].
On Almost Randomizing Channels with a Short Kraus Decomposition
1107
Theorem 1 (Haar-generated ε-randomizing channels). Let (Ui )1i N be independent random matrices Haar-distributed on the unitary group U(d). Let : M(Cd ) → M(Cd ) be the quantum channel defined by (ρ) =
N 1 Ui ρUi† . N i=1
Assume that 0 < ε < 1 and N Cd/ε2 . Then the channel is ε-randomizing with nonzero probability. As often with random constructions, we actually prove that the conclusion holds true with large probability: the probability of failure is exponentially small in d. It is clear that the way N depends on d is optimal: if is a ε-randomizing channel with ε < 1, its Kraus rank must be at least d. This is because for any pure state Px , (Px ) must have full rank. The dependence in ε is sharp for channels as constructed here, since Lemma 2 below is sharp. However, it is not clear whether families of ε-randomizing channels with a better dependence in ε can be found using a different construction, possibly partially deterministic. One checks (using the value c = 1/6 from [13] in Lemma 3 and optimizing over the net size) that the constant in Theorem 1 can be chosen to, e.g., C = 150. This is presumably far from optimal. 2.5. Isotropic measures on unitary matrices. Although the quantum channels constructed in Theorem 1 have minimal Kraus rank, it can be argued that Haar-distributed random matrices are hard to generate in real-life situations. We introduce a wide class of measures on U(d) that may replace the Haar measure. Definition. We say that a probability measure µ on U(d) is isotropic if for any X ∈ M(Cd ), Id U XU † dµ(U ) = Tr X · . d U (d) Similarly, a U(d)-valued random vector is called isotropic if its law is isotropic. Lemma 1. Let U = (Ui j ) be a U(d)-valued random vector. The following assertions are equivalent (1) U is isotropic. (2) For any X ∈ M(Cd ), E| Tr U X † |2 = d1 X 22 . (3) For any indices i, j, k, l, EUi j Ukl = d1 δi,k δ j,l . Proof. Implications (3) ⇒ (1) and (3) ⇒ (2) are easily checked by expansion. For (1) ⇒ (3), simply take X = |e j ek |. Identity (2) implies after polarization that for any A, B ∈ M(Cd ), 1 E Tr(U A† ) Tr(U B † ) = Tr(AB † ), d from which (3) follows.
1108
G. Aubrun
Condition (3) of the lemma means that the covariance matrix of U —which is an element of M(M(Cd ))—is a multiple of the identity matrix. Of course the Haar measure is isotropic. Other examples are provided by discrete measures. Let U = {U1 , . . . , Ud 2 } be a family of unitary matrices, which are mutually orthogonal in the following sense: if i = j, then Tr Ui† U j = 0. For example, one can take U = {B j Ak }1 j,k d , A, B defined as (3). Then the uniform probability measure on U is isotropic. Indeed, any X ∈ M(Cd ) can be decomposed as X = xi Ui and condition (2) of Lemma 1 is easily checked. If we specialize to d = 2, we obtain a random Pauli operator: assign probability 1/4 to each of the following matrices to get an isotropic measure σ0 =
1 0 0 1 0 −i 1 0 , σ1 = , σ2 = , σ3 = . 0 1 1 0 i 0 0 −1
It is straightforward to check that isotropic vectors tensorize: if X 1 ∈ U(d1 ) and X 2 ∈ U(d2 ) are isotropic, so is X 1 ⊗ X 2 ∈ U(d1 d2 ). If we work on M((C2 )⊗k ), which corresponds to a set of k qubits, a natural isotropic measure is therefore obtained by choosing independently a Pauli matrix on each qubit, i.e. assigning mass 1/4k to the matrix σi1 ⊗ · · · ⊗ σik for any i 1 , . . . , i k ∈ {0, 1, 2, 3}k . 2.6. ε-randomizing channels for an isotropic measure. We can now state our main theorem asserting that up to logarithmic terms, the Haar measure can be replaced in Theorem 1 by simpler notions of randomness. We first state our result Theorem 2 (General ε-randomizing channels). Let µ be an isotropic measure on the unitary group U(d). Let (Ui )1i N be independent µ-distributed random matrices, and : M(Cd ) → M(Cd ) be the quantum channel defined as (ρ) =
N 1 Ui ρUi† . N
(4)
i=1
Assume that 0 < ε < 1 and N Cd(log d)6 /ε2 . Then the channel is ε-randomizing with probability larger than 21 . Theorem 2 applies in particular for a product of random Pauli matrices as described in the previous section. It is of interest for certain cryptographic applications to know that ε-randomizing channels can be realized using Pauli matrices. As opposed to Theorem 1, the conclusion of Theorem 2 is not proved to hold with exponentially large probability. The estimate 21 on the probability estimate can be replaced by any number smaller than 1, only affecting the value of the constant C. Theorem 2 could be quickly deduced from a theorem appearing in [11]. However, the proof of [11] is rather intricate and uses Talagrand’s majorizing measures in a central way. We give here a proof of our theorem which uses the simpler Dudley integral instead, giving the same result. We however rely on an entropy Lemma from [11], which appears as Lemma A5 in the Appendix. The log6 d appearing in Theorem 2 is certainly non optimal (see remarks at the end of the paper). However, some power of log d is needed, as shown by the next proposition.
On Almost Randomizing Channels with a Short Kraus Decomposition
1109
Proposition. Let A, B defined as (3) and µ be the uniform measure on the set {B j Ak }1 j,k d . Consider (X i ) independent µ-distributed random unitary matrices. If the quantum channel defined as (4) is 21 -randomizing with probability larger than 1/2, then N cd log d. Proof. We will rely on the following standard result in elementary probability theory known as the coupon collector’s problem (see [10], Chap. 1, Example 5.10): if we choose independently and uniformly random elements among a set of d elements, the mean (and also the median) number of choices before getting all elements at least once is equivalent to d log d for large d. In our case, recall that ω = exp(2iπ/d) and define x j as 1 ω j ω2 j ω(d−1) j xj = √ , √ , √ ,..., √ . d d d d Note that B = (x j )0 j d−1 is an orthonormal basis of Cd and that B j Ak x0 = x j . Consequently, if U is µ-distributed, the random state U Px0 U † equals Px j with probability 1/d. In the basis B, the matrix (Px0 ) is diagonal. Note that if is 21 -randomizing, then (Px0 ) must have full rank. The reduction to the coupon collector’s problem is now immediate. 3. Proof of Theorem 1: Haar-Distributed Unitary Operators The scheme of the proof is similar to [13]. We need two lemmas from there. The first is a deviation inequality sometimes known as Bernstein’s inequality. The second is proved by a volumetric argument. Lemma 2 (Lemma II.3 in [13]). Let ϕ, ψ be pure states on Cd and (Ui )1i N be independent Haar-distributed random unitary matrices. Then for every 0 < δ < 1, N 1 δ 1 † Tr(Ui ϕUi ψ) − P 2 exp(−cδ 2 N ). N d d i=1
Lemma 3 (Lemma II.4 in [13]). For 0 < δ < 1 there exists a set N of pure states on Cd with |N | (5/δ)2d , such that for every pure state ϕ on Cd , there exists ϕ0 ∈ N such that ϕ − ϕ0 1 δ. Such a set N is called a δ-net. The improvement on the result of [13] will follow from the next lemma Lemma 4 (Computing norms on nets). Let : B(Cd ) → B(Cd ) be a Hermitianpreserving linear map. Let A be the quantity A=
sup ϕ∈D (Cd )
(ϕ)∞ =
sup ϕ,ψ∈D (Cd )
|Tr ψ(ϕ)| .
Let 0 < δ < 1/2 and N be a δ-net as provided by Lemma 3. We can estimate A as follows: A
1 B, 1 − 2δ
1110
G. Aubrun
where B=
sup
ϕ0 ,ψ0 ∈N
|Tr ψ0 (ϕ0 )| .
Proof of Lemma 4. First note that for any self-adjoint operators a, b ∈ B(Cd ), we have |Tr b(a)| Aa1 b1 .
(5)
By a convexity argument, the supremum in A can be restricted to pure states. Given pure states ϕ, ψ ∈ D(Cd ), let ϕ0 , ψ0 ∈ N so that ϕ − ϕ0 1 δ and ψ − ψ0 1 δ. Then |Tr ψ(ϕ)| |Tr(ψ − ψ0 )(ϕ)| + |Tr ψ0 (ϕ − ϕ0 )| + |Tr ψ0 (ϕ0 )|. Using twice (5) and taking the supremum over ϕ, ψ gives A δ A + δ A + B, hence the result. Proof of the theorem. Let R be the randomizing channel. Fix a 41 -net N with |N | 202d , as provided by Lemma 3. Let = R − and A, B as in Lemma 4. Here A and B are random quantities and it follows from Lemma 4 that ε ε P B . P A d 2d Using the union bound and Lemma 2, we get ε 204d · 2 exp(−cε2 N /4). P B 2d This is less than
1 2
provided N Cd/ε2 for some constant C.
4. Proof of Theorem 2: General Unitary Operators A Bernoulli random variable is a random variable ε so that P(ε = 1) = P(ε = −1) = 1/2. Recall that C denotes an absolute constant whose value may change from occurrence to occurrence. We will derive Theorem 2 from the following lemma. Lemma 5. Let U1 , . . . , U N ∈ U(d) be deterministic unitary operators and let (εi ) be a sequence of independent Bernoulli random variables. Then N N 1/2 † † 5/2 Eε sup εi Ui ρUi C(log d) log N sup Ui ρUi . (6) ρ∈D (Cd ) ρ∈D (Cd ) i=1
∞
i=1
∞
Proof of Theorem 2 (assuming Lemma 5). Let µ be an isotropic measure on U(d) and (Ui ) be independent µ-distributed random unitary matrices. Let M be the random quantity N 1 Id † M = sup Ui ρUi − . d ρ∈D (Cd ) N i=1
∞
We are going to show that EM is small. The first step is a standard symmetrization argument. Let (Ui ) be independent copies of (Ui ) and (εi ) be a sequence of independent
On Almost Randomizing Channels with a Short Kraus Decomposition
1111
Bernoulli random variables. We make explicit as a subscript the random variables with respect to which the expectation is taken N 1 † † Ui ρUi − Ui ρUi EM EU,U sup ρ∈D (Cd ) N i=1 ∞ N 1 † † = EU,U ,ε sup εi (Ui ρUi − Ui ρUi ) N ρ∈D (Cd ) i=1 ∞ N 1 2EU,ε sup εi Ui ρUi† . ρ∈D (Cd ) N i=1
∞
The inequality of the first line is Jensen’s inequality for EU , while the equality on the second line holds since the distribution of ρ → Ui ρUi† − Ui ρUi† is symmetric (as a M(M(Cd ), M(Cd ))-valued random vector). We then decouple the expectations using Lemma 5 for fixed (Ui ). 1/2 N 1 C † 5/2 log N E sup Ui ρUi EM √ (log d) N ρ∈D (Cd ) N i=1 ∞ C 1 √ (log d)5/2 log N E M + d N 1 C √ (log d)5/2 log N EM + . d N Using the elementary implication X α X + β ⇒ X α 2 + α β we find that EM ε/d provided N Cd log6 d/ε2 .
It remains to prove Lemma 5. We will use several standard concepts from geometry and probability in Banach spaces. All the relevant definitions and statements are postponed to the next section. Proof of Lemma 5. Let Z be the quantity appearing in the left-hand side of (6). By a convexity argument, the supremum is attained for an extremal ρ, i.e. a pure state Px = |xx| for some unit vector x. Since the operator norm itself can be written as a supremum over unit vectors, we get N N εi |y|Ui |x|2 = sup εi | Tr Ui |xy||2 Z = sup |x|=|y|=1 i=1 |x|=|y|=1 i=1 N sup εi | Tr Ui A|2 . A∈B(S d ) 1
i=1
1112
G. Aubrun
The last inequality follows from the fact that B(S1d ) = conv{|xy|, |x| = |y| = 1}. Let : B(S1d ) → R N defined as (A) = (| Tr U1 A|2 , . . . , | Tr U N A|2 ). We now apply Dudley’s inequality (Theorem A2 in the next section) with K = (B(S1d )) to estimate EZ using covering numbers. This yields ∞ EZ C log N ((B(S1d )), | · |, ε)dε, 0
where | · | denotes the Euclidean norm on R N . Define a distance δ on B(S1d ) as
N 2 δ(A, B) = |(A) − (B)| = | Tr Ui A|2 − | Tr Ui B|2
1/2 .
i=1
We are led to the estimate EZ C 0
∞
log N (B(S1d ), δ, ε)dε.
Using the inequality |a|2 − |b|2 |a − b| · |a + b|, the metric δ can be upper bounded as follows: N δ(A, B)2 | Tr Ui (A + B)|2 sup | Tr Ui (A − B)|2 . 1i N
i=1
Let us introduce a new semi-norm ||| · ||| on M(Cd ), |||A||| = sup | Tr Ui A|. 1i N
Let θ be the number equal to θ :=
sup
N
A∈B(S1d ) i=1
N † | Tr Ui A| = sup Ui ρUi . ρ∈D (Cd ) 2
i=1
∞
We get that for A, B ∈ B(S1d ), δ(A, B) 2θ |||A − B|||, and therefore ∞ EZ Cθ log N (B(S1d ), ||| · |||, ε)dε. 0
It remains to bound this new entropy integral. We split it into three parts, for ε0 to be determined. If ε is large (ε > 1), since Ui ∞ = 1, we get that ||| · ||| · 1 . This means that N (B(S1d ), ||| · |||, ε) = 1 and the integrand is zero. If ε is small (0 < ε < ε0 ), we use the volumetric argument of Lemma A1, 2
N (B(S1d ), ||| · |||, ε) N (B(S1d ), · 1 , ε) (3/ε)2d .
On Almost Randomizing Channels with a Short Kraus Decomposition
1113
In the intermediate range (ε0 ε 1), let q = log d and p = 1 + 1/(log d − 1) be the conjugate exponent. We are going to approximate the Schatten 1-norm by the Schatten p-norm. It is elementary to check that for A ∈ M(Cd ), Aq eA∞ . By dualizing A1 eA p ⇒ N (B(S1d ), ||| · |||, ε) N (B(S dp ), ||| · |||, ε/e). We are now in position to apply Lemma A5 to the space E = S dp . By Theorems A3 and A4, the 2-convexity constant of S dp and the type 2 constant of Sqd (see the next section for definitions) are bounded as follows: T2 (Sqd ) λ(S dp ) q − 1 log d. Since Ui q e, the inequality given by Lemma A5 is C log N (B(S1d ), ||| · |||, ε) (log d)3/2 log N . ε We now gather all the estimations ∞ ε0 3/2 d 2 log N (B(S1 ), ||| · |||, ε)dε 2d log(3/ε)dε + C(log d) log N
ε0
0
0
11
ε
dε.
Choosing ε0 = 1/d, an immediate computation shows that ∞ log N (B(S1d ), ||| · |||, ε)dε C(log d)5/2 log N . 0
This concludes the proof of the lemma.
Appendix: Geometry of Banach Spaces In this last section, we gather several definitions and results from geometry and probability in Banach spaces. We denote by (E, · ) a real or complex Banach space (actually, in our applications E will be finite-dimensional). We denote by (E ∗ , · ∗ ) the dual Banach space. 4.1. Covering numbers. Definition. If (K , δ) is a compact metric space, the covering number or entropy number N (K , δ, ε) is defined to be the smallest cardinality M of a set {x1 , . . . , x M } ⊂ K so that K ⊂
M
B(xi , ε),
i=1
where B(x, ε) = {y ∈ K s.t. δ(x, y) ε}. An especially important case is when K is a subset of Rn and δ is induced by a norm. The next lemma is proved by a volumetric argument (see [14], Lemma 9.5). Lemma A1. If · is a norm on Rn with unit ball K , then for every ε > 0, N (K , ·, ε) (1 + 2/ε)n .
1114
G. Aubrun
The following theorem gives upper bounds on Bernoulli averages involving covering numbers. For a proof, see Lemma 4.5 and Theorem 11.17 in [14]. Theorem A2 (Dudley’s Inequality). Let (εi ) be independent Bernoulli random variables and K be a compact subset of Rn . Denote by (x1 , . . . , xn ) the coordinates of a vector x ∈ Rn . Then for some absolute constant C, ∞ n E max εi xi C log N (K , | · |, ε)dε, x∈K
i=1
0
where | · | denotes the Euclidean norm on Rn . 4.2. 2-convexity. Definition. A Banach space (E, · ) is said to be 2-convex with constant λ if for any y, z ∈ E, we have 1 (y + z2 + y − z2 ). 2 The smallest such λ is called the 2-convexity constant of E and denoted by λ(E). y2 + λ−2 z2
We say shortly that “E is 2-convex” while the usual terminology should be “E has a modulus of convexity of power type 2”. This should not be confused with the notion of 2-convexity for Banach lattices [15]. It follows from the parallelogram identity that a Hilbert space is 2-convex with constant 1. Other examples are p and S dp for 1 < p 2. The next theorem has been proved by Ball, Carlen and Lieb [5], refining early work by Tomczak–Jaegermann [17]. Theorem A3. For p 2, the following inequality holds for A, B ∈ M(Cd ):
1 A + B2p + A − B2p . A2p + ( p − 1)B2p 2 √ d Therefore, S p is 2-convex with constant 1/ p − 1. This property nicely dualizes. Indeed, it is easily checked (see [5], Lemma 5) that E is 2-convex with constant λ if and only if, for every y, z ∈ E ∗ , 1 (y + z2∗ + y − z2∗ ). 2 In this case, E ∗ is said to be 2-smooth with constant λ. y2∗ + λ2 z2∗
4.3. Type 2. Definition. A Banach space (E, · ) is said to have type 2 if there exists a constant T2 so that for any finite sequence y1 , . . . , y N of vectors of E, we have ⎛ 2 ⎞1/2 N 1/2 N 2 ⎝E εi yi ⎠ T2 yi . (7) i=1
i=1
The smallest possible T2 is called the type 2 constant of E and denoted T2 (E). Here, the expectation E is taken with respect to a sequence (εi ) of independent Bernoulli random variables.
On Almost Randomizing Channels with a Short Kraus Decomposition
1115
We already mentioned that if E is 2-convex, then E ∗ is 2-smooth. It is easily checked (by induction on the number of vectors involved) that a 2-smooth Banach space has type 2 with the same constant. We therefore have the inequality T2 (E ∗ ) λ(E). In particular, Theorem A3 implies the following result, first proved by Tomczak-Jaegermann [17] with a worse constant. Theorem A4. If q 2, then Sqd has type 2 with the estimate T2 (Sqd ) q − 1. 4.4. An entropy lemma. The following lemma plays a key role in our proof. It appears as Lemma 1 in [11]. Lemma A5. Let E be a Banach space with unit ball B(E). Assume that E is 2-convex with constant λ(E). Let x1 , . . . , x N be elements of E ∗ , and define a semi-norm ||| · ||| on E as |||y||| = max |xi (y)|. 1i N
Then for any ε > 0 we have for some absolute constant C, ε log N (B(E), ||| · |||, ε) Cλ(E)2 T2 (E ∗ ) log N max xi E ∗ . 1i N
(8)
The proof of Lemma A5 is based on a duality argument for covering numbers coming from [7]. A positive answer to the duality conjecture for covering numbers (see [3] for a statement of the conjecture and recent results) would imply that the inequality (8) is valid without the factor λ(E)2 . This would improve our estimate in Theorem 2 to N Cd(log d)4 /ε2 . Acknowledgement. I thank Andreas Winter for several e-mail exchanges on the topic, and I am very grateful to Alain Pajor for showing me that the results of [11] can be applied here.
References 1. Ambainis, A., Mosca, M., Tapp, A., de Wolf, R.: Private quantum channels. In: 41st Annual Symposium on Foundations of Computer Science (Redondo Beach, CA, 2000), New York: John Wiley/IEEE Comput. Soc. Press, 2000 pp. 547–553 2. Ambainis, A., Smith, A.: Small Pseudo-random Families of Matrices: Derandomizing Approximate Quantum Encryption. Proceedings of RANDOM’04, pp. 249–260 3. Artstein, S., Milman, V., Szarek, S., Tomczak–Jaegermann, N.: On convexified packing and entropy duality. Geom. Funct. Anal. 14(5), 1134–1141 (2004) 4. Aubrun, G.: http://arXiv.org/abs/0802.4193v1[quant-ph], 2008 5. Ball, K., Carlen, E., Lieb, E.: Sharp uniform convexity and smoothness inequalities for trace norms. Invent. Math. 115(3), 463–482 (1994) 6. Bhatia, R.: Matrix analysis. Graduate Texts in Mathematics 169. Berlin-Heidelberg-New York: SpringerVerlag, 1997 7. Bourgain, J., Pajor, A., Szarek, S., Tomczak-Jaegermann, N.: On the duality problem for entropy numbers of operators. In: Geometric aspects of functional analysis (1987–88), Lecture Notes in Math. 1376 Berlin-Heidelberg-New York-Springer (1989), pp. 50–63 8. Choi, M.D.: Completely positive linear maps on complex matrices. Linear Algebra and Appl. 10, 285–290 (1975)
1116
G. Aubrun
9. Dickinson, P., Nayak, A.: Approximate Randomization of Quantum States With Fewer Bits of Key. AIP Conference Proceedings 864, 18–36 (2006) 10. Durrett, R.: Probability. Theory and examples, The Wadsworth & Brooks/Cole Statistics/Probability Series, 1991 11. Guédon, O., Mendelson, S., Pajor, A., Tomczak–Jaegermann, N.: Majorizing measures and proportional subsets of bounded orthonormal systems, Preprint, 2008 12. Guédon, O., Rudelson, M.: L p -moments of random vectors via majorizing measures. Adv. Math. 208(2), 798–823 (2007) 13. Hayden, P., Leung, D., Shor, P.W., Winter, A.: Randomizing quantum states: constructions and applications. Commun. Math. Phys. 250, 371–391 (2004) 14. Ledoux, M., Talagrand, M.: Probability in Banach spaces. Isoperimetry and processes. Ergebnisse der Mathematik und ihrer Grenzgebiete (3) 23, Berlin-Heidelberg: Springer-Verlag, 1991 15. Lindenstrauss, J., Tzafriri, L.: Classical Banach spaces. II. Function spaces. Ergebnisse der Mathematik und ihrer Grenzgebiete 97. Berlin-Heidelberg: Springer-Verlag, 1979 16. Rudelson, M.: Random vectors in the isotropic position. J. Funct. Anal. 164(1), 60–72 (1999) 17. Tomczak-Jaegermann, N.: The moduli of smoothness and convexity and the Rademacher averages of trace classes S p (1 p < ∞). Studia Math. 50, 163–182 (1974) 18. Winter, A.: The maximum output p-norm of quantum channels is not multiplicative for any p> 2. http:// arXiv.org/abs/0707.0402v3[quant-ph], 2008; Hayden, P., Winter, A.: Counterexamples to the maximal p-norm multiplicativity conjecture for all p > 1. Commun. Math. Phys. 284(1), 263–280 (2008) Communicated by M. B. Ruskai
Commun. Math. Phys. 288, 1117–1135 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0780-x
Communications in
Mathematical Physics
Conformal Generally Covariant Quantum Field Theory: The Scalar Field and its Wick Products Nicola Pinamonti II. Institut für Theoretische Physik, Universität Hamburg, Luruper Chaussee 149, D-22761 Hamburg, Germany. E-mail:
[email protected] Received: 11 June 2008 / Accepted: 8 January 2009 Published online: 24 March 2009 – © Springer-Verlag 2009
Abstract: In this paper we generalize the construction of generally covariant quantum theories given in [BFV03] to encompass the conformal covariant case. After introducing the abstract framework, we discuss the massless conformally coupled Klein Gordon field theory, showing that its quantization corresponds to a functor between two certain categories. At the abstract level, the ordinary fields, could be thought of as natural transformations in the sense of category theory. We show that the Wick monomials without derivatives (Wick powers) can be interpreted as fields in this generalized sense, provided a non-trivial choice of the renormalization constants is given. A careful analysis shows that the transformation law of Wick powers is characterized by a weight, and it turns out that the sum of fields with different weights breaks the conformal covariance. At this point there is a difference between the previously given picture due to the presence of a bigger group of covariance. It is furthermore shown that the construction does not depend upon the scale µ appearing in the Hadamard parametrix, used to regularize the fields. Finally, we briefly discuss some further examples of more involved fields.
1. Introduction The systematic analysis of quantization in terms of functors given by Brunetti, Fredenhagen and Verch [BFV03], opened an interesting new way to interpret the quantum field theory on curved spacetimes. With these new ideas, the expectation values of fields in different spacetimes can be compared in a mathematically rigorous way. Some interesting new applications have been developed following this line of thinking, we recall here the work of Buchholz and Schlemmer [BS07] and Schlemmer and Verch [SV08], where the authors deal consistently with expectation values of fields in different spacetimes. Another interesting use of similar ideas can be found in the derivation of local energy bounds in curved spacetime as performed by Fewster [Fe07]. The use of these concepts plays a central role in the development of a perturbative theory of
1118
N. Pinamonti
quantum gravity as well; to this end we would like to recall the interesting paper of Brunetti and Fredenhagen [BF06]. A central role in the analysis performed in [BFV03] is played by the study of the isometric embeddings between different spacetimes and their interplay with the quantization procedure. It was shown that the quantization of the massive Klein Gordon fields can be encompassed in the new scheme. Furthermore, the field itself and its Wick powers, as constructed by Hollands and Wald in [HW01,HW02,HW05], can be interpreted as generally covariant quantum fields. Here we would like to address the same problem in the case of field theories having a larger group of symmetry, namely the locally conformally covariant case. Hence, we introduce the notion of generally conformally covariant fields by enlarging the abstract setup presented in [BFV03]. The idea of considering more complicated morphisms than isometries appeared for the first time in the work of Brunetti [Br04]; we would like to follow similar line of reasoning. If the extension of the covariance to the conformal covariance is expected to hold true at the level of canonical commutation relations and hence at the level of the simple scalar field, the situation is expected to be different considering the extended algebra of fields, namely the fields defined by means of a regularization. It usually happens that the regularization breaks the conformal covariance, technically speaking this is due to the unavoidable presence of a length scale in the Hadamard parametrix used to regularize the fields. It is then an unexpected fact, that, in the four dimensional case, despite the presence of this length scale and more generally despite the presence of quantum anomalies, a proper but large subset of the algebra of local fields, contains locally conformally covariant fields. We shall show that the Wick powers (the Wick monomial without derivatives) are contained in this subset, provided a non-trivial choice of renormalization freedom is performed.1 At this point it seems interesting to remark that the requirement of being conformally covariant restricts the renormalization freedom usually present in the construction of these fields. This fact seems to be a peculiarity of the four dimensional case; it is in fact known that, for example, in the two dimensional case, the Wick powers (ϕ n ) are not locally conformally covariant (they are not primary in the language of CFT [DMS97]). We shall furthermore comment on this restriction in a subsection devoted to the analyses of the extension of these results to general dimensions. Another interesting difference that arises in the case under investigation is that the transformations rules enjoyed by the Wick powers are characterized by the presence of a weight. Furthermore, the sum of Wick monomials with different weight breaks the conformal covariance. The analysis performed in this paper allows to geometrically relate a larger class of spacetimes than in [BFV03], namely those that are locally connected by a conformal transformation. In this way it is possible, for example, to transplant observables (and states) from the de Sitter spacetime to the Minkowksi one. This could be useful in the study of concepts like local equilibrium states [BOR02] in the case of conformally covariant theories as well. The paper is organized as follows: at first we introduce the notion of locally generally conformal covariant quantum fields. The example of the massless conformally coupled scalar Klein Gordon field is studied in the second section; we shall present the transformation rule of the fundamental solutions and of the Hadamard parametrix in particular. The third section contains the analysis of the Wick powers in four dimensions and a subsection devoted to the discussion of the differences between this case and the case of spacetimes with general dimensions. Some final comments and some further 1 A detailed analysis of the renormalization freedom can be found in the work of Hollands and Wald [HW01,HW05].
Conformal Generally Covariant Quantum Field Theory
1119
non-trivial examples of more complicated fields are given in the fourth section. The appendix contains some technical computation used in the derivation of the results. 1.1. Categorical formulation of locally conformally covariant field theory. We are going to enumerate the relevant categories that will be used later for the formulation of a conformal quantum field theory in terms of a functor between certain categories. Before doing it, we introduce some small modifications to the locally covariant picture of quantum field theory presented for the first time in [BFV03], in order to adapt the formalism to include the case of conformal invariant theories. The key observation is that conformal invariant field theory should be invariant under a reacher group of transformations, namely the local conformal transformations. It is interesting to notice that such transformations share a lot of nice properties with isometries, the causal structure is preserved by such transformations in particular, and this fact will play a central role later on. For a better formalization of these concepts we would like to introduce the notion of conformal embedding. Definition 1.1. Consider two globally hyperbolic spacetime (M1 , g1 ) and (M2 , g2 ) then, a map ψ : M1 → M2 is called conformal embedding if it is a diffeomorphism between M1 and ψ(M1 ) and the push forward ψ∗ acts on the metric g1 in the following way: ψ∗ g1 = −2 g2 |ψ(M ) , where is a strictly positive smooth function on ψ(M1 ), called 1 conformal factor. In the following we shall consider the case of a conformal embedding ψ between two globally hyperbolic spacetimes (M1 , g1 ) and (M2 , g2 ) that preserves orientation and time orientation and such that the image (ψ(M1 ), g2 |ψ(M1 ) ) is also an open globally hyperbolic subset of (M2 , g2 ). We would like to remark that, under the given hypotheses, ψ preserves the causal structures of the spacetime,2 mapping for example causal curves to causal curves and so on and so forth. At this point it seems important to stress a difference between the conformal embeddings used in this paper and the conformal transformations that form the so called conformal group. The main difference arises because we are not simply considering coordinate transformations but general mappings between different spacetimes. For example, in the four dimensional Minkowski spacetime, the conformal transformations that can arise as coordinate transformations form a finite-dimensional group SO(2, 4), while much more freedom is allowed by conformal embeddings. The following action of weighted conformal transformations on test functions will play a distinguished role in the definition of the weight of the field. Definition 1.2. Let ψ be a conformal embedding between (M1 , g1 ) and (M2 , g2 ) with conformal factor ψ then, the weighted action on test functions ψ∗(λ) is the map from C ∞ (M1 ) to C ∞ (ψ(M1 )) such that, −1 (x), ψ∗(λ) ( f )(x) := −λ (x) f ◦ ψ ψ where λ ∈ R is called the weight of the map. The previously given definition deserves some comments regarding its domain of (λ) definition and its inversion. While it is clear that ψ∗ can also be thought of as acting on 2 See Appendix D of [Wa84] for more details.
1120
N. Pinamonti (λ)
a compactly supported smooth function ψ∗ : C0∞ (M1 ) → C0∞ (M2 ), that is not true anymore considering smooth functions, in fact ψ(M1 ) is in general a proper subset of M2 , hence a smooth function f that is not compactly supported on M1 is not mapped to a smooth function in C ∞ (M2 ). It is indeed impossible to extend uniquely ψ∗(λ) ( f ) on M2 outside ψ(M1 ). Despite the presence of these domain problems we would like to notice (λ) that ψ∗ is invertible either on C0∞ (ψ(M1 )) or on C ∞ (ψ(M1 )). The particular conformal embedding ψ : (M, g) → (M, g ) such that every p ∈ M is mapped to ψ( p) = p, is called conformal transformation. Moreover, if the conformal factor ψ of a conformal transformation is a constant then it is called rigid conformal transformation or rigid dilation. We enumerate here the categories used later on; these definitions are very similar to those given in [BFV03]. For this reason we shall stress, case by case, the differences we have to implement in order to encompass also the conformal transformations in the framework. CLoc: This is the category that encompasses all the geometric structures of the theory. The object of CLoc are all the four dimensional oriented and time oriented globally hyperbolic spacetimes. While the morphisms are all the conformal embeddings ψ : (M1 , g1 ) → (M2 , g2 ) with the following additional properties, that are the same as previously given: (i) (ψ(M1 ), g2 |ψ(M1 ) ) is an open globally hyperbolic subset of (M2 , g2 ) and (ii) the morphisms preserve orientation and time orientation.3 The composition of morphisms is defined as the composition map of conformal embeddings in the usual way. The category CLoc is an extension of the category Loc given in [BFV03], in the sense that in CLoc there is a larger class of morphisms than in Loc. Alg: There is no need to modify the category of Alg introduced in [BFV]. The objects of Alg are all the C ∗ -algebras built on a globally hyperbolic spacetime (M, g), possessing the unit element, while their morphisms are the injective ∗−homomorphisms that preserve the unit; once again the composition descends from the usual composition map of ∗−homomorphism. TAlg: The definition of a TAlg follows easily the one of Alg; the difference is that the object of this category are taken to be only ∗−algebras with unit, instead of C ∗ -algebras. There is no modification between this and the previously given definition. Testλ : The objects of this category are the sets of compactly supported smooth functions C0∞ (M) on the spacetimes (M, g). The morphisms are the weighted trans(λ) formation ψ∗ : M → M with a fixed λ and their action is like the one presented in Definition 1.2. It seems interesting to notice that the categories Alg and TAlg are defined in the same way as on [BFV03,Br04], in a certain sense the algebraic formulation of quantum field theory is already suitable to describe conformal transformations. Furthermore the scaling transformations have already been considered as geometric morphisms in the work [Br04]. 1.2. Quantum Conformal Field theory as a functor and conformal fields as natural transformations. We are now in place to define the locally covariant conformal quantum 3 The requirement of global hyperbolicity for ψ(M ) is equivalent to the requirement of causal convexity 1 of ψ(M1 ) in M2 . In other words every causal curve with endpoints in ψ(M1 ) has to lie inside ψ(M1 ) too.
Conformal Generally Covariant Quantum Field Theory
1121
field as a functor between the two categories CLoc and Alg, such that the objects of CLoc are mapped to the objects of Alg whereas the morphisms ψ of CLoc are mapped into the morphisms αψ of Alg, in such a way that the following diagram commutes: ψ
(M, g) −−−−→ (M , g ) ⏐ ⏐ ⏐ ⏐ A A αψ
A(M, g) −−−−→ A(M , g ) and the following composition property holds: αψ ◦ αψ = αψ◦ψ , αI M = IA(M) . The same construction can be repeated substituting the category Alg with TAlg. Despite the meaningfulness of the previously given definition and the presence of examples of the given framework, it is not at all clear if observables with a certain physical meaning in a spacetime are mapped to observables with the same meaning, on the other spacetime. In general this is indeed not the case and it is precisely because of this problem that the ordinary fields need to be introduced in an alternative way. In the picture we are going to introduce, they will assume the particular meaning of natural transformations between categories. To this end it is useful to consider the set of weighted test functions Dλ as a functor between CLoc and Testλ . More precisely let’s indicate by Dλ (M, g) the category whose elements are the sets of compactly supported smooth functions C0∞ (M), and the morphisms αψλ between these sets are defined by means of the weighted action on test functions as defined in 1.2. Clearly D can also be seen as a functor between the category of CLoc to Test. We are now ready to introduce the notion of conformal quantum field as a natural transformation between two functors. Definition 1.3. A field λ(M,g) of weight λ is a linear transformation between the functor that realizes the test functions D4−λ : (M, g) → D4−λ (M, g) and the functor that realizes the topological algebras A : (M, g) → A(M, g) such that the following diagram commutes: λ(M,g)
D4−λ (M, g) −−−−→ A(M, g) ⏐ ⏐ ⏐α λ (4−λ) ⏐ ψ ψ∗ λ(M ,g )
D4−λ (M , g ) −−−−→ A(M , g ) The preceding definition can be written more explicitly by means of the following conformal covariance property: αψλ λ(M,g) ( f ) = λ(M ,g ) ψ∗(4−λ) ( f ) , (λ)
where ψ∗ ( f ) is defined as a weighted transformation as given in Definition 1.2. We call λ the weight of the field λ .
1122
N. Pinamonti
The difference between the weight in the test functions and the weight in the fields can be understood taking into account the transformation rule enjoyed by the volume form. Under a conformal embedding ψ : (M, g) → (M , g ), g (ψ(x))−4 (ψ(x)) = g(x), where g stands for the determinant of the metric computed in a chart of M containing ψ(x) and g is for the determinant of ψ∗ g computed in the same chart. As a consequence of the given definitions, linear combinations of fields with different weights are not conformally covariant fields. Precisely at this point there is a great difference with what was addressed in [BFV03], where also the linear combinations of fields with different “weights” were taken into account. In two dimensional conformal field theory the fields that possess this property are called primary fields [DMS97]. Hence there is a relation between the conformal covariance studied here and the primarity addressed in ordinary CFT. 2. The Model: Free Conformal Invariant Scalar Field In this section we present a model that shows the previously presented abstract structure. We shall consider the massless conformally coupled scalar field theory. Here and in the next sections we shall consider only the four dimensional case; that’s because many of the presented results hold only in that case. Later on, we shall briefly discuss the difficulties that arise in generalizing the outcomes to other dimensions. Just to fix some notation let us recall that the classical equation of motion of the conformal Klein Gordon scalar field ϕ on a spacetime (M, g) is 1 Rg , Pg ϕ = 0, (1) 6 where g is the d’Alembert operator constructed out of the four dimensional metric g and Rg is the Ricci scalar of the metric g. We start our analysis with the study of the interplay between conformal transformations, the fundamental solutions and the microlocal spectral condition [Ra96,BFK96]. Pg = −g +
2.1. Conformal transformation of the fundamental solutions. Let us start recalling the transformation law satisfied by the operator Pg under conformal embeddings. Lemma 2.1. Let ψ be a conformal embedding of (M1 , g1 ) into (M2 , g2 ), consider the (3) (1) corresponding weighted transformations ψ∗ and ψ∗ of test functions thought of as mappings from C0∞ (M1 ) → C0∞ (ψ(M1 )) ⊂ C0∞ (M2 ). The following equivalence holds for every f in C0∞ (M1 ): Pg2 ψ∗(1) ( f ) = ψ∗(3) Pg1 ( f ) . (2) Proof. Because of the support properties of f we know that the supports of the following (1) (3) smooth functions, ψ∗ ( f ) and ψ∗ ◦ Pg1 ( f ), are contained in ψ(M1 ). Hence we can restrict our attention to the image of M1 under ψ, namely to the spacetime (ψ(M1 ), g1 ). Furthermore the conformal embedding ψ becomes a standard conformal transformation if restricted to ψ(M1 ), and the proof of that proposition descends straightforwardly by means of a direct computation (a detailed analysis is contained in Appendix D of Wald’s book [Wa84]).
Conformal Generally Covariant Quantum Field Theory
1123
We can relax the hypotheses written above and use as test functions only the smooth functions. In this case the equivalence (2) works if restricted to the image ψ(M1 ) ⊂ M2 . Another important extent of the transformation law of the wave operator Pg we would like to stress is its interplay with weighted test functions. Actually, because of the presence of the conformal factor in the transformation law of the operator defining the equations of motion we have that Pg maps test functions of weight 1 into test functions of weight 3. In a globally hyperbolic spacetime (M, g), the advanced/retarded fundamental solutions ± of the partial differential equation Pg φ = 0 are the unique maps from C0∞ (M) to C ∞ (M) such that Pg ± f = f and the domains of ± f are contained in the causal future/past of the support of f respectively supp ± ( f ) ⊂ J ± (supp f ). For the issues regarding the uniqueness see [BGP07]. Let us study the transformation law enjoyed by the fundamental solutions under conformal embeddings and hence by the causal propagator. Lemma 2.2. Let ψ be a morphism in CLoc, hence ψ is a conformal embedding between ψ : (M, g) → (M , g ), let ± and ± be the uniquely defined advanced/retarded fundamental solutions of Pg and Pg . Consider the following operators from C0∞ (ψ(M)) to C ∞ (ψ(M)): ψ
± := ψ∗(1) ◦ ± ◦ ψ∗(3)
−1
,
ψ
then ± are the uniquely defined advanced/retarded fundamental solutions of Pg in ψ (ψ(M), g ). Furthermore ± = χ (ψ(M))± |C ∞ (ψ(M)) , where χ (ψ(M)) is the char0 acteristic function of ψ(M). Proof. (ψ(M), g ) is a global hyperbolic subspace of (M , g ), then, in order to show ψ that ± are the advanced/retarded fundamental solutions of Pg in (ψ(M), g ), we have ψ to check two properties: the first one is that Pg ± f = f , and the other one is that the support of ψ ( f ) ⊂ J ± (supp f )|ψ(M) for every f in C0∞ (ψ(M)). First of all, consider the following chain of equalities valid in ψ(M) for every f ∈ C0∞ (ψ(M)) and f = ψ∗(3)
−1
( f ): f = ψ∗(3) ( f ) = ψ∗(3) ◦ Pg (± f ) = Pg ◦ ψ∗(1) (± f ) ψ ψ = Pg ± ◦ ψ∗(3) ( f ) = Pg ± ( f ) .
The second step is to check that the domain property is preserved by ψ. Nonetheless the properties of ψ assure the validity of the following chain of inclusions: ψ
supp ± f = ψ(supp ± f ) ⊂ ψ(J ± (supp f )) ⊂ J ± (ψ(supp f )) in ψ(M). Furthermore, ψ maps causal curves into causal curves preserving the orientation and from this the last inclusion descends. The causal propagator E is defined as the advanced minus retarded fundamental solution E = + − − ; it is a distribution on compactly supported smooth functions uniquely defined in a globally hyperbolic spacetime once Pg is given. It can be seen as a map from C0∞ (M) to C ∞ (M), namely the set of solutions of Pg φ = 0. Knowing the interplay between advanced, retarded fundamental solutions and conformal embeddings, we can derive straightforwardly the way in which the causal propagator E transforms under conformal transformation, i.e.
1124
N. Pinamonti
Lemma 2.3. Let ψ be a morphism in CLoc between the two elements (M, g), (M , g ) (3) (1) of CLoc, then χ (ψ(M))E (ψ∗ ( f )) = ψ∗ (E( f )) for any f ∈ C0∞ (M). The two point functions of Hadamard type play a distinguished role in the formulation of a quantum field theory in curved spacetime [KW91]. From the work of Radzikowski [Ra96] and Brunetti, Fredenhagen and Köhler [BFK96] we know that an Hadamard two-point function is characterized by the microlocal spectral condition. Hence we shall say that a two-point distribution ω2 is of Hadamard type if its antisymmetric part corresponds to the causal propagator and if it satisfies the microlocal spectral condition, which means that the wave front set of ω2 has a certain form: (3) WF(ω2 ) = (x1 , k1 , x2 , k2 ) ∈ T ∗ M \ {0}|(x1 , k1 ) ∼ (x2 , k2 ), k1 ∈ V+ , where (x1 , k1 ) ∼ (x2 , k2 ) if a null geodesics γ [0, a] → M exists such that γ (0) = x1 and γ (a) = x2 and k1 is the cotangent, coparallel vector to the geodesic at x1 , while k2 is equal to the parallel transport along γ of −k1 on x2 . The next preliminary task we have to accomplish is to give the transformation rule for the Hadamard two-point function under conformal embeddings. While we have already seen that the causal propagator satisfies an homogeneous transformation rule we would like to see what happens to the symmetric part of an ω2 of Hadamard type. Lemma 2.4. Let ψ be a morphism in CLoc from (M, g) to (M , g ) and ω2 a distribution on C0∞ (M × M) that satisfy the microlocal spectral condition then, consider −1 −1 ψ ω2 ( f, g) := ω2 ψ∗(3) f, ψ∗(3) g . ψ
ω2 is a distribution on C0∞ (ψ(M)2 ) and it satisfies the microlocal spectral condition on (ψ(M), g ). ψ
Proof. Since ψ∗(3) is a smooth invertible map from C0∞ (M) to C0∞ (ψ(M)), ω2 is a ψ distribution. Let us analyze its wave front set of ω2 in (ψ(M), g ); the definition of wave front set does not depend on the metric g , we have simply to analyze the (3) relation between M and ψ(M). Since the ψ∗ is smooth and invertible, and since ψ is a diffeomorphism we can immediately conclude that (x1 , k1 , x2 , k2 ) is an eleψ ment of WF(ω2 ) if and only if (ψ −1 (x1 ), ψ∗−1 (k1 ), ψ −1 (x2 ), ψ∗−1 (k2 )) ∈ (WF(ω2 )). −1 Here ψ∗ : Tψ( p) ψ(M)∗ → T p M ∗ defined in the standard way. We have to show that (x1 , k1 ) ∼ (x2 , k2 ) in (ψ(M), g ). To this end we are seeking for a future directed null geodesic γ in ψ(M) whose extreme points are x1 and x2 and whose cotangent vector in x1 is k1 and in x2 is −k2 . Notice that, having (ψ −1 (x1 ), ψ∗−1 (k1 )) ∼ (ψ −1 (x2 ), ψ∗−1 (k2 )) in (M, g), a future directed null geodesic γ exists with such properties in (M, g). Because of the properties of the conformal embedding, k1 and k2 are also null vectors in (ψ(M), g ). Since ψ is an orientation and time orientation preserving conformal embedding, γ = ψ(γ ) turns out to be also a future null geodesics in ψ(M); further 2 more, let λ and λ be the affine parameters of γ and of ψ(γ ), then dλ dλ = c , where c is a constant and is the conformal factor of ψ. Notice that if ψ∗−1 k1 is a cotangent vector of γ in ψ −1 (x1 ), k1 has to be the cotangent vector of ψ(γ ) in x1 , the same also holds for −k2 in x2 . Finally, since the orientation is preserved by ψ, the thesis turns out to be proved.
Conformal Generally Covariant Quantum Field Theory
1125
The singular structure of an Hadamard two point function, called Hadamard parametrix, is fixed [KW91]; to proceed with our analysis it will be useful to analyze it in more details. The Hadamard parametrix H has the following expansion in a small geodesically convex neighborhood containing the points x and y: 1 H (x, y) = 8π 2
u(x, y) σ (x, y) + v(x, y) log , σ (x, y) µ2
(4)
where u and v are certain smooth functions that depend only on the geometry of the spacetime (M, g), once the equations of motion are chosen and σ = σ + i(T (x) − T (y)) + 2 /2, where T is any time function [KW91] and σ is half of the squared geodesical distance between x and y, taken with sign. We shall give further details on the local construction of u and v in the Appendix. The Hadamard parametrix depends on the dimensional parameter µ; we shall fix this parameter once and for every spacetime in CLoc. Finally we would like to analyze the difference of the singular structures in the sense of the following lemma. Lemma 2.5. Let ψ be a morphism in CLoc between the two elements (M, g), (M , g ). Let H and H be the Hadamard parametrix respectively on two geodesically complete neighborhood O of M and O of ψ(M) such that O ⊂ ψ(O) then (3) −1 (3) −1 f, ψ∗ g − H ( f, g) = f (x)A(x, y)g(y) dµg (x)dµg (y), H ψ∗ O ×O
where A(x, y) is a smooth symmetric function on O × O and f, g ∈ C0∞ (O × O ). Furthermore, in general it is non-vanishing, and its coinciding point limit is A(x, x) =
1 −1 2 (x) , R ψ (x) − (x)R g g ψ (12π )2
where ψ is the conformal factor associated to ψ. Proof. The distribution H satisfies the microlocal spectral condition and its antisymmetric part corresponds to the causal propagator hence, also because of the preceding lemma, −1 −1 H ψ ( f, g) := H ψ∗(3) f, ψ∗(3) g is of Hadamard type in (ψ(M), g) too. From this property it is clear that H ψ − H must be a smooth function. In Eq. (11) of the Appendix we show that A(x, x) has precisely the given form, hence, since A(x, y) is a smooth function it cannot vanish in general. Finally, because of Lemma 2.3, the causal propagator in (M, g) is mapped to the causal propagator in (ψ(M), g). Since the antisymmetric part of H corresponds to the causal propagator, it descends that the antisymmetric part of A must vanish. We would like to remark that A(x, x) does not depend upon the dimensional parameter µ present in the short distance expansion of the Hadamard parametrix (4). Moreover, a change of the length scale µ, does not affect the coinciding point limit of the v coefficient.
1126
N. Pinamonti
Proposition 2.1. Consider a normal neighborhood O and two four dimensional Hadamard parametrix H and H defined on O, that differs by the length scale µ and µ , then lim H (x, y) − H (x, y) = 0.
x→y
Proof. The difference H (x, y) − H (x, y) is a smooth function and it is proportional to (log µ − log µ ) v(x, y), hence the proposition descends from the analysis of the coinciding point limit of the v coefficient performed in the Appendix, where it is shown that v(x, x) vanishes. This result does not hold in general considering the coinciding point limit of the derivatives of fields or in dimensions different than four as we shall briefly discuss later. We would like to stress that this is an important issue for having conformally covariant Wick powers.
2.2. Quantization as a functor. In [BFV03] it was shown that the quantization in terms of C ∗ algebras A(M, g) generated by the Weyl operators of the Klein Gordon field correspond to a functor A from the category of isometrically related manifolds Loc to the category Alg. We would like to briefly show that in the case of massless conformally coupled Klein Gordon fields the functor A can be extended as a functor between CLoc and Alg as described in Sect. 1.2. The difference between what we are considering here and the previously given picture [BFV03] is that in the definition of CLoc, we have admitted conformal embeddings as morphisms between the elements of Loc too. Hence we have simply to check the covariance of A with respect to the larger group of morphisms of CLoc. In the sense of the discussion presented in Sect. 1.2 we have to show that, being ψ : (M, g) → (M , g ) a conformal embedding in CLoc, there exists a corresponding morphism αψ : A(M, g) → A(M , g ) such that A(ψ(M, g)) = αψ (A(M, g)). We shall skip many details that can be easily reconstructed knowing the results of [Di80,BFV03]. For our purpose it will be sufficient to know that the morphism αψ can be straightforwardly constructed once a symplectic map between the two symplectic spaces (S(M, g), σ ) and (S(M , g ), σ ) is given. To be more precise let us analyze the construction of (S(M, g), σ ). Using the causal propagator and the differential operator defined above we can construct the set of wave functions S as follows: S(M, g) := E C0∞ (M) . S(M, g) can be equipped with a symplectic form defined in the following way. Let ϕ f = E f then, since the spacetime (M, g) is globally hyperbolic, consider the following non-degenerate symplectic form: f (Eg)dµg , ϕ f ∂a ϕg − ϕg ∂a ϕ f n a dµ = σ (ϕ f , ϕg ) =
where is a Cauchy surface, moreover σ is independent on the particularly chosen Cauchy surface . Furthermore n is the unit vector normal to , µg is the volume element induced by the metric g, and µ is the volume element restricted to the hypersurface . We already know that for every isometric embedding ψ0 : (M, g) → (M , g ) a symplectic map exists from (S(M, g), σ ) to (S(M , g ), σ ). A similar symplectic map exists
Conformal Generally Covariant Quantum Field Theory
1127
also for a conformal embedding ψ : (M, g) → (M , g ). In fact, from the transformation properties of the causal propagator seen in Lemma 2.3, we have that for every ϕ1 and ϕ2 in S(M, g), σ ψ∗(1) (ϕ1 ), ψ∗(1) (ϕ2 ) = σ (ϕ1 , ϕ2 ). It is now a simple task to construct the automorphism αψ from A(M, g) to A(M , g ) in the same way as in [BFV03]. Hence A can be promoted as a conformally covariant functor. 3. Fields as Natural Transformations In order to build more interesting examples it is important to have an algebra of local observables that encompasses more complicated objects as the powers of fields and the component of the stress tensor. Here we shall recall the construction of the fields algebra as presented in the book [Wa94] and then we would like to show that that scalar field is really a natural transformation between two functors.
3.1. The CCR algebra. We would like to follow the algebraic approach so the starting point is the abstract ∗−algebra A(M, g) generated by the identity I and the smeared quantum fields ϕ( f ), where f is a test function (a smooth compactly supported function contained in the set denoted by D(M)). Furthermore the abstract fields ϕ( f ) must satisfy the following further requirements: (i) (ii) (iii) (iv)
ϕ(α1 f 1 + α2 f 2 ) = α1 ϕ( f 1 ) + α2 ϕ( f 2 ), where α1 , α2 ∈ C; ϕ( f )∗ = ϕ( f ); ϕ(Pg f ) = 0; ϕ( f 1 )ϕ( f 2 ) − ϕ( f 2 )ϕ( f 1 ) = i E( f 1 , f 2 )I,
where E is the causal propagator of the massless conformally coupled Klein Gordon field, whose equation of motion is given by the operator Pg given in (1). The sets of A(M, g) with the algebraic morphisms form a category TAlg. We would like to show that the abstract field ϕ can be interpreted as a natural transformation between that category and Test3 . Proposition 3.1. A is a functor between the two categories Test3 and TAlg, in fact: to every (M, g) it is possible to associate A(M, g), and ψ a conformal embedding between (M, g) and (M , g ) A(ψ) is defined as the morphism that acts on the fields in the following way: αψ (ϕ( f 1 ) . . . ϕ( f n )) := ϕ ψ∗(3) ( f 1 ) . . . ψ∗(3) ( f n ) , (5) where ϕ, ϕ are the fields that generate A(M, g) and A(M , g ) respectively. The proof of the present proposition descends from the definitions given above, from the transformation rules of the causal propagator and from the composition rules of the morphisms between two algebras. Moreover, exploiting the definition of A and D and using (5) for one single field, we also have the following proposition:
1128
N. Pinamonti
Proposition 3.2. The scalar field ϕ is a natural transformation between the category Test3 and TAlg, and hence it is a locally covariant conformal field of weight 1. The difference in the weights between the field and the test functions can be understood exploiting the present heuristic representation of the field ϕ( f ) := ϕ(x) f (x)dµg , M
and considering the transformation rule enjoyed by the measure µg under conformal transformations.
3.2. Extension to the local algebra of fields and Wick monomials. As shown in [DF01, HW01], in order to study the Wick monomials we have to extend the algebra A(M, g) to a bigger one, that we shall indicate as W(M, g). In this respect we follow the notation and construction introduced in [HW01] referring to that paper for technical details. Essentially the normal ordered fields, when evaluated on states satisfying the microlocal spectral condition, turn out to be distributions with certain wavefront sets. We can then smear them with more singular objects, namely the compactly supported distributions characterized by a particular wave front set. The normal ordering prescription plays a distinguished role in this construction, we would like to recall its definition. The normal ordering with respect to the Hadamard singularity H (where a unit of measure µ is chosen) is defined as follows:
1 δn : ϕn (x1 ) . . . ϕ(xn ) : H := n exp H ( f ⊗ f ) + iϕ( f )
. (6) i δ f (x1 ) . . . δ f (xn ) 2 f =0 The algebra A(M, g) can now be enlarged allowing the smearing by more singular objects than smooth functions in C0∞ (M n ). In particular, let us consider the following set: T n (M) := t ∈ D (M), t symm. , supp(t) is compact , WF(t) ∩ V+ ∪ V− = ∅ , where V± are the forwards or backwards light cones in T ∗ M whose tip x is in M. The requirement on the wave front set of the elements of T n (M) is introduced in such a way that fields smeared by the distribution t ∈ T n (M) can be unambiguously tested on states satisfying the microlocal spectral condition. For a more complete analysis on the subject we refer to the papers [BF00,HW01]. The algebra W(M, g) can now by defined as the ∗-algebra generated by the elements defined as in (6) smeared by t ∈ T n (M). Remark. It can be shown combining the results in [BF00,HW01] that the algebra constructed in that way is independent on the choice of the Hadamard two point function H . In other words, substituting H in the definition of the normal ordering with another two point distribution with the same singular structure, gives a set of generators of an isomorphic algebra. Part of this freedom is encoded in the choice of the unit length µ. It is in any case possible to add a smooth symmetric function to H without really changing the ∗-algebra W(M, g). We are now ready to study the Wick monomials that are defined as the normal ordered products of fields smeared by some special test distributions. More precisely, suppose
Conformal Generally Covariant Quantum Field Theory
1129
there is a smooth function with compact support C0∞ (M), then a Wick monomial ϕ n ( f ) of order n can be defined as follows: : ϕ(x1 ) . . . ϕ(xn ) : H t f (x1 , . . . , xn ) dµg (x1 ) . . . dµg (xn ), (7) : ϕ n : H ( f ) := where t f (x1 , . . . , xn ) is f (x1 )(x1 , . . . , xn ) and is the diagonal distribution (x1 , . . . , xn ) = δ(x1 , x2 ) . . . δ(xn−1 , xn ). The Wick powers defined in that way satisfy certain interesting properties, in particular they turn out to be locally covariant field in the sense of [BFV03]. Another important extent showed by : ϕ k : H is the almost homogeneous scaling under rigid dilations, where the non homogenous term is logarithmic in the scaling parameter. Hollands and Wald have used an axiomatic approach, i.e., they have promoted these and other physically motivated properties to a set of axioms that every reasonable definition of Wick powers should satisfy. In [HW01], they have furthermore shown that the previously given definition for ϕ k is the unique one that satisfies the axioms up to the following renormalization freedom: ϕ˜ k (x) = ϕ k (x) +
k−2
Ci (x)ϕ i (x),
(8)
i=1
where Ci (x) are classical fields depending on the parameter of the Lagrangian, and on the metric tensor; furthermore it is required that Ci scale homogeneously under rigid dilation while the total field ϕ k scales almost homogeneously, where the non-homogeneous term must be of logarithmic type in the scaling parameter. Hence, it is not possible to get rid of this non homogeneous logarithmic scaling behavior by a suitable choice of the renormalization constants Ci (x).
3.3. Wick monomials and conformal covariance. It is known that the Wick monomials previously defined are locally covariant quantum fields in the sense of the analysis performed in [BFV03]. Here we would like to see that these fields are also locally conformal covariant. Let’s start our discussion analyzing the simplest case of ϕ 2 (x). Here the freedom (8) consists of the following redefinition: ϕα2 (x) =: ϕ 2 : H (x) + α R(x),
(9)
where R is the scalar curvature and α is a constant. We would like to stress that this freedom is not included in the choice of a particular length scale µ in the Hadamard parametrix, in fact, as discussed in Proposition 2.1, the change of the length scale µ does not affect the expectation value of : ϕ 2 : H , while changing the parameter α in (9) modifies its expectation value. An interesting observation is the fact that both : ϕ 2 : H (x) and ϕα2 (x) scale homogeneously under rigid dilations, as can be seen from the transformation rules of the scalar curvature and the Hadamard singularity. Let Hg be the Hadamard singularity in the spacetime (M, g), usually under rigid scaling λ, it should transform in the following way: λ−2 Hλ−2 g (x, y) = Hg (x, y) + vg (x, y) log λ2 ;
1130
N. Pinamonti
notice that in the case under consideration vg (x, x) = 0, as can be seen from the Appendix. Furthermore, Rg transforms homogeneously under rigid re-scaling too, λ−2 Rλ−2 g = Rg , hence the Wick monomial (9) transforms homogeneously under rigid dilation. The second step in the analysis consists of testing ϕα2 under local transformation. Let ψ be a conformal transformation from (M, g) to (M, g ), then, taking into account the transformation rule of the Hadamard singularity H as given in the Appendix, we have
1 2 2 R ϕ α ψ∗(2) ( f ) = ϕα2 ( f ) − + α − ( ◦ ψ) R g ψg f dµg , (12π )2 M where ϕα2 is the field on (M, g) while ϕ 2α is the one on (M, g ). The particular choice α = −1/(12π )2 makes the field conformally covariant. We would like to see if this is the case also for more involved fields. Namely we shall look for a particular redefinition of the Wick monomials, by a suitable choice of the renormalization constants Ci (x) in (8), to get rid of the non homogeneous behavior which is in general present in such cases. We are going to show that this is the case by the following theorem: Theorem 3.1. Let ϕ k be a Wick power as given in (7), there is a non-trivial choice of the renormalization constants Ci in (8) that makes ϕ k a conformal locally covariant field with weight k in the sense of Definition 1.3. Proof. The proof is constructive: let us consider the following smooth function B(x, y) = 1 (Rg (x) + Rg (y)), then redefine the Wick monomials in the following way: 2(12π )2 ϕ k := : ϕ k : H +B , where : ϕ(x1 ) . . . ϕ(xk ) : H +B =
1 δk
exp (H + B)( f ⊗ f ) + iϕ( f ) .
k i δ f (x1 ) . . . δ f (xk ) 2 f =0
The algebra generated using this new normal ordering is isomorphic to W(M, g), the proof is similar to the one of the independence of the state given in [HW01]; furthermore, it can be shown that : ϕ : H +B is related to : ϕ : H by a choice of the renormalization constants as in (8). The difficult part is to show that the Wick monomials defined with respect to the new normal ordering, satisfy the covariance condition with respect to the conformal embedding ψ : (M, g) → (M , g ) in CLoc and its corresponding algebraic morphism αψ defined as in (5), k αψ : ϕ k : H +B ( f ) − : ϕ : H +B ψ∗(4−k) ( f ) = 0. To this end, consider a general element W of the Wick expansion of : ϕ k : H +B ( f ), W (x1 , . . . , xk ) : = ϕ(x1 ) . . . ϕ(xn )(H + B)(xn+1 , xn+2 ) . . . (H + B)(xk−1 , xk ) × t f (x1 , . . . , xk )dµ1g . . . dµkg ,
(10)
Conformal Generally Covariant Quantum Field Theory
1131
where t f (x1 , . . . , xk ) = f (x1 )(x1 , . . . , xk ). We would like to show that on ψ(M)k , S( f ) := αψ W (t f ) − W t f = 0, where W is as in (10) and W (x1 , . . . , xk ) is the corresponding term of the expansion of (4−k) : ϕ k : H +B ( f ) on (ψ(M), g) with f := ψ∗ ( f ). First of all notice that αψ has no action on (H + B) while αψ (ϕ(x)) = −1 (ψ(x))ϕ (ψ(x)). Hence
S( f ) :=
ϕ (x1 ) . . . ϕ (xn ) × −1 (xn+1 ) . . . −1 (xk )(H + B)(xn+1 , xn+2 ) . . . (H + B)(xk−1 , xk ) − H + B (xn+1 , xn+2 ) . . . H + B (xk−1 , xk ) × f (x1 , . . . , xk ) dµ1g . . . dµkg ,
where we have used the fact that f (x1 )(x1 , x2 ) = f (x2 )(x1 , x2 ). The proof can be concluded using the analysis presented in the Appendix (11), hence for y in a geodesically convex neighborhood O of the point x in ψ(M), we have that 1 1 lim (H + B) ψ −1 (x), ψ −1 (y) − H + B (x, y) = 0. y→x (x) (y) With this observation, the proof can be concluded.
The function B does not depend on the length scale µ present in the Hadamard parametrix. Hence even if the regularization procedure depends on that length scale, it does not appear explicitly in the Wick powers. Once again, this is an unexpected result that permits the construction of an infinite series of conformally covariant fields in the four dimensional case. On the other hand, considering a general Wick monomial that contains also derivatives, the length scale µ becomes important, in the sense that it affects the expectation value of such monomial. 3.4. Extension of the results on different dimensions. In this subsection we would like to emphasize the difficulties that appear in a possible generalization of the found results to spacetimes with general dimension d different than four. We shall discuss some aspects of the two dimensional case and we shall stress the differences with the four dimensional case in particular. We start recalling that, in analogy with (1), in a d dimensional spacetime (Md , gd ), the conformal invariant fundamental scalar field ϕ has to satisfy the following equation: −ϕd +
d −2 R ϕd = 0, 4(d − 1)
where is the d’Alembert operator and R the scalar curvature of (Md , gd ). Following the discussion presented above for the four dimensional case and Propositions 3.1 and 3.2 in particular, it is a straightforward task to construct the CCR
1132
N. Pinamonti
algebra of this field and to interpret it as a functor. Similarly, the conformal covariance of the microlocal spectral condition on d dimensions can be shown to hold along the guidelines given in Lemma 2.4. The difficulties arise in considering the extended algebra of fields Wd as done in four dimensions in Subsect. 3.2. This becomes manifest in the analysis of the transformation rules enjoyed by the Wick powers ϕdk under conformal embeddings. In order to touch this fact and to enlighten the difference, it is helpful to consider once again the particular field ϕd2 , and the Hadamard parametrix Hd (x, y) in particular. In the even d dimensional case, similarly to (4), the Hadamard parametrix takes the form u d (x, y) σ Hd (x, y) = Cd + vd (x, y) log 2 , d/2−1 µ σ where u d and vd are again smooth functions and σ is half of the squared geodesic distance taken with sign regularized as in (4), Cd is a dimensional dependent constant. For a detailed analysis of the Hadamard parametrix we refer to the paper [Mo03] and to the references therein. Notice that in the even dimensional case the Hadamard parametrix contains a length scale in the logarithmic part, and this length scale breaks the conformal covariance already at the level of ϕ 2 . To see this extent explicitly consider two Hadamard different parametrix Hd and Hd constructed respectively with µ and µ ; the difference between the two is simply the smooth function 2Cd vd (x, y) log
µ , µ
and the change in the expectation value of : ϕd2 : Hd is 2Cd vd (x, x) log µµ . As already discussed above, in four dimensions, it happens that v4 (x, x) = 0 and hence a change of µ has no effect on : ϕ42 : Hd . Unfortunately this is not a general fact, and usually vd (x, x) = 0. This computation is particularly easy in two and six dimensions. Being vd (x, x) = 0, it happens that the field ϕd2 transforms non-homogeneously under rigid dilations, where in the non-homogenous part a logarithmic term in the scaling parameter λ appears. Following the discussion of Hollands and Wald, it is then not possible to cancel this logarithmic term by a judicious choice of other renormalization constants. The same extent is shown by the other Wick powers ϕdk . On the other hand, in two dimensional conformal field theories, it is known that the fields ϕ k are only quasi-primary but not primary, and hence they cannot be thought of as natural transformations in the sense discussed in the present paper. As a final comment we would like to stress that the study of conformal covariance in general dimensions requires a detailed case by case analysis of the Hadamard coefficient vd that is out of the scope of the present paper. 4. Final Comments We have generalized the notion of generally covariant fields to encompass the conformally covariant transformations. This was done exploiting the theory of category in a similar way as in [BFV03]. We have furthermore analyzed the case of the conformally coupled massless Klein Gordon field, studying its Wick powers. Particularly we have shown that, using in a suitable way the renormalization freedom, it is possible to get rid of the non homogeneous part carried by the conformal transformation of those fields. In a certain sense the larger group of covariance reduces the renormalization freedom. The
Conformal Generally Covariant Quantum Field Theory
1133
situation presented here is different than the one given in [BFV03], due to the presence of the weights in front of the fields. It is indeed not possible to linearly combine fields with different weights without breaking the conformal covariance, unless position dependent coupling constants are taken into account. Before concluding the discussion we would like to give some simple examples of other type of fields that fit into the presented framework. As an example of conformally covariant field with non-constant couplings consider 1/2 λ1 : ϕ 4 : H +B + W 2 λ2 : ϕ 2 : H +B +W 2 λ3 , where λ1 , λ2 , λ3 are constants and W 2 is the square of the Weyl tensor Wabc d , namely W 2 = Wabc d W abc d . Such a field is a conformally covariant field in the sense of Definition 1.3 and its weight is 4. Other interesting cases arise taking into account fields containing covariant derivatives. Usually that kind of fields are more complicated and it is difficult to draw some general conclusions because of the presence of quantum anomalies, but also because of the non-homogeneous transformation rule enjoyed by the covariant derivatives. Nevertheless, also in that case it is possible to construct fields that are conformally covariant, provided a renormalization constant is chosen. As an example of these fields consider − : ∇a ϕϕ : H +
Rg ∇a : ϕ 2 : H , 12
notice that their classical counterparts are quite trivial since they vanish. On the other hand, also in that case there is a renormalization freedom of the form (8); we can add to it an homogeneous scaling constant C. If C is chosen as C(x) = −2∇a v1 (x, x) 4 that field turns out to vanish also quantum mechanically and, even if it is a trivial field, it can be interpreted as a conformally covariant field in the sense of Definition 1.3. Acknowledgement. I would like to thank Romeo Brunetti, Claudio Dappiaggi, Klaus Fredenhagen, Valter Moretti and Karl-Henning Rehren for useful discussions, suggestions and comments on the topic. This work has been supported by the German DFG Research Program SFB 676.
A. Some Technical Computations A.1. Transport equations. The coefficients u and v given in the Hadamard parametrix (4) are symmetric smooth functions [Mo00] that satisfy the following relations: 2∇σ (x, y)∇u(x, y) + (x σ − 4)u(x, y) = 0,
−Px v = 0.
Moreover the coefficient u is twice the square root of the van Vleck Morette determinant u = 21/2 , for definition and details see [DB60,Fr75,Fu89,Ta89]. Furthermore, on a geodesically complete neighborhood, the function v can be expanded as follows: v=
p
vn σ n + O(σ n ).
n=0
We have truncated the series at some order p because, in general, the whole series does not converge, unless the coefficients of the metric are analytic functions. Furthermore, 4 For technical details we refer to [Mo03,HW05].
1134
N. Pinamonti
the coefficients vn can be found, using the following two recursive relations valid for n > 0: 2g(x)∇x σ ∇x v0 + (x σ (x, y) − 2)v0 = Pg(x) u(x, y), 2n g(x)∇x σ ∇x vn + n (x σ (x, y) + 2n − 2)vn = Pg(x) vn−1 (x, y). A.2. Transformation laws for the Hadamard coefficients. Consider a conformal transformation ψ : (M, g) → (M, g ) with conformal factor . Let H and H be the Hadamard singularities, as given in (4), on a (M, g) and (M, g ) respectively. For y in a geodesically complete neighborhood of the point x, we would like to compute the coinciding point limit of the subtraction 1 1 H (x, y) − H (x, y). (x) (y) Because of Lemma 2.4 we know that the subtraction is a smooth function, hence we can compute the following limit directly: lim
y→x
Rg (x) Rg (x) v(x, y) u u(x, y) + log σ − − v log σ = − , 2 (x)σ (x, y)(y) (x)(y) σ 18 (x) 18 (11) µ
where we have used the following expansions around x. Let σ µ = ∇x σ , and L µ := ∇µ log , then we can write the Taylor expansion
µ ν 1 ν L µν + L µ L ν σ σ + O(σ 3/2 ). (y) = (x) 1 − L µ σ + 2 Furthermore using the notation of the book of Fulling [Fu89],
µ ν 1 2 µ µ −2σ L µ L + 8L µ L ν + 4L µν σ σ σ (x, y) = (x)σ (x, y) 1 − L µ σ − 12 +O(σ 5/2 ), and the short distance analysis of van Vleck Morette determinant [DB60] gives 1 Rµν σ µ σ ν + O σ 2 . 1/2 = 1 − 12
(12)
Notice that, in the case under investigation, because of the expansion (12), and the recursive relations given before, v0 (x, x) = v(x, x) = 0. Plugging the expansions written above into the previous subtraction and knowing that v(x, x) = 0, (11) holds. References [BGP07] [Br04]
Bär, C., Ginoux, N., Pfäffle, F.: Wave equations on Lorentzian manifolds and quantization. Zuerich: Eur. Math. Soc., 2007, 194 p Brunetti, R.: Locally Covariant Quantum Field Theories. Contribution to the Proceedings of the Symposium “Rigorous Quantum Field Theory”, in honor of the 70th birthday of Prof. Jacques Bros (SPhT - CEA-Saclay, Paris, France, 19-21 July 2004). Progress in Mathematics 251, Basel-Boston: Birkh¨auser, (2007), pp. 39–47
Conformal Generally Covariant Quantum Field Theory
[BF00] [BF06] [BFK96] [BFV03] [BOR02] [BS07] [DB60] [DMS97] [Di80] [DF01] [Fe07] [Fr75] [Fu89] [HW01] [HW02] [HW05] [KW91] [Mo00] [Mo03] [Ra96] [SV08] [Ta89] [Wa84] [Wa94]
1135
Brunetti, R., Fredenhagen, K.: Microlocal analysis and interacting quantum field theories: Renormalization on physical backgrounds. Commun. Math. Phys. 208, 623 (2000) Brunetti, R., Fredenhagen, K.: Towards a background independent formulation of perturbative quantum gravity. Proceedings of Workshop on Mathematical and Physical Aspects of Quantum Gravity, available at http://arxiv.org/abs/gr-qc/06030793, 2006 Brunetti, R., Fredenhagen, K., Köhler, M.: The microlocal spectrum condition and Wick polynomials of free fields on curved spacetimes. Commun. Math. Phys. 180, 633 (1996) Brunetti, R., Fredenhagen, K., Verch, R.: The generally covariant locality principle: A new paradigm for local quantum physics. Commun. Math. Phys. 237, 31 (2003) Buchholz, D., Ojima, I., Roos, H.: Thermodynamic properties of non-equilibrium states in quantum field theory. Ann. Phys. 297, 219 (2002) Buchholz, D., Schlemmer, J.: Local temperature in curved spacetime. Class. Quant. Grav. 24, F25 (2007) DeWitt, B.S., Brehme, R.W.: Radiation damping in a gravitational field. Ann. Phys. 9, 220 (1960) Di Francesco, P., Mathieu, P., Senechal, D.: Conformal Field Theory. New York: Springer, 1997, 890 p Dimock, J.: Algebras of local observables on a manifold. Commun. Math. Phys. 77, 219 (1980) Duetsch, M., Fredenhagen, K.: Algebraic quantum field theory, perturbation theory, and the loop expansion. Commun. Math. Phys. 219, 5 (2001) Fewster, C.J.: Quantum energy inequalities and local covariance. II: Categorical formulation. Gen. Rel. Grav. 39, 1855 (2007) Friedlander, F.G.: The Wave Equation on a Curved Space-Time. Cambridge University Press, Cambridge, 1975 Fulling, S.A.: Aspects of Quantum Field Theory in Curved Space-Time. Cambridge University Press, Cambridge, 1989 Hollands, S., Wald, R.M.: Local wick polynomials and time ordered products of quantum fields in curved spacetime. Commun. Math. Phys. 223, 289 (2001) Hollands, S., Wald, R.M.: Existence of local covariant time ordered products of quantum fields in curved spacetime. Commun. Math. Phys. 231, 309 (2002) Hollands, S., Wald, R.M.: Conservation of the stress tensor in interacting quantum field theory in curved spacetimes. Rev. Math. Phys. 17, 227 (2005) Kay, B.S., Wald, R.M.: Theorems on the uniqueness and thermal properties of stationary, nonsingular, quasifree states on space-times with a bifurcate killing horizon. Phys. Rept. 207, 49 (1991) Moretti, V.: Proof of the symmetry of the off-diagonal hadamard/seeley-dewitt’s coefficients in c(infinity) lorentzian manifolds by a ‘local wick rotation’. Commun. Math. Phys. 212, 165 (2000) Moretti, V.: Comments on the stress-energy tensor operator in curved spacetime. Commun. Math. Phys. 232, 189 (2003) Radzikowski, M.J.: Micro-local approach to the hadamard condition in quantum field theory on curved space-time. Commun. Math. Phys. 179, 529 (1996) Schlemmer, J., Verch, R.: Local thermal equilibrium states and quantum energy inequalities. Ann. Henri Poincar´e 9, 945–978 (2008) Tadaki, S.: Hadamard regularization and conformal transformation. Prog. Theor. Phys. 81, 891 (1989) Wald, R.M.: General Relativity. University of Chicago Press, Chicago, 1984 Wald, R.M.: Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics. University of Chicago Press, Chicago, 1994
Communicated by Y. Kawahigashi
Commun. Math. Phys. 288, 1137–1179 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0785-5
Communications in
Mathematical Physics
Spectral Extension of the Quantum Group Cotangent Bundle Alexei P. Isaev1 , Pavel Pyatov1,2,3 1 Bogoliubov Laboratory of Theoretical Physics, Joint Institute for Nuclear Research,
141980 Dubna, Moscow Region, Russia. E-mail:
[email protected];
[email protected]
2 Max Planck Institute for Mathematics, Vivatsgasse 7, D-53111 Bonn, Germany 3 Department of Mathematics, Higher School of Economics, Myasnitskaya 20, Moscow, Russia
Received: 11 June 2008 / Accepted: 24 November 2008 Published online: 28 March 2009 – © Springer-Verlag 2009
Abstract: The structure of a cotangent bundle is investigated for quantum linear groups G L q (n) and S L q (n). Using a q-version of the Cayley-Hamilton theorem we construct an extension of the algebra of differential operators on S L q (n) (otherwise called the Heisenberg double) by spectral values of the matrix of right invariant vector fields. We consider two applications for the spectral extension. First, we describe the extended Heisenberg double in terms of a new set of generators — the Weyl partners of the spectral variables. Calculating defining relations in terms of these generators allows us to derive S L q (n) type dynamical R-matrices in a surprisingly simple way. Second, we calculate an evolution operator for the model of the q-deformed isotropic top introduced by A.Alekseev and L.Faddeev. The evolution operator is not uniquely defined and we present two possible expressions for it. The first one is a Riemann theta function in the spectral variables. The second one is an almost free motion evolution operator in terms of logarithms of the spectral variables. The relation between the two operators is given by a modular functional equation for the Riemann theta function. 1. Introduction A notion of a Heisenberg double over the quantum group has been formulated and attracted substantial researcher’s interest in the early 90-s [AF.91,S,SWZ.92,SWZ.93]. From the algebraic point of view it is a smash product algebra (see [M]) of the quantum group (or, the quantized universal enveloping algebra) and its dual Hopf algebra (see [D.86,FRT]). In the differential geometric interpretation it may be viewed as an algebra of quantized differential operators over a group or, equivalently, as an algebra of quantized functions over a cotangent bundle of the group. Since the group’s cotangent bundle serves a typical phase space for integrable classical dynamics, it is natural to attach the same role to the Heisenberg double over the quantum group for quantum physical models. As a test example, a model of the q-deformed isotropic top was suggested in [AF.91,AF.92]. A discrete time evolution in this model is given by a series of automor-
1138
A. P. Isaev, P. Pyatov
phisms of the Heisenberg double. It turns out however that finding an explicit expression for the model’s evolution operator is not just a technical problem.1 The automorphisms defining the evolution by no means can be treated as inner ones in the original algebra and so, for a proper realization of the q-top one needs an appropriate extension of the Heisenberg double. Also stimulated by the invention of quantum groups were general studies of the algebras whose generators satisfy quadratic relations (see [PP] and references therein) and investigations of minor identities for matrices over noncommutative rings [GR.91, GR.92,KL]. These two lines of research are meeting together in the theory of the socalled quantum matrix algebras [H,IOP.99] whose structure theory can be developed in a full analogy with the usual matrix analysis. In particular, one can define quantum versions of the matrix trace and determinant [FRT], introduce notions of a spectrum and a power of the quantum matrix, and formulate the Cayley-Hamilton theorem (see [GPS.97,IOP.99,OP.05] and references therein). A remarkable fact about quantum matrix algebras is that their most known examples — the RTT algebra [FRT] and the reflection equation algebra [KS] — serve as the building blocks for a construction of the quantum group differential geometry in general [SWZ.92] and so, also for the Heisenberg double. It is the aim of the present paper to apply the structure results on the quantum matrix algebras for investigation of the dynamics of the isotropic q-top. Following [AF.91,S,SWZ.92] we begin with a definition of the Heisenberg double as a smash product algebra of a pair of quantum matrix algebras. These are the RTT algebra, playing the role of quantized functions over a group, and the reflection equation algebra, interpreted as quantized right invariant differential operators over a group. We then consider a central extension of the reflection equation algebra by the spectrum of it’s generating matrix of quantized right invariant vector fields, and define a proper (non-central) extension of the whole Heisenberg double by these spectral variables. Finally, after the spectral extension is made, the evolution of the isotropic q-top becomes an inner automorphism of the Heisenberg double.2 Constructing the evolution operator is then straightforward. The paper is organized as follows. In the next section we recall some facts about the universal R-matrix and the R-matrix techniques. We are mainly discussing the case of (numeric) R-matrices of type G L q (n). These type R-matrices are later on used for description of the cotangent bundles (or, the Heisenberg doubles) over quantum linear groups. In Sect. 3 we introduce the RTT algebra, the reflection equation algebra and define their smash product algebra — the Heisenberg double. We are describing the algebras of the two linear types — G L q (n) and S L q (n). For the reflection equation algebra we formulate in these cases the Cayley-Hamilton theorem and use it for the spectral extension of the Heisenberg double. This is the first main result of the paper (see theorem 3.27). The Heisenberg double is initially defined in terms of the quantized right invariant vector fields. In order to demonstrate the left-right symmetry of the Heisenberg double, in subsection 3.4 we describe it using quantized left invariant vector fields. We also derive explicit relations between the spectra of the matrices of left and right generators, see corollaries 3.17 and 3.34. To keep clearness of a presentation some technical lemmas are moved from this subsection to appendix B.
1 This problem was suggested to the authors by L.D. Faddeev in summer of 1996, during the Alushta conference “Nonlocal, nonrenormalizable field theories”. 2 Strictly speaking, one has to extend the algebra by a formal power series in the spectral variables.
Spectral Extension of the Quantum Group Cotangent Bundle
1139
The spectral extension suggests yet another distinguished generating set for the Heisenberg double, namely, the one which satisfy the simplest possible – Weyl algebraic – relations with the spectral variables. In the subsection 3.5 we derive defining relations for this set, see theorem 3.36. Quite expectantly, the relations involve dynamical R-matrices whose dynamical arguments are the spectral variables (see corollary 3.37). Surprising facts are that the dynamical R-matrices are coming in pairs, and that they are derived by solving a simple system of (at most three) linear equations. Section 4 is devoted to solving a dynamical problem for the isotropic q-top. This is our second main result. Noticing that an evolution operator of the model is not uniquely defined, we derive two different expressions for it. The first one is given in terms of the Riemann theta function whose matrix of periods is proportional to Gram matrix of the lattice A∗n−1 , see relations (4.3.4), (4.3.5). This solution converges for |q| < 1, or for q a rational root of 1. The second solution converging for arbitrary values of q is given in terms of logarithms of the spectral variables, see (4.4.2), (4.4.3). The idea for the logarithmic substitution (that means passing from Weyl type to Heisenberg type commutation relations) was suggested to authors by L.D. Faddeev (for argumentation see [F.94,F.95]). The evolution in the logarithmic variables reduces to an almost free motion. A relation between the two solutions is given then by a modular functional equation for Riemann theta function (4.4.4). Concluding the introduction we would like to mention a number of open problems which, in our opinion, deserve further investigation. First of all, it is straightforward to formulate a problem of spectral extension for the Heisenberg doubles over orthogonal and symplectic quantum groups and over quantum linear supergroups. Technical prerequisites for this were developed, respectively, in [OP.05] and [GPS.05,GPS.06]. As well, it would be interesting to construct spectral extensions for the cases of real quantum groups. We believe that a correct setting for this problem is suggested in [AF.92]. Another interesting problem is an extension of a modular double construction [F.95, F.99] (see also [KLS,GKL]) for the case of Heisenberg double over quantum group. A starting point for investigation here would be a modular functional relation (4.4.4) between the two evolution operators constructed in section 4. Riemann theta function standing in the denominator in this relation could be considered as an evolution operator for the modular dual Heisenberg double. At last, an observation that a ribbon element serves a q-top evolution operator on the smash product algebra of a ribbon Hopf algebra with it’s dual Hopf algebra (see example 4.3) could open a way for the spectral extension of a quasi-triangular Hopf algebra. A partial step in this direction is made in appendix A, where pairing of the quasi-triangular Hopf algebra with its dual Hopf algebra is extended for the set of spectral variables, see corollary A.2. 2. R-Matrices In this introductory section we collect some necessary information about R-matrices and an R-matrix technique.
2.1. Universal R-matrix. First, we recall a few basic notions from the theory of quasitriangular Hopf algebras [D.86,D.89] and ribbon Hopf algebras [RT] (for review see [ChP,KSch]).
1140
A. P. Isaev, P. Pyatov
Let A be a Hopf C-algebra supplied with a unit 1 : C → A, a counit : A → C, a product m : A ⊗ A → A, a coproduct : A → A ⊗ A, and an antipode S : A → A mappings subject to standard axioms. A Hopf algebra A is called almost cocommutative if there exists an invertible element R ∈ A ⊗ A that intertwines the coproduct and the opposite coproduct op (in Sweedler’s notation: op (x) = x(2) ⊗ x(1) if (x) = x(1) ⊗ x(2) ) R(x) = op (x)R
∀ x ∈ AR .
(2.1.1)
In this case the element R is called a universal R-matrix, and the corresponding almost cocommutative Hopf algebra is denoted as AR . The algebra AR is called quasi-triangular if additionally R satisfies relations ( ⊗ id)R = R13 R23 , (id ⊗ )R = R13 R12 ,
(2.1.2)
where R12 = R ⊗ 1, R23 = 1 ⊗ R, and R13 = i ai ⊗ 1 ⊗ bi for R = i ai ⊗ bi . Relations (2.1.1), (2.1.2) together imply an equality R12 R13 R23 = R23 R13 R12 ,
(2.1.3)
which is called the Yang-Baxter equation. In the almost cocommutative case an element u := m(S ⊗ id)(R21 ) ∈ AR is invertible. In terms of u the square of the antipode is expressed as S 2 (x) = u x u −1
∀ x ∈ AR .
(2.1.4)
In the quasi-triangular case one has following formulas: S(u) = m(id ⊗ S)(R12 ) , u −1 = m(id ⊗ S 2 )(R21 ) , −1 (u) = (R21 R12 ) u ⊗ u , R (u ⊗ u) = (u ⊗ u) R .
(2.1.5)
An element u S(u) = S(u)u (in the almost cocommutative case) belongs to the center of AR . A central extension of the quasi-triangular Hopf algebra AR by a so-called ribbon element υ such that υ 2 = u S(u) ,
(υ) = (R21 R12 )−1 υ ⊗ υ
(2.1.6)
is called a ribbon Hopf algebra. The ribbon element also fulfills relations (υ) = 1 ,
S(υ) = υ ,
R (υ ⊗ υ) = (υ ⊗ υ) R .
Throughout this paper our basic reference example of the quasi-triangular Hopf algebra AR is the quantized universal enveloping algebra Uq (g) of a complex Lie algebra g = sl(n) [D.86,J.85,J.86].
Spectral Extension of the Quantum Group Cotangent Bundle
1141
2.2. Braid groups and their R-matrix representations. In the rest of the section we introduce standard notation and recall basic results on R-matrix representations of the braid groups. k−1 The braid group Bk in Artin’s presentation is given by a set of generators {σi }i=1 and relations σi σi+1 σi = σi+1 σi σi+1 ∀ i = 1, 2, . . . , k − 1, σi σ j = σ j σi ∀ i, j : |i − j| > 1 .
(2.2.1) (2.2.2)
Let V be a finite dimensional C-linear space. For any operator X ∈ End(V ⊗2 ) and for all integer i > 0, j > 0 denote X i := I ⊗(i−1) ⊗ X ⊗ I ⊗( j−1) ∈ End(V ⊗(i+ j) ) ,
(2.2.3)
where I ∈ Aut(V ) is the identity operator.3 We also use notation X i j for an operator in End(V ⊗k ), 1 ≤ i = j ≤ k, acting as X in component spaces V with labels i and j and as identity in the rest. In these notations X i i+1 ≡ X i . An operator R ∈ Aut(V ⊗ V ) satisfying equality R1 R2 R1 = R2 R1 R2 ,
(2.2.4)
is called an R-matrix. Any R-matrix generates representations ρR of the braid groups Bk , k = 2, 3, ..., ρR :
Bk → Aut(V ⊗k ), ρR (σi ) = Ri , 1 ≤ i ≤ k − 1.
By a slight abuse of notation we assign the same symbol ρR to the R-matrix representations of the braid groups Bk for different values of index k. This should not cause problems as the braid groups admit a series of monomorphisms commuting with ρR , Bk → Bk+1 :
σi → σi ∀i = 1, . . . k − 1.
(2.2.5)
Definition 2.1. An R-matrix R is called skew invertible if there exists an operator R ∈ End(V ⊗2 ) such that Tr (2) R12 R 23 = Tr (2) R 12 R23 = P13 .
(2.2.6)
Here by Tr (i) we denote trace operation in i th space, and by P — the permutation operator: P(u ⊗ v) = v ⊗ u ∀ u, v ∈ V . With any skew invertible R-matrix R we associate a pair of operators DR , CR ∈ End(V ) DR 1 = Tr (2) R 12 , CR 2 = Tr (1) R 12 ,
(2.2.7)
which, by (2.2.6), satisfy equalities Tr (2) R12 DR 2 = I1 , Tr (1) CR 1 R12 = I2 .
(2.2.8)
Further properties of the operators DR and CR are summarized below. (i+ j)
3 Strictly speaking a proper notation for the l.h.s. of (2.2.3) would be, say, X . We use the shortened i notation X i since a dependence on j is not critical for our considerations. All formulas below make sense if the index j is large enough. A minimal possible value for j in each case is obvious from the context.
1142
A. P. Isaev, P. Pyatov
Proposition 2.2 [Is.04,O]. Let R be a skew-invertible R-matrix. The operators DR and CR (2.2.7) satisfy equalities DR 1 I2 = Tr (3) DR 3 R2±1 P12 R2∓1 , R12 DR 1 DR 2 = DR 1 DR 2 R12 ,
CR 3 I2 = Tr (1) CR 1 R1±1 P23 R1∓1 , R12 CR 1 CR 2 = CR 1 CR 2 R12 .
(2.2.9)
Let W be a C-linear space. For any skew invertible R-matrix R we define an R-trace map4 TrR : End W (V ) → W , Y → TrR (Y ) := Tr (DR Y ) , Y ∈ End W (V ) . Following properties of the R-trace are simple consequences of the relations given in Proposition 2.2. Corollary 2.3. Let R be a skew invertible R-matrix. For any operator Y ∈ End W (V ) the R-trace associated with R satisfies relations −ε ε TrR (2) (R12 Y1 R12 ) = I1 TrR (Y ),
(2.2.10)
where ε = ±1 and the symbol TrR (i) denotes taking the R-trace in ith space. For an element x (k) ∈ C[Bk ] denote XR(k) := ρ R (x (k) ) ∈ End(V ⊗k ). The following cyclic property TrR (1, . . . , k) XR(k) Y (k) = TrR (1, . . . , k) Y (k) XR(k) is fulfilled for any k ≥ 1 and Y (k) ∈ End W (V ⊗k ), and for all x (k) ∈ C[Bk ]. Example 2.4. Permutation P: P(u ⊗ v) := v ⊗ u ∀ u, v ∈ V , is the skew invertible R-matrix. The Identity operator I ⊗2 is the R-matrix which is not skew invertible. Example 2.5. Assume that the quasi-triangular Hopf algebra AR admits a representation ρV : AR → End(V ). As follows from the Yang-Baxter equation (2.1.3) an operator R := η P (ρV ⊗ ρV )(R),
(2.2.11)
satisfies relation (2.2.4). Here the scaling factor η ∈ {C \ 0} is introduced for the sake of future convenience. The R-matrix (2.2.11) is skew invertible, its skew inverse matrix is given by formula (see, e.g., [O], Sect. 4.1.2) R = η−1 P (ρV ⊗ ρV )((id ⊗ S)R). The matrices DR and CR associated with the R-matrix (2.2.11) are: DR = η−1 ρV (u) ,
CR = η−1 ρV (S(u)) .
(2.2.12)
Both, they are invertible and their properties (2.2.9) are descending from (2.1.5). 4 This map is often called a quantum trace or, shortly, a q-trace. In our opinion, the name R-trace is more appropriate to it.
Spectral Extension of the Quantum Group Cotangent Bundle
1143
2.3. Hecke algebras and Hecke type R-matrix. An A-type Hecke algebra Hk (q) is a quotient algebra of the group algebra C[Bk ] (2.2.1), (2.2.2) by relations (σi − q1)(σi + q −1 1) = 0
∀ 1 ≤ i ≤ k − 1.
Under the following conditions on the parameter q: [k] i q := (q i − q −i )/(q − q −1 ) = 0 ∀i = 2, 3, . . . , k,
(2.3.1)
the algebra Hk (q) is isomorphic to the group algebra of the symmetric group C[Sk ] and, hence, semisimple. It’s irreducible representations as well as its central idempotents are labeled by a set of partitions λ k. We are particularly interested in a series of idempotents corresponding to the one dimensional representations λ = (1k ), k = 1, 2, . . . . These idempotents – we denote them as a (k) – admit a recursive construction (see, e.g., [HIOPT], Sect. 1, or [GPS.97], Sect. 2.3, or [TW], Lemma 7.2) k−1 (k − 1)q (k−1) q a (1) = 1, a (k) = a 1 − σk−1 a (k−1) (2.3.2) kq (k − 1)q k−1 (k − 1)q (k−1)↑1 q a 1 − σ1 a (k−1)↑1 ∀ k = 2, 3, . . . , (2.3.3) = kq (k − 1)q where we use the symbol x (k)↑1 ∈ Hk+1 (q) for an image of the element x (k) ∈ Hk (q) under the following algebra monomorphism (cf. with (2.2.5)): Hk → Hk+1 :
σi → σi+1 ∀i = 1, . . . k − 1 .
The idempotents a (k) obey relations a (k) σi = σi a (k) = −q −1 a (k) a
(k) (i)↑ j
a
=a
(i)↑ j (k)
a
=a
(k)
∀ i = 1, 2, . . . , k − 1 , ,
if i + j ≤ k .
(2.3.4) (2.3.5)
An R-matrix R satisfying the quadratic minimal characteristic identity is called a Hecke type R-matrix. By an appropriate rescaling of R one always can turn its characteristic identity to a form (R − q I )(R + q −1 I ) = 0 .
(2.3.6)
In this case the corresponding representations ρ R become representations of the Hecke algebras Hk (q), ρR :
Hk (q) → Aut(V ⊗k ), ρR (σi ) = Ri , 1 ≤ i ≤ k − 1.
(2.3.7)
We reserve a special notation for the R-matrix images of idempotents a (k) : A(k) := ρR (a (k) ) ,
A(k)↑1 := ρR (a (k)↑1 ) ∀ k ≥ 1 .
(2.3.8)
We also put A(0) := 1. The elements A(k) will be further referred to as k-antisymmetrizers. Remark 2.6. The R-matrix analogues of relations (2.3.2)–(2.3.5) have been described in the literature (see [J.86,G]) even earlier than their algebraic prototypes.
1144
A. P. Isaev, P. Pyatov
2.4. G L q (n) type R-matrix. Definition 2.7. Consider a Hecke type R-matrix R. Assume that the parameter q in its characteristic identity (2.3.6) satisfies conditions [n] (2.3.1), so that antisymmetrizers A(2) , . . . , A(n) are well defined. R is called a G L q (n) type R-matrix if two conditions n q A(n) I − Rn A(n) = 0 (2.4.1) nq and rk A(n) = 1
(2.4.2)
are fulfilled. Remark 2.8. Assuming (n + 1)q = 0, the condition (2.4.1) is equivalent to A(n+1) = 0. For generic values of q, assuming validity of (2.4.1), the condition (2.4.2) is equivalent to demanding skew invertibility of R (see [G], Props. 3.6 and 3.10). Proposition 2.9 [G,Is.04]. Let R be a skew invertible R-matrix of the type G L q (n). Then CR and DR are invertible and the following relations are fulfilled: DR CR = CR DR = q −2n I, (n + 1 − k)q (k−1) TrR (k) A(k) = q −n A ∀ k = 1, 2, . . . , n, kq n n 2 A(n) (DR )i = (DR )i A(n) = q −n A(n) . i=1
(2.4.3) (2.4.4) (2.4.5)
i=1
Example 2.10. Consider the case AR is the quantized universal enveloping algebra Uq sl(n). Let V be a vector representation of Uq sl(n), dim V = n. In this case formula (2.2.11) with the scaling factor chosen as η = q 1/n gives a standard Drinfeld-Jimbo’s R-matrix R ◦ of the G L q (n) type (see [KSch], Sect. 8.4.2): R◦ =
n
q δi j E i j ⊗ E ji + (q − q −1 )
i, j=1
E ii ⊗ E j j .
(2.4.6)
i< j
Here (E i j )kl := δik δ jl , i, j = 1, . . . , n, is a standard basis of n × n matrix units. Via the so-called twist procedure (for details see [R.90]) R ◦ gives rise to a multiparametric family of G L q (n) type R-matrices, R
f
:= F R ◦ F −1 =
n i, j=1
q δi j
fi j E i j ⊗ E ji + (q − q −1 ) E ii ⊗ E j j , f ji i< j
(2.4.7) ∀ f i j ∈ {C \ 0}. n Here F := i, j=1 f i j E ii ⊗ E j j is a twisting R-matrix. In what follows we use these particular R-matrices for illustration purposes. Their corresponding matrices D R ◦ and D R f are D R◦ = D R f =
n i=1
q 2(i−n)−1 E ii .
Spectral Extension of the Quantum Group Cotangent Bundle
1145
Remark 2.11. Generally speaking, a G L q (n) type R-matrix can be realized in a tensor square of space V whose dimension is different from n. Examples of the R-matrices for any dim V ≥ n are given in [G], in Sect. 4. In what follows we do not assume any relation between the parameter n in Definition 2.7 and the dimension of the space V , unless it is stated explicitly. 3. Quantized Functions on a Cotangent Bundle Over Matrix Group In this section we recall the definition of a quantum group cotangent bundle and develop in linear cases – G L q (n) and S L q (n) – basic techniques for its structure investigation. 3.1. Quantized functions over matrix group (RTT algebra). Definition 3.1 [D.86,FRT]. Let R be a skew invertible R-matrix. An associative unital V algebra generated by a set of matrix components T ji i,dim j=1 satisfying relations R12 T1 T2 = T1 T2 R12
(3.1.1)
is denoted as F[R] and called an RTT algebra. The RTT algebra is endowed in a standard way with the coproduct and the counit (T ji ) = Tki ⊗ T jk , (T ji ) = δ ij . (3.1.2) k V Let further extend the RTT algebra by a set of inverse matrix components (T −1 )ij i,dim j=1 : Tki (T −1 )kj = (T −1 )ik T jk = δ ij 1 . (3.1.3) k
k
The extended algebra can be endowed with the antipode mapping S(T ji ) = (T −1 )ij ,
so that (see [R.89]):
S 2 (T ) DR = DR T .
(3.1.4)
The resulting Hopf algebra is further denoted as FG[R]. Example 3.2. Consider the quasi-triangular Hopf algebra AR together with its representation ρV (see Example 2.5). For any x ∈ AR denote ρV (x)ij a matrix of the operator ρV (x) in a certain basis in the space V . Let A∗R be the dual Hopf algebra and let ·, · denote a non degenerate pairing between AR and A∗R . Consider two matrices of linear functionals on AR — T ji and (T −1 )ij — such that T ji , x = ρV (x)ij , (T −1 )ij , x = ρV (S(x))ij ∀ x ∈ AR . (3.1.5) It is easy to see that these functionals satisfy conditions of Definition 3.1 (for details see, e.g., [B]), the numeric R-matrix R in (3.1.1) in this case is given by (2.2.11), relation (3.1.4) for the square of antipode descends from (2.1.4). The functionals T ji and (T −1 )ij generate a Hopf subalgebra in A∗R . In case AR is a universal enveloping algebra U g of some Lie algebra g, the dual Hopf algebra (U g)∗ can be treated as Fun(G) ≡ FG, where G is a formal group corresponding to g. Therefore, heuristically we can treat the RTT algebras FG[R] and F[R] as algebras of quantized functions over the matrix group and matrix semigroup, respectively. Here the term matrix refers to a matrix form of the coproduct (3.1.2); the term quantized means that relations (3.1.1) in general define a noncommutative product.
1146
A. P. Isaev, P. Pyatov
In the rest of the subsection we describe a construction of the inverse matrix T −1 for the RTT algebra associated with the G L q (n) type R-matrix. Consider an element detR T := Tr (1, . . . , n) A(n) T1 T2 . . . Tn . (3.1.6) By the definition of the coproduct (3.1.2) and due to the rank 1 condition (2.4.2) the element detR T is group-like (detR T ) = detR T ⊗ detR T , and it satisfies relations A(n) T1 T2 . . . Tn = T1 T2 . . . Tn A(n) = A(n) detR T . Therefore, it is natural to call det R T a determinant of the matrix T . Proposition 3.3 [G]. Let R be a skew invertible G L q (n) type R-matrix. The following relation is satisfied in the corresponding RTT algebra F[R]: ) detR T , (detR T ) T = (OR T O−1 R where OR , O−1 ∈ Aut(V ) are mutually inverse matrices: R OR 1 = n q Tr (2, . . . , n +1) P1 P2 . . . Pn A(n) , (n) (O−1 A ) = n Tr P . . . P P (2, . . . , n +1) 1 q n 2 1 , R
(3.1.7)
(recall that Pi are permutation operators acting in components spaces Vi ⊗ Vi+1 ). Corollary 3.4. In the assumptions of Proposition 3.3 consider an extension of the RTT algebra F[R] by an element (detR T )−1 subject to relations T OR )(detR T )−1 , detR T (detR T )−1 = (detR T )−1 detR T = 1 . (detR T )−1 T = (O−1 R In the extended algebra the inverse matrix T −1 satisfying relations (3.1.3) is given by formula (T −1 )1 = q n(n−1) n q TrR (2, . . . , n) T2 . . . Tn A(n) (detR T )−1 . The resulting Hopf algebra is called a G L q (n) type RTT algebra and denoted as FG L q (n)[R]. Assume additionally that for the R-matrix R the corresponding matrix OR (3.1.7) is scalar: OR ∝ I . In this case R is called the R-matrix of S L q (n) type. In the corresponding RTT algebra FG L q (n) R the element detR T is central. A quotient of this algebra by relation detR T = 1 is called S L q (n) type RTT algebra and denoted as F S L q (n)[R].
Spectral Extension of the Quantum Group Cotangent Bundle
1147
Remark 3.5. For a skew invertible G L q (n) type R-matrix R consider a system of equations R12 N1 N2 = N1 N2 R12 ,
N n ∝ OR
for some N ∈ Aut(V ).
Note that a consistency condition for these equations — R12 OR 1 OR 2 = OR 1 OR 2 R12 — is satisfied (see [OP.05]). By any solution N of these equations one can construct the S L q (n) type R-matrix 12 := N1 R12 N −1 = N −1 R12 N2 . R 1 2
(3.1.8)
Example 3.6. For the R-matrices described in Example 2.10 one has n O R ◦ = −I , O R f = − i=1 j=i f ji / f i j E ii . So, R ◦ is S L q (n) type, while R f is S L q (n) type only if ∀ i = 1, . . . , n : j=i ( f ji / f i j ) √ n 1/n th 1. Taking a diagonal n root O R f of the diagonal matrix O R f one finds the = S L q (n) type R-matrix associated with R f : f = R f˜ , where f˜i j := R
k=i, j ( f i j
f jk f ki )1/n , so that O R f˜ = −I .
3.2. Quantized right invariant vector fields (reflection equation algebra). Definition 3.7 [KS]. Let R be a skew invertible R-matrix. An associative unital algebra V LG[R] generated by a set of matrix components L ij i,dim j=1 satisfying relations L 1 R12 L 1 R12 = R12 L 1 R12 L 1
(3.2.1)
is called a reflection equation algebra or, shortly, RE algebra. The RE algebra LG[R] is naturally endowed with a structure of left coadjoint FG[R]-comodule algebra δ (L ij ) = Tki (T −1 )mj ⊗ L km . (3.2.2) k,m
Example 3.8. [FRT]. In notations of Examples 2.5, 3.2 consider the following AR -valued matrices i
L (+) j = id ⊗ T ji , R ,
L (−) j = S(T ji ) ⊗ id , R = T ji ⊗ id , R−1 , i
((L (+) )−1 )ij = id ⊗ T ji , R−1 , ((L (−) )−1 )ij = T ji ⊗ id, R.
(3.2.3)
As a consequence of the Yang-Baxter equation (2.1.3) components of these matrices satisfy relations (±) (±) (±) R12 L (±) 2 L 1 = L 2 L 1 R12 ,
(−) (−) (+) R12 L (+) 2 L 1 = L 2 L 1 R12 ,
(3.2.4)
i where R is given by (2.2.11). By (2.1.2), the elements (L (±) )±1 j generate a Hopf AR -subalgebra i i k i i (L (±) j ) = L (±) k ⊗ L (±) j , (L (±) j ) = δ ij , S(L (±) j ) = ((L (±) )−1 )ij . k
1148
A. P. Isaev, P. Pyatov
Consider a composite matrix L with components 1 1 k L ij := q (n− n ) ((L (−) )−1 )ik L (+) j = q (n− n ) id ⊗ T ji , R21 R12 ,
(3.2.5)
k 1
where our choice of a numeric factor q n− n is argued in Appendix A. By (3.2.4), components of L satisfy reflection equation (3.2.1), where R is given by (2.2.11). Note that an AR -subalgebra generated by the elements L ij (3.2.5) does not carry a natural Hopf algebra structure. Instead, it obeys a coadjoint comodule algebra structure (3.2.2) with respect to the Hopf A∗R -subalgebra generated by the components of the matrices T and T −1 (3.1.5). Let us comment on a geometric interpretation of the RE algebra. In [FRT] the matrices L (±) were used to develop an RTT type description for the quantized universal enveloping algebra Uq g. Consider the case g = sl(n) and let V be its vector representation. The corresponding G L q (n) type R-matrix R is given in Example 2.10. Making a linear change of generators L ij → ij : L ij = δ ij + (q − q −1 ) ij ,
(3.2.6)
and using the Hecke condition (2.3.6) the reflection equation (3.2.1), for q 2 = 1, can be equivalently rewritten as 1 R12 1 R12 − R12 1 R12 1 = R12 1 − 1 R12 .
(3.2.7)
In a “classical” limit q → 1 the R-matrix (2.4.6) tends to the permutation and Eqs. (3.2.7) go into commutation relations for the basis of generators of the Lie algebra gl(n), [ 1 , 2 ] = P12 ( 1 − 2 ) .
(3.2.8)
p ij i, j=1
as a basis of right invariant vector fields on G L(n). Classically we can treat Transformation of these basic fields under the left transition by a group element t ∈ G L(n) is given by formula (cf. with (3.2.2)) δ (t) :
ij
→
n
tki km (t −1 )mj ,
where t ij := ρV (t)ij .
k,m=1
Extrapolating this interpretation to a “quantum” case q = 1 we call L ij i,n j=1 a basis of quantized right invariant vector fields over the matrix group. It is technically convenient to introduce the notation L 1 := L 1 ,
L k+1 := Rk L k Rk−1 ,
L 1 := L 1 ,
Rk−1
L k+1 :=
L k Rk
(3.2.9) ∀ k ≥ 1.
In terms of these R-copies L k , L k of the matrix L the reflection equation (3.2.1) can be equivalently written in any of the following forms: Rk L k L k+1 = L k L k+1 Rk ,
Rk L k+1 L k = L k+1 L k Rk
∀ k ≥ 1 . (3.2.10)
Taking into account commutativity relations Ri L k = L k Ri ,
Ri L k = L k Ri
∀ i, k : k = i, i + 1 ,
(3.2.11)
Spectral Extension of the Quantum Group Cotangent Bundle
1149
one sees that the R-copies L k (L k ) of the matrix L in the RE algebra LG[R] formally satisfy the same relations as the usual copies Tk (Tk−1 ) of the matrix T (T −1 ) in the RTT algebra FG[R]. Matrix monomials in two different series of the R-copies satisfy relations L1 L2 . . . Lk = Lk . . . L2 L1
∀k ≥ 1.
(3.2.12)
For k = 2 the equality (3.2.12) is identical to the reflection equation (3.2.1). For k > 2 this equality follows by induction on k. Note that monomials (3.2.12) transform covariantly under the left coadjoint coaction (3.2.2),
δ L 1 . . . L k = T1 . . . Tk ⊗ 1)(1 ⊗ L 1 . . . L k (S(T1 . . . Tk ) ⊗ 1) . (3.2.13) The following proposition goes back to Theorem 14 from [FRT] (see also [Is.04], Prop. 5). Proposition 3.9. Let R be a skew invertible R-matrix. For an element x (k) ∈ C[Bk ] denote (3.2.14) ch(x (k) ) := TrR (1 . . . k) XR(k) L 1 L 2 . . . L k , where XR(k) := ρ R (x (k) ) ∈ End(V ⊗k ). Consider a linear subspace Ch[R] ⊂ LG[R] spanned by the unity and by elements ch(x (k) ) ∀k ≥ 1 and ∀x (k) ∈ C[Bk ]. The space Ch[R] is a subalgebra of the center of the RE algebra LG[R]. It is called a characteristic subalgebra of the RE algebra LG[R]. The characteristic subalgebra is invariant with respect to the left FG[R] coadjoint coaction (3.2.2). Proof. In a setting of the quasi-triangular Hopf algebras these statements were proved in [D.89,R.89] (see there Sect. 3 and Sect. 4, respectively). Below we prove the proposition in the RE algebra setting. Consider an arbitrary element ch(x (k) ) of the characteristic subalgebra. We first prove the following version of the formula (3.2.14): ch(x (k) ) I1 = TrR (2, . . . , k + 1) XR(k)↑1 L 2 L 3 . . . L k+1 = TrR (2, . . . , k + 1) XR(k)↑1 L k+1 . . . L 3 L 2 . (3.2.15) Here the first equality results from a calculation TrR (2, . . . , k + 1) XR(k)↑1 L 2 . . . L k+1 = TrR (2, . . . , k + 1) XR(k)↑1 R1 · · · Rk L 1 . . . L k Rk−1 · · · R1−1 = TrR (2, . . . , k + 1) R1 · · · Rk (XR(k) L 1 . . . L k ) Rk−1 · · · R1−1 = . . . = TrR (1, . . . , k) XR(k) L 1 L 2 . . . L k , where in the last line we applied (2.2.10) k times. To prove the second equality in (3.2.15) we first use the relation (3.2.12) and then perform similar transformations.
1150
A. P. Isaev, P. Pyatov
With the use of (3.2.15) and (3.2.12) checking centrality of ch(x (k) ) is straightforward: L 1 ch(x (k) ) = TrR (2, . . . , k + 1) XR(k)↑1 L 1 L 2 L 3 . . . L k+1 = TrR (2, . . . , k + 1) XR(k)↑1 L k+1 . . . L 2 L 1 = ch(x (k) ) L 1 . The invariance of ch(x (k) ) under the left FG[R] coadjoint coaction follows immediately from (3.2.13) together with the relation (3.1.4) for the square of antipode. Consider a series of elements of the RE algebra LG[R], pi := TrR (L i ) , i = 1, 2, . . . .
(3.2.16)
Further on they are called power sums. The following calculation −1 −1 i L 1 pi = TrR (2) L 1 R12 L i1 R12 = TrR (2) R12 L 1 R12 L 1 = pi L 1 ,
proves centrality of the power sums. Here in the first and the last equalities we use formula (2.2.10), and the second equality is a consequence of (3.2.1). Actually, the power sums belong to the characteristic subalgebra Ch[R]: pi = ch(σi−1 . . . σ2 σ1 ) , which is verified by a following transformation:
ch(σi−1 . . . σ2 σ1 ) = TrR (1, . . . , i) L 1 . . . L i (Ri−1 . . . R1 ) −1 ) (Ri−1 . . . R1 ) = TrR (1, . . . , i) L 1 . . . L i−1 (Ri−1 . . . R1 )L 1 (R1−1 . . . Ri−1 = TrR (1, . . . , i − 1) L 1 . . . L i−1 TrR (i) Ri−1 (Ri−2 . . . R1 )L 1 = TrR (1, . . . , i − 2) L 1 . . . L i−2 TrR (i − 1) Ri−2 (Ri−3 . . . R1 )L 21 = . . . = TrR (L i ) = pi . Here we repeatedly expand the notation L j = (R j−1 . . . R1 )L 1 (R1−1 . . . R −1 j−1 ) for j = i, . . . , 2, and use (2.2.8). Let R be a skew invertible R-matrix of the Hecke type. Assuming that conditions [k] (2.3.1) are fulfilled consider a series of elements ai ∈ Ch[R], i = 0, 1, . . . k, in the corresponding Hecke type RE algebra LG[R], a0 := 1 , ∀ 1 ≤ i ≤ k , (3.2.17) ai := ch(a (i) ) = TrR (1, . . . , i) A(i) L 1 . . . L i where notations a (i) , A(i) were explained in (2.3.2), (2.3.8). The elements ai are called elementary symmetric functions. Definition 3.10. Let R be a skew invertible G L q (n) type R-matrix. A central extension of the corresponding RE algebra LG[R] by an element an−1 : an an−1 = 1 is called G L q (n)type RE algebra and denoted as LG L q (n)[R]. A quotient of this algebra by a relation an = q −1 1 is called S L q (n) type RE algebra and denoted as L S L q (n)[R].
(3.2.18)
Spectral Extension of the Quantum Group Cotangent Bundle
1151
Remark 3.11. An actual value of a numeric factor in the right-hand side of (3.2.18) is not relevant for the definition. Our choice allows avoiding numeric factors later in formula (4.1.1) (see the proof of Proposition 4.1). Consider realization of the RE algebra L S L q (n)[R] as a subalgebra in the quasi-triangular Hopf algebra AR (see Example 3.8). In this case the condition (3.2.18) is consistent with the pairing ·, · of the dual Hopf algebras AR and A∗R only for the chosen normalizations (3.2.5) for L and η = q 1/n for R (2.2.11). This point is explained in Appendix A, see (A.3). (3.1.8) Remark 3.12. The G L q (n) type R-matrix R and its S L q (n) partner R-matrix R define identical RE algebras. In the theorem below we describe Cayley-Hamilton and Newton identities specific to the G L q (n) type and Hecke type RE algebras. Theorem 3.13. Let R be a skew invertible R-matrix of the Hecke type. Assume that the conditions [k] (2.3.1) are fulfilled. Then in the corresponding RE algebra LG[R] the following Cayley-Hamilton-Newton identities [IOP.98,IOP.99] i q TrR (2, . . . , i) (A(i) L 2 L 3 . . . L i ) = (−1)i+1
i−1
i− j−1
(−q) j a j L 1
∀ 2 ≤ i ≤ k (3.2.19)
j=0
take place. Multiplying by L 1 from the left and taking the R-trace TrR (1) of these identities one obtains Newton relations for the sets of power sums { pi }i≥1 and the set of elementary symmetric functions {ai }i≥0 [GPS.97], i q ai + (−1)
i
i−1
(−q) j a j pi− j = 0 ∀ 1 ≤ i ≤ k .
(3.2.20)
j=0
Both sets {1, p j } j≥1 and {a j } j≥0 in this case generate the characteristic subalgebra Ch[R]. Assume additionally that R is an R-matrix of the G L q (n) type. Then the finite set n {ai }i=0 generates the characteristic subalgebra of the RE algebra LG L q (n)[R], and following Cayley-Hamilton identity is fulfilled [GPS.97]: n
(−q)i ai L n−i = 0 .
(3.2.21)
i=0
This identity leads, in particular, to an invertibility of the matrix L: L
−1
= q
−1
an−1
n−1
(−q)−i an−i−1 L i .
i=0
Remark 3.14. One can introduce generating functions a(x), p(x) for the elementary symmetric functions and for the power sums a(x) := ai x i , p(x) := pi x i . i≥0
i≥1
1152
A. P. Isaev, P. Pyatov
The Newton relations (3.2.20) can be written as a finite difference equation for the generating functions a(q −1 x) − a(q x) . q − q −1
a(q x) p(−x) =
For the G L q (n) type RE algebra we now construct its central extension by roots of the characteristic polynomial (3.2.21). Definition 3.15. Denote Sn a C-algebra of polynomials in n pairwise commuting invert±1 ible indeterminates µ±1 α and their differences (µα − µβ ) , α, β = 1, . . . , n, α = β. Let R be a skew invertible R-matrix of the G L q (n) type, LG L q (n)[R] be the corresponding RE algebra, and Ch[R] be its characteristic subalgebra. Consider a monomorphism Ch[R] → Sn defined on generators as 5
ai → ei (µ1 , . . . , µn ) :=
µ j1 µ j2 . . . µ ji
∀ i = 0, 1, . . . , n ,
(3.2.22)
1≤ j1 <···< ji ≤n
where ei are the elementary symmetric functions of their arguments. The map (3.2.22) defines naturally a structure of, say, left Ch[R]-module on Sn . A central extension of the algebra LG L q (n)[R], LG L q (n)[R] := LG L q (n)[R]
Ch R
Sn :
aα = eα (µ1 , . . . , µn ) , L ij µα = µα L ij
∀ i, j = 1, . . . , dim V, ∀ α = 1, . . . , n , (3.2.23)
is called a (semisimple) spectral completion of LG L q (n)[R]. A quotient of this algebra by relations an =
n
µα = q −1 .
α=1
is called a (semisimple) spectral completion of L S L q (n)[R] and denoted as L S L q (n)[R]. Variables µα are called spectral variables. Remark 3.16. Assuming that the spectral variables µα are invariants of the coadjoint coaction, the algebra LG L q (n)[R] (L S L q (n)[R]) inherits the structure of left coadjoint FG L q (n)[R]- (F S L q (n)[R]-) comodule algebra. Corollary 3.17. In the spectrally completed algebra LG L q (n)[R] the characteristic identity (3.2.21) assumes a factorized form n
(L − qµα I ) = 0 .
(3.2.24)
α=1 5 When defining the map (3.2.22) we implicitly assume an algebraic independence of the elements a , i i = 1, . . . , n. Otherwise, we should impose the same algebraic conditions on functions ei (µ1 , . . . , µn ).
Spectral Extension of the Quantum Group Cotangent Bundle
1153
One can construct a resolution of the matrix unity n
n L − qµβ I α P := Pα = I , : P α P β = δαβ P α , q(µ − µ ) α β β=1
(3.2.25)
α=1
β =α
so that L P α = P α L = qµα P α .
(3.2.26)
Remark 3.18. In papers [GS.99,DM.01,DM.02,GS.04] the factorized form of the Cayley-Hamilton identity and the projectors P α were used to construct explicitly quantized semisimple coadjoint orbits of G L(n) and line bundles over them. 3.3. Quantized differential operators over matrix group (Heisenberg double). Definition 3.19 [AF.91,S]. Let R, T and L be as described in Definitions 3.1 and 3.7. A Heisenberg double (HD) algebra DG[R, γ ] of the two algebras FG[R] and LG[R] is an associative unital algebra generated by the components of the matrices T and L subject additionally to a permutation relation γ 2 T1 L 2 = R12 L 1 R12 T1 , where γ ∈ {C\0} .
(3.3.1)
The HD algebra carries structures of left and right FG[R]-comodule algebra, respectively, δ (T ji ) = Tki ⊗ T jk , δ (L ij ) = Tki (T −1 )mj ⊗ L km ; (3.3.2) k
δr (T ji ) =
k,m
Tki ⊗ T jk , δr (L ij ) = L ij ⊗ 1 .
(3.3.3)
k
Example 3.20. The Heisenberg double is closely related to a smash product of two mutually dual Hopf algebras (see, e.g., [M]). Namely, given a pair AR and A∗R their smash product algebra AR A∗R is a linear space AR ⊗ A∗R supplied with a multiplication (x u)(y v) := u (1) , y(2) (x y(1) u (2) v) ,
(3.3.4)
where x, y ∈ AR , u, v ∈ A∗R , and symbols (x u), (y v) denote elements of AR A∗R . Let us calculate in the settings of Examples 3.2, 3.8 the smash product of the elements i (T j 1) = T ji and (1 L (±) ij ) = L (±) ij , (+) (+) −1 (+) L 2 P12 R12 T1 , T1 L (+) 2 = L 2 T1 , L 2 T1 = η (−) (−) (−) −1 T1 L (−) 2 = L 2 T1 , L 2 T1 = η L 2 P12 R12 T1 ,
wherefrom it follows that the smash product of T1 and L 2 ∝ (L (−) −1 L (+) )2 is given by (3.3.1) with γ = η. However, we stress that in general one can keep γ independent of the normalization η of the R-matrix at the price of losing universality of formulas. Indeed, the multiplication in the smash product algebra is given by (3.3.4) universally for any pair of its elements, while the relation (3.3.1) in the HD algebra is written for the generators L and T only.
1154
A. P. Isaev, P. Pyatov
Now we consider a geometric interpretation of the HD algebra. Applying the substitution L ij → ij (3.2.6) and taking the “classical” limit q → 1 in relation (3.3.1) we find
(3.3.5) [T1 , 2 ] = P12 − γ I12 T1 , where we used the Hecke condition (2.3.6) in a form R 2 = I + (q − q −1 )R and q→1
assumed additionally R −→ P (which is true for the Drinfeld-Jimbo R-matrix (2.4.6)) and γ ≡ γ (q) = 1 + (q − q −1 ) γ2 + o(q − q −1 ). The commutation relations (3.3.5), (3.2.8) are realized by operators ij = |g|
n
gki
∂
, T ji = |g|−γ g ij , where g ij := ρV (g)ij , g ∈ G L(n) ,
j ∂gk k=1 := det g ij .
These are, respectively, right invariant vector fields and properly normalized coordinate functions on G L(n). Together they generate an algebra of differential operators over G L(n)6 . Extrapolating the classical picture we can treat DG[R, γ ] as an algebra of quantized differential operators over the matrix group or, equivalently, as quantized functions over the cotangent bundle of a matrix group (see [AF.91,AF.92,SWZ.92,IP]). The form of the substitution (3.2.6) suggests that the quantized vector fields L ij possess properties of finite difference operators rather than of the differential operators. In particular, they do not satisfy the classical Leibniz rule when acting on functions (see (3.3.1)). The next proposition describes an action of the characteristic subalgebra on quantized functions in the Hecke case. Proposition 3.21. Let R be a skew invertible Hecke type R-matrix. Assume that the conditions [k] (2.3.1) are satisfied, so that the elementary symmetric functions ai ∈ Ch[R] ⊂ DG[R, γ ], 0 ≤ i ≤ k, (3.2.17) are well defined. Then relations γ 2i T ai = ai T − (q 2 − 1)
i (−q)− j ai− j (L j T )
∀0 ≤ i ≤ k
(3.3.6)
j=1
are fulfilled for the Hecke type HD algebra DG[R, γ ]. Proof. For any operator Y ∈ End W (V ⊗i ), where W is an arbitrary C-linear space, we denote Y ↑1 := (P1 P2 . . . Pi )Y (P1 P2 . . . Pi )−1 .
(3.3.7)
For any R-matrix R we define series of operators Ji , Z i , J1 := I ,
Ji+1 := Ri Ji Ri
∀i ≥ 1,
Z i :=
i
Jj .
(3.3.8)
j=1 6 Imposing conditions γ = 1/n, det T = 1, Tr = 0 one can make a reduction to a subalgebra of differential operators over S L(n).
Spectral Extension of the Quantum Group Cotangent Bundle
1155
Remark 3.22. Elements J j , 1 ≤ j ≤ i, are R-matrix realizations of a remarkable set of Jucys-Murphy elements in the braid group Bi : j1 := 1 ,
j j+1 := σ j j j σ j
∀ j = 1, . . . , i − 1 .
These elements generate a commutative subgroup in Bi and their product z i := ij=1 j j is a central element in Bi . For their applications and for historical references see, e.g., [OP.01]. With these notations permutation relations (3.2.1), (3.3.1) can be suitably written for arbitrary R-copies of the matrix L: (L i Ji )(L j J j ) = (L j J j )(L i Ji ),
(3.3.9)
γ 2 T1 (L i Ji )↑1 = (L i+1 Ji+1 ) T1 = Ri (L i Ji ) Ri T1
∀ i, j ≥ 1 .
(3.3.10)
Here the second equality follows from the recursive definitions of L i+1 and Ji+1 , while the first equality can be easily proved by induction on i. Next, we prepare a suitable expression for ai (3.2.17): ai = TrR (1, . . . , i) L 1 . . . L i A(i) = q i(i−1) TrR (1, . . . , i) L 1 . . . L i Z i A(i) = q i(i−1) TrR (1, . . . , i) (L 1 J1 ) . . . (L i Ji ) A(i) . (3.3.11) Here we substituted Z i A(i) = q −i(i−1) A(i) in the first line and used a commutativity relation Li J j = J j Li
∀ i, j : i > j
(3.3.12)
in the second line. By relabelling the subscript indices of the R-traces we then recast (3.3.11) in a following form7 ↑1 I1 ai = q i(i−1) TrR (2, . . . , i + 1) (L 1 J1 ) . . . (L i Ji ) A(i) .
(3.3.13)
Now we are ready to permute T1 and ai . Substituting expression (3.3.13) for ai and using relations (3.3.10) and (3.3.12) we calculate ↑1 γ 2i T1 ai = γ 2i T1 (I1 ai ) = q i(i−1) γ 2i TrR (2, . . . , i + 1) T1 (L 1 J1 ) . . . (L i Ji ) A(i) = q i(i−1) TrR (2, . . . , i + 1) (L 2 J2 ) . . . (L i+1 Ji+1 ) A(i)↑1 T1 = q i(i−1) TrR (2, . . . , i + 1) (L 2 . . . L i+1 ) Z i+1 A(i)↑1 T1 . To continue the calculation we need the following formula: Z i+1 A(i)↑1 = A(i)↑1 Z i+1 A(i)↑1 = q −i(i−1) q 2 A(i)↑1 − q −i (q 2 − 1)(i + 1)q A(i+1) , 7 Notice a similarity of the formula (3.3.13) with the relation (2.2.10). The role of the R-matrices R ±ε is now played by the permutation matrix P (see (3.3.7)).
1156
A. P. Isaev, P. Pyatov
which follows by a combination of the definitions (2.3.3), (2.3.8), (3.3.8), and relations (2.3.4), (2.3.5), (2.3.6). So we finish the calculation
γ 2i T1 ai = TrR (2, . . . , i + 1) (L 2 . . . L i+1 ) q 2 A(i)↑1 − q −i (q 2 − 1)(i + 1)q A(i+1) T1 = q 2 ai T1 + (−q)−i (q 2 − 1)
i
(−q)− j a j L i− j T
1
j=0
= ai T1 − (q 2 − 1)
i (−q)− j ai− j (L j T )1 .
(3.3.14)
j=1
Here we calculate the first summand in the second line taking into account the equality (L 2 . . . L i+1 ) A(i)↑1 = (R1 . . . Ri )(L 1 . . . L i ) A(i) (R1 . . . Ri )−1 , and using i times formula (2.2.10). For calculation of the second summand we use the Cayley-Hamilton-Newton identity (3.2.19). Thus (3.3.6) is proved. Remark 3.23. For the set of power sums (3.2.16) the permutation relations with T ji in the Hecke case read γ 2i T pi = pi T + (q − q −1 )2
i−1 (2 j)q j=1
2q
pi− j (L j T ) + (q − q −1 )
(2i)q i (L T ) . 2q
One can derive this formula applying the R-trace TrR (2) to an equality γ 2i T1 (L 2 )i = (R L 1 R)i T1 and taking into account relations (R L 1 R)i = R(L 1 )i R + (q − q −1 )
i−1
R 2 j (L 1 )i− j R(L 1 ) j ,
j=1
R
2j
=
2q−1
(q 2 j−1 + q −2 j+1 )I + (q 2 j − q −2 j ) R .
These relations, in turn, follow inductively from the Hecke condition (2.3.6) and the reflection equation (3.2.1). Note that in this case there is no need to impose restrictions (2.3.1) on q. Proposition 3.24. Let R be a skew invertible G L q (n) type R-matrix. An extension of the corresponding HD algebra DG[R, γ ] by the elements (detR T )−1 and (an )−1 , satisfying relations γ 2n L (detR T )−1 = q 2 (detR T )−1 (OR L O−1 ), R γ
2n
(an )
−1
T = q T (an ) 2
−1
,
(3.3.15) (3.3.16)
in addition to those given in Definitions 3.4 and 3.10, is called G L q (n) type HD algebra and denoted as DG L q (n)[R, γ ]. Let R be a skew invertible S L q (n) type R-matrix. In the corresponding HD algebra DG[R, γ ] let us restrict the parameters by condition γ n = q and take a quotient by relations detR T = 1 and an = q −1 1. The quotient algebra is called S L q (n) type HD algebra and denoted as D S L q (n)[R].
Spectral Extension of the Quantum Group Cotangent Bundle
1157
Remark 3.25. Notice consistency of the S L q (n) reduction condition γ n = q with the parameter restrictions η = q 1/n in Example 2.10 and γ = η in Example 3.20. Proof. Relations (3.3.15) and (3.3.16) should be consistent with permutation relations for detR T and an in the algebra DG[R, γ ]. Permutation relations for an with T were in fact derived in the first line of the calculation (3.3.14) (put i = n and take into account that A(n+1) = 0 in the G L q (n) case). The permutation relation for detR T with L can be derived by the same method as for detR T with T (see [G], Sect. 5, or [Is.04], calculation (3.5.39)). Given these results the consistency is obvious. In the S L q (n) case (OR ∝ I , γ n = q) the elements detR T and an are central. Hence, D S L q (n)[R] is consistently defined. Corollary 3.26. In the G L q (n) type HD algebra elements of the characteristic subalgebra satisfy the following commutation relations with detR T : γ 2nk detR T ch(x (k) ) = q 2k ch(x (k) ) detR T ∀ x (k) ∈ Hk (q), k = 1, 2, . . . . Proof. A proof is a direct calculation of permutation of ch(x (k) ) (3.2.14) and detR T exploiting relations (3.3.15) and properties of the matrix OR (3.1.7), R12 OR 1 OR 2 = OR 1 OR 2 R12 ,
OR DR = DR OR .
The latter relations are proved in [OP.05], Sect. 5.3.
Theorem 3.27. Let R be a skew invertible G L q (n) (S L q (n)) type R-matrix. An extension of the corresponding HD algebra DG L q (n)[R, γ ] (D S L q (n)[R]) by the algebra Sn ±1 satisfying of polynomials in mutually commuting indeterminates µ±1 α , (µα − µβ ) relations (3.2.23) together with γ 2 (P β T ) µα = q 2δαβ µα (P β T )
∀ α, β = 1, . . . , n ,
(3.3.17)
or, equivalently, γ 2 T µα = µα T + (q 2 − 1)µα (P α T ) , is called a (semisimple) spectral completion of the G L q (n) (S L q (n)) type HD algebra and denoted as DG L q (n)[R, γ ] (D S L q (n)[R]). Remark 3.28. To avoid problems with permutations of (µα −µβ )−1 with P σ T one could assume invertibility of all elements (µα − q 2k µβ ) ∀ α = β, k ∈ Z. Further on we will not make such permutations and so we don’t impose the corresponding restrictions. Remark 3.29. Assuming that the spectral variables µα are invariants of both left and right coactions, the algebra LG L q (n)[R, γ ] (L S L q (n)[R]) inherits the structures of left and right FG L q (n)[R]- (F S L q (n)[R]-) comodule algebra (see Definition 3.19). Remark 3.30. Note that relation (3.3.17) is typical for Weyl algebra generators. In fact there are many ways to combine from the elements (P β T )i j a set of n generators satisfying Weyl relations with the spectral variables µα . One such possibility is used later in Sect. 4.4.
1158
A. P. Isaev, P. Pyatov
Proof. We have to check consistency of relations (3.3.17), (3.3.6) with the conditions ai = ei (µ1 , . . . , µn ) ≡ ei (µ) for 1 ≤ i ≤ n. Denote ei (µ /α ) := ei (µ)|µα =0 . We have ei (µ) = ei (µ /α ) + µα ei−1 (µ /α )
ei (µ /α ) =
⇒
i
(−µα ) j ei− j (µ) . (3.3.18)
j=0
Using relations (3.3.17), (3.3.18), (3.2.25) and (3.2.26) we calculate γ 2i T ei (µ) = γ 2i
n
(P α T ) ei (µ /α ) + µα ei−1 (µ /α )
α=1
=
n α=1
=
ei (µ /α ) + q 2 µα ei−1 (µ /α ) (P α T )
⎞ i−1 ⎝ei (µ) + (q 2 − 1) µα (−µα ) j ei− j−1 (µ)⎠ (P α T ) ⎛
n α=1
j=0
⎛
⎞ i n = ⎝ei (µ) − (q 2 − 1) (−L/q) j ei− j (µ)⎠ (P α T ) α=1
j=1
= ei (µ) T − (q 2 − 1)
i
(−q)− j ei− j (µ) (L j T ) ,
j=1
which coincides with (3.3.6) under identification ei (µ) = ai .
Corollary 3.31. In the completed G L q (n) type HD algebra DG L q (n)[R, γ ] the following permutation relations hold: γ 2n detR T µα = q 2 µα detR T ∀ α = 1, 2, . . . , n .
(3.3.19)
Proof. Using formulas (3.1.6), (3.2.25), (3.3.17) we can permute detR T and µα : Tr (1, . . . , n) A(n) (P β1 T )1 . . . (P βn T )n µα
n
γ 2n detR T µα = γ 2n
β1 ,...,βn =1
= µα
n
q
2
n
j=1 δαβ j
Tr (1, . . . , n) A(n) (P β1 T )1 . . . (P βn T )n .
(3.3.20)
β1 ,...,βn =1
Assuming that Tr (1, . . . , n) A(n) (P β1 T )1 . . . (P βn T )n = 0 , if there exists a pair i, j : βi = β j , (3.3.21) we conclude that for any nonzero summand in (3.3.20) the coefficient q q 2 , and therefore we can complete the calculation γ 2n detR T µα = q 2 µα
n β1 ,...,βn =1
2
n
j=1 δαβ j
equals
Tr (1, . . . , n) A(n) (P β1 T )1 . . . (P βn T )n = q 2 µα detR T .
Spectral Extension of the Quantum Group Cotangent Bundle
1159
It remains to prove the assumption. First, we note that conditions on βi in (3.3.21) stand that there exists an integer σ : 1 ≤ σ ≤ n, and σ = βi ∀ i. Therefore, any projector P βi in (3.3.21) contains the factor (L − qµσ I ). Using relations (3.3.10), (3.3.17) we can move all such factors to the left side of the expression. Thus we obtain
n L J −qµ I ) ... . left hand side of (3.3.21) ∝ Tr (1, . . . , n) A(n) j σ j=1 j (3.3.22) Next, we note that the expression in braces is a symmetric function in a commuting set of matrices L j J j (see (3.3.9)) which by relations Ri (L i Ji )(L i+1 Ji+1 ) = (L i Ji )(L i+1 Ji+1 )Ri , = (L i Ji + L i+1 Ji+1 )Ri ,
Ri (L i Ji + L i+1 Ji+1 )
and by (3.2.11) together with the same formulas for Jk commutes with Ri , i = 1, . . . , (n) n − 1, and so with A(n) . Hence, using relations A(n) = (A(n) )2 and rkA = 1 we
can separate a left factor κ := Tr (1, . . . , n) A(n) nj=1 L j J j − qµσ I ) in (3.3.22). This factor we now calculate explicitly. Taking into account relations (2.4.5), (3.3.9), (3.3.12) and A(n) Ji = q −2(i−1) A(n) we transform the expression for κ:
κ = q n TrR (1, . . . , n) A(n) nj=1 L j − q 2 j−1 µσ I ) .
(3.3.23)
Expanding this expression in powers of L and noticing that (2.4.4) assumes
−1 (k) !(n−k) ! TrR (k + 1, . . . , n) A(n) = q n(k−n) q(n)q ! q := q n(k−n) nk we find that k th order monomials
−1 TrR (1, . . . , n) A(n) L i1 . . . L ik = TrR (1, . . . , n) A(n) L 1 . . . L k = q n(k−n) nk q ak are equal to each other for any choice of indices 1 ≤ i 1 < . . . i k ≤ n. Their corresponding coefficients in (3.3.23) sum up to
(−q −1 µσ )n−k
q2
n−k
r =1 ir
= q n(n−k)
1≤i 1 <...i n−k ≤n
n
k q (−µσ )
and so we obtain κ = q
n
n
ak (−µσ )
n−k
n
(µα − µσ ) = 0 ,
α=1
k=0
where we took (3.2.23) into account.
= q
n
n−k
,
1160
A. P. Isaev, P. Pyatov
3.4. Quantized left invariant vector fields. In a classical differential geometry of the Lie groups one uses two global bases on tangent bundles – the bases of right and left invariant vector fields. In previous sections we discussed quantization of the right invariant vector fields only and defined the HD algebra DG[R, γ ] in their terms. To demonstrate a left-right symmetry of the whole construction we now describe the HD algebra using a set of left invariant generators. We also find explicit relations between the spectra of left and right invariant vector fields in both the G L q (n) and the S L q (n) cases. In the assumptions of Definition 3.19 consider a matrix M whose components belong to DG[R, γ ]: M ij := (T −1 )ik L km T jm . (3.4.1) k,m
Taking into account transformation properties of the matrix elements M ij with respect to the left and right FG[R]-coactions (3.3.2) and (3.3.3), δ (M ij ) = 1 ⊗ M ij , δr (M ij ) = (T −1 )ik T jm ⊗ Mmk , (3.4.2) k,m
we shall call them a basis of quantized left invariant vector fields over the matrix group. One can give the presentation of the HD algebra DG[R, γ ] in terms of generators T ji and M ij , and relations ∗
∗
R 12 T2 T1 = T2 T1 R 12 ,
∗
∗
∗
∗
R 12 M1 R 12 M1 = M1 R 12 M1 R 12 , ∗
∗
(3.4.3)
γ −2 M2 T1 = T1 R 12 M1 R 12 ,
(3.4.4)
R 12 := (P R −1 P)12 = (R21 )−1 .
(3.4.5)
where we denote ∗
∗
Necessary technical data about R are collected in Lemma B.1 in Appendix B. ∗ By (3.4.3), the entries of matrix M generate yet another RE subalgebra LG[ R ] in the ∗
HD algebra DG[R, γ ]. By (3.4.2), the subalgebra LG[ R ] is a right coadjoint FG[R]comodule algebra. We also notice a nontrivial but quite expected property of the quantized left and right invariant vector fields — their mutual commutativity, M1 L 2 = L 2 M1 . ∗
In the rest of this section we investigate the characteristic subalgebra Ch[ R ] ⊂ ∗
∗
LG[ R ]. In particular, we shall see that Ch[ R ] = Ch[R] for the DG[R, γ ]-subalgebras ∗
LG[ R ] and LG[R].
∗
It is suitable to introduce R -copies of the matrix M (cf. with (3.2.9)) M∗ := M1 , 1
∗
∗
M ∗ := R k M∗ ( R k )−1 , k+1
k
(3.4.6)
Spectral Extension of the Quantum Group Cotangent Bundle
1161
∗
and R -matrix realizations of the Jucys-Murphy elements (cf. with (3.3.8) and remark 3.22) ∗
∗
J 1 := I ,
∗
∗ ∗
J k+1 := R k J k R k
∀k ≥ 1.
(3.4.7)
In their terms the relations (3.4.3), (3.4.4) can be written as (cf. with (3.2.10), (3.3.10)) ∗
∗
R M∗ M ∗ = M∗ M ∗ R , k
γ
−2
M∗
∗ Jk
k+1
k
↑1
k+1
T1 = T1 M ∗
k
∗ J k+1
.
(3.4.8)
k+1
∗
We now assume that the R-matrix R is skew invertible8 and introduce two generating ∗ ∗ ∗ sets in the characteristic subalgebra Ch[ R ] ⊂ LG[ R ]: the power sums p k , ∗
p k := Tr ∗ (M k ) , R
k = 1, 2, . . . ,
and, assuming additionally that conditions [k] (2.3.1) are fulfilled, the elementary sym∗ metric functions a i , ∗
∗
a 0 := 1 ,
a i := Tr ∗ (1, . . . , i) R
∗
A(i) M∗ M∗ . . . M∗ 1
2
∀1 ≤ i ≤ k .
(3.4.9)
i
Proposition 3.32. Let R be a skew invertible Hecke type R-matrix and DR be invertible. Assume that conditions [k] (2.3.1) are satisfied. Then for two sets of elements in ∗ Ch[R] ⊂ DG[R, γ ] — ai (3.2.17) and a i (3.4.9) — the following relations are satisfied ∗
a i = γ 2i ai
∀0 ≤ i ≤ k . ∗
Proof. We transform the expression (3.4.9) for a i in the following way: ∗ (i)
∗
a i = Tr ∗ (1, . . . , i) M∗ . . . M∗ A R
1
i
∗ ∗ ∗ (M∗ J 1 ) . . . (M∗ J i ) A(i)
= q −i(i−1) Tr ∗ (1, . . . , i) R
1
.
i
Here we used formulas ∗ ∗
∗
J k A(i) = q 2(k−1)A(i) ∀ 1 ≤ k ≤ i ,
∗
∗
and M∗ J k = J k M∗ ∀ 1 ≤ k < i , (3.4.10) i
i
which, in turn, follow from (B.3), (3.4.6), (3.4.7). Then we apply Lemma B.2 from Appendix B and use the relations (B.6) to move ∗ (i) A leftwards ∗ ∗ a i = q −i(i−1) γ i(i+1) Tr ∗ (1, . . . , i) Tr (i + 1, . . . , 2i) ϒ P(i) ϒ P(2i) (L T )i . . . (L T )1 ϒ (2i) A(i)↑i × ∗ R R R −1 (2i) (i) × (Ti . . . T1 ) ϒ P ϒ P . (∗)
Here matrices ϒ∗ are defined in (B.4). 8 This is indeed the case if R is skew invertible and D is invertible (see Lemma B.1 in Appendix B). R
1162
A. P. Isaev, P. Pyatov ∗
Next, we permute A(i)↑i with ϒ (2i) and cancel terms ϒ P(i) ϒ P(2i) on the left and ∗ (2i)
R
(i)
ϒ P ϒ P on the right. The latter cancellation exchange the R-traces TrR and Tr ∗ : R
∗
a i = q −i(i−1) γ i(i+1) Tr (1, . . . , i) Tr ∗ (i + 1, . . . , 2i) R R ∗ (i) (2i) (L T )i . . . (L T )1 A ϒ ∗ (Ti . . . T1 )−1 . R
In the resulting expression all the R-traces Tr ∗ (i + 1, . . . , 2i) can be evaluated with the help R
of Lemma B.3. So, we continue
∗ ∗ a i = q i(i−1) γ i(i+1) Tr (1, . . . , i) (L T )i . . . (L T )1 A(i) (Ti . . . T1 )−1 R i(i−1) i(i+1) =q γ TrR (1, . . . , i) (L T )1 . . . (L T )i (T1 . . . Ti )−1 A(i) .
Here in the first line we used formula (B.14) and (3.4.10); in the second line we applied (i) formula (B.2) and then, moved the two terms ϒ P , respectively, to the left and to the right and cancelled them under the R-traces TrR (1, . . . , i) . Finally, using repeatedly permutation relations (3.3.10) and then formula (3.3.11) we complete the transformation ∗
a i = q i(i−1) γ i(i+1) Tr (1, . . . , i) R ↑(i−2) ↑1 −1 −1 (i) (L T )1 . . . (L T )i−1 (L 1 J1 ) T1 (T1 . . . Ti−2 ) A · · · = q i(i−1) γ 2i TrR (1, . . . , i) (L 1 J1 ) . . . (L i Ji )A(i) = γ 2i ai . ∗
Remark 3.33. For the sets of power sums pi and pi one can prove following recurrent relations: ∗
pi = γ 2i pi − (q − q −1 )
i−1
∗
γ 2k pk pi−k .
k=1
Corollary 3.34. Let R be a skew invertible R-matrix of the G L q (n) type (in which case DR is invertible, see Proposition 2.9). Then for the matrix M (3.4.1) generating the RE ∗
algebra LG L q (n)[ R ] ⊂ DG L q (n)[R, γ ] the following Cayley-Hamilton identity is valid: n
∗
(−1/q)i a i M n−i =
i=0
n
(−γ 2 /q)i ai M n−i = 0 .
i=0 ∗
In the spectrally completed algebra LG L q (n)[ R ] ⊂ DG L q (n)[R, γ ] this identity assumes a completely factorized form n γ 2 µα M− I = 0. (3.4.11) q α=1
Spectral Extension of the Quantum Group Cotangent Bundle
1163
With the factorized characteristic identity (3.4.11) one can construct yet another resolution of matrix unity (cf. with (3.2.25)) n
n M − γ 2 q −1 µβ I α α β α S α = I , (3.4.12) S := : S S = δαβ S , γ 2 q −1 (µα − µβ ) β=1 α=1
β =α
so that M S α = S α M = γ 2 q −1 µα S α . The relation between the two sets of projectors P α and S α is explained in the following proposition. Proposition 3.35. In the spectrally completed algebra DG L q (n)[R, γ ] (D S L q (n)[R]) one has P α T S β = δαβ P α T
or, equivalently,
P α T = T Sα .
(3.4.13)
Proof. Taking into account relations T M = L T , (3.2.26) and (3.3.17) one finds P α T M = P α L T = qµα (P α T ) = (P α T )γ 2 q −1 µα . Hence, in view of (3.4.12), P α T Sβ = P α T
γ 2 q −1 (µα − µσ ) . γ 2 q −1 (µβ − µσ )
σ =β
In case α = β the factor with σ = α in the product vanishes. In case α = β (and so, σ = α) all the terms in the product are equal to 1. So, the relation above reduces to the first equality in (3.4.13). 3.5. Derivation of dynamical R-matrix. In [AF.91] A.Alekseev and L.Faddeev used the dynamical R-matrix in their construction of the Heisenberg double algebra. Namely, they observed an appearance of the classical dynamical r-matrix in the Poisson relations for certain classical variables and then, by postulating a quantum counterpart of those relations, they derived defining formulas (as in Definition 3.19) for the algebra DG[R, γ ]. In this section we aim to explain an origin of the dynamical R-matrix in the context of the HD algebras. We show that the dynamical R-matrix – R(µ)αβ – appears in the permutation relations for matrix components of the matrices W α := P α T = T S α ,
(3.5.1)
and the arguments of the dynamical R-matrix are just the spectral variables µα . In a sense, we solve an inverse problem to that considered in [AF.91]. Recall the definition of two projectors associated with the Hecke type R-matrix (see (2.3.6)) A(2) =
q I − R1 , q + q −1
S (2) =
q −1 I + R1 . q + q −1
(3.5.2)
These projectors, called the antisymmetrizer and the symmetrizer, serve for suitable separation of the different eigenspaces of R.
1164
A. P. Isaev, P. Pyatov
Theorem 3.36. In the completed HD algebra DG L q (n)[R, γ ] (D S L q (n)[R]) the matrices W α (3.5.1) satisfy relations β β β β S (2) W1α W2 + W1 W2α A(2) = A(2) W1α W2 + W1 W2α S (2) = 0 ∀ α, β, (3.5.3) β β S (2) (µβ −q 2 µα )W1α W2 +(µα − q 2 µβ )W1 W2α S (2) = 0 ∀ α = β, A
(2)
(3.5.4) (µα − q
2
β µβ )W1α W2
+ (µβ − q
2
β µα )W1 W2α β
β
− (q 4 − 1) µα ϕαβ W1α W2α − (q 4 − 1) µβ ϕβα W1 W2
A(2) = 0 ∀ α = β, (3.5.5)
µσ −q 2 µα µσ −µβ
. Relations (3.5.3)–(3.5.4) and (3.3.17) (together with where ϕαβ := σ =α,β the appropriate conditions on the spectral variables µα ) define the algebra DG L q (n)[R, γ ] (D S L q (n)[R]) in terms of generators W α , µα , α = 1, . . . , n. β
Proof. Consider the product W1α W2 , where α = β. With the help of (3.3.17) and (3.3.10) we can reorder terms of the product in a following way: β
(L 1 − qµβ )(L 2 J2 − q 3 µα ) αβ W , q 2 (µα − µβ )(µβ − q 2 µα ) 12 (L 1 − qµσ )(L J2 − qµσ ) 2 T1 T2 . := q 2 (µα − µσ )(µβ − µσ )
W1α W2 = αβ
W12
σ =α,β
αβ
Here factor W12 commutes with the R-matrix R12 , which follows by the same arguments as in the proof of Corollary 3.31, see below (3.3.22). We now extract symmetric and antisymmetric parts of the product using projectors (3.5.2), β
S (2) W1α W2 = S (2)
L 1 L 2 J2 + q 4 µα µβ I −
q 2 (µβ +µα ) (L 1 q+q −1
+ L 2 J2 ) +
µβ −q 2 µα (L 1 q+q −1
− L 2)
q 2 (µα − µβ )(µβ − q 2 µα )
αβ
W12 , (3.5.6)
A
(2)
β W1α W2
=A
(2)
L 1 L 2 J2 + q 4 µα µβ I −
µβ +q 4 µα (L 1 q+q −1
q 2 (µα
+ L 2 J2 ) +
q 2 (µβ −q 2 µα ) (L 1 q+q −1
− L 2)
− µβ )(µβ − q 2 µα )
αβ
W12 .
(3.5.7) Here we separated linear in L terms with the opposite symmetry properties R1 (L 1 + L 2 J2 ) = (L 1 + L 2 J2 )R1 ,
R1 (L 1 − L 2 ) = −(L 1 − L 2 )R1−1 ,
which was done by the use of relation q 3 µα L 1 + qµβ L 2 J2 =
q(I + R1−2 ) (µβ R12 + q 2 µα I )(L 1 + L 2 J2 ) (q + q −1 )2 + (q 2 µα − µβ )(L 1 − L 2 ) .
(3.5.8)
Spectral Extension of the Quantum Group Cotangent Bundle
1165
The symmetry properties (3.5.8) imply, in particular, that the only term contributing β β to expressions A(2) W1α W2 S (2) and S (2) W1α W2 A(2) is the one proportional to (L 1 − β L 2 ), while the terms (L 1 L 2 J2 ), I and (L 1 + L 2 J2 ) contribute to S (2) W1α W2 S (2) and β A(2) W1α W2 A(2) . It is now straightforward to check that formulas (3.5.3) and (3.5.4) follow from relations (3.5.6), (3.5.7). To check formula (3.5.5) one needs also similar expression for the product W1α W2α : W1α W2α =
L 1 L 2 J2 + q 2 µ2β I − qµβ (L 1 + L 2 J2 ) q 2 ϕαβ (µα − µβ )(q 2 µα − µβ ) β
αβ
W12 ,
β
and an analogous formula for W1 W2 . Here the factor ϕαβ was defined in the proposition. It remains to check that defining relations for the algebras DG L q (n)[R, γ ] and D S L q (n)[R] can be derived from (3.5.3)–(3.5.5) and (3.3.17). It is convenient to check relations for the matrices T and L T : R1 T1 T2 = T1 T2 R1 , R1 (L T )1 (L T )2 = (L T )1 (L T )2 R1 , γ 2 T1 (L T )2 = R1 (L T )1 T2 R1 .
(3.5.9)
For G L q (n) and S L q (n) types HD algebras, where T is invertible, these formulas imply (3.2.1) and (3.3.1). Substituting expressions T = nα=1 W α , L T = q nα=1 µα W α , one can easily prove that the first two relations (3.5.9) follow from (3.5.3) and (3.3.17). Checking the last formula in (3.5.9) is also straightforward, although more lengthy. To this end one has to use the whole set of relations for W ’s and to take into account the identity α=β ϕαβ = 1 . Corollary 3.37. Relations (3.5.3)–(3.5.5) can be equivalently written as ⎡ ⎤ n α β β β S (2) ⎣W1α W2 R1 − R S (q; µ)α β W1α W2 ⎦ = 0 , α ,β =1
⎡
n
β
A(2) ⎣W1α W2 R1 −
α ,β =1
(3.5.10)
⎤
α β R A (q; µ)α β
α
β
W1 W2 ⎦ = 0 ,
(3.5.11)
where n 2 × n 2 matrix R S (q; µ) has the following nonzero components: αα
R S αα = q,
αβ
R S αβ = −
(q − q −1 )µβ , µα − µβ
αβ
R S βα =
q −1 µα − qµβ ∀ α = β, µα − µβ
and n 2 × n 2 matrix R A (q; µ) has nonzero components at the same places as R S with values R A (q; µ) = R S (−q −1 ; µ), and the additional nonzero components αα
αα
R A βα = −R A αβ =
(q 4 − 1) µα ϕαβ ∀ α = β. q(µα − µβ )
1166
A. P. Isaev, P. Pyatov
Both matrices R S/A (q; µ) ≡ R S/A (µ) satisfy the dynamical Yang-Baxter equation: 23 23 23 R (µ)12 R ∇ 1 (µ) R (µ)12 = R ∇ 1 (µ) R (µ)12 R ∇ 1 (µ) . (3.5.12) Here superscript labels denote endomorphism spaces for the spectral indices, e.g., R(µ)αβ11 βα22 ≡ R(µ)12 , and ∇ 1 is a diagonal finite shift operator, ∇ 1 = diag{∇ α }α=1n :
∇ α (µβ ) := q 2δαβ γ −2 µβ .
(3.5.13)
Proof. Apply the symmetrizer S (2) and the antisymmetrizer A(2) from the right to both sides of the equalities (3.5.10), (3.5.11). The resulting projections are easy to compare with (3.5.3)–(3.5.5). To prove the dynamical Yang-Baxter equation for the matrices R A (q; µ) and R S (q; µ) we consider, respectively, the following cubic terms: β
A(3) W1α W2 W3σ
and
β
S (3) W1α W2 W3σ .
Here the 3-antisymmetrizer A(3) = ρR (a (3) ) is the R-matrix realization of the idempotent a (3) (see (2.3.2), (2.3.3)), and the 3-symmetrizer S (3) is a similar projector which differs from A(3) by substitution q ↔ −q −1 in the formulas (2.3.2), (2.3.3). Now applying two equal operators R1 R2 R1 and R2 R1 R2 from the right side to these terms and using relations (3.5.10), (3.5.11) and (3.3.17) one eventually proves (3.5.12) for R A/S (q; µ). Remark 3.38. The dynamical R-matrix R S (q; µ) was constructed in [F.90,AF.91,Is.95]. A review on the dynamical Yang-Baxter equation and the dynamical R-matrices is given in [ES]. It is surprising that in our approach the dynamical R-matrices R A/S (q; µ), being the solutions of the nonlinear finite difference equation (3.5.12), are calculated by solving a system of (at most three) linear equations. In concluding of the section we comment how relations (3.5.10) can be reduced to dynamical quadratic relations considered in [F.90,AF.91]. Recall that a (Hecke type) dim V subject quantum plane V[R] is an algebra generated by components of vector {xi }i=1 to relations x1| x2| A(2) = 0
⇔
x1| x2| S (2) = x1| x2| .
(3.5.14)
In the tensor product algebra V[R] ⊗ DG L q (n)[R, γ ] consider a rectangular matrix V α iα := dim α = 1, . . . n , i = 1, . . . , dim V . j=1 x j ⊗ W ji , As a consequence of (3.5.10), (3.5.14) the matrix components of fulfill relations |1
|2
|1
|2
1| 2| R12 = R S (q; µ)12 1| 2| .
(3.5.15)
Assume additionally that i) dim V = n, and ii) the quantum plane admits a one dimensional representation χ : V[R] → C (note that both these conditions are satisfied for the R-matrices from Example 2.10). It is the square matrix χ () ∈ DG L q (n)[R, γ ] whose dynamical quadratic relations (3.5.15) were introduced in [F.90,AF.91] and also investigated in [HIOPT,FHIOPT].
Spectral Extension of the Quantum Group Cotangent Bundle
1167
4. Discrete Time Evolution on Quantum Group Cotangent Bundle 4.1. Automorphisms of the Heisenberg double algebra. In this section we investigate a sequence of automorphisms on the HD algebra DG[R, γ ]. These automorphisms were introduced by A. Alekseev and L. Faddeev [AF.91,AF.92], who interpreted them as a discrete time evolution of a q-deformed quantum isotropic Euler top. The automorphisms θ k : DG[R, γ ] → DG[R, γ ] are given on generators θk
{T, L} −→ {T (k), L(k)} , ∀ k = 0, 1, 2, . . . , T (0) := T , T (k + 1) := L T (k) = L k+1 T , L(k) := L .
(4.1.1)
It is easy to see (cf. (3.5.9)) that the map θ agrees with the defining relations (3.1.1), (3.2.1), (3.3.1) of the algebra DG[R, γ ]. Less obvious is its consistency with the S L q (n) type reduction conditions. Proposition 4.1. The map θ (4.1.1) defines an automorphism of the algebra D S L q (n)[R]. Proof. It is necessary to check that detR (L T ) = 1 in the S L q (n) case. To this end, we use formula
(L T )1 (L T )2 . . . (L T )k = γ −k(k−1) Z k L 1 L 2 . . . L k (T1 T2 . . . Tk ) to separate matrices L and T in the expression for detR (L T ). This formula follows from (3.3.1), (3.2.10), (3.2.11) and (3.3.8) by induction on k. The calculation of detR (L T ) proceeds as follows: det R L T := Tr (1, . . . , n) A(n) (L T )1 . . . (L T )n = γ n(n−1) Tr (1, . . . , n) A(n) Z (n) L 1 . . . L n T1 . . . Tn = (γ q)−n(n−1) Tr (1, . . . , n) A(n) L 1 . . . L n Tr (1, . . . , n) A(n) T1 . . . Tn = q n γ −n(n−1) TrR (1, . . . , n) A(n) L 1 . . . L n detR T
n−1 = qγ −n q an detR T , (4.1.2) and so, under conditions detR T = 1, an = q −1 1, γ n = q we have detR (L T ) = 1. Here in the second line we substituted A(n) Z n = q −n(n−1) A(n) and used the condition rk A(n) = 1; in the last line we applied (2.4.5) and the definitions of detR T and an . In what follows we will investigate the automorphisms (4.1.1) for HD algebras of the types DG L q (n)[R, γ ] and D S L q (n)[R]. A key point for their dynamical interpretation is the possibility to write down the following ansatz: T (k + 1) = L T (k) = (qan )1/n T (k) −1 , where ∈ Ch[R] .
(4.1.3)
Here the dynamical process – evolution – is thought of as an inner HD algebra automorphism, and plays a role of the evolution operator. As the evolution keeps L unchanged, it is natural to assume that belongs to the center of the RE algebra generated by the matrix L. More specifically, we will look for as a formal power series in spectral variables µα , α = 1, . . . , n, which we denote as Ch[R]. We also note that the condition
1168
A. P. Isaev, P. Pyatov
∈ Ch[R] makes the ansatz manifestly covariant with respect to both left and right coactions (3.3.2), (3.3.3). Factor (qan )1/n in the ansatz (4.1.3) becomes trivial for the S L q (n) type HD algebra. In the G L q (n) case one adds this scaling factor to make the ansatz consistent with the evolution of detR T , see (4.1.2). One assumes the following relation for the newly 1/n introduced element an (cf. with (3.3.16)):
2/n 1/n 1/n T an = qγ −n an T . (4.1.4) Then, consistency of (4.1.3) and (4.1.2) results in commutativity of detR T with : detR T = detR T ,
(4.1.5)
which again trivializes in the S L q (n) case. Remark 4.2. The action of the automorphisms θ k on T can be equally treated as multiplications by powers of the left invariant matrix M: T (k + 1) = T (k) M = T M k ,
M(k) = M .
The relation (3.4.1) between L and M would no more be valid if one would treat them as quantized right and left invariant Lie derivatives acting on the quantized external algebra of differential forms over the matrix group. In this case one would have a two-parametric series of automorphisms: θ (k,m)
{T, L , M} −→ {L k T M m , L , M} , ∀ k ≥ 0, m ≥ 0 . Example 4.3. Let us show that in the ribbon Hopf algebra setting the ribbon element υ ∈ AR generates the evolution (4.1.3) in the smash product algebra AR A∗R . For this we first have to specify pairing for the ribbon element. Using Definition (2.1.6) and relations (2.2.12), (2.4.3) and setting η = q 1/n as in Example 2.10 we calculate 1
T, υ 2 = T, u S(u) = ρV (u) ρV (S(u)) = η2 DR CR = q 2( n −n) I . Therefore, taking into account centrality of the ribbon element υ in AR , it is natural to define 1
T, υ = q ( n −n) I . Using this formula and relations (2.1.6), (3.2.5), (3.3.4) we now calculate conjugation of the matrix T with the ribbon element υ T υ −1 = (υ ⊗ id) id ⊗ T, (υ −1 ) T=(υ ⊗ id) id ⊗ T, (υ −1 ⊗ υ −1 )R21 R12 T = T, υ −1 id ⊗ T, R21 R12 T = L T . Note that the defining relations for the evolution operator (as a function of the spectral variables µα ) and for the ribbon element υ both admit multiple solutions. 9 Therefore, the problem of finding an explicit expression of the ribbon element υ in terms of spectral variables µα demands further investigations. 9 The ribbon element is defined modulo the central factor z ∈ A : z 2 = 1 , S(z) = z , (z) = 1 , (z) = R z ⊗ z . For the evolution operator (µ) different solutions are constructed in the next sections.
Spectral Extension of the Quantum Group Cotangent Bundle
1169
4.2. Equations for the evolution operator . Using the results of Sect. 3 it is straightforward to derive equations for . We consider in detail, the evolution in the S L q (n) type HD algebra. In this case we assume where an = nα=1 µα = q −1 and γ = q −1/n . (4.2.1) = (µ1 , . . . , µn ) , Applying from the left the projector P α to both sides of (4.1.3) we obtain qµα (P α T ) = (P α T )−1 , ∀ α = 1, . . . , n . Multiplying this equality by from the right and permuting with P α T in the left-hand side with the help of (3.3.17) we finally get qµα (q −2/n µ1 , . . . , q 2−2/n µα , . . . , q −2/n µn ) (P α T ) = (µ1 , . . . , µn ) (P α T ) . We state the result in the following proposition: Proposition 4.4. For the Heisenberg double algebra D S L q (n)[R] the evolution operator (µα ) in (4.1.3), (4.2.1) is a solution of equations
(4.2.2) qµα ∇ α (µβ ) = (µβ ) ∀ α = 1, . . . , n , where ∇ α are finite shift operators introduced in (3.5.13). In the S L q (n) case their actions are ∇ α (µβ ) := q 2X αβ µβ ,
X αβ := δαβ −
1 n
∀ α, β = 1, . . . , n .
(4.2.3)
The n × n matrix X is a Gram matrix for the set of vectors eα∗ ∈ Qn , α = 1, . . . , n : eα∗ :=
1 ( −1, . . . , −1, n −1, −1, . . . , −1 ) , n
X αβ = eα∗ , eβ∗ .
(α−1) times
X is positive semi-definite of rank n − 1 ( nα=1 eα∗ = 0 ). For the Heisenberg double algebra DG L q (n)[R, γ ] the evolution operator is suitably parameterized by variables z := (qan )1/n and να , n −1 such that , να detR T = detR T να ∀ α . να := µα (qan )−1/n , α=1 να = q The evolution equations for (ν1 , . . . , νn ; z) read (4.2.4) qνα (∇ α (νβ ); (qγ −n )2/n z) = (να ; z) ∀ α = 1, . . . , n , where shift operators ∇ α are defined as in (4.2.3). Since nα=1 ∇ α = 1, this system is consistent provided that (νβ ; q 2 γ −2n z) = (νβ ; z) (cf. with (4.1.5)). Demanding that does not actually depend on z one reduces (4.2.4) to (4.2.2). Proof. The S L q (n) case is already considered. Taking into account relations (4.1.4) and (3.3.19) a derivation of the evolution equations in the G L q (n) case is the same. In the next two subsections we will construct particular solutions of the S L-type evolution equations (4.2.2).
1170
A. P. Isaev, P. Pyatov
4.3. Solution in case |q| < 1. Let us look for the solution of (4.2.2) as a series in µα . Taking into account condition (4.2.1) we exclude one dependent variable, say µn , from the expansion µk1 µk2 . . . µkn−1 , (µα ) = c(k) (4.3.1) 1 2 n−1 Zn−1 k∈
are C-valued funcwhere Zn−1 = {(k1 , . . . , kn−1 ) : ki ∈ Z}, and the coefficients c(k) tions on Zn−1 . Substitution of (4.3.1) into (4.2.2) gives conditions on the coefficients:
c(k + α ) = q
1+ n2
n−1
β=1
A∗αβ kβ
∀ α = 1, . . . , n − 1. c(k)
(4.3.2)
Here α := ( 0, . . . , 0, 1, 0, . . . , 0), and A∗αβ := n X αβ is a (n − 1) × (n − 1) positive (α−1) times
definite matrix. The general solution to (4.3.2) is 1
= qn c(k)
1 k) A∗ k)+( (k,
,
A∗ k) = = 1 and use notation (k, where we choose normalization c(0) n−1 kβ , and 1 = (1, . . . , 1) , so that (1, k) = kα .
(4.3.3) n−1
∗ α,β=1 kα Aαβ
α=1
Remark 4.5. The matrix A∗αβ is a Gram matrix of a lattice A∗n−1 dual to the root lattice A∗ k) is often An−1 (see [CS], Chap. 4, Sect. 6.6). The corresponding quadratic form (k, referred to as Voronoi’s principal form of the first type. The ansatz (4.3.1) gives a particular solution (4.3.3) of the evolution equations; we denote it (1) . Introducing a parameterization, q = exp(2π i τ ), q 1/n µα = exp(2π i z α ) : n−1
z α = 0, αβ =
α=1
2τ ∗ 1 A = 2τ (δαβ − ) , n αβ n
we can write (1) as a Riemann theta function θ (z , ) (see [Mum]) k) + 2π i (k , z ) . (1) (µα ) = θ (z , ) = exp π i (k,
(4.3.4)
(4.3.5)
Z k∈
n−1
Here τ is a modular parameter and is a matrix of periods. Expression (4.3.5) converges either if |q| < 1, or if q is a rational root of unity, in which case the series can be truncated. Remark 4.6. One can present formula (4.3.5) in a manifestly covariant form: τ + 2π i k , z . (1) ≡ (1) (z , A∗n−1 , τ ) = exp 2π i k, k n ∗ k∈A n−1
n−1 ∗ ∗ Here vectors k = n−1 α=1 kα eα label vertices of the lattice An−1 , and z = α=1 z α eα , where eα = α − n , α = 1, . . . , n − 1, (see the line below (4.3.2)) are basic vectors of the root lattice An−1 : eα∗ , eβ = δαβ .
Spectral Extension of the Quantum Group Cotangent Bundle
1171
In the simplest S L q (2) case the evolution operator (1) becomes the Jacobi theta function: 1 (1) (µ1 ) = q 2 k(k+1) µk1 = exp(π i k 2 τ + 2π i kz 1 ) = θ3 (z 1 ; q) , k∈Z
k∈Z
or, in a multiplicative form ∞
(1) (µ1 ) =
(1 − q n )(1 + q n µ1 )(1 + q n−1 /µ1 ) .
n=1
4.4. Solution for arbitrary q. In this section we derive yet another particular solution of the evolution equations (4.2.2), the one which is well defined for arbitrary values of q. The idea of such a solution was proposed in L.D. Faddeev’s lectures on twodimensional integrable quantum field theory [F.94] (see also [F.95]). We use heuristic arguments inspired by considerations in [AF.91]. For the moment we assume dim V = n, so that the range of the indices α and i, j in the projectors Piαj , Siαj is the same. Consider the following n × n matrices: Ui j :=
n
u ik Pkα=i j ,
Vi j :=
k=1
n
α= j
Sik
vk j ,
k=1
where the only restriction for the auxiliary parameters u i j and vi j is their commutativity with the spectral variables µα , [u i j , µα ] = [vi j , µα ] = 0 ∀ i, j, α . As a result of the Cayley-Hamilton identities (3.2.24), (3.4.11) we have matrix relations U L = q DU ,
M V = γ 2 q −1 V D ,
where D := diag{µ1 , . . . , µn } .
Moreover, by (3.4.13), matrix Q := U T V is diagonal
where wi := u P i T v ,
Q = diag{w1 , . . . , wn } ,
ii
and, by (3.3.17), wi satisfy the following permutation relations with µ j and z j :
wi µ j = q 2δi j γ −2 µ j wi ⇔ wi z j = z j + 2τ (δi j − 1/n) wi , (4.4.1) where in the latter formula we used the S L q (n) type condition γ = q 1/n . Assuming invertibility of the matrices U and V we can write diagonal decompositions for the matrices L, M and T , L = q U DU −1 ,
M = γ 2 q −1 V DV −1 ,
T = U −1 QV −1 ,
which after substitution into the ansatz (4.1.3) reduce the evolution equations to the following form: q D Q = Q −1
⇔
qµi wi = wi −1 .
1172
A. P. Isaev, P. Pyatov
Taking into account (4.4.1) these equations clearly have the following solution: π i n 2 . (2) (z α ) := exp − z 2τ β=1 β
(4.4.2)
Now, it is easy to check that the function (2) fulfills the evolution equations (4.2.2) without additional assumptions we made for the derivation. Written in the independent variables z = {z 1 , . . . , z p−1 } it reads ⎛ πi (2) (z ) = exp ⎝− τ
⎞
z α z β ⎠ = exp −π i (z , −1 z ) ,
(4.4.3)
1≤α≤β≤n−1
where the inverse matrix of periods is −1 αβ =
1 1 (δαβ + 1) = Aαβ , 2τ 2τ
and Aαβ = eα , eβ is the Gram matrix for the root lattice An−1 (see Remark 4.6). Let us stress that the logarithmic change of variables µα → z α (4.3.4) which was rather superficial in the case of (1) , is inevitable for the derivation of (2) . Finally, we comment on the relation between the two evolution operators (1) = θ (z , ) and (2) . The relation is based on a functional equation for the Riemann theta function: 1 θ (−1 z , −−1 ) = (det (/i)) 2 exp π i(z , −1 z ) θ (z , ) , which is the special case of a more general modular functional equation (for derivation and generalization see [Mum], Chap. 2, Sect. 5). With our particular matrix of periods (4.3.5) we find 1 (2) (z ) = √ n
2τ i
n−1 2
θ (z , ) . θ (−1 z , −−1 )
(4.4.4)
Note that the theta function in the denominator – θ (−1 z , −−1 ) – commutes with the elements of D S L q (n)[R] and can be thought as an evolution operator on a ‘modular dual’ quantum cotangent bundle [F.99].
Appendix A. Pairing Between Spectral Variables and Quantized Functions Here we calculate pairing of the elementary symmetric functions ai (3.2.17) with the generators of quantized functions T ji . We assume that T and ai are realized respectively, as elements of dual quasi-triangular Hopf algebras A∗R and AR . We further extend this pairing also for the spectral variables µα .
Spectral Extension of the Quantum Group Cotangent Bundle
1173
For the calculation we use formula 1
2 T1 , L 2 = η−2 q (n− n ) R12 ,
(A.1)
which follows from the definitions (2.2.11), (3.1.5), (3.2.3), (3.2.5). Proposition A.1. Let ai (3.2.3), (3.2.5) and T (3.1.5) be elements of the dual quasi-triangular Hopf algebras, respectively, AR and A∗R . Assume that the R-matrix R (2.2.11) is G L q (n) type with scaling parameter η = q 1/n as in Example 2.10. Then n T, ai = q −3i/n n q−1 n q + q n+1 − q n−2i+1 I. (A.2) i q Proof. The calculation proceeds as follows: ↑1 T1 , ai = T1 , TrR (2, . . . , i + 1) A(i) L 1 . . . L i 3 = q i(n− n ) TrR (2, . . . , i + 1) A(i)↑1 (J2 . . . Ji+1 )(J1−1 . . . Ji−1 )↑1 3 = q i(n− n ) TrR (2, . . . , i + 1) R1 . . . Ri A(i) Ri . . . R1 3 = q i(n− n ) q −n (n + 1 − i)q i q−1 TrR (2, . . . , i) R1 . . . Ri−1 A(i−1) Ri−1 . . . R1
+ (q − q −1 ) q −(n+2)(i−1) n q−1 ni q I1 3 . . . = q i(n− n ) ni q q −in +(q − q −1 ) q −(n+2)(i−1) n q−1 (1 + q 2 +· · · + q 2(i−1) ) I1 . Here in the first line we substituted an expression for ai similar to (3.3.13). In the second 1 line we evaluated the pairing using formulas T1 , (L i )↑1 = η−2 q (n− n ) Ji+1 (Ji−1 )↑1 following from (A.1). In the third line we first, used the cyclic property of the R-trace to evaluate the term (J1−1 . . . Ji−1 )↑1 on (A(i) )↑1 , and then rearranged the product J2 . . . Ji+1 = † † Z i+1 = J2† . . . Ji+1 , where J1† := I, Jk+1 := Ri−k+1 Jk† Ri−k+1 , and evaluated the term † J2† . . . Ji† on (A(i) )↑1 . After that we recollected terms in the product: (A(i) )↑1 Ji+1 = −1 (i) −1 R1 . . . Ri A Ri . . . R1 . In the fourth line we substituted Ri = Ri + (q − q )I for one of Ri s and used formulas (2.2.8) and (2.2.10) to evaluate TrR (i + 1) . Then, in the summand which is proportional to (q − q −1 ) all the R-traces can be evaluated with the help of (2.4.4). Omission points in the fifth line stand for similar evaluations of TrR (i) . . . TrR (2) ; the resulting expression coincides with (A.2). For an relation (A.2) simplifies to T, an = q −1 I ,
(A.3)
which obviously agrees with (3.2.18). So, we checked a consistency of the normaliza1 tions q n− n in (3.2.5) and η = q 1/n for the Drinfeld-Jimbo R-matrices with the S L q (n) reduction condition (3.2.18). Corollary A.2. In the conditions of Proposition A.1 the pairing ·, · can be extended for the spectral variables (3.2.22): 3
T, µα = q (2α+2δαn −n− n −1) I,
α = 1, . . . , n.
(A.4)
1174
A. P. Isaev, P. Pyatov 3
Proof. Let us rescale the spectral variables µ˜ α := q n −2δαn µα . For rescaled variables (A.4) reads T, µ˜ α = q (2α−n−1) I. (A.5)
n i−n n−1 Using the q-binomial identity q i n−1 i q +q i−1 q = i q , it is straightforward to ˜ by induction on derive from (A.5) pairings of the elementary symmetric functions ei (µ)
n: T, ei (µ) ˜ = ni q . Using (3.3.18) it is then straightforward to derive pairings for elementary symmetric functions in original spectral variables — T, ei (µ) — which under the identification ai → ei (µ) coincide with (A.2). Appendix B. Three Lemmas for Subsection 3.4 Here we collect some technical results which are used for establishing the relation between the spectra of the left and right invariant vector fields. Lemma B.1. a) If the R-matrix R is skew invertible then the following four statements are equivalent: i) the matrix DR is invertible; ii) the matrix CR is invertible; iii) the ∗
R-matrix R −1 is skew invertible; iv) the R-matrix R := P R −1 P is skew invertible. One has DR∗ = CR −1 = (DR )−1 ,
CR∗ = DR −1 = (CR )−1 .
(B.1)
b) Let R be the Hecke type R-matrix generating representations ρR (2.3.7) of the alge∗
bras Hk (q). Then the R-matrix R is Hecke type as well, and ρR∗ are representations of the algebras Hk (q −1 ). If additionally the parameter q satisfies conditions [k] (2.3.1) so that the idempotent a (k) |q↔q −1 ∈ Hk (q −1 ), (see (2.3.2)) is well defined, then ∗
(k) (k) A(k) := ρR∗ (a (k) |q↔q −1 ) = ϒ P A(k) ϒ P , ∗
∗
(B.2)
∗
R i A(k) = −q A(k) , ∀ i = 1, . . . , k − 1 . (k)
(k)
(B.3) (k)
Here ϒ P = (ϒ P )−1 is a particular R = P case of an operator ϒ R ∈ Aut(V ⊗k ), defined inductively for any R-matrix R, ϒ R(1) := 1 , ϒ R(k+1) := (R1 R2 . . . Rk ) ϒ R(k) = ϒ R(k) (Rk . . . R2 R1 ) ∀ k = 2, 3, . . . . (B.4) This operator performs reflection of the indices of the R-matrices, Ri ϒ R(k) = ϒ R(k) Rk−i
∀ i, k : 1 ≤ i < k .
(B.5)
(k)
The particular element ϒ P enjoys also relations Ri ϒ P(k) = ϒ P(k) (P R P)k−i ∀ i, k : 1 ≤ i < k , ∀ R-matrix R,
(B.6)
(k) Mi ϒ P
(B.7)
=
(k) ϒP
Mk−i+1
∀ i, k : 1 ≤ i ≤ k , ∀ M ∈ End W (V ) ,
where W is an arbitrary C-linear space.
Spectral Extension of the Quantum Group Cotangent Bundle
1175
Proof. The first equality in both formulas (B.1) is proved in a more general setting in [OP.05], Lemma 3.6 c). The second equality is proved in [Is.04], Sect. 3.1, Proposition 2. Relations (B.5) and (B.6) for matrices ϒ R(k) , ϒ P(k) follow directly from (2.2.4) and from equalities R1 P2 P1 = P2 P1 R2 ,
R2 P1 P2 = P1 P2 R1 .
(B.8)
Equalities (B.7) are obvious. Relations (B.3) are byproducts of (B.2) and (2.3.4). The second equality in (B.2) follows from (2.3.2), (2.3.3), (B.6), and the Hecke relation for ∗
∗
R : R = P R P − (q − q −1 )I .
Lemma B.2. Let M be a matrix of left invariant vector fields for the Hecke type HD algebra DG[R, γ ]. For any i ≥ 1 and j ≥ 0 one has ∗
M∗ J 1 1
∗
∗
M∗ J 2 . . . M∗ J i 2
(B.9)
Ii+1,...,i+ j
i
= γ i(i+1) TrR (i + j + 1, . . . , 2i + j) (i+ j) (2i+ j) (2i+ j) (i+ j) (2i) ϒP ϒP , (L T )i . . . (L T )1 ϒ ∗ (Ti . . . T1 )−1 ϒ P ϒP R
where Ii+1,...,i+ j is the identity operator acting in the component spaces V with labels i + 1, . . . , i + j. Proof. Consider the following sequence of transformations: ∗ M∗ J 1 = M1 = (T −1 L T )1 = TrR (2) T1−1 R1 L 1 T1 = γ 2 TrR (2) L 2 T1−1 R −1 T1 1
∗ −1 −1 −1 2 = γ TrR (2) (L T )2 R1 T2 = γ TrR (2) P1 (L T )1 R 1 T1 P1 . 2
(B.10)
Here in the first line we transform the underlined expressions using (3.3.1) and (3.1.1), and in the last line we apply the definition (3.4.5). Relation (B.10) reproduces formula (B.9) for i = 1 and j = 0. By a repeated application of formula (cf. with (2.2.10)) TrR ( j + 1) (P j X P j ) = I j TrR ( j) (X )
∀ X ∈ End W (V ⊗ j ),
we can rewrite it as (B.9) with i = 1 and arbitrary j > 0, ∗ M∗ J 1 1
I2,... j+1 = γ TrR ( j + 2) (P j+1 . . . 2
∗
P1 )(L T )1 R 1 T1−1 (P1 . . .
(B.11)
P j+1 ) . (B.12)
In a similar way, for any value of i relations, (B.9) with j > 0 follow from that with j = 0 by a repeated application of (B.11). Therefore, it is enough to consider the case j = 0. ∗
Using relations (B.12) and (B.8) we can rewrite an expression M∗ J i in the following i
way
1176
A. P. Isaev, P. Pyatov ∗
∗
∗
∗
∗
M∗ J i = ( R i−1 . . . R 1 ) M1 ( R 1 . . . R i−1 ) i
∗
∗
∗
∗
∗
= γ TrR (i + 1) (Pi . . . P2 P1 )(L T )1 ( R i . . . R 2 R 1 R 2 . . . R i ) (T1 ) 2
−1
(P1 P2 . . . Pi ) . (B.13)
Now we are ready to prove formula (B.9) by induction on i. Assuming that (B.9) with j = 1 is valid for the product of (i − 1) factors we transform the product of i factors, ∗
∗
∗
M∗ J 2 . . . M∗ J i
M∗ J 1 1
2
i
(i) (2i−1) = γ (i−1)i TrR (i + 1, . . . 2i − 1) ϒ P ϒ P (L T )i−1 . . . (L T )1 ∗ (2i−2) −1 (2i−1) (i) × ϒ∗ (Ti−1 . . . T1 ) ϒ P ϒP M∗ J i . R
i
∗
Next, we apply formulas (B.6), (B.7) to move the last factor (M∗ J i ) in this expression i
left-wards. The result is
(i) (2i−1) (2i−2) (L T )i−1 . . . (L T )1 ϒ ∗ (Ti−2 . . . T1 )−1 = γ (i−1)i TrR (i + 1, . . . 2i − 1) ϒ P ϒ P R ⎞ ↑(i−2)
∗
T1−1 (M∗ J i )↑1
×
i
ϒ P(2i−1) ϒ P(i) ⎠ ,
where we have used identities (Ti−1 )−1 = (T1−1 )↑(i−2) ∗ ((M∗ J i )↑1 )↑(i−2)
to arrange the terms (Ti−1 )−1 and
i
and
∗ (M∗ J i )↑(i−1)
∗
(M∗ J i )↑(i−1) = i
in a suitable way.
i
Next, we use formula (3.4.8) for their permutation and then, in a similar way we move ∗
term (M∗ J ) to the left of all the terms (T∗ )−1 : ··· = γ
(i−1)i+2(i−1) ∗
× (M
∗ 2i−1
(i) (2i−1) (2i−2) TrR (i + 1, . . . 2i − 1) ϒ P ϒ P (L T )i−1 . . . (L T )1 ϒ ∗ R
J 2i−1 )(Ti−1 . . . T1 )−1 ϒ P(2i−1) ϒ P(i) .
Now we substitute the expression (B.13) for (M
∗ 2i−1
∗
J 2i−1 )
= γ i(i+1) TrR (i + 1, . . . 2i) ϒ P(i) ϒ P(2i−1) (L T )i−1 . . . (L T )1 ϒ (2i−2) (P2i−1 . . . P1 )(L T )1 ∗ R
∗
∗
∗
∗
∗
(2i−1)
× ( R 2i−1 . . . R 2 R 1 R 2 . . . R 2i−1 )(T1 )−1 (P1 . . . P2i−1 )(Ti−1 . . . T1 )−1 ϒ P
(i)
ϒP
Spectral Extension of the Quantum Group Cotangent Bundle
1177
and move the term (P2i−1 . . . P1 ) leftwards and the term (P1 . . . P2i−1 ) rightwards close to the terms ϒ P(2i−1) . Finally, using (B.4) we complete the calculation (i) (2i) (2i) (2i) (i) = γ i(i+1) TrR (i + 1, . . . 2i) ϒ P ϒ P (L T )i . . . (L T )1 ϒ ∗ (Ti . . . T1 )−1 ϒ P ϒ P . R
∗
Here we transformed terms containing R in the following way: (2i−2)↑1 ∗
ϒ∗
∗
∗
∗
∗
( R 2i−1 . . . R 2 R 1 R 2 . . . R 2i−1 )
R
(2i−2)↑1 ∗
R ∗
∗
∗
∗
∗
( R 1 . . . R 2i−2 R 2i−1 R 2i−2 . . . R 1 )
= ϒ∗
∗
(2i−2) ∗
= ( R 1 . . . R 2i−1 )ϒ ∗ R
∗
∗
∗
(2i−1)
( R 2i−2 . . . R 1 ) = ( R 1 . . . R 2i−1 )ϒ ∗ R
(2i)
= ϒ∗ . R
ϒ (k)
(B.4) associated with a skew invertible Lemma B.3. The operators Jk (3.3.8) and R-matrix R satisfy relations (2i) (i) 4 TrR (i + 1, . . . , 2i) ϒ R = ϒ R = (J1 J2 . . . Ji )2 . (B.14) Proof. Calculation proceeds as follows:
TrR (i + 1, . . . , 2i) ϒ R(2i) = TrR (i + 1, . . . , 2i − 1) ϒ R(2i−1) (TrR (2i) R2i−1 ) (R2i−2 . . . R1 ) (2i−1) = TrR (i + 1, . . . , 2i − 1) (R1 . . . Ri−1 ) ϒ R (Ri−1 . . . R1 ) (i)
. . . = (R1 . . . Ri−1 )i ϒ R (Ri−1 . . . R1 )(Ri−2 . . . R1 ) . . . (R2 R1 )R1 (i) 2 (i) 4 = (J1 J2 . . . Ji ) ϒ R = ϒR . Here in passing to the second line we calculated the R-trace TrR (2i) with the help of (2.2.8)
and then used (B.5) to move (i −1) R-matrices to the left of the term ϒ R(2i−1) . Expression in the third line results from similar calculations of the R-traces TrR (2i − 1) , . . . , TrR (i + 1) , consecutively. Equalities in the last line result from rearranging factors of the product (R1 . . . Ri−1 )i .
Acknowledgement. We are grateful to Ludwig Dmitrievich Faddeev for acquainting us with the problem of a dynamics of the isotropic q-top, and for numerous inspiring discussions and advice. We would like to thank Alexei Gorodentsev, Sergei Kuleshov, Andrey Levin, Dmitry Lebedev, Andrey Mudrov, Andrey Marshakov and Vyacheslav Spiridonov for their useful comments and conversations. We also would like to acknowledge the warm hospitality of the Max-Planck-Institute für Mathematik, where writing this paper was started in 2004 and finished in 2008. The work is supported by the Russian Foundation for Basic Research, grants No. 08-01-00392-a, and CNRS-RFBR No. 07-02-92166 and No. 09-01-93107.
References [AF.91] [AF.92]
Alekseev, A.Yu., Faddeev, L.D.: (T ∗ G)t : a toy model for conformal field theory. Commun. Math. Phys. 141(2), 413–422 (1991) Alekseev, A.Yu., Faddeev, L.D.: An involution and dynamics for the q deformed quantum top. Zap. Nauchn. Semin. LOMI 200, 3 (1992) (in Russian); English translation available at http:// arxiv.org/abs/hep-th/9406196, 1994
1178
[B] [ChP] [CS] [D.86] [D.89] [DM.01] [DM.02] [ES] [F.90] [F.94] [F.95] [F.99] [FHIOPT] [FRT] [GKL] [G] [GPS.97] [GPS.05] [GPS.06]
[GR.91] [GR.92] [GS.99] [GS.04] [H] [HIOPT]
A. P. Isaev, P. Pyatov
Burroughs, N.: Relating the approaches to quantized algebras and quantum groups. Commun. Math. Phys. 113, 91–117 (1990) Chari, V., Pressley, A.: A Guide to Quantum Groups. Cambridge: Cambridge University Press, 1994 Conway, J.H., Sloane, N.J.A.: Sphere Packings, Lattices and Groups. Berlin-Heidelberg-New York: Springer-Verlag, 1993 Drinfeld, V.G.: Quantum Groups. In: Proceedings of the Intern. Congress of Mathematics, Vol. 1 (Berkeley, 1986), p. 798. For the expanded version see J. Math. Sci. 41(2), 898–915 (1988) (translated from Zap. Nauch. Sem. LOMI 155, 18–49) (1986) Drinfeld, V.G.: On almost cocommutative Hopf algebras. (Russian) Algebra i Analiz 1(2), 30–46 (1989); English translation in: Leningrad Math. J. 1(2), 321–342 (1990) Donin, J., Mudrov, A.: Uq (sl(n))-covariant quantization of symmetric coadjoint orbits via reflection equation algebra. Contemp. Math. 315, 61–79 (2002) Donin, J., Mudrov, A.: Explicit equivariant quantization on coadjoint orbits of GL(n, C). Lett. Math. Phys. 62(1), 17–32 (2002) Etingof, P., Schiffmann, O.: Lectures on the dynamical Yang-Baxter equations. In: Quantum groups and Lie theory (Durham 1999), London Math. Soc. LN series 290, Cambridge: Cambridge Univ. Press 2001 Faddeev, L.D.: On the exchange matrix for WZNW model. Commun. Math. Phys. 132(1), 131–138 (1990) Faddeev, L.D.: Current-like variables in massive nad massless integrable models. Lectures delivered at the International School of Physics ‘Enrico Fermi’. Varenna, Italy, 1994; available at http://arxiv.org/abs/hep-th/9408041, 1994 Faddeev, L.D.: Discrete Heisenberg-Weyl group and modular group. Lett. Math. Phys. 34(3), 249–254 (1995) Faddeev, L.D.: Modular double of a quantum group. In: Conf’erence Mosh’e Flato 1999, Quantization, Deformation, and Symmetries. Vol. I, Dordrecht: Kluwer Acad. Publ., 2000, pp. 149–156; available at http://arxiv.org/abs/math.QA/9912078, 1999 Furlan, P., Hadjiivanov, L.K., Isaev, A.P., Ogievetsky, O.V., Pyatov, P.N., Todorov, I.T.: Quantum matrix algebra for the SU (n) WZNW model. J. Phys. A: Math. Gen. 36, 5497–5530 (2003) Faddeev, L.D., Reshetikhin, N.Yu., Takhtajan, L.A.: Quantization of Lie groups and Lie algebras. (Russian) Algebra i Analiz 1(1), 178–206; (1989) English translation in: Leningrad Math. J. 1(1), 193–225 (1990) Gerasimov, A., Kharchev, S., Lebedev, D.: Representation theory and quantum integrability. Progr. Math. 237, Basel: Birkhäuser, 2005, pp. 133–156, available at http://arxiv.org/abs/math. QA/0402112, 2004 Gurevich, D.I.: Algebraic aspects of the quantum Yang-Baxter equation. (Russian) Algebra i Analiz 2, 119–148 (1990); English translation in: Leningrad Math. J. 2, 801–828 (1991) Gurevich, D.I., Pyatov, P.N., Saponov, P.A.: Hecke symmetries and characteristic relations on reflection equation algebras. Lett. Math. Phys. 41, 255–264 (1997) Gurevich, D.I., Pyatov, P.N., Saponov, P.A.: Cayley-Hamilton Theorem for Quantum Matrix Algebras of G L(m|n) type. Algebra i Analiz 17(1) 160–182 (2005) (in Russian). English translation in: St. Petersburg Math. J. 17(1), 119–135 (2006) Gurevich, D.I., Pyatov, P.N., Saponov, P.A.: Quantum matrix algebras of the GL(m–n)type: the structure and spectral parameterization of the characteristic subalgebra. Teor. Matem. Fiz. 147(1), 14–46 (2006) (in Russian). English translation in: Theor. Math. Phys. 147(1), 460–485 (2006) Gelfand, I.M., Retakh, V.S.: Determinants of matrices over noncommutative rings. Funct. Anal. Appl. 25, 91–102 (1991) Gelfand, I.M., Retakh, V.S.: A theory of noncommutative determinants and characteristic funstions of graphs. Funct. Anal. Appl. 26, 1–20 (1992); Publ. LACIM, Montreal: UQAM, 14, pp. 1–26 Gurevich, D., Saponov, P.: Quantum line bundles via cayley-hamilton identity. J. Phys. A: Math. Gen. 34(21), 4553–4569 (2001) Gurevich, D., Saponov, P.: Geometry of non-commutative orbits related to Hecke symmetries. to appear in Contemp. Math.: Joseph Donin memorial volume, available at http://arxiv.org/abs/ math.QA/0411579, 2004 Hlavaty, L.: Quantized braided groups. J. Math. Phys. 35, 2560–2569 (1994) Hadjiivanov, L.K., Isaev, A.P., Ogievetsky, O.V., Pyatov, P.N., Todorov, I.T.: Hecke algebraic properties of dynamical R-matrices: application to related quantum matrix algebras. J. Math. Phys. 40(1), 427–448 (1999)
Spectral Extension of the Quantum Group Cotangent Bundle
[I] [Is.95] [Is.04] [IOP.98] [IOP.99] [IP] [J.85] [J.86] [KL] [KLS] [KS] [KSch] [M] [Mum] [O] [OP.01]
[OP.05] [PP] [R.89] [R.90] [RT] [S] [SWZ.92] [SWZ.93] [TW]
1179
Igusa, J.: Theta Functions. Grund. Math. Wiss. 194, Berlin-Heidelberg-New York: SpringerVerlag, 1972 Isaev, A.P.: Twisted Yang-Baxter equations for linear quantum (super) groups. J. Phys. A: Math. Gen. 29, 6903–6910 (1996) Isaev, A.P.: Quantum groups and Yang-Baxter equations. MPIM Preprint 2004-132; available at http://www.mpim-bonn.mpg.de/Research/MPIM-Preprint-Series/ Isaev, A.P., Ogievetsky, O.V., Pyatov, P.N.: Generalized Cayley-Hamilton-Newton identities. Czech. J. Phys. 48, 1369–1374 (1998) Isaev, A., Ogievetsky, O., Pyatov, P.: On quantum matrix algebras satisfying the Cayley-Hamilton-Newton identities. J. Phys. A: Math. Gen. 32, L115–L121 (1999) Isaev, A.P., Pyatov, P.N.: Covariant differential complexes on quantum linear groups. J. Phys. A: Math. Gen. 28, 2227–2246 (1995) Jimbo, M.: A q-difference analogue of U (g) and the Yang-Baxter equation. Lett. Math. Phys. 10, 63–69 (1985) Jimbo, M.: A q-analogue of Uq (gl(N + 1)), Hecke algebra and the Yang-Baxter equation. Lett. Math. Phys. 11, 247–252 (1986) Krob, D., Leclerc, B.: Minor identities for quasi-determinants and quantum determinants. Commun. Math. Phys. 169(1), 1–23 (1995) Kharchev, S., Lebedev, D., Semenov-Tian-Shansky, M.: Unitary representations of Uq (sl(2, R)), the modular double and the multiparticle q-deformed toda chains. Commun. Math. Phys. 225(3), 573–609 (2002) Kulish, P.P., Sklyanin, E.K.: Algebraic structures related to reflection equations. J. Phys. A: Math. Gen. 25(22), 5963–5975 (1992) Klimyk, A., Schmüdgen, K.: Quantum Groups and their Representations. Berlin: Springer, 1997 Montgomery, S.: Hopf Algebras and their Actions on Rings. CBMS Lecture Notes Vol. 82, Providence, RI: Amer. Math. Soc., 1993 Mumford, D.: Tata Lectures on Theta. I. Progress in Mathematics, Vol. 28, Boston, MA: Birkhäuser Boston Inc., 1983 Ogievetsky, O.: Uses of quantum spaces. In: Proc. of School Quantum symmetries in theoretical physics and mathematics (Bariloche, 2000), Contemp. Math. 294, Providence, RI: Amer. Math. Soc., 2002 pp. 161–232 Ogievetsky, O., Pyatov, P.: Lecture on Hecke algebras. In: Proc. of the International School “Symmetries and Integrable Systems” (Dubna, Russia, June 8–11, 1999), JINR, Dubna, D2,52000-218, pp.39-88; MPIM Preprint 2001-40, available at http://www.mpim-bonn.mpg.de/ Research/MPIM-Preprint-Series/ Ogievetsky, O., Pyatov, P.: Orthogonal and symplectic quantum matrix algebras and CayleyHamilton theorem for them. Preprint MPIM2005–53; http://arxiv.org/abs/math.QA/0511618, 2005 Polishchuk, A., Positselski, L.: Quadratic Algebras. University Lecture Series, 37. Providence, RI: Amer. Math. Soc., 2005 Reshetikhin, N.Yu.: Quasitriangular Hopf algebras and invariants of tangles. (Russian) Algebra i Analiz 1 (2), 169–188 (1989); English translation in: Leningrad Math. J. 1(2), 491–513 (1990) Reshetikhin, N.Yu.: Multiparameter quantum groups and twisted quasitriangular Hopf algebras. Lett. Math. Phys. 20, 331–335 (1990) Reshetikhin, N.Yu., Turaev, V.G.: Ribbon graphs and their invariants derived from quantum groups. Commun. Math. Phys. 127(1), 1–26 (1990) Semenov-Tyan-Shanskii, M.A.: Poisson-Lie groups. The quantum duality principle and the twisted quantum double. (Russian) Teor. Mat. Fiz. 93(2) 302–329 (1992); English translation in: Theor. Math. Phys. 93(2), 1292–1307 (1992) Schupp, P., Watts, P., Zumino, B.: Differential geometry on linear quantum groups. Lett. Math. Phys. 25(2), 139–147 (1992) Schupp, P., Watts, P., Zumino, B.: Bicovariant quantum algebras and quantum lie algebras. Commun. Math. Phys. 157(2), 305–329 (1993) Tuba, I., Wenzl, H.: On braided tensor categories of type bcd. J. Reine Angew. Math. 581, 31–69 (2005)
Communicated by L. Takhtajan
Commun. Math. Phys. 288, 1181–1201 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0701-4
Communications in
Mathematical Physics
Stein’s Method and Characters of Compact Lie Groups Jason Fulman Department of Mathematics, University of Southern California, Los Angeles, CA 90089-2532, USA. E-mail:
[email protected] Received: 13 June 2008 / Accepted: 13 September 2008 Published online: 5 December 2008 – © Springer-Verlag 2008
Abstract: Stein’s method is used to study the trace of a random element from a compact Lie group or symmetric space. Central limit theorems are proved using very little information: character values on a single element and the decomposition of the square of the trace into irreducible components. This is illustrated for Lie groups of classical type and Dyson’s circular ensembles. The approach in this paper will be useful for the study of higher dimensional characters, where normal approximations need not hold. 1. Introduction There is a large literature on the traces of random elements of compact Lie groups. One of the earliest results is due to Diaconis and Shahshahani [DS]. Using the method of moments, they show that if g is random from the Haar measure of the unitary group U (n, C), and Z = X + iY is a standard complex normal with X and Y independent, mean 0 and variance 21 normal variables, then for j = 1, 2, . . ., T r (g j ) are independent √ and distributed as j Z asymptotically as n → ∞. They give similar results for the orthogonal group O(n, R) and the group of unitary symplectic matrices U Sp(2n, C). The moment computations of [DS] use representation theory. It is worth noting that there are other approaches to their moment computations: [PV] uses a version of integration by parts (and also treats S O(n, R)), and [CoSz] uses an “extended Wick calculus” (and also treats symmetric spaces). Concerning the error in the normal approximation in the [DS] results, Diaconis conjectured that for fixed j, it decreases exponentially or even superexponentially in n. Stein [St2] uses “Stein’s method” to show that T r (g k ) on O(n, R) is asymptotically normal with error O(n −r ) for any fixed r . Johansson [J] proved Diaconis’ conjecture for classical compact Lie groups using Toeplitz determinants and a very detailed analysis of characteristic functions. The author received funding from NSF grant DMS-0503901.
1182
J. Fulman
One direction in which the [DS] results have been extended is the study of linear statistics of eigenvalues: see [J,DE,So] and the numerous references therein. There is also work by D’Aristotile, Diaconis, and Newman [DDN] on central limit theorems for linear functions such as T r (Ag), where A is a fixed n × n real matrix and g is from the Haar measure of O(n, R). In recent work, Meckes [Me2] refined Stein’s technique from [St2] to establish a sharp total variation distance error term (order n −1 ) for the [DDN] result. A natural goal is to prove limit theorems (with error terms) for the distribution of traces in other irreducible representations: i.e. χ τ (g), where g is a random element of a compact Lie group and χ τ is the character of an irreducible representation τ . This would have direct implications for Katz’s work [Ka] on exponential sums; see Sect. 4.7 of [KLR] for details. We do not attain this goal, but make a useful contribution to it. More precisely, the current paper presents a formulation of Stein’s method designed for the study of χ τ (g). In the case of normal approximation, we obtain O(n −1 ) bounds for the error term using only two pieces of information: φ
χ (α) , where φ may be arbitrary but α is a single • The value of the “character ratios” dim(φ) element of G (typically chosen to be close to the identity) • The decomposition of τ 2 into irreducible representations.
In contrast, the method of moments approach requires knowing the multiplicity of the trivial representation in τ k for all k ≥ 1 (which could be tricky to compute) and does not give an immediate bound on the error. Johansson’s paper [J] gives sharper bounds when χ τ is the trace of an element from a classical compact Lie group, but requires knowledge of high order moments and deep analytical tools which might not extend to arbitrary representations τ . Even Stein’s method approaches of Stein [St2] and Meckes [Me2] use information about the distribution of matrix entries; very little is known about this for arbitrary τ , whereas the main ingredient for our approach (character theory) is well-developed. Let us explain our statement in the abstract that the methods of this paper will prove useful for approximation other than normal approximation. We use Stein’s method of exchangeable pairs which involves the construction of a pair (W, W ) of exchangeable random variables. Our pair (which is somewhat different from those of Stein [St2] and Meckes [Me2]) satisfies the linearity condition that E(W |W ) is proportional to W , and we find representation theoretic formulas for quantities such as E(W − W )k . These computations are completely general and apply to arbitrary distributional approximation. Stein’s method of exchangeable pairs is still quite undeveloped for continuous distributions other than the normal, but that is temporary and there are some results: see [Mn,Re] for the chi-squared distribution, [Lu] for the Gamma distribution, and [GoT] for the semicircle law. Closest to the current paper is [CFR], which develops error terms for exponential approximation using quantities like E(W − W )k with k small. We remark that the bounds in our paper are all given in the Kolmogorov metric. Similar results can be proved in the slightly stronger total variation metric (see the remarks after Theorem 2.1). However we prefer to work in the Kolmogorov metric as it underscores the similarity with discrete settings such as [Fu], where total variation convergence does not occur. We also mention that all bounds obtained in this paper are given with explicit constants. The organization of this paper is as follows. Section 2 gives background on Stein’s method and normal approximation. Section 3 develops general theory for the case that G is a compact Lie group and χ τ an irreducible character. It treats the trace of random
Stein’s Method and Characters of Compact Lie Groups
1183
elements of O(n, R), U Sp(2n, C), and U (n, C) as examples. Section 4 extends the methods of Sect. 3 to study spherical functions of compact symmetric spaces. The symmetric space setting is natural from the viewpoint of random matrix theory [Dn,KaS]. After illustrating the technique on the sphere, we treat Dyson’s circular ensembles as examples, obtaining an error term. 2. Stein’s Method for Normal Approximation In this section we briefly review Stein’s method for normal approximation, using the method of exchangeable pairs [St1]. For more details, one can consult the survey [RR] and the references therein. Two random variables W, W are called an exchangeable pair if (W, W ) has the same distribution as (W , W ). As is typical in probability theory, let E(A|B) denote the expected value of A given B. The following result of Stein uses an exchangeable pair (W, W ) to prove a central limit theorem for W . Theorem 2.1. ([St1]). Let (W, W ) be an exchangeable pair of real random variables such that E(W 2 ) = 1 and E(W |W ) = (1 − a)W with 0 < a < 1. Then for all real x0 , x0 2 − x2 P(W ≤ x0 ) − √1 e d x 2π −∞ V ar (E[(W − W )2 |W ]) 1 − 14 ≤ + (2π ) E|W − W |3 . a a Remarks. (1) There are variations of Theorem 2.1 (for instance Theorem 6 of [Me1]) which can be combined with our calculations to prove normal approximation in the total variation metric. However Theorem 2.1 is quite convenient for our purposes. (2) In recent work, Röellin [Rl] has given a version of Theorem 2.1 in which the exchangeability condition can be replaced by the slightly weaker condition that W and W have the same law. Since exchangeability holds in our examples and may be useful for other applications involving Stein’s method, we adhere to using Theorem 2.1. To apply Theorem 2.1, one needs bounds on V ar (E[(W −W )2 |W ]) and E|W −W |3 . The following lemmas are helpful for this purpose. Lemma 2.2. Let (W, W ) be an exchangeable pair of random variables such that E(W |W ) = (1 − a)W and E(W 2 ) = 1. Then E(W − W )2 = 2a. Proof. Since W and W have the same distribution, E(W − W )2 = = = = =
E(E(W − W )2 |W ) E((W )2 ) + E(W 2 ) − 2E(W E(W |W )) 2E(W 2 ) − 2E(W E(W |W )) 2E(W 2 ) − 2(1 − a)E(W 2 ) 2a.
1184
J. Fulman
Lemma 2.3 is a well known inequality (already used in the monograph [St1]) and useful because often the right—hand side is easier to compute or bound than the left— hand side. To make this paper as self-contained as possible, we include a proof. Here x is an element of the state space X . Lemma 2.3. V ar (E[(W − W )2 |W ]) ≤ V ar (E[(W − W )2 |x]). Proof. Jensen’s inequality states that if g is a convex function, and Z a random variable, then g(E(Z )) ≤ E(g(Z )). There is also a conditional version of Jensen’s inequality (Sect. 4.1 of [Du]) which states that for any σ subalgebra F of the σ -algebra of all subsets of X , E(g(E(Z |F))) ≤ E(g(Z )). The lemma follows by setting g(t) = t 2 , Z = E((W − W )2 |x), and letting F be the σ -algebra generated by the level sets of W . 3. Compact Lie Groups This section uses Stein’s method to study the distribution of a fixed irreducible character χ τ of a compact Lie group G. Subsect. 3.1 develops general theory for the case that χ τ is real valued. This is applied to study the trace of a random element of U Sp(2n, C) in Subsection 3.2 and the trace of a random orthogonal matrix in Subsections 3.3 and 3.4. Subsect. 3.5 indicates the relevant amendments for the complex setting and Subsect. 3.6 illustrates the theory for U (n, C). 3.1. General theory (real case). Let G be a compact Lie group and χ τ a non-trivial realvalued irreducible character of G. The random variable of interest to us is W = χ τ (g), where g is chosen from the Haar measure of G. It follows from the orthogonality relations for irreducible characters of G that E(W ) = 0 and E(W 2 ) = 1. The following functional equation will be useful. Lemma 3.1. ([He2], p. 392). Let G be a compact Lie group and χ φ an irreducible character of G. Then
χ φ (hαh −1 g)dh = G
χ φ (α) φ χ (g) dim(φ)
for all α, g ∈ G. We now define a pair (W, W ) by letting W = χ τ (g), where g is chosen from Haar measure and W = W (αg), where α is chosen uniformly at random from a fixed selfinverse conjugacy class of G. Exchangeability of (W, W ) follows since the conjugacy class of α is self-inverse. Moreover, since χ φ (α −1 ) = χ φ (α), one has that χ φ (α) is real for all irreducible representations φ, a fact which will be used freely throughout this subsection.
Stein’s Method and Characters of Compact Lie Groups
1185
Remark. Non-identity self-inverse conjugacy classes always exist. Indeed, if G has rank r then any maximal torus has 2r − 1 elements of order 2, and these are naturally selfinverse. Moreover if G has all characters real valued (as is the case for symplectic and orthogonal groups), then all conjugacy classes are self inverse, since class functions can be uniformly approximated by sums of characters and χ φ (α −1 ) = χ φ (α). The remaining results in this subsection show that the exchangeable pair (W, W ) has desirable properties. Lemma 3.2.
E(W |W ) =
χ τ (α) W. dim(τ )
Proof. Applying Lemma 3.1 with φ = τ , one has that τ χ (α) χ τ (g). E(W |g) = χ τ (hαh −1 g)dh = dim(τ ) h∈G The result follows since this depends on g only through W . Lemma 3.3.
χ τ (α) E(W − W ) = 2 1 − . dim(τ )
2
Proof. This is immediate from Lemmas 2.2 and 3.2.
For the remainder of this subsection, if φ is an irreducible representation of G, we let m φ (τ r ) denote the multiplicity of φ in the r-fold tensor product of τ (which has character (χ τ )r ). Lemma 3.4. E[(W )2 |g] =
m φ (τ 2 )
φ
χ φ (α) φ χ (g), dim(φ)
where the sum is over all irreducible representations of G. Proof. Write (W )2 = φ m φ (τ 2 )χ φ (g ). Lemma 3.1 gives that χ φ (α) φ χ (g), χ φ (hαh −1 g)dh = E[χ φ (g )|g] = dim(φ) G and the result follows.
Lemma 3.5 writes V ar ([E(W − W )2 |g]) as a sum of positive quantities. Lemma 3.5.
V ar ([E(W − W ) |g]) = 2
∗ φ
2 χ φ (α) 2χ τ (α) − m φ (τ ) 1 + , dim(φ) dim(τ ) 2 2
where the star signifies that the sum is over all nontrivial irreducible representations of G.
1186
J. Fulman
Proof. By Lemmas 3.2 and 3.4, E((W − W )2 |g) = E[(W )2 |g] − 2W E(W |g) + W 2 2χ τ (α) W2 = E[(W )2 |g] + 1 − dim(τ ) 2χ τ (α) χ φ (α) − χ φ (g). = m φ (τ 2 ) 1 + dim(φ) dim(τ ) φ
The orthogonality relation for irreducible characters of G gives that 2 χ φ (α) 2χ τ (α) 2 2 2 2 − m φ (τ ) 1 + . E[E((W − W ) |g) ] = dim(φ) dim(τ ) φ
Finally, note that V ar ([E(W − W )2 |g]) = E[E((W − W )2 |g)2 ] − (E(W − W )2 )2 , and since the multiplicity of the trivial representation in τ 2 is 1, the result follows from Lemma 3.3. Lemma 3.6. Let k be a positive integer. φ r k−r ) χ (α) , (1) E(W − W )k = rk=0 (−1)k−r rk φ m φ (τ )m φ (τ dim(φ) χ τ (α) χ φ (α) (2) E(W − W )4 = φ m φ (τ 2 )2 8 1 − dim(α) − 6 1 − dim(φ) . Proof. For the first assertion, note that k k−r k E[(W − W ) |g] = χ τ (g)k−r E[(W )r |g]. (−1) r
k
r =0
Arguing as in Lemma 3.4 gives that this is equal to k χ φ (α) φ k−r k χ τ (g)k−r χ (g). (−1) m φ (τ r ) r dim(φ) φ
r =0
Thus E(W − W )k is equal to E(E[(W − W )k |g]) k φ k−r k r χ (α) (−1) m φ (τ ) χ τ (g)k−r χ φ (g) = r dim(φ) g∈G r =0
=
k r =0
φ
k χ φ (α) . (−1)k−r m φ (τ r )m φ (τ k−r ) dim(φ) r φ
For the second assertion, note by the first assertion that 4 χ φ (α) 4 r 4 E(W − W ) = . (−1) m φ (τ r )m φ (τ 4−r ) r dim(φ) r =0
φ
Stein’s Method and Characters of Compact Lie Groups
1187
If α is the identity element of G, then W = W which implies that 0=
4 r =0
4 (−1) m φ (τ r )m φ (τ 4−r ). r r
φ
Thus for general α, E(W − W )4 = −
4
(−1)r
4 χ φ (α) m φ (τ r )m φ (τ 4−r ) 1 − . r dim(φ) φ
r =0
Observe that the r = 0, 4 terms in this sum vanish, since the only contribution could come from the trivial representation, which contributes 0. The r = 2 term is χ φ (α) 1− m φ (τ 2 )2 . −6 dim(φ) φ
The r = 1, 3 terms are equal and together contribute χ τ (α) χ τ (α) 8 1− m τ (τ 3 ) = 8 1 − χ τ (g)4 dim(τ ) dim(τ ) g∈G 2 χ τ (α) 2 φ = 8 1− m φ (τ )χ (g) dim(τ ) g∈G φ τ χ (α) = 8 1− m φ (τ 2 )2 . dim(τ ) φ
This completes the proof.
The above lemmas are completely general. Specializing to normal approximation, one obtains the following result. Theorem 3.7. Let G be a compact Lie group and let τ be a non-trivial irreducible representation of G whose character is real valued. Fix a non-identity element α with the property that α and α −1 are conjugate. Let W = χ τ (g), where g is chosen from the Haar measure of G. Then for all real x0 , x0 2 − x2 P(W ≤ x0 ) − √1 e d x 2π −∞ 2 ∗ 1 χ φ (α) ≤ 1− m φ (τ 2 )2 2 − a dim(φ) ⎡
φ
⎤1/4 φ (α) 1 χ 6 ⎦ . 1− +⎣ m φ (τ 2 )2 8 − π a dim(φ) φ
τ
χ (α) Here a = 1 − dim(τ ) , the first sum is over all non-trivial irreducible representations of G, and the second sum is over all irreducible representations of G.
1188
J. Fulman
Proof. One applies Theorem 2.1 to the exchangeable pair (W, W ) of this subsection. By Lemmas 2.3 and 3.5, the first term in Theorem 2.1 gives the first term in the theorem. To upper bound the second term in Theorem 2.1, note by the Cauchy-Schwartz inequality that E|W − W |3 ≤ E(W − W )2 E(W − W )4 . Now use Lemma 3.3 and part 2 of Lemma 3.6.
3.2. Example. U Sp(2n, C). This subsection studies the distribution of χ τ (g), where τ is the 2n dimensional defining representation of U Sp(2n, C). The only representation theoretic fact needed is Lemma 3.8, which is the k = 2 case of a formula from p. 200 of [Su] giving the decomposition of τ k into irreducible representations. In its statement, we let x1 , x1−1 , . . . , xn , xn−1 denote the eigenvalues of an element of U Sp(2n, C). Lemma 3.8. For n ≥ 2, the square of the defining representation of the group U Sp(2n, C) decomposes in a multiplicity free way as the sum of the following three irreducible representations: • The trivial representation, with character 1 • The representation with character 21 ( i xi + xi−1 )2 + 21 i (xi2 + xi−2 ) • The representation with character 21 ( i xi + xi−1 )2 − 21 i (xi2 + xi−2 ) − 1. Remark. Lemma 3.8 could also be easily guessed (and proved) by looking at the character formulas for U Sp(2n, C) on p. 219 of [W]. Theorem 3.9. Let g be chosen from the Haar measure of U Sp(2n, C), where n ≥ 2. Let W (g) be the trace of g. Then for all real x0 , √ x0 2 2 − x2 P(W ≤ x0 ) − √1 . e d x ≤ n 2π −∞ Proof. One applies Theorem 3.7, with τ the defining representation, and α an element of type {x1±1 , . . . , xn±n }, where x1 = · · · = xn−1 = 1 and xn = eiθ . Then α is conjugate . Using Lemma 3.8, one calculates that the to α −1 and one computes that a = 1−cos(θ) n √ 2 4cos(θ)2 −4cos(θ)+2 first error term in Theorem 3.7 is equal to . One computes that the 2n+1 1/4 24(1−cos(θ)) second error term is equal to . Since these bounds hold for all θ and are π(2n+1) continuous in θ , the bounds hold in the limit that θ → 0. This gives an upper bound of √ √ 2 2 2 ≤ , 2n+1 n as claimed. 3.3. Example. S O(2n + 1, R). We investigate the distribution of χ τ (g), where τ is the 2n + 1-dimensional defining representation of S O(2n + 1, R). The only ingredient from representation theory needed is Lemma 3.10, which is the k = 2 case of a formula from p. 204 of [Su] giving the decomposition of τ k into irreducible representations (it is also easily obtained by inspecting the character formulas on p. 228 of [W]). In its statement, we let x1 , x1−1 , . . . , xn , xn−1 , 1 be the eigenvalues of an element of S O(2n + 1, R).
Stein’s Method and Characters of Compact Lie Groups
1189
Lemma 3.10. For n ≥ 2, the square of the defining representation of S O(2n + 1, R) decomposes in a multiplicity free way as the sum of the following three irreducible representations: • The trivial representation, with character 1 • The representation with character 21 ( i xi + xi−1 )2 + 21 i (xi2 + xi−2 ) + i (xi + xi−1 ) • The representation with character 21 ( i xi + xi−1 )2 − 21 i (xi2 + xi−2 )+ i (xi + xi−1 ). This leads to the following theorem. Theorem 3.11. Let g be chosen from the Haar measure of S O(2n + 1, R), where n ≥ 2. Let W (g) be the trace of g. Then for all real x0 , √ x0 2 2 − x2 P(W ≤ x0 ) − √1 . e d x ≤ n 2π −∞ Proof. One applies Theorem 3.7, taking τ to be the defining representation, and α to be an element of type {x1±1 , . . . , xn±n , 1}, where x1 = · · · = xn−1 = 1 and xn = eiθ (i.e. α is a rotation by θ ). Then α is conjugate to α −1 and a = 2(1−cos(θ)) . Using Lemma 3.10 2n+1 √
one computes that in the θ → 0 limit the first error term in Theorem 3.7 is equal to n2 . 1/4 One calculates that the second error term is equal to 12(2n+1)(1−cos(θ)) . The proof π n(2n+3) of the theorem is completed by noting that this goes to 0 as θ → 0. 3.4. Example. O(2n, R). We consider the distribution of χ τ (g), where τ is the 2ndimensional defining representation of O(2n, R). The only representation theoretic information needed is Lemma 3.12, which is the k = 2 case of a result of Proctor [Pr] (and also not difficult to obtain from the character formulas on p. 228 of [W]). In its statement, we let x1 , x1−1 , . . . , xn , xn−1 be the eigenvalues of an element of O(2n, R). Lemma 3.12. For n ≥ 2, the square of the defining representation of O(2n, R) decomposes in a multiplicity free way as the sum of the following three irreducible representations: • The trivial representation, with character 1
2 1 2
• The representation with character 21 − 2 i xi + xi 2 i xi + xi
2 1 2
+ 2 i xi + xi 2 − 1. • The representation with character 21 i xi + xi This leads to the following result. Theorem 3.13. Let g be chosen from the Haar measure of O(2n, R), where n ≥ 2. Let W (g) be the trace of g. Then for all real x0 , √ x0 2 2 − x2 P(W ≤ x0 ) − √1 . e dx ≤ n−1 2π −∞ Proof. Apply Theorem 3.7, with τ the defining representation of O(2n, R). We take α to be an element of type {x1±1 , . . . , xn±n }, where x1 = · · · = xn−1 = 1 and xn = eiθ (i.e. . Lemma 3.12 gives the α is a rotation by θ ). Then α is conjugate to α −1 and a = 1−cos(θ) n
1190
J. Fulman
decomposition of τ 2 into irreducibles, and from this one calculates the first error term in √ 8 Theorem 3.7, and sees that in the θ → 0 limit it is equal to 2n−1 . One computes that the 1/4 second error term is equal to 24n(1−cos(θ)) . The proof of the theorem is completed π(n+1)(2n−1) by noting that this goes to 0 as θ → 0. 3.5. General theory (complex case). Let G be a compact Lie group and τ be an irreducible representation of G whose character
is not real valued. The random variable of interest to us is W = √1 χ τ (g) + χ τ (g) , where g is chosen from the Haar measure 2 of G. It follows from the orthogonality relations for irreducible characters of G that E(W ) = 0 and E(W 2 ) = 1. We now define a pair (W, W ) by letting W be as above and W = W (αg), where α is chosen uniformly at random from a fixed self-inverse conjugacy class of G. As in Subsect. 3.1, the pair (W, W ) is exchangeable and all χ φ (α) are real. The remaining results in this subsection are proved by minor modifications of the arguments in Subsect. 3.1. Lemma 3.14.
E(W |W ) = Lemma 3.15.
χ τ (α) W. dim(τ )
χ τ (α) E(W − W )2 = 2 1 − . dim(τ )
Lemma 3.16. E[(W )2 |g] =
1 χ φ (α) φ χ (g), m φ [(τ + τ )2 ] 2 dim(φ) φ
where the sum is over all irreducible representations of G. Lemma 3.17. V ar ([E(W − W )2 |g]) =
2 ∗ 2χ τ (α) χ φ (α) 1 − m φ [(τ + τ )2 ]2 1 + , 4 dim(φ) dim(τ ) φ
where the star signifies that the sum is over all nontrivial irreducible representations of G. Lemma 3.18. Let k be a positive integer. (1) E(W − W )k is equal to k φ 1 k−r k r k−r χ (α) . (−1) m [(τ + τ ) ]m [(τ + τ ) ] φ φ r 2k/2 dim(φ) r =0
(2) E(W − W )4 =
φ
2 ]2 2 1 − m [(τ + τ ) φ φ
χ τ (α) dim(τ )
−
Finally, one obtains the following central limit theorem.
3 2
1−
χ φ (α) dim(φ)
.
Stein’s Method and Characters of Compact Lie Groups
1191
Theorem 3.19. Let G be a compact Lie group and τ an irreducible representation of G whose character Let α = 1 be such that α and α −1 are conjugate. is not real valued.
Let W = √1 χ τ (g) + χ τ (g) , where g is chosen from the Haar measure of G. Then 2 for all real x0 , x0 2 − x2 P(W ≤ x0 ) − √1 e d x 2π −∞ 2 ∗ 1 1 χ φ (α) ≤ 1− m φ [(τ + τ )2 ]2 2 − 2 a dim(φ) ⎡
φ
⎤1/4 φ (α) 1 3 χ ⎦ . +⎣ m φ [(τ + τ )2 ]2 2 − 1− π 2a dim(φ) φ
τ
χ (α) Here a = 1 − dim(τ ) , the first sum is over all non-trivial irreducible representations of G, and the second sum is over all irreducible representations of G.
3.6. Example. U (n, C). Every element of U (n, C) is conjugate to a diagonal matrix with entries (x1 , . . . , xn ) and the representation theory of U (n, C) is well understood (see for instance [Bu]). The irreducible representations of U (n, C) are parameterized by integer sequences λ1 ≥ λ2 ≥ · · · ≥ λn . The corresponding character value on an element of type {x1 , . . . , xn } is given by the Schur function sλ (x1 , . . . , xn ). (The usual definition of Schur functions requires that λn ≥ 0, so if λn = −k < 0, this should be interpreted as (x1 · · · xn )−k sλ+(k)n , where λ + (k)n is given by adding k to each of λ1 , . . . , λn .) The complex conjugate of a character with data λ1 ≥ λ2 ≥ · · · ≥ λn has data −λn ≥ −λn−1 ≥ · · · ≥ −λ1 . Combining the above information with Theorem 3.19, one obtains the following result. Theorem 3.20. Let g be chosen from the Haar measure of U (n, C), where n ≥ 2. Let W (g) = √1 [T r (g) + T r (g)], where T r denotes trace. Then for all real x0 , 2
x0 2 2 − x2 P(W ≤ x0 ) − √1 . e d x ≤ n − 1 2π −∞ Proof. One applies Theorem 3.19 with τ the n-dimensional defining representation. We take α to be an element of type {x1 , . . . , xn } with x1 = · · · = xn−2 = 1, xn−1 = eiθ , and xn = e−iθ . Then α and α −1 are conjugate and a = 2(1−cos(θ)) . By the Pieri rule n for multiplying Schur functions (p. 73 of [Mac]), the decomposition of the character of (τ + τ )2 in terms of Schur functions is given by (s(1) + s(1) )2 = s(1) s(1) + 2
s(1) s(1n−1 )
+ s(1) s(1) x1 · · · xn s(1n ) + s(2,1n−2 ) + s(−2) + s(−1,−1) = s(2) + s(1,1) + 2 x1 · · · xn = 2 + s(2) + s(1,1) + s(−2) + s(−1,−1) + 2s(1,0n−2 ,−1) .
1192
J. Fulman
One then computes that in the θ → 0 limit, the first error term in Theorem 3.19 is equal to 1/4 √ 2 n 2 +2 2 ≤ n−1 . One computes that the second error term is equal to 12(2n−1)(1−cos(θ)) . n 2 −1 π(n 2 −1) The proof is completed by noting that this approaches 0 as θ → 0. 4. Compact Symmetric Spaces This section extends the methods of Sect. 3 to study the distribution of a fixed spherical function ωτ on a random element of a compact symmetric space G/K . Subsect. 4.1 gives general theory for the case that ωτ is real valued. This is illustrated for the sphere in Subsect. 4.2, giving a different perspective on a result of [DF and Me1]. We note that since compact Lie groups can be viewed as symmetric spaces (see Sect. 4.1), the examples in Subsects. 3.2, 3.3, and 3.4 give further examples. Our theorems should also prove useful for Jacobi-type ensembles arising from other root systems (see for instance [Vr]). Subsection 4.3 indicates the changes needed to treat spherical functions ωτ which are not real valued, and Subsects. 4.4 and 4.5 study the trace of elements from Dyson’s circular orthogonal and circular symplectic ensembles as special cases (the circular unitary ensemble is equivalent to U (n, C), so was already treated in Subsect. 3.6). Central limit theorems are known for the trace of an element from the circular ensembles (see [Ra,BF,CoSz]), but our approach gives an error term.
4.1. General theory (real case). To begin we recall some concepts about spherical functions of symmetric spaces. Standard references which contain more details are [He1,He2, Te, and V]. Chapter 7 of [Mac] is also very helpful. A Riemannian manifold X is said to be a symmetric space if the geodesic symmetry σ : X → X with center at any point x0 is an isometry. Then X can be identified with G/K , where G is a connected transitive Lie group of isometries of X , and K is a compact group which up to finite index is given by K = {g ∈ G : gx0 = x0 }. A function ωφ ∈ L 2 (G/K ) is called spherical if ωφ (1) = 1 and the following functional equation is satisfied: ωφ (xky)dk = ωφ (x)ωφ (y) ∀x, y ∈ G. K
This equation implies that ωφ is K bi-invariant (i.e. ωφ (k1 gk2 ) = ωφ (g) for all k1 , k2 in K ), which justifies our writing ωφ (g) instead of ωφ (g K ). One reason that spherical functions are important is that if G/K is compact and Hφ is the G-invariant subspace of L 2 (G/K ) generated by ωφ , then Hφ is a finite dimensional irreducible representation of G and L 2 (G/K ) is a direct sum of all such Hφ . We let dim(φ) denote the dimension of Hφ . In reading this section it is useful to keep in mind that a compact Lie group U can be viewed as a compact symmetric space. Indeed, one can take G = U × U and K the diagonal subgroup of U ; then G/K is identified with U under the mapping (u 1 , u 2 )K → u 1 u −1 2 . The spherical functions ωφ of G/K are indexed by irreducible representations φ of U and are precisely the character ratios dim(Hφ ) =
χ φ (1)2 .
χ φ (u) ; χ φ (1)
moreover,
Stein’s Method and Characters of Compact Lie Groups
1193
Let ωτ be a non-trivial real valued spherical function of G/K . We are interested in the distribution of ωτ (g) (normalized to have variance 1). Here g K is chosen from the “Haar measure” µ on G/K . This is induced from the Haar measure of G using the projection map G → G/K , and is invariant under the action of G. The following orthogonality relation will be used; see for instance p. 45 of [Kl] for a proof. Lemma 4.1.
G/K
ωφ (g)ωη (g) =
δφ,η . dim(φ)
In particular, Lemma 4.1 implies that W := [dim(τ )]1/2 ωτ has mean 0 and variance 1.
The following lemma is immediate from the functional equation for ωφ and K bi-invariance of ωφ . Lemma 4.2. Let G/K be a compact symmetric space, and ωφ a spherical function of G/K . Then ωφ (k1 αk2 g)dk1 dk2 = ωφ (α)ωφ (g) K ×K
for all α, g ∈ G. We define the pair (W, W ) by letting W = [dim(τ )]1/2 ωτ (g), where g K is from the “Haar measure” of G/K , and W = W (αg), where α is chosen uniformly at random from a fixed double coset K α K = K which satisfies the property that K α K = K α −1 K . Since K α −1 K = (K α K )−1 , it follows that (W, W ) is exchangeable. Moreover the integral formula for spherical functions (p. 417 of [He2]) implies that all ωφ (α) are real. The analysis of the exchangeable pair (W, W ) can be carried out exactly as in Subsect. 3.1, using Lemmas 4.1 and 4.2 instead of the orthogonality relations for compact Lie groups and Lemma 3.1. Hence we simply record the results. Lemma 4.3. E(W |W ) = ωτ (α)W. Lemma 4.4. E(W − W )2 = 2(1 − ωτ (α)). In the statements of the remaining results, we define the “multiplicity” m φ (τ r ) by the expansion dim(φ) 1/2 [ωτ (g)]r = m φ (τ r )ωφ (g). dim(τ )r φ
The numbers m φ (τ r ) are real and non-negative (argue as on p. 396 of [Mac] with sums replaced by integrals) but need not be integers. Note that by Lemma 4.1, r r 1/2 ωτ (g)r ωφ (g). m φ (τ ) = dim(φ)dim(τ ) G/K
1194
J. Fulman
Lemma 4.5. E[(W )2 |g K ] =
m φ (τ )2 [dim(φ)]1/2 ωφ (α)ωφ (g),
φ
where the sum is over all spherical functions of G/K . Lemma 4.6. V ar ([E(W − W )2 |g K ]) =
∗
2 m φ (τ 2 )2 1 + ωφ (α) − 2ωτ (α) ,
φ
where the star signifies that the sum is over all nontrivial spherical functions of G/K . Lemma 4.7. Let k be a positive integer. r k−r )ω (α). (1) E(W − W )k = rk=0 (−1)k−r rk φ φ m φ (τ )m φ (τ 4 2 2 (2) E(W − W ) = φ m φ (τ ) 8(1 − ωτ (α)) − 6(1 − ωφ (α)) . Finally, one has the following central limit theorem. Theorem 4.8. Let G/K be a compact symmetric space and let ωτ be a non-trivial realvalued spherical function of G/K . Fix an element α ∈ K such that K α K = K α −1 K . Let W = [dim(τ )]1/2 ωτ (g), where g K is chosen from the “Haar measure” of G/K . Then for all real x0 , x0 2 − x2 P(W ≤ x0 ) − √1 e d x 2π −∞ ∗
2 1 2 2 ≤ 1 − ωφ (α) m φ (τ ) 2 − a ⎡
φ
⎤ 1/4
1 6 2 2 1 − ωφ (α) ⎦ . +⎣ m φ (τ ) 8 − π a φ
Here a = 1 − ωτ (α), the first sum is over all non-trivial spherical functions of G/K , and the second sum is over all spherical functions of G/K . 4.2. Example: the sphere. This subsection studies the unit sphere in Rn , viewed as the symmetric space S O(n, R)/S O(n − 1, R). Chapter 9 of [V] is a good reference for the spherical functions of this symmetric space, and Chapter 4 of [DyM] is a very clear textbook treatment for the special case n = 3. Letting e1 , . . . , en be the standard basis of Rn and embedding S O(n − 1, R) inside S O(n, R) as the subgroup fixing en , then K g1 K = K g2 K if and only if g1 (en ) and g2 (en ) have the same last coordinate. From this it is not difficult to check that K g K = K g −1 K for all g, and that the double cosets of S O(n − 1, R) in S O(n, R) are parameterized by xn , the final coordinate of a point (x1 , . . . , xn ) on the sphere. In what follows we let x denote xn .
Stein’s Method and Characters of Compact Lie Groups
1195
From p. 461 of [V], the spherical functions ωl are parameterized by integers l ≥ 0 and satisfy ωl (x) =
l!(n − 3)! n−2 C 2 (x). (l + n − 3)! l
ρ
Here Cl is the Gegenbauer polynomial, defined by the generating function ρ Cl (x)t l = (1 − 2xt + t 2 )−ρ . l≥0
For instance, ρ
ρ
ρ
C0 (x) = 1, C1 (x) = 2ρx, C2 (x) = −ρ + 2ρ(1 + ρ)x 2 and ω0 (x) = 1, ω1 (x) = x, ω2 (x) =
nx 2 − 1 . n−1
By p. 462 of [V], dim(l) = (2l+n−2)(n+l−3)! . (n−2)!l! √ We study the random variable W (x) = [dim(1)]1/2 ω1 = nx. In fact sharp (up to constants) normal approximations for W are known: see Diaconis and Freedman [DF] √ 2 3 and also Meckes [Me1], who uses Stein’s method to obtain an error term of n−1 in total variation distance. Our viewpoint leads to the following result. √ Theorem 4.9. Let W = nx, where x is the last coordinate of a random point on the n unit sphere in R . Then for all real x0 , √ x0 x2 2 2 1 − P(W ≤ x0 ) − √ . e 2 d x ≤ n−1 2π −∞ Proof. We apply Theorem 4.8 with τ = ω1 . Writing ω1 (x)2 as a linear combination of
ω0 (x) and ω2 (x), one computes that m (0) (τ 2 ) = 1, m (2) (τ 2 ) =
2(n−1) n+2 ,
and that all
other multiplicities in τ 2
vanish. Letting α be less than one but close to 1, one computes √ 2 2 that the first error term in Theorem 4.8 is exactly n−1 . Letting α tend to 1 from below, the second error term in Theorem 4.8 vanishes and the result follows. 4.3. General theory (complex case). In this subsection G/K is a compact symmetric space and ωτ is a spherical function which is not real valued. The random variable we 1/2
) ω (g) + ω (g) . As in Subsect. 4.1, let W = W (αg), where study is W = dim(τ τ τ 2 α is chosen uniformly at random from a fixed double coset K α K = K which satisfies the property that K α K = K α −1 K . Then (W, W ) is exchangeable and all ωφ (α) are real. The remaining results in this subsection extend those of Subsect. 3.5, and are proved by nearly identical arguments, using Lemmas 4.1 and 4.2. Lemma 4.10. E(W |W ) = ωτ (α)W .
1196
J. Fulman
Lemma 4.11. E(W − W )2 = 2(1 − ωτ (α)). For the remaining results in this subsection, we define the “multiplicity” m φ [(τ +τ )r ] by the expansion (ωτ + ωτ )r =
dim(φ) 1/2 φ
dim(τ )r
m φ [(τ + τ )r ]ωφ .
Arguing as on p. 396 of [Mac], one has that the numbers m φ [(τ + τ )r ] are real and non-negative (though not necessarily integers). Note that by Lemma 4.1, m φ [(τ + τ ) ] = [dim(φ)dim(τ ) ] r
r 1/2 G/K
r ωτ (g) + ωτ (g) ωφ (g).
Lemma 4.12. E[(W )2 |g K ] =
1 m φ [(τ + τ )2 ]dim(φ)1/2 ωφ (α)ωφ (g), 2 φ
where the sum is over all spherical functions of G/K . Lemma 4.13. V ar ([E(W − W )2 |g K ]) =
∗
1 m φ [(τ + τ )2 ]2 (1 + ωφ (α) − 2ωτ (α))2 , 4 φ
where the star signifies that the sum is over all nontrivial spherical functions of G/K . Lemma 4.14. Let k be a positive integer. (1) E(W − W )k is equal to k 1 k−r k (−1) m φ [(τ + τ )r ]m φ [(τ + τ )k−r ]ωφ (α). r 2k/2 r =0
φ
(2) E(W − W )4 is equal to φ
3 m φ [(τ + τ )2 ]2 2(1 − ωτ (α)) − (1 − ωφ (α)) . 2
Putting the pieces together, one has the following central limit theorem. Theorem 4.15. Let G/K be a compact symmetric space and let ωτ be a spherical function of G/K which is not real valued. Fix an element α ∈ K such that K α K = K α −1 K .
Stein’s Method and Characters of Compact Lie Groups
1197
) ω Let W = dim(τ (g) + ω (g) , where g K is from the “Haar measure” of G/K . τ τ 2 Then for all real x0 , x0 2 − x2 P(W ≤ x0 ) − √1 e d x 2π −∞ ∗
2 1 1 ≤ 1 − ωφ (α) m φ [(τ + τ )2 ]2 2 − 2 a ⎡
φ
⎤ 1/4
1 3 1 − ωφ (α) ⎦ . +⎣ m φ [(τ + τ )2 ]2 2 − π 2a φ
Here a = 1 − ωτ (α), the first sum is over all non-trivial spherical functions of G/K , and the second sum is over all spherical functions of G/K . 4.4. Example. U (n, C)/O(n, R). This symmetric space can be identified with the set of symmetric unitary matrices by the map g → gg T (see [Dn] or [Fo] for details), and the resulting matrix ensemble is known as Dyson’s circular orthogonal ensemble. For a thorough discussion of this ensemble, see the texts [Fo] or [Mt]. In particular, it is known that if a function f depends only on the eigenvalues x1 , . . . , xn of a matrix from Dyson’s ensemble, then n d xk [(3/2)]n . f = f (x1 , . . . , xn ) |xi − x j | n n ( + 1) 2π G/K T 2 1≤i< j≤n
In this integral,
Tn
k=1
is the n-dimensional torus with coordinates x1 , . . . , xn , xi ∈ C, |xi | = 1.
It is convenient to let the inner product f, g of two functions be defined by n d xk [(3/2)]n . f, g = f (x , . . . , x )g(x , . . . , x ) |x − x | 1 n 1 n i j n ( 2 + 1) Tn 2π 1≤i< j≤n
k=1
The spherical functions for this symmetric space are parameterized by integer 1 ,...,x n ;2) sequences λ1 ≥ λ2 · · · ≥ λn and are ωλ := PPλ λ(x(1,...,1;2) , the normalized Jack polynomials with parameter 2. An excellent reference for Jack polynomials is Sect. 6.10 of [Mac]. There one assumes that λn ≥ 0, so if λn = −k < 0, Pλ should be interpreted as (x1 · · · xn )−k Pλ+(k)n , where λ + (k)n is given by adding k to each of λ1 , . . . , λn . To describe some useful combinatorial properties of Jack polynomials, we use the notation that if λ is a partition and s is a box of λ, then l (s), l(s), a(s), a (s) are respectively the number of squares in the diagram of λ to the north of s (in the same column), south of s (in the same column), east of s (in the same row), and west of s (in the same row). For example the box marked s in the partition below s
would have l (s) = 1, l(s) = 2, a (s) = 1, and a(s) = 3.
1198
J. Fulman
Letting λ be a partition of n, and using this notation, two useful formulas are the “principal specialization formula” (p. 381 of [Mac]) n + 2a (s) − l (s) Pλ (1, . . . , 1; 2) = , 2a(s) + l(s) + 1 s∈λ
and the formula
n + 1 + 2a (s) − l (s) n + 2a (s) − l (s) dim(λ) = , (2a(s) + l(s) + 2) (2a(s) + l(s) + 1) s∈λ
which follows from the formula for Pλ , Pλ on p. 383 of [Mac] and the fact (Lemma 4.1) that Pλ (1, . . . , 1; 2)2 . Pλ , Pλ
Theorem 4.16. Let W = 21 1 + n1 T r (g) + T r (g) , where g is random from Dyson’s circular orthogonal ensemble, T r denotes trace, and n ≥ 2. Then for all real x0 , x0 2 4 − x2 P(W ≤ x0 ) − √1 e d x ≤ . n 2π −∞ dim(λ) =
Proof. Apply Theorem 4.15 to the spherical function τ = ω(1) (g) = T rn(g) . To compute m φ [(τ + τ )2 ] for all φ, one has to decompose (ω(1) + ω(1) )2 into spherical functions, which is equivalent to decomposing (P(1) + P(1) )2 in terms of Jack polynomials. From the Pieri rule for Jack polynomials ([Mac], p. 340), one calculates that (P(1) + P(1) )2 = P(1) P(1) + 2
P(1) P(1n−1 ) x1 · · · xn
4 = P(2) + P(12 ) + 2 3 =
+ P(1) P(1)
2n n n+1 P(1 )
+ P(2,1n−2 )
x1 · · · xn
+ P(2) +
4 P 2 3 (1 )
4n 4 4 + P(2) + P(12 ) + 2P(1,0n−2 ,−1) + P(2) + P(12 ) . n+1 3 3
We choose α to be an element of type (1, . . . , 1, eiθ , e−iθ ). Then K α K = K α −1 K and a = 2(1−cos(θ)) n . One computes that in the θ → 0 limit, the first error term of 3 +2n 2 +5n+6) ≤ n4 . The proof is completed by computing that the Theorem 4.15 is n1 8(nn 3 +4n 2 +n−6 2 (1−cos(θ)) 1/4 which goes to 0 as θ → 0. second error term is 24(n+1) π n 2 (n+3) 4.5. Example. U (2n, C)/U Sp(2n, C). This symmetric space corresponds to Dyson’s circular symplectic ensemble ([Dn]); see [Fo] or [Mt] for background on this ensemble. In particular, it is known that if f depends only on the eigenvalues x1 , . . . , xn of a matrix from this ensemble, then n d xk 2n , f = f (x1 , . . . , xn ) |xi − x j |4 n (2n)! 2π G/K T 1≤i< j≤n
k=1
Stein’s Method and Characters of Compact Lie Groups
1199
where Tn is as in the previous example. We let the inner product f, g of two functions be defined by f, g =
n d xk 2n . f (x1 , . . . , xn )g(x1 , . . . , xn ) |xi − x j |4 (2n)! Tn 2π 1≤i< j≤n k=1
The spherical functions for this symmetric space are parameterized by integer sequences λ1 ≥ λ2 · · · ≥ λn and are ωλ :=
Pλ (x1 ,...,xn ; 12 ) , Pλ (1,...,1; 21 )
the normalized Jack polynomi-
als with parameter 1/2. As mentioned in the previous example, Jack polynomials are usually defined assuming that λn ≥ 0, so if λn = −k < 0, Pλ should be interpreted as (x1 · · · xn )−k Pλ+(k)n , where λ + (k)n is given by adding k to each of λ1 , . . . , λn . Letting λ be a partition of n and using the notation of the previous example, two useful formulas are the “principal specialization formula” (p. 381 of [Mac]) n + a (s) − l (s) 1 2 Pλ (1, . . . , 1; ) = , a(s) 2 + l(s) + 1 s∈λ
and the formula dim(Hλ ) =
n+
a (s) 2
s∈λ
a(s) 2
2
2n − 1 + a (s) − 2l (s)
, + l(s) + 1 (a(s) + 2l(s) + 1)
− l (s)
which follows from the formula for Pλ , Pλ on p. 383 of [Mac] and the fact (Lemma 4.1) that Pλ (1, . . . , 1; 21 )2 . Pλ , Pλ
1 Theorem 4.17. Let W = 1 − 2n T r (g) + T r (g) , where g is random from Dyson’s circular symplectic ensemble and n ≥ 2. Then for all real x0 , x0 2 4 − x2 P(W ≤ x0 ) − √1 e d x ≤ . n 2π −∞ dim(λ) =
Proof. Apply Theorem 4.15 to the spherical function τ = ω(1) (g) = T rn(g) . To compute m φ [(τ + τ )2 ] for all φ, one has to decompose (ω(1) + ω(1) )2 into spherical functions, which is equivalent to decomposing (P(1) + P(1) )2 in terms of Jack polynomials. From the Pieri rule for Jack polynomials ([Mac], p. 340), one calculates that (P(1) + P(1) )2 is equal to P(1) P(1) + 2
P(1) P(1n−1 ) x1 · · · xn
+ P(1) P(1)
n n 2 2 2n−1 P(1 ) + P(2,1n−2 ) + P(12 ) + P(2) = P(2) + P(12 ) + 2 3 x1 · · · xn 3 2n 2 2 = + P(2) + P(12 ) + 2P(1,0n−2 ,−1) + P(12 ) + P(2) . 2n − 1 3 3
1200
J. Fulman
Now take α to be an element of type (1, . . . , 1, eiθ , e−iθ ); then K α K = K α −1 K and a = 2(1−cos(θ)) . One computes that in the θ → 0 limit, the first error term of n 8(4n 3 −4n 2 +5n−3) 1 Theorem 4.15 is 2n ≤ n4 . The second error term is computed to be 3 −8n 2 +n+3 4n 1/4 6(2n−1)(4n−5)(1−cos(θ)) which goes to 0 as θ → 0. π n 2 (2n−3) References [BF] [Bu] [CFR] [CoSz] [DDN] [DE] [DF] [DS] [Dn] [Du] [DyM] [Fo] [Fu] [GoT] [He1] [He2] [J] [Ka] [KaS] [KLR] [Kl] [Lu] [Mac] [Mn] [Me1] [Me2] [Mt] [PV]
Baker, T., Forrester, P.: Finite-n fluctuation formulas for random matrices. J. Stat. Phys. 88, 1371–1386 (1997) Bump, D.: Lie Groups. Graduate Texts in Mathematics 225. New York: Springer-Verlag, 2004 Chatterjee, S., Fulman, J., Roellin, A.: Exponential approximation by Stein’s method and spectral graph theory. http://arXiv.org/list/math.PR/0605552, 2008 Collins, B., Stolz, M.: Borel theorems for random matrices from the classical compact symmetric spaces. Ann. Probab. 36, 876–895 (2008) D’Aristotile, A., Diaconis, P., Newman, C.: Brownian motion and the classical groups. In: Probability, statistics and their applications: papers in honor of Rabi Bhattacharya. IMS Lecture Notes Ser. 41, 2003, pp. 97–116 Diaconis, P., Evans, S.: Linear functionals of eigenvalues of random matrices. Transac. Amer. Math. Soc. 353, 2615–2633 (2001) Diaconis, P., Freedman, D.: A dozen de Finetti-style results in search of a theory. Ann. Inst. H. Poincaré Probab. Statist. 23, 397–423 (1987) Diaconis, P., Shahshahani, M.: On the eigenvalues of random matrices. J. Appl. Probab. 31, 49–62 (1994) Dueñez, E.: Random matrix ensembles associated to compact symmetric spaces. Commun. Math. Phys. 244, 29–61 (2004) Durrett, R.: Probability: Theory and examples, Second edition. Belmont, CA: Duxbury Press, 1996 Dym, H., McKean, H.: Fourier series and integrals. New York - London: Academic Press, 1972 Forrester, P.: Log-gases and random matrices. Book in preparation. Available at http://www.ms. unimelb.edu.au/~matpjf/matpjf.html, 2005 Fulman, J.: Stein’s method and random character ratios. Transac. Amer. Math. Soc. 360, 3687–3730 (2008) Götze, F., Tikhomirov, A.N.: Limit theorems for spectra of random matrices with martingale structure. Theory Probab. Appl. 51, 42–64 (2007) Helgason, S.: Differential geometry, Lie groups, and symmetric spaces. San Diego, CA: Academic Press, 1978 Helgason, S.: Groups and geometric analysis. Corrected reprint of the 1984 original, Providence, RI: Amer. Math. Soc., 2000 Johansson, K.: On random matrices from the compact classical groups. Ann. of Math. 145, 519–545 (1997) Katz, N.: Exponential sums and differential equations. Ann. Math. Studies 124. Princeton, NJ: Princeton University Press, 1990 Katz, N., Sarnak, P.: Zeroes of zeta functions and symmetry. Bull. Amer. Math. Soc. 36, 1–26 (1999) Keating, J.P., Linden, N., Rudnick, Z.: Random matrix theory, the exceptional lie groups and L functions. J. Phys. A 36, 2933–2944 (2003) Klyachko, A.: Random walks on symmetric spaces and inequalities for matrix spectra. Lin. Alg. Applic. 319, 37–59 (2000) Luk, H.M.: Stein’s method for the gamma distribution and related statistical applications, University of Southern California Ph.D. thesis, 1994 Macdonald, I.: Symmetric functions and Hall polynomials. Second edition, New York: Oxford University Press, 1995 Mann, B.: Stein’s method for χ 2 of a multinomial. Unpublished manuscript, 1997 Meckes, E.: On the approximate normality of eigenfunctions of the Laplacian. http://arXiv.org/abs/ 0705.1342V1[math.SP], (2007), to appear in Transac. Amer. Math. Soc. Meckes, E.: Linear functions on the classical matrix groups. Transac. Amer. Math. Soc. 360, 5355–5366 (2008) Mehta, M.: Random matrices. Third edition, Amsterdam: Elsevier/Academic Press, 2004 Pastur, L., Vasilchuk, V.: On the moments of traces of matrices of classical groups. Commun. Math. Phys. 252, 149–166 (2004)
Stein’s Method and Characters of Compact Lie Groups
[Pr] [Ra] [Re] [RR] [Rl] [So] [St1] [St2] [Su] [Te] [V] [Vr] [W]
1201
Proctor, R.: A Schensted algorithm which models tensor representations of the orthogonal group. Canad. J. Math. 42, 28–49 (1990) Rains, E.: Topics in probability on compact Lie groups, Harvard University Ph.D. thesis, 1997 Reinert, G.: Three general approaches to Stein’s method. In: An introduction to Stein’s method, Lect. Notes Ser. Inst. Math. Sci. Natl. Univ. Singap., vol. 4, 2005 Rinott, Y., Rotar, V.: Normal approximations by Stein’s method. Decis. Econ. Finance 23, 15–29 (2000) Röellin, A.: A note on the exchangeability condition in Stein’s method. http://arXiv.org/list/math. PR/0611050, 2006 Soshnikov, A.: The central limit theorem for local linear statistics in classical compact groups and related combinatorial identities. Ann. Probab. 28, 1353–1370 (2000) Stein, C.: Approximate computation of expectations. Institute of Mathematical Statistics Lecture Notes, vol. 7, 1986 Stein, C.: The accuracy of the normal approximation to the distribution of the traces of powers of random orthogonal matrices. Stanford University Statistics Department technical report no. 470, March 1995 Sundaram, S.: Tableaux in the representation theory of compact Lie groups. In: Invariant theory and tableaux. IMA Volumes in Mathematics, vol. 19, 1990, pp. 191–225 Terras, A.: Harmonic analysis on symmetric spaces and applications. Volumes I, II, N.Y.: Springer Verlag, 1985, 1988 Vilenkin, N.J.: Special functions and the theory of group representations. Translations of Mathematics Monographs, Volume 2, Providence, RI: Amer. Math. Soc., 1968 Vretare, L.: Formulas for elementary spherical functions and generalized Jacobi polynomials. SIAM J. Math. Anal. 15, 805–833 (1984) Weyl, H.: The classical groups. Fifteenth printing, Princeton, NJ: Princeton University Press, 1997
Communicated by S. Zelditch