Commun. Math. Phys. 264, 1–40 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1538-3
Communications in
Mathematical Physics
The Virtual Class of the Moduli Stack of Stable r-Spin Curves Takuro Mochizuki1,2 1 2
Department of Mathematics, Kyoto University, Kyoto, 606-8502, Japan. E-mail:
[email protected] Max-Planck Institute for Mathematics, Vivatsgasse 7, 53111 Bonn, Germany. E-mail:
[email protected]
Received: 18 April 2004 / Accepted: 27 October 2005 Published online: 15 March 2006 – © Springer-Verlag 2006
Abstract: We recall the outline of the Seely-Singer-Witten construction of the virtual class on the moduli of stable r-spin curves. We prove that the obtained classes satisfy the axioms of Jarvis-Kimura-Vaintrob.
1. Introduction 1.1. Stable r-spin curves. Let n be a non-negative integer, r be a natural number and 1/r,m m = (m1 , . . . , mn ) be an n-tuple of integers. Let Mg,n be the moduli stack of smooth n-pointed r-spin curves of genus g, which is a tuple (C, p, F) of a smooth curve C of genus g, n-points p = {p1 , . . . , pn } contained in the curve C, and a line bundle F on C with the isomorphism F ⊗ r ωC ⊗O(−m·p), where we put m·p = mi ·pi . T. Jarvis 1/r,m constructed the moduli stack Mg,n of the stable r-spin curves, which gives the smooth 1/r,m
compactification of the stack Mg,n ([3 and 4]). Saying naively, a ‘definition’ of stable r-spin curve is a tuple (C, p, F) of an n-pointed stable curve (C, p) and a torsion-free rank 1 sheaf F with the morphism F ⊗ r −→ ωC (−m · p) with some conditions. But the moduli stacks of such naive objects are not smooth. Hence he introduced the notion of the coherent net so that the moduli stacks become smooth. Thus the moduli stack 1/r,m Mg,n over the complex number field C is topologically an orbifold. The nodal point of stable r-spin curves (C, p, F) are divided into two types by the local property of the r-spin structure F at the point: If F is locally free at the nodal point Q, then Q is called of Ramond type. Otherwise the point Q is called of Neveu-Schwarz type. By the definition, F around the point Q is described as follows: Let π : C˜ −→ C be a normalization of the curve C at the point Q. Let Qj (j = 1, 2) denote the preimage of Q. We take a coordinate neighbourhood Bi = {zi ∈ C | |zi | < 1, zi (Qi ) = 0} around points Qi .
2
T. Mochizuki
• If Q is of Ramond type, then we have the locally free sheaf F˜ generated by the sections (dzi /zi )1/r . The sheaf F is obtained from F˜ by the gluing ϕ : F˜ Q1 −→ F˜ Q2 such that ϕ((dz1 /z1 )1/r )⊗ r = −dz2 /z2 . If Q is of Ramond type, the index at Q is defined to be (−1, −1). • If Q is of Neveu-Schwarz type, then there is a pair of integers (m1 , m2 ) with the conditions 0 ≤ mi ≤ r − 2 and m1 + m2 = r − 2 such that the sheaf F is π∗ OBi (zimi dzi )1/r around the nodal point Q. The pair (m1 , m2 ) is called the index of Q. Also the number mi is called the index of Qi . 1.2. The virtual class. In the papers [6 and 7], T. Jarvis, T. Kimura and A. Vaintrob 1/r 1/r,m introduced the substack M of Mg,n for any stable decorated graph of genus g with n tails marked by the tuple m, and they formulated the axioms of the virtual class 1/r 1/r,m of M (see Definition 4.1). The virtual class of Mg,n was originally introduced by Witten ([13]), when he generalized his famous conjecture on the intersection theory of the moduli stack of stable curves, although the existence of the nice moduli stack of r-spin stable curves was established by Jarvis later. Remark 1.1. Jarvis, Kimura and Vaintrob also considered the case where one of mi is −1. In this paper, we restrict our attention to the tuples of non-negative integers. 1.3. The purpose. Our purpose of the paper is to recall the outline of the original construction of Witten, which we call the SSW-construction, and to show that the obtained virtual classes satisfy the axioms of Jarvis-Kimura-Vaintrob. We recall the SSW-construction briefly. We denote the moduli stack of smooth (resp. stable) r-spin curves by M (resp. M). From the universal curve C −→ M, we obtain the Hilbert space bundles E 0 and E 1 whose fibers on the curve (C, p, F) are L2 (C, F) and L2 (C, F ⊗ T 0,1 ) respectively. We have the family of the closed operators ∂. Then 0 1 we extend such a family ∂ : E 0 −→ E 1 to the family ∂¯ : E −→ E on the stack M due to the argument of Seely-Singer [11]. (See Sect. 3.) i Then we take finite dimensional vector subbundles E i ⊂ E satisfying the following conditions: 0 1 • ∂(E 0 ) ⊂ E 1 and the index of ∂ : E 0 −→ E 1 is same as the index of ∂ : E −→ E . • E 1 contains the orthogonal complement of Im ∂ . Such E 0 −→ E 1 is called a finite reduction. (See Definition 4.2 for a more precise state1 ment.) Let ρE 1 denote the orthogonal projection E onto E 1 . Let π denote the natural Witten gave the section φ of π ∗ E 1 over E 0 , which is given projection of E 0 on M. r−1 by φ(s) = ∂s + ρE 1 s (see Subsubsect. 4.2.3), and he shows φ −1 (0) = M (see Lemma Chern classctop (π ∗ E1 , φ), which is contained in 04.5). Hence we obtain the topd(r, ∗ H E . It can be shown that (−1) m)π∗ ctop π ∗ E 1 , φ is independent of a choice M of a finite reduction, which should be the virtual class. We will see the classes satisfy the axioms in Subsect. 4.4. One of the key points in the proof is to show ‘vanishing’. Our simple idea for vanishing is explained in the Subsubsect. 4.3.1. The author hopes that it is sufficiently clear for the readers. Remark 1.2. The algebro-geometric construction of the virtual class was given by A. Polishchuk and A. Vaintrob in [10 and 9].
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
3
2. Stable r-Spin Curves We recall the definitions and the results on the stable r-spin curves due to Jarvis ([3 and 4]) to fix the notation. We refer the paper [6] as a convenient reference. We denote a tuple of points p1 , . . . , pn (resp. integers m1 , . . . , mn ) by p (resp. m). We denote the formal sum mi · pi by m · p. Definition 2.1. Let (X, p) be a nodal, n-pointed algebraic curve, and let K be rank one torsion-free sheaf on X. A d th root of K of type m = (m1 , . . . , mn ) is a pair (E, b) of rank one torsion-free sheaf E and an OX -module homomorphism b : E ⊗ d −→ K(−m · p) with the following properties: • d · deg E = deg K − mi . • b is an isomorphism on the locus of X where E is locally free. • for every point p ∈ X where E is not free, the length of the cokernel of b at p is d − 1. Jarvis used the following notion to obtain the smooth moduli space. Definition 2.2. Let K be a rank 1, torsion-free sheaf on a nodal n-pointed curve (X, p). A coherent net of r th roots of K of type m = (m1 , . . . , mn ) consists of the following data: • A rank 1 torsion-free sheaf Ed on X for every divisor d of r. ⊗ d/d
−→ Ed for every pair of divisors d
• An OX -module homomorphism cd,d : Ed
and d of r such that d divides d. These data are subject to the following restrictions: 1. E1 = K and c1,1 = id. 2. For each divisor d of r and each divisor d of d, let m = (m
1 , . . . , m
n ) be a tuple such that m
i is the unique non-negative integer less than d/d , and congruent to mi mod d . Then the homomorphism cd,d makes (Ed , cd,d ) into a d/d root of Ed of type m
. ⊗ d /d
= cd,d
holds. 3. The homomorphisms {cd,d } are compatible, i.e., cd ,d
◦ cd,d
Then Jarvis defined stable r-spin curves. Definition 2.3. An n-pointed r-spin curve of type m = (m1 , . . . , mn ) is defined to be an n-pointed nodal curve (X, p) with a coherent net of r th roots of ωX of type m, where ωX is the dualizing sheaf of X. An r-spin curve is called smooth if X is smooth, and it is called stable if X is stable. The nodal point of a stable r-spin curve (C, p, F) is divided into two types by the local property of the sheaf F at the point: If F is locally free at the nodal point Q, then Q is called of Ramond type. Otherwise Q is called of Neveu-Schwarz type. To obtain the category of the r-spin curves, the ‘morphisms’are considered as follows. ∼ Definition2.4. An isomorphism of r-spin curves X, p, {Ed , cd,d } −→ X , p ,
{Ed , cd,d
} of the same type m is defined to be a tuple (τ, β) of an isomorphism of pointed curves τ : (X, p) −→ (X , p ) and a family of isomorphisms βd : τ ∗ Ed −→ Ed with β1 being the canonical isomorphism τ ∗ ωX (−m · p) −→ ωX (−m · p), such that
the βd are compatible with all the maps cd,d and τ ∗ cd,d
.
4
T. Mochizuki
The foundational and important theorem of Jarvis is the following. Proposition 2.1 (Jarvis). The moduli functor of the stable n-pointed r-spin curves of genus g and type m is representable by a smooth proper Deligne-Mumford stack. Following Jarvis, we denote the stack of the stable n-pointed r-spin curves of genus 1/r,m 1/r,m 1/r g and type m by Mg,n . The disjoint union m,0≤mi
3. Seely-Singer Extension 3.1. Preliminary for the operators on the interval. 3.1.1. Some generality. For any real number a ∈ R, let L2 (a) denote the space of L2 functions on the open interval (0, 1) with respect to the measure R a+1 · dR. Let K(R, ρ) be a continuous function on the product of the intervals (0, 1)×(0, 1). We put as follows: 1 K(R, ρ) · (R/ρ)1+a · dρ, M1 (K) := sup R
0
1
K(R, ρ) · dR.
M2 (K) := sup ρ
0
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
5
given as follows: Let us consider the integral operator K 1 K(R, ρ) · dρ. Kg(R) := 0
We recall the following lemma. naturally induces the bounded Lemma 3.1. Assume Mi (K) < ∞ for i = 1, 2. Then K 1/2 2 2 . operator L (a) −→ L (a). The operator norm is smaller than M1 (K) · M2 (K) Proof. Let g be any element of L2 (a). The following inequality holds: 1 2 K(R, ρ) · ρ −1−a · dρ × K(R, ρ) · |g|2 · ρ 1+a · dρ. Kg(R) ≤ 0
Hence we obtain the following inequality: K(R, ρ) · R/ρ 1+a · dρ Kg(R) · R 1+a · dR ≤
× K(R, ρ) · |g|2 · ρ 1+a · dρ · dR
≤ M1 (K) × K(R, ρ) · dR · |g|2 · ρ 1+a · dρ ≤ M1 (K) · M2 (K) × |g|2 · ρ 1+a · dρ. (1) Thus we are done.
3.1.2. Some functions on the product of intervals (0, 1) × (0, 1). Let χP denote the characteristic function of the subset of (R, ρ) ∈ (0, 1)2 determined bythe property P . For example, χ0
Proof. We have the following formula: 1 −k − a −1 · R k+1+a − R (k + a = 0) + 1+a K · R/ρ · dρ = δ,k 0 −R · log R (k + a = 0). The claim immediately follows from (2).
(2)
6
T. Mochizuki
Lemma 3.3. Let k be a positive number such that k ≥ −1. Let δ and δ be any positive numbers such that 0 < δ < δ < 1. Then there exists a positive constant C, which is independent of k, δ and δ such that the following holds: M2 Kδ,k ≤ C · (1 + |k|)−1 , M2 Kδ,k − Kδ ,k ≤ C · δ − δ · (1 + |k|)−1 . Proof. We have only to see the second inequality. We put ρ := min{ρ, δ}. Then we have the following inequality: 0
1
K + − K + · dR = δ,k δ ,k =
ρ
R/ρ
k
· dR
(k + 1)−1 · ρ −k · ρ k+1 − δ k+1 (k + 1 > 0) 0
(3)
ρ · log ρ − log δ .
The second inequality immediately follows from (3).
Let k be any integer, and δ be any positive number. Let us consider the following functions: k − := R/ρ · χδ<ρ
−1 − M1 Kδ,k − Kδ− ,k ≤ C · |δ − δ | · 1 + |k| .
Proof. We have only to see the second inequality. We put R := min{R, δ}. Then we have the following equality:
1
K − − K − · R/ρ 1+a · dρ δ,k δ ,k 0 −k − a −1 · R −k−a+1 · R −k−a − δ −k−a (k + a < 0) = R · log R − log δ
(k + a = 0).
The second inequality immediately follows from the formula (4).
(4)
Similarly, the following lemma can be shown elementarily. Lemma 3.5. Let k be any integer such that k ≤ 0, and let δ be any positive number. Then there exists a positive constant C, which is independent of k and δ, such that the following holds: − M2 Kδ,k ≤ C · (1 + |k|)−1 . − − Kδ− .k ≤ 2C · (1 + |k|)−1 . In particular, we have M2 Kδ,k
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
7
3.1.3. Some operators on the interval (0, 1). Let a be a real number such that −2 ≤ a < 2. Let L2 (a) denote the space of L2 -functions on the interval (0, 1) with respect to + and K − induced by the measure R 1+a · dR. Let us consider the integral operators K δ,k δ,k + − Kδ,k and Kδ,k . Lemma 3.6. + gives the bounded operator on L2 (a). There exists a • In the case k ≥ −a/2, K δ,k positive constant C, which is independent of a choice of any positive number δ and any + is smaller than C · (1 + |k|)−1 . integer k as above, such that the operator norm of K δ,k − gives the bounded operator on L2 (a). There exists a • In the case k < −a/2, K δ,k positive constant C, which is independent of a choice of any positive number δ and any − is smaller than C · (1 + |k|)−1 . integer k as above, such that the operator norm of K δ,k Proof. In the case k ≥ −a/2, we have k ≥ −a/2 > −1, which implies k ≥ 0. We also have k + 1 + a ≥ −a/2 + 1 + a = 1 + a/2 ≥ 0. Thus the first claim immediately follows from Lemma 3.1, Lemma 3.2 and Lemma 3.3. In the case k < −a/2, we have k < −a/2 ≤ 1, which implies k ≤ 0. If we have k ≤ −1, then k + a < −1 + 1 = 0. If we have k = 0, then we have 0 < −a/2, which implies a < 0. Hence we have k + a < 0 + 0 = 0. Then the second claim immediately follows from Lemma 3.1, Lemma 3.4 and Lemma 3.5. Similarly, the following lemma follows from lemmas 3.1, 3.2, 3.3, 3.4 and 3.5. Lemma 3.7. • In the case k ≥ −a/2, there exists a positive constant C, which is independent of + − K + is smaller than a choice of k, δ and δ such that the operator norm of K δ,k δ ,k C · |δ − δ |1/2 · (1 + |k|)−1 . • In the case k < −a/2, there exists a positive constant C, which is independent of − − K − is smaller than a choice of k, δ and δ such that the operator norm of K δ,k δ ,k C · |δ − δ |1/2 · (1 + |k|)−1 . 3.2. Preliminary for the ∂-operator on annuli. 3.2.1. The on annuli. For any non-negative number δ, let D(δ) denote the parametrices annulus z ∈ C δ < |z| < 1 . We denote the bundle of (0, 1)-forms by 0,1 . Let us consider the line bundle F with a generator e on the disc D(0). Let us consider the ∞ of hermitian metric |e|2a = R a of F. Let us take aC ∞ -measure of D(0) and C -metric 0,1 2 2 2 on D(0). Then we have the L -spaces L D(δ), F, a and L D(δ), F ⊗ 0,1 , a for any 0 ≤ δ < 1. We denote the norm by || · ||a . The number a is called the weight. In this subsection, we fix a C ∞ -measure of the disc D(0) for simplicity. Remark 3.1. When we consider the smooth metric of F and singular measure of the form φ · R 1+a dR · dα, we obtain the same topological vector space L2 (D(δ), F, a). Here φ denotes a positive C ∞ -function on D(δ). It will be convenient to consider 1,1 -valued metrics of a vector bundle V , which is a hermitian pairing h : V ⊗ V −→ 1,1 such that h(v, v) is a positive (1, 1)-form. Then we can say that we consider the 1,1 -valued metric of F which is of the form R a · h0 ,
8
T. Mochizuki
where h0 denotes a C ∞ - 1,1 -valued metric on D(0). When we fix the coordinate, the weight, and the frame e, then a 1,1 -valued metric corresponds to a positive function. For any L2 -section f of F ⊗ 0,1 , we have the Fourier development f = fk · √ −1k·α 2 2 1+a e . Here fk are elements of L (a) = L (0, 1), R · dR . We put as follows: Qδ (f ) :=
+ (fk ) · e K δ,k
√
−1k·α
+
k≥−a/2
− (fk ) · e K δ,k
√
−1k·α
.
(5)
k<−a/2
Lemma 3.8.
• For any 0 ≤ δ < 1, Qδ give the compact operators of L2 D(δ), F ⊗ 0,1 , a to L2 D(δ), F, a . • The operator norms of Qδ are dominated by a constant, which is independent of a choice of δ. • ∂Qδ (f ) = f . Proof. The first and the second claims immediately follow from Lemma 3.6. The third claim can be checked by a formal calculation. We have the natural inclusion D(δ) ⊂ D(0). It induces the isometric embeddings: ιδ : L2 D(δ), F, a ⊕ L2 D(δ), F ⊗ 0,1 , a −→ L2 D(0), F, a ⊕ L2 D(0), F ⊗ 0,1 , a . Then the operators Qδ induce the family of the operators Qδ 0 ≤ δ < 1 of L2 D(0), F, a to L2 D(0), F ⊗ 0,1 , a . Note we have Q0 = Q0 . Lemma 3.9. The family Qδ 0 ≤ δ < 1 is continuous with respect to the variable δ and the operator norms. Proof. It immediately follows from Lemma 3.7.
Let us take a continuous family of diffeomorphisms ηδ : D(0) −→ D(δ), for ∗ example, ηδ (R, α) = (1 − δ) · R + δ, α . We take the bundle map η∞ F −→ F ∗ ∗ ∞ by η e −→ e. Then we obtain the linear map η : C D(δ), F −→ C D(0), F . Then there exists the unique positive function gδ such that gδ · η∗ preserves the hermi tian products, and hence we obtain the unitary isomorphism Pδ : L2 D(δ), F, a −→ L2 D(0), F, a . Similarly, we obtain the unitary isomorphism Pδ : L2 D(δ), F ⊗ 0,1 , a −→ L2 D(0), F ⊗ 0,1 , a . Then we obtain the family of the isometric em beddings Uδ := ιδ ◦ Pδ−1 : L2 D(0), F ⊗ 0,1 , a −→ L2 D(0), F ⊗ 0,1 , a . It is easy to see that Uδ strongly converges to Uδ0 when δ → δ0 . δ := U ∗ ◦ Qδ ◦ Uδ . Then we obtain We obtain the family of the bounded operators Q δ the following lemma by the general argument due to Seely-Singer [11]. δ 0 ≤ δ < 1 is continuous with respect to the parameter Lemma 3.10. The family Q δ and the operator norms.
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
9
Proof. We see the continuity at δ = 0. The others can be seen similarly. We know the continuity of the family Qδ with respect to the parameter δ and the operator norms. ˜ 0 . Since it is compact we can approximate Q0 by an operator Note the equality Q0 = Q F of a finite rank. Let be a positive number. By the continuity of the family Qδ at δ = 0, there is a small number δ0 and the operator F of a finite rank such that the norm of the operators F − Qδ is less than for any δ < δ0 . We have the equality δ − Q 0 = U ∗ · F · Uδ − F + U ∗ · Qδ − F · Uδ + F − Q0 . Due to the strong Q δ δ continuity of the family Uδ δ , there is a positive number δ1 such that the norms of the operators Uδ∗ · F · Uδ − F are less than for any δ < δ1 . On the other hand, the multiplication of the operators U and U ∗ does not increase the norm, we obtain the desired continuity. 3.2.2. The value at the origin in the case δ = 0. Let us consider the case δ = 0.We have a straightforward generalization of Lemma 2 in [11]. We put 2 · Z := 2 · n n ∈ Z}. For any real number a ∈ R − 2 · Z, we put as follows: a˜ = min k k ∈ 2Z, k − a > 0 ,
a¯ = a˜ − a.
Lemma 3.11. Let a be an element of R − 2 · Z. Suppose that u = g · e and ∂u/∂ z¯ are ˜ g)(0). We also L2 with respect to the norm || · ||a. Then we have the value A = (za/2 − a/2 ˜ −a/2 have the estimate u(R) − z A = O(R ) when R → 0. Proof. We have only to see the case −2 < a < 0. Assume that f is contained in L2 (a). It is easy to show the existence of a positive constant C, which is independent of R, f and k, such that the following holds: k R −k R · 0 ρ f ≤ C · R −a/2 · ||f ||, (k ≤ 0), k R −k R · 1 ρ f ≤ C · R −a/2 · ||f ||, (k > 0).
(6)
Assume that u and f = ∂u/∂ z¯ are L2 , then the Fourier coefficient uk of u is of the folR R lowing form: uk (R) = R k · 1 ρ −k fk + ck in the case k > 0, uk (R) = R k · 0 ρ −k fk R in the case k < 0, and u0 (R) = 0 f0 + c0 . Here fk denote the Fourier coefficients of f . Then the claim follows from the estimate (6). Let a be a real number such that 0 < a < 2. Let f be an L2 -section of F ⊗ 0,1 2 with the weight a. Then we have the L -section Q0 (f ) of F. Since a˜ = 2, we have the value z · Q0 (f ) (0). Lemma 3.12. We have the vanishing z · Q0 (f ) (0) = 0. − (f−1 ) at R = 0 is 0, which can be Proof. We have only to see that the value of K δ,−1 checked directly.
10
T. Mochizuki
3.2.3. The r-spin bundle. Let F(l) denote the r-spin bundle on D(δ) with the generator 1/r . We take e∨ (l) := e(l)−1 ⊗ dz as a generator of the line bune(l) := zl · dz/z dual
dle L(l) := F(l)−1 ⊗ ω. We denote the ∂-operators of F(l) and L(l) by ∂ and ∂ to distinguish them. Then we can apply the generality in Subsubsect. 3.2.1 to ∂ and dual ∂ . Since we have the naturally defined pairing F(l) ⊗ L(l) ⊗ 0,1 −→ 1,1 and F(l) ⊗ 0,1 ⊗ L(l) −→ 1,1 , the 1,1 -hermitian metrics of F(l) and F(l) ⊗ 0,1 induce the 1,1 -hermitian metrics of L(l)⊗ 0,1 and L(l) respectively. When the weight for F(l) is a, we take −a as the weight for F(l)−1 ⊗ ω. Then we obtain the family of dual parametrices for both of ∂ and ∂ in the case −2 < a < 2. (Formal adjointness). We have the natural pairing F(l)⊗ 0,1 ⊗F(l)−1 ⊗ω −→ 1,1 . It induces the following perfect pairings: L2 D(δ), F(l) ⊗ 0,1 , a ⊗ L2 D(δ), F(l)−1 ⊗ ω, −a −→ C, L2 D(δ), F(l), a ⊗ L2 D(δ), F(l)−1 ⊗ ω ⊗ 0,1 , −a −→ C. We regard that they are mutually dual spaces. If we fix an appropriate metric of F(l) with weight a, then we obtain the anti-Hermitian isomorphisms: L2 D(δ), F(l) ⊗ 0,1 , a L2 D(δ), F(l)−1 ⊗ ω, −a , L2 D(δ), F(l), a L2 D(δ), F(l)−1 ⊗ ω ⊗ 0,1 , −a . Via the isomorphisms, ∂ and −∂
dual
are mutually formal adjoint.
Remark 3.2. We have the canonical isomorphism L2 D(δ), F(0)−1 ⊗ ω(O), −1 L2 D(δ), F(0)−1 ⊗ ω, 1 . Thus we can use L2 D(δ), F(0)−1 ⊗ ω(O), −1 for our argument. It may be convenient for the reader to compare it with the argument in [11]. Anyway, it does not matter which we choose. (Contribution of the origin to the integral). Let us consider the case δ = 0. Let a be any real number satisfying −2 < a < 2 and a = 0. We take a and −a as the weights of F(l) and F(l)−1 ⊗ ω respectively. Let u = f · e(l) and v = g · e∨ (l) be L2 -sections of F(l) and F(l)−1 ⊗ ω respectively, such that ∂(u) and ∂(v) are also L2 . In that case, we have ˜ ˜ the values za/2 · f (0) and z(−a)/2 · g (0) = z−a/2+1 · g (0) due to Lemma 3.11. 2 Let us consider the naturally defined pairing u, v, which is an L -section of ω such that ∂u, v is L1 . We have the value z · u, v (0), which is given by the product ˜ a/2 · g (0). The integral ∂u, v is naturally defined. It is easy to z ˜ · f (0) × z−a/2+1 see that the contribution of the origin (the singular point) to the Stokes formula is given by the value z · u, v (0). As a result, we obtain the following lemma. Lemma 3.13. origin to the Stokes Let u and v be as above. The contribution of the ˜ f (0) × z−a/2+1 ˜ formula for ∂u, v is given by the product za/2 · g (0).
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
11
3.3. Preliminary for the ∂-operator around a nodal point. 3.3.1. Preliminary. Let us consider the nodal curve X0 = {(z, w) ∈ C2 | zw = 0, |z| < 1, |w| < 1}. We put X0z = X0 ∩ {w = 0}, X0w = X ∩ {z = 0}. Let O√denote the origin √0 (0, 0). We often use the real coordinate z = R · e −1α and w = S · e −1β . Let F0 be an r-spin structure on the nodal curve. Recall that we have the following two cases: 1. Neveu-Schwarz case. The sheaf F0 is a direct sum OX0z ·e(l)z ⊕OX0w ·e(r −l)w , where r r e(l)z and e(r − l)w satisfy e(l)z = zl · dz/z and e(r − l)w = wr−l · dw/w. Recall that the index at the origin O is defined to be (l − 1, r − l − 1). 2. Ramond case. The sheaf F0 is locally free, and it is the quotient of OX0z · e(0)z ⊕ OX0w · e(0)w divided by the relation (−1)1/r e(0)z = e(0)w at the origin O. Here r r e(0)z and e(0)w satisfy e(0)z = dz/z and e(0)w = dw/w. Recall that the index at the origin O is defined to be (−1, −1). Let (m1 , m2 ) denote the index of the r-spin structure F0 at the origin O. We put e := gcd(r, m1 + 1), and we put as follows: X := (z, w, t) zw = t e , |z| < 1, |w| < 1 ⊂ B 3 : = (z, w, t) |z| < 1, |w| < 1, |t| < 1 . We have the naturally defined morphism X −→ B := t ∈ C |t| < 1 . The fiber over t is given by Xt . Hence we obtain the family of the curves whose special fiber at t = 0 is the nodal curve X0 above. Recall that we have a sheaf F on X, which is a universal deformation of F0 . We denote F|Xt by Ft . We divide X into X z ∪ X w : Xz := (z, w, t) ∈ X |z| > |w| , Xw := (z, w, t) ∈ X |z| < |w| . Here · denotes the closure in X. We put Xtz := Xt ∩ X z and Xtw = Xt ∩ X w . For any z e/2 w t such that |t| < 1, we put δ(t) := |t| . Then Xt and Xt are naturally isomorphic to D δ(t) . For simplicity, we often use δ instead of δ(t). We can also take a continuous family of diffeomorphisms ηtz : X0z −→ Xtz and ηtw : X0w −→ Xtw . 3.3.2. The line bundles on curves. The sheaf F naturally induces the line bundles Ftz and Ftw on Xtz and Xtw respectively. It also induces the line bundles over Xtz and Xtw , which are also denoted by Ftz and Ftw for simplicity. We can take the trivialization 1/r of Ftz . Similarly we can take the trivialization e(m2 + e(m1 + 1)z := zm1 +1 · dz/z 1/r of Ftw . Clearly, the line bundles Ftz and Ftw with the 1)w := w m2 +1 · dw/w generators naturally induce the r-spin structure Ft on Xt in the case t = 0. −1 ⊗ ω on Xtz , and the The dual bundles are considered as follows: Lzt denotes Ftz −1 ∨ z z ⊗ dz. Note that the family of line bundles e (m1 + 1) is given by e(m1 + 1) frame z L |t| < 1 naturally gives the line bundle on X z , and e∨ (m1 + 1)z gives the frame over Xz . The dual bundle Lw in the w-side is considered similarly.
12
T. Mochizuki
3.3.3. The L2 -spaces. Here we introduce the following number for any integer l such that 0 ≤ l < r: (l) :=
2l − r . r
(7)
The following lemma can be checked easily. Lemma 3.14. • We have −1 < (l) < 1 in the case l = 0, and (0) = −1. • We have (l) + (r − l) = 0. The weights of Ftz and Lzt are defined to be (m1 + 1) and −(m1 + 1) respectively. Similarly, the weights of Ftw and Lw t are defined to be (m2 + 1) and −(m2 + 1) respectively. Let us take a positive functions φ z and φ w on B 3 . Then the restrictions of φ z to Xtz give the 1,1 -valued metrics on Xtz . (See Remark 3.1.) Similarly the positive function φ w induces the 1,1 -valued metrics on Xtw . 2 X z , F z , (m + 1) t ∈ B , Then we obtain the family of topological spaces L 1 t 2 w w L Xt , F , (m2 + 1) t ∈ B , etc. as in Subsubsect. 3.2.3. Note the following easy lemma. 2 w w z z 2 Lemma 3.15. 1 +1) ⊕L Xt ,F , For any t = 0, the topological space L Xt , F , (m (m2 + 1) is naturally isomorphic to those obtained from the C ∞ -metrics and the C ∞ measure on Xt , which we denote by L2 Xt , F . We also have the natural isomorphism for others, similarly. Remark 3.3. Due to Lemma 3.15, we can describe L2 Xtz , Ftz , (m1 +1) ⊕L2 Xtw , Ftw , (m2 +1) as L2 Xt , Ft . Therefore, we often denote L2 X0z , F0z , (m1 +1) ⊕L2 X0w , F0w , (m2 + 1) by L2 X0 , F0 for simplicity of the notation. We use the notation L2 Xt , Ft ⊗ 0,1 , L2 Xt , Lt and L2 Xt , Lt ⊗ 0,1 in a similar meaning. Once we fix the metrics and measures, the diffeomorphisms ηtz : X0z −→ Xtz induce the unitary isomorphism Ptz : L2 (Xtz , F z , (m1 + 1)) −→ L2 (X0z , F z , (m1 + 1)), as in Subsubsect. 3.2.1. Similarly we have Ptw : L2 (Xtw , F z , (m2 + 1)) −→ L2 (X0w , F z , (m2 + 1)). Thus we have the isomorphisms Ptw ⊕ Ptz of L2 (Xt , Ft ) and L2 (X0 , F0 ). Clearly, we have the isomorphisms for others. dual We obtain the family of parametrices Qzt and Qw for t for the operators ∂ and ∂ the families Ftz |t| < 1 and Ftw |t| < 1 respectively. However, we remark that Qzt ⊕ Qw parametrix for Ft on Xt , because they are not glued at the t does not give the cutting circle Xtz ∩ Xtw = (z, w, t) |z| = |w| . We modify them in the following subsubsections. 3.3.4. The matching condition at the cutting circle (Neveu-Schwarz case). Let us consider the Neveu-Schwarz case with the index (l −1, r −l −1). On each curve Xt (t = 0), we have the following relation: w r−l dw zl dz = −t el · w −r · . z w
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
13
Hence we have the following relation on Xt : e(l)z = (−1)1/r · t el/r · w −1 · e(r − l)w . Therefore we have the following relation on the circle |z| = |w|, for √ √ and for z = δ · e −1α and w = δ · e −1β : e(l)z = A · e
√ −1α
te
=
δ2
·e
√
· δ (l) · e(r − l)w .
(8) −1θ
(9)
Here A denotes the constant whose norms are 1, depending on θ , l and r. Let u(z) · e(l)z and v(w) · e(r − l)w be L21 -sections on the curves Xtz and Xtw respectively, where the weights of Ftz and Ftw are (l) and (r − l) respectively. Note that u(z) · e(r)z ⊕ v(w) · e(r − l)w gives the L2,1 -section on Xt if and only if we have the relation u(z) · e(r)z = v(w) · e(l − r)w on the cutting circle |z| = |w|. The latter gives the matching condition: u(δ, α) · A · e
√
−1α
· δ (l) = v(δ, β).
Here A is the constant in (9). For the Fourier expansion u = √ −1k·β v = vk (S) · e , Eq. (10) can be reworded as follows:
uk (R) ·
√ e −1k·α
vk (δ) = Ak · δ (l) · u−k−1 (δ).
√ Here we put Ak := e− −1θ(l/r−1−k) (−1)1/r , which is a constant Similarly, we also have the relation uk (δ) = δ (r−l) · Ak · v−k−1 (δ).
(10) and (11)
whose norm is 1.
Let us consider the matching condition of the following two functions Xtz and Xtw respectively: √ ck · R k · e −1kα · e(l)z , Qt (f · e(l)z d z¯ ) + k≤−(l)/2−1
¯ + Qt (g · e(r − l)w d w)
dk · S k · e
√ −1kβ
· e(r − l)w .
k≤−(r−l)/2−1
Then we obtain the following equalities: δ ck = Ak · δ −(l)−2k−1 σ k+1 · g−k−1 · dσ,
(k ≤ −(l)/2 − 1),
1
dk = Ak · δ −(r−l)−2k−1
δ 1
σ k+1 · f−k−1 · dσ, (k ≤ −(r − l)/2 − 1).
Thus we put as follows for any t = 0: w −(l)−2k−1 Ak · δ Rt g · e(l) · d w¯ : = k≤−(l)/2−1 √ k −1kα
Rt (f · e(r − l) · d z¯ ) : = z
×R · e
×S · e
We put R0 = 0.
σ
k+1
1
· e(l)z ,
Ak δ
k≤−(r−l)/2−1 √ k −1kβ
δ
−(r−l)−2k−1
· e(r − l)w .
δ
σ 1
· g−k−1 · dσ
k+1
· f−k−1 · dσ (12)
14
T. Mochizuki
Lemma 3.16. For any t, Rt gives a compact operator. Proof. There exists a positive constant C such that the following inequality holds, for any positive number δ < 1 and any integer k < −1:
1
1/2 r 2k r 1+(l) dr
≤ C · δ k+1+(l)/2 |k|−1/2 .
δ
We also have a positive constant C such that the following inequality holds for any integer k < −1, a positive number δ < 1 and any function g: δ δ −(l)−2k−1 σ k+1 · g−k−1 · dσ ≤ C · δ k+3/2−(r−l)/r+(r−2l)/r−2k−1 1
×
1/2
|g−k−1 | · σ 2
1+(r−l)
· dσ
· |k|−1/2
= C · δ −k+1/2−l/r
1/2 2 1+(r−l) × |g−k−1 | · σ · dσ ·|k|−1/2 . (13) Since we have the equality k + 1 + (l)/2 − k + 1/2 − l/r = 1, the norm of the L2 norm of the function ck · r k (k < −1) on the interval (δ, 1) with respect to the measure r 1+(l) dr is dominated by the following:
1/2 |g−k−1 |2 · σ 1+(r−l) · dσ . C · C · δ|k|−1/2 In the case k = −1, the L2 -norm of the function c−1 · r −1 is dominated by the following, up to some constant: 1
1/2 2−2l/r −2+2l/r 2 1+(r−l) · δ σ dσ · |g0 | · σ d ·σ . δ
1/2 . Thus the first sum (12) is absolutely It is dominated by δ 2/r · |g0 |2 σ 1+(r−l) dσ convergent, which implies the compactness of the operators Rt . 3.3.5. Norm-continuous family of parametrices (in the Neveu-Schwarz case). Recall Lemma 3.15. For a pair (f, g) ∈ L2 Xtz , F z ⊗ 0,1 , (l) ⊕L2 Xtw , F w ⊗ 0,1 , (r−l) , we put as follows: t (f, g) : = Qt (f ) + Rt (g), Qt (g) + Rt (f ) Q ∈ L2 Xtz , F z , (l) ⊕ L2 Xtw , F w , (r − l) . t (f, g) = (f, g) and the matching condition at the cutting circle. Then we It satisfies ∂ Q t : L2 Xt , Ft ⊗ 0,1 −→ L2 Xt , Ft for ∂. obtain the parametrices Q Recall that we have the isomorphism Pt of L2 (Xt , F) and L2 (X0z , F0 ) etc., once we take appropriate metrics. The family of the operators Qt induces the family of the t . operators of L2 Xt , Ft to L2 X0 , F0 , which is also denoted by Q
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
15
t |t| < 1 is norm continuous, and Q 0 is Lemma 3.17. The family of the operators Q the same as the operator constructed in Subsubsect. 3.2.1. Proof. We have only to check the norm continuity of the family Rt . It immediately follows from the absolute convergence of the sums in (12). See the proof of Lemma 3.16. Let us consider the parametrices for the operators ∂ relation on the circle |z| = |w| in Xt (t = 0): e∨ (l)z = −A · e
√
−1α
dual
of Lt . We have the following
· δ −(l) · e∨ (r − l)w .
Then by the same construction explained as above, we obtain the norm continuous family dual of the parametrices for ∂ : dual : L2 Xt , Lt ⊗ 0,1 −→ L2 Xt , Lt . Q t dual is same as the operator constructed in Subsubsect. 3.2.1. And Q 0 3.3.6. The Ramond case. Let us consider case. In this case, it con the Ramond will be 1−1/r 1−1/r and w−1 · e∨ (0)w = dw/w venient to use the frame z−1 · e∨ (0)z = dz/z as in [11]. Note that they are L2 -sections in the case t = 0, for the weight of Lz0 and Lw 0 are 1. Clearly they give the L2 -sections in the case t = 0, too. We have the following relation: e(0)z = (−1)1/r e(0)w ,
z−1 · e∨ (0)z = (−1)1−1/r w −1 · e∨ (0)w .
(14)
By a similar construction to that in the Neveu-Schwarz case, we obtain the continuous dual family of parametrices for the operators ∂ and ∂ : t : L2 Xt , Ft ⊗ 0,1 −→ L2 Xt , Ft , Q dual Q : L2 Xt , Lt ⊗ 0,1 −→ L2 Xt , Lt . t 0 and Q dual are same as the operators constructed in 3.2.1. See [11] for more And Q 0 detail in the Ramond case. 3.3.7. The matching condition at the origin and the vanishing. Let us consider the operdual ators ∂ and ∂ of F0 and L0 . To make them mutually adjoint, we impose the matching conditions. Recall Lemmas 3.11, 3.12 and the Subsubsect. 3.2.3. Condition 3.1. The domain of ∂ is the subspace of elements (u, v) of L2 X0z , F0z , (m1 + 1) ⊕ L2 X0w , F0w , (m2 + 1) with L2 -derivatives ∂u and ∂v such that the following holds: • In the Neveu-Schwarz case such that (m1 + 1) > 0, we impose the condition (z · u)(0) = 0. • In the Neveu-Schwarz case such that (m2 + 1) > 0, we impose the condition (w · v)(0) = 0. z w . • In the Ramond case, we impose the condition u(0) = v(0) via the gluing F0|O F0|O
16
T. Mochizuki
dual The domain of ∂ is the subspace of elements (u, v) of L2 X0z , Lz0 , −(m1 + 1) ⊕ dual dual 2 L2 X0w , Lw u and ∂ v such that the following 0 , −(m2 +1) with L -derivatives ∂ holds: • In the Neveu-Schwarz case such that −(m1 + 1) > 0, we impose the condition (z · u)(0) = 0. • In the Neveu-Schwarz case such that −(m2 + 1) > 0, we impose the condition (w · v)(0) = 0. • In the Ramond case, note that we have the isomorphism Lz0 (O)|O Lw 0 (O)|O . We also have the values (z · u)(O) ∈ Lz0 (O)|O and (w · v)(O) ∈ Lw (O) |O . Hence we 0 impose the condition z · u(0) = w · v(0) via the isomorphism. Lemma 3.18. Assume that (u1 , v1 ) ∈ L2 X0 , F0 and (u2 , v2 ) ∈ L2 X0 , L0 are condual
tained in the domains of ∂ and ∂ , and impose the matching condition as above. Then the contribution of the nodal point (0, 0) to the Stokes formula for the integral ¯ 1 · u2 ) + ∂(v ¯ 1 · v2 ) vanishes. ∂(u Proof. The claim in the Neveu-Schwarz caseimmediately follows from Lemma 3.13. ¯ 1 · u2 ) and ∂(v ¯ 1 · v2 ) vanish. In fact, both of the integrals ∂(u −1 ⊗ Let us consider the Ramond case. We consider the isomorphisms Lz0 (O) F0z w −1 z w ⊗ ωXw (O). We have the isomorphism f : F˜ 0 | O −→ ωXz (O) and L0 (O) F0 z −1 −1 w F˜ 0 | O , and thus g : F0 ⊗ ω(0)z|O −→ F0w ⊗ ω(0)w |O . The isomorphism f ⊗ g : C −→ C is −1, where we use the natural identifications ωXz (O)|O CωXw (O)|O C by the residues. Let (u1 , v1 ) and (u2 , v2 ) be elements of the L2,1 -sections of F0 and L0 . Then the matching conditions and the consideration above gives us the equality Res(u1 · u2 ) + Res(v1 · v2 ) = 0. Thus the contributions are canceled. 3.3.8. Change of norms. In the discussion above, we divide the curve Xt into Xtz and Xtw by cutting at the circle |z| = |w| for any t. We also take a tuple of positive functions φ = (φ z , φ w ). (See the Subsubsect. 3.3.3.) Let Nt (φ) denote the norm considered in this case. If we take a continuous family of the cutting circles ct and the other tuple of positive functions φ , we obtain the norm Nt (φ ). Lemma 3.19. Assume that ct is contained in the region C −1 |z| < |w| < C|z| for some constant C > 1. Then the families of the norms Nt (φ) and Nt (φ ) are mutually bounded independently of t, i.e., there is a constant C > 1, such that C −1 Nt (φ)(u) ≤ Nt (φ )(u) ≤ C Nt (φ)(u). Proof. Let f · e(l)z = g · e(r − l)w be a section of Ft . From the relation of the sections e(l)z and e(r − l)w , we obtain the relation |f | · |t|el/r · |w|−1 = |g|. Since z · w = t e , we obtain |f |2 · |z|2l/r = |g|2 · |w|2(r−l)/r . On the other hand, we have the relations dz = −w−1 · z · dw and d z¯ = −w¯ −1 · z¯ · d w¯ on Xt . Thus we have the following inequality for
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
17
the integral on any subset D contained in the region (z, w, t) C −1 |z| < |w| < C|z| : |f |2 · |z|(l) · φ z · |dzd z¯ | ≤ |g|2 · |w|(r−l) · φ w · |dwd w| ¯ C −2 · D D ≤ C2 · |f |2 · |z|(l) · φ z · |dzd z¯ |. D
Hence the norms are mutually bounded.
3.3.9. Collapsing a curve. We continue to use the notation. Lemma 3.20. Let f · e(l) = g · e(r − l) be a section of Ft on Xt (t = 0). Then we have the following equality: z |f |2 · |z|(l) · dz · d z¯ = |g|2 · |w|(r−l) · · dw · d w¯ . (15) w δ 2 <|z|<δ δ<|w|<1 Let f · e(l) · d z¯ = g · e(r − l) · d w¯ be a section of Ft ⊗ 0,1 of Xt (t = 0). Then we have the following equality: 2 (l) z |f | · |z| · · dz · d z¯ = |g|2 · |w|(r−l) · dw · d w¯ . (16) 2 w δ <|z|<δ δ<|w|<1 −1 Note that w · z < 1 in the region. Proof. It can be checked by a direct calculation, by using the relation (8). L2 Xt , Ft and L2 Xt , Ft ⊗ 0,1 by the norms Nt (l), φ z We obtain the L2 -spaces and Nt (r − l), φ w on Xtz and Xtw respectively. We denote the induced norm by N1 . In
the case t = 0, we also have L2 -spaces L2 Xt , Ft and L2 Xt , Ft ⊗ 0,1 by using the isomorphism Xt −→ D(δ 2 ) and the norm N (l), φ z on D(δ 2 ). We denote the induced
norm by N0 . The spaces L2 Xt , Ft (resp. L2 Xt , Ft ⊗ 0,1 ) and L2 Xt , Ft (resp.
L2 Xt , Ft ⊗ 0,1 ) are isomorphic as topological spaces. The norm of the identity maps
Ft : L2 Xt , Ft −→ L2 Xt , Ft and Gt : L2 Xt , Ft ⊗ 0,1 −→ L2 Xt , Ft ⊗ 0,1 are less than 1 for 3.20. t by Lemma any 2 The families L Xt , Ft 0 < |t| < 1 and L2 Xt , Ft ⊗ 0,1 0 < |t| < 1 are
naturally extended to the families over the disc t |t| < 1 , by putting L2 X0 , Ft :=
L2 X0z , F0z and L2 X0 , F0 ⊗ 0,1 := L2 X0z , F0z ⊗ 0,1 . The morphism F0 :
L2 X0 , F0 −→ L2 X0 , F0 is given by the natural projection L2 X0z , F0z ⊕ L2 X0w ,
F0w −→ L2 X0z , F0z . The morphism G0 : L2 X0 , F0 ⊗ 0,1 −→ L2 X0 , F0 ⊗ 0,1 is given by the natural inclusion L2 X0z , F0z ⊗ 0,1 −→ L2 X0z , F0z ⊗ 0,1 ⊕ L2 X0w , F0w ⊗ 0,1 . We have the trivialization of the families L2 Xt , Ft |t| < 1 and L2 Xt , Ft ⊗ 0,1 |t| < 1 as in Subsubsect. 3.3.5. We also have the trivialization of the families 2 L Xt , Ft |t| < 1 and L2 Xt , Ft ⊗ 0,1 |t| < 1 by taking the diffeomor2 ) X . Then the family of the morphisms F is the family of the morphisms phisms D(δ t 0
2 Ft : L X0 , F0 −→ L2 X0 , F0 . Similarly, we obtain the family of the morphisms t : L2 X0 , F0 ⊗ 0,1 −→ L2 X0 , F0 ⊗ 0,1 . G
18
T. Mochizuki
t |t| < 0 and G t |t| < 0 Lemma 3.21. The families of the bounded operators F are strongly continuous with respect to the variable t. Proof. Let us see the strong continuity of Ft . The continuity at t = 0 is easy to see. To show the continuity at t = 0, we have only to show that the induced morphisms
w
2 Ft : L X0 , F0 −→ L2 X0 , F0 converge to 0 strongly. For diffeomorphisms ηt : 1/2 = δ · ηt∗ |w|−1 are bounded indepenD(δ) −→ Xtw , the functions ηt∗ |z| · |w|−1 dently of δ, and they converge to 0 when δ → 0. Then we obtain the strong convergence t to 0 due to (15). of F Let us see the strong continuity of Gt. We have only to see at t = 0, and we have only t : L2 X0 , F0 ⊗ 0,1 −→ L2 X w , F0 ⊗ 0,1 conto show the induced morphisms G 0 verge to 0. For diffeomorphisms ηt : Xt −→ D(δ 2 ), we put T (t) := z ηt ∈ Xtw . Let g ˜ t , we have G ˜ t (g) = G ˜ t g|T (t) . be an element of L2 X0 , F0 ⊗ 0,1 . For the morphism G The functions |z| · |w|−1 are bounded on T (t) independently of t, and the sets T (t) go ˜ (g) to 0, namely the strong convergence to {0}. Hence we obtain the convergence of G
˜ to 0 due to (16). of G Similarly we have the family of topological vector spaces L2 Xt , Lt |t| < 1 and 2
L Xt , Lt ⊗ 0,1 |t| < 1 , and we have strongly continuous families of bounded
morphisms Ft : L2 Xt , Lt −→ L2 Xt , Lt and Gt : L2 Xt , Lt ⊗ 0,1 −→ L2 Xt , Lt ⊗ 0,1 . Finally we give a remark on the morphism ψt of L2 Xt , Lt −→ L2 Xt , Ft ⊗ 0,1 induced by the norms. We consider the family of the semi-norms N s (0 ≤ s ≤ 1) on the topological vector spaces L2 (Xt , Ft ) and L2 Xt , Ft ⊗ 0,1 given by Ns = (1 − s) · N0 + s · N1 . Here we regard N0 as semi-norms on L2 X0 , F0 and L2 X0 , F0 ⊗
0,1 instead of the norms on L2 X0 , F0 and L2 X0 , F0 ⊗ 0,1 in the case t = 0. In the case s = 0, they are the norms. They induce the anti-linear continuous maps (s) ψt : L2 Xt , Lt −→ L2 (Xt , Ft ⊗ 0,1 ). It is continuous with respect to t and s. (s) Note that ψt is not isomorphic in the case (t, s) = (0, 0). In that case, the image of
(0) (0) ψ0 is L2 X0 , F0 ⊗ 0,1 = L2 X0z , F0z ⊗ 0,1 , and the map ψ0 factors through z z L2 X0 , L0 . 3.4. The global family and its graph-continuity. 3.4.1. Preliminary. Let m be an n-tuple of integers m1 , . . . , mn such that −1 ≤ mi . 1/r,m 1/r,m We denote the moduli stack Mg,n (resp. Mg,n ) simply by M (resp. M) in this subsection. Let (C, p, F) be a stable r-spin curve over the complex number field C of type m. We denote mi also by m(Pi ). We denote the set {Pi ∈ p | mi = −1} by E. Let q = Q1 , . . . , Ql denote the set of nodal points of the curve C. We denote the normalization of C by π : C˜ −→ C. Let {Qi,1 , Qi,2 } denote the preimage ofQi via π . We put p := p π −1 (q). The r-spin structure F determines the index m Qi j at the point Qi j . := m(Pi ), m(Qi j ) Pi ∈ p, Qi j ∈ π −1 (q) . Hence we obtain the tuple of integers m is naturally induced. We put E := Qi j ∈ on C The r-spin structure F of type m −1 := E E . π (q) m(Qi j ) = −1 , and E
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
19
Let α = (α1 , . . . , αn ) be an element of Rn . For simplicity, we assume −2 < αi < 2 for any i. We denote αi also by α(Pi ). We assume α(P i ) = −1 for any point Pi ∈ E. For any point Qi j ∈ π −1 (q), we put α(Qi j ) := m(Qi j + 1) . Thus we obtain the tuple of the real numbers α = α(P˜i ), P˜i ∈ p˜ ∈ Rp. We put α ∨ (P˜i ) = −α(P˜i ) for any point P˜i ∈ p. Thus we obtain the tuple α ∨ ∈ Rp. Similarly we obtain the tuple α ∨ ∈ Rp. 3.4.2. The global family of the topological vector spaces over the moduli. We take an with the following conditions: 1,1 -valued hermitian metric (Remark 3.1) of F − • It is C ∞ on C p. ∞ 1,1 i ∈ • Around a point P p, it is of the form ψ · |z|α(Pi ) , where ψ denotes a C - hermitian metric, and z denotes a holomorphic coordinate such that z P˜i = 0. ⊗ 0,1 with a similar condition. Then We also take an 1,1 -hermitian metric of F we obtain the Hilbert spaces: 0 α , F, E (C,p,F ) := L2 C,
1 ⊗ 0,1 , α . F E (C,p,F ) := L2 C,
The underlying topological space is independent of a choice of 1,1 -hermitian metrics. 0 1 We also have the closable operator ∂ : E (C,p,F ) −→ E (C,p,F ). ⊗L⊗ 0,1 −→ 1,1 , the −1 ⊗ω . Since we have the natural pairing F We put L := F C 1,1 -hermitian metric of F induces the 1,1 -hermitian metrics of L ⊗ 0,1 . Similarly the 1,1 -metric of F ⊗ 0,1 induces the 1,1 -metric of L. Then we obtain the Hilbert spaces: ∨0 L, E (C,p,F ) := L2 C, α∨ ,
∨1 L ⊗ 0,1 , E (C,p,F ) := L2 C, α∨ .
The underlying topological vector spaces are independent of a choice of 1,1 -hermitian ∨0 ∨1 dual : E (C,p,F ) −→ E (C,p,F ). metrics. We also have the closable operator ∂ ∨1
0
We have the naturally defined perfect pairing of E and E . We also have the 1 ∨0 naturally defined perfect pairing of E and E . If we fix 1,1 -metrics, we obtain ∨1 0 ∨0 the anti-Hermitian isomorphisms ψ : E (C,p,F ) −→ E (C,p,F ) and ψ : E (C,p,F ) −→ 1
E (C,p,F ). They depend on choices of 1,1 -hermitian metrics. dual
are mutually formal adjoint. To 3.4.3. The adjointness. The operators ∂ and −∂ make them mutually adjoint, we have to impose the conditions to their domains. For the points P˜i ∈ π −1 (q), we impose the matching condition explained in Subsubsect. 3.3.7 (see Condition 3.1). For any points Pi ∈ p such that α(Pi ) ∈ 2 · Z, we consider the following conditions: 0 (Pi ) If u ∈ E (C,p,F ) is contained in the domain of ∂, we have the vanishing zα(Pi )/2 · u (Pi ) = 0. ∨0
(Pi , dual) If v ∈ E (C,p,F ) is contained in the domain of ∂ −α(P i )/2+1 · v (P ) = 0. z i
dual
, we have the vanishing
20
T. Mochizuki
Lemma 3.22. For each point Pi , we impose one of the conditions (Pi ) or (Pi , dual). dual Then ∂ and ∂ are mutually adjoint. Proof. Similar to Lemma 3.18.
3.4.4. The preferred norms. Let (C, p, F) be a stable r-spin curve over the complex k number field C, which can be regarded as a point of M. We take a kmulti-disc B (T ) := z := (z1 , . . . , zk ) |zi | < T , i = 1, . . . , k and the etale map B (1) −→ M such that the point (0, . . . , 0) ∈ B k (1) corresponds to the curve (C, p, F) considered. We have the universal family q : C −→ B k and the universal r-spin structure F. We denote the fiber q −1 (z) by Cz . Let σj denote the j th section of q. For any point z ∈ B k (1), we i
i
have the corresponding topological vector spaces E z and E z . We would like to take a family of their norms with some good properties. In the following argument, we shrink B k (1) to B k (T ) for an appropriate T ≤ 1 without mentioning it, if we need to. We may assume that the divisors zi = 0 (i = 1, . . . , j ≤ k) correspond to the singularities of the map q. Then the morphism q is submersive on B ∗ j × B k−j , where B ∗ denotes the punctured disc. Let Qi be the corresponding nodal point of the curve C(0,... ,0) . Let (mi,1 , mi,2 ) be the index at Qi . We can take an open subset Ui of C around Qi and an embedding Ui −→ B k+2 (T ) = (z1 , . . . , zk , xi , yi ) such that the image is the closed subset determined by xi yi = zie , where e = gcd(r, mi,1 + 1). Let fi and gi be holomorphic functions on the open subset Ui such that fi = xi f˜i , gi = yi g˜ i where f˜i (0, 0) and g˜ i (0, 0) are not 0. Namely the functions fi and gi are close to xi and yi respectively. We consider the following subsets of Ui : y Uix := |fi | > |gi | , Ui := |fi | < |gi | , Wi := |fi | = |gi | . We may assume that the index mi,1 (resp. mi,2 ) corresponds to the divisor given by xi = 0 (resp. yi = 0). We have the trivialization e(mi,1 + 1)x (resp. e(mi,2 + 1)y ) of F y on Uix (resp. Ui ) given as in Subsubsect. 3.3.2. In the following construction, we shrink Ui without mentioning it, if we need to. xi yi Remark 3.4. Let us take positive φ kand φ functions on Ui . Then we obtain the family 1,1 of -hermitian metrics of Ui ∩ Cz z ∈ B (T ) . (See Remark 3.1.) We take a family of 1,1 -hermitian metrics of the family F˜ z z ∈ B k (T ) with the following conditions: • They are smooth on C − i Ui ∪ j σj . • Around the j th section σj , it is of the form |z|αj · ψ, where ψ denotes a family of C ∞ - 1,1 -hermitian metrics, and z denotes a relative coordinate of q : C −→ B around σj such that z(σj ) = 0. • On Ui , it is of the form given as in Remark 3.4, i.e., it is induced by positive functions φ xi and φ yi . z ⊗ 0,1 z ∈ We also take a family of 1,1 -hermitian metrics of the family F Cz i B k (T ) . They induce the family of Hilbert norms on the topological vector spaces E z ∨i
and E z (i = 0, 1). Definition 3.1. Such norms are called preferred norms.
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
21
Lemma 3.23. The preferred norms are mutually bounded. Proof. The change of positive functions φ xi and φ yi has no effect up to boundedness. A change of the holomorphic functions fi and gi satisfying the conditions above changes the cutting circles. Note that the circles Wi = {|fi | = |gi |} are contained in a region of the form {C −1 |xi | ≤ |yi | ≤ C|xi |} for some C > 1. Then we may apply the argument in Lemma 3.19. Thus any changes of the holomorphic functions fi and gi have no effect. Remark 3.5. A sum of preferred norms is not necessarily preferred, because the cutting circles are not same. But it is mutually bounded with any preferred norm. 3.4.5. The local trivializations and the bundle structure. We continue to use the setting in Subsubsect. 3.4.4. We take a preferred norm. Then we can take the trivializations of the families of the topological vector bundles, which we shall explain. 0, zj +1 , . . . , zk ) We denote the projection B k −→ B k−j , (z1 , . . . , zk ) −→ (0, . . . , by p. We can take a diffeomorphism η : C − i Wi p ∗ C|B k−j − i Wi|B k−j . We [δ, C] defined by χδ (a) = (C − δ) · C −1 a + δ. We use have the maps χδ : [0, C] −→ √ √ the real coordinate xi = Ri · e −1αi and yi = Si · e −1βi . Then we may assume the following: • A point (z1 , . . . , zk , xi , yi ) of Ui ∩ |xi | > |yi | is mapped to √ 0, . . . , 0, zj +1 , . . . , zk , χ −1 (Ri ) · e −1αi , 0 via η. • A point (z1 , . . . , zk , xi , yi ) of Ui ∩ |xi | < |yi | is mapped to √ 0, . . . , 0, zj +1 , . . . , zk , 0, χ −1 (Si ) · e −1βi via η. ∗ −1 We can lift the morphism P to the bundle map η : F η F such that it is of the form η∗ e(mi,1 + 1)x = e(mi,1 + 1)x on Uix , and of the form η∗ e(mi,2 + 1)y = y e(mi,2 + 1)y on Ui . Then there exist the positive functions φi (i = 0, 1) such that φi · η∗ gives the unii i i tary isomorphism z of η−1 E z = E p(z) to E z . It gives the local trivializations of the i families E z z ∈ B k . Similarly, we can take the local trivializations of the families ∨i Ez z ∈ Bk . Since the preferred metrics are mutually bounded, the change of the trivializations, raised by changes of preferred norms, are continuous. As a result, we obtain the bundle k ∨k structure of the families E and E . We can also take global families of 1,1 -hermitian k metrics of F and F ⊗ 0,1 , which induces the Hilbert norms of E over M which is mutually bounded with locally defined preferred norms. dual
3.4.6. Graph-continuity of the families of the operators ∂ and ∂ . For the trivializations, the following is shown by the argument of Seely-Singer. (See [11]). Proposition 3.1. The families of the closed operators ∂¯z and ∂¯zdual are graph-continuous. ∨0
1
∨1
0
Proof. We have the anti-Hermitian isomorphisms E E and E E induced by 1 0 the Hilbert norms. Thus we regard the operators −∂¯zdual as the closed operator E −→ E ,
22
T. Mochizuki 0
1
which is the adjoint of ∂. We put E z := E z ⊕ E z , and then we have the family of the closed operators:
id ∂¯ Dz = : E z −→ E z . −∂¯ dual id The graph continuity of the operators ∂¯ and −∂¯ dual is equivalent to the norm-continuity of the inverse Dz−1 . We constructed a family of local parametrices for the families of the operators ∂¯ and dual ∂ around the nodal points or marked points in Subsect. 3.2 and 3.3. Hence we obtain and Q dual . the continuous family of the global parametrices, which we denote by Q Then the following holds:
T Q 0 Q id −∂¯ dual = id + (17) dual 0 dual T . −Q −Q ∂¯ id In other words, we have Dz Qz = I + Tz , where Tz is a compact operator. Note that Dz · Qz is defined, since the image of Qz is contained in the domain of Dz . Since the operators Tz vanish around the nodal points and the marked points, it is easy to see that they give the continuous family. The right-hand side of (17) is Fredholm. Since the operator Dz is invertible, the operator Qz is a Fredholm operator of E z to the domain of Dz with the index 0. We take a complement Vp(z) of the image Im(Qp(z) ) in the domain Dom(Dp(z) ). In particular, we focus around the point 0 := (0, . . . , 0) ∈ B k . Let S0 be a morphism of E 0 to Dom(D0 ) which gives the isomorphism Ker(Q0 ) −→ V0 and vanishes on Ker(Q0 )⊥ . Thus D0 (Q0 + S0 ) = id + T0 + D0 S0 is invertible, and we obtain the equality D0−1 = (id + T + D0 S0 )−1 · (Q0 + S0 ). We extend S0 to the continuous family of the operators on B k−j (T ) = {(0, . . . , 0, zj +1 , . . . , zk )} ⊂ B k (T ). We have the equality Dz · (Qz + Sp(z) ) = id + Tz + Dz Sp(z) . The right hand side is norm-continuous, and hence invertible on a neighbourhood of the point 0. Thus the operator Dz has the norm-continuous inverse. 4. The Virtual Class 4.1. The axioms of Jarvis-Kimura-Vaintrob. For any n-tuple of integers m, we put d(r, m) := −2r −1 · (2g − 2 − mi ) + 2. Let Gg,n denote the set of the decorated stable graph of genus g with n-tails marked with the n-tuple m of non-negative integers. 1/r 1/r Definition 4.1 ([6 and 7]). A family of the cohomology classes c ∈ H d(r,m) M ∈ Gg,n satisfying the following axioms, are called r-spin virtual classes. If the graph 1/r 1/r,m consists of one vertex and no edge, then c is denoted by cg,n . ∗ 1/r,m 1/r r Axiom 1. If the graph is connected, the equality c = holds, e∈E() le · i cg,n 1/r
where i denotes the inclusion morphism M
1/r,m
−→ Mg,n . If the graph is the 1/r 1/r (d) disjoint union of the graphs , then it is described as the product c = d c (d) . 1/r,m
Axiom 2 (Convexity). In the case π∗ F = 0 on some connected components of Mg,n , 1/r,m then the restriction of the cohomology class cg,n to the component is the Euler 1 ∨ class of the vector bundle (R π∗ F) .
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
23
Axiom 3 (Cutting edge). Let be a decorated graph and ˜ be the graph obtained by 1/r 1/r cutting all edges of . We have the morphism µ˜ : M˜ ×M M −→ M . 1/r
1/r
We also have the morphism p1 : M˜ ×M M −→ M˜ . Then the equality 1/r 1/r p1 ∗ µ˜ ∗ c = r |E()| · c ˜ holds. Axiom 4 (Descent Axiom). We have the universal i th section σi of the universal curve 1/r,m 1/r,m Cg,n −→ Mg,n . Thus we have the line bundle σi∗ F which we denote by ψ˜ i (m). Let r · δ i denote an n-tuple of integers which is r in the i-th position and 0 in all 1/r,m+rδ i 1/r,m = −ψ˜ i (m) · cg,n holds. others. Then the equality cg,n 1/r,m Axiom 4’ (Vanishing axiom). In the case mi = r − 1 for some i, the virtual class cg,n is 0. Axiom 5 (Forgetting tails). Let ˜ be a decorated stable graph with a tail ti marked by mi = 0, and let be the decorated stable graph obtained by removing the tail ti . We 1/r 1/r 1/r 1/r have the forgetting morphism π : M˜ −→ M . Then the equality c ˜ = π ∗ c holds. In the next subsection, we recall the construction due to Witten. 1/r,m
Remark 4.1. In [13], the class cg,n vis-Kimura-Vaintrob [6].
is called the top Chern class. Here we follow Jar-
Remark 4.2. Jarvis-Kimura-Vaintrob also considered the case where one of mi is −1. In this paper, we restrict our attention to the tuples of non-negative integers. Remark 4.3. In Axiom 3, if one of the nodes is of Ramond type, the right hand side is understood to be 0. Hence we have to show the vanishing of the right hand side in that case. Remark 4.4. In the case mi = r − 1, it is easy to see that ψ˜ i is 0 in the rational cohomology. Hence Axiom 4 would formally imply Axiom 4’, if we could extend the Seely-Singer-Witten construction in the case when one of mi is −1. 4.2. The construction of virtual classes. 4.2.1. A finite reduction. Let m denote a tuple of integers (m1 , . . . , mn ) such that mi ≥ −1. Let be a decorated stable graph of genus g with n-tails marked with the n-tuple m. We denote M by M for simplicity. Let α = (α1 , . . . , αn ) be an n-tuple of real numbers. We impose the following conditions for simplicity: Condition 4.1. • In the case mi = −1, the inequality −2 < αi < 0 holds. • In the case 0 ≤ mi ≤ r − 1, the inequality −2 < αi ≤ 1 holds. • In the case mi ≥ r, the inequality −2 < αi < 2 holds. 1/r,m
In the case that consists of one vertex and no edge, i.e., M = Mg,n , we have i
∨i
the bundles E and E of the topological vector spaces and the family of the closable operators over M (see Subsect. 3.4). If is connected, then the bundles are obtained by
24
T. Mochizuki 1/r,m
the pull back via the morphism M −→ Mg,n . If is a disjoint union of connected graphs (i) , the bundles are obtained as a direct sum of the pull backs via the projections M −→ M (i). By taking an appropriate family of 1,1 of the fami -hermitian metrics lies F˜ (C,p,F ) (C, p, F) ∈ M and F˜ (C,p,F ) ⊗ 0,1 (C, p, F) ∈ M , we obtain the ∨0
1
anti-Hermitian isomorphism ψ : E −→ E . When we distinguish the dependence of i i E on α and m, we use the notation E (m, α), and similarly for others. dual We take the domains of ∂ and ∂ such that they are mutually adjoint. For that purpose, we have only to impose one of the conditions (Pi ) or (Pi , dual) for each point Pi ∈ p such that α(Pi ) = 0. Condition 4.2. • We choose the condition (Pi ) (resp. (Pi , dual)) in the case αi > 0 (resp. αi < 0), if we do not mention anything. • In the case mi = −1, we do not impose either (Pi ) or (Pi , dual). i
Definition 4.2. Let us consider finite dimensional subbundles E i ⊂ E (i = 0, 1) with the following conditions: 0 ¯ • The fiber E(C, ∂. p,F ) over (C, p, F) ∈ M is contained in the domain of the operator 0 1 The image ∂(E(C,p,F )) is contained in E(C,p,F ), and the equality index(∂) = index ∂ : 0 1 E(C, p,F ) −→ E(C,p,F ) holds. 0 1 • The fiber E(C, p,F ) contains the orthogonal complement of ∂ E(C,p,F ) . 0 th r−1 gives the L2 -section of L • For any element s of E(C, (C,p,F ) p,F ), the (r −1) square s with the appropriate weight. (See Subsubsect. 3.4.2 for L.) Such E 0 −→ E 1 is called a finite reduction of ∂¯ : E −→ E . 0
1
Lemma 4.1. There is a finite reduction. Proof. First, we take a finite dimensional subbundle H of E conditions:
∨1
satisfying the following
∨1 • The naturally defined morphisms H(C,p,F ) −→ E (C,p,F )/ Im ∂¯ dual are surjective for any point (C, p, F) of M. • Any element s of H(C,p,F ) is a smooth section of L(C,p,F ). Moreover s is constantly 0 on a neighbourhood of the marked points Pi and the preimage of the nodal points. Since the family of the operators ∂¯ dual is graph-continuous, (∂¯ dual )−1 (H ) gives the ∨0
∨1
dual
). We subbundle of E because of the surjectivity of H(C,p,F ) −→ E (C,p,F )/ Im(∂ 1 dual −1 (H ) , and then it is a subbundle of E . Since ψ Ker ∂¯ dual is put E 1 := ψ ∂ the orthogonal complement of Im(∂), E 1 contains the orthogonal complement of Im(∂). We put E 0 := ∂¯ −1 (E 1 ). Since the family of the operators ∂¯ is graph-continuous, E 0 is 0 a subbundle of E . Thus we obtain the finite dimensional bundles E 0 and E 1 and the ¯ E 0 : E 0 −→ E 1 . morphism ∂| The first and second conditions hold by our construction. Let us check the third condition, i.e. the L2 -properties around the singular points. Let (C, p, F) be a stable r-spin curve. We use the notation in Subsect. 3.4. The singular points we have to consider are the following:
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
25
such that α(Pi ) = 0. 1. The points Pi ∈ p−E Recall that we put −2 < α(Pi ) < 0. 2. The marked points Pi ∈ E. dual −1 Let us consider Case 1. Let s be an element of ∂ (H )(C,p,F ). Then s is holo 1/r −α(P ) i · f · zm(Pi ) · dz · d z¯ , where morphic around Pi , and thus ψ(s) is of the form |z| z denotes a holomorphic coordinate such that z(Pi ) = 0 and f denotes a C ∞ -function. 1/r 0 be an element of E (C,p,F ), which is contained in the domain Let u = g · zm(Pi ) · dz of ∂ and satisfies ∂u = ψ(s). Then it is easy to see that g is of the form z · |z|−αi · f1 + f2 , where f1 denotes a continuous function and f2 denotes a holomorphic function. In the case 0 ≤ mi ≤ r − 1, we have assumed |αi | ≤ 1, and thus g is bounded. Thus it is easy to check that ur−1 gives an L2 -section of L(C,p,F ). Let us consider the case mi ≥ r. We have only to consider the case 1 − αi < 0, and we can ignore f2 . Note that we have 1 − αi > −1. Since f1 is continuous, we have only to check the L2 -integrability of the function |z|(r−1)·(1−αi ) × |z|mi with respect to the measure |z|−α · |dz · d z¯ |, which is easy to see. dual −1 Let us consider Case 2. Let s be an element of ∂ (H )(C,p,F ). Then it is holo 1/r −α(P ) i · z−1 · f · z−1 · dz · d z¯ , morphic around Pi , and thus ψ(s) is of the form |z| −1 1/r 0 ∞ be an element of E (C,p,F ), where f denotes a C -function. Let u = g · z · dz which is contained in the domain of ∂ and satisfies ∂u = ψ(s). Then it is easy to see that g is of the form z · z−1 · |z|−α(Pi ) · f1 + f2 , where f1 denotes a continuous function and f2 denotes a holomorphic function. Hence we have only to check the L2 -integrability of −1/r (r−1)/r = z−1 · dz · dz/z , with respect to the measure |z|−α(Pi ) · |dz · d z¯ |. dz/z −2 −α(P ) i Since the function |z| · |z| is integrable with respect to the measure |dz · d z¯ |, we obtain the desired L2 -property. 4.2.2. The inner product of ∂s and ψ(s r−1 ). Let us take a finite reduction E 0 −→ E 1 . Let (C, p, F) be a point of M. Let π : C˜ −→ C denote the normalization of C. Let s be 0 r an element of E(C, of 1,1 with some singularity, let us p,F ). Since ∂s gives a section r consider the Stokes formula for the integral ∂s . We use the notation in Subsect. 3.4. Lemma 4.2. In the case Pi ∈ p − E, the contribution of the point Pi to the Stokes formula is 0. Proof. For the preimage of the nodal points, the claim is checked in Lemma 3.18. We have only to consider the case α(Pi ) = 0 for any marked point Pi . The claim immediately follows from the form of s around the points Pi , which has already been seen in the proof of Lemma 4.1. Let us consider the contribution of the points Pi ∈ E. Because of the inequality −2 < α(Pi ) < 0 and Lemma 3.11, we have the value s(Pi ). We have the isomorphism F ⊗ r ω(Pi ) around Pi , and we have the residue map ω(Pi ) −→ C at Pi . Hence we can regard s r (Pi ) as the complex number. Lemma 4.3. Let s be as above. Then we have the following formula: r−1 r · ∂s, ψ s = ∂s r = s r (Pi ). C˜
Pi ∈E
(18)
26
T. Mochizuki
Proof. The second equality follows from the standard Stokes formula and the residue formula. By definition of the anti-Hermitian isomorphism ψ, we have the following equality: ∂s, ψ s r−1 = s r−1 · ∂s = r −1 · ∂s r . C˜
Thus we obtain the first equality. E0
Let follows:
−→
E1
C˜
be a finite reduction. We define the morphism : E 0 −→ C as
(s) :=
∂s r =
s(Pi )r = r · ∂s, ψ(s r−1 ) .
(19)
Pi ∈E
4.2.3. Witten map and the 0-set. Let E 0 −→ E 1 be a finite reduction. We denote the 1 orthogonal projection of E onto the subbundle E 1 by ρE 1 . Let s be an element of E 0 . Since the (r − 1)th square s r−1 satisfies the L2 -condition, ∨0 r−1 ) ∈ E 1 . Hence we obtain the morit is an element of E . We have the element ψ(s r−1 0 1 . Then we put φ := ∂|E 0 + φ1 , phism φ1 : E −→ E given by φ1 (s) := ρE 1 ψ s which we call the Witten map. Let π denote the natural projection of the vector bundle E 0 onto M. The map φ can be regarded as the section of the vector bundle π ∗ E 1 over E 0 . We consider the restriction of the section φ to the subset −1 (a) ⊂ E 0 for a complex number a ∈ C. The following lemma is one of the key ideas for our vanishing argument (the vanishing axiom and the Ramond case of Axiom 3). Lemma 4.4. If an element s ∈ −1 (a) satisfies φ(s) = 0, then the complex number a is a non-positive real number. Proof. We have the equality 0 = φ(s) = φ1 (s) + ∂s. Taking the inner product with φ1 (s), we obtain the following equality: 2 0 = φ1 (s) + ∂s, ρE 1 ψ s r−1 . On the other hand, we have the following equalities: ∂s, ρE 1 ψ s r−1 = ρE 1 ∂s , ψ s r−1 = ∂s, ψ s r−1 = r −1 · (s) = r −1 · a. Thus a is real and non-positive.
The following lemma, due to Witten, is critical for the definition of the virtual class. Lemma 4.5 (Witten). Let s be an element of the subset −1 (0). The equality φ(s) = 0 implies the equality s = 0. Proof. Since we have the equality ∂s, ψ(s r−1 ) = 0 for any element s ∈ −1 (0), the following equality holds: ρE 1 ◦ ψ(s r−1 ), ∂s = ψ(s r−1 ), ∂s = 0. Hence the equality φ(s) = 0 implies ρE 1 ◦ ψ(s r−1 ) = ∂s = 0. In particular, s is holo dual , ψ s r−1 is contained in E 1 , by definition morphic. Since s r−1 is contained in Ker ∂ of finite reduction. Hence ρE 1 ◦ ψ(s r−1 ) = 0 implies ψ s r−1 = 0, which implies s = 0.
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
27
Remark 4.5. Let x be a point of M, and let f1 , . . . , fd be an orthogonal frame of Ex1 . Then we have the formula for s ∈ Ex0 : φ1 (s) :=
d
s
r−1
· fi · fi .
i=1
The description may be useful to check the continuity of φ1 t with respect to the parameter t, when the continuous family of a vector bundles Et1 are given. 4.2.4. The construction of the cohomology class c . When the set E = i mi = −1 is empty, then we have −1 (0) = E 0 , by definition. Due to Lemma 4.5, the section φ on E 0 gives the top Chern class ctop (π ∗ E 1 , φ) of π ∗ E 1 with the support M. (See the appendix about the cohomology theory and the Chern class of orbispaces.) ∗ (E 0 ) −→ Since E 0 is smooth over M, we have the push forward morphism π∗ : Hcpt H ∗−rank E (M). Hence we obtain the cohomology class π∗ ctop (π ∗ E 1 , φ) in the cohomology group H ∗ (M). 0
Lemma 4.6 (Witten). The cohomology class is independent of a choice of finite reductions. The cohomology class is independent of a choice of Hilbert norms, when we fix the weight α. Proof. The second claim is easy, because two norms are connected by a continuous family. Let us show the first claim. We have only to compare the classes obtained from two finite reductions E 0 −→ E 1 and E˜ 0 −→ E˜ 1 in the case E i ⊂ E˜ i . The natural projections E 0 −→ M and E˜ 0 −→ M are denoted by π and π˜ respectively. We take the orthogonal complement E 1 of the subbundle E 1 in E˜ 1 . We take the complement E 0 of the subbundle E 0 in E˜ 0 such that the restriction of ∂ to E 0 gives the isomorphism of E 0 and E 1 . We have the sections φ1 : E 0 −→ π ∗ E 1 and φ˜ 1 : E˜ 0 −→ π˜ ∗ E˜ 1 , as in Subsubsect. 4.2.3. Then the sections φ = ∂ + φ1 and φ˜ = ∂ + φ˜ 1 are used for the constructions respectively. We have the section φ : E˜ 0 −→ π˜ ∗ E˜ 1 defined by φ (s1 , s2 ) = ∂(s1 + s2 ) + φ1 (s1 ) for s = (s1 , s2 ) ∈ E 0 ⊕ E 0 . It is easy to see that the cohomology classes obtained from the sections φ and φ coincide. We have the map at : E˜ 0 −→ E˜ 0 defined as at (s1 , s2 ) = (s1 , t · s2 ). We also have the map bt : E˜ 1 −→ E˜ 1 defined as bt (s1 , s2 ) = (s1 , ts2 ). Note that we have the following relation: ∂s, bt ◦ φ˜ 1 ◦ at (s) = bt ◦ ρE˜ 1 ∂s , φ˜ 1 ◦ at (s) = bt ∂s , φ˜ 1 ◦ at (s) = ∂ at (s) , φ˜ 1 at (s) = r −1 ∂(at (s)r ) = 0. (20) We put φ (t) (s) = ∂(s) + bt φ˜ 1 (at (s)) for 0 ≤ t ≤ 1. It is easy to see that φ (t) is ˜ By the same argument as the proof of a homotopy to connect the sections φ and φ. Lemma 4.5, the cohomology class is defined for any 0 ≤ t ≤ 1, and thus it is constant. Thus we obtain the well defined cohomology class c := π∗ ctop π ∗ (E 1 , φ) ∈ 1/r H ∗ M .
28
T. Mochizuki
4.2.5. Independence of the cohomology classes from a weight α. Let us see that the cohomology class c obtained in Subsubsect. 4.2.4 is independent of a choice of a weight α. (Recall Condition 4.1.) For that purpose, we have only to compare the cohomologyclasses for α and (0, . . . , 0). Hence we consider the family of the weights α(t) = αi (t) i = 1, . . . , n (0 ≤ t ≤ 1) where we put αi (t) := t · αi . i
∨i
Let E (m, α(t)) and E (m, α(t)) denote the vector bundles for the tuple m with the weight α(t). We take a continuous family of 1,1 -hermitian for weights α(t). metrics For each t, we obtain the cohomology classes c(t) ∈ H ∗ M by the construction in Subsubsect. 4.2.4. Lemma 4.7. The family of classes c(t) 0 ≤ t ≤ 1 is constant. Namely, the cohomology class is independent of a choice of a weight α. Proof. Let t0 be a real number such that 0 ≤ t0 ≤ 1. Let small us take a sufficiently positive number γ , and we put a(t0 ) := max 0, t0 − γ and b(t0 ) := min 1, t0 + γ . ∨ 1 m, α(t0 ) as in the proof of Lemma 4.1. We can regard them as Let us take H ⊂ E ∨ 1 m + r · δ i , α(t) for any t ∈ [a(t0 ), b(t0 )]. It is easy to see that the subbundle of E dual −1 the subbundles ∂ H are independent of a choice of t ∈ [a(t0 ), b(t0 )], due to dual
our choice of the domain of the operators ∂ (Condition 4.2). Since the family of the dual −1 1,1 1 -metrics is continuous for t, Et := ψ ∂ H gives a continuous family of −1 1 Et is also continthe vector bundles on M. The family of vector bundles Et0 := ∂ uous due to our choice of the domain of ∂. Then we can conclude that the family of the cohomology classes c(t) is constant on [a(t0 ), b(t0 )]. Therefore we obtain the desired constantness. 4.2.6. The definition of the virtual class. We have arrived at the following definition of the virtual class. Definition 4.3. Let m be a n-tuple of non-negative integers. Let us take a finite reduction 1/r for M with an appropriate weight α as in Condition 4.1, and we put as follows: r · c
c : = (−1)d(r,m) · le e∈E() r · π∗ (ctop (π ∗ E 1 , φ)) ∈ H d(r,m) M1/r = (−1)d(r,m) · . le e∈E()
1/r
−1 It class of the moduli stack M . Here we put d(r, m) = 2 · r · is called the virtual −2g + 2 + mi + 2.
4.2.7. Remark on the pull back. Although we discussed the construction of the coho1/r mology class from the universal family over M , the argument works for any locally trivializable orbispaces. (See the appendix for locally trivializable orbispaces.) Namely, 1/r assume we have a continuous map f : V −→ M of locally trivializable orbispac es. Then we have the family of r-spin structures, f ∗ C, p, F over V , where C, p, F
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
29
1/r
denotes the universal r-spin curve over M . When we fix an appropriate weight α, i the r-spin structure f ∗ C, p, F induces bundles of topological vector spaces E V and ∨i
i
E V over V . They are isomorphic to the pull backs f ∗ E and f ∗ E
∨i
of the bundles of
1/r M . We
0
1
can take a finite reduction of ∂ : E V −→ E V , topological vector spaces over and then we obtain the cohomology class cV as in Subsubsect. 4.2.1–4.2.5, which is independent of choices of finite reductions, norms and weights. If we take a finite reduc0 1 1/r tion E 0 −→ E 1 of ∂ : E −→ E over M , then the pull back f ∗ E 0 −→ f ∗ E 1 0 1 gives a finite reduction of E V −→ E V . Lemma 4.8. In the case, we have cV = f ∗ c . 1/r
Proof. Let π : E 0 −→ M and π˜ : f ∗ E 0 −→ V denote the natural projections. We have the morphism fE : f ∗ E 0 −→ E 0 . Let φ denote the Witten map E 0 −→ 1/r M .Then fE∗ φ : E 0 −→ π˜ ∗ f ∗ E 1 denote the Witten map for V . We have π ∗ E 1 for fE∗ ctop π ∗ E 1 , φ = ctop π˜ ∗ f ∗ E 1 , fE∗ φ . Then we obtain the following: π˜ ∗ ctop π˜ ∗ f ∗ E 1 , fE∗ φ = π˜ ∗ fE∗ ctop π ∗ E 1 , φ = f ∗ π∗ ctop π ∗ E 1 , φ . It implies the claim.
4.3. The argument for vanishing. 4.3.1. Some vanishing. Let us continue to use in Subsubsect. 4.2.3, and the notation let us consider the case that the set E := i mi = −1 is not necessarily empty. We : E 0 −→ π ∗ E 1 ⊕ C, induced by the sections φ : E 0 −→ π ∗ E 1 consider the section φ 0 and : E −→ C. Here C denotes the trivial bundle on E 0 . Due to Lemma 4.5, the is M. Hence we obtain the cohomology class π∗ ctop (π ∗ E 1 ⊕ C, φ ) as in 0-set of φ Subsubsect. 4.2.4. We perturb the section to − a for a ∈ C \ {t ∈ R | t ≤ 0}, and = (φ, ) to φ −1 we perturb the section φ ∗ a 1:= (φ, a ). Then we know φa (0) = ∅ due to is trivial. Lemma 4.4. Thus the class π∗ ctop π E ⊕ C, φ 4.3.2. The vanishing axiom (the case |E| = 1). Let us consider the case |E| = 1, say E = {i}, i.e., mi = −1 and mj ≥ 0 (j = i). Let M denote the moduli stack of stable 0
r-spin curve of type m. We take a finite reduction E 0 −→ E 1 of ∂ : E (m, α) −→ 1 E (m, α) as in Subsubsect. 4.2.1 for the weights α as in Condition 4.1. Recall that we do =0 not impose either (Pi ) or (Pi , dual). We have the vanishing πE ∗ ctop π ∗ E 1 ⊕ C, φ as is shown in Subsubsect. 4.3.1. Let us impose the condition (Pi ) to the domain of ∂. To distinguish it, we denote the (1) −1 1 (1) (E ). We may assume that it is a subbundle of operator by ∂ . We put Eˆ 0 := ∂ E 0 such that E 0 /Eˆ 0 is of rank one. Let πˆ : Eˆ 0 −→ M denote the natural projection and ι : Eˆ 0 −→ E 0 denote the inclusion. Lemma 4.9. We have ctop C, = r · ι∗ 1 in H−1 (0) E 0 . Proof. We have −1 (0) = Eˆ 0 with the multiplicity r, which gives the desired equality.
30
T. Mochizuki
Let φˆ denote the restriction of φ to Eˆ 0 . Lemma 4.10. We have πˆ ∗ ctop πˆ ∗ E 1 , φˆ = 0. = ι∗ ctop πˆ ∗ E 1 , φˆ . Then Proof. From Lemma 4.9, we have r −1 · ctop π ∗ E 1 ⊕ C, φ . the claim follows from the vanishing of π∗ ctop π ∗ E 1 ⊕ C, φ Let us consider the tuple m = m+r ·δ i , where r ·δ i is given in Axiom 4 in Definition 4.1. Note m i = mi + r = r − 1. The moduli stack of stable r-spin curves of type m
is naturally isomorphic to M. We do not distinguish them. We put α = α + 2 · δ i . Let i E (m , α ) be the bundle of topological spaces for the tuple m with the weight α . We i i have the natural identification E (m, α) E (m , α ), and Eˆ 0 −→ E 1 gives a finite 0 1
reduction of E (m ∗, α 1) −→ E (m , α ) under the identification. Hence the vanishing of the class πˆ ∗ ctop πˆ E , φˆ implies the following vanishing axiom. Proposition 4.1 (The vanishing axiom). The virtual class is trivial in the case where one of mj is r − 1. 4.3.3. Preliminary for the vanishing in the Ramond case in Axiom 3. Let be a stable decorated stable graph of genus g with n-tails marked with the n-tuple m of integers such that mi ≥ −1. We put E = {i mi = −1}. Let us consider the case E is indexed as E = {(i, j ) i = 1, . . . w, j = 1, 2}. 1/r
Let (C, F) be the universal r-spin curve over M . Let σi,j denote the section of 1/r
C −→ M , corresponding to (i, j ) ∈ E. Note that we have the canonical isomorphism ⊗r ρi,j : F|σ C given by the residue. i,j 1/r
Let us take a finite reduction E 0 −→ E 1 for M . Let π : E 0 −→ M denote 0 the natural projection. We have the morphism i : E −→ C given by i (s) = r r ρi,1 (s ) + ρi,2 (s ). Note the equality (s) = i (s). Let φ : E 0 −→ E 1 denote the Witten map. 1/r Lemma 4.11. We have φ −1 (0) ∩ i −1 i (0) = M . −1 Proof. It follows from i −1 i (0) ⊂ (0). : E 0 −→ π ∗ E 1 ⊕ i C. Since the 0-set of The maps φ and i induce the section φ 1/r ∗ 1 ∗ is M1.r φ , we obtain the cohomology class π∗ ctop π E ⊕ i C, φ in H M . = 0. Lemma 4.12. We have the vanishing π∗ ctop π ∗ E 1 ⊕ i C, φ Proof. Note that we have i (s) = (s). Let a be a complex number which is not a w −1 −1 non-positive real number. Then the set φ −1 (0) ∩ −1 i=2 i (0) ⊂ φ (0) ∩ 1 (a) ∩ −1 (a) is empty, due to Lemma 4.4. Hence we are done. 4.4. The main theorem. 4.4.1. The statement 1/r,m
Theorem 4.1. The virtual classes cg,n obtained in Subsect. 4.2 satisfy the axioms of Jarvis, Kimura and Vaintrob (see Definition 4.1).
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
31
Axiom 4’ (the vanishing axiom) is proved in Proposition 4.1. Let us look at Axiom 2. Under the assumption in Axiom 2, the operators ∂ are injective for any point (C, p, F) of the component, and R 1 π∗ F is a locally free sheaf on the component. Hence we can take 0 −→ R 1 π∗ F as a finite reduction. Therefore the claim in Axiom 2 holds. We will see that the other axioms are satisfied in the following subsubsections. 4.4.2. Axiom 1. Axiom 1 follows from the following two lemmas. Lemma 4.13. If is a connected decorated stable graph of genus g with n-tails marked ∗ 1/r,m with the n-tuple m, the virtual class c is same as e∈E() r/ le · i cg,n , where i 1/r,m
1/r
denotes the natural morphism M −→ Mg,n . Proof. It immediately follows from Lemma 4.8.
(i) Lemma 4.14. In the case when is a disjoint sum of connected graphs , we have c = i c (i) . 1/r
Proof. We take finite reductions E0 (i) −→ E1 (i) for M (i) . Let E 0 −→ E 1 denote the 1/r
direct sum of the pull backs of E0 (i) −→ E1 (i) via the projections (i) : M
−→
1/r 1/r M (i) . Since the bundle of topological vector spaces on M is a direct sum of the pull 1/r 1/r backs of the bundles over M (i) , E 0 −→ E 1 is a finite reduction for M . 1/r Let φ (i) : E0 (i) −→ E1 (i) denote the Witten map on M (i) , then the direct sum φ of 1/r the pull backs of φ (i) is the Witten map on M . 1/r 1/r Let π : E 0 −→ M and π (i) : E0 (i) −→ M (i) denote the natural projections. We (i) have the naturally defined morphism E : E 0 −→ E0 (i) . Then we have the following:
ctop π ∗ E 1 , φ = ctop π ∗ (i) ∗ E1 (i) , (i) ∗ φ (i) i
=
(i) ∗ ctop π (i) ∗ E1 (i) , φ (i) . E ctop π (i) ∗ E1 (i) , φ (i) =
i
The first equality follows from (π ∗ E 1 , φ) i π ∗ (i) ∗ E0 (i) , (i) ∗ φ (i) and the prod uct formula of the top Chern classes, and the third equality follows from E 0 E0 (i) and the Kunneth formula. Then we obtain the following: (i) π∗ ctop π (i) ∗ E1 (i) , φ (i) . π∗ ctop π ∗ E 1 , φ = i
Then the claim immediately follows.
4.4.3. Axiom 4. We have only to consider the case where the graph consists of one vertex and no edge. We compare the classes for the tuples m = (m1 , . . . , mn ) and m+r ·δ i . 1/r,m+rδ i 1/r,m Recall that we have the natural isomorphism between Mg,n and Mg,n . We do not distinguish them. Let α(t) be the family of the weights given as follows: αj (t) = 0
(j = i),
αi (t) = t.
32
T. Mochizuki ∨i
i
Let E (m + r · δ i , α(t)) and E (m + r · δ i , α(t)) denote the vector bundles for the tuple m +r · δ i with the weight α(t). For each t, we obtain the cohomology classes c(t) ∈ H ∗ M by the construction in Subsubsect. 4.2.4. In the case t = 0, the coho1/r,m+r·δ
i mology . Due to Lemma 4.7, the family of classes (t) class is the virtual class cg,n c 0 ≤ t < 2 is constant. Let β(t) (0 < t ≤ 2) denote the weight such that βi (t) = t − 2 and βj (t) = 0 i ∨ i for i = j . Let E m, β(t) and E m, β(t) for the tuple m with the weight β(t). We obtain the family of the cohomology classes c (t) 0 < t ≤ 2 of H ∗ M . In the case 1/r,m t = 2, the class is the same as the virtual class cg,n . The family c (t) 0 < t ≤ 2 is constant due to Lemma 4.7. Hence we have only to compare c(t) and c (t) for some t ∈ (0, 2). Let us fix an element t ∈ (0, 2). It is easy to see the following equalities:
i i E m + r · δ i , α(t) = E m, β(t) , dual
E
∨ i
∨ i m + r · δ i , α(t) = E m, β(t) . (1)
(1) dual
We denote ∂ and ∂ for the tuple m+r ·δ i by ∂ and ∂ respectively. We denote dual (2) (2) dual (1) ∂ and ∂ for the tuple m by ∂ and ∂ respectively. The difference of ∂ and (2) (1) dual (2) dual ∂ and the difference of ∂ and ∂ are the choices of the domains. Recall (1) that we impose the condition (Pi ) for the domains of ∂ . If we impose the condition (2) (2) dual . We use the identification in (Pi , dual), then we obtain the operators ∂ and ∂ the following argument without mention. ∨ 1 Let H be a subbundle of E m + r · δ i , α(t) as in the proof of Lemma 4.1. We (i) dual −1 (H ) (i = 1, 2). We have H2 ⊂ H1 . We may assume that H2 put Hi := ∂ is a subbundle of H1 such that the rank of H1 /H2 is 1, by replacing H with a bigger subbundle if we need it. For any point (C, p, F) ∈ M, we have the naturally defined morphism: ! 0 1 (1) (21) ψ(H2 )(C,p,F ) −→ Cok ∂ : E (C,p,F ) −→ E (C,p,F ) . Note that the cokernel of the map (21) is at most one dimensional, because ψ(H2 )(C,m,F ) −→ Cok ∂
(2)
is surjective.
Lemma 4.15. We may assume that the map (21) is surjective for each (C, p, F), by replacing H with a bigger subbundle if we need it. Proof. Let s be any section of F ⊗ 0,1 . It gives an element of the cokernel of (21). We remark that we can take a section s of F ⊗ 0,1 with the following conditions: • The induced element in the cokernel of (21) is not 0. • s is C ∞ , and it vanishes around the singular points and the marked points of (C, p). If we add an element ∂ψ −1 (s) to H(C,p,F ), then we obtain the desired property at x. It is also elementary to globalize the procedure. (i) −1 1 We put Et1i := ψ Hi (i = 1, 2), and Et0i := ∂ Et i . Note that c(t) and c (t) 0 1 0 are obtained from finite reductions Et 1 −→ Et 1 and Et 2 −→ Et12 respectively.
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
33
(1) −1 1 (1) We put Eˆ t0 := ∂ (Et 2 ). Due to Lemma 4.15 and the graph-continuity of ∂ , 1 0 , and we have the Eˆ t0 is a subbundle of E m + r · δ i . It is also a subbundle of Et,2 following exact sequence: 0 −→ σi∗ F(m) −→ 0. 0 −→ Eˆ t0 −→ Et,2
(22)
Here σi denotes the i th section, and F(m) denotes the universal r-spin structure for the tuple m. The morphism E 0 −→ σi∗ F(m) gives a section ρ : E 0 −→ π ∗ σi∗ F(m). 1 . Let cˆ denote The construction in Subsubsect. 4.2.4 can be applied to Eˆ t0 −→ Et,2 the obtained cohomology class. By the same argument as the proof of Lemma 4.6, we can show that cˆ is the same as c (t) . 0 −→ E 1 be the Witten map. Let φˆ denote the restriction of φ to E ˆt . Let φ : Et,2 t,2 0 We denote the projections Et,2 and Eˆ t onto M by π and πˆ respectively. Let ι denote 0 . Then we have the following relation from the exact sequence the injection Eˆ t −→ Et,2 (22): 1 1 ), φˆ = ctop π ∗ σi∗ F, ρ · ctop π ∗ (Et,2 ), φ . ι∗ ctop πˆ ∗ (Et,2 Hence we obtain the following: 1 1 , φˆ = c1 σi∗ F · π∗ ctop π ∗ (Et,2 , φ) . πˆ ∗ ctop πˆ ∗ Et,2 It implies cˆ = ψ˜ i · c(t) . By noting the signature in Definition 4.3, we obtain the formula 1/r,m+r·δ i 1/r,m cg,n = −ψ˜ i (m) · cg,n . Thus we obtain Axiom 4. 4.4.4. Axiom 5. We consider the case that the nth point Pn is forgotten. Since we already know Axiom 4, we can assume that mi is less than r − 2 for each marked point. Let m denote a tuple such that mn = 0, and let m denote a tuple obtained from m by forgetting mn . Let M(m) and M(m ) denote the corresponding moduli stacks. i ∨i ¯ ∂¯ dual on M(m). We also We have the bundles E (m), E (m) and the operators ∂, i ∨i dual
on M(m ). Let πn : have the bundles E (m ), E (m ) and the operators ∂, ∂
th M(m) −→ M(m ) denote the morphism forgetting the n marked point. Let (C, p, F) be a point of M(m). The image of the point via πn is denoted by (C , p , F ). We have C = C or C = C ∪ P1 . We have the continuous map F(C,p,F ) : 0
0
E (C,p,F ) −→ E (C ,p ,F ), by forgetting the section on P1 . We have the continuous map 1
1
G(C,p,F ) : E (C ,p ,F ) −→ E (C,p,F ), by putting 0 on P1 . Then the families of the maps 0
1
F : E (m) −→ πn∗ E(m ) and G : πn∗ E (m ) −→ E 1 (m) are strongly continuous, due to Lemma 3.21. Lemma 4.16. We have the natural isomorphism of the L2 -cohomology groups of F˜ (resp. L) on C˜ and F˜ (resp. L ) on C˜ with the given weights (see Subsect. 3.4). Proof. Let us consider the case when the forgotten point Pn is a nodal point. It means that Pn is on the rational curve collapsed, which contains no other marked points and exactly two nodal points Q1 , Q2 . The indices m(Qi ) at the nodal points are less than r − 2, and thus one of the following holds: (i) m(Q1 ) + m(Q2 ) = r − 2, or (ii) m(Q1 ) = m(Q2 ) = −1. In both cases, the claims can be checked easily.
34
T. Mochizuki
Let us consider the case that the point Pn is on the marked point. It means that Pn is on a rational curve P1 which contains exactly one other marked point Pi and exactly one other nodal point Q. As in Case 1, it is easy to check the claims. We take families of 1,1 -hermitian metrics F and F ⊗ 0,1 on M(m). It induces i the family of the Hilbert norms N1 for E (m). We also take a family of 1,1 -hermitian i metrics for M(m ), which induces the family of the Hilbert norms for E (m ). It also i induces the family of the semi-norms N0 on E (m), as in Subsubsect. 3.3.9. We put ∨0 1 Ns := (1 − s) · N0 + s · N1 . It induces the anti-linear maps ψ (s) : E −→ E . ∨1 We take a finite dimensional vector subbundle H of the bundle E (m ) as in the proof ∨ 1 ∨1 of Lemma 4.1. Since we have the strongly continuous map πn∗ E (m ) −→ E (m), we can regard πn∗ H as the subbundle of E
∨1
(m). dual −1 (m), which is given by ∂ (H ) on M(m). On dual −1 ∨0
(H ) the other hand, we have a subbundle H2 of E (m ), which is given by ∂
on M(m ). We take a subbundle H2 of E
∨0
Lemma 4.17. The restriction of the strongly continuous map E to H2 gives an isomorphism of H2 and πn∗ (H2 ). Proof. It follows from Lemma 4.16.
∨0
∨0 (m) −→ πn∗ E (m )
1 0 −1 1 Es ⊂ E (m) for 0 ≤ s ≤ 1. We put Es1 := ψ (s) H2 ⊂ E (m) and Es0 := ∂ 0
1
On the other hand, we have a finite reduction E 0 −→ E 1 for E (m ) −→ E (m ) over M(m ), which is obtained from H . Lemma 4.18. We can naturally identify ∂ : E00 −→ E01 and ∂ : πn∗ E 0 −→ πn∗ E 1 , via 0
0
1
1
the morphisms E (m) −→ πn∗ E (m ) and πn∗ E (m ) −→ E (m). Proof. It follows from Lemma 4.16 and Lemma 4.17.
We have the Witten maps φ (s) : Es0 −→ Es1 (0 < s ≤ 1) given by φ (s) (u) := (s) ∂u + ρEs1 ψ (ur−1 ) as in Subsubsect. 4.2.3. By our construction, φ (s) is continuous with respect to the parameter s. Let φ denote the Witten map for the finite reduction E 0 −→ E 1 . Under the identifications πn∗ E 0 E00 and πn∗ E 1 E01 , πn∗ φ gives the map E00 −→ E01 , which we denote by φ (0) . Lemma 4.19. The family φ (s) 0 ≤ s ≤ 1 is continuous. Proof. We have only to check at s = 0. It can be easily seen by using the description given in Remark 4.5. The maps φ (s) (0 ≤ s ≤ 1) induce the virtual class of M(m). The map φ (0) = πn∗ φ induces the pull back of the virtual class of M(m ) via πn . Thus we can conclude that the pull back of the virtual class of M(m ) is the same as the virtual class of M(m), due to Lemma 4.19. Thus we obtain Axiom 5.
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
35
4.4.5. Axiom 3. Let be a decorated stable graph of genus g with n-tails marked with the n-tuple m. We may assume that is connected. We consider the morphisms: 1/r
1/r
1/r
µ : M ×M M˜ −→ M ,
1/r
p : M ×M M˜ −→ M˜ .
We denote the universal curve over M and M˜ by C and C˜ respectively. We have the canonical map λ : p∗ C˜ −→ µ∗ C . Let F and F˜ denote the universal r-spin structures for C and C˜ . Then we have the canonical map : µ∗ F −→ λ∗ p ∗ F˜ . It is isomorphic outside the nodal points of Ramond type. On the nodal point of Ramond type, we have the matching condition. More precisely, let e denote the edge of which is of Ramond type. We have the cor1/r responding divisor De of µ∗ C , which is a nodal point of µ∗ C −→ M ×M M˜ . Then we have the line bundle Ie on De , and we can take a morphism λ∗ p ∗ F˜ −→ Ie such that the kernel of the morphism λ∗ p ∗ F˜ −→ e Ie is isomorphic to µ∗ F . 1/r
Let i : M
1/r,m
−→ Mg,n
i
be the natural morphism. Let E (m) and E
∨i
(m) be the
1/r,m bundle over Mg,n obtained from the universal r-spin structure as in Subsubsect. 3.4.2. 1/r,m The pull back of cg,n via the morphism i ◦ µ is obtained from a finite reduction for ∗ ∗ i ∗ ∗ ∨i
µ i E (m) and µ i E (m). (See Subsubsect. 4.2.7.) i ∨i 1/r,m ˜ and E ˜ (α) ˜ over M˜ . Here, the On the other hand, we have the bundles E ˜ (α) ˜ weight α˜ is given as follows: Let t be a tail of , which is obtained from a tail of , then ˜ which is obtained from an edge of , then we put we put α(t) ˜ = 0. Let t be a tail of , α(t) ˜ := (mt + 1), where mt denotes the index of the tail. We take the domains of ∂ 1/r,mv dual and ∂ as in Condition 4.2. We remark that E ˜ is a product of the stacks Mgv ,nv for ∨i ˜ and that E i˜ (α) ˜ and E ˜ (α) ˜ are direct sums of the pull backs of the the vertices v of , 1/r,mv
∨i
i
bundles of the topological vector spaces E v (α˜ v ) and E v (α˜ v ) over Mgv ,nv . 0
1
Let Ev0 −→ Ev1 be finite reductions of E v (α˜ v ) −→ E v (α˜ v ). Let E 0 −→ E 1 be a i ˜ and direct sum of the pull backs of Ev0 −→ Ev1 , which is a finite reduction for E ˜ (α) ∨i
˜ E ˜ (α). Lemma 4.20. Assume that all the edge of is of Neveu-Schwarz type. Then the induced 1/r cohomology class from E 0 −→ E 1 is the virtual class of M˜ . ˜ Then the claim follows from Proof. We remark that |(mi + 1)| < 1 for any tails of . Lemma 4.7. Let us consider the pull back p ∗ E 0 −→ p∗ E 1 . Then we have the canonical map −→ e Ie , which may be assumed to be surjective. Let Eˆ 0 denote the kernel.
p∗ E 0
0 1 Lemma 4.21. Eˆ 0 −→ p ∗ E 1 is a finite reduction of µ∗ i ∗ E −→ µ∗ i ∗ E . i
0
Proof. It is clear from our construction of E and the choice of the domains ∂ : E −→ 1 E .
36
T. Mochizuki
Let t be a tail of ˜ obtained from an edge of of Ramond type. Then we have the map Rest : E 0 −→ C, by taking the residue of s r at the marked points corresponding to the tail t. Let te,i (i = 1, 2) denote the tails of ˜ corresponding to the edge e of of Ramond type. Then we obtain the map e (s) := Reste,1 (s r ) + Reste,2 (s r ). Then we have the top Chern class ctop C, e ∈ H ∗ −1 E 0 for each edge e of Ramond type. e (0)
On the other hand, we have the map pE : Eˆ 0 −→ E 0 of the orbifolds. Then we have the cohomology classes pE ∗ (1) ∈ H ∗ ˆ 0 (E 0 ), where 1 denotes the unit element of pE (E )
the cohomology ring of Eˆ 0 . The following lemma is obtained from our construction by a geometric consideration.
Lemma 4.22. We have pE Eˆ 0 = e −1 e (0), and the equality e ctop C, e = pE ∗ (1) in H ∗ ˆ 0 E 0 . pE (E )
Proof. For simplicity of the explanation, we give the argument in the case where consists of one edge of Ramond type. The general case can be discussed similarly. Let dv denote the rank of Ev0 , and let U (dv ) denote the dvth unitary group. We put V := v P (Ev0 ), where P (Ev0 ) denote the U (dv )-principal bundles associated to Ev0 . Note their isotopy groups are trivial, i.e. P (Ev0 ) are the usual manifolds. Then we have 1/r the morphism V −→ M . Let Eˇ 0 denote the pull back of E 0 . We have two subsets pE (Eˆ 0 ) ×E 0 Eˇ 0 and −1 (0) ×E 0 Eˇ 0 . We have only to show the coincidence of them. Let x be any point of V . We have only to compare the fibers of pE (Eˆ 0 ) ×E 0 Eˇ 0 and −1 (0) ×E 0 Eˇ 0 over x. The point x ∈ V naturally gives a stable r-spin curve (C, p, F) ˜ Let Pe,i denote the points corresponding to the tails te,i . Then we have the of type . ⊗r canonical isomorphism F|P C given by the residue. They induce the isomorphism e,i ρ : F|Pe,1 F|Pe,2 . 1/r 1/r M ˜ over M ˜ . Let pˇ denote Let Vˇ denote the fiber product of V and M × M
−1 ˇ to the the natural morphism V −→ V⊗. rThen the fiber pˇ (x) is naturally 0isomorphic set I := κ : F|Pe,1 F|Pe,2 κ = −ρ . Hence the fiber of pE (Eˆ ) ×E 0 Eˇ 0 over x is as follows: " 0 0 −1 s ∈ E(C, p,F ) κ(s|P1 ) = s|P2 = s ∈ E(C,p,F ) s ∈ e (0) . κ∈I
ˇ0 Hence we obtain −1 (0)×E 0 Eˇ 0 = pE (Eˆ 0 )× E 0 E . We obtain the equality of the cohomology classes, by comparing the induced v U (dv ) equivariant cohomology classes. (See the Appendix.) Let φ : E 0 −→ E 1 denote the Witten map as in Subsubsect. 4.2.3. Let π : E 0 −→ denote the projection. The maps φ and e induce the section φ˜ of π ∗ E 1 ⊕ ! ∗ 1 ˜ e C. Then we obtain the top Chern class ctop π E ⊕ e C, φ = e ctop (C, e ) · ∗ 1 ctop π E , φ . On the other hand, let φˆ denote the restriction of p∗ φ to Eˆ 0 . Let πEˆ : Eˆ 0 −→ 1/r,m M × M ˜ be the projection. Then we obtain the top Chern class ctop π ∗ p ∗ E 1 , φˆ . 1/r M˜
M
Eˆ
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
37
Lemma 4.23. We have the following equality: ! # ! pE ∗ ctop πE∗ˆ p ∗ E 1 , φˆ = ctop π ∗ E 1 ⊕ C, φ˜ . e
Proof. It follows from Lemma 4.22 and the following equality: ∗ ∗ 1 pE ∗ ctop πE∗ˆ p ∗ E 1 , φˆ = pE ∗ ctop pE π E , φˆ = pE ∗ (1) · ctop π ∗ E 1 , φ . Note we have the equality p ◦ πEˆ = π ◦ pE .
1/r,m
Corollary 4.1. We have p∗ µ∗ c = 0, if one of the edge of is of Ramond type. Namely, Axiom 3 holds in the Ramond case. Proof. Due to Lemma 4.23, we have only to show the vanishing of π∗ ctop π ∗ E 1 ⊕ e C, φ , which follows from Lemma 4.12. Let us consider the case that all of the edge of is of Neveu-Schwarz type. Then 0 1 p∗ E 0 −→ p ∗ E 1 is a finite reduction of µ∗ i ∗ E (m) −→ µ∗ i ∗ E (m). Hence the induced 1/r, m 1/r,m cohomology class is the same as µ∗ i ∗ cg,n . Then we obtain the equality µ∗ c = ∗ 1/r,m · p r/ l c , due to Lemma 4.20. Then Axiom 3 in the Neveu-Schwarz e e∈E() ˜ case immediately follows. Therefore the proof of Theorem 4.1 is accomplished. 5. Appendix 5.1. Locally trivial orbispaces. In the following, orbispaces are assumed to be paracompact. For an orbispace V , [V ] denote the associated topological space. Let V be an orbispace and E be a vector bundle over V . Once we take a hermitian metric of E, we obtain the U (rE )-principal bundle P (E), where rE denotes the rank of E. We only consider the rational cohomology groups. Remark 5.1. We do not intend to develop the general theory.
Definition 5.1. Let V be an orbispace. It is called locally trivial, if the isotopy groups of any points are finite abelian group and independent of a choice of a point up to isomorphisms. Lemma 5.1. If V is a locally trivial orbispace, then there exist a topological space V , a finite abelian group A and a homomorphism ρ : π1 (V ) −→ Aut (A) such that V U /Tρ . Here Aut (A) denotes the automorphism group of A, and Tρ denotes the torsor associated to ρ. Let E be a complex vector bundle over a locally trivial orbispace V , then we have the vector bundle E over V and the Tρ -action on E . We do not distinguish (V , E ) and (V , E) in the sequel, when V is a locally trivial orbispace. Let V be a locally trivial orbispace, and E be a vector bundle over V . In that case, the following lemma can be checked easily. Lemma 5.2. P (E) is locally trivial.
38
T. Mochizuki
5.2. Locally trivializable orbispaces. Definition 5.2. • An orbispace V is called locally trivializable, if there is a pair (L, P ) of a compact Lie group L and a principal L-bundle P over V such that P is locally trivial. • Let E be a vector bundle on V . If the associated unitary principal bundle P (E) is locally trivial as orbispaces, then V is called s-locally trivializable. If V is a locally trivializable orbispace, it is convenient to consider the Borel construc tion B(P , L) := P ×L EL and the equivariant cohomology HL∗ (P ) := H ∗ B(P , L) . Here (L, P ) denotes a pair of compact Lie group L and a principal L-bundle over V in Definition 5.2, and EL denotes the universal bundle over a classifying space BL of L. over For example, a vector bundle E over V induces the L-equivariant vector bundle E P . Hence we have the equivariant Chern class in HL∗ (P ). Since we consider the rational cohomology groups, we have the isomorphism HL∗ (P ) H ∗ ([V ]). If (Li , Pi ) are pairs of compact Lie groups Li and Li -principal bundles Pi over V as in Definition 5.2, we have the L1 × L2 -principal bundle P1 × V P2 which is a locally trivial orbispace. We have the natural isomorphisms: HL∗1 P1 HL∗1 ×L2 (P1 ×V P2 ) HL∗2 (P2 ). In this sense, the group HL∗ (P ) is independent of a choice of (L, P ), and hence we often denote it by H ∗ (V ), and the Chern class is given in this group. Similarly, we can consider the relative cohomology groups H ∗ (V , S) (S ⊂ V ) and the cohomology groups with support HS∗ (V ). 1/r
Lemma 5.3. For any decorated graph , the orbispace M is locally trivializable. 1/r,m
Proof. The claim can be reduced to the case Mg,n . Since it is smooth, we have the 1/r,m
principal bundle associated with the tangent bundle T Mg,n . Since the isotopy group of a general point is finite abelian, the claim immediately follows.
5.3. Pull back. Let f : V1 −→ V2 be a morphism of s-locally trivializable orbispaces. Then we have vector bundles Ei over Vi such that P (Ei ) are locally trivial. We have the following morphisms: P (E1 ) ←−−−− P (E1 ) ×V1 f ∗ P (E2 ) −−−−→ P (E2 ). Note that P (E1 ) ×V1 f ∗ P (E2 ) is locally trivial. Hence we have the naturally defined morphism:
HU∗ (dE
P (E2 ) −→ HU∗ (dE
2)
1 )×U (dE2
P (E1 ) ↓ ∗ ) P (E1 ) ×V1 f P (E2 )
HU∗ (dE
1)
In other words, we have the naturally defined morphism H ∗ (V2 ) −→ H ∗ (V1 ).
The Virtual Class of the Moduli Stack of Stable r-Spin Curves
39
5.4. Thom isomorphism. Let V be an s-locally trivializable orbispace, and P (E1 ) is locally trivial for a vector bundle E1 over V . Let E be a vector bundle over V . Then P (E1 ) ×V P (E) and Eˆ := P (E1 ) ×V P (E) ×V E also locally trivial, and the natural projection Eˆ −→ P (E1 ) ×V P (E) is a vector bundle. Then we have the Thom isomor ∗+2rE ˆ L) H ∗ B(P (E1 ) ×V P (E), L) , where L denotes the group phism Hcpt B(E, ∗+2rE U (dE1 ) × U (dE ). In other words, we have the Thom isomorphism π∗ : Hcpt (E) ∗ H (V ). Similarly, we obtain the Gysin morphism by using the Borel construction. The functorial properties can be easily reduced to the ordinary case. 5.5. The top Chern class. Let U (r) be the r th unitary group, and p : EU (r) −→ BU (r) denote the universal U (r)-principal bundle. We have the associated universal vector bundle E over BU (r). Then we have the universal section s : EU (r) −→ p ∗ E, whose 0-set is BU (r). Let E be a vector bundle over a locally trivializable orbispace V with section φ. Let (L, P ) be a pair of compact Lie group and P be a L-principal bundle over V as over B(P , L) with the in Definition 5.2. Then we have the induced vector bundle E induced section φ . We can take a continuous map : B(P , L) −→ E such that E ∗ ∗ ∗ ∗ p E and φ = s under the identification. Hence we obtain the map HBU (r) (E) −→ ∗ Hφ∗−1 (0) B(P , L) Hφ∗−1 (0) V . The image of the Thom class in HBU (r) E is called the top Chern class, and it isdenoted by ctop (E, φ). Standard properties ctop E1 ⊕ E1 , φ1 ⊕ φ2 = ctop E1 , φ1 · ctop E2 , φ2 and f ∗ ctop E, φ = ctop f ∗ E, f ∗ φ can be shown by reducing to the ordinary case.
Acknowledgement. The author is heartily grateful to Professor T. Jarvis for his interest, comments and encouragement. He also acknowledges Professors T. Kimura and Y. P. Lee for their kindness. He thanks the referees for their readings. The author wrote the original version of this paper at Osaka City University, the revision at Kyoto University, and the final version at Max-Planck Institute for Mathematics. The author expresses his gratitude to the institutions.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
Chen, W.: A Homotopy theory of orbispaces, http://arxiv.org/list/math.AT/0102020, 2001 Fujita, H., Kuroda, S.T.: Functional Analysis I (in Japanese). Tokyo: Iwanami Shoten, 1983 Jarvis, T.: Geometry of the moduli of higher spin curves. Internat. J. Math. 11, 637–663 (2000) Jarvis, T.: Torsion-free sheaves and moduli of generalized spin curves. Composito Math. 110: 291– 333 (1998) Jarvis, T.: The Picard group of the moduli of higher spin curves. New York J. Math. 7, 23–47 (2001) Jarvis, T., Kimura, T., Vaintrob, A.: Moduli spaces of higher spin curves and integrable hierarchies. Composito Math. 126, 157–212 (2001) Jarvis, T., Kimura, T., Vaintrob, A.: Gravitational descendents and the moduli space of higher spin curves. In: Advances in algebraic geometry motivated by physics (Lowell, MA, 2000). Contemp. Math. 276, pp. 167–177, 2001 Jarvis, T., Kimura, T., Vaintrob, A.: Spin Gromov-Witten Invariants. Commun. Math. Phys. 259, no. 3, 511–543 (2005) Polishchuk, A.: Witten’s top Chern class on the moduli space of higher spin curves. In : Frobenius manifolds, Aspects Math. Wiesbaden: Vieweg, E36, pp. 253–264, 2004 Polishchuk, A., Vaintrob, A.: Algebraic construction of Witten’s top Chern class. In: Advances in algebraic geometry motivated by physics (Lowell, MA, 2000). Contemp. Math. 276, pp. 229–249, 2001
40
T. Mochizuki
11. Seely, R., Singer, I.M.: Extending ∂¯ to Singular Riemann Surfaces. J. Geom. Phys. 5, 121–136 (1989) 12. Witten, E., Two dimensional gravity and intersection theory on the moduli space. Surveys in Diff. Geom. 1, 243–310 (1991) 13. Witten, E.: Algebraic geometry associated with matrix models of two dimensional gravity. In: Topological methods in modern mathematics (Stony Brook, NY, 1991), Houston, TX: Publish or Perish, pp. 235–269, 1993 Communicated by N.A. Nekrasov
Commun. Math. Phys. 264, 41–69 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1501-8
Communications in
Mathematical Physics
Nonassociative Tori and Applications to T-Duality Peter Bouwknegt1 , Keith Hannabuss2 , Varghese Mathai3 1
Department of Theoretical Physics, Research School of Physical Sciences and Engineering, and Department of Mathematics, Mathematical Sciences Institute, The Australian National University, Canberra, ACT 0200, Australia. E-mail:
[email protected] 2 Balliol College, Oxford OX1 3BJ, England. E-mail:
[email protected] 3 Department of Pure Mathematics, University of Adelaide, Adelaide SA 5005, Australia. E-mail:
[email protected] Received: 17 December 2004 / Accepted: 19 September 2005 Published online: 9 March 2006 – © Springer-Verlag 2006
Abstract: In this paper, we initiate the study of C ∗ -algebras A endowed with a twisted action of a locally compact abelian Lie group G, and we construct a twisted crossed product A G, which is in general a nonassociative, noncommutative, algebra. The duality properties of this twisted crossed product algebra are studied in detail, and are applied to T-duality in Type II string theory to obtain the T-dual of a general principal torus bundle with general H-flux, which we will argue to be a bundle of noncommutative, nonassociative tori. Nonassociativity is interpreted in the context of monoidal categories of modules. We also show that this construction of the T-dual includes the other special cases already analysed in a series of papers. 1. Introduction Recent work has revealed the strong connections between T-duality in string theory and Takai duality for C ∗ -algebras (as for instance discussed in the introduction to [3, 28]), but for general H-fluxes C ∗ -algebras are no longer adequate. In this paper we present a generalisation which permits a very precise description of the general T-dual. Let T be a compact connected Abelian Lie group of rank with Lie algebra t and let t be the dual of t. Let E → M be a principal T-bundle with connection. By the Chern-Weil construction the space (E)T of T-invariant forms on E is isomorphic to the space of forms on M with values in ∧ t, i.e. k (E)T ∼ =
k
p (M, ∧k−p t) ,
(1.1)
p=0
and by a classical result of Chevalley and Koszul, the de-Rham complex (• (E), d) is chain homotopy equivalent to the complex (• (M, ∧• t), D) with a modified de-Rham differential D, and hence the associated cohomologies are isomorphic. Furthermore, we
42
P. Bouwknegt, K. Hannabuss, V. Mathai
know that any class in H k (E) can be represented by a form in k (E)T . Thus, under this isomorphism, an H -flux in H 3 (E) can be considered as a 4-tuple H = (H3 , H2 , H1 , H0 ) with Hp ∈ p (M, ∧3−p t) for p = 0, 1, 2, 3, closed under the action of D (cf. [18, 5] for more details).1 T-duality for principal circle bundles was treated geometrically in [2, 3], and dimensional considerations force the H0 and H1 components of the H-flux to vanish in this case. The T-dual turns out to be another principal circle bundle with T-dual H-flux. The arguments were extended in [4] to principal T -bundles with H-flux satisfying the condition that the H0 and H1 components vanish. Then the T-dual turns out to be another principal T -bundle with T-dual H-flux having vanishing H0 and H1 components. The analysis in [28, 29] shows that if one considers principal T -bundles with H-flux satisfying the condition that just the H0 component vanishes, then one arrives at the surprising conclusion that the T-dual bundle has to have noncommutative tori as fibres, provided the H1 component is non-zero. The weaker condition in [28, 29] permits non-vanishing H1 , but still excludes non-zero H0 . In this paper we shall remove the last of these constraints, to allow a non-vanishing H0 component. In this case, we arrive at the astonishing conclusion that the T-dual bundle has to have nonassociative tori as fibres, taking it even beyond the normal range of noncommutative geometry. The key step is provided by a new explicit construction of a continuous trace algebra B having a given Dixmier-Douady invariant, whose spectrum is the total space E of the principal T-bundle, together with automorphisms βg for g ∈ G = t, which transform the spectrum in a way compatible with the G-action on E. The new features arise because in general g → βg is not a homomorphism but satisfies βx βy = ad(v(x, y))βxy , where ad(v) denotes conjugation by a unitary element of the multiplier algebra MB. One expects the algebra associated with the T-dual to be the crossed product B β G, but the twisting forces us to take a suitably twisted crossed product B β,v G. Such (Leptin-)Busby-Smith twistings have long been known but, for non-trivial H0 , v(x, y) is not a cocycle and that means that associativity fails in the twisted crossed product. In Sect. 2 we give the relationship between the differential forms and the multicharacters on G which will be used in our later constructions. Section 3 reviews the generalised Busby-Smith twisted crossed products. An example of a twisting is given in Sect. 4, together with a proof that it is the only type up to stability. This is followed by an example of a nonassociative generalisation of the compact operators. The theory of twisted induced representations is developed in Sect. 6 and then used to construct examples of algebras with given spectrum and Dixmier–Douady class in Sect. 7. In Sect. 8 it is shown that the twisted crossed product of a twisted induced algebra is isomorphic to a generalisation of the twisted compact operators. In Sect. 9, the double dual is shown to be the tensor product of the original algebra with the twisted compact operators, that is, Morita equivalent in this category to the original algebra. In Sect. 10, the mathematical results of the previous sections are used to justify the assertion that the T-dual to a general principal torus bundle with H-flux, is a bundle of nonassociative tori. The final section outlines how associativity can be restored by working in a different category, an idea which will be explored in more detail in a subsequent paper. 1 The conclusions in this paper are valid for integral classes H ∈ H 3 (E, Z) as well, since this paper deals exclusively with the introduction of the additional ‘degree of freedom’ H0 , which does not carry torsion, on top of established results which hold in the case of torsion H . Note that H0 can be identified with the restriction of H to a fibre. For simplicity we have chosen to formulate some of the results in terms of differential forms.
Nonassociative Tori and Applications to T-Duality
43
2. Differential Forms and Multicharacters Let T be a compact connected Abelian Lie group of rank , and E → M a principal T-bundle. In essence the action of T on E provides a map from the Lie algebra t to vector fields on E, which we write X → ξX , and then a p-form f ∈ p (E)T in the fibre directions defines the antisymmetric multilinear form valued function on M, (ξ ∗ f )(X1 , X2 , . . . , Xp ) = f (ξX1 , ξX2 , . . . , ξXp ).
(2.1)
For abelian Lie groups the form exponentiates to a multicharacter on G = t, φ exp(X1 ), . . . , exp(Xp ) = exp(−2πi(ξ ∗ f )(X1 , . . . , Xp )).
(2.2)
(A multicharacter is a character in each variable, and this property follows from the additivity of ξ ∗ f , and φ is also antisymmetric in the sense that even permutations of its variables leave it unchanged whilst odd permutations invert it.) The multicharacter property ensures that this is always a (Moore) cocycle in Z p (t, T), since, for example when p = 3, φ(y, z, w)φ(x, yz, w)φ(x, y, z) = φ(y, z, w)φ(x, y, w)φ(x, z, w)φ(x, y, z) = φ(xy, z, w)φ(x, y, zw). (It is known that every cohomology class in H 2 (Rn , T) can be represented by an antisymmetric bicharacter of the form φ [23, 19], and for p = 3 it is certainly true that smooth cocycles are cohomologous to antisymmetric tricharacters.) Example. Consider the torus bundle T3 over a point, with H0 the class defined by k times the volume form dx 1 ∧dx 2 ∧dx 3 . The associated antisymmetric form on a, b, c ∈ t = R3 is then given by f (a, b, c) = k[a, b, c] ≡ ka.(b × c),
(2.3)
whence φ(a, b, c) = exp(−2kπ i[a, b, c]). Although we are mainly interested in the abelian groups T = Tn , G = t = Rn , and N∼ = Zn the kernel of the exponential map t → T, the constructions which we present in Chapters 3 to 8 are valid for general unimodular separable locally compact groups with a tricharacter φ. For that reason we shall write the group composition multiplicatively, and be careful not to commute terms. However, this generalisation is less sweeping than may appear because the tricharacter φ on G defines a homomorphism of each variable into the abelian group T, and so must be lifted from a tricharacter on the abelianisation G/[G, G]. Finally we note that, by definition, the Dixmier-Douady class has components which are integral 3-forms, and that means, in particular, that the tricharacter φ constructed from H0 is identically 1 on N × N × N, where N is the kernel of the exponential map. We shall assume this to be true in the general case.
44
P. Bouwknegt, K. Hannabuss, V. Mathai
3. Generalised Busby-Smith Twisted Crossed Products As noted above we now work with a general unimodular separable locally compact group G and a closed subgroup N on which the tricharacter φ has trivial restriction. For any group G one can interpret H 2 (G, A) as classifying central extensions of G by A, whilst, for p > 3, H p (G, A) is usually interpreted in terms of crossed modules [15, 16, 6, 20, 21, 34, 26], with elements of H 3 (G, A) known as MacLane–Whitehead obstructions [35, 27]. However, we shall see that these classes also arise in a C ∗ -algebraic context. The H -field is usually linked to the equivariant Brauer group of a continuous trace C∗ -algebra A with spectrum E, on which a group G acts as automorphisms so that the dual action E agrees with the bundle structure. However, the equivariant Brauer group can be described entirely in terms of cohomology classes in H p (M, H 3−p (G, T)) for p = 1, 2, 3, [11], leaving no room for H -fields with a component in H 0 (M, H 3 (G, T)), (for example, any non-trivial H -field on G considered as a principal G-bundle over a point). Since the representatives of H 0 are locally constant functions we shall concentrate our attention on H 3 (G, T). This suggests that we must consider a wider class of algebras or actions. In fact, inner automorphisms automatically act trivially on the spectrum, so that only homomorphisms of G to the outer automorphisms Out(A) = Aut(A)/Inn(A) are interesting. However, to work with these one needs a lifting α : G → Aut(A). The problem then is that αx αy and αxy can differ by an inner automorphism ad(u(x, y)) : a → u(x, y)au(x, y)−1 , that is αx αy = ad(u(x, y))αxy . We can take u(x, y) = 1 whenever x or y is the identity. This is almost precisely the data needed to define a Busby–Smith (or Leptin) twisted crossed product A α,u G of A and G, [24, 7, 32]. Assuming that u is a measurable function on G×G, we can define a twisted convolution product and adjoint on C0 (G, A) by (f ∗ g)(x) =
f (y)αy [g(y −1 x)]u(y, y −1 x) dy,
G
f ∗ (x) = u(x, x −1 )−1 αx [f (x −1 )]∗ , and complete this to get a new algebra. The link with the algebraists’ picture of H 3 (G, A) arises because u is no longer a cocycle since the condition linking α and u tells us only that the adjoint actions of u(x, y)u(xy, z) and αx [u(y, z)]u(x, yz) coincide, so that one has a modified cocycle condition: φ(x, y, z)u(x, y)u(xy, z) = αx [u(y, z)]u(x, yz)
(3.1)
for some central unitary element φ(x, y, z) ∈ U Z(A). It is easy to check that φ is a cocycle defining an element of H 3 (G, U Z(A)). (Essentially the same argument is used in [8] to explain the origin of the Gauss anomaly and Jackiw’s nonassociative anomaly in quantum field theory.) When φ is a tricharacter one can still form a twisted crossed product. Proposition 3.1. When φ defined as above is an antisymmetric tricharacter the twisted crossed product A α,u G satisfies the ∗-algebra identity (f ∗ g)∗ = g ∗ ∗ f ∗ , and is
Nonassociative Tori and Applications to T-Duality
45
associative if and only if φ ≡ 1. In fact, we have ((f ∗ g) ∗ h)(x) = f (z)αz [g(z−1 y)]αy [h(y −1 x)]u(z, z−1 y)u(y, y −1 x) dydz, G×G f (z)αz [g(z−1 y)]αy [h(y −1 x)]φ(z, z−1 y, y −1 x) (f ∗ (g ∗ h))(x) = G×G
×u(z, z−1 y)u(y, y −1 x) dydz. Proof. Using the modified cocycle identity, one calculates that (f ∗ g)∗ (x) = φ(x, x −1 y, y −1 )∗ u(y, y −1 )∗ αy [g(y −1 )]∗ G
×u(x, x −1 y)∗ αx [f (x −1 y)]∗ dy, whilst ∗
∗
(3.2)
φ(y, y −1 x, x −1 y)∗ u(y, y −1 )∗ αy [g(y −1 )]∗
(g ∗ f )(x) = G
×u(x, x −1 y)∗ αx [f (x −1 y)]∗ dy,
(3.3)
and for antisymmetric tricharacters both factors involving φ are 1. The twisted crossed product algebra A α,u G has ((f ∗ g) ∗ h)(x) = (f ∗ g)(y)αy [h(y −1 x)]u(y, y −1 x) dy, G = f (z)αz [g(z−1 y)]u(z, z−1 y)αy [h(y −1 x)]u(y, y −1 x) dydz G×G f (z)αz [g(z−1 y)]αz αz−1 y [h(y −1 x)]u(z, z−1 y) = G×G
×u(y, y −1 x) dydz
= G×G
f (z)αz [g(z−1 y)]αz−1 y [h(y −1 x)]u(z, z−1 y)
×u(y, y −1 x) dydz, and, using the modified cocycle identity, (f ∗ (g ∗ h))(x) = f (z)αz [(g ∗ h)(z−1 x)]u(z, z−1 x) dz G = f (z)αz [g(z−1 y)]αz−1 y [h(y −1 x)]u(z−1 y, y −1 x) G×G
×u(z, z−1 y) dydz,
= G×G
f (z)αz [g(z−1 y)]αz−1 y [h(y −1 x)]αz [u(z−1 y, y −1 x)]
×u(z, z−1 y) dydz
= G×G
f (z)αz [g(z−1 y)]αz−1 y [h(y −1 x)]φ(z, z−1 y, y −1 x)
×u(z, z−1 y)u(y, y −1 x) dydz,
46
P. Bouwknegt, K. Hannabuss, V. Mathai
so that the Busby-Smith twisted crossed product is nonassociative except in the case φ ≡ 1. Henceforth we shall always take φ to be an antisymmetric tricharacter. Usually Busby-Smith products are only defined when φ = 1, but we shall see that much of the theory goes through without that assumption, so that this provides a means of constructing nonassociative from associative algebras. The nonassociativity becomes even more transparent when one considers a covariant representation (U, π ) of (G, A) satisfying the conditions U (x)π(a)U (x)−1 = π(αx (a)),
U (x)U (y) = π(u(x, y))U (xy).
(3.4)
These give U (x)[U (y)U (z)] = U (x)[π(u(y, z))]U (yz) = π(u(x, yz))π(αx u(y, z))U (x(yz)),
[U (x)U (y)]U (z) = π(u(x, y))U (xy)U (z) = π(u(x, y))π(u(xy, z))U ((xy)z),
so that φ(x, y, z)U (x)[U (y)U (z)] = [U (x)U (y)]U (z).
(3.5)
In fact, H 3 (G) was already interpreted as defining a nonassociative structure in [16], and this has resurfaced in the physics literature [22, 13]. 4. Generalised Packer–Raeburn Stabilisation For a given antisymmetric tricharacter φ on G there is a simple example of an algebra with twisting on which G acts as automorphisms. It is derived from the imprimitivity algebra generated by multiplication and translation operators on L2 (G). The right regular representation ρ acts on ψ ∈ L2 (G) by (ρ(x)ψ)(v) = ψ(vx), and we define (uρ (y, z)ψ)(v) = φ(v, y, z)ψ(v).
(4.1)
(We shall often be interested in the case when G is a closed subgroup of a group H, with φ defined on H × H × H and trivial on G × G × G. Then uρ (y, z) can be defined as a multiplication operator on L2 (G) for general y, z ∈ H, and the restriction of uρ to G × G is identically 1.) Now φ(x, y, z)(uρ (x, y)uρ (xy, z)uρ (x, yz)−1 ψ)(v) = φ(x, y, z)φ(v, x, y)φ(v, xy, z)φ(v, x, yz)−1 ψ(v) = φ(vx, y, z)ψ(v)
Nonassociative Tori and Applications to T-Duality
47
and (ρ(x)uρ (y, z)ρ(x)−1 ψ)(v) = (uρ (y, z)ρ(x)−1 ψ)(vx) = φ(vx, y, z)(ρ(x)−1 ψ)(vx) = φ(vx, y, z)ψ(v),
(4.2)
so that, setting αx = ad(ρ(x)), we get φ(x, y, z)uρ (x, y)uρ (xy, z)uρ (x, yz)−1 = αx [uρ (y, z)]. This gives an explicit realisation of an algebra C = C0 (G) with an action of G by automorphisms and with the appropriate Busby-Smith obstruction. (When G is non-compact uρ (x, y) is in the multiplier algebra rather than the algebra itself.) The same idea can be extended to the right regular σ -representation on L2 (G) given by (ρ(x)ψ)(v) = σ (v, x)ψ(vx) for any borel multiplier σ , and this makes no difference to the cocycle identity. Naturally, one can also work with the left regular representation (λ(x)ψ)(v) = ψ(x −1 v), and (uλ (y, z)ψ)(v) = φ(v, y, z)−1 ψ(v).
(4.3)
This is useful because it links directly to the formulation used by [30] to show that twisted crossed products defined by cocycles are stably equivalent to normal crossed products. In our case u is not a cocycle, but there is nonetheless a nice generalisation of the Packer–Raeburn Theorem. Theorem 4.1. Let A, G, α, u be as above. There exists a strongly continuous action β of G on A ⊗ K(L2 (G)) and a twisting uλ such that (β, uλ ) are exterior equivalent to (α ⊗ id, u ⊗ 1), that is, there exists vs = (1 ⊗ λs )(id ⊗ M(u(s, ·))∗ ) such that βs = ad(vs )(αs ⊗ id),
∗ id ⊗ uλ (s, t) = vs αs (vt )(u(s, t) ⊗ 1)vst .
(4.4)
Proof. The proof of Theorem 3.4 in [30] is still valid as far as the last line on p. 301. At that point the original argument uses the fact that u is a cocycle to show that it has been untwisted by the exterior equivalence. In our case u satisfies a modified cocycle identity, so that that last line (in the original notation) gives φ(s, t, r)−1 ξ(r) = (uλ (s, t)ξ )(r). One similarly obtains: Corollary 4.2. In the situation of Theorem 4.1 one has (A α,u G) ⊗ K(L2(G)) ∼ = (A ⊗ K) β,uλ G .
(4.5)
The Packer-Raeburn stabilisation trick is used to show that up to stabilisation by tensoring with compact operators A α,u G and A β,uλ G are isomorphic. The original idea derived from Quigg’s generalisation of Takai duality for twisted crossed products, and their β is just equivalent to Quigg’s double dual α . One may therefore obtain a duality theorem by the same procedure. We postpone discussion of this until Sect. 9 where we shall give a much more detailed account of duality for abelian groups.
48
P. Bouwknegt, K. Hannabuss, V. Mathai
5. Twisted Compact Operators Before developing the theory further it is useful to give a very simple example of a nonassociative algebra, obtained by twisting the algebra of compact operators on L2 (G) using the factor φ(x, y, z). We start with the Hilbert-Schmidt operators, realised as kernels K(x, y) for x, y ∈ G, with the involution K ∗ (x, y) = K(y, x), and norm 2 |K(x, y)|2 dxdy, (5.1) KH S = G×G
and define the new multiplication
(K1 ∗ K2 )(x, z) =
φ(x, y, z)K1 (x, y)K2 (y, z) dy.
(5.2)
G
This is consistent with the involution because (K1 ∗ K2 )∗ (x, z) = (K1 ∗ K2 )(z, x) = φ(z, y, x)−1 K1 (z, y) K2 (y, x) dy G = φ(x, y, z)K2∗ (x, y)K1∗ (y, z) dy G
= (K2∗ ∗ K1∗ )(x, z). The fact that φ(x, y, x) = 1 also means that we still have K2H S = (K ∗ ∗ K)(x, x) dx.
(5.3)
G
As usual, one can define a C ∗ -norm using the left regular representation K1 = sup(K1 ∗ K2 H S /K2 H S ). By using the Cauchy-Schwarz inequality and by considering rank one projections one sees that this is equivalent to the ordinary operator norm. The twisted compact operators Kφ (L2 (G)) are the completion of the Hilbert-Schmidt operators with respect to that norm. Unfortunately the new multiplication is not associative unless φ(x, y, z)φ(x, z, w) = φ(x, y, w)φ(y, z, w),
(5.4)
for all x, y, z, w ∈ G. In that case φ(x, y, z) = φ(x, y, w)φ(y, z, w)/φ(x, z, w), but conversely, whenever φ(x, y, z) has the form ψ(x, y)ψ(y, z)/ψ(x, z), for some function ψ, the algebra is associative. The twisted multiplication of kernels was used in [9] to study the hyperbolic quantum Hall effect, but in that two-dimensional situation the multiplication is automatically associative. Once one gets to three dimensions the analogous algebra is nonassociative. Proposition 5.1. The group G acts on the twisted algebra Kφ (L2 (G)) with multiplication φ(x, y, z)K1 (x, y)K2 (y, z) dy (5.5) (K1 ∗ K2 )(x, z) = G
by natural ∗-automorphisms θx [K](z, w) = φ(x, z, w)K(zx, wx),
(5.6)
and θx θy = ad(σ (x, y))θxy , where ad(σ (x, y))[K](z, w) = φ(x, y, z)φ(x, y, w)−1 K(z, w) comes from the multiplier σ (x, y)(v) = φ(x, y, v).
Nonassociative Tori and Applications to T-Duality
49
Proof. Using the tricharacter property of φ, we see that (θx [K1 ]∗θx [K2 ])(z, w) = φ(z, v, w)φ(x, z, v)φ(x, v, w)K1 (zx, vx)K2 (vx, wx)dv G = φ(zx, vx, wx)φ(x, z, w)K1 (zx, vx)K2 (vx, wx) dv G
= φ(x, z, w)(K1 ∗ K2 )(zx, wx), = θx [K1 ∗ K2 ](z, w). We also have θx [K]∗ (z, w) = θx [K](w, z) = φ(x, z, w)K(w, z) = φ(x, w, z)K ∗ (w, z) = θx [K ∗ ](z, w).
(5.7)
Moreover, θx θy [K](z, w) = φ(x, z, w)(θy [K])(zx, wx) = φ(x, z, w)φ(y, zx, wx)K(zxy, wxy) = φ(x, y, z)φ(x, y, w)−1 θxy [K](z, w) = ad(σ (x, y))[θxy [K]](z, w), with σ (x, y)(z, w) = φ(x, y, z)δ(z − w).
Note. There is also a left-handed version of this which uses the multiplication φ(x, y, z)−1 K1 (x, y)K2 (y, z) dy (K1 ∗ K2 )(x, z) =
(5.8)
G
and automorphisms τx [K](z, w) = φ(x, z, w)K(x −1 z, x −1 w),
(5.9)
and it is this version which will appear later, in Sect. 9. Some further automorphisms of the original algebra Kφ (L2 (G)) given by γx [K](z, w) = φ(x, z, w)−1 K(x −1 z, x −1 w)
(5.10)
will be useful in Sect. 8. When G is a contractible group the twisted compact operators are just a deformation of the usual ones. Proposition 5.2. When G is a contractible group Kφ (L2 (G)) is a continuous deformation of K(L2 (G)). Proof. Let {t : t ∈ [0, 1]} give a contraction of G onto the identity, that is t : G → G is continuous and satisfies 0 (x) = x and 1 (x) is the identity for all x ∈ G. We then define (for the right-handed version) φ(x, t (y), z)K1 (x, y)K2 (y, z) dy, (5.11) (K1 ∗t K2 )(x, z) = G
so that at t = 0 we have the twisted and at t = 1 the untwisted product.
50
P. Bouwknegt, K. Hannabuss, V. Mathai
6. Twisted Induced Algebras The introduction of u and φ has far-reaching consequences, because almost all the standard procedures have to be deformed, and we shall now investigate these in more detail. Suppose that A is a C∗ -algebra on which C(M) acts as double centralisers and the subgroup N of G acts by automorphisms αr , (r ∈ N), and that for each s, t ∈ N there are unitaries u(s, t) in the multiplier algebra M(A) such that αs αt = ad(u(s, t))αst , and also the modified cocycle condition (for a specified continuous tricharacter φ) αr [u(s, t)]u(r, st) = φ(r, s, t)u(r, s)u(rs, t).
(6.1)
Such algebras always exist, since we can take the algebra freely generated by a collection of symbols {u(s, t) : s, t ∈ N} and define the automorphism αr by the formula αr [u(s, t)] = φ(r, s, t)u(r, s)u(rs, t)u(r, st)−1 .
(6.2)
We shall suppose that u and φ extend to continuous functions on G satisfying the same relations: αr [u(x, y)]u(r, xy) = φ(r, x, y)u(r, x)u(rx, y),
(6.3)
for r ∈ N and x, y ∈ G. (We are mainly interested in the case when N is a maximal rank lattice in a vector group G, and then φ automatically extends, and in the interesting examples u does too.) Normally one would induce an algebra admitting a G-action from that containing an N-action, but that will no longer work since the induced algebra is trivial. Instead we consider the u-induced algebra B = u-indG N A described in the next result. Proposition 6.1. The space B = u-indG N A of functions f ∈ C0 (G, A) which satisfy −1 f (rx) = ad(u(r, x)) αr [f (x)] for all x ∈ G and r ∈ N is not trivial, and is closed under pointwise multiplication of functions (f1 f2 )(x) = f1 (x)f2 (x) and the involution f ∗ (x) = f (x)∗ . The norm f = sup f (x) is a C ∗ -norm. Proof. We first note that (using the cocycle condition and remembering that the adjoint action is unaffected by the central factor φ) functions in the space satisfy f (rsx) = ad(u(r, sx))−1 αr [f (sx)] = ad(u(r, sx))−1 αr ([ad(u(s, x))]−1 αs [f (x)]) = ad(u(r, sx))−1 αr [ad(u(s, x))]−1 αr αs [f (x)] = ad(u(rs, x))−1 ad(u(r, s))−1 αr αs [f (x)] = ad(u(rs, x))−1 αrs [f (x)], showing consistency of the condition. Without u this consistency check would fail and the induced algebra would be trivial, showing why one cannot use the normal induced algebra. We shall exhibit some useful explicit functions in the induced algebra in Proposition 6.4, but there is also a general construction which is useful. For f a function C0 (G) and a an element in A, we define an A-valued function (f ♦a) on G by (f ♦a)(x) = f (nx)αn−1 [ad(u(n, x))[a]] dn. (6.4) N
Nonassociative Tori and Applications to T-Duality
51
Using the cocycle identity for ad(u) (the obstruction φ is central and so disappears in the adjoint action) we then check that (f ♦a)(rx) = f (nrx)αn−1 [ad(u(n, rx))[a]] dn N = f (nrx)αn−1 [ad(αn [u(r, x)])−1 ad(u(n, r))ad(u(nr, x))[a]] dn N −1 = ad(u(r, x)) f (nrx)αn−1 [ad(u(n, r))ad(u(nr, x))[a]] dn N −1 = ad(u(r, x))−1 αr f (nrx)αnr [ad(u(nr, x))[a]] dn N
= ad(u(r, x))−1 αr [(f ♦a)(x)], showing that (f ♦a) defines an element of B. Using the fact that αr and ad(u(x, r)) are automorphisms, we see that (f1 f2 )(xr) = f1 (xr)f2 (xr) = ad(u(x, r))αr [f1 (x)]ad(u(x, r))αr [f2 (x)] = ad(u(x, r))αr [f1 (x)f2 (x)] = ad(u(x, r))αr [(f1 f2 )(x)], so that the space of u-induced functions is closed under the product. Finally, exploiting the unitarity of u(r, x), we have f ∗ (rx) = f (rx)∗ = [u(r, x)−1 αr [f (x)]u(r, x)]∗ = u(r, x)−1 αr [f (x)]∗ u(r, x) = u(r, x)−1 αr [f ∗ (x)]u(r, x), so that the involution respects the constraint. Finally f ∗ f = sup (f ∗ f )(x) = sup f (x)∗ f (x) = sup f (x)2
(6.5)
gives a C ∗ -norm. This shows that the induced space B is actually a ∗-algebra (moreover, an associative algebra, since A was associative). A C ∗ -norm can be defined much as in the usual case. Exploiting the ideas of the ordinary induced representations we can improve on this. then B is a continTheorem 6.2. If A is a continuous trace algebra with spectrum A = N\(G × A), where the action of r ∈ N on uous trace algebra with spectrum B is defined by r(x, π) = (rx, π ◦ αr−1 ad(u(r, x))). (x, π ) ∈ G × A Proof. We should start by checking that the above formula does indeed define an action, but that is essentially the same calculation just done to show that f ♦a satisfies the equivariance condition for B. The rest of the proof follows the ideas in [33, 6.16 to 6.21], but with the actions on the other sides, and using our definition of f ♦a, and noting that in the proof of 6.18 one needs to restrict to neighbourhoods of both s and x. This enables
52
P. Bouwknegt, K. Hannabuss, V. Mathai
(x, π ) : F → π(F (x) of us to show that the representation defined by (x, π ) ∈ G × A, B is irreducible, because, for any (x, a) ∈ G × A we can use appropriate f ♦a to find F ∈ B such that F (x) = a, and then use irreducibility of π . The rest of the proof in [33] is independent of the twisting. We next introduce automorphisms of the twisted induced algebra. Proposition 6.3. For y ∈ G and a function f : G → A define βy [f ](x) = ad(u(x, y)) [f (xy)]. Then βy preserves the subalgebra B and defines a ∗-automorphism of it. Proof. βy [f ](rx) = ad(u(rx, y))[f (rxy)] = ad(u(rx, y))ad(u(r, xy))−1 αr [f (xy)] = ad(u(r, x))−1 ad(αr [(u(x, y)])αr [f (xy)] = ad(u(r, x))−1 αr ([ad(u(x, y))][f (xy)]) = ad(u(r, x))−1 αr [βy [f ](x)], showing that βy satisfies the equivariance condition. To see that these are automorphisms we need only note that (βy [f1 ]βy [f2 ])(x) = (βy [f1 ](x)βy [f2 ])(x) = ad(u(x, y))[f1 (xy)]ad(u(x, y))[f2 (xy)] = ad(u(x, y))[f1 (xy)f2 (xy)] = ad(u(x, y))[(f1 f2 )(xy)] = βy [f1 f2 ](x), as required, and compatibility with the involution is also easily checked.
Naturally the map y → βy is not a homomorphism. Proposition 6.4. The functions v(y, z) : x → φ(x, y, z)u(x, y)u(xy, z)u(x, yz)−1 , lie in the multiplier algebra of B and satisfy βx [v(y, z)]v(x, yz) = φ(x, y, z)v(x, y)v(xy, z).
(6.6)
The automorphisms defined by β satisfy the relations βy βz = ad(v(y, z))βyz .
(6.7)
Proof. When x ∈ N we can write v(y, z)(x) = αx [u(y, z)], but otherwise αx is undefined, and we cannot reduce v(y, z)(x) = φ(x, y, z)−1 ad(u(x, y))ad(u(xy, z)) ad(u(x, yz))−1 . To check that v(y, z) satisfies the equivariance condition for membership of B, we calculate u(rx, y)u(rxy, z)u(rx, yz)−1 = φ(r, x, y)−1 u(r, x)−1 αr [u(x, y)]u(r, xy)u(rxy, z)u(rx, yz)−1 = φ(r, x, y)−1 φ(r, xy, z)−1 u(r, x)−1 αr [u(x, y)]αr [u(xy, z)]u(r, xyz)u(rx, yz)−1 = φ(r, x, y)−1 φ(r, xy, z)−1 φ(r, x, yz)u(r, x)−1 αr [u(x, y)u(xy, z)]αr [u(x, yz)]−1 ×u(r, x) = φ(r, x, y)−1 φ(r, xy, z)−1 φ(r, x, yz)ad(u(r, x))−1 αr [u(x, y)u(xy, z)u(x, yz)−1 ] = φ(r, y, z)−1 ad(u(r, x))−1 αr [u(x, y)u(xy, z)u(x, yz)−1 ].
Nonassociative Tori and Applications to T-Duality
53
From this we see that [v(y, z)](rx) = φ(rx, y, z)φ(r, y, z)−1 ad(u(r, x))−1 αr [φ(x, y, z)−1 v(y, z)(x)] = ad(u(r, x))−1 αr [v(y, z)(x)]. We cannot conclude that v(y, z) lies in B since it does not satisfy the analytic condition of vanishing outside compact sets, but it is certainly in the multiplier algebra. By making some cancellations, we obtain φ(s, x, y)φ(s, xy, z) v(x, y)v(xy, z)v(x, yz)−1 (s) = u(s, x)u(sx, y) φ(s, x, yz) ×u(sxy, z)u(s, xyz)−1 u(s, x)−1 = φ(s, y, z)ad(u(s, x))[u(sx, y)u(sxy, z)u(sx, yz)−1] = φ(x, y, z)−1 ad(u(s, x))[v(y, z)(sx)] = φ(x, y, z)−1 βx [v(y, z)](s). Finally we see that (βy βz [f ])(x) = ad(u(x, y))[(βz [f ](xy)] = ad(u(x, y))ad(u(xy, z)[f (xyz)] = ad(u(x, y))ad(u(xy, z)ad(u(x, yz)−1 )[(βyz f (x)] = ad(v(y, z))(x)[(βyz f (x)]. defined by β has orbit space B/G = A/N, Corollary 6.5. The action of G on B so that β defines the principal G/N-bundle (G × A)/N → A/N. to ◦ βy . In the earlier notation we have Proof. The action of y ∈ G sends ∈ B (x, π )(βy [f ]) = π(βy [f ](x)) = π(ad(u(x, y))[f (xy)]),
(6.8)
and, since inner automorphisms don’t affect the class of a representation, this is equivalent to (xy, π ). For r ∈ N this reduces to (x, π )(βr [f ]) = π(ad(u(x, r)u(r, x)−1 )ad(u(r, x)αr [f (x)]),
(6.9)
which is equivalent to r(x, π), showing that the subgroup N stabilises the irreducible representations of B, so that we have a G/N bundle, and that the orbit space is A/N, as claimed. 7. Algebras with Prescribed Spectrum and Dixmier-Douady Class Before tackling the general case it is useful to consider what happens for the principal bundle G/N over a point. In that case only the component H0 can be non-trivial, and we assume that it defines the tricharacter φ as before. In fact [10] gives a universal construction for a principal projective unitary bundle over a group, but we shall give an alternative description of the algebra as a twisted induced algebra. We shall induce from N to G, and in order that the spectrum should be just G/N we induce the algebra K(L2 (N)) of compact operators. That carries the
54
P. Bouwknegt, K. Hannabuss, V. Mathai
right regular representation ρ of N and twisting uρ described in Sect. 4. Although the restriction of uρ to N × N is 1, the extension to G × G gives the induced algebra a twist. 2 At this point it is instructive to consider why uρ -indG N (K(L (N))) has a non-vanishing Dixmier–Douady obstruction when the action of N on K(L2 (N)) is given by 2 αρ = adρ. One might expect that the induced algebra uρ -indG N (K(L (N)) simply acts on the induced Hilbert space of square-integrable functions ψ : G → L2 (N), which satisfy the equivariance condition ψ(rx) = uρ (r, x)−1 ρ(r)ψ(x),
(7.1)
so that there is no obstruction. However, this is incorrect because the suggested equivariance condition on ψ is inconsistent, when uρ is not a cocycle: ψ(rsx) = uρ (r, sx)−1 ρ(r)ψ(sx) = uρ (r, sx)−1 ρ(r)uρ (s, x)−1 ρ(s)ψ(x) = uρ (r, sx)−1 αr [uρ (s, x)]−1 ρ(r)ρ(s)ψ(x) = φ(r, s, x)−1 uρ (rs, x)−1 uρ (r, s)−1 ρ(rs)ψ(x). Now, as we noted in the last paragraph, uρ (r, s) = 1, but the presence of the argument x∈ / N means that φ(r, s, x) = 1, and we end up with constraints uρ (rs, x)−1 ρ(rs)ψ(x) = ψ(rsx) = φ(r, s, x)−1 uρ (rs, x)−1 ρ(rs)ψ(x)
(7.2)
which can be satisfied only by ψ = 0. Proposition 7.1. Let φ be the tricharacter of G constructed from H0 , uρ , ρ defined on 2 L2 (N) as in Sect. 4, and αρ = adρ. The algebra uρ -indG N (K(L (N))) has spectrum G/N, the G action on the spectrum is transitive with stabiliser N, and the Dixmier-Douady class is described by the 3-form H0 . Proof. We assume that φ is obtained from the class f = H0 by the procedure described in Sect. 2. Choose an open set F ⊆ G on which the projection π to G/N is one-one, and translates Fi = F xi whose projections give a cover of G/N. These translates share the property that the projection to G/N is injective, and so we may choose sections γi : π(Fi ) → G. The differences γij (v) = γi (v)γj (v)−1 lie in N. The restriction of the induced algebra to algebra-valued functions on π(Fi ) ⊆ G/N is Morita equivalent to C(π(Fi )) via the bimodule Xi = L2 (π(Fi ), L2 (N)), (restriction to the subsets enables us to sidestep the earlier problem with L2 (G, L2 (N))). The actions are the obvious pointwise multiplicative actions, (f ψ)(v) = f (γi (v))ψ(v)
(7.3)
and this is a imprimitivity bimodule in the sense of [33]. Over the intersection π(Fi ) ∩ π(Fj ) there is an equivalence of the two bimodules Xi and Xj , given by the map (gij ψ)(v) = uρ (γij (v), γj (v))−1 ρ(γij (v))ψ(v)
(7.4)
Nonassociative Tori and Applications to T-Duality
55
from Xj to Xi . To see why this works we note that (gij f ψ)(v) = uρ (γij (v), γj (v))−1 ρ(γij (v))f (γj (v))ψ(v) = ad(uρ (γij (v), γj (v))−1 ad(ρ(γij (v))[f (γj (v))] uρ (γij (v), γj (v))−1 ρ(γij (v))ψ(v). Since γij (v) ∈ N this simplifies to f (γij (v)γj (v))uρ (γij (v), γj (v))−1 ρ(γij (v))ψ(v) = (f gij ψ)(v).
(7.5)
To compute the obstruction we must compare gij gj k and gik on π(Fi ) ∩ π(Fj ) ∩ π(Fk ). Now we have (gij gj k φ)(v) = uρ (γij (v), γj (v))−1 ρ(γij (v))(gj k ψ)(v) = uρ (γij (v), γj (v))−1 ρ(γij (v)) uρ (γj k (v), γk (v))−1 ρ(γj k (v))ψ(v) = uρ (γij (v), γj (v))−1 ad(ρ(γij (v)))[uρ (γj k (v), γk (v))−1 ] ρ(γij (v))ρ(γj k (v))ψ(v). Applying the modified cocycle identity we have (gij gj k φ)(v) = φ(γij (v), γj k (v), γk (v))−1 uρ (γik (v), γk (v))−1 uρ (γij (v), γj k (v))−1 ρ(γik (v))ψ(v), and finally, since γij (v), γj k (v) ∈ N we deduce that uρ (γij (v), γj k (v)) = 1, giving (gij gj k φ)(v) = φ(γij (v), γj k (v), γk (v))−1 (gik ψ)(v).
(7.6)
ˇ This shows that the Dixmier-Douady class can be described by the Cech cocycle φij k = φ(γij (v), γj k (v), γk (v))−1 = exp[2πif (γij (v), γj k (v), γk (v))] = exp[2πif (γi (v), γj (v), γk (v))], where the antisymmetry of f has been used in the final line. To find a form describing the de Rham cocycle we note that, since locally γk (v) and v are the same and the differences γij are in N, df (γij (v), γj k (v), γk (v)) = f (γij (v), γj k (v), dγk (v)) = f (γij (v), γj k (v), dv) = f (γij (v), γj (v), dv) − f (γij (v), γk (v), dv), giving an explicit expression as the difference of one-forms. Repeating this process twice we arrive at the de Rham form f (dv, dv, dv) giving the class H0 . (The antisymmetry of f compensates for the antisymmetry of the exterior product to give a non-vanishing answer.) The same ideas can now be used to deal with the general case. However, since we want to use results for the untwisted case, we shall at this point restrict ourselves to the case of an abelian group G.
56
P. Bouwknegt, K. Hannabuss, V. Mathai
Theorem 7.2. Let G be abelian and A a continuous trace algebra with an action of N ⊂ G by locally projectively unitary automorphisms. (This includes the assumption that the restriction of u to N × N takes the constant value 1). Both u-indG N (A) and (A) are continuous trace algebras, and the difference of their Dixmier-Douady indG N G 3 invariants δ(u-indG N (A)) − δ(indN (A)) is the class defined by the form f ∈ (G, Z) associated to the tricharacter φ: G δ(u-indG N (A)) − δ(indN (A)) = [f ].
(7.7)
Proof. We first note that by Theorem 4.1 u is exterior equivalent to uλ which by (4.3) has trivial restriction to N × N. Since A has continuous trace, it is locally Morita equivalent to an algebra of compact operators, and all our calculations will be local. In fact we by open sets {Uλ } and find Hilbert spaces Hλ such that the restriction AUλ may cover A of A to Uλ is Morita equivalent to C(Uλ , K(Hλ )) via some bimodule Yλ . Combining these with the bimodules Xl used in the previous proof we take X(l,λ) = Xl ⊗ Yλ for the restriction of the algebra to π(Fl ) × Uλ . We may assume that the cover is fine enough that αn (a) is equivalent to ad(ρnλ )(a) (m,µ) for a ∈ AUλ , with ρ λ a θ λ -representation. We now define an equivalence G(l,λ) on the overlap of the sets π(Fl ) × Uλ and π(Fm ) × Uµ by setting (G(l,λ) ψ)(v) = u(γlm (v), γm (v))−1 hλµ ρ µ (γlm (v))ψ(v), (m,µ)
(7.8)
where hλµ describes the equivalences of Yλ and Yµ , (which are assumed to satisfy the ˇ relationship hλµ hµν = λµν hλν , where is a Cech cocycle describing the DixmierDouady class of A.) On overlaps the projective representations ρ λ are equivalent in the sense that ρ µ (n)hµν = hµν κµν (n)ρ ν (n), for some character κµν ∈ N. The adjoint actions of ρ µ and ρ ν are both equivalent to α. We now calculate that (G(l,λ) G(m,µ) ψ)(v) = u(γlm (v), γm (v))−1 hλµ ρ µ (γlm (v))u(γmn (v), (m,µ)
(n,ν)
γn (v))−1 hµν ρ ν (γmn (v))ψ(v) = u(γlm (v), γm (v))−1 αγlm )[u(γmn (v), γn (v))−1 ]hλµ hµν ×κµν (γlm )ρ ν (γlm (v))ρ ν (γmn (v))ψ(v). The first two terms combine as before to give u(γln (v), γn (v))−1 u(γlm (v), γmn (v))−1 = φ(γlm , γmn , γn )−1 u(γln (v), γn (v))−1 , (7.9) whilst the projective representions give ρ ν (γlm (v))ρ ν (γmn (v)) = θ ν (γlm (v), γmn (v))ρ ν (γln (v)), giving (m,µ)
(n,ν)
(G(l,λ) G(m,µ) ψ)(v) = φ(γlm , γmn , γn )−1 κµν (γlm )θ ν (γlm (v), γmn (v))λµν u(γln (v), γn (v))−1 hλν ρ ν (γln (v))ψ(v) = φ(γlm , γmn , γn )−1 κµν (γlm )θ ν (γlm (v), γmn (v))λµν G(l,λ) ψ(v), (n,ν)
(7.10)
Nonassociative Tori and Applications to T-Duality
57
from which we deduce the obstruction. All the factors except the first would be present for indG N (A), so that the difference between the Dixmier-Douady obstructions for G −1 u-indN (A) and indG N (A) is just given by φ , or as forms by f . Our formula shows that the obstruction has four contributions: the MacLane-Whitehead obstruction φ −1 , the Mackey obstruction θ , the Phillips-Raeburn obstruction κ, and the Dixmier-Douady obstruction for A, corresponding to H0 , H1 , H2 and H3 . Corollary 7.3. Let G be abelian and E be a principal G/N-bundle over M with prescribed Dixmier-Douady invariant associated with H . Then there is a u-induced algebra with spectrum E, and the correct action of G, having the Dixmier-Douady invariant H . This u-induced algebra is not necessarily unique.2 Proof. We know from [2, 4, 28] that there is an algebra indG N (A) associated with the principal G/N-bundle E and having Dixmier-Douady form H − H0 . This algebra is not necessarily unique, cf. [28]. With f as in the theorem we see that u-indG N (A) has Dixmier-Douady class described by the form H . An alternative approach would be to note that a continuous trace algebra with Dixmier–Douady class given by H , can be constructed as the tensor product of two continuous trace algebras with classes H − H0 and with H0 , and then to apply [28] for the former and our earlier result for the class H0 . Although we have shown how to construct an algebra with a given Dixmier-Douady class, it is natural to wonder whether one could also find another algebra with twisting given by an ordinary cocycle. The next result shows that this is not possible. Theorem 7.4. Every system with the Dixmier-Douady invariant H is described by automorphisms whose twisting gives the same tricharacter φ. Proof. By the general theory we know that any other system must be exterior equivalent to the u-induced system above, that is it is described by automorphisms λx = ad(W (x))βx , and w(x, y) = W (x)βx [W (y)]v(x, y)W (xy)−1 for some MB-valued function W on G. Since the twistings are cohomologous they must define the same class φ, but more explicitly we calculate that λx [w(y, z)]w(x, yz) = ad(W (x))βx [W (y)βy [W (z)]v(y, z)W (yz)−1 ] W (x)βx [W (yz)]v(x, yz)W (xyz)−1 = W (x)βx [W (y)]βx βy [W (z)]βx [v(y, z)]v(x, yz)W (xyz)−1 = φ(x, y, z)W (x)βx [W (y)]ad(v(x, y)) ×βxy [W (z)]v(x, y)v(xy, z)W (xyz)−1 = φ(x, y, z)W (x)βx [W (y)]v(x, y)W (xy)−1 W (xy) ×βxy [W (z)]v(xy, z)W (xyz)−1 = φ(x, y, z)w(x, y)w(xy, z), showing that the same cocycle φ arises. 2
The non-uniqueness reflects the fact that there may exist several liftings of the action of N on A to G. So, strictly speaking, u-ind does not define a functor, but u-indG N (A) just indicates the induced algebra with a prescribed action of G.
58
P. Bouwknegt, K. Hannabuss, V. Mathai
8. The Twisted Crossed Product Algebra In this section we can return to the case of a general group G. We have argued that the dual of a bundle described by B with a twisted group action should be given by the twisted crossed product. For the induced algebra u-indG N (A) its twisted crossed product with G can be calculated explicitly, and the following result shows that it is a generalisation of the twisted compact operators in Sect. 5. Theorem 8.1. The twisted crossed product u-indG N (A) β,v G is isomorphic to the ∗-algebra of A-valued kernels on G × G satisfying K1 (rz, rw) = φ(r, z, w)−1 u(r, z)−1 αr [K1 (z, w)]u(r, w), with K ∗ (z, w) = K(w, z)∗ and product K1 (z, v)K2 (v, w)φ(z, v, w) dv. (K1 K2 )(z, w) =
(8.1)
(8.2)
G
Proof. The twisted crossed product consists of functions Fj : G → B with twisted convolution F1 (y)βy [F2 (y −1 x)]v(y, y −1 x) dy. (8.3) (F1 ∗ F2 )(x) = G
Identifying B with functions from G to A, we may give the elements of the twisted crossed product a second argument in G and, using the explicit form of the induced action and of v, get F1 (z, y)ad(u(z, y))[F2 (zy, y −1 x)] (F1 ∗ F2 )(z, x) = G
×φ(z, y, y −1 x)u(z, y)u(zy, y −1 x)u(z, x)−1 dy,
(8.4)
which can be rearranged as
φ(z, y, y −1 x)F1 (z, y)u(z, y)F2 (zy, y −1 x)
(F1 ∗ F2 )(z, x)u(z, x) = G
×u(zy, y −1 x) dy.
(8.5)
We now define K1 (z, zy) = F1 (z, y)u(z, y), K2 (z, zy) = F2 (z, y)u(z, y) and (K1 K2 )(z, zx) = (F1 ∗ F2 )(z, x)u(z, x) to obtain K1 (z, zy)K2 (zy, zx)φ(z, y, y −1 x) dy. (8.6) (K1 K2 )(z, zx) = G
Setting w = zx, v = zy and exploiting the antisymmetry of φ the result follows. We readily check that F ∗ (z, x)u(z, x) = v(x, x −1 )∗ ad(u(z, x))[F (zx, x −1 )]∗ u(z, x) = u(zx, x −1 )∗ F (zx, x −1 )∗ , from which it follows that K ∗ (z, w) = K(w, z)∗ .
(8.7)
Nonassociative Tori and Applications to T-Duality
59
The kernels inherit an equivariance condition from the inducing process, K1 (rz, rzy) = F1 (rz, y)u(rz, y) = ad(u(r, z))−1 αr [F1 (z, y)]u(rz, y) = u(r, z)−1 αr [K1 (z, zy)u(z, y)−1 ]u(r, z)u(rz, y) = φ(r, z, y)−1 u(r, z)−1 αr [K1 (z, zy)]u(r, zy). Exploiting the antisymmetry of φ this gives K1 (rz, rw) = φ(r, z, w)−1 u(r, z)−1 αr [K1 (z, w)]u(r, w).
(8.8)
We note that since the original product F1 ∗ F2 respected this equivariance condition, so does the product on kernels. The norm on the twisted crossed product can be defined from the left regular representation and so agrees with that on kernels. The description of the crossed product algebra in Theorem 8.1 can be made more precise for a bundle over a point described by A = K(L2 (N)) as in Proposition 7.1. However, this time it is more useful to take the left handed version of the automorphisms, that is αr [k](s, t) = k(r −1 s, r −1 t) ,
(8.9)
and set (u(s, t)ψ)(r) = φ(r, s, t)−1 ψ(r). Theorem 8.2. The nonassociative torus describing the dual of a bundle over a point is isomorphic to the algebra Aφ = Kφ (L2 (G)) γ ,u N, where γ is defined by Eq. (5.10). Proof. We abbreviate notation for the K(L2 (N))-valued kernels by setting K(z, w) : (s, t) → K(z, w; s, t),
(8.10)
so that the equivariance condition can be given explicitly as K(rz, rw; s, t) = φ(r, z, w)−1 φ(r, z, s)K(z, w; r −1 s, r −1 t)]φ(r, w, t)−1 , (8.11) or, replacing the first two arguments, as K(z, w; s, t) = φ(r, z, w)−1 φ(r, z, s)K(r −1 z, r −1 w; r −1 s, r −1 t)]φ(r, w, t)−1 . (8.12) Taking r = s in this formula we get K(z, w; s, t) = φ(s, z, w)−1 K(s −1 z, s −1 w; 1, s −1 t) ×φ(w, s, t).
(8.13)
Writing u(x, y) for the multiplier associated with the automorphism of twisted kernels γx defined by Eq. (5.10), we have K(z, w; s, t) = γs [K](z, w; 1, s −1 t)u(s, t) = γs [K](z, w; 1, s −1 t)u(s, s −1 t) . (8.14)
60
P. Bouwknegt, K. Hannabuss, V. Mathai
This shows that the kernels can be reconstructed from their values when the third argument is 1, and in this case the product formula can be simplified. The product (K1 K2 )(z,w; r, t) =
K1 (z, v; r, s)K2 (v, w; s, t)φ(z, v, w) dsdv , (8.15) N×G
reduces to (K1 K2 )(z, w; 1, t) =
K1 (z, v; 1, s)K2 (v, w; s, t)φ(z, v, w) dsdv . N×G
(8.16) The second kernel can be rewritten using the equivariance condition to give
(K1 K2 )(z, w; 1, t) =
K1 (z, v; 1, s)γs [K2 ](v, w; 1, s −1 t)u(s, s −1 t) ds}
{ G
N
×φ(z, v, w) dv .
(8.17)
Identifying the kernel Kj with the Kφ (L2 (G))-valued function s → {(z, w) → Kj (z, w; 1, s)} on N, this is just the twisted crossed product of the two functions. In other words we can identify the algebra with Kφ (L2 (G)) γ ,u N. In the general case of a bundle over M one needs to take A = C(M, K(L2 (N)), but since the products of functions on M are all taken pointwise, there is no essential change in the calculations. The only point for caution is that, in those cases where φ depends on M (and consequently so also do γ and u), one is really looking at an algebra of continuous sections of a bundle over M rather than just functions. Observe that by Theorem 2 8.1, the nonassociative torus Aφ is canonically isomorphic to u-indG N (K(L (N)) β,v G. Theorem 8.3. The nonassociative torus describing the dual of a bundle over M is isomorphic to the algebra C(M, Kφ (L2 (G)) γ ,u N). When N is trivial this gives the twisted algebra of kernels introduced in Sect. 5, though with φ replaced by its inverse. More generally it provides an extension of Green’s Generalised Imprimitivity Theorem [17] to the case of these twisted induced algebras. In contrast to the twisted crossed product algebra for the dual, the correspondence algebra is associative. Provided that there is no Mackey obstruction the correspondence space is associated with the algebra u-indG N (A) β,v N. Theorem 8.4. The algebra u-indG N (A) β,v N is associative. Proof. By Propositions 3.1 and 6.4 u-indG N (A) β,v N is associative if and only if the restriction of φ to N × N × N is 1, and we have seen that this is a consequence of the integrality of the form H0 . For non-trivial Mackey obstruction one replaces N by the projection onto G of the centre of the central extension defined by the multiplier [19, 14], and that algebra is similarly associative.
Nonassociative Tori and Applications to T-Duality
61
9. The Dual Action In this section we shall assume that G is abelian. Following the development in Sect. 5, we may define automorphisms of the kernels by τx [K](z, w) = φ(x, z, w)K(x −1 z, x −1 w),
(9.1)
u(x, y))−1 τxy , where and these satisfy τx τy = ad( ad( u(x, y)[K])(z, w) = φ(x, y, z)φ(x, y, w)−1 K(z, w).
(9.2)
However, we are primarily interested in the case of abelian groups, and there is also a on the dual algebra. Its definition is motivated by more useful action of the dual group G ξ on F ∈ u-indG (A) β,v G given by β ξ [F ](z, y) = ξ(y)F (z, y). Rewritten the action β N in terms of kernels this leads to the following idea. define β ξ by Proposition 9.1. For ξ ∈ G ξ [K](z, w) = ξ(z−1 w)K(z, w). β
(9.3)
ξ is an automorphism of u-indG (A) β,v G, and β η = β ξ β ξ η . Then β N ξ is well-defined, that is that it preserves the subspace Proof. We must first show that β of kernels K satisfying the equivariance condition of Theorem 8.1. In fact we see that ξ [K](rz, rw) = ξ(z−1 w)K(rz, rw) β = ξ(z−1 w)φ(r, z, w)−1 u(r, z)−1 βr [K(z, w)]u(r, w) ξ [K](z, w)]u(r, w), = φ(r, z, w)−1 u(r, z)−1 βr [β which gives the required equivariance condition. It is also an automorphism since ξ(z−1 v)K1 (z, v)ξ(v −1 w)K2 (v, w)φ(z, v, w)−1 dv (βξ [K1 ] βξ [K2 ])(z, w) = G = ξ(z−1 w)K1 (z, v))K2 (v, w)φ(z, v, w)−1 dv G
= ξ(z−1 w)(K1 K2 )(v, w). η = β ξ η . ξ β There is no twisting involved since it is easy to check that β
The We now form the crossed product of the twisted crossed product algebra with G. to the algebra of kernels described in the last section, and elements are functions from G with multiplication so may be regarded as A-valued functions on G × G × G, η [ ( k1 k2 )(z, w, ξ ) = k2 ](v, w, η−1 ξ )φ(z, v, w)−1 dηdv k1 (z, v, η)β G×G k1 (z, v, η)η(v −1 w) k2 (v, w, η−1 ξ )φ(z, v, w)−1 dηdv. = G×G
We shall denote the group-theoretic Fourier transform of k with respect to its third argument by k (which is now a function on G × G × G): k(z, w, ξ )ξ(x) dξ. (9.4) k(z, w, x) = G
62
P. Bouwknegt, K. Hannabuss, V. Mathai
The multiplication obtained by Fourier transform is (assuming appropriate normalisation of the measures) ξ(x) k1 (z, v, η)η(v −1 w) k2 (v, w, η−1 ξ ) (k1 k2 )(z, w, x) = G G×G×
×φ(z, v, w)−1 dξ dηdv
=
G G×G×
k2 (v, w, η−1 ξ )(η−1 ξ )(x) k1 (z, v, η)η(v −1 wx)
×φ(z, v, w)−1 dξ dηdv
k1 (z, v, v −1 wx)k2 (v, w, x)φ(z, v, w)−1 dv.
= G
We now introduce another transformation by k(x; z, w) = φ(x, z, w)−1 u(x, z)k(xz, xw, w −1 )u(x, w)−1 .
(9.5)
Theorem 9.2. There is an isomorphism 2 ∼ (u-indG = u-indG G N (A) β,v G) β N (A) ⊗ Kφ (L (G)).
(9.6)
Proof. One important property which follows from the equivariance condition for the A-valued kernels is that k(rx; z, w) = φ(rx, z, w)−1 u(rx, z)k(rxz, rxw, w −1 )u(rx, w)−1 = [φ(rx, z, w)φ(r, xz, xw)]−1 u(rx, z)u(r, xz)−1 αr [k(xz, xw, w −1 )] ×u(r, xw)u(rx, w)−1 = [φ(rx, z, w)φ(r, xz, xw)]−1 φ(r, x, z)−1 φ(r, x, w) u(r, x)−1 αr [u(x, z)]αr [k(xz, xw, w −1 )]αr [u(x, w)]−1 u(r, x) = [φ(rx, z, w)φ(r, xz, xw)]−1 φ(r, z, x)φ(r, x, w) ad(u(r, x))−1 αr [u(x, z)k(xz, xw, w −1 )u(x, w)−1 ] = ad(u(r, x))−1 αr [ k(x; z, w)], so that the kernels k satisfy the induced algebra condition with respect to x, and so can be considered elements of the algebra induced from N to G by the A-valued kernels. Next we consider the product on these functions k2 )(x; z, w) = φ(x, z, w)−1 u(x, z)(k1 k2 )(xz, xw, w−1 )u(x, w)−1 ( k1 = φ(x, z, w)−1 u(x, z)k1 (xz, xv, v −1 )k2 (xv, xw, w−1 ) G
−1
×u(x, w)
=
φ(xz, xv, xw)−1 dv
k2 (x; v, w)φ(z, v, w)−1 dv. k1 (x; z, v)
G
Thus we have the pointwise product with respect to x, with the twisted multiplication of Kφ (L2 (G)) in the fibres.
Nonassociative Tori and Applications to T-Duality
63
We shall show in the final section that the twisted compact operators are in a certain sense Morita equivalent to the ordinary compact operators, so that this result provides a very precise analogue of the normal duality theorem. is To complete the argument we really need to know that the double dual action β equivalent to the original β to within the action on the twisted compact operators. The double dual action is defined on (u-indG G by the same procedure used N (A) β,v G) β variable and the to obtain the dual action, that is multiplication by the pairing of the G group element: ( β g [ k](z, w, ξ ) = ξ(g) k(z, w, ξ ).
(9.7)
Theorem 9.3. The double dual action of G can be written as ( β g [ k])(x; z, w) = τg [(v(g, g −1 z)−1 βg [ k]v(g, g −1 w)(x; z, w)].
(9.8)
Proof. Fourier transforming the definition of the action we get g [k](z, w, x) = ( β
G
ξ(g)ξ(x) k(z, w, ξ ) = k(z, w, xg).
(9.9)
Using the same notation for the equivalent action on k, g [k])(xz, xw, w −1 )u(x, w)−1 k])(x; z, w) = φ(x, z, w)−1 u(x, z)( β ( β g [ = φ(x, z, w)−1 u(x, z)k(xz, xw, w −1 g)u(x, w)−1 = φ(xg, g −1 z, g −1 w)φ(x, z, w)−1 u(x, z)u(xg, g −1 z)−1 k(xg; g −1 z, g −1 w)u(xg, g −1 w)u(x, w)−1 . In terms of the induced twisting we now we have u(x, z)u(xg, g −1 z)−1 = φ(x, g, g −1 z)−1 v(g, g −1 z)(x)−1 u(x, g),
(9.10)
and substituting this (and the analogous expression in w), we arrive at g [ k])(x; z, w) = φ(xg, g −1 z, g −1 w)φ(x, z, w)−1 φ(x, g, g −1 z)−1 φ(x, g, g −1 w) ( β ×v(g, g −1 z)(x)−1 u(x, g) k(xg; g −1 z, g −1 w)u(x, g)−1 v(g, g −1 w)(x) k(xg; g −1 z, g −1 w)] = φ(g, z, w)v(g, g −1 z)(x)−1 ad(u(x, g))[ −1 ×v(g, g w)(x) = φ(g, z, w)v(g, g −1 z)(x)−1 (βg [ k](x; g −1 z, g −1 w)]v(g, g −1 w)(x). This can be rewritten in terms of the twisted action τg on kernels and the adjoint action of v in the form
64
P. Bouwknegt, K. Hannabuss, V. Mathai
g [ ( β k])(x; z, w) = τg [(v(g, g −1 z)−1 βg [ k]v(g, g −1 w))(x; z, w)],
(9.11)
showing that to within an inner automorphism of the kernels one has βg ⊗ τg , and up to an action on the twisted kernels one recovers βg . In the application to principal Tn -bundles one takes G = Rn . As this is contractible, Proposition 5.2 shows that Kφ (L2 (G)) is a deformation of K(L2 (G)), so that this is a close substitute for the usual duality theorem. 10. Applications to T-Duality In this section, we apply the mathematical results of the earlier sections to determine the T-dual of principal torus bundles with general H-flux, thus generalizing earlier results in [2–4, 28, 29]. Let G = R , and N = Z in the setup of Corollary 7.3. Let E → M be a principal T -bundle, and H ∈ H 3 (E) be an integral H-flux on E.3 Then we can identify H = (H3 , H2 , H1 , H0 ), where Hp ∈ p (M, ∧3−p t), under the isomorphism (1.1), and closed under D. Then by the results in [28, 29], there is a continuous trace C ∗ -algebra indG N (A) with spectrum equal to E and with Dixmier-Douady invariant equal to (H3 , H2 , H1 , 0), which has an action of G that covers the given action of G on E. This action is not necessarily unique. Then by Corollary 7.3, we know that there is another continuous trace C ∗ -algebra u-indG N (A) with spectrum equal to E and with Dixmier-Douady invariant equal to H = (H3 , H2 , H1 , H0 ), which has a twisted action of G that covers the given action of G on E. Our main definition in this section is: Definition 10.1. The twisted crossed product u-indG N (A) β,v G is defined to be the T-dual to the principal T -bundle E with H-flux H . We justify this definition as follows. Firstly, the T-dual of u-indG N (A) β,v G is the which by the twisted Takai duality Theorem crossed product (u-indG G, (A) G) β,v β N G 2 9.2 is isomorphic to u-indG N (A) ⊗ Kφ (L (G)). That is, the T-dual of u-indN (A) β,v G G is Morita equivalent to the continuous trace algebra u-indN (A), so that T-duality applied twice returns us to where we started, up to Morita equivalence. In the special case when H = H0 , the fibre of this bundle over the point z ∈ M is equal to the nonassociative torus Aφ of rank with tricharacter φ corresponding to H0 (see Theorem 8.2). In the general case, but when H0 is zero, the fibre is a stabilized noncommutative torus with invariant H1 , [28, 29], and when H0 = 0 and H1 = 0, then the fibre is the stabilized algebra of continuous functions on a torus, [2–4]. Thus we have the following theorem. Theorem 10.2 (T-duality for principal torus bundles). Let E → M be a principal T -bundle over M, and H ∈ H 3 (E) be an integral H-flux on E. Then H = (H3 , H2 , H1 , H0 ), where Hp ∈ p (M, ∧3−p t). Let c1 (E) ∈ H 2 (M, t) denote the first Chern class of E, which determines E up to isomorphism. Then: which is a principal (1) If H0 = 0 and H1 = 0, then there is a canonical T-dual E = H2 ∈ H 2 (M, has a T-dual HT -bundle over M first Chern class c1 (E) t). E = (H 3 , H 2 , 0, 0) given by H 3 = H3 and H 2 = c1 (E). T-duality is neatly flux H encapsulated in the commutative diagram, 3 The conclusions in this section are valid for integral classes H ∈ H 3 (E, Z) as well since the component H0 does not carry torsion, and the remainder of the arguments is based on the results of [28, 29], which also hold for torsion H . For simplicity we state the results for differential forms only.
Nonassociative Tori and Applications to T-Duality
65
E ×M ?E ?? ?? ?? ? p pˆ ??? ?? E? E ?? ?? ?? ? π ?? ?? πˆ ? M
(10.1)
E A
f(z)
p
M z
Fig. 10.1. In the diagram, the fiber over z ∈ M is the noncommutative torus Af (z) , which is represented by a foliated torus, with foliation angle equal to f (z)
(2) If H0 = 0 and H1 = 0, then the T-dual is a continuous field of (stabilized) noncommutative tori Af over M, where the fiber over the point z ∈ M is equal to the rank k noncommutative torus Af (z) (see Fig. 10.1 above). Here f : M → T(2) is t). This map is not a continuous map representing H1 ∈ [M, T(2) ] ⊂ H 1 (M, ∧2 , unique, but the nonuniqueness does not affect its K-theory. (3) If H0 = 0 and if H = H0 , then the T-dual is a bundle of nonassociative tori Aφ (cf. Theorem 8.2) over M, where φ is the tricharacter associated to H0 . For general H , the T-dual is a continuous field of algebras that contains both the noncommutative torus and the nonassociative torus, and moreover, the T-dual is not unique, but the nonuniqueness occurs exactly as in part (2) above.
Part (1) was proved in [2, 3] when = 1 and in [4] for general . Part (2) was proved in [28] when = 2 and in [29] for general . Part (3) is what has been proved in this paper. A particular, but important case of Theorem 10.2 above is the following. 1. The T-dual of the torus T3 with no background flux is the dual torus T3 . This remains true if the background flux is topologically trivial. 2. (T3 , k dx ∧ dy ∧ dz) considered as a trivial circle bundle over T2 . The T-dual of (T3 , k dx ∧ dy ∧ dz) is the nilmanifold (HR /HZ , 0), where HR is the 3 dimensional
66
P. Bouwknegt, K. Hannabuss, V. Mathai
Heisenberg group and HZ the lattice in it defined by 1 x k1 z HZ = 0 1 y : x, y, z ∈ Z . 00 1
(10.2)
3. (T3 , k dx ∧ dy ∧ dz) considered as a trivial T2 -bundle over T. The T-dual of (T3 , k dx ∧ dy ∧ dz) is a continuous field of stabilized noncommutative tori, C ∗ (HZ ) ⊗ K, since k dx ∧ dy ∧ dz = 0. T2
4.
(T3 , k (T3 , k
dx ∧ dy ∧ dz) considered as a trivial T3 -bundle over a point. The T-dual of dx ∧ dy ∧ dz) is a nonassociative torus, Aφ (cf. Theorem 8.2), where φ is the
tricharacter associated to k dx ∧ dy ∧ dz, since
T3
k dx ∧ dy ∧ dz = 0.
We end with some speculations and open problems related to the results of the paper. In Sect. 11, we propose a natural definition of K-theory for the special nonassociative algebras that are considered in this paper. These are of the form A β,v G, where A is a C ∗ -algebra admitting a twisted action of the Abelian group G = R . We expect an analogue of Connes–Thom isomorphism theorem in K-theory to hold, showing that the K-theories of A and A β,v G are naturally isomorphic. This would then give further evidence that our definition of the T-dual of a principal torus bundle with H-flux is indeed correct. Finally, the K-theory of our special nonassociative algebras should be Morita invariant in our context, namely invariant under tensor product with twisted compact operators. Then the twisted Takai duality Theorem 9.2 would prove that T-duality applied twice returns us to the torus bundle with H-flux that we started out with. It remains to also determine the topological invariants of continuous fields of noncommutative tori and bundles of nonassociative tori as in the paper. This would then enable one to give a more symmetric characterization to the T-dual, similar to part (1) of the theorem above. We have an explicit conjecture for this, the explanation for which is in [5], namely, for the continuous field of noncommutative tori Af , there should be a “Chern class” invariant c1 (Af ) = (H2 , H1 , 0) satisfying dH2 + c1 (E) ∧ H1 = 0. In this case, we can add the following to part (2) of the theorem above. The T-dual Af is classified by its Chern class invariant c1 (Af ) = (H2 , H1 , 0) satis = (H 3 , H 2 , H 1 , 0), fying dH2 + c1 (E) ∧ H1 = 0 and dH1 = 0 and has T-dual H-flux H 3 = H3 , H 2 = c1 (E) and H 1 = 0. given by H Similarly, for the bundle of nonassociative tori Aφ with tricharacter φ associated to H0 , there should also be a “Chern class” invariant c1 (Aφ ) = (H2 , H1 , H0 ) satisfying dH2 + c1 (E) ∧ H1 = 0 and dH1 + c1 (E) ∧ H0 = 0. In this case, we can add the following to part (3) of the theorem above. The T-dual Aφ is classified by its Chern class invariant c1 (Aφ ) = (H2 , H1 , H0 ) satisfying dH2 + c1 (E) ∧ H1 = 0, dH1 + c1 (E) ∧ H0 = 0 and dH0 = 0. It has T-dual = (H 3 , H 2 , H 1 , H 0 ) given by H 3 = H3 , H 2 = c1 (E), H 1 = 0 and H 0 = 0. H-flux H What also remains to be done is T-duality for nonabelian principal bundles, where some of the ideas of this paper and [5] apply.
Nonassociative Tori and Applications to T-Duality
67
11. Nonassociative Algebras and Monoidal Categories – An Outlook Although the nonassociativity of the crossed product algebra appears to present a serious amendment to the notion of duality, that is not really the case. The fact that the same obstruction φ appears throughout is a signal that one should rather work in the monoidal category of C0 (G)-modules in which the isomorphism : (U ⊗ V ) ⊗ W → U ⊗ (V ⊗ W ) is given by the action of φ ∈ C(G × G × G), the multiplier algebra of C0 (G) ⊗ C0 (G) ⊗ C0 (G), [25, 12]. The cocycle identity for φ is equivalent to commutativity of the fundamental pentagonal diagram which ensures that all higher associators are consistent. By Fourier transforming we could identify the category as G-modules rather than C(G)-modules, which fits more directly into the framework of the duality theorem. The identity object is the trivial G-module 1 on C, which certainly has the property that, for any G-module U , U ⊗ 1 and 1 ⊗ U are naturally isomorphic to U . This is equivalent on C(G) to evaluating a function at the identity. Because φ vanishes when an argument is set equal to the identity, the two obvious maps from U ⊗ (1 ⊗ V ) = [(U ⊗ 1) ⊗ V ] to U ⊗ V are consistent. An algebra A is a monoid in this category, and the identification automatically takes care of the associativity. We can also define a left A-module M if one has a morphism A ⊗ M → M. A left A-module M is said to be projective if given any surjective morphism of left A-modules a : E → N and any morphism of left A-modules b : M → N , there is a morphism of left A-modules c : M → E such that a ◦ c = b. If A has a unit, then one can define the monoid V (A) consisting of isomorphism classes of finitely generated projective left A-modules under the direct sum operation. Then K0 (A) is defined as the Grothendieck group of V (A). If A does not have a unit, and A+ denotes A with a unit adjoined to it, then K0 (A) is defined as the kernel of the canonical morphism K0 (A+ ) → K0 (C) ∼ = Z. This will be studied in detail in a subsequent paper. modules is the algebra K (L2 (G)) of An example of a monoid in the category of G φ action twisted compact operators with the G (ξ · K)(x, y) = ξ(xy −1 )K(x, y) (ξ · ψ)(x) = ξ(x)ψ(x), is a module with K ⊗ ψ → and L2 (G), which has the G-action K ∗ ψ, where K(x, z)ψ(z) dz. (K ∗ ψ)(x) = G
actions are compatible since The G ξ(xz−1 )K(x, z)ξ(z)ψ(z) dz = (ξ(x)(K ∗ ψ))(x). ((ξ · K) ∗ (ξ · ψ))(x) = G
Then
(K1 ∗ (K2 ∗ ψ))(x) =
K1 (x, y)K2 (y, z)ψ(z) dydz, G
and the alternate bracketing (K1 ∗ K2 ) ∗ ψ must be computed as the image of (K1 ⊗ (K2 ⊗ ψ)), giving φ(x, y, z)−1 K1 (x, y)K2 (y, z)ψ(z) dydz, ((K1 ∗ K2 ) ∗ ψ)(x) = G
consistent with the multiplication law on the twisted kernels.
68
P. Bouwknegt, K. Hannabuss, V. Mathai
but then the action on One can alternately work with the C0 (G) action rather than G kernels requires a use of the coproduct (f )(x, y) = f (xy), so that (f · K)(x, y) = f (xy −1 )K(x, y). One can similarly define right A-modules, and also bimodules for two algebras A1 and A2 . It is also possible to look at A1 -A2 -bimodules X which have an action of a prod1 × G 2 with maps 1 and 2 defining the associativity properties, and are uct group G 1 -modules as left A1 -modules and in the category of G 2 -modules as in the category of G right A2 -modules. Such a bimodule can be used to set up a Morita equivalence between left A2 -modules and left A1 -modules, by mapping a left A2 -module M to the quotient of X ⊗ V by the equivalence relation (x.b) ⊗ ψ ∼ 2 (x ⊗ (b.ψ)), for x ∈ X, b ∈ A2 and ψ ∈ M. This allows us to define Morita equivalence between algebras with different kinds of associativity. In particular, if we take A1 = Kφ (L2 (G)), A2 = K(L2 (G)), with X = K(L2 (G)), equipped with the usual right multiplication action of A2 and the left multiplication action of A1 defined above, then we have Morita equivalence between the twisted and untwisted algebras. Clearly this is only an outline of some of the ideas arising out of this new perspective on nonassociativity, and we shall explore these in more detail in the sequel to this paper. Since posting this paper on the arXives [1] has come to our attention, which also investigates some nonassociative algebras albeit in a rather different context. Acknowledgements. We would like to thank Edwin Beggs, Alan Carey and Ulrike Tillmann for useful comments. KCH would like to thank the University of Adelaide for hospitality during the initial stages of the project. PB and VM were financially supported by the Australian Research Council.
References 1. Akrami, S.E., Majid, S.: Braided cyclic cocycles and nonassociative geometry. J. Math. Phys. 45, 3883–3911 (2004) 2. Bouwknegt, P., Evslin, J., Mathai, V.: T-duality: Topology change from H-flux. Commun. Math. Phys. 249, 383–415 (2004) 3. Bouwknegt, P., Evslin, J., Mathai, V.: On the topology and H-flux of T-dual manifolds. Phys. Rev. Lett. 92, 181601 (2004) 4. Bouwknegt, P., Hannabuss, K.C., Mathai, V.: T-duality for principal torus bundles. J. High Energy Phys. 03, 018 (2004) 5. Bouwknegt, P., Hannabuss, K.C., Mathai, V.: T-duality for principal torus bundles and dimensionally reduced Gysin sequences. Adv. Theor. Math. Phys. 9(5), 749–773 (2005) [hep-th/0412268] 6. Brown, K.S.: Cohomology of groups. New York–Berlin: Springer Verlag, 1982 7. Busby, R.C., Smith, H.A.: Representations of twisted group algebras. Trans. Amer. Math. Soc. 149, 503–537 (1970) 8. Carey, A.L.: The origin of three-cocycles in quantum field theory. Phys. Lett. B194, 267–270 (1987) 9. Carey, A.L., Hannabuss, K.C., Mathai, V., McCann, P.: The quantum Hall effect in hyperbolic space. Commun. Math. Phys. 190, 629–673 (1998) 10. Carey, A.L., Mickelsson, J.: The universal gerbe, Dixmier–Douady class, and gauge theory. Lett. Math. Phys. 59, 47–60 (2002) 11. Crocker, D., Kumjian, A., Raeburn, I., Williams, D.: An equivariant Brauer group and actions of groups on C∗ algebras. J. Funct. Anal. 146, 151–184 (1997) 12. Chari, V., Pressley, A.: A guide to quantum groups. Cambridge: Cambridge University Press, 1994 13. Cornalba, L., Schiappa, R.: Nonassociative star product deformations for D-brane worldvolumes in curved backgrounds. Commun. Math. Phys. 225, 33–66 (2002) 14. Echterhoff, S., Rosenberg, J.: Fine structure of the Mackey machine for actions of abelian groups with constant Mackey obstruction. Pacific J. Math. 170, 17–52 (1995) 15. Eilenberg, S., MacLane, S.: Cohomology theory in abstract groups II. Ann. Math. 48, 326–341 (1947) 16. Eilenberg, S., MacLane, S.: Algebraic cohomology and loops. Duke Math. J. 14, 435–463 (1947) 17. Green, P.: The structure of imprimitivity algebras. J. Funct. Anal. 36, 88–104 (1980)
Nonassociative Tori and Applications to T-Duality
69
18. Greub, W., Halperin S., Vanstone, R.: Connections, curvature, and cohomology. Vols II and III, New York: Academic Press, 1973 19. Hannabuss, K.C.: Representations of nilpotent locally compact groups. J. Funct. Anal. 34, 146–165 (1979) 20. Holt, D.F.: Cohomology groups H n (G, M). J. Alg. 60, 307–320 (1979) 21. Huebschmann, J.: Crossed n-fold extensions of groups and cohomology. Commun. Math. Helv. 55, 302–313 (1980) 22. Jackiw, R.: Three-cocycles in mathematics and physics. Phys. Rev. Lett. 54, 159–162 (1985) 23. Kleppner, A.A.: Multipliers on abelian groups. Math. Ann. 158, 11–34 (1965) 24. Leptin, H.: Verallgemeinerte L1 -Algebren. Math. Ann. 159, 51–76 (1965) 25. MacLane, S.: Categories for the working mathematician. New York: Springer Verlag, 1971 26. MacLane, S.: Historical Note. J. Alg. 60, 319–320 (1979) 27. MacLane, S., Whitehead, J.H.C.: On the 3-type of a complex. Proc. Nat. Acad. Sci. U.S.A. 30, 41–48 (1950) 28. Mathai, V., Rosenberg, J.: T-duality for torus bundles with H-fluxes via noncommutative topology. Commun. Math. Phys. 253, 705–721 (2005) 29. Mathai, V., Rosenberg, J.: On mysteriously missing T-duals, H-flux and the T-duality group. http://arxiv.org/list/hep-th/0409073, 2004, to appear in “Proceedings of the XXXIII International Conference of Differential Geometric Methods in Mathematical Physics” (August 2005), editors Mo-Lin Ge and Weiping Zhang, World Scientific 2006 [hep-th/0409073] 30. Packer, J.A., Raeburn, I.: Twisted crossed products of C∗ -algebras. Math. Proc. Camb. Phil. Soc. 106, 293–311 (1989) 31. Quigg, J.C.: Duality for reduced twisted crossed products of C ∗ -algebras. Indiana Univ. Math. J. 35, 549–571 (1986) 32. Raeburn, I., Sims, A., Williams, D.: Twisted actions and obstructions in group cohomology. In: Cuntz, J., Echterhoff S. (ed.,) C ∗ -algebras. Berlin: Springer Verlag, 2000 33. Raeburn, I., Williams, D.: Morita equivalence and continuous trace C ∗ algebras. Mathematical Surveys and Monographs of the A.M.S. 60, Providence, RI: Amer. Math. Soc., 1998 34. Ratcliffe, J.G.: Crossed extensions. Trans. Amer. Math. Soc. 257, 73–89 (1980) 35. Whitehead, J.H.C.: Combinatorial homotopy II. Bull. Amer. Math. Soc. 55, 453–496 (1949) Communicated by M.R. Douglas
Commun. Math. Phys. 264, 71–85 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1509-0
Communications in
Mathematical Physics
Homological Mirror Symmetry for Toric del Pezzo Surfaces Kazushi Ueda Research Institute for Mathematical Sciences, Kyoto University, Kyoto 606-8502, Japan E-mail:
[email protected] Received: 21 January 2005 / Accepted: 10 August 2005 Published online: 15 Febraury 2006 – © Springer-Verlag 2006
Abstract: We prove the homological mirror conjecture for toric del Pezzo surfaces. In this case, the mirror object is a regular function on an algebraic torus (C× )2 . We show that the derived Fukaya category of this mirror coincides with the derived category of coherent sheaves on the original manifold.
1. Introduction Mirror symmetry started as a mysterious relationship between complex geometry of a Calabi-Yau 3-fold and symplectic geometry of another Calabi-Yau 3-fold called the mirror manifold. In 1994, Kontsevich [11] proposed a program to understand various mirror phenomena as a consequence of the following homological mirror conjecture: Calabi-Yau manifolds always come in pairs in such a way that the derived category of coherent sheaves on one manifold is equivalent as a triangulated category to the derived Fukaya category of the other. Although mirror symmetry was first discovered for Calabi-Yau manifolds, there are also variants of these phenomena for other classes of manifolds. One such example is Givental’s theorem [8] giving integral representations of J -functions of toric Fano manifolds. See also [3, 19, 6, 10]. We can also formulate the homological mirror conjecture for toric Fano manifolds. Let X be a toric Fano manifold of dimension n. Then the mirror partner is a regular function W on an algebraic torus (C× )n of the same dimension equipped with a symplectic structure (along with additional data called a grading). Here, the function W is a Newton polynomial for the convex hull of the generators of a 1-dimensional cone of the fan of X. Coefficients of this polynomial do not matter as long as they are chosen
Supported by JSPS Fellowships for Young Scientists No.15-5561.
72
K. Ueda
general enough. By Kouchnirenko [12], W has exactly dim H ∗ (X, C) critical points. Take a regular value t of W and a distinguished basis of vanishing cycles in W −1 (t). We also have to choose a grading and a spin structure on each of these vanishing cycles. The directed Fukaya category Fuk→ W of W (along with the choice of gradings and spin structures) is an A∞ -category whose objects are vanishing cycles and whose morphisms are Floer complexes. Roughly speaking, the Floer complex between two vanishing cycles are the vector space spanned by intersection points between them, and the compositions of morphisms are given by “counting polygons.” By Seidel [16], the derived category D b Fuk→ W of Fuk→ W is independent of the choice of a distinguished basis of vanishing cycles. The following is the main result of this paper: Theorem 1. The derived category of coherent sheaves on a toric del Pezzo surface X is equivalent as a triangulated category to the derived category of the directed Fukaya category of the mirror W of X; D b coh(X) ∼ = D b Fuk→ W. Our proof is based on an explicit computation. The above Theorem 1 extends the works of Seidel [17] and Auroux, Katzarkov and Orlov [1], where the cases of P2 , P1 × P1 , and P2 blown-up at one point are treated. See also the paper by Hori, Iqbal and Vafa [9], where homological mirror symmetry for Fano manifolds are discussed from a physics point of view. Note added. After the submission of this paper, a preprint [2] appeared which contains the proof of homological mirror symmetry for del Pezzo surfaces which are not necessarily toric.
2. Derived Category of Coherent Sheaves We describe the structure of the bounded derived category D b coh(Y ) of coherent sheaves on Y in this section, where Y is the projective plane blown-up at three points p1 , p2 and p3 in general position. Let φ : Y → P2 be this blow-up and E1 , E2 and E3 be the exceptional divisors corresponding to p1 , p2 and p3 respectively. It is a toric surface and the generators of one-dimensional cones of its fan is drawn in Fig. 1.
v2
v4
v3
v6
v1
v1 v2 v3 v4 v5 v6
= = = = = =
v5 Fig. 1. The toric data of Y
(1, 0), (0, 1), (−1, −1), (−1, 0), (0, −1), (1, 1).
Homological Mirror Symmetry for Toric del Pezzo Surfaces
73
Definition 1. 1. An object E in a triangulated category is exceptional if C if i = 0, i Ext (E, E) = 0 otherwise. 2. An ordered set of objects (Ei )N i=1 in a triangulated category is an exceptional collection if each Ei is exceptional and Ext k (Ei , Ej ) = 0 for any i > j and for any k. By combining theorems of Beilinson [4] and Orlov [13], we have the following generators of D b coh(Y ): Theorem 2. Let C = (E1 , E2 , E3 , E4 , E5 , E6 ), where E1 = OE1 (−1)[−1], E2 = OE2 (−1)[−1], E3 = OE3 (−1)[−1], E4 = φ ∗ OP2 (−1), E5 = φ ∗ P2 (1), E6 = OY .
(1)
Then C is an exceptional collection generating D b coh(Y ). Here, OEi (−1) is the sheaf supported on Ei and is isomorphic to the tautological sheaf OP1 (−1) on Ei ∼ = P1 , [•] is the shift operator in the derived category, φ ∗ is the derived pull-back, OP2 (−1) is the tautological sheaf on P2 , and P2 (1) is the cotangent sheaf of P2 tensored with the hyperplane sheaf OP2 (1). Proposition 1. All the non-zero Ext-groups within the exceptional collection C are Hom(Ei , E4 ) Hom(Ei , E5 ) Hom(Ei , E6 ) Hom(E4 , E5 ) Hom(E4 , E6 ) Hom(E5 , E6 )
= = = = = =
Cpi , Ker(V ∨ → (Cpi )∨ ), C, 2 V ∨ , V ∨, V,
where i = 1, 2, 3, V ∼ = C3 is the three-dimensional vector space such that P2 = P(V ), the check denotes the dual vector space, Cpi ⊂ V is the one-dimensional subspace corresponding to pi ∈ P(V ), and the map V ∨ → (Cpi )∨ is the dual of the inclusion Cpi → V . Compositions of morphisms are given by
(v, ω)
∈
∈
Hom(Ei , E4 ) × Hom(E4 , E5 ) −→ Hom(Ei , E5 ) −→
−ιv ω
,
(v, ω)
∈
∈
Hom(Ei , E4 ) × Hom(E4 , E6 ) −→ Hom(Ei , E6 ) −→
ω(v)
,
74
K. Ueda
∈
∈
Hom(Ei , E5 ) × Hom(E5 , E6 ) −→ Hom(Ei , E6 ) −→
(ω, v)
,
ω(v)
∈
∈
Hom(E4 , E5 ) × Hom(E5 , E6 ) −→ Hom(E4 , E6 ) −→
(ω, v)
ιv ω,
where ιv : ∧2 V ∨ → V ∨ for v ∈ V is the interior product. Proof. All the computations reduce to those on P2 in the following way: Let E and F be vector bundles on P2 . Then RHom(φ ∗ E, φ ∗ F) = RHom(E, Rφ∗ φ ∗ F) = RHom(E, F ⊗ Rφ∗ OY ) = RHom(E, F), RHom(φ ∗ E, OEi (−1)) = RHom(E, Rφ∗ OEi (−1)) = 0, and RHom(OEi (−1)[−1], φ ∗ E) = RHom({OY → OY (Ei )}, φ ∗ E) = R({OY (−Ei ) → OY } ⊗ φ ∗ E) = R({Rφ∗ OY (−Ei ) → Rφ∗ OY } ⊗ E) = R({Ipi → OP2 } ⊗ E) = R(Opi ⊗ E) = E|pi , where Ipi is the ideal sheaf of pi and the last line is the fiber at pi . Here, we have used the exact sequence 0 → OY → OY (Ei ) → OEi (−1) → 0. We can further use the exact sequence 0 → P2 (1) → V ∨ ⊗ OP2 → OP2 (1) → 0 to reduce the computations involving P2 (1) to those involving OP2 (i), i ∈ Z: RHom(OEi (−1)[−1], φ ∗ P2 (1)) = R(Opi ⊗ {V ∨ ⊗ OP2 → OP2 (1)}) = {V ∨ → (Cpi )∨ }, RHom(φ ∗ OP2 (−1), φ ∗ P2 (1)) = RHom(OP2 (−1), {V ∨ ⊗ OP2 → OP2 (1)}) = R(OP2 (1) ⊗ {V ∨ ⊗ OP2 → OP2 (1)}) = R({V ∨ ⊗ OP2 (1) → OP2 (2)}) = {V ∨ ⊗ V ∨ → Sym2 V ∨ } = ∧2 V ∨ ,
Homological Mirror Symmetry for Toric del Pezzo Surfaces
75
RHom(φ ∗ P2 (1), OY ) = RHom({V ∨ ⊗ OP2 → OP2 (1)}, OP2 ) = R({OP2 (−1) → V ⊗ OP2 }) = V. Compositions of morphisms can be easily read off from the above computations.
3. Fukaya Category First we recall the definition of an A∞ -category. For a Z-graded vector space V and i ∈ Z, V [i] denotes the shift of V by i; V [i]j = V i+j . Definition 2. An A∞ -category A consists of 1. the set of objects Ob(A), 2. for any c1 , c2 ∈ Ob(A), a Z-graded C-vector space homA (c1 , c2 ) called the set of morphisms, 3. for any positive integer k and for any set of objects {ci }ki=0 , the composition mk : homA (c0 , c1 )[1] ⊗ · · · ⊗ homA (ck−1 , ck )[1] −→ homA (c0 , ck )[1] which is a linear map of degree 1 such that for any positive integer k, any set of objects {ci }ki=0 and any set of morphisms {ai }ki=1 , ai ∈ homA (ci−1 , ci ), the following A∞ -relations hold: k k−1
(−1)deg a1 +···+deg ai mi−j +k+1 (a1 ⊗ · · · ⊗ ai
i=0 j =i+1
⊗mj −i (ai+1 ⊗ · · · ⊗ aj ) ⊗ aj +1 ⊗ · · · ⊗ ak ) = 0. Here degrees are counted after shifts, i.e., if a ∈ V i , deg a = i − 1 in V [1]. Since m21 = 0, we define HomA (c1 , c2 ) = H 0 (homA (c1 , c2 ), m1 ). The Fukaya category of Lagrangian submanifolds in a symplectic manifold is defined in [7]. We use the following adaptation for exact Morse fibrations by Seidel [16]. Let W : Z → C be a regular function on an affine algebraic manifold Z of complex dimension n with a K¨ahler structure. Assume the following conditions: – The K¨ahler metric is complete. – The symplectic form ω of Z is exact, i.e., there exists a one form θ on Z such that ω = dθ. – At any critical points of W , the Hessian of W is non-degenerate. – All the critical values are distinct. Such W gives rise to an exact Morse fibration in the terminology of [16]. Using the K¨ahler structure, we can define the lift cp : [0, 1] → Z of a path c : [0, 1] → C starting from a point p ∈ Z such that W (p) = c(0) by using the horizontal distribution defined as the orthogonal complement of the tangent space along the fiber of W . Assume that the origin is a regular value of W and fix an order (pi )N i=1 on the set of of vanishing paths is a set of smooth critical points of W . A distinguished set (ci )N i=1 paths ci : [0, 1] → C satisfying
76
1. 2. 3. 4. 5.
K. Ueda
ci (0) = 0, ci (1) = W (pi ), ci has no self-intersection, images of ci and cj intersect only at the origin, ci (0) = 0, and
(0) < arg c (0), i = 1, . . . N − 1, for some choice of the branch of arg(•). arg ci+1 i
Given a distinguished set (ci )N i=1 of vanishing paths, the corresponding vanishing cycles are defined by (Ci )N i=1 cp (t) = pi }. Ci = {p ∈ W −1 (0) | lim t→1
They are Lagrangian submanifolds of W −1 (0). The directed Fukaya category Fuk→ W for W is roughly an A∞ -category whose objects are vanishing cycles and whose morphisms are Lagrangian intersection Floer complexes. To define a Z-grading on the Floer complex, we need the concept of grading on Lagrangian submanifolds introduced by Kontsevich [11], which we now recall. See also Seidel [15]. Let (M, ω) be a symplectic manifold of dimension 2n. An almost complex structure on M is a section J ∈ (M, End(T M)) such that J 2 = − id. J is called ω-compatible if g(V1 , V2 ) = ω(V1 , J V2 ) defines a Hermitian metric on the tangent bundle. Fix an ω-compatible almost complex structure J of M and let S be the principal U (1)-bundle associated to the complex line bundle (n (T ∗ M, J ))⊗2 . The fiber of S at p ∈ M is (n (Tp∗ M, J ))⊗2 /R>0 . We assume that the first Chern class of this complex line bundle vanishes, so that S has a section. A grading of M is a choice of a section : M → S. Fix a grading on M. Let LagM → M be the Lagrangian Grassmannian bundle on M, whose fiber at p ∈ M is the Grassmannian of Lagrangian subspaces in the symplectic vector space Tp M. Define det 2 : LagM → U (1) as follows: For a Lagrangian subspace L ⊂ Tp M, pick any basis {ei }ni=1 of L and take the square of their exterior product; (e1 ∧ e2 ∧ · · · ∧ en )⊗2 ∈ (n (Tp M, J ))⊗2 . det 2 (L) is the image of this element by (p). A Lagrangian submanifold L ⊂ M gives a canonical section sL : L p → Tp L ∈ LagM |p . Denote the composition of sL and det 2 by φL . A grading of a Lagrangian submanifold is a lift φ L : L → R of φL to the universal cover R of U (1) ∼ = R/Z. Now we define the Maslov index. A smooth path : [0, 1] → Lag(V , ω) in the Lagrangian Grassmannian of a fixed symplectic vector space (V , ω) of dimension 2n is called crossingless if (0) ∩ (t) = (0) ∩ (1) for all t ∈ (0, 1]. For a crossingless path t , its differential (0) at t = 0 gives an element of the tangent space T(0) Lag(V , ω) ⊂ T(0) Gr(n, V ) ∼ = Hom((0), V /(0)), where Gr(n, V ) is the Grassmannian of n-dimensional subspaces in V . The composition of (0) and ω defines a quadratic form (0) v → ω(v, (0)v) on (0), which descends to a quadratic form on (0)/(0) ∩ (1) since it vanishes on (0) ∩ (1). The resulting form on (0)/(0) ∩ (1) is called the crossing form [14]. For an intersection p ∈ L1 ∪ L2 of two graded Lagrangian submanifolds (L1 , φ L1 ) and (L2 , φ L2 ), its Maslov index is defined as follows: Choose a crossingless path : [0, 1] → LagM |p from Tp L0 to Tp L1 such that the corresponding crossing form at t = 0 is negative definite. There is a unique lift α : [0, 1] → R of the composition α of : [0, 1] → LagM |p and α (0) = φ det2 (p) : LagM |p → U (1) such that L0 (p), and the Maslov index I (p) of the intersection p of two graded Lagrangian submanifolds is defined by α (1). I (p) = φ L1 (p) −
Homological Mirror Symmetry for Toric del Pezzo Surfaces
77
Now we come back to our exact Morse fibration W . A relative Maslov map is a sec⊗2 of the top exterior product of the relative tion of the second tensor power (n−1 Z/C ) cotangent bundle away from the critical points. Since 0 is a regular value, the restriction of to W −1 (0) gives a grading of W −1 (0). Assume that all the vanishing cycles can be graded, and fix a grading on each vanishing cycle. We also need to choose a spin structure on each vanishing cycle in order to orient the moduli spaces of pseudoholomorphic maps [7]. Since each vanishing cycle is homeomorphic to a sphere, the choice of a spin structure is unique except for dimC W −1 (0) = 1, where one has as many as H 1 (S 1 , Z/2Z) = Z/2Z choices. We take the non-trivial spin structure in such a case. Let Ci denote the vanishing cycle Ci endowed with the above grading and spin structure. We assume that vanishing cycles intersect each other transversally. This condition can always be met by moving vanishing cycles within their Hamiltonian isotopy classes if necessary. Definition 3. Given a function W on an affine K¨ahler manifold Z together with a relative Maslov map and a choice of a distinguished basis of vanishing cycles (Ci )N i=1 with → gradings and spin structures, its directed Fukaya category Fuk W is an A∞ -category such that – the set of objects is the distinguished basis of vanishing cycles; Ob(Fuk→ W ) = (C1 , . . . , CN ),
– the set of morphisms between Ci and Cj is the Z-graded vector space 0 · idC homFuk→ W (Ci , Cj ) = C i
p∈Ci ∩Cj
i > j, i = j, spanC {p} i < j,
where deg p = I (p) (the Maslov index), and – for a positive integer k, a set of objects (Ci0 , . . . , Cik ) and morphisms pl ∈ Cil−1 ∩ Cil for l = 1, . . . , k, the composition mk is given by mk (p1 , . . . , pk ) =
#Mk+1 (Ci0 , . . . , Cik ; p0 , . . . , pk )p0 .
p0 ∈Ci0 ∩Cik
Here, Mk+1 (Ci0 , . . . , Cik ; p0 , . . . , pk ) is the stable compactification of Mk+1 (Ci0 , . . . , Cik ; p0 , . . . , pk ) defined as follows: A disk with k + 1 marked points on the boundary is a pair (D 2 , (z0 , . . . , zk )) of a closed unit disk D 2 = {z ∈ C | |z| ≤ 1} and an ordered set (z0 , . . . , zk ) of k + 1 points on the boundary respecting the cyclic order. Let ∂l D 2 ∈ ∂D 2 be the interval between zl and zl+1 , where we set zk+1 = z0 . Fix an almost complex structure J on W −1 (0). A smooth map ϕ : D 2 → M is called pseudoholomorphic if dϕ ◦ JD 2 = J ◦ dϕ, where JD 2 is the canonical complex structure on D 2 . Mk+1 (Ci0 , . . . , Cik ; p0 , . . . , pk ) is the moduli space of pairs ((D 2 , (z0 , . . . , zk )), ϕ) such that
78
1. 2. 3. 4.
K. Ueda
(D 2 , (z0 , . . . , zk )) is a disk with k + 1 marked points on the boundary, ϕ : D 2 → W −1 (0) is a pseudoholomorphic map, ϕ(∂l D 2 ) ⊂ Cil for l = 0, . . . , k, and ϕ(zl ) = pl for l = 0, . . . , k.
Although Mk+1 (Ci0 , . . . , Cik ; p0 , . . . , pk ) is not an honest manifold in general due to subtleties in the definition of the moduli space of pseudoholomorphic maps, it has a Kuranishi structure with corners and it makes sense to “count” numbers #Mk+1 (Ci0 , . . . , Cik ; p0 , . . . , pk ) of points. See [7] for details. These numbers are counted with signs determined by the orientation of the moduli space, which in turn is determined by the spin structures on vanishing cycles. #Mk+1 (Ci0 , . . . , Cik ; p0 , . . . , pk ) is zero if dim Mk+1 (Ci0 , . . . , Cik ; p0 , . . . , pk ) = 0. These moduli spaces must be compact so that it makes sense to count the numbers of points. To prove compactness of the moduli space, one needs the boundedness of the energy. Usually, this is achieved by introducing the Novikov ring as the coefficient ring of the Floer complex in order to control energies of pseudoholomorphic maps. This is not necessary in our case since vanishing cycles are exact Lagrangian submanifolds, i.e., [θ ] = 0 ∈ H 1 (Ci , R) for the primitive θ of ω; ω = dθ . To see this, let C be the vanishing cycle
corresponding to a vanishing path c : [0, 1] → C. Then C bounds the disk D = p∈C cp ([0, 1]); ∂D = C. Since dθ|D = ω|D = 0, [θ] = 0 ∈ H 1 (∂D, R). Therefore, there exists a function Ki on each vanishing cycles such that θ|Ci = dKi . Then for any ϕ ∈ Mk+1 (Ci0 , . . . , Cik ; p0 , . . . , pk ), the energy of ϕ is E(ϕ) = =
D2 k
ϕ∗ω =
∂D 2
ϕ∗θ =
k l=0
ϕ∗θ
∂i D 2
(Kil (pl+1 ) − Kil (pl ))
l=0
and does not depend on the homotopy class of ϕ. This also implies the vanishing of m0 . Non-compactness of W −1 (0) does not cause any problem either. Since W −1 (0) is Stein, the maximum principle prevents pseudoholomorphic disks from running away to infinity, assuring the compactness of the moduli space. Seidel proved that although the directed Fukaya category Fuk→ W depends on the choice of the distinguished basis of vanishing cycles, different choices are related by mutations, hence its derived category is invariant. Now we explain the mirror construction of toric Fano manifolds after Givental [8]. Given a fan of an n-dimensional toric Fano manifold, let {vi }ri=1 , vi = (vi1 , . . . , vin ) ∈ Zn , be the set of generators of its one-dimensional cones. Then the mirror object for this toric Fano manifold is the regular function W (x1 , . . . , xn ) =
r
qi x1vi1 · · · xnvin
(2)
i=1
on the algebraic torus SpecC[xi±1 ]ni=1 . Here, qi ’s are parameters corresponding to the deformation of symplectic structures on the toric Fano manifold. Therefore, the mirror of our Y is the regular function W (x, y) = q1 x + q2 y +
q4 q5 q3 + + + q6 xy xy x y
(3)
Homological Mirror Symmetry for Toric del Pezzo Surfaces
79
on the algebraic torus (C× )2 = SpecC[x, x −1 , y, y −1 ] (see Fig. 1). The Fukaya category does not depend on a general choice of qi ’s. We have used (q1 , q2 , q3 , q4 , q5 , q6 ) = (1, 1, 1, 0.215, 0.25, 0.3) to draw the figures appearing below. Equip (C× )2 with the d|y| symplectic form d|x| |x| ∧ d(arg x) + |y| ∧ d(arg y) and the relative Maslov map
= (Res
dx dx ∧ dy ⊗2 ) =( )⊗2 . xyW (x, y) ∂y (xyW (x, y))
W −1 (0) is an affine elliptic curve, which can be compactified by adding six points. We depict the critical values of W and our choice of a distinguished set of vanishing paths in Fig. 2. Figure 3 shows the corresponding vanishing cycles. The opposite sides of the square in Fig. 3 are identified to form a two-dimensional torus. The open circles denote the points which are missing due to the non-compactness of (C× )2 . We have drawn Fig. 3 in the following way: First regard W −1 (t) as a branched double cover of C× by the projection πt : W −1 (t) (x, y) → y ∈ C× . For any i = 1, . . . , 6, the branch points of πci (t) moves in C× as we vary t, until two of them finally collides at t = 1. The vanishing cycle Ci is the circle in W −1 (0) over this trajectory of collision. This determines the vanishing cycle Ci up to isotopy. These vanishing cycles can be straightened within their Hamiltonian isotopy classes. Here, being straight refers to the flat metric on the torus W −1 (0), where • denotes the completion. Note that this flat metric on W −1 (0) has nothing to do with the K¨ahler metric on W −1 (0) induced from that of (C× )2 . This determines Ci up to translation. This translational ambiguity can be fixed by imposing exactness, which requires the knowledge of the primitive θ of the symplectic form ω on W −1 (0). We do not try to carry this out since translations of straight Ci ’s do not alter the combinatorial structure of intersections, which is all we need in our computation of the Fukaya category below. The grading on W −1 (0) given
c4
c5
c3
c2 c1 c6
Fig. 2. Distinguished basis of vanishing paths
80
K. Ueda
by is the grading coming from the restriction of the second tensor power of the holo morphic 1-form on W −1 (0). Then one can choose a grading φ Ci on each vanishing cycle Ci so that all the Maslov indices of Ci ∩ Cj are zero for i < j . To give a spin structure on Ci is the same as to give a two-fold covering of Ci . We take the non-trivial cover such that two branches interchange on the black dots in Fig. 3. Different choices for spin structures lead to different categories, and it turns out that the above choice gives a category derived equivalent to the category of coherent sheaves on the del Pezzo surface Y . Let Ci denote the vanishing cycle Ci with the above grading and spin structure. Since the Maslov indices of all the intersection points are zero, the sign for the counting of a triangle is −1 if the boundary of the triangle hits odd numbers of these dots, and +1 otherwise (see Seidel [18]). Note that although the choice of spin structures is important, the choice of the positions of the black dots is irrelevant. The change of the signs caused by a change of the positions of the black dots can be absorbed by a redefinition of the signs of the basis for the Floer cohomologies. mk is non-zero only for k = 2 (polygons with more than three edges in Fig. 3 do not contribute since homFuk→ W (Ci , Cj ) = 0 for i > j ). Theorem 3. There exists an isomorphism
φij : HomD b coh(Y ) (Ei , Ej ) → HomFuk→ W (Ci , Cj ) −x1∨
x3
e3 x2∨ x1∨
x3∨ e1
C3 C5
x2∨
x1 x1∨ ∧ x2∨
x3
C1 C4
x2∨ x3∨
x2
x1∨
e2
C6 C2
x2∨
∧ x3∨
x1
x3∨ x2 x3∨ ∧ x1∨ Fig. 3. Vanishing cycles in the fiber at the origin
Homological Mirror Symmetry for Toric del Pezzo Surfaces
81
as a C-vector space for i, j = 1, . . . , 6 such that the diagrams HomD b coh(Y ) (Ei , Ej ) × HomD b coh(Y ) (Ej , Ek ) −−−−→ HomD b coh(Y ) (Ei , Ek ) φ ×φ φ
ij j k
ik
(4)
HomFuk→ W (Ci , Cj ) × HomFuk→ W (Cj , Ck ) −−−−→ HomFuk→ W (Ci , Ck ) commute. Proof. We omit the lower suffix of Hom(•, •) since there is no danger of confusion. We construct the above φij ’s explicitly. Since m1 = 0, Hom(Ci , Cj ) is isomorphic
to hom(Ci , Cj ) and spanned by intersection points of Ci and Cj , to which we assign the basis of Hom(Ei , Ej ) as in Fig. 3. Notations for the basis of Hom(Ei , Ej ) is given in Appendix. This defines the linear isomorphisms φij ’s. The commutativity of (4) is verified by counting triangles. Let us illustrate this with a few examples. Take x1 ∈ Hom(E1 , E4 ) and x1∨ ∈ Hom(E4 , E6 ), whose composition is e1 ∈ Hom(E1 , E6 ). On the Fukaya side, we count the number of triangles whose edges are contained in C1 ∪C4 ∪C6 and two of whose vertices are x1 and x1∨ . Such a triangle exists uniquely, whose remaining vertex is e1 (Fig. 4). Next we try to compose x3 ∈ Hom(C3 , C4 ) and x1∨ ∈ Hom(C4 , C6 ) in the Fukaya category. The only possibility would be the triangle with e3 as the remaining vertex. However, this is not allowed because of the missing point (Fig. 5). This shows that the composition of x3 and x1∨ in the Fukaya category is zero, which matches the table in Appendix calculated in D b coh(Y ). Finally, as an example of counting with signs, we compute Hom(C1 , C4 ) × Hom(C4 , C5 ) (x1 , x1∨ ∧ x2∨ ) → −x2∨ ∈ Hom(C1 , C5 ).
C6 x1∨ e1
x1 C1 C4
Fig. 4. Counting triangle
82
K. Ueda
x3
e3 C3 x1∨
C6
C4
Fig. 5. No triangles
C5
x2∨ x1 x1∨ ∧ x2∨
C1 C4
Fig. 6. Counting triangles with signs
Indeed, the triangle whose vertices are x1 , x1∨ ∧ x2∨ , and x2∨ contains one black dot on its edge (Fig. 6). We can do similar analysis for the rest of the triangles. The result perfectly agrees with the computations in D b coh(Y ). Although the derived category of an A∞ -category is usually defined using twisted complexes, we adopt the following definition in this paper since our Fuk→ W satisfies homkFuk→ W (Ci , Cj ) = 0 for k = 0 and mk = 0 for k = 2.
Homological Mirror Symmetry for Toric del Pezzo Surfaces
83
Definition 4. Let A = ⊕i,j HomFuk→ W (Ci , Cj ) be the total morphism algebra. The derived Fukaya category D b Fuk→ W is the bounded derived category D b (mod-A) of the category of right finite-dimensional modules over the algebra A. Now we can state our main theorem: Theorem 4. There exists an equivalence of triangulated categories D b coh(Y ) ∼ = D b Fuk→ W. Proof. From Theorem 3, we have A∼ = ⊕i,j HomD b coh(Y ) (Ei , Ej ). Theorem 4 follows immediately from the theorem of Bondal [5] that D b coh(Y ) ∼ = D b (mod-(⊕i,j Hom(Ei , Ej ))). We can perform a similar analysis for all the other toric del Pezzo surfaces to obtain Theorem 1. Acknowledgement. We thank O. Fujino, K. Fukaya, H. Iritani, T. Kawai, K. Saito and A. Takahashi for valuable discussions and comments.
4. Appendix Here we give the table of compositions of morphisms in the derived category of coherent sheaves on the projective plane P(V ) blown-up at p1 = [1 : 0 : 0], p2 = [0 : 1 : 0], p3 = [0 : 0 : 1]. We use the following notations: Ei ’s are defined in (1), {xi }3i=1 is the basis of V , {xi∨ }3i=1 is the dual basis of V ∨ , and ei is the generator of Hom(Ei , E6 ) = C for i = 1, 2, 3: Hom(E1 , E4 ) × Hom(E4 , E5 ) → Hom(E1 , E5 ) x1
x1∨ ∧ x2∨ x2∨ ∧ x3∨ x3∨ ∧ x1∨ , −x2∨ 0 x3∨
Hom(E1 , E4 ) × Hom(E4 , E6 ) → Hom(E1 , E6 ) x1∨ x2∨ x3∨ , x1 e1 0 0 Hom(E1 , E5 ) × Hom(E5 , E6 ) → Hom(E1 , E6 ) x1 x2 x3 x2∨ 0 e1 0 , x3∨ 0 0 e1 Hom(E2 , E4 ) × Hom(E4 , E5 ) → Hom(E2 , E5 ) x1∨ ∧ x2∨ x2∨ ∧ x3∨ x3∨ ∧ x1∨ , x2 x1∨ −x3∨ 0
84
K. Ueda
Hom(E2 , E4 ) × Hom(E4 , E6 ) → Hom(E2 , E6 ) x1∨ x2∨ x3∨ , x2 0 e2 0 Hom(E2 , E5 ) × Hom(E5 , E6 ) → Hom(E2 , E6 ) x1 x2 x3 x1∨ e2 0 0 , x3∨ 0 0 e2 Hom(E3 , E4 ) × Hom(E4 , E5 ) → Hom(E3 , E5 ) x1∨ ∧ x2∨ x2∨ ∧ x3∨ x3∨ ∧ x1∨ , x3 0 x2∨ −x1∨ Hom(E3 , E4 ) × Hom(E4 , E6 ) → Hom(E3 , E6 ) x1∨ x2∨ x3∨ , x3 0 0 e3 Hom(E3 , E5 ) × Hom(E5 , E6 ) → Hom(E3 , E6 ) x1 x2 x3 x1∨ e2 0 0 , x2∨ 0 e3 0 Hom(E4 , E5 ) × Hom(E5 , E6 ) → Hom(E4 , E6 ) x1 x2 x3 x1∨ ∧ x2∨ x2∨ −x1∨ 0 . x2∨ ∧ x3∨ 0 x3∨ −x2∨ ∨ ∨ ∨ ∨ x3 ∧ x1 −x3 0 x1 References 1. Auroux, D., Katzarkov, L., Orlov, D.: Mirror symmetry for weighted projective planes and their noncommutative deformations. http://arxiv.org/list/math.AG/0404281, 2004. 2. Auroux, D., Katzarkov, L., Orlov, D.: Mirror symmetry for Del Pezzo surfaces: Vanishing cycles and coherent sheaves. http://arxiv.org/list/math.AG/0506166, 2005 3. Batyrev, V.: Quantum cohomology rings of toric manifolds. Ast´erisque 218, 9, 34 (1993) 4. Beilinson, A.: Coherent sheaves on Pn and problems in linear algebra. Funkts. Anal. i. Prilozhen. 12, no. 3, 68–69 (1978) 5. Bondal, A.: Representation of associative algebras and coherent sheaves. Izv. Ross. Akad. Nauk Ser. Mat. 53, no. 1, 25–44 (1989) 6. Eguchi, T., Hori, K., Xiong, C.-S.: Gravitational quantum cohomology. Int. J. Mod. Phys. A 12, no. 9, 1743–1782 (1997) 7. Fukaya, K., Oh, Y.-G., Ohta, H., Ono, K.: Lagrangian intersection Floer theory. Preprint, available at http://www.math.kyoto-u.ac.jp/˜fukaya, 2000 8. Givental, A.: Homological geometry and mirror symmetry. In: Proceedings of the International Congress of Mathematicians. Proceedings, Zurich 1994. Basel: Birkh¨auser, 1994, pp. 472–480 9. Hori, K., Iqbal, A., Vafa, C.: D-Branes and mirror symmetry. http://arxiv.org/list/hep-th/0005247, 2000 10. Hori, K., Vafa, C.: Mirror symmetry. http:// arxiv.org/list/hep-th/0002222, 2000 11. Kontsevich, M.: Homological algebra of mirror symmetry. In: Proceedings of the International Congress of Mathematicians. Proceedings, Zurich 1994. Basel: Birkh¨auser, 1994, pp. 120–139 12. Kouchnirenko, A.: Poly´edres de Newton et nombres de Milnor. Invent. Math. 32, no. 1, 1–31 (1976)
Homological Mirror Symmetry for Toric del Pezzo Surfaces
85
13. Orlov, D.: Projective bundles, monoidal transformations and derived categories of coherent sheaves. Izv. Ross. Akad. Nauk Ser. Math. 56, no. 4, 852–862 (1992) 14. Robbin, J., Salamon D.: Maslov index for path. Topology 32, no. 4, 827–844 (1993) 15. Seidel, P.: Graded Lagrangian submanifolds. Bull. Soc. Math. France 128, no. 1, 103–149 (2000) 16. Seidel, P.: Vanishing cycles and mutations. In: European Congress of Mathematics, Vol. II. Proceedings, Barcelona 2000. Basel: Birkh¨auser, 2001, pp. 65–85 17. Seidel, P.: More about vanishing cycles and mutations. In: Symplectic geometry and mirror symmetry. Proceedings, Seoul 2000. Singapore:World Sci. Publishing, 2001, pp. 429–465 18. Seidel, P.: Homological mirror symmetry for the quartic surface. http://arxiv.org/list/math.AG/ 0310414 (2003) 19. Witten, E.: Phases of N = 2 theories in two dimensions. Nucl. Phys. B 403, 159–222 (1993) Communicated by N.A. Nekrasov
Commun. Math. Phys. 264, 87–114 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1513-4
Communications in
Mathematical Physics
Drinfeld Twists and Algebraic Bethe Ansatz of the Supersymmetric Model Associated with Uq (gl (m|n)) Wen-Li Yang1,2 , Yao-Zhong Zhang2 , Shao-You Zhao2,3 1 2 3
Institute of Modern Physics, Northwest University, Xian 710069, P.R. China Department of Mathematics, University of Queensland, Brisbane, QLD 4072, Australia Department of Physics, Beijing Institute of Technology, Beijing 100081, China
Received: 1 March 2005 / Accepted: 22 August 2005 Published online: 31 January 2006 – © Springer-Verlag 2006
Abstract: We construct the Drinfeld twists (or factorizing F -matrices) of the supersymmetric model associated with quantum superalgebra Uq (gl(m|n)), and obtain the completely symmetric representations of the creation operators of the model in the F -basis provided by the F -matrix. As an application of our general results, we present the explicit expressions of the Bethe vectors in the F -basis for the Uq (gl(2|1))-model (the quantum t-J model). 1. Introduction It was realized in [1] that for the XXX or XXZ spin chain systems, there exists a nondegenerate lower-triangular F -matrix (the Drinfeld twists) [2] in terms of which the R-matrix of the system is factorized: −1 R12 (u1 , u2 ) = F21 (u2 , u1 )F12 (u1 , u2 ),
(1.1)
where the R-matrix acts on the tensor space V ⊗ V with V being a 2-dimensional Uq (gl(2))-module. In the basis provided by the N-site F -matrix, i.e. the so-called F -basis, the entries of the monodromy matrices of the models appear in completely symmetric forms. As a result the Bethe vectors of the models are dramatically simplified and can be written down explicitly. These results enabled the authors in [3, 4] to compute form factors, correlation functions [5] and spontaneous magnetizations of the systems analytically and explicitly. The results of [1] were generalized to other systems. In [6], the Drinfeld twists associated with any finite-dimensional irreducible representations of Yangian Y [gl(2)] were investigated. In [7], Albert et al constructed the F -matrix of the rational gl(m) Heisenberg model, obtained a polarization free representation of its creation operators and resolved the hierarchy of its nested Bethe vectors. In [8, 9], the Drinfeld twists of the elliptic XYZ and Belavin models were constructed. Recently we have successfully constructed the Drinfeld twists for the rational gl(m|n) supersymmetric model and resolved
88
W.-L. Yang, Y.-Z. Zhang, S.-Y. Zhao
the hierarchy of its nested Bethe vectors in the F -basis [10, 11]. Quantum integrable models associated with Lie superalgebras [12–14] are physically important because they give strongly correlated fermion models of superconductivity (e.g. [15, 16]). In this paper, we extend our results in [10, 11] to the quantum (or q-deformed) supersymmetric model associated the quantum superalgebra Uq (gl(m|n)) (including quantum supersymmetric t-J model as a special case). Such a generalization is non-trivial due to the following fact. It is well-known that the gl(m|n) rational model has gl(m|n) symmetry which enables one to express the creation operators Ci (u) in terms of the element Tm+n,m+n (u) of the monodromy matrix T (u) and the generators of gl(m|n) by (anti)commutation relations [7, 11]. However, the corresponding quantum model is not Uq (gl(m|n)) invariant (unless appropriate boundary conditions are imposed). One of the consequences is that the creation operators Ci (u) of the quantum model cannot be expressed in terms of Tm+n,m+n (u) and the generators of Uq (gl(m|n)) by simple q-(anti)commutation relations. Indeed, it is found in this paper that extra quantum correction terms are needed, due to the non-trivial coproduct structure of the quantum superalgebra. Having found such a new recursive relation (3.37) and constructed the factorizing F -matrices of the quantum model, we obtain the symmetric representations of the creation operators of the monodromy matrix in the F -basis. These results make possible a complete resolution of the hierarchy of the nested Bethe vectors of the Uq (gl(m|n)) model. As an example, we give the explicit expressions of the Bethe vectors of the quantum t − J model associated with Uq (gl(2|1)). The present paper is organized as follows. In Sect. 2, we introduce some basic notation on the quantum superalgebra Uq (gl(m|n)). In Sect. 3, we derive the recursive relation between the elements of the monodromy matrix and the generators of Uq (gl(m|n)). In Sect. 4, we construct the F -matrix and its inverse of the Uq (gl(m|n)) model. In Sect. 5, we obtain the symmetric representations of the creation operators in the F -basis. As an application of our general results, the hierarchy of the nested Bethe vectors of the Uq (gl(2|1)) model is resolved in Sect. 6. We conclude the paper by offering some discussions in Sect. 7. Some detailed technical derivations are given in Appendices A-B. 2. Quantum Superalgebra Uq (gl(m|n)) Let us fix two non-negative integers n, m such that n + m ≥ 2 and a positive integer N(N ≥ 2), and a generic complex number η such that the q-deformation parameter, which is defined by q = eη , is not a root of unity. Let V be a Z2 -graded (n + m)-dimensional vector space with the orthonormal basis {|i, i = 1, . . . , n + m}. The Z2 -grading is chosen as: [1] = · · · = [m] = 1, [m + 1] = · · · = [m + n] = 0. Definition 1. The quantum superalgebra Uq (gl(m|n)) is a Z2 -graded unital associative superalgebra generated by the generators E i,i , (i = 1, . . . , n + m) and E j,j +1 , E j +1,j (j = 1, . . . , n + m − 1) with the Z2 -grading [E i,i ] = 0, [E j +1,j ] = [E j,j +1 ] = [j ] + [j + 1] by the relations:
[E i,i , E i ,i ] = 0, [E i,i , E j,j +1 ] = (δi,j −δi,j +1 )E i,j+1 , i = 1, . . . , n+m, (2.1) [E i,i , E j+1,j ] = (δi,j+1 −δi,j )E j +1,j ,
[E j,j +1 , E j +1,j ] = (−1)[j ] δj,j
hj
(2.2) −hj
q −q , j = 1, . . . , n+m−1, q − q −1
(2.3)
Drinfeld Twists and Algebraic Bethe Ansatz
89
and the Serre relations: (E m,m+1 )2 = (E m+1,m )2 = 0,
[E j,j +1 , E j ,j +1 ] = [E j +1,j , E j +1,j ] = 0, |j − j | ≥ 2, (E j,j +1 )2 E j ±1,j ±1+1 − (q + q −1 )E j,j +1 E j ±1,j ±1+1 E j,j +1 +E j ±1,j ±1+1 (E j,j +1 )2 = 0, j = m, (E j +1,j )2 E j ±1+1,j ±1 − (q + q −1 )E j +1,j E j ±1+1,j ±1 E j +1,j +E j ±1+1,j ±1 (E j +1,j )2 = 0, j = m,
(2.4)
where hj = (−1)[j ] E j,j − (−1)[j +1] E j +1,j +1 . In addition to the above Serre relations, there exist also extra Serre relations [17] which we omit. Here and throughout, we adopt the convention: [x, y] = xy − (−1)[x][y] yx, x, y ∈ Uq (gl(m|n)). One can easily see that the Z2 -graded vector space V supplies the fundamental Uq (gl(m|n))-module and the generators of Uq (gl(m|n)) are represented in this space by π(E i,i ) = ei,i , π(E j,j +1 ) = ej,j +1 , π(E j +1,j ) = ej +1,j ,
(2.5)
where ei,j ∈ End(V ) is the elementary matrix with elements (ei,j )lk = δj k δil . Uq (gl(m|n)) is a Z2 -graded triangular Hopf superalgebra endowed with Z2 -graded algebra homomorphisms that are coproduct : Uq (gl(m|n)) −→ Uq (gl(m|n)) ⊗ Uq (gl(m|n)) defined by (E i,i ) = 1 ⊗ E i,i + E i,i ⊗ 1, i = 1, . . . , n + m,
(2.6)
(E j,j +1 ) = 1 ⊗ E j,j +1 + E j,j +1 ⊗ q ,
(2.7)
hj
(E
j +1,j
)=q
−hj
⊗E
j +1,j
+E
j +1,j
⊗ 1,
(2.8)
and counit : Uq (gl(m|n)) −→ C defined by (E j,j +1 ) = (E j +1,j ) = (E i,i ) = 0, (1) = 1, and a Z2 -graded algebra antiautomorphism (antipode) S: Uq (gl(m|n)) −→ Uq (gl(m|n)) given by S(E j,j +1 ) = −E j,j +1 q −h , S(E j +1,j ) = −q h E j +1,j , S(E i,i ) = −E i,i . j
j
Multiplications of tensor products are Z2 graded:
(x ⊗ y)(x ⊗ y ) = (−1)[y][x ] xx ⊗ yy , for homogeneous elements x, y, x , y ∈ Uq (gl(m|n)) and where [x] ∈ Z2 denotes the grading of x. It should be pointed out that the antipode satisfies the following equation, for homogeneous elements x, y ∈ Uq (gl(m|n)), S(xy) = (−1)[x][y] S(y)S(x),
90
W.-L. Yang, Y.-Z. Zhang, S.-Y. Zhao
and generalizes to inhomogeneous elements through linearity. The coproduct, counit and antipode satisfy the following relations, ∀x ∈ Uq (gl(m|n)): ( ⊗ id)(x) = (id ⊗ )(x), ( ⊗ id)(x) = x = (id ⊗ )(x), m(S ⊗ id)(x) = m(id ⊗ S)(x) = (x),
(2.9)
where m denote the product of any two elements of Uq (gl(m|n)), i.e., m(x ⊗ y) = xy for x, y ∈ Uq (gl(m|n)). The generators {E j,j +1 } ( {E j +1,j }) are the simple raising (lowering) generators of Uq (gl(m|n)) associated with the simple roots. Thanks to the Serre relations, the other generators associated with the non-simple roots (called the non-simple generators) can be uniquely constructed through the simple ones by the following relations: E α,γ = E α,β E β,γ − q −(−1) E β,γ E α,β , 1 ≤ α < β < γ ≤ n + m, [β]
E
γ ,α
=E
γ ,β
E
β,α
−q
(−1)[β]
E
β,α
E
γ ,β
, 1 ≤ α < β < γ ≤ n + m.
(2.10) (2.11)
The coproduct, counit and antipode of the non-simple generators can be obtained through those of the simple ones. Here, we give the coproduct of non-simple generators which will be used later. Lemma 1. The coproduct of the non-simple generators is (E γ ,γ −l ) = q −
l
γ −k k=1 h
⊗ E γ ,γ −l + E γ ,γ −l ⊗ 1
l−1 l−i γ−k [γ−l+i] + (1−q 2(−1) )q− k=1 h E γ −l+i,γ−l ⊗E γ ,γ−l+i , i=1
γ −l ≥ 1 and l≥2, (E
γ ,γ +l
) = 1⊗E l−1
+
γ ,γ +l
+E
γ ,γ +l
(1−q−2(−1)
[γ+i]
⊗q
(2.12)
l−1
γ +k k=0 h
)E γ +i,γ +l ⊗E γ ,γ+i q
l−1 k=i
hγ +k
,
i=1
γ +l≤n+m and l ≥ 2.
(2.13)
Proof. This lemma can be proved by induction using the coproducts of the simple generators (2.6)-(2.8), the definitions of the non-simple generators (2.10)-(2.11) and the fact that the coproduct is an algebra homomorphism, as well as (2.1)-(2.3) and the Serre relation (2.4).
3. Recursive Relation between Monodromy Matrix Elements and Uq (gl(m|n)) Generators Let R ∈ End(V ⊗ V ) be the R-matrix associated with the fundamental Uq (gl(m|n))module V . The R-matrix depends on the difference of two spectral parameters u1 and u2 associated with two copies of V , and is, in the present grading, given by [13, 14, 18]
Drinfeld Twists and Algebraic Bethe Ansatz
91
R12 (u1 , u2 ) = R12 (u1 − u2 ) m+n m m+n = c12 ei,i ⊗ ei,i + ei,i ⊗ ei,i + a12 ei,i ⊗ ej,j i=1 − + b12
i=j =1
i=m+1 m+n
+ (−1)[j ] ei,j ⊗ ej,i + b12
i>j =1
m+n
(−1)[j ] ei,j ⊗ ej,i , (3.1)
j >i=1
where sinh(u1 −u2 ) e±(u1−u2 ) sinh η ± = b± (u1 , u2 ) ≡ , b12 , sinh(u1 −u2 +η) sinh(u1 −u2 +η) sinh(u1 − u2 − η) = c(u1 , u2 ) ≡ , sinh(u1 − u2 + η)
a12 = a(u1 , u2 ) ≡
(3.2)
c12
(3.3)
and η is the so-called crossing parameter. One can easily check that the R-matrix satisfies the unitary relation R21 R12 = 1.
(3.4) N+1
Let us introduce the (N + 1)-fold tensor product space V ⊗ V · · · ⊗ V , whose components are labelled by 0, 1 . . . , N from the left to the right. As usual, the 0th space, denoted by V0 ( Vi for the i th space), corresponds to the auxiliary space and the other N
N spaces constitute the quantum space H =V ⊗ V · · · ⊗ V . Moreover, for each factor space Vi , i = 0, . . . , N, we associate a complex parameter zi . The parameter associated with the 0th space is usually called the spectral parameter which is set to z0 = u in this paper, and the other parameters are called the inhomogeneous parameters. In this paper we always assume that all the complex parameters u and {zi |i = 1, . . . , N} are generic ones. Hereafter we adopt the standard notation: for any matrix A ∈ End(V ), Aj (or A(j ) ) is an embedding operator in the tensor product space, which acts as A on the j th space and as an identity on the other factor spaces; Rij = Rij (zi , zj ) is an embedding operator of R-matrix in the tensor product space, which acts as an identity on the factor spaces except for the i th and j th ones. The R-matrix satisfies the graded Yang-Baxter equation (GYBE) R12 R13 R23 = R23 R13 R12 .
(3.5)
In terms of the matrix elements defined by ij R(u)(|i ⊗ |j ) = R(u)ij (|i ⊗ |j ), i,j
the GYBE reads ij j k R(u1 − u2 )ij R(u1 − u3 )ii kk R(u2 − u3 )j k (−1)[j ]([i ]+[i ]) i ,j ,k
=
i ,j ,k
j k
i j
R(u2 − u3 )j k R(u1 − u3 )iikk R(u1 − u2 )i j (−1)[j ]([i]+[i ]) .
92
W.-L. Yang, Y.-Z. Zhang, S.-Y. Zhao
Besides the GYBE, the R-matrix satisfies the following relation: −1 R12 , R12 (x) = P12 (x)P12
(3.6)
where P12 is the superpermutation operator, i.e., P12 (|i ⊗ |j = (−1)[i][j ] |j ⊗ |i. Using the coproduct structure of Uq (gl(m|n)), one can define the action of Uq (gl(m|n)) on the (N + 1)-fold tensor product space. For any x ∈ Uq (gl(m|n)), let us denote the action of x on the (N + 1)-fold tensor product space by (x)0...N : (x)0...N = (N) (x) = (id ⊗ (N−1) )(x).
(3.7)
By a straightforward calculation, one has Lemma 2. (E )0...N = i,i
N
i,i E(k) , i = 1, . . . , n + m,
(3.8)
k=0
(E
j,j +1
)0...N =
N
j,j +1
E(k)
N
j i=k+1 h(i)
q
, j = 1, . . . , n + m − 1,
(3.9)
, j = 1, . . . , n + m − 1,
(3.10)
k=0
(E j +1,j )0...N =
N
q−
k−1 i=0
j
h(i)
j +1,j
E(k)
k=0 i,j
where E(k) is the embedding of ei,j in the tensor product space, which acts as ei,j on the k th space and as identity on the other factor spaces. The actions of non-simple generators can be obtained from those of simple ones through (2.10) and (2.11). Let SN+1 denote the permutation group of the N + 1 space labels (0, . . . , N ). The GYBE (3.5) and unitary relation (3.4) of R-matrix allow one to introduce the following mapping. Definition 2. One can define a mapping from SN+1 to End(V0 ⊗ H) which associate in σ a unique way an element R0...N ∈ End(V0 ⊗ H) to any element σ of the permutation group SN+1 . The mapping has the following composition law:
σσ σ σ σ R0...N = P σ R0...N (P σ )−1 R0...N = Rσσ (0...N) R0...N , ∀σ, σ ∈ SN+1 ,
(3.11)
where P σ is the Z2 -graded permutation operator on the tensor product space, i.e., P σ |i0 (0) . . . |iN (N) = |i0 (σ (0)) . . . |iN (σ (N)) . For any elementary permutation σj with σj (0, . . . , j, j + 1, . . . , N ) = (0, . . . , j + 1, j, . . . , N ), j = 0, . . . , N, the σj corresponding R0...N is σ
j = Rj j +1 . R0...N
(3.12)
σ can be constructed through (3.11) For any element σ ∈ SN+1 , the corresponding R0...N and (3.12) as follows. Let σ be decomposed in a minimal way in terms of elementary permutation as σ = σβ1 . . . σβp , where the positive integer p is the length of σ . The comσ position law enables one to obtain the expression of the associated R0...N . The GYBE σ (3.5) and (3.4) guarantee the uniqueness of R0...N . For the special element σc of SN+1 ,
σc = σ0 σ1 . . . σN−1 , namely, σc (0, 1, . . . , N ) = (1, 2, . . . , N, 0),
(3.13)
Drinfeld Twists and Algebraic Bethe Ansatz
93
σc the associated R0...N is given by σc T (u) ≡ T0 (u) = T0,1...N (u) = R0...N = R0 N R0 N−1 . . . R01 .
(3.14)
σc Thus R0...N is the quantum monodromy matrix T (u) of the Uq (gl(m|n)) spin chain on an N-site lattice. By the GYBE, one may prove that the monodromy matrix satisfies the GYBE
R00 (u − v)T0 (u)T0 (v) = T0 (v)T0 (u)R00 (u − v).
(3.15)
Define the transfer matrix t (u) t (u) = str0 T (u),
(3.16)
where str0 denotes the supertrace over the auxiliary space. Then the Hamiltonian of our model is given by d ln t (u) (3.17) |u=0 . du This model is integrable thanks to the commutativity of the transfer matrix for different parameters, H =
[t (u), t (v)] = 0,
(3.18)
which can be verified by using the GYBE. The fundamental relation (3.6) and the co-associativity (2.9) of the coproduct of Uq (gl(m|n)) enable one to prove the following result, using the procedure similar to that in [1] for the non-super case, Proposition 1. The mapping defined in Definition 2 satisfies the following relation: σ σ (x)0...N = P σ (x)0...N (P σ )−1 R0...N , ∀x ∈ Uq (gl(m|n)), σ ∈ SN+1 . R0...N
(3.19)
One may decompose the monodromy matrix T (u) in terms of the basis of End(V0 ) as n+m
T (u) =
i,j
Ti,j (u)E(0) ≡
i,j =1
n+m
Ti,j (u)ei,j ,
(3.20)
i,j =1
where the matrix elements Ti,j (u) are operators acting on the quantum space H and have the Z2 -grading: [Ti,j (u)] = [ei,j ] = [i] + [j ]. Similarly, for the quantities defined in Lemma 2, we have the decomposition: i,i + (E i,i )0...N = E(0)
N
i,i E(k) = ei,i + (E i,i )1...N , i = 1, . . . , n + m,
(3.21)
k=1 j,j +1
(E j,j +1 )0...N = E(0)
N
q
j k=1 h(k)
+
N
j,j +1
E(k)
N
q
j i=k+1 h(i)
k=1
= ej,j +1 q j +1,j
(E j +1,j )0...N = E(0)
(hj )1...N j
+ (E j,j +1 )1...N , j = 1, . . . , n + m − 1,
+ q −h(0)
N
q−
k−1 i=1
j
h(i)
(3.22)
j +1,j
E(k)
k=1
= ej +1,j + q −hj (E j +1,j )1...N , j = 1, . . . , n + m − 1,
(3.23)
94
W.-L. Yang, Y.-Z. Zhang, S.-Y. Zhao
where hj = (−1)[j ] ej,j − (−1)[j +1] ej +1,j +1 . Without confusion, hereafter we adopt the following convention:
i,i , i, i = 1, . . . , n + m, ei,i = E(0)
hj =
(3.24)
j h(0) = (−1)[j ] ej,j −(−1)[j +1] ej +1,j +1 ,
Ei,i = (E i,i )1...N =
N
j = 1, . . . , n+m−1,
i,i E(k) , i = 1, . . . , n + m,
(3.25) (3.26)
k=1
Hj = (−1)[j ] Ej,j − (−1)[j +1] Ej +1,j +1 , j = 1, . . . , n + m − 1, N
Ej,j +1 = (E j,j +1 )1...N =
j,j +1
E(k)
N
j i=k+1 h(i)
q
(3.27)
, j = 1, . . . , n + m − 1, (3.28)
k=1
Ej +1,j = (E
j +1,j
)1...N =
N
q−
k−1
j i=1 h(i)
j +1,j
E(k)
, j = 1, . . . , n + m − 1, (3.29)
k=1
and a similar convention for the non-simple generators. Then the operators {Ei,j } are the operators which act non-trivially on the quantum space H and trivially (i.e. as an identity) on the auxiliary space V0 . From (3.13), we have P σc (hj )0...N (P σc )−1 = hj + Hj , j = 1, . . . , n + m − 1, P
σc
P
σc
(E
j,j +1
(E
j +1,j
σc −1
)0...N (P )
σc −1
)0...N (P )
=q
hj
Ej,j+1 +ej,j+1 , j = 1, . . . , n+m−1,
= ej,j+1 q
−Hj
+ Ej +1,j , j = 1, . . . , n+m−1.
(3.30) (3.31) (3.32)
Using (3.21)-(3.23), (3.30)-(3.32) and Lemma 1, we have Proposition 2. (E γ ,γ −l )0...N = q − +
l−1
l
k=1 hγ−k
[γ −l+i]
(1 − q 2(−1)
Eγ ,γ −l +eγ ,γ −l
)q −
l−i
k=1 hγ −k
eγ −l+i,γ −l Eγ ,γ −l+i , γ − l ≥ 1 and l ≥2,
i=1
P (E σc
γ ,γ −l
σc −1
)0...N (P )
= eγ ,γ −l q
−
l
(3.33)
k=1 Hγ −k
+ Eγ ,γ −l
l−1 l−i [γ−l+i] ) eγ ,γ −l+i q − k=1 Hγ −k Eγ −l+i,γ −l , γ − l ≥ 1 and l ≥ 2, + (1−q 2(−1) i=1
(E
γ ,γ+l
+
)0...N = Eγ ,γ +l +eγ ,γ +l q
l−1
[γ +i]
(1−q −2(−1)
(3.34)
l−1
k=0 Hγ +k
)eγ +i,γ +l Eγ ,γ +i q
l−1 k=i
Hγ +k
, γ +l ≤ n+m and l ≥ 2,
i=1
(3.35)
Drinfeld Twists and Algebraic Bethe Ansatz
95
P σc (E γ ,γ +l )0...N (P σc )−1 = eγ ,γ+l + q +
l−1
[γ +i]
(1 − q −2(−1)
)eγ ,γ +i q
l−1
l−1 k=i
k=0
hγ +k
hγ +k
Eγ ,γ +l
Eγ +i,γ +l , γ +l ≤ n+m and l ≥ 2.
i=1
(3.36) Substituting (3.35) and (3.36) into Proposition 1, we obtain our main result in this section: Theorem 1. The matrix elements Tn+m,n+m−l (u) (l = 1, . . . , n + m − 1) of the monodromy matrix can be expressed in terms of Tn+m,n+m (u) and the generators of Uq (gl(m|n)) by the following recursive relation: Tn+m,n+m−l (u) = l [n+m] = q −(−1) En+m−l,n+m Tn+m,n+m (u)−Tn+m,n+m (u)En+m−l,n+m q − k=1 Hn+m−k −
l−1
(1 − q −2(−1)
[n+m−α]
)Tn+m,n+m−α (u)En+m−l,n+m−α q −
l
k=α+1 Hn+m−k
.
α=1
(3.37) The proof of this theorem is relegated to Appendix A. We call the second term in the R.H.S. of (3.37) a quantum correction term, which vanishes in the rational limit (q → 1). Moreover, such a nontrivial correction term only occurs in the higher rank models (i.e., when n + m ≥ 3). In the rational limit: q → 1, (3.37) reduces to the (anti)commutation relations used in [7, 11]. For some special values of m and n, the associated recursive relations in the present grading become: • For the Uq (gl(1|1)) case:
T2,1 (u) = q −1 E1,2 T2,2 (u) − T2,2 (u)E1,2 q −H1 .
• For the Uq (gl(2|1)) case which corresponds to the quantum t − J model:
T3,2 (u) = q −1 E2,3 T3,3 (u) − T3,3 (u)E2,3 q −H2 ,
T3,1 (u) = q −1 E1,3 T3,3 (u) − T3,3 (u)E1,3 q −H2 −H1 −(1 − q 2 )T3,2 (u)E1,2 q −H1 .
(3.38)
(3.39)
(3.40)
• For the Uq (gl(2|2)) case which corresponds to the quantum EKS model [15]:
T4,3 (u) = q −1 E3,4 T4,4 (u) − T4,4 (u)E3,4 q −H3 , (3.41)
T4,2 (u) = q −1 E2,4 T4,4 (u) − T4,4 (u)E2,4 q −H3 −H2 −(1 − q −2 )T4,3 (u)E2,3 q −H2 ,
T4,1 (u) = q −1 E1,4 T4,4 (u) − T4,4 (u)E1,4 q −H3 −H2 −H1
(3.42)
−(1 − q −2 )T4,3 (u)E1,3 q −H2 −H1 −(1 − q 2 )T4,2 (u)E1,2 q −H1 .
(3.43)
96
W.-L. Yang, Y.-Z. Zhang, S.-Y. Zhao
4. Factorizing F-Matrices and Their Inverses In this section, we construct the Drinfeld twists [2] (factorizing F-matrices) on the N -fold tensor product space (i.e. the quantum space H) associated with the quantum superalgebra Uq (gl(m|n)). 4.1. Factorizing F-matrix. Let SN be the permutation group associated with the indiσ σ ces (1, . . . , N ) and R1...N the N-site R-matrix associated with σ ∈ SN . R1...N acts non-trivially on the quantum space H and trivially (i.e as an identity) on the auxiliary space. Definition 3. The F-matrix F1...N (z1 , . . . , zN ) is an operator in End(H) and satisfies the following three properties: I. lower-triangularity; II. non-degeneracy; III. factorization, namely, σ Fσ (1)...σ (N) (zσ (1) , . . . , zσ (N) ) R1...N = F1...N (z1 , . . . , zN ), ∀σ ∈ SN . (4.1)
Define the N-site F -matrix: F1...N ≡ F1...N (z1 , . . . , zN ) =
∗ n+m
N
α
σ (j ) σ Pσ (j ) S(σ, ασ )R1...N ,
(4.2)
σ ∈SN ασ (1) ...ασ (N ) =1 j =1
embedding of the project operator P α in the i th space with (P α )kl = where Piα is the δkl δkα , the sum ∗ in (4.2) is over all non-decreasing sequences of the labels ασ (i) : ασ (i+1) ≥ ασ (i) ασ (i+1) > ασ (i)
if σ (i + 1) > σ (i), if σ (i + 1) < σ (i),
(4.3)
and S(σ, ασ ) is a c-number function of σ, ασ and the element cij of the R-matrix, defined by
N 1 [ασ (k) ] 1 − (−1) δασ (k) ,ασ (l) ln(1 + cσ (k)σ (l) ) . (4.4) S(σ, ασ ) ≡ exp 2 l>k=1
Proposition 3. The F-matrix F1...N given by (4.2)-(4.4) satisfies Properties I, II, III. Proof. The definition of F1...N (4.2) and the summation condition (4.3) imply that F1...N is a lower-triangular matrix. Moreover, one can easily check that the F -matrix is nondegenerate because all diagonal elements are non-zero. We now prove that the F -matrix (4.2) satisfies Property III. Any given permutation σ ∈ SN can be decomposed into elementary ones of the group SN as σ = σi1 . . . σik . By (3.11), we have, if Property III holds for any elementary permutation σi , σ Fσ (1...N) R1...N = σi
= Fσi1 ...σik (1...N) Rσi k ...σi 1
k−1
σi
σik−1 σi1 (1...N) Rσi1 ...σik−2 (1...N) . . . R1...N
σi1 . . . R1...N (1...N) k−2
= Fσi1 ...σik−1 (1...N) Rσi k−1 ...σi 1
σi
1 = . . . = Fσi1 (1...N) R1...N = F1...N .
(4.5)
Drinfeld Twists and Algebraic Bethe Ansatz
97
For the elementary permutation σi , we have σi Fσi (1...N) R1...N =
=
∗
N
σ ∈SN ασi σ (1) ...ασi σ (N ) j =1
=
∗
N
σ ∈SN ασi σ (1) ...ασi σ (N ) j =1
=
N ∗(i)
ασ
σ (j )
ασ
σ (j )
σi Pσi σi (j ) S(σi σ, ασi σ )Rσσi (1...N) R1...N
σi σ Pσi σi (j ) S(σi σ, ασi σ )R1...N
α
σ˜ ∈SN ασ˜ (1) ...ασ˜ (N ) j =1
σ˜ (j ) σ˜ Pσ˜ (j ˜ , ασ˜ )R1...N , ) S(σ
where σ˜ = σi σ , and the summation sequences of ασ˜ in ασ˜ (j +1) ≥ ασ˜ (j ) ασ˜ (j +1) > ασ˜ (j )
(4.6)
∗ (i)
now have the form
if σi σ˜ (j + 1) > σi σ˜ (j ), if σi σ˜ (j + 1) < σi σ˜ (j ).
(4.7)
Comparing (4.7) with (4.3), we find that the only difference between them is the transposition σi factor in the “if” conditions. For a given σ˜ ∈ SN with σ˜ (j ) = i and σ˜ (k) = i + 1, we now examine how the elementary transposition σi will affect the inequalities (4.7). If |j − k| > 1, then σi does not affect the sequence of ασ˜ at all, that is, the sign of inequality “>” or “≥” between two neighboring root indexes is unchanged with the action of σi . If |j − k| = 1, then in the summation sequences of ασ˜ , when σ˜ (j + 1) = i + 1 and σ˜ (j ) = i, sign “≥” changes to “>”, while when σ˜ (j + 1) = i and σ˜ (j ) = i + 1, “>” changes to “≥”. Thus (4.3) and (4.6) differ only when equal labels ασ˜ appear. With the help of the relation c21 c12 = 1, one may prove that in this case the σi product Fσi (1...N) R1...N still equals F1...N (see [10] for a more detailed proof). Thus, we obtain σ R1...N (z1 , . . . , zN ) = Fσ−1 (1...N) (zσ (1) , . . . , zσ (N) )F1...N (z1 , . . . , zN ),
(4.8)
and the factorizing F -matrix F1...N of Uq (gl(m|n)) is proved to satisfy all three properties.
From the expression of the F -matrix, one knows that it has an even grading, i.e., [F1...N ] = 0.
(4.9)
4.2. Inverse of the F-matrix. The non-degenerate property of the F -matrix implies that −1 we can find the inverse matrix F1...N . To do so, we first define ∗ F1...N
=
∗∗ n+m
−1
S(σ, ασ )Rσσ (1...N)
σ ∈SN ασ (1) ...ασ (N ) =1
N
α
σ (j ) Pσ (j ) ,
(4.10)
j =1
where the sum ∗∗ is taken over all possible αi which satisfies the following nonincreasing constraints: ασ (i+1) ≤ ασ (i) ασ (i+1) < ασ (i)
if σ (i + 1) < σ (i), if σ (i + 1) > σ (i).
(4.11)
98
W.-L. Yang, Y.-Z. Zhang, S.-Y. Zhao
Proposition 4. The inverse of the F -matrix is given by −1 ∗ F1...N = F1...N −1 ij ,
(4.12)
i<j
where
ββ
[ij ]αii αjj =δαi βi δαj βj
sinh(zi −zj ) sinh(zi −zj +η) sinh(zj −zi ) sinh(zj −zi +η) 1 2 (z −z ) cosh2 η −4 sinh i j sinh(zi −zj+η) sinh(zi −zj −η)
if αi > αj if αi < αj , if αi =αj =m+1, . . . , n+m, if αi =αj =1, . . . , m. (4.13)
∗ . Substituting (4.2) and (4.10) into Proof. We compute the product of F1...N and F1...N the product, we have ∗ = F1...N F1...N
∗
∗∗
S(σ, ασ )S(σ , βσ )
σ ∈SN σ ∈SN ασ1 ...ασN βσ ...βσ 1
×
N
N
N
−1
α
σ (i) σ Pσ (i) R1...N Rσσ (1...N)
i=1
=
×
N
i=1
∗
σ ∈SN
β
Pσ σ(i)(i)
σ ∈S α
N
∗∗
S(σ, ασ )S(σ , βσ )
ασ1 ...ασN βσ ...βσ 1
−1
σ (i) σ σ Pσ (i) Rσ (1...N)
i=1
N
N
β
Pσ σ(i)(i) .
(4.14)
i=1
To evaluate the R.H.S., we examine the matrix element of the R-matrix −1 ασ (N ) ...ασ (1) σ Rσσ (1...N) . βσ (N ) ...βσ (1)
(4.15)
Note that the sequence {ασ } is non-decreasing and {βσ } is non-increasing. Thus the non-vanishing condition of the matrix element (4.15) requires that ασ and βσ satisfy βσ (N) = ασ (1) , . . . , βσ (1) = ασ (N) .
(4.16)
One can verify [7] that (4.16) is fulfilled only if σ (N ) = σ (1), . . . , σ (1) = σ (N ).
(4.17)
Let σ¯ be the maximal element of the SN which reverses the site labels σ¯ (1, . . . , N ) = (N, . . . , 1).
(4.18)
σ = σ σ¯ .
(4.19)
Then from (4.17), we have
Drinfeld Twists and Algebraic Bethe Ansatz
99
Substituting (4.16) and (4.19) into (4.14), we have ∗ F1...N F1...N
=
∗
σ ∈SN ασ1 ...ασN
N
N ασ (i) σ¯ ασ (i) S(σ, ασ )S(σ, ασ ) Pσ (i) Rσ (N...1) Pσ (i) . i=1 i=1
(4.20) The decomposition of R σ¯ in terms of elementary R-matrices is unique modulo the GYBE. One reduces from (4.20) that F F ∗ is a diagonal matrix: ∗ F1...N F1...N =
(4.21)
ij .
i<j
Then (4.12) is a simple consequence of the above equation.
5. Monodromy Matrix in the F -Basis In the previous section, we have constructed the F -matrix and its inverse which act on the quantum space H. The non-degeneracy of the F -matrix means that its column vectors also form a complete basis of H, which is called the F -basis. In this section, we study the generators of Uq (gl(m|n)) and the elements of the monodromy matrix in the F -basis. 5.1. Uq (gl(m|n)) generators in the F -basis. The Cartan generators {E i,i } (or {H j }) and the simple generators {E j,j +1 }, {E j +1,j } of Uq (gl(m|n)) are realized on H by {Ei,i } (or {Hj }), {Ej,j +1 } and {Ej +1,j }, respectively, as (3.26)-(3.29). The other non-simple generators {E i,j } can be obtained from the simple ones by (2.10) and (2.11), and denote their realizations on H by {Ei,j }. Introduce the generators in the F -basis: −1 E˜ i,j = F1...N Ei,j F1...N , i, j = 1, . . . , n + m.
(5.1)
Theorem 2. In the F -basis the Cartan and the simple generators of Uq (gl(m|n)) are given by E˜ i,i = Ei,i =
N
i,i E(k) , i = 1, . . . , n + m,
(5.2)
k=1
E˜ j,j +1 =
N
j,j +1
⊗γ =k G(γ ) (k, γ ), j = 1, . . . , n + m − 1,
j +1,j
⊗γ =k G(γ )
E(k)
j,j +1
(5.3)
k=1
E˜ j +1,j =
N
E(k)
j +1,j
(k, γ ), j = 1, . . . , n + m − 1.
k=1 γ ,γ ±1
Here the diagonal matrices G(j )
(i, j ) are:
(5.4)
100
W.-L. Yang, Y.-Z. Zhang, S.-Y. Zhao
• For 1 < γ + 1 ≤ m, k = γ, 2e−η cosh η, = δkl (2aij cosh η)−1 eη , k = γ + 1, 1, otherwise, −η k = γ + 1, 2e cosh η, γ +1,γ (G(j ) (i, j ))kl = δkl (2aj i cosh η)−1 eη , k = γ , 1, otherwise, γ ,γ +1 (G(j ) (i, j ))kl
(5.5)
(5.6)
• For γ = m, 2e−η cosh η, k = γ , k = γ + 1, = δkl e−η , 1, otherwise, (2aj i cosh η)−1 eη , k = γ , γ +1,γ (G(j ) (i, j ))kl = δkl (aj i )−1 eη , k = γ + 1, 1, otherwise, γ ,γ +1 (G(j ) (i, j ))kl
(5.7)
(5.8)
• For 1 + m ≤ γ < n + m, (aij )−1 eη , k = γ , = δkl e−η , k = γ + 1, 1, otherwise, (aj i )−1 e−η , k = γ + 1, γ +1,γ (G(j ) (i, j ))kl = δkl eη , k = γ, 1, otherwise. γ ,γ +1 (G(j ) (i, j ))kl
(5.9)
(5.10)
Proof. Using (3.26)-(3.29), Proposition 1 and the composition law of R-matrices (3.11), one can prove the theorem. Here, without losing generality, we give the proof for the generator E˜ 1,2 as an example. From the expressions of F1...N and its inverse, we have E˜ 1,2 =
∗
∗∗
S(σ, ασ )S(σ , βσ )
σ,σ ∈SN ασ (1) ...ασ (N ) βσ (1) ...βσ (N )
×
N
−1
α
σ (i) σ Pσ (i) R1...N (E 1,2 )1...N Rσσ (1...N)
i=1
=
N
β
Pσ σ(i)(i)
i=1
∗
∗∗
−1 ij
i<j
S(σ, ασ )S(σ , βσ )
σ,σ ∈SN ασ (1) ...ασ (N ) βσ (1) ...βσ (N )
×
N i=1
α
−1
−1
σ (i) σ σ σ Pσ (i) [(P1...N (E 1,2 )1,...,N P1...N )]Rσσ (1...N)
N i=1
β
Pσ σ(i)(i)
i<j
−1 (5.11) ij
Drinfeld Twists and Algebraic Bethe Ansatz
=
N σ,σ ∈SN k=1 α
σ (1) ×Pσ (1)
=1
1,2 E(σ (l)) q
101
∗
N
1 i=l+1 h(σ (i))
∗∗
S(σ, ασ )S(σ , βσ )
ασ (1) ...ασ (N ) βσ (1) ...βσ (N )
N α =1→2 β −1 ασ (N ) σ −1 σ σ (l) . . . Pσ (l)=k Rσ (1...N) Pσ σ(i)(i) ij , . . . Pσ (N) i=1
(5.12)
i<j
where in (5.11) we have used (3.19) and l stands for indices between 1 and N such that σ (l) = k. Here and below, 1 → 2 means that 1 is changed to 2; thus ασ (l) = 1 → 2 in (5.12) means that ασ (l) = 1 is replaced by ασ (l) = 2 on the site σ (l) = k. The ele −1
α
σ (1) σ between Pσ (1) ment of Rσσ (1...N) is denoted as
=1
α
=1→2
α
σ (l) . . . Pσ (l)=k
ασ (N ) ... 1→2 ... 1 −1 σ Rσσ (1...N) βσ (N ) ...βσ (1)
σ (N )
σ (l)=k
β
β
σ (N ) (N ) and Pσ σ(N) . . . Pσ σ(1)(1) . . . Pσ (N)
σ (1)
(5.13)
.
We call the sequence {ασ (l) } normal if it is arranged according to the rules in (4.3), otherwise, we call it abnormal. It is now convenient for us to discuss the non-vanishing condition of the R-matrix element (5.13). Comparing (5.13) with (4.15), we find that the difference between them lies in the k th site. Because the group label in the k th space has been changed, the sequence {ασ } is now an abnormal sequence. However, it can be permuted to the normal sequence by some permutation σˆ k . Namely, α1→2 in the abnormal sequence can be moved to a suitable position by using the permutation σˆ k according to rules in (4.3). (It is easy to verify that σˆ k is unique by using (4.3).) Thus, by procedure similar to that in the previous section, we find that when σ = σˆ k σ σ¯
and
βσ (N) = ασ (1) , . . . , βσ (1) = ασ (N) ,
(5.14)
the R-matrix element (5.13) is non-vanishing. ij Because the non-zero condition of the elementary R-matrix element Rij is i + j = i + j , the following R-matrix elements: ασ (N ) ... 1 ...1→2... 1 −1 σ Rσσ (1...N) βσ (N ) ...βσ (1)
σ (N )
σ (l)=k
σ (p)
σ (1)
,
with 1 ≤ p ≤ l are also non-vanishing. Therefore, (5.12) becomes E˜ 1,2 =
N
∗
S(σ, ασ )S(σˆ k σ, ασˆ k σ ) σ ∈SN k=1 ασ1 ...ασN N 1 α =1 ασ (l) =1→2 ασ (N ) 1,2 i=l+1 h(σ (i)) P σ (1) . . . Pσ (l)=k . . . Pσ (N) × E(σ σ (1) (l)) q + ...
(5.15)
102
W.-L. Yang, Y.-Z. Zhang, S.-Y. Zhao 1,2 + E(σ (p)) q + ... 1,2 +E(σ (1)) q σ¯ σ −1 σˆ k−1 σ k σ (N...1)
×Rσˆ
N
1 i=n+1 h(σ (i))
N
1 i=2 h(i)
N i=1
=
N
α
σ (1) Pσ (1)
ασˆ k σ (i) k σ (i)
Pσˆ
α
σ (1) Pσ (1)
=1
=1→2
α
σ (p) . . . Pσ (p)
α
=1→2
=1
α
=1
α
σ (l) σ (N ) . . . Pσ (l)=k . . . Pσ (N)
α
σ (l) σ (N ) . . . Pσ (l)=k . . . Pσ (N)
−1 ij
(5.16)
i<j
1,2 E(k) ⊗j =k G1,2 (j ) (k, j ),
(5.17)
k=1
where index p runs between 1 and l and σˆ k is the element of SN which permutes the first abnormal sequence in the square bracket of (5.16) to normal sequence. Using the similar procedure, one can prove the theorem for other generators.
The non-simple generators E˜ γ ,γ ±l (for l ≥ 2) can be obtained through the simple ones by (2.10) and (2.11). Some remarks are in order. Equation (5.2) implies that H˜ j = Hj , j = 1, . . . , N,
(5.18)
which will be used frequently. In the rational limit: η → 0 (or q → 1), our results reduce to those in [11] and those in [7] for the special case m = 0.
5.2. Creation operators in the F -basis. Among the matrix elements of the mondromy matrix Ti,j (u), the operators Tn+m,n+m−l (u) (l = 1, . . . , n + m − 1) are called creation operators [5] and are usually denoted by Cn+m−l (u) = Tn+m,n+m−l (u), l = 1, . . . , n + m − 1.
(5.19)
In the F -basis, they become −1 C˜ n+m−l (u) = F1...N Cn+m−l (u)F1...N , l = 1, . . . , n + m − 1.
(5.20)
Let us denote Tn+m,n+m (u) by D(u) and the corresponding operator in the F -basis by −1 ˜ . D(u) = F1...N D(u)F1...N
(5.21)
˜ Proposition 5. D(u) is a diagonal matrix given by ˜ D(u) = ⊗N i=1 diag (a0i , . . . , a0i , 1)(i) .
(5.22)
Proof. From (3.20), we derive that D(u)P0n+m = Tn+m,n+m (u)en+m,n+m = P0n+m T0,1...N (u)P0n+m .
(5.23)
Drinfeld Twists and Algebraic Bethe Ansatz
103
Acting the F -matrix from the left on the both sides of the above equation, we have ∗
F1...N D(u)P0n+m =
N
S(σ, ασ )
σ ∈SN ασ (1) ...ασ (N )
∗
=
σ σ Pσα(i) R1...N P0m+n T0,1...N (u)P0m+n
i=1 N
S(σ, ασ )
σ ∈SN ασ (1) ...ασ (N )
σ σ Pσα(i) P0m+n T0,σ (1...N) (u)P0m+n R1...N .
i=1
(5.24) Following [7], we can split the sum index m + n,
according to the number of occurrences of the
∗
N
F1...N D(u)P0n+m =
∗
σ ∈SN k=0 ασ (1) ...ασ (N )
×
N−k
N
S(σ, ασ )
α
σ (j ) δασ (j ) ,m+n Pσ (j )
j =N−k+1
α
σ (j ) m+n σ Pσ (j T0,σ (1...N) (u)P0m+n R1...N . ) P0
(5.25)
j =1 σ . We have Consider the prefactor of R1...N N−k j =1
=
N−k
=
m+n Pσm+n T0,σ (1...N) (u)P0m+n (j ) P0
j =N −k+1 N
α
σ (j ) Pσ (j )
j =1 N−k
N
α
σ (j ) Pσ (j )
R0 σ (j )
m+n m+n
j =N −k+1
P m+n T0,σ (1...N−k) (u)P0m+n m+n m+n 0
α
σ (j ) m+n Pσ (j T0,σ (1...N −k) (u)P0m+n ) P0
j =1
=
N−k
=
i=1
Pσm+n (j )
j =N−k+1
Pσm+n (j )
j =N−k+1
R0 σ (i)
m+nασ (i) m+nασ (i)
i=1 N−k
N
N
a0 σ (i)
N−k j =1
α
σ (j ) Pσ (j )
N−k
α
σ (j ) Pσ (j )
j =1 N
N
n+m Pσm+n (j ) P0
j =N−k+1 n+m Pσm+n , (j ) P0
(5.26)
j =N−k+1
where a0i = a(u, zi ). Substituting (5.26) into (5.25), we have F1...N D(u) = ⊗N i=1 diag (a0i , . . . , a0i , 1)(i) F1...N . This completes the proof of the proposition.
(5.27)
By means of the expressions of the generators of Uq (gl(m|n)) in the previous subsection, combining with Theorem 1 and Proposition 5, we have
104
W.-L. Yang, Y.-Z. Zhang, S.-Y. Zhao
Theorem 3. In the F -basis the creation operators Cn+m−l (u)(l = 1, . . . , n + m − 1) are given by l [n+m] ˜ ˜ E˜ n+m−l,n+m q − k=1 Hn+m−k C˜ n+m−l (u) = q −(−1) E˜ n+m−l,n+m D(u) − D(u) −
l−1
(1−q −2(−1)
[n+m−α]
)C˜ n+m−α (u)E˜ n+m−l,n+m−α q −
l
k=α+1 Hn+m−k
.
α=1
(5.28) For some special values of m and n, we have: • For m = 0 and n = 2 (namely, the Uq (gl(2) case), our general results reduce to those in [1]. • For the Uq (gl(2|1)) case, ˜ ˜ E˜ 2,3 q −H2 − D(u) C˜ 2 (u) = q −1 E˜ 2,3 D(u) N e−(u−zi ) sinh η 2,3 E sinh(u − zi + η) (i) i=1 sinh(u − zj ) 2 sinh(u − zj ) cosh η ⊗j =i diag , (5.29) , ,1 sinh(u − zj + η) sinh(u − zj + η) (j ) ˜ ˜ E˜ 1,3 q −H1 −H2 − (1 − q 2 )C˜ 2 (u)E˜ 1,2 q −H1 C˜ 1 (u) = q −1 E˜ 1,3 D(u) − D(u)
=
=
N 2 sinh(u − zj ) cosh η e−(u−zi ) sinh η 1,3 E(i) ⊗j =i diag , sinh(u − zi + η) sinh(u − zj + η) i=1 sinh(u − zj ) sinh(zi − zj + η) ,1 sinh(zi − zj ) sinh(u − zj + η) (j ) N ezj −u sinh(u − zi ) sinh2 η[ezj −zi + 2 sinh(zi − zj )] 1,2 2,3 E(i) ⊗ E(j ) sinh(u − zi + η) sinh(u − zj + η) sinh(zi − zj ) i=j =1 2 sinh(u − zk ) cosh η sinh(u − zk ) sinh(zi − zk + η) , ,1 ⊗k=i,j . sinh(u − zk + η) sinh(zi − zk ) sinh(u − zk + η) (k) (5.30)
+
6. Bethe Vectors in the F -Basis The explicit expressions of the creation operators of the Uq (gl(m|n)) monodromy matrix in the F -basis, given in the previous section, enable us to resolve the hierarchy of the nested Bethe vectors of the model associated with Uq (gl(m|n)). Here, we take the quantum t − J model (i.e. the Uq (gl(2|1))-model) as an example to demonstrate the procedure. The generalization to the general Uq (gl(m|n)) case is straightforward. In the framework of the standard algebraic Bethe ansatz method [19], the Bethe vector of the quantum supersymmetric t-J model is given by d1 ...dα
N = ( (1) Cd1 (v1 ) . . . Cdα (vα )|vac, (6.1) α ) d1 ...dα
Drinfeld Twists and Algebraic Bethe Ansatz
105
where α is a positive integer and di = 1, 2, |vac is the pseudo-vacuum state 0 0 , |vac = ⊗N i=1 1 (i)
(6.2)
(1)
and ( α )d1 ...dα are functions of the spectral parameters vj , and are the vector components of the nested Bethe vector (1) . The nested Bethe vector (1) is given by (1)
(1)
(1)
(1) (1) (1) (1)
(1) α = C (v1 )C (v2 ) · · · C (vβ )|vac ,
(6.3)
where β is a positive integer and |vac(1) is the nested pseudo-vacuum state |vac
(1)
=
⊗αi=1
0 . 1 (i)
(6.4)
The creation operator of the nested Uq (gl(2)) system C (1) (u) is the lower-triangular entry of the nested monodromy matrix T (1) (v (1) ), T (1) (v (1) ) = r0α (v (1) − vα )r0 α−1 (v (1) − vα−1 ) . . . r01 (v (1) − v1 ) (1) (1) A (v ) B (1) (v (1) ) ≡ , C (1) (v (1) ) D (1) (v (1) )
(6.5)
and the nested R-matrix is
c12 0 0 0 a12 −b+ 12 r12 (u1 , u2 ) ≡ r12 (u1 − u2 ) = 0 −b− a12 12 0 0 0
0 0 . 0 c12
(6.6)
Equation (6.5) implies that the nested system is a Uq (gl(2)) spin chain on α-site lattice and the corresponding inhomogeneous parameters are {vi |i = 1, . . . , α}. So, in the following, we shall adopt the same convention for the nested system as that in previous sections but the inhomogeneous parameters will be replaced by {vi }. Acting the associated F -matrix on the pseudo-vacuum state (6.2), one finds that the pseudo-vacuum state is invariant. It is due to the fact that only terms with roots equal to 3 will produce non-zero results. Therefore, the Uq (gl(2|1)) Bethe vector (6.1) in the F -basis can be written as ˜ N (v1 , . . . , vα ) ≡ F1...N N (v1 , . . . , vα )
d1 ...dα ˜ = ( (1) Cd1 (v1 ) . . . C˜ dα (vα )|vac. α )
(6.7)
d1 ...dα (1)
The c-number coefficient ( α )d1 ...dα has to be evaluated in the original basis, not in the F -basis.
106
W.-L. Yang, Y.-Z. Zhang, S.-Y. Zhao
6.1. Nested Bethe vectors in the F -basis. Let us first compute the nested Bethe vectors in the F -basis. For the nested R-matrix (6.6), we define the α-site F and F ∗ -matrices by (1) F1...α
=
2
∗
α
σ (j ) ¯ (1) σ Pσ (j ) S (c, σ, ασ )r1...α ,
α
σ ∈Sα ασ (1) ...ασ (α) =1 j =1 2 ∗∗
∗(1)
F1...α =
−1 S¯ (1) (c, σ, ασ )rσσ(1...α)
σ ∈Sα ασ (1) ...ασ (α) =1
α
(6.8)
α
σ (j ) Pσ (j ) ,
(6.9)
j =1
respectively, where the c-number function S¯ (1) is given by S¯ (1) (c, σ, ασ ) ≡ exp{
α
δασ (k) ,ασ (l) ln(1 + cσ (k)σ (l) )}.
(6.10)
l>k=1
Therefore the inverse of the F -matrix can be represented in terms of the F ∗ -matrix as (1) (1) ∗(1) ¯ )−1 , (F1...α )−1 = F1...α ( (6.11) ij i<j
with
¯ (1) = diag ij
sin(vj − vi ) 4 sinh2 (vi − vj ) cosh2 η , , sinh(vi − vj + η) sinh(vi − vj − η) sinh(vj − vi + η)
sinh(vi − vj ) 4 sinh2 (vi − vj ) cosh2 η , . sinh(vi − vj + η) sinh(vi − vj + η) sinh(vi − vj − η)
With the help of the F -matrix (6.8) and its inverse, one may compute the α-site Uq (gl(2)) generator E˜ 1,2 in the F -basis, α eη sinh(vi − vj + η) 1,2 −η ˜ E(i) ⊗j =i 2e cosh η, . (6.12) E1,2 = 2 sinh(vi − vj ) cosh η (j ) i=1
(1) (1) Define D˜ (1) (u) = F1...α D (1) (u)(F1...α )−1 . From Proposition 5, we obtain sinh(u − vi ) sinh(u − vi − η) D˜ (1) (u) = ⊗αi=1 , . sinh(u − vi + η) sinh(u − vi + η) (i)
(6.13)
The creation operator in the F -basis is then obtained with the help of the nested R-matrix (6.6) and Theorem 3 in the case of m = 2 and n = 0, and is given by ˜ ˜ E˜ 1,2 )q −H1 C˜ (1) (u) = (q E˜ 1,2 D(u) − D(u) α e−(u−vi ) sinh η 1,2 2 sinh(u − vj ) cosh η =− E ⊗j =i , sinh(u − vi + η) (i) sinh(u − vj + η) i=1 sinh(vi − vj + η) sinh(u − vj − η) . 2 sinh(vi − vj ) sinh(u − vj + η) cosh η (j )
(6.14)
Drinfeld Twists and Algebraic Bethe Ansatz
107
(1)
(1)
Applying F1...α to the nested Bethe vector α (6.3), we obtain (1) (1) (1) (1) (1) (1) ˜ (1)
α (v1 , . . . , vβ ) ≡ F1...α (v1 , . . . , vβ )
= s(c)C˜ (1) (v1 )C˜ (1) (v2 ) . . . C˜ (1) (vβ )|vac(1) , (6.15) (1)
(1)
(1)
(1) where we have used F1...α |vac(1) = i<j (1 + cij )|vac(1) ≡ s(c)|vac(1) . Substituting C˜ (1) (v) into (6.15), we obtain (1) (1) (1) ˜ (1) (1) ˜ (1) (1) ˜ (1)
α (v1 , . . . , vβ ) = s(c)C (v1 ) . . . C (vβ ) |vac (1) (1) (1) = s(c) Bβ (v1 , . . . , vβ |vi1 , . . . , viβ )E(i1,2 . . . E(i1,2 |vac(1) , (6.16) β) 1) i1 <...
where (1)
(1)
(1)
Bβ (v1 , . . . , vβ |v1 , . . . , vβ ) (1) β e−(vk −vσ (k) ) sinh η = − (1) sinh(vk − vσ (k) + η) σ ∈Sβ k=1 ×
×
α
sinh(vσ (k) − vj + η) sinh(vk − vj − η)
j =σ (k),... ,σ (β)
2 sinh(vσ (k) − vj ) sinh(vk − vj + η) cosh η
(1)
(1)
β (1) 2 cosh η sinh(vk − vσ (l) ) (1)
l=k+1
sinh(vk − vσ (l) + η)
(6.17)
.
6.2. Bethe vectors of the quantum supersymmetric t-J model in the F -basis. Now back to the Bethe vector (6.7) of the quantum supersymmetric t-J model. As is shown in Appendix B, the Bethe vector is invariant (modulo overall factor) under the exchange of arbitrary spectral parameters: ˜ N (vσ (1) , . . . , vσ (α) ) =
1
˜
N (v1 , . . . σ c1...α
, vα ), σ ∈ Sα ,
(6.18)
σ has the decomposition law where c1...α
σ σ σ c1...α = cσσ (1...α) c1...α ,
(6.19)
σi and c1...α = ci i+1 ≡ c(vi , vi+1 ) for an elementary permutation σi . This result is a generalization of that in [20, 21]. This invariance enables us to concentrate on a particularly simple term in the sum (6.7) of the following form with p1 number of di = 1 and α − p1 number of dj = 2,
C˜ 1 (v1 ) . . . C˜ 1 (vp1 )C˜ 2 (vp1 +1 ) . . . C˜ 2 (vα ).
(6.20)
In the F -basis, the commutation relation between Ci (v) and Cj (u), i.e. (B.4), becomes C˜ i (v)C˜ j (u) = −
1 b(u, v) ˜ C˜ j (u)C˜ i (v) + Cj (v)C˜ i (u). a(u, v) a(u, v)
(6.21)
108
W.-L. Yang, Y.-Z. Zhang, S.-Y. Zhao
Then using (6.21), all C˜ 1 ’s in (6.20) can be moved to the right of all C˜ 2 ’s, yielding C˜ 1 (v1 ) . . . C˜ 1 (vp1 )C˜ 2 (vp1 +1 ) . . . C˜ 2 (vα ) = = g(v1 , . . . , vα )C˜ 2 (vp1 +1 ) . . . C˜ 2 (vα )C˜ 1 (v1 ) . . . C˜ 1 (vp1 ) + . . . ,
(6.22)
p1 α where g(v1 , . . . , vα ) = k=1 l=p1 +1 (−1/a(vl , vk )) is the contribution from the first term of (6.21) and “. . . ” stands for other terms contributed by the second term of (6.21). It is easy to see that these other terms have the form C˜ 2 (vσ (p1 +1) ) . . . C˜ 2 (vσ (α) )C˜ 1 (vσ (1) ) . . . C˜ 1 (vσ (p1 ) ),
(6.23)
with σ ∈ Sα . Substituting (6.22) into the Bethe vector (6.7), we obtain 11...12...2 ˜ p1 (v1 , . . . , vα ) = ( (1)
α ) N
p1
α
k=1 l=p1 +1
1 − a(vl , vk )
× C˜ 2 (vp1 +1 ) . . . C˜ 2 (vα )C˜ 1 (v1 ) . . . C˜ 1 (vp1 )|vac + . . . , (6.24) where and below, we use the up-index p1 to denote the Bethe vector corresponding to the quantum number p1 . All other terms in (6.24) (denoted as “ . . . ”) are to be obtained from the first term by the permutation (exchange) symmetry. Then we have, ˜ p1 (v1 , . . . , vα ) =
N =
p1 1 σ 11...12...2 c1...α ( (1),σ ) α p1 !(α − p1 )! σ ∈Sα
α
−
k=1 l=p1 +1
1 a(vσ (l) , vσ (k) )
× C˜ 2 (vσ (p1 +1) ) . . . C˜ 2 (vσ (α) )C˜ 1 (vσ (1) ) . . . C˜ 1 (vσ (p1 ) ) |vac ,
(6.25)
(1),σ (1) where ( α )11...12...2 ≡ (fˆσ α )11...12...2 with fˆσ defined by (B.1) in Appendix B. (1) 1...12...2 in (6.25), which has to be evaluated in the original baWe now show that ( α ) sis, is invariant (modulo an overall factor) under the action of the Uq (gl(2)) F -matrix, i.e. 11...12...2 11...12...2 ˜ (1) ( (1) = (t (c))−1 (
, α ) α )
(6.26)
where the scalar factor t (c) is
t (c) =
p1
(1 + c¯ij )
j >i=1
α
(1 + c¯ij ), c¯ij = c(vi , vj ),
j >i=p1 +1
so that it can be expressed in the form of (6.16).
(6.27)
Drinfeld Twists and Algebraic Bethe Ansatz
109
Write the nested pseudo-vacuum vector in (6.4) as |vac(1) ≡ |2 · · · 2(1) ,
(6.28)
where the number of 2 is α. Then the nested Bethe vector (6.15) can be rewritten as
(1)
(1) (1)
(1) α (v1 . . . vp1 ) ≡ | α =
d1 ...dα ( (1) |d1 . . . dα (1) . α )
(6.29)
d1 ...dα (1)
Acting the Uq (gl(2)) F -matrix F1...α from the left on the above equation, we have (1) (1) (1) (1) ˜ (1) ˜ (1)
α (v1 . . . vp1 ) ≡ | α = F1...α | α =
d1 ...dα ˜ (1) (
|d1 . . . dα (1) . α )
(6.30)
d1 ...dα
It follows that (1) 1...12...2 (1) ˜ (1) ˜ (1) = 1 . . . 12 . . . 2|
(
α ) α = 1 . . . 12 . . . 2|F1...α | α
= 1 . . . 12 . . . 2|
∗
α
σ (j ) ¯ (1) σ (1) Pσ (j ) S (c, σ, ασ )R1...α | α
α
σ ∈Sα ασ (1) ...ασ (α) j =1
" α " ∗ ασ (j ) "" Pσ (j ) " = 1 . . . 12 . . . 2| " α ...α σ (1) σ (α) j =1
(6.31) S¯ (1) (c, σ, ασ )| (1) α
σ =id
=
t (c)1 . . . 12 . . . 2| (1) α
=
1...12...2 t (c)( (1) , α )
(6.32) (6.33)
where the scalar factor t (c) is given by (6.27). Summarizing, we propose the following form of the Uq (gl(2|1)) Bethe vector ˜ p1 (v1 , . . . , vα ) =
N =
p1 α 1 σ c1...α p1 !(α − p1 )! σ ∈Sα
i=1 j =p1 +1
2 sinh(vσ (i) − vσ (j ) ) cosh η sinh(vσ (i) − vσ (j ) + η)
(1) ×Bp(1) (v1 , . . . , vp(1) |vσ (1) , . . . 1 1 p α 1
×
k=1 l=p1 +1
, vσ (p1 ) ) 1 C˜ 2 (vσ (p1 +1) ) . . . C˜ 2 (vσ (α) ) − a(vσ (l) , vσ (k) )
× C˜ 1 (vσ (1) ) . . . C˜ 1 (vσ (p1 ) ) |vac . Substituting (5.29) and (5.30) into the above relation, we finally have
(6.34)
110
W.-L. Yang, Y.-Z. Zhang, S.-Y. Zhao
Proposition 6. The nested Bethe vector of the quantum t − J model is given by ˜ p1 (v1 , . . . , vα )
N 1 = p1 !(α − p1 )!
(1)
Bα,p1 (v1 , . . . , vα ; v1 , . . . , vp(1) |zi1 , . . . , ziα ) 1
i1 <...
α
×
E(i2,3 j)
j =p1 +1
p1 j =1
E(i1,3 |vac, j)
(6.35)
where {i1 , i2 , . . . , ip1 } ∩ {ip1 +1 , ip1 +2 , . . . , iα } = ∅ and (1)
Bα,p1 (v1 , . . . , vα ; v1 , . . . , vp(1) |zi1 , . . . , ziα ) = 1 =
σ c1...α
σ ∈Sα p1
i=1 α
×
k=1
p1
α j =p1 +1
l=p1 +1
2 sinh(vσ (i) − vσ (j ) ) cosh η sinh(vσ (i) − vσ (j ) + η)
sinh(vσ (l) − zik ) sinh(vσ (l) − vσ (k) + η) − sinh(vσ (l) − vσ (k) ) sinh(vσ (l) − zik + η)
∗ ×Bα−p (vσ (p1 +1) , . . . , vσ (α) |zip1 +1 , . . . , ziα ) 1
(v1 , . . . , vp(1) |vσ (1) , . . . , vσ (p1 ) )Bp∗1 (vσ (1) , . . . , vσ (p1 ) |zi1 , . . . , zip1 ). ×Bp(1) 1 1 (6.36) (1)
Here the function Bp∗ (v1 , . . . , vp |z1 , . . . , zp ) is given Bp∗ (v1 , . . . , vp |z1 , . . . , zp ) = =
σ ∈Sp
sign(σ )
p p e−(vk −zσ (k) ) sinh η 2 sinh(vk − zσ (l) ) cosh η . sinh(vk − zσ (k) + η) sinh(vk − zσ (l) + η)
k=1
l=k+1
(6.37)
7. Discussions We have constructed the factorizing F -matrices for the supersymmetric model associated with quantum superalgebra Uq (gl(m|n)) with generic m and n, which includes the quantum supersymmetric t-J model as a special case. We have obtained the completely symmetric representations for the creation operators of the model in the F -basis. Our results make possible a complete resolution of the hierarchy of its nested Bethe vectors. As an example, we have given the explicit expressions of the Bethe vectors of the quantum t − J model in F -basis. Our results are new even for the special m = 0 case, which give the results of the general model associated with Uq (gl(n)). (For m = 0 and n = 2, our results reduce to those in [1].) Authors in [22] solved the quantum inverse problem of the supersymmetric t-J model i,j in the original basis. Namely, they reconstructed the local operators (E(k) ) in terms of
Drinfeld Twists and Algebraic Bethe Ansatz
111
operators figuring in the gl(2|1) monodromy matrix. Their results should be generalizable to the Uq (gl(2|1)) case. Then together with the results of the present paper in the F -basis one should be able to get the exact representations of form factors and correlation functions of the quantum supersymmetric t-J model. These are under investigation and results will be reported elsewhere. Acknowledgements. This work was financially supported by the Australia Research Council. S. -Y. Zhao has also been supported by the UQ Postdoctoral Research Fellowship.
Appendix A: Proof of Theorem 1 Proposition 2 allows one to derive the following equations: (E n+m−l,n+m )0...N = En+m−l,n+m + en+m−l,n+m q +
l−1
(1−q −2(−1)
[n+m−α]
l
k=1 Hn+m−k
)en+m−α,n+m En+m−l,n+m−α q
α
k=1 Hn+m−k
,
α=1
l = 1, . . . , n + m − 1, P σc (E n+m−l,n+m )0...N (P σc )−1 = en+m−l,n+m + q +
l−1
(1−q −2(−1)
[n+m−α]
)en+m−l,n+m−α q
(A.1)
l
k=1 hn+m−k
α
k=1 hn+m−k
En+m−l,n+m
En+m−α,n+m ,
i=1
l = 1, . . . , n + m − 1.
(A.2)
Taking x = E n+m−l,n+m and σ = σc and using (3.14), then (3.19) becomes T (u) (E n+m−l,n+m )0...N = P σc (E n+m−l,n+m )0...N (P σc )−1 T (u).
(A.3)
Substituting (3.20), (A.1) and (A.2) into the above equation, we have, for the L.H.S. of the resulting relation,
L.H.S. =
n+m
(−1)([i]+[n+m−l])([n+m−l]+[n+m]+1) ei,n+m Ti,n+m−l (u)q
l
k=1 Hn+m−k
i=1
+
n+m
ei,j (−1)[i]+[j ] Ti,j (u)En+m−l,n+m
i,j =1
+
n+m l−1
(1 − q −2(−1)
[n+m−α]
)(−1)([i]+[n+m−α])([n+m−α]+[n+m]+1)
i=1 α=1
×ei,n+m Ti,n+m−α (u)En+m−l,n+m−α q
α
k=1 Hn+m−k
.
(A.4)
112
W.-L. Yang, Y.-Z. Zhang, S.-Y. Zhao
Similarly for the R.H.S. of the resulting relation, we obtain R.H.S. =
n+m
(−1)([n+m−l]+[n+m]+1)([i]+[j ]) q
l
k=1 hn+m−k
ei,j En+m−l,n+m Ti,j (u)
i,j =1
+
n+m
(−1)[n+m]+[i] en+m−l,i Tn+m,i (u)
i=1
+
n+m l−1
(1 − q −2(−1)
[n+m−α]
i=1 α=1
×en+m−l,n+m−α q
α
)(−1)([n+m−α]+[i])([n+m−α]+[n+m]+1)
k=1 hn+m−k
en+m−α,i En+m−α,n+m Tn+m−α,i (u). (A.5)
Comparing the coefficients of the en+m,n+m term on both sides, we obtain q −(−1)
[n+m]
En+m−l,n+m Tn+m,n+m (u) =
= Tn+m,n+m (u)En+m−l,n+m + Tn+m,n+m−l (u)q +
l−1
(1 − q −2(−1)
[n+m−α]
l
k=1 Hn+m−k
)Tn+m,n+m−α (u)En+m−l,n+m−α q
α
k=1 Hn+m−k
,
α=1
(A.6) which leads to the recursive relation (3.37). Appendix B: The Exchange Symmetry of the Bethe Vector For the Bethe vector N (v1 , . . . , vα ) of the quantum supersymmetric t-J model, we define the exchange operator fˆσ = fˆσil . . . fˆσik by fˆσ N (v1 , v2 , . . . , vα ) = N (vσ (1) , vσ (2) , . . . , vσ (α) ),
(B.1)
where σ ∈ Sα , and {σi } are the elementary permutations of Sα . We first study the exchange symmetry for the elementary exchange operator fˆσi which exchanges the parameter vi and vi+1 . Acting fˆσi on the Bethe vector of Uq (gl(2|1)) (6.7), we have fˆσi N (v1 , v2 , . . . , vα ) = N (v1 , . . . , vi+1 , vi , . . . , vα ) i )d1 ...dα C (v ) . . . C (v = ( (1),σ d1 1 di i+1 )Cdi+1 (vi ) . . . Cdα (vα )|vac, (B.2) α d1 ,... ,dα (1),σi
(1),σ
where {( α i )d1 ...dα } are the vector components of the nested Bethe vector α constructed by the nested monodromy matrix (1)
(1)
(1)
T (1),σi (u) = L(1) α (u, vα ) . . . Li+1 (u, vi )Li (u, vi+1 ) . . . L1 (u, v1 ), (1)
where the local L-operator is defined by Li (u, v) = r0i (u, v).
(B.3)
Drinfeld Twists and Algebraic Bethe Ansatz
113
From the GYBE (3.15), one can derive the commutation relation between Ci (u) and Cj (v), which is given by Ci (u)Cj (v) =
rˇ (u, v)kl ij Ck (v)Cl (u).
(B.4)
k,l
Here the braided r-matrix rˇ (u, v) ≡ Pr(u, v), P permutes the tensor product spaces of the 2-dimensional Uq (gl(2))-module. Then, by (B.4), (B.2) becomes fˆσi N (v1 , v2 , . . . , vα ) =
i )d1 ...dα C (v ) . . . ( (1),σ d1 1 α
d1 ,... ,dα
×(ˇr (vi+1 , vi ))kdi dli+1 Ck (vi )Cl (vi+1 ) . . . Cdα (vα )|vac. (B.5) We now compute the action of (ˇr (vi+1 , vi ))kdi dli+1 on ( (1),σi )d1 ...dα . One checks that the rˇ -matrix satisfies the YBE, (1)
(1)
rˇi i+1 (vi+1 , vi )Li+1 (u, vi )Li (u, vi+1 ) (1)
(1)
= Li+1 (u, vi+1 )Li (u, vi )ˇri i+1 (vi+1 , vi ) .
(B.6)
Therefore, acting rˇ on T (1),σi (u), we have rˇi i+1 (vi+1 , vi )T (1),σi (u) = T (1) (u)ˇri i+1 (vi+1 , vi ).
(B.7)
Thus, because rˇi i+1 (vi+1 , vi ) v2 ⊗ v2 = ci i+1 (vi+1 , vi )v2 ⊗ v2 =
1 v2 ⊗ v 2 , ci i+1
we obtain
i )d1 ...di di+1 ...dα = (ˇr (vi+1 , vi ))kdi dl i+1 ( (1),σ α
di di+1
1 ci i+1
d1 ...kl...dα ( (1) . α )
(B.8)
Changing the indices k, l to di , di+1 , respectively, and substituting the above relation into (B.5), we obtain the exchange symmetric relation of the Bethe vector of Uq (gl(2|1)), fˆσi N (v1 , v2 , . . . , vα ) =
1 ci i+1
N (v1 , v2 , . . . , vα ),
(B.9)
for the elementary permutation operator σi . It follows that under the action of the exchange operator fσ , fˆσ N (v1 , v2 , . . . , vα ) = σ where c1...α is defined in (6.19).
1
N (v1 , v2 , . . . σ c1...α
, vα ),
(B.10)
114
W.-L. Yang, Y.-Z. Zhang, S.-Y. Zhao
References 1. Maillet, J.M., Sanchez de Santos, I.: Drinfel’d twists and algebraic Bethe ansatz. http://arxiv.org/qalg/9612012, 1996 2. Drinfeld, V.G.: Constant quasi-classical solutions of the Yang-Baxter quantum equation. Sov. Math. Dokl. 28, 667 (1983) 3. Kitanine, N., Maillet, J.M., Terras, V.: Form factors of the XXZ Heisenberg spin–1/2 finite chain. Nucl. Phys. B 554, 647 (1999) 4. Izergin, A.G., Kitanine, N., Maillet, J.M., Terras, V.: Spontaneous magnetization of the XXZ Heisenberg spin-1/2 chain. Nucl. Phys. B 554, 679 (1999) 5. Korepin, V.E., Bogoliubov, N.M., Izergin, A.G.: Quantum Inverse Scattering Method and correlation Function. Cambridge: Cambridge Univ. Press, 1993 6. Terras, V.: Drinfel’d twists and functional Bethe ansatz. Lett. Math. Phys. 48, 263 (1999) 7. Albert, T.-D., Boos, H., Flume, R., Ruhlig, K.: Resolution of the nested hierarchy for the rational sl(n) models. J. Phys. A 33, 4963 (2000) 8. Albert, T.-D., Boos, H., Flume, R., Poghossian, R.H., Ruhlig, K.: An F -twisted XYZ model. Lett. Math. Phys. 53, 201 (2000) 9. Albert, T.-D., Ruhlig, K.: Polarization free generators for the Belavin model. J. Phys. A 34, 1569 (2001) 10. Yang, W.-L., Zhang, Y.-Z., Zhao, S.-Y.: Drinfeld twists and algebraic Bethe Ansatz of the supersymmetric t-J model. JHEP 12, 038 (2004) 11. Zhao, S.-Y., Yang, W.-L., Zhang, Y.-Z.: Drinfeld twists and symmetric Bethe vectors of supersymmetric Fermion model. JSTAT, P04005 (2005) 12. Perk, J.H.H., Schultz, C.L.: New families of commuting transfer matrices in q-state vertex models, Phys. Lett. A84, 407 (1981) 13. Kulish, P.P., Sklyanin, E.K.: On solutions of the Yang-Baxter equation. J. Soviet Math. 19, 1596 (1982) 14. Kulish, P.P.: Integrable graded magnets. J. Soviet Math. 35, 2648 (1986) 15. Essler, F.H.L., Korepin, V.E., Schoutens, K.: New exactly solvable model of the strongly correlated electrons motivated by high-Tc superconductivity. Phys. Rev. Lett. 68, 2960 (1992) 16. Bracken, A.G., Gould, M.D., Links, J.R., Zhang, Y.-Z.: A new supersymmetric and exactly solvable model of correlated electrons. Phys. Rev. Lett. 74, 2768 (1995) 17. Yamane, H.: On defining relations of the affine Lie superalgebras and their quantized universal enveloping superalgebras. http://arxiv.org/abs/q-alg/9603015, 1996 18. Bazhanov, V.V., Shadrikov, A.G.: Trigonometric sotions of the triangle equations and simple Lie superalgebras. Theor. Math. Phys. 73, 1302 (1988) 19. Faddeev, L.D., Sklyanin, E.K., Takhtajan, L.A.: The quantum inverse scattering method. Theor. Math. Phys. 40, 688 (1979) 20. Takhtajan, L.A.: Quantum inverse scattering method and algebraized matrix Bethe-Ansata. J. Sov. Math. 23, 2470 (1983) 21. de Vega, H.J.: Yang-Baxter algebras, integrable theories and quantum groups. Int. J. Mod. Phys. A 44, 2371 (1989) 22. G¨ohmann, F., Korepin, V.E.: Solution of the quantum inverse problem. J. Phys. A 33, 1199 (2000) Communicated by L.Takhtajan
Commun. Math. Phys. 264, 115–144 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1541-8
Communications in
Mathematical Physics
2-Matrix versus Complex Matrix Model, Integrals over the Unitary Group as Triangular Integrals B. Eynard1 , A. Prats Ferrer2 1 2
Service de Physique Th´eorique de Saclay, CEA/DSM/SPhT - CNRS/SPM/URA 2306, 91191 Gif-sur-Yvette Cedex, France. E-mail:
[email protected] Universitad Barcelona, Departament d’Estructura i constituents de la Mat`eria, Av. Diagonal 647, 08028 Barcelona, Spain. E-mail:
[email protected]
Received: 2 March 2005 / Accepted: 20 September 2005 Published online: 9 March 2006 – © Springer-Verlag 2006
Abstract: We prove that the 2-hermitian matrix model and the complex-matrix model obey the same loop equations, and as a byproduct, we find a formula for Itzykzon-Zuber type integrals over the unitary group. Integrals over U (n) are rewritten as gaussian integrals over triangular matrices and then computed explicitly. That formula is an efficient alternative to the former Shatashvili’s formula. 1. Introduction It has been noticed for a long time now, that the so-called “Two-Hermitian-MatrixModel” (introduced in particular for quantum gravity [19, 9]) and the so-called “Complex-Matrix-Model” (used in particular for its applications to Laplacian growth models [27, 28], and string theory) share lots of similarities: They have the same leading large N expansion properties, and, both are associated to some ensembles of biorthogonal polynomials which have formally the same properties. Here, we add a new piece to make this correspondence more precise, we prove that both models have the same loop equations. Both models are not defined for the same weights, in fact, the set of weights for which one model is well defined has no intersection with the set of weights for which the other model is well defined. However, each model can be analytically continued to a larger set of weights, and in that sense, the two models coincide. When written in terms of eigenvalues, this identification of the 2-hermitian-matrixmodel and complex-matrix-model has some interesting corollary: it gives a formula for computing integrals (of the Itzykzon-Zuber type) over the unitary group, as gaussian integrals over triangular matrices. Therefore, we obtain a very explicit formula for all correlators of Shatashvili type [24]. In [24] S. Shatashvili found a formula for all U (n) correlation functions, but his formula still contains integrals, is not explicitly symmetric in all variables, and is very difficult to use for practical purposes, such as [5]. In the particular case of the 2-point correlation function, Morozov has found a much simpler
116
B. Eynard, A.P. Ferrer
formula [23]. In [23] A. Morozov computed it for U (n) with n ≤ 3 and conjectured it for n > 3. Morozov’s formula was later proven for all n in [5], and written in an even simpler form [12]. Here, we find a natural generalization of Morozov’s formula. The formula we find here, contains no integration, it gives the U (n) correlation functions as the sum of a finite number of terms, and is very efficient for effective computations. It also provides an alternative new proof of the Itzykzon-Zuber formula. The derivations proposed in this article are elementary, and it would be interesting to put them in the more general framework of group representation theory [20, 16]. The main results presented in this paper are: • Theorem 3.3 and in particular Remark 3.3, which states the equivalence between the Hermitian-2-matrix model and the complex-matrix model: Hn ×Hn
dM1 dM2 F (M1 , M2 ) e−[
≡
GLn (C)
dZ F (Z, Z † ) e−[
α1 2 α2 2 2 M1 + 2 M2 +γ
α1 2 α2 † 2 2 Z + 2 Z +γ
Tr M1 M2 ]
Tr ZZ † ]
(1.1)
.
The definitions of each term and the meaning of that equality are explained in Sect. 3.3. • Theorem 4.1, which allows to compute U (n) integrals as triangular integrals:
dU F (X, U Y U † ) e− Tr XU Y U
U (n)
∝
σ,τ ∈n2 (−1)
†
σ (−1)τ e− Tr Xσ Yτ
T (n) F (Xσ
+ T , Yτ + T † ) e− Tr T T
†
(X)(Y )
(1.2)
for any polynomial invariant function F . • Theorem 5.2, which gives a formula for computing triangular matrix gaussian integrals. We parametrize polynomial invariant functions by pairs of permutations (of some size R), and a basis is written Fπ,π . Theorem 5.2 gives the result of integration over triangular matrices:
T (n) dT
e− Tr T T Fπ,π ( x , y, X + T , Y + T † ) − Tr T T † T (n) dT e †
= M(R) ( x , y, Xn , Yn ) M(R) ( x , y, Xn−1 , Yn−1 ) . . . M(R) ( x , y, X1 , Y1 )
π,π
, (1.3)
where M(R) ( x , y, Xn , Yn ) is the matrix of size R!, indexed by pairs of permutations: M(R) x , y, Xn , Yn ) π,ρ (
=
R i=1
1 . δπ(i),ρ(i) + (xi − Xn )(yπ(i) − Yn )
(1.4)
Theorem 5.3 shows that the matrices in Eq.1.3 commute together, and can be simultaneously diagonalized.
2-Matrix vs. Complex Matrix Model
117
• Theorem 6.1, which gives a formula for computing correlation functions in terms of biorthogonal polynomials: x, y , M1 , M2 ) e− Tr (V1 (M1 )+V2 (M2 )+M1 M2 ) Hn ×Hn dM1 dM2 Fπ,π ( − Tr (V1 (M1 )+V2 (M2 )+γ M1 M2 ) Hn ×Hn dM1 dM2 e = Mdet M(R) ( x , y, Q, P t ) , (1.5) π,π
• • • • •
where notations are explained in Sect. 6.2. Outline: In Part 2 we give definitions of groups and measures. In Part 3, we prove the equivalence between the Hermitian-2-matrix model and the complex-matrix model, by showing that they have the same loop equations. In Part 4, we prove the identity between U (n) integrals and triangular integrals, and give some examples. In particular we rederive Itzykson-Zuber’s formula and Morozov’s formula. In Part 5, we compute the triangular integrals, by parametrizing polynomial invariant functions with pairs of permutations. In particular we compute explicitly all four point functions. In Part 6, we integrate over eigenvalues using biorthogonal polynomial techniques, and get expressions for correlation functions.
2. Definitions 2.1. Ensembles. Let • U (n) := group of n × n unitary matrices, with the normalized Haar measure. • Hn := group of n × n hermitian matrices, with the Lebesgue measure: dM := dMii dReMij dImMij . (2.1) i
i<j
• GLn (C) := group of n × n complex matrices, with the Lebesgue measure: dZ := dReZij dImZij .
(2.2)
i,j
• Tn := group of n × n strictly upper triangular complex matrices, with the Lebesgue measure: dT := dReTij dImTij . (2.3) i<j
• Dn (R) := group of n × n real diagonal matrices, with the Lebesgue measure: dX := dXii . (2.4) i
• Dn (C) := group of n × n complex diagonal matrices, with the Lebesgue measure: dX := dReXii dImXii . (2.5) i
• (n) := group of permutations of n elements.
118
B. Eynard, A.P. Ferrer
2.2. Vandermonde determinant. For any diagonal matrix X = diag(X1 , . . . , Xn ) ∈ Dn (C), one writes: (X) := (Xi − Xj ) (2.6) i<j
and, for any permutation σ ∈ (n), we define the diagonal matrix: Xσ := diag(Xσ (1) , . . . , Xσ (n) ).
(2.7)
(Xσ ) := (−1)σ (X).
(2.8)
Notice that
2.3. Invariant functions. Definition 2.1. F (A, B) defined on GLn (C) × GLn (C) → C is an analytical invariant function if: • F is analytical in each variable, • ∀U ∈ GL∗n (C), F (U AU −1 , U BU −1 ) = F (A, B). Examples. F (A, B) =
p
Tr
Rt
(xt,rt − A)(yt,rt − B) ,
(2.9)
rt =1
t=1
F (A, B) = e− Tr V1 (A) e− Tr V2 (B) . Definition 2.2. Monomial invariant functions are functions of the form: p Rt k l F (A, B) = Tr (A t,rt B t,rt ) , t=1
(2.10)
(2.11)
rt =1
where the kt,rt ’s and lt,rt ’s are integers such that kt,rt + lt,rt > 0. The total degree is deg F :=
p Rt
kt,rt + lt,rt .
(2.12)
t=1 rt =1
Definition 2.3. Polynomial invariant functions are finite complex linear combinations of monomial invariant functions. Examples of polynomial invariant functions. F (A, B) = Tr Ak1 B l1 Ak2 B l2 , 1 + Tr Ak2 B l2 , F (A, B) = 1 + Tr Ak1 B l1 F (A, B) =
p t=1
det(xt − A)kt
q
det(yu − B)lu ,
(2.13)
(2.14)
u=1
F (A, B) = det(A ⊗ 1 − 1 ⊗ B).
(2.15)
2-Matrix vs. Complex Matrix Model
119
2.4. Decompositions. 2.4.1. Diagonalization. It is a standard result in algebra (see [22, 16, 20] for instance), that any hermitian matrix M ∈ Hn can be written: M = U XU † ,
(2.16)
dM = J˜n 2 (X) dU dX,
(2.17)
where U ∈ U (n) and X ∈ Dn (R). The measure is then:
where the Jacobian is n(n−1)
π 2 J˜n = n−1 k=0
k!
.
(2.18)
This decomposition is not unique. It is unique up to a permutation of eigenvalues, and up to multiplication of U by a diagonal matrix whose elements are on the unit circle. In other words, M = U XU † provides a mapping between Hn and U (n) × Dn (R)/(U (1)n × (n)). 2.4.2. Jordanization. A less standard result (see [22, 26, 20, 16] for instance), is that any complex matrix Z ∈ GLn (C) can be written: Z = U (X + T )U † ,
(2.19)
where U ∈ U (n), T ∈ Tn and X ∈ Dn (C). The measure is then: dZ = Jn |(X)|2 dU dT dX,
(2.20)
where the Jacobian is π n(n−1) 2
2 Jn = n−1 k=0
k!
.
(2.21)
This decomposition is not unique. It is unique up to a permutation of eigenvalues, and up to multiplication of U by a diagonal matrix whose elements are on the unit circle. In other words, Z = U (X + T )U † provides a mapping between Gln (C) and U (n) × Tn × Dn (C)/(U (1)n × (n)). 3. Gaussian Matrix Integrals In all that follows, we consider 3 complex numbers α1 , α2 and γ , and we define δ := α1 α2 − γ 2 and assume that δ = 0.
(3.1)
120
B. Eynard, A.P. Ferrer
3.1. Gaussian Hermitian Model. Consider the measure on Hn × Hn : e− Tr
α1
M12 +
2
α2 2 2 M2 +γ M1 M2
(3.2)
dM1 dM2 .
Definition 3.1. The partition function is: ZH (n, γ , α1 , α2 ) :=
Hn ×Hn
α1 2 α2 2 2 M1 + 2 M2 +γ M1 M2 )
dM1 dM2 e− Tr (
.
(3.3)
Notice that the integral ZH is absolutely convergent only if Re(α1 eφ + α2 e−φ ± 2γ ) > 0
∀φ ∈ R
(3.4)
which implies that Reα1 > 0, Reα2 > 0, (Reγ )2 < Reα1 Reα2 . An easy gaussian integral computation gives: ZH = 2
n
π √ δ
n2 (3.5)
.
Definition 3.2. The expectation value of an invariant function F (A, B) is:
F H :=
Hn ×Hn
dM1 dM2 F (M1 , M2 ) e− Tr (
Hn ×Hn
dM1 dM2 e− Tr (
α1 2 α2 2 2 M1 + 2 M2 +γ M1 M2 )
α1 2 α2 2 2 M1 + 2 M2 +γ M1 M2 )
.
(3.6)
Remark 3.1. It is clear, from Wick’s theorem, that if F is a monomial invariant function, then < F >H is a polynomial in αδ1 , αδ2 and γδ , and can be analytically continued to every complex α1 , α2 , γ , provided that δ = 0. 3.1.1. Gaussian Hermitian loop equations. Consider a monomial matrix valued function, of the form: f (A, B) = f0 (A, B)
p
Tr ft (A, B) ,
∀t = 0, . . . , p,
t=1
ft (A, B) =
Rt
Akt,rt B lt,rt .
(3.7)
rt =1
Define G0 (A, B) :=
Tr fu (A, B) ,
and if t ≥ 1 ,
u=0
Gt (A, B) :=
u=0,t
Tr fu (A, B).
(3.8)
2-Matrix vs. Complex Matrix Model
121
Theorem 3.1. One has the “loop equations”: α1 G0 (M1 , M2 ) Tr M1 f0 (M1 , M2 )H + γ G0 (M1 , M2 ) Tr M2 f0 (M1 , M2 ))H r−1 −1 R0 k0,r
k0,u j l0,u G0 (M1 , M2 ) Tr M1 = M1 M2 r=1 j =0
p Rt k t,r −1
R0
k −j −1 M1 0,r M2 l0,r
× Tr
+
u=1
k M1 0,u M2 l0,u
u=r+1
r−1
Gt (M1 , M2 ) Tr
t=1 r=1 j =0
H
k
j
M1 t,u M2 lt,u M1 f0 (M1 , M2 )
u=1
×
k −j −1 M1 t,r M2 lt,r
Rt
k M1 t,u M2 lt,u
u=r+1
(3.9) H
and α2 G0 (M1 , M2 ) Tr M2 f0 (M1 , M2 )H + γ G0 (M1 , M2 ) Tr M1 f0 (M1 , M2 ))H r−1 −1 R0 l0,r
k0,u k0,r l0,u j G0 (M1 , M2 ) Tr M 1 M2 = M1 M2 r=1 j =0
× Tr
+
u=1
−1 p Rt lt,r
M2
l0,r −j −1
R0
k M1 0,u M2 l0,u
u=r+1
Gt (M1 , M2 ) Tr
r−1
t=1 r=1 j =0
k
M1 t,u M2 lt,u M1 kt,r M2 j
u=1
× f0 (M1 , M2 )M2
H
lt,r −j −1
Rt u=r+1
k M1 t,u M2 lt,u
.
(3.10)
H
Notice that the RHS is a linear combination of invariant polynomial functions of degree strictly lower than the LHS. Loop equations are a standard method for finding recursion relations among expectation values [9], they were first studied by [25] for the 2-matrix model, and solved more explicitly by [14, 11, 13]. Proof. Write that the integral of a total derivative is zero:
0=
i
dM1 dM2
α1 ∂ 2 α2 2 fi,i (M1 , M2 ) e− Tr ( 2 M1 + 2 M2 +γ M1 M2 ) , (3.11) ∂M1ii
122
B. Eynard, A.P. Ferrer
i.e.
i
=
dM1 dM2
∂ fi,i (M1 , M2 ) ∂M1ii
e− Tr (
α1 2 α2 2 2 M1 + 2 M2 +γ M1 M2 )
dM1 dM2 fi,i (M1 , M2 ) (α1 M1ii + γ M2ii )
i
×e− Tr (
α1 2 α2 2 2 M1 + 2 M2 +γ M1 M2 )
(3.12)
.
Similarly:
i<j
=
dM1 dM2
∂ fi,j (M1 , M2 ) ∂ReM1ij
e− Tr (
α1 2 α2 2 2 M1 + 2 M2 +γ M1 M2 )
dM1 dM2 fi,j (M1 , M2 ) α1 (M1j i + M1ij ) + γ (M2j i + M2ij )
i<j
×e− Tr (
α1 2 α2 2 2 M1 + 2 M2 +γ M1 M2 )
(3.13)
and
dM1 dM2
i<j
=i
∂ fi,j (M1 , M2 ) ∂ImM1ij
e− Tr (
α1 2 α2 2 2 M1 + 2 M2 +γ M1 M2 )
dM1 dM2 fi,j (M1 , M2 ) α1 (M1j i − M1ij ) + γ (M2j i − M2ij )
i<j
×e− Tr (
α1 2 α2 2 2 M1 + 2 M2 +γ M1 M2 )
(3.14)
.
Taking 3.12 plus 3.13 minus i 3.14 , we get:
α1 ∂ 2 α2 2 fi,i (M1 , M2 ) e− Tr ( 2 M1 + 2 M2 +γ M1 M2 ) ∂M1ii ∂ 1 ∂ fi,j (M1 , M2 ) dM1 dM2 + −i 2 ∂ReM1ij ∂ImM1ij
dM1 dM2
i
i<j
α1
α2
×e− Tr ( 2 M1 + 2 M2 +γ M1 M2 )
1 ∂ ∂ dM1 dM2 + fj,i (M1 , M2 ) +i 2 ∂ReM1ij ∂ImM1ij 2
2
i<j
=
×e− Tr (
α1 2 α2 2 2 M1 + 2 M2 +γ M1 M2 )
dM1 dM2 ( Tr f (M1 , M2 )(α1 M1 + γ M2 )) e− Tr (
α1 2 α2 2 2 M1 + 2 M2 +γ M1 M2 )
,
(3.15)
2-Matrix vs. Complex Matrix Model
123
i.e. one can proceed as if all the M1ij were n2 real independent variables, i.e., by abuse of notation we write:
α1 ∂ 2 α2 2 dM1 dM2 fi,j (M1 , M2 ) e− Tr ( 2 M1 + 2 M2 +γ M1 M2 ) ∂M1ij i,j = dM1 dM2 ( Tr f (M1 , M2 )(α1 M1 + γ M2 )) ×e− Tr (
α1 2 α2 2 2 M1 + 2 M2 +γ M1 M2 )
(3.16)
.
Now, one can use the following rules: • Split rule: if f (M1 , M2 ) = AM1k B (where A and B are matrices), one has:
∂f (M1 , M2 )ij ∂M1 ij
i,j
=
Tr AM1k−1−l Tr M1l B .
k−1
(3.17)
l=0
• Merge rule: if f (M1 , M2 ) = A Tr (M1k B) (where A and B are matrices), one has:
∂f (M1 , M2 )ij i,j
∂M1 ij
=
k−1
Tr AM1k−1−l BM1l .
(3.18)
l=0
Then, if A and B depend on M1 , one has to use the chain rule. When one considers f given by 3.7, one gets Eq. 3.9. Equation 3.10 is obtained by doing the same for M2 . We find again that F H is a polynomial in
α1 α2 δ , δ
and γδ .
3.2. Gaussian complex model. Consider the measure on GLn (C): e
− Tr
α1 2 α2 † 2 † 2 Z + 2 Z +γ ZZ
dZ.
Definition 3.3. The partition function is: α1 2 α2 † 2 † dZ e− Tr ( 2 Z + 2 Z +γ ZZ ) . ZC (n, γ , α1 , α2 ) := Gln (C)
(3.19)
(3.20)
Notice that the integral ZC is absolutely convergent only if ∀θ ∈ R
Re(α1 eiθ + α2 e−iθ + 2γ ) > 0.
(3.21)
One can see that with θ = π, this condition can never be compatible with 3.4 (with φ = 0). Therefore, if ZH is an absolutely convergent integral then ZC is not, and vice–versa. An easy gaussian integration gives (where δ = α1 α2 − γ 2 ): ZC =
π √ −δ
n2
which can be analytically continued to every α1 , α2 , γ , provided that δ = 0.
(3.22)
124
B. Eynard, A.P. Ferrer
Definition 3.4. The expectation value of an invariant function F (A, B) is:
F C :=
Gln (C) dZ F (Z, Z
Gln (C) dZ e
† ) e− Tr (
− Tr (
α1 2 α2 † 2 † 2 Z + 2 Z +γ ZZ )
(3.23)
.
α1 2 α2 † 2 † 2 Z + 2 Z +γ ZZ )
Remark 3.2. It is clear, from Wick’s theorem, that if F is a monomial invariant function, then < F >C is a polynomial in αδ1 , αδ2 and γδ , and can be analytically continued to every complex α1 , α2 , γ , provided that δ = 0. 3.2.1. Gaussian complex loop equations. Consider a monomial matrix valued function, of the form: p
f (Z, Z † ) = f0 (Z, Z † )
Tr ft (Z, Z † ) ,
t=1 Rt
ft (Z, Z † ) =
Z kt,rt Z †
lt,rt
(3.24)
,
rt =1
define: G0 (Z, Z † ) :=
and if t ≥ 1 ,
Tr fu (Z, Z † ) ,
u=0
Gt (Z, Z † ) :=
Tr fu (Z, Z † ).
(3.25)
u=0,t
Theorem 3.2. One has the same loop equations as Theorem 3.1, replacing the subscript H by C, α1 G0 (Z, Z † ) Tr Zf0 (Z, Z † ) + γ G0 (Z, Z † ) Tr Z † f0 (Z, Z † )) C C r−1 k0,r −1 R 0
l0,u = G0 (Z, Z † ) Tr Zj Z k0,u Z † r=1 j =0
× Tr
+
u=1
p Rt k t,r −1
Z
k0,r −j −1
Z
R0
† l0,r
Z
k0,u
Z
u=r+1
Gt (Z, Z † ) Tr
r−1
t=1 r=1 j =0
† l0,u C
Z kt,u Z †
lt,u
Z j f0 (Z, Z † )
u=1
× Z
kt,r −j −1
Z
† lt,r
Rt u=r+1
Z
kt,u
Z
† lt,u
(3.26) C
2-Matrix vs. Complex Matrix Model
and
125
α2 G0 (Z, Z † ) Tr Z † f0 (Z, Z † ) + γ G0 (Z, Z † ) Tr Zf0 (Z, Z † )) C C r−1 −1 R0 l0,r
l0,u j G0 (Z, Z † ) Tr Z k0,r Z † = Z k0,u Z † r=1 j =0
× Tr
+
u=1
−1 p Rt lt,r
Z
† l0,r −j −1
R0
Z
k0,u
Z
† l0,u
u=r+1
Gt (Z, Z † ) Tr
r−1
t=1 r=1 j =0
Z kt,u Z †
u=1
× f0 (Z, Z )Z †
C
lt,u
Rt
† lt,r −j −1
Z kt,r Z †
j
Z
kt,u
Z
† lt,u
.
u=r+1
(3.27)
C
Notice that the RHS is a linear combination of invariant polynomial functions of degree strictly lower than the LHS. Proof. The proof is very similar to that of Theorem 3.1. Write that the integral of a total derivative is zero: 2 α α ∂ † − Tr ( 21 Z 2 + 22 Z † +γ ZZ † ) fr,s (Z, Z ) e (3.28) 0 = dZ ∂ReZij and
0 = −i
dZ
∂ ∂ImZij
fr,s (Z, Z † ) e− Tr (
α1 2 α2 † 2 † 2 Z + 2 Z +γ ZZ )
.
(3.29)
Taking the sum of both lines, one can proceed as if all the Zij and Z † ij were real independent variables, and from there, follow the proof of Theorem 3.1. Remark 3.3. We see that the loop equations of both models are identical. It is clear from the above derivation that this is general, even for non gaussian measures. When the measure is gaussian, the loop equations determine completely every expectation value, while for non-gaussian measures, the loop equations give recursion relations for expectation values, but don’t give the initial conditions. Let us consider in particular the “semi-classical case” [3, 6], i.e. with a measure of the type ∂µ(M1 , M2 ) = e− Tr [V1 (M1 )+V2 (M2 )+M1 M2 ] ,
(3.30)
where V1 and V2 are rational functions. In that case, the initial conditions which allow to determine all polynomial expectation values recursively, are in one–to–one correspondence with homology classes of integration paths for pairs of eigenvalues [6], therefore, there exists a choice of integration path such that one can write: dM1 dM2 e− Tr [V1 (M1 )+V2 (M2 )+M1 M2 ] (Hn ×Hn )( ) † † ≡ dZ e− Tr [V1 (Z)+V2 (Z )+ZZ ] (3.31) GLn (C)
126
B. Eynard, A.P. Ferrer
and one can consider that this equality defines the RHS. Somehow, the complex matrix model is nothing but the analytical continuation of the 2-matrix model defined on some classes of contours.
3.3. Relation between the two models. Theorem 3.3. For any polynomial invariant function F (A, B), one has:
F H = F C .
(3.32)
Notice that F H and F C have been defined for different range of values of α1 , α2 , and γ , but, as we have explained above, both are polynomials of αδ1 , αδ2 and γδ (and can be analytically continued to any α1 , α2 , and γ ). Theorem 3.3 is thus an equality between polynomials. Proof. It is sufficient to prove it for monomial invariant functions. The proof is clearly obtained from the loop equations, by recursion on deg F . It is obviously true for deg F = 0, i.e. F = 1. And the loop equations of both models are identical. Definition 3.5. For any two given complex diagonal matrices X and Y , and any polynomial invariant function F , define: W˜ F (X, Y ) := 2 (X) 2 (Y )
dU F (X, U Y U † ) e−γ Tr XU Y U ,
(3.33)
F (X + T , Y + T † ) e−γ Tr T T , −γ Tr T T † T (n) dT e
(3.34)
†
U (n)
ωF (X, Y ) := (X) (Y )
T (n) dT
†
which is a polynomial in all its variables Xi , Yj , and a polynomial in 1/γ , and: WF (X, Y ) :=
1
(Xσ )(Yτ ) e−γ Tr Xσ Yτ n!2 σ τ
×
dT F (Xσ + T , Yτ + T † ) e−γ Tr T T . †
(3.35)
T (n)
Theorem 3.4. For any polynomial invariant function F (A, B), one has: α1 α2 J˜n2 2 2 dXdY e− 2 Tr X e− 2 Tr Y W˜ F (X, Y ) 2 n! ZH (n, γ , α1 , α2 ) Dn (R)×Dn (R) 2 α1 α2 Jn 2 = dX e− 2 Tr X e− 2 Tr X WF (X, X). (3.36) n! ZC (n, γ , α1 , α2 ) Dn (C)
2-Matrix vs. Complex Matrix Model
127
Proof. Start from Theorem 3.3, diagonalize M1 and M2 on the hermitian side, and jordanize Z on the complex side, α1 α2 J˜n2 2 2
F H = 2 dXdY e− 2 Tr X e− 2 Tr Y W˜ F (X, Y ) n! ZH (n, γ , α1 , α2 ) Dn (R)×Dn (R) 2 α1 α2 Jn 2 = F C = dX e− 2 Tr X e− 2 Tr X n! ZC (n, γ , α1 , α2 ) Dn (C) ×e−γ Tr XX ωF (X, X) 2 α1 α2 Jn 2 = dX e− 2 Tr X e− 2 Tr X e−γ Tr Xσ Xτ ωF (Xσ , X τ ) n! ZC (n, γ , α1 , α2 ) Dn (C) 2 α1 α2 Jn 2 = dX e− 2 Tr X e− 2 Tr X WF (X, X). (3.37) n! ZC (n, γ , α1 , α2 ) Dn (C) The equality in the first line is obtained by diagonalizing M1 and M2 (with Jacobian given in Eq. 2.17), the equality in the second line is obtained by Jordanizing Z (with Jacobian given in Eq. 2.19), the equality between the second and third line holds for any pair of permutations σ and τ (it can be proven with Lemma A.1 given in the Appendix), and the equality of the last line comes from the definition of WF . 4. Unitary Group Integrals Here is one of the most important theorems of this paper: 4.1. Unitary integrals and triangular integrals. Theorem 4.1. For any invariant function F (A, B) one has: † dU F (X, U Y U † ) e−γ Tr XU Y U U (n)
cn = n!
σ
τ (−1)
σ (−1)τ e−γ Tr Xσ Yτ
T (n) F (Xσ
(4.1)
+ T , Yτ + T † ) e−γ Tr T T dT †
(X)(Y )
where
n−1 cn =
k=0
(−2π)
k!
n(n−1) 2
,
(4.2)
,
i.e. W˜ F (X, Y ) = n! cn WF (X, Y ).
(4.3)
Proof. Using Lemma A.1 given in the Appendix, and using Theorem 3.4, we have: α1 α2 2 2 dX dY e− 2 Tr X e− 2 Tr Y W˜ F (X, Y ) Dn (R)×Dn (R) n!2 ZH Jn = J˜n2 n! ZC
Dn (C)
dX e−
α1 2
Tr X2 −
e
α2 2
Tr X
2
WF (X, X)
128
B. Eynard, A.P. Ferrer
2 α α 1 n!2 ZH Jn − 21 Tr X2 − 22 Tr X −γ Tr Xσ X τ = dX e e e ωF (Xσ , X τ ) J˜n2 n! ZC n!2 σ,τ Dn (C) n √π 2 J n α1 α2 1 n! ZH 2 2 −δ dX dY e− 2 Tr X e− 2 Tr Y = n 2 2π n! Z n! C J˜2 √ σ,τ Dn (R)×Dn (R) n
δ
×e−γ Tr Xσ Yτ ωF (Xσ , Yτ ) n √π 2 J n α1 α2 n! ZH 2 2 −δ = dX dY e− 2 Tr X e− 2 Tr Y WF (X, Y ) n 2π n! Z 2 C Dn (R)×Dn (R) J˜n √ δ α1 α2 2 2 = n!cn dX dY e− 2 Tr X e− 2 Tr Y WF (X, Y ). (4.4) Dn (R)×Dn (R)
Notice that if f (A) and g(B) are polynomial invariant functions, i.e. f (U AU −1 ) = f (A) for all A and U (resp. g(U BU −1 ) = g(B) for all B and U ), one has: Wf (X)g(Y )F (X,Y ) (X, Y ) = f (X)g(Y )WF (X, Y ) , W˜ f (X)g(Y )F (X,Y ) (X, Y ) = f (X)g(Y )W˜ F (X, Y ).
(4.5)
Thus, for any symmetric polynomials f (x1 , . . . , xn ) and g(y1 , . . . , yn ), one has: α1 α2 2 2 0= dX dY e− 2 Tr X e− 2 Tr Y f (X)g(Y ) Dn (R)×Dn (R)
×(n! cn WF (X, Y ) − W˜ F (X, Y )). α − 21
α − 22
(4.6)
Tr Tr Notice that e e (n! cn WF (X, Y ) − W˜ F (X, Y )) is a continuous function fastly decreasing at ∞ in all variables, and it is symmetric in the x’s and in the y’s. Using the Stone–Weierstrass theorem (polynomials are dense in the space of continuous functions), one gets that n! cn WF (X, Y ) = W˜ F (X, Y )). X2
Y2
4.2. Examples. Let us illustrate Theorem 4.1 on some simple examples and recover some classical results. 4.2.1. Harish-Chandra–Itzykson–Zuber’s formula. We can use Theorem 4.1, to find a new proof of the famous Harish-Chandra–Itzykson–Zuber formula [18, 17, 20]. Indeed, consider F (A, B) = 1, Theorem 4.1 gives: −γ Xσ Yτ σ τ i i cn † σ,τ (−1) (−1) ie −γ Tr XU Y U † e = dT e−γ Tr T T n! (X)(Y ) U (n) Tn n(n−1) 2 det E π = cn , (4.7) γ (X)(Y ) which is the famous Harish Chandra-Itzykzon-Zuber integral. Here E is the matrix Eij := e−γ Xi Yj .
(4.8)
2-Matrix vs. Complex Matrix Model
129
4.2.2. Morozov’s formula. Consider TrAk B l for any integers k and l. It is in fact simpler to introduce a generating function: ∞
∞
1 1 1 1 = Tr Ak B l x−Ay−B x k+1 y l+1
F (A, B) = Tr
(4.9)
k=0 l=0
which is to be understood as a formal power series in its large x and large y expansion. F (A, B) is merely a convenient way of considering all polynomial invariant functions of type Tr Ak B l at once. We have: p n−1
1 1 1 = T , (4.10) x − (X + T ) x−X x−X p=0
and thus:
1
1 1 † dT e−γ Tr T T † x − (X + T ) y − (Y + T ) T (n) T (n) dT p n−1 n−1
1 1 1 1 = T Tr † −γ Tr T T x−X x−X y−Y T (n) dT e p=0 q=0 T (n) q 1 † dT e−γ Tr T T × T† y−Y † e−γ Tr T T
=
n−1 n−1
Tr
δi1 ,j1 δip ,jq
p=0 q=0 i1
×
† T (n) Ti1 ,i2 Ti2 ,i3 . . . Tip ,ip+1 Tjq+1 ,jq
T (n)
dT e−γ Tr
p+1 k=1
. . . Tj†2 ,j1 TT†
q+1 1 1 x − Xik y − Yjl
dT
l=1
† e−γ Tr T T
.
(4.11)
That last integral is non-vanishing only if p = q, and according to Wick’s theorem, it is the sum of all possible pairings. Because of the ordering of the ik ’s and jl ’s, the only non-vanishing pairing is obtained for ik = jk for all k. Therefore: 1 1 1 † Tr dT e−γ Tr T T † †) −γ Tr T T x − (X + T ) y − (Y + T T (n) T (n) dT e =
n−1
p=0 i1
= −γ + γ
n i=1
p+1 1 1 γp (x − Xik )(y − Yik ) k=1
1 1+ γ (x − Xi )(y − Yi )
,
(4.12)
and then Theorem 4.1 gives:
† 1 1 U U † e−γ Tr XU Y U π x − X y − Y U (n) −γ Xσi Yτi 1 1 σ τ − e−γ Xσi Yτi + e−γ Xσi Yτi + 1 σ,τ (−1) (−1) i i γ x−Xσi e y−Yτi cn γ = n! (X)(Y )
γ n(n−1) 2
dU Tr
130
B. Eynard, A.P. Ferrer 1 E 1 − det E + det E + γ1 x−X y−Y = γ cn (X)(Y ) det E 1 1 1 cn E E −1 , = γ −1 + det 1 + γ x−X y−Y (X)(Y )
(4.13)
i.e.
U (n) dU Tr
1 1 † x−X U y−Y U
e−γ Tr XU Y U
†
e−γ Tr XU Y U 1 1 1 = γ −1 + det 1 + E E −1 γ x−X y−Y †
U (n) dU
(4.14)
which is identical (for γ = −1) to what was found in [5, 12], i.e. the compact version of Morozov’s formula [23].
5. Computation of Triangular Integrals The goal of this section is to compute the triangular integral on the RHS of Theorem 4.1. Here, we consider γ = 1.
5.1. Parametrization of polynomial invariant functions. Definition 5.1. Let R be a positive integer. Let x = (x1 , . . . , xR ) and y = (y1 , . . . , yR ) be 2R complex numbers. Let π and π be two permutations of R . The permutation ππ −1 is made of p cycles C1 , . . . , Cp of length R1 , . . . , Rp which we note: π
π
−1
π
π
−1
π
π
−1
π
π
−1
Ck = (ik,1 → jk,1 ik,2 → jk,2 ik,3 → . . . ik,Rk → jk,Rk ik,1 ). (5.1) We define, for (A, B) ∈ GLn (C)2 in any dimension n: Fπ,π ( x , y, A, B) :=
p
δRk ,1 + Tr
k=1
Rk l=1
1 1 . xik,l − A yjk,l − B
(5.2)
As explained above, this definition is to be understood as a formal power series in the large xi and yj expansions, it is merely a way of considering all polynomial invariant functions at once. Examples: with R = 2, we have: 1 1 F(1)(2),(1)(2) (x1 , x2 , y1 , y2 , A, B) = 1 + Tr x1 − A y1 − B 1 1 × 1 + Tr , x2 − A y2 − B
2-Matrix vs. Complex Matrix Model
131
1 1 F(12),(12) (x1 , x2 , y1 , y2 , A, B) = 1 + Tr x1 − A y2 − B 1 1 × 1 + Tr , x2 − A y1 − B 1 1 1 1 F(1)(2),(12) (x1 , x2 , y1 , y2 , A, B) = Tr , x1 − A y1 − B x2 − A y2 − B 1 1 1 1 F(12),(1)(2) (x1 , x2 , y1 , y2 , A, B) = Tr . x1 − A y2 − B x2 − A y1 − B
(5.3)
Definition 5.2. Let R be a positive integer, x = (x1 , . . . , xR ) and y = (y1 , . . . , yR ) be 2R complex numbers. Let π and π be two permutations of R . Let n be an integer, and X = diag(X1 , . . . , Xn ) and Y = diag(Y1 , . . . , Yn ) be two complex diagonal matrices of size n, We define: (n)
Wπ,π ( x , y, X, Y ) := 1
if n = 0 or R = 0,
(n)
Wπ,π ( x , y, X, Y ) := Fπ,π ( x , y, X1 , Y1 )
if n = 1,
(5.4)
(5.5)
and otherwise (n) Wπ,π ( x , y, X, Y )
Here,
1 x−(X+T )
1 x − (X + T )
:=
T (n) dT
e− Tr T T Fπ,π ( x , y, X + T , Y + T † ) . − Tr T T † T (n) dT e †
(5.6)
is defined by: := i,j
(j −i)
δij + x − Xi
p=1 i
×
1 1 Ti,i1 Ti ,i . . . x − Xi x − Xi1 1 2
1 1 Ti ,j . x − Xip p x − Xj
(5.7)
5.2. Computation of triangular integrals of invariant functions. We are now going to find some recursion relation in n for the W ’s. Theorem 5.1. (n)
Wπ,π ( x , y, X, Y ) =
(n−1) ˜ Y˜ ), M(R) x , y, Xn , Yn ) Wρ,π ( x , y, X, π,ρ (
(5.8)
ρ
where X˜ := diag(X1 , . . . , Xn−1 ), Y˜ := diag(Y1 , . . . , Yn−1 ), and: M(R) x , y, Xn , Yn ) π,ρ (
=
R i=1
1 δπ(i),ρ(i) + . (xi − Xn )(yπ(i) − Yn )
(5.9)
132
B. Eynard, A.P. Ferrer
Proof. If T is a strictly upper triangular matrix of size n, we define T˜ the triangular matrix of size n − 1, such that T˜i,j = Ti,j for all i, j < n, and u the vector made of the last column of T , uk = Tk,n : .. .. . . . . . . . . u 1 .. . . ˜ .. . T . . . .. .. . . (5.10) T = . . . .. . un−1 0 1 We define := 0 if i = n or j = n. ˜ ˜ x−(X+T ) i,j
Notice that: n−1 δj,n 1 1 1 = + uk x − (X + T ) i,j x − (X˜ + T˜ ) i,j x − Xn k=1 x − (X˜ + T˜ ) i,k +
δi,n δj,n x − Xn
(5.11)
and 1 1 x − (X + T ) y − (Y + T † ) 1 1 = 1 + Tr ˜ ˜ ˜ x − (X + T ) y − (Y + T˜ † ) n−1 n−1
1 1 1 + uk ul . 1+ (x − Xn )(y − Yn ) y − (Y˜ + T˜ † ) x − (X˜ + T˜ ) l,k
1 + Tr
k=1 l=1
(5.12) Now, we integrate u out, using Wick’s theorem, i.e. take the sum over all possible pairings of a u and a u. The pairing (uk , ul ) gives a factor δk,l . Let us represent W as a bivalent graph G, whose edges are pairs (xi , yπ(i) ), and whose vertices are pairs (yπ (i) , xi ). Relation Eq. 5.11 means that, for each edge (xi , yπ(i) ) of G, we can either: – let the edge untouched (first term in Eq. 5.11), with weight 1, 1 – remove the edge (second term in Eq. 5.11), with weight (xi −Xn )(y , π(i) −Yn ) – remove the vertex (yπ (i) , xi ) (third term in Eq.5.11), with weight (xi −Xn )(y1 −Yn ) , π (i) which means that either of the neighboring edges cannot stay untouched. Then, we integrate u out, i.e. we take the sum over all possible pairings, i.e. we draw new edges between vertices (those not removed), so that the final graph is bivalent. For each pairing, we get a new graph G . The sum over possible pairings, is thus the sum over bivalent graphs G , whose vertices form a subset of the vertices of G, i.e.
(n) (n−1) MG,G WG , (5.13) WG = G
where the coefficient MG,G is computed as follows:
2-Matrix vs. Complex Matrix Model
133
1 – MG,G receives a factor 1 + (xi −Xn )(y for each edge (xi , yπ(i) ) of G which is π(i) −Yn )
1 unchanged, i.e. which is an edge of G (1 if it was not removed, and (xi −Xn )(y π(i) −Yn ) if it was removed and drawn again). 1 – the weight of each edge (xi , yπ(i) ) of G, which is not an edge of G , is (xi −Xn )(y . π(i) −Yn ) – the weight of removing a vertex is the same as the weight of creating a length 1 cycle at that vertex. In other words, if G has less vertices than G, consider G obtained from G by adding length 1 cycles at each missing vertex, one has MG,G = MG,G . The sum over G can thus be written as a sum over G , where G has as many vertices as G, and all cycles of length 1 come together with a 1 added. – relation Eq. 5.12 ensures that the previous rules apply also when G has length 1 cycles.
To summarize, we have: (n)
WG =
G
where
MG,G =
(n−1)
MG,G WG
1+
(xi ,yπ(i) )∈G
×
(xi ,yπ(i) )∈G /
(5.14)
,
1 (xi − Xn )(yπ(i) − Yn )
1 (xi − Xn )(yπ(i) − Yn )
(5.15)
when G and G are written in terms of pairs of permutations, it reduces to Eq. 5.9. This proof is graphicaly illustrated for R = 2 as follows: =
+
+
1 1 x’−Xn y’−Yn
+
1 1 1 1 x’−Xn y’−Yn x−Xn y−Yn
1 1 x−Xn y−Yn
+
+
+
+
=
(5.16)
1 = 1+ 1 x’−Xn y’−Yn
+
1+
1 1 1 1 x’−Xn y’−Yn x−Xn y−Yn
1 1 x−Xn y−Yn
1
+
1
134
B. Eynard, A.P. Ferrer
and:
1
1
=
1 1 x−Xn y’−Yn
1 1 1 x’−Xn y−Yn 1 1 1 1 x−Xn y’−Yn x’−Xn y−Yn
1 = 1+ 1 x−Xn y’−Yn
1+
1 1 x’−Xn y−Yn
=
1
(5.17) .
1
1 1 1 1 x−Xn y’−Yn x’−Xn y−Yn
Remark 5.1. Notice that: M(R) ( x , y, Xn , Yn ) = M(R) ( x , y, Xn , Yn )t , (R) x , y, Xn , Yn ) Mπ,π (
=
(R)
Mπρ,π ρ ( x , y, Xn , Yn ) =
(5.18)
(R) M −1 −1 ( y , x, Yn , Xn ), π ,π (R) Mπ,π ( xρ −1 , y, Xn , Yn ).
(5.19) (5.20)
Theorem 5.2. (n)
Wπ,π ( x , y, X, Y ) x , y, Xn , Yn ) M(R) ( x , y, Xn−1 , Yn−1 ) . . . M(R) ( x , y, X1 , Y1 ) = M(R) (
π,π
.
(5.21) Proof. For n = 1, we have (1)
(R)
Wπ,π ( x , y, X1 , Y1 ) = Fπ,π ( x , y, X1 , Y1 ) = Mπ,π ( x , y, X1 , Y1 ). The proof follows from recursion on n.
(5.22)
Theorem 5.3. The matrices M(R) ( x , y, ξ, η) commute among themselves: M(R) ( x , y, ξ, η)M(R) ( x , y, ξ , η ) = M(R) ( x , y, ξ , η )M(R) ( x , y, ξ, η).
(5.23)
Proof. Let n = 2, X = diag(X1 , X2 ) and Y = diag(Y1 , Y2 ) be two diagonal matrices, and X˜ = diag(X2 , X1 ) and Y˜ = diag(Y2 , Y1 ). Let T be a 2 × 2 upper triangular matrix with non-vanishing element T12 . Let U be the 2 × 2 matrix: T 12 Y2 − Y1 U= , (5.24) X1 − X2 T12
2-Matrix vs. Complex Matrix Model
135
it satisfies: U (X + T ) = (X˜ + T )U ,
U (Y + T † ) = (Y˜ + T t )U.
(5.25)
If U is invertible (which is true for almost every T ), one has: Fπ,ρ ( x , y, X + T , Y + T † ) = Fπ,ρ ( x , y, X˜ + T , Y˜ + T t ) (5.26) for every T (except a zero measure subset). Since the Jacobian ∂T ∂T = 1, one has:
T (2) dT
=
e− Tr T T Fπ,ρ ( x , y, X + T , Y + T † ) − Tr T T † T (2) dT e †
˜ e− Tr T˜ T˜ † Fπ,ρ ( x , y, X˜ + T , Y˜ + T t ) . ˜ − Tr T˜ T˜ † T (2) d T e
T (2) d T
(5.27)
Using Theorem 5.2 for n = 2, we have: M(R) ( x , y, X1 , Y1 )M(R) ( x , y, X2 , Y2 ) = M(R) ( x , y, X2 , Y2 )M(R) ( x , y, X1 , Y1 ). (5.28) Corollary 1. Therefore, there exists an orthogonal matrix U( x , y), independent of ξ and η, such that: ( x , y, ξ, η) := U( x , y) M(R) ( x , y, ξ, η) U t ( x , y)
(5.29)
is a diagonal matrix ( x , y, ξ, η) = diag (π ( x , y, ξ, η)) .
(5.30)
Notice that is a rational function of ξ and η. Thus: − Tr T T † F x , y, X + T , Y + T † ) π,π ( T (n) dT e − Tr T T † T (n) dT e =
Uπ,ρ ( x , y) Uπ ,ρ ( x , y)
ρ
n
ρ ( x , y, Xi , Yi )
(5.31)
i=1
and:
e− Tr XU Y U Fπ,π ( x , y, X, U Y U † ) − Tr XU Y U † U (n) dU e
det e−Xi Yj ρ ( x , y, Xi , Yj ) = Uπ,ρ ( x , y) Uπ ,ρ ( x , y) . det e−Xi Yj ρ
U (n) dU
†
Remark 5.2. If one defines the “Matricial determinant” as follows:
(5.32)
136
B. Eynard, A.P. Ferrer
Definition 5.3. Let M ∈ GLn (Glm (C)), i.e. for each i = 1, . . . , n, j = 1, . . . , n, Mi,j is a square matrix of size m. We define: 1 Mdet(M) := n!
σ
(−1) (−1)
σ ∈(n) τ ∈(n)
τ
n
Mσ (i),τ (i)
(5.33)
i=1
which is a m × m square matrix. Then we have:
dU Fπ,π ( x , y, X, U Y U † ) e− Tr XU Y U U (n)
= cn (π)
n(n−1) 2
†
Mdet e−Xi Yj M(R) ( x , y, Xi , Yj ) π,π (X)(Y )
(5.34)
if R = 0, one immediately recovers the Itzykson–Zuber’s formula, and if R = 1, one immediately recovers Morozov’s formula.
5.3. Examples. • Example R = 1: (1)
M1,1 (x, y, ξ, η) = 1 +
1 1 , x−ξ y−η
(5.35)
and thus: 1 1 − Tr T T † 1 + Tr n T (n) dT e x−(X+T ) x−(Y +T † ) 1 1 1+ . = − Tr T T † x − X i y − Yi T (n) dT e i=1 (5.36) • Example R = 2: We have: F(1)(2),(1)(2) (x1 , x2 , y1 , y2 , A, B) =
F(12),(12) (x1 , x2 , y1 , y2 , A, B) =
F(1)(2),(12) (x1 , x2 , y1 , y2 , A, B) = F(12),(1)(2) (x1 , x2 , y1 , y2 , A, B) =
1 1 1 + Tr x1 − A y1 − B 1 1 × 1 + Tr , x2 − A y2 − B 1 1 1 + Tr x1 − A y2 − B 1 1 × 1 + Tr , x2 − A y1 − B 1 1 1 1 Tr , x1 − A y1 − B x2 − A y2 − B 1 1 1 1 Tr , x1 − A y2 − B x2 − A y1 − B
(5.37)
2-Matrix vs. Complex Matrix Model
and
137
(2) 1 1 1 1 M (x , x , y , y , ξ, η) = 1 + 1 + 1 2 1 2 x1 −ξ y1 −η x2 −ξ y2 −η , (1)(2),(1)(2) (2) 1 1 1 1 M(12),(12) (x1 , x2 , y1 , y2 , ξ, η) = 1 + x1 −ξ y2 −η 1 + x2 −ξ y1 −η , (5.38) (2) M(1)(2),(12) (x1 , x2 , y1 , y2 , ξ, η) = x11−ξ y11−η x21−ξ y21−η , M(2) 1 1 1 (x , x , y , y , ξ, η) = 1 , (12),(1)(2)
1
2
1
2
x1 −ξ y2 −η x2 −ξ y1 −η
i.e., the matrix M(2) (x1 , x2 , y1 , y2 , ξ, η) is: 1 + x11−ξ y11−η 1 + x21−ξ y21−η 1 1 1 1 1+ x1 −ξ y2 −η x2 −ξ y1 −η
1 1 1 1 x1 −ξ y1 −η x2 −ξ y2 −η , 1 1 1 1 1 + x1 −ξ y2 −η x2 −ξ y1 −η
(5.39)
i.e. M(2) (x1 , x2 , y1 , y2 , ξ, η) 1 1 1 1 1 10 + + = 1+ 01 2 x1 − ξ x2 − ξ y1 − η y2 − η 1 1+S 1 + , 1 1−S (x1 − ξ )(x2 − ξ )(y1 − η)(y2 − η)
(5.40)
where S=
1 (x1 − x2 )(y1 − y2 ). 2
(5.41)
Define the following orthogonal matrix (U (2) (x1 , x2 , y1 , y2 ) U (2) (x1 , x2 , y1 , y2 )t = 1): 1 1 λ−S U (2) (x1 , x2 , y1 , y2 ) := √ , where λ = 1 + S 2 S − λ 1 2λ(λ − S) (5.42) one has: M(2) (x1 , x2 , y1 , y2 , ξ, η) = U (2) (x1 , x2 , y1 , y2 ) (2) ×(x1 , x2 , y1 , y2 , ξ, η) U (2) (x1 , x2 , y1 , y2 )t , (5.43) where (2) (x1 , x2 , y1 , y2 , ξ, η) = diag(+ (x1 , x2 , y1 , y2 , ξ, η), − (x1 , x2 , y1 , y2 , ξ, η) ) with 1 1 1 1 1 ± (x1 , x2 , y1 , y2 , ξ, η) = 1 + + + 2 x1 − ξ x2 − ξ y1 − η y2 − η 1±λ + . (5.44) (x1 − ξ )(x2 − ξ )(y1 − η)(y2 − η) Eventually, one gets: − Tr T T † Tr T (n) dT e
1 1 1 1 x1 −(X+T ) y1 −(Y +T † ) x2 −(X+T ) y2 −(Y +T † ) − Tr T T † T (n) dT e n
n 1 − (x1 , x2 , y1 , y2 , Xi , Yi ) − = 2λ i=1
(5.45)
+ (x1 , x2 , y1 , y2 , Xi , Yi ) ,
i=1
138
B. Eynard, A.P. Ferrer
T (n) dT
e− Tr T T
†
1 + Tr
1 1 x1 −(X+T ) y1 −(Y +T † )
T (n) dT
=
e− Tr T T
1 + Tr
1 1 x2 −(X+T ) y2 −(Y +T † )
†
1 + (x1 , x2 , y1 , y2 , Xi , Yi ) (λ + S) 2λ i=1 n − (x1 , x2 , y1 , y2 , Xi , Yi ) +(λ − S) n
(5.46)
i=1
6. Mixed Correlation Functions and Biorthogonal Polynomials Let us consider two polynomial potentials V1 (x) and V2 (y) . Our goal is to compute the following matrix expectation values: Hn ×Hn
dM1 dM2 Fπ,π ( x , y, M1 , M2 ) e− Tr (V1 (M1 )+V2 (M2 )+M1 M2 ) . − Tr (V1 (M1 )+V2 (M2 )+M1 M2 ) Hn ×Hn dM1 dM2 e
(6.1)
6.1. Biorthonormal polynomials. We recall here a few elementary notions about biorthogonal polynomials. More detailed descriptions can be found in particular in [22, 21, 5, 8, 7, 4]. We introduce two families of polynomials pn (x) = √1h x n + O(x n−1 ), qn (y) = n
√1 y n + O(y n−1 ), with the same leading coefficient √1 , and orthonormal with respect hn hn
to the pairing:
(pn , qm ) =
dx dy pn (x) qm (y) e−(V1 (x)+V2 (y)+xy) = δnm .
(6.2)
The integration path is a priori R × R, but this condition can be relaxed (see [3, 7]). When they exist, the biorthonormal polynomials are uniquely determined. Since the biorthonormal polynomials form a basis, one can decompose xpn (x) onto the basis of pm (x) with m ≤ n + 1: x pn (x) =
n+1
Qnm pm (x)
(6.3)
Pnm qm (y).
(6.4)
m=0
and similarly: y qn (y) =
n+1
m=0
Q and P are infinite matrices. In the case where V2 (resp. V1 ) is a polynomial, then Q (resp. P ) is a finite band matrix.
2-Matrix vs. Complex Matrix Model
139
We also introduce the following ∞ × n rectangular matrix: 1 .. . 1 n−1 := 0
(6.5)
which is the projector onto the n first polynomials.
6.2. Mixed correlation functions. Theorem 6.1.
dM1 dM2 Fπ,π ( x , y, M1 , M2 ) e− Tr (V1 (M1 )+V2 (M2 )+M1 M2 ) − Tr (V1 (M1 )+V2 (M2 )+M1 M2 ) Hn ×Hn dM1 dM2 e
= Uπ,ρ ( x , y) Uπ ,ρ ( x , y) det tn−1 : ρ ( x , y, Q, P t ) : n−1 ,
Hn ×Hn
(6.6)
ρ
where for any function of two variables f (ξ, η), we define : f (Q, P t ) : by putting the Q’s on the left of the P ’s. This is always possible in this case because ρ ( x , y, ξ, η) is a rational function of ξ and η. Proof. It works as usual (see [21, 22]), by writing Vandermonde determinants as: (X) =
j −1 det(Xi )
n−1 = det( hj −1 pj −1 (Xi )) = hi (−1)σ pσ (i) (Xi ), i=0
σ
i
(6.7)
j −1
(Y ) = det(Yi
n−1 ) = det( hj −1 qj −1 (Yi )) = hi (−1)τ qτ (i) (Yi ), i=0
τ
i
(6.8) Then, we use Eq. 5.32, i.e. 1 dM1 dM2 Fπ,π ( x , y, M1 , M2 ) e− Tr (V1 (M1 )+V2 (M2 )+M1 M2 ) n(n−1) 2 Hn ×Hn cn J˜n π 2 =
n−1
1 hi Uπ,ρ ( x , y) Uπ ,ρ ( x , y) (−1)σ τ ν 2 n! i=0 ρ∈(R) σ,τ,ν∈(n) × ρ ( x , y, Xi , Yν(i) )pσ (i) (Xi )e−V1 (Xi ) qτ ν(i) (Yν(i) )e−V2 (Yν(i) ) i
×e−Xi Yν(i) dXi dYν(i)
140
B. Eynard, A.P. Ferrer
=
n−1 1 hi n!2
× =
i=0 n
i=1 n−1
i=0
ρ∈(R)
hi
n−1
Uπ,ρ ( x , y) Uπ ,ρ ( x , y)
(−1)σ τ ν
σ,τ,ν∈(n)
: ρ ( x , y, Q, P t ) :σ (i),τ ν(i)
i=0
=
Uπ,ρ ( x , y) Uπ ,ρ ( x , y)
ρ∈(R)
hi
(−1)σ
σ ∈(n)
n
: ρ ( x , y, Q, P t ) :i,σ (i)
i=1
Uπ,ρ ( x , y) Uπ ,ρ ( x , y) det tn−1 : ρ ( x , y, Q, P t ) : n−1 . (6.9)
ρ∈(R)
Or, using the matricial determinant defined in Def.5.3: x, y , M1 , M2 ) e− Tr (V1 (M1 )+V2 (M2 )+M1 M2 ) Hn ×Hn dM1 dM2 Fπ,π ( − Tr (V1 (M1 )+V2 (M2 )+M1 M2 ) Hn ×Hn dM1 dM2 e = Mdet tn−1 : M(R) ( x , y, Q, P t ) : n−1 . π,π
Example. With R = 1, we find: 1 1 − Tr (V1 (M1 )+V2 (M2 )+M1 M2 ) Hn ×Hn dM1 dM2 (1 + Tr x−M1 y−M2 ) e − Tr (V1 (M1 )+V2 (M2 )+M1 M2 ) H ×H dM1 dM2 e n n 1 1 = det tn−1 1 + n−1 x − Q y − Pt
(6.10)
(6.11)
which is identical to what was found in [5]. 7. Conclusions In this article, we have shown that the hermitian 2-matrix model and the complex matrix model have the same loop equations. In the gaussian case, that implies they are identical. In case the weight is non–gaussian, the loop equations, which are recursion equations, determine all correlation functions when some initial conditions (moduli) are fixed. The generalization of the hermitian 2-matrix model to homology classes of contours (as in [7]), allows to have any arbitrary initial conditions, so, there exists a choice of homology class of contours for each set of initial conditions, i.e. for which the complex matrix model is identical to the 2-hermitian matrix model. Conversely, the initial conditions for the complex matrix model are not fully understood yet, they depend on how the complex matrix model is defined. If the complex matrix model is only a formal integral defined by its large n properties as in [27, 28], initial conditions are associated to filling fractions, and can thus be chosen arbitrarily. If the complex matrix model is defined as the result of a convergent integral for all n, it is not known yet how to find which homology class of contours it corresponds to. The consequence of that identification, through diagonalization of hermitian matrices and Jordanization of complex matrices, yields an identity between unitary group integrals
2-Matrix vs. Complex Matrix Model
141
and triangular matrices integrals, which seems to be a special case of the identification of GLn (C)/T (n) and the quotient of SU (n) by its Cartan subalgebra. The nature of that identification needs to be further understood, in particular in terms of characters of both groups, and in terms of group representation theory, in terms of Weyl’s character formula, or Harish-Chandra formulae. The gaussian triangular matrix integrals are easily computed, and we thus get very explicit expressions for all expectation values of the type which were studied by Shatashvili [24]. In particular, we have provided a new proof of the Itzykson-Zuber-HarishChandra integral, as well as Morozov’s integral. The key piece in this computation is that the matrices M commute together. This fact seems to be related to integrability, and is very reminiscent of some Yang-Baxter relations, and Bethe ansatz, and it would be interesting to understand how. It would be interesting also to understand these formulae in the framework of Duistermaat-Heckman localization theories [10]. Then, we have been able to perform the integral over eigenvalues, in a way very similar to what was done in [5], i.e. in terms of n×n determinants. It would then be interesting to rewrite these n × n determinants in terms of determinants of size independent of n, using kernels, as it is known for non-mixed expectation values (see [1, 2, 15]). Acknowledgements. The authors want to thank F. David, P. Di Francesco, M. Berg`ere, M. Bauer for stimulating discussions, and J.B. Zuber for carefully reading the manuscript. One of the authors (B.E.) wants to thank the European network Enigma (MRTN-CT-2004-5652). A. P-F. wants to thank the SPhT Saclay for its hospitality when part of this work was being conducted, and the support of CIRIT grant 2001FI-00387.
Appendix A. Gaussian Integrals Let δ = α1 α2 − γ 2 . Real integrals:
R×R
R×R
dx dy e−(
dx dy x k y l e−(
α1 2 α2 2 2 x + 2 y +γ xy)
2π =√ , δ
(1.1)
α1 2 α2 2 2 x + 2 y +γ xy)
α1 2
α2 2
dx dy e−( 2 x + 2 y +γ xy) =0 if k + l is odd √ k−l 2 δ ∂ l 2π ∂ − −2 = √ 2π ∂α1 ∂γ δ √ l−k k 2 ∂ 2π ∂ δ − −2 = √ 2π ∂α2 ∂γ δ R×R
Dn (R)×Dn (R)
Complex integrals:
C
dX dY e− Tr (
dx e−(
α1 2 α2 2 2 X + 2 Y +γ XY )
α1 2 α2 2 2 x + 2 x +γ xx)
if k ≥ l if k ≤ l,
=
π =√ , −δ
2π √ δ
(1.2)
n .
(1.3)
(1.4)
142
B. Eynard, A.P. Ferrer
α1 2
α2 2
dx x k x l e−( 2 x + 2 x +γ xx) α α −( 21 x 2 + 22 x 2 +γ xx) C dx e =0 if k + l is odd √ k−l 2 −δ ∂ l π ∂ − −2 = √ π ∂α1 ∂γ −δ √ l−k k 2 −δ ∂ π ∂ − −2 = √ π ∂α2 ∂γ −δ
C
Dn (C)
dX e− Tr (
α1 2 α2 2 2 X + 2 X +γ XX)
=
π √ −δ
if k ≥ l if k ≤ l,
(1.5)
.
(1.6)
n
Lemma A.1. Let ω(X, Y ), be a polynomial in all its variables X1 , . . . , Xn and Y1 , . . . , Yn , one has: α α − Tr ( 21 X2 + 22 Y 2 +γ XY ) Dn (R)×Dn (R) dX dY ω(X, Y ) e α α − Tr ( 21 X2 + 22 Y 2 +γ XY ) Dn (R)×Dn (R) dX dY e 2 α α − Tr ( 21 X2 + 22 X +γ XX) Dn (C) dX ω(X, X) e = . (1.7) 2 α α − Tr ( 21 X2 + 22 X +γ XX) dX e Dn (C) Proof. Equations 1.2 and 1.5 show that it is true for n = 1. By decomposing ω into monomials, the integral decouples into a product of n = 1 type integrals. Appendix B. Some Commutations x , y, ξ, η) commutes with the matrix A( x , y) defined by: Theorem B.1. The matrix M(R)( x , y) := i xi yπ(i) Aπ,π ( (2.1) x , y) := 1 if ππ −1 = transposition . Aπ,π ( Aπ,π ( x , y) := 0 otherwise Proof. A( x , y) = Res Res ξ η M(R) ( x , y, ξ, η) dξ dη.
(2.2)
ξ →∞ η→∞
x , y) defined by: Theorem B.2. The matrices Aα,β ( α,β δπ(i),π (i) + x , y) := δβ,π(α) Aπ,π ( i=α
1 1 xα − xi yβ − yπ(i)
1 − δβ,π(α) (xα − xπ −1 (β) )(yβ − yπ(α) ) 1 1 δπ(i),π (i) + × xα − xi yβ − yπ(i) −1 +
i=α,π
(2.3)
(β)
commute together for all α, β. They also commute with M( x , y, ξ, η) and with A( x , y).
2-Matrix vs. Complex Matrix Model
143
One has: x , y, ξ, η) = 1 + M(R) (
α,β
1 Aα,β ( x , y). (ξ − xα )(η − yβ )
(2.4)
Proof. Aα,β ( x , y) = Res Res M(R) ( x , y, ξ, η) dξ dη. ξ →xα η→yβ
(2.5)
References 1. Berg`ere, M.: Biorthogonal polynomials for potentials of two variables and external sources at the denominator. http://arxiv.org/list/hep-th/0404126, 2004 2. Berg`ere, M.: Biorthogonal polynomials for potentials of two variables and external sources at the denominator. 4th Eurogrid meeting, Les Houches (March 2004), and Capri meeting (Sept 2004), http://www.na.infn.it/congr/Bohr/, 2004 3. Bertola, M.: Bilinear semi–classical moment functionals and their integral representation. J. App. Theory 121, 71–99 (2003) 4. Bertola, M., Eynard, B.: The PDEs of biorthogonal polynomials arising in the 2-matrix model. http://arxiv.org/list/nlin.SI/0311033, 2003 5. Bertola, M., Eynard, B.: Mixed Correlation Functions of the Two-Matrix Model. J. Phys. A36, 7733–7750 (2003) 6. Bertola, M., Eynard, B., Harnad, J.: Semiclassical orthogonal polynomials, matrix models and isomonodromic tau functions. http://arxiv.org/list/nlin.SI/0410043, 2004 7. Bertola, M., Eynard, B., Harnad, J.: Differential systems for biorthogonal polynomials appearing in 2–matrix models and the associated Riemann–Hilbert problem. Commun. Math. Phys. 243, 193–240 (2003) 8. Bertola, M., Eynard, B., Harnad, J.: Duality, Biorthogonal Polynomials and Multi–Matrix Models. Commun. Math. Phys. 229, 73–120 (2002) 9. Di Francesco, P., Ginsparg, P., Zinn-Justin, J.: 2D Gravity and Random Matrices. Phys. Rep. 254, 1 (1995) 10. Duistermaat, J.J., Heckman, G.J.: Inv. Math. 69, 259 (1982) 11. Eynard, B.: Eigenvalue distribution of large random matrices, from one matrix to several coupled matrices. Nucl. Phys. B 506, 3, 633–664 (1997) 12. Eynard, B.: A short note about Morozov’s formula. http://arxiv.org/list/math-ph/0406063, 2004 13. Eynard, B.: Master loop equations, free energy and correlations for the chain of matrices. JHEP 03101, 018 (2003) 14. Eynard, B.: Large N expansion of the 2-matrix model. JHEP 0301, 051, (2003) 15. Eynard, B., Mehta, M.L.: Matrices coupled in a chain: eigenvalue correlations. J. Phys. A: Math. Gen. 31, 4449 (1998) 16. Fulton, W., Harris, J.: Representation theory. Series Graduate Texts in Mathematics, Vol. 129, Berlin-Heidelberg-New York: Springer, 1991 17. Itzykson, C., Zuber, J.B.: The planar approximation (II). J. Math. Phys. 21, 411 (1980) 18. Harish-Chandra,: Amer. J. Math. 80, 241 (1958) 19. Kazakov, V.A.: Ising model on a dynamical planar random lattice: exact solution. Phys. Lett. A 119, 140 (1986) 20. Knapp, A.W.: Representation theory of semisimple groups. Princeton, NJ: Princeton University Press, 1986 21. Mehta, M.L.: A method of integration over matrix variables. Commun. Math. Phys. 79, 327 (1981) 22. Mehta, M.L.: Random Matrices. 2nd edition, New York: Academic Press, 1991 23. Morozov, A.: Pair correlator in the Itzykson–Zuber Integral. Modern Phys. Lett. A 7, no. 37, 3503– 3507 (1992) 24. Shatashvili, S.L.: Correlation Functions in the Itzykson–Zuber Model. Commun. Math. Phys. 154, 421–432 (1993) 25. Staudacher, M.: Combinatorial Solution of the Two–Matrix Model. Phys. Lett. B 305, 332–338 (1993) 26. Ginibre, J.: Statistical ensembles of complex, quaternion and real matrices. J. Math. Phys. 6, 440 (1965)
144
B. Eynard, A.P. Ferrer
27. Kostov, I.K., Krichever, I., Mineev-Weinstein, M., Wiegmann, P., Zabrodin, A.: τ -function for analytic curves. In: Random matrices and their applications, MSRI publications, Vol. 40, Cambridge: Cambridge University Press, 2001, p. 285 28. Zabrodin, A.: Matrix models and growth processes: from viscous flows to the quantum Hall effect. Lectures given at the School Applications of Random Matrices in Physics, Les Houches, June 2004, http://arxiv.org/list/hep-th/0412219, 2004 Communicated by L. Takhtajan
Commun. Math. Phys. 264, 145–165 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1478-3
Communications in
Mathematical Physics
Infrared-Finite Algorithms in QED: The Groundstate of an Atom Interacting with the Quantized Radiation Field Volker Bach1 , Jurg ¨ Fr¨ohlich2, , Alessandro Pizzo3 1 2
FB Mathematik, Universit¨at Mainz, 55099 Mainz, Germany. E-mail:
[email protected] Inst. f. Theoretische Physik, ETH H¨onggerberg, 8093 Z¨urich, Switzerland. E-mail:
[email protected] 3 Dept. de Math., Univ. Paris-Sud, 91405 Orsay Cedex, France. E-mail:
[email protected]
Received: 7 March 2005 / Accepted: 13 June 2005 Published online: 28 February 2006 – © Springer-Verlag 2006
Abstract: In this paper, the groundstate of a nonrelativistic atom, minimally coupled to the quantized radiation field, and its groundstate energy are constructed by an iteration scheme inspired by [10]. This scheme successively removes an infrared cutoff in momentum space and yields a convergent algorithm enabling us to calculate the groundstate and the groundstate energy, to arbitrary order in the feinstructure constant α ∼ 1/137. In forthcoming papers, we will use our result to re-expand the groundstate and, eventually, scattering amplitudes in terms of bare quantities. I. Description of the Problem and Summary of Main Results The purpose of this paper is to develop an infrared-finite algorithm for the construction of the groundstate of a nonrelativistic atom interacting with the quantized radiation field. Using a scheme that goes back to Pizzo [10], we derive a convergent expansion for the groundstate and the groundstate energy, assuming the feinstructure constant α ∼ 1/137 to be a sufficiently small positive number. This is the first in a series of papers devoted to the construction of the groundstate and related quantities and their explicit representation as finite sums of contributions computable in terms of finitely many convergent integrals and an error term bounded above by const · α N , for any given N > 0. In further papers in this series, we will use the result of the present one to derive an infrared-finite algorithm yielding an explicit representation for the groundstate and a scheme for the calculation of the groundstate energy and of scattering amplitudes for Rayleigh scattering of light at an atom. As a corollary, we will present a mathematically precise justification of Bohr’s frequency condition. In this paper, an atom is described as a quantum-mechanical bound state consisting of a static, positively charged pointlike nucleus surrounded by electrons. The electrons are described as nonrelativistic, pointlike quantum-mechanical particles with electric
also at IHES, Bures-sur-Yvette, France.
146
V. Bach, J. Fr¨ohlich, A. Pizzo
charge −e and spin 21 , as originally proposed by Pauli. They are bound to the nucleus by the electrostatic Coulomb field and interact with the soft modes of the quantized electromagnetic field. To keep our exposition as simple as possible, we consider a hydrogen atom consisting of a single, static proton of charge e surrounded by only one electron. The spin of the electron then turns out to be an inessential complication, and, for simplicity, we neglect the Zeeman coupling of the magnetic moment of the electron to the quantized magnetic field. It would, however, not be difficult to include the Zeeman term in our analysis. The Hilbert space of pure state vectors of the system just described is given by H := Hel ⊗ F ,
(I.1)
where Hel = L2 (R3 ) is the Hilbert space appropriate for the description of a single electron, disregarding its spin, and F is the Fock space used to describe the states of the transverse modes of the quantized electromagnetic field, i.e., the photons. More explicitly, F :=
∞
F (N) ,
F (0) = C ,
(I.2)
N=0
where is the vacuum vector, i.e., the state of the electromagnetic field without any excited modes, and F (N) := SN
N
h,
N ≥1,
(I.3)
j =1
where the Hilbert space h of state vectors of a single photon is h := L2 [R3 × Z2 ] ,
(I.4)
with photon momentum space given by R3 , and Z2 is accounting for the two independent transverse polarizations, or helicities, of a photon. In Eq. (I.3), SN denotes the orthogonal projection onto the subspace of N j =1 h of totally symmetric N -photon wave functions, in accordance with the fact that photons satisfy Bose-Einstein statistics. Thus, F (N) is the subspace of F of state vectors for configurations of exactly N photons. It is convenient to represent the Hilbert space H as the space of square-integrable wave functions on the electron position space R3 with values in the photon Fock space F, i.e., H ∼ (I.5) = L2 R3 ; F . The dynamics of the system is generated by the Hamiltonian H :=
x) 2 − V ( x + α 3/2 A(α x ) + Hˇ . − i∇
(I.6)
x denotes the gradient with respect to the electron position x ∈ R3 , In Eq. (I.6), ∇ ∼ x ) denotes the vector potential of the transα = 1/137 is the feinstructure constant, A( verse modes of the quantized electromagnetic field in the Coulomb gauge, x) = 0 , x · A( ∇
(I.7)
Infrared-Finite Algorithms in QED
147
V is the Coulomb potential of electrostatic attraction of the electron to the nucleus, and Hˇ is the Hamiltonian of the quantized, free electromagnetic field. is the dimenThroughout this paper, we work with dimensionless quantities: If X sionful position vector of the electron, we set x :=
2 X, rBohr
(I.8)
1 where the Bohr radius rBohr = me 2 = mα , in units where Planck’s constant = 1, and the velocity of light c = 1. Here, m denotes the electron’s mass. Then the components of a single photon have dimension (length)−1 , and we set of the momentum vector, K, 2
rBohr k := K. 2α
(I.9)
The variables x and k are dimensionless. We choose units for the energy such that 2 m α 2 = 4 Rydberg = 1 .
(I.10)
With these conventions 1 . | x|
(I.11)
λ) |k| a(k, λ) , d 3 k a ∗ (k,
(I.12)
= α k · x , K · X Furthermore, Hˇ :=
and
V ( x) =
λ=±
λ) and a(k, λ) are the usual photon creation- and annihilation operators, where a ∗ (k, which satisfy the canonical commutation relations λ), a ∗ (k , λ )] = [a(k, λ), a(k , λ )] = 0 , [a ∗ (k, ∗ λ), a (k , λ )] = δλλ δ(k − k ) , [a(k, λ) = 0 , a(k,
(I.13) (I.14) (I.15)
k ∈ R3 and λ, λ ∈ Z2 ≡ {±}. for all k, x ), in the Coulomb gauge is given by The vector potential, A( d 3k 1 x ∗ x x ) := λ)∗ ei k· ε(k, λ)e−i k·
A( a (k, λ) + ε(k, a(k, λ) , (k) 3/2 (2π ) λ=± 2 |k| (I.16) is the characteristic function of the ball {k ∈ R3 | |k| ≤ κ} (or a nonnegative, where (k) +), ε(k, −) are photon polarization vectors, smooth approximation thereof), and ε(k, i.e., two unit vectors in C ⊗ R3 satisfying λ)∗ · ε(k, µ) = δλµ , ε(k,
λ) = 0 , k · ε(k,
(I.17)
λ) = 0 expresses the Coulomb gauge condition. We for λ, µ = ±. The equation k · ε(k, ±) cannot be chosen to be note that, by a famous theorem due to Hopf, the vectors ε(k, k|; but this is of no concern for us. continuous functions of k/|
148
V. Bach, J. Fr¨ohlich, A. Pizzo
insures that modes of the electromagnetic field corresponding to The function (k) ≥ κ do not interact with the electron; i.e., represents an ultrawave vectors k with |k| violet cutoff that will be kept fixed throughout our analysis. The vector potential defined in (I.16) is thus cut off in the ultraviolet – the true vector potential of the electromagnetic field corresponds to the choice ≡ 1. In our analysis of groundstates of light atoms and of low-energy Rayleigh scattering, an ultraviolet cutoff with 5 ≤ κ ≤ 10 can be expected to yield reliable results. (We shall henceforth assume that κ ≥ 1.) Of course, it would be desirable to carry out a mathematically precise analysis of ultraviolet renormalization (renormalization of the chemical potential and of the mass of an electron) and to construct the limiting dynamics, as κ → ∞. But this problem is outside the scope of our paper. In the following, we simplify our notations by setting λ) , ω(k) ≡ |k| := |k| , and λ) d 3 k , (I.18) k := (k, f (k) dk := f (k, λ=±
for any integrable (possibly vector-valued) function f . Next, we summarize the main results of this paper. In Sect. II, we prove that the Hamiltonian H introduced in Eq. (I.6) is selfadjoint on the domain of the operator −x + Hˇ , provided the feinstructure constant α, which we consider a perturbation parameter, is small, depending on the value of the ultraviolet cutoff κ. (Actually, it is not difficult to construct a canonical selfadjoint extension of H , for arbitrary values of κ and α, using Brownian motion and a Feynman-Kac formula for exp[−tH ]; see, e.g., [5].) If σ (H ) denotes the spectrum of H , we prove that Egs := inf σ (H ) > −∞
(I.19)
is a simple eigenvalue of H corresponding to a groundstate eigenvector φgs ∈ H of H , provided α > 0 is sufficiently small (depending on κ). Furthermore, σ (H ) = [Egs , ∞) ,
(I.20)
in fact, Egs is a threshold for continuous spectrum of H of infinite multiplicity. Stated without more precision, these results are not new. The first result of this type for a slightly simplified model appeared in [1] and for the model described above in [3]; a stronger result valid for arbitrary values of α and κ was proven in [4, 9]. The uniqueness of the groundstate for arbitrary values of α was shown in the spinless case in [6], and for the spin- 21 case, its two-fold degeneracy was shown in [7]. The point of our analysis is that we present an explicit inductive construction of (Egs , φgs ), based on a decomposition of the terms in the Hamiltonian H that describe the interactions between the electron and the photons into a sum of terms with the property that at least one photon creation- or annihilation operator is localized in a shell n+1 k κβ ≤ |k| ≤ κβ n , κ ≥ 1 , 0 < β < κ −1 , n ∈ N0 = {0, 1, 2, . . . } , (I.21) of photon momentum space. Our inductive construction yields a sequence (En , φn )∞ n=0 of approximate groundstate energies and groundstates constructed from Hamiltonians Hn with an infrared cutoff at |k| = κβ n . More precisely, Hn derives from H by replacing x ) by the characteristic the cutoff in the definition (I.16) of the vector potential A( 3 n ≤ κ}. We prove that function of {k ∈ R | κβ < |k| lim En = Egs ,
n→∞
lim φn = φgs .
n→∞
(I.22)
Infrared-Finite Algorithms in QED
149
Our construction is inspired by and similar to one used by Pizzo [10] in an analysis of Nelson’s model. The main new challenge we are coping with is that, while the interaction term in H superficially seems to be marginal (in the sense of power counting) in the infrared, it is actually irrelevant in the infrared on the subspace of all those states where the electron is localized near the nucleus in position space. This will be made precise with the help of the identity i [H , x] = 2 v ,
(I.23)
x) is (two times) the electron velocity operator, the virial x + α 3/2 A(α where v := −i ∇ theorem, and using the result that, in the groundstate, the electron is (exponentially well) localized near the nucleus in position space. In a further paper we will explore a rather remarkable consequence of the insight that, in a study of groundstates of atoms – and, for that matter, of low-energy Rayleigh scattering – the interaction term in the Hamiltonian H behaves as if it were irrelevant in the infrared: If the constant β in (I.21) is chosen to depend on the feinstructure constant √ α in such a way that limα→0 β = 0, e.g., α ≤ β ≤ α , for α 1, then |En − Egs | ≤ O(α N ) ,
(I.24)
φn − φgs ≤ O(α ) ,
(I.25)
N
for any finite N > 0, provided n ≥ const N. We plan to use these estimates to prove that, for β := α, Egs ≡ Egs (α) and φgs ≡ φgs (α) have expansions of the form Egs (α) = ε0 +
2N
εk (α) α k/2 + o(α N ) ,
(I.26)
ϕk (α) α k/2 + o(α N ) ,
(I.27)
k=1
φgs (α) = ϕ0 +
2N k=1
for arbitrary N = 1, 2, 3, . . . , where the coefficients εk (α) and ϕk (α) are smaller in magnitude than any positive power of α, i.e., for any δ > 0 and any k ∈ {1, 2, . . . , 2N }, (I.28) lim α δ εk (α) = 0 and lim α δ ϕk (α) = 0 . α→0
α→0
The point of the expansions (I.26) and (I.27) is that εk (α) and ϕk (α) are computable in terms of finitely many convergent integrals, for any 0 ≤ k < ∞. We expect that powers of ln[1/α] (“infrared logarithms”) appear in εk (α) and ϕk (α), which would not be an artefact of our algorithm, but an expression of infrared divergences in naive perturbation theory: The quantities Egs (α) and φgs (α) are not analytic nor even smooth at α = 0, but their derivatives of sufficiently high order diverge, as α → 0. II. Construction of the Ground State In this section we prove our main result. We show that the infimum Egs of the spectrum of H , defined in (I.6), is an eigenvalue of multiplicity one with a corresponding eigenvector φgs . Our proof does not use the explicit form of V (x) = | x |−1 but only requires 3 that V : R → R is an infinitesimal perturbation of −, i.e., the following hypothesis.
150
V. Bach, J. Fr¨ohlich, A. Pizzo
Hypothesis 1. The form domain of V includes the form domain H1 (R3 ) of − and, for any ε > 0, there exists a constant bε < ∞, such that ±V ≤ ε (−) + bε · 1
(II.1)
on H1 (R3 ). Moreover, lim|x|→∞ V (x) = 0, and eel := inf σ (Hel ) < 0 is an isolated eigenvalue of multiplicity one with corresponding normalized eigenvector ϕel ∈ Hel .
(II.2)
Equation (II.1) is a standard assumption which ensures the self-adjointness and semiboundedness of Hel = − − V (x) (see [8, 12]). Moreover, note that if Hel possesses a groundstate with eel < 0 then it is automatically unique, by the Perron-Frobenius theorem (see [11]). We define eel := inf[σ (Hel −eel )\{0}] to be the distance of the atomic groundstate energy eel to the first excited atomic energy level or the continuum threshold. We remark that Assumption (II.2) is fulfilled for Hel = − − 1/|x|; the degeneracy of the groundstate of the hydrogen atom is due to the spin of the electron. Theorem II.1. Assume Hypotheses H-1 and H-2 (stated below) and that α 3 /β > 0 is suffieciently small. Then the groundstate energy Egs = inf σ (H ) is a simple eigenvalue of H corresponding to a normalized groundstate eigenvector φgs ∈ H of H . Moreover, for all n ∈ N, the groundstate energy En = inf σ (Hn ) is a simple eigenvalue of Hn , and there exists a constant C < ∞, independent of n, such that φgs − φn 2 ≤ Cαβ n ,
(II.3)
where φn ∈ H is a normalized groundstate eigenvector of Hn . Theorem II.1 is restated in Corollary II.13 in Sect. II.5. Note, however, that Hn and n and φ ⊗ n , respectively, in φn Theorem II.1 are replaced by Hn ⊗ 1n∞ + 1n ⊗ Hˇ ∞ n ∞ Corollary II.13. II.1. Notations. We introduce some more notation useful in the proof of Theorem II.1. Given an operator-valued function F : R3 ×Z2 → B(Hel ), we write a ∗ (F ) := F (k)⊗ a ∗ (k) dk and a(F ) := F (k)∗ ⊗ a(k) dk. This allows us to write the velocity operator v as + a(G) , x + a ∗ (G) v := −i ∇
(II.4)
: R3 × Z2 → B(Hel )3 are the multiplication operators defined by where G (k) −iα k· α 3/2 x ε(k) . G(k) := + e √ (2π)3/2 2 |k|
(II.5)
In terms of the velocity operator, the Hamiltonian assumes the simple form H = v 2 − V (x) + Hˇ .
(II.6)
Its significance for our construction of a groundstate of H lies in the fact that (formally, at this point) 2 v = i [H, x] .
(II.7)
Infrared-Finite Algorithms in QED
151
Our construction of a groundstate of H is inductive, as in [10]. Given a constant β > 0, with β < κ −1 , where κ ≥ 1 is the UV-cutoff parameter in Eq. (I.16), we define a decreasing sequence (σn )∞ n=0 of energy scales by setting σn := κ β n .
(II.8)
Note, in passing, that, for all n ∈ N0 , we have that σn+1 ≤ κβ < 1. To cut the interaction into slices at ever lower energy scales, we introduce n (k) := 1 σn ≤ |k| G(k) m G , and G n (k) := 1 σn ≤ |k| < σm G(k) , (II.9) n = ∞ for all k ∈ R3 × Z2 and m, n ∈ N0 , with m < n. Note that G n=0 Gn+1 and that n is the coupling function of the interaction, regularized in the infrared region. We also G decompose the Fock space F = F(h) into energy scales by introducing m 2 and hm (II.10) hn := L2 Kn n := L Kn , where
τ ) ∈ R3 × Z2 for 0 ≤ n ≤ ∞: Kn := (k, τ ) ∈ R3 × Z2 for 0 ≤ m < n ≤ ∞: Knm := (k,
σn ≤ ω(k) , (II.11)
σn ≤ ω(k) < σm . (II.12)
Note that Kn0 ⊂ Kn is a proper subset. For integers 1 ≤ m < n < ≤ ∞, we have the disjoint decomposition K = Km ∪ Knm ∪ Kn and hence the orthogonal sum n h ∼ = hm ⊕ hm n ⊕ h ,
(II.13)
which gives rise to the isomorphism F ∼ = Fm ⊗ Fnm ⊗ Fn , with Fn := F(hn ) and
Fnm
:=
F(hm n ).
(II.14)
In particular, for any n ∈ N,
n n+1 ⊗ F∞ . F = F∞ ∼ = Fn ⊗ Fn+1
(II.15)
Setting Hn := Hel ⊗ Fn
and
Hnm := Hel ⊗ Fnm ,
(II.16)
and given energy scale indices m, n ∈ N0 , m < n, we define the velocity operator vn , the field energy operators Hˇ n , Hˇ nm , and the Hamiltonian Hn at those scales by n ) + a(G n) , + a ∗ (G vn := −i ∇ x Hˇ n := 1(σn ≤ |k|) ω(k) a ∗ (k)a(k) dk , Hˇ nm := 1(σn ≤ |k| < σm ) ω(k) a ∗ (k)a(k) dk , Hn := vn2 − V (x) + Hˇ n ,
(II.17) (II.18) (II.19) (II.20)
as operators on Hn and Hnm , respectively. We introduce the groundstate energy at scale n and differences, for different scales m < n, En := inf σ (Hn )
and
Enm := Em − En .
(II.21)
152
V. Bach, J. Fr¨ohlich, A. Pizzo
To compare Hamiltonians Hn and Hn+1 at successive energy scales, it is convenient to n+ on Hn and Hn+1 , respectively, by define the positive operators Hn+ and H Hn+ := Hn − En , n n+ := Hn+ ⊗ 1nn+1 + 1n ⊗ Hˇ n+1 H ,
(II.22) (II.23)
where we denote the identity operator on Hn and Fnm by 1n and 1m n , respectively. One resolvent on Hn+1 that we use in our estimates over and over again is the following one, + n + σn+1 −1 . n := H (II.24) R n ≤ σ −1 . We identify vn with vn ⊗ 1n , which is acting on Hn+1 . Note that 0 ≤ R n+1 n+1 Note that, for n ∈ N0 , nn+1 ) + a(G nn+1 ) . vn+1 = vn + a ∗ (G
(II.25)
Similarly, given ψn ∈ Hn , we define a vector ψ˜ n := ψn ⊗ nn+1 ∈ Hn+1 ,
(II.26)
m where n and m n denote the vacuum vectors in Fn and Fn , respectively. With these notations, we have that + n n n+ + Wn+1 Hn+1 = H + En+1 ,
(II.27)
n Wn+1 := ( vn+1 )2 − ( v n )2
(II.28)
= =
nn+1 ) · vn + 2 vn · a(G nn+1 ) + a ∗ (G nn+1 ) + a(G nn+1 ) 2 2a (G nn+1 ) · vn + 2 vn · a(G nn+1 ) + a ∗ (G nn+1 ) · a ∗ (G nn+1 ) 2 a ∗ (G n 2 nn+1 ) + 2 a ∗ (G nn+1 ) · a(G nn+1 ) + G n+1 , nn+1 ) · a(G +a(G ∗
n 2 n 2 := |G where G n+1 n+1 (k)| dk. In Eq. (II.28), we make use of the Coulomb gauge · A(x) condition ∇ = 0, which implies that nn+1 ) · vn = vn · a ∗ (G nn+1 ) a ∗ (G
and
nn+1 ) · vn = vn · a(G nn+1 ) . (II.29) a(G
Finally, it is convenient to collect the various assumptions which enter the derivations of the estimates, below, in the following hypothesis: Hypothesis 2. Assume that κ ≥ 1 and that ±V ≤ ε(−) + bε , for ε > 0, and set C1 := 8(κ + b1/8 ) ≥ 8 .
(II.30)
Assume furthermore that 0 ≤ α ≤ β ≤ β1 := min
1 2
, κ −1 , eel κ −1 ,
(II.31)
and set (1)
gα,β :=
C1 α 3/2 . β 1/2
(II.32)
Infrared-Finite Algorithms in QED
153
II.2. Some basic estimates. We make use of the notation 2 b(k)2B(Hel ) dk , b :=
(II.33)
for b ∈ L2 [R3 × Z2 ; B(Hel )]. Moreover, for b ∈ L2 [R3 × Z2 ; B(Hel )3 ], we set 2 b(k) B(Hel ) :=
3
bν (k)2B(Hel )
and
2 := b
ν=1
3
bν 2 .
(II.34)
ν=1
Using this notation, we have, for all n ∈ N0 and all k ∈ R3 × Z2 , nn+1 (k)2 G B(Hel ) ≤ and
3 α3 1 σn+1 ≤ ω(k) < σn (2π)3 2 |k|
3 σn 2 1 ≤ 3α k dk ≤ σn2 α 3 , n+1 2 π2 0 5 σn 3 −1/2 n 2 1 ω n+1 ≤ 3 α G dk ≤ σn α 3 . 2 π2 0 5
n G
Lemma II.2. Let f, g ∈ L2 [R3 × Z2 ; B(Hel )] be operator-valued functions (1 + ω−1 )1/2 f , (1 + ω−1 )1/2 g < ∞. Then, for any ρ > 0, a(f ) (Hˇ + ρ)−1/2 ≤ ω−1/2 f , (Hˇ + ρ)−1/2 a(f ) ≤ (ρ −1 + ω−1 )1/2 f , a(f )a(g)(Hˇ + ρ)−1 ≤ ω−1/2 f · ω−1/2 g , (Hˇ + ρ)−1 a(g)a(f ) ≤ (ρ −1 + ω−1 )1/2 f · (ρ −1 + ω−1 )1/2 g .
(II.35)
(II.36) (II.37) obeying (II.38) (II.39) (II.40) (II.41)
The proof of Lemma II.2 can be found in many texts, see, e.g., in [1, 2]. We remark that the bounds asserted in Lemma II.2 and (II.1) are the basic input to prove the self-adjointness of H = v 2 − V (x) + Hˇ on the domain of − + Hˇ , provided α is not too large. Lemma II.3. For all 0 ≤ α ≤ κ −1/3 , all n ∈ N0 , and any ρ ∈ (0, σn ], + (H + ρ)−1/2 ( vn )2 (Hn+ + ρ)−1/2 H ≤ C1 ρ −1 , n n 1/2 −1 2 1/2 R n ( vn ) Rn H ≤ C1 σn+1 , n+1
(II.42) (II.43)
n is defined in (II.24) and C1 in (II.30). where R The proof of Lemma II.3 is given in Appendix A. In the next lemma, we derive a n relative to the energy scale σn+1 which quadratic form bound on the interaction Wn+1 n n+ + σn+1 , provided g (1) := C1 α 3/2 β −1/2 proves that Wn+1 is a small perturbation of H α,β is sufficiently small. Lemma II.4. Assume Hypotheses 1 and 2. Then, for all n ∈ N0 , 1/2 n (1) R n1/2 n Wn+1 R ≤ gα,β . H n+1
(II.44)
154
V. Bach, J. Fr¨ohlich, A. Pizzo
Proof. By Eq. (II.28) and Lemmata II.2 and II.3, we have 1/2 n R n Wn+1 R n1/2 Hn+1 1/2 n n vn 1/2 ≤ 4 R Hn+1 · a(Gn+1 )Rn Hn+1 ∗ n 1/2 2 n n + a (G ) + a(G ) R n+1
≤ ≤ ≤
Hn+1
n+1
1/2 4C1 nn+1 + 4(σ −1 + ω−1 )1/2 G nn+1 2 ω−1/2 G n+1 1/2 σn+1 1/2 1/2 4C 1/2 4C1 α 3/2 σn 8α 3 σn2 8κ α 3/2 + ≤ √1 + √ 1/2 5σn+1 5 β 1/2 5 5σn+1 4 1 C1 α 3/2 C1 α 3/2 + ≤ , √ 5 β 1/2 β 1/2 5C1
where we use Estimates (II.36) and (II.37), κβ ≤ 1, and C1 ≥ 8κ ≥ 8.
(II.45)
Corollary II.5. Assume Hypotheses 1 and 2 and 0 ≤ gα,β = C1 α 3/2 β −1/2 ≤ 1. Then, for all n ∈ N0 , (1)
n+ . Hn+1 ≥ En − gα,β σn+1 + [1 − gα,β ]H (1)
(1)
(II.46)
Proof. The assertion follows directly from Lemma II.4: n n+ + Wn+1 Hn+1 = En + H
n 1/2 + n+ + σn+1 )1/2 1 + R n1/2 Wn+1 = En − σn+1 + (H Rn (Hn + σn+1 )1/2 n+ + σn+1 ) ≥ En − σn+1 + [1 − gα,β ](H (1)
(1) (1) + = En − gα,β σn+1 + [1 − gα,β ] H n .
(II.47)
n II.3. Energy shifts and gaps. In this subsection we estimate the differences En+1 = En − En+1 of the groundstate energies at scales n and n + 1, see (II.21). We inductively prove that En = inf σ (Hn ) is an isolated eigenvalue of multiplicity one and derive a lower bound on the size of the gaps n+ ) \ {0} gapn := inf σ (Hn+ ) \ {0} and g apn := inf σ (H (II.48)
n+ above their lowest spectral value 0. of Hn+ and H n We first derive a simple estimate on the difference En+1 of the groundstate energies of Hn and Hn+1 . (1)
Lemma II.6. Assume Hypotheses 1 and 2 and gα,β ≤ 1. Then, for all n ∈ N0 , (1)
n |En+1 | ≤ gα,β σn+1 , (1)
where gα,β ≥ 0 is defined in (II.32).
(II.49)
Infrared-Finite Algorithms in QED
155
Proof. Given ε > 0, suppose that ψn ∈ Hn is an ε-approximate groundstate of Hn , i.e., ψn |Hn ψn ≤ En + ε, and let ψ˜ n := ψn ⊗ nn+1 . The variational principle yields the upper bound n ψ˜ n En+1 ≤ ψ˜ n |Hn+1 ψ˜ n = En + ε + ψ˜ n |Wn+1 nn+1 2 ≤ En + ε + 1 α 3 σn2 ≤ En + ε + G 5 (1) ≤ En + ε + gα,β σn+1 ,
(II.50)
where we use that C1 ≥ κ, β n ≤ 1, and α 3 ≤ β to derive the last inequality. In the limit (1) n+ ≥ 0 ε → 0, we obtain En+1 ≤ En + gα,β from (II.50). Conversely, Eq. (II.46) and H imply that (1)
En+1 ≥ En − gα,β σn+1 .
(II.51)
In the next lemma, we prove that En = inf σ (Hn ) is an isolated eigenvalue of multiplicity one and derive a lower bound on the size of the spectral gap above En . We use a variational argument which would not apply in the degenerate case. That could actually not be expected, because the interaction with the quantized radiation field could lift the degeneracy at any energy scale, and there is, a priori, no control on the resulting splitting. (1)
Lemma II.7. Assume Hypotheses 1 and 2 and gα,β ≤ 1/6. Then, for all n ∈ N0 , the groundstate energy En = inf σ (Hn ) is a simple, isolated eigenvalue of Hn , with corresponding normalized eigenvector φn ∈ Hn , and gap0 = min{eel , κ} , gapn+1 ≥
(1) [1 − 3gα,β ] σn+1
(II.52) ,
g apn = σn+1 .
(II.53) (II.54)
Proof. We proceed by induction in n ∈ N0 . Since H0 = Hel ⊗ 10 + 1el ⊗ Hˇ 0 , we have that σ (H0 ) = σ (Hel ) + σ (Hˇ 0 ) = σ (Hel ) + {0} ∪ [κ, ∞) ⊆ eel + {0} ∪ [min{eel , κ}, ∞) , (II.55) which implies (II.52), as a variational argument shows that σ (H0 ) eel + min{eel , κ}. Clearly, E0 = eel is an eigenvalue of H0 of multiplicity one, the corresponding normalized eigenvector being φ0 = ϕel ⊗ . A similar argument is used to demonstrate (II.53). Namely, due to (II.23), we have that n n+ ) = σ (Hn+ ) + σ (Hˇ n+1 ) ⊆ {0} ∪ [gapn , ∞) ∪ [σn+1 , ∞) , σ (H
(II.56)
n+ ) min{gapn , σn+1 }, we even which implies that g apn ≥ min{gapn , σn+1 }. Since σ (H conclude that (II.57) g apn = min gapn , σn+1 ,
156
V. Bach, J. Fr¨ohlich, A. Pizzo
for all n ∈ N0 . Since Hypothesis 2 implies that gap0 = min{eel , κ} ≥ κβ1 ≥ κβ = σ1 , we have that g ap0 = σ1 . (1) (1) Let us now suppose that n ≥ 1 and gapn ≥ [1 − 3gα,β ]σn . Since 3gα,β ≤ 1/2 and β ≤ β1 ≤ 1/2, we have that gapn ≥ σn+1 , so g apn = σn+1 , which establishes (II.54). To perform the induction step, we assume that En is a simple, isolated eigenvalue of Hn , with corresponding normalized eigenvector φn ∈ Hn , and that g apn = σn+1 . We must prove that En+1 is an eigenvalue of Hn+1 of multiplicity one and that the gap above En+1 , denoted by gapn+1 , obeys (II.53). We observe that, due to (II.46) and (II.49), Xn+1 := ≥
sup
inf
ψ0 ∈Hn+1 \{0} φ⊥ψ0 ,φ=1 + φ Hn+1 φ inf φ⊥φ˜ n ,φ=1
+ φ Hn+1 φ
(1)
(II.58) (1)
(1)
n ≥ En+1 − gα,β σn+1 + [1 − gα,β ] g apn ≥ [1 − 3gα,β ] σn+1
holds true. Since Xn+1 > 0, the min-max principle implies that En+1 is an isolated point in the spectrum of Hn+1 of multiplicity at most one. The multiplicity of En+1 is at least one, however, since En+1 is, by definition, in the spectrum of Hn+1 . Thus, En+1 is an eigenvalue of Hn+1 of multiplicity one, and gapn+1 = Xn+1 . II.4. Groundstate projections. Lemma II.7 enables us to write the projection Pn = |φn φn | onto the groundstate φn of Hn as a contour integral dzn −1 Pn = , (II.59) + 2πi n Hn − zn (1)
where n := { 41 σn eiϑ ∈ C|ϑ ∈ [0, 2π)} and provided that 41 < 1 − 3gα,β , for n ∈ N, and 41 < min{eel /κ, 1}, for n = 0, as these inequalities insure that |zn | < gapn . For (1) 1 apn and thus 4 < 1 − 3gα,β and zn+1 ∈ n+1 , we also observe that |zn+1 | < g n = −1 P 2πi
n+1
dzn+1 , + Hn − zn+1
(II.60)
n := |φ˜ n φ˜ n | = Pn ⊗ Pn is the projection onto φ˜ n . Note that, for n ∈ N0 , where P n+1 we have that 1 (1) n gapn+1 − |zn+1 | − |En+1 σn+1 . | ≥ 1 − 4gα,β − (II.61) 4 (1)
Thus, if gα,β < 1/6 then we have that Pn+1 =
−1 2π i
n+1
dzn+1 −1 = Hn+1 − En+1 − zn+1 2πi
n+1
dzn+1 , (II.62) Hn+1 − En − zn+1
for all n ∈ N0 , because in both integrals in (II.62) the only spectral point of Hn+1 encircled by the integration contours is En+1 .
Infrared-Finite Algorithms in QED
157 (1)
Lemma II.8. Assume Hypotheses 1 and 2 and gα,β ≤ 1/6. For all n ∈ N0 , we have that + n + σn+1 H ≤4. sup H + − z
z∈n+1
(II.63)
n
Proof. Equation (II.63) follows directly from the spectral theorem and g apn = σn+1 , namely +
H apn + σn+1
n + σn+1 ≤ max 4 , sup
r + g
H n+ − z r + g apn − z r>0 r + 2σ n+1 ≤ max 4 , sup =4. (II.64) 3 r>0 r + 4 σn+1 n+ − zn+1 is always comparable to H n+ + σn+1 , proLemma II.8 establishes that H 1 n is a small perturbation of H n+ +σn+1 , vided |zn+1 | = 4 σn+1 . Since the interaction Wn+1 as proved in Lemma II.4, we obtain convergent series expansions for the resolvents (Hn+1 − En+1 − zn+1 )−1 and hence also for the groundstate projections Pn+1 . (1)
Lemma II.9. Assume Hypotheses 1 and 2 and gα,β ≤ 1/8. Then, for all n ∈ N0 , the series expansion ∞ −1 (ν) (−1)ν Yn+1 (zn+1 ) dzn+1 , Pn+1 = 2πi n+1 ν=0 1 ν 1 n (ν) Yn+1 (z) := + , Wn+1 + n − z n − z H H
(II.65) (II.66)
is norm-convergent, with (ν) sup Yn+1 (z) ≤
|z|=σn+1
4 σn+1
(1) ν
4 gα,β
≤
4 −ν 2 5n+1
(II.67)
for all ν ∈ N0 . (ν)
Proof. We rewrite Yn+1 (z) as (ν) Yn+1 (z)
ν + + 1/2 Hn + σn+1 1/2 1/2 Hn + σn+1 n n1/2 . Rn Wn+1 Rn R = Rn n+ − z n+ − z H H
The proof then follows from Lemmata II.4 and II.8.
(II.68)
We remark that, by (II.60), the 0th -order term in the series expansion for Pn+1 equals Pn , so that n = Pn+1 − P
∞ −1 (ν) (−1)ν Yn+1 (zn+1 ) dzn+1 , 2πi n+1 ν=1
(II.69)
158
V. Bach, J. Fr¨ohlich, A. Pizzo
and hence n ≤ 4 Pn+1 − P
∞
(1) ν
(1)
≤ 32gα,β ≤ 32 C1
4 gα,β
ν=1
α 3/2 , β 1/2
(II.70)
which is of order α 3/2 β −1/2 . This yields an alternative proof of the fact that if Pn has n and hence also Pn+1 , provided α 3/2 β −1/2 is sufficiently small. Equarank one, so do P tion (II.70) is not sufficient, however, to prove that the sequence (Pn )∞ n=0 of projections converges. Indeed, this cannot be expected, in general, because the perturbation W is marginal, in the sense of power counting. To prove convergence of (Pn )∞ n=0 , we need refined estimates that take the particularities of the model into account - mere power counting cannot achieve this goal. II.5. Convergence of the groundstate projections. Our proof of the convergence of (Pn )∞ n=0 is based on a virial argument that exploits Identity (II.7) and uses the boundedness of φn |(x 2 ⊗ 1)φn uniformly in n. Theorem II.10. Assume Hypotheses 1 and 2. Then xφn ∈ dom(Hn ), for all n ∈ N, 2 vn φn = i Hn+ x φn ,
(II.71)
and there exists a universal constant C2 < ∞, depending only on V , such that (II.72) sup x φn H ≤ C2 . n∈N0
n
We remark that actually a much stronger estimate holds. Namely, from a CombesThomas argument or an Agmon estimate it follows that eγ |x | φn is uniformly bounded in norm, for some γ > 0. For our purpose, however, Theorem II.10 is sufficient. Its proof is given in Appendix A. In the proof of convergence of the sequence (Pn )∞ n=0 of projections onto the groundstate, we make use of the following lemma. Lemma II.11. Let a, b ∈ H be two normalized vectors in a Hilbert space H, and denote the corresponding rank-one projections by Pa := |aa| and Pb := |bb|. Then Pa − Pb = 1 − |a|b|2 = a |(Pa − Pb )a . (II.73) Proof. We may assume {a, b} to be linearly independent. Pa − Pb is a rank-2 operator with matrix representation 1 a|b M= (II.74) −b|a −1 in the basis {a, b}. Since M is traceless, √ it has eigenvalues ±λ, for some λ > 0. Hence, (II.73) follows from M = |λ| = − det[M]. (1)
Theorem II.12. Assume Hypotheses 1 and 2 and gα,β ≤ 1/8. Then, for all n ∈ N0 , n − Pn+1 ≤ C3 g β n/2 , P α,β (1)
where C3 = κ 1/2 (C2 + 1).
(II.75)
Infrared-Finite Algorithms in QED
159
Proof. In view of Lemma II.11, we compute n − Pn+1 ) φ˜ n φ˜ n |(P 1 1 −1
φ˜ n φ˜ n dzn+1 − = n+ − zn+1 2π i n+1 Hn+1 − En − zn+1 H dzn+1
1 1 n φ˜ n φ˜ n = Wn+1 2π i n+1 zn+1 Hn+1 − En − zn+1 −1 dzn+1
n ˜ n Wn+1 φ˜ n = φ 2 2π i n+1 zn+1
1 dzn+1 n 1
n ˜ ˜ + W W φ φ
n n n+1 2 2πi n+1 zn+1 Hn+1 − En − zn+1 n+1
1 dzn+1 n 1
n = Wn+1 Wn+1 φ˜ n (II.76) φ˜ n , 2 2π i n+1 zn+1 Hn+1 − En − zn+1 using (II.62), the second resolvent equation, and the fact that the contour integral −2 dz = 0 vanishes. n+1 z n := R n−1 = H n+ + σn+1 , we have Thus, with Q n − Pn+1 ) φ˜ n | |φ˜ n |(P 2 −1 1/2 n ≤ σn+1 Rn Wn+1 φ˜ n ·
sup
zn+1 ∈n+1
1/2 Q , n (Hn+1 − En − zn+1 )−1 Q 1/2 n (II.77)
and using a Neumann series expansion, −1 1/2 1/2 Q Qn n (Hn+1 − En − zn+1 ) ∞ + !k + H n + σn+1 1/2 1/2 Hn + σn+1 n = , Rn Wn+1 Rn n+ − zn+1 n+ − zn+1 H H
(II.78)
k=0
and Lemmata II.4 and II.8, we obtain the norm bound ∞ ∞ 1/2 (1) k 2 k Q n (Hn+1 −En − zn+1 )−1 Q 1/2 ≤ 4 4 g ≤ 4 = 12 . (II.79) n α,β 3 k=0
k=0
Combining Lemma II.11, Eq. (II.76), and Eq. (II.79), we obtain the bound √ n n1/2 Wn+1 n − Pn+1 ≤ 12 ˜n R φ P . 1/2 σn+1
(II.80)
n1/2 W n φ˜ n = ψ (1) + ψ (2) + 2ψ (3) + 2ψ (4) , with Appealing to (II.28), we write R n+1 −1/2 (1) G nn+1 2 φ˜ n , (II.81) ψ := σ n+1
ψ
(2)
ψ
(3)
ψ
(4)
nn+1 ) · a ∗ (G nn+1 ) φ˜ n , n1/2 a ∗ (G := R := = :=
n1/2 R n1/2 R n1/2 R
(II.82)
nn+1 ) · vn φ˜ n a (G nn+1 ) φ˜ n vn · a ∗ (G
,
(II.84)
n (G ) · vn φ˜ n 0,n+1
,
(II.85)
∗
a
∗
(II.83)
160
V. Bach, J. Fr¨ohlich, A. Pizzo
n := G n − G n where G n+1 n+1
0,n+1
and
3/2 1(|k| < κ)
nn+1 (k) = α n (k) := G ε(k) . G x=0 0,n+1 4π 3/2 ω(k)1/2
(II.86)
Equation (II.84) derives from the transversality of the photon polarization. The contributions ψ (1) and ψ (2) can easily be estimated by −1/2 n 2 ≤ 1 α 3 σ 2 σ −1/2 , ψ (1) ≤ σn+1 G n n+1 n+1 5 n 2 −1/2 n+1 nn+1 ) · a ∗ (G nn+1 ) nn+1 ≤ 2 σ −1/2 G ψ (2) = σn+1 a ∗ (G n+1 2 −1/2 ≤ α 3 σn2 σn+1 . 5
(II.87)
(II.88)
n To estimate ψ (3) , we recall from (II.12) the definition of Kn+1 := {k ∈ R3 × Z2 |σn+1 ≤ n ω(k) < σn }. We observe that, for k ∈ Kn+1 and any ψ ∈ Hn ,
−1/2 n n1/2 ψ ⊗ a ∗ (k)nn+1 = Hn+ ⊗ 1 + 1 ⊗ (Hˇ n+1 + σn+1 ) ψ ⊗ a ∗ (k)nn+1 R −1/2 = Hn+ + ω(k) + σn+1 ψ ⊗ a ∗ (k) nn+1 . (II.89) Writing ψ (3) as an Fn+1 -valued function of the electron coordinate and using (II.89), we obtain 2 (3) 2 1/2 n ∗ n ψ (x) = dk Rn [ vn · Gn+1 (k) φn ](x) ⊗ a (k)n+1 F n+1
Fn+1
−1/2 2 nn+1 (k) φn (x) vn · G = dk Hn+ + ω(k) + σn+1 Fn dk 2 n ≤ C1 (II.90) Gn+1 (k) φn (x) , Fn ω(k) where the last inequality follows from (II.42). Next, we use that, as a function of the two variables x ∈ R3 and k ∈ R3 × Z2 , " " 3 3
n
α
1 − e−i k·x ≤ α ω(k) |
G n+1 (k) | nn+1 (k) = x | = ω(k) G x| . 16π 3 ω(k) 16π 3 (II.91) Therefore, (3) ψ
Hn+1
ω
1/2 1/2
≤ C1
1 1/2 3/2 nn+1 G x φn ≤ C1 C2 α 3/2 σn , 2
(II.92)
where C2 is the universal constant in (II.72). n It remains to estimate ψ (4) , which is our key estimate. We use that G (k), as an 0,n+1 n operator on Hel , is a multiple of the identity, for each k ∈ Kn+1 . Thus, by (II.89), we have −1/2 + n ψ (4) = dk G vn φn ⊗ a ∗ (k) nn+1 , (II.93) 0,n+1 (k) · Hn + ω(k) + σn+1
Infrared-Finite Algorithms in QED
and then (4) 2 ψ H
161
n+1
2 n 2 −1/2 dk G vn φn Hn+ + ω(k) + σn+1 0,n+1 (k) Hn n 2 −1 + vn φn ≤ dk G (k) + ω(k) + σ v φ H n+1 n n , n 0,n+1 ≤
(II.94) where the last estimate follows from Schwarz’ inequality. We observe that, by (II.71), −1 Hn+ + Hn + ω(k) + σn+1 vn φn ≤ + x φn ≤ C2 , (II.95) Hn + ω(k) + σn+1 where C2 is the constant in (II.72). Inserting Estimate (II.95) in (II.94) and using that √ 1/2 n1/2 ≤ C1 , we arrive at vn φn ≤ σn+1 vn R (4) n ψ ≤ C 1/4 C 1/2 G 1
2
n+1
≤ 1 C 1/4 C 1/2 α 3/2 σn . 2 2 1
(II.96)
Adding up Estimates (II.87), (II.88), (II.92), and (II.96), we obtain √ # 1/2 1/4 1/2 $ 3 2 C C2 3/2 3/2 C1 C2 3/2 n − Pn+1 ≤ 12 3 α σn + 1 P α α σ + σ n n 1/2 1/2 2 2 σn+1 5 σn+1 √ # 3 1/2 1/4 1/2 1/2 C1 C2 α 3/2 σn $ C C2 α 3/2 σn 3 12 α σn ≤ + + 1 5 β 1/2 β 1/2 β √ 1/2 3 12 α 3/2 σn 1/4 1/2 1/2 ≤ 1 + C 1 C 2 + C 1 C2 1/2 5 β C1 α 3/2 n/2 ≤ 3 (C2 + 1) β , (II.97) β 1/2 where we make use of C1 ≥ 8κ ≥ 8 to derive the last inequality. This yields (II.75).
By Theorem II.12, the sequence (φn ⊗ n∞ )∞ n=0 converges to a groundstate φgs ∈ H. We summarize our main result in the following corollary. (1)
Corollary II.13. Assume Hypotheses 1 and 2 and gα,β ≤ 1/8. Then the groundstate energy Egs = inf σ (H ) is a simple eigenvalue of H corresponding to a groundstate eigenvector φgs ∈ H. Moreover, (1) 2 1 − |φgs |φn ⊗ n∞ |2 ≤ 4 C32 gα,β β n , (II.98) for all n ∈ N. Proof. It remains to show the uniqueness of the groundstate eigenvector φgs ∈ H. To # ∈ H is a normalized groundstate eigenvector orthogonal this end, we assume that φgs # |φ # = 1, φ |φ # = 0, and H φ # = E φ # . to φgs = limn→∞ φn ⊗ n∞ , i.e., φgs gs gs gs gs gs gs n∞ 2 ≤ 1 σn2 α 3 and ω−1/2 We first remark that (II.36) and (II.37) generalize to G 5 n∞ 2 ≤ 1 σn α 3 , respectively. Using this, we obtain G 5 + n n (H %n+ + σn+1 )−1/2 ≤ 2 g (1) %n + σn+1 )−1/2 (W∞ + E∞ ) (H (II.99) α,β
162
V. Bach, J. Fr¨ohlich, A. Pizzo
similarly to the derivation of Lemmata II.4 and II.6, where we introduce n n n %n+ := (Hn − En ) ⊗ 1n∞ + 1n ⊗ Hˇ ∞ H = H + − W∞ − E∞ .
(II.100)
Equations (II.99) and (II.100) imply n n r := (H + + σn+1 )−1/2 (W∞ + E∞ ) (H + + σn+1 )−1/2 (1) %n+ + σn+1 )1/2 2 ≤ 2g (1) (1 + r) , (II.101) ≤ 2gα,β (H + + σn+1 )−1/2 (H α,β which yields + (H + σn+1 )−1/2 (W n + E n ) (H + + σn+1 )−1/2 = r ≤ 4 g (1) , (II.102) ∞ ∞ α,β (1)
since gα,β ≤ 41 . Therefore,
#
φ | (W n + E n ) φ # gs ∞ ∞ gs
#
n n # = σn+1 φgs | (H + + σn+1 )−1/2 (W∞ + E∞ ) (H + + σn+1 )−1/2 φgs (1)
≤ 4 σn+1 gα,β .
(II.103)
Secondly, we observe that, for n ≥ 1, we have Hn+ ⊗ 1n∞ ≥ gapn Pn⊥ ⊗ 1n∞ . Hence, σn ⊥ n P ⊗ 1n∞ ≤ (Hn − En ) ⊗ 1n∞ ≤ (Hn − En ) ⊗ 1n∞ + 1n ⊗ Hˇ ∞ 2 n n n %n+ = H + − W∞ =H − E∞ . (II.104) # and using (II.103), we obtain Sandwiching (II.99) with φgs
2
# (1) σn+1 (1) # ⊥ # n n # φgs + E∞ ) φgs ≤ 8 gα,β ≤ 8 gα,β β . Pn ⊗ 1n∞ φgs φgs | (W∞ ≤ σn σn (II.105)
%n := Pn ⊗ Pn and observe that Now let P ∞ n %n⊥ = Pn⊥ ⊗ 1n∞ + Pn ⊗ P ⊥n ≤ Pn⊥ ⊗ 1n∞ + 1n ⊗ N∞ , (II.106) P ∞ n := ∗ n where N∞ n dk a (k)a(k) is the number operator on the Fock space F∞ of K∞ photons of momenta below σn . %n := Pn ⊗ Pn converges to the orthogonal projection Pgs := |φgs φgs | Since P ∞ onto φgs , as n → ∞, and the right side of (II.105) is uniform in n, we conclude from (II.106) that # % # # %⊥ # # |φgs |φgs |2 = lim φgs |Pn φgs = 1 − lim φgs |Pn φgs n→∞
≥1 −
≥ 1−β (1)
since 8gα,β ≤ 1.
n→∞ # n # β − lim supφgs |(1n ⊗ N∞ ) φgs n→∞ # n # − lim supφgs |(1n ⊗ N∞ ) φgs , n→∞
(1) 8 gα,β
(II.107)
Infrared-Finite Algorithms in QED
163
Finally, we invoke a soft-photon bound # # |(1el ⊗ Nˇ ) φgs < ∞, φgs
(II.108)
where Nˇ := R3 ×Z2 dk a ∗ (k)a(k) is the (total) photon number operator. Estimate (II.108) asserts the finiteness of the photon number expectation. It holds for any ground # |(1 ⊗ N # implies that ˇ ) φgs state of H and can be derived as in [3]. The finiteness of φgs el # |(1 ⊗ N n ) φ # = 0. Thus, by choosing n sufficiently large in (II.107), limn→∞ φgs n ∞ gs # |φ |2 ≥ 1 (1 − β) > 0 which is in contradiction to the assumed we arrive at |φgs gs 2 # and φ . orthogonality of φgs gs Acknowledgement. We thank M. Shoufan and M. K¨onenberg for proofreading of the manuscript. Support by the EU network RTN 2-2001-065 is gratefully acknowledged.
A. Proofs of Lemma II.3 and Theorem II.10 Proof of Lemma II.3. We first note that (II.43) follows from (II.42) with ρ := σn+1 . To prove (II.42), we observe that Lemma II.2 implies the quadratic form estimate ∗ n 2 (Hˇ n + κ) ≤ 8κ α 3 (Hˇ n + κ) . (A.1) n ) + a(G n ) 2 ≤ 4 (κ −1 + ω−1 )1/2 G a (G This implies that n ) + a(G n ) 2 ≤ 2 ( vn )2 + 16κ α 3 (Hˇ n + κ) . − ≤ 2 ( vn )2 + 2 a ∗ (G
(A.2)
Since V is a small perturbation relative to −, we have that V ≤ 2ε( vn )2 + 16κεα 3 (Hˇ n + κ) + bε ,
(A.3)
for any ε > 0. It follows that, for any ε, µ > 0, vn )2 − µV + 16κεµα 3 (Hˇ n + κ) + µbε . ( vn )2 ≤ (1 + 2εµ)(
(A.4)
We choose ε := 1/8, µ := 2 and observe that then 16κεµα 3 = 4κα 3 ≤ 4, since α ≤ κ −1/3 , which yields vn )2 − V + Hˇ n + κ + b1/8 ≤ 4[Hn + κ + b1/8 ] ( vn )2 ≤ 4 ( ≤
C1 (Hn − En + ρ) , ρ
(A.5)
where the constant C1 is given by r + E + κ + b 1/8 n = 4 max En + κ + b1/8 , ρ . C1 := 4 ρ sup r +ρ r≥0
(A.6)
Since the groundstate energy eel of Hel is strictly negative, eel < 0, the variational principle with trial vector ϕel ⊗n , where ϕel ∈ Hel is the atomic groundstate, Hel ϕel = eel ϕel , yields n 2 ≤ eel + 1 κ 2 α 3 ≤ κ . En ≤ ϕel ⊗ n | Hn (ϕel ⊗ n ) = eel + G 5 Moreover, ρ ≤ σn ≤ κ, and thus C1 ≤ C1 = 8(κ + b1/8 ).
(A.7)
164
V. Bach, J. Fr¨ohlich, A. Pizzo
Proof of Theorem II.10. Fix n ∈ N0 . We first observe that, for any f ∈ C0∞ (R3 ; R), the vector f ( x )φn lies in the domain of Hn , and we have ( x ) φn = [Hn , f ( x )] φn = − f ( x ) − 2i ∇f x ) · vn φn . Hn+ f (
(A.8)
By Lemma II.3, vn φn is bounded by C1 , uniformly in n ∈ N. Thus, + H f ( ∞ (2C1 + 1) . x ) φn ≤ f ∞ + ∇f n
(A.9)
We choose a smooth characteristic function χ ∈ C0∞ (R3 ; [0, 1]) of the unit ball in R3 , i.e., χ ( x ) = 1, for | x | ≤ 1, and χ ( x ) = 0, for | x | ≥ 2, and define # x $ x Fr,R ( x ) := x 1 − χ χ , r R
(A.10)
x )| ≤ −eel /3, and where r ≥ 1 is chosen sufficiently large such that sup|x |≥r/2 |V ( R > r. One easily checks that
sup
3
R>r>1 j =1
r,R;j ∞ ≤ C2 , Fr,R;j ∞ + ∇F
(A.11)
where C2 < ∞ depends only on χ and its derivatives up to second order, and Fr,R =: (Fr,R;1 , Fr,R;2 , Fr,R;3 ). Note that Fr,R = 0 implies | x | ≥ r/2. Further note that Hn+ ≥ 2 n = |eel | − G n 2 , by (A.7). Hence −V ( x ) − En and −En ≥ −eel − G
n 2 ] Fr,R φn x ) + |eel | − G Fr,R φn Hn+ Fr,R φn ≥ Fr,R φn [−V ( |eel | Fr,R φn 2 , ≥ 3
(A.12)
n 2 ≤ |eel |/3. The Cauchyx )φn , using that (II.31) implies G where Fr,R φn := Fr,R ( Schwarz inequality, (A.9), and (A.11) thus yield Fr,R φn ≤
3 (2C1 + 1) C2 3 (Hn − En ) Fr,R φn ≤ . |eel | |eel |
(A.13)
x ) = xχ ( x /r)φn ( x ) + limR→∞ Fr,R ( x )φn ( x ), for a. e. x ∈ R3 , by We have xφn ( Lebesgue’s monotone convergence theorem, and we obtain the bound x φn ≤ χ( x /r) x φn + [1 − χ ( x /r)] x φn ≤ 2 r + 3 (2C1 + 1) C2 |eel |−1 . (A.14) Moreover, the closedness of Hn in combination with the limit R → ∞ implies that x φn belongs to its domain.
Infrared-Finite Algorithms in QED
165
References 1. Bach, V., Fr¨ohlich, J., Sigal, I.M.: Quantum electrodynamics of confined non-relativistic particles. Adv. in Math. 137, 299–395 (1998) 2. Bach, V., Fr¨ohlich, J., Sigal, I.M.: Renormalization group analysis of spectral problems in quantum field theory. Adv. in Math. 137, 205–298 (1998) 3. Bach, V., Fr¨ohlich, J., Sigal, I.M.: Spectral analysis for systems of atoms and molecules coupled to the quantized radiation field. Commun. Math. Phys. 207(2), 249–290 (1999) 4. Griesemer, M., Lieb, E., Loss, M.: Ground states in nonrelativistic quantum electrodynamics. Invent. Math. 145, 557–595 (2001) 5. Hiroshima, F.: Functional integral representation of a model in QED. Rev. Math. Phys. 9(4), 489–530 (1997) 6. Hiroshima, F.: Ground states of a model in nonrelativistic quantum electrodynamics II. J. Math. Phys. 41(2), 661–674 (2000) 7. Hiroshima, F., Spohn, H.: Ground state degeneracy of the Pauli-Fierz Hamiltonian with spin. Adv. Theor. Math. Phys. 5(6), 1091–1104 (2001) 8. Kato, T.: Perturbation Theory of Linear Operators, Volume 132 of Grundlehren der mathematischen Wissenschaften, 2nd ed., Berlin-Heidelberg-New York: Springer-Verlag, 1976 9. Lieb, E., Loss, M.: Existence of atoms and molecules in non-relativistic quantum electrodynamics. Adv. Theor. Math. Phys. 7(4), 667–710 (2003) 10. Pizzo, A.: One-particle (improper) states in Nelson’s massless model. Ann. H. Poincar´e 4(3), 439– 486 (2003) 11. Reed, M., Simon, B.: Methods of Modern Mathematical Physics: Analysis of Operators. Volume 4, 1st ed., San Diego: Academic Press, 1978 12. Reed, M., Simon, B.: Methods of Modern Mathematical Physics: II. Fourier Analysis and Self-Adjointness. Volume 2, 2nd ed., San Diego: Academic Press, 1980 Communicated by G. Gallavotti
Commun. Math. Phys. 264, 167–189 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1542-7
Communications in
Mathematical Physics
Approach to Equilibrium in a Microscopic Model of Friction S. Caprino1 , C. Marchioro2 , M. Pulvirenti2 1
Dipartimento di Matematica, Universit`a di Roma ‘Tor Vergata’, via della Ricerca Scientifica, 00133 Roma, Italy. E-mail:
[email protected] 2 Dipartimento di Matematica, Universit`a di Roma ‘La Sapienza’, P.le Aldo Moro 2, 00185 Roma, Italy. E-mail:
[email protected];
[email protected] Received: 10 March 2005 / Accepted: 2 November 2005 Published online: 11 March 2006 – © Springer-Verlag 2006
Abstract: We consider the time evolution of a disk under the action of a constant force and interacting with a free gas in the mean-field approximation. Letting V0 > 0 be the initial velocity of the disk and V∞ > 0 its equilibrium velocity, namely the one for which the external field is balanced by the friction force exerted by the background, we show that, if V∞ − V0 is positive and sufficiently small, then the disk reaches V∞ with the power law t −(d+2) , d = 1, 2, 3 being the dimension of the physical space. The reason for this behavior is the long tail memory due to recollisions. Any Markovian approximation (or simply neglecting the recollisions) yields an exponential approach to equilibrium. 1. Introduction Consider a solid body moving along the x-axis, under the action of a constant force E, immersed in a homogeneous fluid. Then its time evolution is given by: M V˙ (t) = −G(V ) + E,
(1.1)
where V = V (t) is the (horizontal) velocity of the body, M its total mass and G, the friction term, is usually determined on the basis of phenomenological considerations. Such a function, that for V small often takes the familiar form G(V ) = λV for a positive λ, summarizes all the complex interactions between the body and the medium. If we suppose the body initially at rest and G(V ) increasing, the solution V = V (t) of Eq. (1.1) is increasing in time and converges exponentially to the limiting velocity V∞ which satisfies G(V∞ ) = E.
(1.2)
In the equilibrium situation the external force is perfectly compensated by the friction force and the body moves with constant velocity.
168
S. Caprino, C. Marchioro, M. Pulvirenti
It would be desirable, of course, to give a microscopic explanation of these facts. The most natural way to pose the problem is to model the medium as an infinite particle system, interacting with an obstacle accelerated by a given field E. Obviously the behavior of the obstacle will depend on the obstacle-background interaction. We address the reader to the classical monograph [10] for heuristic considerations. With regard to rigorous results, we are aware of Ref. [2], where the background is modeled by a vibration field. In this case the obstacle reaches its limiting velocity with an exponential rate. On the other hand in Refs. [4–6] it is shown that a test particle, immersed in a medium of identical interacting particles, accelerates indefinitely whenever the interaction is smooth or moderately diverging. The simplest model to consider is a gas of free light particles elastically interacting with the body. This kind of interaction gives rise to a very irregular motion, with fluctuations which are very small if the ratio between the mass of the body and that of the gas particles is very large. However the averaged motion is expected to be regular and sufficient to give a correct description of the macroscopic behavior of the system. To avoid the difficulties connected with the computations of the averaged quantities, one can alternatively consider the gas in the mean-field approximation, that is the limit in which the mass of the particles constituting the free gas goes to zero, while the number of particles per unit volume diverges, in such a way that the mass density stays finite. Such a limit is well known for interacting particle systems with finite total mass (see [7, 8, 12, 13]) and one-dimensional systems with unbounded mass (see [3]). This is exactly what we do in the present paper, namely we study the time evolution of an obstacle elastically interacting with a free gas in a mean-field approximation. This model has been previously introduced in connection with the so-called piston problem (see [9] and also [11] and references quoted therein). We assume that the body has a particularly simple shape, namely we consider a cylinder with a negligible length. We prove that, if the initial velocity of the body is sufficiently close to the limiting velocity V∞ then, for large t: |V∞ − V (t)| ≈
C t (d+2)
,
(1.3)
where C is a positive constant depending on the medium and the shape of the obstacle and d = 1, 2, 3 is the dimension of the physical space. The law (1.3) is not exponential and hence the result is somehow surprising. The reason for this behavior is the appearance of recollisions between the gas particles and the obstacle. Indeed if the obstacle accelerates, it can hit a gas particle many times and this influences the friction force dramatically. In particular a gas particle which has collided quite early, can recollide after an arbitrarily large time. This creates a long tail memory which is responsible for the power law behavior. Neglecting the recollisions, namely assuming that the obstacle always hits new particles at a given thermal equilibrium, the friction force can be computed almost explicitly and the behavior is the one predicted by Eq. (1.1), that is exponential. We show that this approximation is not legitimate in our model. One can argue, however, that such a model is too poor to give realistic information: the background is schematized by a free gas while an interacting system, with good ergodic properties, could reasonably destroy the memory effects which are present in our context. Unfortunately such ergodic properties for Hamiltonian systems seem far to be proven. In any case the result of the present paper at least shows that, in the suitable time scale in which the thermalization of the medium is not yet effective, the approach to the limiting velocity is not exponential but obeys a power law. It is worth mentioning
Approach to Equilibrium in a Microscopic Model of Friction
169
that it was already known that the recollisions can produce a power-law decay. In fact the velocity-velocity correlation of a tagged particle of a one-dimensional free gas decays as t −3 . (See for instance Ref. [1]). We prove (1.3) for an obstacle of a particular shape and under the hypothesis that V∞ − V0 is sufficiently small, V0 < V∞ being the initial velocity of the obstacle. It looks quite reasonable to conjecture that a power law holds under more general assumptions on both initial conditions and external field, however this has still to be proven (see Sect. 5). We conclude by outlining the plan of the paper. In the next section we establish and discuss the model which is heuristically justified in the Appendix. Section 3 is devoted to preliminary technicalities, while in Sect. 4 we give the proofs of the main Theorems 2.1 and 2.2. Section 5 is devoted to concluding remarks. 2. Model and Results The body we consider is a disk of radius R in dimension d = 3, a stick of length 2R for d = 2 and a point particle on the line for d = 1. We assume, for simplicity, its mass to be unitary. The disk (or the stick) is constrained to stay orthogonal to the x axis, with the center moving along the same axis. The thickness of the disk is assumed to be negligible, however this assumption is not essential and it is made just for notational simplicity. The system is immersed in a perfect gas in equilibrium at inverse temperature proportional to β and with constant density ρ. Moreover a constant force E is acting on the disk. We are interested in the time asymptotics of the system, in particular we want to investigate whether and how the disk reaches a limiting velocity. We assume the perfect gas in the mean-field approximation. In other words the presence of the disk modifies the equilibrium of the gas, which starts to evolve according to the free Vlasov equation. Let f = f (x, v; t), (x, v) ∈ Rd × Rd be the mass density in the phase space of the gas particles. It evolves according to: (∂t + v · ∇x )f (x, v; t) = 0,
x∈ / D(t).
for
(2.1)
Here D(t) denotes the (d − 1)-dimensional circular surface of radius R: D(t) = {y ∈ ⊥ (X(t))| |y − X(t)|2 < R 2 }. X(t) denotes the position of the center of the disk at time t and ⊥ (X(t)) the plane orthogonal to the x-axis at the point X(t). Together with Eq. (2.1) we consider the boundary conditions. They express the con ) tinuity of f along the trajectories with elastic reflection on D(t). Defining v = (vx , v⊥ as vx = 2V (t) − vx ,
v⊥ = v⊥ ,
(2.2)
˙ where V (t) = X(t) is the velocity of the disk and vx and v⊥ the velocity components of the gas particles on the x-axis and the orthogonal plane respectively, we set f+ (x, v ; t) = f− (x, v; t);
for
x ∈ D(t),
(2.3)
where f± (x, v; t) = lim f (x ± εv, v; t ± ε); ε→0+
for
x ∈ D(t).
(2.4)
170
S. Caprino, C. Marchioro, M. Pulvirenti
Equation (2.3) describes both the continuity along the collisions from the right V (t) > vx and from the left V (t) < vx . Coupled to Eq. (2.1) we consider the evolution equation for the disk: ˙ X(t) = V (t), V˙ (t) = E − F (t), X(0) = 0, V (0) = V0 , where E > 0 is a constant given field and F (t) = 2 dx dv(V (t) − vx )2 f− (x, v; t) D(t) vx
vx ≥V (t)
(2.5)
(2.6)
is the action of the gas on the disk. As initial state for the gas distribution we assume the thermal equilibrium, namely β 2 lim f (x + εv, v; ε) = ρ( )d/2 e−βv , + π ε→0
(2.7)
for β > 0. We incidentally remark that the results in the present paper hold for any initial datum of the form ρg(v 2 ), with g integrably decreasing. Summarizing we define a solution to the friction problem any pair (f, V ) where V = V (t) solves, for almost all t ∈ R+ , Eqs. (2.5),(2.6) and f satisfies Eq. (2.8) below d f (x + vt, v; t) = 0, dt
a.e.(x, v),
(2.8)
together with boundary conditions (2.3) and initial condition (2.7). Note that similar models have been introduced in Ref [9]. Here we give a heuristic derivation of the model in the Appendix. We first observe that Eq. (2.1) can be solved by means of characteristics. More precisely, knowing the evolution of the disk X(t), V (t), we can trace back the time evolution of position and velocity of the gas particle x(s, t; x, v), v(s, t; x, v) at time s ≤ t, having position and velocity x, v at time t. Such backward evolution is the free motion up to the last instant τ < t in which the particle hits the disk. On the surface of the disk we impose the elastic collision, namely: vx (τ − ) = 2V (τ ) − vx (τ + ),
v⊥ (τ + ) = v⊥ (τ − ).
Then we go backward in time, up to the one but the last collision. Impose again the reflection condition and so on. Note that if x ∈ D(t) then v has to be interpreted as a precollisional velocity namely v = lims→t v(s, t; x, v). At the end we obtain β 2 F (t) = 2ρ( )d/2 dx dv(V (t) − vx )2 e−βv (0,t;x,v) π D(t) vx
vx ≥V (t)
(2.9)
Approach to Equilibrium in a Microscopic Model of Friction
171
Note that to compute F (t) we need to evaluate v(0, t; x, v) and hence to know all the previous history {X(s), V (s), s < t}. On the other hand, if the light particle goes back without undergoing any collision, then v(0, t; x, v) = v. In this case we say, for obvious reasons, that the gas particle has no recollisions (the very last collision, namely the one at time t, is automatically taken into consideration because the gas particle is, at time t, on the surface of the disk). In absence of the recollisions the friction term is easily computed: β 2 dv(V − vx )2 e−βv F0 (V ) = 2ρ( )d/2 σd π vx
where σd is the area of the disk. Let V∞ be the solution of F0 (V∞ ) = E.
(2.11)
We assume that V∞ ≥ V0 > 0. We will show in Lemma 2.1 that F0 is a positive, increasing and convex function in the interval (0, V∞ ]. Now we see that, neglecting recollisions, our problem becomes trivial. Indeed replacing F by F0 in Eq. (2.5) we have: ˙ X(t) = V (t),
V˙ (t) = E − F0 (V (t)) = K(t)(V∞ − V (t)); X(0) = 0,
V (0) = V0 ,
(2.12)
where K(t) =
F0 (V∞ ) − F0 (V (t)) . V∞ − V (t)
The solution to Eq. (2.12) can be almost explicitly computed. We note that V is increasing in time and converging to V∞ . Furthermore a standard comparison argument shows that γ e−C− t ≤ V∞ − V (t) ≤ γ e−C+ t ,
(2.13)
where γ = V∞ − V0
and C+ = F0 (V0 ) ≤ C− = F0 (V∞ ).
(2.14)
The Vlasov equation (2.1) is then solved by characteristics. The full problem (including recollisions) is considerably more difficult because it is not Markovian since the friction term F at time t depends on the previous history X(s), V (s) with s < t. As we shall see, this long range memory affects the behavior of the system in a crucial way. For further convenience we rewrite the full friction term F as: F (t) = F0 (V (t)) + r + (t) + r − (t),
(2.15)
172
S. Caprino, C. Marchioro, M. Pulvirenti
where r + (t) and r − (t) are the contribution coming from right and left recollisions respectively. Explicitly: r + (t) = 2ρ
d/2 β 2 2 dx dv(V (t) − vx )2 e−βv (0,t;x,v) − e−βv π D(t) vx
and d/2 β 2 2 r (t) = 2ρ dx dv(V (t) − vx )2 e−βv − e−βv (0,t;x,v) . π D(t) vx ≥V (t) (2.17) −
We note that r + (t) and r − (t) are both not negative as it follows by the collision law (2.2). It turns out that r − (t) slows down the disk, in spite of the fact that this term arises from the left recollisions. The reason is that, if the disk slows down, F0 includes many kinematically impossible left collisions which must be compensated. We consider as data of the problem the quantities ρ, β, R, V∞ (or equivalently E) and γ = V∞ − V0 . We are now in the position to state the main result of the present paper. Theorem 2.1. There exists γ0 = γ0 (ρ, β, R, V∞ ) > 0 sufficiently small such that, for any γ ∈ (0, γ0 ) there exists at least one solution (V (t), f (t)) to problem (2.1)–(2.6). Moreover any solution (V (t), f (t)) satisfies, for any t ≥ 0: V∞ − V (t) ≤ e−C+ t γ +
A+ γ 3, (1 + t)d+2
(2.18)
for a suitable positive constant A+ independent of γ and V∞ − V (t) ≥ e−C− t γ .
(2.19)
The next theorem shows that bound (2.19) can be improved. Theorem 2.2. Let γ ∈ (0, γ0 ). There exists a sufficiently large t¯, depending on γ , such that any solution (f, V ) to problem (2.1)–(2.6) satisfies for any t ≥ 0: V∞ − V (t) ≥ e−C− t γ +
A− 4 γ χ ({t > t¯}), t d+2
(2.20)
where A− is a positive constant, independent of γ , and χ ({. . . }) is the characteristic function of {. . . }. Note that the above theorems establish the power law approach to the stationary state. For the sake of concreteness we shall prove Thms. 2.1 and 2.2 for the three-dimensional case. The remaining cases d = 1, 2 follow by the same arguments with obvious modifications. Now we prove the announced properties of F0 . Lemma 2.1. F0 is a positive, increasing and convex function in (0, V∞ ].
Approach to Equilibrium in a Microscopic Model of Friction
Proof. By (2.10) it is, for a constant C > 0: 2 F0 (V ) = C dv⊥ e−βv⊥
V
+∞
−
dvx (V − vx )2 e−βvx
2
−∞
173
2 −βvx2
dvx (V − vx ) e
(2.21)
.
V
By the simple change of variables vx → −vx we obtain V −V 2 −βvx2 2 −βvx2 F0 (V ) = C dvx (V − vx ) e − dvx (V + vx ) e ≥C
−∞ −V
2 dvx (V − vx )2 − (V + vx )2 e−βvx
−∞
= −4CV Moreover F0 (V ) = 2C
−V
dvx vx e−βvx > 0. 2
−∞
V
2
−∞ −V −∞
Finally F0 (V ) = 2C
(2.22)
dvx (V − vx )e−βvx −
≥ −4C
−∞
−V −∞
dvx (V + vx )e−βvx
dvx vx e−βvx > 0. 2
V
dvx e−βvx − 2
−∞
2
(2.23)
−V −∞
2 dvx e−βvx > 0.
(2.24)
3. Preliminary Results In what follows the symbol C will indicate any positive constant, possibly depending on β, V∞ , ρ, R, but not on γ which is our small parameter. Any such constant is explicitly computable. For any γ ∈ (0, γ0 ) with γ0 sufficiently small, we introduce a t- a.e. differentiable function t → W (t) ∈ [V0 , V∞ ], with bounded derivative, such that W (0) = V0 , limt→∞ W (t) = V∞ and satisfying the following properties: (i) W is increasing over the interval [0, t0 ], with t0 =
1 C+ . log 2C− γ
(3.1)
(ii) There exists a positive constant A+ such that, for any t ≥ 0, it is: V∞ − W (t) ≤ e−C+ t γ +
A+ γ3 (1 + t)5
(3.2)
and V∞ − W (t) ≥ e−C− t γ . The constant A+ , independent of γ and γ0 , will be fixed later on.
(3.3)
174
S. Caprino, C. Marchioro, M. Pulvirenti
We collect in the following lemma some properties of the function W , which will be useful in the sequel. For 0 ≤ s < t, we set t 1 W (τ )dτ (3.4) W s,t = t −s s and W 0,t = W t .
(3.5)
Lemma 3.1. Suppose γ0 sufficiently small. Then: i) For any t > 0 we have: W (t) > W t .
(3.6)
ii) t → W t is a strictly increasing function. iii) For any s ∈ (0, t), W s,t > W t .
(3.7)
iv) For any t > 0, the following bound holds: W (t) − W t ≤
C (γ + A+ γ 3 ). 1+t
(3.8)
Proof. Proof of i). The result is true for t ≤ t0 because in this region W is increasing. For t ≥ t0 we have by (3.2) and (3.3): 1 t ds[(V∞ − W (s)) − (V∞ − W (t))] W (t) − W t = t 0 t γ A+ −C− s −C+ t 3 ≥ −γ ds e −e t 0 (1 + t)5 −C t 1−e − A+ =γ (3.9) − e−C+ t − γ 3 C− t (1 + t)5 which is positive, by choosing γ sufficiently small and consequently t0 sufficiently large. Proof of ii) 1 1 t 1 d dτ W (τ ) + W (t) = [W (t) − W t ] > 0 (3.10) W t = − 2 t 0 t t dt by the previous lemma. Proof of iii)
t 1 t 1 W (τ )dτ − W (τ )dτ t −s s t 0 t s 1 1 1 = − W (τ )dτ − W (τ )dτ t −s t t −s 0 0 t s 1 1 s = W (τ )dτ − W (τ )dτ > 0 t −s t 0 s 0
by ii).
(3.11)
Approach to Equilibrium in a Microscopic Model of Friction
175
Proof of iv) For t ≤ 1 we have W (t) − W t ≤ γ ≤
2γ . 1+t
On the other hand by (3.2) we have, for t > 1, 1 t W (t) − W t = ds[V∞ − W (s) − (V∞ − W (t))] t 0 t
C 1 A+ ≤ ds e−C+ s γ + γ 3 ≤ (γ + A+ γ 3 ) 5 t 0 t (1 + s) C 3 ≤ (γ + A+ γ ). 1+t
(3.12)
(3.13)
4. Proofs Proof of Theorem 2.1. The strategy in proving Theorem 2.1 is the following. For an assigned velocity W of the disk (with the properties stated in the previous section), we can solve the free Vlasov equation outside the disk moving with velocity W and compute + − the friction contribution due to the recollisions, namely rW and rW defined below. Then + − we solve Eq. (2.5) for the disk with assigned rW and rW , finding a new velocity VW . Obviously the solution of our problem is the fixed point of the map W → VW (if any), so that our main goal is to infer for V the same properties established in (3.1),(3.2) and (3.3) for W (see Proposition 4.1 below). t Let W be defined as in the previous section and X(t) = 0 W (τ )dτ be the position of the disk at time t. Consider the modified problem: d + − (V∞ − VW (t)) = −K(t)(V∞ − VW (t)) + rW (t) + rW (t), dt
(4.1)
where K(t) is the function introduced in Eq. (2.12) with VW (t) in place of V (t), + (t) = 2ρ rW
3 β 2 2 2 dx dv(vx −W (t))2 (e−βv (0,t;x,v) −e−βv ) (4.2) π D(t) vx ≤W (t)
− rW (t) = 2ρ
3 β 2 2 2 dx dv(vx −W (t))2 (e−βv −e−βv (0,t;x,v) ). (4.3) π D(t) vx ≥W (t)
and
The velocities of the light particles v(s, t; x, v), s < t, are computed according to the evolution X(s), W (s) of the disk and the law of elastic reflection (2.2). Moreover the dynamics of the system leads a fluid particle to have a finite number of collisions for almost all t, x ∈ D(t) and v. Finally we note that the tangential collisions, namely those for which there exists a time s < t such that x ∈ D(t), x(s, t; x, v) ∈ D(s) and vx (s, t; x, v) = W (s), constitute a zero (t, x, v) measure set. These claims are proven in Proposition A.1 in the Appendix and in the sequel the possibility of having infinitely
176
S. Caprino, C. Marchioro, M. Pulvirenti
many or tangential collisions will be neglected. As a consequence Eq. (4.1) holds a.e. t ∈ R+ . In view of the fixed point argument we shall use in the sequel, we want to show that ± VW behaves like W . Preliminarily however we have to estimate rW . To have a recollision from the right it is necessary that vx < W (t), x ∈ D(t), v(0, t; x, v) = v and a time s < t has to exist such that t vx (t − s) = X(t) − X(s) = W (τ )dτ, (4.4) s
that is vx = W s,t for some s ∈ (0, t) and |x⊥ − v⊥ (t − s)| ≤ 2R.
(4.5)
Thus by Lemma 3.1 iii), for a recollision to happen it is necessary that 2R . t −s Lemma 4.1. Let A+ be the constant in (3.2). Then for any t ≥ 0, vx ≥ W t
and
|v⊥ | ≤
(γ + A+ γ 3 )3 , (1 + t)5
γ + A+ γ 3 3 − rW (t) ≤ Cχ ({t > t0 }) . (1 + t)5
+ (t) ≤ C rW
(4.6)
(4.7) (4.8)
+ Proof. We start by estimating rW (t). Recalling that, by (2.2) v⊥ (0, t; x, v) = v⊥ , from (4.2) and (4.6) it follows: W (t) 2R 2 + 2 dv⊥ e−βv⊥ χ ({|v⊥ | < dvx (vx − W (t)) }), (4.9) rW (t) ≤ C t − s¯ W t
where s¯ is the maximal solution of Eq. (4.4), namely the first backward recollision time. For vx such that s¯ < 2t , we have: 2 2R R 2 . (4.10) }) ≤ C dv⊥ e−βv⊥ χ ({|v⊥ | < t − s¯ 1+t + Therefore we have a first contribution to the estimate of rW (t) which is 2 W (t) 2 1 1 dvx (vx − W (t))2 ≤ C (W (t) − W t )3 . C 1+t 1+t W t
On the other hand, if s¯ > 2t , from (4.4) and (3.2) it follows : t 1 vx = W (t) − dτ (W (t) − W (τ )) t − s¯ s¯ t 1 ≥ W (t) − dτ (V∞ − W (τ )) t − s¯ s¯ t 1 A+ γ 3 ≥ W (t) − dτ e−C+ τ γ + t − s¯ s¯ (1 + τ )5 −C+ s¯ −C t −e + e A+ γ 3 ≥ W (t) − . γ +C C+ (t − s¯ ) (1 + t)5
(4.11)
(4.12)
Approach to Equilibrium in a Microscopic Model of Friction
177
Since 1 − e−C+ (t−¯s ) < C, C+ (t − s¯ ) it follows that
vx ≥ W (t) − C γ e
−C+ 2t
(γ + A+ γ 3 ) A+ γ 3 ≥ W (t) − C + . (1 + t)5 (1 + t)5
+ (t) is Hence the second contribution to the estimate of rW +∞ C(γ + A+ γ 3 ) C dvx (vx − W (t))2 χ ({W (t) − ≤ vx ≤ W (t)}) (1 + t)5 0 3 γ + A+ γ 3 . (4.13) ≤C (1 + t)5
Collecting estimates (4.11), (4.13) and using Lemma 3.1 iv), we finally obtain (4.7). − we prove a similar estimate. First we notice that, as far as W is increasFor rW − ing, rW (t) = 0 and this justifies the characteristic function in Eq. (4.8). Moreover if v(0, t; x, v) = v , x ∈ D(t), and vx > W (t), there exists s < t such that vx = 2W (s) − vx∗
(4.14)
vx ≤ 2W (s) − W (s) < V∞ .
(4.15)
for some vx∗ > W (s). Hence
Thus from (4.3) we obtain: V∞ − rW (t) ≤ C (vx − W (t))2 dvx ≤ C(V∞ − W (t))3 .
(4.16)
W (t)
By using (3.2) we obtain: − (t) rW
≤ C γe
−C+ t
γ 3 A+ + (1 + t)5
We obtain (4.8) by observing that e−C+ t ≤
C . (1+t)5
3 .
(4.17)
Now we prove that the function VW (t) satisfying Eq. (4.1) enjoys, for γ suitably small, the same properties as the function W , with the same constant A+ . Proposition 4.1. Suppose γ sufficiently small. Then: (i) t → VW (t) is a a.e. differentiable function, increasing over the interval [0, t0 ] with t0 =
1 C+ log . 2C− γ
(ii) For any t ≥ 0: V∞ − VW (t) ≥ e−C− t γ .
(4.18)
178
S. Caprino, C. Marchioro, M. Pulvirenti
(iii) For any t ≥ 0: V∞ − VW (t) < e−C+ t γ +
A+ γ 3. (1 + t)5
(4.19)
+ − (t) and rW (t) are bounded, VW is a.e. differentiable with uniformly, Proof. Since rW essentially bounded derivative. Moreover from Eq. (4.1) and the Duhamel formula we have: t
V∞ − VW (t) = γ e− 0 K(τ )dτ t t + − + ds e− s K(τ )dτ (rW (s) + rW (s))
(4.20)
0
+ − (t) and rW (t), that VW (t) < V∞ for any t. Thus which shows, by the positivity of rW K(t) < F0 (V∞ ) = C− and, again from (4.20) we get
V∞ − VW (t) ≥ e−C− t γ ,
(4.21)
which proves ii). Moreover VW (t) > V0 for any t > 0. Indeed, by (4.1) it is : d + − (t) − rW (t) (VW (t) − V0 ) = F0 (V∞ ) − F0 (VW (t)) − rW dt F0 (V∞ ) − F0 (V0 ) =γ V∞ − V 0 F0 (VW (t))−F0 (V0 ) + − − (VW (t)−V0 )−rW (t)−rW (t). (4.22) VW (t)−V0 By the properties of F0 and Lemma 4.1 we obtain, for γ sufficiently small: F0 (VW (t)) − F0 (V0 ) d (VW (t) − V0 ) > − (VW (t) − V0 ). dt VW (t) − V0
(4.23)
This implies VW (t) > V0 for any t > 0 and consequently K(t) > F0 (V0 ) = C+ . Hence, by Eqs. (4.1), (4.21) and again Lemma 4.1 we have: d + − (t) + rW (t) (V∞ − VW (t)) ≤ −C+ (V∞ − VW (t)) + rW dt (γ + A+ γ 3 )3 ≤ −C+ γ e−C− t + C (1 + t)5 −C− t 2 ≤ −C+ γ e +γ
(4.24)
for γ sufficiently small, and this implies d (V∞ − VW (t)) < 0, dt for t ∈ [0, t0 ], so that i) is proven.
(4.25)
Approach to Equilibrium in a Microscopic Model of Friction
179
It remains to prove (iii). From Eq. (4.20) and Lemma 4.1 it follows: t + − dse−C+ (t−s) (rW (s) + rW (s)) V∞ − VW (t) ≤ e−C+ t γ + 0
≤e
−C+ t
γ + C(γ + A+ γ )
t
3 3
ds 0
e−C+ (t−s) . (1 + s)5
(4.26)
Let us evaluate the integral: t t t 2 eC+ s ds = (·)ds + (·)ds t (1 + s)5 0 0 2 t
t
eC+ 2 − 1 25 eC+ t − eC+ 2 ≤ + . C+ C+ (2 + t)5
(4.27)
Thus
ds 0
e−C+ (t−s) e−C+ 2 − e−C+ t 25 1 − e−C+ 2 ≤ + C+ C+ (1 + s)5 (2 + t)5 5 1 −C+ t 2 C 2 + ≤ . e ≤ C+ (2 + t)5 (1 + t)5 t
t
t
(4.28)
To conclude, there exists a constant C¯ such that: ¯ + A + γ 3 )3 V∞ − VW (t) ≤ e−C+ t γ + C(γ
1 . (1 + t)5
(4.29)
Therefore to obtain iii) it is sufficient that ¯ + A+ γ 3 )3 < A+ γ 3 . C(γ
(4.30)
Inequality (4.30) is satisfied, for instance, by choosing A+ = 2C¯ (this fixes A+ ) and γ consequently small.
We note that inequality (4.19) is strict even assuming (3.2) (which is not strict). This improvement in passing from W to VW will be used later on. By Proposition 4.1 we can prove Thm 2.1. We construct a sequence {Vn }∞ n=1 defined by Vn = VVn−1 ,
n≥2
(4.31)
setting V1 = W , W being any function with properties 3.1), 3.2) and 3.3). By Proposition 4.1 such properties hold for the whole sequence (for suitable values of A+ and t0 independent of n). By compactness (the sequence is equibounded and equicontinuous), we can extract a subsequence converging to a limit point V = V (t). Let f (t), t ≥ 0 be solution to Eq. (2.1) with boundary conditions (2.3) given by V (t), then the couple (f, V ) solves problem (2.5), (2.6) for t ≥ 0. We will prove this by showing that the characteristics solving (2.1) with boundary conditions given by Vn (t) converge to characteristics solving (2.1) with boundary conditions given by V (t). In order to avoid too heavy notation we consider the one dimensional case, the general one being an immediate transposition of it. For t > 0 and v given, consider the equation in τ < t: X(t) − X(τ ) = v(t − τ ),
(4.32)
180
S. Caprino, C. Marchioro, M. Pulvirenti
where X˙ = V . This is a right recollision condition of a fluid particle with the disk in the limit dynamics and from Lemma 3.1 (which holds for the limit velocity V (t) as well), we know that a necessary and sufficient condition for a solution to (4.32) to exist is: V t < v < V (t)
(4.33)
since v = V τ,t is a continuous function of τ , such that V 0,t = V t and V t,t = V (t). Let τ∗ be the maximal time for which (4.32) is verified. It is characterized by the condition X(s) < X(t) − v(t − s),
s ∈ (τ∗ , t).
(4.34)
Parallel to (4.32) and (4.34) we consider the equations Xn (t) − Xn (τ ) = v(t − τ ), Xn (s) < Xn (t) − v(t − s), s ∈ (τ, t).
(4.35) (4.36)
Since Vn is converging to V , choosing n large enough we have Vn t < v < Vn (t)
(4.37)
so that a maximal solution does exist also in this case and we denote it by τn . By compactness τn → τ¯ (extracting a subsequence if necessary). We want to show that τ¯ = τ∗ . In fact, by (4.35) and (4.36) we get in the limit n → ∞, X(t) − X(τ¯ ) = v(t − τ¯ ), X(s) ≤ X(t) − v(t − s),
s ∈ (τ¯ , t).
(4.38) (4.39)
We exclude equality in Eq. (4.39) because it would correspond to a tangential collision, which is not considered because it is negligible. Therefore τ¯ should be another maximal solution, in contrast with the uniqueness of τ∗ . Thus, with regard to the first backward recollision from the right, we have proven that τn → τ∗ . Whenever the trajectory of the fluid particle (x(s), v(s)), s ∈ (0, t) delivers k collisions at times τ 1 , ..., τ k in the limiting dynamics induced by V (t), the characteristics (xn (s), vn (s)) induced by Vn (t) perform the same number of collisions, for n sufficiently large and the collision times τn1 , ..., τnk do converge to τ 1 , ..., τ k (up to extraction of the subsequence when necessary). This can be easily proven by iterating the above arguments. We are not considering infinitely many collisions by Proposition A.1. The recollisions due to fluid particles coming from the left can be treated in the same way. We remark that, in two or three dimensions we have to exclude also the null measure set of initial conditions (x, v) for which x(tk ) ∈ ∂D(tk ). Finally the convergence of the characteristics shows that rV±n → rV± , so that (f, V ) is a solution to our friction problem. To conclude the proof of Theorem 2.1, let us consider any solution (f, V ) to problem (2.5) (2.6). By the continuity of V , there exists a time interval for which V∞ − V (t) < e−C+ t γ +
A+ γ 3, (1 + t)5
(4.40)
because it is obviously verified at time zero. Let T be the first time for which (4.40) is violated. The same arguments used to prove Proposition 4.1 (i) (replacing W by V ) apply here to show that V∞ − V (t) ≥ e−C− t γ
(4.41)
Approach to Equilibrium in a Microscopic Model of Friction
181
and for t ∈ [0, min(t0 , T )): d (V∞ − V (t)) ≤ 0. dt
(4.42)
Proceeding as in the proof of Proposition 4.1 (iii), since V enjoys the same properties as W for t ∈ [0, T ), we infer that (4.40) is still valid for t = T . Hence (4.40) holds globally in time. This concludes the proof of Theorem 2.1. Proof of Theorem 2.2. Consider now a solution (f, V ) to problem (2.1), (2.6). The lower bound (2.20) will be obtained by considering the integration over the velocities v producing a single recollision in the past. This allows us to estimate explicitly v(0, t; x, v) in Eq. (4.2). To this end we introduce s0 > 0 defined as:
V0 + V s,t . s0 = min s ∈ (0, t) : V (s) ≥ 2
(4.43)
Such s0 does exist by continuity, since by Lemma 3.2 we have at time 0: V (0) = V0 <
V0 + V 0,t V0 + V t = , 2 2
(4.44)
while at time s = t, V (t) >
V0 + V t,t V0 + V (t) = . 2 2
(4.45)
The set {(x, v)|x ∈ D(t), V t ≤ vx ≤ V s0 ,t } generates a subfamily of characteristics which had at most one recollision with the disk in the past. Indeed, consider a light particle which is to collide at x and let it go back up to the time s < t of the first recollision in the past. Then vx = V s,t for some s ≤ s0 . Hence, denoting by vx (s − ) the x− component of the precollisional velocity, by (4.43) we have: vx (s − ) = −vx + 2V (s) = − V s,t + 2V (s) ≤ V0
(4.46)
vx (0, t; x, v) = 2V (s) − vx .
(4.47)
so that
We now prove that s0 is bounded from above and from below, independently of γ , in the following way: 1 3 log 4 log ≤ s0 ≤ , C− 2 C+
(4.48)
provided that t is sufficiently large independently of γ . Indeed s0 is the minimal solution of the equation: V (s0 ) =
V0 + V s0 ,t 2
(4.49)
V∞ − V s0 ,t γ + . 2 2
(4.50)
which gives V∞ − V (s0 ) =
182
S. Caprino, C. Marchioro, M. Pulvirenti
By the use of property (2.18) for V we obtain: e−C+ s0 γ ≥
γ V∞ − V s0 ,t A+ γ A+ γ3 + γ 3. − ≥ − 5 2 2 2 (1 + s0 ) (1 + s0 )5
(4.51)
For γ small, A+ γ 2 ≤ 41 , so that e−C+ s0 ≥
1 4
(4.52)
and we have proved the right bound in (4.48). To prove the left bound, again by (2.18) we have: A+ γ 3 dτ γ e−C+ τ + (1 + τ )5 s0 −C+ s0 1 e − e−C+ t 3 A+ ≤ γ +γ . t − s0 C+ 4
V∞ − V s0 ,t ≤
1 t − s0
t
(4.53)
Hence, for t sufficiently large independent of γ : V∞ − V s0 ,t ≤
2γ t
γ 2 A+ 1 + C+ 4
≤
γ . 3
(4.54)
Therefore (4.50) and (2.19) yield: e−C− s0 γ ≤
2 γ, 3
(4.55)
proving also the left bound in (4.48). Now we set V s ,t 0 2 2 I (t) = dx dv⊥ dvx (vx −V (t))2 [e−βv (0,t;x,v) −e−βv ]. (4.56) V t
D(t)
Note that, on the basis of the same arguments leading to inequality (3.6), we can show that V (t) > V s0 ,t . Hence r + (t) ≥ CI (t).
(4.57)
For s ≤ s0 by (4.43) we get: vx2 − vx2 (0, t; x, v) = vx2 − (2V (s) − vx )2 = 4V (s)( V s,t − V (s)) ≥ 2V (s)( V s,t − V0 ).
(4.58)
The quantity in the right-hand side of (4.58) can be estimated from below in the following way: 2V (s)( V s,t − V0 ) ≥ V0 γ
(4.59)
Approach to Equilibrium in a Microscopic Model of Friction
183
for t sufficiently large independent of γ . Indeed, by Lemma 3.1 (iii) and Eq. (2.18): 1 t inf ( V s,t − V0 ) ≥ V t − V0 = dsV (s) − V0 s≤t t 0 t 1 A+ ≥ ds[γ (1 − e−C+ s ) − γ 3] t 0 (1 + s)5 1 − e−C+ t A+ 3 γ ≥ γ (1 − )− γ ≥ . (4.60) C+ t 4t 2 By these considerations, we have: V s ,t 0 2 dx dv⊥ dvx (vx − V (t))2 e−βv [vx2 − vx2 (0, t; x.v)] I (t) ≥ β ≥ Cγ
D(t) V s ,t 0 V t
V t
dvx (vx − V (t))2 e−βvx
2
dv⊥ e−βv⊥ 2
|v⊥ |< Ct
Cγ ≥ 2 [(V (t) − V t )3 − (V (t) − V s0 ,t )3 ] t Cγ = 2 [(V (t) − V s,t )2 ( V s0 ,t − V t )], t
(4.61)
for some s ∈ (0, s0 ). We now estimate both differences appearing in (4.61) showing that they are both O( 1t ). Using Eq. (3.11) we have: V s0 ,t − V t =
s0 [ V t − V s0 ]. t − s0
(4.62)
By estimate (4.48) we know that, for γ small, t0 is much larger than s0 so that, by monotonicity, we have for any τ ∈ (0, s0 ): V (τ ) < V (s0 ) <
V0 + V∞ , 2
after using Eq. (4.49). This implies that V t − V s0 = V∞ − V s0 − V∞ − V t γ 1 t ≥ (V∞ − V (τ ))dτ . − 2 t 0
(4.63)
(4.64)
By (2.18):
t 0
(V∞ − V (τ ))dτ
t
≤γ
dτ e 0
−C+ τ
+ γ A+
∞
3
dτ 0
1 γ γ 3 A+ . ≤ + C+ 4 (1 + τ )5
(4.65)
Hence we obtain: V t − V s0 ≥
1 γ γ γ γ 3 A+ − [ ]≥ , + 2 t C+ 4 4
(4.66)
184
S. Caprino, C. Marchioro, M. Pulvirenti
for t large independently of γ . Thus by (4.62) and (4.66) we arrive at: γ V s0 ,t − V t ≥ C . t
(4.67)
Let us now estimate the remaining term in (4.61). It is: V (t) − V s,t = V∞ − V s,t − (V∞ − V (t)).
(4.68)
Again by properties (2.18) and (2.19) we obtain: V (t) − V s,t
t γ γ 3 A+ ≥ dτ e−C− τ − γ e−C+ t − t −s s (1 + t)5 t γ γ 3 A+ ≥ dτ e−C− τ − γ e−C+ t − , t − s0 s0 (1 + t)5
(4.69)
because s ≤ s0 and e−C− τ is decreasing in τ . Consequently, for t sufficiently large independently of γ : V (t) − V s,t ≥ γ
C . t
(4.70)
Inserting estimates (4.67) and (4.70) in (4.61), by (4.57) we conclude that, for t sufficiently large, independently of γ , r + (t) ≥ C
γ4 . t5
(4.71)
Actually Eq. (4.71) holds (a fortiori) for any t > t0 , provided that γ is sufficiently small, since t0 is diverging when γ is vanishing. For t ≥ 2t0 , by virtue of (4.71) and the Duhamel formula, it is: V∞ − V (t) ≥ e−C− t γ +
t 0
≥ e−C− t γ + C
dse−C− (t−s) r + (s) t
dse−C− (t−s)
t0
γ4 . s5
(4.72)
Now we have:
t
t0
1
1 − ( Cγ+ ) 2 e−C− (t−s) 1 − e−C− (t−t0 ) 1 − e−C− t0 C ds ≥ ≥ = ≥ 5 , (4.73) s5 C− t 5 C− t 5 C− t 5 t
by (3.1) since t ≥ 2t0 . Hence: V∞ − V (t) ≥ e−C− t γ + C
γ4 . t5
The last inequality fixes A− and the proof is finally complete.
(4.74)
Approach to Equilibrium in a Microscopic Model of Friction
185
5. Comments In this paper we proved some significant and somehow surprising effects of recollisions in a suitable microscopic model of friction. Our techniques are perturbative and work only when the parameter γ = V∞ − V0 is small. We did not prove uniqueness of the solution to problem (2.1)–(2.6). Such a property should follow from a rather detailed analysis of the entire recollision sequence. Such a deeper analysis would also improve the upper bound (2.18) as regards to γ -dependence. On the other hand we were able to outline the asymptotic behavior of the solution taking into explicit account one recollision only. We emphasize that a small change in the model can cause a drastic change of the time asymptotics. For instance, assuming a lower bound on the vertical component of the gas particles velocity, namely |v⊥ | > ε > 0, two consecutive collisions may happen in a time interval of length at most 2R ε . This means that the memory effects are bounded in time and it can be proven that this implies an exponential decay. In the present paper we have considered the case 0 < V0 ≤ V∞ . Of course other cases can be studied, for instance V0 ≥ V∞ or 0 = E = V∞ . Another physical interesting case is when the external field depends on the position of the disk. Unfortunately it seems hard to find a unique approach to all these cases: we analyzed the easiest one. Let us briefly discuss the case V0 ≥ V∞ > 0 which, apparently, is symmetric to ours. Our techniques give the paradoxical result that the difference V (t) − V∞ becomes negative before vanishing as t → ∞. Indeed, let us suppose (by absurdum) that V (t) > V∞ for all times. Then V (t) − V∞ is decreasing in time (in particular r + = 0). By the Duhamel formula: t t t dse− s K(τ )dτ r − (s) V (t) − V∞ = (V0 − V∞ )e− 0 K(s)ds − 0 t −F0 (V∞ )t ≤ (V0 − V∞ )e − dse−F0 (V0 )(t−s) r − (s). (5.1) 0
Analogously to what we have proven in Thm. 2.2 we can find that, for small (V0 −V∞ ) and large t: r − (s) > C
(V0 − V∞ )4 . t d+2
(5.2)
Therefore, for large t we find a contradiction because V (t) − V∞ becomes negative. Moreover the positivity of r ± (s) prevents V (t) − V∞ from becoming positive later on. Since there is a change of sign in the difference V (t) − V∞ , a detailed analysis of the asymptotics is delicate. Even more involved is the case in which E = 0. Again after some time V (t) becomes negative and the quantities in the square brackets in (2.16) and (2.17) are no more positive, while in our paper the positivity of r ± played an important role. More generally, the cases in which there is a change of sign of the velocity of the disk, for instance when V0 < 0 or when E is not constant, are beyond a straightforward application of the techniques used in the present paper. We did not make explicit the dependence on β of the constants, even if it is reasonable to believe that the long tail memory becomes irrelevant as β → ∞. Indeed, in the limiting case, all the gas particles are initially at rest and the recollisions are absent when |V∞ − V0 | is small. This is because the first collision yields an outgoing velocity larger than 2V0 > V∞ , so that the gas particles cannot be hit anymore. On the other hand, the
186
S. Caprino, C. Marchioro, M. Pulvirenti
probability of finding a post-collisional velocity between V0 and V∞ , is vanishing as β → 0. We incidentally observe that for E = 0 and β1 = 0, the asymptotic behavior is not exponential, even neglecting recollisions. In fact in this case F0 (V ) = CV |V | and hence V (t) =
V0 . 1 + CV0 t
(5.3)
We finally remark that in the present paper we have essentially studied the asymptotic behavior of the motion of the solid body. It would also be interesting to understand the behavior of the Vlasov fluid. In particular one may ask whether the velocity distribution at a given point (say the origin) converges to the Maxwellian when t → ∞. This is not true in one dimension. Indeed a light particle with velocity v < −V∞ , at a large time, has surely collided with the disk in the past, while in higher dimension the transversal velocity makes this event exceptional. Appendix We give an heuristic derivation of our model in the one-dimensional case. The case of a d-dimensional disk follows with minor modifications. Denoting by V and M velocity and mass of the heavy particle and by v and m velocity and mass of a gas particle, the law of elastic collision says that: V = V +
2m (v − V ); M +m
v = V −
M −m (v − V ), M +m
(A.1)
where V and v are the outgoing velocities. As usual in the mean field limit, we assume the mass of any light particle to be m = N1 << M, N being the total number of the gas particles, so that, by (A.1), we have: V ≈ V +
2 (v − V ); NM
v ≈ 2V − v.
(A.2)
We now evaluate the variation of the velocity V of the heavy particle in the time interval [t, t + t]. It is: V = E t −
1 N
j ∈I + ( t)
2 1 |vj − V | + M N
j ∈I − ( t)
2 |vj − V | + h, M
(A.3)
where h denotes a term o( t) and I ± ( t) denote the indices of the light particles which are colliding from the right (vj < V ) and from the left (vj ≥ V ) respectively. We finally apply our mean-field hypothesis by setting: 1 2 2 |vj − V | = t dv|v − V |2 f ± (X, v, t). (A.4) N M M ± j ∈I ( t)
Taking the limit t → 0, we obtain Eqs. (2.1)–(2.6). We also set M = 1, M being an irrelevant constant.
Approach to Equilibrium in a Microscopic Model of Friction
187
Proposition A. 1. Consider the dynamics of the disk with given velocity W = W (t) and the fluid trajectories x(s, t; x, v), v(s, t; x, v) computed according to the evolution of the disk and the law of the elastic reflection (2.2). Assume W differentiable for almost all t and such that ess supt∈R+ (|W (t)| + |W˙ (t)|) = L < +∞.
(A.5)
Then the set of all t ∈ R+ , x ∈ D(t), v ∈ Rd for which x(s, t; x, v), v(s, t; x, v), 0 ≤ s < t, delivers infinitely many backward collisions, or has a tangential collision, has vanishing Lebesgue measure. Proof. We give the proof for the one-dimensional case, for notational simplicity. For a given T > 0, we shall prove that the set of (x, v) ∈ R2 for which x(s, T ; x, v), v(s, T ; x, v), s < T yields infinitely many collisions or has a tangential collision, has null measure. Then Proposition A.1 follows easily. We consider a partition I1 , . . . IN of the time interval [0, T ) into intervals of the same length δ. Obviously N = T /δ. We denote by tk the middle point of Ik . We shall not consider the case in which tk is a collision time because it is a (x, v) measure zero event. Consider the set Aδk = {(x, v)|x(s1 , T ; x, v) = X(s1 ),
x(s2 , T ; x, v) = X(s2 )
for some s1 , s2 ∈ Ik }.
(A.6)
Denote also by RT the set of all configurations at time T leading (backward) to infinitely many collisions. Then, to have an accumulation point of the collision times, we necessarily have two consecutive collisions falling in the same time interval Ik for some k. Hence Aδk (A.7) RT ⊂ k
for all δ > 0. We finally set, for s < T : D δ (s) = {(x, v)||x(s, T ; x, v) − X(s)| < 2Lδ, |v(s, T ; x, v) − W (s)| < 2Lδ}.
(A.8)
We shall prove that Aδk ⊂ D δ (tk ).
(A.9)
By Eq. (A.9) we easily conclude the proof. Indeed, by the time invariance (with respect to the time evolution of the fluid particle flow) of the Lebesgue measure, the map (x, v) → (x(s, T ; x, v), v(s, T ; x, v)) has unitary Jacobian and hence |D δ (tk )| ≤ Cδ 2 ,
(A.10)
where |A| denotes the Lebesgue measure of the set A. Therefore |RT | ≤
N k=1
|Aδk | ≤
N k=1
|D δ (tk )| ≤ CN δ 2 = Cδ.
(A.11)
188
S. Caprino, C. Marchioro, M. Pulvirenti
By the arbitrariness of δ we conclude that |RT | = 0. Moreover the set ZT of all (x, v) leading to a tangential collision, i.e. x(s, T ; x, v) = X(s);
v(s, T ; x, v) = W (s)
for some s ∈ Ik , k = 1 . . . N, trivially satisfies: ZT ⊂ D δ (tk ) k
and hence has vanishing measure. To prove (A.9) let (x, v) ∈ Aδk and s1 and s2 , with s1 < s2 , be two consecutive collision instants in Ik . Then, if tk ∈ (s1 , s2 ), s2 W (s)ds, (A.12) v(tk )(s2 − s1 ) = s1
and hence v(tk ) = W (¯s ),
(A.13)
for some s¯ ∈ (s1 , s2 ). Here and in the sequel we shall use the shorthand notation v(s) = v(s, T ; x, v), x(s) = x(s, T ; x, v). Therefore tk |v(tk ) − W (tk )| = | W˙ (s)ds| ≤ Lδ. (A.14) s¯
On the other hand if tk ∈ Ik /(s1 , s2 ), say for instance tk < s1 < s2 , with no collision in (tk , s1 ), then we have simultaneously: v(s1+ ) = W (¯s )
(A.15)
v(s1+ ) = 2W (s1 ) − v(tk ).
(A.16)
|v(tk ) − W (tk )| = |2W (s1 ) − W (¯s ) − W (tk )| ≤ 2Lδ.
(A.17)
for some s¯ ∈ (s1 , s2 ) and
Hence
Finally, let s1 be the closest collision time to tk , say for instance, s1 < tk . Then tk |X(tk ) − x(tk )| ≤ ds|v(s1+ ) − W (s)| ≤ 2Lδ. (A.18) s1
In fact if (x, v) develops at least two collisions in Ik , then sups∈Ik |v(s)| < 2L.
We finally remark that it is possible to prove that infinitely many collisions in a finite time interval cannot occur, for all initial data (x, v), just for geometrical reasons, provided that W has bounded second derivative. However such an extra regularity property does not follow easily from our arguments. Acknowledgements. Work performed under the auspices of the Italian Ministry of the University (MIUR) the GNFM-INDAM. We thank E. Caglioti for stimulating conversations, G. Cavallaro for having pointed out an error in a previous version of the present work and the referees, for their careful reading of a previous version of the paper and their useful criticism.
Approach to Equilibrium in a Microscopic Model of Friction
189
References 1. Balkrishnan, V., Bena, I., Van der Broeck, C.: Velocity correlations, diffusion and stochasticity in a one-dimensional system. Phys. Rev. E 65, 031102(1,9) (2002) 2. Bruneau, L., De Bi`evre, S.: A Hamiltonian model for linear friction in a homogeneous medium. Commun. Math. Phys. 229, 511–542 (2002) 3. Butt`a, P., Caglioti, E., Marchioro, C.: On the long time behavior of infinitely extended systems of particles interacting via Kac Potentials. J. Stat. Phys. 108, 317–339 (2002) 4. Butt`a, P., Caglioti, E., Marchioro, C.: On the motion of a charged particle interacting with an infinitely extended system. Commun. Math. Phys. 233, 545–569 (2003) 5. Butt`a, P., Caglioti, E., Marchioro, C.: On the Violation of Ohm’s Law for Bounded Interaction: a One Dimensional System. Commun. Math. Phys. 249, 353–382 (2004) 6. Butt`a, P., Manzo, F., Marchioro, C.: A simple Hamiltonian model of run-away particle with singular interaction. Math. Mod. Meth. Appl. Sci. 15, 753–766 (2005) 7. Braun, W., Hepp, K.: The Vlasov dynamics and its fluctuations in the 1/N limit of interacting classical particles. Commun. Math. Phys. 56, 101–11 (1977) 8. Dobrushin, R.L.: Vlasov equations. Soviet J. Funct. Anal. 13, 115–123 (1979) 9. Gruber, Ch., Piasecki, Jb.: Stationary motion of the adiabatic piston. Physica A 268, 412–423 (1999) 10. Landau, L.D., Lifshitz, E.M.: Physical Kinetics: Course in Theoretical Physics, 10, London: Pergamon Press, 1981 11. Lebowitz, J.L., Piasecki, J., Sinai, Ya.: Scaling dynamics of a massive piston in a ideal gas. In: Hard Ball Systems and the Lorentz Gas. Encycl. Math. Sci. 101, Berlin: Springer, 2000, pp. 217–227 12. Neunzert, H.: An introduction to the nonlinear Boltzmann-Vlasov equation. In: Kinetic theories and the Boltzmann equation (Montecatini, 1981), Lecture Notes in Math. 1048, Berlin: Springer, 1984, pp. 60–110 13. Spohn, H.: On the Vlasov Hierarchy. Math. Mod. Meth. Appl. Sci. 3, 445–455 (1981) Communicated by H. Spohn
Commun. Math. Phys. 264, 191–225 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1519-6
Communications in
Mathematical Physics
Periodic Integrable Systems with Delta-Potentials E. Emsiz, E.M. Opdam, J.V. Stokman KdV Institute for Mathematics, Universiteit van Amsterdam, Plantage Muidergracht 24, 1018 TV Amsterdam, The Netherlands. E-mail:
[email protected];
[email protected];
[email protected] Received: 25 March 2005 / Accepted: 15 September 2005 Published online: 1 March 2006 – © Springer-Verlag 2006
Abstract: In this paper we study root system generalizations of the quantum Bosegas on the circle with pair-wise delta-function interactions. The underlying symmetry structures are shown to be governed by the associated graded algebra of Cherednik’s (suitably filtered) degenerate double affine Hecke algebra, acting by Dunkl-type differential-reflection operators. We use Gutkin’s generalization of the equivalence between the impenetrable Bose-gas and the free Fermi-gas to derive the Bethe ansatz equations and the Bethe ansatz eigenfunctions. 1. Introduction Given any affine root system , Gutkin and Sutherland [10, 31] defined a quantum integrable system whose Hamiltonian − + V has a potential V expressible as a weighted sum of delta-functions at the affine root hyperplanes of . For the affine root system of type A, the quantum integrable system essentially reduces to the quantum Bose-gas on the circle with pair-wise delta-function interactions, which has been the subject of intensive studies over the past 40 years. The special case of the impenetrable Bose-gas on the circle was exactly solved by relating the model to the free Fermi-gas on the circle (see Girardeau [9]). Soon afterwards fundamental progress was made for arbitrary pair-wise delta function interactions by Lieb & Liniger [22], Yang [32] and Yang & Yang [33], leading to the derivation of the associated Bethe ansatz equations and Bethe ansatz eigenfunctions. Yang & Yang [33] showed that the solutions of the Bethe ansatz equations are controlled by a strictly convex master function. One of the aims of the present paper is to generalize these results to Gutkin’s and Sutherland’s quantum integrable systems associated to affine root systems. Quantum Calogero-Moser systems are root system generalizations of quantum Bosegases on the line or circle with long range pair-wise interactions. In special cases quantum Calogero-Moser systems naturally arose from harmonic analysis on symmetric spaces. A decisive role in the studies of quantum Calogero-Moser systems has been played by
192
E. Emsiz, E.M. Opdam, J.V. Stokman
certain non-bosonic analogs of these systems, which are defined in terms of Dunkl-type commuting differential-reflection operators. Suitable degenerations of affine Hecke algebras naturally appear here as the fundamental objects governing the algebraic relations between the Dunkl-type operators and the natural Weyl group action. In this paper we define Dunkl-type commuting differential-reflection operators associated to the root system generalizations of the quantum Bose-gas with delta-function interactions. We furthermore show that the Dunkl-type operators, together with the natural affine Weyl group action, realize a faithful representation of the associated graded algebra of Cherednik’s [3] (suitably filtered) degenerate double affine Hecke algebra. These results show that these quantum integrable systems naturally fit into the class of quantum Calogero-Moser integrable systems, a point of view which also has been advertised from the perspective of harmonic analysis in [15, Sect. 5]. The quantum integrable systems under consideration for affine root systems of classical type still have reasonable physical interpretations in terms of interacting onedimensional quantum bosons. In these cases various results of the present paper can be found in the vast physics literature on this subject. We will give the precise connections to the literature in the main body of the text. The knowledge on the quantum Bose-gas with pair-wise delta-function interactions still far exceeds the knowledge on its root system generalizations. In fact, an important feature of the quantum Bose-gas with pair-wise delta-function interactions is its realization as the restriction to a fixed particle sector of the quantum integrable field theory in 1 + 1 dimensions governed by the quantum nonlinear Schr¨odinger equation. This point of view has led to the study of this model by quantum inverse scattering methods. With these methods a proof of full orthogonality of the Bethe eigenfunctions on a period box (with respect to Lebesgue measure) is derived in [5] and the quadratic norms of the Bethe eigenfunctions are evaluated in terms of the determinant of the Hessian of the master function (conjectured by Gaudin [8, Sect. 4.3.3] and proved by Korepin [20]). At this point we can only speculate on the generalizations of these results to arbitrary root systems. The quantum inverse scattering techniques are only in reach for classical root systems, in which case we have quantum field theories with (non)periodic integrable boundary conditions at our disposal, see [29]. In general it seems reasonable to expect that the Bethe eigenfunctions are orthogonal on a fundamental domain for the reflection representation of the affine Weyl group (with respect to Lebesgue measure), and that their quadratic norms are expressible in terms of the determinant of the Hessian of the master function at the associated spectral point. The contents of the paper is as follows. Sections 2 and 3 are meant to introduce the quantum integrable systems and to state and clarify the results on the associated spectral problem. We first introduce in Sect. 2 the relevant notations on affine root systems. Following Gutkin [11] we formulate the spectral problem for the quantum integrable systems under consideration as an explicit boundary value problem. We state the main results on the boundary value problem (Bethe ansatz equations and Bethe ansatz eigenfunctions) and we introduce the associated master function. In Sect. 3 we formulate the analog of Girardeau’s equivalence between the impenetrable Bose-gas and the free Fermi-gas on the circle for the quantum integrable systems under consideration. In Sect. 4 we introduce Dunkl-type commuting differential-reflection operators and show that they realize, together with the natural affine Weyl group action, a faithful realization of the associated graded algebra H of Cherednik’s [3] (suitably filtered) degenerate double affine Hecke algebra. In Sect. 5 we show that Gutkin’s [11] integral-reflection operators, together with the ordinary directional derivatives, yield an
Periodic Integrable Systems with Delta-Potentials
193
equivalent realization of H . The equivalence is realized by Gutkin’s [11] propagation operator. We furthermore show that the Dunkl operators naturally act on a space of functions with higher order normal derivative jumps over the affine root hyperplanes. In Sect. 6 we return to the boundary value problem of Sect. 2. Using the Hecke-type algebra H we refine and clarify Gutkin’s [11] generalization of Girardeau’s equivalence between the boundary value problem for the impenetrable Bose gas and the boundary value problem for the free Fermi-gas as formulated in Sect. 3. The results in this section entail that the boundary value problem is equivalent to a boundary value problem with trivial boundary value conditions, at the cost of having to deal with a non-standard affine Weyl group action. In Sect. 7 we study the reformulated boundary value problem, leading in Sect. 8 to the derivation of the Bethe ansatz equations. In Sect. 9 we study the master function and show how it leads to a natural parametrization of the solutions of the Bethe ansatz equations. In Sect. 10 the solutions of the Bethe ansatz equations are further analyzed. In Sect. 11 it is proved that the boundary value problem has solutions if and only if the associated spectral value is a regular solution of the Bethe ansatz equations. In case of root system of type A, this is known as the Pauli principle for the interacting bosons. 2. The Boundary Value Problem In this section we recall Gutkin’s [11] reformulation of the spectral problem for periodic integrable systems with delta-potentials in terms of a concrete boundary value problem. We furthermore state the main results on the solutions of the boundary value problem and we detail the physical background. In order to fix notations we start by recalling some well known facts on affine root systems, see e.g. [17] for a detailed exposition. Let V be an Euclidean space of dimension n. Let 0 be a finite, irreducible crystallographic root system in the dual Euclidean space V ∗ . We denote ·, · for the inner product on V ∗ and · for the corresponding norm. The co-root of α ∈ 0 is the unique vector α ∨ ∈ V satisfying ξ(α ∨ ) =
2ξ, α , α2
∀ ξ ∈ V ∗.
We write 0∨ = {α ∨ }α∈0 for the resulting co-root system in V . We fix a basis I0 = {a1 , . . . , an } for the root system 0 . Let 0 = 0+ ∪ 0− be the corresponding decomposition in positive and negative roots. We denote ρ ∈ V ∗ for the half sum of positive roots and ϕ ∈ 0+ for the highest root with respect to the basis I0 . The highest root ϕ is a long root in 0 . We define the fundamental Weyl chamber in V ∗ by V+∗ = {ξ ∈ V ∗ | ξ(α ∨ ) > 0
∀ α ∈ 0+ }.
(2.1)
be the vector space of affine linear functionals on V . Then V V ∗ ⊕ R as Let V vector spaces, where the second component is identified with the constant functions on → V ∗ is the projection onto V ∗ along this decomposition. V . The gradient map D : V is the affine root system associated to 0 . We extend The subset = 0 + Z ⊂ V the basis I0 of 0 to a basis I = {a0 = −ϕ + 1, a1 , . . . , an } of the affine root system . Observe that D maps onto 0 . For a root a ∈ , sa (v) = v − a(v)Da ∨ ,
v∈V
194
E. Emsiz, E.M. Opdam, J.V. Stokman
defines the orthogonal reflection in the root hyperplane Va := a −1 (0). The affine Weyl group W associated to is the sub-group of the affine linear isomorphisms of V generated by the orthogonal reflections sa (a ∈ ). The sub-group W0 ⊂ W generated by the orthogonal reflections sα (α ∈ 0 ) is the Weyl group associated to 0 . We denote w0 for the longest Weyl group element in W0 . It is well known that W (respectively W0 ) is a Coxeter group with Coxeter generators the simple reflections sj = saj for j = 0, . . . , n (respectively sj for j = 1, . . . , n). A second important presentation of W is given by W W0 Q∨ ,
(2.2)
with Q∨ = Z0∨ ⊂ V the co-root lattice of 0 , acting by translations on V . The gradient map D induces a surjective group homomorphism D : W → W0 by D(sa ) = sDa for a ∈ . Alternatively, Dw = v if v ∈ W0 is the W0 -component of w in the semi-direct product decomposition (2.2). of affine linear functionals on V is a W -module by (wf )(v) = f (w −1 v) The space V , v ∈ V ). Observe that V ∗ is W0 -stable, and (w ∈ W, f ∈ V sα (ξ ) = ξ − ξ(α ∨ )α,
ξ ∈ V∗
for roots α ∈ 0 . Furthermore, sα (0 ) = 0 ,
sa () =
for α ∈ 0 and a ∈ . The length of w ∈ W is defined by l(w) = # + ∩ w−1 − . Alternatively, l(w) is the minimal positive integer r such that w ∈ W can be written as a product of r simple reflections. Such an expression w = sj1 sj2 · · · sjl(w) (jk ∈ {0, . . . , n}) is called reduced. The weight lattice of 0 is defined by P = {λ ∈ V ∗ | λ(α ∨ ) ∈ Z ∀ α ∈ 0 }. Another convenient description is P = {λ ∈ V ∗ | wλ(ϕ ∨ ) ∈ Z ∀ w ∈ W0 },
(2.3)
which follows from the fact that Q∨ is already spanned over Z by the short co-roots in 0∨ . We denote P + (respectively P ++ ) for the cone of dominant (respectively strictly dominant) weights with respect to the choice 0+ of positive roots in 0 . Recall that P ++ = ρ + P + . We write Virreg = a∈ + Va for the irregular vectors in V with respect to the affine root hyperplane arrangement {Va | a ∈ + }. Its open, dense complement Vreg := V \ Virreg is called the set of regular vectors in V . We denote C for the collection of connected components of Vreg . An element C ∈ C is calledan alcove. The affine Weyl group W acts simply transitively on C. Explicitly, Vreg = w∈W w(C+ ) (disjoint union) with the fundamental alcove C+ defined by C+ = { v ∈ V | aj (v) > 0 (j = 0, . . . , n) }. We call a vector v ∈ Va (a ∈ + ) sub-regular if it does not lie on any other root hyperplane Vb (a = b ∈ + ).
Periodic Integrable Systems with Delta-Potentials
195
The symmetric algebra S(V ) is canonically a W0 -module algebra. Using the standard identification S(V ) P (V ∗ ), where P (V ∗ ) is the algebra of real-valued polynomial functions on V ∗ , the W0 -module structure takes the form (wp)(ξ ) = p(w −1 ξ ),
w ∈ W0 , ξ ∈ V ∗ .
We denote S(V )W0 and P (V ∗ )W0 for the subalgebra of W0 -invariants in S(V ) and P (V ∗ ), respectively. Let ∂v (v ∈ V ) be the derivative in direction v, d (∂v f )(u) = f (u + tv) dt t=0 for f continuously differentiable at u ∈ V . The assignment v → ∂v uniquely extends to an algebra isomorphism of S(V ) onto the algebra of constant coefficient differential operators on V (say acting on C ∞ (V )). We denote p(∂) for the constant coefficient differential operator corresponding to p ∈ S(V ) P (V ∗ ). For example, the W0 invariant constant coefficient differential operator p2 (∂) associated to the polynomial p2 (·) = · 2 ∈ P (V ∗ )W0 is the Laplacian on V . The quantum integrable system which we will define now in a moment depends on certain coupling constants called multiplicity functions. Definition 2.1. A multiplicity function k is a W -invariant function k : → R satisfying k(a) = k(Da) for all a ∈ . Unless stated explicitly otherwise, we fix a strictly positive multiplicity function k : → R>0 . To simplify notations we write ka for the value of k at the root a ∈ . We define the quantum Hamiltonian Hk by Hk = − + ka δ(a(·)), (2.4) a∈
where δ is the Kronecker delta-function. Here we interpret Hk as a linear map Hk : C(V ) → D (V ), with C(V ) the complex-valued continuous functions on V and D (V ) the space of distributions on V , as ka Hk f (φ) := − f (v) φ (v)dv + f (v)φ(v)da v (2.5) Da ∨ Va V a∈
for a test function φ, with dv the Euclidean volume measure on V and da v (a ∈ + ) the corresponding volume measure on the root hyperplane Va . The quantum Hamiltonian Hk and the associated quantum physical system has been studied in e.g. [31, 10, 11]. A key step in these investigations is the reformulation of the spectral problem for Hk in terms of an explicit boundary value problem for the Laplacian on V , which we now proceed to recall. Let CB 1 (V ) be the space of complex valued continuous functions f on V whose restriction f |C to an alcove C ∈ C has a continuously differentiable extension to some ⊃ C. Let C 1,(k) (V ) be the space of functions f ∈ CB 1 (V ) which open neighborhood C satisfy the derivative jump conditions ∂Da ∨ f (v + 0Da ∨ ) − ∂Da ∨ f (v − 0Da ∨ ) = 2ka f (v) (2.6) for sub-regular vectors v ∈ Va (a ∈ + ).
196
E. Emsiz, E.M. Opdam, J.V. Stokman
Proposition 2.2. For f ∈ CB 1 (V ) and E ∈ C the following two statements are equivalent. (i) Hk f = Ef as distributions on V . (ii) f ∈ C 1,(k) (V ) and f |Vreg = −Ef |Vreg as distributions on Vreg . A function f ∈ CB 1 (V ) satisfying these equivalent conditions is smooth on Vreg . Proof. The first part of the proposition follows from a straightforward application of Green’s identity (cf. the proof of [11, Thm. 2.7]). The last statement follows from the fact that the constant coefficient differential operator + E on V is (hypo)elliptic. The quantum physical system with quantum Hamiltonian Hk is known to be integrable. The common spectral problem for the associated quantum conserved integrals has been translated by Sutherland and Gutkin [31, 10] into the following boundary value problem. Definition 2.3. Fix a spectral parameter λ ∈ VC∗ := C ⊗R V ∗ . We denote BVPk (λ) for the space of functions f ∈ C 1,(k) (V ) solving (in the distributional sense) the system p(∂)f V
reg
= p(λ)f V
reg
∀ p ∈ S(V )W0
of differential equations away from the root hyperplane configuration
(2.7)
a∈ +
Va .
Remark 2.4. Since = p2 (∂) is the Laplacian on V , Proposition 2.2 implies that a function f ∈ BVPk (λ) is smooth on Vreg and satisfies the differential equations (2.7) in the strong sense. The fact that f is an eigenfunction of all W0 -invariant constant coefficient differential operators on Vreg in fact implies that f |C is the restriction of a (necessarily unique) analytic function on V for all alcoves C ∈ C, see [30]. The central theme of this paper is the study of the subspace BVPk (λ)W ⊂ BVP(λ) of W = W0 Q∨ -invariant solutions, where W acts on BVPk (λ) ⊂ C 1,(k) (V ) by (wf )(v) = f (w−1 v)
(2.8)
for w ∈ W and v ∈ V . Our focus on W -invariant solutions thus amounts to studying the bosonic (=W0 -invariant) theory of the quantum system under Q∨ -periodicity constraints (or equivalently, we view the quantum system on the torus V /Q∨ ). Example 2.5 (Free case k ≡ 0). A function f ∈ BVP0 (λ) is a distribution solution of the (hypo)elliptic constant coefficient differential operator − p2 (λ) on V , hence f is smooth on V (cf. Proposition 2.2). Combined with Remark 2.4 we conclude that a function f ∈ BVP0 (λ) is analytic on V . Then BVP0 (λ)W (λ ∈ VC∗ ) are the common eigenspaces of the quantum conserved integrals for the free bosonic quantum integrable system on V /Q∨ associated to the Laplacian on V . It is easy to show that BVP0 (λ)W is zero-dimensional unless λ ∈ 2πiP , in which case it is spanned by the plane wave φλ0 =
1 wλ e #W0 w∈W0
(cf. the analysis in the impenetrable case k ≡ ∞ in Sect. 3).
Periodic Integrable Systems with Delta-Potentials
197
The quantum Hamiltonian (2.4) for 0 of type An takes the explicit form − + k δ(xi − xj + m). m∈Z 1≤i=j ≤n+1
Here we have embedded V into Rn+1 as the hyperplane defined by x1 + · · · + xn+1 = 0. The study of W -invariant solutions to the boundary value problem then essentially amounts to analyzing the spectral problem for the system describing n + 1 quantum bosons on the circle with pair-wise repulsive delta-function interactions. In this special case the quantum system has been extensively studied in the physics literature, see e.g. [9, 22, 32, 33, 8, 18]. The upgrade to other classical root systems amounts to adding particular reflection terms to the physical model, see e.g. [29, 2, 8, 16, 19, 24]. We are now in a position to formulate the main results on the solution space of the boundary value problem. We call the spectral value λ ∈ VC∗ = V ∗ ⊕ iV ∗ regular if its isotropy sub-group in W0 is trivial (equivalently, λ(α ∨ ) = 0 for all α ∈ 0 ). We call λ singular otherwise. Furthermore, λ is called real (respectively purely imaginary) if λ ∈ V ∗ (respectively λ ∈ iV ∗ ). Define the c-function by λ(α ∨ ) + kα (2.9) ck (λ) = λ(α ∨ ) + α∈0
V ∗,
as rational function of λ ∈ C cf. [8, 15]. Theorem 2.6. Let λ ∈ VC∗ . The space BVPk (λ)W of W -invariant solutions to the boundary value problem is one-dimensional or zero-dimensional. It is one-dimensional if and only if the spectral value λ is a purely imaginary, regular solution of the Bethe ansatz equations ∨ wλ(α ∨ ) − kα α(ϕ ) wλ(ϕ ∨ ) e = ∀ w ∈ W0 . (2.10) wλ(α ∨ ) + kα + α∈0
If BVPk (λ)W is one-dimensional, then there exists a unique φλk ∈ BVPk (λ)W normalized by φλk (0) = 1. The solution φλk is the unique W -invariant function satisfying 1 φλk (v) = ck (wλ)ewλ(v) , v ∈ C+ . (2.11) #W0 w∈W0
We give a reformulation of Theorem 2.6 in Sect. 3. The Bethe ansatz equations are derived in Sect. 8. The regularity constraint on λ is proved in Sect. 11. Remark 2.7. The Bethe ansatz equations (2.10) can be rewritten as wλ(ϕ ∨ ) − kϕ wλ(α ∨ ) − kα ∨ ewλ(ϕ ) = , ∀ w ∈ W0 , wλ(ϕ ∨ ) + kϕ wλ(α ∨ ) + kα + −
(2.12)
α∈0 ∩sϕ 0
due to the fact that for α ∈ 0+ , 2 ∨ α(ϕ ) = 1 0
if α = ϕ, if α ∈ 0+ ∩ sϕ 0− \ {ϕ}, if α ∈ 0+ ∩ sϕ 0+ .
(2.13)
198
E. Emsiz, E.M. Opdam, J.V. Stokman
A key role in the analysis of the Bethe ansatz equations (2.10) is played by the following master function. Definition 2.8. The master function Sk : P × V ∗ → R is defined by
ξ(α ∨ ) 1 1 t Sk (µ, ξ ) = ξ 2 − 2πµ, ξ + dt. α2 arctan 2 2 kα 0
(2.14)
α∈0
The master function Sk enters into the description of the set BAEk of solutions λ ∈ iV ∗ of the Bethe ansatz equations (2.10) in the following way. Proposition 2.9. For µ ∈ P there exists a unique extremum µk ∈ V ∗ of the master function Sk (µ, ·). The assignment µ → i µk defines a W0 -equivariant bijection ∼ P −→ BAEk . The proof of Proposition 2.9, which hinges on the strict convexity of Sk (µ, ·) (µ ∈ P ), is given in Sect. 9. The regularity condition on the spectrum in Theorem 2.6 also turns out to be a consequence of the strict convexity of the master function Sk (µ, ·) (µ ∈ P ), see Sect. 11. The following proposition yields precise information on the location of the deformed weight i µk ∈ BAEk . Proposition 2.10. For µ ∈ P + and β ∈ 0+ we have 2πµ(β ∨ ) ≤ µk (β ∨ ) ≤ 2πµ(β ∨ ), 1 + hnk where hk = 2
−1 α∈0 kα .
(2.15)
Furthermore, µ ∈ P + if and only if µk ∈ V+∗ .
Proposition 2.10 is proved in Sect. 10. The lower bound in (2.15) shows how far away the spectral values µk ∈ V+∗ (µ ∈ P ++ ) are from being singular. The Bethe ansatz functions φλk and the necessity of the Bethe ansatz equations (2.10) on the allowed spectrum were obtained by Lieb and Liniger [22] for a root system 0 of type An , and soon after generalized to a root system 0 of type Dn by Gaudin [7, 8] (see also [19]). For 0 of type An , Yang and Yang [33] introduced the master function S (also known as the Yang-Yang action) and derived the special case of Proposition 2.9 using its strict convexity. In physics literature the regularity of the spectral parameter λ (see Theorem 2.6) is usually imposed as an additional requirement, since it automatically ensures that eigenstates admit a plane wave expansion within any alcove C ∈ C. The regularity condition for a root system 0 of type An can be viewed as a Pauli type principle for the interacting quantum bosons, since it implies that the momenta of the quantum bosons are pair-wise different. An actual proof of the regularity of the spectrum was obtained by Izergin and Korepin [18] using quantum inverse scattering methods. In this derivation the regularity condition again follows from the strict convexity of the master function. Estimates for the momenta gaps of the quantum particles play a role in the study of the thermodynamical limit, see [22, 33]. See e.g. [8, Sect 4.3.2] for the exact analog of the estimates (2.15) for 0 of type An . It is believed [18] that quantum integrable systems governed by a strictly convex master function always have a regularity constraint on the spectrum, although a conceptual understanding is not known as far as we know. We remark though that our derivation
Periodic Integrable Systems with Delta-Potentials
199
of the regularity constraint on the spectrum is in accordance with this point of view. A conceptual understanding of the partly fermionic nature of the quantum integrable system at hand is given in the next section. 3. Generalization of Girardeau’s Isomorphism Let C ω (V ) be the space of complex valued, real analytic functions on V , which we consider as a W -module with respect to the usual action (2.8). Consider for λ ∈ VC∗ the space E(λ) = {f ∈ C ω (V ) | p(∂)f = p(λ)f which is a W -submodule of
C ω (V ). We
0 ∀ p ∈ S(V )W C },
(3.1)
observed in Example 2.5 that
E(λ) = BVP0 (λ),
λ ∈ VC∗ .
(3.2)
In this section we give a convenient description of the solution space BVPk (λ)W of the boundary value problem (Definition 2.3) in terms of the space of invariants in E(λ) with respect to a k-dependent W -action by integral-reflection operators. We will view this result as a natural generalization of Girardeau’s [9] equivalence between the impenetrable quantum Bose-gas and the free quantum Fermi-gas on the circle to arbitrary root systems and to arbitrary multiplicity functions k. We start by generalizing Girardeau’s [9] results on the impenetrable quantum Bose∨ gas on the circle to arbitrary affine root systems. Denote E(λ)Q for the subspace of ∨ Q -translation invariant functions in E(λ). ∨
Lemma 3.1. For λ ∈ VC∗ , we have E(λ)Q = {0} unless λ ∈ 2π iP . For λ ∈ 2π iP the ∨ space E(λ)Q is spanned by eµ (µ ∈ W0 λ). Proof. By [30], a function f ∈ E(λ) can be uniquely expressed as pµ (v)eµ(v) , f (v) = µ∈W0 λ
where pµ ∈ P (V )C , see also Sect. 7. Such a nonzero function f is Q∨ -translation invariant iff pµ (v + γ )eµ(γ ) = pµ (v) Q∨ . This
(3.3) iV ∗
for all µ ∈ W0 λ, v ∈ V and γ ∈ implies that λ ∈ and that pµ is bounded on V for all µ ∈ W0 λ. The latter condition implies that pµ is constant for all µ ∈ W0 λ. Returning to (3.3) with pµ ∈ C, the Q∨ -translation invariance of f is equivalent to ∨ µ(Q∨ ) ⊂ 2π iZ if pµ = 0. Hence E(λ)Q = {0} unless λ ∈ 2π iP , in which case ∨ E(λ)Q is spanned by eµ (µ ∈ W0 λ). We denote E(λ)−W for the space of functions f ∈ E(λ) satisfying f (w −1 v) = (−1)l(w) f (v) for all w ∈ W and v ∈ V . Since translations µ ∈ Q∨ ⊂ W have even length, E(λ)−W consists of Q∨ -translation invariant functions. In particular, E(λ)−W is the solution space to the spectral problem for a free fermionic quantum integrable system on V /Q∨ associated to the Laplacian on V . Corollary 3.2. Let λ ∈ VC∗ . The space E(λ)−W is zero-dimensional or one-dimensional. It is one-dimensional iff λ is a regular element from 2π iP , in which case E(λ)−W is spanned by
200
E. Emsiz, E.M. Opdam, J.V. Stokman
ψλ∞ =
1 λ(α ∨ )−1 (−1)l(w) ewλ . #W0 α∈0
(3.4)
w∈W0
∨ Proof. Let λ ∈ 2π iP and f = µ∈W0 λ cµ eµ ∈ E(λ)Q with cµ ∈ C, cf. Lemma 3.1. Then we have f ∈ E(λ)−W iff cwλ = (−1)l(w) cλ for all w ∈ W0 . For singular λ this implies cµ = 0 for all µ ∈ W0 λ. For regular λ we conclude that f is a constant multiple of ψλ∞ ∈ E(λ)−W . Following the analogy with Girardeau’s [9] analysis of the impenetrable quantum Bose-gas on the circle, we define now a linear map G : C ω (V ) → C(V )W by −1 Gf (w v) := f (v), w ∈ W, v ∈ C+ . (3.5) The map G is injective: for g ∈ C(V )W in the image of G, the function G−1 g is the unique analytic continuation of g|C+ to V . For k ≡ ∞ we interpret the boundary conditions (2.6) as f |Va ≡ 0 for all a ∈ + . The solution spaces BVP∞ (λ)W of the associated boundary value problem (see Definition 2.3) can now be analyzed as follows. Proposition 3.3. For λ ∈ VC∗ we have
∼
(i) The map G restricts to a linear isomorphism G : E(λ)−W −→ BVP∞ (λ)W . (ii) The space BVP∞ (λ)W is zero-dimensional or one-dimensional. It is one-dimensional iff λ is a regular element from 2πiP . In that case BVP∞ (λ)W is spanned by φλ∞ := G(ψλ∞ ), which is the unique W -invariant function satisfying φλ∞ (v) =
1 λ(α ∨ )−1 (−1)l(w) ewλ(v) , #W0 +
v ∈ C+ .
w∈W0
α∈0
Proof. (i) A function f ∈ E(λ)−W vanishes on the root hyperplanes Va (a ∈ + ), hence so does g := G(f ) ∈ C(V )W . The function g furthermore satisfies the differential equations (2.7), hence g ∈ BVP∞ (λ)W . For g ∈ BVP∞ (λ)W we define f = G(g) ∈ C(V )−W by f (w −1 v) := (−1)l(w) g(v) for w ∈ W and v ∈ C+ . This is well defined since g vanishes on the root hyperplanes Va (a ∈ + ). Since f is W -alternating we have f ∈ C 1,(0) (V ). The function f satisfies the differential equations (2.7), hence f ∈ BVP0 (λ)−W = E(λ)−W , where the last equality follows from (3.2). The proof is now completed by observing that : BVP∞ (λ)W → E(λ)−W is the inverse of the map G : E(λ)−W → BVP∞ (λ)W . G (ii) This follows from (i) and Corollary 3.2. For a root system 0 of type A, Proposition 3.3 is due to Girardeau [9]. For the generalization of Proposition 3.3 to arbitrary multiplicity function k it is convenient to reinterpret the space E(λ)−W as follows. Consider the integral operator a(v) f (v − tDa ∨ )dt (a ∈ ) (3.6) I(a)f (v) = 0
as a linear operator on C(V ). The integral operators I(a) (a ∈ I ) satisfy the braid relations of as well as the quadratic relations I(a)2 = 0, cf. e.g. [13]. In particular, given a reduced expression w = si1 si2 · · · sil(w) for w ∈ W , the operator Q∞ (w) := I(ai1 )I(ai2 ) · · · I(ail(w) ) is well defined. Denote
Periodic Integrable Systems with Delta-Potentials
201
E(λ)W Q∞ := {f ∈ E(λ) | Q∞ (w)f = 0,
∀ w ∈ W \ {e}},
where e ∈ W is the unit element of W . We now have the following simple observation. Lemma 3.4. For f ∈ C(V ) and b ∈ we have sb f = −f if and only if I(b)f = 0. ∗ In particular, E(λ)−W = E(λ)W Q∞ for all λ ∈ VC . Proof. It is immediate that I(b)f = 0 if sb f = −f . The converse follows from the fact that ∂Db∨ I(b)f = f + sb f, (3.7) cf. [11, Lem. 2.1(iii)].
By Lemma 3.4, Proposition 3.3 (i) can be reformulated as the statement that the map G restricts to an isomorphism ∼
W G : E(λ)W Q∞ −→ BVP∞ (λ) .
(3.8)
The isomorphism (3.8) can now be generalized to arbitrary multiplicity function k as follows. In the terminology of Gutkin [13], the system of integral operators {kb I(b)}b∈ + is an operator calculus with respect to the affine Weyl group W for arbitrary multiplicity function k. This implies that the assignment sa → Qk,a (a ∈ I ), with Qk,a the integral-reflection operators Qk,a f (v) = f (sa v) + ka I(a)f (v), a ∈ , f ∈ C(V ), (3.9) uniquely defines a W -action on C(V ), cf. [11, 13] or Sect. 5. Accordingly, we write Qk (w) := Qk,ai1 Qk,ai2 · · · Qk,air for w = si1 si2 · · · sir ∈ W . Note that −1 Qk (w), Q∞ (w) = lim kw k→∞
∀ w ∈ W,
where kw := kai1 kai2 · · · kair for a reduced expression w = si1 si2 · · · sir ∈ W . The generalization of (3.8) for arbitrary multiplicity function k is now the statement that the map G restricts to a linear isomorphism ∼
W G : E(λ)W Qk −→ BVPk (λ)
(3.10)
for arbitrary positive multiplicity function k, where E(λ)W Qk is the subspace of Qk (W )-invariant functions in E(λ). The proof of (3.10) will be given in Sect. 6. With the isomorphism (3.10) at hand, Theorem 2.6 is equivalent to the following theorem. Theorem 3.5. Let λ ∈ VC∗ . The space E(λ)W Qk is one-dimensional or zero-dimensional. It is one-dimensional if and only if λ is a purely imaginary, regular solution of the Bethe ansatz equations (2.10). If E(λ)W Qk is one-dimensional then ψλk (v) =
1 ck (wλ)ewλ(v) , #W0
∀v ∈ V
w∈W0
k is the unique function in E(λ)W Qk normalized by ψλ (0) = 1.
(3.11)
202
E. Emsiz, E.M. Opdam, J.V. Stokman
Theorem 3.5 is proved in Sect. 8 under the assumption that λ is regular. The assertion that λ is necessarily regular is proved in Sect. 11. In order to reveal the full symmetry structures underlying the isomorphism (3.10), we will consider the upgrade of the map G to a k-dependent linear isomorphism Tk of C(V ) which intertwines the Qk (W )-action with the usual W -action (2.8), and which acts as G when applied to Qk (W )-invariant functions. does the job The map which is Gutkin’s [11] propagation operator, defined by Tk f (w −1 v) = Qk (w)f (v) for w ∈ W and v ∈ C+ (see Sect. 5 for details). The propagation operator Tk now restricts to an isomorphism ∼
Tk : E(λ) −→ BVPk (λ)
(3.12)
for all λ ∈ VC∗ (cf. [11] and Theorem 6.3), which implies (3.10) by restricting to the subspaces of W -invariant functions. We conclude this section by considering the limit to the impenetrable case k ≡ ∞. The Bethe ansatz equations (2.10) then reduce to ewλ(ϕ
∨)
=1
∀ w ∈ W0 ,
which has 2π iP as purely imaginary solutions λ (see (2.3)). Furthermore we have lim µk = 2πµ
k→∞
(3.13)
for µ ∈ P + , which follows by taking the limit k → ∞ in (2.15). For λ = i µk ∈ iV ∗ (µ ∈ P ++ ) a regular solution to the Bethe ansatz equation, ψλk ∈ E(λ)W (see (3.11)) Qk can alternatively be written as ψλk =
1 Qk (w)(eλ ), #W0 w∈W0
see [15] or Sect. 7. It follows that −1 k lim kw ψi µk = 0
k→∞
1 ∞ Q∞ (w0 )(e2πiµ ) = ψ2πiµ #W0
for µ ∈ P ++ , uniformly on compacta. Pulling the limits through the map G, we obtain −1 k ∞ lim kw φ µk = φ2πiµ 0 i
k→∞
for µ ∈ P ++ , uniformly on compacta. 4. Dunkl Operators and Hecke Algebras It is well known that conserved integrals for quantum integrable systems of CalogeroMoser type can be conveniently expressed in terms of Dunkl-type operators, which are explicit commuting first-order differential-reflection operators, see e.g. [14, 6, 1]. The Dunkl operators, together with the usual Weyl group action (2.8), form a faithful representation of suitable degenerations of affine Hecke algebras, see [26, Cor. 2.9]. The exploration of these structures has been instrumental in solving the corresponding quantum integrable systems.
Periodic Integrable Systems with Delta-Potentials
203
In this section we derive the Dunkl-type operators and the underlying Hecke algebra structures for the periodic quantum integrable systems with delta-potentials as introduced in Sect. 2. We initially define the Dunkl operators as explicit differential-reflection operators on the space C ∞ (Vreg ) of smooth functions on Vreg . In Sect. 6 we obtain the key result that these Dunkl operators act on the solution space BVPk (λ) to the boundary value problem. Together with the usual W -action (2.8), the space BVPk (λ) then becomes a module over the associated graded algebra Hk of Cherednik’s [3] (suitably filtered) degenerate double affine Hecke algebra. On the other hand, we will show in Sect. 5 that the W -action Qk on E(λ) together with the directional derivatives ∂v (v ∈ V ) makes E(λ) into a Hk -module. With these upgraded symmetry structures, Gutkin’s propagation operator Tk turns out to yield an isomorphism ∼
Tk : E(λ) −→ BVPk (λ) of Hk -modules for all λ ∈ VC∗ . It is this particular isomorphism which is explored in Sect. 6 to (re-)prove and clarify crucial results on the boundary value problem (see Definition 2.3), as well as on the associated bosonic theory. We denote χ : R\{0} → {0, 1} for the characteristic function of the interval (−∞, 0), so χ (x) = 1 if x < 0 and χ (x) = 0 if x > 0. For a ∈ the function χa (v) := χ (a(v)) (v ∈ Vreg ) defines a smooth function on Vreg , which is constant on the alcoves C of Vreg . In fact, for w ∈ W and a ∈ + we have 1 if wa ∈ − χa |w−1 C+ ≡ (4.1) 0 if wa ∈ + , hence χa is nonzero on a given alcove w −1 C+ (w ∈ W ) for only finitely many positive roots a ∈ + . The Dunkl-type operators ka Da(v)χa (·)sa (v ∈ V ), (4.2) Dvk = ∂v + a∈ +
thus define linear operators on C ∞ (Vreg ), which depend linearly on v ∈ V . For f ∈ C ∞ (Vreg ) and w ∈ W we have by (4.1), k Dv f |w−1 C+ = ∂v f + ka Da(v)sa f . (4.3) w−1 C+
a∈ + ∩w−1 −
In particular, for the fundamental alcove C+ we simply have Dvk f |C+ = ∂v f |C+ .
(4.4)
The Dunkl operators Dvk (v ∈ V ) and the W -action (2.8) on C ∞ (Vreg ) satisfy the following fundamental commutation relations. Theorem 4.1. (i) We have the cross relation sa Dvk = DskDa v sa + ka Da(v),
v ∈ V , a ∈ I.
(ii) The Dunkl operators Dvk (v ∈ V ) pair-wise commute.
204
E. Emsiz, E.M. Opdam, J.V. Stokman
Proof. (i) Fix v ∈ V and a ∈ I . By a direct computation we have sa Dvk sa = ∂sDa v +
b∈sa
kb Db(sDa v)χb (·)sb .
+
Since sa + = + \ {a} ∪ {−a} we obtain sa Dvk = DskDa v sa − ka Da(sDa v) χa (·) + χ−a (·) = DskDa v sa + ka Da(v), which is the desired cross relation. (ii) We derive the commutativity of the Dunkl operators Dvk (v ∈ V ) as a direct consequence of (4.4) and the cross relation. Let f ∈ C ∞ (Vreg ) and v, v ∈ V . We show by induction on the length l(w) of w ∈ W that [Dvk , Dvk ]f |w−1 C+ = 0.
(4.5)
By (4.4), Eq. (4.5) is obviously valid for w = e the unit element of W . To prove the induction step, it suffices to show that sa [Dvk , Dvk ] = [DskDa v , DskDa v ]sa
(4.6)
for all a ∈ I . For the proof of (4.6), first observe that k sa Dvk Dvk − DskDa v DskDa v sa = ka Da(v )Dvk + Da(v)Dvk − Da(v)Da(v )DDa ∨ (4.7) for all a ∈ I , which follows from applying the cross relation twice. Now (4.6) follows from the fact that the right hand side of (4.7) is symmetric in v and v . By Theorem 4.1 (ii), the assignment v → Dvk uniquely extends to an algebra morphism S(V )C → End(C ∞ (Vreg )). We denote p(Dk ) for the differential-reflection operator on C ∞ (Vreg ) associated to p ∈ S(V )C . Theorem 4.2. (i) There exists a unique complex unital associative algebra Hk = Hk () satisfying (a) Hk = S(V )C ⊗ C[W ] as vector spaces, with C[W ] the group algebra of W . (b) The maps p → p ⊗ e and w → 1 ⊗ w, with e ∈ W the unit element of W , are algebra embeddings of S(V )C and C[W ] into Hk . (c) The cross relations sa · v − sDa v · sa = ka Da(v) holds in Hk for a ∈ I and v ∈ V ⊂ S(V )C . Here we have identified S(V )C and C[W ] with their images in Hk through the algebra embeddings of (b). (ii) The assignment p → p(Dk ) (p ∈ S(V )C ), together with the W -action (2.8), defines a faithful representation πk : Hk → End(C ∞ (Vreg )).
Periodic Integrable Systems with Delta-Potentials
205
Proof. Suppose that w∈W pw (Dk )w = 0 as an endomorphism of C ∞ (Vreg ), with only finitely many pw ∈ S(V )C ’s non zero. We show that all pw ’s are zero. Equation (4.4) implies pw (∂)(wf )|C+ ≡ 0, f ∈ C ∞ (Vreg ). (4.8) w∈W
Applying (4.8) to functions f of the form u−1 g with u ∈ W and with g ∈ C ∞ (Vreg ) having support in the fundamental alcove C+ , we conclude that pu (∂) = 0 as a constant coefficient differential operator on smooth functions in some open ball D ⊂ C + , hence pu = 0. k be the complex unital associative The proof of the theorem is now standard: let H algebra generated by v ∈ V and sa (a ∈ I ) with defining relations as in (b) and (c) (so the vectors v ∈ V pair-wise commute, the sa (a ∈ I ) are involutions satisfying the Coxeter relations associated to and I , and the generators satisfy the cross relations from (c)). By Theorem 4.1 and by the paragraph preceding this theorem, the assignment v → Dvk , together with the W -action (2.8), uniquely defines an algebra morphism k → End(C ∞ (Vreg )). By the previous paragraph and by the cross relations πk : H k it follows that πk is injective and that H k S(V )C ⊗ C[W ] as vector spaces in H k ). Both statements of the theorem are now (the Poincar´e-Birkhoff-Witt Theorem for H immediately clear. We use the notation Mπk to indicate that a subspace M ⊆ C ∞ (Vreg ) is a W -submodule or Hk -submodule of C ∞ (Vreg ) with respect to the πk -action. Remark 4.3. If the values ka of the multiplicity function k are considered to be independent central variables in the definition of Hk , then Hk is graded by imposing the degree of w ∈ W to be zero and the degrees of v ∈ V and ka to be one. As a graded algebra, Hk is the associated graded algebra of Cherednik’s [3] degenerate double affine Hecke algebra Hk , considered as a filtered algebra by the same degree function (the only difference in the definition of Hk is the cross relation (see Theorem 4.2 (c)), which now is of the form sa · v − sa v · sa = ka Da(v) for a ∈ I , where S(V ) is considered as a W -module algebra with the action of s0 defined by s0 v = sϕ (v) + 2ϕ−2 ϕ(v)1 ∈ S(V )). 0 Lemma 4.4. The center Z(Hk ) of Hk contains S(V )W C .
Proof. Observe that the cross relations in Hk (see Theorem 4.2(c)) imply sDa p − p sa · p − sDa p · sa = −ka Da ∨ 0 for a ∈ I and p ∈ S(V )C . It follows from (4.9) that S(V )W C ⊆ Z(Hk ).
(0)
(4.9)
Remark 4.5. Observe that the subalgebra Hk ⊂ Hk generated by W0 and S(V )C is isomorphic to the degenerate affine Hecke algebra (also known as the graded Hecke (0) 0 algebra), see e.g. [15, 23]. By [23, Prop. 4.5] we have Z(Hk ) = S(V )W C .
206
E. Emsiz, E.M. Opdam, J.V. Stokman
For trivial multiplicity parameters k ≡ 0, the operator p(D0 ) (p ∈ S(V )C ) on ∞ reg ) is the constant-coefficient differential operator p(∂) on C (Vreg ). We have the following striking fact when p ∈ S(V )C is W0 -invariant. C ∞ (V
0 k ∞ Corollary 4.6. For p ∈ S(V )W C we have p(D ) = p(∂) as operators on C (Vreg ). 0 ∞ k Proof. Let p ∈ S(V )W C and f ∈ C (Vreg ). By (4.4) we have p(D )f |C+ = p(∂)f |C+ . Let w ∈ W and v ∈ C+ . By Lemma 4.4 applied twice (once with multiplicity function k, once with k ≡ 0), we have p(Dk )f (w −1 v) = p(Dk )(wf ) (v) = p(∂)(wf ) (v) = p(∂)f (w −1 v),
hence p(Dk )f = p(∂)f .
Remark 4.7. The Dunkl operators Dvk , Theorem 4.1, Theorem 4.2 and Corollary 4.6 have their obvious analogs in the context of finite root systems. In that case, the Dunkl-type operators are kα α(v)χα (·)sα , v∈V ∂v + α∈0+
realizing, together with the W0 -action (2.8), an action of the degenerate affine Hecke (0) algebra Hk on the space of smooth functions on V \ α∈ + Vα . For classical root 0 systems these operators were constructed using solutions of classical Yang-Baxter equations and reflection equations in [28], [24] (type A) and [19]. This construction fits into Cherednik’s [2] general framework relating root system analogs of r-matrices to (degenerate) affine Hecke algebras and Dunkl operators. 5. Integral-Reflection Operators and the Propagation Operator (0)
Heckman and Opdam [15] clarified the role of the degenerate affine Hecke algebra Hk in Gutkin’s [11] work when the underlying root system is finite. It led to an explicit (0) action of Hk as directional derivatives and integral-reflection operators. In this section we extend these results to the present affine set-up. We show that Gutkin’s [11] propagation operator intertwines this action with the action πk which is defined in the previous section in terms of Dunkl-type differential-reflection operators. The integral-reflection operators Qk,a (see (3.9)) for a ∈ are endomorphisms of C(V ) satisfying wQk,a w −1 = Qk,w(a) ,
w ∈ W, a ∈
(5.1)
with respect to the W -action (2.8) on C(V ). We furthermore have Qk,a f |Va = f |Va ,
a ∈ .
(5.2)
By [11, Thm. 2.3], the assignment sa → Qk (sa ) := Qk,a
(a ∈ I )
(5.3)
Periodic Integrable Systems with Delta-Potentials
207
extends to a representation Qk of W on C(V ). In particular, for w ∈ W and any choice of decomposition w = sj1 sj2 · · · sjr as a product of simple reflections (jl ∈ {0, . . . , n}), we have (5.4) Qk (w) = Qk (sj1 )Qk (sj2 ) · · · Qk (sjr ) as operators on C(V ). Definition 5.1. Gutkin’s [11] propagation operator Tk is the endomorphism of C(V ) defined by (Tk f )(w −1 v) = Qk (w)f )(v), v ∈ C+ , w ∈ W. (5.5) In particular, T0 is the identity operator on C(V ). A W -submodule M ⊆ C(V ) with respect to the Qk -action will be denoted by MQk . By construction the propagation operator Tk : C(V )Q → C(V )π is W -equivariant. In fact, by [11, Thm. 2.6] Tk is an isomorphism of W -modules. Observe that the operators Qk (w) (w ∈ W ) preserve the space C ∞ (V ) of complex valued, smooth functions on V . The following result is the affine analog of [15, Thm. 2.1] and [15, Cor. 2.3]. Theorem 5.2. The assignment v → ∂v (v ∈ V ), together with the W -action (5.3) on C ∞ (V ), extends uniquely to a representation Qk : Hk → End(C ∞ (V )). Proof. It suffices to verify the cross relations (see Theorem 4.2(c)), which follow directly from [11, Lem. 2.1]. We will also use the notation MQk to indicate that a subspace M ⊆ C ∞ (V ) is a Hk -submodule with respect to the Qk -action. Observe that C ω (V )Q ⊆ C ∞ (V )Q as Hk -submodule. Consider the space CB ω (V ) of functions f ∈ C(V ) such that f |C is the restriction of a (necessarily unique) analytic function on V for all alcoves C ∈ C (cf. Remark 2.4). Denote C ω,(k) (V ) for the space of functions f ∈ CB ω (V ) satisfying r−1 r ∨ r ∨ ∨ − ∂Db = 1 − (−1)r kb ∂Db ∂Db ∨ f v − 0Db ∨ f v + 0Db ∨ f v + 0Db (5.6) for b ∈ + , v ∈ Vb sub-regular and r ∈ Z>0 . A function f ∈ C ω,(k) (V ) automatically satisfies the jump conditions (5.6) for b ∈ − , v ∈ Vb sub-regular and r ∈ Z>0 , hence the space C ω,(k) (V ) is not dependent on the choice of positive roots + in . We thus can and will interpret C ω,(k) (V )πk and CB ω (V )πk as W -submodules of C ∞ (Vreg )πk . Observe furthermore that C ω,(k) (V ) is a subspace of the space C 1,(k) (V ) used in the formulation of the boundary value problems (see Proposition 2.2 and Definition 2.3). Observe that the propagation operator Tk restricts to a linear map Tk : C ω (V ) → CB ω (V ). We now obtain the following theorem. Theorem 5.3. (i) C ω,(k) (V )πk ⊆ C ∞ (Vreg )πk is a Hk -submodule. (ii) The propagation operator Tk restricts to an isomorphism ∼
Tk : C ω (V )Qk −→ C ω,(k) (V )πk of Hk -modules.
208
E. Emsiz, E.M. Opdam, J.V. Stokman ∼
Proof. We first show that Tk restricts to a linear isomorphism Tk : C ω (V ) −→ C ω,(k) (V ). For this we use the commutation relations r r r−1 , a ∈ I, r ∈ Z>0 sa · Da ∨ − (−1)r Da ∨ · sa = 1 − (−1)r ka Da ∨ (5.7) in Hk , which follows from (4.9) applied to p = (Da ∨ )r ∈ S(V )C . Let φ ∈ C ω (V ) and denote f = Tk φ ∈ CB ω (V ). We show that f satisfies the derivative jumps (5.6) over sub-regular v ∈ Vb (b ∈ + ) for all r ∈ Z>0 . In view of the W -equivariance of the propagation operator Tk , it suffices to derive the derivative jumps for f over sub-regular vectors v ∈ Va ∩ C+ (a ∈ I ). Fix a ∈ I , v ∈ Va ∩ C+ sub-regular and r ∈ Z>0 . For > 0 small we have v + tDa ∨ = sa (v − tDa ∨ ) ∈ C+ for 0 < t < . Hence r ∨ r r ∂Da ∨ f (v + 0Da ) = ∂Da ∨ φ(v) = Qk (sa )(∂Da ∨ φ)(v),
(5.8)
where the second equality follows from (5.2). On the other hand, r ∨ r r ∨ r r ∂Da ∨ f (v − 0Da ) = (−1) ∂Da ∨ (sa f )(v + 0Da ) = (−1) ∂Da ∨ (Qk (sa )φ)(v). (5.9)
Combining (5.8) and (5.9) now yields
r ∨ r ∨ r r r Qk (sa )∂Da ∂Da ∨ f (v + 0Da ) − ∂Da ∨ f (v − 0Da ) = ∨ − (−1) ∂Da ∨ Qk (sa ) φ (v) r−1 = 1 − (−1)r ka ∂Da ∨ φ(v) r−1 ∨ = 1 − (−1)r ka ∂Da ∨ f (v + 0Da ), where the second equality follows from (the Qk -image of) (5.7). Thus f ∈ C ω,(k) (V ). The map Tk : C ω (V ) → C ω,(k) (V ) is clearly injective. We now proceed to prove surjectivity. Let f ∈ C ω,(k) (V ) and denote ψ for the unique analytic function on V satisfying ψ|C+ = f |C+ . The function g := f − Tk ψ ∈ C ω,(k) (V ) satisfies g|C+ ≡ 0. Combined with the continuity of g and the derivative jump conditions (5.6) for g, we obtain r ∂Da ∨ g (v − 0Da ∨ ) = 0 for r ∈ Z≥0 , a ∈ I and v ∈ Va ∩ C+ sub-regular. Since g|C (C ∈ C) has an extension to an analytic function on the whole Euclidean space V , we conclude that g|C ≡ 0 for the neighboring alcoves C = sa C+ (a ∈ I ) of C+ . Continuing inductively we conclude that g ≡ 0 on V , hence f = Tk ψ. It remains to show that the isomorphism ∼
Tk : C ω (V )Qk −→ C ω,(k) (V )πk of W -modules is in fact an isomorphism of Hk -modules. For this it suffices to show that (5.10) Tk ∂v f |Vreg = Dvk (Tk f |Vreg ) for v ∈ V and f ∈ C ω (V ). To prove (5.10) we use the commutation relation ka Da(v)wsa w · v = (Dw)v · w + a∈ + ∩w−1 −
(5.11)
Periodic Integrable Systems with Delta-Potentials
209
in Hk , which can be easily proved by induction on the length l(w) of w ∈ W using the cross relations in Hk (see Theorem 4.2(c)). Fix w ∈ W and v ∈ C+ . By (5.11) and Theorem 5.2 we have Tk (∂v f )(w −1 v ) = Qk (w)(∂v f )(v )
= ∂(Dw)v (Qk (w)f )(v ) +
ka Da(v)Qk (wsa )f (v )
a∈ + ∩w−1 −
= ∂v (Tk f )(w
−1
v )+
ka Da(v)Tk f (sa w −1 v )
a∈ + ∩w−1 −
=
Dvk (Tk f )(w −1 v ),
where the last equality follows from (4.3).
Remark 5.4. The assertion [11, Thm. 2.7] that, in Gutkin’s notation, the propagation operator Tk is an automorphism of the W -module CB ∞ , seems to be incorrect. In fact, the integral-operators I(a) (a ∈ ) do not preserve CB ∞ , contrary to the claim in the proof of [11, Thm. 2.7]. In [11], this result is used to link BVPk (λ) to E(λ) (see (3.1)). We will show in Sect. 6 that Theorem 5.3(ii) suffices to provide this link. Remark 5.5. Theorem 5.3 has an obvious analog in the context of finite root systems (compare with Remark 4.7). In the case of a finite root system of type A, the intertwining properties of the propagation operator with respect to the degenerate affine Hecke algebra actions were considered in [16] and the normal derivative jump conditions of higher order were considered in [12]. Corollary 5.6. Dvk is a linear operator on C ω,(k) (V ) Fixv ∈ V. The Dunkl operator k ω satisfying Dv Tk f = Tk ∂v f for all f ∈ C (V ). In the following proposition we relate the Dunkl operators Dvk to the quantum Hamiltonian Hk (see (2.4) and (2.5)). Recall that p2 (∂) = for the W0 -invariant polynomial p2 = · 2 on V ∗ . Proposition 5.7. For f ∈ C ω,(k) (V ) we have −p2 (Dk )f = Hk f
(5.12)
as distributions on V . Proof. Fix f ∈ C ω,(k) (V ), then p2 (Dk )f ∈ C ω,(k) (V ) ⊆ C(V ) and p2 (Dk )f |Vreg = f |Vreg by Corollary 4.6. Furthermore, f satisfies the first order normal derivative jumps (2.6) over the affine hyperplanes Va (a ∈ + ). The identity (5.12) then follows from a standard argument using Green’s identity, cf. (the proof of) Proposition 2.2. By Proposition 5.7 it is justified to interpret the quantum Hamiltonian Hk on C ω,(k) (V ) as the operator −p2 (Dk ) on C ω,(k) (V ). The complete integrability of the quantum system is then directly reflected by the commutativity of the Dunkl operators Dvk (v ∈ V ). More precisely, the space C ω,(k) (V )W π serves as an algebraic model for the Hilbert space of quantum states associated to the bosonic quantum system on V /Q∨ with Hamilto0 nian Hk = −p2 (Dk ). The pair-wise commuting operators p(Dk ) (p ∈ S(V )W C ) on ω,(k) W C (V )π are the corresponding quantum conserved integrals.
210
E. Emsiz, E.M. Opdam, J.V. Stokman
6. The Boundary Value Problem Revisited 0 ω,(k) (V ) satisfy The operators p(Dk ) (p ∈ S(V )W C ) on C
p(Dk )f |Vreg = p(∂)f |Vreg ,
f ∈ C ω,(k) (V )
by Corollary 4.6. This key observation leads to an explicit connection between the spec0 tral problem of the operators p(Dk ) (p ∈ S(V )W C ) and the boundary value problem as formulated in Definition 2.3. We will first do the analysis for the spectral problem of the quantum Hamiltonian Hk (defined by (2.4) and (2.5)). For E ∈ C we write E(E) for the space of functions f ∈ C ω (V ) satisfying f = −Ef on V (cf. Example 2.5). By Lemma 4.4, E(E)Qk ⊆ C ω (V )Qk is a Hk -submodule. Denote Ek (E) for the space of functions f ∈ CB ω (V ) satisfying Hk f = Ef as distributions on V (cf. Proposition 2.2). Theorem 6.1. Fix E ∈ C. (i) We have Ek (E) = {f ∈ C ω,(k) (V ) | p2 (Dk )f = −Ef },
(6.1)
hence Ek (E)πk ⊆ C ω,(k) (V )πk is a Hk -submodule. (ii) The propagation operator Tk restricts to an isomorphism ∼
Tk : E(E)Qk −→ Ek (E)πk of Hk -modules. Proof. (i) We first show that Ek (E) ⊆ C ω,(k) (V ). Fix f ∈ Ek (E). By Proposition 2.2, f ∈ C 1,(k) (V ) ∩ CB ω (V ) and f |Vreg = −Ef |Vreg . Let ψ be the unique analytic function on V satisfying ψ|C+ = f |C+ , then ψ ∈ E(E). By Theorem 5.3 and Corollary 4.6 we conclude that Tk ψ ∈ C ω,(k) (V ) and (Tk ψ)|Vreg = −E(Tk ψ)|Vreg . Hence g := f − Tk ψ ∈ C 1,(k) (V ) ∩ CB ω (V ) satisfies g|Vreg = −Eg|Vreg and has the additional property that g|C+ ≡ 0. Fix v ∈ Va ∩ C+ (a ∈ I ) sub-regular. The nontrivial normal derivative jump condition (2.6) for g at v trivializes since g|C+ ≡ 0, hence g is continuously differentiable in an open neighborhood U of v. It follows that g|U is a distribution solution of the (hypo)elliptic constant coefficient differential operator + E (cf. Example 2.5), hence g|U is smooth. Since g|C+ ≡ 0, we conclude that r ∨ r ∨ ∂Da ∨ g(v − 0Da ) = ∂Da ∨ g(v + 0Da ) = 0,
r ∈ Z≥0 .
As in the proof of Theorem 5.3 we conclude that g|sa C+ ≡ 0 for a ∈ I (alternatively, this is a direct consequence of Holmgren’s Uniqueness Theorem). Continuing inductively, we conclude that g ≡ 0 on V . Hence f = Tk ψ ∈ C ω,(k) (V ). Formula (6.1) now follows from Proposition 5.7. Since p2 (Dk ) = πk (p2 ), Lemma 4.4 implies that Ek (E)πk ⊆ C ω,(k) (V )πk is a Hk -submodule. (ii) This follows directly from Theorem 5.3, (6.1) and the fact that Qk (p2 ) = p2 (∂) = . We now extend these results to the solution spaces BVPk (λ) of the boundary value problem (Definition 2.3). For a Hk -module M and λ ∈ VC∗ we define
Periodic Integrable Systems with Delta-Potentials
Mλ := {m ∈ M | p · m = p(λ)m
211 0 ∀ p ∈ S(V )W C },
(6.2)
which is a Hk -submodule of M in view of Lemma 4.4. By Remark 4.5 the module Mλ consists of the vectors m ∈ M transforming according to the central character λ ∈ VC∗ (0) for the action of the center of the degenerate affine Hecke algebra Hk ⊆ Hk . Corollary 6.2. Let λ ∈ VC∗ . The space BVPk (λ) is the Hk -submodule C ω,(k) (V )πk ,λ of C ω,(k) (V )πk . Proof. By Corollary 4.6 and Theorem 5.3 we have C ω,(k) (V )πk ,λ = {f ∈ C ω,(k) (V ) | p(∂)f |Vreg = p(λ)f |Vreg
0 ∀ p ∈ S(V )W C }, (6.3)
hence C ω,(k) (V )πk ,λ ⊆ BVPk (λ). By Proposition 2.2 and Remark 2.4 we have BVPk (λ) ⊆ Ek (−p2 (λ)). Theorem 6.1 and (6.3) now imply that BVPk (λ) ⊆ C ω,(k) (V )πk ,λ .
Theorem 6.3. Let λ ∈ VC∗ . ∼
(i) The propagation operator Tk restricts to an isomorphism Tk : E(λ)Qk −→ BVPk (λ)πk of left Hk -modules. ∼ W (ii) The map G (3.5) restricts to an isomorphism G : E(λ)W Qk −→ BVPk (λ)πk . Proof. (i) The restriction of the propagation operator Tk to the Hk -module E(λ)Qk = C ω (V )Qk ,λ defines an isomorphism ∼
Tk : E(λ)Qk −→ C ω,(k) (V )πk ,λ of Hk -modules in view of Theorem 5.3. Corollary 6.2 now completes the proof. (ii) This follows from (i) and from the fact that the propagation map Tk acts on Qk (W )-invariant functions in the same way as the map G (3.5). As observed in Sect. 3, Theorem 6.3 (ii) can be used to reformulate the main results on the solution space BVPk (λ)W π (see Theorem 2.6) to the boundary value problem in terms of the space of invariants E(λ)W Q , where E(λ) now is the solution space to the boundary value problem with zero normal derivative jumps over sub-regular vectors. Theorem 3.5 is the resulting reformulation of Theorem 2.6. In order to prove Theorem 3.5 we analyze the space E(λ)W Q in detail in the following sections. 7. Invariants in E(λ) 0 In this section we analyze the sub-space E(λ)W Q of W0 -invariants of E(λ)Q . First we recall some well known properties of the space E(λ) from [30, 15]. For technical purposes it is convenient to introduce the following terminology.
212
E. Emsiz, E.M. Opdam, J.V. Stokman
Definition 7.1. Let J be a subset of the simple roots I0 . The spectral parameter λ ∈ VC∗ is called J -standard if λ ∈ V ∗ ⊕ iV+∗ and if the isotropic sub-group of λ in W0 is the standard parabolic sub-group W0,J generated by the simple reflections sα (α ∈ J ). Lemma 7.2. Let λ ∈ VC∗ . The W0 -orbit of λ contains a J -standard spectral parameter for some subset J ⊆ I0 . Proof. Taking a W0 -translate of λ we may assume that λ = µ + iν with µ ∈ V ∗ and ν ∈ V+∗ . The isotropy group of ν in W0 is a standard parabolic sub-group W0,K ⊂ W0 for some subset K ⊆ I0 . Write V ∗ = VK∗ ⊕ (VK∗ )⊥ with VK∗ = spanR {α | α ∈ K} and (VK∗ )⊥ its orthocomplement in V ∗ . Set ∗ VK,+ = {ξ ∈ VK∗ | ξ(α ∨ ) > 0
∀ α ∈ K},
which we view as the fundamental chamber for the action of the standard parabolic subgroup W0,K on VK∗ . Taking a W0,K -translate of λ we may assume that λ = µ + µ + iν ∗ , µ ∈ (V ∗ )⊥ , and ν ∈ V ∗ as before. The isotropy sub-group of λ in W with µ ∈ VK,+ 0 + K then equals the isotropy sub-group of µ in W0,K , which is a standard parabolic sub-group ∗ . W0,J for some subset J ⊆ K since µ ∈ VK,+ Observe that a J -standard spectral parameter λ is regular if and only if J = ∅. Note furthermore that the module E(λ) (λ ∈ VC∗ ) only depends on the orbit W0 λ. When analyzing the module E(λ), we thus may assume without loss of generality that λ is J -standard for some subset J ⊆ I0 . In particular, we will now assume this condition for the remainder of this section. For j ∈ Z≥0 we denote P (j ) (V )C (respectively P (≤j ) (V )C ) for the homogeneous polynomials p ∈ P (V )C of degree j (respectively the polynomials p ∈ P (V )C of degree ≤ j ). The W0 -action (2.8) on P (V )C respects the natural grading P (V )C = ∞ (j ) (V ) . Furthermore, C j =0 P EJ (0) = {f ∈ P (V )C | p(∂)f = p(0)f
∀ p ∈ S(V )W0,J }
is a graded W0,J -submodule of P (V )C , isomorphic to the regular representation of W0,J (j ) (see e.g. [30, Thm. 1.2] and references therein). We write EJ (0) = EJ (0) ∩ P (j ) (V )C (≤j ) and EJ (0) = EJ (0) ∩ P (≤j ) (V )C . Denote by W0J the minimal coset representatives of W0 /W0,J . Steinberg [30] established the decomposition (7.1) E(λ) = u EJ (0)eλ . u∈W0J
It follows from (7.1) that E(λ), viewed as a W0 -module by the action (2.8), is isomorphic (j ) to the regular representation of W0 . Furthermore, we have E(λ) = ∞ j =0 E (λ) with E (j ) (λ) the W0 -submodule (j ) u EJ (0)eλ . E (j ) (λ) = u∈W0J
We denote E (≤j ) (λ) =
j
r=0 E
(r) (λ).
Periodic Integrable Systems with Delta-Potentials
213
Representations of the finite group W0 do not admit nontrivial continuous deformations, hence E(λ)Q is isomorphic to the regular representation of W0 for an arbitrary 0 multiplicity function k. In particular, E(λ)W Q is one-dimensional for all spectral values λ ∈ VC∗ . In fact, by (5.2) the function 1 Qk (w)eλ (7.2) ψλk = #W0 w∈W0
0 satisfies ψλk (0) = 1 and spans E(λ)W Q . On the polynomials puλ ∈ EJ (0) (u ∈ W0J ) such that
ψλk (v) =
other hand, by (7.1) there exist unique
puλ (u−1 v)euλ(v) ,
v ∈ V.
(7.3)
u∈W0J
By (7.1) we have E(λ) =
Cewλ ,
λ ∈ VC∗ regular,
(7.4)
w∈W0 λ (w ∈ W ) are constants for regular λ. In fact, from e.g. [8] and so the polynomials pw 0 [15, Sect. 2] we have 1 ck (wλ)ewλ(v) , λ ∈ VC∗ regular, (7.5) ψλk (v) = #W0 w∈W0
where the c-function ck is given by (2.9). In the remainder of the paper it will actually be more convenient to work with the regularized c-function µ(α ∨ ) + kα ck (µ) := (7.6) , µ ∈ VC∗ ∨) µ(α + α∈0
µ(α ∨ )=0
which is equal to ck (µ) for regular µ. We can then write λ = pw
1 ck (wλ), #W0
λ ∈ VC∗ regular.
For singular λ an explicit expression for puλ ∈ EJ (0) (u ∈ W0J ) is not known. For our purposes it suffices to have explicit expressions for the highest and the next to highest homogeneous components of puλ , which we will now proceed to derive. We denote 0J ⊆ 0 for the parabolic root sub-system associated to the simple roots J ⊆ I0 . We write NJ for the cardinality of the corresponding set 0J,+ := 0J ∩ 0+ of positive roots in 0J and 1 δJ = α ∈ V ∗. 2 J,+ α∈0
Recall that the minimal coset representatives W0J of W0 /W0,J can be characterized by W0J = {u ∈ W0 | u(0J,+ ) ⊆ 0+ }. The following lemma now gives a derivational expression for puλ (u ∈ W0J ).
214
E. Emsiz, E.M. Opdam, J.V. Stokman
Lemma 7.3. Let λ ∈ VC∗ be J -standard. For u ∈ W0J we have NJ d puλ = KJ−1 N du (t)euv (t)(−1)l(v) etvδJ dt J t=0 v∈W0,J
with coefficients
du (t) =
−1
uδJ (α ∨ )t +uλ(α ∨ )
, euv (t)=
(uvδJ (α ∨ )t +uλ(α ∨ )+kα )
α∈0+
α∈0+ \u(0J,+ )
and with strictly positive constant KJ = NJ !#W0
δ (α α∈0J,+ J
∨ ).
Proof. By (7.2), ψµk (v ) (v ∈ V ) depends analytically on the spectral parameter µ ∈ VC∗ . In particular, ψλkt (v ) with λt := λ + tδJ ∈ VC∗ depends analytically on t ∈ C, and we have the (point-wise) limit lim ψλkt = ψλk .
(7.7)
t→0
For > 0 we write U0 = {t ∈ C | 0 < |t| < },
U = {t ∈ C | |t| < }.
There exists an > 0 such that λt is regular for t ∈ U0 , hence ψλkt =
1 #W0
wλt (α ∨ ) + kα wλ e t, wλt (α ∨ ) +
w∈W0
t ∈ U0
α∈0
by (7.5). Splitting the sum into a double sum w = uv with u ∈ W0J and v ∈ W0,J and using
uvλt (α ∨ ) = (−1)l(u)+l(v) t NJ
α∈0+
α∈0J,+
= (−1)l(v) t NJ
α∈0J,+
δJ (α ∨ )
δJ (α ∨ )
λt (β ∨ )
β∈0+ \0J,+
uλt (β ∨ ),
β∈0+ \u(0J,+ )
we obtain t NJ ψλkt = KJ−1 NJ !
u∈W0J
v∈W0,J
du (t)euv (t)(−1)l(v) etuvδJ +uλ
(7.8)
as analytic functions in t ∈ U (note that du (t) is analytic at t ∈ U ). By (7.7), ψλk is the NJth term in the power series expansion of (7.8) at t = 0, which yields the desired result.
Periodic Integrable Systems with Delta-Potentials
215
Define the strictly positive constant CJk by CJk =
1 #W0
α∈0J,+
kα . δJ (α ∨ )
The highest and next to highest homogeneous terms of puλ ∈ EJ (0) (u ∈ W0J ) can now be explicitly computed as follows. Proposition 7.4. Let λ ∈ VC∗ be J -standard and u ∈ W0J . (i) The highest homogeneous term hλu of puλ ∈ EJ (0) is of degree NJ and is explicitly given by hλu = CJk ck (uλ) α. α∈0J,+
(ii) Suppose that λ is singular (i.e. J = ∅). The next to highest homogeneous term nλu of puλ ∈ EJ (0) is k nλu = ∂u−1 ρ k hλu = CJk ck (uλ) uβ(ρuλ ) α uλ
β∈0J,+
α∈0J,+ \{β}
with ρµk =
α∈0+
α∨ ∈ VC . µ(α ∨ ) + kα
(7.9)
Remark 7.5. The formula for nλu should be read as an identity between analytic functions in kα > 0 (the possible singularities are easily seen to be removable). Proof. (i) Observe that euv (0) = eu (0) is independent of v ∈ W0,J , and du (0)eu (0) = ck (uλ) kα . α∈0J,+
Combined with Lemma 7.3 we conclude that the highest homogeneous term hλu of puλ is given by CJk d NJ λ hu = (−1)l(v) etvδJ ck (uλ) N dt J t=0 NJ ! v∈W0,J
=
CJk NJ !
ck (uλ)
N (−1)l(v) vδJ J .
(7.10)
v∈W0,J
On the other hand, by the Weyl denominator formula for 0J we have d NJ d NJ l(v) tvδJ tδJ −tα 1 − e = N (−1) e = e ! α. J dt NJ t=0 dt NJ t=0 J,+ J,+ v∈W0,J
α∈0
α∈0
216
E. Emsiz, E.M. Opdam, J.V. Stokman
Combined with the first equality in (7.10) we obtain the desired expression for hλu . (ii) The next to highest homogeneous term nλu of puλ is NJ N −1 N −1 (−1)l(v) vδJ J +du (0) (−1)l(v) euv (0) vδJ J du (0)eu (0) nλu = KJ v∈W0,J
v∈W0,J
in view of Lemma 7.3, where the prime denotes the t-derivative. The first W0,J -sum in this expression is identically zero since it is a W0,J -alternating polynomial of degree < NJ . By a direct calculation the remaining expression can be rewritten as nλu =
N −1 CJk k (−1)l(v) (vδJ )(u−1 ρuλ ) vδJ J . ck (uλ) (NJ − 1)! v∈W0,J
The desired expression for nλu now follows from (7.10).
8. The Bethe Ansatz Equations In this section we show that E(λ)W Q = {0} implies that the spectral parameter λ is a purely imaginary solution of the Bethe ansatz equations (2.10). From the results of the previous section it is clear that E(λ)W Q is one-dimensional or zero-dimensional. In fact it is one-dimensional if and only if Qk (a0 )ψλk = ψλk , in which case we have W0 k E(λ)W Q = E(λ)Q = spanC {ψλ }.
It is convenient to reformulate these observations in terms of Jk = ∂ϕ ∨ Qk (a0 ) + kϕ
(8.1)
(viewed as an operator on e.g. C ∞ (V ) or E(λ)), which satisfies the elementary commutation relations Jk ∂v = ∂sϕ v Jk ,
∀v ∈ V
(the operator Jk can be defined on the level of the algebra Hk as the element ϕ ∨ ·s0 +kϕ ∈ Hk , in which case it is the analog of the affine intertwiner from [4] and [27, Sect. 4]). The equality Qk (a0 )ψλk = ψλk clearly implies Jk ψλk = (∂ϕ ∨ + kϕ )ψλk . Lemma 8.1. If λ is regular, then Jk ψλk = (∂ϕ ∨ + kϕ )ψλk implies Qk (a0 )ψλk = ψλk . Proof. By (7.4) we have a unique expansion Qk (a0 )ψλk − ψλk =
dw ewλ
w∈W0
with dw ∈ C. We conclude from the equality Jk ψλk = (∂ϕ ∨ +kϕ )ψλk that wλ(ϕ ∨ )dw = 0 for all w ∈ W0 . Since λ is regular, this implies dw = 0 for all w ∈ W0 . For p ∈ P (V )C S(V ∗ )C we write p(∂ µ ) for the associated constant coefficient differential operator acting on smooth functions in µ ∈ VC∗ .
Periodic Integrable Systems with Delta-Potentials
217
Lemma 8.2. Let p ∈ P (V )C S(V ∗ )C . For w ∈ W0 we have −1 ∨ −1 Jk p(w −1 ·)ewµ (v) = −p(∂ µ ) (µ(w −1 ϕ ∨ ) + kϕ )eµ(w ϕ ) eµ(w sϕ v) , −1 ∂ϕ ∨ + kϕ p(w −1 ·)ewµ (v) = p(∂ µ ) (µ(w −1 ϕ ∨ ) + kϕ )eµ(w v) , where we view the left hand sides as functions in v ∈ V and the right hand sides as functions in µ ∈ VC∗ . In particular, Jk P (≤j ) (V )C eµ ⊆ P (≤j ) (V )C esϕ µ , ∂ϕ ∨ + kϕ P (≤j ) (V )C eµ ⊆ P (≤j ) (V )C eµ for j ∈ Z≥0 and µ ∈ VC∗ . Proof. Observe that −1 p(w −1 ·)ewµ (v) = p(∂ µ ) eµ(w v) ,
and p(∂ µ ) (acting on µ ∈ VC∗ ) clearly commutes with Jk and (∂ϕ ∨ + kϕ ) (which act on v ∈ V ). Thus it suffices to prove the lemma for p ≡ 1, in which case the second formula is trivial. To prove the first formula for p ≡ 1 we may assume without loss of generality that w = e is the unit element of W0 . Suppose that µ ∈ VC∗ is regular. A direct computation using the definition (3.9) of Qk (a0 ) as an integral-reflection operator yields
kϕ µ µ(ϕ ∨ ) + kϕ ∨ Qk (a0 )eµ = − + e eµ(ϕ ) esϕ µ , ∨ ∨ µ(ϕ ) µ(ϕ ) hence
∨ Jk (eµ ) = − µ(ϕ ∨ ) + kϕ eµ(ϕ ) esϕ µ .
In the latter formula the regularity constraint on µ can be removed by continuity.
(j )
We denote πλ : E(λ) → E (j ) (λ) for the projection onto E (j ) (λ) along the decom (r) position E(λ) = ∞ r=0 E (λ). Observe that IdE(λ) =
NJ
(j )
πλ
(8.2)
j =0
if λ is J -standard in view of Proposition 7.4 (i). In this section we consider the constraint on λ such that (j ) (j ) πλ Jk ψλk = πλ (∂ϕ ∨ + kϕ )ψλk (8.3) for the highest degree component j = NJ . The map u → uJ , where uJ ∈ W0J is obtained from the unique decomposition sϕ u = uJ uJ ,
uJ ∈ W0J , uJ ∈ W0,J ,
(8.4)
defines an involution on W0J . Observe that (uJ )J = (uJ )−1 ,
u ∈ W0J .
Recall that ck denotes the regularized c-function (7.6).
(8.5)
218
E. Emsiz, E.M. Opdam, J.V. Stokman
Lemma 8.3. Suppose that λ ∈ VC∗ is J -standard. (i) Equation (8.3) for j = NJ holds if and only if λ satisfies the equations ∨
ck (sϕ uλ)(uλ(ϕ ∨ ) − kϕ )e−uλ(ϕ ) (−1)l(uJ ) = ck (uλ)(uλ(ϕ ∨ )+kϕ ),
∀ u ∈ W0J . (8.6)
(ii) For u ∈ W0J and for multiplicity functions k such that ck (uλ) = 0, we have ck (sϕ uλ) = (−1)l(uJ ) ck (uλ)
α∈0+ ∩sϕ 0−
uλ(α ∨ ) − kα . uλ(α ∨ ) + kα
Proof. (i) By (7.3), Lemma 8.2 and Proposition 7.4(i) we have ∨ (N ) πλ J (Jk ψλk ) = −CJk ck (uλ)(uλ(ϕ ∨ ) + kϕ )euλ(ϕ ) esϕ uλ sϕ uα, (NJ )
πλ
(∂ϕ ∨ + kϕ )ψλk = CJk
u∈W0J
ck (uλ)(uλ(ϕ ∨ ) + kϕ )euλ
u∈W0J
α∈0J,+
uα.
(8.7)
α∈0J,+
The proof now follows by equating the coefficients of euλ α∈ J,+ uα (u ∈ W0J ) in 0 (8.7) using (8.4). (ii) We first compare the denominators of ck (uλ) and ck (sϕ uλ) = ck (uJ λ). If µ ∈ VC∗ is regular then ∨ −1 uJ µ(α ∨ ) = uJ µ(α ∨ ) (uu−1 J µ(β )) α∈0+ \uJ 0J,+
α∈0+
= (−1)l(uJ )
α∈0+
= (−1)l(uJ )+1
J,+ β∈uu−1 J 0
∨ sϕ uu−1 J µ(α )
∨ −1 (uu−1 J µ(β ))
β∈u0J,+
∨ uu−1 J µ(α ).
α∈0+ \u0J,+
Taking the limit µ → λ we obtain uJ λ(α ∨ ) = (−1)l(uJ )+1 α∈0+ \uJ 0J,+
uλ(α ∨ ).
α∈0+ \u0J,+
A similar (and easier) computation leads to the comparative formula ∨ J uλ(β )−kβ u λ(α ∨ ) + kα = − uλ(β ∨ )+kβ + − J,+ + + α∈0 \uJ 0
β∈0 ∩sϕ 0
uλ(α ∨ )+kα
α∈0 \u0J,+
for the numerators of ck (uλ) and ck (uJ λ). Combining both formulas leads to the desired result. Recall from Sect. 2 that BAEk is the set of purely imaginary solutions of the Bethe ansatz equations (2.10).
Periodic Integrable Systems with Delta-Potentials
219
Proposition 8.4. Suppose that λ ∈ VC∗ is J -standard. Equation (8.3) for j = NJ holds if and only if λ ∈ BAEk . Proof. We first show that λ is purely imaginary if λ satisfies Eq. (8.6). Let µ = uλ (u ∈ W0J ) be the element in the W0 -orbit of λ having its real part in V+∗ . Then ck (µ) = 0 since the multiplicity function k is strictly positive, hence (8.6) and Lemma 8.3(ii) imply eµ(ϕ
∨)
=
µ(ϕ ∨ ) − kϕ µ(ϕ ∨ ) + kϕ
α∈0+ ∩sϕ 0−
µ(α ∨ ) − kα . µ(α ∨ ) + kα
(8.8)
The modulus of the left-hand (respectively right-hand side) of (8.8) is ≥ 1 (respectively ≤ 1) since the real part of µ is in V+∗ and the multiplicity function k is strictly positive. ∨ Thus |eµ(ϕ ) | = 1, implying that µ(ϕ ∨ ) is purely imaginary. Since ϕ ∨ = nj=1 mj aj∨
with mj strictly positive integers and since the real part of µ lies in V+∗ , we conclude that µ(aj∨ ) is purely imaginary for all co-roots aj∨ (j = 1, . . . , n). This implies µ ∈ iV ∗ , hence λ ∈ iV ∗ . Combined with Lemma 8.3(i) it follows that λ satisfies (8.3) for j = NJ if and only if λ is a purely imaginary solution of Eqs. (8.6). For purely imaginary λ we have ck (uλ) = 0 for all u ∈ W0J due to the strict positivity of the multiplicity function k. The proof now follows from Lemma 8.3(ii) and Remark 2.7.
As an immediate result we obtain the following “regular part” of Theorem 3.5. Corollary 8.5. Suppose that λ ∈ VC∗ is regular. The space E(λ)W Q is zero-dimensional or one-dimensional. It is one-dimensional if and only if λ ∈ BAEk . In that case E(λ)W Q is spanned by ψλk (3.11). Proof. By the observations at the beginning of the section it suffices to show that E(λ)W Q = {0} iff λ ∈ BAEk . Since BAEk ⊂ iV ∗ is a W0 -invariant subset and E(λ)W Q only depends on the W0 -orbit of λ, we may assume without loss of generality that λ is ∅-standard. If E(λ)W Q = {0} then (8.3) holds, hence λ ∈ BAEk by Proposition 8.4. Conversely, suppose that λ ∈ BAEk . (0) Since λ is regular we have IdE(λ) = πλ by (8.2), hence Jk ψλk = ∂ϕ ∨ + kϕ ψλk by Proposition 8.4. By Lemma 8.1 this implies Qk (a0 )ψλk = ψλk , hence 0 = ψλk ∈ E(λ)W Q. 9. The Master Function In this section we prove Proposition 2.9, which yields a parametrization of the set BAEk of purely imaginary solutions of the Bethe ansatz equations (2.10) by the weight lattice P. We first rewrite the Bethe ansatz equations (2.10) in logarithmic form. By a direct computation using the elementary identity e−2i arctan(x) =
1 − ix 1 + ix
(x ∈ R)
220
E. Emsiz, E.M. Opdam, J.V. Stokman
the Bethe ansatz equations (2.10) for λ ∈ iV ∗ can be rewritten as
−iλ(α ∨ ) ∨ −iλ(wϕ ) + α(wϕ ∨ ) = 0 modulo 2π Z arctan kα
(9.1)
α∈0
for all w ∈ W0 . On the other hand, for µ ∈ P the gradient of the master function Sk (µ, ·) : V ∗ → R (see (2.14)) is determined by
η(α ∨ ) ∂ξ Sk (µ, ·) (η) = η − 2πµ + α, ξ , ξ, η ∈ V ∗ . (9.2) arctan kα α∈0
Comparing (9.1) and (9.2) yields the following result. Lemma 9.1. We have λ ∈ BAEk if and only if λ = iη with η ∈ V ∗ an extremal vector of the master function Sk (µ, ·) for some µ ∈ P . ∗ Proof. 0 is an irreducible root system in V ∗ , hence {wϕ | w ∈ W0 }spans V . Thus ∗ η ∈ V is an extremal vector of Sk (µ, ·) if and only if ∂wϕ Sk (µ, ·) (η) = 0 for all w ∈ W0 , which by (9.2) is equivalent to
η(α ∨ ) η(wϕ ∨ ) + = 2π µ(wϕ ∨ ) α(wϕ ∨ ) arctan kα α∈0
for all w ∈ W0 . Comparing to (9.1), the proof now follows from (2.3).
We thus need to analyze the extrema of the master function Sk (µ, ·) at a given weight µ ∈ P . Observe that the Hessian Bξk : V ∗ × V ∗ → R of Sk (µ, ·) at ξ ∈ V ∗ is independent of µ, and is given explicitly by Bξk (η, η ) = ∂η ∂η Sk (µ, ·) (ξ ) η(α ∨ )η (α ∨ ) 1 kα α2 2 , η, η ∈ V ∗ . (9.3) = η, η + 2 kα + ξ(α ∨ )2 α∈0
By the strict positivity of the multiplicity function k, it follows from (9.3) that the Hessian Bξk is positive definite for all ξ ∈ V ∗ , hence Sk (µ, ·) is strictly convex. Furthermore, for all µ ∈ P , ξ 2 − 2π µ, ξ → ∞, ξ → ∞, 2 hence Sk (µ, ·) has a unique extremum µk ∈ V ∗ , which is a global minimum. It now follows from (9.2) that µk (µ ∈ P ) is uniquely determined by the equation Sk (µ, ξ ) ≥
k µk + σ µk = 2πµ
(9.4)
in V ∗ , where σλk ∈ V ∗ (λ ∈ V ∗ ) is defined by
λ(α ∨ ) k σλ = α. arctan kα α∈0
Combined with Lemma 9.1 it now follows that the map µ → i µk is a bijection from the weight lattice P onto BAEk . The W0 -equivariance of this map is immediate from the equivariance property ∀ w ∈ W0 ∂wξ Sk (wµ, ·) (wη) = ∂ξ Sk (µ, ·) (η), for ξ, η ∈ V ∗ and µ ∈ P . This completes the proof of Proposition 2.9.
Periodic Integrable Systems with Delta-Potentials
221
10. Moment Gaps In this section we prove Proposition 2.10, which yields estimates for the location of the deformed weight µ= µk compared to the parametrizing weight µ ∈ P . In view of (9.2) and Lemma 9.1, the deformed weight µ ∈ V ∗ (µ ∈ P ) is the unique solution of (9.4). The following lemma establishes the necessary bounds for σλk . Lemma 10.1. For λ ∈ V+∗ , 0 ≤ σλk (β ∨ ) ≤ with hk = 2
hk λ(β ∨ ), n
∀ β ∈ 0+
−1 α∈0 kα .
Proof. Fix λ ∈ V+∗ and β ∈ 0+ . Let 0 be the set of roots α ∈ 0 satisfying α(β ∨ ) > 0, then
! λ(sβ α ∨ ) λ(α ∨ ) k ∨ σλ (β ) = arctan − arctan α(β ∨ ). (10.1) k k α α β β
α∈0
Each term in this sum is positive, hence σλk (β ∨ ) ≥ 0. β For the second inequality, we use the estimate for α ∈ 0 ,
arctan
λ(α ∨ ) kα
− arctan
λ(sβ (α ∨ )) kα
=
λ(α ∨ )/kα
λ(sβ (α ∨ ))/kα
dx λ(β ∨ )β(α ∨ ) ≤ , 1 + x2 kα
leading to σλk (β ∨ ) ≤ λ(β ∨ )
β(α ∨ )α(β ∨ ) λ(β ∨ ) β(α ∨ )α(β ∨ ) = kα 2 kα β
(10.2)
α∈0
α∈0
in view of (10.1). Now note that ξ →
kα−1 ξ(α ∨ )α
α∈0
defines a W0 -equivariant linear map V ∗ → V ∗ . By Schur’s lemma it equals Ck IdV ∗ for some constant Ck ∈ C. To determine Ck explicitly we fix a basis {ej }nj=1 of V and we denote {j }nj=1 for the corresponding dual basis of V ∗ . Then Ck n =
n
kα−1 j (α ∨ )α(ej ) = hk
j =1 α∈0
with hk = 2
−1 α∈0 kα .
Combined with (10.2) we obtain σλk (β ∨ ) ≤
hk ∨ n λ(β ).
222
E. Emsiz, E.M. Opdam, J.V. Stokman
Corollary 10.2. Let µ ∈ P . We have µk ∈ V+∗ if and only if µ ∈ P + . Proof. Let µ ∈ P and suppose that µk ∈ V+∗ . Then for all β ∈ 0+ , k ∨ 2πµ(β ∨ ) = µk (β ∨ ) + σ µk (β ) ≥ 0
by Lemma 10.1, hence µ ∈ P + . Conversely, suppose that µ ∈ P + and let w ∈ W0 such that w µk ∈ V+∗ . By Proposition 2.9 this implies w " µk ∈ V+∗ . By the previous paragraph we conclude that wµ ∈ P + . + On the other hand P ∩ W0 µ = {µ}, hence wµ = µ ∈ P + and µk = w " µk ∈ V+∗ . Proposition 2.10 is now a direct consequence of Corollary 10.2 and Lemma 10.1. 11. The Pauli Principle In this section we complete the proof of Theorem 3.5 (and hence also of Theorem 2.6). In view of Proposition 8.4 and Corollary 8.5 it suffices to show the following root system analog of the Pauli principle. Proposition 11.1. If λ ∈ BAEk is singular then E(λ)W Q = {0}. For the proof of Proposition 11.1 we may assume without loss of generality that λ ∈ BAEk is J -standard (in particular, λ ∈ iV+∗ ). We write VJ∗ ⊆ V ∗ for the real sub-space spanned by the subset J of simple roots. Its complement in V is defined by VJ⊥ = {v ∈ V | ξ(v) = 0
∀ ξ ∈ VJ∗ }.
Observe that VJ⊥ = V iff J = ∅ iff λ is regular. Consider the linear map Kλk : V → V defined by Kλk (v) = v +
α∈0
kα α(v)α ∨ , kα2 − λ(α ∨ )2
v ∈ V.
Lemma 11.2. Let λ ∈ iV ∗ be a singular J -standard solution of the Bethe ansatz equations (2.10). Then λ satisfies the constraint (N −1) (N −1) Jk ψλk = πλ J (∂ϕ ∨ + kϕ )ψλk (11.1) πλ J iff Kλk (V ) ⊆ VJ⊥ . Proof. Fix a singular J -standard solution λ ∈ iV+∗ of the Bethe ansatz equations (2.10) (in particular J = ∅). By a similar computation as in the proof of Proposition 8.4 we obtain from (7.3), Lemma 8.2 and Proposition 7.4, (N −1) πλ J (∂ϕ ∨ + kϕ )ψλk = CJk ck (uλ) uβ(auλ )euλ uα, (NJ−1)
πλ
Jk ψλ = CJk
u∈W0J
u∈W0J
β∈0J,+
ck (uλ)e−u
J λ(ϕ ∨ )
β∈0J,+
α∈0J,+ \{β} Jλ
uβ(buJ λ )eu
uJ u J α
α∈0J,+ \{β}
Periodic Integrable Systems with Delta-Potentials
223
with vectors aµ , bµ ∈ VC (µ ∈ VC∗ ) given by aµ = (µ(ϕ ∨ ) + kϕ )ρµk + ϕ ∨ , bµ = µ(ϕ ∨ ) − kϕ ρskϕ µ + ϕ ∨ − ϕ ∨ , where we have used the involution on W0J defined by (8.4), as well as (8.5). For u ∈ W0J we have uβ(buJ λ ) J uβ(buJ λ ) uJ uJ α = (−1)l(uJ ) u α uJ uJ β J,+ J,+ J,+ J,+ β∈0
α∈0
\{β}
β∈0
α∈0
=
1 (−1)l(uJ ) 2 J β∈0
= (−1)l(uJ )
uu−1 J β(buJ λ ) uJ β
uu−1 J β(buJ λ )
β∈0J,+
uJ α
α∈0J,+
uJ α.
α∈0J,+ \{β}
Consequently (11.1) is equivalent to ∨
ck (uλ)uβ(auλ ) = (−1)l(uJ ) ck (uJ λ)e−uλ(ϕ ) sϕ uβ(buλ ),
∀ u ∈ W0J , ∀ β ∈ 0J,+ .
Since λ is a solution of the Bethe ansatz equations (see (8.6) for the convenient equivalent form of the Bethe ansatz equations) this is equivalent to uλ(ϕ ∨ ) − kϕ auλ − uλ(ϕ ∨ ) + kϕ sϕ buλ ∈ u(VJ⊥ ), ∀ u ∈ W0J . (11.2) Note that (11.2) only depends on the coset uW0,J (u ∈ W0J ). Using the explicit expressions for auλ and buλ we can rewrite (11.2) as $ # ∨ )2 − k 2 −2k wλ(ϕ −1 k ϕ ϕ w ρwλ −w−1 sϕ ρskϕ wλ + w−1 ϕ ∨ ∈ VJ⊥ , ∀ w ∈ W0 . wλ(ϕ)2 − kϕ2 (11.3) We match (11.3) to the desired condition Kλk (V ) ⊆ VJ⊥ as follows. Since 0 is an irreducible root system in V ∗ , the condition Kλk (V ) ⊆ VJ⊥ is equivalent to Kλk (w −1 ϕ ∨ ) ∈ VJ⊥ for all w ∈ W0 , which in turn is equivalent to (11.3) if $ # wλ(ϕ ∨ )2 − kϕ2 − 2kϕ −1 k k −1 ∨ −1 k Kλ (w ϕ ) = w ρwλ − w sϕ ρsϕ wλ + w−1 ϕ ∨ wλ(ϕ)2 − kϕ2 (11.4) for all w ∈ W0 . To prove (11.4) we first observe that k −2 sϕ ρskϕ wλ = ρwλ
α∈0+ ∩sϕ 0−
kα α ∨ kα2 − wλ(α ∨ )2
224
E. Emsiz, E.M. Opdam, J.V. Stokman
by the explicit expression (7.9) for ρµk . Using (2.13) this can be rewritten as k − w −1 sϕ ρskϕ wλ = 2 w−1 ρwλ
kα α(ϕ ∨ )w −1 α ∨ kϕ w −1 ϕ ∨ + 2 . wλ(ϕ ∨ )2 − kϕ2 kα2 − wλ(α ∨ )2 + α∈0
The second term can be rewritten as 2
kα α(ϕ ∨ )w −1 α ∨ kα α(ϕ ∨ )w −1 α ∨ = kα2 − wλ(α ∨ )2 kα2 − wλ(α ∨ )2 + α∈0
α∈0
= =
kα α(w −1 ϕ ∨ ) kα2 − λ(α ∨ )2
α∈0 Kλk (w −1 ϕ ∨ ) − w −1 ϕ ∨ .
Combining the latter two formulas yields (11.4).
It follows from (9.3) that k (ηv , ηv ) = Kλk (v), v , B−iλ
v, v ∈ V
k the Hessian of the master function Sk at −iλ ∈ V ∗ . with ηv = v, · ∈ V ∗ and B−iλ ∼
k is positive definite, K k : V −→ V is a linear isomorphism. Proposition 11.1 Since B−iλ λ thus is an immediate consequence of Lemma 11.2.
Acknowledgements. This research was supported in part by a Pionier grant of the Netherlands organization for Scientific Research (NWO) (Emsiz and Opdam (program coordinator)). Stokman was supported by the Royal Netherlands Academy of Arts and Sciences (KNAW) and by the Netherlands Organization for Scientific Research (NWO) in the VIDI-project “Symmetry and modularity in exactly solvable models”.
References 1. Buchstaber, V.M., Felder, G., Veselov, A.P.: Elliptic Dunkl operators, root systems, and functional equations. Duke Math. J. 76, 885–911 (1994) 2. Cherednik, I.: A unification of Knizhnik-Zamolodchikov and Dunkl operators via affine Hecke algebras. Invent. Math. 106, 411–431 (1991) 3. Cherednik, I.: Inverse Harish-Chandra transform and difference operators. Internat. Math. Res. Notices 1997, no. 15, 733–750 4. Cherednik, I.: Intertwining operators of double affine Hecke algebras. Selecta Math. (N.S.) 3, no. 4, 459–495 (1997) 5. Dorlas, T.C.: Orthogonality and completeness of the Bethe ansatz eigenstates of the nonlinear Schroedinger model. Commun. Math. Phys. 154, 347–376 (1993) 6. Dunkl, C.F.: Differential-difference operators associated to reflection groups. Trans. Amer. Math. Soc. 311, no. 1, 167–183 (1989) 7. Gaudin, M.: Boundary energy of a Bose gas in one dimension. Phys. Rev. A 4, no. 1, 386–394 (1971) ´ 8. Gaudin, M.: La fonction d’Onde de Bethe. Collection du Commissariat a` l’Energie Atomique: S´erie Scientifique. Paris: Masson, 1983 9. Girardeau, M.: Relationship between systems of impenetrable bosons and fermions in one dimension. J. Math. Phys. 1, no. 6, 516–523 (1960) 10. Gutkin, E., Sutherland, B.: Completely integrable systems and groups generated by reflections. Proc. Natl. Acad. Sci. USA 76, no. 12, 6057–6059 (1979) 11. Gutkin, E.: Integrable systems with delta-potential. Duke Math. J. 49, no. 1, 1–21 (1982)
Periodic Integrable Systems with Delta-Potentials
225
12. Gutkin, E.: Conservation laws for the nonlinear Schr¨odinger equation. Ann. Inst. Henri Poincar´e 2, no. 1, 76–74 (1985) 13. Gutkin, E.: Operator calculi associated with reflection groups. Duke Math. J. 55, no. 1, 1–18 (1987) 14. Heckman, G.J.: A remark on the Dunkl differential-difference operators. In: Harmonic analysis on reductive groups (Brunswick, ME, 1989), Progr. Math., 101, Boston, MA: Birkh¨auser Boston, 1991, pp 181–191 15. Heckman, G.J., Opdam, E.M.: Yang’s system of particles and Hecke algebras. Ann. of Math. (2) 145, no. 1, 139–173 (1997) 16. Hikami, K.: Notes on the δ-function interacting gas. Intertwining operator in the degenerate affine Hecke algebra. J. Phys. A: Math. Gen. 31, L85–L91 (1998) 17. Humphreys, J.E.: Reflection groups and Coxeter groups. Cambridge Studies in Adv. Math. 29, Cambridge: Cambridge Univ. Press (1990) 18. Izergin, A.G., Korepin, V.E.: The Pauli principle for one-dimensional bosons and the algebraic Bethe ansatz. Lett. Math. Phys. 6, 283–289 (1982) 19. Komori, Y., Hikami, K.: Nonlinear Schr¨odinger model with boundary, integrability and scattering matrix based on the degenerate affine Hecke algebra. Int. J. Mod. Phys. A 12, no. 3, 5397–5410 (1997) 20. Korepin, V.E.: Calculation of Bethe wave functions. Commun. Math. Phys. 86, 391–418 (1982) 21. Korepin, V.E., Faddeev, L.D.: Quantization of solitons. Theor. Math. Phys. 25, 1039–1049 (1975) 22. Lieb, E.H., Liniger, W.: Exact analysis of an interacting Bose gas. I. The general solution and the ground state. Phys. Rev. (2), 130, 1605–1616 (1963) 23. Lusztig, G.: Affine Hecke algebras and their graded version. J. Amer. Math. Soc. 2, no. 3, 599–635 (1989) 24. Murakami, S., Wadati, M.: Connection between Yangian symmetry and the quantum inverse scattering method. J. Phys. A: Math. Gen. 29, 7903–7915 (1996) 25. Olshanetsky, M.A., Perelomov, A.M.: Quantum integrable systems related to Lie algebras. Physics Reports 94, no. 6, 313–404 (1983) 26. Opdam, E.M.: Harmonic analysis for certain representations of graded Hecke algebras. Acta Math. 175, no. 1, 75–121 (1995) 27. Opdam, E.M.: Lecture notes on Dunkl operators for real and complex reflection groups. MSJ Memoirs, 8. Tokyo: Mathematical Society of Japan, 2000 28. Polychronakos, A.P.: Exchange operator formalism for integrable systems of particles. Phys. Rev. Lett. 69, 703–705 (1992) 29. Sklyanin, E.K.: Boundary conditions for integrable quantum systems. J. Phys. A 21, no. 10, 2375– 2389 (1988) 30. Steinberg, R.: Differential equations invariant under finite reflection groups. Trans. Amer. Math. Soc. 112, 392–400 (1964) 31. Sutherland, B.: Nondiffractive scattering: Scattering from kaleidoscopes. J. Math. Phys. 21, no. 7, 1770–1775 (1980) 32. Yang, C.N.: Some exact results for the many-body problem in one dimension with repulsive deltafunction interaction. Phys. Rev. Lett. 19, 1312–1315 (1967) 33. Yang, C.N., Yang, C.P.: Thermodynamics of a one-dimensional system of bosons with repulsive delta-function interaction. J. Math. Phys. 10, 1115–1122 (1969) Communicated by L. Takhtajan
Commun. Math. Phys. 264, 227–253 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1527-6
Communications in
Mathematical Physics
Computation of Superpotentials for D-Branes Paul S. Aspinwall1,2 , Sheldon Katz3 1
Department of Physics and SLAC Stanford University, Stanford, CA 94305/94309, USA. E-mail:
[email protected] 2 Center for Geometry and Theoretical Physics, Box 90318, Duke University, Durham, NC 27708-0318, USA 3 Departments of Mathematics and Physics, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA Received: 30 March 2005 / Accepted: 13 August 2005 Published online: 23 February 2006 – © Springer-Verlag 2006
Abstract: We present a general method for the computation of tree-level superpotentials for the world-volume theory of B-type D-branes. This includes quiver gauge theories in the case that the D-brane is marginally stable. The technique involves analyzing the A∞ -structure inherent in the derived category of coherent sheaves. This effectively gives a practical method of computing correlation functions in holomorphic Chern–Simons theory. As an example, we give a more rigorous proof of previous results concerning 3-branes on certain singularities including conifolds. We also provide a new example.
1. Introduction Consider a type II superstring compactification on a Calabi–Yau threefold X. BPS D-branes that “wrap cycles” within X and fill the noncompact spacetime give rise to an effective N = 1, d = 4 supersymmetric gauge theory arising from the D-brane world-volume. As is well-known [1], if N irreducible D-branes wrap the same cycle, one obtains a model with U(N ) gauge symmetry. One may obtain more general supersymmetric gauge theories in the form of quiver gauge theories by considering marginally stable D-branes. Suppose a given D-brane (which may consist of multiple copies of some irreducible D-branes) is marginally stable with respect to decay into N1 copies of some (irreducible) D-brane plus N2 copies of another D-brane, etc., then one obtains a gauge theory with gauge group U(N1 ) × U(N2 ) × · · · . The fact that the given D-brane is marginally stable means that there will be massless open strings between the decay products. These strings give rise to massless chiral supermultiplets in bifundamental (N1 , N2 ) representations, etc. These N = 1, d = 4 supersymmetric gauge theories will, in general, have a nontrivial superpotential expressible as a function of the chiral superfields. The purpose of this paper is to describe a systematic and general method for the computation of this superpotential at tree level directly from the algebraic geometry of X.
228
P.S. Aspinwall, S. Katz
There are two types of BPS D-branes on a Calabi–Yau threefold — the so-called A-type and B-type. The A-type D-branes are described by special Lagrangian cycles within X [2] and are, in principle, described completely by the language of the Fukaya category [3–5]. Having said that, the Fukaya category is extremely difficult to deal with explicitly for any example of a Calabi–Yau threefold. The other “B-type” D-branes are described by D(X), the derived category of coherent sheaves on X [6–9]. While, at first sight, the derived category may appear to be mathematically formidable, it is actually very useful for direct computations in any given example. In this paper we therefore focus on the problem of computing the superpotential for B-type D-branes. The easiest route for computing superpotentials for A-branes in many situations is by reducing to the B-brane case by using mirror symmetry. Various methods leading to proposals for superpotentials in several examples have already appeared in the literature [10–16]. Each of these papers has used a somewhat indirect approach to analyzing the superpotential. Here we will give a very direct method that can be applied for any collection of B-type D-branes on any Calabi–Yau manifold. It uses similar ideas to those used in demonstrations of homological mirror symmetry given in [17]. In a recent paper [18] this problem was studied in the Landau–Ginzburg phase of the B-model. It is believed that this should give the same result as in the Calabi–Yau phase, i.e., the case of interest here. The only worry would be that, in current understanding, the Landau–Ginzburg analysis appears to collapse the derived category somewhat by identifying objects under a shift of 2. This might change the results of computations of superpotentials in some cases. It has long been known [19] that the information concerning the superpotential is contained in a holomorphic Chern–Simons theory of the Calabi–Yau threefold in question. The propagator in this field theory appears to require a complete knowledge of the metric of X and so cannot be computed in general. Here we recast the holomorphic Chern–Simons theory in a form described purely by homological algebra and algebraic geometry. The logic of this argument is very similar to that used when arguing that B-branes, which are originally described by Dolbeault cohomology, are ultimately described by D(X). Indeed, all we need do is to supplement this argument by a product structure. The wedge product of Dolbeault cohomology becomes a “composition” product in D(X). ˇ Having done this change of language to algebraic geometry one can then use Cech cohomology and a knowledge of locally-free resolutions to perform a computation of the superpotential. Our method applies, in principle, to a computation involving any D-brane (i.e., any object in the derived category) on any Calabi–Yau threefold. The only obstacle ˇ in general to the computation is the stamina required to compute Cech cohomology when several patches are required, and dealing with potentially long locally-free resolutions. All the technical machinery of computing superpotentials is tied up in the A∞ -algebra language (see, for example, [20, 21]). We therefore begin with a review of the required facts in Sect. 2. In Sect. 3 we discuss general features of the way that the superpotential is described by correlation functions in the topological B-model. An interesting result is that, although one can define a generalized superpotential on the “thickened” moduli space of the topological field theory, it essentially contains no more information than the physical superpotential. We also discuss the uniqueness of the superpotential computed by the topological field theory.
Computation of Superpotentials for D-Branes
229
In Sect. 4 we show how holomorphic Chern–Simons theory can be restated in terms more appropriate to algebraic geometry and in Sect. 5 we compute some examples. We are able to verify some results concerning 3-branes on conifold singularities. We also compute a new result based on a 5-brane wrapping a particular P1 . 2. A∞ Algebras and Categories The superpotential in these N = 1 gauge theories is intimately related to the structure of an A∞ algebra (in the case of a single stable D-brane) or category (in the case of a quiver). This has been discussed in [14, 22–24]. We begin, therefore, with a review of A∞ algebras following [20, 21]. Let V be a vector space with a Z-grading and let T (V ) be the resulting graded tensor algebra T (V ) =
∞
V ⊗n .
(1)
n=1
If a ∈ V , we will denote the grade of a by |a|. By the usual abuse of notation we will often write (−1)a rather than (−1)|a| . If f and g are operators of given degrees, we use the rule (f ⊗ g)(a ⊗ b) = (−1)|g|.|a| f (a) ⊗ g(b).
(2)
Now let d be a derivative with degree 1, with respect to the grading, acting on T (V ) obeying the graded Leibniz rule d(a ⊗ b) = d(a) ⊗ b + (−1)a a ⊗ d(b).
(3)
d2 = 0.
(4)
We also demand
The Leibniz rule (3) means that d is entirely determined by its restriction to V . Let us denote this restriction as (d)V . One can then decompose (d)V = d1 + d2 + · · · ,
(5)
dk : V → V ⊗k .
(6)
where
Let V [1] denote the vector space V with all the grades decreased by one and let s : V → V [1] be the obvious map of degree −1. We can now define our A∞ algebra A: A = (V [1])∗ ,
(7)
mk : A⊗k → A,
(8)
together with its higher products
given by the dual of s ⊗k · dk · s −1 . The map mk thus has degree 2 − k.1 1 The grades of vector spaces are negated upon dualizing. Thus, when a map between vector spaces is dualized, its direction is reversed but its degree remains the same.
230
P.S. Aspinwall, S. Katz
The condition (4) then becomes equivalent to [25] (−1)r+st mu (1⊗r ⊗ ms ⊗ 1⊗t ) = 0,
(9)
r+s+t=n
for any n > 0, where u = n + 1 − s. One may view (9) as the defining relations for an A∞ algebra. It is easy to extend the idea of an A∞ -algebra to an A∞ -category [26]. Such a category consists of objects and morphisms in the usual way except that morphisms, under k-fold compositions, satisfy the relations (9). In particular, an A∞ -category need not be a category in the usual sense since composition of morphisms need not be associative. Now suppose we have another graded vector space U with its own differential d acting on T (U ). It is natural to consider maps g : T (U ) → T (V ) which commute with d. We impose the condition g(a ⊗ b) = (−1)|g||a| g(a) ⊗ g(b),
(10)
so that such maps are defined completely by their restriction to U . If B is the A∞ -algebra constructed from U , such a g gives rise to an “A∞ -morphism” given by maps fk : A⊗k → B,
(11)
constructed in the obvious way from g above. The condition that g commutes with d then becomes (−1)r+st fu (1⊗r ⊗ ms ⊗ 1⊗t ) = (−1)q mr (fi1 ⊗ fi2 ⊗ · · · ⊗ fir ), r+s+t=n
1≤r≤n i1 +···+ir =n
(12) for any n > 0 and u = n + 1 − s again. The sign on the right is given by q = (r − 1)(i1 − 1) + (r − 2)(i2 − 1) + · · · + (ir−1 − 1).
(13)
Note that m1 : A → A is a degree one map satisfying m1 · m1 = 0. It thus gives A the structure of a graded differential complex, and we may take cohomology to yield H ∗ (A). By choosing representatives of each cohomology class we may define an embedding i : H ∗ (A) → A.
(14)
Thanks to a theorem by Kadeishvili [27], we may define an A∞ structure on H ∗ (A) such that 1. There is an A∞ morphism f from H ∗ (A) to A with f1 equal to the embedding i. 2. m1 = 0. Here, m1 refers to the A∞ structure on H ∗ (A). This A∞ structure is not unique, but it is unique up to A∞ -isomorphisms, as will be discussed in a more general situation in Lemma 1 below. An A∞ -algebra with m1 = 0 is called a minimal A∞ -algebra. It is quite easy to construct Kadeishvili’s A∞ -structure in practice. A rather simple example of an A∞ -algebra is given by mk = 0 for k ≥ 3. Such an algebra is called a differential graded algebra, or dga. In this paper, we will need to put an A∞ structure
Computation of Superpotentials for D-Branes
231
on the cohomology of a dga, which may be done explicitly as follows. Let m1 on A be denoted d, and let m2 (a ⊗ b) be denoted a · b. Putting n = 2 in (12), and using the fact that m1 = 0 in H ∗ (A), yields im2 = (i · i) + df2 .
(15)
Since d(i · i) = 0, we must define m2 on H ∗ (A) as the cohomology class of i · i. We may also use this to define a choice of f2 : H ∗ (A)⊗2 → A (up to an element in the kernel of d). Next, putting n = 3 in (12) yields2 im3 = f2 (1 ⊗ m2 ) − f2 (m2 ⊗ 1) + (i · f2 ) − (f2 · i) + df3 .
(16)
Direct computation using (15) shows that d(f2 (1⊗m2 )−f2 (m2 ⊗1)+(i·f2 )−(f2 ·i)) = 0, so as before this defines m3 and allows us to choose a definition for f3 . Clearly this process continues and defines all the products for the A∞ algebra on H ∗ (A). This construction of the A∞ algebra on H ∗ (A) may be rephrased following [26, 28] in a language which will also be useful to us. Suppose we define a projection p : A → H ∗ (A) such that p ◦ i = 1 and furthermore assume that we have a map H : A → A of degree −1 such that 1 − i ◦ p = dH + H d. Clearly, m2 is defined as p ◦ (i · i) as before. For k > 2, we then define ±mk,T , (17) mk = T
where the sum is over all trees T with k branch tips at the top and one root. These trees look like Feynman diagrams of a φ 3 field theory and are computed accordingly, with m2 of A acting as the cubic coupling and H acting as the propagator. We refer to [26] for more details. We say that two dga’s A and B are quasi-isomorphic if there is a homomorphism of dga’s g : A → B (i.e. preserving the respective products and commuting with the differentials) inducing an isomorphism on the respective cohomologies, which can then be identified. A simple extension of the “uniqueness up to A∞ -isomorphism” part of the above construction proves the following. Lemma 1. Suppose that A and B are quasi-isomorphic dga’s, determining A∞ structures on H ∗ (A) H ∗ (B) as above. Then these two A∞ algebras are A∞ -isomorphic. Before describing the simple proof, we recall that A∞ -quasi-isomorphisms are simply A∞ -morphisms which are quasi-isomorphisms, i.e. which induce isomorphisms between the respective m1 -cohomologies. We will need the result that A∞ -quasiisomorphisms have homotopy inverses, see [25] and the references therein. We also need the result that an A∞ -morphism f between minimal A∞ -algebras is an isomorphism if and only if f1 is an isomorphism. Let f be an A∞ -quasi-isomorphism from H ∗ (A) to A and g be an A∞ -quasiisomorphism from H ∗ (B) to B as in Kadeishvili’s theorem. Let φ : A → B be the given quasi-isomorphism of dga’s, which, viewing A and B as A∞ -algebras can be viewed as describing an A∞ -quasi-isomorphism. Let h be a homotopy inverse of g. Then r = h ◦ φ ◦ f is an A∞ -morphism from H ∗ (A) to H ∗ (B). Here ◦ denotes the composition of A∞ -morphisms, see e.g. [25]. But r1 : H ∗ (A) → H ∗ (B) is an isomorphism by definition since φ is a quasi-isomorphism, while H ∗ (A) and H ∗ (B) are minimal A∞ -algebras. Hence r is an isomorphism by the discussion above. 2
There appears to be a typo in [25] discarding too many terms.
232
P.S. Aspinwall, S. Katz
3. D-Branes and Superpotentials To begin with, assume we have a D-brane that consists of a vector bundle E → X. This was the case studied by Witten in [19]. The open strings in the B-model correspond to 0,q elements of Dolbeault cohomology H∂¯ (X, End(E)). The vertex operators corresponding to q = 0 yield massless vector bosons in the uncompactified 4 dimensions which give rise to a gauge theory. The vector bundle E is said to be simple if Hom(E, E) = End(E) = C. In this case we have one vector boson and the gauge group is U(1). Similarly if E is (E0 )⊕N , where E0 is simple, then the gauge group is U(N ). The vertex operators corresponding to q = 1 yield massless scalars and fermions in four dimensions coming from chiral supermultiplets. These transform in the adjoint representation of U(N ). Let A denote the Hilbert space of open string states. The effective D-brane world-volume theory contains a superpotential which is a holomorphic function of these chiral superfields. In [29] it was shown that this superpotential could be computed in terms of correlation functions in the associated topological quantum field theory. The result is as follows. The open strings are associated to local vertex operators ψi in the topological field theory. These ψi ’s may be viewed as a basis for A. To each such vertex operator, one may construct a 1-form operator − (1) (18) ψi = √1 G− 1 + G− 1 , ψi . 2
−2
2
These 1-form operators may be used to deform the topological field theory (at least to first order): (1) Zi ψ i , (19) S→S+ i
where the Zi are complex numbers as far as the topological field theory is concerned. The Zi are (the scalar components of) chiral superfields in the effective world-volume theory. The deformations (19) correspond to giving vacuum expectation values to these fields. Thus, the chiral superfields are naturally dual to the vertex operators of the topological quantum field theory. (1) Let qi ∈ Z denote the ghost number of ψi . Then ψi has ghost number qi − 1. The only operators that can be used to deform the untwisted conformal field theory associated to the D-brane must have qi = 1. We would like to extend our discussion to the “thickened” moduli space of [30]. In this picture, all ghost numbers are allowed. This gives rise to a generalized space of chiral superfields where Zi has a grade 1 − qi . We also have a generalized superpotential W which is a function of all the Zi ’s. Always remember, though, that only the fields of grade zero are true chiral superfields. Following the conventions of [24], we define correlation functions for k + 1 open string vertex operators: (1) (1) (1) ψi3 . . . ψik−1 ψik , (20) Bi0 ,i1 ,... ,ik = (−1)ζ1 +ζ2 +···+ζk−1 ψi0 ψi1 P ψi2 where we introduce the notation ζj = 1 − qij . The integrals in this correlation function are over segments of the boundary so as to preserve the path ordering. A choice of regulator needs to be made in order to fully define these correlation functions as was done
Computation of Superpotentials for D-Branes
233
in [24]. We will avoid making such a choice, giving rise to ambiguities which we discuss at the end of this section. It was shown in [24] that these correlators satisfy the following cyclicity property Bi0 ,i1 ,... ,ik = (−1)ζk (ζ0 +ζ1 +···+ζk−1 ) Bik ,i0 ,i1 ,... ,ik−1 .
(21)
In the case of N copies of a simple D-brane, the fields Zi naturally form N ×N matrices. We may now write the superpotential ∞ Bi ,i ,... ,i 0 1 k (22) W = Tr Zi0 Zi1 . . . Zik . k+1 k=2 i0 ,i1 ,... ,ik
Note that this trace has a graded cyclicity property consistent with (21). The correlation functions of a topological quantum field theory are subject to various constraints due to sewing conditions as discussed in [31, 32] and, in particular, [33]. The open string “pair of pants” diagram associates a bilinear product of degree 0 to A. Anticipating the connection with A∞ algebras, we denote this m2 : A ⊗ A → A.
(23)
If X is a Calabi–Yau threefold, there is also a “trace map” of degree −3, γ : A → C. It follows that our desired correlation function may be written in the form
Bi0 ,i1 ,... ,ik = γ m2 mk (ψi0 , ψi1 , . . . , ψik−1 ), ψik ,
(24)
(25)
for maps of degree 2 − k mk : A⊗k → A.
(26)
It was shown in [24] that these products do indeed obey the conditions (9) and thus give A the structure of an A∞ algebra. Comparing this structure with the description of A∞ algebras in Sect. 2, it should be clear that the chiral superfields Zi play the role of generators of the space V . The shift by one comes from (18) and the dualizing comes from (19). Since the structure of the A∞ algebra is simpler to describe in terms of T (V ), it should be enlightening to rephrase the above in this language. The degree −3 pairing γ (m2 (−, −)) is non-degenerate on A and simply corresponds to Serre duality. It naturally dualizes to produce a map η : C → V ⊗ V,
(27)
of degree −1. If Zi is a homogeneous basis for V , η(1) =
Zi ⊗ Zˆ i ,
i
where Zˆ i are viewed as the “Serre dual” of Zi .
(28)
234
P.S. Aspinwall, S. Katz
We write a basis of V as follows. Let X1 . . . Xn be a basis of the degree 0 part of V . These are therefore the true chiral superfields in the four-dimensional theory. Assume, for now, that E is simple, i.e., Hom(E, E) = C. In other words, there is a unique “identity” vertex operator for open strings beginning and ending on E. This is dual to an element denoted e ∈ V of degree 1. Serre duality can now be used to give a basis Xˆ i of the degree −1 part of V and a generator eˆ of the degree −2 part of V , where η(1) = e ⊗ eˆ + eˆ ⊗ e + Xα ⊗ Xˆ α + Xα ⊗ Xˆ α . (29) α
α
Viewing the superpotential W as an element of T (V ) and using (22), the higher products of the A∞ algebra can be rephrased in the language of Sect. 2 as the beautifully simple statement dZˆ i =
∂W . ∂Zi
(30)
Since T (V ) is the non-commutative algebra generated by Zi , some care is needed in defining the partial derivative in (30). The recipe is as follows. The cyclic trace property (21) allows W to be written with any of the generators at the front. ∂W/∂Zi is then defined as the sum of all the possible forms of W under the trace property with Zi at the front, with said Zi removed. Clearly this coincides with the usual definition of derivative in commutative algebra. The identity vertex operator has special properties under the higher products as shown in [24]. Let ψ0 be the identity operator. Then m2 (ψ0 , ψi ) = m2 (ψi , ψ0 ) = ψi , mk (ψi1 , ψi2 , . . . , ψ0 , . . . ) = 0,
for k > 2.
(31)
Carefully computing signs, it follows from (30) that W=
1 (−1)Zi Zˆ i ⊗ e ⊗ Zi − Zˆ i ⊗ Zi ⊗ e + terms not containing e. (32) 2 i
In (32) we have also dropped terms which can be deduced from the cyclicity property (21). So far we have discussed one simple D-brane. It is very easy to generalize to the case of a collection of D-branes E1⊕N1 ⊕ E2⊕N2 ⊕ . . . ,
(33)
forming a U(N1 ) × U(N2 ) × . . . quiver gauge theory, where each Ej is simple. In order to form a quiver gauge theory free from tachyons or peculiar vector bosons we are required to impose [34, 35] Hom(Ej , Ek ) = 0,
for j = k.
(34)
This means that the only degree zero vertex operators in A remain multiples of identity maps Ej → Ej . The effect of passing to a quiver gauge theory is that we must now think in terms of A∞ categories rather than algebras. This amounts to little more than bookkeeping as
Computation of Superpotentials for D-Branes
235
follows. The elements of A should be viewed as morphisms between D-branes and, as such, as elements of H∂¯0,∗ (X, Hom(Ei , Ej )). All we need do is to rewrite (32) as 1 Zi ˆ W = Tr (−1) Zi ⊗ e ⊗ Zi − Zˆ i ⊗ Zi ⊗ e + terms not containing e , 2 i
i
(35) where now Zi are matrices. The “⊗” in (35) now implicitly includes matrix multiplication and the concept of composition of morphisms between different objects. The symbol e now refers to a square Nj × Nj matrix with entries dual to the identity operator of a given simple D-brane Ej . The composition of morphisms implied by the superpotential must begin and end on the same D-brane so that a trace may then be taken. This is equivalent to the statement that the superpotential is gauge invariant. Equation (30) remains valid for the quiver theory. One implicitly removes the trace and then na¨ıvely applies the rule for differentiation we described above. The general form of the superpotential can be further constrained. The only vertex operators appearing have grade 0, 1, 2 or 3. Thus, the Zi have grade 1, 0, −1 or −2, with the e the only generator with grade 1. Now, de is of degree 2 and so must be a sum of terms in T (V ) each with at least one e. The property of the identity element of an A∞ algebra thus implies de = −e ⊗ e.
(36)
Similarly, since Xα has degree 0, we must have dXα = Xα ⊗ e − e ⊗ Xα .
(37)
dXˆ α = F (Xβ ) − Xˆ α ⊗ e − e ⊗ Xˆ α ,
(38)
Since Xˆ α is of degree −1,
where F (Xβ ) is an arbitrary function of the Xβ ’s. Finally d eˆ = ∂W/∂e and is completely determined by (35). The result is that
W = Tr W (Xα ) − eˆ ⊗ e ⊗ e + Xˆ α ⊗ Xα ⊗ e − Xˆ α ⊗ e ⊗ Xα , (39) α
where W (Xα ) is a completely arbitrary function of all the chiral superfields Xα . This function is, of course, the physical superpotential. For any Zi ∈ V , using (30) and (39) it is a simple matter to show d2 Zi = 0.
(40)
There are therefore three remarkable properties of (39): 1. The A∞ relations are trivially satisfied. There is no need to go through the computation of [24]. 2. The generalized superpotential associated to the thickened moduli space is determined completely by W (Xα ) — the physical superpotential on the physical moduli space. 3. The A∞ relations are satisfied for completely arbitrary W (Xα ).
236
P.S. Aspinwall, S. Katz
We should perhaps point out that much of the simplification we have found here is due to the constraint (34) for a physical quiver. Had we not imposed this, we could not have used the special properties of the identity operator. The A∞ -morphisms are also simplified when passing to the dual language of the chiral superfields. An A∞ -morphism from a theory with superfields Zi to a theory with superfields Yα is simply an analytic map Yα = gα (Z1 , Z2 , . . . ).
(41)
The complicated expression (12) is restated as g commuting with d. If any fk is nonzero for k ≥ 2, this map of superfields is nonlinear. We will be using Kadeishvili’s theorem of Sect. 2 to compute the desired A∞ structure yielding the superpotential, It is important to note that this theory only gives this structure up to an A∞ -isomorphism. It follows that we will only be able to determine the superpotential up to a nonlinear change in superfields where this nonlinear map is invertible and commutes with d. It is not surprising that there is an ambiguity in the superpotential. From the fourdimensional field theory point of view, the topological B-model knows nothing about the kinetic term and so one is free to apply nonlinear redefinitions to the chiral superfields. From the point of the view of the string worldsheet, contact terms arise from the vertex insertion point coalescing at the ends of the integration regions. Such contact terms are known to introduce ambiguities as in [36]. That these ambiguities exist is therefore not a surprise, but we have a very precise form of the ambiguity — the nonlinear redefinition of the superfields must commute with d. It would be interesting to find the physics behind this statement but we will not attempt to pursue this question here. 4. Holomorphic Chern–Simons Theory In [19], it was shown how to exactly compute the correlation functions (20), at least for one D-brane E and for vertex operators in H∂¯0,1 (X, End(E)). One defines a holomorphic Chern–Simons theory with action
¯ + 2 A ∧ A ∧ A ∧ , Tr A ∧ ∂A (42) S= 3 X
where the field A is a (0, 1)-form on X taking values in End(E), and is a holomorphic (3, 0)-form on X. From this, the correlation functions are then computed as follows in the language of Sect. 3. Let A be the Hilbert space H 0,∗ (X, End(E)). The trace map (24) is given by Tr(a) ∧ , (43) γ (a) = X
while m2 (a, b) = a ∧ b,
(44)
where composition in End(E) is implicit. The computation of mk is then exactly as described by the tree construction at the end of Sect. 2. The propagator is, of course,
Computation of Superpotentials for D-Branes
237
the propagator of (42) which is given by H = G∂¯ † , where G is the Green’s operator inverting the Laplacian. Since, for any differential form α, [28] ¯ ¯ ∂¯ † + G∂¯ † ∂, α = [α]Harm + ∂G
(45)
we have the following (which was also effectively noted in [14]): Theorem 1. The correlation functions in the holomorphic Chern–Simons theory are associated with the A∞ algebra as computed in Sect. 2, where the dga is given by the Dolbeault complex of End(E)-valued (0, q)-forms together with the wedge product. The embedding, i, of H 0,∗ (X, End(E)) into this complex is given by Harmonic forms. This formulation of holomorphic Chern–Simons theory is all very well but it is not very practical. Computing the propagator G∂¯ † would appear to require a knowledge of the metric on X. Naturally this is not in the spirit of the topological field theory. One generally expects all computations in the topological B-model to be cast in the language of algebraic geometry and thus not require detailed knowledge of X, such as its metric. The derived category program for B-branes [7] precisely does this translation to algebraic geometry as reviewed in [9]. We need to extend this argument to include product structures. What we will arrive at is an A∞ -structure implicit in the derived category that has been discussed in [17, 37, 38]. Indeed, the equivalence we derive in this section was also described in these references. The key idea is that we have three natural dga’s associated to three different cohomologies, all of which may equally be used to analyze the problem at hand. Suppose we have a holomorphic vector bundle B with a product µ : B ⊗ B → B. Let B be the locally-free sheaf of sections of B. In the case of interest, we will want B = End(E), and B = Hom(E , E ) with the product µ being given by composition. The useful dga’s are then: 1. The Dolbeault complex of (0, q)-forms valued in B: ···
∂¯
/ (A 0,q−1 ⊗ B)
∂¯
∂¯
/ (A 0,q ⊗B)
/ (A 0,q+1 ⊗B)
∂¯
/ ··· , (46)
where denotes global section and A 0,q is the sheaf of C ∞ (0, q)-forms on X. This yields Dolbeault cohomology groups H∂¯∗ (X, B). The product is given by the wedge product combined with µ. Putting B = End(E), this is the description Witten originally used to formulate the B-model [19]. ˇ ˇ 2. The Cech complex of Cech cochains associated to an open cover U for the locally-free sheaf B of sections of B: ···
δ
/ Cˇ n−1 (U, B)
δ
/ Cˇ n (U, B)
δ
/ Cˇ n+1 (U, B)
δ
/ · · ·.
(47)
ˇ For sufficiently fine U, the cohomology of this complex yields the Cech cohomology ∗ ˇ groups H (X, B). The product given by the cup product combined with µ yields the dga. 3. Given an injective resolution of B: 0
/B
/I 0
i0
/I 1
i1
/I 2
i2
/ ··· ,
(48)
238
P.S. Aspinwall, S. Katz
we may apply the global section functor, , to yield a complex ···
(in−2 )
/ (I n−1 )
(in−1 )
(in )
/ (I n )
/ (I n+1 )
(in+1 )
/ ··· , (49)
whose cohomology yields the sheaf cohomology groups H ∗ (X, B). The resolution (48) extends µ naturally to a product: µ : I p ⊗ I q → I p+q ,
(50)
which gives a dga structure to (49). There is a standard spectral sequence argument, as reviewed in [9] which shows that these three theories of cohomology are equivalent. For example, one may define the double complex p,q
E0
= Cˇ p (U, B ⊗ A 0,q ).
(51)
To this we associate a single complex En =
p,q
(52)
E0 ,
p+q=n
¯ The d-cohomology of E • can be realized as the with differential d = δ + (−1)p ∂. abutment of either of two spectral sequences. The first spectral sequence has an E1 -term obtained from the p-cohomology of (51): p,q
E1
= Hˇ p (B ⊗ A 0,q ),
(53)
and the second spectral sequence has an E1 -term obtained from the q-cohomology of (51). Since the A 0,q are fine sheaves (i.e. admit partitions of unity), so are the B ⊗ A 0,q . It follows that their higher cohomologies vanish and (53) reduces to 0,q
E1
= (B ⊗ A 0,q )
(54)
p,q
with E1 = 0 for p > 0. Here 0,q is the ring of global (0, q)-forms on X. This spectral sequence therefore degenerates at E2 and the d-cohomology of E • is isomorphic to the cohomology of E10,• . In other words, the chain map ...
...
∂¯
/ (B ⊗ A 0,n−1 ) d
∂¯
ξ
/ E n−1
/ (B ⊗ A 0,n )
∂¯
ξ
d
/ En
d
∂¯
/ (B ⊗ A 0,n+1 )
/ ...
ξ
/ E n+1
d
/ ... (55)
is a quasi-isomorphism. That is, ξ induces an isomorphism between the cohomology of the two complexes. Here, the ξ are the natural maps (B ⊗ A 0,k ) → Cˇ 0 (U, B ⊗ A 0,k ) ⊂ E k expressing a global section in terms of the given open cover.
Computation of Superpotentials for D-Branes
239
It is easy to see that ξ preserves the product structure between the two complexes too. Thus, ξ is a quasi-isomorphism of dga’s. Turning to the other spectral sequence, the fact that /O
0
ε
∂¯
/ A 0,0
/ A 0,1
∂¯
∂¯
/ A 0,2
/ ···
(56)
is exact (and remains exact upon tensoring with B) for intersections in a suitably-chosen U means that the q-cohomology of (51) is 0 unless q = 0, in which case we simply get Cˇ p (U, B). Thus this spectral sequence degenerates as well, and the cohomology of Cˇ • (U, B) coincides with the d-cohomology of E • . More precisely, the chain map ...
/ Cˇ n−1 (U, B)
δ
...
/ Cˇ n (U, B)
d
/ En
ε
δ
/ Cˇ n+1 (U, B)
ε
/ E n−1
d
δ
ε
/ E n+1
d
/ ...
δ
/ ...
d
(57) gives another quasi-isomorphism of dga’s. An immediate consequence of this construction is Dolbeault’s theorem: 0,q H∂¯ (X, B) ∼ = Hˇ q (X, B).
(58)
We have done a little more than just prove this fact however. We have also given maps that induce this isomorphism and described how the natural product structures are also mapped. We may treat sheaf cohomology in a similar way. We use a double complex given by p,q E˜ 0 = Cˇ p (U, I q ),
d = δ + (−1)p iq .
(59)
A quasi-isomorphism analogous to (57) again follows. Since the sheaves I q are “flabby”, one may also show that [39] 0
/ (I q )
ρ
/ Cˇ 0 (U, I q )
δ
/ Cˇ 1 (U, I q )
δ
/ Cˇ 2 (U, I q )
δ
/ ··· (60)
is exact. This gives rise to yet another quasi-isomorphism of dga’s: ...
...
(in−2 )
d
/ (I n−1 )
(in−1 )
/ (I n )
d
/ E˜ n
/ (I n+1 )
ρ
ρ
/ E˜ n−1
(in )
d
(in+1 )
/ ...
ρ
/ E˜ n+1
d
/ ... , (61)
where E˜ n =
p,q E˜ 0 .
(62)
p+q=n
Finally, note that the injective resolution (48) induces a quasi-isomorphism from (47) to the bottom complex of (61). Thus all of the complexes we have discussed are
240
P.S. Aspinwall, S. Katz
quasi-isomorphic to each other. By Lemma 1, all of the A∞ -algebras we obtain are A∞ -isomorphic to each other. In particular, combining this result with the discussion at the end of Sect. 3, we conclude that the superpotential of holomorphic Chern-Simons theory is independent of the metric up to field redefinitions, an expected property of the B-model. One is therefore free to recast the formulation of the topological B-model into either ˇ Cech cohomology or sheaf cohomology. The idea that one may use sheaf cohomology leads inexorably to the appearance of the derived category D(X), as reviewed in [9]. So far we have restricted attention to a single D-brane that fills X. One extends this notion to any number of more general D-branes. The result is that a D-brane is a complex of coherent sheaves dn−2 dn+1 dn / E n−1 dn−1 / E n / E n+1 / · · · . (63) E• = ··· For the analysis of open strings between E • and F • one replaces B in the above discussion by the sheaf Hom(E m , F n ). One needs to extend the notation to cope with these new complexes but this is an exercise only in bookkeeping and we will spare the reader of this. The Hilbert space of open strings from E • to F • is then given by “hyperext” groups: Ext n (E • , F • ). (64) n
Let us review exactly how to compute the A∞ structure of the morphisms between objects in D(X). Each object in D(X) is quasi-isomorphic to a complex of injective sheaves. We may view this as an injective resolution of these objects. Without loss of generality therefore, we may assume that the D-branes are given as a complex of injective sheaves. Suppose, first, for simplicity, that we have only one D-brane E • . The first row of (61) is then given by the complex with entries Hom(E p , E p+n ). (65) p
If we denote an element of this group by differential for this complex is given by
p
fn,p , where fn,p : E p → E p+n , then the
dn fn,p = dp+n ◦ fn,p − (−1)n fp+1,n ◦ dp .
(66)
For several D-branes, we write E • = E1• ⊕E2• ⊕. . . . The spaces of Hom’s then break up into direct sums and we may relabel everything in terms of morphisms between the different objects E1• , E2• , etc. This complex, together with the obvious product structure given by composition, gives a dga. The cohomology of this complex gives the Hilbert spaces of the various open string states. The method of Sect. 2 may then be used to compute the higher products of the resulting A∞ category and thus we find the information required for the superpotential. 5. A Practical Method In the last section we achieved our primary goal. We rephrased the question of how to compute the superpotential into a purely algebraic one. There is no need to know
Computation of Superpotentials for D-Branes
241
the metric on X. Having said that, the answer we obtained cannot really be viewed as a practical method of computing the higher products. This is because it required finding an injective resolution for each sheaf involved. While the existence of injective resolutions is guaranteed (see [39] for example), an explicit construction is not usually forthcoming. ˇ Instead we should use Cech cohomology as follows. In general, there is a spectral sequence given by3 p,q
E2
= H p (X, Ext q (E , F )),
(67)
that converges to Ext p+q (E , F ). If E is locally-free, then Ext q (E , F ) = 0 for q > 0 and therefore Extn (E , F ) = H n (X, Hom(E , F )) = Hˇ n (X, Hom(E , F )).
(68)
Now any coherent sheaf on a smooth X has a locally-free resolution and so we are free to represent any object of D(X) by a complex of locally-free sheaves. Unlike the case of injectives representations, it is usually straightforward to compute a locally-free representation of a given object in D(X). So we proceed as follows. Suppose, again, for simplicity of notation, that we have a single D-brane which is represented by a complex E • of locally-free sheaves. We have a complex with entries denoted: Hom(E m , E m+q ), (69) Homq (E • , E • ) = m
and a differential dq given by (66). Now build a double complex with entries
Cˇ p U, Homq (E • , E • ) ,
(70)
p+q=n
of degree n, and differential d = δ + (−1)p dq , where dq is given by (66). ˇ There is a natural product, given by the Cech cup product combined with composition of maps in the Hom sheaves. Suppose
a ∈ Cˇ p U, Homq (E • , E • ) ,
(71) b ∈ Cˇ r U, Homs (E • , E • ) , and let us denote the natural composition a · b. This composition fails to satisfy the required Leibniz rule and instead we define a product a b = (−1)qr a · b.
(72)
This new product gives us the structure of a dga. By the same methods that were employed above, this dga is again quasi-isomorphic to all those considered in Sect. 4. The presentation of the dga is actually perfectly practical to use, at least in relatively simple cases as we now demonstrate. ˇ In order to compute Cech cohomology, we need an open cover of X that is sufficiently fine. That is, we need all the open sets, and all the intersections of the open sets, to have 3 This “local to global” spectral sequence can be viewed as the Grothendieck spectral sequence (see [40] for example) applied to the composition of functors and Hom(E , −).
242
P.S. Aspinwall, S. Katz
trivial sheaf cohomology. A sufficient condition for this is that the open sets and their intersections be affine [39]. A space is affine if it can be written as the solution of a set of algebraic equations in Cn , for some n. For example, consider the projective space Pn with homogeneous coordinates [z0 , z1 , . . . , zn ]. The usual patches Ui , isomorphic to Cn , are defined by zi = 0. Let Ui0 i1 ...ip = Ui0 ∩ Ui1 ∩ · · · ∩ Uip .
(73)
The space Ui0 ,i1 ...ip ∼ = (C ∗ )p × Cn−p is isomorphic to the affine variety defined by zi0 zi1 . . . zip = 1 in Cn+1 . Thus this cover is good enough for our purposes. Note that any algebraic variety defined within this Pn can also use this cover. ˇ Before giving some examples, let us fix notation for Cech cochains. In our examples, our sheaves F will be vector bundles which have been trivialized over each Ui , so that sections of F over Ui can and will be identified with tuples of functions on Ui . As usual, when we change trivializations, we must multiply by an appropriate transition function. For the higher cochains we will make a notational choice in describing elements of F (Ui0 ,i1 ,... ,ip ) since many different trivializations are possible in general. Our choice will consistently be to choose the trivialization over Ui0 . So (f )i0 ,i1 ,... ,ip denotes a section of F (Ui0 ,i1 ,... ,ip ) over Ui0 ,i1 ,... ,ip , expressed as a vector of functions using the given trivialization of F over Ui0 . As a special case, if a 0-chain is a global section, i.e., a 0-cocycle, then we denote it simply by f , when f denotes its expression in the U0 trivialization. For example, for the sheaf O(n) on P1 , (f )01 is a 1-cochain given by f in terms of variables for U0 . It will therefore be given by (f )01 .(z0 /z1 )n in terms of variables in the patch U1 . 5.1. The conifold point of type (−1, −1). For the first example consider a 3-brane (i.e., a point-like object in the compact directions) on a conifold point obtained by contracting a curve C ∼ = P1 with normal bundle OC (−1) ⊕ OC (−1). As explained in [41], this 3-brane is marginally stable with respect to decay into OC and OC (−1) [1]. Thus, if we considered N coincident 3-branes at this conifold point, we would have a U(N ) × U(N ) quiver gauge theory. The superpotential for this case is known. It is computed in [10, 11] by somewhat indirect means. This will provide a useful check for our method of computation. One may also regard our computation as a more rigorous proof of the result. To produce a local model for this case, let X be the total space of the normal bundle OC (−1) ⊕ OC (−1). Thus we have bundle map π : X → C. An affine open cover of X is then given by two patches: U0 , with coordinates (x, y1 , y2 ); and U1 , with coordinates (w, z1 , z2 ). The transition functions are obviously w = x −1 , z1 = xy1 , z2 = xy2 .
(74)
Now OC is not a locally-free sheaf on X. Define O(1) = π ∗ OC (1). We then have an exact sequence
0
/ O(2)
−y2 y1 −z2 z1
/ O(1) ⊕ O(1)
( y1 y2 ) ( z1 z2 )
/O
/ OC
/ 0, (75)
Computation of Superpotentials for D-Branes
243
where we have given the explicit sheaf maps in both patches. This provides the locallyfree resolution of OC , and thus OC (−1) [1] too by tensoring the resolution by O(−1) and shifting one place to the left. Ext1 (OC (−1) [1], OC ) and Ext 1 (OC , OC (−1) [1]) are both isomorphic to C2 . Thus we have a quiver: a
*#◦ OC
b
OC (−1)[1] ◦jc
c
(76)
d
Open strings correspond to maps which are d-closed. In turns out that in this example we may represent all the required d-closed maps by maps which are both δ-closed and d-closed as we now see explicitly. The classes in Ext 1 (OC (−1) [1], OC ) are represented by elements of Cˇ 0 (U, Hom1 (OC (−1) [1], OC )) as follows. Using the notation described above, let one generator of this group, denoted a, be represented by
O(1) 1
O(2)
−y2 y1
/ O ⊕O
− 10 01
−y2 y1
( y1 y2 )
/ O(−1) (77)
1
/ O(1) ⊕ O(1)
/O
( y1 y2 )
and b by
O(1) x
O(2)
−y2 y1
/ O ⊕O
− x 0 0x
−y2 y1
( y1 y2 )
/ O(1) ⊕ O(1)
/ O(−1) (78)
x ( y1 y2 )
/ O.
Next, the two generators of Ext 1 (OC , OC (−1) [1]) can be represented by elements of Cˇ 1 (U, Hom0 (OC , OC (−1) [1]). Let c be represented by
O(2)
0 1 x
O(1)
−y2 y1
−y2 y1
/ O(1) ⊕ O(1)ar[d]
1 x
0
01
( y1 y2 )
/O (79)
01
/ O ⊕O
( y1 y2 )
/ O(−1)
244
P.S. Aspinwall, S. Katz
and d by
O(2)
O(1)
−y2 y1
01
/ O ⊕O
y1 y2 )
/ O(1) ⊕ O(1) (
1 x
0
−y2 y1
( y1 y2 )
0
/O
(80)
1 x 01
/ O(−1).
Finally, the generator of Ext 3 (OC (−1) [1], OC (−1) [1]) can be represented by a 1-cochain in Cˇ 1 (U, Hom2 (OC (−1) [1], OC (−1) [1])):
O(1)
−y2 y1
/ O ⊕O
( y1 y2 )
/ O(−1) (81)
( x1 )01
O(1)
−y2 y1
/ O ⊕O
/ O(−1)
( y1 y2 )
The composition c a gives a map
O(1)
O(1)
−y2 y1
0 − x1
−y2 y1
/ O ⊕O
01
/ O ⊕O
( y1 y2 )
/ O(−1)
− x1 0
(82)
01
/ O(−1).
( y1 y2 )
ˇ This is exact. To be precise, c a is a Cech coboundary of the map which is zero in patch 0 and in patch 1 given by the chain map
O(1)
O(1)
−y2 y1
0 −1
/ O ⊕O
−y2 y1
/ O ⊕O
( y1 y2 )
( −1 0 )1
1
/ O(−1).
( y1 y2 )
/ O(−1) (83)
Computation of Superpotentials for D-Branes
245
Thus, from (15), m2 (c, a) = 0 and f2 (c, a) is given by minus (83). ˇ Now compose this with b to form b f2 (c, a) given by the Cech 0-chain
O(1)
O(2)
−y2 y1
0 −1
−y2 y1
/ O ⊕O
( y1 y2 )
/ O(−1)
/ O(1) ⊕ O(1)
(84)
( 1 0 )1
1
( y1 y2 )
/ O.
This corresponds to one of the terms needed to compute m3 (b, c, a). Similarly, a computation for f2 (b, c) a yields
O(1)
O(2)
−y2 y1
0 −1
−y2 y1
/ O ⊕O
( y1 y2 )
/ O(−1)
/ O(1) ⊕ O(1)
(85)
( 1 0 )0
0
( y1 y2 )
/ O.
Remembering the rule (2), from (16) we see that m3 (b, c, a) is equal to −bf2 (c, a)− f2 (b, c) a and is thus given by the following globally defined map which represents a generator of Ext 2 (OC (−1) [1], OC ):
O(1)
−y2 y1
/ O ⊕O
0 1
O(2)
−y2 y1
( −1 0 )
/ O(1) ⊕ O(1)
( y1 y2 )
( y1 y2 )
/ O(−1) (86)
/ O.
When composed with d this gives the Ext3 of (81) but when composed with c it gives zero. Thus m3 (b, c, a) is Serre dual to d. Denoting by A the N = 1 superfield dual to a etc., we thus have a term in the superpotential equal to Tr(BCAD). Composing the other way to find m3 (a, c, b) gives a similar result except for a sign. There are no other higher products and so the total is, in agreement with [10]:
246
P.S. Aspinwall, S. Katz
W = Tr(BCAD − ACBD). One is also free to do nonlinear field redefinitions as discussed at the end of Sect. 3.
5.2. A P1 with higher obstructions. Our next example is a conifold-like point associated with an obstructed P1 with normal bundle O ⊕ O(−2). An example of such a P1 can be given explicitly in patches using the transition functions w = x −1 , z1 = x 2 y1 + xy2n , z2 = y2 ,
(87)
with n ≥ 2 (the n = 1 case can be identified with the resolved conifold after a change of variables). The quiver for a decay of a 3-brane into OC and OC (−1) [1] in this case is given by a
◦ jc
b
OC (−1) [1]
d
y
◦*# OC
c
x.
(88)
A locally-free resolution of OC is given by y 2
O
−1 x
1 O −x ⊕ −y n−1 2 / O(1) ⊕ O(1)
y2 0 0 y2 O(1) ⊕ −s −y1
/ O(1) ⊕ O
(s y1 y2 )
/O
/ OC ,
(89) where s = xy1 + y2n . In constructing this resolution, the bundles O(n) which appear were chosen so that all maps appearing in the resolution remain holomorphic in the U1 after changing coordinates and multiplying by the transition function x −n of O(n). For example, s is given as a section of O(−1). In U1 coordinates, this becomes z1 = xs which is holomorphic. Note that the sections s, y1 , y2 have been chosen to generate the ideal of all functions vanishing on C in both patches. In U0 , the sections y1 and y2 already suffice to generate the ideal. In U1 , these sections become z1 , wz1 − z2n , z2 respectively, and now z1 and z2 already suffice. Note in particular that it was necessary to include the section s, as y1 , y2 would not have sufficed: in U1 these get identified with wz1 − z2n , z2 which fail to generate the ideal of the curve at (w, z1 , z2 ) = (0, 0, 0).
Computation of Superpotentials for D-Branes
247
Define x to be the following generator of Ext 1 (OC , OC ) ∼ = C:
1 O −x ⊕ −y n−1 2 / O(1) ⊕ O(1)
y 2 −1 x
O
1 0 0
O
y2n−2
1 O −x ⊕ −y n−1 2 / O(1) ⊕ O(1)
y 2 −1 x
0 0
1 0 0 1 0 0
y2 0 0 y2 O(1) ⊕ −s −y1
/ O(1) ⊕ O
y2 0 0 y2 O(1) ⊕ −s −y1
/ O(1) ⊕ O
( s y1 y2 )
/O.
(0 0 1)
/O
( s y1 y2 )
(90) From now on, for brevity, let us refer to the sheaves in the locally-free resolution (89) as Fi . Ext 3 (OC , OC ) is represented by the 0-cochain: / F2
F3
/ F1
/ F0 (91)
1
/ F2
F3
/ F0
/ F1
or, equivalently, by the 1-cochain: / F2
F3
0
/ F2
F3
/ F1
/ F1 01
/ F0
1 x 01
(92)
/ F0
These two choices differ by a d-boundary. We will use the representative (91) to describe the A∞ -algebra via Kadeishvili’s theorem. We compute x x to be Jn−2 , where Jp ∈ Ext2 (OC , OC ) is defined as / F2
F3
F3
/ F2
/ F1
0 0 p y2
/
F1
/ F0
(y2p
0 0)
(93)
/ F0 .
But, if p ≥ 1 then Jp = dKp−1 ,
(94)
248
P.S. Aspinwall, S. Katz
where Kp is given by / F2
F3
0 0 0
/ F2
F3
/ F1
0 0 0 0 0 0 p y2 0 0
/ F0 .
(95)
(0 0 0)
/ F1
/ F0
It is now easy to see that Jp = x Kp + Kp x.
(96)
Applying (12) and using the fact that Ki Kj = 0, it follows that we can choose fk (x, x, . . . , x) = (−1) mk (x, x, . . . , x) = 0
k(k−1) 2
Kn−k−1
for 2 ≤ k < n,
(97)
and mn (x, x, . . . , x) = −(−1)
n(n−1) 2
J0 .
(98)
But J0 composed with x is the generator of Ext3 given in (91) so we have a term n(n−1) in the superpotential equal to −(−1) 2 X n+1 . Similarly we obtain a contribution n(n−1) −(−1) 2 Y n+1 to the superpotential. The next few arrows in (88) are given by: F3 (−1) a=
b=
/ F2 (−1)
−1
/ F1 (−1)
/ F0 (−1)
−1
1
1
F3
/ F2
/ F1
/ F0
F3 (−1)
/ F2 (−1)
/ F1 (−1)
/ F0 (−1)
−x
−x.1
x.1
F3
/ F2
F3 c=
/ F2
0
F3 (−1)
/ F2 (−1)
x
/ F1
/ F0 / F1
00 0 00 0 0 1 x1
/ F1 (−1)
,
(99)
,
(100)
/ F0 . 1
1 x
01
/ F0 (−1)
0
01
(101)
Computation of Superpotentials for D-Branes
249
A new feature appears when we try to write down the final map d. Unlike the above cases we cannot use a single map with δd = dd = 0. Instead we need to write d as a sum f + h, where f is a class in Cˇ 1 (U, Hom0 (O, O(−1) [1])): / F2
F3
0
F3 (−1)
/ F2 (−1)
/ F1 0 0 0 0 0 0 0 − x1 0
/ F1 (−1)
/ F0 − x1 0 0
01
(102)
01
/ F0 (−1)
and h is a class in Cˇ 0 (U, Hom1 (O, O(−1) [1])): / F2
F3
F3 (−1)
/ F2 (−1)
0 0 −1
/ F1
/ F0
( 1 0 0 )1
(103)
1
/ F1 (−1)
/ F0 (−1).
Then dd = −df + δh = 0 as required. A straightforward computation, whose details we omit, then yields
n(n−1) n(n−1) W = Tr −(−1) 2 X n+1 −(−1) 2 Y n+1 −XAC +XBD−Y CA+Y DB (104) in agreement with [12] for example.
5.3. A new example of type (1, −3). Here we consider a 5-brane wrapping a P1 locally given by w = x −1 , z1 = x 3 y1 + y22 , z2 = x
−1
(105)
y2 .
This curve cannot be contracted and so we do not consider 3-brane decay in this case. There are 2 massless open strings beginning and ending on this P1 and the moduli space is again obstructed as we see below. It already follows from [42] that the moduli space can be defined as the critical point locus of a superpotential-like function XY 2 , but no claim was made there that this coincides with the physical superpotential. Our computations will show that this is indeed the physical superpotential. The equation w2 z1 − z22 = xy1 shows that y1 can be identified with a global section of O(−1). Similarly, the last equation in (106) shows that y2 can be identified with a section of O(1).
250
P.S. Aspinwall, S. Katz
A resolution of OC yields the following complex of locally-free sheaves representing the D-brane:
y2 −x 3 −1
O(−3)
O(−2) ⊕ / O ⊕ O(−1)
x 3 y2 0 y2 −y1 z1 −1 0 −y2
O(1) ⊕ / O(−1) ⊕ O
(y1 y2 z1 )
/ O.
(106) In (106), we have used z1 as an abbreviation for its expression x 3 y1 + y22 in the U0 patch. Define x and y to be the following generators of Ext 1 (OC , OC ) ∼ = C: / F2
F3 x=
1 0 0
F3
/ F2
/ F1
F3
/ F2
y=
x 0 0
/ F2
F3
/ F1
/ F1 0 1 0 −1 0 0 0 0 −1
(107)
(0 1 0)
/ F0 / F1
0 x 0 −x 0 0 0 0 −x
/ F0 ,
/ F0 .
(108)
(0 x 0)
/ F0
In the U0 patch, we can simply write y = xx. Note that the entries of the above matrices remain holomorphic in the U1 patch, and prevent us from multiplying the entries by any higher powers of x. We compute x x to be / F2
F3
/ F1
/ F2
F3
/ F1
/ F0 .
0 −1 0
(109)
(−1 0 0)
/ F0
From this, we compute immediately that y x = x(x x) and y y = x 2 (x x). We now note that x x is exact, given by d applied to F3 0 0 0
F3
/ F2
/ F2
/ F1
/ F1 000 001 000
/ F0 .
(0 0 1)
(110)
/ F0
Since the nonzero entries of (110) are sections of O, they are constants, hence holomorphic in the U1 patch as well. This also explains that we can’t simply multiply (110) by
Computation of Superpotentials for D-Branes
251
x or x 2 to conclude exactness of y x, and y y, as these are not holomorphic in the U1 patch. In fact, it can be checked that y x and y y generate Ext 2 (OC , OC ). Next, we compute x x x to be / F2
F3
/ F1
−1
F3
/ F2
(111)
/ F0
/ F1
/ F0
We immediately compute y x x = x (x x x) , y y x = x 2 (x x x) , y y y = x 3 (x x x) . (112) The d-exactness of x x and d-closedness of x and y imply that x x y and x x x are d-exact. Note that y y y is d of F3 0 0 0
F3
/ F2
/ F1
/ F2 ( 0 −1 0 )
/ F1
/ F0 (113)
/ F0
It can be shown that y y x is not exact and generates Ext 3 (OC , OC ). This shows that XY 2 is the only cubic term in the superpotential. It is not hard to show inductively that all mk = fk = 0 for k > 2. Therefore, we have no higher terms in the superpotential and so W = Tr(XY 2 ),
(114)
as might have been expected from [42]. Acknowledgements. We wish to thank M. Douglas, S. Kachru, A. Lawrence, C. Lazaroiu, I. Melnikov, and E. Sharpe for useful conversations. P.S.A. is supported in part by NSF grant DMS-0301476, Stanford University, SLAC and the Packard Foundation. S.K. is supported in part by NSF grants DMS 02-96154, DMS 02-44412, and NSA grant MDA904-03-1-0050.
References 1. Witten, E.: Bound States Of Strings And p-Branes. Nucl. Phys. B460, 335–350 (1996) 2. Becker, K., Becker, M., Strominger, A.: Five-branes, Membranes and Nonperturbative String Theory. Nucl. Phys. B456, 130–152 (1995) 3. Fukaya, K.: Morse Homotopy, A∞ -Category, and Floer Homologies. In: proc. of the 1993 Garc workshop on Geomentry and Topology. Lecture Notes Ser. 18, Seoul: Seoul Nat.Univ., 1993, pp. 1– 102 4. Fukaya, K., Seidel, P.: Floer Homology, A∞ -Categories and Topological Field Theory. Lecture Notes in Pure and Appl. Math. 184, 9–32 (1997) 5. Fukaya, K., Oh, Y.-G., Ohta, H., Ono, K.: Lagrangian Intersection Floer Theory — Anomaly and Obstruction. 2000, http://www.kusm.kyoto-u.ac.jp/˜fukaya/ fukaya.html, 2000 6. Kontsevich, M.: Homological Algebra of Mirror Symmetry. In: Proceedings of the International Congress of Mathematicians. Basel-Boston: Birkh¨auser, 1995, pp. 120–139 7. Douglas, M.R.: D-Branes, Categories and N=1 Supersymmetry. J. Math. Phys. 42, 2818–2843 (2001)
252
P.S. Aspinwall, S. Katz
8. Aspinwall, P.S., Lawrence, A.E.: Derived Categories and Zero-Brane Stability. JHEP 08, 004 (2001) 9. Aspinwall, P.S.: D-Branes on Calabi–Yau Manifolds. http://arxiv.org/list/hep-th/0403166, 2004 10. Klebanov, I.R., Witten, E.: Superconformal Field Theory on Threebranes at a Calabi–Yau Singularity. Nucl. Phys. B536, 199–218 (1998) 11. Morrison, D.R., Plesser, M.R.: Non-Spherical Horizons. I. Adv. Theor. Math. Phys. 3, 1–81 (1999) 12. Cachazo, F., Katz, S., Vafa, C.: Geometric Transitions and N = 1 Quiver Theories. http://arxiv.org/list/hep-th/0108120, 2001 13. Cachazo, F., Fiol, B., Intriligator, K.A., Katz, S., Vafa, C.: A Geometric Unification of Dualities. Nucl. Phys. B628, 3–78, (2002) 14. Douglas, M.R., Govindarajan, S., Jayaraman, T., Tomasiello, A.: D-branes on Calabi-Yau Manifolds and Superpotentials. Commun. Math. phys. 248, 85–118 (2004) 15. Feng, A., Hanany, A., He, Y.-H.: D-brane Gauge Theories from Toric Singularities and Toric Duality. Nucl. Phys. B595, 165–200 (2001) 16. Herbst, M., Lazaroiu, C.-I., Lerche, W.: D-brane Effective Action and Tachyon Condensation in Topological Minimal Models. JHEP 0503, 078(2005) 17. Polishchuk, A.: A∞ -Structures on an Elliptic Curve. Commun. Math. Phys. 247, 527–551 (2004) 18. Ashok, S.K., Dell’Aquila, E., Diaconescu, D.-E., Florea, B.: Obstructed D-branes in Landau– Ginzburg orbifolds. Adv. Theor. Math. Phys. 8, 427–472 (2004) 19. Witten, E.: Chern-Simons Gauge Theory as a String Theory. In: H. Hofer et al., (ed.), The Floer Memorial Volume, Basel-Boston: Birkh¨auser, 1995, pp. 637–678 20. Gugenheim, V., Stasheff, J.: On Perturbations and A∞ -Structures. Bul. Soc. Math. Belg. A38, 237– 246 (1987) 21. Penkava, M., Schwarz, A.: A∞ Algebras and the Cohomology of Moduli Spaces. http://arxiv.org/list/hep-th/9408064, 1994 22. Lazaroiu, C.I.: String Field Theory and Brane Superpotentials. JHEP 10, 018 (2001) 23. Tomasiello, A.: A-infinity Structure and Superpotentials. JHEP 09, 030, (2001) 24. Herbst, M., Lazaroiu, C., Lerche, W.: Superpotentials, A-infinity Relations and WDVV Equations for Open Topological Strings. JHEP 0502, 071 (2005) 25. Keller, B.: Introduction to A-infinity Algebras and Modules. Homology Homotopy Appl. 3, 1–35 (2001) 26. Kontsevich, M., Soibelman, Y.: Homological Mirror Symmetry and Torus Fibrations. In: K. Fukaya et al., (ed.), Symplectic Geometry and Mirror Symmetry. River Edge, NJ: World Scientific, 2001, pp 203–263 27. Kadeishvili, T.V.: The Algebraic Structure in the Homology of an A∞ -Algebra. Soobshch. Akad. Nauk. Gruzin. SSR 108, 249–252 (1982) 28. Merkulov, S.: Strongly Homotopy Algebras of a K¨ahler Manifold. Internat. Math. Res. Notices (1999) 153–164 29. Brunner, I., Douglas, M.R., Lawrence, A., R¨omelsberger, C.: D-branes on the Quintic. JHEP 08, 015 (2000) 30. Witten, E.: Mirror Manifolds and Topological Field Theory. In: S.-T. Yau, (ed.), Essays on Mirror Manifolds, Cambridge, MA: International Press, 1992 31. Dubrovin, B.: Geometry of 2D Topological Field Theories. In: Integrable Systems and Quantum Groups. Lecture Notes in Math. 1620, Berlin Heidelberg Newyork: Springer, 1996, pp. 120–348 32. Segal, G.: The Definition of Conformal Field Theory. In: Topology, Geometry and Quantum Field Theory. London Math. Soc. Lecture Note Ser. 308, Cambridge: London Math. Soc., 2004, pp. 421– 577 33. Lazaroiu, C.I.: On the Structure of Open-Closed Topological Field Theory in Two Dimensions. Nucl. Phys. B603, 497–530 (2001) 34. Aspinwall, P.S., Melnikov, I.V.: D-Branes on Vanishing del Pezzo Surfaces. JHEP 0412, 042 (2004) 35. Herzog, C.P.: Seiberg Duality is an Exceptional Mutation. JHEP 0408, 064 (2004) 36. Kutasov, D.: Geometry on the Space of Conformal Field Theories and Contact Terms. Phys. Lett. B220, 153–158 (1989) 37. Polishchuk, A.: Homological Mirror Symmetry with Higher Products. In: Winter School on Mirror Symmetry, Vector Bundles and Lagrangian Submanifolds (Cambridge, MA, 1999), AMS/IP Stud. Adv. Math., Proxidence, RI: AMS, 2001, pp. 247–259 38. Polishchuk, A.: Extensions of Homogeneous Coordinate Rings to A∞ -Algebras. Homology Homotopy Appl. 5, 407–421 (2003) 39. Hartshorne, R.: Algebraic Geometry. Graduate Texts in Mathematics 52, Berlin Heidelberg Newyork: Springer-Verlag, 1977 40. Weibel, C.A.: An Introduction to Homological Algebra. Cambridge Stud. in Adv. Math. 38, Cambridge: Cambridge Univ. Press, 1994 41. Aspinwall, P.S.: A Point’s Point of View of Stringy Geometry. JHEP 01, 002 (2003)
Computation of Superpotentials for D-Branes
253
42. Katz, S.: Versal Deformations and Superpotentials for Rational Curves in Smooth Threefolds. Cont. Math. 312, 129–136 (2002) Communicated by N.A. Nekrasov
Commun. Math. Phys. 264, 255–282 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1485-4
Communications in
Mathematical Physics
Nonuniformly Expanding 1D Maps Qiudong Wang1 , Lai-Sang Young2 1
Dept. of Math., University of Arizona, Tucson, AZ 85721, USA. E-mail:
[email protected] 2 Courant Institute of Mathematical Sciences, 251 Mercer St., New York, NY 10012, USA. E-mail:
[email protected] Received: 7 April 2005 / Accepted: 30 June 2005 Published online: 1 March 2006 – © Springer-Verlag 2006
Abstract: This paper attempts to make accessible a body of ideas surrounding the following result: Typical families of (possibly multi-model) 1-dimensional maps passing through “Misiurewicz points” have invariant densities for positive measure sets of parameters. The simplest paradigms of chaotic behavior in dynamical systems are found in uniformly expanding and uniformly hyperbolic (or Anosov) maps. Allowing expanding and contracting behaviors to mix leads to a multitude of new possibilities. In spite of much progress, the analysis of most nonuniformly hyperbolic systems has remained hopelessly difficult. One-dimensional maps are an exception. The situation in 1 dimension is made tractable by the fact that the worst enemy of expansion is the critical set, i.e., the set on which f vanishes, and for typical 1D maps, this set is finite. It has been shown that by controlling the orbits starting from this finite set, the dynamics on the rest of the phase space can be tamed. A nonuniform theory for 1D maps was developed in a series of papers in the late 1970s and 1980s ([J, M, CE, BC1, R, BC2, NS], and others). These ideas are later exploited in the study of attractors with a single direction of instability, beginning with the H´enon maps ([BC2, BY] etc.) and culminating recently in a general theory of rank-one attractors that can live in phase spaces of arbitrary dimensions [WY2]. In the course of these developments, some of the original 1D arguments have been extended and improved. This paper is written in response to numerous requests from the dynamical systems community to make more accessible a certain body of ideas in 1 dimension, both for its independent interest and as an introduction to the study of rank-one maps in higher dimensions. The content of this paper can be summarized as follows: Let I be a closed interval or a circle, and let C 2 (I, I ) denote the set of C 2 maps from I to itself. We seek to identify a reasonably large class of maps G ⊂ C 2 (I, I ) with controlled nonuniform expansion,
The research of both authors is partially supported by grant from the NSF
256
Q. Wang, L.-S. Young
and to give a description of its dynamical properties. This is carried out in the following 3 steps: (1) First we identify a set M ⊂ C 2 (I, I ) defined by strong expanding conditions. (2) Our class of “good maps" G ⊂ C 2 (I, I ) is obtained by relaxing these conditions. Maps in G are shown to have absolutely continuous invariant measures. (3) We show that G is “large" in the sense that for every typical 1-parameter family {fa } passing through M, the set {a : fa ∈ G} has positive Lebesgue measure. We first cite the main references directly related to (1)–(3): The class M is a slight generalization of the maps studied in [M]. In the special case of the quadratic family fa (x) = 1−ax 2 , the existence of absolutely continuous invariant measures for a positive measure set of parameters is the well known theorem of Jakobson [J]; for other proofs of Jakobson’s theorem, see [BC1, BC2] and [R]. A key idea used in [BC1] and [BC2], namely the exponential growth of derivatives along critical orbits, is first introduced in [CE]. An analysis along the lines of (1)–(3) above for unimodal maps was carried out in [TTY]. A version of Jakobson’s theorem for multimodal maps is given in [T]. While the results of this paper as stated have not been published before, we do not claim that the ideas of the proofs are new. In outline, our proofs follow those in [BC1] and Sect. 2 of [BC2]. The generalization from fa (x) = 1 − ax 2 to more general maps is along the lines of [TTY]. We have also borrowed heavily from [WY1] and especially [WY2], both in terms of setting and the way in which the arguments are carried out. More detailed references are given at the end of each section. Organization of paper. The class M in (1) above is discussed in Sect. 1; the class G is introduced in Sect. 2. The result on invariant measures (Theorem 1) is stated and proved in Sect. 3. The result on positive measure sets of parameters (Theorem 2) is stated in Sect. 4.1 and proved in Sects. 4–7. Part I. Dynamical Properties 1. The Class M 1.1. Definition and expanding property. For f ∈ C 2 (I, I ), let C = C(f ) = {f = 0} denote the critical set of f , and let Cδ denote the δ-neighborhood of C in I . For x ∈ I , let d(x, C) := minx∈C |x − x|. ˆ ˆ Definition 1.1. We say f ∈ C 2 (I, I ) is in the class M if the following hold for some δ0 > 0: (a) Outside of Cδ0 : there exist λ0 > 0, M0 ∈ Z+ and 0 < c0 ≤ 1 such that (i) for all n ≥ M0 , if x, f (x), · · · , f n−1 (x) ∈ Cδ0 , then |(f n ) (x)| ≥ eλ0 n ; (ii) if x, f (x), · · · , f n−1 (x) ∈ Cδ0 and f n (x) ∈ Cδ0 , any n, then |(f n ) (x)| ≥ c0 eλ0 n . (b) Inside Cδ0 : (i) f (x) = 0 for all x ∈ Cδ0 ; (ii) for all xˆ ∈ C and n > 0, d(f n (x), ˆ C) ≥ δ0 ; (iii) for all x ∈ Cδ0 \ C, there exists p0 (x) > 0 such that f j (x) ∈ Cδ0 for all 1 j < p0 (x) and |(f p0 (x) ) (x)| ≥ c0−1 e 3 λ0 p0 (x) .
Nonuniformly Expanding 1D Maps
257
Remark 1. The maps in M are among the simplest with nonuniform expansion: The phase space is divided into two regions, Cδ0 and I \ Cδ0 . Condition (a) in Definition 1.1 says that on I \ Cδ0 , f is essentially uniformly expanding. (b)(iii) says that for x ∈ Cδ0 \ C, even though |f (x)| is small, the orbit of x does not return to Cδ0 again until its derivative has regained a definite amount of exponential growth; i.e., if n is the 1 first return time of x ∈ Cδ0 to Cδ0 , then |(f n ) (x)| ≥ e 3 λ0 n . (To see this, use (b)(iii) followed by (a)(ii).) Remark 2. We identify two properties of the critical orbits of f ∈ M that will serve as the basis of the generalization in Sect. 2. Let xˆ ∈ C. (1) d(f n (x), ˆ C) ≥ δ0 for all n > 0, i.e., (b)(ii) in Definition 1.1. (This condition is redundant and is included solely for emphasis; it follows from (b)(iii) together with the observation that p0 (x) → ∞ as d(x, C) → 0.) ˆ ≥ c0 eλ0 n for all n > 0, where c0 = (max |f |)−M0 . This follows from (2) |(f n ) (f x)| (b)(ii) and (a)(i). We record for future use the following important fact about the behavior of f ∈ M outside of Cδ for arbitrary δ < δ0 : Lemma 1.1. There exists c0 > 0 depending only on f such that for all δ < δ0 and n > 0: 1
(a) if x, f (x), . . . , f n−1 (x) ∈ Cδ , then |(f n ) (x)| ≥ c0 δe 3 λ0 n ;
1
(b) if x, f (x), . . . , f n−1 (x) ∈ Cδ and f n (x) ∈ Cδ0 , then |(f n ) (x)| ≥ c0 e 3 λ0 n . Proof. Let x be such that f i (x) ∈ Cδ for i ∈ [0, n). We divide [0, n] into maximal time intervals [i, i + k] such that f i+j (x) ∈ Cδ0 for 0 < j < k, and estimate |(f k ) (f i (x))| as follows: 1 Case 1. f i (x), f i+k (x) ∈ Cδ0 . |(f k ) (f i (x))| ≥ e 3 λ0 k by Definition 1.1(a)(ii) and (b)(iii). Case 2. f i (x) ∈ Cδ0 , f i+k (x) ∈ Cδ0 . The estimate is given by Definition 1.1(a)(ii). Case 3. f i (x), f i+k (x) ∈ Cδ0 . If k ≥ M0 , then |(f k ) (f i (x))| > eλ0 k by Definition ˆ 1.1(a)(i). If k < M0 , we let kˆ be the smallest integer > k such that f i+k (x) ∈ Cδ0 . Using Definition 1.1(a)(i) for kˆ ≥ M0 and Definition 1.1(a)(ii) for kˆ < M0 , we conclude that |(f k ) (f i (x))| > c0 (max |f (x)|)−M0 eλ0 k . Case 4. f i (x) ∈ Cδ0 , f i+k (x) ∈ Cδ0 . As in Case 3, with extra factor (miny∈Cδ0 |f (y)|)δ. Cases 3 and 4 are relevant only for part (a); each appears at most once in the estimate on |(f n ) (x)|.
In the interest of carrying as few constants around as possible, we write c1 = min{c0 , c0 , c0 }. 1.2. Examples. Example 1. Let f ∈ C 3 (I, I ) be such that (i) S(f ) < 0, where S(f ) denotes the Schwarzian derivative of f ,1 1 We have elected to replace this condition by an explicit description of the dynamics in Definition 1.1 because (1) that is exactly what is used and (2) we have found that maps that arise in applications often do not have negative Schwarzian derivative.
258
Q. Wang, L.-S. Young
(ii) f (x) ˆ = 0 for all xˆ ∈ C, (iii) if f n (x) = x, then |(f n ) (x)| > 1, and (iv) for all xˆ ∈ C, infn>0 d(f n (x), ˆ C) > 0. Then f ∈ M. For a proof of this fact, see Lemma 2.5 of [WY1]. We note that (i) and (ii) above are satisfied by all members of the quadratic family fa (x) = 1 − ax 2 , a ∈ (0, 2], and (iii) and (iv) are satisfied by an uncountable number of a including a = 2. Example 2. Another situation where maps in M arise naturally is through scaling. The following is a slight generalization of Lemma 5.3 in [WY3] and has the same proof: Let fa : S 1 → S 1 be given by fa (θ ) = θ + a + L(θ ), where a, L ∈ R and : S 1 → S 1 is an arbitrary function with nondegenerate critical points (and the right side is to be read mod 1). Then there exists L0 > 0 such that for all L ≥ L0 , there exists an O( L1 )-dense set of a for which fa ∈ M. References: Maps of the type in Example 1 are introduced and studied in [M]. Maps of the type in Example 2 appear naturally in [WY3] and [WY4]. 2. The Class G: 3 Basic Properties Condition (b)(ii) in Definition 1.1 severely limits the scope of M as a subset of C 2 (I, I ). We now introduce in a neighborhood of each f0 ∈ M an admissible set of perturbations G(f0 ). Our set of “good maps” G is then defined to be f0 ∈M G(f0 ). Throughout this section, let f0 ∈ M be fixed, and let δ0 , λ0 , c1 etc. be the constants in Sect. 1.1 associated with f0 . 2.1. Definition of G(f0 ) and basic properties. For λ, α, ε > 0 and f ∈ C 2 (I, I ), we say f ∈ G(f0 ; λ, α, ε) if f − f0 C 2 < ε and the following hold for all xˆ ∈ C = C(f ) and n > 0: ˆ C) > min{ 21 δ0 , e−αn }; (G1) d(f n (x), ˆ ≥ c1 eλn . (G2) |(f n ) (f (x))| Note that with λ < λ0 , (G1) and (G2) are relaxations of the conditions on critical orbits for f0 (see Remark 2 in Sect. 1.1). The main result of this section is 1 Proposition 2.1. Given f0 ∈ M, λ < 41 λ0 and α < 100 λ, there exists δ = δ(f0 , λ, α) and ε = ε(f0 , λ, α, δ) > 0 such that (P1)–(P3) below hold for all f ∈ G(f0 ; λ, α, ε).
Here δ < 21 δ0 is an auxiliary constant. For simplicity, we assume ε is small enough that d(f n (x), ˆ C) > 21 δ0 for all xˆ ∈ C and 1 ≤ n ≤ n0 , where n0 is a large integer satis−αn fying e 0 << δ. Consequently, (G1) can be violated only when f n (x) ˆ ∈ Cδ . Precise requirements on δ and ε will become clear in the proofs. In general, ε << δ << 1. The arguments are perturbative; some of them will require that ε be taken very close to 0. The set G(f0 ) is defined to be the union of G(f0 ; λ, α, ε) as (λ, α, ε) ranges over all triples satisfying the conditions in Proposition 2.1. We now state (P1)–(P3), introducing some useful language along the way.
Nonuniformly Expanding 1D Maps
259 1
(P1) Outside of Cδ . (i) If x, f (x), . . . , f n−1 (x) ∈ Cδ , then |(f n ) (x)| ≥ c1 δe 4 λ0 n ; 1 (ii) if x, f (x), . . . , f n−1 (x) ∈ Cδ and f n (x) ∈ Cδ0 , then |(f n ) (x)| ≥ c1 e 4 λ0 n . Let xˆ ∈ C, and let Cδ (x) ˆ := (x−δ, ˆ x+δ). ˆ For x ∈ Cδ (x)\{ ˆ x}, ˆ we define p(x), the bound period of x, to be the largest integer such that |f i (x) − f i (x)| ˆ ≤ e−2αi ∀i < p(x). (P2) Partial derivative recovery for x ∈ Cδ \ C. For x ∈ Cδ (x) ˆ \ {x}, ˆ 1 1 3 1 (i) 3 ln(max log ≤ p(x) ≤ log ; λ |f |) |x−x| ˆ |x−x| ˆ λ
(ii) |(f p(x) ) (x)| > e 3 p(x) .
(P2) leads to the following general description of orbits: Decomposition into “bound” and “free” states. For x ∈ I such that f i (x) ∈ C for all i ≥ 0 (for example, x = f (x) ˆ for xˆ ∈ C), let t1 < t1 + p1 ≤ t2 < t2 + p2 ≤ · · · be defined as follows: t1 is the smallest j ≥ 0 such that f j (x) ∈ Cδ . For k ≥ 1, let pk be the bound period of f tk (x), and let tk+1 be the smallest j ≥ tk + pk such that f j (x) ∈ Cδ . (Note that an orbit may return to Cδ during its bound periods, i.e. ti are not the only return times to Cδ .) This decomposes the orbit of x into segments corresponding to time intervals (tk , tk + pk ) and [tk + pk , tk+1 ], during which we describe the orbit of x as being in “bound” and “free”states respectively; tk are called times of free returns. (P3) is about comparisons of derivatives for nearby orbits. To state what it means for two points to be close to each other, we introduce a partition P on I . First let P0 = {Iµj } be the following partition on (−δ, δ): Assume δ = e−µ∗ for some µ∗ ∈ Z+ . For µ ≥ µ∗ , let Iµ = (e−(µ+1) , e−µ ); for µ ≤ −µ∗ , let Iµ be the reflection of I−µ about 0. Each Iµ is further subdivided into µ12 subintervals of equal length called Iµj . For xˆ ∈ C, let P0xˆ be the partition on Cδ (x) ˆ obtained by shifting the center of P0 from 0 to x. ˆ The partition P is defined to be P0xˆ on Cδ (x); ˆ on I \ Cδ , its elements are intervals of length ≈ δ. The following shorthand is used: We refer to π ∈ P corresponding to (translated) Iµj intervals in P0xˆ simply as “Iµj ”. For π ∈ P, π + denotes the union of π and the two elements of P adjacent to it. For an interval γ ⊂ I , we say γ ≈ π if π ⊂ γ ⊂ π + . For practical purposes, π + intersecting ∂Cδ can be treated as “inside Cδ ” or “outside + Cδ ”.2 For γ ⊂ Iµj , we define the bound period of γ to be p(γ ) = minx∈I + {p(x)}. µj For x, y ∈ I , [x, y] denotes the segment connecting x and y. We say x and y in I have the same itinerary (with respect to P) through time n − 1 if there exist t1 < t1 + p1 ≤ t2 < t2 + p2 ≤ · · · ≤ n such that for every k, f tk [x, y] ⊂ π + for some π ⊂ Cδ , pk = p(f tk [x, y]), and for all i ∈ [0, n) \ ∪k [tk , tk + pk ), f i [x, y] ⊂ π + for some π ∈ P with π ∩ Cδ = ∅. (P3) Distortion estimate. There exists K0 > 1 (depending only on f0 and on λ) such that if x and y have the same itinerary through time n − 1, then n (f ) (x) (f n ) (y) ≤ K0 . (P1)–(P3) are proved in the next subsection. We finish by recording the following corollary of Proposition 2.1. 2
In particular, if π is one of the outermost Iµj in Cδ , then π + contains an interval of length δ.
260
Q. Wang, L.-S. Young
Corollary 2.1. There exists K1 (depending only on f0 and on λ) such that for all x ∈ I with f i (x) ∈ C for all 0 ≤ i < n, |(f n ) (x)| > K1−1 d(f j (x), C) e 4 λn , 1
where j is the time of the last free return before n. The factor d(f j (x), C) may be replaced by δ if f n (x) is free. Proof. Let 0 ≤ t1 < t1 + p1 ≤ t2 < t2 + p2 ≤ · · · be as in the paragraph following (P2). The derivatives on time intervals [tk , tk + pk ) and [tk + pk , tk+1 ) are given by (P2)(ii) and (P1)(ii) respectively, provided these intervals are completed before time n. We assume δ is sufficiently small so that the constant c1 in (P1)(ii) is absorbed into the exponential estimate from the proceeding bound period. If f n (x) is in a bound period initiated at time j , then |(f (n−j ) ) (f j (x))| ≥ |f (f j (x))|K0−1 c1 eλ(n−j −1) ; see (G2) and (P3). If tk + pk ≤ n < tk+1 for some k, then the derivative between time tk + pk and n is given by (P1)(i).
Remarks on the use of constants. In this article, K, K1 , K2 , . . . , are reserved for use as system constants, which in Part I are constants that are allowed to depend only on (1) f0 , by which we included also the constants in Sect. 1.1 associated with f0 , and (2) our choice of λ. The more important of these constants, such as K0 in (P3), carry a subscript; all others are referred to by the generic name K. The value of K, therefore, may vary from expression to expression. Notation. Where no ambiguity arises, i.e. when only one map f is involved, we will sometimes write xi = f i (x) for i = 1, 2, . . . . 2.2. Proofs of (P1)–(P3). Proof of (P1). First we deduce from Lemma 1.1(a) that there exists N = N (δ) such that 1 for all y ∈ I , if y, f0 (y), . . . , f0N−1 (y) ∈ C 1 δ (f0 ), then |(f0N ) (y)| > e 3.5 λ0 N . We then 2 choose ε small enough that f is sufficiently close to f0 for N iterates in the sense below: (i) if x and n are as in (P1) and n ≤ N , then |(f n ) (x) − (f0n ) (x)| is small enough that the conclusions of (P1) follow from Lemma 1.1; 1 (ii) if f i (y) ∈ Cδ for 0 ≤ i < N , then |(f N ) (y)| > e 4 λ0 N . If n in (P1) is > N , we let k be such that kN ≤ n < (k + 1)N , and estimate |(f n ) (x)| by the chain rule, comparing (f N ) (f iN (x)) with (f0N ) (f iN (x)) for i ≤ k using (ii) above, and (f n−kN ) (f kN (x)) with (f0n−kN ) (f kN (x)) using (i).
Lemma 2.1. The following holds if δ and ε are sufficiently small and suitably related: Let xˆ ∈ C, and let x ∈ Cδ (x). ˆ Then for all y ∈ [x, ˆ x] and k < p(x), (f k ) (y1 ) 1 ≤ k ≤ 2. 2 (f ) (xˆ1 ) Proof. First, log
k k |yj − xˆj | (f k ) (y1 ) |f (yj ) − f (xˆj )| ≤ ≤ K . k (f ) (xˆ1 ) |f (xˆj )| d(xˆj , C) j =1
j =1
Nonuniformly Expanding 1D Maps
261
∞
e−αj << 1 and (ii) e−αh0 < 21 δ0 (so that 0 2 j d(xˆj , C) > e−αj for j > h0 ). Next we choose δ small enough that δ hj =1 δ0 (max |f |) << 1. Finally, let ε be small enough that d(xˆj , C) > 21 δ0 for all j ≤ h0 . Then
We choose h0 large enough that (i)
i>h0
h0 k |yj − xˆj | 2 (max |f |)j δ + < d(xˆj , C) δ0 j =1
j =1
k e−2αj << 1. e−αj
j =h0 +1
Proof of (P2). Suppose |x − x| ˆ = e−h . Then (G2) together with Lemma 2.1 implies that |xp − xˆp | ≥
1 p−1 ) (xˆ1 )||x1 − xˆ1 | ≥ K −1 eλ(p−1) (x − x) ˆ 2. |(f 2
From this we deduce that p < λ3 h, assuming that h is sufficiently large (or δ is sufficiently small). The lower bound on p is obtained by comparing the inequalities |xp − xˆp | < (max |f |)p−1 e−2h and |xp − xˆp | ≥ e−2αp (definition of bound period). To prove (P2)(ii), taking square root of |xp − xˆp | < K|(f p−1 ) (xˆ1 )|(x − x) ˆ 2 , we obtain 1
K|(f p−1 ) (xˆ1 )| 2 |x − x| ˆ > e−αp .
(1)
Then writing |(f p ) (x)| as 1
1
|(f p ) (x)| = |(f p−1 ) (x1 )||f (x)| > (K −1 |(f p−1 ) (xˆ1 )| 2 |x − x|) ˆ · |(f p−1 ) (xˆ1 )| 2 1
1
and substituting in (1), we see that |(f p ) (x)| > K −1 c12 e 2 λ(p−1) e−αp , which we may 1
assume is > e 3 λp if p is sufficiently large, or equivalently, δ is sufficiently small. Proof of (P3). We write σ0 = [x, y], σk = f tk σ0 , and assume for definiteness that σ0 ⊂ Cδ and tq + pq ≤ n, where f i σ0 ∩ Cδ = ∅ for all tq + pq ≤ i < n. (The proof for the case tq < n < tq + pq is contained in that for n = tq + pq .) Then log
q n−1 |f (yj ) − f (xj )| (f n ) (x) (Sk + Sk ), ≤ ≤ K (f n ) (y) |f (yj )| j =0
k=1
where Sk =
tk +p k −1 j =tk
|yj − xj | d(yj , C)
and Sk =
tk+1 −1
|yj − xj | d(yj , C) t +p k
k
except for Sq , which ends at index n − 1. Observe that for k < q, it follows from (P2)(ii) 1
and (P1)(ii) that |σk+1 | ≥ c1 e 3 λ(tk+1 −tk ) |σk |, which we may assume is ≥ τ |σk | for some τ > 1 (the factor c1 having been absorbed into the exponential assuming δ is sufficiently small). q I. Bound on k=1 Sk . First we estimate Sk . Suppose ytk ∈ Cδ (x). ˆ For tk < j < tk + pk we write |yj − xj | |yj − xj | |yj − xˆj −tk | = · . d(yj , C) |yj − xˆj −tk | d(yj , C)
262
Q. Wang, L.-S. Young
By Lemma 2.1 and the usual estimates near x, ˆ the first factor on the right is < K Thus
|f (xtk )| |ytk − xtk | |σk | |ytk +1 − xtk +1 |
tk +p k −1 |y − x ˆ | | |σ |σk | j j −tk k 1 + Sk ≤ K ≤ K , d(ytk , C) d(xˆj −tk , C) d(ytk , C) j =tk +1
treating the sum inside the parentheses as in Lemma 2.1. Now let Kµ = {k ≤ q : σk ⊂ Iµj for some j }. We claim that
Sk < K
k∈Kµ
|σk | 1 < K 2. −|µ| e µ
k∈Kµ
The first inequality is from the estimate above. The second follows from (i) |σk+1 | ≥ + τ |σk | and (ii) the term with the largest index is bounded above by |Iµj |, which is ≤ 3 −|µ| 1 e . To finish, we sum over all µ, |µ| ≥ log δ , to obtain Sk < K. µ2 q II. Bound on k=1 Sk . For k < q and tk + pk ≤ j ≤ tk+1 − 1, we have, by (P1)(ii), | |σk+1 | ≥ c1 eλ(tk+1 −j ) |xj −yj |, so Sk ≤ K |σk+1 δ . This together with |σk+1 | ≥ τ |σk | gives q−1 |σq | k=0 Sk ≤ K δ < K. If n < tq+1 , Sq needs to be estimated separately: Assume the orbit in question visits Cδ0 during the time interval [tq + pq , n) (otherwise the estimate is trivial). Let nˆ be the time of the last such visit. Then Sq =
n−1 ˆ j =tq +pq
K
n−1 |yj − xj | |yj − xj | |y − xnˆ | + K nˆ + . K d(yj , C) d(ynˆ , C) d(yj , C) j =n+1 ˆ
1
λ0 (n−1−j ˆ ) − xn−1 |yj − xj | for tq + pq ≤ The first sum is estimated using |yn−1 ˆ ˆ | > c1 e 4 1 j < nˆ ((P1)(ii)). The last sum is estimated using |yn−1 − xn−1 | > c1 δ0 e 4 λ0 (n−j −1) |yj − xj | for nˆ < j < n − 1 ((P1)(i) with δ0 in the place of δ). It follows that
Sq ≤ K
− xn−1 |yn−1 |y − xnˆ | |yn−1 − xn−1 | ˆ ˆ | + K nˆ +K < K. δ δ δ02
References: Conditions of the type (G1) are first used in [BC1] and [BC2]. (G2) is introduced in [CE]. A version of the material in this section for f0 (x) = 1−2x 2 first appeared in [BC1] and Sect. 2 of [BC2]. 3. Absolutely Continuous Invariant Measures The goal of this section is to prove Theorem 1. Every f ∈ G has an absolutely continuous invariant probability measure.
Nonuniformly Expanding 1D Maps
263
3.1. Growth of segments with bounded distortion. Let f ∈ G. To prove that f admits an absolutely continuous invariant probability measure (acipm), we need to show that the forward images of Lebesgue measure have certain regularity properties. Expansion is conducive to such regularity, but distortion bounds are also essential. Since (P3) guarantees uniform distortion bounds only on intervals with the same itinerary, we need to show that intervals of this type grow with sufficient regularity. This is the mission of the present subsection. We begin by introducing what will be referred to as a canonical subdivision by itinerary on an interval ω ⊂ I . This consists of an increasing sequence of partitions Q0 < Q1 < Q2 < · · · on ω defined as follows: Let us say an interval is “short” if it is strictly contained in an element of P. Q0 is defined to be P|ω except that the end intervals are attached to their neighbors if they are short. We assume inductively that all ωˆ ∈ Qi are intervals and all points in ωˆ have the same itinerary through time i. To go from Qi to Qi+1 , we consider one ωˆ ∈ Qi at a time: ˆ is in a bound period, then it is automatically put into Qi+1 . (Observe that – If f i+1 (ω) ˆ ∩ Cδ = ∅, then f i+1 (ω) ˆ ⊂ Iµ+ j for some µ , j , i.e. no cutting is needed if f i+1 (ω) during bound periods. This is an easy exercise.) ˆ is not in a bound period, but all points in ωˆ have the same itinerary through – If f i+1 (ω) time i + 1, we again put ωˆ ∈ Qi+1 . – If neither of the last two cases holds, then we partition ωˆ into segments {ωˆ } according to their itineraries through time i + 1, requiring that f i+1 (ωˆ ) ≈ γ for some γ ∈ P (i.e., no cuts are made that lead to short intervals). The resulting partition is Qi+1 |ωˆ . For x ∈ ω and i ≥ 0, let Qi (x) denote the element of Qi containing x. We introduce the following stopping time on ω: For x ∈ ω, S(x) is the smallest i > 0 such that f i (Qi−1 (x)) is not in a bound period and has length > δ. The main result of this subsection is Proposition 3.1. There exists K2 > 0 such that for any ω ⊂ I with δ < |ω| < 3δ, −1
|{S > n}| < e−K2
n
|ω|
for n > K2 log δ −1 .
Here | · | denotes the Lebesgue measure of the set. The proof of this proposition follows from a series of lemmas. 7α
Lemma 3.1. Let ω ≈ Iµj . Then |f p (ω)| > e− λ |µ| , where p = p(ω). Proof. It follows from Lemma 2.1 that for Iµj ∈ P xˆ , |f p (Iµj )| ˆ xˆ ± e−|µ| ])| |f p ([x, |f p ([x, ˆ xˆ ± e−|µ| ])| |f (Iµj )| ≥ K −1 |f p ([x, ˆ xˆ ± e−|µ| ])| |f ([x, ˆ xˆ ± e−|µ| ])| 1 ≥ K −1 2 e−2αp . µ
|f p (Iµj )| =
By (P2)(i), p < λ3 |µ|. The lemma then follows assuming δ is sufficiently small.
For ω ≈ Iµj , we define the extended bound period of ω to be the largest n such that all points in ω have the same itinerary (in the sense of (P3)) for n − 1 iterates. The next lemma follows immediately from Corollary 2.1.
264
Q. Wang, L.-S. Young
Lemma 3.2. There exists K such that the extended bound period of Iµj is < K|µ|. Lemma 3.3. Assume δ is sufficiently small. Then for ω ≈ Iµ0 j0 , 1
|{x ∈ ω : S(x) > n}| < e− 2 K
−1 n
|ω|
for all n > Kµ0 ,
where K is the constant in Lemma 3.2. Proof. Let x ∈ ω be such that S(x) > n. We define the essential return times and addresses of x before n as follows: Let t be the extended bound period of ω. Then either + S = t, or f t (ω) ⊂ Cδ (more precisely ∪Iµj ). In the latter case, we say t1 = t is the t 1 first essential return time of x, and if f (Qt1 (x)) ≈ Iµ1 j1 ⊂ Cδ , then we say Iµ1 j1 is its first essential return address. If S has not been reached, we continue iterating. Let t be the extended bound period of f t1 (Qt1 (x)). Then either S(x) = t1 + t, or we define the second essential return time to be t2 = t1 + t and second return address to be Iµ2 j2 if f t2 (Qt2 (x)) ≈ Iµ2 j2 ⊂ Cδ , and so on. i Let Aq = {x ∈ ω : S(x) > n, f (x) makes q but not q + 1 essential returns before time n}. Then |{S > n}| = q |Aq |. We write Aq = ∪R Aq,R , where Aq,R = {x ∈ Aq : if (µ1 , . . . , µq ) are the µ-coordinates of its first q return addresses, then |µ1 | + |µ2 | + · · · + |µq | = R}, and further decompose Aq,R into intervals σ consisting of points whose first q return addresses are identical. Each such σ is equal to Qtq (x) for some x, since the extended bound period of f tq (Qtq (x)) is not completed before time n (see above). Writing Qtk = Qtk (x), we have |σ | =
|Qtq | |Qtq−1 | |Qtq−1 | |Qtq−2 | q
≤ K0 ≤
···
|Qt1 | |ω| |ω|
|f tq−1 +pq−1 (Qtq )| |f tq−2 +pq−2 (Qtq−1 )| |f tq−1 +pq−1 (Q
|f tq−2 +pq−2 (Q
tq−2 )| t q−1 (Qtq−1 )| |f t +p q−2 q−2 |f (Qtq−2 )|
tq−1 )| t q |f (Qtq )| −q q c1 K0 t +pq−1 (Q q−1 |f tq−1 )|
≤ Kq
e−|µq | e
− λ7 α|µq−1 |
e−|µq−1 | e
···
− λ7 α|µq−2 |
···
e−|µ1 | 7
e− λ α|µ0 |
|f p(ω) (Qt1 )| |ω| |f p(ω) (ω)| ···
|f t1 (Qt1 )| |ω| |f p(ω) (ω)|
|ω| .
Here (P3) is used in the first inequality, (P1)(ii) is used in the second and Lemma 3.1 in 1 λ, the estimate above gives the third. With α < 100 |σ | < K q e−
q
9 1 k=1 10 |µk |+ 10 |µ0 |
9
1
|ω| = K q e− 10 R+ 10 |µ0 | |ω| := |σ |R .
(This estimate is valid only if f i (ω) has completed its first bound period, which is not a problem since n > K|µ0 |.) We estimate |{S > n}| by |{S > n}| = |Aq,R | ≤ (number of σ in ∪q Aq,R ) · |σ |R . q,R
R
R−1 There are ways of decomposing R into a sum of q + 1 integers. For a fixed q q-tuple (µ1 , . . . , µq ), µi > 0, we claim there are ≤ 2q µ21 µ22 · · · µ2q possibilities for σ with the µ-coordinates of their essential free return addresses being (±µ1 , · · · ± µq ).
Nonuniformly Expanding 1D Maps
265
This is because f tk (Qtk (σ )) is short enough that it can meet at most one Cδ (x), ˆ which contains ≤ 2µ2k intervals of the form I±µk j . Furthermore, for (µ1 , . . . , µq ) with |µ1 | + |µ2 | + · · · + |µq | = R, we have µ21 µ22 · · · µ2q ≤ ( Rq )2q . There is one other piece of information that is crucial to us, namely that all bound periods are ≥ := K −1 log 1δ . This means that for a given R, the only feasible q are R . For a fixed R, then, the number of σ in ∪q Aq,R is ≤
2q R − 1
R R R R R q ≤ · R · 2 2 , ·2 ≤ q q q which, by Sterling’s formula, is 2 R R ( 1 ) 1 ∼ e 2
where
1
→ 0 as δ → 0.
Calling the expression above (1 + η(δ))R , we have η(δ) → 0 as δ → 0. To finish, we note that n ≤ K(|µ0 |+R), where K is as in Lemma 3.2, since the essential bound period following the q th essential return expires before time K(|µ0 | + R). Thus 9 1 |{S > n}| < K q (1 + η(δ))R e− 10 R+ 10 |µ0 | |ω| R≥K −1 n−|µ0 | 4
< e− 5 K provided that n > K|µ0 |.
−1 n+ 1 |µ | 10 0
1
|ω| < e− 2 K
−1 n
|ω|
Proof of Proposition 3.1. First there is the trival case where for some i > 0, all points in ω have the same itinerary through time i − 1 and |f i (ω)| > δ, so that S|ω = i. This case aside, let t0 ≥ 0 be the first time Qt0 contains more than one element. Clearly, t0 < K log 1δ , and |f t0 (ω)| > K −1 δ by (P1). Let n > 0 be an arbitrary integer. For each ω ∈ Qt0 such that f t0 (ω ) ≈ Iµj , |µ| < K −1 n, we have |ω ∩ {S > t0 + n}| < 1 −1 K0 e− 2 K n |ω | by Lemma 3.3, K0 here being the distortion constant for f t0 . The mea−1 sure of the union of ω ∈ Qt0 with |µ| > K −1 n is ≤ Ke−K n . It follows therefore that 1
|{x ∈ ω : S(x) > t0 + n}| < Ke− 2 K
−1 n
|ω| + Kδ −1 e−K
provided K2 is sufficiently large and n + t0 > K2 log δ −1 .
−1 n
−1
|ω| < e−K2
(t0 +n)
|ω|,
3.2. Proof of Theorem 1. Let m denote Lebesgue measure on I , and let f∗i (m) be the Borel measure with f∗i (m)(E) = m(f −i (E)). Fix l := Iµ0 j0 for some µ0 , j0 . For n = 1, 2, . . . , let 1 i νn = f∗ (m|l ). n n−1 i=0
Clearly, any limit point of νn in the weak∗ topology is f -invariant. As we will see, it suffices to show that a positive fraction of these measures is absolutely continuous with respect to m (written “<< m”). The next lemma helps us “catch” this fraction:
266
Q. Wang, L.-S. Young
Lemma 3.4. There exist (i) an interval L ⊂ I , (ii) a number c > 0, (iii) a sequence of integers n1 < n2 < · · · , and (iv) for each i = 0, 1, 2, . . . , a collection of subsegments (i) {ωj } of l, with the property that the following hold for each i, j : (i)
(a) f i (ωj ) = L; (b) (c)
(f i ) (x)
(i) < K0 for all x, y ∈ ωj ; (f i ) (y) (i) 1 nk −1 i=0 m(∪j ωj ) ≥ c m(l). nk
We first finish the proof assuming the conclusion of Lemma 3.4: Let νˆ nk =
nk −1 1 f∗i (m|(∪ ω(i) ) ), j j nk i=0
and let νˆ be a limit point of νˆ nk . It follows from Lemma 3.4(c) that νˆ (L) > cm(l) > 0. From Lemma 3.4(a) and (b), we see that if ρk is the density of νˆ nk , then ρk (x)/ρk (y) ≤ K0 for all x, y ∈ L. These bounds are passed to the limit measure νˆ . In particular, νˆ << m. Now let ν be a limit point of νnk . Then ν ≥ νˆ , meaning ν − νˆ is a nonnegative measure. We decompose ν into νac + ν⊥ , where νac << m and ν⊥ is singular with respect to m. Since f∗ (νac ) << m and f∗ (ν⊥ ) ⊥ m, it follows that both νac and ν⊥ are f -invariant. It remains to argue that νac (I ) > 0, which is true since νac ≥ νˆ and νˆ (L) > 0. Proof of Lemma 3.4. We introduce a sequence of stopping times S0 < S1 < S2 < · · · on l as follows: Let S0 = 0 and S1 = S, where S is as defined in Sect. 3.1. For k ≥ 1, let x ∈ ω be such that Sk (x) has been defined. For definiteness, suppose x ∈ ω ∈ Qi−1 with Sk |ω = i. We define Sk+1 |ω = Sk + S|f Sk (ω) ◦ f Sk , i.e., Sk+1 (x) is defined to be the smallest j > i such that f j (Qj −1 (x)) is free and has length > δ. Let Q˜ i−1 = {ω ∈ Qi−1 such that Sk |ω = i for some k}. It follows from Proposition 3.1 that for all such ω,
(Sk+1 − Sk )dm ≤ M|ω| ω
for some M possibly dependent on δ but independent of ω, k or i. Keeping k fixed while summing over ω and i, we obtain l (Sk+1 − Sk )dm ≤ M m(l). Summing over k then gives
Sk dm ≤ Mk m(l). l
By Chebychev’s Inequality, m{S[
1 2M N]
> N} ≤
1 m(l). 2
Hence 1 1 m(l). m(∪{ω ∈ Q˜ i−1 }) ≥ N 4M
(2)
i≤N
To finish, we partition I into intervals L1 , L2 , . . . , L 3 of length 13 δ each. For each δ ω ∈ Q˜ i−1 , since |f i (ω)| > δ, there exists n = ψ(ω) such that f i (ω) ⊃ Ln . Let
Nonuniformly Expanding 1D Maps
267
ωˆ = ω ∩ f −i Ln . By (P3), there exists K such that m(ω) ˆ > K −1 m(ω). Together with (2), this implies that for each N , there exists n = n(N ) such that δ 1 m(∪{ω ∈ Q˜ i−1 : ψ(ω) = n}) ≥ m(l). N 12MK i≤N
Let n∗ be such that n∗ = n(N ) for infinitely many N . We let L = Ln∗ , and for each i, (i)
let {ωj } := {ωˆ : ω ∈ Q˜ i−1 , ψ(ω) = n∗ }. References: A version of the material in Sect. 3.1 is used in [BC2] (in a different context). Section 3.2 follows the construction of SRB measures in [BY]. Part II. Parameter Issues 4. Admissible One-Parameter Families 4.1. Statement of Theorem 2. We say a 1-parameter family of 1D maps {fa ∈ C 2 (I, I ), a ∈ (a1 , a2 )} is C 2 if the map (x, a) → fa (x) is C 2 . If I is an interval, we assume fa (I ) is contained in the interior of I ; small modifications of some statements are needed otherwise. Assuming a1 < 0 < a2 and f0 ∈ M, certain orbits of f0 have natural continuations to a near 0. For example: (i) Continuations a → x(a) ˆ of every xˆ ∈ C(f0 ) is clearly well defined. (ii) Let ⊂ I be a closed subset with the property that f0 () ⊂ and ∩C(f0 ) = ∅. Then has a natural continuation a → (a) to a small interval containing 0. Moreover, for each x ∈ , a → x(a) is differentiable. (For more detail, see Sect. 4.2.) Definition 4.1. Let {fa } be a 1-parameter family with f0 ∈ M. We say {fa } satisfies the parameter transversality condition (PT) at f0 if for every xˆ ∈ C(f0 ) and q = f0 (x), ˆ d (fa (x(a)) ˆ − q(a)) = 0. c( ˆ x) ˆ := da a=0 The notation “q(a)” in the displayed formula above is to be interpreted as follows: Since f0 ∈ M, q is contained in a closed subset of the kind in (ii) above. By q(a), we refer to the continuation of q in the sense of (ii). As a varies, fa (x(a)) ˆ moves with a, as does the set (a). Roughly speaking, (PT) stipulates that the two move at different speeds. In general, moves more slowly, and the trajectory of a → fa (x(a)) ˆ moves “through” . This is why we think of (PT) as a transversality condition. Theorem 2. For every C 2 family {fa } satisfying (PT) at f0 ∈ M, the set {a : fa ∈ G} has positive Lebesgue measure. The rest of this paper is devoted to the proof of Theorem 2. To simplify slightly the discussion, we assume from here on that C(fa ) = C(f0 ) := C for all a. This is easily arranged via a-dependent changes of coordinates that do not affect the content of the theorem. As before, critical points will be denoted by “hats” (e.g., xˆ ∈ C, while x is an arbitrary point in I ).
268
Q. Wang, L.-S. Young
Standing assumptions for Part II. – (a1 , a2 ) is an interval with a1 < 0 < a2 ; – {fa , a ∈ (a1 , a2 )} is a C 2 family with C(fa ) = C(f0 ) for all a; – f0 ∈ M, and (PT) is satisfied at f0 . System constants for Part II are allowed to depend only on (i) f0 (including the constants in Sect. 1.1 associated with f0 ), (ii) the C 2 norm of the family {fa }, (iii) cˆ := minx∈C |c( ˆ x)| ˆ and (iv) our choice of λ. ˆ
4.2. Alternate formulation of (PT). We begin with some simple facts about the symbolic dynamics of f = f0 ∈ M. Let J = {J1 , . . . , Jq } be the components of I \ C. For x ∈ I such that f i x ∈ C for all i ≥ 0, let φ(x) = (ιi )i=0,1,... be given by ιi = k if f i x ∈ Jk . Lemma 4.1. For f ∈ M, there exists an increasing sequence of compact sets (n) such that (a) (n) ∩ C = ∅, f ((n) ) ⊂ (n) , and f |(n) is conjugate to a shift of finite type; (b) ∪n (n) is dense in I ; (c) if inf i≥0 d(f i (x), C) > 0, then x ∈ (n) for some n. Proof. First we argue that ∪i≥0 f −i C is dense in I . If not, there would be an interval ω with the property that φ(x) is identical for all x ∈ ω. Let ω be a maximal interval of this type. Then either (i) f n+k (ω) ⊂ f n (ω) for some n, k, or (ii) f k (ω), k = 0, 1, . . . , are pairwise disjoint. Case (i) cannot happen since it implies the presence of a periodic point x with |(f k ) x| ≤ 1. Case (ii) is equally absurd, for it implies the existence of {ki }, where f ki (ω) are arbitrarily short. We leave it as an easy exercise to see that this is incompatible with the definition of M. For definiteness, consider I = S 1 . Let ln (x) ˆ and rn (x) ˆ be the two points in ∪0
0 f −i C is dense in I . Assertion (c) follows immediately from this construction. (n) To show that f |(n) is conjugate to a shift of finite type, let J (n) = {Ji } be the (n) ˆ x) ˆ or (x, ˆ rn (x)), ˆ observe that by partition of I by ∪0≤i≤n f −i C. For Ji = (ln (x), (n) construction, f (Ji ) is equal to the union of a finite number of elements of J (n) . Let (n) (n) (n) i = (n) ∩ Ji . Then the alphabet of the shift in question is {i : i = ∅}, and the (n) (n) transition i → j is admissible if f (i ) ⊃ j . Where I is an interval, (n) is as above but restricted to the interval [zn1 , zn2 ], where zn1 and zn2 are the two points in ∪0
Our next result guarantees that q(a) in Definition 4.1 is well defined. Corollary 4.1. For f ∈ M, let q ∈ I be such that δ1 := inf n≥0 d(f n (q), C) > 0. Then for all g with g − f C 2 < ε, where ε = ε(δ1 ), there is a unique point qg ∈ I with φg (qg ) = φf (q).
Nonuniformly Expanding 1D Maps
269
Proof. Fix n large enough that for all i ≥ 0, f i (q) ∈ (ln (x), ˆ rn (x)) ˆ for all xˆ ∈ C, and ¯ i , where ¯ i is the shortest interval containing i . Since let = (n) . Let B = ∪i ∂ B is a finite set with f (B) ⊂ B, it consists of preperiodic points. From Lemma 1.1, the periodic points in question are repelling. Thus if g is sufficiently near f , there is a unique set Bg with g(Bg ) ⊂ Bg such that g|Bg is conjugate to f |B . Using Bg , we recover a set g on which g is conjugate to f | . The uniqueness of qg follows from the expanding property of g away from C (Lemma 1.1).
We return now to the C 2 family {fa } with f0 ∈ M. Let xˆ ∈ C be fixed, and let q = f0 (x). ˆ We write F (x, a) := fa (x), reserving (·) for x-derivatives, i.e., ∂x F (x, a) = (fa ) (x). Proposition 4.1. There is an interval ω in a-space containing 0 in its interior on which q(a) is defined, a → q(a) is differentiable, and ∞
∂a F (f i−1 (q(a)), a) d a q(a) = − . da (fai ) (q(a))
(3)
i=1
Proof. Let q be as in the proof of Corollary 4.1. We assume ω is short enough that for all a ∈ ω, q(a) is well defined, q(a) ∩ C 1 δ0 = ∅, and that (P1) holds for fa outside of 2 C 1 δ0 with uniform bounds. In the computations below, we suppress the dependence on 2 a, writing f = fa , q = q(a), ∂a F (·) = ∂a F (·, a), and so on. Continuing to use the notation in Corollary 4.1, we let i0 ,i1 ,...,im = {x ∈ I : f j (x) ∈ ij , 0 ≤ j ≤ m}, and let i0 ,i1 ,...,im (q) be the cylinder set containing q. For each m, ¯ i0 ,i1 ,...,im (q). Then q (m) → q. It suffices to show that as functions of choose q (m) ∈ ∂ d (m) a, da q converges uniformly to the right side of (3). Let p (m) = f m (q (m) ). Differentiating, we obtain m d (m) d (m) p q . = (f m−i ) (f i (q (m) )) ∂a F (f i−1 (q (m) )) + (f m ) (q (m) ) da da i=1
This gives m d (m) ∂a F (f i−1 q (m) ) d (m) da p = q − . m (m) da (f ) (q ) (f i ) (q (m) )
(4)
i=1
We stress that all the action below takes place outside of C 1 δ0 , where |(f n ) | grows 2
exponentially (with prefactor 21 δ0 c1 ). To estimate (4), observe that since p(m) ∈ B (see the proof of Corollary 4.1), and d (m) B is a finite set, da p is uniformly bounded for all m. With |(f m ) (q (m) )| growing exponentially, the first term on the right is exponentially small. It remains to check that the second term converges uniformly to the right side of (3). Since the tail of the sum in (3) decreases exponentially (uniformly bounded numerator and exponentially increasing denominator), it suffices to verify that
270
Q. Wang, L.-S. Young
m m ∂ F (f i−1 (q (m) )) ∂a F (f i−1 (q)) a A := − (f i ) (q) (f i ) (q (m) ) i=1
≤
i=1
m |∂a F (f i−1 (q (m) )) − ∂a F (f i−1 (q))|
|(f i ) (q (m) )| m |∂a F (f i−1 (q))| (f i ) (q (m) ) − 1 + · i i (m) |(f ) (q )| (f ) (q) i=1
i=1
converges to 0 uniformly. Consider the i th term in the first sum. By the expanding prop1 erty of f outside of C 1 δ0 , the numerator is < const e− 4 λ0 (m−i) |f m (q (m) ) − f m (q)|, 2
1
while the denominator is > const e 4 λ0 i . The second sum is estimated similarly. Together 1 they give A < const me− 4 λ0 m .
Let xˆi (a) := fai (x(a)), ˆ and recall the definition of c( ˆ x) ˆ in Definition 4.1. Corollary 4.2. ∞
c( ˆ x) ˆ =
∂a F (xˆi (0), 0) d xˆ1 . (0) + da (f0i ) (xˆ1 (0)) i=1
We have thus obtained an equivalent formulation of (PT) that involves only ∂a F (·, ·) and properties of f0 . 4.3. Comparability of x- and a-derivatives. For f0 , λ, α and ε as in Proposition 2.1, we define GN (f0 ; λ, α, ε) := {f : f − f0 C 2 < ε and (G1), (G2) hold for all xˆ ∈ C and n ≤ N }. Proposition 4.2. Let λ, α and ε be fixed. Then there exist εˆ > 0 and iˆ ∈ Z+ such that the following holds for all N ∈ Z+ : Let N ⊂ (−ˆε , εˆ ) be such that fa ∈ GN (f0 ; λ, α, ε) for all a ∈ N . Then for every a ∈ N and xˆ ∈ C, d xˆi (a)| | da 1 < 2|c( ˆ x)| ˆ |c( ˆ x)| ˆ < i−1 2 |(fa ) (xˆ1 )|
for iˆ < i ≤ N.
Proof. Writing d d xˆi (a) = (fa ) (xˆi−1 ) xˆi−1 (a) + ∂a F (xˆi−1 , a), da da we obtain inductively d da xˆ i (a) (fai−1 ) (xˆ1 )
=
i−1 ∂a F (xˆj , a) d xˆ1 (a) + . j da j =1 (fa ) (xˆ 1 )
Nonuniformly Expanding 1D Maps
271
Letting I (a, i) denote the expression on the right side above, we choose iˆ large enough ˆ ≈ c( ˆ that (i) I (0, i) ˆ x) ˆ and (ii) for i > i, i−1 ∂a F (xˆj , a) << |c( ˆ x)| ˆ uniformly for all a ∈ J. j ˆ (fa ) (xˆ1 ) j =i
(i) makes sense because c( ˆ x) ˆ = 0 by (PT). (ii) is because |∂a F (xˆj , a)| < K and j λj |(fa ) (xˆ1 )| > c1 e from (G2). Since only a finite number of iterates are involved, we ˆ − I (0, i)| ˆ << |c( may now shrink εˆ sufficiently so that |I (a, i) ˆ x)| ˆ for all a ∈ (−ˆε , εˆ ).
References: The comparability of x- and a-derivatives is first used in [BC1]; the unimodal case of the material in Sects. 4.2 and 4.3 is in [TTY]. 5. Evolution of Critical Curves 5.1. Logistics. Much of the rest of the analysis revolves around evolutions of the type a → xˆi (a),
i = 0, 1, 2, . . . ,
for xˆ ∈ C, a ∈ J,
where J is an interval in (−ˆε , εˆ ). Via the geometry of these curves, we seek to determine what fraction of J corresponds to “bad" parameters, that is to say, what fraction of these curves comes too close to the critical set (therefore violating (G1)), or visits the critical set too often (thereby violating (G2)). Herein lies the dilemma: In order for the curves a → xˆi (a) to have controlled geometry, the maps fa corresponding to the a’s involved must (individually) be good to start with. On the other hand, by studying curve segments corresponding to good parameters only, how are we to determine what fraction of parameters are bad? The following discussion motivates our answer to this logistical dilemma. Properties of f ∈ GN (f0 ; λ, α, ε) up to time α1∗ N, α ∗ := λ3 α. Assume f ∈ GN (f0 ; λ, α, ε), and consider xˆ ∈ C. Suppose for some n ≤ α1∗ N, xˆn ∈ Cδ with d(xˆn , C) > e−αn . We claim that (P2) in Sect. 2.1 holds for this return even though n may be greater than N . This is because the critical point that will guide xˆn through its partial derivative recovery obeys (G1) and (G2) up to time N, and by the proof of (P2), the time it takes to complete this recovery is < λ3 αn ≤ N . Indeed if we assume (G1) holds for xˆ up to time n, n ≤ α1∗ N , then on the time interval [0, n], the orbit of xˆ1 has the bound/free behavior described in Sect. 2.1. Moreover, by an 1 argument identical to that for Corollary 2.1, we have |(f j ) (xˆ1 )| > K −1 e 4 λj for j ≤ n. We remark that beyond time α1∗ N, the dynamical description of xˆ in the last paragraph ceases to be valid as soon as a bound period > N is encountered. Conversely, the behaviors of other critical orbits beyond time N do not impact the properties of xˆ up to time α1∗ N . In view of the discussion above, we modify Proposition 4.2 slightly as follows: Proposition 4.2’. In addition to the hypotheses in Proposition 4.2, we assume that for some xˆ ∈ C and n ∈ (N, α1∗ N ], (G1) holds for xˆ up to time n. Then the conclusion of Proposition 4.2 holds for this xˆ for all i ≤ n.
272
Q. Wang, L.-S. Young
5.2. Duality between phase-space and parameter-space dynamics. Setting. Let λ < 41 λ0 be as before. To establish the above-mentioned duality, new upper bounds are imposed on α and ε (or equivalently εˆ ). Let N = {a ∈ (−ˆε , εˆ ) : fa ∈ GN (f0 ; λ, α, ε)}. For the rest of Sect. 5, we fix xˆ ∈ C. All parameters considered are assumed to be in N ; all indices considered are assumed to be ≤ α1∗ N , and (G1) is assumed to hold for xˆ for all the indices in question. We use the notation d τi (a) := da xˆi (a). Our main results are (P1’)–(P3’), three properties of a → xˆi (a) that are the analogs of (P1)–(P3) in Sect. 2.1. We state also two lemmas that lie at the heart of these properties. To avoid disrupting the flow of ideas, proofs are postponed to Sect. 5.3. ˆ where iˆ is as in Proposition 4.2. Then Lemma 5.1. Let n > i, 1 |τn+i | ≤ (1 + Ke− 4 λn ) |(fai ) (xˆn )|. |τn |
1
(1 − Ke− 4 λn ) |(fai ) (xˆn )| ≤
(P1’) (Outside of Cδ ). There exists i0 ≥ iˆ such that the following hold for n ≥ i0 : 1 (i) If xˆn is free, and xˆn+j ∈ Cδ ∀ 0 ≤ j < j0 , then |τn+j | > 21 c1 δe 4 λ0 j |τn | for j ≤ j0 ; 1 (ii) if in addition xˆn+j0 ∈ Cδ0 , then |τn+j0 | > 21 c1 e 4 λ0 j0 |τn |. The reader should think of i0 as the time after which x- and a-derivatives are sufficiently close in the sense of Lemma 5.1. We assume xˆi ∈ C 1 δ0 for all i < i0 . 2
+ Consider next an interval ω ⊂ N with xˆn (ω) ⊂ Iµj . To establish the desired relationship between phase-space and parameter-space dynamics during the bound period, we impose the following additional upper bound on α: Let L be a Lipschitz constant of the map G : (x, a) → (fa (x), a). We assume α is small enough that 3
1
L λ α < e 8 λ.
(5)
For each a ∈ ω, let pa denote the bound period of Iµj with respect to fa , and let HD(·, ·) denote the Hausdorff distance between two sets. Lemma 5.2. Let ω and α be as above. Then the following hold for all a ∈ ω: HD(xˆn+j (ω), fa (xˆn (ω))) << e−2αj j
for all j < pa .
We define the bound period pˆ n (ω) of xˆn (ω) in parameter-space dynamics to be pˆ n (ω) := min{pa : a ∈ ω}. + (P2’) (Partial derivative recovery). Suppose xˆn (ω) ⊂ Iµj , and let pˆ = pˆ n (ω). Then 1 3 (a) 3 ln(max|f |) |µ| ≤ pˆ ≤ λ |µ|; ˆ |xˆn+j (a) − xˆn+j (a )| < 2e−2αj ; (b) for a, a ∈ ω and j < p,
(c) |τn+pˆ (a)| > e
λpˆ 4
|τn (a)| for all a ∈ ω; 8α
(d) if xˆn (ω) ≈ Iµj , then |xˆn+pˆ (ω)| ≥ e− λ |µ| . To state (P3’), we divide each orbit in the time interval [i0 , n] into bound and free periods as in Sect. 2.1, and say all a ∈ ω have the same itinerary up to time n if (i) their bound and free periods coincide and (ii) whenever xˆi (ω) is free, it is ⊂ π + for some π ∈ P.
Nonuniformly Expanding 1D Maps
273
(P3’) (Global distortion). There exist i1 > i0 and K3 > 1 such that if n ≥ i1 and all points in ω have the same itinerary through step n − 1, then for all a, a ∈ ω, K3−1 <
τn (a) < K3 . τn (a )
Shrinking ε if necessary, we assume from here on that xˆi ∈ C 1 δ0 for all i < i1 . 2
5.3. Proofs of (P1’)–(P3’). Proof of Lemma 5.1. Let ξ = xˆn . From τn+i = (fa ) (ξi−1 )τn+i−1 + ∂a F (ξi−1 , a), we deduce inductively that τn+i = (fai ) (ξ )τn +
i−1
(f i−j −1 ) (ξj +1 )∂a F (ξj , a)
j =0
= (fai ) (ξ )τn 1 +
i−1 ∂a F (ξj , a) j =0
j +1 ) (ξ )τ
(fa
. n
Proposition 4.2’ and Corollary 2.1 then give i−1 i−1 ∂a F (ξj , a) 1 F (ξ , a) ∂ a j ≤ 2|c( ˆ x)| ˆ ≤ Ke− 4 λn . j +1 n+j (fa ) (ξ )τn (fa ) (xˆ1 (a)) j =0
j =0
Proof of (P1’). (P1’) follows immediately from (P1) via Lemma 5.1. With i0 large enough, the x- and a-derivatives are as close as need be.
Proof of Lemma 5.2. It suffices to consider the end points of the segments to be compared. Suppose ω = [a¯ 1 , a¯ 2 ]. Then for i = 1, 2, j
|xˆn+j (a¯ i ) − fa (xˆn (a¯ i ))| = |Gj (xˆn (a¯ i ), a¯ i ) − Gj (xˆn (a¯ i ), a)| ≤ Lj |a¯ i − a| ≤ Lj |ω|.
(6)
1
By Proposition 4.2 and Corollary 2.1, |ω| ≤ Ke− 4 λn . Also, j ≤ pa , which, by (P2), is ≤ λ3 αn. Thus Lj |ω| << e−2αj if α satisfies (5).
Proof of (P2’). (a) is true because pˆ = pa¯ for some a¯ ∈ ω. (b) and (d) are immediate consequences of Lemma 5.2, and (c) follows from Lemma 5.1.
Turning now to the setting of (P3’), we let a, a ∈ ω, and for some k with i0 ≤ k < n, let ξ = xˆk (a) and ξ = xˆk (a ). Lemma 5.3. There exists K > 0 such that for all i such that k + i ≤ n, i−1 | i − ξ |ξ j (fa ) (ξ ) j K ≤ exp . d(ξj , C) (f i ) (ξ ) a
j =0
274
Q. Wang, L.-S. Young
Proof. First we write log
i−1 |(f ) (ξ ) − f (ξ )| i−1 |ξj − ξj | + |a − a | (fai ) (ξ ) a j a j K ≤ < (fai ) (ξ ) |(fa ) (ξj )| d(ξj , C) j =0 j =0 i−1 |a − a | |ξj − ξj |
Then we use Proposition 4.2’ and Corollary 2.1 to estimate the quantity in parenthesis: Assuming for definiteness that a < a,
a
a 1 k+j |τk+j (s)|ds ≥ |c( |(fs ) (xˆ1 (s))|ds |ξj − ξj | = ˆ x)| ˆ 2 a a 1 1 ≥ |c( ˆ x)|e ˆ 4 λk |a − a |. (7) 2
Proof of (P3’). By Proposition 4.2’, it suffices to show (fan−1 ) (xˆ1 (a )) (fan−1 ) (xˆ1 (a))
< K,
the proof of which follows closely that of (P3) in Sect. 2.2. Let i0 < t1 < t1 + p1 ≤ t2 < t2 + p2 ≤ · · · , where ti are free return times and pi are the associated bound periods. For definiteness we assume tq + pq ≤ n ≤ tq+1 . Then log
(fan−1 ) (xˆ1 (a )) (fan−1 ) (xˆ1 (a))
≤ log
(fat1 −1 ) (xˆ1 (a )) (fat1 −1 ) (xˆ1 (a))
+
q
(Sk + Sk ),
k=1
where |(fa k ) (xˆtk (a ))| p
Sk = log
p
|(fa k ) (xˆtk (a))|
and Sk = log
t
−(tk +pk ) ) (xˆ
t
−(tk +pk ) ) (xˆ
|(fak+1
|(fak+1
tk +pk (a
))|
,
tk +pk (a))|
except for Sq , which ends at time n − 1 instead of tq+1 . Let σk = [xˆtk (a), xˆtk (a )]. It follows from (P2’)(c) and (P1’)(ii) that, for k < q, 1 |σk+1 | ≥ 21 c1 e 4 λ(tk+1 −tk ) |σk |, which we may assume is ≥ τˆ |σk | for some τˆ > 1 (the factor 21 c1 is again absorbed into the exponential assuming δ is sufficiently small). In the proof of (P3), the derivative estimates in Sk and Sk are converted to distance estimates. This is exactly what we have done in Lemma 5.3. More precisely, if ξ = xˆtk (a) and ξ = xˆtk (a ), then by Lemma 5.3, Sk
≤ K
p k −1 j =0
|ξj − ξj | d(ξj , C)
≤ K
p k −1 j =0
p j j j j k −1 |fa (ξ ) − fa (ξ )| |fa (ξ ) − fa (ξ )| +K . d(ξj , C) d(ξj , C) j =0
Nonuniformly Expanding 1D Maps
275
|σk | The first sum on the right involves only the map fa and is ≤ K d(ξ,C) by the corresponding argument in the proof of (P3). For the second sum, we combine the estimate in (6) with the result of (7) to show that it is
p k −1
≤ K
(Leα )j |a − a | ≤ K(Leα ) λ αtk e− 4 λtk |σk | << |σk |. 3
1
j =0
After this, the rest of the proof is as before. The sums Sk (including k = q) are estimated similarly, the only difference being that the exponential growth of |ξj − ξj | here is derived from (P1’). It remains to treat the initial stretch. The estimate from time i0 to time t1 is identical to that for Sk , and log
(fai0 −1 ) (x1 (a )) (fai0 −1 ) (x1 (a))
≤ K i0 |a − a |, 1
where K is related to the C 2 norm of the 1-parameter family. Since |a − a | < Ke− 4 λi1 , we know that the contribution of the first i0 iterates is < 1 if i1 is sufficiently large. This completes the proof of (P3’).
Reference: A version of the material in Sects. 5.2 and 5.3 is contained in [BC1].
6. Two Parameter Estimates 6.1. Processes defined by critical orbits. Let λ, α, ε and εˆ be as in Sect. 5.2. Associated with each xˆ ∈ C we now introduce the idea of a process {γi } describing the dynamics of a → xˆi (a) combined with a deletion process. The domain of definition of this process is 0 , a subinterval of (−ˆε , εˆ ). For each i = 0, 1, 2, . . . and a ∈ 0 , γi (a) is equal to either xˆi (a) or ∗, the meaning of the latter being that the parameter in question will not be considered further, i.e., it is deleted. In particular, if γi (a) = ∗, then γi+k (a) = ∗ for all k > 0. Roughly speaking, we seek to identify a decreasing sequence of subsets {γi = ∗} of 0 and a sequence of partitions Qi defined on {γi = ∗} representing the canonical subdivision associated with a → xˆi (a). This subdivision is carried out in a manner analogous to that in Sect. 3.1. As the reader will recall, the construction in Sect. 3.1 relies on the decomposition of orbits into bound and free periods. We have seen in Sect. 5 that bound/free notions are well defined for a → xˆi (a) under suitable circumstances. The idea is that whenever we are unable to guarantee these circumstances, we will delete the parameter. Later on, we will see that it is useful also to make deletions for other purposes (but will not concern ourselves with that for now). This is the motivation for the definition below. For N ≤ ∞, we say {γi , i < N} is a process associated with xˆ if the following hold: Let 0 ⊂ (−ˆε , εˆ ) be an interval containing 0, and let i1 be as in Sect. 5.2. (1) (a) We assume γi (0 ) ∩ C 1 δ0 = ∅ for all i ≤ i1 . In this time range, γi has no 2 meaning. We set γi = xˆi and Qi = {0 }. No deletions are permitted. (b) A subdivision must occur before γi (0 ) meets Cδ .
276
Q. Wang, L.-S. Young
(2) At time i > i1 , we assume that for all j < i, γj are defined, as are Qj , representing a canonical subdivision. We assume also that the notion of bound/free makes sense on each ω ∈ Qj . Consider now ω ∈ Qi−1 (on which γi−1 = ∗). We first put on it the canonical partition Qi as defined in Sect. 3.1. On each ωˆ ∈ Qi |ω , there are 2 options: we either let γi = xˆi on all of ω, ˆ or we let it = ∗ on all of ω. ˆ The rules are as follows: (a) We are free to set γi = ∗ or xˆi on any ωˆ for which xˆi (w) ˆ is outside of Cδ . (b) If xˆi (w) ˆ ⊂ Cδ , the following conditions must be met if we wish to set γi |ωˆ = xˆi : ˆ ∩ {d(·, C) < e−αi } = ∅; (i) xˆi (ω) (ii) ωˆ ⊂ α ∗ i , i.e., fa ∈ Gα ∗ i (f0 ; λ, α, ε) for all a ∈ ω. ˆ Finally, we set γi = ∗ on {γi−1 = ∗}. This completes our definition at the i th step. Paragraph (2) is then repeated with i + 1 in the place of i. We observe that the process above is well defined. It is clearly well defined initially. When γi (ω) ˆ ⊂ Cδ , (2)(b) guarantees that for all a ∈ ωˆ for which γi (a) = ∗, the ensuing bound period is meaningful (see Sect. 5.1). Once this is taken care of, Sect. 5.2 gives the desired resemblance to phase-space dynamics until the next free return. We remark that even though {γi } is associated with a particular xˆ ∈ C, Condition (2)(b)(ii) demands good behavior of all critical orbits up to time α ∗ i. The fact that this requirement is only up to time α ∗ i, which is << i, is crucial for us. Observe also that on each 0 , there is a maximal process, referring to the one in which the only deletions are those in (2)(b). All other processes are subordinate to this one, meaning they are defined by rules that demand further deletions.
6.2. Deletions due to (G1). We fix xˆ ∈ C and 0 , and let {γi , i < n} be a process associated with x. ˆ Lemma 6.1. There exists K such that for all ω ∈ Qn−1 , if ω1 is the part of ω deleted on account of (G1) at step n, then 1
|ω1 | < Ke− 2 αn |ω| . Proof. Suppose ω1 = ∅. Let j0 be the largest j < n such that (i) ω ∈ Qj , (ii) γj (ω) is free and “long”. (Such a j0 exists by condition (1)(b) in Sect. 6.1; one may have to go back to the time when ω is created as a result of a subdivision.) There are two possibilities: Case 1. γj0 (ω) is outside of Cδ , i.e., γj0 (ω) ≈ π for some π with π ∩ Cδ = ∅. Then |γj0 (ω)| ≥ δ, and |γn (ω)| > K −1 δ. Not knowing the location of γn (ω), we assume the worst-case scenario, namely that γn (ω) crosses entirely a forbidden region {d(·, y) ˆ < e−αn } for some yˆ ∈ C. Thus the fraction of ω with d(xˆn , C) < e−αn is 1 −αn < 2e · Kδ −1 , which we may assume is < Ke− 2 αn (see the paragraph following Proposition 2.2). Here (P3’) is used to transfer the ratio of lengths on γn (ω) back to ω. Case 2. γj0 (ω) ≈ Iµj . Let p be the bound period initiated at time j0 . Observe first that 1 |γn (ω)| > K −1 |γj0 +p (ω)| > K −1 e− 10 |µ| : Since γn (ω) is free (otherwise there would be no deletion), n ≥ j0 + p. The first inequality follows from (P1’)(ii) combined (possibly) with (P2’)(c); the second follows from (P2’)(d). Observe also that by design, |µ| ≤ 1 1 αj0 < αn, so the fraction of ω being estimated is again < Ke−αn e 10 αn < Ke− 2 αn .
Nonuniformly Expanding 1D Maps
277
6.3. Deletions on account of (G2). We begin with an estimate on derivative growth in terms of the time an orbit spends in bound periods initiated at returns to Cδˆ for arbitrary δˆ < δ. Consider f ∈ GN (f0 ; λ, α, ε) and n ≤ α1∗ N . Let x ∈ I be such that d(xi , C) ≥ min{ 21 δ0 , e−αi } for all 0 ≤ i < n. By the reasoning in Sect. 5.1, the usual ˆ n) denote bound/free decomposition makes sense for the orbit of x up to time n. Let B(δ; the total number of i, 0 ≤ i ≤ n, such that xi ∈ Cδˆ or it is in a bound period initiated from a visit to Cδˆ . ˆ n) ≤ σ n, then Lemma 6.2. Let f and x be as above. Given δˆ ≤ δ and σ > 0, if B(δ; |(f n ) (x)| > K −1 δˆ e[(1−σ ) 4 λ0 −α]n . 1
Proof. Consider first the case where xn is free. Let tˆ1 < tˆ1 + pˆ 1 ≤ tˆ2 < tˆ2 + pˆ 2 ≤ · · · ≤ ˆ tˆk + pˆ k ≤ n, where tˆ1 , · · · , tˆk are the consecutive free return times to {d(·, C) < δ}. Then (f n ) (x) = (f n−tˆk −pˆk ) (xtˆk +pˆk ) · (f pˆk ) (xtˆk ) ·(f tˆk −tˆk−1 −pˆk−1 ) (xtˆk−1 +pˆk−1 ) · · · (f tˆ1 ) (x). We use (P1)(i) for (f n−tˆk −pˆk ) (xtˆk +pˆk ), (P1)(ii) for (f ti −tˆi−1 −pˆi−1 ) (xtˆi−1 +pˆi−1 ), and the trivial estimate |(f pˆi ) (xtˆi )| > c1−1 for growth during bound periods (see (P2)(ii)). This ˆ 41 λ0 (1−σ )n since pˆ 1 + · · · + pˆ k ≤ σ n by assumption. The gives |(f n ) (x)| > K −1 δe factor −αn is needed if n is not free; see Corollary 2.1.
Corollary 6.1. Let the hypotheses be as in Lemma 6.2, with x = xˆ1 for some xˆ ∈ C. We assume further that d(xˆi , C) > 21 δ0 for all i ≤ n0 , where n0 is sufficiently large ˆ Then B(δ; ˆ 0, n) < σ n implies |(f n ) (xˆ1 )| > c1 e[(1−σ ) 41 λ0 −α]n . depending on δ. Proof. The factor δˆ is absorbed into the initial growth if n > n0 .
ˆ n) For f ∈ G, it can be deduced from properties of the invariant measure that n1 B(δ; ˆ dµ decreases with δ. In light of the duality in Sect. 5.2, one may expect a similar phenomenon for a → γi (a). We formulate below a large deviation estimate useful for estimating the measure of parameters deleted on account of (G2). ˆ n) be the Let {γi , i < n} be as in Sect. 6.2. For a such that γn (a) = ∗, let B(a, δ; ˆ n) defined above with f = fa and x = x. number B(δ; ˆ Proposition 6.1. Given any σ > 0, there exist positive numbers εˆ 1 = εˆ 1 (σ ) and ˆ ) such that δˆ = δ(σ ˆ n) > σ n}| < e−ˆε1 n |0 |. |{a ∈ 0 : γn = ∗ and B(a, δ; 6.4. Large deviation estimate. We first state the analog of Lemma 3.3. Let ωˆ ∈ Qj0 ˆ a stopping time starting from be such that γj0 (ω) ˆ = ∗ and is free. On ωˆ we define S, j0 , as follows: We extend the process on ω beyond time j0 , and for each a ∈ ω, let k = k(a) > j0 be the first time when γk (Qk−1 (a)) is not in a bound period and has ˆ length > δ. If such a k exists, we set S(a) = k − j0 . If a is deleted before that happens, ˆ we set S(a) = 0.
278
Q. Wang, L.-S. Young
Lemma 6.3. Let ωˆ ∈ Qj0 be such that γj0 (ω) ˆ is free and ≈ Iµj . Then −1 ˆ ˆ for all m > K log |µ|. |{a ∈ ωˆ : S(a) > m}| < e− 2 K m |ω| 1
The proof is entirely parallel to that of Lemma 3.3 in Sect. 3.1. Proof of Proposition 6.1. We take a probabilistic viewpoint, with underlying probability space (0 , P ), P being normalized Lebesgue measure on 0 . Let δˆ > 0 be a small number to be determined. Let n be fixed. The idea is to introduce Xi dominated by ˆ certain exponential random variables such that B(a) := B(a, δ, n) ≤ Xi (a). Step I. Formulation of problem as one involving Xi . For each a ∈ 0 , we define t0 < t1 < · · · and S1 , S2 , . . . via the following algorithm, with the understanding that the algorithm terminates as soon as γi (a) = ∗ or time n is reached. To get started, let t0 be the smallest j > 0 such that γj (Qj −1 (a)) ∩ Cδˆ = ∅. (i) After ti is defined, we define Si+1 : If Qti (a)∩Cδˆ = ∅, set Si+1 = 0; if Qti (a) ⊂ Cδˆ , ˆ n − ti ) where Sˆ is the stopping time above starting from ti . let Si+1 = min(S, (ii) If Si+1 = 0, let ti+1 be the smallest j > ti such that γj (Qj −1 (a)) ∩ Cδˆ = ∅; define ti+1 the same way if Si+1 > 0 except that j is taken ≥ ti + Si+1 . Suppose ti (a) is defined. Let Q = Qti −1 (a). Assuming δˆ << δ, we claim: (1) γti (Q) is free; 1 (2) |γti (Q)| > δˆ 10 ; (3) for all a , a ∈ Q, τti (a )/τti (a ) < K. (1) is true because trajectories of critical curves in bound periods initiated outside of Cδˆ cannot meet Cδˆ . If Si > 0, it may happen that ti = ti−1 + Si , in which case |γti (Q)| > δ by definition. Otherwise we back up to time t when Q was first created as an element of some Qj . Then ti−1 + Si ≤ t < ti , and γt (Q) ∩ Cδˆ = ∅. If γt (Q) is outside of Cδ , then |γti (Q)| > K −1 δ by (P1’). If γt (Q) ≈ Iµj for some Iµj ⊂ Cδ \ Cδˆ , then 8α |γti (Q)| > K −1 δˆ λ by (P2’)(d). In all cases, (2) holds assuming δˆ << δ. (3) follows from (P3’). Because of (1)–(3), we think of ti as times of dynamical renewal. In preparation for Step II, we organize some of the information from above as follows. Let X0 = 0. For i = 1, 2, . . . , n, let Xi : 0 → Z+ be such that Xi (a) = Si (a), where Si (a) is defined, 0 otherwise. Then B ≤ i≤n Xi . It suffices to show P { i≤n Xi > σ n} decreases exponentially with n. We define the following σ -algebras on ω: Let Ai be the set of a for which ti is defined. Then Ai ∈ Fi , and for a ∈ Ai , the atom of Fi containing a is Qti (a)−1 (a). For a ∈ Ai , the atom of Fi containing a is Qk (a), where k is the last step before the algorithm above is terminated. One verifies that Fi so defined is a , and that Xi is measurable with respect to Fi . σ -algebra, that F0 < F1 < · · · < Fn Step II. Large deviation estimate for 1≤i≤n Xi . First we compute the conditional distribution of Xi+1 given Fi , i ≥ 0. Consider Q ∈ Fi |Ai . (On Q ∈ Fi with Q ∩ Ai = ∅, Xi+1 = 0.) From (2) and (3) above, we have (i) P (Xi+1 = 0 | Q) ≥ 1 − K δˆ 10 . 9
For Iµj ⊂ Cδˆ , Lemma 6.3 together with (1) and (3) above give 1
(ii) P (Xi+1 > m | Q ∩ {γti ∈ Iµj }) < Ke− 2 K otherwise.
−1 m
if m ≥ K|µ|; no information
Nonuniformly Expanding 1D Maps
279
Combining the last two estimates, we obtain for all m ≥ 0, ˆ e−K P (Xi+1 > m | Q) < K δˆ− 10 min(δ, 1
−1 m
) + K δˆ 10 e− 2 K 9
1
−1 m
.
(8)
A simple computation then gives E[eρXi+1 |Q] < ∞ if ρ < 21 K −1 (where K is as in the exponents above). We note further that by decreasing δˆ (keeping ρ fixed), E[eρXi+1 |Q] can be made arbitrarily close to 1. Let η > 0 be a number to be determined shortly, and ˆ choose δˆ = δ(η) sufficiently small that E[eρXi+1 |Q] < eη . Observing that the upper bound in (8) and hence that for E[eρXi+1 |Q] do not depend on i or on Q, we conclude that with the choices of ρ, η and δˆ above, E[eρXi+1 |Fi ] < eη for every i ≥ 0. To finish, we observe that E eρ 1≤i≤n Xi = E E[eρ 1≤i≤n Xi |Fn−1 ] = E eρ 1≤i≤n−1 Xi E[eρXn |Fn−1 ] ≤ eη E eρ 1≤i≤n−1 Xi ,
] ≤ enη . We arrive, therefore, at the estimate P {B > σ n} < P Xi > σ n < eηn−ρσ n .
giving inductively E[eρ
1≤i≤n Xi
1≤i≤n
1
This is < e− 2 ρσ n if η is chosen < 41 ρσ .
References: A version of Sects. 6.2 and 6.3 is used in [BC2]; Sect. 6.4 is taken from [WY2]. 7. Positive Measure Sets of Good Parameters 7.1. Preliminary definitions and choices. 1. We fix λ ≤ 15 λ0 . 2. Augmented versions of (G1) and (G2). For reasons to become clear, it will be advantageous to put our good maps “deeper inside" GN (f0 ; λ, α, ε). We say xˆ ∈ C satisfies (G1)# and (G2)# up to time N if for all 1 ≤ i ≤ N , (G1)# d(xˆi , C) > min( 21 δ0 , 2e−αi ); 1 (G2)# |(f i ) (xˆ1 )| > 2c1 eλ1 i where λ1 = λ + 100 λ0 , # and say f ∈ GN (f0 ; λ, α, ε) if all xˆ ∈ C satisfy (G1)# and (G2)# up to time N . # (f ; λ, α, ε) ⊂ G (f ; λ, α, ε). The proof of the following lemma is Clearly, GN 0 N 0 straightforward. # (f ; λ, α, ε), Lemma 7.1. There exists K4 > 1 for which the following holds: If faˆ ∈ GN 0 −n −n then for all n ≤ N, fa ∈ Gn (f0 ; λ, α, ε) for all a ∈ [aˆ − K4 , aˆ + K4 ]. 1 λ, we impose two upper bounds on α: The first 3. Choice of α. In addition to α < 100 is introduced in (5) in Sect. 5.2; the second is (9) in Sect. 7.2. With λ and α fixed, # (f ; λ, α, ε) will be abbreviated as G and G # from here on. GN (f0 ; λ, α, ε) and GN 0 N N
280
Q. Wang, L.-S. Young
ˆ We need σ to be small enough that the exponent in Corollary 4. Choices of σ and δ. 1 6.1, namely (1 − σ ) 41 λ0 − α, is > λ1 . (For example, σ = 100 will do.) We then let δˆ 1 be given by Proposition 6.1 with 2 σ in the place of σ . 5. The start-up interval 0 . We choose 0 ⊂ (−ˆε , εˆ ) to contain 0 and to be short enough that for some n0 sufficiently large, d(xˆi , C) > 21 δ0 for all i ≤ n0 , xˆ ∈ C and a ∈ 0 . A number of impositions on n0 have been made; see, for example, Sects. 2.1 and 5.2, and Corollary 6.1. There will be more in the next two pages. 7.2. Inductive construction of . We seek to construct a sequence of sets 0 ⊃ n0 ⊃ 2n0 ⊃ 22 n0 ⊃ · · · in parameter space with the properties that (i) for each , {fa , a ∈ 2 n0 } ⊂ G2# n and 0 (ii) := ∩≥0 2 n0 has positive Lebesgue measure. The rules of construction are detailed below; the measure estimate is given in Sect. 7.3. Overview of procedure. Let C := {xˆ 1 , xˆ 2 , . . . , xˆ q }. Associated with each xˆ k , we define a process {γik , i < ∞} in the sense of Sect. 6.1 with the property that for every a such that γ2k n (a) = ∗, xˆ k (a) satisfies (G1)# and (G2)# up to time 2 n0 . We then let 0
k2 n := {γ2k n = ∗} 0
It
0
and
2 n0 := ∩1≤k≤q k2 n . 0
follows that fa ∈ G2# n for every a ∈ 2 n0 . 0 The processes γik are updated in N -to-2N cycles,
N = 2 n0 , = 1, 2, . . . . Within each cycle, we first update each of the q processes individually, i.e., extend γik from k to ∗ i = N to i = 2N. At the end of this updating, we reset some of the values of γ2N to reflect the combined status of all q processes before moving to the next cycle. Remarks. It is absolutely essential to take inventory of the global picture at regular time intervals (as we do at times 2 n0 ). Other than that, the precise order in which γik is updated is unimportant. Also, the number “2" has little significance: all that is needed is a relation with α that gives (9) below. Getting started. Let nk1 be the smallest i > 0 such that xˆik (0 )∩C 1 δ0 = ∅. Then nk1 > n0 , 2
and |xˆ kk (0 )| > 21 δ0 . Since δ << 21 δ0 , the first subdivision occurs at or before time nk1 . n1
Condition (1)(b) in the definition of a process in Sect. 6.1 is met. Recall that we have assumed e−αn0 << δ, so that Lemma 6.1 applies to all deletions due to (G1)# . Formal procedure from step N = 2 n0 to step 2N . At time N , we are handed q well defined processes {γik , i ≤ N }, k = 1, 2, . . . , q, with the property that for each a ∈ kN = {γNk = ∗}, (i) fa ∈ G2α ∗ N , and (ii) xˆ k (a) satisfies (G1)# and (G2)# up to time N . The procedure from N to 2N consists of the following: Step 1. The following is carried out for each of the processes {γik }. 1a. We extend γik up to i = 2N , deleting all ω ∈ Qi with γi (ω) ∩ {d(·, C) < 2e−αi } = ∅. (Deletions corresponding to (2)(b)(ii) in Sect. 6.1 are not needed k because (i) above already puts {γNk = ∗} ⊂ 2α ∗ N .) We denote the resulting γ2N k by γ2N,tmp1 as its values will be reset momentarily.
Nonuniformly Expanding 1D Maps
281
k ˆ 2N ) > σ N for a ∈ ω, and let 1b. On each ω ∈ Qk2N , we let γ2N,tmp2 = ∗ if B(a, δ; k it be = γ2N,tmp1 otherwise. k = γk Step 2. On each ω ∈ Qk2N , we let γ2N 2N,tmp2 if ω ∩ N = ∅, otherwise let it = ∗.
It remains to show that these steps lead to (i) and (ii) above with N replaced by 2N . k = ∗}, Step 1a ensures that xˆ k (a) satisfies (G1)# up to time 2N . For a ∈ k2N = {γ2N i ˆ n) ≤ σ N < σ n. Our choice of σ (see Sect. 7.1) together with By Step 1b, B(a, δ; Corollary 6.1 then gives the lower bound for |(f n−1 ) (xˆ1k )| in (G2)# . It remains to prove fa ∈ G4α ∗ N for a ∈ k2N . Let ω be the element of Qk2N containing a. Since ω survived the deletion in Step 2, there must be a point aˆ ∈ ω ∩ N . By Lemma 7.1, it suffices to ∗ ∗ ˆ 1 )−1 e−2λ1 N . show that ω ⊂ [aˆ −K4−4α N , aˆ +K4−4α N ]. By Proposition 4.2’, |ω| < 2(cc (Observe that the hypotheses of Proposition 4.2’ are met: ω ⊂ 2α ∗ N , and xˆik obeys (G1) up to time 2N .) We conclude that ω ⊂ 4α ∗ N if 2(cc ˆ 1 )−1 e−2λ1 N < K4−4α
∗N
(9)
.
This is one of the conditions imposed on α in Item 3 of Sect. 7.1. 7.3. Lower bound on measure of . For each N and k, we now estimate the contribution to N \ 2N (not kN \ k2N ! ) due to deletions in γik from i = N to i = 2N . 1 Step 1a: By Lemma 6.1, the total measure deleted is < N
It follows that || ≥ |0 | −
|2 n0 \ 2+1 n0 | > 1 − q
∞
(Ke
− 21 αi
+ e−ˆεi ) |0 |,
i=n0
which is positive if n0 is sufficiently large. This completes the proof of Theorem 2.
References: This section follows [WY2], which, together with its precursor [WY1], contain the corresponding material for rank one attractors. References [BC1] Benedicks, M., Carleson, L.: On iterations of 1 − ax 2 on (−1, 1). Ann. Math. 122, 1–25 (1985) [BC2] Benedicks, M., Carleson, L.: The dynamics of the H´enon map. Ann. Math. 133, 73–169 (1991) [BY] Benedicks, M., Young, L.S.: Absolutely continuous invariant measures and random perturbation for certain one-dimensional maps. Ergod. Th. and Dynam. Sys. 12, 13–37 (1992)
282 [CE] [J] [M] [NS] [R] [T] [TTY] [WY1] [WY2] [WY3] [WY4]
Q. Wang, L.-S. Young Collet, P., Eckmann, J.P.: Positive Liapunov exponents and absolute continuity for maps of the interval. Ergod. Th. and Dynam. Sys. 3, 13–46 (1983) Jakobson, M.: Absolutely continuous invariant measures for one-parameter families of one-dimensional maps. Comm. Math. Phys. 81, 39–88 (1981) Misiurewicz, M.: Absolutely continuous invariant measures for certain maps of an interval. Publ. Math. IHES. 53, 17–51 (1981) Nowicki, T., Van Strien, S.: Absolutely continuous invariant measures under a summability condition. Invent. Math. 105, 123–136 (1991) Rychlik, M.: Another proof of Jakobson’s theorem and related results. Ergod. Th. and Dynam. Sys. 8, 83–109 (1988) Tsujii, M.: A proof of Benedicks-Carleson-Jakobson theorem. Tokyo J. Math. 16, 295–310 (1993) Thieullen, P., Tresser, C., Young, L.-S.: Positive exponent for generic 1-parameter families of unimodal maps. C.R. Acad. Sci. Paris, t. 315 S´erie I, 69–72 (1992); J Analyse 64, 121–172 (1994) Wang, Q., Young, L.-S.: Strange attractors with one direction of instability. Commun. Math. Phys. 218, 1–97 (2001) Wang, Q., Young, L.-S.: Toward a theory of rank one attractors. To appear in Ann. Math. Wang, Q., Young, L.-S.: From invariant curves to strange attractors. Commun. Math. Phys. 225, 275–304 (2002) Wang, Q., Young, L.-S.: Strange attractors in periodically-kicked limit cycles and Hopf bifurcations. Commun. Math. Phys. 240, 509–529 (2002)
Communicated by G. Gallavotti
Commun. Math. Phys. 264, 283–289 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1477-4
Communications in
Mathematical Physics
An Exchange Identity for Non-linear Fields Arthur Jaffe, Christian J¨akel Harvard University, Cambridge, MA 02138, USA Received: 1 March 2005 / Accepted: 21 July 2005 Published online: 3 February 2006 – © Springer-Verlag 2006
Abstract: We establish a useful identity for intertwining a creation or annihilation operator with the heat kernel of a self-interacting bosonic field theory. 1. Background Consider creation operators a ∗ (f ) and annihilation operators a(h), both linear in their respective test functions f, h ∈ L2 (R, dx), acting on the Fock Hilbert space H, and satisfying the canonical commutation relations ¯ f 2 . (1.1) a(h), a ∗ (f ) = h, L The free field Hamiltonian H0 acts on H and also on the one particle subspace L2 (R, dx), 1/2 . The time-zero where one denotes its action by the operator ω = −d 2 /dx 2 + m2 field ϕ(g) has the definition ϕ(g) = a ∗ (2ω)−1/2 g + a (2ω)−1/2 g . The operators a ∗ , a, and H0 satisfy the relation ea(h) e−βH0 ea
∗ (f )
¯
= e h,e
−βω f
ea
∗ (e−βω f )
e−βH0 ea(e
−βω h)
.
(1.2)
This identity (in case either f = 0 or h = 0) is known in the constructive field theory literature as a “pull-through” identity. The pull-through identity played a central role in the analysis of properties of heat kernels for field theories with interaction. It provided a fundamental ingredient in the analysis of the domain of the fields, in the proof of the cluster expansion, in the proof of the existence of a mass gap, and especially in the proof of the existence of an upper mass gap in weakly-coupled λP(ϕ) quantum field models, see [3, 4]. An introduction to this work can be found in [2, 5], but one must visit the original literature for details. Current address: Theoretical Physics, Swiss Federal Institute of Technology Zurich (ETHZ), Switzerland.
284
A. Jaffe, C. J¨akel
The free-field pull-through identity provides a key step in the proof of the nuclearity property for the free field by Buchholz and Wichmann [1], and motivates finding the related identity (2.16) for a field theory with a P(ϕ) polynomial interaction. 2. The Main Result In this paper we give a new identity similar to (1.2), but with H0 replaced by the Hamiltonian H for a non-linear field theory (with a spatial cutoff). Because of the non-linearity, the Hamiltonian on the left of the identity differs from the Hamiltonian on the right. Remarkably, we present a closed form for the relationship between the time-dependent Hamiltonians. At least one of the Hamiltonians must depend on time, so we allow both to do so (in a particular way) and denote the time-dependent Hamiltonians that arise byH(s). One must β replace the semigroup e−βH by the time-ordered exponential T exp − 0 H(s)ds , where we use the convention that time increases from left to right. Call the resulting identity that generalizes (1.2) an exchange identity. It has the structure ∗ β β −βω ∗ −βω −βω ¯ ea(h) T e− 0 H1 (s)ds ea (f ) = e h,e f ea (e f ) T e− 0 H2 (s)ds ea(e h) . (2.1) We give the explicit form of H1 (s) and H2 (s) in Theorem 2.1. In this paper we emphasize the algebraic structure of the exchange identity. We do not analyze the convergence of exponential series or the convergence of families of such series. We expect that most such questions in specific applications of interest can be addressed by the reader—hopefully without undue difficulty. In order to ensure stability we do assume that the basic interaction polynomial is bounded from below. In order to avoid infra-red problems we also assume that the mass of H0 is strictly positive, or else we work with a twist field defined on a spatial circle. All in all, the complete justification of Theorem 2.1, even for an elementary non-linearity, requires the introduction and removal of an ultra-violet cutoff, using for instance, a Feynman-Kac representation and estimates on path space to establish stability bounds and convergence of associated vectors and operators. See the methods in [2]. Once one establishes the basic stability bound in a particular example—uniform in the ultra-violet cutoff—details concerning convergence of vectors and operators, domains on which Theorem 2.1 applies, etc., will all fall into place. The case of complex functions f or h leads to non-hermitian Hamiltonians H1 or H2 . But these always arise as small non-hermitian perturbations of a self-adjoint Hamiltonian, so standard methods should apply. While these steps need to be carried out in particular examples, including such details here would obscure the simplicity of the presentation of our new identity. This elegant form of the exchange identity raises the question whether one might make progress toward finding other useful closed-form expressions in the solution of P(ϕ)2 quantum field theories. 2.1. Interactions. The usual interaction polynomial arises from a polynomial P(ξ ) and is defined as
2k−1 HI (P, λ) = :P(ϕ(x)):λ(x)dx , where P(ξ ) = ξ 2k + cj ξ j , (2.2) j =0
Exchange Identity
285
where :P(ϕ(x)): is the normal-ordered interaction-energy density, see for example [2]. We take the spatially dependent cutoff 0 ≤ λ(x) to be smooth and compactly supported. This cutoff defines an interesting class of polynomial interactions. Let us now generalize this form of interaction, by assigning a spatially-dependent coupling constant λgj (x) to the j th -derivative P (j ) of the polynomial P. Write HI (P, λg) :=
∞
HI (P (j ) , λgj ) .
(2.3)
j =0
The sum in (2.3) terminates with j = 2k, the degree of P. Consider now g as a vector of coupling constants, with components gj . Motivated by this form of interaction, define a vector space of sequences of complexvalued, bounded functions on R. These vectors f ∈ C have components fj (x), j ∈ Z+ . There is a natural scalar multiplication by smooth functions λ(x), (λf)j (x) = λ(x)fj (x) .
(2.4)
There is also a natural imbedding ι : L2 (R) → C given by ι(f ) = {0, f, 0, . . . } .
(2.5)
In addition to multiplication (2.4) by scalars, the vector space C is a commutative ring with the product ∗ : C × C → C defined by (f ∗ g)j (x) =
j
fk (x)gj −k (x) .
(2.6)
k=0
The identity in C is the function Id = {1, 0 . . . } ,
(2.7)
ι(f ) ∗ ι(f ) ∗ · · · ∗ ι(f ) = {0, 0, . . . , f (x)n , . . . } .
(2.8)
and the nth ∗-power of ι(f ) is Also define ι(f )0 = Id. In terms of these powers, there is a natural exponential imbedding : f → C given by (f ) = eι(f ) = Id +
∞ 1 1 1 ι(f )j = {1, f (x), f (x)2 , . . . , f (x)j , . . . } . j! 2! j! j =1
(2.9) With this notation, (f ) ∗ (g) = (f + g) ,
(f )−1 = (−f ) ,
and (0) = Id . (2.10)
The special case HI (P, λ) of (2.2) corresponds to g = Id = (0). We use a bold-face letter H to denote a Hamiltonian determined by a polynomial P (bounded from below) as well as its derivatives P (j ) in the fashion (2.3) with g = (g) .
(2.11)
286
A. Jaffe, C. J¨akel
In the following we find that perturbations of this type play a special role, especially when g has the form fs + hβ−s , where fs (x) = (2ω)−1/2 e−sω f (x) . (2.12) (Note f0 = f .) Therefore consider the time-dependent, total Hamiltonians at time s of the form H = H(P, λ(gs + hβ−s )) = H0 + HI (P, λ(gs + hβ−s )) , (where the vacuum energy has not been renormalized to zero). An elementary pull-through identity has the form β β −βω ea(h) T e− 0 H(P ,λ(gs ))ds = T e− 0 H(P ,λ(gs +hβ−s )) ds ea(e h) .
(2.13)
(2.14)
We establish this and related identities in the next section.
2.2. Exchange identities. The following generalization states how to exchange the position of the product of an exponential of a creation and an exponential of an annihilation operator. Theorem 2.1. (Exchange Identity) As a formal identity, ∗ β −βω ∗ −βω ¯ ea(h) T e− 0 H(P ,λ(gs ))ds ea (f ) = e h,e f ea (e f ) β −βω × T e− 0 H(P ,λ(fs +gs +hβ−s ))ds ea(e h) . (2.15) Remark. The exchange identity (2.15) reduces to the pull-through identity (2.14) for f = 0. Furthermore, the special choice g = Id gives ea(h) e−β(H0 +HI (P ,λ)) ea
∗ (f )
¯
−βω
∗
−βω
= e h,e f ea (e f ) β −βω × T e− 0 H(P ,λ(fs +hβ−s ))ds ea(e h) . (2.16)
This special case shows that if one begins with a time-independent interaction, the exchange identity gives rise to a time-dependent Hamiltonian. After the exchange, the perturbation of the original Hamiltonian involves perturbations of lower degree than P, and the coupling constant of the highest degree term is unchanged. Therefore the standard stability bounds of constructive quantum field theory (based on the Feynman-Kac formula) β − H( P ,λ(f +h s β−s ))ds 0 should yield the existence of the time ordered exponential T e of the time-dependent Hamiltonian. Lemma 2.2. Let t1 ≤ t2 . Consider the Hamiltonian H(P, λ(gs )) and the time-ordered exponential R(t2 , t1 ) = T e−
t2 t1
H(P ,λ(gs ))ds
,
(2.17)
Exchange Identity
287
with time increasing from left to right. Then R(t2 , t1 ) is the solution to the differential equation ∂ R(t2 , t1 ) = −H(P, λ(gt2 ))R(t2 , t1 ) , ∂t2
with R(t, t) = I ,
(2.18)
as well as the equation ∂ R(t2 , t1 ) = R(t2 , t1 )H(P, λ(gt1 )) , ∂t1
with R(t, t) = I .
(2.19)
Proof. Assume that the time-ordered exponential (2.17) can be expanded according to usual perturbation series. Integrating the relation (2.18) gives
t2 R(t2 , t1 ) = I − ds1 H(s1 )R(s1 , t1 ) t1 t2
=I−
ds1 H(s1 ) +
t1 ∞
= ··· =
t2 t1
H(s)ds
ds2 H(s1 )H(s2 )R(s2 , t1 )
t1
t1
j t1 ≤sj ···≤s2 ≤s1 ≤t2
j =0
= T e−
s1
ds1
(−1)
t2
ds1 · · · dsj H(s1 ) · · · H(sj ) (2.20)
.
This also shows that R(t2 + , t1 ) − R(t2 , t1 ) ∼ − H (t2 )R(t2 , t1 ). One completes the proof that the time-ordered exponential satisfies the Eq. (2.18) by removing the regularization and establishing convergence of the approximation.
A similar iteration gives
t2 R(t2 , t1 ) = I − ds1 R(t2 , s1 )H(s1 )
=I−
t1 t2
ds1 H(s1 ) +
t1
= ··· =
∞ j =0
= T e−
t2 t1
H(s)ds
t2
ds1 t1
(−1)j
t2
ds2 R(t2 , s2 )H(s2 )H(s1 ) s1
t1 ≤s1 ≤s2 ···≤sj ≤t2
ds1 · · · dsj H(sn ) · · · H(s1 ) (2.21)
,
leading to (2.19). Lemma 2.3. The interaction HI (P, λ(gs )) satisfies HI (P, λ(gs ))ea
∗ (e−tω f )
= ea
∗ (e−tω f )
HI (P, λ(ft + gs )) .
(2.22)
The corresponding relation for an annihilation operator is ea(e
−tω h)
HI (P, λ(gs )) = HI (P, λ(ht + gs )) ea(e
−tω h)
.
(2.23)
288
A. Jaffe, C. J¨akel
Proof. Denote HI (P, λ(gs )) by HI (s). Then
∗ −tω ∗ −tω ∗ −tω ∗ −tω HI (s), ea (e f ) = ea (e f ) e−a (e f ) HI (s)ea (e f ) − HI (s) . (2.24) But ∗ −tω −a (e f ), HI (s) = −Ada ∗ (e−tω f ) (HI (s)) = HI (P, λ(gs ) ∗ ι (ft )) . (2.25) ∗
∗
Expanding the exponential e−a HI (s)ea in (2.24) as a series in (−Ada ∗ )j , one obtains
HI (s), ea
∗ (e−tω f )
= ea = ea
∗ (e−tω f )
∗ (e−tω f )
N j 1 −Ada ∗ (e−tω f ) (HI (s)) j! j =1 ∞ j =1
1 HI (P, λ(gs ) (∗ι (ft ))j ) j!
=e
a ∗ (e−tω f )
(HI (P, λ(gs ) ∗ (ft )) − HI (P, λ(gs )))
=e
a ∗ (e−tω f )
(HI (P, λ(gs + ft )) − HI (P, λ(gs ))) , (2.26)
where we use (2.10). Thus we obtain (2.22) as claimed. A similar argument establishes the corresponding relation (2.23).
Proof of Theorem 2.1. Let us begin by establishing the case h = 0, namely
T e−
β 0
H(P ,λ(gs ))ds
ea
∗ (f )
= ea
∗ (e−βω f )
T e−
β 0
H(P ,λ(fs +gs ))ds
. (2.27)
Consider the function G(s ) = R(β, s )ea
∗ (e−s ω f )
S(s , 0) ,
(2.28)
where β R(β, s ) = T e− s H(P ,λ(gs ))ds ,
s and S(s , 0) = T e− 0 H(P ,λ(gs +fs ))ds . (2.29)
The left and right sides of (2.15) equal respectively G(0) and G(β). We compute the derivative of G(s) and show that it vanishes, proving (2.15). In fact using Proposition 2.2, along with the relation ∗ −sω d a ∗ (e−sω f ) ∗ −sω e = −a ∗ (ωe−sω f )ea (e f ) = − H0 , a ∗ (e−sω f ) ea (e f ) ds
∗ −sω = − H0 , ea (e f ) , (2.30)
Exchange Identity
289
we find that
d a ∗ (e−s ω f ) d ∗ −s ω G(s ) = R(β, s ) H(P, λ(gs ))ea (e f ) − e ds ds ∗ −s ω − ea (e f ) H(P, λ(gs + fs )) S(s , 0) ∗ −s ω ∗ −s ω = R(β, s ) HI (P, λ(gs ))ea (e f ) − ea (e f ) HI (P, λ(gs + fs )) ×S(s , 0).
(2.31)
Using Lemma 2.3, we infer that dG(s)/ds = 0 as claimed and (2.27) holds. Next consider the case f = 0, which we analyze by taking the adjoint of the case established above, but with g¯ in place of g and h¯ in place of f . This gives β β −βω ea(h) Ae− 0 H(P ,λ(gs ))ds = Ae− 0 H(P ,λ(hs +gs ))ds ea(e h) , (2.32) where A denotes anti-time ordering. Replacing s by β − s in the integrands is equivalent to the replacement of anti-time-ordering by time-ordering. Therefore, β β −βω ea(h) T e− 0 H(P ,λ(gβ−s ))ds = T e− 0 H(P ,λ(hβ−s +gβ−s ))ds ea(e h) . (2.33) In order to combine the two expressions, replace gβ−s by gs to yield, β β −βω ea(h) T e− 0 H(P ,λ(gs ))ds = T e− 0 H(P ,λ(hβ−s +gs ))ds ea(e h) .
(2.34)
∗
Multiply this identity on the right by ea (f ) . Then move this exponential to the left in ∗ the right-hand term: use the canonical commutation relations to commute ea (f ) past −βω ea(e h) . Then apply the exchange identity (2.27) that was already proved. This yields (2.15) and completes the proof of the theorem. References 1. Buchholz, D., Wichmann, E.H.: Causal independence and the energy-level density of states in local quantum field theory. Commun. Math. Phys. 106, 321–344 (1986) 2. Glimm, J., Jaffe, A.: Quantum Physics, Second Edition, Berlin-Heidelberg-New York: Springer Verlag, 1987, and Selected Papers, Volumes I and II, Basel-Boston: Birkh¨auser Boston, 1985 3. Glimm, J., Jaffe, A.: The λϕ24 quantum field theory without cut-offs, IV. Perturbations of the Hamiltonian. J. Math. Phys. 13, 1568–1584 (1972) 4. Glimm, J., Jaffe, A., Spencer, T.: The Wightman axioms and particle structure in the weakly coupled P (ϕ)2 quantum field model. Ann. of Math. 100, 585–632 (1974) 5. A. Jaffe, Constructive quantum field theory. In: Mathematical Physics, edited by T. Kibble, Singapore: World Scientific, 2000 Communicated by J.Z. Imbrie
Commun. Math. Phys. 264, 291–302 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1466-7
Communications in
Mathematical Physics
Hermitian Geometry and Complex Space-Time A.H. Chamseddine Center for Advanced Mathematical Sciences (CAMS) and Physics Department, American University of Beirut, Lebanon. E-mail: [email protected] Received: 26 March 2005 / Accepted: 20 June 2005 Published online: 18 November 2005 – © Springer-Verlag 2005
Abstract: We consider a complex Hermitian manifold of complex dimensions four with a Hermitian metric and a Chern connection. It is shown that the action that determines the dynamics of the metric is unique, provided that the linearized Einstein action coupled to an antisymmetric tensor is obtained, in the limit when the imaginary coordinates vanish. The unique action is of the Chern-Simons type when expressed in terms of the K¨ahler form. The antisymmetric tensor field has gauge transformations coming from diffeomorphism invariance in the complex directions. The equations of motion must be supplemented by boundary conditions imposed on the Hermitian metric to give, in the limit of vanishing imaginary coordinates, the low-energy effective action for a curved metric coupled to an antisymmetric tensor.
1. Introduction The idea of complexifying space-time in general relativity was put forward in the early sixties. It appeared in different but related lines of research. These include complexifying the four-dimensional manifold and equipping it with a holomorphic metric, asymptotically complex null surfaces and theory of twistors [1–7]. More recently, Witten [8] considered string propagation on complexified space-time where he presented some evidence that the imaginary part of the complex coordinates enters in the study of the high-energy behavior of scattering amplitudes [9]. In this string picture it is assumed that the imaginary parts of the coordinates are small at low-energies. At a fundamental level the complex coordinates X µ , µ = 1, . . . , d with complex conjugates X µ ≡ X µ are described by the topological σ model action [8] I=
dσ dσ gµν X(σ, σ ), X (σ, σ ) ∂σ X µ ∂σ X ν ,
292
A.H. Chamseddine
where the world-sheet coordinates are denoted by σ and σ , and where the background metric for the complex d-dimensional manifold M is Hermitian so that gµν = gνµ ,
gµν = gµ ν = 0.
Decomposing the metric into real and imaginary components gµν = Gµν + iBµν , the hermiticity condition implies that Gµν is symmetric and Bµν is antisymmetric. The low-energy effective string action is given by the Einstein-Hilbert action coupled to the field strength of the antisymmetric tensor. This can be related to the invariance µ µ µ µ of the sigma model under complex transformations X → X + ζ (X) , X → µ µ X +ζ X . A related phenomena was observed in noncommutative geometry [10], where the space-time coordinates are deformed and become noncommuting, [x µ , x ν ] = iθ µν [11]. Furthermore, it was found that in the effective action of open-string theory, the −1 does appear [12]. This was taken as a motiinverse of the combinations Gµν + Bµν vation to study the dynamics of a complex Hermitian metric on a real manifold [13], considered first by Einstein and Strauss [14]. In [13] it was shown that the invariant action constructed have the required behavior for the propagation of the fields Gµν and Bµν at the linearized level, but problems do arise when non-linear interactions are taken into account. This is due to the fact that there is no gauge symmetry to prevent the ghost components of Bµν from propagating. It is then important to address the question of whether it is possible to have consistent interactions in which the field Bµν appears explicitly in analogy with Gµν and not only through the combination of derivatives Hµνρ = ∂µ Bνρ + ∂ν Bρµ + ∂ρ Bµν . This suggests that the gauge parameters for the transformation Bµν → Bµν + ∂µ ν − ∂ν µ that keep Hµνρ invariant must be combined with the diffeomorphism parameters on the real manifold. For this to happen there must be diffeomorphism invariance of the Hermitian manifold M of complex dimensions d, with complex coordinates zµ = x µ + iy µ , µ = 1, . . . , d. The line element is then given by [15] ds 2 = 2gµν dzµ dzν , where we have denoted zµ = zµ . The metric preserves its form under infinitesimal transformations zµ → zµ − ζ µ (z) , zµ → zµ − ζ µ (z) , as can be seen from the transformations 0 = δgµν = ∂µ ζ λ gλν + ∂ν ζ λ gµλ , 0 = δgµ ν = ∂µ ζ λ gλν + ∂ν ζ λ gµλ , δgµν = ∂µ ζ λ gλν + ∂ν ζ λ gµλ + ζ λ ∂λ gµν + ζ λ ∂λ gµν .
Hermitian Geometry and Complex Space-Time
293
It is instructive to express these transformations in terms of the fields Gµν (x, y) and Bµν (x, y) by writing ζ µ (z) = α µ (x, y) + iβ µ (x, y), ζ µ (z) = α µ (x, y) − iβ µ (x, y). The holomorphicity conditions on ζ µ and ζ µ imply the relations ∂µy β ν = ∂µx α ν , ∂µy α ν = −∂µx β ν , where we have denoted ∂µy =
∂ , ∂y µ
∂µx =
∂ . ∂x µ
The transformations of Gµν (x, y) and Bµν (x, y) are then given by δGµν (x, y) = ∂µx α λ Gλν + ∂νx α λ Gµλ + α λ ∂λx Gµν y
−∂µx β λ Bλν + ∂νx β λ Bµλ + β λ ∂λ Gµν , δBµν (x, y) = ∂µx β λ Gλν − ∂νx β λ Gµλ + α λ ∂λx Bµν y
+∂µx α λ Bλν + ∂νx α λ Bµλ + β λ ∂λ Bµν . One readily recognizes that in the vicinity of small y µ the fields Gµν (x, 0) and Bµν (x, 0) transform as symmetric and antisymmetric tensors with gauge parameters α µ (x) and β µ (x) where α µ (x, y) = α µ (x) − ∂νx β µ (x)y ν + O(y 2 ), β µ (x, y) = β µ (x) + ∂νx α µ (x)y ν + O(y 2 ), as implied by the holomorphicity conditions. The purpose of this work is to investigate the dynamics of the Hermitian metric gµν on a complex space-time with complex dimensions four, such that in the limit of vanishing imaginary values of the coordinates, the action reduces to that of a symmetric metric Gµν and an antisymmetric field Bµν . The plan of this paper is as follows. In Section Two we summarize the essentials of Hermitian geometry. In Section Three we construct the most general action which gives, in the linearized limit, the correct equations of motion for a symmetric metric Gµν and an antisymmetric field Bµν and show that the action is unique. In Section Four we impose constraints on the torsion and curvature in the four dimensional limit where the imaginary values of the coordinates vanish and study the equations of motion . Section Five is the conclusion. 2. Hermitian Geometry The Hermitian manifold M of complex dimensions d is defined as a Riemannian manii with fold real dimensions 2d with Riemannian metric gij and complex coordinates z = µ µ z , z , where Latin indices i, j, k, . . . , run over the range 1, 2, . . . , d, 1, 2, . . . , d. The invariant line element is then [16] ds 2 = gij dzi dzj ,
294
A.H. Chamseddine
where the metric gij is hybrid gij =
0 gµν gνµ 0
.
j
It has also an integrable complex structure Fi satisfying j
j
Fik Fk = −δi , and with a vanishing Nijenhuis tensor Njhi = Fjt ∂t Fih − ∂i Fth − Fit ∂t Fjh − ∂j Fth . The complex structure has components ν iδµ 0 j . Fi = 0 −iδµν The affine connection with torsion ijh is introduced so that the following two conditions are satisfied h ∇k gij = ∂k gij − ik ghj − jhk gih = 0, j
j
j
j
h ∇k Fi = ∂k Fi − ik Fh + hk Fih = 0.
These conditions do not determine the affine connection uniquely and there exist several possibilities used in the literature. We shall adopt the Chern connection, which is the one most commonly used. It is defined by prescribing that the (2d)2 linear differential forms ωij = ji k dzk , µ
µ
be such that ω ν and ω ν are given by [15], µ dzρ , ωµν = νρ µ
µ
µ
ω ν = ω ν = ν ρ dzρ , µ
with the remaining (2d)2 forms set equal to zero. For ω ν to have a metrical connection the differential of the metric tensor g must be given by ρ
dgµν = ωρµ gρν + ω ν gµρ , from which we obtain ρ
ρ
∂λ gµν dzλ + ∂λ gµν dzλ = µλ gρν dzλ + νλ gµρ dzλ , so that ρ
µλ = g νρ ∂λ gµν , ρ
νλ = g ρµ ∂λ gµν ,
Hermitian Geometry and Complex Space-Time
295
where the inverse metric g νµ is defined by g νµ gµκ = δκν . j
The condition ∇k Fi = 0 is then automatically satisfied and the connection is metric. The torsion forms are defined by 1 µ ν µ ≡ − Tνρ dz ∧ dzρ 2 µ = ωµν dzν = − νρ dzν ∧ dzρ , which implies that µ µ µ = νρ − ρν Tνρ = g σ µ ∂ρ gνσ − ∂ν gρσ .
The torsion form is related to the differential of the Hermitian form F =
1 Fij dzi ∧ dzj , 2
where Fij = Fik gkj = −Fj i , is antisymmetric and satisfy Fµν = 0 = Fµ ν , Fµν = igµν = −Fνµ , so that F = igµν dzµ ∧ dzν . The differential of F is then dF =
1 Fij k dzi ∧ dzj ∧ dzk , 6
so that Fij k = ∂i Fj k + ∂j Fki + ∂k Fij . The only non-vanishing components of this tensor are σ Fµνρ = i ∂µ gνρ − ∂ν gµρ = −iTµν gσ ρ = −iTµνρ , Fµ νρ = −i ∂µ gρν − ∂ν gρµ = iTµσ ν gρσ = iTµ νρ . The curvature tensor of the metric connection is constructed in the usual manner ij = dωij − ωik ∧ ωkj ,
296
A.H. Chamseddine
with the only non-vanishing components νµ and νµ . These are given by νµ = −R νµκλ dzκ ∧ dzλ − R νµκλ dzκ ∧ dzλ ν ρ ν ν = ∂κ µλ − µκ ρλ dzκ ∧ dzλ . dzκ ∧ dzλ − ∂λ µκ Comparing both sides we obtain ρ
ν ν ρ ν ν R νµκλ = ∂λ µκ − ∂κ µλ + µκ ρλ − µλ ρκ , ν R νµκλ = ∂λ µκ .
One can easily show that R νµκλ = 0, R νµκλ = g ρν ∂κ ∂λ gµρ + ∂λ g ρν ∂κ gµρ . Transvecting the last relation with gνσ we obtain −Rµσ κλ = ∂κ ∂λ gµσ + gνσ ∂λ g ρν ∂κ gµρ . Therefore the only non-vanishing covariant components of the curvature tensor are Rµνκλ ,
Rµν κλ ,
Rµνκλ ,
Rµνκλ ,
which are related by Rµνκλ = −Rνµκλ = −Rµνλκ , and satisfy the first Bianchi identity [15] R νµκλ − R νκµλ = ∇λ Tµκ ν . The second Bianchi identity is given by ∇ρ Rµνκλ − ∇κ Rµνρλ = Rµνσ λ Tρκ σ , together with the conjugate relations. There are three possible contractions for the curvature tensor which are called the Ricci tensors Rµν = −g λκ Rµλκν ,
Sµν = −g λκ Rµνκλ ,
Tµν = −g λκ Rκλµν .
Upon further contraction these result in two possible curvature scalars R = g νµ Rµν ,
S = g νµ Sµν = g νµ Tµν .
Note that when the torsion tensors vanishes, the manifold M becomes K¨ahler. We shall not impose the K¨ahler condition as we are interested in Hermitian non-K¨ahlerian geometry. We note that it is also possible to consider the Levi-Civita connection ˚ ijk and the h , where associated Riemann curvature Kkij 1 ˚ ijk = g kl ∂i glj + ∂j gil − ∂l gij , 2 h h h ˚t t Kkij ij − ˚ ith ˚ kj = ∂k ˚ ijh − ∂i ˚ kj + ˚ kt .
Hermitian Geometry and Complex Space-Time
297
The relation between the Chern connection and the Levi-Civita connection is given by 1 k ijk = ˚ ijk + Tij − Tijk − Tjki . 2 t and K = g ij K . Moreover The Ricci tensor and curvature scalar are Kij = Ktij ij t i kj Hkj = Kkj i Ft and H = g Hkj . The two scalar curvatures K and H are related by [17]
K − H = ∇˚ h F ij ∇˚ j Fih − ∇˚ k Fki ∇˚ h F hi − 2F j i ∇˚ j ∇˚ k Fki . There are also relations between curvatures of the Chern connection and those of the Levi-Civita connection, mainly [17] 1 K = S − ∇ µ Tµ − ∇ µ Tµ − Tµ Tν g νµ , 2 ν . There are two natural conditions that can be imposed on the torsion. where Tµ = Tµν The first is Tµ = 0 which results in a semi-K¨ahler manifold. The other is when the ν = 0 implying that the curvature tensor has torsion is complex analytic so that ∇λ Tµκ the same symmetry properties as in the K¨ahler case. In this work we shall not impose any conditions on the torsion tensor.
3. An Invariant Action We now specialize to the realistic case of a complexified four dimensional space-time. To construct invariants up to second order in derivatives we write the following possible terms d 4 zd 4 zg aR + bS + c Tµνκ Tρ σ λ g ρµ g σ ν g κλ + d Tµνκ Tρ σ λ g ρµ g σ λ g κν + e . I= M4
1
The density factor is det gij 2 = det gµν ≡ g. We shall set the cosmological term to zero (e = 0) . The above action can equivalently be written in terms of the Riemannian metric gij in the form
1
I = d 4 zd 4 z det gij 2 a K + b H + c Fij k F ij k + d Fi F i , M
where Fi = Fij k F j k and a , b , c , d are parameters linearly related to the parameters a, b, c, d. We shall now impose the requirement that the linearized action, in the limit of y → 0 gives the correct kinetic terms for Gµν (x) and Bµν (x). Therefore writing Gµν (x, y) = ηµν + hµν (x), Bµν (x, y) = Bµν (x), and keeping only quadratic terms in the action, we obtain, after integrating by parts, the quadratic hµν terms, I = d 4 xd 4 y 2c∂κx hµν ∂ xκ hµν + (a − 2c + d) ∂ xν hµν ∂λx hµλ + (a − b + 2d) ∂ xν hµν ∂ xν hλλ + (d − b) ∂µx hνν ∂ xµ hλλ .
298
A.H. Chamseddine
Comparing with the linearized Einstein action we obtain the following conditions 2c = 1,
a − 2c + d = −2,
−a + b − 2d = 2,
d − b = −1,
which are equivalent to b = −a,
c=
1 , 2
d = −1 − a.
With this choice of coefficients, the quadratic B contributions simplify to d 4 xd 4 y ∂µx Bνρ ∂ xµ B νρ − 2∂ xµ Bµλ ∂νx B νλ , which is identical to the term 1 3
d 4 xd 4 yHµνρ H µνρ ,
where Hµνρ = ∂µx Bνρ + ∂νx Bρµ + ∂ρx Bµν . The action can then be regrouped into the form I = d 4 zd 4 zg a R − S − Tµνκ Tρ σ λ g ρµ g σ λ g κν M
+
1 Tµνκ Tρ σ λ g ρµ g σ ν g κλ − 2g ρµ g σ λ g κν . 2
Using the first Bianchi identity we have d 4 zd 4 zg (R − S) = d 4 zd 4 zgg λµ ∂λ Tµν M
ν
M
=
d 4 zd 4 zgTµνκ Tρ σ λ g ρµ g σ λ g κν , M
where we have integrated by parts and ignored a surface term. This implies that the group of terms with coefficient a drop out, and the action becomes unique: 1 I= d 4 zd 4 zgTµνκ Tρ σ λ g ρµ g σ ν g κλ − 2g ρµ g σ λ g κν . 2 M
Substituting for the torsion tensor in terms of the metric gµν , the above action reduces to 1 I= d 4 zd 4 zgX κλσ µνρ ∂ν gµσ ∂λ gρκ , 2 M
where
Xκλσ µνρ = g σ ρ g κµ g λν − g κν g λµ + g σ µ g κν g λρ − g κρ g λν +g σ ν g κρ g λµ − g κµ g λρ ,
Hermitian Geometry and Complex Space-Time
299
which is completely antisymmetric in the indices µνρ and in κλσ Xκλσ µνρ = X
κλσ [µνρ]
.
This is remarkable because the simple requirement that the linearized action for Gµν should be recovered determines the action uniquely. This form of the action is valid in all complex dimensions d, however, when d = 4, we can write 1 X κλσ µνρ = − κλσ η µνρτ gτ η , g and the action takes the very simple form 1 d 4 zd 4 z κλσ η µνρτ gτ η ∂µ gνσ ∂κ gρλ . I =− 2 M
The above expression has the advantage that the action is a function of the metric gµν and there is no need to introduce the inverse metric g νµ . This suggests that the action could be expressed in terms of the K¨ahler form F. Indeed, we can write i F ∧ ∂F ∧ ∂F. I= 2 M
The equations of motion are given by 1 κλσ η µνρτ gνσ ∂µ ∂κ gρλ + ∂µ gνσ ∂κ gρλ = 0. 2 Notice that the above equations are trivially satisfied when the metric gµν is K¨ahler, ∂µ gνρ = ∂ν gµρ ,
∂σ gνρ = ∂ρ gνσ ,
where these conditions are locally equivalent to gµν = ∂µ ∂ν K for some scalar function K. 4. Four Dimensional Limit with Vanishing Imaginary Part To study the spectrum of the action we have to assume that although the coordinates are complex, the imaginary parts are small in low-energy experiments. The action is a function of the fields Gµν (x, y) and Bµν (x, y) which depend continuously on the coordinates y µ implying a continuous spectrum with an infinite number of fields depending on x µ . To obtain a discrete spectrum a certain physical assumption should be made that forces the imaginary coordinates to be small. One way, suggested by Witten, [8] is to suppress the imaginary parts by constructing an orbifold space M = M/G, where G is the group of imaginary shifts zµ → zµ + i(2πk µ ), where k µ are real. To maintain invariance under general coordinate transformation we must have k µ (x, y) . It is not easy, however, to deal with such an orbifold in field theoretic considerations. To determine what is needed we proceed by first expressing the full action in terms of the fields Gµν (x, y) and Bµν (x, y). We write ∂µ gνσ ∂κ gρλ =
1 Aκλσ µνρ + iBκλσ µνρ , 4
300
A.H. Chamseddine
where Aκλσ µνρ = ∂µx Gνσ + ∂µy Bνσ ∂κx Gρλ − ∂κy Bρλ − ∂µx Bνσ − ∂µy Gνσ ∂κx Bρλ + ∂κy Gρλ , Bκλσ µνρ = ∂µx Gνσ + ∂µy Bνσ ∂κx Bρλ + ∂κy Gρλ + ∂µx Bνσ − ∂µy Gνσ ∂κx Gρλ − ∂κy Bρλ . The equations of motion split into real and imaginary parts. These are given by 0 = κλσ η µνρτ Gνσ ∂µx ∂κx + ∂µy ∂κy Gρλ − ∂µx ∂κy − ∂µy ∂κx Bρλ −Bνσ ∂µx ∂κy − ∂µy ∂κx Gρλ + ∂µx ∂κx + ∂µy ∂κy Bρλ 1 + Aκλσ µνρ , 2 0 = κλσ η µνρτ Gνσ ∂µx ∂κx + ∂µy ∂κy Bρλ + ∂µx ∂κy − ∂µy ∂κx Gρλ +Bνσ ∂µx ∂κx + ∂µy ∂κy Gρλ − ∂µx ∂κy − ∂µy ∂κx Bρλ 1 + Bκλσ µνρ . 2 We are interested in evaluating this action and equations of motion for small values of the imaginary coordinates y µ . The above expressions contain terms which are at most y quadratic in ∂µ derivatives, it is then enough to expand the fields to second order in y µ and take the limit y → 0. We therefore write 1 Gµν (x, y) = Gµν (x) + Gµνρ (x)y ρ + Gµνρσ (x) y ρ y σ + O(y 3 ), 2 1 ρ Bµν (x, y) = Bµν (x) + Bµνρ (x)y + Bµνρσ (x) y ρ y σ + O(y 3 ). 2 What is needed is a principle that determines the fields Gµνρ (x), Bµνρ (x), Gµνρσ (x) and Bµνρσ (x) and all higher terms as functions of Gµν (x), Bµν (x). For our purposes it will be enough to determine the expansions only to second order. This can be achieved by imposing boundary conditions in the limit y → 0 on the first and second derivatives of the Hermitian metric. The invariances of the string action given in the introduction suggests that the equations of motion in the y → 0 limit reproduce the low-energy limit of the string equations 1 1 η τ νρ 0 = Gητ R (G) + Hµνρ H µνρ − 2 R ητ (G) + Hνρ , H 6 4 0 = ∇ µ(G) Hµητ . In the absence of a principle that reduces the continuous spectrum, we shall impose the boundary conditions on the Hermitian metric gµν (x, y) to be such that Tµνρ |y→0 = 2iBµν,ρ (x) , Rµσ κλ − Rκσ µλ = −2 Rµκσ λ (G) + i ∇λG Hµκσ − ∇σG Hµκλ .
y→0
Hermitian Geometry and Complex Space-Time
301
The solution of the torsion constraint gives, to lowest orders, Gµνρ (x) = ∂ν Bµρ (x) + ∂µ Bνρ (x) , Bµνρ (x) = −Gµρ,ν (x) + Gνρ,µ (x) , where all derivatives are with respect to x µ . Substituting these into the curvature constraints yield Gµσ κλ (x) = ∂σ ∂λ Gµκ (x) + ∂µ ∂λ Gσ κ (x) + ∂σ ∂κ Gµλ (x) +∂µ ∂κ Gσ λ (x) − ∂κ ∂λ Gµσ (x) + O (∂G, ∂B) , Bµσ κλ (x) = ∂σ ∂λ Bµκ (x) − ∂µ ∂λ Bσ κ (x) + ∂σ ∂κ Bµλ (x) −∂µ ∂κ Bσ λ (x) − ∂κ ∂λ Bµσ (x) + O (∂G, ∂B) , where O (∂G, ∂B) are terms of second order. To write the equations of motion in component form, we substitute the Gµν (x, y) and Bµν (x, y) expansions into Aκλσ µνρ and Bκλσ µνρ using the above solutions to obtain Aκλσ µνρ = µνσ (G) κλρ (G) − ∂µ Bνσ + ∂σ Bµν − ∂ν Bσ µ ∂κ Bρλ + ∂λ Bρκ + ∂ρ Bρκ + O(y), Bκλσ µνρ = µνσ (G) ∂κ Bρλ + ∂λ Bρκ + ∂ρ Bρκ + ∂µ Bνσ + ∂σ Bµν − ∂ν Bσ µ κλρ (G) + O(y), where µνσ (G) = ∂ν Gµσ + ∂µ Gνσ − ∂σ Gµν . In terms of components, the equations of motion take the form 0 = κλσ η µνρτ Gνσ ∂µ ∂κ Gρλ + Gρλµκ − ∂µ Bρλκ + ∂κ Bρλµ −Bνσ ∂µ ∂κ Bρλ + Bρλµκ + ∂µ Gρλκ − ∂κ Gρλµ 1 + Aκλσ µνρ , 2 0 = κλσ η µνρτ Gνσ ∂µ ∂κ Bρλ + Bρλµκ + ∂µ Gρλκ − ∂κ Gρλµ −Bνσ ∂µ ∂κ Gρλ + Gρλµκ − ∂µ Bρλκ + ∂κ Bρλµ 1 + Bκλσ µνρ . 2 After substituting the solutions of the constraints these take the form 1 κλσ η µνρτ Gνσ Rµλρκ (G) − ∂σ Bµν ∂ρ Bκλ − 2Bνσ ∂λ ∂µ Bρκ = 0, 4 κλσ η µνρτ Gνσ ∂λ ∂µ Bρκ − Bνσ Rµλρκ (G) = 0. Using the identity κλσ η µνρτ Gνσ = 6 det Gµν Gµ[κ Gλ|ρ G η]τ , these equations reduce to the correct equations of motion, up to terms of the form O (∂G, ∂B) which were neglected in the derivation.
302
A.H. Chamseddine
5. Conclusions In this work we have investigated the structure of a complexified space-time. The geometry is taken to be that of a Hermitian manifold with complex metric given by gµν (z, z) = Gµν (x, y) + iBµν (x, y). After studying the properties of Hermitian geometry, we find that there is a unique action, up to boundary terms, that gives the correct linearized kinetic energies for Gµν (x) and Bµν (x) in the limit when the metric is restricted to depend only the variables x µ . The unique action is of the Chern-Simons type when expressed in terms of the K¨ahler form. We have shown that the diffeomorphism invariance in the complex coordinates protect both fields Gµν (x) and Bµν (x) keeping them massless. The physical requirement that the imaginary parts of the coordinates are small at low energies, must be imposed in such a way as to reduce the continuous spectrum of Gµν (x, y) and Bµν (x, y) to a discrete spectrum. In the absence of information about the spectrum arising at high energies where the imaginary coordinates are expected to play a role, it is enough for our purposes to impose conditions on first and second derivatives of the Hermitian metric, which allows us to solve for the lowest order terms in the expansion in terms of y µ . These constraints are imposed on the torsion and curvature of the Hermitian geometry in the limit y µ → 0. We have solved the constraints and shown that the equations of motion for the Hermitian metric results in the low-energy string equations in the limit y µ → 0. The results obtained so far, give circumstantial evidence that space-time might be enlarged to become complex. Much more work is needed to determine the principle that restricts the form of the hermitian metric to give a discrete spectrum and fixes the dependence on the imaginary coordinates to all orders. This will be necessary in order to understand the contributions of the imaginary parts of the coordinates at high energies. One would expect that Bµν (x) would also enter in the higher order terms of the action and not only through their derivatives, in analogy with the field Gµν (x) . Acknowledgement. I would like to thank Jean-Pierre Bourguignon for pointing out reference [17] to me. Research supported in part by the National Science Foundation under Grant No. Phys-0313416.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17.
Synge, J.L.: Proc. R. Irish. Acad. 62, 1 (1961) Newman, E.: J. Math. Phys. 2, 324 (1961) Penrose, R.: J. Math. Phys. 8, 345 (1967) Penrose, R., Rindler, W.: Spinors and Space-Time, Cambridge: Cambridge University Press, 1986 Plebanski, J.F.: J. Math. Phys. 16, 2396 (1975) Flaherty, E.J.: Hermitian and K¨ahlerian Geometry in General Relativity. Lecture Notes in Physics, Volume 46, Heidelberg: Springer, 1976 Flaherty, E.J.: Complex Variables in Relativity. In: General Relativity and Gravitation, One Hundred Years after the Birth of Albert Einstein, A. Held, (ed.), New York: Plenum, 1980 Witten, E.: Phys. Rev. Lett. 61, 670 (1988) Gross, D., Mende, P.: Phys. Lett B197, 129 (1987) Connes, A.: Noncommutative Geometry, London-New York:Academic Press, 1994 Connes, A., Douglas, M., Schwarz, A.: JHEP 9802, 003 (1998) Seiberg, N., Witten, E.: JHEP 9909, 032 (1999) Chamseddine, A.H.: Commun. Math. Phys. 218, 283 (2001) Einstein, A., Strauss, E.: Ann. Math. 47, 731 (1946) Goldberg, S.I.: Ann. Math. 63, 64 (1956) Yano, K.: Differential Geometry on Complex and Almost Complex Manifolds. New York: Pergamon Press, 1965 Gauduchon, P.: Math. Ann. 267, 495 (1984)
Communicated by A. Connes
Commun. Math. Phys. 264, 303–316 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1467-6
Communications in
Mathematical Physics
Quantum Leaks Jens Marklof School of Mathematics, University of Bristol, Bristol BS8 1TW, U.K. E-mail: [email protected] Received: 28 March 2005 / Accepted: 2 May 2005 Published online: 18 November 2005 – © Springer-Verlag 2005
Abstract: We show that eigenfunctions of the Laplacian on certain non-compact domains with finite area may localize at infinity—provided there is no extreme level clustering—and thus rule out quantum unique ergodicity for such systems. The construction is elementary and based on ‘bouncing ball’ quasimodes whose discrepancy is proved to be significantly smaller than the mean level spacing. 1. Introduction Consider a region D in R2 with piecewise smooth boundary and finite area. The billiard flow on the unit cotangent bundle of D is defined as the motion along straight lines with specular reflections at its boundary ∂D. The quantum states and energy levels of the flow are determined by the eigenvalue problem for the Dirichlet Laplacian,1 ( + λ)ϕ = 0, (1.1) ϕ ∂ D = 0, where = ∂x2 + ∂y2 . It is well known that the spectrum is discrete. The asymptotic distribution of the eigenvalues 0 < λ1 ≤ λ2 ≤ . . . → ∞
(1.2)
is governed by Weyl’s law (cf. [30, 2, 3, 19, 4] and references therein) #{j : λj < λ} Area(D) = . λ→∞ λ 4π lim
(1.3)
Research supported by an EPSRC Advanced Research Fellowship and EPSRC Research Grant GR/T28058/01. 1 Our results can easily be adapted to the case of Neumann boundary conditions provided the spectrum of the Laplacian is discrete (which, in contrast to Dirichlet conditions, is not generally the case for non-compact regions with finite area).
304
J. Marklof
The mean spacing between consecutive eigenvalues is therefore asymptotically constant. We denote by {ϕj }j an orthonormal basis of eigenfunctions, and consider the probability measure dνj = |ϕj (x, y)|2 dx dy
(1.4)
associated with the j th eigenstate. One of the central problems in quantum chaos is to classify all weak limits of dνj as j → ∞. The quantum ergodicity theorem, due to Schnirelman, Zelditch and Colin de Verdi`ere [29, 32, 10] (adapted for billiard flows on domains of the above type in [34]), asserts that, if the underlying dynamics is ergodic, there is a subsequence λj1 , λj2 , . . . of full density2 such that the corresponding eigenfunctions ϕji (i → ∞) become uniformly distributed on the unit cotangent bundle of D. This implies for instance that for any set A ⊂ D with smooth boundary, Area(A) lim dνji = . (1.5) i→∞ A Area(D) The proof of this theorem does not indicate whether in fact all eigenfunctions become uniformly distributed (a phenomenon called quantum unique ergodicity since there is only one possible quantum limit [27, 28]), or if there may exist sparse subsequences that have a singular limit, e.g., measures concentrated on periodic orbits of the billiard flow. Such exceptional subsequences have been observed in numerical experiments and are referred to as scars or bouncing ball modes. Following earlier results for quantum maps [11, 26, 20, 21], recent seminal contributions on the question of quantum unique ergodicity include the work of Faure, Nonnenmacher and De Bi`evre [15, 14] who prove the existence of localized eigenstates for quantum cat maps, and Lindenstrauss’ proof [25] of quantum unique ergodicity in the case of Hecke eigenstates3 of the Laplacian on compact arithmetic hyperbolic surfaces of congruence type. In the present paper we show that for certain non-compact domains D ⊂ R2 with finite area the sequence of measures dνj is not tight,4 provided there is no extreme clustering of eigenvalues. Hence there exist subsequences of eigenstates ϕji that leak to infinity, and quantum unique ergodicity is not satisfied for such a system. Let D be given by D = {(x, y) ∈ R2 : x > 0, 0 < y < f (x)},
(1.6)
where f : (0, ∞) → (0, ∞) is right-continuous and decreasing to 0 as x → ∞. More specifically, we assume that f is constant on the intervals [ai , ai+1 ), i = 1, 2, 3, . . . . Examples of such domains are displayed in Figs. 1–3. The condition ∞
i δi < ∞,
with δi := f (ai ) and i := ai+1 − ai ,
(1.7)
i=1
ensures D has finite area. To illustrate our main result, let us for example choose δi = i −(1+σ ) and i = i ρ , where σ > ρ > 0 are arbitrary fixed constants. Theorem 1 in A subsequence {λji }i is of full density if limλ→∞ #{i : λji < λ}/#{j : λj < λ} = 1. Hecke eigenstates are simultaneous eigenfunctions of the Laplacian and all Hecke operators. If the spectrum of the Laplacian is simple, as conjectured e.g. for the modular surface, any eigenfunction of the Laplacian is a Hecke eigenstate. 4 A sequence of probability measures dν is tight if for any > 0 there is a compact domain K ⊂ D j such that lim supj →∞ D−K dνj < . 2 3
Quantum Leaks
305
Fig. 1. Leaky Sinai billiard
Fig. 2. Leaky Bunimovich billiard
Fig. 3. Leaky polygonal billiard
Sect. 3 implies that there is a constant C > 0 such that (at least) one of the following two statements is true: There is a subsequence of eigenfunctions ϕji (i = 1, 2, . . . ) with eigenvalues λji ∈ π 2 i 2(1+σ ) + [−Ci −2ρ , Ci −2ρ ] and some c > 0 such that for any compact K ⊂ D we have dνji > c. (1.8) lim inf i→∞
D−K
The number of eigenvalues λj in the interval π 2 i 2(1+σ ) + [−Ci −2ρ , Ci −2ρ ] is unbounded as i → ∞. The first statement implies that eigenfunctions lose a positive proportion of mass. The second alternative implies extreme level clustering; this seems unlikely for a generic billiard of the above type, but cannot a priori be ruled out. To get a rough idea on whether to expect more level clustering than in the case of compact domains D, we show in Sect. 5 that the spectral counting function has the asymptotics (Theorem 2) #{j : λj < λ} =
Area(D) L(λ) √ λ λ− 4π 4π ∞ ∞ √ √ 1 √ 1 + J1 2rδi λ + O( λ), λ i 2π r i=1 √ δi λ>π
(1.9)
r=1
where L(λ) = 2
∞ i=1 √ δi λ>π
i
(1.10)
306
J. Marklof
is an ‘effective length’ of the boundary ∂D and J1 is the J -Bessel function. The fluctuations √ are therefore larger than in the compact case, where the error term is of order O( λ); cf. Sect. 5 for a more detailed discussion. The proof of Theorem 1 is elementary and based on the construction of ‘bouncing ball’ quasimodes [17, 1, 31, 13, 6–8, 33, 18] (see also Bogomolny and Schmit’s recent work on eigenfunctions in pseudo-integrable billiards [5]). The non-compactness of the domain allows for quasimodes with discrepancy almost as small as O(µ−1 ), where µ is the quasi-eigenvalue. The best rigorous bound for the discrepancy in the compact case is O(1), cf. [13]. Our construction is completely independent on the choice of f on the interval (0, a1 ), and one may use this additional freedom to try and tune f on (0, a1 ) in such a way that the billiard flow on D is ergodic. It seems plausible that this is possible if the billiard flow on the restricted compact region D0 = {(x, y) ∈ R2 : 0 < x < a1 , 0 < y < f (x)} is ergodic (as in the examples displayed in Figs. 1 and 2), but to the best of my knowledge there are no rigorous results in this direction (see however [23, 24, 16] for proofs of ergodicity for different classes of non-compact domains). A further interesting class of examples are infinite pseudo-integrable billiards (Fig. 3) that are known to be ergodic5 for almost all initial directions [12]. 2. Quasimodes A function ψ ∈ H01 (D) is called a quasimode for − with quasi-eigenvalue µ and discrepancy , if ( + µ)ψ ≤ ψ , (2.1) ψ ∂ D = 0, where · denotes the L2 norm. A sequence of quasimodes {ψi }i with quasi-eigenvalues µi is of order s, if −s/2
( + µi )ψi = O(µi
) ψi .
(2.2)
We summarize a few important properties of quasimodes; more details can be found in [9, 22, 13, 33]. By expanding ψ in an orthonormal basis of eigenfunctions, ψ = j ψ, ϕj ϕj , it is easy to see that (2.1) implies | ψ, ϕj |2 (λj − µ)2 ≤ 2 ψ 2 = 2 | ψ, ϕj |2 . (2.3) j
j
Hence |λj − µ| ≤ for at least one j , i.e., there is at least one eigenvalue λj in the interval [µ − , µ + ]. Consider the larger interval J = [µ − b , µ + b ], b > 1. We have | ψ, ϕj |2 ≤ (b )−2 | ψ, ϕj |2 (λj − µ)2 ≤ b−2 ψ 2 . (2.4) λj ∈J / 5
λj ∈J /
Since the modulus of the momentum components in both x- and y-directions are constants of motion, ergodicity is here understood with respect to a two-dimensional submanifold of the unit cotangent bundle.
Quantum Leaks
307
For a domain A ⊂ D define ψ A =
A
|ψ(x, y)|2 dx dy.
Triangle and Cauchy-Schwarz inequality imply
ψ A ≤
ψ, ϕj ϕj +
ψ, ϕj ϕj
A
λj ∈J
≤
| ψ, ϕj |2
λj ∈J
≤ ψ
λj ∈J
λj ∈J
ϕj 2A +
λj ∈J /
(2.5)
A
ϕj 2A +
ψ, ϕ ϕ j j
λj ∈J /
| ψ, ϕj |2 ,
(2.6)
λj ∈J /
and hence, together with (2.4), λj ∈J
ϕj 2A ≥
ψ A − b−1 . ψ
(2.7)
Now suppose that for a sequence of quasimodes ψi with quasi-eigenvalue µi and discrepancy i the intervals Ji = [µi − b i , µi + b i ] each contain at most k eigenvalues λj .
(2.8)
Then, in each interval Ji there is a λji such that 1 ψi A −1 ϕji A ≥ √ . −b k ψi
(2.9)
3. Leaky Domains Let f : (0, ∞) → (0, ∞) be a right-continuous function, monotonically decreasing to 0 on the half-line [a1 , ∞) (for some a1 > 0), and f (x)dx < ∞. We are interested in the domain D = {(x, y) ∈ R2 : x > 0, 0 < y < f (x)}. In the following we will assume that f is chosen so that ∞ f (x)h(π 2 f (x)−2 )dx < ∞, (3.1) a1
where h : [0, ∞) → [0, ∞) is a fixed increasing function bounded by h(x) ≤ central result is the following.6
√ x. The
6 The notation A B for two positive quantities A, B means there is a constant C > 0 such that A ≤ CB. We write A B if A B A.
308
J. Marklof
Theorem 1. For any given decreasing function τ : [0, ∞) → (0, ∞), and any infinite sequence of real numbers 0 < µ1 ≤ µ2 ≤ . . . → ∞
(3.2)
satisfying ∞
τ (µi ) < ∞,
(3.3)
i=1
there is a domain D of the above type whose Dirichlet Laplacian has an infinite sequence of quasimodes ψi,m,n with quasi-eigenvalues µi,m,n = n2 µi + m2 ξi ,
i, m, n ∈ N,
(3.4)
and ξi
h(µi )2 , µi τ (µi )2
(3.5)
so that (i) ( + µi,m,n )ψi,m,n = O(mξi ) ψi,m,n , (ii) ψi,m,n , ψi ,m ,n = 0 for i = i or n = n , (iii) | ψi,m,n , ψi,m ,n | min{0.001, |m − m |−1 } ψi,m,n ψi,m ,n for m = m , (iv) for any compact set K ⊂ D, ψi,m,n D−K →1 ψi,m,n
(3.6)
uniformly for all m, n ∈ N as i → ∞. Remark 1.1. Note that the set {µi,m,n : i, m, n ∈ N} is a discrete subset of R+ , with mean density #{(i, m, n) : µi,m,n < λ} C = , λ→∞ λ 4π lim
(3.7)
where C = π2
i
√
1 ≤ Area(D). µi ξi
(3.8)
This may either be verified directly, or concluded from the observation (cf. Sects. 4 and 6) that {µi,m,n } can be identified with the spectrum of the Dirichlet Laplacian on an −1/2 −1/2 , δi = π µ i , and thus total infinite union of rectangles Di with sides i = πξi area C = i Area(Di ). In this interpretation, (3.7) represents Weyl’s law (1.3).
Quantum Leaks
309
Remark 1.2. If assumption (2.8) holds e.g. for the quasimodes ψi,1,1 , Eqs. (2.9) and (3.6) imply there is an infinite sequence of eigenfunctions ϕji with ϕji = 1, such that for any compact K ⊂ D, lim inf ϕji D−K ≥ i→∞
1 − b−1 . √ k
(3.9)
That is, the eigenstates ϕji lose a positive proportion of mass. It should be stressed that we have not ruled out the probably very remote possibility that assumption (2.8) with i = O(mξi ) can never be satisfied for the domains D considered in the theorem (an explicit construction of D is given in Sect. 4). It would be interesting to see whether (2.8) can be established at least for generic choices of such D, i.e., generic choices of δi . In Sect. 5 we will prove an upper bound for the error term in Weyl’s law, which in turn yields a rough estimate on possible level clustering. Remark 1.3. For m, n bounded as i → ∞ the theorem establishes quasimodes with very small discrepancy, h(µi,m,n )2 ( + µi,m,n )ψi,m,n = O ψi,m,n . (3.10) µi,m,n τ (µi,m,n )2 Since h and τ can be arbitrarily slowly increasing/decreasing functions (respectively), this yields quasimodes of order arbitrarily close to 2; cf. Example 1.1 below. The number of such quasimodes with µi,m,n < λ, Nbb (λ) = #{(i, m, n) : m, n = O(1), µi,m,n < λ} #{i : µi < λ}, is determined by the restriction that τ (λ)dNbb (λ) < ∞.
(3.11)
(3.12)
Hence the higher the desired accuracy of quasimodes (achieved by choosing a sufficiently slowly decreasing τ ), the thinner the corresponding sequence of quasimodes becomes. Remark 1.4. The theorem also implies that there can be sequences of quasimodes of order zero that have almost full density. ‘Order zero’ means that ( + µi,m,n )ψi,m,n = O(1) ψi,m,n ,
(3.13)
i.e., mξi ≤ C1 for some constant C1 > 0. Since in view of (3.5) there is a constant C2 > 0 such that ξi µi ≥ C2 , we have NBB (λ) = #{(i, m, n) : µi,m,n = n2 µi + m2 ξi < λ, mξi ≤ C1 } 2 C λ C 1 ≥ # (i, m, n) : n2 < − 1, m≤ µi C2 ξi √ √ µi τ (µi )2 . λ h(µi )2 µi <λ
(3.14)
310
J. Marklof
For suitable choices of h and τ this quantity can be arbitrarily √ close to a function λ, cf. (3.18). On the other hand, it is bounded from below by λ. This bound is attained in the case when ∞ √ µi τ (µi )2 i=1
h(µi )2
< ∞,
(3.15)
and coincides with the bound for compact domains, cf. [13]. Note that the heuristic approaches in [1, 31] predict a greater number of bouncing ball modes. Example 1.1. Take h(x) = x β with 0 ≤ β < 1/2. For any given infinite sequence of real numbers µi with #{j : µj ≤ λ} λα ,
(3.16)
there is a domain D with f (x)1−2β dx < ∞, so that the corresponding quasimodes ψj have order 2 − 2σ , for any fixed σ > 2(α + β). That is, ( + µi,m,n )ψi,m,n = O(mµ−1+σ i,m,n ) ψi,m,n ,
(3.17)
The fact that (3.16) implies (3.3) with τ (x) = x −α (α > α) is seen by summation by parts. In view of Weyl’s law (1.3) and the small discrepancy O(µ−1+σ i,m,n ) for bounded m, a failure of assumption (2.8) would imply an extreme clustering of eigenvalues. As we shall see in Sect. 5, the bounds on the error term in Weyl’s law worsen as σ → 0, and hence clustering cannot be ruled out. An evaluation of the lower bound for the number of order-zero quasimodes in (3.14) yields NBB (λ) λθ ,
(3.18)
with θ = max{1 + α − 2α − 2β, 1/2}. Note that θ can be arbitrarily close to 1 for suitable parameter choices. Example 1.2. A second interesting choice that yields a domain D with exponentially √ narrow cusps is h(x) = x/ logγ (1 + x) with γ > 0. For any given infinite sequence of real numbers µi with #{j : µj ≤ λ} logα λ, there is a domain D with
(3.19)
| log f (x)|−γ dx < ∞, so that
( + µi,m,n )ψi,m,n = O(m log−σ µi,m,n ) ψi,m,n ,
(3.20)
for any fixed σ < 2(γ − α). Choose here τ (x) = log−α x with α > α, and (3.3) can again be checked using summation by parts. In this case the number of order-zero quasimodes is bounded from below by NBB (λ)
√ λ.
(3.21)
Quantum Leaks
311
4. Proof of Theorem 1 We begin by constructing accurate quasimodes on the rectangle [a, a + ] × [0, δ] with Dirichlet boundary conditions at y = 0, δ. Let χ ∈ C0∞ (R) be a mollified characteristic function of the interval [0, 1]. That is, 0 ≤ χ (x) ≤ 1, χ (x) = 0 for x ∈ / [0, 1] and χ (x) = 1 for x ∈ [ , 1 − ] for some fixed, small > 0. We assume also that χ (x) = O( −1 ) (such a choice is always possible). For m, n ∈ N, a ∈ R and , δ > 0 put π m(x − a) π ny x−a ψm,n (x, y) = χ sin sin (4.1) δ and
µm,n = π 2
m
2 +
2 n . δ
(4.2)
Straightforward differentiation yields 1 π m(x − a) x−a ( + µm,n )ψm,n (x, y) = 2 2πmχ cos x − a π m(x − a) π ny +χ sin sin , δ (4.3) and hence
( + µm,n )ψm,n = Oχ 2
m2 δ , 3
(4.4)
where the implied constant only depends on the choice of χ . Because of this and ψm,n 2 =
δ (1 + O( )), 4
we obtain
( + µm,n )ψm,n = Oχ
m ψm,n . 2
(4.5)
(4.6)
Furthermore, for n = n we have ψm,n , ψm ,n = 0, and for n = n , m = m , 2 π m x δ x πmx
ψm,n , ψm ,n = sin dx χ sin 2 0 2
x π mx π m x δ χ + − 1 sin sin dx = 2 0 (1− ) 1 δ = [χ (x)2 − 1] + 4 0 1− ×[cos(π(m − m )x) − cos(π(m + m )x)]dx δ = O( ). (4.7) 4
312
J. Marklof
On the other hand, using integration by parts, we have
[χ (x)2 − 1] cos(π(m − m )x)dx 0
1 2 = [χ (x) − 1] sin(π(m − m )x) π(m − m ) 0 − 2χ (x)χ (x) sin(π(m − m )x)dx .
(4.8)
0
Since χ ( )2 = 1, sin(0) = 0 the first term vanishes, and since χ (x) = O( −1 ) the integral is of O(1). The analogous argument works for the remaining integrals. Hence | ψm,n , ψm ,n | min ,
1 ψm,n ψm ,n . |m − m |
(4.9)
We will now give an explicit construction of D. The function f is chosen constant on the intervals [ai , ai+1 ), i = 1, 2, 3, . . . ; set δi = f (ai ) and i = ai+1 − ai . As quasimodes we take ψi,m,n (x, y) = χ
x − ai i
sin
π m(x − ai ) π ny sin , i δi
(4.10)
with quasi-eigenvalues µi,m,n = π
2
m i
2
2 n . + δi
(4.11)
By construction, these are completely localized in the rectangle [ai , ai+1 ] × [0, δi ] and hence satisfy requirement (iv) of the theorem. Setting µi = π 2 δi−2 , every given sequence of µi having property (3.3) determines a sequence of δi . Because of (4.6), ( + µi,m,n )ψi,m,n −1 −2 2 −2 = Oχ (m−2 i ) = Oχ (mδi Ai ) = Oχ (mµi Ai ). (4.12) ψi,m,n To minimize the discrepancy, we would like to choose Ai as large as possible. The choice Ai = τ (µi )h(µi )−1 yields condition (i) and determines f . Since
∞
f (x)h(π 2 f (x)−2 )dx =
a1
i f (ai )h(π 2 f (ai )−2 )
i
=
Ai h(π 2 δi−2 )
i
=
τ (µi ) < ∞,
i
the function f is in the required class satisfying (3.1). Condition (ii) is evident from (4.1), and (iii) from (4.9).
(4.13)
Quantum Leaks
313
5. Asymptotic Distribution of Eigenvalues In view of condition (2.8) we would like to control the number of eigenvalues in small intervals. The following theorem illustrates that extreme level clustering cannot a priori be ruled out. Theorem 2. The spectral counting function N (λ) = #{j : λj < λ} of the Dirichlet Laplacian for the domain D (as in Sect. 4) satisfies N (λ) =
L(λ) √ Area(D) λ− λ 4π 4π ∞ ∞ √ √ 1 √ 1 + λ i J1 2rδi λ + O( λ), 2π r i=1 √ δi λ>π
(5.1)
r=1
where ∞
L(λ) = 2
i
(5.2)
i=1 √ δi λ>π
and J1 is the J -Bessel function. Remark 2.1. The standard bound |J1 (x)| x −1/2
for x large
(5.3)
implies that N (λ) =
√ Area(D) λ + O(L(λ) λ), 4π
(5.4)
where L(λ) = 2π
∞ i=1 µi <λ
∞ √ µi τ (µi ) 1 ; √ h(µi ) ξi i=1
(5.5)
µi <λ
recall that µi = π 2 /δi2 and ξi = π 2 /2i . As the examples following Theorem 1 illustrate, a good quasimode discrepancy (ξi small) is thus traded with an error bound in (5.4) approaching o(λ). But as we shall see in the following √ section, cf. Eq. (6.7), the number of eigenvalues in the interval [λ, λ + σ ] with σ < λ is √ N(λ + σ ) − N (λ) = #{(i, m, n) ∈ N3 : λ ≤ µi,m,n < λ + σ } + O( λ),
(5.6)
√ with quasi-eigenvalues µi,m,n as in (3.4). That is, all extreme fluctuations beyond O( λ) are due to the presence of bouncing ball quasimodes.
314
J. Marklof
6. Proof of Theorem 2 Consider the domains Di = {(x, y) ∈ R2 : ai < x < ai+1 , 0 < y < f (x)}, where (i) i = 0, 1, 2, . . . and a0 = 0. Let ND (λ) be the spectral counting function for the Di(i) richlet Laplacian for Di , and NN (λ) the counting function with Neumann conditions on the boundary lines x = ai and x = ai+1 and Dirichlet conditions on the remaining boundary. Set ND (λ) =
∞
(i)
ND (λ),
NN (λ) =
∞
i=0
(i)
NN (λ).
(6.1)
i=0
It is well known (‘Dirichlet-Neumann bracketing’ [2, 4]) that ND (λ) ≤ N (λ) ≤ NN (λ).
(6.2)
For i = 0 the general error estimate in Weyl’s law for compact domains yields (0)
ND (λ) =
√ Area(D0 ) λ + O( λ), 4π
(0)
NN (λ) =
√ Area(D0 ) λ + O( λ). (6.3) 4π
For the remaining domains we have ∞
(λ) := ND
(i)
ND (λ) = #{(m, n, i) ∈ N3 : n2 µi + m2 ξi < λ}
(6.4)
(i) (λ) + #{(n, i) ∈ N2 : n2 µ < λ}. NN (λ) = ND i
(6.5)
i=1
and (λ) := NN
∞ i=1
Note that (λ) ≤ (λ) − N N N
D
µi <λ
since
i
−1/2
µi
√ λ = O( λ), µi
(6.6)
< ∞, cf. (3.8). Therefore N (λ) =
Area(D0 ) + O(√λ). λ + ND 4π
(6.7)
Now (λ) = ND
∞ i,n=1 n2 µi <λ
∞ 2µ √ λ − n λ − n 2 µi i + O(1) = + O( λ), (6.8) ξi ξi i,n=1 n2 µi <λ
Quantum Leaks
315
recall the argument in (6.6). The main term is ∞ ∞ ∞ √ λ − n 2 µi 1 µi = λ √ F n ξi λ ξi i,n=1 i=1 n=1 µi <λ
n2 µi <λ
=
∞ ∞ 1√ 1 µi 1√ 1 λ λ − √ F n √ , (6.9) 2 λ 2 ξi ξi µ <λ i=1 n=−∞ i
µi <λ
where F (x) = over n
max{1 − x 2 , 0}. The Poisson summation formula yields for the sum ∞ µi λ λ , F n F r = λ µ µ i i n=−∞ r=−∞ ∞
(0) = π/2 and for y = 0, where F (y) = F
1
−1
= So
(6.10)
1 − x 2 cos(2π xy)dx
1 J1 (2πy). 2y
(6.11)
∞ π λ µi 1 λ . = J1 2π r F n + λ 2 µ r µ i i n=−∞ ∞
(6.12)
r=1
The bound (5.3) proves the convergence of the series on the right hand side of (6.12). This concludes the proof of Theorem 2. Acknowledgement. I thank M. van den Berg, M. Degli Esposti, J. Keating, M. Lenci, Z. Rudnick and R. Schubert for stimulating discussions.
References 1. B¨acker, A., Schubert, R., Stifter, P.: On the number of bouncing ball modes in billiards. J. Phys. A 30, no. 19, 6783–6795 (1997) 2. van den Berg, M.: Dirichlet-Neumann bracketing for horn-shaped regions. J. Funct. Anal. 104, no. 1, 110–120 (1992) 3. van den Berg, M.: On the spectral counting function for the Dirichlet Laplacian. J. Funct. Anal. 107, no. 2, 352–361 (1992) 4. van den Berg, M., Lianantonakis, M.: Asymptotics for the spectrum of the Dirichlet Laplacian on horn-shaped regions. Indiana Univ. Math. J. 50, no. 1, 299–333 (2001) 5. Bogomolny, E., Schmit, C.: Structure of wave functions of pseudointegrable billiards. Phys. Rev. Lett. 92, 244102 (2004) 6. Burq, N., Zworski, M.: Geometric control in the presence of a black box. J. Amer. Math. Soc. 17, no. 2, 443–471 (2004) 7. Burq, N., Zworski, M.: Bouncing ball modes and quantum chaos. To appear in Siam Review 8. Burq, N., Zworski, M.: Eigenfunctions for partially rectangular billiards. http://arxiv.org/PS cache/ math/pdf/0312098.pdf, 2003
316
J. Marklof
9. Colin de Verdi`ere, Y.: Quasi-modes sur les vari´et´es Riemanniennes. Invent. Math. 43, no. 1, 15–52 (1977) 10. Colin de Verdi`ere, Y.: Ergodicit´e et fonctions propres du laplacien. Commun. Math. Phys. 102, 497–502 (1985) 11. Degli Esposti, M., Graffi, S., Isola, S.: Classical limit of the quantized hyperbolic toral automorphisms. Commun. Math. Phys. 167, no. 3, 471–507 (1995) 12. Degli Esposti, M., Del Magno, G., Lenci, M.: Escape orbits and ergodicity in infinite step billiards. Nonlinearity 13, no. 4, 1275–1292 (2000) 13. Donnelly, H.G.: Quantum unique ergodicity. Proc. Amer. Math. Soc. 131, no. 9, 2945–2951 (2003) 14. Faure, F., Nonnenmacher, S.: On the maximal scarring for quantum cat map eigenstates. Commun. Math. Phys. 245, no. 1, 201–214 (2004) 15. Faure, F., Nonnenmacher, S., De Bi`evre, S.: Scarred eigenstates for quantum cat maps of minimal periods. Commun. Math. Phys. 239, no. 3, 449–492 (2003) 16. Graffi, S., Lenci, M.: Localization in infinite billiards: a comparison between quantum and classical ergodicity. J. Stat. Phys. 116, 821–830 (2004) 17. Heller, E.J., O’Connor, P.W.: Quantum localization for a strongly classically chaotic system. Phys. Rev. Lett. 61, (20), 2288–2291 (1988) 18. Hillairet, L.: Weyl’s remainder on translation surfaces. Prepub. 333, ENS-Lyon, 2005 19. Ivrii, V.: Microlocal analysis and precise spectral asymptotics. Springer Monographs in Mathematics. Berlin: Springer-Verlag, 1998 20. Kurlberg, P., Rudnick, Z.: Hecke theory and equidistribution for the quantization of linear maps of the torus. Duke Math. J. 103, no. 1, 47–77 (2000) 21. Kurlberg, P., Rudnick, Z.: On quantum ergodicity for linear maps of the torus. Commun. Math. Phys. 222, no. 1, 201–227 (2001) 22. Lazutkin, V.F.: KAM theory and semiclassical approximations to eigenfunctions. With an addendum by A. I. Shnirelman. Ergebnisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)], 24. Berlin: Springer-Verlag, 1993 23. Lenci, M.: Semi-dispersing billiards with an infinite cusp. I. Commun. Math. Phys. 230, no. 1, 133–180 (2002) 24. Lenci, M.: Semidispersing billiards with an infinite cusp. II. Chaos 13, no. 1, 105–111 (2003) 25. Lindenstrauss, E.: Invariant measures and arithmetic quantum unique ergodicity. To appear in Annals of Math 26. Marklof, J., Rudnick, Z.: Quantum unique ergodicity for parabolic maps. Geom. Funct. Anal. 10, no. 6, 1554–1578 (2000) 27. Rudnick, Z., Sarnak, P.: The behaviour of eigenstates of arithmetic hyperbolic manifolds. Commun. Math. Phys. 161, no. 1, 195–213 (1994) 28. Sarnak, P.: Spectra of hyperbolic surfaces. Bull. Amer. Math. Soc. (N.S.) 40, no. 4, 441–478 (2003) 29. Schnirelman, A.I.: Ergodic properties of eigenfunctions. Uspehi Mat. Nauk 29, 181–182 (1974) 30. Simon, B.: Functional integration and quantum physics. In: Pure and Applied Mathematics, 86. New York-London: Academic Press, Inc. [Harcourt Brace Jovanovich, Publishers], 1979 31. Tanner, G.: How chaotic is the stadium billiard? A semiclassical analysis. J. Phys. A 30, no. 8, 2863–2888 (1997) 32. Zelditch, S.: Uniform distribution of eigenfunctions on compact hyperbolic surfaces. Duke Math. J. 55, 919–941 (1987) 33. Zelditch, S.: Note on quantum unique ergodicity. Proc. Amer. Math. Soc. 132, no. 6, 1869–1872 (2004) 34. Zelditch, S., Zworski, M.: Ergodicity of eigenfunctions for ergodic billiards. Commun. Math. Phys. 175, no. 3, 673–682 (1996) Communicated by P. Sarnak
Commun. Math. Phys. 264, 317–334 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1546-3
Communications in
Mathematical Physics
On Computational Complexity of Siegel Julia Sets I. Binder, M. Braverman, M. Yampolsky Departments of Mathematics and Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada Received: 6 April 2005 / Accepted: 5 November 2005 Published online: 22 March 2006 – © Springer-Verlag 2006
Abstract: It has been previously shown by two of the authors that some polynomial Julia sets are algorithmically impossible to draw with arbitrary magnification. On the other hand, for a large class of examples the problem of drawing a picture has polynomial complexity. In this paper we demonstrate the existence of computable quadratic Julia sets whose computational complexity is arbitrarily high. 1. Foreword Let us informally say that a compact set in the plane is computable if one can program a computer to draw a picture of this set on the screen, with an arbitrary desired magnification. It was recently shown by the second and third authors, that some Julia sets are not computable [BY]. This in itself is quite surprising to dynamicists – Julia sets are among the “most drawn” objects in contemporary mathematics, and numerous algorithms exist to produce their pictures. In the cases when one has not been able to produce informative pictures (the dynamically pathological cases, like maps with a Cremer or a highly Liouville Siegel point) the feeling had been that this was due to the immense computational resources required by the known algorithms. The next surprise came with the discovery by the authors of this paper in [BBY] that all Cremer quadratics (or more generally, rational maps without rotation domains) have computable Julia sets. The non-computable examples constructed in [BY] were Siegel quadratic polynomials, and one would expect the Cremer case to be at least as bad if not worse computationally. The natural question to ask is then whether in those cases in which we know the Julia set is computable, but no good pictures exist, the computational complexity of such a set is indeed high. Here at least, our original intuition seems to be correct: it is shown in the present paper that there exist computable Siegel quadratic Julia sets with arbitrarily The first and third authors are partially supported by NSERC Discovery grants. The second author is partially supported by NSERC Postgraduate Scholarship
318
I. Binder, M. Braverman, M. Yampolsky
high computational complexity. An irritating possibility still remains that some Cremer Julia sets are computationally easy (and we just do not go about trying to draw them in the right way). This, however, seems unlikely. We note that the examples constructed in this paper are the first known cases of Julia sets which are not poly-time computable. The second author [Brv1] and independently Rettinger [Ret] have previously shown that hyperbolic Julia sets are poly-time computable. More recently the second author has shown [Brv2] that some Julia sets with parabolics are poly-time computable as well. The last result was yet another surprise, as the time complexity of all previously known algorithms for these Julia sets was exponential. The structure of the paper is as follows. In §2.2 of the Introduction, having stated the principal definitions, we formulate the main result of the paper. In §2.4 we give a sketch of the argument. In §4 we prove several technical lemmas. The final §5 contains the proof of the Main Theorem. 2. Introduction 2.1. Computability of real sets. The reader is directed to [BY] for a more detailed discussion of the notion of computability of subsets of Rn as applied, in particular, to Julia sets. We recall the principal definitions here. The exposition below uses the concept of a Turing Machine. This is a standard model for a computer program employed by computer scientists. Readers unfamiliar with this concept should think instead of an algorithm written in their favorite programming language. These concepts are known to be equivalent. Denote by D the set of the dyadic rationals, that is, rationals of the form 2pm . We say that φ : N → D is an oracle for a real number x, if |x − φ(n)| < 2−n for all n ∈ N. In other words, φ provides a good dyadic approximation for x. We say that a Turing Machine (further abbreviated as TM) M φ is an oracle machine, if at every step of the computation M is allowed to query the value φ(n) for any n. This definition allows us to define the computability of real functions on compact sets. Definition 2.1. We say that a function f : [a, b] → [c, d] is computable, if there exists an oracle TM M φ (m) such that if φ is an oracle for x ∈ [a, b], then on input m, M φ outputs a y ∈ D such that |y − f (x)| < 2−m . To understand this definition better, the reader without a Computer Science background should think of a computer program with an instruction READ real number x WITH PRECISION n(m). On the execution of this command, a dyadic rational d is input from the keyboard. This number must not differ from x by more than 2−n(m) (but otherwise can be arbitrary). The algorithm then outputs f (x) to precision 2−n . It is worthwhile to note why the oracle mechanism is introduced. There are only countably many possible algorithms, and consequently only countably many computable real numbers which such algorithms can encode. Therefore, one wants to separate the hardness of encoding the real number x from the hardness of computing the value of the function f (x), having the access to the value of x. Let K ⊂ Rk be a compact set. We say that a TM M computes the set K if it approximates K in the Hausdorff metric. Recall that the Hausdorff metric is a metric on compact subsets of Rn defined by dH (X, Y ) = inf{ > 0 | X ⊂ U (Y ) and Y ⊂ U (X)},
(2.1)
Complexity of Julia Sets
319
where U (S) is defined as the union of the set of -balls with centers in S. We introduce a class C of sets which is dense in metric dH among the compact sets and which has a natural correspondence to binary strings. Namely C is the set of finite unions of dyadic balls: n C= B(di , ri ) | where di ∈ D2 , ri ∈ D . i=1
Members of C can be encoded as binary strings in a natural way. We now define the notion of computability of subsets of Rn (see [Wei], and also [RW]). Definition 2.2. We say that a compact set K ⊂ Rk is computable, if there exists a TM M(d, n), where d ∈ D, n ∈ N which outputs a value 1 if dist(d, K) < 2−n , the value 0 if dist(d, K) > 2 · 2−n , and in the “in-between” case it halts and outputs either 0 or 1. In other words, it computes, in the classical sense, a function from the family FK of functions of the form if dist(d, K) > 2 · 2−n 0, 1, if dist(d, K) < 2−n f (d, n) = (2.2) 0 or 1, otherwise. Theorem 2.1. For a compact K ⊂ Rk the following are equivalent: (1) K is computable as per Definition 2.2 (2) there exists a TM M(m), such that on input m, M(m) outputs an encoding of Cm ∈ C such that dH (K, Cm ) < 2−m (global computability) (3) the distance function dK (x) = inf{|x − y| | y ∈ K} is computable as per Definition 2.1. Note that in the case k = 2 computability means that K can be drawn on a computer screen with arbitrarily good precision (if we imagine the screen as a lattice of pixels). In the present paper we are interested in questions concerning the computability of the Julia set Jc = J (fc ) = J (z2 + c). Since there are uncountably many possible parameter values for c, we cannot expect for each c to have a machine M such that M computes Jc (recall that there are countably many TMs). On the other hand, it is reasonable to want M to compute Jc with an oracle access to c. Define the function J : C → K ∗ (K ∗ is the set of all compact subsets of C) by J (c) = J (fc ). In a complete analogy to Definition 2.1 we can define Definition 2.3. We say that a function κ : S → K ∗ for some bounded set S is computable, if there exists an oracle TM M φ (d, n), where φ is an oracle for x ∈ S, which computes a function (2.2) of the family Fκ(x) . Equivalently, there exists an oracle TM M φ (m) with φ again representing x ∈ S such that on input m, M φ outputs a C ∈ C such that dH (C, κ(x)) < 2−m . In the case of Julia sets: Definition 2.4. We say that Jc is computable if the function J : d → Jd is computable on the set {c}. We have the following (see [Brv1]):
320
I. Binder, M. Braverman, M. Yampolsky
Theorem 2.2. Suppose that a TM M φ computes the function J on a set S ⊂ C. Then J is continuous on S in Hausdorff sense. Proof. Let c be any point in S, and let ε = 2−k be given. Let φ be an oracle for c such that |φ(n) − c| < 2−(n+1) for all k. We run M φ (k + 1) with this oracle φ. By the definition of J , it outputs a set L which is a 2−(k+1) approximation of Jc in the Hausdorff metric. The computation is performed in a finite amount of time. Hence there is an m such that φ is only queried with parameters not exceeding m. Then for any x such that |x − c| < 2−(m+1) , φ is a valid oracle for x up to parameter value of m. In particular, we can create an oracle ψ for x that agrees with φ on 1, 2, . . . , m. If x ∈ S, then the execution of M ψ (k + 1) will be identical to the execution of M φ (k + 1), and it will output L which has to be an approximation of Jx . Thus we have dH (Jc , Jx ) ≤ dH (Jc , L) + dH (Jx , L) < 2−(k+1) + 2−(k+1) = 2−k . This is true for any x ∈ B(c, 2−(m+1) ) ∩ S. Hence J is continuous on S.
The second and third authors have demonstrated in [BY]: Theorem 2.3. There exists a parameter value c ∈ C such that the Julia set of the quadratic polynomial fc (z) = z2 + c is not computable. The quadratic polynomials in Theorem 2.3 possess Siegel disks (see §2.3 below for the definitions of Siegel and Cremer points). It was further shown by the authors of the present paper in [BBY] that the absence of rotation domains, that is either Siegel disks or Herman rings, guarantees computability of the rational Julia set. This implies, in particular, that all Cremer quadratic Julia sets are computable – this despite the fact that no informative high resolution images of such sets have ever been produced. One expects, however, that such “bad” but still computable examples have high algorithmic complexity, which makes the computational cost of producing such a picture prohibitively high. We note that the second author [Brv1] and independently Rettinger [Ret] have shown: Theorem 2.4. Hyperbolic Julia sets are computable in polynomial time. That is, if J is the Julia set of a hyperbolic rational mapping R, then there exists a TM M(d, n) which computes a function of the family (2.2) in time polynomial in the bit size of d and the value of n. It is worth noting that the same oracle TM M φ (d, n) with the oracle representing the parameters of the rational mapping R, can be selected for all hyperbolic Julia sets of the same degree. Moreover, the asymptotics of the polynomial time bound depends only on R but not on the input (d, n). 2.2. Statement of the Main Theorem. On the other end of the complexity spectrum we expect to find “bad” but computable Siegel Julia sets and Cremer Julia sets. Indeed, it the present paper we show: Theorem 2.5. There exist quadratic Siegel Julia sets of arbitrarily high computational complexity. More precisely, for any computable increasing function h : N → N there exists a computable Siegel parameter value c ∈ C such that: • the Julia set Jc is computable by an oracle TM;
Complexity of Julia Sets
321
• for any oracle TM M φ (m) which computes the 2−m -approximations to Jc , there exists φ a sequence {mi }∞ i=1 such that M requires the time of at least h(mi ) to compute the approximation Cmi ∈ C. From this statement for global computational complexity immediately follows the corresponding local statement: Corollary 2.6. There exist computable parameter values c for which the Julia set Jc is computable, and the complexity of the problem of computing a function (2.2) in the family FJc is arbitrarily high. ˆ → C ˆ be a rational map of the Rie2.3. Siegel disks of quadratic maps. Let R : C p mann sphere. For a periodic point z0 = R (z0 ) of period p its multiplier is the quantity λ = λ(z0 ) = DR p (z0 ). We may speak of the multiplier of a periodic cycle, as it is the same for all points in the cycle by the Chain Rule. In the case when |λ| = 1, the dynamics in a sufficiently small neighborhood of the cycle is governed by the Mean Value Theorem: when 0 < |λ| < 1, the cycle is attracting (super-attracting if λ = 0), if |λ| > 1 it is repelling. Both in the attracting and repelling cases, the dynamics can be locally linearized: ψ(R p (z)) = λ · ψ(z),
(2.3)
where ψ is a conformal mapping of a small neighborhood of z0 to a disk around 0. By a classical result of Fatou, a rational mapping has at most finitely many non-repelling periodic orbits. In the case when λ = e2πiθ , θ ∈ R, the simplest to study is the parabolic case when θ = n/m ∈ Q, so λ is a root of unity. In this case R p is not locally linearizable; it is not hard to see that z0 ∈ J (R). In the complementary situation, two non-vacuous possibilities are considered: Cremer case, when R p is not linearizable, and Siegel case, when it is. In the latter case, the linearizing map ψ from (2.3) conjugates the dynamics of R p on a neighborhood U (z0 ) to the irrational rotation by angle θ (the rotation angle) on a disk around the origin. The maximal such neighborhood of z0 is called a Siegel disk. Let us discuss in more detail the occurrence of Siegel disks in the quadratic family. For a number θ ∈ [0, 1) denote [r0 , r1 , . . . , rn , . . .], ri ∈ N ∪ {∞} its possibly finite continued fraction expansion: [r0 , r1 , . . . , rn , . . .] ≡
1 r0 +
(2.4)
1 r1 +
1 1 ···+ r +··· n
Such an expansion is defined uniquely if and only if θ ∈ / Q. In this case, the rational convergents pn /qn = [r0 , . . . , rn−1 ] are the closest rational approximants of θ among the numbers with denominators not exceeding qn . In fact, setting λ = e2πiθ , we have |λh − 1| > |λqn − 1| for all 0 < h < qn+1 , h = qn . The difference |λqn − 1| lies between 2/qn+1 and 2π/qn+1 , therefore the rate of growth of the denominators qn describes how well θ may be approximated with rationals. We recall a theorem due to Brjuno (1972):
322
I. Binder, M. Braverman, M. Yampolsky
ˆ Suppose Theorem 2.7 ([Bru]). Let R be an analytic map with a periodic point z0 ∈ C. that the multiplier of z0 is λ = e2πiθ , and B(θ) =
log(qn+1 ) qn
n
< ∞.
(2.5)
Then z0 is a Siegel point. Note that a quadratic polynomial with a fixed Siegel disk with rotation angle θ after an affine change of coordinates can be written as Pθ (z) = z2 + e2πiθ z.
(2.6)
In 1987 Yoccoz [Yoc] proved the following converse to Brjuno’s Theorem: Theorem 2.8 ([Yoc]). Suppose that for θ ∈ [0, 1) the polynomial Pθ has a Siegel point at the origin. Then B(θ) < ∞. The numbers satisfying (2.5) are called Brjuno numbers; the set of all Brjuno numbers will be denoted B. It is a full measure set which contains all Diophantine rotation numbers. In particular, the rotation numbers [r0 , r1 , . . .] of bounded type, that is with sup ri < ∞ are in B. The sum of the series (2.5) is called the Brjuno function. For us a different characterization of B will be more useful. Inductively define θ1 = θ and θn+1 = {1/θn }. In this way, θn = [rn−1 , rn , rn+1 , . . .]. We define the Yoccoz’s Brjuno function as
(θ ) =
∞ n=1
θ1 θ2 · · · θn−1 log
1 . θn
One can verify that B(θ) < ∞ ⇔ (θ ) < ∞. The value of the function is related to the size of the Siegel disk in the following way. Definition 2.5. Let (U, u) be a simply-connected subdomain of C with a marked interior point. Consider the unique conformal isomorphism φ : D → U with φ(0) = u, and φ (0) > 0. The conformal radius of (U, u) is the value of the derivative r(U, u) = φ (0). Let P (θ) be a quadratic polynomial with a Siegel disk θ 0. The conformal radius of the Siegel disk θ is r(θ ) = r(θ , 0). For all other θ ∈ [0, ∞) we set r(θ) = 0, and θ = {0}. By the Koebe 1/4 Theorem of classical complex analysis (see e.g. [Ahl]), the radius of the largest Euclidean disk around u which can be inscribed in U is at least r(U, u)/4. We note that one has the following direct consequence of the Carath´eodory Kernel Theorem (see e.g. [Pom]): Proposition 2.9. The conformal radius of a quadratic Siegel disk varies continuously with respect to the Hausdorff distance on Julia sets.
Complexity of Julia Sets
323
Yoccoz [Yoc] has shown that the sum
(θ ) + log r(θ ) is bounded below independently of θ ∈ B. Recently, Buff and Ch´eritat have greatly improved this result by showing that: Theorem 2.10 ([BC]). The function θ → (θ ) + log r(θ) extends to R as a 1-periodic continuous function. In [BBY] we obtain the following result on computability of quadratic Siegel disks: Theorem 2.11. The following statements are equivalent: (I) the Julia set J (Pθ ) is computable; (II) the conformal radius r(θ ) is computable; (III) the inner radius inf z∈∂θ |z| is computable. We note that when θ is not a Brjuno number, the quantities in (II) and (III) are each equal to zero, and the claim is simply that J (Pθ ) is computable in this case. We will make use of the following lemma which bounds the variation of the conformal radius under a perturbation of the domain. It is a direct consequence of the Koebe Theorem (see e.g. [RZ] for a proof). Lemma 2.12. Let U be a simply-connected subdomain of C containing the point 0 in the interior. Let V ⊂ U be a subdomain of U . Assume that ∂V ⊂ B (∂U ). Then √ 0 < r(U, 0) − r(V , 0) ≤ 4 r(U, 0) . 2.4. Outline of the construction.. We can now describe the idea of our construction. This outline is rather sketchy and suffers from obvious logical deficiencies, however, it presents the construction in a simple to understand form. Consider the oracle Turing machines M φ with φ representing the parameter θ in Pθ . Since there are only countably φ φ many Turing machines, we may order these machines in a sequence M1 , M2 , . . .. We φ denote Si the domain on which Mi computes J (Pθ ) properly. We thus have that for each i, the function J : θ → J (Pθ ) is continuous on Si . φ Let us start with a machine Mn1 which computes J (Pθ∗ ) for θ∗ = [1, 1, 1, . . .]. If any of the digits ri in this infinite continued fraction is changed to a sufficiently large N ∈ N, the conformal radius of the Siegel disk will become small. For N → ∞ the Siegel disk will implode and its center will become a parabolic fixed point in the Julia set (see [Do2]). If we are careful, we may select i1 > 1 and N1 1 in such a way that for θ1 given by the continued fraction where all digits are ones except ri1 = N1 we have r(θ∗ )(1 − 1/4) < r(θ1 ) < r(θ∗ )(1 − 1/8).
(2.7)
By the Koebe 1/4-Theorem, there exists 1 > 0 such that the distance between the two Julia sets dH (J (Pθ∗ ), J (Pθ1 )) > 2− 1 .
324
I. Binder, M. Braverman, M. Yampolsky
To ensure that the machine Mn1 will not be able to produce an accurate 2− 1 -approximation of J (Pθ1 ) faster than in the time h( 1 ) we simply select i1 > h( 1 ). This guarantees that the TM will have to read at least h( 1 ) digits of the oracle φ to distinguish the two Julia sets, which takes the time h( 1 ). φ To “fool” the machine Mn2 we then change a digit ri2 for i2 > i1 sufficiently far in the continued fraction of θ1 to a large N2 . In this way, we will obtain a Brjuno number θ2 for which φ
r(θ∗ )(1 − 1/4 − 1/8) < r(θ2 ) < r(θ∗ )(1 − 1/4).
(2.8)
Again, there exists 2 such that for any such Brjuno number, we have dH (J (Pθ1 ), J (Pθ2 )) > 2− 2 , and we choose i2 > h( 2 ). Continuing inductively, we arrive at the desired limiting Brjuno number θ∞ . To convince the reader that this construction is not artificial, and not due to the peculiarities of the selected computation model let us recast it somewhat informally as follows. It is possible by an arbitrarily small perturbation of the parameter θ to cause a detectable disturbance in the picture of J (Pθ ). To distinguish the picture of the new Julia set from the old one, in practice one needs to draw it with arbitrary precision arithmetic. That is, not only the input of the parameter (reading the oracle) will take a long time due to the number of significant digits, but also all the arithmetic manipulations with this parameter. Of course, the former consideration is already sufficient to prove the theorem. 3. Computing Noble Siegel Disks The primary goal of the present paper is to show that there are computationally hard yet computable Julia sets with Siegel disks. To establish this computability we need a computability result for noble Siegel disks. The term “noble” is applied in the literature to rotation numbers of the form [a0 , a1 , . . . , ak , 1, 1, 1, . . .]. The noblest of all is the golden mean γ∗ = [1, 1, 1, . . .]. Lemma 3.1. There is a Turing Machine M, which given a finite sequence of numbers [a0 , a1 , . . . , ak ] computes the conformal radius rγ for the noble number γ = [a0 , . . . , ak , 1, . . .]. The idea is to approximate the boundary of γ with the iterates of the critical point cγ = −e2π iγ /2. It is known that in this case the critical point itself is contained in the boundary. The renormalization theory for golden-mean Siegel disks (constructed in [McM]) implies that the boundary γ∗ is self-similar up to an exponentially small error. In particular, there exist constants C > 0 and λ > 1 such that dH ({Pγi∗ (cγ∗ ), i = 0, . . . , qn }, ∂γ∗ ) < Cλ−n . Below we derive a similar estimate for all noble Siegel disks with constructive constants C and λ. For this, we do not need to invoke the whole power of renormalization theory. Rather, we will use a theorem of Douady, Ghys, Herman, and Shishikura [Do1] which specifically applies to quadratic noble Siegel disks.
Complexity of Julia Sets
325
Noble (or more generally, bounded type) Sigel quadratic Julia sets may be constructed by means of quasiconformal surgery on a Blaschke product, fγ (z) = e2πiτ (γ ) z2
z−3 . 1 − 3z
This map homeomorphically maps the unit circle T onto itself with a single (cubic) critical point at 1. The angle τ (γ ) can be uniquely selected in such a way that the rotation number of the restriction ρ(fγ |T ) = γ . For each n, the points q
{1, fγ (1), fγ2 (1), . . . , fγ n+1
−1
(1)}
form the nth dynamical partition of the unit circle. We have (cf. Theorem 3.1 of [dFdM]) the following: Theorem 3.2 (Universal real a priori bound). There exists an explicit constant B > 1 independent of γ and n such that the following holds. Any two adjacent intervals I and J of the nth dynamical partition of fγ are B-commensurable: B −1 |I | ≤ |J | ≤ B|I |. Let us now consider the mapping which identifies the critical orbits of fγ and Pγ by : fγi (1) → Pγi (cγ ). We have the following (Theorem 3.10 of [YZ]): Theorem 3.3 (Douady, Ghys, Herman, Shishikura). The mapping extends to a K-quasiconformal homeomorphism of the plane C which maps the unit disk D onto the Siegel disk γ . The constant K depends on B and a0 , . . . , ak in a constructive fashion. Elementary combinatorics implies that each interval of the nth dynamical partition contains at least two intervals of the (n + 2)nd dynamical partition. This in conjunction with Theorem 3.2 implies that the size of an interval of the (n + 2)nd dynamical partition of fγ is at most τ n where B τ= . B +1 We now complete the proof of Lemma 3.1. Denote Wn the connected component containing 0 of the domain obtained by removing from the plane a closed disk of radius 2Kτ n around each point of n = {Pγi (cγ ), i = 0, . . . , qn+2 }. By Theorem 3.3, distH (n , ∂γ ) < Kτ n , and we have Wn ⊂ γ and distH (∂γ , ∂Wn ) ≤ n = 2Kτ n .
326
I. Binder, M. Braverman, M. Yampolsky
Any constructive algorithm for producing the Riemann mapping of a planar region (e.g. that of [BB]) can be used to estimate the conformal radius r(Wn , 0) with precision n . Denote this estimate rn . Elementary estimates imply that the Julia set J (Pγ ) ⊂ B(0, 2). By Schwarz Lemma this implies r(γ , 0) < 2. By Lemma 2.12 we have √ |r(γ , 0) − rn | ≤ |r(γ , 0) − r(Wn , 0)| + |r(Wn , 0) − rn | < 4 n + n −→ 0, n→∞
and the proof is complete. 4. Making Small Changes to Φ and r For a number γ = [a1 , a2 , . . .] ∈ R \ Q we denote αi (γ ) =
1 ai +
1 ai+1 +
,
1 ai+2 + · · ·
so that
(γ ) =
α1 (γ )α2 (γ ) . . . αn−1 (γ ) log
n≥1
1 . αn (γ )
We will show the following two lemmas. Lemma 4.1. For any initial segment I = (a0 , a1 , . . . , an ), write ω = [a0 , a1 , . . . , an , 1, 1, 1, . . .]. Then for any ε > 0, there is an m > 0 and an integer N such that if we write β = [a0 , a1 , . . . , an , 1, 1, . . . , 1, N, 1, 1, . . .], where the N is located in the n + mth position, then
(ω) + ε < (β) < (ω) + 2ε. Lemma 4.2. For ω as above, for any ε > 0 there is an m0 > 0, which can be computed from (a0 , a1 , . . . , an ) and ε, such that for any m ≥ m0 , and for any tail I = [an+m , an+m+1 , . . .] if we denote β I = [a1 , a2 , . . . , an , 1, 1, . . . , 1, an+m , an+m+1 , . . .], then
(β I ) > (ω) − ε. We first prove Lemma 4.1. Denote
− (ω) = (ω) − α0 (ω)α1 (ω) . . . αn+m−1 (ω) log
1 αm+n (ω)
The value of the integer m > 0 is yet to be determined. Denote β N = (a0 , a1 , . . . , an , 1, 1, . . . , 1, N, 1, 1, . . .). We will need the following estimates, which are proven by induction
.
Complexity of Julia Sets
327
Lemma 4.3. For any N , the following holds: 1. For i ≤ n + m we have
N
log αi (β ) < 2i−(n+m) /N ;
α (β N+1 )
i
2. for i < n + m,
N
log αi (β ) < 2i−(n+m) ;
α (β 1 )
i
3. for i < n + m,
log α (β1 N )
i
log 1
log N +1 αi (β
)
< 2i−(n+m)+1 ;
4. for i < n + m − 1,
log α (β1 N )
i
< 2i−(n+m)+1 .
log
log 1 1
αi (β )
The estimates yield the following. Lemma 4.4. For any ω of the form as in Lemma 4.1 and for any ε > 0, there is an m0 > 0 such that for any N and any m ≥ m0 , | − (β N ) − − (β 1 )| <
ε . 4
Proof. The in the expression for (β 1 ) converges, hence there is an m1 > 1 such ε that the tail of the sum i≥n+m1 α1 α2 · · · αi−1 log α1i < 16 . We will show how to choose m0 ≥ m1 to satisfy the conclusion of the lemma. We bound the influence of the change from β 1 to β N using Lemma 4.3, Parts 2 and 4. The influence on each of the “head elements" (i < n + m1 < n + m − 1) is bounded by
i−1
1
α1 (β 1 ) · · · αi−1 (β 1 ) log α (β 1)
i 2j −(n+m) + 2i−(n+m)+1
<
log 1
α1 (β N ) · · · αi−1 (β N ) log N αi (β )
j =1
< 2i−(n+m)+2 < 2m1 +2−m . By making m sufficiently large (i.e. by choosing a sufficiently large m0 we can ensure that 1−
α1 (β N ) · · · αi−1 (β N ) log α (β1 N ) ε ε i < , <1+ 1 1) 16 (β 1 ) 16 (β α1 (β 1 ) · · · αi−1 (β 1 ) log α (β 1) i
hence
α1 (β N ) · · · αi−1 (β N ) log
1 1
1 1 − α (β ) · · · α (β ) log 1 i−1 αi (β N ) αi (β 1 )
328
I. Binder, M. Braverman, M. Yampolsky
<
ε 1 α1 (β 1 ) . . . αi−1 (β 1 ) log . 1 16 (β ) αi (β 1 )
Adding the inequality for i = 1, 2, . . . , n + m1 − 1 we obtain
n+m −1
n+m 1 1 −1
1 1
N N 1 1 α1 (β ) · · · αi−1 (β ) log α1 (β ) · · · αi−1 (β ) log −
αi (β N ) αi (β 1 )
i=1
<
i=1
ε 16 (β 1 )
n+m 1 −1
α1 (β 1 ) · · · αi−1 (β 1 ) log
i=1
1 ε ε <
(β 1 ) = . αi (β 1 ) 16 (β 1 ) 16
ε Hence the influence on the “head” of − is bounded by 16 . To bound the influence on the “tail" we consider three kinds of terms α1 (β N ) · · · αi−1 (β N ) log α (β1 N ) : n + m1 ≤ i ≤ n + m − 2, i = m + n − 1 and i i ≥ m + n + 1 (recall that i = n + m is not in − ). For n + m1 ≤ i ≤ n + m − 2:
1
α1 (β 1 ) · · · αi−1 (β 1 ) log α (β 1)
i
log
α1 (β N ) · · · αi−1 (β N ) log 1 N
αi (β )
<
i−1
2j −(n+m) + 2i−(n+m)+1 < 2i−(n+m)+2 ≤ 1.
j =1
Hence in this case each term can increase at most by a factor of e. 1 For i = n+m−1: Note that the change decreases log αn+m−1 so that log α log α
1
n+m−1 (β
1
n+m−1 (β
1)
log
N)
≤
, hence we have α1 (β N ) · · · αi−1 (β N ) log α (β1 N ) i
1 α1 (β 1 ) · · · αi−1 (β 1 ) log α (β 1)
≤ log
i
<
n+m−2
2j −(n+m) <
j =1
α1 (β N ) · · · αi−1 (β N ) α1 (β 1 ) · · · αi−1 (β 1 ) 1 . 2
√ Hence this term could increase by a factor of e at most. For i ≥ n + m + 1: Note that αj for j > n + m are not affected by the change, and the change decreases αn+m , so that αn+m (β N ) ≤ αn+m (β 1 ). Hence log
α1 (β N ) · · · αi−1 (β N ) log α (β1 N ) i
1 α1 (β 1 ) · · · αi−1 (β 1 ) log α (β 1) i
≤ log
= log
α1 (β N ) · · · αn+m (β N ) α1 (β 1 ) · · · αn+m (β 1 )
n+m−1 α1 (β N ) · · · αn+m−1 (β N ) < 2j −(n+m) < 1. α1 (β 1 ) · · · αn+m−1 (β 1 ) j =1
So in this case each term could increase by a factor of e at most.
Complexity of Julia Sets
329
We see that after the change each term of the tail could increase by a factor of e eε at most. The value of the tail remains positive in the interval (0, 16 ], hence the change eε 3ε in the tail is bounded by 16 < 16 . So the total change in − is bounded by change in the “head” + change in the “tail” <
3ε ε ε + = . 16 16 4
Lemma 4.4 immediately yields: Lemma 4.5. For any ε and for the same m0 (ε) as in Lemma 4.4, for any m ≥ m0 and N , | − (β N ) − − (β N+1 )| <
ε . 2
Denote 1 (ω) = α0 (ω)α1 (ω) · · · αn+m−1 (ω) log αm+n1 (ω) = (ω) − − (ω). We are now ready to prove the following. Lemma 4.6. For sufficiently large m, for any N ,
1 (β N+1 ) − 1 (β N ) <
ε . 2
Proof. According to Lemma 4.3, Part 1 we have
n+m−1
N+1 ) · · · α N+1 )
1 n+m−1 (β
<
log α1 (β 2i−(n+m) /N < .
N N α1 (β ) · · · αn+m−1 (β ) N i=1
Hence α1 (β N+1 ) · · · αn+m−1 (β N+1 ) < α1 (β N ) · · · αn+m−1 (β N )e1/N , and 1
log α
n+m (β
1 (β N+1 ) < 1 (β N )e1/N
log
N +1 )
= 1 (β N )e1/N
1 αn+m (β N )
Hence
1
(β
N+1
) − (β ) < (β ) e 1
N
1
N
log(N + 1 + φ) . log(N + φ)
1/N
log(N + 1 + 1/φ) −1 log(N + 1/φ)
e log(N + 1 + 1/φ) < 1 (β N ) (1 + ) −1 . N log(N + 1/φ) We make the following calcualtions. Denote x = N + 1 + 1/φ, (N + 1/φ)x−1 = 1 +
1 N+1/φ
log(N+1+1/φ) log(N+1/φ) ,
< e
1 N +1/φ
then (N + 1/φ)x =
. N + 1/φ > e1/3 , and so
3 x − 1 < N+1/φ < N3 , thus x < 1 + N3 . It is not hard to see that αk−1 αk < 1/2 for all k > 1, and we have (n+m−1)/2 1 1 log(N + 1/φ). <
1 (β N ) = α1 (β N ) · · · αn+m−1 (β N ) log N αn+m (β ) 2
330
I. Binder, M. Braverman, M. Yampolsky
Thus
1
(β
N+1
<
e log(N + 1 + 1/φ) ) − (β ) < (β ) (1 + ) −1 N log(N + 1/φ) 1
N
1
N
(n+m−1)/2 1 log(N + 1/φ) ((1 + e/N )(1 + 3/N ) − 1) 2 (n+m−1)/2 1 14 < log(N + 1/φ) . 2 N
Since 14 N ∈ o(1/ log(N + 1/φ)), this expression can be always made less than choosing m large enough.
ε 2
by
Since = − + 1 , summing the inequalities in Lemmas 4.5 and 4.6 yields the following. Lemma 4.7. For sufficiently large m, for any N ,
(β N+1 ) − (β N ) < ε. It is immediate from the formula of (β N ) that: Lemma 4.8. lim (β N ) = ∞.
N→∞
We are now ready to prove Lemma 4.1. Proof. (of Lemma 4.1). Choose m large enough for Lemma 4.7 to hold. Increase N by one at a time starting with N = 1. We know that (β 1 ) = (ω) < (ω) + ε, and by Lemma 4.8, there exists an M with (β M ) > (ω) + ε. Let N be the smallest such M. Then (β N−1 ) ≤ (ω) + ε, and by Lemma 4.7,
(β N ) < (β N−1 ) + ε ≤ (ω) + 2ε. Hence
(ω) + ε < (β N ) < (ω) + 2ε. Choosing β = β N completes the proof.
The second part of the following lemma follows by the same argument as Lemma 4.4 by taking N ≥ 1 to be an arbitrary real number, not necessarily an integer. The first part is obvious, since the tail of ω has only 1’s. Lemma 4.9. For an ω = β 1 as above, for any ε > 0 we can compute an m0 > 0, such that for any m ≥ m0 , and for any tail I = [an+m , an+m+1 , · · ·] if we denote β I = [a1 , a2 , · · · , an , 1, 1, · · · , 1, an+m , an+m+1 , · · ·],
Complexity of Julia Sets
331
then
α1 (β 1 )α2 (β 1 ) · · · αi−1 (β 1 ) log
i≥n+m
1 < ε, αi (β 1 )
and
n+m−1 i=1
α1 (β I ) · · · αi−1 (β I ) log 1 − α1 (β 1 ) · · · αi−1 (β 1 ) log 1 < ε.
I 1 αi (β ) αi (β )
We can now prove Lemma 4.2. Proof (of Lemma 4.2). Applying Lemma 4.9 with
(β I ) − (ω) =
>−
ε 2
instead of ε, we get
{“head”(β I ) − “head”(ω)} +
{“tail”(β I ) − “tail”(ω)}
ε ε ε − {“tail”(ω)} > − − = −ε. 2 2 2
We will need a computable version of Lemma 4.1 for modifying the conformal radius of the corresponding Julia set. Lemma 4.10. For any given initial segment I = (a0 , a1 , · · · , an ) and m0 > 0, write ω = [a0 , a1 , · · · , an , 1, 1, 1, . . .]. Then for any ε > 0, we can uniformly compute m > m0 and an integer N such that if we write β = [a0 , a1 , · · · , an , 1, 1, . . . , 1, N, 1, 1, . . .], where the N is located in the n + mth position, we have r(ω) − 2ε < r(β) < r(ω) − ε,
(4.1)
(β) > (ω).
(4.2)
and
Proof. We first show that such m and N exist, and then give an algorithm to compute them. By Lemma 4.1 we can increase (ω) by any controlled amount by modifying one term arbitrarily far in the expansion. By Theorem 2.10, f : θ → (θ ) + log r(θ ) extends to a continuous function. Hence for any ε0 there is a δ such that |f (x) − f (y)| < ε0 whenever |x − y| < δ. In particular, there is an m1 such that |f (β) − f (ω)| < ε0 whenever m ≥ m1 . This means that if we choose m large enough, a controlled increase of closely corresponds to a controlled drop of r by a corresponding amount, hence there are m > m0 and N such that (4.1) holds. (4.2) is satisfied almost automatically. The only problem is to computably find such m and N . To this end, we apply Lemma 3.1. It implies that for any specific m and N we can compute r(β). This means that we can find the suitable m and N , by enumerating all the pairs (m, N ) and exhaustively checking (4.1) and (4.2) for all of them. We know that eventually we will find a pair for which (4.1) and (4.2) hold.
332
I. Binder, M. Braverman, M. Yampolsky
5. Proving the Main Theorem There are countably many oracle Turing Machines. Let us enumerate them in some arbiφ φ trary computable fashion M1 , M2 , . . . so that every machine appears infinitely many times in the enumeration. Recall that r(θ ) is the conformal radius of the Siegel disk associated with the polynomial Pθ (z) = z2 + e2πiθ z, or zero, if θ is not a Brjuno number. We will argue by induction. On each iteration i of the argument we shall maintain an initial segment Ii = [a0 , a1 , . . . , aNi ] an interval Hi = [li , ri ], and i = (Hi ) = ri − li such that the following properties are maintained: ri = r(γi ), where γi = [Ii , 1, 1, . . .],
(5.1)
for any β = [Ii , tNi +1 , tNi +2 , . . .] with r(β) ∈ [li , ri ],
(5.2)
and
φ
the machine Mi requires at least the time h(2 − log i + 1) to compute the imation to J (Pβ ), and
2i 2 -approx-
for i ≥ 1, (β) > (γi−1 ) − 2−(i−1) , for any β = [Ii , tNi +1 , tNi +2 , . . .].
(5.3)
Moreover, the intervals we construct are nested: [li , ri ] ⊂ [li−1 , ri−1 ], and the sequence Ii contains Ii−1 as the initial segment. The numbers 2 − log i form a strictly increasing sequence. For the basis of induction, set I0 = [1], r0 = r(γ0 ) < 2 (by the Schwarz Lemma) and l0 = r0 /2, where γ0 = [1, 1, 1, . . .]. Then for i = 0 condition (5.1) holds by definition and conditions (5.2) and (5.3) hold because they are empty.
The induction step. We now have the conditions (5.1), (5.2) and (5.3) for some i and would like to extend them to i + 1. φ φ
i Consider the machine Mi+1 . Set i+1 = 20 . Simulate Mi+1 on γi for at most
2
h(2 − log i+1 + 1) steps to compute Jγi with precision i+1 2 . The machine reads at most h(2 − log i+1 + 1) bits of the input, and we can compute m0 such that this run does not distinguish between γi and γ = [Ii , 1, 1, . . . , 1, Nm0 +1 , Nm0 +2 , . . .]. There are two cases: φ
Case 1. Mi+1 does not terminate in the assigned time, or does not output a proper set. In this case, we proceed by setting Ii+1 = [Ii , 1, . . . , 1] (with 1’s up to position m0 ), γi+1 = γi , ri+1 = ri , and li+1 = ri+1 − i+1 . By Lemma 4.2, we can choose sufficiently many 1’s in Ii+1 , so that for any β beginning with Ii+1 , we have (β) > (γi ) − 2−i . φ
Case 2. Mi+1 outputs a set S. Compute the conformal radius r(S). Schwarz Lemma implies that for any quadratic Siegel disk, r() < 2. Using the above consideration to bound the constant in Lemma 2.12, we know that for any √ Julia set J (Pω ) which is
2i+1 -accurately described by S, we have |r(ω) − r(S)| < 4 2 i+1 < 6 i+1 . Again, there are two cases (if both hold, it doesn’t matter which way to proceed):
Complexity of Julia Sets
333
Subcase 2a. ri − i+1 > r(S) + 8 i+1 . In this case we proceed by setting Ii+1 = [Ii , 1, . . . , 1] (with 1’s up to position m0 ), γi+1 = γi , ri+1 = ri , and li+1 = ri+1 − i+1 . Subcase 2b. li + 2 i+1 < r(S) − 8 i+1 . By Lemma 4.10, we can select γi+1 = [Ii+1 , 1, 1, . . .] by modifying γi at an arbitrarily far position, and set ri+1 = r(γi+1 ) so that (γi+1 ) > (γi ), ri+1 ≥ li + i+1 and [ri+1 − i+1 , ri+1 ] ∩ [r(S) − 8 i+1 , r(S) + 8 i+1 ] = ∅. The number ri+1 is computable since it is the conformal radius of a noble Siegel disk. Set li+1 = ri+1 − i+1 . We see that the induction is maintained for these parameters. In either subcase, by Lemma 4.2, we can add sufficiently many 1’s to Ii+1 , so that for any β beginning with Ii+1 , we have (β) > (γi ) − 2−i , and condition (5.3) is satisfied. Lemma 5.1. Denote γ = limi→∞ γi . Then the following equalities hold:
(γ ) = lim (γi ) i→∞
and
r(γ ) = lim r(γi ). i→∞
Proof. By the construction, the limit γ = lim γi exists. We also know that the sequence r(γi ) = ri converges uniformly to some number r, and that the sequence (γi ) is monotone non-decreasing, and hence converges to a value ψ (a priori we could have ψ = ∞). By the Carath´eodory Kernel Theorem (see e.g. [Pom]), we have r(γ ) ≥ r > 0, so ψ < ∞. On the other hand, by the property we have maintined through the construction, we know that (γ ) > (γi ) − 2−i for all i. Hence (γ ) ≥ ψ. From [BC] we know that ψ + log r = lim( (γi ) + log r(γi )) = (γ ) + log r(γ ). Hence we must have ψ = (γ ), and r = r(γ ), which completes the proof.
(5.4)
The conformal radius r(γ ) is computable, since the convergence r(γi ) → r(γ ) is uniform. Thus JPγ is also computable by Theorem 2.11. By construction, it satisfies all of the required properties. Note that the value γ itself is also computable. Acknowledgement. We would like to thank Giovanni Gallavotti for very helpful suggestions on the exposition.
References [Ahl] Ahlfors, L.: Complex Analysis. New York: McGraw-Hill, (1953) [BBY] Binder, I., Braverman, M., Yampolsky, M.: Filled Julia sets with empty interior are computable. http://arxiv/org/list/math.DS/0410580, 2004 [BB] Bishop, E., Bridges, D.S.: Constructive Analysis. Springer-Verlag, Berlin (1985) [Brv1] Braverman, M.: Computational Complexity of Euclidean Sets: Hyperbolic Julia Sets are PolyTime Computable. Thesis, University of Toronto, 2004, and Proc. CCA 2004, in ENTCS, Vol. 120, 17–30 (2005) [Brv2] Braverman, M.: Parabolic Julia Sets are Polynomial Time Computable. http://arxiv.org/ list/math.DS/0505036, 2005 [BY] Braverman, M., Yampolsky, M.: Non-computable Julia sets. J. Amer. Math. Soc., to appear [Bru] Brjuno, A.D.: Analytic forms of differential equations. Trans. Mosc. Math. Soc. 25, 131–288 (1971) [BC] Buff, X., Ch´eritat, A.: The Yoccoz Function Continuously Estimates the Size of Siegel Disks. Annals of Math., to appear [dFdM] de Faria, E., de Melo, W.: Rigidity of critical circle mappings I. J. Eur. Math. Soc. (JEMS) 4(1), 339–392 (1999)
334 [Do1]
I. Binder, M. Braverman, M. Yampolsky
Douady, A.: Disques de Siegel et anneax de Herman. Sem. Bourbaki, Ast´erisque, 152-153, 151–172 (1987) [Do2] Douady, A.: Does a Julia set depend continuously on the polynomial? In: Complex dynamical systems: The mathematics behind the Mandelbrot set and Julia sets. ed. Devaney, R.L. Proc. of Symposia in Applied Math., Vol. 49, Providence RI: Amer. Math. Soc. 1994, pp. 91–138 [MMY] Marmi, S., Moussa, P., Yoccoz, J.-C.: The Brjuno functions and their regularity properties. Commun. Math. Phys. 186, 265–293 (1997) [McM] McMullen, C.T.: Self-similarity of Siegel disks and Hausdorff dimension of Julia sets. Acta Math. 180(2), 247–292 (1998) [Mil] Milnor, J.: Dynamics in one complex variable. Introductory lectures. Braunschweig: Friedr. Vieweg & Sohn, 1999 [Pom] Pommerenke, C.: Boundary behavior of conformal maps. Springer-Verlag, Berlin Heidelberg New York (1992) [RW] Rettinger, R., Weihrauch, K.: The Computational Complexity of Some Julia Sets. In: STOC’03, June 9-11, 2003, San Diego, California, New York: ACM Press, 2004, pp. 177–185 [Ret] Rettinger, R.: A Fast Algorithm for Julia Sets of Hyperbolic Rational Functions. Proc. of CCA 2004, in ENTCS, Vol. 120, 145–157 (2005) [RZ] Rohde, S., Zinsmeister, M.: Variation of the conformal radius. J. Anal. Math. 92, 105–115 (2004) [Sie] Siegel, C.: Iteration of analytic functions. Ann. of Math. (2) 43, 607–612 (1942) [Wei] Weihrauch, K.: Computable Analysis. Springer, Berlin (2000) [YZ] Yampolsky, M., Zakeri, S.: Mating Siegel quadratic polynomials, J. Amer. Math. Soc. 14, 25–78 (2000) [Yoc] Yoccoz, J.-C.: Petits diviseurs en dimension 1. S.M.F., Ast´erisque 231 (1995) Communicated by G. Gallavotti
Commun. Math. Phys. 264, 335–347 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1526-7
Communications in
Mathematical Physics
Nonlinear Instability for the Navier-Stokes Equations Susan Friedlander1 , Nataˇsa Pavlovi´c2 , Roman Shvydkoy1 1 2
Department of Mathematics Statistics and Computer Science, University of Illinois, Chicago, IL (m/c 249) 60607, USA. E-mail: [email protected]; [email protected] Department of Mathematics, Princeton University, Princeton, NJ 08544, USA. E-mail: [email protected]
Received: 7 April 2005 / Accepted: 26 September 2005 Published online: 1 March 2006 – © Springer-Verlag 2006
Abstract: It is proved, using a bootstrap argument, that linear instability implies nonlinear instability for the incompressible Navier-Stokes equations in Lp for all p ∈ (1, ∞) and any finite or infinite domain in any dimension n. 1. Introduction The stability/instability of a flow of viscous incompressible fluid governed by the Navier-Stokes equations is a classical subject with a very extensive literature over more than 100 years. Much of the classical literature has concerned the stability of relatively simple specific flows (e.g. Couette flows and Poiseuille flows), the spectrum of the Navier-Stokes equations linearized about such flows and the role of the critical Reynolds number delineating the linearly stable and unstable regimes. An elegant result for general bounded flows was proved by Serrin [18] who used energy methods to show that all flows are nonlinearly stable in L2 norm when the Reynolds number is less than a √ specific constant (π 3). Hence all steady flows that are sufficiently slow or sufficiently viscous are stable. However for many physical situations the Reynolds number is much √ larger than π 3, often by many orders of magnitude and observations indicate that such flows are unstable. Linear instability has been confirmed in some specific examples by demonstrating existence of an nonempty unstable spectrum for the linearized Navier-Stokes operator. For example, Meshalkin and Sinai [14] used Fourier series and continued fractions to show the existence of unstable eigenvalues in the case of so-called Kolmogorov flows (i.e. plane parallel shear flow with a sinusoidal profile). Sattinger [16] uses a Galerkin argument to prove that linear instability implies nonlinear instability for weak solutions in L2 in the case of bounded domains. In a book published in Russian in 1984 (and in English in 1989) Yudovich [21] obtained an important result relating linear stability/instability for the Navier-Stokes equations with nonlinear stability/instability (see also Henry [8]). These results were proved in the function space Lq () with q ≥ n in n-spatial dimensions. A fairly general abstract theorem of Friedlander et al. [5] can be
336
S. Friedlander, N. Pavlovi´c, R. Shvydkoy
applied to the Navier-Stokes equations in a finite domain to prove nonlinear instability in H s , s > n2 + 1 when the linearized operator has an unstable eigenvalue in L2 . In this present paper we extend the result that linear instability implies nonlinear instability for the Navier-Stokes equations to all Lp spaces with 1 < p < ∞ and both finite domains and Rn . We note that our result includes nonlinear instability in the L2 energy norm which we claim is the natural norm in which to consider issues of stability and instability. The technique we employ to prove our main result is a bootstrap argument. Such arguments have been previously employed by several authors to prove under certain restrictions that linear instability implies nonlinear instability for the 2 dimensional Euler equation (Bardos et al. [1], Friedlander and Vishik [20], Lin [13]). Because in general the spectrum of the Euler operator has a continuous component, unlike the Navier-Stokes operator in a finite domain whose spectrum is purely discrete, these nonlinear instability results for the Euler equation are much more limited than those presented here for the Navier-Stokes equations. 2. Notation and Formulation We consider solutions to the Navier-Stokes equations ∂q = −(q · ∇)q − ∇p + R −1 q + f, ∂t ∇ · q = 0,
(2.1a) (2.1b)
where q(x, t) denotes the n-dimensional velocity vector, p(x, t) denotes the pressure and f (x) is an external force vector. The dimensionless parameter R is the Reynolds number defined as R = VνL , where V and L are characteristic velocity and length scales of the system and ν is the viscosity of the fluid. In Sect. 3 we consider the system on the n-dimensional torus Tn and in a bounded domain ⊂ Rn . In Sect.4 we consider the system in Rn . The results are valid in all dimensions n although the most relevant physical cases are n = 2 and 3. We impose the standard boundary conditions on solutions of (2.1) for each type of domain: the no-slip condition q |∂ = 0, in the case of ; vanishing velocity q(x) → 0, as x → ∞, in the case of Rn ; and periodic boundary condition in case of the torus. The results in Sects.3 and 4 prove that spectral instability for the linearized Navier-Stokes equations implies nonlinear instability in Lp for 1 < p < ∞. In Sect.5 we prove a result relating spectral stability with nonlinear stability in Lp for p > n. Here and thereafter, for any p ∈ [1, ∞), Lp denotes the usual Lebesgue space, with norm denoted · p , intersected with the space of divergence free functions. We let W s,p stand for the Sobolev space in the same context with norm denoted · s,p . We consider an arbitrary steady solution of (2.1), 0 = −(U0 · ∇)U0 − ∇P0 + R −1 U0 + f, ∇ · U0 = 0.
(2.2a) (2.2b)
We assume U0 (x) ∈ C ∞ and f (x) ∈ C ∞ . To discuss stability of U0 we rewrite the Navier-Stokes equations (2.1) in perturbation form with q(x, t) = U0 (x) + v(x, t), ∂v = −(U0 · ∇)v − (v · ∇)U0 + R −1 v − ∇ · (v ⊗ v) − ∇p, ∂t ∇ · v = 0, v |t=0 = v0 .
(2.3a) (2.3b) (2.3c)
Nonlinear Instability for the Navier-Stokes Equations
337
Applying the Leray projector P onto the space of divergence free functions, we write (2.3a) in the operator form: ∂v = Av + N(v, v), ∂t
(2.4)
where Av = P[−(U0 · ∇)v − (v · ∇)U0 + R −1 v], N(v, v) = P[−∇ · (v ⊗ v)].
(2.5) (2.6)
We note that the linear operator A is a bounded perturbation, to lower order, of the Stokes operator R −1 P. The operator A generates a strongly continuous semigroup in every Sobolev space W s,p which we denote by eAt : v(t) = eAt v0 , v0 ∈ W s,p
(2.7)
(the case of a bounded domain is treated in [21], and in the case of Rn the statement can be proved with the use of the Fourier transform and the H¨ormander-Mikhlin multiplier theorem). We now define a suitable version of Lyapunov (nonlinear) stability for the NavierStokes equations. Definition 2.1. Let (X, Z) be a pair of Banach spaces. An equilibrium U0 which is the solution of (2.2) is called (X, Z) nonlinearly stable if, no matter how small ρ > 0, there exists δ > 0 so that v0 ∈ X and v0 Z < δ
(2.8)
imply the following two assertions (i) there exists a global in time solution to (2.3) such that v(t) ∈ C([0, ∞); X); (ii) v(t)Z < ρ for a.e. t ∈ [0, ∞). An equilibrium U0 that is not stable in the above sense is called Lyapunov unstable. We will drop the reference to (X, Z) where it does not lead to confusion. We note that under this strong definition of stability, loss of existence of a solution to (2.4) is a particular case of instability. We remark that in literature there are many definitions of a solution to the Navier-Stokes equations. These include “classical” solutions that are continuous functions of each argument (and very few such solutions are known), “weak” solutions defined via test functions by Leray [12] and “mild” solutions introduced by Kato-Fujita [10]. It is this last concept of existence that we will invoke because we utilize a “mild” integral representation of the solution to (2.4) via Duhamel’s formula. We remark that to date local in time existence of mild solutions for the Navier-Stokes equations is proved only in Lp , p ≥ n, (for p > n by Fabes-Jones-Riviere [4] and for p = n by Kato [9]). The existence of weak solutions has been proved in L2 by Leray [12], in Lp for all 2 ≤ p < ∞ by C. Calderon [2], and for uniformly locally square integrable initial data by Lemari´e [11]. For a survey of existence results see for example, Temam [19] and Cannone [3].
338
S. Friedlander, N. Pavlovi´c, R. Shvydkoy
We now state the main result of this paper: Theorem 2.2. Let 1 < p < ∞ be arbitrary. Suppose that the operator A over Lp has spectrum in the right half of the complex plane. Then the flow U0 is (Lq , Lp ) nonlinearly unstable for any q > max{p, n}. The proof of this theorem essentially uses properties of the operator A which are stated in Lemmas 3.1 and 3.2. The instability result is proved using a bootstrap argument which is presented in Sect.3 in the case of finite domains and Tn and in Sect.4 in the case of Rn . Here we state a version of the Sobolev embedding theorem that we shall invoke in the proof of Theorem 2.2. Proposition 2.3. Let s > 0, 1 < r1 < ∞, and 1 < r2 < ∞ satisfy 1 s <1− , r1 n
r2 ≤ r1 ,
1 1 s ≤ + . r2 r1 n
(2.9)
Then f −s,r1 f r2 .
(2.10)
Proof. Recall that for s > 0 and 1 < r < ∞, W −s,r is defined as the dual space to W0s,r , where 1/r + 1/r = 1. The inequalities (2.9) can be rewritten as sr1 < n,
r1 ≤ r2 ≤
nr1 . n − sr1
(2.11)
Thus, the standard Sobolev embedding theorem implies that f r2 f s,r1 .
(2.12)
Applying (2.12) we obtain f −s,r1 =
sup f, g sup f, g = f r2 ,
gs,r ≤1 1
gr ≤1 2
which proves the proposition. 3. Finite Domain In this section we present a proof of Theorem 2.2 in the case of finite domains Tn and ⊂ Rn . Let µ be the eigenvalue of A with maximal positive real part, which we denote by λ, and let φ ∈ Lp , with φp = 1, be the corresponding eigenfunction. We note that in the case of a finite domain all eigenfunctions of A are infinitely smooth. For a fixed 0 < δ < λ we denote by Aδ the following operator: Aδ = A − λ − δ.
(3.1)
Now we state two auxiliary lemmas which hold both in the case of a finite and in the case of an infinite domain.
Nonlinear Instability for the Navier-Stokes Equations
339
Lemma 3.1. For every 0 < α < 1 and p > 1 there exists a constant M > 0 such that for all t > 0 one has Aαδ eAδ t Lp →Lp ≤
M . tα
(3.2)
This lemma holds generally for any bounded analytic semigroup (see [15]). The rescaling of A given by 3.1 ensures that the semigroup eAδ t is bounded. The fact that it is analytic is proved by Yudovich [21] and Giga [6]. Lemma 3.2. For every 1/2 < α < 1 and p > 1 there exists a constant C > 0 such that A−α δ f p ≤ Cf −2α,p .
(3.3)
In the case of a bounded domain the lemma follows by duality from the papers of Giga [7] and Seeley [17]. On the torus and Rn one can check (3.3) directly using the Fourier transform and integral representation for fractional power of a generator [15]. We are now in a position to prove Theorem 2.2. Let us fix an arbitrary small > 0, and solve the Cauchy problem (2.3) with initial condition v0 = φ. We note that for such initial condition, with small enough, there exists a unique global in time classical solution to (2.3) (see, for example, [19]). Using Duhamel’s formula we write the solution in the form v(t) = etµ φ + B(t), where
t
B(t) =
(3.4)
eA(t−τ ) N(v, v)(τ ) dτ.
0
The main idea of the proof is to show that the bilinear term B(t) grows at most like the square of the norm of v(t) for as long as the latter is bounded by a constant multiple of eλt . The Lq -metric in which such control is possible has to satisfy the assumption q > n. Since this condition is not assumed for p we will use Lq as an auxiliary space, while our final instability result will be proved in Lp as stated. Lemma 3.3. Let q > n. Then there exists a constant C > 0 such that the following estimate holds t 1 B(t)q ≤ C e(λ+δ)(t−τ ) v(τ )2q dτ, (3.5) α (t − τ ) 0 for some 1/2 < α < 1. Proof. Indeed, for any 0 < α < 1, we can write t B(t) = e(λ+δ)(t−τ ) Aαδ eAδ (t−τ ) A−α δ N (v, v)(τ ) dτ. 0
Hence, by Lemma 3.1, B(t)q 0
t
e(λ+δ)(t−τ )
1 A−α N (v, v)(τ )q dτ. (t − τ )α δ
340
S. Friedlander, N. Pavlovi´c, R. Shvydkoy
By Lemma 3.2, we have A−α δ N(v, v)q N(v, v)−2α,q v ⊗ v1−2α,q , where the last inequality follows from the continuity of the Leray projection. We now choose α sufficiently close to 1 so that q > n/(2α − 1). This would fulfill the conditions of Proposition 2.3 with s = 2α − 1, r1 = q and r2 = q/2. Thus, 2 A−α δ N(v, v)q v ⊗ vq/2 vq .
(3.6)
Inserting this in the last estimate for B(t)q we finally obtain (3.5). Let us fix q > max{n, p}. So, in particular, (3.5) holds. For any Q > φq let T = T (Q) be the maximal time such that v(t)q ≤ Qeλt ,
∀ t ≤ T.
(3.7)
Notice that (3.7) holds for t = 0. Hence, T > 0 by continuity. In fact, we show that this critical time T is sufficiently large for any choice of Q. First, let us observe that for any t ≤ T , by Lemma 3.3, t 1 B(t)q ≤ CQ2 2 e(λ+δ)(t−τ ) e2λτ dτ. (t − τ )α 0 Splitting the integral into two integrals over [0, t − 1] and [t − 1, t], one can show that it behaves asymptotically as e2λt . Hence, perhaps with a different C > 0 independent of Q or t, we obtain the following estimate B(t)q ≤ C(Qeλt )2 ,
∀ t ≤ T.
(3.8)
Using (3.8) we now prove an estimate on the size of T . Lemma 3.4. For any Q > φq one has the following inequality eλT ≥
Q − φq . CQ2
(3.9)
Proof. If T = ∞, the inequality is trivial. If T < ∞, then at time t = T the inequality (3.7) turns into equality and we obtain using (3.4) and (3.8), QeλT = v(T )q ≤ eλT φq + C(QeλT )2 . The lemma now easily follows. Let X∗ denote the constant on the right-hand side of (3.9), i.e. X∗ =
Q − φq . CQ2
(3.10)
In view of (3.9) there exists time t∗ ≤ T such that X∗ = eλt∗ . Since q > p we trivially have B(t)p ≤ C B(t)q ,
(3.11)
Nonlinear Instability for the Navier-Stokes Equations
341
for some C > 0. So, by the triangle inequality applied to (3.4) we obtain using (3.8), (3.11), and our assumption φp = 1, v(t∗ )p ≥ X∗ − C CX2∗ = X∗ (1 − C CX∗ ).
(3.12)
Since C and C are independent of Q, we could choose Q = Q0 in the beginning of the argument so close to φq that X∗ < 1/(2C C). Then v(t∗ )p ≥ X∗ /2 = c0 . This finishes the proof of Theorem 2.2 in the case of a finite domain. We remark that in the case of a finite domain our method proves a stronger result. Since the eigenfunction φ belongs to C ∞ , the size of initial perturbation can be measured in the stronger metric of C ∞ so that v0 C ∞ ≤ , whereas instability at the critical time t∗ is measured in the weak Lp -metric.
4. Infinite Domain The case of Rn brings two main difficulties to the proof. First, we no longer have the inclusion Lq ⊂ Lp to satisfy (3.11). Second, there may not be an exact smooth eigenfunction φ corresponding to µ ∈ σ (A), because the operator A has a non-compact resolvent over Rn .
4.1. Estimates for B(t). In the case of R n we replace the single estimate (3.11) with a sequence of recursive estimates improving integrability exponent on each step. Let L be the first integer such that 2L p > n. By Lemma 3.3, which is valid on R n too, we have B(t)2L p ≤ C
t
e(λ+δ)(t−τ )
0
1 v(τ )22L p dτ, (t − τ )α
(4.1)
for some 1/2 < α < 1. On the other hand, for every l = 0, . . . , L − 1 one has, in place of (3.6), 2 A−α δ N (v, v)2l p v ⊗ v1−2α,2l p v ⊗ v2l p v2l+1 p .
Thus, we obtain B(t)2l p ≤ C
0
t
e(λ+δ)(t−τ )
1 v(τ )22l+1 p dτ. (t − τ )α
(4.2)
We postpone the use of (4.1) and (4.2) till Lemma 4.3, where we show the analogue of (3.12) for the case of R n .
342
S. Friedlander, N. Pavlovi´c, R. Shvydkoy
4.2. Construction of approximate eigenfunctions. Suppose now that µ ∈ σ (A) lies on the boundary of the spectrum and has the greatest positive real part λ. In this case there p n exists a sequence of functions {fm }∞ m=1 ⊂ L (R ) such that fm p = 1, lim Afm − µfm p = 0,
m→∞
and as a consequence, for every t > 0, lim etA fm − etµ fm p = 0.
m→∞
p n Lemma 4.1. There exists a sequence {φm }∞ m=1 ⊂ L (R ) such that the following is true
(i) φm p = 1, m ∈ N; (ii) For every q > p there is a constant Mq such that φm q ≤ Mq holds for all m ∈ N; (iii) etA φm p ≥ 21 etλ , for all 0 ≤ t ≤ m; (iv) etA φm q ≤ 2φm q etλ , for all 0 ≤ t ≤ m and p ≤ q ≤ 2L p. Proof. Let φ˜ m = eA fm . Since eA fm − eµ fm p → 0 we conclude that c ≤ φ˜ m p ≤ C,
(4.3)
for all m ∈ N. Denote φm = φ˜ m · φ˜ m −1 p . Clearly, (i) is satisfied. To prove the other three statements we fix s > 0 such that n = sp. By the end-point Sobolev embedding theorem and (4.3) we have, for any q > p, φm q φ˜ m q = eA fm q eA fm s,p Asδ eA fm p fm p = 1. This proves (ii). Furthermore, we have etA φm − etµ φm q eA (etA fm − etµ fm )s,p Asδ eA (etA fm − etµ fm )p etA fm − etµ fm p → 0, as m → ∞ for each fixed t > 0 and p ≤ q. So, by choosing an appropriate subsequence, we achieve (iii) and (iv).
Nonlinear Instability for the Navier-Stokes Equations
343
4.3. Bootstrap argument. Let us fix an arbitrary > 0 and find m ∈ N such that eλm > 1.
(4.4)
This m will be fixed through the rest of the argument. We solve the Cauchy problem (2.3) with initial condition v0 = φm . Lemma 4.1 shows that φm ∈ Lq uniformly in m for all q > p. In particular, for any fixed q > max{p, n} there exists a mild solution in Z = Lq for which the Duhamel formulation holds: v(t) = eAt φm + B(t).
(4.5)
We note that failure for v(t) to satisfy (4.5) for all t > 0 or being in C([0, ∞), X) is regarded as instability by definition. We thus can assume in the rest of the argument L that (4.5) holds for all t > 0 and v ∈ C([0, ∞), X). In addition, since φm ∈ L2 p and L 2L p > n, the solution v(t) belongs to L2 p at least for a certain initial period of time. L Our subsequent estimates will show that, in fact, v(t) ∈ L2 p over a time interval of the order log 1/. Let Q > 2φm 2L p be arbitrary, and define T = T (Q) to be the maximal time such that v(t)2L p ≤ Qeλt , for all t ≤ T .
(4.6)
Like in the previous section the following inequality holds B(t)2L p ≤ C(Qeλt )2 ,
∀t ≤ T .
(4.7)
Q − 2φm 2L p eλT ≥ min 1; , CQ2
(4.8)
Lemma 4.2. For any Q > 2φm 2L p we have
where C > 0 is independent of Q. Proof. If T ≥ m, we appeal to (4.4). If T < m, then at t = T the inequality (4.6) must turn into equality. Thus, in view of (4.7) we have QeλT = v(T (Q))2L p ≤ 2φm 2L p eλT + C(QeλT )2 , which implies (4.8). We will choose Q appropriately after the following key lemma. Lemma 4.3. There are constants C2 , . . . , C2L+1 and 2 ≤ K ≤ 2L+1 independent of Q and m such that for any t ≤ min{T , m} one has the following inequality v(t)p ≥
where X = eλt .
1 X − C2 X2 − · · · −CK−1 XK−1 − 2 L+1 L+1 −CK QK XK − · · · − C2L+1 Q2 X2 ,
(4.9)
344
S. Friedlander, N. Pavlovi´c, R. Shvydkoy
Proof. First we bound all the norms v(t)2l p , l = 1, . . . , L from above using the estimates on the nonlinear term (4.2), (4.7). We start with l = L and invoke (4.7) to obtain v(t)2L p ≤ 2eλt φm 2L p + C(eλt )2 Q2 = C1 X + C2 Q2 X2 , for all t ≤ T . We note that our constants may change during the proof. By the previous inequality and (4.2) with l = L − 1 we obtain v(t)2L−1 p ≤ 2Xφm 2L−1 p + t (t−τ )(λ+δ) e (C1 eλτ + C2 Q2 2 e2λτ )2 dτ + (t − τ )α 0 ≤ C1 X + C2 X2 + C3 Q3 X3 + C4 Q4 X4 . Here and thereafter we use the fact that for any k ≥ 2 one has t e(t−τ )(λ+δ) (t − τ )−α ekλτ dτ ekλt . 0
By induction on l we arrive at ˜
˜
˜
L
L
v(t)2p ≤ C1 X + · · · + CK−1 XK−1 + CK˜ QK XK + · · · + C2L Q2 X2 , ˜ and hence, B(t)p ≤ C2 X2 + · · · + CK−1 XK−1 + CK QK XK + · · · + C2L+1 Q2
L+1
X2
L+1
.
Finally, using (iii) of Lemma (4.1) and the triangle inequality on (4.5) in the opposite direction to get (4.9). We will choose a Q so that the RHS of (4.9) is bigger than an absolute constant at Q − 2φm 2L p X = X∗ = min 1; . CQ2 For this X∗ , due to Lemma 4.2 and our initial assumption (4.4), there exists a t∗ ≤ min{T , m} such that X∗ = et∗ λ . Hence, Lemma 4.3 applies to obtain instability at time t = t∗ . It is convenient to seek Q in the form Q = (2 + aφm 2L p )φm 2L p , where 0 < a < 1. Then Q − 2φm 2L p CQ2
=
a a . ≤ C(2 + aφm 2L p )2 4C
Choosing a < 4C we ensure that Q − 2φm 2L p CQ2
<1
Nonlinear Instability for the Navier-Stokes Equations
345
and hence, X∗ =
Q − 2φm 2L p CQ2
.
By the above estimate and (ii) of Lemma 4.1, we have a a ≤ X∗ ≤ , C(2 + M2L p )2 4C or a a ≤ X∗ ≤ . C c
(4.10)
We notice that since Q is bounded by a constant independent of m and a, we can bound the minimum k −1/k k k −1/k min 1; min (Ck 4 ) ; min (Ck Q 4 ) 2≤k≤K−1
K≤k≤2L+1
from below by some constant c0 independent of Q. Let a = min{4C, c c0 /2}. Then from (4.10), we obtain c˜0 ≤ X∗ ≤ c0 . Thus, by (4.9), 1 1 v(t∗ )p ≥ c˜0 ( − − · · · ) = c. 2 16 This finishes the proof. We remark again that like in the case of a finite domain our method yields a slightly stronger result. Since φm ∈ W s,p uniformly, we can measure the size of initial perturbation in the metric of any Sobolev space W s,p for all s > 0. 5. Stability Result Bootstrap techniques can also be used to prove that linear stability implies nonlinear stability for the Navier-Stokes equations in Lq for q > n. In particular this reproves the classical stability theorem of Yudovich [21]. Theorem 5.1. Let q > n be arbitrary. Assume the operator A in Lq has spectrum confined to the left half of the complex plane. Then the flow U0 is (Lq , Lq ) nonlinearly stable. The result holds in Tn and , and in any spatial dimension n. Proof. We recall that any analytic semigroup possesses the spectral mapping property. From the assumption that the spectrum of A is confined to the left half plane we thus conclude that the exponential type of the semigroup eAt is negative. Hence, there exists λ > 0 such that eAt v0 q ≤ Me−λt v0 q ,
(5.1)
346
S. Friedlander, N. Pavlovi´c, R. Shvydkoy
for all t > 0 and v0 ∈ Lq . From Duhamel’s formula (3.4) with the initial condition replaced by v0 , and by argument similar to that used in the proof of Lemma 3.3, we have t −λt e−λ(t−τ ) (t − τ )−α v(τ )2q dτ. (5.2) v(t)q ≤ Me v0 q + C 0
Again let T be the maximal time for which v(t)q ≤ 2Mv0 q e−λt ,
t ≤ T.
(5.3)
Combining (5.2) and (5.3) gives v(t)q ≤ Me−λt v0 q + 4M 2 Ce−2λt v0 2q ≤ Me−λt v0 q (1 + 4MCv0 q ), for t ≤ T . We choose v0 q < (8MC)−1 . Then the previous inequality implies that v(t)q ≤
3 Mv0 q e−λt , 2
(5.4)
for t ≤ T . Hence, the assumption of (5.3) implies the smaller bound of (5.4), which gives a contradiction with a maximal finite T . Thus, T = ∞ and the bound (5.3) holds for all t ≥ 0. This bound implies the global existence of the solution to (2.1) and condition (ii) of Definition 2.1 for a sufficiently small choice of v0 q . Remark 5.2. The instability/stability results in this paper can be generalized to all the equations of motion that are augmented versions of the equations for incompressible, dissipative fluids described in operator form by an appropriate version of (2.4). This includes the magnetohydrodynamic equations for a dissipative electrically conducting fluid, the equations for an incompressible, stratified fluid with viscous and thermal dissipation and the so called modified Navier-Stokes equations with (−) replaced by (−)β , where β > 1/2. References 1. Bardos, C., Guo, Y., Strauss, W.: Stable and unstable ideal plane flows, Chinese Ann. Math. Ser. B 23(2), 149–164 (2002) Dedicated to the memory of Jacques-Louis Lions. 2. Calder´on, C.P.: Existence of weak solutions for the Navier-Stokes equations with initial data in Lp . Trans. Amer. Math. Soc. 318(1), 179–200 (1990) 3. Cannone, M.: Harmonic analysis tools for solving the incompressible Navier-Stokes equations. Handbook of mathematical fluid dynamics. Vol. III, Amsterdam; North-Holland, pp. 161–244 (2004) 4. Fabes, E. B., Jones, B. F., Rivi`ere, N. M.: The initial value problem for the Navier-Stokes equations with data in Lp . Arch. Rat. Mech. Anal. 45, 222–240 (1972) 5. Friedlander, S., Strauss, W., Vishik, M.: Nonlinear instability in an ideal fluid. Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 14(2), 187–209 (1997) 6. Giga, Y.: Analyticity of the semigroup generated by the Stokes operator in Lr spaces. Math. Z. 178(3), 297–329 (1981) 7. Giga, Y.: The Stokes operator in Lr spaces. Proc. Japan Acad. Ser. A Math. Sci. 57(2), 85–89 (1981) 8. Henry, D.: Geometric theory of semilinear parabolic equations. Lecture Notes in Mathematics 840, New York: Springer-Verlag, 1981 9. Kato,T.: Strong Lp -solutions of the Navier-Stokes equation in Rm , with applications to weak solutions, Math. Z. 187(4), 471–480 (1984) 10. Kato,T., Fujita, H.: On the nonstationary Navier-Stokes system. Rend. Sem. Mat. Univ. Padova 32, 243–260 (1962)
Nonlinear Instability for the Navier-Stokes Equations
347
11. Lemari´e-Rieusset, P.G.: Solutions faibles d’´energie infinie pour les e´ quations de Navier-Stokes dans R3 . C. R. Acad. Sci. Paris S´er. I Math. 328(12), 1133–1138 (1999) 12. Leray, J.: Essai sur le mouvement d’un liquide visqueux emplissant l’espace. Acta Math. no. 63, 193–248 (1934) 13. Lin, Z.: Nonlinear instability of ideal plane flows. Int. Math. Res. Not. 41, 2147–2178 (2004) 14. Meˇsalkin, L. D. Sina˘ı, Ja. G.: Investigation of the stability of a stationary solution of a system of equations for the plane movement of an incompressible viscous liquid. J. Appl. Math. Mech. 25, 1700–1705 (1961) 15. Pazy, A.: Semigroups of linear operators and applications to partial differential equations. NewYork, Springer-Verlag: 1983 16. Sattinger, D.H.: The Mathematical problem of hydrodynamic stability. J. Math. Mech. 19(9), 797–817 (1970) 17. Seeley, R.: Interpolation in Lp with boundary conditions, Studia Math. 44, 47–60 (1972). Collection of articles honoring the completion by Antoni Zygmund of 50 years of scientific activity. 18. Serrin, J.: On the stability of viscous fluid motions, Arch. Rat. Mech. Anal. 3, 1–13 (1959) 19. Temam, R.: Some developments on Navier-Stokes equations in the second half of the 20th century. Development of mathematics 1950–2000, Basel, Birkh¨auser: 2000, pp. 1049–1106 20. Vishik, M. Friedlander, S.: Nonlinear instability in two dimensional ideal fluids: the case of a dominant eigenvalue. Comm. Math. Phys. 243(2), 261–273 (2003) 21. Yudovich, V. I.: The linearization method in hydrodynamical stability theory. Translations of Mathematical Monographs, Vol. 74, Providence, RI: Amer. Math. Soc. 1989 Communicated by P. Constantin
Commun. Math. Phys. 264, 349–370 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1547-2
Communications in
Mathematical Physics
Connecting Solutions of the Lorentz Force Equation do Exist E. Minguzzi1,2 , M. S´anchez3 1
Departamento de Matem´aticas, Plaza de la Merced 1–4, 37008 Salamanca, Spain. E-mail: [email protected] 2 INFN, Piazza dei Caprettari 70, 00186 Roma, Italy 3 Departamento de Geometr´ıa y Topolog´ıa Facultad de Ciencias, Avda. Fuentenueva s/n. 18071 Granada, Spain. E-mail: [email protected] Received: 8 April 2005 / Accepted: 14 November 2005 Published online: 11 March 2006 – © Springer-Verlag 2006
Abstract: Recent results on the maximization of the charged-particle action Ix0 ,x1 in a globally hyperbolic spacetime are discussed and generalized. We focus on the maximization of Ix0 ,x1 over a given causal homotopy class C of curves connecting two causally related events x0 ≤ x1 . Action Ix0 ,x1 is proved to admit a maximum on C, and also one in the adherence of each timelike homotopy class C. Moreover, the maximum σ0 on C is timelike if C contains a timelike curve (and the degree of differentiability of all the elements is at least C 2 ). In particular, this last result yields a complete Avez-Seifert type solution to the problem of connectedness through trajectories of charged particles in a globally hyperbolic spacetime endowed with an exact electromagnetic field: fixed any charge-to-mass ratio q/m, any two chronologically related events x0 x1 can be connected by means of a timelike solution of the Lorentz force equation corresponding to q/m. The accuracy of the approach is stressed by many examples, including an explicit counterexample (valid for all q/m = 0) in the non-exact case. As a relevant previous step, new properties of the causal path space, causal homotopy classes and cut points on lightlike geodesics are studied. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . 2. Causal Homotopy Classes . . . . . . . . . . . 3. Connectedness Through Solutions to the LFE . 4. Maximization of Ix0 ,x1 over Homotopy Classes 5. Existence of Timelike Local Maximizers . . . 6. The Non-Exact Electromagnetic Field Case . . 7. Conclusions . . . . . . . . . . . . . . . . . . . Appendix: Causality of Kaluza-Klein Metrics . . . . References . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
350 352 357 361 364 367 368 368 369
350
E. Minguzzi, M. S´anchez
1. Introduction Recently, there has been a renewed interest in the existence of solutions to the Lorentz force equation (LFE; Eq. (1) below) connecting two events x0 x1 of a spacetime M, for an (exact) electromagnetic field, F = dω. Even though many, sometimes competitive, results have been obtained [5, 12, 13, 17, 11, 29, 30], and related mathematical problems studied [6, 16, 31], the full answer to the original problem has remained open: Question (Q). Assume that M is globally hyperbolic, fix a charge-to-mass ratio q/m, and let x0 x1 be any two fixed chronologically related events: must a timelike solution of the corresponding LFE, connecting the two events, exist? Our main aim is to give a complete (affirmative) answer to this question. Moreover, we will study also some properties of causal homotopy classes not only essential for (Q) but also interesting in their own right. The existence of a length-maximizing causal geodesic σ0 connecting x0 and x1 , when x0 ≤ x1 , is a well-known property since the works by Avez [3] and Seifert [39] (see Subsect. 3.1). For the LFE, one must solve two problems: (A) Prove that the associated action functional Ix0 ,x1 admits local maximizers (or at least critical points). (B) Show that one of these maximizers is timelike (at all the points). As a difference with the Avez-Seifert geodesic case, now this second point is not trivial and becomes essential because, otherwise, the maximizer cannot be interpreted as a solution to the LFE (see Sect. 3). Our progress can be summarized as follows: (A) The existence of a maximizer of Ix0 ,x1 on each causal homotopy class Cx0 ,x1 will be proved. In principle, a possible proof would follow steps analogous to the Avez-Seifert theorem, but new technicalities would appear (to prove the upper semi-continuity of Ix0 ,x1 , or to try to reduce the space of connecting curves to a finite-dimensional one, as in Subsect. 2.3), which propagate to problem (B). Instead, we follow an approach based on Kaluza-Klein metrics as in [17, 29]. This approach reduces problem (A) to a problem on lightlike geodesics, and gives a simple geometrical interpretation for the functional Ix0 ,x1 , which is somewhat reminiscent of the time of the arrival functional in the Fermat principle of General Relativity [25, 35] (see Subsect. 4.2). In the present paper, this approach will be refined and clarified to obtain a maximizer of Ix0 ,x1 on each causal homotopy class Cx0 ,x1 (and in the closure of each timelike homotopy class C x0 ,x1 ). (B) We must emphasize that when the maximizer in Cx0 ,x1 is not timelike, it becomes a lightlike geodesic (Theorems 3.2, 4.2). This excludes the existence of non-timelike maximizers for generic pairs x0 x1 , but lightlike maximizers can exist for particular x0 , x1 (Example 3.1). Nevertheless, as a main goal in the present paper, we prove that, in this case, the causal homotopy class of a lightlike maximizer only can contain lightlike pregeodesics (some of them, global maximizers in the class), that is, x0 and x1 are only causally but not chronologically related by curves in the class. Thus, as x0 x1 , we can choose a causal homotopy class which contains a timelike curve, and the required timelike maximizer is obtained. As shown by means of an explicit example (Remark 5.2), differentiability C 2 will be essential (this is the natural degree of differentiability for LFE, even though the associated variational problem makes sense for C 1 differentiability).
Connecting Solutions of the Lorentz Force Equation do Exist
351
Our main result is then the following: Theorem 1.1. Let (M, g) be a C 2 globally hyperbolic spacetime, and F be an exact electromagnetic field on M (F = dω for some C 2 differential form). Choose x1 ∈ J + (x0 ), x1 = x0 , and fix any causal homotopy class Cx0 ,x1 . For each charge-to-mass ratio q/m ∈ R there exists a future-directed causal curve σ0 which connects x0 and x1 and maximizes the corresponding action functional Ix0 ,x1 on Cx0 ,x1 . This maximizer σ0 is lightlike if and only if Cx0 ,x1 only contains lightlike curves (necessarily geodesics, up to reparametrizations). In this case, σ0 is a lightlike geodesic with no conjugate point to x0 strictly before x1 , and x1 will be conjugate if there exists a second curve in Cx0 ,x1 which is not a reparametrization of σ0 . Otherwise, σ0 is timelike, and its reparametrization with respect to proper time becomes a solution of the LFE for the charge-to-mass ratio q/m. In particular given q/m, if x1 ∈ I + (x0 ) there exists at least one solution to the LFE which connects x0 and x1 . This work is organized as follows. – In Sect. 2, causal homotopy classes are studied, specially in globally hyperbolic spacetimes. In Subsect. 2.1, the general framework is introduced, and some new general properties are given (Theorem 2.1, Corollary 2.1; compare with [7, Theorem 9.15]). In Subsect. 2.2, we introduce the notion of homotopic cut point, and prove that, for lightlike geodesics, this point becomes equivalent to the first conjugate point, Theorem 2.2, Remark 2.3. This result will be essential to solve problem (B) above, in Sect. 5. In Subsect. 2.3, first some properties of the standard approach for the timelike path space of a globally hyperbolic spacetime [41], [7, Ch. 10.2] are shown to be extendible to the causal case. More originally, Theorem 2.3 proves, in particular, that any geodesic limit of curves contained in a single timelike or causal homotopy class, also belongs to this same causal homotopy class; this result will turn out essential to solve problem (A) above, in Sect. 4. – In Sect. 3, LFE and question (Q) are introduced (Subsect. 3.1), and the variational framework for the LFE is discussed (Subsect. 3.2). Even though this framework is well-known, we discuss it with some detail, because there are some related variational frameworks (widely studied in recent references) which may lead to confusion. In Subsect. 3.3, we summarize the known results, and give a counterexample which shows that the basic question (Q) was still open, in general. – In Sect. 4, Problem (A) on the existence of local maximizers is solved, as explained above. Even though we follow the approach in [17, 29], the proof is rewritten completely. In fact, apart from some simplifications of these references, several new technicalities appear when causal homotopy classes are considered, in both the Kaluza-Klein fiber bundle (Subsect. 4.1) and the limit process on curves of the homotopy class (Subsect. 4.2, Lemma 4.1). The main result, Theorem 4.2, includes a refinement on the existence of local maximizers for the action Ix0 ,x1 in the closure of any timelike homotopy class (which remains valid for C 1 elements). – In Sect. 5, Problem (B) is solved completely. In Subsect. 5.1, we give a result on the impossibility for a lightlike geodesic with conjugate points to be a local maximizer of actions as those for the LFE (Lemma 5.1, Remark 5.1). Then, the timelike character of the maximizer follows from the properties of causal homotopy classes in Sect. 2. In Subsect. 5.2, we give examples which show the accuracy of our results: (i) even though there are maximizers in the closure of any timelike homotopy class, these
352
E. Minguzzi, M. S´anchez
maximizers can be lightlike in some of these classes, and one can ensure the existence of a timelike maximizer only in the whole causal homotopy class (which may contain more than one timelike homotopy class), but (ii) if the degree of differentiability were only C 1 , such a timelike maximizer may not exist. – In Sect. 6 we provide an example which shows that the results obtained do not admit further generalizations to the non-exact electromagnetic field case. Remarkably, in this example: (a) for a suitably chosen pair x0 x1 , no connecting solution of the LFE exists, whatever the value of q/m(= 0) is chosen (in particular, no timelike connecting solution exists for the related Eq. (5) considered in Subsect. 3.2), and (b) even though F is non-exact, it is the curvature of a suitable bundle of fiber S 1 . – In Sect. 7 we give the conclusions. Finally, in a short appendix a general result on global hyperbolicity for Kaluza-Klein metrics is given. This result makes our paper self-contained (compare with [14, 42] and [17, Lemma 5]), and it is provided not only for completeness, but also to incorporate the recent progress on the splitting of globally hyperbolic spacetimes in [9, 10]. In particular, this rules out the problem of differentiability in the parametrizations of the curves in the causal path space [7, Lemma 10.34]. 2. Causal Homotopy Classes 2.1. General properties. Throughout this paper, (M, g) will denote a C r0 spacetime (connected, time-oriented Lorentzian manifold), r0 ∈ {2, . . . , ∞} of arbitrary dimension n0 ≥ 2 and signature (+, −, . . . , −). Nevertheless, for the maximization result of the action functional in Sect. 4, it is enough r0 = 1. Without loss of generality, causal curves will be regarded as piecewise C r0 (piecewise smooth) and future-directed from now on1 . Notice that the curves are regarded as parametrized (we will not be specifically interested in the space of all the unparametrized causal curves, in the spirit of Morse theory), and suitable reparametrizations will be chosen, if necessary. In principle, strong causality is our (minimal) ambient causal assumption for (M, g), always assumed implicitly. But most of the results need global hyperbolicity, and sometimes this assumption will be imposed (explicitly in this section) for simplicity. For background results in this section, see for example [7, Ch. 9, 10] and [33, Ch. 10]; timelike homotopy classes have also been studied in different contexts (see for example, [40, 18, 38] and references therein); some of the difficulties circumvented in our approach are illustrated in the beginning of the proof of [34, Th. 6.5]. Two causal (resp. timelike) curves, γi : [λ0 , λ1 ] → M, i = 0, 1 with fixed extremes xj = γ0 (λj ) = γ1 (λj ), j = 0, 1, are causally homotopic if there exists a causal (resp. timelike) homotopy (with fixed extremes x0 , x1 ) connecting γ0 , γ1 , i.e., a continuous map H : [0, 1] × [λ0 , λ1 ] → M (, λ) → γ (λ) 1 The reader can check that no more generality is obtained if the causal curves (in our at least C 2 spacetime) are regarded only as piecewise C 1 smooth. Even more, essentially the results in the present section can be extended if future-directed causal curves are regarded only as C 0 (i.e., a continuous curve λ → γ (λ) which satisfies: for any open connected subset U ⊂ M, if λ < λ and [λ, λ ] ⊂ γ −1 (U ) then γ (λ) and γ (λ ) can be joined by a piecewise smooth future directed causal curve contained in U ; such causal curves satisfy a Lipschitzian condition, see [34, p. 17]). For example, Lemma 2.1 (and, then, Theorem 2.1 and Corollary 2.1) or Theorem 2.2 hold obviously if the longitudinal curves of the causal homotopies are allowed to be only continuous causal curves.
Connecting Solutions of the Lorentz Force Equation do Exist
353
such that each longitudinal curve γ is causal (resp. timelike) for all . This divides the set of causal curves joining x0 and x1 into causal homotopy classes, each one containing none, one or more classes of timelike homotopy (see Example 2.1). In our notation a generic causal (resp. timelike) homotopy class will be denoted with Cx0 ,x1 (resp. Cx0 ,x1 ). The following lemma to Theorem 2.1 solves the technical difficulty associated to our choice of parametrized curves. Lemma 2.1. Consider two causally homotopic lightlike geodesics γ0 , γ1 : [λ0 , λ1 ] → M joining x0 with x1 . If γ0 and γ1 maximize the time-separation (or “length”) in its causal homotopy class C (that is, there is no timelike curve in C), then there exist a causal homotopy connecting them through longitudinal geodesics. Proof. By the hypothesis, all the longitudinal curves of the causal homotopy H (, λ) between γ0 and γ1 are necessarily lightlike pregeodesics [33, Prop. 10.46]. Let γ : [λ0 , λ1 ] → M be the (unique) reparametrization as a geodesic of the longitudinal curve corresponding to ∈ [0, 1], and put u = γ (λ0 ). The map h : (, λ) → γ (λ) will be continuous (and, thus, the required causal homotopy) if and only if the map → u ,
∈ [0, 1]
is continuous. Consider the tangent space at γ (λ0 ) and the canonical projection on the projective space π : T Mx0 → P T Mx0 . The curve of directions → π(u ) is continuous since π(u ) = π(exp−1 x0 H (, λ)), in any normal neighborhood of x0 , independently of λ. Since expx0 (u (λ1 −λ0 )) = x1 , the continuity of the curve → u follows directly from [7, Lemma 9.25 ] (the strong causality of M is used there).
Theorem 2.1. Let γ0 : [λ0 , λ1 ] → M be a lightlike geodesic which connects two fixed points x0 , x1 and maximizes the time-separation in its causal homotopy class. If there exists a distinct geodesic γ1 in this class then x1 is the first conjugate point of x0 along γ0 (and, then, along γ1 ). Proof. By the previous lemma, there exists a causal homotopy from γ0 to γ1 through lightlike geodesics. Thus, expx0 cannot be injective in any neighborhood of (λ1 − λ0 )γi (λ0 ), i = 0, 1, and x1 is a conjugate point. Even more, it is the first one because, otherwise, γ0 would not maximize in its causal homotopy class.
Remark 2.1. The converse implication may not hold: even though a variation of γ0 through lightlike geodesics with variational vector field zero at the extremes will exist [33, Corollary 10.40], conjugate points are only “almost meeting points” of geodesics, that is, the longitudinal geodesics of the variation may not reach x1 . Cut points on causal geodesics have been widely studied [7, Ch. 9]; recall that a lightlike geodesic ray maximizes the time-separation until its cut point. Corollary 2.1. Let γ : [0, b) → M be a lightlike geodesic with a cut point γ (λc ), λc ∈ (0, b). If γ (λc ) is not a conjugate point, then:
354
E. Minguzzi, M. S´anchez
(1) No other lightlike geodesic which connects γ (0) and γ (λc ) is causally homotopic to γ . (2) If (M, g) is globally hyperbolic, there exist at least another lightlike geodesic γˆ (necessarily non-causally homotopic to γ ) which connects γ (0) and γ (λc ). Proof. Assertion (1) is straightforward from Theorem 2.1. Then, (2) is a consequence of the well-known existence of a second connecting lightlike geodesic for any non-conjugate cut point on a lightlike geodesic (see [7, Theorem 9.15]).
Remark 2.2. The well-known behavior of lightlike geodesics of bidimensional de Sitter spacetime illustrates Corollary 2.1. No lightlike geodesic can have a conjugate point (because of dimension 2) but any such geodesic has a cut point, reached by a non-causally homotopic lightlike geodesic. By removing a point of this second geodesic, the necessity of the assumption of global hyperbolicity for (2) is stressed. 2.2. Homotopic cut points. For the following crucial result, we introduce an auxiliary concept: Definition 2.1. Let γ : [0, b) → M be a lightlike geodesic. The point γ (λc ), λc > 0 is the homotopic cut point along γ of γ (0), if λc is the first point such that, for any δ > 0, the restricted curve γ |[0,λc +δ] does not maximize the length in its causal homotopy class (that is, γ |[0,λc +δ] is causally homotopic to a timelike curve). Theorem 2.2. Let γ : [0, b) → M be a lightlike geodesic in a globally hyperbolic spacetime (M, g). The point γ (λc ) is the homotopic cut point along γ of γ (0) if and only if it is the first conjugate point to γ (0). Proof. (⇒). Assume that γ (λc ) is not the first conjugate point. As conjugate points on causal geodesics are discrete [7, Th. 10.77], we can choose λδ = λc + δ > λc such that no conjugate point γ (λ) appears for λ ∈ [0, λδ ]. By hypothesis on γ (λc ), there exists a timelike geodesic ρ from γ (0) to γ (λδ ) which is causally homotopic to γ |[0,λδ ] , and maximizes the time-separation in its causal homotopy class. We can assume that, for such a causal homotopy H (, λ), the longitudinal curves H are not lightlike pregeodesics close to γ (neither, in particular, equal to γ0 ); otherwise, γ (λδ ) would be conjugate to γ (0) along γ . We will need a variation h(, λ) of γ |[0,λδ ] which is piecewise C 2 (that is, continuous and C 2 on each closed rectangle corresponding to a suitable partition of the domain (, λ)), and satisfies the other technical properties of the following result, to be proved at the end. Lemma 2.2. Curve γ0 = γ |[0,λδ ] admits a piecewise C 2 causal homotopy h : [0, 1] × [0, λδ ] → M,
(, λ) → γ (λ)
with fixed extremes, such that the longitudinal curves γn := γn are causal for some sequence n 0, and the variational vector field V = ∂ |0 γ (λ) is not identically 0. In particular, V satisfies: V (0) = V (λδ ) = 0,
V ≡ 0,
g(V , γ ) ≡ 0.
We remark that, a priori, the longitudinal curves are causal only for the sequence {γn }, that is, the variation may be “non-admissible”, in the terminology of [7, Sect. 10.3] (recall also Remark 2.1).
Connecting Solutions of the Lorentz Force Equation do Exist
355
Recall that, according to [7, Dfns. 10.47, 10.49, 10.54, 10.57, 10.59], V induces a class [V ] which belongs to the domain X0 (γ0 ) of the quotient index form I¯, i.e., the index form defined on the piecewise smooth sections (vanishing at the extremes) of the quotient bundle G(γ ) defined by taking vector fields on γ modulo γ . As each γn is causal2 , then g(γ , γ ) is non-decreasing at 0, and λδ 1 ∂ 2 g(γ (λ), γ (λ))dλ = I (V , V ) = I¯([V ], [V ]), 0≤ 2 ∂ 2 0 0 (see also [33, pp. 289-290]), in contradiction with [7, Th. 10.69]. (⇐). Obviously, the homotopic cut point of γ (0) must appear not beyond the first conjugate point. But from the proved implication, it can neither appear before this point.
Proof of Lemma 2.2. Fix a finite covering of convex neighborhoods of M which cover the image of γ0 , and choose 0 = λ0 < λ1 < · · · < λk < λk+1 = λδ such that γ ([λi , λi+1 ]) is included in one of such neighborhoods, Ui , for all i = 0, . . . , k. Notice that, taken normal coordinates on each Ui , the tangent space to γ0 (λi ) is identifiable to Rn . Even more, as lim H (λi ) = γ0 (λi )
→0
and the set of directions in Rn is compact, there exists a sequence {ni } 0 and a C 1 -curve αi such that αi (ni ) = Hni (λi ) for all n. By taking each sequence {ni } as a subsequence of {ni−1 }, we can assume that all the sequences are equal n ≡ ni . Iterating the process, αi can be chosen C r , for any finite r > 1. Let us show that αi 0 (0) = 0 can also be assumed for some i0 . Indeed, otherwise, as the curves H are different to γ0 for small , we can find another C 2 reparametrization of some αi0 such that its velocity at 0 does not vanish, and consider this new parameter as the original transversal parameter of the homotopy H . Now, choose as variation γ (λ) of γ0 (for small ) the homotopy h defined as: γ is the unique broken geodesic which joins αi () and αi+1 () in Ui when λ varies between λi and λi+1 . Let vi () be the tangent vector at λi of such a γ restricted to [λi , λi+1 ]. As each pair of curves αi , αi+1 are C 2 , the curve in T M which maps each to vi () is C 2 too (use [33, Lemma 5.9]). Thus, the homotopy h can be written as h(, λ)|λ∈[λi ,λi+1 ] = γ (λ)|[λi ,λi+1 ] = expαi () ((λ − λi )vi ()) for small , and all the required properties follow.
Remark 2.3. The definition of homotopic cut point is obviously extendible to timelike geodesics and to geodesics in Riemannian manifolds, but the analogous of Theorem 2.2 would not hold, in general. In fact, it is easy to construct a Riemannian manifold (S, dl 2 ) with a closed geodesic c : [0, 2] → S, c (0) = c (2), without conjugate points, such that its middle point c(1) is the cut point, and the two pieces of the geodesic c0 = c|[0,1] , and c1 (λ) = c(2 − λ), ∀λ ∈ [0, 1] are homotopic with fixed extremes p0 = c(0), p1 = c(1). Concretely, let S be the following surface embedded in Euclidean space R3 (with natural coordinates (x, y, z)) and induced metric dl 2 . S is obtained by gluing the semicylinder x 2 + y 2 = r 2 , z ≤ 0, with a spherical cap of radius r, x 2 + y 2 + z2 = r 2 , z ≥ 0. The metric dl 2 is C 1 but can even be made smooth by suitably redefining the cap near 2
With our sign convention (+, −, . . . , −), different from [7, 33].
356
E. Minguzzi, M. S´anchez
the equator (this manifold will be used in the next examples; it can be also replaced by a paraboloid, with straightforward modifications). The required geodesic then would be c(λ) = (r cos π λ, r sin πλ, −1), with p0 = (r, 0, −1), p1 = (−r, 0, −1). This counterexample for Riemannian geodesics is extended to timelike geodesics in the following example, which also shows the possible existence of different timelike homotopy classes in a single causal one. Example 2.1. Consider the Riemannian surface S p0 , p1 as above, and define the spacetime M = R × S, g = dt 2 − dl 2 . Choose T > π r, and take z0 = (0, p0 ), z1 = (T , p1 ) ∈ M. The timelike geodesics σi (λ) = (T λ, ci (π λ)), i = 0, 1 maximize the time-separation between z0 and z1 , and do not have conjugate points. If T > 2 + π r then they are timelike homotopic, and if T = 2 + πr then they are causally homotopic, but not timelike homotopic (if S is smoothed as suggested above, the critical value T = 2 + π r must be replaced by T = L where L satisfies: (i) any smooth homotopy in S between c0 and c1 contains a curve of length L, (ii) no L > L satisfies (i)). 2.3. Arc-connectedness of the closure of the classes. Now, let us remark some properties of causal homotopy classes in globally hyperbolic spacetimes, in relation to Uhlenbeck’s study of the timelike case [41], carefully developed further by Beem et al. [7]. Our main aim is to show an appropriate sense of compactness of the piecewise geodesics in timelike homotopy classes (Theorem 2.3, Remark 2.4), which will be directly extendible to the fibered classes in Sect. 4. Along this subsection, (M, g) will be globally hyperbolic, and a temporal Cauchy function t : M → R (see the Appendix) is chosen to parametrize all the causal curves. Two points x0 < x1 will be also fixed and, as it is not restrictive to assume t (xi ) = i, i = 0, 1, all the connecting causal curves will be parametrized in [0, 1]. Following [7, Sect. 10.2] (our basic reference throughout this subsection), there exist a N > 0 and a partition 0 = t0 < t1 < · · · < tN < tN = 1 of [0, 1] such that for any causal chain (z0 , . . . zN ), zi ∈ M, zi < zi+1 with t (zi ) = ti and z0 = x0 , zN = x1 , there exists one and only one maximal causal geodesic connecting zi with zi+1 (thus, (z0 , . . . , zN ) can be identified with the piecewise causal geodesic obtained connecting zi with zi+1 ). Let Mx0 ,x1 be the space of such causal chains and Mx0 ,x1 the subset containing all the chronologically related chains (i.e., zi zi+1 ). Clearly, Mx0 ,x1 and Mx0 ,x1 are subsets of a product of N − 1 Cauchy hypersurfaces Si (the fixed extremes of the chain can be disregarded) and, then, inherit a topology. By using that the relation ≤ is closed on convex neighborhoods, it is straightforward to check: Proposition 2.1. (1) The space of causal chains Mx0 ,x1 is compact. (2) The space of chronological chains Mx0 ,x1 is open in S1 × · · · × SN−1 , and its closure is included in Mx0 ,x1 . One can check, reasoning for Mx0 ,x1 as in [7, Prop. 10.36] for Mx0 ,x1 , that there exists a well defined length non-decreasing retraction from the set Nx0 ,x1 of all the t−parametrized causal curves connecting x0 and x1 to Mx0 ,x1 (the topology of Nx0 ,x1 can be chosen as the topology associated to the uniform distance d, i.e.: d(γ1 , γ2 ) = Max{dR (γ1 (t), γ2 (t)) : t ∈ [0, 1]}, where dR is the distance associated to any auxiliary Riemannian metric). Given an arc-connected component Ux0 ,x1 (resp. Ux0 ,x1 ) of Mx0 ,x1 (resp. Mx0 ,x1 ), its piecewise smooth geodesics are timelike (resp. causally) homotopic. In fact, any
Connecting Solutions of the Lorentz Force Equation do Exist
357
continuous curve (z0 (), . . . zN ()) in Ux0 ,x1 (resp. Ux0 ,x1 ) joining two given chains, also yields the required homotopy between the associated piecewise geodesics. We will be interested just in the problem of the invariance of the arc-connected components under limits. More precisely, recall that, as Mx0 ,x1 is an open subset of a manifold, its connected and arc-connected components are equal. Even more, the following property (which is false for open subsets of RN , in general) holds: Lemma 2.3. Any chain (z0,∞ , . . . , zN,∞ ) in the closure of a (arc-)connected component Ux0 ,x1 (resp. Ux0 ,x1 ) of Mx0 ,x1 (resp. Mx0 ,x1 ), can be connected to any element of Ux0 ,x1 (resp. Ux0 ,x1 ) by means of an arc totally included in Ux0 ,x1 (resp. Ux0 ,x1 ), except one extreme. Proof. We will reason for Ux0 ,x1 , being analogous for Ux0 ,x1 . Consider any converging sequence of chronological chains in Ux0 ,x1 , {(z0,k , . . . , zN,k )}k → (z0,∞ , . . . , zN,∞ ). Let Ni be the convex neighborhood which contains zi,∞ , zi+1,∞ and make lighter the notation putting xk = zi,k , yk = zi+1,k , for k = 1, 2..., ∞. The result follows immediately by applying the following claim recurrently for i = 0, . . . , N − 1: Claim. Let xk yk , k ∈ N, and x∞ < y∞ be as above. If there exists a continuous curve α : [0, 1] → Ni and a decreasing sequence λk → 0 such that xk = α(λk ) for large k, then there exists a continuous curve β : [0, 1] → Ni such that, for some ∈ (0, 1]: α(λ) β(λ),
β(λk ) = yk ,
∀λ, λk ∈ (0, ).
Notice that this claim is obvious in the particular case x∞ y∞ . For the general case, let vk be the velocity at zero of the unique timelike geodesic (causal, when k = ∞), defined on [0, 1], from xk to yk . As the bundle of the future timelike cones on α is arcconnected, we can find a continuous timelike vector field V on α such that V (λk ) = vk . The required curve is then β(λ) = expα(λ) (V (λ)).
Recall that, in the case of Ux0 ,x1 , the a priori excluded extreme will also belong to Ux0 ,x1 . Summing up: Theorem 2.3. Let (M, g) be globally hyperbolic. (1) The closure U x0 ,x1 of any connected component Ux0 ,x1 of Mx0 ,x1 is arc-connected. (2) Any arc-connected component Ux0 ,x1 of Mx0 ,x1 is closed. Remark 2.4. In particular, when {γk }k∈N is a sequence of causal geodesics in the same causal (resp. timelike) homotopy class, such that its initial velocities converge to the velocity of a geodesic γ0 , then γ0 belongs to the same causal homotopy class (resp. to the adherence of the timelike homotopy class – which is included in the same causal homotopy class of the sequence). 3. Connectedness Through Solutions to the LFE In the following sections we shall consider three kinds of manifolds: the spacetime M of dimension n0 ≥ 2, the Kaluza-Klein spacetime P of dimension n0 + 1 and (in some examples) a spacelike hypersurface S of dimension n0 − 1. We adopt the convention of denoting curves belonging to P with γ , curves belonging to M with σ or x, and curves belonging to S with c.
358
E. Minguzzi, M. S´anchez
3.1. Avez-Seifert type problem for the LFE. Consider on M a fixed (exact) electromagnetic field F = dω, where ω is any differential 1-form. A point particle of rest mass m > 0 and electric charge q ∈ R, moving under F , has a timelike worldline which satisfies the Lorentz force equation (LFE) (cf. [32, Sect. 3.1], [20, Sect. 11.9] or [26, Sect. 23]) dx q dx Ds = Fˆ (x) . (1) ds m ds Here the units are such that c = 1, x = x(s) is the world line of the particle paramdx dx eterized with proper time, dx ds is the velocity, Ds ds is the covariant derivative of ds along x(s) associated to the Levi-Civita connection of g, and Fˆ (x)[·] is the linear map on Tx M defined by g(x)[v, Fˆ (x)[w]] = F (x)[v, w], for any v, w ∈ Tx M. We remark that s must be the proper time parametrization, otherwise the constant q/m in front of Fˆ cannot be interpreted as the charge-to-mass ratio of the particle. Recall that, for the LFE, the ratio q/m is fixed, but the individual values of q and m become irrelevant3 . As commented in the introduction, question (Q) becomes natural now. For the case q/m = 0, the LFE is just the geodesic equation, and the solution to question (Q) is well-known [7, Theorem 3.18, Prop. 10.39], [33, Prop. 14.19], [19, Prop. 6.7.1]: Theorem 3.1. (Avez [3], Seifert [39]). Let (M, g) be a globally hyperbolic spacetime, and x0 ≤ x1 two causally related events. Then, in each causal homotopy class, Cx0 ,x1 there exists a causal geodesic σ which maximizes the length among the causal curves in the class. In particular, if x0 x1 the two points can be connected by means of a timelike geodesic (in fact, by one for each time like homotopy class in C − {X − 0, X − 1}, as will be apparent below). If x1 ∈ E + (x0 ) = J + (x0 )\I + (x0 ) then x0 and x1 can still be joined by a lightlike geodesic, but this case does not make sense for the LFE. One can also wonder for the connectedness of x0 , x1 by means of a geodesic even if they are not causally related, as in variational frameworks described below. Although this question has a geometrical interest (see for instance the survey [37]), it does not have a direct physical interpretation, nor equivalence for LFE. 3.2. Related variational problems. Question (Q) can be approached as a variational one [26, Sect. 16]. Indeed, let Nx0 ,x1 be the set of all (piecewise) C 1 causal curves σ : [0, 1] → M from x0 to x1 > x0 . Using F = dω, consider the functional Ix0 ,x1 on Nx0 ,x1 , 1
q q Ix0 ,x1 [σ ] = (ds + ω) = g(σ (λ), σ (λ)) + ω(σ (λ)) dλ, (2) m m σ 0 3
Sometimes the mass of the particle is assumed to be known, and the curve x is parametrized with r = s/m. In this case the LFE becomes equivalent to the system [36, Defs. 3.1.1 and 3.8.1] Dr dx dr = dx q Fˆ (x) dx dr , | dr | = m.
Connecting Solutions of the Lorentz Force Equation do Exist
359
for all σ ∈ Nx0 ,x1 . The functional is invariant under monotonic reparametrizations of γ ; in this sense, when talking about critical points, we may refer to non-parametrized curves. The connecting timelike solutions of the LFE (1), if they exist, are critical points of this functional, as it follows from a computation of the Euler-Lagrange equation. Conversely, every timelike extremal of this functional, once parametrized with respect to proper time, is a solution of the LFE. In the geodesic case q/m = 0, it is convenient to replace functional Ix0 ,x1 by the “energy” functional E[σ ] =
1 2
1
g(σ (λ), σ (λ))dλ
(3)
0
because of several reasons: (i) the critical curves of this functional are parametrized directly as geodesics, and (ii) the domain of the functional can be enlarged to include noncausal curves (making sense also for non-causally related x0 , x1 ) and, then, non-causal connecting geodesics also become critical points. Moreover, the choice of extremes λ0 = 0, λ1 = 1 simplifies the domain of curves without loss of generality. Nevertheless, if one only knows that there exists a critical curve σ0 of (3), neither the causal character of σ0 nor (if the curve were timelike) the time of arrival, would be known a priori. In the case of the LFE, one can also consider a functional, introduced in [8], which is related to the action functional and closer to (3): Jx0 ,x1 [σ ] =
0
1 1
2
g(σ (λ), σ (λ)) + b ω(σ (λ)) dλ,
(4)
on the space of all the (absolutely continuous) curves, non necessarily causal, which connect x0 and x1 in the interval [0, 1]. Concretely, Bartolo and Antonacci et al. [2, 5] studied the connectedness of the whole spacetime by means of critical points (non-necessarily causal) of this functional, and further results were obtained in posterior references (see, for example, [6, 13, 11, 16, 31] or the detailed account in [30]). Remarkably, in [14, 15] the authors were able to prove, under global hyperbolicity, the existence of at least one (uncontrolled) value of q/m such that a timelike connecting solution of the associated LFE does exist. Indeed, a timelike critical point σ0 (if it exists) is a solution of the Lorentz force equation for some (uncontrolled) ratio q/m. This follows from the Euler-Lagrange equation for Jx0 ,x1 , Dλ σ = bFˆ (x) σ .
(5)
In particular, ds/dλ = C is a constant and, thus, σ0 would satisfy the LFE with chargeto-mass ratio q/m = b/C. However, C depends on the critical curve σ0 (C = σ0 ds), which in turn depends on the coefficient b in an uncontrolled way. Summing up, the variational approach for functional Jx0 ,x1 , even though mathematically appropriate to study non-timelike curves, presents the following two limitations for our question (Q): (a) For chronologically related points, one cannot control easily the causal character of the critical points. (b) Even if the critical point is proved to be timelike, one cannot know a priori its charge-to-mass ratio and indeed it could in the end correspond to an unphysical value.
360
E. Minguzzi, M. S´anchez
3.3. Known results on (Q), and a counterexample. An approach conceived to study directly the physical question (Q) started in [17, 29], where the authors considered the solutions to the LFE as projections of suitable lightlike geodesics for a Kaluza-Klein metric. The relation between the LFE and the projections of timelike geodesics for a higher dimensional Kaluza-Klein spacetime is well known [21–24, 27, 28]. However, it proved more useful for question (Q) to consider the solutions of the LFE as projections of lightlike geodesics in a Kaluza-Klein spacetime with a scale factor (for the additional dimension) proportional to the charge-to-mass ratio. Indeed this allowed to prove the existence of solutions having an a priori fixed charge-to-mass ratio in most cases. The results in [29] improve those in [17]; the best achieved result is then: Theorem 3.2. Let (M, g) be a globally hyperbolic spacetime, and F = dω be an electromagnetic field on M. Let x1 be an event in the chronological future of x0 and q/m ∈ R − {0} any charge-to-mass ratio. Then there exists a future-directed causal curve σ0 which connects x0 and x1 and maximizes the functional Ix0 ,x1 on Nx0 ,x1 . Moreover, σ0 is everywhere timelike or lightlike. In the former case, the reparametrization of σ0 with respect to proper time becomes a solution of the LFE (1); in the latter case, σ0 is a lightlike geodesic. Even though the hypotheses of this theorem are optimal, it is not completely satisfactory for our question (Q) because Theorem 3.2 does not forbid the maximizing curve σ0 to be a lightlike geodesic. We present below an example of such a situation, even in a simply connected (and contractible) spacetime. Summing up: if a connecting lightlike geodesic exists, Theorem 3.2 does not answer our question (Q). Example 3.1. Consider the 3-dimensional spacetime M = R × S, ds 2 = dt 2 − dl 2 , t ∈ R, in Example 2.1, Remark 2.3. Let F = dω independent of t, with ω ≡ 0 in the cap of S and, on the cylinder: ω = Brµ(z)dθ, where θ is the angle, B ∈ R and µ(z) is a smooth monotone decreasing function such that µ(−2π r) = 1, µ(−π r) = 0. Notice that F is different from zero only in the set R × R, where the ribbon R ⊂ S is defined by R = {q ∈ S : −2πr < z < −πr}. Let p ∈ S be defined by the R3 coordinates (r, 0, −3π r), and let x0 = (0, p), x1 = (2π r, p). The events x0 and x1 are connected by two lightlike geodesics σ1 , σ2 . Their projections c, ¯ c˜ differ only on the orientation of their parametrization as their image is the circle x 2 + y 2 = r 2 , z = −3πr. Let σL be a generic timelike connecting curve such that its projection cL on S has length L. Since σL is timelike, dl/dt < 1, and since it is also connecting we have L < 2πr. This equation implies that any timelike connecting curve has a projection c completely contained in the region z ≤ −2π r (c belongs to the identity of the homotopy group of the cylindric part of S, with base point p). In particular timelike connecting curves cannot enter the region of the non-vanishing electromagnetic field. Thus, there are three causal homotopy classes. Two classes Ci for the two isolated lightlike geodesics σi , i = 1, 2 and a further causal class C containing all the timelike connecting curves (this last class contains lightlike curves, which are not geodesics). Roughly speaking the curves in C cannot be deformed to the lightlike geodesics σi since the projections would reach the cap, and hence the deformed curves would become non-causal. q We shall now show that, for any | m B| > 1, the absolute maximum and minimum of Ix0 ,x1 [σ ] are reached in the geodesics σi . First, notice that the electromagnetic term of
Connecting Solutions of the Lorentz Force Equation do Exist
the action can be rewritten q m
σ
q ω = Br m
361
(6)
µ(z)dθ, c
q Br on the geodesics σi . Thus, on the class which vanishes on C, and is equal to ±2π m C the action functional is equivalent to the length functional, which is bounded by the length 2π r of the maximizing geodesic (the particle at rest at p), as required.
4. Maximization of Ix0 ,x1 over Homotopy Classes First, let us introduce a trivial principal bundle P = M × R, : P → M with structural group (R, +): b ∈ R, p = (x, y), p = pb = (x, y + b). Let y be the fiber coordinate and β be a dimensional positive constant. Given a potential 1-form ω on M and a connection ω˜ = dy + βω on P , consider the Kaluza-Klein metric g˜ = g − a 2 ω˜ 2 ,
(7)
q |. as a = β −1 | m 4 our work .
The actual value of the dimensional and choose the scale factor a constant β will have no role in Fix x0 ≤ x1 and a causal homotopy class Cx0 ,x1 of Nx0 ,x1 . We are looking for critical curves of (2) on Cx0 ,x1 (and, thus, on Nx0 ,x1 ). 4.1. Causal homotopy classes in a K-K bundle. Let Cp0 ,x1 denote a (continuous) causal homotopy class of (piecewise C 1 ) curves on P , starting at some p0 ∈ −1 (x0 ) and ending in −1 (x1 ) where, now, the homotopy does not keep fixed the second endpoint. Proposition 4.1. Fixed x0 ≤ x1 and p0 ∈ −1 (x0 ): (1) If γ1 , γ2 ∈ Cp0 ,x1 then σ1 = ◦ γ1 and σ2 = ◦ γ2 belong to the same class Cx0 ,x1 . (2) If γ1 and γ2 projects on curves σ1 = ◦ γ1 and σ2 = ◦ γ2 which are causally homotopic, then γ1 and γ2 belong to the same Cp0 ,x1 . Thus, the projection π : P → M sends homotopy classes of type Cp0 ,x1 to homotopy classes of type Cx0 ,x1 , and induces a bijective map between classes of type Cp0 ,x1 and Cx0 ,x1 . Proof. Assertion (1) is obvious. Recall also that it ensures the induction of a map between homotopy classes. The surjectivity of this map is ensured because, for any future-directed causal curve σ in M which connects x0 and x1 , there exists a future directed causal curve σ˜ in P (i.e., the horizontal lift) starting at p0 and projecting on σ . For (2) (which ensures injectivity), consider first the case when the γi ’s project on the same curve σ , that is, γ1 (λ) = (σ (λ), y1 (λ)),
γ2 (λ) = (σ (λ), y2 (λ)),
∀λ ∈ [0, 1].
√ It should be said that in the physical Kaluza-Klein theory one usually chooses a = β −1 16π G, where G is the Newton constant, to obtain the correct coupling between gravity and electromagnetism. This choice is obviously incompatible with our constraint (even approximately, because both imply q √ 1 |m | = 1 and for realistic particles this coefficient is huge). However, note that the Kaluza-Klein 16πG spacetime is used by us only as a technical tool: given q/m we define a, and so a changes with the particle considered - the spacetime P is not pretending to be a physical spacetime. 4
362
E. Minguzzi, M. S´anchez
The map H (, λ) = (σ (λ), y1 (λ) + (1 − )y2 (λ))
(8)
is a causal homotopy without the second point fixed, as required. If the projections σi are different, consider their horizontal lifts σ˜ i starting at p0 . Then each σ˜ i is continuously causally homotopic to γi (from the previous case), and the lifting of the causal homotopy between σ1 and σ2 is clearly a causal homotopy between σ˜ 1 and σ˜ 2 .
Remark 4.1. An analogous result holds for timelike classes type Cp0 ,x1 . In general, all the study of Subsect. 2.3 for the spaces containing piecewise geodesics Mx0 ,x1 , Mx0 ,x1 , and its arc-connected components Ux0 ,x1 , Ux0 ,x1 (constructed from causal and time path spaces from x0 to x1 ), can be extended to analogous spaces containing piecewise geodesics from p0 to −1 (x1 ), namely, Mp0 ,x1 , Mp0 ,x1 , and its arc-connected components Up0 ,x1 , Up0 ,x1 . Just take into account: (1) If t is a Cauchy temporal function on M then so is t˜ = t ◦ on P (see the Appendix). In particular, J + (p0 ) ∩ −1 (x1 ) is compact because, any chosen Cauchy hypersur face S of M through x1 , it can be written as J + (p0 ) ∩ −1 (S) ∩ −1 (x1 ) (the intersection between a compact and a closed subset). (2) Analogous to Theorem 2.3, Remark 2.4 can be stated and, in particular, the limit of geodesics in a class Cp0 ,x1 (resp. Cp0 ,x1 ) also belong to this class (resp. to the closure of this class). 4.2. A Fermat-type equivalent problem. Given σ : [0, 1] → M in Cx0 ,x1 , consider the two lightlike lifts σ˜ ± (λ) = (σ (λ), y ± (λ)),
λ ∈ [0, 1]
−1 (x
of σ in P starting at a fixed p0 = (x0 , y0 ) ∈ 0 ). Explicitly, the requirement to be lightlike implies 1 (y ± ) (λ) = ∓ |σ (λ)| − βω(σ (λ)) (9) a and, thus: λ 1 λ
σ˜ ± (λ) = σ (λ), y0 ∓ g(σ (λ), σ (λ))dλ − β ω(σ (λ))dλ . (10) a 0 0 Then, the fiber coordinate Y1± [σ ] of the final point of this curve is, essentially, the action functional: 1 Y1± [σ ] = y ± (1) = y0 ∓ ds + (±|q/m|) ω . (11) a σ σ Comparing this expression with the one of Ix0 ,x1 , it is obvious that a maximization on Cx0 ,x1 of Ix0 ,x1 relative to the ratio +|q/m| (resp. −|q/m|), corresponds to a minimization (resp. maximization) of Y1+ [σ ] (resp. Y1− [σ ]). Summing up: Theorem 4.1. The curve σ0 is a maximum of Ix0 ,x1 on Cx0 ,x1 and q/m > 0 (resp. q/m < 0) if and only if σ0 is a minimum (resp. maximum) of the arrival coordinate Y1+ : Cx0 ,x1 → R,
σ → Y1+ (σ )
(resp. Y1− : Cx0 ,x1 → R,
σ → Y1− (σ )).
Connecting Solutions of the Lorentz Force Equation do Exist
363
4.3. The maximization result. The above variational principle reduces our problem to ensure the existence of maxima or minima for Y1± . The following result yields two candidates. Lemma 4.1. The set K containing the points in −1 (x1 ) reached by means of a causal curve in Cp0 ,x1 , is a compact interval K = {x1 } × [y 1 , y¯1 ], y 1 ≤ y¯1 . Moreover, p0 will be connectable with any of the two extremes of K by means of a lightlike geodesic γ ± (necessarily without conjugate points before the endpoint) in Cp0 ,x1 . Proof. The arc-connectedness of K is straightforward from the existence of a causal homotopy between any two curves in Cp0 ,x1 . For the compactness of K, recall that as J + (p0 ) ∩ −1 (x1 ) is compact (Remark 4.1, Item (1)), any Cauchy sequence {(x1 , yk )} in K will have a limit (x1 , y∞ ) in −1 (x1 ). By the Avez-Seifert theorem, there exists a maximizing causal geodesic γk ∈ Up0 ,x1 ⊂ Cp0 ,x1 connecting p0 and each yk . Then, the limit curve γ0 of the sequence {γk } will be a causal geodesic in Cp0 ,x1 too, and it will cross the Cauchy hypersurface −1 (S) at some point, necessarily (x1 , y∞ ). Thus, γ0 belongs to the same causal homotopy class (Remark 4.1, Item (2)). For the last assertion, notice that the extreme y 1 (resp. y¯1 ) is connectable with p0 by means of a length-maximizing causal geodesic γ + (resp. γ − ). Even more, γ ± must be lightlike because, otherwise, an open neighborhood of the final point of γ ± in −1 (x1 ) would lie in K. Finally, a conjugate point cannot exist because γ ± is maximizing (see Theorem 2.2).
Lemma 4.2. Assume q/m > 0 (resp. < 0), and let γ (λ) = (σ (λ), y(λ)), λ ∈ [0, 1] be the lightlike geodesic in Cp0 ,x1 which connects p0 and (x1 , y 1 ) (resp. (x1 , y¯1 )). Then, σ is a maximum of Ix0 ,x1 on Cx0 ,x1 . Proof. From Theorem 4.1, we only have to prove γ = σ˜ + (resp = σ˜ − ) because in this case σ is obviously a minimum (resp. maximum) of Y + (resp. Y − ). As γ is a geodesic and ∂y a Killing field on P , we have the constant ν ≡ g(γ ˜ , ∂y ) = −a 2 y + βω(σ ) , that is: ν y = − 2 − βω(σ ). (12) a Even more, as the extreme (x1 , y 1 ) is minimum (resp. (x1 , y¯1 ) maximum) in K, then ν ≥ 0 (resp. ≤ 0) – otherwise, the horizontal lift of σ would end beyond the extreme. As γ is lightlike, g(γ ˜ , γ ) ≡ 0 thus y = ε
|σ | − βω(σ ), a
(13)
where ε equals 1, or −1. To specify ε, notice from (12), (13) and the sign of ν: |σ | = +
ν a
ν (resp. |σ | = − ). a
(14)
Now, use (14) to check that the expression for (y + ) (resp. (y − ) ) in (9) coincides with the expression of y in (12), and the result follows.
Notice also that, in the proof, ν = 0 if and only if σ is a lightlike geodesic and γ is its horizontal lift [29]; otherwise σ is timelike.
364
E. Minguzzi, M. S´anchez
Remark 4.2. Analogous results hold if K in Lemma 4.1 is taken as the adherence of the points in −1 (x1 ) reachable by curves in a timelike homotopy class Cp0 ,x1 . The lightlike geodesics γ ± will lie in C p0 ,x1 , and will be causally homotopic to the curves in Cp0 ,x1 . The analog of Lemma 4.2 states that if q/m > 0 (resp. q/m < 0), (γ + ) (resp.
(γ − )) maximizes Ix0 ,x1 on C x0 ,x1 . This projection is either a null geodesic belonging to the boundary C˙ x0 ,x1 or a timelike curve belonging to Cx0 ,x1 . Summing up, the following generalization of Theorem 3.2 to causal homotopy classes is obtained: Theorem 4.2. Let (M, g) be a globally hyperbolic spacetime, and F = dω be an electromagnetic field on M. Let x1 be an event in the causal future of x0 and fix any causal homotopy class Cx0 ,x1 . For each q/m ∈ R−{0} there exists a future-directed causal curve σ0 which connects x0 and x1 and maximizes the functional Ix0 ,x1 on Cx0 ,x1 . Moreover, σ0 is everywhere timelike or lightlike. In the former case, the reparametrization of σ0 with respect to proper time becomes a solution of the LFE (1) for the charge-to-mass ratio q/m; in the latter case, σ0 is a lightlike geodesic. Even more, for any timelike homotopy class Cx0 ,x1 ⊂ Nx0 ,x1 there exists a maximizer in C x0 ,x1 which is either a timelike curve in Cx0 ,x1 or a lightlike geodesic in the boundary C˙ x0 ,x1 . 5. Existence of Timelike Local Maximizers 5.1. The existence result for causal homotopy classes. In the previous section the existence of maximizers, either timelike curves or lightlike geodesics, in each Cx0 ,x1 (C x0 ,x1 ), has been ensured. Here, we will prove that the maximizer on Cx0 ,x1 cannot be a lightlike geodesic if there exists a timelike curve in Cx0 ,x1 . The proof is carried out in two steps. In the first one (Lemma 5.1), a maximizing lightlike geodesic σ0 is shown to be free of conjugate points (except at most the two extremes). The second step is to check that, if such a σ0 exists, all the other curves in Cx0 ,x1 must be lightlike. Lemma 5.1. Let σ : [0, 1] → M be a lightlike geodesic such that σ (r), 0 < r < 1 is the first conjugate point to σ (0). Then there exists a smooth (C r0 ) variation of σ through causal curves such that the functional Ix0 ,x1 is strictly bigger on the variated longitudinal curves. Therefore, the maximum of Ix0 ,x1 on a causal homotopy class Cx0 ,x1 cannot be attained at a lightlike geodesic with a conjugate point to σ (0) = x0 before σ (1) = x1 . Proof. The integrand in the right-hand side of (2) will be written, for a general smooth variation through causal curves σv (λ), v ∈ [0, ], > 0, as
q I˜ ≡ I˜v (λ) = σv (λ), σv (λ) + ω(σv (λ)). (15) m Let V (λ) be the variational field. Recall that for this variation, at v = 0: σv (λ), σv (λ) ≡ 0
∂v σv (λ), σv (λ) = 2V (λ), σv (λ) ≡ 0.
But the second derivative of σv (λ), σv (λ) at v = 0 can be chosen nonnegative on [0, 1] and strictly positive in some interval (0, r + δ) as in [33, Prop. 10.48]. Moreover, this
Connecting Solutions of the Lorentz Force Equation do Exist
365
second derivative is equal for the associated variation σ−v (λ) with variational field −V . Then, by using a Taylor expansion of (15): ∂2 d I˜ q (λ) |v=0 = σv (λ), σv (λ) |v=0 + ∂v (ω(σv (λ))) |v=0 , 2 dv 2∂v m ∀λ ∈ (0, r + δ), (16) 1 where the integral 0 of the first term is strictly positive and equal for the two variations σv (λ) and σ−v (λ). As the integral of the last term changes with the sign of V , the integral of (16) will be strictly positive for at least one of the two variations, as required.
Remark 5.1. Even though in Lemma 4.1 the obtained lightlike geodesic in the total space P cannot have a conjugate point, we have proved here directly the inexistence of conjugate points for its projection on M. In fact, the proof shows that this is a general property for actions of type Ix0 ,x1 , which contain a free particle term plus lower order terms in |σ |. With Lemma 5.1 at hand, the last step follows just applying the studied properties of causal homotopy classes. Theorem 5.1. Under the hypotheses of Theorem 4.2, the maximizer σ0 of Ix0 ,x1 on Cx0 ,x1 is timelike if Cx0 ,x1 contains a timelike curve. Proof. Assume by contradiction that σ0 is not timelike and, thus, it is a lightlike geodesic with no conjugate points before x1 . By Theorem 2.2, σ0 must maximize the time separation in Cx0 ,x1 , in contradiction with the existence of a timelike curve in Cx0 ,x1 .
Theorems 4.2 and 5.1 prove directly our main result, Theorem 1.1. 5.2. A remarkable example. Lemma 5.1 does not forbid the existence of a lightlike geodesic σ which maximizes the functional on the closure of a timelike class C x0 ,x1 . However, in that case the maximizer on Cx0 ,x1 ⊃ C x0 ,x1 does not coincide with σ , as the following example shows. In this way the example will be automatically numbered with a title “Example . . . ” before them as previous examples. Let be√a surface embedded in R3 obtained by gluing the spherical cap x 2 +y 2 +z2 = √ r 2 , z > − 23 r + z , with a cylinder x 2 + y 2 = r 2 /4, z < − 23 r − z , by making a smooth transition in the points with coordinate z ∈ [− √
√
3 2 r
− z , −
√
3 2 r
+ z ], for some
3 2 r.
positive z < Notice that this transition can be made smooth and depending only on the azimuthal angle θ in a small interval ( 56 π − θ , 56 π + θ ), θ < π/6. Only the details of this surface included in the spherical cap with θ ≤ π/2 + , for some small positive < π/6, will be relevant. Let dl 2 be the induced Riemannian metric on , and fix q = (r, 0, 0) ∈ . Consider the natural product (globally hyperbolic) spacetime M = R × , g = dt 2 − dl 2 , with natural projection π : M → , and the fixed events x0 = (0, q), x1 = (2π r, q). (1) The timelike curve λ → (2πrλ, q) fix a timelike homotopy class C1 (:= Cx0 ,x1 ). The connecting lightlike geodesic σ0 (λ) = (2π rλ, c0 (λ)),
c0 (λ) = (r cos 2πλ, r sin 2π λ, 0),
λ ∈ [0, 1],
366
E. Minguzzi, M. S´anchez
lies in the boundary C˙ 1 . In fact, σ0 can be reached by approximating the part c0 with a constant-speed parametrization cα of ∩ α , where α ⊂ R3 is the plane through q, orthogonal to the plane y = 0, which makes an oriented positive angle α < π/2 with the plane z = 0 (cα is contained in the region z > 0 except in the tangent point q). However, by letting α < 0 we can find a second timelike homotopy class C2 such that σ0 ∈ C˙ 2 ; of course, C1 and C2 are contained in the same causal homotopy class C. Notice that c0 passes through the antipodal point −q = (−r, 0, 0), which is also a conjugate point of q; thus, σ0 also contains a conjugate point. Fix q/m > 0 (resp. q/m < 0), and let F = Bπ ∗ = dω be on M, where is the volume 2-form of (with the orientation induced by the outer normal in the spherical cap), and where B : → R is a non-negative (resp. non-positive) function, with B ≡ B > 0 (resp. < 0) constant for θ ≤ π/2, and monotonically decreasing (resp. increasing) to 0 for θ ∈ (π/2, π/2 + ]. The charged-particle action Ix0 ,x1 is given by two contributions. The electromagnetic term reads q q ω= B (17) m σ m R where, without loss of generality, σ (λ) = (2π rλ, c(λ)) and ∂R = c. For a given length L ≤ 2π r of c this integral is maximized in C1 by the circle cα with length L, namely cL . Indeed, the maximizer must be a circle in order to maximize the area, and it is tangent to c0 since, otherwise, its enclosed surface R would include regions where B < B (resp. B > B). Thus q q ω ≤ BA[cL ], (18) m m σ where A[cL ] is the area contained in cL . And the equality holds iff c = cL (up to a reparametrization with the same winding number). The contribution of the length of σ in Ix0 ,x1 is:
ds = σ
0
2πr
1−
dl dt
2
dt ≤ 2πr 1 −
l[c] 2π r
2 ,
(19)
where l[c] is the length of c = π ◦ σ , and the equality holds when the speed of c is constant. We have then l[c] 2 q Ix0 ,x1 [σ ] ≤ 2πr 1 − + BA[cl[c] ], (20) 2πr m where the equality holds iff π ◦ σ = cl[c] . But in terms of the angle 0 ≤ α ≤ π/2 with q cα = cl , we have l = 2πr cos α and A[cl ] = 2πr 2 (1 − sin α). Hence if m Br > 1, Ix0 ,x1 [σ ] ≤ 2π r 2
q q q B + 2πr(1 − Br) sin α ≤ 2π r 2 B = Ix0 ,x1 [σ0 ], m m m
(21)
and the equality holds iff α = 0 and the projection of σ is c0 (= c2πr ), i.e. iff σ = σ0 . Nevertheless, even though σ0 maximizes in C 1 , it does not maximize in C 2 (nor in the causal homotopy class C), in agreement with our results the example environment.
Connecting Solutions of the Lorentz Force Equation do Exist
367
Remark 5.2. Take the surface = S obtained by gluing the semisphere x 2 + y 2 + z2 = r 2 , z ≤ 0 with the semicylinder x 2 + y 2 = r 2 , z ≤ 0. By using an appropriate differentiable structure, S can be regarded as a C 2 manifold endowed with a C 1 metric. Thus, Christoffel symbols and geodesics make sense, but not conjugate points or Jacobi fields. Repeating the above procedure, the timelike homotopy class C1 is defined as above, but C2 will not exist and, then, C = C 1 . Analogous F, ω makes sense on z ≥ 0 (and can be extended to S), thus, the action Ix0 ,x1 can be defined on C. Then, the lightlike geodesic σ0 maximizes in C, and no timelike maximizer in C exists, even though C contains timelike curves. We conclude that the assumption made at the beginning of Section 2.1, that the degree of differentiability of (M, g) is at least C 2 , is needed for Theorem 5.1 to hold.
6. The Non-Exact Electromagnetic Field Case Throughout the paper, F has been not only a closed skew-symmetric 2-covariant vector field (i.e., an ‘electromagnetic field’), but also exact. This condition is scarcely restrictive: from the mathematical viewpoint, it is fulfilled in any contractible spacetime and, from the physical one, no known experimental evidence of non–exact electromagnetic fields on spacetime exists (magnetic monopoles have not been found). Nevertheless, it is interesting to study this case, in order to understand better our approach and the limits of the expected results. Recall: (1) If F is not exact the problem of maximizing the action becomes ill posed, since the electromagnetic potential is not globally defined. Thus, the stated variational problem of finding maximizing curves for the action does not make sense, but the existence problem for connecting solutions of the LFE is still perfectly meaningful. (2) There are non-exact F which can be studied by means of Kaluza-Klein metrics. For such a F , the total space P , whenever it exists, is necessarily a non-trivial principle bundle with fiber U (1) ≡ S 1 . In fact, the necessary and sufficient condition for βF to be the curvature 2-form of a real-valued connection ω˜ = dy + βω over a U (1) bundle P on M of fiber angle y (and, introducing a metric (7), of fiber circumβ ference 2π a), is that the cohomology class [ 2π F ] ∈ H 2 (M, R) is integer (more β ˘ precisely, the Cech cohomology class canonically associated with [ 2π F ] belongs to 2 2 2 ˘ ˘ the image of the morphism : H (M, Z) → H (M, R) induced by the inclusion : Z → R.)5 The example below shows that Theorem 1.1 does not admit a generalization to the nonexact case, and becomes definitive for several reasons: the spacetime and the field F are very simple, F can be described with a principal bundle, and the found events x0 , x1 cannot be connected by a solution of the LFE, whatever value of q/m is chosen. Example 6.1. Let the 3-dimensional spacetime M be the product M = R × S 2 , t ∈ R, with natural projection π : M → S 2 , and metric ds 2 = dt 2 − dl 2 where dl 2 is the usual Riemannian metric of a sphere S 2 of radius r. Let be one of the two associated volume ˆ ≡ (k ), and consider on M the 2-forms on S 2 , with associated endomorphism field i i
Remarkably, in this case the complex exponential associated to the charged particle action e a I , can be well defined. The problems for I can therefore be circumvented in the quantum case [1]. 5
368
E. Minguzzi, M. S´anchez
electromagnetic field6 F = Bπ ∗ . The LFE for a curve σ (t) = (t, c(t)) becomes Dt
c 1 − |c |2
=
q ˆ ], B [c m
(22)
where D is the Levi-Civita connection of S 2 . Multiplying by c / 1 − |c |2 one finds |c |2 = const. Thus, from the above equation: q
ˆ ], (23) Dt c = B 1 − |c |2 [c m from which it follows that c is a curve having constant velocity and constant curvature. Indeed, the equation above implies that the velocity c (t) rotates with angular velocity
q ω, |ω| = | m B| 1 − |c |2 with respect to a parallel transported frame along c. But the only constant curvature curves on S 2 are the circles, and c cannot be a maximal circle (in this case c would be a geodesic and, hence, the left-hand side of (23) would vanish, whereas the right-hand side would not). As a consequence, there is no solution c of (23) which connects two opposite points p, q on S 2 , although for t > π r, x1 = (q, t) lies in the chronological future of x0 = (p, 0). 7. Conclusions A full Avez-Seifert type result on the spacetime connectedness through solutions to the LFE has been proved. The hypothesis of this result (Theorem 1.1) are optimal because, on one hand, no more general hypothesis than x0 x1 makes sense and, on the other, the generalization to non-exact electromagnetic fields is not possible. The proof is based in a purely geometric technique on an auxiliary Kaluza-Klein spacetime, and the consistency of the technique is proved by using the variational interpretation of the solutions to the LFE for exact electromagnetic fields. However, the proof itself is not variational, and the results previously obtained by means of variational methods do not have the accuracy and natural physical interpretation of those studied here. By the way, new properties of timelike and causal homotopy classes, interesting on their own right, have been obtained. The careful study of the lightlike geodesics in such classes have become essential for our proof, and results as Theorems 2.1, 2.2 do not have analogues in the Riemannian case. Thus, the applications of timelike and causal classes in the present and previous papers, as [18, 38, 40], would justify a further study, as a separate field. Appendix: Causality of Kaluza-Klein Metrics Let (M, g) be a spacetime and G a Lie group endowed with a positive definite Adinvariant metric h. Let : P → M be a principal fiber bundle with structural group G endowed with a Kaluza-Klein metric g˜ = ∗ g − ω˜ ∗ h, where ω˜ is some fixed 1-form connection on P . 6 Notice that the electric part of F with respect to ∂ is 0, and the magnetic part corresponds, at a t prerelativistic level, with a uniform magnetic field on S, see [4, Sect. 3] and references therein.
Connecting Solutions of the Lorentz Force Equation do Exist
369
Recall that, when (M, g) is globally hyperbolic then it admits a Cauchy temporal function t; that is, t is smooth with future-directed timelike gradient7 and each level St0 = t −1 (t0 ) is a Cauchy hypersurface [10]. Moreover, M splits smoothly as a product R × S where S is a Cauchy hypersurface and the metric has no crossed terms between R and S. Theorem 7.1. If (M, g) is globally hyperbolic then (P , g) ˜ is globally hyperbolic too and, for any Cauchy temporal function t : M → R the composition t˜ = t ◦ : P → R is Cauchy temporal on P . Therefore, P splits smoothly as a product −1 (S) × R where the metric g˜ has no crossed terms and −1 (S) is a principal fiber bundle on S with structural group G. Proof. All the conclusions follow easily by proving that t˜ is a Cauchy temporal function. Clearly g(∇ ˜ t˜, V˜ ) = d t˜(V˜ ) = dt (d (V˜ )) = g(∇t, d (V˜ )). Thus, ∇ t˜ is the horizontal lifting of ∇t and, in particular, a temporal function. To check that each hypersurface S˜t = t˜−1 (t) = −1 (St ) is Cauchy, let γ be an inextendible timelike curve, which can be assumed to be reparametrized with t˜ without loss of generality. Assume by contradiction that γ crosses the hypersurfaces S˜t for t < t0 but not S˜t0 (an analogous reasoning holds for t0 < t). The projection σ = ◦ γ is also timelike, but it is extendible through t0 because St0 is Cauchy. Then, consider a local trivialization U × G of the bundle P with σ (t0 ) ∈ U . Then we can write in this trivialization γ (t) = (σ (t), η(t)) and 0 < g(γ ˜ (t), γ (t)) = g(σ (t), σ (t)) − h(ω(η ˜ (t)), ω(η ˜ (t))).
(24)
That is, the h-length of η(t) is bounded in [t0 − , t0 ) by the g−length of the extension of σ to [t0 − , t0 ]. Thus, as h is complete, η(t) is continuously extendible to t0 , and so is γ , a contradiction.
Remark 7.1. There are other causality properties of M (as being chronological, causal, strongly causal or stably causal) which are transferred to P . Acknowledgement. E.M. is supported by INFN, grant n◦ 9503/02, and M.S. is partially supported by MEC-FEDER grant n◦ MTM 2004-04934-C04-01.
References 1. Alvarez, O.: Topological quantization and cohomology. Commun. Math. Phys. 100, 279–309 (1985) 2. Antonacci, F., Giannoni, F., Magrone, P.: On the problem of the existence for connecting trajectories under the action of a gravitational and electromagnetic fields. Diff. Geom. Appl. 13, 1–17 (2000) 3. Avez, A.: Essais de g´eom´etrie Riemannienne hyperbolique globale. Application a` la relativit´e g´en´erale. Ann. Inst. Fourier (Grenoble) 132, 105–190 (1963) 4. Barros, M., Cabrerizo, J.L., Fern´andez, M., Romero, A.: The Gauss-Landau-Hall problem on Riemannian surfaces. J. Math. Phys. 46, 11295-1–11295-15 (2005) 5. Bartolo, B.: Trajectories connecting two events of a Lorentzian manifold in the presence of a vector field. J. Differ. Eq. 153, 82–95 (1999) 7 This means that t is temporal; in particular, t is a time function, that is, a continuous function which grows on any future-directed causal curve.
370
E. Minguzzi, M. S´anchez
6. Bartolo, R., S´anchez, M.: Remarks on some variational problems on non-complete manifolds. Nonlinear Anal. 47, 2887–2892 (2001) 7. Beem, J.K., Ehrlich, P.E., Easley, K.L.: Global Lorentzian Geometry. Marcel Dekker Inc., New York (1996) 8. Benci, V., Fortunato, D.: A new variational principle for the fundamental equations of classical physics. Found. Phys. 28, 333–352 (1998) 9. Bernal, A.N., S´anchez, M.: On smooth Cauchy hypersurfaces, Geroch’s splitting theorem. Commun. Math. Phys. 243, 461–470 (2003) 10. Bernal, A.N., S´anchez, M.: Smoothness of time functions and the metric splitting of globally hyperbolic spacetimes. Commun. Math. Phys. 257(1), 43–50 (2005) 11. Caponio, E.: Timelike solutions to the Lorentz force equation in time-dependent electromagnetic and gravitational fields. J. Diff. Eq. 199, 115–142 (2004) 12. Caponio, E., Masiello, A.: Trajectories for relativistic particles under the action of an electromagnetic force in a stationary space-time. Nonlinear Anal. 50, 71–89 (2002) 13. Caponio, E., Masiello, A.: Trajectories of charged particles in a region of a stationary spacetime. Class. Quantum Grav. 19, 2229–2256 (2002) 14. Caponio, E., Masiello, A.: The Avez-Seifert theorem for the relativistic Lorentz force equation. J. Math. Phys. 45, 4134–4140 (2004) 15. Caponio, E., Masiello, A.: Causal properties of Kaluza-Klein metrics. Appl. Math. Lett. 17, 1371– 1374 (2004) 16. Caponio, E., Masiello, A., Piccione, P.: Maslov index and Morse theory for the relativistic Lorentz force equation. Manus. Math. 113, 471–506 (2004) 17. Caponio, E., Minguzzi, E.: Solutions to the Lorentz force equation with fixed charge-to-mass ratio in globally hyperbolic spacetimes. J. Geom. Phys. 49, 176–186 (2004) 18. Galloway, G.J.: Closed timelike geodesics. Trans. Amer. Math. Soc. 285, 379–384 (1984) 19. Hawking, S.W., Ellis, G.F.R.: The Large Scale Structure of Space-Time. Cambridge University Press, Cambridge (1973) 20. Jackson, J.D.: Classical Electrodynamics. John Wiley & Sons, New York (1975) 21. Kaluza, T.F.E.: Zum Unit¨atsproblem der Physik. Sitzungsberichte der Preubische Akademie der Wissenschaften zu Berlin, Physikalisch-Mathematische Klasse 1, pp. 966–972 (1921) 22. Kerner, R.: Generalization of the Kaluza-Klein theory for an arbitrary non-abelian gauge group. Ann. Inst. H. Poincar´e 9, 143–152 (1968) 23. Kerner, R., Martin, J., Mignemi, S., van Holten, J.-W.: Geodesic deviation in Kaluza-Klein theories. Phys. Rev. D 63, 027502 (2001) 24. Kovacs, D.: The geodesics equation in five-dimensional relativity theory of Kaluza-Klein. Gen. Relativ. Gravit. 16, 645–655 (1984) 25. Kovner, I.: Fermat principle in arbitrary gravitational fields. Astrophys. J. 351, 114–120 (1990) 26. Landau, L.D., Lifshitz, E.M.: The Classical Theory of Fields. Reading, Addison-Wesley Publishing Company, MA (1962) 27. Leibowitz, E., Rosen, N.: Five-dimensional relativity theory. Gen. Relativ. Gravit. 4, 449–474 (1973) 28. Lichnerowicz, A.: Th´eories relativistes de la gravitation et de l’´electromagnetisme, Relativit´e G´en´erale et th´eories unitaires. Masson, Paris (1955) 29. Minguzzi, E.: On the existence of maximizing curves for the charged-particle action. Class. Quantum Grav. 20, 4169–4175 (2003) 30. Minguzzi, E.: Comment on “The Avez-Seifert theorem for the relativistic Lorentz force equation" and other related works, Math-ph /0505006 31. Mirenghi, E., Tucci, M.: Stationary Lorentz manifolds and vector fields: existence of periodic trajectories. Nonlinear Anal. 50, 763–786 (2002) 32. Misner, C.W., Thorne, K.S., Wheeler, J.A.: Gravitation. Freeman, San Francisco, CA (1973) 33. O’Neill, B.: Semi-Riemannian Geometry. Academic Press, San Diego, CA (1983) 34. Penrose, R.: Techniques of Differential Topology in Relativity. CBSM-NSF Regional Conference Series in applied Mathematics. SIAM, Philadelphia (1972) 35. Perlick, V.: On Fermat’s principle in general relativity: I. The general case. Class. Quantum Grav. 7, 1319–1331 (1990) 36. Sachs, R.K., Wu, H.: General Relativity for Mathematicians. Springer, Berlin (1977) 37. S´anchez, M.: Geodesic connectedness of semi-Riemannian manifolds. Nonlinear Anal. 47, 3085– 3102 (2001) 38. S´anchez, M.: On causality and closed geodesics of compact Lorentzian manifolds and static spacetimes (2005). Diff. Geom. Appl. 24, 21–32 (2006) 39. Seifert, H.J.: Global connectivity by timelike geodesics. Z. Naturforsh. 22a, 1356–1360 (1967) 40. Smith, J.W.: Fundamental groups on a Lorentz manifold. Amer. J. Math. 82, 873–890 (1967) 41. Uhlenbeck, K.: A Morse theory for geodesics on a Lorentz manifold. Topology 14, 69–90 (1975) 42. Walschap, G.: Causality relations on a class of spacetimes. Gen. Rel. Grav. 27, 721–733 (1995) Communicated by G.W. Gibbons
Commun. Math. Phys. 264, 371–389 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1468-5
Communications in
Mathematical Physics
Absolutely Continuous Spectra of Quantum Tree Graphs with Weak Disorder Michael Aizenman1 , Robert Sims2 , Simone Warzel1, 1 2
Departments of Mathematics and Physics, Princeton University, Princeton, NJ 08544, USA Department of Mathematics, University of California at Davis, Davis, CA 95616, USA
Received: 12 April 2005 / Accepted: 26 May 2005 Published online: 15 November 2005 – © The authors 2005
Abstract: We consider the Laplacian on a rooted metric tree graph with branching number K ≥ 2 and random edge lengths given by independent and identically distributed bounded variables. Our main result is the stability of the absolutely continuous spectrum for weak disorder. A useful tool in the discussion is a function which expresses a directional transmission amplitude to infinity and forms a generalization of the WeylTitchmarsh function to trees. The proof of the main result rests on upper bounds on the range of fluctuations of this quantity in the limit of weak disorder. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. An Outline of the Argument . . . . . . . . . . . . . . . . . . . . 3. A Lyapunov Exponent and Its Continuity . . . . . . . . . . . . . 4. Fluctuation Bounds . . . . . . . . . . . . . . . . . . . . . . . . . 5. Stability of the Weyl-Titchmarsh Function Under Weak Disorder . 6. Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Appendix: More on the Weyl-Titchmarsh Function on Tree Graphs References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
371 374 378 379 380 384 385 389
1. Introduction A quantum graph (QG) is a metric graph with an associated Laplace-like operator acting on the L2 -space of the union of the graph edges. The spectral and dynamical properties of such operators have been of interest both because this model mimics situations realizable with quantum dots and wires, and because QGs may provide a simple setup Copyright rests with the authors. Faithful reproduction of the article for non-commercial purpose is permitted. On leave from: Institut f¨ ur Theoretische Physik, Universit¨at Erlangen-N¨urnberg, Germany
372
M. Aizenman, R. Sims, S. Warzel
elucidating issues which are also of relevance for Schr¨odinger operators and Laplacians on manifolds (see [10, 19, 11, 8] and references therein). Examples of such topics are the Gutzwiller trace formula and the transition associated with the spectral and dynamical localization due to disorder. The main results of this work pertain to quantum tree graphs whose edge lengths are randomly stretched, but remain close to a common value. The goal is to present new results concerning the persistence of absolutely continuous spectra under weak disorder. A secondary goal is to demonstrate, in the QG context, a new technique for the proof of absolutely continuous spectrum which is also effective for discrete random Schr¨odinger operators on trees as was proven in [2]. 1.1. Random quantum trees and their spectra. A rooted metric tree graph T with branching number K consists, for us, of a countably infinite set of vertices, one of which is being labeled as the root, 0, and a set E of edges, each joining a pair of vertices, such that: 1. the graph is edge connected, 2. there are no closed loops, 3. each vertex has K +1 edges except for the root which has only one edge. Each edge e ∈ E is assigned a positive finite length Le ∈ (0, ∞) and is parametrized by a variable with values in [0, Le ]. Thus, the union of the edges has the natural coordinates l ∈ [0, Le ]. The orientation for the latter is chosen so that l increases away from the root, and we denote by the derivative with respect to those coordinates. Our discussion concerns the spectral properties of the Laplacian −T ψe = −ψe , (1.1) which acts in the Hilbert space L2 (T) = e∈E L2 [0, Le ] of complex-valued squareintegrable functions ψ = ⊕e∈E ψe defined over the union of the graph edges. The Laplacian is rendered essentially self-adjoint through the imposition of boundary conditions (BC) on the functions in its domain; here we take these to be the Kirchhoff conditions at internal vertices and α-BC at the root. More precisely, the domain consists of functions such that ψe ∈ H2 [0, Le ] for all e ∈ E and 1. at each vertex ψ is continuous. 2. at internal vertices the net flux defined by the directional derivatives vanishes, i.e., ψe (Le ) = ψf (0), (1.2) f ∈Ne+
where Ne+ is the collection of edges which are forward to e as seen from the root. 3. at the root cos(α) ψ0 (0) − sin(α) ψ0 (0) = 0
(1.3)
with some α ∈ [0, π). An extensive discussion of other boundary conditions which yield self adjointness can be found in [7, 12]. Among those is the class of symmetric BC; the adaptation of the argument to this case is discussed in Sect. 6. 1.2. Statement of the main result. Our discussion will focus on the absolutely continuous (AC) component of the spectrum of the Laplacian on deformed metric trees. Before presenting the main result let us note the following fact, which may, for instance, be deduced from Theorem A.2 in Appendix A.
Quantum Tree Graphs with Disorder
373
Proposition 1.1. The AC spectrum of −T is independent of the boundary condition at the root, i.e., of α ∈ [0, π). For the regular tree T with constant edge lengths L ∈ (0, ∞) and branching number K ∈ N one has [21, 22] ∞ πn + θ 2 π(n + 1) − θ 2 σac (−T ) = , (1.4) , L L n=0
where θ := arctan K 1/2 − K −1/2 /2 . In particular, this implies that the AC spectrum of −T has band structure if K ≥ 2. As an aside, we note that for K ≥ 2 there occur infinitely degenerate eigenvalues in the band gaps [21, 22]. The main object of interest in this paper is the AC spectrum of the Laplacian on random deformations of T. Definition 1.1. A random deformation T(λ, ω) of the regular rooted metric tree T is a rooted metric tree graph, which has the same vertex set and neighboring relations as T, but the edge lengths are given by Le (λ, ω) := L exp (λ ωe )
(1.5)
with a collection of real-valued, independent, and identically distributed (iid) bounded random variables ω = {ωe }e∈E . The parameter λ ∈ [0, 1] controls the strength of the disorder and L > 0 stands for the edge length of T. Our main result is Theorem 1.1. For a random deformation, T(λ, ω), of a regular tree graph T with branching number K ≥ 2 the AC spectrum of −T(λ,ω) is continuous at λ = 0 in the sense that for any interval I ⊂ R and almost all ω:
lim L I ∩ σac (−T(λ,ω) ) = L I ∩ σac (−T ) , (1.6) λ→0
where L(·) denotes the Lebesgue measure. Remarks 1.1. (i) As is generally known by ergodicity arguments [3, 17, 1], and in our case also by the 0-1 law for the sigma-algebra of events measurable at infinity, which is applicable through Theorem A.2, for almost all ω the AC spectrum of −T(λ,ω) is given by a certain non-random set. (ii) The assumption on the distribution of {ωe }e∈E can be relaxed: the present proof readily extends to the class of random graphs where the distribution of these variables is stationary under the endomorphisms of the tree T and weakly correlated in the sense of [2, Def. 1.1]. (iii) To better appreciate the continuity asserted in Theorem 1.1, one may note that the analogous statement is not expected to be true in case the disorder is restricted to be radially symmetric, i.e., ωe = adist{e,0} with {an } a collection of iid random variables. In this case, the AC spectrum coincides with that of a one-dimensional Sturm-Liouville operator. In view of related results about Anderson localization in one dimension [3, 17, 15, 16] one may expect (though we are not aware of a published proof) that also here localization sets in at any non-zero level of disorder.
374
M. Aizenman, R. Sims, S. Warzel
2. An Outline of the Argument A generally useful tool for the study of the spectral and dynamical properties of any quantum graph is provided by the Green function. For tree graphs, we find it particularly useful to consider a related quantity, which is an extension of the Weyl-Titchmarsh function familiar from the context of Sturm-Liouville or Schr¨odinger operators on a line. Before outlining the main steps in the derivation of Theorem 1.1, we shall introduce this function and its key properties, first somewhat informally through its appearance in a scattering problem. 2.1. A scattering perspective. As noted by Miller and Derrida [14], one may obtain a scattering perspective on extended states by considering a setup in which a wire Wx is attached to a tree graph T at an interior point x of an edge. Particles of energy E and decay rate η are sent at a steady rate down this wire. In the corresponding steady state, the quantum amplitude ψ for observing a particle at a point is given by a function satisfying (−T∪Wx − z) ψ = 0, where z = E + iη and −T∪Wx is a self adjoint Laplacian on the union of the graph and the wire, defined with suitable BC for the three segments meeting at the point of contact. For the latter, we assume here that it will be appropriate to take the Kirchhoff conditions. − As follows from Theorem 2.1 below, on the two subgraphs T+ x and Tx , produced by cutting T at x, the above differential equation has a unique – up to a multiplicative constant – square-integrable solution ψ + and correspondingly ψ − . Thus ψ takes the form: √ √ ei z(y−x) + r(x; z) e−i z(y−x) along the wire ψ(y; z) = , (2.1) ψ ± (y; z) along the graph where r(x; z) is the reflection coefficient, and the three branches are linked through the Kirchhoff conditions: ψ + (x; z) = ψ − (x; z) = 1 + r(x; z),
√ ∂ + ∂ − ψ (x; z) − ψ (x; z) = i z 1 − r(x; z) ∂x ∂x
(2.2)
with the differentiation taken in the direction away from the root of T. The above relations yield √ 1 − r(x; z) i z (2.3) = R + (x; z) + R − (x; z), 1 + r(x; z)
where R ± = ± ∂ψ ± /∂x /ψ ± . From the scattering perspective the graph absorbs some of the current directed at it, i.e., conducts it to infinity, if and only if |r(x; z)| < 1. A simple consequence of (2.3) is the equivalence
|r(x; E)| < 1 ⇔ Im R + (x; E) + R − (x; E) > 0 . (2.4) As it turns out R also plays a direct role in the spectral theory of −T : the diagonal of its Green function is given by
−1 GT (x, x; z) = − R + (x; z) + R − (x; z) . (2.5)
Quantum Tree Graphs with Disorder
375
By the theorem of de la Vall´ee Poussin, the AC component of the spectral measure, associated with the function in (2.5), is π −1 Im GT (x, x; E + i0) dE. Therefore, there is a relation between the occurrence of the AC spectrum, the ability of the graph to conduct current to infinity, and the non-vanishing of Im R ± (x; E). Let us note that the reflection coefficient for the version of the above experiment in which the particles are sent towards only the forward subtree T+ x , is given by a version of (2.3) with only R + (x; z) on the right side, and similarly for T− x. 2.2. Tree extension of the Weyl-Titchmarsh function. We shall now follow the somewhat informal introduction above with a more careful definition of the functions R ± . For this purpose the following statement plays an important role. Theorem 2.1. Let G be a connected metric graph with a selected “open” vertex u which has exactly one adjacent edge. Let −G,u be the symmetric Laplacian defined with selfadjoint BC on all vertices excepting the open vertex, where it is required that both ψ(u) = 0 and ψ (u) = 0. Then: (i) For any z ∈ C+ := {z ∈ C : Im z > 0}, the space of square-integrable solutions of (−∗G,u − z) ψ = 0, with −∗G,u the adjoint operator, is one dimensional. (ii) The solution ψ(x; z) and its derivative ψ (x; z) do not vanish on any point which disconnects G. (iii) Normalized so that ψ(u; z) = 1, both ψ(x; z) and ψ (x; z) are analytic for z ∈ C+ and all x ∈ G. We note that −G,u is not self-adjoint. The proof of this theorem is given in Appendix A. The following corollary is a relevant implication for trees. Throughout, we denote by ψ ± (x; z|u) the functions described in Theorem 2.1 which correspond to the two subtrees, T± u , into which T is split at u, with u serving as the open vertex. We fix their normalization such that ψ ± (u; z|u) = 1. Corollary 2.1. Along the edges of a metric tree T, the ratio R ± (x; z) := ±
1 ψ ± (x; z|u)
∂ ± ψ (x; z|u) ∂x
(2.6)
does not depend on u as long as x stays in T± u. Definition 2.1. We shall refer to the above R ± as the (generalized) Weyl-Titchmarsh (WT) functions. These functions have a number of properties which are used in the proof of our main result. If not obvious, their derivation is given in Appendix A. 1. (Relation with the Green function). The generalized WT function may be related to the diagonal elements of the Green function which is defined on T+ x , with the α = 0 BC at x, as R + (x; z) = cot α −
1 , GαT+ (x, x; z) x
and similarly for
R−.
(2.7)
376
M. Aizenman, R. Sims, S. Warzel
2. (Boundary values). The function has the Herglotz-Nevanlinna property [5]: it is analytic for z ∈ C+ with Im R ± (x; z) > 0 when Im z > 0. By a standard implication, for each x the limit R ± (x; E + i0) := lim R ± (x; E + iη) η↓0
(2.8)
exists for Lebesgue almost every E ∈ R. 3. (Evolution along the tree). The values Re+ (·; z) at two opposite ends of an edge e are related by a M¨obius transformation, which integrates the Riccati equation: ∂ + R (x; z) + z + R + (x; z)2 = 0 . ∂x Over each vertex R + (·; z) is additive thanks to (1.2):
Rf+ 0; z . Re+ Le ; z =
(2.9)
(2.10)
f ∈Ne+
4. (Relation with the current). For each u, the quantity
∂ + + + J (x, z|u) := Im ψ (x; z|u) ψ (x; z|u) ∂x = |ψ + (x; z|u)|2 Im R + (x; z) ≥ 0
(2.11)
represents a current. It is additive at the vertices and conserved along the edges for real z. For z ∈ C+ the current is decreasing in the direction away from the root: ∂ + J (x; z|u) = −|ψ + (x; z|u)|2 Im z ≤ 0 . ∂x
(2.12)
At interior vertices the net current flux is zero. 2.3. The core of the argument. We now have the requisite tools to outline the proof of the persistence of the AC spectrum under weak disorder. A key element in our analysis is to show that for small (λ, η), the WT function R + (x; E + iη, λ, ω) does not depend much on ω. At each point its distribution is narrowly peaked around a value which may only depend on (λ, η), and the relative location of the point within the edge. By the rules of the evolution of R + along an edge, which are described above, it follows that for
(λ, η) → (0, 0) the limit of the “typical” value of Re+ 0; z, λ, ω , or more precisely any accumulation point of such, obeys a M¨obius evolution whose unique periodic solution is given by the WT function of the regular tree T. The continuity then readily follows, though some care is needed in the presentation of the argument. In this part, we employ the strategy which was presented in [2]. It should be appreciated that the asymptotic lack of dependence of R + (x, z; λ, ω) on ω is not just a trivial consequence of the smallness of λ since this parameter affects an infinite number of random terms. As commented above, it is natural to expect the corresponding statement to fail when the disorder is radial, with ω given by radially symmetric but otherwise iid random variables. To streamline the notation, in various places the dependence of ψ + and R + on λ and ω will be suppressed. The first statement establishing a reduction of fluctuations concerns Im R + (x; z). For that the starting point is (2.11) by which |ψ + (x; z|0)|2 · Im R + (x; z) gives the flux at
Quantum Tree Graphs with Disorder
377
x of a conserved current. The current is injected at the root and at each vertex it is split among the forward directions. It is significant that the first factor takes a common value among the different forward directions, the second factor is independently distributed, and, furthermore, it has the same distribution as the total current Im R + (0; z). It follows that
+
1 + ψ 0; z, λ, ω|0 2 e∈N0+ Im Re 0; z, λ, ω K 0
≤ (2.13)
2 . Im R0+ 0; z, λ, ω K ψ + 0; z, λ, ω|0 f
This expresses current conservation/attrition, and for Im z = 0 holds as equality. Here f ∈ N0+ is an arbitrary edge forward to that of the root, and due to the particular normalization chosen (before Corollary 2.1) the numerator on the right side is actually one. Our argument proceeds by combining two essential observations: 1. By the Jensen inequality the expectation value of the logarithm of the left side of (2.13) is non-negative. The inequality can be strengthened to show that the above expectation value provides an upper bound on a positive quantity which expresses the relative width of the distribution of Im R0+ 0; z, λ, ω . 2. The expectation of the logarithm of the right side of (2.13) is a quantity which it is natural to regard as a Lyapunov exponent,
√ ψf+ 0; z, λ, ·|0
. γλ (z) := −E log K + (2.14) ψ 0; z, λ, ·|0 0
For λ = 0, this Lyapunov exponent vanishes for almost every z ∈ σac (−T ). Furthermore, the average of γλ (E + iη) over any energy interval is continuous in (λ, η). The above mentioned improvement of the Jensen inequality is summarized in the following statement, which is a consequence of [2, Lemma 3.1 and Lemma D.2]. Lemma 2.1. Let {Xj }K j =1 be a collection of K ≥ 2 iid positive random variables, and X a variable of the same distribution. Then for any a ∈ (0, 1/2]: K 1 a2 E log Xj ≥ E log X + (2.15) δ (X, a)2 , K 4 j =1
where δ(X, a) is the relative a-width of X, which is defined below. Definition 2.2. The relative a-width of the distribution of a positive random variable X, at a ∈ (0, 1/2], is δ(X, a) := 1 −
ξ− (X, a) ξ+ (X, a)
(2.16)
with ξ− (X, a) = sup{ ξ : P(X < ξ ) ≤ a} and ξ+ (X, a) = inf{ ξ : P(X > ξ ) ≤ a}. A number of useful rules of estimates of the relative width of a distribution are compiled in [2, Appendix D]. We shall now turn to the two key properties of the Lyapunov exponent which were mentioned above.
378
M. Aizenman, R. Sims, S. Warzel
3. A Lyapunov Exponent and Its Continuity We shall refer to γλ (z) which is defined by (2.14) as the Lyapunov exponent of the randomly deformed tree T(λ, ω). The following theorem collects some of its properties. Of particular relevance is that the integral of γλ (E + iη) over E ∈ σac (T) is small for small λ and η. Theorem 3.1. The Lyapunov exponent γλ (z) has the following properties: (i) As a function of z ∈ C+ , it is positive and harmonic with γλ (iη)/η → 0 for η → ∞. (ii) For λ = 0, it vanishes on the AC spectrum: γ0 (E + i0) = 0 for Lebesgue-almost all E ∈ σac (−T ). (iii) For any z ∈ C+ , γλ (z + iη) is jointly continuous in (λ, η) ∈ R × [0, ∞). (iv) For any [a, b] ⊂ σac (−T ):
b
lim
λ→0 η↓0
γλ (E + iη) dE = 0.
(3.1)
a
Proof. (i) From (2.14) and (2.6) it follows that γλ (z) is the negative of the real part of the Herglotz-Nevanlinna function √ wλ (z) := log K + E
L0 (λ)
+
R0 l; z, λ dl ,
0
(3.2)
and hence it is harmonic. The positivity of γλ (z) follows from (2.11) and the Jensen inequality, which yield 2γλ (z) ≥ E log
J0+ (0; z|0)
J0+ (L0 (λ, ·); z|0)
>0
(3.3)
due to the current loss (2.12) on every edge for z ∈ C+ . The statement of asymptotics derives from (A.5) and the bound (A.2) in Appendix A. (ii) The vanishing of γ0 along σac (−T ) is a consequence of the Im z ↓ 0 limit of (2.13) and the fact that Re+ (0; z, 0) is independent of e, with 0 < Im Re+ (0; E + i0, 0) < ∞ for Lebesgue-almost all E ∈ σac (−T ). (iii) From (2.14) and (A.4) together with the dominated convergence theorem, which is applicable due to (A.5) and Theorem A.1, we conclude that the continuity of γλ (z + iη) follows from that of R0 (0; z + iη, λ, ω). The latter is derived using the argument in the proof of Theorem A.1(iv). (iv) By virtue of (ii) it suffices to prove that
b
lim
λ,η→0
a
b
γλ (E + iη) dE =
γ0 (E + i0) dE.
(3.4)
a
To do so, we note that the integrals in (3.4) can be associated with the (unique) Borel measure σ(λ,η) corresponding to the positive harmonic function h(λ,η) (z) = γλ (z+iη) (cf. (3.6) below). Since wλ (·+iη) has the Herglotz-Nevanlinna property, the harmonic conjugate of h(λ,η) = − Re wλ (· + iη) has a definite sign and hence
Quantum Tree Graphs with Disorder
379
locally integrable boundary values [5, Thm. 1.1]. Therefore, the measure σ(λ,η) is purely AC [5, Thm 3.1 & Corollary 1] for all (λ, η) ∈ [0, 1]2 and given by b σ(λ,η) [a, b] = γλ (E + iη) dE . (3.5) a
The assertion thus follows from (iii) and Lemma 3.1 below.
The last part of the preceding proof was based on the following general convergence result for sequences of harmonic functions. Recall (cf. [5, 9]) that every positive harmonic function h : C+ → (0, ∞) which satisfies limη→∞ h(iη)/η = 0 admits the representation Im z h(z) = σ (dE) (3.6) |E − z|2 R with some positive Borel measure σ on R with R (E 2 + 1)−1 σ (dE) < ∞. Lemma 3.1. Let hn , h : C+ → (0, ∞) be positive harmonic functions with limη→∞ hn (iη)/η = 0 and similarly for h. Suppose that for all z ∈ C+ , lim hn (z) = h(z).
n→∞
(3.7)
Then their associated Borel measures converge vaguely, limn→∞ σn = σ . The proof is an immediate consequence of the representation (3.6) and [6, Prop. 4.1] (see also [17, Lemma 5.22]). 4. Fluctuation Bounds Proceeding along the lines outlined in Subsect. 2.3, we shall now show that a small Lyapunov exponent γλ (z) implies the sharpness of the distribution of both the imaginary part and the modulus of a certain linear function of R0+ (0; z, λ, ω). Theorem 4.1. For any λ ∈ R, z ∈ C+ and a ∈ (0, 1/2]:
2 8 δ Im R0+ (0; z, λ, ·), a ≤ 2 γλ (z), a √
2 2 √
sin zL0 (λ, ·) + δ cos zL0 (λ, ·) + R0 (0; z, λ, ·) , a √ z ≤ 512
(K + 1)2 γλ (z) . a2
Proof. The derivation of (4.1) starts from the relation 1
2γλ (z) ≥ E log Im Rf+ 0; z, λ, · − E log Im R0+ 0; z, λ, · , K +
(4.1)
(4.2)
(4.3)
f ∈N0
which is obtained by taking the expectation of the logarithm of (2.13). Applying the
improved Jensen inequality (2.15), and using the fact that Im Rf+ 0; z, λ, ω are iid for
380
M. Aizenman, R. Sims, S. Warzel
2 f ∈ N0+ , the right side of (4.3) is bounded from below by a 2 δ Im R0+ (0; z, λ, ·), a /4. This implies (4.1). The proof of (4.2) starts by observing that the quantity in its left side can be identified with the right side of (2.13): √
√
sin zL0 (λ, ω) + cos zL0 (λ, ω) + R0 (0; z, λ, ω) √ z
= ψ0+ L0 (λ, ω); z, λ, ω|0 . (4.4) This follows from (A.4) in Appendix A. Setting X := J0+ (L0 (λ, ·); z|0)/J0+ (0; z|0) and using the definition of the current, the left side in (4.2) therefore equals
Im R0+ 0; z, λ, ·
δ X, a + f ∈N0+ Im Rf 0; z, λ, ·
Im R0+ 0; z, λ, · a a
, ≤δ + δ X, , (4.5) + 2 2 f ∈N + Im Rf 0; z, λ, · 0
where the inequality results from the additivity of the relative width under multiplication [2, Lemma D.1]. This additivity and the invariance under inversion [2, Lemma D.1] ensures that the first term on the right side of (4.5) is bounded from above by
a aK δ Im R0+ 0; z, λ , Im Rf+ 0; z, λ , +δ 2(K + 1) 2(K + 1) +
f ∈N0
+
≤ 2 δ Im R0
0; z, λ ,
√ a 8 2 (K + 1) γλ (z) . ≤ 2(K + 1) a
(4.6)
Here the first inequality results from the rules of addition of iid random variables [2, Lemma D.1]. The second one is a consequence of (4.1). The second term on the right √ side of (4.5) is bounded from above according to δ(X, a/2) ≤ 2 γλ (z)/a. This follows from (3.3) and the simple bound E [ln X] , (4.7) a valid for all random variables X taking values in (0, 1]. Combining the above estimates, we arrive at (4.2). δ(X, a)2 ≤ (1 − ξ− (X, a))2 ≤ − ln ξ− (X, a) ≤ −
5. Stability of the Weyl-Titchmarsh Function Under Weak Disorder 5.1. The main stability result. Our goal in this section is to show that the boundary values of the WT function are continuous at λ = 0 in a certain distributional sense as long as E ∈ σac (−T ). Here the distribution refers to the joint dependence on the energy and the randomness. The result to be derived is: Theorem 5.1. Let I ⊂ σac (−T ) be an interval. Then the WT function converges in LI ⊗ P-measure, i.e., for all ε > 0: ! lim LI ⊗ P R0+ (0; E + i0, λ, ω) − R0+ (0; E + i0, 0) > ε = 0, (5.1) λ→0
where LI denotes the Lebesgue measure on I .
Quantum Tree Graphs with Disorder
381
The above statement will be derived in this section by proving, in Theorem 5.2 which appears below, that for all ε > 0 and all sequences (λ, η) converging to zero, ! lim LI ⊗ P R0+ (0; E + iη, λ, ω) − R0+ (0; E + i0, 0) > ε = 0 . (5.2) λ,η→0
Before we delve into the proof of the statements which lead to Theorem 5.1 let us note that it implies our main claim. Proof (of Theorem 1.1; assuming Thm. 5.1). Since σac (−T(λ,ω) ) coincides almost-surely with a non-random set, it suffices to show that
lim E L I ∩ σac (−T(λ,·) ) = L I . (5.3) λ→0
We start the proof of this relation by observing that
L I ≥ E L I ∩ σac (−T(λ,·) )
! ≥ LI ⊗ P 0 < Im R0+ (0; E + i0, λ, ω) < ∞ ,
(5.4)
where the second inequality is due to Theorem A.2. For any ε > 0 the set on the right side includes the collection of (E, ω) for which ε < Im R0+ (0; E+i0, 0) < ∞ and Im R0+ (0; E+i0, λ, ω)−Im R0+ (0; E+i0, 0) ≤ ε. Accordingly, the right side of (5.4) is bounded below by the difference of ! LI ⊗ P Im R0+ (0; E + i0, λ, ω) − Im R0+ (0; E + i0, 0) ≤ ε (5.5) and
! LI Im R0+ (0; E + i0, 0) ∈ [0, ε] ∪ {∞} .
(5.6)
As λ → 0 the measure in (5.5) converges to L(I ) by Theorem 5.1. Moreover, as ε ↓ 0 the measure in (5.6) converges to zero. 5.2. Convergence in measure. In order to derive Theorem 5.1, we shall consider the! distribution under the measure LI ⊗ P of the joint values of E, Re+ (0; E + iη, λ, ω) e∈E , and {Le (λ, ω)}e∈E . In the following, Lmax stands for some uniform upper bound on Le (λ, ω), which exists due to the boundedness of the random variables. The setup is similar to that employed in [2]. Definition 5.1. Let (λ, η) ∈ [0, 1]2 and I ⊂ σac (−T ). The Borel measure ν(λ,η) on I × CE × [0, Lmax ]E is the measure induced by LI ⊗ P under the mapping
!
(E, ω) → E, Re+ 0; E + iη, λ, ω e∈E , {Le (λ, ω)}e∈E . (5.7) E . Moreover, its E-conditional distribution on CE × [0, Lmax ]E is abbreviated by ν(λ,η)
Remarks 5.1. (i) The above definition relies on the fact that one may identify the edge sets E of T(λ, ω) corresponding to different values of λ and/or ω. (ii) In case (λ, η) = (0, 0) the measure ν(λ,η) is a product of the Lebesgue measure and products of Dirac measures: " " ν(0,0) = dE δR + (0;E+i0,0) δL . (5.8) e∈E
0
e∈E
382
M. Aizenman, R. Sims, S. Warzel
(iii) The family of finite measures ν(λ,η) is tight. Indeed, the bound (A.2) in Appendix A and arguments as in [2, Prop. B.1 & Lemma B.1] show that inf
sup
t>0 (λ,η)∈[0,1]2
ν(λ,η) Re > t = 0
(5.9)
for all e ∈ E. Accordingly, every sequence of measures ν(λ,η) corresponding to (λ, η) → (0, 0) has weak accumulation points. The issue now is to show that all of the above mentioned accumulation points coincide. Theorem 5.2. In the sense of weak convergence: lim ν(λ,η) = ν(0,0) .
λ,η→0
(5.10)
The proof of this theorem closely follows ideas in [2], and rests on the following two lemmas. We first show that all accumulation points of the sequence in (5.10) are supported on points satisfying the limiting recursion relation. Lemma 5.1. Let ν be a (weak) accumulation point for the family of measures ν(λ,η) , with the parameters (λ, η) in [0, 1] × (0, 1] converging to (0, 0). Then (i) The limiting recursion relation
√
√
sin ELe cos ELe + Rf Re √ E f ∈Ne+ √ √ √
= cos ELe Re − E sin ELe
(5.11)
holds for ν-almost all (E, R, L) ∈ I × CE × [0, Lmax ]E . (ii) The lengths are ν-almost surely constant, Le = L for all e ∈ E. (iii) The variables {Re }e∈E are identically distributed ν-almost surely. (iv) For Lebesgue-almost all E ∈ I there exist I ∈ [0, ∞) and M ∈ [0, ∞) such that for all e ∈ E √ √ sin EL Im Re = I and cos EL + Re = M √ E
(5.12)
ν E -almost surely. Proof. (i) The fact that the accumulation points obey the limiting recursion relation which is the (λ, η) = (0, 0) version of (A.3) and (2.10) is a consequence of the general principle proven in [2, Prop. 4.1]. (ii) This statement is implied by the pointwise convergence limλ→0 Le (λ, ω) = L. (iii) The claim follows from the fact that all prelimit quantities are identically distributed.
Quantum Tree Graphs with Disorder
383
(iv) We fix e ∈ E. Then Theorem 4.1 and Theorem 3.1 yield lim
δ Im Re+ (0; E + iη, λ, ·), a dE = 0,
(5.13) I
lim δ cos E + iη Le (λ, ·) λ,η→0 I √
2 sin E + iη Le (λ, ·) + + Re (0; E + iη, λ, ·) , a dE = 0 (5.14) √ E + iη λ,η→0
for all a ∈ (0, 1/2]. By [2, Lemma D.4] this implies that both random variables in (5.12) are almost surely constant for Lebesgue-almost all E ∈ I . Since they are identically distributed for all e ∈ E, the constants I and M are independent of e . √ √ The explicit expression (1.4) shows that sin EL / E = 0 for all E ∈ σac (−T ). Therefore, (5.12) asserts that the Re -marginals of ν E are supported on the intersection of a line with a circle, that is, on at most two points. Next, we show that this support contains only one point which coincides with R0 (0, E + i0, 0). Lemma 5.2. Assume the situation of Lemma 5.1. Then for Lebesgue-almost all E ∈ I : (i) there exists ∈ C with Im ≥ 0 such that for all e ∈ E, Re =
ν E -almost surely.
(5.15)
(ii) = R0 (0, E + i0, 0). Proof. (i) By Lemma 5.1 there exists ± ∈ C with Im ± ≥ 0 such that the Re -marginal of ν E is supported on {+ , − } for all e ∈ E. Suppose that + = − . Then the distribution of f ∈Ne+ Rf is supported on at least three points. This follows by explicitly identifying three points K± and (K − 1) + + − in the support. But this contradicts the limiting recursion relation (5.11) since then the distribution of the left side is supported contains at least three points but the distribution of the right side on at most two points in its support. (ii) Equation (5.11) with substituted for all Re , and Le = L for all edges e, is quadratic in . For Lebesgue almost all E ∈ σac (−T ) this equation has a complex non-real solution, and in this case R0+ (0, E + i0, 0) is its only solution in the upper half plane.
5.3. Section summary. Let us now note that the above lemmas imply the two theorems stated in this section: Proof of Theorem 5.2. Lemmas 5.1 and 5.2 jointly imply Theorem 5.2.
Proof of Theorem 5.1. As is discussed in [2], an application of Fatou’s lemma yields (5.1) from (5.2).
384
M. Aizenman, R. Sims, S. Warzel
6. Extensions 6.1. More general vertex conditions. A variety of boundary conditions other than (1.2) lead to self-adjoint Laplacians on metric graphs [7, 12]. Of those, the argument presented here can be readily extended to the class of symmetric BC. These require at each vertex: 1. for some fixed β ∈ [0, π] the following is common to all the edges e adjacent to the vertex: cos(β) ψ + sin(β) ne · ∇ ψ
(6.1)
with ne · ∇ the outward derivative, 2. for some fixed α ∈ [0, π] the sums over all edges adjacent to the vertex satisfy cos(α) ψe − sin(α) ne · ∇ ψ = 0. (6.2) e
e
The symmetric class includes the Kirchhoff BC (1.2), for which β = 0 and α = π/2. Our analysis extends to the general symmetric BC through a rotation which mixes the function ψ + and its derivative, where ψ + is defined as below Theorem 2.1 with the present boundary conditions. We denote: #+ (x; z|0) := cot(β) ψ + (x; z|0) + ψ and correspondingly
∂ + ψ + (x; z|0) , ψ (x; z|0) = − + # (x; z) ∂x R
#+ (x; z) := − cot(β) + R + (x; z) −1 . R
(6.3)
(6.4)
#+ (x; z|0) which takes a comUnder the above boundary conditions it is the function ψ mon value among the forward edges of any vertex. The current can be expressed in terms of the ‘rotated’ quantities as 2 + #+ (x; z) . (6.5) # (x; z|0)2 Im R J + (x; z|0) ≡ ψ + (x; z|0) Im R + (x; z) = ψ The argument, as it is outlined in Sect. 2.3, applies verbatim with ψ + (x; z|0) and #+ (x; z). In this context the relevant Lyapunov #+ (x; z|0) and R R + (x; z) replaced by ψ exponent is % + $ # (0; z, λ, ·|0) √ ψ f γλ (z) := −E log # K + , (6.6) ψ # (0; z, λ, ·|0) 0
where f is an arbitrary edge forward to the edge emanating from the root. It follows #+ yields the same value as with ψ + , i.e., from (6.3) that the above expression with ψ γλ (z) = γλ (z), the latter being defined by (2.14). # 6.2. Tree graphs with decorations. By gluing a copy of a finite metric graph G to every vertex of the tree T one obtains a metric graph T G which is referred to as a decorated tree. The Laplacian −TG is rendered self-adjoint by imposing, for example, Kichhoff BC. Such decorations provide a mechanism for the creation of gaps in the spectrum [20, 13]. The strategy presented here allows to establish the stability of the AC spectrum under random deformations of a (uniformly) decorated tree even if G has loops. In deriving the fluctuation bounds in this case, in the sum on the left side of (2.13) one may omit the terms Im Rf+ which correspond to directions f into the decorating parts. These terms vanish for real z since the finite graph G does not conduct current to infinity.
Quantum Tree Graphs with Disorder
385
A. Appendix: More on the Weyl-Titchmarsh Function on Tree Graphs This appendix is devoted to the WT functions R ± on general metric tree graphs T, presented in Definition 2.1. We start by proving Theorem 2.1 on which the definition relies. Basic properties of R ± are the topic of the second subsection. The third subsection deals with the Green function on T and its relation to the WT functions.
A.1. Uniqueness of square-integrable solutions on graphs with a dangling end. We will now give a proof of Theorem 2.1. Proof. (i) That there is at least one non identically vanishing function in the kernel of the operator −∗G,u − z can be seen by elongating the dangling edge beyond u thereby creating a backward extension G ⊃ G . We set ψ(x; z) = (−G − z)−1 ϕ(x),
(A.1)
where ϕ is some non identically vanishing function compactly supported on the elongation of the edge containing u and −G is a self-adjoint Laplacian on G . This function (A.1) does not vanish identically on G, since otherwise (by (A.4) below) it would be identically zero on the whole edge containing u and the support of ϕ. Suppose now there is another solution which is linearly independent of ψ(·; z). Since the solution space on the edge adjacent to u is two dimensional, one can linearly combine them to satisfy a self-adjoint BC (cf. (1.3)) at u. Thereby one produces an eigenfunction of a self-adjoint Laplacian −G with eigenvalue z ∈ C+ . This contradicts the self-adjointness. (ii) In fact, more generally for all x ∈ G which disconnect the graph, we have that cos(α) ψ(x; z) − sin(α) ψ (x; z) = 0 for all α ∈ [0, π ). Otherwise one would have found a square-integrable, non-trivial eigenfunction with eigenvalue z ∈ C+ of a restriction of −∗G,u to functions on that disconnected piece, which does not contain u. Since this Laplacian is rendered self-adjoint by imposing α-BC at x, this is a contradiction. (iii) This is an immediate consequence of (A.1).
A.2. Basic properties of the Weyl-Titchmarsh functions. Following are some properties of R + (x; z) which are of relevance in the main part of the paper. Similar statements apply to R − , with proofs differing only in the notation. Theorem A.1. The WT function R + (x; z) has the following properties: (i) R + (x; ·) : C+ → C+ is analytic for fixed x. (ii) For each e ∈ E and all z ∈ C+ : + R (0; z) ≤ e
√ 2 |z| √ , 1 − exp −2Le Im z
√ √ −1 and |Re+ (Le ; z)| ≤ 2K |z| 1 − exp −2Le Im z due to (2.10).
(A.2)
386
M. Aizenman, R. Sims, S. Warzel
(iii) Along any edge e the function obeys the Riccati equation (2.9). In particular its values are related by M¨obius transformations: √
√ √
Re+ 0; z cos zl − z sin zl +
√ √ √
Re l; z = (A.3) cos zl + Re+ 0; z sin zl / z for all l ∈ [0, Le ] and z ∈ C+ . (iv) Equipping the space [Lmin , ∞)E with the uniform topology, R0+ (0; z) is a continuous function of {Le }e∈E ∈ [Lmin , ∞)E for all z ∈ C+ . Proof. (i) The first assertion follows from the analyticity of ψ + (x; z|0) and of its deriative (cf. Theorem 2.1). The Herglotz-Nevanlinna property is a consequence of (2.7). (ii) This is an immediate consequence of (A.6) and Lemma A.1(i) below. (iii) This assertion follows from the fact that ψe+ (·; z|0) ∈ H2 [0, Le ] is a solution of the free Schr¨odinger equation −ψ = zψ on the interval [0, Le ] which, using the boundary conditions at l = 0, may be written as √ √ ψe+ (l; z|0) sin( zl) + = cos( zl) + Re (0; z) √ (A.4) z ψe+ (0; z|0) for all l ∈ [0, Le ]. (iv) Suppose the metric tree T is finite and has only N generations, i.e., the number of edges connecting any edge to the root is at most N . In this case, the continuity of R0 (0; z) follows from the explicit evolution equations (A.3) and (2.10). Lemma A.1(iii) below shows that R√0 (0; z) may be uniformly approximated by its values on a finite tree provided Im z is large enough. Hence R0 (0; z) is continuous for those z ∈ C+ . Since R0 (0; z) is analytic in z ∈ C+ , this implies continuity for all z ∈ C+ . Remark A.1. Another immediate consequence of (A.4) and its analog with 0 and Le interchanged, is the bound −1 + + √ √ ψe (Le ; z|0) |Re+ (Le ; z)| − |z|Le ≤ e |z|Le 1 + |Re√(0; z)| 1+ ≤ + e √ |z| |z| ψe (0; z|0) (A.5) which shows that ψ + (·; z|0) de- or increases at most exponentially on any edge e. Instead of the WT function R + , it is sometimes more convenient to consider its transform √ R + (x; z) − i z m(x; z) := + (A.6) √ R (x; z) + i z which takes values in the complex unit disk. The evolution on the edges takes a particularly simple form for m. In fact, from (A.3) and (2.10) one obtains √ 1 + mf (0; z) 2i zLe me (0; z) = e me (Le ; z), and me (Le ; z) = g , (A.7) 1 − mf (0; z) + f ∈Ne
where g(ζ ) := Theorem A.1.
ζ −1 ζ +1 . The
next lemma collects some facts which are used in the proof of
Quantum Tree Graphs with Disorder
387
Lemma A.1. Let z ∈ C+ and assume Le ≥ Lmin > 0 for all e ∈ E. Then m(x; z) has the following properties: √
(i) It satisfies: |me (0; z)| ≤ exp −2Le Im z . (ii) At the root the dependence on a particular value me (Le ; z) is uniformly √ exponential in the sense that there exists a constant c < ∞ such that for all Im z sufficiently large: ∂m0 (0; z) √
N (A.8) ∂m (L ; z) ≤ c exp −2N Lmin Im z , e e where N is the number of vertices between the edge e to the root. # #0 (0; z) correspond to metric (iii) Let m0 (0; z) and m √ tree graphs T and T which coincide up to the N th generation. Then for all Im z sufficiently large: √
|m0 (0; z) − m #0 (0; z)| ≤ 2K N+1 cN exp −2N Lmin Im z .
(A.9)
Proof. (i) This is an immediate consequence of the first evolution equation (A.7). (ii) Using the chain rule this can be traced back to a straightforward differentiation of Eqs. (A.7). The edge and vertex terms are subsequently bounded with the help of (i). (iii) We expand the difference into a telescopic sum of K N+1 differences and use both (ii) and the fact that the values of m(·; z) and m #(·; z) on the K N+1 leaves in the N th generation differ at most by a complex number of modulus 2.
A.3. The Green function on a tree graph. Analogously to one dimension [4, 3], the Green function of the Laplacian −T on a metric tree graph T can be constructed using two non-vanishing square-integrable functions. In fact, the following lemma is straightforward. Lemma A.2. The Green function GT (u, x; z) of the Laplacian −T can be expressed as
+ ψ ∧ ψ − (u, x; z|v) , (A.10) GT (u, x; z) = W (ψ + , ψ − )(u; z|v) + independently of v, as long as v ∈ T+ u ∩ Tx . Here
+
ψ ∧ψ
−
(u, x; z|v) :=
ψ + (u; z|0) ψ − (x; z|v) for x ∈ T− u ψ − (u; z|v) ψ + (x; z|0) for x ∈ T+ u,
(A.11)
and W (ψ + , ψ − ) := ψ + ∂ψ − /∂x − ∂ψ + /∂x ψ − is the Wronskian. Remarks A.1. (i) The Wronskian is constant along any edge in T− v . In particular, this implies that W (ψ + , ψ − ) = 0, since otherwise one could linearly combine ψ ± to a square-integrable solution of (−T − z) ψ = 0 on the whole tree. (ii) The right side of (A.10) defines an integral kernel of the resolvent (−T − z)−1 which is jointly continuous in (u, x).
388
M. Aizenman, R. Sims, S. Warzel
(iii) Setting fv± (·; z|·) := ψ ± (·; z|·)/ψ ± (v; z|·), where v is any point on the same edge as u, the Green function (A.10) can be rewritten in terms of WT functions:
+ fv ∧ fv− (u, x; z) . (A.12) GT (u, x; z) = − + R (v; z) + R − (v; z) In particular, for u = x = v, we obtain (2.5). Moreover, at the root, we obtain
−1 α = 0, (A.13) GT (0, 0; z) = cot α − R + (0; z) , because R − (0; z) = − cot α due to the BC (1.3). For a self-adjoint Sturm-Liouville, or more specifically, Schr¨odinger operator on the half-line, the WT function at the origin allows one to reconstruct the spectral measure and therefore contains all spectral information [4, 3]. Generally, this fails to hold for operators on tree graphs. However, the AC spectrum of −T can still be detected by the boundary value of R + (0; z): Theorem A.2. The AC spectrum σac (−T ) of the Laplacian on a rooted metric tree graph T is concentrated on the set ! (A.14) E ∈ R : 0 < Im R0+ (0; E + i0) < ∞ . Proof. Pick any edge e ∈ E and let φ be a compactly supported function on e. A straightforward but tedious computation using (A.12) shows that for Lebesgue-almost all E ∈ R the AC density of the spectral measure associated with φ is given by & ' lim Im φ, (−T − E − iη)−1 φ η↓0
=
Im Re+ (0; E + i0) gφ− (E) + Im Re− (0; E + i0) gφ+ (E) |Re+ (0; E + i0) + Re− (0; E + i0)|2
,
(A.15)
for Lebesgue-almost every E ∈ R, where (
)2 (
)2 gφ± (E) := φ, Re f ± ·; E + φ, Im f ± ·; E (A.16)
and f ± ·; E is the solution of the Schr¨odinger equation (−T − E) f = 0, which
satisfies dff± /dx 0; E = ±Rf± (0; E + i0) at every edge f and is normalized to
fe± 0; E = 1. By the current conservation (2.12) along each edge and the positivity and additivity of the current at each vertex, we have
2 Im Re+ (0; E + i0) ≤ f0+ 0; E Im R0+ (0; E + i0) (A.17) for Lebesgue-almost all E ∈ R. Similarly, by tracing the current flow on the backward tree emanating from e and using Im R0− (0; E + i0) = 0, we obtain for Lebesgue-almost all E ∈ R,
2 Im Re− (0; E + i0) = (A.18) ff− 0; E Im Rf+ (0; E + i0), f
where the sum extends over all edges f = e, which have the same distance to the root as e. From the above considerations we conclude that for any e and Lebesgue-almost all E ∈ R if Im R0+ (0; E + i0) = 0, then 1. Im Re+ (0; E + i0) = 0, and 2. Im Re− (0; E + i0) = 0. But this shows that σac (−T ) is indeed concentrated on the set in (A.14).
Quantum Tree Graphs with Disorder
389
Acknowledgement. We are grateful to Uzy Smilansky for stimulating discussions concerning quantum graphs. We would also like to express thanks for the gracious hospitality enjoyed at the Weizmann Institute (MA) and the Department of Mathematics at UC Davis (SW). This work was supported in part by the Einstein Center for Theoretical Physics and the Minerva Center for Nonlinear Physics at the Weizmann Institute, by the US National Science Foundation, and by the Deutsche Forschungsgemeinschaft.
References 1. Acosta, V., Klein, A.: Analyticity of the density of states in the Anderson model in the Bethe lattice. J. Stat. Phys. 69, 277–305 (1992) 2. Aizenman, M., Sims, R., Warzel, S.: Stability of the absolutely continuous spectrum of random Schr¨odinger operators on tree graphs. http://arxiv.org/list/math-ph/0502006, 2005. To appear in Probab. Theor. Relat. Fields 3. Carmona, R., Lacroix, J.: Spectral theory of random Schr¨odinger operators. Boston: Birkh¨auser, 1990 4. Coddington, E.A., Levinson, N.: Theory of ordinary differential equations. NewYork: McGraw-Hill, 1955 5. Duren, P.L.: Theory of H p spaces. New York: Academic, 1970 6. Hupfer, T., Leschke, H., M¨uller, P., Warzel, S.: Existence and uniqueness of the integrated density of states for Schr¨odinger operators with magnetic fields and unbounded random potentials. Rev. Math. Phys. 13, 1547–1581 (2001) 7. Kostrykin, V., Schrader, R.: Kirchhoff’s rule for quantum wires. J. Phys. A 32, 595–630 (1999) 8. Kostrykin, V., Schrader, R.: A random necklace model. Waves and random media 14, S75–S9032 (2004) 9. Kotani, S.: One-dimensional random Schr¨odinger operators and Herglotz functions. In: K. Ito (ed.), Taneguchi Symp. PMMP, Amsterdam: North Holland, 1985, pp. 219–250 10. Kottos, T., Smilansky, U.: Periodic orbit theory and spectral statistics for quantum graphs. Ann. Phys. 274, 76–124 (1999) 11. Kuchment, P.: Graph models for waves in thin structures. Waves and random media 12, R1–R24 (2002) 12. Kuchment, P.: Quantum graphs: I. Some basic structures. Waves and random media 14, S107–S128 (2004) 13. Kuchment, P.: Quantum graphs II. Some spectral properties of quantum and combinatorial graphs. preprint. J. Phys. A: Math. Gen. 38, 4887–4900 (2005) 14. Miller, J.D., Derrida, B.: Weak disorder expansion for the Anderson model on a tree. J. Stat. Phys. 75, 357–388 (1993) 15. Minami, N.: An extension of Kotani’s theorem to random generalized Sturm-Liouville operators. Commun. Math. Phys. 103, 387–402 (1986) 16. Minami, N.: An extension of Kotani’s theorem to random generalized Sturm-Liouville operators II. In: Stochastic processes in classical and quantum systems, Lecture Notes in Physics 262, BerlinHeidelberg-New York: Springer, 1986, pp. 411–419 17. Pastur, L., Figotin, A.: Spectra of random and almost-periodic operators. Berlin: Springer, 1992 18. Reed, M., Simon, B.: Methods of modern mathematical physics IV: Analysis of operators. NewYork: Academic Press, 1978 19. Schanz, H., Smilansky, U.: Periodic-orbit theory of Anderson-localization on graphs. Phys. Rev. Lett. 84, 1427–1430 (2000) 20. Schenker, J.H., Aizenman, M.: The creation of spectral gaps by graph decorations. Lett. Math. Phys. 53, 253–262 (2000) 21. Sobolev, A.V., Solomyak, M.: Schr¨odinger operators on homogeneous metric trees: spectrum in gaps. Rev. Math. Phys. 14, 421–467 (2002) 22. Solomyak, M.: On the spectrum of the Laplacian on regular metric trees. Waves and random media 14, S155–S171 (2004) Communicated by B. Simon
Commun. Math. Phys. 264, 391–410 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1469-4
Communications in
Mathematical Physics
Fermionic Quantization of Hopf Solitons S. Krusch1 , J.M. Speight2 1 2
Institute of Mathematics, University of Kent, Canterbury CT2 7NF, England. E-mail: [email protected] Department of Pure Mathematics, University of Leeds, Leeds LS2 9JT, England. E-mail: [email protected]
Received: 14 April 2005 / Accepted: 10 June 2005 Published online: 24 November 2005 – © Springer-Verlag 2005
Abstract: In this paper we show how to quantize Hopf solitons using the FinkelsteinRubinstein approach. Hopf solitons can be quantized as fermions if their Hopf charge is odd. Symmetries of classical minimal energy configurations induce loops in configuration space which give rise to constraints on the wave function. These constraints depend on whether the given loop is contractible. Our method is to exploit the relationship between the configuration spaces of the Faddeev-Hopf and Skyrme models provided by the Hopf fibration. We then use recent results in the Skyrme model to determine whether loops are contractible. We discuss possible quantum ground states up to Hopf charge Q = 7. 1. Introduction The possibility of knot-like solitons in a nonlinear field theory was first proposed by Faddeev in 1975, [10]. In 1997, interest in the model was revived by an article by Faddeev and Niemi [11]: the advent of larger computer power and a better understanding of the initial conditions led to a series of papers. In [15] axially symmetric configurations were studied extensively. Papers by Battye and Sutcliffe showed that for higher Hopf charge twisted, knotted and linked configurations occur [7, 8]. The most recent results are due to Hietarinta and Salo [16, 17]. Stable and metastable static solutions have now been explored up to Hopf charge Q = 8. Quantization of Hopf solitons was first discussed in [15]. More recently Su described a collective coordinate quantization in [27] which was motivated by the collective coordinate quantization of Skyrmions in [1]. However, collective coordinate quantizations can be potentially misleading unless the topology of configuration space is examined carefully [5]. In this paper we describe the fermionic quantization of Hopf solitons following an old idea of Finkelstein and Rubinstein [12]. Solitons in scalar field theories can consistently be quantized as fermions provided the fundamental group of configuration space
392
S. Krusch, J.M. Speight
has a Z2 subgroup generated by a loop in which two identical solitons are exchanged. Loops in configuration space give rise to so-called Finkelstein-Rubinstein constraints which depend on whether the loop is contractible. The Skyrme model [28] was the main motivation for this approach; see [20] for further references. Symmetries of classical configurations induce loops in configuration space. After quantization these loops give rise to constraints on the wave function. Recently, a simple formula has been found to determine whether a loop in the configuration space of Skyrmions is contractible [20]. We shall exploit the fact that Skyrmions and Hopf solitons are related via the Hopf map to use Skyrmions as a tool to study Hopf solitons. This paper is organized as follows. In Sect. 2 we discuss the configuration space of Hopf solitons for general domains. The configuration space of Skyrmions can be related to Hopf solitons via the Hopf map which is a fibration. This mathematical structure enables us to prove that the Hopf map induces, in certain circumstances, an isomorphism between the fundamental groups of the Skyrme and Faddeev-Hopf configuration spaces. In Sect. 3 we summarize some known facts about Hopf solitons. In Sect. 4 we describe how to quantize a Hopf soliton as a fermion and calculate possible ground states in the Faddeev-Hopf model. In the following section, we discuss collective coordinate quantization in this context. We end with some concluding remarks.
2. The Topology of Configuration Space Let M be a compact, connected, oriented 3-manifold and p0 ∈ M be a marked point. The case of most interest is M = S 3 , interpreted as the one point compactification of R3 with p0 representing the boundary at infinity. The configuration space we seek to study is (S 2 )M , the space of based maps M → S 2 , that is continuous maps sending the chosen point p0 to a chosen point in S 2 , (0, 0, 1) say. We also define the space Free(M, S 2 ) of unbased maps M → S 2 and similarly (S 3 )M and Free(M, S 3 ), where the chosen point is (1, 0) ∈ S 3 ⊂ C2 , say. All such spaces are given the compact open topology (equivalent to the C 0 topology). Our goal in this section is to relate the topology of (S 2 )M , the Faddeev-Hopf configuration space, to that of (S 3 )M , the standard Skyrme configuration space. The connected components of (S 2 )M were enumerated and classified by Pontrjagin [25]. Let µ be a generating 2-cocycle for H 2 (S 2 ; Z) = Z. Then given φ ∈ (S 2 )M one has an associated 2-cocycle φ ∗ µ ∈ H 2 (M; Z) by pullback. No two maps M → S 2 having noncohomologous 2-cocycles can be homotopic, and every 2-cocycle on M is cohomologous to the pullback of µ by some map. Thus, the homotopy classes of maps M → S 2 fall into disjoint families labelled by H 2 (M; Z). Within any such family, the classes are labelled by elements of H 3 (M; Z)/2[φ ∗ µ] ∪ H 1 (M; Z). Note that this group varies from family to family and that to compute it requires knowledge of the ring structure on H ∗ (M; Z). The most important family is the one with [φ ∗ µ] = 0, the so-called algebraically inessential maps. Classes within this family are labelled by elements of H 3 (M; Z) = Z, identified with the Hopf charge Q, which we would like to interpret as the soliton number of the configuration, that is, the excess of solitons over antisol2 M itons. Let us denote the space of algebraically inessential maps by (S 2 )M ∗ ⊂ (S ) . 2 3 Note that these sets coincide if H (M; Z) = 0, for example, when M = S . Configura2 tions outside (S 2 )M ∗ wrap some 2-cycle in M nontrivially around S . They are bound to some topological defect in physical space and so are arguably not localized topological solitons at all. We shall not consider their physics in this paper.
Fermionic Quantization of Hopf Solitons
393
Our main tool will be the Hopf map π : S 3 → S 2 , most conveniently defined by identifying S 3 with the unit sphere in C2 and S 2 with CP 1 , for then π : (z1 , z2 ) → [z1 , z2 ].
(2.1)
Note that π sends the marked point (1, 0) ∈ S 3 to the marked point [1, 0] ∈ S 2 , corresponding to the North pole, (0, 0, 1). The map π is a fibration, that is, it has the homotopy : M → S3 lifting property with respect to all domains. A map φ : M → S 2 has a lift φ ∗ 2 M (where π ◦ φ = φ) if and only if φ µ = 0, that is, if and only if φ ∈ (S )∗ . The integer : M → S 3 , that is, the in H 3 (M; Z) labelling the class of φ is precisely the degree of φ baryon number of the Skyrme configuration φ . This was shown explicitly for M = S 3 in [24]. So, given a Skyrme configuration of degree Q, we may produce an algebraically inessential Hopf configuration of charge Q by composition with the Hopf map. In this way we produce a map π∗ : (S 3 )M → (S 2 )M ∗ . To what extent does the topology of (S 3 )M determine that of (S 2 )M ? ∗ Theorem 1. The map π∗ : (S 3 )M → (S 2 )M ∗ induced by the Hopf fibration is a Serre fibration. Proof. We must prove that the map has the homotopy lifting property with respect to all disks D k [23], that is, that the commutative diagram below left may be com along the diagonal. Here H is a homotopy between two maps pleted by a map H Z f0 , f1 : D k → (S 2 )M ∗ and f0 is a lift of f0 . Using the identification of g : X → Y with gˆ : Z × X → Y , we produce the commutative diagram below right. Now the ˆ since π is a fibration. From H ˆ we produce a map homotopy Hˆ certainly does lift to H Hˇ : D k × I → Free(M, S 3 ) by (Hˇ (d, t))(p) = Hˆ (p, d, t). A priori, this is not necessarily the lifted homotopy we seek, however, since there is no reason why it should respect the basing condition: f0- 3 M (S )
D k × {0} ι
H
? D × [0, 1] k
M × D k × {0}
π∗
ι
ˆ f 0 - 3 S ˆ H
π .
? ? ? - S2 - (S 2 )M M × D k × [0, 1] ∗ H Hˆ
Let U ⊂ S 2 be a small closed ball centred on (0, 0, 1) and choose a local trivialization π of the Hopf bundle S 1 → S 3 → S 2 over U . Then by continuity of Hˆ and compactness k of D × [0, 1], there exists a closed ball B ⊂ M centred on p0 so that the restriction ˆ | : B × D k × I → S 3 takes values in π −1 (U ). We may write it, with respect to our H local trivialization, as ˆ |(p, d, t) = (Hˆ (p, d, t), λ(p, d, t)), H where λ : B × D k × I → S 1 . In this language, we are done if λ| : {p0 } × D k × I → {1}, for then the map Hˇ does satisfy the basing criteria. Note that we are free to change λ to any continuous map λ∗ we please, provided we do not change it on ∂B × D × I , since ˆ along the fibres of S 3 which does not change π ◦ H ˆ , so that the altered this just shifts H
394
S. Krusch, J.M. Speight
map is still a lift of Hˆ , and is still continuous. Now since ∂B × D k × I deformation retracts to S 2 and π2 (S 1 ) = 0, λ| : ∂B × D k × I is nullhomotopic and we may construct the required λ∗ : B × D × I → S 1 by applying the null homotopy radially in B. Our main interest is to understand the fundamental group of each connected component of (S 2 )M ∗ . Given any map ρ : X → Y , there is a natural homomorphism ρ∗ : π1 (X) → π1 (Y ) defined by composition of loops in X with ρ. The fact that π∗ , which we will henceforth denote ρ, is a Serre fibration allows us to obtain a short exact sequence 3 relating π1 ((S 3 )M ) and π1 ((S 2 )M ∗ ). In the case M = S this reduces to the statement that the homomorphism ρ∗ associated with ρ is actually an isomorphism. We can therefore determine the homotopy class of a loop in the Hopf configuration space by lifting it to a loop in the Skyrme configuration space and applying known results. Theorem 2. The map ρ : (S 3 )M → (S 2 )M ∗ obtained from the Hopf fibration induces a short exact sequence of groups ρ∗
1 0 → π1 ((S 3 )M ) → π1 ((S 2 )M ∗ ) → H (M; Z) → 0. ρ
Proof. Given any Serre fibration F → E → B, where F, E, B denote the fibre, total space and base, we have an induced long exact sequence of homotopy groups: ρ∗
ρ∗
. . . → π1 (F )→π1 (E) → π1 (B) → π0 (F ) → π0 (E) → π0 (B) → 0.
(2.2)
1 M In the case at hand, E = (S 3 )M , B = (S 2 )M ∗ and F = (S ) . Using the identification S 1 = U (1), we see that F is a topological group, so all its connected components are homeomorphic. The components of GM for any Lie group G are enumerated in [3] while π1 (GM ) is constructed in [4]. The relevant results here are π0 (F ) = H 1 (M; Z) and π1 (F ) = 0. Note also that π0 (E) = π0 (B) = H 3 (M; Z) = Z by the theorems of Hopf and Pontrjagin. Substituting in (2.2) gives ρ∗
ρ∗
0 → π1 (E) → π1 (B) → H 1 (M; Z) → Z → Z → 0.
(2.3)
By exactness, the second ρ∗ is surjective, and there are only two surjective homomorphisms Z → Z (namely 1 → 1 and 1 → −1), both of which are injective. So we see that the second ρ∗ is an isomorphism. Since the second ρ∗ has trivial kernel, the image of H 1 (M) in Z is 0 by exactness, and the sequence truncates as was claimed. We note in passing that this provides an algebraic proof that the Hopf map takes degree Q Skyrme configurations to Hopf charge Q (or −Q if the orientation on M or S 3 is swapped) Faddeev-Hopf configurations, since this is precisely the statement that ρ∗ : π0 ((S 3 )M ) → π0 ((S 2 )M ∗ ) is an isomorphism. By identifying the Hopf degree of ∈ (S 3 )M , we adopt the standard convention that φ ∈ (S 2 )M with the degree of its lift φ ∗ 3 2 S the Hopf map π ∈ (S ) itself has Hopf degree +1. The short exact sequence does not tell us precisely what π1 ((S 2 )M ∗ ) is in general. One useful class of domains (which includes M = S 3 ), where we do know the answer is those with finite fundamental group. Corollary 3. If π1 (M) is finite then ρ∗ : π1 ((S 3 )M ) → π1 ((S 2 )M ∗ ) induced by the Hopf map is an isomorphism.
Fermionic Quantization of Hopf Solitons
395
Proof. The result follows once we show that H 1 (M; Z) = 0. By the Universal Coefficient Theorem, H 1 (M; Z) is isomorphic to the free part of H1 (M; Z), since H0 (M; Z) = Z has no torsion. But H1 (M; Z) is isomorphic to the abelianization of π1 (M) which, being finite, can have no free part. These results are useful because a lot is known about the topology of (S 3 )M since it can be identified with the topological group GM , where G = SU (2). The canonical identification is given by z1 −¯z2 S 3 → SU (2) : (z1 , z2 ) → U = . (2.4) z2 z¯ 1 This map is well-defined because U † U = U U † = I2 and |z1 |2 + |z2 |2 = 1 implies that det U = 1. Also note that (1, 0) → I2 . Since (SU (2))M is a topological group all connected components of (S 3 )M are homeomorphic, and the fundamental group is abelian. A loop in the identity component of SU (2)M based at the constant map M → {I2 } may be thought of as a map from S 1 ∧ M to SU (2), where ∧ denotes smash product. If M = S 3 then S 1 ∧ M = S 4 and π4 (SU (2)) = Z2 , so we have that 3 3 π1 ((S 2 )S∗ ) = π1 (SU (2)S ) = Z2 for all components. Using a similar argument for the vacuum sector (S 2 )M 0 of the Faddeev-Hopf model, we could very easily have shown 2 that, for M = S 3 , π1 ((S 2 )M 0 ) = π4 (S ) = Z2 . Note that we have actually proved much more than this, however: the fundamental group of every connected component of the Faddeev-Hopf configuration space is Z2 , and crucially, that the map from the Skyrme configuration space induced by Hopf fibration is an isomorphism. The above results will suffice for our purposes. In fact, one can say much more about the algebraic topology of (S 2 )M , with M a general compact oriented 3-manifold. It turns out that all components of (S 2 )M ∗ are homeomorphic, though the same fails to be true for the full space (S 2 )M . Furthermore, it is possible to compute both the fundamental group and the whole real cohomology ring (including its cup product structure) of any component of (S 2 )M . These results are obtained [4] by exploiting a somewhat less obvious relationship between (S 2 )M and the vacuum (degree 0) sector of SU (2)M . Essentially, all Faddeev-Hopf configurations in a given sector may be obtained from a fixed map in that sector by acting on the codomain with some degree 0 Skyrme configuration. This gives natural maps from the vacuum sector of the Skyrme model to each sector of the Faddeev-Hopf model, which can be shown to have many topologically natural properties. The topological results we present here are not so powerful as those of [4], but they are also less technical and may be visualized rather concretely. Most importantly, they are particularly well-suited to the study of Finkelstein-Rubinstein quantization in the Faddeev-Hopf model. 3. The Faddeev-Hopf Model From now on we consider only the case M = S 3 , interpreted as the one point compactification of R3 with the point p0 representing the boundary at infinity. The most extensively studied model of this kind is due to Faddeev [10] who suggested the following Lagrangian density: L=
1 λ ∂µ n · ∂ µ n − ∂µ n × ∂ν n · ∂ µ n × ∂ ν n , 2 4
(3.1)
396
S. Krusch, J.M. Speight
where the field n = (n1 , n2 , n3 ) takes values on the 2-sphere, that is |n|2 = 1, λ is a coupling constant, and the boundary condition is n(∞) = (0, 0, 1). We have changed notation from φ to n for the field so as to fit in with the existing literature on the model. Note that the second term in (3.1) stabilizes the solitons against radial rescaling. As discussed in Sect. 2 the Hopf charge Q can be identified with the degree of any lift of n to n : R3 → S 3 . The energy E of a static configuration of Hopf charge Q is bounded below by 3
E ≥ c|Q| 4 ,
(3.2)
where c is a constant. For more details see [29, 30]. The Lagrangian of the model has E(3) × O(3) symmetry. Since spatial translations are rather trivial we will not discuss them any further. The target space O(3) symmetry is broken to O(2) symmetry by the boundary condition. Kundu and Rybakov showed in [21] that topologically nontrivial configurations admit at most an axial (oneparameter) symmetry. General configurations with axial symmetry are discussed in [15]. Special configurations with axial symmetry have been studied recently in [17] and can be described in the following way. Introduce toroidal coordinates (η, ξ, φ) on R3 defined by x=a
sinh η cos φ sinh η sin φ sin ξ , y=a , z=a . cosh η − cos ξ cosh η − cos ξ cosh η − cos ξ
(3.3)
These coordinates form a canonically oriented orthogonal system covering all of R3 except the circle C = {x 2 + y 2 = a 2 , z = 0} and the z-axis. Surfaces of constant η ∈ (0, ∞) are tori of revolution about the z-axis, but with non-circular generating curves. As η → ∞ these tori collapse to the circle C and as η → 0 they collapse to the z-axis. Each torus of constant η is parametrized by the angular coordinates (φ, ξ ); φ is the angle around the z axis, ξ is an angular coordinate around the not quite circular generating curve of the torus. The maps of interest are most easily written in terms of a complex stereographic coordinate W on S 2 . Projecting from (0, 0, 1), so that W = (n1 + in2 )/(1 − n3 ), they take the form1 W = f (η)ei(mξ −nφ) ,
(3.4)
where f (η) satisfies the boundary conditions f (0) = ∞ and f (∞) = 0. Inverting the stereographic projection yields n=
2f 2f f2 − 1 cos(mξ − nφ), 2 sin(mξ − nφ), 2 . f2 + 1 f +1 f +1
(3.5)
This ansatz will be referred to as the toroidal ansatz. Here the word “ansatz” is used rather loosely for an approximation which is a good initial guess for the numerically calculated static solution. It is worth mentioning that the toroidal ansatz gives rise to exact solutions 3 for the Lagrangian density L = (Hµν H µν ) 4 , where Hµν = n · (∂µ n × ∂ν n), [2]. Under rotation by α around the z axis the toroidal coordinates change to (η, ξ, φ + α) which rotates the vector n by −nα around the third axis in target space. Obviously, this rotation can be undone by a rotation around the third axis in target space. 1
Note that we have changed the sign of n in [17].
Fermionic Quantization of Hopf Solitons
397
There is an obvious lift of any map R3 → S 2 within this ansatz to a Skyrme configuration R3 → S 3 , obtained as follows. For given f , m and n, let n : (η, φ, ξ ) → (z1 , z2 ) ∈ C2 , z2 =
1 f2
+1
where z1 =
einφ .
f f2 + 1
eimξ , (3.6)
n is actually S 3 -valued, and the composition of this Then |z1 |2 + |z2 |2 = 1 so that map with the Hopf map is clearly n, since the stereographic coordinate W coincides with the inhomogeneous coordinate W = z1 /z2 under the identification S 2 ≡ CP 1 . It is now straightforward to compute the degree of n, and hence the Hopf degree of n. Since the degree of n is a homotopy invariant, we may deform f to any convenient function satisfying the boundary conditions, for example, f (η) = η−1 . In this case, 1 1 (2− 2 , 2− 2 ) ∈ S 3 is a regular value of n with precisely |mn| preimages, namely the points with η = exp(imξ ) = exp(inφ) = 1. At each of these preimages, the image of the canonically oriented coordinate frame under d n is
m n − 23 − 23 d n : [∂η , ∂ξ , ∂φ ] → −2 , 0, 2 , 0 , 0, √ , 0, 0 , 0, 0, 0, √ , 2 2 where we have identified C2 ∼ = R4 . The orientation of the image frame is given by the sign of the determinant n∂ξ , d n ∂φ , n] = det[d n∂η , d
mn . 8
Hence each of the |mn| preimages has multiplicity +1 if mn > 0 and −1 if mn < 0, so the Hopf charge of n is mn, in agreement with the calculation in [15]. Numerical evidence suggests that the energy minimals for Q = 1, 2 and 4 have axial symmetry. In general, minimals are more complicated, having knotted or linked structures with at most discrete symmetries. In principle any cyclic group Cq is a possible discrete symmetry. However, in practice only the simplest nontrivial symmetry — the twofold symmetry C2 — seems to occur. Clearly any nonconstant smooth field configuration cannot be symmetric under a rotation in target space without a compensating spatial rotation. It is possible, however, for a configuration to be invariant under a spatial rotation without a compensating rotation in target space. For example, the axial configuration in (3.5) with even n has a C2 symmetry generated by spatial rotation by π about the z-axis. We will discuss symmetries further in the next section when we calculate the constraints they impose on the wave function. 4. Finkelstein-Rubinstein Constraints In this section we describe how to use ideas of Finkelstein and Rubinstein [12] to quantize a scalar field theory and obtain fermions. Quantization usually implies replacing the classical configuration space by wave functions on configuration space. However, if the configuration space is not simply connected it is possible to define wave functions on the universal cover of configuration space. As shown in Sect. 2, the fundamental group of 3 each connected component of our configuration space Q = (S 2 )S∗ is Z2 . So the universal is a twofold cover. We will also assume that the topological charge is conserved cover Q
398
S. Krusch, J.M. Speight
in the quantum theory, as it is in the classical theory, so the wave functions are defined on the covering space of a component of configuration space QQ with fixed Hopf charge Q. We shall formally think of the quantum state of the model as being specified by a Q ) with respect to some measure on Q Q . Let : Q Q → Q Q wave function ∈ L2 (Q be the deck transformation, that is, the map which takes p to the unique point in QQ which differs from p but projects to the same point in QQ . This induces a linear map
∗ : L2 → L2 by pullback: ( ∗ )(p) := ( (p)). Since the states ∗ and are Q and physically indistinguishable, we must have (p) = eiθ(p) ∗ (p) for all p ∈ Q all . But ∗ ∗ = 1, so the only possibilities are ∗ = or ∗ = − . In order to allow for fermionic solitons, we must consistently choose the latter possibility: our wavefunctions must always be odd under . Spinoriality then arises as follows. Consider the loop in QQ defined by spatial rotation about a fixed axis through 2π of a fixed base configuration n. Since π1 (QQ ) = Z2 , this may not be contractible, and its contractibility is independent of the basepoint n Q fail to close, but are rather chosen. If it is noncontractible, both lifts of the loop to Q paths connecting a -related pair of points (both of which project to n). Having insisted on -oddness, therefore, we see that every allowable state in this sector acquires a minus sign under spatial rotation by 2π, the hallmark of spinoriality. That this is equivalent to fermionicity (that is, odd exchange statistics) was proved by Finkelstein and Rubinstein in [12]. The question of whether Hopf solitons can be consistently quantized as fermions thus reduces to the question of whether 2π spatial rotation loops in QQ are noncontractible when Q is odd and contractible when Q is even. To answer this, we only need to determine the contractibility for a representative of each sector. Consider the loop 3 n(η, ξ, φ + 2π t), where n : S3 → S3 γ : [0, 1] → (S 3 )S defined by γˆ (η, ξ, φ, t) = is defined in (3.6), and we once again use the natural identification of g : X → Y Z with gˆ : Z × X → Y . This is a 2π spatial rotation loop (about the z axis) of the degree Q = mn Skyrme configuration n. Note that π ◦ γ : [0, 1] → QQ is also a 2π rotation loop, but in the Faddeev-Hopf configuration space. Corollary 3 states that π ◦ γ is contractible if and only if γ is contractible, which is true if and only if the degree Q is odd, by work of Giulini [14]. Hence imposing -oddness on our quantum states does indeed produce a consistent fermionic quantization of Hopf solitons. It is important to realize that, having imposed -oddness, every noncontractible loop Q → C, regardless of whether the in QQ must be associated with a sign flip in : Q loop is generated by a spatial rotation. Let n be a Hopf degree Q = 0 energy minimal of the Faddeev-Hopf model which is invariant under a simultaneous spatial rotation by α about some axis e and rotation by β around the third axis in target space (the only axis compatible with the boundary conditions). Since for Q = 0 the maximal symmetry of a configuration is O(2) × O(2), only one spatial rotation axis e is possible for a given n, and we may choose it, without loss of generality, to lie along the z axis. Let us call such a combined transformation an (α, β)-rotation. Then we may construct a loop L(α, β)n in QQ based at n which consists of rotation by 2tα around the z-axis for time t ∈ [0, 21 ], followed by rotation by (2t − 1)β around the third axis in target space for t ∈ ( 21 , 1]. In this language, the fact that n has the specified symmetry is precisely the statement that Q corresponding to n, L(α, β)n is a loop, i.e. closed. There are two points p, (p) ∈ Q and any physical state must have ( (p)) = − (p). Now if L(α, β)n is noncontractible then p and (p) are connected by the lifts of L(α, β)n , starting at p and (p), respectively.Hence, evaluated at the specific point p (or (p)) we must have
Fermionic Quantization of Hopf Solitons
ˆ ˆ e−iα L3 e−iβ K3 (p) = − (p),
399
(4.1)
for any allowed state, where Lˆ 3 is the third component of the spin operator Lˆ and Kˆ 3 is the third (and only) component of the spin operator in target space (henceforth called isospin). Q If L(α, β)n were contractible, however, it would lift to a pair of closed loops in Q based at p and (p), so that ˆ ˆ (4.2) e−iα L3 e−iβ K3 (p) = (p), simply by continuity of . In the spirit of semiclassical quantization we assume that, at least for low lying states, the symmetry of the classical energy minimal is not broken by quantum effects. Thus we seek quantum states which are also invariant under (α, β)-rotations, so that ˆ ˆ (4.3) e−iα L3 e−iβ K3 (x) = eiθ(x) (x), Q . But, assuming the (α, β)-rotation generates a finite group, there must for all x ∈ Q ˆ ˆ exist an integer q such that (e−iα J3 e−iβ I3 )q ≡ , which implies, by continuity, that θ(x) must in fact be constant. But then θ (x) = θ (p) = π if L(α, β)n is noncontractible by (4.1), or θ (x) ≡ 0 if L(α, β)n is contractible, by (4.2). Hence, we obtain the so-called Finkelstein-Rubinstein constraints on symmetric quantum states: ψ if the induced loop is contractible, ˆ ˆ (4.4) e−iα L3 e−iβ K3 ψ = −ψ otherwise. Equation (4.4) imposes constraints on the spin and isospin quantum numbers L, L3 and K3 . It is worth pausing here to discuss the relationship between body-fixed and space-fixed angular momentum. The Lagrangian of the Hopf model is invariant under a SO(3) × SO(3) symmetry group consisting of rotations in space and target space. For these symmetries we can define left and right actions which are generated by the space-fixed and body-fixed angular momenta J and L acting on space and by space-fixed and bodyfixed angular momenta I and K acting on target space. The body-fixed and space-fixed angular momentum operators are related by rotations which imply that J2 = L2 . For rotations in target space only rotations around the third axis are compatible with the boundary conditions. This implies I32 = K32 . When the model is quantized the angular momentum operators Jˆ 2 = Lˆ 2 , Jˆ3 , Lˆ 3 , Iˆ3 and Kˆ 3 form a set of commuting observables. The quantum wave function ψ can then be labelled by the usual spin quantum number as follows ψ = |L, L3 , J3 , K3 , I3 . Since the Finkelstein-Rubinstein constraints do not impose any restrictions on the values of J3 and I3 , these values will often be suppressed and the wave function is given as ψ = |L, L3 , K3 . In order to make predictions, we are interested in states with given J and I3 . Therefore, we have to consider states with quantum numbers L = J and K3 = ±I3 . Then the Finkelstein-Rubinstein constraints have the following effect. By restricting the allowed quantum states for given J and I3 the degeneracy of the states is changed. In the extreme case that the degeneracy is zero, certain combinations of J and I3 get excluded. We now return to our discussion of loops in configuration space and FinkelsteinRubinstein constraints. Just as for 2π spatial rotation loops, we can use the isomorphism
400
S. Krusch, J.M. Speight 3
3
π1 ((S 2 )S∗ ) → π1 ((S 3 )S ) induced by the Hopf fibration to calculate whether a given loop L(α, β)n is contractible. For every configuration n we can choose a configuration 3 3 n˜ in the configuration space (S 3 )S of Skyrmions. Then L(α, β)n˜ is a loop in (S 3 )S 3 which projects to the loop L(α, β)n in (S 2 )S∗ under π . The action of SO(3) on the target space of n, that is S 3 , is now identified with the adjoint action of SU (2) on itself. Once again, Corollary 3 shows that L(α, β)n is contractible if and only if L(α, β)n˜ is contractible. Contractibility of the latter loop can be determined by means of an explicit formula recently derived for Skyrmions with discrete symmetries, [20]. This states that the loop L(α, β)n˜ is contractible if and only if N=
Q (Qα − β) 2π
(4.5)
is even. Note that there is a slight subtlety with the choice of the sign of β. We can immediately recover our earlier result that the -odd quantization is consistently fermionic from formula (4.5). To see this, note that every configuration is symmetric under (2π, 0)-rotation, and substituting α = 2π , β = 0 into (4.5) shows that N is odd if and only if Q is odd. Hence the spin quantum numbers L and J are half integer if and only if Q is odd. Similarly, considering the case α = 0, β = 2π (pure isorotation by 2π) shows that the isospin quantum numbers K3 and I3 are also half integer if and only if Q is odd. New constraints on low-lying quantum states are obtained if we assume that they are invariant under the symmetry groups of the corresponding classical energy minimals. The Faddeev-Hopf model has received much less numerical attention than the Skyrme model, so our understanding of these minimals and their symmetries is comparatively limited. For this reason, we will discuss the Finkelstein-Rubinstein constraints for general symmetries first, then apply the analysis to those symmetries which have been observed in numerical experiments. Since we are interested in symmetries which can be generated by loops in configuration space we disregard reflections and look only at subgroups of T 2 = SO(2) × SO(2). Note that T 2 , and hence every subgroup of T 2 , is abelian. This severely limits the symmetry groups possible, and accounts in part for the numerical observation that Hopf degree Q minimals tend to possess far less symmetry than degree Q Skyrmions. The symmetry group Gn < T 2 of a configuration n is either continuous, in which case Gn ∼ = SO(2) corresponding to axial symmetry, or discrete, hence finite (T 2 is compact). Every finite abelian group is isomorphic to a product of finite cyclic groups of coprime order, so it suffices to understand the FinkelsteinRubinstein constraints for q-fold cyclic symmetry Cq . First, we deal with axial symmetry. Consider the axial configurations (3.4) with Hopf charge Q = mn. These are invariant under (α, nα)-rotations for all α ∈ R. Since the loop L(α, nα)n exists for all α ∈ R it is homotopic to the constant loop (α = 0). So L(α, nα)n is contractible and gives rise to the following constraint on wave functions: ˆ
ˆ
e−iα L3 e−inα K3 = .
(4.6)
Since formula (4.6) is valid for all α we can expand the equation in α. The first order term gives rise to the following constraint for the spin operators: (4.7) Lˆ 3 + nKˆ 3 = 0. Equation (4.7) implies for the spin quantum numbers that L3 = −nK3 .
Fermionic Quantization of Hopf Solitons
401
If the axial symmetry is broken then the symmetry group must be isomorphic to a product of finite cyclic groups. Not every cyclic subgroup of T 2 is possible for a given Q, however, since the generator (α, β) of Cq < T 2 must satisfy Eq. (4.5), that is, N must be an integer. There are precisely q different Cq subgroups of T 2 which are candidates for symmetry groups, generated by (2π/q, 2kπ/q), where k = 0, 1, . . . , q − 1, since pure isorotation can never leave a nonconstant configuration invariant. Let us denote these groups Cqk . To illustrate, let us assume that q is prime so that Cq is a finite field. Then formula (4.5) applied to the generator of Cqk implies that Q(Q − k) = 0 mod q and hence Q = 0 mod q or Q = k mod q by the field property. Hence, unless Q is Q mod q a multiple of q, formula (4.5) rules out all possible Cq symmetries except Cq . Similar criteria can be derived for q not prime, but they are not so neat. Of particular interest given the current state of numerics is the case q = 2. The argument above shows that, for odd Q, only C21 symmetry is possible, not C20 . Given a candidate symmetry group Cqk , formula (4.5) gives us a one-dimensional (hence irreducible) representation of Cq¯ , where q¯ = q if Q(k + 1) is even and q¯ = 2q if Q(k +1) is odd, by mapping the generator (2π/q, 2kπ/q) to (−1)N . This representation may also be thought of as a homomorphism Cq¯ → Z2 = {1, −1} and is thus necessarily trivial if q is odd and Q(k + 1) is even. We call this the Finkelstein-Rubinstein representation of Cq¯ . There is also a natural representation of Cq¯ on the spin-isospin L, K3 quantum state space, defined by the inclusion Cqk < SO(3) × SO(2). A state with quantum numbers L, K3 is thus compatible with Cqk symmetry if and only if the decomposition of the spin-isospin L, K3 representation of Cq¯ into irreducible representations contains a copy of the Finkelstein-Rubinstein representation. Given that we consider only cyclic groups, in practice we need only check compatibility on the generator (α, β) = (2π/q, 2π k/q). Thus L3 , K3 must satisfy e−2πi(L3 +kK3 )/q = (−1)N = eiπQ(Q−k)/q , 1 ⇔ L3 + kK3 = − Q(Q − k) + q, 2
(4.8) (4.9)
where is an integer. A good candidate for the ground state in the charge Q sector is the state with the lowest values of L and |K3 | (and hence J and |I3 |) compatible in this way with the symmetries of the classical minimal. To illustrate this symmetry analysis, we compute the quantum ground state for stable and metastable Hopf solitons of degrees Q = 1, . . . 7, using the classical solutions obtained numerically by Hietarinta et al [17]. Only axial and C2 symmetries ever arise for these solutions. In the C2 case for even Q, we distinguish between the two possible groups C20 and C21 using the colour coding information in [17]. The results are presented in Table 1. The first entry is the Hopf number Q. A star indicates that the state is metastable, that is, the classical solution is not a global minimal. The next entry is the energy EQ which has been calculated in [17] and corresponds to λ = 1/4. The following entry gives the shape of the Hopf configuration. The entry “symmetry” shows which symmetry has been used to calculate the Finkelstein-Rubinstein constraints. Here (n, m) corresponds to the axial symmetry of the corresponding toroidal ansatz (3.4). C20 is generated by π rotation in space whereas C21 is generated by rotation by π in space followed by rotation by π in target space. As a word of caution, while axial symmetry has been checked numerically, the C2 symmetry is obtained by inspection from the figures in [17] and [8]. For low Q the symmetries are apparent. However, for higher Hopf charge, Q > 4, the
402
S. Krusch, J.M. Speight Table 1. Ground states and excited states for Q = 1, . . . , 7
|Q| 1
EQ 135.2
shape unknot
symmetry (1, 1)
FR
ground state
excited state (1)
excited state (2)
1
| 21 , − 21 , 21
| 23 , − 21 , 21
| 23 , − 23 , 23
2
220.6
unknot
(2, 1)
1
|0, 0, 0
|1, 0, 0
|2, −2, 1
2∗
249.6
unknot
(1, 2)
1
|0, 0, 0
|1, 0, 0
|1, −1, 1
3
308.9
unknot
C21
-1
311.3
unknot
(3, 1)
1
| 23 , 21 , 21 | 25 , − 23 , 21
| 21 , − 21 , 23
3∗
| 21 , 21 , 21 | 23 , − 23 , 21
| 29 , − 29 , 23
4
385.5
unknot
(2, 2)
1
|0, 0, 0
|1, 0, 0
|2, −2, 1
4∗
392.7
unknot
1
|0, 0, 0
|1, 0, 0
|0, 0, 1
4∗
C20
405.0
unknot
(4, 1)
1
|0, 0, 0 | 21 , ± 21 , | 21 , − 21 ,
5
459.8
link
—
—
5∗
479.2
unknot
C21
1
|1, 0, 0 1 2 1 2
| 23 , ± 21 , | 23 , − 21 ,
|4, −4, 1 1 2 1 2
| 21 , ± 21 , 23 | 21 , 21 , 23
6
521.0
link
—
—
|0, 0, 0
|1, 0, 0
|0, 0, 1
6∗
536.2
link
—
—
|0, 0, 0
|1, 0, 0
|0, 0, 1
7
589.0
knot
—
—
| 21 , ± 21 , 21
| 23 , ± 21 , 21
| 21 , ± 21 , 23
symmetries are difficult to guess, if indeed they exist at all. Where no entry is given, the classical solution has no obvious symmetry and the only constraint applicable is that of consistent fermionicity. “FR” gives the Finkelstein-Rubinstein constraints (−1)N , where N is calculated with Eq. (4.5) for the generator of the discrete symmetries. Note that axial symmetry implies FR = 1. Then ground states are calculated as explained above. They are given in the form |LL3 K3 . The quantum numbers J3 and I3 are suppressed. Recall that J = L and |I3 | = |K3 |. We have also included two excited states. “Excited state (1)”, is obtained from the ground state by increasing L by 1 and finding the lowest K3 such that all constraints are satisfied. Similarly, “excited state (2)” is obtained by increasing K3 by 1. Note that changing the sign of Lˆ 3 and Kˆ 3 in the constraints (4.4) given by a loop L(α, β)n can be interpreted as constraints for the loop L(−α, −β)n . Since the fundamental group is Z2 the loop L(α, β)n is contractible if and only if L(−α, −β)n is contractible. Therefore, whenever |L, L3 , K3 satisfies the constraints imposed by a symmetry, so does |L, −L3 , −K3 . In Table 1, we only display states with K3 ≥ 0. Since no constraints with FR = −1 occur for even Hopf charge Q all the ground states are given by |0, 0, 0 and “excited states (1)” are |1, 0, 0. The influence of the Finkelstein-Rubinstein constraints can only be seen for “excited state (2)”. For odd Q the Finkelstein-Rubinstein constraints influence the ground states and all the excited states. One might ask why the first and second excited states are expected to have spin and isospin one unit higher than the ground state, respectively. One reason is that this is consistent with the collective coordinate quantization of Hopf solitons, to which we turn in the next section. 5. Collective Coordinate Quantization The simplest non-trivial quantitative application of our results is the collective coordinate quantization, [27]. In this case the wave function is only non-vanishing on the
Fermionic Quantization of Hopf Solitons
403
space of minimal energy configurations in a given sector, also called the moduli space. The effective Lagrangian Leff in this approximation is obtained by restricting the full Lagrangian to fields which, at each fixed time, lie in the moduli space.2 From Leff one can construct an effective Hamiltonian and canonically quantize the system in the standard manner. For Hopf charge Q = 1 the reduced Hamiltonian is given in [27] using “SU (2) notation”. The Lagrangian L (3.1) can be split up into kinetic energy T and potential energy V , namely L = T − V , where
1 λ |∂t n × ∂i n|2 , |∂t n|2 + 2 R3 2 i 1 λ |∂i n|2 + |∂i n × ∂j n|2 . V = 4 R3 2 T =
i
(5.1) (5.2)
i,j
Now let M ⊂ QQ be the moduli space of charge Q energy minimizers, and n(t) be a trajectory in M. Since n(t) is a critical point of V for all t, V must remain constant, V [n(t)] = M0 say, interpreted as the classical mass of the Hopf soliton. It follows that the effective Lagrangian is Leff = T |M − M0 , so the reduced dynamics is determined purely by the kinetic energy restricted to M. This has a natural geometric interpretation: being quadratic in first time derivatives, T defines a positive quadratic form and hence a unique Riemannian metric γ on M, and the classical dynamics descending from Leff is nothing other than geodesic motion in (M, γ ). Since the Faddeev-Hopf model is not of Bogomol’nyi type, M is just the orbit of any energy minimizer under the symmetry group of the model, that is, all zero modes arise due to symmetry. The centre of mass motion decouples, so we may, without loss of generality, assume that the centre of mass is fixed at the origin, so that M is the orbit of some minimizer n0 under G = SO(3) × SO(2), acting as described in Sect. 3. So (M, γ ) is a homogeneous space, diffeomorphic to G/K, where K < G is the isotropy group of n0 . It follows that γ is uniquely determined by its value on Tn0 M. Generically, as we have described, K is discrete, so M has dimension 4, and γ is specified by 6 constants, which may be interpreted as the components of the Hopf soliton’s inertia tensor. However, we shall concentrate on the case where n has axial symmetry. Then K = {k(α) = ([diag(eiα/2 , e−iα/2 )], einα ) : α ∈ R}
(5.3)
for some divisor n of Q, where we have used the standard isomorphisms SO(3) ≡ P U (2) and SO(2) ≡ U (1) to identify SO(3) matrices with projective equivalence classes of U (2) matrices, and SO(2) matrices with complex phases. Let θ1 , θ2 , θ3 be the usual basis of left invariant vector fields on SO(3) and θ4 = ∂ξ on SO(2) ≡ {eiξ : ξ ∈ R}. Let · · · denote linear span. Then the Lie algebra of G, is g = θ1 , . . . , θ4 , and the Lie algebra of K is k = θ3 + nθ4 . We may identify Tn0 M with the complementary space p = θ1 , θ2 , θ3 . Note g = k ⊕ p since n = 0. So γ is equivalent to a positive symmetric 2 As has been discussed in the Skyrme model, [6, 9], this approximation breaks down if centrifugal effects are taken into account. This problem can be avoided by introducing a (sufficiently large) mass term for the vector n so that the fields decay fast enough at infinity.
404
S. Krusch, J.M. Speight
bilinear form γ¯ : p ⊕ p → R, and this must be invariant under the adjoint action of K on p. Relative to the basis {θ1 , θ2 , θ3 } this is cos α − sin α 0 Adk(α) = sin α cos α 0 . (5.4) 0 0 1 Let p∗ denote the dual space to p, so that γ¯ ∈ p∗ p∗ , where denotes the symmetric tensor product. The induced action of K on p∗ p∗ may be decomposed into irreducible representations, whence one finds that the dimension of the space of invariant symmetric bilinear forms on p ⊕ p is [19] 2π 2 1 1 2 dα = 2. (5.5) tr Adk(α) + tr Adk(α) 2π 0 2 Hence there exist positive constants a, b such that γ¯ = a σ12 + σ22 + bσ32 ,
(5.6)
where {σi } are the one forms dual to {θi }. Thus the metric γ on M is determined by just two constants. The static solution n0 , and hence its classical mass M0 and moments of inertia a, b, all depend parametrically on the coupling λ. In fact, this dependence is quite simple, as we shall now show. Let us temporarily denote all λ dependence explicitly, so that Tλ , Vλ are the kinetic and potential energy functionals at coupling λ, nλ is the static solution, M0 (λ) is its mass, and a(λ), b(λ) its moments of inertia. A simple rescaling of the integration variables in (5.2) shows that, for any fixed map n : R3 → S 2 , √ √ Vλ [n(x)] ≡ λ V1 [n( λ x)]. (5.7) Hence, given an extremal n∗ of V1 (here and henceforth, the subscript ∗ will indicate 1 that a quantity refers to the λ = 1 model), nλ (x) = n∗ (λ− 2 x) is an extremal of Vλ , and furthermore its energy is √ √ M0 (λ) = Vλ [nλ ] = λ V1 [n∗ ] = λ M∗ . (5.8) 1
So the classical soliton masses scale as λ 2 . A similar argument works for the moments of inertia too. The coefficients a(λ), b(λ) are, by definition, twice the kinetic energies (i) of the time-dependent fields, nλ (x, t) say, obtained from nλ by subjecting it to spatial rotation at unit angular velocity about the xi -axis with i = 1, 3 respectively. Let Ri (t) denote rotation through angle t about the xi -axis. Then 1 1 1 (i) −2 nλ (x, t) = nλ (Ri (t)x) = n∗ λ− 2 Ri (t)x = n∗ Ri (t)λ− 2 x = n(i) λ x, t , ∗ (5.9) by linearity of Ri . Rescaling the integration variables in (5.1) as before, one sees that 3 3 (i) (i) Tλ [nλ ] = λ 2 T1 [n∗ ], and so the moments of inertia scale as λ 2 : 3
a(λ) = λ 2 a∗ ,
3
b(λ) = λ 2 b∗ .
(5.10)
Note that neither of these arguments appealed to axial symmetry, so the same scaling behaviour applies to solitons with only discrete (for example, trivial) symmetry groups,
Fermionic Quantization of Hopf Solitons
405
also. This includes the scaling behaviour of the moment of inertia associated with isorotation (where this no longer coincides with spatial rotation) because (iso)
nλ
1
1
−2 (x, t) = R3 (t)nλ (x) = R3 (t)n∗ (λ− 2 x) = n(iso) x, t). ∗ (λ
(5.11)
From now on, we will no longer denote the λ dependence explicitly, but will retain the ∗ subscript for quantities associated with the λ = 1 model. We wish to quantize geodesic motion on M, which may be formulated as a Hamiltonian flow on T ∗ M, within the framework of Finkelstein and Rubinstein. As it stands, there is a problem with this, however. As shown above, the fundamental group of QQ , the topological sector containing M, is Z2 , whereas π1 (M) = Z2n , where n is the divisor of Q appearing in (5.3). A proof of this is presented in the appendix. So π1 (M) = π1 (QQ ) unless n = 1, and this type of axial symmetry occurs only for Q = 1 and the metastable Q = 2 state, according to Hieterinta et al [17]. Nevertheless, a fermionic collective coordinate approximation is still possible, the key point being that in all cases the 2π spatial rotation loop has order 2 in π1 (M). It is slightly unfortunate that this is true independent of Q, that is, whether Q is odd or even. For consistency we must thus choose bosonic quantization for Q even, it is not imposed on us by the topology of M. This illustrates that collective coordinate quantization can be quite treacherous in the absence of a good understanding of the topology of the full configuration space. To construct the collective coordinate quantization it is convenient to exploit the n-fold covering map : SO(3) → G/K which maps g ∈ SO(3) to the coset (g, 1)K, that is, the left coset of K containing (g, 1) ∈ G. Note that commutes with the natural SO(3) left actions on SO(3) and M. Geodesics in (M, γ ) are the images of geodesics in (SO(3), ∗ γ ), where the lifted metric ∗ γ is precisely (5.6), but with σi now interpreted as (global) left invariant one forms on SO(3), rather than basis vectors in p∗ . The Hamiltonian generating geodesic flow in (SO(3), ∗ γ ) is 1 2 1 1 1 1 H = (5.12) (L1 + L22 ) + L23 = |L|2 + − L23 , 2a 2b 2a 2b 2a where Li : T ∗ SO(3) → R are the angular momenta corresponding to the vector fields θi (the components of the moment map for the Hamiltonian action of SO(3) on T ∗ SO(3)). Their Poisson bracket algebra is well known: {L1 , L2 } = L3 and cyclic permutations. We may now quantize in the usual way, replacing classical angular momenta by Lˆ i , self-adjoint linear operators on L2 (SO(3)) and Poisson brackets by commutators. Note that {Hˆ , Lˆ 2 , Lˆ 3 } is a compatible set of observables. In this set-up, we are thinking of the wavefunction as defined on the covering space, ψ : SO(3) → C; it is important to note that for Q odd (even) only those functions which are double-valued (single-valued) under the projection make physical sense. The deck transformation group for is generated by exp(2π θ3 /n), so we find that the eigenvalues of Lˆ 3 must be integer multiples of n/2. This conclusion may be reached another way. Note that θ3 + nθ4 ∈ k vanishes on M, so the corresponding classical momenta are linearly dependent: L3 + nK3 = 0. Hence the quantum operators must satisfy (Lˆ 3 + nKˆ 3 )ψ = 0 on any physical state, and the conclusion follows because Kˆ 3 has half-integer spectrum. Of course, this is nothing other than the FR constraint for axial symmetry (4.7). We may use the linear dependence of the third components of spin and isospin to rewrite Hˆ in terms of Kˆ 3 , or both Lˆ 3 and Kˆ 3 if we wish. A convenient way to write the quantum hamiltonian is 1 ˆ2 1 1 ˆ2 ˆ H = M0 + (5.13) L + − L3 . 2a 2b 2a
406
S. Krusch, J.M. Speight
It is now trivial to express the quantum energy spectrum in terms of the quantum numbers L2 and K3 :
√ 2 L(L + 1) 1 1 E = λ M∗ + 3 n2 K32 , (5.14) + − 2a∗ 2b∗ 2a∗ λ2 where we have used the constraint L3 = −nK3 to eliminate L3 , and the scaling behaviour obtained in (5.8),(5.10) to render all λ dependence explicit. Recall that ∗-subscript quantities refer to the λ = 1 soliton. As discussed in the previous section, the body-fixed and space-fixed angular momenta satisfy Jˆ 2 = Lˆ 2 and Iˆ32 = Kˆ 32 . Therefore, we can also express the energy in terms of the space-fixed angular momentum quantum numbers, which are the quantities measured in a physical experiment, by replacing L(L + 1) by J (J + 1) and K32 by I32 in formula (5.14). We would like to order these states by increasing energy. Clearly, this order depends on n and the relative size of the constants a∗ and b∗ . As discussed above, to determine these constants, one must compute the kinetic energy of time dependent fields n(t) = (exp(tθ1 ), 1) · n0 and n(t) = (exp(tθ3 ), 1) · n0 respectively, where · denotes the action of G on M. This is computationally very expensive if one uses for n0 the genuine axially symmetric energy minimizers found in [17], since even to construct n0 requires one to solve nonlinear PDEs. Instead, we shall again exploit the Hopf fibration and assume that n0 is well approximated by the image under the Hopf map ρ of a Skyrme configuration U : R3 → SU (2) within the rational map ansatz of Houghton, Manton and Sutcliffe [18]. This idea was introduced in [8].3 The rational map ansatz may be described as follows. Using exp : su(2) → SU (2), one may identify SU (2) with the closed ball of radius π in su(2) ≡ R3 . The entire boundary of this ball gets mapped to −I2 . Partition physical space R3 into concentric 2-spheres of radius r ∈ [0, ∞). Choose a fixed holomorphic map R : S 2 → S 2 ⊂ R3 of degree Q and a smooth decreasing surjection f : [0, ∞) → (0, π ] (the profile function). Then the corresponding degree Q Skyrme configuration is U (r, x1 , x2 ) = exp (f (r)R(x1 , x2 )) ,
(5.15)
where x1 , x2 is any coordinate system on S 2 . With respect to stereographic coordinates z, R on its domain and codomain, R is the eponymous rational map R(z). We may then write U (r, z) more explicitly as −if 1 e + |R|2 eif 2i R¯ sin f U (r, z) = . (5.16) 2iR sin f eif + |R|2 e−if 1 + |R|2 The corresponding Faddeev-Hopf configuration π ◦ U can easily be calculated with Eqs. (2.1) and (2.4), W (r, z) =
|R(z)|2 eif (r) + e−if (r) , 2iR(z) sin f (r)
(5.17)
where again we choose stereographic coordinates on S 2 . The idea is to approximate the true energy minimizer n0 by a configuration of this form and minimize over all possible R and f . In fact, to obtain axial symmetry, we must assume R(z) = zQ (note this 3
Su has also discussed the rational map ansatz for Hopf solitons, using a different notation, [26].
Fermionic Quantization of Hopf Solitons
407
Table 2. Classical energy M∗ and moments of inertia a∗ , b∗ of various axially symmetric solitons, at λ = 1, within the rational map ansatz. For comparison, we also quote the classical energies of the corresponding numerical solutions found in the literature (M∗H : Hietarinta and Salo [17], M∗G : Gladikowski and Hellmund [15], M∗B : Battye and Sutcliffe [7]). Note that M∗H and M∗G have been inferred using the scaling rule (5.8) Q 1 2 3∗
M∗H 270.4 441.2 622.6
M∗ 275.0 462.9 665.5
M∗G 278.6 446.9 —
M∗B 252.5 418.0 590.5
a∗ 418.8 1265.0 3272.7
b∗ 369.7 1309.4 3556.1
assumes the divisor n of Q is simply n = Q, so our results apply only to Q = 1, 2 and the metastable Q = 3∗ , 4∗ solitons). We then minimize the potential energy V over all possible profile functions f . This yields a nonlinear second order ODE for f (r) which is easily solved numerically. We may, without loss of generality, set λ to unity. Having constructed our approximate energy minimizer, W (r, z), we must compute the kinetic energy at t = 0 of W (t, r, z) =
|R(˜z(t, z))|2 eif (r) + e−if (r) , 2iR(˜z(t, z)) sin f (r)
(5.18)
where z˜ (t, z) =
z cos t/2 + i sin t/2 , iz sin t/2 + cos t/2
and
z˜ (t, z) = zeit ,
(5.19)
yielding a∗ /2 and b∗ /2 respectively. The calculations are elementary, but lengthy, and all reduce to radial integrals of expressions involving f (r) and f (r). The results for Q = 1, 2 and the metastable Q = 3∗ are summarized in Table 2. These data, along with formula (5.14) give the complete quantum energy spectrum for these solitons, at arbitrary coupling. To illustrate our approach we shall interpret the Hopf solitons as super heavy fermion states in the strongly coupled pure Higgs sector of the standard model, as advocated by Gipson and Tze [13]. To make contact with their work, we must take the unit of energy to be e0 = 300 GeV, = 1, and the coupling constant to be λ = ln(mH /e0 )/24π 2 , where mH is the Higgs mass. In this model, the Higgs sector is strongly coupled, so the Higgs mass assumes the rather large value mH ≈ 1 TeV, so that λ ≈ 0.005. The unit of length is the Compton wavelength of a particle of rest energy e0 , namely d0 = c/e0 ≈ 0.66 10−3 fm. Then the Q = 1 ground state represents what Gipson and Tze call a “smoke ring soliton” of energy 6.63 TeV which is compatible with the lower bound of 5.5 TeV given in [13]. A sensible measure for the size of the Hopf soliton is the value of the radius in the rational map ansatz at which the profile function takes the value π/2. We find that our Hopf soliton has a radius of 0.08 10−3 fm which is comparable with the lower bound of 0.2 10−3 fm in [13] where the radius is defined in a slightly different way. We display the groundstates and the first two excited states in the collective coordinate approximation in Table 3. The energies of the states are dominated by the classical contribution. As anticipated in Table 1, the groundstate has the lowest energy followed by excited state (1) and excited state (2). The energy of the states increases with the Hopf charge Q. The size of the Hopf solitons also increases with the charge; 0.08 10−3 fm for Q = 1, 0.09 10−3 fm for Q = 2 and 0.13 10−3 fm for Q = 3.
408
S. Krusch, J.M. Speight
Table 3. Groundstates and first excited states, and their energies, of super heavy smoke ring solitons in the collective coordinate approximation, using the rational map ansatz Q
groundstate
E0
excited state (1)
E1
excited state (2)
E2
1
| 21 , − 21 , 21
6.63 TeV
| 23 , − 21 , 21
9.67 TeV
| 23 , − 23 , 23
9.93 TeV
2
|0, 0, 0 | 23 , − 23 , 21
9.82 TeV
|1, 0, 0 | 25 , − 23 , 21
10.49 TeV
14.58 TeV
15.23 TeV
|2, −2, 1 | 29 , − 29 , 23
17.12 TeV
3∗
11.79 TeV
Clearly, the relative size of the quantum excitation energy of an excited state to the ground state energy depends on the coupling λ. If λ is small, as in the application above, the quantum corrections become significant. In an application where the solitons are taken to model real physical structures, whose energies and sizes are known experimentally (rather than hypothetical exotic matter states as in the current case), one would tune the energy and length scales independently so as to fit some reference data as well as possible. This amounts to tuning both λ and the value of , which is why we retained explicit dependence in Eq. (5.14). In the case of the Skyrme system as a model of nucleons, for example, one finds that ≈ 46.8 in natural units [22]. Even if λ is large, therefore, quantum corrections may still be significant, provided /λ remains large. So the relative importance of quantum corrections depends strongly on the physical interpretation of the model under consideration. 6. Conclusion We have described how to quantize Hopf solitons using the Finkelstein-Rubinstein construction and thereby demonstrated that Hopf solitons can be quantized as fermions when their Hopf charge Q is odd. An important ingredient of the proof is the fact that the Hopf map S 3 → S 2 induces a Serre fibration (S 3 )M → (S 2 )M ∗ . Using this fibration we could show that the fundamental group of Skyrmions is isomorphic to the fundamental group of Hopf solitons, when physical space has finite fundamental group, and this isomorphism is induced by the Hopf map. This enabled us to use results which have been derived for the Skyrme model. In a semiclassical quantization we expect that classical symmetries are not broken by quantum effects. Then the symmetries of the classical configurations induce non-trivial constraints on the wave function. We calculated possible ground states of Hopf solitons for Q = 1, . . . , 7 from the minimal energy configurations given in [17]. Since Hopf solitons do not have many symmetries, the constraints on the wave functions are quite weak. Often, only the degeneracy of a state changes, rather than the state being excluded completely. Excited states have been included to better illustrate the influence of the Finkelstein-Rubinstein constraints. In order to get quantitative predictions of the quantum energy spectrum of Hopf solitons, we resorted to a collective coordinate approximation. In general, naive collective coordinate quantization can give spurious results if the topology of the moduli space is incompatible with that of the full configuration space. We concentrated on the case where the moduli space consists of axially symmetric configurations, which provides a good example of this difficulty. As discussed in the previous section, such a moduli space allows for fermionic quantization for both odd and even Hopf charge. In order to describe the physics correctly, we have to impose bosonic quantization for even Q and fermionic quantization for odd Q. In other words, we must impose some of the
Fermionic Quantization of Hopf Solitons
409
Finkelstein-Rubinstein constraints arising from the topology of the full configuration space “by hand” on the wave function on the moduli space. They do not arise from the topology of the moduli space itself. The Faddeev-Hopf model contains a single coupling constant λ. By simple rescaling arguments, we derived the scaling behaviour of the classical energy and moments of inertia of a soliton as λ varies. This allowed us to find a formula for the quantum energy spectrum of axially symmetric solitons, within the collective coordinate approximation, with all λ dependence explicit. The numerical constants M∗ , a∗ and b∗ in this formula were approximated, for three such axially symmetric solitons, by constructing approximate energy minimizers within the rational map ansatz. Our aim in this paper was to illustrate the general approach of fermionic soliton quantization within the FaddeevHopf model. This can now be applied to a variety of physical models that admit Hopf solitons. Acknowledgements. The authors wish to thank D. Auckly and P. M. Sutcliffe for fruitful discussions. S. K. acknowledges an EPSRC Research fellowship GR/S29478/01.
Appendix: The Fundamental Group of the Moduli Space We wish to compute the fundamental group of M, the orbit of a configuration n : R3 → S 2 under G = SO(3) × SO(2), when n is invariant under the axial symmetry group K = {(R3 (α), einα ) : α ∈ R} < G, where R3 (α) denotes rotation through α about the x3 axis. Since M ∼ = G/K and p : G → G/K is a fibration, we have the associated homotopy exact sequence ι
⇒
p
K → G → G/K p∗ ι∗ π1 (K) → π1 (G) → π1 (M) → π0 (K) p∗ ι∗ Z → Z2 ⊕ Z → π1 (M) → 0.
Hence p∗ surjects, so π1 (M) ≡ π1 (G)/ ker p∗ by the Isomorphism Theorem. But ker p∗ is, by exactness, the image of π1 (K) under inclusion, clearly the infinite cyclic group generated by 1 ⊕ n ∈ π1 (G). This group has precisely 2n cosets in π1 (G), labelled by the elements 0 ⊕ 0, 0 ⊕ 1, . . . , 0 ⊕ (2n − 1), for example. Let us denote the coset g + ker p∗ by [g]. It follows immediately that the quotient group π1 (G)/ ker p∗ is cyclic of order 2n, generated by [0 ⊕ 1]. Note also that the 2π spatial rotation loop lies in 1 ⊕ 0 ∈ π1 (G), which projects to [0 ⊕ n] = n[0 ⊕ 1] in π1 (G)/ ker p∗ , since 1 ⊕ 0 = 0 ⊕ n − 1 ⊕ n. Hence the 2π spatial rotation loop in M is noncontractible of order 2, independent of n (and Q). References 1. Adkins, G.S., Nappi, C.R., Witten, E.: Static properties of nucleons in the Skyrme model. Nucl. Phys. B228, 552 (1983) 2. Aratyn, H., Ferreira, L.A., Zimerman, A.H.: Exact static soliton solutions of 3+1 dimensional integrable theory with nonzero Hopf numbers. Phys. Rev. Lett. 83, 1723–1726 (1999) 3. Auckly, D., Kapitanski, L.: Holonomy and Skyrme’s model. Commun. Math. Phys. 240, 97–122 (2003)
410
S. Krusch, J.M. Speight
4. Auckly, D., Speight, J.M.: Fermionic quantization and configuration spaces for the Skyrme and Faddeev-Hopf models. http://arxiv.org/list/, 2004 5. Balachandran, A.P., Marmo, G., Skagerstam, B.S., Stern, A.: Classical Topology and Quantum States. Chap. 13.4, Singapore: World Scientific, 1991 6. Bander, M., Hayot, F.: Instability of rotating chiral solitons. Phys. Rev. D 30, 1837 (1984) 7. Battye, R.A., Sutcliffe, P.M.: Knots as stable soliton solutions in a three-dimensional classical field theory. Phys. Rev. Lett. 81, 4798 (1998) 8. Battye, R.A., Sutcliffe, P.M.: Solitons, links and knots. Proc. Roy. Soc. Lond. A455, 4305 (1999) 9. Braaten, E., Ralston, J.P.: Limitations of a semiclassical treatment of the Skyrmion. Phys. Rev. D 31, 598 (1985) 10. Faddeev, L.D.: Quantisation of solitons. Preprint IAS Print-75-QS70, Princeton, 1975 11. Faddeev, L.D., Niemi, A.J.: Knots and particles. Nature 387, 58 (1997) 12. Finkelstein, D., Rubinstein, J.: Connection between spin, statistics, and kinks. J. Math. Phys. 9, 1762 (1968) 13. Gipson, J.M., Tze, H.C.: Possible heavy solitons in the strongly coupled Higgs sector. Nucl. Phys. B183, 524 (1980) 14. Giulini, D.: On the Possibility of Spinorial Quantization in the Skyrme Model. Mod. Phys. Lett. A 8, 1917–1924 (1993) 15. Gladikowski, J., Hellmund, M.: Static solitons with non-zero Hopf number. Phys. Rev. D56, 5194– 5199 (1997) 16. Hietarinta, J., Salo, P.: Faddeev-Hopf knots: Dynamics of linked un-knots. Phys. Lett. B451, 60–67 (1999) 17. Hietarinta, J., Salo, P.: Ground state in the Faddeev-Skyrme model. Phys. Rev. D62, 081701 (2000) 18. Houghton, C.J., Manton, N.S., Sutcliffe, P.M.: Rational maps, monopoles and skyrmions. Nucl. Phys. B510, 507 (1998) 19. Jones, H.F.: Groups, representations and physics. Bristol, UK: Adam Hilger, 1990, p. 80 20. Krusch, S.: Homotopy of rational maps and the quantization of Skyrmions. Ann. Phys. 304, 103–127 (2003) 21. Kundu, A., Rybakov,Y.P.: Closed vortex type solitons with Hopf index. J. Phys. A15, 269–275 (1982) 22. Leese, R.A., Manton, N.S., Schroers, B.J.: Attractive channel Skyrmions and the Deuteron. Nucl. Phys. B 442, 228 (1995) 23. McCleary, J.: User’s guide to spectral sequences. Delaware: Publish or Perish, 1985, p. 102 24. Meissner, U.G.: Toroidal solitons with unit Hopf charge. Phys. Lett. B 154, 190–192 (1985) 25. Pontyagin, L.: A classification of mappings of the 3-dimensional complex into the 2-dimensional sphere. Mat. Sbornik N.S. 9:51, 331–363 (1941) 26. Su, W.C.M.: Faddeev-Skyrme model and rational maps. Chin. J. Phys. 40, 516 (2002) 27. Su, W.-C.: Semiclassical quantization of Hopf solitons. Phys. Lett. B525, 201–204 (2002) 28. Skyrme, T.H.R.: A nonlinear field theory. Proc. Roy. Soc. Lond. A260, 127 (1961) 29. Vakulenko, A.F., Kapitansky, L.V.: Stability of solitons in S(2) in the nonlinear sigma model. Sov. Phys. Dokl. 24, 433–434 (1979) 30. Ward, R.S.: Hopf solitons on S 3 and R 3 . Nonlinearity 12, 241 (1999) Communicated by G.W. Gibbons
Commun. Math. Phys. 264, 411–426 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1550-7
Communications in
Mathematical Physics
On Fermion Grading Symmetry for Quasi-Local Systems Hajime Moriya Department of Mathematics, Graduate School of Science, Hokkaido University, Kita 10, Nishi 8, Kita-Ku, Sapporo, Hokkaido, 060-0810, Japan. E-mail: [email protected] Received: 15 April 2005 / Accepted: 24 November 2005 Published online: 15 March 2006 – © Springer-Verlag 2006
Abstract: We discuss fermion grading symmetry for quasi-local systems with graded commutation relations. We introduce a criterion of spontaneously symmetry breaking (SSB) for general quasi-local systems. It is formulated based on the idea that each pair of distinct phases (appeared in spontaneous symmetry breaking) should be disjoint not only for the total system but also for every complementary outside system of a local region specified by the given quasi-local structure. Under a completely model independent setting, we show the absence of SSB for fermion grading symmetry in the above sense. We obtain some structural results for equilibrium states of lattice systems. If there would exist an even KMS state for some even dynamics that is decomposed into noneven KMS states, then those noneven states inevitably violate our local thermal stability condition. 1. Introduction The univalence super-selection rule that forbids the superposition of two states whose total angular momenta are integers and half-integers is regarded as a natural law [28], see also e.g. § 6.1 of [8], III.1 of [16], § 2.2 of [27]. However, if we take a more fundamental standpoint, there are subtle points in deciding whether a (conserved) quantity satisfies the superselection rule, see e.g. [6]. Particularly, if the number of degrees of freedom is infinite as usually considered in statistical mechanics and quantum field theory, it is not obvious that symmetry assumed for kinematics leads its preservation in the state level, that is, the absence of spontaneously symmetry breaking. In this note we try to justify the univalence superselection, i.e. unbroken symmetry of fermion grading transformations that multiply fermion fields by −1. We clarify how fermion grading symmetry is different from other symmetries and is hardly broken. We shall review some relevant results. First, if a state is invariant under some asymptotically abelian group of automorphisms like space-translations, then fermion grading symmetry is perfectly preserved. That is, any such state has zero expectation value for
412
H. Moriya
every odd element [18, 23]. (See also e.g. 7.1.6 of [25], Exam. 5.2.21 of [14]. The same statement for quantum field theory is given in [15].) We shall refer to [22] that discusses (possible) forms of symmetry breaking of fermion grading transformations for dynamics that commute with some asymptotically abelian group of automorphisms. But the status of broken and unbroken symmetry of fermion grading is not given there. It seems not unreasonable to expect unbroken symmetry of fermion grading irrespective of such translation invariant assumptions. It has been shown however that non-factor quasi-free states of the CAR algebra have odd elements in their centers and give an example of the breakdown of fermion grading symmetry, though being rather technical and not coming from a physical model [20]. We note that two mutually disjoint noneven states in the factor decomposition of each non-factor quasi-free state have a common state restriction outside of some local region. It can be said that those noneven states are not macroscopically distinguishable. We are led to consider that the conventional criterion of spontaneous symmetry breaking based on the center merely for the total system is too weak to be an appropriate formula for general quasi-local systems. We introduce a more demanding criterion of SSB for general quasi-local systems, which turns out to be equivalent to the usual one for tensor-product systems. A pair of states are said to be disjoint with respect to the given quasi-local structure if for every local region, their state restrictions to its complementary outside system induce disjoint GNS representations (Definition 1). Using this notion, we propose a criterion of spontaneous symmetry breaking (Definition 2). We show the absence of spontaneous symmetry breaking in the above sense for fermion grading symmetry for general graded quasi-local systems that encompass lattice and continuous systems (Proposition 1). This proposition may be similar to the following statement in [24]: No odd element exists in observable at infinity [19]. We study temperature states (Gibbs states and KMS states) of lattice systems with graded commutation relations. For every even Gibbs state, we have a grading preserving isomorphism from its center onto that of its state restriction to the complementary outside system of each local region (Proposition 3). For now, we cannot provide a definite answer whether fermion grading symmetry is perfectly preserved or not for temperature states of those lattice systems. We only claim that if a KMS state breaks the fermion grading symmetry, then it is not thermodynamically stable. More precisely, suppose that the odd part of the center of an even KMS state for even dynamics is not empty. Then in the factor decomposition of its perturbed state by a local Hamiltonian multiplied by the inverse temperature, there are noneven KMS states that violate the local thermal stability condition (a minimum free energy condition for open systems) with respect to the perturbed dynamics acting trivially on the specified local region (Proposition 5). We give a remark upon our choice of the local thermal stability condition. In [10] we introduced two versions of local thermal stability — LTS-M and LTS-P. We make use of the latter that will be simply called LTS here. (See the Appendix for the details.) Though we have no example of such breaking nor disprove its existence, we may say that the violation of the univalence superselection rule, if it would occur, is pathological from a thermodynamical viewpoint. 2. Notation and Some Known Results We recall the definition of quasi-local C∗ -systems. (For references, we refer e.g. to § 2 of [24], § 2.6 of [14], and § 7.1 of [25].) Let F be a directed set with a partial order relation ≥ and an orthogonal relation ⊥ satisfying the following conditions:
Fermion Grading Symmetry for Quasi-Local Systems
413
a) If α ≤ β and β ⊥ γ , then α ⊥ γ . b) For each α, β ∈ F, there exists a unique upper bound α ∨ β ∈ F which satisfies γ ≥ α ∨ β for any γ ∈ F such that γ ≥ α and γ ≥ β. c) For each α ∈ F, there exists a unique αc in F satisfying αc ⊥ α and αc ≥ β for any β ∈ F such that β ⊥ α. We consider a C∗ -algebra A furnished with the following structure. Let {Aα ; α ∈ F} be a family of C∗ -subalgebras of A with the index set F. Let be an involutive ∗-automorphism that determines the grading on A as Ae := {A ∈ A (A) = A},
Ao := {A ∈ A (A) = −A}.
(1)
These Ae and Ao are called the even and the odd parts of A. For α ∈ F, Aeα := Ae ∩ Aα ,
Aoα := Ao ∩ Aα .
(2)
The above grading structure is referred to as fermion grading (by the condition L4 defined below). For a given state ω on A, its restriction to Aα is denoted ωα . If a state takes zero for all odd elements, it is called even. Let Floc be a subset of F corresponding to the set of indices of all local subsystems and set Aloc := α∈Floc Aα . We assume L1, L2, L3, L4 as follows: L1. L2. L3. L4.
Aloc ∩ Aδ is norm-dense in Aδ for any δ ∈ F. If α ≥ β, then Aα ⊃ Aβ . (Aα ) = Aα for all α ∈ F. For α ⊥ β the following graded commutation relations hold: [Aeα , Aeβ ] = 0,
[Aeα , Aoβ ] = [Aoα , Aeβ ] = 0,
{Aoα , Aoβ } = 0, where [A, B] = AB − BA is the commutator and {A, B} = AB + BA is the anticommutator. Our Floc may correspond to the set of all bounded open subsets of a space(-time) region or the set of all finite subsets of a lattice. About the condition c), αc will indicate the complement of α in the total region. We set L1 as it is for necessity in the proof of Proposition 1. For A ∈ A (and also for A ∈ Aα due to condition L3), we have the following unique decomposition: A = A + + A− , 1 A+ := A + (A) ∈ Ae (Aeα ), 2 1 A− := A − (A) ∈ Ao (Aoα ). 2
(3)
In order to ensure that the fermion grading involution acts non-trivially on A, we may assume, for example, that Aoα is not empty for all α ∈ F. However, all our results below obviously hold for any trivial cases where fermions do not or rarely exist.
414
H. Moriya
3. A Criterion of Spontaneous Symmetry Breaking Appropriate for General Quasi-Local Systems and Fermion Grading Symmetry A pair of states will be called disjoint with each other if their GNS representations are disjoint, see e.g. § 2.4.4 and § 4.2.2 of [14]. We shall employ the following more demanding condition for disjointness of two states. Definition 1. Let ω1 and ω2 be states of a quasi-local system (A, {Aα }α∈Floc ). If for every γ ∈ Floc , their restrictions to the complementary outside system of γ , i.e., ω1γc and ω2γc are disjoint with each other, then ω1 and ω2 are said to be disjoint with respect to the quasi-local structure {Aα }α∈Floc . We shall give a criterion of spontaneous symmetry breaking based on Definition 1 as follows. Let G be a group and τg (g ∈ G) be its action of ∗-automorphisms on a quasilocal system (A, {Aα }α∈Floc ). Suppose that τg commutes with a given (Hamiltonian) dynamics for every g ∈ G. Let denote some set of physical states (e.g. the set of all ground states or all equilibrium states at some temperature for the given dynamics), G and G denote the set of all G-invariant states in . Let ω be an extremal point in . Suppose that ω has a factor state decomposition in in the form of ω = dµ(g)ωg with ωg := τg∗ ω0 (= ω0 ◦ τg ), where ω0 is a factor state in (but not in G ) and so is each ωg , and µ denotes some probability measure on G. With the above setting, we define the following. Definition 2. If for each g = g of G a pair of factor states ωg and ωg are disjoint with respect to the given quasi-local structure, then it is said that the G-symmetry is macroscopically broken. Let ω be a state of a quasi-local system (A, {Aα }α∈Floc ). It is said that ω satisfies the cluster property (with respect to the quasi-local structure) if for any given ε > 0 and any A ∈ A there exists an α ∈ Floc such that ω(AB) − ω(A)ω(B) < ε B (4) for all B ∈ β⊥α Aβ . It is shown in [24] and Theorem 2.6.5 [14] that every factor state satisfies this cluster property. However, the converse does not always hold; non-factor quasi-free states of the CAR algebra satisfy the cluster property with respect to the quasi-local (lattice) structure used for their construction, see [20] for details. The following proposition asserts that fermion grading symmetry cannot be broken in the sense of Definition 2. A remarkable thing is that it makes no reference to the dynamics. We are using essentially no more than the canonical anticommutation relations (CAR) for its proof. (The idea of the proof comes from our study on state correlation for composite fermion systems done in [12, 21].) Proposition 1. Let ω be a state of a quasi-local system (A, {Aα }α∈Floc ) and denote the fermion grading involution of A. Suppose that ω satisfies the cluster property with respect to the quasi-local structure. Then ω and ω cannot be disjoint with respect to the quasi-local structure {Aα }α∈Floc . Accordingly spontaneous symmetry breaking in the sense of Definition 2 does not exist for fermion grading symmetry. Proof. Suppose that ω and ω are disjoint with respect to the quasi-local structure {Aα }α∈Floc . Then ω and ω restricted to Aαc are disjoint for each α ∈ Floc . Hence it follows that ωα − ωα = 2. (5) c
c
Fermion Grading Symmetry for Quasi-Local Systems
415
This is equivalent to the existence of an odd element A− ∈ Aoαc such that A− ≤ 1 and |ω(A− ) − ω((A− ))| = |ω(A− ) − ω(−A− )| = 2|ω(A− )| = 2, namely, ω(A− ) = 1. (6) By (3) and L1, we have that Aloc ∩ Aeδ is norm dense in Aeδ and so is Aloc ∩ Aoδ in for any δ ∈ F. Hence from (6), we have some A− in Aoγ for some γ ∈ Floc such that γ ≤ αc , A− ≤ 1 and ω(A− ) > 0.999. (7) Aoδ
(We use a sloppy notation for A− in the above; A− in (6) belonging to Aαc is approximated by A− in (7) belonging to Aoγ .) By the decomposition of A− into hermitian elements A− = 1/2(A− + A∗− ) − i i/2(A− − A∗− ) , we have
ω 1/2(A− + A∗ ) − iω i/2(A− − A∗ ) > 0.999. − −
Since (A− + A∗− ) and i(A− − A∗− ) are both self-adjoint, we have ω 1/2(A− + A∗ ) 2 + ω i/2(A− − A∗ ) 2 > 0.9992 . − − Hence we have 0.999 ω 1/2(A− + A∗ ) > 0.999 (8) √ or ω i/2(A− − A∗− ) > √ . − 2 2 From (8), 1/2(A− + A∗− ) ≤ 1 and i/2(A− − A∗− ) ≤ 1, we can choose A− = A∗− ∈ Aoγ (by adjusting ±1) such that A− ≤ 1 and 0.999 ω(A− ) > √ . 2
(9)
By the cluster property assumption (4) on ω, for a sufficiently small ε > 0 and the above specified A− ∈ Aoγ there exists an α ∈ Floc such that ω(A− B) − ω(A− )ω(B) < ε B (10) for all B ∈ β⊥α Aβ . By (5) with α = γ ∨ α , the same argument leading to (9) implies that there exists ∗ ∈ Ao such that ζ ⊥ (γ ∨ α ), B ≤ 1 and B− = B− − ζ 0.999 ω(B− ) > √ . 2
(11)
Substituting the above B− to B in (10), and using (9) and (11), we have Im (ω(A− B− )) < ε, 0.9992 Re ω(A− B− ) > − ε. 2
(12)
416
H. Moriya
∗ ∈ Ao , and γ ⊥ ζ , A B is skew-self-adjoint, Due to A− = A∗− ∈ Aoγ , B− = B− − − ζ ∗ i.e. (A− B− ) = −A− B− . Therefore ω(A− B− ) is a purely imaginary number, which however contradicts (12). Thus we have shown that ω and ω cannot be disjoint with respect to {Aα }α∈Floc . Since any factor state satisfies the cluster property, the possibility of SSB of Definition 2 for the symmetry is negated.
4. On the Centers of Temperature States of Lattice Systems From now on, we consider lattice fermion systems [11] and also the lattice systems with graded commutation relations [7] satisfying the translation uniformity to be specified. Take Zν , ν(∈ N)-dimensional cubic integer lattice. Let Floc be a set of all finite subsets of the lattice. We assume that there is a finite number of degrees of freedom (spins) on each site of the lattice. For general graded lattice systems, we further assume that the subalgebra A{i} on each site i on the lattice is isomorphic to a d × d full matrix algebra, d ∈ N being independent of i. Hence for each I ∈ Floc , AI is isomorphic to a d |I| × d |I| full matrix algebra, and A is a UHF algebra of type d ∞ by Lemma 2.1 of [7]. (As an example of such systems, A{i} is generated by fermion operators ai , ai∗ , and y spin operators represented by the Pauli matrices σix , σi , σiz which are even elements commuting with all fermion operators.) We denote the conditional expectation of the tracial state from A onto AJ by EJ . The interaction among sites is determined by the potential , a map from Floc to A satisfying the following conditions: ( -a) ( -b) ( -c) ( -d) ( -e)
(I) ∈ AI , (∅) = 0. ∗
(I) = (I). (I) = (I). EJ (I) = 0 if J ⊂ I and J = I. For each fixed I ∈ Floc , the net {HJ (I)}J with HJ (I) := K (K); K ∩ I = ∅, K ⊂ J is a Cauchy net for J ∈ Floc in the norm topology converging to a local Hamiltonian H (I) ∈ A.
Let P denote the real vector space of all satisfying all the above conditions. The set of all ∗-derivations on the domain Aloc commuting with is denoted D(Aloc ). There exists a bijective real linear map from ∈ P to δ ∈ D(Aloc ) for the lattice fermion systems (Theorem 5.13 of [11]), and similarly for the graded lattice systems (Theorem 4.2 of [7]). The connection between δ ∈ D(Aloc ) and its corresponding ∈ P is given by δ(A) = i[H (I), A],
A ∈ AI
(13)
for every I ∈ Floc , where the local Hamiltonian H (I) is determined by ( -e) for this . The condition ( -d) is called the standardness which is for fixing ambiguous terms (such as scalars) irrelevant to the dynamics given by (13). We remark that any product state, for example the Fock state, can be used in place of the tracial state for EJ to obtain a similar one-to-one correspondence between δ and . Furthermore, characterizations of equilibrium states, such as LTS, Gibbs (and also the variational principle for translation invariant states), have all been shown to be independent of the choice of its those product states [7]. The above-mentioned Gibbs condition was defined for the quantum spin lattice systems [9], and then extended to the lattice fermion systems in § 7.3 of [11], and to the
Fermion Grading Symmetry for Quasi-Local Systems
417
graded lattice systems under consideration [7]. Let be a cyclic and separating vector of a von Neumann algebra M on H and denote the modular operator for (M, ), see [26]. The state ω on M given by ω(A) = (, A) for A ∈ M satisfies the KMS condition for the modular automorphism group σt := Ad(it ), t ∈ R, at the inverse temperature β = −1 and is called the modular state with respect to σt . The following definition works for any lattice system under consideration. Definition 3. Let ϕ be a state of A and Hϕ , πϕ , ϕ be its GNS triplet. It is said that ϕ satisfies the Gibbs condition for δ ∈ D(Aloc ) at inverse temperature β ∈ R, for short (δ, β)-Gibbs condition, if and only if the following conditions are satisfied : (Gibbs-1) The GNS vector ϕ is separating for Mϕ := πϕ (A) . For (Mϕ , ϕ , Hϕ ), the modular operator ϕ and the modular automorphism group βH (I) σϕ,t are defined. Let σϕ,t denote the one-parameter group of ∗-automorphisms determined by the generator δϕ + δπϕ (βH (I)) , where δϕ denotes the generator for σϕ,t and δπϕ (βH (I)) (A) := i[βπϕ (H (I)), A] for A ∈ Mϕ . βH (I)
(Gibbs-2) For every I ∈ Floc , σϕ,t βH (I)
The modular state for σϕ,t βH (I)
unit vector ϕ
fixes the subalgebra πϕ (AI ) elementwise.
is given as the vector state of a (uniquely determined)
lying in the natural cone for (Mϕ , ϕ ) and is denoted ϕ βH (I) . We use βH (I)
the same symbol for its restriction to A, namely, ϕ βH (I) (A) := π(A)ϕ
βH (I)
, ϕ
βH (I)
is normalized and ϕ βH (I) (1) = 1 in our notation. for A ∈ A. We remark that ϕ (For the general references of the perturbed states, the relative modular automorphisms, and their application to quantum statistical mechanics, see [1, 2] and § 5.4 of [14].) We next show the product property of ϕ βH (I) in the following sense. Lemma 2. Let ϕ be a (δ, β)-Gibbs state for δ ∈ D(Aloc ) and β ∈ R. If it is even, then for each I ∈ Floc , ϕ βH (I) is a product state extension of the tracial state tr I on AI and its restriction to AIc , as denoted ϕ βH (I) = tr I ◦ ϕ βH (I) A . (14) Ic
Proof. It has already been shown in Proposition 7.7 of [11] for the lattice fermion systems, and we can easily verify this statement for the graded lattice systems as well. But we shall provide a slightly simpler proof. In Theorem 9.1 of [2] it is shown that ϕ βH (I) ([Q1 , Q2 ]Q) = 0
(15)
for every Q1 , Q2 ∈ AI and Q ∈ AI , the commutant of AI in A. From this we see that ϕ βH (I) is a product state extension of the tracial state tr I on AI and its restriction to AI . Since ϕ is an even state and H (I) is an even self-adjoint element, ϕ βH (I) is also even. It is easy to see AI = AeIc + vI AoIc , where vi := ai∗ ai − ai ai∗ ,
vI :=
i∈I
vi .
(16)
418
H. Moriya
This vI is a self-adjoint unitary implementing on AI . For A+ ∈ AeI , A− ∈ AoI , B+ ∈ AeIc and B− ∈ AoIc , computing the expectation values of all Aσ Bσ with σ = ± and σ = ± for ϕ βH (I) , we obtain ϕ βH (I) (A+ B+ ) = tr I (A+ )ϕ βH (I) (B+ ), and zeros for the others, i.e A+ B− , A− B+ and A− B− . Therefore ϕ βH (I) is equal to the product state extension of the tracial state tr I on AI and ϕ βH (I) |AIc . We provide a grading structure with von Neumann algebras generated by even states subalgebras. For an even state ω of a quasi-local system, let and with their -invariant Hω , πω , ω be a GNS triplet of ω and let Mω denote the von Neumann algebra generated by this representation. Let U,ω be a unitary operator of Hω implementing the grading involution , and ω := Ad(U,ω ). Then even and odd parts of Mω are given by Meω := {A ∈ Mω ω (A) = A}, Moω := {A ∈ Mω ω (A) = −A}. (17) Let N be a -invariant subalgebra of Mω . We give its grading as Ne(ω ) := N ∩ Meω ,
No(ω ) := N ∩ Moω ,
(18)
where the superscripts e(ω ) and o(ω ) indicate that the grading is determined by ω . For any A ∈ Mω (also A ∈ N), we have its unique decomposition A = A+ + A− such that A+ ∈ Meω (Ne(ω ) ) and A− ∈ Moω (No(ω ) ) in the same manner as (3). Let ω1 and ω2 be even states on A. Let N1 and N2 be some -invariant subalgebras of Mω1 and Mω2 , respectively. If there is an isomorphism η from N1 onto N2 , that is, N1 and N2 are isomorphic, then we denote this relationship by N1 ∼ N2 . If there is a grading preserving isomorphism η from N1 onto N2 , that is, η maps the even part to the even, the odd to the odd, then we write N1 ∼ N2 . Obviously each of ‘∼’ and ‘∼ ’ is an equivalence relation. We recall relative entropy, which will be used in the proof of the next proposition and also for the formulation of our local thermal stability condition in the next section and the Appendix. For two states ω1 and ω2 of a finite-dimensional system, it is defined by S(ω1 , ω2 ) := ω2 (log D2 − log D1 ) , := +∞, otherwise,
if ker D1 ⊂ ker D2 , (19)
where Di is the density matrix for ωi (i = 1, 2). It is positive, and zero if and only if ω1 = ω2 . Its generalization to von Neumann algebras is given in [4, 5]. (Note that the order of two states and the sign convention of relative entropy are both reversed in [14].) In the following discussion we are interested in centers. Let us denote the center of Mω by Zω . It is immediate to see that Zω is -invariant for an even state ω. We shall e( ) o( ) use the shorthand Zeω and Zoω for Zω ω and Zω ω , respectively. Proposition 3. Let ϕ be an even (δ, β)-Gibbs state. For I ∈ Floc , let ϕIc denote the state restriction of ϕ onto AIc . Then for any I ∈ Floc there is a grading preserving isomorphism between the centers of the von Neumann algebras generated by the GNS representation of ϕ and by that of ϕIc . Especially, ϕ is a factor state if and only if is ϕIc also.
Fermion Grading Symmetry for Quasi-Local Systems
419
βH (I) Proof. Let Hϕ , πϕ , ϕ be a GNS triplet of ϕ, and ϕ denote the normalized βH (I) vector representing its perturbed state ϕ as in Definition 3. By Theorem 3.10 of [5] (also by the discussion below Definition 6.2.29 of [14]), S(ϕ, ϕ βH (I) ) ≤ 2 βH (I) ,
(20)
S(ϕ βH (I) , ϕ) ≤ 2 βH (I) .
Since the relative entropy is not increasing by restriction onto any subsystem, taking the restrictions of ϕ and ϕ βH (I) onto AIc denoted ϕIc and ϕ βH (I) Ic respectively, we have S(ϕIc , ϕ βH (I) Ic ) ≤ 2 βH (I) ,
(21)
ϕIc ) ≤ 2 βH (I) .
(22)
S(ϕ
βH (I)
Ic ,
By applying the argument in § 2 and 3 of [3] to the present case, (21) implies that ϕIc quasi-contains ϕ βH (I) Ic , and also for (22), vice-versa. (The notion of quasi-containment given in this reference is as follows. For a pair of representations π1 and π2 of a C∗ -algebra, if there is a subrepresentation of π1 which is quasi-equivalent to π2 , then π1 is said to quasi-contain π2 ). Therefore ϕIc and ϕ βH (I) Ic are quasi-equivalent. Let HϕI , πϕI , ϕI and Hϕ βH (I) I , πϕ βH (I) I , ϕ βH (I) I be GNS representations c
c
c
c
c
c
for ϕIc and ϕ βH (I) Ic , MϕI and Mϕ βH (I) I be von Neumann algebras generated by those c c representations of AIc . By taking the restriction of the canonical isomorphism between the von Neumann algebras MϕI and Mϕ βH (I) I which maps πϕI (A) to πϕ βH (I) I (A) for c c c c A ∈ A onto their centers ZϕI := MϕI ∩ Mϕ and Zϕ βH (I) I := Mϕ βH (I) I ∩ Mϕ βH (I) , c c c c Ic Ic we have ZϕI ∼ Zϕ βH (I) I . c
c
(23)
In the above derivation, we have noted that even and odd parts of von Neumann algebras generated by a GNS representation are weak limits of even and odd parts of a underlying C∗ -system (mapped onto the GNS space), and hence the canonical isomorphism conjugating a pair of quasi-equivalent representations and its restriction to -invariant subalgebras are grading preserving. We shall construct a GNS representation of ϕ βH (I) (on A) from the above Hϕ βH (I) I , c πϕ βH (I) I , ϕ βH (I) I on AIc and a GNS representation of the tracial state tr I on AI c c denoted KI , κI , I . Define K := KI ⊗ Hϕ βH (I) I , c
:= I ⊗ ϕ βH (I) I , c
VI := κI (vI ) ⊗ 1Ic , κˆ I (A) := κI (A) ⊗ 1Ic for A ∈ AI , πˆ ϕ βH (I) I (A) := 1I ⊗ πϕ βH (I) I (A) for A ∈ AIc , c
(24)
c
where 1I and 1Ic are the identity operators on KI and Hϕ βH (I) I , vI is given by (16). Noting c Ad(vI ) = |AI , we have a unique representation κ of the total system A on K satisfying κ(A) = κˆ I (A)
for A ∈ AI ,
(25)
420
H. Moriya
and κ(B+ ) = πˆ ϕ βH (I) I (B+ ) c
for B+ ∈ AeIc ,
κ(B− ) = VI πˆ ϕ βH (I) I (B− ) c
for B− ∈ AoIc .
(26) By (14), i.e., the product property of ϕ βH (I) for AI and AIc , we verify that this K, κ, gives a GNS triplet of ϕ βH (I) . We have also
Mκ := κ(A) = κI (AI ) ⊗ πϕ βH (I) I (AIc ) c
= (κI (AI )) ⊗ Mϕ βH (I) I .
(27)
c
Since ϕ βH (I) Ic is even, and is |AIc -invariant, we have a unitary operator UIc of Hϕ βH (I) I which implements |AIc in its GNS space Hϕ βH (I) I , πϕ βH (I) I , ϕ βH (I) I . c c c c As (17), Ad(UIc ) determines the even and odd parts of Mϕ βH (I) I . Accordingly by (18), c the grading is induced on the center Zϕ βH (I) I and it is decomposed into Zeϕ βH (I) and c
Ic
Zoϕ βH (I) . Ic For AI , κI (vI ) gives a unitary operator implementing |AI . By the construction of K, κ, , U := κI (vI ) ⊗ UIc ∈ B(K)
(28)
gives a unitary operator which implements for ϕ βH (I) . This U gives a grading for Mκ and it is split into Meκ and Moκ . Also by this grading the center Zκ := Mκ ∩ Mκ is decomposed into Zeκ and Zoκ . Note that the center of the tensor product of a pair of von Neumann algebras is equal to the tensor product of their centers by the commutant theorem (Corollary 5.11 in I.V. of [26]). Since AI is a full matrix algebra, and the center of any state on it is trivial, by (27) we have Zκ = 1I ⊗ Zϕ βH (I) I .
(29)
c
Moreover from (28) and (29) it follows that Zeκ = 1I ⊗ Zeϕ βH (I) , Ic
Zoκ = 1I ⊗ Zoϕ βH (I) , Ic
(30)
where we have noted that the grading of Zϕ βH (I) I is determined by the unitary UIc . The c equalities (29) and (30) give Zκ ∼ Zϕ βH (I) I . c
(31)
Combining (31) with (23) we have ZϕI ∼ Zϕ βH (I) I ∼ Zκ . c
c
(32)
βH (I) Since K, κ, and Hϕ , πϕ , ϕ are both GNS representations of the βH (I) same state ϕ on A, they are apparently unitary equivalent. The representation βH (I) obviously induces algebra for Hϕ , πϕ , ϕ , Hϕ , πϕ , ϕ the same von Neumann namely Mϕ . Hence K, κ, and Hϕ , πϕ , ϕ are unitary equivalent. Taking the
Fermion Grading Symmetry for Quasi-Local Systems
421
restriction of the unitary map which conjugates those equivalent representations of A onto the center, we have Z κ ∼ Zϕ .
(33)
ZϕI ∼ Zϕ ,
(34)
From (32) and (33), it follows that c
which is what we would like to have.
Remark 1. We note that the identification of two von Neumann algebras in (31) and in (34) does not imply that the underlying C∗ -systems A and AIc are conjugated to each other in those representations. Remark 2. We shall explain that the formula (34) does not hold in general by an example. Take the one-dimensional lattice Z and a site of it, say the origin 0. We prepare a non-factor quasi-free state ρ [20] on A{0}c , where {0}c denote the complementary region of {0}. The factor decomposition of ρ is given by ρ = 1/2(ψ + ψ), where ψ is a noneven factor state of A{0}c . Take a (unique) product state extension of the tracial state ˜ We see that the state tr {0} of A{0} and ψ to the total system A, which is denoted ψ. ˜ ψ on A is equal to the state extension of tr {0} and ψ to A. Let Hψ˜ , πψ˜ , ψ˜ √ ˜ Take an odd unitary u of A{0} , say, 1/ 2(a0 + a ∗ ). Define be a GNS triplet of ψ. 0 √ ξ := 1/ 2(ψ˜ + πψ˜ (u)ψ˜ ), which is a unit vector of Hψ˜ . Let ϕξ denote the state
determined by ϕξ (A) := πψ˜ (A)ξ, ξ for A ∈ A. It is clear that this ϕξ is a factor state of A by its construction. By direct computation, its restriction onto A{0}c is equal to ρ. Hence ϕξ is a factor state whose restriction to the subsystem A{0}c is a non-factor. 5. Violation of the Local Thermal Stability for Noneven KMS States For some technical reason we shall work with KMS states [17] (not directly with Gibbs states). Let αt (t ∈ R) be a one-parameter group of ∗-automorphisms of A. A state ϕ is called an (αt , β)-KMS state if it satisfies ϕ Aαiβ (B) = ϕ(BA) for every A ∈ A and B ∈ Aent , where Aent denotes the set of all B ∈ A for which αt (B) has an analytic extension to A-valued entire function αz (B) as a function of z ∈ C. Our dynamics αt is assumed to be even, namely αt = αt for each t ∈ R. We also put the following assumptions in order to relate αt with some δ ∈ D(Aloc ). (I) The domain of the generator δα of αt includes Aloc . (II) Aloc is a core of δα . The next proposition asserts the equivalence of the KMS and Gibbs conditions under (I, II). The proof was given for the lattice fermion systems in Theorem 7.5 (the implication from KMS to Gibbs under the assumption (I) and Theorem 7.6 (the converse direction under the assumption (I, II)) of [11]. The proof for the graded lattice systems can be done in much the same way and we shall omit it. We emphasize that this equivalence does not require the evenness of states, which becomes essential in the proof of Proposition 5.
422
H. Moriya
Proposition 4. Let αt be an even dynamics satisfying Conditions (I, II). Let δ(∈ D(Aloc )) be the restriction of its generator δα to Aloc . Then a state ϕ of A satisfies (αt , β)-KMS condition if and only if it satisfies (δ, β)-Gibbs condition. One would ask whether fermion grading symmetry is perfectly preserved or not for non-zero temperature states. (It is plausible that we can derive a stronger statement about the unbroken symmetry of fermion grading for KMS states than Proposition 1.) We leave this question for future study. Here we show the following rather weak statement. Suppose that there is a nonzero odd element in the center of some even KMS state for even dynamics satisfying (I, II), then there always exist noneven KMS states that do not satisfy the local thermal stability (LTS). This LTS refers to LTS-P in the terminology of [10] (not LTS-M there). The content of the local thermal stability condition is summarized in the Appendix. We give some relevant material here. Let ϕ be an arbitrary even (αt , β)-KMS state. For I ∈ Floc , which is now fixed, ϕ βH (I) denotes the perturbed state of ϕ by βH (I). From the given δ ∈ D(Aloc ) and I ∈ Floc , a new ∗-derivation δ˜ ∈ D(Aloc ) is given as ˜ ∈ P by follows. Let ∈ P denote the potential corresponding to δ. Define ˜
(J) := 0, if J ∩ I = ∅,
˜
(J) := (J), otherwise. (35) ˜ by δ˜ ∈ D(Aloc ). By definition, δ˜ acts We denote the ∗-derivation corresponding to trivially on AI . The one-parameter group of ∗-automorphisms of A generated by δ˜ is equal to the perturbation of αt by H (I) given in terms of the Dyson-Schwinger expansion series and denoted α˜ t . By Proposition 4 and its proof found in [11], ϕ βH (I) satisfies ˜ β)-Gibbs condition. (α˜ t , β)-KMS condition and (δ, We recall the GNS representation K, κ, of ϕ βH (I) previously defined in (24), (25), (26). Let p be a nonzero projection in Zκ which has a unique even-odd decomposition p = p+ + p− , p+ ∈ Zeκ and p− ∈ Zoκ . By (29) we can write p = 1I ⊗ q with some q ∈ Zϕ βH (I) I . Furthermore by (30), we have p+ = 1I ⊗ q+ with q+ ∈ Zeϕ βH (I) and
c
Ic
and p− = 1I ⊗ q− with q− ∈ Zoϕ βH (I) . We define a positive linear functional on A by Ic
βH (I) ϕp (A)
for A ∈ A.
:= (κ(A), p)
(36)
We take its restriction onto AIc . For A+ ∈ AeIc , we have βH (I)
ϕp
(A+ ) = (κ(A+ ), p)
= 1I ⊗ πϕ βH (I) I(A+ ) I ⊗ ϕ βH (I) I , I ⊗ qϕ βH (I) I c c c
= πϕ βH (I) I(A+ )ϕ βH (I) I , qϕ βH (I) I c c c
= πϕ βH (I) I(A+ )ϕ βH (I) I , q+ ϕ βH (I) I , c
c
c
(37)
where in the last equality we have used the evenness of ϕ βH (I) . For A− ∈ AoIc , βH (I)
ϕp
(A− ) = (κ(A− ), p)
= κI (vI ) ⊗ πϕ βH (I) I (A− ) I ⊗ ϕ βH (I) I , I ⊗ qϕ βH (I) I c c c
= tr I (vI ) πϕ βH (I) I (A− )ϕ βH (I) I , qϕ βH (I) I = 0, (38) c
where we have used tr I (vI ) = 0.
c
c
Fermion Grading Symmetry for Quasi-Local Systems
423
If p is even, i.e. p = p+ = 1I ⊗ q+ with q+ ∈ Zeϕ βH (I) , then from (37), (38), and Ic
ϕ βH (I) (A− ) = 0 for any A− ∈ Ao , it follows that
βH (I) ϕp (A) = πϕ βH (I) I (A)ϕ βH (I) I , q+ ϕ βH (I) I c
c
c
(39)
for any A ∈ AIc . Suppose that Zoκ is not empty. Take any nonzero f ∈ Zoκ . Then f + f ∗ and if + (if )∗ are self-adjoint elements in Zoκ . Since at least one of them is nonzero, we can take a self-adjoint element in Zoκ whose operator norm is less than 1 and shall denote such an element by f . Let pf := 1/2(1 + f ), which is a positive operator. Define a noneven state βH (I)
(40)
ψ := 2ϕpf
βH (I)
by substituting this pf into p of (36). We easily see that ψ is equal to 2ϕp−f for p−f := 1/2(1 − f ). Their averaged state 1/2(ψ + ψ) is obviously equal to ϕ βH (I) .
Proposition 5. Let αt be an even dynamics satisfying (I, II) and let ϕ be an arbitrary even (αt , β)-KMS state. For I ∈ Floc , let α˜ t denote the perturbed dynamics of αt by the ˜ denote the potential for α˜ t given as (35). If the odd part local Hamiltonian H (I). Let of the center of the perturbed state ϕ βH (I) is not empty, then the noneven (α˜ t , β)-KMS ˜ β)-LTS condition. states ψ and ψ given as (40) violate ( , Proof. Since ϕ βH (I) is an (α˜ t , β)-KMS state, ψ and ψ are also (α˜ t , β)-KMS states ˜ β)-Gibbs states by by Theorem 5.3.30 [14]. Accordingly ϕ βH (I) , ψ and ψ are all (δ, Proposition 4. We consider the state restrictions of ϕ βH (I) , ψ, and ψ onto AIc . Since the even parts of pf and p−f are both scalar, it follows from (37) that ϕ βH (I) |AeI = ψ|AeI = ψ|AeI . c
c
c
Due to (38) all of them are even when restricted to AIc . Hence we have ϕ βH (I) |AIc = ψ|AIc = ψ|AIc .
(41)
˜ determined by the formula Denote the local Hamiltonians for the new potential ( -e) by {H˜ (J)}J∈Floc . From (35) it follows that H˜ (I) = 0, and hence ϕ βH (I) (H˜ (I)) = ψ(H˜ (I)) = ψ(H˜ (I)) = 0.
(42)
We compute conditional entropy of ϕ βH (I) , ψ and ψ for the finite region I. The definition of conditional entropy is given in (47). Noting (14) we have
SI (ϕ βH (I) ) = −S(tr I ◦ ϕ βH (I) |AIc , ϕ βH (I) ) = −S(tr I ◦ ϕ βH (I) |AIc , tr I ◦ ϕ βH (I) |AIc ) = 0,
(43)
424
H. Moriya
which is the maximum value of SI (·). For ψ, using (41) and then (14) we have
SI (ψ) = −S(tr I ◦ ψ|AIc , ψ) = −S(tr I ◦ ϕ βH (I) |AIc , ψ) = −S(ϕ βH (I) , ψ).
(44)
Since ϕ βH (I) = ψ, the former is even and the latter is noneven, it follows from this equality and the strict positivity of relative entropy (see [4]) that
SI (ψ) < 0. By the automorphism invariance (acting on two states in the argument) of relative entropy, we have
SI (ψ) < 0. SI (ψ) =
(45)
Substituting (42), (43), and (45) into (48), we obtain ˜
˜
˜
FI,β (ψ) = FI,β (ψ) < FI,β (ϕ βH (I) ) = 0.
(46)
˜ β)-LTS condition This strict inequality with (41) shows that ψ and ψ do not satisfy ( , ˜ (49), although both of them satisfy (δ, β)-Gibbs condition. Appendix Local thermal stability (LTS) condition. Let (A, {AI }I∈Floc ) be a lattice system considered in § 4. In [10] the local thermal stability (LTS) is studied for the lattice fermion systems. It is easy to see that the same formulation is available for the graded lattice systems under consideration. Let ω be a state of (A, {AI }I∈Floc ). For I ∈ Floc , the conditional entropy of ω is defined in terms of the relative entropy (19) by
SI (ω) := −S(tr I ◦ ω|AIc , ω) = −S(ω · EIc , ω) ≤ 0,
(47)
where EIc is the conditional expectation onto AIc with respect to the tracial state and ω · EIc (A) := ω(EIc (A)) for A ∈ A. Let ∈ P. The conditional free energy of ω for I ∈ Floc is given by
(ω) := SI (ω) − βω(H (I)), FI,β
(48)
where H (I) is a local Hamiltonian for I with respect to . Definition 4. Let be a potential in P. A state ϕ of A is said to satisfy the local thermal stability condition for at inverse temperature β or ( , β)-LTS condition if for each I ∈ Floc ,
(ϕ) ≥ FI,β (ω) FI,β
for any state ω satisfying ω|AIc = ϕ|AIc .
(49)
Fermion Grading Symmetry for Quasi-Local Systems
425
There is the other definition of local thermal stability in [10] that has the same variational principle formula as above but takes the commutant algebra AI as the complementary outside system of a local region I instead of AIc . We shall call this alternative local thermal stability condition LTS condition, where the superscript ‘’ stands for the commutant. (Also by ‘’ we mean that this formalism is not so natural compared to Definition 4 if we respect the given quasi-local structure. Nevertheless, there are some mathematically good points with LTS .) The equivalence of KMS and LTS conditions holds for the lattice fermion systems without assuming the evenness on states. For our LTS, on the contrary, such evenness assumption is required in deriving its equivalence to the KMS condition. (The formalism of LTS using commutants for complementary outside systems makes it possible to exploit the known arguments for quantum spin lattice systems [13].) Acknowledgements. I thank KEK and IHES where this work is done. I thank Professor Araki for providing valuable suggestions that clarify the discussion. I thank Professor Tsutsui for kind hospitality at KEK. I acknowledge the JSPS Postdoctoral Fellowships for Research Abroad (Aug 2003–May 2005) and IHES. I thank Professor Ruelle for kind hospitality at IHES. I have been supported by COE post-doctoral fellowship of the Mathematics Department of Hokkaido University since Jun 2005 which is greatly appreciated.
References 1. Araki, H.: Relative hamiltonian for faithful normal states of a von Neumann algebra. Publ. RIMS, Kyoto Univ. 7, 165–209 (1973) 2. Araki, H.: Positive cone, Radon-Nikodym theorems, relative hamiltonian and the Gibbs condition in statistical mechanics. An application of the Tomita-Takesaki theory. In: C∗ -algebras and their applications to statistical mechanics and quantum field theory. D. Kastler, ed. Bologna: Editrice Composition, pp 64–100, 1975 3. Araki, H.: On uniqueness of KMS states of one-dimensional quantum lattice systems. Commun. Math. Phys. 44, 1–7 (1975) 4. Araki, H.: Relative entropy of states of von Neumann algebras. Publ. RIMS, Kyoto Univ. 11, 809–833 (1976) 5. Araki, H.: Relative entropy for states of von Neumann algebras II. Publ. RIMS, Kyoto Univ. 13, 173–192 (1977) 6. Araki, H.: On superselection rules. Proc. 2nd Int. Symp. Foundations of Quantum Mechanics, Tokyo, (1986) pp. 348–354 Physical Society of Japan, Tokyo (1987) 7. Araki, H.: Conditional expectations relative to a product state and the corresponding standard potentials. Commun. Math. Phys. 246, 113–132 (2004) 8. Araki, H.: Ryoushiba no Suuri (Japanese). Iwanami, 1996. Mathematical Theory of Quantum Fields. translation by Watamura, U. C., Oxford University Press, Oxford (1999) 9. Araki, H., Ion, P.D.F.: On the equivalence of KMS and Gibbs conditions for states of quantum lattice systems. Commun. Math. Phys. 35, 1–12 (1974); Araki, H.: On the equivalence of the KMS condition and the variational principle for quantum lattice systems. Commun. Math. Phys. 38, 1–10 (1974) 10. Araki, H., Moriya, H.: Local thermodynamical stability of fermion lattice systems. Lett. Math. Phys. 60, 109–121 (2002) 11. Araki, H., Moriya, H.: Equilibrium statistical mechanics of fermion lattice systems. Rev. Math. Phys. 15, 93–198 (2003) 12. Araki, H., Moriya, H.: Joint extension of states of subsystems for a CAR system. Commun. Math. Phys. 237, 105–122 (2003) 13. Araki, H., Sewell, G.L.: KMS conditions and local thermodynamical stability of quantum lattice systems. Commun. Math. Phys. 52, 103–109 (1977); Sewell, G.L.: KMS conditions and local thermodynamical stability of quantum lattice systems II. Commun. Math. Phys. 55, 53–61 (1977) 14. Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics I and II, Springer-Verlag, Berlin Heidelberg NewYork (1979 and 1981) 15. Driessler, D., Summers, S. J.: Central decomposition of Poincar´e-invariant nets. Ann. Inst. Henri Poincar´e, Phys. Th´eor. 43, 147–166 (1985)
426
H. Moriya
16. Haag, R.: Local Quantum Physics. Springer-Verlag, Berlin Heidelberg NewYork (1996) 17. Haag, R., Hugenholz, N.M., Winnink, M.: On the equilibrium states in quantum statistical mechanics. Commun. Math. Phys. 5, 215–236 (1967) 18. Lanford III, O.E., Robinson, D.W.: Mean entropy of states in quantum statistical mechanics. J. Math. Phys. 9, 1120–1125 (1968) 19. Lanford III, O.E., Ruelle, D.: Observable at infinity and states with short range correlations in statistical mechanics. Commun. Math. Phys. 13, 194–215 (1969) 20. Manuceau, J., Verbeure, A.: Non-factor quasi-free states of the CAR-algebra. Commun. Math. Phys. 18, 319–326 (1970) 21. Moriya, H.: Some aspects of quantum entanglement for CAR systems. Lett. Math. Phys. 60, 109– 121 (2002); On separable states for composite systems of distinguishable fermions J. Phys. A: Math. Gen in press. Validity and failure of some entropy inequalities for CAR systems. J. Math. Phys. 46, 033508 (2005). On a state having pure-state restrictions for a pair of regions. Interdisc. Inf. Sci. 10, 31–40 (2004) 22. Narnhofer, H., Thirring, W.: Spontaneously broken symmetries. Ann. Inst. Henri Poincar´e, Phys. Th´eor. 70, 1–21 (1999) 23. Powers, R.T.: Representations of the canonical anticommutation relations. Thesis, Princeton University, 1967 24. Robinson, D.W.: A characterizaion of clustering states. Commun. Math. Phys. 41, 79–88 (1975) 25. Ruelle, D.: Statistical Mechanics, Rigorous Results. Benjamin, NewYork (1969) 26. Takesaki, M.: Theory of Operator Algebras I. Springer-Verlag, Berlin Heidelberg NewYork (1979) 27. Weinberg, S.: The Quantum Theory of Fields I. Cambridge University Press, Cambridge (2002) 28. Wick, G.C., Wightman, A.S., Wigner, E.P.: The intrinsic parity of elementary particles. Phys. Rev. 88, 101–105 (1952) Communicated by H. Spohn
Commun. Math. Phys. 264, 427–464 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1486-3
Communications in
Mathematical Physics
Fermionic Characters and Arbitrary Highest-Weight r +1 -Modules Integrable sl Eddy Ardonne1 , Rinat Kedem2 , Michael Stone1 1
Department of Physics, University of Illinois, 1110 W. Green St., Urbana, IL 61801, USA. E-mail: [email protected]; [email protected] 2 Department of Mathematics, University of Illinois, 1409 W. Green Street, Urbana, IL 61801, USA. E-mail: [email protected] Received: 18 April 2005 / Accepted: 6 July 2005 Published online: 10 February 2006 – © Springer-Verlag 2006
Abstract: This paper contains the generalization of the Feigin-Stoyanovsky construction to all integrable slr+1 -modules. We give formulas for the q-characters of any highestweight integrable module of slr+1 as a linear combination of the fermionic q-characters of the fusion products of a special set of integrable modules. The coefficients in the sum are the entries of the inverse matrix of generalized Kostka polynomials in q −1 . We prove the conjecture of Feigin and Loktev regarding the q-multiplicities of irreducible modules in the graded tensor product of rectangular highest weight-modules in the case of slr+1 . We also give the fermionic formulas for the q-characters of the (non-level-restricted) fusion products of rectangular highest-weight integrable slr+1 -modules.
1. Introduction Fermionic formulæ for characters of highest-weight modules of affine algebras or vertex algebras first appeared in a purely algebraic context [17]. They were later shown [13, 12] to be related to the partition functions of certain statistical mechanical systems at their critical points. These character formulæ have desirable combinatorial properties, such as the manifest positivity of the coefficients that represent weight-space multiplicities. They also have a physical significance because they reflect the quasi-particle content of the statistical mechanical system. Consequently, algebraic constructions of bases for representations which reveal this combinatorial structure are important, and have been studied using several methods in the past dozen years. One such method is that of Feigin and Stoyanovski˘ı [23]. These authors used a theorem of Primc [21] to give an interesting construction of the vacuum integrable modules of the affine algebra g associated to any simple Lie algebra g. Their construction relies on the loop generators of the affine algebra. Physical systems associated with such integrable g-modules are generalizations of the Heisenberg spin chain in statistical mechanics, or the WZW model in conformal field theory.
428
E. Ardonne, R. Kedem, M. Stone
The formulæ of Feigin-Stoyanovski˘ı [23] have an attractive interpretation in terms of (a bosonic version of) non-abelian quantum Hall states [19, 2]. In these states there are r “types” of particles that obey a generalized exclusion principle: the wave function vanishes if any k +1 particles occupy the same state. Here r is the rank of the algebra and k is the level of the integrable g-module. In the presence of quasi-particle excitations, the wave functions can also vanish if fewer than k + 1 particles occupy the same state. The statistics of the quasi-particles is ‘dual’ to the statistics of the fundamental particles [1]. The original construction of Feigin-Stoyanovski˘ı can be used to compute [23] characters of vacuum (with highest weight k0 ) representations of affine algebras. Later, Georgiev [10, 9] generalized it to some modules in the ADE series, with particularly simple highest weights, of the form lωj + k0 , corresponding to special rectangular Young diagrams. (Here ωj are certain fundamental g-weights, and l ∈ Z≥0 .) In general, no fermionic formulæ are available for arbitrary highest-weight, integrable g-modules. In this paper, we resolve this problem for the case of slr+1 . We explain, in terms of the functional realization of Feigin and Stoyanovsk˘i, why such ‘rectangular highest weight’ modules are very special, and why there is no direct fermionic construction for other modules. However, we prove that it is possible to compute the character of any module as a finite sum of fermionic characters of the ‘rectangular’ highest-weight modules. The coefficients in this sum are the entries of the inverse matrix of generalized Kostka polynomials. These coefficients are, however, not manifestly positive (or even of positive degree). In our construction we are naturally led to the graded tensor product of Feigin and Loktev [8] of finite-dimensional g-modules. In the case of irreducible slr+1 -modules with highest weights of the form lωj (where ωj is any fundamental weight), we compute the explicit fermionic form of the graded multiplicities of irreducible modules in the Feigin-Loktev tensor product, thus proving two of the conjectures of [8]: That the graded tensor product in this case is independent of the evaluation parameters, and that it is related to the generalized Kostka polynomials of [22, 16]. The plan of the paper is as follows. In Sect. 2 we give the basic definitions of the algebra and its modules. In Sects. 3 and 4, we supply the details of the generalized construction of [23] for integrable modules of slr+1 , with highest weights corresponding to rectangular Young diagrams. In Sect. 5, we explain a similar calculation of graded characters of conformal blocks or coinvariants (the fusion product of [8]), which turn out to be related to the generalized Kostka polynomials of [22, 16]. We then use this calculation in Sect. 6 to compute the characters of arbitrary highest-weight representations. See Theorem 6.3 for the main result. Although, for the sake of clarity, we concentrate in this paper on the case of g= slr+1 , the generalization to affine algebras associated with other simple Lie algebras is possible, but in that case one should replace the notion of integrable g-modules with irreducible g-modules as their top component with those which have (the degeneration to the classical case of) Kirillov-Reshetikhin modules as their top component. We will give this construction in a future publication.
2. Notation 2.1. Current generators of affine algebras. Let g = slr+1 and let = {αi | i = 1, . . . , r} denote its simple roots, and {ωi | i = 1, . . . , r} the fundamental weights. Let {eαi = ei | i = 1, . . . , r} denote the corresponding generators of n+ , and
Fermionic Characters of slr+1
429
{fαi = fi | i = 1, . . . , r} those of n− . We have the Cartan decomposition slr+1 n+ ⊕ h ⊕ n− , where h is the Cartan subalgebra. Irreducible, finite-dimensional highest-weight g-modules πλ are parametrized by weights λ ∈ P + , that is, λ = l1 ω1 + · · · + lr ωr with li ∈ Z≥0 . The subset of P + consisting of weights λ such that ri=1 li ≤ k is called the set of level-k restricted weights, Pk+ . The affine Lie algebra associated with g is g, where g g ⊗ C[t, t −1 ] ⊕ Cc ⊕ Cd, where c is central and [d, x ⊗ t n ] = −nx ⊗ t n .
(2.1)
def
We denote the current generators by x[n] = x ⊗ t n , x ∈ slr+1 . Let x, y be the symmetric bilinear form on slr+1 . Then the relations between the currents are f (t)g(t)dt, [x ⊗ f (t), y ⊗ g(t)]g = [x, y]gf (t)g(t) + c x, y t=0
where [·, ·]g is the corresponding commutator in g. The Cartan decomposition is g n+ ⊕ h ⊕ n− with n± = n± ⊕ (slr+1 ⊗ t ±1 C[t ±1 ]) and h = h ⊕ Cc ⊕ Cd. The algebra g is the algebra obtained by dropping the generator d. We will frequently use generating functions for current generators of the affine algebra, which we define by x[n]z−n−1 , x ∈ slr+1 . (2.2) x(z) = n∈Z
Note that the convention for the current generators in (2.2) is different from that used by [23, 4]. 2.2. Affine algebra modules. On any irreducible slr+1 -module, c acts by a constant k called the level of the representation. A cyclic highest-weight g-module with highest weight = λ + k0 + mδ is a cyclic module generated by the action of g on a highest-weight vector vλ , such that n+ vλ = 0, hvλ = λ(h)vλ , for h ∈ h ⊂ g, cvλ = kvλ ,
dvλ = mvλ .
(2.3) (2.4)
The universal such module is the Verma module M() U ( n− ). If k ∈ N and λ ∈ Pk+ , the quotient of the Verma module by its maximal submodule is an irreducible, highest-weight integrable g-module, which we denote by Vλ (we assume k is fixed in this notation). The structure of the cyclic module generated by a highest-weight vector vλ is independent of m, so it is generally convenient to set m = 0. Definition 2.1. Let M be an irreducible cyclic highest-weight module with highest weight = λ + k0 , generated by the highest-weight vector vλ . The subspace generated by the action of the subalgebra g ⊗ 1 g on vλ is called the top component of M. It is isomorphic as a g-module to πλ .
430
E. Ardonne, R. Kedem, M. Stone
The irreducible, finite-dimensional g-module πλ is characterized as the quotient of the Verma module of g by the left ideal in g generated by fili +1 . Similarly, the integrable module Vλ is the quotient of theVerma module of g, M(), by the left ideal in g generated by fi [0]li +1 , plus one additional generator, eθ [−1]k−θ(λ)+1 , where θ = α1 + · · · + αr . A characterization of the maximal proper submodule M () of M() in the case of integrable modules was given in [21] in terms of the algebra of current generators. Note that on any highest-weight module, the current (2.2) acts as a Laurent series in z. Therefore, products of currents make sense when acting on a highest-weight module, and one can consider the associative algebra of currents. Formally, the coefficients of zn in products of currents of the form x(z)y(z) exist only in a completion U of U ( g ). Theorem 2.2 [21]. Let M() be a Verma module with highest weight = λ + k0 , with λ ∈ Pk+ and k ∈ N. Denote its maximal proper submodule by M (), such that Vλ M()/M (). Let R be the subspace in U generated by the adjoint action of U (slr+1 ) on the coefficients of eθ (z)k+1 . Then M () = RM(). Again, the elements in R act as well-defined elements of U ( g ) on M(). We call the set of currents which result from the adjoint action of slr+1 on the current eθ (z)k+1 the integrability conditions. For example, for any root α, the coefficients of eα (z)k+1 are in R.
3. The Semi-Infinite Construction of Feigin and Stoyanovski˘ı Theorem 2.2 was used by Feigin and Stoyanovski˘ı [23] to give a construction of the integrable modules in the case where = k0 . The construction naturally gives rise to fermionic formulæ for the characters of integrable modules. We will explain the details of the construction of [23] below.
3.1. Principal subspaces. For arbitrary integrable highest weight = λ + k0 , let vλ be the highest-weight vector of Vλ . Consider the subalgebra n− = n− ⊗ C[t, t −1 ] def
acting on vλ . (0)
Definition 3.1. Define the principal subspace Wλ = Wλ = U ( n− )vλ ⊂ Vλ . Similarly, (N) n− )TN vλ , where TN = tα(N) is the affine define the principal subspaces Wλ = U ( Weyl translation corresponding to the root α(N) = i Ni αi (in the notation of [11] (6.5.2)), where Ni are positive integers such that (Cr N)α = 2N for all α, and Cr is the Cartan matrix of slr+1 . Lemma 3.2. This choice of α(N) gives a sequence of inclusions (0)
(1)
(N)
Wλ ⊂ Wλ ⊂ · · · ⊂ Wλ
⊂ ··· ,
(3.1)
such that the inductive limit of the sequence (3.1) as N → ∞ is the integrable module Vλ .
Fermionic Characters of slr+1
431 (N)
The inclusions follow from the fact that vλ ∈ Wλ . The fact that the inductive limit indeed gives the full module is not obvious (see [20, 5]) but follows from the fact that the module is integrable. In fact, this theorem was proven in [23] for the following cases: sl2 for arbitrary highest weight, and sl3 with = k0 . This was done by computing the characters in the limit N → ∞, and comparing them with the known character formulæ for Vλ of [17]. In [10, 9], certain combinatorial proofs were provided using ideas related to those of [23] (with differently defined principal subspaces) for rectangular highest weights, for all simply laced algebras. The principal subspaces of that paper are different from those used here, as [10] uses what amounts to a different subalgebra to generate the subspace. In this paper, we will continue this program by giving the character formulæ for arbitrary highest-weight modules of slr+1 . It turns out that the methods of [23] are not sufficient for the case of non-rectangular representations, and instead we must resort to computing the characters of certain fusion products of representations, and decomposing them in terms of irreducible modules. The result is a formula which is a sum of fermionic formulas of the form found in [23, 10, 9], where the coefficients in the sum are elements of Z[q −1 ]. 3.2. Relations in the principal subspace. Let us characterize the ideal Iλ , where Wλ U ( n− )/Iλ . Using a PBW-type argument, it is easy to see that Wλ = U (n− ⊗ C[t −1 ])vλ , because the highest-weight vector vλ is annihilated by n− ⊗ tC[t]. Thus, Iλ includes the left ideal generated by {fα [n] | n > 0, α ∈ }. The ideal contains the two-sided ideal generated by relations in the Lie algebra. In terms of generating functions, these relations are 0, |i − j | = 1 [fαi (z), fαj (w)] = , (3.2) w −1 δ(w/z)fαi +αj (z), |i − j | = 1 fαi (z), [fαi (w), fαi±1 (u)] = 0, (3.3) where δ(z) = n∈Z zn . These two relations together mean that matrix elements involving the product fi (z)fi±1 (w) have a simple pole whenever z = w, and that the residue of this pole commutes with fi (u). The integrability condition fi (z)k+1 v = 0,
v ∈ Vλ , 1 ≤ i ≤ r,
(3.4)
implies that Iλ contains the two-sided ideal generated by the coefficients of zn of fi (z)k+1 (in the appropriate completion of the universal enveloping algebra). Finally there are the relations which follow from the integrability of the top component πλ of Vλ , which is a subspace of Wλ also. Therefore, Iλ contains the left ideal generated by fi [0]li +1 . The integrability condition involving eθ [−1] does not play a role, because it is not an element of U ( n− ). 3.3. Construction of the dual space. In order to compute the characters of the principal subspace Wλ , we describe its dual space. This will enable us to calculate the character for sufficiently simple λ. The dual space is spanned by the coefficients of monomials of nm the form x1n1 · · · xm of matrix elements in the set
Fλ = w|fi1 (x1 ) · · · fim (xm )|vλ | w ∈ Vλ∗ , m ≥ 0, 1 ≤ ia ≤ r ,
432
E. Ardonne, R. Kedem, M. Stone
where Vλ∗ is the restricted dual module. Given an ordering of the generators, the function nm above is defined in the region |xi | > |xi+1 |, and therefore the coefficient of x1n1 · · · xm for given integers nj is given by the expansion in this regime. Below, we shall refer to the function space Fλ itself as the dual space, and specify an appropriate pairing. This space can be characterized by its pole structure and vanishing conditions. 3.3.1. The dual space to U ( n− ). Let us first consider the larger function space G, dual to the universal enveloping algebra U = U ( n− ). The algebra U is spanned by words in the letters {fαi [n] | i = 1, . . . , r, n ∈ Z}, and it is h and d-graded. The graded component U [m]d , where m = (m(1) , . . . , m(r) )T , is spanned by the elements fi1 [n1 ] · · · fim [nm ], of h-weight α m(α) α = j αij and − i ni = d. The dual space to U is also h- and d-graded. Denote by U [m] the h-graded component, and by G[m] the dual to it. This is a space of functions in the variables (α)
x = {xi
| i = 1, . . . , m(α) , α = 1, . . . , r},
(α)
(α)
where xi is the variable corresponding to a generator of the form fα (xi ). We define the pairing (·, ·) between U and G inductively, as follows: (1, 1) = 1,
(g(x), Mfα [n]) =
1 2πi
(α) (α) (x1 )n g(x)dx1 , M
(α)
x1 =0
,
M ∈ U,
(3.5) (α)
where the contour of integration is taken counter-clockwise around the point x1 in such a way that all other points are excluded, (g(x), fα [n]M) =
1 2πi
(α) |x1 |
(α )
< |xj
= 0,
|. Similarly,
(α)
x1 =0
(α) (α) (x1 )n g(x)dx1 , M
,
the contour is taken clockwise. The commutation relations between the currents are equivalent to the operator product expansion (OPE) fi (z)fi±1 (w) =
fαi +αi±1 (w) + regular terms, z−w
where “regular terms” refers to terms which have no pole at z = w, and the expansion of the denominator is taken in the region |z| > |w|. Due to the OPE’s, it is clear that func(α) (α±1) . Thus, functions tions in G[m] will have at most a simple pole whenever xj = xk in G[m] are rational functions of the form g(x) =
g1 (x) (α) i,j,α (xi
where g1 (x) are polynomials in (xi )±1 . (α)
(α+1)
− xj
, )
(3.6)
Fermionic Characters of slr+1
433
Again using the OPE’s, we can construct the pairing between all other elements of U and G. For example, 1 (α) n (α) (α±1) (α) (g(x), Mfα+α±1 [n]) = (x ) (x1 −x1 )g(x) (α) (α±1) dx1 , M , x1 =x1 2πi x1(α) =0 1 where the contour excludes all other points, and (g(x), Mfα+···+α+h [n]) 1 (α) (α) (α+1) (x )n (x1 − x1 ) = 2πi x1(α) =0 1 (α+h−1) (α+h) · · · (x1 − x1 )g(x) (α)
(α+h)
x1 =···=x1
(α) dx1 , M
.
The function g1 (x) is not completely arbitrary, due to the Serre relation (3.3). The Serre relation implies that the function (α) (α+1) (x1 − x1 )g(x) (α) (α+1) x1 =x1
(α+1)
(α)
(α)
has no poles at the points xj = x1 and xj that the function g1 (x) has the property that
(α+1)
= x1
, where j > 1. This implies
g1 (x)|x (α) =x (α) =x (α±1) = 0. i
j
(3.7)
k
Finally, it is clear that since [fi (z), fi (w)] = 0, g1 (x) is symmetric under the ex(α) (α) change of variables xi ↔ xj . In summary, we have Theorem 3.3. The space of functions G[m] dual to the graded component U [m] of the universal enveloping algebra of n− , with the pairing defined inductively by (3.5), is the (α) space of functions in the variables {xj } with j = 1, . . . , m(α) and α = 1, . . . , r, of the form (3.6), where g1 (x) is a polynomial in (xj )±1 , symmetric under the exchange of (α)
(α)
variables with the same superscript, and which vanishes whenever x1
(α)
= x2
(α±1)
= x1
.
3.3.2. Dual to the principal subspace Wλ . Next, we consider the space Fλ [m], which is defined as the graded component of the space Fλ , the subset of matrix elements of U [m] in Fλ . The space Fλ [m] is the dual space to Wλ [m] (the weight subspace of Wλ of h-weight λ − mT α) with the pairing defined as in (3.5), where 1 ∈ U is replaced by vλ . The dual space Fλ [m] is the subspace of G[m], which couples trivially via the pairing (3.5) to the ideal Iλ ⊂ U . Apart from the two-sided ideal coming from the relations in the algebra, which we have already accounted for in constructing G[m], the ideal Iλ contains the relations coming from the highest-weight conditions (2.4), and from the integrability conditions (3.4). The integrability conditions mean that Ufi (x)k+1 U ⊂ Iλ , which means that g1 (x)|x (α) =···=x (α) = 0, 1
for all g(x) ∈ Fλ [m] and for all α.
k+1
(3.8)
434
E. Ardonne, R. Kedem, M. Stone
The ideal Iλ contains the left ideal generated by fα [n], n > 0 for any α. We see from (α) (3.5) that for functions in Fλ [m], g1 (x) can have at most a simple pole at x1 = 0. Let us define the function g2 (x) by g(x) =
g2 (x) (α) (α) α,i (xi ) α,i,j (xi
(α+1)
− xj
(3.9)
, )
(α)
where g2 (x) is a polynomial in xi for all i, α. In order to account for the relation Ufβ [n] ⊂ Iλ for β = αi + · · · + αi+h , where n > 0, we need to impose an additional restriction on g2 (x), because of the prefac(α) (α+1) (α+h) −1 · · · x1 ) in (3.9). The function g1 (x), after evaluation at the point tor (x1 x1 (α) (α+1) (α+h) = · · · = x1 , must be of degree greater than or equal to −1 in u = x1 = x 1 the variable u if it is to couple trivially to fβ [n] for n > 0. Therefore, we see that g2 (x) satisfies: g2 (x)|x (α) =x (α+1) =···=x (α+h) =u 1
1
1
vanishes as uh as u → 0.
(3.10)
Finally we need to take into account the integrability conditions for the top component: Ufβ [0]λ(β)+1 ⊂ Iλ for each positive root β. For simple roots, this means that g2 (x)|x (α) =···=x (α)
lα +1 =0
1
= 0.
(3.11)
When β is not a simple root, then the relations are more complicated, involving variables corresponding to different roots. These are sufficiently complicated that we do not know how to compute the character of the space in this case. However, at this point let us note that for the special case of rectangular representations, the situation is much simpler. The relation (3.10) is automatically satisfied for such representations. For suppose we consider the representation with lβ = 0 for at most one index β. Then since Ufβ [0] ⊂ Iλ , whereas Ufα [0] ⊂ Iλ for α = β, we have that in this special case, (β) (xj )−1 g2 (x), (3.12) g1 (x) = j
where g2 (x) is a polynomial in all the variables, satisfying (3.11) for the index β only, as well as the integrability conditions and the Serre relation. The relation (3.10) is not an extra condition in this case. Let us summarize the result for rectangular representations, therefore. Theorem 3.4. Let β = lωβ + k0 for some 1 ≤ β ≤ r. Then the dual space of functions to the graded component of the principal subspace Wlωβ [m] is the space of rational functions of the form (3.6), where g1 (x) is a function of the form (3.12), where g2 (x) is a (α) polynomial in the variables xi satisfying the Serre relation (3.7), symmetric under the (β) (β) (α) (α) exchange of variables xi ↔ xj for all α, vanishing when x1 = · · · = xl+1 = 0, or (α)
when any k + 1 variables of the same superscript coincide, x1
(α)
= · · · = xk+1 for any α.
In the next section, we will show how to compute the character of this space using a filtration on the space. For non-rectangular representations there is no such simple description of the space. The purpose of this paper is to explain how to compute the character for non-rectangular representations as a linear combination of characters of rectangular representations.
Fermionic Characters of slr+1
435
3.4. Filtration of the dual space Fλ . In this subsection, we will assume that = β = λ+k0 , λ = λβ = lωβ for some fixed 1 ≤ β ≤ r. This corresponds to aYoung diagram of rectangular form (with l columns and β rows). As explained above, the space Fλ is h-graded, Fλ = m Fλ [m], where Fλ [m] is a (α) subspace of the space of rational functions in the variables x = {xi | α = 1, . . . , r ; i = 1, . . . , m(α) } of the form G(x) =
g(x)
(β) r−1 (α) i (xi ) α=1 i,j (xj
(α+1)
− xj
,
(3.13)
)
where g(x) is polynomial, symmetric under exchange of variables with the same value (α) (α) of α (which we will refer to as the color index), xi ↔ xj . The index β corresponds to the fundamental weight wβ , where λβ = lwβ . In addition, g(x) vanishes when any of the following conditions is met (α)
x1
(α) x1 (β) x1
(α)
= · · · = xk+1 , =
(3.14)
(α) x2
= ···
(α±1) = x1 , (β) = xl+1 = 0.
(3.15) (3.16)
Our goal is to compute the character of this space, for which purpose we will introduce a filtration and an associated graded space. We will be able to compute the characters of the graded pieces easily. To simplify the calculations below, let us define the closely related space Fλ [m]. This space is a subspace of the space of all rational functions in the variables x, which are given by G(x) = r−1 α=1
g(x) (α) i,j (xi
(α+1)
− xj
,
(3.17)
)
(β) where g(x) is as in (3.13), so G(x) = i xi G(x). In the following, we will fix m and l, and study a filtration of this space Fλ [m] (which we will refer to by F), which can be described as follows. Let µ = (µ(1) , . . . , µ(r) ) be a collection of partitions, where each µ(α) is a partition (α) of m(α) and has ma rows of length a. (α) We can now rename the variables xi by associating each of them to a box of the Young diagram associated with the partitions µ(α) . As a result of this renaming, we have (α) variables xa,i,j , which correspond to the Young diagram of partition µ(α) , namely to column j of the i th row (counted from top to bottom) of length a. See the left part of Fig. 1 for an explicit example. In the proofs which follow, we will simplify this notation as much as possible. Note that, due to the symmetry properties of g(x), how we rename the variables is irrelevant. (α) Let H be the space of rational functions in the variables y = {ya,i | α = 1, . . . , r; a ≥ (α)
1; i = 1, . . . , ma }. Define the evaluation map ϕµ(α) , which sets all the variables in the same row of the (Young diagram associated to the) partition µ(α) to the same value, (α) (α) xa,i,j → ya,i . The effect of the evaluation map on the variables corresponding to the
436
E. Ardonne, R. Kedem, M. Stone (α)
xk,1,1
(α)
xk,1,2
(α)
xk,1,k−1 xk,1,k
(α)
(α)
(α) xk,mk ,k−1xk,mk ,k
(α)
(α)
xk−1,1,k−1
(α)
xk−1,mk−1 ,k−1
xk,mk ,1 xk,mk ,2 xk−1,1,1 xk−1,1,2
(α) xk−1,mk−1x,1k−1,mk−1 ,2
(α)
yk,1
(α)
(α)
(α)
yk,1
yk,mk
(α) (α)
yk−1,1
yk−1,1
(α)
(α)
(α)
yk,1
(α)
yk,1
(α)
yk,mk
(α)
yk,mk
(α)
yk,mk
(α)
yk−1,1
(α)
yk−1,mk−1
(α)
(α)
(α)
yk−1,mk−1yk−1,mk−1
−→
(α)
x2,1,1
(α)
x2,m2 ,1
x2,1,2
(α)
y2,1
(α)
y2,m2
(α)
(α)
x2,m2 ,2
(α)
y1,1
(α)
y1,m1
(α)
y2,1
(α)
y2,m2
(α)
x1,1,1
(α)
x1,m1 ,1
(α)
Fig. 1. The evaluation map for the variables x (α) . Note that we dropped the superscripts (α) in ma
(α) partition
µ is shown in Fig. 1. We define the evaluation map ϕµ : F → H to be ϕµ = rα=1 ϕµ(α) . By (3.14), ϕµ (g(x)) = 0 (where g(x) is as in (3.13) with G(x) ∈ F), if any of the partitions µ(α) has a part which is greater than k. Hence, in the following, we will assume that none of the partitions has a part greater than k, and refer to these (multi)-partitions as k-restricted. Our strategy will be to study the image of F under the evaluation map.
Definition 3.5. Let Hµ be the space of functions in the variables y, and let Hµ ⊂ Hµ be the subspace spanned by functions of the form H (y) = Hµ (y)h(y),
(3.18) (α)
(α)
where h(y) is an arbitrary polynomial in y, symmetric under the exchange ya,i ↔ ya,j , and (β) (α) (α) (α) (α+1) Hµ (y) = (ya,i − yb,j )2Aab (ya,i − yb,j )−Aab (ya,i )max(0,a−l) . α=1,... ,r (a,i)>(b,j )
α=1,... ,r−1 (a,i);(b,j )
(a,i)
(3.19) Here, Aab = min(a, b) and (a, i) ∈ Ik × Imα (where Im = {1, . . . , m}). The ordering (a, i) > (b, j ) is defined as follows. The index i increases downwards, and we say that (a, i) > (b, j ) if a > b, or, if a = b, when i < j .
Fermionic Characters of slr+1
437
Let us define a lexicographic ordering on multi-partitions. That is, the usual lexicographic ordering is taken on partitions µ(α) , and ν > µ if ν (α) = µ(α) for all α < γ and ν (γ ) > µ(γ ) . Let ker ϕµ be the kernel of the evaluation map ϕµ acting on F. We can now define the subspaces
µ = ker ϕν , µ = ker ϕν . (3.20) ν>µ
ν≥µ
Thus, µ is the space of rational functions which are annihilated by every evaluation map with ν > µ. By definition, ν ⊂ µ if ν < µ, and µ ⊂ µ . In addition, m(1) = {0}. m(r) (1
,... ,1
Therefore, µ defines a filtration on F. Define the associated graded space Gr = Gr µ ,
)
(3.21)
µ
where Gr µ = µ / µ and the sum is over multi-partitions of m. The main purpose of this section is to prove Theorem 3.6. The induced map ϕ µ : Gr µ → Hµ
(3.22)
is an isomorphism of graded vector spaces. This is very similar to the proof found in [7] for the case which corresponds to sl3 , and we use the same ideas here. To prove the theorem, we need to show three things. First, the evaluation map ϕµ : µ → Hµ
(3.23)
is well-defined. Second, it is surjective, and third, the induced map (3.22) is well defined and injective. 3.4.1. The evaluation map is well-defined. To prove that the map ϕµ : µ → Hµ , is well defined, we must show that the rational functions obtained after the evaluation are indeed of the form (3.18) and (3.19). We will do this by showing that the structure of the poles and zeros of the image of the functions (3.17) in Fm under the evaluation map is precisely of the form (3.19). Lemma 3.7. Let G(x) ∈ µ . Then, the function ϕµ (G(x)) has a zero of order at least (α) (α) 2 min(a, a ) when ya,i = ya ,i , ∀α. Proof. The proof is independent of α, and so we can use the argument used in the case of sl2 in [2]. We will repeat that argument here for completeness. It is sufficient to consider the dependence of G(x) on the two sets of variables of the same color α, which we denote by {xa,i | i = 1, . . . , a} and {xa ,i | i = 1, . . . , a }. We can assume that a ≥ a without loss of generality. We can carry out the evaluation map in two steps: ϕµ = ϕ 2 ◦ ϕ 1 . Here ϕ 1 consists of evaluating all the variables except the set {xa ,i | i = 1, . . . , a } and ϕ 2 consists of
438
E. Ardonne, R. Kedem, M. Stone
setting xa ,1 = · · · = xa ,a = ya (note that under ϕ 1 , the variables xa,1 , . . . , xa,a are all set to ya ). Let g1 (ya ; xa ,1 , . . . , xa ,a ) = ϕ 1 (G(x)).
(3.24)
Because G(x) ∈ µ , G(x) is annihilated by all ϕν with ν > µ. Therefore g1 (ya ; xa ,1 , . . . , xa ,a )x
a ,i =ya
= 0 for all i ,
(3.25)
because this corresponds to an evaluation corresponding to a multi-partition greater than µ. Therefore,
g1 (ya ; xa ,1 , . . . , xa ,a ) =
a
(xa − xa ,i )g˜ 1 (ya ; xa ,1 , . . . , xa ,a ).
(3.26)
i =1 (α)
Now g1 (ya ; xa ,1 , . . . , xa ,a ) was obtained from a symmetric function in xi , and so, for each i , ∂g1 ∂g1 =a . (3.27) ∂ya xa ,i =ya ∂xa ,i x =ya a ,i
However (3.26) tells us that, again for each i , ∂g1 ∂g1 = − ∂ya xa ,i =ya ∂xa ,i x
= (ya − xa ,i )g˜ 1 i =1
a
a ,i =ya
,
(3.28)
xa ,i =ya
the prime on the product meaning that the term with i = i is to be omitted. The only way to reconcile (3.27) with (3.28) is for g˜ 1 |xa ,i =ya to be zero. Thus the zero at xa ,i = ya is at least of order two
g1 (ya ; xa ,1 , . . . , xa ,a ) =
a
(ya − xa ,i )2 g˜ 2 (ua ; xa ,1 , . . . , xa ,a ).
(3.29)
i =1
We now evaluate the right-hand-side of (3.29) at xa ,1 = · · · = xa ,a = ya and, recalling the condition that a ≥ a , we have ϕµ (G(x)) =
(ya − ya )2Aa,a G.
(3.30)
Lemma 3.8. The image under the evaluation map ϕµ of any function in F (and hence (α) (α+1)
µ ) has a pole of maximal order min(a, a ) whenever ya,i = ya ,i .
Fermionic Characters of slr+1
439
Proof. We will prove this lemma by looking at the zeros of g(x), which arise because we need to satisfy the Serre relations, g|x (α) =x (α) =x (α+1) = 0 and g|x (α) =x (α+1) =x (α+1) = 0 1 2 1 1 1 2 for α = 1, . . . , r − 1. These relations depend on two sets of variables only. (α) Consider the dependence of g on the two sets of variables xi = xi , with (α±1) , with j = 1, . . . , a . Under the evaluation map, these i = 1, . . . , a and x¯j = xj variables map to ϕµ (xi ) = y and ϕµ (x¯i ) = y¯ respectively. Note that x and x¯ are variables corresponding to two adjacent roots. Again without loss of generality, assume that a ≥ a . When x1 = x¯1 = x¯j or x1 = xj = x¯1 , g vanishes, so we find g(x1 , . . . , xa ; x¯1 , . . . , x¯a ; . . . )|x1 =x¯1 =z1
a a = (xi − z1 ) (x¯i − z1 )g (z1 ; x2 , . . . , xa ; x¯2 , . . . , x¯a ; . . . ).
(3.31)
j =2
i=2
Repeating the argument for g we find g (x2 , . . . , xa ; x¯2 , . . . , x¯a ; . . . )|x2 =x¯2 =z2 =
a
(xi − z2 )
a
(x¯i − z2 )g (z1 , z2 ; x3 , . . . , xa ; x¯3 , . . . , x¯a ; . . . ).
(3.32)
j =3
i=3
We can repeat this argument a times with the result g(x1 , . . . , xa ; x¯1 , . . . , x¯a ; . . . )|{x =x¯ =z }a i
a
=
a
a
(xj − zi )
i=1 j =i+1
i
i i=1
a
(x¯j − zi )g(z ˜ 1 , . . . , za ; xa +1 , . . . , xa ; . . . ).
i=1 j =i+1
(3.33) We find that ϕµ (g) has a zero of order at least aa − min(a, a ) when y = y, ¯ by counting the number of zeros in (3.33) and using that a ≤ a. Taking into account the poles of (3.17), which after applying the evaluation map becomes a pole of order aa when y = y, ¯ we find that the image of Fm has a pole of order at most min(a, a ), when (α) (α±1) xa,j = xa ,j . Lemma 3.9. The image of ϕµ acting on a function G ∈ µ has a zero of order at least (β) max(0, a − l) when ya,i = 0. Proof. To prove this lemma, we will study the effect of the evaluation map on g(x) in Eq. (3.13). We focus on the variables of a row of length a (where we assume that a > l), (β) (β) {xj | j = 1, . . . , a}. Under the evaluation map, these variables map to ϕµ (xj ) = y (β) . We know that the function (β)
g1 (x1 , . . . , xa(β) ) = g(x)|x (β) =···=x (β) =0 1
(3.34)
l
(β) (β) contains a factor aj =l+1 xj , because it vanishes if any of the remaining variables xj is set to zero (because of the condition (3.16) on g(x)). Thus, the image of g1 under the (β) evaluation map has a zero of order at least max(0, a − l) whenever ya,i = 0.
440
E. Ardonne, R. Kedem, M. Stone
Lemma 3.10. The map ϕµ : µ → Hµ is well defined. Proof. This follows from Lemmas 3.7, 3.8, 3.9 and the definition of the space Hµ .
3.4.2. Proof of surjectivity We will continue with the proof that the map (3.23) is surjective. We have to prove that for each function of the form defined by (3.18) and (3.19), there is at least one function in the pre-image in µ . We do this by explicitly giving the form of these pre-images, showing that they are elements of F and finally, proving that these pre-images are indeed in the kernel of ϕν for each ν > µ, which shows that they are in µ . For each (k-restricted) multi-partition µ, we consider the function F (x) =
Sym f (x) , p(x)
(3.35)
where f (x) and p(x) are a polynomials of the form (we identify the variables (α) (α) xa,i,a+1 = xa,i,1 ) (β) (α) (α+1) f (x) = f˜(x) xa,i,j (xa,i,j − xa ,i ,j ) α a,i,j a ,i ;j =j
α,a,i j >l
×
(α)
p(x) =
(α)
(α)
(α)
(xa,i,j − xa ,i ,j )(xa,i,j +1 − xa ,i ,j )
(3.36)
(α)
(3.37)
α (a,i)>(a ,i ) (α) j =1,... ,ma
(α+1)
(xa,i,j − xa ,i ,j ),
α=1,... ,r−1 a,i,j a ,i ,j
where f˜(x) is an arbitrary polynomial. The symmetrization is over each of the r sets (α) of variables {xi } with the same value of α. As we did before, we will drop as many indices as possible in the following lemmas. Lemma 3.11. The functions F (x) of (3.35) are elements of F. Proof. We have to show that f (x) satisfies the vanishing conditions (3.14), (3.15) and (3.16). First of all, we easily see that f (x) is zero when any k + 1 variables of the same color are set to the same value. Because the partitions have rows of maximum length k, these k + 1 variables can not all be placed in the same row, which implies that the factor
(α) (α) (xa,i,j − xa ,i ,j ) evaluates to zero under ϕµ . To show that the Serre relations are satisfied, we have to show that the zeros (α) (α+1) (xa,i,j − xa ,i ,j ) (3.38) α a,i,j a ,i ;j =j (α)
(α+1)
satisfy the Serre relations. Let xa,j = xa,i,j and x¯a ,j = xa ,i ,j , for some choice of α, i and i .
Fermionic Characters of slr+1
441
For every x, there is a zero with every x, ¯ except those appearing in the column which has the same number as the x (i.e. for j = j ). Note that if we set two variables x, which belong to the same column, to the same value, f (x) is zero, because the factor
(α) (α) (xa,i,j − xa ,i ,j ) is zero in that case. Hence, we set xa,j = x¯a ,j = x˜ (j = j ). Focus ing on this variable, we find the following zeros (x˜ − x¯a,i )(x˜ − x¯a ,i ) a ;i =i,i (x˜ − x¯a ,i )2 So, indeed x˜ has zero with every x. ¯ Similarly, we find that there is at least a zero of order one when we set x1 = x¯1 = x¯2 . To complete the proof of this lemma, we need to show that f (x) satisfies the condition
(β) (3.16). This easily follows from the factor j >l xa,i,j , combined with the zeros which give rise to the condition (3.14). Remark 3.12. It is instructive to note that all the zeros in (3.38) are necessary to satisfy the Serre relations. We need to show that if we remove any of these zeros, we will violate a Serre relation. To show that this is true, it is important that we take the zeros between variables of the same color into account. Let us remove the zero (xa,j − x¯a ,j ), where j = j . Without loss of generality, we can assume that j < j . The two variables are indicated in Fig. 2 by the black boxes. The gray boxes denote the zeros with the variables corresponding to the black box from the same partition. All we need to do is show that there is at least one variable, of either partition, such that when this variable is set to the same value as the two ‘black variables’, we do not get a zero, and thus violate a Serre relation. This variable is taken to be of color (α + 1), (if j > j , it is of color (α)). More precisely, it is the variable x¯a ,j , taken from the same row as x¯a ,j (denoted by the ‘slanted’ box), which always exists, because j < j . There is no zero at x¯a ,j = x¯a ,j , because both variables are taken from the same row. In addition, there is no zero at xa,j = x¯a ,j , because it is not present in the factor (3.38) and the zero at xa,j = x¯a ,j is the one we removed. We conclude that after we remove the (arbitrary) zero at xa,j = x¯a ,j , we do not have a zero when xa,j = x¯a ,j = x¯a ,j . Thus, we have shown that by removing any of the zeros in (3.38), we violate a Serre condition. We conclude that the zeros are indeed necessary. Lemma 3.13. The function F (x) of (3.35) associated to a k-restricted multi-partition µ is an element of the kernel of ϕν for any ν > µ. (α)
(α + 1)
01 Fig. 2. A violation of the Serre relations if the zero corresponding to the black squares is removed from (3.36). The left partition corresponds to the variables of color (α), the right one to color (α + 1). The ‘slanted’ box is the third variable, in addition to the two black ones, for which the Serre condition is violated. The gray boxes denote the zeros with the variable corresponding to the black box of the same partition, coming from the integrability conditions
442
E. Ardonne, R. Kedem, M. Stone
Proof. Let us take a ν > µ, and let ν (α) be the first partition such that ν (α) > µ(α) . We will focus on the variables x (α) and show that the function F (x) can not be non-zero under the evaluation map ϕν . Two variables in the same column of µ(α) have a zero, so they can not be placed in the same row in ν (α) , if the result is to be non-zero, because in that case, acting with the evaluation map gives a zero. However, because ν (α) > µ(α) , we can not avoid placing variables of the same column in µ(α) in the same row of ν (α) . To show this, let us denote the length of the rows of (α) (α) the partitions by µi and νi , such that the index i is increasing going downwards. The only way to avoid placing variables of the same column of µ(α) in the same row of ν (α) is by placing the variables of µ(α) in rows of the same length in ν (α) . However, because (α) (α) ν (α) > µ(α) , there will be an ı˜ such that νı˜ > µı˜ . Let us focus on the smallest ı˜. We (α) (α) (α) (α) have to place a variable of a row µi with i > ı˜ in the row ν˜ . Because µi ≤ µı˜ , i
(α)
this variable belongs to the same column of another variable in νı˜ . We conclude that F (x) is zero under the evaluation map ϕν with ν > µ. Lemma 3.14. The function F (x) of (3.35) is an element of µ . Proof. This follows from Lemmas 3.11 and 3.13.
As a last step in the proof of surjectivity, we have to show that the image of F (x) under the evaluation map is indeed of the form (3.18) and (3.19). In particular, it contains as a factor the functions h(y), which are symmetric under the exchange of variables (α) (α) ya,i ↔ ya,i . Lemma 3.15. The image of F (x) under the evaluation map ϕµ is a scalar multiple of the function H (y) in (3.18). Proof. To prove this lemma, we can follow the same approach as we did in our paper on the sl2 case, because the argument does not depend on the color of the variables. We will focus on the variables x (α) , and determine the permutations σ , for which ϕµ (f (σ (x (α) ))) is non-zero. So, we consider f (σ {x (α) }). (3.39) σ ∈Sm(α)
In the following, we will omit the label α. Recall that the variable xa,i,j corresponds to the j th column in the i th row of length a. Under the evaluation map, xa,i,j → ya,i ∀j . Suppose that for some σ , we have σ (xa,i,j ) = xa ,i ,j with (a , i ) < (a, i) and that (a, i) is the largest row for which this is true. This means that all rows above (a, i) undergo only a permutation within the row. Suppose that the pre-factor ϕµ ◦ σ (xa,i,j − xa ,i ,j )(xa,i,j +1 − xa ,i ,j ) (3.40) (a,i)>(a ,i )
is to be non-zero. Then xa ,i ,j can not be in a column directly below or to the left of the permutation image of any other element from row (a, i). This means that at least one other element from row (a, i) should be mapped to a row below (a, i). If it is mapped to the row (a , i ) it can appear in any column other than j . If it is mapped to any other
Fermionic Characters of slr+1
443
row, it can appear in any other column than j and an adjacent column (to the right or left depending on whether it is above or below (a , i ).) Now we repeat this argument for this new element, concluding that at least one more element of row (a, i) is mapped to a lower row, and so forth, until eventually we find that all elements are permuted to a row below (a, i). If the elements are permuted to the same row, they can be placed in adjacent columns. Elements which are permuted to different rows can not be placed in adjacent columns, this being due to the factor linking adjacent columns in the pre-factor. There are at most a columns in µ(α) in rows below (a, i), and hence the elements must all appear in the same row, which is therefore of length a. Thus all the variables in rows of length a are mapped to another row of length a, for the same reason. As a result, the only permutations which give a non-zero contribution to ϕµ (f (σ (x (α) ))) are those that permute variables within each row, or those that permute rows of equal length. Under the evaluation map, the former contribute equal terms to the sum, while row interchanges (α) correspond to the symmetrization over the variables ya,i with the same values of α and a in h(y). Note that the other factors in the function F are symmetric under the permutation of rows of equal length, so these factors do not interfere with the argument above. Lemma 3.16. The map ϕµ : µ → Hµ is surjective. Proof. This follows from Lemmas 3.14 and 3.15.
3.4.3. Injectivity proof Lemma 3.17. The induced map ϕ µ : Gr µ → Hµ (3.22) is well defined and injective. Proof. To prove that the map (3.22) is well defined, we use Lemma 3.10 and observe that the image of µ under ϕµ is zero by using the definition of µ . It follows that we can define the induced map ϕ µ acting on the quotient Gr µ = µ / µ . Moreover, the difference between two different functions in µ that map to the same rational function in Hµ is in µ . Hence, the map is also injective. We have now completed the proof of Theorem 3.6, because the theorem follows from Lemmas 3.17 and 3.16. The map (3.22) is degree preserving, and thus we can count the functions of homogeneous degree d in Hµ to obtain the character of the space F.
(β) To compute the character of Fλ , we add the poles (xa,i,j )−1 , which are present in the functions G(x) in (3.13). The only thing in the calculation of the character which
(β) changes is the fact that due to these poles, the zeros (ya,i )max(0,a−l) in (3.19) become
(β) poles (ya,i )− min(a,l) . 3.5. Character of the dual space. Using the results of the previous section, we can calculate the character of the dual space Fλ , where λ = lωβ . First, let us define the character of Wλ as follows: T dim Wλ [m]d q d eλ−ω Cr m , (3.41) chq Wλ = d,m(α)
where Wλ [m]d is the subspace generated by elements in U ( n− ) of homogeneous degree m(α) in fα , and homogeneous degree −d in t. Here, ω = (ω1 , . . . , ωr )T .
444
E. Ardonne, R. Kedem, M. Stone (α)
The space Fλ is a space of functions in the variables xi . If we define its (m, d)-graded (α) component to be the space of functions in m(α) variables xi and total homogeneous degree d in all the variables, then, due to the way we defined the generating functions fα (x) (or, equivalently, the coupling), we have that Fλ [m]d is the dual to Wλ [m]d , where d = d + α m(α) . Thus, (α) T chq Wλ = chWλ [m] = q d+ α m eλ−ω Cr m dim(Fλ [m])d , (3.42) m
m
d
where dim(Fλ [m])d denotes the dimension of the subspace of functions in Fλ [m] which have homogeneous degree d. The powers of z correspond to the components of the weights in terms of the simple roots. Recall that here, λ = λβ = lωβ . We will calculate this character by actually summing over all the functions in H, and counting their homogeneous degree. The character of the space of symmetric functions (α) h(y) in ma variables is given by 1
k
r α=1
a=1 (q)m(α) a
(3.43)
,
i where (q)m = m i=1 (1 − q ) for m ∈ N and (q)0 = 1. The homogeneous degree of the rational function Hµ (y), combined with the addi (α) tional poles (a,i) (ya,i )−a is given by 1 Hµ (y) (α ) deg = Aa,l m(β) m(α) . m(α) a (Cr )α,α Aa,a ma − a − (α) a 2 (y ) a α (a,i)
a,i
α,α ,a,a
(3.44) (0)
It follows that the character of Wλ is →T
(0) chq Wlωβ
q 21 m
= →
m∈Zr×k ≥0
→
→ (β)
(Cr ⊗A)m−(id⊗Am)l
(q)→ m
elωβ −ω
TC m r
.
(3.45) →
Here (A)a,b = min(a, b) is a k × k matrix, and Cr is the Cartan matrix of slr+1 . Also, m (1) (1) (r) (r) denotes the vector (m1 , . . . , mk ; · · · ; m1 , . . . , mk ). We made use of the definition (q)→ = m
k r
(q)m(α) . a
α=1 a=1
r +1 -Modules 4. Characters for Rectangular Highest-Weight sl In this section, we will show that we can use the characters of the principal subspace Wλ to obtain the character of the full integrable module Vλ . We will be able to do this by using the invariance of the weight multiplicities of Vλ under the action of the affine Weyl group, in particular the affine Weyl translations tα . More specifically, we will show
Fermionic Characters of slr+1
445
that acting with an affine Weyl translation on the principal subspace, and taking an appropriate limit, we obtain the full integrable module. Let be an affine weight of level k. It can be written as = λ + k0 − mδ, where λ is the weight with respect to h ∈ slr+1 . Let tα be the affine Weyl translation
corresponding to the root α (see [11], Eq. (6.5.2)), and define the translation tN = i tNi αi , where N = (N1 , . . . , Nr )T . Then 1 tN () = λ + kNT · α + k0 − (m + NT · l + kNT Cr N)δ. (4.1) 2 Again, l = (l1 , . . . , lr )T , where λ = i li ωi , and α = (α1 , . . . , αr )T . Also note that α in terms of the weights is given by α = Cr ω. Consider the principal subspace W (N) = U ( n− )tN vλ . It has a dual space description which is similar to Fλ , if we choose the vector N carefully. Given that if fα [m]vλ = 0, then fα [m + (Cr · N)α ]tN vλ = 0 (since the Weyl group preserves weight space multiplicities), we choose N such that (Cr · N)α = 2N for all α, for some N ∈ Z+ . In the case of slr+1 , we have (N)i = N i(r + 1 − i). Then fα [2N + δα,β ]tN vλ = 0, where λ = lωβ , and fα [2N − 1 + δα,β ]tN vλ = 0. Note also that the extremal vector tN vλ is a basis for the one-dimensional weight subspace of weight 1 tN (λ) = λ + kNT α + k0 − (λ, NT α) + kNT Cr N δ. 2 In the case of interest here, this becomes tN (lωβ ) = lωβ + kNT α + k0 − (lNβ + kN |N|)δ, where |N| = i Ni . (N) Thus, the space dual to Wλ is the space of functions of the form (α) (xi )−2N G(x), α,i
where G(x) is the function in Eq. (3.13). (N) (0) Thus, we find that the character of Wlωβ differs from the character of Wlωβ by a (α) change in the exponent of q by lNβ + kN|N| − 2N |m| (where |m| = α m ) and a change in the weight by ωT Cr kN, which leads to →T
(N) chq Wlωβ
q 21 m
= →
r×k m∈Z≥0
→
→ (β)
(Cr ⊗A)m−(id⊗Am)l
(q)→ m
q lNβ +kN|N|−2N|m| elωβ −ω
T C (m−kN) r
.
A form suitable for taking the limit N → ∞ is obtained by eliminating the summa (α) (α) (0) tion variable mk in favor of m(α) = ka=1 ama . This gives for the character of Wlωβ (α) (we define m(α) = k−1 a=1 ama )
446
E. Ardonne, R. Kedem, M. Stone (0)
chq Wlωβ =
1
q 2k m
T C m− 1 lm(β) r k
elωβ −ω
m∈Zr≥0
×
TC m r
q
→
T −1 → 1→ 2 m (Cr ⊗Ck−1 )m−
r
k−1
α=1
r×(k−1) m∈Z≥0
→ (β) δl
−1 (id⊗Ck−1 )m
(q) m(α) −m(α) a=1 (q)m(α) a
(4.2)
,
k
where the prime on the sum denotes the constraints m(α) ≤ m(α) and m(α) ≡ m(α) mod k. Here, Ck−1 denotes the Cartan matrix of slk . The symbol δl
q lNβ + 2 kN
T C N−mT C N r r
.
Combining this power of q with the power in the first line of (4.2), we use the change of variables m (α) = m(α) − kNα , since the combined power in this new variable is 1 l 1 T l (β) 1 T m Cr m + kNT Cr N − mT Cr N − (m(β) − kN (β) ) = m Cr m − m . 2k 2k 2 k k Making this substitution, we have 1 (N) T Cr m − k1 l m β lωβ −ωT Cr m chq Wlωβ = q 2k m e m ≥−kN
×
→
r×(k−1) m∈Z≥0
1→
→
T
→ (β) δl
−1 −1 m (Cr ⊗Ck−1 )m− (id⊗Ck−1 )m
q2
r
k−1
(q) m(α) −m(α) a=1 (q)m(α) +Nα a
α=1
(4.3)
,
k
where the prime denotes the constraints m(α) ≤ m (α) + kNα and m(α) = m (α) mod k. We can now easily obtain the characters of the integrable level-k modules corresponding to rectangular highest weights by taking the limit N → ∞ while keeping m finite. This gives 1 T (∞) Cr m − k1 l m (β) lωβ −ωT Cr m chq Wlωβ = q 2k m e m ∈Zr
1→
q 2 m 1 × r (q)∞ → r×(k−1) m∈Z≥0
T
→
→ (β) δl
−1 −1 (Cr ⊗Ck−1 )m− (id⊗Ck−1 )m
(q)→ m
,
(4.4)
with the constraint m(α) = m (α) mod k. The nice feature of this character formula is that it manifestly splits the character into a sum over all the finite weights, each of which contributes a string function to the full character. These string functions are proportional to ‘the second line’ of Eq. 4.4. We can make the appearance of the characters slightly more compact, by rewriting → it in terms of the r × k-vector m again. This results in T → →(β) 1→ m (Cr ⊗A)m− (id⊗A)m 2 l q 1 T (∞)
r
elωβ −ω Cr m , (4.5) chq Wlωβ = r (q)∞ R (q) (α) a
a
Fermionic Characters of slr+1
447
which, again, holds in the case of rectangular representations. Comparing this to the known character formulæ for the integrable representations which appear, for example, in [9], we see that this is indeed the character of the integrable, level-k slr+1 -module with highest weight λ = lωβ , i.e. the module Vlωβ . Hence, we have the following result Theorem 4.1. The character of the integrable, level-k slr+1 -module with highest weight λ = lωβ is given by (∞)
chq Vlωβ = chq Wlωβ , (∞)
where chq Wlωβ is given by Eq. (4.5). The remainder of the paper will be devoted to obtaining character formulæ for general irreducible representations. 5. Conformal Blocks and Their Dual Spaces 5.1. Modules localized at ζ = 0. Above, we considered the standard action of the central extension of the loop algebra, g = g ⊗ C[t, t −1 ] on integrable modules Vλ of level k. Such modules can be considered as “localized” at the point 0. For a generic point ζ ∈ CP 1 , let tζ = t − ζ denote a local variable at ζ , and consider the action of the current algebra g(ζ ) = g ⊗ C[tζ , tζ−1 ] on a module Vλ (ζ ), “localized” at the point ζ , which is isomorphic to Vλ . Specifically, the generator x ⊗ tζn acts as x[n] on the module Vλ (ζ ). In the physics literature [3], this action is sometimes denoted by xn (ζ ). Equivalently, in terms of the generating current x(z) = n∈Z x[n]z−n−1 , let v ∈ Vλ (ζ ). The action of x ⊗ tζn may be written as 1 n x ⊗ tζ · v = dz(z − ζ )n x(z)v, 2πi Cζ where Cζ is a contour around ζ . g , where the cocycle acts in the same The central extension of g(ζ ) is isomorphic to way as on modules localized at 0: 1 x ⊗ f (tζ ), y ⊗ g(tζ ) = x, y f (tζ )g(tζ )dtζ , 2πi tζ =0 where x, y is the symmetric bilinear form on g. We call the centrally extended algebra with this cocycle g (ζ ) . Obviously, its representations are isomorphic to those of g . We also allow the point ζ = ∞, and at that point we choose the local variable to be t∞ = t −1 . 5.2. Fusion product of g ζ -modules. Let N ∈ N and let (ζ1 , . . . , ζN ) be N distinct, finite 1 points in CP (for convenience we choose ζp = 0). Denote the local variable at each point by tp = t − ζp . g (ζp ) -module Vp = Vµp (ζp ) of level k, At each point ζp , we localize an integrable and top component πp = πµp . We choose to consider only modules with highest weights
448
E. Ardonne, R. Kedem, M. Stone
of the form µp = ap ωαp , where 1 ≤ αp ≤ r and ap ∈ Z≥0 . That is, highest weights corresponding to rectangular Young diagrams. The completed loop algebra U = ⊕p g ⊗ C[tp , tp−1 ] ⊂ g ⊗ C(t) acts on the tensor product of these modules, V1 ⊗ · · · ⊗ VN by the usual coproduct, N ζ (x ⊗ f (t)) =
N
x ⊗ f (tp + ζp )
(p)
,
p=1
where the pth term in the sum above acts on the p th factor in the tensor product only: x(p) w1 ⊗ · · · ⊗ wN := w1 ⊗ · · · ⊗ x · wp ⊗ · · · ⊗ wN , x ∈ U. Here, by C(t) we mean rational functions in t, although we need only consider for our purposes the smaller space of rational functions with poles at at most ζ1 , . . . , ζN . This action has a central extension, where the cocycle acts as N 1 f (t)g(t)dt, f (t), g(t) ∈ C(t). x ⊗ f (t), y ⊗ g(t) = x, y 2πi t=ζp p=1
Thus, the level of the action of the centrally extended, completed algebra U = U ⊕ Cc is also k, which is the same as the level of each localized module Vi . This action is called the fusion action in the physics literature. Since it differs from the usual action on the tensor product of g -modules (which has level N k), it is denoted in [4] by the symbol rather than the usual ⊗: def
Vµ (ζ ) = Vµ1 (ζ1 ) · · · VµN (ζN ), µ = (µ1 , . . . , µN ).
(5.1)
5.3. Coinvariant spaces. The fusion product is an integrable g -module of level-k, thus, there is a sense in which it is completely reducible (see Appendix I of [6] for the precise explanation and proofs). The “multiplicity” of the irreducible g -module Vλ (0) in the (k) fusion product is given by the Verlinde numbers [24], which we denote by Kλ,µ . If k is sufficiently large (that is, k ≥ p ap ), these numbers are just the sums of products of the usual Richardson-Littlewood coefficients. In this paper, we only need to consider this case in order to obtain the character formulæ. Remark 5.1. In the case where αp = 1 for all p and k is sufficiently large, the multiplicities are the usual Kostka numbers Kλ,µ in the notation of [18], where µ = (a1 , . . . , aN ) and λ is a partition of length r + 1 with |λ| = |µ|, such that λi − λi+1 = λ(αi ), where αi are the simple roots. (k)
In complete generality, the multiplicity Kλ,µ is equal to the dimension of the coinvariant space [24, 6] Cλ,µ (ζ ) := Vλ∗ (∞) Vµ (ζ )/g ⊗ A, where the quotient is taken with respect to the image of g ⊗ A acting on the fusion product, where A is the space of meromorphic functions with possible poles at the points ζp and ∞ (it has trivial central extension). Here, λ∗ refers to the highest weight of the dual module to πλ : λ∗ = −ω0 (λ) where ω0 is the longest element in the Weyl group.
Fermionic Characters of slr+1
449
5.4. The coinvariant space as a quotient of principal subspaces. The dimension of the coinvariant space Cλ,µ was the subject of the paper [4], where a grading was defined on the space, compatible with the action of the current algebra. We will use the results about this space here, and compute the graded dimension for the special case of rectangular Young diagrams, with sufficiently large k. Theorem 5.2 ([4] (1.6), slightly modified). There is a surjective map uλ∗ (∞) ⊗ π1 ⊗ · · · ⊗ πN → Cλ,µ (ζ ), where uλ∗ (∞) is the lowest weight vector of the top component of the module Vλ∗ (∞) with respect to the action of g, and πp are the top components of the modules Vµp (ζp ). Thus we can conclude that the coinvariant is a quotient of the fusion product of principal subspaces Wp = Wµp (ζ ) = U (n− ⊗ C[tp−1 ])vp , where vp is the highest-weight vector of Vp , because πp ⊂ Wp . The fusion product of principal subspaces is the space Wµ (ζ ) = W1 · · · WN = U (n− ⊗ C(t))v1 ⊗ · · · ⊗ vN , where we allow poles at t = ζp . That is, in exactly the same way as for the integrable modules, the fusion product of principal subspaces can be decomposed as a direct sum of principal subspaces Wλ (0), (k) with multiplicities given by the Verlinde numbers Kλ,µ . We can compute these multiplicities by computing the dimension of the space of highest-weight vectors (with respect to the action of g) in the space U (n− ⊗C[t])v1 ⊗· · ·⊗vN . Notice that x ⊗ t n acts on the p th factor by xζpn . (Here, we do not allow poles at ζp , because they generate vectors in Wp which are not in the top component πp .) Remark 5.3. The naturally graded version of the space described in the previous paragraph is the Feigin-Loktev “fusion product” [8]. 5.5. Dual space of functions to the coinvariant. Again, in this paper, we do not incorporate the level-restriction for k, but we simply assume k to be sufficiently large, with respect to the collection of weights µp : if µp = ap ωαp , then the assumption is equiv (k) alent to k ≥ p ap . In this case, the Verlinde number Kλ,µ is equal to the Littlewood Richardson coefficient Kλ,µ . This is all we need in this paper to compute the characters of Wλ for generic λ ∈ Pk+ . Consider the space of matrix elements Cλ,µ , also known as the space of conformal blocks: Cλ,µ = {uλ∗ |U (n− ⊗ C[t])v1 (ζ1 ) ⊗ · · · ⊗ vN (ζN )} .
(5.2)
g(0) -module with g(0) Here, uλ∗ is the lowest weight vector of Vλ∗ (∞), considered as acting to the left. (Thus, n− ⊗ C[t −1 ] acts on uλ∗ trivially.) If ζj are pairwise distinct, the action of n− ⊗ C[t] on the product of highest-weight vectors generates all of π1 ⊗ · · · ⊗ πN (cf. the fusion product of [8]). The multiplicity of vλ ∈ πλ in this tensor product is the Littlewood Richardson coefficient Kλ,µ . This space has a filtration by degree in t inherited from the corresponding filtration on the universal enveloping algebra. Let U ≤n be the subspace of elements in U (n− ⊗ C[t]) ≤n of degree less than or equal to n in t. Let Cλ,µ be the subspace of matrix elements of
450
E. Ardonne, R. Kedem, M. Stone
U ≤n . Let Cλ,µ [n] = Gr n Cλ,µ be the graded component of degree n. We define the graded coefficients Kλ,µ (q −1 ) to be Kλ,µ (q −1 ) = q −n dim Cλµ [n]. (5.3) n
q −1
rather than q in order to be consistent with the grading in We choose powers of the last section, where we defined the degree of f [n] to be −n, as in (2.1). Therefore, Kλ,µ (q) is a polynomial in positive powers of q. (Notice that this is by definition the coefficient of πλ in the fusion product of Feigin and Loktev [8].) Let G(ζ )λ,µ be the space of generating functions for matrix elements of the form (5.2). That is, (α ) (α ) Gλ,µ (ζ ) = uλ∗ |fα1 (x1 1 ) · · · fαm (xm(αmm ) )v1 (ζ1 ) ⊗ · · · ⊗ vN (ζN ) , (5.4) where fα (x) = n fα [n]x −n−1 and 1 ≤ α ≤ r. Obviously, for this matrix element to be non-zero, the sum of the h-weights should be 0, that is, the matrix element should be g-invariant. If there are exactly m(α) generating (α) currents of the form fα (xi ) in the matrix element (5.4), define m = (m(1) , . . . , m(r) )T . Then m is fixed by the zero-weight condition on the matrix element. Specifically, let ω = (ω1 , . . . , ωr )T . Then the zero-weight condition on m is µp − ωT Cr m − λ = 0. (5.5) p
Recall the notation λ =
α lα ωα ,
and l = (l1 , . . . , lr )T . Let
n(α) a = number of weights of the form µp = aωα , (α) (α) (1) (r) T and n(α) = a ana , n = (n , . . . , n ) . Then p µp = α n ωα . We can rewrite (5.5) more compactly as m = Cr−1 (n − l),
(5.6)
where Cr is the Cartan matrix of slr+1 . (α) Let g(x) ∈ Gλ,µ (ζ ), where x = {xi , i = 1, . . . , m(α) ; α = 1, . . . , r}. We define the pairing between functions g(x) and an element in U (n− ⊗ C(t)) of the form M(fα ⊗ tpn )(p) , where M ∈ U (n− ⊗ C[t]) and x(p) is an element in the algebra which acts on the pth factor only. The pairing is again defined inductively as in (3.5), but the integral is modified to 1 (α) (α) (g(x), M(fα ⊗ tpn )(p) ) = g(x)(x1 − ζp )n dx1 , M , (5.7) 2πi Cp where Cp is a contour around the point ζp , and so forth. We now describe the zero and pole structure of the space of functions Gλ,µ (ζ ). First we note that Gλ,µ (ζ ) is a subspace of the dual space G[m] to U (n− ⊗ C[t, t −1 ])[m], which is described in Theorem 3.3: g1 (x) G[m] = (α) g (x)|x (α) =x (α) =x (α±1) = 0, g1 (x)|x (α) ↔x (α) = g1 (x). . (α+1) 1 1 2 1 i j (x − x ) i
j
Fermionic Characters of slr+1
451
Also, recall that fα ⊗ tp0 = fα [0](ζp ) acts trivially on vp , unless µp = ap ωα , in which case, (fα [0])ap +1 acts trivially on vp . In addition, fα ⊗ tpn acts trivially on vp for all n > 0 and all α. This implies, from the pairing (5.7), that for g(x) ∈ Gλ,µ (ζ ), g1 (x) in Eq. (3.6) can (αp )
have at most a simple pole whenever xi β = αp . That is, g(x) =
(α) (xi
(β)
= ζp . There is no pole when xi
g2 (x) (α ) (α+1) − xj ) p (xa p
− ζp )
∈ Gλ,µ (ζ ),
= ζp if
(5.8)
where the function g2 (x) satisfies g2 (x)|
(αp )
x1
(αp )
=···=xap +1 =ζp
= 0,
∀p.
(5.9)
(α)
(Recall that we assume ζp = 0, so that there is no pole at xi = 0.) Thus, g2 (x) is a (α) polynomial in xi . Finally, the currents in U (n− ⊗ C[t]) may act to the left, on uλ∗ sitting at infinity. The pairing at infinity is 1 (α) (α) (g(x), (fα ⊗ t n )(∞) M) = g(x)(x1 )n dx1 , M 2πi C∞ 1 (α) (α) (α) (α) = (x1 )−n−2 g((x1 )−1 , x2 , . . . )dx1 , M 2πi C0 (the contour around infinity is clockwise). Since fα [n] acts trivially at ∞ if n ≤ 0, this integral should be zero for n ≤ 0 if g(x) ∈ Gλ,µ (ζ ). This shows that degx (α) g(x) ≤ −2
for all i, α.
(5.10)
i
In summary, we have that, for k sufficiently large, Theorem 5.4. The dual space Gλ,µ (ζ ) to the space coinvariants Cλ,µ (ζ ), with respect (α) to the pairing (5.7), is the space of functions in the variables x = {xi | α = 1, . . . , r; (α) (α) i = 1, . . . , m }, where m is determined by (5.6), of the form (5.8), where g2 (x) is a polynomial, symmetric with respect to exchange of variables with the same superscript (α), satisfying the Serre relation (3.7) and the vanishing condition (5.9), with the degree of g(x) in each variable less than or equal to −2. In the next section, we compute the character of this space. 5.6. Filtration of the dual space. The space Gλ,µ (ζ ) is filtered by homogeneous (total) (α) degree in xi . Let Gλ,µ (ζ )[n] be the graded component. This space is dual to the space Cλ,µ [n + |m|] (because the definition of the pairing involves taking the residue). We normalize the degree of the cyclic vector to be 0. Therefore we have chq Cλ,µ = chq Cλ,µ = q −|m| chq Gλ,µ (ζ ) = q −n−|m| Gλ,µ (ζ )[n] = Kλ,µ (q −1 ). n
(5.11)
452
E. Ardonne, R. Kedem, M. Stone
We use the same filtration argument as in Sect. 3. That is, consider the lexicographic ordering on r-tuples of partitions ν, where ν (α) is a partition of m(α) . (Since k plays no role in the filtration argument except in limiting the types of partitions allowed in the filtration, there is no difference in the zero and pole structure related to k). We act with the evaluation maps φν on the space Gλ,µ (ζ ) and consider the image in the space H[m] of functions in the variables (α)
(α)
(α) {ya,i | a ≥ 1, i = 1, . . . , m(α) a , ma = Card({νi
= a})}
of the subspaces ν = ∩ν >ν Kerφν . We take the associated graded space, and compute the character of the graded components ν / ν , where ν = ∩ν ≥ν Kerφν . Define Hν to be the image of the induced map ϕ ν : ν / ν . The results are as follows. Lemma 5.5. Let g(x) ∈ Gλ,µ (ζ ). Then φν (g(x)) =
(α)
α;(a,i)<(a ,i )
(α)
(ya,i − ya ,i )2Aa,a h1 (y).
Proof. This follows from Lemma 3.7. The only difference in the two situations is that the partitions are only restricted by m(α) , not k. The next lemma gives the pole structure due to the nontrivial commutation relations together with the Serre relations. Its proof is identical to Lemma 3.8. Lemma 5.6. Let h1 (y) be defined as in Lemma 5.5. Then h1 (y) =
r−1
(α)
(ya,i − ya ,i )−Aa,a h2 (y), (α)
α=1 a,a ,i,i
(α+1)
(α+1)
where h2 (y) is regular when ya,i = ya ,i . The following lemma is a slight modification of Lemma 3.9. Lemma 5.7. Let h2 (y) be as in Lemma 5.6. Then h2 (y) as a pole of order at most (α ) min(a, ap ) whenever ya,ip = ζp . Thus, we have
(α) (α) (ya,i − ya ,i )2Aa,a
φν (g(x)) = h(y) =
(α) (ya,i
(α+1) − ya ,i )Aa,a
(ya,ip − ζp )−Aa,ap h3 (y), (α )
(5.12)
p (α)
(α)
where h3 (y) is a polynomial in the variables {ya,i | a ≥ 1, i = 1, . . . , ma , (α) α = 1, . . . , r}, with a ama = m(α) , symmetric under the exchange of variables (α) (α) (α) ya,i ↔ ya,i . Here, ma is the number of parts of length a in the partition ν (α) . Remark 5.8. It is important to note that, since we are only interested in the character of the space of functions of the form (5.12), we can now set all ζp = 0 in the space of polynomials without changing the character of the space.
Fermionic Characters of slr+1
453
There is a further restriction on h(y) coming from the degree restriction (5.10) on g(x). (This ensures that the space of coinvariants is finite-dimensional.) The evaluation map is degree preserving, which implies that degy (α) h(y) ≤ −2a. a,i
This gives the following restriction on the degree of h3 (y): Lemma 5.9. Let h3 (y) be as in Eq. (5.12). Then h3 (y) is a polynomial in the variables (α) {ya,i }, with 0 ≤ degy (α) h3 (y) ≤ − a,i
(β) (α) (Cr )α,β Aa,b mb + Aa,b nb , b,β
b
(α)
where na is the number of g-modules with highest weight aωα . The injectivity of the induced map ϕ ν : ν / ν → Hν follows from the injectivity argument of Lemma 3.17. We do not show surjectivity. Instead, we compute the graded character of the coinvariant using the above space of functions, evaluate it at q = 1, and show that it is equal to the desired multiplicity given by the Littlewood-Richardson rule, by comparing with the known result [15] for generalized Kostka polynomials. The argument is as follows. The injectivity of the map ϕ ν , which is a degree preserving map, implies that dim Gλ,µ (ζ )[n] ≤ dim Hν [n], ν
where by [n] means the graded component with respect to the homogeneous grading in the variables y. We will show that dim Gλ,µ (ζ ) = ν dim Hν , by computing the q-character of Hν , and showing that dim Hν = Kλ,µ , which is the dimension of the space of coinvariants. This proves the surjectivity of the evaluation map ϕ ν , and also gives the q-character of Gλ,µ (ζ ). Define the character of the space Hν to be q −n Hν [n]. chq Hν = n
This character can be computed by setting ζp → 0 for all p. Recall that we must multiply by q −|m| to obtain the character of the coinvariant. We use the Gaussian polynomial, ! (q)m+n m+n = , m, n ∈ Z≥0 . m q (q)m (q)n Lemma 5.10. Let Hν be the space of functions of the form (5.12) with degree restrictions (5.9), and ζp = 0. Then q −|m| chq Hν = q Q(m,n)
P (α) + m(α) ! a a , (α) m a q a,α
454
E. Ardonne, R. Kedem, M. Stone
where Q(m, n) =
1 (α) (β) (α) ma (Cr )α,β Aa,b mb − m(α) a Aa,b nb 2 a,b,α,β
and Pa(α) =
a,b,α
(α)
Aa,b nb −
b
(β)
(Cr )α,β Aa,b mb .
β,b
(α)
(α)
Here, ma is the number of parts of ν (α) of length a, and na is the number of g-modules of highest weight aωα . Since the evaluation maps φν are degree preserving, we can conclude that chq Gλ,µ (ζ ) ≤ chq Hν , ν
where by the inequality, we mean the inequality in the coefficient of each power of q. Recall the identity ! ! m+n mn m + n =q . m q m 1 q
We can now conclude that we have an equality. Theorem 5.11. The graded character of the space of conformal blocks Cλ,µ is Kλ,µ (q −1 ), where ! (α) (α) 1 →T → P + m m C ⊗A m a a r k Kλ,µ (q) = q2 , (5.13) m(α) a → q m
→
where m is a vector with entries ma restricted by (5.6), namely m = Cr−1 (n − l), and (α)
→
→
→
P = (id ⊗ Ak ) n − (Cr ⊗ Ak )m.
Proof. A direct comparison of the fermionic formula on the right hand side of (5.13) with Eq. (2.6) of [15] shows that Kλ,µ (q) = Kλt ,R t (q),
(5.14)
in the notation of [15] (where Kλ,R (q) is the co-charge Kostka polynomial). Here, λ is the Young diagram obtained from the weight λ by adjoining to the corresponding Young diagram of λ columns of length r + 1, so that the equality |λ| = |R| is satisfied (the Kostka polynomial is zero unless |R| − |λ| ≡ 0 mod (r + 1), as a consequence of the restriction on the summation over mαa , see part (4) of Lemma 5.12 below). The sequence R = (R1 , . . . , RN ), with Rp = (ap )αp , is the sequence of rectangular Young diagrams corresponding to the weights µp . We use a duality theorem for generalized Kostka polynomials [14] where n(R) =
Kλt ;R t (q) = q n(R) Kλ,R (q −1 ),
1≤p
(5.15)
min(αp , α ) min(ap , a ). Then using the fact that p
p
Kλ,R (1) = dim H omg(πλ , πµ1 ⊗ · · · ⊗ πµN ) (where g = slr+1 or glr+1 ) is the dimension of the space of conformal blocks Cλµ , we conclude the equality of q-dimensions in the theorem holds.
Fermionic Characters of slr+1
455
5.7. A remark about the structure of Kλ,µ (q). In this paper, since we are concerned with representations of slr+1 , we have labeled the representations with highest weight λ with respect to the slr+1 weights, λ = l1 ω1 + · · · + lr ωr , and similarly for the weights µp = ap ωαp with αp ≤ r. (k)
Define SN to be the setof all unordered N -tuples of slr+1 dominant weights of the form µp = ap ωαp , with p ap ≤ k. Let P (r, k) be the set of partitions of length at (k)
most r and width at most k. Define ν : SN → P (r, k) to be the “horizontal concatenation” map: (ν(µ))β =
r
(k)
1 ≤ β ≤ r, µ ∈ SN .
n(α) ,
α=β
Note that this map is surjective but in general not injective. (k) Let Sr be the subset of Sr consisting of precisely r weights of the form µp = ap ωp (again with ap ≤ k). That is, n(α) = aα . Then ν is now a natural isomorphism, ∼ ν : Sr → P (r, k). The inverse map is ν −1 (µ) = µ = (µ1 , . . . , µr ), with µp = (µp − µp+1 )ωp , with µr+1 = 0 by definition. In this paper we need to consider only the cases where µ ∈ Sr and λ ∈ P (r, k). In this special case, we have the following properties of the Kostka polynomial. Lemma 5.12. Let µ ∈ Sr and λ ∈ P (r, k). Then the following statements are true for the Kostka polynomial of Eq. (5.13): 1. Kλ,µ (q) = 1 if ν(µ) = λ; 2. Kλ,µ (q) = 0 if λ1 > ν(µ)1 ; 3. Kλ,µ (q) = 0 if λ1 = ν(µ)1 and λs > ν(µ)s , where s is the smallest integer such that λs = ν(µ)s ; 1 4. Kλ,µ (q) = 0 if r+1 (|ν(µ)| − |λ|) ∈ / Z≥0 . Let K(q) be the matrix with entries (K(q))λ,ν(µ) = Kλ,µ (q) with µ ∈ Sr and λ ∈ P (r, k). The lemma implies in particular that K(q) is upper unitriangular with respect to the ordering on partitions which looks like the lexicographic ordering on partitions, applied to partitions which are not of the same size: λ < µ if λi = µi for all i < s ≤ r, and λs < µs . Proof. 1. The constraint Cr m = n−l means, when n = l (i.e. λ = ν(µ)), that m(α) = 0 (α) to the sum. for all α, hence only the term with ma = 0 contributes 2. Suppose that λ1 > ν(µ)1 , which implies that rα=1 (n(α) − lα ) < 0. However, the constraint implies that r α=1
(n(α) − lα ) =
r
(Cr m)α = m(1) + m(r) ,
(5.16)
α=1
and since m(α) ≥ 0 for non-zero Kostka polynomials, this gives the desired result. 3. Arguments similar to the proof of item (2) show that λ1 = ν(µ)1 implies m(1) = m(r) = 0, and also that s < r. Note that from λα = ν(µ)α for α ≤ s − 1 it follows that n(α) = lα for α ≤ s − 2. From the constraint, we now obtain the relations t α=1
(Cr m)α = m(1) + m(t) − m(t+1) = m(t) − m(t+1) = 0
1 ≤ t ≤ s − 2,
456
E. Ardonne, R. Kedem, M. Stone
which imply that m(t) = 0 for 1 ≤ t ≤ s − 1, in order for the Kostka polynomials to be non-zero. From the assumption that λs > ν(µ)s , we obtain ls−1 < n(s−1) . Thus, we find that s−1
(Cr m)α = m(1) + m(s−1) − m(s) = −m(s) = n(s−1) − ls−1 > 0,
α=1
which implies that the Kostka polynomial indeed vanishes, because m(s) < 0. 4. This comes from the fact that m(r) ∈ Z≥0 , and (r)
m
=
r
(Cr−1 )r,α (n(α) α=1
1 1 − lα ) = α(n(α) − lα ) = (|ν(µ)| − |λ|). r +1 r +1 r
α=1
(5.17) We can also make contact with the usual combinatorial notation for Kostka polynomials, which are labeled by Young diagrams, that is, glr+1 representations. Let λ be the partition of length at most r + 1, obtained from λ by defining λβ = m(r) + λβ , 1 ≤ β ≤ r + 1, where m = Cr−1 (n − l). Let µ = ν(µ). Then Eq. (5.17) implies |λ| = |µ|, which is the usual condition in the Kostka polynomial labeled by glr+1 -weights. The partition λ can be pictured as that obtained by adding m(r) columns of length r + 1 to the left of the Young diagram corresponding to λ. The Kostka polynomial is defined for any r. If we choose to fix |µ| = m, and choose r sufficiently large (r ≥ m), then n(α) = lα = 0 if α > m. We have the following generalization of the triangularity property for Kostka polynomials: Lemma 5.13. The generalized Kostka polynomial Kλ,µ (q) = 0 unless λ µ = ν(µ) according to the dominance ordering on partitions. Proof. The dominance ordering on partitions is β
λα ≤
α=1
β
µα ,
for all β ∈ 1, . . . , r + 1.
α=1
Recast in terms of the variables n and l this means that A(n − l)β − βlr+1 = A(n − l)β − βm(r) ≥ 0
for all β.
(5.18)
For β = r +1 the equality holds due to the condition |λ| = |µ|, so we need only consider β ≤ r. Using the fact that 1 α(n(α) − lα ), r +1 r
m(r) =
α=1
Eq. (5.18) becomes r α=1
Since
m(β)
(Aβα −
βα )(n(α) − lα ) = (Cr−1 (n − l))β = m(β) . r +1
≥ 0 in the summation in Kλ,µ (q), this proves the lemma.
Fermionic Characters of slr+1
457
Note also that if m(β) = 0 for all β, then λ = µ. In that special case, Kλ,µ (q) = 1. To tie in with the usual notion of the unitriangularity of the Kostka matrix, let Sr [m] ∼ P (r, k)[m] be the subsets of (multi-) partitions of m, and fix max(k, r) ≥ m. The number of elements of both sets is the number of partitions of m. Let λ m. The last lemma implies that the square matrix K(q), with entries indexed lexicographically by the partitions ν(µ) with µ ∈ Sr [m] and λ is upper unitriangular. That is, define (K(q))λ,µ = Kλ,µ ,
µ ∈ Sr [m], µ = ν(µ), max(k, r) ≥ |λ| = |µ|.
Then K(q)λ,µ = 0 if λ µ, and it is equal to 1 if λ = µ. In the case in which we are interested, in which r is fixed and may be smaller than |µ|, we take the subset of the elements of this matrix which have the length of µ to be at most r, and the length of λ to be at most r + 1. r +1 -Modules 6. Characters for Arbitrary Highest-Weight sl Let λ ∈ Pk+ and let Vλ be the highest-weight slr+1 -module of level k. We are interested in computing a fermionic formula for the character of this space, for arbitrary λ, similar in form to the one found in Sect. 3. We compute this character in several steps. First, we compute the character of the fusion product of several principal subspaces corresponding to rectangular highest weights µp . We then use a Weyl translation to find the character of the fusion product of integrable modules corresponding to the same highest weights. At this point, we choose a very particular set of r rectangular highest weights, of the form µp = ap ωp with p = 1, . . . , r. We use the decomposition of the fusion product into the graded sum over irreducible highest-weight modules, with coefficients given by the generalized Kostka polynomials. This means that the character of the fusion product is the sum over characters of irreducible modules, with coefficients given by the Kostka polynomial. This relation between the characters is invertible, so we use it to write the character of the irreducible module in terms of a finite sum over characters of particular fusion products. The coefficients in the sum are polynomials in q −1 whose coefficients are not necessarily positive, since they are given by the entries of the inverse of the matrix of generalized Kostka polynomials in q −1 . 6.1. Character of the fusion product of principal subspaces. Consider the fusion product of principal subspaces: Wµ (ζ ) = W1 (ζ1 ) · · · WN (ζN ) = U (n− ⊗ C(t))v1 ⊗ · · · ⊗ vN , where we allow singularities at t = ζp . Here, vp is the highest-weight vector of Vµp (ζp ), the module of level-k, with highest weight of the form µp = ap ωαp , localized at ζp . We choose k sufficiently large – that is, k ≥ p ap , so that the level-restriction in the decomposition coefficients does not play a role. Note once more that the algebra U (n− ⊗ C(t)) is filtered by degree in t, and that, defining the cyclic vector ⊗vp to have degree 0, the fusion product Wµ (ζ ) inherits this filtration. Hence, we can define the q-character of Wµ (ζ ) as the Hilbert series of the associated graded space – it is a Laurent series in q, which we can compute for sufficiently simple µp .
458
E. Ardonne, R. Kedem, M. Stone
As an n− ⊗ C[t, t −1 ]-module, Wµ (ζ ) decomposes as a direct sum of principal subspaces Wλ (0), with graded coefficients which are equal to the generalized Kostka polynomials in the previous section. This follows from the fact that Wλ (0) is generated by the action of n− ⊗ C[t, t −1 ] on the highest weight vector of Vλ (0), and in the previous section we computed the graded space of multiplicities of these highest-weight vectors in the fusion product of integrable modules to be generalized Kostka polynomials. Thus, we can see that chq Wµ (ζ ) = Kλ,µ (q −1 )chq Wλ . (6.1) λ
Note that the sum over λ is finite, because Kλ,µ (q) = 0 when λ1 > (ν(µ))1 . In this subsection we will compute the character of the fusion Wµ (ζ ), by characterizing the dual space of n− ⊗ C(t) acting on the cyclic vector ⊗vp . The dual space is the space of generating functions for matrix elements of the form
w|U (n− ⊗ C(t))v1 ⊗ · · · ⊗ vN , | w ∈ Wλ∗ (∞), λ ∈ Pk+ . (α)
Thus, the dual space Fµ (ζ ) is the space of functions in the variables xi (with 1 ≤ α ≤ r and 1 ≤ i ≤ m(α) ), with pairing defined in the same way as in Eq. (5.7). Thus it is the (α ) (α) (α±1) space of functions with possible simple poles at xi p = ζp and xi = xj , such that the polynomial f (x) defined by F (x) =
(αp ) p,i (xi
− ζp )
f (x)
r−1
(α)
is symmetric under the exchange xi relation whenever
(α)
x1
α=1
(α)
j,k (xj
(α+1)
− xk
)
∈ Fµ (ζ )
(6.2)
(α)
↔ xj . In addition, it vanishes due to the Serre (α)
= x2
(α±1)
= x1
.
There is no degree restriction on f (x), since we allow for poles at infinity in U (n− ⊗ C(t)), as well as at t = ζp . We do not allow for zeros at t = ζp , so the pole structure at t = ζp is as before. Moreover we have, as in the calculation of the coinvariant, the condition that f (x) vanishes whenever (αp )
x1
(α )
= · · · = xapp+1 = ζp ,
p = 1, . . . , N.
(6.3)
Finally, it is possible now to have currents fα (z)k+1 acting non-trivially on the tensor product of highest-weight vectors. Since Wλ (0) is a subspace of an integrable module, where such currents act trivially, the dual space is in the subspace which couples trivially to such currents. That is, we must impose the integrability condition, that f (x) vanishes whenever (α)
x1
(α)
= · · · = xk+1 .
(6.4)
These conditions characterize the space Fµ (ζ ). In order to compute the character of the h-graded component Fµ (ζ )[m], we introduce the same filtration as in Sect. 3.4. That is, let ν be a multi-partition consisting of r partitions, where ν (α) m(α) , (we denote this as ν m). We order multi-partitions lexicographically, and introduce the
Fermionic Characters of slr+1
459
evaluation maps ϕν as in Sect. 3.4. The evaluation maps act on the space Fµ (ζ ). Let
ν = ∩ν >ν ker ϕν etc., where the kernel now refers to that of the evaluation map acting on Fµ (ζ ). Define the graded components Gr ν = ν / ν . We compute the image of the induced map ϕ ν : Gr ν → Hν . Here, Hν is the space of rational functions in the variables α y = ya,i | 1 ≤ α ≤ r, 1 ≤ i ≤ m(α) , 1 ≤ a ≤ k , a (α)
(α)
(α+1)
where ma is the number of rows of length a in ν (α) , with possible poles at ya,i = ya ,i and at
(α ) ya,ip
= ζp .
ν ⊂ Hν be the subspace of functions spanned by functions of the Definition 6.1. Let H form H (y) = Hν (y)h(y),
(6.5)
where h(y) is a polynomial, symmetric under the exchange of variables with the same values of α and a, and (α) (α) (α) (α+1) Hν (y) = (ya,i − ya ,i )2Aa,a (ya,i − ya ,i )−Aa,a α=1,... ,r (a,i)>(a ,i )
×
α=1,... ,r−1 (a,i);(a ,i )
(ya,ip − ζp )−Aa,ap . (α )
(6.6)
p,(a,i)
By using almost identical arguments to those in Sect. 3.4, we conclude that Theorem 6.2. The induced map ν ϕ ν : Gr ν → H
(6.7)
is an isomorphism of graded vector spaces. Therefore we have that chq Fµ (ζ ) =
ν . chq H
m νm
ν we can set ζp = 0 in Hν (y), as it does not change the To compute the character of H character. Also recall that chq Wµ (ζ )[m] = q |m| chq Fµ (ζ ). Thus we have →T
q 21 m
chq Wµ (ζ ) = →
m∈Zr×k ≥0
→ →T
→
(Cr ⊗A)m−m (id⊗A) n
(q)→ m
eω
T ·n−ωT C m r
.
(6.8)
(α) (α) Recall that n = (n(1) , . . . , n(r) )T , with n(α) = a≥0 ana , where na is the number of highest weights of the form µp = aωα . In order to calculate the character for general principal subspaces of slr+1 , we can restrict ourselves to sequences of r partitions of the form µp = ap ωp , with p = 1, . . . , r.
460
E. Ardonne, R. Kedem, M. Stone
The results of Sect. 5.7 show that the matrix K(q) with elements (K(q))λ,ν(µ) = Kλ,µ (q) is invertible, so we can invert the relation (6.1) and conclude that the character of the principal subspace of a general highest weight is given by (K−1 (q −1 ))ν(µ),λ chq Wµ (ζ ), (6.9) chq Wλ = µ
where the finite sum is over sequences of partitions of the form µ = (n(1) ω1 , . . . , n(r) ωr ), i.e. sequences of rectangular weights, such that ν(µ) ≤ λ (in the sense of Lemma 5.12). 6.2. Characters for general highest-weight modules of slr+1 . We can now use the results of Sect. 4, to obtain the character formulæ for the Weyl translated principal subspaces and, in particular, the characters of general integrable irreducible representations of slr+1 . Let us denote the limit of N → ∞ of T N chq Vµ (ζ ) (where N is chosen in such a way that (Cr · N)α = 2N, for all α) by chq Vµ (ζ ). Using the results and notation of Sect. 4, we find 1 T Cr m − k1 nT · m ωT ·n−ωT Cr m q 2k m e chq Vµ (ζ ) = m ∈Zr →T
−1 → →T
−1 →
q 21 m (Cr ⊗Ck−1 )m− n (id⊗Ck−1 )m 1
r × r , (q)∞ → r×(k−1) a
(6.10)
a
(α) where the prime denotes the constraint k−1 (α) mod k. As in the case of a=1 ama = m the fusion of the principal spaces, the second line of Eq. (6.10) leads to an expression for the string functions, in this case associated to general modules of slr+1 . However, (α) we can make the character simpler in appearance by reintroducing mk in favor of m(α) . This gives 1 chq Vµ (ζ ) = (q)r∞
1→
→
m (α) (α) mk ∈Z,ma
T
→ →T
→
q 2 m (Cr ⊗A)m−n (id⊗A)m ωT ·n−ωT Cr m e .
r k−1 α=1 a=1 (q)m(α)
(6.11)
a
This character decomposes into characters of the integrable modules in the following way chq Vµ (ζ ) = Kλ,µ (q −1 ) chq Vλ , (6.12) λ≤ν(µ)
where the sum is over dominant weights of slr+1 . We can now invert the relation (6.12), to obtain the character of a general integrable highest-weight module of slr+1 . slr+1 module with highest Theorem 6.3. The character chq Vλ of any integrable, level-k weight λ is given by (K−1 (q −1 ))ν(µ),λ chq Vµ (ζ ), (6.13) chq Vλ = µ
Fermionic Characters of slr+1
461
where chq Vµ (ζ ) is given by Eq. (6.11) and the elements of the invertible matrix K are given by (K(q))λ,ν(µ) = Kλ,µ (q), where Kλ,µ (q) is given by Eq. (5.13). The finite sum is over sequences of rectangular partitions of the form µ = (n(1) ω1 , . . . , n(r) ωr ), such that ν(µ) ≤ λ in the sense of Lemma 5.12. We note some features of this formula. It is a finite sum, with coefficients in Z[q −1 ]. Therefore, not only is the positivity of the coefficients of q n not manifest from this formula, neither is the fact that the character is in fact a series in positive powers of q only.
6.3. Some examples. Let us consider some explicit examples of the matrices of generalized Kostka polynomials, and, as a result, some character formulæ for non-rectangular representations. We will do this for sl3 in full generality, and for sl4 at fixed level. 6.3.1. The case sl3 . In this case, it is very easy to write down the elements of the matrix Kλ;ν(µ) . For a given partition λ, let li = λi+1 − λi . Using this notation, we have the following result K(l1 ,l2 );(l1 −i,l2 −j ) = δi,j q i ,
(6.14)
where we have the constraints 0 ≤ i, j ≤ min(l1 , l2 ). The non-zero elements of K−1 are also easily obtained K−1 (l1 ,l2 );(l1 ,l2 ) (q) = 1, K−1 (l1 ,l2 );(l1 −1,l2 −1) (q) = −q,
(6.15) l1 , l2 > 0,
(6.16)
while all the other elements are zero. For the characters of arbitrary sl3 representations, this implies for non-rectangular representations (i.e. l1 , l2 > 0), chq V(l1 ,l2 ) = chq V(l1 ,l2 ) (ζ ) −
1 chq V(l1 −1,l2 −1) (ζ ), q
(6.17)
where chq Vµ (ζ ) is given by Eq. (6.10) or (6.11). 6.3.2. An sl4 example. We give an explicit example for the matrix K for representations of sl4 , with level k ≤ 4. In addition, we will restrict ourselves to representations with 3 i=1 ili = 0 mod 4 (see Sect. 5.7). There are 10 representations of this kind, and we will use the ordering (0, 0, 0); (1, 0, 1), (0, 2, 0); (2, 1, 0), (0, 1, 2); (4, 0, 0), (2, 0, 2), (1, 2, 1), (0, 4, 0), (0, 0, 4).
462
E. Ardonne, R. Kedem, M. Stone
With this ordering, we obtain the following Kostka matrix 1 q 0 0 0 0 q2 0 0 0 0 1 0 q q 0 q q2 0 0 0 0 1 0 0 0 0 q + q2 0 0 0 0 0 1 0 0 0 q 0 0 q 0 0. K(q) = 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 01 000000 0 The inverse is
1 0 0 0 −1 K (q) = 0 0 0 0 0 0
−q 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
q2 −q 0 1 0 0 0 0 0 0
q2 −q 0 0 1 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0
0 −q 3 −q q2 0 −q − q 2 0 −q 0 −q 0 0 1 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 1 0
(6.18)
0 0 0 0 0. 0 0 0 0 1
(6.19)
Note that the inverse Kostka matrix has off-diagonal elements with both signs. As an example, we find that (by making use of Eq. (6.13)) 1 1 chq V(2,1,0) (ζ ) − chq V(0,1,2) (ζ ) q q 1 1 1 1 − + 2 chq V(0,2,0) (ζ ) + 2 chq V(1,0,1) (ζ ) − 3 chq V(0,0,0) (ζ ), q q q q (6.20)
chq V(1,2,1) = chq V(1,2,1) (ζ ) −
with chq Vµ (ζ ) given by Eq. (6.10). 7. Conclusion The main purpose of this paper was to find explicit fermionic character formulæ for arbitrary integrable highest-weight modules of slr+1 , using a generalization of the methods of Feigin and Stoyanovski˘ı [23]. Because the functional realization of the dual space for non-rectangular highest weights is too complex for computation of a fermionic character (see Sect. 3.3.2), we did not compute purely fermionic characters, which would have the nice feature that they are manifestly power series in q, with non-negative coefficients. Instead, we found explicit character formulæ as a finite sum of fermionic characters with coefficients in Z[q −1 ]. To obtain these explicit characters, we used the following strategy: we computed the fermionic character formula for the (non level-restricted) fusion product of N integrable modules with rectangular highest weights µp = ap ωαp , Eq. (6.11), and of the space of
Fermionic Characters of slr+1
463
conformal blocks associated with this fusion product, the generalized Kostka polynomial of Theorem 5.11. We thus provided a proof of the conjecture of Feigin and Loktev [8], concerning the relation between their graded tensor product and the generalized Kostka polynomials [22, 16] in this case. It is also a direct proof of the independence of the dimension of the FL-fusion product of the evaluation parameters (the points ζp ), since the associated graded space whose character we computed corresponds to the limit ζp → 0 for all p. We then used the characters for the special case of these fusion products, together with the relation (6.12), to obtain a formula for the characters of integrable modules of slr+1 of arbitrary (non-rectangular) highest weight, in terms of the inverse matrix of certain generalized Kostka polynomials, see Theorem 6.3. The generalization of the discussion in this paper to other simple Lie algebras requires us to consider the so-called Kirillov-Reshetikhin modules (or rather, their limit to the loop algebra case, as KR-modules were originally defined for Yangians). These take the place of irreducible g-modules with rectangular highest weights but as g-modules, they are not necessarily irreducible. We will explain this generalization in an upcoming publication. Acknowledgements. The work of E.A. is supported by NSF grants numbers DMR-04-42537 and DMR01-32990; that of M.S. by NSF grant DMR-01-32990. R.K. would like to thank B. Feigin and S. Loktev for many useful discussions.
References 1. Ardonne, E., Bouwknegt, P., Schoutens, K.: Non-abelian quantum Hall states—exclusion statistics, K-matrices, and duality. J. Stat. Phys. 102(3–4), 421–469 (2001) 2. Ardonne, E., Kedem, R., Stone, M.: Filling the bose sea: symmetric quantum Hall edge states and affine characters. J. Phys. A 38(3), 617–636 (2005) 3. Belavin, A.A., Polyakov, A.M., Zamolodchikov, A.B.: Infinite conformal symmetry in two-dimensional quantum field theory. Nucl. Phys. B 241(2), 333–380 (1984) 4. Feigin, B., Jimbo, M., Kedem, R., Loktev, S., Miwa, T.: Spaces of coinvariants and fusion product, affine sl2 character formulas in terms of kostka polynomials. Duke Math. J. 125(3), 549–588 (2004) 5. Feigin, B., Jimbo, M., Loktev, S., Miwa, T., Mukhin, E.: Addendum to: “Bosonic formulas for (k, l)-admissible partitions”. Ramanujan J. 7(4), 519–530 (2003) 2 spaces of coin6. Feigin, B., Kedem, R., Loktev, S., Miwa, T., Mukhin, E.: Combinatorics of the sl variants. Transform. Groups 6(1), 25–52 (2001) 2 coinvariants: dual 7. Feigin, B., Kedem, R., Loktev, S., Miwa, T., Mukhin, E.: Combinatorics of the sl functional realization and recursion. Compositio Math. 134(2), 193–241 (2002) 8. Feigin, B., Loktev, S.: On generalized Kostka polynomials and the quantum Verlinde rule. In: Differential topology, infinite-dimensional Lie algebras, and applications, Volume 194 of Amer. Math. Soc. Transl. Ser. 2, Providence, RI: Amer. Math. Soc., 1999, pp. 61–79 9. Georgiev, G.: Combinatorial constructions of modules for infinite-dimensional Lie algebras, II. Parafermionic space. http://arxiv.org/list/math.QA/9504024, 1995 10. Georgiev, G.: Combinatorial constructions of modules for infinite-dimensional Lie algebras, I. Principal subspace. J. Pure Appl. Alg. 112(3), 247–286 (1996) 11. Kac, V.G.: Infinite-dimensional Lie algebras. Cambridge: Cambridge University Press, Third edition, 1990 12. Kedem, R., Klassen, T.R., McCoy, B.M., Melzer, E.: Fermionic quasi-particle representations for characters of (G(1) )1 × (G(1) )1 /(G(1) )2 . Phys. Lett. B 304(3–4), 263–270 (1993) 13. Kedem, R., McCoy, B.M.: Construction of modular branching functions from Bethe’s equations in the 3-state Potts chain. J. Stat. Phys. 71(5–6), 865–901 (1993) 14. Kirillov, A.N.: Ubiquity of Kostka polynomials. In: Physics and combinatorics 1999 (Nagoya), River Edge, NJ: World Sci. Publishing, 2001, pp. 85–200 15. Kirillov, A.N., Schilling, A., Shimozono, M.: A bijection between Littlewood-Richardson tableaux and rigged configurations. Selecta Math. (N.S.) 8(1), 67–135 (2002)
464
E. Ardonne, R. Kedem, M. Stone
16. Kirillov, A.N., Shimozono, M.: A generalization of the Kostka-Foulkes polynomials. J. Alg. Combin. 15(1), 27–69 (2002) 17. Lepowsky, J., Primc, M.: Structure of the standard modules for the affine Lie algebra A[1] 1 . Volume 46 of Contemporary Mathematics. Providence, RI: Amer. Math. Soc., 1985 18. Macdonald, I.G.: Symmetric functions and Hall polynomials. Oxford Mathematical Monographs. New York: Clarendon Press Oxford University Press, Second edition, 1995. With contributions by A. Zelevinsky, Oxford Science Publications 19. Moore, G., Read, N.: Nonabelions in the fractional quantum Hall effect. Nucl. Phys. B 360(2–3), 362–396 (1991) (1) 20. Primc, M.: Vertex operator construction of standard modules for An . Pac. J. Math. 162(1), 143–187 (1994) 21. Primc, M.: Loop modules in annihilating ideals of standard modules for affine Lie algebras. In: VII. Mathematikertreffen Zagreb-Graz (Graz, 1990), Volume 313 of Grazer Math. Ber., Karl-FranzensUniv. Graz, 1991, pp. 39–44 22. Schilling, A., Ole Warnaar, S.: Inhomogeneous lattice paths, generalized Kostka polynomials and An−1 supernomials. Commun. Math. Phys. 202(2), 359–401 (1999) 23. Stoyanovski˘ı, A.V., Fe˘ıgin, B.L.: Functional models of the representations of current algebras, and semi-infinite Schubert cells. Funkt. Anal. i Pril. 28(1), 68–90, 96 (1994) 24. Tsuchiya, A., Kanie, Y.: Vertex operators in conformal field theory on P1 and monodromy representations of braid group. In: Conformal field theory and solvable lattice models (Kyoto, 1986), Volume 16 of Adv. Stud. Pure Math., Boston, MA: Academic Press, 1988, pp. 297–372 Communicated by L. Takhtajan
Commun. Math. Phys. 264, 465–503 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1525-8
Communications in
Mathematical Physics
Decay of Solutions of the Wave Equation in the Kerr Geometry F. Finster1 , N. Kamran2 , J. Smoller3 , S.-T. Yau4 1
NWF I – Mathematik, Universit¨at Regensburg, 93040 Regensburg, Germany. E-mail:
[email protected] Department of Math. and Statistics, McGill University, Montr´eal, Qu´ebec, Canada H3A 2K6. E-mail:
[email protected] 3 Mathematics Department, The University of Michigan, Ann Arbor, MI 48109, USA. E-mail:
[email protected] 4 Mathematics Department, Harvard University, Cambridge, MA 02138, USA. E-mail:
[email protected] 2
Received: 18 April 2005 / Accepted: 10 August 2005 Published online: 1 March 2006 – © Springer-Verlag 2006
Abstract: We consider the Cauchy problem for the massless scalar wave equation in the Kerr geometry for smooth initial data compactly supported outside the event horizon. We prove that the solutions decay in time in L∞ loc . The proof is based on a representation of the solution as an infinite sum over the angular momentum modes, each of which is an integral of the energy variable ω on the real line. This integral representation involves solutions of the radial and angular ODEs which arise in the separation of variables. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . 3. Asymptotic Estimates for the Radial Equation . . . . . . . . . 3.1 Holomorphic families of radial solutions . . . . . . . . . 3.2 A continuous family of solutions near ω = 0 . . . . . . . 4. Global Estimates for the Radial Equation . . . . . . . . . . . . 4.1 The complex Riccati equation . . . . . . . . . . . . . . 4.2 Invariant disk estimates . . . . . . . . . . . . . . . . . . 4.3 Bounds for the Wronskian and the fundamental solutions 5. Contour Deformations to the Real Axis . . . . . . . . . . . . . 6. Energy Splitting Estimates . . . . . . . . . . . . . . . . . . . 7. An Integral Representation on the Real Axis . . . . . . . . . . 8. Proof of Decay . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Research supported in part by the Deutsche Forschungsgemeinschaft. Research supported by NSERC grant # RGPIN 105490-2004. Research supported in part by the NSF, Grant No. DMS-010-3998. Research supported in part by the NSF, Grant No. 33-585-7510-2-30.
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
466 467 470 470 477 480 480 481 487 492 495 499 500 503
466
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
1. Introduction In this paper we study the long-time dynamics of massless scalar waves in the Kerr geometry. We prove that solutions of the Cauchy problem with smooth initial data which is compactly supported outside the event horizon, decay in L∞ loc . Our starting point is the integral representation for the propagator [5], which involves an integral over a complex contour in the energy variable ω. In order to study the long-time dynamics, we must deform the contour to the real line. To this end, we carefully analyze the solutions of the associated radial and angular ODEs which arise in the separation of variables. In particular, we show that the integrand in our representation has no poles on the real axis. We call such poles radiant modes, because in a dynamical situation they would lead to continuous radiation coming out of the ergosphere. We now set up some notation and state our main result. As in [5], we choose BoyerLindquist coordinates (t, r, ϑ, ϕ) with r > 0, 0 ≤ ϑ ≤ π , 0 ≤ ϕ < 2π , in which the Kerr metric takes the form 2 dr 2 2 2 2 ds = (dt − a sin ϑ dϕ) − U + dϑ U −
sin2 ϑ (a dt − (r 2 + a 2 ) dϕ)2 U
(1.1)
with U (r, ϑ) = r 2 + a 2 cos2 ϑ ,
(r) = r 2 − 2Mr + a 2 ,
where M and aM denote the mass and the angular momentum of the black hole, respectively. We restrict attention to the non-extreme case M 2 > a 2 , where the function has two distinct zeros, r0 = M −
M 2 − a2
and r1 = M +
M 2 − a2 ,
corresponding to the Cauchy and the event horizon, respectively. We consider only the region r > r1 outside the event horizon, and thus > 0. The ergosphere is the region where the Killing vector ∂t∂ is space-like, that is where r 2 − 2Mr + a 2 cos2 ϑ < 0.
(1.2)
The ergosphere lies outside the event horizon r = r1 , and its boundary intersects the event horizon at the poles ϑ = 0, π. Theorem 1.1. Consider the Cauchy problem for the wave equation in the Kerr geometry for smooth initial data which is compactly supported outside the event horizon and has fixed angular momentum in the direction of the rotation axis of the black hole, i.e. for some k ∈ Z, (0 , ∂t 0 ) = e−ikϕ (0 , ∂t 0 )(r, ϑ) ∈ C0∞ ((r1 , ∞) × S 2 )2 . 2 2 Then the solution decays in L∞ loc ((r1 , ∞) × S ) as t → ∞.
Decay of Solutions of the Wave Equation in the Kerr Geometry
467
The study of linear hyperbolic equations in a black hole geometry has a long history. Regge and Wheeler [11] considered the radial equation for metric perturbations of the Schwarzschild metric. In the late 1960s and early 1970s, Carter, Teukolsky and Chandrasekhar discovered that the equations describing scalar, Dirac, Maxwell and linearized gravitational fields in the Kerr geometry are separable into ordinary differential equations (see [2]). Much research has been done concerning the long-time behavior of the solutions of these equations, through both numerical and analytical methods. Price [9] gave arguments which indicated decay of solutions of the scalar wave equation in the Schwarzschild geometry. Press and Teukolsky [8] did a numerical study which strongly suggested the absence of unstable modes, and Whiting [10] later proved that for ω in the complex plane, such unstable modes cannot exist. This “mode stability” does not rule out that there might be unstable modes for real ω (what we call radiant modes). Furthermore, mode stability does not lead to any statement on the Cauchy problem. Finally, Kay and Wald [7] used energy estimates to prove a boundedness result for solutions of the scalar wave equation in the Schwarzschild geometry. Unfortunately, these energy methods cannot be used in a rotating black hole geometry, because the energy density is indefinite inside the ergosphere, making it impossible to introduce a positive definite conserved scalar product. This difficulty was dealt with in [5, 6], where Whiting’s mode stability result was combined with estimates for the resolvent and for the radial and angular ODEs. In [5] we established an integral representation which expresses the solution as a contour integral of an integrand involving the separated radial and angular eigenfunctions over a contour staying within a neighborhood located arbitrarily close to the real axis. This integral representation is the starting point of the present paper. After deforming the contours onto the real axis, we can prove decay using the Riemann-Lebesgue Lemma, similar to the case of the Dirac equation [4]. We remark that the decay result of our paper would not be expected to hold for a massive scalar field satisfying the Klein-Gordon equation, as indicated in [1]. Finally, we note that the problem considered here is closely related to one of the major open questions in general relativity; namely the problem of linearized stability of the Kerr metric. For the stability under metric perturbations one considers the equation for linearized gravitational waves, which can be identified with the general wave equation for spin s = 2 (see [2]). Thus replacing scalar waves (s = 0) by gravitational waves (s = 2), the above theorem would prove linearized stability of the Kerr metric. However, the analysis for s = 2 would be considerably more difficult due to the complexity of the linearized Einstein equations. Nevertheless, we regard this paper as a first step towards proving linearized stability of the Kerr metric. 2. Preliminaries We recall a few constructions and results from [5, 6] which will be needed later on. As radial variable we usually work with the Regge-Wheeler variable u ∈ R defined by du r 2 + a2 = ; dr
(2.1)
then u = −∞ corresponds to the event horizon. It is most convenient to write the wave equation in the Hamiltonian form i ∂t = H ,
(2.2)
468
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
where = (, i∂t ). The Hamiltonian can be written as 0 1 H = , Aβ
(2.3)
where 1 ∂ ∂ a2 k2 − (r 2 + a 2 ) − 2 . − 2 ρ ∂u ∂u r + a 2 S r 2 + a2 2ak β=− . 1− 2 ρ r + a2 ρ = r 2 + a 2 − a 2 sin2 ϑ 2 . r + a2
A=
(2.4) (2.5) (2.6)
The operators A and β are symmetric on the Hilbert space L2 (R × S 2 , dµ)2 with the measure dµ := ρ du d cos ϑ.
(2.7)
It is immediately verified that the Hamiltonian is symmetric with respect to the bilinear form A0 < 1 , 2 > = 1 , 2 C2 dµ. (2.8) 0 1 R×S 2 As is worked out in detail in [5], the inner product < , > is the physical energy of . Therefore, we refer to <., .> as the energy scalar product. The fact that the energy scalar product is not positive definite can be understood from the fact that the operator A is not positive on L2 (R × S 2 , dµ). Using the ansatz (t, r, ϑ, ϕ) = e−iωt−ikϕ R(r) (ϑ),
(2.9)
the wave equation can be separated into an angular and a radial ODE, Rω,k Rλ = −λ Rλ ,
Aω,k λ = λ λ .
(2.10)
Here the angular operator Aω,k is also called the spheroidal wave operator. The separation constant λ is an eigenvalue of Aω,k and can thus be regarded as an angular quantum number. In [6] it was shown that if ω is in a small neighborhood of the real line, more precisely if ε ω ∈ Uε := ω ∈ C | |Im ω| < , 1 + |Re ω| then for sufficiently small ε > 0 the angular operator Aω,k has a purely discrete spectrum (λn )n∈N with corresponding one-dimensional eigenspaces which span the Hilbert space L2 (S 2 ). We denote the projections onto the eigenspaces by Qn (k, ω). These projections as well as the corresponding eigenvalues λn are holomorphic in ω ∈ Uε . In
Decay of Solutions of the Wave Equation in the Kerr Geometry
469
analogy to the eigenvalues l(l + 1) of the Laplacian on the sphere, the angular eigenvalues λn grow quadratically for large n in the sense that there is a constant C(k, ω) > 0 such that n2 C(k, ω)
|λn (k, ω)| ≥
for all n ∈ N.
(2.11)
We set ω0 = −
ak r12 + a 2
(2.12)
with r1 the event horizon and use the notation (ω) = ω − ω0 .
(2.13)
In order to bring the radial equation into a convenient form, we introduce a new radial function φ(r) by φ(r) =
r 2 + a 2 R(r) .
Then in the Regge-Wheeler variable, the radial equation can be written as the “Schr¨odinger equation” d2 − 2 + V (u) φ(u) = 0 du
(2.14)
with the potential V (u) = − ω +
ak 2 r + a2
2 +
λn (ω) 1 +√ ∂u2 r 2 + a 2 . 2 2 2 (r + a ) r 2 + a2
(2.15)
In [5] we derived an integral representation for the solution of the Cauchy problem of the following form, (t, r, ϑ, ϕ) 1 −ikϕ =− e 2πi k∈Z dω e−iωt (Qk,n (ω) S∞ (ω) 0k )(r, ϑ). × lim − n∈IN
ε 0
Cε
Cε
Here the integration contour Cε must lie inside the set Uε .
(2.16)
470
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
3. Asymptotic Estimates for the Radial Equation 3.1. Holomorphic families of radial solutions. In this section we fix the angular quantum numbers k, n and consider solutions φ´ and φ` of the Schr¨odinger equation (2.14) which satisfy the following asymptotic boundary conditions on the event horizon and at infinity, respectively,
´ ´ lim e−iu φ(u) = 0, (3.1) = 1, lim e−iu φ(u) u→−∞ u→−∞
` ` = 0. (3.2) = 1, lim eiωu φ(u) lim eiωu φ(u) u→∞
u→∞
These solutions were introduced in [5] for ω in the lower complex half plane intersected with Uε . Here we will show that they are holomorphic in ω, and we will extend their definition to a larger ω-domain. More precisely, we prove the following two theorems. Theorem 3.1. The solutions φ´ are well-defined on the domain
r1 − r0 D = Uε ∩ ω ∈ C | Im ω ≤ . 2(r12 + a 2 ) They form a holomorphic family of solutions in the sense that for every fixed u ∈ R ´ and n ∈ N, the function φ(u) is holomorphic in ω ∈ D. Theorem 3.2. For every angular momentum number n there is an open set E containing the real line except for the origin, E ⊃ E0 := Uε ∩ {ω ∈ C | Im ω ≤ 0 and ω = 0} ,
(3.3)
such that the solutions φ` are well-defined for all ω ∈ E and form a holomorphic family on E. For the proofs we will rewrite the Schr¨odinger equation with boundary conditions (3.1, 3.2) as an integral equation (which in different contexts is called the Lipman-Schwinger or Jost equation). Then we will perform a perturbation expansion and get estimates for all ´ the the terms of the expansion. To introduce the method, we begin with the solutions φ; solutions φ` will be treated later with a similar technique. First we write the Schr¨odinger equation (2.14) in the form d2 2 ´ ´ − 2 − φ(u) = −W (u) φ(u) (3.4) du with a potential W = 2 + V (u) which vanishes at u = −∞. We define the Green’s function of the differential operator −∂u2 − 2 by the distributional equation (−∂v2 − 2 ) S(u, v) = δ(u − v).
(3.5)
The Green’s function is not unique; we choose it such that its support is contained in the region v ≤ u; i.e. 1 −i(u−v) i(u−v) e if = 0 − e (3.6) S(u, v) = (u − v) × 2i v−u if = 0.
Decay of Solutions of the Wave Equation in the Kerr Geometry
471
(Here denotes the Heaviside function defined by (x) = 1 if x ≥ 0 and (x) = 0 otherwise.) We multiply (3.4) by the Green’s function and integrate, ∞ ∞
´ ´ S(u, v) (−∂v2 − 2 ) (φ(v) − eiu ) dv = − S(u, v) W (v) φ(v) dv. −∞
−∞
If we assume for the moment that φ´ satisfies the desired boundary conditions (3.1), we can integrate by parts on the left and use (3.5). This gives the Lipman-Schwinger equation u iu ´ ´ φ(u) = e − S(u, v) W (v) φ(v) dv, −∞
which in the context of potential scattering is also called the Jost equation (see e.g. [3]). Its significance lies in the fact that we can now easily perform a perturbation expansion in the potential W . Namely, taking for φ´ the ansatz as the perturbation series φ´ =
∞
φ (l) ,
(3.7)
l =0
we are led to the iteration scheme
φ (0) (u) = eiu u φ (l+1) (u) = − S(u, v) W (v) φ (l) (v) dv.
(3.8)
−∞
This iteration scheme can be used for constructing solutions of the Jost equation, and this will give us the functions φ´ with the desired properties. Proof of Theorem 3.1. Fix ω ∈ D. As the potential W is smooth in r and vanishes on the event horizon, we know that W has near r1 the asymptotics W = O(r − r1 ). This means in the Regge-Wheeler variable (2.1) that W decays exponentially as u → −∞. More precisely, there is a constant c > 0 such that |W (u)| ≤ c eγ u
with γ :=
r1 − r 0 . r12 + a 2
(3.9)
Let us show inductively that |φ (l) (u)| ≤ µl e−Im u
with µ :=
c eγ u . (γ − Im − |Im |)2
(3.10)
In the case l = 0, the claim is obvious from (3.8). Thus assume that (3.10) holds for a given l. Then, estimating the integral equation in (3.8) using (3.9), we obtain u l (l+1) |φ (u)| ≤ c µ |S(u, v)| e(γ −Im ) v dv. (3.11) −∞
The Green’s function (3.6) can be estimated in the case v ≤ u by u − v 1 −i(u−v) τ |Im | (u−v) |S(u, v)| = e dτ . ≤ (u − v) e 2 0
472
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
Substituting this inequality in (3.11) gives u |φ (l+1) (u)| ≤ c µl e|Im | u (u − v) e(γ −Im −|Im |) v dv. −∞
Since the parameter α := γ − Im − |Im | is positive according to the definition of D, we can carry out the integral as follows, u u d d eαu eαu αv αv (u − v) e dv = u − e dv = u − = 2. dα dα α α −∞ −∞ This gives (3.10) with l replaced by l + 1. Since for u on a compact interval, the analytic dependence of the solutions in ω from the coefficients and the initial conditions follows immediately from the Picard-Lindel¨of Theorem, it suffices to consider the region u < u0 for any u0 ∈ R. By choosing u0 sufficiently small, we can arrange that µ < 1/2 for all u < u0 . Then the estimate (3.10) shows that the perturbation series (3.7) converges absolutely, uniformly in u ∈ (−∞, u0 ). Using similar estimates for the u-derivatives of φ (l) , one sees furthermore that the perturbation series (3.10) can be differentiated term by term, and using (3.5) we find that φ´ is indeed a solution of (3.4). Furthermore, ´ φ(u) − eiu =
∞
φ (l) (u),
l=1
and taking the limit u → ∞ and using (3.10) we find that the right side goes to zero. Using the same argument for the first derivatives, we obtain (3.1). In order to prove that φ´ is analytic in ω, we first note that if = 0, we can differentiate the perturbation series (3.7) term by term and verify that the Cauchy-Riemann equations are satisfied (note that λn is holomorphic in ω according to [6]). Since φ´ is bounded near = 0, it is also analytic at = 0. ` In analogy to (3.4), we now write the Schr¨odinger equation We turn to the solutions φ. as
−
d2 − ω2 du2
` ` φ(u) = −W (u) φ(u)
(3.12)
with (ak)2 λn 2ak − + 2 r 2 + a2 (r 2 + a 2 )2 (r + a 2 )2 1 +√ ∂u2 r 2 + a 2 . r 2 + a2
W (u) = −ω
Assuming that ω = 0, we choose the Green’s function as 1 −iω(v−u) e − eiω(v−u) (v − u). S(u, v) = 2iω The corresponding Jost equation is ` φ(u) = e−iωu −
∞ u
` S(u, v) W (v) φ(v) dv.
(3.13)
(3.14)
Decay of Solutions of the Wave Equation in the Kerr Geometry
473
The perturbation series ansatz φ` =
∞
φ (l)
(3.15)
l =0
leads to the iteration scheme
φ (0) (u) = e−iωu ∞ (l) (l+1) φ (u) = − S(u, v) W (v) φ (v) dv.
(3.16)
u
Note that, in contrast to the exponential decay (3.9), now the potential W , (3.13), has only polynomial decay. As a consequence, the iteration scheme allows us to construct φ` only inside the set E0 as defined in (3.3). Lemma 3.3. The solutions φ` are well-defined for every ω ∈ E0 . They form a holomorphic family in the interior of E0 . Proof. Fix ω ∈ E0 . Then ω = 0 and Im ω ≤ 0, and this allows us to estimate the potential (3.13) and the Green’s function (3.14) for u, v > u0 and some u0 > 0 by |W (v)| ≤
c , v2
|S(u, v)| ≤
1 Im ω (u−v) . e |ω|
(3.17)
Let us show by induction that 1 |φ (u)| ≤ l! (l)
c |ω| u
l eIm ω u .
For l = 0 this is obvious from (3.16), whereas the induction step follows by estimating the integral equation in (3.16) with (3.17), n ∞ 1 c 1 Im ω (u−v) c Im ω v e e |φ (l+1) (u)| ≤ l! |ω| |ω| v 2+l u l+1 c 1 = eIm ω u . (l + 1)! |ω| u Hence the perturbation series (3.15) converges absolutely, locally uniformly in u. It is straightforward to check that φ` satisfies the Schr¨odinger equation (3.12) with the correct boundary values (3.2). If Im ω < 0, one can differentiate the series (3.15) term by term with respect to ω and verify that the Cauchy-Riemann equations are satisfied. It remains to analytically extend the solutions φ` for fixed n to a neighborhood of any point ω0 ∈ R \ {0}. To this end, we need good estimates of the derivatives of φ` with respect to ω and u. It is most convenient to work with the functions ψ (l) (u) := (2iω)l eiωu φ (l) (u), for which the iteration scheme (3.16) can be written as ψ (0) = 1 and ∞ ψ (l+1) (u) = (e−2iω(v−u) − 1) W (v) ψ (l) (v) dv. u
(3.18)
(3.19)
474
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
Lemma 3.4. For every ω0 ∈ R \ {0} and n ∈ N, there are positive constants c, K, δ, such that for all ω ∈ E0 ∩ Bδ (ω0 ) with Im ω < 0 and all p, q, n ∈ N the following inequality holds, p q ∂ ∂ (l) ≤ c1+l+p K q p! q! 1 . ψ (u) (3.20) ∂ω ∂u l! ul+q Proof. According to [6], λn is holomorphic in a neighborhood of ω0 , and thus (for example using the Cauchy integral formula) its derivatives can be bounded in Bδ (ω0 ) by 1+p K |∂ωp λn (ω)| ≤ p! 2 for suitable K > 0. Since the potential W , (3.13), is also holomorphic in r (in a suitable neighborhood of the positive real axis) and has quadratic decay, its derivatives can be estimated by 1+p+q p! q! K q |∂ωp ∂u W (u)| ≤ . (3.21) 2 u2+q We choose c so large that the following conditions hold, 1
c > 16 K,
1 K eK ≤ . (ω0 − δ) c 2
(3.22)
We proceed to prove (3.20) by induction in l. For l = 0 there is nothing to prove. Thus assume that (3.20) holds for a given l. Using the induction hypothesis together with (3.21), we can then estimate the derivatives of the product W ψ (l) as follows, q
|∂ωp ∂u (W ψ (l) )| ≤
p q 1+a+b K p q a b 2
a=0
× =
a! b! u2+b
b=0 c1+l+p−a
K q−b (p − a)! (q − b)! l! a q b p K 1 p! q! . l! 2c 2
ul+q−b
c1+l+p K 1+q u2+l+q
a=0
b=0
According to (3.22), the two remaining sums can be bounded by the geometric series ∞ −m = 2, and thus m=0 2 q
|∂ωp ∂u (W ψ (l) )| ≤ 4
c1+l+p K 1+q p! q! . u2+l+q l!
Next we differentiate the integral equation (3.19), q ∂ωp ∂u ψ (l+1) (u)
p ∞ p q = ∂ωr ∂u (v − u)(e−2iω(v−u) − 1) r −∞ r =0
×∂ωp−r W ψ (l) (v) dv
(3.23)
Decay of Solutions of the Wave Equation in the Kerr Geometry
475
(note that, since Im ω < 0, the factor e−2iωv gives an exponential decay of the integrand as v → ∞). After manipulating the partial derivatives as follows, r v−u r q −2iω(v−u) q ∂ω ∂u (v − u) (e − 1) = (−∂v ) (v − u) ∂v ω × (e−2iω(v−u) − 1) , the resulting v-derivatives can all be integrated by parts. The boundary terms drop out, and we obtain p ∞ u − v r q p−r
p p q (l+1) −2iω(v−u) ∂ω ∂u ψ W ψ (l) (v) dv. (u) = (e −1) ∂v ∂v ∂ω r ω u r =0
Since ω is in the lower half plane, we have the inequality |e−2iω(v−u) | ≤ 1. We conclude that r p ∞
p p q (l+1) ∂v u − v ∂ q ∂ p−r W ψ (l) (v) dv. (u) ≤ 2 ∂ω ∂u ψ v ω r u ω r =0
(3.24) The v-derivatives in the curly brackets can act either on one of the factors (u − v) or on the function W ψ (l) . Taking into account the combinatorics, we obtain p r
p 1 r r−s ∞ p q (l+1) s p−r q+s W ψ (l) dv. r (u) ≤ 2 (v − u) ∂v ∂ω ∂u ψ ∂ ω r r ω s u r =0
s=0
Using (3.23), we get p r p! (q + s)! −r r−s 1+l+p−r 1+q+s p q (l+1) (u) ≤ 8 K ω r c ∂ω ∂u ψ s! (r − s)! l! r = 0 s=0 ∞ (v − u)s × dv. v 2+l+q+s u
Introducing the new variable τ = uv , the integral can be computed with iterative integrations by parts, 1 ∞ (v − u)s 1 dv = 1+l+q (1 − τ )s τ l+q dτ v 2+l+q+s u u 0 1 (l + q)! d s l+q+s 1 (1 − τ )s τ dτ = 1+l+q u (l + q + s)! 0 dτ s 1 (l + q)! s! (l + q)! s! 1 1 . τ l+q+s dτ = 1+l+q = 1+l+q u (l + q + s)! 0 u (1 + l + q + s)! We thus obtain p r (l + q)! c1+l+p K 1+q p! (q + s)! K s r r−s p q (l+1) (u) ≤ 8 . ∂ω ∂u ψ 1+l+q r u (r − s)! l! (ωc) (1 + l + q + s)! r = 0 s=0
476
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
Using the elementary estimate q! q +1 q +s q! (q + s)! (l + q)! = · ··· ≤ , (1 + l + q + s)! q +l+1 q +l+2 q +l+s+1 l+1 we obtain r p
r r−s 1 c1+l+p K 1+q p! q! K r p q (l+1) (u) ≤ 8 . ∂ω ∂u ψ u1+l+q (l + 1)! ωc (r − s)! K r =0
s=0
The last sum can be estimated by an exponential, r s=0
r ∞
r r−s
r 1 1 r a 1 r a . ≤ ≤ = exp (r − s)! K a! K a! K K a=0
a=0
According to (3.22), we can now estimate the remaining sum over r by a geometric series, r 1 p ∞
r K eK K r exp ≤ 2. ≤ ωc K ωc r =0
r =0
We thus obtain c2+l+p K q p! q! c1+l+p K 1+q p! q! p q (l+1) ≤ , (u) ≤ 16 ∂ω ∂u ψ u1+l+q (l + 1)! u1+l+q (l + 1)! where in the last step we again used (3.22).
Proof of Theorem 3.1. According to (3.15, 3.18), ` φ(ω, u) = e−iωu
∞ l =0
1 ψ (l) (ω, u). (2iω)l
Expanding ψ (l) in a Taylor series in ω, we obtain the formal expansion ` + ζ, u) = e−iωu φ(ω
∞ l =0
∞ 1 ζ p p (l) ∂ ψ (ω, u). (2i(ω + ζ ))l p! ω p=0
Lemma 3.4 allows us to estimate this expansion for every ω ∈ E0 ∩ Bδ (ω0 ) with Im ω < 0 as follows, l ∞ ∞ 1 c ` |φ(ω + ζ, u)| ≤ c (c|ζ |)p . l! |ω + ζ | u l =0
p=0
This expansion converges uniformly for |ζ | < 2c . Similarly, one can show that the series of ζ -derivatives also converge uniformly. Hence we can interchange differentiation with summation, and a straightforward calculation shows that the Cauchy-Riemann equations are satisfied. Thus the above expansion allows us to extend φ` analytically to the ball |ζ | < 2c . Since the constant c is independent of Im ω, we thus obtain an analytic extension of φ` across the real line.
Decay of Solutions of the Wave Equation in the Kerr Geometry
477
3.2. A continuous family of solutions near ω = 0. In Theorem 3.2 we made no statement about the behavior of the fundamental solutions φ` at ω = 0. Indeed, we cannot expect the solutions to have a holomorphic extension in a neighborhood of ω = 0. But at least, after suitable rescaling, these solutions have a well-defined limit at ω = 0: Theorem 3.5. For every angular momentum number n, there is a real solution φ0 of the Schr¨odinger equation (2.14) for ω = 0 with the asymptotics (µ) 1 µ− 21 lim u φ0 (u) = √ (3.25) with µ := λn (0) + . u→∞ 4 π This solution can be obtained as a limit of the solutions from Theorem 3.2, in the sense that for all u ∈ R, φ0 (u) =
lim
E0 ω→0
` ωµ φ(u)
and
φ0 (u) =
lim
E0 ω→0
ωµ φ` (u).
Note that the λn are the eigenvalues of the Laplacian on the sphere. They are clearly non-negative, and thus the parameter µ in (3.25) is positive. Unfortunately, the function φ0 cannot be constructed with the iteration scheme (3.16) because if we put in the Green’s function for ω = 0 (which is obtained from (3.14) by taking the limit ω → 0), we get for φ (1) the equation ∞ φ (1) (u) = (v − u) W (v) dv, u
and since W decays at infinity only quadratically, the integral diverges. To overcome this problem, we combine the quadratically decaying part of the potential with the unperturbed operator. More precisely, for any ω in the set F := ω ∈ C | Im ω ≤ 0 and |ω| ≤ (16ak)−1 , we write the Schr¨odinger equation as µ2 − 41 d2 2 − 2+ − ω φ(u) = −W (u) φ(u), du u2 1
where µ(ω) = (λn (ω) − 2akω + 41 ) 2 . The potential W is continuous in ω and bounded by c |W (u)| ≤ 3 for all ω ∈ F . (3.26) u The solutions of the unperturbed Schr¨odinger equation can be expressed with Bessel functions, πu πu h1 (u) = Jµ (ωu), Yµ (ωu). h2 (u) = 2 2 They have the following asymptotics, h2 (u) ∼ sin(ωu) h1 (u) ∼ cos(ωu), √ 1 µ πω (µ) 2µ− 2 −µ+ 1 µ+ 21 2 h (u) ∼ u , h (u) ∼ u √ 2 1 1 π ωµ (µ + 1) 2µ+ 2
if ωu 1, if ωu 1.
478
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
The Green’s function can be expressed in terms of the two fundamental solutions by the standard formula h1 (u) h2 (v) − h1 (v) h2 (u) S(u, v) = (v − u) , w(h1 , h2 ) where w(h1 , h2 ) = h 1 h2 −h1 h 2 = −ω is the Wronskian. The perturbation series ansatz φ =
∞
φ (l)
(3.27)
l=1
now leads to the integral equation φ
(l+1)
(u) =
∞
S(u, v) W (v) φ (l) (v) dv.
(3.28)
u
We choose the function φ (0) such that its asymptotics at infinity is a multiple times the plane wave e−iωu , whereas for ω = 0, it has the asymptotics (3.25), φ (0) (u) = ωµ (h1 − ih2 )(u).
(3.29)
Lemma 3.6. For any fixed n there is u0 ∈ R such that the iteration scheme (3.29, 3.28) converges uniformly for all u > u0 and ω ∈ F . The functions φ defined by (3.27) are solutions of the Schr¨odinger equation (2.14) with the asymptotics φ(u) c φ (0) (u) − 1 ≤ u and a constant c = c(n). Proof. Using the asymptotic formulas for the Bessel functions, one sees (similar to the estimate [3, Eq. (4.4)] for µ = l + 21 and integer l) that for all v ≥ u and ω ∈ F , the Green’s function is bounded by −µ+ 1 µ+ 1 2 2 u v Im ω (v−u) |S(u, v)| ≤ C e . (3.30) 1 + |ω| u 1 + |ω| v Similarly, we can bound the Bessel functions in (3.29) to get µ− 1 2 1 u (0) −Im ωu ≤ |φ | e ≤ C. C 1 + |ω| u
(3.31)
Let us show inductively that |φ (l) | ≤ C eIm ω u
u 1 + |ω| u
−µ+ 1 2
Cc u
l .
(3.32)
For l = 0 there is nothing to prove. The induction step follows from (3.28, 3.26, 3.30) −µ+ 1 ∞ 2 cC Cc l u v −Im ωu 2 Im ωv (l+1) |φ e dv |≤Ce 1 + |ω| u 1 + |ω| v v 3 v u −µ+ 1 l ∞ 2 cC u Cc Im ωu dv. ≤Ce 1 + |ω| u u v2 u The lemma now follows immediately from (3.32, 3.31) and by differentiating the series (3.27) with respect to u.
Decay of Solutions of the Wave Equation in the Kerr Geometry
479
Proof of Theorem 3.5. From the asymptotics at infinity, it is clear that φ =
ωµ φ` if ω = 0 φ0 if ω = 0.
Denoting the ω-dependence of φ by a subscript, we thus need to prove that for all u ∈ R, lim φω (u) = φ0 (u).
lim φω (u) = φ0 (u),
F ω→0
(3.33)
F ω→0
To simplify the problem, we first note that for u on compact intervals, the continuous dependence on ω follows immediately from the Picard-Lindel¨of Theorem (i.e. the continuous dependence of solutions of ODEs on the coefficients and initial values). Thus it suffices to prove (3.33) for large u. Furthermore, writing the Schr¨odinger equation as (∂u − iω)(∂u + iω) φω = −U φ, the potential U has quadratic decay at infinity. Thus, after the substitution (∂u − iω) = eiωu ∂u e−iωu , we can multiply the above equation by e−iωu and integrate to obtain e−iωu (∂u + iω) φω (u) =
∞
e−iωv U (v) φω (v) dv.
u
Here we emphasized the ω-dependence by a subscript; note also that the integral is well-defined in view of the asymptotics of φω at infinity. This equation shows that φω
converges pointwise once we know that φω (u) converges uniformly in u. Hence it remains to show that for every > 0 there is u0 and δ > 0 such that for all ω ∈ F with |ω| < δ, |φω (u) − φ0 (u)| < ε
for all u > u0 .
(3.34) (0)
To prove (3.34) we use the uniform convergence of the functions φω , (3.29), to choose δ such that for all ω ∈ F with |ω| < δ, (0)
|φω(0) (u) − φ0 (u)| <
ε 3
for all u > u0 .
According to Lemma 3.6, we can by choosing u0 sufficiently large arrange that |φω(0) (u) − φω (u)| <
ε 3
for all u > u0 and ω ∈ F .
Now (3.34) follows immediately from the estimate (0)
(0)
|φω − φ0 | ≤ |φω − φω(0) | + |φω(0) − φ0 | + |φ0 − φ0 |.
480
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
4. Global Estimates for the Radial Equation Let Y1 and Y2 be two real fundamental solutions of the Schr¨odinger equation (2.14) for a general real and smooth potential V . Then their Wronskian w := Y1 (u) Y2 (u) − Y1 (u) Y2 (u)
(4.1)
is a constant. By flipping the sign of Y2 , we can always arrange that w < 0. We combine the two real solutions into the complex function z = Y1 + iY2 , and denote its polar decomposition by z = ρ eiϕ
(4.2)
with real functions ρ(u) ≥ 0 and ϕ(u). By linearity, z is a solution of the complex Schr¨odinger equation z
= V z .
(4.3)
Note that z has no zeros because at every u at least one of the fundamental solutions does not vanish. 4.1. The complex Riccati equation. We introduce the function y by z
. (4.4) z Since z has no zeros, the function y is smooth. Moreover, it satisfies the complex Riccati equation y =
y + y2 = V .
(4.5)
The fact that the solutions of the complex Riccati equation are smooth will be helpful for getting estimates. Conversely, from a solution of the Riccati equation one obtains the corresponding solution of the Schr¨odinger equation by integration, v log z|vu = y. (4.6) u
Using (4.2) in (4.4) gives separate equations for the amplitude and phase of z, ρ = ρ Re y , and integration gives
ϕ = Im y,
log ρ|vu = ϕ|vu =
v
Re y,
(4.7)
Im y .
(4.8)
u v u
Furthermore, the Wronskian (4.1) gives a simple algebraic relation between ρ and y. Namely, w can be expressed by w = −Im (z z ) = ρ 2 Im y and thus w . (4.9) ρ2 = − Im y Since ρ 2 is positive and w is negative, we see that Im y(u) > 0
for all u.
(4.10)
Decay of Solutions of the Wave Equation in the Kerr Geometry
481
4.2. Invariant disk estimates. We now explain a method for getting estimates for the complex Riccati equation. This method was first used in [6] for estimates in the case where the potential is negative (Lemma 4.1). Here we extend the method to the situation when the potential is positive (Lemma 4.2). For sake of clarity, we develop the method again from the beginning, but we point out that the proof of Lemma 4.1 is taken from [6]. Let y(u) be a solution of the complex Riccati equation (4.5). We want to estimate the Euclidean distance of y to a given curve m(u) = α + iβ in the complex plane. A direct calculation using (4.5) gives 1 d |y − m|2 = (Re y − α) (Re y − α) + (Im y − β) (Im y − β)
2 du = (Re y − α) V − (Re y)2 + (Im y)2 − α − (Im y − β) 2 Re y Im y + β
= (Rey − α) V − (Rey)2 − (Imy)2 + 2β Im y − α + (Rey − α) 2(Im y − β) Im y −(Im y − β) β + 2α Im y − (Im y − β) 2(Re y − α) Im y = (Re y − α) V − (Re y − α)2 − (Im y − β)2 − α 2 + β 2 − α
−(Im y − β) β + 2αβ − 2α (Re y − α)2 + (Im y − β)2 . Choosing polar coordinates centered at m, y = m + Reiϕ ,
R := |y − m|,
we obtain the following differential equation for R, R + 2αR = cos ϕ V − R 2 − α 2 + β 2 − α − sin ϕ β + 2αβ .
(4.11)
In order to use this equation for estimates, we assume that α is a given function (to be determined later). With the abbreviations u U = V − α 2 − α and σ (u) = exp 2 α , (4.12) 0
the ODE (4.11) can then be written as (σ R) = σ U − R 2 + β 2 cos ϕ − (σβ) sin ϕ. To further simplify the equation, we want to arrange that the square bracket vanishes. If U is negative, this can be achieved by the ansatz √ √ 1 1 |U | |U | β = T + , R = T − (U < 0), (4.13) 2 T 2 T with T > 1 a free function. In the case U > 0, we make similarly the ansatz √ √ U U 1 1 β = T − , R = T + (U > 0) 2 T 2 T
(4.14)
482
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
Im y √
|U | T R m
√ |U | T Re y
Fig. 1. Invariant disk estimate for U < 0
with a function T > 0. Using (4.13, 4.14), the ODE (4.11) reduces to the simple equation (σ R) = −(σβ) sin ϕ. If we now replace this equation by a strict inequality, (σ R) > −(σβ) sin ϕ,
(4.15)
with R a general positive function, the inequality |y − m| ≤ R will be preserved as u increases. In other words, the disk BR (m) will be an invariant region for the flow of y. In the next two lemmas we specify the function T in the cases U < 0 and U > 0, respectively. To avoid confusion, we note that it is only a matter of convenience to state the lemmas on the interval [0, umax ]; by translation we can later immediately apply the lemmas on any closed interval. Lemma 4.1. Let α be a real function on [0, umax ] which is continuous and piecewise C 1 , such that the corresponding function U , (4.12), is negative, U ≤ 0 on [0, umax ] . For a constant T0 ≥ 1 we introduce the function T by 1 TV[0,u) log |σ 2 U | , T (u) = T0 exp 2
(4.16)
define the functions β and R by (4.13) and set m = α + iβ. If a solution y of the complex Riccati equation (4.5) satisfies at u = 0 the condition |y − m| ≤ R,
(4.17)
then this condition holds for all u ∈ [0, umax ] (for illustration see Fig 1). Proof. For ε > 0 we set Tε (u) = T0 exp
u 2 |σ U | 1 + ε(1 − e−u ) 2 0 |σ 2 U |
(4.18)
Decay of Solutions of the Wave Equation in the Kerr Geometry
483
and denote corresponding functions α, R, m, and σ by an additional subscript ε. Since Tε (0) = T (0) and limε 0 Tε = T , it suffices to show that for all ε > 0 the following statement holds, |y − mε |(0) ≤ Rε (0)
⇒
|y − mε |(u) ≤ Rε (u) for all u ∈ [0, umax ].
In differential form, we get the sufficient condition |y − mε |(u) = Rε (u)
⇒
|y − mε | (u) < Rε (u).
According to (4.15), this last condition will be satisfied if (σε Rε ) > |(σε βε ) |.
(4.19)
From now on we omit the subscripts ε. In order to prove (4.19), we first use (4.13, 4.12) to rewrite the functions σβ and σ R as 1 2 σβ = |σ U | T + |σ 2 U | T −1 2 (4.20) 1 σR = |σ 2 U | T − |σ 2 U | T −1 . 2 By definition of Tε (4.18),
T
1 |σ 2 U | = 2 + εe−u . T 2 |σ U |
It follows that 2 U | T −1 ) = −εe−u ( |σ 2 U | T −1 ) ( |σ ( |σ 2 U | T ) = εe−u ( |σ 2 U | T )
if |σ 2 U | ≥ 0, if |σ 2 U | < 0.
Hence when we differentiate through (4.20) and set ε = 0, either the first or the second summand drops out in each equation, and we obtain (σ R) = |σβ| . If ε > 0, an inspection of the signs of the additional terms gives (4.19). Lemma 4.2. Let α be a real function on [0, umax ] which is continuous and piecewise C 1 , such that the corresponding function U , (4.12), satisfies on [0, umax ] the conditions U ≥0
and
U + 4U α ≥ 0.
For a constant T0 ≥ 0 we introduce the function T by U (0) T (u) = T0 , σ 2U
(4.21)
(4.22)
define the functions β and R by (4.14) and set m = α + iβ. If a solution y of the complex Riccati equation (4.5) satisfies at u = 0 the condition |y − m| ≤ R, then this condition holds for all u ∈ [0, umax ] (see Fig. 2). Furthermore, √ √ U (0) . Re y ≥ α − U − T0 2σ
(4.23)
484
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
Fig. 2. Invariant disk estimate for U > 0, in the cases T > 1 (left) and T < 1 (right).
Proof. For ε > 0 we set 1
Tε = T0 (σ 2 U )− 2 (1 − εe−u ). Using (4.14, 4.12) we can write the functions σβ and σ R as 1 −1 2 σβ = − T0 σ U (1 − εe−u )−1 − T0 (1 − εe−u ) 2 1 −1 2 σR = T0 σ U (1 − εe−u )−1 + T0 (1 − εe−u ) , 2
where we again omitted the subscript ε. Differentiation gives (σ R) > −(σβ) =
1 !
1 −1 2 σ U (1 − εe−u )−1 − T0 1 − εe−u . T0 2 2
(4.24)
According to the second inequality in (4.21), the function σ 2 U is strictly increasing and thus the expression on the right of (4.24) is positive for sufficiently small ε. Hence (4.19) is satisfied. Letting ε → 0, we obtain that the circle B R (m) is invariant. In order to prove (4.23) √ we note that in the case T < 1 the inequality is obvious because even Re y ≤ α − U (see Fig. 2). Thus we can assume T ≥ 1, and the estimate √ U Re y ≥ α − R ≥ α − (T + 2) 2 together with (4.22) gives the claim.
If the potential V is monotone increasing, by choosing α ≡ 0 we obtain the following simple estimate. Corollary 4.3. Assume that the potential V is monotone increasing on [0, umax ]. For a constant T0 > 0 with T02 ≥ −V (0) we introduce the functions β =
1 V T0 − , 2 T0
R =
1 V T0 + . 2 T0
(4.25)
Decay of Solutions of the Wave Equation in the Kerr Geometry
485
Im y T0
Re y β R
Fig. 3. Invariant region estimate for monotone V
If a solution of the complex Riccati equation (4.5) satisfies at u = 0 the condition iT0 T0 ≤ , Re z ≤ 0 , y ∈ z | |z − iβ| ≤ R, Re z, Im z ≥ 0 ∪ z | z − 2 2 then this condition holds for all u ∈ [0, umax ] (see Fig. 3). Proof. Choosing α ≡ 0 and β, T according to (4.25), we know from Lemma 4.1 and Lemma 4.2 that the circles |y − m| ≤ R are invariant. Furthermore, we note that the arc in Fig. 3 is the flow line of the equation y + y 2 = 0, and thus it cannot be crossed from the right to the left when V is positive. This gives the result in the case that V has no zeros. If V has a zero, the invariant disks in the regions V ≤ 0 and V ≥ 0 coincide at the zero of V . The invariant disk estimates of Lemma 4.1 and Lemma 4.2 can also be used if the functions α and U have a discontinuity at some v ∈ [0, umax ], i.e. αl := lim α(u) = lim α(u) =: αr , uv
u v
Ul := lim U (u) = lim U (u) =: Ur . uv
u v
In this case we choose the function T also to be discontinuous at v, Tl := lim T (u) = lim T (u) =: Tr , uv
u v
in such a way that the circle corresponding to (αr , Ur , Tr ) contains that corresponding to (αl , Ul , Tl ) (see Fig. 4). In the next lemma we give sufficient “jump conditions” for this “matching.” Lemma 4.4. (Matching of invariant disks). Suppose that Ul < 0. Depending on the sign of Ur , we set (αr − αl )2 + |Ul + Ur | if Ur < 0, √ |Ul | |Ur | √ (αr − αl )2 + |Ul + Ur | + |Ul | Ur Tr = Tl if Ur > 0. √ |Ul | Ur
Tr = Tl
(4.26) (4.27)
486
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
Fig. 4. Matching of invariant disks in the cases Ur < 0 (left) and Ur > 0 (right)
Let Bl/r be the disks with centers ml/r = αl/r + iβl/r and radii Rl/r as given by (4.13) or (4.14). Then Bl ⊂ Br .
Proof. We must satisfy the condition Rr ≥ |mr − ml | + Rl . Taking squares, we obtain the equivalent conditions Rr ≥ Rl and (Rr − Rl )2 ≥ (αr − αl )2 + (βr − βl )2 . This last condition can also be written as (αr − αl )2 + (βl2 − Rl2 ) + (βr2 − Rr2 ) ≤ 2 (βl βr − Rl Rr ).
(4.28)
In the case Ur < 0, we can substitute the ansatz (4.13) into (4.28) to obtain the equivalent inequality (αr − αl ) + |Ul | + |Ur | ≤ 2
Tr Tl |Ul | |Ur | + Tl Tr
.
Dropping the last summand on the right and solving for Tr , we obtain (4.26), which is thus a sufficient condition. In the case Ur > 0, we substitute (4.13, 4.14) into (4.28) to obtain the equivalent condition (αr − αl ) + |Ul | − Ur ≤ |Ul | Ur 2
Tr Tl − Tl Tr
.
Using the inequality |Ul | − Ur ≤ |Ul + Ur |, replacing the factor Tl /Tr on the right by one and solving for Tr , we obtain the sufficient condition (4.27).
Decay of Solutions of the Wave Equation in the Kerr Geometry
487
4.3. Bounds for the Wronskian and the fundamental solutions. We now consider the solutions φ´ and φ` as defined in Section 3.1 for ω on the real axis and set y´ =
φ´
, φ´
y` =
φ`
. φ`
We keep k fixed. Since taking the complex conjugate of the separated wave equation flips the sign of k, we may assume that k ≥ 0. Then ω0 as defined by (2.12) is negative. ´ φ) ` is non-zero. Proposition 4.5. If ω ∈ [ω0 , 0], the Wronskian w(φ, Proof. According to (2.13), ω and have the same sign. From (4.10) we know that the functions y´ and y` both stay either in the upper or lower half plane. In view of the asymptotics (3.1, 3.2), we know that they must be in opposite half planes. Thus ! ´ φ) ` = φ´ φ` y´ − y` = 0. w(φ, In the case ω ∈ (ω0 , 0), we need the following global estimate for large λ. Proposition 4.6. For any u1 ∈ R there are constants c, λ0 > 0 such that φ(u) c ´ for all λ > λ0 , ω ∈ (ω0 , 0), u < u1 . ≤ w(φ, ´ φ) ` λ The remainder of this section is devoted to the proof of this proposition. Let u1 ∈ R and ω ∈ (ω0 , 0). Possibly by increasing u1 and λ0 we can clearly arrange that V is monotone decreasing on [u1 , ∞). Then we have the following estimate. Lemma 4.7. The functions φ` and y` satisfy the inequalities ` |φ(u)| ≥ 1,
Re y(u) ` ≤ |ω| on [u1 , ∞).
` = −iω. Thus for v suffiProof. From the asymptotics (3.2) we know that limu→∞ y(u) ciently large, |y(v) ` − i|ω|| < ε, and we can apply Corollary 4.3 on the interval [u1 , v] backwards in u with T0 = |ω| + 2ε. Since ε can be chosen arbitrarily small, we conclude that Corollary 4.3 applies even on [u1 , ∞) with T0 = |ω|. This means that 0 ≤ Im y` ≤ |ω|, Finally, we use (4.9) with w = i|ω|.
Re y` ≤ |ω|
on [u1 , ∞).
´ which are more difficult because we need a We now come to the estimates for φ, stronger result. The next lemma specifies the behavior of the potential on (−∞, 2u1 ]. Lemma 4.8. For any u1 ∈ R there are constants c, λ0 such that the potential V has for all ω ∈ (ω0 , 0) and all λ > λ0 the following properties. There are unique points u− < u0 < u+ < u1 such that V (u− ) = −
2 , 2
V (u0 ) = 0,
V (u+ ) = 2 .
488
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
V is monotone increasing on (−∞, u+ ]. Furthermore, u+ − u− ≤ c,
(4.29)
γ u+ ≥ log − log λ − c, 1 1 |V | + |V
| 2 ≤ |V | on [u+ , 2u1 ], 4 2
2 3
(4.30) (4.31)
with γ as in (3.9). Proof. We expand V in a Taylor series around the event horizon, V = −2 + (λ + c0 )(r − r1 ) + λ O((r − r1 )2 ). Hence for sufficiently large λ0 there are near the event horizon unique points u− , u0 , u+ where the potential has the required value. Integrating (2.1) we get near the event horizon the asymptotic formula u ∼
1 log(r − r1 ). γ
Getting asymptotic expansions for u± we immediately obtain (4.29, 4.30). Furthermore, using (2.1) to transform r-derivatives into u-derivatives, we obtain in the region (r1 , r1 + ε) ∩ (u+ , ∞) the estimates λ γu ≤ V (u) ≤ λ c eγ u , e c |V (u)| + |V
(u)| ≤ λ c eγ u , uniformly in λ and ω. Hence for sufficiently large λ0 , (4.31) will be satisfied near the event horizon. In the region r > r1 + ε away from the event horizon, V is strictly positive, V > λ/c, and since the derivatives of V can clearly be bounded by |V | + |V
| < cλ, it follows that (4.31) is again satisfied. First we apply Corollary 4.3 on the interval (−∞, u− ) to obtain the following result. Corollary 4.9. There is a constant c > 0 such that for all ω ∈ (ω0 , 0) and λ > λ0 , ≤ Im y ≤ , 2
|Re y| ≤
2
on (−∞, u− ].
Also, at u = u− we have an invariant disk with αl = 0,
Ul = −
2 , 2
Tl =
√ 2.
On the interval [u− , u+ ] we use the method described in the next lemma.
(4.32)
Decay of Solutions of the Wave Equation in the Kerr Geometry
489
Lemma 4.10. Assume that the potential V is monotone increasing on [0, umax ]. We set α = max(2 V (umax ), 0) and introduce for a given constant T0 > 1 the functions U , σ , β, R, and T by (4.12, 4.14) and √ |U (0)| 2αu T (u) = T0 e . (4.33) √ |U (u)| If a solution y of the complex Riccati equation (4.5) satisfies at u = 0 the condition |y − m| ≤ R, then this condition holds for all u ∈ [0, umax ]. Proof. By definition of α, the function U = V − α 2 is negative and monotone increasing. Using furthermore that σ = e2αu , we can estimate the total variation in (4.16) as follows, u |U |
2 TV[0,u) log |σ U | = 4α − = 4αu + log |U (0)| − log |U (u)|. |U | 0 This gives (4.33).
2 √ Thus we match the invariant disk (4.32) to a disk with Ur = V (u− ) − αr and αr = 2 . From (4.29) we see that (u+ − u− ) α is uniformly bounded, and thus we obtain the following estimate.
Corollary 4.11. There is a constant c > 0 such that for all ω ∈ (ω0 , 0) and λ > λ0 , ≤ Im y ≤ c , c
|Re y| ≤ c on [u− , u+ ].
At u = u+ we get an invariant disk with 0 ≤ αl ≤ ,
−c Ul = −2 ,
Tl ≤ c.
(4.34)
In the remaining interval [u+ , 2u1 ] an approximate solution of the Schr¨odinger equation (2.14) is available from semi-classical analysis: the WKB wave function u √ − 41 φ(u) = V exp V . The corresponding function y is given by y(u) =
√ V
V − . 4V
In order to get an invariant disk estimate which quantifies the exponential increase of ϕ, we choose α such that it also becomes large as V 0. For technical simplicity, we choose 7 α(u) = V (u), (4.35) 8 giving rise to the following general result.
490
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
Lemma 4.12. Assume that the potential V is positive on [0, umax ] and that |V (u)| ≤
3 1 V (u) 2 , 2
|V
(u)| ≤
1 V (u)2 . 4
(4.36)
We introduce for a given constant T0 > 0 the functions α, U , σ , β, R, and T by (4.35, 4.12, 4.14, 4.22). If a solution y of the complex Riccati equation (4.5) satisfies at u = 0 the condition |y − m| ≤ R, then this condition holds for all u ∈ [0, umax ]. Furthermore, √ V T0 Re y ≥ − . 8 2
(4.37)
Proof. A short calculation yields 7 V
15 V − √ , 64 16 V 105 3 83
7 V 2 7 V
V2 − − V + U + 4αU = √ . 3 128 64 32 V 2 16 V U = V − α2 − α =
Using (4.36) we obtain the estimates V V ≤ U ≤ 64 2
3
V2 and U + 4αU ≥ . 16
Hence the conditions (4.21) are satisfied, and Lemma 4.2 applies. The inequality (4.37) follows from (4.23), the just-derived upper bound for U and the fact that σ ≥ 1. Matching the invariant disk (4.34) to the invariant disk with αr = α(ur ) and Ur = V (ur ) − α 2 (ur ) − α (ur ) with α according to (4.35), we obtain Ur ≤ 2 ,
Tr ≤ c.
(4.38)
We can then apply the last lemma on the interval [u+ , 2u1 ]. ` we can Proof of Proposition 4.6. Suppose that u < u1 . Using the definition of y´ and y, rewrite the Wronskian as ´ φ) ` = φ´ φ` (y´ − y). w(φ, ` Applying Lemma 4.7 at u = 2u1 gives φ(u) φ(u) 1 ´ ´ . ≤ w(φ, φ(2u ´ φ) ` ´ Re y(2u ´ ) 1 ) − |ω| 1 We combine (4.37) with T0 = Tr satisfying (4.38) to get √ V Re y´ ≥ − c. 8
(4.39)
(4.40)
Decay of Solutions of the Wave Equation in the Kerr Geometry
491
Since the potential V is strictly positive on the interval [u1 , 2u1 ], we can, possibly by increasing λ0 and c, arrange that √
√ V ≥
λ c
on [u1 , 2u1 ]
(4.41)
on [u1 , 2u1 ].
(4.42)
and thus also that √ λ Re y´ ≥ 16 c
This inequality allows us to bound the fraction in (4.39), φ(u) φ(u) ´ ´ ≤ . w(φ, φ(2u ´ φ) ` ´ 1)
(4.43)
Thus it remains to control the last quotient. We omit the accent and use the notation ρ = |φ|. In the case u < u+ , we can use (4.9), ρ(u)2 Im y(u+ ) , = 2 ρ(u+ ) Im y(u) and the last quotient is controlled from above and below by Corollary 4.9 and Corollary 4.11. Hence, rewriting the quotient on the right of (4.43) as ρ(u) ρ(u+ ) ρ(u) = , ρ(2u1 ) ρ(u+ ) ρ(2u1 ) it remains to consider the case u ≥ u+ . Applying (4.7) and (4.40), we obtain 2u1 φ(u) 1 2u1 √ = − Re y(u) ≤ c (2u − u ) − V. A := log 1 + φ(2u1 ) 8 u u Now we use (4.30) and the fact that the function log is bounded, A ≤ c log λ −
1 8
2u1
√
V.
u
Estimating the last summand with (4.41), u
2u1
√
V ≥
2u1 u1
√
√ V ≥
λ u1 , c
we conclude that for large λ√this summand dominates the term c log λ, and thus (4.43) decays in λ even like exp(− λ/c).
492
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
5. Contour Deformations to the Real Axis In this section we fix the angular momentum number k throughout and omit the angular variable ϕ. We can again assume without loss of generality that k ≥ 0. Also, since here we are interested in the situation only locally in u, we evaluate weakly. Thus we write the integral representation (2.16) for compactly supported initial data 0 and a test function η ∈ C0∞ (R × S 2 )2 as 1 <η, (t)> = − dωe−iωt <η, Qn (ω) S∞ (ω) 0 >. (5.1) lim − ε 0 2πi Cε Cε n∈IN
The integration contour in (5.1) can be moved to the real axis provided that the integrand is continuous. In the next lemma we specify when this is the case and simplify the integrand. For ω real, the complex conjugates of φ´ and φ` are again solutions of the ODE. Thus, apart from the exceptional cases ω ∈ {0, ω0 }, we can express φ` as a linear ´ combination of φ´ and φ, φ` = α φ´ + β φ´
(ω ∈ R \ {0, ω0 }).
(5.2)
The complex coefficients α and β are called transmission coefficients. The Wronskian of φ´ and φ` can then be expressed by ´ φ) ` = β w(φ, ´ φ) ´ = 2i β, w(φ,
(5.3)
where in the last step we used the asymptotics (3.1). Furthermore, it is convenient to introduce the real fundamental solutions ´ φ1 = Re φ,
´ φ2 = Im φ,
and to denote the corresponding solutions of the wave equation in Hamiltonian form ωn . by 1/2 ´ φ) ` is non-zero at ω ∈ R \ {0, ω0 }, then the integrand Lemma 5.1. If the Wronskian w(φ, in (5.1) is continuous at ω and ! lim − lim (Qn (ω + iε) S∞ (ω + iε) )(r, ϑ) ε0
=−
ε 0
2 i ωn ωn tab a < bωn , >, ω
(5.4)
a,b=1
where the coefficients tab are given by α α t11 = 1 + Re , t12 = t21 = −Im , β β
t22 = 1 − Re
α . β
(5.5)
Proof. We start from the explicit formula for the operator product Qn S∞ given in [5, Proposition 5.4]. Since the angular operator Qn (ω + iε) can be diagonalized for ε sufficiently small, the kernel g(u, u ) is simply the Green’s function of the radial ODE, i.e. for ω in the lower half plane, ´ ` ) if u ≤ u
1 φ(u) φ(u
× ` g(u, u ) := (5.6) ´ ) if u > u . , ´ φ) ` φ(u) φ(u w(φ,
Decay of Solutions of the Wave Equation in the Kerr Geometry
493
whereas the formula in the upper half plane is obtained by complex conjugation. Using ` we find that that limε0 φ´ = limε 0 φ´ and limε0 φ` = limε 0 φ, ! lim − lim g(u, u ) = 2i Im g(u, u ), ε0
ε 0
and a short calculation using (5.2, 5.3) gives 2 ! i lim − lim g(u, u ) = − tab φ a (u) φ b (u ) ε0 ε 0 a,b=1
with tab according to (5.5). Except for the function g(u, u ), all the functions appearing in the formula for Qn S∞ in [5, Proposition 5.4] are continuous on the real axis. A direct calculation shows that ! lim − lim (Qk,n (ω + iε) S∞ (ω + iε) ) ε0
=−
ε 0
2 i (ω − β)ω 0 tab a < b , >L2 (dµ) 0 1 ω a,b=1
with dµ given by (2.7). Since the b are eigenfunctions of the Hamiltonian, we know according to (2.3) that A b = (ω − β)ω b . Using furthermore that the operator A is symmetric on L2 (dµ), we conclude that (ω − β)ω 0 A0 < b , >L2 (dµ) = < b , >L2 (dµ) = < b , >, 0 1 0 1 where in the last step we used (2.8).
Let us now consider for which values of ω and n the contour can be moved to the ´ φ) ` is non-zero unless ω ∈ real axis. According to Proposition 4.5, the Wronskian w(φ, [ω0 , 0]. We now analyze carefully the exceptional cases ω = 0, ω0 . From Theorem 3.1, Theorem 3.2 and Theorem 3.5 we know that the functions φ´ and φω = ωµ φ` are continuous for all ω ∈ R. If ak = 0 and ω = 0, the functions φ´ and φω degenerate to real solutions with the asymptotics ´ lim φ(u) = 1,
u→−∞
1 (µ) lim uµ− 2 φ0 (u) = √ . π
u→∞
Noting that the function ∂u
r2
+ a2
=
r 3
(r 2 + a 2 ) 2
= √
r r 2 + a2
2Mr 1− 2 r + a2
is monotone increasing, the potential V , (2.15), is everywhere positive. Hence solutions of the Schr¨odinger equation (2.14) are convex. This implies that the functions φ´ and φ` do not coincide, and thus their Wronskian is non-zero. As a consequence, the Green’s function (5.6), and thus the whole integrand in (5.1), is bounded and continu` and thus we can in ous near ω = 0 (note that (5.6) is invariant under rescalings of φ, this formula replace φ` by φω ). In the case ak = 0 and ω = 0, the function φ0 is real,
494
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
´ φ0 ) = 0. If on the other hand ak = 0 and ω = ω0 , whereas φ´ is complex, and thus w(φ, ´ φ0 ) = 0. Hence the integrand in (5.1) is φ´ is real and φ` is complex, and again w(φ, continuous and bounded at the points ω = 0, ω0 . We conclude that for every n ∈ N, the integrand in (5.1) is continuous on an open neighborhood of ω ∈ R \ (ω0 , 0). Fur´ φ) ` = 0 if ω ∈ (ω0 , 0) and λ is sufficiently thermore, according to Proposition 4.6, w(φ, large. We have thus proved the following result. Proposition 5.2. There is δ > 0 and n0 ∈ N such that for every 0 ∈ C0∞ (R × S 2 )2 , the completeness relation ∞ 2 0 dω 1 ωn ωn 0 = + tab a < bωn , 0 > 2π ω R \[ω ,0] ω 0 0 n>n0 n=0 a,b=1 " + (Qn S∞ 0 ) dω n≤n0 Dδ
holds, with the contour Dδ as in Fig. 5. We point out that the contour Dδ passes along the line segment [ω0 , ω0 +δ) twice, once as the limit of the contour in the lower half plane, and once as limit of the contour in the upper half plane. These two integrals can be combined to one integral over [ω0 , ω0 + δ) with the integrand given by (5.4). Let us now consider how the remaining contour integrals over Cε can be moved to the real line. According to Theorems 3.1 and 3.2, the functions φ´ and φ` have for every n ≤ n0 and for every ω ∈ (ω0 , 0) a holomorphic extension to a neighborhood of ω. Thus their Wronskian is also holomorphic in this neighborhood, and consequently ´ φ) ` = 0 for ω near 0 they can have only isolated zeros of finite order. Since w(φ, and ω0 , we conclude that the numbers of zeros must be finite. Since we only need to consider a finite number of angular momentum modes, there is at most a finite number of points ω1 , . . . , ωK ∈ (ω0 , 0), K ≥ 0, where any of the Wronskians w(φ´ n , φ` n ) has a zero. We denote the maximum of the orders of these zeros at ωi by li ∈ N. The above zeros of the Wronskian lead to poles in the integrand of (5.1) and correspond to radiant modes. We will prove in Sect. 7 by contradiction that these radiant modes are actually absent. Therefore, we now make the assumption that there are radiant modes, i.e. that the Wronskians w(φ´ n , φ` n ) have at least one zero on the real axis. As a preparation for the analysis of Sect. 7, we now choose a special configuration where radiant modes appear, but in the simplest possible way. We choose new initial data 0 = P(H ) 0 ,
(5.7) Im ω
ω0 ω0 + δ Fig. 5. The integration contour Dδ
Re ω
Decay of Solutions of the Wave Equation in the Kerr Geometry
495
where P is the polynomial P(x) = ω (ω − ω0 ) (x − ω1 )
l1 −1
K #
(x − ωi )li .
i=2
Then 0 again has compact support, and using the spectral calculus, the corresponding solution (t) of the Cauchy problem is obtained from (5.1) by multiplying the integrand by P(ω). Then the poles of the integrand at ω2 , . . . , ωK disappear, and at ω1 a simple pole remains. Subtracting this pole, the integrand becomes analytic, whereas for the pole itself we get a contour integral which can be computed with residues. Let us summarize the result of the above construction with a compact notation. For a test function η ∈ C0∞ (R × S 2 )2 we introduce the vectors ηωn by ηωn = (η1ωn , η2ωn ) where ηaωn = < aωn , η>. Proposition 5.3. Assume that there are radiant modes, K ≥ 1. Then the Cauchy development (t) of the initial data (5.7) satisfies the relation 1 ∞ −iωt ωn ωn ωn <η, (t)> = e η , T C2 dω + e−iω1 t ηω1 n , σ n C2 . 2π −∞ n≤n n∈N
0
Here ω1 ∈ (ω0 , 0). The (σ n )n=1,...,n0 are vectors in C2 , at least one of which is non-zero. The matrices T ωn have the following properties, (1) If ω ∈ [ω0 , 0] or n > n0 , ωn (T ωn )ab = tab
(5.8)
with tab according to (5.5). (2) For each n, the function T ωn is continuous in ω ∈ R and analytic in (ω0 , 0). 6. Energy Splitting Estimates In this section, we consider the family of test functions ηL (u) = η(u + L) for a fixed η ∈ C0∞ (R × S 2 )2 . Our goal is to control the inner product <ηL , (t)> in the limit L → ∞ when the support of ηL moves towards the event horizon. Our method is to split up the inner product into a positive and an indefinite part. Once the indefinite part is bounded using the ODE estimates of Sect. 4, we can use the Schwarz inequality and energy conservation to also control the positive part. We choose u1 ∈ R and general test functions η, ζ ∈ C0∞ ((−∞, u1 ) × S 2 )2 which are supported to the left of u1 (for later use we often work more generally with ζ instead of 0 ). Since for each fixed n, the T ωn are continuous and the eigensolutions aωn (u) are, according to Theorem 3.1, also continuous in ω, uniformly for u ∈ (−∞, u1 ), we have no difficulty controlling the expressions ηωn , T ωn ζ ωn C2 for n ≤ n0 and ω ∈ [ω0 , 0]. Hence we only need to consider the case when the matrix T ωn is given by (5.8). Using (5.5), the eigenvalues λ± of this matrix are α λ± = 1 ± . (6.1) β
496
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
In order to determine the sign of these eigenvalues, we first use the asymptotics (3.1, ` φ) ` = −2iω and w(φ, ´ φ) ´ = 2i. Furthermore, 3.2) to compute the Wronskians w(φ, we obtain from (5.2) and its complex conjugate that ` φ) ´ φ). ` = (|α|2 − |β|2 ) w(φ, ´ w(φ, Combining these identities, we find that |α|2 − |β|2 = −
ω .
(6.2)
From (6.1, 6.2) we see that in the case ω ∈ [ω0 , 0], where ω and have the same sign, the eigenvalues λ± are both positive. However, if ω ∈ (ω0 , 0), one of the eigenvalues is negative. This result is not surprising, because the lack of positivity corresponds to the fact that for ω ∈ [ω0 , 0] the energy density can be negative inside the ergosphere. In the case when T ωn is not positive, we decompose it into the difference of two positive matrices, T ωn = T+ωn − T−ωn
for ω ∈ (ω0 , 0), n > n0 ,
where T−ωn = −λ− 1. In the next lemma we bound the integral over T−ωn using ODE techniques. Lemma 6.1. For any ε > 0 we can, possibly by increasing n0 , arrange that for all L ≤ 0, 0 ηωn , T ωn ζ ωn C2 dω ≤ ε. − L n>n0 ω0
Proof. Using (6.2, 5.3) we can estimate the norm of T− by ω 1 |α|2 − |β|2 2|ω| T− = |λ− | = = . ≤ 2 ´ φ)| ` 2 |β| (|α| + |β|) 2 |β| |w(φ, Hence
0
n>n0 ω0
ωn ωn ωn η , T ζ C2 dω ≤ 2 − L
0
n>n0 ω0
ωn | |ηL |ζ ωn | |ω| dω. ´ φ)| ` |w(φ, ´ φ)| ` |w(φ,
Writing out the energy scalar product using [5, Eq. (2.14)] and expressing the funda´ one sees that mental solutions aωn in terms of the radial solution φ, ωn ´ |ηL | ≤ c sup |ηL φ|,
R
´ |ζ ωn | ≤ c sup |ζ φ|, R
where the constant c = c(ω) is independent of λ. Now we apply Proposition 4.6 and use that the eigenvalues λn grow quadratically in n, (2.11). Lemma 6.2. There is a constant C > 0 such that for all L ≥ 0, ωn ωn ωn η , T ζ C2 dω ≤ C. n∈N R\[ω0 ,0]
L
Decay of Solutions of the Wave Equation in the Kerr Geometry
497
Proof. First of all, using the positivity of the matrix T , ωn ωn ωn η , T ζ C2 dω L n∈N R\[ω0 ,0]
≤
! 1 ωn ωn ηL , T ωn ηL C2 + ζ ωn , T ωn ζ ωn C2 dω. 2 n∈N R\[ω0 ,0]
The two summands can be treated in exactly the same way; we treat the summand involvωn because of the additional L-dependence. Applying Proposition 5.2 and dropping ing ηL all negative terms, we get ωn ωn ηL , T+ωn ηL C2 dω ≤ <ηL , H (H − ω0 ) ηL > R\[ω0 ,0]
+
0
n>n0 ω0
ωn ωn ηL , T−ωn ηL C2 dω +
" n≤n0 Dδ
|<ηL , (Qn S∞ H (H − ω0 ) ηL )>| dω.
Using the asymptotic form of the energy scalar product and the Hamiltonian near the event horizon, it is obvious that the first term stays bounded as L → ∞. The second term is bounded according to Lemma 6.1. For the contour integrals we can use the formula (5.4) on the real interval [ω0 , ω0 + δ). Since Theorem 3.1 gives us control of the asymptotics of fundamental solution φ´ uniformly as u → −∞, it is clear that the integral over [ω0 , ω0 + δ) is bounded uniformly in L. For the contour in the complex plane, we cannot work with (5.4), but we must instead consider the formula for the operator product Qn S∞ given in [5, Proposition 5.4] together with the estimate for the Green’s function given in Lemma 6.3 below. Lemma 6.3. For every ω˜ ∈ Dδ with ω = ω0 , there are constants C, > 0 and u0 ∈ R such the Green’s function satisfies for all ω ∈ Dδ ∩ B (ω) ˜ the inequality |g(u, v)| ≤ C for all u, v ≤ u0 . Proof. It suffices to consider the case Im ω ≤ 0, because the Green’s function in the upper half plane is obtained simply by complex conjugation. By symmetry, we can furthermore assume that u ≤ v. Thus, according to (5.6), we must prove the inequality φ(u) ` φ(v) ´ ≤ C for all u ≤ v ≤ u0 . w(φ, ´ φ) ` ´ φ) ` has no zeros away According to Whiting’s mode stability [10], the Wronskian w(φ, from the real line, and thus by choosing δ so small that Bδ (ω) ˜ lies entirely in the lower ´ φ)| ` is bounded away from zero on Bδ (ω). half plane, we can arrange that |w(φ, ˜ Hence ´ ` ´ φ) ` our task is to bound the factor |φ(u) φ(v)|. Solving the defining equation for w(φ, for φ` and integrating, we obtain u0 φ` u0 du ´ ` . = −w(φ, φ) ´ 2 φ´ v φ(u) v
498
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
Substituting the identity 1
e−2iu d 2i du
=
´ 2 φ(u)
e2iu ´ 2 φ(u)
−
1 d 2i du
1 ´ 2 φ(u)
,
the integral over the last term gives a boundary term, v
u0
1 d 2i du
1
´ 2 φ(u)
1 1 u0 = . ´ 2 v 2i φ(u)
The integral over the other term can be estimated by 1 2
u0
v
2iu
´ e−2Im v u0 (e−iu φ(u)) −2iu e e dv ≤ dv. 2 −iu 3 ´ ´ (e φ(u) φ(u)) v
Using the asymptotics (3.1) one sees that the last integrand vanishes at the event horizon. ´ (3.7, 3.8), we see that this integrand decays even expoFrom the series expansion for φ, nentially fast. Therefore, the last integral is finite, uniformly in v and locally uniformly ´ the in ω. Collecting all the obtained terms and using the known asymptotics (3.1) of φ, result follows. Lemma 6.4. For any ε > 0 we can, possibly by increasing n0 , arrange that for all L ≥ 0,
0
n>n0 ω0
ωn ωn ωn η , T ζ C2 dω ≤ ε. + L
Proof. Again using positivity, it suffices to bound the terms n>n0
0 ω0
ωn ωn ηL , T+ωn ηL C 2
dω
and
n>n0
0 ω0
ζ ωn , T+ωn ζ ωn C2 dω .
They can be treated similarly, consider for example the first term. For any n1 > n0 , inf λ2n1
(ω0 ,0)
0
n≥n1 ω0
ωn ωn ηL , T+ωn ηL C2 dω ≤
≤
+ +
" n≤n0 Dδ
n>n0
0 ω0
0
n>n0 ω0
(AηL )ωn , T+ωn (AηL )ωn C2 dω
(AηL )ωn , T−ωn (AηL )ωn C2 dω
|| dω.
Here A is the angular operator. When it acts on a test function, we always get rid of the time-derivatives with the replacement i∂t → H . We now argue as in the proof of Lemma 6.2 (with ηL replaced by AηL ) and choose n1 sufficiently large.
Decay of Solutions of the Wave Equation in the Kerr Geometry
499
7. An Integral Representation on the Real Axis We now use a causality argument together with the estimates of the previous section to show that the radiant modes in Proposition 5.3 must be absent. This will be a contradiction to the assumption that there are radiant modes, ruling out the possibility that there are radiant modes at all. This will lead us to an integral representation of the propagator on the real axis. Let us return to the setting of Proposition 5.3. Choosing the ϑ-dependence of η such that it is orthogonal to the angular wave functions ( aω1 n )n≤n0 except for one n, and choosing the u-dependence of η such that it is orthogonal only to one of the plane waves e±i(ω1 −ω0 )u , we can clearly arrange that ω1 n lim sup |σ (L)| =: κ > 0 where σ (L) := <ηL , σ n >C 2 . (7.1) L→∞
n≤n0
Furthermore, we choose η such that its support lies to the left of the support of 0 , i.e. dist(supp ηL , supp 0 ) > L for all L ≥ 0. Due to the finite propagation speed (which in the (t, u)-coordinates is equal to one), supp ηL ∩ supp (t) = ∅
if |t| ≤ L.
Hence for all L > 0, L 1 0= eiω1 t <ηL , (t)> 2L −L 1 ∞ sin((ω − ω1 )L) ωn ωn ωn = ηL , T 0 C2 dω + σ (L). 2π (ω − ω1 )L −∞ n∈N
We apply Lemma 6.1 and Lemma 6.4 with ε = κ/(8π ) to obtain κ 1 0 ωn ωn ωn ηL , T 0 C2 dω ≤ . 2 2π n>n ω0 0
Furthermore, Lemma 6.2 gives rise to the estimate sin((ω − ω1 )L) sin((ω − ω1 )L) ωn ωn ωn , ηL , T 0 C2 dω ≤ C sup (ω − ω1 )L (ω − ω1 )L R\[ω0 ,0] R\[ω0 ,0] and since this supremum tends to zero as L → ∞, we conclude that the expression on the left vanishes in the limit L → ∞. Combining these estimates with (7.1), we obtain 0 sin((ω − ω )L) 1 ωn ωn ωn lim sup (7.2) ηL , T 0 C2 dω ≥ π κ. (ω − ω1 )L ω0 L→∞ n≤n0
Since the matrices T ωn are continuous in ω and the fundamental solutions aωn (u) are according to Theorem 3.1 uniformly bounded as u → −∞, there is a constant C such that ωn ωn ωn η , T C2 ≤ C for all L ≥ 0 and ω ∈ (ω0 , 0), n ≤ n0 . 0 L
500
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
Hence we can apply Lebesgue’s dominated convergence theorem on the left of (7.2) and take the limit L → ∞ inside the integral, giving zero. This is a contradiction. ´ φ) ` has Since radiant modes have been ruled out, we know that the Wronskian w(φ, no zeros on the real axis. Thus we can move all contours up to the real axis. This gives the following integral representation for the propagator. Theorem 7.1. For any initial data 0 ∈ C0∞ (R × S 2 )2 , the solution of the Cauchy problem has the integral representation (t, r, ϑ, ϕ) =
2 1 −ikϕ ∞ dω −iωt kωn a b e tab kωn (r, ϑ) < kωn , 0 > e 2π −∞ ω k∈Z
n∈IN
a,b=1
with the coefficients tab as given by (5.5, 5.2). Here the sums and the integrals converge in L2loc . 8. Proof of Decay We now combine the integral representation of the solution of the Cauchy problem obtained in Theorem 7.1 with the energy splitting estimates of Sect. 6 to prove our main decay theorem. Proof of Theorem 1.1. We choose an interval [rL , rR ] ⊂ (r1 , ∞) and let K be the compact set K = [rL , rR ] × S 2 . As a consequence of Theorem 7.1, we have for any η ∈ ◦
C0∞ (K )2 the integral representation
1 ∞ −iωt ωn ωn ωn <η, H (H − ω0 ) (t)> = e η , T 0 C2 dω. 2π −∞
(8.1)
n∈IN
It is useful to introduce the short notation (η, ζ ) = <η, H (H − ω0 ) ζ >.
(8.2)
We consider on K the Hilbert space H = L2 (K, dµ)2 and denote its scalar product by <., .>H . We can represent the inner product (8.2) as (η, ζ ) = <η, B ζ >H with the operator B given by A0 A(β − ω0 ) A2 B = H (H − ω0 ) = . 0 1 (β − ω0 )A A + β (β − ω0 ) This operator, densely defined on C0∞ (K)2 ⊂ H, is obviously symmetric. We now construct a self-adjoint extension. We decompose B in the form 2 A 0 B = B0 + E with B0 := . 0 A The elliptic operator A on the compact domain K is essentially self-adjoint and has compact resolvent (see [5, Sect. 3]). Thus we can choose a domain D(B0 ) which makes B0
Decay of Solutions of the Wave Equation in the Kerr Geometry
501
self-adjoint. Denoting the resolvents by Rλ0 = (B0 − λ)−1 and Rλ = (B − λ)−1 , the resolvent identity reads $ $ $ 0 0 0 0 Rλ = (1 1 + Rλ E) Rλ = Rλ 1 + Rλ E Rλ0 B0 − λ R λ . Writing out the operator inside the brackets, $ $ 0 F : = Rλ E Rλ0 1 1 0 (A2 − λ)− 2 A(β − ω0 ) (A − λ)− 2 , = 1 1 1 1 (A − λ)− 2 (β − ω0 )A(A2 − λ)− 2 (A − λ)− 2 β (β − ω0 ) (A − λ)− 2 1
and using that (A2 − λ)− 2 AH ≤ 1 for λ < 0, we conclude that by choosing λ 0, we can make the norm of F arbitrarily small. Hence the operator 1 + F is invertible, and we obtain the formula $ $ Rλ = Rλ0 (1 1 + F )−1 Rλ0 . We conclude that the operator Rλ is also compact. This gives us a self-adjoint extension of B with a purely discrete spectrum without limit points and finite-dimensional eigenspaces. We now arrange that the operator B has no kernel. Namely, if on the contrary the operator has a kernel, it is obvious from the definition of B that one of the operators A, H or H − ω0 has a kernel. Using the separation of variables, we get corresponding radial ODEs with Dirichlet boundary conditions at rL and rR . Since non-trivial solutions of these ODEs have discrete zeros, we can by increasing the size of the interval [rL , rR ] arrange that B has no kernel. Due to the purely discrete spectrum, there is a constant c > 0 such that Bξ ≥
1 ξ c
for all ξ ∈ D(B).
(8.3)
Let ε > 0. For given ω1 > |ω0 | and n1 ≥ n0 we set T−ωn if n > n1 and ω ∈ (ω0 , 0) ωn TI = T ωn if n ≤ n1 and ω ∈ [−ω1 , ω1 ] 0 otherwise and T+ωn = T ωn − TIωn . Furthermore, we introduce the short notation
1 ∞ dν · · · ≡ dω · · · 2π N −∞ n∈N
and omit the superscript compact form
ωn .
With this notation, we can write (8.1) for t = 0 in the
(η, ζ ) = N
η, (T+ + TI ) ζ C2 dν.
502
F. Finster, N. Kamran, J. Smoller, S.-T. Yau
Since in Lemma 6.1 we used pointwise estimates of the aωn , these estimates depend on η and ζ only via their norm in the Hilbert space H. The same is true for the finite number of modes n ≤ n1 for ω in the compact set [−ω1 , ω1 ]. We thus have the bound |η, TI ζ C2 | dν ≤ c(K, ω1 , n1 ) ηH ζ H . (8.4) N
◦
Now we estimate the inner product (η, (t)) for η ∈ (C ∞ (K )2 ) as follows, |(η, (t))| ≤ e−iωt η, TI 0 C2 dν N 1 + |η, T+ (1 + H 2 )(1 + A) 0 C2 | dν, C N
(8.5)
where the constant C, given by C = C(n1 , ω1 ) =
inf
n>n1 or |ω|>ω1
(1 + ω2 )(1 + λn (ω)),
can be made arbitrarily small by increasing ω1 and n1 . Since T+ is positive, we have, using the Schwarz inequality,
N
1
η, T+ ζ C2 dν ≤
N
η, T+ ηC2 dν
2
N
1 ζ, T+ ζ C2 dν
2
.
Applying this inequality in the last term of (8.5), we obtain 2 |η, T+ (1 + H 2 )(1 + A) 0 C2 | dν ≤ c( 0 ) η, T+ ηC2 dν N N
= c( 0 ) (η, η) − η, TI ηC2 dν ≤ c( 0 ) |(η, η)| + η2H , N
where in the last step we used (8.4). Hence by choosing ω1 and n1 sufficiently large, we 1 can arrange that the second summand in (8.5) is smaller than ε(|(η, η) + η2H ) 2 . The first term in (8.5) consists of the sum of the angular modes n > n1 and n ≤ n1 . For the sum over n > n1 , we can again apply Lemma 6.1 keeping in mind that the dependence on η and ζ is controlled by their norms. Possibly by further increasing n1 we can arrange that this contribution is bounded by εηL2 . For the remaining finite sum n ≤ n1 we simply apply the Riemann-Lebesgue lemma. We conclude that, for large t,
◦ 1 |(η, (t))| ≤ ε ηL2 (K,dµ) + |(η, η)| 2 for all η ∈ C0∞ (K )2 . We rewrite this inequality in the Hilbert space H,
<η, B (t)>2H ≤ 2ε2 η2H + <η, B η>H . By continuity, this inequality holds for any η ∈ H. Evaluating this inequality for the sequence ηk obtained by projecting (t) on the eigenspaces with eigenvalues ≤ k, ηk = χ(−∞,k] (B) (t),
Decay of Solutions of the Wave Equation in the Kerr Geometry
503
and taking the limit k → ∞, we obtain the inequality
< (t), B (t)>2H ≤ 2ε2 (t)2H + < (t), B (t)>H . Since the term < (t), B (t)>H vanishes only if (t) = 0, we may divide by this term. Using (8.3), we conclude that 2 (t) 1 H (t)2 ≤ 2ε2 + 1 ≤ 2ε 2 (c + 1) c < (t), B (t)>H and thus, for sufficiently large t, 1
(t)H ≤ ε (2c(c + 1)) 2 . We conclude that (t) converges to zero in L2 (K). Applying the same argument to the initial data H n 0 , we conclude that the partial derivatives of (t) also decay in L2 (K). The Sobolev embedding H 2,2 (K) → L∞ (K) proves the theorem. Acknowledgement. We would like to thank Johann Kronthaler for discussions on the Jost equation. We are grateful to the Vielberth Foundation, Regensburg, for support.
References 1. Cardoso, V., Yoshida, S.: Superradiant instabilities of rotating black branes and strings. JHEP 0507, 009 (2005) 2. Chandrasekhar, S.: The mathematical theory of black holes. Oxford: Oxford University Press, 1983 3. De Alfaro, V., Regge, T.: Potential Scattering. Amsterdam: North-Holland Publishing Company, 1965 4. Finster, F., Kamran, N., Smoller, J., Yau, S.T.: The long-time dynamics of Dirac particles in the Kerr-Newman black hole geometry. Adv. Theor. Math. Phys. 7, 25–52 (2003) 5. Finster, F., Kamran, N., Smoller, J., Yau, S.T.: An integral spectral representation of the propagator for the wave equation in the Kerr geometry, Commun. Math. Phys. 260, no.2, 257–298 (2005) 6. Finster, F., Schmid, H.: Spectral estimates and non-selfadjoint perturbations of spheroidal wave operators. http://atxiv.org/list/math-ph/0405010 to appear in Crelle’s Journal (2006) 7. Kay, B., Wald, R.: Linear stability of Schwarzschild under perturbations which are nonvanishing on the bifurcation 2-sphere. Classical Quantum Gravity 4, 893–898 (1987) 8. Press, W.H., Teukolsky, S.A.: Perturbations of a rotating black hole. II. Dynamical stability of the Kerr metric. Astrophys. J. 185, 649 (1973) 9. Price, R.H.: Nonspherical perturbations of relativistic gravitational collapse I, scalar and gravitational perturbations. Phys. Rev. D (3) 5, 2419–2438 (1972) 10. Whiting, B.: Mode stability of the Kerr black hole. J. Math. Phys 30, 1301–1305 (1989) 11. Regge, T., Wheeler, J.A.: Stability of the Schwarzschild singularity. Phys. Rev. (2) 108, 1063–1069 (1957) Communicated by G.W. Gibbons
Commun. Math. Phys. 264, 505–537 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1524-9
Communications in
Mathematical Physics
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases Elliott H. Lieb1, , Robert Seiringer2, 1
Departments of Mathematics and Physics, Jadwin Hall, Princeton University, P. O. Box 708, Princeton, NJ 08544, USA. E-mail: [email protected] 2 Department of Physics, Jadwin Hall, Princeton University, P. O. Box 708, Princeton, NJ 08544, USA. E-mail:
[email protected] Received: 27 April 2005 / Accepted: 27 September 2005 c E.H. Lieb and R. Seiringer 2006 Published online: 9 March 2006 –
Dedicated to Jakob Yngvason on the occasion of his 60th birthday Abstract: We prove that the Gross-Pitaevskii equation correctly describes the ground state energy and corresponding one-particle density matrix of rotating, dilute, trapped Bose gases with repulsive two-body interactions. We also show that there is 100% Bose-Einstein condensation. While a proof that the GP equation correctly describes non-rotating or slowly rotating gases was known for some time, the rapidly rotating case was unclear because the Bose (i.e., symmetric) ground state is not the lowest eigenstate of the Hamiltonian in this case. We have been able to overcome this difficulty with the aid of coherent states. Our proof also conceptually simplifies the previous proof for the slowly rotating case. In the case of axially symmetric traps, our results show that the appearance of quantized vortices causes spontaneous symmetry breaking in the ground state. 1. Introduction In this paper we show that a rotating Bose gas is correctly described by the Gross-Pitaevskii (GP) equation in a suitable low-density limit. We also show that there is 100% Bose-Einstein condensation (BEC) into a solution of the GP equation. These conclusions were heretofore unproved, and it might not be an exaggeration to say that they were even conjectural, primarily because of the unusual situation (proved in [23]) that the absolute ground state of the Schr¨odinger Hamiltonian is not the bosonic ground state in the rapidly rotating case, as it is in the case when there is little or no rotation. In other c
2006 by the authors. This paper may be reproduced, in its entirety, for non-commercial purposes. Work partially supported by U.S. National Science Foundation grant PHY 01 39984. Work partially supported by U.S. National Science Foundation grant PHY 03 53181, and by an A.P. Sloan Fellowship
506
E.H. Lieb, R. Seiringer
words, the vortices seen in rotating gases are not properties of the absolute ground state but are, instead, true manifestations of the bosonic symmetry requirement. If the GP equation correctly describes the physics of a rotating gas (as we show here), then it also shows the superfluidity of such a gas, as will be discussed below. In the case of a cylindrically symmetric trap potential, the rotational symmetry is broken when more than one vortex is present; the GP equation must describe this broken rotational symmetry and, therefore, it must have multiple minimum energy solutions in this case. The key mathematical tool employed here is coherent states. Our work is based on the results of [18] and the observation there that one can make a c-number substitution for many boson modes (not just one, as in Bogoliubov’s method) without significant error provided the number of such modes is of lower order than N , the number of particles. As in our previous work [15, 12, 17, 23, 14] on dilute, trapped Bose gases we start with the Hamiltonian for N bosons HN =
N
(i)
H0 +
vN (xi − xj ) ,
(1)
1≤i<j ≤N
i=1
where H0 is the one-body part of the Hamiltonian and vN is the two-body repulsive interaction. These terms and the GP limit are described as follows. 1. The GP limit. We want to fix the external trapping potential but let N tend to infinity. To retain the notion of a dilute gas in this situation we let the interparticle potential depend on N in such a way that aN , the two-body scattering length of vN , is related to N by the condition that N aN = a
is fixed.
(2)
In this limit the three components of the energy (kinetic, trapping potential and interaction potential) scale in the same way and are all of the same order of magnitude. We call this the GP limit. It is this limit that will lead to the GP equation (6). 2. The two-body potential. We choose a radial two-body potential w(x) such that w(x) ≥ 0 (this is an important restriction for our methods) and such that w(x) = 0 for |x| > R0 (this finite range condition is a technical restriction for simplicity and can be relaxed if need be). We note that integrability of w(x) is not assumed here, w(x) is even allowed to have a hard core. The scattering length of w is a (i.e., the solution to [−2+w(x)]f (x) = 0 with f (∞) = 1 satisfies f = 1 − a/|x| for |x| > R0 ). The actual two-body potential in (1), given by vN (x) = N 2 w(N x) ,
(3)
has scattering length aN = a/N. 3. The one-body Hamiltonian. We work, as usual, in the rotating coordinate system, in which case the kinetic energy has to be supplemented by a term −·(p∧x) = p·(∧x), where is the angular velocity vector, and p = −i∇. It is convenient to add and subtract a term m2 ( ∧ x)2 and thereby write H0 =
1 (p + A(x))2 + V (x) 2m
(4)
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
507
with A(x) = m ∧ x. Then V is the trapping potential (which might or might not have some geometric symmetry) minus m2 ( ∧ x)2 . It is well known that we must have V (x) → ∞ as |x| → ∞, for otherwise the system will fly apart. We can also assume that V ≥ 0 without loss of generality. Actually, for technical reasons we require just a little more, namely V (x) ≥ C1 ln(|x|)−C2 for some positive constants C1 and C2 . (This condition can probably be relaxed a bit. What we actually need is that Tr eα(−V (x)) and Tr |A(x)|s eα(−V (x)) are finite for α large enough, for some s > 2. We will show in the appendix that this is fulfilled under the stated assumption on V .) We note that in the rotating coordinate system, the velocity at x is not p/m but rather v = i−1 [H0 , x] = p/m + ∧ x. The angular velocity around the axis is · (v ∧ x)|x ⊥ |−2 ||−1 , where |x ⊥ | is the distance to the axis. In a cylindrically symmetric state ψ we have ·(p∧x) ψ = 0 and, therefore, the angular velocity is , not zero. In the fixed frame the angular velocity is − = 0. In other words, the system in such a state is not rotating. As long as is small enough, the GP ground state is cylindrically symmetric and hence there is no rotation; this is a manifestation of superfluidity. In order to have rotation at least one vortex must form. This is a typical property of superfluids. Henceforth, we use units in which = 2m = 1. We also note that the modification of the kinetic energy in (4) is mathematically just like that caused by a uniform magnetic field with vector potential A (and e/c = 1). There is nothing special about A(x) = m ∧ x as far as the mathematics is concerned, so one could have an arbitrary A without disturbing our analysis, provided it does not grow too fast at infinity. One could think, for example, of applying a magnetic field to the system, but then our particles would have to be charged and the attendant Coulomb interaction would nullify the treatment of the system as a dilute gas with short range interaction. On the other hand, we could allow our particles to have a magnetic moment (“bosons with spin”) and our analysis would easily extend to this case. The ground state energy depends in a non-trivial way on the total spin when there is rotation [22], even in the absence of a magnetic field. This is due to the symmetry requirement of the wave function, whereby the symmetry of the spin part determines the spatial symmetry (see, e.g., [6]). We will not pursue this topic further in this paper. Our analysis is carried out here for a gas of three-dimensional particles, but the same ideas apply to a two-dimensional gas. There will be changes, of course, because the notion of scattering length is different in 2D and because the energy per particle of a homogeneous gas of low density is not 4πas as in 3D but rather 4π /| log as2 |. (Here, as is the unscaled scattering length of the interaction potential, which is held fixed in the thermodynamic limit for the homogeneous gas.) Thus, the GP equation will be a little different, but the conclusion will be the same: The only effect of rotation is to replace p 2 by |p + A|2 in the GP equation derived in [16]. In order to keep this paper manageable we do not discuss the 2D case, but the interested reader can easily combine the results in [16, 23] and the present paper. The Hamiltonian HN acts on L2 (R3N ) but we are interested in its restriction to the bosonic subspace of L2 (R3N ), namely to permutation symmetric functions. We denote the ground state energy of HN in the bosonic sector by E0 (N ), and we keep in mind that this might be larger than the absolute ground state energy of HN when no permutation symmetry is imposed. We turn now to the GP equation, which originates from the GP energy functional for a complex-valued function φ of one variable x ∈ R3 . For a ≥ 0, the GP energy functional is given by
508
E.H. Lieb, R. Seiringer
E GP [φ] = φ|H0 |φ + 4πa
R3
|φ(x)|4 dx .
(5)
It can easily be shown [15] that E GP [φ] has a minimum over all φ with φ2 = 1 and this minimum energy is denoted by E GP (a). (We use the standard notation φp = 1/p .) There might be several minimizers (and there surely will be when |φ(x)|p dx the trap has axial symmetry and a is large [22, 23]) but each minimizing φ will satisfy the GP equation (−i∇ + A(x))2 φ(x) + V (x)φ(x) + 8πa|φ(x)|2 φ(x) = µφ(x) ,
(6)
where µ is the chemical potential (i.e., the energy per particle to add a small number of particles). Note that µ = E GP (a) + 4πa |φ(x)|4 dx > E GP (a) because of the quartic nonlinearity. Our main theorem concerning the bosonic ground state energy of (1) is the following. Theorem 1. With a denoting the scattering length of w, we have E0 (N ) = E GP (a) . N→∞ N lim
(7)
In [23] it was shown that lim sup N→∞
E0 (N ) ≤ E GP (a) , N
(8)
and, therefore, it remains only to prove a lower bound to lim inf N→∞ E0 (N )/N of the right form, which we do here. The GP energy minimizer(s) φ also tells us something about the density (diagonal and off-diagonal) and about Bose-Einstein condensation in the ground state of HN , or any approximate ground state. We call a sequence of bosonic N -particle density matrices γN an approximate ground state if limN→∞ N −1 Tr HN γN = E GP (a). The reduced one-particle density matrix of γN will be denoted by γN(1) . We would like to suppose that, as N → ∞, γN(1) converges to some γ and that γ = |φ φ|, where φ is a solution to the GP equation. This would be 100% Bose-Einstein condensation into the GP state and was proved to occur in the non-rotating case [12]. The difficulty in the rotating case is that the solution to the GP equation might not be unique (as it is in the non-rotating case), in which case the limit γ need not be a pure state. We wouldexpect, however, that γ is always a convex combination of pure GP states, i.e., γ = i λi |φi φi |, where φi is a solution to the GP equation and i λi = 1. (This, of course, is not the same as the much weaker and less interesting statement that γ is a convex combination of terms of the form |ψ ψ|, in which ψ is a linear combination of GP solutions instead of being equal to just one GP solution.) Unfortunately, as in the case of a cylindrically symmetric trap, the set of GP states might not be countable, and so the summation i must be replaced by some kind of integral. This accounts for the rather abstract Theorem 2 below. In any event, this theorem tells us that there is always 100% condensation, even if the system has a wide choice of states into which to condense. Note that γN(1) is a positive trace class operator on the one-particle space L2 (R3 ), and we choose the normalization Tr γN(1) = 1 for convenience. (The conventional normalization is Tr γN(1) = N.) By the Banach-Alaoglu Theorem, any sequence γN(1) will have a
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
509
subsequence that converges to some γ in the weak-* topology, i.e., limN→∞ Tr AγN(1) = Tr Aγ for all compact operators A. This convergence will even hold in the norm topology, i.e., limN→∞ Tr |γN(1) − γ | = 0 by compactness. More precisely, since the γN(1) are the one-particle density matrices of approximate ground states, we have (using the positivity of the interaction potential in HN ) Tr H0 γN(1) ≤ const. independently of N . √ √ √ √ √ √ Hence also H0 γN(1) H0 H0 γ H0 in weak-* sense, i.e., Tr A H0 γN(1) H0 → √ √ Tr A H0 γ H0 for all compact A. Since H0−1 is a compact operator, this implies that Tr γN(1) → Tr γ as N → ∞ (simply use A = H0−1 above). For positive operators, weak-* convergence plus convergence of the trace implies norm-convergence [27, 25]. We denote by the set of all γ s that are limit points of one-particle density matrices of approximate minimizers. That is, 1 GP (1) Tr HN γN = E (a), lim γN = γ . = γ : there is a sequence γN , lim N→∞ N N→∞ (9) As remarked above, the convergence γN(1) → γ can either mean weak-* convergence or norm convergence. Note that, in particular, norm convergence implies that Tr γ = 1 for all γ ∈ . Theorem 2. The set of one-particle density matrices of approximate ground states, as defined in (9), has the following properties: (i) is a compact and convex subset of the set of all trace class operators. (ii) Let ext ⊂ denote the set of extreme points in . (An element γ ∈ is extreme if γ cannot be written as γ = aγ1 + (1 − a)γ2 with γ1,2 ∈ , γ1 = γ2 , and 0 < a < 1.) We have ext = {|φ φ| : E GP [φ] = E GP (a)}, i.e., the extreme points in are given by the rank-one projections onto GP minimizers. (iii) For each γ ∈ , there is a positive (regular Borel) measure dµγ , supported in ext , with ext dµγ (φ) = 1, such that dµγ (φ) |φ φ| , (10) γ = ext
where the integral is understood in the weak sense. That is, every γ ∈ is a convex combination of rank-one projections onto GP minimizers. A consequence of the Krein–Milman Theorem [4, Vol. 2, Thm. 25.12] is that given any γ ∈ and given any ε > 0 there are finitely many GP minimizers φi and positive coefficients λi (with i λi = 1) such that γ = λi |φi φi | + ε (11) i
with Tr |ε | < ε. That is, every element of can be approximated by a finite convex combination of GP minimizers. We also note that part (iii) of Theorem 2 follows from part (ii) using Choquet’s Theorem [4, Vol. 2, Thm. 27.6]. We shall, however, prove part (iii) (and Eq. (11)) directly in Sect. 3 (see Step 4). Equation (10) reflects the spontaneous symmetry breaking that occurs in the system under consideration. Consider the case of an external potential V (x) which is axially symmetric, with symmetry axis given by the angular velocity vector . In general, the
510
E.H. Lieb, R. Seiringer
non-uniqueness of the GP minimizer stems from the appearance of quantized vortices, which break the axial symmetry, and hence lead to a whole continuum of GP minimizers [22, 23, 2, 3, 8, 7, 1]. Uniqueness of the GP minimizer can be restored by perturbing the one-particle Hamiltonian H0 in such a way as to break the symmetry and to favor one of the minimizers, e.g., by introducing a slightly asymmetric trap potential V (x). This then leads to complete BEC, as can be seen from our Theorem 2, which does not assume any particular symmetry of V (x). Note that in the case of a unique GP minimizer, Theorem 2 implies that the reduced one-particle density matrix of any approximate ground state converges to the projection onto this unique GP minimizer, since ext (and hence ) consists of only one element in this case. The situation of a dilute rotating Bose gas described in this section contrasts with the situation of the absolute ground state of HN , i.e., the lowest eigenvalue and corresponding state without imposing symmetry restrictions on the wavefunctions. In [23] it was shown that Eq. (7) does not hold, in general, for the absolute ground state energy. The energy per particle in this case is given by minimizing a functional similar to (5), but which now depends on one-particle density matrices rather than on wave functions φ(x). In [22, 23] it was shown that the corresponding energy is strictly lower than E GP (a) for a large enough (and = 0). The density matrix functional has a unique minimizer for any value of and a, and in general this minimizer will not be rank one. An analogue of Theorem 2 also holds for the absolute ground state. As shown in [23], consists of only one element in this case, namely the unique minimizer of the density matrix functional just mentioned. This implies, in particular, that there is no spontaneous symmetry breaking in the absolute ground state. We refer the reader to [23] for more details. In the remainder of this paper, we present the proof of Theorems 1 and 2. We are grateful to Lev Pitaevskii for drawing our attention to the problem of the correctness of the GP equation for a rapidly rotating Bose gas in an email correspondence in 1999.
2. Proof of Theorem 1 Step 1. Reduction of the number of particles to ensure a bounded energy per particle. One of the problems we shall face in our analysis is to control three-body collisions, i.e., to show that the ground state wave function is suitably small when three particles are close together. We have found a way to do this (see Step 4) with the help of a bound on the change in energy when three particles are added to the system. It is not evident that this bound is always satisfied (although it must be satisfied on average since the total energy is bounded by N) and the discussion in this subsection shows how to circumvent this annoyance. If another way could be found to control the three-body amplitude or to control the incremental energy then the analysis in this section would not be needed. Let us consider the Hamiltonian (1) for M ≤ N particles (but still with interaction potential vN depending on N ): HM,N =
M i=1
(i)
H0 +
vN (xi − xj ) .
(12)
1≤i<j ≤M
This operator acts naturally on all of L2 (R3M ). We denote the ground state energy in the bosonic sector by E0 (M, N ). Our goal is a good lower bound on E0 (N, N ).
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
511
= M(N ) be the largest integer ≤ N satisfying two conditions: a.) N − M is Let M N) − E0 (M − 3, N ) ≤ 6E GP (a). Then E0 (M + 3, N ) − divisible by 3 and b.) E0 (M, N ) > 6E GP (a), E0 (M + 6, N) − E0 (M + 3, N ) > 6E GP (a), etc., whence E0 (M, N ) + 2(N − M)E GP (a) . E0 (N, N ) ≥ E0 (M,
(13)
We will prove the following in the remainder of this section. Proposition 1. Fix Z > 0, and let Mj and Nj be two sequences of integers, with Mj ≤ Nj , limj →∞ Mj = ∞ and limj →∞ Nj = ∞, such that E0 (Mj , Nj ) − E0 (Mj − 3, Nj ) ≤ 3Z for all j and limj →∞ Mj /Nj = λ for some 0 ≤ λ ≤ 1. Then lim inf j →∞
1 E0 (Mj , Nj ) ≥ λE GP (λa) . Nj
(14)
Note that (14) does not depend on Z. It is now useful to note that the energy E GP (a) is concave in a (as an infimum over affine functions) and thus satisfies E GP (λa) ≥ (1 − λ)E GP (0) + λE GP (a) ≥ λE GP (a) .
(15)
The last inequality in (15) follows from E GP (0) > 0. ) defined above will have a subsequence such that M(N j )/Nj → The sequence M(N λ as j → ∞ for some 0 ≤ λ ≤ 1. If we combine (13)–(15) with Z = 2E GP (a) we find for this sequence that lim E0 (Nj , Nj )/Nj ≥ λ2 E GP (a) + 2(1 − λ)E GP (a)
j →∞
= [1 + (1 − λ)2 ]E GP (a) ≥ E GP (a),
(16)
which proves (7) for this sequence Nj . (Here and in the following, we denote lim inf by lim for short.) Together with the upper bound (8) we also conclude from (16) that λ = 1. )/N has only 1 as a limit point, and hence That is, for Z ≥ 2E GP (a) the sequence M(N (16) holds for the full sequence N = 1, 2, 3 .... Our goal in the rest of this section is to prove Proposition 1, which then proves (7), as just explained.
Step 2. The generalized Dyson Lemma. To get a lower bound on E0 (M, N ), we start by deriving a lower bound on the Hamiltonian HM,N , using Corollary 1 in [13]. This corollary, which is a generalization of Lemma 1 in [19] which, in turn, stems from Lemma 1 in Dyson’s paper [5], asserts the following. (Note that the range of the potential vN is R0 /N , and its scattering length is a/N. We use the “hat” to denote Fourier transform.) Lemma 1. Let R > R0 /N. Let χ (p) be a radial function such that 0 ≤ χ (p) ≤ 1 and such that h(x) ≡ (1 − χ )(x) is bounded and integrable (which implies that χ (p) → 1 as |p| → ∞). Let fR (x) = sup |h(x − y) − h(x)| , |y|≤R
(17)
512
E.H. Lieb, R. Seiringer
and wR (x) =
2 f (x) fR (y) dy . R π2 R3
(18)
Let UR (x) be any positive, radial function that vanishes outside the annulus R0 /N ≤ |x| ≤ R, with R3 UR (x) dx = 4π. Let ε > 0. If y1 , . . . , yn denote n fixed points in R3 , with |yi − yj | ≥ 2R for all i = j , then we have the operator inequality on L2 (R3 ), −∇χ (p)2 ∇ + 21
n
vN (x −yi ) ≥
i=1
n
(1−ε)
i=1
a a UR (x −yi )− wR (x −yi ) . (19) N Nε
The sums in (19) are multiplication operators, i.e., they are just functions of x. The operator −∇χ (p)2 ∇ is just the positive multiplication by p 2 χ (p)2 in Fourier space. The original Lemma 1 in [19] has χ (p) ≡ 1 and h = wR = f = ε = 0. Clarification. What Lemma 1 really says is that we can replace the unpleasant interaction potential vN (which possibly contains an infinite hard core) by a small, smooth, but longer ranged potential whose main part, UR , is positive. There are two prices that have to be paid for this luxury. One is to forego a piece of the positive kinetic energy, −∇χ(p)2 ∇. The second is that the potential is really only a ‘nearest neighbor’ potential. That is to say, the particle at x is allowed to interact with only one other particle at a time. This is seen from the requirement that the interaction UR has range R, but the other particles must be separated by a distance 2R. In order to utilize the coherent state inequalities later on in Step 3 we have to extend our UR to an ordinary two-body potential, i.e., we have to be able to drop the 2R separation requirement. To do so will require an estimation of the amplitude (in the exact, original ground state wave function) of finding three or more particles within a distance 2R of each other. Clearly, this amplitude is small, but we find that we have to resort to path integrals (or, more precisely, the Trotter product formula) to estimate it. This will be done in Step 4 below. As an immediate corollary of Lemma 1 we can omit the condition |yi − yj | ≥ 2R and replace (19) by −∇χ (p) ∇ + 2
1 2
n
vN (x − yi )
i=1
≥
n
(1 − ε)
i=1
a a UR (x − yi ) − wR (x − yi ) θ(|yk − yi | − 2R) N Nε
(20)
k=i
for any set of points yj ∈ R3 . Here, θ denotes the Heaviside step function, given by θ(t) = 1 if t ≥ 0 and θ (t) = 0 if t < 0. That is to say, if there are only n < n of the
yi that are a distance ≥ 2R from all the other yk then we simply apply (19) to these n coordinates. The right side of (20) does not contain the other values of i because the θ factor vanishes for those. The left side does contain these unwanted yi but, since vN is non-negative, this does no harm to the inequality (20). We apply (20) to each particle, considering the other M − 1 particles as fixed, and obtain
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
HM,N ≥
513
M
−∇i 1 − χ (pi )2 ∇i + 2pi · A(xi ) + A(xi )2 + V (xi )
i=1
+
M
(1 − ε)aN −1 UR (xi − xj ) − a(N ε)−1 wR (xi − xj )
i=1 j =i
×
θ (|xk − xj | − 2R) .
(21)
k=i,j
For the negative part of the interaction (containing wR ), we can simply use θ ≤ 1 for a lower bound. For the positive part (containing UR ), we will use the fact that θ (|xk − xj | − 2R) ≥ 1 − θ(2R − |xk − xj |) , (22) k=i,j
k=i,j
which follows from the simple inequality j (1 − sj ) ≥ 1 − j sj when 0 ≤ sj ≤ 1 for all j . We now use (21) and (22) in the following way. We begin by defining a new M-particle Hamiltonian, K, by K=
M i=1
(i)
K0 +
2(1 − ε) aN −1 UR (xi − xj ) ,
(23)
1≤i<j ≤M
where K0 is a one-body Hamiltonian to be described next. If K0 were simply (−i∇ + A)2 + V then (23) would be the conventional Hamiltonian with two-body interaction 2UR . (The factor 2 arises because each pair i, j appears twice in (21).) Unfortunately, K0 has to be a little more complicated because we used up part of the kinetic energy in replacing vN by UR via Lemma 1. Pick some η > 0, and let K0 = −∇ 1 − χ (p)2 ∇ − 2η + 2p · A(x) +A(x)2 + V (x) + η|x|4 − κ(η) .
(24)
The constant κ(η) is chosen so that K0 > 0. It is a matter of convenience to include it in the definition of K0 . It is defined by
κ(η) = inf spec −η + 2p · A(x) + η|x|4 . (25) The reason for adding the terms −2η and η|x|4 to K0 is to ensure that K0 is bounded from below and has compact resolvent, and so that κ(η) is finite. (Note: the exponent 4 in |x|4 could be replaced by any exponent > 2 for our purposes. This is due to the fact that we have a vector potential A(x) in mind that is bounded by (const. )|x|, as in the case of pure rotation. If this is not so (because an external magnetic field has been added) some polynomial of higher order than |x|4 could be needed, but our analysis would continue to go through.) Since there is a 2η in (24) and not just η we have that K0 ≥ −η + V (x) ≥ −η ≥ 0, since V (x) ≥ 0 by assumption. This will be convenient later. Let · B denote the M-particle bosonic ground state expectation for the original Hamiltonian HM,N . Actually, it is convenient to take the zero temperature limit of the
514
E.H. Lieb, R. Seiringer
Gibbs state, which means that in case of a ground state degeneracy of HM,N , we would take · B to be the uniform average over all ground states. Then E0 (M, N ) = HM,N B and we have, therefore, using (21)–(25), E0 (M, N ) ≥ inf spec K + Mκ(η) − ηM |x1 |4 B − 2ηM − 1 B M 2a wR (x1 − x2 ) B Nε aM 3 UR (x1 − x2 )θ (2R − |x2 − x3 |) B . − N −
(26)
(Note: We made use of the bosonic symmetry to replace i i by M1 , for example.) The term −1 B can be bounded as follows. We have p 2 ≤ 2(p + A)2 + 2A2 , and hence, using positivity of the interaction potential vN , M − 1 B ≤ 2E0 (M, N ) + 21 ||2 M |x1 |2 B .
(27)
To prove Proposition 1 we have to bound the various terms in (26) and (27), and that is what we do in the following steps. The main term to bound is inf spec K, the ground state energy of the ‘effective Hamiltonian’ (23). The momentum cutoff χ (p) in (24) will be chosen as follows. Let (p) be an infinitely differentiable, spherically symmetric function with (p) = 0 for |p| ≤ 1, (p) = 1 for |p| ≥ 2 and 0 ≤ (p) ≤ 1 in-between. Then, for some adjustable parameter s to be determined later, we choose χ (p) = (sp) .
(28)
The potential wR (x) defined in (18) is then a smooth and rapidly decreasing function that depends only on the ratio R/s. It is easy to see that R3
wR (x)dx ≤ const.
R2 s2
(29)
as long as R ≤ const. s. We will, in fact, choose R s. Finally, we are still free to make a choice for the function UR (x) in Lemma 1. We choose it to be a ‘hat’ function: UR (x) =
6R −3 R ≥ |x| ≥ 2−1/3 R 0 otherwise ,
(30)
assuming that R ≥ 21/3 R0 /N, a condition that will be amply satisfied by our choice N −1/3 R N −2/3 later on. We remark that the exact form of UR (x) is unimportant in what is to come. We will need only the properties that UR (x)dx = 4π and that UR ∞ ≤ const. R −3 for R R0 /N.
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
515
Step 3. Coherent state method for the ground state. We begin our analysis of (26) by bounding the main term, inf spec K. This will be done with the aid of coherent states, exploiting ideas in [18], and is, perhaps, the most methodologically novel part of our work. The one-body operator K0 has purely discrete spectrum and can be written in terms of its eigenvalues ej and orthonormal eigenfunctions |ϕj as K0 = j ≥1 ej |ϕj ϕj |. Recall that K0 ≥ −η + V (x) ≥ −η ≥ 0, so ej > 0. We assume the sequence ej to be ordered, i.e., ej +1 ≥ ej for all j . For simplicity, we introduce the notation W (x1 − x2 ) ≡ (1 − ε)aN −1 UR (x1 − x2 ) .
(31)
The well known second quantization formalism involves the operators aj† and aj which are the creation and annihilation operators of a particle in the state |ϕj . They satisfy the usual canonical commutation relations [ai , aj† ] = δij , etc. The second quantized version of (23) is † †
= K ej aj† aj + ai aj ak al Wij kl , (32) j ≥1
ij kl
acts on the bosonic Fock space, where Wij kl = ϕi ⊗ ϕj |W |ϕk ⊗ ϕl . The operator K F, consisting of a direct sum over all particle number sectors. We are interested in a
in the sector of particle number M. Hence lower bound to the ground state energy of K
without changing this energy. We can then we can add a term ( j aj† aj − M)2 to K look for a lower bound irrespective of particle number. I.e., for any C ≥ 0, we have that
for M particles is ≥ inf spec K on the full Fock space, where inf spec K 2 † † C † K≡ ej aj† aj + ai aj ak al Wij kl + a j aj − M . (33) M j ≥1
ij kl
j ≥1
The choice of C will be made later. The Fock space F can be thought of as the tensor product of the Fock spaces generated by each mode ϕj . We choose some integer J 1 (to be determined later) and split the Fock space into two parts, namely F = F < ⊗ F > , where F < is the tensor product of the Fock spaces generated by all the modes ϕj with j ≤ J and where F > is that generated by all the other modes. Next, we introduce coherent states [10] for all the modes j ≤ J . (By coherent states we mean ordinary canonical Schr¨odinger, Bargmann, Glauber, coherent states.) The modes with j > J will not be omitted, but they will be treated differently from the j ≤ J modes. Let z = (z1 , . . . , zJ ) denote a vector in CJ . Let also (z) denote the projection onto the coherent state |z1 ⊗ · · · ⊗ zJ ∈ F < . The symbol |z1 ⊗ · · · ⊗ zJ
is shorthand for |z1 ⊗ |z2 ⊗ · · · ⊗ |zJ , and |zj denotes the coherent state for the j th mode given by |zj = exp[−|zj |2 /2 + zj aj† ] |vacuum . The Hamiltonian K in (33) can now be written as K = dz (z) ⊗ U (z) , (34) where U (z) is an operator acting on F > . The operator U (z) depends on z since it is also an upper symbol for the modes j ≤ J . The integration measure is dz = π −J j ≤J dxj dyj with zj = xj + iyj . This is discussed in [18, 10]. As an example, the upper symbol for
516
E.H. Lieb, R. Seiringer
aj† is z¯ j and for aj it is zj , but for aj† aj it is |zj |2 − 1. Thus, to a term such as ai† aj† ak al with i, j ≤ J and k, l > J would correspond the upper symbol operator z¯ i z¯ j ak al . It is easier to compute the lower symbol (which is denoted by u(z)) than the upper symbol U (z). It is obtained simply by replacing aj† by z¯ j and aj by zj in all (nor-
mal-ordered) polynomials, even in higher polynomials such as aj† aj or aj† aj† aj aj . An
equivalent definition of the lower symbol of any polynomial P in the aj ’s and aj† ’s (normal-ordered or not) is the expectation value u(z) = z1 ⊗ · · · ⊗ zJ |P|z1 ⊗ · · · ⊗ zJ . In the case considered here, u(z) = z1 ⊗ · · · ⊗ zJ |K|z1 ⊗ · · · ⊗ zJ . The lower symbol is useful because the upper symbol can conveniently be obtained from it as [10] U (z) = e−∂z ∂z¯ u(z) = u(z) − ∂z ∂z¯ u(z) + 21 (∂z ∂z¯ )2 u(z) , (35) where ∂z ∂z¯ = j ≤J ∂zj ∂z¯ j . (In the general case there would be higher order derivatives on the right side of (35), but not in our case since u(z) is a polynomial of order four.) Note that (34) implies that inf spec K ≥ inf (inf spec U (z)) , z
(36)
since dz (z) = IF < and (z) ⊗ U (z) ≥ inf spec U (z) (z). Our goal in the rest of this subsection is to derive a lower bound to inf spec U (z) for a fixed z. The reader might wonder why we use coherent states only for modes j ≤ J and not for all modes. The reason is that the upper symbol for the operator ej aj† aj is ej (|zj |2 − 1), and the −1 term is a term that we do not want when minimizing for a fixed z. We make an error in the energy of the form − j ≤J ej and for this reason we cannot take J = ∞. But we can, and will let J → ∞ as N → ∞. 3a. Lower bound on the lower symbol u(z). In order to derive a lower bound to U (z) and the bottom of its spectrum, we start by deriving a lower bound to the lower symbol u(z), which is the first term in (35). This symbol can be conveniently expressed in terms of the function z ∈ L2 (R3 ), parametrized by z ∈ CJ , given by z (x) = zj ϕj (x) . (37) 1≤j ≤J
Note that z 22 = j ≤J |zj |2 . Denoting T ≡ k>J ek ak† ak , we have z1 ⊗ · · · ⊗ zJ j ej aj† aj z1 ⊗ · · · ⊗ zJ = ej |zj |2 + T j ≤J
= z |K0 |z + T .
(38)
There is a mild abuse of notation here, which will continue for the rest of this paper, and which we hope will not cause any confusion. The operator j ej aj† aj acts on F while the vector |z1 ⊗ · · · ⊗ zJ is in F < , so the left side of (38) defines an operator on F > in an obvious way (actually, it defines a quadratic form). The right side must also be an operator on F > , and it is so if the number z |K0 |z is regarded as a number times the identity on F > .
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
Similarly, with N > ≡ greater than J,
j >J
517
aj† aj denoting the number of particles in the modes
2 2 † > 2 z1 ⊗ · · · ⊗ zJ a a − M ⊗ · · · ⊗ z + − M + z 22 = N z 1 J z 2 j j j 2 ≥ z 22 − M − 2eJ−1 MT . (39) Here, we used the normal ordering [ aj† aj ,
j ≤J
followed by the elementary bound
aj† aj ]2 =
N>
i≤J
j ≤J
ai† aj† ai aj +
j ≤J
≤ T /eJ .
The interaction part of u(z) is obtained by replacing aj by zj and aj† by z¯ j when j ≤ J . We will now derive a lower bound on this term. It is convenient to introduce the notation I (z ) = dxdy |z (x)|2 |z (y)|2 W (x − y) . (40)
Since W ≥ 0, it is possible to neglect the interaction between modes > J for a lower bound. More precisely, let P = 1≤i≤J |ϕi ϕi | and Q = 1−P . The two-body operator W (x − y) is then bounded from below by W = ((P + Q) ⊗ (P + Q))W ((P + Q) ⊗ (P + Q)) ≥ (P ⊗ P )W (P ⊗ P ) + (P ⊗ P )W (P ⊗ Q + Q ⊗ P + Q ⊗ Q) + (P ⊗ Q + Q ⊗ P + Q ⊗ Q) W (P ⊗ P ) ,
(41)
since the missing term on the right side of (41) is (Q ⊗ Q + P ⊗ Q + Q ⊗ P )W (Q ⊗ Q + P ⊗ Q + Q ⊗ P ) ≥ 0. We thus have that z1 ⊗ · · · ⊗ zJ ij kl ai† aj† ak al Wij kl z1 ⊗ · · · ⊗ zJ ≥ I (z ) + z ⊗ z |W |ϕk ⊗ ϕl ak al + ϕk ⊗ ϕl |W |z ⊗ z ak† al† +2
kl>J
z ⊗ z |W |z ⊗ ϕk ak + 2
k>J
kl>J
z ⊗ ϕk |W |z ⊗ z ak† .
(42)
k>J
Here we used that W is symmetric, implying that in the last line we could replace |z ⊗ ϕk + ϕk ⊗ z by 2|z ⊗ ϕk . We seek a lower bound to the last two expressions in (42). Note that, for a general operator A, |A + A† |2 = A2 + A†2 + AA† + A† A ≤ 2A† A + 2AA† by Schwarz’s inequality, and so (A + A† )2 ≤ 4|A|2 + 2[A, A† ] .
(43)
We apply this to the second line in (42), with A = kl>J ckl ak al and ckl = z ⊗ z |W |ϕk ⊗ ϕl . The commutator is † ak al ckm clm + 2 |ckl |2 . (44) [A, A† ] = 4 klm>J
kl>J
518
E.H. Lieb, R. Seiringer
The last term in (44) is bounded by |ckl |2 ≤ |ckl |2 = z ⊗ z |W 2 |z ⊗ z ≤ W ∞ I (z ) . kl>J
(45)
kl≥1
The first term on the right side of (44) can be bounded as
ak† al ckm clm ≤
ak† al ckm clm ≤
m≥1 kl>J
klm>J
4 −1 η W 21 ∇z 42 T . 27π 4
(46)
This can be seen as follows. The integral kernel σ of the one-particle operator defined by the matrix m≥1 ckm clm is given by σ (x, x ) = dy|z (y)|2 W (x − y)W (x − y)z (x)z (x ) . (47) Using Young’s and Schwarz’s inequalities, we have, for any function f on R3 , dx dx f (x)f (x )σ (x, x ) ≤ dx dx dy|z (y)|2 W (x −y)W (x −y)|z (x)f (x)|2 ≤ W 21 z 26 z f 23 ≤ W 21 z 46 f 26 .
(48)
Hence (46) follows by applying the Sobolev inequality f 26 ≤ (4/3)(2π 2 )−2/3 ∇f 22 both to f and to z , and using the fact that − ≤ η−1 K0 . To get an upper bound on |A|2 we use Schwarz’s inequality again to obtain |ckl |2 2 † † (49) e m e n a m a n am a n |A| ≤ ek el kl>J
mn>J
for any sequence of positive numbers ej . We choose ej to be the eigenvalues of K0 , in which case 2 † † † em en a m an am an ≤ ek ak ak = T2. (50) mn>J
k>J
Moreover, |ckl |2 Q Q W z ⊗ z . = z ⊗ z W ⊗ e k el K0 K0
(51)
kl>J
We have the following two operator inequalities; the first comes from the fact that K0 ≥ eJ on the range of the projector Q and the second comes from K0 ≥ −η: 2 2 Q ≤ ≤ . K0 K0 + e J −η + eJ
(52)
Denoting the integral kernel of (− + µ)−1 by √
1 e− µ|x−x | kµ (x − x ) = , 4π |x − x |
(53)
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
519
we see that (51) is bounded above by
4 η2
dxdydx dy z (x)z (y)W (x − y)keJ /η (x − x )
×keJ /η (y − y )z (x )z (y )W (x − y ) 4 ≤ 2 dxdydx dy |z (x)|2 |z (y)|2 W (x − y)keJ /η (x − x ) η ×keJ /η (y − y )W (x − y ) 1 W 1 ≤ √ I (z ) . 2πη3/2 eJ
(54)
Here, we used inequality for the (x , y ) integration, as well as the fact that √Young’s 2 −1 kµ 2 = (8π µ) . By putting all this together, we have that (A + A† )2 ≤ 4W ∞ I (z ) + +
32 −1 η W 21 ∇z 42 T 27π 4
4 W 1 √ I (z ) T 2 . 2πη3/2 eJ
(55)
Since the square root preserves operator monotonicity, we can take the square root on both sides of (55). By the triangle inequality, we can take the sum of the square roots of each term on the right side. Finally, applying the Schwarz inequality to the first and third term, we conclude that, for any δ > 0, √ 2 −1/2 1 4 η |A + A | ≤ δI (z ) + W ∞ + 2 W 1 ∇z 22 T δ π 27 −1 −1/4 + eJ W 1 I (z ) + 2π η3/2 T. †
(56)
We now proceed similarly with the last term in (42) which is linear in ak and ak† . Denoting ck = z ⊗ z |W |z ⊗ ϕk , we have that
2
ck ak + ck ak†
k>J
|ck |2 ≤4 ek k>J
ek ak† ak
k>J
+2
|ck |2 .
(57)
k>J
Using H¨older’s and Sobolev’s inequality,
|ck |2 =
dxdydz |z (x)|2 |z (y)|2 |z (z)|2 W (x − y)W (x − z)
k≥1
≤ W 3/2 z 26 I (z ) 4 1/3 2/3 ≤ W ∞ W 1 ∇z 22 I (z ) . 3(2π 2 )2/3
(58)
520
E.H. Lieb, R. Seiringer
Moreover, using (52) again, together with Young’s and Sobolev’s inequalities, as well as the fact that the 3/2 norm of kµ is given by 2−1/3 µ−1/2 /3, we find that |ck |2 2 dxdydx dy z (x)|z (x )|2 W (x − x )keJ /η (x − y) ≤ ek η k>J
× (y)|z (y )|2 W (y − y ) z 2 ≤ dxdydx dy |z (x)|2 |z (x )|2 W (x − x )keJ /η (x − y) η ×|z (y )|2 W (y − y ) 4 W 1 ≤ η−1/2 √ ∇z 22 I (z ) . 4/3 9π eJ
(59)
This implies that 2 † ck ak + ck a k k>J
8 1/3 2/3 W ∞ W 1 ∇z 22 I (z ) 3(2π 2 )2/3 W 1 16 + 4/3 η−1/2 √ ∇z 22 I (z ) T 9π eJ 2 √ 2 2 1 1/6 1/3 1/2 −1/4 ≤ W ∞ W 1 + η−1/4 W 1 eJ T 3 (2π 2 )1/3 3π 2/3 2 2 × ∇z 2 + I (z ) ,
≤
(60)
again using the triangle and the Schwarz inequality. As mentioned above, operator monotonicity is preserved by the square root, and hence we can take the square root on both sides of Eq. (60). This completes the lower bound on the lower symbol u(z). For the convenience of the reader, we repeat the bound just derived: C 2 −1/4 z 22 − M u(z) ≥ z |K0 |z + I (z ) 1 − δ − eJ W 1 T + M 2 1 1/6 1/3 − ∇z 22 + I (z ) 2 W ∞ W 1 3 (2π 2 )1/3 4 1/2 −1/4 √ + 2/3 η−1/4 W 1 eJ T 3π √ 2 −1/2 1 2 4 −∇z 2 2 η W 1 T − W ∞ π 27 δ 2C −1 −1/4 . (61) +T 1 − eJ − 2π 2 η3/2 eJ We note that in the following we will choose J large enough so that the last term in (61) is positive and thus can be neglected for a lower bound. (Recall that eJ → ∞ as J → ∞.)
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
521
3b. Lower bound on the remaining terms in U (z). A lower bound on the first term on the right side of (35) is given in (61) and, therefore, to get a lower bound on the upper symbol U (z), it remains to bound the last two terms on the right side of (35). The very last term is positive, as will be shown now, and can thus be neglected for a lower bound. Namely, C 2 2 4 1 1 ∂ u(z) = ∂ ⊗ |W | ⊗
+ (∂ ) (∂ ) z z z z z 2 2 z z¯ 2 z z¯ M = 21 ϕi ⊗ ϕj + ϕj ⊗ ϕi |W |ϕi ⊗ ϕj + ϕj ⊗ ϕi
1≤ij ≤J
+
C J (J + 1) ≥ 0. M
(62)
The remaining expression, ∂z ∂z¯ u(z), consists of the following terms. First, from the one-body part (38) of the Hamiltonian we obtain a contribution j ≤J ej . Second, from the term (39) (see also (33)) that was introduced in order to control the particle number, we get C (2N > − 2M + 1 + 2z 2 )J + 2z 2 M 2C JC > ≤ (J + 1)z 22 + 2N + 1 . M M
(63)
Finally, the following three contributions are obtained from the interaction part. From the part where all four indices are ≤ J , we have z ⊗ ϕj + ϕj ⊗ z |W |z ⊗ ϕj + ϕj ⊗ z ≤ 4 z ⊗ ϕj |W |z ⊗ ϕj
j ≤J
≤4
j ≤J
j ≤J
√ 1 W 1 z 26 ϕj 23 ≤ (4/3)3/2 2 η−1/2 W 1 ∇z 22 ej . (64) 2π j ≤J
Here, we used the inequalities of Young, H¨older and Sobolev as well as the facts that − ≤ η−1 K0 and ϕj |K0 |ϕj = ej in the last step. From the term with 3 indices ≤ J , we get z ⊗ ϕj + ϕj ⊗ z |W |ϕj ⊗ ϕk ak + adjoint . (65) 2 j ≤J k>J
Using (57), this time with ek ≡ 1, (65) is bounded above, as an operator, by 4
!
ϕj ⊗ ϕk |W |z ⊗ ϕj + ϕj ⊗ z |2 N +
j ≤J
>
1 2
"1/2 .
(66)
k>J
Similarly to (58), we can derive the bound | ϕi ⊗ ϕk |W |z ⊗ ϕi + ϕi ⊗ z |2 ≤ 4W 3/2 W 1 z 26 ϕi 36 ϕi 2 . (67) k≥1
522
E.H. Lieb, R. Seiringer
Since ϕi 2 = 1 and ϕi 26 ≤ (4/3)(2π 2 )−2/3 ∇ϕi 22 ≤ (4/3)(2π 2 )−2/3 η−1 ei , this implies that 3/4 # 1 5/6 1/6 −3/4 (66) ≤ 8(4/3)5/4 W W ∇ η ei N > + 21 z 2 ∞ 1 (2π 2 )5/6 i≤J 1 5/6 1/6 −3/4 3/4 2 > 1 ∇ ≤ 4(4/3)5/4 W W η e + N + z ∞ 2 1 i 2 . (2π 2 )5/6 i≤J
(68) Here, Schwarz’s inequality was used in the last step. The last term to estimate is the one coming from 2 indices ≤ J , given by ϕj ⊗ ϕk + ϕk ⊗ ϕj |W |ϕj ⊗ ϕl + ϕl ⊗ ϕj ak† al j ≤J k,l>J
≤ (4/3)3/2
√ 1 −3/2 η W ej T . 1 2π 2
(69)
j ≤J
This inequality can be seen as follows. For any one-particle function f , ϕi ⊗ f + f ⊗ ϕi |W |ϕi ⊗ f + f ⊗ ϕi ≤ 4 ϕi ⊗ f |W |ϕi ⊗ f ≤ 4W 1 f 26 ϕi 23 √ 1 ≤ (4/3)3/2 2 η−1/2 W 1 ∇f 22 ei . (70) 2π The last inequality is the same is in (64). The result now follows using − ≤ η−1 K0 . Altogether, we have thus shown that ∂z ∂z¯ u(z) ≤
2C 2CJ > 1 (J + 1)z 22 + N +2 M M i≤J √ 1 +(4/3)3/2 2 η−1/2 W 1 ∇z 22 + η−1 T ei 2π ei +
(71)
i≤J
3/4 1 5/6 1/6 W 1 W ∞ η−3/4 ei 2 5/6 (2π ) i≤J × ∇z 22 + N > + 21 .
+4(4/3)5/4
This finishes our lower bound on the upper symbol U (z). To summarize, we have shown the following operator lower bound to the operator U (z): U (z) ≥ right side of (61) − right side of (71) .
(72)
3c. c-number bound on T . We are interested in the ground state energy of U (z) for a fixed z ∈ CJ . Since T and N > are the only operators appearing in (61) and (71), this quantity can be bounded from below using (61) and (71) if we can evaluate the expectation values of T and N > in the ground state (or one of the ground states) of U (z). Let · z denote the √ expectation value in a ground state of U (z). We can use two simple facts: i.) Since T enters (61) negatively, we can use the concavity of the square
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
523
√ √ root to replace T z by T z for a lower bound. ii.) Since N > appears positively in (71), and hence negatively in (72), we can replace it by the upper bound N > ≤ T /eJ . For the purpose of bounding T z we can use a lower bound to U (z) that is much simpler than (72). This is obtained by totally neglecting both the interaction part and the part controlling the particle number in u(z). These give positive contributions to u(z) (since u(z) is the expectation value of K in the coherent state). We have to be more careful about ∂z ∂z¯ u(z), however, because this contains some negative terms, as given in (71). (The annoying fact is that an upper symbol of a positive operator need not be positive, although the lower symbol is always positive.) Proceeding in the manner just described we have that U (z) ≥ z |K0 |z + T − ∂z ∂z¯ u(z) .
(73)
Now let us estimate the various terms in ∂z ∂z¯ u(z) in (71). We have η∇z 22 ≤ z |K0 |z . Also z 22 ≤ z |K0 |z /inf spec (−η + V (x)). Moreover, W 1 ≤ 4π a/N , and W ∞ ≤ 6a/(R 3 N ). We will choose R N −2/3 below. Therefore, W ∞ N . The operator N > can be bounded in terms of T as N > ≤ T /eJ . Note also that M = O(N ) by assumption. Hence we see from (71) and (73) that, for N large enough (depending on the parameters η, C and J ), U (z) ≥ 21 T − const. ,
(74)
where the constant depends only on η, C and J , but not on M or N . The value of (74) is that it allows us to control the value of T z , and thereby control inf z inf spec U (z), which is our lower bound to the ground state energy of K. There is some number E, independent of all parameters, such that inf z inf spec U (z) ≤ ME/2 − const. because inf z inf spec U (z) is less than the known upper bound to the ground state energy of K. Then we can, and will restrict our attention to z’s with T z ≤ ME because only those values of z are relevant for computing inf z inf spec U (z), as (74) shows. Only the existence of E and not its value is important. We conclude from (72) and the fact that T z ≤ ME for the z in question that inf inf spec U (z) ≥ inf E[] , z
(75)
where 2 C 22 − M M 2C 2 −D2 ∇2 − D3 − (J + 1)22 . M
E[] = |K0 | + D1 I () +
(76)
The notation is the following: D1 = 1 − δ −
−1/4 − eJ W 1 ME
−2
2 1 1/6 1/3 W ∞ W 1 3 (2π 2 )1/3
4 1/2 −1/4 η−1/4 W 1 eJ M 1/2 E 1/2 , 3π 2/3
(77)
524
E.H. Lieb, R. Seiringer
D2 = 2
2 4 1 1/6 1/3 1/2 −1/4 W ∞ W 1 + η−1/4 W 1 eJ M 1/2 E 1/2 3 (2π 2 )1/3 3π 2/3 √ 2 −1/2 1 4 + 2 W 1 M 1/2 E 1/2 + (4/3)3/2 2 η−1/2 W 1 ei η π 27 2π 3/4 1 5/6 1/6 +4(4/3)5/4 W 1 W ∞ η−3/4 ei , 2 5/6 (2π )
i≤J
(78)
i≤J
and D3 =
√ 1 2CJ ME/eJ + 21 + (4/3)3/2 2 η−3/2 W 1 ME ei M 2π i≤J i≤J 3/4 1 5/6 1/6 −3/4 −1 1 +4(4/3)5/4 W W η e + MEe ∞ 1 J i 2 (2π 2 )5/6 ei +
i≤J
1 + W ∞ . δ
(79) −1/4
We have neglected the last term in (61) containing (1 − eJ (2π 2 η3/2 )−1 − 2C/eJ ), assuming J to be large enough to make this term positive. (Recall that eJ → ∞ as J → ∞.) Our final result in this section, (75)–(76), might not appear to be useful at first sight, but the reader should note that the first two terms in (76) are essentially the GP energy expression. The term |K0 | is the relevant (i.e., low momentum) part of the kinetic energy |(i∇ − A)|2 . The coefficient D1 equals 1 to leading order and I () is essen tially the GP quartic term 4πa ||4 (up to errors which will be controlled). Moreover, for C large enough the term C(22 − M)2 /M ensures that we have the right particle number. For an appropriate choice of the parameters J , η and R all other terms are of lower order as N → ∞, as we shall show. Step 4. Bounds on three-particle density. So far we have bounded the main term in (26), namely inf spec K. Of the various other terms in (26) that have to be bounded, the one that is most intuitively negligible, but which we find the hardest to control is the last term in (26). To show that it is small we have to show that the probability of finding three particles within a distance 2R of each other (in a true ground state of HM,N ) is small. This is accomplished in this section. We begin with a lemma about the possible size of the expectation value of a function of the coordinates of three bosons. Recall from Step 2 that · B denotes expectation value in the bosonic, zero-temperature state of the M-body Hamiltonian HM,N . Lemma 2. Let ξ(x1 , x2 , x3 ) be any positive function of x1 , x2 and x3 ∈ R3 . With V = the one-body potential appearing in HM,N , we define the three-body, independent particle Hamiltonian h = −1 − 2 − 3 + V (x1 ) + V (x2 ) + V (x3 ) . Let α > 0 and let e−αh (x1 , x2 , x3 ; y1 , y2 , y3 ) be the ‘heat kernel’ of h at ‘inverse temperature’ α. Finally, consider the modified integral kernel $ e−αh (x1 , x2 , x3 ; y1 , y2 , y3 ) ξ(x1 , x2 , x3 )ξ(y1 , y2 , y3 ), (80)
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
525
and let denote its largest eigenvalue (i.e., its norm as a map from L2 (R9 ) to L2 (R9 )). Then (81) ξ(x1 , x2 , x3 ) B ≤ exp{α(E0 (M, N ) − E0 (M − 3, N ))} . Note that for the M and N under consideration here, we have E0 (M, N ) − E0 (M − 3, N ) ≤ 3Z, as explained in Step 1. It is the appearance of the peculiar difference E0 (M, N ) − E0 (M − 3, N) in Lemma 2 that led us to the discussion in Step 1. If the three-body correlations could be bounded more expeditiously than is done here, Step 1 could be simplified. Proof. We denote by Tr [ · ] the trace over all of L2 (R3M ), not just the bosonic states, and by Pb the projection onto the bosonic (i.e., symmetric) subspace. Note that exp{−βHM,N } is trace class for large enough β, by our assumption on the logarithmic increase of the potential V (x). (This follows from the Feynman-Kac-Itˆo formula, together with the results in the Appendix.) Hence Tr [ξ e−αnHM,N Pb ] , ξ B = lim n→∞ Tr [e−αnHM,N P ] b
(82)
independently of α, of course. Note that HM,N commutes with Pb so e−αnHM,N Pb is self-adjoint and positive. The multiplication operator ξ is also positive and we can write ξ e−αnHM,N Pb = [ξ e−αHM,N Pb ]e−α(n−1)HM,N Pb . H¨older’s inequality for traces of positive operators states that TrAB ≤ {TrAn }1/n {TrB n/(n−1) }(n−1)/n , and therefore Tr [ξ e−αnHM,N Pb ] Tr [e−αnHM,N Pb ]
≤
Tr [(ξ e−αHM,N Pb )n ]
1/n
Tr [e−αnHM,N Pb ]
(83)
.
Consider the bigger projection P b , which symmetrizes only among particles 4, 5, . . . , M. It commutes with HM,N and also with ξ , and hence e−αHM,N Pb ≤ e−αHM,N P b . Since ξ ≥ 0, this yields the upper bound
Tr [(ξ e−αHM,N Pb )n ] Tr [e−αnHM,N Pb ]
1/n
≤
Tr [P b (ξ e−αHM,N )n ] Tr [e−αnHM,N Pb ]
1/n .
(84)
We now claim that Tr [P b (ξ e−αHM,N )n ] ≤ Tr 3 (ξ e−αh )n Tr M−3 [e−αnHM−3,N P b ] ,
(85)
where Tr 3 and Tr M−3 denote the trace over the first 3 and last M − 3 particles, respectively. Taking the limit n → ∞ this proves (81). To show (85), we write HM,N = H3,N ⊗ IM−3 + I3 ⊗ HM−3,N + W , with W denoting the interaction between the first 3 and the last M − 3 particles. Note that W ≥ 0. Using the Trotter product formula, we first replace each factor e−αHM,N by (e−αH3,N /m e−α(HM−3,N +W )/m )m for some integer m. (Here we abuse the notation slightly, omitting to write tensor products and identity operators.) For x = (x1 , x2 , x3 ), let k(x, x ) denote the integral kernel of e−αH3,N /m . Denoting by Wx the multiplication operator on the
526
E.H. Lieb, R. Seiringer
subspace of the last M − 3 particles obtained by fixing the first 3 to have positions x, and introducing nm integration variables xij , 1 ≤ i ≤ n, 1 ≤ j ≤ m, we can write % m n & Tr P b ξ e−αH3,N /m e−α(HM−3,N +W )/m = dxij ξ(xi1 ) k(xij , xi(j +1) ) ij
i
×Tr M−3 P b
ij
e
−α(HM−3,N +Wxij )/m
,
(86)
i,j
where we identify xi(m+1) ≡ x(i+1)1 and xn(m+1) ≡ x1,1 . By H¨older’s inequality for traces, we can estimate % &
−α(HM−3,N +Wxij )/m Tr M−3 P b
b e−αn(HM−3,N +Wxij ) ≤ sup P e Tr M−3 ij
i,j
≤ Tr M−3 P b e−αnHM−3,N ,
(87)
where in the last inequality we used the fact that Wx ≥ 0 and that the partition function is monotone in the potential. By the Feynman-Kac-Itˆo formula [24, Sect. 15], the integral kernel k(x, x ) is bounded in absolute value by the kernel of e−αh/m . Using this estimate and rewriting the integrals as a trace we obtain % m n & Tr P b ξ e−αH3,N /m e−α(HM−3,N +W )/m ≤ Tr 3 (ξ e−αh )n Tr M−3 [e−αnHM−3,N P b ] . Letting m → ∞ this yields (85).
(88)
We now use Lemma 2 to obtain a bound on the various terms in (26) and (27). Lemma 2 immediately implies that
|x1 |2 B ≤ e3αZ |x|eα(−V (x)) |x|∞ ,
(89)
with · ∞ denoting operator norm. For positive operators, the operator norm is bounded by the trace, in this case given by Tr|x|2 eα(−V (x)) . This expression, in turn, is bounded for α large enough, as shown in the Appendix. In exactly the same way we can bound |x1 |4 B . Moreover, we have that √ √ wR (x1 − x2 ) B ≤ e3αZ wR e−αh wR ∞ 1 ≤ e3αZ wR (x)dx . (90) (4πα)3/2 R3 The last inequality can be seen as follows. Denote by k(x, x ) the kernel of eα(−V (x)) . The Feynman-Kac formula implies that k(x, x ) ≤ (4π α)−3/2 for any positive V (x).
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
527
Hence, for any function f ∈ L2 (R6 ), $ $ dxdx dydy f (x, y) wR (x − y)k(x, x )k(y, y ) wR (x − y )f (x, y) % & $ −3/2
dxdx k(x, x ) dy wR (x − y)|f (x, y)| ≤ (4π α) % & $ × dy wR (x − y)|f (x , y)| ≤ (4π α)−3/2 % ×
wR
dxdx k(x, x )
&1/2
% dy|f (x, y)|2
&1/2 2
dy|f (x , y)|
(91)
,
where we used Schwarz’s inequality in the last step. The result now follows from the fact that eα(−V (x)) ≤ I. Similarly, repeating the above argument with x2 in place of x and (x1 , x3 ) in place of y, we obtain 1 3αZ UR (x)dx θ(2R − |x|)dx UR (x1 − x2 )θ (2R − |x2 − x3 |) B ≤ e (4πα)3 R3 R3 1 2 3 = e3αZ 3 (92) R . α 3π This finishes our bounds on the various terms appearing in (26) and (27). Step 5. Collection of all the terms and the final inequality. In this section we concatenate the various pieces of the lower bound to the energy E0 (M, N ) in (26), and finish the proof of Proposition 1. Inequality (26) contains several terms. All except inf spec K were bounded in Step 4 and in (27). The essence of Step 3 is the bound on the main term inf spec K ≥ inf inf spec U (z) ≥ inf E[] , z
(93)
where E[] is defined in (76). Let us begin by disposing of the terms mentioned in Step 4. As shown there, |x1 |2 B ≤ const. and |x1 |4 B ≤ const. for some constant depending only on Z. (Recall that Z is a fixed number of order 1.) Moreover, from (90) and (29) we see that (recalling that M ≤ N) a R2 M 2a (x − x ) ≤ const. . w R 1 2 B N 2ε ε s2
(94)
This term will thus be negligible, if R → 0 as N → ∞ (keeping ε and s fixed for the moment). We are free to choose the dependence of R on N , and we choose R to satisfy N −1/3 R N −2/3
as
N →∞.
(95)
The last term to estimate is then aM 3 UR (x1 − x2 )θ (2R − |x2 − x3 |) B ≤ const. aN R 3 1 . N2
(96)
528
E.H. Lieb, R. Seiringer
Hence it follows from (26) and (27) that, for any fixed s, ε and η (recalling that λ = limN→∞ M/N), 1 1 (1 + 4η)E0 (M, N ) ≥ lim inf E[] + λκ(η) − const. λη . N→∞ N N→∞ N lim
(97)
The only thing left is the minimization of E[] given in (76), which contains the numbers D1 , D2 and D3 in (77)–(79). To evaluate them as N → ∞ we note that W 1 ≤ 4π a/N, and W ∞ N for our choice of R in (95). Hence, we see that lim lim
lim D1 = 1 , lim
δ→0 J →∞ N→∞
lim D2 = 0 , and lim
J →∞ N→∞
N→∞
1 D3 = 0 . N
(98)
Using the fact that both ∇22 and 22 are bounded relative to |K0 | , and rescaling → M 1/2 , we obtain 1 lim inf E[] ≥ N J →∞ N→∞ 2 |(x)|2 |(y)|2 UR (x − y)dxdy lim inf λ |K0 | + (1 − ε)aλ R→0 2 2 . +Cλ 2 − 1 lim
(99)
Note that the infimum can obviously be restricted to a set of bounded |K0 | , independent of R, since UR ≥ 0. Since K0 ≥ −η this implies that we can assume that ∇2 is bounded independent of R, and hence also 6 by Sobolev’s inequality. Using the inequality (proved below) |(x)|2 |(y)|2 UR (x − y)dxdy − 4π4 ≤ 8π R3 ∇2 , (100) 4 6 we see that we can interchange the limit and the infimum and thus obtain 2 1 2 2 2 lim lim . inf E[] ≥ inf λ |K0 | +(1−ε)4π aλ 4 +Cλ 2 −1 J →∞ N→∞ N (101) Inequality (100) can be obtained in the following way. Using Schwarz’s inequality, as well as UR (y)dy = 4π , |(x)|2 |(y)|2 UR (x − y)dxdy − 4π 4 4 ≤ dyUR (y) dx|(x)|2 (|(x)| + |(x + y)|) ||(x)| − |(x + y)|| 1/2
≤ 236
dyUR (y)
dx |(x)| − |(x + y)||2
.
(102)
The result now follows from the fact that || − |( · + y)|2 ≤ |y|∇2 , which can −ip·y |2 ≤ |y|2 |p|2 , and be seen by evaluating the norm in Fourier space, using |1 − e also using the fact that UR (y)|y|dy ≤ R UR (y)dy = 4π R.
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
529
Now, letting C → ∞, we infer from (101) that 1 λ |K0 | +(1−ε)4π aλ2 24 . (103) inf E[] ≥ inf 2 =1 C→∞ J →∞ N→∞ N lim lim
lim
The final step is to remove the momentum cutoff in K0 , i.e., to let s → 0 in Eq. (103). Again, we claim that we can interchange the limit and the infimum, at least to obtain a lower bound. Let s denote a minimizer of the functional on the right side of (103). Since K0 ≥ −η + V (x) and V (x) → ∞ as |x| → ∞, a sequence sj with sj → 0 as j → ∞ lies in a compact subset of L2 (R3 ), and hence there exists a subsequence which converges strongly and pointwise almost everywhere [11] (both in p-space and x-space) to a function 0 as j → ∞, with 0 2 = 1. All the s-independent terms in the functional on the right side of (103) are weakly lower semicontinuous. Moreover, by Fatou’s Lemma [11],
0 (p)|2 dp .
s (p)|2 dp ≥ p2 | lim p2 1 − χs (p)2 | (104) s→0
Hence the infimum and the limit s → 0 can be interchanged for a lower bound. In combination with inequalities (103) and (97), we find that 1 (1 + 4η)E0 (M, N ) N N→∞ λ − + 2p · A(x) + A(x)2 + V (x) ≥ inf lim
2 =1
+(1 − ε)4πaλ2 24 − const. λη .
(105)
(For a lower bound we simply dropped the positive terms −2η and η|x|4 in K0 .) By letting η → 0 and ε → 0 Proposition 1 is proved. As explained in Step 1, this proves Theorem 1. Remark about the optimal choice of the parameters. In Eq. (95) we showed how the parameter R has to depend on N , as N → ∞, in order to obtain the correct limit for the energy. The explicit dependence on N of the other parameters J , C, s, η and ε need not be specified so closely (unless we wish to obtain a detailed error estimate). It suffices to let J → ∞, C → ∞, s → 0, η → 0 and ε → 0 (in this order) after taking the N → ∞ limit. 3. Proof of Theorem 2 Step 1. Proof of Part (i). The fact that is a convex set follows easily from its definition. Namely, if γN and γ¯N are two approximate ground state sequences, and 0 ≤ λ ≤ 1, then λγN + (1 − λ)γ¯N is certainly also an approximate ground state sequence, whose reduced one particle density matrix is given by λγN(1) + (1 − λ)γ¯N(1) . Compactness of is also not difficult to see. Given a sequence γi ∈ , the BanachAlaoglu Theorem implies the existence of a subsequence such that γi γ∞ for some γ∞ in the weak-* sense as i → ∞. As already remarked in the introduction, the fact that Tr H0 γi ≤ const. implies that γi → γ∞ in trace norm. To prove compactness we have to show that γ∞ ∈ .
530
E.H. Lieb, R. Seiringer
By definition, corresponding to every γi there is an approximate ground state sequence (1) − γ ≤ 1/ i γN,i . That is, there is a number Ni such that N ≥ Ni implies that γN,i i −1 GP and |N Tr HN γN,i − E (a)| ≤ 1/ i. (Here, · denotes trace norm.) We can assume that Ni → ∞ as i → ∞. Now, for given N , let ıˆ(N) be the largest integer i such that N ≥ Ni . Then ıˆ(N ) → ∞ as N → ∞, and hence the sequence γN,ˆı (N ) is an approximate (1) −γ ≤ γ (1) −γ ground state sequence. Moreover, γN,ˆ +γıˆ(N ) −γ∞ → 0 ∞ ıˆ(N ) ı (N ) N,ˆı (N ) as N → ∞. This proves that γ∞ ∈ , and hence is compact. Step 2. An extension of Theorem 1. A key step in the proof of Theorem 2 is an extension of the lower bound in Theorem 1 to the case of a perturbed Hamiltonian, where we replace the one-particle part H0 of the Hamiltonian (1) by H0 + S, where S is a bounded hermitian operator on the one-particle space L2 (R3 ). Let HN(S) denote the perturbed N-particle operator (S)
HN
= HN +
N
S (i) ,
i=1 (S) and let E0 (N )
= inf spec HN(S) denote its ground state energy. Correspondingly, define GP as in (5), with H + S in place of H , and let E GP (a) the perturbed GP functional E(S) 0 0 (S) denote its infimum over all φ with φ2 = 1. Then we have the following extension of Theorem 1, to whose proof we will devote the remainder of this subsection. Proposition 2. For all bounded hermitian operators S, lim inf N→∞
1 (S) GP (a) . E (N ) ≥ E(S) N 0
(106)
We start by noting that in order to prove Proposition 2 it suffices to prove it in the special case in which S is a finite rank operator with exponentially decaying eigenfunctions. In particular, we can assume that its integral kernel S(x, y) satisfies a bound |S(x, y)| ≤ B exp (−D(|x| + |y|))
(107)
for some positive constants B and D. This can be seen as follows. Let {fi }∞ i=1 be an orthonormal basis for L2 (R3 ) such that |fi (x)| < Bi exp(−Di |x|) for some choice of constants Bi , Di > 0 and let Pn denote the projection onto the first n of these functions. Clearly, Pn → I strongly as n → ∞. Then, for any bounded S, Pn SPn is of the desired form, i.e., it has finite rank and its integral kernel satisfies a bound of the form (107). For any one-particle density matrix γ , + + 1 + + 1 + Tr[H0 γ ] , + γ (S − P S − P SP ) ≤ SP (108) √ √ Tr n n n n + H H + 0
−1/2 H0
0
is compact and, therefore, is the norm with · denoting operator norm. Since limit of finite rank operators, it is easy to see that the norm in (108) goes to zero as n → ∞. On the other hand the set of numbers Tr[H0 γ ] that arise from those γ ’s that come from approximate ground states is bounded. Consequently, both sides of (106) can be approximated to within any desired ε by replacing S by Pn SPn and choosing n large enough — which implies the statement.
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
531
Thus we can assume (107) henceforth. The proof of Proposition 2 then follows exactly the same lines as the proof of Theorem 1. In fact, our proof of Theorem 1 has the advantage of being almost completely independent of the exact form of the Hamiltonian. The only place where we used the explicit form is Lemma 2, which was used to bound expectation values of certain one-, two- and three-body operators in the zero-temperature state of HM,N . We now have to bound the expectation value of these operators in the (S) , which we denote as · (S) . (Here, the operator H (S) is zero-temperature state of HM,N B M,N (S)
defined in the obvious way. Its ground state energy will be denoted by E0 (M, N ).) To this end, Lemma 2 can be extended in the following way. Lemma 3. Let ξ(x1 , x2 , x3 ) be any positive function of x1 , x2 and x3 ∈ R3 . Let
S denote the rank one operator on the one-particle space with integral kernel given by the right side of (107). With V = the one-body potential appearing in HM,N , we define the three-body, independent particle Hamiltonian, h(S) = −1 − 2 − 3 + V (x1 ) + V (x2 ) + V (x3 ) −
S1 −
S2 −
S3 .
(109)
Let α > 0 and let e−αh (x1 , x2 , x3 ; y1 , y2 , y3 ) be the ‘heat kernel’ of h(S) at ‘inverse temperature’ α. Finally, consider the modified integral kernel $ (S) e−αh (x1 , x2 , x3 ; y1 , y2 , y3 ) ξ(x1 , x2 , x3 )ξ(y1 , y2 , y3 ) (110) (S)
and let (S) denote its largest eigenvalue (i.e., its norm as a map from L2 (R9 ) to L2 (R9 )). Then (S) (S) (S) ξ(x1 , x2 , x3 ) B ≤ (S) exp{α(E0 (M, N ) − E0 (M − 3, N ))} . (111) The proof follows along the same lines as the proof of Lemma 2, except for one step. Before Eq. (88), it was necessary to get an upper bound on the absolute value of the integral kernel of exp{−αH3,N /m} in terms of the kernel of exp{−αh/m}, which can be obtained with the help of the Feynman-Kac-Itˆo formula. In the case considered here, we (S) /m}. We will now show that need an upper bound on the integral kernel of exp{−αH3,N the absolute value of this kernel is bounded above by the kernel of exp{−αh(S) /m} for the modified three-particle operator h(S) in (109). This claim follows from the Trotter product formula, together with the Feynman-KacItˆo formula, in the following way. Since S is a bounded (in fact, finite rank) operator, we can write
(S) α n/m e−αH3,N /m = lim e−αH3,N /n 1 − S . (112) n→∞ n By the Feynman-Kac-Itˆo formula, (107) and the definition of
S, n/m n/m −αH /n α 3,N e ≤ e−αh/n 1 + α
1 − S S (x, y) (x, y) . n n
(113)
In the limit n → ∞, the operator on the right side converges strongly to e−αh /m . This proves our claim. For the application of this lemma, as in Sect. 2, Step 4, it is necessary to have some (S) bounds on the kernel of e−αh . In particular, we need that the kernel is bounded, and that its diagonal decays for large |x| at least like |x|−const. α for some positive constant. (S)
532
E.H. Lieb, R. Seiringer
As for the case S = 0, these properties are again shown in the appendix. It is there that the exponential decay of the kernel of
S gets used. As already mentioned, except for the replacement of Lemma 2 by Lemma 3, the proof of Proposition 2 consists of simply mimicking the discussion of the proof of Theorem 1 given in Sect. 2. Step 3. contains projections onto GP minimizers. Let GP ⊂ L2 (R3 ) denote the set of all minimizers of the GP functional (5). We now consider the special case where S = −|φ φ| for some φ ∈ GP . In this case, we claim that lim
N→∞
1 (λS) (N ) = E GP (a) − λ E N 0
(114)
for any λ ≥ 0. Given Theorem 1, the lower bound is trivial in this case. The upper bound can be derived in the same way as the upper bound for Theorem 1 in [23]. The arguments there also apply to this case, and the expectation value of S in the trial state can easily be estimated using the methods in [5, 15]. (In the non-rotating case, this was carried out in [21].) Taking the derivative of (114) at λ > 0, Griffiths’ argument [9, 18] implies that the one-particle density matrix of a ground state of HN(λS) converges to |φ φ| as N → ∞ in this case. Hence, by a similar ‘diagonal’ argument as at the end of the proof of part (i) of Theorem 2, we can find a sequence λN with λN → 0 as N → ∞ such that the (λ S)
ground state of HN N represents an approximate ground state sequence for the λ = 0 problem, and its reduced one-particle density matrix converges to |φ φ| as N → ∞. This shows that |φ φ| ∈ for any φ ∈ GP . (Remark. The claim of this subsection can in principle be proved by simply constructing an appropriate approximate ground state. However, although the one-particle density matrix of the trial state used in [23] converges to |φ φ| as N → ∞, this does not immediately imply that |φ φ| ∈ since the trial state is not symmetric! This explains the somewhat different reasoning in this subsection.) We note that also |φ φ| ∈ ext for all φ ∈ GP . This follows from the fact that all elements of are positive operators, and a rank one operator cannot be written as a non-trivial convex combination of two positive operators. In the next subsection, we will show that all elements of ext are of the form |φ φ| with φ ∈ GP . Step 4. Proof of Parts (ii) and (iii). For a given γ ∈ , let γN be an approximate ground state sequence for HN , with γN(1) → γ as N → ∞. By Proposition 2 we have that, for any bounded hermitian operator S and any λ ∈ R, 1 GP (a) . Tr HN(λS) γN ≥ E(λS) N→∞ N
E GP (a) + λ Tr Sγ = lim
(115)
Upon dividing by λ and letting λ → 0, this yields Tr Sγ ≥ lim
GP (a) − E GP (a) E(λS)
λ0
λ
.
(116)
We claim that lim
λ0
GP (a) − E GP (a) E(λS)
λ
= min φ|S|φ . φ∈GP
(117)
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
533
GP (a) ≤ E GP (a) Using φ ∈ GP as a trial function, we immediately see that E(λS) GP + λ φ|S|φ for all φ ∈ GP . For the other direction, we use a minimizer of E(λS) GP as a trial state for E . As λ → 0, this sequence of minimizers will have a subsequence that converges strongly to a minimizer of E GP . Hence, for some φ ∈ GP , GP (a) − E GP (a)) ≥ φ|S|φ , which proves our claim. (Note that this limλ0 λ−1 (E(λS) argument also proves that the right side of (117) is a true minimum and not merely an infimum.) We have thus shown that, for every bounded hermitian operator S, and every γ ∈ ,
Tr Sγ ≥ min φ|S|φ . φ∈GP
(118)
Replacing S by −S, this also implies that Tr Sγ ≤ maxφ∈GP φ|S|φ . Inequality (118) is the key to the proof of statements (ii) and (iii) in Theorem 2. Let Pn be a rank n projection, and let Pn = {Pn γ Pn : γ ∈ }. When γ is a bounded operator on L2 (R3 ), Pn γ Pn can be identified with an n × n complex matrix, and hence 2 with a vector in R2n . We make this identification (denoted by ι in the following) in order to be able to use finite-dimensional convexity theory (see, e.g., [20]). Note that ι is linear and continuous, and hence the set Bn = ιPn = {ιPn γ Pn : γ ∈ } is a closed 2 convex subset of R2n . An exposed point [20] of a convex set C ⊂ Rm is an extreme point p of C with the additional property that there is a tangent plane to C containing p but containing no other point of C. (For an example of points that are extreme but not exposed, let C ⊂ R2 be a square with each corner rounded off into a quarter of a circle. The extreme points are all the points on the four quarter-circles, including their endpoints, but the endpoints are not exposed.) An equivalent way to say this is that an exposed point p in C ⊂ Rm is characterized by the existence of a vector a ∈ Rm (a normal to the tangent plane) such that (a, p) ≤ (a, b)
for all b ∈ C ,
(119)
with equality if and only if b = p. (Here, (· , ·) denotes the standard inner product in Rm .) 2 γ Pn ∈ Pn . For a fixed n, an exposed point of Bn ⊂ R2n corresponds to some Pn This density matrix γ may not be unique and it may depend on n, but this is of no concern to us. We note that our space of density matrices is a complex space and, therefore, we have to translate (119) to this setting. For any two bounded operators γ , γ (not necessarily in ), the real inner product (· , ·) becomes (ιPn γ Pn , ιPn γ Pn ) = Tr(Pn γ † Pn γ ) ,
(120)
where γ † is the adjoint of γ . Translated to our original space, this means that if Pn γ Pn is an exposed point of Pn , then there exists an operator S (with Pn SPn = S) such that Tr S γ ≤ Tr Sγ
for all γ ∈ ,
(121)
or, equivalently, there exists a hermitian S such that Tr S γ ≤ Tr Sγ
for all γ ∈ .
(122)
Note that, by definition, equality holds in (122) if and only if Pn γ Pn = Pn γ Pn . We now use inequality (122), with γ = |φ φ|, where φ ∈ GP minimizes φ|S|φ among all
534
E.H. Lieb, R. Seiringer
GP minimizers. We know from Step 3 that this γ is an element of . The inequalities (118) (applied to γ ) and (122) for this special choice of γ together imply that there is actually equality in this case, and thus that Pn γ Pn = Pn |φ φ|Pn . That is, all exposed points of Pn are of the form Pn |φ φ|Pn , with φ ∈ GP . We can go further and conclude that all extreme points in Pn are of this form, not only the exposed points. This follows from the fact that the set of GP minimizers is closed, together with Straszewicz’s Theorem [20, Thm. 18.6] which states that the exposed points are a dense subset of the extreme points. Carath´eodory’s Theorem [20, Thm. 17.1] implies that every Pn γ Pn ∈ Pn can be 2 written as a convex combination of 2n + 1 extreme points. That is, there exist λi ≥ 0 with i λi = 1 such that 2 2n +1 Pn γ Pn = Pn λi |φi φi | Pn , (123) i=1
with φi ∈ GP for all i. This equation defines an atomic (i.e., point) measure dµn (φ) supported on the (compact) space of projections onto GP minimizers. Let us provisionally call this space , with the intention of showing that ext = . For every ψ with Pn ψ = ψ we have thus shown that dµn (φ)| ψ|φ |2 with dµn (φ) = 1 . (124) ψ|γ |ψ =
To complete the proof of Theorem 2 we wish to take the limit n → ∞ in (124). We choose Pn in such a way that Pn converges strongly to the identity as n → ∞. The sequence dµn has a subsequence that converges weakly to some measure dµ with dµ = 1 (see [4, Vol. 1, Thm. 12.7 and 12.10]). This implies that, for ψ in a dense subset of L2 (R3 ) (namely, those ψ for which Pn ψ = ψ for some n), dµ(φ)| ψ|φ |2 with dµ(φ) = 1 . (125) ψ|γ |ψ =
Since(125) holds for a dense set of ψ, it actually holds for all ψ by continuity. That is, γ = dµ(φ)|φ φ| in the weak sense. Note that there is a representation (125) for γ ∈ ext (since there is such a representation for all γ ∈ ). It is not hard to see that for an extreme γ the corresponding Borel measure dµ must be an atomic measure at a single point in . Another way to say this is that ext ⊂ , which is exactly part (ii) of Theorem 2 (since we have already proved in Step 3 that ⊂ ext ). Part (iii) of Theorem 2 follows from (125), together with part (ii). This completes the proof of Theorem 2. We conclude with the direct proof of (11), which was promised just after the statement of Theorem 2. We start with (123) and choose Pn to be the projection onto the largest n eigenvalues of γ , with n large enough so that Tr |γ − Pn γ Pn | < ε2 /8. We now denote Pn = P , 1 − Pn = Q and B = i λi |φi φi |. From (123) (and a little algebra) we learn that γ − B = Q(γ − B)Q − QBP − P BQ. Thus Tr |γ − B| ≤ Tr (|Qγ Q| + |QBQ| + 2|QBP |). Obviously, Tr Qγ Q < ε2 /8 and since Tr B = Tr γ = 1, we also have Tr |QBQ| = Tr QBQ = Tr (1 − P )B = Tr (1 − P )γ = Tr Qγ Q < ε2 /8. The remain1/2 1/2 ing term √ can be bounded, using Schwarz’s inequality, by (Tr QBQ) (Tr P BP ) < ε/ 8. This proves (11).
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
535
Appendix: Heat Kernel Estimates In this appendix we derive an upper bound on the heat kernel for a general Schr¨odinger operator. This bound will show, in particular, that for any s > 0 and α large enough (depending on s) Tr |x|s eα(−V ) < ∞
(126)
if V (x) ≥ C1 ln(|x|) − C2 for some constants C1 > 0 and C2 . This property was used in the proof of Theorem 1. (Actually, in the proof of Theorem 1 we used only the cases s = 2 and s = 4 (see Step 4 of Sect. 2) because we assumed A = 21 ∧ x, but (126) permits the inclusion of a magnetic field with polynomial growth of A.) Our bound on the heat kernel follows an idea of Symanzik [26]. Using the FeynmanKac formula for the integral kernel, we can write α eα(−V ) (x, y) = dµx,y (ω) exp − ds V (ω(s)) , (127) 0
where dµx,y denotes the conditional Wiener measure for paths ω going from x to y in time α. By Jensen’s inequality we have, for any given path ω, α 1 α ds V (ω(s)) ≤ ds exp (−αV (ω(s))) . (128) exp − α 0 0 Therefore (using Fubini’s Theorem) 1 α eα(−V ) (x, y) ≤ ds dµx,y (ω) exp (−αV (ω(s))) α 0 α 1 = ds es e−αV e(α−s) (x, y) . α 0
(129)
To evaluate the trace in (126), we only need the heat kernel on the diagonal, i.e., for x = y. The integral kernel of et is given by jt (x − y) ≡ which leads to es e−αV e(α−s) (x, x) =
1 2 e−|x−y| /4t , 3/2 (4πt)
(130)
1 1 −αV (y) 2 dy e exp −|x − y| /4t , (4π)3 (tα)3/2 R3 (131)
where t is defined by 1/t ≡ 1/s + 1/(α − s). Let us change the integration variable from s to t, and introduce the function 2 α/4 1 jt (x) . hα (x) = dt √ (132) α 0 1 − 4t/α Then the bound (129) yields eα(−V ) (x, x) ≤
1 e−αV ∗ hα (x) , 3/2 (4πα)
(133)
536
E.H. Lieb, R. Seiringer
with ∗ denoting convolution. Note that hα (x)dx = 1. It is easy to see that hα (x) ∼ exp(−|x|2 /α) for large |x|. Hence, if V (x) increases logarithmically with |x|, we see that the diagonal of the heat kernel decays at least as |x|−const. α for large |x|. Thus, we can choose α large enough to ensure that (126) is finite. For the proof of Theorem 2 it is necessary to extend this result to the case where − + V is replaced by − + V + K, with K a finite rank operator. As explained there, we can restrict ourselves to the case when K has exponentially decaying eigenfunctions. I.e., we can assume that the kernel of K, which we denote by K(x, y), satisfies a bound K(x, y) ≤ Be−D(|x|+|y|)
(134)
for some constants B > 0 and D > 0. Again we want to show that, for any s > 0 and α large enough (depending on s), Tr |x|s eα(−V −K) < ∞
(135)
if V (x) ≥ C1 ln(|x|) − C2 for some constants C1 > 0 and C2 . With the notation Lt = et (−V ) , we can use the Dyson expansion to write e
α(−V −K)
= Lα +
n≥1
(−1)
n
i ti =α
dt0 dt1 · · · dtn Lt0 KLt1 K · · · KLtn . (136)
We have already derived an upper bound on the kernel of Lα above. The kernel of the terms for n ≥ 1 in the sum can be bounded as follows. First of all, the Feynman-Kac formula tells us that since V ≥ 0 we have the inequality Lt (x, y) ≤ jt (x −√y) for the kernel of Lt . Moreover, using (134) and denoting by the function (x) = Be−D|x| , we have n−1 Lt KLt K · · · KLt (x, y) ≤ jt ∗ (x) |Lti | jtn ∗ (y) . n 0 1 0
(137)
i=1
Since Lt ≤ I, we have |Lti | ≤ 22 . Denoting sup jt ∗ (x) , ξα (x) = −1 2
(138)
Lt KLt K · · · KLt (x, y) ≤ ξα (x)ξα (y)2n . n 0 1 2
(139)
0
we thus have
The integral over the simplex in (136) yields a factor α n /n!, and hence 2 α(−V −K) (x, y) ≤ eα(−V ) (x, y) + eα2 − 1 ξα (x)ξα (y) . e Since ξα decays exponentially for large |x| this proves our claim (135).
(140)
Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases
537
References 1. Aftalion, A., Du, Q.: Vortices in a rotating Bose-Einstein condensate: Critical angular velocities and energy diagrams in the Thomas-Fermi regime. Phys. Rev. A 64, 063603 (2001) 2. Butts, D.A., Rokhsar, D.S.: Predicted signatures of rotating Bose-Einstein condensates. Nature 397, 327–329 (1999) 3. Castin, Y., Dum, R.: Bose-Einstein condensates with vortices in rotating traps. Eur. Phys. J. D 7, 399–412 (1999) 4. Choquet, G.: Lectures on Analysis, Vols. 1 and 2. New York: W.A. Benjamin, 1969 5. Dyson, F.J.: Ground State Energy of a Hard-Sphere Gas. Phys. Rev. 106, 20–26 (1957) 6. Eisenberg, E., Lieb, E.H.: Polarization of interacting bosons with spin. Phys. Rev. Lett. 89, 220403 (2002) 7. Fetter, A.L., Svidzinsky, A.A.: Vortices in a trapped dilute Bose-Einstein condensate. J. Phys.: Condens. Matter 13, R135–R194 (2001) 8. Garc´ıa-Ripoll, J.J., P´erez-Garc´ıa, V.M.: Stability of vortices in inhomogeneous Bose condensates subject to rotation: A three-dimensional analysis. Phys. Rev. A 60, 4864–4874 (1999) 9. Griffiths, R.B.: A Proof that the Free Energy of a Spin System is Extensive. J. Math. Phys. 5, 1215–1222 (1964) 10. Klauder, J., Skagerstam, B.-S.: Coherent states, applications in physics and mathematical physics. Singapore: World Scientific, 1985 11. Lieb, E.H., Loss, M.: Analysis, Second edition. Providence, RI: Amer. Math. Soc., 2001 12. Lieb, E.H., Seiringer, R.: Proof of Bose-Einstein Condensation for Dilute Trapped Gases. Phys. Rev. Lett. 88, 170409 (2002) 13. Lieb, E.H., Seiringer, R., Solovej, J.P.: Ground-state energy of the low-density Fermi gas. Phys. Rev. A 71, 053605 (2005) 14. Lieb, E.H., Seiringer, R., Solovej, J.P., Yngvason, J.: The Mathematics of the Bose Gas and its Condensation. Oberwolfach Seminars, Vol. 34, Basel-Boston: Birkh¨auser, 2005 15. Lieb, E.H., Seiringer, R., Yngvason, J.: Bosons in a Trap: A Rigorous Derivation of the Gross-Pitaevskii Energy Functional. Phys. Rev. A 61, 043602 (2000) 16. Lieb, E.H., Seiringer, R., Yngvason, J.: A Rigorous Derivation of the Gross-Pitaevskii Energy Functional for a Two-Dimensional Bose Gas. Commun. Math. Phys. 224, 17–31 (2001) 17. Lieb, E.H., Seiringer, R., Yngvason, J.: Superfluidity in dilute trapped Bose gases. Phys. Rev. B 66, 134529 (2002) 18. Lieb, E.H.: Seiringer, R., Yngvason, J.: Justification of c-Number Substitutions in Bosonic Hamiltonians. Phys. Rev. Lett. 94, 080401 (2005) 19. Lieb, E.H.: Yngvason, J.: Ground State Energy of the Low Density Bose Gas. Phys. Rev. Lett. 80, 2504–2507 (1998) 20. Rockafellar, R.T.: Convex Analysis. Princeton, NJ: University Press, 1970 21. Seiringer, R.: Contributions to the Rigorous Theory of Many-Body Quantum Systems. PhD thesis, University of Vienna (2000). Available online at http://www.math. princeton.edu/˜rseiring/theses.html 22. Seiringer, R.: Gross-Pitaevskii Theory of the Rotating Bose Gas. Commun. Math. Phys. 229, 491– 509 (2002) 23. Seiringer, R.: Ground state asymptotics of a dilute, rotating gas. J. Phys.A: Math. Gen. 36, 9755–9778 (2003) 24. Simon, B.: Functional Integration and Quantum Physics. New York-London-San Diego: Academic Press, 1979 25. Simon, B.: Trace ideals and their application. London Math. Soc. Lecture Notes 35. Cambridge: Cambridge University Press, 1979 26. Symanzik, K.: Proof and Refinement of an Inequality of Feynman. J. Math. Phys. 6, 1155–1156 (1964) 27. Wehrl, A.: Three theorems about entropy and convergence of density matrices. Rep. Math. Phys. 10, 159–163 (1976) Communicated by H.-T. Yau
Commun. Math. Phys. 264, 539–561 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1488-1
Communications in
Mathematical Physics
On Metastability in FPU Dario Bambusi, Antonio Ponno Dipartimento di Matematica Via Saldini 50, 20133 Milano, Italy. E-mail:
[email protected],
[email protected] Received: 3 May 2005 / Accepted: 18 July 2005 Published online: 26 January 2006 – © Springer-Verlag 2006
Abstract: We present an analytical study of the Fermi–Pasta–Ulam (FPU) α–model with periodic boundary conditions. We analyze the dynamics corresponding to initial data with one low frequency Fourier mode excited. We show that, correspondingly, a pair of KdV equations constitute the resonant normal form of the system. We also use such a normal form in order to prove the existence of a metastability phenomenon. More precisely, we show that the time average of the modal energy spectrum rapidly attains a well defined distribution corresponding to a packet of low frequencies modes. Subsequently, the distribution remains unchanged up to the time scales of validity of our approximation. The phenomenon is controlled by the specific energy. 1. Introduction In this paper we present an analytical study of the Fermi–Pasta–Ulam (FPU) α–model with periodic boundary conditions for initial data with one low frequency Fourier mode excited. We give some rigorous results concerning the relaxation to a metastable state, in which energy sharing takes place among low frequency modes only. The FPU model consists of a long chain of particles interacting with their nearest neighbours through nonlinear springs. It was first introduced and studied numerically by FPU [22] in order to determine the time of approach to equilibrium of the system. In the FPU original experiment all the energy was initially given to a single low frequency Fourier mode and the energies of the Fourier modes were plotted vs. time. The result was surprising: energy sharing occurred only among a few low frequency modes and an almost recurrent behaviour of the solution was observed. On the contrary, a fast approach to a state characterized by equipartition of all modal energies was expected. The chain numerically integrated by FPU was composed by a relatively small number of particles; a problem that naturally arises is that of understanding whether the
540
D. Bambusi, A. Ponno
unexpected lack of equipartition persists when the number of particles grows. Actually a huge number of numerical computations have been performed [5, 7, 26, 31], but the situation is not yet clear. From the theoretical point of view, in the FPU problem there were initially two lines of research. The first one originated from the paper [40] by Zabusky and Kruskal, who numerically studied the dynamics of the Kortweg de Vries equation (KdV) which was heuristically known to describe the long wave solutions of the FPU. The authors observed a recurrent behaviour in the KdV and interpreted it as a possible explanation of the FPU recurrence. The paper by Zabusky and Kruskal constituted the starting point of the theory of Lax–integrable partial differential equations, but, as far as we know, the relevance of the KdV equation for the FPU relaxation problem was never completely clarified. A second line of research was initiated by Izrailev and Chirikov [24] and is based on the Kolmogorov Arnold Moser (KAM) theorem, or more generally on the application of canonical perturbation theory to the study of FPU. The idea of Izrailev and Chirikov is that an energy threshold exists, below which KAM theory is (in principle) applicable (actually the applicability of KAM theory to FPU is a delicate question, since one has to verify the validity of the KAM nondegeneracy condition, which was accomplished only recently in the paper [37]). The main point is the dependence of such a threshold on the number of degrees of freedom. The thesis by Izrailev and Chirikov is that, if the mode initially excited has high frequency, then the threshold goes to zero as the number of degrees of freedom increases, so that the region of recurrent motions becomes irrelevant. Afterwards, many heuristic arguments have been developed in order to support and refine Chirikov’s thesis. In particular, Shepeliansky gave some heuristic arguments according to which Chirikov’s thesis should hold also for low frequency initial excitations [38]. Anyway, up to now no rigorous result is available. It has to be noticed that the thesis of Izrailev–Chirikov–Shepeliansky is hardly compatible with the result of Zabusky–Kruskal: according to the former authors the FPU phenomenon disappears when the number of degrees of freedom is large, while the latters explain the FPU recurrence by making use of a PDE, which requires a large number of degrees of freedom. Finally a new theoretical scenario, which we call the metastability scenario, was proposed for the FPU problem in the paper [17] (see also [26]). The thesis is that the FPU system approaches, in a relatively short time, a first state whose modal energy spectrum displays a plateau of equipartition among low frequency modes, followed by an exponentially decreasing tail in the region of high frequencies. Complete equipartition is eventually reached on a second very long time–scale. In [17] the presence of the exponential tail in the energy spectrum of the metastable state was explicitly referred to as “similar to Wien’s law for black–body radiation”. Actually such an analogy was previously pointed out by Galgani and Scotti [23], who fitted the FPU energy spectrum to a Planck–like distribution. A new emphasis to such a metastability scenario was given in the papers [9–11]. 2. Main Ideas In the present paper we consider low frequency initial data and, following the line sketched in [29], we unify the first two approaches presented above, in the sense that we show that canonical perturbation theory leads to the Zabusky–Kruskal result. More precisely, we show that a pair of KdV equations constitute the resonant normal form of
On Metastability in FPU
541
FPU in the standard sense of canonical perturbation theory. We also use such a normal form in order to give a first rigorous result on energy sharing among the modes. In doing this, we show that the result of Zabusky–Kruskal is controlled by specific energy, so that the stability phenomenon should persist in the thermodynamic limit, against the thesis of Izrailev–Chirikov–Shepeliansky. On the other hand, we make a bridge with the metastability scenario of [17], because we point out the relevance of the time scales over which different qualitative descriptions of the dynamics hold. More precisely, we consider a very long chain with periodic boundary conditions, and focus on initial data in which only one Fourier mode with very small index (i.e. with low frequency) is initially excited. It is useful to describe the system using an interpolating function, namely a function whose values at integers are the displacements of the particles from equilibrium. It turns out that such an interpolating function has to fulfill a differential-difference equation which is well approximated (for long wavelengths) by a partial differential equation coinciding, at first order, with the linear string equation. More precisely, the Hamiltonian of the system describing the interpolating function has the structure H0 + P + R1 ,
(2.1)
where H0 is the Hamiltonian of the linear wave equation, P contains the lowest order (nonlinear and dispersive) corrections, and R1 contains higher order corrections. In order to take into account the corrections to the dynamics due to P we use the methods of Hamiltonian perturbation theory. In particular we apply the Galerkin averaging method of [3]. Thus we construct a canonical transformation conjugating the original system to a system with Hamiltonian H0 + P + R, where P is the average of P with respect to the flow of H0 (which coincides with the normal form of the system), and R is a remainder whose size is here rigorously estimated (uniformly with respect to the length of the chain) for states with small specific energy (but large total energy). Then, we explicitly compute the averaged Hamiltonian H0 + P , and show that its equations of motions consist of a pair of uncoupled KdV equations with periodic boundary conditions on a ring of length 2 (independently of the number of particles). As a third step we use these KdV’s to construct approximate solutions of the FPU chain and we estimate the error with respect to a true solution. Denote by µ the wave number of the initially excited mode, and assume it has specific energy E ≡ E/N ∼ µ4 (where N is the number of particles and E the total energy), then the dynamics of the KdV equations gives rise to finite size effects over a time–scale µ−3 . In order to get an estimate of the error valid over such a time–scale we use a technique by Schneider and Wayne [39]. It turns out that, having fixed an arbitrarily long time Tf , the KdV’s describe the solutions of the FPU up to a time Tf µ−3 . Finally we use known results on the KdV dynamics with periodic boundary conditions in order to compute the energy per mode along an approximate solution of the FPU system. In particular, denoting by Ek the energy in the k th mode and by Ek := Ek /N the corresponding specific energy, we prove that, for the considered initial data, Ek decreases as exp(−σ κ/µ), with κ = k/N and σ > 0, at least for the times such that the approximation is valid. Moreover, if we consider the time average of Ek , we prove that it quickly relaxes to a certain energy distribution, and then remains unchanged up to the times accessible within our approximation.
542
D. Bambusi, A. Ponno
Notice that the time–scale µ−3 ∼ E −3/4 for the formation of the packet and the width µ ∼ E 1/4 display the same dependence on the specific energy E as numerically observed in [5] and [7], and heuristically predicted in [29, 32, 33, 28]. As far as we know, this is the first rigorous result on a large FPU chain with finite specific energy. Moreover, this is a first rigorous description of the fast formation of a metastable packet of modes of the type observed by FPU. The main limitation of our result concerns the choice of the initial data: one would like to consider initial data involving e.g. a small packet of nearby modes as in most numerical computations while here only one mode (and possibly its higher harmonics) is excited. The reason for our limitation is that the manifold consisting of states with only one mode and its higher harmonics excited is invariant (see [36]). Moreover, on this manifold the dynamics is equivalent to the dynamics of a chain with 2/µ particles. However the fact that the result involves specific energy and moreover is in agreement with numerical results with low frequency initial data on a full packet of low frequency modes, seems to suggest that this limitation may be just a technical one. From the technical point of view, the core of our paper consists in the proof that a pair of KdV’s is the normal form of the FPU problem and in an estimation of the error. We point out that a previous result on the justification of KdV as a modulation equation for FPU was obtained by Schneider and Wayne in [39]. In their paper the attention was restricted to the case of solutions fast decreasing in space, whereas we deal here with space–periodic ones. The fact that a pair of uncoupled KdV equations describes well the FPU dynamics when the initial datum is space periodic is quite surprising. Indeed, the two waves travelling in the chain and described by the KdV equations continue to interact forever and one might expect some constructive interference to occur. This is not the case, essentially due to the structure of the FPU nonlinearity. This is in sharp contrast with the typical behaviour for short waves; see [35, 4]. We also mention the papers [18–21], where a remarkable connection between the FPU and the KdV has been obtained. However, also this series of papers refers to initial data that decay fast in space and thus is not directly connected with the problem of thermalization.
3. Main Result Consider the Hamiltonian system
H (q, p) =
N−1 j =−N
2
+ U (qj +1 − qj ),
x2 x3 + , 2 3 = qj , pj +2N = pj ,
U (x) = qj +2N
pj2
(3.1)
(3.2) (3.3)
describing a chain composed by 2N particles interacting through nonlinear springs. The canonical variables are q = (q−N , . . . , qN−1 ), p = (p−N , . . . , pN−1 ). The Hamiltonian (3.1) is known as the Fermi, Pasta and Ulam (FPU) α-model (with α = 1). Remark that, due to the periodic boundary conditions (3.3), the total of the linear momentum system is preserved. So one can restrict oneself to the case j pj = j qj = 0.
On Metastability in FPU
543
Introduce the Fourier coefficients by pj = √
N−1
1 2N
pˆ k ei
j kπ N
(3.4)
k=−N
and similarly for qj . We denote by Ek :=
|pˆ k |2 + ωk2 |qˆk |2 , 2
k = −N...., N − 1
the energy of the k th mode, where ωk := 2| sin
kπ 2N
(3.5)
|.
Remark 3.1. For real states one has Ek = E−k for all k, thus we will consider only positive indexes. It is convenient to state our main result in terms of “specific quantities”, thus we will label the modes with the index κ :=
k ; N
correspondingly we denote by Eκ :=
Ek N
(3.6)
the specific energy in the mode with index κ. In the following a small but finite index kN0 ≡ κ0 ≡ µ 1 will appear. Theorem 3.2. Fix a constant C0 and a positive (large) time Tf ; then there exist positive constants µ∗ , C1 , C2 , dependent only on C0 and on Tf , such that the following holds. Consider an initial datum with (3.7) Eκ0 (0) = C0 µ4 , Eκ (0) ≡ Eκ (t)t=0 = 0 , ∀κ = κ0 , and assume µ < µ∗ . Then, there exists σ > 0 such that, along the corresponding solution, one has (i) Eκ (t) ≤ µ4 C1 e−σ κ/µ + C2 µ5 ,
for |t| ≤
Tf µ3
(3.8)
for all κ > 0. (ii) There exists a sequence of almost periodic functions {Fn }n∈N such that, defining the specific energy distribution Fnκ0 = µ4 Fn ,
Fκ = 0 if κ = nκ0
(3.9)
one has |Eκ (t) − Fκ (t)| ≤ C2 µ5 ,
|t| ≤
Tf . µ3
(3.10)
544
D. Bambusi, A. Ponno
Remark 3.3. Since Fn (t) are almost periodic functions of time their time average defined by 1 T ¯ Fn (t)dt (3.11) Fn := lim T →∞ T 0 exists (see e.g. [16]). It follows that up to the error the time average of Eκ (t) relaxes to the limit distribution obtained by rescaling F¯n as in (3.9). Remark 3.4. One can give heuristic arguments to show that the (rescaled) limit distribution F¯n is the same for all initial data in a set of full measure. Moreover such a limit distribution was computed explicitly in [32] obtaining a result in very good agreement with the numerical observations by [5]. However, we were unable to transform the heuristic argument into a rigorous one. Remark 3.5. There exist numerical results showing that the time Te of approach to equipartition in FPU systems is a stretched exponential of the inverse of the specific energy E: Te ∼ exp[(1/E)a ] [31, 6]. The existence of such a time–scale a la Nekhoroshev was first conjectured in [17] making use of probabilistic arguments. It is not yet clear whether the metastable state with energy distribution E¯κ may survive over such a time– scale. The only rigorous result in this direction was obtained in [8] (see also [30]), where the exponential stability of the fundamental mode of a nonlinear string was proved. Remark 3.6. We expect Theorem 3.2 to hold also in the β–FPU, model (the time scale should be substituted by µ−4 ). Indeed, the theory of Sects. 4, 5 can be trivially generalized to the β model, the only difference being that the KdV equation has to be substituted by a different integrable equation, namely the modified KdV equation (mKdV). However, the study of the modified KdV is less developed than the study of the KdV equation, so, even if the results of Sect. 6 are expected to hold also in the case of the mKdV, there are not “ready to use theorems” available. Remark 3.7. It is very easy to see that a variant of Theorem 3.2 holds also in the case where not only the first Fourier mode is excited, but also its higher harmonics are excited, provided that the energy decreases exponentially or at least quadratically with κ/µ. Remark 3.8. With an extension of our theory we would (probably) be able to prove stability of the solutions constructed in Theorem 3.2 with respect to excitations involving a small packet of modes, but only on a time–scale of order µ−2 . Over such a time–scale the effects of the nonlinearity are not visible, so this extension has to be considered unsatisfactory. On the time–scale µ−3 , at present, we are only able to prove stability of the solutions we constructed for perturbations of the initial data that decay fast in space (i.e. with vanishing specific energy). Thus the energy spectrum of the initial data that we can control has the shape of a sequence of peaks of height proportional to N , but decreasing exponentially with κ, each with a superimposed bump of modes of small height. Work is in progress in order to deal with more general initial data. 4. Normal Form In this section we compute the normal form of the FPU and we give a rigorous estimate of the remainder.
On Metastability in FPU
545
From now on, instead of the “specific index κ” we will use integers to label the modes and the energy per mode Ek instead of the specific energy per mode Eκ = Ek /N . As above, corresponding to an integer index 1 ≤ k0 ≤ N we define the parameter µ :=
k0 . N
(4.1)
Rewrite the FPU system in terms of new rescaled variables rj defined by rj = 0, µ2 rj := qj − qj −1 ,
(4.2)
j
one has that the change of variables q → r is well defined and invertible. Introducing also the operator of second difference 1 by (1 r)j := rj +1 + rj −1 − 2rj ,
(4.3)
the FPU equations take the form r¨j = (1 (r + µ2 r 2 ))j . Remark 4.1. Introducing also the momenta sj defined by sj = 0, pj = µ2 sj − sj +1 ,
(4.4)
(4.5)
j
one gets that the transformation (p, q) → (s, r) is canonical. Moreover, it is easy to verify that in these variables one has 2 2 2 4 rˆk + ωk sˆk (4.6) Ek = µ 2 with rˆk and sˆk the Fourier coefficients of r and s, respectively. We introduce now an interpolating function r = r(x, t) for the sequence rj , namely a (smooth) function with the property that the sequence rj (t) ≡ r(j, t)
(4.7)
fulfills the FPU equations (4.4). Moreover we will assume that the function r(x) is 2/µ periodic and has zero average, namely that 1/µ r(x, t)dx = 0. (4.8) r(x + 2/µ, t) = r(x, t) , −1/µ
Thus we postulate that the function r fulfills r¨ = 1 (r + µ2 r 2 )
(4.9)
with an obvious extension of the definition of 1 to smooth functions. It is easy to verify that this system is Hamiltonian with Hamiltonian function 1/µ 3 −s1 s + r 2 2r H (r, s) := +µ dx (4.10) 2 3 −1/µ
546
D. Bambusi, A. Ponno
and with s a periodic function with zero average, playing the role of the momentum conjugated to the function r(x). The momentum s(x) is actually an interpolating function for the momentum introduced in Remark 4.1. Actually one has sj (t) = s(j, t). The Hamilton equations of (4.10) are given by δH dr = , dt δs
ds δH =− dt δr
(4.11)
δH 2 with δH δr denoting the L gradient of H with respect to r and similarly for δs . It is now convenient to rescale the length of the ring and the size of the momentum s, by introducing as new phase variables two function (u, v) periodic of period 2, defined by
v(µx) = µs(x) ,
u(µx) = r(x).
(4.12)
In the following we will denote by y the rescaled space variable, namely y = µx. The coordinate transformation (4.12) is not canonical, but it turns out that the equations for the variables (u, v) are still Hamiltonian with the original symplectic structure, and with Hamiltonian function H (u, v) = µK(u, v) with
K(u, v) =
1
(4.13)
−vµ v u2 µ 2 u3 + + 2µ2 2 3
dy,
(4.14)
(µ v)(y) := v(y + µ) + v(y − µ) − 2v(y).
(4.15)
−1
where we introduced the difference operator
Remark 4.2. From now on we will study the system (4.14). This clearly amounts to introducing a new time τ ≡ µt. More precisely, denote by u(τ ), v(τ ) a solution of the equations of motion of K, namely of du δK dv δK = , =− . dτ δv dτ δu Then u(µt), v(µt) is a solution of the equations of motion of H .
(4.16)
The formal expansion of the operator µ , defined in (4.15), gives µ2 ∂y4 µ 2 + O(µ4 ), = ∂ + y µ2 12
(4.17)
K = H0 + P + R1 ,
(4.18)
so that one has
with
H0 (u, v) := P (u, v) :=
1
−1
1
−1
−µ
R1 being the remainder of the expansion.
v(−∂y2 v) + u2 2
dy,
µ 2 u3 + dy, 24 3
v∂y4 v 2
(4.19) (4.20)
On Metastability in FPU
547
Remark 4.3. The equations of motion of the Hamiltonian H0 are uτ = −∂y2 v ,
vτ = −u,
(4.21)
and thus they are equivalent to the linear wave equation. Its flow will be denoted τ (v, u) and is periodic in time with period 2. Following [3] we are going to use a Galerkin averaging method in order to compute the corrections to the dynamics due to the presence of P , and to estimate the effect of R1 . To this end we first have to introduce a topology in the phase space. This is conveniently done in terms of Fourier coefficients. Definition 4.4. Having fixed two positive constants s, σ consider the Hilbert space 2σ,s of the complex sequences v ≡ {vK }K∈Z−{0} such that v2σ,s :=
|vK |2 |K|2s e2σ |K| < ∞.
(4.22)
K
We will identify a 2 periodic function v with its Fourier coefficients vˆK defined by 1 v(y) = √ vˆK eiπKy , 2 K∈Z and we will say that v ∈ 2σ,s if its Fourier coefficients have this property. Moreover in what follows the coefficient σ will be kept fixed. We will study the system K(u, v) in the phase spaces Ps defined by Ps := 2σ,s+1 × 2σ,s (v, u),
(4.23)
(v, u)2s := v2σ,s+1 + u2σ,s .
(4.24)
endowed with the norm
A phase point (v, u) will also be denoted by z, and the ball of radius R centered at the origin of Ps will be denoted by Bs (R). It is easy to see that the flow τ of the system H0 is unitary in all the spaces Ps . Theorem 4.5. For any r ≥ 5 there exists a constant µ∗ ≡ µ∗r , such that, if µ < µ∗ , then there exists an analytic canonical transformation T : Br (1) → Br (2) which averages K, namely such that K ◦ T = H0 + P + R ;
(4.25)
here P (z) :=
1 2
0
2
P ( τ (z))dτ
(4.26)
548
D. Bambusi, A. Ponno
and the vector field XR of the remainder is analytic in a complex ball of radius 1 and fulfills the estimate 12
sup XR (z)0 ≤ Cr µ4− 6+r .
zr ≤1
(4.27)
Moreover for any 1 ≤ r1 ≤ r the transformation T maps Br1 into Pr1 and fulfills 6
sup z − T (z)r1 ≤ Cµ2− 6+r .
zr1 ≤1
(4.28)
The proof is an application of the techniques of [3] and, for the sake of completeness, it will be given in Appendix A. Remark 4.6. We recall that a heuristic discussion on the possibility of putting the FPU system in normal form corresponding to low frequency initial data was given in [38]. The above theorem rigorously proves that this is indeed possible. Below we give the explicit expression of the normal form, which is integrable! As a consequence we think that some of the conclusions of the paper [38], which are based on the heuristic argument that resonances enforce chaos, could be incorrect. In the rest of this section we will perform the explicit computation of the averaged equations, showing that they coincide with two uncoupled KdV equations. To obtain the result it is useful to introduce new variables in which the unperturbed flow τ assumes a simpler form. To this end we introduce the non canonical transformation u + vy u − vy ξ := √ , η := √ . (4.29) 2 2 Since the transformation is not canonical one has to modify the Poisson tensor in order to deduce the equations of motion from the Hamiltonian. Lemma 4.7. In terms of the variables ξ, η the Poisson tensor takes the form −1 0 J = ∂ , 0 1 y
(4.30)
i.e. the Hamilton equations associated to a Hamiltonian function H take the form dz δH δH = J ∇H (z) , ⇐⇒ ξτ = −∂y , ητ = ∂y , (4.31) dτ δξ δη where ∇H denotes the L2 gradient and z = (ξ, η). In the variables (ξ, η) the various parts of the Hamiltonian take the form 1 2 ξ + η2 H0 (ξ, η) = dy, (4.32) 2 −1 1 2 3 2 [∂y (ξ − η)] 2 (ξ + η) P (ξ, η) = −µ +µ dy, (4.33) √ 48 6 2 −1 and in particular the equations of motion of H0 assume the simple form
ξτ = −ξy , ητ = ηy ⇐⇒ [ξ(y, τ ) = ξ0 (y − τ ) , η(y, τ ) = η0 (y + τ )] . (4.34) It is now easy to obtain the following
On Metastability in FPU
549
Proposition 4.8. In the variables ξ, η the average of the perturbation is given by
1 3 3 ξ 2 + ηy2 2 y 2 (ξ + η ) P (ξ, η) = −µ +µ dy, (4.35) √ 48 6 2 −1 and the equations of motion of H0 + P are given by 1 1 ξyyy − µ2 √ ξ ξy , 24 2 2 1 1 ητ = ηy + µ2 ηyyy + µ2 √ ηηy , 24 2 2
ξτ = −ξy − µ2
(4.36) (4.37)
i.e. two uncoupled KdV equations in translating frames, and therefore such equations constitute the resonant normal form of FPU in the region of the phase space corresponding to long wavelength excitations. Remark 4.9. It is a remarkable fact that averaging an infinite dimensional system with respect to one angle only one gets a normal that is integrable (two uncoupled KdV). Similar phenomena were already pointed out in the β–FPU model (see [37]) and for the water wave problem (see [15, 12–14]). We have no a priori explaination of this fact. Remark 4.10. One could also write down the normal form in the original variables u, v, but the resulting expression would turn out to be quite complicated and difficult to read. Proof. One has to compute the average of the different terms composing Eq. (4.33). As 1 an example we deal explicitly with the term proportional to −1 dyξy ηy . One has 2 1 1 4 1 2 dyξy ηy = ds dyξy (y − s)ηy (y + s) = dα dβξy (α)ηy (β) 4 −2 −1 0 −1 0 (4.38) which vanishes due to the fact that ξy has zero average. Performing the same computation over all the terms one gets the result. Since we are interested in the energy per mode we give now the relation of Ek with the Fourier coefficients of ξ and η, which in turn are defined by 1 ξ(y) = √ (4.39) ξˆK eiKyπ 2 K∈Z and similarly for η. Proposition 4.11. Let ξ(y), η(y) be a pair of functions belonging to P0 ; denote by Ek the energy in the k th mode as defined by (3.5) in terms of the original variables. Then, for µ small enough, one has 2 2 Ek 11 4 |ξK | + |ηK | ≤ Cµ 2 (ξ, η)20 (4.40) N −µ 2 for all k such that
k N
= µK with |K| ≤
| ln µ| 2σ ;
11 |Ek | ≤ µ 2 (ξ, η)20 N
for all k such that
k N
= µK and |K| >
| ln µ| 2σ ,
and Ek = 0 otherwise.
(4.41)
550
D. Bambusi, A. Ponno
The elementary proof is based on the exponential decay of the Fourier coefficients of a function in 2σ,0 . It is deferred to Appendix B. 5. Estimate of the Error Here we use the normal form to construct approximate solutions of FPU and we estimate their difference from true solutions. First we construct explicitly the approximate solutions. Consider the following pair of KdV equations 1 1 ξyyy − √ ξ ξy , 24 2 2 1 1 ηyyy + √ ηηy , = 24 2 2
ξτ1 = −
(5.1)
ητ1
(5.2)
obtained by rescaling time to τ1 = µ2 τ . Let ξ a (y, τ1 ), ηa (y, τ1 ) be a solution of such a pair of equations with the property that it belongs to Pr for all times τ1 , with a given r. Correspondingly, we define an approximate solution za ≡ (r a , s a ) of the FPU by ξ a (µ(x − t), µ3 t) + ηa (µ(x + t), µ3 t) , √ 2 ξ a (µ(x − t), µ3 t) − ηa (µ(x + t), µ3 t) sxa (x, t) := . √ 2
r a (x, t) :=
(5.3) (5.4)
The main result of this section is a theorem comparing the approximate solution with a corresponding true solution. Precisely, consider an initial datum (r0,j , s0,j ) and the corresponding Fourier coefficients (ˆr0,k , sˆ0,k ) as defined by Eq.(3.4). We assume that they are different from zero only if k/N = µK and that there exist two positive constants C and ρ such that
k |ˆr0,k |2 + ωk2 |ˆs0,k |2 −2ρ µN ≤ Ce . N
Finally, we define uniquely a corresponding interpolating function for the initial datum by 1 rˆ0,k eiπµKy , r0 (y) := √ 2N K where the sum runs over the integers K such that |K|µ = |k|/N ≤ 1, and in the formula one has to read k = µKN. We will consider a similar interpolating function for s0,j and corresponding initial data for the KdV equations. Theorem 5.1. Consider an initial datum for the FPU system with the above properties and denote by (rj (t), sj (t)) the corresponding solution. Consider the approximate solution ξ a (y, t), ηa (y, t) with the corresponding initial datum just constructed. Assume that for all times t the approximate solution is such that (ξ a , ηa ) ∈ P78 with some σ > 0, and fix an arbitrary Tf > 0. Then there exists µ∗ depending on Tf and on ξ a (t), ηa (t) only, such that, if µ < µ∗ then for all times t fulfilling 78 |t| ≤
Tf µ3
(5.5)
On Metastability in FPU
one has
551
sup rj (t) − r a (j, t) + sj (t) − s a (j, t) ≤ Cµ,
(5.6)
j
where r a , s a are given by (5.3), (5.4); moreover E (t) a (t)|2 + |ηa (t)|2 |ξ k K − µ4 K ≤ Cµ5 N 2 for all k such that
k N
= µK with |K| ≤
| ln µ| 2σ ,
and
|Ek (t)| ≤ µ5 N for all k such that
k N
= µK with |K| >
| ln µ| 2σ ,
(5.7)
(5.8)
whereas Ek (t) = 0 otherwise.
The proof of the theorem, which follows closely the strategy of [39], is deferred to Appendix C. 6. Dynamics of KdV and Conclusion of the Proof In this section we recall some known facts on the dynamics of the KdV equation with periodic boundary conditions and we use them to prove the results of Sect. 3. Consider the KdV equation (5.1), namely ξτ1 = −
1 1 ξyyy − √ ξ ξy . 24 2 2
It is a well known consequence of the Lax pair formulation that the spectrum of the Sturm Liouville operator √ Lξ := −∂yy + 6 2ξ(y, τ1 ) (6.1) with periodic boundary conditions on [0, 4] is invariant under the KdV evolution, i.e. it is independent of τ1 . The spectrum of Lξ with periodic boundary conditions on [0, 4], will be simply called the periodic spectrum of ξ . Such a periodic spectrum is of pure point type and consists of a sequence of eigenvalues λ0 < λ1 ≤ λ2 < λ3 ≤ λ4 < · · ·
(6.2)
(notice that the symbols < and ≤ do exactly alternate). The quantities γn := λ2n − λ2n−1
(6.3)
are called the gaps of the spectrum. From standard asymptotic properties of the spectrum one has γn ∈ 2 for any L2 potential ξ . Moreover, it has been proved by Garnett and Trubowitz that the sequence of the γn entirely determine the periodic spectrum of ξ . A further, very important, feature of the above Sturm Liouville problem is the relation between the sequence of the gaps and the regularity of the corresponding potential ξ . Indeed, up to a certain extent the correspondence between the regularity of ξ and the property of the sequence γn is the same one existing between the regularity of a function and its Fourier coefficients (see [27]). Precisely, the following theorem (from [34]) holds:
552
D. Bambusi, A. Ponno
Theorem 6.1. Suppose ξ ∈ L2 ; then ξ ∈ 0,s if and only if its gap lengths satisfy n2s |γn |2 < ∞. (6.4) n≥1
Moreover, if ξ ∈ σ,s then
n2s e2σ n |γn |2 < ∞
(6.5)
n≥1
conversely, if (6.5) holds, then ξ ∈ σ ,0 with some σ > 0. From a Hamiltonian point of view the KdV is an integrable infinite dimensional system. It has been shown that a complete system of integrals of motion is given by the γn2 . Moreover the KdV admits global action angle coordinates. More precisely, the following result holds Theorem 6.2. [Kappeler-P¨oschel [25]] There exists a diffeomorphism : L2 → 20,1/2 × 20,1/2 with the following properties1 : i) is one-to-one, onto, bianalytic, and canonical. ii) For each s ≥ 0, the restriction of to 20,s is a map : 20,s → 20,s+1/2 × 20,s+1/2 , which is one-to-one, onto, and bianalytic as well. iii) The coordinates (x, y) ∈ 20,3/2 × 20,3/2 are Birkhoff coordinates for the KdV equation. That is to say, in terms of the coordinates (x, y) the Hamiltonian HKdV of the KdV depends only on In := (xn2 + yn2 )/2, n ≥ 1, with (x, y) canonically conjugated coordinates. In terms of the variables (x, y) the dynamics of the KdV is trivial. To describe the latter, fix an initial datum (x 0 , y 0 ), and define νn (x 0 , y 0 ) :=
∂HKdV 0 0 (x , y ); ∂In
then the equations of motion take the form x˙n = νn yn ,
y˙n = −νn xn .
(6.6)
Thus, it is immediately seen that any solution is periodic, quasiperiodic or almost periodic, depending on the number of gaps (actions) initially different from zero. With these tools at hand it is easy to obtain the Proof of Theorem 3.2. We begin by proving (i). Consider an initial datum as in the statement of the theorem. This corresponds to initial data with ξ and η which are entire analytic functions (actually proportional to a sinus). By Theorem 6.1 the corresponding sequence of gaps decreases exponentially with any coefficient ρ in the exponential. This property is then conserved along the corresponding solution. Going back to Fourier coefficients one immediately deduces that the corresponding solution ξ(τ1 ) is analytic in the y variable in a complex strip of width σ (τ1 ). Taking the minimum of such quantities 1
By abuse of notation, here 20,α is the space of the sequences {xn }n≥1 such that
2α 2 n n |xn |
< ∞.
On Metastability in FPU
553
one finds the coefficient σ of Theorem 3.2. This is the result for the solution of the KdV equations. Using Theorem 5.1, Eq. (5.7), one goes back to the quantities Ek and obtains the desired result. In order to prove statement (ii) we use the fact that any solution is almost periodic in time. Denote the quantity 2 (1) EK := ξˆK ; (1)
then, EK (x(τ1 ), y(τ1 )) is almost periodic. Define also 2 (2) EK := ηˆ K
(1) (2) and EK := (E¯ K + E¯ K )/2. Scaling back to physical variables, using again Theorem 5.1, Eq. (5.7), and dividing by N where required, one gets statement (ii).
A. Appendix: Proof of Theorem 4.5 Since the Hamiltonian (and its vector field) is analytic, it is useful to complexify the phase space. Thus, from now on we will think of the phase variable z as a complex variable. The main reason is that, through Cauchy inequality the sup norm of a function controls also the supremum of the derivatives of the function. First we prove the following simple Lemma A.1. For any s ≥ 0 one has XR (z) ≤ 2µ4 , 1 s
∀z : zs+5 ≤ 2,
(A.1)
XP (z)s ≤ Cµ2 ,
∀z : zs+3 ≤ 2.
(A.2)
Proof. The estimate of XP is an immediate consequence of the definition of the norm and of the fact that 2σ,s is an algebra for s ≥ 1. Concerning XR1 just remark that the K th Fourier coefficient of its u component is given (and estimated) by 4 K 4 π 4 µ2 u ∧ 2 2 2 (A.3) vˆK XR1 (u, v) K = 2 sin (Kµπ ) − π K + µ 24 π 6 6 4 ≤ K µ vˆK 6! from which the thesis follows.
Then we perform a Galerkin cutoff of P . Precisely, define the projector n on the Fourier modes with index smaller than n, i.e. n (uˆ −∞ ...uˆ −K ...uˆ K ...uˆ ∞ ) = (uˆ −n ...uˆ n ), define also n (u, v) := (n u, n v), and finally define P (n) (z) := P (n (z)). Following [2] we have the following
(A.4)
554
D. Bambusi, A. Ponno
Lemma A.2. For any s ≥ 1 there exists a constant C such that, for any r ≥ 0, and any n ≥ 0, one has X
2 ≤ µ Cs , nr
P −P (n) (z) s
∀z : zs+r+3 ≤ 3/2.
(A.5)
For the proof see the proof of Lemma 5.2 in [2]. Moreover it is easy to show that XP (n) is analytic as a map from Ps to itself and that X (n) (z) ≤ µ2 Cs n3 ∀z : zs ≤ 2. (A.6) P s We now use Lie transform to construct a canonical transformation averaging the Hamiltonian up to order µ4 (or more precisely, slightly less). Thus consider an auxiliary Hamiltonian function χ (of order µ2 ), assume that the corresponding Hamiltonian vector field is analytic as a map from Ps to itself ∀s ≥ 1, and consider the corresponding Hamilton equations z˙ = Xχ (z).
(A.7)
Denote by T τ the corresponding time τ flow and by T the time 1 flow. We use such a T in order to transform our Hamiltonian system K. One has K ◦ T = H0 + P (n) + {χ , H0 } + R
(A.8)
where
R = (P −P (n) ) ◦ T +R1 ◦ T + P (n) ◦ T −P (n) +[H0 ◦ T −H0 − {χ , H0 }] (A.9)
is the sum of the higher order terms (they will be estimated in a while). First of all we choose χ in such a way that P (n) + {χ , H0 } = P (n) , according to Lemma 8.4 of [1] (a simple computation); this is given by 1 2 (n) τ τ P ( (z)) − P (n) ( τ (z)) dτ χ (z) := 2 0 and its vector field is analytic and estimated by Xχ (z) ≤ µ2 Cs n3 ∀z : zs ≤ 2. s
(A.10)
(A.11)
It also follows that the transformation T exists and fulfills the estimates (4.28). Moreover the various terms of (A.9) are estimated by Lemma A.3. The following estimates hold µ2 Cs X , ∀z : zs+r+3 ≤ 1, (P −P (n) )◦T s ≤ r n XR ◦T ≤ Cµ4 , ∀z : zs+5 ≤ 1, 1 s X (n) ≤ Cµ4 n6 , ∀z : zs ≤ 1, (n) P ◦T −P s XH ◦T −H −{χ,H } ≤ Cµ4 n6 , ∀z : zs ≤ 1. 0 0 0 s
(A.12) (A.13) (A.14) (A.15)
On Metastability in FPU
555
Proof. All these estimates are a direct application of some lemmas already proved in [1]. In particular (A.12) and (A.13) follow from Lemma 8.2 with R = 3/2 and δ = 1/2 the first one, and R = 2 and δ = 1 the second one. Equation (A.14) is a consequence of Lemma 8.3 with R = 2 and δ = 1. Equation (A.15) is a consequence of Lemma 8.5 with R = 2 and δ = 1. We choose now n in such a way that (A.12) and (A.14) are of the same order of 2 magnitude. This leads to the choice n = µ− r+6 which gives the estimate (4.27) for the remainder. Up to now we have shown that K ◦ T = H0 + P (n) + R (A.16) with R fulfilling the wanted estimate. To conclude the proof it is enough to remark that 12 µ2 Cs X ≤ Cµ4− 6+r , P −P (n) (z) s ≤ nr
∀z : zs+r+3 ≤ 3/2,
(A.17)
and thus one can simply substitute P in place of P (n) including the difference in the remainder. B. Appendix: Proof of Proposition 4.11 Define the Fourier coefficients of the function u by 1 uˆ K := √ 2
1
−1
u(y)e−iπKy dy,
(B.1)
and similarly for v, then Lemma B.1. For a state of the FPU corresponding to a pair of functions (u, v) one has 2 2 Ek 2 vˆ K+L , uˆ K+L + ωk = N µ
∀k : µK =
L∈L
k , N
(B.2)
where L := {L ∈ Z : Lµ = 2l with l ∈ Z}
(B.3)
and Ek = 0 otherwise. Proof. First introduce a 2N –periodic interpolating function for rj , namely a smooth function r N (x) such that rj = r N (j ) ,
r N (x + 2N ) = r N (x).
(B.4)
Denote rˆkN := √
1 2N
N
−N
r N (x)e
−ikπ x N
dx ,
(B.5)
556
D. Bambusi, A. Ponno
then one has rj = r (j ) = √ N
1 2N
k∈Z
ikπj rˆkN e N
=√
1 2N
N−1 k=−N
l∈Z
N rˆk+2Nl
e
ikπj N
which implies rˆk =
l∈Z
N rˆk+2Nl .
(B.6)
Then the relation between rˆkN and uˆ K is easily obtained remarking that µ2 1 N ikπj r N (j ) = µ2 u(µj ) = √ uˆ K eiKµj π = √ rˆk e N . 2N k∈Z 2 K∈Z
(B.7)
Proof of Proposition 4.11. We start from Eq.(B.2), and as a first step we remark that, for Kµ = k/N , one has ωk = 2 sin kπ = 2 sin µKπ ≤ π |K|, (B.8) µ µ 2N µ 2 and that, for |K| ≥ 2| ln µ|/σ one has 2 uˆ K + π 2 K 2 |vˆK |2 ≤ π 2 µ4 (u, v)20 . 2
(B.9)
Using the relation between (u, v) and (ξ, η) one gets 2 2 ˆ 2 ξK + ηˆ K uˆ K + π 2 K 2 |vˆK |2 = 2 2
(B.10)
from which, using (B.8), Eq. (4.41) immediately follows. Concerning (4.40) one has, for |K| ≤ 2| ln µ|/σ , 2
2 ˆ E ξK + ηˆ K ω2 − (µK)2 2 1 2 ωk2 k k 2 uˆ K+L + 2 |vˆK+L | 4 − vˆK + ≤ µ 2 µ2 2 µ L =0 L∈L
≤
(µK)4 µ2
1 uˆ K+L 2 + |K + L|2 |vˆK+L |2 |vˆK |2 + 2 L =0 L∈L
≤ µ (2| ln µ|)2 v2σ,1 + 2
(v, u)20 e−2σ µ . l
l =0
The logarithm of µ can obviously be estimated by µ−1/2 , while the sum is exponentially small with µ. Thus the thesis follows.
On Metastability in FPU
557
C. Appendix: Proof of Theorem 5.1 It is useful to use also the variables (u, v), to define ξ a (y − τ, µ2 τ ) + ηa (y + τ, µ2 τ ) , √ 2 ξ a (y − τ, µ2 τ ) − ηa (y + τ, µ2 τ ) vya (y, τ ) := , √ 2
ua (y, τ ) :=
(C.1) (C.2)
and denote za (y, τ ) = (ua (y, τ ), v a (y, τ )). Then, in order to get a better approximation we define (u, ˜ v) ˜ ≡ z˜ = T (za ) = za + ψ a (za ),
(C.3)
a 6 ψ ≤ Cµ2− 6+r , r
(C.4)
where
and (u, ˜ v) ˜ fulfills the equations v˜t = −u˜ − µ2 π0 u˜ 2 + Rv , µ v˜ u˜ t = −1 + µRu , µ
(C.5) (C.6)
where the operator 1 acts in terms of the x variables, the remainders are functions of y, τ which fulfill v 12 12 R ≤ Cµ4− 6+r , Ru σ,0 ≤ Cµ4− 6+r , (C.7) σ,1 and π0 is the projector on the space of the functions with zero average. We restrict the space variable to integer values. If µ = l/n with l and n relatively prime integers then all the quantities involved in Eqs. (C.5), (C.6) are periodic with period n. In what follows we will restrict to the case l = 1; the case l = 1 can be dealt with by simple modifications. Keeping this in mind we will allow the space variable j to vary in {−n, . . . n − 1}. For a (finite) sequence r = {rj } we define the norm r2 2 (j ) :=
n−1
|rj |2 .
(C.8)
j =−n
For the quantities u, ˜ v, ˜ Rv , Ru evaluated at the integers j we will retain the same notation as for the original quantities. Moreover it is useful to introduce the difference operator ∂ defined by (∂r)j := rj − rj −1 ,
(C.9)
where r is an arbitrary sequence. We consider the FPU model (4.9). We rewrite it in the form s˙ = −r − µ2 π0 r 2 , r˙ = −1 s,
(C.10) (C.11)
558
D. Bambusi, A. Ponno
and we look for two sequences E ≡ {Ej } and F ≡ {Fj } such that r = u˜ + µE ,
s=
v˜ + µF µ
(C.12)
fulfill the FPU equation in the form (C.10) and (C.11). Then E and F have to fulfill µRu , E˙ = −1 F − µ
(C.13)
Rv ˜ − µ 3 π0 E 2 − . F˙ = −E − µ2 2π0 uE µ
(C.14)
Moreover, for (E, F ) we impose initial conditions such that (u, ˜ v) ˜ has initial data corresponding to those of the true initial datum, namely we assume u(µj, ˜ 0)+µE0,j = r0,j = ua (µj, 0) ,
v(µj, ˜ 0) v a (µj, 0) +µF0,j = s0,j = . µ µ (C.15)
Lemma C.1. One has 1
6
E0 2 (j ) ≤ Cµ 2 − 6+r ,
1
6
∂F0 2 (j ) ≤ Cµ 2 − 6+r .
(C.16)
Proof. From (C.3), (C.4) one has u˜ − ua ψu = , µ µ
E0 =
F0 =
v˜ − v a ψv ≡ µ2 µ2
and 6 sup u(y) ˜ − ua (y) ≤ Cµ2− 6+r , y
from which E0 2 2 (j ) ≤
n−1 j =−n
12
12
2 Cµ4− 6+r µ4− 6+r sup Ej ≤ 2n =2 , 2 µ µ3 j
from which the estimate of E0 follows. Concerning F we need an estimate of ∂ψ v . Since ψ v is a function of y, one has 6 (∂ψ v )(j ) = ψ v (µj ) − ψ v (µj − µ) ≤ µ sup ∂y ψ v (y) ≤ Cµ3− 6+r , y
from which v 2 ∂ψ 2
(j )
12 n−1 |∂ψ v (j )| 2 nµ6− 6+r ≤ ≤ . µ2 µ4
j =−n
We use now an idea of Wayne and Schneider to obtain the
(C.17)
On Metastability in FPU
559
Theorem C.2. Fix r = 78, and fix Tf and CF > 0, then provided µ is small enough one has that E2 2 (j ) + (∂F )2 2 (j ) ≤ CF
(C.18)
for all times t fulfilling |t| ≤
Tf . µ3
(C.19)
Proof. Define the function F(E, F ) :=
Ej2 + Fj (−1 F )j 2
j
+
2µ2 u˜ j Ej2
2
(C.20)
and remark that 1 F(E, F ) ≤ E2 2 (j ) + (∂F )2 2 (j ) ≤ 2F(E, F ). 2 Compute now the time derivative of F; inserting Eqs. (C.13) and (C.14) one gets F˙ =
Rvj Ej µRuj 2µ3 u˜ j Ruj Ej 2µ3 Ej2 ∂ u˜ j (1 F )j µ3 Ej2 + (1 F )j − − + . µ µ µ 2 ∂τ j
(C.21) In order to estimate the r.h.s. we need some preliminary estimates. The first one is √ sup (1 F )j = sup (∂F )j +1 − (∂F )j ≤ 2 sup (∂F )j ≤ 4 F. j
j
j
Next we will need an estimate of Ru 2 (j ) . This is given by u 2 R 2
(j )
≤
2 24 |Ru |2 ≤ 2n sup Ru (y) ≤ Cnµ8− 6+r ,
(C.22)
y
j
which gives u R
2 (j )
1
12
≤ Cµ4− 2 − 6+r .
(C.23)
Concerning Rv we need an estimate of ∂Rv 2 (j ) . This is given by v 12 ∂R 2 ≤ Cµ4+ 21 − 6+r , (j )
(C.24)
which is obtained by remarking that
v (∂Rv )j = Rv (µj ) − Rv (µj − µ) ≤ µ sup ∂R (y) ∂y y
and proceeding as in the proof of (C.23). Now, the first term of (C.21) is estimated by 4µ3 F 3/2 . Concerning the second term, first remark that it coincides with j (∂F )j (∂Rv )j /µ and therefore it is estimated by
560
D. Bambusi, A. Ponno 1
12
CF 1/2 µ4+ 2 − 6+r −1 . The same estimate holds for the third term, the fourth term is esti1 12 mated by CF 1/2 µ6+ 2 − 6+r −1 and the last term is also easily estimated remarking that the derivative of u˜ with respect to τ is bounded and therefore such a term is bounded by Cµ3 F. As far as F < 2CF one thus has F˙ ≤ C(µ3 + µ3 )F + Cµ5+1/4−1 . (C.25) Such a differential inequality can be easily solved giving 1
12
F(t) ≤ F0 eT0 C + eT0 C CT0 µ1+ 2 − 6+r −1
(C.26)
which, inserting the value of r implies the thesis. Moreover the result on the Fourier modes is an immediate consequence of Proposition 4.11 and of the fact that the error from a true solution is measured in the norm 2 (j ) which controls the Fourier coefficients. Acknowledgement. This work emerged from many discussions within our group. In particular we would like to thank Andrea Carati and Luigi Galgani for their very interesting suggestions and comments. We would like to thank Antonio Giorgilli who showed us many numerical simulations and stimulated our interest in the phenomenon of formation of the packet. We also thank Giancarlo Benettin whose criticism is always very stimulating and leads to a better understanding of the problems at hand.
References 1. Bambusi, D.: Nekhoroshev theorem for small amplitude solution sin nonlinear Schr¨odinger equation. Math. Z. 130, 345–387 (1999) 2. Bambusi, D.: An averaging theorem for quasilinear Hamiltonian PDEs. Ann. Henri Poincar´e 4, 685–712 (2003) 3. Bambusi, D.: Galerkin averaging method and Poincar´e normal form for some quasilinear PDEs. http://www.ma.utexas.edu/mp arc/c/05/05-28.pdf, 2005 4. Bambusi, D., Carati, A., Ponno, A.: The nonlinear Schrødinger equation as a resonant normal form. DCDS-B 2, 109–128 (2002) 5. Berchialla, L., Galgani, L., Giorgilli,A.: Localization of energy in FPU chains. DCDS-A 11, 855–866 (2005) 6. Berchialla, L., Giorgilli, A., Paleari, S.: Exponentially long times to equipartition in the thermodynamic limit. Phys. Lett. A 321, 167–172 (2004) 7. Biello, J.A., Kramer, P.R., LvovD, Y.V.: Stages of energy transfer in the FPU model. Dynamical systems and differential equations (Wilmington NC 2002). DCDS Suppl., 113–122 (2003) 8. Bambusi, D., Nekhoroshev, N.N.: A property of exponential stability in the nonlinear wave equation close to main linear mode. Physica D 122, 73–104 (1998) 9. Carati, A., Galgani, L.: On the specific heat of FPU systems and their glassy behavior. J. Stat. Phys. 94, 859–869 (1999) 10. Carati, A., Galgani, L.: Planck’s formula and glassy behaviour in classical nonequilibrium statistical mechanics. Physica A 280, 105–114 (2001) 11. Carati, A., Galgani, L., Giorgilli, A.: The Fermi–Pasta–Ulam problem as a challenge for the foundations of physics. Chaos, to appear, 2005 12. Craig, W.: Birkhoff normal form for water waves. Mathematical problems in the theory of water waves, V. 200, Providence, EI: AMS, 1996 13. Craig, W., Sulem, C.: Numerical simulation of gravity waves. J. Comput. Phys. 108, 73–83 (1993) 14. Craig, W., Worfolk, P.A.: An integrable normal form for water waves in infinite depth. Physica D 84, 513–531 (1995) 15. Dyachenko, A.I., Zakharov, V.E.: Is free-surface hydrodynamics an integrable system?. Phys. Lett. A 190, 144–148 (1994) 16. Fink, A.: Almost periodic differential equations. Berlin: Springer-Verlag, 1974 17. Fucito, F., Marchesoni, F., Marinari, E., Parisi, G., Peliti, L., Ruffo, S., Vulpiani, A.: Approach to equilibrium in a chain of nonlinear oscillators. J. de Physique 43, 707–713 (1982)
On Metastability in FPU
561
18. Friesecke, G., Pego, R.L.: Solitary waves on Fermi-Pasta-Ulam lattices. I. Qualitative properties renormalization and continuum limit. Nonlinearity 12, 1601–1627 (1999) 19. Friesecke, G., Pego, R.L.: Solitary waves on Fermi-Pasta-Ulam lattices. II. Linear implies nonlinear stability. Nonlinearity 15, 1343–1359 (2002) 20. Friesecke, G., Pego, R.L.: Solitary waves on Fermi-Pasta-Ulam lattices. III. Howland-type Floquet theory. Nonlinearity 17, 207–227 (2004) 21. Friesecke, G., Pego, R.L.: Solitary waves on Fermi-Pasta-Ulam lattices. IV. Proof of stability at low energy. Nonlinearity 17, 229–251 (2004) 22. Fermi, E., Pasta, J.R., Ulam, S.M.: Studies of nonlinear problems. In Collected works of E. Fermi Vol.2. Chicago: Chicago University Press, 1965 23. Galgani, L., Scotti, A.: Planck-like distribution in classical nonlinear mechanics. Phys. Rev. Lett. 28, 1173–1176 (1972) 24. Izrailev, F.M., Chirikov, B.V.: Statistical properties of a nonlinear string. Sov. Phys. Dokl. 11, 30–32 (1966) 25. Kappeler, T. P¨oschel, J.: KAM & KdV. Berlin-Heidelberg-Newyork: Springer, 2003 26. Livi, R., Pettini, M., Ruffo, S., Vulpiani, A.: Further results on the equipartition threshold in large nonlinear Hamiltonian systems. Phys. Rev. A 31, 2741–2742 (1985) 27. Marchenko, V.: Sturm-Liouville operators and applications. Basel: Birkh¨auser, 1986 28. Ponno, A., Bambusi, D.: Energy cascade in Fermi–Pasta–Ulam model. In: G. Gaeta et al. (eds.) Symmetry and Perturbation Theory 2004, RiverEdge, NJ: World Scientific, 2005 pp. 263–270 29. Ponno, A., Bambusi, D.: KdV equation and energy sharing in FPU. Chaos 15, 015107 (2005) 30. Paleari, S., Bambusi, D., Cacciatori, S.: Normal form and exponential stability for some nonlinear string equations. ZAMP 52, 1033–1052 (2001) 31. Pettini, M., Landolfi, M.: Relaxation properties and ergodicity breaking in nonlinear Hamiltonian dynamics. Phys. Rev. A 41, 768–783 (1990) 32. Ponno, A.: Soliton theory and the Fermi-Pasta-Ulam problem in the thermodynamic limit. Europhys. Lett. 64, 606–612 (2003) 33. Ponno, A.: The Fermi–Pasta–Ulam problem in the thermodynamic limit. In: P. Collet et al. (ed.) Proceedings of the Carg´ese Summer School 2003 on Chaotic Dynamics and Transport in Classical and Quantum Systems, Dordrecht: Kluwer Academic Publishers, 2005, pp. 431–440 34. P¨oschel, J.: Hill’s potentials in weighted Sobolev spaces and their spectral gaps. Preprint (2004) 35. Pierce, R.D., Wayne, C.E.: On the validity of mean-field amplitude equations for counterpropagating wavetrains Nonlinearity 8, 769–780 (1995) 36. Rink, B.: Symmetric invariant manifolds in the Fermi-Pasta-Ulam lattice. Physica D 175, 31–42 (2001) 37. Rink, B.: Symmetry and resonance in periodic FPU chains. Commun. Math. Phys. 218, 665–685 (2001) 38. Shepelyansky, D.L.: Low-Energy chaos in the Fermi–Pasta–Ulam problem. Nonlinearity 10, 1331– 1338 (1997) 39. Schneider, G., Wayne, C.E.: Counter-propagating waves on fluid surfaces and the continuun limit of the Fermi Pasta Ulam model. In: Proceedings of the International Conference on Differential Equations Berlin 1999, River Edge NJ : World Scientific, 2000 40. Zabusky, N.J., Kruskal, M.D.: Interaction of solitons in a collisionless plasma and the recurrence of initial states. Phys. Rev. Lett. 15, 240–243 (1965) Communicated by G. Gallavotti
Commun. Math. Phys. 264, 563–564 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1558-z
Communications in
Mathematical Physics
Erratum
The Hamiltonian Operator Associated with Some Quantum Stochastic Evolutions M. Gregoratti Dipartimento di Matematica “F.Brioschi”, Politecnico di Milano, piazza Leonardo da Vinci 32, 20133 Milano, Italy. E-mail:
[email protected] Received: 29 July 2004 / Accepted: 23 January 2006 Published online: 22 March 2006 – © Springer-Verlag 2006 Commun. Math. Phys. 222, 181–200 (2001)
It was kindly pointed out to us by W. von Waldenfels that Section 3.2 of [1] contains an error when the trace operator is introduced for functions in the Sobolev space H (Rn∗ ; H): we claimed that there exists a bounded operator ·|{r =s} : H (Rn∗ ; H) → L2 (Rn−1 ; H) which naturally defines the trace of each v in H (Rn∗ ; H) as a function v|{r =s} in L2 (Rn−1 ; H), but actually such trace v|{r =s} is naturally defined only as a function in n 2 n−1 ; H) can only be closed, L2loc (Rn−1 ∗ ; H) and a trace operator from H (R∗ ; H) to L (R with a domain to be specified. Nevertheless the main result of [1], Theorem 3, is correct and provable through an adjustment of the argument. We refer to [2] for a detailed introduction of the traces ·|{r =s} and we list below the points which require an adjustment, that is the points involving ·|{r =s} which are to be handled taking into account domain constraints. (22) needs to be generalized [2] because u|∂Qm 1. The integration by parts formula v|∂Q H is not necessarily in L1 (∂Qm ) for every u and v in H (Rn ; H). Therefore, for m ∗ > 0, we introduce on Rn the totally symmetric indicator function I (r) = < 1 − I(−∞,0) (r r ) I[0,] (|r | + |r |) , which vanishes when r has two small coordinates of opposite sign. Then I (r) ↑ 1 as ↓ 0 and for every u and v in H (Rn∗ ; H) the following generalized integration by parts formula holds: n n u| ∂ vH = − ∂ u|vH Qm
=1
Qm =1
+ lim ↓0
n
∂Qm
=1
ηm · e (I u)|∂Qm (I v)|∂Qm H ,
(22b)
564
M. Gregoratti
which reduces to (22), by dominated convergence, every time u|∂Qm v|∂Qm H is in L1 (∂Qm ). This happens if u and v have traces u|∂Qm and v|∂Qm in L2 (∂Qm ; H), or also if, independently of v, u|∂Qm = (I u)|∂Qm for some .n
Analogously, for every u and v in Hsymm (R∗ × J ) ; H , the correct version of (23) is the following generalized integration by parts formula [2]: n n u| ∂ v =− ∂ u|v =1
L2 ((R×J )n ;H)
=1
L2 ((R×J )n ;H)
+n lim I u |{rn =0− } I v |{rn =0− } Z⊗L2 ((R×J )n−1 ;H) ↓0
− I u |{rn =0+ } I v |{rn =0+ } Z⊗L2 ((R×J )n−1 ;H) . (23b)
2. The unbounded operators a(s) and their domains Vs are to be defined just by Eqs. (32) and (25) of [1], which therefore imply that a vector in Vs needs to have every single component n with square integrable trace (n |{rn =s}Z⊗L2 ((R×J )n−1 ;H) <∞∀n). 3. Proposition 3 can still be proved as in [1], but domain constraints for a(0− ) and a(0+ ) are to be dealt with more carefully. Clearly Eq. (36) can always be extended by linearity and it can also be extended by continuity (bounded convergence) to a vector in V0± every time there is a sequence of vectors N in V0± satisfying (36) such that N → in K, E N → E in K and a(s) N → a(s) in Z ⊗ K for 1 s = 0− , 0+ . So the validity of (36) can be extended from E H (R∗ ; Z) to n-par ticle vectors in span v ⊗n ⊗ hv ∈ H 1 (R∗ ; Z), h ∈ H and then to n-particle vectors in H 1 (R∗ ; Z)n ⊗ H; 4 in [2], since the latter space includes
thanks to Theorem D(Rn∗ ; Z⊗n ⊗H)∩L2symm (R×J )n−1 ; H , Eq. (36) can be extended also to all n-particle vectors belonging to V0± and finally to all vectors in V0± . 4. Proposition 6 can still be proved as in [1], even if only the generalized integration by parts formula (23b) is available. The integration by parts formula is applied to prove that Ut belongs to V0− and with (23b) there is a limit w.r.t. ↓ 0 which has to be commuted with the integrations in the scalar products. Such operations can be commuted if the vector ϒ in V0 is assumed to have components ϒn vanishing in a neighborhood of all the coordinate hyperedges {rj = r = 0}, j = . Then, thanks to Lemma 8 in [2], this class of vectors is large enough to get the thesis. Acknowledgement. We thank W. von Waldenfels who pointed out the error and G. Gilardi who gave us helpful hints to solve it.
References 1. Gregoratti, M.: The Hamiltonian Operator Associated with Some Quantum Stochastic Evolutions. Commun. Math. Phys. 222, 181–200 (2001) 2. Gregoratti, M.: Traces of Sobolev functions with one square integrable directional derivative. Math. Meth. Appl. Sci. 29, No. 2 , 157–171 (2006) Communicated by M. Aizenman
Commun. Math. Phys. 264, 565–581 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1540-9
Communications in
Mathematical Physics
On Hausdorff Dimension of Unimodal Attractors J. Graczyk1 , O.S. Kozlovski2 1 2
Universit´e de Paris-Sud, Math´ematiques, 91405 Orsay, France Mathematical Institute, Warwick University, England
Received: 23 January 2005 / Accepted: 22 November 2005 Published online: 31 March 2006 – © Springer-Verlag 2006
Abstract: There exists a universal constant σ < 1 such that every attractor of every C 4 unimodal map with a non-degenerate critical point is an analytic manifold or its Hausdorff dimension is equal to or less than σ .
1. Introduction A concept of attractor is central in understanding long time behavior of dynamical systems. Intuitively, it is a piece of the phase space where a large number of trajectories accumulate. More precisely, a forward invariant compact set A is called a (minimal) metric attractor for some dynamics if its basin of attraction B(A) = {x : limn→∞ dist(f n (x), A) = 0} has positive Lebesgue measure and A has no proper subset with the same property, [15]. Analogously, a topological attractor is a minimal closed A with B(A) which is of second Baire category. One of the simplest examples of non-linear dynamical systems are unimodal maps of the interval. It is well known that for sufficiently smooth unimodal maps an attractor can be either an attracting periodic point, or a cycle of intervals, or an invariant Cantor set, [14]. The main result of the paper asserts that there exists σ < 1 such that the Cantor-like attractor of every C 4 unimodal map with a non-degenerate critical point has the Hausdorff dimension bounded above by σ . This answers negatively a long standing problem in 1-dimensional dynamics if there exist Cantor-like attractors of dimension 1 in the quadratic family fa (x) = ax(1 − x). The Feigenbaum quadratic polynomial is a prototype map in the theory of renormalization, [20]. It is both the simplest and the most studied map in the class of infinitely renormalizable unimodal maps with bounded combinatorics. By the general theory, [20, 14], every sufficiently smooth map from this class has the Cantor-like attractor of the Hausdorff dimension strictly less than 1. Even in the bounded case, the general theory does not supply uniform estimates.
566
J. Graczyk, O.S. Kozlovski
Our main result bears a close relationship to a spectral gap problem in conformal geometry, [19, 18]. In [2] a universal lower bound for the eigenvalues of the Laplacian is obtained for a classical Schottky group operation on the Riemann sphere. By [18] and [19], this yields a universal upper estimate of the Hausdorff dimension of the limit set for the group. A C 1 map f of a compact interval I is unimodal if it has exactly one point ζ , where Df (ζ ) = 0 (the critical point), ζ ∈ int I, f has a local extremum at ζ and f maps the boundary of I into itself. The critical point ζ of a C 2 map f is non-degenerate if D 2 f (ζ ) = 0. Here Df denotes the derivative of f . Theorem 1. There exists a universal constant σ < 1 such that the Hausdorff dimension of every attractor of every C 4 unimodal map with non-degenerate critical point has either Hausdorff dimension less than σ or is a finite union of closed and non-degenerate intervals. A scope of validity of Theorem 1 should go beyond the unimodal setting. It is an interesting problem to identify the systems which maximize the Hausdorff dimension of the Cantor-like metric/topological attractors in a given class of unimodal (multimodal) maps. The asymptotic estimates of Lemma 6.6 as well as numerical experiments, [16], seem to indicate that the attractor of the Feigenbaum map has the maximal Hausdorff dimension in the class of smooth unimodal maps with a non-degenerate critical point. The Hausdorff dimension of the Feigenbaum attractor is universally equal to 0, 538 . . . , see [9], for maps which satisfy the Feigenbaum scaling laws [3]. This value is quite close but bigger than 0.5 which is a standard ‘saddle-node’ lower bound on σ , compare [7]. Idea of the proof. We use analytic estimates from [8]. A reduction of smooth systems to analytic ones is possible due to the recent developments of [4 and 11]. The renormalization methods are used to set an initial framework of the problem. The union of the images of a restrictive interval, see Definition 1, forms a finite cover of the Cantor-like attractor. The proof of Theorem 1 is based on explicit estimates of the sum of lengths of disjoint intervals from the forward orbit of the restrictive interval in some positive power α < 1. In Sect. 3 we obtain two fairly general estimates. The first yields an upper bound on the sum of lengths of intervals in the power α ∈ ( 21 , 1) as they are mapped through a funnel opened by the parabolic bifurcation, Proposition 1. The second provides similar estimates in the hyperbolic case (for any α > 0) when an escaping route is provided by an almost renormalizable setting, Proposition 2. In the next step, we introduce a canonical induction procedure and propose inductive estimates of the sums of lengths of dynamically defined intervals in some power α ∈ ( 21 , 1). These inductive estimates rely on the expansion property of the first return maps outside the central intervals together with the parabolic and the hyperbolic estimates of Sect. 3. All these estimates still hold under the relaxed hypotheses that the critical point is non-flat. However, to derive absolute estimates from the inductive ones we need a super-exponential decay of geometry to win over an exponential accumulation of constants. The decay of geometry holds only for maps with a non-degenerate critical point. In the last section, we produce the constant σ by calibrating α in the preceding estimates. 2. Preliminaries Classification of metric dynamics. The classification of metric attractors in unimodal dynamics with non-degenerate critical point and negative Schwarzian derivative for maps
On Hausdorff Dimension of Unimodal Attractors
567
having no saddle-node cascades was obtained in [12]. The general case of unimodal maps with non-degenerate critical point was done in [4, 5]. For C 4 unimodal maps with non-degenerate critical point the metric and topological attractors coincide. More precisely, in this case any metric (topological) attractor is either 1. a topologically attracting periodic orbit, or 2. a transitive cycle of intervals, or 3. a Cantor set of solenoid type, and there is at most one metric attractor of type other than 1. This should be contrasted with the result of [1] that there exists a unimodal polynomial with negative Schwarzian derivative and the critical point of order higher than 2 for which the metric and topological attractors do not coincide. The classification of attractors reduces a proof of Theorem 1 to estimates of Hausdorff dimension of attractors which are Cantor sets of solenoidal type. By general theory, [14], the corresponding unimodal maps have infinitely many restrictive intervals of different periods. Definition 1. A compact interval J is restrictive of period n for an ambient dynamics f if J contains the critical point of f in its interior and, for some n > 0, f n (J ) ⊆ J and f n | J is unimodal. In particular, f n maps the boundary of J into itself. Asymptotic dynamics and Hausdorff dimension. Let f be a unimodal map with an attractor of solenoidal type. Assume that J is a restrictive interval of f with period n(J ). The n(J ) solenoidal attractor is equal to ∩J ∪i=1 f i (J ), where the intersection is taken over all restrictive intervals of f . The Hausdorff dimension is an invariant in the smooth category. This means that the Hausdorff dimensions of the attractors of f and f n(J ) | J coincide. In other words, we can pick up an arbitrary small restrictive interval J and then replace f by any smooth rescaling of f n(J ) | J without altering the Hausdorff dimension of the attractor. Consequently, it is enough to get universal estimates of n(J )
|f i (J )|σ
i=1
for all restrictive intervals J of high enough periods. Negative Schwarzian. We say that a C 3 function g has negative Schwarzian derivative 2 if S(g)(x) := D 3 g(x)/Dg(x) − 23 D 2 g(x)/Dg(x) < 0 whenever Dg(x) = 0. Such maps will be called S–unimodal. The Schwarzian derivative S(g) satisfies the composition law S(g ◦ h)(x) = S(g)(h(x))Dh(x)2 + S(h)(x). Thus iterates of a map with negative Schwarzian derivative also have negative Schwarzian derivative. Cross-ratio. If J ⊂ T are two intervals, then b(T , J ) =
|J ||T | , |X||Y |
where X, Y are two components of T \ J . We have a well-known property of expanding the cross-ratio b(T , J ).
568
J. Graczyk, O.S. Kozlovski
Lemma 2.1. For any J ⊂ T and any diffeomorphism g with negative Schwarzian derivative, b(g(T ), g(J )) > b(T , J ). Lemma 2.2. For every C > 0 there is λ < 1 such that if I ⊂ J ⊂ T are three intervals and b(T , J ) < C, then b(T , I ) < λ b(J, I ). Proof. From b(T , J ) < C, we infer that there exists C , which depends only on C, such that each connected component of T \ J is bigger than C |J |. Hence, |J | ≤ |T |/ (2C + 1). Now a direct calculation shows that the claim is satisfied for λ := (1 + 2C )/ (1 + C )2 < 1.
Nestedness and symmetry. An open set U is τ -nested inside T if |U | ≤ τ. dist(U, ∂T ) We say that a map g : U → R with one critical point ξ is κ-symmetric if |Dg(x1 )|/ |Dg(x2 )| < κ for every x1 , x2 ∈ Y such that g(x1 ) = g(x2 ). Note that if g is κ-symmetric, then |x1 − ξ |/|x2 − ξ | < κ.
3. Estimates near Bifurcation Saddle-node estimates. Proposition 1. Let Y be a compact interval and g : Y → R a C 3 diffeomorphism without fixed points such that |g(x) − x| has a local minimum inside Y . Let τ, δ be positive so that length of g(Y ) \ Y is greater than τ |Y | and Sg(x)|Y |2 < −δ for every x ∈ Y . Let {x0 , . . . , xk } ⊂ Y be a sequence such that g(xi−1 ) = xi for i = 1, . . . , k. Then for every α > 1/2 there exists a constant C > 0 which depends only on τ and δ such that k
|xi − xi−1 |α < C|x0 − xk |α .
i=1
Geometric set up of Proposition 1 when g(x) > x is shown on Fig. 1. 2 g(x) stand for the nonlinearity of g. The proof of Proposition 1 is Let N g(x) = DDg(x) based on the following lemma. Lemma 3.1. Suppose that g is a diffeomorphism defined in a neighborhood of 0 such that (i) Sg < −β, (ii) g(0) = w, (iii) there is no x ∈ (−L, L), where g is defined and g(x) ≤ x. Then for every L and β positive there exist K1 > 0 and K2 > 0 so that w < K1 yields N g(0) > K2 for every g satisfying (i-iii).
On Hausdorff Dimension of Unimodal Attractors
569
x
g k (x) Y
g(Y) \ Y
Fig. 1. Orbit of x near saddle-node bifurcation
Proof. For the convenience of the reader we sketch the proof from p. 248 of [7]. Let G(w, L) be a class of C 3 diffeomorphisms defined on (−L, L) and satisfying (i − iii). Recall a well-known differential equation 1 (1) DN g = Sg + (N g)2 . 2 Every function in G(w, L) is uniquely determined by a continuous function ψ = Sg and two numbers ν and µ equal to N g(0) and Dg(0) respectively. Observe that with µ fixed, Eq. (1) implies that every g ∈ G(w, L) is a monotone function of ψ and ν. Now, we will prove the following claim: for every µ ∈ R there exists w > 0 such that the solution g, g(0) = w, of Eq. (1) with ψ = −β and ν = 0 can not satisfy g(x) ≥ x for every x ∈ (−L, L). Suppose, to the contrary, that for every w > 0 there exists µ such that g(x) ≥ x. From another well-known differential formula, D 2 u = −βu, where Dg = 1/u2 . Hence, µ Dg(x) = √ cosh βx and the set of such µ must be bounded. Every g, as a map with negative Schwarzian, has exactly one infliction point which is at 0. Taking a limit, we obtain a map g with the property that g(x) ≥ x for every x ∈ (−L, L) and g(0) = 0, a contradiction. By the claim and the monotonicity of g ∈ G(w, L) with respect to ψ and ν, there exist K1 and K2 positive such that Sg < −β ⇒ N g(0) > K2 , provided w < K1 .
570
J. Graczyk, O.S. Kozlovski
Let g be as in Proposition 1. Since Sg < 0, there exists a unique point χ ∈ Y so that |g(x) − x| achieves its local minimum at χ . Lemma 3.2. Let g be as in Proposition 1. Then there exist positive constants k0 , Q and κ (which depend only on δ and τ from Proposition 1) so that for every k > k0 the following holds: if there is a point contained in Y together with its k iterates by g then there exists a neighborhood U of χ so that |D 2 g(x)| > |YQ| if x ∈ U, |g(x) − x| > κ|Y | if x ∈ Y \ U, Proof. Let x ∈ Y . Without loss of generality g(x) > x. We normalize Y by an affine transformation x (y) = y−x |Y | . The function G(y) = x ◦ g ◦ x−1 (y) satisfies (i-iii) of Lemma 3.1 with G(0) = w = (g(x) − x)/|Y |. Therefore, there are constants P , K > 0 so that if G(0) ≤ P then K < N G(0) = |Y | N g(x).
(2)
We will choose k0 and κ as functions of P and τ . The existence of an orbit by g of length k > k0 > 0 contained in Y implies that inf (g(x) − x) < |Y |/k0 .
x∈Y
By the negative Schwarzian property, we have that either g(χ ) − χ < |Y |/k0 or |g(Y ) \ Y | < |Y |/k0 . If k0 > τ −1 then the latter is not possible. Choosing k0 > P −1 , we obtain that g(χ) − χ < P |Y |. Let U be a connected component of {x ∈ Y ; g(x) − x < P |Y |} containing χ . For x ∈ U we rewrite (2) as D log(Dg(x)) > K/|Y |. Solving this differential inequality, we obtain that Dg(x) > e−K and consequently, D 2 g(x) > Q/|Y | for every x ∈ U . The negative Schwarzian of g and the hypothesis that |g(Y ) \ Y | > τ |Y | yield g(x) − x > min(τ, P )|Y | for every x ∈ Y \ U .
Proof of Proposition 1. Let k0 come from Lemma 3.2. If k ≤ k0 then by the power means inequality, for every α there exists C so that k
|xi − xi−1 |α < C|x1 − xk |α ,
i=1
where C depends only on α, τ , and δ. If k > k0 then we change the coordinates by an affine map χ so that χ goes to 0 and length of Y is 1. In this coordinate system g becomes φ and Y becomes V . Without loss of generality φ(x) > x. Observe that φ satisfies an (Q, κ)-lower approximation rule, i.e. φ is not smaller than the map q(x) = x + Qx 2 + φ(0) on the interval U = κ (U ), where U χ and Q > 0 come from Lemma 3.2.
On Hausdorff Dimension of Unimodal Attractors
571
y=l(x)
T
ξ
r
r’ g(ξ)
Y’ Y
Fig. 2. Geometric set up of Proposition 2 when g(ξ ) > ξ
We see that only what’s inside U counts. Indeed, the number of iterates n so that 0 < n < k and φ n (x) ∈ / U is uniformly bounded. Moreover, for all such values of n+1 n n, φ (x) − φ (x) is uniformly bounded away from 0. The second part of the claim follows readily from Lemma 3.2 and the first part is an obvious consequence of the second. Therefore, the sum ki=1 |φ i (x) − φ i−1 (x)|α , is majorated by a uniform constant multiple of |x − φ k (x)|α + |q i (y) − q i−1 (y)|α , q i (y)∈I
where I = [y, z] and y, z are, respectively, the first and the last point of the orbit x, . . . φ k (x) in U . By the well-known estimates for quadratic maps, see [7], |q i (y) − q i−1 (y)|α ≤ C|I |α q i (y)∈I
with C depending only on Q and α > 1/2. This completes the proof of Proposition 1.
Estimates near almost restrictive interval. Proposition 2. Let g : Y → T with Y ⊂ T be a C 3 map with exactly one critical point ξ , negative Schwarzian derivative, and two repelling fixed points. Suppose g is
572
J. Graczyk, O.S. Kozlovski
κ-symmetric for some κ > 1 and suppose there is E > 0 such that g E (ξ ) ∈ Y . Let Y be τ -nested inside T , τ > 0, g(∂Y ) ⊂ ∂T , r be a fixed point of g with a positive multiplier and Y be a component of Y \ r not containing the critical point. Let {x0 , . . . , xk } ⊂ Y be a sequence such that g(xi−1 ) = xi for i = 1, . . . , k. Then for every α > 0 there exists a constant C > 0 which depends only on τ and κ such that k
|xi − xi−1 |α < C|x0 − xk |α .
i=1
Proof of Proposition 2. Without loss of generality assume that g is first increasing and then decreasing. Let r > ξ be a preimage of r by g. Since the iterates of the critical point leave the interval Y we have g(ξ ) > r . Let y be the endpoint of Y lying on the same side of ξ as r. Draw two lines through the points (r, r) and (y, g(y)), and through (r, r) and (ξ, g(ξ )). Let y = l(x) be the equation of the line with smaller slope. This slope D(l) is larger than some constant (> 1) which depends only on τ and κ. For x ∈ Y we have g(x) ≤ l(x). Indeed, we know that g(y) ≤ l(y), g(r) = l(r) and g(ξ ) ≥ l(ξ ). If there is a point a ∈ Y such that g(a) > l(a), then there are two points b1 and b2 in Y ∪ {r} such that b1 < a < b2 , g(b1,2 ) = l(b1,2 ). By the minimum principle, Dg(y) ≥ D(l) > 1 for every y ∈ (b1 , b2 ) which yields a contradiction. Recall that the minimum principle, see Lemma 6.1 in [14], says that every diffeomorphism with negative Schwarzian derivative from a closed interval U into R has the derivative inside U strictly larger than the minimum of the derivatives at the boundary points of U . Let I = [xk , x]. Since the graph of g for x ∈ Y lies below the graph of l, there are constants C1 , C > 0 which depend only on D(l) and α > 0 so that k
|xi − xi−1 |α < C1
i=1
|l i (x) − l i−1 (x)|α < C |I |α .
l i (x)∈I
4. Sequences of Box Mappings We recall that an open interval T is regularly returning for some dynamics f defined in an ambient space containing T if f n (∂T ) ∩ T = ∅ for every n > 0. The first entry map φ of f into a set T is defined on DT := {x : ∃ n > 0, f n (x) ∈ T } by the formula φ(x) := f n(x) (x), where n(x) := min{n > 0 : f n (x) ∈ T }. If T is a regularly returning open interval then the function n(x) is locally constant on DT . Box mappings form an important class of maps which naturally arise as first entry maps of unimodal maps to regularly returning intervals. Definition 2. Let U be an open set in R and T an open interval. We say that R : U → T is a box mapping if the following holds: • ∂T ∩ U = ∅; • R is differentiable and has exactly one non-degenerate critical point ξ , the connected component of U containing ξ is called the central domain; • if I is a connected component of U and ξ ∈ I , then the branch R | I is a diffeomorphism between the intervals I and T ; • if I is the central domain then R : I → T is proper.
On Hausdorff Dimension of Unimodal Attractors
573
An induced sequence of central intervals. Let R : U → T be a box mapping and let J be its central domain. It might happen that R(ξ ) is in J . Let E be a minimal positive integer such that R E (ξ ) ∈ / J . This number does not necessarily exist, all iterates of the critical point can stay in the central domain. If this number is finite, it will be called an escaping time. The end points of J are mapped by R onto an end point a of T . Let ai be a preimage of a by the i th iterate of the central branch of R, i = 0, . . . , E − 1, such that R i+1 (ai ) = a. In this notation a0 is a boundary point of J . Each point ai has a symmetrical point which we denote by ai (here “symmetrical” means that RJ (ai ) = RJ (ai )). Finally, let Ji = (ai , ai ) (so J = J0 ). We call the sequence J0 , . . . , JE−1 an induced sequence of central intervals for R. We say that a box mapping is κ-symmetric if its central branch is κ-symmetric. Lemma 4.1. Let R : U → T be a C 3 box mapping with negative Schwarzian derivative. Let δ, τ be positive and α > 1/2, κ > 1 such that (i) U is τ -nested inside T , (ii) the Schwarzian derivative of the central branch is smaller than −δ/|J |2 for every x ∈ J , (iii) R is κ-symmetric. Then there exists a constant K > 0 depending only on α, τ, κ and δ so that the following is true: if I is an interval such that • R m is monotone on I , • R i (I ) does not intersect JE−1 for i = 0, . . . , m − 1, • R m−1 (I ) and J0 are disjoint, then m
|R i (I )|α < K b(T , R m (I ))α |T |α .
i=0
We can prove this lemma only for α > 1/2 and believe that for α < 1/2 it might not hold. The constant 1/2 is not related to the order of the critical point, but to the order of “almost” tangency of the central branch and the diagonal as in Proposition 1. Proof. We divide the orbit {R i (I ), i = 0, . . . , m} into pieces according to the following rule: m0 = 0 and mj +1 is the smallest integer greater than mj such that the interval R mj +1 (I ) is disjoint from the central domain J . In this way we have constructed a sequence {mj , j = 0, . . . , p}. Put mp+1 = m. If mj +1 = mj + 1, then between mj and mj +1 iterates the interval I is either close to a saddle-node bifurcation or to a repelling fixed point. Let Vj be a domain of R such that R mj (I ) ⊂ Vj . The domain Vj exists because otherwise the map R mj +1 |I would not be monotone. Since U is τ -nested inside T , the cross-ratio b(T , Vj ) is bounded by a universal constant depending only on τ . We want to prove that there exists λ < 1 which depends only on τ so that b(Vj , R mj (I )) < λ b(Vj +1 , R mj +1 (I )). The proof falls naturally into two cases. Case I. Suppose that mj +1 = mj + 1. Since R expands cross-ratios, b(Vj , R mj (I )) < b(T , R mj +1 (I )).
(3)
574
J. Graczyk, O.S. Kozlovski
Lemma 2.2 applied to T ⊃ Vj +1 ⊃ R mj +1 (I ) yields b(T , R mj +1 (I )) < λ b(Vj +1 , R mj +1 (I )), where λ < 1 depends only on τ . Combining these two inequalities we obtain (3). Case II. Suppose mj +1 > mj + 1. Let Ji = (ai , ai ), i = 1, . . . , E − 1, be an induced sequence of central intervals for R. Let Vj ⊃ R mj +1 (I ) be a connected component of Ji \ Ji+1 . Since R m−1 (I ) ∩ J = ∅, the intervals {R j (I ), j = 0, . . . , m − 1} never meet the boundary of the intervals Ji . Since R expands cross-ratios, b(Vj , R mj +1 (I )) < b((a, a0 ), R mj +1 (I )) < b(Vj +1 , R mj +1 (I )), b(Vj , R mj (I )) < b(T , R mj +1 (I )). Lemma 2.2 applied to the intervals T ⊃ Vj ⊃ R mj +1 (I ) yields b(T , R mj +1 (I )) < λ b(Vj , R mj +1 (I )) < λ b(Vj +1 , R mj +1 (I )), which combined with the other inequalities implies (3). Conclusion. The estimate (3) implies an exponential decay of b(T , R mp−j (I )) with j . Since |J |/|T | < b(T , J ), we also have an exponential decay of lengths of the intervals R mp−j (I ). Consequently, there exists a positive constant C1 so that p+1
|R mj (I )|α < C1 b(T , R m (I ))α |T |α ,
j =0 p+1
(4) b(T , R mj (I ))α < C1 b(T , R m (I ))α .
j =0
We see that the only missing estimate is a universal bound on mj +1
|R i (I )|α .
i=mj +1
Clearly, we may assume that mj +1 > mj + 1. Recall that Vj is either (ai , ai+1 ) or
), where i = m
l (ai , ai+1 j +1 − mj − 2. Hence R (Vj ) = (ai−l , ai−l+1 ) for l = 1, . . . , i. Therefore, by the property of expanding cross-ratios, for the same values of l, we have that |R l+mj +1 (I )|α < |R l (Vj )|α b(R l (Vj ), R l+mj +1 (I ))α < |ai−l − ai−l+1 |α b(Vj +1 , R mj +1 (I ))α . The above inequality and the estimates of Propositions 1 and 2 yield (for convenience we put a−1 = a) mj +1
i
α
|R (I )| < b(Vj +1 , R
mj +1
(I ))
i=mj +1
< C2 b(Vj +1 , R
mj +1
α
mj +1 −mj
|ai − ai−1 |α
i=0 α α
(I )) |T | .
Combining (4) and (5), we obtain the assertion of the lemma.
(5)
On Hausdorff Dimension of Unimodal Attractors
575
Estimates for induced box mappings. Suppose that a box mapping R : U → T is the first entry map of a unimodal map f to a regularly returning interval T . Then every branch of R is an iterate of f , i.e. for every connected component I of U , one has R | I = f n(I ) , where n(I ) is a positive integer. We say that a box mapping R is induced by f . Definition 3. An induced box mapping R is τ -extendible if there exists an interval T so that T is τ -nested inside T and every branch of R with the domain I ⊂ T of U has an extension mapping diffeomorphically over T . An inductive parameter. Let a box mapping R : U → T be induced by f . We define α
ρ (f, R) = sup
n(V )
V
|f i (V )|α ,
i=0
where the supremum is taken over connected components of the domain of R outside of T which contain points from the critical orbit, i.e. {f i (ξ ), i > 0}. The next lemma is a counterpart of Lemma 4.1 for induced box mappings. Lemma 4.2. Let R : U → T be the first entry map of an S-unimodal map f to a regularly returning interval T . Suppose that R satisfies the conditions (i-iii) from Lemma 4.1 and is τ -extendible. Let E be an escaping time and Ji , i = 0, . . . , E − 1, be induced central intervals. Then for every α > 1/2 there exists K > 1 (depending on α, δ, κ and τ ) with the following property: If I is an interval such that • f m is monotone on I and f m (I ) ⊂ T , • f i (I ) is disjoint with JE−1 for i = 0, . . . , m − 1, • f µ (I ) is disjoint with J0 , where µ is the greatest integer smaller than m such that f µ (I ) ⊂ T . We set µ = 0 if f i (I ) ∩ T = ∅ for i = 0, . . . , m − 1. Then m |f i (I )|α < Kρ α (f, R) b(T , f m (I ))α . i=0
Proof. Let 1 ≤ m1 < m2 < · · · < m be the iterates when the interval I is mapped into T , f mj (I ) ⊂ T . Denote by Vj an interval such that f mj +1 (I ) ⊂ Vj and f mj +1 −mj −1 (Vj ) = T . Let I ⊂ V0 . Since f has negative Schwarzian derivative and the range of f mj +1 −mj −1 restricted to Vj has a definite extension, the distortion of f mj +1 −mj −1 on Vj is bounded by a constant which depends only on τ -extendibility of R. Hence, for i = mj −1 + 1, . . . , mj , |f i (I )|α |f i−mj −1 −1 (V
j −1 )|
α
< C3
|f mj (I )|α , |T |α
where C3 depends only on τ . Therefore, mj i=mj −1 +1
|f i (I )|α < C3
|f mj (I )|α |T |α
< C3 ρ α (f, R)
mj
|f i−mj −1 −1 (Vj −1 )|α
j =mj −1 +1 |f mj (I )|α
|T |α
.
Lemma 4.2 follows from Lemma 4.1 and the above inequality.
576
J. Graczyk, O.S. Kozlovski
Corollary 1. Suppose that R : U → T is as in Lemma 4.2. Then there is a constant K depending only on α, δ, κ, and τ such that if R is the first entry map of f to JE−1 , then ρ α (f, R ) < Kρ α (f, R)
|JE−1 |α . |T |α
Proof. By Lemma 4.2, ρ α (f, R ) < Kρ α (f, R) b(T , JE−1 )α . Since J is τ -nested inside T , both components of T \ JE−1 have length greater than C4 |T |. Hence, there exists a constant C5 which depends only on τ so that b(T , JE−1 )α < C5
|JE−1 |α . |T |α
5. Canonical Inducing Let f : M → M be an S-unimodal map without periodic attractors. To begin the construction, look for an open interval T0 ⊂ M which contains ξ and is regularly returning. There is a canonical way of finding an initial T0 . Note first that the fixed point q which is in the interior of M has another preimage q < 0. The interval (q , q) can be taken as T0 . The central domain of R0 can coincide with T0 . In this case the map f has a restrictive interval of period 2. If the central domain of R0 does not coincide with T0 , then it is compactly contained in T0 . Let E be an escaping time of R0 and let JE−1 be an induced central domain for R0 . Note that JE−1 is regularly returning. If E is infinite, then f has a restrictive interval properly contained in the central domain of R0 . If E is finite, set T1 = JE−1 and take the first entry map R1 of f into T1 . We can repeat this construction inductively and get a canonical induced sequence of box mappings Rl : Ul → Tl . This sequence is finite iff either f has a restrictive interval or ξ is not recurrent. Initial estimates. Suppose that f is an S-unimodal map with all periodic points repelling. If f has a restrictive interval then a canonical sequence of induced box mappings Rl is finite. We will analyze the last box mapping R, termed terminal, in this sequence. Let J be the central−idomain of R. The escaping time E of R is infinite, and hence
M = ∞ i=1 (R | J ) (J ) is a restrictive interval for f . From now on we call M a canonical restrictive interval. Lemma 5.1. Let f : M → M be an S-unimodal map with all periodic points repelling and M be canonical restrictive interval for f . Let R : U → T be a terminal box mapping from the canonical sequence induced by f and suppose that U is τ -nested inside T . If the central branch of R is a restriction of f n then there exists K < 1 which depends only on τ so that n i=1
|f i (M )| < K|M|.
On Hausdorff Dimension of Unimodal Attractors
577
Proof. Let V be a component of U containing f (M ). Then f n−1 (V ) = T . Moreover, f n−1 (f (M )) ⊂ M ⊂ J . The map f i on f n−1−i (V ) has negative Schwarzian derivative, J is τ -nested inside T , hence there exists a constant K < 1 depending only on τ such that |f i (M )| < K|f i−1 (V )|, i = 1, . . . , n. The orbit of V is disjoint because R is the first return map and consequently, ni=1 |f i−1 (V )| < |M|.
It is easy to see that there are exactly two repelling fixed points of the terminal box mapping R : U → T in the canonical restrictive interval M , but only one, say r, is in the interior of M . Let r ∈ M be its symmetrical point, i.e. R(r ) = R(r) = r. An interval T := (r, r ) is regularly returning. Lemma 5.2. For every λ > 1 and α > 0 there exist constants C > 0 and κ > 1 with the following property. Suppose that R : U → T is a terminal box mapping induced by f and M is a canonical restrictive interval. Let T = (r, r ) and R : U → T be the first entry map of R into T . If multipliers of the fixed points of R | M are larger than λ and R | M is κ-symmetric then ρ α (R | M , R ) < C|M |α . Proof. Let W be a connected component of M \ r not containing the critical point. If V is a connected component of the domain of R , then it is easy to see that either R(V ) = T or R k (V ) ⊂ W for some k ∈ {0, 1, 2}. Thus, in order to prove the lemma we can assume that V ⊂ W . In this case there is k such that R i (V ) ⊂ W for i = 0, . . . , k and R k+1 (V ) = T . Take κ = 2λ/(1 + λ). Then the absolute values of DR at the boundary points of W are larger than (1 + λ)/2. Due to the minimal principle this implies that the absolute value of DR | W is bigger than (1+λ)/2. Hence there exists a constant C > 0 depending on λ and α such that k
|R i (V )|α < C |M |α .
i=0
6. Universal Estimates Suppose that M is a canonical restrictive interval of f of period n. Then f n | M is a unimodal map and we have a sequence of canonically induced box mappings Rl,M : Ul → Tl . We say that Y is a canonically induced interval if there exists a canonical restrictive interval M and a box mapping Rl,M such that Y = Tl for some l. Proposition 3. There exist positive constants τ, λ > 1 and δ with the following properties. Let f : M → M be an S-unimodal map with a non-degenerate critical point and infinitely many canonical restrictive intervals Mk of different periods nk . Then for every κ > 1 there exists k0 so that for every canonically induced interval Y ⊂ Mk0 we have that • the first entry map R of f to Y is τ -extendible and the central domain J of R is τ -nested inside Y , • R is κ-symmetric, • SR | J (x)|J |2 < −δ for every x ∈ J ,
578
J. Graczyk, O.S. Kozlovski
• the absolute value of a multiplier of every periodic point of R is larger than λ. The hypotheses of Proposition 3 that f has infinitely many restrictive intervals is redundant. It is enough to assume that length of Y is small enough. The proof of Proposition 3 is based on a few lemmas, which use only a small size of Y . Lemma 6.1. For n which is a positive integer let f be a C n+1 unimodal map from a closed interval I , normalized so that the critical point is at 0 and f (0) = 1 is a local maximum. Suppose that D 2 f (0) < 0. Then f (x) can be expressed as 1 − (h(x))2 where h(x) is a C n increasing diffeomorphism of I onto its image and h(0) = 0. √ Proof. The map h is uniquely determined: h(x) = 1 − f (x), where positive values of (x) the root are chosen for positive x and negative for negative x. Denoting θ(x) = 1−f x2 √ we can write h(x) = x θ (x). We see that θ is C n−1 in a neighborhood of 0, C n in a punctured neighborhood with the nth derivative bounded as o(|x|−1 ). Hence h is C n .
By Lemma 6.1 the central branch can be put in the desired form with the sacrifice of one order of smoothness. Lemma 6.2. Suppose that f is a C 2 unimodal map with non-degenerate critical point ξ . Then for every κ > 1 there is > 0 such that if Y is a regularly returning interval of size smaller than , then f n | M is κ-symmetric. This is a straightforward application of the previous lemma. Lemma 6.3. Let f be a C 3 unimodal map with non-degenerate critical point ξ . There exists > 0 so that for every regularly returning interval Y with length smaller than the following holds: Let ψ : Y → Y be the central branch of the first return map to Y . Then Sψ(x)|Y |2 < −1/3. Proof. By calculus, for every C 3 unimodal map f with non-degenerate critical point there exists η > 0 such that for every x ∈ (−η, η), Sf (x) < −(1/3)|x|2 . If ψ is a restriction of f n and Y ⊂ (−η, η) then the composition law for the Schwarzian derivative implies that Sψ(x)|Y |2 ≤ Sf (x)|x|2 + Sf n−1 (f (x))|Df (x)|2 |Y |2 ≤ −1/3.
Lemma 6.4 (Real bounds). For every κ > 0 there exists τ > 0 with the following property. Let f be a κ-symmetric S-unimodal map with a non-degenerate critical point and infinitely many different periods of renormalization. If Y is a canonically induced interval then the first entry map R to Y is τ -extendible and the central domain of R is τ -nested inside Y . Proof. This is a well-known fact. We refer the reader, for example, to Theorem 3.2 of [17] from which a numerical constant for the extendibility can be readily produced. The τ -nestedness is a direct consequence of the uniform extendibility and the uniform separation from 1 of the absolute values of multipliers of periodic points of f , see [14], p. 268.
Proof of Proposition 3. Observe that lengths of restrictive intervals Mk go to 0 with k since otherwise f would have a wandering interval. The first claim of Proposition 3 follows from Lemmas 6.4 and 6.2. The other two claims are obtained from Lemmas 6.2 and 6.3 with δ = 1/3. For the last claim see [14], p. 268.
On Hausdorff Dimension of Unimodal Attractors
579
Uniform decay of geometry. Suppose that the critical orbit of an S-unimodal map f is recurrent. Consider a canonical induced sequence of box mappings Rl : Ul → Tl . An induced sequence Rl : Ul → Tl shows the decay of geometry provided that ωl := log
|Tl | ≥ Cl |Tl+1 |
for some C > 0 (which depends only on ω0 ) and all l for which Rl is well-defined. Lemma 6.5. Let f be as in Proposition 3. Then there exists k0 so that for every k > k0 if Rl , l = 0, . . . , m, is a canonical induced sequence for fk := f nk Mk then it shows a uniform decay of geometry. Proof. We will need a simplified version of Theorem 5 of [5]: Suppose that f is an Sunimodal map with recurrent critical orbit or restrictive interval. Then for every > 0, there is a regularly returning interval Y for f which contains ξ and such that if Y denotes the connected component of the first entry map into Y containing ξ , then |Y | ≥ K, where K > 0 depends only on and not on f and either Y is a restrictive interval or |Y |/dist(Y, ∂Y ) < . Now from this (see Corollary 1 in [5]) and Proposition 3 follows that for any > 0 there exists a universal number l0 so that ωl < provided l > l0 . The starting condition of [10] is satisfied and the uniform decay of geometry follows from [10].
Lemma 6.6. For every α > 1/2 there exists a constant C > 0 with the following property. Let f be as in Proposition 3 and Rl , l = 0, . . . , m, be a canonical induced sequence for fk := f nk Mk . There exists k0 such that for every k ≥ k0 , ρ α (fk , Rm ) < Cρ α (fk , R0 ). Proof. Proposition 3 implies that the constants K produced by Corollary 1 applied to Ri , i = 0, . . . , m − 1, are all bounded by a universal constant if only k is large enough. Hence, by Corollary 1, ρ α (fk , Rm ) < K m ρ α (fk , R0 )
|Tm |α . |T0 |α
Combining Proposition 3 and Lemma 6.5, we see that lengths of Tm decay superexponentially fast in a uniform fashion. Consequently, K m |Tm |α /|T0 |α remain uniformly bounded by a constant independent from m and tend to 0 when m → ∞.
Proof of Theorem 1. We may assume that f has negative Schwarzian derivative. Let Mk be a sequence of restrictive intervals of periods nk and denote by fk the map f nk |Mk : Mk → Mk . Suppose that k > k0 , where k0 is a constant supplied by Proposition 3. In view of Proposition 3, Lemma 5.1 and combined Lemmas 6.6 and 5.2 imply respectively nk /nk−1
i=1 nk /nk−1
i=1
i |fk−1 (Mk )| < C6 |Mk−1 |,
i |fk−1 (Mk )|α < C7 |Mk−1 |α ,
580
J. Graczyk, O.S. Kozlovski
where C6 < 1 and C7 are universal constants if we set α to be some number in (1/2, 1). i (M ) belong to M All the intervals fk−1 k k−1 for i = 1, . . . , nk /nk−1 . The distortion of n −j the map f k−1 |f j (Mk−1 ) , j = 1, . . . , nk−1 , is bounded by some universal constant, hence we have nk
|f i (Mk )| < C8
i=1 nk
|f i (Mk )|α < C9
i=1
nk−1
i=1 nk−1
|f i (Mk−1 )|, |f i (Mk−1 )|α ,
i=1
where C8 < 1. Every σ ∈ (α, 1) can be represented as (1/µ) + (1/µ )α, where 1/µ + 1/µ = 1. Using the H¨older inequality with exponents µ and µ , we obtain that nk
i
σ
|f (Mk )| =
i=1
nk
1/µ
|f i (Mk )|1/µ |f i (Mk )|α
i=1
<
1/µ C8
nk−1
1/µ i
|f (Mk−1 )|
1/µ
C9
i=1 1/µ
1/µ
≤ C 8 C9
nk−1
1/µ
i
|f (Mk−1 )|
α
i=1 nk−1
|f i (Mk−1 )|σ .
i=1
Taking µ sufficiently large, we get universal constants σ < 1 and K < 1 such that nk i=1
nk
i
σ
|f (Mk )| < K
nk−1
|f i (Mk−1 )|σ .
i=1
Thus, i=1 |f i (Mk )|σ decays exponentially fast to zero. Our attractor is equal k ∞ f i (Mk ), therefore its Hausdorff dimension is equal to or smaller than σ . ∩k=1 ∪ni=1
to
Acknowledgement. Both authors are very grateful to Nicolae Mihalache for many valuable comments. Jacek Graczyk would like to thank Fr´ed´eric Paulin for bringing to his attention the results of P. Doyle.
References 1. Bruin, H., Keller, G., Nowicki, T., van Strien, S.: Wild Cantor attractors exist. Ann. of Math. (2) 143, no. 1, 97–130 (1996) 2. Doyle, P.: On the bass note of a Schottky group. Acta. Math. 160, no. 3–4, 249–284 (1988) 3. Feigenbaum, M.: Quantitative universality for a class of nonlinear transformations. J. Statist. Phys. 19, no. 1, 25–52 (1978) ´ atek, G.: Metric attractors for smooth unimodal maps. Annals of Math. 4. Graczyk, J., Sands, D., Swi¸ 159, 725–740 (2004) ´ atek, G.: Decay of geometry for unimodal maps: negative Schwarzian 5. Graczyk, J., Sands, D., Swi¸ case. Annals of Math.161, 613–677 (2005) ´ atek, G.: Induced expansion for quadratic polynomials. Ann. Scient. Ec. ´ Norm. Sup. 6. Graczyk, J., Swi¸ 29, 399–482 (1996) ´ atek, G.: Critical circle maps near bifurcation. Commun. Math. Phys. 127, 227–260 7. Graczyk, J., Swi¸ (1996)
On Hausdorff Dimension of Unimodal Attractors
581
´ atek, G.: The real Fatou conjecture. Princeton, NJ: Princeton University Press, 1998 8. Graczyk, J., Swi¸ 9. Grassberger, P.: On the Hausdorff dimension of fractal attractors. J. Statist. Phys. 26, no. 1, 173–179 (1981) ´ atek, G.: Metric properties of non-renormalizable S-unimodal maps. I. Induced 10. Jakobson, M., Swi¸ expansion and invariant measures. Erg. Th. Dyn. Sys. 14, 721–755 (1994) 11. Kozlovski, O.: Getting rid of the negative Schwarzian derivative condition. Ann. of Math. 152, 743–762 (2000) 12. Lyubich, M.: Combinatorics, geometry and attractors of quasi-quadratic maps. Ann. of Math. 140, 347–404 (1994) 13. Martens, M., de Melo, W., van Strien, S.: Julia-Fatou-Sullivan theory for real one-dimensional dynamics. Acta Math. 168, no. 3–4, 273–318 (1992) 14. de Melo, W., van Strien, S.: One-dimensional dynamics. Berlin-Heidelberg-New York: SpringerVerlag, 1993 15. Milnor, J.: On the concept of attractor. Commun. Math. Phys. 99, no. 2, 177–195 (1985) 16. Mihalache, N.: Hausdorff dimension of fractal attractors. Preprint, 2003 17. Nowicki, T., Sands, D.: Non-uniform hyperbolicity and universal bounds for S-unimodal maps. Invent. Math. 132, no. 3, 633–680 (1998) 18. Patterson, S.J.: The limit set of a Fuchsian group. Acta Math. 136, no. 3–4, 241–273 (1976) 19. Sullivan, D.: Entropy, Hausdorff measures old and new, and limit sets of geometrically finite Kleinian groups. Acta Math. 153, no. 3–4, 259–277 (1984) 20. Sullivan, D.: Bounds, quadratic differentials, and renormalization conjectures. American Mathematical Society centennial publications. Vol. II Providence, RI: Amer. Math. Soc., 1992, pp. 417–466 Communicated by G. Gallavotti
Commun. Math. Phys. 264, 583–611 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1543-6
Communications in
Mathematical Physics
Spectra of Sol-Manifolds: Arithmetic and Quantum Monodromy A.V. Bolsinov1,2 , H.R. Dullin1 , A.P. Veselov1,3 1
Department of Mathematical Sciences, Loughborough University, Loughborough, Leicestershire, LE11 3TU, UK. E-mail:
[email protected],
[email protected],
[email protected] 2 Department of Mathematics and Mechanics, Moscow State University, 119899 Moscow, Russia 3 Landau Institute for Theoretical Physics, Moscow, Russia Received: 18 March 2005 / Accepted: 16 November 2005 Published online: 31 March 2006 – © Springer-Verlag 2006
Abstract: The spectral problem of three-dimensional manifolds MA3 admitting Solgeometry in Thurston’s sense is investigated. Topologically MA3 are torus bundles over a circle with a unimodular hyperbolic gluing map A. The eigenfunctions of the corresponding Laplace-Beltrami operators are described in terms of modified Mathieu functions. It is shown that the multiplicities of the eigenvalues are the same for generic values of the parameters in the metric and are directly related to the number of representations of an integer by a given indefinite binary quadratic form. As a result the spectral statistics is shown to disagree with the Berry-Tabor conjecture. The topological nature of the monodromy for both classical and quantum systems on Sol-manifolds is demonstrated.
1. Introduction It has been known since the nineteenth century that in dimension two there is a close relationship between geometry and topology. Namely each compact orientable surface admits a metric of constant curvature: positive if it is a topological sphere, zero if it is a torus and negative if it has genus more than 1. In dimension three the situation is much more sophisticated. The major development here was due to Thurston [29] who put forward the famous Geometrisation Conjecture: Any compact orientable 3-manifold can be cut by disjoint embedded 2-spheres and tori into pieces, which after gluing 3-balls to all boundary spheres, admit one of 8 special geometric structures. These special 3-dimensional geometries are the standard Euclidean E 3 , spherical S 3 and hyperbolic H 3 geometries, the product geometries S 2 × R and H 2 × R and three geometries related to the Lie groups SL2 (R), N il and Sol. The last group Sol is the 3-dimensional solvable Lie group, which is isomorphic to the group of isometries of Minkowski 2-space. The corresponding metric has the least symmetry of all the 8 geometries as the identity component of the stabiliser of a point is trivial.
584
A.V. Bolsinov, H.R. Dullin, A.P. Veselov
The structure of 3-manifolds admitting any of the seven geometries excluding the most complicated hyperbolic case H 3 is pretty well understood. In particular a 3-manifold M possesses Sol-geometric structure if and only if M is finitely covered by a torus bundle over S 1 with hyperbolic gluing map. For all other 6 geometries M must be a Seifert fibre space (see e.g. [27]), so the Sol-manifolds are special from this point of view. Their special role in the theory of dynamical systems became clear after a recent paper [5] by Taimanov and one of the authors, who showed the surprising fact that although the geodesic flow on Sol-manifolds is integrable in the sense of Liouville (but not in the analytic category) it has non-zero topological entropy! In the present paper we investigate the quantum version of the geodesic flow on Sol-manifolds, which is the spectral problem for the corresponding Laplace-Beltrami operator . We describe the spectra explicitly in terms of the spectrum of the modified Mathieu equation. These spectra are degenerate and have very interesting arithmetic. The multiplicities are directly related to the numbers of representations of a given integer by an indefinite binary quadratic form determined by the corresponding hyperbolic gluing map. This allows us to conclude that the spectral statistics for Sol-manifolds is not Poisson contrary to the well-known Berry-Tabor conjecture [3]. Note that the Sol-structure on Sol-manifolds is not unique in the same way as the flat structure on a torus is. The spectra of tori are very sensitive to a change of the flat metric: if we change the periods slightly the degeneracy will essentially disappear. The fact that this does not happen with Sol-manifolds shows the rigidity of the spectra and can be considered as a reflection of the hyperbolicity hidden inside the topology of Sol-manifolds. We should mention that a deep relation of Sol-manifolds with arithmetic was known before (see e.g. [1, 6, 13]). In particular, Hirzebruch [13] and Atiyah, Donnelly and Singer [1] discovered a remarkable relation between topological “signature defects” of Sol-manifolds and arithmetical L-functions. From the dynamical point of view the arithmetic and topology reveal themselves through Hamiltonian monodromy [9]. Its quantum analogue - quantum monodromy is a relatively new phenomenon [8, 12, 30], which still needs better understanding. An interesting feature of our case is that the corresponding grid of the quantum states can be described explicitly and nicely visualised (“Sol-flower”, see Figs. 6, 7 below). This is probably the first example of quantum monodromy of that kind. The structure of the paper is the following. First we introduce the class of Sol-manifolds and describe the classical geodesic dynamics and the corresponding Hamiltonian monodromy. Then we review the facts from classical number theory about the relations between binary quadratic forms and the modular group SL(2, Z). In Sect. 5 we consider the spectral problem for the corresponding Laplace-Beltrami operator and find the eigenfunctions in terms of modified Mathieu functions. The arithmetic of the multiplicities of the eigenvalues is discussed in detail in Sect. 6. The semiclassical analysis of the problem is done in Sect. 7 in relation with Weyl’s law. In Sect. 8 we discuss the spectral statistics in the context of the Berry-Tabor conjecture. The quantum monodromy for Sol-manifolds is discussed in the final section. 2. Sol-Manifolds In this paper we restrict ourselves to the main class of Sol-manifolds, which are T 2 torus bundles over a circle S 1 with hyperbolic gluing maps with positive eigenvalues.
Spectra of Sol-Manifolds
585
More precisely, consider the action of Z on M˜ 3 = T 2 × R generated by the following transformation TA . Let (x, y) be standard periodic coordinates on T 2 defined modulo 1, and z ∈ (−∞, +∞) be a coordinate on R. Then in these coordinates the transformation TA is given by x a11 x + a12 y TA : y −→ a21 x + a22 y , (1) z z+1 a11 a12 ∈ SL(2, Z) is an integer hyperbolic matrix, which defines a where A = a21 a22 hyperbolic automorphism of the 2-torus. The corresponding Sol-manifold MA3 is defined as the quotient M˜ 3 /Z by this action. Let λ and λ−1 be the eigenvalues of A and we assume that λ > 1. The Sol-manifolds with negative λ are covered by those with positive eigenvalues. Together with (x, y, z) we shall use another coordinate system (u, v, z) on MA3 , where (u, v) are linear coordinates on the fibres related to a positively oriented eigenbasis of A. The transformation TA in these coordinates is given by λu u v −→ λ−1 v . (2) z z+1
One should note that unlike (x, y), the new coordinates (u, v) are not periodic on the tori T 2 anymore: two pairs (u, v), (u , v ) define the same point on T 2 if and only if (u−u , v −v ) = k(c11 , c12 )+m(c21 , c22 ), where k, m ∈ Z and e1 = (c11 , c12 ), e2 = (c21 , c22 ) is the basis of the lattice associated to T 2 : 1 1 −1 1 1 c1 c2 c1 c2 λ 0 a11 a12 A= = . −1 2 2 a21 a22 0λ c 1 c2 c12 c22 The Riemannian metrics on Sol-manifolds come from right-invariant metrics on the universal covering of MA3 , which has the natural structure of a solvable Lie group Sol. Topologically this group is R3 with a multiplication of the form (u, v, w) ∗ (u , v , w ) = (u + ew u , v + e−w v , w + w ). One can realise it as the group of 3 × 3 matrices of the form w 0 u e 0 e−w v . 0 0 1 The Sol-manifolds MA3 we consider are the quotients of the group Sol by the discrete subgroups GA corresponding to w = m ln λ, m ∈ Z and (u, v) = ke1 + le2 belonging to the integer lattice described above, z = w/ ln λ. The right-invariant metrics on the group Sol correspond to the following class of metrics on the Sol-manifold MA : ds 2 = α(z)dx 2 + 2β(z)dxdy + γ (z)dy 2 + dz2 ,
(3)
586
where
A.V. Bolsinov, H.R. Dullin, A.P. Veselov
α(z) β(z) β(z) γ (z)
= exp(−zB)
αβ βγ
exp(−zB).
Here α, β, γ are real parameters with the only condition that the form ds 2 = αdx 2 + 2βdxdy + γ dy 2 is positive definite and B is defined by the relation exp B = A : −1 1 1 c11 c21 c1 c2 ln λ 0 B= . 2 2 0 − ln λ c c c2 c2 1
2
1
2
One can consider a more general metric allowing a constant coefficient at dz2 but this will lead only to a general scaling. 3. Geodesic Flows on Sol-Manifolds: Integrals and Hamiltonian Monodromy Thus, the Hamiltonian of the geodesic flow on MA3 in (u, v, z)-coordinates can be written as H =
1 1 (Ee2z ln λ pu2 + 2Fpu pv + Ge−2z ln λ pv2 ) + pz2 , 2 2
where E, F, G are real parameters: E > 0, G > 0, EG − F 2 > 0. It is invariant under the following transformation: λu u λ−1 v v z+1 z (4) TA∗ : −→ −1 , λ pu pu λp p v v pz pz and, of course, under the translations by the elements of the lattice . The same property must be satisfied for any smooth function on T ∗ MA3 , in particular, for the first integrals of the geodesic flow. Since H depends neither on u, nor on v, the corresponding momenta pu and pv are local first integrals of the geodesic flow. However, being not invariant under (4), they are not well defined on the cotangent bundle T ∗ MA3 . That is why, to get global first integrals, we need to replace pu , pv by two smooth functions f1 (pu , pv ), f2 (pu , pv ) invariant under the transformation (pu , pv ) → (λ−1 pu , λpv ) (or, speaking in more general terms, by the invariants of the Z-action on the cotangent plane generated by the −1 hyperbolic linear transformation A ). One invariant function is evident: Q = pu pv . To find another one we introduce the following expression which will be useful also in the future E pu ln G pv . (5) α= 2 ln λ Under the transformation (4) α changes in a very simple way: α(pu , pv ) → α(pu , pv ) − 1.
Spectra of Sol-Manifolds
587
Thus, as a second integral we can take any function of α with period 1, for instance, cos(2π α) or sin(2π α). However, these functions are not smooth at pu = 0 and pv = 0. To avoid this difficulty and to get the first integrals in a more symmetric form we put: f1 = R(Q) cos 2πα, f2 = R(Q) sin 2πα, where R(Q) =
1 |Q| exp(− 2 ). Q
Remark. The fact that the second integral is not analytic is not accidental: A theorem proved by Taimanov [28] implies that Sol-manifolds do not admit integrable geodesic flows with analytic integrals (see [5] for more details). We are going to show now that one can see the topological structure of Sol-manifolds by looking at the Hamiltonian monodromy of the geodesic flow. For that we will have to investigate the bifurcation diagram (i.e. the set of critical values) of the momentum 5 = {H = 1}: mapping restricted to the isoenergy surface EA 5 FA = (f1 , f2 ) : EA → R2 .
(6)
Proposition 1. The set of critical points of the momentum mapping FA consists of five parts: a) four one-parameter families Li , (i = 1, . . . , 4) of (degenerate) 2-dimensional tori lying in the cotangent bundle given by (α is a parameter): z = −α, u and v are arbitrary,
pz = 0,
pu = ±
e2α ln λ
E 1+
√F EG
,
pv = ±
e−2α ln λ ; G 1 + √F EG
and z = −α, and v are arbitrary,
pz = 0,
pu = ±
e2α ln λ
E 1−
√F EG
,
pv = ∓
e−2α ln λ , G 1 − √F EG
b) the critical set N given by the equation Q = pu pv = 0. The bifurcation diagram of FA consists of two circles {f12 + f22 = R 2 (Q∗+ )} = FA (Li ),
i = 1, 2,
and {f12 + f22 = R 2 (Q∗− )} = FA (Li ), i = 3, 4, √ where Q∗± = (F ± EG)−1 , and the point (0, 0) = FA (N ), the centre of these circles.
588
A.V. Bolsinov, H.R. Dullin, A.P. Veselov
Proof. We are interested in the singularities of FA or, which is the same, those of the Liouville foliation. These singularities can be of two types. To explain their nature we first consider the geodesic flow on the covering manifold M˜ 3 . On this (non-compact) manifold the integrals of the flow are simply pu and pv . Consider the Liouville foliation for this covering system. Its singular leaves correspond to the critical points of the momentum mapping F˜ = (pu , pv ) : E˜ 5 → R2 , ˜ Obviously, these leaves remain singular after the natwhere E˜ 5 = {H = 1} ⊂ T ∗ M. 5 5 ural projection E˜ → EA . These are singularities of the first type. On the other hand some new singularities appear since instead of pu and pv we have to consider more complicated functions f1 and f2 . In other words, these are singularities of the map (pu , pv ) → (f1 , f2 ). Let us treat both cases in turn. It is easily seen that pu and pv are functionally dependent, as functions on E˜ 5 = {H = 1} if and only if two conditions are simultaneously ∂H satisfied: 1) ∂p = 2pz = 0 and 2) ∂H ∂z = 0. Taking into account the condition H = 1, z we obtain a system of equations Ee2 log λz pu2 − Ge−2 log λz pv2 = 0, Ee2 log λz pu2 + 2Fpu pv + Ge−2 log λz pv2 = 2. The first equation gives ln z=−
E pu G pv
2 ln λ
= −α.
Now solving this system for pu and pv (after substituting z = −α), we find four distinct solutions: pv =
e−α ln λ ; G 1+ √F
−e , E 1+ √F
pv =
−α ln λ −e ; G 1+ √F
3) pu =
α ln λ e , E 1− √F
pv =
−α ln λ −e ; G 1− √F
4) pu =
α ln λ −e , E 1− √F
pv =
e . G 1− √F
1) pu =
α ln λ e , E 1+ √F
2) pu =
EG α ln λ
EG
EG
EG
EG
EG
EG −α ln λ
EG
Thus, for each value of α we obtain four 2-dimensional invariant tori in T ∗ MA3 . All 2 = {z = const = of them are diffeomorphically projected onto the same T 2 -fibre T−α −α} ⊂ M. Varying α, we obtain 4 families of degenerate Liouville 2-tori Li , i = 1, . . . , 4. It is easy to verify that and √ for each family Li the value of Q = pu pv is constant √ equal to Q∗+ = (F + EG)−1 for L1 and L2 , and equal to Q∗− = (F − EG)−1 for L3 and L4 . Hence the image of L1 and L2 is the circle f12 + f22 = R 2 (Q∗+ ), and analogously the image of L3 and L4 is the other circle f12 + f22 = R 2 (Q∗− ), as required.
Spectra of Sol-Manifolds
589
3 × K where K is given in the figure Fig. 1. The topological structure of the singular leaf is MA
The singularities of the second type come from those of the mapping (pu , pv ) → (f1 , f2 ). It can be easily seen that the critical points of this mapping are defined by the equation Q = pu pv = 0. This implies immediately f1 = f2 = 0 which gives a single point on the bifurcation diagram, namely the centre of the circles. Notice that topologically the subset N = {Q = pu pv = 0, H = 1} ⊂ T ∗ MA3 is homeomorphic to the direct product MA3 × K, where K is a graph that consists of two vertices and four segments connecting them (see Fig. 1). This follows from the parallelizability of MA3 and the simple observation that in each cotangent space the conditions pu pv = 0, H = 1 define a graph homeomorphic to K: Two circles {Gpu2 + 21 pz2 = 1} and {Epv2 + 21 pz2 = 1} intersecting at pu = pv = 0, pz = ±1. Now we are able to describe the global structure of the foliation of the isoenergy 5 into Liouville tori. surface EA If we remove the singular set from the isoenergy surface we obtain four families of 3-dimensional Liouville tori distinguished from each other by signs of pu and pv : a) b) c) d)
pu > 0, pv > 0; pu < 0, pv < 0; pu > 0, pv < 0; pu < 0, pv > 0.
The families a) and b) are isomorphic (more precisely, they transform into each other by the globally defined time reversal automorphism of the geodesic flow (u, v, z, pu , pv , pz ) → (u, v, z, −pu , −pv , −pz )). The same is true for the families c) and d). Each Liouville 3-torus is uniquely determined by the values of two integrals Q and α mod 1, where the values of Q form the interval (0, Q∗+ ) in the first two cases and (Q∗− , 0) for the other two cases. In particular, in each of the cases, the base of the T 3 -foliation is homeomorphic to a punctured disc. As Q → 0, the Liouville torus approaches the singular set N . As Q → Q∗± , the torus shrinks into one of the degenerate 2-tori described above. 5 = {H = 1} can be considered Thus, the base of the global Liouville foliation on EA as four discs glued together at their centres. All interior points of these discs except the centre correspond one-to-one to regular 3-dimensional Liouville tori, the boundary circles of the discs correspond to the families Li of degenerate 2-tori, and finally, the common center of the discs corresponds to the singular set N .
The image of each family under the momentum map is a 2-disc with the center removed. This is exactly the situation when we can talk about Hamiltonian monodromy [9].
590
A.V. Bolsinov, H.R. Dullin, A.P. Veselov
Theorem 1. For each family of Liouville 3-tori there exist a basis of cycles in the first homology group of the tori in which the Hamiltonian monodromy has the matrix A0 . 0 1 Proof. This fact can be observed in different ways. We shall follow the definition of Hamiltonian monodromy and will explicitly compute the deformation of Liouville tori and the final gluing map. Consider an arbitrary Liouville 3-torus T 3 = TQ3 0 ,α0 . In coordinates, this torus is given by three conditions: Ee2z ln λ pu2 + 2Fpu pv + Ge−2z ln λ pv2 + pz2 = 2, Q(pu , pv ) = pu pv = Q0 , α(pu , pv ) =
ln
E pu G pv 2 ln λ
= α0
(7)
mod 1.
More precisely, these conditions define a disjoint union of two or four tori, which differ from each other by the signs of the momenta pu and pv . We consider one of them TQ3 0 ,α0 by putting for definiteness pu > 0, pv > 0. For our purposes first we need to explain why the above conditions define indeed a three-dimensional torus and to describe the basic cycles on this torus. Notice that the common level set (7) of the first integrals can be regarded from two slightly different points of view: as a subset in T ∗ M˜ 3 and that in T ∗ MA3 . However one can show that the natural projection T ∗ M˜ 3 → T ∗ MA3 restricted to this level set is a diffeomorphism (no points are glued between them). Thus, in fact there is no real difference between these two points of view. In particular, instead of the conditions pu pv = Q0 , α(pu , pv ) = α0 mod 1 we may simply assume that the momenta pu , pv themselves are constant. Then the conditions (7) can be rewritten as: pu = const ,
pv = const ,
u and v are arbitrary,
and c1 cosh(2 ln λ(z + α0 )) + pz2 = c2 ,
(8) √ where c1 = 2 EG|Q0 |, c2 = 2 − 2F Q0 . We see that the variables separate and the fact that this system defines a 3-torus becomes evident. Indeed, the variables u, v run over a two-dimensional torus and the last equation defines a simple closed curve on the plane R2 (z, pz ). In other words, we have a natural splitting of TQ3 0 ,α0 into the direct product T 2 × S 1 . Thus, as basic cycles on TQ3 0 ,α0 we can take the cycles on T 2 (u, v) related to the original coordinate system (x, y) (see above) and the third cycle defined by (8). Now let us look at what happens to this torus if we change the parameters Q0 and α0 in such a way that the point FA (TQ3 0 ,α0 ) moves inside the image of the momentum mapping around the singular point FA (N ) = (0, 0). It is easy to see that this deformation just means that we change the value of α, while Q can be chosen to remain constant: Q(t) = Q0 ,
α(t) = α0 + t,
t ∈ [0, 1].
Consider the family of mappings φt (u, v, z, pu , pv , pz ) = (u, v, z − t, et ln λ pu , e−t ln λ pv , pz ).
Spectra of Sol-Manifolds
591
It is not hard to see that the image of TQ3 0 ,α0 under φt is exactly TQ3 0 ,α0 +t and φt : TQ3 0 ,α0 → TQ3 0 ,α0 +t is a diffeomorphism. In other words, φt defines the deformation of Liouville tori we need. At the moment t = 1 the torus comes back to the initial position, i.e., TQ3 0 ,α0 = 3 TQ0 ,α0 +1 , and we obtain the monodromy map φ1 : TQ3 0 ,α0 → TQ3 0 ,α0 = TQ3 0 ,α0 +1 . Now our goal is to describe the corresponding automorphism of the first homology group: φ1 ∗ : H1 (TQ3 0 ,α0 ) = Z3 → H1 (TQ3 0 ,α0 ) = Z3 . Using the identification (4) we see that the map φt can be rewritten as follows: λu u v λ−1 v z z φt = . pu pu p p v v pz pz We see that the only transformation is related to the variables u and v. Moreover, this transformation is exactly the original hyperbolic automorphism A : T 2 → T 2 . Taking into account the natural splitting TQ3 0 ,α0 = T 2 (u, v) × S 1 (z, pz ) we conclude immediately that the monodromy matrix in the chosen basis is A0 . 0 1
We conclude this section with the discussion of the geodesics on Sol-manifolds. They have different properties depending on the types of leaves of the Liouville foliation which they belong to. First consider the geodesics lying on Liouville tori of dimension three. They are characterized by the property that all momenta pu , pv and pz differ from zero. More precisely, the signs pu and pv always remain the same, whereas the sign of pz changes. This happens when z reaches the value ± cosh−1 √h−Fpu pv EG|pu pv | − α(pu , pv ). z± = 2 ln λ Two levels z = z+ and z = z− are exactly the caustics of the Liouville tori that contains a given geodesic. The situation is quite similar to that on a surface of revolution where the motion takes place between two levels of z. It is easy to see that the distance between these levels z+ − z− tends to infinity as pu pv tends to zero. From this it follows that the corresponding geodesics rotate many times (along the base S 1 ), then turn back, after this go in the opposite direction, then turn back and so on. As pu pv tends to zero the number of rotations in one direction until turning back (or, which is the same, between two caustics) increases up to infinity.
592
A.V. Bolsinov, H.R. Dullin, A.P. Veselov
If pu pv = 0, then we are on the singular level. The corresponding geodesics have the following behaviour. If both pu and pv vanish, then we obtain the family of geodesics u = const, v = const, z = t. Such geodesics obviously form an invariant submanifold N+ in T ∗ M which is diffeomorphic to M. Exactly on this submanifold the geodesic flow is chaotic and has positive entropy. Indeed, the time-one map transforms each fibre Tz2 into itself by means of the hyperbolic automorphism A. As is well known, the entropy of A : T 2 → T 2 is ln λ > 0. There is another invariant submanifold N− with the same properties formed by vertical geodesics going in the opposite direction: u = const, v = const, z = −t. From the viewpoint of the ambient geodesic flow N+ and N− are hyperbolic invariant subsets. The stable manifold coresponding to N+ is given by pv = 0, the unstable one is pu = 0. For N− the stable and unstable manifolds interchange. The geodesics satisfying the condition pv = 0 as t → +∞ asymptotically approaches N+ , in particular, pz → +1. But there is t = t0 when pz changes sign so that for t → −∞ the geodesic approaches N− . The geodesics satisfying pu = 0 behave in the opposite way. In slightly other terms this structure can be described as follows: there are two hyperbolic submanifolds diffeomorphic to MA3 , which are connected by four 4-dimensional separatrices, see Fig. 1 Finally we would like to mention an interesting phenomenon which one would not expect from an integrable geodesic flow on a compact manifold. Namely, one of the action integrals diverges as the integral Q → 0 with the energy fixed (see the calculations and footnote in Sect. 7). A more well known scenario would be when approaching the singular level some of the cycles of the Liouville tori shrink so the actions will stay finite. The fact that this not true for Sol-manifolds when one approaches the singular (chaotic) level demonstrates once again the peculiar nature of this system. To discuss the quantum case we will need some facts from the classical number theory, which we present in the next section. 4. SL(2, Z) and Binary Quadratic Forms The content of this section is well-known (see e.g. [20, 21, 24]). a11 a12 Let A = ∈ SL(2, Z) be an integer hyperbolic matrix. Hyperbolicity as a21 a22 before means that its eigenvalues are real and distinct. We would like to consider A as the automorphism of the lattice L = Z ⊕ Z ⊂ R2 by choosing some basis e1 , e2 in this lattice. For any such A we can define the following integer binary quadratic form QA by the formula Av ∧ v = QA (v)e1 ∧ e2 , where v is a vector from R2 . Explicitly if v = xe1 + ye2 then a11 x + a12 y x = −a21 x 2 + (a11 − a22 )xy + a12 y 2 . QA (x, y) = det a21 x + a22 y y
(9)
(10)
Spectra of Sol-Manifolds
593
It is easy to see from the definition that this form is invariant under the action of A: QA (Av) = QA (v). Notice that QA has the discriminant D = (a11 − a22 )2 + 4a12 a21 = (a11 + a22 )2 − 4(a11 a22 − a12 a21 ) = (a11 + a22 )2 − 4, which is exactly the discriminant of the characteristic equation of A: λ2 − (a11 + a22 )λ + 1 = 0. In particular, since A is hyperbolic the form QA is indefinite. Note that the discriminant D cannot be a perfect square. In general the coefficients of the quadratic form QA may have a common factor. Let ˆ A (x, y) = ax 2 + bxy + cy 2 Q
(11)
be its primitive form after division of QA by the largest common factor. It is defined correctly only up to a sign. Thus to each integer unimodular hyperbolic matrix A we relate an indefinite integer ˆ A. primitive quadratic form Q Conversely, suppose we have such a form Q(x, y) = ax 2 + bxy + cy 2 . We would like to describe all A from SL(2, Z) which preserve this form. Such A are called the automorphs of Q. Let d = b2 − 4ac be the discriminant of Q which we assume not to be a perfect square and consider the corresponding Diophantine equation called Pell’s equation: X 2 − dY 2 = 4.
(12)
Then the group of automorphs consists of matrices of the form X−bY −cY 2 , A=± aY X+bY 2 where (X, Y ) are the solutions of Pell’s equation. Modulo ±I this group is cyclic with generator X0 −bY0 −cY0 2 A0 = , (13) 0 aY0 X0 +bY 2 where (X0 , Y0 ) is the fundamental solution of this equation. Recall that √ (X0 , Y0 ) is the fundamental solution of Pell’s equation if X0 > 0, Y0 > 0 and X0 + dY0 is minimal among all such solutions. The classical result about Pell’s equation says that all other solutions can be found from the relation n √ √ X + dY X0 + dY0 , =± 2 2
594
A.V. Bolsinov, H.R. Dullin, A.P. Veselov
where n√= 0, 1, . . . . One can find the fundamental solution from the continued fraction of d. This structure of the solutions of Pell’s equations induces the cyclic group structure for the automorphs. Notice that the form QA corresponding to the matrix (13) has the form Q = Y0 (ax 2 + bxy + cy 2 ). Let us call a hyperbolic element A from SL(2, Z) primitive if it cannot be represented as a power of any other element from SL(2, Z). Thus we have described a natural correspondence between the primitive binary indefinite forms Q and primitive elements A from SL(2, Z). In particular, it helps us to answer the question if a given integer unimodular matrix A is a primitive or if not which power of a primitive matrix it is. 5. Spectrum and Eigenfunctions of the Laplace-Beltrami Operator Let us now discuss the quantum geodesic problem on the Sol-manifold MA3 : −ψ = Eψ,
(14)
where is the Laplace-Beltrami operator on MA3 and ψ = ψ(P , E), P ∈ MA3 . In coordinates (u, v, z) the Laplace-Beltrami operator has the following explicit form: = Ee2z ln λ
2 ∂2 ∂2 ∂2 −2z ln λ ∂ + Ge + 2F + . ∂u2 ∂u∂v ∂v 2 ∂z2
(15)
This is a self-adjoint operator in the Hilbert space L2 (MA3 ) where the integration measure on MA3 is induced by the Riemannian metric (3). In both (x, y, z) and (u, v, z) coordinate systems the corresponding measure dµ is proportional to the standard Lebesgue measure on R3 . Because the coefficients of depend only on z it is quite natural to separate variables and look for the eigenfunctions of of the form γ (u, v, z) = e2πi(γ ,w) f (z), where γ is an element of the dual lattice ∗ corresponding to the T 2 -fibres and w = (u, v) (so the scalar product (γ , w) is defined modulo Z). By substituting into the Schr¨odinger equation (14), (15) we get 2 √ F sgn Q(γ ) ∂ f 2 −8π EG|Q(γ )| cosh 2 ln λ(z+α(γ )) + √ γ = f e2πi(γ ,w) , ∂z2 EG where Q(γ ) = (γ , eu )(γ , ev ) is a quadratic form on the lattice ∗ , and E (γ ,eu ) ln G (γ ,ev ) α(γ ) = . 2 ln λ Here eu and ev are the eigenvectors of A related to the eigenvalues λ and λ−1 respectively and the basis eu , ev is assumed to be positively oriented. Notice that α is the same as before in (5) if we replace pu by (γ , eu ) and pv by (γ , ev ).
Spectra of Sol-Manifolds
595
To clarify the meaning of the coefficient in front of the cosh let us consider the basis eu∗ , ev∗ in R2∗ dual to eu , ev . The vectors eu∗ and ev∗ are also the eigenvectors of A∗ with the eigenvalues λ and λ−1 respectively. By definition we have γ = (γ , eu )eu∗ + (γ , ev )ev∗ . Since Q(γ ) is obviously invariant under the action of A∗ it is natural to compare it with the binary form QA∗ defined in Sect. 3. We have A∗ γ ∧ γ = (λ(γ , eu )eu∗ + λ−1 (γ , ev )ev∗ ) ∧ ((γ , eu )eu∗ + (γ , ev )ev∗ ) = (γ , eu )(γ , ev )(λ − λ−1 )eu∗ ∧ ev∗ . Let l1 , l2 be a positively oriented basis in the dual lattice ∗ , then by definition A∗ γ ∧γ = QA∗ (γ )l1 ∧ l2 . From these calculations and from the equalities E = |eu∗ |2 , G = |ev∗ |2 it follows that √ EG|Q(γ )| = c|QA∗ (γ )|, where A(∗ ) 1 c = c(A; E, F, G) = √ =√ . D sin θ DA(T 2 ) sin θ
(16)
A(∗ ) is the area of the dual basic parallelogram (e1∗ , e2∗ ) (which is the inverse of the area of the fibre T 2 ), D = (λ − λ−1 )2 is the discriminant of the characteristic equation of the matrix A (or equivalently A∗ ), and θ is the angle between eu∗ and ev∗ . Thus we have proved the following Proposition 2. A function = e2πi(γ ,w) f (z) satisfies Eq. (14) if and only if f (z) satisfies the modified Mathieu equation d2 (17) − 2 + |ν(γ )| cosh 2µ (z + α(γ )) f (z) = f (z), dz where µ = ln λ, ν(γ ) = 8π 2 cQA∗ (γ ) and α(γ ) is given above. The eigenvalues E and are related by the shift E = + ν(γ ) cos θ.
(18)
Recall that the modified Mathieu equation is the cosh-version of the standard Mathieu equation d2 y + (a cos 2µx + b)y = 0. dx 2 Its solutions are known as modified Mathieu functions (see e.g. [33, 34]). They appear also in the theory of Coulomb spheroidal functions [17], where one can find some related numerical results (see also [18]). Let = k (ν), k = 1, 2, . . . be the spectrum of the corresponding modified Mathieu operator M=−
d2 + |ν| cosh 2µz dz2
and fγ ,k (z) be the corresponding solutions of (17).
596
A.V. Bolsinov, H.R. Dullin, A.P. Veselov
Thus, to each element γ of the dual lattice ∗ we associate the functions γ ,k (u, v, z) = e2πi(γ ,w) fγ ,k (z). The problem with these functions is that they are well defined on the covering space M˜ 3 = T 2 × R but not on the Sol-manifold MA3 itself because they are not invariant with respect to the transformation (1), (2). One can try to construct the genuine eigenfunctions of on MA3 by averaging these functions with respect to the action of Z on M˜ 3 generated by this transformation. It turns out that the averaging procedure works. To show this let us consider instead of γ ,k (u, v, z) the following sum: γ ,k (λn u, λ−n v, z + n) = A∗ n γ ,k (u, v, z). (19) γ ,k = n∈Z
n∈Z
Because of the fast decay of the eigenfunctions fγ ,k (z) this sum is absolutely convergent. It is easy to see that it defines a well-defined function on MA3 , which is an eigenfunction of the Laplace-Beltrami operator . The eigenfunctions γ ,k (u, v, z) on MA3 actually depend only on the orbits [γ ] = ∗ {A n (γ )}n∈Z with respect to the action of A∗ on ∗ : γ ,k (u, v, z) = [γ ],k (u, v, z). We should also consider separately the eigenfunctions related to γ = 0. It is easy to see that the corresponding eigenfunctions have the very simple form 0,l = e2πilz ,
l∈Z
(20)
with the eigenvalues El = (2π)2 l 2 . Theorem 2. The eigenfunctions of the Laplace-Beltrami operator [γ ],k (u, v, z), [γ ] ∈ ∗ \ {0}/A∗ and 0,l (z) form a complete basis in L2 (MA3 ). Proof. The independence and orthogonality of these functions are obvious. The only thing we have to verify is the completeness. To prove this we need to show that any smooth function : MA3 → R which is orthogonal to each eigenfunction from the list is, in fact, zero. Consider such a function (w, z) on the covering space M˜ 3 and expand it as a Fourier series (with respect to w): (w, z) = e2πi(γ ,w) aγ (z) γ ∈ ∗
with some smooth coefficients aγ (z), z ∈ R. Lemma 1. For all γ = 0 the functions aγ (z) have fast decay at infinity and thus belong to L2 (R). Proof. Since is invariant with respect to the transformation (1), we have (w, z) = (Aw, z + 1). Hence ∗ e2πi(γ ,w) aγ (z) = e2πi(γ ,Aw) aγ (z + 1) = e2πi(A γ ,w) aγ (z + 1). γ ∈ ∗
γ ∈ ∗
γ ∈ ∗
Thus the Fourier coefficients satisfy the following property: aγ (z + 1) = aA∗ γ (z),
Spectra of Sol-Manifolds
597
or, more generally, aγ (z + n) = aA∗ n γ (z),
n ∈ Z.
Since the Fourier coefficients aγ of a smooth function decay fast for large γ and A∗ k γ for γ = 0 tends to infinity we see that the functions aγ (z) decay very fast and thus
belong to L2 (R). Now suppose that (w, z) is orthogonal to the eigenfunction [γ0 ],k (u, v, z) = 3 n n∈Z A∗ γ0 ,k (u, v, z). Since the measure on MA is proportional to the standard Lebesgue measure dudvdz we have ¯ [γ0 ],k (w, z)dσ (w, z) 0 = (w, z), [γ0 ],k (w, z) =
=
1
0
= 0
1
MA3
∗n γ ,w) 0
e2πi(γ ,w) e−2πi(A
2 γ ∈ ∗ n∈Z T
e
2 n∈Z T 1 2
= A(T )
0 n∈Z
dudv aγ (z)fA∗ n γ0 ,k (z)dz
2πi(A∗n γ0 ,w) −2πi(A∗n γ0 ,w)
e
dudv aA∗ n γ0 (z)fA∗ n γ0 ,k (z)dz
aA∗ n γ0 (z)fA∗ n γ0 ,k (z)dz.
We now use the property that fγ ,k (z+n) = fA∗ n γ ,k (z) and aγ (z+n) = aA∗ n γ (z), Z to conclude that 1 aA∗ n γ0 (z)fA∗ n γ0 ,k (z)dz 0 n∈Z
=
1
0 n∈Z
aγ0 (z + n)fγ0 ,k (z + n)dz =
n∈
+∞ −∞
aγ0 (z)fγ0 ,k (z)dz.
Thus, the Fourier coefficients aγ0 (z) for γ0 = 0 belong to L2 (R) and at the same time are orthogonal to all the functions fγ0 ,k (z) which form a complete basis in L2 (R). Hence for γ0 = 0 the coefficients aγ0 (z) ≡ 0. This means that the function must be of the form (w, z) = a(z), where a(z) is periodic with period 1. Now using orthogonality to the functions (20) we conclude that a(z) must be identically zero.
Corollary 1. The spectrum of the Laplace-Beltrami operator on Sol-manifolds consists of two parts: the trivial part E = El = 4l 2 π 2 ,
l = 0, 1, . . .
corresponding to the eigenfunctions (20) and the non-trivial part E = Ek,[γ ] = k (ν([γ ])) + ν([γ ]) cos θ,
k = 1, 2 . . . ,
[γ ] ∈ ∗ {0}/A∗
related to the modified Mathieu equation(17). The multiplicities of the trivial eigenvalues are 2 except for the ground state E = 0 which has multiplicity 1. The multiplicities of the non-trivial part of the spectrum are much more interesting and the answer depends on the arithmetical properties of the gluing map A. We discuss this in the next section.
598
A.V. Bolsinov, H.R. Dullin, A.P. Veselov py pv
10
Q = - 121 10
20
30
px
- 10
- 20
pu
Fig. 2. Fundamental domain of the lattice in (px , py ) with |Q| ≤ 302 for the cat-map. The hyperbola Q = −112 illustrates the first example of a non-trivial degeneracy
6. Multiplicities of the Eigenvalues and Number Theory As one can see from the previous section the eigenvalue of [γ ],k (u, v, z) depends on γ only via QA∗ (γ ). Thus the calculation of the multiplicity1 is reduced to the classical number theoretic problem of finding the number NQ (n) of integer solutions of the equation Q(x, y) = ax 2 + bxy + cy 2 = n for a primitive indefinite quadratic form Q different modulo its automorphs. Figure 2 illustrates this for the cat-map A with Q = −x 2 + xy + y 2 . For forms Q with certain discriminants there exists an effective formula which permits the computation of NQ (n). To be more precise we need the following notion. We say that two forms Q and Q are equivalent if there exists a transformation from SL(2, Z) mapping one into the other. It is easy to see that two equivalent forms must have the same discriminant d = b2 − 4ac. The converse is not true: there can be non-equivalent forms with the same discriminant, see the examples after Theorem 3. Let h(d) be the number of classes of primitive forms with discriminant d. Note that the discriminant d = b2 − 4ac is always 0 or 1 modulo 4 and we assume as usual that it is not a perfect square. 1 Here we consider generic values of the parameters to avoid additional accidental coincidences. Any such degeneracy can be removed by changing the metric such that c in Eq. (16) and hence are constant while θ and hence E are changed.
Spectra of Sol-Manifolds
599
Remark. One should √ distinguish h(d) and the class number of ideals in the quadratic number field Q( d). They coincide only if the so-called negative Pell equation X2 − dY 2 = −4 has a solution; otherwise h(d) is twice as big (see e.g. [15], Chapter 16). The last property √ can be reformulated in terms of the period of the continued fraction expansion of d, but a more explicit description is unknown. If h(d) = 1 then all forms with discriminant d are equivalent. In that case there is the following remarkable formula for the number NQ (n) when n is positive and coprime with d: d NQ (n) = Nd (n) = , (21) k k|n
where the sum is taken over all divisors of n and dk is the standard Kronecker symbol (see Landau [20], Chapter IV.4). The Kronecker symbol is a real character modulo d, which has the following properties determining it uniquely: (1) If d and k are not coprime then dk = 0; d (2) If d and k are coprime then k = ±1; (3) kld = dk dl ; (4) for p odd prime which is not a divisor of d pd coincides with the Legendre symbol, which is 1 if d is quadratic residue modulo p and −1 otherwise; (5) d2 is 1 if d has residue 1 modulo 8 and −1 if it has residue 5 modulo 8. For its computation one can use the following generalisation of the Law of Quadratic Reciprocity: if p, q are coprime positive odd numbers then p−1 q−1 p q = (−1) 2 2 . q p Here is the list of the discriminants d up to 100 with h(d) = 1, see [15] 5, 8, 13, 17, 20, 29, 37, 41, 52, 53, 61, 65, 68, 73, 85, 89, 97. It is believed that there are infinitely many fundamental discriminants with h(d) = 1, but it is still an open problem. Notice that for positive definite forms it is known that there are only 9 fundamental discriminants with h(d) = 1 as it was conjectured by Gauss, namely d = −3, −4, −7, −8, −11, −19, −43, −67, −163. In general if h(d) > 1 the right-hand side of the formula (21) gives the total number of representations of n by all non-equivalent forms with discriminant d. An interesting case is when the ideal class number of d is 1 but h(d) = 2. In that case we have only two nonequivalent forms with discriminant d: Q and −Q and the formula (21) gives the number of the solutions of the equation |Q| = n. The first corresponding discriminants are: 12, 21, 24, 28, 32, 33, 44, 45, 48, 56, 57, 69, 72, 76, 77, 80, 84, 88, 92, 93 (see [15]). The only discriminants < 100 not listed in either table above are 40, 60, 85, 96 with h(d) = 2, 4, 2, 4, respectively. Note that most of these discriminants d are not of
600
A.V. Bolsinov, H.R. Dullin, A.P. Veselov
the form D = t 2 − 4, but they can still be obtained from A ∈ SL2 (Z) because D/d may be an arbitrary square. Now we are ready to describe the multiplicities of the eigenvalues Ek,[γ ] . First we should take into account that the gluing map A ∈ SL(2, Z) and the corresponding form Q = QA∗ may be non-primitive. Let us define the positive integers r = r(A) ˆ where A0 ∈ SL(2, Z) and and l = l(A) from the relations A = Ar0 and QA∗ = l Q, ˆ ˆ Q = QA∗ are primitive. Theorem 3. The multiplicity m of the eigenvalue Ek,[γ ] of the Laplace-Beltrami operator for generic values of the parameters in the metric (3) is m(γ ) = 2r(A)NQ∗ (n), where n = Q∗ (γ ) = l(A)−1 QA∗ (γ ). When the discriminant d of the form Q∗ has class number 1 and n is coprime with d then NQ∗ (n) can be computed using the formula (21). Examples. 1. Let
A=
21 11
be the so-called cat-map. Then A∗ = A and QA = QA∗ = −(x 2 − xy − y 2 ) are both primitive. The discriminant D = d = 5 has class number 1, so one can use the formula (21) to compute the multiplicity of the corresponding El,[γ ] . One can check that this leads to the formula m = 2(N±1 (n) − N±2 (n)), where N±1 (n) and N±2 (n) are the numbers of divisors of n = QA (γ ) which have respectively the residues ±1 and ±2 modulo 5. This example shows that the multiplicities of the eigenvalues can be as big as we like: for example for n = 11M the multiplicity is M + 1; for n a product of M distinct primes all ±1 mod 5 the multiplicity is 2M . 2. The matrix 1 3 A= 3 10 corresponds to d = 13 with D = 32 d and QA∗ = −3(x 2 − 3xy − y 2 ) so l(A) = 3. 3. For 52 A= 21 we have d = 8 with D = 22 d and QA∗ = QA = −2(x 2 − 2xy − y 2 ) so l(A) = 2. This example shows that l(A) in general is not directly related to the largest square divisor of D. 4. For 13 A= , 14 d = D = 21 and QA∗ = −(3x 2 + 3xy − y 2 ), l(A) = 1. Here h(d) = 2, but the non-equivalent forms simply differ by a sign.
Spectra of Sol-Manifolds
601
5. The matrices
A1 =
1 6 6 37
and A2 =
7 18 12 31
correspond to d = 40 with D = 62 d, l(Ai ) = 6. Then QA∗1 = −6(x 2 + 6xy − y 2 ) and QA∗2 = −6(3x 2 + 4xy − 2y 2 ) are the two corresponding (non-trivially) nonequivalent forms. Remark. In the case when h(d) is larger than 1 in general we do not have a simple formula for the multiplicities for a particular Sol-manifold but only for the disjoint union of Sol-manifolds with non-equivalent forms of given discriminant d. The fact that the multiplicities are large and not sensitive to the change of the parameters in the metric seems to be remarkable. A possible explanation of the rigidity of multiplicities for Sol-manifolds is in the hyperbolicity hidden in the topology of the manifolds. Remark. The same numbers NQ (n) appear in the harmonic analysis on Sol-manifolds as the multiplicities of the irreducible Sol-representations in C ∞ (MA3 ) (see Chapter 1 in [6]). Although this fact has a similar origin it does not explain the degeneracy of the spectrum of . It is interesting to compare the Sol-case with the spectra of flat tori T 2 . It is easy to see that in the last case the answer will depend drastically on the metric parameters (or equivalently, on the geometry of the basic parallelogram). For example, if it is a square then the spectrum up to a multiple is given by the values of the standard quadratic form n = x 2 + y 2 , and the multiplicity are given by the Gauss’ famous formula m = 4(N1 (n) − N3 (n)), where N1 (n) and N3 (n) are the numbers of the divisors of n with the residues 1 and 3 modulo 4 respectively. If however the basic parallelogram is generic then all multiplicities are 2 (which is due to the central symmetry of the problem). 7. Semiclassical Analysis and Weyl’s Law In this section we use semiclassical arguments to investigate the spectrum of Sol-manifolds in two limiting cases when c is either small (“fat Sol-manifold”) or large. In particular we will show that for fat Sol-manifolds one can “hear the values of the indefinite quadratic form Q”. It is instructive to see first how our exact calculation of the spectrum agrees with the famous Weyl’s law [32], which says that for a quantum system the number N () of eigenvalues E ≤ for large asymptotically is equal (up to a factor (2π )−n ) to the volume of the domain in the classical phase space with energy less than . For our Laplace-Beltrami operator (15) this means that N () ∼
2 V ol(MA3 ) 4 4 3/2 A(T ) = , π3/2 π 3 (2π)3 3 (2π )3
(22)
where V ol(MA3 ) is the volume of our Sol-manifold (which equals the area of the fibre A(T 2 ) since the length in z-direction was assumed to be 1).
602
A.V. Bolsinov, H.R. Dullin, A.P. Veselov
Let us now count the eigenvalues E using the results of Sect. 5. Let us assume for simplicity that cos θ = 0, so besides the trivial part they coincide with the eigenvalues of the Mathieu operator M=−
d2 + |ν| cosh 2µz, dz2
(23)
where as before ν=√
8π 2 Q DA(T 2 ) sin θ
,
(24)
and Q is the corresponding binary quadratic form. First of all let us use the well-known fact from number theory (see e.g. [15]) that for large Q0 the number of lattice points (modulo A) with values of |Q| less than Q0 is proportional to the area of the fundamental domain up to Q0 : µ M(Q0 ) ∼ 4 √ Q0 . D The factor of four counts lattice points related by the symmetry given by changing the sign of both pu and pv and also accounts for the states in the quadrants where Q is of opposite sign. For fixed value of Q (and hence ν) there is a whole line of eigenvalues of the Mathieu operator (23). The number of these eigenvalues up to energy for large is given asymptotically by the action integral 1 I (, Q) = − |ν| cosh(2µz)dz , 2π which is of course the area of the domain in the phase plane with energy less than divided by 2π. This can be simplified to √ 2π µI = 1 − g cosh 2ζ dζ, g = |ν|/ . With ξ = cosh(2ζ ) this becomes a standard elliptic integral (see e.g. [33]) 1/g √
2π µ 1 − gξ 1−g
. dξ = 4 1 + g(K(k) − E(k)), k 2 = √ I =2 1+g ξ2 − 1 1
(25)
Let us denote this expression f (g). Thus we see that the total number of states up to energy is 3/2 1 µ N () ∼ M (Q)I (, Q)dQ = 2 √ f (g)dg, Cµπ 0 D 2
. The integral over g is best performed by treating it as a double where C = √ 8π 2 D A(T ) sin θ integral over g and ξ . Introducing η = gξ and performing the η-integral first gives 1 4 ∞ dξ 2
f (g)dg = = π. 2 3 3 ξ ξ −1 0 1
Spectra of Sol-Manifolds
603
Thus we have N () ∼ 3/2
A(T 2 ) 4π sin θ , (2π)3 3
(26)
which agrees with Weyl’s formula (22) when θ = π/2. For general θ we have E = λ − ν cos θ . In that case we need to compute X± = 1 − g(cosh 2z ± cos θ)dzdg over the domain 0 ≤ g ≤ 1/(1 ± cos θ ) and |z| ≤ z0 where z0 is the smallest positive root of the integrant. The transformation η = g(cosh 2z + cos θ ),
ξ = cosh 2z
folds the integration region to the rectangle ξ > 1 and 0 ≤ η ≤ 1 and the integral becomes √ 1−η
X± = 2 dηdξ . (ξ ± cos θ ) ξ 2 − 1 The integral over η is easily done as before while the integral over ξ evaluates to X± =
4 3
π 2
± (θ − π2 ) . sin θ
Hence the sum of the contributions from the two cases of positive and negative Q is sin θ (X+ + X− ) =
4 π 3
as before. This computation shows that θ determines the relative number of states between the regions with positive and negative Q, namely X+ /X− = θ/(π − θ). Let us look what this calculation gives for the first eigenvalues of . There are two opposite cases depending on whether geometric parameter A = A(T 2 ) sin θ is small or large. Let us assume again for simplicity that θ = π/2, then A is simply the area of the fibre. The small A corresponds to the “rope-like” Sol-manifolds. In this case the first eigenvalues are “trivial”: El = 4l 2 π 2 , l = 0, 1, 2... which correspond to Q = 0. The second case when A is large (“fat” Sol-manifolds) is more interesting. In that case the parameter ν in the Mathieu operator is small. The action integral (25) for small ν (or large ) has the asymptotics 2 √ I∼ ln . µπ |ν| This suggests the following asymptotics for the eigenvalues for small ν (or large ): k ∼
(µπ(k + 1))2 , (ln |ν|)2
k = 0, 1, 2, . . . .
(27)
2 The fact that the action diverges logarithmically for ν → 0 seems to be surprising. To our knowledge this is the first example of a Liouville integrable system on a compact manifold for which the action diverges on approach of a singular level (with energy fixed).
604
A.V. Bolsinov, H.R. Dullin, A.P. Veselov
Fig. 3. First 20 states √ of the cosh-Mathieu equation in dependence on the parameter log ν where f (, ν) = ( − ν)/( 2νµ)
Note that although k → 0 as ν → 0 the decay is slower than any power of ν. The first 20 states are shown in Fig. 3. When ν is not small the low states are like those of the harmonic oscillator, k ≈ |ν| +
√
1 , 2νλ k + 2
k = 0, 1, 2, . . .
(28)
Both semiclassical formulas (27) and (28) are well known, see e.g. [16]. In this paper it is shown that the modified Mathieu equation is equivalent to the relativistic harmonic oscillator in which ν is proportional to the speed of light. This explains why (not only semiclassically) the low lying states behave like a harmonic oscillator for large enough ν. For the corresponding Sol-manifolds this gives the following behaviour of the first eigenvalues Ek,[γ ] ∼
(µπ k)2 , (ln |CQA∗ ([γ ])|)2
k = 1, 2, ...,
[γ ] ∈ ∗ \ {0}/A∗
(29)
2
. Starting with the ground state E0 = 0 we order these for small C = √ 8π 2 D A(T ) sin θ eigenvalues E0 < E1 ≤ E2 , . . . to have Ej ∼
ln Qj (µπ )2 ), (1 − 2 (ln C)2 ln C
j = 1, 2, . . . ,
(30)
where Q0 = 0 ≤ Q1 ≤ Q2 , . . . are positive values of the form QA∗ listed in increasing order and we have assumed that Qj C1 . In particular, E1 ∼
(µπ )2 , (ln C)2
Ej − E1 ln(Qj /Q1 ) ∼ , Ei − E 1 ln(Qi /Q1 )
(31)
so when C is small we can “hear" the values of the quadratic form QA∗ . Note that the question about the next order term in Weyl’s law is non-trivial. For the simpler case of Nil-manifolds some results in this direction can be found in [23]. One can also look at the corresponding Minakshisundaram-Plejel asymptotic expansion, which is an important characteristic of a spectrum (see e.g. [11]). In particular the
Spectra of Sol-Manifolds
605
second coefficient in this expansion is proportional to the integral of the scalar curvature K. A straightforward calculation shows that for the Sol-manifold MA3 the principal sectional curvatures are ± sin 2θ log2 λ and −2(sin θ log λ)2 , so K = −2(sin θ log λ)2 and thus is always negative. 8. Spectral Statistics The spectral statistics of integrable and chaotic systems are quite different, see, e.g., [3, 4, 10]. As we have seen the geodesic flow on Sol-manifolds has properties of both, integrable and chaotic systems. Therefore it is a natural question what the spectral statistics of the Sol-manifolds is like. Note that according to the Berry-Tabor conjecture [3] integrable systems should have Poisson distributed level spacing. We are going to show that this is not the case for Sol-manifolds. The reason is the high multiplicities of the eigenvalues. Indeed, for the simplest positive quadratic form Q0 = x 2 + y 2 the classical result due to E. Landau says √ that the number of integers up to a number K represented by this form grows as K/ log K. If there would be no degeneracies then this number would grow like the area of the fundamental region, which is proportional to K. This means that most of the level spacings of the values of Q0 are zero. P. Sarnak pointed out to us [26] that a similar fact is true for indefinite forms as well, and has been proved by Bernays [2]: Theorem (Bernays [2]). The number of positive integers up to K that can be represented √ by a given indefinite quadratic form Q is O(K/ log K). Combining this with the results of Sect. 5 we have the following: Theorem 4. The level spacing distribution for the spectrum of Sol-manifolds MA3 is not Poisson. In the fundamental work by Berry and Tabor [3] a stationary phase computation was shown to yield Poisson statistics if the energy surface in action space is curved. This argument cannot, however, detect global coincidences of eigenvalues coming from different parts of the energy surface. Berry and Tabor in [3] also discussed exceptions to the principle “Integrable systems yield Poisson statistics” with flat energy surface and the subtle number theory involved. Examples with a curved energy surface that do not have Poisson statistics have been known for some time, see e.g. [22, 25], but so far no examples with continuous parameters were known. The present example is therefore particularly interesting because the statement of the theorem is not sensitive to change of the metric in the Sol-class (3). Let us illustrate the level spacing distribution in the example of the cat-map A. In Fig. 4 (left) the level spacing statistics for the indefinite binary quadratic form QA = −x 2 + xy + y 2 is shown for three different values of Qmax . Since the cat-map is the product of two involutions, there is a simple reflection symmetry in the lattice, which causes almost all states to be at least twofold degenerate. This discrete symmetry needs to be factored out before the level spacing statistics can be studied. The involutions Ri with Ri2 = I d are 1 0 1 −1 A = R2 R1 , R1 = , R2 = . −1 −1 0 −1
606
A.V. Bolsinov, H.R. Dullin, A.P. Veselov
%
%
Qmax = 10^5., # = 21749
Qmax = 10^5., # = 21749
0.12
0.25
0.1
0.2
0.08 0.15 0.06 0.1 0.04 0.05
0.02
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
%
DQ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Qmax = 10^7., # = 2154323
%
DQ
Qmax = 10^7., # = 2154323
0.1 0.3 0.08 0.06
0.2
0.04 0.1 0.02
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
%
DQ
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Qmax = 10^9., # = 215227349
%
DQ
Qmax = 10^9., # = 215227349
0.1 0.4 0.08 0.3 0.06 0.2
0.04
0.1
0.02
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
DQ
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
DQ
Fig. 4. Level spacing statistics of the indefinite binary quadratic form Q(x, y) = −x 2 + xy + y 2 (left) with degeneracies removed (right)
The fixed line of R1 is the line y = −x/2, and factoring the fundamental region in Fig. 2 by R1 simply cuts the fundamental region in half along this line. Since the values of Q are integers we chose to present the raw level spacing statistics, i.e. without unfolding the spectrum first. The number of lattice points found up to the corresponding Qmax in the reduced fundamental region is given in the heading of each figure, and the ratio √ approaches ln λ/(2 d). The figures clearly show that the proportion of degenerate levels grows in agreement with what we said above. When the degenerate levels are discarded in the statistics Fig. 4 (right) shows that the distribution appears to converge to some non-universal shape. We would like to mention here the paper [14] where the moments of the intervals between the sums of two squares were studied.
9. Quantum monodromy In view of the previous results the appearance of quantum monodromy in our problem is quite natural. However there is a problem with this notion in our case which we want to discuss first. As it was shown in [5] the geodesic flow on Sol-manifolds cannot have three analytic integrals (see Sect. 3 above). A similar fact holds in the quantum case. Namely, one can show that the algebra of the differential operators on the Sol-manifold MA3 commuting ∂ ∂ with the Laplace-Beltrami operator (15) is generated by and ∂u ∂v . This means that our quantum problem does not have enough quantum integrals, at least in the class of the
Spectra of Sol-Manifolds
607
differential operators and therefore it is not clear if we can apply the rigorous treatment of quantum monodromy from [30]. So in this section we will treat the quantum monodromy on Sol-manifolds on the intuitive level paying more attention to geometry rather than to analysis. As we have already shown, the set of eigenfunctions is in a natural one-to-one correspondence with ∗ /A∗ × N, [γ ] ∈ ∗ /A∗ , k ∈ N. The fundamental domain of A∗ is shown in Fig. 2. It is natural to represent the orbit space as a lattice on the cone obtained by gluing the edges of the fundamental domain of A∗ on the plane (more precisely we should consider four different cones corresponding exactly to four families of Liouville tori). The complete 3-dimensional lattice of quantum states is obtained by attaching a line to each point of the Sol-flower which contains the spectrum of the modified Mathieu equation with the corresponding value of ν(Q). The Sol-flower can be viewed as a slice through this lattice. But since there is no monodromy in the Mathieu fibres we have chosen to simply present the 2-dimensional base of this 3-dimensional lattice. Quantum monodromy arises when we pass around the vertex of the cone. It is clear that the basis of the lattice will undergo the transformation A. On the other hand, nothing happens to the third direction corresponding to the parameter k. Therefore the quantum monodromy for the Sol-manifold MA3 is given by the matrix ∗ A 0 . 0 1 We want to emphasize that in this case quantum monodromy has a purely topological nature. It is determined by the topology of the underlying manifold, and not by properties of the metric. It does not depend on the parameters E, G, F , moreover the monodromy remains the same for all metrics of the form ds 2 = dsz2 + dz2 ,
30
120
20
110
10
100
0
F2
F2
where dsz2 is a flat metric on fibres Tz2 with coefficients depending on z. In the previously known examples (like the geodesic flow on the 3-dimensional ellipsoid of revolution, [31]) the metric g played the principal role.
90
- 10
80
- 20
70
- 20
- 10
0 F1
10
20
30
70
80
90 F1
100
110
120
Fig. 5. Image of the lattice Z2 in (px , py ) under the momentum map (F1 , F2 ) for fixed energy where A is the cat-map. Left: origin at the centre. Right: distorted standard lattice away from the origin
608
A.V. Bolsinov, H.R. Dullin, A.P. Veselov
In Figs. 5, 6, and 7 we demonstrate the quantum monodromy of the Sol-manifold MA3 related to the cat-map 21 A= . 11 To make the image of the lattice uniform we have slightly modified the classical integrals f1 , f2 from Sect. 3 as follows: √ F1 = √|Q| cos 2πβ, F2 = |Q| sin 2πβ, u| where Q = pu pv and β = ln|p ln λ . The image of the lattice under the map F we call v| Sol-flower (see Fig. 6). Note that an alternative choice β = ln|p ln λ would give a similar picture and the freedom of the rescaling of the eigenvectors eu , ev leads simply to a rotation of the plane (F1 , F2 ). Figure 5 illustrates that away from the origin the lattice is simply a deformed standard lattice. A nice property of the map (pu , pv ) → (F1 , F2 ) is that it changes the area simply by a constant multiple: it is easy to check that
dF1 ∧ dF2 =
π dpu ∧ dpv . ln λ
It is interesting to mention that the multiplicity problem becomes the standard “circle problem" if one replaces the square lattice by the Sol-flower (but of course it does not help to compute them). When a fundamental cell is chosen in the Sol-flower as indicated in grey in Fig. 7 the monodromy can be observed as follows: A line extending a basis vector is parallel transported in the lattice. After completing a cycle about the origin this direction is changed. The left picture shows images of the lines (px , py ) = (30 − 2l, −j − l), j = 0, . . . , 27, l = 0, . . . , 5. The right picture shows images of the lines (px , py ) = (30 − l, −j − l), j = 0, . . . , 30, l = 0, . . . , 5. Denote the direction of the line shown in the left part of
Fig. 6. Image of the lattice Z2 in (px , py ) under the momentum map (F1 , F2 ) for fixed energy and A the cat-map. Left: Sol-flower with |Q| ≤ 602 . Right: Sol-flower with |Q| ≤ 902
Spectra of Sol-Manifolds
609
Fig. 7. Parallel transport of basic directions in the image of the momentum map
Fig. 7 by e1 , and the one on the right part by e2 . The preimages of these basis vectors in Fig. 2 are −(2ex + ey ) and −(ex + ey ). Parallel transporting e1 clockwise by increasing j gives e1 + e2 (determining the second row of A), while parallel transporting e2 counterclockwise by decreasing j gives −e1 + e2 (determining the first row of A−1 ). Since A ∈ SL(2, Z) this determines the cat-map. 10. Concluding Remarks The Sol-geometry from a dynamical point of view has the special property of being on the border between integrability and chaos. Integrability is reflected in the solvability of the corresponding group while the chaos is related to a hidden (partial) hyperbolicity. This makes the Sol-case of particular interest and explains why the geodesic problem on the Sol-manifolds has both integrable and chaotic features. As we have seen the quantum case gives a new interesting twist to the story by bringing arithmetic into play. Atiyah, Donelly and Singer [1] considered a more general case of Sol-manifolds which are T n+1 torus fibres over T n . Much of our analysis can be generalised to this case as well. The quantum Toda lattice Hamiltonian will appear then as a generalisation of the modified Mathieu operator. Some very interesting results in the corresponding classical problem were found recently by Leo Butler in [7]. It would be also interesting to study in more detail how the chaos (at the degenerate level Q = 0) of the classical system manifests itself in the quantum version. We showed that the spectral statistics provides a counterexample to the Berry-Tabor conjecture, but it cannot be taken as an indicator of chaos. One simple observation is that the trivial eigenfunctions 0,s are asymptotically ‘uniformly distributed’ on the manifold. Hence the subset of eigenfunctions that are associated with the classical chaos are quantum unique ergodic, cf. [19]. Already at relatively small quantum numbers it can be seen that the nodal lines are more complicated when Q is small, see Fig. 8. Acknowledgements. We are very grateful to M. Berry, E. Bombieri, V. Kuznetsov, J. Marklof, A. Pushnitski, P. Sarnak, R. Schubert and P. Shiu for very useful and stimulating discussions. This work has been started in December 2002 when one of us (A.B.) visited Loughborough University. We are grateful to the London Mathematical Society for the support of this visit. The work of APV
610
A.V. Bolsinov, H.R. Dullin, A.P. Veselov
Fig. 8. Slice of ([γ ],k ) at x = 0 for γ = (1, 0), k = 15 (left) and γ = (12, −5), k = 2 (right) for the cat-map. The appearance of the eigenfunction whose Q(γ ) is small and thus close to the classical chaos is more irregular
was partially supported by the European research network ENIGMA (contract MRTN-CT-2004-5652) and ESF Scientific programme MISGAM. The work of HRD was partially supported by the European research programme MASIE (contract HPRN-CT-2000-0113). The work of AVB was partially supported by the Russian Foundation for Basic Research (grant 05-01-00978).
References 1. Atiyah, M.F., Donnelly, H., Singer, I.M.: Eta invariants, signature defects of cusps and values of L-functions. Ann. Math. 118, 131–177 (1983) ¨ 2. Bernays, P.: Uber die Darstellung von positiven, ganzen Zahlen durch die primitiven, bin¨aren quadratischen Formen einer nicht-quadratischen Diskriminante. Dissertation G¨ottingen, 1912 3. Berry, M.V., Tabor, M.: Level clustering in the regular spectrum. Proc. Roy. Soc. London A 356, 375–394 (1977) 4. Bohigas, O., Giannoni, M.-J., Schmidt, C.: Characterization of chaotic quantum spectra and universality of level fluctuation laws. Phys. Rev. Lett. 52, 1–4 (1984) 5. Bolsinov, A.V., Taimanov, I.A.: Integrable geodesic flows with positive topological entropy. Invent. Math. 140, 639–650 (2000) 6. Brezin, J.: Harmonic analysis on compact solvmanifolds. LNM 602, New York: Springer-Verlag, 1977 7. Butler, L.: Toda lattices and positive-entropy integrable systems. Invent. Math. 158, 515–549 (2004) 8. Cushman, R., Duistermaat, J.J.: The quantum spherical pendulum. Bull. Amer. Math. Soc. (N.S.) 19, 475–479 (1988) 9. Duistermaat, J.J.: On global action-angle coordinates. Comm. Pure Appl. Math. 33, 687–706 (1980) 10. Duistermaat, J.J., Guillemin, V.W.: The spectrum of positive elliptic operators and periodic bicharacteristics. Invent. Math. 29(1), 39–79 (1975) 11. Gilkey, P.: Spectral geometry of a Riemannian manifold. J. Diff. Geom. 10(4), 601–618 (1975) 12. Guillemin, V., Uribe, A.: Monodromy in the quantum spherical pendulum. Commun. Math. Phys. 122, 563–574 (1989) 13. Hirzebruch, F.: Hilbert modular surfaces. L’Enseign. Math. 19 (1973), 183–281 14. Hooley, C.: On the intervals between numbers that are sums of two squares. Acta Math. 127, 279–297 (1971) 15. Keng, H.L.: Introduction to Number Theory. Berlin-Heidelberg-New York: Springer-Verlag, 1982 16. Jager, L.: Fonctions de Mathieu et fonctions propres de l’oscillateur relativiste. Ann. Fac. Sci. Toulouse Math. Serie 6, 7, 465–495 (1998) 17. Komarov, I.V., Ponomarev, L.I., Slavyanov, S.Yu.: Spheroidal and Coulomb spheroidal functions (Russian). Moscow: “Nauka”, 1975, pp 320
Spectra of Sol-Manifolds
611
18. Komarov, I.V., Tsiganov, A.B.: Quantum two-particle periodic Toda lattice (Russian). Vestnik LGU, 4(2), 69–71 (1988) 19. Jakobson, D., Nadirashvili, N., Toth, J.: Geometric properties of eigenfunctions. Russ. Math. Surv. 56, 1085–1105 (2001) 20. Landau, E.: Elementary Number Theory. New York: Chelsea, 1958 21. LeVeque, W.J.: Fundamentals of Number Theory. New York: Dover Publications, 1996 22. Marklof, J.: Spectral form factors of rectangle billiards. Commun. Math. Phys. 199, 169–202 (1998) 23. Petridis, Y.N., Toth, J.A.: The remainder in Weyl’s law for Heisenberg manifolds. J. Differ. Geom. 60, 455–483 (2002) 24. Sarnak, P.: Class numbers of indefinite binary quadratic forms. J. Number Theory 15, 229–247 (1982) 25. Sarnak, P.: Values at integers of binary quadratic forms. CMS Conf. Proc. 21, Providence, RI: Amer. Math. Soc., 1997, pp. 181–203 26. Sarnak, P.: Private communication. (June 2003) 27. Scott, P.: The geometries of 3-manifolds. Bull. London Math. Soc. 15, 401–487 (1983) 28. Taimanov, I.A.: Topological obstructions to integrability of geodesic flows on non-simply-connected manifolds. Math. USSR Izv. 30, 403–409 (1988) 29. Thurston, W.P.: Hyperbolic geometry and 3-manifolds. London Math. Soc. Lecture Note Ser., 48, Cambridge: Cambridge Univ. Press, 1982 30. Vu Ngoc S.: Quantum monodromy in integrable systems. Commun. Math. Phys. 203(2), 465–479 (1999) 31. Waalkens, H., Dullin, H.R.: Quantum monodromy in prolate ellipsoidal billiards. Ann. Phys. 295, 81–112 (2002) 32. Weyl, H.: Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Differentialgleichungen. Math. Ann. 141, 441–479 (1912) 33. Whittaker, E.T., Watson, G.N.: A Course in Modern Analysis, 4th ed., Cambridge: Cambridge University Press, 1990 34. Zwillinger, D.: Handbook of Differential Equations, 3rd ed. Boston, MA: Academic Press, 1997 Communicated by L. Takhtajan
Commun. Math. Phys. 264, 613–630 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1470-y
Communications in
Mathematical Physics
On Curvature Decay in Expanding Cosmological Models Hans Ringstr¨om Max-Planck-Institut f¨ur Gravitationsphysik, Am M¨uhlenberg 1, 14476 Golm, Germany. E-mail:
[email protected] Received: 21 April 2005 / Accepted: 22 June 2005 Published online: 18 November 2005 – © Springer-Verlag 2005
Abstract: Consider a globally hyperbolic cosmological spacetime. Topologically, the spacetime is then a compact 3-manifold in cartesian product with an interval. Assuming that there is an expanding direction, is there any relation between the topology of the 3-manifold and the asymptotics? In fact, there is a result by Michael Anderson, where he obtains relations between the long-time evolution in General Relativity and the geometrization of 3-manifolds. In order to obtain conclusions however, he makes assumptions concerning the rate of decay of the curvature as proper time tends to infinity. It is thus of interest to find out if such curvature decay conditions are always fulfilled. We consider here the Gowdy spacetimes, for which we prove that the decay condition holds. However, we observe that for general Bianchi VIII spacetimes, the curvature decay condition does not hold, but that some aspects of the expected asymptotic behaviour are still true. 1. Introduction The objects of study in this paper are cosmological spacetimes. We shall assume them to be globally hyperbolic, so that topologically, they are of the form I × M, where M is a compact 3-manifold. We shall also only consider spacetimes which have one expanding direction, i.e. there is one time direction in which spacetime is causally geodesically complete. The question is then, what is the relationship between the asymptotic behaviour and the topology of the compact Cauchy surfaces? Anderson, Fischer and Moncrief have written several papers on the subject, see [2] and [7] and the references cited therein. In the current paper, we are concerned with questions raised in [2] regarding the relationship between the asymptotics and geometrization. The special case of interest here is when one has a globally hyperbolic vacuum spacetime foliated by compact constant mean curvature (CMC) hypersurfaces, though in the case of Gowdy, we shall also be interested in another geometrically defined foliation. We shall assume that σ () ≤ 0
Current address: Department of Mathematics, KTH, 100 44 Stockholm, Sweden
614
H. Ringstr¨om
for any CMC hypersurface (for a definition of the σ -constant of a compact 3-manifold, see [1]) or, in other words, that does not admit a metric of positive scalar curvature, see [2]. Furthermore, we shall assume that the range of the mean curvatures attained in the foliation exhausts the interval (−∞, 0) and that the spacetime is future causally geodesically complete. In fact, we shall only be interested in the expanding direction, so it is enough if the foliation exhausts the interval [H0 , 0) for some H0 < 0, and sometimes future causal geodesic completeness will be a consequence of other assumptions. In this setting we wish to consider the behaviour of the geometry induced on the leaves of the foliation as proper time tends to infinity. Let us recall some definitions from [2]. Definition 1. Let be a closed, oriented and connected 3-manifold, satisfying σ () ≤ 0. A weak geometrization of is a decomposition of , = H ∪ G,
(1)
where H is a finite collection of complete connected hyperbolic manifolds of finite volume embedded in and G is a finite collection of connected graph manifolds embedded in . The union is along a finite collection of embedded tori T = ∪Ti , T = ∂H = ∂G. A strong geometrization of is a weak geometrization as above, for which each torus Ti in T is incompressible in , i.e. the inclusion of Ti into induces an injection of fundamental groups. For more details concerning the terminology, we refer to [2] and the references cited therein. Graph manifolds are built by gluing together Seifert fibred spaces along toral boundary components. Since we shall only be concerned with Seifert fibred 3-manifolds in this paper, the details of these constructions are not of any greater importance here. Let us however define the concept Seifert fibred space. Definition 2. A 3-manifold is said to be a Seifert fibred space if it satisfies the following two conditions: 1. It can be written as a disjoint union of circles. 2. Each circle fibre has an open neighbourhood U satisfying: – U can be written as a disjoint union of circle fibres, – U is isomorphic either to a solid torus or a cylinder where the ends have been identified after a rotation by a rational angle. When we say that U is isomorphic to a solid torus, we mean that U is diffeomorphic to a solid torus and that the circle fibres of U are mapped to the natural circle fibres of the solid torus under the diffeomorphism. Note that there are different definitions of Seifert fibred spaces in the literature. In particular, our definition coincides with the original definition by Seifert but not with that of Scott [14]. Since the geometry on the leaves of the foliation becomes more and more flat, it is natural to rescale the metric in some way. Following [2], we shall use the proper time distance to a fixed Cauchy surface in order to do so. Let be a fixed Cauchy surface and define, for an arbitrary spacetime point p, tˆ(p) = sup γ
1 0
[−γ , γ ]1/2 ds,
On Curvature Decay in Expanding Cosmological Models
615
where the supremum is taken over timelike curves γ with γ (0) ∈ and γ (1) = p and ·, · denotes inner product with respect to the spacetime metric. We also define tˆ( ) = sup tˆ(p) p∈
for a Cauchy surface . Let the leaves of the foliation be indexed by a parameter s. In the case of a CMC foliation, the parameter can be chosen to be the mean curvature of the corresponding leaf, and in the case of Gowdy, the parameter will be the so called areal time coordinate. We are interested in the interval [s0 , smax ), where s0 corresponds to some arbitrary initial hypersurface (filling the role of above) and smax corresponds to infinite expansion, i.e. smax = 0 in the CMC case and smax = ∞ in the case of the areal time coordinate in Gowdy. Let gˆ s be the Riemannian metric induced on the leaf s by the spacetime metric and define gs = tˆ−2 (s )gˆ s . The following weak asymptotics problem was raised in [2]. Suppose that is a closed, oriented, connected 3-manifold with σ () ≤ 0. Suppose further that the vacuum spacetime is future causally geodesically complete and that the CMC foliation exhausts the future development. Then for any sequence si → smax , the slices (si , gsi ) have a subsequence asymptotic to a weak geometrization of . More precisely, there should be a division of as in (1) and on the region H , the metrics gsi should converge to complete hyperbolic metrics of finite volume, while on G, the metrics collapse the graph manifold with bounded curvature. When we say that a region collapses we mean that the injectivity radius of that region converges to zero. If a region collapses even though the curvature remains bounded, we shall say that it collapses in the sense of Cheeger-Gromov. This conjecture should be compared with the work of Andersson and Moncrief [3], Choquet-Bruhat and Moncrief [4] and Fischer and Moncrief [7]. In [3], the authors considered the future development of perturbations of spatially compact variants of the k = −1 Friedmann-Robertson-Walker vacuum spacetime. They proved that the future development is covered by CMC hypersurfaces exhausting the maximal range, and that it is future causally geodesically complete. Furthermore, the rescaled metric on the spatial hypersurfaces was shown to converge to the hyperbolic one. In [4], the authors considered Cauchy surfaces that have the topology of a trivial circle bundle over a higher genus surface and they restricted their attention to small, polarized, U (1)symmetric data. They proved that the future development is foliated by CMC hypersurfaces exhausting the maximal range. Furthermore, they stated that causal geodesic completeness should hold, though they did not prove it. However, this was shown for a larger class of spacetimes in [5], a paper which extends the results of [4] to the non-polarized case, using the results of [6]. Finally, they showed that the Cauchy surfaces should undergo a Cheeger-Gromov type collapse. In [7], some known spatially homogeneous examples were studied and the expected behaviour was confirmed. Note that in all the cases mentioned above, either H = ∅ or G = ∅ in the division (1). The reason for this is the fact that all results, as far as we are aware, can be divided into the category of small data results and the category of results for a situation in which there is symmetry. The small data category may seem to be more general, but since it presupposes the existence of a symmetric solution around which to perturb, it is not more general in terms of spatial topology. In other words, all results known require the spatial manifold to allow a highly symmetric metric, and this reduces the number of allowed spatial topologies.
616
H. Ringstr¨om
In [2], the following statement was proved. Consider a spacetime which is the maximal development of vacuum initial data, with σ () ≤ 0, where is the initial hypersurface, and assume that it is foliated to the future by CMC hypersurfaces exhausting the maximum range. Assume furthermore that the curvature satisfies |R|(p) + tˆ(p)|∇R|(p) ≤
C tˆ2 (p)
,
(2)
where |R|2 is defined as the sum of the squares of the components of the Riemann curvature tensor with respect to an orthonormal frame, where the timelike unit vector in the frame is the future oriented normal to the foliation (the definition of |∇R|2 is similar). Then the spacetime is future causally geodesically complete and, for any sequence si → smax , the slices (si , gsi ) have a subsequence asymptotic to a weak geometrization. Due to this theorem, it is of interest to analyze how curvature decays in expanding cosmological spacetimes. In the following, we shall only consider whether the estimate |R|(p) ≤
C , tˆ2 (p)
(3)
holds or not. In the case of Gowdy, it turns out that such an estimate holds, at least relative to the foliation defined by the areal time coordinate. In the case of locally rotationally symmetric Bianchi VIII, the estimate also holds, but it turns out that for general Bianchi VIII it does not. In that case tˆ(p) ln tˆ(p)|R|(p) converges to a positive number as p tends to a point in the infinite future. In fact, in the case of general Bianchi VIII, one does not get a better estimate even if one considers the Kretschmann scalar κ = Rαβγ δ R αβγ δ .
(4)
It is then of interest to consider the Ricci curvature of gsi . It turns out that in general, the Ricci curvature does not have any better decay, but that there is a time sequence such that one does get the expected decay. This time sequence corresponds to the metric being locally rotationally symmetric. Concerning the topology, we have the following results. In the case of Gowdy, the topology is T 3 , and after rescaling the 3-tori collapse along 2-tori. In the Bianchi VIII case, the topology is that of a non-trivial circle bundle over a higher genus surface. After rescaling one obtains the conclusion that the length of the circle fibers converges to zero.
1.1. Gowdy spacetimes. The Gowdy spacetimes is a class of vacuum spacetimes with a two dimensional group of isometries. Of the spatial topologies compatible with the symmetry requirements, only T 3 is expected to be compatible with infinite expansion. For this reason, we shall only be interested in such a spatial topology in this paper. There are natural conditions defining the Gowdy spacetimes, see [12] and references therein, but we shall not write them down here. For the purposes of the present paper, a Gowdy T 3 spacetime is defined as a Lorentz manifold R+ × T 3 , where R+ = (0, ∞), with metric g = t −1/2 eλ/2 (−dt 2 + dθ 2 ) + t[eP dσ 2 + 2eP Qdσ dδ + (eP Q2 + e−P )dδ 2 ], (5)
On Curvature Decay in Expanding Cosmological Models
617
where P , Q and λ only depend on t and θ, satisfying Einstein’s vacuum equations. In terms of P , Q and λ, the equations are 1 Ptt + Pt − Pθθ − e2P (Q2t − Q2θ ) = 0, t 1 Qtt + Qt − Qθθ + 2(Pt Qt − Pθ Qθ ) = 0, t
(6) (7)
and λt = t[Pt2 + Pθ2 + e2P (Q2t + Q2θ )],
(8)
λθ = 2t (Pθ Pt + e
(9)
2P
Qθ Qt ).
The time coordinate t appearing in (5) is called the areal time coordinate. The reason for this is that the area of the two torus given by fixing t and θ is t. On the other hand, the trace of the second fundamental form need not be constant on the hypersurfaces of constant t. One might then naively expect this to approximately be the case asymptotically. However, there are metrics of the form (5) such that there is a time sequence tk → ∞ with the property that the quotient of the maximum and the minimum of |trktk | tends to infinity, where ktk is the second fundamental form of the hypersurface defined by t = tk . We refer the reader to [13] for a proof of this fact. Thus there is certainly a difference between the CMC foliation and the areal time coordinate foliation. Since most of the analysis concerning Gowdy spacetimes has been carried out in the areal time coordinate and since this coordinate has a natural geometric definition, we shall however only consider this choice here. In the end we are interested in getting estimates for the curvature. In [12], we analyzed the asymptotics of solutions to (6)–(7). However, the analysis was not complete. In particular, [12] only contains estimates of the first derivatives of P and Q, and this is not sufficient for computing curvature. The first step is to remedy this situation. Theorem 1. Consider a solution to (6)–(7). Then
(∂θk ∂t P )2 + (∂θk+1 P )2 + e2P [(∂θk ∂t Q)2 + (∂θk+1 Q)2 ] C 0 (S 1 ,R) ≤ Ck
(ln t)2k (10) t
for t ≥ 2 and k ≥ 0. Remark 1. The above estimates together with Eqs. (6)–(7) yield estimates for the higher order derivatives involving an arbitrary number of time derivatives. In the polarized case, i.e. when Q = 0, there is an improved estimate. In fact, one does not need the logarithms. To see this, note that the case k = 0 of (10) was proved in [12] and that in the polarized case, the equation remains the same under differentiation with respect to θ . The proof is to be found at the beginning of Sect. 2. Define the proper time distance between the hypersurfaces defined by t0 and t to be τ (t0 , t), cf. (18). Then the decay estimate for the curvature is as follows. Theorem 2. Consider a metric of the form (5), where P , Q and λ satisfy (6)–(9). Assume furthermore that P and Q are not both independent of θ for all t. Then for every t0 > 0, there is a positive constant C(t0 ) and a T (t0 ) such that for t ≥ T (t0 ), |R|(t) ≤ C(t0 )τ −2 (t0 , t), where |R| is defined with respect to the areal time coordinate foliation.
(11)
618
H. Ringstr¨om
Remark 2. When considering metrics of the form (5), the spatially homogeneous solutions have a special type of behaviour. In particular, if there is some spatial variation, λ tends to infinity linearly, but if there is no spatial variation, λ tends to infinity logarithmically, cf. [12]. Since P cannot grow faster than logarithmically and Q cannot grow faster than polynomially, cf. [12], it is clear that in the spatially inhomogeneous case, the factor in front of −dt 2 + dθ 2 tends to infinity exponentially whereas all the other factors tend to infinity at worst polynomially. In other words, all the expansion is in the factor in front of −dt 2 + dθ 2 . In the spatially homogeneous case, there is however no such clear distinction between the different factors, since λ tends to infinity logarithmically. For this reason we focus on the spatially inhomogeneous case and leave the homogeneous case to the reader. The proof is to be found at the end of Sect. 2. Finally, let us say something about the rescaled Riemannian metric on the hypersurfaces of constant areal time. The proof is also to be found at the end of Sect. 2. Proposition 1. Consider a metric of the form (5), where P , Q and λ satisfy (6)–(9). Assume furthermore that P and Q are not both independent of θ for all t. Let gˆ t be the Riemannian metric induced on the hypersurface of constant areal time t, and let gt = gˆ t /τ 2 (t0 , t). Then gt is a metric on T 3 , which can be written gt = f1 (t, θ )dθ 2 + f2 (t, θ )dδ 2 + f3 (t, θ )dδdσ + f4 (t, θ )dσ 2 . The family f1 (t, ·) of functions is bounded in C 1 and from below by a positive constant, for t ≥ t0 + 1. For i ≥ 2, k ≥ 0 and t ≥ t0 + 1, we have the following estimate,
fi (t, ·) C k ≤ Ck
{ln[1 + τ (t0 , t)]}αk , τ 2 (t0 , t)
where αk and Ck are positive constants. Remark 3. By the conclusions of the proposition and the Arzela-Ascoli theorem, there is, for any time sequence tk → ∞, a subsequence such that f1 (tk , ·) converges to a positive continuous function (the limit function will of course be Lipschitz). Furthermore, it is clear that the metric collapses in the two-torus direction defined by δ and σ . Finally, if it were possible to improve the estimate (10) in such a way that the logarithms do not occur, f1 (t, ·) would be bounded in any C k norm for t ≥ t0 + 1. In particular, in the polarized Gowdy case, we have such bounds. 1.2. Bianchi VIII. For proofs of the statements made below, we refer the reader to [11] and the references cited therein. We define Bianchi VIII spacetimes in terms of initial data. Bianchi VIII initial data are given by (G, g, k), where G is a Lie group of Bianchi type VIII (to be defined below), g is a left invariant metric, k is a left invariant symmetric two tensor and g and k satisfy the constraint equations. In practice, G can be assumed to be the universal covering group of Sl(2, R). However, in general, a Lie group G is said to be of Bianchi type VIII if it has a basis ei of the Lie algebra satisfying [ei , ej ] = γijk ek , with γijk = ij l nlk , where ij l is antisymmetric in all its indices, 123 = 1, and nlk is diagonal with diagonal components ni such that n1 < 0 and n2 , n3 > 0. Given initial
On Curvature Decay in Expanding Cosmological Models
619
data, there is a basis ei satisfying the conditions of the previous sentence such that g is orthonormal with respect to this basis and k is diagonal. We call such a basis a canonical basis. Such bases are not unique, but it turns out that e1 is well defined up to a sign. Let ki = k(ei , ei ). Then the initial data are said to be of NUT type if k2 = k3 and n2 = n3 . Given initial data, one can construct a globally hyperbolic Lorentz manifold (I × G, g), ¯ where I is an open interval and g¯ is of the form g¯ = −dt 2 +
3
ai2 (t)ξ i ⊗ ξ i ,
(12)
i=1
where the ξ i are the duals of ei , a canonical basis, and ai (0) = 1. Finally Ric[g] ¯ =0 and the Riemannian metric and the second fundamental form induced on = {0} × G by g¯ are given by g and k, after identifying G with in the obvious way. The development is future causally geodesically complete and independent of the canonical basis chosen. If the data are not of NUT type, the development is C 2 -inextendible, in fact, the Kretschmann scalar (4) is unbounded to the past, cf. [8]. Finally, if the data are of NUT type, a2 (t) = a3 (t) for all t. ˜ We can, without loss of generality, assume G to be Sl(2, R), the universal covering 3 ˜ group of Sl(2, R). Since Sl(2, R) is diffeomorphic to R , it is of interest to know when the geometry allows compactifications of the spatial hypersurfaces. In [11] we showed that if is a free and properly discontinuous subgroup of the isometry group of the initial data (G, g, k), then {Id}× is a free and properly discontinuous subgroup of the isometry group of the development. By taking the quotient, we thus get developments such that the corresponding CMC hypersurfaces have topology G/ . Furthermore, the compact manifold G/ must be Seifert fibred and e1 corresponds to the Seifert fibre direction. We also proved that a1 = l0 +O(t −1 ) in the NUT case and a1 (t) = c0 (ln t)1/2 [1+O(ln ln t/ ln t)] in the non-NUT case. Furthermore ai (t)/t → αi > 0 for i = 2, 3. Thus, after rescaling, the Seifert fibred spaces collapse as expected. Note that for each p > 1, there is a sub˜ ˜ group p of Sl(2, R) such that the quotient of Sl(2, R) by p (when p is viewed as a group of isometries by acting on the left) is diffeomorphic to the unit tangent bundle of a compact orientable surface of genus p with respect to some hyperbolic metric. Thus all initial data allow infinitely many different compactifications. However, the following holds. Theorem 3. Consider a Bianchi VIII spacetime. If it is of NUT type, there are constants c0 , c1 > 0 and a T > 0, such that c0 t −3 ≤ |R|(t) ≤ c1 t −3 for all t ≥ T . If it is of non-NUT type, there is a constant c0 > 0 such that lim t ln t|R|(t) = c0 .
t→∞
Furthermore, there are constants ci > 0 and sequences ti,k → ∞, i = 1, 2, such that 2 (ln ti,k )2 κ(ti,k ) = (−1)i ci , lim ti,k
k→∞
where κ is defined in (4).
620
H. Ringstr¨om
The proofs of this result and the next are to be found in Sect. 3. One can then ask the question if the Ricci curvature of the spatial hypersurfaces behaves better. This turns out not to be the case in general, but there is in fact a time sequence along which it behaves well. Proposition 2. Consider a Bianchi VIII spacetime which is not of NUT type. Then there are time sequences ti,k → ∞, i = 1, 2, and positive constants ci such that 2 4 lim t1,k (ln t1,k )2 (Rij R ij )(t1,k ) = c1 , t2,k (Rij R ij )(t2,k ) ≤ c2 ,
k→∞
where the last inequality is valid for all k, and Rij (t) denotes the Ricci tensor of the spatial hypersurface of homogeneity defined by t, with metric induced by g. ¯ Remark 4. The time sequence t2,k corresponds to the induced Riemannian metric being locally rotationally symmetric. Due to the existence of the sequence t1,k , the conjecture embodied in the weak asymptotics problem is not correct. 2. Curvature Estimates for Gowdy The expanding direction of Gowdy spacetimes was considered in [12]. The leading order behaviour for the functions P , Q and λ was sorted out and (10) was proved to hold for k = 0. In this paper, we are interested in the behaviour of curvature quantities, and thus we need to concern ourselves with the asymptotic behaviour of higher order derivatives. Proof (Theorem 1). By [12], we know that the conclusion holds for k = 0. Define t [(∂ k ∂t P ± ∂θk+1 P )2 + e2P (∂θk ∂t Q ± ∂θk+1 Q)2 ], 2 θ
Ak,± (t, ·) C 0 (S 1 ,R) . Ek (t) = Ak,± =
±
Let us make the inductive assumption that 1/2
Em (t) ≤ Cm (ln t)m for m = 0, ..., k − 1 and t ≥ 2. Observe that since (10) holds for k = 0, this holds for k = 1. Compute, for k ≥ 1, (∂t ∓ ∂θ )Ak,± = I1,k,± + I2,k,± , where I1,k,± =
1 {−(∂θk Pt )2 + (∂θk Pθ )2 + e2P [−(∂θk Qt )2 + (∂θk Qθ )]} 2 −te2P (Pt ± Pθ )[(∂θk Qt )2 − (∂θk Qθ )2 ] +te2P (Qt ± Qθ )[(∂θk Qt ∓ ∂θk Qθ )(∂θk Pt ± ∂θk Pθ ) −(∂θk Qt ± ∂θk Qθ )(∂θk Pt ∓ ∂θk Pθ )],
(13)
On Curvature Decay in Expanding Cosmological Models
621
and I2,k,± = t{∂θk [e2P (Q2t − Q2θ )] − 2e2P (Qt ∂θk Qt − Qθ ∂θk Qθ )}(∂θk Pt ± ∂θk Pθ ) k−1 k j k−j j k−j 2P −2te [∂θ Pt ∂θ Qt − ∂θ Pθ ∂θ Qθ ](∂θk Qt ± ∂θk Qθ ). j j =1
Fix θ and define γ± (u) = (u, θ ± u). For f : R+ × S 1 → R, let f± = f ◦ γ± . Note that ∂u f± = [(∂t ± ∂θ )f ]± . Compute Ak,± [γ∓ (u)] = Ak,± [γ∓ (u0 )] +
u u0
[(∂t ∓ ∂θ )Ak,± ]∓ (t)dt.
(14)
Note that we have (13) and that each of the terms in I1,k,± ◦ γ∓ can be written, disregarding numerical factors, as a sum of terms of the form f1∓ f2∓ ∂u f3∓ . Here, the possibilities for f1 are 1, e2P , ue2P (Pu ± Pθ ), ue2P (Qu ± Qθ ),
(15)
the corresponding estimates for |f1 | and |∂u f1∓ | being, respectively, 1, Ce2P , Cu1/2 e2P , Cu1/2 eP and 0, Cu−1/2 e2P∓ , Ce2P∓ , CeP∓ , where we have used (6)–(7) and the fact that (10) holds for k = 0. The possibilities for f2 are (∂u ± ∂θ )∂θk P , (∂u ± ∂θ )∂θk Q,
(16)
the corresponding estimates for |f2 | and |∂u f2∓ | being, respectively (ln u)k −P∓ −1 1/2 (ln u)k 1/2 1/2 1/2 u Ek + ,e , u−1/2 Ek , u−1/2 e−P Ek and u−1 Ek + u u (17) up to numerical factors. The reason for the latter is that 1 k k k k 2P 2 2 ∂u [(∂u ± ∂θ )∂θ P ]∓ = [∂θ (Puu − Pθθ )]∓ = − ∂θ Pt + ∂θ [e (Qt − Qθ )] . u ∓ The first term on the right hand side satisfies a better estimate than the second to last expression in (17), and the terms resulting from the second term when at least one derivative hits the factor e2P are also better. What remains to be considered are terms of the form j
j
j
j
[e2P (∂θ 1 Qt ∂θ 2 Qt − ∂θ 1 Qθ ∂θ 2 Qθ )]∓ ,
622
H. Ringstr¨om
where j1 + j2 = k. These terms can be estimated by the second to last expression in (17) due to the induction hypothesis. The argument for the second possibility for f2 is similar. The possibilities for f3 are ∂θk P , ∂θk Q, and the corresponding estimates for |f3 | are k−1 (ln u)k−1 −P (ln u) , e u1/2 u1/2
due to the induction hypothesis (note that k ≥ 1). Consider u I1,k,± ◦ γ∓ (t)dt. u0
Up to numerical factors, this integral can be written as a sum of terms of the form u u u f1∓ f2∓ ∂t f3∓ dt = [f1∓ f2∓ f3∓ ]u0 − [∂t f1∓ f2∓ f3∓ + f1∓ ∂t f2∓ f3∓ ]dt. u0
u0
Note that not all combinations occur and that when taking the products, all factors of eP in the estimates cancel. Using the definition of I1,k,± and the estimates written down above, we get u k−1 1/2 ≤ C + C (ln u) I ◦ γ (t)dt Ek (u) 1,k,± ∓ u1/2 u0 u (ln t)k−1 1/2 (ln t)2k−1 +C Ek (t) + dt. t t u0 Let us turn to I2,k,± . Up to numerical factors, the first term can be written as a sum of terms of the form j
j
t∂θ 1 P · · · ∂θ l P e2P (∂θm1 Qt ∂θm2 Qt − ∂θm1 Qθ ∂θm2 Qθ )(∂θk Pt ± ∂θk Pθ ), where ji ≥ 1, mi ≤ k − 1 and j1 + · · · + jl + m1 + m2 = k. Using the induction hypothesis, this can be estimated by C
(ln t)k−l 1/2 E (t). t (l+1)/2 k
If l ≥ 1, this estimate is as good as what we already have, so let us consider terms of the form te2P (∂θm1 Qt ∂θm2 Qt − ∂θm1 Qθ ∂θm2 Qθ )(∂θk Pt ± ∂θk Pθ ), where m1 + m2 = k but mi ≤ k − 1. Note that ∂θm1 Qt ∂θm2 Qt − ∂θm1 Qθ ∂θm2 Qθ =
1 m1 [(∂ Qt ± ∂θm1 Qθ )(∂θm2 Qt ∓ ∂θm2 Qθ ) 2 θ +(∂θm1 Qt ∓ ∂θm1 Qθ )(∂θm2 Qt ± ∂θm2 Qθ )].
In other words, we need only concern ourselves with terms of the form te2P (∂θm1 Qt ± ∂θm1 Qθ )(∂θm2 Qt ∓ ∂θm2 Qθ )(∂θk Pt ± ∂θk Pθ ).
On Curvature Decay in Expanding Cosmological Models
623
We can then argue as before, with f1 = te2P (∂θm1 Qt ± ∂θm1 Qθ ), f2 = (∂θk Pt ± ∂θk Pθ ) and f3 = ∂θm2 Q. Note that since m1 + m2 = k and mi ≤ k − 1, we have mi ≥ 1. The arguments for the remaining terms in I2,k,± are similar, and by (13) we get
u
u0
(ln u)k−1 1/2 Ek (u) u1/2 u (ln t)k−1 1/2 (ln t)2k−1 +C Ek (t) + dt. t t u0
[(∂t ∓ ∂θ )Ak,± ]∓ (t)dt ≤ C + C
Taking the supremum of the right hand side in (14), we thus get (ln u)k−1 1/2 Ak,± [γ∓ (u)] ≤ Ak,± (u0 , ·) C 0 (S 1 ,R) + C + C Ek (u) u1/2 u (ln t)k−1 1/2 (ln t)2k−1 +C Ek (t) + dt. t t u0 Taking the supremum of the left hand side (note that there is a θ hidden in γ± ) and adding the two estimates, we get (ln u)k−1 1/2 Ek (u) Ek (u) ≤ C + C u1/2 u (ln t)k−1 1/2 (ln t)2k−1 +C Ek (t) + dt. t t u0 Note that C
(ln u)k−1 1/2 1 1 (ln u)2k−2 + Ek (u). Ek (u) ≤ C 2 1/2 u 2 u 2
Defining Eˆ k (u) = Ek (u) + (ln u)2k , we thus get the estimate Eˆ k (u) ≤ C + C
≤ C+C
u (ln t)k−1 u0 u u0
t (ln t)k−1 t
1/2
Ek (t) +
1/2 Eˆ k (t)dt.
By a Gr¨onwall’s lemma type argument, we conclude that Eˆ k (u) ≤ Ck (ln u)2k for u ≥ u0 . This completes the induction proof.
(ln t)2k−1 dt t
624
H. Ringstr¨om
Before we come to the curvature estimate, let us define t τ (t, t0 ) = sup [−γ (s), γ (s)]1/2 ds, γ
(18)
t0
where the supremum is taken over smooth timelike curves γ (s) = [s, x(s)], where x takes values on T 3 . Note that for an arbitrary smooth timelike curve joining the hypersurface corresponding to t0 with the hypersurface corresponding to t, one can change the parameterization so that it is of the above mentioned form. Proposition 3. Consider a metric of the form (5), where P , Q and λ satisfy (6)–(9). Assume furthermore that P and Q are not both independent of θ for all t. Given t0 > 0 there are positive constants c(t0 ) and C(t0 ) such that for t ≥ t0 + 1, c(t0 )t −1/4 eλ(t)/4 ≤ τ (t, t0 ) ≤ C(t0 )t −1/4 eλ(t)/4 .
(19)
Proof. Note that since (10) holds for k = 0, |λθ | is bounded to the future, and consequently, |λ(t, θ ) − λ(t)| ≤ C(t0 )
(20)
for t ≥ t0 . Let us estimate t
1/4 −λ(t)/4
t
e
[−γ (s), γ (s)]1/2 ds
t0 1/4
t
t exp{[λ(s, θ (s)) − λ(t)]/4}ds s t0 t 1/4 t exp{[λ(s) − λ(t)]/4}ds. ≤ C(t0 ) s t0 ≤
However, by Theorem 1.6 of [12] we have |λt (t) − c0 | ≤ C(t0 )t −1 for t ≥ t0 , where c0 > 0, assuming the solution is not independent of θ . Thus t λ(s) − λ(t) ≤ −c0 (t − s) + C(t0 ) ln . s We conclude that t
1/4 −λ(t)/4
t
e
≤ C(t0 ) = C(t0 )
[−γ (s), γ (s)]1/2 ds
t0 t t
t0 1 t0 /t
s
α(t0 )
exp[−c0 (t − s)/4]ds
u−α(t0 ) exp[−c0 t (1 − u)/4]tdu.
(21)
On Curvature Decay in Expanding Cosmological Models
625
If t ≤ 2t0 , this integral is bounded. If t ≥ 2t0 we can divide the integral into two parts. Let us estimate 1 1 −α(t0 ) α(t0 ) u exp[−c0 t (1 − u)/4]tdu ≤ 2 exp[−c0 t (1 − u)/4]tdu 1/2
1/2
4 ≤ 2α(t0 ) . c0 We also have 1/2 u t0 /t
−α(t0 )
4 exp[−c0 t (1 − u)/4]tdu ≤ c0
α(t0 ) t exp[−c0 t/8] t0
which is bounded by a constant depending on t0 . Note that the constants involved in the arguments above are independent of the curve γ . Thus τ (t, t0 ) ≤ C(t0 )t −1/4 eλ(t)/4 . In order to get the opposite inequality, consider the curve γ (s) = (s, x0 ), where x0 is a fixed point on T 3 . We get t t 1/4 t 1/4 −λ(t)/4 1/2 t e [−γ (s), γ (s)] ds = exp{[λ(s, θ0 ) − λ(t)]/4}ds s t0 t0 t ≥ c(t0 ) exp{[λ(s) − λ(t)]/4}ds, t0
where c(t0 ) is a positive constant. Assuming t ≥ t0 + 1, we can use (21) to prove that t t exp{[λ(s) − λ(t)]/4}ds ≥ exp{[λ(s) − λ(t)]/4}ds t0
t−1/2
≥ c(t0 ) > 0. The proposition follows.
Proof (Theorem 2). Note that there is no loss of generality in choosing the vectors orthogonal to e0 to be e1 = t 1/4 e−λ/4 ∂θ , e2 = t −1/2 e−P /2 ∂σ , e3 = t −1/2 eP /2 (−Q∂σ + ∂δ ). It will be convenient to introduce the notation φ = t 1/4 e−λ/4 . Note that c(t0 ) ≤ φ(t, θ )τ (t0 , t) ≤ C(t0 )
(22)
α e = ∇ e . Then for t ≥ t0 + 1 and θ ∈ S 1 due to (20) and (19). Let βγ α eβ γ δ δ δ κ δ κ Reµ eν eα , eβ = eν (µα )ηδβ − eµ (να )ηδβ + µα νδ ηκβ − να µδ ηκβ κ δ +ηδβ γµν κα , κ e defines γ κ . The above where η is the Minkowski metric and where [eα , eβ ] = γαβ κ µν formulas indicate what sign conventions we are using. One can check that all the terms δ )η − e ( δ )η can be estimated by φ 2 . Furthermore, due to the estiexcept eν (µα δβ µ να δβ mate (10), one sees that the only problem consists in second derivatives of λ. However, one can check that these derivatives only occur in the combination λtt − λθθ which is O(t −1/2 ) due to (10) and the equations. This proves that |R| ≤ Cφ 2 , which together with (22) proves (11).
626
H. Ringstr¨om
Proof (Proposition 1). Let f1 = φ −2 τ −2 (t0 , t), using the notation of the previous proof. Due to (22), we conclude that f1 (τ, ·) is bounded from above and from below by positive constants. Since λθ is bounded, due to (10) for k = 0, ∂θ f1 is bounded. The conclusions concerning f1 follow. Note that if we had an estimate of the form (10) without the logarithms, ∂θk λ would be bounded to the future for any k ≥ 1, and consequently f1 (t, ·) would be bounded in any C k norm for t ≥ t0 + 1. Due to the results of [12], P does not grow faster than logarithmically and Q does not go to infinity faster than polynomially. Combining this information with (10), we conclude that ∂θk P converges to zero for any k ≥ 1 and that ∂θk Q does not grow faster than polynomially. Due to (19) and the fact that λ = c0 t + O(ln t), where c0 > 0, cf. (21), we conclude that for large t, t and ln[1 + τ (t0 , t)] are equivalent. Adding these pieces together, we get the conclusions of the proposition. 3. Bianchi VIII In this section we prove Theorem 3 and Proposition 2. The results necessary in order to carry out the computations are all taken from [11]. However, we refer the reader to [10] and the appendices of [9] for more details on curvature computations in the current setting. Proof (Theorem 3). Let e0 = ∂t and ei = (ai )−1 ei (no summation) for i = 1, 2, 3, with terminology as in Subsect. 1.2. Let Greek indices range from 0 to 3 and Latin indices δ e . Due to the form (12) and the fact that e is a from 1 to 3. Define [eα , eβ ] = γαβ δ i canonical basis, we have γij0 = γ0i0 = 0. Furthermore, we can define n, θ and k by i γijk = ij l nlk , γ0j = −θij and k(ei , ej ) = ∇ei e0 , ej .
Then nlk is diagonal, and the diagonal components will be denoted by ni . Furthermore θij is diagonal, and coincides with −k(ei , ej ). In what follows, we shall raise and lower Latin indices with δij , and we shall consequently not be very careful when it comes to indices being upstairs or downstairs. Let θ denote the trace of θij and let σij be the traceless part. Since θ is never zero in the case of Bianchi VIII, cf. Lemma 21.5 of [9], we can define √ σij 3 ni 3 ij = , Ni = , + = (22 + 33 ), − = (22 − 33 ). θ θ 2 2 The relevant curvature quantities can be written κ = Rαβγ δ R αβγ δ = 8(Eij E ij − Hij H ij ), |R|2 = 8(Eij E ij + Hij H ij ), where
1 1 θ σij − σi k σkj − σkl σ kl δij + sij , 3 3 1 Hij = −3σ k(i nj )k + nkl σ kl δij + tr(n)σij , 2 1 sij = bij − tr(b)δij , 3 bij = 2nik nkj − tr(n)nij , Eij =
On Curvature Decay in Expanding Cosmological Models
627
cf. p. 19 and p. 40 of [15]. Note that Eij and Hij define diagonal traceless matrices. In order to relate these expressions to the variables defined above, it will be convenient to define H˜ i = Hii /θ 2 , E˜ i = Eii /θ 2 . Then 1 H˜ 1 = N1 + + √ (N2 − N3 )− , 3 √ 1 1 1 H˜ 2 = − N2 (+ + 3− ) + (N3 − N1 ) + − √ − , 2 2 3 2 E˜ 2 − E˜ 3 = √ − (1 − 2+ ) + (N2 − N3 )(N2 + N3 − N1 ), 3 3 2 2 2 2 1 1 E˜ 2 + E˜ 3 = + (1 + + ) − − − N12 + (N2 − N3 )2 + N1 (N2 + N3 ). 9 9 3 3 3 Note that all other components of E˜ i and H˜ i can be computed from this due to the fact that Eij and Hij both define traceless matrices. Let us consider the case when the initial data are of NUT type. The relevant statements concerning the asymptotics are then to be found on pp. 1955–1956 of [11]. In this case − = 0, N2 = N3 and + − 1 + (N1 N2 )(τ ) + 1 + N2 e−3τ/2 − cN ≤ Ce−3τ/2 2 4 for some positive constants cN and C and for τ ≥ 0. Furthermore, there are positive constants cθ , C such that 1 3τ/2 θ (τ ) − cθ e ≤C for τ ≥ 0. Finally, t and τ are related through |t (τ ) − 2cθ e3τ/2 | ≤ C(1 + τ ) for all τ ≥ 0. We conclude that H˜ i and E˜ i are all O(e−3τ/2 ) = O(θ ). We conclude that |R|2 = O(θ 6 ) = O(t −6 ). This proves the upper bound in the theorem. In order to prove the lower bound, we need only observe that cθ = 0. lim t H˜ 1 = − 4cN
t→∞
Let us consider the general case. The necessary information is contained in Proposition 6, Corollary 7 and Corollary 8 of [11]. Note that in these results, 3 1 1 2 + (N2 − N3 )2 , v := −N1 (N2 + N3 ) − , u := + − . h := − 4 2 2 We have 2 −
3 1 + (N2 − N3 )2 = +O 4 4τ
ln τ τ2
, + =
and 1 N1 (N2 + N3 ) = − + O(τ −2 ). 2
1 + O(τ −1 ) 2
(23)
628
H. Ringstr¨om
By (82) of [11], we also have ln τ N2 = cN τ −3/4 e3τ/2 1 + O τ
(24)
for some positive constant cN . In combination with the above equations, this proves that N1 converges to zero exponentially. In view of the above equations, we have H˜ 1 = O(τ −1 ), √ 1 1 1 H˜ 2 = − N2 (+ + 3− ) + N2 + − √ − + O(τ −1/2 ) 2 2 3 2 = − √ N2 − + O(τ −1/2 ), 3 ˜ ˜ E2 − E3 = 2N2 (N2 − N3 ) + O(τ −1 ), E˜ 2 + E˜ 3 = O(τ −1 ), Thus θ
−4
3 1 |R| = 8 (E˜ 2 + E˜ 3 )2 + (E˜ 2 − E˜ 3 )2 + H˜ 12 + H˜ 22 + (H˜ 1 + H˜ 2 )2 2 2 8 2 = 8[2N22 (N2 − N3 )2 + N22 − + N2 O(τ −1 )] 3 64 2 2 3 = N [ + (N2 − N3 )2 + N2−1 O(τ −1 )]. 3 2 − 4
2
Taking (23) into account, we conclude that lim τ N2−2 θ −4 |R|2 =
τ →∞
16 . 3
(25)
On p. 1972 of [11], it is shown that there is a positive constant αθ such that αθ 3τ/2 2αθ 3τ/2 1 ln τ ln τ = 1/4 e , t = 1/4 e . 1+O 1+O θ τ τ τ τ Combining this with (24), we conclude that there are positive constants ci , i = 1, 2, 3, such that lim t −2 (τ )τ N22 (τ ) = c1 ,
τ →∞
lim t (τ )θ (τ ) = c2 ,
τ →∞
lim τ [ln t (τ )]−1 = c3 .
τ →∞
Combining this with (25), we conclude that there is a positive constant c0 such that lim t ln t|R|(t) = c0 .
t→∞
Since there are sequences τi,k → ∞, i = 1, 2, such that − (τ1,k ) = 0 and (N2 − N3 )(τ2,k ) = 0, cf. [11], the conclusions concerning the Kretschmann scalar follow by similar arguments.
On Curvature Decay in Expanding Cosmological Models
629
Proof (Proposition 2). Let Ric denote the Ricci curvature of a spatial hypersurface of homogeneity. One can compute that 1 Ric(ei , ej ) = 2nik nkj − tr(n)nij − nkl nkl δij + [tr(n)]2 δij , 2 with terminology as in the proof of Theorem 3. Let Ri = Ric(ei , ei ). We get θ −2 R1 =
1 2 1 1 1 N1 − (N2 − N3 )2 , θ −2 R2 = N22 − (N1 − N3 )2 2 2 2 2
and similarly for R3 . We see that θ −2 R1 tends to zero and that θ −2 R2 =
1 1 (N2 + N3 )(N2 − N3 ) − N12 + N1 N3 . 2 2
The statement concerning R3 is similar. Note that there are time sequences τi,k → ∞, i = 1, 2, such that 1/2
lim (N2 − N3 )(τ1,k )τ1,k = c0 ,
k→∞
for some positive constant c0 , and such that (N2 − N3 )(τ2,k ) = 0. Once one has made the above observations, the argument is similar to the end of the proof of Theorem 3. Acknowledgement. I am grateful to Michael Anderson for discussions that led to me considering these problems.
References 1. Anderson, M.: Scalar curvature and geometrization conjectures for 3-manifolds. Comparison Geometry, Vol. 30, MSRI Publications, Cambridge: Cambridge University Press, 1997, pp. 49–82 2. Anderson, M.: On long-time evolution in general relativity and geometrization of 3-manifolds. Commun. Math. Phys. 222, 533–567 (2001) 3. Andersson, L., Moncrief, V.: Future complete vacuum spacetimes. In: Chru´sciel, P.T., Friedrich, H. (eds.), The Einstein equations and the large scale behavior of gravitational fields, Basel: Birkh¨auser, 2004, pp. 299–330 4. Choquet-Bruhat,Y., Moncrief, V.: Future Global in Time Einsteinian Spacetimes with U(1) Isometry Group. Ann. Henri Poincar´e 2, 1001–1064 (2001) 5. Choquet-Bruhat, Y.: Future complete U(1) symmetric Einsteinian spacetimes, the unpolarized case. In: Chru´sciel, P.T., Friedrich, H. (eds.), The Einstein equations and the large scale behavior of gravitational fields, Basel: Birkh¨auser, 2004, pp. 251–298 6. Choquet-Bruhat, Y., Cotsakis, S.: Global hyperbolicity and completeness. J. Geom. Phys. 43, 345– 350 (2002) 7. Fischer, A., Moncrief, V.: The reduced Einstein equations and the conformal volume collapse of 3-manifolds. Class. Quantum Grav. 18, 4493–4515 (2001) 8. Ringstr¨om, H.: Curvature blow up in Bianchi VIII and IX vacuum spacetimes. Class. Quantum Grav. 17, 713–731 (2000) 9. Ringstr¨om, H.: The Bianchi IX attractor Ann. Henri Poincar´e 2, 405–500 (2001) 10. Ringstr¨om, H.: The future asymptotics of Bianchi VIII vacuum solutions. Class. Quantum Grav. 18, 3791–3823 (2001) 11. Ringstr¨om, H.: Future asymptotics expansions of Bianchi VIII vacuum metrics. Class. Quantum Grav. 20, 1943–1989 (2003) 12. Ringstr¨om, H.: On a wave map equation arising in General Relativity. Commun. Pure Appl. Math. 57, 657–703 (2004) 13. Ringstr¨om, H.: Data at the moment of infinite expansion for polarized Gowdy. Class. Quantum Grav. 22, 1647–1653 (2005)
630
H. Ringstr¨om
14. Scott, P.: The geometries of 3-manifolds. Bull. London Math. Soc. 15, 401–487 (1983) 15. Wainwright, J., Ellis, G.F.R. (eds.): Dynamical systems in cosmology. Cambridge: Cambridge University Press, 1997 Communicated by G.W. Gibbons
Commun. Math. Phys. 264, 631–656 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1523-x
Communications in
Mathematical Physics
Forbidden Gap Argument for Phase Transitions Proved by Means of Chessboard Estimates Marek Biskup1 , Roman Koteck´y2 1 2
Department of Mathematics, UCLA, Los Angeles, California, USA Center for Theoretical Study, Charles University, Prague, Czech Republic
Received: 3 May 2005 / Accepted: 13 September 2005 c M. Biskup and R. Koteck´y 2006 Published online: 22 March 2006 –
Abstract: Chessboard estimates are one of the standard tools for proving phase coexistence in spin systems of physical interest. In this note we show that the method not only produces a point in the phase diagram where more than one Gibbs states coexist, but that it can also be used to rule out the existence of shift-ergodic states that differ significantly from those proved to exist. For models depending on a parameter (say, the temperature), this shows that the values of the conjugate thermodynamic quantity (the energy) inside the “transitional gap” are forbidden in all shift-ergodic Gibbs states. We point out several models where our result provides useful additional information concerning the set of possible thermodynamic equilibria. 1. Introduction One of the basic tasks of mathematical statistical mechanics is to find a rigorous approach to various first-order phase transitions in lattice spin systems. Here two methods of proof are generally available: Pirogov-Sinai theory and chessboard estimates. The former, developed in [30, 31], possesses an indisputable advantage of robustness with respect to (general) perturbations, but its drawbacks are the restrictions—not entirely without hope of being eventually eliminated [22, 23, 15, 35, 7]—to (effectively) finite sets of possible spin values and to situations with rapidly decaying correlations. The latter method, which goes back to [20, 18, 19], is limited, for the most part, to systems with nearest-neighbor interactions but it poses almost no limitations on the individual spin space and/or the rate of correlation decay; see e.g. [29]. While both techniques ultimately produce a proof of phase coexistence, Pirogov-Sinai theory offers significantly better control of the number of possible Gibbs states. Indeed, one can prove the so called completeness of phase diagram [34, 8] which asserts that the states constructed by the theory exhaust the set of all shift-ergodic Gibbs states. (In c 2006 by M. Biskup and R. Koteck´y. Reproduction, by any means, of the entire article for noncommercial purposes is permitted without charge.
632
M. Biskup, R. Koteck´y
technical terms, there is a one-to-one correspondence between the shift-ergodic Gibbs states and the “stable phases” defined in terms of minimal “metastable free energy”.) Unfortunately, no conclusion of this kind is currently available in the approaches based solely on chessboard estimates. This makes many of the conclusions of this technique— see [12, 33, 3, 5, 17] for a modest sample of recent references—seem to be somewhat “incomplete”. To make the distinction more explicit, let us consider the example of temperaturedriven first-order phase transition in the q-state Potts model with q 1. In dimensions d ≥ 2, there exists a transition temperature, Tt , at which there are q ordered states that are low on both entropy and energy, and one disordered state which is abundant in both quantities. The transition is accompanied by a massive jump in the energy density (as a function of temperature). Here the “standard” proof based on chessboard estimates [25, 26] produces “only” the existence of a temperature where the aforementioned q +1 states coexist, but it does not rule out the existence of other states; particularly, those with energies “inside” the jump. On the other hand, Pirogov-Sinai approaches [24, 27] permit us to conclude that no other than the above q + 1 shift-ergodic Gibbs states can exist at Tt and, in particular, there is a forbidden gap of energy densities where no shift ergodic Gibbs states are allowed to enter. The purpose of this note is to show that, after all, chessboard estimates can also be supplemented with a corresponding “forbidden-gap” argument. Explicitly, we will show that the calculations (and the assumptions) used, e.g., in [25, 12, 33, 3, 5, 17] to prove the existence of particular Gibbs states at the corresponding transition temperature, or other driving parameter, imply also the absence of Gibbs states that differ significantly from those proved to exist. We emphasize that no statement about the number of possible extremal, translation-invariant Gibbs states is being made here, i.e., the completeness of phase diagram in its full extent remains unproved. Notwithstanding, our results go some way towards a proof of completeness by ruling out, on general grounds, all but a “small neighborhood” of the few desired states (which may themselves be a non-trivial convex combination of extremal states). The assumptions we make are quite modest; indeed, apart from the necessary condition of reflection positivity we require only translation invariance and absolute summability of interactions. And, of course, the validity—uniformly in the parameter driving the transition—of a bound that is generally used to suppress the contours while proving the existence of coexisting phases. We also remark that the conclusion about the “forbidden gap” should not be interpreted too literally. Indeed, there are systems (e.g., the Potts model in an external field) where more than one gap may “open up” at the transition. Obviously, in such situations one may have to consider a larger set of observables and/or richer parametrization of the model. We refer the reader to our theorems for the precise interpretation of the phrase “forbidden gap” in a general context. The main idea of the proof is that all Gibbs states (at the same temperature) have the same large-deviation properties on the scale that is exponential in volume. This permits us to compare any translation-invariant Gibbs state with a corresponding measure on the torus, where chessboard estimates can be used to rule out most of the undesirable scenarios. The comparison with torus boundary conditions requires an estimate on the interaction “across” the boundary; as usual this is implied by the absolute summability of interactions. This is the setting we assume for the bulk of this paper (cf Theorem 2.5). For systems with unbounded interactions, a similar conclusion can be made under the assumption that the interactions are integrable with respect to the measures of interest (see Theorem 4.4).
Forbidden Gap Argument and Chessboard Estimates
633
The rest of this paper is organized as follows: In Sect. 2.1 and 2.2 we define the class of models to which our techniques apply and review various elementary facts about reflection positivity and chessboard estimates. The statements of our main theorems (Theorem 2.5 and Corollary 2.6) come in Sect 2.3. The proofs constitute the bulk of Sect. 3; applications to recent results established by means of chessboard estimates are discussed in Sect. 4. The Appendix (Sect. 5) contains the proof of Theorem 4.4 which provides an explicit estimate on the energy gap from Theorem 3 of [17]. This result is needed for one of our applications in Sect. 4. 2. Main Result In order to formulate our principal claims we will first recall the standard setup for proofs of first-order phase transitions by chessboard estimates and introduce the necessary notations. The actual theorems are stated in Sect. 2.3. 2.1. Models of interest. We will work with the standard class of spin systems on Zd and so we will keep our discussion of general concepts at the minimum possible. We refer the reader to Georgii’s monograph [21] for a more comprehensive treatment and relevant references. Our spins, sx , will take values in a compact separable metric space 0 . We equip 0 with the σ -algebra F0 of its Borel subsets and consider an a priori probability measure ν0 on (0 , F0 ). Spin configurations on Zd are the collections (sx )x∈Zd . We will d d use = Z 0 to denote the set of all spin configurations on Z and F to denote the σ -algebra of Borel subsets of defined using the product topology. If ⊂ Zd , we define F to be the sub-σ -algebra of events depending only on (sx )x∈ . For each x ∈ Zd , the map τx : → is the “translation by x” defined by (τx s)y = sx+y . It is easy to check that τx is continuous and hence measurable for all x ∈ Zd . We will write Zd to indicate that is a finite subset of Zd . To define Gibbs measures, we will consider a family of Hamiltonians (H )Zd . These will be defined in terms of interaction potentials (A )AZd . Namely, for each A Zd , let A : → R be a function with the following properties: (1) The function A is FA -measurable for each A Zd . (2) The interaction (A ) is translation invariant, i.e., A+x = A ◦ τx for all x ∈ Zd and all A Zd . (3) The interaction (A ) is absolutely summable in the sense that |||||| = A ∞ < ∞. (2.1) AZd 0∈A
The Hamiltonian on a set Zd is a function H : → R defined by H = A .
(2.2)
AZd A∩ =∅
For each β ≥ 0, let Gβ be the set of Gibbs measures for the Hamiltonian (2.2). Specifically, µ ∈ Gβ if and only if the conditional probability µ( · |Fc )—which exists since
634
M. Biskup, R. Koteck´y
is a Polish space—satisfies, for all Zd and µ-almost all s, the (conditional) DLR equation µ(ds |Fc )(s) =
e −βH (s) ν0 (dsx ). Z
(2.3)
x∈
Here Z = Z (β, sc ) is a normalization constant which is independent of s = (sx )x∈ . Remark 2.1. The results of the present paper can be generalized even to the situations with unbounded spins and interactions; see Theorem 4.5. However, the general theory of Gibbs measures with unbounded spins features some unpleasant technicalities that would obscure the presentation. We prefer to avoid them and to formulate the bulk of the paper for systems with compact spins. Our restriction to translation-invariant interactions in (2) above is mostly for convenience of exposition. Actually, the proofs in Sect. 3 can readily be modified to include periodic interactions as well.
2.2. Chessboard estimates. As alluded to before, chessboard estimates are among the principal tools for proving phase coexistence. In order to make this tool available, we have to place our spin system on a torus. Let TL be the torus of L × · · · × L sites and let L HL : T 0 → R be the function defined as follows. Given a configuration s = (sx )x∈TL , we extend s periodically to a configuration s¯ on all of Zd . Using HTL to denote the Hamiltonian associated with the embedding of TL into Zd , we define HL (s) = HTL (¯s ). The torus measure PL,β then simply is PL,β (ds) =
e −βHL (s) ν0 (dsx ). ZL
(2.4)
x∈TL
Here ZL = ZL (β) is the torus partition function. Chessboard estimates will be implied by the condition of reflection positivity. While this condition can already be defined in terms of interactions ( )Zd , it is often easier to check it directly on the torus. Let us consider a torus TL with even L and let us split − it into two symmetric halves, T+ L and TL , sharing a “plane of sites” on their boundary. + − We will refer to the set P = TL ∩ TL as a plane of reflection. Let FP+ and FP− denote − the σ -algebras of events depending only on configurations in T+ L and TL , respectively. + We assume that the naturally-defined (spatial) reflection ϑP : TL ↔ T− L gives rise TL L to a map θP : T → which obeys the following constraints: 0 0 (1) θP is an involution, θP ◦ θP = id. (2) θP is a reflection in the sense that if A ∈ FP+ depends only on configurations in − ⊂ T+ L , then θP (A) ∈ FP depends only on configurations in ϑP (). that is directly induced by the spaIn many cases of interest, θP is simply the mapping tial reflection ϑP , i.e., θP = ϑP∗ , where ϑP∗ (s) x = sϑP (x) ; our definition permits us to combine the spatial reflection with an involution of the single-spin space. Reflection positivity is now defined as follows:
Forbidden Gap Argument and Chessboard Estimates
635
L Definition 2.2. Let P be a probability measure on T 0 and let E be the corresponding expectation. We say that P is reflection positive, if for any plane of reflection P and any two bounded FP+ -measurable random variables X and Y , E XθP (Y ) = E Y θP (X) (2.5)
and E XθP (X) ≥ 0.
(2.6)
Here, θP (X) denotes the FL− -measurable random variable X ◦ θP . Remark 2.3. Here are some standard examples of summable two-body interactions that are reflection positive. Consider spin systems with vector-valued spins sx and interaction potentials {x,y} = Jx,y (sx , sy ),
x = y,
(2.7)
where Jx,y are coupling constants and (·, ·) denotes a positive-semidefinite inner product on . Then the corresponding torus Gibbs measure with β ≥ 0 is reflection positive (for reflections through sites) for the following choices of Jx,y ’s: (1) “Cube” interactions: Reflection-symmetric Jx,y ’s such that Jx,y = 0 unless x and y are vertices of a cube of 2 × · · · × 2 sites in Zd . (2) Yukawa-type potentials: Jx,y = e −µ|x−y|1 ,
(2.8)
where µ > 0 and |x − y|1 is the 1 -distance between x and y. (3) Power-law decaying interactions: Jx,y =
1 , |x − y|κ 1
(2.9)
with κ > 0. The proofs of these are based on the general theory developed in [20, 18, 19]; relevant calculations can also be found in [2, Sect. 4.2]. Of course, any linear combination of the above—as well as other reflection-positive interactions—with positive coefficients is still reflection positive. Now, we are finally getting to the setup underlying chessboard estimates. Suppose that L is an integer multiple of an (integer) number B. (To rule out various technical complications with the following theorem, we will actually always assume that L/B is a power of 2.) Let B ⊂ TL be the box of (B + 1) × · · · × (B + 1) sites with the “lower-left” corner at the origin—we will call such box a B-block. We can tile TL by translates of B by B-multiples of vectors from the factor torus, T = TL/B . Note that the neighboring translates of B will have a side in common. Let A be an event depending only on configurations in B ; we will call such A a B-block event. For each t ∈ T, we define the event θt (A) as follows: (1) If t has all components even, then θt (A) is simply the translation of A by vector Bt, −1 L i.e., θt (A) = τBt (A) = {s ∈ T 0 : τBt (s) ∈ A}.
636
M. Biskup, R. Koteck´y
(2) For the remaining t ∈ T, we first reflect A through the “midplane” of B in all directions whose component of t is odd, and then translate the result by Bt as before. Thus, θt (A) will always depend only on configurations in the B-block B + Bt. The desired consequence of reflection positivity is now stated as follows. L Theorem 2.4 (Chessboard estimate). Let P be a measure on T 0 which is reflectionpositive with respect to θP . Then for any B-block events A1 , . . . , Am and any distinct sites t 1 , . . . , t m ∈ T,
P
m
m 1/|T| θt j (Aj ) ≤ P θt (Aj ) .
j =1
Proof. See [20, Theorem 2.2].
(2.10)
t∈ T
j =1
The moral of this result—whose proof is nothing more than an enhanced version of the Cauchy-Schwarz inequality applied to the inner product X, Y → E(XθP (Y ))—is that the probability of any number of events factorizes, as a bound, into the product of probabilities. This is particularly useful for contour estimates; of course, provided that the word contour refers to a collection of boxes on each of which some “bad” event occurs. Indeed, by (2.10) the probability of a contour will automatically be suppressed exponentially in the number of constituting “bad” boxes. 2.3. Main theorems. For any B-block event A, we introduce the quantity
pβ (A) = lim PL,β L→∞
1/|T| θt (A)
,
(2.11)
t∈ T
with the limit taken over multiples of B. The limit exists by standard subadditivity arguments. While the definition would suggest that pβ (A) is a large-deviation rate, chessboard estimates (2.10) show that pβ (A) can also be thought of as the “probability of A regardless of the status of all other B-blocks.” This interpretation is supported by the fact that A → pβ (A) is an outer measure on FB with pβ () = 1, cf. Lemma 6.3 of [5]. Furthermore, recalling that N−1 is the block of N × · · · × N sites with the “lowerleft” corner at the lattice origin, let RN (A) =
1 |N−1 |
1A ◦ τBx
(2.12)
x∈N −1
be the fraction of B-blocks (in NB−1 ) in which A occurs. Whenever µ ∈ Gβ is a Gibbs state for the Hamiltonian (2.2) at inverse temperature β that is invariant with respect to the shifts (τBx )x∈Zd , the limit ρµ (A) = lim RN (A) N→∞
(2.13)
exists µ-almost surely. In the following, we will use ρµ (A) mostly for measures that are actually ergodic with respect to the shifts by multiples of B. In such cases the limit
Forbidden Gap Argument and Chessboard Estimates
637
is self-averaging, ρµ (A) = µ(A) almost surely. Notwithstanding, we will stick to the notation ρµ (A) to indicate that claims are being made about almost-sure properties of configurations and not just expectations. To keep our statements concise, we will refer to measures which are invariant and ergodic with respect to the translations (τBx )x∈Zd as B-shift ergodic. Our principal result can be formulated as follows: Theorem 2.5. Let d ≥ 2 and consider a spin system as described above for which the torus measure is reflection positive for all β ≥ 0 and all even L ≥ 2. Let G1 , . . . , Gr be a finite number of B-block events and let B = (G1 ∪ · · · ∪ Gr )c . Suppose that the good block events are mutually exclusive and non-compatible (different types of goodness cannot occur in neighboring blocks): (1) Gi ∩ Gj = ∅ for all i = j . (2) If t 1 , t 2 ∈ T are nearest neighbors, then θt 1 (Gi ) ∩ θt 2 (Gj ) = ∅ for all i = j.
(2.14)
Then for every > 0, there exists δ > 0—which may depend on d but not on the details of the model nor on B or n—such that for any β ≥ 0 with pβ (B) < δ we have ρµ (B) ∈ [0, ]
(2.15)
and ρµ (Gi ) ∈ [0, ] ∪ [1 − , 1],
i = 1, . . . , r,
(2.16)
for every B-shift ergodic Gibbs state µ ∈ Gβ . In particular, if < 1/2 then for every such µ there exists a unique i such that ρµ (Gi ) ≥ 1 − and ρµ (Gj ) ≤ for all j = i. We remark that the conclusion of Theorem 2.5 holds even when the requirement of compact single-spin space and norm-bounded interactions are relaxed to the condition of finite average energy. We state the corresponding generalization in Theorem 4.5. Theorem 2.5 directly implies the standard conclusion of chessboard estimates (cf. [14, Propositions 3.1-3.3] or [25, Theorem 4]): Corollary 2.6. Let d ≥ 2, let β1 < β2 be two inverse temperatures and let G1 and G2 be two mutually exclusive, non-compatible good B-block events (cf. conditions (1) and (2) in Theorem 2.5). Then, for every > 0 there exists a constant δ > 0—which may depend on d but not B or the details of the model—such that the conditions (1) pβ (B) < δ for all β ∈ [β1 , β2 ] and (2) pβ1 (G2 ) < δ and pβ2 (G1 ) < δ imply an existence of an inverse temperature βt ∈ (β1 , β2 ) and of two distinct B-shift ergodic Gibbs measures µ1 , µ2 ∈ Gβt such that ρµj (Gj ) ≥ 1 − ,
j = 1, 2.
(2.17)
The above assumptions (1) and (2) appear in some form in all existing proofs based on chessboard estimates; see Sect. 4 for some explicit examples. The conclusions about the set of coexistence points can be significantly strengthened when, on the basis of thermodynamic arguments and/or stochastic domination, the expected amount of goodness G2 increases (and G1 decreases) with increasing β. For 1 the phase diagram then features a unique (massive) jump at some βt from states dominated by G1 to those dominated by G2 . Theorem 2.5 implies that the bulk of the values inside the jump are not found in any ergodic Gibbs state. Both Theorem 2.5 and Corollary 2.6 are proved in Sect. 3.2.
638
M. Biskup, R. Koteck´y
Remark 2.7. Both results above single out inverse temperature as the principal parameter of interest. However, this is only a matter of convenience; all results hold equally well for any parameter of the model. An inspection of the proof shows that we can take δ = c(d) 2/d in Theorem 2.5, where c(d) is a constant that grows with dimension. However, the dependence on should be significantly better; we made no attempts to reach the optimum. In any case, the fact that δ does not depend on the details of the model is definitely sufficient to prove phase coexistence. 3. Proofs of Main Results We will assume that there is an ergodic Gibbs measure µ ∈ Gβ that violates one of the conditions (2.15–2.16), and derive a contradiction. Various steps of the proof will be encapsulated in technical lemmas in Sect. 3.1; the actual proofs come in Sect. 3.2.
3.1. Technical lemmas. Our first step is to convert the information about infinite-volume densities into a finite volume event. Using the sites from N−1 to translate B-block B by multiples of B in each coordinate direction, we get x∈N −1 (B + Bx) = NB . Similarly, considering translates of NB by vectors N Bx, where x ∈ M−1 , we get x∈M−1 (NB + NBx) = MNB . The important point is that, while the neighboring translates NB + N Bx and NB + N By are not disjoint, they have only one of their (d − 1)-dimensional sides in common. Let BN and Ej,N , j = 1, . . . , r, be events defined by
BN = RN (B) > (3.1) and
Ej,N = RN (Gj ) > ,
j = 1, . . . , r.
(3.2)
Introducing the event
EN = BN ∪
(Ei,N ∩ Ej,N )
(3.3)
1≤i<j ≤r
and the fraction RM,N (EN ) of BN -blocks (in MNB ) in which EN occurs, RM,N (EN ) =
1 |M−1 |
1EN ◦ τNBx ,
(3.4)
x∈M−1
we have: Lemma 3.1. Let < 1/2 and consider a B-shift ergodic Gibbs measure µ ∈ Gβ that violates one of the conditions (2.15–2.16). Then there exists an N0 < ∞ and, for each N ≥ N0 , there exists an M0 = M0 (N ) such that for all N ≥ N0 and all M ≥ M0 (N ), one has µ RM,N (EN ) > 1/2 >
1 . 2N d
(3.5)
Forbidden Gap Argument and Chessboard Estimates
639
Proof. The proof is based on a two-fold application of the Pointwise Ergodic Theorem. Indeed, by ergodicity of µ and Fatou’s lemma we know that lim inf µ(BN ) ≥ µ ρµ (B) > (3.6) N→∞
and
lim inf µ(Ei,N ∩ Ej,N ) ≥ µ ρµ (Gi ) > ∩ ρµ (Gj ) > . N→∞
(3.7)
But µ violates one of the conditions (2.15–2.16) and so either ρµ (B) > or ρµ (Gi ) > and ρµ (Gj ) > for some i = j . All of these inequalities are valid µ-almost surely and so it follows that µ(EN ) −→ 1.
(3.8)
N→∞
Now, let us fix N so that µ(EN ) ≥ 3/4. Then ergodicity with respect to translates by multiples of B implies that 1
1 µ R (E ) ◦ τ > RM,N (EN ) ◦ τBy > 1/2 ≥ µ M,N N By Nd 2 y∈N −1 y∈N −1 = µ RMN (EN ) > 1/2 −→ 1. (3.9) M→∞
1/ 2
It follows that the left-hand side exceeds once M is sufficiently large, which in conjunction with subadditivity and τBy -invariance of µ directly implies (3.5). Our nexttask will be to express EN solely in terms of conditions on bad B-blocks in NB = x∈N −1 (B + Bx). Given two distinct sites x, y ∈ N−1 , let {x y} denote the event that there is no nearest-neighbor path π = (x1 , . . . , xk ) on N−1 such that (1) π connects x to y, i.e., x1 = x and xk = y. (2) all B-blocks “along” π are good, i.e., τBxj (B c ) occurs for all j = 1, . . . , k. Note that {x y} automatically holds when one of the blocks B + Bx or B + By is bad. Further, let YN be the (FN B -measurable) random variable
YN = # (x, y) ∈ N−1 × N−1 : x = y & x y (3.10) and let CN be the event
CN = YN ≥ ( N d )2 .
(3.11)
Conditions (1) and (2) from Theorem 2.5 now directly imply: Lemma 3.2. For all N , we have EN ⊂ CN . Proof. Clearly, we have BN ⊂ CN , and so we only have to show that Ei,N ∩ Ej,N ⊂ CN ,
1 ≤ i < j ≤ r.
(3.12)
Let us fix i = j and recall that on Ei,N ∩ Ej,N , at least an -fraction of all B-blocks in NB will be i-good and at least an -fraction of them will be j -good. By conditions (1) and (2) from Theorem 2.5, no two B-blocks of different type of goodness can be connected by a path of good B-blocks, and so there are at least ( N d )2 pairs of distinct B-blocks in NB that are not connected to each other by a path of good blocks. This is exactly what defines the event CN .
640
M. Biskup, R. Koteck´y
The events EN and CN have the natural interpretation as N B-block events on TL whenever L is divisible by N B. If A is such an N B-block event, let p˜ β (A) denote the analogue of the quantity from (2.11) where the θt ’s now involve translations by multiples of NB. Our next technical lemma provides an estimate on p˜ β (CN ) in terms of pβ (B): Lemma 3.3. Let d be the dimension of the underlying lattice and suppose that d ≥ 2. For each > 0—underlying the definitions of BN , EN and CN —and each η > 0, there exists a number δ = δ( , η, d) > 0 such that if pβ (B) < δ, then p˜ β (CN ) < η. Proof. Let us use L,β (CN ) to abbreviate the quantity θt (CN ) , L,β (CN ) = PL,β
(3.13)
t∈ T
where T = TL/(NB) is the factor torus in the present context. Observing that CN is preserved by reflections through the “midplanes” of NB , a multivariate version of Chebyshev’s inequality then yields
YN ◦ τBNt L,β (CN ) ≤ EL,β . (3.14) ( N d )2 t∈ T
Here EL,β is the expectation with respect to PL,β . To estimate the right-hand side of (3.14), we will rewrite YN as a sum. Let x, y ∈ N −1 be distinct. A connected subset ⊂ N−1 is said to separate x from y (in N−1 ) if each nearest-neighbor path π from x to y on N−1 intersects . We use S(x, y) to denote the set of all such sets ⊂ N−1 . Notice that {x}, {y} ∈ S(x, y). We claim that, whenever (x, y) is a pair of points contributing to YN , there exists ∈ S(x, y) separating x from y such that every block B +Bz with z ∈ is bad. Indeed, if B +Bx is a bad block we take = {x}. If B + Bx is a good block, then we define Cx to be the maximal connected subset of N−1 containing x such that B + Bz is a good block for all z ∈ Cx , and let be its external boundary. Using 1 to denote the indicator of the event that every block B + Bz with z ∈ is bad, we get YN ≤ 1 . (3.15) x,y∈N −1 ∈S(x,y) L d ) be the volume of the factor torus and let t 1 , . . . , t K be an ordering of Let K = ( BN all sites of T. Then we have
L,β (CN ) ≤
1 ( N d )2K
(xj ,yj ) 1 ,...,K j =1,...,K
EL,β
K
1j ◦ τBNt j ,
(3.16)
j =1
where the first sum runs over collections of pairs (xj , yj ), j = 1, . . . , K, of distinct sites in N−1 and the second sum is over all collections of separating surfaces j ∈ S(xj , yj ), j = 1, . . . , K. To estimate the right-hand side of (3.16) we define pL,β (B) to be the quantity on the right-hand side of (2.11), before taking the limit L → ∞, with A = B. Since each indicator 1j ◦ τBNt j enforces bad blocks B + B(z + N t j ) for z ∈ j , and the set of blocks
Forbidden Gap Argument and Chessboard Estimates
641
B + B(z + N t j ), z ∈ N−1 , is, for t i = t j , disjoint from the set B + B(z + N t i ), z ∈ N−1 , we can use chessboard estimates (Theorem 2.4) to get
K | |+···+|K | EL,β 1j ◦ τBNt j ≤ pL,β (B) 1 . (3.17) j =1
A standard contour-counting argument now shows that, for any distinct x, y ∈ N−1 , || pL,β (B) ≤ c1 pL,β (B)d (3.18) ∈S(x,y)
with some constant c1 = c1 (d), provided that pL,β (B) is sufficiently small. The sum over collections of pairs (xj , yj ), j = 1, . . . , K, contains at most (N 2d )K terms, allowing us to bound
c1 pL,β (B)d K . (3.19) L,β (CN ) ≤
2 Since L,β (CN ) /K → p˜ β (CN ) as L → ∞, it follows that p˜ β (CN ) ≤ c1 pβ (B)d −2 , which for pβ (B) small enough, can be made smaller than any η initially prescribed. 1
Our final technical ingredient is an estimate on the Radon-Nikodym derivative of a Gibbs measure µ ∈ Gβ and the torus measure at the same temperature: Lemma 3.4. Let L ⊂ Zd be an L-block and let T2L be a torus of side 2L. Let us view L as embedded into T2L and let P2L,β be the torus Gibbs measure on T2L . Then for any a > 0 there exists L0 such that e −βaL P2L,β (A) ≤ µ(A) ≤ e βaL P2L,β (A) d
d
(3.20)
for all L ≥ L0 , any µ ∈ Gβ , and any FL -measurable event A. Proof. For finite-range interactions, this lemma is completely standard. However, since our setting includes also interactions with infinite range, we provide a complete proof. We will prove only the right-hand side of the above inequality; the other side is completely analogous. First, from the DLR equation we know that there exists a configuration s = (sx )x∈Zd , such that µ(A|Fc )(s) ≥ µ(A)
(3.21)
with the left-hand side of the form (2.3). Let s be a configuration on T2L . We will show that µ( · |FcL )(s) and P2L,β ( · |FcL )(s ) are absolutely continuous with respect to each other—as measures on FL —and the Radon-Nikodym derivative is bounded above by d e βaL regardless of the “boundary conditions” s and s . Suppose that sx = sx for all x ∈ L and let s¯ be its 2L-periodic extension to all of Zd . Then the Radon-Nikodym derivative of P2L,β ( · |FcL )(s ) with respect to the ) while that of µ( · |F c )(s) is product measure x∈L ν0 (dsx ) is e −βHL (¯s ) /ZL (¯s c L L
e −βHL (s) /ZL (scL ). It thus suffices to show, uniformly in (sx )x∈L , that H (s) − H (¯s ) ≤ a Ld L L 2
(3.22)
642
M. Biskup, R. Koteck´y
once L is sufficiently large. To this end, we first note that H (s) − H (¯s ) ≤ 2 A ∞ . L L
(3.23)
A : A∩L =∅ A∩cL =∅
To estimate the right-hand side, we will decompose L into “shells,” n \ n−1 , and use the fact that if A intersects n \ n−1 as well as cL , then the diameter of A must be at least L − n. Using the translation invariance of the interactions, we thus get A : A∩L =∅ A∩cL =∅
A ∞ ≤
L n=1
|n \ n−1 |
A ∞ .
(3.24)
A : 0∈A diam(A)≥L−n
But |||||| < ∞ implies that the second sum tends to zero as L − n → ∞ and since |n \ n−1 | = o(Ld ) while 1≤n≤L |n \ n−1 | = Ld , the result is thus o(Ld ). In particular, for L sufficiently large, the right-hand side of (3.23) will be less than a2 Ld . 3.2. Proofs of Theorem 2.5 and Corollary 2.6. Now we are ready to prove our main theorem: Proof of Theorem 2.5. Fix < 1/2 and let µ ∈ Gβ be a B-shift ergodic Gibbs measure for which one of the conditions (2.15–2.16) fails. Applying Lemma 3.1 and the inclusion in Lemma 3.2 we find that 1 µ RM,N (CN ) > 1/2 > (3.25) 2N d once N ≥ N0 andM ≥ M0 (N ). Now, consider the torus TL of side L = 2MN B and embed MNB = x∈M−1 (NB + N Bx) into TL in the “usual” way. By Lemma 3.4 we know that for any fixed N ≥ N0 , there exists a sequence aM of positive numbers with aM ↓ 0 as M → ∞, such that we have PL,β RM,N (CN ) > 1/2 >
1 −β(NB)d aM M d e , 2N d
M → ∞.
(3.26)
Our goal is to show that, once N is chosen sufficiently large, the left-hand side is exponentially small in M d , thus arriving at a contradiction. By conditioning on which of the M d /2 translates of BN have CN satisfied, and applying the chessboard estimates in blocks of side N B, we get d d PL,β RM,N (CN ) > 1/2 ≤ 2M p˜ 2L,β (CN )M /2 , (3.27) where p˜ 2L,β (CN ) is the finite-torus version of p˜ β (CN ). Next we choose η < 1/4 and let δ > 0 and N ≥ N0 be such that the bounds in Lemma 3.3 apply. Then for all sufficiently large M (and hence all large L) we have p˜ 2L,β (CN ) < η and so d PL,β RM,N (CN ) > 1/2 ≤ (4η)M /2 . (3.28) But this is true for all M 1 and so the bound (3.26) must be false. Hence, no such µ ∈ Gβ could exist to begin with; i.e., (2.15–2.16) must hold for all B-shift ergodic µ ∈ Gβ .
Forbidden Gap Argument and Chessboard Estimates
643
To finish our proofs, we will also need to establish our claims concerning phase coexistence: Proof of Corollary 2.6. Suppose that and δ are such that Theorem 2.5 applies. By condition (1), the conclusions (2.15–2.16) of this theorem are thus available for all β ∈ [β1 , β2 ]. This implies ρµ (Gj ) ∈ [0, ] ∪ [1 − , 1],
j = 1, 2,
(3.29)
for every B-shift ergodic µ ∈ Gβ at every β ∈ [β1 , β2 ]. We claim that ρµ (G2 ) is small in every ergodic state µ ∈ Gβ1 . Indeed, by Lemma 6.3 of [5] and condition (2) of the corollary, we have
pβ1 (B ∪ G2 ) ≤ pβ1 (B) + pβ1 (Gj ) < 2δ.
(3.30)
Hence, if the δ in Corollary 2.6 was so small that Theorem 2.5 applies for some < 1/2 even when δ is replaced by 2δ, we can regard B ∪ G2 as a bad event at β = β1 and conclude that ρµ (G2 ) < 1/2, and hence ρµ (G2 ) ≤ , by (3.29), in every ergodic µ ∈ Gβ1 . A similar argument proves that ρµ (G1 ) ≤ in every ergodic µ ∈ Gβ2 . Usual weak-limit arguments then yield the existence of at least one point βt ∈ (β1 , β2 ) where both types of goodness coexist. 4. Applications The formulation of our main result is somewhat abstract. In the present section, we will pick several models in which phase coexistence has been proved using chessboard estimates and use them to demonstrate the consequences of our main theorem. Although we will try to stay rather brief, we will show that, generally, the hypothesis of our main result—i.e., the assumption on smallness of the parameter pβ (B)—is directly implied by the calculations already carried out in the corresponding papers. The reader should consult the original articles for more motivation and further details concerning the particular models. 4.1. Potts model. The q-state Potts model serves as a paradigm of order-disorder transitions. The existence of the transition has been proved by chessboard estimates in [25]. While the completeness of the phase diagram has, in the meantime, been established with the help of Pirogov-Sinai theory [28], we find it useful to illustrate our general claims on this rather straightforward example. Later on we will pass to more complex systems where no form of completeness—and, more relevantly, no “forbidden gap”—has been proved. The spins σx of the q-state Potts model take values in the set {1, . . . , q} with a priori equal probabilities. The formal Hamiltonian is H (σ ) = − δσx ,σy , (4.1) x,y
where x, y runs over all (unordered) nearest-neighbor pairs in Zd . The states of minimal energy have all neighboring spins equal, and so we expect that low temperature states are dominated by nearly constant spin-configurations. On the other hand, at high temperatures the spins should be nearly independent and, in particular, neighboring spins
644
M. Biskup, R. Koteck´y
will typically be different from each other. This leads us to consider the following good events on 1-block 1 :
G dis = σ : σx = σy for all x, y ∈ 1 , |x − y| = 1 ,
G ord,m = σ : σx = m for all x ∈ 1 , m = 1, . . . , q. (4.2) Using similar events, it was proved [25] that, for d ≥ 2 and q sufficiently large, there exists an inverse temperature βt and q + 1 ergodic Gibbs states µdis ∈ Gβt and µord,m ∈ Gβt , m = 1, . . . , q, such that the corresponding 1-block densities satisfy ρµdis (G dis ) ≥ 1 −
(4.3)
and ρµord,m (G ord,m ) ≥ 1 − ,
m = 1, . . . , q,
(4.4)
where = (q) tends to zero as q → ∞. In addition, monotonicity of the energy density as a function of β can be invoked to show that ρµ (G dis ) is large in all translation-invariant µ ∈ Gβ when β < βt , while it is small in all such states when β > βt . The full completeness [28] asserts that the above-mentioned q + 1 states exhaust the set of all shift-ergodic Gibbs states in Gβt . A weaker claim follows as a straightforward application of our Theorem 2.5: For each shift-ergodic Gibbs state µ ∈ Gβt there is either ρµ (G dis ) ≥ 1 − or ρµ (G ord,m ) ≥ 1 − for some m = 1, . . . , q. The main hypothesis of our theorem amounts to the smallness of the quantity pβ (B), where q c B = G dis ∪ G ord,m ,
(4.5)
m=1
which in turn boils down to an estimate on the probability of the disseminated event B on the right-hand side of (2.11). The needed estimate coincides with the bound provided in [25] by evaluating directly (i.e., “by hand”) the energy and the number of contributing configurations. The result—which in [25] appears right before the last formula on p. 506 is used to produce (4.4 )—reads
pβ (B) ≤
q d−2−(d−1) 1
2d
(q
− 2d)d
(4.6)
.
This implies the needed bound once q 1. Remark 4.1. Analogous calculations establish the corresponding forbidden gap in more complicated variants of the Potts model; see e.g. [4]. 4.2. Intermediate phases in dilute spin systems. The first instance where our results provide some new insight are dilute annealed ferromagnets exhibiting staggered order phases at intermediate temperatures. These systems have been studied in the context of both discrete [10] and continuous spins [11]. The characteristic examples of these classes are the site-diluted Potts model with the Hamiltonian H (n, σ ) = − nx ny (δσx ,σy − 1) − λ nx − κ n x ny , (4.7) x,y
x
x,y
Forbidden Gap Argument and Chessboard Estimates
645
and the site-diluted XY -model with the Hamiltonian H (n, φ) = − nx ny cos(φx − φy ) − 1 − λ nx − κ nx ny . x,y
x
(4.8)
x,y
Here, as before, σx ∈ {1, . . . , q} are the Potts spins, φx ∈ [−π, π ) are variables representing the “angle” of the corresponding O(2)-spins, and nx ∈ {0, 1} indicates the presence or absence of a particle (that carries the Potts spin σx or the angle variable φx ) at site x. On the basis of “usual” arguments, the high temperature region is characterized by disordered configurations while the low temperatures feature configurations with a strong (local) order, at least at small-to-intermediate dilutions. The phenomenon discovered in [10, 11] is the existence of a region of intermediate temperatures and chemical potentials, sandwiched between the low temperature/high density ordered region and the high temperature/low density disordered region, where typical configurations exhibit preferential occupation of one of the even/odd sublattices. The appearance of such states is due to an effective entropic repulsion. Indeed, at low temperatures the spins on particles at neighboring sites are forced to be (nearly) aligned while if a particle is completely isolated, its spin is permitted to enjoy the full freedom of the available spin space. Hence, at intermediate temperatures and moderate dilutions, there is an entropic advantage for the particles to occupy only one of the sublattices. Let us concentrate on the portion of the phase boundary between the staggered region and the low temperature region. The claim can be stated uniformly for both systems in (4.7–4.8) provided we introduce the relevant good events in terms of occupation variable n. Namely, we let:
G dense = (σ, n) : nx = 1 for all x ∈ 1 ,
G even = (σ, n) : nx = 1{x even} for all x ∈ 1 , (4.9)
odd G = (σ, n) : nx = 1{x odd} for all x ∈ 1 . Again, using slightly modified versions of these events, it was shown in [10, 11] that there exist positive numbers , κ0 1 and, for every κ ∈ (0, κ0 ), an interval I (κ) ⊂ R such that the following is true: For any λ ∈ I there exist inverse temperatures β1 (κ, λ) and β2 (κ, λ), and a transition temperature βt (κ, λ) ∈ [β1 , β2 ] such that (1) for any β ∈ [βt , β2 ] there exists an “densely occupied” state µdense ∈ Gβ , for which ρµdense (G dense ) ≥ 1 − ,
(4.10)
(2) for any β ∈ [β1 , βt ] there exist two states µeven , µodd ∈ Gβ satisfying ρµeven (G even ) ≥ 1 −
and
ρµodd (G odd ) ≥ 1 − .
(4.11)
The error is of order β − /8 (cf. the bound (2.15) in [11]) in the case of the XY -model in d = 2, and it tends zero as q → ∞ in the case of the diluted Potts model. A somewhat stronger conclusion can be made for the diluted Potts model. Namely, at β = βt , there are actually q + 2 distinct states, two staggered states µeven and µodd and q ordered states µdense,m , with the latter characterized by the condition 1
ρµdense,m (G dense,m ) ≥ 1 − ,
(4.12)
646
M. Biskup, R. Koteck´y
where
G dense,m = (σ, n) : nx = 1 and σx = m for all x ∈ 1 .
(4.13)
It is plausible that an analogous conclusion applies to the XY-model in d ≥ 3 because there the low-temperature phase should exhibit magnetic order. However, in d = 2 such long-range order is not permitted by the Mermin-Wagner theorem and so there one expects to have only 3 distinct ergodic Gibbs states at βt . A weaker form of the expected conclusion is an easy consequence of our Theorem 2.5: For each extremal 2-periodic Gibbs state µ ∈ Gβt there exists G ∈ {G even , G odd , G dense } (in the case of diluted Potts model, G ∈ {G even , G odd , G dense,m , m = 1, . . . , q}) such that ρµ (G) ≥ 1 − .
(4.14)
In particular, no ergodic Gibbs state µ ∈ Gβt has particle density in [ , 1/2 − ] ∪ [1/2 +
, 1 − ]. The proof of these observations goes by noting that the smallness of pβ (B) for the bad event B = (G dense ∪ G even ∪ G odd )c is a direct consequence of the corresponding bounds from [10, 11] of the “contour events.” In the case of the XY-model in dimension d = 2, this amounts to the bounds (2.9) and (2.15) from [11]. Remark 4.2. A more general class of models, with spin taking values in a Riemannian manifold, is also considered in [11]. A related phase transition in an annealed diluted O(n) Heisenberg ferromagnet has been proved in [12]. 4.3. Order-by-disorder transitions. Another class of systems where our results provide new information are the O(2)-nearest and next-nearest neighbor antiferromagnet [3], the 120-degree model [5], and the orbital-compass model [6].All of these are continuum-spin systems whose common feature is that the infinite degeneracy of the ground states is broken, at positive temperatures, by long-wavelength (spin-wave) excitation. We will restrict our attention to the first of these models, the O(2)-nearest and next-nearest neighbor antiferromagnet. The other two models are somewhat more complicated—particularly, due to the presence of non-translation invariant ground states—but the conclusions are fairly analogous. Consider a spin system on Z2 whose spins, S x , take values on the unit circle in R2 with a priori uniform distribution. The Hamiltonian is H (S) = S x · S x+ˆe1 +ˆe2 +S x ·S x+ˆe1−ˆe2 +γ S x · S x+ˆe1 +S x ·S x+ˆe2 , (4.15) x
x
where eˆ 1 and eˆ 2 are the unit vectors in the coordinate lattice directions and the dot denotes the usual scalar product. Note that both nearest and next-nearest neighbors are coupled antiferromagnetically but with a different strength. The following are the ground state configurations for γ ∈ (−2, 2): Both even and odd sublattices enjoy a Ne´el (antiferromagnetic) order, but the relative orientation of these sublattice states is arbitrary. It is clear that, at low temperatures, the configurations will be locally near one of the aforementioned ground states. Due to the continuous nature of the spins, the fluctuation spectrum is dominated by “harmonic perturbations,” a.k.a. spin waves. A heuristic spin-wave calculation (cf. [5, Sect. 2.2] for an example in the context of the 120-degree model) suggests that among all 2π possible relative orientations of the sublattices, the parallel and the antiparallel orientations are those entropically most favorable. And,
Forbidden Gap Argument and Chessboard Estimates
647
indeed, as was proved in [3], there exist two 2-periodic Gibbs states µ1 and µ2 with the corresponding type of long-range order. However, the existence of Gibbs states with other relative orientations has not been ruled out. We will now state a stronger version of [3, Theorem 2.1]. Let B be a large even integer and consider two B-block events G1 and G2 defined as follows: fixing a positive κ 1, let G1 = {S x · S y ≥ 1 − κ} ∩ {S x · S x+ˆe2 ≤ −1 + κ}, (4.16) x,y∈B (y−x)·ˆe2 =0
x,x+ˆe2 ∈B
i.e., G1 enforces horizontal stripes all over B . The event G2 in turn enforces vertical stripes; the definition is as above with the roles of eˆ 1 and eˆ 2 interchanged. Then we have: Theorem 4.3. Let γ ∈ (0, 2) and let κ 1. For each > 0 there exists β0 ∈ (0, ∞) such that for each β ≥ β0 : (1) There exist two ergodic Gibbs states µ1 , µ2 ∈ Gβ , such that ρµj (Gj ) ≥ 1 − ,
j = 1, 2.
(4.17)
(2) There exists an integer B ≥ 1 such that for any µ ∈ Gβ that is ergodic with respect to shifts by multiples of B we have either ρµ (G1 ) ≥ 1 − or ρµ (G2 ) ≥ 1 − .
(4.18)
The first conclusion—the existence of Gibbs states with parallel and antiparallel relative orientation of the sublattices—was the main content of Theorem 2.1 of [3]. What we have added here is that the corresponding configurations dominate all ergodic Gibbs states. The O(2) ground-state symmetry of the relative orientation of the sublattices is thus truly broken at positive temperatures, which bolsters significantly the main point of [3]. Note that no restrictions are posed on the overall orientation of the spins. Indeed, by the Mermin-Wagner theorem every µ ∈ Gβ is invariant under simultaneous rotations of all spins. Proof of Theorem 4.3. As expected, the proof boils down to showing that, for a proper choice of scale B we have pβ (B) 1 for B = (G1 ∪ G2 )c . In [3] this is done by decomposing B into more elementary events—depending on whether the “badness” comes from excessive energy or insufficient entropy—and estimating each of them separately. The relevant bounds are proved in [3, Lemmas 4.4 and 4.5] and combined together in [3, Eq. (4.20)]. Applying Theorem 2.5 of the present paper, we thus know that every B-shift ergodic µ ∈ Gβ is dominated either by blocks of type G1 or by blocks of type G2 . Since ρµ (B) ≤ in all states, the existence of µ1 , µ2 ∈ Gβ satisfying (4.17) follows by symmetry with respect to rotation (of the lattice) by 90-degrees. 4.4. Nonlinear vector models. A class of models with continuous symmetry that are conceptually close to the Potts model has been studied recently by van Enter and Shlosman [17]. As for our previous examples with continuous spins, Pirogov-Sinai theory is not readily available and one has to rely on chessboard estimates. We will focus our attention on one example in this class, a nonlinear ferromagnet, although our conclusions apply with appropriate, and somewhat delicate, modifications also to liquid crystal models and lattice gauge models discussed in [17].
648
M. Biskup, R. Koteck´y
Let us consider an O(2)-spin system on Z2 with spins parametrized by the angular variables φx ∈ (−π, π ]. The Hamiltonian is given by 1 + cos(φx − φy ) p H (φ) = − , (4.19) 2 x,y
where p is a nonlinearity parameter. The a priori distribution of the φx ’s is the Lebesgue measure on (−π, π ]; the difference φx − φy is always taken modulo 2π . In order to define the good block events, we first split all bonds into three classes. Namely, given a configuration (φx )x∈Z2 , we say that the bond x, y is (1) strongly ordered if |φx − φy | ≤ (2) weakly ordered if
1 √ C p
1 √ C p,
< |φx − φy | <
(3) disordered if |φx − φy | ≥
√C , p
and
√C . p
Here C is a large number to be determined later. If a bond is either strongly or weakly ordered, we will call it simply ordered. On the basis of (4.19), it is clear that strongly ordered bonds are favored energetically while the disordered bonds are favored entropically. The main observation of [17]—going back to [14, 25, 1]—is that, at least in torus measures, ordered and disordered bonds are unlikely to occur in the same configuration. This immediately implies coexistence of at least two distinct states at some intermediate temperature. Moreover, since it is also unlikely to have many bonds in the “borderline” region |φx − φy | ≈ √Cp , the transition is accompanied by a jump in the energy density. But, to prove that the energy gap stays uniformly positive as p → ∞, it appears that one needs to establish the existence of a free-energy barrier between the strongly ordered and disordered phases. Let 1 be a 1-block (i.e., a plaquette) and let us consider the following good events on 1 : The event that all bonds on 1 are strongly ordered, 1 Gso = |φx − φy | ≤ √ : ∀x, y ∈ 1 , |x − y| = 1 , (4.20) C p and the event that all bonds on 1 are disordered, C Gdis = |φx − φy | ≥ √ : ∀x, y ∈ 1 , |x − y| = 1 . p
(4.21)
Then we have: Theorem 4.4. For each > 0 and each sufficiently large C > 1, there exists p0 > 0 such that for all p > p0 , there exists a number βt ∈ (0, ∞) and two distinct, shift-ergodic Gibbs states µso , µdis ∈ Gβt such that ρµso (Gso ) ≥ 1 − and ρµdis (Gdis ) ≥ 1 − .
(4.22)
In addition, for all shift-ergodic Gibbs states µ ∈ Gβt , we have either ρµ (Gdis ) ≥ 1 − or ρµ (Gso ) ≥ 1 − ,
(4.23)
ρµ (Gso ) ≥ 1 − for all shift-ergodic µ ∈ Gβ with β > βt
(4.24)
while
Forbidden Gap Argument and Chessboard Estimates
649
and ρµ (Gdis ) ≥ 1 − for all shift-ergodic µ ∈ Gβ with β < βt .
(4.25)
Finally, for every p > p0 and C large, every ergodic Gibbs state will have energy near zero when β > βt and at least 1 − O(C −2 ) when β < βt . We remark that the existence of a first-order transition in energy density has been a matter of some controversy in the physics literature; see [16, 17] for more discussion and relevant references. The proof of Theorem 4.4 is fairly technical and it is therefore deferred to Sect. 5.
4.5. Magnetostriction transition. Our final example is the magnetostriction transition studied recently by Shlosman and Zagrebnov [33]. The specific system considered in [33] has the Hamiltonian H (σ, r) = − J (rx,y )σx σy + κ (rx,y − R)2 + λ (rx,y − rz,y )2 . (4.26) x,y
x,y
x,y,z,y √ |x−z|= 2
Here the sites x ∈ Zd label the atoms in a crystal; the atoms have magnetic moments represented by the Ising spins σx . The crystal is not rigid; the variables rx,y ∈ R, rx,y > 0, play the role of spatial distance between neighboring crystal sites. The word magnetostriction refers to the phenomenon where a solid undergoes a magnetic transition accompanied by a drastic change in the crystalline structure. In [33] such a transition was proven for interaction potentials J = J (rx,y ) that are strong at short distances and weak at large distances. The relevant states are characterized by disjoint contracted,
G contr = (r, σ ) : rx,y ≤ η, ∀x, y ∈ 1 , |x − y| = 1 , (4.27) and expanded,
G exp,± = (r, σ ) : rx,y ≥ η+ , ∀x, y ∈ 1 , |x − y| = 1 ∩ σx = ±1, ∀x ∈ 1 , (4.28) block events. The parameters η and ε can be chosen so that there exists βt ∈ (0, ∞) for which the following holds: (1) For all β ≤ βt there exists an expanded Gibbs state µexp ∈ Gβ such that ρµexp (G exp ) ≥ 3/ ; 4 (2) For all β ≥ βt there exist two distinct contracted Gibbs states µcontr,± ∈ Gβ such that ρµcontr,± (G contr,± ) ≥ 3/4. In particular at β = βt there exist three distinct Gibbs states; one expanded and two contracted with opposite values of the magnetization. The authors conjecture that these are the only shift-ergodic Gibbs states at β = βt . Unfortunately, the above system has unbounded interactions and so it is not strictly of the form for which Theorem 2.5 applies. Instead we will use the following generalization:
650
M. Biskup, R. Koteck´y
Theorem 4.5. Let d ≥ 2 and consider a spin system with translation-invariant finiterange interaction potentials (A )AZd such that the torus measure is reflection positive for all even L. Let G1 , . . . , Gr be a collection of good B-block events satisfying the requirements in Theorem 2.5 and let B be the corresponding bad event. Then for all > 0 there exists δ > 0—depending possibly only on d but not on details of the model nor on n or B—such that for all β ≥ 0 for which pβ (B) < δ the following is true: If µ ∈ Gβ is a B-shift ergodic Gibbs state with A:
Eµ |A | < ∞,
(4.29)
AZd
0∈A
then we have ρµ (B) ∈ [0, ],
(4.30)
and there exists i ∈ {1, . . . , r} such that ρµ (Gi ) ≥ 1 − .
(4.31)
Proof. The proof is virtually identical to that of Theorem 2.5 with one exception: Since the interactions are not bounded, we cannot use Lemma 3.4 directly. Suppose we have a Gibbs state µ that obeys (4.29) but violates one of the conditions (4.30–4.31). Let RM,N (CN ) be as in (3.4). Lemma 3.1 still applies and so we have (3.5) for some N . Let L = MN B and let DM be the event that the boundary energy in the box is less than cM d−1 , i.e., (4.32) |A | ≤ cM d−1 , DM = A : A∩L =∅ A∩cL =∅
where c is a positive constant. In light of the condition (4.29), the fact that the interaction has a finite range, and the Chebyshev bound, it is clear that we can choose c so that c ) < (4N d )−1 for all M. Hence, we have µ(DM µ DM ∩ {RM,N (CN ) > 1/2} >
1 . 4N d
(4.33)
Next let s and s be as in the proof of Lemma 3.4 and suppose that both s and s belong to DM . Then, by definition, H (s) − H (s ) ≤ 2cM d−1 L L
(4.34)
and, applying the rest of the proof of Lemma 3.4, we thus have d−1 µ DM ∩ {RM,N (CN ) > 1/2} ≤ e 2βcM P2L,β DM ∩ {RM,N (CN ) > 1/2} .
(4.35)
Neglecting DL on the right-hand side and invoking (3.28), we again derive the desired contradiction once M is sufficiently large.
Forbidden Gap Argument and Chessboard Estimates
651
With Theorem 4.5 in hand, we can extract the desired conclusion for the magnetostriction transition. First, the energy condition is clearly satisfied in any state generated by tempered boundary conditions. We then know that, in every such ergodic state µ, only a small number of blocks will feature bonds that are neither contracted (and magnetized) nor expanded (and non-magnetized): ρµ (G exp ), ρµ (G exp,± ) ∈ [0, ] ∪ [1 − , 1]
ρµ (B) ≤ .
and
(4.36)
The existence of a phase transition follows by noting that the contracted states have less energy than the expanded ones; there is thus a jump in the energy density as the temperature varies. 5. Appendix The goal of this section is to prove Theorem 4.4 which concerns the non-linear vector model with interaction (4.19). The technical part of the proof is encapsulated into the following claim: Proposition 5.1. There exists a constant C0 > 0 such that for all δ > 0 and all C ≥ C0 the following holds: There exists p0 > 0 such that for all p ≥ p0 we have sup pβ ((Gso ∪ Gdis )c ) < δ
(5.1)
β≥0
and lim pβ (Gdis ) = 0 and
β→∞
lim pβ (Gso ) < δ.
(5.2)
β↓0
To prove this proposition, we will need to carry out a sequence of energy and entropy bounds. To make our energy estimates easier, and uniform in p, we first notice that there are constants 0 < a < b such that e −bx ≤ 2
1 + cos(x) 2 ≤ e −ax , 2
−1 ≤ x ≤ 1.
(5.3)
The argument commences by splitting the bad event B = (Gso ∪ Gdis )c into two events: The event Bwo that 1 contains a weakly-ordered bond, and Bmix = B \ Bwo which, as a moment’s thought reveals, is the event that 1 contains two adjacent bonds, one of which is strongly ordered and the other disordered. The principal chessboard estimate yields the following lemma: √ Lemma 5.2. Suppose that C ≤ p. Then
pβ (Bwo ) ≤ 4 min
C2 κ
e
−2β[e −bκ
2 /C 2
2
−e −a/C ]
,
C √ π p
e
2βe −a/C
2
1/4 (5.4)
and
2 3 −b/C 2 −1−e −aC ]
pβ (Bmix ) ≤ 4 min e −2β[ 2 e
, e 2β
1√ πC p
1/2
3/4
(5.5)
652
M. Biskup, R. Koteck´y
for all β ≥ 0 and all κ ∈ (0, 1). Moreover, we have
√
pβ (Gdis ) ≤ πC p exp −2β[e
−
b C2
− e −aC ] 2
(5.6)
and
pβ (Gso ) ≤
1 e 2β √ . πC p
(5.7)
Proof. Let ZL be the partition function obtained by integrating e −βHL over all allowed configurations. Consider the following reduced partition functions: (1) ZLdis , obtained by integrating e −βHL subject to the restriction that every bond in TL is disordered. (2) ZLso , obtained similarly while stipulating that every bond in TL is strongly ordered. (3) ZLwo , in which every bond in TL is asked to be weakly ordered. (4) ZLmix , enforcing that every other horizontal line contains only strongly-ordered bonds, and the remaining lines contain only disordered bonds. A similar periodic pattern is imposed on vertical lines as well. To prove the lemma, we will need upper and lower bounds on the partition functions in (1-2), and upper bounds on the partition functions in (3-4). We begin by upper and lower bounds on ZLdis . First, using the fact that the Hamiltonian is always non-positive, we have e −βHL ≥ 1. On the other hand, the inequalities (5.3) and a natural monotonicity of the interaction imply that 1 + cos(φ − φ ) p 1 + cos(C/√p) p 2 x y ≤ ≤ e −aC (5.8) 2 2 whenever x, y is a disordered bond. In particular, −βHL is less than 2βe −aC |TL | for every configuration contributing to ZLdis . Using these observations we now easily derive that 2
(2π)|TL | ≤ ZLdis ≤ (2π)|TL | e 2βe
−aC 2 |T | L
.
(5.9)
Similarly, for the partition function ZLso we get 2 |TL |−1 |TL | −bκ 2 /C 2 2κ ≤ ZLso ≤ 2π e 2β|TL | √ . e 2βe √ C p C p
(5.10)
Indeed, for the upper bound we first note that −βHL ≤ 2β|TL |. Then we fix a tree spanning all vertices of TL , disregard the constraints everywhere except on the edges in the tree and, starting from the “leaves,” we sequentially integrate all site variables. 2 (Thus, each site is effectively forced into an interval of length C √ p , except for the “root” which retains all of its 2π possibilities.) For the lower bound we fix a number κ ∈ (0, 1) κ and restrict the integrals to configurations such that |φx −φy | ≤ C √ p for all bonds x, y in TL . The bound −βHL ≥ 2βe −bκ /C |TL | then permits us to estimate away the Boltzmann factor for all configurations; the entropy factor reflects the fact that each site can √ . vary throughout an interval of length at least C2κ p 2
2
Forbidden Gap Argument and Chessboard Estimates
653
Next we will derive good upper bounds on the remaining two partition functions. First, similar estimates as those leading to the upper bound in (5.10) give us −a/C 2 2C |TL | ZLwo ≤ 2π e 2βe . √ p
(5.11)
For the partition function ZLmix we note that 1/4 of all sites are adjacent only to disordered bonds, while the remaining 3/4 are connected to one another via a grid of strongly-ordered 2 bonds. Estimating −βHL ≤ β(1 + e −aC )|TL | for all relevant configurations, similar calculations as those leading to (5.10) again give us ZLmix ≤ 2πe β(1+e
−aC 2 )|T | L
(2π)
|TL | 4
2 3 |TL |−1 4 . √ C p
(5.12)
It now remains to combine these estimates into the bounds on the quantities on the left-hand side of (5.4–5.5) and (5.6–5.7). We begin with the bound (5.6). Clearly, pβ (Gdis ) is the L → ∞ limit of (ZLdis/ZL )1/|TL | , which using the lower bound ZL ≥ ZLso with κ = 1 easily implies (5.6). The bound (5.7) is obtained similarly, except that now we use that ZL ≥ ZLdis . The remaining two bounds will conveniently use the fact that for two-dimensional nearest-neighbor models, and square tori, the torus measure PL,β is reflection positive even with respect to the diagonal planes in TL . Indeed, focusing on (5.4) for a moment, we first note that Bwo is covered by the union of four (non-disjoint) events characterized by the position of the weakly(1) ordered bond on 1 . If Bwo is the event that the lower horizontal bond is the culprit, the (1) subadditivity property of pβ —see Lemma 6.3 of [5]—gives us pβ (Bwo ) ≤ 4pβ (Bwo ). (1) Disseminating Bwo using reflections in coordinate directions, we obtain an event enforcing weakly-ordered bonds on every other horizontal line. Next we apply a reflection in a diagonal line of even parity to make this into an even parity grid. From the perspective of reflections in odd-parity diagonal lines—i.e., those not passing through the vertices of the grid—half of the “cells” enforces all four bonds therein to be weakly ordered, while the other half does nothing. Applying chessboard estimates for these diagonal reflections, we get rid of the latter cells. The result of all these operations is the bound
pβ (Bwo ) ≤ lim 4 L→∞
Z wo L
ZL
1 4|TL |
.
(5.13)
Estimating ZL from below by the left-hand sides of (5.9–5.10) now directly implies (5.4). The event Bmix is handled similarly: First we fix a position of the ordered-disordered pair of bonds and use subadditivity of pβ to enforce the same choice at every lattice plaquette; this leaves us with four overall choices. Next we use diagonal reflections to produce the event underlying ZLmix . Estimating ZL from below by 1/4-th power of the lower bound in (5.9) and 3/4-th power of the lower bound in (5.10) with κ = 1, we get the first term in the minimum in (5.5). To get the second term, we use that ZL ≥ ZLdis , 2 apply (5.12) and invoke the bound 1 + e −aC ≤ 2. Proof of Proposition 5.1. The desired properties are simple consequences of the bounds 2 2 in Lemma 5.2. Indeed, if C is so large that e −b/C > e −aC , then (5.6) implies that pβ (Gdis ) → 0 as β → ∞. On the other hand, (5.7) shows that the β → 0 limit
654
M. Biskup, R. Koteck´y
of pβ (Gso ) is order 1/√p, which can be made as small as desired by choosing p sufficiently large. To prove also (5.1), we first invoke Lemma 6.3 of [5] one last time to see that pβ (B) ≤ pβ (Bwo )+ pβ (Bmix ). We thus have to show that both pβ (Bwo ) and pβ (Bmix ) can be made arbitrary small by increasing p appropriately. We begin with pβ (Bmix ). Let C be so large that 3 −b/C 2 2e
− 1 − e −aC > 0. 2
(5.14)
1
Then for β such that e 2β > p /4 the first term in the minimum in (5.6) decays like a neg1 ative power of p, while for the complementary values of β, the second term is O(p − /8 ). As to the remaining term, pβ (Bwo ), here we choose κ ∈ (0, 1) such that e −bκ
2 /C 2
− e −a/C > 0, 2
(5.15) √
and apply the first part of the minimum in (5.4) for β with e 2β ≥ p, and the second part for the complementary β, to show that pβ (Bwo ) is also bounded by constants time a negative power of p, independently of β. Choosing p large, (5.1) follows. Now we can finally prove Theorem 4.4: Proof of Theorem 4.4. We will plug the claims of Proposition 5.1 in our main theorem. First, it is easy to check that the good block events Gso and Gdis satisfy Conditions (1) and (2) of Theorem 2.5. Then (5.1) and (2.15–2.16) imply that either ρµ (Gdis ) ≥ 1 − or ρµ (Gso ) ≥ 1 −
(5.16)
for all shift-ergodic Gibbs states µ ∈ Gβ and all β ∈ (0, ∞). The limits (5.2) and Corollary 2.6 then imply the existence of the transition temperature βt and of the corresponding coexisting states. Since the energy density with negative sign undergoes 2 2 a jump at βt from values e −b/C to values e −aC —which differ by almost one once C 1—all ergodic states for β > βt must have small energy density while the states for β < βt will have quite a lot of energy. Applying (5.16), all ergodic µ ∈ Gβ for β > βt must be dominated by strongly-ordered bonds, while those for β < βt must be dominated by disordered bonds. Acknowledgement. The research of M.B. was supported by the NSF grant DMS-0306167 and that of ˇ 201/03/0478 and MSM 0021620845. Large parts of this paper were written R.K. by the grants GACR while both authors visited Microsoft Research in Redmond. The authors would like to thank Senya Shlosman, Aernout van Enter and an anonymous referee for many valuable suggestions on the first version of this paper.
References 1. Alexander, K.S., Chayes, L.: Non-perturbative criteria for Gibbsian uniqueness. Commun. Math. Phys. 189(2), 447–464 (1997) 2. Biskup, M., Chayes, L., Crawford, N.: Mean-field driven first-order phase transitions in systems with long-range interactions. J. Statist. Phys. (to appear) 3. Biskup, M., Chayes, L., Kivelson, S.A.: Order by disorder, without order, in a two-dimensional spin system with O(2)-symmetry. Ann. Henri Poincar´e 5(6), 1181–1205 (2004) 4. Biskup, M., Chayes, L., Koteck´y, R.: Coexistence of partially disordered/ordered phases in an extended Potts model. J. Statist. Phys. 99 (5/6), 1169–1206 (2000)
Forbidden Gap Argument and Chessboard Estimates
655
5. Biskup, M., Chayes, L., Nussinov, Z.: Orbital ordering in transition-metal compounds: I. The 120degree model. Commun. Math. Phys. 255, 253–292 (2005) 6. Biskup, M., Chayes, L., Nussinov, Z.: Orbital ordering in transition-metal compounds: II. The orbital-compass model. In preparation 7. Borgs, C., Waxler, R.: First order phase transitions in unbounded spin systems. I. Construction of the phase diagram. Commun. Math. Phys. 126, 291–324 (1990) 8. Borgs, C., Waxler, R.: First order phase transitions in unbounded spin systems. II. Completeness of the phase diagram. Commun. Math. Phys. 126, 483–506 (1990) 9. Bricmont, J., Slawny, J.: Phase transitions in systems with a finite number of dominant ground states. J. Statist. Phys. 54(1-2), 89–161 (1989) 10. Chayes, L., Koteck´y, R., Shlosman, S.B.:Aggregation and intermediate phases in dilute spin systems. Commun. Math. Phys. 171, 203–232 (1995) 11. Chayes, L., Koteck´y, R., Shlosman, S.B.: Staggered phases in diluted systems with continuous spins. Commun. Math. Phys. 189, 631–640 (1997) 12. Chayes, L., Shlosman, S., Zagrebnov, V.: Discontinuity in magnetization in diluted O(n)-Models. J. Statist. Phys. 98, 537–549 (2000) 13. Dinaburg, E.I., Sinai, Ya.G.: An analysis of ANNNI model by Peierls’ contour method. Commun. Math. Phys. 98(1), 119–144 (1985) 14. Dobrushin, R.L., Shlosman, S.B.: Phases corresponding to minima of the local energy. Selecta Math. Soviet. 1(4), 317–338 (1981) 15. Dobrushin, R.L., Zahradn´ık, M.: Phase diagrams for continuous-spin models: an extension of the Pirogov-Sina˘ı theory. In: Dobrushin R.L. (ed.) Mathematical problems of statistical mechanics and dynamics. Math. Appl. (Soviet Ser.), Vol. 6, Dordrecht: Reidel, 1986, pp. 1–123 16. van Enter, A.C.D., Shlosman, S.B.: First-order transitions for n-vector models in two and more dimensions: Rigorous proof. Phys. Rev. Lett. 89, 285702 (2002) 17. van Enter, A.C.D., Shlosman, S.B.: Provable first-order transitions for nonlinear vector and gauge models with continuous symmetries. Commun. Math. Phys. 255, 21–32 (2005) 18. Fr¨ohlich, J., Israel, R., Lieb, E.H., Simon, B.: Phase transitions and reflection positivity. I. General theory and long range models. Commun. Math. Phys. 62, 1–34 (1978) 19. Fr¨ohlich, J., Israel, R., Lieb, E.H., Simon, B.: Phase transitions and reflection positivity. II. Lattice systems with short range and Coulomb interactions. J. Statist. Phys. 22, 297–347 (1980) 20. Fr¨ohlich, J., Lieb, E.H.: Phase transitions in anisotropic lattice spin systems. Commun. Math. Phys. 60(3), 233–267 (1978) 21. Georgii, H.-O.: Gibbs Measures and Phase Transitions. de Gruyter Studies in Mathematics, Vol. 9, Berlin: Walter de Gruyter & Co., 1988 22. Imbrie, J.Z.: Phase diagrams and cluster expansions for low temperature P (ϕ)2 models. I. The phase diagram. Commun. Math. Phys. 82(2), 261–304 (1981/82) 23. Imbrie, J.Z.: Phase diagrams and cluster expansions for low temperature P (ϕ)2 models. II. The Schwinger functions. Commun. Math. Phys. 82(3), 305–343 (1981/82) 24. Koteck´y, R., Laanait, L., Messager, A., Ruiz, J.: The q-state Potts model in the standard Pirogov-Sina˘ı theory: surface tensions and Wilson loops. J. Statist. Phys. 58(1-2), 199–248 (1990) 25. Koteck´y, R., Shlosman, S.B.: First-order phase transitions in large entropy lattice models. Commun. Math. Phys. 83(4), 493–515 (1982) 26. Koteck´y, R., Shlosman, S.B.: Existence of first-order transitions for Potts models. In: Albeverio, S., Combe, Ph. , Sirigue-Collins M. (eds.), Proc. of the International Workshop — Stochastic Processes in Quantum Theory and Statistical Physics, Lecture Notes in Physics 173, Berlin-Heidelberg-New York: Springer-Verlag, 1982, pp. 248–253 27. Laanait, L., Messager, A., Miracle-Sol´e, S., Ruiz, J., Shlosman, S.: Interfaces in the Potts model. I. Pirogov-Sinai theory of the Fortuin-Kasteleyn representation. Commun. Math. Phys. 140(1), 81– 91 (1991) 28. Martirosian, D.H.: Translation invariant Gibbs states in the q-state Potts model. Commun. Math. Phys. 105(2), 281–290 (1986) 29. Messager, A., Nachtergaele, B.: A model with simultaneous first and second order phase transitions. http://arxiv.org/list/cond-mat/0501229, 2005 30. Pirogov, S.A., Sinai,Ya.G.: Phase diagrams of classical lattice systems (Russian). Theor. Math. Phys. 25(3), 358–369 (1975) 31. Pirogov, S.A., Sinai, Ya.G.: Phase diagrams of classical lattice systems. Continuation (Russian). Theor. Math. Phys. 26(1), 61–76 (1976) 32. Shlosman, S.B.: The method of reflective positivity in the mathematical theory of phase transitions of the first kind (Russian). Uspekhi Mat. Nauk 41(3(249)), 69–111, 240 (1986) 33. Shlosman, S., Zagrebnov, V.: Magnetostriction transition. J. Statist. Phys. 114, 563–574 (2004) 34. Zahradn´ık, M.: An alternate version of Pirogov-Sinai theory. Commun. Math. Phys. 93, 559–581 (1984)
656
M. Biskup, R. Koteck´y
35. Zahradn´ık, M.: Contour methods and Pirogov-Sinai theory for continuous spin lattice models. In: R.A. Minlos, S. Shlosman Yu.M. Suhov (eds.), On Dobrushin’s way. From probability theory to statistical physics, Amer. Math. Soc. Transl. Ser. 2, Vol. 198, Providence, RI: Amer. Math. Soc., 2000, pp. 197–220 Communicated by M. Aizenman
Commun. Math. Phys. 264, 657–681 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1552-5
Communications in
Mathematical Physics
Spectral Triples of Holonomy Loops Johannes Aastrup1 , Jesper Møller Grimstrup2,3 1 2 3
Institut f¨ur Analysis, Universit¨at Hannover, Welfengarten 1, 30167 Hannover, Germany. E-mail:
[email protected] NORDITA, Blegdamsvej 17, 2100 Copenhagen, Denmark. E-mail:
[email protected] Science Institute, University of Iceland, Dunhaga 3, 107 Reykjavik, Iceland
Received: 4 May 2005 / Accepted: 27 October 2005 Published online: 31 March 2006 – © Springer-Verlag 2006
Abstract: The machinery of noncommutative geometry is applied to a space of connections. A noncommutative function algebra of loops closely related to holonomy loops is investigated. The space of connections is identified as a projective limit of Lie-groups composed of copies of the gauge group. A spectral triple over the space of connections is obtained by factoring out the diffeomorphism group. The triple consist of equivalence classes of loops acting on a hilbert space of sections in an infinite dimensional Clifford bundle. We find that the Dirac operator acting on this hilbert space does not fully comply with the axioms of a spectral triple. Contents 1. 2. 3. 4. 5.
Introduction . . . . . . . . . . . . . . . . . . . . . . . The Hoop Group . . . . . . . . . . . . . . . . . . . . Hoop Group Representations . . . . . . . . . . . . . . The Space A¯ as a Projective Limit . . . . . . . . . . . Spectral Triples over Gn and the Projective Limit . . . 5.1 The hilbert space . . . . . . . . . . . . . . . . . 5.2 The Euler-Dirac operator . . . . . . . . . . . . . 5.3 The algebra . . . . . . . . . . . . . . . . . . . . 5.4 An extended Euler-Dirac operator . . . . . . . . 6. The Space of Connections . . . . . . . . . . . . . . . 6.1 Distances on A¯ . . . . . . . . . . . . . . . . . . 7. Diffeomorphism Invariance . . . . . . . . . . . . . . . 7.1 Transformations of states and operators . . . . . 7.2 Diffeomorphism invariant states . . . . . . . . . 7.3 Diffeomorphism invariance via equivalent triples 7.4 Spectral in the sense of Connes? . . . . . . . . . 8. Discussion & Outlook . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
658 661 662 663 664 664 667 668 669 669 670 670 671 671 673 674 675
658
J. Aastrup, J.M. Grimstrup
A. Clifford Algebras and Dirac Operators . . . . . . . . . . . . . . B. Projective and Inductive Limits . . . . . . . . . . . . . . . . . . B.1 Projective limits . . . . . . . . . . . . . . . . . . . . . . . B.2 Inductive limits . . . . . . . . . . . . . . . . . . . . . . . B.3 Constructing operators on inductive limits of hilbert spaces References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
677 678 678 679 680 681
1. Introduction The story of noncommutative geometry starts with the idea that instead of studying spaces one studies algebras of functions on the spaces. A concrete result supporting this idea is the Gel’fand-Naimark theorem [1] that states that the world of locally compact Hausdorff spaces is, by taking the corresponding algebra of continuous complex valued functions vanishing at infinity, the same as the world of commutative C ∗ -algebras. Hence noncommutative C ∗ -algebras can be considered as noncommutative locally compact Hausdorff spaces. The crucial leap from noncommutative topology to geometry was done by Alain Connes [2]. The key observation is that the Dirac operator on a Riemannian manifold gives full information about the metric. This idea provides the definition of a noncommutative geometry, i.e. a spectral triple, by abstractizing a Dirac operator as an operator acting on the same hilbert space as the (non)commutative algebra; satisfying a list of axioms generalizing interaction rules of smooth functions with the Dirac operator. Prime examples of noncommutative geometries are given by quotient spaces. A conceptually simple case is the set of two points identified. The classical way of identification would be to consider just one point. The noncommutative quotient is to consider two by two matrices. So we regard the two sub-algebras C 0 0 0 , 0 0 0 C as the function algebras over the two points. These algebras are then identified through the partial isometries 0 1 0 0 , , 0 0 1 0 which not only identify the points but also belong to the algebra. Represented on H = C ⊕ C the algebra of two by two matrices interacts with a Dirac operator given by 0 a D = , a ∈ R. −a 0 This noncommutative geometry, when combined with the commutative algebra of smooth functions on a manifold, is related to the Higgs effect in the Connes-Lott model [3] and to the Higgs effect in Connes’ full formulation of the Standard Model [4]. The crucial point is that exactly the noncommutativity of the algebra generates the entire bosonic sector, including the Higgs scalar, through fluctuations around the Dirac operator. The action of the standard model coupled to gravity comes out [5, 6] ˜ ˜ |ξ + Trace ϕ D , ξ |D
Spectral Triples of Holonomy Loops
659
˜ is the fluctuated Dirac operator, ξ a hilbert state and ϕ a suitable cutoff function where D selecting eigenvalues below the cutoff . Unfortunately, this beautiful unification of the standard model with general relativity is completely classical. No clear notion of quantization exist within the framework of noncommutative geometry. The aim of this paper is to explore new ideas on the unification of noncommutative geometry with the principles of quantum field theory. Quantum field theory deals with spaces of field configurations. The central object is the path integral i D exp S[] , where denotes the field content of the theory described by the (symmetries of the) classical action S[]. D is a formal measure on the space of field configurations. Therefore, rather than dealing with manifolds or algebras of functions hereon, quantum field theory lives on the much larger spaces of field configurations. We now suggest the following: If Connes’ formulation of the standard model and quantum field theory are to be linked, and if the principles of noncommutative geometry are fundamental (which we believe they are), then one should apply the machinery of noncommutative geometry to some space of field configurations. Further, since Connes’ formulation of the standard model is in principle a gravitational theory (pure geometry) we suggest that the correct implementation of quantum theory must involve quantum gravity. Thus, we suggest to study a functional space related to general relativity. The aim is to find a suitable configuration space on which a generalized Dirac operator exists. A function algebra hereon may very well be naturally noncommutative (classically). The hope is that the Dirac operator will generate a kind of quantization of the underlying space. For the space of field configurations we use ideas from loop quantum gravity [7]. Here the space is the space of certain connections modulo gauge equivalence. The function algebra is generated by traced holonomies of connections along loops, i.e. all physical observables can be expressed by these. This gives a commutative algebra. However, the lesson taught by noncommutative geometry is that the noncommutativity of the algebra provides essential structure. The idea is therefore to keep the noncommutativity by taking holonomies without tracing them; a loop L maps connections into group elements of G L : ∇ → H ol(L, ∇) ∈ G,
(1)
where H ol(L, ∇) is the holonomy along L, and G is the gauge group which we, for now, assume to be compact. Loop functions like (1) correspond to an underlying space of gauge connections which includes also gauge equivalent connections. This will also resemble Connes’ construction of standard model, since we get an algebra of matrix valued functions over a configuration space just as Connes’ matrix valued functions over a manifold. Furthermore, in loop quantum gravity a fibration of the space-time manifold into global space and time directions is considered. This is done in order to apply a canonical quantization scheme. In the present case the aim is to construct a spectral triple over a functional space of connections. For this purpose such a fibration is not needed and we therefore consider the whole manifold. Thus, the connections considered are space-time connections.
660
J. Aastrup, J.M. Grimstrup
The central achievement of loop quantum gravity is its ability to obtain a separable hilbert space of loop functions via diffeomorphism invariance1 (see [8] and references therein). It is possible to extend these results to the case of a noncommutative algebra; we represent certain equivalent classes of noncommutative loop operators on a diffeomorphism invariant, separable hilbert space. Further, the Dirac operator we construct on the holonomy algebra is diffeomorphism invariant and hence also descends to the diffeomorphism invariant hilbert space. This is important since the Dirac operator stores the full physical information. Let us finally add a note on noncommutativity and quantum theory. Clearly, the noncommutativity suggested is classical: It is simply related to the non-Abelian structure of the group G and therefore carries no quantum aspect. On the other hand, the Dirac operator which we construct will resemble a global functional derivation. As a Dirac operator it carries spectral information of the underlying space – the space of connections – and will enable integration theory. In this sense, the quantum aspect enters through the constructed Dirac operator. Outline of the paper. The algebra of (untraced) holonomy transformations, which is a central object in this paper, is introduced as the hoop group HG in Sect. 2. Since a smooth connection in a G-bundle maps a loop L ∈ HG into G homomorphically via the holonomy transform we define in Sect. 3 the space A¯ of generalized connections as the set of homomorphisms A¯ = H om(HG, G). This is the functional space on which we wish to do geometry. Conversely, since the hoop group acts on A¯ simply by HL (∇) = ∇(L), ¯ The key technical tool we interpret HG as a noncommutative function algebra on A. for dealing with the space A¯ is described in Sect. 4. Referring to [9] we identify A¯ as a projective limit over the representations of finite subgroups of the hoop group. This enables us to work with only finitely many loops at a time. The space A¯ seen from finitely many loops looks like Gn = G × · · · × G , n times
where n is related to the number of loops. Thus, since G is a Lie-group, we are at this level dealing with just an ordinary manifold and we can therefore write down Dirac operators from classical geometry. A concrete realization of this technique/idea is worked out in Sect. 5. Since we are sitting in a projective system we are not entirely free to choose our Dirac operator; it has to fit with different choices of finitely many loops. In fact, problems arise from loops with common line segments. We remedy this defect by technically excluding such combinations of loops. Also, for technical reasons, we choose the classical Euler-Dirac instead of the real Dirac operator. 1 In fact, diffeomorphism invariance alone does not give a separable hilbert space. Instead one has to use a generalized notion of diffeomorphisms, see [8].
Spectral Triples of Holonomy Loops
661
In doing this the link to connections becomes unclear. This is clarified in Sect. 6, where we show that the connection are still contained in the spectrum of the modified algebra. A key issue in the construction presented is the implementation of diffeomorphism invariance; the concern of Sect. 7. Using once more ideas from loop quantum gravity we construct diffeomorphism invariant states and a diffeomorphism invariant algebra of loop operators. Finally we are concerned with the question whether the obtained, diffeomorphism invariant triple is spectral in the sense of Connes. It turns out not to be the case since the eigenvalues of the Dirac operator have infinite multiplicity. In particular, this is linked to the kernel of the Euler-Dirac operator on G which has dimension larger than one. Although we are at present unable to solve the problem we suggest some possible solutions. We provide a final discussion and outlook in Sect. 8 and leave some extra material for the appendices. 2. The Hoop Group The starting point is a manifold M. Let us for simplicity assume that M is topological trivial. On this manifold we consider first the set P of piecewise analytic paths
P := P (t)|P : [0, 1] → M , where paths which differ only by a reparameterization are identified. If two paths P1 , P2 ∈ P have coinciding end and start points, P1 (1) = P2 (0), we define their product
P1 (2t) t ∈ 0, 21
. P1 ◦ P2 (t) = P2 (2t − 1) t ∈ 21 , 1 In case P1 (1) = P2 (0) we set their product to zero. There is a natural involution on P, P ∗ (t) = P (1 − t)
∀t,
since (P ∗ )∗ = P ,
(P1 ◦ P2 )∗ = P2∗ ◦ P1∗ .
Choose an arbitrary basepoint x o ∈ M. We call a path which starts and ends at x o a based loop. Further, by a simple loop we understand a based loop for which L(t) = x o ⇔ t ∈ {0, 1}. The set of based loops is called loop space and is denoted Lx o . An equivalence relation on loop space is generated by identifying loops which differ by a simple retracing along a path L1 = P1 ◦ P2 ◦ P2∗ ◦ P3 L1 ∼ L2 ⇔ , L2 = P1 ◦ P3 where Li ∈ Lx o , and Pi ∈ P. An equivalence class [L] is called a hoop [10]. The set of hoops is called the hoop group, denoted HG = Lx o / ∼,
662
J. Aastrup, J.M. Grimstrup
since the involution on HG gives an inverse element [L] · [L]∗ = [Lid ], where Lid is the trivial loop Lid (t) = x o
∀t ∈ [0, 1].
To ease the notation we will denote a hoop [L] simply by an representative L of the equivalence class. Furthermore, for literary reasons we often call [L] a loop. We emphasize that since M has no metric any notion of distance between and length of loops and hoops is meaningless. 3. Hoop Group Representations Consider the space of homomorphisms A¯ = Hom(HG, G), from the hoop group into a matrix representation of a compact Lie group G (we denote both the group and its representation by G. The group G is assumed to have a both left and right invariant metric). That is, for ∇ ∈ A¯ we have ∇(L1 ) · ∇(L2 ) = ∇(L1 ◦ L2 )
∀L1 , L2 ∈ HG.
If we denote by A the space of smooth connections in a bundle with structure group G, then a connection ∇ ∈ A clearly gives such a homomorphism via ∇ : L → H ol(L, ∇),
(2)
where H ol(L, ∇) is the holonomy of the connection around the loop L. Let us recall that the holonomy is the parallel transport of the connection along a path P , H ol(P , ∇) = P exp i ∇ , P
where P is the path ordering symbol. The parallel transport along a closed loop is a nonlocal, gauge covariant object and the trace hereof, the Wilson loop, is gauge invariant. From (2) we conclude that ¯ A ⊂ A. It is however important to realize that A¯ is much larger2 [9]. The space A¯ is the general space of field configurations on which we wish to obtain ¯ To this a geometrical structure. We therefore consider an algebra of functions over A. ¯ end we first notice that a hoop L ∈ HG gives rise to a function HL on A into G via HL (∇) = ∇(L),
(3)
¯ Notice that where ∇ ∈ A. HL1 · HL2 = HL1 ◦L2 ,
(HL )∗ = HL−1 ,
HLid = 1,
In fact, A has, with respect to the Ashtekar-Lewandowski measure, zero measure in A¯ (modulo gauge transformations). 2
Spectral Triples of Holonomy Loops
663
where Li ∈ HG. The set of complex linear combinations of all functions HL is a -algebra. The norm of a general linear combination a1 HL1 + · · · + an HLn is defined by a1 HL1 + · · · + an HLn = sup a1 HL1 (∇) + · · · + an HLn (∇) , ∇∈A¯
where · on the rhs is the matrix norm. Notice that HL = 1 ∀L ∈ HG if the group G is orthonormal or unitary. The closure in this norm of the algebra generated by functions HL is a C -algebra. Let us denote it C ∗ (Lx o ). For now, this is the noncommutative function algebra over A¯ which we wish to imbed in a spectral triple. However, as we will explain in the next section, we need to change the algebra slightly to be able to construct a Dirac operator. ¯ as a Projective Limit 4. The Space A The space A¯ was analyzed in [9] in a somewhat different context3 . Here, the authors identify A¯ with a projective limit (see Appendix B for details on projective and inductive limits) lim H om(F, G), ←
F ∈ F,
where F is the set of all strongly independent, finitely generated subgroups of HG (strongly independent in the sense of [10]). Let L1 . . . Ln(F ) be the strongly independent generators of F ∈ F. We then identify [10] H om(F, G) Gn(F ) , since we just map φ ∈ H om(F, G) into (φ(L1 ), . . . , φ(Ln(F ) )) ∈ Gn(F ) .
(4)
This identification is of great advantage: Since Gn(F ) is a Lie group it is now straightforward to construct a spectral triple by choosing a metric on G and then using the Euler-Dirac4 or the Dirac operator (since Gn(F ) is a Lie group it is parallelizable and hence possesses a spin structure). Once a geometrical construction on Gn(F ) is obtained we extend this to all of A¯ by taking the projective limit of the algebra and the inductive limit of the relevant hilbert space. Thus, it is tempting to consider the hilbert space L2 (Gn(F ) , Mk (C) ⊗ S), 3 The authors of [9] considered the smaller space A¯ /Ad of smooth connections modulo local gauge transformations. This otherwise important difference is not essential for the issues regarding the projective limit. 4 See Appendix A.
664
J. Aastrup, J.M. Grimstrup
where S is the Clifford algebra or the spin bundle corresponding to either the Euler-Dirac or the Dirac operator. L2 is with respect to the Haar measure on Gn(F ) . The problem with this construction is that the Euler-Dirac and the Dirac operators contain all the metric information on the underlying space [2] and the structure maps defining the projective limit are not metric. The problems can be traced back to the definition of the generating hoops. Here, following [10], we encounter overlapping hoops which lead to structure maps PF1 ,F2 : H om(F1 , G) → H om(F2 , G) of the form5 (g1 , g2 , g3 ) → g1 g2 ,
(5)
where F1 ⊂ F2 lie in F. The problem is that such maps do not have a canonical isometric cross-section. The solution to this problem is to redefine our notion of generating hoops. This, in turn, will affect the projective limit. Let us go into detail in the next section. Before we do that we end this section by mentioning that the identification (4) indirectly chooses an orientation of the hoop. Basically, there are two possible identifications corresponding to either ϕ(L) or ϕ(L−1 ). Therefore, we can identify H om(F, G), where F is a subgroup generated by a single hoop, with both G and G−1 . 5. Spectral Triples over Gn and the Projective Limit Let FI be the set of finitely generated subgroups of HG with the property that they are generated by simple, non-selfintersecting loops that do not have overlapping segments or points6 . The inclusion of groups F1 ⊂ F2 gives an inductive system on FI and therefore a projective structure on {Hom(F, G)}F ∈FI . Again, we can identify Hom(F, G) with Gn(F ) , where n(F ), as before, is the number of simple loops in a generating set of F . Since we are looking at subgroups with the property that no two loops have overlapping segments the maps PF1 ,F2 : Gn(F2 ) → Gn(F1 ) induced by the inclusion F1 ⊂ F2 are just given by deleting some coordinates or inverting some coordinates. This eliminates structure maps of the form (5) and thus enables the following construction of a spectral triple. 5.1. The hilbert space. We first construct the hilbert space. We choose a left and right invariant metric on G. We therefore also have a metric on Gn(F ) and hence we can construct the Clifford bundle Cl(T Gn(F ) ). Due to the invariance of the metric we get the result Proposition 5.1.1. There is an embedding of hilbert spaces PF∗1 ,F2 : L2 (Gn(F1 ) , Cl(T Gn(F1 ) )) → L2 (Gn(F2 ) , Cl(T Gn(F2 ) )), where the measure on Gn(Fi ) is the Haar measure. 5 Here H om(F , G) and H om(F , G) are, as an example, identified with G1 and G3 , respectively. A 1 2 similar structure map with a G2 -subgroup is not possible due to the special construction of independent hoops. 6 In contrast to [10] we no longer require loops to be piece-wise analytic. Nor does the manifold need a real analytic structure.
Spectral Triples of Holonomy Loops
665
Proof. We will need some notation. Let e1 , . . . , en be an orthonormal basis in Tid G, the tangent space over the identity in G. Due to the invariance property of the metric we get that Dg (id)(e1 ), . . . , Dg (id)(en ) is an orthonormal basis in Tg G. Here Dg (id) denotes the differential of the map mg : G → G,
mg (g1 ) = gg1
in the identity. We will also use the notation e1 , . . . , en to denote the corresponding global vector fields in T G, i.e. ek (g) = Dg (id)(ek ). We will abbreviate n(Fi ) by ni . We first consider the case where the projection PF1 ,F2 is of the form PF1 ,F2 (g1 , . . . , gn2 ) = (g1 , . . . , gn1 ),
(6)
and denote by e11 , . . . , en1 , e12 , . . . , en2 , . . . , e1ni , . . . , enni the global vector fields on Gni , where e1k , . . . , enk denote the global vector fields e1 , . . . , en in the k th component of T Gni . Put g¯ ni = (g1 , . . . , gni ). We clearly have that
elk (g¯ n1 ), elk (g¯ n1 )Tg¯n
1
Gn1
= elk (g¯ n2 ), elk (g¯ n2 )Tg¯n
2
Gn2 ,
where k, k ≤ n1 . An element in L2 (Gn1 , Cl(T Gn1 )) is a linear combination of elements of the form f e, where e is a product of elements in e11 , . . . , en1 , e12 , . . . , en2 , . . . , e1n1 , . . . , enn1 , and f ∈ L2 (Gn1 ). We define PF∗1 ,F2 (f e) = f˜e, where f˜(g¯ n2 ) ≡ f (PF1 ,F2 (gn2 )) = f (g¯ n1 ). This map preserves the inner product since f e, f e L2 (Gn1 ,Cl(T Gn1 )) = f¯(g¯ n1 )f (g¯ n1 )e, e T(g¯n ) Gn1 · dµH (g1 ) · · · dµH (gn1 ) 1 = f¯(g¯ n1 )f (g¯ n1 )e, e T(g¯n ) Gn2 · dµH (g1 ) · · · dµH (gn2 ) 2
= f˜e, f˜ e L2 (Gn2 ,Cl(T Gn2 )) , where we have used that 1dµH = 1.
666
J. Aastrup, J.M. Grimstrup
To finish the construction we only need to consider a map of the form PF1 ,F2 (g) = g −1 ,
(7)
since any structure map is the composition of maps of the type (6) and (7). However, the map PF∗1 ,F2 : L2 (G, Cl(T G)) → L2 (G, Cl(T G)), defined by (with the notation from before) PF∗1 ,F2 (f e)(g) = f (g −1 )DP −1 (e) F1 ,F2
is, due to the left and right invariance of the metric, a map of hilbert spaces. This completes the proof. We can now construct the direct limit of these hilbert spaces (see Appendix B for a more detailed discussion on inductive limits). This is done in the following way: First define Halg = ⊕F ∈FI L2 (Gn(F ) , Cl(T Gn(F ) ))/N, where N is the subspace generated by elements of the form (. . . , v, . . . , −PF∗1 ,F2 (v), . . . ). In other words, we identify the vectors v and PF∗1 ,F2 (v). The problem is now to define an inner product on Halg . Decompose L2 (G, Cl(T G)) into the subspace generated by the function 1 and the orthogonal complement. We will write this as L2 (G, Cl(T G)) = H1 ⊕ H2 , where H1 = C. Given a vector v ∈ L2 (Gn(F ) , Cl(T Gn(F ) )) this can be uniquely decomposed into vectors of the form v1 ⊗ · · · ⊗ vn(F ) , where each vi belong either to H1 or H2 . It is therefore enough to define the inner product of vectors of this type. Further, let v1 ∈ L2 (Gn(F1 ) , Cl(T Gn(F1 ) )) and v2 ∈ L2 (Gn(F2 ) , Cl(T Gn(F2 ) )) be vectors of this form. We will assume that in the tensor decomposition of v1 and v2 only elements from H2 appear. We can assume this since else v1 and/or v2 will be the image under one of the P ’s, and we can simply pull the vector back. We finally define the inner product by v1 , v2 = PF∗1 ,F3 (v1 ), PF∗2 ,F3 (v2 )L2 (Gn(F3 ) ,Cl(T Gn(F3 )) )
(8)
if there exist a F3 with F1 , F2 ⊂ F3 and zero else. The completion of the Halg with (si ∼ respect to this inner product is the inductive limit and will be denoted by Hsi segment independent). In Eq. (8), since v1 and v2 are, per definition, decomposed into tensor-powers in H2 , the inner product will be different from zero only when F1 = F2 .
Spectral Triples of Holonomy Loops
667
in terms of the hilbert space H , We can give a more concrete description of Hsi 2 namely Hsi = C ⊕ (⊕l 1 H2 ) ⊕ (⊕l 2 H2 ⊗ H2 ) ⊕ . . . ,
(9)
where l k is the set of all products of k-nonintersecting simple loops, and where ⊕ means orthogonal sum. The first C corresponds to the trivial loop. For each simple loop we get a copy of L2 (G, Cl(T G)); however the constant functions are identified in the inductive limit, and we hence only get a copy of H2 for each simple loop. This picture continues for products of two simple loops and so on. 5.2. The Euler-Dirac operator. On each of the hilbert spaces L2 (Gn(F ) , Cl(T Gn(F ) )) we have a canonical Euler-Dirac operator D(ξ ) =
ei · ∇ei (ξ ),
(10)
where {ei } are global, orthonormal sections in the tangent bundle of Gn(F ) and ∇ is the Levi-Civita connection. It is clear that this Euler-Dirac operator commutes with the structure maps PF∗1 ,F2 not involving inversions. According to [14] D can, under the identification of Cl(T M) with ∧∗ (T ∗ M) (differential forms), be identified with d + d ∗ . The exterior derivative d is invariant under all diffeomorphisms, and since d ∗ only additionally depends on the metric and the metric on G is invariant under inversions, the Euler-Dirac operator also commutes with structure maps involving inversions. Therefore . get an Euler-Dirac operator D on Hsi The reason why we choose the Euler-Dirac operator instead of the classical Dirac operator is that the former has better functorial properties. In particular, it is invariant under inversions of loops. If we consider for example the Abelian case, G = S 1 , and parameterize S 1 by θ ∈ [0, 2π ], then the Dirac operator reads D = i
∂ , ∂θ
(11)
which, under inversion of the underlying loop G → G−1
(12)
picks up a minus sign. On the other hand, we have just argued that the Euler-Dirac operator is invariant under inversions. It is of course desirable to work out a construction that works for the classical Dirac operator, but for now we choose to work with the easier Euler-Dirac operator. The particular choice of “Dirac” operator in (11) is motivated by its resemblance to a (integrated) functional derivation. Heuristically: A (smooth) connection is determined by holonomies along hoops. In the projective system described here we consider first a finite number of hoops and a connection is thus described ‘coarse-grained’ by assigning group elements to each of the finitely many elementary hoops. The Euler-Dirac operator (10) takes the derivative on each of these copies of the group G and throws it into the Clifford bundle. In this way the Dirac operator resembles a functional derivation operator.
668
J. Aastrup, J.M. Grimstrup
We interpret this Euler-Dirac operator as intrinsically ‘quantum’ since it bears some resemblance to a canonical conjugate of the connection. Heuristically, we write D ∼
δ δ∇
(13)
and HL ∼ 1 + ∇
(14)
due to HL ’s relation to the holonomy map. Here ∇ is a connection. From (13) and (14) the non-vanishing commutator [D, HL ] = 0 obtains, on a very heuristical level, a resemblance to a commutation relation of canonical conjugate variables. Thus, it is not the noncommutativity of the algebra of holonomy loops (to be defined rigorously below) which is ‘quantum’ but rather the Dirac operator and its interaction with the algebra. This is an essential point for the interpretation of the geometrical construction presented. , or 5.3. The algebra. We will construct our algebra as an algebra of operators on Hsi rather a variant of, hereof denoted Hsi . This algebra will be similar, but not equal, to the group algebra C ∗ (Lx o ) of hoops. but where The hilbert space Hsi is constructed the same way as Hsi
L2 (Gn(F ) , Cl(T Gn(F ) ) ⊗ Mn (C)) is used instead of L2 (Gn(F ) , Cl(T Gn(F ) )). Here n is the size of the representation of G. The reason for the additional matrix factor is that we wish to represent the holonomy loops by left matrix multiplication. The decomposition analogous to (9) looks like Hsi = Mn (C) ⊕ (⊕l 1 H2 ⊗ Mn (C)) ⊕ (⊕l 2 H2 ⊗ H2 ⊗ Mn (C)) . . . . If we are given a simple hoop L, we construct an operator HL on Hsi in the following way: For a subgroup F ∈ FI we make use of the identification (4) of Gn(F ) with Hom(F, G) and hence define HL (s)(ϕ) = (id ⊗ ϕ(L))(s(ϕ)), / F . Since where s ∈ L2 (Gn(F ) , Cl(T Gn(F ) ) ⊗ Mn (C)) and where ϕ(L) = id when L ∈ HL respects the maps PF∗1 ,F2 we get an operator HL on Hsi . For a general hoop L, using the unique decomposition of L into simple hoops L1 ◦ . . . ◦ Ln define HL = HL1 ◦ · · · ◦ HLn . Our algebra, which we denote A, is the C ∗ -algebra generated by the operators HL , L ∈ HG. It is important to realize that the algebra A is not identical to the C ∗ -algebra C ∗ (Lx o ) introduced in Sect. 3. That is, we have not obtained a representation of the group algebra of hoops on M. To illustrate this consider the following two situations:
Spectral Triples of Holonomy Loops
669
1. Loops with common line segment. We consider for example two loops L1 and L2 where L1 = P1 ◦ P2 ,
L2 = P2∗ ◦ P3 ,
with Pi ∈ P. Hence L3 ≡ L1 ◦ L2 = P1 ◦ P3 . 2. Intersecting loops. Consider two loops L4 and L5 where L4 (t1 ) = L5 (t2 ) = x o . In the first case L1 , L2 and L3 cannot belong to the same subgroup F ∈ FI since they all have common line segments. Thus, their associated operators HLi act on different parts of the hilbert space. This means that they commute HL1 · HL2 = HL2 · HL1 . In particular, it means that HL1 · HL2 = HL3 . In the second case, the product L4 ◦ L5 does not even belong to any subgroup F ∈ FI . Thus, the operator HL4 ◦L5 only exist as the composition HL4 · HL5 . 5.4. An extended Euler-Dirac operator. The Euler-Dirac operator defined in Eq. (10) . When acting on H it does not ‘see’ the matrix acts, basically, on the hilbert space Hsi si part of the hilbert space. This need not be so. We can for example define an extended Euler-Dirac operator by Dext (ξ ⊗ m)(g) = D(ξ(g)) ⊗ m + ξ(g) ⊗ mn (g) · m,
(15)
where mn (g) is a matrix valued function on Gn(F ) and ξ ⊗ m ∈ Hsi . The form of the operator in Eq. (15) is similar to the Dirac operators of the almost commutative geometries (including the standard model). See for example [4]. 6. The Space of Connections So far, we have considered a geometrical structure over spaces related to certain loop group homomorphisms. We now want to describe in more detail the role of connections in this construction. In the above we constructed the hilbert space Hsi = lim L2 (Gm , Cl(T Gm ) ⊗ Mn (C)). →
Let us for simplicity now consider the same hilbert space but without the spin structure and the matrix factor: H = lim L2 (Gn ). →
670
J. Aastrup, J.M. Grimstrup
Hence, Hsi is H with coefficients in an infinite dimensional Clifford algebra tensored with n by n matrices. If we backtrack our line of reasoning we first make the identification H = L2 (lim H om(F, G)), ←
where F ∈ FI . Let ∇ be a fixed, smooth connection in A. As already mentioned, for a given F , ∇ gives rise to a homomorphism into G via the holonomy loop ∇ : L → H ol(L, ∇) ∈ G, where L ∈ F . It is easy to see that this commutes with the structure map and hence that we get a map A → lim H om(F, G). ←
Clearly, this map is injective. We therefore conclude that H is a hilbert space over a space which contains all smooth connections. ¯ On a Riemannian spin-geometry the Dirac operator D contains 6.1. Distances on A. the geometrical information of the manifold M. In particular, distances can be formulated in a purely algebraic fashion due to Connes [2]. Given two points x, y ∈ M their distance is given by d(x, y) =
sup
{|f (x) − f (y)|[D, f ] ≤ 1}.
f ∈C ∞ (M)
(16)
On a noncommutative geometry the state space replaces the notion of points. It is possible to extend the notion of distance to the state space by generalizing (16) in an obvious manner. For the present case, however, it is quite unclear in what sense a Dirac operator incorporates a distance. Further, the usefulness of such a notion is in the present situation not obvious. Clearly, if the Dirac operator (10) is interpreted as a metric it will give rise to ¯ distances on the space A. For example, it is not difficult to see that for the G = U (1) case the distance between two smooth connections will be infinite. This can be seen by first noting that the distance induced by the Dirac operator on U (1)n is just the sum of distances on each copy of U (1). This product distance of two smooth connections will differ on infinitely many non-intersecting loops. Further, summing these differences will give an infinite distance between the points. Perhaps this is not so surprising considering the fact that our geometry is infinite dimensional. 7. Diffeomorphism Invariance Clearly, the construction considered so far is very large. In fact, the hilbert space Hsi is not separable and it is unclear how to extract physical quantities in a well-defined manner. What is missing is of course the implementation of diffeomorphism invariance relative to the underlying manifold M. Invariance under arbitrary coordinate transformations is the defining symmetry of general relativity and it is therefore an essential ingredient in the formalism. It turns out that the ‘size’ of the construction can indeed be
Spectral Triples of Holonomy Loops
671
drastically reduced by taking diffeomorphism invariance into account7 . First we write down transformation laws of hilbert states and operators. Next, we define diffeomorphism invariant states via a formal sum over states connected via diffeomorphisms. We are able to represent loop operators on such ‘smeared’ states albeit not as a representation of the operator algebra A. In a subsequent subsection we investigate an alternative approach where we introduce an equivalence of spectral triples to cut down the size of both the hilbert space and the algebra as well as the corresponding Euler-Dirac operator simultaneously. We find that the two approaches are in fact equivalent. Finally we look at the spectrum of the relevant Euler-Dirac operators and show that it is not fully a Dirac operator in the sense of Connes. We assume that the space-time dimension of the manifold M is larger than three. Since there exist no knot theory outside 3 dimensions we hereby avoid considering different “knot states,” etc. 7.1. Transformations of states and operators. We first consider states in L2 (Gn(F ) , Cl(T Gn(F ) ) ⊗ Mn (C)) which are polynomial in g1 , . . . , gn(F ) tensored with constant elements in Cl(T Gn(F ) ). A diffeomorphism d ∈ Diff (M) which maps d : Li → Li has a natural action on such polynomials d : p(g1 , . . . , gn(F ) ) → p(g1 , . . . , gn(F ) ),
gi
Gi
(17)
Li .
where ∈ is the group corresponding to the new loop Because we interpret states in Hsi as (polynomials in) holonomy loops we can really only state how polynomials and their closure should transform under diffeomorphisms. However, we can simply extend the transformation law (17) to all of L2 (Gn(F ) , Cl(T Gn(F ) ) ⊗ Mn (C)) d : ξ(g1 , . . . , gn(F ) ) → ξ(g1 , . . . , gn(F ) ),
and via the inductive limit to all of Hsi . The action of the diffeomorphism group on the algebra A is straightforward, simply taken from (17). Above and in the following we only consider diffeomorphisms in Diff (M) which preserve the basepoint x 0 . 7.2. Diffeomorphism invariant states. From one point of view we need to solve the diffeomorphism constraint dξ = ξ,
∀ d ∈ Diff (M) ; ξ ∈ Hsi .
(18)
Let us start by investigating this. The following is inspired by [7, 12]. Equation (18) has, of course, the formal solution, ξ˜ = d(ξ ). (19) d∈Diff (M) 7 The construction on this section works both for diffeomorphisms and for extended diffeomorphisms, and we will therefore notationally not distinguish between them. But only the latter case gives a separable hilbert space, and hence the verification of (or lack of) the axioms of a spectral triple only makes sense for the extended diffeomorphisms.
672
J. Aastrup, J.M. Grimstrup
This, however, makes no sense in Hsi . Instead we need to consider the dual of Hsi . So, given a vector η ∈ Hsi we let the formal sum (19) act on η like ξ˜ (η) = d(ξ )|η. (20) d∈Diff (M)
Strictly speaking this does not make sense either, since the sum on the right-hand side need not be convergent. If we however define the action of ξ˜ only on the algebraic part of Hsi , i.e. only finite sums of elements in the sum (9), the sum (20) becomes finite if the summation over Diff (M) is understood correctly. We will now describe how this works: ⊗n(F ) First we define the projection onto symmetrized states. Given a state ξ ∈ H2 ⊗Mn we denote by Diff (M|F ) diffeomorphisms which preserve form as well as orientation of all loops in F . Consider next diffeomorphisms F → F which do not lie in Diff (M|F ). We denote these by Diff (F → F ). The symmetry group of F , denoted SGF , is the quotient SGF = Diff (F → F )/Diff (M|F ).
(21)
They consist of certain permutations and inversions. The projection is defined by P (ξ ) =
1 d(ξ ), NF
(22)
d∈SGF
where NF is the number of elements in SGF . Next, consider the remaining diffeomorphisms which move the loops in F outside F . We define the sum (20) by d(P ξ )|η, (23) ξ˜ (η) = d∈Diff (M)/Diff (F →F )
where the sum is interpreted as an effective sum, i.e. if d1 (F ) = d2 (F ) we identify d1 ⊗n(F ) and d2 . If η ∈ H2 ⊗ Mn we find NF contributions on the rhs of (23). Else it is zero. The vector space of linear combinations of sums (23) is given the inner product ξ˜1 |ξ˜2 = ξ˜1 (ξ2 ).
(24)
The crucial point is that this sum has finitely many non-vanishing terms (see above). The completion of this vector space in the norm (24) is a diffeomorphism invariant hilbert space which we denote by Hdiff . The problem with this construction is that it is somewhat unclear how the algebra of hoops should be represented on Hdiff . Since our goal is to find a spectral triple involving not only a separable hilbert space but also a (separable) algebra and a well defined Dirac operator, this is clearly a crucial point. The difficulty stems from the fact that the algebra is not diffeomorphism invariant but rather co-variant. The Dirac operator, on the other hand, is diffeomorphism invariant and therefore causes no problems. Essentially, we need to make sense of a ’smearing’ of algebra elements according to H˜ L = Hd(L) , (25) d∈Diff (M)
Spectral Triples of Holonomy Loops
673
similar to Eq. (19). As it stands, Eq. (25) is meaningless. Instead we do the following: Given a hoop operator HL ∈ A define the symmetrized operator by 1 PF (HL ) = Hd(L) , (26) NF d∈SGF
where SGF is the symmetry group of a subgroup F including L. NF is again the total number of elements in SGF . For example, if L is simple and F is the algebra generated by L, we have 1 PF (HL ) = L + L−1 . 2 For a ’smeared’ state ξ˜ ∈ Hdiff we define the action of HL on ξ˜ by HL (ξ˜ ) = d(PF (HL ) · P (ξ )), d∈Diff (M)/Diff (F →F )
where we choose the representative ξ so that L and ξ have coinciding domains and where PF is taken with respect to the subgroup F defined by the domain of ξ and L. Note that we no longer deal with a representation of loops. For example, given a simple loop L acting on a state ξ with domain on a single copy of G we find that (using a somewhat sloppy notation) HL · HL = 41 (HL2 + HL−2 + 2). This relation, however, changes according to what states in Hdiff HL acts on. 7.3. Diffeomorphism invariance via equivalent triples. In the previous subsection we implemented diffeomorphism invariance by constructing diffeomorphism invariant states and defining an action of loop operators hereon. In fact, there is another option which, however, only works for extended diffeomorphisms. As explained above, the diffeomorphism group acts not only on the hilbert space but also on the algebra. We can therefore define an equivalence on the level of sub-triples; algebra, hilbert space and Euler-Dirac operator. This identification happens at the level of subgroups F ∈ HG. If we consider a single, simple loop L, the spectral triple associated to this is just (L, L2 (G, Cl(T G) ⊗ Mn ), D),
(27)
where a, b, . . . is the C ∗ -algebra generated by {a, b, . . . }. Since all single, simple loops are diffeomorphic, at this level we just get expression (27) when we identify spectral sub-triples which are diffeomorphic. At the level of two nonintersecting simple loops L1 and L2 the spectral triple associated to this is (L1 , L2 , L2 (G2 , Cl(T G2 ) ⊗ Mn ), D).
(28)
Again, by identifying spectral triples of diffeomorphic loops we get at this level just expression (28). This picture simply continues for all finitely generated subgroups and taking the limit hereof gives us an equivalence class of spectral triples represented by the infinite dimensional triple, (L1 , L2 , . . . , L2 (G∞ , Cl(T G∞ ) ⊗ Mn , D).
(29)
674
J. Aastrup, J.M. Grimstrup
Further, not only are all subgroups of nonintersecting loops with n generators diffeomorphic, there are also internal diffeomorphisms which shuffle the generators. One can factor out this symmetry by symmetrizing operators and states, just as we did in the previous subsection. Therefore, the result is, in fact, identical to the result of the previous subsection. Instead of symmetrizing one could also make the noncommutative quotient of the action of the internal diffeomorphism group SGF (and the limit), i.e. consider the crossed product AF × SGF , where AF is the part of our algebra acting on the F part. This would be more in the spirit of noncommutative geometry and Connes. We will investigate this alternative elsewhere. 7.4 Spectral in the sense of Connes? It remains to clarify whether the spectral triple (29) satisfy the conditions put forward by Connes [2], see also [11]. A confirmative answer will permit us the full power of noncommutative geometry. Clearly, on each level in the projective/inductive limit, the relevant Dirac (Euler-Dirac) operator satisfies the conditions for a spectral triple, simply per construction. The question remains whether it also holds in the limit. There are three conditions. First, the operator [D, a],
(30)
where a belongs to the subspace of A of finite linear combinations of loop operators, has to be bounded. A simple loop operator a = HL , acts, according to (26), on a state via 1 (HL1 + HL−1 + · · · + HLnF + HL−1 ), (31) 1 n(F ) NF where the number n(F ) refers to the domain of L and the state on which it acts (see Sect. 7.2). We can estimate the commutator of (31) with the Dirac operator by 1 D, HL1 + HL−1 + · · · + HLn(F ) + HL−1 1 n(F ) NF
1 ≤ D, HL1 + D, HL−1 + · · · + D, HLnF + D, HL−1 1 n(F ) NF
= D, HL . (32) Because the operator HL : G → G ;
g→g
is bounded we conclude that the operator (30) is bounded for a simple loop operator. For compositions of simple loop operators the argument is repeated and therefore we conclude that the first condition is satisfied. Second, we need to investigate whether the operator 1 , D − λ
λ ∈ C/R
is compact. In fact, this turns out not to be the case. Let us explain. For simplicity we leave out the matrix part of the hilbert space and simply consider the space L2 (Gn , Cl(T Gn )) = L2 (G, Cl(T G))⊗n ,
Spectral Triples of Holonomy Loops
675
where we only consider symmetrized (un-ordered) elements according to (22). Given a set of eigenfunctions {ξ1 , . . . , ξm } in L2 (G, Cl(T G)) of the Dirac operator, the product ξi1 ⊗ · · · ⊗ ξin
(33)
is an eigenfunction of the Dirac operator in L2 (Gn , Cl(T Gn )). The problem is that if we find a function ξ0 in L2 (G, Cl(T G)) with eigenvalue zero and which differs from the function 1, then we will automatically have an infinite dimensional eigenspace associated to any eigenvalue. To see this simply consider the function (33) (remember that we consider only symmetrized products) ξ0 ⊗ ξi1 ⊗ · · · ⊗ ξin in L2 (G(n+1) , Cl(T G(n+1) )). This is again an eigenfunction with the same eigenvalue as (33). According to Hodge theory (see Theorem II.5.15 in [14]) the kernel of a Euler-Dirac operator on a compact manifold M is related to the cohomology group: ker(D) = ⊕Hp ,
Hp = H p (M; C),
and the cohomology group is, at least on an orientable manifold as the Lie group G, not empty (the volume form is an example). Therefore we conclude that the Euler-Dirac operator in (29) does not satisfy Connes’ second condition. In principle, it is possible to correct this “flaw” in the construction of the Dirac operator in (29) by adding a bounded perturbation to D on each level in the projective/inductive limit. Such a perturbation will, in general, not be bounded in the limit itself. Indeed, if the perturbation is constructed in a way so that the perturbed Dirac operator satisfies condition two, then the full perturbation will be unbounded. Changing the operator on each level of the projective/inductive limit does not change the K-homology class at each level. In the limit, however, the operator will be changed (the original Euler-Dirac operator in (29) does not have a K-homology class). The third condition is self-adjointness. That D is self-adjoint is secured by construction. Let us end this subsection by noting that the fact that the Dirac operator in the triple (29) does not satisfy Connes second condition may be interpreted as a hint that there exist some extra symmetries that have not been (and should be) factored out. 8. Discussion & Outlook In the present paper we presented new ideas on the unification of noncommutative geometry – in particular Connes formulation of the standard model – and the principles of quantum field theory. We apply the machinery of noncommutative geometry to a general function space of connections related to gravity. A noncommutative algebra of holonomy loops is represented on a separable, diffeomorphism invariant hilbert space. An Euler-Dirac operator is constructed. The whole setup relies on techniques of projective and inductive limits of algebras, hilbert spaces and operators. What comes out is a geometrical structure, including integration theory, on a space of field configurations modulo diffeomorphism invariance. A global notion of differentiation (the Dirac operator) is obtained. We find it remarkable that the whole construction boils down to the study of Dirac operators on various copies of some Lie-group.
676
J. Aastrup, J.M. Grimstrup
Whereas the noncommutativity of the algebra is intrinsically classic we interpret the Dirac operator, which resembles a functional derivation, as ‘quantum’. Certain problems arose during the analysis. First, we were unable to represent the full hoop group in a manner compatible with the Euler-Dirac operator. The solution proposed and analyzed is to consider only finite subgroups of non-intersecting loops (in the projective system). This modification has important consequences; instead of graphs (spin-networks) we deal with polynomials on various copies of the group. It is, however, not clear to us whether this is an important point. More seriously, the final Dirac operator does not fulfill the conditions formulated by Connes. In particular, it has infinite-dimensional eigenspaces. Thus, we did not succeed to construct a spectral triple which satisfies the conditions put forward by Connes. A prime concern to further development is to understand why our constructed Dirac operator is not spectral in the sense of Connes. We suspect that what is missing is a symmetry related to the infinite dimensional Clifford algebra, i.e. Cl(Tid (G∞ )). Another possible solution is to use the ordinary Dirac operator instead of the Euler-Dirac operator. To account for lack of invariance under inversions of loops one can double the hilbert space: Instead of for each simple loop to assign the hilbert space of square integrable functions over G we can assign two copies of this hilbert space; one for each orientation of the loop. The diffeomorphism associated with inversion of the loop will then act by interchanging the two hilbert spaces. There will however still be some problems, for example embedding properties when we increase the number of copies of G’s. Another concern is to extend the present construction to work for non-compact groups, since gravity involves SO(3, 1). The main problem will be the embeddings in the projective limit. For example, L2 (G) is not naturally embedded in L2 (G2 ) whenever G is non-compact. We believe, however, that this is a technical and solvable problem. Also, loops will no longer occur as states in the hilbert space; a priori not necessarily a problem. Also, we would like to understand in what sense the noncommutativity of the holonomy algebra generates a bosonic sector and, if so, what it is. Clearly, noncommutativity permits inner automorphisms and nontrivial fluctuations of the Dirac operator. If we assume that we succeed to construct a Dirac operator D satisfying all of Connes’ conditions, and if we consider fluctuations around D of the form D → D˜ = D + A + J AJ † , where J is Tomita’s anti-linear isometry [13] and A is a noncommutative one-form8 , A ∈ 1D , then we can apply Chamseddine and Connes’ spectral action principle [5, 6]. Thus, we can write down automorphism invariant quantities like ˜ ξ˜ , ξ˜ |D|
Trϕ D˜ ,
... .
Such terms can be interpreted as integrated quantities, schematically, of the form d∇ . . . A¯ /Diff
which resembles a Feynman path integral and contains both fermionic and bosonic degrees of freedom. Here the integration is defined, modulo diffeomorphisms, on a space of connections. 8
Elements of nD are of the form a0 [D, a1 ] · · · [D, an ] where the ai ’s are elements of the algebra [4].
Spectral Triples of Holonomy Loops
677
In the introduction we motivated our analysis by stating that Connes formulation of the standard model coupled to gravity is intrinsically classical. With the aim of combining noncommutative geometry and the principles of quantum field theory, we have found a spectral triple which a priori appears to be quite far from field theory. It is clearly of prime concern to investigate whether the construction does contain a field theory limit and, if so, what it is. Acknowledgement. It is a pleasure to thank Raimar Wulkenhaar for comments and for carefully reading the manuscript.
A. Clifford Algebras and Dirac Operators Here we give a brief review of Clifford algebras and the Euler-Dirac operator. For a detailed account see for example [14]. Since we are interested in Clifford algebras over Lie-groups we only treat the Euclidean case. Let V be a real vector-space. We define the tensor-algebra T (V ) as i V⊗ T (V ) = i≥0
with multiplication v1 ⊗ · · · ⊗ vn · u1 ⊗ · · · ⊗ um = v1 ⊗ · · · ⊗ vn ⊗ u1 ⊗ · · · ⊗ um . Given a metric ·, · on V one defines the Clifford algebra as Cl(V ) = T (V )/(v ⊗ u + u ⊗ v = −2v, u). If e1 , . . . , en is an orthonormal basis of V the Clifford algebra Cl(V ) consists of elements on the form ei1 · · · eik , where i1 < · · · < ik and with the product rules ei ej = −ej ei ,
i = j,
ei2 = −1.
There is an inner product on Cl(V ) given by ei1 · · · eik , ej1 · · · ejl = 1 if k = l and i1 = j1 , . . . , ik = jl and zero else. The group O(n) acts on Cl(V ) by o(ei1 · · · eik ) = o(ei1 ) · · · o(eik ),
o ∈ O(n)
as automorphisms preserving the inner product. In particular one also gets an action of so(n) on Cl(V ). For a manifold M with a metric, one defines the Clifford bundle Cl(T M) as the bundle M m → Cl(Tm M), where the inner product on Tm M is the one given by the metric.
678
J. Aastrup, J.M. Grimstrup
Let ∇ denote the Levi-Civita connection associated to the metric. Via the extension of the action of O(n) from V to Cl(V ), the Levi-Civita connection extends to a connection in Cl(T M) via the formula ∇(ei1 · · · eik ) = ei1 · · · ∇(eil ) · · · eik , l
where eil are local orthonormal sections in T M. One defines the Euler Dirac operator D by L2 (M, Cl(T M)) s → D(s) =
ei · ∇ei s,
i
where {ei } is a local orthonormal sections in T M. Since one wants to work with hilbert spaces one complexifies the space L2 (M, Cl(T M)) leaving the notion unchanged. B. Projective and Inductive Limits Here we review the concepts of projective and inductive limits. For a different treatment we refer to [9]. B.1. Projective limits. To illustrate the concept of a projective limit we will consider the index set N and for each n ∈ N the space Rn . If n1 ≤ n2 there are projection Pn2 ,n1 : Rn2 → Rn1 given by Pn2 ,n1 (x1 , . . . , xn2 ) = (x1 , . . . , xn1 ). We define the product
Rn = {(Xn )n∈N |Xn ∈ Rn },
n∈N
Rn
is just where we pick an element in each Rn for all n. An i.e. an element in n∈N element can thus be written as (x11 , (x12 , x22 ), (x13 , x23 , x33 ), . . . ). The projective limit is defined as those elements in n∈N Rn where (x1n1 , . . . , xnn11 ) = Pn2 ,n1 (x1n2 , . . . , xnn22 ), or written out x11 = x12 ,
(x12 , x22 ) = (x13 , x23 ),
(x13 , x23 , x33 ) = (x14 , x24 , x34 ), . . . .
In other words, the projective limit, also written lim(Rn , Pn2 ,n1 ), ←
is just
R∞ ,
the set of all sequences in R.
Spectral Triples of Holonomy Loops
679
Another example which is more relevant to our case, comes from group theory. Let G be a group. We let F be the set of finitely generated subgroups of G. If F1 , F2 ∈ F and F1 ⊂ F2 we have the inclusion map ιF1 ,F2 : F1 → F2 . If we therefore consider group homomorphism from each of these finitely generated subgroups to a fixed group G1 we get, by dualizing, restriction maps ι∗F1 ,F2 : H om(F2 , G1 ) → H om(F1 , G1 ). As in the case of Rn we can consider the product H om(F, G1 ) = {(ϕF )F ∈F |ϕF ∈ H om(F, G1 )}, F ∈F
and the projective limit is defined as the subset of the product of sequences that are consistent with the restriction maps, i.e. a sequence (ϕF )F ∈F is in the projective limit if ι∗F1 ,F2 (ϕF2 ) = ϕF1 , for all F1 , F2 ∈ F with F1 ⊂ F2 . We note that we have a map : H om(G, G1 ) → lim(H om(F, G1 ), ι∗F1 ,F2 ) ←
just by restricting a homomorphism from G to G1 to its finite subgroups. It is easy to see that this map is a bijection, and we can hence identify H om(G, G1 ) with the projective limit. This might seem like we have just expressed something easy, namely H om(G, G1 ) with something complicated, namely the projective limit. However the description as a projective limit turns out to be very useful.
B.2. Inductive limits. Inductive limit is the dual concept of projective limit. For simplicity we take T∞ , the infinite torus (easier than R∞ since Tn is compact). This means that we have a projective system Pn2 ,n1 : Tn2 → Tn1 ,
n1 , n2 ∈ N,
n1 ≤ n2 ,
where Tn is the n-torus and Pn2 ,n1 are the natural projections. The dual of a space is the functions on the space. There are of course several candidates for functions. In this example we will take the space of square integrable functions on Tn with respect to the Haar measure, i.e. L2 (Tn , dµH ). The dual map of Pn2 ,n1 gives a map Pn∗2 ,n1 : L2 (Tn1 ) → L2 (Tn2 ) defined by Pn∗2 ,n1 (ξ )(x) = ξ(Pn2 ,n1 (x)),
x ∈ Tn2 .
These maps are embeddings and are maps of hilbert spaces since 1dµH = 1.
680
J. Aastrup, J.M. Grimstrup
The inductive limit of these hilbert spaces are constructed in the following way: We take the direct sum ⊕n L2 (Tn ), i.e. sequences {ξn }n∈N with ξn ∈ L2 (Tn ) such that {ξn } is zero from a certain step. In this space we consider the subspace N generated by elements of the form (0, . . . , 0, ξn1 , 0, . . . , 0, −Pn∗2 ,n1 (ξn1 ), 0, . . . ), and form the quotient space ⊕n L2 (Tn )/N . This quotient just means that we consider all vectors lying in some L2 (Tn ), and identify two vectors ξn1 , ξn2 if Pn∗2 ,n1 (ξn1 ) = ξn2 . The space ⊕n L2 (Tn )/N is the algebraic inductive limit lim L2 (Tn ). →
Naively we are considering L2 (T1 ) as a subspace of L2 (T2 ), L2 (T2 ) as a subspace of L2 (T3 ), L2 (T3 ) as a subspace of L2 (T4 ) and so on, and the limit space as n tends to infinity is the direct limit. Or in a picture, L2 (T1 ) ⊂ L2 (T2 ) ⊂ L2 (T3 ) ⊂ · · · ⊂ lim L2 (Tn ). →
We have used the words algebraic inductive limit, since we want to put some hilbert space structure on the inductive limit. If we have two vectors in the inductive limit, let us say ξ1 ∈ L2 (Tn1 ) and ξn2 ∈ L2 (Tn2 ) we define the inner product by: ξn1 , ξn2 = Pn∗2 ,n1 (ξn1 ), ξn2 L2 (Tn2 ) . Since the embeddings Pn∗2 ,n1 are hilbert space maps, this inner product is well defined. The definition of the hilbert space inductive limit of {L2 (Tn ), Pn∗2 ,n1 } is therefore the completion of ⊕n L2 (Tn )/N in the inner product < ·, · >. We will also denote this limit with lim L2 (Tn ). →
B.3. Constructing operators on inductive limits of hilbert spaces. The main advantage of giving a description of spaces as projective or inductive limits is that one can work on each copy, and then extend to the hole space if the construction is compatible with the structure maps, i.e. Pn∗2 ,n1 for example. As an example of this, let us take limL2 (Tn ). On L2 (Tn ) we have the Laplacian → n : L2 (Tn ) → L2 (Tn ) defined by n = −(∂θ21 + ∂θ22 + · · · + ∂θ2n ).
Spectral Triples of Holonomy Loops
681
Note that
and therefore =
Pn∗2 ,n1 (n1 (ξn1 )) = n2 (Pn∗2 ,n1 (ξn1 )),
n n
on ⊕L2 (Tn ) has the property (N ) ⊂ N,
i.e. descends to a densely defined operator on the quotient space, i.e. the inductive limit limL2 (Tn ). → References 1. Gel’fand, I.M., Naimark, M.A.: On the imbedding of normed rings into the ring of operators in Hilbert space. Mat. Sb. 12, 197–213 (1943) 2. Connes, A.: Noncommutative Geometry. London-New York: Academic Press, 1994 3. Connes, A., Lott, J.: Particle Models And Noncommutative Geometry (Expanded Version). Nucl. Phys. Proc. Suppl. 18B, 29 (1991) 4. Connes, A.: Gravity coupled with matter and the foundation of non-commutative geometry. Commun. Math. Phys. 182, 155 (1996) 5. Chamseddine, A.H., Connes, A.: Universal formula for noncommutative geometry actions: Unification of gravity and the standard model. Phys. Rev. Lett. 77, 4868 (1996) 6. Chamseddine, A.H., Connes, A.: A universal action formula. Phys. Rev. Lett. 77, 4868 (1996) 7. Ashtekar, A., Lewandowski, J.: Background independent quantum gravity: A status report. Class. Quant. Grav. 21, R53 (2004) 8. Fairbairn, W., Rovelli, C.: Separable Hilbert space in loop quantum gravity. J. Math. Phys. 45, 2802 (2004) 9. Marolf, D., Mourao, J.M.: On the support of the Ashtekar-Lewandowski measure. Commun. Math. Phys. 170, 583 (1995) 10. Ashtekar, A., Lewandowski, J.: Representation theory of analytic holonomy C* algebras. In: Baez, J. (ed.,), Knots and Quantum Gravity, Oxford: Oxford Univ. Press, 1994 11. Connes, A., Moscovici, H.: The local index formula in noncommutative geometry. Geom. Funct. Anal. 5(2), 174–243 (1995) 12. Ashtekar, A., Lewandowski, J., Marolf, D., Mourao, J., Thiemann, T.: Quantization of diffeomorphism invariant theories of connections with local degrees of freedom. J. Math. Phys. 36, 6456 (1995) 13. Takesaki, M.: Tomita’s theory on modular Hilbert algebras and its applications. Lecture Notes in Math., Berlin-Heidelberg-New York: Springer, 1970 14. Lawson, H., Michelsohn, M.: Spin Geometry. Princeton, NJ: Princeton University Press, 1989 Communicated by A. Connes
Commun. Math. Phys. 264, 683–703 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1487-2
Communications in
Mathematical Physics
Notes on Fast Moving Strings Andrei Mikhailov1,2 1 2
California Institute of Technology 452-48, Pasadena, CA 91125, USA. E-mail:
[email protected] Institute for Theoretical and Experimental Physics, Bol. Cheremushkinskaya, 25, 117259 Moscow, Russia
Received: 13 May 2005 / Accepted: 13 July 2005 Published online: 24 January 2006 – © Springer-Verlag 2006
Abstract: We review the recent work on the mechanics of fast moving strings in antide Sitter space times a sphere and discuss the role of conserved charges. An interesting relation between the local conserved charges of rigid solutions was found in the earlier work. We propose a generalization of this relation for arbitrary solutions, not necessarily rigid. We conjecture that an infinite combination of local conserved charges is an action variable generating periodic trajectories in the classical string phase space. It corresponds to the length of the operator on the field theory side. 1. Introduction The AdS/CFT correspondence is a strong-weak coupling duality. Weakly coupled YangMills is mapped to the string theory on the highly curved AdS space. When AdS space is highly curved, the string worldsheet theory becomes strongly coupled. Therefore, the weakly coupled Yang-Mills maps to the strongly coupled string worldsheet theory. Nevertheless, in some situations elements of the YM perturbation theory can be reproduced from the string theory side. One of the examples are the “spinning strings”. Spinning strings are a class of solutions of the classical string worldsheet theory. They were first considered in the context of the AdS/CFT correspondence in [1–3]. These are strings rotating in S 5 with a large angular momentum. It was noticed in [1] that the energy of these solutions has an expansion in some small parameter which is similar in form to the perturbative expansion in the field theory on the boundary. Then [4] computed the anomalous dimensions of single trace operators with the generic large R-charge, making the actual comparison possible. In [5] more general solutions were considered, having large compact charges both in S 5 and in AdS5 . For all these solutions, computations in the classical worldsheet theory lead to the series in the small parameter which on the field theory side is identified with λ/J 2 , where λ is the ’t Hooft coupling constant and J a large conserved charge. Moreover it was shown in [6] that the quantum corrections to the classical worldsheet theory are suppressed for the solutions with the large conserved
684
A. Mikhailov
charge (see also the recent discussion in [7]). This opened the possibility that the results of the calculations in the classical mechanics of spinning strings, which are valid a priori only in the large λ limit, can be in fact extended to weak coupling and therefore compared to the Yang-Mills perturbation theory. It was conjectured that the Yang-Mills perturbation theory in the corresponding sector is reproduced by the classical dynamics of the spinning strings. The following picture is emerging. Single string states in AdS5 × S 5 correspond to single-trace operators in the N = 4 supersymmetricYang-Mills theory. (We consider the large N limit.) The dynamics of the single-trace operators is described in the perturbation theory by an integrable spin chain. This spin chain has a classical continuous limit [8] which describes a class of operators with the large R-charges. In this limit the spin chain becomes a classical continuous system. We have conjectured in [9] that this classical system is equivalent to the worldsheet theory of the classical string in AdS5 × S 5 . The Yang-Mills perturbative expansion corresponds to considering the worldsheet of the fast moving string as a perturbation of the null-surface [8–12]. The null-surface perturbation theory was previously considered in a closely related context in [13]. In this paper we will try to make the statement of equivalence more precise. We will argue that the string worldsheet theory has a “hidden” U (1) symmetry which is defined unambiguously by its characteristic properties which we describe. This U (1) commutes with the group of geometrical symmetries of the target space. It corresponds to the length of the spin chain on the field theory side. We conjecture that the phase space of the classical continuous spin chain is equivalent to the Hamiltonian reduction of the phase space of the classical string by the action of this U (1). The equivalence commutes with the action of geometrical symmetries. We should stress that the hidden U (1) symmetry which we discuss in this paper was constructed already in [12], but the explicit calculation was carried out only at the first nontrivial order of the null-surface perturbation theory. The main new result of our paper is that we discuss this hidden symmetry from the point of view of the integrability. We conjecture the relation between the U (1) symmetry and the local conserved charges which if true gives a uniform description of this symmetry at all orders of the perturbation theory. The classical string on AdS5 ×S 5 is an integrable system (see [14–18] and references there), and our U (1) corresponds to an action variable. The existence of the action variables for integrable systems with a finite-dimensional phase space is a consequence of the Liouville theorem [19]. The classical string has an infinite-dimensional phase space. We are not aware of the existence of a general theorem which would guarantee that the action variables can be constructed in the infinite-dimensional case. But we will give two arguments for the existence of one action variable for the string in AdS5 × S 5 , at least in the perturbation theory around the null-surfaces. The first argument gives an explicit procedure to construct the action variable order by order in the perturbation theory (Sects. 3, 4.4 and 4.6). The second argument uses the existence of the local conserved charges [20] (known as higher Pohlmeyer charges) and the results of the evaluation of these charges on the so-called “rigid solutions” performed in [21, 22]. The arguments in Sect. 4 of our paper together with the results of [21, 22] suggest that the action variable is an infinite linear combination of the Pohlmeyer charges and allow in principle to find the coefficients of this linear combination. The plan of the paper. In Sect. 2 we will review the classification of the null-surfaces following mostly [11, 9] and stress that the moduli space of the null-surfaces is a U (1)bundle over a loop space. Therefore it has a canonically defined action of U (1). In Sect. 3
Notes on Fast Moving Strings
685
we will explain how to extend the action of U (1) from the null-surfaces to the nearlydegenerate extremal surfaces using the perturbation theory. A large part of Sect. 3 is a review of [12]. In Sect. 4 we discuss the geometrical meaning of this U (1) as an action variable and argue that it is an infinite sum of the local conserved charges. Note added in the revised version. The coefficients of the expansion of the action variable in the local conserved charges were fixed to all orders in the first paper of [28]. Here we consider only the Pohlmeyer charges for the S 5 part of the string sigma-model. The role of the Pohlmeyer charges for AdS5 was discussed in the second paper of [28]. In the special case when the motion of the string is restricted to R × S 2 ⊂ AdS5 × S 5 the action variable discussed here corresponds to the action variable of the sine-Gordon model, see the third paper of [28]. 2. Null-surfaces 2.1. The definition. A two-dimensional surface in a space-time of Lorentzian signature is called a null-surface if it has a degenerate metric and is ruled by the light rays. There is a connection between null-surfaces and extremal surfaces. An extremal surface is a two-dimensional surface with the induced metric of the signature 1+1 which extremizes the area functional. Extremal surfaces are solutions of the string worldsheet equation of motion in the purely geometrical background (no B-field). When the string moves very fast, the metric on the worldsheet degenerates and the worldsheet becomes a null-surface. Therefore a null-surface can be considered as a degenerate limit of an extremal surface. In AdS5 × S 5 there are two types of light rays. The light rays of the first type project to points in S 5 . The light rays of the second type project to the timelike geodesics in AdS5 and the equator of S 5 . The operators of the large R-charge correspond to the null-surfaces ruled by the light rays of the second type1 . 2.2. The moduli space of null-surfaces. It is straightforward to explicitly describe all the null-surfaces of the second type in AdS5 × S 5 . We have to first describe the moduli space of the null-geodesics of the second type. An equator of S 5 is specified by a point SO(6) . Similarly, a timelike geodesic in AdS5 is specified in the coset space gS ∈ SO(2)×SO(4) SO(2,4) . Given gS and gA , let E(gS ) ⊂ S 5 and T(gA ) ⊂ AdS5 be the corby gA ∈ SO(2)×SO(4) responding equator in S 5 and timelike geodesic in AdS5 , respectively. To specify a light ray in AdS5 × S 5 we have to give also a map F : T → E which pulls back the angular coordinate on E to the length parameter on T (see Fig.1). Such maps are parametrized by S 1 . We see that each light ray is defined by a triple (T, E, F ). Therefore, the moduli space of light-rays of the second type in AdS5 × S 5 is geometrically: SO(2, 4) SO(6) S 1 . × × (1) SO(2) × SO(4) SO(2) × SO(4)
A null-surface is a one-parameter family of light rays. Therefore it determines a SO(2,4) SO(6) S 1 . But we have to also remember that an contour in SO(2)×SO(4) × × SO(2)×SO(4) arbitrary collection of the light rays is not necessarily a null-surface. It is a null-surface only if the induced metric is degenerate. To understand what it means, let us choose a 1 The null-surfaces of the first type have a boundary. They describe the shock wave propagating from the cusp of the worldline of a spectator quark in R × S 3 .
686
A. Mikhailov
Fig. 1. A null-geodesic in AdS5 ×S 5 is specified by the choice of an equator E in S 5 , a time-like geodesic T in AdS5 and a map F : T → E which maps the angular parameter ψ on the equator to the time t on the geodesic, up to a constant
space-like curve belonging to our surface. This space-like curve is a collection of points, one point on each light ray. For the surface to be null, the tangent vector to this curve at each point of the curve should be orthogonal to the light ray to which the point belongs. (This condition does not depend on how we choose a space-like curve.) What kind of a SO(2,4) SO(6) S 1 constraint does it impose on the contour? The space SO(2)×SO(4) × SO(2)×SO(4) × SO(2,4) SO(6) is a U (1) bundle over SO(2)×SO(4) × SO(2)×SO(4) . The condition of the degeneracy of the metric defines a connection on this bundle. The definition of this connection is: the curve in the total space is considered horizontal, precisely if the corresponding collection of light rays is a degenerate surface. What is the curvature of this connection? Both SO(2,4) SO(6) SO(2)×SO(4) and SO(2)×SO(4) are Kahler manifolds (if we forgive that the metric on the first coset is not positive-definite). Let us denote the Kahler forms kA and kS . The curvaSO(2,4) SO(6) × SO(2)×SO(4) ture of our U (1)-bundle is kA +kS . A curve in the base space SO(2)×SO(4) can be lifted to the horizontal curve in the total space if and only if a two-dimensional film ending on this curve has an integer Kahler area (integral of kA + kS over this film should be an integer). Moreover, it is lifted as a horizontal curve almost unambiguously, except that there is a “global” action of U (1) shifting F : T → E on every light ray by the same constant. Therefore, the moduli space of null-surfaces is the U (1) bundle over SO(2,4) SO(6) × SO(2)×SO(4) subject to the integrality condition the space of contours in SO(2)×SO(4) which we described. To summarize, the moduli space of the null-surfaces of the second type is: SO(2,4) SO(6) S 1 Map0 S 1 , SO(2)×SO(4) × × SO(2)×SO(4) . (2) Diff(S 1 )
Here Map(S 1 , X) means the space of maps from the circle to X; for X a Kahler manifold Map0 (S 1 , X) means the space of maps satisfying the integrality condition. At this
Notes on Fast Moving Strings
687
Fig. 2. A picture of a null-surface in AdS5 × S 5 . A null-surface is a two-dimensional surface with the degenerate metric, ruled by the light rays. We have shown five light rays and a spacial contour with a parameter σ . One can visualize the null-surface as the surface swept by the spacial contour as it moves along the light rays
point we consider the null-surfaces without a parametrization; therefore we divide by the group Diff(S 1 ) of the diffeomorphisms of the circle. Turning on the fermionic degrees of freedom on the worldsheet we get the moduli space of supersymmetric null-surfaces [9]: S 1 Map0 S 1 , Gr(2|2, 4|4) × . (3) Diff(S 1 ) Here Map0 S 1 , Gr(2|2, 4|4) is the phase space of the continuous spin chain [9]. Therefore the moduli space of null-surfaces is “almost” equivalent to the phase space of the continuous spin chain, except for the fiber S 1 and the reparametrizations Diff(S 1 ). We have to explain what happens to the fiber and why the null-surface actually comes with the parametrization. Also, we have to explain how the symplectic structure is defined on the moduli space of null surfaces. Let us start with the parametrization. 2.3. Parametrized null-surfaces. The phase space of the classical string has a boundary which consists of strings “moving with the speed of light”. A string moving very fast can be approximated by a null-surface. But one null-surface can approximate many different fast moving strings. The null-surface as we defined it so far “remembers” only the direction of the velocity at each point of the approximated string, but it misses the information √ 2 at different points of the string. Alabout the ratios of the relativistic factors 1 − v
√ though 1 − v 2 → 0 in the null-surface limit, the ratio 1 − v 2 (τ, σ1 )/ 1 − v 2 (τ, σ2 ) for two different points on the worldsheet remains finite. Therefore, if we want to think of the moduli space of the null-surfaces as the boundary of the phase space, we have to equip the null-surfaces with an additional structure. This additional structure is the parametrization. A null-surface is a one-parameter family of the light rays. The parameterization is a particular choice of the parameter. In other words, it is a monotonic function σ from the family of light rays forming the null-surface to the circle, defined modulo σ ∼ σ +const. One can also think of it as a density dσ on the set of light rays forming the null-surface. This density is roughly speaking proportional to the density of energy on the worldsheet of the fast-moving string, in the limit when it becomes the null-surface. We will now give the definition of σ .
688
A. Mikhailov
Consider the family of string worldsheets (L) converging to the null-surface 0 = (∞). We will introduce a parametrization dσ of 0 in the following way. Consider a Killing vector field U on S 5 , corresponding to some rotation of the sphere: j
U.xSi = uij xS .
(4)
Here xSi parametrizes the S 5 : (xSi )2 = 1. When L is large, (L) is close to 0 , the string moves very fast and the conserved charge corresponding to U is very large. We can approximate this charge by an integral i ∂ x j times some density dσ : over a spacial contour on the null-surface 0 of uij x0,S τ 0,S j i QU = L dσ uij x0,S ∂τ x0,S + (terms vanishing at L → ∞). (5) σ ∈[0,2π]
Here x0,S is the S 5 -part of the null-surface; we choose the τ coordinate on the null-surface to be the affine parameter on the light ray normalized by the condition x0,S (τ + 2π, σ )
= x0,S (τ, σ ). Equation (5) with the condition dσ = 2π is the definition of dσ , and also the precise definition of the large parameter L, modulo O(1/L). We choose σ as the parametrization. We can now say that the moduli space of parametrized null-surfaces is the boundary of the phase space of a classical string. We say that a family (L) of extremal surfaces has a parametrized null-surface 0 as a limit when L → ∞ if and only if • (L) has 0 as a limit when L → ∞, as a continuous family of smooth twodimensional surfaces in a smooth two-dimensional manifold, and • the density of QU approaches Eq. (5) in the limit L → ∞. This definition of the parametrization does not depend on which particular geometrical symmetry U we use. An alternative way to define the same parametrization is to use a special choice of the worldsheet coordinates on . Let us choose the worldsheet coordinates τ, σ so that ∂xS 2 ∂xA 2 ∂xA 2 ∂xS 2 + = − − = 1, ∂τ ∂σ ∂τ ∂σ ∂xS ∂xS ∂xA ∂xA = − = const, , , ∂τ ∂σ ∂τ ∂σ where xA is the projection of the string worldsheet to AdS5 and xS is the projection to S 5 . Then we define σ = σ / dσ . In the null-surface limit dσ defines the parametrization of the null-surface. 2.4. The symplectic structure. The moduli space of parametrized null-surfaces as a manifold depends only on the conformal structure of the target space. But we can introduce additional structures on this moduli space which use the metric on AdS5 × S 5 . An important additional structure is the closed 2-form which originates from the symplectic form of the classical string. Strictly speaking a differential form in the bulk of the manifold does not automatically determine a differential form on the boundary. Indeed, suppose that we have a differential form, for example a 2-form in the bulk. We can try to define the “boundary value” ω of on the boundary in the following way. Given
Notes on Fast Moving Strings
689
two vector fields v1 , v2 on the boundary, we find two vector fields V1 , V2 in the bulk such that lim V1 = v1 and lim V2 = v2 . Then we define ω(v1 , v2 ) = lim (V1 , V2 ). But the problem is that this definition will depend on the choice of V1 and V2 . Intuitively, if (V˜1 , V˜2 ) is some other choice of a pair of vector fields inducing (v1 , v2 ) on the boundary, and the “vertical component” of V˜i − Vi is not small enough near the boundary, then (V1 , V2 ) = (V˜1 , V˜2 ). Given this difficulty, how do we define the symplectic form on the space of null-surfaces given the symplectic form on the string phase space? When we lift the vector field v on the boundary to the vector field V in the bulk, let us require that dL(V ) goes to zero when L → ∞. We define L by Eq. (5); it is only an approximate definition at L → ∞, but this is good enough for the purpose of our definition: ω(v1 , v2 ) = lim L−1 (V1 , V2 ), L→∞
(6)
where V1 and V2 are such that dL(V1 ) = dL(V2 ) 0. One can see that ω has a kernel, which is precisely the tangent space to the fiber S 1 in the numerator of Eq. (3). The moduli space has a symmetry U (1) rotating this fiber; we will discuss this symmetry in the next section; we will call it U (1)L . Therefore ω is the symplectic form on the moduli space of null-surfaces modulo U (1)L . Equation (3) implies that the moduli space of parametrized null-surfaces modulo U (1)L is the space of parametrized contours in the Grassmannian: (7) Map0 S 1 , Gr(2|2, 4|4) . One can see that ω is equal to the integral of the symplectic form on the superGrassmannian pointwise on the contour, with the measure dσ . The symplectic area of the film filling the contour is the generating function of the shift of the origin of the circle. Therefore the integrality condition guarantees that the symplectic form does not depend on the choice of the origin on S 1 ; the symplectic form is horizontal and invariant with respect to the shifts of the origin of S 1 . Our definition of the symplectic form on the space of null-surfaces used the target space metric (just the conformal structure would not be enough) and also the fact that the target space is a product of two manifolds. 3. Nearly-Degenerate Extremal Surfaces and the Role of the Engineering Dimension Our discussion in this and the next section will be limited to the classical bosonic string. 3.1. Definition of U (1)L . The moduli space (3) of null-surfaces is a U (1)-bundle. The U (1) symmetry shifting in the fiber S 1 plays an important role in the formalism. We will call it U (1)L . On Fig. 3 we have shown schematically how U (1)L acts on the null-surfaces. We conjecture that U (1)L corresponds to the length of the spin chain. Generally speaking, the length of the spin chain is not conserved in the Yang-Mills perturbation theory [24], but it is probably conserved in the continuous limit (this should be related to the discussion of the “closed sectors” in [26]). It should be conserved modulo the corrections vanishing in the continuous limit. We therefore conjecture that there is a continuation of U (1)L from the space of null-surfaces to the phase space of the classical
690
A. Mikhailov
Fig. 3. The action of U (1)L on the null-surface. The symmetry acts only on the S 5 -part of the null-string. Each point shifts by the same angle along the equator which is the projection to S 5 of the corresponding light ray
string, at least to the region of the phase space corresponding to fast moving strings. We conjecture that this continuation is uniquely defined by the following properties: 1. The action of U (1)L preserves the symplectic structure. 2. The action of U (1)L does not change the projection of the worldsheet to AdS5 . Moreover, it preserves the projection to AdS5 of the null-directions on the worldsheet. 3. We require that the orbits of U (1)L are closed (otherwise, we would not have called it U (1)). 4. The restriction of U (1)L to the null-surfaces acts as we described (see Fig. 3). The second property reflects the fact that U (1)L corresponds to the length of the operator rather than its engineering dimension. Let E denote the Hamiltonian of U (1)L . Let X denote the phase space of the classical string, and X//(E = l) denote the Hamiltonian reduction of the phase space on the level set of E. The basic conjecture is: There is a one-to-one map from the phase space of the spin chain of the length l to the reduced phase space of the classical string X//(E = l) preserving the symplectic structure and commuting with the action of SO(2, 4) × SO(6). The reduction by U (1)L was discussed in [23] but only in a sector [24] in which U (1)L acts as some element of SO(6). The perturbation theory in this sector was discussed in [25] (see also Sect. 2 of [11]).
3.2. Action of U (1)L on nearly-degenerate extremal surfaces. In this subsection we will explain how to continue the action of U (1)L from the boundary of the phase space. Most of this section is a partial review2 of Sect. 3 of [12]. 2
Section 3 of [12] has more than just a construction of U (1)L . The next step is considering the action ∂ , where T is the global time in AdS on the invariants of U (1) and bringof the Killing vector field ∂T L 5 ing the result to the form suitable for the comparison with the field theory computation. Here we are discussing only the first step.
Notes on Fast Moving Strings
691
3.2.1. Particle on a sphere. Consider the phase space of a particle moving on S 5 , and restrict to the domain where the velocity of the particle is nonzero. This domain is naturally a bundle over the moduli space of equators of S 5 ; let π denote the projection map in this bundle. A point of the phase space, corresponding to the position x ∈ S 5 and the velocity v ∈ Tx S 5 , projects by π to the equator going through x and tangent to v. See the discussion in [11]. The symplectic form on the phase space is expressed in terms of the symplectic form on the base and the connection form Dψ: ω = df ∧ Dψ + f π ∗ , (8) √ (p,dx) , f = (p, p) (p is the momentum of the particle) and is where Dψ = √ (p,p) the symplectic form on the moduli space of equators. The moduli space of equators SO(6) SO(2)×SO(4) is a Kahler manifold, the symplectic form is the Kahler form. Now it is easy to construct the action of U (1). One takes V=
∂ . ∂ψ
(9)
This is a vertical vector field, it does not act on the base. The coordinate ψ is essentially the angle along the equator on which the particle is moving. More explicitly: ∂ 1 .x = √ ∂τ x. ∂ψ (∂τ x, ∂τ x)
(10)
∂ It is easy to see that the trajectories of the vector field ∂ψ on the phase space of a particle 5 on S are periodic with the period 2π . One has to remember that this vector field is defined only on the open subset of the phase space, where the velocity of the particle is nonzero. But we consider fast moving strings, and the region of the phase space where the velocity is nearly zero is not important for us.
3.2.2. String on a sphere. In some sense, a string is a continuous collection of particles. Therefore, it is natural to apply a similar construction to the string. Treating the string as a continuous collection of particles requires the choice of the coordinates on the worldsheet. We will therefore introduce the conformal gauge: (∂τ x)2 + (∂σ x)2 = 0, (∂τ x, ∂σ x) = 0. In this gauge the symplectic form is: ↔ ω = dσ (δ1 x, D τ δ2 x).
(11)
(12)
In the Hamiltonian formalism, we introduce pA = ∂τ xA ∈ T (AdS5 ) — the AdS5 component of the momentum, and pS = ∂τ xS ∈ T (S 5 ) — the S 5 -component of the momentum. Now we will interpret the string as a collection of particles parametrized by σ . We are tempted to interpret the vector field (9),(10) acting pointwise in σ as the
required U (1)L symmetry. The generator of this symmetry would be dσ |pS |. But this would be wrong. This field preserves the symplectic structure, does have periodic trajectories and acts correctly on the null surfaces. But unfortunately it does not preserve
692
A. Mikhailov
the gauge (11). It only commutes with the second constraint, (p, ∂σ x) = 0. But it does not commute with the first one, (p, p) + (∂σ x, ∂σ x) = 0. Indeed, it commutes with (∂τ xS )2 = (pS , pS ) but not with (∂σ xS )2 . Therefore we should modify this vector field so that it still has periodic trajectories, but also commutes with the constraint. There is a systematic procedure to do this, order by order in (pS1,pS ) , developed in [12]. Let us summarize this procedure, or perhaps a variation of it. To make sure that the modified vector field is Hamiltonian (preserves the symplectic structure) we construct ∂ it as a conjugation of ∂ψ with some canonical transformation, which we denote F : V.x = F −1
∂ .F [x] ∂ψ
(13)
∂ ◦ F . Since F is a canonical transformation, V is or schematically V = F −1 ◦ ∂ψ automatically a Hamiltonian vector field. Since F is single-valued, V generates periodic trajectories. It remains to construct F such that V commutes with the constraint ∂ ◦ F −1 commutes with (p, p) + (∂σ x)2 is (p, p) + (∂σ x)2 . But to require that F −1 ◦ ∂ψ
∂ the same as to require that ∂ψ commutes with F ∗ [(pS , pS ) + (∂σ xS )2 ] — the pullback of (pS , pS ) + (∂σ xS )2 by F . Therefore we have to find such a canonical transformation F that the pullback of (pS , pS ) + (∂σ xS )2 with F is annihilated by the vector ∂ field ∂ψ . In other words, we have to find a canonical transformation which removes ψ 2 from (pS , pS ) + (∂σ xS )2 ; after this canonical transformation |pS |2 + (∂
σ xS ) becomes 2 |pS | + φ0 + φ1 + . . ., where all the φk for k ≥ 0 are in involution with |pS (σ )|dσ and φk is of the order 1/|pS |2k . This was done in Sect. 3 of [12]. The canonical transformation can be expanded in 1/(pS , pS ); the corresponding generating function is expanded in the odd powers of 1/|pS |. The authors of [12] gave the explicit expression for F to the first order in 1/|pS |, but they also give a straightforward algorithm for constructing the higher orders. (We will reconsider the higher orders from a slightly different point of view in Sect. 4.4, perhaps making this algorithm more precise.) At the first order we need to find h(1) such that the canonically transformed constraint, which is a function of σ :
(pS , pS )(σ ) + (∂σ xS , ∂σ xS )(σ ) + {h(1) , [(pS , pS )(σ ) + (∂σ xS , ∂σ xS )(σ )]}
has zero Poisson bracket with dσ |pS |(σ ) up to the terms of the order 1/|pS |3 , for every σ . And h(1) should be of the order 1/|pS |. In other words, we should have:
dσ |pS |(σ ), (∂σ xS ) (σ ) + {h , |pS | (σ )} = 0.
2
(1)
2
(14)
One can see that (1)
h
1 =− 4
dσ pS ∂ σ xS , D σ (σ ) |pS | |pS |
(15)
works. Notice also that this h(1) is reparametrization invariant (where |pS | transforms as a density of weight one). Therefore it commutes also with the second constraint (pS , ∂σ xS ) = const. Therefore, to the first order in 1/|pS | the canonical transformation
Notes on Fast Moving Strings
693
we are looking for is generated by this h(1) . Then the generator of the U (1)L is, up to the terms of the order 1/|pS |3 : dσ |pS |(σ )} + . . . dσ |pS | − {h(1) , pS pS 1 (pS , ∂σ xS )2 = dσ |pS | + (∂σ xS )2 − Dσ , Dσ − + ... . 4|pS | |pS | |pS | (pS , pS ) (16)
E=
One can see immediately that the trajectories of this charge are closed up to the terms of the order subleading to 1/|pS |. Indeed, we have explained in Sect. 3.2.1 why the leading term gives periodic trajectories. And the second term (which as we have seen is needed to make the charge commuting with the Virasoro constraints) averages to zero on the periodic trajectories of the first term. Therefore (see for example Sect. 3 of [11]) the trajectories of E do not drift at this order. We will discuss the higher orders in Sect. 4.4.
4. Length of the Operator and Local Conserved Charges We have seen that the null-surface perturbation theory has a “hidden” symmetry U (1)L . The existence of U (1) symmetries acting on the phase space is typical for integrable systems, at least for those which have a finite-dimensional phase space. Corresponding conserved charges are called action variables [19]. Classical string in AdS5 × S 5 is an integrable system. Therefore, we should not be surprised to find such an action variable3 . The local conserved charges in involution for the classical string in AdS5 × S 5 are explicitly known. Therefore, instead of constructing U (1)L in the perturbation theory, we can try to build it as some linear combination of the already known conserved charges. In this section, we will argue that the coefficients of this linear combination are actually fixed by the calculation of [21, 22].
4.1. Local conserved charges. Consider a string in the target space which is a product of two manifolds A and S. We assume that the metric on A has the Lorentzian signature, and the metric on S has the Euclidean signature. We will need A = AdS5 and S = S 5 , but let us first consider the general A × S. The string worldsheet will be denoted . The classical trajectory of the string is an embedding x : → A × S. We are going to use the fact that the target space is a direct product. A point of A × S is obviously a pair (xA , xS ), where xA is a point of A and xS is a point of S. Therefore for each point ζ ∈ we have x(ζ ) = (xA (ζ ), xS (ζ )), where xA ∈ A and xS ∈ S. Consider the 1-forms dxA 3 Strictly speaking, the integrability is not necessary for the construction of the action variable in perturbation theory. A typical example is a particle on a sphere S 2 in an arbitrary (polynomial) potential. When the particle moves very fast, it does not feel the potential. All the trajectories are periodic in the limit of an infinite velocity. Therefore on the boundary of the phase space, when the velocity is infinite, we have an action variable |p| — the absolute value of the momentum. It is well known that the perturbation theory in 1/|p| allows us to extend this action variable from the boundary inside the phase space, but only in the perturbation theory. For an arbitrary potential, the perturbation series must diverge, because in fact there is no additional conserved quantity besides the energy. Therefore the U (1) will be actually broken by effects which are not visible in the perturbation theory, unless if the potential is such that the system is integrable. We want to thank V. Kaloshin and A. Starinets for discussions of this subject.
694
A. Mikhailov
and dxSon the string worldvolume, dxA taking values in Tx A and dxS in Tx S. In other dxA words, is a differential of x. dxS The metric on A × S has the Lorentzian signature, and we consider the string worldsheets which have the induced metric with the Lorentzian signature. Pick two vector fields ξ+ and ξ− on , which are both lightlike but have a nonzero scalar product: (ξ+ , ξ+ ) = 0, (ξ− , ξ− ) = 0, (ξ+ , ξ− ) = 0. These vector fields have a simple geometrical meaning. Since the worldsheet is twodimensional, at each point we have two different lightlike directions. The vector ξ+ points along one lightlike direction, and ξ− points along another. Pick a spacial contour C on , and a 1-form ν on such that ν(ξ− ) = 0 and ν(ξ+ ) = 0. Consider the following functional:
(dxS (ξ+ ))2 [1] . (17) ν Q [x] = ν(ξ+ ) C We will prove that this functional does not depend on a particular choice of ξ+ , ξ− , ν and C. This is therefore a correctly defined functional on the phase space of the string. Indeed, the only ambiguity in the choice of ξ+ is ξ+ = f (ζ )ξ+ , where f is some function on the worldsheet. But this function cancels in (17). The ambiguity in the choice of ξ− and ν is also in rescaling which does not change (17). It remains to prove that (17) does not depend on the choice of the integration contour C. To prove that (17) is independent of C, let us choose coordinates (τ + , τ − ) on the worldsheet in such a way that the induced metric is ds 2 = ρ(τ + , τ − )dτ + dτ − . Then ξ+ is proportional to ∂τ∂+ and ξ− is proportional to ∂τ∂− . In these coordinates Q[1] = dτ + (∂+ xS )2 (18) C
The variation of Q[1] under the variation of the contour is measured by the differential of the form: + 2 d dτ (∂+ xS ) = −dτ + ∧ dτ −
(∂+ xS , D− ∂+ xS )
(∂+ xS )2
But on the equations of motion D+ ∂− xS = 0. Therefore the integral does not depend on the choice of the contour. Let us explain why on the equations of motion we have D+ ∂− xS = 0. Let N be the second quadratic form of the surface, N : S 2 (T ) → N (here N = T (A × S)/T is the normal bundle to in A × S). The second quadratic form is defined in the following way: suppose that the particle moves on with the velocity v, then the acceleration of the particle is N (v) modulo a vector parallel to T . For the surface to be extremal, the trace of N should be zero. The trace of N is the contraction of N with the induced
Notes on Fast Moving Strings
695
metric on ; it is a section of N . The trace of N is proportional to D+ ∂− x, therefore we should have: D+ ∂− x = f + (τ + , τ − )∂+ x + f − (τ + , τ − )∂− x. But notice that (D+ ∂− x, ∂− x) = (D− ∂+ x, ∂+ x) = 0, therefore f + = f − = 0.Another conserved charge is: [1] = Q
dτ
−
C
(∂− xS )2 .
(19)
Are there charges containing higher derivatives of xS ? Let us consider the following expression: J+[2] (τ+ , τ− )
∂+ xS ∂+ xS 1 = D+ , D+ . |∂+ xS | |∂+ xS | |∂+ xS |
(20)
Even though D+ ∂− xS = 0 it is not true that ∂− J+[2] is zero. The covariant derivatives D+ and D− do not commute, therefore D− D+ |∂∂++ xxSS | = 0. In fact, for any function w : → T (A × S) we have [D+ , D− ]w = R(∂+ x, ∂− x).w,
(21)
where R is the Riemann tensor of A × S. Now we have to start using that S is a sphere. For S = S 5 , the Riemann tensor is constructed from the metric tensor, and [D+ , D− ]w = ∂+ x(∂− x, w) − ∂− x(∂+ x, w).
(22)
Now consider the following differential form: λ=2
dτ − ∂+ x S ∂+ xS dτ + (∂− xS , ∂+ xS ) + D+ , D+ . |∂+ xS | |∂+ xS | |∂+ xS | |∂+ xS |
(23)
Using (22) we can show that dλ = 0, therefore λ is a local conservation law. We use the formula D− D+ ∂+ xS = (∂+ xS )2 ∂− xS − (∂+ xs , ∂− xs )∂+ xs which is special for S 5 . [2 which is obtained from (23) We will denote this charge Q[2] . There is also a charge Q by replacing τ + with τ − and ∂+ or D+ with ∂− or D− . These charges are just the first examples of an infinite family of charges, which are all in involution. This infinite family was constructed in [20]. A particularly important linear combination is E2 =
1 [1] ˜ [1] ). (Q − Q 2
(24)
The construction of this charge requires only that the target space is a direct product of two manifolds.
696
A. Mikhailov
4.2. Local conserved charges are invariant under U (1)L . Consider a local conserved charge Q acting trivially on the AdS part of the worldsheet. In the conformal gauge, this means that Q is constructed as a contour integral of some combination of xS and pS . Let us decompose Q in the inverse powers of |pS |: Q = Qm + Qm+1 + Qm+2 + . . . ,
(25)
where m is a non-negative integer, the “order” of the charge; Qm is of the order 1/|pS |2m−1 , Qm+1 is of the order 1/|pS |2m+1 , etc. We have to require that Q is in involution with the Virasoro constraints. In particular, it should be in involution with |pS (σ )|2 + (∂σ xS (σ ))2 for an arbitrary σ . (Here we used that Q is trivial in AdS-part.) Let us now apply the canonical transformation F which we described in Sect. 3.2.2. After this canonical transformation |pS |2 + (∂σ xS ) 2 becomes |pS |2 + φ0 + φ1 + . . ., where all the φk for k ≥ 0 are in involution with |pS (σ )|dσ and φk is of the order 1/|pS |2k . And Q = Qm + Qm+1 + . . . becomes Q = Qm + Qm+1 + . . ., where Q is the canonically transformed Q. We should have: (26) {|pS (σ )|2 + φ0 (σ ) + φ1 (σ ) + . . . , Qm + Qm+1 + . . .} = 0
for an arbitrary σ . At the leading order in |pS | this implies that dσ |pS (σ )| is in involution with Qm . At the next order, it follows that for all values of σ the expression
{|pS (σ )|2 , Qm+1 } is in involution with dσ |pS (σ )|. This implies that: dσ |pS (σ )|, = 0. (27) dσ |pS (σ )|, Qm+1
Since the vector field generated by dσ |pS (σ )| is periodic, this equation implies that dσ |pS (σ )| is in involution with Q
m+1 . An analogous argument at higher orders shows that all the Qm+j commute with dσ |pS (σ )|. Therefore Q is in involution with the
expression dσ |pS (σ )| which is the generator of U (1)L . The conserved charges of [20] do have an expansion of the form (25) therefore they should commute with U (1)L . This reinforces our conjecture that U (1)L should be a combination of the local conserved charges. 4.3. A geometrical meaning of U (1)L . We can try to make more transparent the geometrical meaning of U (1)L by drawing an analogy with the Liouville theorem for finitedimensional integrable systems. A mechanical system with 2n-dimensional phase space is integrable if there are n functions F1 , . . . , Fn in involution with each other, and the Hamiltonian is a function of F1 , . . . , Fn . Then, there are n action variables I1 , . . . , In , each of them being some combination of F1 , . . . , Fn : Ij = Ij (F1 , . . . , Fn ), such that each Ij generates U (1) (has periodic orbits). In this paper we are dealing with an infinite-dimensional system, a classical string in AdS5 × S 5 . We can take the first ˜ [1] as a Hamiltonian4 . This Hamiltonian is presumably intePohlmeyer charge Q[1] − Q grable, because there is an infinite family of higher charges commuting with it. On the 4 It is a natural Hamiltonian on the phase space of a classical string in any case when the target space is a direct product of two manifolds.
Notes on Fast Moving Strings
697
other hand, it does not have any special periodicity properties (we do not see any reason ˜ [1] is an invariant torus. why it would). This means that the closure of the orbit of Q[1] − Q [1] [1] ˜ . (This fact is seen immediately, because Q[1] Our U (1)L commutes Q −Q
with can be rewritten as dτ + −(∂+ xA , ∂+ xA ) and by definition U (1)L does not act on the AdS-part of the worldsheet.) Therefore U (1)L should be a shift of one of the angles ˜ [1] . The angles parametrizing the invariant parametrizing the invariant torus of Q[1] − Q torus are in correspondence with its one-dimensional cycles. Which cycle corresponds to U (1)L ? Every invariant torus can be connected by a one-parameter family of invariant tori to a torus on the boundary of the phase space (or the one very close to the boundary). This means that every 1-cycle is connected to some 1-cycle on a torus on the boundary — the space of null-surfaces. We should take that 1-cycle which is connected to the orbit of U (1)L on the null-surfaces, described in Sect. 3.1. The corresponding action variable is E — the generator of U (1)L . These arguments show the uniqueness of U (1)L . ˜ [1] has a special property: it actually generates The first Pohlmeyer charge Q[1] − Q ˜ [1] and E should be U (1)L on the boundary. Therefore the difference between Q[1] − Q a combination of charges vanishing at the boundary. We expect that this is an infinite and linear combination. Indeed, the construction of [12] tells us that the charge we are looking for is local at each order in 1/|pS |. A nonlinear combination of the charges would be non-local (a product of integrals).
4.4. A different point of view on the perturbation theory; higher
orders. In Sect. 3 we ∂ ∂ constructed U (1)L as F −1 ◦ ∂ψ ◦ F , where ∂ψ is generated by dσ |pS (σ )| and F is the
∂ canonical transformation such that F −1 ◦ ∂ψ ◦ F commutes with |pS (σ )|2 + |∂σ xS (σ )|2 . This canonical transformation is constructed in the perturbation theory, order by order in |p1 |2 . S A disadvantage of this procedure is that at each order we have to require that our U (1)L commutes with |pS (σ )|2 + |∂σ xS (σ )|2 for any σ . Since there are infinitely many values of σ we have to impose infinitely many conditions on F at each order. At the first order, we have seen in Sect. 3.2.2 that these conditions are not really independent; one generating function h(1) takes care of all of them — see Eq. (14). At the higher orders, this is not immediately obvious. Therefore, we would like to propose a slightly different way of constructing F . Let us forget for a moment about the Virasoro constraint; instead of the phase space of the string consider the space of harmonic maps x(τ, σ ). Instead of requiring that U (1)L commutes with |pS (σ )|2 + |∂σ xS (σ )|2 , let us require that U (1)L
commutes with Q[1] = dσ |∂+ xS (σ )|. We will see that the requirement that U (1)L commutes with Q[1] already fixes U (1)L in the perturbation theory, and the resulting U (1)L will automatically commute with the Virasoro constraints. As in Sect.
3, we look for the generator of U (1)L as a pullback by a canonical transformation of |pS (σ )|dσ . In other words, let us look for such a canonical transformation F
that dσ |pS (σ )| commutes with F ∗ Q[1] (the pullback of Q[1] by F ). We can construct such a canonical transformation order by order in the perturbation theory. Let us denote K = dσ |pS (σ )|. We have:
Q[1] = K + q1 + q2 + . . . .
(28)
Under the rescaling pS → tpS : K → tK, q1 → t −1 q1 , q2 → t −3 q2 , qm → t 1−2m qm . The symplectic structure is of the degree 1: ω → tω, therefore the Poisson brackets are
698
A. Mikhailov
of the degree −1: {, } → t −1 {, }. We can construct F order by order in this grading. We have: F ∗ (Q[1] ) = K + q1 + q2 + . . . + qm + ....
(29)
Suppose that we have already found F such that q1 , . . . , qm−1 commute with K. At the order m, we want to modify F by the canonical transformation with the generating + {f , K} commutes with K. Since K is function fm of the order |pS |1−2m so that qm m periodic, we can decompose qm = qm,0 + qm,k , (30) k=0 } = ikq . Then we should take where {K, qm,k m,k
fm =
1 q . ik m,k
(31)
k=0
Repeating this procedure at higher orders, we end up with the function F such that {K, F ∗ (Q[1] )} = 0. The reparametrization invariance is manifestly preserved at each order, therefore the ∂ resulting charge F −1 ◦ ∂ψ ◦ F will commute with (pS , ∂σ xS )(σ ) for any σ . Also, the [1] fact that Q is reparametrization-invariant and the arguments analogous to the dis∂ cussion at the end of Sect. 4.2 show that F −1 ◦ ∂ψ ◦ F will automatically commute 2 2 with |pS (σ )| + |∂σ xS (σ )| , as well as with the higher Pohlmeyer charges. Indeed, we know that F ∗ Q[1] = K + q1 + q2 + . . . commutes with F ∗ (|pS |2 (σ ) + |∂σ xS |2 (σ )) = |pS |2 + φ0 + φ1 + . . .; therefore {K, φ0 } = {|pS |2 (σ ), q1 } ⇒ {K, {K, φ0 }} = 0 ⇒ {K, φ0 } = 0, {K, φ1 } + {q1 , φ0 } = {|pS |2 (σ ), q2 } ⇒ {K, {K, φ1 }} = 0 ⇒ {K, φ1 } = 0, etc. We used the periodicity of the trajectories of K when we claimed that {K, {K, φ}} = 0 implies {K, φ} = 0. Indeed, for any functional φ on the phase space, if {K, {K, φ}} = 0 then {K, φ} is constant on the trajectories of K. But if this constant were nonzero, then the change of φ along the trajectory of K would accumulate over the period of K, which would contradict the single-valuedness of φ on the phase space. 4.5. An infinite combination of local conserved charges. Expanding (23) in the conformal gauge in the powers of |p1S | we get: 1 [2] [2] 3 = dσ −2|pS | + Q −Q (∂σ xS )2 2 |pS | pS pS 3 4 2 − (pS , ∂σ xS ) + Dσ , Dσ + ... . (32) |pS |3 |pS | |pS | |pS | And for Q[1] we get: [1] = Q[1] − Q
1 1 2 (∂σ xS )2 − dσ 2|pS | + (p , ∂ x ) + . . . . S σ S |pS | |pS |3
(33)
Notes on Fast Moving Strings
699
We have: 1 [1] ) − 1 (Q[2] − Q [2] ) 7(Q[1] − Q 16 2 pS pS 1 (pS , ∂σ xS )2 2 = dσ |pS | + (∂σ xS ) − Dσ , Dσ − + ... . 4|pS | |pS | |pS | (pS , pS ) This coincides with the result (16) for E which we know from the perturbation theory. We see that up to the terms of the order |p1 |3 the Hamiltonian of U (1)L can be repreS sented as a sum of the first two commuting local charges. We conjecture that U (1)L is in fact an infinite combination of the local conserved charges. The perturbation theory construction suggests that it should be a worldsheet parity-invariant combination. The coefficients of this linear combination can be found from considering the conserved charges of particular solutions. There is a special class of fast moving strings, the so-called “rigid” strings. For these “rigid” strings, the corresponding field theory operators are known a priori. These operators provide local extrema of the anomalous dimension in the sector with the given charges. These “rigid” solutions were classified in [10, 27]. They are related to the solutions of the Neumann integrable system. The local conserved charges of some rigid strings were computed in [21, 22]. In [21] the local conserved charges are denoted Ek . (This agrees with our notation E for the Hamiltonian of U (1)L .) The precise definition of Ek is given in Sect. 3 of [21]. ˜ [1] = 2E2 , Q[2] +2Q[1] − Q ˜ [2] −2Q ˜ [1] = −4E4 . The relation to our notations is: Q[1] − Q The conserved charges have the following structure: (1)
En = δ2,n J +
(2)
(3)
n n n + 3 + 5 + ..., J J J
(34)
where J −2 = λ/J 2 , and J is a particular combination of the SO(6) momenta. The (m) coefficients n depend on what kind of a rigid string is considered (the ratio of spins). (m) But the authors of [21] noticed that the coefficients n for different values of n are not independent. For all the solutions they considered, they find that: E10 +
74 1898 6922 32768 1 E8 + E6 + E4 + (E2 − J ) ∼ 9 . 7 35 35 35 J
(35)
This means that up to the terms of the order 1/|pS |9 we should have: J = E2 +
6922 1898 370 35 E4 + E6 + E8 + E10 + . . . . 32768 32768 32768 32768
(36)
At first this formula looks rather strange, because it seems to imply that a certain combination of Pohlmeyer charges (which all commute with SO(6)) is equal to some component of the angular momentum (which transforms in the adjoint of SO(6)). We propose the following resolution of this puzzle. The right-hand side of (36) is actually the action variable, which for a particular class of the solutions considered in [21, 22] happens to be equal to the SO(6) charge J (because these particular solutions correspond to the chiral operators on the field theory side; see Sect. 2 of [11]). In other words, for this particular class of solutions the angular momentum J should be equal to our action variable E. The general formula should have on the left-hand side E, the generator of U (1)L , instead of J :
700
A. Mikhailov
6922 1898 370 35 E4 + E6 + E8 + E10 + . . . . 32768 32768 32768 32768
E = E2 +
(37)
This gives the expansion of the generating function of U (1)L to the order |p1 |9 . It would S be interesting to check explicitly, beyond the order 1/|pS |, that this Hamiltonian generates periodic trajectories.
4.6. More on the perturbation theory. Here we want to present a slightly different and perhaps simpler way of thinking about the continuation of U (1)L in the perturbation theory. Consider the Hamiltonian vector field ξE2 corresponding to the first Pohlmeyer charge E2 . Consider the canonical transformation F = e2πξE2 .
(38)
This canonical transformation is the Hamiltonian flow generated by E2 by the time 2π . The trajectories of E2 are almost periodic in the null-surface limit, therefore we can write F = ev1 ,
(39)
where v1 is a vector field of the order 1/|pS |2 . This vector field can be constructed in the following way. Let us choose the conformal gauge on the worldsheet. We know from (33) that E2 = dσ |pS | + f = K + f , where f is of the order 1/|pS |. Taking into account that e2π ξK = 1 we get F = 1+ + = exp +
2π
dse−sξK ξf esξK
0
ds2 ds1 e−s2 ξK ξf es2 ξK e−s1 ξK ξf es1 ξK + . . .
s1 <s2 2π
1 2
0
dse−sξK ξf esξK + ds1 ds2 [e−s2 ξK ξf es2 ξK , e−s1 ξK ξf es1 ξK ] + . . . .
(40)
s1 <s2
This defines 2π 1 v1 = dse−sξK ξf esξK + ds1 ds2 [e−s2 ξK ξf es2 ξK , e−s1 ξK ξf es1 ξK ] + . . . 2 s1 <s2 0 in the perturbation theory. (Notice that f can be decomposed in the Fourier series f = fk so that {K, fk } = ikfk and then the leading term of v is the zero mode ξf0 ; this is the “averaging” procedure of [11].) The vector field v1 defines the vector field on the moduli space of null-surfaces as a limit lim|pS |→∞ (E22 v1 ) = lim|pS |→∞ (L2 v1 ) (where L was defined in Sect. 2.3). This vector field determines the slow evolution of the null-surface; it is the Hamiltonian vector field of the Landau-Lifshitz model on the moduli space of the null-surfaces modulo U (1)L [11, 9]. . By definition E is a linear As in [21] we can consider the improved currents E2n 2n −2n+3 combination of E2 , . . . , E2n such that E2n = O(|pS | ). The Hamiltonian of the
Notes on Fast Moving Strings
701
Landau-Lifshitz model is the null-surface limit of E4 , more precisely lim|pS |→∞ (E2 E4 ). Given that E2 and E4 are in involution, this implies that for some a1 we have F1 = e
2π(ξE2 +a1 ξE ) 4
= ev2 ,
(41)
where v2 is of the order 1/|pS |4 . Again, lim|pS |→∞ (E24 v2 ) determines a vector field on the moduli space of null-surfaces. (It can be also defined as lim|pS |→∞ (L4 v2 ).) This vector field commutes with the time evolution of the Landau-Lifshitz model. We conjecture that this vector field is generated by the second conservation law of the Landau-Lifshitz model, which is proportional to the null-surface limit5 of E6 , more precisely lim|pS |→∞ (E23 E6 ) or lim|pS |→∞ (L3 E6 ). Repeating this procedure we get e
2π(ξE2 +a1 ξE +a2 ξE +...) 4
6
=1
in the perturbation theory. These arguments lead us to the following conclusion. First, we see once again that there is a linear combination E2 +a1 E4 +a2 E6 +. . . generating periodic trajectories. Second, the moduli space of null-surfaces in AdS5 ×S 5 modulo U (1)L is naturally equipped with the infinite tower of Hamiltonians in involution which are the nullsurface limit of the Pohlmeyer charges. This is the generalized Landau-Lifshitz model. 5. Conclusion Given a manifold with the metric of the Lorentzian signature it is possible to construct the extremal surfaces in this manifold as perturbations of the null-surfaces. In the special case when the manifold is AdS5 × S 5 the AdS/CFT correspondence predicts that the extremal surfaces (which are the same as classical string worldsheets) correspond to the states of the large R-charge in the N = 4 supersymmetric Yang-Mills theory. From this point of view considering the extremal surface as a perturbation around the null-surface corresponds to considering the state of the interacting Yang-Mills theory as a perturbation of the state of the free Yang-Mills theory. This correspondence has the following important features: 1. Locality. In the planar limit (the limit of infinitely many colors) the Yang-Mills perturbation theory is local in the following sense: the Feynman diagrams involve only interactions of those elementary field operators which stand next to each other in the product under the trace. We expect that the correspondence between the parton chains and the string worldsheets is local in each order of the perturbation theory, and therefore the locality of the planar Yang-Mills perturbation theory should correspond to the locality of the string worldsheet theory. 2. Integrability. The classical string worldsheet theory in AdS5 × S 5 is an integrable system. Because of the integrability, there is an infinite family of local conserved charges in involution. In this paper we have argued that an infinite linear combination of these local charges generates periodic trajectories on the string phase space. This statement can be verified order by order in the null-surface perturbation theory, and it is local at each order. This means that the “slow evolution” of nearly-degenerate extremal surfaces 5 Notice that the null-surface limit of E is invariant under the U (1) symmetry of the null-surL 2n faces, because the U (1)L symmetry of the null-surfaces is generated by the conserved quantity E2 which commutes with E2n .
702
A. Mikhailov
[11] is essentially controlled by the Pohlmeyer charges (we will further discuss the slow evolution and how it is related to the Pohlmeyer charges in the second paper of [28]). It would be interesting to further study the null-surface perturbation theory from the point of view of the integrability. It would be especially interesting to study those manifestations of the integrability which are local. The B¨acklund transformations [20] is one example. They allow us to construct a new extremal surface from a given extremal surface, and in the null-surface perturbation theory these transformations are well-defined and local at each order. The B¨acklund transformations are closely related to the local conserved charges, and in fact the hidden symmetry U (1)L can be considered as a consequence of the special properties of these transformations. We will discuss the relation between U (1)L and the B¨acklund transformations in the third paper of [28]. The general problem is to study those aspects of the integrability which are local in the null-surface perturbation theory. (Without a reference to the null-surface perturbation theory, we would define the locality as some sort of an independence of the choice of the boundary conditions.) This problem arises also on the field theory side. The Feynman diagrams in the planar limit are local, but we usually compute the anomalous dimension of the single-trace operators which requires summing over the whole parton chain. The spectrum of single-trace operators at large N is certainly an invariant of the theory, but it is non-local. If it is true that the planar N = 4 Yang-Mills theory is integrable, it would be important to understand the integrability as much as possible in terms of the local properties of the parton chain (perhaps on the level of the individual Feynman diagrams). We have defined the U (1)L strictly speaking in the perturbation theory, but it should be actually well-defined in the domain of the string phase space where the velocity of the string is large enough. In other words, the series defining the U (1)L in fact converges if the string moves fast enough. It would be interesting to study the global properties of U (1)L . An important question is what happens to U (1)L after the quantization. To answer this question we should first include fermions. Important steps in this direction were made recently in [29–31]. It would be interesting to understand better why the “length” is conserved on the field theory side. (Why is there a quantum number L with a well-defined classical limit?) To which extent the conservation of L is related to the integrability of the planar Yang-Mills theory? What happens to L when we turn on the fermions? Null-surfaces are obviously an important ingredient in our construction. The correspondence between the null-surfaces and the “engineering” operators in the free field theory is rather straightforward; the null-surfaces in AdS5 × S 5 appear very naturally in the description of the coherent states of the free theory [8, 9]. Is there any way to see directly on the field theory side, that turning on the Yang-Mills interaction corresponds to the deformation of the null-surface into the extremal surface? Acknowledgements. I would like to thank S. Frolov, A. Gorsky, V. Kaloshin, A. Kapustin, A. Marshakov, S. Minwalla, M. Van Raamsdonk, A. Starinets, K. Zarembo and especially A. Tseytlin for discussions, and G. Arutyunov for a correspondence on the local conserved charges. I want to thank A. Tseytlin for comments on the text. I want to thank the organizers of the String Field Theory Camp at the Banff International Research Station for their hospitality while this work was in progress. This research was supported by the Sherman Fairchild Fellowship and in part by the RFBR Grant No. 03-02-17373 and in part by the Russian Grant for the support of the scientific schools NSh-1999.2003.2.
References 1. Frolov, S., Tseytlin, A.A.: Semiclassical quantization of rotating superstring in AdS5 × S 5 . JHEP 0206, 007 (2002)
Notes on Fast Moving Strings
703
2. Tseytlin, A.A.: Semiclassical quantization of superstrings: AdS5 ×S 5 and beyond. Int. J. Mod. Phys. A18, 981 (2003) 3. Russo, J.G.: Anomalous dimensions in gauge theories from rotating strings in AdS5 × S 5 . JHEP 0206, 038 (2002) 4. Minahan, J.A., Zarembo, K.: The Bethe-Ansatz for N=4 Super Yang-Mills. JHEP 0303, 013 (2003) 5. Frolov, S., Tseytlin, A.A.: Multi-spin string solutions in AdS5 x S 5 . Nucl.Phys. B668, 77–110 (2003) 6. Frolov, S., Tseytlin, A.A.: Quantizing three-spin string solution in AdS5 × S 5 . JHEP 0307, 016 (2003) 7. Frolov, S.A., Park, I.Y., Tseytlin, A.A.: On one-loop correction to energy of spinning strings in S5 . Phys. Rev. D71, 026006 (2005) 8. Kruczenski, M.: Spin chains and string theory. Phys. Rev. Lett. 93, 161602 (2004) 9. Mikhailov, A.: Supersymmetric null-surfaces. JHEP 0409, 068 (2004) 10. Mikhailov, A.: Speeding Strings. JHEP 0312, 058 (2003) 11. Mikhailov, A.: Slow evolution of nearly-degenerate extremal surfaces. J. Geom. Phys. 54, 228–250 (2005) 12. Kruczenski, M., Tseytlin, A.: Semiclassical relativistic strings in S 5 and long coherent operators in N=4 SYM theory. JHEP 0409, 038 (2004) 13. De Vega, H.J., Nicolaidis, A.: Strings in strong gravitational fields. Phys. Lett. B295, 214–218 (1992); de Vega, H.J., Giannakis, I., Nicolaidis, A.: String Quantization in Curved Spacetimes: Null String Approach. Mod. Phys. Lett. A10, 2479–2484 (1995) 14. Mandal, G., Suryanarayana, N.V., Wadia, S.R.: Aspects of Semiclassical Strings in AdS5 . Phys. Lett. B543, 81 (2002) 15. Bena, I., Polchinski, J., Roiban, R.: Hidden Symmetries of the AdS5 × S 5 Superstring. Phys. Rev. D69, 046002 (2004) 16. Alday, L.F.: Non-local charges on AdS5 × S 5 and PP-waves. JHEP 0312, 033 (2003) 17. Swanson, I.: On the Integrability of String Theory in AdS5 ×S 5 . http://arxiv.org/list/hep-th/0405172, 2004 18. Swanson, I.: Quantum string integrability and AdS/CFT. Nucl. Phys. B709, 443-464 (2005) 19. Arnold, V.I.: Mathematical methods of classical mechanics. New York: Springer-Verlag, 1989 20. Pohlmeyer, K.: Integrable Hamiltonian Systems and Interactions through Quadratic Constraints. Commun. Math. Phys. 46, 207–221 (1976) 21. Arutyunov, G., Staudacher, M.: Matching Higher Conserved Charges for Strings and Spins. JHEP 0403, 004 (2004) 22. Engquist, J.: Higher Conserved Charges and Integrability for Spinning Strings in AdS5 x S 5 . JHEP 0404, 002 (2004) 23. Kazakov, V.A., Marshakov, A., Minahan, J.A., Zarembo, K.: Classical/quantum integrability in AdS/CFT. JHEP 0405, 024 (2004) 24. Beisert, N.: The su(2|3) Dynamic Spin Chain. Nucl. Phys. B682, 487–520 (2004) 25. Kruczenski, M., Ryzhov, A.V., Tseytlin, A.A.: Large spin limit of AdS5 x S 5 string theory and low energy expansion of ferromagnetic spin chains. Nucl. Phys. B692, 3–49 (2004) 26. Minahan, J.: Higher Loops Beyond the SU(2) Sector, hep-th/0405243 27. Arutyunov, G., Russo, J., Tseytlin, A.A.: Spinning strings in AdS5 × S 5 : new integrable system relations, hep-th/0311004 28. Mikhailov, A.: Plane wave limit of local conserved charges, hep-th/0502097; Anomalous dimension and local charges, hep-th/0411178; An action variable of the sine-Gordon model, hep-th/0504035 29. Arutyunov, G., Frolov, S.: Integrable Hamiltonian for Classical Strings on AdS5 × S 5 . JHEP 0502, 059 (2005) 30. Beisert, N., Kazakov, V.A., Sakai, K., Zarembo, K.: The Algebraic Curve of Classical Superstrings on AdS5 × S 5 , hep-th/0502226 31. Alday, L.F., Arutyunov, G., Tseytlin, A.A.: On Integrability of Classical SuperStrings in AdS5 × S 5 , hep-th/0502240 Communicated by G.W. Gibbons
Commun. Math. Phys. 264, 705–724 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1522-y
Communications in
Mathematical Physics
Global Solutions to the Cauchy Problem for the Relativistic Boltzmann Equation with Near–Vacuum Data Robert T. Glassey Department of Mathematics, Indiana University, Bloomington, IN 47405–7106, USA. E-mail:
[email protected] Received: 14 May 2005 / Accepted: 9 August 2005 Published online: 1 March 2006 – © Springer-Verlag 2006
Abstract: The Cauchy Problem for the relativistic Boltzmann equation is studied with small (i.e., near–vacuum) data. For an appropriate class of scattering cross sections, global “mild” solutions are obtained. 1. Introduction We study the Cauchy Problem for the relativistic Boltzmann equation ∂t F + vˆ · ∇x F = Q(F, F ) F (0, x, v) = F0 (x, v)
(x, v ∈ R3 , t > 0)
(RB)
for “small” data F0 , i.e., the near–vacuum situation. We normalize the speed of light c and the particle mass m to unity. Then v, ˆ the relativistic velocity, is defined in terms of the momenta v R3 by v vˆ = ; v0 ≡ 1 + |v|2 (1.1) v0 and thus |v| ˆ < 1 for all v. There is of course a vast literature on solutions to this equation, but most of it concerns the classical (nonrelativistic) formulation. General references include [6, 9, 12, 18, 22, 30, 31, 41, 42]. In contrast, the relativistic version studied here has not received comparable attention. Background information on this relativistic case may be found in [10, 13, 39]. A sampling of the many relevant papers are given in the references; other books and surveys contain a much more thorough listing, see e.g., [11, 18 and 42]. For the classical equation in all space (or with periodic boundary conditions) one has global existence of smooth solutions near the vacuum ([28, 11, 23, 36]), near the
Supported in part by NSF DMS 0204227
706
R. T. Glassey
equilibrium ([32, 11, 29, 38, 43, 24 and 18]) as well as global weak solutions ([14]). (The papers of Guo cited here apply to a more general situation in which a nonlinear Vlasov–type force term may also be included.) Moreover the “nearly homogeneous” case has also been successfully treated, see [11] for references and details. It has been known for some time that the near–equilibrium relativistic Boltzmann equation also admits global smooth solutions ([20 and 21]). For this relativistic situation details of the collisions are studied in [19], a regularizing property is presented in [1] and weak solutions are treated in [15]. There are numerous examples in Kinetic Theory which suggest that the relativistic case can be quite challenging. One such is the Cauchy Problem for the Vlasov–Poisson system. The classical version possesses global smooth solutions for smooth large data ([34, 37]), but the relativistic problem remains open. The classical near–vacuum problem was solved in the hard–sphere case via a beautiful trick used in [28] (essentially Galilean invariance: |x − tv |2 + |x − tu |2 = |x − tv|2 + |x − tu|2 ); cf. also [40]. It allows one to eliminate in many situations the dependence in estimates on the post–collisional velocities u , v . That device does not work in the present relativistic case and we are forced to proceed otherwise. In the kernel of the collision operator Q(F, F ) appears the scattering cross section σ . The specific possible forms of σ do not seem to be widely disseminated in the literature (see however [10]). In appropriate coordinates it may depend on the “relative momentum” and the scattering angle (both to be defined directly). The hypotheses we impose on this kernel will be given in (H0) below after the relevant quantities are defined. Once the requisite quadratic estimates are achieved on the collision term, the nonnegativity of the solution follows essentially from the Illner–Shinbrot iteration [28]. A “mild” solution to the initial–value problem is a continuous function satisfying the time–integrated form of (RB); see Theorem 2 below. 2. Notation and Details of the Collision Operator As is standard, we abbreviate F (t, x, u) by F (u), etc., and use primes to represent the results of collisions. The conservation laws for momentum and energy are u + v = u + v ≡ m,
(2.1)
1 + |u |2 + 1 + |v |2 = 1 + |u|2 + 1 + |v|2 ≡ e,
(2.2)
for u, v ∈ R3 . The scattering angle θ is defined as follows: given two 4–vectors 3 U = (u0 , u1 , u2 , u3 ), V ≡ (v0 , v1 , v2 , v3 ), we set U · V = u0 v0 − uk vk (which is 1
the Lorentz inner product). Then the angle θ is given by cos θ =
(V − U ) · (V − U ) . (V − U ) · (V − U )
(2.3)
u , v can be explicitly calculated [19]. In terms of a unit vector ω we have v = v − a(u, v, ω)ω,
u = u + a(u, v, ω)ω,
(2.4)
where a(u, v, ω) =
ˆ 2eu0 v0 ω · (vˆ − u) . e2 − (ω · m)2
(2.5)
Relativistic Boltzmann Equation
707
Other relevant quantities which will appear below now follow. We define s = (U + V )2 = (u0 + v0 )2 − |u + v|2 = 2(u0 v0 − u · v + 1) = e2 − |m|2 , 4g 2 = −(U − V )2 = −(u0 − v0 )2 + |u − v|2 = 2(u0 v0 − u · v − 1) = s − 4.
(2.6)
(2.7)
Furthermore, we define the Møller velocity as the scalar vM given by 2 = |vˆ − u| ˆ 2 − |vˆ × u| ˆ2= vM
or vM
s(s − 4) 4v02 u20
2g 1 + g 2 = . v 0 u0
(2.8)
The last equality is established in [20]. There are several representations of the collision operator Q; we will use that in Appendix II of [20]. Using that formulation we can write the collision operator as Q(F, F )(t, x, v) = q(u, v, ω)[F (t, x, v )F (t, x, u ) 2 S+
R3
−F (t, x, u)F (t, x, v)] du dω = “gain − loss”, = Qg (F, F ) − Q (F, F ) where 2 S+ = {ω ∈ S 2 : ω · vˆ ≥ ω · u}. ˆ
As shown in [20], the scattering kernel q admits the representation q(u, v, ω) =
4sσ e2 |ω · (vˆ − u)| ˆ 2 2 (e − (ω · m) )2
(2.9)
in which the scattering cross section σ appears. We now state the hypotheses on σ : (H0) Hypothesis on the Scattering Cross Section σ . Let ω be a unit vector and u, v ∈ R3 . Let 0 < δ < 1 and denote by pu the vector v×u. ˆ σ = σ (u, v, ω) is to be nonnegative, continuous and satisfy σ (u, v, ω) ≤
|ω · pu |σ˜ (ω) 1
g(1 + g 2 ) 2 +δ
,
where σ˜ (ω) is also nonnegative, bounded and continuous and satisfies for some constant c and for every 0 = z ∈ R3 , σ˜ (ω) dω ≤ c|z|−1 . 1 + |ω · z| |ω|=1
708
R. T. Glassey
The major theorem is as follows: Theorem 1. Let σ satisfy the Hypothesis (H0) and let δ be as in (H0). Consider (RB) with initial value F0 (x, v) satisfying 0 ≤ F0 (x, v) ∈ C 0 (R6 ) as well as 1+δ 2 exp(v0 ) 1 + |x × v|2 F0 (x, v) ≤ c0 . Then there exists a positive number with the property that if c0 ≤ , a uniquely determined nonnegative global solution F (t, x, v) to the mild form of the Cauchy problem for (RB) exists. This solution satisfies the estimate − 1+δ 2 F (t, x, v) ≤ c exp(−v0 ) 1 + |x × v|2 for some constant c depending only on δ and the data. The solution F need not be integrable in x. This will be addressed later using the causality present in (RB) as well as an additional assumption that the initial value F0 (x, v) have compact support in x. The exponential decay of F0 in v can be easily weakened to algebraic decay at the rate v0−w for sufficiently large w > 0. This is because of the inequality v0 u0 ≥ cv0
(c > 0)
which is known from [20]; see also [7]. Discussion of (H0). A sufficient condition for the integral condition in (H0) to hold is the following. Assume that σ˜ (ω) ≤ |ω3 |µ for some µ > 0 and choose spherical coordinates with the polar axis directed along z. Denote by φ the polar angle. Then we have π 2 sin φ (cos φ)µ dφ σ˜ (ω) dω |ω3 |µ dω ≤ = 2 · 2π ≤ c|z|−1 2 1 + |ω · z| 1 + |z| cos φ S+ |ω|=1 1 + |ω · z| 0 which is the desired condition. An assumption in (H0) on the decay in g is “natural” because the kernel q will be seen in Lemma 2.2 below to grow in g at a rate no greater 1 than g(1 + g 2 ) 2 . We do not know if the factor ω · pu appearing in (H0) represents a physically realistic situation. This assumption, while not entirely satisfactory and probably method–related, is needed in the estimates on the gain term. It can be shown that the imposition of this condition is not required to deal with the loss term, in which case we use part ii) of Lemma 2.2 below. We now begin the sequence of estimates with a study of the function a(u, v, ω). Lemma 2.1. The function a(u, v, ω) and the expression ω · (vˆ − u) ˆ satisfy the estimates i) ii)
2u0 v0 |ω·(v− ˆ u)| ˆ
e
vM sin
θ 2
≤ |a(u, v, ω)| ≤ ge ˆ =e √g
1+g 2
evM sin θ2 2 1+g 2
≤ |ω · (vˆ − u)| ˆ ≤ √
,
.
Proof. The lower bound in i) is trivial: in the denominator of the definition of a we simply use e2 − (ω · m)2 ≤ e2 . For the upper bound in i) we begin with the definition of the scattering angle cos θ =
(v0 − u0 )(v0 − u0 ) − (v − u) · (v − u ) . (v0 − u0 )2 − |v − u|2
(2.10)
Relativistic Boltzmann Equation
709
We expand the denominator to get (v0 − u0 )2 − |v − u|2 = v02 + u20 − 2u0 v0 − |v|2 − |u|2 + 2u · v = 2 − 2u0 v0 + 2u · v = −4g 2 .
(2.11)
For the numerator we have by definition (v − u) · (v − u ) = (v − u) · (v − u − 2aω) and (v0 )2 − (u0 )2 e |v |2 − |u |2 = e v02 − u20 − 2aω · m = . e
v0 − u0 =
(2.12)
Therefore (v0 − u0 )(v0 − u0 ) =
(v0 + u0 )(v0 − u0 )2 − 2a(v0 − u0 ) ω · m e
so that the numerator becomes 2a(v0 − u0 )ω · m − |v − u|2 + 2aω · (v − u) e 2a = 2 − 2u0 v0 + 2u · v + ω · [e(v − u) − (v0 − u0 )m] e 2a = −4g 2 + ω · [(v0 + u0 )(v − u) − (v0 − u0 )(u + v)]. e (v0 − u0 )2 −
Now performing some elementary algebra and returning to the definition (2.10) we get ˆ 2 8u20 v02 (ω · (vˆ − u)) , e2 − (ω · m)2
−4g 2 cos θ = −4g 2 + and it follows that sin2
ˆ 2 u2 v 2 (ω · (vˆ − u)) e2 − (ω · m)2 2 θ = 02 0 2 = ·a . 2 g (e − (ω · m)2 ) 4e2 g 2
(2.13)
Hence |a| ≤
2eg e2
− (ω
· m)2
≤
2eg e2
− |m|2
2eg eg = √ = s 1 + g2
as desired. For part ii) we insert the definition of the function a from (2.5) into (2.13) and then solve for ω · (vˆ − u) ˆ to get g sin θ2 e2 − (ω · m)2 |ω · (vˆ − u)| ˆ = . u 0 v0
710
R. T. Glassey
Thus by the definition of s and the relation (2.8), √ eg sin θ2 evM sin θ2 s g sin θ2 θ vM sin = ≤ |ω · (vˆ − u)| ˆ ≤ = , 2 u 0 v0 u0 v 0 2 1 + g2 and this completes the proof.
Although we will not use part ii) of this lemma in this paper, we present it to relate the variation over the sphere (parameterized by ω) to that of the scattering angle θ . The precise relationship between ω and θ is expressed in (2.13). Another reason for our doing so is that our notation is not the same as that in other sources such as [10]. Next we derive estimates on the scattering kernel q. Lemma 2.2. The function q(u, v, ω) satisfies the following estimates: i) |q(u, v, ω)| ≤ 8 1 + g 2 gσ , ii) |q(u, v, ω)| ≤ 4σ u0 v0 |ω · (vˆ − u)|. ˆ We will not use ii) in this paper; as above, it is included for reference purposes and completeness. Proof. From Lemma 3.1 of [20] we have the relation e2 |u × v|2 |u − v|2 = , + 1 + g2 g 2 (1 + g 2 ) g2 so that e2 |u × v|2 + |u − v|2 ≤ . 2 1+g g2 Also from this same lemma in [20] we have the inequality g≥
(|u × v|2 + |u − v|2 )1/2 , √ 2 u0 v 0
and hence e2 ≤ 4u0 v0 . 1 + g2 For the second estimate we begin with the definition of q:
(2.14)
4sσ e2 |ω · (vˆ − u)| ˆ ˆ ˆ 4sσ e2 |ω · (vˆ − u)| 4σ e2 |ω · (vˆ − u)| ≤ = (e2 − (ω · m)2 )2 (e2 − |m|2 )2 s σ e2 |ω · (vˆ − u)| ˆ = ≤ 4σ u0 v0 |ω · (vˆ − u)|. ˆ 2 1+g To verify the first inequality we write q in terms of a using its definition and then apply Lemma 2.1: 2sσ e|a| 2sσ e · ge ˆ 2σ e2 g |q| = ≤ = . 2 2 u0 v0 (e − (ω · m) ) u 0 v0 s u0 v 0 1 + g 2 |q| =
Therefore by (2.14) we get
2g 1 + g 2 σ e2 |q| ≤ ≤ 8σg 1 + g2 u0 v0 (1 + g 2 ) which concludes the proof.
Relativistic Boltzmann Equation
711
3. Estimates on the Collision operator In (RB) ∂t F + vˆ · ∇x F = Q(F, F ) we first make an elementary change of variables to achieve exponential decay in the integration variable v. Set F (t, x, v) = exp(−v0 )f (t, x, v). Then by energy conservation q[exp(−v0 )f (v ) exp(−u0 )f (u ) Q(F, F ) = R3
2 S+
− exp(−v0 )f (v) exp(−u0 )f (u)] dω du = exp(−v0 ) exp(−u0 )q[f (v )f (u ) − f (v)f (u)] dω du.
R3
2 S+
The equation for f is then ∂t f + vˆ · ∇x f = Q(f, f ) = Qg (f, f ) − Ql (f, f ), where
Ql (f, f )(t, x, v) = f (t, x, v)
and
Qg (f, f )(t, x, v) =
R3
2 S+
R3
2 S+
q(u, v, ω)e−u0 f (t, x, u) dω du
q(u, v, ω)e−u0 f (t, x, u )f (t, x, v ) dω du.
(3.1)
(3.2)
We introduce the by now standard notation f # (t, x, v) = f (t, x + vt, ˆ v).
(3.3)
Then (RB) can be written as d # f (t, x, v) = Q# (f, f )(t, x, v). dt
(3.4)
Thus Q# (f, f )(t, x, v) = Q#g (f, f )(t, x, v) − Q# (f, f )(t, x, v), where Q#g (f, f )(t, x, v) = q(u, v, ω)e−u0 f (t, x + t v, ˆ v )f (t, x + t v, ˆ u ) dω du 2 S+
=
2 S+
R3 R3
q(u, v, ω)e−u0 f # (t, x + t (vˆ − vˆ ), v )
×f # (t, x + t (vˆ − uˆ ), u ) dω du, # Q (f, f )(t, x, v) = f (t, x + t v, ˆ v) qe−u0 f (t, x + t v, ˆ u) dω du 2 S+
= f # (t, x, v)
2 S+
R3
R3
qe−u0 f # (t, x + t (vˆ − u), ˆ u) dω du
≡ f # (t, x, v)R# (f )(t, x, v).
(3.5)
712
R. T. Glassey
It is the time integrated form of (3.4) to which we will find a continuous bounded nonnegative solution f in this paper. Because f will be bounded, the original distribution function F will decay exponentially in v. The function spaces in which we will seek the solution are as follows. Let
1+δ M = f ∈ C 0 [0, ∞) × R3 × R3 : f ≡ sup (1 + |x × v|2 ) 2 |f (t, x, v)| < ∞ . t,x,v
We also define
X = f ∈ L∞ [0, ∞) × R3 × R3 with the same norm. We give a name to the weight function: ρ(x, v) = (1 + |x × v|2 )
1+δ 2
.
Turning to the proof of the major result, we begin with a simple calculus lemma. Lemma 3.1. For vectors a, b ∈ R3 with b = 0 let ν = I=
t
dτ
1 + |a + bτ |2
0
b |b|
and
1+δ . 2
Then I≤
1 c |b| (1 + |a × ν|2 ) 2δ
for some constant c independent of t, a and b. Proof. Elementary algebra gives us
1 + |a + bτ | = 1 + |a × ν| + |b| 2
2
a·b τ+ |b|2
2
2
so that
t
I = 0
≤ ≤ as desired.
2 |b|
1
1 + |a
0
× ν|2
+ |b|2 (τ
∞
+
a·b 2 ) |b|2
1+δ dτ 2
ds (1 + |a × ν|2 + s 2 ) 1
1+δ 2
c |b| (1 + |a × ν|2 ) 2δ
The next lemma provides the key estimates of the collision operator.
Relativistic Boltzmann Equation
713
Lemma 3.2. Let hypothesis (H0) hold. For any t ≥ 0 and f # ∈ M there is a constant c independent of t, x, v for which t |Q#g (f, f )| dτ ≤ cρ(x, v)−1 f # 2 , 0 t |Q# (f, f )| dτ ≤ cρ(x, v)−1 f # 2 . 0
#g (f1 , f2 ) and Q # (f1 , f2 ) by More generally, for f1# , f2# ∈ M, t ≥ 0, define Q #g (f1 , f2 )(t, x, v) = Q e−u0 qf1# (t, x + t (vˆ − vˆ ), v ) 2 R3 S+
×f2# (t, x + t (vˆ − uˆ ), u ) dω du, # (f1 , f2 )(t, x, v) = f1# (t, x, v) Q qe−u0 f2# (t, x + t (vˆ − u), ˆ u) dω du. 2 S+
Then
t
0
0
t
R3
#g (f1 , f2 )| dτ ≤ cρ(x, v)−1 f1# f2# , |Q # (f1 , f2 )| dτ ≤ cρ(x, v)−1 f1# f2# . |Q
Proof. By definition t # Q (f, f )(τ, x, v) dτ 0 t # −u0 # = dω qe f (τ, x + τ (vˆ − u), ˆ u) du dτ f (τ, x, v) 2 S+ R3 0 t dτ ≤ ρ(x, v)−1 f # 2 dω |q|e−u0 du . (1+δ) 2 S+ R3 0 (1 + |(x + τ v) ˆ × u|2 ) 2 The time integral has the form of that in Lemma 3.1 with a = x × u, b = vˆ × u ≡ pu c . Hence and is therefore dominated by |b| t |q|e−u0 # −1 # 2 du dω Q (f, f )(τ, x, v) dτ ≤ ρ(x, v) f 2 S+ R3 |vˆ × u| 0 σ˜ (ω)|ω · pu | dω ≤ ρ(x, v)−1 f # 2
2 S+
e−u0 du, 2 δ R3 |pu |(1 + g ) where we have used Lemma 2.2 i) and (H0) to bound |q|. Now in the remaining integrals we use |ω · pu | ≤ |pu |, apply (H0) again to bound the ω integral uniformly and bound the g expression below by 1. Then t # Q (f, f )(τ, x, v) dτ ≤ ρ(x, v)−1 f # 2 , ×
0
714
R. T. Glassey
which is the desired estimate for the loss term. As expected, the gain term is much more difficult to handle. We break up its estimate into a number of steps. First we have t |Q#g (f, f )(τ, x, v)| dτ 0 t −u0 # # ˆ ˆ = e qf (τ, x + τ (vˆ − v ), v )f (τ, x + τ (vˆ − u ), u ) dω du dτ 2 R3 0 S+ t dτ ≤ f # 2 |q|e−u0 dω du 1+δ . 2 2 S+ R3 0 (1 + |(x + τ v) ˆ × v |2 )(1 + |(x + τ v) ˆ × u |2 ) (3.6) Now consider D˜ ≡ (1 + |(x + τ v) ˆ × v |2 )(1 + |(x + τ v) ˆ × u |2 ). We define av = x × v ,
au = x × u ,
bv = vˆ × v ,
bu = vˆ × u ,
νv =
νu =
bv , |bv |
bu , |bu |
cv = av · bv ;
cu = au · bu .
(3.7)
Step 1. Estimation of D˜ from below. From its definition we have D˜ = [1 + |av + τ bv |2 ][1 + |au + τ bu |2 ] = 1 + |av + τ bv |2 + |au + τ bu |2 + |av + τ bv |2 |au + τ bu |2 . The last term, quartic in τ , will be estimated using |av + τ bv | ≥ |ω · (av + τ bv )|, and for this we compute ω · bv = ω · (vˆ × v ) = ω · vˆ × (v − aω) = 0 and ω · av = ω · (x × v ) = ω · x × (v − aω) = ω · x × v. ˜ we arrive at These expressions are substituted into the quartic term in D; D˜ ≥ 1 + |av |2 + τ 2 |bv |2 + 2τ cv + |au |2 + τ 2 |bu |2 + 2τ cu +v02 (ω · px )2 |au |2 + τ 2 |bu |2 + 2τ cu , where px ≡ x × v. ˆ
(3.8)
Relativistic Boltzmann Equation
715
We write this as D˜ ≥ ατ 2 + 2βτ + γ ,
(3.9)
where α = |bv |2 + |bu |2 (1 + v02 (ω · px )2 ), β = cv + cu (1 + v02 (ω · px )2 ), γ = 1 + |av |2 + |au |2 (1 + v02 (ω · px )2 ).
(3.10)
Estimation of the time integral. We will show shortly that αγ −β 2
Step 2. > 0. Assuming this for the moment, let t t dτ dτ It ≡ = 1+δ 1+δ ˜ 2 2 0 D 0 [1 + |av + τ bv |2 ][1 + |au + τ bu |2 ] t dτ ≤ 1+δ 0 (ατ 2 + 2βτ + γ ) 2 and complete the square to write
ατ 2 + 2βτ + γ = α Therefore It ≤ 2α
− (1+δ) 2
∞ 0
β τ+ α
ds
s2 +
αγ −β α2
2
αγ − β 2 . + α2
1+δ = cα 2
(δ−1) 2
(αγ − β 2 )− 2 . δ
(3.11)
2
Step 3. Estimation of αγ − β 2 from below. From (3.10) we have directly that αγ −β 2
= |bv |2 (1+|av |2 )+|bv |2 |au |2 (1+v02 (ω·px )2 )+|bu |2 (1+|av |2 )(1+v02 (ω·px )2 ) +|bu |2 |au |2 (1+v02 (ω·px )2 )2 −cv2 −cu2 (1+v02 (ω·px )2 )2 −2cv cu (1+v02 (ω·px )2 )
= |bv |2 +|av ×bv |2 +|au ×bu |2 (1+v02 (ω·px )2 )2 +(1+v02 (ω·px )2 )(|bv |2 |au |2 +|bu |2+|bu |2 |av |2−2cu cv ). For the last expression in parentheses here we write |bv |2 |au |2 +|bu |2 +|bu |2 |av |2 −2cu cv
= |bv |2 =
(au ·νu )2 +|au ×νu |2 +|bu |2 +|bu |2 (av ·νv )2 +|av ×νv |2 −2cu cv
2 |bv |2 cu |b |2 c2 +|bv |2 |au ×νu |2 +|bu |2 + u 2v +|bu |2 |av ×νv |2 −2cu cv |bu |2 |bv | 2 2
2 2
= |bu |2 (1+|av ×νv |2 )+|bv |2 |au ×νu |2 + |b|bv | |2cu + |b|bu | |2cv −2 |b|buv || cu · |b|buv || cv u
v
≥ |bu |2 (1+|av ×νv |2 )+|bv |2 |au ×νu |2 .
It follows that αγ − β 2 is bounded below by |bv |2 (1+|av ×νv |2 )+|bu |2 |au ×νu |2 (1+v02 (ω·px )2 )2 +(1+v02 (ω·px )2 )[|bu |2 (1+|av ×νv |2 )+|bv |2 |au ×νu |2 ] = |bv |2 +|bu |2 (1+v02 (ω·px )2 ) 1+|av ×νv |2 +(1+v02 (ω·px )2 )|au ×νu |2 =α 1+|av ×νv |2 +(1+v02 (ω·px )2 )|au ×νu |2 ≥α 1+|av ×νv |2 +|au ×νu |2 ≥cα[1+|av ×νv |+|au ×νu |]2 .
(3.12)
716
R. T. Glassey
Using the vector identity A × (B × C) = (C · A)B − (B · A)C, we compute these cross products as av × bv = (x × v ) × (vˆ × v ) = −(vˆ · x × v )v = −(v · vˆ × x)v = (v · px )v ; similarly au × bu = (u · px )u . Since |bv | = |vˆ × v | ≤ |v||v ˆ | we have |av × νv | =
|av × bv | |v · px | ≥ |bv | |v| ˆ
and therefore also
|au × νu | ≥
|u · px | . |v| ˆ
We can now estimate the expression in the last line of (3.12) from below: |v · px | |u · px | + |v| ˆ |v| ˆ −1 = |v| ˆ |v| ˆ + |v · px | + |u · px | ≥ |v| ˆ −1 |v| ˆ + |(v + u ) · px | = |v| ˆ −1 |v| ˆ + |u · px | .
1 + |av × νv | + |au × νu | ≥ 1 +
Here we have used conservation of momentum and the fact that v · px = 0. Using this in (3.12) we see that 2 ˆ + |u · px | . ˆ −2 |v| αγ − β 2 ≥ cα|v| This completes Step 3. With its result, we can estimate the time integral It from (3.11) as −δ (δ−1) δ 1 ˆ + |u · px | ˆ δ α 2 (αγ − β 2 )− 2 ≤ c|v| ˆ δ α − 2 |v| . It ≤ c|v| From the definition (3.10) of α we have α = |bv |2 + |bu |2 (1 + v02 (ω · px )2 ) ≥ c|bu |2 (1 + v0 |ω · px |)2 . For |bu | we can write |bu | = |vˆ × u | ≥ |ω · (vˆ × u )| = |ω · (vˆ × (u + aω))| = |ω · (vˆ × u)| ≡ |ω · pu |. Therefore α ≥ c|ω · pu |2 (1 + v0 |ω · px |)2 and it follows that the time integral satisfies the estimate It ≤ c|v| ˆ δ |ω · pu |−1 (1 + v0 |ω · px |)−1 [|v| ˆ + |u · px |]−δ .
(3.13)
Relativistic Boltzmann Equation
717
Returning to the integral for the gain term in (3.6) we now have t |Q#g (f, f )(τ, x, v)| dτ 0 −δ δ # 2 ≤ c|v| ˆ f ˆ + |u · px | |q|e−u0 |ω · pu |−1 (1 + v0 |ω · px |)−1 |v| dω du S 2 R3
+ ≤ c|v| ˆ δ f # 2
2 R3 S+
≤ c|v| ˆ f δ
# 2 2 S+
R3
g 1 + g 2 σ e−u0 |ω · pu |−1 (1 + v0 |ω · px |)−1 −δ × |v| ˆ + |u · px | dω du −δ σ˜ e−u0 (1 + g 2 )−δ (1 + v0 |ω · px |)−1 |v| dω du, ˆ + |u · px | (3.14)
where we have first used part i) of Lemma 2.2 to bound q and then, in the last line, have applied hypothesis (H0) to adjust the powers of g and to cancel the factor |ω · pu |. Step 4. The desired estimate holds when |x×v| is bounded. Let us assume that |x×v| ≤ 1. Then (3.13) allows us to write t # # 2 |Qg (f, f )(τ, x, v)| dτ ≤ c f σ˜ e−u0 dω du ≤ cρ −1 f # 2 2 R3 S+
0
by (H0). Therefore in what follows we may assume that |x × v| ≥ 1. Step 5. Estimation of the ω integral from above. It is now immediate to estimate the ω integral Iω appearing in (3.13): σ˜ (ω)(1 + v0 |ω · px |)−1 dω ≤ c(v0 |px |)−1 = c|x × v|−1 Iω ≡ 2 S+
in view of hypothesis (H0). (As |x × v| ≥ 1 there is no singularity). Using this result in (3.13) we now have t # Qg (f, f )(τ, x, v) dτ 0 ≤ c|x × v|−1 |v| ˆ δ f # 2 e−u0 (1 + g 2 )−δ [|v| ˆ + |u · px |]−δ du. (3.15) R3
In this same set we may write |x × v|−1 ≤ cρ −1 |x × v|δ = cρ −1 v0δ |px |δ , and therefore (3.15) then takes the form t # Qg (f, f )(τ, x, v) dτ 0 −δ ≤ cρ −1 v0δ |px |δ |v| ˆ δ f # 2 e−u0 (1 + g 2 )−δ |v| du. ˆ + |u · px | R3
(3.16)
718
R. T. Glassey
Step 6. Estimation of the u integral and completion of the lemma. Denote by Iu the remaining integral: Iu =
R3
−δ ˆ + |u · px | e−u0 (1 + g 2 )−δ |v| du.
Our goal is to show that this integral is dominated by a constant multiple of (v0 |px |)−δ . Toward this end we partition it as Iu =
|u|<|v|/2
+
|u|>|v|/2
≡ I + I I.
From Lemma 3.1 of [20] we have g≥
(|u × v|2 + |u − v|2 )1/2 , √ 2 u0 v 0
and thus on the set {|u| < |v|/2} we get |u − v|2 |v|2 1 + |v|2 v0 ≥1+ ≥ = . 4v0 u0 16v0 u0 16v0 u0 16u0
1 + g2 ≥ 1 + Hence because δ < 1, I ≤ cv0−δ ≤
cv0−δ
−δ uδ0 e−u0 |v| du ˆ + |u · px |
R3 ∞
δ 2
r (1 + r ) e 2
2
√ − 1+r 2
dr 0
0
π 2
sin φ dφ (r|px | cos φ)δ
≤ c(|px |v0 )−δ . The other integral I I admits the estimate I I ≤ cv0−δ
|u|>|v|/2
uδ0 e−u0 [|v| ˆ + |u · px |]−δ du,
and this integral is dominated by the same integral which appeared in the upper bound for I above. We can now conclude that Iu ≤ c(|px |v0 )−δ . Inserting this in (3.16) we finally get t # Qg (f, f )(τ, x, v) dτ ≤ cρ −1 f # 2 0
which is desired estimate for the gain term.
Relativistic Boltzmann Equation
719
With quadratic estimates of the form from Lemma 3.2 in hand, a small–data theorem results without difficulty. Write f (0, x, v) = f0 (x, v). Returning to (3.4), we integrate in time to get t f # (t, x, v) = f0 (x, v) + Q# (f, f )(τ, x, v) dτ. (3.17) 0
Define the operator F on M by
t
Ff # = f0 (x, v) +
Q# (f, f )(τ, x, v) dτ
0
and let MR = {f ∈ M : f # ≤ R}. Theorem 2. Let hypothesis (H0) hold. There exists a constant R0 such that if f0 is sufficiently small, then Eq. (3.17) has a unique solution f # ∈ MR0 . Moreover, under the same restrictions on f0 and R0 , this equation is uniquely solvable in X as well. Proof. The estimates of Lemma 3.2 show that if e.g., f0 ≤ R/2 and f ∈ MR , then |Ff # | ≤ ρ(x, v)−1 f0 + cρ(x, v)−1 f # 2 −1 R 2 ≤ ρ(x, v) + cR . 2 Thus F maps MR into itself for R sufficiently small. Similarly, we show that F is a contraction on MR for suitably small R. Since elements of MR are continuous, the continuity of Ff # is evident.
4. Nonnegativity of the Solution It remains to show that the solution just obtained remains nonnegative. For this purpose we use the well–known iteration of Illner and Shinbrot [28] which, to a certain point, proceeds as in the classical case. Let T > 0 be arbitrary and let MT denote the restriction of elements f ∈ M to [0, T ] × R3 × R3 . Suppose that there exist U0# , #0 ∈ M such that 0 (t, x, v) ≤ U0 (t, x, v) for all 0 ≤ t < T , (x, v) ∈ R3 × R3 . Define two sequences {k }, {Uk } by d # + #k+1 R# (Uk ) = Q#g (k , k ), k+1 (0) = f0 , dt k+1 d # # U + Uk+1 R# (k ) = Q#g (Uk , Uk ), Uk+1 (0) = f0 . dt k+1
(4.1)
Because we have assumed that U0# ∈ M, the estimates of Lemma 3.2 allow us to conclude that R# (U0 ), Q#g (U0 , U0 ) ∈ L1 ((0, T ), C 0 (R3 × R3 )).
(4.2)
Clearly there exists a solution when k = 0. These are linear ordinary differential equations; thus if k−1 , Uk−1 exist on (0, T ) then so do k , Uk .
720
R. T. Glassey
Lemma 4.1. Let 0 ≤ f0 ∈ M. Assume the beginning condition (BC) 0 ≤ 0 (t) ≤ 1 (t) ≤ U1 (t) ≤ U0 (t),
0 ≤ t < T.
(BC)
Then the system (4.1) has a unique solution #k , Uk# ∈ MT
(4.3)
for all k ≥ 1 with the property k−1 (t) ≤ k (t) ≤ Uk (t) ≤ Uk−1 (t),
0 ≤ t < T.
(4.4)
Temporarily we assume that the (BC) and the result of the lemma hold with some U0# ∈ MT . Then there exist functions , U with k , Uk U, and (t) ≤ U(t) for all t. Now integrate over [0, t] the ordinary differential equations (4.1) at step k; let k → ∞ and apply the dominated convergence theorem to get t t # (t) + # R# (U)(τ ) dτ = f0 + Q#g (, )(τ ) dτ, 0 0 (4.5) t t # # # # U R ()(τ ) dτ = f0 + Qg (U, U)(τ ) dτ. U (t) + 0
0
This is the separated Boltzmann system. If we can show that U = , then f ≡ U = will be a nonnegative “mild” solution of (RB). Proof. In order to see the monotonicity, we solve explicitly to get t t # t # #k (t) = f0 e− 0 R (Uk−1 )ds + e− τ R (Uk−1 )ds Q#g (k−1 , k−1 ) dτ.
(4.6)
0
Thus #k+1 (t)
= f0 e
−
t 0
R# (Uk )ds
t
+
e−
t τ
R# (Uk )ds
0
Q#g (k , k ) dτ.
(4.7)
Assume that for some k ≥ 1, k−1 (t) ≤ k (t) ≤ Uk (t) ≤ Uk−1 (t),
(4.8)
and subtract (4.6) from (4.7): t # t # #k+1 (t) − #k (t) = f0 e− 0 R (Uk )ds − e− 0 R (Uk−1 )ds t t # t # + e− τ R (Uk )ds − e− τ R (Uk−1 )ds Q#g (k , k ) dτ 0 t t # + e− τ R (Uk−1 )ds Q#g (k , k ) − Q#g (k−1 , k−1 ) dτ. (4.9) 0
The kernel q is nonnegative on the set of integration, and from definition (3.5), R(u) ≤ R(v) if u ≤ v a.e. So the first two terms are nonnegative. By the induction assumption, the last term is too, since Qg is monotone. Hence #k (t) ≤ #k+1 (t),
(4.10)
and a similar argument applies to the {Uk# (t)}. We see that each member of {#k }, {Uk# } is nonnegative and belongs to MT by using the estimates of Lemma 3.2. This proves the lemma.
Relativistic Boltzmann Equation
721
In order to simplify the (BC), we take 0 = 0 and any 0 ≤ U0# ∈ MT . We claim that 0 = 0 (t) ≤ 1 (t) ≤ U1 (t).
(4.11)
Indeed, by the differential equations, d # + #1 R# (U0 ) = Q#g (0 , 0 ), dt 1 d # U + U1# R# (0 ) = Q#g (U0 , U0 ). dt 1
(4.12)
0 = 0 implies R# (0 ) = 0, Q#g (0 , 0 ) = 0.
(4.13)
Now
Therefore
U1#
t
= f0 +
0 ≤ #1 = f0 e−
0 t 0
Q#g (U0 , U0 ) ds,
R# (U0 )ds
(4.14)
≤ f0 ≤ U1# .
Hence the (BC) reduces to U1 (t) ≤ U0 (t).
(BC)
5. Satisfaction of the Beginning Condition We are assuming that 0 ≤ f0 ∈ M. Lemma 5.1. If f0 is sufficiently small, then the (BC) holds, and the separated Boltzmann system has a global solution (, U) with (# , U # ) ∈ MR . Proof. Since 0 = 0,
U1# (t) = f0 +
i.e., U1 (t, x + t v, ˆ v) = f0 (x, v) +
t 0
Q#g (U0 , U0 ) dτ,
t 0
R3
2 S+
(5.1)
q exp(−u0 )U0 (τ, x + τ v, ˆ v)
×U0 (τ, x + τ v, ˆ u ) dω du dτ.
(5.2)
The (BC) requires U1 (t) ≤ U0 (t); sufficient for this is that there exists a function U0 which satisfies t U0 (t, x + t v, ˆ v) = f0 (x, v) + q exp(−u0 )U0 (τ, x + τ v, ˆ v) 0
2 R3 S+
ˆ u ) dω du dτ. ×U0 (τ, x + τ v,
(5.3)
We recognize the right–hand side here as having the same form as the “gain term” studied above. As such, the same estimates apply and provide the existence of a solution U0 provided f0 is sufficiently small. (Large–data solutions will not exist in general to the “gain only” Boltzmann equation; see [2]).
722
R. T. Glassey
Proof that U = : It remains to show that U = . Take R from Theorem 2. Lemma 5.2. When f0 is sufficiently small, U = , where U, are the solutions of the separated Boltzmann system 4.5. Proof. By definition,
t
# (t) +
0 t
U # (t) +
t
# R# (U)(τ ) dτ = f0 +
0 t
U # R# ()(τ ) dτ = f0 +
0
0
Q#g (, )(τ ) dτ, Q#g (U, U)(τ ) dτ.
Subtracting these equations, we have t # # (U − )(t) = [Q#g (U, U − ) + Q#g (U − , ) 0
+# R# (U − ) − (U # − # )R# ()] dτ. Now we simply take norms in M, using the estimates from the second part of Lemma 3.2: U # − # ≤ c U # U # − # + # U # − # . Now U # , # both lie in MR , so each of the factors U # , # is bounded by cR. The conclusion now follows when R is sufficiently small.
As in Lemma 3.2, we can show that U = ∈ X under the same restriction on f0 . Thus the nonnegative solution just obtained must coincide with the unique solution f ∈ X obtained from the last sentence of Theorem 2. Since f0 ∈ M by hypothesis, and since M ⊆ X, our solutions must be identical, and the solution f from Theorem 2 must remain nonnegative. This completes the proof. We end with an observation about the integrability of the solution F in x and v. As f is bounded, the solution F is clearly integrable in v because it decays exponentially. Let us now assume that the initial value F0 (x, v) has compact support in x, say F0 (x, v) = 0 for |x| > k. Then it can be seen from the representation that the gain and loss terms (and hence the solution F as well) each has support in |x| ≤ k +t. Perhaps the easiest way to see this is to use induction on the successive approximations which is straightforward. (The linear case was studied in [16, 17].) Therefore the solution is integrable in x and v under this additional assumption. References 1. Andr´easson, H.: Regularity of the gain term and strong L1 convergence to equilibrium for the relativistic Boltzmann equation. SIAM J. Math. Anal. 27, 1386–1405 (1996) 2. Andr´easson, H., Calogero, S., Illner, R.: On Blow–up for Gain–term–only classical and relativistic Boltzmann equations. Math. Methods Appl. Sci. 27, no. 18, 2231–2240 (2004) 3. Andr´easson, H.: The Einstein-Vlasov system/kinetic theory. Living Rev. Relativ. (electronic), 5, 2002–7, 33 pp (2002) 4. Asano, K., Ukai, S.: On the Cauchy Problem of the Boltzmann Equation with Soft Potentials. Publ. R.I.M.S. Kyoto Univ. 18, 477–519 (1982)
Relativistic Boltzmann Equation
723
5. Bardos, C., Degond, P., Golse, F.: A priori estimates and existence results for the Vlasov and Boltzmann Equations. In: Proc. AMS–SIAM Summer Seminar, Santa Fe (1984), Ams Lectures in Appl. Math. Vol. 23, Providence, RI: Amer. Math. Soc., 1986 6. Boltzmann, L.: Weitere Studien u¨ ber das W¨armegleichgewicht unter Gasmolek¨ulen. Sitzungsberichte der Akademie der Wissenschaften Wien 66, 275–370 (1872) 7. Bellomo, N., Toscani, G.: On the Cauchy Problem for the nonlinear Boltzmann equation. Global existence, uniqueness and asymptotic stability. J. Math. Phys. 26, 334–338 (1985) 8. Caflisch, R.: The Boltzmann Equation with Soft Potentials. Commun. Math. Phys. 74, 71–109 (1980) 9. Cercignani, C.: The Boltzmann Equation and its Applications. New York: Springer–Verlag, 1988 10. Cercignani, C., Kremer, G.: The relativistic Boltzmann equation, theory and applications. Boston: Birkhaeuser, 2002 11. Cercignani, C., Illner, R., Pulvirenti, M.: The Mathematical Theory of Dilute Gases. New York: Springer–Verlag, 1994 12. Chapman, S., Cowling, T.G.: The Mathematical Theory of Non–uniform Gases. Third Ed., Cambridge: Cambridge University Press, 1990 13. de Groot, S.R., van Leeuwen, W.A., van Weert, C.G.: Relativistic Kinetic Theory. Amsterdam: North–Holland, 1980 14. DiPerna, R., Lions, P.L.: On the Cauchy Problem for Boltzmann Equations: Global Existence and Weak Stability. Ann. Math. 130, 321–366 (1989) 15. Dudy´nski, M., Ekiel-Jezewska, M.: Global Existence Proof for Relativistic Boltzmann Equation. J. Stat. Phys. 66, 991–1001 (1992) 16. Dudy´nski, M., Ekiel-Jezewska, M.: Causality of the linearized relativistic Boltzmann equation. Phys. Rev. Lett. 55, 2831–2834 (1985) 17. Dudy´nski, M., Ekiel-Jezewska, M.: Errata: Causality of the linearized relativistic Boltzmann equation. Investigaci´on Oper. 6, 2228 (1985) 18. Glassey, R.: The Cauchy problem in kinetic theory. Philadelphia: SIAM, 1996 19. Glassey, R., Strauss, W.: On the Derivatives of the Collision Map of Relativistic Particles. Trans. Th. Stat. Phys. 20, 55–68 (1991) 20. Glassey, R., Strauss, W.: Asymptotic Stability of the Relativistic Maxwellian. Publ. R.I.M.S. Kyoto Univ. 29, 301–347 (1993) 21. Glassey, R., Strauss, W.: Asymptotic Stability of the Relativistic Maxwellian via Fourteen Moments. Trans. Th. Stat. Phys. 24, 657–678 (1995) 22. Grad, H.: Principles of the Kinetic Theory of Gases. Handbuch der Physik 12, Berlin: Springer– Verlag, 1958, pp. 205–294 23. Guo,Y.: The Vlasov–Poisson–Boltzmann system near vacuum. Commun. Math. Phys. 218, 293–313 (2001) 24. Guo, Y.: The Vlasov–Maxwell–Boltzmann system near Maxwellians. Invent. Math. 153, 593–630 (2003) 25. Guo, Y.: Classical solutions to the Boltzmann equation for molecules with an angular cutoff. Arch. Rat. Mech. Anal. 169, no. 4, 305–353 (2003) 26. Hamdache, K.: Quelques r´esultats pour l’´equation de Boltzmann. C. R. Acad. Sci. Paris I 299, 431–434 (1984) 27. Hamdache, K.: Initial boundary value problems for Boltzmann equation: Global existence of weak solutions. Arch. Rat. Mech. Anal. 119, 309–353 (1992) 28. Illner, R., Shinbrot, M.: The Boltzmann Equation, global existence for a rare gas in an infinite vacuum. Commun. Math. Phys. 95, 217–226 (1984) 29. Kawashima, S.: The Boltzmann Equation and Thirteen Moments. Japan J. Appl. Math. 7, 301–320 (1990) 30. Kaniel, S., Shinbrot, M.: The Boltzmann Equation, uniqueness and local existence. Commun. Math. Phys. 58, 65–84 (1978) 31. Landau, E.M., Pitaevskii, L.P.: Physical Kinetics. Vol. 10 of Course of Theoretical Physics, Oxford: Pergamon Press, 1981 32. Nishida, T., Imai, K.: Global Solutions to the initial value problem for the nonlinear Boltzmann Equation. Publ. R.I.M.S. Kyoto Univ. 12, 229–239 (1976) 33. Noutchegueme, N., Tetsadjio, M.E.: Global solutions for the relativistic Boltzmann equation in the homogeneous case on the Minkowski space–time. http://arXiv.org/abs/gr-qc/0307065, 2004 34. Pfaffelmoser, K.: Global classical solutions of the Vlasov–Poisson system in three dimensions for general initial data. J. Diff. Eqs. 95, 281–303 (1992) 35. Lions, P.L.: Compactness in Boltzmann’s equation via Fourier integral operators and applications. J. Math. Kyoto Univ. 34, 391–427 (1994) 36. Polewczak, J.: Classical solution of the Nonlinear Boltzmann equation in all R3 . J. Stat. Phys. 50 (3 & 4), 611–632 (1988)
724
R. T. Glassey
37. Schaeffer, J.: Global Existence of Smooth Solutions to the Vlasov–Poisson System in Three Dimensions. Commun. P.D.E. 16, 1313–1335 (1991) 38. Shizuta, Y.: On the Classical Solutions of the Boltzmann Equation. Commun. Pure Appl. Math. 36, 705–754 (1983) 39. Stewart, J.: Non–Equilibrium Relativistic Kinetic Theory. Lecture Notes in Physics 10, New York: Springer–Verlag 1971 40. Tartar, L.: Some Existence Theorems for semilinear hyperbolic systems in one space variable. MRC Technical Summary Report, Madison, WI, 1980 41. Truesdell, C., Muncaster, R.: Fundamentals of Maxwell Kinetic Theory of a Simple Monatomic Gas (treated as a branch of rational continuum mechanics). New York: Academic Press, 1980 42. Ukai, S.: Solutions of the Boltzmann Equation. Studies in Math. Appl. 18, 37–96 (1986) 43. Ukai, S.: On the Existence of Global Solutions of a mixed problem for the nonlinear Boltzmann equation. Proc. Japan. Acad. 50, 179–184 (1974) Communicated by P. Constantin
Commun. Math. Phys. 264, 725–740 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1521-z
Communications in
Mathematical Physics
On Lieb-Thirring Inequalities for Schr¨odinger Operators with Virtual Level T. Ekholm, R. L. Frank Royal Institute of Technology, Department of Mathematics, 100 44 Stockholm, Sweden. E-mail:
[email protected];
[email protected] Received: 16 May 2005 / Accepted: 11 October 2005 Published online: 10 February 2006 – © Springer-Verlag 2006
Abstract: We consider the operator H = − − V in L2 (Rd ), d ≥ 3. For the moments of its negative eigenvalues we prove the estimate γ tr H−
≤ Cγ ,d
Rd
(d − 2)2 V (x) − 4|x|2
γ + d2 +
dx,
γ > 0.
Similar estimates hold for the one-dimensional operator with a Dirichlet condition at the origin and for the two-dimensional Aharonov-Bohm operator. Introduction The Lieb-Thirring inequalities estimate a quantum mechanical quantity, namely moments of negative eigenvalues of the Schr¨odinger operator − − V in L2 (Rd ), by means of the classical phase space volume. They state for suitable values of γ and d (see [LiTh] and, for more recent results, the survey [LaWei2]) that d γ tr(− − V )− ≤ Lγ ,d V+ (x)γ + 2 dx. (0.1) Rd
Lately the main topic in connection with these inequalities has been to establish their sharp constants Lγ ,d . We are interested in a different question. As is well-known, in dimension d ≥ 3 a sufficiently weak potential cannot bind a particle. Put differently, if V ∈ C0∞ (Rd ) then − − βV is non-negative for small β > 0. This follows, e.g., from the Hardy inequality |u|2 (d − 2)2 dx ≤ |∇u|2 dx, 2 d d 4 |x| R R
u ∈ C0∞ (Rd ).
(0.2)
726
T. Ekholm, R. L. Frank
We see that the Lieb-Thirring estimate does not yield a good bound for weak potentials. 2 In the particular case V (x) = (d−2) the l.h.s. of (0.1) is zero whereas the r.h.s. is infinite! 4|x|2 In this paper we show the rather unexpected result that the part of the potential which is stronger than the Hardy weight is sufficient to estimate the moments of the negative eigenvalues of − − V . More precisely, we prove the inequality γ tr (− − V )−
≤ Cγ ,d
(d − 2)2 V (x) − 4|x|2
Rd
γ + d2 +
(0.3)
dx
for any d ≥ 3 and γ > 0. Note that a direct approach based only on (0.1) and (0.2) leads 2 to − − V ≥ −ε + (1 − ε) (d−2) − V for ε ∈ (0, 1) and hence 4|x|2 γ tr(− − V )−
≤ Lγ ,d ε
− d2
Rd
(d − 2)2 V (x) − (1 − ε) 4|x|2
γ + d2 +
dx.
However, as ε tends to zero the constant in front of the integral diverges. For a deeper analysis in the case of positive ε we refer to [St]. In order to prove (0.3) we will choose a slightly different (but equivalent) point of view and establish Lieb-Thirring inequalities for the operator − −
(d − 2)2 − βV 4|x|2
in L2 (Rd ), d ≥ 3,
(0.4)
see Theorem 1.1. We note that several works have been devoted to the investigation of this operator. In [Bi] sufficient conditions for the finiteness of the negative spectrum were given. In [BiLa] the asymptotic number of negative eigenvalues in the strong coupling regime β → ∞ was investigated and, in particular, it was shown that the Weyl-type formula may be violated. Indeed, an additional contribution of a one-dimensional auxiliary problem appears. We emphasize that such a term does not appear in our Lieb-Thirring estimates, which demonstrates the ‘smoothing effect’ of taking γ > 0. In [Wei] the weak coupling regime β → 0 is investigated and a necessary and sufficient condition on V is given for the operator (0.4) to have a negative eigenvalue for any β > 0. This is in particular the case if V ≥ 0 and stands in sharp contrast to the operator − − βV if d ≥ 3. This is what we mean by a virtual level. It will turn out that the operator (0.4) has both two- and d-dimensional features and the main difficulty in establishing a Lieb-Thirring inequality is to estimate the former. Here we rely on weighted Lieb-Thirring inequalities by Egorov-Kondrat’ev [EgKo]. The same approach allows us to obtain Lieb-Thirring inequalities for the onedimensional analogue of (0.4) with a Dirichlet boundary condition at the origin (see Theorem 1.6) and for the two-dimensional magnetic Schr¨odinger operator corresponding to the Aharonov-Bohm field when the critical Hardy weight is subtracted (see Theorem 1.10). Our estimates allow also the inclusion of a weight in the spirit of [GlGrMaTh, BlReSt and EgKo]. 1. Statement of the Results 1.1. Schr¨odinger operators in d ≥ 3. The main result of this paper is
On Lieb-Thirring Inequalities
727
Theorem 1.1. Let d ≥ 3, γ > 0 and α ≥ 0. Then γ (d − 2)2 γ + d+α − V ≤ C V (x)+ 2 |x|α dx tr − − γ ,d,α 2 4|x| Rd −
(1.1)
with a constant Cγ ,d,α independent of V . To be more precise, we prove that if V ∈ L1,loc (Rd ) and if the r.h.s. of (1.1) is finite then the quadratic form (d − 2)2 2 2 2 |∇u| − (1.2) |u| − V |u| dx 4|x|2 Rd is lower semi-bounded and closable on C0∞ (Rd ) and the estimate (1.1) holds for the operator associated with the closure of this form. Specializing to the case α = 0 we obtain for the standard Schr¨odinger operator Corollary 1.2. Let d ≥ 3 and γ > 0. Then γ tr (− − V )−
≤ Cγ ,d,0
Rd
(d − 2)2 V (x) − 4|x|2
γ + d2 +
dx
(1.3)
with the constant Cγ ,d,0 from (1.1). Remark 1.3. In particular, if we replace V by βV , where V is bounded and compactly supported, then the r.h.s. of (1.3) is zero for sufficiently small β > 0. As explained in the introduction, this is an important feature of (1.3) which is not shared by the classical estimate (0.1). Remark 1.4. Neither Theorem 1.1 nor Corollary 1.2 hold for γ = 0. This follows from 2 − βV has a negative eigenthe fact that if V ≥ 0, V ≡ 0, then the operator − − (d−2) 4|x|2 value for all β > 0, see Remark 8.2 in [Wei] or, for the case of a spherically symmetric V , our Proposition 3.2 below. Remark 1.5. Our constants are explicit and given in (2.5). However, they might be strongly overblown. It would be challenging to find their sharp values.
1.2. Schr¨odinger operators on the semi-axis. Our result has a one-dimensional analogue, which is an important ingredient in the proof of Theorem 1.1, but also of inded2 1 pendent interest. We consider the operator − dr 2 − 4r 2 − V in L2 (R+ ) with Dirichlet boundary conditions at the origin and prove Theorem 1.6. Let γ > 0 and α ≥ 0 such that γ +
1+α 2
> 1. Then
γ 1 d2 γ + 1+α ≤ Cγ ,1,α V (r)+ 2 r α dr tr − 2 − 2 − V dr 4r R+ − with a constant Cγ ,1,α independent of V .
728
T. Ekholm, R. L. Frank
Similarly as before, the precise statement involves the operator associated with the closure of the form 1 |f |2 − 2 |f |2 − V |f |2 dr (1.4) 4r R+ on C0∞ (R+ ). We stress that R+ = (0, ∞). Note also that 4r12 is the critical Hardy weight if d = 1, see (3.1) below. We will prove this theorem in two steps. The case α ≥ 1 is dealt with in Sect. 2 using results of [EgKo] and the case 0 ≤ α ≤ 1 in Sect. 3 using explicit diagonalization of the d2 1 operator − dr 2 − 4r 2 . Remark 1.7. If α = 0 and γ > 21 this leads to a Lieb-Thirring-type inequality in the spirit of Corollary 1.2. It would be interesting to extend the estimate to the critical case γ = 21 . Remark 1.8. If α ≥ 1 we can take any γ > 0. However, a similar estimate for γ = 0 d2 1 cannot hold due to the virtual level of − dr 2 − 4r 2 , see Proposition 3.2 below. Remark 1.9. If α = 1 this is a Bargmann-type inequality. Recall that γ d2 s(s + 1) 1 γ +1 tr − 2 + −V ≤ V (r)+ r dr dr r2 (1 + 2s)(γ + 1) R+ −
(1.5)
for s > − 21 and γ ≥ 0. Indeed, the proof of the Bargmann inequality in [Si2] (Theorem 7.3) applies for any s > − 21 and γ = 0. By the argument of Aizenman-Lieb [AiLi] one extends this inequality to γ ≥ 0. When s → − 21 the constant in (1.5) diverges to infinity. Our Theorem 1.6 with α = 1 yields (1.5) for s = − 21 and γ > 0 with a finite constant. Indeed, the constant can be chosen as 2π Lγ ,2 with Lγ ,2 from (0.1), see Remark 2.3. 1.3. Aharonov-Bohm operators in d = 2. If d = 2 then the Hardy inequality (0.2) becomes trivial. In this case it is interesting to consider the magnetic Schr¨odinger operator (−i∇ − φA)2 , where φ ∈ R and A is the Aharonov-Bohm magnetic vector potential, A(x) = |x|−2 (−x2 , x1 ),
x ∈ R2 \ {0}.
As usual, (−i∇−φA)2 is defined as the Friedrichs extensions of the corresponding differential operator on C0∞ (R2 \ {0}). By gauge invariance we can assume that − 21 < φ ≤ 21 . Recall (see [LaWei1]) the Hardy-type inequality |u|2 φ2 dx ≤ |(−i∇ − φA)u|2 dx, u ∈ C0∞ (R2 \ {0}). 2 R2 |x| R2 Here the constant φ 2 is sharp. Our result is Theorem 1.10. Let γ > 0, α ≥ 0 and − 21 < φ ≤ 21 . Then γ φ2 γ + 2+α tr (−i∇ − φA)2 − 2 − V ≤ Cγ ,2,α V (x)+ 2 |x|α dx |x| R2 − with a constant Cγ ,2,α independent of V and φ. The proof is found in Sect. 4.
On Lieb-Thirring Inequalities
729
2. Proof of Theorem 1.1 In this section we assume, unless stated otherwise, that d ≥ 3 and write H0 = − −
(d − 2)2 . 4|x|2
The main idea in the proof of Theorem 1.1 is to consider H0 − V separately on the space of spherically symmetric functions and on its orthogonal complement. An essential ingredient in our study of the operator on the former space will be the following result from [EgKo]. Proposition 2.1. Let d ≥ 2, γ > 0 and α ≥ 0. Then γ + d+α γ EK tr(− − V )− ≤ Cγ ,d,α V (x)+ 2 |x|α dx Rd
with a constant CγEK ,d,α independent of V . For the convenience of the reader we will give a proof of this proposition in the appendix. We remark that for α = 0 this coincides with the classical Lieb-Thirring inequality (0.1). The inclusion of the weight |x|α increases the power of V by α2 as compared to (0.1). That this is necessary can easily be seen by scaling of the space variables. We note that the result holds also if d ≥ 3 and γ = 0 and if d = 1 and γ > 1+α 2 . From Proposition 2.1 we deduce now the first part of Theorem 1.6. Recall that we d2 1 consider the operator − dr 2 − 4r 2 − V in L2 (R+ ) with Dirichlet boundary condition. Corollary 2.2. Let γ > 0 and α ≥ 1. Then γ 1 d2 γ + 1+α ≤ Cγ ,1,α V (r)+ 2 r α dr tr − 2 − 2 − V dr 4r R+ − with a constant Cγ ,1,α independent of V . Proof. The operator − − V (| · |) in L2 (R2 ) is unitarily equivalent to the direct sum ⊕n∈Z (hn − V ) in ⊕n∈Z L2 (R+ ), where we define hn − V := −
d2 1 n2 − + −V dr 2 4r 2 r2
as quadratic form on C0∞ (R+ ). (Here we used that C0∞ (R2 \ {0}) is a form core for − − V .) Hence Proposition 2.1 yields γ d2 1 γ ≤ tr L2 (R2 ) (− − V (| · |))− tr L2 (R+ ) − 2 − 2 − V dr 4r − γ + 1+α ≤ CγEK V (|x|)+ 2 |x|α−1 dx ,2,α−1 R2 γ + 1+α = 2πCγEK V (r)+ 2 r α dr ,2,α−1 R+
as claimed.
730
T. Ekholm, R. L. Frank
Remark 2.3. In the case α = 1 we can apply the Lieb-Thirring inequality (0.1) instead of Proposition 2.1. This shows that the sharp value of the constant Cγ ,1,1 is bounded from above by 2π Lγ ,2 with Lγ ,2 from (0.1). Corollary 2.2 will allow us to treat the part of H0 − V on spherically symmetric functions. On the orthogonal complement of that space one has an improved Hardy inequality. Lemma 2.4. Let d ≥ 2. Then
d2 |u|2 dx ≤ |∇u|2 dx 4 Rd |x|2 Rd for all u ∈ C0∞ (Rd ) satisfying Sd−1 u(rω) dω = 0 for all r ≥ 0. This inequality appears, e.g., in [BiLa]. We sketch the simple proof. Proof. We substitute u = |x|(2−d)/2 v and obtain (d − 2)2 2 2 |∇u| − |u| dx = |∇v|2 |x|2−d dx 4|x|2 Rd Rd ∞ ∂v 2 |∇θ v|2 + = dθ r dr ∂r r2 Sd−1 0 ∞ r −1 |∇θ v|2 dθ dr. (2.1) ≥ 0
Sd−1
For fixed r the function v(r·) is orthogonal to constants, i.e., to the first eigenfunction of the Laplace-Beltrami operator on Sd−1 . Since the next eigenvalue is d − 1 we find |∇θ v(rθ )|2 dθ ≥ (d − 1) |v(rθ )|2 dθ. Sd−1
Sd−1
Multiplying by r −1 and integrating yields ∞ r −1 |∇θ v(rθ )|2 dθ dr ≥ (d − 1) 0
Sd−1
Rd
Combining this with (2.1) we obtain the result.
|x|−2 |u|2 dx.
Now we are in position to give the Proof of Theorem 1.1. By the variational principle it suffices to prove the result for V ≥ 0. Moreover, we will assume that the r.h.s. of (1.1) is finite and that the quadratic form (1.2) is lower semi-bounded on C0∞ (Rd ). (Note that these assumptions are satisfied, e.g., if V is bounded and has compact support.) Since the form (1.2) is closable we can define H0 − V as the operator associated with this form. At the end of the proof we use a standard approximation argument to show that the finiteness of the r.h.s. of (1.1) implies the lower semi-boundedness. We denote by P the projection onto spherically symmetric functions, (P u)(x) := |Sd−1 |−1 u(|x| ω) dω, x ∈ Rd , Sd−1
On Lieb-Thirring Inequalities
731
and put Q := I − P . Note that P and Q commute with H0 . Moreover, for u ∈ C0∞ (Rd ) one has 2Re (P V Qu, u) ≤ 2 V 1/2 Qu · V 1/2 P u ≤ (P V P u, u) + (QV Qu, u), which implies the operator inequality P V Q + QV P ≤ P V P + QV Q. It follows that H0 − V = P (H0 − V ) P + Q (H0 − V ) Q − P V Q − QV P ≥ P (H0 − 2V ) P + Q (H0 − 2V ) Q, and hence γ
γ
γ
tr(H0 − V )− ≤ tr (P (H0 − 2V ) P )− + tr (Q (H0 − 2V ) Q)− .
(2.2)
We consider the two terms separately and begin with the second one. By Lemma 2.4 we find for all 0 < ρ ≤ 1 that Q (H0 − 2V ) Q ≥ ρQ − − ρ −1 2V Q + 41 (1 − ρ)d 2 − (d − 2)2 Q|x|−2 Q. We choose ρ such that (1 − ρ)d 2 = (d − 2)2 and obtain from Proposition 2.1 (or (0.1) if α = 0) that γ γ tr (Q (H0 − 2V ) Q)− ≤ ρ γ tr Q − − ρ −1 2V Q − γ γ −1 ≤ ρ tr − − ρ 2V (2.3) − d+α d+α d+α V (x)γ + 2 |x|α dx. ≤ ρ − 2 2γ + 2 CγEK ,d,α Rd
γ
Now we turn to the term tr (P (H0 − 2V ) P )− . We define the spherical average of V by V˜ (r) := |Sd−1 |−1
Sd−1
V (rω) dω,
r ∈ R+ ,
and note that (the non-trivial part of) P (H0 − 2V ) P is unitarily equivalent to the operd2 1 ˜ ator − dr 2 − 4r 2 − 2V in L2 (R+ ). By Corollary 2.2 (with α replaced by α + d − 1) we obtain d+α d+α γ tr (P (H0 − 2V ) P )− ≤ 2γ + 2 Cγ ,1,α+d−1 V˜ (r)γ + 2 r α+d−1 dr. R+
Now H¨older’s (or Jensen’s) inequality implies that d+α d+α V˜ (r)γ + 2 ≤ |Sd−1 |−1 V (rω)γ + 2 dω, Sd−1
r ∈ R+ ,
732
T. Ekholm, R. L. Frank
and hence γ
tr (P (H0 − 2V ) P )− ≤ 2γ +
d+α 2
= 2γ +
d+α 2
|Sd−1 |−1 Cγ ,1,α+d−1 |Sd−1 |−1 Cγ ,1,α+d−1
V (rω)γ +
d+α 2
R+ Sd−1
Rd
V (x)γ +
d+α 2
r α+d−1 dω dr
|x|α dx.
(2.4)
Adding (2.3) and (2.4) we obtain the assertion in view of (2.2) with a constant satisfying d+α d+α d−1 −1 Cγ ,d,α ≤ 2γ + 2 ρ − 2 CγEK | Cγ ,1,α+d−1 . (2.5) ,d,α + |S To complete the proof it remains that show that if 0 ≤ V ∈ L1,loc (Rd ) is such that the r.h.s. of (1.1) is finite then the quadratic form (1.2) is lower semi-bounded on C0∞ (Rd ). To see this, choose bounded, compactly supported functions 0 ≤ Vn ≤ V such that Vn → V a.e. and d+α (V (x) − Vn (x))γ + 2 |x|α dx → 0. (2.6) Rd
The operators H0 − Vn are well-defined and (1.1) holds for them. In particular, λ(Vn ) := inf σ (H0 − Vn ) satisfies 1 1 γ γ γ + d+α α 2 |x| dx Vn (x) . λ(Vn ) ≥ −Cγ ,d,α Rd
Hence for any u ∈ C0∞ (Rd ) one has (d − 2)2 2 2 |∇u|2 − dx |u| − V |u| n 4|x|2 Rd 1 1 γ d+α Vn (x)γ + 2 |x|α dx . ≥ −Cγγ,d,α u 2 Rd
Using dominated convergence and (2.6) we can pass to the limit n → ∞ and find that also the form (1.2) is bounded from below on C0∞ (Rd ). This completes the proof. Proof of Corollary 1.2. Assume that the r.h.s. of (1.3) is finite. Then according to our comments after Theorem 1.1 the form |∇u|2 − V˜ |u|2 dx, (2.7) Rd
where
(d − 2)2 (d − 2)2 V˜ (x) := + V (x) − , 4|x|2 4|x|2 +
is lower semi-bounded on C0∞ (Rd ) and we denote by − − V˜ the operator associated with its closure. Since V ≤ V˜ the form (2.7), with V˜ replaced by V , is also lower semibounded on C0∞ (Rd ) and the associated operator satisfies − − V ≥ − − V˜ . Now Corollary 1.2 follows from Theorem 1.1 with α = 0 by the variational principle.
On Lieb-Thirring Inequalities
733
3. Proof of Theorem 1.6 Recall the one-dimensional Hardy inequality
|f (r)|2 dr ≤ 4 |f (r)|2 dr, r2 R+ R+
f ∈ C0∞ (R+ ).
(3.1)
It allows to define the non-negative operator h0 = −
d2 1 − 2 dr 2 4r
in L2 (R+ )
(3.2)
as the Friedrichs extension of the quadratic form (1.4) with V ≡ 0 on C0∞ (R+ ). This operator can be diagonalized explicitly. Indeed, let J0 be the first Bessel function of order zero (see [AbSt]). Then ∞√ (F0 f )(k) := krJ0 (kr)f (r) dr, k ∈ R+ , 0
initially defined for f ∈ C0∞ (R+ ), can be extended to a unitary operator F0 : L2 (R+ ) → L2 (R+ ). It has the property (F0 h0 f )(k) = k 2 (F0 f )(k),
k ∈ R+ ,
(3.3)
for all f ∈ D(h0 ). (These facts are essentially contained in Chapter 4 of [StWei].) We denote by N (τ, h0 − V ) the number of eigenvalues less than −τ , counting multiplicities, of the operator h0 − V in L2 (R+ ). Our proof of Theorem 1.6 relies on the following Lemma 3.1. Let q > 1, 0 ≤ α ≤ 1 such that 2q − α > 1. Then q −q+ 1+α 2 N(τ, h0 − V ) ≤ Cα,q τ V (r)+ r α dr, R+
τ > 0,
(3.4)
with a constant Cα,q independent of V . What we precisely prove is that if V ∈ L1,loc (R+ ) and if the r.h.s. of (3.4) is finite, then
1/2 the form (1.4) is closed and lower semi-bounded on D h0 and for the corresponding self-adjoint operator h0 − V the estimate (3.4) holds. Before we begin the proof we recall (see [BiSo1, Si2]) that Sp denotes the class of compact operators K (in a given Hilbert space, in our case in L2 (R+ )) such that 1 p p
K p := tr(K ∗ K) 2 < ∞. We will use the following fact (see [LiTh]). If q ≥ 1 and A, B are self-adjoint, non-negative operators such that Aq B q ∈ S2 , then AB ∈ S2q and 2q
AB 2q ≤ Aq B q 22 .
(3.5)
734
T. Ekholm, R. L. Frank
Proof of Lemma 3.1. Scaling with respect to the space variables shows that it is enough to consider the case τ = 1. Moreover, by the variational principle we may assume V ≥ 0. By the Birman-Schwinger principle and the inequality (3.5) we obtain 2q 2 N (1, h0 − V ) ≤ V 1/2 (h0 + I )−1/2 2q ≤ V q/2 (h0 + I )−q/2 2 . It follows from this estimate that we can restrict ourselves to, say, bounded and compactly supported V . The general result as well as the comment we made after the lemma are derived then in a standard way. It follows from (3.3) that the operator V q/2 (h0 + I )−q/2 F0∗ has the integral kernel q√ q r, k ∈ R+ , V (r) 2 rk J0 (rk)(k 2 + 1)− 2 , and therefore q/2 V (h0 + I )−q/2 2 = 2
R+ R+
rkJ02 (rk)(k 2 + 1)−q V (r)q dk dr.
Recall (see [AbSt]) that J0 is a continuous function with J0 (0) = 1 and
π cos(x − π/4) J0 (x) ∼ as x → ∞. √ 2 x Hence with cα := supx>0 x (1−α)/2 J0 (x) < ∞ we can estimate N(1, h0 − V ) ≤ cα2 (rk)α (k 2 + 1)−q V (r)q dk dr = Cα,q V (r)q r α dr. 2
R+ R+
R+
Here Cα,q := cα R+ k α (1 + k 2 )−q dk is finite in view of our assumptions.
Given Lemma 3.1 we obtain in a standard manner the Proof of Theorem 1.6. The case α ≥ 1 was already proven in Corollary 2.2. We assume now 0 ≤ α ≤ 1. The operator inequality t t h0 − V + t ≥ h0 − V − + 2 + 2
t implies N(t, h0 − V ) ≤ N 2 , h0 − (V − 2t )+ and hence γ tr(h0 − V )− = γ N (t, h0 − V )t γ −1 dt R+
≤γ N 2t , h0 − (V − 2t )+ t γ −1 dt R+ γ = γ2 N (τ, h0 − (V − τ )+ )τ γ −1 dτ. R+
Now we fix 1 < q < γ +
1+α 2
and apply Lemma 3.1 to obtain α−1 γ q γ tr(h0 − V )− ≤ γ 2 Cα,q (V (r) − τ )+ τ γ −q+ 2 dτ r α dr R+ R+ γ + 1+α γ 1+α = γ 2 Cα,q B(γ + 2 − q, q + 1) V (r)+ 2 r α dr, R+
On Lieb-Thirring Inequalities
735
1 where B(a, b) = 0 s a−1 (1 − s)b−1 ds is the beta function. Finally, one may optimize over all 1 < q < γ + 1+α 2 . This establishes the theorem. In Theorem 1.6 it is impossible to take γ = 0 since the operator h0 has a virtual level. More precisely, one has Proposition 3.2. Let V obey R+ |V (r)|1+δ r dr < ∞ and R+ |V (r)|(1 + r δ ) r dr < ∞ for some δ > 0. Then h0 − βV has a negative eigenvalue for all β > 0 if and only if R+ V (r)r dr ≥ 0, V ≡ 0. In this case, for sufficiently small β the eigenvalue λ(β) is unique and satisfies as β → 0, λ(β) ∼ − exp
β 2
λ(β) ∼ − exp − βc2
R+ V (r)r dr
−1
if if
R+
V (r)r dr > 0,
R+
V (r)r dr = 0
with a suitable constant c = c(V ) > 0.
Here we use the notation λ(β) ∼ − exp −aβ −ρ meaning lim −β ρ log(−λ(β)) = a.
β→0
The proof uses the same idea as the proof of Corollary 2.2. Proof. We recall that −−βV (|·|) in L2 (R2 ) is unitarily equivalent to ⊕n∈Z (hn −βV ) in ⊕n∈Z L2 (R+ ), where hn − βV are similarly defined as in the proof of Corollary 2.2. Clearly ⊕n∈Z (hn − βV ) has a negative eigenvalue if and only if h0 − βV has a negative eigenvalue. The assertion follows therefore from Theorem 3.4 in [Si1]. 4. Proof of Theorem 1.10 The proof of Theorem 1.10 is similar to the proof of Theorem 1.1 and we only sketch the major steps. We write Hφ := (−i∇ − φA)2 −
φ2 . |x|2
With polar coordinates (r, θ ) we define the projections Pn , n ∈ Z, in L2 (R2 ), 1 (Pn u)(r, θ ) := 2π
π
−π
u(r, ω)e−inω dω einθ ,
r > 0, θ ∈ (−π, π ).
The subspace Pn L2 (R2 ) reduces Hφ and its part in this space is unitarily equivalent to n(n−2φ) d2 1 in L2 (R+ ), defined as quadratic form on C0∞ (R+ ). the operator − dr 2 − 4r 2 + r2 Note that this operator coincides with (3.2) if n = 0 and, if φ = 21 , also if n = 1. This means that Hφ has one virtual level if |φ| < 21 and two virtual levels if φ = 21 .
736
T. Ekholm, R. L. Frank
Proof of Theorem 1.10. We assume V ≥ 0 and put P := P−1 + P0 if − 21 < φ ≤ 0, P := P0 + P1 if 0 < φ ≤ 21 and Q := I − P . (We emphasize that one may also take P = P0 if |φ| < 21 , but then the constants below will blow up as |φ| → 21 .) As in the proof of Theorem 1.1 one finds
γ
γ γ tr(Hφ − V )− ≤ tr P Hφ − 2V P − + tr Q Hφ − 2V Q − . On QL2 (R2 ) we use the estimate QHφ Q ≥ (1 − |φ|)Q(−)Q which is easily obtained by decomposition into the subspaces Pn L2 (R2 ). By Proposition 2.1 we conclude γ
γ tr Q Hφ − 2V Q − ≤ (1 − |φ|)γ tr Q − − 2(1 − |φ|)−1 V Q − 2+α 2+α − 2+α γ + EK γ + 2 C 2 |x|α dx. ≤ (1 − |φ|) 2 2 V (x) γ ,2,α R2
On the orthogonal complement P L2 (R2 ) we estimate
P Hφ − 2V P ≥ P0 Hφ − 4V P0 + P∓1 Hφ − 4V P∓1 . The latter operator is unitarily equivalent to the operator d2 1 d2 1 1 − 2|φ| ˜ ˜ − 2 − 2 − 4V ⊕ − 2 − 2 + − 4V dr 4r dr 4r r2 in L2 (R+ ) ⊕ L2 (R+ ), where 1 V˜ (r) := 2π
π
−π
V (r, θ ) dθ,
r > 0.
We estimate 1 − 2|φ| ≥ 0 and conclude by Corollary 2.2 that γ γ
d2 1 tr P Hφ − 2V P − ≤ 2 tr − 2 − 2 − 4V˜ dr 4r − 2+α γ + 2+α 2 Cγ ,1,α+1 ≤ 2·4 V˜ (r)γ + 2 r α+1 dr. R+
It remains to use H¨older’s inequality to complete the proof. Finally, we remark that the constants can be chosen independently of φ.
On Lieb-Thirring Inequalities
737
Appendix A. An Inequality of Egorov-Kondrat’ev Our exposition in this appendix follows rather closely [EgKo]. Proposition 2.1 can be deduced by standard arguments (as, e.g., in our proof of Theorem 1.6 in Sect. 3) provided we have established Lemma A.1. Let d ≥ 2, q > 1, α ≥ 0 such that 2q − α > d. Then d+α q V (x)+ |x|α dx, τ > 0, N(τ, − − V ) ≤ C τ −q+ 2 Rd
(A.1)
with a constant C = C(α, d, q) independent of V . The same result (with the same proof) holds if d = 1, q > 1, α ≥ 0 such that q − α > 1. First some terminology. By a ‘cube’ we mean always a cube with edges parallel to the coordinate axis, and by its ‘length’ we mean the length of one of its edges. We need the following variant of Rozenblum’s covering lemma (see [EgKo], where also an explicit value for the constant can be found). Lemma A.2. Let d ≥ 1. Then there exists a constant C1 > 0 such that for any ε ∈ (0, 1], any cube Q ⊂ Rd and any non-negative f ∈ L1 (Q) there exists a finite number of cubes Q1 , . . . , QM with the following properties: (1) Q ⊂ M point in Rd is contained in at most C1 cubes, j =1 Qj and any (2) Qj f (x) dx ≤ εC1 Q f (x) dx, j = 1, . . . , M, (3) M ≤ ε−1 C1 .
d−s if d ≥ 3 and 1 ≤ p < ∞ if d = 2. Then Lemma A.3. Let 0 ≤ s < 2 and 1 ≤ p < d−2 there exists a constant C2 > 0 such that for any cube Q of length l and any u ∈ H 1 (Q), 1 p 2−d+ d−s 2p −s p |u| |x| dx ≤ C2 l |∇u|2 + l −2 |u|2 dx. (A.2) Q
Moreover, if
Q
Q u dx
= 0 then
|u| |x| 2p
−s
1 dx
p
≤ C2 l
2−d+ d−s p
Q
|∇u|2 dx.
(A.3)
Q
Proof. By scaling it suffices to consider the case l = 1. We can choose 1 < p1 < ∞ 2d if d ≥ 3 and such that sq1 < d, where p1−1 + q1−1 = 1. Then by such that 2pp1 ≤ d−2 H¨older’s inequality 1 1 p1 q1 2p −s 2pp1 −sq1 |u| |x| dx ≤ |u| dx |x| dx . Q
Q
Q
The latter integral is finite, uniformly for all cubes of length one, by our choice of q1 . Moreover, by the Sobolev embedding theorems 1 pp1 |∇u|2 + |u|2 dx |u|2pp1 dx ≤ C2 Q
Q
for some constant C2 = C2 (p, p1 , d). If u has mean value zero we may use Poincar´e’s inequality instead (see [LiLo]).
738
T. Ekholm, R. L. Frank
Proof of Lemma A.1. As in the proof of Lemma 3.1 we may assume τ = 1 and V ≥ 0. Fix q > 1, α ≥ 0 such that 2q − α > d and note that p and s, defined by p −1 + q −1 = 1 and α = s(q − 1), satisfy the assumptions of Lemma A.3. Put I := V q |x|α dx Rd
and introduce the unit cube Q0 := (0, 1)d . By H¨older’s inequality and (A.2) we obtain that 1 1 q p 2 q s(q−1) 2p −s V |u| dx ≤ V |x| dx |u| |x| dx Rd
k∈Zd 1
≤ C2 I q
Q0 +k
Q0 +k
|∇u|2 + |u|2 dx.
Rd
(A.4)
In view of this inequality it suffices to prove the assertion only for, say, bounded and compactly supported V . Moreover, we conclude from this inequality that N (1, −−V ) = 0 −q if I ≤ C2 . Hence it is enough to establish the estimate N (1, − − V ) ≤ CI
(A.5)
under the additional condition −q
I ≥ C2 .
(A.6)
To obtain (A.5) we find a subspace L ⊂ H 1 (Rd ) such that |∇u|2 + |u|2 dx, V |u|2 dx ≤ Rd
Rd
u ∈ L,
(A.7)
and such that codim L ≤ CI . For 0 < ε ≤ 1 (which will be determined later) LemmaA.2 yields cubes Q1 , . . . , QM such that supp V ⊂ M j =1 Qj , such that each point is covered by at most C1 cubes, V q |x|α dx ≤ ε C1 I (A.8) Qj
and M ≤ ε−1 C1 . With lj denoting the length of Qj put J≤ := {j : lj ≤ 1},
J> := {j : lj > 1}. First we consider j ∈ J> (i.e., large cubes). We divide Qj = k Qj,k in a finite number of non-intersecting cubes with equal length l˜j ∈ (1, 2]. Estimating similarly as in (A.4) and using (A.8) we obtain V |u|2 dx ≤ Qj
k
≤ C2 2
1
1
q
|u|2p |x|−s dx
V q |x|s(q−1) dx Qj,k
2−d+ d−s p
1
(ε C1 I ) q Qj
p
Qj,k
|∇u|2 + |u|2 dx.
(A.9)
On Lieb-Thirring Inequalities
739
Now we consider j ∈ J≤ (i.e., small cubes). If u ∈ H 1 (Rd ) satisfies u dx = 0,
(A.10)
Qj
then a similar estimate (but using (A.3) instead of (A.2)) yields 1 2 q V |u| dx ≤ C2 (ε C1 I ) |∇u|2 dx. Qj
(A.11)
Qj
Let L be the space of all u ∈ H 1 (Rd ) such that (A.10) holds for all j ∈ J≤ . We sum (A.9) and (A.11) over all j to get 1 |∇u|2 + |u|2 dx, u ∈ L, V |u|2 dx ≤ C3 (εI ) q Rd
Rd
−q
. Now we choose ε := C3 I −1 . (Note that in view where C3 := C2 22−d+(d−s)/p C1 of (A.6) and C1 ≥ 1 one has ε ≤ 1.) Moreover, with this choice of ε relation (A.7) holds and 1+1/q
codim L = J≤ ≤ M ≤ ε−1 C1 = C1 C3 I. q
q
This yields (A.5) with C = C1 C3 and finishes the proof.
Remark A.4. If d ≥ 3 then Proposition 2.1 follows by the argument of Aizenman-Lieb [AiLi] from d+α EK N (0, − − V ) ≤ C0,d,α V (x)+2 |x|α dx. Rd
Different proofs of this inequality can be found in [BlReSt, EgKo and BiSo2]. It would be desirable, in particular in view of constants, to find an alternative proof of Proposition 2.1 in the case d = 2. Acknowledgements. The authors are grateful to Ari Laptev and Timo Weidl for useful discussions. The first author has been partially supported by the ESF European programme SPECT.
References [AbSt] [AiLi] [Bi] [BiLa] [BiSo1] [BiSo2] [BlReSt]
Abramowitz, M., Stegun, I.A.: Handbook of mathematical functions with formulas, graphs, and mathematical tables. Reprint of the 1972 edition. New York: Dover Publications, 1992 Aizenman, M., Lieb, E.: On semiclassical bounds for eigenvalues of Schr¨odinger operators. Phys. Lett. A 66(6), 427–429 (1978) Birman, M.Sh.: The spectrum of singular boundary problems. Amer. Math. Soc. Trans. (2) 53, 23–80 (1966) Birman, M.Sh., Laptev, A.: The negative discrete spectrum of a two-dimensional Schr¨odinger operator. Comm. Pure Appl. Math. 49, 967–997 (1996) Birman, M.Sh., Solomyak, M.Z.: Spectral theory of selfadjoint operators in Hilbert space. Dordrecht: D. Reidel, 1987 Birman, M.Sh., Solomyak, M.Z.: Schr¨odinger operators. Estimates for number of bound states as function-theoretic problem. Amer. Math. Soc. Transl. (2) 150, 1–54 (1992) Blanchard, Ph., Rezende, J., Stubbe, J.: New estimates on the number of bound states of Schr¨odinger operators. Lett. Math. Phys. 14(3), 215–225 (1987)
740 [EgKo]
T. Ekholm, R. L. Frank
Egorov, Yu.V., Kondrat’ev, V.A.: On spectral theory of elliptic operators. Oper. Theory Adv. Appl. 89, Basel: Birkh¨auser, 1996 [GlGrMaTh] Glaser, V., Grosse, H., Martin, A., Thirring, W.: A family of optimal conditions for the absence of bound states in a potential. Studies in Mathematical Physics. Princeton, NJ: Princeton University Press, 1976, pp. 169–194 [LaWei1] Laptev, A., Weidl, T.: Hardy inequalities for magnetic Dirichlet forms. Mathematical results in quantum mechanics (Prague, 1998). Oper. TheoryAdv.Appl. 108, Basel: Birkh¨auser, 1999, pp. 299–305 ´ [LaWei2] Laptev, A., Weidl, T.: Recent results on Lieb-Thirring inequalities. Journ´ees “Equations aux D´eriv´ees Partielles” (La Chapelle sur Erdre, 2000), Exp. No. XX, Nantes: Univ. Nantes, 2000 [LiLo] Lieb, E., Loss, M.: Analysis. Second edition. Graduate Studies in Mathematics 14. Providence, RI: Amer. Math. Soc., 2001 [LiTh] Lieb, E., Thirring, W.: Inequalities for the moments of the eigenvalues of the Schr¨odinger Hamiltonian and their relation to Sobolev inequalities. Studies in Mathematical Physics. Princeton, NJ: Princeton University Press, 1976, pp. 269–303 [Si1] Simon, B.: The bound state of weakly coupled Schr¨odinger operators in one and two dimensions. Ann. Physics 97(2), 279–288 (1976) [Si2] Simon, B.: Trace ideals and their applications. London Mathematical Society Lecture Note Series 35. Cambridge-New York: Cambridge University Press, 1979 [StWei] Stein, E.M., Weiss, G.: Introduction to Fourier analysis on Euclidean spaces. Princeton Mathematical Series 32. Princeton, NJ: Princeton University Press, 1971 [St] Stubbe, J.: Bounds on the number of bound states for potentials with critical decay at infinity. J. Math. Phys. 31(5), 1177–1180 (1990) [Wei] Weidl, T.: Remarks on virtual bound states for semi-bounded operators. Comm. Partial Differ. Eqs. 24(1–2), 25–60 (1999) Communicated by B. Simon
Commun. Math. Phys. 264, 741–758 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1553-4
Communications in
Mathematical Physics
Propagation Effects on the Breakdown of a Linear Amplifier Model: Complex-Mass Schr¨odinger Equation Driven by the Square of a Gaussian Field Philippe Mounaix1 , Pierre Collet1 , Joel L. Lebowitz2 1 2
Centre de Physique Th´eorique, UMR 7644 du CNRS, Ecole Polytechnique, 91128 Palaiseau Cedex, France. E-mail:
[email protected];
[email protected] Departments of Mathematics and Physics, Rutgers, The State University of New Jersey, Piscataway, NJ 08854-8019, USA. E-mail:
[email protected]
Received: 20 May 2005 / Accepted: 8 December 2005 Published online: 31 March 2006 – © Springer-Verlag 2006 i Abstract: Solutions to the equation ∂t E(x, t) − 2m E(x, t) = λ|S(x, t)|2 E(x, t) are investigated, where S(x, t) is a complex Gaussian field with zero mean and specified covariance, and m = 0 is a complex mass with Im(m) ≥ 0. For real m this equation describes the backscattering of a smoothed laser beam by an optically active medium. Assuming that S(x, t) is the sum of a finite number of independent complex Gaussian random variables, we obtain an expression for the value of λ at which the q th moment of |E(x, t)| w.r.t. the Gaussian field S diverges. This value is found to be less or equal for all m = 0, Im(m) ≥ 0 and |m| < +∞ than for |m| = +∞, i.e. when the E term is absent. Our solution is based on a distributional formulation of the Feynman path-integral and the Paley-Wiener theorem.
I. Introduction We investigate the breakdown of linear amplification in a system driven by the square of a Gaussian noise. This problem which models the backscattering of an incoherent laser by an optically active medium was first considered by Akhmanov et al. in nonlinear optics [1], and by Rose and DuBois in laser-plasma interaction [10]. The latter investigated the divergence of the average solution to the stochastic PDE, i ∂t E(x, t) − 2m E(x, t) = λ|S(x, t)|2 E(x, t), (1) t ≥ 0, x ∈ ⊂ Rd , and E(x, 0) = 1, heuristically and numerically in the “diffractive case” where Im(m) = 0 and Re(m) = 0. Here λ > 0 is the coupling constant and S is a complex Gaussian noise with zero mean1 . More recently, this problem was analyzed from a more rigorous mathematical point of 1 This is the case of interest in laser-plasma interaction and nonlinear optics in which S is the (complex) time-envelope of the laser electric field. With the help of some minor modifications, our results carry over straightforwardly to the cases where S is real.
742
Ph. Mounaix, P. Collet, J.L. Lebowitz
view in [2] and [7]. The “diffusive case” in which Re(m) = 0 and Im(m) > 0 was considered in [2], and the one dimensional diffractive case was considered in [7] for a restrictive class of S’s. In the present work we will consider the general case m = 0 and Im(m) ≥ 0 for a d-dimensional torus with d ≤ 3. As in [2] and [7] we will express the solution to (1) formally as the Feynman-Kac path-integral
t im
E(x, t) =
e
0
2
x(τ ˙ )2 +λ|S(x(τ ),τ )|2 dτ
d[x(·)],
(2)
x(·)∈B(x,t)
where B(x, t) denotes the set of all the continuous paths in satisfying x(t) = x. t In the diffusive case, the right-hand side of (2) is just the Wiener integral of exp λ 0 dτ |S(x(τ ), τ )|2 over B(x, t). This was used in [2] to prove, under some reasonable assumptions on the covariance of S, that for every t > 0 and any positive integer q the average of E(x, t)q over the realizations of S, E(x, t)q , diverges as λ increases past some critical value smaller (or equal) than in the diffusion-free case (i.e. when |m| = +∞), with equality holding for a class of S. It was conjectured there that this inequality should also apply when diffusion is replaced by diffraction, i.e. m real, m = 0, the case of physical interest considered by Rose and DuBois in [10]. The diffractive case is much more difficult because the right-hand side of (2) is no longer well defined and one cannot a priori exclude the possibility that destructive interference between paths makes the sum of divergent contributions finite, raising (possibly to infinity) the critical value of λ at which the average of (2) diverges. Using heuristics and numerical simulations, Rose and DuBois argued that |E(x, t)|2 should diverge for every t > 0 as λ increases to some finite critical value [10]. The conjecture made in [2] that diffraction should actually lower the critical coupling (or, at least, not increase it) compared to the case |m| = +∞ was proved in [7], for very special choices of S, for the divergence of |E(x, t)|. In this paper, we extend the results of [7] to a much wider class of S. We analyze the divergence of |E(x, t)|q for any positive integer q, and we treat both the diffusive and diffractive cases as well as all the intermediate cases between these two limits [i.e. complex m with Im(m) ≥ 0 and m = 0]. Our strategy for controlling the complex Feynman path-integral (2) and determining the critical value of λ uses the following three ingredients: (i) we consider a restricted but quite wide class of S for which E(x, t) can be written as a Fourier-Laplace integral w.r.t. a distribution with compact support;2 (ii) we apply the Paley-Wiener theorem to this Fourier-Laplace integral. This yields the control of (2) for “large” |S|2 ; (iii) we average |E(x, t)|q over the realizations of S and use the control obtained in (ii) to determine the smallest value of λ for which this average blows up. The rest of the paper follows this strategy quite faithfully. In Sect. II we specify the class of S which we can treat. The distributional formulation of E(x, t) is given in Sect. III and the way to control its growth is explained in Sect. IV. Finally, the determination of the critical value of λ and the proof of the conjecture made in [2] are given in Sect. V. It is worth noting that (i) and (ii) do not depend on S being Gaussian and thus apply also in a more general setting. 2 The “Fourier-Laplace” integral w.r.t. a distribution with compact support on RN is the continuation of the usual Fourier integral from RN to CN .
Propagation Effects on the Breakdown of a Linear Amplifier Model
743
II. Model and Definitions We consider the solution to the linear amplifier equation (1), written in its integral representation (2), with m in C+ \{0}, where C+ ≡ {m ∈ C : Im(m) ≥ 0}. We assume that S can be expressed as a finite combination of M complex Gaussian r.v., sn , S(x, t) =
M
(3)
sn n (x, t),
n=1
with
sn = sn sm = 0, ∗ = δ . sn sm nm
(4)
The n are normalized such that 1 ||
1
0
1 || M
|S(x, τ )|2 dτ d d x =
n=1 0
1
|n (x, τ )|2 dτ d d x = 1.
Furthermore, the n (·, τ ) are assumed to have second derivatives bounded uniformly in τ ∈ [0, t], and the n (x, ·) are piecewise continuous for every x ∈ with a finite number of discontinuities in [0, t] for all finite t. Note that locally, i.e. for each x and τ , |S(x, τ )|2 is a quadratic form of 2M real Gaussian r.v. (the real and imaginary parts of the sn ), so it is a χ 2 r.v. with 2M degrees of freedom. Equation (3) generalizes models of spatially smoothed laser beams in which the laser light is represented by a superposition of a finite number of monochromatic beamlets the amplitudes of which are independent r.v. [9]. For a large number of beamlets these r.v. can be taken as Gaussian and the laser electric field takes on the form (3) with n (x, t) ∝ exp[i(kn · x + akn2 t)], where kn is the wave vector of the nth beamlet and a > 0 is a (real) constant. It can be checked that all the assumptions made on S are fulfilled. We are interested in the critical coupling λq (x, t) and its Laplacian-free counterpart λq (x, t) obtained by setting m−1 = 0 in Eq. (1). These quantities are defined by λq (x, t) = inf{λ > 0 : |E(x, t)|q = +∞}, λq (x, t) = inf{λ > 0 : eqλ
t 0
|S(x,τ )|2 dτ
= +∞}.
(5a) (5b)
Equations (5) give the values of λ at which |E(x, t)|q diverges with and without the Laplacian on the left-hand side of (1). Note that S is not assumed to be homogeneous and the critical coupling will depend on x in general. III. Distributional Formulation of E(x, t; s) Let s be the M-dimensional Gaussian random vector the elements of which are the sn , and γ (x, τ ) the M × M Hermitian matrix defined by γnm (x, τ ) = ∗n (x, τ )m (x, τ ).
744
Ph. Mounaix, P. Collet, J.L. Lebowitz
Inserting (3) into the right-hand side of (2) yields t im ˙ )2 + λ s † γ (x(τ ),τ )s dτ 0 2 x(τ E(x, t; s) = e d[x(·)],
(6)
x(·)∈B(x,t)
where we have made the dependence of E(x, t) on the realization of s explicit. In order to make (6) more appropriate to a distributional formulation it is desirable to replace the quadratic form s † γ (x(τ ), τ )s with its monomial decomposition. One obtains N
t im t 2 dτ x(τ ˙ ) E(x, t; s) = e2 0 exp λ ki (s) ϕi (x(τ ), τ ) dτ d[x(·)], x(·)∈B(x,t)
i=1
0
√
(7)
with√ N = M 2 , and where the ϕi are N real valued functions given by γnn , 2Re(γnm ), and 2Im(γnm ), n√< m. The components of the vector k(s) ∈ RN are given by |sn |2 , √ ∗ ), and 2Im(s s ∗ ), n < m. It can be checked that 2Re(sn sm n m
k(s) = s 2 , (8)
1/2
1/2 N M 2 2 and s = . We first give a heuristic where k(s) = i=1 ki (s) i=1 |si | derivation of the distributional formulation of (6). Then, we set it on a much firmer ground by justifying it rigorously from a mathematical point of view. A. Heuristics. Inserting the identity 1=
N i=1 R
t
δ ui −
ϕi (x(τ ), τ ) dτ
dui ,
0
in the path-integral (7) and permuting the path- and u-integrals, one obtains E(x, t; s) =
···
Gx,t (u) eλk(s)·u
RN
N
dui ,
(9)
i=1
with Gx,t (u1 , . . . , uN ) =
e
im t 2 0
x(τ ˙ )2 dτ
x(·)∈B(x,t)
t N δ ui − ϕi (x(τ ), τ ) dτ d[x(·)]. i=1
0
(10) As a Feynman-Kac path-integral, the expression (10) is not well defined. A possible way to make it meaningful consists in writing Gx,t as the Fourier transform w.r.t. η of some function (x, t; η): Gx,t (u) =
1 (2π)N
···
RN
(x, t; η) eiu·η
N
dηi ,
(11)
i=1
in which (x, t; η) has a well defined meaning. Fourier transforming (10) w.r.t. u and permuting the path- and u-integrals, one obtains
Propagation Effects on the Breakdown of a Linear Amplifier Model
(x, t; η) =
ei
tm 0
2
745
x(τ ˙ )2 − V (x(τ ),τ ;η) dτ
d[x(·)],
(12)
x(·)∈B(x,t)
where V (x, t; η) is given by V (x, t; η) ≡
N
(13)
ηi ϕi (x, t).
i=1
We now observe that Eq. (12) is the path-integral solution to the Schr¨odinger equation
1 (x, t; η) + V (x, t; η) (x, t; η), i∂t (x, t; η) = − 2m
t ≥ 0, x ∈ , and (x, 0; η) = 1.
(14)
This yields a well defined (x, t; η). The permutation of path- and ordinary integrals, as well as the formal Feynman-Kac path-integral used in the derivation of Eqs. (9), (11), and (14) above require justification. The work by Cartier and DeWitt-Morette [4] suggests that we define the path-integral (6) by the right-hand side of (9) in which Gx,t is defined by its Fourier transform given as the solution to Eq. (14). We now prove the validity of this approach.
B. The distributional formulation. Let (x, t; η) be the solution t to (14) where V (x, t; η) is given by (13) with η ∈ CN . Let ai = inf x(·)∈B(x,t) 0 ϕi (x(τ ), τ ) dτ and bi = t supx(·)∈B(x,t) 0 ϕi (x(τ ), τ ) dτ . Then the following lemma holds. Lemma 1. For every t > 0, x ∈ , and m ∈ C+ \{0}, (i) Gx,t defined by (11) is a distribution with compact support on RN and suppGx,t ⊂ [a1 , b1 ] × · · · × [aN , bN ]; (ii) E(x, t; s) defined by (9) is the solution to (1). Proof. Taking the derivative of (14) with respect to ηi∗ , the complex conjugate of ηi , and using ∂ηi∗ V (x, t; η) = 0 which follows from analyticity of V (x, t; η) in η [see Eq. (13)], one finds that ∂ηi∗ (x, t; η) evolves in time according to the same Eq. (14) with the initial condition ∂ηi∗ (x, 0; η) = 0. Thus, ∂ηi∗ (x, t; η) = 0 for all t ≥ 0 and η ∈ CN which implies that (x, t; η) is analytic in η. ˜ Let (x, t; η) = (x, t; η) exp(−it N i=1 ηi ci ), where the constants ci ∈ R will be ˜ is the solution to (14) with V given by (13) in which the ϕi are replaced specified later. ˜ and the Schwartz inequality by ϕ˜i = ϕi + ci . Let i = sgn[Im(ηi )]. From Eq. (14) for one obtains d Im(m) ˜ 22 + 2 ˜ 22 = −
∇ Im(ηi )
2 dt |m| N
i=1
˜ 22 ≤ 2
N i=1
|Im(ηi )| sup (i ϕ˜i ), x∈
˜ 2d d x ϕ˜i | |
(15)
746
Ph. Mounaix, P. Collet, J.L. Lebowitz
d Im(m) ˜ 22 = − ˜ 22 + 2
∇
Im(ηi ) 2 dt |m| N
+ 2Im
N
˜ 2d d x ϕ˜i |∇ |
˜ ˜ ∗ · ∇ ϕ˜i d d x ∇
ηi
i=1
˜ 22 ≤ 2 ∇
i=1
N
˜ 2 ˜ 2 |Im(ηi )| sup (i ϕ˜i ) + 2 ∇ x∈
i=1
N
|ηi | ∇ ϕ˜i ∞ ,
i=1
(16) and d Im(m) ˜ 22 + 2 ˜ 22 = −
∇ Im(ηi )
|m|2 dt N
+2Im
N
˜ ϕ˜i + 2∇ ˜ ∗ ˜ · ∇ ϕ˜i d d x
N
|Im(ηi )| sup (i ϕ˜i ) x∈
i=1
˜ 2 +2
˜ 2d d x ϕ˜i | |
ηi
i=1
˜ 22 ≤ 2
i=1
N
˜ 2 ϕ˜i ∞ + 2 ∇ ˜ 2 ∇ ϕ˜i ∞ , |ηi |
i=1
(17) where · 2 and · ∞ respectively denote the L2 and uniform norms 3 on for given t and η. Both ∇ ϕ˜i ∞ (t) and ϕ˜i ∞ (t) are bounded by assumption. Integrating then the inequality (15) over time from 0 to t, one obtains ˜ 2 (t, η) ≤ ||1/2 e
t i=1 |Im(ηi )| 0
N
supx∈ [i ϕ˜i (x,τ )] dτ
.
(18)
Similarly, by integrating (16) and (17) one finds ˜ 2 (t, η) ≤ C1 t
N i=1
|ηi | + C2 t 2
N
2 t N |ηi | e i=1 |Im(ηi )| 0 supx∈ [i ϕ˜i (x,τ )] dτ ,
i=1
(19) where C1 and C2 are finite and independent of η and m. We now substitute (18) and (19) into the right-hand side of the Sobolev-type inequality below, valid for d ≤ 3 (see e.g. [12], pp 106–107), ˜ ˜ 2 (t, η) , ˜ 2 (t, η) + | (x, t; η)| ≤ C3 In the case of a vector field v(x, t) ∈ Cd , these norms are to be understood as v 2 (t) = √ ∗ d 1/2 and v (t) = sup v(x, t) · v(x, t)∗ . ∞ x∈ v(x, t) · v(x, t) d x
3
Propagation Effects on the Breakdown of a Linear Amplifier Model
747
with C3 finite and independent of η and m. This yields ˜ | (x, t; η)| ≤ A + Bt
N
|ηi | + Ct 2
i=1
N
2 t N |ηi | e i=1 |Im(ηi )| 0 supy∈ [i ϕ˜i (y,τ )] dτ ,
i=1
(20) where A, B, and C are finite and independent of η and m. Take ci = −
1 2t
t
sup [ϕi (x, τ )] dτ +
0 x∈
t
inf [ϕi (x, τ )] dτ ,
0 x∈
and define
t
κi ≡ 0
1 sup [i ϕ˜i (x, τ )] dτ = 2 x∈
t
sup [ϕi (x, τ )] dτ −
0 x∈
t
inf [ϕi (x, τ )] dτ .
0 x∈
Note that, with this choice of ci , κi is independent of i . Since κi ≥ 0, one can bound the right side of (20) by ˜ | (x, t; η)| ≤ A + Bt
N i=1
|ηi | + Ct 2
N
2 N |ηi | e i=1 κi |ηi | .
(21)
i=1
˜ x,t be defined by Eq. (11) in which is replaced by . ˜ By the Paley-Wiener Let G theorem in the formulation given in [11] (Theorem XVI in Chapter VII), it follows from ˜ x,t is a distribution with compact sup˜ the analyticity of (x, t; η) in η and Eq. (21) that G N ˜ x,t ⊂ [−κ1 , κ1 ] × · · · × [−κN , κN ]. From the definition of ˜ one port on R and suppG ˜ x,t (u) = Gx,t (u − ct), which implies immediately by translation that Gx,t is also has G a distribution with compact support on RN and suppGx,t ⊂ [α1 , β1 ] × · · · × [αN , βN ], t t where αi = 0 inf x∈ ϕi (τ ) dτ and βi = 0 supx∈ ϕi (τ ) dτ . The permutation of time integral and space supremum (resp. infimum) can then be performed by using Lemma A1 with W = ±ϕi (see Appendix A). One obtains αi = ai and βi = bi , yielding suppGx,t ⊂ [a1 , b1 ] × · · · × [aN , bN ] .
(22)
It is worth noting that, heuristically, (22) follows immediately from the formal expression (10) since the product of the delta functions vanishes identically outside [a1 , b1 ] × ... × [aN , bN ]. It remains to prove that E(x, t; s) defined by the r.h.s. of (9) is the solution to (1). To this end it suffices to note that Eq. (9) can be written as E(x, t; s) = (x, t; η = iλk(s)) which is the solution to (14) with η = iλk(s) in the potential (13). It can be checked that the latter equation is indeed Eq. (1) [reconstruct s † γ s from its monomial decomposition and multiply (14) by −i], which completes the proof of Lemma 1.
748
Ph. Mounaix, P. Collet, J.L. Lebowitz
IV. Controlling the Growth of |E(x, t; s)|q The advantage gained by recasting (6) as (9) is that the latter formulation is suitable for a straightforward application of the Paley-Wiener theorem (see e.g. [11], Theorem XVI in Chapter VII, and [6], Theorem 7.4 in Chapter VI), offering the possibility of controlling the growth of E(x, t; s) as s → +∞. This is embodied in Lemma 2 below. Let t sˆ ≡ s/ s be the direction of s in CM and Hx,t (ˆs ) = supx(·)∈B(x,t) 0 U (x(τ ), τ ; sˆ ) dτ , ˆ ˆ with U (x, τ ; sˆ ) = N i=1 k(s)i ϕi (x, τ ), where k(s) = k(s)/ k(s) . Lemma 2. For every t > 0, x ∈ , m ∈ C+ \{0}, and q a positive integer, one has ln |E(x, t; s)|q =qλHx,t (ˆs ),
s 2
s →+∞ lim sup
(23)
along every given direction sˆ in CM . Proof. From Eqs. (8) and (9), one can rewrite the left-hand side of (23) as N ln |E(x, t; s)|q 1 λk(s)·u = qλ lim sup G (u) e du ln · · · lim sup . x,t i 2 N
s R
s →+∞
k(s) →+∞ λ k(s) i=1
CM
Fixing the direction of s in also fixes the direction of k(s) in Write u = ˆ + u⊥ , with u⊥ · k(s) ˆ v k(s) = 0, replace Gx,t (u) by its Fourier representation (11), and let η|| and η⊥ denote the Fourier conjugated variables of v and u⊥ , respectively. One obtains, N Gx,t (u) eλk(s)·u dui = · · · eλ k(s) v ··· RN
1 × (2π )N
···
RN
RN .
RN
i=1
(x, t; η) e
i(vη|| +u⊥ ·η⊥ )
dη||
N−1
dη⊥i dv
i=1
N−1
du⊥i .
i=1
Performing the integration over u⊥ first, and then the one over η⊥ , one finds that the latter expression reduces to
···
with
RN
Gx,t (u) eλk(s)·u
N i=1
dui =
R
gx,t (v) eλ k(s) v dv,
1 gx,t (v) = (x, t; η|| , η⊥ = 0) eivη|| dη|| , (2π) R
where (x, t; η|| , η⊥ = 0) is the solution to (14) with V (x, t; η) = η|| U (x, t; sˆ ). t t Let a = inf x(·)∈B(x,t) 0 U (x(τ ), τ ; sˆ ) dτ and b = supx(·)∈B(x,t) 0 U (x(τ ), τ ; sˆ ) dτ = Hx,t (ˆs ). By Lemma 1, gx,t is a distribution with compact support on R and suppgx,t ⊂ [a, b]. This implies that sup{v : v ∈ suppgx,t } ≤ Hx,t (ˆs ), and by the Paley-Wiener theorem, 1 λ k(s) v ln gx,t (v) e dv ≤ Hx,t (ˆs ). (24) lim sup R
k(s) →+∞ λ k(s)
Propagation Effects on the Breakdown of a Linear Amplifier Model
We now prove that (24) is an equality. Suppose that ∃ε > 0 such that 1 λ k(s) v lim sup ln gx,t (v) e dv ≤ Hx,t (ˆs ) − ε. R
k(s) →+∞ λ k(s)
749
(25)
Then, according to the Paley-Wiener theorem, sup{v : v ∈ suppgx,t } ≤ Hx,t (ˆs ) − ε. It is shown in Appendix B that sup{v : v ∈ suppgx,t } = Hx,t (ˆs ), yielding Hx,t (ˆs ) ≤ Hx,t (ˆs ) − ε, in contradiction with ε > 0. Thus, Eq. (25) is false and one obtains 1 ln gx,t (v) eλ k(s) v dv = Hx,t (ˆs ). lim sup R
k(s) →+∞ λ k(s)
This completes the proof of Lemma 2.
V. Determination of λq (x, t) and Comparison to λq (x, t) In this section we prove the conjecture made in Ref. [2] that λq ≤ λq , in the case where S(x, t) is given by (3). Since we wish to express the results in terms of eigenvalues of the correlation function S ∗ (x(t), t)S(x(t ), t ),we begin with a technical preliminary t linking these eigenvalues to those of the matrix 0 γ (x(τ ), τ ) dτ . Let µ1 [x(·)] ≥ µ2 [x(·)] ≥ · · · ≥ 0 be the eigenvalues of the covariance operator Tx(·) acting on f (τ ) ∈ L2 (dτ ), defined by
t
(Tx(·) f )(τ ) =
S ∗ (x(τ ), τ )S(x(τ ), τ )f (τ ) dτ ,
(26)
0
with 0 ≤ τ, τ ≤ t and x(·) ∈ B(x, t). Let fi (τ ) ∈ L2 (dτ ) be the eigenfunction assot ciated with µi [x(·)] and define the vector σi ∈ CM by σin = 0 ∗n (x(τ ), τ )fi∗ (τ ) dτ . From (26), (3), and (4) one has (Tx(·) fi )(τ ) =
M
∗ ∗m (x(τ ), τ )σim = µi [x(·)]fi (τ ),
(27)
m=1
and µi [x(·)]σin = =
t
∗n (x(τ ), τ )(Tx(·) fi )∗ (τ ) dτ
0 M
0
m=1
=
M
m=1
t
σim t
∗n (x(τ ), τ )m (x(τ ), τ ) dτ
γnm (x(τ ), τ ) dτ σim .
(28)
0
It follows from the last equality of (28) that any non-vanishing eigenvalue of Tx(·) is t also an eigenvalue of 0 γ (x(τ ), τ ) dτ with eigenvector σi . Conversely, any non-vant ishing eigenvalue of 0 γ (x(τ ), τ ) dτ is also an eigenvalue of Tx(·) with eigenfunction
750
Ph. Mounaix, P. Collet, J.L. Lebowitz
M ∗ ∗ fi (τ ) = µ−1 there is a one-to-one relationm=1 σim m (x(τ ), τ ) [see Eq. (27)]. Thus, i t ship between the non-vanishing eigenvalues of Tx(·) and 0 γ (x(τ ), τ ) dτ . In the sequel t µ1 [x(·)] will denote the largest eigenvalue of Tx(·) and of 0 γ (x(τ ), τ ) dτ . Define µx,t =
sup
µ1 [x(·)].
x(·)∈B(x,t)
One can now prove the following proposition: Proposition 1. For every t > 0 and x ∈ , λq (x, t) = (qµx,t )−1 ≤ λq (x, t). Proof. First we prove λq (x, t) ≥ (qµx,t )−1 . Expressing U (x(τ ), τ ; sˆ ) in terms of the quadratic form s † γ (x(τ ), τ )s in the expression for Hx,t (ˆs ), one has t s† s Hx,t (ˆs ) = sup γ (x(τ ), τ ) dτ ≤ µx,t . ||s|| 0 x(·)∈B(x,t) ||s|| Hence, by Lemma 2, ln |E(x, t; s)|q ≤ qλµx,t . ||s||2 ||s||→+∞ lim sup
This implies that for every λ < (qµx,t )−1 , M d 2 sn 2 q |E(x, t)| = · · · e−||s|| |E(x, t; s)|q < +∞, π CM
(29)
n=1
which proves λq (x, t) ≥ (qµx,t )−1 . We now prove the inequality λq (x, t) ≤ (qµx,t )−1 [or, more exactly, λq (x, t) ≤ (qµx,t − 0+ )−1 ]. To this end we follow the same line of reasoning as in Ref. [7]. Let A(r) = {z ∈ CM : |zn | ≤ r, 1 ≤ n ≤ M}. For Eq. (29) to hold it is necessary that, for every r > 0, M d 2 sn 2 lim ··· e−||s+s || |E(x, t; s + s )|q = 0, (30) ||s||→+∞ π A(r) n=1
∗
along every direction sˆ in CM . For any fixed s ∈ CM , e−2s ·z/q E(x, t; s + z) is an ∗ entire function of z ∈ CM and hence |e−2s ·z/q E(x, t; s + z)|q is subharmonic w.r.t. each component of z [5]. Thus, writing q 2 2 2 2 e−||s+s || = e−||s|| e−||s || exp − s ∗ · s , q in the integral (30), one obtains by the subharmonicity M d 2 sn 2 e−||s+s || |E(x, t; s + s )|q ··· π A(r)
n=1
M q d 2 sn 2 2 − 2 s ∗ ·s = e−||s|| ··· e−||s || e q E(x, t; s + s ) π A(r) n=1
M 2 e−||s|| |E(x, t; s)|q , ≥ 1 − exp −r 2
Propagation Effects on the Breakdown of a Linear Amplifier Model
751
and the condition (30) implies lim
||s||→+∞
e−||s|| |E(x, t; s)|q = 0, 2
(31)
t along every direction sˆ in CM . Since every element of the matrix 0 γ (x(τ ), τ ) dτ is a continuous functional of x(·) ∈ B(x, t) with the uniform norm on [0, t] (see Appendix B), its eigenvalues are also continuous functionals of x(·). Accordingly, ∀ε > 0 ∃xε (·) ∈ M B(x, t) such that µ tx,t − ε ≤ µ1 [xε (·)] ≤ µx,t . Let σε ∈ C (with ||σε || = 1) be an eigenvector of 0 γ (xε (τ ), τ ) dτ associated with the eigenvalue µ1 [xε (·)] and take sˆ = σε , then t † σε γ (x(τ ), τ ) dτ σε Hx,t (ˆs ) = sup ≥
0 x(·)∈B(x,t) t † σε γ (xε (τ ), τ ) dτ σε 0
≥ µx,t − ε.
(32)
Thus, along the direction of σε , Lemma 2 and Eq. (32) yield ln |E(x, t; s)|q ≥ qλ(µx,t − ε), ||s||2 ||s||→+∞ lim sup
and for every λ > (qµx,t − qε)−1 , lim sup e−||s|| |E(x, t; s)|q = +∞, 2
||s||→+∞
in contradiction with (31). Therefore, λq (x, t) ≤ (qµx,t − qε)−1 . Taking ε arbitrarily small one obtains λq (x, t) ≤ (qµx,t − 0+ )−1 , hence λq (x, t) = (qµx,t )−1 . Finally, we always have λq (x, t) = 1/qµ1 [x(·) = x] (see [2]) and µx,t ≥ µ1 [x(·) = x] yields λq (x, t) ≤ λq (x, t), which completes the proof of the proposition. VI. Summary and Perspectives In this paper, we have studied the effects of propagation on the divergence of the solution to a linear amplifier driven by the square of a Gaussian field. We have considered a model in which the propagation is that of a free Schr¨odinger equation with a complex mass. For this model, we have explicitly determined the values of the coupling constant at which the moments of the solution diverge. We proved that the divergence yielded by a propagation-free calculation, i.e. in the limit of an infinite mass, cannot occur at a smaller coupling constant than the one obtained with a finite mass. This extends the results of ref. [2] where such an inequality was proven in the diffusion case only, i.e. imaginary mass. As explained in the conclusion of ref. [2], the stumbling block to going beyond the purely diffusive case was to control the growth of a complex Feynman path-integral. Our solution of this problem is based on the realization that, if S is given by Eq. (3), the Feynman path-integral can be rewritten as the Fourier integral of a distribution with compact support (Lemma 1). Control can then be obtained as a consequence of the Paley-Wiener theorem (Lemma 2).
752
Ph. Mounaix, P. Collet, J.L. Lebowitz
In conclusion we outline some possible generalizations of this work. From a practical point of view, it would be interesting to find out whether there exists a class of S of the form (3) for which there are no propagation effects on the onset of the divergence, i.e. for which λq (x, t) = λq (x, t). In addition, since most Gaussian fields of physical interest admit Karhunen-Lo`eve-type expansions, it would also be very interesting to find a way to generalize our solution to the case where the finite sum (3) is replaced by an infinite sum. Other problems involve relaxing some of the assumptions in (1). For instance, under what conditions on S do the results carry over to the case where is replaced by Rd and E(x, 0) ∈ L2 (Rd ). It should also be checked whether our solution of the problem is robust with respect to the initial condition. If the answer is no, the size of the set of E(x, 0) for which our results do not hold should be estimated according to physically relevant measures on the space of E(x, 0). Acknowledgements. We warmly thank K.Yajima, C. Kopper, and G. Ben Arous for providing many valuable insights all along the completion of this work. The work of J. L. L. was supported by ASFOSR grant 49620-01-1-0154 and NSF grant DMR 01-279-26. J. L. L. also thanks the IHES at Bures-sur-Yvette, France, where part of this work was done.
Appendix A: On the Permutation of Time Integral and Space Supremum This appendix is devoted to the proof of the following lemma: Lemma A1. Let be a compact pathwise connected metric space [with distance denoted by d(·, ·)], and t > 0 a real number. Let W be a real function on × [0, t] such that W (·, τ ) is continuous in x uniformly in τ ∈ [0, t], and ∀x ∈ , W (x, ·) is piecewise continuous with a finite number of discontinuities. Then, for any x ∈ , t t sup W (y, τ ) dτ = sup W (x(τ ), τ ) dτ, 0 y∈
x(·)∈B(x,t) 0
where B(x, t) is the set of continuous paths in satisfying x(t) = x. Proof. We obviously have t sup W (y, τ ) dτ ≥ 0 y∈
sup
t
W (x(τ ), τ ) dτ,
x(·)∈B(x,t) 0
and it remains to prove the inequality in the other direction. First we assume W ∈ C 0 ( × [0, t]). Since × [0, t] is compact, W is uniformly continuous, and for every > 0 we can find a number δ = δ() > 0 such that if max{d(x, x ), |τ − τ |} < δ(), then |W (x, τ ) − W (x , τ )| < . Moreover, δ() tends to zero with . If t/δ() is not an integer, it is convenient to replace δ() by the smaller quantity t/(1 + [t/δ()]), where [·] denotes the integer part, and from now on we will assume that t/δ() is an integer. For a fixed > 0, let N = N () = t/δ(), and let R be a finite partition of by sets of diameter at most δ(). We observe that for any 0 ≤ q ≤ N − 1 one has sup
sup
F ∈R 0≤q≤N−1
OscF ×[qt/N, (q+1)t/N] W ≤ ,
(A1)
where “Osc” denotes the oscillation of the function (namely, its sup minus its inf). We now choose once and for all a point (xF,q , τq ) in each F × [qt/N, (q + 1)t/N ].
Propagation Effects on the Breakdown of a Linear Amplifier Model
753
For each 0 ≤ q ≤ N − 1 and any τ ∈ [qt/N, (q + 1)t/N ] one has sup W (y, τ ) ≤ sup y∈
≤ sup
F ∈R
sup
F ∈R F ×[qt/N, (q+1)t/N]
sup
F ×[qt/N, (q+1)t/N]
W
W + W (xF,q , τq ) −
inf
F ×[qt/N, (q+1)t/N]
W
≤ sup W (xF,q , τq ) + sup OscF ×[qt/N, (q+1)t/N] W F ∈R
F ∈R
≤ sup W (xF,q , τq ) + sup F ∈R
sup
F ∈R 0≤q≤N −1
OscF ×[qt/N, (q+1)t/N] W
≤ sup W (xF,q , τq ) + , F ∈R
where we have used the inequality (A1). Thus, choosing an atom Fq ∈ R such that supF ∈R W (xF,q , τq ) = W (xFq ,q , τq ), one has for any τ ∈ [qt/N, (q + 1)t/N ], sup W (y, τ ) ≤ W (xFq ,q , τq ) + .
(A2)
y∈
Let y1 and y2 be two given points in . We now define a continuous path x in from y1 to y2 . For any 1 ≤ j ≤ N − 1 we choose a family of continuous paths xj from [−t/2N 2 , t/2N 2 ] to satisfying xj (−t/2N 2 ) = xFj −1 ,j −1 and xj (t/2N 2 ) = xFj ,j . We also choose a continuous path x0 from [0, t/2N 2 ] to such that x0 (0) = y1 and x0 (t/2N 2 ) = xF0 ,0 , and a continuous path xN from [−t/2N 2 , 0] to such that xN (0) = y2 and xN (−t/2N 2 ) = xFN −1 ,N−1 . The continuous path x is defined by x0 (τ ) for 0 ≤ τ ≤ t/2N 2 , xFq ,q for qt/N + t/2N 2 ≤ τ ≤ (q + 1)t/N − t/2N 2 , x (τ ) = x (τ − qt/N ) for qt/N − t/2N 2 ≤ τ ≤ qt/N + t/2N 2 for q = 0, q xN (τ − t) for t − t/2N 2 ≤ τ ≤ t. (A3) One can observe that the Lebesgue measure of the time domain over which x (τ ) = xFq ,q is at most equal to t/N . Since is compact and ∀τ ∈ [0, t], W (·, τ ) ∈ C 0 (), there is a finite number M > 0 such that for any (x, τ ) ∈ × [0, t] one has |W (x, τ )| ≤ M. Thus, |W (xFq ,q , τ ) − W (x (τ ), τ )| ≤ 2M for any τ in [qt/N, (q + 1)t/N ]. Using the latter estimate, the remark below (A3), and (A2) one obtains
t
sup W (y, τ ) dτ =
0 y∈
N−1 (q+1)t/N q=0
≤ t +
N−1 (q+1)t/N q=0
≤ t +
sup W (y, τ ) dτ y∈
qt/N
qt/N
2Mt + N
W (xFq ,q , τ ) dτ
t
W (x (τ ), τ ) dτ.
(A4)
0
Now, assume that there is a finite set of times independent of x, {τ1 , ..., τL }, such that ∀x ∈ the set of times at which W (x, ·) is discontinuous is a subset of {τ1 , ..., τL }.
754
Ph. Mounaix, P. Collet, J.L. Lebowitz
Thus, (A4) applies in each time interval [τi , τi+1 ], 0 ≤ i ≤ L, with τ0 = 0 and τL+1 = t. Let y0 , y1 , ..., yL+1 be L + 2 points in with yL+1 = x. Let x be a continuous path in passing by x (τi ) = yi and defined by (A3) in each time interval [τi , τi+1 ]. From Eq. (A4) in which one writes Ci () = + 2M/Ni (), it follows
t
sup W (y, τ ) dτ ≤
0 y∈
L
L
W (x (τ ), τ ) dτ τi
i=0
=
L
t
Ci ()(τi+1 − τi ) +
W (x (τ ), τ ) dτ 0
i=0
≤
τi+1
Ci ()(τi+1 − τi ) +
Ci ()(τi+1 − τi ) +
t
sup
W (x(τ ), τ ) dτ,
x(·)∈B(x,t) 0
i=0
where the last inequality results from the fact that x ∈ B(x, t). The proof of Lemma A1 for the class of W considered in this paragraph is completed by taking the limit → 0 and observing that the Ci () tend to zero with . We are now ready to prove Lemma 1 in the general case. Since W (·, τ ) is continuous in x uniformly in τ ∈ [0, t], for any > 0, we can find δ = δ() > 0 such that if d(x, x ) ≤ δ, then sup |W (x, τ ) − W (x , τ )| ≤ .
τ ∈[0,t]
Since is compact, we can find a finite covering by open balls of radius at most δ/2, and therefore a finite partition of unity (χk ) by continuous functions whose support has diameter at most δ (see [3], paragraph 4.3 in Chapter IX). For any k we choose once for all a point xk ∈ suppχk , and define the function W (x, τ ) = W (xk , τ )χk (x). k
For each fixed τ , this function is obviously continuous in x, and for fixed x, it is piecewise continuous in τ , with the possible discontinuity points belonging to a finite set which can be chosen independent of x. From the previous result we have t t sup W (y, τ ) dτ = sup W (x(τ ), τ ) dτ. 0 y∈
x(·)∈B(x,t) 0
Since supx∈,τ ∈[0,t] |W (x, τ ) − W (x, τ )| ≤ , we deduce that t t sup W (y, τ ) dτ ≤ t + sup W (y, τ ) dτ 0 y∈
0 y∈
= t +
t
sup
W (x(τ ), τ ) dτ
x(·)∈B(x,t) 0
≤ 2t +
sup
t
W (x(τ ), τ ) dτ.
x(·)∈B(x,t) 0
Since the estimate holds for any > 0 the general result follows.
Propagation Effects on the Breakdown of a Linear Amplifier Model
755
(m)
Appendix B: Determination of the Support of g(x,T ) (m)
Let gx,t (u) be a distribution with compact support on R whose Fourier transform, (m) (m) (x, t; η) ≡ (Fgx,t )(η) with η ∈ R, is the solution to (14) with V (x, t; η) = ˆ ηU (x, t; sˆ ), where U (x, t; sˆ ) = N i=1 k(s)i ϕi (x, t). This appendix is devoted to the (m) determination of the support of g(x,t) . We have modified the notation used in the text to make the dependence on m explicit. We begin with a technical lemma that will be useful in the sequel. Let C0∞ (R) denote the set of all smooth compactly supported functions in R: (m) Lemma B1. For every t > 0, x ∈ , and f ∈ C0∞ (R), R gx,t (v)f (v) dv is an ana (m) lytic function of m on C+ ≡ {m ∈ C : Im(m) > 0}, and R gx,t (v)f (v) dv = (m+iγ ) (v)f (v) dv for each real m = 0. limγ →0+ R gx,t Proof. As a Fourier transform of a function with compact supports on R, (Ff )(η) is an analytic function of η ∈ C. We have seen at the beginning of the proof of Lemma 1 that (m) (x, t; η), with m ∈ C+ , is also analytic in η. Furthermore, if is a torus and V is bounded on (which is the case), then (i) (m) (x, t; η) is analytic in m ∈ C+ ; and (ii) ∀η ∈ C, (m) (x, t; η) = limγ →0+ (m+iγ ) (x, t; η) for each real m = 0. We are indebted to K. Yajima for the proof of the latter result that we reproduce here for the sake of completeness [13]. Define it Um (t) = exp , t ≥ 0, m ∈ C+ , 2m and write the initial value problem (14) in the form of an integral equation, t (m) (t) = 1 − iη Um (t − τ )V (τ ) (m) (τ ) dτ. 0
Here V (t) is the multiplication operator with U (x, t; sˆ ). Let B denote the space of bounded operators in L2 (). By Fourier series expansion, it is evident that (a)||Um (t) (m) (t)||2 ≤ || (m) (t)||2 , viz. Um (t) ∈ B and ||Um (t)|| ≤ 1; (b) the function [0, ∞) × C+ (t, m) → Um (t) ∈ B is strongly continuous [viz. (t, m) → Um (t)f ∈ L2 () is continuous for every f ∈ L2 ()]; and (c) for every t ≥ 0, m → Um (t) ∈ B is analytic for m ∈ C+ and (d/dm)Um (t) is norm continuous w.r.t. (t, m) ∈ [0, ∞) × C+ . It follows from the boundedness of V that the Dyson expansion [8] t Dm (t) = Um (t) − iη Um (t − τ )V (τ )Um (τ ) dτ + · · · + 0 (−iη)n Um (t − τn )V (τn ) · · · V (τ1 )Um (τ1 ) dτ1 · · · dτn + · · · 0<τ1 <···<τn
converges in the operator norm of B uniformly w.r.t. (t, m) in every compact subset of [0, ∞) × (C+ \{0}). Thus, the operator Dm (t) enjoys the same properties (b) and (c) mentioned above as an operator valued function of t and m. It is easy to check that Dm (t) defines the propagator for (14) and is unitary if m is real. Hence the solution to (14) satisfies properties (i) and (ii).
756
Ph. Mounaix, P. Collet, J.L. Lebowitz
Analyticity of (m) (x, t; η) and (Ff )(η) in m ∈ C+ and η ∈ C implies that
(m)
R
gx,t (u)f (u) du ≡
R
(m) (x, t; η)(Ff )(−η)
dη 2π
is an analytic function of m ∈ C+ . By Eq. (20) and the fact that none of the constants A, B, and C depend on m, | (m) (x, t; η)(Ff )(−η)| is bounded by an integrable function of η independent of m, from which it follows that the m-limit and the η-integral can be interchanged. Thus, according to (ii), one finds that for each real m = 0 , lim
γ →0+ R
(m+iγ )
gx,t
dη (u)f (u) du = lim (m+iγ ) (x, t; η)(Ff )(−η) 2π γ →0+ R dη lim (m+iγ ) (x, t; η)(Ff )(−η) = + 2π γ →0 R dη (m) = (m) (x, t; η)(Ff )(−η) gx,T (u)f (u) du, = 2π R R
which completes the proof of Lemma B1.
t t Let a = inf x(·)∈B(x,t) 0 U (x(τ ), τ ; sˆ ) dτ and b = supx(·)∈B(x,t) 0 U (x(τ ), τ ; sˆ ) dτ = Hx,t (ˆs ). One has the following lemma: (m)
Lemma B2. For every t > 0, x ∈ , and m ∈ C+ \{0}, the support of gx,t is equal to [a, b]. Proof. First, t consider the case m = iγ , γ > 0. Denote by α[x(·)] the functional α[x(·)] ≡ 0 U (x(τ ), τ ; sˆ ) dτ . By boundedness of ∇V over × [0, t], ∃A > 0 such that ∀x, y ∈ and 0 ≤ τ ≤ t, |U (x, τ ; sˆ ) − U (y, τ ; sˆ )| =
∇U (x , τ ; sˆ ) · dx ≤ A x − y ,
x,y
where x, y denotes the segment of geodesic from x to y and · is the usual Euclidean distance. This implies that ∀x(·), y(·) ∈ B(x, t),
t
|α[x(·)] − α[y(·)]| ≤
|U (x(τ ), τ ; sˆ ) − U (y(τ ), τ ; sˆ )| dτ
0
≤A
t
x(τ ) − y(τ ) dτ
0
≤ At sup x(τ ) − y(τ ) , 0≤τ ≤t
which shows that α[x(·)] is a continuous functional of x(·) ∈ B(x, t) with the uniform norm on [0, t]. Let h ∈ C0∞ (R) be a real positive test function with support in [a, b] and supu∈R h(u) = 1. From the continuity of α[x(·)] it follows that ∃x0 (·) ∈ B(x, t) such that h(α[x0 (·)]) = 1. By continuity of h and α[x(·)] it follows that ∀ε > 0, ∃δ > 0 such that |h(α[x(·)]) − 1| < ε for every x(·) ∈ B0 (δ) ≡ {x(·) ∈ B(x, t) :
Propagation Effects on the Breakdown of a Linear Amplifier Model
757
sup0≤τ ≤t x(τ ) − x0 (τ ) < δ}. Take ε = 1/2, in this case h(α[x(·)]) > 1/2 for every x(·) ∈ B0 (δ) and one has γ t (iγ ) ˙ )2 dτ gx,t (u)h(u) du = e− 2 0 x(τ h(α[x(·)]) d[x(·)] R x(·)∈B(x,t) γ t ˙ )2 dτ ≥ e− 2 0 x(τ h(α[x(·)]) d[x(·)] x(·)∈B0 (δ) γ t 1 ˙ )2 dτ > e− 2 0 x(τ d[x(·)]. 2 x(·)∈B0 (δ) Since the set of the Brownian paths x(·) that are in B0 (δ) has a strictly positive Wiener measure, the last term is strictly positive and one finds (iγ ) gx,t (u)h(u) du > 0. (B1) R
(iγ )
If there was an open subset of [a, b] not intersecting the support of gx,t , it would be pos (iγ ) (iγ ) sible to choose the support of h outside the one of gx,t , yielding gx,t (u)h(u) du = 0 (iγ ) in contradiction with Eq. (B1). Thus, for every x ∈ and γ > 0, the support of gx,t is equal to [a, b]. Consider now the general case m ∈ C+ \{0}. Assume that there is an open subset (m) of [a, b] not intersecting the support of gx,t . In this case it is possible to choose the (m) (m) support of h outside the one of gx,t , yielding gx,t (u)h(u) du = 0. By Lemma B1 (z) the support of gx,t must vary continuously with z ∈ C+ , whence the support of h can be taken small enough such that there is a open subset V(m) ⊂ C+ with m ∈ V(m) (z) (z) and gx,t (u)h(u) du = 0 identically in V(m). From the analyticity of gx,t (u)h(u) du (z) in z on C+ (Lemma B1), it follows immediately that gx,t (u)h(u) du = 0 identically in all C+ , in contradiction with Eq. (B1), which completes the proof of Lemma B2. In (m) particular, one has sup{v : v ∈ suppgx,t } = b = Hx,t (ˆs ), which is the result used in the proof of Lemma 2. References 1. Akhmanov, S.A., D’yakov,Yu.E., Pavlov, L.I.: Statistical phenomena in Raman scattering stimulated by a broad-band pump. Sov. Phys. JETP 39, 249–256 (1974) 2. Asselah, A., Dai Pra, P., Lebowitz, J.L., Mounaix, Ph.: Diffusion effects on the breakdown of a linear amplifier model driven by the square of a Gaussian field. J. Stat. Phys. 104, 1299–1315 (2001) 3. Bourbaki, N.: El´ements de math´ematique: topologie g´en´erale, chapitres 5 a` 10. Paris: Dunod, 1997 (in French) 4. DeWitt-Morette, C.: Feynman’s path integral. Definition without limiting procedure. Commun. Math. Phys. 28, 47–67 (1972); Cartier, P., DeWitt-Morette, C.: A new perspective on functional integration. J. Math. Phys. 36, 2237–2312 (1995); Cartier, P., DeWitt-Morette, C.: Functional integration. J. Math. Phys. 41, 4154-4187 (2000) 5. Hayman, W.K., Kennedy, P.B.: Subharmonic Functions, Vol. I. London Mathematical Society Monographs, No. 9. London-New York: Academic Press (Harcourt Brace Jovanovich, Publishers), 1976 6. Katznelson, Y.: An Introduction to Harmonic Analysis. Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, S˜ao Paulo: Cambridge Univeristy Press, 2004 7. Mounaix, Ph., Lebowitz, J.L.: Note on a diffraction-amplification problem. J. Phys. A: Math. Gen. 37, 5289–5294 (2004)
758
Ph. Mounaix, P. Collet, J.L. Lebowitz
8. Reed, M., Simon, B.: Methods of Modern Mathematical Physics 2: Fourier Analysis, Self-Adjointness. New York-San Francisco-London: Academic Press, 1975 9. Rose, H.A., DuBois, D.F.: Statistical properties of laser hot spots produced by a random phase plate. Phys. Fluids B 5, 590–596 (1993) 10. Rose, H.A., DuBois, D.F.: Laser hot spots and the breakdown of linear instability theory with application to stimulated Brillouin scattering. Phys. Rev. Lett. 72, 2883–2886 (1994) 11. Schwartz, L.: Th´eorie des distributions. Paris: Hermann, 1997 (in French) 12. Thirring, W.: A Course in Mathematical Physics 3: Quantum Mechanics of Atoms and Molecules. Wien-New York: Springer-Verlag, 1991 13. Yajima, K.: Private communication, 2004 Communicated by A. Kupiainen
Commun. Math. Phys. 264, 759–772 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1520-0
Communications in
Mathematical Physics
Spectral Asymptotics of Pauli Operators and Orthogonal Polynomials in Complex Domains N. Filonov1 , A. Pushnitski2, 1
Department of Mathematical Physics, Faculty of Physics, St. Petersburg State University, 198504 St. Petersburg, Russia. E-mail:
[email protected] 2 Mathematics 253-37, Caltech, Pasadena, CA 91125, U.S.A. E-mail:
[email protected] Received: 24 May 2005 / Accepted: 7 September 2005 Published online: 15 February 2006 – © Springer-Verlag 2006
Abstract: We consider the spectrum of a two-dimensional Pauli operator with a compactly supported electric potential and a variable magnetic field with a positive mean value. The rate of accumulation of eigenvalues to zero is described in terms of the logarithmic capacity of the support of the electric potential. A connection between these eigenvalues and orthogonal polynomials in complex domains is established. 1. Introduction 1.1. The unperturbed Pauli operator. Let B = B(x), x = (x1 , x2 ) ∈ R2 , be a real valued function which has the physical meaning of the strength of a magnetic field in R2 . A two-dimensional non-relativistic electron in the external magnetic field B can be described by the Pauli operator + h 0 in L2 (R2 ) ⊕ L2 (R2 ). h= 0 h− The standard approach to the definition of the operators h± in L2 (R2 ) involves introducing the magnetic vector potential A(x) = (A1 (x), A2 (x)) such that B = ∂x1 A2 − ∂x2 A1 and setting h± = (−i∇ − A)2 ∓ B.
(1.1)
Instead, we adopt the approach advocated in [6], which consists of defining h± in terms of a solution = (x) to the differential equation = B. Assume that B is such that a solution can be chosen subject to the condition (x) =
B0 2 |x| + 1 (x), 4
1 = 1 ∈ L∞ (R2 ),
On leave of absence from King’s College London, U.K.
B0 > 0.
(1.2)
760
N. Filonov, A. Pushnitski
Important examples of magnetic fields B of this class are periodic fields with mean value B0 and constant magnetic fields B(x) = B0 . Next, denote, as usual, ∂ = 21 (∂x1 − i∂x2 ) and ∂ = 21 (∂x1 + i∂x2 ). Consider the quadratic forms + h [u] = 4 |∂(e(x) u(x))|2 e−2(x) dx, R2
h− [u] = 4
|∂(e−(x) u(x))|2 e2(x) dx,
(1.3)
R2
which are closed on the domains Dom(h± ) = {u ∈ L2 (R2 ) | h± [u] < ∞}. Let us define h± as the self-adjoint operators in L2 (R2 ), corresponding to the quadratic forms h± . For a wide class of magnetic fields this definition is equivalent to the standard definition (1.1) with A = (−∂x2 , ∂x1 ); see [6] for a detailed analysis of this issue. In fact, the magnetic field B or the magnetic vector potential A do not enter directly either the definition of Pauli operator or any of our considerations; instead, the ‘potential function’ becomes the main functional parameter. Note that condition (1.2) is very close to the ‘admissibility’ condition used in [16]. ± We will denote by h± 0 and h0 the above defined forms and operators corresponding to the case of the constant magnetic field B(x) = B0 > 0. Note that for all u ∈ Dom(h− 0) one has 2 h− 0 [u] 2B0 u .
(1.4)
1.2. Zero modes and the spectral gap. It is well known that Pauli operator h has infinite dimensional kernel. More precisely, by the argument dating back to [1], we have Ker h− = {0} and Ker h+ = {u ∈ L2 (R2 ) | u(x) = f (x)e−(x) , ∂f = 0},
dim Ker h+ = ∞. (1.5)
Next, the following well known supersymmetric argument (which has appeared in many forms in the literature; see e.g. [9] or [16]) establishes the existence of a spectral gap (0, m), m > 0 of the operator h. Let a0 and a0∗ be the annihilation and creation operators in L2 (R2 ), corresponding to the constant component B0 > 0 of the magnetic field B: a0 = −2ie−B0 |x|
2 /4
∂ eB0 |x|
2 /4
,
a0∗ = −2ieB0 |x|
2 /4
∂ e−B0 |x|
2 /4
.
(1.6)
Then one can define a = e−1 a0 e1 on Dom(a) = {e−1 u | u ∈ Dom(a0 )} and a ∗ = e1 a0∗ e−1 . (1.7) In terms of these operators, we have h+ = a ∗ a and h− = aa ∗ , and therefore σ (h+ ) \ {0} = σ (h− ) \ {0}. Finally, comparing the form h− with using (1.4), one obtains (see e.g. [3] or [16, Prop. 1.2])
(1.8) h− 0
and
−1 u] 2B0 e2 ess inf 1 e−1 u2 h− [u] e2 ess inf 1 h− 0 [e
2B0 e−2 osc 1 u2
(1.9)
Spectral Asymptotics and Orthogonal Polynomials
761
for all u ∈ Dom(h− ), where osc 1 = ess sup 1 − ess inf 1 . It follows that (0, m), m = 2B0 e−2 osc 1 > 0, is a gap in the spectrum of h+ and of h. See [5] for a different point of view on the issue of existence of the spectral gap and [18] for recent progress on this topic. 1.3. Perturbations of the Pauli operator and spectral asymptotics. Let v ∈ Lp (R2 ), p > 1, be a non-negative compactly supported function, which has the physical meaning of the electric potential. The Pauli operator which describes a particle in the external magnetic field B and the electric field with the potential v is + h +v 0 h + vI = in L2 (R2 ) ⊕ L2 (R2 ). 0 h− + v Along with h + vI , we will also consider the operator h − vI . In order to define h+ ± v as a quadratic form sum, let us establish that v is h+ -form compact. By (1.7), (1.8) and boundedness of 1 , we see that v is h+ -form compact if and only if ve−21 is h+ 0 -form compact. As ve−21 ∈ Lp (R2 ), p > 1, we obtain that ve−21 is h+ -form compact (see 0 [2]). A similar argument shows that v is h− -form compact and so the operators h− ± v are also well defined. The main object of interest in this paper is the rate of accumulation of the eigenvalues of h ± vI to zero. By the above established relative compactness and by the estimate (1.9), the operators h− ±v can have only finitely many eigenvalues in a sufficiently small neighbourhood of zero. Thus, the question reduces to that of the rate of accumulation of the eigenvalues of h+ ± v to zero. Due to the assumption v 0, the eigenvalues of h+ +v can accumulate to 0 only from − above, and the eigenvalues of h+ − v can do so only from below. Let λ− 1 λ2 · · · be + + + the negative eigenvalues of h − v, and λ1 λ2 · · · be the eigenvalues of h+ + v in the spectral gap (0, m); here and in the rest of the paper, we assume eigenvalues to be enumerated with multiplicities taken into account. Our aim is to describe the rate of convergence λ± n → 0 as n → ∞. Roughly speaking, we prove the following asymptotics (precise statements are given in Sect. 2): log(±n!λ± n ) = n log(B0 /2) + 2n log Cap(supp v) + o(n),
n → ∞,
(1.10)
where Cap is the logarithmic capacity of a set. The notion of logarithmic capacity is introduced in the framework of potential theory; see e.g. [7, 11]. Recall that the logarithmic capacity of compact sets in R2 has the following properties: (i) if 1 ⊂ 2 then Cap 1 Cap 2 ; (ii) Cap coincides with the logarithmic capacity of the outer boundary of (= the boundary of the unbounded component of R2 \ ); (iii) the logarithmic capacity of a disc of radius r is r; (iv) if 2 = {αx | x ∈ 1 }, α > 0, then Cap 2 = α Cap 1 . We establish (1.10) by means of the following simple chain of equivalent reformulations of the problem. Firstly, a perturbation theory argument reduces the problem to the spectral asymptotics of an auxiliary compact self-adjoint operator P0 vP0 , where P0 is the spectral projection of h+ , corresponding to the eigenvalue 0. Next, we observe that the eigenvalues of P0 vP0 coincide with the singular values of a certain embedding operator (see (4.2)). Using the approach of [13], we relate the singular numbers of this
762
N. Filonov, A. Pushnitski
embedding operator to some sequence of orthogonal polynomials in the complex domain (see below). Finally, application of the results of [22] concerning the asymptotics of these orthogonal polynomials leads to (1.10). Using the same technique, we are also able to treat two similar problems. First, we consider the Pauli operators h+ 0 ± v in the case of a constant magnetic field and describe the rate of accumulation of the eigenvalues to the higher Landau levels. Secondly, we consider the three-dimensional Pauli Hamiltonian with a constant magnetic field and a compactly supported electric potential and describe the rate of convergence of eigenvalues to 0. These results are presented in Sect. 2. The rate of convergence of eigenvalues to zero for Pauli operators in dimensions two and three was investigated before in the case of constant magnetic field B(x) = B0 > 0 for various classes of potentials v with power or exponential decay at infinity; see [21, 19, 23, 14, 15, 10, 17]. We refer the reader to the discussion in [17]. The case of a constant magnetic field and compactly supported potentials v was considered in [17] and [12]. The case of a two-dimensional operator with variable magnetic field and potentials v with power or exponential decay and also with compactly supported potentials was treated in [16]. The results of [17, 12, 16] for the case of compactly supported potentials read as log(±λ± n ) = −n log n + O(n),
n → ∞.
(1.11)
As far as we are aware, a connection between the spectral asymptotics of magnetic operators and logarithmic capacity or orthogonal polynomials has not been made before. Some physical intuition concerning these problems with constant magnetic field can be gained from [8]. 1.4. Orthogonal and Chebyshev polynomials. We identify R2 and C in a standard way: z = x1 +ix2 for (x1 , x2 ) ∈ R2 , denote by dm(z) the Lebesgue measure in C and consider v as a function of z. It appears that the sequence of polynomials in z, orthogonal with respect to the measure v(z)dm(z), is related to the asymptotics of λ± n . Here we recall necessary facts from the theory of Chebyshev and orthogonal polynomials in complex domains; see e.g. [7] and [22] for the details. For any n = 0, 1, 2, . . . , let Pn be the set of all monic polynomials in z of degree n: Pn = {zn + an−1 zn−1 + · · · + a1 z + a0 | a0 , . . . , an−1 ∈ C}.
(1.12)
Let ⊂ C be a compact set. For a fixed n, consider the problem of minimization of the norm tC() ≡ supz∈ |t (z)| on the set t ∈ Pn . It is clear that the minimum is positive and attained at some polynomial tn ∈ Pn . The polynomial tn is called the nth Chebyshev polynomial for the set . (One can prove that such a polynomial is unique, but we will not need this fact). It is well known that all zeros of tn lie in the closed convex hull of . The nth root asymptotics of tn is given by lim tn 1/n = Cap .
n→∞
(1.13)
Next, let v ∈ L1 (C, dm) be a non-negative compactly supported function. Denote Mn (v) = inf |p(z)|2 v(z)dm(z). (1.14) p∈Pn C
Spectral Asymptotics and Orthogonal Polynomials
763
It is easy to prove that the infimum in (1.14) is attained at the polynomials {pn }∞ n=0 , pn ∈ Pn , which can be obtained by applying the Gram–Schmidt orthogonalisation process in L2 (C, v(z)dm(z)) to the sequence 1, z, z2 , . . . . All zeros of pn lie in the closed convex hull of supp v. Regarding the nth root asymptotics of pn , the following facts are known (see [22]). Denote ρ+ (v) = lim sup Mn (v)1/n ,
ρ− (v) = lim inf Mn (v)1/n . n→∞
n→∞
(1.15)
In general, it can happen that ρ− (v) < ρ+ (v) (see the proof of Theorem 1.1.9 in [22]). One has the estimates ρ+ (v) (Cap supp v)2 ,
ρ− (v) (Cap − (v))2 , log |z−ζ |r v(ζ )dm(ζ ) where − (v) = {z ∈ C | lim sup < ∞}. log r r→+0
(1.16)
The first inequality in (1.16) is a part of Corollary 1.1.7 of [22]. The second inequality in (1.16), although not stated explicitly in [22], follows directly from the proof of Theorem 4.2.1 therein. Remark 1. Let ⊂ C be a compact set with a Lipschitz boundary, and let v ∈ L1 (C, dm) be such that v(z) c > 0 for all z ∈ and v(z) = 0 for all z ∈ C \ . Then we easily find that − (v) = = supp v and therefore ρ+ (v) = ρ− (v) = (Cap )2 . 2. Main Results 2.1. Two-dimensional Pauli operators with variable magnetic field. Let h+ , as in the Introduction, be the Pauli operator defined via (1.3) with subject to (1.2). Let v and λ± n be as in the Introduction. Theorem 1. Let 0 v ∈ Lp (R2 ), p > 1, be a compactly supported potential and let Mn (v) be as defined in (1.14). Then there exists k ∈ N such that 1/n (B0 /2)Mn+k (v)1/n (1 + o(1)) (n!λ+ n)
(B0 /2)Mn−1 (v)1/n (1 + o(1)), (B0 /2)Mn−1 (v)
1/n
(1 + o(1))
(2.1)
1/n (−n!λ− n)
(B0 /2)Mn−k (v)1/n (1 + o(1)),
(2.2)
as n → ∞. In particular, 1/n = B0 ρ+ (v)/2, lim sup(±n!λ± n) n→∞
1/n lim inf (±n!λ± = B0 ρ− (v)/2, n) n→∞
where ρ± (v) are defined by (1.15). If v is of the class described in Remark 1, then the asymptotics (1.10) holds true. 2 Remark 2. Let µ be a compactly supported finite measure in R such that the quadratic form C |u(x)|2 dµ(x) is compact with respect to the quadratic form h+ . Then one can define the self-adjoint operators corresponding to the quadratic forms + |u(x)|2 dµ(x). h [u] ±
R2
All our considerations remain valid for such operators. For example, the case of a measure µ, supported by a curve, can be interesting.
764
N. Filonov, A. Pushnitski
2.2. Two-dimensional Pauli operators with constant magnetic field. Let B(x) = B0 > 0; + consider the corresponding operator h+ 0 . As it is well known, the spectrum of h0 consists ∞ of the eigenvalues {2qB0 }q=0 of infinite multiplicities; these eigenvalues are known as Landau levels. Consider the problem of accumulation of eigenvalues of h+ 0 ± v to a fixed − − higher Landau level 2qB0 , q 1. Let λq,1 λq,2 · · · be the eigenvalues of h+ 0 −v + in the interval (2(q − 1)B0 , 2qB0 ), and let λ+ λ · · · be the eigenvalues of q,1 q,2 + h0 + v in (2qB0 , 2(q + 1)B0 ). Theorem 2. Let ⊂ R2 be a compact set with Lipschitz boundary and let v ∈ Lp (R2 ), p > 1, be such that v(x) c > 0 for x ∈ and v(x) = 0 for x ∈ R2 \ . Then for the corresponding eigenvalues λ± q,n we have: B0 (Cap )2 . 2 → 2qB0 as n → ∞, q 1, was studied before in
1/n lim (±n!(λ± = q,n − 2qB0 ))
n→∞
The rate of convergence of λ± q,n [17, 12], where the asymptotics
log(±(λ± q,n − 2qB0 )) = −n log n + O(n),
n→∞
was obtained. Note that if the potential v depends only on |x|, then the result of Theorem 2 can be obtained by a direct calculation using separation of variables, see e.g. [17, Prop. 3.2]. 2.3. Three-dimensional Pauli operator with a constant magnetic field. Let H = (−i∇ − A(x))2 − B0 in L2 (R3 , dx),
x = (x1 , x2 , x3 ),
where A(x) = It is well known that the spectrum of H is absolutely continuous and coincides with the interval [0, ∞). The background information concerning the spectral theory of H and its perturbations can be found in [2]. Let V ∈ L3/2 (R3 ) be a non-negative compactly supported potential. The operator of multiplication by V in L2 (R3 ) is H -form compact (cf. [2]). Thus, one can define the self-adjoint operator H −V via the corresponding quadratic form; the essential spectrum of H − V is also [0, ∞). Let 1 2 · · · be the negative eigenvalues of H − V ; we have n → 0 as n → ∞. Below we describe the asymptotic behaviour of n as n → ∞ in terms of the auxiliary weight function ∞ w(x1 , x2 ) = V (x1 , x2 , x3 )dx3 , a.e. (x1 , x2 ) ∈ R2 . (2.3) (− 21 B0 x2 , 21 B0 x1 , 0).
−∞
As above, we consider w as a function of z = x1 + ix2 . Theorem 3. Let 0 V ∈ L3/2 (R3 ) be a compactly supported potential and w be defined by (2.3). Then there exists k ∈ N such that (B0 /2)2 Mn+k (w)2/n (1 + o(1)) (−(n!)2 n )1/n (B0 /2)2 Mn−k (w)2/n (1 + o(1)),
(2.4)
as n → ∞. In particular, lim sup(−(n!)2 n )1/n = (B0 ρ+ (w)/2)2 , n→∞
where ρ± (w) are defined by (1.15).
lim inf (−(n!)2 n )1/n = (B0 ρ− (w)/2)2 , n→∞
Spectral Asymptotics and Orthogonal Polynomials
765
The rate of accumulation n → 0 for potentials V with power or exponential decay was considered before in [21, 19, 23, 14, 15, 10, 17]. For compactly supported potentials, this problem was considered in [17, 12], where the asymptotics log(−n ) = −2n log n + O(n),
n→∞
was obtained. Remark 3. Theorem 3 remains valid under the following assumptions on V : (i) V 0, V is H -form compact; (ii) R3 V (x)(1 + |x3 |2 )dx < ∞; (iii) the function w, defined by (2.3), is compactly supported. 3. Proof of Theorems 1, 2 and 3 Proof of Theorem 1. Let H0 ⊂ L2 (R2 ) be the kernel of h+ , and let P0 be the corresponding eigenprojection, Ran P0 = H0 . Consider the compact self-adjoint operator P0 vP0 . The key ingredient in the proof is the following Lemma 1. Let v ∈ L1 (R2 ) be a non-negative compactly supported function and let s1 s2 · · · > 0 be the eigenvalues of P0 vP0 . Then (n!sn+1 )1/n = (B0 /2)Mn (v)1/n (1 + o(1)),
n → ∞,
(3.1)
where Mn (v) are defined by (1.14). The proof is given in Sect. 4. Now it remains to employ a perturbation theory argument (see [16, Prop. 3.1] or [17, Prop. 4.1]) based on the Birman-Schwinger principle and on Weyl inequalities for eigenvalues of a sum of compact operators. This argument shows that there exists k ∈ N such that for all sufficiently large n ∈ N one has sn −λ− n 2sn−k ,
1 sn+k λ+ n sn . 2
Combining these inequalities with Lemma 1, we obtain the required result.
Proof of Theorem 2. For any q 0, denote Hq = Ker(h+ − 2qB0 ) and let Pq be the eigenprojection of h+ 0 corresponding to the eigenvalue 2qB0 . Consider the compact (q) (q) self-adjoint operator Pq vPq , and let s1 s2 · · · be the eigenvalues of this operator. As in the proof of Theorem 1, using a perturbation theory argument based on the Birman-Schwinger principle and on Weyl inequalities (see [17, Prop. 4.1]), one shows that there exists k ∈ N such that for all sufficiently large n ∈ N, 1 (q) (q) s ±(λ± q,n − 2qB0 ) 2sn−k . 2 n+k
(3.2)
Now the proof of Theorem 2 reduces to Lemma 2. Let ⊂ R2 be a compact set with Lipschitz boundary and let v ∈ L1 (R2 ) be such that v(x) c > 0 for x ∈ and v(x) = 0 for x ∈ R2 \ . Fix q ∈ N and let (q) (q) s1 s2 · · · be the eigenvalues of Pq vPq . Then one has (q)
lim (n!sn )1/n = (B0 /2)(Cap )2 .
n→∞
(3.3)
766
N. Filonov, A. Pushnitski
The proof of Lemma 2 is given in Sect. 5. From Lemma 2 and the estimate (3.2), we immediately obtain the required result. Proof of Theorem 3. The proof repeats almost word for word the construction of [19]. According to the Birman-Schwinger principle, for E > 0 we have: √ √ (3.4) {n | n < −E} = n+ (1; V (H0 + E)−1 V ). √ √ The operator V (H0 + E)−1 V can be represented as √ √ 1 V (H0 + E)−1 V = √ K1 + K2 + K3 . 2 E Here K1 , K2 are the operators in L2 (R3 ) with the integral kernels K1 (x, y) = V (x)P0 (x⊥ , y⊥ ) V (y), √ E|x3 −y3 |
e− K2 (x, y) = V (x)P0 (x⊥ , y⊥ )
√
− 1
2 E
V (y),
where the notation x⊥ = (x1 , x2 ), y⊥ = (y1 , y2 ) is used, and P0 (x⊥ , y⊥ ) is the integral kernel of the operator P0 in L2 (R2 ). Finally, K3 is the operator √ √ K3 = V Q0 (H0 + E)−1 V , where Q0 = (I − P0 ) ⊗ I in the decomposition L2 (R3 , dx1 dx2 dx3 ) = L2 (R2 , dx1 dx2 ) ⊗ L2 (R, dx3 ). The operators K2 and K3 have limits (in the operator norm) as E → +0; these limits are compact self-adjoint operators. Thus, by Weyl’s inequalities for eigenvalues (see e.g. [4]), we have for E → +0: √ √ 1 1 1 V (H0 + E)−1 V ) n+ ( ; √ K1 ) + n+ ( ; K2 + K3 ) 2 2 E 2 √ n+ ( E; K1 ) + O(1), √ √ 3 1 1 n+ (1; V (H0 + E)−1 V ) n+ ( ; √ K1 ) − n+ ( ; −K2 − K3 ) 2 2 E 2 √ n+ (3 E; K1 ) − O(1). n+ (1;
(3.5)
(3.6)
Finally, again as in [19], let us prove that the non-zero eigenvalues of K1 coincide with those of P0 wP0 , where w is defined by (2.3). It suffices to prove this statement for continuous V with compact support; the general case V ∈ L3/2 then follows by approximation argument. Let N1 : L2 (R3 , dx1 dx2 dx3 ) → L2 (R2 , dx1 dx2 ) and N2 : L2 (R2 , dx1 dx2 ) → L2 (R3 , dx1 dx2 dx3 ) be the following operators: ∞ V 1/2 (x1 , x2 , x3 )u(x1 , x2 , x3 )dx3 , (N1 u)(x1 , x2 ) = −∞ 1/2
(N2 u)(x1 , x2 , x3 ) = V
(x1 , x2 , x3 )u(x1 , x2 ).
Then K1 = N2 P0 N1 = (N2 P0 )(P0 N1 ) and P0 wP0 = (P0 N1 )(N2 P0 ). It follows that the non-zero eigenvalues of K1 coincide with {sn }, the non-zero eigenvalues of P0 wP0 ,
Spectral Asymptotics and Orthogonal Polynomials
767
√ and so n+ ( E; K1 ) = {n | (sn )2 > E}. From here and (3.4), (3.5), (3.6) it follows that for some k ∈ N and all sufficiently large n ∈ N, one has 1 (sn+k )2 n (sn−k )2 . 9
(3.7)
Combining this with Lemma 1, we get the statement of Theorem 3.
4. Proof of Lemma 1 First let us consider the case of a constant magnetic field B(x) = B0 > 0. Let F 2 be the Hilbert space of all entire functions f such that 2 f 2F 2 = |f (z)|2 e−B0 |z| /2 dm(z) < ∞. (4.1) C
In the case B0 = 2, the space F 2 is usually called Fock space or Segal-Bargmann space. By (1.5), we have an isometry between H0 = Ker h+ ⊂ L2 (C, dm) and F 2 , given 2 by u(z) = e−B0 |z| /4 f (z), u ∈ H0 , f ∈ F 2 . Thus, the quadratic form of the operator P0 vP0 |H0 is unitarily equivalent to the quadratic form 2 |f (z)|2 v(z)e−B0 |z| /2 dm(z), f ∈ F 2 . C
It follows that the non-zero eigenvalues sn of P0 vP0 coincide with the singular values µn of the embedding operator F 2 ⊂ L2 (C, v(z)e−B0 |z|
2 /2
(4.2)
dm(z)).
The case of a variable magnetic field can be also reduced to the embedding (4.2). Indeed, if sn correspond to the case of a variable magnetic field, then, using the boundedness of 1 , one obtains (see [16, Prop. 3.2]): µn e−2 osc 1 sn µn e2 osc 1 ,
n ∈ N.
Thus, it remains to prove the asymptotic formula (n!µn+1 )1/n = (B0 /2)Mn (v)1/n (1 + o(1)),
n→∞
(4.3)
for the singular values µn of the embedding (4.2). We shall assume B0 = 2; the general case can be reduced to this one by a linear change of coordinates. The embedding F 2 ⊂ C(), where is a compact set in C, was studied in [13]. Below we repeat the arguments of [13] (with trivial modifications) to obtain the required asymptotics. By the minimax principle, we have the following variational characterisation of µn : 2 −|z|2 dm(z) C |f (z)| v(z)e sup , codim L+ (4.4) µn+1 = inf n = n, 2 + 2 + f Ln ⊂F f ∈Ln \{0} F2 2 −|z|2 dm(z) C |f (z)| v(z)e inf , dim L− µn+1 = sup n = n + 1. (4.5) 2 − − f 2 f ∈L \{0} n Ln ⊂F F2
768
N. Filonov, A. Pushnitski
Upper bound on µn+1 . 1. For the subspaces L+ n from (4.4), we will take 2 L+ n = {f ∈ F | f (z) = pn (z)g(z), g is entire function},
where pn is the sequence of monic polynomials orthogonal with respect to the measure v(z)dm(z). In order to estimate the ratio in (4.4) from above, let us prove the following auxiliary statement. Denote R0 = maxz∈supp v |z|. We claim that for any ε ∈ (0, 13 ), there exists N ∈ N such that for all n N and any f = pn g ∈ L+ n , we have sup |g(z)|2 (1 − ε)−2n
|z|R0
Indeed, we have g(z) =
1 2πi
|ζ |=r
1 pn g2F 2 . n!
f (ζ ) dζ, pn (ζ )(ζ − z)
(4.6)
r > R0 ,
and therefore sup |g(z)|2
|z|R0
r sup 2π |z|R0
|ζ |=r
|f (ζ )|2 d|ζ | |pn (ζ )|2 |ζ − z|2
for any r > R0 . Denote R = R0 /ε. Since all zeros of pn lie in the closed convex hull of supp v, we obtain: |pn (ζ )||ζ − z| ((1 − ε)r)n+1 ,
|z| R0 ,
|ζ | = r R.
Thus, we get sup |g(z)|2
|z|R0
r −2n−1 2π(1 − ε)2n+2
|ζ |=r
|f (ζ )|2 d|ζ |,
r R.
Integrating the last inequality over r from R to ∞ with the weight e−r r 2n+1 , and using the fact that ∞ R 1 1 2 2 e−r r 2n+1 dr = n! − e−r r 2n+1 dr (1 − ε)−2 n! 2 2π R 0 2
for all sufficiently large n, we obtain (4.6). 2. From (4.6) we obtain for any f = pn g ∈ L+ n: C
|f (z)|2 v(z)e−|z| dm(z) Mn (v)g2C(supp v) Mn (v)(1 − ε)−2n 2
f 2F 2 n!
.
Together with (4.4), the last estimate yields (n!µn+1 )1/n (1 − ε)−2 Mn (v)1/n for all sufficiently large n.
(4.7)
Spectral Asymptotics and Orthogonal Polynomials
769
Lower bound for µn+1 . Let us use formula (4.5) and take L− n to be the set of all polynomials in z of degree n. As in the proof of the upper bound, we denote R0 = maxz∈supp v |z|, fix ε > 0 and set R = R0 /ε. We shall use the following norm in F 2: 2 |||f |||2F 2 = |f (z)|2 e−|z| dm(z). (4.8) |z|R
This norm is equivalent to the earlier introduced norm (4.1). Indeed, the inequality |||f |||F 2 f F 2 is trivial; the inequality f F 2 C(R)|||f |||F 2 is easy to obtain by application of the Cauchy integral formula. Let qn ∈ L− n \ {0} be the polynomial which minimizes the ratio 2 −|z|2 dm(z) C |qn (z)| v(z)e (4.9) |||qn |||2F 2 among all polynomials in L− n \ {0}. Next, without the loss of generality, we may assume that qn is monic. Denote k = deg qn n. The following standard argument shows that all zeros of qn are confined to the disk {z | |z| R0 }. Suppose that one of the zeros z0 is outside the disk; then replace qn (z) by qn (z)|z0 |(z − R02 /z0 )/(R0 (z − z0 )). One has |z0 ||z − R02 /z0 | |z0 ||z − R02 /z0 | 1 for |z| R0 and 1 for |z| R0 , R0 |z − z0 | R0 |z − z0 | so this change decreases the ratio (4.9) — contradiction. Thus, we get the estimate |qn (z)| (1 + ε)k |z|k , It follows that |||qn |||2F 2
|z| R,
k = deg qn .
|z|2k (1 + ε)2k e−|z| dm(z) (1 + ε)2k π k!. 2
|z|R
On the other hand, for the numerator of (4.9), we have 2 2 2 |qn (z)|2 v(z)e−|z| dm(z) e−R0 |qn (z)|2 v(z)dm(z) e−R0 Mk (v). C
C
Combining the above estimates, we obtain: 2 −|z|2 dm(z) C |f (z)| v(z)e µn+1 inf C(R)|||f |||2F 2 f ∈L− n \{0} Mk (v) . 0k n C1 (R)(1 + ε)2k k!
min
(4.10)
As zpk (z) ∈ Pk+1 , from the definition (1.14) of Mk (v) we get a trivial estimate Mk+1 (v) R02 Mk (v). This estimate shows that for a sufficiently large n, the minimum in (4.10) is attained at k = n. Therefore, 1/n Mn (v)1/n 1 1/n (1 + ε)−3 Mn (v)1/n (n!µn+1 ) C1 (R) (1 + ε)2 for all sufficiently large n. The latter estimate together with (4.7) completes the proof of the lemma.
770
N. Filonov, A. Pushnitski
5. Proof of Lemma 2 First recall some well known facts concerning the spectral decomposition of the operator + h+ 0 . As above, we use the notation Hq = Ker(h − 2qB0 ) and Pq is the orthogonal 2 projection in L (C, dm) onto the subspace Pq . The operator h+ 0 can be represented in terms of the annihilation and creation operators (1.6) as h+ = a0∗ a0 . The operators 0 ∗ ∗ a0 , a0 obey the commutation relation [a0 , a0 ] = 2B0 , wherefrom we get the identity q a0 (a0∗ )q u = (2B0 )q q!u for all u ∈ H0 and q ∈ N. It follows that (2B0 )−q/2 (q!)−1/2 (a0∗ )q : H0 → Hq is an isometry onto Hq .
(5.1)
Recalling the explicit isomorphism between H0 and the space F 2 (see the previous 2 section), we see that the change u = (2B0 )−q/2 (q!)−1/2 (a0∗ )q (e−B0 |z| /4 f (z)) gives a unitary equivalence between the operator Pq vPq and the operator in F 2 defined by the quadratic form 2 (2B0 )−q (q!)−1 |(a0∗ )q (e−B0 |z| /4 f (z))|2 v(z)dm(z). C
We will consider the case B0 = 2; the general case can be reduced to this one by a linear change of variables. With this simplification, the above quadratic form becomes 2 −1 (q!) |(∂ − z)q f (z)|2 v(z)e−|z| dm(z), f ∈ F 2 . (5.2) C
Let us prove the asymptotics (3.3) for the eigenvalues {sn }∞ n=1 corresponding to the form (5.2). (q) Upper bound for sn . Let δ = {z | dist(z, ) δ}. By the Cauchy integral formula, we have (q)
sup|f (j ) |
j! sup|f |, δ j δ
f ∈ F 2,
j ∈ N.
Thus, we have the following bound for the form (5.2): 2 (q!)−1 |(∂ − z)q f (z)|2 v(z)e−|z| dm(z) Cf 2C(δ ) , C
where C depends on , δ, v, q. Let us define 2 L+ n = {f ∈ F | f (z) = tn (z)g(z), g entire},
where tn is the nth Chebyshev polynomial for the set δ . Note that the proof of (4.6) uses only the fact that all zeros of pn lie in the closed convex hull of supp v. Therefore, the same estimate remains valid with the change pn → tn . Thus, we get 2 −1 (q!) |(∂ −z)q f (z)|2 v(z)e−|z| dm(z) Cf 2C(δ ) Ctn 2C(δ ) g2C(δ ) C
C(1−ε)−2n for all sufficiently large n. This yields
1 tn 2C(δ ) f 2F 2 , f ∈ L+ n, n!
Spectral Asymptotics and Orthogonal Polynomials
771
lim sup(n!sn )1/n (1 − ε)−2 lim tn C(δ ) = (1 − ε)−2 (Cap δ )2 . 2/n
(q)
n→∞
n→∞
It remains to note that ε and δ can be chosen arbitrary small and that for any compact set one has limδ→+0 Cap δ = Cap (see e.g. [7]). (q)
Lower bound for sn . 1. Due to the compactness of the embedding of the Sobolev space W21 () ⊂ L2 (), for any γ > 0 there exists a subspace in W21 () of a finite codimension such that for all elements u in this subspace we have uL2 () γ ∇uL2 () . It follows that for any γ > 0 there exists a subspace in F 2 of a finite codimension such that for any element of this subspace we have f L2 () γ f L2 () . Arguing by induction, we see that for any γ > 0 there exists a subspace N = N (γ , q) ⊂ F 2 of a finite codimension l such that ∂ k−1 f L2 () γ ∂ k f L2 () ,
∀f ∈ N (γ , q),
∀k = 1, 2, . . . , q. (5.3)
2. We need to estimate the form (5.2) from below. Using the assumption v(z) c > 0, z ∈ , on the first step and the triangle inequality on the second step, we obtain 2 |(∂ − z)q f (z)|2 v(z)e−|z| dm(z) C(∂ − z)q f 2L2 () C
C ∂ f L2 () − q
2
q q
(−z) ∂
k
k q−k
f L2 ()
k=1
C ∂ f L2 () − q
q q
2
k q−k f L2 () k R0 ∂
,
(5.4)
k=1
where R0 = maxz∈ |z|. From this estimate it is clear that by choosing γ sufficiently small we can ensure that for all f in the corresponding subspace N (γ , q) (see (5.3)) we have C C 2 |(∂ − z)q f (z)|2 v(z)e−|z| dm(z) ∂ q f 2L2 () γ q f 2L2 () . 2 2 C 3. Now, as in the proof of Lemma 1, let L− n be the set of all polynomials in z of degree − ∩ N (γ , q); clearly, dim L − − = L n. Consider the subspace L n n n n + 1 − l, where l = codim N (γ , q). Next, again as in the proof of Lemma 1 (see (4.10) and the following argument), we obtain for all sufficiently large n: q 2 −|z|2 v(z)dm(z) (q) C |(∂ − z) f (z)| e sn+1−l f 2F 2 C
f 2L2 () f 2F 2
C
Mn (χ ) , (1 + ε)2n n!
− f ∈L n,
where χ denotes the characteristic function of the set in C. As stated in Remark 1, limn→∞ Mn (χ )1/n = (Cap )2 . Thus,
772
N. Filonov, A. Pushnitski
(q)
lim inf (n!sn )1/n n→∞
for any ε > 0.
(Cap )2 (1 + ε)2
Acknowledgement. We are indebted to M. Sh. Birman,Yu. Netrusov, G. Raikov, G. Rozenblum, A. Sobolev and H. Stahl for useful discussions. The work was supported by the Royal Society grant 2004/R1-FS. The authors are grateful to the Mathematisches Forschungsinstitut Oberwolfach for hospitality and financial support. The first named author is also grateful to Loughborough University for hospitality.
References 1. Aharonov, Y., Casher, A.: Ground state of a spin- 21 charged particle in a two-dimensional magnetic field. Phys. Rev. A (3) 19(6), 2461–2462 (1979) 2. Avron, J., Herbst, I., Simon, B.: Schr¨odinger operators with magnetic fields. I. General interactions. Duke Math. J. 45(4), 847–883 (1978) 3. Besch A.: Eigenvalues in spectral gaps of the two-dimensional Pauli operator. J. Math. Phys. 41, 7918–7931 (2002) 4. Birman, M.Sh., Solomyak, M.Z.: Spectral theory of self-adjoint operators in Hilbert space. Dordrecht: D. Reidel P.C., 1987 5. Dubrovin, B.A., Novikov, S.P.: Fundamental states in a periodic field. Magnetic Bloch functions and vector bundles. (Russian) Dokl. Akad. Nauk SSSR 253(6), 1293–1297 (1980) 6. Erd¨os, L., Vougalter, V.: Pauli operator and Aharonov-Casher theorem for measure valued magnetic fields. Commun. Math. Phys. 225(2), 399–421 (2002) 7. Hille, E.: Analytic function theory. Vol. II, Boston, Mass: Ginn and Co., 1962 8. Hornberger, K., Smilansky, U.: Magnetic edge states. Physics Reports 367, 249–385 (2002) 9. Iwatsuka, A.: The essential spectrum of two-dimensional Schr¨odinger operators with perturbed constant magnetic fields. J. Math. Kyoto Univ. 23(3), 475–480 (1983) 10. Ivrii, V.: Microlocal analysis and precise spectral asymptotics. Berlin: Springer, 1998 11. Landkof, N.S.: Foundations of modern potential theory. New York: Springer, 1972 12. Melgaard, M., Rozenblum, G.: Eigenvalue asymptotics for weakly perturbed Dirac and Schr¨odinger operators with constant magnetic fields of full rank. Comm. Partial Differ. Eqs. 28(3–4), 697–736 (2003) 13. Parf¨enov, O.G.: The widths of some classes of entire functions. Mat. Sb. 190(4), 87–94 (1999); translation in Sb. Math. 190(3–4), 561–568 (1999) 14. Raikov, G.D.: Eigenvalue asymptotics for the Schr¨odinger operator with homogeneous magnetic potential and decreasing electric potential. I. Behaviour near the essential spectrum tips. Comm. Partial Differ. Eqs. 15(3), 407–434 (1990); Errata: Comm. Partial Differ. Eqs. 18(11), 1977–1979 (1993) 15. Raikov, G.D.: Border-line eigenvalue asymptotics for the Schr¨odinger operator with electromagnetic potential. Integral Equations Operator Theory 14(6), 875–888 (1991) 16. Raikov, G. D.: Spectral asymptotics for the perturbed 2D Pauli operator with oscillating magnetic fields. I. Non-zero mean value of the magnetic field. Markov Processes Relat. Fields 9, 775–794 (2003) 17. Raikov, G.D., Warzel, S.: Quasi-classical versus non-classical spectral asymptotics for magnetic Schr¨odinger operators with decreasing electric potentials. Rev. Math. Phys. 14(10), 1051–1072 (2002) 18. Rozenblum, G., Shirokov, N.: Infiniteness of zero modes for the Pauli operator with singular magnetic field. http://lanl.arxiv.org/abs/math-ph/0501059, 2005. To appear in J. Functional Analysis 19. Sobolev, A.V.: Asymptotic behavior of energy levels of a quantum particle in a homogeneous magnetic field perturbed by an attenuating electric field. I, (Russian). Probl. Mat. Anal. 9, Leningrad: Leningrad. Univ., 1984, pp. 67–84. English translation in: J. Sov. Math. 35, 2201–2212 (1986) 20. Sobolev, A.V.: On the Lieb-Thirring estimates for the Pauli operator. Duke Math. J. 82(3), 607–635 (1996) 21. Solnyshkin, S.N.: Asymptotic behavior of the energy of bound states of the Schr¨odinger operator in the presence of electric and homogeneous magnetic fields (Russian). Probl. Mat. Fiz., 10, Leningrad: Leningrad. Univ., 1982, pp. 266–278 22. Stahl, H., Totik, V.: General orthogonal polynomials. Cambridge: Cambridge Univ. Press, 1992 23. Tamura, H.: Asymptotic distribution of eigenvalues for Schr¨odinger operators with homogeneous magnetic fields. Osaka J. Math. 25(3), 633–647 (1988) Communicated by B.Simon
Commun. Math. Phys. 264, 773–795 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1554-3
Communications in
Mathematical Physics
Integration with Respect to the Haar Measure on Unitary, Orthogonal and Symplectic Group 2, ´ Benoˆıt Collins1, , Piotr Sniady 1 2
Department of Mathematics, Graduate School of Science, Kyoto University, Kyoto 606-8502, Japan. E-mail:
[email protected] Institute of Mathematics, University of Wroclaw, pl. Grunwaldzki 2/4, 50-384 Wroclaw, Poland. E-mail:
[email protected]
Received: 24 May 2005/ Accepted: 31 October 2005 Published online: 22 March 2006 – © Springer-Verlag 2006
Abstract: We revisit the work of the first named author and using simpler algebraic arguments we calculate integrals of polynomial functions with respect to the Haar measure on the unitary group U(d). The previous result provided exact formulas only for 2d bigger than the degree of the integrated polynomial and we show that these formulas remain valid for all values of d. Also, we consider the integrals of polynomial functions on the orthogonal group O(d) and the symplectic group Sp(d). We obtain an exact character expansion and the asymptotic behavior for large d. Thus we can show the asymptotic freeness of Haar-distributed orthogonal and symplectic random matrices, as well as the convergence of integrals of the Itzykson–Zuber type. 1. Introduction Let G ⊂ End(Cd ) be a compact Lie group viewed as a group of matrices. The matrix structure provides a very natural coordinate system on G; in particular we are interested in the family of functions eij : G → C defined by eij : Md (C) m → mij which to a matrix assign one of its entries. We call polynomials in (eij ) polynomial functions on G. In this article we are interested in the integrals of the polynomial functions on compact Lie groups with respect to the Haar measure on G, i.e. the integrals of the form Ui1 j1 · · · Uin jn Ui j · · · Ui j dU. (1) G
1 1
n n
For simplicity, such integrals will be called moments of the group G. If we consider a matrix-valued random variable U the distribution of which is the Haar measure on G then the integrals of the form (1) have a natural interpretation as certain moments of the entries of U and they appear very naturally in the random matrix
B.C. is supported by a JSPS postdoctoral fellowship. ´ was supported by State Committee for Scientific Research (KBN) grant 2 P03A 007 23. P.S.
´ B. Collins, P. Sniady
774
theory. The reason for this is that quite many random matrix ensembles X are invariant with respect to the conjugation by the elements of the group G and therefore can be written as X = UX U−1 , where U and X are independent matrix-valued random variables and the distribution of U is the Haar measure on G. As a result, the expressions similar to E Tr(X1 Us1 X2 Us2 · · · Xn Usn )
(2)
are quite common in the random matrix theory, where s1 , . . . , sn ∈ {1, } and X1 , . . . , Xn are some matrix-valued random matrices independent from U. It is easy to see that the calculation of (2) can be easily reduced to the calculation of (1). In the random matrix theory we are quite often interested not in the exact value of the expression of type (2) but in its asymptotic behavior if d tends to infinity. The results of this type were obtained for the first time by Weingarten [Wei78]. In this article we are interested in the case when G ⊂ Md (C) belongs to one of the series of the classical Lie groups, i.e. G is either the unitary group U(d) or the orthogonal group O(d) or the symplectic group Sp(d/2), where in the latter case we assume that d is even. Since Haar measure of classical groups and their moments have many physical interpretations, the problem we are considering in this paper has had a long history in theoretical physics. In particular, many algorithms have been found for computing specific moments with high symmetry properties, but as far as the authors know, no general theory has been developed so far at a mathematical level of rigor. For interesting computations handling particular cases of this paper and for a good bibliography, we refer to [Gor02, BB96]. Firstly, we revisit a part of the work of the first named author [Col03] and compute with a new convolution formula the moments of the unitary group. This formula gives a new combinatorial insight into the relation between free probability and asymptotics of moments of the unitary group. Then, we make use of other features of the invariant theory to give an explicit integration formula on the orthogonal and symplectic groups and to compute the asymptotics in the latter case. This allows us to prove a new convergence result for a large family of matrix integrals. Our main tool is the Schur–Weyl duality for the unitary group and its analogues for the orthogonal and symplectic groups.
2. Integration Over Unitary Groups 2.1. Schur–Weyl duality for unitary groups. We recall a couple of notations and standard facts. A non-increasing sequence of nonnegative integers λ = (λ1 , . . . ) is said to be a partition of the integer n (abbreviated by λ n) if i λi = n. We denote by l(λ) its length, i.e. the largest index i for which λi is non-zero. There is a canonical way to parameterize all irreducible polynomial representations ρλ : U(d) → End VUλ(d) of the compact unitary group U(d) by partitions λ such U(d) that l(λ) ≤ d. The character of this representation evaluated on the torus is the Schur polynomial sλ,d (see [Ful97]). By sλ,d (x) we shall understand sλ,d (x, . . . , x) with d copies of x. In particular, sλ,d (1) is the dimension of the representation VUλ(d) of U(d). The group algebra C[Sn ] of the symmetric group Sn is semi-simple. It is endowed λ with its canonical basis {δσ }σ∈Sn . The irreducible representations ρλ Sn : Sn → End VSn are canonically labeled by λ n via the Schur functor (see [Ful97] as well); we denote the corresponding characters by χλ .
Integration on the Unitary Group
775
The following isomorphism holds: ∼ C[Sn ] =
End VSλn .
(3)
λn χλ (e)
For any λ n, let pλ = n! χλ ∈ C[Sn ] be the minimal central projector onto End VSλn . We define for future use the algebra Cd [Sn ] =
pλ C[Sn ] =
λn, l(λ)≤d
λn, l(λ)≤d
End VSλn .
(4)
d ⊗n , where Consider the representation ρd Sn of Sn on (C )
ρd Sn (π) : v1 ⊗ · · · ⊗ vn → vπ−1 (1) ⊗ · · · ⊗ vπ−1 (n) is given by the natural permutation of the elementary tensors. We consider also the representation ρn of U(d) on (Cd )⊗n , where U(d) ρn U(d) (U) : v1 ⊗ · · · ⊗ vn → U(v1 ) ⊗ · · · ⊗ U(vn ) n is the diagonal action. Since the representations ρd Sn and ρU(d) commute, we obtain a
representation ρSn ×U(d) of Sn × U(d) on (Cd )⊗n .
Theorem 2.1 (Schur–Weyl duality for unitary groups [Wey39]). The action of Sn ×U(d) is multiplicity free, i.e. no irreducible representation of Sn ×U(d) occurs more than once in ρSn ×U(d) . The decomposition of ρSn ×U(d) into irreducible components is given by ∼ (Cd )⊗n =
λn, l(λ)≤d
VSλn ⊗ VUλ(d) ,
(5)
λ where Sn × U(d) acts by ρλ Sn ⊗ ρU(d) on the summand corresponding to λ.
We shall consider the inclusion of algebras d ⊗n ρd . Sn (Cd [Sn ]) ⊆ End(C )
Equations (4) and (5) show that ρd Sn is injective when restricted to Cd [Sn ] and for this d reason we shall omit ρSn whenever convenient and consider Cd [Sn ] as sitting inside End(Cd )⊗n . Conversely, we can identify every element of the image ρd Sn (Cd [Sn ]) ⊆ End(Cd )⊗n with the unique corresponding element of the group algebra Cd [Sn ].
´ B. Collins, P. Sniady
776
2.2. Conditional expectation. For A ∈ End(Cd )⊗n we define E(A) = U⊗n A (U−1 )⊗n dU, U(d)
(6)
where the integration is taken with respect to the Haar measure on the compact group U(d). We recall that for an algebra inclusion M ⊂ N, a conditional expectation is a M-bimodule map E : N → M such that E(1N ) = 1M . Proposition 2.2. E defined in (6) is a conditional expectation of End(Cd )⊗n onto Cd [Sn ]. We regard End(Cd )⊗n as a Euclidean space with a scalar product A, B = Tr A∗ B. Then E is an orthogonal projection onto ρd Sn (Cd [Sn ]) . Moreover, it is compatible with the trace in the sense that Tr ◦E = Tr . Proof. Since Haar measure is a probability measure invariant with respect to the left and right multiplication, therefore E(A) commutes with the action of the unitary group U(d) for every A ∈ End(Cd )⊗n . Theorem 2.1 shows that E(A) ∈ Cd [Sn ] and that the range of E is exactly Cd [Sn ]. Since E(A), E(B) = E(A), B it follows that E is an orthogonal projection. The other statements of the proposition can be easily checked directly. For A ∈ End(Cd )⊗n we set −1 Φ(A) = Tr A ρd Sn (σ ) δσ ∈ C(Sn ).
(7)
σ∈Sn
Proposition 2.3. Φ fulfills the following properties: 1. Φ is a C[Sn ]–C[Sn ] bimodule morphism in the sense that Φ A ρd (σ) = Φ(A) σ, Sn Φ ρd Sn (σ) A = σ Φ(A); 2. Φ(Id) coincides with the character of ρd Sn hence it is equal to Φ(Id) = n!
sλ,d (1) pλ χλ (e)
(8)
λn
and is an invertible element of Cd [Sn ]; its inverse will be called Weingarten function and is equal to Wg =
χλ (e)2 1 χλ ; 2 sλ,d (1) (n!) λn l(λ)≤d
3. the relation between Φ(A) and E(A) is explicitly given by Φ(A) = E(A)Φ(Id);
(9)
Integration on the Unitary Group
777
4. the range of Φ is equal to Cd [Sn ]; 5. in Cd [Sn ], the following holds true: Φ(A E(B)) = Φ(A)Φ(B)Φ(Id)−1 .
(10)
Proof. Points 1 and 2 are immediate. Point 1 implies Φ(A) = Φ (E(A)) = Φ (Id E(A)) = Φ(Id) E(A) which proves Point 3. Point 4 follows from Point 3 and Point 2. Point 5 follows from Points 1 and 3. ), Corollary 2.4. Let n be a positive integer and i = (i1 , . . . , in ), i = (i1 , . . . , in j = (j1 , . . . , jn ), j = (j1 , . . . , jn ) be n-tuples of positive integers. Then Ui1 j1 · · · Uin jn Ui j · · · Uin jn dU U(d)
=
1 1
δi1 i
σ(1)
σ,τ∈Sn
If n = n then
. . . δin i
σ(n)
δj 1 j
τ(1)
. . . δjn j
τ(n)
Wg(τσ −1 ).
(11)
U(d)
Ui1 j1 · · · Uin jn Ui j · · · Ui 1 1
j n n
dU = 0.
(12)
Proof. In order to show (11) it is enough to take appropriate A and B in Md (C)⊗n and take the value of both sides of (10) in e ∈ Sn . For every u ∈ C such that |u| = 1 the map U(d) U → uU ∈ U(d) is measure preserving therefore Ui1 j1 · · · Uin jn Ui j · · · Ui j dU 1 1 n n U(d) = uUi1 j1 · · · uUin jn uUi j · · · uUi j dU 1 1
U(d)
and (12) follows.
n n
The above result was obtained by the first named author [Col03] under the assumption n ≥ d. As we shall see, this assumption is not necessary. For n ≥ d the formula (9) takes the simpler form Wg =
1 χλ (e)2 λ χ , sλ,d (1) (n!)2
(13)
λn
with no restrictions on the length of λ. The right-hand side is a rational function of d and hence we may consider it for any d ∈ C. However, the polynomial d → sλ,d (1) has zeros in the integer points −l(λ), −l(λ) + 1, . . . , l(λ) − 1, l(λ), and hence the righthand side of (13) has poles in the points −n, −n + 1, . . . , n − 1, n and therefore is not well-defined on the whole C. Nevertheless, even for the case d < n, let us plug this incorrect value (13) into (11). In this way the right-hand side of (11) becomes a rational function in d.
´ B. Collins, P. Sniady
778
We claim that for every d ∈ N for which the left-hand side of (11) makes sense (i.e. , j , . . . , j , j , . . . , j ∈ {1, . . . , d}), the right-hand side also if i1 , . . . , in , i1 , . . . , in n 1 1 n makes sense (possibly after some cancelations of poles) and is equal to the left-hand side of (11). Indeed, let us view the product Φ(A)Φ(B) Wg as an element of C[Sn ] with rational coefficients in d. For the choice of A, B ∈ Md (C)⊗n used in the proof of Corollary 2.4 we must have Φ(A), Φ(B) ∈ Cd [Sn ], therefore the product Φ(A)Φ(B) Wg is an element of Cd [Sn ] with rational coefficients in d. Since (9) and (13) regarded as elements of C[Sn ] with rational coefficients in d coincide on Cd [Sn ], hence our claim holds true. We summarize the above discussion in the following proposition. Proposition 2.5. For fixed values of the indices i, j, i , j the integral Ui1 j1 · · · Uin jn Ui j · · · Uin jn dU 1 1
U(d)
is a rational function of d. Furthermore, Eq. (11) remains true (possibly after some cancelations of poles) if we replace the correct value (9) of Weingarten function by (13). Example. Corollary 2.4 implies that for d ≥ 2, |U11 |4 dU = U11 U11 U11 U11 dU U(d)
U(d)
= 2 Wg
1 2 1 2
+ 2 Wg
1 2 2 1
=2
−1 1 +2 , d2 − 1 d(d2 − 1)
where the values of the Weingarten function were computed by (13) and where 1 n · · · σ(1) σ(n) denotes the permutation σ. The right-hand side appears to make no sense for d = 1, nevertheless after algebraic simplifications we obtain 2 |U11 |4 dU = d(d + 1) U(d) which is a correct value for all d ≥ 1. 2.3. Asymptotics of the Weingarten function. In this section we compute the first order asymptotic of the Weingarten function for large values of d. Consider the algebra C[Sn ][[d−1 ]] of functions on Sn valued in formal power series in d−1 and the vector space A = Vect αδσ : α = O(d−|σ| ) and αd|σ| is a power series in d−2 , where |σ| denotes the minimal number of factors necessary to write σ as a product of transpositions. By the triangle inequality |σ1 | + |σ2 | ≥ |σ1 σ2 | and the parity property (−1)|σ1 | (−1)|σ2 | = (−1)|σ1 σ2 | , A turns out to be a unital subalgebra of C[Sn ][[d−1 ]]. It is easy to check that d−n Φ(Id) ∈ A. Since d−n Φ(Id) = δe + O(d−1 ), therefore i its inverse dn Wg = i 1 − d−n Φ(Id) makes sense as a formal power series in d−1 . The following proposition follows immediately.
Integration on the Unitary Group
779
Proposition 2.6. dn Wg ∈ A. Equivalently, for any σ ∈ Sn , Wg(σ) = O(d−n−|σ| ). In order to find a more precise asymptotic expansion we consider the two-sided ideal I in A generated by d−2 δe . It is easy to check that the quotient algebra A/I regarded as a vector space is spanned by the vectors d−|σ| δσ . The products of these elements are given by d−|σρ| δσρ if |σρ| = |σ| + |ρ|, −|σ| −|ρ| ∼ (d δσ )(d δρ ) = 0 if |σρ| < |σ| + |ρ|. Biane [Bia97] considered an algebra which as a vector space is equal to C[Sn ] with the multiplication δσρ if |σρ| = |σ| + |ρ|, δσ δρ = 0 if |σρ| < |σ| + |ρ|. One can easily see now that d−|ρ| δρ → δρ provides an isomorphism of A/I and Biane algebra. Under this isomorphism d−n Φ(Id) is mapped into ζ = σ∈Sn δσ . The inverse of ζ in Biane algebra is called M¨obius function and is given explicitly by Moeb(σ) = c|Ci |−1 (−1)|Ci |−1 , 1≤i≤k
where σ is a permutation with a cycle decomposition, σ = C1 · · · Ck , |Ci | is the number of elements in the cycle Ci and cn =
(2n)! n!(n + 1)!
(14)
is the Catalan number. Corollary 2.7. dn+|σ| Wg(σ) = Moeb(σ) + O(d−2 ). 3. Integration Over Orthogonal Groups 3.1. Schur–Weyl duality for orthogonal groups. 3.1.1. Brauer algebras. We consider the group of orthogonal matrices O(d) = {M ∈ GL(d), M−1 = Mt = M∗ }. Its invariant theory has first been studied by R. Brauer [Bra37] who introduced a family of algebras, nowadays called Brauer algebras. These algebras have been at the center of many investigations (see [BW89, Gro99] and the references therein). Some actions of these algebras lead to an analogue of the Schur–Weyl duality in the case of the orthogonal group and symplectic groups and for this reason they are very useful for our purposes. Consider 2n vertices arranged in two rows: the upper one with n vertices denoted by U1 , . . . , Un and the bottom row with n vertices denoted by B1 , . . . , Bn . We regard S2n as a group of permutations of the set of vertices and denote by P2n the set of all pairings of this set. An example of such a pairing is presented on Fig. 1. We
´ B. Collins, P. Sniady
780
Fig. 1. Example of an element of P20
can view P2n as a set of permutations σ ∈ S2n such that σ 2 = e and σ has no fixpoints. We will consider the action ρS2n of S2n on P2n by conjugation under the embedding P2n ⊂ S2n described above. By C[P2n ] we denote the linear space spanned by P2n . We equip this linear space with a bilinear symmetric form ·, · by requirement that the elements of P2n form an orthonormal basis. The embedding P2n ⊂ S2n extends linearly to the inclusion of S2n –modules C[P2n ] ⊂ C[S2n ] and the scalar product can be described as
a, b =
χreg (ab∗ ) , χreg (e)
where χreg denotes the character of the left regular representation. The Brauer algebra B(d, n) regarded as a vector space is isomorphic to C(P2n ). The multiplication in the algebra B(d, n) depends on the parameter d, but in this article we will not use the multiplicative structure of the Brauer algebra. 3.1.2. Canonical representation of the Brauer algebra. By ·, · we denote the canonical bilinear symmetric forms on Cd and on (Cd )⊗n . The canonical representation ρB of the Brauer algebra B(d, n) on (Cd )⊗n is defined as follows: in order to compute
u1 ⊗ · · · ⊗ un , ρB (p)[b1 ⊗ · · · ⊗ bn ], where p ∈ P2n and u1 , . . . , un , b1 , . . . , bn ∈ Cd we assign to the upper vertices of p vectors u1 , . . . , un and to bottom vertices vectors b1 , . . . , bn . The value of u1 ⊗ · · · ⊗ un , ρB (p)[b1 ⊗ · · · ⊗ bn ] is defined to be a product of the scalar products of vectors assigned to vertices joined by the same line. For example, for the diagram p from Fig. 1 we obtain:
u1 ⊗ · · · ⊗ u10 , ρB (p)[b1 ⊗ · · · ⊗ b10 ] = u1 , u3 u2 , u4 × u5 , b7 u6 , u10 u7 , u9 u8 , b8 b1 , b3 b2 , b5 b4 , b6 b9 , b10 . (15) The bilinear form ·, · allows to identify canonically Cd with its dual and we can write the isomorphism of vector spaces End(Cd )⊗n = Cd . (16) i∈{U1 ,...,Un ,B1 ,...,Bn }
Let us consider the action of S2n on End(Cd )⊗n by permutation of factors on the right-hand side of (16). We consider the representation ρn of O(d) on (Cd )⊗n , where O(d) ρn O(d) (O) : v1 ⊗ · · · ⊗ vn → O(v1 ) ⊗ · · · ⊗ O(vn ) is the diagonal action.
Integration on the Unitary Group
781
Theorem 3.1 (Schur–Weyl duality for orthogonal groups [Bra37, Wen88]). The commutant of ρn (O(d)) is equal to ρB (C[P2n ]). Furthermore if d ≥ n then ρB is O(d) injective.
3.2. Integration formula. 3.2.1. For A ∈ End(Cd )⊗n we define E(A) =
O(d)
O⊗n A(Ot )⊗n dO.
Proposition 3.2. E is a conditional expectation of End(Cd )⊗n into ρB (C[P2n ]), in particular it satisfies E2 = E. We regard End(Cd )⊗n as a Euclidean space with a scalar product A, B = Tr AB∗ . Then E is an orthogonal projection onto ρB (C[P2n ]) . It is compatible with the trace in the sense that Tr ◦E = Tr . Proof. Proof is analogous to the proof of Proposition 2.2 but instead of Theorem 2.1 we use Theorem 3.1. For A ∈ End(Cd )⊗n we set Φ(A) =
p Tr(ρB (p)t A) ∈ C[P2n ].
(17)
p∈P2n
Every element of C(P2n ) can be viewed by the representation ρB as an element of End(Cd )⊗n and therefore we can consider the linear map
= Φ ◦ ρB : C(P2n ) → C(P2n ). Φ
coincides with the Gram matrix of the set of the vectors The matrix of the operator Φ d ⊗n
We ρB (p) ∈ End(C ) indexed by p ∈ P2n . We denote by Wg the inverse of Φ. postpone answering the question if this inverse exists to Proposition 3.10. We denote by Πp1 ,p2 the partition induced by the action of the group generated by p1 , p2 . Proposition 3.3. ρB , E, Φ are morphisms of S2n -spaces. As a consequence, p1 , Wg p2 depends only on the conjugacy class of p1 p2 . Proof. The proof of this proposition is straightforward.
By a change of labels we can view P2n as the set of pairings of the set {1, . . . , 2n}. We do not care about the choice of the way in which labels {U1 , . . . , Un , B1 , . . . , Bn } are replaced by {1, . . . , 2n}. For a tuple of indices i = (i1 , . . . , i2n ), where i1 , . . . , i2n ∈ p {1, . . . , d} and a pairing p ∈ P2n we set δi = 1 if for each pair a, b ∈ {1, . . . , 2n} p connected by p we have ia = ib ; otherwise we set δi = 0.
´ B. Collins, P. Sniady
782
Corollary 3.4. The following formulas hold true: E = ρB ◦ Wg ◦Φ, Tr AE(B) = Tr (AρB (p1 )) Tr ρB (p2 )t B p1 , Wg p2 .
(18) (19)
p1 ,p2 ∈P2n
For every choice of u1 , . . . , u2n , v1 , . . . , v2n we have
u1 , Ov1 · · · u2n , Ov2n dO O(d)
=
u1 ⊗ · · · ⊗ un , ρB (p1 ) un+1 ⊗ · · · ⊗ u2n
p1 ,p2 ∈P2n
× v1 ⊗ · · · vn , ρB (p2 ) vn+1 ⊗ · · · ⊗ v2n p1 , Wg p2 . In particular, for every choice of indices i = (i1 , . . . , i2n ), j = (j1 , . . . , j2n ), p p Oi1 j1 · · · Oi2n j2n dO = δi 1 δj 2 p1 , Wg p2 . O(d)
(20)
(21)
p1 ,p2 ∈P2n
The moments of an odd number of factors vanish: Oi1 j1 · · · Oi2n+1 j2n+1 dO = 0. O(d)
(22)
Proof. It is enough to take appropriate matrices in the canonical basis to establish this result. The map O(d) : O → −O ∈ O(d) preserves the Haar measure, therefore Oi1 j1 · · · Oi2n+1 j2n+1 dO = (−Oi1 j1 ) · · · (−Oi2n+1 j2n+1 ) dO O(d)
which shows (22).
O(d)
Therefore Wg appears to be of fundamental importance in the computation of moments of the orthogonal group, and it is of theoretical importance to give a closed formula for it. We shall do this in the following. 3.2.2. An abstract formula for the orthogonal Weingarten function. Let Id ∈ P2n be any fixed pairing; to have a concrete example let us say that Id is the identity of the Brauer algebra, i.e. the pairing which connects the pairs of vertices Ui , Bi with each 1 ≤ i ≤ n. From now on we fix an inclusion of the hyperoctahedral group On into S2n by considering On as the global stabilizer of Id under the action of S2n . We equip the set P2n of pairings with a metric l by setting l(p1 , p2 ) =
|p1 p2 | , 2
where pairings p1 , p2 are regarded on the right-hand side as elements of S2n .
Integration on the Unitary Group
783
Fig. 2. Identity in Brauer algebra
Lemma 3.5. If p1 , p2 ∈ P2n then Tr ρB (p1 )ρB (p2 )t = dn−l(p1 ,p2 ) .
(23)
Furthermore, l(p1 , p2 ) is an integer number. Each right class πOn of S2n /On is uniquely determined by its action π(Id) on the identity diagram hence the right classes S2n /On are in a one-to-one correspondence with the elements of P2n . We set |πOn | = minσ∈πOn |σ|. Then
(π(Id)) , Id = dn−|πOn | . Φ (24) Let a left and right class On ρOn be fixed. The value of |πOq | does not depend on the choice of π ∈ On ρOn , therefore the definition |On ρOn | = |ρOn | makes sense. Proof. Let e1 , . . . , ed be the orthogonal basis of Cd ; then Tr ρB (p1 )ρB (p2 )t =
ej1 ⊗ · · · ⊗ ejn , ρB (p1 )(ei1 ⊗ · · · ⊗ ein )
1≤i1 ,...,in ,j1 ,...,jn ≤d
× ej1 ⊗ · · · ⊗ ejn , ρB (p2 )(ei1 ⊗ · · · ⊗ ein ) . To every upper vertex Uk (respectively, bottom vertex Bk ) we assign the appropriate index ik (respectively, jk ). From the very definition of ρB , the right-hand side is equal to 1 if the indices corresponding to each pair of vertices connected by p1 or p2 are equal; otherwise the right-hand side is equal to 0. It follows that Tr ρB (p1 )ρB (p2 )t = dnumber of connected components of the graph depicting p1 and p2 . We observe that each connected component of the graph depicting p1 and p2 corresponds to a pair of orbits of the permutation p1 p2 . The number of orbits of p1 p2 is equal to 2n − |p1 p2 | which finishes the proof of the first part. The above considerations imply that
(π(Id)) , Id = dn− 12 ·|π Id π−1 Id | . Φ Let σ ∈ πOn . Since |π Id π −1 Id | = |σ Id σ −1 Id | ≤ |σ| + | Id σ −1 Id | = 2|σ|, therefore |π Id π −1 Id | ≤ 2 |πOn |.
´ B. Collins, P. Sniady
784
We can decompose the set of the vertices {U1 , . . . , Un , B1 , . . . , Bn } into two classes in such a way that the graph depicting pairings π(Id) and Id is bipartite, or—in other words—each of the pairings π(Id), Id regarded as a permutation maps these two classes into each other. We leave it to the reader to check that there exists a unique permutation σ ∈ πOq which is equal to the identity on the first of these classes. It follows |π Id π −1 Id | = 2 |σ|, which shows that |π Id π −1 Id | ≥ 2 |πOn |. Let σ ∈ On . Then |π Id π −1 Id | = |π Id π −1 σ −1 Id σ| = |σπ Id π −1 σ −1 Id |, therefore |πOn | = |σπOn | finishes the proof.
Lemma 3.6. The sum of dimensions of representations of S2n of the shape 2y1 ≥ 2y2 ≥ · · · , where y1 + y2 + · · · = n equals the cardinality of P2n . Proof. The Robinson–Schensted–Knuth algorithm provides a bijection between permutations and pairs (P, Q) of standard Young tableaux of the same shape. Furthermore if σ → (P, Q) then σ −1 → (Q, P); it follows that the RSK algorithm is a bijection between involutions σ = σ −1 and standard Young tableaux. It is easy to show that for any idempotent without fixed point, the RSK algorithm which gives a pair of tableaux (P, Q) of the same shape satisfies the additional property that P = Q. Furthermore, implementing the reverse of RSK algorithm (see [Ful97]) shows that the tableaux must have the shape prescribed in the lemma, and that any such tableau gives rise to an idempotent without fixed point. Proposition 3.7. The space C(P2n ) splits under the action of S2n as a direct sum of representations associated to Young diagrams of the shape 2y1 ≥ 2y2 ≥ . . . , where y1 + . . . + yq = n, hence the action is multiplicity–free. Proof. Following Fulton [Ful97], let us consider a diagram of shape 2y1 ≥ 2y2 ≥ . . . and consider its row numbering Young tableau. Let C be the column invariant subgroup of S2n and L the line invariant subgroup; both groups are isomorphic to a product of the symmetric groups. We consider the projection operator pC associated to the trivial representation of C and the projection operator pL associated to the alternate representation of L. One can see geometrically that these two operators commute and that the partition (1, 2)(3, 4), . . . , (2n − 1, 2n) is not in the kernel of pC ◦ pL . The dimension argument of Lemma 3.6 concludes the proof and shows the uniqueness of the occurrence of any representation of the shape 2y1 ≥ 2y2 ≥ . . . .
are indexed by Young diagrams λ with the shape Proposition 3.8. The eigenspaces of Φ 2l1 ≥ 2l2 ≥ . . . . The corresponding eigenvalue is given by n−|π| λ π∈On \S2n /On d σ∈π χ (σ) , (25) zλ = λ σ∈On χ (σ) and the corresponding eigenspace is equal to the image of ρS2n (pλ ).
Integration on the Unitary Group
785
is a morphism of S2n –spaces by Proposition 3.3, hence Proposition 3.7 gives Proof. Φ
Let λ be as in Proposition 3.7; then the element the classification of the eigenspaces of Φ. ρS2n (pλ )(Id) is non-zero and belongs to an irreducible submodule of C(P2n ), thus it satisfies
ρS (pλ )(Id) = zλ ρS (pλ )(Id). Φ 2n 2n We have therefore by bilinearity
S (pλ )(Id), Id = zλ ρS (pλ )(Id), Id = zλ
Φρ 2n 2n
pλ (σ).
(26)
σ∈On
Lemma 3.5 can be used to evaluate the left-hand side of (26). Since the left-hand side of (26) is non-zero for sufficiently big d, hence also the right-hand side is non-zero and the division makes sense. Theorem 3.9. The Weingarten function is given by Wg =
1 ρS2n pλ , zλ
(27)
λ
where the sum runs over diagrams λ with a shape prescribed in Proposition 3.7 and zλ was defined in Eq. (25). In particular,
p1 , Wg p2 =
λ
1 χreg ρS2n pλ (p1 ) · ρS2n pλ (p2 ) , zλ (2n)!
(28)
where ρS2n pλ (pi ) are considered as elements of C[S2n ], · is the multiplication in C[S2n ]. Proof. The first point follows from the above discussion and for the second it is enough 1 to observe that p1 , p2 = (2n χreg (p1 pt2 ). )! Observe that Eq. (28) is a closed formula for p1 , Wg p2 as a (rational) function of the dimension d, expressed in terms of the characters of the symmetric group. A priori, Corollary 3.4 is valid only for d ≥ n since in this case ρB is injective
is invertible; otherwise the Weingarten function does not exist. The and therefore Φ following result deals also with the cases d < n. Proposition 3.10. Corollary 3.4 remains true for all values of d and n if the following definition of the Weingarten function is used: Wg =
1 ρS2n pλ , zλ
(29)
λ
where the sum is taken over all diagrams λ with a shape prescribed in Proposition 3.7 for which zλ = 0.
´ B. Collins, P. Sniady
786
Proof. Since E is an orthogonal projection, it is enough to check the validity of (18) on the range of ρB . We denote by V ⊆ C(P2n ) the span of the images of ρS2n (pλ ) for which zλ = 0; the range of ρB is equal to ρB (V ), hence it is enough to show that
E ◦ ρB = ρB ◦ Wg ◦Φ
: V → V is equal holds true on V . The latter equality is obvious since the inverse of Φ to Wg given by (29). We can treat Φ d : C(P2n ) → C(P2n ) as a matrix the entries of which are polynomials in d and therefore its inverse Wgd : C(P2n ) → C(P2n ) makes sense as a matrix the entries of which are rational functions of d ∈ C; therefore Wgd is well–defined for all d ∈ C except for a finite set; it is explicitly given by (27). For fixed A, B ∈ Md0 (C)⊗n let us plug this (incorrect for d0 < n) value of Wgd into (19); the right-hand side becomes a rational function of d and after the cancelation of poles it has a limit d → d0 which is indeed equal to the left-hand side of (19). In other words, we claim that E = lim ρB ◦ Wgd ◦Φ.
(30)
d →d 0
It is indeed the case since for every d ∈ C the value of Wgd (Φ(A)) is the same no matter if we use (27) or (29). We summarize the above discussion in the following proposition. Proposition 3.11. Corollary 3.4 remains true for all values of d and n if the Weingarten function is regarded as a rational function computed in (27); possibly after some cancelation of poles. 3.3. Asymptotics of Weingarten function. For pairings p1 , p2 ∈ P2n let 2n1 , 2n2 , . . . denote the numbers of the elements in the orbits of the action of {p1 , p2 }. We define the M¨obius function Moeb(p1 , p2 ) = (−1)ni −1 cni −1 , i
where cn is the Catalan number defined in (14). Lemma 3.12. For every p ∈ P2n and |d| sufficiently large we have Wg(p) = d −n
k≥0
p0 ,p1 ,...,pk p0 =p, pi =pi+1 for i∈{0,1,...,k−1}
(−1)k d−l(p0 ,p1 )−···−l(pk−1 ,pk ) pk .
Proof. It is enough to observe
d−n Φ(p) =p+
d−l(p,p ) p
p =p 1 2 3 and use the power series expansion 1+ x = 1 − x + x − x + · · · for the operator − n
d Φ.
Integration on the Unitary Group
787
Theorem 3.13. The leading term of the Weingarten function is given by
p, Wg p = d−n−l(p,p ) Moeb(p, p ) + O(d−n−l(p,p )−1 ).
(31)
Proof. Lemma 3.12 implies that we need to find explicitly all tuples of pairings p0 , . . . , pk such that p0 = p, pk = p which fulfill pi = pi+1 for i ∈ {0, . . . , k − 1} and l(p0 , p1 ) + · · · + l(pk−1 , pk ) = l(p0 , pk ). For every such a tuple the triangle inequality implies that l(p0 , pi ) + l(pi , pk ) = l(p0 , pk ), or equivalently, |p0 pi | + |pi pk | = |p0 pk | = |(p0 pi )(pi pk )|. The latter condition implies that every orbit of p0 pi ∈ S2n must be a subset of one of the orbits of p0 pk [Bia97, Bia98]. Therefore the pairing pi cannot connect vertices which belong to different connected components of the graph spanned by p0 and pk . It follows that it is enough to consider the case if the graph spanned by p0 and pk is connected. Suppose that the graph spanned by p0 and pk is connected. It follows that the permutation p0 pk consists of two n-cycles, we denote one of them by π. Since every orbit of p0 pi is a subset of one of the orbits of p0 pk therefore it makes sense to consider the restriction ρi of p0 pi to the support of π. Observe that knowing ρi we can reconstruct the pairing pi by the formula if ρi (s) is defined, p0 ρi (s) pi (s) = ρi−1 p0 (s) otherwise. It follows that the solutions of the equation l(p0 , pi ) + l(pi , pk ) = l(p0 , pk ) can be identified with the solutions of the equation |ρ| + |ρ−1 π| = |π|. Now one can easily see that the tuples of pairings p1 , . . . , pk−1 which fulfill pi = pi+1 for i ∈ {0, . . . , k − 1} and l(p0 , p1 ) + · · · + l(pk−1 , pk ) = l(p0 , pk ) are in oneto-one correspondence with tuples of permutations ρ1 , . . . , ρk−1 such that ρi = ρi+1 −1 −1 and |ρ0 ρ1−1 | + · · · + |ρk−1 ρk | = |ρ0 ρk |, where ρ0 is the identity permutation and ρk = π. The results of Biane [Bia97] finish the proof. 3.4. Cumulants. Recall that in the work of the first named author [Col03] the asymptotics of the cumulants of the unitary Weingarten functions have been obtained (Theorem 2.15). The purpose of this section is to establish the counterpart of this result for the orthogonal Wg functions. As we see by Proposition 3.3, the function Wg can be labeled by Wg(λ, d) were λ n is a partition of the number n. It will be more convenient to define in the obvious way Wg(π, d), where π is a partition of the interval [1, n]. The set of partitions of an interval is endowed with the order of refinement, and denoted by ≤. The set of partitions is known to be a lattice, in which there is a smallest element (the partition with only one-element blocks 0n ) and a largest element (the partition with only one block 1n ). In addition, the notion of sup and inf makes sense. For partitions Π, Π of [1, n] such that π ≤ Π ≤ Π , it is of fundamental importance to have a good understanding of the relative cumulants Cπ,Π,Π of Wg defined implicitly by the relation WgΠ (π, d) = Cπ,Π,Π whenever Π ≥ Π, with WgΠ (π) =
Π≤Π ≤Π
k Wg(π|Vk )
if one denotes Π = {V1 , . . . , Vk }.
´ B. Collins, P. Sniady
788
Remark. Cπ,Π,Π is multiplicative, therefore it is enough to know Cπ,Π,1n to know all Cπ,Π,Π . Actually, it is shown in [Col03] that Cπ,Π,Π can be written as a sum of Cπ,π,1n ’s. Lemma 3.14. The relative cumulant is given for d large enough, by Cπ,Π,Π = d−n (−1)k d−l(p0 ,p1 )−···−l(pn−1 ,pn ) . k≥0
p=p0 ,p1 ,...,pk pi =pi+1 for i∈{0,1,...,k−1} sup(Π,π,π1 ,...,πk )=Π
The leading order of the series of Cπ,Π,Π is therefore the number of k-tuples (π1 , . . . , πk ) of elements of P2n such that l(π, π1 ) + l(π1 , π2 ) + . . . + l(πk , Id) = n + l(π, Id) − 2(#blocks(Π ) − #blocks(Π)) together with the requirement that sup(Π, π, π1 , . . . , πk ) = Π . Proof. For the first point, it is enough to check that this equation satisfies the momentcumulant equation. Asymptotics of the leading order is elementary. For a less direct approach, see also [Col03]. In order to compute the leading order, it is enough to compute the number of k -tuples (π1 , . . . , πk ) of elements of P2n such that d(π, π1 ) + d(π1 , π2 ) + . . . + d(πk , Id) = n + l(π, Id) − 2(#blocks(Π ) − #blocks(Π)) together with the requirement that the sup(π, π1 , . . . , πk ) = 1n . We call B[π, k] this number. Denote by τ1 , . . . , τn the disjoint transpositions generating the pairing Id ∈ P2n , and G be the subgroup of S2n generated by these transpositions. This group has the structure of (Z/2Z)n . The symmetric group Sn can be regarded as a subset of P2n when we identify a permutation σ with a pairing which connects the upper vertex Ui with the bottom vertex Bσ(i) for all values of 1 ≤ i ≤ n. We say that pairings which can be obtained by this construction are permutation-like. The group G acts on P2n by conjugations and one checks easily that in any orbit under the action of G there exist at least one permutationlike element. Moreover, two permutation-like elements in the same orbit are conjugate to each other when regarded as elements of Sn . More precisely, each orbit has 2l elements, where l is the number of cycles with at least 3 elements in Sn . Fix π ∈ P2n and call k the number of its connected components (i.e. the number of cycles -including trivial cycles (two-element orbits) and transpositions (four-elements orbits) of an associated permutation-like element). Let σ ∈ Sn be one image π. Consider the number of k -tuples (σ1 , . . . , σk ) of permutations of Sn such that σ1 . . . σk σ = e, the group generated by σ1 , . . . , σk acts transitively on [1, n] and |σ| + |σ1 | + . . . + |σk | = 2n − 2. This number has already been computed in [BMS00] and is equal to ˜ A[σ, k] = k
ki − 1di (nk − n − 1)! i , (nk − 2n + |σ| + 2)! i i≥1
where di denotes the number of cycles with i elements of σ. ˜ k]. Proposition 3.15. B[π, k] = 2k−1 A[σ,
Integration on the Unitary Group
789
Proof. The group G acts by conjugation on k-tuples (π1 , . . . , πk ) arising in the counting of B[π, k]. There is an element of G which turns π into a permutation like element. In other words, one can assume that π is permutation like. Introduce the group G generated by τi1,1 . . . τi1,l1 , . . . , τik,1 . . . τik,lk , where τij,1 , . . . , τij,lj correspond to the elements of the jth cycle of σ. This group has the structure of (Z/2Z)k and acts by restriction of G on the k-tuples (π1 , . . . , πk ). One checks that for any k-tuple (π1 , . . . , πk ) satisfying the length conditions, there exist exactly elements of G such that their action turns all k-tuples into permutation like elements. Theorem 3.16. Cπ,Π,Π is a rational fraction of order d−n−l(π,Id)+2(#blocks(Π )−#blocks(Π)) and its leading term is given by γπ,Π,Π . Assume that π has di cycles of length i − 1. Then q 22q−2|π|−1 (3q − 3 − |π|)! (2i − 1)! di . (32) γπ,π,1n = (−1)|π| (2q)! (i − 1)!2 i=1
Proof. The proof is exactly the same as that of Theorem 2.15 in [Col03]. It is enough, ˜ in Eq. (2.56), to replace A[σ, k] by B[π, k]. According to the remark above Lemma 3.14, it is possible to write γπ,Π,Π as a sum of elements of type γπ,π,1n . This is a straightforward adaptation of Theorem 2.15, item (iii) of [Col03]. Therefore the above theorem actually gives us full understanding of the leading order of Cπ,Π,Π . 4. Integration Over Symplectic Groups Let e1 , . . . , ed , f1 , . . . , fd be an orthonormal basis of C2d . We refer to this basis as the canonical basis. Consider the bilinear antisymmetric form ·, · such that
ei , fj = δi,j ,
ei , ej = fi , fj = 0.
(33)
The symplectic group Sp(d) is the set of unitary matrices of M2d (C) preserving ·, ·. Also by ·, · we denote the bilinear form on (C2d )⊗n given by the canonical tensor product of forms ·, · on C2d . This form is symmetric if n is even and antisymmetric if n is odd. The Brauer algebra B(−d, n) admits a natural action onto the space (C2d )⊗n given in the same way as in Sect. 3.1.2 with the difference that ·, · should be understood as in Eq. (33). Most of the results from Sect. 3 remain true also for the symplectic case. Below we present briefly which changes are necessary. Theorem 4.1 (Schur–Weyl duality for symplectic groups [Bra37, Wen88, BW89]). The commutant of ρSp(d) (Sp(d)) is equal to ρB (C[P2n ]). Furthermore if d ≥ n then ρB is injective. For A ∈ End(C2d )⊗n we set E(A) = O⊗n A(Ot )⊗n dO, Sp(d)
and define Φ(A) as in (17). All results of Sect. 3 remain true with the only difference that the value of d in all formulas should be replaced by (−d). As for the cumulants, γπ,π,1n should be replaced by (−1)k+1 γπ,π,1n , where k is the number of blocks of π.
´ B. Collins, P. Sniady
790
5. Expectation of Product of Random Matrices and Free Probability This section is rather sketchy since it follows very closely the work of the first–named author [Col03]. 5.1. Asymptotic freeness for orthogonal matrices. Let n be an integer. We consider the following enumeration of 8n integers: 1, . . . 4n, 1, . . . , 4n. Consider T the subset of P8n such that any pairing links each i with some j. There is a natural bijection between this set and S4n . Call Ξ the element of P8n linking 2i − 1 to 2i and 2j − 1 to 2j, and S the subset of P8n such that elements link 2i − 1 to 2i and an odd (resp. even) j to an odd (resp. even) k. Let A(1) , . . . , A(2n) be (constant) matrices in Md (C). For τ ∈ P8n , and a random matrix B, define tr(A(1) , . . . , A(2n) ; B, τ) = d−loops(Ξ,τ) E
k1 ,...,k4n ,k1 ,...,k4n
×
2n i=1
(i) Bk2i−1 ,k2i Ak δτ,k, 2i−1 ,k2i
(34)
where δτ,k = 1 if for all pair (i, j) of τ, ki = kj , and 0 else. This expression is obviously a product of normalized traces of {B, Bt } alternating with {A(i) , A(i)t } Let τ ∈ T and σ ∈ S. Define tr(A(1) , . . . , A(2n) ; τ, σ) = d−loops(σ,τ) E
k1 ,...,k2n ,k1 ,...,k2n 2n (i) × Ak i=1
2i−1 ,k2i
δτ,k δσ|{1,...4n} ,k.
(35)
Observe that σ is restricted to the set {1, . . . 4n}. It makes sense to do so because it belongs to S and elements of S do not link i’s with ¯j’s. As in Eq. (34) this expression is obviously a product of normalized traces of {A(i) , A(i)t }. Let O be a random orthogonal Haar distributed matrix in Md (C). One establishes easily E tr(A(1) , . . . , A(2n) ; O, τ) g(σ, Ξ)dl(Ξ,τ)−l(Ξ,σ)−l(σ,τ) , = tr(A(1) , . . . , A(2n) ; τ, σ)W
(36)
σ∈S
g is the asymptotic normalized Wg function restricted on the set {1, . . . , 4n}. where W From this we obtain
Integration on the Unitary Group
791
Lemma 5.1. In Eq. (36), assuming that {A(i) , A(i)∗ } admits a joint limit distribution with respect to the normalized trace tr on Md (C), any term on the right hand side has asymptotic order ≤ 0. In case l(Ξ, τ) − l(Ξ, σ) − l(σ, τ) = 0, at least two factors of tr(A(1) , . . . , A(2n) ; τ, σ) have to be of the kind tr(A(i) ). In addition, at least two of such indices i are such that neither the pattern " . . . OA(i) O∗ . . . " nor " . . . O∗ A(i) O . . . " occurs in the cycle decomposition. Proof. The first point is an obvious consequence of the triangle inequality. In the case l(Ξ, τ) − l(Ξ, σ) − l(σ, τ) = 0, observe that since l(Ξ, σ) ≥ n, one has to have l(σ, τ) ≤ 3n − 1. The remaining assertions are an easy adaptation of [Col03], Proposition 3.3 (note that according to the definition of T and S, l(σ, τ) ≥ 2n; the proof follows by an easy graphical interpretation and the description of geodesic given in proof of [Col03], Theorem 3.13). From this we deduce: Theorem 5.2. Let O1 , O2 . . . . be independent copies of orthogonal ensembles, and W be a set of matrices such that the set (W, W t ) admits a limit distribution. Then W, {O1 , Ot1 }, {O2 , Ot2 } . . . are asymptotically free. This convergence holds almost surely. Proof. Asymptotic freeness is an immediate application of definition of freeness together with the previous lemma and asymptotic multiplicativity of Wg function established at Theorem 3.13. The proof of almost sure convergence is a consequence of the computation of cumulants of Wg function in Theorem 3.16 together with an application of Chebyshev inequality and Borel-Cantelli lemma (see [Col03], Theorem 3.7 for details). Remark. We would like to draw the attention of the reader on the existence of important differences between the unitary case and other cases. For example, in the unitary case, the (d) matrix family (2d Ei,i+1 , {O, O∗ }) ∈ Md (C) admits an asymptotic joint law whereas this is not true in the orthogonal case. The existence of an asymptotic joint law does not fail if one assumes that matrices are bounded. It also holds if one modifies the joint law assumption by enlarging the family W to W, W t as we do in the previous theorem. It is also possible to write down a necessary and sufficient relation from Eq. (36) but to our knowledge, there is no mathematical need for this at this point. 5.2. Orthogonal matrix integral. In this section we deal with orthogonal matrix integrals (that is, matrix integrals where the integral is taken with respect to the Haar measure on the orthogonal group), and in particular with the orthogonal Itzykson-Zuber integral. For unitary matrix integrals many tools are available and this paper together with [Col03] just provide a complementary mathematical approach. However, interestingly enough, it seems that up to now, except character expansion, there were not so many systematic tools for the study of non-unitary (i.e. orthogonal, symplectic) matrix integrals. One bright side of our approach is to provide such a tool and therefore new formulae to theoretical physics. Theorem 5.3. Let W be a family of matrices such that the family W, W t admits a limit joint distribution. Let O1 , . . . , Ok be independent Haar distributed unitary (resp. orthogonal or symplectic) matrices. Let (Pi,j )1≤i,j≤k and (Qi,j )1≤i,j≤k be two families of noncommutative polynomials in O1 , O∗1 , . . . , Ok , O∗k and W . Let Ad be the ran k k k ∗ dom variable k i=1 j=1 tr Pi,j (O, O , W ) and Bd the variable i=1 j=1 tr Qi,j 1 (O, O∗ , W ), where tr x = d Tr x for x ∈ Md (C)(C) denotes the normalized trace.
´ B. Collins, P. Sniady
792
• (i) For each d, the analytic function z → d−2 log E exp(zd2 Ad ) =
ad,n z n
n≥1
is such that for all n the limit limd→∞ ad,n exists and is finite. It depends only on the limit distribution of W and on the polynomials Pi,j . • (ii) For each d, the analytic function z→
E exp(zBd + zd2 Ad ) = 1 + bd,n z n E exp(zd2 Ad ) n≥1
is such that for all n the limit limd→∞ bd,n exists and is finite. It depends only on the limit distribution of W and on the polynomials Pi,j and Qi,j . Proof. This is a straightforward application of Theorem 3.16. See Theorem 4.1 of [Col03] for details. As a further illustration of our results on the asymptotics of cumulants, we state the asymptotics of d−2 Cn (d Tr Ad OBd O∗ ). This number is also known as the coefficient of the series of the orthogonal ItzyksonZuber integral. Observe that if Ad , Bd are real antisymmetric, the Harish-Chandra formula applies and yields a formula for finite dimensional IZ integral provided that the eigenvalues of Ad and Bd have no multiplicity. Without these assumptions, there is no formula to our knowledge. However, interesting results have been obtained in [BH03] (see also references therein) about asymptotics of symplectic Harish-Chandra integrals and the two results would deserve to be compared. The asymptotic convergence of d−2 Cn (d Tr Ad OBd O∗ ) provided that Ad , t Ad , Bd , Btd admit a joint limit distribution is already granted by Theorem 5.3. Let Gn be the set of (not-necessarily connected) planar graphs (such that any connected component is drawn on a distinct sphere) with n edges together with the following conditions: (i) each face has an even number of edges, (ii) the edges are labeled from 1 to n, (iii) there is a bicoloring in white and black of the vertices such that each black vertex has only white neighbors and vice versa. To each such graph g ∈ Gn we associate the permutations σ(g) (resp. τ(g)) of Sn defined by turning clockwise (resp. counterclockwise) around the white (resp. black) vertices and the function Moeb(g) = γτσ−1 ,Πτ ∨Πσ ,q+|τσ−1 |+2(C(Πτ ∨Πσ )−1) . For this definition to make sense in the orthogonal framework, we chose an embedding of Sn into B2n by partitioning [1, 2n] into two sets V1 and V2 of cardinal n and to a permutation σ, we associate an element of B2n pairing the i th element of V1 to the σ(i)th element of V2 .
Integration on the Unitary Group
793
7 8
6
5
10
9
16
3
2
4
1
11
13
15
17
12
14
For example in the picture, σ = (1 13 2)(3 5 4)(6 7)(8 9 10)(11 12)(16 17)(14 15), τ = (5 6)(7 8)(10 11)(2 3 9)(12 13)(1 4)(14 17)(15 16), τσ −1 = (1 3)(5 9 7)(6 8 11 13 4)(2 12 10)(17 15)(14 16). Two graphs are said to be equivalent if there is a positive oriented diffeomorphism of the plane transforming one into the other and respecting the coloring of the vertices and the labeling of the edges. We call ∼ this equivalence relation. For a permutation li σ ∈ Sn , we call Xσ the amount k i=1 tr X if σ splits into orbits containing l1 , . . . , lk elements. Theorem 5.4. If Xd , Yd , Xtd , Ydt admit a joint limit distribution, one has
Xτ(g) Yσ(g) Moeb(g). lim d−2 Cn (d2 A) = d
g∈Gq/
(37)
∼
We omit this proof, for it is almost the same as that of Theorem 4.3 of [Col03]. Observe that the asymptotic result only depends on traces of polynomials in Xd and traces of polynomials in Yd . Mixed patterns (involving traces of a non-commutative polynomial in the four variables Xd , Yd , Xtd , Ydt ) do not occur in the limit. However we need a control on the joint moments. In other words, the same diagrams appear as in the unitary case. The only difference is that the orthogonal function Moeb is the unitary one times 2#connected components−1 . Theorem 5.5. Let Xd be a rank one projection and assume that (Yd , Ydt ) has a limit joint distribution whose first marginal is µ, lim d−1 · Cn (d Tr(Xd OYd O∗ ) = (n − 1)!kn (µ). d
(38)
∗ In other words, the coefficients of z → d−1 log Eed Tr(Xd OYd O ) converge pointwise to those of the primitive of R-transform of µ.
´ B. Collins, P. Sniady
794
The proof goes along the same lines as Theorem 4.7 of [Col03], therefore we omit it. Observe that this result is exactly the same as for the unitary case, except that we need an extra control on the joint moments of Xd , Yd , Xtd , Ydt . 5.3. Orthogonal replaced by symplectic. The statement when replacing orthogonal matrices by symplectic should be replaced in the following way: if P is the unitary such that POT P = O∗ , then (Xd , Xtd ) (resp. (Yd , Ydt )) should be replaced by (X2d , PXT2d P) T P)). Theorem 5.3 remains true, and Theorem 5.4 as well (one only (resp. (Y2d , PY2d needs to modify accordingly the definition of Moeb). In Theorem 5.5, µ should be replaced by −µ. 6. Examples of Wg function We present below the values of the Weingarten function computed for the orthogonal group Od . In order to obtain the appropriate results for the symplectic group Spd one should replace in the formulas d by −d. These formulae have been obtained directly from the definition of Wg, without the help of formula (28). Observe that relative cumulants that can be obtained from these value yield asymptotics predicted by Theorem 3.16, Formula (32). Wg([1]) = d−1 , d+1 , d(d − 1)(d + 2) −1 Wg([2]) = , d(d − 1)(d + 2)
Wg([1, 1]) =
Wg([1, 1, 1]) = Wg([2, 1]) = Wg([3]) = Wg([4]) = Wg([3, 1]) =
d2 + 3d − 2 , d(d − 1)(d − 2)(d + 2)(d + 4) −1 , d(d − 1)(d − 2)(d + 4) 2 , d(d − 1)(d − 2)(d + 2)(d + 4) −5d − 6 , d(d + 1)(d + 2)(d + 4)(d + 6)(d − 1)(d − 2)(d − 3) 2d + 8 , (d + 1)(d + 2)(d + 4)(d + 6)(d − 1)(d − 2)(d − 3)
Wg([2, 2]) =
d2 + 5d + 18 , d(d + 1)(d + 2)(d + 4)(d + 6)(d − 1)(d − 2)(d − 3)
Wg([2, 1, 1]) =
−d3 − 6d2 − 3d + 6 , d(d + 1)(d + 2)(d + 4)(d + 6)(d − 1)(d − 2)(d − 3)
Wg([1, 1, 1, 1]) =
d4 + 7d3 + d2 − 35d − 6 . d(d + 1)(d + 2)(d + 4)(d + 6)(d − 1)(d − 2)(d − 3)
Acknowledgements. B.C. was Allocataire Moniteur at the Ecole Normale Sup´erieure, Paris while a part of this work was done. He is currently a JSPS postdoctoral fellow.
Integration on the Unitary Group
795
´ was supported by State Committee for Scientific Research (KBN) grant No. 2 P03A 007 23. P.S. ´ was performed during a visit in Ecole Normale Sup´erieure (Paris) and Institute des Research of P.S. Hautes Etudes Scientifiques funded by European Post-Doctoral Institute for Mathematical Sciences.
References [BB96]
Brouwer, P.W., Beenakker, C.W.J.: Diagrammatic method of integration over the unitary group, with applications to quantum transport in mesoscopic systems. J. Math. Phys. 37(10), 4904–4934 (1996) [BH03] Br´ezin, E., Hikami, S.: An extension of the Harish Chandra-Itzykson-Zuber integral. Commun. Math. Phys. 235(1), 125–137 (2003) [Bia97] Biane, P.: Some properties of crossings and partitions. Discrete Math. 175(1–3), 41–53 (1997) [Bia98] Biane, P.: Representations of symmetric groups and free probability. Adv. Math. 138(1), 126–181 (1998) [BMS00] Bousquet-M´elou, M., Schaeffer, G.: Enumeration of planar constellations. Adv. in Appl. Math. 24(4):337–368 (2000) [Bra37] Brauer, R.: On algebras which are connected with the semisimple continuous groups. Ann. Math. 38, 857–872 (1937) [BW89] Birman, J.S., Wenzl, H.: Braids, link polynomials and a new algebra. Trans. Amer. Math. Soc. 313(1), 249–273 (1989) [Col03] Collins, B.: Moments and cumulants of polynomial random variables on unitary groups, the Itzykson-Zuber integral, and free probability. Int. Math. Res. Not. 17:953–982 (2003) [Ful97] Fulton, W.: Young tableaux. Volume 35 of London Mathematical Society Student Texts. Cambridge: Cambridge University Press, 1997 [Gor02] Gorin, T.: Integrals of monomials over the orthogonal group. J. Math. Phys. 43(6), 3342–3351 (2002) [Gro99] Grood, C.: Brauer algebras and centralizer algebras for SO(2n,C). J. Algebra 222(2), 678–707 (1999) [Wei78] Weingarten, D.: Asymptotic behavior of group integrals in the limit of infinite rank. J. Math. Phys. 19(5), 999–1001 (1978) [Wen88] Wenzl, H.: On the structure of Brauer’s centralizer algebras. Ann. of Math. (2), 128(1), 173– 193 (1988) [Wey39] Weyl, H.: The Classical Groups. Their Invariants and Representations. Princeton, NJ: Princeton University Press, 1939 Communicated by Y. Kawahigashi
Commun. Math. Phys. 264,797–810 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1555-2
Communications in
Mathematical Physics
The Expected Area of the Filled Planar Brownian Loop is π/5 Christophe Garban1,2 , Jos´e A. Trujillo Ferreras1 1 2
Department of Mathematics, Cornell University, Ithaca, NY 14853-4201, USA. E-mail:
[email protected] Ecole Normale Superieure, 45 rue d’ulm, 75230 Paris Cedex 05, France. E-mail:
[email protected]
Received: 24 May 2005 / Accepted: 5 December 2005 Published online: 15 March 2006 – © Springer-Verlag 2006
Abstract: Let Bt , 0 ≤ t ≤ 1 be a planar Brownian loop (a Brownian motion conditioned so that B0 = B1 ). We consider the compact hull obtained by filling in all the holes, i.e. the complement of the unique unbounded component of C \ B[0, 1]. We show that the expected area of this hull is π/5. The proof uses, perhaps not surprisingly, the Schramm Loewner Evolution (SLE). As a consequence of this result, using Yor’s formula [17] for the law of the index of a Brownian loop, we find that the expected area of the region inside the loop having index zero is π/30; this value could not be obtained directly using Yor’s index description. 1. Introduction The main result of the present paper goes as follows: Let B denote a Brownian loop in C of time duration 1. There are various equivalent ways to define it. One can view it as a Brownian path (Bt , 0 ≤ t ≤ 1) appropriately conditioned to be back at its starting point at time 1. One can also write Bt = Wt − tW1 , where W is just a standard Brownian motion in C. Then, C \ B[0, 1], i.e. the complement of the path, has a unique infinite connected component H . The hull T generated by the Brownian loop is by definition equal to C \ H . This is the set obtained by filling in the holes in the loop. Let A be the random variable whose value is the area of T . Then: Theorem 1.1. The expected value of A is π/5. Our result gives interesting information regarding the Brownian loop soups introduced in [4]. This conformally invariant object plays an important role in the understanding and description of SLE curves (see, e.g. [4, 14, 5]). It can be viewed as a Poissonian cloud (of intensity c) of filled Brownian loops in subdomains of the plane. Among other things, it is announced in [15] that the dimension of the set of points in the complement of the loop soup (i.e. the points that are in the inside of no loop) can be shown to be equal to 2 − c/5, using consequences of the restriction property. A detailed proof of this
798
C. Garban, J. A. Trujillo Ferreras
Fig. 1. Random walk loop of 50000 steps and corresponding hull
statement has never been published, and in fact, our result implies the corresponding first moment estimate (i.e. the mean number of balls of radius ε needed to cover the set). The other arguments needed to derive the result announced in [15] will be detailed in [9]. Another consequence of our result concerns the direct relation between different measures on self-avoiding loops in the plane defined as outer boundaries of planar Brownian loops. See [16]. In the abundant existing literature about planar Brownian motion, there are certainly results dealing with the question of area. Paul L´evy’s stochastic area formula describing the algebraic area “swept” by a Brownian motion will likely come to the mind of many readers. Our result, however, is very different from this classical theorem, firstly because L´evy’s area is a signed area, but mainly because of the following: in order to apprehend L´evy’s area it is enough to follow the Brownian curve locally without paying attention to the rest of the curve. In our case, one needs to consider the curve globally. Also, Yor [17] has been able to give an explicit formula for the law of the index of a Brownian loop around a fixed point z. Yor’s proof relies on the fact that the index can be obtained via a stochastic integral along the loop. Let us explain how this result is related to ours. A point with a non-zero index has to be inside the loop. Using this fact, it is almost possible to describe the probability that a given point is inside the loop, modulo the problem of the zero index; indeed, there are some regions inside the Brownian loop around which the loop has an index equal to zero. In the last section of our paper, we combine Theorem 1.1 with the law of the index given by Yor, to find that the expected area of the set of points inside the hull that have index zero is π/30. The expected areas of the regions of index n ∈ Z \ {0} can be directly obtained by integrating Yor’s formula with respect to z. In [2], using physics methods, Comtet, Desbois and Ouvry obtained the values of the expected areas for the non-zero index regions by different techniques, and they pointed out the different nature of the n = 0 sector (the points in the plane of zero index) and emphasized that “it would be interesting to distinguish in the n = 0 sector, curves which do not enclose the origin from curves which do enclose the origin but an equal number
The Expected Area of the Filled Planar Brownian Loop is π/5
799
of times clockwise and anticlockwise” but they argue that the 0-case cannot be treated within the scope of their analysis. From a probabilistic viewpoint, it also appears that usual techniques for Brownian motion are not strong enough to obtain the expected area of the Brownian hull or the expected area of the 0-index region inside the Brownian hull. Let us briefly explain why. Basically, the enclosed area depends only on the boundary of the hull generated by the Brownian loop. The frontier of the Brownian loop concerns only a small subset of the time duration [0, 1]. In some sense, on certain time-intervals, the enclosed area does not depend much on the behavior of the Brownian motion. So, this problem needs a good description of the frontier of a Brownian loop. Recently, Lawler, Schramm and Werner proved a conjecture of Mandelbrot that the Hausdorff dimension of the Brownian frontier is 4/3. For this purpose they used the value of intersection exponents computed with the help of SLE curves, see for instance [6] and references therein. The description of the Brownian frontier via SLE can be done in a slightly different way using the conformal-restriction point of view, see [5]. We will use this approach, and so will present to the reader the facts needed about conformal restriction measures in the next section. Our paper gives another striking example of a simple result concerning planar Brownian motion that seemed out of reach using the usual stochastic calculus approach, but that can be derived using conformal invariance and SLE. For a thorough account on SLE processes, see [3, 13]. Let us also finally mention another related result from the Physics literature. In [1], using methods of conformal field theory, Cardy has shown that the ratio of the expected area enclosed by a self-avoiding polygon of perimeter 2n to the expected squared radius of gyration for a polygon of perimeter 2n converges as n goes to infinity to 4π/5. We note that self-avoiding polygons are supposed to have the same asymptotic shape as filled Brownian loops (see, for example, [8] and references therein). However, studying this relationship is hard basically for the following reason. The boundary of the Brownian loop is of SLE8/3 -type, but, unfortunately, there does not exist a good way of “talking about the length” of SLE curves at this moment. A rigorous analysis of the squared radius of gyration seems currently out of reach. However, the universality of the ratio of the expected area to the expected squared radius of gyration for loops has been widely explored in the theoretical physics literature. Combining Cardy’s result with our result about the expected area of the Brownian loop, and considering the above described universality one could think that the expected squared radius of gyration for the boundary of a simple random walk of length 2n in the plane behaves like 41 n. In fact, we ran some numerical simulations and this seems to be indeed the case. 2. Preliminaries Conformal restriction measures in H are measures supported on the set of closed subsets K of H such that K ∩ R = {0}, K is unbounded and H \ K has two infinite connected components, that satisfy the conformal restriction property: for all simply connected domains H ⊂ H such that H \ H is bounded and bounded away from the origin, the law of K conditioned on K ⊂ H is the law of (K), where is any conformal transformation from H to H preserving 0 and ∞ (this law doesn’t depend of the choice of ). It is proved in [5] that there is only one real parameter family of such restriction measures, Pα where α ≥ 5/8. These measures are uniquely described by the following property: for all closed A in H bounded and bounded away from 0,
800
C. Garban, J. A. Trujillo Ferreras
Pα [K ∩ A = ∅] = A (0)α ,
(2.1)
where A is a conformal transformation from H \ A onto H such that A (z)/z → 1, when z → ∞. To aid with the notation for the rest of the paper whenever we write A we will be assuming that we have chosen the translate with the additional property A (0) = 0. P5/8 is the law of chordal SLE8/3 , and P1 can be constructed by filling the closed loops of a Brownian excursion in H (Brownian motion started at 0 conditioned to stay in H). An important property of these conformal restriction measures is that using two independent restriction measures Pα1 and Pα2 , we can construct Pα1 +α2 by filling the “inside” of the union of K1 and K2 . This “additivity” property and the construction of P5/8 and P1 give the good description of the Brownian motion in terms of SLE curves, namely, 8 SLE8/3 give the same hull as 5 Brownian excursions. Since we want to describe the boundary of loops of time duration 1, we will first create loops with the use of the infinite hulls described above. Restriction measures are conformally invariant (Brownian excursion, SLE8/3 ,..), so we had better use conformal maps. There is obviously no conformal equivalence which sends both ∞ and 0 to 0, so the natural idea is to consider a M¨obius transformation preserving H which maps 0 to 0, and ∞ to ε. We can choose εz , z+1 z m−1 . ε (z) = ε−z mε (z) =
The limit when ε goes to zero of the measures mε (P1 ) is the Dirac measure at {0}. The good renormalization to keep something interesting is in ε2 . Hence, we define the Brownian bubble measure in H as: µbub = lim
ε→0
1 mε (P1 ) . ε2
This measure was introduced in [5], and it is an important tool for studying the link between SLE curves and the Brownian loop soup (see [4]). It was already noted in [5, 7], as an easy consequence of the “additivity” property described above, that 5 bub 1 1 5 µ = lim 2 mε (P1 ) = lim 2 mε (P5/8 ) . ε→0 ε 8 8 ε→0 ε The last measure can be seen as an infinite measure on “SLE8/3 loops”, let us call this measure µsle . Recall, that we are interested in a Brownian loop of time duration 1. We have the following time decomposition for µbub , (see [4, 3]): ∞ dt br µbub = P × Ptexc , (2.2) 2 t2 t 0 where Ptbr is the law of a one-dimensional Brownian bridge of time duration t, and Ptexc is the law of an Itˆo Brownian excursion re-normalized to have time t. Ptbr × Ptexc is the law of an H-Brownian bridge of time duration t, by considering the one dimensional bridge as the x coordinate of the curve, and the excursion as the y coordinate. Unfortunately, it is hard to compute fixed-time quantities with SLE techniques. Thus, we will compute a “geometric quantity” using SLE8/3 , and then extract E(A) from this geometric value by using the relation µbub = 8/5µsle and the decomposition (2.2).
The Expected Area of the Filled Planar Brownian Loop is π/5
801
Let us explain in a few words why we need to deal with Brownian bridges in H and cannot work directly with bridges in C. The underlying idea is the fact that one needs to choose a starting point on the boundary of the Brownian loop for the SLE loop representation. A natural choice is the (almost surely) unique lowest point, this is why we are interested in H quantities. So let AH be the random variable giving the area of an H−Brownian bridge of time duration one. Working with AH will turn out not to be a problem since, as the reader might already suspect, the random variables A and AH have the same law. For the geometric quantity, we could choose to compute A(γ )dµsle , where A(γ ) is the area enclosed in H by the “curve” γ , but this integral is infinite. Let γ ∗ be the radius of the curve γ , that is, γ ∗ = sup0≤t≤tγ |γ (t)|. We may consider the “expected” area under the law µsle “conditioned” on γ ∗ = 1. Here, µsle is not a probability measure so the term “expected value” is not correct, and the conditioning is on a set of µsle −measure equal to 0. The following definition will be sufficient for our purposes: ∗
µ (A|γ = 1) = lim sle
δ↓0
A(γ )1{γ ∗ ∈[1,1+δ)} dµsle . µsle {γ ∗ ∈ [1, 1 + δ)}
(2.3)
Using µsle = 5/8µbub , we can write in the same way: ∗
µ (A|γ = 1) = lim sle
δ↓0
A(γ )1{γ ∗ ∈[1,1+δ)} dµbub . µbub {γ ∗ ∈ [1, 1 + δ)}
(2.4)
Thus, µsle (A|γ ∗ = 1) represents at the same time the “expected” area of an SLE8/3 loop conditioned to touch the half circle of radius one and the expected area of a Brownian bubble with the same conditioning. With the use of the restriction property for SLE8/3 , we will be able to compute in the last section µsle (A|γ ∗ = 1). Before, in the coming section, we will find the relationship between E(A) and µsle (A|γ ∗ = 1). 3. Extraction of E(A) from µsle (A|γ ∗ = 1) In this section we will prove the following Lemma 3.1. The expected value of A is equal to 2µsle (A|γ ∗ = 1). Proof. First of all, by using the definition of µbub in terms of limε↓0 ε12 mε (P1 ) and the restriction property of P1 , it is easy to show that µbub {γ ∗ ≥ r} = r12 . (The definition of µbub just mentioned shows that the crucial quantity to compute is the probability that an H−Brownian excursion hits a half-ball of radius ε/r centered about −1; the restriction property tells us that to evaluate this, one just needs to differentiate the appropriate map at 0. An analogous computation for the case of SLE8/3 , or equivalently P5/8 , is carried out in more detail in Lemma 4.1.) Hence, µbub {γ ∗ ∈ [1, 1 + δ)} = 1 − 1/(1 + δ)2 = 2δ + O(δ 2 ).
802
C. Garban, J. A. Trujillo Ferreras
On the other hand, recalling (2.2) and letting µt = Ptbr × Ptexc , we have 1 bub A(γ )1{γ ∗ ∈[1,1+δ)} dµ = A(γ )1{γ ∗ ∈[1,1+δ)} dµt dt 2t 2 1 = t A(γ )1{γ ∗ ∈[ √1 , 1+δ √ )} dµ1 dt (Brownian scaling) 2t 2 t t 1 = A(γ ) (Fubini) 1 1 (1+δ)2 dt dµ1 , ∗2 )} 2t {t∈[ γ ∗2 γ = A(γ ) log(1 + δ)dµ1 = E(AH )(δ + O(δ 2 )). Thus, from (2.4), ∗
µ (A|γ = 1) = lim sle
δ↓0
=
A(γ )1{γ ∗ ∈[1,1+δ)} dµbub µbub {γ ∗ ∈ [1, 1 + δ)}
1 E(AH ). 2
Hence, the proof of the lemma will be concluded as soon as we establish E(AH ) = E(A). There is a (almost sure) one-to-one correspondence between C-Brownian bridges and H-Brownian bridges. The idea is to start the Brownian loop from its lowest point. More precisely, if Bt , 0 ≤ t ≤ 1 is a Brownian bridge in C, with probability one, there is a unique t¯ ∈ [0, 1] such that Im(Bt¯) ≤ Im(Bt ), for all t ∈ [0, 1]. We associate to the Brownian Bridge Bt the process (Zt )0 ≤ t ≤ 1 in H, defined by this simple space-time translation: Bt¯+t − Bt¯ , 0 ≤ t ≤ 1 − t¯ , Zt = (3.1) Bt¯+t−1 − Bt¯ , 1 − t¯ ≤ t ≤ 1 . Now, we have to identify the law of Zt with P1exc × P1br . The real and imaginary parts of Bt are two independent one-dimensional Brownian bridges. The law of the random variable t¯ is independent of Re(Bt ), so in the space-time change (3.1), Re(Zt ) is still a one-dimensional bridge independent of the imaginary part of Zt . Im(Zt ) has the law of a one-dimensional Brownian bridge viewed from its (almost sure) unique lowest point. By the Vervaat Theorem (see [11]), this gives the law of an Itˆo excursion renormalized to have time one. Thus Zt has the law of an H-Brownian bridge of time one. Our space-time transformation obviously preserves the area, hence E(AH ) = E(A). 4. Computation of µsle (A|γ ∗ = 1) In this section we compute µsle (A|γ ∗ = 1). Combined with Lemma 3.1, this will conclude the proof of Theorem 1.1. Lemma 4.1. µsle (A|γ ∗ = 1) is equal to
π 10 .
The Expected Area of the Filled Planar Brownian Loop is π/5
803
The proof of this lemma provides a good example of the use of standard techniques for SLE8/3 . We have chosen to leave out some algebraic details in order to allow the reader to focus on the main ideas. We first state a result that we will use extensively. This result is due to Schramm, [10], who gave a general formula covering the values κ ∈ [0, 8). For simplicity we will only state his result for κ = 8/3, which is all that we need. Let γ be chordal SLE8/3 in the upper half-plane H, and let z = reiθ be a point in H. Then [10], P{z is to the right of γ [0, ∞)} = 1/2 + 1/2 cos(θ ).
(4.1)
Proof. Recall (2.3): ∗
µ (A|γ = 1) = lim sle
δ↓0
By using the definition µsle = limε↓0
A(γ )1{γ ∗ ∈[1,1+δ)} dµsle . µsle {γ ∗ ∈ [1, 1 + δ)}
1 m (P ) , ε2 ε 5/8
we can rewrite (4.2) as:
lim lim Eε (A(γ )|γ ∗ ∈ [1, 1 + δ)), δ↓0 ε↓0
(4.2)
(4.3)
where Eε is a more appealing notation for the expected value under the law of mε (P5/8 ) (this law, in simpler words, is the law of a chordal SLE8/3 in H from 0 to ε). Recall that A(γ ) is the area of the bounded set in H enclosed by the curve γ . A(γ ) can be written as H 1{z inside} dA(z), where {z inside} means that z is in the component bounded by γ . Thus (4.3) can be written as: 1{z inside} dA(z)|γ ∗ ∈ [1, 1 + δ) , (4.4) lim lim Eε δ↓0 ε↓0
(1+δ)D+
where D+ is D ∩ H. Since everything is nicely bounded, we can interchange the limits and the integral. This gives us: lim lim Pε {z inside |γ ∗ ∈ [1, 1 + δ)} dA(z). (4.5) µsle (A|γ ∗ = 1) = D+ δ↓0 ε↓0
Therefore, what remains to be done is to compute, for a fixed z, the “probability” that this z is inside an “SLE8/3 loop” conditioned to have radius exactly 1. So let us fix z0 in D+ . Let Dε ( resp. Dεδ ) denote the image under m−1 ε (z) = z/(ε − z) of the set {z ∈ H : |z| ≥ 1} (resp. {z ∈ H : |z| ≥ 1 + δ}).
804
C. Garban, J. A. Trujillo Ferreras
We warn the reader that γ will denote two different kinds of curves in H : a curve from 0 to ∞, or a curve from 0 to ε. Let Fε be the event {γ [0, ∞) ∩ Dε = ∅}, and, similarly, let Fεδ be the analogous event for Dεδ . Then, c δ Pε {z0 inside |γ ∗ ∈ [1, 1 + δ)} = P5/8 {m−1 ε (z0 ) is to the right of γ |(Fε ) ∩ Fε } .
Recall that P5/8 is the law of a chordal SLE8/3 from 0 to ∞ in H, henceforth, we will simply call it P. In order to make the formulas more concise we will denote the event {z is to the right of γ } by R(z). Then, c δ P{R(m−1 ε (z0 ))|(Fε ) ∩ Fε } =
δ δ −1 P{R(m−1 ε (z0 ))|Fε }P{Fε }−P{R(mε (z0 ))|Fε }P{Fε } . P{Fεδ } − P{Fε } (4.6)
The reason for this last step is that now all the probabilities involved can be computed using the restriction property for SLE8/3 , and a simple formula, see Lemma 4.1, for the probability that a point is to the right of an SLE8/3 path from 0 to ∞ in H. This requires (cf. Sect. 2) to know the unique conformal map ε = Dε from H \ Dε into H, with ε (0) = 0, ε (∞) = ∞ and ε (∞) = 1 (with a similar statement for Dεδ ). Thus by restriction, the law of the chordal SLE8/3 in H conditioned not to touch Dε is the inverse image of the chordal SLE in H by ε . This implies for the quantities we need to compute: −1 P{R(m−1 ε (z0 ))|Fε } = P{R(mε (ε (z0 )))},
δ −1 δ P{R(m−1 ε (z0 ))|Fε } = P{R(mε (ε (z0 )))} .
Note that m−1 obius transformation, which maps ∞ to −1. Therefore, Dε and ε is a M¨ Dεδ are half disks whose centers are very close to -1. The fact that they are not exactly centered at -1 is due to the lack of symmetry in the problem: an SLE from 0 to ε in a half disk D+ centered in 0. Nevertheless, for the computation of ε (z) and δε (z), we can think of Dε and Dεδ as two half disks centered at -1 with radii respectively ε and (1 − δ)ε. If we carried out the computations with the actual disks (straightforward but tedious), we would see that our approximation is of order O(ε 2 + ε 2 δ 2 /|z + 1| + ε 4 /|z + 1|2 ), when z goes to -1. In this way, we have ε2 ε4 ε (z) = z − ε 2 + , + O ε2 + z+1 |z + 1|2 ε4 ε 2 (1 − δ)2 ε2 δ 2 δε (z) = z − ε 2 (1 − δ)2 + . + O ε2 + + z+1 |z + 1| |z + 1|2 We now have to evaluate these functions at the point m−1 ε (z0 ) = z0 /(ε − z0 ) = −1 − ε 2 4 2 2 2 z0 + O(ε ) (recall z0 is fixed). The approximations O(ε /|z + 1| ) and O(ε δ /|z + 1|) −1 2 2 at the point mε (z0 ) are of order O(ε ) and O(εδ ), respectively; this gives us: ε ε 2 (1 − δ)2 + + O(εδ 2 + ε 2 ) z0 −ε/z0 + O(ε 2 ) 1 = −1 − ε z0 + + 2εδz0 + O(εδ 2 + ε 2 ) . z0
δε (m−1 ε (z0 )) = −1 −
The Expected Area of the Filled Planar Brownian Loop is π/5
805
Using the Taylor series for the logarithm, and then taking the imaginary part, we see that 1 arg δε (m−1 ) − 2εδIm(z0 ) + O(εδ 2 + ε 2 ). ε (z0 )) = π + εIm(z0 + z0 Now, using Lemma 4.1, and the Taylor series for cosine we see that ε2 1 2 1 δ −1 P{R(ε (mε (z0 )))} = Im(z0 + ) − 4δIm(z0 + )Im(z0 ) 4 z0 z0 +O(ε2 δ 2 + ε 3 ).
(4.7)
In particular, if we set δ = 0 we obtain, P{R(ε (m−1 ε (z0 )))}
ε2 = 4
2
1 Im(z0 + ) z0
+ O(ε 3 ).
(4.8)
Also, by (2.1), we have (our approximation doesn’t change significantly the derivative at 0 which is far away from small disks centered at -1): P{Fεδ } = P5/8 {γ [0, ∞) ∩ Dεδ = ∅} = (δε ) (0)5/8 = (1 − ε 2 (1 − 2δ + O(δ 2 )))5/8 + O(ε 3 ) 5 5 = 1 − ε 2 + ε2 δ + O(ε 2 δ 2 + ε 3 ). 8 4 Similarly, P{Fε } = 1 − 5/8ε 2 + O(ε 3 ), which gives P{Fεδ } − P{Fε } =
5 2 ε δ + O(ε 2 δ 2 + ε 3 ) . 4
(4.9)
Hence, by combining this last expression, (4.6), (4.7), (4.8) and using the fact that both P{Fε } and P{Fεδ } are 1 + O(ε 2 ), we obtain : lim lim Pε {z0 inside |{γ ∗ ∈ [1, 1 + δ)}} δ↓0 ε↓0
= lim lim
ε2 4 (−4δIm(z0
δ↓0 ε↓0
+
1 2 2 3 z0 )Im(z0 )) + ε O(δ ) + O(ε ) 5 2 2 2 3 4 ε δ + O(ε δ + ε )
4 1 = − Im z0 + Im(z0 ). 5 z0
Therefore, by (4.5), and using polar coordinates to evaluate the integral, we get : 4 1 µsle (A|γ ∗ = 1) = − Im z + Im(z)dA(z) z D+ 5 π = . 10 This concludes the proof of the lemma.
Remark. We can note that the 1/5 in the final result, comes from the 8/5 in the restriction formula (2.1).
806
C. Garban, J. A. Trujillo Ferreras
Before finishing this section we would like to show how the techniques used in our proof yield a conditional version of Schramm’s formula (4.1), for κ = 8/3. We have decided to work with an infinitesimal slit instead of an infinitesimal ball, but the result is the same in either case. Lemma 4.2. Let γ be an SLE8/3 from 0 to ∞ in H, and let Sε be the vertical slit [−1, −1 + iε]. Then, for any z = reiθ in H, lim P{z is to the right of γ [0, ∞)|γ [0, ∞) ∩ Sε = ∅} 1 . = 1/2 + 1/2 cos(θ ) − 4/5 sin(θ )Im z+1
ε→0
√ Proof. The map ε (z) = (z + 1)2 + ε 2 − 1 + ε 2 , where the square root has been chosen to be positive on R+ , is a conformal transformation from H \ Sε onto H. It fixes both 0 and ∞, and has derivative 1 at ∞. Simple algebra yields
1 ε2 ε (z) = z 1 − 2z+1
+ O(ε 4 ),
(ε (0))5/8 = 1 −
5 2 ε + O(ε 4 ). 16
In particular, we have arg ε (z) = arg z −
1 ε2 + O(ε 4 ). 2z+1
Thus, using this last remark, the restriction property and (4.1), we have P{R(z)|γ [0, ∞) ∩ Sε = ∅} P{R(z)} − P{R(z)|γ [0, ∞) ∩ Sε = ∅}P{γ [0, ∞) ∩ Sε = ∅} = P{γ [0, ∞) ∩ Sε = ∅} = =
=
+
1 5 2 4 2 cos(arg ε (z))(1 − 16 ε ) + O(ε ) 5 2 4 16 ε + O(ε ) 1 1 2 1 4 2 [cos(arg z) − cos(arg z + (− 2 ε Im( z+1 )) + O(ε ))] 5 2 4 16 ε + O(ε ) 1 2
1 2
cos(arg z) − ( 21 +
1 + (1 + cos(arg z))(1 + o(1)) 2 1 1 [sin(arg z)(− 21 ε 2 Im( z+1 )) + O(ε 4 )] 2
1 + (1 + cos(arg z))(1 + o(1)) 2
+ O(ε 4 ) 1 1 4 )(1 + o(1)) + (1 + cos(arg z))(1 + o(1)), = − sin(arg z)Im 5 z+1 2 5 2 16 ε
which is what we wanted to prove.
The Expected Area of the Filled Planar Brownian Loop is π/5
807
5. Expected Areas for Fixed Index Let z ∈ C \ {0} be fixed, and (Bt )0≤t≤1 a Brownian loop in C starting at 0. Almost surely z ∈ {Bs : 0 ≤ s ≤ 1}, and therefore we can define its index nz . More precisely, ∀s ∈ [0, 1], Bs − z = Rsz exp(iθsz ), where Rsz = |Bs − z| and θsz is any continuous θ z −θ z
representative of the argument. The index nz is by definition 12π 0 ; this is the number of times that the Brownian particle winds around z. Let us now recall Yor’s result from [17]: Theorem 5.1. (M. Yor). Fix z = reiθ , with r = 0. Under the law of a Brownian loop of time duration one, starting at 0, we have the following probabilities: P(nz = n) = e−r [ r ((2n − 1)π ) − r ((2n + 1)π )] if n ∈ Z \ {0} , 2
P(nz = 0) = 1 + e−r [ r (−π) − r (π )] , 2
where ∀x = 0, x r (x) = π
∞ 0
e−r
2 cosh(t)
t2
dt . + x2
For each n ∈ Z, n = 0, let Wn denote the area of the open set of points of index nz = n. This random variable can be written as : Wn = 1{nz =n} dA(z) . C
Let W0 be the area of the open set of points inside the loop that have index zero: 1{nz =0}∩{z is inside} dA(z) . W0 = C
Fig. 2. Random walk of 50000 steps with areas of index 0 inside its hull in black
808
C. Garban, J. A. Trujillo Ferreras
Since the Brownian curve is of Lebesgue measure zero, we have the following decomposition of the area A inside the Brownian loop (basically, the Brownian path does not take much place inside its hull) A= Wn . n∈Z
Hence: E(A) =
π = E(Wn ). 5 n∈Z
Using Yor’s result, it will be rather straightforward to compute E(Wn ) for n = 0. And, hence, by subtracting from π/5, one can obtain the value of E(W0 ): Theorem 5.2.
E(Wn ) =
π 30 1 2πn2
n = 0, n = 0, n ∈ Z .
(5.1)
Remark. This result is consistent with the asymptotic result obtained by Werner in [12], about the area Atn of the set of points around which the planar Brownian motion (not the loop) winds around n times on [0, t]. It is indeed proved that Atn is equivalent (in the t L2 -sense) to 2πn 2 as n goes to infinity. Very roughly the area of the n-sector for large n comes from local contributions along the path, hence the global picture of the hull is not relevant; that is why, both Brownian motion and Brownian bridge should have the same asymptotics. Werner’s proof requires to compute the asymptotics of the first and second moments. This present paper gives exact computations for the first moments in the case of the loop, but it does not provide any information about the second moments. In fact, if one were to try to compute the second moment (or any higher order moments) of the area of the loop following an approach similar to the one in this paper one would face two difficulties. First, the simplification that allows one to obtain the equation in Lemma 3.1, only occurs for the first moment. And even if somehow one were able to obtain a formula analogous to the one in the lemma, it would still be necessary to have a statement analogous to that of Lemma 4.1 for more than one point: a notoriously difficult problem. Proof. We start by computing E(Wn ) for n = 0. For this purpose we use Theorem 5.1. Thus, for each n = 0, using polar coordinates: E(Wn ) = P(nz = n)dA(z) C ∞ 2 rdre−r = 2π
0 ∞ 2n − 1 2n + 1 −r 2 cosh(t) × dte − 2 t 2 + (2n − 1)2 π 2 t + (2n + 1)2 π 2 0 ∞ ∞ 2n−1 2n + 1 2 = 2π dt 2 − 2 re−r (1+cosh(t)) dr 2 2 2 2 t + (2n − 1) π t +(2n + 1) π 0 0 ∞ 2n − 1 dt 2n + 1 =π − 2 1 + cosh(t) t 2 + (2n − 1)2 π 2 t + (2n + 1)2 π 2 0 1 = . 2π n2
The Expected Area of the Filled Planar Brownian Loop is π/5
809
We sketch one possible way to see how to obtain the last line in the above chain of equalities. It is slightly more convenient to generalize a bit, so thinking of 2n as x and using the symmetry of the integrand, we consider the function ∞ x−1 dt x+1 F (x) = . − 2 2 2 2 t + (x + 1)2 π 2 −∞ 1 + cosh(t) t + (x − 1) π In this new notation what we want to prove is that F (x) = π 24x 2 (for x ≥ |2|). Since, F is symmetric about 0, it is enough to study the case of x positive; furthermore, since F is real analytic on {x : x > 1}, we can allow ourselves to assume that x is not an integer. Now, for x > 1 and x not an integer, a simple residue computation with appropriate contours yields ∞ 8 (2k − 1)(x − 1) (2k − 1)(x + 1) F (x) = − 2 . − π ((x − 1)2 − (2k − 1)2 )2 ((x + 1)2 − (2k − 1)2 )2 k=1
In order to evaluate this sum, it is enough to notice that using partial fractions one can obtain ∞ ∞ (2k − 1)w 1 1 1 , = − − (w 2 − (2k − 1)2 )2 16 (k + w/2 + 1/2)2 (k − w/2 + 1/2)2 k=1
k=0
and substituting x − 1 and x + 1 for w, and noticing the telescoping cancellations one readily obtains F (x) = hence, E(Wn ) =
4 , π 2x2
1 . 2πn2
π2 1 Finally, using the fact that ∞ n=1 n2 = 6 , and the fact that the area of the Brownian π loop is π/5 we conclude E(W0 ) = 30 . This finishes the proof of the theorem.
Acknowledgement. We wish to thank Greg Lawler and Wendelin Werner for suggesting the problem and for fruitful discussions, and Wendelin Werner for pointing out the link with the paper of Yor [17] as well as for his many comments on previous versions of the manuscript. J. T. thanks Julien Dub´edat for a discussion during a workshop at Oberwolfach that has led to a clearer presentation of Lemma 3.1. We thank the referees for useful suggestions.
References 1. Cardy, J.: Mean area of self-avoiding loops. Phys. Rev. Lett. 72, 1580–1583 (1994) 2. Comtet, A., Desbois, J., Ouvry, S.: Winding of planar Brownian curves. J. Phys. A: Math. Gen. 23, 3563–3572 (1990) 3. Lawler, G.F.: Conformally Invariant Processes in the Plane. Mathematical Surveys and Monographs 114, Providence, RI: Amer. Math. Soc., 2005 4. Lawler, G.F., Werner, W.: The Brownian loop soup. Probab. Theory Related Fields 128, 565–588 (2004) 5. Lawler, G.F., Schramm O., Werner, W.: Conformal restriction. The chordal case. J. Amer. Math. Soc. 16, 917–955 (2003) 6. Lawler, G.F., Schramm, O., Werner, W.: The dimension of the planar Brownian frontier is 4/3. Math. Res. Lett. 8, 401–411 (2001)
810
C. Garban, J. A. Trujillo Ferreras
7. Lawler, G.F., Schramm, O., Werner, W.: On the scaling limit of planar self-avoiding walk. In: Fractal geometry and applications, A jubilee of Benoˆıt Mandelbrot, Proc. Symp. Pure Math. 72, Providence, RI: Amer. Math. Soc., 2004 8. Richard, C.: Area distribution of the planar random loop boundary. J. Phys. A. 37, 4493–4500 (2004) 9. Thacker, J.: Hausdorff Dimension of the Brownian Loop Soup. In preparation (2005) 10. Schramm, O.: A percolation formula. Electron. J. Probab. Vol. 7(2), 1–13 (2001) 11. Vervaat, W.: A relation between Brownian bridge and Brownian excursion. Ann. Probab. 7, 143–149 (1979) 12. Werner, W.: Sur l’ensemble des points autour desquels le mouvement brownien plan tourne beaucoup. Probability Theory and Related Fields 99, 111–142 (1994) 13. Werner, W.: Random planar curves and Schramm-Loewner evolutions. In: Lecture Notes from the 2002 Saint-Flour Summer School, L.N. Math. 1840, Berlin-Heidelberg-New York Springer, 2004, pp. 107–195 14. Werner, W.: Conformal restriction and related questions. http://arxiv.org/list/math.PR/0307353, 2003 15. Werner, W.: SLEs as boundaries of clusters of Brownian loops. C. R. Acad. Sci. Paris Ser. I Math. 337, 481–486 (2003) 16. Werner, W.: The conformally invariant measure on self-avoiding loops. http://arxiv.org/list/math.PR/ 0511605, 2005 17. Yor, M.: Loi de l’indice du lacet brownien, et distribution de Hartman-Watson. Z. Wahrsch. Verw. Gebiete 53, 71–95 (1980) Communicated by M. Aizenman
Commun. Math. Phys. 264, 811–842 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1518-7
Communications in
Mathematical Physics
Scattering Theory for Jacobi Operators with Quasi-Periodic Background Iryna Egorova1 , Johanna Michor2,3 , Gerald Teschl2,3 1 2 3
Kharkiv National University 47, Lenin ave, 61164 Kharkiv, Ukraine. E-mail:
[email protected] Faculty of Mathematics, Nordbergstrasse 15, 1090 Wien, Austria International Erwin Schr¨odinger Institute for Mathematical Physics, Boltzmanngasse 9, 1090 Wien, Austria. E-mail:
[email protected];
[email protected]
Received: 7 June 2005 / Accepted: 19 September 2005 Published online: 15 Febraury 2006 – © Springer-Verlag 2006
Abstract: We develop direct and inverse scattering theory for Jacobi operators which are short range perturbations of quasi-periodic finite-gap operators. We show existence of transformation operators, investigate their properties, derive the corresponding Gel’fandLevitan-Marchenko equation, and find minimal scattering data which determine the perturbed operator uniquely. 1. Introduction Classical scattering theory deals with the reconstruction of a given Jacobi operator H u(n) = a(n)u(n + 1) + a(n − 1)u(n − 1) + b(n)u(n),
(1.1)
which is a short range perturbation of the free one H0 associated with the coefficients a(n) = 21 , b(n) = 0. This case has been first developed on an informal level by Case in a series of papers [5–10]. The first rigorous results were established by Guseinov [18], who gave necessary and sufficient conditions for the scattering data to determine H uniquely under the assumption 1 (1.2) |n| |a(n) − | + |b(n)| < ∞. 2 n Further extensions were made by Guseinov [19, 20], and Teschl [27]. Additional details and further references can be found, e.g., in [28]. In addition to being of interest on its own, scattering theory can also be used to solve the initial value problem for the Toda equation via the inverse scattering transform. This has been formally developed by Flaschka [14] (see also [29] and [11] for the Work supported by the Austrian Science Fund (FWF) under Grant No. P17762, the Austrian Academy of Sciences under DOC-21388, and INTAS Research Network NeCCA 03-51-6637.
812
I. Egorova, J. Michor, G. Teschl
case of rapidly decaying sequences) who also worked out the inverse procedure in the reflection-less case. Further results and an extension of the method to the entire Toda hierarchy were given by Teschl in [26] and [27]. The next interesting problem is to replace the free Hamiltonian H0 by one with a periodic potential. First results in the case of Sturm-Liouville operators have been obtained by Firsova in a series of papers (see [13]). For further results, including potentials with different spatial asymptotics, and additional references see Gesztesy et al. [16]. In the discrete case, the investigation has only recently been started by Boutet de Monvel and Egorova [2] and by Volberg and Yuditskii [31], who treat the case where H has a homogeneous spectrum and is of Szeg¨o class exhaustively from an operator point of view. Applications to the Toda lattice can be found in Bazargan and Egorova [1] and Boutet de Monvel and Egorova [3]. Finally, let us give a brief overview of the paper: Section 2 collects some well-known facts from Riemann surfaces and introduces the necessary notation. Section 3 introduces the Baker-Akhiezer function and investigates the quasi-momentum map. In the periodic case, where the integrals can be explicitly computed, this was first done in [24]. In addition, we characterize the second solution at the band edges. In Sect. 4 we prove existence of Jost solutions and use them to characterize the spectrum of the perturbed operator. In the periodic case, existence of Jost solutions was first shown by Geronimo and Van Assche [17] and the fact that there are only finitely many eigenvalues in each gap was first proven in Cojuhari [12] and later rediscovered in Teschl [25]. Section 5 introduces the transformation operator and proves the crucial decay estimate on its coefficients. This was first done by Boutet de Monvel and Egorova [2] in the periodic case under the additional assumption that all spectral gaps are open. We fix a problem in the original proof and at the same time simplify and streamline the argument. Section 6 investigates the scattering matrix. Our main result here is the reconstruction of the transmission coefficient from the reflection coefficient, which was not known previously even in the periodic case. Section 7 derives the Gel’fand-Levitan-Marchenko equation and proves positivity of the Gel’fand-Levitan-Marchenko operator. In addition, we formulate necessary conditions for the scattering data to uniquely determine our Jacobi operator. Our final Sect. 8 shows that our necessary conditions for the scattering data are also sufficient. It should be mentioned that, due to the lack of continuity with respect to the spacial variable n, a significant change in the strategy of the original proof in the continuous case from [22] is needed. Our approach uses heavily the fact that the Baker-Akhiezer function is a meromorphic function on the Riemann surface associated with the problem. This strategy gives a more streamlined treatment and more elegant proofs even in the special cases which were previously known. In this respect it is important to emphasize that, in contradistinction to the constant background case, the upper sheet of our Riemann surface is not simply connected and in particular not isomorphic to the unit disc.
2. Quasi-Periodic Finite-Gap Operators and Riemann Surfaces To set the stage let M be the Riemann surface associated with the following function 1/2 R2g+2 (z),
R2g+2 (z) =
2g+1
(z − Ej ),
j =0
E0 < E1 < · · · < E2g+1 ,
(2.1)
Scattering Theory for Jacobi Operators
813
g ∈ N. M is a compact, hyperelliptic Riemann surface of genus g. We will choose 1/2 R2g+2 (z) as the fixed branch 1/2
R2g+2 (z) = −
2g+1
z − Ej ,
(2.2)
j =0
where
√ . is the standard root with branch cut along (−∞, 0). 1/2
A point on M is denoted by p = (z, ±R2g+2 (z)) = (z, ±), z ∈ C, or p = ∞± , and the projection onto C ∪ {∞} by π(p) = z. The points {(Ej , 0), 0 ≤ j ≤ 2g + 1} ⊆ M are called branch points and the sets ± =
1/2 {(z, ±R2g+2 (z))
| z ∈ C\
g
[E2j , E2j +1 ]} ⊂ M
(2.3)
j =0
are called upper, lower sheet, respectively. g Let {aj , bj }j =1 be loops on the surface M representing the canonical generators of the fundamental group π1 (M). We require aj to surround the points E2j −1 , E2j (thereby changing sheets twice) and bj to surround E0 , E2j −1 counter-clockwise on the upper sheet, with pairwise intersection indices given by ai ◦ aj = bi ◦ bj = 0,
ai ◦ bj = δij ,
1 ≤ i, j ≤ g.
(2.4)
g
The corresponding canonical basis {ζj }j =1 for the space of holomorphic differentials can be constructed by ζ =
g
c(j )
π j −1 dπ 1/2
(2.5)
,
R2g+2
j =1
where the constants c(.) are given by
cj (k) = Cj−1 k , The differentials fulfill ζk = δj,k , aj
Cj k = ak
π j −1 dπ 1/2 R2g+2
=2
E2k
zj −1 dz
E2k−1
R2g+2 (z)
1/2
∈ R.
ζk = τj,k ,
τj,k = τk,j ,
1 ≤ j, k ≤ g.
(2.6)
bj
Now pick g numbers (the Dirichlet eigenvalues) g
g
(µˆ j )j =1 = (µj , σj )j =1
(2.7)
whose projections lie in the spectral gaps, that is, µj ∈ [E2j −1 , E2j ]. Associated with these numbers is the divisor Dµˆ which is one at the points µˆ j and zero otherwise. Using this divisor we introduce ˆ p ∈ Cg , z(p, n) = Aˆ p0 (p)−α p0 (Dµˆ )−nA∞− (∞+ )− 0
z(n) = z(∞+ , n),
(2.8)
814
I. Egorova, J. Michor, G. Teschl
where p0 is the vector of Riemann constants ˆ p0 ,j =
1−
g
k=1 τj,k
2
,
p0 = (E0 , 0),
(2.9)
and Ap0 (α p0 ) is Abel’s map (for divisors). The hat indicates that we regard it as a (single-valued) map from Mˆ (the fundamental polygon associated with M) to Cg . We recall that the function θ (z(p, n)) has precisely g zeros µˆ j (n) (with µˆ j (0) = µˆ j ), where θ(z) is the Riemann theta function of M. Then our Jacobi operator Hq is given by θ (z(n + 1))θ (z(n − 1)) , θ (z(n))2 g θ (w + z(n)) ∂ bq (n) = b˜ + cj (g) ln . ∂wj θ (w + z(n − 1)) w=0
aq (n)2 = a˜ 2
(2.10)
j =1
The constants a, ˜ b˜ depend only on the Riemann surface and will be defined in the next section. It is well known that the spectrum of Hq is purely absolutely continuous and consists of g + 1 bands σ (Hq ) =
g
[E2j , E2j +1 ].
(2.11)
j =0
For further information and proofs we refer to [28], Sect. 9. 3. The Baker-Akhiezer Function and the Quasi-Momentum Map The Baker-Akhiezer function ψq (p, n) = ψq (p, n, 0) is given by p θ (z(n0 − 1))θ (z(n0 )) θ (z(p, n)) ψq (p, n, n0 ) = exp (n − n0 ) ωˆ ∞+ ,∞− , θ (z(n − 1))θ (z(n)) θ (z(p, n0 )) p0 (3.1) where ω∞+ ,∞− is the normalized Abelian differential of the third kind with simple poles at ∞± and residues ±1, respectively. They are normalized such that ψq (p, n0 , n0 ) = 1. The two branches ψq,± (z, n) =
n−1
φq,± (z, j )
(3.2)
j =0
of the Baker-Akhiezer function are solutions of τq u = zu, z ∈ C, where τq is the difference expression associated with Hq and ([28], (8.87)) 1/2 g R2g+2 (z) Rˆ j (n) 1 φq,± (z, n) = z − bq (n) + ± g , 2aq (n) z − µj (n) j =1 (z − µj (n)) j =1
(3.3)
Scattering Theory for Jacobi Operators
815
1/2
Rj (n) =
R2g+1 (µj (n)) k =j (µj (n) − µk (n))
,
Rˆ j (n) = σj (n)Rj (n).
However, the Wronskian 1/2
R2g+2 (z) Wq (ψq,− (z), ψq,+ (z)) = g j =1 (z − µj )
(3.4)
(µj = µj (0)) shows that they are linearly dependent at the band edge Ej , 0 ≤ j ≤ 2g + 1. The branch ψq,σj (z, n) has a first order pole at µj if µj is away from the band edges lim (z − µj )ψq,σj (z, n) = ψq,σj (µj , n, 1)
z→µj
Rˆ j (0) aq (0)
(3.5)
(use (3.3) and ψq,± (z, n) = ψq,± (z, n, 1)φq,± (z, 0)) and both branches have a square root singularity if µj coincides with a band edge El ,
√ il k =l |El − Ek | lim z − µj ψq,± (z, n) = ± ψq,+ (El , n, 1). (3.6)
√ z→µj 2aq (0) k =j El − µk Lemma 3.1. The solutions of τq u = zu can be characterized as follows: (i) If R2g+2 (z) = 0, there exist two solutions satisfying ψq,± (z, n) = θ± (z, n)w(z)±n ,
w(z) = exp
(z,+)
p0
ωˆ ∞+ ,∞− ,
(3.7)
with θ± (z, n) quasi-periodic. (ii) If R2g+2 (z) = 0, z = El , there are two solutions satisfying ψq (El , n) = ψq,+ (El , n) = ψq,− (El , n),
ψˆ q (El , n) = ψq (El , n)(θˆl (n)+n), (3.8)
where θˆl (n) is quasi-periodic. Proof. (ii). We construct a second linearly independent solution at z = E = El using (see [28], (1.50)) sq (E, n) = lim aq (0) z→E
ψq,+ (z, n) − ψq,− (z, n) , W (ψq,− (z), ψq,+ (z))
(3.9)
where sq (z, n) denotes the fundamental solution of τq u = zu with initial conditions sq (z, 0) = 0, sq (z, 1) = 1. W.l.o.g. we assume that El does not coincide with one of the Dirichlet eigenvalues µj (otherwise shift the base point). To derive an expression for ψq,± (z) at z = E + 2 we start with 1/2 R2g+2 (z) = (R˜ + O( 2 )), R˜ = − E − Ej . j =l
816
I. Egorova, J. Michor, G. Teschl
Moreover, R˜ (1 + O( 2 )) (E − µ ) j j =1
Wq (ψq,− (z), ψq,+ (z)) = g and for p = (E + 2 , ±) (see (3.11) below)
p
p0
ωˆ ∞+ ,∞− =
E
ωˆ ± β + O( ), 3
β=
2
g
j =1 (E
R˜
p0
z(p, n) = z(E, n) ± γ + O( 3 ),
γ =
− λj )
g j =1
c(j )
,
2E j −1 , R˜
and θ (z(p, n)) = θ (z(E, n)) ±
∂θ (z(E, n)) γ + O( 3 ). ∂z
Using this to evaluate the limit ε → 0 shows sq (E, n) = 2aq (0)
g E − µj ψˆ q (E, n) = ψq (E, n)(θˆ (n) + n), E − λj
j =1
where g ∂ 1 E j ck (j ) ln θ(z(E, n) + w), ∂w (E − λ ) k j j,k=1 j =1
θˆ (n) = g and finishes the proof.
Remark 3.2. (i) Since ψq (z, n) has a singularity if z = µj the solutions in Lemma 3.1 are not well-defined for those z. However, you can either remove the singularities of ψq (z, n) or choose a different normalization point n0 = 0 to see that solutions of the above type exist for every z. (ii) In the periodic case Floquet theory tells you that there are two possible cases at a band edge: Either two (linearly independent) periodic solutions or one periodic and one linearly growing solution. The above lemma shows that the first case happens if the corresponding gap is closed and the second if the gap is open. To understand the properties of ψq,± (z, n) we need to investigate the quasi-momentum map p
w(z) = exp p = (z, +). (3.10) ωˆ ∞+ ,∞− , p0
The differential ω∞+ ,∞− is given by
g ω∞+ ,∞− =
j =1 (π − λj ) dπ, 1/2 R2g+2
(3.11)
Scattering Theory for Jacobi Operators
817
where the constants λj have to be determined from the normalization E2j g
aj
j =1 (z − λj ) dz 1/2 R2g+2 (z)
ω∞+ ,∞− = 2 E2j −1
= 0,
(3.12)
which shows λj ∈ (E2j −1 , E2j ). Since λj ∈ (E2j −1 , E2j ) the integrand is a Herglotz function and admits the following representation (cf. [28], Appendix B)
g
j =1 (z − λj ) 1/2 R2g+2 (z)
=
∞
−∞
1 d µ(λ) ˜ λ−z
(3.13)
with the probability measure
g d µ(λ) ˜ =
j =1 (λ − λj ) χσ (Hq ) (λ)dλ. 1/2 πiR2g+2 (λ)
(3.14)
Hence
z ∞ 1 ω∞+ ,∞− = d µ(λ)dζ ˜ p0 E0 −∞ λ − ζ ∞ λ − E0 = d µ(λ). ˜ ln λ−z −∞ p
g(z, ∞) =
(3.15)
In particular, note that −Re(g(z, ∞)) is the Green’s function of the upper sheet + with pole at ∞+ and µ˜ is the equilibrium measure of the spectrum (see [30], Thm. III.37). We will abbreviate g(z) = g(z, ∞). The asymptotic expansion of exp(g(z)) is given by ([28], (9.42))
p
exp p0
ωˆ ∞+ ,∞−
=−
a˜ b˜ 1
1 + + O( 2 ) , z z z
z → ∞,
(3.16)
where a˜ is the capacity of the spectrum and 2g+1 g 1 b˜ = Ej − λj . 2 j =0
(3.17)
j =1
Theorem 3.3. The map g is a bijection from the upper (resp. lower) half plane C± = {z ∈ C | ±Im(z) > 0} to S ± = {z ∈ C | ±Re(z) < 0, 0 < Im(z) < π }\
g
[g(λj ), g(E2j +1 )]
j =1
such that σ (Hq ) = {z | Re(z) = 0}.
(3.18)
818
I. Egorova, J. Michor, G. Teschl
Proof. By the Herglotz property of its integrand, the function g(z, ∞) satisfies the conditions of [23], Theorem 1(b) in Chapter VI, which shows that it is one-to-one. To prove that g(z, ∞) is surjective, it suffices to show that the boundary of C+ is mapped to the boundary of S + . Note that g(λ) is negative for λ < E0 and purely imaginary for λ ∈ [E0 , E1 ]. At E1 , the real part starts to decrease from zero until it hits its minimum at λ1 and increases again until it becomes 0 at E2 (since all a-periods are zero), while the imaginary part remains constant. Proceeding like this we move along the boundary of S + as λ moves along the real line. For λ > E2g+1 , g(λ) is again negative. Remark 3.4. In the special case where Hq is periodic the quasi-momentum is given by w(z) = exp(iN −1 arccos (z)), where (z) is the Floquet discriminant, and our result is due to [24]. Therefore the map ±
w:C →W
±
= {w ∈ C | |w| < 1, ±Im(w) > 0}\
g
[w(λj ), w(E2j +1 )]
j =1
z → exp(g(z))
(3.19)
is bijective. Denote W = W + ∪ W − ∪ (−1, 1), W0 = W \{0}. If we identify corresponding points on the slits [w(λj ), w(E2j +1 )] we obtain a Riemann surface W which is isomorphic to the upper sheet + . Remark 3.5. In [24] the largest band edge E2g+1 is chosen for p0 and w will map C± → W ∓ in this case. Moreover, in the periodic case the slits [w(λj ), w(E2j +1 )] appear at equal angles 2π N , where N is the period. Since z → w(z) = exp(g(z)) is a bijection, we consider the functions ψq,± as functions of the new parameter w whenever convenient. For notational simplicity we will write ψq,± (w, n) for ψq,± (λ(w), n) and similarly for other quantities. The functions ψq,± (w, n) are meromorphic in W and continuous up to the boundary with the only possible singularities at the images of the Dirichlet eigenvalues w(µj ) and at 0. More precisely, denote by M± the sets of poles (and square root singularities if µj = El ) of g the Weyl m-functions m ˜ ± (λ), i.e. M+ ∪ M− = {µj }j =1 (see (3.2) and [28], Sect. 2.1). Note that µj ∈ M+ ∩ M− if and only if µj = El . Then g
(B1) ψq,± (w, n) are holomorphic in W\({w(µj )}j =1 ∪ {0}) and continuous on ∂W \{w(µj )}. (B2) ψq,± (w, n) has a simple pole at w(µj ) if µj ∈ M± \{El }, no pole if µj ∈ M± , and if µj = El , ψq,± (w, n) = ±
il C(n) + O(1), w − wl
where C(n) is bounded and real. (B3) ψq,± (w, n) = ψq,∓ (w, n) for |w| = 1. (B4) At w = 0 the following asymptotics hold ψq,± (w, n) = (−1)n
n−1
∗
m=0
±1 w ( )±n (1 + O(w)). a˜
aq (m)
Scattering Theory for Jacobi Operators
819
By Sect. 2.5 of [28] the vector valued functions 1 ψq,+ (λ, n) U (λ, n) = 4aq (0)2 πIm(m ˜ + (λ)) ψq,− (λ, n)
(3.20)
form an orthonormal basis for the Hilbert space L2 (σ (Hq ), C2 , dλ). The Weyl m-functions m ˜ ± (z) satisfy (see [28], Eq. (8.95)) 1/2
Im(m ˜ ± (λ)) =
∓R2g+2 (λ) ,
g 2iaq (0)2 j =1 (λ − µj )
λ ∈ σ (Hq ).
(3.21)
(z,+) Using our map w(z) = exp( p0 ωˆ ∞+ ,∞− ) we can transform this into an orthonormal basis on the unit circle. Lemma 3.6. Both functions ψq,+ (w, n) and ψq,− (w, n) form orthonormal bases in the Hilbert space L2 (S 1 , 2π1 i dω), where dω(w) =
g λ(w) − µj dw . λ(w) − λj w
(3.22)
j =1
Proof. Just use dw =w dz
g
j =1 (z − λj ) . 1/2 R2g+2 (z)
(3.23)
Observe that dω is meromorphic on W with a simple pole at w = 0. In particular, there are no poles at w(λj ). Remark 3.7. In [2] a different normalization is used. To establish the connection observe N
ψq,+ (z, n)ψq,− (z, n) = N
n=1
N−1 j =1
z − λj z − µj
(3.24)
if Hq is periodic with period N . 4. Existence of Jost Solutions After we have these preparations out of our way, we come to the study of short-range perturbations H of Hq associated with sequences a, b satisfying a(n) → aq (n) and b(n) → bq (n) as |n| → ∞. More precisely, we will make the following assumption throughout this paper. Hypothesis H. 4.1. Let H be a perturbation such that
|n| |a(n) − aq (n)| + |b(n) − bq (n)| < ∞. n∈Z
(4.1)
820
I. Egorova, J. Michor, G. Teschl
We first establish existence of Jost solutions, that is, solutions of the perturbed operator which asymptotically look like the Baker-Akhiezer solutions. Theorem 4.2. Assume (H.4.1). Then there exist solutions ψ± (z, .), z ∈ C, of τ ψ = zψ satisfying lim |w(z)∓n (ψ± (z, n) − ψq,± (z, n))| = 0,
n→±∞
(4.2)
where ψq,± (z, .) are the Baker-Akhiezer functions. Moreover, ψ± (z, .) are continuous (resp. holomorphic) with respect to z whenever ψq,± (z, .) are and inherit the properties il C± (n) (B1) and (B2), where now ψ± (z, n) = √ + O(1). (B4) has to be replaced by z−µj
ψ± (z, n) =
n−1 n
±1
1 z∓n 1
∗ ∗ aq (j ) bq (j − 01 ) 1 + B± (n) ± + O( 2 ) , A± (n) j =0 z z j =1
(4.3) where A+ (n) =
∞ a(j ) , aq (j )
j =n
A− (n) =
n−1 j =−∞
a(j ) , aq (j )
B+ (n) =
∞
(bq (m) − b(m)),
m=n+1
B− (n) =
n−1
(bq (m) − b(m)).
(4.4)
m=−∞
Proof. The proof can be done as in the periodic case (see e.g., [17, 25 or 28], Sect. 7.5). The only problem is to show that the second solution at a band edge grows at most linearly. In the periodic case this follows from Floquet theory; here we just use Lemma 3.1. From this result we obtain a complete characterization of the spectrum of H . Theorem 4.3. Assume (H.4.1). Then we have σess (H ) = σ (Hq ), the point spectrum of H is finite and confined to the spectral gaps of Hq , that is, σp (H ) ⊂ R\σ (Hq ). Furthermore, the essential spectrum of H is purely absolutely continuous. Proof. Again the proof can be done as in the periodic case (see e.g., [25 or 28], Sect. 7.5). 5. The Transformation Operator We define the kernel of the transformation operator as the Fourier coefficients of the Jost solutions ψ± (w, n) with respect to the orthonormal system given in Lemma 3.6, {ψq,± (w, n)}n∈Z , 1 ψ± (w, n)ψq,∓ (w, m)dω(w). (5.1) K± (n, m) := 2πi |w|=1
Scattering Theory for Jacobi Operators
821
By the Cauchy theorem, this integral equals the residue at w = 0, K± (n, m) = Res0
1 ψ± (w, n)ψq,∓ (w, m). w
(5.2)
In particular, since ψ± (w, n)ψq,∓ (w, m) = O(w±(n−m) ), we conclude K± (n, m) = 0,
±(m − n) < 0.
(5.3)
Lemma 5.1. Assume H.4.1. The Jost solutions ψ± (w, n) can be represented as ±∞
ψ± (w, n) =
|w| = 1,
K± (n, m)ψq,± (w, m),
(5.4)
m=n
where the kernels K± (n, .) satisfy K± (n, m) = 0 for ±m < ±n and
|a(j ) − aq (j )|+|b(j )−bq (j )| ,
±∞
|K± (n, m)| ≤ C
±m > ±n.
(5.5)
j =[ m+n 2 ]±1
The constant C depends only on Hq and the value of the sum in (4.1). Proof. We prove the estimate for K+ (n, m) and omit “+” and “z” whenever possible. Define ϕ(n) = ψ(n)K(n, n)−1 , then ϕ fulfills ∞
ϕ(n) = ψq (n) +
J (n, m)ϕ(m),
(5.6)
sq (z, n, m) sq (z, n, m − 1) ˜ + b(m) aq (m − 1) aq (m)
(5.7)
m=n+1
where J (z, n, m) = a(m ˜ − 1) with the abbreviation
a(m)2 − aq (m), aq (m)
a(m) ˜ =
˜ b(m) = b(m) − bq (m).
(5.8)
On the other hand, ϕ(n) is given by ϕ(n) =
∞
κ(n, m) =
κ(n, m)ψq (m),
m=n
K(n, m) , K(n, n)
therefore ∞
∞
κ(n, m)ψq (m) =
m=n
∞
J (n, m)ψq (m)+
m=n+1
∞
J (n, m)κ(m, l)ψq (l).
m=n+1 l=m+1
(5.9) Multiplying both sides of (5.9) by ψq,− (k) and integrating over the unit circle yields κ(n, k) =
∞ m=n+1
(n, m, m, k) +
∞
∞
m=n+1 l=n+1
(n, m, l, k)κ(m, l),
(5.10)
822
I. Egorova, J. Michor, G. Teschl
where (n, m, l, k) =
1 2πi
|w|=1
J (w, n, m)ψq,+ (w, l)ψq,− (w, k)dω(w).
(5.11)
Using [28], (1.50), sq (n, m) ψq,+ (m)ψq,− (n) − ψq,+ (n)ψq,− (m) = , a(m) W (ψq,+ , ψq,− )
(5.12)
˜ (n, m, l, k) = b(m) ˜ q (n, m, l, k) + a(m) q (n, m − 1, l, k)
(5.13)
we obtain
with q (n, m, l, k) = 0 (m, n, l, k) − 0 (n, m, l, k), ψq,+ (w, n)ψq,− (w, m)ψq,+ (w, l)ψq,− (w, k) 1 0 (n, m, l, k) = dω(w) 2πi w(γ ) W (ψq,+ (w), ψq,− (w))
ψq,+ (z, n)ψq,− (z, m)ψq,+ (z, l)ψq,− (z, k) (z − µj ) 1 = dz 1/2 2πi γ W (ψq,+ (z), ψq,− (z)) R2g+2 (z) ψq,+ (z, n)ψq,− (z, m)ψq,+ (z, l)ψq,− (z, k) 1 dz. (5.14) = 2πi γ W (ψq,+ (z), ψq,− (z))2 Here γ is a path on the upper sheet encircling the spectrum. The integrand of 0 is meromorphic on the Riemann surface M with poles of order one at Ej and poles of order O(z±(n−m+l−k)−2 ) near ∞± (there are no poles at the Dirichlet eigenvalues µj ). We apply the residue theorem twice, first on the side of γ including ∞+ , then on the other side including the spectrum (and thus ∞− ), 0 (n, m, l, k) = −Res∞+ = Res∞−
ψq,+ (n)ψq,− (m)ψq,+ (l)ψq,− (k) W (ψq,+ , ψq,− )2 2g+1
ψ (n)ψ (m)ψ (l)ψ (k) q,+ q,− q,+ q,− . + ResEj W (ψq,+ , ψq,− )2 j =0
(5.15) The order of the poles at ∞± implies 0 (n, m, l, k) =
2g+1
j =0
ResEj
ψq,+ (n)ψq,− (m)ψq,+ (l)ψq,− (k) W (ψq,+ ,ψq,− )2
0
n−m+l−k <0 n − m + l − k ≥ 0,
which shows that 0 (n, m, l, k) is real and bounded since ψq,+ (E, .) = ψq,− (E, .) are (if µj = El , use (B2)). Together with (5.14) this yields 0 (n, m, l, k) = −0 (m, n, k, l) = −0 (m, n, k, l) = −0 (n, m, k, l).
(5.16)
Scattering Theory for Jacobi Operators
823
Moreover, l − k ≥ |m − n|, q (n, m, l, k) = 0, q (n, m, l, k) = −q (m, n, k, l) = q (n, m, k, l),
(5.17)
which then implies 2g+1 ψ (n)ψ (m)ψq,+ (l)ψq,− (k) sign(n−m) ResEj q,+ Wq,− |l − k| < |m − n| (ψq,+ ,ψq,− )2 q (n, m, l, k) = j =0 0 |l − k| ≥ |m − n| (5.18) and (n, m, l, k) = 0 for |l − k| ≥ m − n if m > n. Note that the residue at Ej is given by
g 2 =1 (Ej − µ )2
(5.19) ψq (Ej , n)ψq (Ej , m)ψq (Ej , l)ψq (Ej , k). =j (Ej − E ) Now we obtain for κ(n, k), ∞
κ(n, k) =
(n, m, m, k) +
m=n+1 ∞
=
∞
∞
(n, m, l, k)κ(m, l)
m=n+1 l=m+1
(n, m, m, k) +
∞
m+k−n−1
(n, m, l, k)κ(m, l),
m=n+1 l=n+k−m+1
m=[ n+k 2 ]+1
(5.20) since (n, m, m, k) = 0 only if |m − k| < m − n implying m > n+k 2 . In the third sum of (5.20) we need that |m + δ − k| < m − n for δ ≥ 1 which yields δ < k − n and δ > n + k − 2m. Two remarks might be in order: m + k − n − 1 ≥ n + k − m + 1 since m − n ≥ n − m + 2, and the starting point l = n + k − m + 1 of the third sum actually has a lower limit, namely m ≤ n+k 2 , since we require l ≥ m + 1 for κ(m, l) = 0, 1. Note that ∞ ∞ ˜ (n, m, m, k) ≤ D |b(m) + a(m)| ˜ =: q( ˆ n+k 2 ),
m=[ n+k 2 ]+1 m+k−n−1
m=[ n+k 2 ]+1
˜ |(n, m, l, k)| ≤ D (m − n − 1)|b(m) + a(m)| ˜ =: c(m) ˆ ∈ 1 (Z),
l=n+k−m+1
where D is the estimate provided by (5.18), (5.19). We set up the following iteration procedure κ0 (n, k) =
∞
(n, m, m, k),
m=[ n+k 2 ]+1
κj (n, k) =
∞
m+k−n−1
m=n+1 l=n+k−m+1
(n, m, l, k)κj −1 (m, l).
(5.21)
824
I. Egorova, J. Michor, G. Teschl
Then using induction one has
∞
|κj (n, k)| ≤
q( ˆ n+k 2 )
j
ˆ m=n+1 c(m)
(5.22)
, j! and hence the iteration converges and implies the estimate ∞ ∞ |κ(n, k)| = κj (n, k) ≤ q( ˆ n+k ) exp c(m) ˆ . 2 j =0
(5.23)
m=n+1
Associated with K± (n, m) is the operator (K± f )(n) =
±∞
K± (n, m)f (m),
f ∈ ∞ ± (Z, C),
(5.24)
m=n
which acts as a transformation operator for the pair τ , τq . Theorem 5.2. Let τq and τ be the quasi-periodic and perturbed Jacobi difference expression, respectively. Then τ K± f = K± τq f,
f ∈ ∞ ± (Z, C).
(5.25)
Proof. It suffices to show that H K± = K± Hq . Indeed, 1 H ψ± (w, n)ψq,∓ (w, m)dω(w) H K± (n, m) = 2πi |w|=1 1 = λ(w)ψ± (w, n)ψq,∓ (w, m)dω(w) 2πi |w|=1 1 = ψ± (w, n)Hq ψq,∓ (w, m)dω(w). 2πi |w|=1
(5.26)
Lemma 5.3. For n ∈ Z we have a(n) K+ (n + 1, n + 1) K− (n, n) = = , (5.27) aq (n) K+ (n, n) K− (n + 1, n + 1) K+ (n, n + 1) K+ (n − 1, n) b(n) − bq (n) = aq (n) − aq (n − 1) K+ (n, n) K+ (n − 1, n − 1) K− (n, n − 1) K− (n + 1, n) − aq (n) . = aq (n − 1) K− (n, n) K− (n + 1, n + 1) Proof. Consider the equation of the transformation operator H K± = K± Hq , which is equivalent to (cf. (5.26)) a(n − 1)K± (n − 1, m) + b(n)K± (n, m) + a(n)K± (n + 1, m) = aq (m − 1)K± (n, m − 1) + bq (m)K± (n, m) + aq (m)K± (n, m + 1). Evaluating at m = n we obtain the first equation and at m = n ∓ 1 the second.
In particular, observe K± (n, n) =
1 , A± (n)
K± (n, n ± 1) =
B± (n) A± (n)aq (n − 01 )
.
(5.28)
Scattering Theory for Jacobi Operators
825
6. The Scattering Matrix Let Hq be a given quasi-periodic Jacobi operator and H a perturbation of Hq satisfying Hypothesis H.4.1. To set up scattering theory for the pair (H, Hq ) we proceed as usual. The Wronskian of our Jost functions can be evaluated as n → ±∞ and is given by W (ψ± (λ), ψ± (λ)) = Wq (ψq,± (λ), ψq,∓ (λ)) 1/2
R2g+2 (λ)
= ∓ g
j =1 (λ − µj )
,
λ ∈ σ (Hq ).
(6.1)
Hence ψ± (λ), ψ± (λ) are linearly independent for λ in the interior of σ (Hq ) and we consider the scattering relations ψ± (λ, n) = α(λ)ψ∓ (λ, n) + β∓ (λ)ψ∓ (λ, n), where α(λ) = β± (λ) =
W (ψ∓ (λ), ψ± (λ)) W (ψ∓ (λ), ψ∓ (λ)) W (ψ∓ (λ), ψ± (λ)) W (ψ± (λ), ψ± (λ))
λ ∈ σ (Hq ),
(6.2)
g
= =
j =1 (λ − µj ) W (ψ− (λ), ψ+ (λ)), 1/2 R2g+2 (λ)
g j =1 (λ − µj ) ∓ W (ψ∓ (λ), ψ± (λ)). 1/2 R2g+2 (λ)
(6.3)
While α(λ) is only defined for λ ∈ σ (Hq ), (6.3) may be used as a definition for λ ∈ C\{Ej }. Therefore α(w) can be continued as a holomorphic function on W and it is continuous up to the boundary except possibly at the band edges. Remark 6.1. Note that α(λ) does not depend on the normalization of ψ± (λ) at the base point n0 = 0 whereas β± = β±,0 does. Using ψ± (z, n, n0 ) = ψq,± (z, n0 )−1 ψ± (z, n) and W ((ψ+ (λ), ψ− (λ)) =
g λ − µj (n0 ) W ((ψ+ (λ, ., n0 ), ψ− (λ, ., n0 )) λ − µj
j =1
we see β±,0 (λ) =
ψq,∓ (λ, n0 ) β±,n0 (λ). ψq,± (λ, n0 )
(6.4)
β± (w) = β± (w) = −β∓ (w),
(6.5)
A direct calculation shows α(w) = α(w),
and the Pl¨ucker identity (cf. [28], (2.169)) implies |α(w)|2 = 1 + |β± (w)|2 ,
|w| = 1.
(6.6)
We will denote the eigenvalues of H by q
σp (H ) = {ρj }j =1 .
(6.7)
826
I. Egorova, J. Michor, G. Teschl
Our next aim is to study the behavior of α(λ) at the eigenvalues ρj , therefore we modify the Jost solutions ψ± (λ, n) according to their poles at µj and define the following eigenfunctions ψˆ ± (λ, .) ψˆ + (λ, .) = (λ − µl ) ψ+ (λ, .), µl ∈M+
ψˆ − (λ, .) =
(λ − µl ) ψ− (λ, .).
(6.8)
µl ∈M− \{Ej }
Define ψˆ q,± (λ, .) accordingly. Moreover, ψˆ ± (ρj , n) = cj± ψˆ ∓ (ρj , n) with cj+ cj− = 1. The norming constants γ±,j are defined by 1 γ±,j
=
|ψˆ ± (ρj , m)|2 .
(6.9)
m∈Z
To compute the derivative of α(λ) at ρj , note that α(λ) =
W (ψˆ − (λ), ψˆ + (λ)) 1/2
(6.10)
.
R2g+2 (λ)
By virtue of [28], Lemma 2.4, d 1 . ψˆ − (ρj , k)ψˆ + (ρj , k) = − ± W (ψˆ − (λ), ψˆ + (λ)) = − ρj dλ cj γ±,j
(6.11)
k∈Z
Therefore W (ψˆ − (ρj ), ψˆ + (ρj )) −1 d α(λ) = = . 1/2 1/2 ± ρj dλ R2g+2 (ρj ) cj γ±,j R2g+2 (ρj )
(6.12)
From (6.12) we obtain a connection between the left and right norming constants γ+,j γ−,j =
1 (α (ρ
j
))2 R
2g+2 (ρj )
.
(6.13)
As a last preparation, we study the behavior of α(w) as w → 0. By (4.3), W (ψ− (w), ψ+ (w)) =
1 aw ˜ −1 + O(w) A
(6.14)
with A = A− (0)A+ (0) and 1/2
R2g+2 (λ(w)) = aw ˜ −1 + O(1),
g (λ(w) − λ ) j j =1
(6.15)
therefore α −1 (w) is bounded at 0 with α(0) =
∞ aq (j ) . a(j )
j =−∞
(6.16)
Scattering Theory for Jacobi Operators
827
We now define the scattering matrix T (w) R− (w) S(w) = , R+ (w) T (w)
|w| = 1,
(6.17)
where T (w) := α −1 (w) and R± (w) := α −1 (w)β± (w) are called transmission and reflection coefficients. Equations (6.5) and (6.6) imply Lemma 6.2. The scattering matrix S(w) is unitary. The coefficients T (w), R± (w) are bounded for |w| = 1, continuous for |w| = 1 except at possibly wl = w(El ), fulfill |T (w)|2 + |R± (w)|2 = 1, T (w)R+ (w) + T (w)R− (w) = 0,
|w| = 1, |w| = 1,
(6.18) (6.19)
and T (w) = T (w), R± (w) = R± (w) for |w| = 1. 1/2 Moreover, R2g+2 (w)T (w)−1 is continuous (in particular T (w) can only vanish at wl ) and 1/2
wl = w(µj )
1/2
wl = w(µj )
lim R2g+2 (w) R±T(w)+1 (w) = 0,
w→wl
lim R2g+2 (w) R±T(w)−1 (w) = 0,
w→wl
.
(6.20)
The transmission coefficient T (w) has a meromorphic continuation to W with simple poles at w(ρj ), 2 (6.21) Resρj T (λ) = γ+,j γ−,j R2g+2 (ρj ). In addition, T (z) ∈ R as z ∈ R\σ (Hq ) and T (0) =
∞ a(j ) 1 = , K+ (n, n)K− (n, n) aq (j )
(6.22)
j =−∞
where K± (n, n) are the coefficients of the transformation operators. Proof. To show (6.20) we use the definition (6.3), 1/2
R2g+2 (λ)
g R± (λ) + 1 = (λ − µj ) W (ψ− (λ), ψ+ (λ)) ∓ W (ψ∓ (λ), ψ± (λ)) . T (λ) j =1
There are two cases to distinguish: If µj = El then ψ± are continuous and real at λ = El and the two Wronskians cancel. Otherwise, if µj = El they are purely imaginary (by property (B2) of the Jost functions) and the two terms are equal in the limit and add up. The sets S± (H ) = {R± (w), |w| = 1; (ρj , γ±,j ), 1 ≤ j ≤ q}
(6.23)
are called left/right scattering data for H . First we want to show that the transmission coefficient can be reconstructed from either left or right scattering data.
828
I. Egorova, J. Michor, G. Teschl
Let g(w, w0 ) be the Green function associated with W and let ∂g µ(w, w0 )dw0 = (w, reiθ ) − eiθ dθ, w0 = eiθ , (6.24) r=1 ∂r be the corresponding harmonic measure on the boundary (see, e.g., [30]). Since W0 is simply connected, we can choose a function h(w, v) such that g(w, ˆ w0 ) = g(w, w0 ) + ih(w, w0 ) is analytic in W0 . Clearly gˆ is only well-defined up to an imaginary constant and it will not be analytic on W\{0} in general. Similarly we can find a corresponding ν(w, w0 ) and set µ(w, ˆ w0 ) = µ(w, w0 ) + iν(w, w0 ). Theorem 6.3. Either one of the sets S± (H ) determines the other and T (w) via the Poisson-Jensen type formula q 1 T (w) = exp g(w, ˆ w(ρj )) exp ln(1 − |R± (w0 )|2 )µ(w, ˆ w0 )dw0 , 2 |w|=1 j =1
(6.25) where the constant of gˆ has to be chosen such that T (0) > 0, and 2 Resρj T (λ) R− (w) T (w) =− , γ+,j γ−,j = 2g+1 . R+ (w) T (w) l=0 (ρj − El ) Proof. It suffices to prove the formula for T (w), since evaluating the residua provides γ±,j , together with {λl }, {El }. The formula for T (w) holds by [32], Theorem 1, at least when taking absolute values. Since both sides are analytic and have equal absolute values, they can only differ by a constant of absolute value one. But both sides are positive at w = 0 and hence this constant is one. Note that neither the Blaschke factors nor the outer function in (6.25) are single valued on W in general. In particular, the eigenvalues cannot be chosen arbitrarily, which was first observed in [21]. 7. The Gel’fand-Levitan-Marchenko Equations In this section we want to derive a procedure which allows the reconstruction of the Jacobi operator H with asymptotically quasi-periodic coefficients from its scattering data S± (H ). This will be achieved by deriving an equation for K± (n, m) which is generally known as Gel’fand-Levitan-Marchenko equation. Since K± (n, m) are essentially the Fourier coefficients of the Jost solutions ψ± (w, n) we compute the Fourier coefficients of the scattering relations (6.2). Therefore we multiply T (w)ψ∓ (w, n) = R± (w)ψ± (w, n) + ψ± (w, n)
(7.1)
by (2πi)−1 ψq,± (w, m)dω, where ±m ≥ ±n, and integrate around the unit circle. First we evaluate the right-hand side of (7.1) using (5.1), 1 ψ+ (w, n)ψq,+ (w, m)dω(w) = K+ (n, m), (7.2) 2π i |w|=1 ∞ 1 R+ (w)ψ+ (w, n)ψq,+ (w, m)dω(w) = K+ (n, l)F˜ + (l, m), 2π i |w|=1 l=n
Scattering Theory for Jacobi Operators
where 1 F˜ + (l, m) = 2πi
829
|w|=1
R+ (w)ψq,+ (w, l)ψq,+ (w, m)dω(w).
(7.3)
Note that F˜ + (l, m) = F˜ + (m, l) is real. To evaluate the left hand side of (7.1) we use the residue theorem. The only poles are at the eigenvalues and at 0 if n = m, hence 1 T (w)ψ− (w, n)ψq,+ (w, m)dω(w) 2πi |w|=1 q T (λ)ψˆ − (λ, n)ψˆ q,+ (λ, m) δ(n, m) = + Resρj . 1/2 K+ (n, n) R2g+2 (λ) j =1 Here δ(n, m) is one for m = n and zero else. By (6.12) the residua at the eigenvalues are given by T (λ)ψˆ − (λ, n)ψˆ q,+ (λ, m) (7.4) = −γ+,j ψˆ + (ρj , n)ψˆ q,+ (ρj , m). Resρj 1/2 R2g+2 (λ) Collecting all terms yields K± (n, m) +
±∞ l=n
δ(n, m) K± (n, l)F˜ ± (l, m) = γ±,j ψˆ ± (ρj , n)ψˆ q,± (ρj , m) − K± (n, n) q
j =1
(7.5) and we have thus proved the following result. Theorem 7.1. The kernel K± (n, m) of the transformation operator satisfies the Gel’fand-Levitan-Marchenko equation, K± (n, m) +
±∞
K± (n, l)F ± (l, m) =
l=n
δ(n, m) , K± (n, n)
±m ≥ ±n,
(7.6)
where F ± (l, m) = F˜ ± (l, m) +
q
γ±,j ψˆ q,± (ρj , l)ψˆ q,± (ρj , m).
(7.7)
j =1
Defining the Gel’fand-Levitan-Marchenko operator Fn± f (j ) =
∞
F ± (n ± l, n ± j )f (l),
f ∈ 2 (N0 , C),
(7.8)
l=0
yields that the Gel’fand-Levitan-Marchenko equation is equal to (1 + Fn± )K± (n, n ± .) = (K± (n, n))−1 δ0 .
(7.9)
Our next aim is to study the Gel’fand-Levitan-Marchenko operator Fn± in more detail. The structure of the Gel’fand-Levitan-Marchenko equation suggests that the estimate (5.5) for K± (n, m) should imply a similar estimate for F ± (n, m).
830
I. Egorova, J. Michor, G. Teschl
Lemma 7.2. ±∞
±
|F (n, m)| ≤ C
|a(j ) − aq (j )| + |b(j ) − bq (j )| ,
(7.10)
j =[ n+m 2 ]±1
where the constant C is of the same nature as in (5.5). Proof. We abbreviate the estimate (5.5) for K+ (n, m) by |K+ (n, m)| ≤ C C+ (n + m),
(7.11)
where C+ (n + m) =
∞
c(j ),
c(j ) = |a(j ) − aq (j )| + |b(j ) − bq (j )|.
j =[ n+m 2 ]+1
Note that C+ (n + 1) ≤ C+ (n). Moreover, C+ (n) ∈ 1+ (Z) since the summation by parts formula (e.g. [28], (1.18)) N
g(m)(f (m + 1) − f (m)) = g(N)f (N + 1) − g(n − 1)f (n)
m=n
+
N
(g(m − 1) − g(m))f (m)
(7.12)
m=n
implies for g(m) = m, f (m) = C+ (m) that ∞
m c(m) = (n − 1)C+ (n) +
m=n
∞
C+ (m),
(7.13)
m=n
where we used limn→∞ n C+ (n + 1) ≤ limn→∞ ∞ m=n m c(m) = 0. Solving the GLMequation (7.6) for F + (n, m), m > n, we obtain ∞ 1 K+ (n, l)F + (l, m) |K+ (n, m)| + |F + (n, m)| ≤ K+ (n, n) l=n+1 ∞ + C+ (n + l) F (l, m) , ≤ C1 (n) C+ (n + m) + l=n+1
(n, n)|−1
→ C for n → ∞ (see (5.28)). For n large enough, i.e. where C1 (n) = C |K+ C1 (n)C+ (2n) < 1, we apply the discrete Gronwall-type inequality [28], Lemma 10.8, ∞ C1 (l)C+ (l + m)C+ (n + l) + |F (n, m)| ≤ C1 (n) C+ (n + m) +
l k=n+1 (1 − C1 (k)C+ (n + k)) l=n+1 ∞ C1 (k)C+ (n + l) , ≤ C1 (n)C+ (n + m) 1 +
l k=n+1 (1 − C1 (n)C+ (n + k)) l=n+1 (7.14) which finishes the proof.
Scattering Theory for Jacobi Operators
831
Furthermore, Lemma 7.3. Let F ± (n, m) be solutions of the Gel’fand-Levitan-Marchenko equation. Then ±∞
|n| F ± (n, n) − F ± (n ± 1, n ± 1) < ∞,
(7.15)
|n| aq (n)F ± (n, n + 1) − aq (n − 1)F ± (n − 1, n) < ∞.
(7.16)
n=n0 ±∞ n=n0
Proof. We first prove (7.16) for F + . Lemma 5.3 implies b(n) − bq (n) = aq (n)κ+,1 (n) − aq (n − 1)κ+,1 (n − 1),
(7.17)
where κ+,j (n) := κ+ (n, n + j ) :=
K+ (n, n + j ) . K+ (n, n)
(7.18)
Abbreviate Fj+ (n) = F + (n + j, n). With this notation, the GLM-equation (7.6) reads κ+,l (n) + Fl+ (n) +
∞ j =1
κ+,j (n)Fj+−l (n + l) =
δ(l, 0) , K+ (n, n)2
l ≥ 0.
(7.19)
Insert the GLM-equation for F + (n, n+1), F + (n−1, n) (recall F + (n, m) = F + (m, n)) (7.20) aq (n)F1+ (n) − aq (n − 1)F1+ (n − 1) = −aq (n)κ+,1 (n) + aq (n − 1)κ+,1 (n − 1) ∞
aq (n)κ+,j (n)Fj+−1 (n + 1) − aq (n − 1)κ+,j (n − 1)Fj+−1 (n) . − j =1
Since −aq (n)κ+,1 (n) + aq (n − 1)κ+,1 (n − 1) = bq (n) − b(n) the only interesting part is the sum. For N, J < ∞, N n=n0
=
n
aq (n)κ+,j (n)Fj+−1 (n + 1) − aq (n − 1)κ+,j (n − 1)Fj+−1 (n)
j =1
N J j =1 n=n0
=
J
n aq (n)κ+,j (n)Fj+−1 (n + 1) − aq (n − 1)κ+,j (n − 1)Fj+−1 (n)
J j =1
Naq (N )κ+,j (N )Fj+−1 (N + 1) − (n0 − 1)aq (n0 − 1)κ+,j (n0 − 1)Fj+−1 (n0 )
+
N n=n0
(−1)aq (n − 1)κ+,j (n − 1)Fj+−1 (n) ,
(7.21)
832
I. Egorova, J. Michor, G. Teschl
where we used the summation by parts. Estimates (7.11), (7.14) imply for the first summand J J ˜ + (2N + j )C+ (2N + j + 1) Naq (N )κ+,j (N)Fj+−1 (N + 1) ≤ |N |aq (N )CC j =1
j =1
ˆ + (2N + 1), ≤ |N |aq (N )CC which holds uniformly in J , and (compare (7.13)) ˆ + (2N + 1) = 0. lim N aq (N )CC
N→∞
(7.22)
Moreover, lim
N,J →∞
≤ ≤
J N aq (n − 1)κ+,j (n − 1)Fj+−1 (n) j =1 n=n0
lim
N,J →∞
N J aq (n − 1)κ+,j (n − 1)Fj+−1 (n) j =1 n=n0
∞ ∞
˜ + (2n + j )C+ (2n + j + 1) < ∞. aq (n − 1)CC
j =1 n=n0
Therefore |n||aq (n)F + (n, n + 1) − aq (n − 1)F + (n − 1, n)| ∈ 1+ (Z) as desired. To apply Lemma 5.3 for F − use the symmetry property F − (n, m) = F − (m, n). For (7.15), inserting the GLM-equation yields −2 −2 F + (n, n) − F + (n + 1, n + 1) = K+ (n, n) − K+ (n + 1, n + 1) ∞
+ κ+,j (n + 1)Fj+ (n + 1) − κ+,j (n)Fj+ (n) . j =1
By (5.28), ∞ |a(n) + a (n)| a(j )2 q −2 −2 (n + 1, n + 1) ≤ |a(n) − aq (n)| K+ (n, n) − K+ a(n)2 aq (j )2 j =n+1
≤ C|a(n) − aq (n)|, and the same considerations as above imply (7.15).
(7.23)
Remark 7.4. The Gel’fand-Levitan-Marchenko equation is symmetric in K± (n, m) and F ± (n, m), therefore we can invert the analysis done in Lemma 7.3 and obtain estimates for K± (n, m) starting with an analogue of estimate (7.10) for F ± (n, m) and the estimates (7.15), (7.16) (cf. Lemma 8.1). Theorem 7.5. For n ∈ Z, the Gel’fand-Levitan-Marchenko operator Fn± : 2 → 2 is Hilbert-Schmidt. Moreover, 1 + Fn± is positive and hence invertible. In particular, the Gel’fand-Levitan-Marchenko equation (7.9) has a unique solution and S+ (H ) or S− (H ) uniquely determine H .
Scattering Theory for Jacobi Operators
833
Proof. That Fn± is Hilbert-Schmidt is a straightforward consequence of our estimate in Lemma 7.2. Let f ∈ 2 (N0 ) be real (which is no restriction since F + (n, l) is real and the real ∞and imaginary part of (7.24) could be treated separately) and abbreviate fn (w) = j =0 f (j )ψq,+ (w, n + j ). Then ∞
f (j )Fn+ f (j ) =
j =0
=
1 2π i +
∞
f (j )
j =0 ∞
|w|=1
R+ (w)
∞
F + (n + j, n + l)f (l)
l=0
f (j )ψq,+ (w, n + j )ψq,+ (w, n + l)f (l) dω(w)
j,l=0
q ∞
f (j )γ+,k ψˆ q,+ (ρk , n + j )ψˆ q,+ (ρk , n + l)f (l)
k=1 j,l=0
1 = 2π i =
1 2π i
|w|=1
|w|=1
R+ (w)fn (w)fn (w) dω(w) +
q
γ+,k |fˆn (ρk )|2
k=1
R˜ + (w)|fn (w)|2 dω(w) +
q
γ+,k |fˆn (ρk )|2 ,
(7.24)
k=1
−1 where R˜ + (w) = R+ (w)fn (w) fn (w) with |R˜ + (w)| = |R+ (w)| and fˆn (w) = ∞ ˜ ˆ j =0 f (j )ψq,+ (w, n+j ). The integral over the imaginary part vanishes since R+ (w) = R˜ + (w) and we replace the real part by 1 1 |1 + R˜ + (w)|2 − 1 − |R˜ + (w)|2 = |1 + R˜ + (w)|2 + |T (w)|2 − 1, 2 2 1 2 (recall |R˜ + (w)|2 +|T (w)|2 = 1). This yields using |f (j )|2 = 2πi |w|=1 |fn (w)| dω,
Re(R˜ + (w)) =
∞ j =0
f (j )(1 + Fn+ )f (j ) =
q
γ+,k |fˆn (ρk )|2
k=1
+
1 4πi
|w|=1
|1 + R˜ + (w)|2 + |T (w)|2 |fn (w)|2 dω(w), (7.25)
which establishes 1 + Fn+ ≥ 0. According to Lemma 6.2, |T (w)|2 > 0 a.e., therefore −1 is not an eigenvalue and 1 + Fn+ ≥ n for some n > 0. To finish the direct scattering step for the Jacobi operator H with asymptotically quasi-periodic coefficients we summarize the properties of the scattering data S± (H ). Hypothesis H. 7.6. The scattering data S± (H ) = {R± (w), |w| = 1; (ρj , γ±,j ), 1 ≤ j ≤ q} satisfy the following conditions:
(7.26)
834
I. Egorova, J. Michor, G. Teschl
(i) The reflection coefficients R± (w) are continuous except possibly at wl = w(El ) and fulfill R± (w) = R± (w).
(7.27)
Moreover, |R± (w)| < 1 for w = wl and 1 − |R± (w)|2 ≥ C
2g+1
|w − wl |2 .
(7.28)
R± (w)ψq,± (w, l)ψq,± (w, m)dω(w)
(7.29)
l=0
The Fourier coefficients 1 F˜ ± (l, m) = 2πi
|w|=1
satisfy ±∞
|F˜ ± (n, m)| ≤
q(j ) ≥ 0,
q(j ),
|j |q(j ) ∈ 1 (Z),
j =n+m ±∞
|n|F˜ ± (n, n) − F˜ ± (n ± 1, n ± 1) < ∞,
n=n0 ±∞
|n|aq (n)F˜ ± (n, n + 1) − aq (n − 1)F˜ ± (n − 1, n) < ∞.
n=n0
(ii) The values ρj ∈ R\σ (Hq ), 1 ≤ j ≤ q, are distinct and the norming constants γ±,j , 1 ≤ j ≤ q, are positive. (iii) T (w) defined via Eq. (6.25) extends to a single valued function on W (i.e., it has equal values on the corresponding slits). (iv) Transmission and reflection coefficients satisfy lim (w − wl ) R±T(w)+1 (w) = 0,
w→wl
lim (w
w→wl
− wl ) R±T(w)−1 (w)
= 0,
wl = w(µj ), (7.30)
wl = w(µj ),
and the consistency conditions R− (w) T (w) , =− R+ (w) T (w)
2
Resρj T (λ)
γ+,j γ−,j = 2g+1 l=0
(ρj − El )
.
Remark 7.7. Note that (7.28) implies that ln(1 − |R± (w)|2 ) is integrable and ensures that (6.25) is well-defined, at least as a multi-valued function. Condition (iii), which is void in the constant background case, shows that the reflection coefficient and the eigenvalues cannot be chosen independent of each other.
Scattering Theory for Jacobi Operators
835
8. Inverse Scattering Theory In this section we want to invert the process of scattering theory, that is, we want to reconstruct the operator H from a given set S± and a given quasi-periodic Jacobi operator Hq . If S± (satisfying H.7.6 (i)–(ii)) and Hq are known, we can construct F ± (l, m) via formula (7.7) and thus derive the Gel’fand-Levitan-Marchenko equation, which has a unique solution by Theorem 7.5. This solution K± (n, n) = δ0 , (1 + Fn± )−1 δ0 1/2 , 1 K± (n, n ± j ) = δj , (1 + Fn± )−1 δ0 K± (n, n)
(8.1)
is the kernel of the transformation operator. Since 1+Fn± is positive, K± (n, n) is positive and we can set in accordance with Lemma 5.3, K+ (n + 1, n + 1) , K+ (n, n) K− (n, n) , a− (n) = aq (n) K− (n + 1, n + 1) K+ (n, n + 1) K+ (n − 1, n) − aq (n − 1) , b+ (n) = bq (n) + aq (n) K+ (n, n) K+ (n − 1, n − 1) K− (n, n − 1) K− (n + 1, n) − aq (n) . b− (n) = bq (n) + aq (n − 1) K− (n, n) K− (n + 1, n + 1) a+ (n) = aq (n)
(8.2)
Let H+ , H− be the associated Jacobi operators. Lemma 8.1. Suppose a given set S± satisfies H.7.6 (i)–(ii). Then the sequences defined in (8.2) satisfy n|a± (n) − aq (n)|, n|b± (n) − bq (n)| ∈ 1± (N). Moreover, ψ± (λ, n) = ±∞ m=n K± (n, m)ψq,± (λ, m), where K± (n, m) is the solution of the Gel’fand-Levitan-Marchenko equation, satisfies τ± ψ± = λψ± . Proof. We only prove the statements for the “+” case. Define F + (n, m) by (cf. (7.7)) F + (l, m) = F˜ + (l, m) +
q
γ+,j ψˆ q,+ (ρj , l)ψˆ q,+ (ρj , m).
j =1
Hypothesis H.7.6 (i) implies |F + (n, m)| ≤ C
∞
q(j ) =: C+ (n + m),
(8.3)
j =n+m ∞ n=n0 ∞ n=n0
|n|F + (n, n) − F + (n + 1, n + 1) < ∞,
(8.4)
|n|aq (n)F + (n, n + 1) − aq (n − 1)F + (n − 1, n) < ∞,
(8.5)
836
I. Egorova, J. Michor, G. Teschl
since ψˆ q,+ (ρj , n) decay exponentially as n → ∞ and j γ+,j ψˆ q,+ (ρj , .)ψˆ q,+ (ρj , .) form a telescopic sum. Note that C+ (n + 1) < C+ (n). Set κ+ (n, m) := K+ (n, m)K+ (n, n)−1 . Then as in the proof of Lemma 7.2 we obtain |κ+ (n, m)| ≤ C+ (n + m)(1 + O(1)).
(8.6)
Now we have all estimates at our disposal to prove n|b+ (n) − bq (n)| ∈ 1 (N). By definition (cf. (8.2)), b+ (n) − bq (n) = aq (n)κ+ (n, n + 1) − aq (n − 1)κ+ (n − 1, n).
(8.7)
We insert the GLM-equation for κ+ (n, n + 1), κ+ (n − 1, n) and use estimate (8.5), the summation by parts formula, and estimates (8.3), (8.6) in the same way as in Lemma 7.3. Similarly using (8.4) we see ∞ 1 1 |n| 2 − 2 (8.8) < ∞. K+ (n, n) K+ (n + 1, n + 1) n=n0 Equation (8.2) yields ∞ 1 1 1 a+ (j )2
|a+ (n)2 − aq (n)2 |. − 2 2 = K+ (n, n) K+ (n + 1, n + 1) aq (n)2 aq (j )2 j =n+1 The product converges and therefore |n||a+ (n)2 − aq (n)2 | ∈ 1 (N). Next we consider ψ+ (λ, n). Abbreviate 2 (n)aq−1 (n)κ+ (n + 1, m) (K+ )(n, m) = aq (n − 1)κ+ (n − 1, m) + a+ −aq (m − 1)κ+ (n, m − 1) −aq (m)κ+ (n, m + 1) + (b+ (n) − bq (m))κ+ (n, m).
(8.9)
K+ = 0 is equivalent to the operator equality H+ K+ = K+ Hq , which in turn implies that ψ+ (λ, n) satisfies H+ ψ+ = λψ+ , H+ ψ+ = H+ K+ ψq,+ = K+ Hq ψq,+ = K+ λψq,+ = λK+ ψq,+ = λψ+ .
(8.10)
To show that K+ = 0 we insert the GLM-equation into (8.9) and obtain (K+ )(n, m) +
∞
(K+ )(n, l)F + (l, m) = 0,
m > n + 1.
(8.11)
l=n+1
In the calculations we used aq (n − 1)F + (n − 1, m) + bq (n)F + (n, m) + aq (n)F + (n + 1, m) = aq (m − 1)F + (n, m − 1) + bq (m)F + (n, m) + aq (m)F + (n, m + 1) which follows from (7.7). By Theorem 7.5 Eq. (8.11) has only the trivial solution K+ = 0 and hence the proof is complete. Now we can prove the main result of this section.
Scattering Theory for Jacobi Operators
837
Theorem 8.2. Hypothesis H.7.6 is necessary and sufficient for a set S± to be the left/right scattering data of a unique Jacobi operator H associated with sequences a, b satisfying H.4.1. Proof. Necessity has been established in the previous section. By Lemma 8.1, we know existence of sequences a± , b± and corresponding solutions ψ± (w, n) associated with S+ (or S− ). Hence it remains to establish a+ (n) = a− (n) and b+ (n) = b− (n). Consider the following part of the GLM-equation + (n, .) :=
∞
K+ (n, l)F˜ + (l, .) ∈ 1+ (Z).
(8.12)
l=n
Then by use of (7.2) and Lemma 3.6,
+ (n, m)ψq,− (w, m) =
m∈Z
∞ m∈Z
K+ (n, l)F˜ + (l, m) ψq,− (w, m)
l=n
1 = R+ (w)ψ+ (w, n)ψq,+ (w, m)dω(w) ψq,− (w, m) 2πi |w|=1 m∈Z = ψq,− (w, m), R+ (w)ψ+ (w, n)ψq,− (w, m) m∈Z
= R+ (w)ψ+ (w, n).
(8.13)
On the other hand, inserting the GLM-equation yields for |w| = 1, + (n, m)ψq,− (w, m) = m∈Z
=
n−1
+ (n, m)ψq,− (w, m) +
m=−∞ ∞
−
q
γ+,j ψˆ q,+ (ρj , l)ψˆ q,+ (ρj , m) ψq,− (w, m)
j =1
n−1
−1 + (n, m)ψq,− (w, m) + ψq,− (w, n)K+ (n, n) − ψ+ (w, n)
m=−∞ q
∞
j =1
m=n
γ+,j ψˆ + (ρj , n)
−
−1 δ(n, m)K+ (n, n) − K+ (n, m)
m=n
K+ (n, l)
l=n
=
∞
ψˆ q,+ (ρj , m)ψq,− (w, m),
(8.14)
(recall the definition of ψˆ q,± from (6.8)) and therefore T (w)h− (w, n) = ψ+ (w, n) + R+ (w)ψ+ (w, n), where
|w| = 1,
n−1 ψq,− (w, n) ψq,− (w, m) 1 h− (w, n) = + (n, m) + T (w) K+ (n, n) m=−∞ ψq,− (w, n) q Wn−1 (ψˆ q,+ (ρj ), ψq,− (w)) ˆ γ+,j ψ+ (ρj , n) , + ψq,− (w, n)(λ(w) − ρj ) j =1
(8.15)
(8.16)
838
I. Egorova, J. Michor, G. Teschl
since Green’s formula ([28], Eq. (1.20)) implies for λ ∈ σ (Hq ), (λ − ρj )
∞
ψˆ q,+ (ρj , m)ψq,− (λ, m) = −Wn−1 (ψˆ q,+ (ρj ), ψq,− (λ)).
m=n
Similarly, we obtain h+ (w, n) =
∞ ψq,+ (w, n) ψq,+ (w, m) 1 − (n, m) + T (w) K− (n, n) ψq,+ (w, n) m=n+1 q Wn (ψˆ q,− (ρj ), ψq,+ (w)) γ−,j ψˆ − (ρj , n) − ψq,+ (w, n)(λ(w) − ρj )
(8.17)
j =1
with − (n, m) =
n
K− (n, l)F˜ − (l, m).
l=−∞
For n ∈ Z, |w| = 1, we see that h∓ (w −1 , n) = h∓ (w, n), since K± (n, m) and ± (n, m) are real. The functions h∓ (w, n) are continuous for |w| = 1, w = w(Ej ), since T −1 (w) is continuous on this set by the Poisson-Jensen formula (6.25) (|R± (w)| < 1 for w = w(Ej ) by H.7.6 (i)) and ψq,∓ (w, m) are continuous on ∂W \{w(µk )}. The functions h∓ (w, n) have a meromorphic continuation to W\{0} with the only possible poles at w(ρj ) and w(µj ). At w(ρj ) there are no poles, due to the zeros of T −1 (w) at w(ρj ). For w = w(µj ) we have the same type of singularity as ψq,± . In summary, h± (w, n) have simple poles at w(µj ) and are continuous at the boundary except possibly at w(Ej ). To study the behavior of h± (w, n) as w → 0, we recall z−1 = −w/a˜ (1 + O(w)). Then w + O(w 2 ) Wn−1 (ψˆ q,+ (ρj ), ψq,− (w)) a˜ (−1)n a˜ n−1 −n+1 w = n−2 (ψˆ q,+ (ρj , n − 1) + O(w)), a (j ) q j =0 −w + O(w 2 ) Wn (ψˆ q,− (ρj ), ψq,+ (w)) a˜
(−1)n nj=0 aq (j ) n+1 w (ψˆ q,− (ρj , n + 1) + O(w)), = a˜ n+1 and property (B4) implies ∓∞
−1 ± (n, m)ψq,∓ (w, m)ψq,∓ (w, n) = O(w),
w → 0.
(8.18)
m=n∓1
We conclude that lim h∓ (w, n)ψq,± (w, n) =
w→0
1 . T (0)K± (n, n)
(8.19)
Scattering Theory for Jacobi Operators
839
H.7.6 (iv) and (6.1) imply the following behavior of hˆ ∓ (λ, n) as λ → ρj : lim hˆ ∓ (λ, n) = ±γ±,j ψˆ ± (ρj , n) lim
λ→ρj
λ→ρj
Wn−1 (ψˆ q,± (ρj ), ψˆ q,∓ (λ)) (λ − ρj )T (λ)
2g+1 −1 = γ±,j ψˆ ± (ρj , n) Resρj T (λ) ρj − E l ,
(8.20)
l=0
where hˆ ± are defined as in (6.8). By virtue of the consistency condition T (w)R+ (w) = −T (w)R− (w) we obtain h± (w, n) + R± (w)h± (w, n) =
R (w)
1 ± ψ∓ (w, n) + R∓ (w)ψ∓ (w, n) + ψ∓ (w, n) + R∓ (w)ψ∓ (w, n) = T (w) T (w) 1 R (w) R (w)
R± (w)R∓ (w) ∓ ± = ψ∓ (w, n) + + ψ∓ (w, n) + T (w) T (w) T (w) T (w) |w| = 1. = ψ∓ (w, n)T (w), If we eliminate R∓ (w) from the last equation and (8.15) we see −1/2 T (w)R2g+2 (w) ψˆ + (w, n)ψˆ − (w, n) − hˆ + (w, n)hˆ − (w, n)
j (λ(w)−µj ) h± (w, n)ψ± (w, n)−ψ± (w, n)h± (w, n) =: G(w, n) = 1/2 R2g+2 (w)
(8.21)
for |w| = 1. Observe that G(w, n) = G(w, n) = G(w, n), |w| = 1, since h± ψ± − −1/2 ψ± h± and R2g+2 (w) are odd functions for |w| = 1. The function G(w, n) can be continued analytically on W since the difference ψˆ + ψˆ − − hˆ + hˆ − vanishes at the poles w(ρj ) of T (w) by (8.20). Note that the product ψˆ + ψˆ − and hence also hˆ + hˆ − do not have poles at w(µj ). Moreover, since W is just the image of the upper sheet, we can ˜ by adding the image of the lower sheet. Now extend it to a compact Riemann surface W ˜ by setting G(w, n) = G(w −1 , n) for by G(w, n) = G(w, n) we can extend G to W |w| > 1. Now let us investigate the behavior at the band edges: If wl = w(µj ), we obtain by (7.30), (8.15), and real-valuedness of ψˆ ± at the band edges that 1/2
lim R2g+2 (w)
w→wl
= lim
w→wl
= lim
w→wl
(λ(w) − µj )h∓ (w, n)ψ∓ (w, n) j
1/2 R2g+2 j (λ − µj )
T
1/2 R2g+2 j (λ − µj )
T
ψ± + R ± ψ± ψ∓
(R± + 1)ψ± + ψ± − ψ± ψ∓ = 0.
840
I. Egorova, J. Michor, G. Teschl
If wl = w(µj ), the same calculation shows that 1/2
lim R2g+2 (w)
w→wl
(λ(w) − µj )h± (w, n)ψ± (w, n) j 1/2
= (−1)l+1 C+ (n)C− (n) lim R2g+2 (w) w→wl
R± (w) − 1 =0 T (w)
by (7.30), where we used ψ± (w, n) = il C± (n)(λ(w) − µj )−1/2 + O(1). Consequently, R2g+2 (w)G(w, n) is continuous at w = wl and vanishes at the band 1/2 edges. Thus the singularities of R2g+2 (w)G(w, n) at wl are removable. Furthermore, 1/2
R2g+2 (w)G(w, n) is purely imaginary for |w| = 1 and real on the slits and hence must vanish at wl by continuity. So the singularities of G(w, n) at wl are removable as well. ˜ and vanishes at w = 0, that is, G(w, n) ≡ 0 which Thus G is holomorphic on all of W implies (compare (B4))
lim ψ+ (w, n)ψ− (w, n) − h+ (w, n)h− (w, n)
w→0
= K+ (n, n)K− (n, n) − (T (0)2 K+ (n, n)K− (n, n))−1 = 0. −2 Using (8.2) we finally obtain from T (0)2 = K+ (n, n)K− (n, n) that a+ (n) = a− (n) ≡ a(n),
∀n ∈ Z.
(8.22)
It remains to prove b+ (n) = b− (n). Proceeding as for G(w, n) we can show that −1/2 T (w)R2g+2 (w) ψˆ + (w, n)ψˆ − (w, n + 1) − hˆ + (w, n + 1)hˆ − (w, n)
j (λ(w)−µj ) = h+ (w, n + 1)ψ+ (w, n)−ψ+ (w, n)h+ (w, n + 1) 1/2 R2g+2 (w)
(8.23)
is a constant equal to −1/a(n). Thus W (w, n) := a(n) (ψ+ (w, n)ψ− (w, n + 1) − h+ (w, n + 1)h− (w, n)) 1/2
= −
R2g+2 (w)
. T (w) j (λ(w) − µj )
(8.24)
Computing the asymptotics at w = 0 (compare (4.3)) we see 0 = W (w, n) − W (w, n − 1) = and in particular b+ (n) = b− (n) ≡ b(n).
1 (b+ (n) − b− (n)) A
(8.25)
Scattering Theory for Jacobi Operators
841
Our operator H has the correct norming constants since as in (6.12) it follows
2g+1 −1 ˆ ˆ ψ+ (ρj , n)ψ− (ρj , n) = Resρj T (λ) ρj − E l ,
n∈Z
(8.26)
l=0
and by (8.20), n∈Z
−1 . ψˆ ± (ρj , n)ψˆ ± (ρj , n) = γ±,j
Acknowledgement. I.E. thanks A. Boutet de Monvel for the kind hospitality of University Paris-7, where part of this work was done. G.T. thanks Peter Yuditskii for several helpful discussions and hints with respect to the literature. We thank Mark Losik for help with respect to literature.
References 1. Bazargan, J., Egorova, I.: Jacobi operator with step-like asymptotically periodic coefficients. Mat. Fiz. Anal. Geom. 10(3), 425–442 (2003) 2. Boutet de Monvel, A., Egorova, I.: Transformation operator for Jacobi matrices with asymptotically periodic coefficients. J. Difference Eqs. Appl. 10, 711–727 (2004) 3. Boutet de Monvel, A., Egorova, I.: The Toda lattice with step-like initial data. Soliton asymptotics. Inverse Problems 16(4), 955–977 (2000) 4. Bulla, W., Gesztesy, F., Holden, H., Teschl, G.: Algebro-Geometric Quasi-Periodic Finite-Gap Solutions of the Toda and Kac-van Moerbeke Hierarchies. Memoirs of the Amer. Math. Soc. 135/641, (1998) 5. Case, K.M.: Orthogonal polynomials from the viewpoint of scattering theory. J. Math. Phys. 14, 2166–2175 (1973) 6. Case, K.M.: The discrete inverse scattering problem in one dimension. J. Math. Phys. 15, 143–146 (1974) 7. Case, K.M.: Orthogonal polynomials II. J. Math. Phys. 16, 1435–1440 (1975) 8. Case, K.M.: On discrete inverse scattering problems. II. J. Math. Phys. 14, 916–920 (1973) 9. Case, K.M., Chiu, S.C.: The discrete version of the Marchenko equations in the inverse scattering problem. J. Math. Phys. 14, 1643–1647 (1973) 10. Case, K.M., Kac, M.: A discrete version of the inverse scattering problem. J. Math. Phys. 14, 594–603 (1973) 11. Faddeev, L., Takhtajan, L.: Hamiltonian Methods in the Theory of Solitons. Berlin: Springer, 1987 12. Cojuhari, P.A.: Finiteness of the discrete spectrum of Jacobi matrices (Russian). In: Investigations in differential equations and mathematical analysis 173, Kishinev: “Shtiintsa”, 1988, pp. 80–93 13. Firsova, N.E.: The direct and inverse scattering problems for the one-dimensional perturbed Hill operator. Math. USSR, Sb. 58, 351–388 (1987) 14. Flaschka, H.: On the Toda lattice. II. Progr. Theoret. Phys. 51, 703–716 (1974) 15. Gardner, C.S., Green, J.M., Kruskal, M.D., Miura, R.M.: A method for solving the Korteweg-de Vries equation. Phys. Rev. Lett. 19, 1095–1097 (1967) 16. Gesztesy, F., Nowell, R., P¨otz, W.: One-dimensional scattering theory for quantum systems with nontrivial spatial asymptotics. Differ. Integral Eq. 10(3), 521–546 (1997) 17. Geronimo, J.S., Van Assche, W.: Orthogonal polynomials with asymptotically periodic recurrence coefficients. J. App. Th. 46, 251–283 (1986) 18. Guseinov, G.S.: The inverse problem of scattering theory for a second-order difference equation on the whole axis. Soviet Math. Dokl. 17, 1684–1688 (1976) 19. Guseinov, G.S.: The determination of an infinite Jacobi matrix from the scattering data. Soviet Math. Dokl. 17, 596–600 (1976) 20. Guseinov, G.S.: Scattering problem for the infinite Jacobi matrix. Izv. Akad. Nauk Arm. SSR, Mat. 12, 365–379 (1977) 21. Kuznetsov, E.A., Mikha˘ılov,A.V.: Stability of stationary waves in nonlinear weakly dispersive media. Soviet Phys. JETP 40(5), 855–859 (1975) 22. Marchenko, V.A.: Sturm–Liouville Operators and Applications. Basel: Birkh¨auser, 1986
842
I. Egorova, J. Michor, G. Teschl
23. Parthasarathy, T.: On Global Univalence Theorems, LNM 577. Berlin: Springer, 1983 24. Percolab, L.: The inverse problem for the periodic Jacobi matrix. Theor. funk., funk. an., pril. 42, 107–121 (1984), in Russian 25. Teschl, G.: Oscillation theory and renormalized oscillation theory for Jacobi operators. J. Diff. Eqs. 129, 532–558 (1996) 26. Teschl, G.: Inverse scattering transform for the Toda hierarchy. Math. Nach. 202, 163–171 (1999) 27. Teschl, G.: On the initial value problem for the Toda and Kac-van Moerbeke hierarchies. AMS/IP Studies in Advanced Mathematics 16, Providence, RI: Amer. Math. Soc. 2000, pp. 375–384 28. Teschl, G.: Jacobi Operators and Completely Integrable Nonlinear Lattices. Math. Surv. and Mon. 72, Providence, RI: Amer. Math. Soc., 2000 29. Toda, M.: Theory of Nonlinear Lattices, 2nd enl. edn, Berlin: Springer, 1989 30. Tsuji, M.: Potential Theory in modern Functional Analysis. Tokyo: Maruzen, 1959 31. Volberg, A., Yuditskii, P.: On the inverse scattering problem for Jacobi Matrices with the Spectrum on an Interval, a finite system of intervals or a Cantor set of positive length. Commun. Math. Phys. 226, 567–605 (2002) 32. Voichick, V., Zalcman, L.: Inner and outer functions on Riemann surfaces. Proc. Amer. Math. Soc. 16, 1200-1204 (1965) Communicated by B. Simon
Commun. Math. Phys. 264, 843 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1557-0
Communications in
Mathematical Physics
Erratum
Infinite Volume Limit for the Stationary Distribution of Abelian Sandpile Models S.R. Athreya1 , A.A. J´arai2 1 2
7 SJSS marg, Indian Statistical Institute, New Delhi, 110016, India School of Mathematics and Statistics, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario K1S 5B6, Canada. E-mail:
[email protected]
Received: 25 November 2005 / Accepted: 16 December 2005 Erratum published online : 3 April 2006 – © Springer-Verlag 2006 Commun. Math. Phys. 249, 197–213 (2004)
Electronic Supplementary Material: Supplementary material is available in the online version of this article at http://dx.doi.org/10.1007/s00220-006-1557-0 and is accessible for authorized users. Regrettably, our proof of the main theorem in [1] contains some errors. The results of the paper do hold without change, and the original line of argument can be followed, after appropriate modifications. The corrections can be found in the electronic supplementary material to this article. The problems are indicated below. (a) The way HF,x was defined, the inclusion {(F ∗ , x ∗ ) = (F, x)} ⊃ {T ∩ HF,x = F } in (7) may fail. On the event {T ∩ HF,x = F }, there may be descendents of x in T that do not belong to F (but belong to F ∗ ). Therefore, we cannot conclude F ∗ = F . This problem can be fixed by letting (F ∗ , e∗ ) play the role of (F ∗ , x ∗ ), where e∗ is the unique edge joining F ∗ to the rest of the tree. (b) The sets {ω ∈ : ω ∩ HF,x = F } are not disjoint (as claimed above (7)), only their intersections with X . This is remedied by a more careful application of weak convergence. (c) The description of the event B (F¯ , x) ¯ via Wilson’s algorithm is not correct. The random walks started at vertices in ∪ri=1 V (Fi ) are not sufficient to describe this event. A suitable modification of B works. Reference 1. Athreya, S.R., J´arai, A.A.: Infinite volume limit for the stationary distribution of Abelian sandpile models. Commun. Math. Phys. 249, 197–213 (2004) Communicated by M. Aizenman