Commun. Math. Phys. 206, 1 – 22 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
On Fusion Algebras and Modular Matrices? T. Gannon1 , M.A. Walton2,?? 1 Department of Mathematics, University of Alberta, Edmonton, Alberta, Canada T6G 2G1.
E-mail:
[email protected]
2 Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Silver Street,
Cambridge CB3 9EW, UK. E-mail:
[email protected] Received: 7 October 1997 / Accepted: 7 March 1999
Abstract: We consider the fusion algebras arising in e.g. Wess–Zumino–Witten conformal field theories, affine Kac–Moody algebras at positive integer level, and quantum groups at roots of unity. Using properties of the modular matrix S, we find small sets of primary fields (equivalently, sets of highest weights) which can be identified with the variables of a polynomial realization of the Ar fusion algebra at level k. We prove that for many choices of rank r and level k, the number of these variables is the minimum possible, and we conjecture that it is in fact minimal for most r and k. We also find new, systematic sources of zeros in the modular matrix S. In addition, we obtain a formula relating the entries of S at fixed points, to entries of S at smaller ranks and levels. Finally, we identify the number fields generated over the rationals by the entries of S, and by the fusion (Verlinde) eigenvalues.
1. Introduction (1)
Fix an affine non-twisted algebra g = Xr , and level k. Put k := k + h∨ , where h∨ is 0 r the dual P Coxeter number of g. Let w , . . . , w denote its fundamental weights, and put ρ := ri=0 w i . Let P+k (g) be the set of all level k integrable highest weights of g. For example, r r X X λi wi | λi ∈ Z≥0 , λi = k}. P+k (A(1) r )={ i=0
i=0
the corresponding character. Sometimes it is convenient to write (λ0 , λ1 , Write chλ forP . . . , λr ) for i λi w i . When the level of a weight is known, we will often drop the w 0 component. For example, the element kw0 of P+k (g) will be denoted by 0. The ? This research was supported in part by NSERC.
?? On leave from the Physics Dept, Univ. Lethbridge, Alberta, Canada.
2
T. Gannon, M.A. Walton
corresponding quantities for the underlying finite-dimensional Lie algebra g¯ will always be denoted with a bar. Under the familiar action of SL2 (Z) on the Cartan subalgebras of g, we find that the span of the level k characters chλ is stable. In particular, define a matrix S by: X (z|z) −1 z Sλ,µ chµ (τ, z, u). , ,u − = chλ τ τ τ k µ∈P+ (g)
S has several interesting properties. Most importantly: Lemma 1 (Kac–Peterson [16]). Let chν¯ denote the Weyl character of g¯ with highest weight ν¯ . Then for any λ, µ ∈ P+k (g), we have both S0,µ 6 = 0 and Sλ,µ µ+ρ = chλ −2πi (1.1a) =: χλ (µ). S0,µ k By Lemma 1, a useful expression for χλ (µ) is χλ (µ) =
X
X
mλ (β) exp[−2π i
γ · (µ + ρ)
β∈(λ) γ ∈W β
k
],
(1.1b)
where W is the (finite) Weyl group, where (λ) is the set of dominant weights of the representation of g¯ with highest weight λ, and where mλ (β) is the weight multiplicity. A classical result is: Lemma 2 (Cartan [3]). For each ν¯ , we can write chν¯ = Pν¯ (chw1 , . . . , chwr ) for some polynomial Pν¯ (x1 , . . . , xr ). Therefore,
χλ (µ) = Pλ¯ (χw1 (µ), . . . , χwr (µ)),
(1.2)
for all µ ∈ P+k (g). Define the fusion matrices Nλ by Verlinde’s formula [21]: ν = (Nλ )νµ := Nλ,µ
X γ ∈P+k (g)
Sλ,γ
Sµ,γ ∗ S . S0,γ ν,γ
(1.3)
Equation (1.3) tells us that the Nλ are simultaneously diagonalized by S, and have eigenvalues χλ (µ). The fusion algebra (or Verlinde algebra) of g at level k can be defined to be the C-span of {Nλ : λ ∈ P+k (g)}. It is associative and commutative, with ν : unit N0 = I and integer structure constants Nλ,µ X ν Nλ,µ Nν . Nλ Nµ = ν∈P+k (g) k
In fact it is isomorphic as an algebra to CkP+ (g)k , defined with componentwise addition and multiplication, and so a critical ingredient here in our definition is the choice of preferred basis {Nλ : λ ∈ P+k (g)}. Fusion algebras (or the corresponding fusion ring) appear in many different contexts, e.g. in rational conformal field theory (RCFT) [21]. The RCFTs with fusion algebras of the type discussed here, i.e. those associated with
Fusion Algebras and Modular Matrices
3
some g, are known as Wess–Zumino–Witten models. Fusion algebras also appear in the study of quantum groups [19] and Hecke algebras [14] at roots of unity, Chevalley groups at nonzero characteristic [12], and quantum cohomology [22]. Call a set 0 = {γ 1 , . . . , γ n } ⊂ P+k (g) a fusion-generator if any Nλ can be written as a polynomial1 in Nγ 1 , . . . , Nγ n – in other words, if for each λ ∈ P+k (g) there is a polynomial Pλ (x1 , . . . , xn ) such that χλ (µ) = Pλ (χγ 1 (µ), . . . , χγ n (µ))
∀µ ∈ P+k (g).
(1.4a)
Equivalently, 0 is a fusion-generator2 iff for any λ, µ ∈ P+k (g), the only way we can have (1.4b) χγ ` (λ) = χγ ` (µ) for all ` = 1, . . . , n, is when λ = µ. The equivalence of the statements of (1.4a) and (1.4b) can be seen as follows. First, if (1.4b) holds, then (1.4a) implies χφ (λ) = χφ (µ) for all φ ∈ P+k (g). Multiplying ∗ and summing over φ ∈ P k (g) gives λ = µ, by the unitarity of the this result by Sν,φ + matrix S. In the other direction, we need to construct a polynomial Pλ in n = k0k variables, taking values χλ (µ) at m = kP+k (g)k distinct points. Let x := (x1 , . . . , xn ) denote a point in Cn , and let x a , a = 1, . . . , m be the points at which the required polynomial must take the values ya . Here xa,j = χγ j (µa ) and ya = χλ (µa ), where a labels the different weights of P+k (g). A polynomial of minimal degree satisfying the requirements can be constructed by the Lagrange interpolation formula: P (x) =
m X a=1
ya
m Y b=1,b6=a
r · (x − x b ) . r · (x a − x b )
Here r can be any (constant) vector such that r · (x a − x b ) vanishes iff a = b. By the fusion-rank Rk (g), we mean the minimum possible cardinality n = k0k of a fusion-generator 0. Such a 0 is called a fusion-basis. Question 1. For a given g and k, what is the fusion-rank Rk (g), and what is a fusionbasis? This problem was studied by Di Francesco and Zuber [6]. For the applications it should suffice to get a reasonable upper bound for the fusion-rank, and to find a 0 which realizes that bound. Incidentally, it was proven in [1] that there will be a fusion potential [13] corresponding to any fusion-generator 0. Question 1 seems a natural one from the fusion algebra perspective, and is especially interesting considering that the fusion-rank often turns out to be surprisingly low. This analysis should have consequences for the work of Moody, Patera, Pianzola, . . . on elements of finite order in a finite-dimensional Lie group (see e.g. [18,20] and references therein). It has direct relevance for the classification of conformal field theories (more precisely, their 1-loop partition functions; see e.g. [9,11,10]). Our results may lead to a new presentation of the fusion algebras, along the lines of the Schubert calculus of [13,15]. As another example, we mention that our problem may be related to finding bases for the quantum cohomology of Grassmannians [22]. 1 By Lagrange interpolation, “polynomial” here is equivalent to “function”. 2 Our definition should not be confused with the “bootstrapped” version of a fusion-generator used in [10].
4
T. Gannon, M.A. Walton
Incidentally, these fusion algebras all have a rank of one, in a sense: precisely, the Krull dimension of a fusion algebra will be one. It is not difficult to find an element N of the fusion algebra in which every fusion matrix Nλ will be a polynomial. These N however will in general be nontrivial linear combinations of our basis vectors (1.3). For the applications we are interested in, this observation is not helpful. There is a natural basis for the fusion algebra, namely P+k (g), and an important condition is that fusion-generators are required to be subsets of that basis. (1) We will address Question 1 for g = Ar in Sect. 3. Our best lower bound for (1) Rr,k := Rk (Ar ) is given in Thm. 1(2); our best upper bound and smallest fusiongenerator is given in Thm. 3. Corollary 1 tells us precisely when {w1 } is a fusiongenerator. Corollary 2 answers Question 1 when r or k is small, and Conjecture 1 gives our guess for a general statement. Another question related to this one, which we will consider in Sect. 4, is: (1)
Question 2. For g = Ar , when is Nw1 invertible? The first fundamental weight w 1 is especially interesting, since (1.1b) and its fusion numbers Nwν 1 ,µ are so simple. Incidentally, Nλ is invertible iff Nλσ is, for any Galois element σ (see (2.6) below) – this holds in fact for any RCFT [5]. However, the inverse of a fusion matrix will only itself be a fusion matrix in the trivial cases: (Nλ )−1 = Nµ iff both λ = J a 0 and µ = J −a 0 for some a ∈ Z, where J is given in (2.1b) – again the analogue holds for any RCFT. (The proof of this uses the fact that the inverse of a nonnegative integer matrix can itself be integral and non-negative, only if it is a permutation matrix.) Our best condition for Nw1 being invertible is given in Thm. 6(3), while our best conditions for noninvertibility are Thms. 6(4),(5). Together, these answer Question 2 for most r, k. Conjecture 2 gives our guess for the general answer. A final question, which we solve in Sect. 6, was asked in [4]. It is interesting because of the Galois action (2.6) on the matrix S and on the fusion coefficients. (1)
Question 3. For Ar , what are the number fields Kr,k and Lr,k generated over the rationals by the entries Sλ,µ , and by the fusion (Verlinde) eigenvalues χλ (µ), respectively? The primary purpose of this paper is to develop tools for the analysis of affine fusions. We focus mostly on the most important case: Ar,k . We believe that these three questions are both interesting and representative. 2. The Ar,k Modular Matrix S (1)
For now, let us restrict attention to Ar,k (i.e. Ar at level k). Write r := r + 1, P+r,k := (1) (1) P+k (Ar ) and Rr,k := Rk (Ar ). The symmetry group of its Coxeter-Dynkin diagram is the dihedral group on r elements, generated by an order 2 conjugation C and an order r simple current J : Cλ = λ0 w0 + J λ = λr w 0 +
r X i=1 r X i=1
λr+1−i w i ,
(2.1a)
λi−1 w i .
(2.1b)
Fusion Algebras and Modular Matrices
5
These act on the χλ (µ) by χCλ (µ) = χλ (Cµ) = χλ (µ)∗ ,
(2.2a)
b
χJ a λ (J µ) = exp[2πi(b t (λ) + a t (µ) + kab)/r] χλ (µ), where t (λ) :=
r X
j λj
(2.2b)
(2.2c)
j =1
is called the r-ality. A useful relation is t (J a λ) ≡ ak + t (λ) (mod r).
(2.2d)
Another “symmetry” of χλ (µ), when k 6 = 1, is rank-level duality [2]: eτ λ (τ µ)∗ , χλ (µ) = exp[2π i t (λ) t (µ)/rk] χ
(2.3a)
where τ λ denotes the weight in P+k−1,r+1 corresponding to the transpose (sometimes called “conjugate”) of the Young diagram of λ, after deleting any columns of length k in the transposed diagram (reminder: the i th row of the Young diagram of λ has P r j =i λj boxes). This deletion is a consequence of (2.4f) below. We will usually denote the quantities of Ak−1,r+1 with tildes. For example, τ w` = `e w1 . τ defines a bijection r,k k−1,r+1 . Note that between the J -orbits in P+ and the J˜-orbits in P+ t˜(τ λ) ∈ t (λ) − kZ≥0 ,
(2.3b)
since t (λ) is the number of boxes in the Young diagram of λ. The Weyl group of Ar is the symmetric group Sr . This gives us an essential property of S: its relation to the symmetric polynomials. In particular, we can see from (1.1b) that (2.4a) χλ (µ) = exp[2π i t (λ) t (µ + ρ)/rk] Sλ (x1 , . . . , xr ), Pr where x` := exp[−2πi µ(`)/k] for µ(`) := j =` (µj + 1). Sλ is a polynomial over Z P P – the Schur polynomial of shape ( ri=1 λi , ri=2 λi , . . . , λr ) [8] – symmetric in the xi , and homogeneous of degree t (λ). It is often convenient to write Sλ as a polynomial X Y m cm y` ` , (2.4b) Qλ (y1 , . . . , yrk ) = m=(m1 ,... ,mrk )
`
evaluated at the “power sums” of our xi : y` =
r X i=1
xi` = P` (x1 , . . . , xr ).
(2.4c)
The coefficients cm of Qλ can be expressed in terms of the characters of the symmetric group Sr (this is essentially the Frobenius–Schur duality), and each nonzero cm will have P j mj = t (λ) [8]. We will also write Sλ [µ] and P` [µ], when convenient. Note that P` [J m µ] = exp[2π i ` µ(r − m)/k] P` [µ] .
(2.4d)
6
T. Gannon, M.A. Walton
A valuable special case of (2.4a) is χw` (µ) = exp[2π i ` t (µ + ρ)/rk]
X
xi1 · · · xi` .
(2.4e)
1≤i1 <···
Symmetric polynomials have an important variable-specialisation property which permits the number of variables to be increased (with the extra variables set to 0), and yet all algebraic relations3 among the symmetric polynomials will be preserved. This permits us to define χλ when λ has more than r components, using (2.4a) with variables 0 = · · · = 0, and we find x10 = x1 , . . . , xr0 = xr , and xr+1 0 if λ` > 0 for some ` > r , (2.4f) χ(λ0 ,λ1 ,... ,λr ,... ) (µ) = otherwise χ(λ0 ,λ1 ,... ,λr ) (µ) valid for any µ ∈ P+r,k . This can be directly understood using for example the construction of Schur polynomials from Young Tableaux. A special case of (2.4f) is χwr = 1 and χw` = 0 for ` > r. We will use (2.4f) in several places – see e.g. the proof of Thm. 3. Call λ ∈ P+r,k a J d -fixed point if d is the smallest positive integer satisfying J d λ = λ – in other words if the λi have period d. We will say λ is a fixed point if it is a J d -fixed point for some d < r. Note that if ϕ is a fixed point of J d , we can speak of a “truncated d,kd/r . We have weight” (ϕ0 , ϕ1 , . . . , ϕd−1 ) =: ϕ 0 ; by (2.5a) below it will lie in P+ d−1 d−1 X X dk ϕi = ϕi0 , = r i=0
t (ϕ) =
(2.5a)
i=0
d−1 r X r −d r r −d j ϕj + k = t 0 (ϕ 0 ) + k , d 2 d 2
(2.5b)
j =1
where t 0 denotes d-ality. There exist J d -fixed points in P+r,k iff d divides r and r/d divides k. In other words, the smallest fixed-point period is r/gcd{r, k}, and all other possible periods are multiples of this number. Also, if ϕ is a J r/d -fixed point, its rank-level dual τ ϕ is a J˜k/d -fixed point. By (2.2b), if µ is a J d -fixed point, then χλ (µ) = 0 whenever t (λ) 6 ≡ 0 (mod r/d). The same comment holds for µ if instead λ is a J d -fixed point. This is certainly not the only source of zeros in the matrix S however, as we shall see, but it is an important one. In fact, there are many more zeros at fixed points than this simple r-ality test suggests. For example, of all weights λ with t (λ) = r/d, the entry Sλ,ϕ will equal zero for every J d -fixed point ϕ, unless λ is a hook ( dr − a)w1 + w a . We will describe below the set N Z(d) of all weights λ which can have nonzero entries at J d -fixed points. Moreover, many different weights λ 6 = µ – even in the set N Z(d) – will have the same value Sλ,ϕ = Sµ,ϕ at all J d -fixed points ϕ. For example, for the hooks λ with t (λ) = dr , we will have χλ (ϕ) = ±χwr/d (ϕ) for all ϕ, where the sign is independent of ϕ. More generally, note that the right side of (2.8c) is independent of a 00 , except for the unimportant sign. 3 Such as (2.4), but not e.g. (2.3a), (2.6) or (2.8). More precisely, specialisation defines a homomorphism between the polynomial rings, taking Schur polynomials to Schur polynomials, power sums to power sums, etc.
Fusion Algebras and Modular Matrices
7
Hence fixed point considerations are very important for both Questions 1 and 2, and play a large role in this paper. An unexpected symmetry of the matrix S is the Galois action discussed in [5]. For any σ ∈ Gal(Kr,k /Q), there exists a permutation µ 7 → σ µ of P+r,k such that σ Sλ,µ = σ (µ) Sλ,σ µ , σ χλ (µ) = χλ (σ µ),
(2.6a) (2.6b)
where σ (µ) ∈ {±1}. Similar equations hold for any other affine algebra g, and more generally for any RCFT. The field Kr,k here is generated over Q by all elements Sλ,µ ; if instead we are only interested in the permutation µ 7 → σ µ, and not the “parities” σ (µ), then we are more concerned with the effective Galois group Gal(Lr,k /Q) coming from the subfield Lr,k generated over Q by the fusion eigenvalues χλ (µ). Incidentally, Galois orbits tend to be nicely behaved – see e.g. Thm. 8 below. They also have been studied in the “elements of finite order” Lie group context – see e.g. [18, 20]. Galois group considerations are central to many arguments in this paper, so next we will quickly review the cyclotomic Galois group. The cyclotomic field Qn := Q[exp[2π i/n]] consists of all polynomials in ξn := exp[2π i/n]. The Galois group Gn := Gal(Qn /Q) is isomorphic to the multiplicative group (Z/nZ)× of integers coprime to n, taken mod n. More precisely, any automorphism σ ∈ Gn corresponds to some integer ` ∈ (Z/nZ)× , in such a way that σ ξn = ξn` . We write σ` for this σ . The classic example of a Galois automorphism is complex conjugation, which always corresponds to ` = −1. A subfield F of Qn will have Galois group Gal(F /Q) isomorphic to a factor group (equivalently here, a subgroup) of (Z/nZ)× . The previous properties of S are all well known. The following one, which relates S entries at fixed points to S entries at both smaller rank and level, appears to be new. We will call it fixed-point factorisation. Let ϕ be a fixed point of J d for Ar,k . Then we will show that χλ (ϕ) = 0 unless (i)
(i)
(∗) for each i = 1, . . . , r/d, there are precisely d integers 1 ≤ `1 < · · · < `d ≤ r P (i) for which λ(`j ) ≡ −i (mod r/d). (Recall λ(a) := rb=a (λb + 1).) Assume this for now. (∗) implies dr will divide t (λ) – which we already know – but it is much stronger. Write N Z(d) for the set of all weights λ ∈ P+r,k which obey (∗). We will see below that λ ∈ N Z(d)
⇐⇒
χλ
r/d−1 kd X di w 6= 0. r
(2.7a)
i=0
The fixed-point argument of this last equation has truncated weight 00 . Consider any λ ∈ N Z(d). Let π be the unique permutation of {1, . . . , r} defined by the following rule: for each 1 ≤ i ≤
r d
(i)
and 1 ≤ j ≤ d, put π(i + (j − 1) dr ) = `j . d−1,kd/r
π will exist iff (∗) holds. For each such i, let λ0(i) denote the weight in P+ with Dynkin labels (i) (i) λ(`j ) − λ(`j +1 ) (i) −1 (2.7b) λ0 j = r/d
8
T. Gannon, M.A. Walton d−1,kd/r
for j = 1, . . . , d − 1. As above, let ϕ 0 ∈ P+ (ϕ0 , ϕ1 , . . . , ϕd−1 ). Then we obtain the “factorisations” Sλ,ϕ = sgn π ξ (−1)t (λ)(1−d/r)
be the truncated weight
r r/d−1 2
Sλ0 0(1) ,ϕ 0 · · · Sλ0 0(r/d) ,ϕ 0 , k χλ (ϕ) = sgn π ξ (−1)t (λ)(1−d/r) χλ0 0(1) (ϕ 0 ) · · · χλ0 0(r/d) (ϕ 0 ),
(2.8a) (2.8b)
P (i) where ξ is the kd/r th root of unity equal to exp[2π i t 0 (ϕ 0 + ρ 0 ) i (λd + i − r/d)], and 0 0 where primes denote quantities in Ad−1,kd/r (we take S = χ = 1 for d = 1). Perhaps some examples at low rank and level will be helpful. For r = 3, k = 4, the only fixed points are (ϕ0 , ϕ1 , ϕ2 , ϕ3 ) = (2, 0, 2, 0), (0, 2, 0, 2), and (1, 1, 1, 1). N Z(1) consists of the J -orbits of (4,0,0,0) and (2,1,0,1), for a total of 8 weights out of the full 35. N Z(2) contains N Z(1) plus the J -orbits of (3, 0, 1, 0), (2, 2, 0, 0) and (2, 0, 2, 0), increasing the number of weights to 18 out of 35. All three fixed points are in the simple-current orbits of weights of the special type indicated in (2.7a) (for d = 1 or 2). Therefore, for these fixed points ϕ, we must have Sϕ,λ 6 = 0 for all weights λ in the appropriate N Z(d). For r = 3, k = 8, d = 2, however, there are fixed points such as ϕ = (3, 1, 3, 1) that are not of the type in (2.7a), i.e. (4, 0, 4, 0). In this case, we find that S(3,1,3,1),λ 6= 0 for only 48 weights λ, while N Z(2) has cardinality 75 (and kP+3,8 k = 165). The large discrepancy here between “48” and “75” is not surprising and is explained by (2.8): χϕ0 0 will vanish at a fifth of the points of P+2,4 . Incidentally, the total number of weights satisfying t (λ) ≡ 0 (mod r/d) is 85. This means there are 10 weights that satisfy the r-ality test necessary for χλ (ϕ) 6 = 0, yet still have χλ (ϕ) = 0 for all ϕ. Condition (∗) will become more severe as r and k increase. For example, with r = 3 and d = 2, the numbers of weights in N Z(2) compared with those with even r-ality, compared with those in P+3,k are: 196, 231, and 455 for k = 12; and 405, 489, and 969 for k = 16. As an example of how “factor weights” {λ0(i) } are found, consider the weight λ = (0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0) at r = 11, k = 6. Fix d = 4. The corresponding partition labels {λ(`)} are {17, 16, 14, 13, 12, 10, 8, 6, 5, 3, 1, 0}. Those congruent to −1 (mod r/d = 3) are {17, 14, 8, 5}. From these we find λ0(1) = (1, 0, 1, 0), where the zeroth Dynkin label is set so that the factor weight is at level kd/r = 2. We find λ0(2) = (0, 0, 0, 2) and λ0(3) = (1, 1, 0, 0) in similar fashion. For a more general example, consider any hook λ = aw1 + wb . It will lie in N Z(d) iff r/d divides a + b, in which case we find χaw1 +wb (ϕ) = ξ (−1)a+b+c+(a
00 +1)(c+a 0 +1)
χ(a 0 −1)w01 +w0c−a0 +1 (ϕ 0 ),
(2.8c)
where c = (a + b)d/r and a = dr a 0 − a 00 , for a 00 ∈ {1, . . . , dr }, and where ξ = 1 unless b > r − r/d, in which case ξ = exp[2π i t 0 (ϕ 0 + ρ 0 )r/dk]. The permutation π here is the product of c − a 0 + 1 disjoint a 00 -cycles. In this example, each λ0(i) = 00 except for 00 0 λ(a ) = (a 0 − 1)w01 + w0c−a +1 . Equation (2.8c) says that hooks in P+r,k act like hooks in d−1,kd/r , when their fusion eigenvalues are restricted to fixed points of J d . The most P+ interesting special case of (2.8c) is 0 χ (ϕ 0 ) if r/d divides ` . (2.8d) χw` (ϕ) = w0`d/r 0 otherwise
Fusion Algebras and Modular Matrices
9
Lemma 3 (Fixed-point factorisation). Choose any Ar,k , any divisor d of gcd{r, k}, and any λ ∈ P+r,k . Then exactly one of the following holds: (i) Sλ,ϕ = χλ (ϕ) = 0 for all fixed points ϕ of J d ; or (ii) λ ∈ N Z(d) and so λ obeys (2.8a), (2.8b) for every fixed point ϕ of J d . The leading signs in (2.8) are independent of ϕ and so for our purposes are of no significance. The phase ξ depends only on ϕ and will often equal 1. Of course the right side of (2.8b) can be “linearised” by expanding it out using fusion coefficients. Conversely, it leads to the curious observation that the fusion coefficients of Ar,k can be seen in the fusion eigenvalues of A2r+1,2k evaluated at fixed points. At present we do not have formulas of equal generality for the other affine algebras with simple currents. Since their Coxeter–Dynkin diagrams are similarly related by the (1) “folding” by a simple current, one would expect that E6 would be related in this way (1) (1) (1) (1) to G2 , E7 to F4 , Dr for its vector simple current (i.e. the one interchanging w0 (1) and w1 , and wr−1 and wr ) to Cr−2 , etc. Perhaps an algebraic understanding of these equations can be obtained from the ideas in e.g. [7]. To prove Eqs. (2.8), first note that P` [ϕ] =
r
0 0 d P`d/r [ϕ ]
0
if r/d divides ` , otherwise
(2.9a)
0 [ϕ 0 ] or 0 (for r|d`, r 6 |d`, from which we immediately obtain that H` [ϕ] equals H`d/r respectively), for the “complete” symmetric polynomials H` := S`w1 , since (2.4b) for λ = `w 1 takes a simple form [8]. We have the determinantal formula [8]
Sλ = det(Hλ(i)−r+j )1≤i,j ≤r =
X
sgn σ Hλ(σ 1)−r+1 · · · Hλ(σ r)−r+r .
(2.9b)
σ
In this formula, H0 identically equals 1, and for negative `, H` is identically 0. Evaluated at the fixed point ϕ, this will be a sparse matrix: each row will have at most d nonzero elements, spaced r/d entries apart. If Sλ [ϕ] 6 = 0, then some prod(i) (i) uct Hλ(σ 1)−r+1 [ϕ] · · · Hλ(σ r)−r+r [ϕ] 6 = 0, and thus {`1 , . . . , `d } = {σ i, σ (i + dr ), . . . , σ (i + r − dr )} for each i. This shows that (∗) is satisfied, and that the permutation π exists. The sum in (2.9b) can be restricted to those σ in the coset (Sd )×(r/d) π ⊂ Sr , where the i th factor Sd permutes the indices congruent to i (mod r/d). So (2.9b) can now be written as the product of determinants, the i th one of which corresponds to the d−1,kd/r (note that (2.4f) is implicit in (2.7b)), which gives us (2.8b). weight λ0(i) ∈ P+ P di 0 0 Equation (2.7a) follows from (2.8b) and the fact that ( kd i w ) = 0 . Using the r product formula (= Weyl denominator formula) for S0,µ , we can show S0,ϕ =
r/d−1 2 r k
(S00 0 ,ϕ 0 )r/d .
Together with (2.8b), this immediately gives us (2.8a).
(2.9c)
10
T. Gannon, M.A. Walton
3. Fusion-rank of Ar,k The original polynomial realisation [13,15] uses the Cartan fusion-generator 0 = {w1 , . . . , wr }, which works by Lemma 2. We can do better. From (2.2a) and Lemma 2, we see that Rr,k ≤ 2r , with 0 = {w1 , . . . , wbr/2c }, where bxc is the largest integer not larger than x. For example, the fusion-rank of A1,k and A2,k equals 1 for all k, with {w 1 } a fusion-generator. This result for A2 was first obtained in [6], though by a more complicated argument. We also obtain, from Thm. 2(3) below (rank-level duality), the bound Rr,k ≤ 2k + 1. We begin by collecting a few simple consequences of the previous comments. Parts (1) and (3) of Thm. 1 are technical facts we will use repeatedly in the rest of the paper. Theorem 1(2) gives a fairly strong lower bound on Rr,k . We give some consequences of Thm. 1(4) in the paragraph before Conjecture 1. Theorem 1 (Simple-current constraints). (1) Let 0 be a fusion-generator, and choose any µ ∈ P+r,k . Let 0µ be the set of all γ ∈ 0 for which χγ (µ) 6 = 0. Let d = gcd{r, k, t (γ )|γ ∈0µ } (put d = r if 0µ = ∅). Then µ is a J r/d -fixed point. (2) (Our best lower bound). LetQ 0 be any fusion-generator. Write out the prime decomposition D := gcd{r, k} = piai , where each prime pi is distinct. Then X ai . Rr,k ≥ If D 6 = r, we get the stronger bound Rr,k ≥ 1 +
X
ai .
More precisely, for each pi , and each `, 1 ≤ ` ≤ ai , there must be some γ ∈ 0 ∩ N Z(rpi` /D) (see Lemma 3) with gcd{D, t (γ )} = D/pi` . When D 6 = r, there must also be some γ ∈ 0 ∩ N Z(r/D) whose r-ality t (γ ) is a multiple of D. (3) Suppose J r/d µ = µ and J r/d ν = ν for some divisor d of r. Then for any weight λ, χλ (µ) = χλ (ν) 6 = 0 implies t (λ) t (µ) ≡ t (λ) t (ν) (mod d r). (4) When k 1 is some multiple of k 2 , then Rr,k1 ≥ Rr,k2 . Proof. (1) Let µ be a J c -fixed point. Then from the previous remarks, c must divide r, and r/c must divide both k and t (γ ) for each γ ∈ 0µ . Therefore c must be a multiple of r/d. Moreover χγ (J r/d µ) = χγ (µ) for all γ ∈ 0µ (hence all γ ∈ 0); since 0 is a fusion-generator this means J r/d µ = µ, and hence c = r/d. (2) We know that for every divisor d of D, there are J r/d -fixed points (more than one, unless d = D = r). Choose such a fixed point ϕ, say. Let 0ϕ be as in (1) – necessarily 0ϕ ⊆ N Z(r/d). Then, by (1), gcd{r, k, t (γ )|γ ∈0ϕ } = d. So we see there must be a subset 0d ⊆ 0, namely 0d = 0ϕ , such that gcd{D, t (γ )|γ ∈0d } = d. Note that each 0D/p` must contain some weight γ with gcd{D, t (γ )} = D/pi` (otherwise different i
J r/d -fixed points would not be distinguished by 0). This gives the first bound. If r 6= D, then there will be several J r/D -fixed points, and in order for 0 to distinguish them, 0D must be nonempty. This gives the second bound. (3) Let P` be the `th power sum polynomial (2.4c). From (2.4d) and (2.5a), P` [µ] 6 = 0 requires d to divide `. Consider the m = (m1 , m2 , . . . )th term in Qλ (see (2.4b)); either it will vanish at µ, or d will divide each ` with m` 6= 0. Since P` [µ] lies in the cyclotomic
Fusion Algebras and Modular Matrices
11
field Q[exp[2πi `/k]], we find that Sλ [µ] lies in the cyclotomic field Q[exp[2π i d/k]]. Therefore (2.4a) applied to χλ (µ) = χλ (ν) 6 = 0 gives us the desired conclusion. (4) First note that we have the containment k 1 (P+r,k2 + ρ) ⊂ P+r,k1 + ρ. Moreover, k2
(1)
(2)
for any weight γ we have χγ ( k 1 (µ + ρ) − ρ) = χγ (µ) for all µ ∈ P+r,k2 , where the k2
superscripts indicate that k1 or k2 should be substituted for k in (1.1b). Suppose 0 (1) is a (2) (2) fusion-generator for Ar,k1 . Then for any µ, ν ∈ P+r,k2 , if we have χγ (µ) = χγ (ν) for all γ ∈ 0 (1) , then we know µ = ν. Now the ρ-shifted action of the affine Weyl group at level k2 will map any weight γ ∈ 0 (1) either to some γ 0 ∈ P+r,k2 or onto the “boundary” (2) (2) of P+r,k . In the former case we get χγ (µ) = ±χγ 0 (µ), for some sign independent of (2)
µ. In the latter case χγ (µ) = 0 for any µ, and can be ignored. Therefore, the set of weights γ 0 in P+r,k2 obtained in this way from those in 0 (1) will be a fusion-generator t for Ar,k2 . u Equation (2.3a) suggests that the fusion-generators for Ar,k should be related to those of Ak−1,r+1 . This is indeed so: Theorem 2 (Rank-level duality). (1) Suppose r does not divide k. Then Rr,k ≥ Rk−1,r+1 . Moreover, if 0 = {γ 1 , . . . , γ n } 0 = {J˜a1 τ γ 1 , . . . , J˜an τ γ n } is one for Ak−1,r+1 , is a fusion-generator for Ar,k , then e where each ai is chosen so that gcd{ai r + t (γ i ), k} = gcd{t (γ i ), r, k} for each i. (2) If r does not divide k, and k does not divide r, then Rr,k = Rk−1,r+1 ; in this case 0 , defined in (1), will be one for Ak−1,r+1 . if 0 is a fusion-basis for Ar,k , then e (3) If r does divide k, then Rr,k ≤ Rk−1,r+1 ≤ Rr,k + 1. Using the notation of (1), ˜ τ γ 1 , . . . , τ γ n } is a fusion-generator for Ak−1,r+1 . {J˜0, Proof. (1) Any weight of P+k−1,r+1 can be expressed as J˜b τ µ for some integer b and some weight µ ∈ P+r,k . So, it suffices to consider any µ, µ0 ∈ P+r,k and b ∈ Z for which eJ˜ai τ γ i (J˜b τ µ0 ) χ eJ˜ai τ γ i (τ µ) = χ
∀i ,
(3.1a)
and show that this implies τ µ = J˜b τ µ0 . Equation (3.1a) becomes χγ i (µ) = exp[2πi {rai + t (γ i )} {t (µ) − t (µ0 ) − rb}/rk] χγ i (µ0 ).
(3.1b)
Define 0µ as in Thm. 1(1). Because r does not divide k, we know 0µ 6= ∅. Equation (3.1b) and Thm. 1(1) imply that µ and µ0 will both be J r/d -fixed points, where d = gcdγ i ∈0µ {ai r + t (γ i ), k}. Then τ µ and J˜b τ µ0 will both be J˜k/d -fixed points. For each γ i ∈ 0µ , Thm. 1(3) and (3.1a) imply {rai + t (γ i )} {t (µ) − t (µ0 ) − rb} ≡ 0 (mod d k) . 0
(3.1c)
For each prime p|k, write p a and pa for the exact powers dividing k and d, re0 spectively: i.e. p a kk and pa kd. So a ≥ a 0 . If a = a 0 , then pa must divide both r and 0 t (µ) − t (µ0 ), by (2.5b). If a > a 0 , then pa k(rai + t (γ i )), for some γ i ∈ 0µ . Therefore 0 (3.1c) tells us that L := {t (µ)−t (µ )−rb}/k is an integer. Equation (3.1b) then implies χγ i (µ) = χγ i (J L µ0 ) for all i. Therefore µ = J L µ0 , so we may take µ = µ0 in (3.1a),
12
T. Gannon, M.A. Walton
and absorb the L into b. Then r/d must divide L, i.e. k/d must divide b, i.e. J˜b τ µ = τ µ, and we see that (3.1a) can only be trivially satisfied. Hence Rk−1,r+1 ≤ Rr,k . (2) is immediate from part (1). (3) The first inequality comes from (1). That the given set is a fusion-generator follows by the proof of (1). More precisely, by replacing J˜ai τ γ i with J˜0˜ in (3.1a) implies L ∈ Z. The rest of the argument is as before. u t The Chinese Remainder Theorem tells us that it is always possible to choose the ai ’s in Thm. 2(1). Incidentally, in all cases of which we know, Rk−1,r+1 = 1 + Rr,k when r < k divides k. Earlier we suggested the upper bound Rr,k ≤ r/2, and now we also know Rr,k ≤ k + 1 (or k/2 if k fails to divide r). In fact we can do much better than this for most 2 pairs (r, k). The argument relies on the cyclotomic Galois group Gn described briefly in the previous section. Theorem 3 (Galois considerations). 0÷ := {w d : 2d ≤ r and d divides k } is a fusion-generator for Ar,k , called the divisor generator. A related fusion-generator is τ , defined by 0÷ {wd : 2d ≤ k and d divides k} when k does not divide r τ . := 0÷ k when k divides r {w } ∪ {wd : 2d ≤ k and d divides k} τ can be replaced with any hook `w 1 + w d−` for Moreover, each w d in 0÷ and 0÷ 1 ≤ ` ≤ d. th
Proof. The key observation here is that, because each xj is a k root of unity, for any ` there will exist a Galois automorphism σ ∈ Gk for which σ Pd (x1 , . . . , xr ) = P` (x1 , . . . , xr ),
(3.2)
where d = gcd{`, k}. Suppose, for all d ≤ r/2 dividing k, that χwd (µ) = χwd (µ0 ).
(3.3a)
We will show this implies µ = µ0 . Equations (3.3a) and (2.4a) give Swd [µ] = ξ d Swd [µ0 ]
(3.3b)
for all d ≤ r/2 dividing k, where ξ = exp[−2πi (t (µ) − t (µ0 ))/rk]. Equation (2.4b) reads Sw` [µ] =
(−1)`+1 ˙ ` (P1 [µ], . . . , P`−1 [µ]) P` [µ] + Q `
(3.4a)
˙ ` homogeneous in the same sense as Qλ (and so has no constant for some polynomial Q term). Let d be the smallest ` with P` [µ] 6 = 0. Then (3.4a) implies Sw` [µ] = 0 for all ` < d and Swd [µ] = ± d1 Pd [µ] 6 = 0, so either d = r (in which case µ = ( kr , kr , . . . , kr ) = µ0 ), or d ≤ r/2 by (2.2a). But (3.2) requires d to divide k, if it is to be minimal. Thus th (3.3b) holds. However both Swd [µ] and Swd [µ0 ] lie in Qk/d , so ξ must be a k root of unity.
Fusion Algebras and Modular Matrices
13
We next want to show, by induction on `, that P` [µ] = ξ ` P` [µ0 ]
(3.4b)
for all ` ≤ r/2. If we could show this, we would be done, because by (3.4a) it would force χw` (µ) = χw` (µ0 ) for all ` ≤ r/2, i.e. µ = µ0 . Equation (3.4b) is clearly true for P1 = S1 , using (3.3b) with d = 1. By (3.2), it is then true for all ` with gcd{`, k} = 1. Take any divisor d ≤ r/2 of k, and suppose (3.4b) is true for all ` < d. Using (3.3b), Eq. (3.4a) means that (3.4b) is true for ` = d, and hence all ` with gcd{`, k} = d. Therefore (3.4b) is indeed true for all ` ≤ r/2, and µ = µ0 . The above remarks continue to hold if we replace each w d with any hook `w1 +w d−` (of all the weights λ with t (λ) = d, only the hooks have the variable yd appearing nontrivially in the corresponding polynomial Qλ (yi ) – see e.g. p.51 of [8]). Theorem 2 τ (the hooks dw 1 and J 0 = kw 1 can be applied to 0÷ gives us the fusion-generator 0÷ d k t replaced here with w and w , respectively). u In many special cases, most notably Cor. 1 and Cor. 2 below, we can prove that the divisor generator 0÷ is actually a fusion-basis. Another example: suppose gcd{r, k} = p ` for some prime p, so k will equal pm q for some m ≥ ` and some number q coprime to p. If all prime divisors of q are larger than r/2, then 0÷ will be a fusion-basis, and Rr,k = ` + 1 (if r 6 = p ` ) or Rr,k = ` (if r = p ` ). The reason is that here the lower bound for Rr,k from Thm. 1(2) agrees with the upper bound from Thm. 3. A special case of this occurs when both r and k are powers of p. In fact we know of only a few examples (for r ≤ k) where the divisor generator is not a fusion-basis. For r = 4, for example, we find by computer that the fusion-rank is one for k = 5, 9, 17 and 21. On the other hand, the computer program tells us that the fusion-rank is 2 for r = 4 and k = 7, 11, 13 and 15. This implies, by Thm. 1(4), that whenever k is a multiple of 12,16,18 or 20, R4,k = 2 and 0÷ will be a fusion-basis. Conjecture 1. At fixed rank r, the divisor generator 0÷ is a fusion-basis for all sufficiently high levels k. For reasons of simplicity, the case of greatest interest is when 0 = {w1 } is a fusiongenerator. The complete solution to this is a consequence of this theorem: Theorem 4. 0 = {w 1 , w2 , . . . , wm } is a fusion-generator of Ar,k iff 0÷ ⊆ 0 or τ ⊆ 0. 0÷ Proof. “⇐” is immediate from Thm. 3. “⇒” Suppose we could find a polynomial p(x) = x m1 + · · · + x m` − x n1 − · · · − x n` , not identically 0, such that: (a) (b) (c) (d)
` < r, 1 ≤ m1 < · · · < m` < k and 1 ≤ n1 < · · · < n` < k, x = exp[2πia/k] is a root of p(x), for each a = 1, 2, . . . , m, and P` P` i=1 mi = i=1 ni .
14
T. Gannon, M.A. Walton
Then there would exist weights λ 6 = µ in P+r,k obeying χwa (λ) = χwa (µ) for each a = 1, . . . , m – in other words, 0 could not in this case be a fusion-generator. To see this, choose any r − ` distinct integers hi such that h1 = 0, the remaining hi obey 1 ≤ hi < k, and {hi } ∩ {mi } = {hi } ∩ {ni } = ∅. The hi and mj together equal the r values of λ(i), and the hi and nj together equal the r values of µ(i). Since p(x) 6 ≡ 0, we know µ 6 = λ. Condition (c) says that Pa [λ] = Pa [µ] for all a ≤ m, and (d) is just the statement that t (λ + ρ) = t (µ + ρ). Hence (c) together with induction on (3.4a) is equivalent to saying χwa (λ) = χwa (µ) for those a, and we are done. It is easy to find this polynomial in many cases. In particular, let d be the largest divisor of k with 2d ≤ min{r, k}, and assume d > m. Take p(x) to be (x 4 − x 3 − x 2 + x)(x k−n + x k−2n + · · · + x n + 1), where n = k/d. Then (c) and (d) are automatically satisfied. ` = 2d here, so (a) will be satisfied unless d = r/2. Also, (b) will be satisfied unless n ≤ 4, which can only happen if d = r/2 = k/2. This argument breaks down only when d = r/2. However, when r/2 divides k, there will be J 2 -fixed points, and by Thm. 1(2) we would require some γ ∈ 0 with t (γ ) a multiple of r/2 if 0 is to be a fusion-generator. τ is if simultaneously k|r, The ony remaining way 0 could fail to contain 0÷ ∩ 0÷ r 6 = k, and m < k. But then Thm. 1(2) applies, and 0 would not be able to distinguish t the J r/k -fixed points. u Corollary 1 (The first-fundamental generator). 0 = {w1 } is a fusion-generator iff both: (i) each prime divisor p of k satisfies 2p > min {r, k}, and (ii) either r divides k, or gcd{r, k} = 1. Incidentally, the proof of Thm. 4 also implies that at least one weight γ in any fusiongenerator must have t (γ ) ≥ d, where d is the largest divisor of k with d ≤ r/2 and d ≤ k/2. If this γ is not a hook, then in fact t (γ ) would have to be strictly larger than d. Corollary 2. Some fusion-bases for Ar,k are: 0÷ = {w1 } for r = 1 and 2, ∀k ≥ 1; 0÷ = {w1 } for r = 3 when k is odd; 0÷ = {w1 , w 2 } for r = 3 when k is even; τ = {w 1 } for k = 1, ∀r ≥ 1; 0÷ τ = {w 1 } for k = 2 and any even r; both 0 = {J 0, w 1 } and 0 τ = {w 1 , w 2 } for 0÷ ÷ k = 2 and any odd r > 1; τ = {w 1 } for k = 3 and any r coprime to 3; both 0 = {J 0, w 1 } and 0 τ = {w 1 , w 3 } • 0÷ ÷ for k = 3 and any multiple r > 3 of 3; τ = {w 1 } for k = 4 when r is even; 0 = {w 1 , w 2 } for k = 4 when r ≡ 1 (mod 4), • 0÷ ÷ τ = {w 1 , w 2 , w 4 } for k = 4 when r ≡ 3 r > 4; and both 0 = {J 0, w 1 , w 2 } and 0÷ (mod4), r > 4. • • • •
Corollary 2 follows immediately from Thm. 1(2) and Thm. 3. Some of these fusionbases are collected in the table. Corollary 2 tells us the fusion-rank when either r ≤ 3 or k ≤ 4. In addition, other fusion-bases are 0÷ = {w1 } for r = 4 when k is even, for r = 5 τ = {w 1 } for k = 6 when r is coprime to 6; 0 = {w 1 , w 2 } when k is coprime to 6, and 0÷ ÷ τ = {w 1 , w 2 } for k = 6 when r ≡ 1, 3 (mod for r = 5 when k ≡ 2, 4 (mod 6), and 0÷ τ = {w 1 , w 3 } for k = 6 6); and 0÷ = {w1 , w3 } for r = 5 when k ≡ 3 (mod 6), and 0÷ when r ≡ 2 (mod 6). The simplest cases we do not yet know the answer for are: r = 4
Fusion Algebras and Modular Matrices
15
Table 1. Listed are Ar,k fusion-bases for low ranks and/or levels. The symbols | in rows of the table delimit sequences of fusion-bases that repeat indefinitely as the level k increases. For increasing ranks r, overlines and underlines work similarly in the columns. “l” signifies that Nw1 is invertible (see Sect. 4) r \ k
1
2
3
4
5
1
|{w 1 }| l
{w 1 }
{w 1 } l
{w 1 }
{w 1 } l
2
|{w 1 }| l
{w 1 } l
{w 1 }
{w 1 } l
{w 1 } l
3
|{w 1 } l
{w 1 , w2 }|
{w 1 } l
{w 1 , w2 }
{w 1 } l
4
{w 1 } l
{w1 } l
{w 1 } l
{w 1 } l
{w 2 }
5
{w 1 } l
{w 1 , w2 }
{w 1 , w3 }
{w 1 , w2 }
{w 1 } l
6
{w 1 } l
{w1 } l
{w 1 } l
{w 1 } l
{w 1 , w2 }
7
{w 1 } l
{w 1 , w2 }
{w 1 } l
{w 1 , w2 , w4 }
{w 1 } l
8
{w 1 } l
{w1 } l
{w 1 , w3 }
{w 1 } l
{2w 2 + w 5 } l
when k is odd (R ≤ 2); r = 5 when 6 divides k (R = 2 or 3); k = 5 when r is even (R ≤ 3); and k = 6 when 6 divides r (R = 3 or 4). Obviously to go further we need a better lower bound. Theorem 1(2) is the best we have, but it only exploits the presence of fixed points. 4. The Fusion Matrix of w 1 There are many times when it is useful to know whether particular S matrix elements are nonzero. This is the case for example in almost every modular invariant partition function classification attempt – e.g. see the underlying assumption in [17]. It is especially useful to answer this for the first fundamental weight w1 – in Thm. 5 below we give some consequences. For later convenience, define the sets Pr,k := {p prime : p ≤ min{r, k} and p divides k}, X ax x : ax ∈ Z≥0 }, Z≥ X := {
(4.1a) (4.1b)
x∈X
where X in (4.1b) is any set of natural numbers. Z≥ X is the set of all possible sums (repetitions allowed) of elements of X. For example, Z≥ {n} = {0, n, 2n, . . . } is the set of all nonnegative multiples of n. Theorem 5. (1) Suppose Sw1 ,µ = 0. Then Sλ,µ = 0 unless t (λ) ∈ Z≥ Pr,k . Both k and r must lie in Z≥ Pr,k . (2) Suppose there is only one prime divisor p of k not larger than min{r, k}. Then Sw1 ,µ = 0 iff µ is a fixed point. Proof. When k ≥ r, part (1) follows by considering the polynomial expression (2.4b) and using the Galois argument of (3.2): P` [µ] 6 = 0 requires ` ∈ Z≥ Pr,k . Taking λ = J 0 gives us k ∈ Z≥ Pr,k , and λ = wr (see (2.4f)) gives us r ∈ Z≥ Pr,k . When k < r, to show that we can restrict to primes p ≤ k, we use rank-level duality (2.3a) to get that t˜(τ λ) ∈ Z≥ Pr,k and then t (λ) ∈ Z≥ Pr,k follows from (2.3b) and the fact that k ∈ Z≥ Pr,k . For part (2), use part (1) and Thm. 1(1) to get that µ must be fixed by J r/p . t u
16
T. Gannon, M.A. Walton
Note that the hypothesis of (2) holds whenever k is a power of a prime. This special case follows directly from (4.2) below, by using Gauss’ Lemma on factorising integral polynomials, and evaluating certain factored polynomials at 1. Theorem 5(2) however is much more general. Nw1 is invertible iff Sw1 ,µ 6 = 0 for all µ ∈ P+r,k . Equivalently, Nw1 is invertible iff r X
exp[2π i µ(j )/k] 6 = 0
j =1
∀µ ∈ P+r,k .
(4.2)
It is not hard to show that for k ≤ 4 or r ≤ 4, Nw1 is invertible iff gcd{r, k} = 1; in fact, for those r, k, χw1 (µ) = 0 only for fixed points µ. The identical conclusion holds for many other r and k, as we saw in Thm. 5(2). But Thms. 6(4),(5) below say that these cases are uncharacteristically well-behaved. For example, when r = 5, if 6 divides k ≥ 12, then Nw1 will not be invertible, even though there are no fixed points. Theorem 6 (Invertibility). (1) Nw1 is invertible iff N˜ w˜ 1 is, where the latter is the fusion matrix for Ak−1,r+1 . (2) If gcd{r, k} 6 = 1, then Nw1 cannot be invertible. (3) Nw1 is invertible if either r 6 ∈ Z≥ Pr,k or k 6∈ Z≥ Pr,k . (4) Suppose pq divides k, where p and q are distinct primes for which r ∈ Z≥ {p, q} – i.e. there exist nonnegative integers a, b such that ap + bq = r. If k ≥ pq(d qa e + d pb e), then Nw1 will not be invertible (dxe here denotes the smallest integer not smaller than x – e.g. d2e = 2, d3.1e = 4). (5) Suppose p1 , p2 , . . . , pn are primes dividing k for which r ∈ Z≥ {p1 , . . . , pn } – i.e. P Pj there exist nonnegative integers ai such that ai pi = r. If k ≥ pi pj h=i ah for any i < j , then Nw1 will not be invertible. Proof. (1) follows directly from (2.3a). (2) exploits the fact (see (2.2b)) that χw1 (ϕ) = 0 for any fixed point ϕ. (3) is a corollary of Thm. 5(1). (4) We want to construct a particular µ ∈ P+r,k such that χw1 (µ) = 0. To do this we find an arithmetic sequence pk Z + ci for each i = 1, . . . , a, and an arithmetic sequence k 0 q Z + cj
for each j = 1, . . . , b, such that none of these a + b sequences intersect. This
is easy to do, provided k is big enough. Choose as the ci ’s 0, qk , . . . , qk (q − 1), 1, 1 + k q,...
, 1 + qk (q − 1), etc., until we have chosen a of them (the last one will be d qa e − 1
plus some multiple of
k q ).
Next choose as the cj0 ’s d qa e, d qa e +
chosen b of them (the last one will be
d qa e
+
d pb e
k p,...,
until we have
− 1 plus some multiple of
k p ).
Our
a + b sequences will be disjoint, provided the bound on k is satisfied, and will intersect the interval 0 ≤ x < k in precisely ap + bq = r points. Let µ be the unique weight in P+r,k whose µ(`) equal those r points. Then χw1 (µ) = 0, because the sum in (4.2) along each of the a + b sequences is 0. (5) follows immediately from similar considerations: we are looking for ai series k k k Z pi + cij , where cij 6 ≡ ci` (mod pi ) for j 6 = `, and cij 6 ≡ ch` (mod pi ph ) for i 6 = h. Pi−1 t The choice cij = j − 1 + `=1 a` works. u
Fusion Algebras and Modular Matrices
17
The proofs of Thms. P 6(4),(5) are constructive: their zeros arise when (4.2) finds itself p a sum of terms such as a=1 ξ a for ξ a primitive pth root of unity. A simple example of Thm. 6(4) is at r = 11, k = 30. With p = 3, q = 5, and a = 2, b = 1, the bound is saturated. One finds c1 = 0, c2 = 6 and c10 = 1. These yield 0, 10, 20; 6, 16, 26; and 1, 7, 13, 19, 25; respectively. So, there is a zero for the weight given by {µ(1), . . . , µ(r)} = {26, 25, 20, 19, 16, 13, 10, 7, 6, 1, 0}. Conjecture 2. For Ar,k , Nw1 fails to be invertible iff one can find distinct primes pi ≤ P k} dividing k and nonnegative integers ai , bi such that r = i ai pi and k = min{r, P i bi pi . In other words, we conjecture that the condition of Thm. 6(3) is an “iff”. Note that one way this condition will be satisfied is if gcd{r, k} 6 = 1. The conditions in Thms. 6(4),(5) are strongest when we take r < k (which without loss of generality we can). Also, the bound in 6(5) is best when the pi are labelled so that the largest are given indices near n/2. In practice the most useful special case of Thms. 6(4),(5) is: If one can find an odd prime p ≤ r for which 2p divides k and k ≥ 3p − 1, then Nw1 will not be invertible. The analogue of Thm. 1(4) is also valid here, but is not very useful. The answer to Question 2 for small r and k is indicated in the table. Computer checks were performed for r ≤ 9 and all levels k > r such that dim P+r,k < 300, 000. The results were consistent with Conjecture 2. Conjectures 1 and 2 are the simplest guesses consistent with our results, but it would be nice to test them against additional numerical data. Incidentally, conditions like “` ∈ Z≥ {n1 , . . . , nm }” are only strong when ` is small. For example, given any coprime numbers m and n, there are only (m − 1) (n − 1)/2 positive integers ` which do not lie in Z≥ {m, n} – the largest such ` is mn − m − n. So for fixed r, we know Conjecture 2 will hold for all sufficiently large k. 5. Extensions Because the fundamental weights are much simpler, the most interesting fusion-generators are the ones which consist only of fundamental weights: 0 ⊆ {w1 , . . . , wr }. We can speak of fundamental-fusion-generators and fundamental-fusion-rank FRr,k . All of the results in Sects. 3 and 4 also apply directly to FRr,k . By definition, Rr,k ≤ FRr,k , and Conjecture 1 predicts that, for fixed r, Rr,k = FRr,k for all sufficiently large k. Note however from the table that FR8,5 = FR4,9 = 2 while R8,5 = R4,9 = 1. Because of (2.8d), we can strengthen here the bound in Thm. 1(2). For example, if FRr,k equals the bound given in Thm. 1(2), then so must FRr/d−1,k/d for all divisors d of gcd{r, k}. One can also ask Question 2 for other weights, most importantly the other fundamental weights, and again (2.8b) will be very useful. For example, we know χw2 will vanish at some J 5 -fixed point of A9,14 , because Nw1 is not invertible for A4,7 . Of course Questions 1 and 2 can and should be asked of the fusion algebras for the other affine algebras, and similar arguments will apply. We have not investigated them, except to find some fusion-bases for C2,k and G2,k on the computer, and to get Thm. (1) 7 below for G2,k . Of course Rk (C2 ) must equal 2 for any even k, and we find the rank is also 2 for all odd k < 26 (the limit of our computer check), save k = 1, 3 and 9. For k = 1 and 9, the only fusion-bases are {w1 } and {2w1 + 6w2 }, respectively. At k = 3 there are four different fusion-bases: {2w 1 }, {w 2 }, {2w 1 + w2 }, and {2w2 }. A very
18
T. Gannon, M.A. Walton
tempting conjecture is that the rank R(Cr,k ) equals 2 for all sufficiently large k (and probably for all k > 9). The situation for G2,k however is more surprising: Theorem 7. (1) When the level k is odd, {w 2 } is a fusion-basis for G2,k . (2) Nw2 fails to be invertible for G2,k iff either 4 or 30 divides k := k + 4. Proof. The key here is to reduce the G2,k quantities to A2,k+1 quantities, and use the fact that {w 1 } is a fusion-basis for A2,k+1 . Using (1.1b) and the simple Lie subalgebra A2 ⊂ G2 , we find χw2 (µ) = χ w1 (µ) + χ w2 (µ) + 1,
(5.1)
where underlines denote A2,k+1 quantities, and µ = µ1 w1 + (µ1 + µ2 + 1)w2 . So part (1) reduces to the following statement4 for A2,k+1 : for any λ, µ ∈ P+2,k+1 with λ 6 = Cλ and µ 6 = Cµ (only these nonselfconjugate weights correspond to G2,k ones), does the equality λ1 + 2λ2 + 3
) + cos(2π
λ2 − λ1
) + cos(2π
2λ1 + λ2 + 3
) (5.2a) 3k 3k 3k µ2 − µ1 2µ1 + µ2 + 3 µ1 + 2µ2 + 3 ) + cos(2π ) + cos(2π ) = cos(2π 3k 3k 3k
cos(2π
force either λ = µ or λ = Cµ? Write c1 , c2 , c3 for the three cosines on the left side of (5.2a), and write c10 , c20 , c30 for those on the right. Then (5.2a) says c1 + c2 + c3 = c10 + c20 + c30 , and since (2ν1 + ν2 + 3) + (ν2 − ν1 ) = ν1 + 2ν2 + 3, we also get c12 + c22 + c32 = 1 + 2c1 c2 c3 and c10 2 + c20 2 + c30 2 = 1 + 2c10 c20 c30 . Hit both sides of (5.2a) with the Galois automorphism σ2 (see Sect. 2). Since cos(2x) = 2 cos2 (x) − 1, we obtain c12 + c22 + c32 = c10 2 + c20 2 + c30 2 .
(5.2b)
Thus any symmetric polynomial in c1 , c2 , c3 will equal the corresponding symmetric polynomial in c10 , c20 , c30 . In particular 2λ1 + λ2 + 3 2 (5.2c) ) 3k 3k 3k µ2 − µ1 2µ1 + µ2 + 3 2 µ1 + 2µ2 + 3 ) − sin(2π ) − sin(2π ) . = sin(2π 3k 3k 3k
sin(2π
λ1 + 2λ2 + 3
) − sin(2π
λ2 − λ1
) − sin(2π
In other words, we know from (5.2a) that the real parts of χw1 (λ) and χw1 (µ) are equal, and from (5.2c) that their imaginary parts are also equal, up to a sign. Hence either λ = µ or λ = Cµ, and we have proven part (1). For part (2), note that χw2 (µ) = 0 is equivalent to (see (5.1)) 1 c1 + c2 + c3 = − , 2
(5.3a)
in the above notation. Consider first k odd. Then hitting (5.3a) with the Galois automorphism σ2 gives us c12 +c22 +c32 = 45 , and hence c1 c2 c3 = 18 . We can solve these equations, 4 For the remainder of the proof of part (1), we will switch to A 2,k+1 notation.
Fusion Algebras and Modular Matrices
19
and we find 8ci3 +4ci2 −4ci −1 = 0, i.e. {c1 , c2 , c3 } = {cos(2π 17 ), cos(2π 27 ), cos(2π 37 )}. (1) However, these cosines cannot be realised by a weight in P+k (G2 ). Next, suppose k ≡ 2 (mod 4). We may assume (using G2,k notation) that exactly two of the arguments {3µ1 + 2µ2 + 5, µ2 + 1, 3µ1 + µ2 + 4} are odd, otherwise they would all be even and the argument would reduce to the k odd one. Here we use the automorphism σ3k/2−2 and find (relabeling the ci if necessary) that c32 − c12 − c22 = − 43 . We can solve for ci as before, and we find that either c3 = cos(2π 15 ) and {c1 , c2 } = 7 13 1 11 ), cos(2π 30 )}, or c3 = cos(2π 25 ) and {c1 , c2 } = {cos(2π 30 ), cos(2π 30 )}. {cos(2π 30 Either possibility requires 30 to divide k, in order to be realised by a weight of G2,k . When (1) 30 divides k, we do indeed get zeros: µ = (k/3 − 1, k/30 − 1, 3k/5 − 1) ∈ P+k (G2 ) works. (1) t Finally, suppose 4 divides k. Then µ = (k/4, k/4, k/4) ∈ P+k (G2 ) works. u (By w 2 here we mean the Weyl-dimension 7 fundamental weight of G2 , corresponding to the short simple root.) However, {w 2 } will not be a fusion-generator when k > 4 is even. Our computer program tells us that for k ≤ 24, the fusion-rank is 1 except for k = 6, 12, 16 and 20 (of course this implies it will also be 2 whenever k + 4 is a multiple of 10, 16, or 24). 6. Number Fields Associated with S By the field Kr,k we mean the smallest field containing the rationals and all of the entries Sλ,µ of S. Similarly, by the field Lr,k we mean the smallest field containing Q and all of the values χλ (µ). Because of their role in the Galois symmetry (2.6), it is natural to try to identify these fields. This question was posed in [4], and related questions have been considered in e.g. [18,20]. Another reason the question is interesting is that, as we shall see, it has a simple answer! We will give this answer in Cor. 3 below, for the most important case: Ar,k . The matrix S for any nontwisted affine algebra g is given in e.g. [16]. The expression for Sλ,µ consists of a sum s(λ, µ) over the Weyl group of g, multiplied by a constant c. For Ar,k , s(λ, µ) manifestly lies in the field Qrk , and c=
ir(r+1)/2 . r/2 √ k r +1
Using Gauss sums, which express square-roots of integers as sums of roots of unity, it can be shown that the constant c lies in either Qr if r is even, or Qrk if either r ≡ 3 (mod √ 4) or k is even, or Qrk [ ±2] if both k is odd and kr ≡ ±2 (mod 8). Thus we know Lr,k is always a subfield of Qrk , and Kr,k is always a subfield of Q4rk . Write [λ] for the orbit {J i λ} of λ by the simple currents. We will find our fields by first computing some Galois orbits. This result should be of independent value. Theorem 8. Consider any k > 2 and r 6 = 1. (1) Choose any fundamental weight wm with m ≤ min{r − 2, k − 2}, and any Galois automorphism σ` . Then (with one exception) σ` w m ∈ [wm ] ∪ [Cwm ] iff ` ≡ ±1 (mod k); for all other ` the quantum-dimension Sσ` wm ,0 /S0,0 of σ` w m will be strictly greater than that of w m . (The one exception is w2 for A3,4 , where each σ` fixes w2 .)
20
T. Gannon, M.A. Walton
(2) When r 6 ≡ 1 (mod 4), σ` w1 = w1 iff ` = 1 (mod rk). When r ≡ 1 (mod 4) and k is even, then σ` w1 = w1 iff ` = 1 (mod rk/2). Proof. (1) Because of (2.1a), we may assume m ≤ r/2. Assume first that k ≥ r. From the Weyl denominator formula, we compute m Y | sin(π`n/k)|r−n Sσ` wm ,0 = Swm ,0 sin(πn/k)r−n n=1 r−m Y n=m+1
| sin(π`n/k)|r−n sin(πn/k)r−n
r Y
| sin(π `n/k)|r+1−n
n=r+1−m
sin(π n/k)r+1−n
(6.2a)
where we drop the middle product if m = r/2. We want to know when (6.2a) equals 1. This is easy, for k > r ≥ 2, since sin(π/k) < sin(2π/k) < · · · < sin(πr/k). Consider first m < r/2: of all possible choices of integers 1 ≤ n1 < n2 < · · · < nr+1 ≤ k/2, the minimum possible product of r − 1 sin(π n1 /k)’s, r − 2 sin(π n2 /k)’s, ..., r − m sin(π nm /k)’s, r − m sin(πnm+1 /k)’s, ..., m sin(π nr−m /k)’s, m sin(π nr+1−m /k)’s, ..., and 1 sin(π nr /k), is the choice n1 = 1, n2 = 2, ..., {nm , nm+1 } = {m, m + 1}, ..., nm+2 = m + 2, ..., {nr−m , nr+1−m } = {r − m, r + 1 − m}, ..., nm+1 = m + 1. This immediately forces ` ≡ ±1 (mod k) (for m > 1, just look at the first term; when m = 1, ` ≡ ±2 is eliminated by seeing what happens to the second term). If instead m = r/2, the exponents of sin(πn/k) in (6.2a) are no longer nonincreasing: near n = m + 1 we get the subproduct · · · sin(π(m − 1)/k)r−m sin(π m/k)r−m sin(π (m + 1)/k)r−m sin(π (m + 2)/k)r−m · · · . For m > 2, the proof that (6.2a) will always be greater than 1 for ` 6 ≡ ±1 (mod k), follows from the simple observation that sin(π/k) sin(π (m + 1)/k) < sin(2π/k) sin(π m/k): the least-harmful place to move “1” to is “2”, and the best place to move “m + 1” to is “m”, and yet even that (forgetting the other terms, which will make matters worse) will increase the product. The remaining case m = 2 corresponds to r = k = 4, i.e. to the given exception. This completes the argument for k ≥ r. When k < r, apply rank-level duality (2.3a): it is an exact symmetry of quantum-dimensions, and maps J -orbits to J˜-orbits. τ w m = mw˜ 1 , so we are interested in the ratio k−2 Y | sin(π`n/k)|k−1−n k−1 Y | sin(π ` (n + m)/k)| S˜σ` mw˜ 1 ,0 = . sin(πn/k)k−1−n n=1 sin(π (n + m)/k) S˜mw˜ 1 ,0 n=1
(6.2b)
The rest of the argument is as before: again m = r/2 causes minor problems. Now consider any ` = (−1)a + bk. Applying (2.6b) to the Cartan generators λ ∈ 1 {w , . . . , wr } and using (2.4e), we find σ` µ = C a J b t (µ+ρ) µ t whenever σ` ∈ Gal(Lr,k /Q). Applying (6.2b) to µ = w1 gives us part (2). u
(6.2b)
Fusion Algebras and Modular Matrices
21
Corollary 3. When both k > 2 and r 6 = 1, then Lr,k = Qrk and ( Qrk if either r 6 ≡ 1 (mod 4) or k is even . Kr,k = √ Qrk [ ±2] if r is odd and rk ≡ ±2 (mod 8) The proof of the corollary is immediate from Thm. 8, by regarding Galois orbit sizes: when r 6 ≡ 1 (mod 4), the Galois orbit of w 1 alone suffices, but when r ≡ 1 (mod 4) and k is even, we have σ1+rk/2 w 1 = w1 , so also use σ1+rk/2 0 = J r/2 0 6 = 0, which is obtained from (6.2b). What we find in all cases is that for any ` ∈ (Z/rkZ)× , ` 6 = 1, either σ` w 1 6 = w1 or σ` 0 6 = 0. This tells us Lr,k = Qrk , and Kr,k is then obtained by adjoining the constant c shown above. Similar statements to Thm. 8 can be found for other weights. For example, by ranklevel duality the identical result to Thm. 8(1) holds for any mw 1 , 0 ≤ m ≤ min{r − 2, k − 2}, and we can expect similar results for other hooks. When r ≡ 1 (mod 4) and k is odd, Q4rk is a degree 2 extension of Kr,k , which is in turn a degree 2 extension of Qrk . The results corresponding to Corollary 3 for k = 1, 2 or r = 1 can be easily found, but are more complicated and hence less interesting. We include them here for completeness. √ • Lr,1 = Qr . Kr,1 will equal either Qr , Qr [i], or Qr [ ±2], depending on whether or not r ≡ 0, 1 (mod 4), or r ≡ 3 (mod 4), or r ≡ ±2 (mod 8), respectively. if k is odd, and √ Q[cos(π/k)] if k is even. K1,k will equal either • L1,k = Q[cos(2π/k)] √ L1,k , or L1,k [ 2 sin(2π/k)], or L1,k [ 2], depending on whether k ≡ 0, 2, or k ≡ 3, or k ≡ 1 (mod 4), respectively. • Lr,2 = Qr [cos(2π/k)] if r is odd, and Qrk if r is even. Kr,2 will equal Lr,2 , unless r ≡ 3 (mod 4) when Kr,2 = Qrk . Acknowledgements. T.G. thanks A. Coste for showing him Questions 1 and 3, and C. Cummins for discussions. M.W. thanks the High Energy Physics group of DAMTP for hospitality, and W. Eholzer for reading the manuscript.
References 1. Aharony, O.: Generalized fusion potentials. Phys. Lett. B306, 276–282 (1993) 2. Altschuler, D., Bauer, M., and Itzykson, C.: The branching rules of conformal embeddings. Commun. Math. Phys. 132, 349–364 (1990) 3. Bourbaki, N.: Groupes et Algèbres de Lie. Chapitres IV-VI, Paris: Hermann, 1968 4. Buffenoir, E., Coste, A., Lascoux, J., Buhot, A., and Degiovanni, P.: Precise study of some number fields and Galois actions occurring in conformal field theory. Annales de l’I.H.P.: Phys. Théor. 63, 41–79 (1995) 5. Coste, A. and Gannon, T.: Remarks on Galois symmetry in rational conformal field theories. Phys. Lett. B323, 316–321 (1994) 6. Di Francesco, P. and Zuber, J.-B.: Fusion Potentials I. J. Phys. A26, 1441–1454 (1993) 7. Fuchs, J., Schellekens, B., and Schweigert, C.: From Dynkin diagram symmetries to fixed point structures. Commun. Math. Phys. 180, 39–97 (1996) 8. Fulton, W. and Harris, J.: Representation Theory: A First Course. New York: Springer-Verlag, 1991 9. Gannon, T.: Symmetries of the Kac–Peterson modular matrices of affine algebras. Invent. Math. 122, 341–357 (1995) 10. Gannon, T.: Kac–Peterson, Perron–Frobenius, and the classification of conformal field theories. e-print q-alg/9510026 (1995) 11. Gannon, T., Ruelle, Ph., and Walton, M.A.: Automorphism modular invariants of current algebras. Commun. Math. Phys. 179, 121–156 (1996) 12. Georgieu, G. and Mathieu, O.: Catégorie de fusion pour les groupes de Chevalley. C. R. Acad. Sci. Paris 315, 659–662 (1992)
22
T. Gannon, M.A. Walton
13. Gepner, D.: Fusion rings and geometry. Commun. Math. Phys. 141, 381–411 (1991) 14. Goodman, F. and Nakanishi, T.: Fusion algebras in integrable systems in two dimensions. Phys. Lett. B262, 259–264 (1991) 15. Goodman, F. and Wenzl, H.: Littlewood–Richardson coefficients for Hecke algebras at roots of unity. Adv. Math. 82, 244–265 (1990) 16. Kac, V. and Peterson, D.: Infinite-dimensional Lie algebras, theta functions and modular forms. Adv. Math. 53, 125–264 (1984) 17. Kreuzer, M. and Schellekens, A.N.: Simple currents versus orbifolds with discrete torsion – a complete classification. Nucl. Phys. B411, 97–121 (1994) 18. Moody, R.V. and Patera, J.: Characters of elements of finite order in Lie groups. SIAM J. Alg. Disc. Meth. 5, 359–383 (1984) 19. Pasquier, V. and Saleur, H.: Common structures between finite systems and conformal field theories through quantum groups. Nucl. Phys. B330, 523–526 (1990) 20. Pianzola, A.: The arithmetic of the representation ring and elements of finite order in Lie groups. J. Algebra 108, 1–33 (1987) 21. Verlinde, E.: Fusion rules and modular transformations in 2D conformal field theory. Nucl. Phys. B300, 360–376 (1988) 22. Witten, E.: The Verlinde algebra and the cohomology of the Grassmannian. In: Geometry, Topology and Physics. Conf. Proc. and Lecture Notes in Geom. Topol. Vol. VI, 1995, pp. 357–422 Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 206, 23 – 32 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Master Partitions for Large N Matrix Field Theories Matthias Staudacher?,?? Albert-Einstein-Institut, Max-Planck-Institut für Gravitationsphysik, Schlaatzweg 1, D-14473 Potsdam, Germany. E-mail:
[email protected] Received: 30 October 1998 / Accepted: 7 March 1999
Abstract: We introduce a systematic approach for treating the large N limit of matrix field theories.
1. Introduction It has been known for thirty years that quantum field theory simplifies enormously if the number N of internal field components tends to infinity. In the case where the N components form a vector this leads to exact solutions in any dimension of spacetime. For physical applications, ranging from solid state physics to gauge theories and quantum gravity, a different situation is much more pertinent: The case of N 2 internal components that form a matrix. Here exact solutions have only been produced for very low dimensionalities. It is one of the outstanding problems of theoretical physics to extend large N technology to physically interesting dimensions. In the present article we will be concerned with matrix “spin systems”, that is Ddimensional Euclidean lattice field theories whose internal degrees of freedom are hermitian, complex or unitary N × N matrices. The idea is to treat the problem by a three step procedure: (1) Eguchi–Kawai reduction: Replace the N = ∞ field theory by a one-matrix model coupled to appropriate constant external field matrices. (2) Character expansion: Express the partition function of the one-matrix model of (1) as a sum over polynomial representations – labelled by Young diagrams – of U (N ). (3) Saddle point analysis: Find an effective Young diagram that dominates the partition sum of (2) in the large N limit. ? Supported in part by EU Contract FMRX-CT96-0012.
?? Current institute address: Am Mühlenberg, Haus 5, D-14476 Golm, Germany.
24
M. Staudacher
The insight that step (1) is possible is due to Eguchi and Kawai [1]. Intuitively it says that, if a saddle point configuration exists at N = ∞, it should be given by a single translationally invariant matrix (the so-called master field). In practice the reduction is rather subtle, and we will be using the twisted EK reduction [2] which results in a one-matrix model in external constant fields encoding the original (discrete) space-time. Step (2) is novel in this context and is the main focus of the present work. The onematrix model of (1) still has N 2 degrees of freedom, and it is well known that a saddle point for matrix models can only be found once the degrees of freedom are reduced as N 2 → N . The external fields encoding space-time prevent any naive reduction to the N eigenvalues of the matrix, which is the route of choice for simpler models without external fields. But is it possible to replace the matrix integral by a sum over partitions corresponding to a sum over all polynomial representations of U (N ). The crucial point is then that one ends up with a kind of one-dimensional spin model in Young diagram space with only N variables: the possible lengths of the N rows of the diagram. Step (3) might appear to be an exotic idea: we claim that the N = ∞ “master field” can be described by a “master partition”. However, it has already been recently demonstrated in a series of papers [3,4] that certain infinite sums over partitions are dominated by a saddle point configuration. This led to the solution of matrix models in external fields not treatable by any other method. The present models are more complicated, but not fundamentally different. The character expansions we find lead to a very interesting and apparently novel combinatorial problem in Young pattern space (see Sect. 4). More insight into this problem will be needed in order to proceed with the final step (3) of our program, the saddle point analysis. We introduce what we call “lattice polynomials” 4h ,ϒh which are polynomials in N1 . They depend on the Young diagram h and the precise nature of the space-time lattice. It might be objected that the present approach is futile unless one can demonstrate that the lattice polynomials 4h ,ϒh can be explicitly computed or at least bootstraped at N = ∞. But there is one important argument against this pessimistic assessment: The lattice polynomials 4h ,ϒh only depend on the nature of the lattice but not on the local measure of the minimally coupled (matrix) spins of the model1 . Therefore, solving interacting field theory in our language is of the same degree of complexity as solving the free field case. Finally we should mention that our program is very general since it applies in principle to any large N matrix spin system. It would be interesting to extend the method to matrix field theories with a gauge symmetry such asYang-Mills theory. Indeed the EK reduction was initially designed for lattice gauge theory [1]. Recently it was demonstrated by Monte Carlo methods that even the path integral of continuum gauge theory may be EK reduced to a convergent ordinary multiple matrix integral [5]. A rigorous mathematical proof, as well as an investigation on whether the reduced model reinduces the field theory as N → ∞, are still lacking. At any rate, reducing a D-dimensional gauge theory, one so far ends up with a nonlinearly coupled D-matrix model, which is not yet tractable by the present machinery unless it is understood how to perform a further reduction DN 2 → N 2 .
1 Except for the global symmetry of the matrix spins. In this paper we develop the theory in parallel for the case of U(N) global symmetry (hermitian matrices) and U(N ) × U(N ) symmetry (complex matrices). The other classical groups could presumably be treated as well, but it is well known that they do not lead to different large N limits.
Master Partitions for Large N Matrix Field Theories
25
2. Reduced Matrix Spin Systems Consider a spin model on a periodic lattice. In order to be specific we will sketch the method for a two-dimensional lattice, but higher (or lower) dimensions can be treated as well. We will not dwell on details since they are well explained elsewhere. The variables are N × N hermitian matrices M(x) defined on the lattice sites x Z Y DM(x) e−SH , ZH = x
SH = N Tr
X 1 x
2
M(x)2 + V M(x)
β X [M(x)M(x + µ) ˆ + M(x)M(x − µ)] ˆ , − 2
(1)
µ=1,2
where µˆ denotes the unit vector in the µ-direction. It is equally natural to consider general complex matrices 8(x) ∈ GL(N, C), in which case Z Y D8(x) e−SGL , ZGL = x
SGL = N Tr
X
8(x)8† (x) + V 8(x)8† (x)
x
−β
X
[8(x)8 (x + µ) ˆ + 8(x)8 (x − µ)] ˆ . †
(2)
†
µ=1,2
If V = 0 in Eqs. (1),(2) the model is free. The integration measures in Eqs. (1),(2) are the flat measures for hermitian and complex matrices: DM =
N dMii Y dReMij dImMij , √ −1 πN −1 2πN i=1 i<j
N Y
D8 =
N Y dRe8ij dIm8ij . (3) π N −1
i,j =1
A third, very important type of spin model is the so-called chiral field, which looks like the free complex model Eq. (2) Z Y DU (x) e−SU , ZU = x
SU = −βN Tr
X X
[U (x)U † (x + µ) ˆ + U (x)U † (x − µ)] ˆ ,
(4)
x µ=1,2
but the matrices U (x) ∈ U(N) are unitary. In this case the measure DU (x) is the Haar measure on the group. The model is therefore not free.
26
M. Staudacher
The Eguchi–Kawai reduction [1,2] states that the above lattice models can be replaced at N = ∞ by, respectively, the following one-matrix models coupled to constant external field matrices P and Q: Z 1 2 † † , (5) ZH = DM exp N Tr − M − V M + β MP MP + MQMQ 2 Z ZGL =
Z ZU =
h i D8 exp N Tr − 88† − V 88† × × exp βN Tr 8P 8† P † + 8P † 8† P + 8Q8† Q† + 8Q† 8† Q , (6)
DU exp βN Tr U P U † P † + U P † U † P + U QU † Q† + U Q† U † Q . (7)
Here P = PN and Q = QN are the famous N × N unitary “shift and clock” matrices 1 01 ωN 0 1 . . . . . . , Q = (8) PN = , . N . . N −2 01 ω 1
N
0
N −1 ωN
where ωN = exp 2πi N and PN QN = ωN QN PN . To be more precise, the free energies as well as appropriate correlation functions (see [2]) are identical to leading order in N1 in the lattice field theory and the corresponding one-matrix model. The thermodynamic limit, that is a lattice of infinite extent, is approached when N → ∞. We see that the structure of the lattice has been “hidden” in index space! It is natural to generalize the situation to a toroidal K × L lattice: P = PK ⊗ 1 N , K
Q = QL ⊗ 1 N , L
(9)
where N is chosen to be divisible by K and L. This allows to take the thermodynamic limit and the large N limit independently. If we put L = 1 (we can then equivalently omit Q altogether) the target space becomes a closed one-dimensional chain. We suspect that matrix models on arbitrary discrete target spaces can be EK reduced by appropriate external matrices, but this has not been worked out yet. 3. Character Expansions Now we turn to step (2) and rewrite the reduced hermitian, complex and unitary matrix integrals Eqs. (5), (6), (7) as sums over representations of U(N ). To this end introduce the following source integrals: Z h 1 i (10) ZH [J ] = DM exp N Tr − M 2 − V (M) + J M , 2 ZGL [J J¯] =
Z
h i D8 exp N Tr − 88† − V (88† ) + J 8 + 8† J¯ ,
(11)
Master Partitions for Large N Matrix Field Theories
ZU [J J¯] =
Z
27
h i DU exp N Tr J U + U † J¯ .
(12)
The two different ways of introducing a source are due to the U(N ) symmetry of hermitian matrices on the one hand and the U(N)×U(N ) symmetry of complex (and complex unitary) matrices on the other. The reduced models are easily obtained from the source integrals by applying an operator: ZH = exp
ZGL,U = exp
β , Tr ∂P ∂P † + ∂Q∂Q† · ZH [J ] J =0 N
(13)
β ¯ † + ∂P † ∂P ¯ + ∂Q∂Q ¯ † + ∂Q† ∂Q ¯ Tr ∂P ∂P . · ZGL,U [J J¯] J =J¯=0 N (14)
Here ∂,∂¯ denote N × N matrix differential operators whose matrix elements are ∂j i = ∂ ∂ ¯ ∂Jij and ∂j i = ∂ J¯ij . It is clear that the source integrals are class functions of, respectively, J and J J¯. Therefore they can be expressed as character expansions, with known (see [3, 4,6]) expansion coefficients. If V = 0, they read for the hermitian and complex source integrals, respectively, ZH [J ] = exp
X 1 χh (A2 )χh (J ), N Tr J 2 = 2
(15)
h
ZGL [J J¯] = exp N Tr J J¯ =
X
χh (A1 )χh (J J¯),
(16)
h
while for the unitary source integral one has [6] ZU [J J¯] =
X χh (A1 )χh (A1 ) h
χh (1)
χh (J J¯).
(17)
Here the sum runs over all partitions h labeled by the shifted weights hi = N − i + mi , where mi ≥ 0, i = 1, . . . , N, is the number of boxes in the i th row of the Young pattern associated to h. χh (J ) is the Schur function, dependent on J , on the diagram h. It is identical to the Weyl character of the matrix J corresponding to the representation labeled by h. A1 and A2 are defined through Tr Ak1 = N(δk,0 +δk,1 ) and Tr Ak2 = N (δk,0 +δk,2 ), and χh (1) is the dimension of the representation. For more details on the notation, and for explicit formulas for the characters χh (A1 ), χh (A2 ) and χh (1) see [3,4]. For a nonzero potential V , the hermitian and complex character expansions become a bit more complicated, but are still available: X 2h χh (J ), (18) ZH [J ] = h
ZGL [J J¯] =
X h
h χh (J J¯),
(19)
28
M. Staudacher
where 2h is given by Z h 1 i χh (A1 ) DM exp N Tr − M 2 − V (M) χh (M), 2h = χh (1) 2 and h by 2 Z i h χh (A1 ) D8 exp N Tr − 88† − V (88† ) χh (88† ). h = χh (1)
(20)
(21)
The integrals appearing in Eqs. (20), (21) are ordinary one-matrix integrals which may be computed rather explicitly as N × N determinants. Their analysis in the N → ∞ limit proceeds by employing standard techniques, supplemented by the methods of [3]. Now we apply the operators in Eqs. (13), (14) in order to generate the space-time lattice; this results in character expansions for the reduced matrix field theories. In the P hermitian case one has (here |h| = i mi =number of boxes in the Young diagram) X |h| χh (A2 ) 4h β 2 for V = 0, (22) ZH = h
ZH =
X
2h 4h β
|h| 2
for V 6 = 0,
(23)
h
with
1 Tr ∂P ∂P † + ∂Q∂Q† · χh (J ) . J =0 N The free complex, interacting complex, and the unitary case become X χh (A1 ) ϒh β |h| for V = 0, ZGL = 4h = exp
(24)
(25)
h
ZGL =
X
h ϒh β |h|
for V 6= 0,
(26)
h
ZU =
X χh (A1 )χh (A1 ) h
χh (1)
ϒh β |h| ,
(27)
with
1 ¯ + ∂Q∂Q ¯ † + ∂Q† ∂Q ¯ ¯ † + ∂P † ∂P . (28) · χh (J J¯) Tr ∂P ∂P J =J¯=0 N The character expansions Eqs. (22), (23), (25), (26), (27) are at the heart of our proposal. It is seen that they neatly separate the nature of the local spin weight (χh (A2 ),2h ,χh (A1 ),h , (χh (A1 ))2 (χh (1))−1 ) and the nature of the embedding space (4h ,ϒh ). As a striking example, note that from the point of view of our character expansion method the difference between the free Gaussian model on a toroidal lattice Eq. (25) and the non-trivial chiral model Eq. (27) is a simple, explicitly known factor ϒh = exp
N
Y (N − i)! χh (A1 ) = N |h| . χh (1) hi ! i=1
The character expansions involve sums over N variables only and we can write down a saddle point equation for the effective density of the master partition. In order to complete the program, we need a second bootstrap equation for the novel quantities 4h and ϒh , which contain the connectivity information of the lattice.
Master Partitions for Large N Matrix Field Theories
29
4. Lattice Polynomials Inspection of the quantities 4h and ϒh in eqs.(24),(28) shows that they are polynomials in the variable N1 of degree not higher than, respectively, 21 |h| − 1 and |h| − 2. They are zero if the number |h| of boxes in the Young pattern is odd. Conjugating the diagram gives the same polynomial except for the replacement N1 → − N1 . The first few can be computed by brute force calculation directly from the definitions Eqs. (24), (28), see Table 1. Table 1. The first few D = 2 lattice polynomials h 2
4h 2
ϒh 2
12
2
2
4
3 + 12 N1 5 + 4 N1
3 + 24 N1 + 54 12 N
31 22 212
6 5 − 4 N1
5 + 8 N1 + 18 12 N 6 5 − 8 N1 + 18 12
14
3 − 12 N1
3 − 24 N1 + 54 12
N
N
Here we used Tr(P k Ql ) = Nδk,0 δl,0 , which is true as long as |k| < N, |l| < N. We ∗ → 1 (remember ω = exp 2π i ): in other words, we assumed also replaced ωN → 1, ωN N N P and Q to commute at large N. Both assumptions are innocent at least in the strong coupling (small β) phase. If the model possesses a weak coupling phase (like e.g. the chiral field Eq. (7)), these assumptions may have to be reconsidered, if we want the character expansion to describe this second phase as well. This is because in the present approach we expect large N phase transitions to correspond to the situation where the number of rows of the master partition is of O(N ) (“touching transition”). Note that we cannot drop the other terms of O( N1 ) in 4h ,ϒh since the character expansions are for the partition function and not for the free energy. The direct calculation of the lattice polynomials quickly gets very tedious. The combinatorics involved seems to be of a novel type. While we have not yet found an efficient calculational scheme or recursive method, let us give some interesting representations for 4h and ϒh that may prove useful later. Introduce the following Gaussian measure on the space of M × N (M ≤ N) complex matrices 3:
[D3] =
N M Y Y dRe3ij dIm3ij i=1 j =1
πN −1
i h exp N Tr − 33† .
(29)
This measure is invariant under U(M) × U(N ). It is then fairly easy to prove (cf. [4]) the following representation for the character of the source: Z χh (J ) =
Z DU χh (U † )
[D3] exp N Tr U 3J 3† ,
(30)
30
M. Staudacher
where U ∈ U(M) is unitary and DU is the Haar measure on U(M). This formula is valid for diagrams h with at most M rows. Therefore 4h becomes, cf. Eq. (24) Z Z † 4h = DU χh (U ) [D3] exp N Tr 3† U 3P 3† U 3P † + 3† U 3Q3† U 3Q† . (31) After a Hubbard-Stratanovich transformation decoupling the quartic terms by Gaussian M × M complex matrices S and T (with measure as in Eq. (29) with N → M), and integration over 3, we obtain the representation Z Z † 4h = DU χh (U ) [DS][DT ] ∞ k X 1 SU ⊗ P + S † U ⊗ P † + T U ⊗ Q + T † U ⊗ Q† . × exp Tr M⊗N k k=1
(32) The combinatorial interpretation of the exponential in Eq. (32) is the following: we have a generating function for a non-commutative random walk on a two-dimensional lattice with variable U . The representation is useful for getting some exact results on the 4h , but we have not yet been able to compute the integral Eq. (32) exactly except for M = 1 (characters with just one row). E.g. we can find a generating function (with zi being the eigenvalues of U ) for the large N limit of 4h M Y i,j
X 1 =∞ = 4N χh (z) h 2 (1 − zi zj )
(33)
h
giving the constant terms of the lattice polynomials. This is however not sufficient for the large N limit of the field theory, as already mentioned. A curious feature of Eq. (32) is that we can take N → ∞ while keeping M in the range 1 M N . That is, it should be possible to find a saddle point for the situation where the row lengths are large compared to the number of rows, corresponding to the extreme strong coupling limit. Furthermore, it should be investigated whether the M × M matrices can be taken to commute as N → ∞. Similar, if slightly more complicated representations are possible for ϒh ; here the starting point is the expression Z Z 1 1 χh (J J¯) = DU χh (U † ) [D31 ][D32 ] exp N Tr U 2 31 J 3†2 + 32 J¯3†1 U 2 , (34) which means the lattice polynomials become Z Z † ϒh = DU χh (U ) [D31 ][D32 ]× 1 1 1 1 × exp N Tr 3†2 U 2 31 P 3†1 U 2 32 P † + 3†2 U 2 31 P † 3†1 U 2 32 P × 1 1 1 1 × exp N Tr 3†2 U 2 31 Q3†1 U 2 32 Q† + 3†2 U 2 31 Q† 3†1 U 2 32 Q , (35)
Master Partitions for Large N Matrix Field Theories
31
and the non-commutative random walk representation is Z Z † ¯ ][DT¯ ] ϒh = DU χh (U ) [DS][DS][DT ∞ k X 1 1 1 1 1 ¯ 2 ⊗ P † + T U 2 ⊗ Q + T¯ U 2 ⊗ Q† SU 2 ⊗ P + SU × exp Tr M⊗N k k=1 ∞ k X 1 ¯† 1 † 21 † † 21 † 21 † ¯ 2 S U ⊗P +S U ⊗P +T U ⊗Q+T U ⊗Q , × exp Tr M⊗N k k=1
(36) , cf. Eq. (33), but N1 corrections are different from which we find that ϒhN=∞ = 4N=∞ h (see Table 1). Again, for arbitrary one-row representations (M = 1) it is possible to obtain ϒh rather explicitly. Another potentially useful representation2 of the lattice polynomials is given by the following dual equations: Eq. (24) becomes 1 , (37) 4h = χh (∂) · exp Tr J P J P † + J QJ Q† J =0 N and Eq. (28) is dual to ¯ · exp 1 Tr J P J¯P † + J P † J¯P + J QJ¯Q† + J Q† J¯Q . (38) ϒh = χh (∂ ∂) J =J¯=0 N We could go on and discuss correlation functions which are naturally included into the present formalism. In particular, it is straightforward to give expressions for their character expansions in terms of modified lattice polynomials, and it remains true that the combinatorics is independent on whether the reduced field theory is free or interacting. This is however beyond the scope of the present article. While it is unclear whether the D ≥ 2 lattice polynomials can be computed exactly for a general partition, it should be stressed once more that this is unnecessary; all we need is an indirect method in order to extract the large N behavior. 5. Conclusions This solution to the problem of the large N limit of (non-gauge) matrix field theories is not yet complete since the structure of the lattice polynomials we introduced still needs to be further analyzed in order to be able to write the full set of saddle point equations. However we feel that we are definitely closing in on the large N problem, and that we have brought it into the simplest form to date. The proposed approach is concrete, systematic and rather general: we demonstrated that the reduction from N 2 to N variables is possible once one changes variables from matrices to partitions. In this language the master field becomes a master partition. Presumably one should first (re)derive in the current framework the exact solutions for some lower dimensional target spaces before dealing with the two (and higher) dimensional field theories. Acknowledgements. We thank Daya-Nand Verma and Brian G. Wybourne for interesting and useful discussions concerning the combinatorial aspects of this project. This work was supported in part by the EU under Contract FMRX-CT96-0012. 2 We thank D.-N. Verma for pointing this out to us.
32
M. Staudacher
References 1. Eguchi, T. and Kawai, H.: Reduction of dynamical degrees of freedom in the large N gauge theory. Phys. Rev. Lett. 48, 1063 (1982) 2. Eguchi, T. and Nakayama, R.: Simplification of Quenching Procedure for Large N Spin Models. Phys. Lett. B122, 59 (1982) 3. Kazakov, V.A., Staudacher, M. and Wynter, T.: Character Expansion Methods for Matrix Models of Dually Weighted Graphs. hep-th/9502132, Commun. Math. Phys. 177, 451 (1996); Almost Flat Planar Diagrams. hep-th/9506174, Commun. Math. Phys. 179, 235 (1996); Exact Solution of Discrete 2D R 2 Gravity. hep-th/9601069, Nucl. Phys. B471, 309 (1996) 4. Kostov, I. and Staudacher, M.: Two-Dimensional Chiral Matrix Models and String Theories. hepth/9611011, Phys. Lett. B394, 75 (1997); Kostov, I., Staudacher, M. and Wynter, T.: Complex Matrix Models and Statistics of Branched Coverings of 2D Surfaces. hep-th/9703189, Commun. Math. Phys. 191, 283 (1998) 5. Krauth, W. and Staudacher, M.: Finite Yang-Mills Integrals. AEI-065, hep-th/9804199, accepted for publication in Phys. Lett. B. 6. Bars, I.: U (N) Integral for the Generating Functional in Lattice Gauge Theory. J. Math. Phys. 21, 2678 (1980) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 206, 33 –55 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Statistics of Return Times: A General Framework and New Applications Masaki Hirata1 , Benoît Saussol2 , Sandro Vaienti2 1 Mathematical Department, Tokyo Metropolitan University, Japan. E-mail:
[email protected] 2 Centre de Physique Théorique, Luminy, Marseille and PHYMAT, Mathematical Department, University of
Toulon, France. E-mail:
[email protected];
[email protected] Received: 4 August 1998 / Accepted: 9 March 1999
Abstract: In this paper we provide general estimates for the errors between the distribution of the first, and more generally, the K th return time (suitably rescaled) and the Poisson law for measurable dynamical systems. In the case that the system exhibits strong mixing properties, these bounds are explicitly expressed in terms of the speed of mixing. Using these approximations, the Poisson law is finally proved to hold for a large class of non hyperbolic systems on the interval. 1. Introduction The investigation of asymptotically rare events is growing up as a new direction in the understanding of statistical properties of dynamical systems. By “asymptotically rare” events we mean, in a wide sense and following the terminology in the review paper of [Coe97], those events which have asymptotically zero probability but which occur with a well determined asymptotic limit law. In the dynamical setting, where we have a probability space (X, B, µ) with a measurable µ-preserving mapping T acting on it, the “events” will usually be the visits into a sequence of sets k ∈ B of positive measure but with their measure going to zero in the limit of large k. We call the event “rare”, when the expected entrance time in k diverges with k. A well-known result in ergodic theory shows how abundant are the “asymptotically rare” events. Let us consider in fact an ergodic measure µ for an endomorphism T and take a measurable subset : then Kac’s theorem [CFS82] says that the expectation of the return time to , starting from , is just µ()−1 . Kac’s theorem suggests the good normalization to keep in order to study the asymptotic distribution of the return time to . The natural object will thus be the distribution: (1) F (t) = µ x ∈ τ (x)µ() > t , where τ (x) is the first return time to provided that x ∈ and µ is the normalized restriction of µ to . The question will be whether the limit of F (t) exists when the
34
M. Hirata, B. Saussol, S. Vaienti
measure goes to zero and what kind of distribution is recovered. The condition that the starting point x in (1) belongs to could be relaxed by asking that x belongs to the whole space. In this case, F (t) will give the distribution of the “visiting time” into , but in order to get its asymptotic distribution, a suitable normalization is needed [GS97]. The situations sketched above could be considerably refined, producing richer processes (see the quoted paper [Coe97] for an historical account of these questions and an exhaustive bibliography). We will however explore some of them in this paper under a more general perspective and successively by giving applications to class of systems never investigated before. Let us first come back to formula (1) and replace with a decreasing sequence of neighborhoods of a given point z ∈ X, ε (z), such that their measure goes to zero when ε → 0+ . Then for some classes of hyperbolic dynamical systems, notably axiom A diffeomorphisms [Hir93], transitive Markov chains [Pit91], expanding maps of the interval with a spectral gap [Col96] and in the more general setting of systems verifying a strong mixing property (“self-mixing” condition and ϕ-mixing [Hir95]), and recently even in the case of rational maps with critical points in the Julia set [Hay98a], it is possible to prove that the distribution Fε (z) (t) goes to the exponential-one law e−t and this for µ-almost every z ∈ X. A strong improvement of this kind of result appears in the paper [GS97], where an upper bound for the difference t −t µ τA (x) > − e µ(A)λ(A) was explicitly computed in the case of ϕ-mixing systems and where A is a cylinder set, and λ(A) a suitable normalizing factor. Recently [Hay98b] obtained an exponential error estimate for the quantity like (1) in the case of parabolic rational maps. To enrich the process, and the statistics, one successively introduce the K th return time, τKε (x), from ε into itself (see the precise definition in the next section), where ε = ε (z) is still a neighborhood of some point z ∈ X. For the dynamical systems quoted above, a Poisson statistics can be proved, by showing that the distribution of successive return times into ε satisfies, for z µ-a.e. t K −t (x) −→ e . µε x ∈ ε τKε (x) ≤ t < τK+1 ε ε→0+ K!
(2)
The preceding results deserve further investigations at least in two directions: 1. extend them to non-hyperbolic dynamical systems and, more ambitiously, check their robustness when the system loses strong mixing properties. 2. prove an error estimate even for the distribution of successive return times (2) and relate this approximation rate, if possible, to the statistical properties of the system like correlations decay or spectral properties. We try to give partial answer to these questions in this paper. The general setting we put in, is the return(s) times to the set starting from itself, as expressed in formulas (1) and (2) (although in Theorem 2.1 we will also consider points starting everywhere). The first attempt was to give, for measure preserving dynamical systems, a general upper bound for the difference between the distribution of the (rescaled) first return time and the exponential-one law e−t and then between the distribution of high-order (rescaled) t K −t e . We do not make any hypothesis on the set , nor return times and the Poisson law K! on the ergodic properties of µ; nevertheless these bounds are expressed in terms of the
Statistics of Return Times
35
self-interactions of the set and can be explicitly computed when typical rates of mixing are known (uniform mixing, α-mixing or ϕ-mixing). In this context, our bounds greatly improve and simplify the hypothesis of self-mixing condition of [Hir95], which was a powerful tool to get sufficient condition for the Poisson statistics. This first part of the paper is essentially due to one of us (B.S.) and is part of his Ph.D. Thesis [Sau98b]. In the second part we apply the preceding bounds to new situations. The systems we treat are some non-uniformly hyperbolic maps of the interval; these maps are characterized by a structure parameter, say α, which measures the order of tangency at a neutral fixed point and governs the algebraic decay of correlations (in our example the order is n1−1/α ). If µ denotes the absolutely continuous invariant measure, we prove Poisson statistics (in the sense precise above), by giving an explicit approximation of the asymptotic law in terms of the measure of the set n , where in this case n is a decreasing sequence of cylinder sets chosen around almost all points in the interval. To be precise the error is of the type: µ(n )β , for any β < 1 − α, and therefore β is explicitly related to α and optimized just by 1 − α. For the distributions of the K th return times the bounds simply become µ(n )β/K . By inspecting these results, we could argue that the non-hyperbolic character of the maps reflects in the error term; to be more precise we think that as soon as the degree of non-uniform hyperbolicity of the map is monitored by a structure parameter α, this parameter will appear explicitly in the approximation to the Poisson law, which suggests, on the converse, that we could use Poissonian statistics to test lack of hyperbolicity. Our claim is motivated by two more observations: first, in getting these bounds we proved a sort of α-mixing for the map with a rate which was exactly the same as the algebraic rate for the correlations’ decay. Second, in the forthcoming paper [Sau98a] the return times is analyzed for a class of piecewise expanding multidimensional maps. Although the mixing properties are much more difficult to handle with, especially for the presence of singularity lines and the geometry of their shape, the uniform dilatation will provide bounds on the form: µ(n )β and µ(n )β/K for all β < 1, which reflects the fact that all the quantities involved, and the correlations’ decay too, admit exponential estimates. We will come back to these questions in Sect. 4. As a final remark, we address two questions: 1. Our analysis is local: the events are chosen around almost all points which we could call, following a widespread tradition, generic (for our statistics). What happens if we consider non-generic points (discarding of course some trivial situation like fixed points)? Could we see their (possibly different) statistics by involving some sort of large deviation argument ? 2. What is the place of Poissonian statistics regarding other ergodic characterizations of dynamical systems? For example: what is the largest class of ergodic dynamical systems enjoying a Poissonian statistics? Conversely, does an invariant measure satisfying that behavior imply strong ergodic properties too?
2. General Bounds on the Distribution of Return Times We will consider in this section a probability space (X, B, µ) together with a measure preserving transformation T acting on X. The basic object will be the return time into a positive measure set U starting from U defined by n o τU (x) = inf k ≥ 1| T k x ∈ U ∪ {∞}.
36
M. Hirata, B. Saussol, S. Vaienti
µ(A ∩ U ) . We µ(U ) then recall Kac’s theorem which says that the conditional expectation of τU given U is finite, and equal to 1/µ(U ), when µ is ergodic. As indicated in the introduction, Kac’s result suggests how to properly rescale the return time when we are interested in its distribution.
We define as usual the conditional measure µU on U by µU (A) =
2.1. First return time. We begin to show that the distribution of the first return time into the set U starting from U is close to an exponential one law if and only if the two distributions of the first return time starting, respectively from U and everywhere, are close. Theorem 2.1. Let us define c(k, U ) = µU (τU > k) − µ(τU > k) and set c(U ) = supk |c(k, U )|. The distribution of the (rescaled) first return time into the set U differs from the exponential-one law by at most d(U ) := 4µ(U ) + c(U )(1 + log c(U )−1 ), namely: t −t − e ≤ d(U ), sup µ τU > µ(U ) t≥0 which is still true starting from U : sup µU τU > t≥0
t µ(U )
− e ≤ d(U ). −t
Conversely, the difference between the two distributions (starting inside U and everywhere) can be bounded in terms of the distance e c(U ) := supt≥0 |µU (τU > t/µ(U )) − −t e |, precisely: c(U ) ≤ 2µ(U ) + e c(U )(2 + loge c(U )−1 ). Remark 2.2. Whenever µ(U ) > 0 the return time’s law is discrete and this allow us to get a lower bound for the rate of convergence. More precisely, we have the following proposition: Proposition 2.3. For each k ≥ 0, εk,U := µ (τU > k) − e−kµ(U ) + µ (τU > k + 1/2) − e−(k+1/2)µ(U ) ≥
e−kµ(U ) µ(U ). 4
In particular, ε0,U ≥ µ(U )/4. Proof of Proposition 2.3. Let k ≥ 0 be an integer. Since τU takes only integer values, the distribution for t = kµ(U ) and t 0 = (k + 1/2)µ(U ) is the same, then εk,U ≥ |exp(−kµ(U )) − exp(−(k + 1/2)µ(U ))| ≥ exp(−kµ(U ))(1 − e−µ(U )/2 ) ≥
e−kµ(U ) µ(U ). 4
t u
Statistics of Return Times
37
Proof of Theorem 2.1. Let us remark that for any k ≥ 1 we have µ(τU = k) = µ(U ∩ {τU > k − 1}).
(3)
Since {τU > k} = T −1 (U c ∩ {τU > k − 1}) by the invariance of µ we get that µ(τU > k) = µ(τU > k − 1) − µ(U ∩ {τU > k − 1}), whence the result. Next, for all k > 0 we have µ(τU > k) = µ(τU > k − 1) − µ(U )µU (τU > k − 1) = µ(τU > k − 1) − µ(U )[µ(τU > k − 1) + c(k, U )] = µ(τU > k − 1)[1 − µ(U )] − µ(U )c(k, U ). Then it follows by an immediate induction that k
µ(τU > k) = (1 − µ(U )) − µ(U )
k X
c(j, U )(1 − µ(U ))k−j .
j =1
Hence for all t ≥ 0, putting kt = [t/µ(U )], we have kt X µ(τU > kt ) − (1 − µ(U ))kt ≤ µ(U ) |c(j, U )| ≤ tc(U ).
(4)
j =1
Setting z = − log c(U ), and kz = [z/µ(U )], we get (1 − µ(U ))kz ≤ e−kz µ(U ) ≤ c(U )eµ(U ) ≤ c(U ) + 2µ(U ), for any t > z, µ(τU > kt ) ≤ µ(τU > kz )
≤ (1 − µ(U ))kz + zc(U ) ≤ 2µ(U ) + c(U )(1 − log c(U )) which gives µ(τU > kt ) − (1 − µ(U ))kt ≤ 2µ(U ) + c(U )(1 − log c(U )). Instead for any t ≤ z the same estimate holds by inequality (4). Since, by an easy computation |(1 − µ(U ))kt − e−t | ≤ 2µ(U ), we get for any t ≥ 0, µ(τU > kt ) − e−t ≤ 4µ(U ) + c(U )(1 − log c(U )), which proves the first part of the theorem. Moreover, since µU (τU > kt ) − µ(τU > kt ) = |c(kt , U )| ≤ c(U ), we finally have for each t ≥ 0, µU (τU > kt ) − e−t ≤ 4µ(U ) + c(U )(2 − log c(U )).
38
M. Hirata, B. Saussol, S. Vaienti
The converse part is proven in the same way. For k ≥ 1, µ(τU > k) = 1 − µ(τU ≤ k) = 1−
k X
µ(τU = j )
j =1
= 1 − µ(U )
k X
µU (τU > j − 1),
j =1
where we used in the last equality the relation (3). Hence k X −kµ(U ) −(j −1)µ(U ) −kµ(U ) | ≤ 1 − µ(U ) e −e c(U ) |µ(τU > k) − e + kµ(U )e j =1 1 − e−kµ(U ) −kµ(U ) − e c(U ) ≤ 1 − µ(U ) + kµ(U )e 1 − e−µ(U ) µ(U ) −kµ(U ) ) 1 − + kµ(U )e c(U ) ≤ (1 + e 1 − e−µ(U ) ≤ 2µ(U ) + kµ(U )e c(U ). c(U )−1 /µ(U ): This gives, whenever k ≤ k0 := loge |c(k, U )| ≤ 2µ(U ) + e c(U ) loge c(U )−1 . For k > k0 we simply have |c(k, U )| ≤ µ(τU > k0 ) + µU (τU > k0 ) c(U ). ≤ 2µ(U ) + e c(U ) loge c(U )−1 + e−k0 µ(U ) + e
t u
The last theorem gives a necessary and sufficient condition to obtain the exponential law, that is d(U ) → 0. However, such a quantity is not very transparent for dynamical systems, that is why we give a criterion to estimate it. This kind of condition is a generalization of the so-called “self-mixing condition” introduced in [Hir95]. Lemma 2.4. Let U ⊂ X a measurable set. The following estimate holds: c(U ) ≤ inf { aN (U ) + bN (U ) + N µ(U )| N ∈ N}, where the quantities are defined by aN (U ) = µU (
N [
T −j U ) = µU (τU ≤ N ),
j =1
bN (U ) = sup |µU (T −N V ) − µ(V )| V ∈U∞
with U = {U, U c }, Un =
Wn−1 k=0
T −k U and U∞ = ∪n σ (Un ).
Statistics of Return Times
39
Proof. Let N ∈ N. If k < N, we just bound c(k, U ) by |µU (τU > k) − µ(τU > k)| = |µU (τU ≤ k) − µ(τU ≤ k)| ≤ |µU (τU ≤ k)| + |µ(τU ≤ k)| ≤ aN (U ) + kµ(U ) ≤ aN (U ) + N µ(U ). Otherwise, let us remark that {τU > k} and {τU ◦ T N > k − N } differ only on {τU ≤ N }, and by hypothesis |µU (τU > k) − µU (τU ◦ T N > k − N )| ≤ µU (τU ≤ N ) = aN (U ). Moreover |µU (τU ◦ T N > k − N) − µ(τU > k − N )| = |µU (T −N (τU > k − N)) − µ(τU > k − N )| ≤ bN (U ). But {τU > k − N } and {τU > k} differs only on {τU ◦ T k−N ≤ N }, hence |µ(τU > k − N) − µ(τU > k)| ≤ µ(τU ◦ T k−N ≤ N ) = µ(τU ≤ N ) ≤ N µ(U ). We finally get for each k, N ∈ N, |µU (τU > k) − µ(τU > k)| ≤ aN (U ) + bN (U ) + N µ(U ), which concludes the proof, since N is arbitrary. u t We remark that bN (U ) is bounded by α(N) if the partition U = {U, U c } is α-mixing, and by γ (N ) if it is uniformly mixing (see Definition 2.1 below). To simplify, we could say that the exponential law holds when there exists some N so small that only few points of U come back in U before N steps, but large enough such that T N U is uniformly spread out. Definition 2.1 (Speed of mixing). Let (X, B, T , µ) be a dynamical system and ξ a W −j ξ and σ (ξ ) the finite or countable measurable partition of X. We set ξk = k−1 k j =0 T σ -algebra generated by ξk . 1. Uniform mixing. The partition ξ is uniformly mixing with speed γ (n) going to zero for n going to infinity if for any n, γ (n) = sup k,l
sup
R∈σ (ξk ) S∈T −(n+k) σ (ξl )
|µ(R ∩ S) − µ(R)µ(S)|.
2. α-mixing. The partition ξ is α-mixing with speed α(n) going to zero for n going to infinity if for any n, µ(R ∩ S) sup α(n) = sup µ(R) − µ(S) . k,l R∈ξk S∈T −(n+k) σ (ξl )
40
M. Hirata, B. Saussol, S. Vaienti
3. ϕ-mixing. The partition ξ is ϕ-mixing with speed ϕ(n) going to zero for n going to infinity if for any n, µ(R ∩ S) sup ϕ(n) = sup µ(R)µ(S) − 1 . k,l R∈σ (ξk ) S∈T −(n+k) ξl
4. Weak-Bernoulli. The partition ξ is weak-Bernoulli with speed β(n) going to zero when n goes to infinity, if for any n, X |µ(R ∩ S) − µ(R)µ(S)|. β(n) = sup k,l
R∈ξk S∈T −(n+k) ξl
Remark 2.5. We state some general implications and results verified by the preceding types of mixing. 1. ϕ-mixing implies α-mixing which implies uniform mixing. For any n, γ (n) ≤ α(n) ≤ ϕ(n). 2. ϕ-mixing implies weak-Bernoulli which implies uniform mixing. For any n, γ (n) ≤ β(n) ≤ ϕ(n). 3. If ξ is a generating partition of an uniformly mixing dynamical system, then the system is mixing. 4. If ξ is a generating weak-Bernoulli partition then the system is metrically conjugated with a Bernoulli shift. 2.2. Successive return times. We will now investigate the properties of successive return times to the set U . For this purpose, let us define the k th return time in U by ( 0 if k = 0, (k) τU (x) = (k−1) τU (x) + τU (T τU (x) (x)) if k > 1. Observe that the difference between two consecutive return times follows the same law than the first, for the simple reason that (K+1)
τU
(K)
− τU
(K)
= τU ◦ T τU
and the measure µU is invariant with respect to the induced application on U . Theorem 2.6. Let U ⊂ X be a measurable set, and U = {U, U c } the partition associated to it. Given an integer K and a rectangle QK in RK , the differences between successives normalized return times in U are independent and exponentially distributed up to f (K, U ) (see (5) below), where f (K, U ) is defined depending on the type of mixing by (α) When (X, T , µ) is α-mixing for U, with speed α 1 , then f (K, U ) = K 3d(U ) + inf {α(M) + 3Mµ(U )} . M∈N
1 We just need that mixing property for some special sets, more precisely, we are interested by
µ(R ∩ S) α 0 (N) = sup − µ(S) µ(R)
j, N ∈ N, R ∈ Uj , T j R ⊂ U, V ∈ T −j −N U∞ .
Statistics of Return Times
41
(γ ) When the partition U is uniformly mixed by (X, T , µ) with speed γ , then f (K, U) = K 4d(U ) +
inf
M∈N γ (M)<µ(U )2
γ (M) µ(U )2
2 − K log
γ (M) µ(U )2
+ 3Mµ(U ) .
Indeed the following inequality holds: Z Y K 1 (1) (2) (1) (K) (K−1) −si K )∈ e ds QK − µU (τU , τU −τU , . . . , τU −τU µ(U ) QK i=1
≤ f (K, U ). (5) Remark 2.7. Note that the mixing assumption is made only for the special partition U. If the system has a partition Z (not necessarily with two elements), uniformly mixing with speed γZ , then for any cylinder U ∈ Zn of order n, the partition U = {U, U c } is still uniformly mixing with speed γU (M) ≤ γZ (M − n). The proof of the theorem is inspired by [CG93], with the following differences: 1) U is any measurable set; 2) we take care of the approximations to get an estimation of the error; 3) we still get an estimation even if the system is uniformly mixing; however, it is interesting whenever γ (M) = o(1/M 2 ). Proof of Theorem 2.6. Let us remark first that if we denote by F = T τU the induced application on U , then for each k ∈ N, (k+1)
τU
(k)
− τU = τU ◦ F k .
We set τk = (τU , τU ◦ F, · · · , τU ◦ F k−1 ). We will show that the inequality (5) holds by induction on K. For K = 1, we apply Theorem 2.1 which gives, setting Q1 = [u, v], Z v e−s ds| |µU (τU ∈ [u, v]) − u
= |µU (τU > v) − µU (τU > u) − (e−u − e−v )| ≤ 2d(U ). Let’s suppose that the inequality (5) is true for K; we want to prove that it is also true for K + 1. Let [r, s] be the projection of QK+1 onto the last coordinate, and for k = K, K + 1 denote: Dk = U ∩ τk−1 (
1 Qk ). µ(U )
For any M ∈ N, the set defined by n o EK+1 (M) = DK ∩ x ∈ U | τU ◦ T M ◦ F K (x) ∈ [r, s]/µ(U ) − M verifies the inclusions EK+1 (M) ∩ {τU ◦ F K > M} ⊂ DK+1 ⊂ EK+1 (M) ∪ {τU ◦ F K ≤ M}.
42
M. Hirata, B. Saussol, S. Vaienti
Theorem 2.1 shows that the two sets which bound DK+1 do not differ too much, namely, µU (τU ◦ F K ≤ M) = µU (τU ≤ M) ≤ 1 − e−Mµ(U ) + d(U ) ≤ Mµ(U ) + d(U ). Therefore we get the first bound |µU (DK+1 ) − µU (EK+1 (M))| ≤ Mµ(U ) + d(U ).
(6)
So the problem reduces to prove that µU (EK+1 (M)) follows the expected law. We j (K) decompose the sets EK+1 (M) over AK = U ∩ {τU = j }. We have j
j
EK+1 (M) ∩ AK = DK ∩ AK ∩ T −(M+j ) {τU ∈
[r, s] − M}. µ(U )
j
We can now use the mixing with R = DK ∩ AK ∈ σ (Uj ) and S = T −(M+j ) {τU ∈ [r, s]/µ(U ) − M}. According to the type of mixing, we get two approximations: (α) When the partition U is α-mixing: j
j
|µU (EK+1 (M) ∩ AK ) − µU (DK ∩ AK )µ(τU ∈
[r, s] − M)| µ(U )
j
≤ α(M)µU (DK ∩ AK ). Summing over the possible values of j we get: |µU (EK+1 (M)) − µU (DK )µ(τU ∈
[r, s] − M)| ≤ α(M)µU (DK ) ≤ α(M). µ(U )
(7)
Now Theorem 2.1 gives |µ(τU ∈
[r, s] [r, s] − M) − (e−r − e−s )| ≤ |µ(τU ∈ ) − (e−r − e−s )| + 2Mµ(U ) µ(U ) µ(U ) ≤ 2(Mµ(U ) + d(U )).
We briefly recall the approximations done with their respective errors [r,s] −r − e−s ) µU (DK+1 )→µU (EK+1 (M))→µU (DK )µ{τU ∈ µ(U ) }→µU (DK )(e ↓ ↓ ↓ Mµ(U ) + d(U ) α(M) 2(Mµ(U ) + d(U ))
This allows us to show that the difference Z K+1 Y −si K+1 e ds µU (DK+1 ) − QK+1
(8)
i=1
is bounded by the quantity f (K, U ) + 3Mµ(U ) + α(M) + 3d(U ) ≤ f (K + 1, U ), which proves the induction and concludes the proof of this first case. (γ ) We now consider the case when U is uniformly mixing:
Statistics of Return Times
43
Let M be such that γ (M) < µ(U )2 . As a first step, we can restrict ourselves to the case γ (M) > 0. In fact, when QK ⊂ [0, z]K , with z = − log µ(U )2 QK \ [0, z]K ⊂
K [ k=1
K−k Rk−1 + ×]z, ∞] × R+
which implies using Theorem 2.1 K
µU (µ(U )τK ∈ QK \ [0, z] ) ≤
K X k=1
(k+1)
µU (τU
(k)
− τU > z/µ(U ))
= KµU (τU > z/µ(U )) ≤ K(e−z + d(U )). Moreover Z
K Y
QK \[0,z]K i=1
e−si ds K ≤
K Z X
K Y
k−1 K−k k=1 R+ ×]z,∞]×R+ i=1
e−si ds K ≤ Ke−z .
Next, by decomposing according to µU (µ(U )τK ∈ QK ) = µU (µ(U )τK ∈ QK ∩ [0, z]K ) + µ(µ(U )τK ∈ QK \ [0, z]K ), we get f (K, U ) ≤ K(2e−z + d(U )) + f 0 (K, U ), where f 0 (K, U ) is the maximum of the difference (5) for the boxes QK ⊂ [0, z]K . We then estimate f 0 (K, U ). First by uniform mixing we get j
j
|µU (EK+1 (M) ∩ AK ) − µU (DK ∩ AK )µ(τU ∈ [r, s]/µ(U ) − M)| ≤
γ (M) µ(U )
and then we sum over all possible2 values j of τ (K) , |µU (EK+1 (M)) − µU (DK )µ(τU ∈ [r, s]/µ(U ) − M)| ≤
Kzγ (M) . µ(U )2
The same computation performed after estimation (7) (where now α(M) is replaced zγ (M) + by Kzγ (M)/µ(U )2 in inequality (7)), gives the bound f 0 (K + 1, U ) ≤ K µ(U )2 3(d(U ) + Mµ(U )). Then for each M, f 0 (K, U ) ≤ K 2
zγ (M) + 3K(d(U ) + Mµ(U )). µ(U )2
Since M is arbitrary, our choice of z implies that the inequality (5) is verified with γ (M) γ (M) 2−K log +3Mµ(U ) . u t f (K, U ) = K 4d(U ) + inf 2 µ(U ) µ(U )2 M∈N γ (M)<µ(U )2
2 Since Q ⊂ [0, z]K , the K th return time is less or equal to Kz, hence it takes at most [Kz] different K values.
44
M. Hirata, B. Saussol, S. Vaienti
We are now ready to give the most important result of this section, namely, to prove the Poisson statistics for successive return times. Let N (t) be the number of visits into U up to the normalized time t/µ(U ), o n (K) N(t) = sup K > 0| τU ≤ t/µ(U ) . It turns out that N(t) is a discrete random variable whose law is close to a Poissonian one, more precisely we have Theorem 2.8. The distribution of the number of visits N (t) differs from the Poissonian law by K µU (N(t) = K) − t e−t ≤ g(t, K, U ) + g(t, K + 1, U ), K! p where for each k ≥ 0 g(t, k, U ) = 12t k /k + k k−1 k f (k, U ). Proof. It is a consequence of the weak dependence of the differences of successives return times established by Theorem 2.6. We first remark that t t (K) (K+1) } ∩ {τU } > µU (N(t) = K) = µU {τU ≤ µ(U ) µ(U ) = µU τ (K) ≤ t/µ(U ) − µU τ (K+1) ≤ t/µ(U ) . It is then sufficient to compute the measure of points whose k th rescaled return time is ek (t) the distribution of the sum of the smaller than t, for k = K, K + 1. If we put P differences of successive return times, we know that when the latter are i.i.d. random variables with the same exponential law, then setting n o Lk (t) = (s1 , . . . , sk ) ∈ Rk+ s1 + · · · + sk ≤ t we get ek (t) = Pk (t) := P
Z
k Y
Lk (t) i=1
e−si dsi
t K −t e . K! The difficulty comes now from the fact that we have to translate Theorem 2.6 given for boxes on the simplex Lk (t). Let’s suppose that f (k, U ) < 1, otherwise there is nothing to prove. Hence the integer defined by N = [k/f (k, U )k+1 ] is bigger than k. We consider the uniform partition of [0, t]k by cubes of size t/N. Let 1k be the union of those cubes Qk included P in the interior of Lk (t), for which for any (s1 , . . . , sk ) ∈ Qk , ki=1 si < t and 6k those which intersect the boundary, i.e. the union of those cubes such that there exists which gives the classical result PK (t) − PK+1 (t) =
Statistics of Return Times
45
s2 t
s t
1
Fig. 1. Partition of the cube [0, t]k for k = 2. 6k is the union of dotted squares and 1k the union of shaded rectangles Rk (Qk ).
P (s1 , . . . , sk ) ∈ Qk with ki=1 si = t. By using the notation τk introduced in the proof of Theorem 2.6 we have, Z k Y (k) −si k e ds δ := µU (τU ≤ t/µ(U )) − Lk (t) i=1 Z Y Z Y k k 1k 6k )− )+ e−si ds k + µU (τk ∈ e−si ds k ≤ µU (τk ∈ µ(U ) µ(U ) 1k 6k ≤ δ1 + δ2 + δ3 .
i=1
i=1
To estimate δ1 , we put 5 for the projection over the k − 1 last coordinates; then the sets Rk (Qk ) = {Q0k ∈ 1k |5(Q0k ) = 5(Qk )} are boxes, and their number is bounded by N k−1 (see Fig. 1). For each of these boxes Theorem 2.6 gives an error smaller than f (k, U ), and then we get δ1 ≤ N k−1 f (k, U ). To compute δ2 and δ3 , we first remark that a straightforward combinatorial calculus k of cubes inside 6 , C k ≤ 6N k−1 (see [Sau98b]). But for each gives, for the number CN k N cube Qk ⊂ 6k Theorem 2.6 gives Z µU (τk ∈ Qk ) ≤
k Y
Qk i=1
e−si ds k + f (k, U ).
Summing over all the cubes contained in 6k one has δ2 ≤ 6N k−1 f (k, U )+δ3 . Moreover Z Y k e−si ds k is bounded by the volume of Qk equal to (t/N )k , which the integral gives δ3 ≤
Qk i=1 6N k−1 t k /N k . We
then deduce that
δ ≤ δ1 + δ2 + δ3 ≤ N k−1 f (k, U ) + 12t k /N which implies δ ≤ 12t k /k + k k f (k, U ) by the previous choice of N. u t
46
M. Hirata, B. Saussol, S. Vaienti
3. Applications In the preceding chapter we gave general estimates for the error between the distribution of the number of visits into a set U and the Poissonian law. We could wonder whether this law is attained in the limit of µ(U ) → 0. Put in this way the question is not very clear. What we need is instead to localize a sequence of neighborhoods Uε (z) shrinking to zero and ask whether the Poisson law holds in the limit ε → 0. This approach was successfully carried out by several authors as reminded in the introduction.Although their results were applied to dynamical systems, the inspiration and some of the techniques of the proofs were of probabilistic nature (theory of moments, Laplace transform). Here we follow a purely dynamical direction, trying to extract all the statistical information by the ergodic properties of the system. In this way we are able, for example, to exhibit the Poissonian statistics for a large class of non uniformly hyperbolic maps of the interval, widely studied in the last years especially to determine the rate of decay of correlations and the central limit theorem. Some statistical properties of these maps have been studied in the paper [LSV97] (this paper contains a quite complete bibliography on the subject), where an absolutely continuous invariant probability measure (acim) is first constructed, and then it is shown that it enjoys a polynomial decay of correlations. One feature of these maps is that they are characterized by a structure parameter (the order of tangency at an indifferent fixed point), which governs the statistical properties, and that can be viewed as an indicator of the “weak” hyperbolicity of the map. Actually, it turns out that this parameter appears even in the approximation to the Poissonian law. Let’s then consider for 0 < α < 1 the following map of the unit interval: ( x(1 + 2α x α ) ∀x ∈ [0, 1/2) . T (x) = 2x − 1 ∀x ∈ [1/2, 1] We recall some properties and results which we will need in the following, and we refer the reader to the quoted paper for more informations and proofs. This application has a finite Markov partition (with two elements), but for our purposes it is more convenient to work with the countable one ξ generated by the left preimages an of 1, ξ = { Am | m ∈ N} an ≤ 2. with An =]an+1 , an ]. We will often use in the following the easy bound an+1 We can associate to each point z ∈ X =]0, 1] an unique infinite sequence ω = ω1 ω2 ... with the property that T m−1 z ∈ Aωm for all integer m ≥ 1. We denote by ξm the dynamical partition ξ ∨ T −1 ξ · · · T −m+1 ξ and call its elements m-cylinders. We denote with ξm (z) ∈ ξm the m-cylinder which contains z. The sequence ω satisfies the admissibility condition: ωm ωm+1 appears in ω if and only if ωm = 0 or ωm+1 = ωm − 1. We say that a non empty cylinder C = [ω1 . . . ωk ] ∈ ξk is maximal if it maps onto X after exactly k iterations, which is easily seen to be equivalent to ωk = 0. 3.1. Some mixing properties. We begin with a brief survey of some results proved by two of us (B.S., S.V) in the joint paper [LSV97] with Carlangelo Liverani. We showed that the density h of the acim belongs to a certain cone of functions C∗ (a), which will be characterized later (see Lemma 3.2), provided a is big enough, and satisfies3 : 3 We recall the formal definition of the Perron Frobenius operator P acting on function f : [0, 1] → R: P 1 Pf (x) = T y=x D1 T f (y). One easily check that µ is an acim iff h = dµ dx is a fixed point of P on L (dx). y
Statistics of Return Times
47
Lemma A (Lemma 2.2 in [LSV97]). The cone C∗ (a) is left invariant by the PerronFrobenius operator P , i.e. P (C∗ (a)) ⊂ C∗ (a). Lemma B (Lemma 2.3 in [LSV97]). The density h belongs to the cone C∗ (a), and verifies in particular whenever x ≤ y, h(x) ≤ (y/x)α+1 , h(y) h(x) ≤ ax −α .
(9) (10)
Proposition C (Distortion inequality, proof of Proposition 3.3 in [LSV97]). There exists some constant 1 such that for all k and x, y ∈ C ∈ ξk , Dx T k ≤ 1 < ∞. Dy T k
(11)
We will suppose without loss of generality that a ≥ 41. Theorem D (Theorem 4.1 in [LSV97]). In the proof of this theorem we in particular got that for f ∈ C∗ (a),
n
(12)
P f − λ(f ) 1 ≤ 8(n)kf kL1 (λ) L (λ)
with 8(n) = Cn− α +1 (log n) α = OL (n− α +1 ), where we define by 1
1
1
OL (ε) = O(ε(log ε−1 )r ) in the limit ε → 0, for any constant r. We then need a few more results on the speed of mixing which turn out to be useful for the statistics of return times and also to establish the weak-bernoullicity of the map. Lemma 3.1. For any z ∈ X, and for any m such that ξm (z) is maximal, the partition U = {ξm (z), ξm (z)c } satisfies a property close to the α-mixing, namely µ(R ∩ T −N−j S) 1 − µ(S) = OL ((N − m)1− α ). α 0 (N ) = sup sup sup µ(R) j ∈N R∈Uj S∈U∞ T j R⊂U
Proof. Let z be a point of X and m be an integer such that ξm (z) is maximal. Let U be the partition given by ξm (z) and its complement, and Uj the refinement of U. For R ∈ Uj such that T j R ⊂ U , we have R ∈ σ (ξm+j ) and R is a union of maximal k ∈ ξm+j ; choose V ∈ ξm+j one of these maximal cylinders. For any cylinders Vm+j −(N +j ) B there exists a set W ∈ B such that R = T −(N+j ) W . We then have S∈T (∗) := µ(V ∩ S) − µ(V )µ(S) Z Z = 1I V 1I W ◦ T N+j hdλ − µ(V )h1I W dλ Z = P N+j [h(1I V − µ(V ))]1I W dλ ≤ kP N+j [h(1I V − µ(V ))]kL1 (λ) .
48
M. Hirata, B. Saussol, S. Vaienti
By exploiting the fact that V is maximal we continue the preceding bound as
(∗) ≤ P N−m [P j +m (h1I V ) − µ(V )] 1 + P N −m [µ(V )h − µ(V )] L (λ)
L1 (λ)
≤ 4a8(N − m)µ(V ), with 8 given by inequality (12), provided P m+j (h1I V ) ∈ C∗ (a), which is the case by Lemma 3.2 below. We conclude the proof by summing over all the maximal cylinders of R. u t Lemma 3.2. For any maximal cylinder V ∈ ξp , P p (h1I V ) ∈ C∗ (a). p
Proof. We first set f := P p (h1I V ) and TV : V → X the restriction of T p to V . Since T p is injective over V we can rewrite f as −p
−p
f (x) = h ◦ TV (x)Dx TV
which in particular shows that f is continuous. To prove that f belongs to the cone of smooth functions C∗ (a) we must verify the following four properties which just define the cone: 1. f is continuous and positive, that is clear in our case. −p 2. f is decreasing. Since h ∈ C∗ (a), h decreases. In addition, TV is decreasing and −p −p concave, therefore h ◦ TV and DTV decrease. −p α+1 f (x) increases. Since TV : X → V is increasing, an equivalent statement 3. x 7 → x is that 1 (T p u)α+1 h(u) Du T p is increasing with u ∈ V . Observing that p α+1 1 T u u Du T p increases over V ∈ ξp (which is true for p = 1 and the general case is proved by α+1 recurrence), and R u 7 → u h(u) increases, we obtain the result. 4. f (x) ≤ ax −α f . Since f is continuous, there exists v ∈ V such that Z 1 . f = f (T p v) = h(v) Dv T p The distortion estimate (11) for u ∈ V ∈ ξp gives Dv T p ≤ 1. Du T p Moreover since h decreases, inequality (9) yields h(aω1 +1 ) aω1 α+1 h(u) ≤ ≤ ≤ 4. h(v) h(aω1 ) aω1 +1
Statistics of Return Times
49 −p
As a consequence, we get for u = TV x, 1 1 ≤ 4h(v) ≤ ax −α f (x) = h(u) Du T p Dv T p
Z f,
because x ≤ 1 and 41 ≤ a. u t We finally prove that the countable partition ξ , and therefore the two-elements one, is weakly Bernoulli. Theorem 3.3. The partition ξ is weakly Bernoulli for (X, T , µ) with speed β(n) = OL (n1−1/α ). Proof. We begin to recall the following result by Hofbauer and Keller [HK82] which permits to bound β(n) as β(n) ≤ sup
X
m∈N R∈ξ
kP n+m ((1I R − µ(R))h)kL1 . λ
(13)
m
Then it will be enough to bound kP m+n ((1I R − µ(R))h)k with R ∈ ξm . Let pR ≥ m be the integer for which R ∈ ξpR is maximal. We decompose the sum over all the cylinders R ∈ ξm into two blocks. Let M(m, n) be the set of maximal cylinders for pR < m + n/2. When R ∈ M(m, n), the same computation performed in Lemma 3.1 gives kP m+n ((1I R − µ(R))h)kL1 ≤ µ(R)OL ((m + n − pR )1−1/α ) = µ(R)OL (n1−1/α ). λ
Then the set of cylinders which do not belong to M(m, n) is exactly T −m+1 [0, an/2 ], whose measure is equal to µ(T −m+1 [0, an/2 ]) = µ([0, an/2 ]) =
Z
an/2
h(x)dx = O(n1−1/α ).
0
This proves the theorem. u t
3.2. Statistics of return times. We now come back to the study of return times and the first step will be the estimation of the quantities involved in the error term given by Lemma 2.4. Lemma 3.4. There exists a constant B such that for any k and C ∈ ξk with T −k C ∩C 6= ∅, sup P k 1I C ≤ Bk −1−1/α .
(14)
50
M. Hirata, B. Saussol, S. Vaienti
Proof. Let k0 be such that Dak0 T ≤ 2, and put r = Dak0 T > 1. Let C = [ω1 ...ωk ] be a k-cylinder such that T −k C ∩ C 6 = ∅. This implies that ωk ω1 is admissible. We want to estimate sup P k 1I C = 1/ inf C DT k . If ωj ≤ k0 for all j = 1..k, then DT k ≥ r k . Else, take j such that ωj = max1≤i≤k ωi . Either j = 1, and consequently ωk = 0 or ωj −1 = 0. In the last case we have inf DT k ≥ C
inf
[ω1 ...ωj −1 ]
DT j −1
inf
[ωj ...ωk ]
DT k+1−j ≥ 1−1
inf
[ωj ...ωk ω1 ...ωj −1 ]
DT k .
By this argument we are led to consider the worst case which is given by a cylinder of type C = [(k − 1)(k − 2)...0]. For T k C = [0, 1], the distortion formula (11) and the estimation ak ≤ ck −1/α given by Lemma 3.2 in [LSV97] we get Dak T k = c0 k 1+1/α for some constant c0 , from which the lemma follows by taking B ≥ 1/c0 such that t Bk 1−1/α ≥ r k for all k > 0. u We now introduce the first return time of a cylinder U which plays a crucial role in [Hir95]. We define it as τ (U ) = inf { τU (x)| x ∈ U }. Lemma 3.5. The quantity aN (U ) defined in Lemma 2.4 for U = ξm (z) is bounded by, aN (U ) =
N µ(U ) 41 . inf h λ(T τ (U ) U )
Proof. We suppose N > τ (U ) otherwise aN (U ) = 0. Set τ = τ (U ); for each z in X we have aN (U ) ≤
N X j =1
=
N X j =τ
1 µ(T −j U ∩ U ) µ(U ) 1 µ(U )
Z
≤ N sup sup j =τ..N U
P j (1I U h)1I U dλ P j (1I U h) . h
Now the distortion (11) and the regularity of the density (9) give P τ (1I U h) = h ◦ TU−τ DTU−τ 1I T τ U Z 1 ≤ 41 h ◦ TU−τ DTU−τ 1I T τ U dλ λ(T τ U ) T τ U µ(U ) . ≤ 41 λ(T τ U ) Finally, P h = h and since P is a positive operator one has P j −τ infh h P j −τ 1I 41 µ(U ) P j (1I U h) ≤ sup P τ (1I U h) ≤ sup P τ (1I U h) ≤ . h h h inf h λ(T τ U ) t u
Statistics of Return Times
51
The next step will be to show that τ (U ) is almost everywhere big enough to give a good upper bound in the previous lemma for aN (U ). We first define in full generality the local rate of return for cylinders. As a matter of fact, we would like to point out that the first return time of a set into itself allows to define and compute an interesting dimension-like characteristic which we called the Afraimovich-Pesin dimension in [PSV98]. Definition 3.1. Let ζ a partition of X. Denote with ζn (x) the element of ζ ∨ T −1 ζ ∨ · · · ∨ T −n+1 ζ which contains x ∈ X. We then define the local (lower and upper) rate of return for cylinders as τ (ζn (x)) . R ζ (x) = lim n n→∞ Proposition 3.6. (i) Both R ζ and R ζ are sub-invariant, namely R ζ ◦ T ≤ R ζ and Rζ ◦ T ≤ Rζ . (ii) Assume that ζ is a measurable partition of the measurable space X, and µ is an invariant probability, then R ζ and R ζ are µ-a.e. invariant. (iii) Moreover, whenever µ is ergodic R ζ and R ζ are µ-a.e. constant Proof. (i) Let x ∈ X. For each integer n > 0, we have: ζn (x) ∩ T k ζn (x) 6 = ∅ H⇒ ζn−1 (T x) ∩ T k ζn−1 (T x) 6 = ∅, which implies that τ (ζn−1 (T x)) ≤ τ (ζn (x)). (ii) is a standard property of sub-invariant functions on finite measure spaces and then (iii) follows immediately. u t We state the following result which can be improved for some subshifts4 . Proposition 3.7. For µ-almost every z ∈ X, the lower rate of return for cylinders is equal to 1. R ξ (z) = 1. Proof. Let 1/2 < δ < 1. Consider the set (we denote Nm (z) = τ (ξm (z))), Lm := { z ∈ A0 | Nm (z) ≤ δm}. If ∞ X
µ(Lm ) < ∞,
(15)
m=1
then the Borel-Cantelli Lemma ensures that for almost every z ∈ A0 , we have Nm > δm, up to finitely many m. By sending δ to 1 we show that R ξ (z) ≥ 1 almost everywhere on A0 . Then for the preceding proposition (iii) and the ergodicity of the measure µ, we 4 We have in fact the following: Theorem. Suppose that µ is a Gibbs state for the Hölder potential ϕ on some irreducible and aperiodic subshift of finite type with finite alphabet ζ , then µ-almost everywhere, R ζ = R ζ = 1.
Proof. An easiest version of the Proposition 3.7 gives the lower bound, while the uniform upper bound τ (Cn ) ≤ n + n0 holds, where Cn is a cylinder of order n, and n0 is the lowest power for which the transition matrix becomes strictly positive.
52
M. Hirata, B. Saussol, S. Vaienti
get the same bound almost everywhere. The equality finally follows since each time that T m−1 z ∈ A0 , we have T m ξm (z) = X hence Nm (z) ≤ m. In order to prove (15) it is sufficient to consider the Lebesgue measure instead of µ (since the density h is bounded from below). We have λ(Lm ) =
[m/2] X
δm X
λ(Nm = k) +
k=1
λ(Nm = k).
k=[m/2]+1
(1)
+
(2)
We now perform a detailed analysis of the sets appearing in the preceding formula. (1): In this case, the cylinder ξm (z) with Nm = k must be of the form ξm (z) = [(ω1 .. .ωk )(ω1 ...ωk )...(ω1 . ..ωk )...]. | {z } [m/k]
Therefore when k ≤ [m/2], the cylinder is completely determined by its first k symbols. Put C = [ω1 ...ωk ]; we say that a cylinder of length k is admissible (admis) when it is the beginning of a cylinder of Lm with Nm = k. Then we can bound (1) by (1) ≤
[m/2] X
X
λ(C ∩ T −k C ∩ · · · ∩ T −[m/k−1]k C)
k=1 C admis
≤
[m/2] X
X
k=1 C admis
≤
[m/2] X
sup
k=1 C admis
sup P k 1I C
[m/k]−1
C
k
sup P 1I C
[m/k]−1
C
λ(C) .
We first remark that T k being injective over C ∈ ξk , we have P k 1I C ≤ 1/ inf DT k ≤ 1/2. A0
1+ α1
We split the last sum in three pieces by fixing k0 as the biggest integer for which k0 eB , where B is the constant in Lemma 3.4. We then have by using Lemma 3.4,
≥
m/3 [m/2] k0 X X X [m/k]−1 −1−1/α m/k−2 (1/2) + (Bk ) + Bk −1−1/α . (1) ≤ k=1
m/3
k=k0
The first and the last sum are easily shown to be summable with respect to m. For the second term, we observe that the terms (Bk −1−1/α )m/k−2 are increasing in k when k is bigger than k0 . A direct estimation of the sum is B31/α m−1/α which is summable with respect to m. (2): In this case, the cylinder ξm (z) has the form ξm (z) = [ω1 ...ωm−k ωm−k+1 ...ωk ω1 ...ωm−k ]. | {z } | {z } | {z } m−k
2k−m
m−k
Statistics of Return Times
53
As before, we set C = [ω1 ...ωm−k ], and we say that C is admissible (admis) when it is the beginning of a cylinder of Lm with Nm = k, (2) ≤
δm X
X
λ(C ∩ T −k C)
k=[m/2]+1 C admis
≤
δm X
sup sup P k 1I C .
k=[m/2]+1 C admis C
Let first p = p(C) ≥ m − k be such that C ∈ ξp is maximal (i.e. p(C) is the smallest p for which C ∈ ξp ). When p < k, since 1 ∈ C∗ (a) the inequality (12) and Lemma 3.4 give sup P k 1I C ≤ sup P p 1I C sup P k−p 1 ≤ a2α Bp−1−1/α ≤ a2α B(m − k)−1−1/α . C
C
When p ≥ k, C ∈ ξk and T −k C ∩ C 6 = ∅ we have P k 1I C ≤ Bk −1−1/α . But k ≥ m − k ≥ (1 − δ)m, and then the sum (15) is summable for any δ < 1. u t We are now ready to state and prove the main theorems of this section ¯ Theorem 3.8. For µ-almost every z ∈ X and β < β(α), t − exp(−t) = O(µ(ξm (z))β ), sup µξm (z) τξm (z) > µ(ξm (z)) t≥0 ¯ where the critical exponent β(α) = 1 − α. Proof. Let ε be a positive number. Let z be a typical point for Proposition 3.7 and for the Shannon–McMillan–Breiman theorem. We want to apply Lemma 2.4; Let m(ε) such that for any m > m(ε) we have (1 − ε)m ≤ τ (ξm (z)), µ(ξm (z)) ≤ exp(−m2hµ /3) and also µ(ξεm (T [(1−ε)m] z)) ≥ exp(−(2[εm])hµ ). For the sake of simplicity, we put for any m, Um = ξm (z). For any m > m(ε) such that Um is maximal, we have (1 − ε)m ≤ τ (Um ) ≤ m, and all the iterates T j Um for 1 ≤ j < m are at a distance bigger than am from the neutral fixed point (because Um is −α so maximal). If τ (Um ) < m then the density stays bounded on the orbit T j Um by bam we have aα aα λ(T τ (Um ) Um ) ≥ m µ(T τ (Um ) Um ) ≥ m exp(−2εmhµ ). b b On the other hand, when τ (Um ) = m we still get λ(T τ (Um ) Um ) = 1 ≥
α am exp(−2εmhµ ). b
Lemma 3.5 gives us the following estimation with N = µ(Um )−α+ε , aN (Um ) = O(µ(Um )1−α−3ε ).
54
M. Hirata, B. Saussol, S. Vaienti
Lemma 3.1 with R = Um gives us bN (Um ) = OL ((µ(Um )−α+ε − m)1− α ) = OL (µ(Um )(−α+ε)(1− α ) ). 1
1
We can then apply Lemma 2.4, which gives c(Um ) =≤ aN (Um ) + bN (Um ) = O(µ(Um )β ) for β ≤ 1 − α − 3ε and β ≤ 1 − α − 2ε(1/α − 1). We finally end up with d(Um ) = O(µ(Um )β )
(16)
for any β < 1 − α, since ε is arbitrary small, which conclude the proof by applying Theorem 2.1. u t ¯ Remark 3.9. The preceding theorem shows that the critical exponent β(α) is smaller than 1.We point out that, by using Proposition 2.3 the power β¯ cannot exceed 1. Theorem 3.10. For µ-almost every z ∈ X, we have for any t ≥ 0 and K ≥ 0 and ¯ β < β(α), K µξ (z) Nξ (z) (t) = K − t exp(−t) = O(µ(ξm (z))β/(K+1) ). m m K! ¯ with the critical exponent β(α) = 1 − α. Proof. Let z be a typical point satisfying the preceding theorem and m such that Um = ξm (z) is maximal. By invoking the footnote of Theorem 2.6, it will be sufficient to use the weakened α-mixing condition 1 α 0 (M) = OL ((M − m)α− α ) given by Lemma 3.1 to apply Theorem 2.6. Take M = µ(Um )−α ; we thus find for β < 1 − α, and by the estimation (16) and Theorem 2.6 an error of the order f (K, Um ) = const[d(Um ) + α 0 (M) + Mµ(U )] = O(µ(Um )β ). By applying Theorem 2.8, the error for the probability to have K successive visits is of t the order µ(Um )β/(K+1) for all β < 1 − α. u 4. Concluding Remarks We conclude with few observations. First, the proofs for the exponential-one law and the Poisson law given in Sect. 3 for a class of non uniform hyperbolic maps, can be easily adapted, and they are even easier, to all the cases quoted in the introduction, namely: Axiom A diffeomorphisms, transitive Markov chains, expanding maps of the interval with a spectral gap and in general to all ϕ-mixing dynamical systems. For such systems, an estimation for the error can also be done: following the arguments of Theorems 3.8 and 3.10, one can easily see that the critical exponent β¯ is equal to 1. This supports our beliefs that: (i) the error terms of type µ(U )β could be optimal and (ii) the non uniform hyperbolicity of the map reflects in the critical exponent: in that case, in fact, it should be strictly smaller than one. Acknowledgements. We would like to thank Viviane Baladi for a careful reading of a preliminary version of this work and Bernard Schmitt for useful discussions. B.S. acknowledges the ESF for support during the workshop “Probabilistic methods in non-hyperbolic dynamics” in Warwick.
Statistics of Return Times
55
References [Coe97] [Col96] [CG93] [CFS82] [GS97] [Hay98a] [Hay98b] [Hir93] [Hir95] [HK82] [LSV97] [PSV98] [Pit91] [Sau98a] [Sau98b]
Coelho, Z.: Asymptotic laws for symbolic dynamical systems. Lectures given in Temuco, Chili, 1997 Collet, P.: Some ergodic properties of maps of the interval, Dynamical systems (Temuco, 1991/1992) (Paris), Travaux en Cours, vol. 52, Paris: Hermann, 1996, pp. 55–91 Collet, P. and Galves, A.: Statistics of close visits to the indifferent fixed point of an interval map. J. Stat. Phys. 72, no. 3-4, 459–478 (1993) Cornfeld, I.P., Fomin, S.V. and Sina˘ı, Ya.G.: Ergodic theory. vol. 245, New York: Springer-Verlag, 1982 Galves, A. and Schmitt, B.: Inequalities for hitting time in mixing dynamical systems. Random Comput. Dynam. 5, no. 4, 337–347 (1997) Haydn, N.: The distribution of the first return time for rational maps. 1998, USC Haydn, N.: Statistical properties of equilibrium states for rational maps. 1998, USC Hirata, M.: Poisson law for Axiom A diffeomorphisms. Ergodic Theory Dynamical Systems 13, no. 3, 533–556 (1993) Hirata, M.: Poisson law for the dynamical systems with the “self-mixing” conditions. In: Dynamical systems and chaos, Vol. 1 (Hachioji, 1994) (River Edge, NJ), River Edge, NJ: World Sci. Publishing, 1995, pp. 87–96 Hofbauer, F. and Keller, G.: Ergodic properties of invariant measures for piecewise monotonic transformations. Math. Z. 180, no. 1, 119–140 (1982) Liverani, C., Saussol, B. and Vaienti, S.: A probabilistic approach to intermittency. Ergodic Theory Dynamical Systems (1997). To appear Penné, V., Saussol, B. and Vaienti, S.: Fractal and statistical characteristics of recurrence times. To appear in Journal de Physique (Paris), 1998, Proceeding of the conference “Disorder and Chaos” (Rome, 22–24th Sept. 1997), in honour of Giovanni Paladin Pitskel, B.: Poisson limit law for Markov chains. Ergodic Theory Dynamical Systems 11, no. 3, 501–513 (1991) Saussol, B.:Absolutely continuous invariant measures for multidimensional expanding maps. 1998, Submitted Saussol, B.: Etude statistique de systèmes dynamiques dilatants. Ph.D.thesis, Université de Toulon et du Var, 1998
Communicated by Ya. G. Sinai
Commun. Math. Phys. 206, 57 – 103 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Lifshitz Tails for Random Schrödinger Operators with Negative Singular Poisson Potential Frédéric Klopp1 , Leonid Pastur2,3 1 Département de Mathématique, Institut Galilée, U.M.R 7539 C.N.R.S, Université de Paris-Nord, Avenue
J.-B. Clément, F-93430 Villetaneuse, France. E-mail:
[email protected]
2 Département de Mathématique, Université Paris VII, 2, Place Jussieu, F-75005 Paris, France.
E-mail:
[email protected]
3 Mathematical Division, Institute for Low Temperature Physics, 47, Lenin’s Ave., 310164, Kharka, Ukraine
Received: 18 November 1998 / Accepted: 9 March 1999
Abstract: We develop a method of asymptotic study of the integrated density of states (IDS) N (E) of a random Schrödinger operator with a non-positive (attractive) Poisson potential. The method is based on the periodic approximations of the potential instead of the Dirichlet-Neumann bracketing used before. This allows us to derive more precise bounds for the rate of approximations of the IDS by the IDS of respective periodic operators and to obtain rigorously for the first time the leading term of log N (E) as E → −∞ for the Poisson random potential with a singular single-site (impurity) potential, in particular, for the screened Coulomb impurities, dislocations, etc. Contents 0. Introduction: Problems and History . . . . . . . . . . . . . . . . . 1. The Assumptions and the Results . . . . . . . . . . . . . . . . . . 1.1 The integrated density of states . . . . . . . . . . . . . . . . . 1.2 The asymptotics of the IDS . . . . . . . . . . . . . . . . . . . 2. Periodic Approximations . . . . . . . . . . . . . . . . . . . . . . . 2.1 A general approximation result . . . . . . . . . . . . . . . . . 2.2 Application to the estimation of the integrated density of states 3. The Lower Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The general case . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The case when V is bounded from below . . . . . . . . . . . . 3.3 The case when V has power law singularities . . . . . . . . . 4. The Upper Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 The general case . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Proof of Proposition 4.1 . . . . . . . . . . . . . . . . . . . . . 4.3 The case when V has power law singularities . . . . . . . . . 5. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 The structure of the Poisson potential . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
58 61 62 62 65 65 75 76 76 79 80 81 81 81 87 91 91
58
F. Klopp, L. Pastur
5.2 An a-priori estimate on the density of states . . . . . . . . . . . . . . . . 94 5.3 Exponential decay estimates . . . . . . . . . . . . . . . . . . . . . . . . 96 5.4 Some useful facts about the single site potential Hamiltonian Hg . . . . . 97 0. Introduction: Problems and History The Integrated Density of States (IDS) is one of the simplest but quite important characteristics of the random Schrödinger operator. Among numerous problems related to the IDS, the problem of its asymptotic behavior near the edges of the spectrum is well known and studied. The results of these studies can be summarized as follows. One has to distinguish two types of spectral edges: stable and fluctuational (see e.g. [7,18,21]). The latter are special for shortly correlated random potentials. In the simplest case of the lower edge of the spectrum, they are determined by the absolute minimum of the potential since the spectrum in a neighborhood of this edge exists only because of the (arbitrarily large) fluctuations of the potential arbitrarily close to the minimum. By using the quantum mechanical terminology one can call these portions of the realization the potential wells. A heuristic derivation of the fluctuational asymptotics of the IDS was proposed by I.Lifshitz in the early 60’s [16,17]. The asymptotics is given by the probability to have a potential well whose ground state energy is close enough to the spectral edge. Since the probability of these realizations having the form of very broad or deep potential wells (and known also as optimal fluctuations) is usually exponentially small, one has to deal here with a version of the large deviation technique in the spectral context. In particular, to determine the asymptotic formula for the IDS one has to be able to give a rather detailed description of the statistics of these special realizations. This is why precise and explicit asymptotic formulae are known only for comparatively restricted classes of random potentials. One of the widely studied random potentials is the Poisson potential having the form X V (x − xj ), (0.1) Vω (x) = j
where {xj } is the Poisson point field of the density µ in Rd and V (x), the one-site (or single site) potential, is a function decaying sufficiently fast at infinity. The Poisson potential is of considerable interest both in spectral theory and in the theoretical physics of disordered systems. It possesses a number of nontrivial asymptotic regimes, only part of which has been studied so far. One has to mention first the case of the nonnegative one-site potential V (x) of compact support. In this case E = 0 is a fluctuational edge and according to I. Lifshitz, N(E) ' exp(−const · E −d/2 ), E → +0.
(0.2)
The right-hand side of this formula is just the probability to have a well (a region of Rd free of xj ’s) of width L ' E −1/2 ; the latter relation is due to the uncertainty principle. In other words, the asymptotics of the IDS in this case is determined by an optimization procedure, balancing of the quantum and the probabilistic components of the problem. This is why formula (0.2) is often called the quantum Lifshitz tail. Rigorous derivations of various versions of (0.2) (e.g. its logarithmic or even its double logarithmic forms) have required a number of rather sophisticated probabilistic and spectral techniques (see
Lifshitz Tails for Random Schrödinger Operators
59
e.g. [3,21,27,30,31,11] for results and references). Here and below we use the symbol “'” to denote the asymptotic equivalence without indicating explicitly the order of the remainder and respective constants. Other asymptotic regimes of the IDS for the potential (0.1) correspond to the case when the one-site potential has a non-positive part, i.e. inf V (x) < 0, so that the lower edge of the spectrum is E = −∞. In this case one has to distinguish the two asymptotic regimes, usually called quantum and classical. We will present respective asymptotic formulae by using a version of Lifshitz’s arguments adapted to this case. Recall the definition of the IDS. It is the limit as L → +∞ of the expectation of the normalized counting function of eigenvalues of the Schrödinger operator H3L , where H3L is the restriction of Hω = −1 + Vω to L2 (3L ) (see e.g. [21,7]). Here 3L is the cube of the Rd of center zero and of side length L. The definition shows that the IDS can be regarded as the probability to find an eigenvalue of H3L lying below a given energy E. For E → −∞ these eigenvalues are produced by very deep potential wells, created by large clusters of k Poisson points xj ’s, confined to sufficiently small regions of the space, say a cube of the side length l << L. The respective probability is d
e−µl (µl d )k /k!.
(0.3)
Assume first that l is independent of k. This corresponds to a single site potential that varies relatively slowly with respect to the exponential decay of the single site ground state. Then, the IDS has to be asymptotically equivalent to the probability (0.3) in which k is determined by the equation E(k) = E, where E(g) is the ground state energy of the well gV (x). In other words, for a non positive one-site potential V (x) optimal fluctuations are potential wells whose form is roughly kV (x). Thus, we have to find the ground state energy E(g) of the well gV (x) as the function of its coupling constant g and then the asymptotic form of the IDS will be given by (0.3) in which k = g(E), where g(E) is the asymptotic inverse of −E(g) for E → −∞, g → +∞. Since the leading contribution to (0.3) in this case is given by the factor 1/k!, the asymptotic formula should have the form N(E) ' exp(−g(E) log g(E)), E → −∞
(0.4)
log N(E) ' −g(E) log g(E), E → −∞.
(0.5)
or
The simplest case when this formula can be made rigorous is when V (x) is continuous at its (negative) absolute minimum. In this case E(g) = gV (0)(1 + o(1)), E → −∞.
(0.6)
The respective rigorous formula log N(E) =
E log |E|(1 + o(1)), E → −∞ V (0)
(0.7)
was proved in [20] using a combination of variational and Wiener integral techniques. The formula is obviously classical because it includes only the characteristics of the potential but not the Laplacian. In addition it can be easily shown that the r.h.s. of (0.4) is the asymptotic form of the logarithm of P (Vω (x) < E). This provides one more evidence of the classical nature of the formula.
60
F. Klopp, L. Pastur
This paper is devoted to quantum versions of (0.4), i.e. to singular single site potentials V (x). Despite a considerable amount of physical results in the question (see e.g. [18, 4]) rigorous results are almost absent, except the one dimensional case of point one-site potentials V (x) = −g0 δ(x), g0 > 0,
(0.8)
where δ(x) is the Dirac delta-function. In this case by using the special Markov processes technique one can obtain (see [18]) an asymptotic of the form s s E E log (1 + o(1)), E → −∞, log N(E) = −2 E0 E0 where E0 = −g02 /4. We see that formula (0.5) is only valid up to a factor 2. This difference is the result of a tunneling phenomenon related to the question how close to one another should k potentials (0.8) be in order to be regarded as the potential k ∗ (k)δ(x), i.e. the potential of same shape as (0.8) and of amplitude k ∗ (k). In other words, in this case contrary to the classical case, the radius of the exponential decay of the single site ground state is much larger than the width of the single site potential. Thus the optimal cluster should be much smaller (i.e. its radius should tend to 0 sufficiently fast) in order to be modeled by a single site potential of some effective amplitude k ∗ (k). ∗ We shall see below that in many interesting cases l ' k −α for some α ∗ > 0. Because ∗ of that, the factor k −α dk in (0.3) will also contribute to the asymptotic formula of the IDS. The study of this phenomenon is one of the topics dealt with in the present paper. In this paper we study the case of the singular one-site potential (mainly with powerlaw singularities) following in essence the scheme outlined above. We find the precise form of k ∗ (k) (see Theorem 1.4 and 1.6 below). This became possible owing to an improvement of one of the technique in the field, based on approximations of the Schrödinger operator in the whole space by the operator with the same potential but defined in a finite box whose size is properly chosen as a function of energy. Previous versions of this technique were based on the so called Neumann–Dirichlet bracketing where the boxes with the Neumann and Dirichlet boundary conditions were used to construct the upper and the lower bounds for the IDS. The error in these bounds is of the order O(L−1 ), where L is the size of the box. This precision is not sufficient to treat the quantum case. Therefore, we approximate the IDS of the random model by the IDS of some well chosen periodic Schrödinger operators and obtain much more precise bounds (see Sect. 2 and more precisely Lemmas 2.1 and 2.3)). This method was proposed in [11] and has been used to solve several problems in the field ([10,12]). We obtain the once logarithmic versions of (0.4), i.e. (0.5) with explicit g(E) and constants in front of g(E) log g(E) (see Theorems 1.5 and 1.6). Let us now give just one example of the results presented in Sect. 1.2; let V , the single site potential, be the 3-dimensional attractive screened Coulomb potential V (x) = e−|x| , widely used in semiconductor physics (see e.g.[4]). In this case we prove − |x| p log N(E) = −2 |E| log |E|(1 + o(1)), E → −∞, (see Theorem 1.6 and the discussion following it). The role of the IDS in the spectral theory and theoretical physics of disordered systems is well known and appreciated (see [2,7,21,4,18]). However, there is one more reason to
Lifshitz Tails for Random Schrödinger Operators
61
study this quantity. Since the pioneering papers of I. Lifshitz, the study of the IDS has been providing a first important step in the study and in the understanding of more complex properties and quantities in a respective version of the strong localization regime. In particular, the IDS is the first moment (see formula (1.4)) of the spectral kernel of the Schrödinger operator. From the mathematical physics point of view, the IDS determines the equilibrium properties of disordered systems, i.e. of the ideal gas of elementary excitation (electrons, phonons, spin waves, etc.) in the random environment. The study of the kinetic properties of this gas and of the interaction effects requires knowledge of higher moments of the spectral kernel, the second moments first of all. The knowledge of these correlators allows one to answer a number of relevant questions concerning the existence and the nature of the localization and behavior of related quantities. In particular, in a subsequent paper ([13]), we use the technique developed in this paper in order to find the large deficit asymptotic behavior of the inter-band light absorption coefficient. The paper is organized as follows. In Sect. 1, we define the framework of our study and give a brief account of our results. We also present several examples at the end of the section. In Sect. 2, we first prove the basic relation (1.4) expressing the IDS in terms of the spectral family of the random Schrödinger operator. Then we construct our main technical tool, the periodic approximations of the IDS. Sections 3 and 4 are devoted to the derivation of the lower and upper bounds for the IDS using the periodic approximations. Section 5 contains auxiliary facts on the statistics of the Poisson field, on random Schrödinger operators and on the structure of the ground state of the Schrödinger operator with a singular single site potential. 1. The Assumptions and the Results Let V : Rd → R be a function such that V = V1 + V2 , where H1 For some C > 0 and any x ∈ Rd , |V1 (x)| ≤ Ce−|x|/C . H2 The function V2 is compactly supported and satisfies V2 ∈ Lp (Rd ), where p > p(d) and p(d) = 2 if d ≤ 2 and p(d) = d/2 if d ≥ 3. H3 For some set of positive measure E, V E < 0. Define the random potential
Z
Vω (x) =
Rd
V (x − y)m(ω, dy),
(1.1)
where m(ω, dy) is a random Poisson measure of concentration µ. Vω is an ergodic random field on Rd . Consider the random Schrödinger operator Hω = −1 + Vω .
(1.2)
One has Theorem 1.1 ([7]). Under the assumptions made above, Hω is essentially self-adjoint on C0∞ (Rd ) ω-almost surely. Under our assumptions on V , we know that the almost sure spectrum of Hω is 6 = R ([21,7]).
62
F. Klopp, L. Pastur
1.1. The integrated density of states. Let 3 be a cube centered at 0 in Rd . We define D to be the Dirichlet restriction of H to 3. Pick E ∈ R. Consider the quantity Hω,3 ω Nω,3 (E) =
1 D ]{eigenvalues of Hω,3 smaller than or equal to E}. Vol(3)
(1.3)
Then one has Theorem 1.2 ([7]). Under the assumptions made above, there exists a non-random, nondecreasing, non-negative, right continuous function N (E) such that, ω-almost surely, for all E ∈ R, E a continuity point of N, Nω,3 (E) converges to N (E) as 3 exhausts Rd . N (E) is the integrated density of states (IDS) of Hω . As N is non-decreasing, one can define its distributional derivative dN. It is a positive measure and is supported on the almost sure spectrum of Hω (see [7,21]). One has the following result: Theorem 1.3. For ϕ ∈ C0∞ (R), we have (ϕ, dN) = E(tr(1C(0,1) ϕ(Hω )1C(0,1) )),
(1.4)
where C(0, 1) is the cube of center 0 and side length 1. Formula (1.4) is well known under more restrictive assumptions on the potential Vω , i.e. for less singular single site potentials V (see [21]). 1.2. The asymptotics of the IDS. To describe the asymptotic behavior of N (E) near −∞, we will need to define an auxiliary operator. For g ∈ R, define H (g) = −1 + gV .
(1.5)
Under our assumptions on V , V is relatively form bounded with respect to −1 with relative bound 0. Hence, H (g) admits a unique self-adjoint extension. Let σ (H (g)) denote its spectrum. It is lower semi-bounded. The infimum of σ (H (g)), i.e. the ground state energy of H (g) will be denoted by E(g). Let ϕg be the respective ground state, i.e. the unique positive normalized eigenfunction of H (g) associated to energy E(g) ([22,26]). In the sequel it will often be more convenient to work with E− (g) = −E(g) instead of E(g) itself. From assumption H 3, one easily infers that E− (g) → +∞ when g → +∞ (see Sect. 5). Moreover E− is strictly increasing in a neighborhood of +∞. Let g be an inverse of E− in a neighborhood of +∞. g is strictly increasing. In the regular (classical) case, it was found that g is governing the first term asymptotic of log N (cf [21,20]). In the singular (quantum) case, the singular set of V will play a special part in the asymptotics. To measure this role, we introduce the notion of asymptotic ground state, i.e. Definition 1.1. Let g ∈ (1, +∞) 7 → ψg ∈ H 1 (Rd ). We will say that ψg is an asymptotic ground state if and only if • the vector ψg is normalized. • ∃g0 > 1, l0 > 0 such that ∀g ≥ g0 , supp ψg ⊂ C(0, l0 ) (where C(x, l) denotes the • cube of center x and side length l). |h(H (g) − E(g))ψg , ψg i| → 0 as g → +∞. (1.6) |E(g)|
Lifshitz Tails for Random Schrödinger Operators
63
In Lemma 5.6, we prove the existence of an asymptotic ground state. For a ∈ Rd , we define the translation τa by τa V (x) = V (x − a) and we define ( Aψg = α > 0;
lim
sup
g→+∞ |a|≤g −α
) g|h(τa V − V )ψg , ψg i| =0 . E− (g)
If Aψg 6 = ∅, then we define α ∗ (ψg ) := inf Aψg . Moreover, we define A to be the union of all Aψg . By Lemma 5.6, we know that A 6 = ∅. We define α ∗ := inf A.
(1.7)
Roughly speaking, the dependence of the radius of the exponential decay of the single ∗ site potential ground state on the coupling constant g is of the form g −α . This determines the characteristic size l of the optimal cluster. Then we prove Theorem 1.4. Under the assumptions H1, H 2 and H 3, for sufficiently large E, one has −(1 + α ∗ d)g(E) log g(E)(1 + o(1)) ≤ log N (−E) ≤ −g(E) log g(E)(1 + o(1)). (1.8) One may complain that Theorem 1.4 is somewhat imprecise in that it only gives a two sided estimate. But, as we will see below, this is in some way unavoidable as the true asymptotic depends not only on g but also on the singular set of the negative part of V . More precisely, as can be seen from Theorem 1.6 (and from the proof of Theorem 1.4), the asymptotics of the IDS depends on the way the eigenfunction associated to the lowest eigenvalue for the operator −1 + gV concentrates near the singular set of the negative part of V as g becomes large. In general the correction also depends on the geometry of the singular set. For example, if the singular set is a segment (e.g. a dislocation), using the techniques developed in Sect. 3, one can see that neither the lower nor the upper bound given by Theorem 1.4 are sharp. The two sided estimate (1.8) can be made more precise if we know more on V . The first and simplest example we give is the case when V is bounded from below, reaches its minimum at a single point, say 0, and is continuous near 0. Then one easily proves that α ∗ = 0 and the upper and lower bounds in (1.8) coalesce to give (0.7). We will now give other results that, we think, enclose most of the physically relevant examples. Let v− be the essential infimum of V and assume that V is bounded from below, say H1’ −∞ < v− < 0. It is easy to show that g(E) ∼ E/|v− | when E → +∞ (see Lemma 5.5). We obtain Theorem 1.5. Under the assumptions H 1, H 2 and H 1’, one has log N(−E)
∼
E→+∞
−g(E) log g(E)
∼
E→+∞
E log E. v−
Here and in the rest of the paper, a ∼ b will always mean a = b(1 + o(1)).
(1.9)
64
F. Klopp, L. Pastur
This result extends (0.7) removing the continuity assumption near the minimum. Consider now an example a bit more singular. In this case, d = 2 and V2 (x) = log− |x|, x ∈ R2 , where, for a ≥ 0, log− a = min{log a, 0}. Using the inequality log− |x| + log R ≤ log− R|x| ≤ log |x| for 0 < R < 1 and the variational principle for the ground state energy, one shows that, in this case, E− (g) ∼ g/2 log g, hence g(E) ∼ 2E/ log E. g→+∞
E→+∞
One also shows that α ∗ = 0 for this single site potential. Therefore, Theorem 1.4 tells us that log N(−E)
∼
E→+∞
−g(E) log g(E)
∼
E→+∞
−2E.
Hence, the asymptotic formula (0.5) is also valid for certain mildly singular potentials. Another case where one can find an asymptotic for log N is when V has only power law singularities. Let q be a positive integer and pick q positive exponents (νi )i=1,...,q and q functions (hi (θ))i=1,...,q continuous on the sphere Sd−1 . For 1 ≤ i ≤ q, consider the potentials Vi (x) =
hi (θ(x)) x . where θ (x) = ν i |x| |x|
Assume that
(1.10)
( 0 < νi <
1 if d = 1, 2, 2 if d ≥ 3.
(1.11)
Then Vi is relatively form bounded with respect to −1 with relative bound 0 and we can consider the operators H i = −1 + Vi with form domain H 1 (Rd ). For 1 ≤ i ≤ q, Ei denotes the ground state energy of H i . Now we assume that H 1” • There exists q distinct points (xi )i=1,...,q in Rd and q continuous compactly supported function (Wi )i=1,...,q such that Wi (0) = 1 and V2 (x) =
q X
Wi (x − xi )Vi (x − xi ) =
i=1
q X
τxi (Wi Vi )(x).
(1.12)
i=1
• For some 1 ≤ i0 ≤ q, we have Ei0 < 0. Notice that assumption H 1” implies assumptions H 2 and H 3. Define ν † = sup {νi ; 1 ≤ i ≤ q such that Ei < 0} , n o E− = sup |Ei |; 1 ≤ i ≤ q such that Ei < 0 and νi = ν † , α† =
1 . 2 − ν†
(1.13) (1.14) (1.15)
†
Then, by Lemma 5.7, we know that E− (g) ∼ E− g 2α when g → +∞. In addition, using the periodic approximation scheme, in this case, we can find an upper bound having the same form as the lower bound in (1.8) where α ∗ is replaced by α † (Proposition 4.4). On the other hand, we prove that α ∗ ≤ α † (Lemma 3.1). From these two facts and from Theorem 1.4, we deduce α ∗ = α † and
Lifshitz Tails for Random Schrödinger Operators
65
Theorem 1.6. Under the assumptions H1 and H 1”, one has log N(−E)
∼
E→+∞
−(1 + α † d)g(E) log g(E),
† E 1−ν /2 d − ν† E − 1+ log . E→+∞ 2 E− E−
(1.16)
∼
We see that the lower bound in (1.8) is rather universal. There exist other methods to obtain this bound, for instance a version of the variational method of [18,21] (also [13]). This method also gives a sharp lower bound for practically all asymptotic formulae for log N (−E) known so far. This is true in particular for the Poisson potential with a nonpositive single site potential. Thus it is the upper bound that has been requiring different techniques depending on the specific potential (see e.g. [3,27,30,11]). Assumptions H1 and H 1” include most physically interesting cases as, for example, e−|x| . In this case the 3-dimensional attractive screened Coulomb potential V (x) = − |x| we have α ∗ = α † = 1 and E− = 1 (see [15]); thus √ log N(−E) ∼ −2 E log E. E→+∞
There is another physically interesting case that has not been discussed here: it is the case of point potentials. Such potentials will be studied in a sequel to this paper [13]. 2. Periodic Approximations 2.1. A general approximation result. In this subsection, we will show how to adapt the periodic approximation scheme developed in [11] to the case of unbounded potentials. The main new technical problems will come from the fact that, in the present case, the negative part of the random potential grows much faster at infinity than in the case dealt with in [11] or when the single site potential is bounded. In the sequel C(x, l) will denote the cube of center x and side length l. For y ∈ Rd and f a function on Rd , fy will denote the translate of f by y i.e. fy (x) = f (x − y). Fix ω; the support of the Poisson measure m(ω, dy) is a set of discrete points that we denote by (xk (ω))k∈N . We then define the following periodic potential: X X V (x − β − xk (ω)) (2.1) Vω,n = β∈nZd
k;xk (ω)∈C(0,n)
and the corresponding periodic Schrödinger operator Hω,n = −1 + Vω,n .
(2.2)
By our regularity assumptions on V , ω-almost surely, Hω,n is relatively form bounded with respect to −1 (see [2]). It is essentially self-adjoint on C0∞ (Rd ). We can define the integrated density of states Nω,n (E) of this periodic operator by the same limit procedure as in the case of the random operator (see [22,14]). We will now compare N (E) and E(Nω,n (E)). We prove
66
F. Klopp, L. Pastur
Lemma 2.1. For any α ∈ (0, 1), there exists C > 1 and ρ > 0 (depending only on d) such that, for any ϕ ∈ C0∞ (R), for k ∈ N∗ and n ∈ N∗ , we have j Cµ −(1−α)k Ck log k ρ+k d ϕ (x) . e sup (|x| + C) |E((ϕ, dNω,n )) − (ϕ, dN)| ≤ Ce n jx d x∈R 0≤j ≤k+ρ
(2.3) Remark 2.1. The proof shows that the constant C obtained in Lemma 2.1 is independent of the concentration µ of the Poisson process. Before starting the proof of Lemma 2.1, let us recall some basic facts about the density of states of a periodic Schrödinger operator. Let T∗n = Rd /(2π nZd ). For θ ∈ T∗n , we can consider Hω,n,θ the unique self-adjoint operator defined by the quadratic form k∇ϕk2 + hVω ϕ, ϕi on L2θ,loc (i.e. the set of L2loc -functions that satisfy the boundary conditions ϕ(x + nγ ) = einθγ ϕ(x) for γ ∈ Zd and x ∈ Rd ; this set is endowed with the usual scalar product on L2 (C(0, n))). We know that Hω,n,θ has a compact resolvent (see [23]); hence its spectrum is discrete. Let us denote its eigenvalues by E0 (θ, ω, n) ≤ E1 (θ, ω, n) ≤ · · · ≤ En (θ, ω, n) ≤ . . . , The functions (θ 7 → En (θ, ω, n))n∈N are Lipschitz continuous in θ and one has En (θ, ω, n) → +∞ as n → +∞ (uniformly in θ ). One proves that the IDS of Hω,n satisfies
Z 1 X dθ Nω,n (E) = (2π)d {θ ∈T∗n ; En (θ,ω,n)≤E}
(2.4)
n∈N
and (ϕ, dNω,n )) =
1 tr(1C(0,n) ϕ(Hω,n )1C(0,n) ). Vol(C(0, n))
for any ϕ ∈ C0∞ (R) (see [24,23] or [28]). The rest of this subsection will be devoted to the proof of Lemma 2.1. To prove this result we will need the formula given in Theorem 1.3. The proof of this formula will be given at the end of the section. Let us proceed with the proof of Lemma 2.1. Fix ϕ ∈ C0∞ (R). We want to estimate |E((ϕ, dNω,n )) − (ϕ, dN )|. The computation done in the proof of Theorem 5.1 in [11] gives E((ϕ, dNω,n )) = E(tr(1C(0,1) ϕ(Hω,n )1C(0,1) )). Here we used the fact that the Poisson process is Zd -homogeneous. Notice that our regularity assumptions on Vω,n are weaker than the one used in [11] and [22]. Indeed, ω almost surely, Vω,n is only relatively form bounded with respect to −1 with relative bound 0 (see Lemma 5.1). Nevertheless the proofs of the relevant results in these papers extend easily to the case of relatively form bounded perturbations of −1. Now we only have to estimate |E(tr(1C(0,1) (ϕ(Hω,n )−ϕ(Hω ))))|. This is done with an integral representation of ϕ(H ) using an almost analytic extension of ϕ. Pick ϕ ∈ S(R) (the Schwartz space of rapidly decreasing functions). An almost analytic extension of ϕ is a function ϕ˜ : C → C satisfying
Lifshitz Tails for Random Schrödinger Operators
67
1. For z ∈ R, ϕ(z) ˜ = ϕ(z). 2. supp(ϕ) ˜ ⊂ {z ∈ C; |Im(z)| < 1}. 3. ϕ˜ ∈ S({z ∈ C; |Im(z)| < 1}). ∂ ϕ˜ (x + iy) · |y|−n (for 0 < |y| < 1) is bounded in 4. The family of functions x 7 → ∂z S(R) for any n ∈ N. Such extensions always exist for ϕ ∈ S (see [19]) and, one has the following estimates: there exists C > 0 such that for n ≥ 0, α ≥ 0, β ≥ 0, one has ∂β ∂ ϕ˜ (x + iy) sup sup x α β |y|−n · ∂x ∂z 0<|y|≤1 x∈R (2.5) 0 ∂ β0 ϕ α n log n+α log α+β+1 sup sup x (x) . ≤C 0 ∂x β β 0 ≤n+β+2 x∈R α 0 ≤α
Let ϕ˜ be an almost analytic extension of (i + x)q ϕ(x) for some q > d/2, q integer. Then, by [5] and [8], we know that, for any n and ω ∈ , the following formula holds: Z i ∂ ϕ˜ (2.6) (z) · (i + Hω,n )−q (z − Hω,n )−1 dz ∧ dz. ϕ(Hω,n ) = 2π C ∂z For q > d/2, 1C(0,1) (i + Hω,n )−q (z − Hω,n )−1 is trace-class and we have tr 1C(0,1) ϕ(Hω,n )1C(0,1) Z ∂ ϕ˜ i (z) · tr 1C(0,1) (i + Hω,n )−q (z − Hω,n )−1 1C(0,1) dz ∧ dz. = 2π C ∂z
(2.7)
By Lemma 5.1 and Sect. B.12 in [26], we know that ω-almost surely Hω is essential self-adjoint on C0∞ (Rd ) and that 1C(0,1) (i + Hω )−q (z − Hω )−1 is trace-class. Hence, (2.6) and (2.7) also hold for Hω . We are now going to use Lemma 5.1. We pick p0 ∈ (p(d), p) and b = 1, and compute |E(tr(1C(0,1) (ϕ(Hω,n ) − ϕ(Hω ))))| Z ∂ ϕ˜ 1 (z) E ≤
1C(0,1) (i + Hω,n )−q (z − Hω,n )−1 2π C ∂z
−(i + Hω )−q (z − Hω )−1 1C(0,1) dxdy (2.8) tr X 1 Z ∂ ϕ˜ (z) E 1{ω; V ∈ (α,1,p0 )\ (α,1,p0 )} K(z, ω) dxdy, ≤ ω k k−1 2π C ∂z k≥1
where
K(z, ω) = 1C(0,1) (i +Hω,n )−q (z − Hω,n )−1 −(i + Hω )−q (z−Hω )−1 1C(0,1) . tr
Here k · ktr denotes the trace-class norm. We need to estimate 1C(0,1) ((i + Hω,n )−q (z − Hω,n )−1 − (i + Hω )−q (z − Hω )−1 )1C(0,1)
68
F. Klopp, L. Pastur
under the assumption Vω ∈ k (α, 1, p0 ). Therefore, we imitate the method used in [11]. We write
1C(0,1) ((i + Hω,n )−q (z − Hω,n )−1 − (i + Hω )−q (z − Hω )−1 )1C(0,1) ≤ A + B, tr
(2.9) where
A = 1C(0,1) (z − Hω,n )−1 − (z − Hω )−1 (i + Hω )−q 1C(0,1) tr
−1 −1 −q = 1C(0,1) (z − Hω,n ) (i + Hω ) 1C(0,1) Vω,n − Vω (z − Hω )
tr
and
B = 1C(0,1) (z − Hω,n )−1 (i + Hω,n )−q − (i + Hω )−q 1C(0,1) tr
q−1 X
−1 l−q −l
= 1C(0,1) (z − Hω,n ) (i + Hω,n ) Vω,n − Vω (i + Hω ) 1C(0,1)
.
l=1 tr
The estimates for A and B being obtained essentially in the same way, we will write the details for A only. Pick χ ∈ C0∞ (Rd ) such that 0 ≤ χ ≤ 1, χ ≡ 1 on C(0, 1/2) and X χγ4 ≡ 1. Then, we have χ ≡ 0 outside of C(0, 3/2) such that γ ∈Zd
A≤
X
1C(0,1) (z−Hω,n )−1 χγ 0
L(H −1 ,L2 )
γ 0 ∈Zd ,β∈Zd
· χγ 0 (z − Hω )−1 χβ
χγ 0 (Vω,n −Vω )χγ 0
L(L2 ,H 1 )
L(H 1 ,H −1 )
·
· χβ3 (i + Hω )−q 1C(0,1) . tr
Here χβ (·) = χ(·−β). By Lemma 5.4 applied to Hω and to Hω,n , for Vω ∈ k (α, 1, p0 ), we know that, for some > 0, ρ ≥ 1 and C > 0, for all γ 0 ∈ Zd and β ∈ Zd , we have 0 1−α
e−·η(z,K)|γ |
≤ C ,
1C(0,1) (z − Hω,n )−1 χγ 0 L(H −1 ,L2 ) η(z, K)ρ
C 0 1−α 1−α
e−·η(z,K)||γ | −|β| | ,
χγ 0 (z − Hω )−1 χβ 2 1 ≤ ρ L(L ,H ) η(z, K) 1−α
e−·η(z,K)|β|
3 −q ,
χβ (i + Hω ) 1C(0,1) ≤ C tr η(z, K)ρ
|Imz| and ρ depends only on d |z| + K + C and q. By Lemma 5.3 and the growth estimate known for Vω,n and Vω (when Vω ∈ k (α, 1, p0 )), we know that, for γ 0 ∈ Zd , we have
0 α
χγ 0 (Vω,n − Vω )χγ 0 L(H 1 ,H −1 ) ≤ C(1 + |γ |) . p
where K = Ck p−p0 , η(z, K) = η(z, K, 1) =
Lifshitz Tails for Random Schrödinger Operators
69
On the other hand, due to the exponential decay of V1 and the compact support of V2 , there exists C > 0 such that, for |γ 0 | ≤ n/2, we have
−n/C
χγ 0 (Vω,n − Vω )χγ 0 . L(H 1 ,H −1 ) ≤ Ce Hence, if we multiply these estimates and sum the result in γ and γ 0 , we get that A≤C
n(d+3)α −·η(z,K)n1−α e . η(z, K)ρ
(2.10)
For B, we get an estimate analogous to (2.10); only the constants change. Plugging this into (2.9) and (2.8), summing over k using the estimate (5.1) for the probability of c (α, 1, p 0 ), we get k (d+3)α
|E(tr(1C(0,1) (ϕ(Hω,n ) − ϕ(Hω ))))| ≤ Cn
1 2π
Z ∂ ϕ˜ (z) S(z, n)dxdy, (2.11) C ∂z
where S(z, n) :=
X (Cµ)k k!
k≥1
ρ p |Imz| − n1−α p p−p 0 + C |z| + Ck e |z|+Ck p−p0 +C . |Imz|
As suppϕ˜ ⊂ {z ∈ C; |Im(z)| < 1}, using the notation z = x + iy, for l ∈ N∗ , we estimate S(z, n) for |y| < 1 by S(z, n) ≤
X (Cµ)k k≥1
k!
≤ n−l(1−α) = n−l(1−α)
ρ p |y| − p p−p 0 (|x| + C)k e (|x|+C)k p−p0 |y| |x| + C |y| |x| + C |y|
ρ+l X ρ+l
k≥1
p (Cµ)k p−p ρ k 0 k!
n1−α
|y|n1−α (|x| + C)
l
1−α
e
|y| − n(|x|+C) k
−
p p−p 0
fl (t), (2.12)
where t =
fl (t) :=
|y|n1−α and |x| + C X (Cµ)k k≥1
k!
k
p ρ p−p 0
tle
p 0 −tk p −p
≤
X (Cµ)k k≥1
k!
k
p ρ p−p 0
lk
p p−p 0
l e−l
l l X (Cµ)k (ρ+l) p l l p−p 0 ≤ k ≤ e−l e−l L!(eCµe − eCµ ), k! k≥1
(2.13)
70
F. Klopp, L. Pastur
where L denotes the smallest integer larger than (ρ + l)
p . Here we used Stirling’s p − p0
formula and the identity X (Cµ)k X (Cµ)k X k l X 1 X (Cµ)k = kl = (ek −1) = eCµe −eCµ . l! k! k! l! k! l≥1
k≥1
k≥1
l≥1
k≥1
Hence, for some C > 0 (independent of l, n, z and µ), we have ρ+l Cµ −(1−α)l Cl log l |x| + C e . S(z, n) ≤ Ce n |y|
(2.14)
Plugging this into (2.11) and using estimate (2.5) for almost analytic extensions, we get (2.3) and end the proof of Lemma 2.1. u t Remark 2.2. In the proof of Lemma 2.1, we only have used the fact that the space of realization could be written as = ∪n n , where the probability P (c n ) was decreasing fast enough, and that in these subsets, we had uniform estimates on the quantity we want to compute. Obviously, to get such a decomposition, one does not need to have a Poisson potential but only a homogeneous random field with suitable bounds at infinity. This idea is applicable to many other random Schrödinger operators. Proof of Theorem 1.3. By [21], we know that, for φ ∈ C0∞ (R) and for almost every ω, we have 1 D tr(φ(Hω,3 )) 3→Rd Vol(3) 1 D ))) E(tr(φ(Hω,3 = lim d Vol(3) 3→R X 1 D E(tr(1C(γ ,1) φ(Hω,3 )1C(γ ,1) )), = lim 3→Rd Vol(3) d
hφ, dNi = lim
(2.15)
γ ∈3∩Z
D is defined in Sect. 1.1. On the other hand, as H is homogeneous, for any where Hω,3 ω d γ ∈ Z , we have
E(tr(1C(γ ,1) φ(Hω )1C(γ ,1) )) = E(tr(1C(0,1) φ(Hω )1C(0,1) )). So that E(tr(1C(0,1) φ(Hω )1C(0,1) )) =
1 Vol(3)
X
E(tr(1C(γ ,1) φ(Hω )1C(γ ,1) )).
γ ∈3∩Zd
Hence, by (2.15), to get Eq. (1.4), we just have to prove that 1 X D E tr(1C(γ ,1) φ(Hω,3 )1C(γ ,1) )−tr(1C(γ ,1) φ(Hω )1C(γ ,1) ) = 0. lim 3→Rd Vol(3) γ ∈3∩Zd (2.16)
Lifshitz Tails for Random Schrödinger Operators
71
To shorten the notations, let 3L = C(0, L) be the cube of center 0 and side length L. Pick > 0. To prove (2.16), one could try to prove that, for any γ ∈ Zd , one has D )1 )−tr(1 φ(H )1 ) lim A(γ , 3L ) := lim E tr(1C(γ ,1) φ(Hω,3 ω C(γ ,1) C(γ ,1) C(γ ,1) L L→∞
3L →Rd
=0 (2.17) in some uniform way. However, this may be difficult as, because of the Dirichlet boundary conditions, A(γ , 3L ) may have some non-uniform behavior for γ close to the boundary of 3L . So we are going to split the difficulty into two parts; on the one hand, we will show that, for any > 0 and for γ ∈ 3L \ 3(1−)L , we will show that A(γ , 3L ) stays bounded (uniformly in L, ω). As there are only very few such terms, this part of the sum tends to 0. On the other hand, for γ ∈ 3(1−)L , we will show that A(γ , 3L ) tends to 0 uniformly in L and ω. As in the proof of Lemma 2.1, uniformity in ω cannot be achieved over the whole set of realization, but only over subsets whose measure we control (see Lemma 5.1). This will suffice. D ). Hence, To estimate A(γ , 3L ), we will use (2.6) to compute φ(Hω ) and φ(Hω,3 L we see that we only need to estimate the following expression (cf. (2.9)): D D )−1 (i + Hω,3 )−q 1C(γ ,1) ) tr(1C(γ ,1) (z − Hω,3 L L
− tr(1C(γ ,1) (z − Hω )−1 (i + Hω )−q 1C(γ ,1) ),
where q > d/2 is an even integer. To do this we need a way to compare the resolvent of the Dirichlet problem on 3L with the resolvent of Hω over the whole space. We use the ◦
following resolvent identity (see e.g. [1]): let χ ∈ C02 (3L ) then, for Imz 6= 0, we have D D )−1 χ + (z − Hω,3 )−1 [−1, χ](z − Hω )−1 . χ (z − Hω )−1 = (z − Hω,3 L L
(2.18)
Pick α, k and p 0 are taken as in Lemma 2.1. We will use the following lemma Lemma 2.2. Assume that q > d is an even integer and that ω is such that Vω ∈ k (α, 1, p0 ). Then, there exists Cq > 0 such that, for any 30 ⊂ 3L (30 measurable) and λ ≥ 1, we have pd
d
D )−1 kTq ≤ Cq k 2q(p−p0 ) L q k(i + Hω,3 L
(1+ α2 )
−1/2 130 kTq ≤ Cq |30 |1/q . k(λ − 1D 3L )
,
(2.19) (2.20)
Here, p and p(d) are defined in H1, p 0 satisfies p(d) < p0 < p, |30 | denotes the measure of 30 and 1/q . k · kTq = tr | · |q Let us postpone the proof of this result to finish the proof of Theorem 1.3. Pick > 0 to be chosen precisely later. Pick ω such that Vω ∈ k (α, 1, p0 ). Hence, by Lemma 5.1, (i) (r) we can decompose Vω = Vω + Vω , where p0
(i)
• Vω(i) ∈ Lloc,unif (3L ) and supx∈3L kVω kLp0 (C(x,1)) ≤ 1, (r)
p
• for x ∈ 3L , |Vω (x)| ≤ Ck p−p0 Lα .
72
F. Klopp, L. Pastur p
For 0
(r)
(i)
By Lemma 5.3 and the estimate given on Vω and Vω above, we get that, for L large enough, −1/2 −1/2 Vω|3l (λ − 1D kL(L2 ) < k(λ − 1D 3L ) 3L )
1 . 2
Hence, combining the two estimates of Lemma 2.2, we get 1
(2+d
d
D )−1 13L \3(1−2)L kTq ≤ C q k 2(p−p0 ) Lα L q k(i + Hω,3 L
(1+ α2 )+ dq
.
(2.21)
So, by formula (2.6), we get that, for ω such that Vω ∈ k (α, 1, p0 ), we have (2+d
1
α+ dq + d(q−1) (1+ α2 ) q
D )13L \3(1−2)L )| ≤ Ck 2(p−p0 ) q L |tr(φ(Hω,3 L
.
Hence, taking the expectation in ω and summing over all k, we get that h i 1 α(1+ d(q−1) D 2q ) Ld . )13L \3(1−2)L ) ≤ C q L E tr(φ(Hω,3 L Finally, we obtain h i X 1 D E tr(1C(γ ,1) φ(Hω,3 )1 ) C(γ ,1) L Vol(3L ) γ ∈3L \3(1−2)L γ ∈Zd h i 1 1 α(1+ d(q−1) D 2q ) . )13L \3(1−2)L ) ≤ C q L = E tr(φ(Hω,3 L Vol(3L )
(2.22)
On the other hand, let χ be a C0∞ -function supported in 3L that is constant equal to 1 in 3(1−)L . For γ ∈ 3(1−2)L , we use the resolvent formula (2.18) to write D D )−q 1C(γ ,1) = (i + Hω,3 )−q χ1C(γ ,1) (i + Hω,3 L L D = (i + Hω,3 )−(q−1) χ(i + Hω )−1 1C(γ ,1) L
D − (i + Hω,3 )−(q−1) [1, χ] (i + Hω )−1 1C(γ ,1) L −q
= χ (i + Hω )
q X D 1C(γ ,1) − (i + Hω,3 )−j [1, χ] (i + Hω )−q+j −1 1C(γ ,1) . L j =1
Hence D )−q 1C(γ ,1) = (i + Hω )−q 1C(γ ,1) + R(L), (i + Hω,3 L
(2.23)
Lifshitz Tails for Random Schrödinger Operators
73
where R(L) = −(1 − χ)(i + Hω )−q 1C(γ ,1) − q X D (i + Hω,3 )−j [1, χ] (i + Hω )−q+j −1 1C(γ ,1) . − L j =1
By the support properties of 1 − χ and [1, χ], by Lemma 5.4 and by the assumption on ω, we have that, for some C > 0, if εL ≥ 1, then p
1−α /(Ck p/(p−p 0 ) )
kR(L)kT1 ≤ Ck p−p0 e−(L)
.
Plugging this into (2.6), we get that, for some C > 0, D )1C(γ ,1) ) − tr(1C(γ ,1) φ(Hω )1C(γ ,1) )| ≤ |tr(1C(γ ,1) φ(Hω,3 L p
1−α /(Ck p/(p−p 0 ) )
≤ Ck p−p0 e−(L)
.
(2.24)
Taking the expectation of (2.24), summing in k and using (5.1) in the same way as in (2.14), we get that D )1C(γ ,1) ) − tr(1C(γ ,1) φ(Hω )1C(γ ,1) )| ≤ C(L)−(1−α) . E |tr(1C(γ ,1) φ(Hω,3 L (2.25) Then, by (2.22) and (2.25), we have that 1 Vol(3L )
X γ ∈3L ∩Zd
D E tr(1C(γ ,1) φ(Hω,3 )1C(γ ,1) ) = L =
1 Vol(3L )
X
E tr(1C(γ ,1) φ(Hω )1C(γ ,1) ) + Q(L),
γ ∈3L ∩Zd
where 1
α(1+ d(q−1) 2q )
|Q(L)| ≤ C(L)−(1−α) + C q L
+ CL−1 .
If we choose β = α(q + d(q − 1)/2), set = L−β and pick 0 < α < 1/3 small enough so that 1 − β > 1/2, we get that Q(L) → 0 as L → +∞. This completes the proof of Theorem 1.3. u t Remark 2.3. Remark 2.2 applies also for the proof of Theorem 1.3. Proof of Lemma 2.2. Under our assumptions on ω, we have Vω ∈ k (α, 1, p0 ). Hence, (i) (r) by Lemma 5.1, we can decompose Vω = Vω + Vω , where p0
(i)
• Vω(i) ∈ Lloc,unif (3L ) and supx∈3L kVω kLp0 (C(x,1)) ≤ 1, (r)
p
• for x ∈ 3L , |Vω (x)| ≤ Ck p−p0 Lα .
74
F. Klopp, L. Pastur
For ϕ ∈ C0∞ (3L ), this yields D i r ϕ, ϕi = h−1D hHω,3 3L ϕ, ϕi + hVω ϕ, ϕi + hVω ϕ, ϕi L p 1 p−p 0 Lα + 1 kϕk2 . ≥ h−1D 3L ϕ, ϕi − C k 2
(2.26)
Equation (2.26) and the variational principle for eigenvalues immediately imply that D )≥ λj (Hω,3 L
p 1 p−p 0 Lα + 1 , ) − C k λj (−1D 3L 2
where λj (H ) denotes the j th eigenvalue of H (ordered increasingly counting multiplicity). Hence q
D )−1 kTq = k(i + Hω,3 L
X
≤
j ∈N
X j ∈N
1
D )]2 1 + [λj (Hω,3 L
q/2
1+ p
(2.27)
0
p−p Lα +1) λj (−1D 3 )<4C(k L
X
+
j ∈N
1
p
0
p−p Lα +1) λj (−1D 3 )≥4C(k
p
p−p 0 Lα 1 + [λj (−1D 3L ) − C(k
+ 1)]2
q/2 .
L
d Now the eigenvalues of −1D 3L are known explicitly: for j = (j1 , . . . , jd ) ∈ (N \ {0}) , 2 X π jp2 . Hence, Eq. (2.27) tells us that one has λj (−1D 3L ) = 2 L 1≤p≤d
d/2 p D −1 q 2 p−p0 α ) k ≤ 4CL (k L + 1) + k(i + Hω,3 Tq L X
+
p √ 0 |j |≥2 CL(k p−p Lα +1)1/2
L2q
p
L4 + [|j |2 − CL2 (k p−p0 Lα + 1)]2
q/2
d/2 p ≤ 4CL2 (k p−p0 Lα + 1) + X
+
L2q
p √ 0 |j |≥2 CL(k p−p Lα +1)1/2
L4 + |j |4 /2
q/2
d/2 p ≤ 4CL2 (k p−p0 Lα + 1) + Cq Ld pd
α
≤ Cq k 2(p−p0 ) Ld(1+ 2 ) . This completes the proof of (2.19). The proof of (2.20) will follow from an explicit computation. Pick λ ≥ 1. As a corollary of the Golden–Thompson inequalities (see
Lifshitz Tails for Random Schrödinger Operators
75
[25]), we have q
−1/2 −q/4 2 130 kTq ≤ k130 (λ − 1D kT2 k(λ − 1D 3L ) 3L ) X −q/4 −q/4 = h(λ − 1D 130 (λ − 1D ϕj , ϕj i 3L ) 3L ) j ∈(N\{0})d
X
=
j ∈(N\{0})d
1 h130 ϕj , ϕj i, (λ + λj )q/2 (2.28)
where, for the sake of simplicity, we denoted the eigenvalues of −1D 3L by (λj )j and the associated eigenvector by (ϕj )j . The vector ϕj being explicitly known, one easily computes d 2 |30 |. h130 ϕj , ϕj i ≤ L Plugging this into Eq. (2.28), as λ ≥ 1, we get q −1/2 130 kTq k(λ − 1D 3L )
d 2 ≤ L
X j ∈(N\{0})d
|30 | ≤ Cq |30 |, (1 + λj )q/2
t using the explicit formula given above for λj . u
2.2. Application to the estimation of the integrated density of states. We prove Lemma 2.3. Pick ν > 0. There exists β > 0 and Eν > 0 such that, for E > Eν and for n ≥ E β , one has ν
ν
E(Nω,n (−E − 1)) − e−E ≤ N(−E) ≤ E(Nω,n (−E + 1)) + e−E .
(2.29)
Proof of Lemma 2.3. Pick ν > 0 arbitrary. By Eq. (5.13), the a-priori estimate on N given in Lemma 5.2, we know that, for some τ > 0, N(−E τ ) ≤
1 −E ν e . 4
(2.30)
Hence we just have to estimate N(−E) − N(−E τ ). Therefore introduce two functions ϕ± defined by ϕ± = 1[−E τ ∓ 1 ,−E± 1 ] ∗ ϕ0 , 2
2
C0∞ (R)
is a non-negative Gevrey class function of Gevrey exponent ρ > 1 where ϕ0 ∈ such that ϕ0 ≡ 1 on [− 41 , 41 ] and suppϕ0 ⊂ [− 21 , 21 ]. The functions ϕ± are then Gevrey class of Gevrey exponent ρ and suppϕ± ⊂ [−E τ − 1, −E + 1] (see e.g. [6]). Then one has hϕ− , dNi ≤ N(−E) − N(−E τ ) ≤ hϕ+ , dNi.
(2.31)
76
F. Klopp, L. Pastur
Using the Gevrey estimates for the derivatives and the estimates on the support of ϕ± , by Lemma 2.1, we have that, for some C > 0, and for all n ∈ N∗ and all k ∈ N∗ , |hϕ± , dNi − E(hϕ± , dNω,n i)| ≤ Cn−(1−α)k eCk log k (E τ + C)ρ+k (ρ + k)η(ρ+k) . (2.32) We optimize the right-hand side of (2.32) in k to get 1 1−α (E τ +C)) η+C
|hϕ± , dNi − E(hϕ± , dNω,n i)| ≤ e−(η+C)(n
.
Hence, for some β > 0, if n ≥ E β , we have |hϕ± , dNi − E(hϕ± , dNω,n i)| ≤
1 −E ν . e 4
(2.33)
Thus E(Nω,n (−E − 1) − Nω,n (−E τ + 1)) ≤ E(hϕ− , dNω,n i) ≤ E(hϕ+ , dNω,n i) ≤ E(Nω,n (−E + 1) − Nω,n (−E τ − 1)). (2.34) Using (5.14) to estimate Nω,n (−E τ ± 1) and summing Eqs. (2.30), (2.31) and (2.34), we end the proof of Lemma 2.3. u t Remark 2.4. Notice that we could have estimated N (−E) with E(Nω,n (−E ± )) (for small). The price to pay to get an error of the same size as in (2.29) would have been to take n of size E β −ζ for some ζ > 0. This was used in [12] to get precise asymptotics for N at high energy for a different model. 3. The Lower Bounds In this section, we will prove the asymptotic lower bound on the approximated density of states dNω,n defined in Sect. 2.1 in the different cases considered in the introduction. 3.1. The general case. We prove the following general bound Proposition 3.1. Under the assumptions H1, H 2 and H 3, there exist β0 > 0 such that, for any β > β0 and n = [E β ], we have log E(Nω,n (−E − 1)) ≥ −(1 + α ∗ d)g(E) log g(E)(1 + o(1)), E → +∞. (3.1) Here [·] denotes the integer part of ·. Proof. The strategy used to prove the lower bound is quite obvious: we construct a normalized vector ϕ such that h(H + Vω )ϕ, ϕi ≤ −E − 1, this with a sufficiently large probability. The right candidate will be an asymptotic ground state for H (g) for g chosen properly (see Sect. 1.2). Pick n = [E β ] and l = [| log E|β ]. Pick ρ > 1 large, 0 < ε < 1 small, 0 < α < 1 and 1 < k large. Let ψg be an asymptotic ground state for H (g) such that α ∗ (ψg ) < α ∗ (1+ε) (see Sect. 1.2). Define k,E = 1k,E ∩ 2E ,
Lifshitz Tails for Random Schrödinger Operators
77
where 1k,E = {ω : k(1 + 2ε) ≥ m(ω, C(0, k −α
∗ (1+ε)
)) = m(ω, C(0, l)) ≥ k(1 + ε)}, (3.2)
2E = {ω : ∀γ ∈ r0 Zd , m(ω, C(γ , r0 )) < E ρ (|γ |α + 1)}.
(3.3)
Here r0 is chosen such that suppV2 ⊂ C(0, r0 ). We minorize the probability of 1k,E ∗ by the probability that m(ω, C(0, k −α (1+ε) )) = k(1 + ε) and that m(ω, C(0, l) \ ∗ (1+ε) −α )) = 0. Using (5.2) to minorize the probability of 2E , for E sufficiently C(0, k large, we get that the probability of k,E is estimated by ∗
−dα (1+ε) )k(1+ε) C 1 d (µk − , P (k,E ) ≥ e−Cµl C 0(k(1 + ε)) 0(E ρ /2)
(3.4)
where C > 0 is a constant independent of l, k and ε, and 0 is the Euler 0-function. For ω ∈ k,E , we have (i) (e) )ψk , ψk i + hVω,n ψk , ψk i, h(−1 + Vω,n )ψk , ψk i = h(−1 + Vω,n
where Vω(i) =
Z C(0,l)
V (x − y)m(ω, dy) and Vω(e) =
Z Rd \C(0,l)
V (x − y)m(ω, dy), (i,e)
i.e. they are the parts of Vω with centers in C(0, l) or outside of C(0, l), and Vω,n is built from these in the same way as Vω,n is from Vω (see Eq. (2.2)). As V2 is of compact support, as V1 is exponentially decaying and as ω ∈ 2E , using the support properties of asymptotic ground states (see Definition 1.1), we get that β /C
(e) ψk , ψk i| ≤ CE ρ e−| log E| |hVω,n
.
(3.5)
∗
On the other hand, if we set m(ω) = m(ω, ∗B(0, k −α (1+ε) )) and define (xi (ω))i to be the points supporting m(ω, dx) in B(0, k −α (1+ε) ), we have h(−1 + Vω(i) )ψk , ψk i = h(−1 + m(ω)V )ψk , ψk i +
m(ω) X
h(τxi (ω) V − V )ψk , ψk i
i=1
= h(−1 + kV )ψk , ψk i +
m(ω) X m(ω) − k hkV ψk , ψk i + h(τxi (ω) V − V )ψk , ψk i k i=1
≤ E(k) + o(E(k)) + ε(E(k) + o(E(k))) +
m(ω) X
|h(τxi (ω) V − V )ψk , ψk i|.
i=1
(3.6) We used the fact that ψk is an asymptotic ground state and the fact that ω ∈ 1k,E .
78
F. Klopp, L. Pastur
As m(ω) ≤ k(1 + 2ε) and α ∗ (ψg ) < α ∗ (1 + ε), the definition of α ∗ (ψg ) tells us that m(ω) X 1 h(τxi (ω) V − V )ψk , ψk i → 0 as k → +∞. E− (k) i=1 Plugging all this into (3.6), we get h(−1 + Vω(i) )ψk , ψk i ≤ E(k)(1 + o(1) + ε) as k → +∞.
(3.7)
If we now chose k = g(E)(1 + ε),
(3.8)
then, for sufficiently large k, we get h(−1 + Vω(i) )ψk , ψk i ≤ −(E + 1).
(3.9)
But, as an asymptotic ground state, ψk vanishes outside some fixed cube independent of k; hence for E sufficiently large, it vanishes in a neighborhood of the boundary of the cube C(0, n) and can be continued so as to satisfy any quasi-periodic boundary conditions on C(0, n) (see Sect. 2.1). This implies that there exist C > 0 independent of n and E such that for ω ∈ k,E and k given by (3.8), we have Nω,n (−E − 1) ≥ n−d /C.
(3.10)
Taking into account the probability estimate (3.4), if ρ is large enough and as k ≥ E δ for some δ > 0 by (5.30), we get ∗
−dα (1+ε) )k(1+ε) 1 d (µk . E(Nω,n (−E − 1)) ≥ E −βd e−Cµl C 0(k(1 + ε))
So that, as E tends to +∞, we get log[E(Nω,n (−E − 1))] ≥ −(1 + ε + dα ∗ (1 + ε))g(E) log g(E)(1 + 8ε). As we can choose ε as small as we please, this ends the proof of Proposition 3.1. u t By Lemma 5.5, we know that, for some C > 0 and α > 1, g(E) ≤ CE for some C > 0 and sufficiently large E. Thus, if β > 1 is large enough and ν in Eq. (2.29) satisfies ν > 1, Lemma 2.3 and Proposition 3.1 immediately imply Proposition 3.2. Under the assumptions H1, H 2 and H 3, we have log N (−E) ≥ −(1 + α ∗ d)g(E) log g(E)(1 + o(1)), E → +∞.
Lifshitz Tails for Random Schrödinger Operators
79
3.2. The case when V is bounded from below. We now assume that V is bounded from below, i.e. that V− is bounded. We then prove Proposition 3.3. Under the assumptions H1, H 2 and H 1’, there exist β0 > 0 such that, for any β > β0 and n = [E β ], we have (3.11) log E(Nω,n (−E − 1)) ≥ −g(E) log g(E)(1 + o(1)), E → +∞. Proof. The strategy of the proof is the one used in the proof of Proposition 3.1. Recall that v− is the essential infimum of V . Then, for any ε > 0, we can find χ ∈ C0∞ (Rd ) such that Z Z V (x)χ 2 (x)dx ≤ v− + ε/2 and χ 2 (x)dx = 1. Rd
Rd
Recall that τa χ(x) = χ(x − a). As V ∈ L1 (Rd ), there exists δ > 0 such that Z |V (x)| · |τa χ 2 (x) − χ 2 (x)|dx ≤ ε/2. sup |a|≤δ Rd
Pick now n and l as in the proof of Proposition 3.1; pick k large and define δ,E = 1δ,E ∩ 2E , where 2E is defined in (3.3) and (see (3.2)) 1δ,E = {ω; k(1 + 2ε) ≥ m(ω, C(0, δ)) = m(ω, C(0, l)) ≥ k(1 + ε)}. Using (5.2), we get that the probability of δ,E is estimated by P (δ,E ) ≥
C 1 −Cµl d (µδ)kd(1+ε) e − . C 0(k(1 + ε)) 0(E ρ /2)
(3.12)
Pick ω ∈ δ,E . An argument similar to that used in the proof of Proposition 3.1 in which ∗ k −α (1+ε) is replaced by δ yields (cf. (3.6)) h(−1 + Vω,n )χ, χi = k∇χk2 + +
m(ω) XZ i=1
m(ω) XZ
τxi (ω) (V χ 2 ) +
i=1 (e) χ, χi [τxi (ω) V ](τxi (ω) χ 2 − χ 2 ) + hVω,n
≤ k(1 + ε)(v− + ε) + C + O(1/E).
(3.13)
Hence, if we take k = E/|v− |(1 + ε/|v− |), for E sufficiently large, we get (cf. (3.9)) h(−1 + Vω,n )χ, χi ≤ −E − 1. This inequality and the probability estimate (3.12) give, for some C > 0 independent of E and ε E (1 + Cε) log E. log E(Nω,n (−E − 1)) ≥ − |v− | This ends the proof of Proposition 3.3. u t
80
F. Klopp, L. Pastur
By the same argument as above, we get Proposition 3.4. Under the assumptions H1, H 2 and H 1’, if V is bounded from below then, we have log N(−E) ≥ −g(E) log g(E)(1 + o(1)), E → +∞. This ends the proof of Theorem 1.5 as the asymptotic upper bound given by Theorem 1.4 coalesces with the lower bound given by Proposition 3.4. 3.3. The case when V has power law singularities. Let us assume that V2 satisfies assumption H1.2”. We will show that Lemma 3.1. Under the assumptions H1 and H 1”, we have α† ≥ α∗. Proof. To estimate α ∗ we will use the asymptotic ground state constructed for H (g) in (i ) Lemma 5.7. Pick i0 and ψg 0 as in Lemma 5.7. Let α > α † . If the support of χ (cf. Lemma 5.7) is small enough, then, for |a| ≤ g −α and g large enough, we have (i )
8(a) =
(i )
gh(τa V − V )ψg 0 , ψg 0 i E− (g)
(i )
(i )
(i )
(i )
(i )
(i )
=
gh[τa (Wi0 Vi0 ) − Wi0 Vi0 ]ψg 0 , ψg 0 i gh[τa V1 − V1 ]ψg 0 , ψg 0 i + E− (g) E− (g)
=
gh[(τa Wi0 Vi0 ) − Wi0 Vi0 ]ψg 0 , ψg 0 i + o(1), E− (g)
as V1 is bounded and g = o(E− (g)). Hence to estimate 8(a), it is enough to estimate the expression (i )
(i )
gh[τa (Wi0 Vi0 ) − Wi0 Vi0 ]ψg 0 , ψg 0 i E− (g) Z g Wi0 (x − a)Vi0 (x − a) − Wi0 (x)Vi0 (x) χ 2 (x)|ϕg(i0 ) (x)|2 dx = E− (g) |x|
=
†
(3.14) where
Z I (a) =
Wi0 (g −α (x − b))Vi0 (x − b)−Wi0 (g −α x)Vi0 (x)
†
|x|
†
†
· χ 2 (g −α x)|ϕ (i0 ) (x)|2 dx, †
where b = g α a. Notice that b → 0 when g → +∞ as α > α † and |a| ≤ g −α . † Moreover, by Lemma 5.7, E− (g) ∼ Cg 2α (for some C). Hence to show that α > α ∗ , we just need to show that I (a) → 0 when g → +∞. This is easily seen cutting I (a) †
Lifshitz Tails for Random Schrödinger Operators
81
into two parts. In the first one, we integrate over some small neighborhood of 0 and this is small as χ and ϕ (i0 ) are bounded and the singularities of Vi0 are integrable; outside of this neighborhood, the potentials Vi0 and Vi0 (· − b) are continuous and as ϕ (i0 ) is in L2 , we can conclude using Lebesgue’s Dominated Convergence Theorem. This implies t that α ≥ α ∗ . As it holds for any α > α † , we get the result of Lemma 3.1. u Combining Lemma 3.1 with Theorem 1.4, we get Proposition 3.5. Under the assumptions H1 and H 1”, one has −(1 + α † d)g(E) log g(E)(1 + o(1)) ≤ log N (−E), E → +∞. In Sect. 4.3 we improve on the general upper bound given in Theorem 1.4 so that the new upper bound coalesces with the lower bound obtained here. 4. The Upper Bounds In this section, we will prove the asymptotic upper bound on the approximated density of states dNω,n defined in Sect. 2.1 in the different cases considered in the introduction. 4.1. The general case. We prove the following general bound Proposition 4.1. Under the assumptions H1, H 2 and H 3, there exists β0 > 0 such that, for any β > β0 and n = 2[E β ] · [| log E|β ], we have (4.1) log E(Nω,n (−E + 1)) ≤ −g(E) log g(E)(1 + o(1)), E → +∞. By Lemma 5.5, we know that, for some C > 0 and α > 1, g(E) ≥ CE 1/α for some C > 0 and E large enough. So that if we pick β > 1 large enough so that ν defined in Eq. (2.29) satisfies ν > 1, Lemma 2.3 and Proposition 4.1 immediately imply Proposition 4.2. Under the assumptions H1, H 2 and H 3, we have log N(−E) ≤ −g(E) log g(E)(1 + o(1)), E → +∞. Taking into account Proposition 3.2, this ends the proof of Theorem 1.4. We now turn to the proof of Proposition 4.1. 4.2. Proof of Proposition 4.1. The idea of the proof is to show that, if Hω,n,θ has some low energies, then the corresponding potential Vω must have a very deep well, i.e. the corresponding realization of the Poisson measure must put sufficiently many points inside the cube C(0, n) and those points must be sufficiently close to each other. The main technical difficulty comes from the fact that our single site potential is not of finite range; so we need some a priori estimate on Nω,n that tells us that the behavior of Vω outside of C(0, n) does not interact too much with the one inside, more precisely, that this interaction can be large only with a small probability. Actually we need this not only on the scale of the large cube used in the periodic approximation but also on a much smaller scale, namely the scale of the size of the cube where we want the Poisson points to pile up. Pick β > 0 large. Set n = 2[E β ] · [| log E|β ] and l = [| log E|β ]. Pick ρ > 0 large and let 2E be defined as in (3.3). Then we prove
82
F. Klopp, L. Pastur
Lemma 4.1. For some C > 0 and sufficiently large E, we have E(Nω,n (−E + 1)) ≤ E(Nω,n (−E + 1)12 ) + E
C . 0(E ρ /2)
(4.2)
Proof. We decompose E(Nω,n (−E + 1)) = E(Nω,n (−E + 1)12 ) + A, E
where α and
p0
are as in Lemma 5.1, X E(Nω,n (−E + 1)1k (α,1,p0 )\(2 ∪k−1 (α,1,p0 )) ) A= E
k≥1
+ E(Nω,n (−E + 1)10 (α,1,p0 )\2 ) E X E(Nω,n (−E + 1)1k (α,1,p0 )\k−1 (α,1,p0 ) ) ≤
(4.3)
k≥E ρ
as, for k < E ρ , by the proof of Lemma 5.1, we have k (α, 1, p0 ) ⊂ 2E . For ω ∈ k (α, 1, p0 ) (see Lemma 5.1), we have p
(r) k∞ ≤ C(E log E)αβ k p−p0 . kVω,n
Hence, if N denotes the density of states of −1, for ω ∈ k (α, 1, p0 ), we get p Nω,n (−E + 1) ≤ N −E + 2 + C(E log E)αβ k p−p0 p d/2 . ≤ C −E + 2 + C(E log E)αβ k p−p0 Plugging this into (4.3) and computing the sum over k using (5.1), we get Lemma 4.1. t u Define Vω(i) (x) =
Z
V (x − y)m(ω, dy) and Vω(e) (x) =
C(0,n+2l)
Z V (x − y)m(ω, dy) Rd \C(0,n+2l)
(4.4) (i,e)
and the corresponding periodized potentials Vω,n (see Eq. (2.1)). Note that the peri(e) odized potentials are of period n. As V2 is compactly supported, Vω,n is almost surely bounded for l (i.e. E) large. We will estimate its magnitude later. The proof of Proposition 4.1 will be a consequence of the following lemmas: Lemma 4.2. For any ε > 0, there exists E0 and k0 such that, if k > k0 , E > E0 and ∀x ∈ C(0, n + 2l), m(ω, C(x, 2l)) ≤ k, then, for any θ ∈ Tn∗ , we have C k (e) k∞ − C((n/2l)d + 1) · ke−l/C − 2 − kVω,n Hω,n,θ ≥ −E− 1−ε εl
(4.5)
(4.6)
Lifshitz Tails for Random Schrödinger Operators
83
and Lemma 4.3. Pick δ > 0. If E is large enough and if k ≥ E δ , then, log [P ({ω; ∃x ∈ C(0, n + 2l), m(ω, C(x, 2l)) > k})]
∼
E→+∞
−k log k.
(4.7)
Before proving Lemmas 4.2 and 4.3, we finish the proof of Proposition 4.1. Therefore, fix ε > 0 and set k = g(E − 2)(1 − ε). For ω ∈ 2E , we have, for some C > 0, β /C
(e) k∞ ≤ E ρ e−(log E) kVω,n
.
(4.8)
Hence, using (4.6) and the bounds known for g(E) given in Lemma 5.5, we get that for ω ∈ 2E such that ∀x ∈ C(0, n + 2l), m(ω, C(x, 2l)) ≤ k, for any θ ∈ Tn∗ , we have Hω,n,θ ≥ −E + 3/2. In other words, we have ]{eigenvalues of Hω,n,θ ≤ −E + 1} = ]{eigenvalues of Hω,n,θ ≤ −E + 1}1{ω∈2 : ∃x∈C(0,n+2l), m(ω,C(x,2l))>k} . E
By the definition of Nω,n , using Fubini’s Theorem, we compute E(Nω,n (−E + 1)12 ) E Z E(]{eigenvalues of Hω,n,θ ≤ −E + 1}12 )dθ = Z
=
Tn∗ Tn∗
E
h
i E ]{eigenvalues of Hω,n,θ ≤ −E +1}1{ω∈2 : ∃x∈C(0,n+2l), m(ω,C(x,2l))>k} dθ. E
(4.9) On the other hand, for ω ∈ 2E (2E is defined in (3.3)), we have 0
(i) kLp (C(0,n)) ≤ CE β(d+α )+ρ , kVω,n
hence, using Corollary 5.1, there exists ν > 0 such that 1 0 Hω,n,θ ≥ − 1n,θ − E (β(d+α )+ρ)ν , 2 where −1n,θ is the Laplace operator on C(0, n) with quasi-periodic boundary conditions. This implies that, for some C > 0, ]{eigenvalues of Hω,n,θ ≤ −E + 1} 0
≤ ]{eigenvalues of − 1n,θ ≤ CE (β(d+α )+ρ)ν − 2E + 2} 0
≤ Cnd (E (β(d+α )+ρ)ν − E + 1)d/2 . Plugging this into (4.9), we get 0
E(Nω,n (−E + 1)12 ) ≤ C(E (β(d+α )+ρ)ν − E + 1)d/2 E
· P ({ω; ∃x ∈ C(0, n + 2l), m(ω, C(x, 2l)) > k}. Notice that, by Lemma 5.5, our choice of k fulfills the assumptions of Lemma 4.3. Therefrom we deduce that, if E is large enough, then log E(Nω,n (−E + 1)) ≤ −g(E) log g(E)(1 − 2ε). This completes the proof of Proposition 4.1 u t
84
F. Klopp, L. Pastur
4.2.1. The proof of Lemma 4.2. Consider the partition of the cube C(0, n) into cubes of side length l, i.e. [ C(γ , l), C(0, n) = γ ∈l Zd |γ |≤n/ l
and the covering
[
C(0, n) ⊂
C(γ , 2l).
γ ∈l Zd |γ |≤n/ l
Set L = 2[E β ] and consider a Zd -periodic partition of unity of Rd , i.e. X X χγ2+β , 1= β∈LZd γ ∈Zd ∩C(0,L)
where χ ∈ C0∞ (Rd ) such that χ ≡ 1 on C(0, 1/2), 0 ≤ χ ≤ 1 and suppχ ⊂ C(0, 3/2). For γ ∈ lZd , define χγ ,l (x) = χ((x − γ )/ l). Then, one has X X χγ2+β,l . 1= β∈nZd γ ∈l Zd ∩C(0,n)
Assume (4.5) holds and consider ϕ ∈ C ∞ (Rd ) satisfying the quasi-periodic boundary conditions ϕ(x + nγ ) = einθγ ϕ(x) for γ ∈ Zd and x ∈ Rd (i.e. ϕ ∈ C ∞ (Rd ) ∩ L2θ,loc (C(0, n))) such that kϕk = 1. Note that k · k and h·, ·i denote respectively the usual norm and scalar product in L2 (C(0, n)). For small positive ε, we compute (i) (e) )ϕ, ϕi + hVω,n ϕ, ϕi h(−1 + Vω,n )ϕ, ϕi = h(−1 + Vω,n (i) (e) ϕ, ϕi − kVω,n k∞ ≥ k∇ϕk2 + hVω,n X (i) (e) = χγ ,l ϕ, χγ ,l ϕi − kVω,n k∞ kχγ ,l ∇ϕk2 + hVω,n γ ∈l Zd ∩C(0,n+l)
1 (i) hV χγ ,l ϕ, χγ ,l ϕi ≥ (1 − ε) k∇ χγ ,l ϕ k + 1 − ε ω,n γ ∈l Zd ∩C(0,n+l) X 1 (e) k ∇χγ ,l ϕk2 − kVω,n k∞ + 1− ε d γ ∈l Z ∩C(0,n+l) X 2 1 (i) ≥ (1 − ε) hV χγ ,l ϕ, χγ ,l ϕi k∇ χγ ,l ϕ k + 1 − ε ω,n d
X
2
γ ∈l Z ∩C(0,n+l)
−
C (e) − kVω,n k∞ . εl 2
Define Vω(i,γ ) = =
Z C(γ ,2l)
Z
V (x − y)m(ω, dy) and Vω(e,γ )
C(0,n+2l)\C(γ ,2l)
V (x − y)m(ω, dy)
(4.10)
Lifshitz Tails for Random Schrödinger Operators
85 ((i,e),γ )
and the corresponding periodized potentials Vω,n (see Eq. (2.2)). Then (4.5) tells us that m(ω, C(0, n + 2l)) ≤ C(n/2l)d · k. Using this and the exponential decay of V1 , we get that, for some C > 0, (e,γ )
sup |Vω,n (x)| ≤ (n/2l)d · ke−l/C ,
x∈C(γ ,l)
(i,γ ) sup Vω,n − Vω(i,γ ) (x) ≤ Cke−l/C .
x∈C(γ ,l)
So that (4.10) gives us −1 +
X
h(−1 + Vω,n )ϕ, ϕi ≥ (1 − ε)
γ ∈l Zd ∩C(0,n+l)
−
1 (i,γ ) V χγ ,l ϕ, χγ ,l ϕ 1−ε ω
C (e) − kVω,n k∞ − C((n/2l)d + 1) · ke−l/C . εl 2
(4.11)
Now, if (xj (ω))j =1,...,m(ω,C(γ ,2l)) denotes the support of m(ω, dx) in C(γ , 2l), we write 1 (i,γ ) V χγ ,l ϕ, χγ ,l ϕ −1 + 1−ε ω + * m(ω,C(γ ,2l)) X 1 τxj (ω) V χγ ,l ϕ, χγ ,l ϕ = −1 + 1−ε j =1
=
1 m(ω, C(γ , 2l))
m(ω,C(γ X ,2l))
−1 +
j =1 m(ω,C(γ X ,2l))
1 ≥− m(ω, C(γ , 2l)) j =1 k ≥ −E− kχγ ,l ϕk2 . 1−ε
E−
m(ω, C(γ , 2l)) τxj (ω) V 1−ε
χγ ,l ϕ, χγ ,l ϕ
m(ω, C(γ , 2l)) kχγ ,l ϕk2 1−ε
(4.12) Here we have used (4.5). Plugging (4.12) into (4.11), we get C k (e) k∞ − C((n/2l)d + 1) · ke−l/C . − 2 − kVω,n h(−1 + Vω,n )ϕ, ϕi ≥ −E− 1−ε εl (4.13) Lemma 4.2 follows then from the fact that C ∞ (Rd )∩L2θ (C(0, n)) is dense in the domain t of Hω,n,θ . u 4.2.2. The proof of Lemma 4.3. Define P (n, k, l) = P ({ω; ∃x ∈ C(0, n + 2l), m(ω, C(x, 2l)) > k}). We will assume that k, n and l are large and that they satisfy d
(n + l)d+1 el = o(l −kd 0(k)).
(4.14)
86
F. Klopp, L. Pastur
We will prove a lower and an upper bound on P (n, k, l). We start with the lower bound. Consider the partition of the cube C(0, n + 2l) into cubes of side length 2l, i.e. [ C(γ , 2l). C(0, n + 2l) = γ ∈2l Zd |γ |≤n/2l+1
Using the independence for disjoint cubes and the homogeneity of the Poisson field, we obtain that P (n, k, l) ≥ P ({ω; ∃γ ∈ 2lZd , |γ | ≤ n/2l + 1; m(ω, C(γ , 2l)) > k}) d
= 1 − (1 − P (m(ω, C(0, 2l)) > k))(n/2l+1) . d
By definition, P (m(ω, C(0, 2l)) > k) ≥ e−µ(2l) we get
(4.15)
(µ(2l)d )k . Plugging this into (4.15), k!
log P (n, k, l) ≥ −k log k(1 + o(1)) when k, n and l tend to +∞ under the assumption (4.14). To get an upper bound, we consider the partition of the cube C(0, n + 2l) into cubes of side length 4l, i.e. [ C(γ , 4l). C(0, n + 2l) = γ ∈4l Zd |γ |≤n/(4l)+1/2
For any x ∈ C(0, n + 2l), there exists γ ∈ 4lZd , |γ | ≤ n/(4l) + 1/2 such that C(γ , 4l) ∩ C(x, 2l) = C(x, 2l). Hence, P ({ω; ∀x ∈ C(0, n + 2l), m(ω, C(x, 2l)) ≤ k}) ≥ ≥ P ({ω; ∀γ ∈ 4lZd , |γ | ≤ n/2l + 1/2; m(ω, C(γ , 4l)) ≤ k}) that is, using the stationarity of the Poisson process, P ({ω; ∃x ∈ C(0, n + 2l), m(ω, C(x, 2l)) > k}) X P ({ω; m(ω, C(γ , 4l)) > k}) ≤ γ ∈4l Zd , |γ |≤n/2l+1/2
≤ (n + 2l)d P ({ω; m(ω, C(0, 4l)) > k}). On the other hand P ({m(ω, C(0, 4l)) > k}) =
X j >k
dµ
e−(4l)
[µ(4l)d ]j (µ(4l)d )k ≤ . j! k!
This then implies that log P (n, k, l) ≤ −k log k(1 + o(1)) when k, n and l tend to +∞ under assumption (4.14). u t
(4.16)
Lifshitz Tails for Random Schrödinger Operators
87
4.3. The case when V has power law singularities. We now assume that H 1” holds. Obviously, modifying V1 , we can assume that the functions Wi do not change sign and that the supports of distinct Wi are pairwise disjoint. We then prove Proposition 4.3. Under the assumptions H1 and H 1”, there exists β0 > 0 such that, for any β > β0 and n = 2[E β ] · [| log E|β ], we have log E(Nω,n (−E + 1)) ≤ −(1 + α † d)g(E) log g(E)(1 + o(1)), E → +∞. (4.17) Taking into account Lemma 5.5 and Lemma 2.3, as a corollary to Proposition 4.3, we get Proposition 4.4. Under the assumptions H1 and H 1”, we have log (E(N (−E + 1))) ≤ −(1 + α † d)g(E) log g(E)(1 + o(1)), E → +∞. This ends the proof of Theorem 1.6 if one takes into account Proposition 3.5. 4.3.1. Proof of Proposition 4.3. The idea guiding this proof is essentially the same as the one guiding the proof of Proposition 4.1. The difference comes from the fact that, as E− (g) increases faster than linearly in g, if we want to gather k single site potentials sufficiently close together so as to get the effect of having k single site potentials exactly at the same point, we need the single site potentials to be roughly at a distance less † than k −α from each other. Hence, the scale on which we want the Poisson points to concentrate is much smaller than the one used to prove the upper bound in the general case. This leads to some supplementary technical difficulties as the single site potentials have a finite non-zero range. Recall that (xi )1≤i≤q are the singularities of the single site potential V (see Sect. 1.2). We define [ C(x − xi , r). (4.18) K(x, r) = 1≤i≤q
Pick ε > 0 small and define the events ˜ 1 = {ω ∈ 2E : ∃x ∈ C(0, n + 2l), m(ω, K(x, k −(α † −2ε) )) > k}, ˜ 2 = {ω ∈ 2E : ∃x ∈ C(0, n + 2l), m(ω, C(x, 1)) > k 1+εν † }, ˜ 2. ˜ = ˜1∪ and Pick β > 0 large. Set n = 2[E β ] · [| log E|β ] and l = [| log E|β ]. Pick ρ > 0 large. Taking into account Lemma 4.1, Proposition 4.3 is a direct consequence of the following two lemmas (cf. Lemmas 4.2 and 4.3). Lemma 4.4. For any ε > 0, there exists E0 and k0 such that, if k > k0 and E > E0 , if ω ∈ 2E and if ∀x ∈ C(0, n + 2l), m(ω, K(x, k −(α
† −2ε)
)) ≤ k and m(ω, C(x, 1)) ≤ k 1+εν
†
(4.19)
then, for any θ ∈ T∗n , we have † † Ck 2(α −ν ε/2) k β † (1 + ε) − − E ρ e−(log E) /C − Ck 1+εν , Hω,n,θ ≥ −E− 1−ε ε (4.20)
88
F. Klopp, L. Pastur
and Lemma 4.5. Pick δ > 0. Then, if E is large enough and if k ≥ E δ , the probability of ˜ satisfies the event ˜ log P ()
∼
E→+∞
−(1 + (α † − 2ε)d)k log k.
(4.21)
We now pick k = [(1 − ε)g(E/(1 + 2ε))], i.e. k is of order of magnitude E 1−ν /2 as † † † E → +∞ (see Lemma 5.7). In that case, for sufficiently small ε, k 1+εν and k 2(α −ν ε/2) are o(E). Hence Proposition 4.3 follows from Lemma 4.1, Lemma 4.4 and Lemma 4.5 in the same way as Proposition 4.1 followed from Lemmas 4.1, 4.2 and 4.3. †
4.3.2. The proof of Lemma 4.4. We are going to proceed along the same lines as in the (i,e) proof of Lemma 4.2. The potentials Vω,n are defined as in (4.4). Fix ϕ ∈ C ∞ (Rd ) ∩ L2θ (C(0, n)) normalized by kϕk = 1. Then (i) (e) ϕ, ϕi + hVω,n ϕ, ϕi hHω,n,θ ϕ, ϕi = h−1ϕ, ϕi + hVω,n
(4.22)
β /C
(i) ϕ, ϕi − E ρ e−(log E) ≥ h−1ϕ, ϕi + hVω,n
using (4.8). We can split (i) (i,1) (i,2) = Vω,n + Vω,n , Vω,n
where Vω(i,1)
Z =
C(0,n+2l)
(i,2)
V1 (x − y)m(ω, dy) and
Vω(i,2)
Z =
C(0,n+2l)
V2 (x − y)m(ω, dy).
(i)
Vω,n contains all the local singularities of Vω,n and, as V1 is exponentially decaying, there exists C > 0 such that for ω ∈ 2E satisfying (4.19), we have (i,1) k∞ ≤ Ck 1+εν . kVω,n †
Hence (4.22) gives β /C
(i,1) ϕ, ϕi − E ρ e−(log E) hHω,n,θ ϕ, ϕi ≥ h−1ϕ, ϕi + hVω,n
†
− Ck 1+εν .
(4.23)
Consider now a periodic partition of the unity of the cube C(0, n) of the form X χγ2 , 1C(0,n) = γ ∈0
where the χγ are supported on cells of size roughly k −α +ν ε/2 . These cells are centered † † at the points of 0 = δ(k)Zd ∩ C(0, n), where δ(k) = 1/[k α −ν ε/2 ]. Here [·] denote the integer part. These functions can then be chosen so that, for some C > 0, we have †
†
Lifshitz Tails for Random Schrödinger Operators
89
sup k∇χγ k2∞ + sup k1χγ k∞ ≤ Cδ(k)−2 ≤ Ck 2(α
γ ∈0
† −ν † ε/2)
γ ∈0
(4.24)
for some C > 1. We compute h−1ϕ, ϕi =
X
k∇(χγ ϕ)k2 + h|∇χγ |2 ϕ, ϕi − 2Re(h∇(χγ ϕ), ϕ∇χγ i)
γ ∈0
1 X k∇(χγ ϕ)k + 1 − h|∇χγ |2 ϕ, ϕi ≥ (1 + ε) ε γ ∈0 γ ∈0 X 1 † † 2 k∇(χγ ϕ)k + C 1 − k 2(α −ν ε/2) . ≥ (1 − ε) ε
X
2
(4.25)
γ ∈0
On the other hand (i,1) ϕ, ϕi = hVω,n
X γ ∈0
(i,1) hVω,n χγ ϕ, χγ ϕi.
Set V˜i = Wi Vi . As the Wi ’s are of compact support, so are the V˜i . Hence, for some R0 positive, we have X (i,1) χγ ϕ, χγ ϕi = hτxu (ω)+xj (V˜j )χγ ϕ, χγ ϕi hVω,n 1≤j ≤q, u≥1 xu (ω)+xj ∈C(γ ,R0 )
X
=
1≤j ≤q, u≥1
hτxu (ω)+xj (V˜j )χγ ϕ, χγ ϕi †
xu (ω)+xj ∈C(γ ,k 2ε−α )
X
+
1≤j ≤q, u≥1
hτxu (ω)+xj (V˜j )χγ ϕ, χγ ϕi, †
xu (ω)+xj ∈C(γ ,R0 )\C(γ ,k 2ε−α )
(4.26) where the (xu (ω))u≥1 are the support of the Poisson measure m(ω, dy). † For |γ − xu (ω) − xj | ≥ k ε−α and x in the support of χγ , one has † † † |τxu (ω)+xj (V˜j )(x)| ≤ Ck εν −α ν . †
We have assumed that, for any x ∈ C(0, n + 2l), one has m(ω, C(x, 1)) ≤ k 1+εν ; hence, we obtain X † † † † hτxu (ω)+xj (V˜j )χγ ϕ, χγ ϕi ≥ −Ck −2εν +α ν +1+εν 1≤j ≤q, u≥1
†
xu (ω)+xj ∈C(γ ,R0 )\C(γ ,k 2ε−α )
= −Ck 2α
† −εν †
. (4.27)
90
F. Klopp, L. Pastur
Hence, to estimate hHω,n,θ ϕ, ϕi using Eqs. (4.25), (4.26) and (4.27), we only need to estimate A := k∇(χγ ϕ)k2 +
1 1−ε
X
hτxu (ω)+xj (V˜j )χγ ϕ, χγ ϕi.
1≤j ≤q, u≥1
†
xu (ω)∈K(γ ,k 2ε−α )
Therefore, we notice that, as the (xj )1≤j ≤q are distinct, for k sufficiently large, for any ω, † any γ and any u, there is at most a single 1 ≤ j ≤ q such that xu (ω) ∈ C(γ −xj , k 2ε−α ). Hence, by (4.18), we get a partition †
Q := {xu ; xu (ω) ∈ K(γ , k 2ε−α )} [ [ † {xu ; xu (ω) ∈ C(γ − xj , k 2ε−α )} =: Qj . = 1≤j ≤q
1≤j ≤q †
Set m(ω) = m(ω, K(γ , k 2ε−α )) and qj = ]Qj . We compute q 1 XX hτxu (ω)+xj (V˜j )χγ ϕ, χγ ϕi A = k∇(χγ ϕ)k + 1−ε j =1 Qj q X X 1 m(ω) qj k∇(χγ ϕ)k2 + hτxu (ω)+xj (V˜j )χγ ϕ, χγ ϕi = m(ω) 1−ε 2
j =1
=
q X j =1
≥
1 m(ω)
X
Qj
k∇(χγ ϕ)k2 +
Qj
m(ω) hτx (ω)+xj (V˜j )χγ ϕ, χγ ϕi 1−ε u
q X qj X m(ω) E˜ j kχγ ϕk2 , m(ω) (1 − ε) j =1
Qj
where E˜ j (g) is the ground state of −1 + g V˜j . In Sect. 5.4, we prove that the lowest of these ground states is asymptotic to E(g) as g → +∞. Hence, for k sufficiently large, by (4.19), we have A ≥ −(1 + ε)E−
m(ω) k kχγ ϕk2 ≥ −(1 + ε)E− kχγ ϕk2 . 1−ε 1−ε
(4.28)
Combining this with Eqs. (4.25), (4.26) and (4.27), we end the proof of Lemma 4.4. u t ˜ 1 ). 4.3.3. The proof of Lemma 4.5. We first prove an asymptotic lower bound for P ( Recall that K is defined by (4.18). We notice that, for E large enough, P ({ω ∈ 2E : ∃x ∈ C(0, n + 2l), m(ω, K(x, k −(α ≥ P ({ω ∈
2E
† −2ε)
)) > k}) ≥
: ∃x ∈ C(0, n + 2l), m(ω, C(x, k −(α
† −2ε)
)) > k}).
Lifshitz Tails for Random Schrödinger Operators
91
Hence, using the proof of the lower estimate in Lemma 4.3, we get ˜ 1) log P ( ≥ −1. E→+∞ (1 + (α † − 2ε)d)k log k
(4.29)
lim inf
˜ 1 ). We partition C(0, n+3l) into cubes Let us prove the asymptotic upper bound for P ( † † of side length k −(α −2ε) , the cubes being indexed by 4k = k −(α −2ε) Zd ∩ C(0, n + 3l). † For 1 ≤ i ≤ q, let γi be the projection of xi on the lattice k −(α −2ε) Zd . Then, for any x ∈ C(0, n + 2l), there exists γ ∈ 4k such that K(x, k −(α
† −2ε)
[
)⊂
C(γ − γi , 4k −(α
† −2ε)
).
1≤i≤q
Hence P ({ω ∈ 2E : ∃x ∈ C(0, n + 2l), m(ω, K(x, k −(α −2ε) )) > k}) [ X † P ω ∈ 2E ; m ω, C(γi + γ , 4k −(α −2ε) ) > k ≤ γ ∈4k 1≤i≤q [ † † C(γi , 4k −(α −2ε) ) > k . ≤ ((n + 3l)k (α −2ε) )d P ω ∈ 2E : m ω, †
1≤i≤q
As k > E δ and as n and l are at most of polynomial size in E, this gives lim sup E→+∞
˜ 1) log P ( ≤ −1. † (1 + (α − 2ε)d)k log k
Combined with (4.29), we get ˜ 1) log P (
∼
E→+∞
−(1 + (α † − 2ε)d)k log k.
(4.30)
On the other hand, Lemma 4.3 tells us that ˜ 2) log P (
∼
E→+∞
†
−(1 + εν † )k 1+εν log k.
˜ 1 ). In view of (4.30), this completes the ˜ 2 ) is negligible with respect to P ( So that P ( proof of Lemma 4.5. u t 5. Appendix 5.1. The structure of the Poisson potential. Let V ∈ Lp (Rd ) be a potential satisfying assumptions H1, H 2 and H 3. Let m(ω, dx) be a Poisson measure of concentration µ. Consider the potential Vω defined by (1.1). One has
92
F. Klopp, L. Pastur
Lemma 5.1. For any α ∈ (0, 1), p0 < p and 0 < b < 1, there exists C > 0, such that ω-almost surely, Vω ∈
[
k (α, b, p0 ),
k≥1
where, for k ≥ 1, k (α, b, p0 ) is the set of measurable functions V : Rd → C that can be written in the form V = V (r) + V (i) , where V (r) and V (i) satisfy −p 0
p
1. |V (r) (x)| ≤ Ck p−p0 b p−p0 (|x|α + 1) for x ∈ Rd , p0
2. V (i) ∈ Lloc,unif and kV (i) k
p0
Lloc,unif
≤ b (here k · kLp
= sup k · kLp (C(x,1)) ).
loc,unif
x∈Rd
Moreover, there exists C > 0 such that P({Vω 6 ∈ k (α, b, p0 )}) ≤
(Cµ)k . k!
(5.1)
Proof. Fix α, p 0 and b as in the lemma. Pick 0 < α 0 < α. As V2 is compactly supported, there exists r0 > 1 such that suppV2 ⊂ C(0, r0 ). Let 0 = r0 Zd . Pick k > 0. Define 0 Pk,α 0 := P ({∃γ ∈ 0; m(ω, C(γ , r0 )) ≥ k(|γ |α + 1)}). Using (4.16), we compute Pk,α 0 ≤
X
0
P ({ω; m(ω, C(γ , r0 )) ≥ k(|γ |α + 1)}
γ ∈0 0
X (µr d )k(|γ |α +1) 0 ≤ 0 (k(|γ |α + 1))! γ ∈0 ≤C
(5.2)
(µr0d )k for some C > 0. k!
We write Vω = V1,ω + V2,ω ,
(5.3)
Z where V1,ω (x) =
Rd
Z V1 (x − y)dm(ω, y) and V2,ω (x) =
Now assume that ω is such that 0
Rd
V2 (x − y)dm(ω, y).
∀γ ∈ 0, m(ω, C(γ , r0 )) < k(|γ |α + 1),
(5.4)
Lifshitz Tails for Random Schrödinger Operators
93
and (xn (ω))n∈N the points in the support of m(ω, dx). We will estimate V1,ω and V2,ω separately. V1,ω can be estimated as in [9]. One has |V1,ω (x)| ≤
X
X
|V1 (x − xn (ω))|
β∈r0 Zd xn (ω)∈C(β,r0 )
≤C
X
e−|x−β|/C m(ω, C(γ , r0 ))
β∈r0 Zd
≤C
(5.5)
X
β∈r0
e
−|x−β|/C
α0
k(|β| + 1)
Zd 0
≤ Ck(|x|α + 1). As V2 ∈ Lp (Rd ), for K ≥ 1, we have Z
p0
Rd
1|V2 (x)|≥K |V2 (x)| dx
1/p0
Z =
1|V2 (x)|≥K |V2 (x)| |V2 (x)|
Rd
≤K
p0 −p
p
p 0 −p p0
1/p0 dx
p/p0
kV2 kLp .
(5.6) (r)
(i)
Let us introduce a notation: take a function W and decompose it into W = WK + WK , where (r)
(i)
WK := 1|W (x)|≤K W and WK := 1|W (x)|>K W.
(5.7)
Fix a sequence of positive numbers (Kγ )γ ∈0 . As suppV2 ⊂ C(0, r0 ), V2,ω can be rewritten as (i)
(r)
V2,ω = V2,ω + V2,ω ,
(5.8)
where we define (i,r)
V2,ω (x) :=
X
1C(γ ,r0 ) (x)
γ ∈0
X
(i,r)
xn (ω)∈C(γ ,2r0 ) p0 p α,
Pick a > 0, Kγ = a(|γ |β + 1) and set β = by (5.6), we have
(V2 )Kγ (x − xn (ω)) .
α0 =
(i)
p−p0 p α.
Then, for some C > 0,
k1C(γ ,r0 ) V2,ω kLp0 ≤ Cm(ω, C(γ , 2r0 ))(a(|γ |β + 1)) ≤ Cka
p 0 −p p0
p 0 −p p0
0
α 0 +β p p−p 0
(|γ |
+ 1) ≤ 2Cka
p 0 −p p0
(5.9) ,
and (r)
0
k1C(γ ,r0 ) V2,ω k∞ ≤ Cka(|γ |α +β + 1) ≤ Cka(|γ |α + 1).
(5.10)
94
F. Klopp, L. Pastur p0
p0
Pick a = (2−d b) p0 −p (Ck) p−p0 (where C is the constant appearing in (5.9)). As r0 > 1, (i) (i) ≤ 2d sup k1C(γ ,r0 ) V2,ω kLp0 . Hence, by (5.9), we get we have kV2,ω k p0 Lloc,unif
γ ∈0
(i)
kV2,ω k
p0
Lloc,unif
≤ b.
(5.11)
Moreover, by (5.10), for some C > 0, for x ∈ Rd , we have (r)
p
−p 0
|V2,ω (x)| ≤ Ck p−p0 b p−p0 (|x|α + 1).
(5.12)
Now putting together Eqs. (5.3), (5.5), (5.8), (5.11) and (5.12), we see that if ω is such 0 that ∀γ ∈ 0, m(ω, C(γ , r0 )) < k(|γ |α + 1), then Vω ∈ k (α, b, p0 ). Taking into account Eq. (5.2), we get Lemma 5.1. u t 5.2. An a-priori estimate on the density of states. Recall that N denotes the integrated density of states of Hω . We prove Lemma 5.2. Under assumption H1, H 2 and H 3, there exists C > 0 such that, for E > 1, we have 1 2p−d log N(−E) ≤ − E 2p log E, C
(5.13)
and, for any n ≥ 1 and E > 1, we have 1 2p−d log E(Nω,n (−E)) ≤ − E 2p log E. C
(5.14)
Proof. Equations (5.14) and (5.13) are proved along the same lines. We will only write the details for (5.13). By [21], we know that, for any cube 3 and E > 0, we have 1 N smaller than or equal to − E} ]{eigenvalues of Hω,3 N (−E) ≤ E Vol(3) 1 X N E 1{ω; m(ω,3)=n} ]{eigenvalues of Hω,3 smaller than or equal to −E} , = Vol(3) n∈N
(5.15) N denotes the restriction of H to 3 with the Neumann boundary conditions. where Hω,3 ω One has the
Lemma 5.3. Pick p > p(d) (p(d) is defined in assumption H 2). If V ∈ Lp (Rd ) then, d and E > 0, (E − 1)−α/2 V (E − 1)−α/2 is bounded on L2 (Rd ) and for for α > 2p some Cα,p > 0, one has k(E − 1)−α/2 V (E − 1)−α/2 kL(L2 ) ≤ Cα,p kV kLp (Rd ) E −α+d/2p .
(5.16)
Lifshitz Tails for Random Schrödinger Operators
95
Proof. Using classical properties of the Fourier transform from Lp to Lq (q −1 +p−1 = 1 and 1 ≤ p ≤ 2) (see e.g. [6]), one shows (E−1)−α/2 is bounded from L2 (Rd ) to Lr (Rd ) α 1 1 for < + with the bound r 2 d α
k(E − 1)−α/2 kL2 →Lr ≤ Cα,r E − 2 +
d(2−r) 4r
.
One concludes using duality and Hölder’s inequality. u t Lemma 5.3, admits an immediate corollary p
Corollary 5.1. Pick p > p(d). Let V ∈ Lloc,unif (Rd ) be such that kV (i) kLp Then, there exists C > 0 such that, for φ ∈ H 1 (Rd ) and > 0, we have d
h|V |φ, φi ≤ k∇φk2L2 (Rd ) + C d−2p kφk2L2 (Rd ) .
loc,unif
≤ 1.
(5.17)
We now continue the proof of Lemma 5.2. The identity (5.17) can be carried over to H 1 (3) in the following way. Let φ ∈ H 1 (3); we can extend φ to Rd and we denote this extension by φ˜ (see [29,32]). By definition, φ˜ |3 = φ. Moreover we know that, for some C > 0, ˜ L2 (Rd ) ≤ C0 k∇φkL2 (3) . ˜ L2 (Rd ) ≤ C0 kφkL2 (3) and k∇ φk kφk
(5.18)
Pick x0 ∈ Rd and define Vx0 (x) = V (x −x0 ). Then, for φ ∈ H 1 (3) and > 0, by (5.17) ˜ φi ˜ L2 (Rd ) h|Vx0 |φ, φiL2 (3) = h|Vx0 |13 φ, d
˜ 2 2 d + C d−2p kφk ˜ 22 d ≤ k∇ φk L (R ) L (R ) d
≤ C0 k∇φk2L2 (3) + C0 C d−2p kφk2L2 (3) , Hence, there exists C > 0 such that, for any x0 ∈ Rd , > 0 and φ ∈ H 1 (3), we have d
h|Vx0 |φ, φiL2 (3) ≤ k∇φk2L2 (3) + C d−2p kφk2L2 (3) .
(5.19)
Now pick ω such that m(ω, 3) = n. By (5.19), for φ ∈ H 1 (3), we have d
hVω φ, φiL2 (3) ≥ −nk∇φk2L2 (3) − Cn d−2p kφk2L2 (3) . Pick =
1 to get 2n h(−1 + Vω )φ, φiL2 (3) ≥
2p 1 k∇φk2L2 (3) − Cn 2p−d kφk2L2 (3) . 2
(5.20)
Hence 2p
N ≥ −Cn 2p−d . Hω,3
(5.21)
96
F. Klopp, L. Pastur
As a direct consequence of (5.20), we get that N smaller than or equal to − E} ]{eigenvalues of Hω,3 2p
2p−d − E} ≤ ]{eigenvalues of − 1N 3 smaller than or equal to Cn
≤ CVol(3)n
2p 2p−d
(5.22)
.
Plugging this into (5.15), for some C > 0 (depending on 3), we obtain X 2p C E 1{ω; m(ω,3)=n}Vol(3)n 2p−d N (−E) ≤ Vol(3) 2p−d n≥E
X
≤C
n≥E
≤C
n
2p−d 2p
/C
2p
2p 2p−d
(µVol(3))n n!
/C
E(µVol(3))E
2p−d 2p
/C
2p−d 2p
(E /C)! This ends the proof of Lemma 5.2. u t
.
5.3. Exponential decay estimates. One has Lemma 5.4. Let α ∈ (0, 1), p > p(d) (p(d) is defined in assumption H 2) and q > d/2. Pick χ ∈ C0∞ (Rd ) such that 0 ≤ χ ≤ 1, χ ≡ 1 on C(0, 1/2) and χ ≡ 0 outside of C(0, 3/2). Then, there exists Cα,p,q,χ > 0 and α,p,q,χ > 0 such that, for any V of the form V = V i + V r , where p
• V i ∈ Lloc,unif and kV i kLp
• For some K > 0,
Vr
loc,unif
≤ b (for some b > 0).
satisfies | V r (x) |≤ K(|x|α + 1) for all x ∈ Rd ,
there exists Cb > 0 such that, for any (γ , γ 0 ) ∈ Zd × Zd and z ∈ C \ R, one has kχγ (z − (−1 + V ))−1 χγ 0 kTq ≤ Cα,p,q,χ
kχγ (z − (−1 + V ))−1 χγ 0 kL(L2 ,H 1 )
(1 + |γ 0 |)α η(z, K, b)
0
· e−α,p,q,χ ·η(z,K,b)|δα (γ )−δα (γ )| , (1 + |γ |)α ≤ Cα,p,q,χ η(z, K, b)
(5.23)
· e−α,p,q,χ ·η(z,K,b)|δα (γ )−δα (γ )| ,
(5.24)
0
|Imz| . Here χγ (·) = χ(·−γ ), |z| + K + Cb k · kTq denotes the norm in the q th Schatten class, H 1 (Rd ) is the usual Sobolev space H 1 (Rd ) = (1 − 1)−1/2 L2 (Rd ) and dist(z, z0 ) denotes the distance in C.
where δα (x) = (1+x 2 )(1−α)/2 and η(z, K, b) =
Proof. Up to small modifications, the proof of this result is the same as the proof of Lemma 4.1 in [9]. Let us just say that, in order to prove (5.24), we use the fact that t (1 + x 2 )α/2 ∇(z − (−1 + V ))−1 is bounded on L2 (Rd ) for Imz 6= 0 (see [26]). u
Lifshitz Tails for Random Schrödinger Operators
97
5.4. Some useful facts about the single site potential Hamiltonian Hg . Recall that, for g ∈ R, in Sect. 1.2, we have defined H (g) = −1 + gV and −E− (g) to be the infimum of the spectrum of H (g). It is well known that, for g large enough, −E− is a simple eigenvalue ([23,26]); hence it is analytic in g. Moreover it is convex (by the variational principle). Its first and second derivative are positive. Let ϕg be the unique positive normalized ground state associated to −E− (g). Then the eigenvalue equation gives k∇ϕg k2 + ghV ϕg , ϕg i = −E− (g). 0 (g) = −hV ϕ , ϕ i; so that E (g) satisfies Hence E− g g − 0 (g). k∇ϕg k2 + E− (g) = gE−
(5.25)
Let g be an inverse of E− in a neighborhood of +∞. Then, one has the following Lemma 5.5. Let V ∈ Lp (Rd ) where p is chosen as in assumption H 2. Then, there exists C > 0 such that, for g and E sufficiently large, one has 2p 1 g ≤ E− (g) ≤ Cg 2p−d . C
E− (g) is bounded if and only if V− is bounded. g In this case, we have: E− (g) ∼ kV− k∞ g. g→+∞ Z Z |ϕg |2 dx + |∇ϕg |2 dx → 0 as g → +∞. ∃R > 0 such that |x|>R
|x|>R
(5.26) (5.27)
(5.28)
2p
k∇ϕg k2 ≤ Cg 2p−d .
(5.29)
1 2p−d E 2p ≤ g(E) ≤ CE. C ∃C > 0, E0 > 0 such that ∀a > 0, ∀E > E0 , one has g(E) ≤ g(E + a) ≤ g(E) + Ca.
(5.30) (5.31)
Proof. The two sided bound (5.26) is an immediate corollary of (5.17) and of assumption H 3. The definition of g and (5.26) give (5.30). The proof of (5.27) is easy and left to the reader. By Eqs. (5.25) and (5.26), for g large enough, we have 0 (g) ≥ E−
1 1 E− (g) ≥ . g C
Integrating this relation, we get E− (g + k) − E− (g) ≥
k . C
Hence, as g is increasing, for E large enough and a > 0, we have g(E) ≤ g(E + a) ≤ g(E− (g(E) + Ca)) = g(E) + Ca. 0 is increasing, for This completes the proof of (5.31). Let us now prove (5.29). As E− ε > 0, we have 0 (g) ≥ εk∇ϕg k2 E− (g(1 + ε)) − E− (g) ≥ εgE−
98
F. Klopp, L. Pastur
by Eq. (5.25). Equation (5.29) is then an immediate consequence of (5.26). Let us now prove (5.28). We will distinguish two cases when V− is bounded and when it is not. Let us start with assuming V− bounded and let v− be its essential infimum. Then, by (5.27), as g → ∞, we have Z k∇ϕg k2 + g V |ϕg |2 dx = gv− + o(g). Hence,
Z 0≤
(V − v− )|ϕg |2 dx ≤ o(1).
(5.32)
By assumptions H1 and H 2, there exists δ > 0 such that for |x| ≥ 1/δ, V (x) − v− ≥ δ. Then, as V − v− is non negative, Eq. (5.32) tells us that Z |ϕg |2 dx ≤ o(1). |x|≥1/δ
If V is not bounded from below, let χ be a C0∞ cut-off for the cube C(0, R). The eigenvalue equation for ϕg gives us (−1 − E(g))[(1 − χ )ϕg ] = gV (1 − χ)ϕg + 2∇χ · ∇ϕg + 1χ ϕg . So that (1 − χ)ϕg = a1 + a2 + a3 , where a1 = g(−1 − E(g))−1 [V (1 − χ)ϕg ], a2 = 2(1 + E(g))−1 [∇χ · ∇ϕg ], a3 = (1 + E(g))−1 [1χ ϕg ]. For R large enough, V is bounded on the support of 1 − χ; this and (5.27) implies that ka1 k + ka3 k → 0 as g → +∞. We write a2 = 2 (1 + E(g))−1 [1, ∇χ] + ∇χ (1 + E(g))−1 ∇ϕg . As (1 + E(g))−1 ∇ → 0 in L2 -operator norm when g → +∞, we have ka2 k → 0 as g → +∞. Hence k(1 − χ)ϕg k → 0 as g → +∞. We compute k∇[(1 − χ )ϕg ]k2 = 2h(1 − χ)∇ϕg , ϕg ∇χi + h1χ ϕg , χϕg i = 2h∇[(1 − χ)ϕg ], ϕg ∇χi − 2h(∇χ)2 ϕg , ϕg ∇χ i + h1χ ϕg , χϕg i. For R sufficiently large, by what we have just proved, the last two terms in the equation above tend to 0 as g → +∞. Using the Cauchy–Schwartz inequality, this implies k∇[(1 − χ )ϕg ]k2 ≤ 2k∇[(1 − χ)ϕg ]kkϕg ∇χ k + o(1). As kϕg ∇χ k → 0 when g → +∞, we get that k∇[(1 − χ)ϕg ]k → 0 when g → +∞. This completes the proof of Eq. (5.28) hence of Lemma 5.5. u t
Lifshitz Tails for Random Schrödinger Operators
99
As a corollary of this lemma, we get Lemma 5.6. Let V ∈ Lp (Rd ) where p is chosen as in assumption H 2. Then, there exists R0 > 0 such that, if χ ∈ C0∞ (Rd ), 0 ≤ χ ≤ 1, χ(x) = 1 if |x| ≤ R0 and χ (x) = 0 if |x| ≥ 2R0 , then ψg = χ · ϕg /kχ · ϕg k is an asymptotic ground state and we have ) ( g|h(τa V − V )ψg , ψg i| = 0 6 = ∅. α > 0; lim sup g→+∞ |a|≤g −α E− (g) Proof. If we pick χ as above for R0 large enough, by Lemma 5.5, we know that k(1 − χ )ϕg k + k(1 − χ)∇ϕg k → 0 as g → +∞.
(5.33)
Then, as ϕg is the ground state of H (g), we compute h(H (g) − E(g))ψg , ψg i = h(H (g) − E(g))(1 − χ)ϕg , (1 − χ)ϕg i = −E(g)k(1 − χ)ϕg k2 + h∇χ∇ϕg , (1 − χ)ϕg i + h1χ ∇ϕg , (1 − χ)ϕg i. Equation (5.33) tells us that ψg is an asymptotic ground state as kχ · ϕg k → 1 when g → +∞. Define φg = (E− (g) − 1)1/2 ϕg . Then, by (5.26) and (5.29), we have 2p
kφg k2 ≤ Cg 2p−d . Pick a ∈ Rd and write h(τa V − V )ϕg , ϕg i = h0(g)φg , φg i, where 0(g) = (E− (g) − 1)−1/2 (τa V − V )(E− (g) − 1)−1/2 . We now estimate the norm of 0(g). We write 0(g) = (τa − 1)(E− (g) − 1)−1/2 V (E− (g) − 1)−1/2 τ−a + + (E− (g) − 1)−1/2 V (E− (g) − 1)−1/2 (τ−a − 1). Hence, for 0 < δ < 1/2 such that −1 + 2δ + d/2p < 0, by Lemma 5.3, we have k0(g)k ≤ 2k(τa − 1)(E− (g) − 1)−δ kk(E− (g) − 1)−1/2+δ V (E− (g) − 1)−1/2 k ≤ C[E− (g)]−1+δ+d/2p |a|δ (E− (g))−δ , writing (τa − 1)(E− (g) − 1)−δ as a Fourier multiplier. Hence 2p g|h(τa V − V )ϕg , ϕg i| −αδ+ 2p−d . ≤ Cg[E− (g)]−2+d/2p g sup E− (g) |a|≤g −α
100
F. Klopp, L. Pastur
Using (5.26) and picking α large enough, we get lim
sup
g→+∞ |a|≤g −α
g|h(τa V − V )ϕg , ϕg i| = 0. E− (g)
Now using Eqs. (5.26), (5.33) and the fact that, for R0 large enough, τa V −V is bounded outside {|x| ≤ 2R0 }, we get lim
sup
g→+∞ |a|≤g −α
g|h(τa V − V )ψg , ψg i| = 0. E− (g)
This ends the proof of Lemma 5.6. u t To end this section we describe the ground state energy of H (g) and an asymptotic ground state in the case when V satisfies assumption H,1”. For 1 ≤ i ≤ q, define Hgi = −1 + gVi , where Vi is defined in Eq. (1.10). Hgi is a form bounded perturbation of −1 with relative bound 0. By the homogeneity properties of Vi , when Ei < 0 (Ei is the ground state energy of H1i ), it is obvious that the ground state energy of Hgi is given by 2
Ei (g) = Ei g 2−νi .
(5.34) (i)
Define E† (g) = inf Ei (g). Moreover, if ϕi is the ground state of H1i then ϕg , the ground state of Hgi has the following form: d 1 ϕg(i) (x) = g 2(2−νi ) ϕi g 2−νi x .
(5.35)
We have Lemma 5.7. Assume V satisfies assumptions H1 and H 1”. Then, we have E(g)
∼
g→+∞
E† (g).
Moreover, if 1 ≤ i0 ≤ q is such that Ei0 (g) = E† (g), then, if χi0 is a C0∞ cut-off function (i ) (i ) (i ) for a sufficiently small neighborhood of xi0 , ψg 0 (·) = χi0 ϕg 0 (·−xi0 )/kχi0 ϕg 0 (·−xi0 )k is an asymptotic ground state for H (g). Proof. Fix ε > 0 small. As in Sect. 4.3, at the cost of adding a term to V1 , we may assume that the functions Wi stay non-negative and smaller than 1 + ε and that the support of the Wi are two by two disjoint. For 1 ≤ i ≤ q, let χi be a C0∞ cut-off function of a on the support of χi , the function Wi (· − xi ) stays larger neighborhood of xi such that, P than 1 − ε. Define χ02 = 1 − 1≤i≤q χi2 . Pick ϕ ∈ H 1 (Rd ). Then, for any ε > 0, we
Lifshitz Tails for Random Schrödinger Operators
101
have q q X X 2 h−1χi ϕ, ϕi + ghV χi2 ϕ, ϕi hH (g)ϕ, ϕi = i=0
=
q X i=0
+
i=0
k∇(χi ϕ)k2 − 2Rehχi ∇ϕ, ϕ∇χi i − h|∇χi |2 ϕ, ϕi q X i=1
≥ (1 − ε)
ghτxi (Wi Vi )χi2 ϕ, ϕi + ghV0 ϕ, ϕi q X i=1
(5.36)
k∇(χi ϕ)k2 +
g hτx (Wi Vi )χi ϕ, χi ϕi 1−ε i
q 1 X k∇χi · ϕk2 , + ghV0 ϕ, ϕi + 1 − ε i=0
where V0 is some bounded potential (depending on ε). On the other hand, on the support of χi , we have Wi Vi ≥ Vi − ε|V |i where |V |i (·) :=
|hi (·)| . | · |νi
Hence k∇(χi ϕ)k2 +
g hτx (Wi Vi )χi2 ϕ, ϕi ≥ 1−ε i
g hτx (Vi )χi ϕ, χi ϕi + ≥ (1 − ε) k∇(χi ϕ)k2 + 1−ε i g 2 hτx (|V |i )χi ϕ, χi ϕi . (5.37) + ε k∇(χi ϕ)k − 1−ε i
By Eq. (5.37), we get that k∇(χi ϕ)k2 +
g hτx (Wi Vi )χi2 ϕ, ϕi ≥ (1 + εC)Ei 1−ε i
g kχi ϕk2 1−ε
(5.38)
for some constant C (independent of ε) given by the lowest eigenvalue of −1 − g|V |i . Here we used the fact that this eigenvalue has the same growth rate in g as Ei (g); indeed, in the present case, this growth rate only depends on the homogeneity properties of the potential |V |i (as can be seen by a scaling argument). Putting Eqs. (5.38) and (5.36) together, we obtain that, there exists C > 0, such that, for any ε > 0 small and ϕ ∈ H 1 (Rd ), we have g + O(g) kϕk2 . hH (g)ϕ, ϕi ≥ (1 − ε)E† 1−ε This proves that lim sup g→+∞
E(g) ≤ 1. E† (g)
102
F. Klopp, L. Pastur
Pick 1 ≤ i0 ≤ q such that Ei0 (g) = E† (g) and χ a C0∞ cut-off function for a sufficiently (i ) (i ) small neighborhood of xi0 . By the definition of ϕg 0 , it is immediate that kχϕg 0 k → 1 as g → +∞. An immediate computation gives that (i )
(i )
hH (g)χϕg 0 , χϕg 0 i = 1. g→+∞ Ei0 (g) lim
This implies that lim inf
g→+∞
E(g) ≥ 1. E† (g)
Hence, it completes the proof of Lemma 5.7. u t Acknowledgement. F. K. thanks the Erwin Schrödinger Institute (Vienna) where this work was partially done and A. Trouvé for interesting discussions about Poisson processes. The authors are grateful to the referee for his very careful reading of the paper and for his pertinent remarks that allowed them to correct a number of misprints and to make several important improvements.
References 1. Combes, J.M. and Hislop, P.D.: Localization for some continuous random hamiltonians in d-dimensions. J. Funct. Anal. 124, 149–180 (1994) 2. Cycon, H.L., Froese, R.G., Kirsch, W. and Simon, B.: Schrödinger Operators. Berlin: Springer Verlag, 1987 3. Donsker, M. and Varadhan, S.R.S.: Asymptotics for the Wiener sausage. Commun. Pure and App. Math. 28, 525–565 (1975) 4. Efros, M. and Shlovski, B.: Electronic properties of doped semi-conductors. Heidelberg: Springer Verlag, 1984 5. Helffer, B. and Sjöstrand, J.: On diamagnetism and the De Haas-Van Alphen effect. Ann. de l’Institut Henri Poincaré, série Phys. Théor. 52, 303–375 (1990) 6. Hörmander, L.: The analysis of linear partial differential equations. I. Vol. 256 of Grundlehren der Mathematischen Wissenschaften, Berlin–Heidelberg–New York: Springer Verlag, 1990 7. Kirsch, W.: Random Schrödinger operators. In: A.Jensen H.Holden, editor, Schrödinger Operators, Number 345 in Lecture Notes in Physics, Berlin: Springer Verlag, 1989 8. Klopp, F.: An asymptotic expansion for the density of states of a random Schrödinger operator with Bernoulli disorder. Random Operators and Stochastic Equations 3 (4), 315–332 (1995) 9. Klopp, F.: A low concentration asymptotic expansion for the density of states of a random Schrödinger operator with Poisson disorder. J. Funct. Anal. 145, 267–295 (1995) 10. Klopp, F.: Band edge behaviour for the integrated density of states of random Jacobi matrices in dimension 1. J. Stat. Phy. 90 (3–4), 927–947 (1998) 11. Klopp, F.: Internal Lifshits tails for random perturbations of periodic Schrödinger operators. Duke Math. J. 1999. To appear 12. Klopp, F.: Precise high energy asymptotics for the integrated density of states of an unbounded random Jacobi matrix. Rev. Math. Phys. 1999. To appear 13. Klopp, F. and Pastur, L.: In progress 14. Kuchment, P.: Floquet theory for partial differential equations. Vol. 60 of Operator Theory: Advances and Applications, Basel: Birkhäuser, 1993 15. Landau, L. and Lifshitz, L. Mécanique quantique, théorie non-relativiste. Moscou: Editions MIR, 1966 16. Lifshitz, I.M.: Structure of the energy spectrum of impurity bands in disordered solid solutions. Sov. Phys. JETP 17, 1159–1170 (1963) 17. Lifshitz, I.M.: Energy spectrum structure and quantum states of disordered condensed systems. Sov. Phys. Uspekhi 7, 549–573 (1965) 18. Lifshitz, I.M., Gredeskul, S.A. and Pastur, L.A. Introduction to the theory of disordered systems. NewYork: Wiley, 1988 19. Mather, J.N. On Nirenberg’s proof of Malgrange’s preparation theorem. In: Proceedings of Liverpool Singularities-Symposium I, Number 192 in Lecture Notes in Mathematics, Berlin: Springer Verlag, 1971
Lifshitz Tails for Random Schrödinger Operators
103
20. Pastur, L.: Behaviour of some Wiener integrals as t → +∞ and the density of states of the Schrödinger equation with a random potential. Teor.-Mat.-Fiz 32, 88–95 (1977) (in Russian) 21. Pastur, L. and Figotin, A.: Spectra of Random and Almost-Periodic Operators. Berlin: Springer Verlag, 1992 22. Reed, M. and Simon, B.: Methods of Modern Mathematical Physics, Vol IV: Analysis of Operators. New-York: Academic Press, 1978 23. Reed, M. and Simon, B. Methods of Modern Mathematical Physics, Vol I: Functional Analysis. New-York: Academic Press, 1980 24. Shubin, M.A. Spectral theory and index of elliptic operators with almost periodic coefficients. Russ. Math. Surv. 34, 109–157 (1979) 25. Simon, B.: Trace ideals and their applications. Cambridge: Cambridge University Press, 1979 26. Simon, B.: Schrödinger semigroups. Bull. Am. Math. Soc. 7, 447–526 (1982) 27. Simon, B.: Lifshitz tails for the Anderson model. J. Stat. Phys. 38, 65–76 (1985) 28. Sjöstrand, J.: Microlocal analysis for periodic magnetic Schrödinger equation and related questions. In: Microlocal analysis and applications, Vol. 1495 of Lecture Notes in Mathematics Berlin: Springer Verlag, 1991 29. Stein, E.: Singular integrals and Differentiability properties of functions. Princeton, N.J.: Princeton University Press, 1970 30. Sznitman, A.: Lifshitz tails and Wiener sausages. I. J. Funct. Anal. 94, 223–246 (1990) 31. Sznitman, A.: Fluctuations of principal eigenvalues and random scales. Commun. Math. Phys. 189, 337– 363 (1997) 32. Taylor, M. Partial differential equations. New-York–Berlin: Springer, 1996 Communicated by Ya. G. Sinai
Commun. Math. Phys. 206, 105 – 136 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Theta Functions and Hodge Numbers of Moduli Spaces of Sheaves on Rational Surfaces Lothar Göttsche International Center for Theoretical Physics, Strada Costiera 11, P.O. Box 586, 34100 Trieste, Italy. E-mail:
[email protected] Received: 26 August 1998/ Accepted: 10 March 1999
Abstract: We compute generating functions for the Hodge numbers of the moduli spaces of H -stable rank 2 sheaves on a rational surface S in terms of theta functions for indefinite lattices. If H lies in the closure of the ample cone and has self-intersection 0, it follows that the generating functions are Jacobi forms. In particular the generating functions for the Euler numbers can be expressed in terms of modular forms, and their transformation behaviour is compatible with the predictions of S-duality. We also express the generating functions for the signatures in terms of modular forms. It turns out that these generating functions are also (with respect to another developing parameter) the generating function for the Donaldson invariants of S evaluated on all powers of the point class. 1. Introduction Let (S, H ) be a rational algebraic surface with an ample divisor. We assume that KS H ≤ 0. In the current paper we want to compute the Betti numbers and Hodge numbers of the moduli spaces MSH (C, d) of H -semistable torsion-free sheaves of rank 2 on S. In [V-W] Vafa and Witten made a number of predictions about the Euler numbers of moduli spaces of sheaves on algebraic surfaces: in many cases their generating functions should be given by modular forms. In the case of rational surfaces this cannot be true for all polarizations H : The moduli spaces and their Euler numbers depend on H , and this dependence is not compatible with the modularity properties. We study the limit of the generating function for the Euler numbers as H approaches a point F on the boundary of the ample cone with F 2 = 0 (see below for the definitions). It turns out that this limit is indeed a (quasi)-modular form (see Sect. 2.3). More generally we will relate the generating functions for the Hodge numbers and Betti numbers of the MSH (C, d) to certain theta functions of indefinite lattices, which were introduced and studied in [G-Z] in order to show structural results about Donaldson invariants. That the Euler numbers and signatures are given by modular and quasimodular
106
L. Göttsche
forms follows then from the fact that these theta functions are Jacobi forms. As in [G-Z], where the Donaldson invariants were studied, the theta functions enter the calculations by summing over walls. The ample cone has a chamber structure, and the moduli spaces MSH (C, d) only change when H crosses a wall. The structure of the walls for the moduli spaces is precisely the same as for the Donaldson invariants. Therefore we can use again the same theta functions as in [G-Z]. We write our results for the χy -genera instead of for the Hodge numbers, which is equivalent as all the cohomology is of type (p, p) [Be]. One could also have instead used the Poincaré polynomial, but I believe that in general the χy -genus will be better behaved. By specializing the generating functions for the χy -genera of the moduli spaces, we also obtain that the generating functions for the signatures are given by modular forms, a fact that does not seem to have been predicted by the physics literature. It turns out that the generating function for the signatures is better behaved than that for the Euler numbers. If F lies on the boundary of the positive cone, then the corresponding generating function for the signatures is a modular form and not just a quasimodular form. A surprising and interesting result is that the signatures of the moduli spaces MSH (C, d) are closely related to the corresponding Donaldson invariants 8S,H C . For any point H in the ample cone, the generating function for the signatures is also the r generating function for the Donaldson invariants 8S,H C (p ) evaluated on all powers of the point class p ∈ H0 (S, Z). The signatures of the moduli spaces are just the coefficients of the Fourier development of this generating function, whereas the Donaldson invariants are (up to some elementary factors) the coefficients of the development of this function into powers of a modular function u(τ ) for 0(2). In particular knowing all the signatures of the moduli spaces MSH (C, d) is equivalent to knowing all the Donaldson r invariants 8S,H C (p ). This relation also persists under our extension of the generating functions and, together with the formulas for the K3 surfaces, suggests a similar result for any algebraic surface. The proof of this result uses the conjecture of Kotschick and Morgan [K-M]. Feehan and Leness [F-L1,F-L2,F-L3,F-L4] are working towards the proof of this conjecture. This paper grew out of discussions with Jun Li on some aspects of [V-W]. I would like to thank K. Yoshioka for several very useful comments, G. Thompson for useful discussions and the referee for many useful comments and improvements. While preparing this manuscript I learned about related work. In [M-N-V-W] new predictions are made about the Euler numbers of MSF (C, d), where S is a rational elliptic surface, F is the class of a fibre and CF even.Yoshioka [Y4] has shown these predictions. Li and Qin ([L-Q1,L-Q2]) have shown blowup formulas for the Euler numbers and virtual Hodge polynomials of MSH (C, d) for arbitrary S. After this paper was submitted Baranovsky [Ba] displayed an action of the oscillator algebra on the cohomology of the moduli spaces MSF (r, C, d) and gave a simple relation between the Betti numbers of the Gieseker and Uhlenbeck compactifications.
2. Notations, Definitions and Background In this paper S usually denotes a smooth algebraic surface over C. Often we will assume S to be also rational. For a variety Y over C, we denote by upper case letters the classes in H 2 (Y, C), unless they appear as walls (see below), when we denote them by Greek letters. For A, B ∈ H 2 (Y, C) the intersection product on H 2 (Y, C) is just denoted by AB. Later we will also need the negative of the intersection product, which we denote
Theta Functions and Hodge Numbers of Moduli Spaces
107
by hA, Bi. For a smooth compact variety Y of complex dimension d let X (−1)p+q hp,q (Y )x p y q h(Y, x, y) := p,q
be the Hodge polynomial (note the signs), and let d
H (Y ) = H (Y : x, y) := (xy)− 2 h(Y, x, y). , y 1/2 is that it is symmetric around The advantage of this (Laurent) polynomial in x 1/2P degree 0. In a similar way let P (Y ) = P (Y : y) = i (−1)i bi (Y )y i−d := H (Y : y, y) be the (shifted) Poincaré polynomial (again note the signs) and let Xy (Y ) = H (Y : 1, y) be the (shifted) χ−y -genus. Then the Euler number of Y is e(Y ) = X1 (Y ) = P (Y, 1), d and the signature is σ (Y ) := (−1) 2 X−1 (Y ). 2.1. Virtual Hodge polynomials and the Weil conjectures. Virtual Hodge polynomials were introduced in [D-K]. For Y a complex variety the cohomology Hck (Y, Q) with compact support carries a natural mixed Hodge structure. If Y is smooth and projective, this Hodge structure coincides with the classical one. Following [Ch], we put XX (−1)k hp,q (Hck (Y, Q))x p y q . hv (Y : x, y) := p,q
k
These virtual Hodge polynomials have the following properties (see [Ch]). If Y is a smooth projective variety, then hv (Y : x, y) = h(Y : x, y). For Z ⊂ Y Zariski-closed we have hv (Y : x, y) = hv (Y \ Z : x, y) + hv (Z : x, y). For f : Z −→ Y a Zariski-locally trivial fibre bundle with fibre F , we have hv (Z : x, y) = hv (Y : x, y)hv (F : x, y). Finally e(Y ) = hv (Y, 1, 1) for any complex variety Y . We denote by X (−1)i bvi (Y )y i = pv (Y : y) := hv (Y : y, y) i
the virtual Poincaré polynomial. If Y has pure complex dimension d (or sometimes when d Y has expected dimension d), we write Hv (Y ) = Hv (Y : x, y) := (xy)− 2 hv (Y : x, y), Xyv (Y ) := Hv (Y : 1, y) and Pv (Y ) = Pv (Y : y) := y −d pv (Y : y). If Y is smooth and projective of dimension d we have therefore Hv (Y ) = H (Y ), Xyv (Y ) = Xy (Y ) and Pv (Y ) = P (Y ). Let Y be an arbitrary quasiprojective variety (not necessarily irreducible or smooth) over C. We want to show that the Weil conjectures still compute the virtual Poincaré polynomials. This was pointed out to me by Jun Li, and seems to be known to the experts. Proposition 2.1. There is a finitely generated subring A = Z[a1 , . . . , al ] ⊂ C and a variety YA over A, such that Y = YA ×A C, and the following holds: For m a maximal ideal of A we put Ym := YA ×A A/m. There is a nonempty dense open subset U of spec(A), such that if m ∈ U is a maximal ideal of A with quotient field Fq , then there exist complex numbers (ai,j )i,j with |ai,j | = q i/2 , such that for all n ∈ Z>0 , i
v (Y ) X bX n (−1)i ai,j . #Yq (Fq n ) =
i
j =1
108
L. Göttsche
Proof. If Y is smooth and projective, this is part of the Weil conjectures, proven by Deligne [De]. The general case is a simple consequence of this and resolution of singularities in characteristic 0. Let d be the largest dimension of a component of Y . The proof is by induction on d, the case d = 0 being trivial. Write Y = Y0 t W , where Y0 is the smooth locus of Y , and let Y˜ = Y0 t Z be a smooth compactification of Y . Then pv (Y, z) = p(Y˜ , z) + pv (W, z) − pv (Z, z). Let A = Z[a1 , . . . , al ] ⊂ C be a finitely generated subring, such that Y , Y˜ , Z, W are already defined over A. Let U be an open dense subset of spec(A) where the proposition applies to Y˜ (by the usual Weil conjectures) Z and W (by induction). Let m ∈ U be a maximal ideal with quotient field t Fq . Then #Ym (Fq n ) = #Y˜m (Fq n ) + #Wm (Fq n ) − #Zm (Fq n ), and the result follows. u 2.2. Moduli spaces. Let again S be an algebraic surface, H a general ample divisor on S, and let C ∈ H 2 (X, Z). Let MSH (r, C, d) denote the moduli space of H -semistable sheaves E on S (in the sense of Gieseker-Maruyama), with c1 (E) = C and discriminant H 2 d = c2 (E) − r−1 2r C . Let MS (r, C, d)s denote the open subspace of H -slope stable H sheaves and NS (r, C, d) the subspace of H -slope stable locally free sheaves. If d is sufficiently large, then MSH (r, C, d) is irreducible and generically smooth of dimension e = 2rd − (r 2 − 1)χ(OS ) (see e.g. [H-L]). We put MSH (C, d) := MSH (2, C, d), MSH (C, d)s := MSH (2, C, d)s and NSH (C, d) := NSH (2, C, d). If S is a rational algebraic surface and H is an ample divisor with H KS ≤ 0, then a slope stable sheaf E fulfills Ext2 (E, E) = Hom(E, E ⊗ KS ) = 0, and therefore MSH (r, C, d)s is smooth of dimension e = 2rd − (r 2 − 1). 2.3. Modular forms. We give a brief review of the results for modular that we will forms need. It might be helpful to also look at [G-Z] Sect. 2.2. Let H := τ ∈ C Im(τ ) > 0 be the complex upper half-plane. For τ ∈ H let q := e2π iτ and q 1/n := e2π iτ/n . For We always use the principal branch of a ∈ Q we often write√(−1)a instead of eπia . √ the square root (with τ ∈ H for τ ∈ H and a ∈ R>0 for a ∈ R>0 ). We recall the definition of quasimodular forms from [K-Z]. A modular form of weight k on a subgroup 0 ⊂ Sl(2, Z) of finite index is a holomorphic function f on H satisfying aτ + b ab k ∈0 = (cτ + d) f (τ ), τ ∈ H, f cd cτ + d growing at most polynomially in 1/=(τ ) as =(τ ) → 0. An almost holomorphic modular form of weight k is a function F on H with the same transformation properties and growth P −m for conditions as a modular form which is of the form F (τ ) = M m=0 fm (τ )(=(τ )) M ≥ 0 and fi holomorphic functions. Functions f which occur as (the holomorphic part of F ) f0 (τ ) in Psuch an expansion are called quasimodular forms of weight k. We denote σk (n) := d|n d k and by σ1odd (n) the sum of the odd divisors of n. For even k ≥ 2 let Bk X + σk−1 (n)q n Gk (τ ) := − 2k n>0
is the k th
be the Eisenstein series, where Bk Bernoulli number. Note that Gk is a modular form of weight k on SL(2, Z) for k ≥ 4, but is only quasimodular for k = 2, i.e.
Theta Functions and Hodge Numbers of Moduli Spaces
109
G2 (τ ) + 1/(8π=(τ )) is an almost holomorphic modular form of weight 2. Equivalently c(cτ + d) aτ + b = (cτ + d)2 G2 (τ ) − (2.1) cτ + d 4π i Q (see [Z2, p. 242]). Let η(τ ) := q 1/24 n>0 (1 − q n ) be the Dedekind eta function and 1 := η24 the discriminant. We have the transformation laws r τ η(τ ) see [C, VIII.3.] (2.2) η(τ + 1) = (−1)1/12 η(τ ), η(−1/τ ) = i G2
We write y := e2πiz for z a complex variable. Recall the classical theta functions X 2 (−1)nν q (n+µ/2) /2 y n+µ/2 (µ, ν ∈ {0, 1}) (2.3) θµ,ν (τ, z) := n∈Z
(see e.g. [C, Ch. V], where however the notations and conventions are slightly different), and the “Nullwerte” η(τ )5 , η(τ/2)2 η(2τ )2 η(2τ )2 0 , (τ ) = θ1,0 (τ, 0) = 2 θ1,0 η(τ ) θ (τ ) := θ0,0 (τ, 0) =
0 θ0,1 (τ ) = θ0,1 (τ, 0) =
η(τ/2)2 , η(τ )
(2.4)
θ1,1 (τ, 0) = 0.
We use the same notations also for µ, ν arbitrary in Q. The identities (2.4) follow readily from the product formulas 1 1 1 Y (1 − q n )(1 − q n y)(1 − q n y −1 ), (2.5) θ1,1 (τ, z) = q 8 (y 2 − y − 2 ) θ0,1 (τ, z) =
Y
n>0 n
(1 − q )(1 − q n− 2 y)(1 − q n− 2 y −1 ), 1
1
n>0
and the fact that θµ,ν (τ, z) = θµ,0 (τ, z + ν). θ1,1 has the transformation behaviour r τ π iz2 /τ 1/4 e θ1,1 (τ, z). θ1,1 (τ + 1, z) = (−1) θ1,1 (τ, z), θ1,1 (−1/τ, z/τ ) = −i i (2.6) By the product formulas (2.5) we see that θ0,1 (τ, z)θ1,1 (τ, z) = We write e θ1,1 (τ, z) :=
η(τ )2 θ1,1 (τ/2, z). η(τ/2) θ1,1 (τ, z) y 2 − y− 2 1
1
(2.7)
.
From the definitions it is straightforward to see that θµ+2,0 (τ, z) = θµ,0 (τ, z), θµ+2,1 (τ, z) = −θµ,1 (τ, z), µ ∈ Q.
(2.8)
110
By
L. Göttsche (−n−1/2)2 4
(n+1/2)2 4
=
0 (2τ ) = θ1/2,0
X
= (±(n/2 + 1/2))2 , one also checks immediately that
q (n+1/4) = 2
n∈Z
0 (τ/2) θ1,0 1 X (n+1/2)2 /4 η(τ )2 q = = . 2 2 η(τ/2)
(2.9)
n∈Z
) Following [Gö3,G-Z], we set f (τ ) := (−1)−1/4 η(τ θ (τ ) . Let e2 and e3 be the 2-division values of the Weierstraß ℘-function at τ/2 and (1 + τ )/2 respectively, i.e. X 1 +2 σ1odd (n)q n/2 , e2 (τ ) = 12 n>0 X 1 (−1)n σ1odd (n)q n/2 , +2 e3 (τ ) = 12 3
n>0
(see e.g. [H-B-J, p.132]). It is easy to see that e3 (2τ + 1) = e2 (2τ ). We also see that 0 (2τ ) and f (2τ + 1) = η(2τ )4 /η(τ )2 . We write θ (2τ + 1) = θ0,1 u(τ ) := − Remark 2.2. Let
η(2τ )8 f (τ )2 . , u(τ ) := u(2τ + 1) = − 3e3 (τ ) 3e2 (2τ )η(τ )4
(2.10)
12 0 −1 11 2 , S := . T := , V := T = 01 1 0 01
Let 0u = ±hV 2 , V S, SV i; this is a subgroup of index 6 of SL(2, Z). u(τ ) is a modular 21 . It is function on 0u . Let 0(2) := A ∈ Sl(2, Z) A ≡ id mod 2 . Let X := 01 easy to see that X−1 0u X = 0(2). In other words a function g(τ ) is a modular function on 0u , if and only if h(τ ) := g(2τ + 1) is a modular function on 0(2). In particular u(τ ) is a modular function on 0(2). 2.4. Theta functions for indefinite lattices. We review the definition of theta functions for indefinite lattices from [G-Z]. Let 0 be a lattice, i.e. a free Z module 0 together with a Z-valued bilinear form hx, yi on 0. The extension of the bilinear form to 0C := 0 ⊗ C and 0R = 0 ⊗R is denoted in the same way. The type of 0 is the pair (r −s, s), where r is the rank of 0 and s the largest rank of a sublattice of 0 on which h , i is negative definite. ab , Let M0 be the space of meromorphic functions on H × 0C . For v ∈ 0Q , A = cd and k ∈ Z we put f |v(τ, x) := q hv,vi/2 exp(2πihv, xi)f (τ, x + vτ ), x hx, xi aτ + b f , . f |k A(τ, x) := (cτ + d)−k exp − πi cτ + d cτ + d cτ + d
(2.11) (2.12)
Now assume that 0 is unimodular of type (r − 1, 1). We fix a vector f0 ∈ 0R with hf0 , f0 i < 0, and let C0 := f ∈ 0R hf, f i < 0, hf, f0 i < 0 , S0 := f ∈ 0 f primitive, hf, f i = 0, hf, f0 i < 0 .
Theta Functions and Hodge Numbers of Moduli Spaces
For f ∈ S0 put
111
D(f ) := (τ, x) ∈ H × 0C 0 < =(hf, xi) < =(τ ) ,
and for f ∈ C0 put D(f ) := H × 0C . For t ∈ R we put µ(t) := 1, if t ≥ 0 and µ(t) = 0 otherwise. Let f, g ∈ C0 ∪ S0 . For c ∈ 0 and (τ, x) ∈ D(f ) ∩ D(g) we put X f,g (2.13) µ(hξ, f i) − µ(hξ, gi) q hξ,ξ i/2 e2π ihξ,xi , 20,c (τ, x) := ξ ∈0+c/2
f,g
f,g
and 20 := 20,0 .
f,g
Assume now that f, g ∈ S0 . Then (see [G-Z]) the function 20,c,b has a meromorphic extension to H × 0C , which is defined as follows. Let F : H × C2 → C; (τ, u, v) 7 →
η(τ )3 θ1,1 (τ, (u + v)/(2π i)) , θ1,1 (τ, u/(2π i))θ1,1 (τ, v/(2π i))
(see [Z1]; note the different conventions for θ1,1 in [Z1]). We have X X F (τ, u, v) = q nm e−nu−mv − q nm enu+mv , n≥0,m>0
n>0,m≥0
(see [G-Z, Sect. 3.1]). Assume hf, gi = −N ∈ Z<0 . We denote by [f, g] the lattice generated by f and g and by [f, g]⊥ its orthogonal complement. Let L := [f, g] ⊕ [f, g]⊥ . For (τ, x) ∈ H × 0C , we put X f,g q hξ,ξ i/2 e2π ihξ,xi . (2.14) 2L (τ, x) := F (N τ, −2πihf, xi, 2πihg, xi) ξ ∈[f,g]⊥
Let P be a system of representatives of 0 modulo L. Then, using the notation of (2.11), the meromorphic extension is given by X f,g f,g 2L (τ, x)|(t + c/2). (2.15) 20,c := t∈P
For f, g ∈ S0 the following is shown in [G-Z]: For |=(hf, xi)/=(τ )| < 1, |=(hg, xi)/=(τ )| < 1 we have X
1
f,g
20,c (τ, x) =
1 − e2πihf,xi − +
hξ,f i=0 hf,gi≤hξ,gi<0
X
1 1 − e2πihg,xi X ξ ·f >0>ξ ·g
q hξ,ξ i/2 e2π ihξ,xi
hξ,gi=0 hf,gi≤hξ,f i<0
q hξ,ξ i/2 e2π ihξ,xi
q hξ,ξ i/2 e2π ihξ,xi − e−2π ihξ,xi .
112
L. Göttsche
Here the sum is taken over all ξ ∈ 0 + c/2. For b, c ∈ 0 and any characteristic vector w of 0 we have f,g
f,g
(20,c /θ σ (0) )|S(τ, x + b/2) = (20,b /θ σ (0) )(τ, x + c/2), f,g
f,g
20,c (τ + 1, x) = (−1)3hc,ci/4−hc,wi/2 20,c (τ, x + (w − c)/2), f,g 20,c (τ
hc,ci/2
+ 2, x) = (−1)
(2.16)
f,g 20,c (τ, x).
The last two formulas are elementary consequences of Definition (2.13), which also hold for f, g ∈ C0 ∪ S0 . 2.5. Hilbert schemes. For a general algebraic surface S, we denote by S [n] the Hilbert scheme of subschemes of length n on S. S [n] is smooth of dimension 2n [F], and its Hodge numbers have been computed ([E-S,Gö1,G-S,Ch,dC-M]). Using (2.5), the results can be easily translated to X
Xy (S [n] )q n−e(S)/24 =
n≥0
η(τ )σ (S)−χ (OS ) . θ˜1,1 (τ, z)χ (OS )
(2.17)
(Recall that we write y := e2πiz .) In particular X n≥0
e(S [n] )q n−e(S)/24 =
X 1 η(τ )σ (S) [n] n n−e(S)/24 , σ (S )(−1) q = . η(τ )e(S) η(2τ )2χ (OS ) n≥0
3. Relation to Locally Free Sheaves and Blowup Formulas In this section let S be an arbitrary smooth projective surface, and let C ∈ H 2 (S, Z). Let b S be the blowup of S in a point and E the exceptional divisor. Let H be a general ample divisor on S (general means that it does not lie on a wall with respect to (r, C), see [Y3]; in the case r = 2 we will discuss walls and chambers in the next section). We will usually denote the cohomology classes on S and their pullbacks to b S by the same letter. H (r, C +bE, d) the space of slope stable sheaves on b S which are stable We denote by Mb s S
H −E (r, C + bE, d)s for with respect to (the pullback of) H . It can be identified with Mb S > 0 small enough. We want to relate the virtual Poincaré polynomials of MSH (r, C, d)s , NSH (r, C, d) and H (r, C + bE, d)s . In fact we will see that the generating function for b S is obtained Mb S from that for S by multiplying by a suitable theta function and dividing by a power of the eta function. The results are easy consequences of corresponding results of Yoshioka about the counting of points of these moduli spaces over finite fields and of Prop. 2.1. We write
Pv (MSH (r, C, d)s ) = y −e pv (MSH (r, C, d)s , y), Pv (NSH (r, C, d)) = y −e pv (NSH (r, C, d), y),
where e = 2rd − (r 2 − 1)χ(OS ) is the virtual dimension, which agrees with the actual dimension for d sufficiently large.
Theta Functions and Hodge Numbers of Moduli Spaces
113
Proposition 3.1. Let S be an algebraic surface and let H be a general ample divisor on S. 1. X d≥0
r Y 4 YY i+1 Pv (MSH (r, C, d)s )q d = (1 − y i−2b q k )(−1) bi (S) k≥1 b=1 i=0
X Pv (NSH (r, C, d))q d , d≥0
in particular X d≥0
re(S)/24 X q e(MSH (r, C, d)s )q d = e(NSH (r, C, d))q d . η(τ )re(S) d≥0
2. Let A = (aij )ij be the (r − 1) × (r − 1)-matrix with entries aij = 1 for i ≤ j and aij = 0 otherwise. We view elements of Rr−1 as column vectors. We write I for the column vector of length r − 1 with all entries equal to one. Then r/24 X X q t t H Pv (Mb (r, C + bE, d)s )q d = (y 2 )v AI q v Av r S η(τ ) b d≥0
v∈Zr−1 + r I
X Pv (MSH (r, C, d)s )q d , d≥0
in particular X d≥0
H e(Mb (r, C + bE, d)s )q d = S
q r/24 η(τ )r
X
qv
t Av
v∈Zr−1 + br I
X
d≥0
e(MSH (r, C + bE, d)s )q d .
Proof. (1) is a consequence of ([Y1], Thm. 0.4) and Prop. 2.1. Let X be a surface over Fq . For every sheaf E in MXH (r, C, d)s (Fq ) there is an exact sequence 0 → E → E ∨∨ → E ∨∨ /E → 0, where E ∨∨ ∈ NSH (r, C, d − k)s (Fq ) and E ∨∨ /E ∈ QuotkE ∨∨ (Fq ) for a suitable k ≤ d. In fact it is easy to see that if E is defined over Fq , then it is defined over Fq if and only if both E ∨∨ and E ∨∨ /E are. For a sheaf F over X we denote by QuotkF the (Grothendieck) scheme of quotients of length k of F and by QuotkF,p the subscheme (with the reduced structure) of quotients with support in the point p ∈ X. If F is locally free of rank r and p is defined over Fq , we get isomorphisms QuotkF,p ' QuotkO⊕r ,p over Fq . In X
particular #QuotkF,p (Fq ) = #QuotkO⊕r ,p (Fq ). Therefore the proof of ([Y1], Thm. 0.4) for X
114
L. Göttsche
the numbers #QuotkO⊕r (Fq ) can be repeated for #QuotkF (Fq ), the only numbers entering X
the calculation being the #QuotkF,p (Fq n ). Therefore #QuotkF (Fq ) = #QuotkO⊕r (Fq ) (see X
also Y1, p.194). This gives #MXH (r, C, d)s (Fq ) =
X
#NXH (r, C, d − k)s (Fq ) · #QuotkO⊕r (Fq ). X
k≤d
Applying Prop. 2.1 to a good reduction X of S modulo q, we obtain immediately XX pv (MSH (r, C, d)s )q d d≥0 d≥0
r Y 4 YY X i+1 = (1 − y 2rk+i−2b q k )(−1) bi (S) pv (NSH (r, C, d))q d , k≥1 b=1 i=0
d≥0
(recall the signs in the definition of pv ). By the definition of Pv and the formula e = 2rd − (r 2 − 1)χ(OS ), we see that in order to replace pv by Pv we have to replace the factor (1 − y 2rk+i−2b q k ) by (1 − y i−2b q k ). (2) We apply Prop. 2.1 to ([Y3], Prop. 3.4). Using again e = 2rd − (r 2 − 1)χ (OS ) we obtain X H Pv (Mb (r, C + bE, d)s )q d S d≥0
P X q r/24 X = (y 2 )w(a1 ,... ,ar ) q − i<j ai aj Pv (MSH (r, C, d)s )q d . η(τ )r d≥0
(a1 ,... ,ar )
P Here the sum runs through the r-tuples (a1 , . . . , ar ) in Z + br with ri=1 ai = 0, and X X aj − ai +r ai aj . w(a1 , . . . , ar ) = 2 i<j ≤r
i<j ≤r
We note that equivalently we can let the sum run through the (r−1)-tuples (a1 , . . . , ar−1 ), P and put ar = − r−1 i=1 ai . Then X X ai aj = ai aj . − i<j ≤r
Furthermore we have
X
(aj − ai )2 = 2r
i<j ≤r
and
X
j ≤i≤r−1
X
ai aj
j ≤i≤r−1
! r−1 X (aj − ai ) = −2 (r − i)ai .
i<j ≤r
i=1
Putting things together, we obtain w(a1 , . . . , ar ) =
r−1 X (r − i)ai = (a1 , . . . , ar−1 )AI. i=1
Theta Functions and Hodge Numbers of Moduli Spaces
Finally we note that X
115
ai aj = (a1 , . . . , ar−1 )A(a1 , . . . , ar−1 )t .
t u
j ≤i≤r−1
Remark 3.2. 1. Li and Qin ([L-Q1,L-Q2]) have shown a blowup formula for the virtual Hodge polynomials in the case r = 2 using completely different methods. In particular they also obtain a blowup formula for the Euler numbers. Their method also gives a blowup formula for the virtual Hodge polynomials of the Uhlenbeck compactification. We write again Hv (MSH (r, C, d)s ) = (xy)−e/2 hv (MSH (r, C, d)s ) with e = 2rd − (r 2 − 1)χ(OS ). Then, writing x = e2π iu , their result can be rewritten as 1/12 X X q θ0,0 (2τ, u + z) H Hv (Mb (C, d)s )q d = Hv (MSH (C, d)s )q d , S η(τ )2 d≥0 d≥0 1/12 X X q θ1,0 (2τ, u + z) H Hv (Mb (C + E, d)s )q d = Hv (MSH (C, d)s )q d . S η(τ )2 d≥0
d≥0
This is the case r = 2 of the formula X H Hv (Mb (r, C + bE, d)s )q d S d≥0
=
q r/24 η(τ )r
X
v t AI
(xy)
v∈Zr−1 + br I
q
v t Av
X Hv (MSH (r, C, d)s )q d . d≥0
I expect that this formula holds for all r. P 2. Using [Y5], Prop. 3.1(2) can also be rewritten: Let Ar−1 = (x1 , . . . , xr ) i xi = P 0 be the Ar−1 -lattice and e1 , . . . , er−1 its standard basis. Let a := r−1 i=1 i(r − i)ei and λ = (1 − 1/r, −1/r, . . . , −1/r). Then the theta function on the left hand side in Prop. 3.1 can be written as X y hv,ai q hv,vi/2 , v∈Ar−1 +bλ
where h , i is the pairing of Ar−1 . This was pointed out to me by K. Yoshioka. 4. Wallcrossing and Theta Functions 4.1. Wallcrossing. Now let S again be a rational algebraic surface. Let 0 be the lattice H 2 (S, Z) with the negative of the intersection form as a quadratic form, i.e. for A, B ∈ 0 let hA, Bi = −AB. In this section we want to relate the Hodge numbers of the moduli spaces MSH (C, d) to the theta functions 2F,H 0,C from [G-Z]. The dependence of the moduli spaces MSH (C, d) on the polarization H and the corresponding dependence of the Donaldson invariants has been studied by a number of authors [Q1,Q2,F-Q,Gö2, E-G,Y2,Y3,L]. We follow (with some modifications) the notations in [Gö2,E-G].
116
L. Göttsche
An ample divisor H is called good if KS · H ≤ 0. We denote by CS the ample cone of S and by CSG the subcone of all good ample divisors. A class ξ ∈ H 2 (X, Z) + C/2 is called of type (C, d) if ξ 2 + d ∈ Z≥0 . In this case we call W ξ := ξ ⊥ ∩ CS the wall defined by ξ . If ξ ⊥ ∩ CSG 6 = ∅, we call W ξ a good wall. The chambers of type (C, d) are the connected components of the complement of the walls of type (C, d) in CS . If L and H lie in the same chamber of type (C, d), then MSL (C, d) = MSH (C, d). We say that L lies on a wall of type C, if Lξ = 0 for some class ξ ∈ H 2 (X, Z) + C/2. Theorem 4.1. Let C ∈ H 2 (S, Z). Let H, L ∈ CSG not on a wall of type C. Then 1.
X d≥0
Xyv (MSH (C, d)) − Xyv (MSL (C, d)) q d−e(S)/12
η(τ )2σ (S)−2 (y 2 − y − 2 ) L,H = 20,C (2τ, KS z), θ1,1 (τ, z)2 X (e(MSH (Cd)) − e(M L (C, d)))q d−e(S)/12 1
d≥0
=
1
1 Coeff2πiz 2L,H 0,C (2τ, KS z) . η(τ )2e(S)
2. Assume now that C 6 ∈ 2H 2 (S, Z). Then we can replace Xyv by Xy in (1). Furthermore X (e(NSH (C, d)) − e(NSL (C, d)))q d = Coeff2π iz 2L,H 0,C (2τ, KS z) d≥0
X d≥0
=
(−1)e(d)/2 (σ (MSH (C, d)) − σ (MSL (C, d)))q d−e(S)/12
η(τ )2σ (S) L,H 2 (2τ, KS /2). 2iη(2τ )4 0,C
Here e(d) := 4d − 3 is the dimension of MSH (C, d). Proof. This is essentially a reformulation of Thm. 3.4 from [Gö2]. Assume that H and L do not lie on a wall of type C. The result of [Gö2] gives y 2d−3/2 (Xyv (MSH (C, d)) − Xyv (MSL (C, d))) = ξ KS − y −ξ KS X 2 2 2y y d+ξ Xy ((S t S)[d+ξ ] )y d−ξ , = y(y − 1) ξ
where the sum runs through all classes of type (C, d) with ξ H < 0 < ξ L. We sum over P P [n] n 2 . all d ≥ 0. We use (2.17), noting that n≥0 Xy ((S t S)[n] )q n = n≥0 Xy (S )q We obtain X Xyv (MSH (C, d)) − Xyv (MSL (C, d)) q d−e(S)/12 = d≥0
η(τ )2σ (S)−2 X −ξ 2 y ξ KS − y −ξ KS q = . 1 1 θ˜1,1 (τ, z)2 y 2 − y− 2 ξ
(4.1)
Theta Functions and Hodge Numbers of Moduli Spaces
117
The sum on the right-hand side runs through all ξ ∈ H 2 (X, Z) + C/2 satisfying ξ H < g,f 0 < ξ L. Using the definition (2.13) of the theta functions 20,c , we obtain. X θ1,1 (τ, z)2 Xyv (MSH (C, d)) − Xyv (MSL (C, d)) q d−e(S)/12 1 − 21 2σ (S)−2 2 η(τ ) (y − y ) d≥0 X 2 q −ξ y ξ KS − y −ξ KS = ξ ∈H 2 (S,Z)+C/2 ξ H <0<ξ L
X
=
q hξ,ξ i y hξ,KS i − y −hξ,KS i
ξ ∈0+C/2 hξ,H i<0
=
X
q hξ,ξ i (µ(hξ, Li) − µ(hξ, H i))y hξ,KS i
ξ ∈0+C/2
= 2L,H 0,C (2τ, KS z). Specializing (4.1) to the Euler number we obtain: X d≥0
e(MSH (C, d)) − e(MSL (C, d)) q d−e(S)/12 = 1 = η(τ )2e(S)
X
X
1 2 2ξ KS q −ξ η(τ )2e(S) H ξ <0
2hξ, KS iq hξ,ξ i
hH,ξ i<0
1 Coeff2πiz (2L,H 0,C (2τ, KS z)). η(τ )2e(S) The last line follows directly from (2.13). Assume now that C 6 ∈ 2H 2 (S, Z). Then MSH (C, d) = MSH (C, d)s and MSL (C, d) = L MS (C, d)s are smooth, and we can replace Xyv (MSH (C, d)) by Xy (MSH (C, d)) and Xyv (MSL (C, d)) by Xy (MSL (C, d)). Furthermore we get by Prop. 3.1, X (e(NSH (C, d)) − e(NSL (C, d)))q d = Coeff2π iz 2L,H 0,C (2τ, KS z) . =
d≥0
Finally we obtain from (4.1), X (−1)−e(d)/2 (σ (MSH (C, d)) − σ (MSL (C, d)))q d−e(S)/12 d≥0
ξ KS − (−1)−ξ KS η(τ )2σ (S) X (−1) 2 = q −ξ η(2τ )4 (−1)1/2 − (−1)−1/2 ξ H <0<ξ L X η(τ )2σ (S) q hξ,ξ i (−1)hξ,KS i − (−1)−hξ,KS i = 2iη(2τ )4 hξ,H i<0
=
η(τ )2σ (S) 2iη(2τ )4
2L,H 0,C (2τ, KS /2).
118
L. Göttsche
As the signature can only be nonzero if e(d) is even, we can replace (−1)−e(d)/2 by (−1)e(d)/2 . u t 4.2. Extension of the invariants. A class F ∈ H 2 (X, Z) is called nef if its intersection with every effective curve is nonnegative. The real cone C S,R of nef classes is the closure of the ample cone. Let δ(CS ) be the set of primitive classes in (C S,R \ CS,R ) ∩ H 2 (S, Z), and put C S := CS ∪ δ(CS ). Let SS := F ∈ δ(CS ) F 2 = 0 . An upper index G will indicate that we allow only classes H with H KS ≤ 0. We now extend the generating functions for the χy -genera, Euler numbers and signatures of the MSH (C, d) to the whole of C S . Definition 4.2. Let C ∈ H 2 (S, Z). 1. Let F ∈ δ(CSG ), and assume that CF is odd. Then, for each d, the class F lies in the closure of a unique chamber α ⊂ CSG of type (C, d). We put MSF (C, d) := MSH (C, d) for H ∈ α. 2. Let C ∈ H 2 (S, Z), and let H ∈ CSG , not lying on a wall of type C. Let F ∈ C S . If F ∈ δ(CS ), we assume that KS F 6 = 0 or F C is odd. We put X Xyv (MSH (C, d))q d−e(S)/12 XS,F C := d≥0
η(τ )2σ (S)−2 (y 2 − y − 2 ) . θ1,1 (τ, z)2 1
+ 2H,F 0,C (2τ, KS z)
1
If C 6 ∈ 2H 2 (S, Z) we can replace Xyv (MSH (C, d)) by Xy (MSH (C, d)). We also put (ES,F C )0 := ES,F C :=
X
e(NSH (C, d))q d + Coeff2π iz 2H,F 0,C (2τ, KS z) ,
d≥0 (ES,F C )0 . η(τ )2e(S)
Finally we put 6CS,F = 0 if C 2 ≡ 0 modulo 2, and otherwise 6CS,F :=
X η(τ )2σ (S) H,F (−1)e(d)/2 σ (MSH (C, d))q d−e(S)/12 + 2 (2τ, KS /2). 2iη(2τ )4 0,C d≥0
The cocycle condition G,H F,H 2F,G 0,C + 20,C = 20,C (Rem. 3.4 from [G-Z])
(4.2)
S,F S,F S,F are indeand Thm. 4.1 imply that the definitions of XS,F C , (EC )0 , EC and 6C pendent of H .
Theta Functions and Hodge Numbers of Moduli Spaces
119
Remark 4.3. The definitions above are motivated as follows. Denote by XSF (C, d), S,F S,F ESF (C, d) and SSF (C, d) the coefficients of q d−e(S)/12 in XS,F C , EC and 6C . Then 1. If F ∈ CSG does not lie on a wall of type (C, d), then by Thm. 4.1 XSF (C, d) = Xy (MSF (C, d)), ESF (C, d) = e(MSF (C, d)) and SSF (C, d) = (−1)e(d)/2 σ (MSF (C, d)). If C ≡ 0 modulo 2, then the moduli space MSF (C, d) will sometimes be singular. We defined σ (MSF (C, d)) := 0, which is reasonable, as MSF (C, d) has odd complex dimension. 2. If L ∈ CS fulfills KS L > 0, then MSL (C, d) is not necessarily smooth, and the wallcrossing formulas of Thm. 4.1 need not be true. We extend the generating function by formally requiring that Thm. 4.1 holds. We have seen above that this gives a consistent definition of the generating functions. I believe that there exists a geometric definition of the Xy (MSL (C, d)) given by this generating function. 3. If F ∈ CS lies on a finite number of walls of type (C, d), then XSF (C, d) is the average of the Xy (MSG (C, d)) over all the chambers of type (C, d) which contain F in their closure (and similarly for ESF (C, d) and SSF (C, d)). 4. If F ∈ SS and CF is even, then F usually lies on infinitely many walls of type (C, d) and in the closure of infinitely many chambers. We can view XSF (C, d) as a renormalized average over all these chambers. Note that in this case XSF (C, d) need not be a Laurent polynomial in y and ESF (C, d), and SSF (C, d) need not be integers. Remark 4.4. Note that we did not define XS,F C for F ∈ SS in case KS F = 0 and F C H,F even. The point is that in this case 20,C (2τ, KS z) is not well-defined: 2H,F 0,C (2τ, ·) has a pole along F ⊥ . If S is the blowup of P2 in 9 points and F = −KS , it is easy to see that H (MSL (C, d)) is constant for L near F , and one can therefore define Xy (MSF (C, d)) := Xy (MSL (C, d)). This case has been studied byYoshioka [Y4] in order to check predictions from [M-N-V-W]. S,F In the future, whenever we deal with XS,F C , EC for F ∈ SS , we implicitly assume that KS F 6 = 0 or CF is odd. Corollary 4.5. Let F , G in C S , then η(τ )2σ (S)−2 G,F 2 (2τ, KS z), θ1,1 (τ, z)2 0,C S,G G,F (ES,F C )0 − (EC )0 = Coeff2πiz 20,C (2τ, KS z) , 1 S,G Coeff2π iz 2G,F ES,F C − EC = 0,C (2τ, KS z) , 2e(S) η(τ ) η(τ )2σ (S) G,F 2 (2τ, KS /2). 6CS,F − 6CS,G = 2iη(2τ )4 0,C S,G −2 2 ) XS,F C − XC = (y − y 1
1
Proof. This is straightforward from Thm. 4.1, Def. 4.2 and the cocycle condition (4.2). t u 4.3. Birational properties. Let b S be the blowup of S in a point, and let E be the excepS, Z). tional divisor. We identify H 2 (S, Z) with E ⊥ ⊂ H 2 (b
120
L. Göttsche
Corollary 4.6 (Blowup formulas). Assume C 6 ∈ 2H 2 (S, Z), then for all F ∈ C S we have θ0,0 (2τ, z) S,F θ1,0 (2τ, z) S,F b XC , XS,F XC . C+E = η(τ )2 η(τ )2 1 b b S,F 6 S,F , 6C+E = 0. 2. 6CS,F = η(2τ ) C 3. If F ∈ CS or F ∈ δ(CS ) and F C is odd, then b
1. XS,F C =
b
ES,F C =
0 (2τ ) θ1,0 θ(2τ ) S,F b S,F E , E = ES,F . C+E C η(τ )2 η(τ )2 C
Proof. If C 6 ∈ 2H 2 (S, Z) and F ∈ CSG does not lie on a wall of type (C), or if F ∈ F (C, d) and δ(CSG ), and CF is odd, then by [Be] all the cohomology of MSF (C, d), Mb S F (C + E, d) is of type (p, p). Therefore in this case the result follows from the of Mb S blowup formulas from [L-Q1,L-Q2]. Alternatively one can use Prop. 3.1. In order to check the result for general F , we have to check that the blowup formula is compatible 0 := H 2 (b S, Z) with the negative of the with our extension. Let 0 := H 2 (S, Z) and b intersection forms. By definition the compatibility of XS,F C with the blowup formulas amounts to the easy formulas G,F G,F (τ, Kb 2b S z) = θ0,0 (τ, z)20,C (τ, KS z), 0 ,C G,F G,F (τ, Kb 2b S z) = θ1,0 (τ, z)20,C (τ, KS z). 0 ,C+E
The result for the signatures follows by 0 (2τ ) θ0,1 1 θ0,0 (2τ, 1) , θ1,0 (2τ, 1) = θ1,1 (τ, 0) = 0. = = η(τ )2 η(τ )2 η(2τ )
t u
Immediately from (2) we get: Corollary 4.7. Assume that π : S → X is the blowup of a rational surface X in finitely G many points. Let C ∈ H 2 (S, Z) \ π ∗ (H 2 (X, Z)). Let F ∈ π ∗ (C X ) not on a wall of type G )), assume that CF is odd. Then σ (M F (C, d)) = 0. (C, d). If F ∈ π ∗ (δ(CX S S,G Corollary 4.5 expresses the differences XS,F C − XC in terms of the theta functions 2 2F,G 0,C . We now want to show that in case C 6 ∈ H (S, Z) we can for suitable G also
S,F express XS,F and 6CS,F in terms of 2F,G C , EC 0,C . We use the following easy fact (see e.g.[Q1,H-L]).
Lemma 4.8. Let π : X → P1 a rational ruled surface. Let S be obtained from X by successively blowing up a number of points. Let G be pullback of the class of a fibre of π. Let C ∈ H 2 (S, Z) with CG odd. Then MSG (C, d) = ∅ for all d. S of S Proposition 4.9. Let F ∈ C S . Assume C 6 ∈ 2H 2 (S, Z). There exists a blowup e S,G G such that Xe = 0. Let E , . . . , E be the classes of the in n points and a G ∈ Se 1 n C S
Theta Functions and Hodge Numbers of Moduli Spaces
121
exceptional divisors, and write e 0 := 0 ⊕ hE1 , . . . , En i. Then η(τ )2σ (S)−2 2G,F (2τ, Ke S z), 0 ,C θ1,1 (τ, z)2 θ0,0 (2τ, z)n e G,F Coeff2πiz 2e (2τ, Ke S z) 0 ,C = , η(τ )2e(S) θ(2τ )n
XS,F C = ES,F C 6CS,F
G,F (2τ, Ke η(τ )2σ (S) 2e S /2) 0 ,C = . 0 (2τ )n 2iη(2τ )4 θ0,1
Proof. Any rational surface S can be blown up in such a way that e S is a blowup of a ruled surface X, and CG is odd for G the pullback of the fibre. Then by Lem. 4.8 MSG (C, d) = ∅ e
= 0. The formulas are then a straightforward application for all d, and therefore XS,G C of Thm. 4.1 and the blowup formula Cor. 4.6. u t is invariant under deformations of the triple (S, H, C). Corollary 4.10. XS,H C Let p1 , p2 , p3 be three non-collinear points in P2 . Let L1 , L2 , L3 be the lines through pairs of the pi with pi , pj ∈ Lk for distinct indices i, j, k. Let X be the blowup of P2 in p1 , p2 , p3 , let E1 , E2 , E3 be the exceptional divisors and E 1 , E 2 , E 3 the strict transforms of L1 , L2 , L3 . They can be blown down to points p1 , p2 , p3 to obtain another projective plane P2 . Let H and H be the hyperplane classes on P2 and P2 . Let S be the blowup of X in additional points p4 , . . . , pr with exceptional divisors E4 , . . . , Er . We denote the pullbacks of H and H by the same letters. Then H = 2H − E1 − E2 − E3 , E i = H − Ej − Ej for i, j, k distinct in {1, 2, 3}. We can view S both as a blowup of P2 in p1 , . . . , pr and as a blowup of P2 in p1 , p2 , p3 , p4 , . . . , pr . The P change of viewpoint amounts to a Cremona transform on H 2 (S, Z) sending dH − ri=1 ai Ei to (2d − a1 − a2 − a3 )H − (d − a2 − a3 )E1 − (d − a1 − a3 )E2 − (d − a1 − a2 )E3 −
r X
ai Ei .
i=4
This shows: Corollary 4.11. Let C ∈ H 2 (S, Z) and F ∈ C S . Let G be the group generated by the Cremona transforms and the permutations of E1 , . . . , Er . Then for all g ∈ G we have S,g(F ) = Xg(C) , and, if F ∈ CS , or F ∈ δ(CSG ) with F C odd, then MSF (C, d) = XS,F C g(F )
MS
(g(C), d) for all d.
5. Transformation Properties on the Boundary Vafa and Witten [V-W] made predictions for the modular behaviour of generating functions of the Euler numbers of moduli spaces of sheaves on algebraic surfaces. Up to eventual quasimodularity their generating function ZC (which can be essentially identified with EF,S C ) should fulfill the equations τ −e(S)/2 X S (−1)DC ZD . (5.1) ZCS (τ + 1) = ZCS ; ZCS (−1/τ ) = ±2−b2 (S)/2 i D
122
L. Göttsche
Here is a root of unity, and D runs through a system of representatives of H 2 (S, Z) modulo 2H 2 (S, Z). We want to show that for F ∈ SS a similar transformation behaviour holds for XS,F C . Formulas similar to those of (5.1) for the Euler numbers then follow as a corollary. In addition we also get the modular behaviour for the signatures. For the purpose of this section we will for F ∈ C S define := XS,F 0
η(τ )2 η(τ )2 b b S,F ES,F , := 0 XS,F E , E0 θ1,0 (2τ, z) θ1,0 (2τ ) E
where b S is the blowup of S in a point, and E is the class of the exceptional divisor, i.e. we formally use the blowup formulas Cor. 4.6 (which do not apply). This is similar to the approach for the Donaldson invariants. We put for any H ∈ CS XF,S C , y 1/2 − y −1/2 KS2 G2 (τ ) (τ ) − Coeff(2π iz)−1 2H,F FCS,F (τ ) := ES,F C 0,C (2τ, KS z). 2η(τ )2e(S)
YCS,F (τ, z) :=
H,F Note that in case CF is odd we just have FCS,F (τ ) := ES,F C (τ ), because 20,C (2τ, KS z)
is holomorphic at z = 0. By the cocycle condition (4.2) FCS,F (z) is independent of H . Theorem 5.1. For all C ∈ H 2 (S, Z) and all F ∈ SS with F KS 6= 0 we have 1. YCS,F transforms according to the rules YCS,F (τ + 1, z) = (−1)−e(S)/6−C /2 YCS,F (τ, z), r −b2 (S) 2τ S,F exp(−π i(KS2 /2 + 2)z2 /τ ) YC (−1/τ, z/τ ) = −i i X (−1)CD Y S,F (τ, z) . · 2
D∈H 2 (S,Z)/2H 2 (S,Z)
D
2. FCS,F transforms according to FCS,F (τ + 1) = (−1)−e(S)/6−C /2 FCS,F (τ ), r −e(S) X τ S,F 2−b2 (S)/2 FC (−1/τ ) = − i 2 2
D∈H (S,Z)/2H 2 (S,Z)
3. Assume C 2 is odd. Then
(−1)CD FDS,F (τ ).
η(τ )2 6 S,F is a modular function on 0(2). η(2τ )σ (S) C
Remark 5.2. By the fact that G2 (τ ) + 1/(8π=(τ )) transforms like a modular form of weight 2, we could also define FCS,F (τ ) := ES,F C (τ ) +
KS2 Coeff(2π iz)−1 2H,F 0,C (2τ, KS z), 16π =(τ )η(τ )2e(S)
and get the same transformation behaviour in 2.
Theta Functions and Hodge Numbers of Moduli Spaces
123
Proof. (1) Let F, G ∈ SS . We first want to show that (1) holds if we replace YCS,F and YDS,F by YCS,F − YCS,G and YDS,F − YDS,G . By Cor. 4.5 we have YCS,F − YCS,G =
η(τ )2σ (S)−2 G,F 2 (2τ, KS z). θ1,1 (τ, z)2 0,C
By (2.2) and (2.6), we know that 2σ (S)−2 η(τ + 1)2σ (S)−2 −e(S)/6 η(τ ) = (−1) , θ1,1 (τ + 1, z)2 θ1,1 (τ, z)2 τ −b2 (S) η(τ )2σ (S)−2 η(−1/τ )2σ (S)−2 2 = − e−2π iz /τ . 2 θ1,1 (−1/τ, z/τ ) i θ1,1 (τ, z)2
Furthermore, putting T (τ, z) := 2G,F 0,C (2τ, KS z) we get by (2.16), T (τ + 1, z) = (−1)−C T (−1/τ, z/τ ) =
2 /2
T (τ, z),
2G,F 0,C (−2/τ, KS z/τ ) r b2 (S) τ 2i
=i
exp(−πiKS2 z2 /2τ )2G,F 0 (τ/2, KS z/2 + C/2).
Putting this together, we obtain r = −i
2τ i
−b2 (S)
(YCS,F − YCS,G )(−1/τ, z/τ ) = exp(−πi(KS2 /2 + 2)z2 /τ )2G,F 0 (τ/2, KS z/2 + C/2).
Finally we have 2G,F 0 (τ/2, KS z/2 + C/2) = X = (−1)DC D
X
X (µ(hξ, Gi) − µ(hξ, F i))q hξ,ξ i/4 (−1)Cξ y hKS ,ξ i/2 ξ ∈0
(µ(hξ, Gi) − µ(hξ, F i))q hξ,ξ i y hKS ,ξ i
ξ ∈0+D/2
X = (−1)DC 2G,F 0,D (2τ, KS z).
(5.2)
D
This shows (1) for YCS,F − YCS,G . It is therefore enough to show (1) YCS,F for all S, C and S → S be the blowup in a point with exceptional divisor one particular F ∈ SS . Let : b E. By the blowup formulas Cor. 4.6 and the definition of XS,F we get 0 b
YCS,F (τ, z) =
θ0,0 (2τ, z) S,F θ1,0 (2τ, z) S,F b S,F YC (τ, z), YC+E (τ, z) = YC (τ, z), 2 η(τ ) η(τ )2
and therefore θ0,0 (2τ, z) S,F YC (τ + 1, z), η(τ )2 θ1,0 (2τ, z) S,F b S,F (τ + 1, z) = (−1)1/2−1/6 YC (τ + 1, z). YC+E η(τ )2 b
YCS,F (τ + 1, z) = (−1)−1/6
124
L. Göttsche
By the transformation behaviour of θ0,0 , θ1,0 (see [C, Sect. V.8.]) and η we also see r θ0,0 (τ/2, z/2) S,F i b S,F exp ((πi/2)z2 /τ ) YC (−1/τ, z/τ ), YC (−1/τ, z/τ ) = 2τ η(τ )2 r θ0,1 (τ/2, z/2) S,F i b S,F exp ((πi/2)z2 /τ ) (−1/τ, z/τ ) = YC (−1/τ, z/τ ). YC+E 2τ η(τ )2 Using the elementary identities θ0,0 (τ/2, z/2) = θ0,0 (2τ, z) + θ1,0 (2τ, z), θ0,1 (τ/2, z/2) = θ0,0 (2τ, z) − θ1,0 (2τ, z) b
it follows that (1) holds for YCS,F for all C ∈ H 2 (S, Z) if and only if it holds for YCS,F for all C ∈ H 2 (b S, Z). As any two rational surfaces can be connected by a sequence of blowups and blow downs, it is enough to check the result for S = P1 × P1 and F the class of a fibre of the first projection. Let G be the class of a fibre of the second b projection. By Lem. 4.8 we have YGS,F = YFS,F +G = 0. Denote by P2 the blowup of P2 in 2 P2 , Z) be the class of a hyperplane. a point with exceptional divisor E1 . Let H ∈ H (b P2 be the blowup in a point with exceptional divisor E2 . There exists a Let σ : e P2 → b blowup : e P2 → P1 × P1 with exceptional divisor E such that σ ∗ (H − E1 ) = ∗ (F ) ∗ and E = (F ) − E2 . By Cor. 4.6 and our definition of Y0S,F we get that YFS,F =
η(τ )2 e b ,H −E1 P , ∗ (F ) = YHP2−E , Y ∗2 1 θ0,0 (2τ, z) (F )
Y0S,F =
η(τ )2 e b ,H −E1 P ,ε∗ (F ) YE 2 = YHP2−E . 1 θ1,0 (2τ, z)
By F 2 = G2 = 0, F G = 1, (1) follows. (2) By an argument that is very similar to that at the end of the proof of (1), it is enough to show the formula for the difference FCS,F − FCS,G , for F, G ∈ SS . The trans2 formation behaviour FCS,F (τ + 1) = (−1)−e(S)/6−C /2 FCS,F (τ ) follows immediately from the corresponding transformation behaviour of YCS,F . By the transformation behaviour of 2G,F 0,C (2τ, KS z), the transformation properties (2.1) of G2 and (5.2), we see that UC (τ, z) := exp(2π 2 KS2 G2 (τ )z2 )2G,F 0,C (2τ, KS z), transforms according to r
UC (−1/τ, z/τ ) = i
τ 2i
−b2 (S)
X
UD (τ, z).
D∈H 2 (S,Z)/2H 2 (S,Z)
By definition (FCS,F − FCS,G )(τ ) =
1 Coeff2π iz UC (τ, z), η(τ )2e(S)
and (2) follows. (3) Let b S be the blowup of S in a point. By the blowup formula Cor. 4.6, we see that the statements for (S, F, C) and (b S, F, C) are equivalent. Therefore, by Prop. 4.9, we can assume that there exists a G ∈ SSG such that XS,G C = 0 and 6CS,F (τ ) =
η(τ )2σ (S) G,F 2 (2τ, KS /2). 2iη(2τ )4 0,C
Theta Functions and Hodge Numbers of Moduli Spaces
125
By [G-Z], Thm 3.13.1) the function τ 7 → G(τ ) =
θ(τ )σ (S) G,F 20,C (τ, C/2) f (τ )
is a modular function on 0u . Therefore, by Rem. 2.2, τ 7→ G(2τ + 1) is a modular function on 0(2). By (2.16), we have −3C 2G,F 0,C (2τ + 1, C/2) = (−1)
2 /4+CK /2 S
2G,F 0,C (2τ, KS /2),
and by Rem. 2.2 we see that η(τ )2σ (S)+2 θ(2τ + 1)σ (S) . = f (2τ + 1) η(2τ )σ (S)+4 The result follows. u t 6. The Signature and the Donaldson Invariants Let again S be a rational algebraic surface, let H ∈ CS , and let E be a differentiable complex vector bundle on S, with Chern classes (C, c2 ). Let d := c2 − C 2 /4 and e := 4d − 3. Let Ae (S) be the set of polynomials of weight e in H2 (S, Q) ⊕ H0 (S, Q), where a ∈ H2 (S, C) has weight 1, and the class p ∈ H0 (S, Z) of a point has weight 2. The Donaldson invariants corresponding to E, the Fubini-Study metric associated to H , and the homology orientation determined by the connected component of L ∈ H 2 (S, R) L2 > 0 containing H are a linear map 8S,H C,e : Ae (S) → Q. Let X M := 8S,H Ae (S) → Q. 8S,H C C,e : A∗ (S) := e≥0
e≥0
In [K-M] it is shown (more generally for simply connected 4-manifolds S with b+ = 1, where now H is the period point of a Riemannian metric on S), that 8S,H C,e depends
S,L only on the chamber of type (C, d) of H , and that 8S,H C,e − 8C,e can be expressed as S , for ξ running through the classes of type (C, d) with a sum of wallcrossing terms δξ,e S . ξ H < 0 < ξ L. Kotschick and Morgan make a conjecture about the structure of the δξ,e
S (x e ) is a polynomial in ξ x, x 2 whose coefficients depend Conjecture 6.1. [K-M] δξ,e 2 only on ξ , e and the homotopy type of S. S,L Using this conjecture the difference 8S,H C,e − 8C,e was in [Gö3] and [G-Z] expressed in terms of modular forms and theta functions. In a series of (in part forthcoming) papers [F-L1,F-L2,F-L3,F-L4] Feehan and Leness work towards a proof of Conjecture 6.1. In [F-L1] some necessary gluing results are proven. We will in this section assume Conjecture 6.1 and show that for any class H ∈ C S the generating function for the signatures σ (MSH (C, d)) is also (with respect to a different r development parameter) the generating function for the Donaldson invariants 8S,H C (p ), evaluated on the powers of the point class p. The reason for this result is that both Donaldson invariants and the signatures of the moduli spaces vanish in certain chambers, the chamber structures for Donaldson invariants and signatures are the same, and the wallcrossing terms for Donaldson invariants and signatures are related.
126
L. Göttsche
Theorem 6.2. Assume Conjecture 6.1. Let H ∈ C S . Then h 4η(τ )2 i S,H r (CKS +1)/2 (p ) = (−1) Coeff 6 8S,H . r+1 u(τ ) C η(2τ )σ (S) C Here Coeffu(τ )r+1 V (τ ) is the coefficient of u(τ )r+1 in the Laurent development of V (τ ) in powers of u(τ ). In particular, if H ∈ CSG does not lie on a wall of type C or F ∈ δ(CS ) with CF odd, then X (−1)e(d)/2 σ (MSH (C, d))q d−e(S)/12 = d≥0
= (−1)(CKS +1)/2
η(2τ )σ (S) 4η(τ )2
X
r≥0
r r+1 8S,H . C (p )u(τ )
Proof. We note that the result is trivially true if C 2 is even. We assume that C 2 is odd. Case 1: Assume that S is the blowup of a ruled surface and that CG is odd for G the pullback of the class of the fibre of the ruling. Then we get by Prop. 4.9, 6CS,H =
η(τ )2σ (S) G,H 2 (2τ, KS /2). 2iη(τ )4 0,C
On the other hand we get by [G-Z, Cor. 4.3] and Lem. 5.1, # " σ (S) 3 2 2θ (τ ) X,H G,H 20,C (τ, C/2) . 8C (pr ) = Coeffu(τ )r+1 (−1) 4 C f (τ ) We make the transformation τ → 2τ + 1. By (2.16) we get −3C 2G,H 0,C (2τ + 1, C/2) = (−1)
Using also Rem. 2.2, we get
2 /4+CK /2 S
2G,H 0,C (2τ, KS /2).
# 2η(τ )2σ (S)+2 H,G = (−1) Coeffu(τ )r+1 2 (2τ, KS /2) η(2τ )σ (S)+4 0,C h 4η(τ )2 i S,H 6 . = (−1)(CKS +1)/2 Coeffu(τ )r+1 C η(2τ )σ (S) This shows the first part. To show the second part, we need to see that the smallest power 4η(τ )2 6 S,H is u(τ ). We see that u(τ ) is of u(τ ) that occurs in the development of η(2τ )σ (S) C r 8S,H C (p )
"
CKS /2
1
q 2 multiplied with a power series in q. In case C 2 ≡ 1 modulo 4 it follows from the definition that 6CS,H is q −e(S)/12+3/4 multiplied with a power series in q. If C 2 ≡ 3 S (C, d) is only nonempty if the expected dimension 4d − 3 modulo 4, the fact that MH is nonnegative implies that 6CS,H is q −e(S)/12+5/4 multiplied with a power series in q. This shows the second part. General case: Let e S be the blowup of S in n points, so that Case 1 applies to e S. Then by the blowup formulas for the Donaldson invariants [F-S] and by Cor. 4.6, e
e
S,H S,H r r = 8S,H C (p ) = 8C (p ), 6C
The result follows. u t
1 6 S,H . η(2τ )n C
Theta Functions and Hodge Numbers of Moduli Spaces
127
Corollary 6.3. If F ∈ SSG , CF is odd and σ (S) > −8, then σ (MSF (C, d)) = 0. Proof. This follows immediately from [G-Z, Cor. 5.5]. u t Remark 6.4. Theorem 6.2 and the results for the K3 surface suggest that there should be a general formula relating the Donaldson invariants and the signatures of the moduli spaces MSH (C, d) for all simply connected algebraic surfaces S even if pg (S) > 0. In general the moduli spaces MSH (C, d) will be very singular, and one first has to find a suitable definition of the signature. The simplest formula that fits the known data seems to be the following: X (−1)e(d)/2 σ (MSH (C, d))q d−e(S)/12 = d
χ (OS ) X η(2τ )σ (S) w(τ ) 8SC (p r )u(τ )r+1 , = ± (2η(τ ))2χ(OS ) r≥0 where w(τ ) = χ (OS ) χ (OS ) 1 2u(τ )+1 + 2u(τ )−1 , 2 1 2
2u(τ )
if 3χ(OS ) − C 2 ≡ 2 mod 4,
2u(τ )
1 + 2u(τ ))χ(OS ) − (−1)χ (OS ) (1 − 2u(τ ))χ (OS ) , if 3χ(OS ) − C 2 ≡ 0 mod 4.
The formula has the following features: 1. It gives the correct result for rational surfaces and for K3-surfaces. 2. It is compatible with the blow-up formulas of [F-S] for the Donaldson invariants and with those of Prop. 3.1 for the signatures. 3. It is compatible with taking the disjoint union of algebraic surfaces. The formula for the rational surfaces is just Thm. 6.2, and the compatibility with the blowup formulas is obvious. We check the formula for the K3 surfaces. We know by [G-H] that, for generic polarization H and suitable C ∈ H 2 (X, Z), the moduli space MXH (C, d) has the same Hodge numbers as X[2d−3] . Let L and M be two such classes in H 2 (X, Z), satisfying L2 ≡ 2 modulo 4 and M 2 ≡ 0 modulo 4. Using (2.17), this gives X (−1)e(d) (σ (MXH (L, d)) + σ (MXH (M, d))q d−2 = d≥0
=
X d≥0
(−1)n σ (S [n] )q (n−1)/2 =
1 . η(τ )4 η(τ/2)16
For the Donaldson invariants we have the following results: X fulfills the simple type 2 = (−1)C /2 for all C ∈ H 2 (S, Z) (see e.g. [Kr-M]). Therefore we condition and 8S,H C X,H 2r 2r 2r+1 ) = 22r+1 , i.e. get 8X,H L (p ) = −2 and 8M (p X r≥0
X,M r r r+1 (8X,H =− L (p ) + 8L (p ))u(τ )
η(2τ )8 u(τ ) =4 . 1 + 2u(τ ) η(τ/2)8
128
L. Göttsche
The last identity is an elementary exercise in modular forms (e.g. one multiplies both sides with a suitable modular form on 0(2) such that they both become modular forms on 0(2) and compares the first few coefficients). Putting this together, we obtain X (σ (MXH (L, d) − σ (MXH (M, d))q d−2 d≥0
2 η(2τ )σ (X) X X,H r X,H = (8 (p ) + 8M (pr ))u(τ )r+1 (4η(τ ))2χ(OX ) r≥0 L 2 X X,H η(2τ )σ (X) = (1 − 2u(τ ))2 8L (pr ) (4η(τ ))2χ(OX ) r≥0 2 X X,H η(2τ )σ (X) 2 r = (1 − 1/(2u(τ ))) 8M (p ) . (4η(τ ))2χ(OX ) r≥0
The result follows by collecting the odd powers of u(τ ) (for L) and the even powers of u(τ ) (for M). Remark 6.5. Note that u(τ ) is the modular function on 0(2) that occurs in a natural way in physics ([W], there it is called u). In [M-W] the Donaldson invariants of 4-manifolds with b+ = 1 were (using physics arguments) related to (Borcherds type [Bo]) integrals over the “u-plane” H/ 0(2). This suggests that also many results of this paper could be reformulated in terms of such integrals. For the Euler number we can prove a weaker statement along the same lines. We can relate the generating functions for the difference of the Euler numbers for two polarizations H, L to the difference of certain Donaldson invariants between H and L. Let kS be the Poincaré dual of KS . Proposition 6.6. Let H , L ∈ CS not on a wall of type 0. Then 6 X iη(2τ ) S,L S,L r r r+1 = 8S,H . ES,H 0 −E0 0 (kS p )−80 (kS p ) u(2τ ) 2θ(2τ )σ (S)+2 η(τ )e(S) r≥0
Proof. The proof is similar to the case of the signature. By Corollary 4.5 we have i h 1 S,L L,H − E = Coeff (2τ, K z) . 2 ES,H 2π iz S 0 0 0,0 η(τ )2e(S) i h As H, L ∈ CS , we see that Coeff2πiz 2L,H 0 (2τ, Ks z) starts in degree ≥ 1/2 in q. Using this, we get from [G-Z], Cor 4.3, X S,H r r+1 80 (kS pr ) − 8S,L 0 (kS p ) u(2τ ) r≥0
" = Coeff2πiz =
#
2θ(2τ )σ (S) H,L 20,0 (2τ, KS z) f (2τ )2
2iθ(2τ )σ (S)+2 η(τ )2e(S) S,L (E0 − ES,H 0 ). η(2τ )6
t u
Theta Functions and Hodge Numbers of Moduli Spaces
129
Remark 6.7. This result can be reformulated as follows. The expression 6 X η(2τ ) r r+1 + 8S,H ES,H 0 0 (kS p )u(2τ ) 2iθ(2τ )σ (S)+2 η(τ )e(S) r≥0
is independent of H ∈ CS . 7. Examples 7.1. Rational ruled surfaces. Let S be a rational ruled surface. Let F be the class of a fibre of the ruling, and let G be a section with G2 ≤ 0. By Lem. 4.8 we know = 0 if CF = 1. We will compute XS,F that XS,F and ES,F C F F . Furthermore we set F+ F +G (F, d) for > 0 sufficiently small, so that there is no wall of MS (F, d) := MS type (F, d) between F and F + G. 1 1 y 2 − y − 2 η(τ ) S,F , Proposition 7.1. 1. XF = θ1,1 (τ, z)2 θ1,1 (τ, 2z) 1 1 X 1 y 2 − y− 2 1 η(τ )3 F Xy (MS + (F, d))q d− 3 = − , 2. η(τ )2 θ1,1 (τ, z)2 θ1,1 (τ, 2z) y − y −1 d≥0
3. ES,F F =
2G2 (τ ) + 1 2G2 (τ ) X F , e(MS + (F, d))q d− 3 = 8 η(τ ) η(τ )8
1 12
.
d≥0
Proof. Let F1 , F2 be the fibres of the two projections of P1 × P1 to P1 . By a sequence of blowups and blowdowns (S, F ) can be obtained from (P1 × P1 , F1 ), where in each blowup F is replaced by its total transform. By the blowup formula Cor. 4.6 we get = XPF11 ×P1 ,F1 . We can therefore assume that S = P1 × P1 and F = F1 , G = F2 . XS,F F
= 0 and By Cor. 4.8 we get XS,G F 1
XS,F F
1
y2 −y2 = 2G,F (2τ, −2F z − 2Gz). η(τ )2 θ1,1 (τ, z)2 0,F
By (2.15) we have 3 2G,F 0,F (2τ, x) = η(2τ )
=
θ1,1 (·, h(F − G), ·i) (2τ, x) θ1,1 (·, −hG, ·i)θ1,1 (·, hF, ·i) F /2
η(2τ )3 θ0,1 (2τ, h(F − G), xi) . θ0,1 (2τ, −hG, xi)θ1,1 (2τ, hF, xi)
Thus 2G,F 0,F (2τ, −2F z − 2Gz) = By (2.7) we get XS,F F
0 (2τ ) η(2τ )3 θ0,1
θ0,1 (2τ, 2z)θ1,1 (2τ, 2z)
1 1 y 2 − y − 2 η(τ ) = . θ1,1 (τ, z)2 θ1,1 (τ, 2z)
.
130
L. Göttsche +
To get the χy -genus of MSF (F, d), we note that X d≥0
y 2 − y− 2 +G (τ, −2F z − 2Gz) , lim 2G,F 0,F 2 2 η(τ ) θ1,1 (τ, z) →0 1
+
Xy (MSF (F, d)q d− 3 = 1
1
and by formula (3.9.1) from [G-Z], G,F +G (τ, −2F z − 2Gz) = 2G,F 0,F (τ, −2F z − 2Gz) − lim 20,F →0
1 . y − y −1
To finally obtain the formulas for the Euler numbers we use the formula 1 2 η(τ )3 = exp Gk (τ )(2π i)k , (see [Z1]), θ1,1 (τ, z) 2πiz k! and Coeff2πiz
1 1 t =− . u −1 y−y 12
7.2. The rational elliptic surface. Let m ∈ Z≥0 . Let S be the blowup of P2 in 4m + 5 P points, and assume that F := (m + 2)H − mE1 − 4m+5 i=2 Ei is nef, e.g. F is the fibre of a fibration of S over P1 , such that the genus of the generic fibre is m. Theorem 7.2. If m is odd, then θ1,1 (τ, mz) , θ1,1 (τ, z)e θ1,1 (τ/2, z)θ0,1 (τ, (m − 1)z)η(τ )η(τ/2)4m+3 m = , η(τ/2)4m+8
S,F 1. XS,F H + X E1 = S,F 2. ES,F H + EE1
S,F + 6ES,F = 3. 6H 1
1 . η(τ )2 η(τ/2)4m+4
If m is even, then θ1,1 (τ, mz) , e θ1,1 (τ, z)θ1,1 (τ/2, z)θ0,1 (τ, (m − 1)z)η(τ )η(τ/2)4m+3 m = , η(τ/2)4m+8
S,F 1. XS,F H +E2 + XE1 +E2 = S,F 2. ES,F H +E2 + EE1 +E2
S,F S,F 3. 6H +E2 = 6E1 +E2 = 0.
Proof. We mostly deal with the case m = 2l − 1 odd. The proof in the case m even is analogous. Let 0 = H 2 (S, Z) with the negative of the intersection form. Let G := H − E1 . Then by Lem. 4.8 and Cor. 4.5, y 2 − y− 2 G,E1 G,F (2τ, K z) + 2 (2τ, K z) . 2 S S 0,H 0,H η(τ )16l+2 θ1,1 (τ, z)2 1
S,F XS,F H + XE1 =
1
Theta Functions and Hodge Numbers of Moduli Spaces
131
Let [G, F ] be the lattice generated by G and F , and let [G, F ]⊥ be its orthogonal complement in 0. We write 3 := [G, F ] ⊕ [G, F ]⊥ . By hF, F i = hG, Gi = 0, hF, Gi = −2, hE1 , F i = 1 − 2l, hE1 , Gi = −1, hE2 , F i = −1, hE2 , Gi = 0 we see that 3 has index 4 in 0, and that 0, E1 , E2 , E2 + E1 form a system of representatives of 0 modulo 3. Therefore we get by (2.15): G,F G,F |0 + |E1 + |E2 + |E1 +E2 |H /2 +E1 /2 2G,F 0,H + 20,E1 = 23 = 2G,F |0 + |E1 + |E2 + |E1 +E2 |0 + |G/2 |E1 . 3
P ai even . Then the map ϕ : D8l → [G, F ]⊥ Let D8l := (a1 , . . . , a8l ) ∈ Z8l 8l i=1 P defined by (a1 , . . . , a8l ) 7 → 8l i=1 ai (Ei+1 − G/2) is easily seen to be an isomorphism of lattices. It is well-known (and easy to check) that
Y 1 Y θ0,0 (τ, xi ) + θ0,1 (τ, xi ) . 2D8l (τ, (x1 , . . . , x8l )) = 2 8l
8l
i=1
i=1
So we get by (2.14),
η(2τ )3 θ1,1 (2τ, hF − G, xi) · θ1,1 (2τ, hF, xi)θ1,1 (2τ, h−G, xi) 8l+1 8l+1 Y 1 Y θ0,0 (τ, hEi − G/2, xi) + θ0,1 (τ, hEi − G/2, xi) . · 2
2G,F 3 (τ, x) =
i=2
i=2
If H (τ, x) is a function H × (0C ) → C satisfying H (τ, x) = θa,b (nτ, hL, xi)H1 (τ, x) for some L ∈ 0Q , then, for W ∈ 0,
H |W (τ, x) = θa+2hW,L/ni,b (nτ, hL, xi)H1 |W (τ, x).
We have hH, F i = −(2l + 1), hH, Gi = −1 and hH, Ei i = 0 for i ≥ 2; hE1 , F i = −(2l − 1), hE1 , Gi = −1 and hE1 , Ei i = 0 for i ≥ 2; hE2 , F i = −1, hE2 , Gi = 0 and hE2 , E2 i = 1, hE2 , Ei i = 0 for i ≥ 3. We also use repeatedly (2.8). Using this we obtain the following:
132
L. Göttsche
Put A(τ, x) :=
η(2τ )3 θ1,1 (2τ, hF − G, xi) , θ1,1 (2τ, hF, xi)θ1,1 (2τ, h−G, xi)
B(τ, x) :=
−η(2τ )3 θ1,1 (2τ, hF − G, xi) , θ0,1 (2τ, hF, xi)θ0,1 (2τ, h−G, xi)
C(τ, x) :=
η(2τ )3 θ0,1 (2τ, hF − G, xi) , θ1,1 (2τ, hF, xi)θ0,1 (2τ, h−G, xi)
η(2τ )3 θ0,1 (2τ, hF − G, xi) , θ0,1 (2τ, hF, xi)θ1,1 (2τ, h−G, xi) 8l+1 8l+1 Y 1 Y θ0,0 (τ, hEi − G/2, xi) + θ0,1 (τ, hEi − G/2, xi) , α(τ, x) := 2
D(τ, x) :=
β(τ, x) := γ (τ, x) := δ(τ, x) :=
i=2
i=2
8l+1 1 Y
8l+1 Y
2
θ1,0 (τ, hEi − G/2, xi) +
i=2
i=2
8l+1 1 Y
8l+1 Y
2
θ0,0 (τ, hEi − G/2, xi) −
i=2
i=2
8l+1 1 Y
8l+1 Y
2
θ1,0 (τ, hEi − G/2, xi) −
i=2
θ1,1 (τ, hEi − G/2, xi) , θ0,1 (τ, hEi − G/2, xi) , θ1,1 (τ, hEi − G/2, xi) .
i=2
Then we get G,F 2G,F 3 |0 (τ, x) := A(τ, x)α(τ, x), 23 |E1 (τ, x) := B(τ, x)β(τ, x),
G,F 2G,F 3 |E2 (τ, x) := C(τ, x)γ (τ, x), 23 |E1 +E2 (τ, x) := D(τ, x)δ(τ, x),
G,F 2G,F 3 |G/2 (τ, x) := D(τ, x)α(τ, x), 23 |G/2+E1 (τ, x) := C(τ, x)β(τ, x),
G,F 2G,F 3 |G/2+E2 (τ, x) := B(τ, x)γ (τ, x), 23 |G/2+E1 +E2 (τ, x) := A(τ, x)δ(τ, x).
By hKS , Ei − G/2i = 0 and hE1 , Ei − G/2i = 1/2 for i ≥ 2, we see that 0 0 (τ ))8l + (θ1/2,1 (τ ))8l )/2, α|E1 /2 (τ, KS z) = ((θ1/2,0 0 0 (τ ))8l + (θ3/2,1 (τ ))8l )/2, β|E1 /2 (τ, KS z) = ((θ3/2,0 0 0 (τ ))8l − (θ1/2,1 (τ ))8l )/2, γ |E1 /2 (τ, KS z) = ((θ1/2,0 0 0 (τ ))8l − (θ3/2,1 (τ ))8l )/2. δ|E1 /2 (τ, KS z) = ((θ3/2,0
Now we note that by (2.8) and (2.9), 0 0 (τ ) = θ3/2,0 (τ ) = θ1/2,0
η(τ/2)2 0 0 , θ1/2,1 (τ ) = −θ3/2,1 (τ ). η(τ/4)
Putting things together, we get that G,F 2G,F 0,E1 (τ, KS z) + 20,H (τ, KS z) =
(A + B + C + D)|E1 /2 (τ, KS z) ·
η(τ/2)16l . η(τ/4)8l
Theta Functions and Hodge Numbers of Moduli Spaces
133
The orthogonal projections of 0, E1 and E2 and E1 + E2 to [F, G] are a system of representatives of [F /2, G/2] modulo [F, G]. Therefore (A + B + C + D)(τ, x) = 2G,F [G/2,F /2] (τ, x) =
η(τ/2)3 θ1,1 (τ/2, hF /2 − G/2, xi) . θ1,1 (τ/2, hF /2, xi)θ1,1 (τ/2, h−G/2, xi)
Finally by hE1 , F i = 1 − 2l, hE1 , −Gi = 1, we get 2G,F [G/2,F /2] |E1 /2 (2τ, KS z) = − =
η(τ )3 θ1,1 (τ, hF /2 − G/2, KS iz) θ0,1 (τ, hF /2, KS iz)θ0,1 (τ, h−G/2, KS iz)
η(τ )3 θ1,1 (τ, (2l − 1)z) . θ0,1 (τ, (2 − 2l)z)θ0,1 (τ, z)
Putting this together, we obtain (y 2 − y − 2 )η(τ )3 θ1,1 (τ, (2l − 1)z) η(τ )16l = η(τ )16l+2 θ1,1 (τ, z)2 θ0,1 (τ, (2 − 2l)z)θ0,1 (τ, z) η(τ/2)8l θ1,1 (τ, (2l − 1)z) = . θ1,1 (τ, z)e θ1,1 (τ/2, z)θ0,1 (τ, (2 − 2l)z)η(τ )η(τ/2)8l−1 1
XS,F H
+ XS,F E1
1
S,F In the last line we have used (2.7). To get the formula for ES,F H + EE1 , take the limit θ1,1 (τ,(2l−1)z) |z=0 = 2l − 1 and e θ1,1 (τ/2, 0) = z → 0. It is immediate from (2.5) that θ1,1 (τ,z)
η(τ/2)3 . Therefore S,F ES,F H + EE1 =
2l − 1 2l − 1 = . 0 8l+4 8l+2 η(τ/2) η(τ )η(τ/2) θ0,1 (τ )
S,F + 6ES,F we put z = π i, to obtain To get the formula for 6H 1 S,F + 6ES,F = 6H 1
1 0 (τ/2)θ 0 (τ )η(τ )η(τ/2)8l−1 θ1,0 0,1
=
1 η(τ )2 η(τ/2)8l
.
In the case m = 2l even, we again have that 0, E1 , E2 , E1 + E2 form a system of representatives of 0 modulo 3. So we get G,F G,F |0 + |E1 + |E2 + |E1 +E2 |0 + |G/2 |E1 +E2 . 2G,F 0,H +E2 + 20,E1 +E2 = 23 Essentially the same computations as in the case m odd give the result. u t Let now S be the blowup of P2 in 9 points. Let H be the pullback of the hyperplane class, and let E1 , . . . , E9 be the classes of the exceptional divisors. Let F := 3H − P9 i=1 Ei . Then KS = −F . An interesting case is when S is a rational elliptic surface, and F is the class of a fibre. In [M-N-V-W] the generating functions of the Euler numbers e(MS (C, d)) are predicted in case CF is even. This prediction was proven in [Y4]. As an immediate consequence of Thm. 7.2 we can compute the Hodge numbers of the MSF (C, d) in case F C is odd. For the Betti numbers this result was already obtained (more generally for regular elliptic surfaces) in [Y6] using completely different methods. By [Be] the result about the Hodge numbers for S is a direct consequence.
134
L. Göttsche
Theorem 7.3. Let C ∈ H 2 (S, d) with C 2 odd. Then MSF (C, d) has the same Hodge numbers as S [2d−3/2] . In particular the Hodge numbers depend only on d, and we have X 1 , Xy (MSF (H, d)) + Xy (MSF (E1 , d)) q d−1 = 1. e θ1,1 (τ/2, z)η(τ/2)9 d≥0 X 1 , e(MSF (H, d)) + e(MSF (E1 , d)) q d−1 = 2. η(τ/2)12 d≥0 X 1 . 3. σ (MSF (H, d)) − σ (MSF (E1 , d)) q d−1 = η(τ )2 η(τ/2)8 d≥0
Remark 7.4. 1. We can recover the Hodge numbers of MSF (C, d): X d≥0
Xy (MSF (H, d))q d−1
1 i + , e θ1,1 (τ/2, z)η(τ/2)9 e θ1,1 ((τ + 1)/2, z)η((τ + 1)/2)9 X Xy (MSF (E1 , d))q d−1 =
d≥0
=
1 i − . 9 e e θ1,1 (τ/2, z)η(τ/2) θ1,1 ((τ + 1)/2, z)η((τ + 1)/2)9
2. We can also use Thm. 6.2 to compute the generating functions for the signatures. From [G-Z], Sect. 5.3 we get 8S,F H (1 + p/2) = 1, and by the simple type condition S,F r r 8H (p ) is 2 if r even and 0 otherwise. After some calculations this gives X d≥0
σ (MSF (H, d))q d−1 = 12e2 (2τ )
η(2τ )8 . η(τ )22
A similar calculation using [G-Z], Sect. 5.3 and Thm. 6.2 gives X d≥0
σ (MSF (E1 , d))q d−1 = −8
η(2τ )16 . η(τ )26
It is an exercise in modular forms to show that 12e2 (2τ )
η(2τ )8 η(2τ )16 1 + 8 = , 22 26 2 η(τ ) η(τ ) η(τ ) η(τ/2)8
and thus to recover part (3) of Thm. 7.3. 3. If X is a K3 surface, L a primitive line bundle and H a generic ample line bundle on X, then it was shown in [G-H] that MXH (L, d) has the same Hodge numbers as X [2d−3] , and in [H] that MXH (L, d) is deformation equivalent to X[2d−3] . There should be a similar proof of Thm. 7.3 as that in [G-H]. Furthermore I expect that, in case C 2 odd, MSF (C, d) is deformation equivalent to S [2d−3/2] . More generally similar results also should hold for arbitrary rank.
Theta Functions and Hodge Numbers of Moduli Spaces
135
4. In physics the polarized rational elliptic surface (S, F ) is often called 21 K3. This is related to the fact that one can degenerate an elliptic K3 surface to the union of 2 rational elliptic surfaces intersecting along a fibre. The generating function of the χy -genera of the MXH (L, d) (L primitive and allowing L2 both congruent 0 modulo 4 and congruent 2 modulo 4) on the K3 surface is just the square of the generating function on S. One could ask whether this result can also be shown by degenerating the moduli spaces MXH (L, d). Proof (of Thm. 7.3). We first show that the MSF (C, d) depend only on d. Let G be the subgroup of Aut (H 2 (S, Z)) generated by the Cremona transforms and the permutations of E1 , . . . , E9 . F is invariant under the operation of G, and therefore, by Cor. P 4.11, MSF (C, d) ' MSF (g(C), d) for all g ∈ G. We can assume that C = nH − i ai Ei with n, a1 , . . . , a9 ∈ {0, 1}. Let m be the number of indices i ≥ 1 with ai = 1. By renumbering E1 . . . E9 we can assume that either C = H or C = E1 , in which case we are done, or (h, a1 , a2 , a3 ) is one of (0, 1, 1, 1) or (1, 0, 1, 1). The Cremona transform replaces (h, a1 , a2 , a3 ) by (1, 0, 0, 0), (0, 1, 0, 0), and the result follows by induction on m. As KS F ≤ 0, the moduli spaces MSF (C, d) are smooth, and by [Be] all their cohomology is of Hodge type (p, p). Therefore the theorem follows from the case l = 1 of Thm. 7.2. u t
References [Ba] [Be] [Bo] [C] [dC-M] [Ch] [D-K] [De] [E-G] [E-S] [F-S] [F] [F-L1] [F-L2] [F-L3] [F-L4] [F-Q] [Gö1] [Gö2]
Baranovsky, V.: Moduli of sheaves on surfaces and action of the oscillator algebra. Preprint math.AG/9811092 Beauville, A.: Sur la cohomologie de certaines espaces de modules de fibrés vectoriels. In: Geometry and Analysis. (Bombay, 1952) Bombay: Tata Inst. Fund. Res., 1995, pp. 37–40 Borcherds, R.: Automorphic forms with singularities on Grassmannians. Invent. Math. 132, 491–562 (1998) Chandrasekharan, K: Elliptic functions. Grundlehren 281, Berlin Heidelberg: Springer-Verlag, 1985 de Cataldo, M.A., Migliorini, M.: The Douady space of a complex surface. Preprint math.AG/9811199 Cheah, J.: On the cohomology of Hilbert schemes of points. J. Alg. Geom. 5, 479–511 (1996) Danilov, V.I., Khovanskii, A.G.: Newton Polyhedra and an algorithm for computing Hodge–Deligne numbers. Math. USSR Izvestiya 29, 279–298 (1987) Deligne, P.: La conjecture de Weil I. Publ. Math. IHES 43, 273–307 (1974) Ellingsrud, G. and Göttsche, L.: Variation of moduli spaces and Donaldson invariants under change of polarisation. J. reine angew. Math. 467, 1–49 (1995) Ellingsrud, G., Strømme, S.A.: On the homology of the Hilbert scheme of points in the plane. Invent. Math. 87, 343–352 (1987) Fintushel, R., Stern, R.: The blowup formula for Donaldson invariants. Ann. Math. 143, 529–546 (1996) Fogarty, J.: Algebraic families on an algebraic surface. Am. J. Math. 90, 511–521 (1968) Feehan, P.M.N., Leness, T.G.: Donaldson invariants and wall-crossing maps, I: Continuity of gluing maps. Preprint math.DG/9812060 Feehan, P.M.N., Leness, T.G.: Donaldson invariants and wall-crossing maps, II: Surjectivity of gluing maps. In preparation Feehan, P.M.N., Leness, T.G.: Donaldson invariants and wall-crossing maps, III: Bubble-tree compactifications and manifolds-with-corners structures. In preparation Feehan, P.M.N., Leness, T.G.: Donaldson invariants and wall-crossing maps, IV: Intersection theory. In preparation Friedman, R., Qin, Z.: Flips of moduli spaces and transition formulas for Donaldson polynomial invariants of rational surfaces. Commun. in Analysis and Geometry 3, 11–83 (1995) Göttsche, L.: The Betti numbers of the Hilbert schemes of points on a smooth projective surface. Math. Ann. 286, 193–207 (1990) Göttsche, L.: Change of polarization and Hodge numbers of moduli spaces of torsion free sheaves on surfaces. Math. Zeitschr. 223, 247–260 (1996)
136
L. Göttsche
Göttsche, L.: Modular forms and Donaldson invariants for 4-manifolds with b+ = 1. J. Am. Math. Soc. 9, 826–843 (1996) [G-H] Göttsche, L., Huybrechts, D.: Hodge numbers of moduli spaces of stable sheaves on K3 surfaces. Int. J. Math. 7 No. 3, 359–372 (1996) [G-S] Göttsche, L., Soergel, W.: Perverse sheaves and the cohomology of Hilbert schemes of smooth algebraic surfaces. Math. Ann. 296, 235–245 (1993) [G-Z] Göttsche, L., Zagier, D.: Jacobi forms and the structure of Donaldson invariants for 4-manifolds with b+ = 1. Sel. Math. New. Ser. 4, 69–115 (1998) [H-B-J] Hirzebruch, F., Berger, T., Jung, R.: Manifolds and modular forms. Aspects of Math. E20, Braunschweig–Wiesbaden: Viehweg, 1994 [H] Huybrechts, D.: Birational symplectic manifolds and their deformations. J. Diff. Geom. 45, 488–513 (1997) [H-L] Huybrechts, D., Lehn, M.: The Geometry of Moduli Spaces of Sheaves. Aspects of Math. Vol. E 31, Braunschweig–Wiesbaden: Vieweg 1997 [K-Z] Kaneko, M., Zagier, D.: A generalized Jacobi theta function and quasimodular forms. In: R. Dijkgraaf, C. Faber, G. van der Geer (eds.) The moduli space of curves. Boston: Birkhäuser, 1995, pp. 165–172 [K-M] Kotschick, D., Morgan, J.: SO(3)-invariants for 4-manifolds with b+ = 1 II. J. Diff. Geom. 39, 433–546 (1994) [Kr-M] Kronheimer, P., Mrowka, T.: Embedded surfaces and the structure of Donaldson’s Polynomial invariants. J. Diff. Geom. 33, 573–734 (1995) [L] Leness, T.G.: Donaldson wall-crossing formulas via topology. Forum Math. (1998) to appear, dgga/960316 [L-Q1] Li, W.-P., Qin, Z.: On blowup formulae for the S-duality conjecture of Vafa and Witten. Preprint mathAG/9805054 [L-Q2] Li, W.-P., Qin, Z.: On blowup formulae for the S-duality conjecture of Vafa and Witten II: The universal functions. Preprint mathAG/9805055, to appear in Math Research letters [M-N-V-W] Minahan, J.A., Nemeschansky, D., Vafa, C., Warner, N.P.: E-Strings and N = 4 Topological Yang-Mills Theories. hep-th/9802168 [M-W] Moore, G., Witten, E.: Integration over the u-plane in Donaldson theory preprint hep-th/9709193 [Q1] Qin, Z.: Moduli of stable sheaves on ruled surfaces and their Picard groups. J. reine angew. Math. 433, 201–219 (1992) [Q2] Qin, Z.: Equivalence classes of polarizations and moduli spaces of sheaves. J. Diff. Geom. 37, 397– 413 (1993) [V-W] Vafa, C., Witten, E.: A Strong Coupling Test of S-Duality. Nucl. Phys. B 431 (1994) [W] Witten, E.: Monopoles and four-manifolds. Math. Research Letters 1, 769–796 (1994) [Y1] Yoshioka, K.: The Betti numbers of the moduli space of stable sheaves of rank 2 on P2 . J. reine angew. Math. 453, 193–220 (1994) [Y2] Yoshioka, K.: The Betti numbers of the moduli space of stable sheaves of rank 2 on a ruled surface. Math. Ann. 302, 519–540 (1995) [Y3] Yoshioka, K.: Chamber structure of polarizations and the moduli space of stable sheaves on a ruled surface. Int. J. Math. 7, 411–431 (1996) [Y4] Yoshioka, K.: Euler characteristics of SU (2) instanton moduli spaces of rational elliptic surfaces. preprint math.AG/9805003 [Y5] Yoshioka, K., Betti numbers of moduli of stable sheaves on some surfaces. Nucl. Phys. B (Proc. Suppl.) 46, 263–268 (1996) [Y6] Yoshioka, K.: Numbers of Fq -rational points of moduli of stable sheaves on elliptic surfaces. Moduli of vector bundles, Lect. Notes in Pure and Applied Math. 179, Marcel Dekker, 297–305 [Z1] Zagier, D.: Periods of modular forms and Jacobi theta functions. Invent. math. 104, 449-465 (1991) [Z2] Zagier, D.: Introduction to Modular forms. In: From Number Theory to Physics, eds. W. Waldschmidt, P. Moussa, J.-M. Luck, C. Itzykson, Berlin–Heidelberg: Springer-Verlag, 1992
[Gö3]
Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 206, 137 – 155 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Angular Momentum and Positive Mass Theorem Xiao Zhang Institute of Mathematics, Chinese Academy of Sciences, Beijing 100080, P. R. China. E-mail:
[email protected] Received: 26 October 1998 / Accepted: 10 March 1999
Abstract: Total angular momentum for asymptotically flat manifolds is defined. Positive mass theorem for initial (spin) data set (M, gij , pij ) with nonsymmetric pij is proved. As an application, we establish positive mass theorems involving total linear momentum and total angular momentum. This gives an answer to a problem of S. T. Yau in his Problem Section [Ya2] and a partial answer to his recent conjecture on the relationship among total energy, total linear momentum, total angular momentum and entropy of black hole. 1. Introduction In general relativity, basic quantities are total energy, total linear momentum, total angular momentum and entropy of black hole (i.e., area of apparent horizon). Total energy and total linear momentum can be defined on asymptotically flat manifolds. Let (M, gij , hij ) be a 3-dimensional Riemannian manifold with metric tensor gij , and a 2-symmetric tensor hij . M is called asymptotically flat of order τ if there is a compact set K ⊂ M such that M − K is the disjoint union of a finite number of subsets M1 , · · · , Mk − called the “ends” of M − each diffeomorphic to R 3 − Br , where Br is the closed ball of radius r with center at the coordinate origin. Under the diffeomorphism the metric of Ml ⊂ M is of the form gij = δij + aij
(1.1)
in the standard coordinates {x i } on R 3 , where aij satisfies aij = O(r −τ ),
(1.2)
∂k aij = O(r
−τ −1
),
(1.3)
∂l ∂k aij = O(r
−τ −2
).
(1.4)
138
X. Zhang
Furthermore, the 2-symmetric tensor hij satisfies hij = O(r −τ −1 ), ∂k hij = O(r
−τ −2
).
We will often identify the end Ml ⊂ M with the corresponding set Ml ⊂ R 3 . For asymptotically flat manifold M, the total energy of end Ml is defined as Z 1 lim El = (∂j gij − ∂i gjj )di , 16π r→∞ Sr,l the total linear momentum of end Ml is defined as Z 1 (hki − gki hjj )di , lim Plk = 8π r→∞ Sr,l
(1.5) (1.6)
(1.7)
(1.8)
where Sr,l is the sphere of radius r in end Ml ⊂ R 3 . When the asymptotic order τ > 21 , total energy is independent on the choice of asymptotic coordinates, and vanishes when τ > 1 [Ba1]. The Riemannian version of Positive Mass Conjecture was proved first by R. Schoen and S. T. Yau [SY1,SY2,SY3]. Theorem 1.1 (Schoen, Yau). Let (M, gij , hij ) be a 3-dimensional asymptotically flat manifold of order 1. If M satisfies the following dominant energy condition: sX X X X 1 2 2 hii ) − hij ) ≥ ( (∇i hj i − ∇j hii ))2 , (1.9) (R + ( 2 i
i,j
j
i
where R is the scalar curvature of M, then, for each end Ml , we have El ≥ 0.
(1.10)
If El0 = 0 for some l0 , then M can be isometrically embedded into 4-dimensional Minkowski space R 3,1 as a spacelike hypersurface so that gij is the induced metric from R 3,1 and hij is the second fundamental form. In particular, M is topologically R 3 . Alternatively, there is a Lorentzian version of the Positive Mass Conjecture which was proved then by E. Witten [Wi], and, soon later, was completed by Parker and Taubes [PT]. Theorem 1.2 (Witten). Let N be a 4-dimensional Lorentzian manifold with Lorentzian metric g˜ of signature (−1, 1, 1, 1), which satisfies the Einstein equations R˜ R˜ αβ − g˜ αβ = Tαβ , 2
(1.11)
where R˜ αβ , R˜ are Ricci curvature, scalar curvature of g˜ respectively. Let M ⊂ N be a spacelike hypersurface with induced Riemannian metric gij , and the second fundamental form hij . (M, gij , hij ) is asymptotically flat of order 1. If M satisfies the following dominant energy condition sX T0i2 , (1.12) T00 ≥ i
Angular Momentum and Positive Mass Theorem
139
where we choose an orthonormal frame {eα } on N with e0 timelike, then, for each end Ml , we have sX Plk2 . (1.13) El ≥ k
If El0 = 0 for some l0 , and T00 ≥ |Tαβ |, then M has only one end and N is flat over M. qP 2 They can also improve El ≥ 0 by El ≥ k Plk by their argument in Schoen and Yau’s Positive Mass Theorem [Ya1]. By Gauss and Codazzi equations, (1.12) is equivalent to (1.9). Hence two versions of Positive Mass Theorem are equivalent. On the other hand, the Penrose inequality, which was proved by Huisken and Ilmanen [HI1,HI2], see also [Ba2,Br,Gi,He1] for partial or related results, gives the relation between total energy and area of apparent horizon. In [Ya2], S.T.Yau asked what a good definition of total angular momentum for asymptotically flat space is and what the relationship would be with total mass (Problem 120). In Spring 1997, S.T. Yau conjectured also that the most general case of “Positive Mass Theorem/Penrose inequality” should be as follows: Under a certain kind of “dominant energy condition”, one should have r A , El ≥ |Pl | + |Jl | + 16π where Pl , Jl are total linear momentum, total angular momentum of end Ml respectively, and A is the area of apparent horizon. The above conjecture of Yau deduces to Penrose inequality when hqij = 0. (See also a
A if (1.9) holds related conjecture of Huisken and Ilmanen that the ADM rest mass ≥ 16π [HI2]). It should also have its analogous version on higher dimensional asymptotically q
A by some universal constant times n−2 flat manifolds, simply replacing the part 16π n−1 power of entropy of the black hole, where n is the dimension of manifold, see [Va] for some evidence. We refer to [AH,AM,AS,St] for many important works on angular momentum on more restrictive asymptotically flat 3-dimensional manifolds in spacetime which can be conformally compactified in the sense of Penrose, and the limit of magnetic part of asymptotic the Weyl curvature tensor vanishes at spacelike infinity of Minkowski space. On these manifolds, total angular momentum can be defined in terms of the Weyl curvature tensor and conformal factor. Unfortunately, the concept of angular momentum remains somewhat problematical [Pe]. In this paper, we shall give total angular momentum a “good” definition and study its relations with total energy and total linear momentum. From now on, we always denote (M, gij , hij ) (hij = hj i ) as a 3-dimensional asymptotically flat manifold of order τ in the sense that 2,α (M), gij − δij ∈ C−τ
X i
hij ∈ hii ∈
0,α C−τ −1 (M), 1,α C−τ −1 (M).
(1.14) (1.15) (1.16)
140
X. Zhang
(Here, and henceforth, definitions of weighted Sobolev and Hölder spaces follow from Bartnik [Ba1].) Let ρz be the distance function of M with respect to some fixed point z ∈ M. ρz is Lipschitz. Let ij k be the components of the volume element of M relative to an arbitrary frame. Definition 1.1. For any 3-dimensional manifold (M, gij , hij ) (hij = hj i ), the local angular momentum density h˜ zij with respect to point z ∈ M is defined as 1 h˜ zij = i 2
uv
(∇u ρz2 )(hvj − gvj trg (h)).
(1.17)
2-tensor h˜ zij is bounded on any compact set K˜ ⊂ M, but it might have good smoothness, depending on the smoothness of hij , near infinity on the ends. Moreover, since j uv is anti-symmetric with respect to j and v, we have 1 trg (h˜ z ) = j uv (∇u ρz2 )(hvj − gvj trg (h)) = 0. 2
(1.18)
Note that h˜ zij is not symmetric in general. (One can also define local angular momentum density with respect to any global function f on M, replacing ρz2 by f in (1.17).) Definition 1.2. For 3-dimensional asymptotically flat manifold (M, gij , hij ) in the above sense, the total angular momentum of end Ml with respect to point z ∈ M is defined as Z 1 lim (1.19) h˜ z di , Jlkin (z) = 8π r→∞ Sr,l ki the total angular momentum of end Ml with respect to point x0 ∈ R 3 with coordinates {x0u } is defined as Z 1 v (x u − x0u )(hvi − gvi trg (h))di . (1.20) lim Jlkex (x0 ) = 8π r→∞ Sr,l ku (Note that Jlkex (0) is defined as total angular momentum in [CK].) In classical theory, total angular momentum is defined as Z kuv x u T v0 ∗ 1 Jk = R3
with origin of coordinates at the system’s center of mass, where Tv0 is the momentum density of system [MTW]. If there is a symmetric 2-tensor hij such that X ∂i hv i − ∂v tr(h), Tv0 = i
then
Z kiv (hvi − δvi tr(h)) ∗ 1 + lim kuv x u (hv i − δvi tr(h)) ∗ dx i r→∞ S R3 r Z = lim kuv x u (hv i − δvi tr(h)) ∗ dx i . Z
Jk =
r→∞ S r
Angular Momentum and Positive Mass Theorem
141
Therefore the definitions (1.19), (1.20) of total angular momentum coincide with the one in the classical case up to a constant. Inspired by it, we can also define total angular momentum with respect to z ∈ M as Z 1 in (∇i ρz2 )(hj k − gj k trg (h)) (z) = Cn lim Jlij r→∞ S 2 r,l (1.21) 2 k − (∇j ρz )(hik − gik trg (h)) d , and with respect to x0 ∈ R n with coordinates {x0u } as Z ex (x0 ) = Cn lim (x i − x0i )(hj k − gj k trg (h)) Jlij r→∞ S r,l
− (x
j
j − x0 )(hik
− gik trg (h)) dk
(1.22)
for higher dimensional asymptotically flat manifolds, where Cn is some universal constant, Sr,l is the sphere of radius r in end Ml ⊂ R n . Note that the total angular momentum Jlkex (x0 ) of each end depends on the choice of point in R 3 . Hence the one with respect to the center of mass of each end, if it exists, will play a special role in general relativity. For a class of 3-manifolds M with an asymptotically flat end and satisfying much more special asymptotic conditions than the ones of (1.2), (1.3), (1.4), Huisken and Yau proved that the center of mass does exist if the mass is positive [HY], i.e., there is a unique round sphere foliation of constant mean curvature for the asymptotically flat end such that their center of gravity converges to a vector a ∈ R 3 which depends only on the geometry of M. Therefore it defines a geometric center of mass. Although the same statement in their paper is not claimed, it is believed the center of mass will still exist for those ends which satisfy asymptotic conditions (1.2), (1.3), (1.4) and have positive total energy [Hu]. We first prove a Positive Mass Theorem for 3-dimensional almost asymptotically flat manifolds (M, gij , pij ) with nonsymmetric pij . We also generalize it to higher dimensional almost asymptotically flat spin manifolds. Definition 1.3. For any n-dimensional manifold (M, gij , pij ) (n ≥ 3) with metric tensor gij and an arbitrary 2-tensor pij , local mass density is defined as µ=
X X 1 2 pii )2 − pij ), (R + ( 2 i
(1.23)
i,j
where R is the scalar curvature of M, local momentum densities are defined as X (∇i pj i − ∇j pii ), (1.24) ωj = i
χj = 2
X
∇i (pij − pj i ).
(1.25)
i
(M, gij , pij ), the dominant energy condition, is satisfied if sX sX ωj2 , (ωj + χj )2 }. µ ≥ max{ j
j
(1.26)
142
X. Zhang
(M, gij , pij ) is called almost asymptotically flat of order τ if on each end Ml ⊂ M the metric is of the form (1.1) which is uniformly equivalent to the flat metric on R n − Br and there exists q > n such that 2,q
aij ∈ W−τ (M), R ∈ L (M) ∩ L 1
(1.27) q 2 ,−τ −2
(M).
(1.28)
Furthermore, the 2-tensor pij together with its associated 2-form X θ= (pij − pj i )ei ∧ ej
(1.29)
i,j
satisfy that there exists a compact set K˜ ⊃ K and C > 0 such that 0,α ˜ pij ∈ C−τ −1 (M − K),
˜ |pij − pj i | < C on K, X ˜ pii | < C on K, | i
X i
(1.30) (1.31) (1.32)
1, q
pii ∈ W−τ2−1 (M),
(1.33)
dθ, d ∗ θ ∈ L q ,−τ −2 (M).
(1.34)
2
For a 3-dimensional almost asymptotically flat manifold (M, gij , pij ), the total energy of end Ml is defined by (1.7) also and the total linear momentum of end Ml is defined as the same as (1.8) except to replace hij by pij . We refer to [Ba1,PT,Sc, Zh1] for definitions of total energy and total linear momentum on higher dimensional manifolds. Theorem 1.3. Let (M, gij , pij ) be a 3-dimensional almost asymptotically flat manifold of order 1 ≥ τ > 21 , where pij is an arbitrary 2-tensor. If M satisfies the dominant energy condition (1.26), then, for each end Ml , we have sX Plk2 . (1.35) El ≥ k
If equality holds in (1.35) for some end Ml0 , then M has only one end. Furthermore, if El0 = 0 and gij is C 2 , pij is C 1 , then the following equations hold on M Rij kl + pik pj l − pil pj k = 0, ∇i pj k − ∇j pik = 0, X ∇i (pij − pj i ) = 0.
(1.36) (1.37) (1.38)
i
Theorem 1.4. Let (M, gij , pij ) be an n-dimensional almost asymptotically flat spin manifold of order n − 2 ≥ τ > n−2 2 (n > 3), where pij is an arbitrary 2-tensor. If M satisfies the following dominant energy condition: sX s X sX ωj2 , (ωj + χj )2 } + κj2 , (1.39) µ ≥ max{ j
j
1≤j ≤n−3
Angular Momentum and Positive Mass Theorem
143
denoting p˜ ab = pab − pba , where κj2 =
X
(p˜ j i p˜ kl + p˜ j k p˜ li + p˜ j l p˜ ik )2 ,
(1.40)
i,k,l;k>l>i>j
then, for each end Ml , we have El ≥
sX k
Plk2 .
(1.41)
If equality holds in (1.41) for some end Ml0 , then M has only one end. Furthermore, if El0 = 0 and gij is C 2 , pij is C 1 , then X k
(Rij kl + pik pj l − pil pj k )ek el −
√ = − −1(
k
X
−(
l,a,b;i6=l6=a6=b6=j 6=i
X
j a b
∇i pab e e e −
a,b;i6=a6=b6=j 6=i
X
√ X −1 (∇i pj k − ∇j pik )ek ∇j pab ei ea eb )
(1.42)
a,b;i6 =a6 =b6 =j 6 =i j l a b
pab pil e e e e −
X
pab pj l ei el ea eb )
l,a,b;i6 =l6 =a6 =b6 =j 6 =i
as an endomorphism of S, where Rij kl is curvature tensor of M. e on Our argument is inspired by Witten [Wi,PT]. We define a Dirac type operator D (spin) initial data set (M, gij , pij ) with nonsymmetric pij , which is analogous to the one used by Witten on Lorentzian manifolds. The crucial point is that this Dirac–Witten e and e is not self-adjoint, but there are nice Weitzenböck formulas for D e∗ D operator D eD e∗ . Then our Positive Mass Theorem is a direct consequence of these formulas. We D also observe that the existence of the Dirac–Witten equation (subsequently, the Positive Mass Theorem) actually holds for a class of much more generally asymptotic 2-tensor pij , namely, almost asymptotically flat manifolds, which can be applied to local angular momentum density. Now we can easily see that total energy is greater or equal to the norm of total linear momentum for any asymptotically flat spin manifold (M n , gij , hij ) (n ≥ 3, hij = hj i ) which satisfies the dominant energy condition, and equality for some end implies the manifold has only one end. Furthermore, when total energy is zero for some end and n ≤ 7, by solving Jang’s equation, the same as [SY3], one can prove that M is topological R n and can be embedded into R n,1 as a spacelike hypersurface so that gij is the induced metric from R n,1 and hij is the second fundamental form. This avoids the difficulty to prove there is a positive definite metric on the space of Spin(n, 1) spinors for n ≥ 5 when one tries to generalize the Lorentzian version of the Positive Mass Theorem to higher dimensional spin manifolds [Zh1,Zh2,Zh3]. As an application of Theorem 1.3, we prove the following two Positive Mass Theorems involving total angular momentum. This gives an answer to Problem 120 in [Ya2] and a partial answer to Yau’s conjecture. Theorem 1.5. Let (M, gij , hij ) be a 3-dimensional asymptotically flat manifold of order 1 ≥ τ > 21 , where hij = hj i . Suppose there is a regular point z ∈ M and the dominant
144
X. Zhang
energy condition (1.26) holds for pij = h˜ zij , then, for each end Ml , and each point x0 ∈ lz (if lz is nonempty), we have sX sX (Jlkin (z))2 = (Jlkex (x0 ))2 . (1.43) El ≥ k
k
If equality holds in (1.43), then M has only one end. Furthermore, if El = 0 and h˜ zij is C 1 , then (1.36), (1.37) and (1.38) hold true for pij = h˜ z . ij
Theorem 1.6. Let (M, gij , hij ) be a 3-dimensional asymptotically flat manifold of order 1 ≥ τ > 21 , where hij = hj i . Suppose there is a regular point z ∈ M and the dominant energy condition (1.26) holds for pij = hij ± h˜ zij , then, for each end Ml and each point x0 ∈ lz (if lz is nonempty), we have sX sX (Plk ± Jlkin (z))2 = (Plk ± Jlkex (x0 ))2 . (1.44) El ≥ k
k
If equality holds in (1.44), then M has only one end. Furthermore, if El = 0 and hij ± h˜ zij is C 1 , then (1.36), (1.37) and (1.38) hold true for pij = hij ± h˜ z . ij
We refer to Sect. 4 for related definitions in the above two theorems. We shall address in another paper the Positive Mass Theorem involving total angular momentum for higher dimensional manifolds. 2. Dirac–Witten Operator Let (M, gij , pij ) be an n-dimensional spin Riemannian manifold with metric tensor gij and 2-tensor pij . Fix a point p ∈ M and an orthonormal basis {ei } of Tp M such that (∇i ej )p = 0, where ∇ is the metric connection of M. Let {ei } be the dual frame. Let S be the spinor bundle of M with Hermitian metric h , i. The metric connection ∇ of M induces a metric connection (also denoted by ∇) on S. Define the modified connections ∇˜ and ∇¯ on S by √ −1 X ˜ pij ej , (2.1) ∇i = ∇i + 2 j √ √ X −1 X −1 pij ej − pj k ei ej ek . (2.2) ∇¯ i = ∇i + 2 2 j
j,k;i6 =j 6 =k6 =i
In a local orthonormal coframe {ei } of M, Dirac operator D and Dirac–Witten opere are defined by ator D X ei ∇i , (2.3) D= i
e= D
X i
ei ∇˜ i
(2.4)
Angular Momentum and Positive Mass Theorem
145
respectively. We have the following Lichnerowicz formula: 1 D 2 = ∇ ∗ ∇ + R, 4
(2.5)
where R is the scalar curvature of M. In terms of (2.1), we have e=D+ D
√
−1 X pij ei ej . 2
(2.6)
i,j
Moreover, d(hφ, ψi ∗ ei ) = (h∇i φ, ψi + hφ, ∇i ψi) ∗ 1 √ X = (h∇˜ i φ, ψi + hφ, (∇˜ i − −1 pij ej )ψi) ∗ 1 = (h∇¯ i φ, ψi + hφ, (∇¯ i −
√
(2.7) (2.8)
j
−1
X
pij ej )ψi) ∗ 1,
(2.9)
j
d(hei φ, ψi ∗ ei ) = (hDφ, ψi − hφ, Dψi) ∗ 1 √ X e ψi − hφ, (D e + −1 pii )ψi) ∗ 1. = (hDφ,
(2.10) (2.11)
i
Hence, ∇˜ i∗ = −∇˜ i +
√ X −1 pij ej
(2.12)
j
√ −1 X pij ej , 2 j X √ pij ej , ∇¯ i∗ = −∇¯ i + −1 = −∇i +
(2.13) (2.14)
j
√ √ X −1 X −1 pij ej + pj k ei ej ek , 2 2 j j,k;i6 =j 6 =k6 =i √ X ∗ e e pii D = D + −1 = −∇i +
i
=D+
√
−1
X i
√ −1 X pii + pij ei ej . 2 i,j
Now we prove the following three Weitzenböck formulas.
(2.15) (2.16) (2.17)
146
X. Zhang
Theorem 2.1. X
√ −1
e = ∇ ∗∇ + e∗ D D
pj k ei ej ek ∇i +
i6=j 6=k6=i
X
√ −1 2
X
X
ei ej ek ∇i pj k
i6 =j 6 =k6 =i
1 + (R + ( pii )2 − pij pkl ei ej ek el ) 4 i i6=j,k6=l √ √ −1 X −1 X j ∇i (pij − pj i )e − ∇j pii ej − 2 2 i,j
(2.18)
i,j
√ X 1 1 ωj ej ) + F, = ∇¯ ∗ ∇¯ + (µ + −1 2 2
(2.19)
√ X 1 eD e∗ = ∇¯ ∇¯ ∗ + 1 (µ − −1 (ωj + χj )ej ) − F, D 2 2
(2.20)
j
j
where
F=
P
0 (n = 3), i ej ek el (n > 3). p p e ij kl i6=j 6=k6=l6=i
(2.21)
Proof. By (2.17), we have √ −1 X ∗e 2 e ∇k pij ek ei ej D D=D + 2 i,j,k √ X √ X √ X − −1 pij ej ∇i + −1 pij ei ∇j + −1 pij ei ej D −
1X 2
i
i,j
pii
X i,j
i,j
1 X pij ei ej − pij pkl ei ej ek el 4
i6 =j
i,j,k,l
√ √ −1 X 2 =D + ∇k pij ek ei ej + −1 2 i,j,k
X
ei ej ek pj k ∇i
i6 =j 6 =k6 =i
1 X 1 X + ( pii )2 − pij pkl ei ej ek el . 4 4 i
i6=j,k6=l
Hence (2.18) follows from Lichnerowicz formula (2.5). By (2.15), we have √ X √ −1 X ∗ ∗ ei ej ek ∇i pj k + −1 ei ej ek pj k ∇i ∇¯ ∇¯ = ∇ ∇ + 2 i6=j 6=k6=i i6 =j 6 =k6 =i √ X X X 1 1 1 −1 2 − ∇i pij ej + pij − pij pkl ei ej ek el − F. (2.22) 2 4 4 2 i,j
i,j
i6 =j,k6 =l
Hence (2.19) follows. On the other hand, (2.2), (2.15), (2.6) and (2.17) give √ X ∇i pij ej + F, ∇¯ ∇¯ ∗ − ∇¯ ∗ ∇¯ = −1
(2.23)
i,j
e∗ D e= eD e∗ − D D
√ X −1 ∇j pii ej . i,j
(2.24)
Angular Momentum and Positive Mass Theorem
147
Hence (2.20) follows. u t Now we can derive the following integral form of Weitzenböck formula (2.19): Theorem 2.2. Z Z √ X e ∗ ei = ¯ 2 + 1 hφ, (µ + −1 hφ, ∇¯ i φ + ei Dφi |∇φ| ωj ej )φi 2 ∂M M j Z 1 e 2. hφ, Fφi − |Dφ| (2.25) + 2 M Proof. It follows from (2.9), (2.11) and (2.19). u t 3. Positive Mass Theorem If (M, gij , pij ) is a 3-dimensional almost asymptotically flat manifold of order 1 ≥ τ > 1 i 2 with asymptotic coordinates {x }, then, on end, we have 1 (3.1) ∇j = ∂j − 0kj l dx k dx l + O(r −2τ −1 ), 4 √ X e = dx j ∂j − 1 0kj l dx j dx k dx l + −1 pij dx i dx j + O(r −2τ −1 ), (3.2) D 4 2 i,j
where 0kj l =
1 (∂j gkl + ∂l gkj − ∂k gj l ). 2
(3.3)
Let δ = τ for 1 > τ > 21 and δ = 1 − ε for τ = 1, where ε > 0 is chosen such e D e∗ give the maps for the following weighted Sobolev spaces that 1 > δ > 21 . Thus D, defined by connection ∇ on S, 2,q
e D
1,q
e∗ D
0,q
W−δ (S) −→ W−δ−1 (S) −→ W−δ−2 (S). e 0 ∈ W 1,q (S), and D e∗ Dφ e 0 ∈ W 0,q (S). For constant spinor φ0 , ∂j φ0 = 0, we have Dφ −δ−1 −δ−2 Recall the Pauli representation of the coframe {ei } on a spinor bundle √ √ 0 0 1 −1 √0 −1 , e2 7 → √ , e3 7 → e1 7 → . (3.4) −1 0 −1 0 0 − −1 Obviously, e1 e2 e3 = I d. Now we recall a lemma, which can be easily proved in the spirit of [PT], see also [Zh1]. Lemma 3.1. Suppose (M, gij , pij ) is a 3-dimensional almost asymptotically flat man¯ = 0, ∇φ ¯ a =0 ifold of order τ > 0, and φ, {φa } are C 1 spinors which satisfy either ∇φ or ∇¯ ∗ φ = 0, ∇¯ ∗ φa = 0. (i) If limx→∞ φ(x) = 0, where the limit is taken along M in one asymptotic end, then φ = 0.
148
X. Zhang
(ii) If {φa } are linearly independent in some end, then they are linearly independent everywhere on M. Proof. (i) By the assumption, we have √ √ −1 X −1 j pij e φ + ∇i φ = ∓ 2 2 j
X
pj k ei ej ek φ.
j,k;i6 =j 6 =k6 =i
Then |d|φ|2 | = 2|<eh∇φ, φi| ≤ C|p||φ|2 . This implies |d ln |φ|| ≤ Cr −τ −1 on the ˜ Integrating it along a path from x0 ∈ M complement of the zero set of φ on M − K. gives −τ −|x|−τ )
|φ(x)| ≥ |φ(x0 )|eC(|x0 |
.
Taking x to be the first zero of φ along the path of integration, or taking the limit as ˜ Since |x| → ∞ if no such zero exists, shows that φ(x0 ) = 0. Hence φ = 0 on M − K. φ satisfies the following Dirac-type equation, √ √ −1 X −1 X ( pii )φ − (1 ± 1) (pij − pj i )ei ej φ, Dφ = ∓ 2 4 i
i,j
therefore φ = 0 by Unique Continuation Property. P (ii) Suppose there are constant Ca such that φ = a Ca φa vanishes at some point ¯ = 0 or ∇¯ ∗ φ = 0, we can repeat the above argument to conclude that x0 ∈ M. Since ∇φ t φ → 0 on each end. Hence Ca = 0 and it follows. u Proposition 3.1. If (M, gij , pij ) is a 3-dimensional almost asymptotically flat manifold of order 1 ≥ τ > 21 and the dominant energy condition (1.26) holds, then the map e : W 2,q (S) −→ W 0,q (S) e∗ D D −δ −δ−2
(3.5)
is an isomorphism. e is asymptotic to the standard Laplacian operator ∇ ∗ ∇ e∗ D Proof. Note that, by (2.18), D e is Fredholm with index e∗ D in the sense of Bartnik [Ba1]. Hence D e∗ D| e 2,q ) = ind(∇ ∗ ∇| 2,q ), ind(D W (S) W (S) −δ
(3.6)
−δ
see [Ba1]. Weighted Holder ¨ inequality and elliptic regularity imply Coker(∇ ∗ ∇|W 2,q (S) ) = Ker((∇ ∗ ∇)∗ |W 0,q¯
−1+δ (S)
−δ
) = Ker(∇ ∗ ∇|W 2,q¯
−1+δ (S)
),
(3.7)
2,q q ∗ q−1 . By the maximal principle, ∇ ∇ has trivial kernel both on W−δ (S) and e∗ D) e = 0 and we only need to show that the kernel of D e∗ D e (S). Thus ind(D
where q¯ = 2,q¯
on W−1+δ
e = 0, then e∗ Dφ on W−δ (S) is trivial. Let φ ∈ W−δ (S) satisfy D Z Z Z Z 2 ∗e i i e e e e ∗ ei → 0 |Dφ| = hφ, D Dφi + he φ, Dφi ∗ e = hei φ, Dφi 2,q
M
2,q
M
∂M
∂M
e = 0. Therefore ∇φ ¯ = 0 on M by (2.25). Hence φ = 0 by Lemma as x → ∞. Thus Dφ 3.1 (i). The proof of proposition is complete. u t
Angular Momentum and Positive Mass Theorem
149
Proposition 3.2. If (M, gij , pij ) is a 3-dimensional almost asymptotically flat of order 1 ≥ τ > 21 and the dominant energy condition (1.26) holds, then the map e∗ : W 1,q (S) −→ W 0,q (S) D −δ−1 −δ−2
(3.8)
is injective. e∗ | 1,q ). Then (2.20) gives ∇¯ ∇¯ ∗ φ + Proof. Let φ ∈ Ker(D W−δ−1 (S) P j j (ωj + χj )e )φ = 0. By (2.9), we have,
1 2 (µ
√ −1
−
d(h∇¯ i∗ φ, φi ∗ ei ) = (h∇¯ ∇¯ ∗ , φi) − h∇ ∇¯ ∗ , ∇ ∇¯ ∗ i) ∗ 1 √ X 1 (ωj + χj )ej )φi) ∗ 1 = −(|∇¯ ∗ φ|2 + hφ, (µ − −1 2 j
Hence Z
√ X 1 |∇¯ ∗ φ|2 + hφ, (µ − −1 (ωj + χj )ej )φi = − 2 M j
Z ∂M
h∇¯ i∗ φ, φi ∗ ei → 0
as x → ∞. Thus ∇¯ ∗ φ = 0, and φ = 0 by Lemma 3.1 (i). The proof of proposition is complete. u t Proposition 3.3. If (M, gij , pij ) is a 3-dimensional almost asymptotically flat manifold of order 1 ≥ τ > 21 and the dominant energy condition (1.26) holds, then for any constant spinor φ0 on ends, the following boundary value problem has a unique solution φ ∈ W 2,q (S), e Dφ = 0, (3.9) limr→∞ φ = φ0 . e 0 ∈ W 0,q (S), by Proposition 3.1, there is unique φ1 ∈ W 2,q (S) e∗ Dφ Proof. Since D −δ −δ−2 e 1 =−D e 0 . Then φ = φ1 + φ0 satisfies D e = 0. Since Dφ e ∈ e∗ Dφ e∗ Dφ e∗ Dφ such that D 1,q e = 0 by Proposition 3.2 and φ is the unique solution of (3.9). u W (S), then Dφ t −δ−1
Theorem 3.1. Let (M, gij , pij ) be a 3-dimensional almost asymptotically flat manifold of order 1 ≥ τ > 21 , where pij is an arbitrary 2-tensor. If M satisfies the dominant energy condition (1.26), then, for each end Ml , we have sX El ≥ Plk2 . (3.10) k
If equality holds in (3.10) for some end Ml0 , then M has only one end. Furthermore, if El0 = 0 and gij is C 2 , pij is C 1 , then the following equations hold on M: Rij kl + pik pj l − pil pj k = 0, ∇i pj k − ∇j pik = 0, X ∇i (pij − pj i ) = 0. i
(3.11) (3.12) (3.13)
150
X. Zhang
Proof. Let constant spinor φ0 6 = 0 on Ml , and φ0 = 0 on other ends. Denote φ = φ1 +φ0 , 2,q where φ1 ∈ W−δ (S), as the corresponding solution of (3.9) for this φ0 . We have Z
√ X ¯ 2 + 1 hφ, (µ + −1 |∇φ| ωj ej )φi 2 M j √ Z X −1 X hφ0 , dx i dx j ∇˜ j φ0 − pj k dx i dx j dx k φ0 i ∗ dx i = 2 ∂M∞ i6=j i6=j 6 =k6 =i Z X 1 = hφ0 , − 0kj l dx i dx j dx k dx l φ0 i ∗ dx i 4 ∂M∞ i6=j √ Z X −1 X + hφ0 , pj k dx i dx j dx k − pj k dx i dx j dx k )φ0 i ∗ dx i ( 2 ∂M∞ i6=j i6 =j 6 =k6 =i Z 1 = hφ0 , (∂j gij − ∂i gjj )φ0 i ∗ dx i 4 ∂M∞ √ Z −1 (pki dx k − hjj dx i )φ0 i ∗ dx i + hφ0 , 2 ∂M∞ √ = 4π(hφ0 , El φ0 i + hφ0 , −1Plk dx k φ0 i). qP 2 −1Plk dx k has eigenvalue ± k Plk . Now we take φ0 as the eigenspinor of qP 2 eigenvalue − k Plk with |φ0 | = 1, we therefore obtain the first part of the theorem. ¯ = 0. Hence it That the equality implies there is at least one spinor φ such that ∇φ follows from Lemma 3.1 (i) that M has only one end. If total energy vanishes, then, by Lemma 3.1 (ii), there is {φa } which form a basis of the spinor bundle everywhere on M ¯ a = 0. So in a local frame {ei } of M, such that ∇φ √
Note
∇ i φa = −
√ √ −1 −1 pik ek φa + iab pab φa . 2 2
Thus √ √ −1 −1 ∇i pj l el φa − pj l el ∇i φa 2 2 √ √ −1 −1 + j ab ∇i pab φa + j ab pab ∇i φa 2 √2 1 1 −1 ∇i pj l el φa − pj l pik el ek φa + iab pab pj l el φa =− 4 4 √2 1 1 −1 j ab ∇i pab φa + j ab pab pil el φa − iab j cd pab pcd φa . + 2 4 4
∇i ∇j φa = −
Angular Momentum and Positive Mass Theorem
151
Therefore 1 − Rij kl ek el φa = (∇i ∇j − ∇j ∇i )φa 4 √ 1 −1 (∇i pj k − ∇j pik )ek φa + (pik pj l − pil pj k )ek el φa =− 2 4 √ −1 − (3.14) (iab ∇j pab − j ab ∇i pab )φa . 2 Thus X
(Rij kl + pik pj l − pil pj k )ek el φa =
√ X −1 (∇i pj k − ∇j pik )ek φa
k
k
+
√ −1(iab ∇j pab − j ab ∇i pab )φa (3.15)
for a basis {φa } of S. This implies X √ X (Rij kl + pik pj l − pil pj k )ek el = −1 (∇i pj k − ∇j pik )ek k
k
√ + −1(iab ∇j pab − j ab ∇i pab ) (3.16)
as an endomorphism of S. In terms of Clifford representation (3.4), we obtain Rij kl + pik pj l − pil pj k = 0, ∇i pj k − ∇j pik = 0, iab ∇j pab − j ab ∇i pab = 0.
(3.17) (3.18) (3.19)
The first two equations are exactly Gauss and Codazzi equations of spacelike hypersurface in R 3,1 if pij is symmetric. Equation (3.19) is equivalent to X 1 ∇i (pij − pj i ) = 0. χj = 2
(3.20)
i
Therefore the proof of the theorem is complete. u t Remark 3.1. Note that (3.13) implies d ∗ θ = 0, (3.12) implies dθ = 0. Therefore if we assume that there is no L2 form on M in Theorem 3.1, then the vanishing of total energy for some end implies pij = pj i . Hence M is topologically R 3 which can be isometrically embedded into a 4-dimensional Minkowski space R 3,1 as a spacelike hypersurface so that gij is the induced metric from R 3,1 and pij is the second fundamental form. Remark 3.2. Suppose M has a finite number of apparent horizons, i.e., there are a finite number of 2-surface 6k , each of them is a topological 2-sphere with mean curvature H6k = ±trg (p)|√6k . Denote nk as a normal covector of 6k . We take the boundary condition φ = ± −1nk φ on each 6k . Under this condition, Z e ∗ ei hφ, ∇¯ i φ + ei Dφi 6k
vanishes. This verifies that Theorem 3.1 holds true for black holes, see [GHHP,He2] for the case of symmetric hij .
152
X. Zhang
Remark 3.3. Suppose N is a 4-dimensional Lorentzian manifold and M is a spacelike hypersurface of N with induced metric gij and the second fundamental form hij . Let {eα ; α = 0, 1, 2, 3} be a coframe of N such that ei is tangential to M and e0 is normal to M. Let ∇ N be a Lorentzian metric connection on a hypersurface spinor bundle of M [Wi,PT,Zh1,Zh2,Zh3]. For any 2-tensor p¯ ij on M, we define modified connections on a hypersurface spinor bundle by 1X p¯ ij e0 ej , (3.21) ∇˜ i = ∇iN + 2 j √ X 1X −1 N 0 j ¯ p¯ ij e e − p¯ j k ei ej ek . (3.22) ∇i = ∇i + 2 2 j
j,k;i6 =j 6 =k6 =i
Suppose (M, gij , hij + p¯ ij ) is almost asymptotically flat of order 1 ≥ τ > 21 , then the equivalent Lorentzian version of Theorem 3.1 can be proved similarly in terms of the e = ei ∇˜ i . We leave it as an exercise. Dirac–Witten operator D Remark 3.4. Theorem 3.1 holds true also for Bondi energy-momentum. The same argument is extended to arbitrary n-dimensional spin manifolds. Theorem 3.2. Let (M, gij , pij ) be an n-dimensional almost asymptotically flat spin manifold of order n − 2 ≥ τ > n−2 2 (n > 3), where pij is an arbitrary 2-tensor. If M satisfies the dominant energy condition (1.39), then, for each end Ml , we have sX Plk2 . (3.23) El ≥ k
If equality holds in (3.23) for some end Ml0 , then M has only one end. Furthermore, if El0 = 0 and gij is C 2 , pij is C 1 , then X √ X (Rij kl + pik pj l − pil pj k )ek el − −1 (∇i pj k − ∇j pik )ek k
k
X
√ = − −1(
a,b;i6=a6=b6=j 6=i
−(
X
X
∇i pab ej ea eb −
∇j pab ei ea eb )
(3.24)
a,b;i6 =a6 =b6 =j 6 =i
X
j l a b
pab pil e e e e −
l,a,b;i6=l6=a6=b6=j 6=i
pab pj l ei el ea eb )
l,a,b;i6 =l6 =a6 =b6 =j 6 =i
as an endomorphism of S, where Rij kl is curvature tensor of M. 4. Angular Momentum In this section, we assume that (M, gij , hij ) (hij = hj i ) is a 3-dimensional asymptotically flat manifold of order τ in the sense that, 2,α , gij − δij ∈ C−τ
X i
hij ∈ hii ∈
0,α C−τ −1 , 1,α C−τ −1 .
(4.1) (4.2) (4.3)
Angular Momentum and Positive Mass Theorem
153
Let h˜ zij be the local angular momentum density with respect to point z ∈ M and θ˜ z be its associated 2-form. Definition 4.1. The point z ∈ M is called “regular” if there exists q > 3 and a compact set K˜ ⊃ K such that 0,α ˜ h˜ zij ∈ C−τ −1 (M − K),
(4.4)
d θ˜ z , d ∗ θ˜ z ∈ L q ,−τ −2 (M).
(4.5)
2
For any z ∈ M, its “influence domain” lz with respect to end Ml is the set of point x0 ∈ R 3 such that i uv ∂u (ρz2 − σxl 0 )(hvj − gvj trg (h)) = O(r −2τ −1 ), i
uv
∂u gkl (hvj − gvj trg (h))(x
k
− x0k )(x l
− x0l )
= O(r
−2τ −1
)
(4.6) (4.7)
as x → ∞ on end Ml , where σxl 0 is the barrier function defined on R 3 − Br with respect to end Ml and point x0 ∈ R 3 by j
σxl 0 (x) = gij (x i − x0i )(x j − x0 ),
(4.8)
where gij is understood to be a function on R 3 − Br . As an example, let us see a symmetric 2-tensor hij defined on R 3 with flat metric, hii = −f + Fii , hij = 2f + Fij (i 6= j ),
(4.9) (4.10)
where f , Fij are certain smooth functions which satisfy asymptotic orders f = O(r −2 ), Fij = O(r −3 ).
(4.11) (4.12)
It is not hard to see that the influence domain 0 of origin is the line spanned by vector a = (1, 1, 1). For each x0 ∈ lz , we have 1 h˜ zij = i uv (∇u σxl 0 )(hvj − gvj trg (h)) + O(r −2τ −1 ) 2 = iu v (x u − x0u )(hvj − gvj trg (h)) + O(r −2τ −1 ) as x → ∞ on end Ml . Therefore, Z 1 lim h˜ z di Jlkin (z) = 8π r→∞ Sr,l ki Z 1 lim = v (x u − x0u )(hvj − gvj trg (h))di 8π r→∞ Sr,l iu = Jlkex (x0 ).
By taking pij = h˜ ij , pij = hij ± h˜ ij in Theorem 3.1 respectively, we can obtain the following two Positive Mass Theorems involving total angular momentum.
154
X. Zhang
Theorem 4.1. Let (M, gij , hij ) be a 3-dimensional asymptotically flat manifold of order 1 ≥ τ > 21 , where hij = hj i . Suppose there is a regular point z ∈ M and the dominant energy condition (1.26) holds for pij = h˜ zij , then, for each end Ml and each point x0 ∈ lz (if lz is nonempty), we have sX sX (Jlkin (z))2 = (Jlkex (x0 ))2 . (4.13) El ≥ k
k
If equality holds in (4.13), then M has only one end. Furthermore, if El = 0 and h˜ zij is C 1 , then (3.11), (3.12) and (3.13) hold true for pij = h˜ z . ij
Theorem 4.2. Let (M, gij , hij ) be a 3-dimensional asymptotically flat manifold of order 1 ≥ τ > 21 , where hij = hj i . Suppose there is a regular point z ∈ M and the dominant energy condition (1.26) holds for pij = hij ± h˜ zij , then, for each end Ml and each point x0 ∈ lz (if lz is nonempty), we have sX sX in 2 (Plk ± Jlk (z)) = (Plk ± Jlkex (x0 ))2 . (4.14) El ≥ k
k
If equality holds in (4.14), then M has only one end. Furthermore, if El = 0 and hij ± h˜ zij is C 1 , then (3.11), (3.12) and (3.13) hold true for pij = hij ± h˜ z . ij
Remark 4.1. Theorem 4.1 holds true when M contains minimal 2-spheres and Theorem 4.2 holds true when M contains apparent horizons, due to Remark 3.2 and (1.18). Acknowledgements. The author would like to express his gratitude to Professor S. T. Yau for his suggestion and especially for bringing the center of mass to his attention, and to thank Professors R. Bartnik, F. H. Lin, K. Liu, L. F. Tam, G. Tian, J. P. Wang and W. P. Zhang for their interest in this work and useful conversations. This work was finished while the author visited Max-Planck-Institute for Mathematics in the Sciences, Leipzig, Germany. He would like to thank Professor J. Jost for his invitation and to thank the institute for its hospitality and financial support.
References [ADM] Arnowitt, S., Deser, S., Misner, C.: Coordinate invariance and energy expressions in general relativity. Phys. Rev. 122, 997–1006 (1961) [AH] Ashtekar, A., Hansen, R.: A unified treatment of null and spatial infinity in general relativity. I. Universal structure, asymptotic symmetries, and conserved quantities at spatial infinity. J. Math. Phys. 19, 1542–1566 (1978) [AM] Ashtekar, A., Magnon-Ashtekar, A.: On conserved quantities in general relativity. J. Math. Phys. 20, 793–800 (1979) [AS] Ashtekar, A., Streubel, M.: On angular momentum of stationary gravitating systems. J. Math. Phys. 20, 1362–1365 (1979) [Ba1] Bartnik, R.: The mass of an asymptotically flat manifold. Comm. Pure Appl. Math. 36, 661–693 (1986) [Ba2] Bartnik, R.: Quasi-spherical metrics and prescribed scalar curvature. J. Diff. Geom. 37, 31–71 (1993) [Br] Bray, H.: The Penrose inequality in general relativity and volume comparison theorems involving scalar curvature. Thesis, Stanford University, 1997 [CK] Christodoulou, D., Klainerman, S.: The global nonlinear stablity of Minkowski space. Princeton Math. Series 41, Princeton, NJ: Princeton Univ. Press, 1993
Angular Momentum and Positive Mass Theorem
[Gi]
155
Gibbons, G.: Collapsing shells and the isoperimetric inequality for black holes. Class. Quant. Grav. 14, 2905–2915 (1997) [GHHP] Gibbons, G., Hawking, S., Horowitz, G., Perry, M.: Positive mass theorems for black holes. Commun. Math. Phys. 88, 295–308 (1983) [Ha] Harvey, R.: Spinors and calibrations. London: Academic Press, 1989 [HE] Hawking, S., Ellis, S.: The large scale structure of space-time. Cambridge: Cambridge Univ. Press, 1973 [He1] Herzlich, M.: A Penrose-like inequality for the mass of Riemannian asymptotically flat manifolds. Commun. Math. Phys. 188, 121–133 (1997) [He2] Herzlich, M.: The positive mass theorem for black holes revisited. J. Geom. Phys. 26, 97–111 (1998) [Hu] Huisken, M.: The private communication [HI1] Huisken, G., Ilmanen, T.: The Riemannian Penrose inequality. (Announcement) Int. Math. Res. Not. 20, 1045–1058 (1997) [HI2] Huisken, G., Ilmanen, T.: The inverse mean curvature flow and the Riemannian Penrose inequality. Preprint [HY] Huisken, G., Yau, S.T.: Definition of center of mass for isolated physical systems and unique foliations by stable spheres with constant mean curvature. Invent. Math. 124, 281–311 (1996) [LP] Lee, J., Parker, T.: The Yamabe problem. Bull. Am. Math. Soc. 17, 31–81 (1987) [Li] Li, P.: Lecture notes on geometric analysis. Lecture Notes Series No. 6 – Research Institute of Mathematics and Global Analysis Research Center, Seoul National University, Seoul, 1993 [MTW] Misner, C., Thorne, K., Wheeler, J.: Gravitation. (19th printing), New York: W.H. Freeman and Company, 1995 [PT] Parker, T., Taubes, C.: On Witten’s proof of the positive energy theorem. Commun. Math. Phys. 84, 223–238 (1982) [Pe] Penrose, C.: Some unsolved problems in classical general relativity. Seminar on differential geometry, ed. S.T.Yau, Annals of Math. Stud. 102, Princeton, NJ: Princeton Univ. Press, 1982, pp. 631–668 [Sc] Schoen, R.: Variational theory for the total scalar curvature functional for Riemannian metric and related topics. Lecture Notes in Math. 1365, Berlin–Heidelberg–New York: Springer-Verlag, 1987, pp. 120–154 [St] Streubel, M.: Conserved quantities for isolated gravitational system. Gen. Rel. Grav. 9, 551–561 (1978) [SY1] Schoen, R., Yau, S.T.: On the proof of the positive mass conjecture in general relativity. Commun. Math. Phys. 65, 45–76 (1979) [SY2] Schoen, R., Yau, S.T.: The energy and the linear momentum of spacetimes in general relativity. Commun. Math. Phys. 79, 47–51 (1981) [SY3] Schoen, R., Yau, S.T.: Proof of the positive mass theorem. II Commun. Math. Phys. 79, 231–260 (1981) [Va] Vafa, C.: Geometric physics. Doc. Math., Extra Vol. ICM 1998 (I), pp. 375-394 [Wi] Witten, E.: A new proof of the positive energy theorem. Commun. Math. Phys. 80, 381–402 (1981) [Ya1] Yau, S.T.: The private communication [Ya2] Yau, S.T.: Problem section. Seminar on differential geometry, ed. S.T. Yau, Annals of Math. Stud. 102, Princeton, NJ: Princeton Univ. Press, 1982, pp. 699–706 [Zh1] Zhang, X.: Positive mass conjecture for 5-dimensional Lorentzian manifolds. submitted [Zh2] Zhang, X.: Positive mass theorem for modified energy condition. submitted [Zh3] Zhang, X.: Positive mass theorem for hypersurface in 5-dimensional Lorentzian manifolds. Comm. Anal. Geom., to appear [Zh4] Zhang, X.: Lower bounds for eigenvalues of hypersurface Dirac operators. Math. Res. Lett. 5, 199–210 (1998) Communicated by H. Nicolai
Commun. Math. Phys. 206, 157 – 183 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
On the Structure of the Small Quantum Cohomology Rings of Projective Hypersurfaces Alberto Collino1 , Masao Jinzenji2,? 1 Dipartimento di Matematica, Universita’ di Torino, Via Carlo Alberto 10, 10123 Torino, Italy.
E-mail:
[email protected]
2 Department of Physics, University of Tokyo, Bunkyo-ku, Tokyo 113, Japan.
E-mail:
[email protected] Received: 29 November 1996 / Accepted: 15 March 1999
Abstract: We give an explicit procedure which computes for degree d ≤ 3 the correlation functions of topological sigma model (A-model) on a projective Fano hypersurface X as homogeneous polynomials of degree d in the correlation functions of degree 1 (number of lines). We extend this formalism to the case of Calabi–Yau hypersurfaces and explain how the polynomial property is preserved. Our key tool is the construction of universal recursive formulas which express the structure constants of the quantum cohomology ring of X as weighted homogeneous polynomial functions of the constants of the Fano hypersurface with the same degree and dimension one more. We propose some conjectures about the existence and the form of the recursive laws for the structure constants of rational curves of arbitrary degree. Our recursive formulas should yield the coefficients of the hypergeometric series used in the mirror calculation. Assuming the validity of the conjectures we find the recursive laws for rational curves of degree four. 1. Introduction ∗ (M k ) in the quantum cohomology ring of a In [16], we studied the Kähler sub-ring Hq,e N k N−1 , where we used numerical computation based hypersurface MN of degree k in CP on the torus action method. We worked under the condition that c1 (MNk ) is not negative, i.e. under the hypothesis N ≥ k. The following statements summarize the content of that paper:
1. For N ≤ 9 with N − k ≥ 2, we computed that the main relation satisfied by the ∗ (M k ) has the simple form generator Oe of Hq,e N (Oe )N−1 − k k (Oe )k−1 · q = 0.
(1.1)
? Present address: Graduate School of Mathematical Sciences, University of Tokyo, Meguro-ku, Tokyo, 153-8914, Japan.
158
A. Collino, M. Jinzenji
∗ (M k ) can be ex2. Under the same restriction as for 1 the structural constants of Hq,e N pressed as polynomial functions of a finite set of integers. These integers are basically the Schubert numbers of lines, and they do depend on the degree of the hypersurface but not on its dimension. 3. An explanation for (1.1) was found by looking at a toric compactification of the moduli space of maps from CP 1 to CP N−1 . It was said that the boundary portion of the moduli space should turn out to be irrelevant for the calculation, under the condition N − k ≥ 2.
The justification for 1 and 2 was based on numerical computations and the explanation of 3 was heuristic. Givental [11] gave a mathematically rigorous proof of (1.1). He constructed the fundamental solution of the Gauss–Manin system (the deformation parameter is restricted to the Kähler deformation) associated to the A-model on MNk by inventing a powerful extension of the torus action method and he showed that it satisfies the linear ODE of ∗ (M k ) hypergeometric type if N − k ≥ 2. This ODE yields the main relation of Hq,e N (1.1) by means of a certain limit procedure (in his notation h¯ → 0). Givental also treated the cases when N − k is 1 or 0 and he showed that the solution above satisfies the linear ODE of hypergeometric type if a) some multiplicative factor is added (when N − k = 1); b) some multiplicative factor is added and a coordinate transformation (the mirror map) is performed (when N = k). In this way he proved the mirror symmetry conjecture, namely that topological sigma models on Calabi–Yau manifolds realized as complete intersections in projective space can be solved by the analysis of hypergeometric series. His proof of the symmetry seems to rely on the flat metric condition, namely on the fact that the three point functions which include the identity operator do not receive quantum correction. His arguments are quite powerful and deep, but we can hardly see what is happening microscopically. The original motivation of the present paper was an attempt to prove (1.1) by descending induction on N, see [6]. Our program is to construct recursive formulas that ∗ (M k ) as weighted homogeneous polynomials in express the structural constants of Hq,e N k ∗ the structural constants of Hq,e (MN+1 ). Our method is based on a geometric process, which we call the specialization procedure, but unfortunately it works only up to the case of cubic curves. We believe that it should be possible to find and construct universal recursive laws also for curves of higher degree. We state some ansatz on the expected structure of such formulas. The main expected property says that if the index N − k ≥ 2 then the recursive laws should stay the same, independently of N and k. When the hypersurface is of Fano index 1 then the recursion law for the Schubert numbers of lines changes, while the formulas for curves of higher degree do not. Coming to the Calabi– Yau situation, N = k, the recursion relation is modified for all degrees. In this case the main relation of (1.1) must be changed entirely. ∗ (M N ) and evaluated We first computed the recursion relation for lines in case of Hq,e N the degree 1 part of the relation [16]. The result had a structure strongly reminiscent of the situation from mirror symmetry [18] and we speculated that the above correction and the correction terms argued in [18] are closely related. On the other hand the universal recursion laws valid for the case N − k ≥ 2 can be formally iterated by descent of dimension (while keeping the degree of the hypersurface fixed) twice more down to the case of a Calabi–Yau. What we conjecture here, and verify in part, is that this formal procedure yields the coefficients of the hypergeometric series which appear in the mirror
Quantum Cohomology Rings of Projective Hypersurfaces
159
calculation, but without use of the mirror conjecture. At this point the construction of the structure constants for the quantum ring of the Calabi–Yau hypersurface can be realized by the standard procedure that arises from the flat metric condition. Coming back to the main relation above, we can prove it modulo (q 4 ) as follows. We construct explicit recursion relations for d ≤ 3 and start the descending induction where N is large enough with respect to the fixed degree k so that the only non-trivial quantum corrections comes from lines. This procedure yields (1.1) and 2. We expect that the universal recursive procedure should provide interesting information also for the case when MNk is a hypersurface of general type, i.e. N < k. Our paper is organized as follows. In Sect. 2 we recall first the main properties of the structure of the quantum Kähler ∗ (M k ) and then we study the quantum product with primitive cohomology algebra Hq,e N classes for the Fano case, k < N. In Sect. 3 we introduce the specialization calculation and derive the recursion relations for rational curves of degree at most 3 under the assumption that the hypersurface is a Fano manifold. In Sect. 4 we try to extend the specialization procedure to Calabi–Yau hypersurfaces and determine how the recursion laws should be modified up to degree 2. Using this we evaluate the main relation of the quantum ring and compare it with the result from mirror symmetry. We find here the motivation to organize our findings in a compact form by means of the hypergeometric series used in mirror calculation. At this point some computations induce us to state a conjecture which says how to modify the recursion relation for cubic curves in the Calabi–Yau case. In Sect. 5 we present a set of conjectures which should provide a guiding rule in the explicit construction of the recursive formulas for rational curves of higher degree. Assuming the conjectures we can explicitly construct the recursive formulas for degree four.
2. Quantum Cohomology of Fano Hypersurfaces ∗ (M k ). Let M k be the generic and non-singular 2.1. The quantum Kähler sub-ring Hq,e N N hypersurface of degree k in CP N−1 . By the Lefschetz theorem the cohomology ring H ∗ (MNk ) splits into two parts. One of them is the Kähler sub-ring generated by the Kähler form e induced from the hyperplane section H of CP N −1 , and the other is the primitive part, which is a subspace of the middle dimension cohomology H N −2 (MNk ). ∗ (M k ). It is generated additively by We first consider the quantum Kähler sub-ring Hq,e N Oeα (α = 0, 1, 2, · · · N − 2), where Oeα represents the BRST- closed operator induced ∗ (M k ) are determined by means of from eα ∈ H ∗ (MNk ). The multiplication rules of Hq,e N the flat metric and the three point correlation functions (or Gromov–Witten invariants): N,k ηαβ αβ
Z := hOe0 O Oeβ iM k = eα
N
k MN
eα ∧ eβ = k · δα+β,N −2 ,
1 · δα+β,N−2 , k Z ∞ X = hOeα Oeβ Oeγ iM k = qd αβ
N,k = δγα , ηN,k = ηN,k ηβγ N,k Cα,β,γ
N
d=0
(2.1) (2.2)
k
M ¯ N M 0,d,3
φ1∗ (eα ) ∧ φ2∗ (eβ ) ∧ φ3∗ (eγ ). (2.3)
160
A. Collino, M. Jinzenji k
¯ MN represents the moduli space of stable maps from 3-pointed Here q := exp(t), M 0,d,3 k
¯ MN → M k . genus 0 curves to MNk and φi is the natural i th evaluation morphism M 0,d,3 N Mk
¯ N is the space of rational curves of degree d with three punctures in M k . Basically M 0,d,3 N The rules of quantum multiplication are N,k η γ δ Oe δ = Oeα · Oeβ = Cα,β,γ
∞
X 1 N,k,d 1 N,k q d Cα,β,γ OeN −2−γ . Cα,β,γ OeN −2−γ := k k
(2.4)
d=0
∗ (M k ), and therefore One should note that Oe is a multiplicative generator of Hq,e N it is enough to determine the multiplication rule between Oe and Oeα . The topological N,k is non-zero only if 1 + α + β = N − 2 + (N − k)d, hence selection rule yields that C1αβ it is:
Oe · Oeα =
∞ X d=0
1 N,k,d q d C1,α,N−3−α+(N −k)d Oeα+1−(N −k)d . k
(2.5)
For conventional reasons, we rewrite (2.5) as follows: Oe · OeN −2−m = OeN −1−m + LN,k,d m
∞ X d=1
q d LN,k,d OeN −1−m−(N −k)d , m
1 N,k,d := C1,N −2−m,m−1+(N−k)d . k
(2.6)
∗ (M k ) coincides with the classical cohoHere we have used the fact that q 0 part of Hq,e N
are the structure constants of the quantum ring. One mology ring. The integers LN,k,d m should think of kLN,k,d as the number of rational curves of degree d on MNk which meet a m linear section of dimension m and a second linear section of the right (= m+(N −k)d−1) are independent of N if N ≥ k + 2, and therefore codimension. We shall see that LN,k,1 m we write them simply as Lkm ; we refer to them as the Schubert number of lines. Note that Lkm = Lkk−m−1 . Gromov–Witten invariants of genus 0 vanish by insertion of Oe0 . Since MNk is a N −2 dimensional manifold we obtain: 6 = 0 H⇒ 1 ≤ N − 2 − m ≤ N − 2, 1 ≤ m − 1 + (N − k)d ≤ N − 2 LN,k,d m ⇐⇒ max{0, 2 − (N − k)d} ≤ m ≤ min{N − 3, N − 1 − (N − k)d}. (2.7) These vanishing conditions translate into 6 = 0 H⇒ 0 ≤ m ≤ (N LN,k,d m H⇒ 1 ≤ m ≤ (N H⇒ 0 ≤ m ≤ (N H⇒ 2 ≤ m ≤ (N
− 1) − (N − k)d (N − k ≥ 2), − 3) (N − k = 1, d = 1), − 1) − (N − k)d (N − k = 1, d ≥ 2), − 3) (N − k = 0).
(2.8)
We remark explicitly that if the dimension N is large with respect to k (N ≥ 2k) then the only non trivial quantum correction left is due to curves of degree 1.
Quantum Cohomology Rings of Projective Hypersurfaces
161
As we said above Oe is a multiplicative generator of the ring, and then there are coefficients γ which give the representations: OeN −1−m = (Oe )N−1−m −
∞ X d=1
q d γmN,k,d (Oe )N −1−m−(N−k)d .
(2.9)
∗ (M k ). One has When we set m = 0 we obtain the main relation of Hq,e N
Oe · ((Oe )N−2−m −
∞ X d=1
= (Oe )N−1−m −
∞ X d=1
+ −
∞ X d=1 ∞ X
N,k,d q d γm+1 (Oe )N −2−m−(N −k)d )
q d γmN,k,d (Oe )N −1−m−(N−k)d
LN,k,d q d (Oe )N−1−m−(N−k)d m
d 0 =1
0 N,k,d 0 0 q d γm+(N−k)d (Oe )N−1−m−(N−k)(d+d ) ,
(2.10)
and therefore it is N,k,d = LN,k,d − γmN,k,d − γm+1 m
d−1 X d 0 =1
0
0
N,k,d LN,k,d−d γm+(N m −k)(d−d 0 ) .
(2.11)
This yields: γmN,k,d
=
d X
X
l=1 Pl
l−1
(−1)
i=1 di =d
N −1−(N X−k)d
···
jl =m
j2 j3 X X
l Y
j2 =m j1 =m
i=1
! LN,k,dPi i−1 ji +( n=1 dn )(N−k)
.
(2.12) ∗ (M k ) is of the form of (O )N −1 −k k (O )k−1 ·q = 0 The fact that the main relation of Hq,e e e N [16,11] is equivalent to
γ0N,k,1 =
N−1 X j =1
γ0N,k,d =
LN,k,1 = kk , j
d X
X
l=1 Pl
(−1)l−1
i=1 di =d
(d ≥ 2).
N −1−(N X−k)d
···
jl =0
j2 Y j3 X l X j2 =0 j1 =0 i=1
LN,k,dPi i−1 ji +(
n=1 dn )(N−k)
= 0.
(2.13)
162
A. Collino, M. Jinzenji
2.2. The role of primitive cohomology. In this subsection we consider the general structure of the quantum cohomology ring of a Fano hypersurface V of degree k in Pn+1 (n ≥ 3) including the primitive part. It is H2 (V , Z) = Zq, where kq is the class of a plane section, and H 2 (V , Z) is spanned by the class x(:= e) of the hyperplane section H . The ring H ∗ (V , Q) is generated by x and by the primitive cohomology H n (V , Q)0 , with the relations x n+1 = 0, x ∪ a1 = 0, R −1 n a1 ∪ a2 = k V (a1 ∧ a2 )x for aR1 , a2 primitive classes. We shall denote by ( | )V the intersection form, hence (a|b)V = V a ∧ b. For 0 ≤ i ≤ n, xi (:= ei ) is the class of the linear section of V of codimension i, so that x = x1 . The vectors xi span the invariant part R of H ∗ (V , Q), which is the orthogonal complement of H n (V , Q)0 . We recall that the Fano index of V is h P = h(V ) = n + 2 − k. Let Z{H2 (V , Z)} be the graded homogeneous ring of formal series nd q d with integer coefficients. One introduces a ring structure on α ∗ , β ∗ in H ∗ (V , Z) the quantum H ∗ (V , Z{H2 (V , Z)}) by the rule that for P homogeneous ∗ ∗ ∗ ∗ d multiplication product is α · β = l (α , β )d q , where (α ∗ , β ∗ )0 is the ordinary cohomology product, and (α ∗ , β ∗ )d is a class of degree deg(α ∗ )+deg(β ∗ )−2hd defined by the condition ((α ∗ , β ∗ )d |γ ) = [α ∗ , β ∗ , γ ; d; V ](= hOα ∗ Oβ ∗ Oγ iV ,d,gravity ). This last term is the GW invariant, which can be informally defined as the number of rational curves of degree d on V meeting the representative submanifold A,B,G in general position. We shall use the associativity and the grading properties of ·, whose rigorous and highly non-trivial construction is due to Ruan and Tian [27]. We recall some facts from [28]. Tian observed that the GW classes [α1 , . . . , αl ; d; V ] are invariant under monodromy action, which is a direct corollary of the main result in [27], and he applied this explicitly to cases like hypersurfaces by using the Picard-Lefschetz theorem. Proposition 1 (Tian). If m − l is odd and as are primitive classes then [xi1 , . . . , xil , al+1 , . . . , am ; d; V ] = 0. Proof. The statement holds when n is odd P for trivial reasons, indeed by definition [xi1 , . . . , xil , al+1 , . . . , am ; j ; V ] = 0 if 2( ij )+(m−l)n 6 = 2n+2hd +2(m+l −3). Coming to the case when the hypersurface V is even dimensional, we recall that the monodromy group M is generated by reflections defined by the vanishing cycles. The case of even dimensional quadrics is readily checked, since the vanishing cohomology has rank one in this case. On the other hand if n > 3 and k > 3, by the same argument explained in p.384 of [26], a lemma of Deligne yields that the Zariski closure M¯ is in fact the full group of isometries of H ∗ (V , C)0 . Thus the GW invariant above defines a symmetric multilinear form with an odd number of entries, invariant under the orthogonal group. It is clear that such a form vanishes. u t If h ≥ 2 Tian’s result yields x · a = 0, for a ∈ H n (V , Q)0 . Instead we have Proposition 2. If h = 1 then x · a = k!aq. Proof. The statement is equivalent to [x, a1 , a2 ; 1] = −k!(a1 |a2 )V ; here ai are primitive classes and [x, a1 , a2 ; 1] is the GW number of the lines which meet them. Our proof of this equality is based on a remark of Beauville, [1, 4, Application II]. In this direction we also need to prove the formula below, which is a generalization of a result of Tyurin, [29, 3] and [22]. Let W be a general hypersurface whose generic hyperplane section is V . Then the Fano variety F (W ) of lines on W is a non-singular irreducible of dimension k and there are k! lines on W which meet a general point, [22]. The variety F (V ) is a nonsingular subvariety of codimension 2 in F (W ). The natural P 1 bundle p : L → F (W )
Quantum Cohomology Rings of Projective Hypersurfaces
163
surjects λ : L → W with degree k!. We denote by γ : BF → V the restriction of λ to V , where γ has degree k!. Then β : BF → F (W ) is the blow up along F (V ) and the projection of the exceptional divisor π : E → F (V ) is the restriction of p. We denote here i : E → BF and j : V → W the natural inclusions. The cohomology of a blow up decomposes as a direct sum, in our case H ∗ (BF, Q) = i∗ π ∗ (H ∗−2 (F (V ), Q)) ⊕ β ∗ (H ∗ (F (W ), Q)). Now γ∗ H ∗ (BF, Q) → H ∗ (V , Q) is a surjection, because γ : BF → V is. It is known that the primitive cohomology is contained in the image (γ i)∗ π ∗ (H ∗−2 (F (V ), Q)), [22]. We need the stronger result that given a primitive class a there is a class α with γ ∗ (a) = i∗ π ∗ α. To prove this statement we first note that it is equivalent to β∗ γ ∗ (a) = 0, and then we consider a cycle A which represents a and which is in general position with respect to the locus covered by the lines on V . We have β∗ γ ∗ (A) = β∗ (λ∗ (j∗ (A)) ∩ BF ), and then j∗ (A) = 0 in H n+2 (W, Q)) because primitive classes are annihilated by j∗ . Fix next the primitive classes a1 and a2 so that γ ∗ a1 = i∗ π ∗ α1 , γ ∗ a2 = i∗ π ∗ α2 . One has equality of degrees of intersection (γ ∗ a1 |γ ∗ a2 )BF = k!(a1 |a2 )V , because the degree of γ is k!. On the other hand the excess intersection formula of [9] yields (i∗ π ∗ α1 |i∗ π ∗ α2 )BF = −(π ∗ α1 |π ∗ α2 · ζ )E = −(α1 |α2 )F (V ) , here ζ denotes the tautological class of E as a P 1 bundle, and ζ is known to be the opposite of the class of the normal bundle of E in BF . Thus k!(a1 |a2 )V = −(α1 |α2 )F (V ) . Now it is geometrically clear, and is the idea from [1], that [x, a1 , a2 ; 1] = (π∗ i ∗ γ ∗ a1 |π∗ i ∗ γ ∗ a2 )F (V ) = (α1 |α2 )F (V ) . Tian’s vanishing implies also that the quantum product of the hyperplane class with a linear section is of type X ad,s xs−dh q d , x · xs−1 = xs + i≥1
where ad,s = k −1 [xs−1 , xn+dh−s , x; d] are the structure constants. We set w := x +k!q, if h = 1, and otherwise w := x, and we write ws the s th power of w with respect to the quantum product. Then w satisfies a unique minimal monic equation F = 0, of degree(n+1), the equation which is found by setting s = n+1 in the displayed P[(n+1)/ h] cd w n+1−dh q d . For primitive formula. This is of the form F := w n+1 + d=1 P [n/ h] classes a and b we have a·b = k −1 (a|b)V (w n + d=1 bd w n−dh q d ). Following Tian we P[n/ h] note that associativity yields 0 = (w·a)·b = k −1 (a|b)V (w n+1 + d=1 bd wn+1−dh q d ), and thus cd = bd for 1 ≤ d ≤ n, cn+1 = 0. Beauville in [1] studied the structure of the quantum ring of Fano hypersurfaces of degree small with respect to the dimension. Beauville’s result deals with the case n ≥ 2k − 3, in this case only the coefficient c1 6 = 0. Now −c1 is the sum of the Schubert numbers of lines on V , and it turns out that −c1 = k k hence: Theorem 1. The quantum cohomology of V over the rational numbers is generated by w and H n (V , Q)0 with relations (i) wn+1 = k k w k−1 q, (ii) w · a = 0, (iii) a · b = k −1 (a|b)V (w n − k k wk−2 q). This theorem in fact always holds, the hardest part (i) is a deep theorem of Givental [11], while (ii) and (iii) follow from the same arguments used before. u t
164
A. Collino, M. Jinzenji
3. Recursion Relations for the Structure Constants of Fano Hypersurfaces This section is devoted to the proof of the following recursion laws and of some related results: Theorem 2. Consider a hypersurface MNk in CP N −1 of degree k; if the 1st Chern class N − k ≥ 2 then the basic structure constants satisfy the following recursion relations: = LN+1,k,1 =: Lkm , LN,k,1 m m 1 +1,k,1 +1,k,1 = (LN+1,k,2 + LN+1,k,2 + 2LN · LN LN,k,2 m m m m+(N−k) ), 2 m−1 1 +1,k,3 (4LN+1,k,3 = + 10LN+1,k,3 + 4LN LN,k,3 m m m−2 m−1 18 +1,k,1 N +1,k,2 + 12LN+1,k,2 · LN+1,k,1 · LN m−1 m+2(N−k) + 9Lm m+2(N−k)
(3.1) (3.2)
+ 6LN+1,k,2 · LN+1,k,1 m m+1+2(N−k) +1,k,2 N +1,k,1 + 6LN+1,k,1 · LN+1,k,2 · LN m−1 m−1+(N −k) + 9Lm m−1+(N−k)
+ 12LN+1,k,1 · LN+1,k,2 m m+(N −k) N +1,k,1 + 18LN+1,k,1 · LN+1,k,1 m m+(N −k) · Lm+2(N−k) ).
(3.3)
Our arguments are heuristic. We embed X := MNk as the linear section of a general k in CP N so that hypersurface Y := MN+1 k ∩ H, MNk = MN+1
(3.4)
where the hyperplane H is identified with CP N −1 . Next we introduce the notation N N hOea1 Oea2 · · · Oeam iM k ,d,gravity = [AN a1 , Aa2 , · · · , Aam ; d, N, k]. N
(3.5)
N −1 and in general Here the spaces AN ai are linear subspaces of codimension ai in CP position, so that k P DM k (eai ) = AN ai ∩ MN . N
(3.6)
We introduce below the “special position” correlation functions, N +1 N +1 ∩ H, AN+1 ∩ H, · · · , AN+1 G[AN+1 a1 a2 am ∩ H, Aam+1 , · · · , Aam+n ; d, N + 1, k]. (3.7)
∩ H ’s is a linear subspace in CP N of codimension ai + 1 which lies Clearly AN+1 ai N−1 = H . The special position correlation function should count the number in CP of rational curves of degree d on Y with m labeled points on them which belong to the corresponding linear spaces and which have the further property that points with different labels stay distinct. By taking the linear spaces in general position in CP N −1 we may N +1 N +1 ∩ H, AN+1 ∩ H, · · · , AN+1 assume that [AN+1 a1 a2 am ∩ H, Aam+1 , · · · , Aam+n ; d, N, k] has no contribution from reducible curves on X. Now an irreducible curve of degree d which cuts H in d + 1 points lies on it and then
Quantum Cohomology Rings of Projective Hypersurfaces
165
N N [AN a1 , Aa2 , · · · , Aad+1 ; d, N, k] + R +1 ∩ H, AN+1 ∩ H, · · · , AN = G[AN+1 a1 a2 ad+1 ∩ H ; d, N + 1, k],
(3.8)
where R measures the contributions due to the connected reducible curves on Y which satisfy the conditions. In the cases that we consider, R does not occur for lines and conics and it is a finite set for the case of cubic curves, as we compute below. For curves of degree 4 or more the family of reducible curves supporting R may be of positive dimension and we are not able to determine the contribution due to them. For this reason we shall restrict to the case of curves of degree d at most equal to 3. The following lemma, the specialization formula, gives a procedure for computing the N+1 N+1 degree of G[AN+1 a1 ∩ H, Aa2 ∩ H, · · · , Aad+1 ∩ H ; d, N + 1, k], because by definition N+1 N+1 N+1 +1 N +1 N +1 G[Aa1 +1 , Aa2 +1 , · · · , Aam +1 ; d, N +1, k] = [AN a1 +1 , Aa2 +1 , · · · , Aam +1 ; d, N +1, k]. N+1 By moving AN+1 as+1 +1 into Aas+1 ∩ H one has Lemma 1. N +1 N +1 ∩ H, · · · , AN+1 ∩ H, AN+1 G[AN+1 a1 as as+1 +1 , Aas+2 +1 , · · · , Aas+t +1 ; d, N + 1, k] N +1 N +1 ∩ H, · · · , AN+1 ∩ H, AN+1 = G[AN+1 a1 as as+1 ∩ H, Aas+2 +1 , · · · , Aas+t +1 ; d, N + 1, k] + s X j =1
N+1 N +1 N +1 G[AN+1 ∩ H, · · · , AN+1 ∩ H, a1 aj −1 ∩ H, Aaj +as+1 ∩ H, Aaj +1 ∩ H, · · · , Aas
N+1 AN+1 as+2 +1 , · · · , Aas+t +1 ; d, N + 1, k].
(3.9)
Here we explain the definition of the special position G-W invariants. Given a projective variety Z Kontsevich [20] has constructed the coarse moduli ¯ space M¯ := M(Z, m, β) of stable maps of homology class β to Z. M¯ is the set of equivalence classes of data [C, p1 , . . . , pm , µ], where µ : C → Z is the “stable” map, C is a varying, projective, connected, nodal curve of arithmetic genus 0, and p1 , . . . , pm are distinct, labeled nonsingular points on C. We refer to [10] for a detailed discussion of this construction. The canonical evaluation maps ρi : M¯ → Z are defined ¯ m, β) by ρi ([C, p1 , . . . , pm , µ]) = µ(pi ). The interior M(Z, m, β) is the locus in M(Z, ¯ corresponding to nonsingular irreducible domain curves. We write M(Z, A, β) when the index set of labels is a set A instead of [n] = {1, . . . , n}. There are forgetful ¯ ¯ A, β) → M(Z, A − B, β), defined when B is a subset of A. Let maps φB : M(Z, now Z be a general non-singular Fano hypersurface of dimension n ≥ 3 and of index h(Z) = n + 2 − deg(Z). We consider the case when β is the class of a curve of degree d ¯ and we assume that M(Z, m, d) has the expected dimension dimZ +dh(Z)+m−3 and similarly for the boundary components. We recall that such components are associated with the choice of a partition A ∪ B of the set [m] := {1, . . . , m} and of the choice of d1 ¯ and d2 with d = d1 + d2 . The boundary component D(A, B; d1 , d2 ) is defined as the locus of moduli points corresponding to reducible domain curve C = C1 ∪ C2 , where µ∗ (Ci ) has degree di . Here the curve C is obtained by gluing at • the curve C1 which has on it points marked by the elements in A and a further point, labeled by •. and C2 , which has on it points marked by the elements in B and a further point, also labeled by •. ¯ ¯ ¯ B ∪ {•}, d2 ). A ∪ {•}, d1 ) ×Z M(Z, There is an identification D(A, B; d1 , d2 ) = M(Z, In what follows we take n + 2 = N and define Ti , i = 1, . . . , m to be linear spaces of codimension ti ≥ 1 in P n+2 and in general position there. As before Y
166
A. Collino, M. Jinzenji
P is a Fano hypersurface of degree k. We assume that ti is the expected dimension ¯ dimY + dh(Y ) + m − 3 of M(Y, m, d). We define [t1 , . . . , tm ; d, Y ] to be the degree of the zero cycle [T1 , . . . , Tm ; d, Y ], which we define to be the intersection product of the cycles ρi−1 (Ti ). Here we assume that those cycles intersect transversally in a finite number of points, each one of which is associated with an irreducible source curve and with the property that the corresponding map sends different labeled points to different images. By definition [t1 , . . . , tm ; d, Y ] is one of the GW invariants of Y , and it is called basic if m = 3 and if at least one of the ti is 1. The GW invariants on X are defined in a similar way, by means of linear spaces Si ⊂ P n+1 . We shall use the convention that Si and Ti are spaces of the same dimension, so that Si is obtained by moving Ti into P n+1 . Given linear spaces as above we write G[S1 , . . . , Ss , Ts+1 , . . . , Ts+t ; d; Y ] to rep¯ resent the open cycle in M(Y, s + t, d) which can be informally described as the set of rational curves of degree d on Y with s + t marked points such that the images of the labeled points pj belong to the space with the same label, T and such that for j ≤ s and i ≤ s if pj and pi have the same image point in Sj Si then this point is a double point for the image curve. We shall use the notation that si is the codi= ti − 1, because mension of Si in P n+1 so that si P P of our convention. The codimension of the preceding cycle is j (sj + 1) + j tj . Our aim is to compute the degree of G[S1 , . . . , Ss , Ts+1 , . . . , Ts+t ; d; Y ] when its expected dimension is 0. By abuse of notations we shall often use the same notation to represent both a cycle of dimension 0 and the degree of the said cycle. We define M¯ 0 (s) to be the complement ¯ ¯ in M¯ := M(Y, s + t, d) of the union of the components of type D(A, B; d1 , d2 ) with d2 = 0 and with at least two elements of B which are ≤ s. Thus we have ¯ The evaluation ρi restricts to ρ(s)0 on M¯ 0 (s). Our definiM¯ 0 (s + 1) ⊂ M¯ 0 (s) ⊂ M. i tion is that G[S1 , . . . , Ss , Ts+1 , . . . , Ts+t ; d; Y ] is the intersection product of the cycles (ρ(s)oj )−1 (Sj ),j = 1, . . . s, (ρ(s)ol )−1 (Tl ), l = s + 1, . . . , s + t. If the codimensions sj and tl are fixed the set of lists (S1 , . . . , Ss , Ts+1 , . . . , Ts+t ) is parameterized by a product of Grassmann manifolds, hence it is an irreducible variety and then there is an open dense subset of it where the degree of G[S1 , . . . , Ss , Ts+1 , . . . , Ts+t ] is maximum. We shall assume that our lists come from this subset. We start by noting that [S1 , . . . , Sd+1 ; d; X] and G[S1 , . . . , Sd+1 ; d; Y ] both have the same expected dimension, which we take to be 0. Proposition 3. If the given cycles are zero dimensional then G[S1 , . . . , Sd+1 ; d; Y ] = [S1 , . . . , Sd+1 ; d; X] + R, ¯ where R is supported on the boundary locus of M(Y, m, d) and more precisely on the locus corresponding to reducible domain curves with reducible image. In order to compute the degree of [S1 , . . . , Sd+1 ; d; X] we need to verify that the dimension of the preceding cycles is in fact 0 and then to compute their degrees. Now we have by assumption that [S1 , . . . , Sd+1 ; d; X] has the correct dimension 0, so the dimension of G[S1 , . . . , Sd+1 ; d; Y ] can fail to be also = 0 only if R fails. Of course the dimension of R can be detected by looking at decomposable curves on Y ; that is to the behavior of rational curves of degree strictly less than d. As we have said above the degree of G[S1 , . . . , Sd+1 ; d; Y ] is determined by a reduction procedure, which is performed by moving linear spaces Ti which are in general position in P n+2 to spaces Si of the same dimension, which are contained in the hyperplane P n+1 and in general position there. Our main tool is the next proposition; we have only heuristic arguments to support it.
Quantum Cohomology Rings of Projective Hypersurfaces
167
Proposition 4. Provided that the dimensions of the cycles below are 0 as it is expected then degree G[S1 , . . . , Ss , Ts+1 , . . . , Ts+t ; d; Y ] = degree G[S1 , . . . , Ss , Ss+1 , Ts+2 , . . . , Ts+t ; d; Y ] X + degree ψi−1 G[S1 , . . . , Si−1 , Si,s+1 , Si+1 , . . . , Ss , Ts+2 , . . . , Ts+t ; d; Y ], ¯ + t] − {i, s + 1}, {i, s + where Si,s+1 := Si ∩ Ss+1 and where ψi is the isomorphism D([s ¯ 1}; d, 0) → M(Y, ([s + t] − {i, s + 1}) ∪ {•}, d). The specialization lemma is just a restatement of this proposition. The following procedure gives the recursive formulas: +1 ∩ H, AN +1 ∩ 1. By iterative application of the specialization formula we write G[AN a b N+1,d−1 ∩H, · · · , A ∩H ; d, N +1, k] in terms of the standard correlation H, AN+1,1 1 1 k . functions of Y = MN+1 2. We decompose the standard correlation functions found in Step 1 as polynomials in the basic G-W invariants of Y , by which we mean the functions
hOea Oeb Oe id,M k
N +1 ,gravity
.
This step is done by means of the first reconstruction theorem of Kontsevich and Manin or, equivalently, by the microscopic version of the DWVV equations. +1 ∩ H, AN +1 ∩ H , 3. We compute the contribution of reducible curves in G[AN a b N+1,d−1 AN+1,−1 ∩ H, · · · , A ∩ H ; d, N + 1, k]. 1 1 The 1st step gives next equalities. In writing them we use the convention that if the number of insertion points gets lower than 3 then we insert an hyperplane condition and divide by the degree of the curve: ∩ H, AN+1 ∩ H ; 1, N + 1, k] G[AN+1 a b N+1 N +1 = [AN+1 a+1 , Ab+1 ; 1, N + 1, k] − [Aa+b+1 ; 1, N + 1, k]
(a + b = N − 3 + (N − k)),
∩ H, AN+1 ∩ H, AN+1 ∩ H ; 2, N + 1, k] G[AN+1 a 1 b N+1 N+1 +1 N +1 ; 2, N + 1, k] − [AN = [AN+1 a+1 , Ab+1 , A2 a+2 , Ab+1 ; 2, N + 1, k] N+1 − [AN+1 a+1 , Ab+2 ; 2, N + 1, k] N+1 +1 ; 2, N + 1, k] + 2[AN − [AN+1 a+b+1 , A2 a+b+2 ; 2, N + 1, k]
(a + b = N − 3 + 2(N − k)),
168
A. Collino, M. Jinzenji
+1,2 G[AN+1 ∩ H, AN+1 ∩ H, AN+1,1 ∩ H, AN ∩ H ; 3, N + 1, k] a 1 1 b
N+1,1 N+1 , AN+1,2 ; 3, N + 1, k] = [AN+1 a+1 , Ab+1 , A2 2
N+1 N+1 +1 N +1 N +1 ; 3, N + 1, k] − 2[AN ; 3, N + 1, k] − 2[AN+1 a+1 , Ab+2 , A2 a+2 , Ab+1 , A2
N+1,1 +1 N +1 N +1 , AN+1,2 ; 3, N + 1, k] − [AN ; 3, N + 1, k] − [AN+1 2 a+1 , Ab+1 , A3 a+b+1 , A2
N+1 N+1 N +1 + 2[AN+1 a+1 , Ab+3 ; 3, N + 1, k] + 2[Aa+2 , Ab+2 ; 3, N + 1, k] N+1 + 2[AN+1 a+3 , Ab+1 ; 3, N + 1, k]
N+1 N +1 ; 3, N + 1, k] + [AN+1 ; 3, N + 1, k] + 4[AN+1 a+b+2 , A2 a+b+1 , A3
− 6[AN+1 a+b+3 ; 3, N + 1, k] (a + b = N − 3 + 3(N − k)).
(3.10)
We assume now that the Fano index of X is at least 2, namely N − k ≥ 2, then a + b + 1 k ) + 1, and therefore (3.10) is truncated in an obvious is greater than N = dim(MN+1 +1 N +1 N +1 way. At this point we recall the definition [AN a1 , Aa1 , · · · , Aam ; d, N + 1, k] = hOea1 Oea2 · · · , Oeam id,M k ,gr . In order to proceed we need to express N +1
hOea Oeb Oe2 id,M k
N +1 ,gr
and hOea Oeb Oe3 id,M k
N +1 ,gr
in terms of hOea Oeb Oe id,M k ,gr . Our tool is the first reconstruction theorem of KontN +1 sevich and Manin [21], and it yields hOea Oeb Oem ihOeN −1−m Oe Oe i hOea Oeb Oem ihOeN −1−m Oe2 Oe i hOea Oeb Oe2 Oem ihOeN −1−m Oe Oe i = hOea Oe Oe2 Oem ihOeN −1−m Oeb Oe i
= = + +
hOea Oe Oem ihOeN −1−m Oeb Oe i, hOea Oe2 Oem ihOeN −1−m Oeb Oe i, hOea Oeb Oem ihOeN −1−m Oe2 Oe Oe i hOea Oe Oem ihOeN −1−m Oeb Oe2 Oe i. (3.11)
We find in the end, if the Fano index of X is at least 2: 1 N+1 G[AN+1 N−2−m ∩ H, Am−1+(N−k) ∩ H ; 1, N + 1, k] k := Lkm , (3.12) = LN+1,k,1 m 1 N+1 N +1 ∩ H ; 2, N + 1, k] G[AN+1 N−2−m ∩ H, Am−1+2(N−k) ∩ H, A1 k 1 +1,k,1 + LN+1,k,2 + 2LN+1,k,1 · LN (3.13) = (LN+1,k,2 m m m+(N−k) ), 2 m−1 1 N +1,1 +1,2 N+1 G[AN+1 ∩ H, AN ∩ H ; 3, N + 1, k] 1 N−2−m ∩ H, Am−1+3(N−k) ∩ H, A1 k 1 + 10LN+1,k,3 + 4LN+1,k,3 = (4LN+1,k,3 m m−2 m−1 6 +1,k,1 N+1,k,2 + 12LN+1,k,2 · LN+1,k,1 · LN m−1 m+2(N−k) + 12Lm m+2(N−k) + 6LN+1,k,2 · LN+1,k,1 m m+1+2(N−k) N+1,k,1 N +1,k,1 + 6LN+1,k,2 + 12LN+1,k,2 m−1+(N−k) · Lm−1 m−1+(N−k) · Lm
Quantum Cohomology Rings of Projective Hypersurfaces
169
N+1,k,1 + 12LN+1,k,2 m+(N−k) · Lm N+1,k,1 +1,k,1 + 18LN · LN+1,k,1 m m+(N−k) · Lm+2(N−k) ).
(3.14)
We make the hypothesis that the Schubert varieties of conics and lines on Y and on X which are associated with the given linear spaces Si , Tj and their intersections have the right dimension. A count of dimensions shows that for the cases of degree 1 and degree 2 there is no contribution from the reducible curves and therefore N+1 N,k,1 G[AN+1 N−2−m ∩ H, Am−1+(N−k) ∩ H ; 1, N + 1, k]/k = Lm
and N+1 N+1 ∩ H ; 2, N + 1, k]/k = LN,k,2 . G[AN+1 m N−2−m ∩ H, Am−1+(N−k) ∩ H, A1
For cubic curves there is a contribution due to the reducible connected curves which are made of a line lying on X and of a conic lying on Y . There are two cases which occur, in one instance the line meets AN+1 N−2−m ∩ H , and the conic meets the line and N+1 Am−1+3(N −k) ∩ H . In the other case the incidence conditions with the linear spaces are reversed. In this way N+1,1 +1,2 N+1 ∩ H, AN ∩ H ; 3, N + 1, k]/k G[AN+1 1 N −2−m ∩ H, Am−1+3(N−k) ∩ H, A1
+R = 3LN,k,3 m
1 k
1 1 +1,k,2 + LN+1,k,2 · LN,k,1 + LN · LN,k,1 . = 3LN,k,3 m m m+2(N−k) 2 2 m−1+(N−k) m
t u
(3.15)
We come now to the case of Fano index 1. The same type of computations as above yield: Theorem 3. If X is a hypersurface of degree k in CP k the recursion relations for the basic invariants of conics and cubic curves are the same as given in Theorem 1, instead the numbers of lines satisfy the law = Lk+2,k,1 − Lk+2,k,1 = Lk+2,k,1 − k!. Lk+1,k,1 m m m 0
(3.16)
Proof. In this case, a + b + 1 = N − 2 + d, from (3.10). Then we obtain ∩ H, AN+1 ∩ H ; 1, N + 1, k] [AN+1 a b
N+1 N +1 = [AN+1 a+1 , Ab+1 ; 1, N + 1, k] − [AN −1 ; 1, N + 1, k] (a + b + 1 = N − 1). t u
(3.17)
Now we can prove: Corollary 1. The main relation of the quantum ring of a Fano hypersurface MNk with index N − k ≥ 2 is of the form (Oe )N−1 − k k (Oe )k−1 q = 0 up to q 3 .
(3.18)
170
A. Collino, M. Jinzenji
Proof. The recursion relations of Theorem 2 do not change γ0N,k,d , namely γ0N,k,d = γ0N+1,k,d . If N ≥ 2k + 1, then γ0N,k,d = 0(d ≥ 2), because of the vanishing conditions due to the topological selection rule.On the other hand γ0N,k,1 = k k , from Schubert calculus cf. [1]. u t Corollary 2. The main relation of the quantum cohomology ring of a Fano hypersurface of index 1 and dimension k − 1 is of the form (Oe + k!q)k − k k (Oe + k!q)k−1 q = 0
(3.19)
up to q 3 . Proof. Consider the multiplication rule (2.6): Oek−1−m + Oe · Oek−1−m := Oek−m + qLk+1,k,1 m
∞ X d=2
q d Lk+1,k,d Oek−m−d . (3.20) m
This gives: + k!)Oek−1−m + (Oe + k!q) · Oek−1−m = Oek−m + q(Lk+1,k,1 m = Oek−m +
∞ X d=1
∞ X d=2
q d Lk+1,k,d Oek−m−d m
q d L˜ k+1,k,d Oek−m−d . m
(3.21)
Set now F = Oe + k!q, use F as a multiplicative generator and write k−m
Oek−m = (F )
−
∞ X d=1
q d γ˜mk+1,k,d (F )k−m−d (m = 0, 1, · · · , k − 1).
(3.22)
A standard computation yields γ˜0k+1,k,d = γ0k+2,k,d , and we conclude by descending induction as in the proof of the preceding corollary. u t Our last result is also easily proved by descending induction on N . can be written as Corollary 3. If d ≤ 3 and N − k ≥ 1) the structure constants LN,k,d m a homogeneous polynomial of degree d in the structure constants of lines Lkm . 4. Recursive Formulas in the Calabi–Yau Case Here we try to understand how the recursive formulas change when the hypersurface becomes of Calabi–Yau type, i.e. when we deal with Mkk of degree k in CP k−1 . In this situation we can proceed as before for lines and conics. On the other hand we cannot use the same method for curves of degree 3, because it is difficult in this case to control the contribution from reducible curves. We give instead conjectural recursive formulas for cubics. In the last part of the section we explain the trend of thought which led us to the conjecture. We recall first that given a general point on Mkk there are no rational curves meeting k,k,d = Lk,k,d = Lk,k,d it, and therefore Lk,k,d 0 1 k−1 = Lk−2 = 0.
Quantum Cohomology Rings of Projective Hypersurfaces
171
Theorem 4. The recursive laws for lines and conics on a Calabi–Yau hypersurface of degree k are for 2 ≤ m ≤ k − 3, = Lk+1,k,1 − Lk+1,k,1 , Lk,k,1 m m 1 1 Lk,k,2 = (Lk+1,k,2 + Lk+1,k,2 − Lk+1,k,2 − Lk+1,k,2 m m 0 1 2 m−1 k+1,k,1 2 k+1,k,1 +2(Lm − L1 ) ).
(4.1)
(4.2)
Proof of Theorem 4. We go back to the specialization formula (3.10), which we use with the condition a + b = N − 3 because now we have N = k. Repeated use of the first reconstruction theorem yields 1 k+1 G[Ak+1 k−2−m ∩ H, Am−1 ∩ H ; 1, k + 1, k] k − Lk+1,k,1 , = Lk+1,k,1 m 1 1 k+1 k+1 ∩ H ; 2, k + 1, k] G[Ak+1 k−2−m ∩ H, Am−1 ∩ H, A1 k 1 + Lk+1,k,2 − Lk+1,k,2 − Lk+1,k,2 = (Lk+1,k,2 m 0 1 2 m−1 k+1,k,1 k+1,k,1 k+1,k,1 + 2Lm · (Lm − L1 )).
(4.3)
(4.4)
Next we check the contribution from reducible curves. For lines there are no reducible curves, so that 1 k+1 k,k,1 G[Ak+1 k−2−m ∩ H, Am−1 ∩ H ; 1, k + 1, k] = Lm . k
(4.5)
In case of conics, the reducible curves are given by two intersecting lines, one lying on k k , hence it is ∩ H and the other on Mk+1 Mkk = Mk+1 1 k+1 k+1 G[Ak+1 ∩ H ; 2, k + 1, k] k−2−m ∩ H, Am−1 ∩ H, A1 k + Lk+1,k,1 · Lk,k,1 = Lk,k,2 m m . 1
(4.6)
t u =0 Next we deal with cubic curves, the Calabi–Yau condition implies again Lk,k,3 m for 0 ≤ m ≤ 1, k − 2 ≤ m ≤ k − 1, our proposal for the remaining m is the following: Conjecture 1. The recursive law for curves of degree 3 on Mkk and for 2 ≤ m ≤ k − 3 becomes: = Lk,k,3 m
1 (4Lk+1,k,3 + 10Lk+1,k,3 + 4Lk+1,k,3 m m−2 m−1 18 − 10Lk+1,k,3 − 4Lk+1,k,3 0 1 + 12Lk+1,k,2 · Lk+1,k,1 + 12Lk+1,k,2 · Lk+1,k,1 m m m m−1 + 6Lk+1,k,2 · Lk+1,k,1 m m+1
+ 6Lk+1,k,2 · Lk+1,k,1 + 12Lk+1,k,2 · Lk+1,k,1 m m−1 m−1 m−1 + 12Lk+1,k,2 · Lk+1,k,1 m m
172
A. Collino, M. Jinzenji
+ 18(Lk+1,k,1 − Lk+1,k,1 )2 · (Lk+1,k,1 + 2Lk+1,k,1 )) m m 1 1 1 k+1,k,2 k+1,k,1 1 k+1,k,2 k+1,k,1 − Lm−1 · Lm − Lm · Lm 6 6 3 3 − Lk+1,k,2 · Lk+1,k,1 − Lk+1,k,2 · Lk+1,k,1 m m 4 0 4 1 5 5 − Lk+1,k,2 · Lk+1,k,1 − Lk+1,k,2 · Lk+1,k,1 1 1 12 1 12 0 1 − Lk+1,k,2 · Lk+1,k,1 2 3 1 1 − 3Lk+1,k,1 · (Lk+1,k,2 + Lk+1,k,2 − Lk+1,k,2 − Lk+1,k,2 m 1 0 1 2 m−1 k+1,k,1 2 k+1,k,1 + 2(Lm − L1 ) ).
(4.7)
We came to this formula by means of the following considerations. Using Theorem 4 ∗ (M k ) up to degree 2, this reads: one can compute the main relation of Hq,e k (1 − (k k − (k − 2)Lk1 − 2Lk0 )q − (2k k Lk0 + (k − 3)k k Lk1 − 3(Lk0 )2 − (2k − 6)Lk1 Lk0 − k k − 2 k+1,k,2 2 L1 − Lk+1,k,2 − )q − · · · )(Oe )k−1 = 0. 0 2 2
(k − 3)(k − 2) k 2 (L1 ) 2 (4.8)
On the other hand one has from [18] and [16] that the main relation can be written using the k − 2 point correlation function of the pure matter theory in the form Qk−2
h
k
j =1 Oe (zj )iM k ,matter
(Oe )k−1 = 0,
(4.9)
k
and that it is h
k−2 Y j =1
Oe (zj )iM k ,matter = k + k k+1 (1 − 2a1 − (k − 2)(b1 ))q k
+ k 2k+1 (1 − 2a1 − b1 + 3(a1 )2 − 2a2 + 2a1 · b1 (k − 2)(k − 3) + (k − 2)(−b1 + 4a1 · b1 + 2(b1 )2 − 2b2 ) + (b1 )2 )q 2 + · · · . 2 (4.10) Here ad =
(kd)! , (d!)k k kd
bd = ad (
k−1 d X X i=1 m=1
m ) i(ki − m)
are the coefficients of the hypergeometric series associated to the solutions W0 (x) =
∞ X d=0
ad edx ,
W1 (x) =
∞ X d=1
bd edx + W0 (x)x,
Quantum Cohomology Rings of Projective Hypersurfaces
173
of the differential equation ((
1 d 2 d k−1 d d k−1 ) + )( + )···( + ))Wi (x) = 0. − ex ( dx dx k dx k dx k (4.11)
By comparison of (4.8) with (4.10), we notice the following equalities: k k a1 = Lk0 , k k b1 = Lk1 , 1 + 2(Lk0 )2 ), k 2k a2 = (Lk+1,k,2 2 0 1 + Lk+1,k,2 + 2Lk1 Lk1 ) + Lk1 Lk0 . k 2k b2 = (Lk+1,k,2 0 4 1
(4.12)
These equalities can be organized more systematically by means of generating functions. To this aim we need: be the integer obtained by applying Definition 1. For arbitrary N and k let L˜ N,k,d m formally the recursive laws of Theorem 2. x x ˜ k,k Remark. One should note: (i) L˜ k,k i (e ) = Lk−1−i (e ), (ii) if the index N − k is at least N,k,d must be the ordinary structural constant LN,k,d of the Fano hypersurface. 2 then L˜ m m
Now we can rewrite (4.12) as , k k a1 = L˜ k,k,1 0 k,k,1 k k b1 = L˜ , 1
, k a2 = L˜ k,k,2 0 1 k 2k b2 = L˜ k,k,2 + L˜ k,k,1 · L˜ k,k,1 . 1 0 2 1 2k
(4.13)
After having performed some numerical computations, we have noticed that also the following relations should hold true: , k 3k a3 = L˜ k,k,3 0 1 1 + L˜ k,k,2 · L˜ k,k,1 + L˜ k,k,1 · L˜ k,k,2 . k 3k b3 = L˜ k,k,3 0 1 0 3 1 2 1
(4.14)
Consider next the generating functions: ˜ := 1 + L˜ k,k i (q)
∞ X d=1
q˜ d , L˜ k,k,d i
q˜ := ex ,
(4.15)
and define t := x + (
∞ X
∞ X bj k e )/( aj k kj ej x ).
j =1
j =0
kj j x
(4.16)
174
A. Collino, M. Jinzenji
The preceding equalities motivate us to expect: ˜ = L˜ k,k 0 (q)
∞ X j =0
dt x aj k kj ej x , L˜ k,k . 1 (e ) = dx
(4.17)
˜ automatically gives the information of the B-model on the Thus, we expect that Lk,k m (q) mirror manifold of Mkk without using the mirror conjecture. We use the virtual structure ˜ to define a virtual quantum product determined by the action of a ring constants L˜ k,k m (q) generator G which operates according to the rules: x m G · Oem−1 = L˜ k,k k−1−m (e )Oe , (m = 1, 2, · · · , k − 2) G · Oek−2 = 0.
(4.18)
x We note the relation G = G · 1 = L˜ k,k k−2 (e )Oe . We expect that the structure constants of the virtual action satisfy the following equality, which in fact may be checked up to q˜ 3 using the recursive laws for the Fano case: k−1 Y i=0
x k x −1 L˜ k,k i (e ) = (1 − k e ) ,
(4.19)
and this yields the relation: x 2 k−1 = 0. (1 − k k ex ) · (L˜ k,k 0 (e )) · (G)
(4.20)
Note that multiplicative factor in (4.20) is nothing but the reciprocal of the generalized Yukawa coupling of the B-model [18]. On the other hand the true quantum cohomology ring satisfies a similar multiplication rule: Oe · 1 = Oe ,
t m (m = 2, 3, · · · , k − 2), Oe · Oem−1 = Lk,k k−1−m (e )Oe Oe · Oek−3 = Oek−2 , Oe · Oek−2 = 0, where ∞ X k,k t Lk,k,d edt . Li (e ) := 1 + i
(4.21)
(4.22)
d=1
t We can compute Lk,k i (e ) in concrete examples using the method of torus localization. See [17] for details and results. Now we search for a transformation rule to pass from the virtual to the true quantum multiplication. To this aim we find it useful to introduce a formal definition:
Definition 2. The commutative product (∗) between differential operators of the form dm f (x) dx m is given by: (f (x)
dn d m+n dm ) ∗ (g(x) ) = (f (x) · g(x)) . dx m dx n dx m+n
(4.23)
Quantum Cohomology Rings of Projective Hypersurfaces
175
d d Given the coordinate change x = x(t) we define a map from dx operators to dt operators by means of the rule
f (x)
dt m d m dm ) → f (x(t))( . dx m dx dt m
(4.24)
At this point we can relate the quantum product laws using as an intermediate step the product of differential operators. To start we propose the correspondence Oe =
d d , G= . dt dx
(4.25)
Then one has d x m ∗ Oem−1 = L˜ k,k k−1−m (e )Oe , (m = 1, 2, · · · , k − 2), dx d ∗ Oek−2 = 0, dx
(4.26)
and d ∗ 1 = Oe , dt
d t m (m = 2, 3, · · · , k − 3), ∗ Oem−1 = Lk,k k−1−m (e )Oe dt d d ∗ Oek−3 = Oek−2 , ∗ Oek−2 = 0. dt dt
(4.27)
It follows from (4.26): d x ˜ k,k x d = dt · d . ∗ 1 = L˜ k,k k−2 (e ) · Oe = Lk−2 (e ) · dx dt dx dt
(4.28)
This equality suggests that (4.26) and (4.27) become isomorphic if we use the transformation of differential operators defined above. Compare now the coefficients for Oeα in (4.26) with (4.27), then the wished isomorphism yields the equality α Y
x(t) dx (L˜ k,k ) )= k−1−j (e dt
j =1
α Y j =1
t Lk,k k−1−j (e ).
(4.29)
We find in this way the transformation laws that we were looking for; they are: x(t) ) L˜ k,k dx i (e t = L˜ k,k = Lk,k (ex(t) ) i i (e ). k,k x(t) dt L˜ (e )
(4.30)
1
This is the rule that provides the recursive formulas for curves of arbitrary degree d on the Calabi–Yau hypersurface Mkk once we know the recursive formulas for curves up to degree d on Fano hypersurfaces. At this point we obtain the recursive formulas for cubics in Conjecture 1 by means of elementary calculations.
176
A. Collino, M. Jinzenji
Example. The true quantum cohomology ring of the quintic Calabi–Yau threefold is: Oe · 1 = Oe , Oe · Oe = Oe2 (1 + 575et + 975375e2t + 1712915000e3t + · · · ), O e · O e 2 = Oe 3 , Oe · Oe3 = 0.
(4.31)
while the associated virtual ring is: G · 1 = Oe (1 + 770ex + 1435650e2x + 3225308000e3x + · · · ), G · Oe = Oe2 (1 + 1345ex + 3296525e2x + 8940963625e3x + · · · ),
G · Oe2 = Oe3 (1 + 770ex + 1435650e2x + 3225308000e3x + · · · ), G · Oe3 = 0.
(4.32)
Using (4.30) we find that: 575 = 1345 − 770, (4.33) 975375 = 3296525 − 1435650 + 770 · 770 (4.34) − 1345 · 770 − 770 · (1345 − 770), 1712915000 = 8940963625 − 3225308000 + 2 · 770 · 1435650 − 7703 + 1345 · (770)2 − 1345 · 1435650 − 3296525 · 770 − 2 · 770 · (3296525 − 1435650 + 770 · 770 − 1345 · 770) 1 3 (4.35) + ( · (770)2 − · 1435650) · (1345 − 770). 2 2 5. On the Construction of the Recursive Formulas for Rational Curves of Larger Degree Motivated by the preceding results, we begin this section by proposing some conjectures. They are strong enough to allow in principle the construction of the expected recursive formulas for curves of higher degree, and we explicitly produce the law for the d = 4 case. Conjecture 2. There are universal recursive polynomial laws which express the structure +1,k,n on a Fano variety in terms of LN (1 ≤ n ≤ d). The formulas constants LN,k,d m m have the following properties: 1. The form of the recursive polynomials is invariant if the index N − k ≥ 2, and the equality γ0N,k,d = γ0N+1,k,d for the coefficients in the fundamental relations is a consequence of them. 2. If N − k = 1 the recursive formulas change only for the case of lines, i.e. d = 1. represents the result of a formal We keep the notations of Sect. 4, so that L˜ N,k,d m x iteration of the recursive functions for fixed k down to any chosen N, and then L˜ k,k m (e ) is the associated generating function.
Quantum Cohomology Rings of Projective Hypersurfaces
177
Conjecture 3. Formal iteration of the laws of Conjecture 2 for descending N down to the case N = k yields the coefficients of the hypergeometric series used in the mirror calculation, i.e., it should be , k dk ad = L˜ k,k,d 0 k dk bd =
d−1
1 ˜ k,k,d X 1 ˜ k,k,m ˜ k,k,d−m L L + · L0 . d 1 m 1
(5.1)
m=1
The same procedure gives the structure constants of the quantum cohomology ring of the Calabi–Yau hypersurface of degree k, according to the rule t Lk,k i (e ) =
x(t) ) L˜ k,k i (e . L˜ k,k (ex(t) )
(5.2)
1
Here we set
dt x := L˜ k,k 1 (e ). dx
Remark. It is an immediate consequence of the conjectures and of the vanishing conis a polynomial of degree d in the ditions of Sect. 2 that the structure constant LN,k,d m constants Lkm , N ≥ k. We proceed now to the construction of the recursive formulas for the case d = 4. Our method is based on the expectation that the specialization procedure gives if not the right coefficients at least the right monomials which appear in the recursive laws. We formalize this below with a conjecture. We start by constructing some technical formulas for the factorization of the Gromov– Witten invariants. P Let {n∗ } := {n1 , n2 , · · · , nl } and ind({n∗ }) = lj =1 (nj − 1). We formally define ind({∅}) to be 0. We have a formula for the correlation functions (Gromov–Witten invariants) of the topological sigma model on MNk +1 coupled to gravity, which reads: k −1 h·Oea Oen1 Oen2 · · · Oenl Oeb id,M k
N +1 ,gr
=
d1 d X X d1 =0 d2 =0 ind({n Y∗ }) i=0
dind({n∗ })−1
···
X
C d ({n∗ }; d1 , d2 , · · · , dind({n∗ }) )
dind({n∗ }) =0 N+1,k,d −d
i i+1 Ln+1−a−i+(N −k+1)(d−di ) ,
d0 := d, N − k + 1 ≥ ind({n∗ }) + 1.
(5.3) (5.4)
The coefficients C d ({n∗ }; d1 , · · · , dind({n∗ }) ) which appear here have the following properties: C d ({m}; d1 , · · · , dm−1 ) = 1, d
(5.5) d
C ({n∗ } ∪ {1}; d1 , · · · , dind({n∗ }) ) = dC ({n∗ }; d1 , · · · , dind({n∗ }) ).
(5.6)
178
A. Collino, M. Jinzenji
One can determine C d ({n∗ }; d1 , · · · , dind({n∗ }) ) by means of the recursive relation, C d ({n∗ }; d1 , · · · , dind({n∗ }) ) X (C d−dind({l∗ })+nl −1 ({l∗ } ∪ {nl − 1}; d1 − dind({l∗ })+nl −1 , = {l∗ }q{m∗ }={n∗ }/{nl }
{m∗ }6=∅
· · · , dind({l∗ })+nl −2 − dind({l∗ })+nl −1 ) ·
C dind({l∗ })+nl −1 ({m∗ }; d1+ind({l∗ })+nl −1 , · · · , dind({n∗ }) ) · dind({l∗ })+nl −1 )
+ C d−dind({n∗ }) ({n∗ }\{ne } ∪ {nl − 1}; d1 − dind({n∗ }) , · · · , dind({n∗ })−1 − dind({n∗ }) ). (5.7) Proof. We prove (5.7) by induction of ind({n∗ }). We denote Oen1 Oen2 · · · Oenl as Oe{n∗ } for brevity. The first reconstruction theorem of KM yields: X
d X
{l∗ }q{m∗ }
d0 =0
={n∗ }/{nl }
k −1 hOea Oe{l∗ } Oeb Oenl +ind({m∗ })−(N −k+1)d0 id−d0 ,gr · · k −1 hOeind({l∗ })−(N −k+1)(d−d0 )+a+b Oe{m∗ } Oenl −1 Oe id0 ,gr
=
X
d X
{l∗ }q{m∗ }
d0 =0
={n∗ }/{nl }
k −1 hOea Oe{l∗ } Oenl −1 Oeb+1+ind({m∗ })−(N −k+1)d0 id−d0 ,gr · · k −1 hOea−1+nl +ind({l∗ })−(N −k+1)(d−d0 ) Oe{m∗ } Oe Oeb id0 ,gr . (5.8)
The l.h.s. of (5.8) has the contribution of d0 = 0 and of {m∗ } = {∅} because ind({n∗ }) − (N − k + 1)d0 ≤ −1, (d0 ≥ 1) and because the classical correlation function remains non-zero only if the number of the operator insertion point equals 3. Then we have (the l.h.s.) of (5.8) = k −1 hOea Oe{n∗ } Oeb id,gr .
(5.9)
On the other hand, we can rewrite the r.h.s. of (5.8) from the assumption of induction, d d−d X X0
X
(
{l∗ }q{m∗ }={n∗ }/{nl } d0 =0 t1 =0 {m∗ }6=∅
ind({l∗Y })+nl −2 i=0
(
d0 X u1 =0
ind({m Y∗ }) j =0
···
X
C d−d0 ({l∗ } ∪ {nl − 1}; t1 , · · · , tind({l∗ })+nl −2 ) ·
tind({l∗ })+nl −2 =0
N+1,k,t −t
i i+1 Ln+1−a−i+(N−k+1)(d−d )· 0 −ti )
uind({l∗ })−1
···
tind({l∗ })+nl −3
X
d0 C d0 {m∗ }; u1 , · · · , uind({m∗ }) ) ·
uind({l∗ }) =0 N+1,k,u −u
j j +1 Ln+1−a+1+ind({l ) ∗ })−j +(N−k+1)(d−d0 )+(N−k+1)(d0 −uj )
(5.10)
Quantum Cohomology Rings of Projective Hypersurfaces d d−d X X0
X
+
179
···
{l∗ }q{m∗ }={n∗ }/{nl } d0 =0 t1 =0 {m∗ }6=∅
tind({n∗ })−2
X
C d−d0 ({n∗ }\{ne } ∪ {nl − 1}; t1 ,· · ·, tind({n∗ })−1 ) ·
tind({n∗ })−1 =0
·
ind({n ∗ })−1 Y i=0
N+1,k,t −t
N+1,k,t −t
i i+1 i i+1 Ln+1−a−i+(N−k+1)(d−d · Ln+1−a−ind({n . ∗ })+(N−k+1)(d−d0 ) 0 −ti )
t Then an appropriate change of ti ’s and ui ’s leads to (5.7). u The specialization process can be done systematically employing the formula: [Aa1 −1 ∩ H, Aa2 −1 ∩ H, · · · , Aad+1 −1 ∩ H ; d, N + 1, k] d+1 X
=
X
(−1)d+1−m ([Aind(U1 )+1 , Aind(U2 )+1 , · · · , Aind(Um )+1 ; d, N + 1, k] ·
m=1 qm j =1 Uj ={a∗ } Uj 6=∅
·(
m Y
(] (Uj ) − 1)!)).
(5.11)
j =1
We apply formally the process of specialization to quartic curves. Combining (5.7) with (5.11), we obtain: +R 16LN,k,4 n 13 13 3 3 = L4n−3 + L4n−2 + L4n−1 + L4n 2 2 2 2 + 2L1n−2 L3n−2+N−k + 2L1n−1 L3n−2+N −k + 6L1n L3n−2+N −k + 8L1n−1 L3n−1+N−k + 12L1n L3n−1+N −k + 6L1n L3n+N −k + 3L2n−2 L2n−1+2(N−k) + 7L2n−1 L2n−1+2(N−k) + 6L2n L2n−1+2(N−k) + 10L2n−1 L2n+2(N−k) + 7L2n L2n+2(N−k) + 3L2n L2n+1+2(N−k) + 6L3n−2 L1n+3(N−k) + 12L3n−1 L1n+3(N −k) + 6L3n L1n+3(N −k) + 8L3n−1 L1n+1+3(N−k) + 2L3n L1n+1+3(N−k) + 2L3n L1n+2+3(N−k) + 4L1n−1 L1n−1+N−k L2n−1+2(N−k) + 9L1n L1n−1+N −k L2n−1+2(N−k) + 10L1n L1n+N−k L2n−1+2(N−k) + 12L1n L1n+N −k L2n+2(N−k) + 8L1n−1 L2n−1+N−k L1n+3(N−k) + 14L1n L2n−1+N −k L1n+3(N −k) + 14L1n L2n+N−k L1n+3(N−k) + 8L1n L2n+N −k L1n+1+3(N−k) + 12L2n−1 L1n+2(N−k) L1n+3(N−k) + 10L2n L1n+2(N−k) L1n+3(N −k) + 9L2n L1n+1+2(N−k) L1n+3(N−k) + 4L2n L1n+1+2(N−k) L1n+1+3(N−k) + 16L1n L1n+N−k L1n+2(N−k) L1n+3(N −k) .
(5.12)
180
A. Collino, M. Jinzenji
+1,k,d Here and further down we drop N +1, k from LN for simplicity. We determine next n the contribution from the connected reducible curves R indirectly, namely by means of a conjecture on the structure of the recursive formulas which is motivated by the results that we have found for curves of degree up to 3.
Conjecture 4. The prototype result obtained from the specialization exhausts all the addends that appear in the “true” recursive formula and the coefficients described by the following generating polynomial remain unchanged after subtraction of the contribution from “R” term:
d−1 Y
(
j =1
j x + (d − j )y + zj ). d
(5.13)
Examples. d=2 d=3 d=4
x+y ) + z1 , 2 2x + y 2x 2 + 5xy + 2y 2 x + 2y )+( )z1 + ( )z2 + z1 z2 , ( 9 3 3 3x 3 + 13x 2 y + 13xy 2 + 3y 3 ) ( 32 3x 2 + 10xy + 3y 2 3x 2 + 4xy + y 2 x 2 + 4xy + 3y 2 )z3 + ( )z2 + ( )z1 +( 8 16 8 3x + y x+y x + 3y +( )z1 z2 + ( )z1 z3 + ( )z2 z3 4 2 4 + z1 z2 z3 . (5.14) (
The remaining unknown coefficients are found by means of the constraints that they must satisfy. One condition is that the recursive laws must be compatible with the main relation of the Kähler sub-ring, i.e. with (Oe )N −1 − k k (Oe )k−1 q = 0. We use also the symmetry of the recursive formulas arising from the equality LN,k,d = LN,k,d n N −1−(N−k)d−n . 1 As a result the only unknown coefficients left are the ones of Ln−1 L3n−2+N −k and of L1n L3n−1+N −k . By symmetry they coincide with the coefficients of L3n L1n+1+3(N −k) and of L3n−1 L1n+3(N −k) respectively. Using the condition imposed by the main relation we find that their sum must be 7/9. Everything is reduced to the computation of the coefficient of L1n−1 L3n−2+N−k . Now we have already the recursive laws for the curves of degree at most 3, and then iterated use of the recursive formula (with the unknown coefficients) from N ≥ 2k region (in this region, what we need is only the Schubert numbers !) down to the N = k region yields linear functions on the remaining unknown coefficients. At this point the information that we have about the coefficients of the hypergeometric series ad and bd in Conjecture 3 provide us with an infinite number of linear relations, by varying k.
Quantum Cohomology Rings of Projective Hypersurfaces
181
Using these arguments we compute that the coefficient of L1n−1 L3n−2+N−k is 1/6. We have thus obtained the law for curves of degree 4: = 32−1 (3L4n−3 + 13L4n−2 + 13L4n−1 + 3L4n ) LN,k,4 n
+ 72−1 (9L1n−2 L3n−2+N−k + 12L1n−1 L3n−2+N−k + 16L1n L3n−2+N −k + 36L1n−1 L3n−1+N−k + 44L1n L3n−1+N −k + 27L1n L3n+N −k )
+ 16−1 (3L2n−2 L2n−1+2(N−k) + 6L2n−1 L2n−1+2(N−k) + 4L2n L2n−1+2(N−k) + 10L2n−1 L2n+2(N−k) + 6L2n L2n+2(N−k) + 3L2n L2n+1+2(N−k) ) + 72−1 (27L3n−2 L1n+3(N−k) + 44L3n−1 L1n+3(N−k) + 16L3n L1n+3(N −k) + 36L3n−1 L1n+1+3(N−k) + 12L3n L1n+1+3(N −k) + 9L3n L1n+2+3(N−k) ) + 12−1 (3L1n−1 L1n−1+N−k L2n−1+2(N −k) + 4L1n L1n−1+N −k L2n−1+2(N−k) + 6L1n L1n+N−k L2n−1+2(N−k) + 9L1n L1n+N −k L2n+2(N−k) ) + 6−1 (3L1n−1 L2n−1+N−k L1n+3(N−k) + 4L1n L2n−1+N −k L1n+3(N−k) + 4L1n L2n+N−k L1n+3(N−k) + 3L1n L2n+N −k L1n+1+3(N −k) ) + 12−1 (9L2n−1 L1n+2(N−k) L1n+3(N −k) + 6L2n L1n+2(N−k) L1n+3(N −k) + 4L2n L1n+1+2(N−k) L1n+3(N−k) + 3L2n L1n+1+2(N−k) L1n+1+3(N−k) ) + L1n L1n+N−k L1n+2(N−k) L1n+3(N−k) .
(5.15)
We have checked numerically that this formula is compatible with the previous conjecture (3.1) on the relation satisfied by the coefficients ad and bd . We have also checked that the formula for quartic curves yields the right Gromov–Witten invariant for the quintic threefold when we use the same conjecture. Remark. Looking at the recursive laws for the Fano case established so far, we expect that the number of addends appearing in the laws for degree d is the number of monomials of degree d in d variables, (2d − 1)!/(d!(d − 1)!). Of course the procedure that we have used for the determination of the recursive formula for d = 4 is in principle applicable to any d. For example we have also determined the recursive formula for d = 5. See (hep-th 9611053), the web version of this paper. On the other hand a general description of the coefficients that appear in the conjectural recursive formula of arbitrary degree is still an open problem. 6. Conclusion In this paper, we proved the property that up to degree 3 the correlation functions of the hypersurfaces MNk in CP N−1 (N ≥ k) can be written as polynomial functions of a finite number of integers Lkm . For instance in the quintic case these numbers are 1345, 770, 120. In proving this, we have found certain recursive formulas that are invariant in the c1 (MNk ) ≥ 2 region. They yield the “bare” B-model or “bare” coordinates of the deformation space of the complex structure of the mirror manifold associated to Mkk . This completely agrees with the results of Givental, which say that in the c1 (MNk ) ≥ 2 region the sigma models on (MNk ) can be solved with the hypergeometric series without
182
A. Collino, M. Jinzenji
coordinate transformation i.e., (the bare deformation parameter is good coordinate of the A-model) and that in the Calabi–Yau case, one has to transform the bare coordinate by the mirror map. In sum one can see the B-model as the toric quantum cohomology compatible with the toric compactification of the moduli space of the pure matter theory. In the Calabi–Yau case one has to introduce the mirror map to compensate for the gap between the toric compactification of the moduli space of the pure matter theory and the exact moduli space. These conclusions agree with the arguments of [24]. Possibly the case of complete intersections in projective space can be achieved by changing the input integers Lkm . One may try to search for the generalization of our arguments to the case of complete intersections in weighted projective spaces. Acknowledgement. A. C. thanks A. B. Givental, J. Lewis, C. Peters, S. A. Strømme, G. Tian, A. Tyurin. M. J. thanks T. Eguchi, K. Hori, A. Matsuo, M. Kobayashi, members of Algebraic Geometry Group in Dept. of Mathematical Science in Tokyo Univ., M. Nagura andY. Sun for discussions and kind encouragement. A.C. has been partially supported by Science Project Geom. of Alg. Var. n. 0-198-SC1 , and by fundings from M.U.R.S.T. and G.N.S.A.G.A. (C.N.R.) Italy. M. J has been supported by a grant of Japan Society for Promotion of Science.
References 1. Beauville, A.: Quantum cohomology of complete intersections. Mathematical, Physics Analysis and Geometry 168, 384–398 (1995) 2. Bertram, A.: Quantum Schubert Calculus. To appear in Advances in Mathematics 3. Bloch-Murre: On the Chow group of certain types of Fano threefolds. Compositio Math. 39, 47–105 (1979) 4. Candelas, P. and de la Ossa, X.: Nucl. Phys. B355, 415 (1991) 5. Candelas, P., de la Ossa, X., Green, P. and Parkes, L.: Phys. Lett. 258B, 118 (1991); Nucl. Phys. B359, 21 (1991) 6. Collino, A.: Some computations on the quantum cohomology algebra of a Fano hypersurface. Informal draft (1996) 7. Dubrovin, B.: The geometry of 2D topological field theories. In: Integral systems and quantum groups. LNM 1620, Berlin–Heidelberg–New York: Springer-Verlag, 1996, pp. 120–348 8. Ellingsrud, Strømme, S.A.: Bott’s formula and enumerative geometry. J. AMS 9, n. 1, 175–193 (1996) 9. Fulton, W.: Intersection Theory. Ergebnisse der Math. und ihrer Grenzgebiete 3. Folge Band 2, Berlin– Heidelberg–New York: Springer-Verlag, 1984 10. Fulton and Pandharipande: Notes on stable maps and quantum cohomology. In: Proceedings of symposia in pure mathematics: Algebraic geometry Santa Cruz 1995, (J. Kollar, R. Lazarsfeld, D. Morrison eds.) Volume 62, Part 2, Providence, RI: American Mathematical Society, pp. 45–96 11. Givental, A.B.: Equivariant Gromov–Witten Invariants. Internat. Math. Res. Notices 13 613–663 (1996) 12. Greene, B.R., Morrison, D.R. and Plesser, M.R.: Mirror Manifolds in Higher Dimension. Commun. Math. Phys. 173, 559–598 (1995) 13. Griffiths, P. and Harris, J.: Principles of Algebraic Geometry. New York: Wiley, 1978 14. Hosono, S.,Klemn, A.,Theisen, S. and Yau, S.T.: Mirror Symmetry, Mirror Map and Applications to Calabi–Yau Hypersurfaces. Commun.Math.Phys. 167, 301–350 (1995) 15. Intriligator, K.: Fusion residues. Modern Physics letters A6 Number 38, 3543–3556 (1991) 16. Jinzenji, M.: On Quantum Cohomology Rings for Hypersurfaces in CP N −1 . J. Math. Phys. 38, 5775– 5802 (1997) 17. Jinzenji, M.: Construction of Free Energy of Calabi–Yau Manifold embedded in CP N −1 via Torus Actions. Int. J. Mod. Phys. A12, 5775–5802 (1997) 18. Jinzenji, M. and Nagura, M.: Mirror Symmetry and An Exact Calculation of N − 2 point Correlation Function on Calabi–Yau Manifold embedded in CP N −1 . Int. J. Mod. Phys. A11, 171–202 (1996) 19. Keel, S.: Intersection theory of moduli spaces of n-stable pointed curves of genus zero. Trans. Am, 330, 545–574 (1992) 20. Kontsevich, M.: Enumeration of Rational Curves via Torus Actions. In: The moduli space of curves, R.Dijkgraaf, C.Faber, G.van der Geer (Eds.), Progress in Math., 129, Basel–Boston: Birkhäuser, 1995, pp. 335–368 21. Kontsevich, M., Manin,Y.: Gromov–Witten Classes, Quantum Cohomology, and Enumerative Geometry. Commun. Math. Phys. 164, 525–562 (1994)
Quantum Cohomology Rings of Projective Hypersurfaces
183
22. Lewis, J.: The cylinder correspondence for hypersurfaces of degree n in P n . Am. J. of Math. 110, 77–114 23. Li, J.,Tian, G.: Quantum Cohomology of Homogeneous Varieties. alg/geom/9504009 24. Morrison, D.R. and Plesser, M.R.: Summing the Instantons: Quantum Cohomology and Mirror Symmetry in Toric Varieties. Nucl. Phys. B440, 279–354 (1995) 25. Nagura, M. and Sugiyama, K.: Mirror Symmetry of K3 Surface. Int. J. Mod. Phys. A10, 233 (1995) 26. Persson, U., Peters, C.: Some aspects of the topology of algebraic surfaces. Israel Mathematical conference Proceedings Vol. 9, 377–392 (1996) 27. Ruan, Y., Tian, G.: A mathematical theory of quantum cohomology. J. Diff. Geom. 42 no.2, (1995) 28. Tian, G.: Quantum cohomology and its associativity. In: Proc. of 1st Current Developments in Math., Cambridge: Cambridge International Press, 1995 29. Tjurin, A. N.: Five lectures on three-dimensional varieties. (Russian) Uspehi Mat. Nauk 27 no. 5, (167), 3–50 (1972) 30. Vafa, C.: Topological Mirrors and Quantum Rings. hep-th/9111017 31. Witten, E.: Mirror Manifolds and Topological Field Theory. In: Essays on Mirror Manifolds, ed. S.-T.Yau, Hong Kong: Int. Press. Co., 1992, pp. 120–180 Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 206, 185 – 233 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Dynamics of Cubic Siegel Polynomials Saeed Zakeri? Department of Mathematics, SUNY at Stony Brook, Stony Brook, NY 11794, USA. E-mail:
[email protected] Received: 29 August 1998 / Accepted: 19 March 1999
Abstract: We study the one-dimensional parameter space of cubic polynomials in the complex plane which have a fixed Siegel disk of rotation number θ , where θ is a given irrational number of Brjuno type. The main result of this work is that when θ is of bounded type, the boundary of the Siegel disk is a quasicircle which contains one or both critical points of the cubic polynomial. We also show that these boundaries vary continuously as one moves in the parameter space. This is most nontrivial near the set of cubics with both critical points on the boundary of their Siegel disk. We prove that this locus is a Jordan curve in the parameter space. Most of the techniques and results can be generalized to polynomials of higher degrees. Contents 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . A Cubic Parameter Space . . . . . . . . . . . . . . . . Components of the Interior of M(θ) . . . . . . . . . . . Renormalizable Cubics . . . . . . . . . . . . . . . . . . Quasiconformal Conjugacy Classes . . . . . . . . . . . Connectivity of M(θ) . . . . . . . . . . . . . . . . . . Critical Parametrization of Blaschke Products . . . . . . A Blaschke Parameter Space . . . . . . . . . . . . . . . The Surgery . . . . . . . . . . . . . . . . . . . . . . . . The Blaschke Connectedness Locus C(θ) . . . . . . . . Continuity of the Surgery Map . . . . . . . . . . . . . . Renormalizable Blaschke Products . . . . . . . . . . . Surjectivity of the Surgery Map . . . . . . . . . . . . . Siegel Disks with Two Critical Points on Their Boundary
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
186 189 193 199 201 203 205 208 213 216 218 221 224 229
? Current address: Department of Mathematics, University of Pennsylvania, Philadelphia, PA 19104-6395, USA.
186
S. Zakeri
1. Introduction Let f be a polynomial of degree d ≥ 2 in the complex plane and consider the following statements: • (Ad ) “If f has a fixed Siegel disk 1 of bounded type rotation number, then ∂1 is a quasicircle passing through some critical point of f .” • (Bd ) “If f has a fixed Siegel disk 1 such that ∂1 is a quasicircle passing through some critical point of f , then the rotation number of 1 is a bounded type.” Statement (A2 ) is a theorem of Douady, Ghys, Herman and Shishikura, (Bd ) is open, even for d = 2,1 and one of the main corollaries of this work is (A3 ): Theorem. Let P be a cubic polynomial which has a fixed Siegel disk 1 of rotation number θ . Let θ be of bounded type. Then the boundary of 1 is a quasicircle which contains one or both critical points of P . In fact, we study the one-dimensional slice in the cubic parameter space which consists of all cubics with a fixed Siegel disk of a given rotation number. Many of the results apply to all rotation numbers of Brjuno type and can be generalized to polynomials of degree d ≥ 4. Siegel disks provide examples of quasiperiodic dynamics. Let p be an irrationally indifferent fixed point of a rational map f : b C→b C. This means that f (p) = p and the multiplier f 0 (p) is of the form e2πiθ , where the rotation number 0 < θ < 1 is irrational. When f is linearizable near p, the largest domain 1 on which the linearization is possible is simply-connected and is called the Siegel disk of f centered at p. Every punctured Siegel disk 1 r {p} is foliated by dynamically-defined real-analytic invariant curves. However, as we get close to ∂1, these invariant curves may become more wiggly, and in the limit we lose control of their distortion. So, a priori, we do not even know if ∂1 is a Jordan curve. The topology and geometry of the boundary of Siegel disks is a current field of research in Holomorphic Dynamics. It was conjectured by Douady and Sullivan in the early 80’s that the boundary of every Siegel disk of a rational map has to be a Jordan curve (see [D1]). To this date, this has remained an open problem, even for polynomials, even when the degree is 2. Even worse, there are very few explicit examples of polynomials for which we can effectively verify this conjecture. For instance, it is easy to see that local-connectivity of the Julia set implies the boundary of a Siegel disk to be a Jordan curve, but except for one case in the quadratic family [Pe], we do not know how to check local-connectivity of the Julia set of a rational map which has a Siegel disk (and even in that single case, the boundary being a Jordan curve is proved as a first step in the proof of local-connectivity). On the other hand, there are examples of non locally-connected quadratic Julia sets whose Siegel disks are bounded by quasicircles [H3] or indifferent linearizable germs with non locally-connected “ hedgehogs” whose Siegel disks are bounded by smooth or even quasianalytic Jordan curves [Pr2]. It is known that in any counterexample to the DouadySullivan conjecture, the boundary of the Siegel disk must either be very complicated (an indecomposable continuum) or very simple (a circle with infinitely many topologist’s sine curves planted on it) [R]. Let [a1 , a2 , . . . , an , . . . ] be the continued fraction expansion of the rotation number θ and pn /qn = [a1 , a2 , . . . , an ] be its nth rational approximation, where every an is a positive integer. According to the theorem of Brjuno–Yoccoz [Y], every holomorphic 1 Note added in proof (July 1999): Carsten Petersen has announced a proof of (B ) for d ≥ 2. d
Dynamics of Cubic Siegel Polynomials
187
germ with an indifferent fixed point of multiplier e2π iθ is linearizable if and only if θ satisfies the condition ∞ X log qn+1 n=1
qn
< +∞.
(1.1)
Such θ is called of Brjuno type. It is not hard to show that this set has full measure in the unit interval. The set of irrational numbers of Brjuno type contains two important arithmetic subsets: (1) numbers of Diophantine type, the set of all 0 < θ < 1 for which there exist positive constants C and ν such that |θ − p/q| > C/q ν for every rational number 0 < p/q < 1; and (2) numbers of bounded type, the set of all 0 < θ < 1 for which supn an < +∞. Another issue is the existence of critical points on the boundary of Siegel disks. This problem was first studied by Ghys under the assumption that the boundary is a Jordan curve and the rotation number is Diophantine [G]. Later Herman improved the result by showing that when the rotation number is Diophantine and the action on the boundary is injective, there must be a critical point on the boundary [H1]. A very short proof of this theorem is now possible with knowledge of “Siegel compacts” as recently introduced by Perez-Marco [Pr1] (see [Z2] for such a proof). In the case of quadratic polynomials, no critical point on the boundary of the Siegel disk automatically implies that the map acts injectively on this boundary. Hence one concludes that for θ of Diophantine type, the critical point of Qθ : z 7 → e2πiθ z + z2 is on the boundary of the Siegel disk centered at 0. Later Herman gave the first example of a θ of Brjuno type for which the boundary of the Siegel disk for Qθ is disjoint from the entire orbit of the critical point [H3]. The most significant example in which one can explicitly show that the boundary of a Siegel disk is a Jordan curve containing a critical point is the quadratic map Qθ : z 7 → e2πiθ z + z2 , with θ of bounded type. The idea, originally due to Ghys but utilized by Douady, Herman and Shishikura, is to consider the degree 3 Blaschke product z−3 2πit (θ) 2 z fθ (z) = e 1 − 3z which has a double critical point at 1 and 0 < t (θ ) < 1 is chosen such that the rotation ´ atek number of the restriction of fθ to the unit circle is θ . Using a theorem of Swi¸ and Herman on quasisymmetric linearization of critical circle maps ([Sw,H2]), one can redefine fθ on the unit disk to make it quasiconformally conjugate to the rigid rotation by angle θ. After modifying the conformal structure of the sphere on the unit disk and all its preimages, one applies the Measurable Riemann Mapping Theorem of Morrey-Ahlfors-Bers to prove that the resulting map is quasiconformally conjugate to a quadratic polynomial Q. But the image of the unit disk has to be a Siegel disk of rotation number θ for Q, and there is only one such quadratic up to an affine conjugacy, so Q must be √ conjugate to Qθ , which proves (A2 ). The Julia set of Qθ for the golden mean θ = ( 5 − 1)/2 and its self-similar properties was studied empirically by physicists in the early 80’s (see [MN,W]). For general θ of bounded type, it has been a subject of recent rigorous studies by mathematicians (see for example [Pe,GJ,Mc3,YZ]). In a very recent work [Z4], using a non-quasiconformal surgery on fθ , we find explicit arithmetical conditions on unbounded type rotation numbers θ which guarantee the Siegel disk of Qθ is a Jordan curve passing through the critical point. In any attempt to generalize (A2 ) to higher degrees, one must address several problems. In fact, the main difficulty is not the surgery which can be performed in all degrees
188
S. Zakeri
in a similar way, provided that one has the appropriate Blaschke products in hand. Instead, we have to face a different set of questions such as parametrization of the candidate Blaschke products by their critical points, combinatorics of various “drops” of their Julia sets, continuity of the surgery, and surjectivity of this operation. None of these issues arises in degree 2, where the corresponding parameter spaces are single points. In this paper we address these questions in detail for cubic polynomials although many of the arguments apply to higher degrees as well. We introduce the parameter space P cm (θ ) of critically marked cubic polynomials with a Siegel disk of a given rotation number θ of Brjuno type, which is canonically isomorphic to the punctured plane. The connectedness locus M(θ) ⊂ P cm (θ ) (the analogue of the Mandelbrot set for the quadratic family) is the set of all cubics with both critical orbits bounded (see Fig. 1). In the interior of M(θ), every cubic is either hyperbolic-like, for which the free critical point approaches an attracting cycle, or capture, for which the free critical point eventually maps into the Siegel disk, or of neither type, in which case it is called queer. (There may be no queer components. In any case, no example is known.) The presence of hyperbolic-like cubics in P cm (θ) implies the existence of copies of the Mandelbrot set all over M(θ), while captures appear as components in M(θ ) which look like Siegel disks in the dynamical plane. The most significant property of queer cubics is that their Julia sets support invariant line fields and in particular have positive Lebesgue measure (Theorem 3.4). Motivated by the Douady-Ghys-Herman-Shishikura approach, we introduce an auxiliary family of degree 5 critically marked Blaschke products which serve as models for cubics in P cm (θ) in the same way fθ does for the quadratic Qθ . We show that these Blaschke products can be parametrized by their critical points (Theorem 7.1) and we use this parametrization to define the parameter space B cm (θ ) which is also homeomorphic to the punctured plane. A connectedness locus C(θ ) ⊂ B cm (θ ) can be defined similarly. When θ is of bounded type, one can perform a quasiconformal surgery on Blaschke products in B cm (θ ) in order to obtain critically marked cubics in P cm (θ ). The result of this surgery does not depend on various choices we make along the way (Proposition 9.2), hence it gives rise to a well-defined surgery map S : B cm (θ ) → P cm (θ ). Continuity of S is far from being straightforward and depends on the fact that the parameter spaces have one complex dimension (Theorem 11.1). In fact, in higher degrees, the proof of this continuity step is the only part our techniques for cubics polynomials fail to apply. Various evidence suggest that the connectedness loci C(θ ) and M(θ ) are in fact homeomorphic. One can go even farther as to speculate that S is a homeomorphism. Although we provide some evidence to support this, we only need to show that S is surjective (Theorem 13.6) in order to get the desired results in the dynamical plane of cubics. Surjectivity follows from an injectivity result (Theorem 13.3) which in particular shows that S induces a homeomorphism between the complementary components of C(θ ) and M(θ). The proof of the injectivity result relies on various tools developed along the way, especially a renormalization scheme to “extract” Qθ from some cubics in P cm (θ ) and fθ from some Blaschke products in Bcm (θ ). Surjectivity of S proves (A3 ). As another consequence, we obtain the following Theorem. For θ of bounded type, the boundary of the Siegel disk of P ∈ P cm (θ ) is a continuous function of P in the Hausdorff topology. It is interesting to contrast this result with the fact that the Julia set of P undergoes drastic implosions near the boundary of M(θ ), especially near the set of cubics with both critical points on the boundary of their Siegel disk. We study this locus and describe its topology:
Dynamics of Cubic Siegel Polynomials
189
Theorem. For θ of bounded type, the set 0 of all cubics in P cm (θ ) with both critical points on the boundary of their Siegel disk is a Jordan curve. Figure 18 shows the Jordan curve 0. We conclude with a topological characterization of this set as the common boundary of the two complementary components of M(θ ) (Theorem 14.4). 2. A Cubic Parameter Space We begin by considering the space of all cubic polynomials which have a fixed Siegel disk of multiplier λ = e2πiθ centered at the origin. Here 0 < θ < 1 is a given irrational number of Brjuno type satisfying the condition (1.1). By the theorem of Brjuno–Yoccoz [Y], every holomorphic germ z 7 → e2πiθ z + O(z2 ) with θ of Brjuno type is holomorphically linearizable near 0. In particular, every cubic polynomial of the form z 7 → λz + a2 z2 + a3 z3 ,
(2.1)
with (a2 , a3 ) ∈ C × C∗ has a Siegel disk centered at the origin. We are not directly interested in the rather big space of all such cubics. Instead, we would like to consider the space of affine conjugacy classes of these cubics together with a marking of their critical points. A few words on the notion of “marking” is in order; however, we will hardly refer to the following formal definition in the rest of this paper. Roughly speaking, a marking of the critical points of a cubic P of the form (2.1) is a choice of labeling these critical points. It can be thought of as a surjective function m from the set {1, 2} to the set of critical points of P . Two such critically marked cubics (P , m) and (Q, m0 ) are affinely conjugate if there is a dilation ϕ : z 7 → αz such that ϕ ◦ P = Q ◦ ϕ and m0 = ϕ ◦ m. In other words, an affine conjugacy must also respect the markings. We denote the space of affine conjugacy classes of such critically marked cubics by P cm (θ). One way to parametrize P cm (θ) is as follows: In each conjugacy class we choose the unique critically marked cubic (P , m) whose second critical point m(2) is located at z = 1 in the complex plane. The first critical point m(1) will then be located at some point c 6 = 0. It is easy to see that such a cubic has the form 1 1 2 1 (2.2) Pc : z 7 → λz 1 − (1 + )z + z . 2 c 3c Note that using this normal form, every cubic comes automatically with a marking of its critical points. Thus (2.2) provides us with an identification P cm (θ ) ' C∗ . Under this identification, the natural Z2 -action on P cm (θ ) (swapping the markings of the critical points) corresponds to the involution c 7 → 1/c. By an abuse of notation, we often identify the cubic Pc ∈ P cm (θ) with the parameter c ∈ C∗ . The parameter space P cm (θ) has two very special points: P1 which corresponds to the conjugacy class of cubics of the form (2.1) with one critical point, and P−1 which corresponds to the conjugacy class of those cubics whose critical points are centered. The pair {P1 , P−1 } coincides with the set of fixed points of the natural Z2 -action on P cm (θ ). To understand the implication of marking the critical points, let us also consider the space P(θ ) of affine conjugacy classes of cubics of the form (2.1), this time with no particular marking. Every cubic in (2.1) can be conjugated to a monic cubic of the form z 7 → λz + az2 + z3 ,
190
S. Zakeri
where a ∈ C, and this polynomial is uniquely determined by ±a. In other words, the space P(θ ) is parametrized by the invariant ζ = a 2 ∈ C, hence it can be identified with the complex plane. Consider the map which sends every critically marked cubic in P cm (θ ) to its unique monic representative in P(θ ). This amounts to “forgetting” the markings of the critical points. It is easy to check that in the coordinates we have chosen, this map P cm (θ) → P(θ) is given by 1 2 3 . ζ = λc 1 + 4 c It follows that P cm (θ) is a double cover of P(θ ), branched over the points c = ±1. Note that by the above formula ζ (c) = ζ (1/c), as expected. Notation and Terminology. Throughout this work, the Siegel disk of the cubic Pc centered at the origin is denoted by 1c . When we do not want to emphasize the dependence on c, we denote the Siegel disk of a cubic P by 1P . By the grand orbit GO(1P ) we mean the set of all points in the plane which eventually map to the Siegel disk under the iteration of P . In other words, [ P −k (1P ). GO(1P ) = k≥0
Remark. From classical Fatou–Julia theory, we know that every point on the boundary of the Siegel disk 1c must be in the closure of the orbit of either c or 1 [M1, Corollary 11.4]. According to Herman [H1], Pc |∂1c has a dense orbit. It follows that the orbit of either c or 1 must accumulate on the entire boundary of 1c . The “size” of a Siegel disk can be measured by the following invariant: Definition (Conformal Capacity). Consider the Siegel disk 1c for c ∈ C∗ and the ' unique linearizing map hc : D(0, rc ) −→ 1c , with hc (0) = 0 and h0c (0) = 1. The radius rc > 0 of the domain of hc is called the conformal capacity of 1c and is denoted by κ(1c ). Alternatively, κ(1c ) can be described as the derivative ϕc0 (0) of the unique linearizing ' map ϕc : D −→ 1c normalized by ϕc (0) = 0 and ϕc0 (0) > 0. Naturally, one is interested in the behavior of the function c 7 → κ(1c ). This function is upper semicontinuous [Y], so a priori it can jump to a lower value, meaning that the Siegel disk 1c can shrink by a very small perturbation of the cubic Pc . Later we will see that for θ of bounded type, the closed Siegel disk 1c is a quasidisk which moves continuously in the Hausdorff topology on compact subsets of the plane (see Theorem 13.9). Therefore, in that case κ(1c ) is actually continuous as a function of c. On the other hand, for arbitrary θ of Brjuno type, I do not know if c 7 → κ(1c ) is continuous. However, we have the following general theorem of Yoccoz [Y]: Theorem 2.1. Let 0 < θ < 1 be an irrational number of Brjuno type, and set W (θ ) = P∞ n=1 (log qn+1 )/qn < ∞. Let S(θ) be the space of all univalent functions f : D → C with f (0) = 0 and f 0 (0) = e2πiθ , with the maximal Siegel disk 1f ⊂ D. Finally, define κ(θ ) = inf f ∈S(θ) κ(1f ). Then, there is a universal constant C > 0 such that | log(κ(θ )) + W (θ)| < C.
Dynamics of Cubic Siegel Polynomials
191
√ Fig. 1. The connectedness locus M(θ ) for θ = ( 5 − 1)/2
We obtain the following statement which will be used in Theorem 5.3. Corollary 2.2. In the family {Pc } of cubic polynomials in (2.2), the conformal capacity function c 7 → κ(1c ) is locally bounded away from 0. Definition. We define the cubic connectedness locus M(θ ) as the set of all critically marked cubics P ∈ P cm (θ) whose Julia sets J (P ) are connected. It follows from classical Fatou–Julia theory that P ∈ M(θ ) if and only if both critical points of P have bounded orbits [M1, Theorem 17.3]: M(θ ) = {c ∈ C∗ : The Julia set J (Pc ) is connected} = {c ∈ C∗ : Both sequences {Pc◦k (c)} and {Pc◦k (1)} are bounded}. Since Pc and P1/c are affinely conjugate as maps when we neglect markings of their critical points, M(θ) as a subset of the c-plane is invariant under the mapping √ c 7 → 1/c. Figure 1 shows the connectedness locus M(θ) for the golden mean θ = ( 5 − 1)/2 = 0.61803399... and Fig. 2 shows the details of the same set near the unit circle. Proposition 2.3. (a) M(θ) is compact and contained in the open annulus A( (b) The complement C∗ r M(θ) has exactly two connected components ext which are mapped to one another by c 7 → 1/c. Moreover,
1 , 30). 30 and int
ext = {c ∈ C∗ : Pc◦k (c) → ∞ as k → ∞}, int = {c ∈ C∗ : Pc◦k (1) → ∞ as k → ∞}. Later we will prove that ext (hence int ) is homeomorphic to a punctured disk. This will show that M(θ) is a connected set (Theorem 6.1).
192
S. Zakeri
Fig. 2. Details of the same connectedness locus near the unit circle
Proof. (a) M(θ) is clearly closed. Let mc = (4.38) max{|c|, 1}.
(2.3)
If |z| ≥ mc , then |Pc (z)| ≥ (
1 1 1 ( |z| − |z|)|z| − 1)|z| |c| 3 4.38
≥ (0.46|z| − 1)|z| ≥ 1.0148 |z|, from which it follows that K(Pc ) ⊂ D(0, mc ),
(2.4)
where K(Pc ) is the filled Julia set of Pc . Now if |c| ≥ 30, then 1 1 |Pc (c)| = | c − ||c| ≥ (4.5)|c| > mc , 6 2 which implies Pc◦k (c) → ∞ as k → ∞. Therefore M(θ ) ⊂ D(0, 30), hence by 1 symmetry M(θ) ⊂ A( , 30). 30 (b) Let ext be the unbounded connected component of C∗ r M(θ ). Since M(θ ) is invariant under c 7 → 1/c, there exists a corresponding component int of the complement of M(θ ) containing a punctured neighborhood of the origin. By the proof of (a), we have ext ⊂ {c ∈ C∗ : Pc◦k (c) → ∞ as k → ∞} and similarly int ⊂ {c ∈ C∗ : Pc◦k (1) → ∞ as k → ∞}.
Dynamics of Cubic Siegel Polynomials
193
Suppose that there exists a bounded connected component U of C∗ r M(θ ) which is not int . Then 0 < sup |c| = R < +∞. c∈∂U
If c ∈ ∂U , it follows from (2.4) that for each k ≥ 0, |Pc◦k (c)| and |Pc◦k (1)| are not greater than mc , and sup mc ≤ (4.38) max{R, 1} < +∞. c∈∂U
Since U 6 = int , we have ∂U ⊂ ∂M(θ) and both Pc◦k (c) and Pc◦k (1) are holomorphic in U as functions of c. It follows from the Maximum Principle that the iterates Pc◦k (c) t and Pc◦k (1) are uniformly bounded throughout U , which is a contradiction. u 3. Components of the Interior of M(θ) First we give the following dynamical characterization of the boundary of the connectedness locus M(θ), which is reminiscent of the similar property of the Mandelbrot set. For terminology and basic results on holomorphic motions and J -stability, see for example [Mc2]. Theorem 3.1 (Boundary of M(θ) is Unstable). The boundary ∂M(θ ) is the set of parameters for which the corresponding cubics are not J-stable in P cm (θ ). Proof. A polynomial Pc0 ∈ P cm (θ) is J -stable if and only if both sequences {Pc◦k (c)} and {Pc◦k (1)} are normal for c in a neighborhood of c0 ([Mc2, Theorem 4.2]. If c0 ∈ ext , then c0 escapes to infinity under iterations of Pc0 , while 1 has bounded orbit. For c close to c0 , the orbit of c under Pc will still converge to infinity while 1 will have bounded orbit, with a bound given by mc in (2.3). It follows from Montel’s theorem that both sequences are normal throughout a neighborhood of c0 . Hence c0 is J -stable. Similarly, every Pc0 with c0 ∈ int is J -stable. If c0 belongs to the interior of M(θ ), then both c0 and 1 will have orbits contained in D(0, mc0 ) and the same holds for all c sufficiently close to c0 . Again both sequences {Pc◦k (c)} and {Pc◦k (1)} are normal in a neighborhood of c0 . Finally, if c0 belongs to the boundary of M(θ ), then a small perturbation will make either c or 1 escape to infinity. Hence at least one of the sequences {Pc◦k (c)} or t {Pc◦k (1)} fails to be normal in any neighborhood of c0 . u Corollary 3.2. Let Pc0 ∈ P cm (θ) have an indifferent periodic orbit other than the fixed point at the origin. Then c0 ∈ ∂M(θ). Proof. Otherwise c0 will be a J -stable parameter by the above theorem. But any stable indifferent cycle has to be persistent ([Mc2, Theorem 4.2]. So the indifferent cycle (z(c0 )) 7 → z(c0 ) can be continued analytically to z(c0 ) 7 → Pc0 (z(c0 )) 7 → · · · 7 → Pc◦k−1 0 the whole plane as a function of c and the multiplier function c 7→ (Pc◦k )0 (z(c)) remains constant. This is clearly impossible, since for example when c = 3 − 6λ, Pc (c) = c is a superattracting fixed point, hence there cannot be any indifferent periodic point other than 0. u t Definition (Types of Components). A component U of the interior of M(θ ) is called hyperbolic-like if for every c ∈ U , the orbit of either c or 1 under Pc converges to an attracting cycle. U is called a capture component if for every c ∈ U , either c or 1
194
S. Zakeri
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
eventually maps into the Siegel disk 1c . In case U is neither hyperbolic-like nor capture, we call it a queer component. We say that Pc is hyperbolic-like, capture, or queer if the corresponding parameter c belongs to such a component. For example, there is a hyperbolic-like component in the form of the main cardioid of a large copy of the Mandelbrot set on the lower right corner of Fig. 1. For every c in this component, the orbit of the critical point c of Pc converges to an attracting fixed point. On the other hand, the large component which is attached on the right side of the unit circle to c = 1 is a capture, consisting of all c for which Pc (c) belongs √ to 1c . Figure 3–7 show examples of the filled Julia sets of cubics in P cm (θ ) for θ = ( 5 − 1)/2. Figure 3 is the filled Julia set of a hyperbolic-like cubic. The large topological disk in black is the immediate basin of attraction of an attracting fixed point. Figure 4 is the filled Julia set of a capture, with a critical point in the large preimage of the Siegel disk on the right. The cubic in Fig. 5 is located at the “cusp” of the large cardioid in the right lower corner of Fig. 1, hence it has a parabolic fixed point. Figure 6 has two critical points on the boundary of its Siegel disk. Finally, the cubic in Fig. 7 belongs to ext so it has a disconnected Julia set. There are countably many components each quasiconformally homeomorphic to the quadratic Siegel filled Julia set with the same rotation number. The uncountably many remaining components are single points. In the above definition, we tacitly assumed that hyperbolic-like or capture cubics define components of the interior of M(θ). The condition of being hyperbolic-like is clearly open. It is also closed in the interior of M(θ ) since by Theorem 3.1 a cubic P in the interior of M(θ) is J -stable, so in a small neighborhood of it the number of attracting
Dynamics of Cubic Siegel Polynomials
195
Fig. 7.
cycles remains constant [Mc2, Theorem 4.2]. This number is 1 if P is accumulated by hyperbolic-like cubics. Now consider the property of being capture for P ∈ P cm (θ ). It follows from Theorem 3.1 that when a capture cubic P belongs to the interior of M(θ ), there is an open neighborhood of P consisting of captures. Let V be the component of the interior of the set of capture cubics containing P . Similarly, define U to be the component of the interior of M(θ) containing P . Clearly V ⊂ U . If they are not equal, choose a cubic Q ∈ ∂V ∩ U . Since Q is J -stable, for all Q0 in a small neighborhood of Q, a critical point of Q0 belongs to the Fatou set of Q0 if and only if the corresponding critical point of Q belongs to the Fatou set of Q. If we choose Q0 ∈ V , there is a critical point of Q0 which hits the Siegel disk 1Q0 . It follows that the same is true for Q, hence Q is capture, which contradicts Q ∈ ∂V . This proves V = U . In other words, when a capture cubic P belongs to the interior of M(θ), the entire component of the interior of M(θ ) containing P consists of captures, hence the name “capture component”. However, the above argument does not rule out the possibility of a capture being on the boundary of the connectedness locus M(θ ). In fact, that the capture condition is open follows from a different type of argument which is standard in deformation theory of rational maps (see Theorem 5.3). Conjecturally, queer components do not exist. But if they do, every cubic in a queer component exhibits an outstanding property: It admits an invariant line field on its Julia set, and in particular, its Julia set has positive Lebesgue measure. The proof of this fact depends on the harmonic λ-lemma of Bers and Royden [BR] as well as the elementary observation of Sullivan [Su2] that if the boundary of a Siegel disk moves holomorphically in a family of rational maps, then there is a choice of holomorphically varying Riemann maps for the Siegel disks (also see the new expanded version [McS]). There is a technical difficulty showing up in the proof: For a general θ of Brjuno type, it is not known whether the boundary of the Siegel disk of a P ∈ P cm (θ ) is a Jordan curve. For this reason, the
196
S. Zakeri
extension of holomorphic motions to the grand orbits of Siegel disks will require some extra work. We will repeatedly use the following lemma of L. Bers [B], [DH2]: Lemma 3.3 (Bers Sewing Lemma). Let E ⊂ C be closed and U and V be two open ' ' neighborhoods of E. Let ϕ : U −→ ϕ(U ) and ψ : V −→ ψ(V ) be two homeomorphisms such that • ϕ is K1 -quasiconformal, • ψ|V rE is K2 -quasiconformal, • ϕ|∂E = ψ|∂E . Then the map ϕ q ψ defined on V by (ϕ q ψ)(z) =
ϕ(z) z ∈ E ψ(z) z ∈ V r E
is a K-quasiconformal homeomorphism with K = max{K1 , K2 }. Moreover, ∂(ϕ qψ) = ∂ϕ almost everywhere on E. Theorem 3.4 (Invariant Line Fields for Queer Cubics). Let U be a queer component of the interior of M(θ). Then for any c ∈ U , the Julia set J (Pc ) has positive Lebesgue measure and supports an invariant line field. Proof. Fix some c0 ∈ U . We first note that every Fatou component of Pc0 eventually maps to the Siegel disk 1c0 and the mapping is a conformal isomorphism: There cannot be further attracting cycles (since Pc0 is not hyperbolic-like) or indifferent periodic orbits (see Corollary 3.2). In particular, K(Pc0 ) = GO(1c0 ). Choose some c ∈ U with c 6 = c0 , and let '
ϕc : C r K(Pc0 ) −→ C r K(Pc ) be the conformal conjugacy given by composition of the Böttcher maps √of Pc0 and Pc . A brief computation using the normal form (2.2) shows that ϕc (z) = c/c0 z + O(1) and we can choose the branch of the square root near c0 for which ϕc0 (z) = z. Since ϕc depends holomorphically on c, it defines a holomorphic motion of C r K(Pc0 ). By the harmonic λ-lemma [BR], this motion extends to a unique holomorphic motion b ϕc of the entire plane, which is now defined only for c in a small neighborhood V of c0 , with the following properties: • For every c ∈ V , b ϕc is a quasiconformal homeomorphism of the plane. ∂b ϕc dz is harmonic in GO(1c0 ). • For every c ∈ V , the Beltrami differential ∂b ϕc dz It is easy to see that uniqueness of this extended motion implies that b ϕc conjugates Pc0 to Pc on the entire plane (compare [McS]). In fact, one can replace b ϕc by Pc−1 ◦ b ϕc ◦ Pc0 on GO(1c0 ), which also extends ϕc , where the branch of Pc−1 is determined uniquely ϕc = Pc−1 ◦ b ϕc ◦ Pc0 by uniqueness. by the values of b ϕc on the Julia set J (Pc0 ). Hence b Next, we want to show that the restriction b ϕc : GO(1c0 ) → GO(1c ) is a conformal conjugacy. As Sullivan observes in [Su2], the fact that the boundary of 1c moves holomorphically for c ∈ U (Theorem 3.1) implies that there is a choice of the Riemann map ζc : D → 1c such that ζc (0) = 0 and c 7 → ζc is holomorphic in c. Define a conformal
Dynamics of Cubic Siegel Polynomials
197
conjugacy ψc : 1c0 → 1c by ψc = ζc ◦ ζc−1 , and extend it to a conformal conjugacy 0 ψc : GO(1c0 ) → GO(1c ) by taking pull-backs as follows. Take any component W of (1c0 ) and let Wc = b ϕc (W ) be the corresponding component of Pc−n (1c ). Define Pc−n 0 . Since c 7 → ψc is holomorphic and ψc = id on ψc : W → Wc by ψc = Pc−n ◦ ψc ◦ Pc◦n 0 GO(1c0 ) when c = c0 , it follows that ψc defines a holomorphic motion of GO(1c0 ). bc of the entire By the harmonic λ-lemma, ψc extends to a unique holomorphic motion ψ plane which is defined for c in a neighborhood V 0 of c0 and has harmonic Beltrami difϕc , it follows that ferential on CrK(Pc0 ). By an argument similar to the one we used for b bc respects the dynamics, i.e., it conjugates Pc0 to Pc on the entire plane. In particular, ψ it sends the marked critical point c0 of Pc0 to the marked critical c of Pc . Let us assume for example that the forward orbit of c0 accumulates on the boundary of 1c0 . Then the ϕc was also a conjugacy to begin with, for all c ∈ V ∩ V 0 same is true for c and 1c . Since b bc (c0 ) = c = b bc (Pc◦k (c0 )) = Pc◦k (c) = b we have ψ ϕc (c0 ), and by induction ψ ϕc (Pc◦k (c0 )) 0 0 for all k. Since every point on the boundary of 1c0 is in the closure of the forward orbit bc and b bc and b ϕc agree on ∂1c0 . Evidently this shows that ψ ϕc of c0 , we conclude that ψ agree on the boundary of every bounded Fatou component of Pc0 , hence on the entire Julia set J (Pc0 ). It follows then from the Bers Sewing Lemma 3.3 that ϕc q ψc defined by b ϕ (z) z ∈ C r GO(1c0 ) (ϕc q ψc )(z) = bc ψc (z) z ∈ GO(1c0 ) is a quasiconformal homeomorphism which has harmonic Beltrami differential in C r J (Pc0 ). Note that ϕc qψc is an extension of both ϕc and ψc . By uniqueness, we conclude bc . In particular, when c ∈ V ∩ V 0 , b ϕc is conformal away from the Julia set that b ϕc ≡ ψ J (Pc0 ). ϕc would have been conformal, conNow, if the Julia set J (Pc0 ) had measure zero, b tradicting c 6 = c0 . So J (Pc0 ) has positive measure. The desired invariant line field is then given by b ϕc∗ (σ0 ), the pull-back of the standard conformal structure σ0 on the plane t by b ϕc . u The existence of holomorphic motions in the above proof was the crucial fact which made the conformal extensions possible. In the case we have “static” quasiconformal conjugacies, such conformal extensions are still possible once we assume that the boundaries of Siegel disks are Jordan curves. Let 1 be a Jordan domain containing the origin and Rt : z 7 → e2πit z be the rigid rotation on the unit circle. Let ζ : 1 → D be any conformal isomorphism with ζ (0) = 0. Then the homeomorphism ht1 : ∂1 → ∂1 defined by ht1 = ζ −1 ◦ Rt ◦ ζ is the intrinsic rotation of ∂1 by angle t. By Schwarz Lemma, ht1 is independent of the choice of ζ . Now suppose 11 and 12 are two Jordan domains containing 0 and t is irrational. Let ϕ : ∂11 → ∂12 be a homeomorphism satisfying ϕ ◦ ht11 = ht12 ◦ ϕ. Then two points a1 ∈ 11 and a2 ∈ 12 have the same conformal position with respect to ϕ if ζ1 (a1 ) = ζ2 (a2 ), where the ζj : 1j → D are conformal isomorphisms with ζj (0) = 0 and ζ1 = ζ2 ◦ ϕ on ∂11 . Lemma 3.5 (Extending QC Conjugacies). Let P and Q be two cubics in P cm (θ ) such that the boundaries of the Siegel disks 1P and 1Q are Jordan curves. Let ϕ : C → C be a quasiconformal homeomorphism whose restriction C r GO(1P ) → C r GO(1Q ) conjugates P to Q. Then (a) If P is not capture, there exists a quasiconformal homeomorphism ψ : C → C which conjugates P and Q, which is conformal on GO(1P ) and agrees with ϕ on C r GO(1P ).
198
S. Zakeri
α1
U1 ∗ c1
β1
P
V1
γ1 v1
α1’
U2
ψ α 2’
V2
β2
Q
c2 ∗
v2
γ2
α2 Fig. 8. Extending ϕ in the capture case
(b) If P is capture, we can construct a ψ as in (a) if and only if the captured images of the critical points of P and Q in 1P and 1Q have the same conformal position with respect to ϕ. Proof. (a) Fix some b1 ∈ ∂1P and let b2 = ϕ(b1 ). Consider conformal isomorphisms ' ' ζ1 : 1P −→ D and ζ2 : 1Q −→ D, with ζ1 (0) = 0 = ζ2 (0) and ζ1 (b1 ) = 1 = ζ2 (b2 ), which conjugate P on 1P and Q on 1Q to the rigid rotation Rθ : z 7→ e2π iθ z on D. Since the boundaries of 1P and 1Q are Jordan curves, ζ1 and ζ2 extend homeomorphically to the closures. The composition ψ = ζ2−1 ◦ ζ1 : 1P → 1Q is conformal and conjugates P on 1P to Q on 1Q . Also ψ(b1 ) = ϕ(b1 ) = b2 and by induction ψ(P ◦k (b1 ))) = Q◦k (b2 ) = ϕ(P ◦k (b1 )). Since the orbit of b1 is dense on the boundary of 1P , we have ψ|∂1P = ϕ|∂1P . Therefore, ψ gives the required extension of ϕ to the Siegel disk 1P . It is now easy to extend ψ to the grand orbit GO(1P ): P ◦k maps any component of P −k (1P ) isomorphically onto 1P . Hence we can define ψ on any such component as the composition Q−k ◦ ψ|1P ◦ P ◦k , where the branch of Q−k is determined by the values of ϕ on the Julia set J (P ). Clearly this composition is conformal inside this component and agrees with ϕ on its boundary. ψ defined this way is a quasiconformal homeomorphism by the Bers Sewing Lemma 3.3, with U = V = C and E = C r GO(1P ). (b) Now let P be capture. The construction of ψ goes through as in case (a) except for the last part where we want to extend ψ by taking pull-backs. Suppose that there exists a positive integer k such that the critical point c1 of P belongs to the component U1 of P −k (1P ). Let V1 = P (U1 ) and let v1 = P (c1 ) be the critical value in V1 . Since P : ∂U1 → ∂V1 is a double covering and ϕ conjugates P to Q on the Julia sets, there must be a critical point c2 of Q in a component U2 of Q−k (1Q ), with ∂U2 = ϕ(∂U1 ). Similarly define V2 and v2 . By the proof of part (a) we can define ψ inductively up to the (k − 1)th preimages of 1P , including V1 . This gives us a conformal isomorphism ψ : V1 → V2 which necessarily maps v1 to v2 , because by our assumption P ◦k (c1 ) and Q◦k (c2 ) have the same conformal position in 1P and 1Q and so one gets mapped to the other by ψ|1P . Choose any simple arc γ1 in V1 connecting v1 to some boundary point β1 . The simple arc γ2 = ψ(γ1 ) in V2 connects v2 to the boundary point β2 = ψ(β1 ).
Dynamics of Cubic Siegel Polynomials
199
Pull γ1 back by P to get two branches of a simple arc passing through the critical point c1 with two distinct endpoints α1 and α10 on the boundary of U1 . Similarly we consider the pull-back of γ2 by Q and we get two endpoints on the boundary of U2 , which we label as α2 = ϕ(α1 ) and α20 = ϕ(α10 ) (see Fig. 8). Now the inverse Q−1 can be defined analytically over V2 r γ2 and has two branches which take values in two different connected components of U2 r Q−1 (γ2 ). Define ψ on U1 as the composition Q−1 ◦ ψ ◦ P , where the boundary orientation tells us which of the two branches of Q−1 has to be taken. This way we extend ψ to U1 and ψ can then be defined on further t preimages of 1P similar to the case (a). u 4. Renormalizable Cubics This section briefly studies the class of renormalizable cubics in P cm (θ ). These are the cubics with disjoint critical orbits from which one can extract the quadratic Qθ : z 7 → e2πiθ z + z2 by straightening. From a different point of view, one may consider a renormalizable cubic with connected Julia set as the result of “intertwining” the quadratic Qθ with another quadratic with connected Julia set (compare [EY]). For background on polynomial-like maps, straightening and hybrid classes, see for example [DH2]. Definition. A cubic P ∈ P cm (θ) is called renormalizable if there exists a pair of Jordan domains U and V , with 0 ∈ U b V , such that the restriction P |U : U → V is a quadratic-like map hybrid equivalent to Qθ : z 7→ e2π iθ z + z2 . When θ is irrational of bounded type, it follows from the work of Douady-Ghys-HermanShishikura [D2] that the boundary of the Siegel disk of Qθ is a quasicircle passing through the critical point. Hence the same is true for the Siegel disk 1P when P is renormalizable. To prove the next theorem, we need the following useful lemma of Kiwi in [K]. This lemma in particular shows that each indifferent cycle for a cubic P ∈ P cm (θ ) must attract its own critical point. Lemma 4.1 (Separation Lemma). Let P be a polynomial with connected Julia set. Then there exists a finite collection of closed preperiodic external rays, separating the plane into disjoint open simply-connected sets {Uj }, such that: • Each Uj contains at most one non-repelling periodic point or periodic Fatou component of P . • If z1 7 → · · · 7 → zp 7 → z1 is a non-repelling cycle meeting Ui1 7→ · · · 7 → Uip 7→ Ui1 , Sp then j =1 Uij contains the entire orbit of at least one critical point of P . Theorem 4.2. A cubic P ∈ P cm (θ) is renormalizable if either of the following conditions holds: (a) P has a non-repelling periodic orbit other than 0 which is not parabolic. (b) P has disconnected Julia set. Proof. First assume that we are in case (a) so that J (P ) is connected. Let R be the finite collection of the closed preperiodic external rays given by the Separation Lemma 4.1. Let V be the component of C r R which contains 0, cut off by an equipotential of K(P ). Finally, let U be the component of P −1 (V ) containing 0. Since all the rays in R are preperiodic, P (R) ⊂ R, hence U ⊂ V . U necessarily contains a critical point of P since
200
S. Zakeri
√ Fig. 9. Filled Julia set of the quadratic Qθ : z 7 → e2π iθ z + z2 for θ = ( 5 − 1)/2
otherwise Schwarz lemma and |P 0 (0)| = 1 would imply that U = V and P |U : U → V is a conformal isomorphism conjugate to a rotation. This would contradict the fact that U intersects the basin of attraction of infinity for P . The other critical point of P has to stay away from V because by the second part of the Separation Lemma its entire orbit lives in the cycle of components of C r R which contains the non- repelling periodic orbit of P . Since by our assumption the non-repelling cycle of P is not parabolic, the landing points of the external rays in R must all be repelling. Therefore, by a simple “thickening” procedure (see for example [M3]), we can assume that U ⊂ V , so that P |U : U → V is a quadratic-like map. Up to affine conjugation, there is only one quadratic polynomial which has a fixed Siegel disk of rotation number θ , so this quadratic-like map has to be hybrid equivalent to Qθ : z 7 → e2πiθ z + z2 . Now suppose that J (P ) is disconnected. For > 0, let U be the connected component of {z ∈ C : GP (z) < } containing the Siegel disk 1P , where GP : C → {x ∈ R : x ≥ 0} is the Green’s function of K(P ). It is not hard to see that for small , t P |U : U → U3 is a quadratic- like map, necessarily hybrid equivalent to Qθ . u Figures 3 and 7 demonstrate the above theorem. In either example, one can see the filled Julia set of the quadratic-like restriction P |U : U → V given by the above theorem, which is quasiconformally homeomorphic to the filled Julia set of Qθ : z 7 → e2π iθ z + z2 in Fig. 9. Remark. When P ∈ P cm (θ) has a parabolic cycle, we can no longer expect to extract Qθ from it by straightening. However, there must be a homeomorphic embedding K(Qθ ) → K(P ), conformal in the interior of K(Qθ ), which conjugates Qθ to P . This can be proved directly when θ is of bounded type (by using Theorem 13.7), and in the general case by using the parabolic surgery recently introduced in [Ha]. Corollary 4.3. Let θ be an irrational number of bounded type. Let P ∈ P cm (θ ) be hyperbolic-like or have disconnected Julia set J (P ). Then J (P ) has Lebesgue measure zero.
Dynamics of Cubic Siegel Polynomials
201
Proof. Let P |U : U → V be the quadratic-like restriction given by Theorem 4.2 and let K be its filled Julia set. Since this restriction is hybrid equivalent to Qθ : z 7→ e2π iθ z+z2 whose Julia set has measure zero by the theorem of Petersen [Pe], we simply conclude that ∂K has Lebesgue measure zero. It is well-known that the forward orbit of almost every point z ∈ J (P ) accumulates on the ω-limit set of the critical points of P [Ly, Prop. 1.14], which in this case is just ∂1P together with the attracting periodic orbit (resp. ∂1P ) if P is hyperbolic-like (resp. with disconnected Julia set). So the orbit of almost every z ∈ J (P ) accumulates on ∂1P . This implies that for all n ≥ N = N(z), P ◦n (z) ∈ V . This can happen only if or equivalently z ∈ P −N (∂K). We conclude that, up to a set of measure P ◦N (z) ∈ ∂K S zero, J (P ) = N≥0 P −N (∂K). But the right-hand side has measure zero because ∂K does. This proves that J (P ) has Lebesgue measure zero as well. u t
5. Quasiconformal Conjugacy Classes In this section we characterize the quasiconformal conjugacy classes in P cm (θ ).A central role is played by the following: Theorem 5.1 (Parametrization of QC Conjugacy Classes). Let Pc0 , Pc1 be distinct cubics in P cm (θ) and let ϕ : C → C be a K-quasiconformal homeomorphism which conjugates Pc0 to Pc1 , i.e., ϕ ◦ Pc0 = Pc1 ◦ ϕ and ϕ(c0 ) = c1 . Then there exists a holomorphic map t 7 → ct from an open disk D(0, r) (r > 1) into C∗ which maps 0 to c0 and 1 to c1 , such that for every t ∈ D(0, r), Pc0 is conjugate to Pct by a Kt - quasiconformal homeomorphism ϕt : C → C. Moreover, Kt → 1 as t → 0. Proof. The idea of the proof is standard in Holomorphic Dynamics (see [Su2,DH2]); however, we briefly sketch it here because similar arguments appear again in the rest of this work. Define a conformal structure σ on C by σ = ϕ ∗ σ0 , where, as usual, σ0 is the standard conformal structure on C. (To simplify the notation, in what follows we identify a conformal structure on C with its associated Beltrami differential.) Since Pc1 is holomorphic, Pc0 has to preserve σ . Since ϕ is quasiconformal, kσ k∞ < 1. Define a one-parameter family {σt } of complex-analytic deformations of σ by σt = tσ , where t ∈ D(0, r) and r > 1 is chosen such that rkσ k∞ < 1. By the Measurable Riemann Mapping Theorem [AB], there exists a unique quasiconformal homeomorphism ϕt of the plane which solves the Beltrami equation ϕt∗ σ0 = σt and fixes 0, 1 and ∞. Define P t = ϕt ◦ Pc0 ◦ ϕt−1 . Since Pc0 is holomorphic, it acts as a pure rotation on Beltrami differentials. Hence Pc∗0 σ = σ implies Pc∗0 σt = σt and therefore P t is a quasiregular self-map of the plane which preserves σ0 and is conjugate to a cubic polynomial. It is then easy to see that P t itself is a cubic polynomial with a fixed Siegel disk of rotation number θ centered at 0 with a marked critical point at z = 1. Note that t 7 → σt is holomorphic, so the same is true for t 7→ ϕt and hence t 7→ P t by the analytic dependence of the solutions of the Beltrami equation on parameters [AB]. Therefore the map t 7 → ct which defines the second critical point of P t so that P t = Pct is holomorphic. It is easy to see that ct has all the required properties. u t Corollary 5.2. Quasiconformal conjugacy classes in P cm (θ ) are either single points or open and connected. In particular, cubics on the boundary ∂M(θ ) are quasiconformally rigid, i.e., their conjugacy classes are single points.
202
S. Zakeri
Theorem 5.3 (Capture is an open condition). Let Pc0 be a capture cubic. Then there is an open neighborhood U ⊂ P cm (θ) of c0 such that for every c ∈ U , Pc is also capture. In particular, capture cubics belong to the interior of the connectedness locus M(θ ). (c0 ) ∈ 1c0 and k ≥ 1 is the smallest Proof. To fix the ideas, let us assume that Pc◦k 0 (c ) 6 = 0. Let A ⊂ 1c0 be the annulus bounded by such integer. First assume that Pc◦k 0 0 (c0 ). Take a conformal ∂1c0 and the analytic invariant curve in 1c0 passing through Pc◦k 0 '
isomorphism ψ : A −→ A(1, ), with = e2π mod(A) > 1, which conjugates Pc0 on A to the rotation on A(1, ). Postcompose ψ with a (non- conformal) dilation A(1, ) → A(1, 2 ) to get a quasiconformal homeomorphism ϕ : A → A(1, 2 ) conjugating Pc0 to the rotation. Define a Pc0 -invariant conformal structure σ on C by putting σ = ϕ ∗ σ0 on A and pulling it back by the inverse branches of Pc0 to the entire grand orbit of A. Set σ = σ0 elsewhere. As in the proof of Theorem 5.1, we define σt = tσ for t ∈ D(0, r) for some r > 1, solve the Beltrami equation ϕt∗ σ0 = σt and set P t = ϕt ◦ Pc0 ◦ ϕt−1 . Then P t is a capture cubic in P cm (θ) and P 0 = Pc0 . The holomorphic mapping t 7 → P t is not constant because mod(ϕ1 (A)) is the same as the modulus of A equipped with the conformal structure σ , which in turn is (1/2π) log( 2 ) = 2 mod(A). Hence P 1 6 = P 0 and the mapping t 7 → P t is open. Now consider the case where Pc◦k (c0 ) = 0. In this case, by Corollary 2.2, the confor0 mal capacity of 1c has a positive lower bound for all c sufficiently close to c0 . It follows that there exists an > 0 such that for all c close to c0 , 1c ⊃ D(0, ). Hence a small t perturbation of Pc0 will still be a capture cubic. u By a center of a hyperbolic-like component U ⊂ M(θ ) we mean a cubic Pc ∈ U with one of the critical points c or 1 being periodic. Similarly, a center of a capture component will be a cubic with one critical point eventually mapped to the indifferent fixed point at the origin. Lemma 5.4 (Existence of Centers). Every hyperbolic-like or capture component of the interior of M(θ) has a center. By the remark after the proof, centers of hyperbolic-like or capture components are unique when θ is of bounded type. Proof. First let U be a hyperbolic-like component. For every c ∈ U , consider the multiplier m(c) of the unique attracting periodic orbit of Pc . The mapping c 7→ m(c) from U into D is easily seen to be proper and holomorphic. Hence it vanishes at a finite number of points in U . Now let U be capture. To be more specific, let us assume that for every c ∈ U , Pc◦k (c) belongs to the Siegel disk 1c , and let k be the smallest such integer. Since Pc is J -stable by Theorem 3.1, the boundary of 1c moves holomorphically. Then, as in the proof of Theorem 3.4, there is a holomorphically varying choice of the Riemann maps ζc : D → 1c with ζc (0) = 0. Define a map m : U → D by m(c) = ζc−1 (Pc◦k (c)). Clearly m is holomorphic. Let cn ∈ U be any sequence which converges to c ∈ ∂U as (cn ) ∈ 1cn and wn = m(cn ) = n → ∞. For simplicity, put ζcn = ζn . Let zn = Pc◦k n ζn−1 (zn ) ∈ D. If wn does not converge to the unit circle, we can find a subsequence wn(j ) such that wn(j ) → w ∈ D as j → ∞. Since the family of univalent functions
Dynamics of Cubic Siegel Polynomials
203
{ζn : D → C} is normal, by passing to a further subsequence if necessary, we may assume that ζn(j ) → ζ locally uniformly on D. Clearly ζ (D) ⊂ 1c . Therefore, ζ (w) = limj ζn(j ) (wn(j ) ) = limj zn(j ) = Pc◦k (c) ∈ 1c . But this means that Pc is capture, which contradicts c ∈ ∂U . This proves that wn converges to the unit circle. Hence m is a proper t map. Now, as before, m−1 (0) has to be non-vacuous and finite. u Remark. To show uniqueness of centers, by Theorem 5.1 it would be enough to prove that any two centers for a component are quasiconformally conjugate. When the rotation number θ is of bounded type, this can be proved by a pull-back argument similar to Lemma 3.5 since in this case the boundary of 1P for P ∈ P cm (θ ) is a Jordan curve by Theorem 13.7 (compare [Mc1] or [M2], where uniqueness of centers is shown for every hyperbolic component in the space of polynomial maps). Theorem 5.5 (QC Conjugacy Classes in P cm (θ )). Quasiconformal conjugacy classes in P cm (θ ) are given by the following list: (a) Hyperbolic-like or capture components of the interior of M(θ ) with the center(s) removed. (b) The two components ext and int . (c) Queer components of the interior of M(θ). (d) Centers of hyperbolic-like or capture components. (e) Single points on the boundary of M(θ). Proof. Corollary 5.2 shows that no conjugacy class intersects two distinct members of the above list. It also proves that (d) and (e) are in fact conjugacy classes. Also the proof of Theorem 3.4 shows that every queer component is a conjugacy class. That (a) and (b) are quasiconformal conjugacy classes follows from the fact that over the components of t type (a) or (b), the family {Pc } has no critical orbit relations ([McS], Theorem 2.7). u 6. Connectivity of M(θ) In this section we prove that M(θ) is connected. This amounts to showing that each of its complementary components ext and int are homeomorphic to the punctured disk. One way to do this is to mimic the standard Douady–Hubbard proof of connectivity of the Mandelbrot set [DH1]: We can construct a holomorphic branched covering 8 : ext → C r D by assigning to each Pc ∈ ext the position of the critical point c in the Böttcher coordinate of Pc . 8 extends holomorphically to infinity with 8−1 (∞) = ∞. The degree of this map is 3, so to prove that ext is a punctured disk we must show that 8 has no critical point other than ∞. (This additional difficulty does not show up in the case of the Mandelbrot set where the similar map has degree 1.) To prove that 8 is locally injective, one can start with two nearby polynomials in the same fiber of 8 and de fine a conformal conjugacy between them near infinity by composing their Böttcher coordinates. This conjugacy can be conformally extended using the dynamics to the entire basin of attraction of infinity. Then a delicate argument is necessary to prove that one can extend the conjugacy further to the complex plane in a holomorphic way, proving that the two polynomials are identical (see [Z3] for details of such a proof). However, to prove that ext is a punctured disk, it would be much easier to use methods of Teichmüller theory of rational maps as developed in [McS]. (There one can also find a different proof for connectivity of the Mandelbrot set.) Let P ∈ P cm (θ ). By definition, the Teichmüller space Teich(P ) consists of all pairs (Q, [ϕ]), where Q ∈ P cm (θ ) and
204
S. Zakeri
c γ
Fig. 10.
ϕ : C → C is a quasiconformal conjugacy between P and Q, i.e., P ◦ ϕ = ϕ ◦ Q. Here [ϕ] means that we only remember the isotopy class of ϕ. The modular group Mod(P ) is the group of isotopy classes of quasiconformal homeomorphisms commuting with P . Mod(P ) acts on Teich(P ) properly discontinuously by [ψ](Q, [ϕ]) = (Q, [ψ ◦ ϕ]). The quotient Teich(P )/Mod(P ), also called the moduli space of P , is isomorphic to the quasiconformal conjugacy class of P in P cm (θ ). More generally, one can define the Teichmüller space Teich(U, P ), where U is an open set invariant under P . It consists of all triples (V , Q, [ϕ]), where V is open and invariant under Q, and the quasiconformal homeomorphism ϕ : V → U conjugates P and Q. But now [ϕ] denotes the isotopy class of ϕ rel ideal boundary of V . Theorem 6.1. The connectedness locus M(θ) is connected. Proof. Let P = Pc ∈ ext . Then J (P ) is disconnected and the critical point c belongs to the basin of attraction of infinity. Let γ be the equipotential of the Green’s function of K(P ) passing through c. Topologically γ is a figure eight with the double point at c (see Fig. 10). Let Jb(P ) be the union of J (P ) together with the backward orbit of the fixed point 0 as well as the union of all forward and backward images of γ . In other words, Jb(P ) is the closure of the grand orbits of all periodic points and critical points of P . The complement U = C r Jb(P ) consists of countably many annuli Ai of finite modulus (contained in the basin of attraction of ∞) and countably many punctured disks (corresponding to the Siegel disk and its preimages). On U the grand orbit equivalence relation is clearly indiscrete. By [McS, Theorem 6.2], Teich(P ) ' Teich(U, P ) × M1 (J (P ), P ), where M1 (J (P ), P ) is the unit ball in the space of all P -invariant Beltrami differentials supported on J (P ). This factor is trivial by the following Lemma 6.2. The Julia set of a cubic polynomial outside the connectedness locus M(θ ) admits no invariant line field. Note that for arbitrary θ of Brjuno type, it is not known whether this Julia set has measure zero (compare Corollary 4.3).
Dynamics of Cubic Siegel Polynomials
205
Proof. By Theorem 4.2(b), such a cubic is renormalizable. By straightening, an invariant line field on its Julia set gives rise to an invariant line field, or equivalently an invariant Beltrami differential σ , on the Julia set of Qθ : z 7 → e2π iθ z + z2 . Now, as in the proof of Theorem 5.1, by deforming σ to σt = tσ we can get a holomorphic family Qt of normalized quadratic polynomials all quasiconformally conjugate to Qθ . But Qθ belongs to the boundary of the Mandelbrot set, hence admits no non-trivial deformations, implying that Qt = Qθ for all t. So the normalized quasiconformal homeomorphisms ϕt which solve the Beltrami equation ϕt∗ σ0 = σt must all commute with Qθ . Now for any periodic point z ∈ J (Qθ ) of period n, t 7 → ϕt (z) is a continuous path in the finite set of all period-n points in J (Qθ ). Since ϕ0 (z) = z, we must have ϕt (z) = z for all t. Such points z are dense in the Julia set, so ϕt |J (Qθ ) must be the identity. Since σt = 0 off the Julia set, it follows from the Bers Sewing Lemma 3.3 that ∂ϕt = 0 almost everywhere in the plane. This implies that σt , or equivalently σ , vanishes almost everywhere, which is a contradiction. u t Now by Theorem 5.5, ext coincides with the quasiconformal conjugacy class of P . It follows that ext ' Teich(P )/Mod(P ). By [McS, Theorem 6.1], Teich(P ) ' Teich(U, P ) is isomorphic to the upper halfplane H. Finally, every quasiconformal self-conjugacy ψ of P preserves grand orbits of the distinguished points 0 and c, hence it fixes the boundaries of all the annuli Ai pointwise. In particular, ψ is the identity on the Julia set J (P ). Hence the action of [ψ] ∈ Mod(P ) is identity except in the annuli Ai where it is possibly a power of a Dehn twist. So Mod(P ) is at most Z. Since ext is not simply-connected, Mod(P ) = Z. It t follows that ext is homeomorphic to a punctured disk. u 7. Critical Parametrization of Blaschke Products This section is the beginning of a digression in the study of cubic Siegel polynomials. We look at certain Blaschke products which will serve as models for the cubics in P cm (θ ). We will introduce these model maps in Sec 8 and return to their relation with the cubics in Sect. 9. Let us consider the following space of degree 5 normalized Blaschke products: z−q b = {B : z 7 → τ z3 z − p : B(1) = 1 and |p| > 1, |q| > 1}, (7.1) B 1 − pz 1 − qz where the rotation factor τ on the unit circle T is chosen so as to achieve the normalization bhas superattracting fixed points at 0 and ∞ and four other critical B(1) = 1. Each B ∈ B b of those points counted with multiplicity. We are interested in the open subset B ⊂ B normalized Blaschke products of the form (7.1) whose four critical points other than 0 and ∞ are of the form 1 1 , c1 , c2 , c1 c2 with |c1 | > 1, |c2 | > 1. Our goal is to parametrize elements of B by their critical points c1 and c2 . The following theorem provides this “critical parametrization” for B: Theorem 7.1 (Critical Parametrization). Let c1 and c2 be two points outside the closed unit disk in the complex plane. Then there exists a unique normalized Blaschke product 1 1 B ∈ B whose critical points are located at 0, ∞, c1 , c2 , , . c1 c2
206
S. Zakeri
The proof of this theorem will be given after the following two supporting lemmas. It would be interesting to find a conceptual proof of this fact which can be generalized to higher degrees (compare a similar situation in [Z1], where such a proof is given). bof all Blaschke products of the form (7.1) can be identified with the set of The space B all unordered pairs {p, q} of points outside the closed unit disk. This is homeomorphic to the symmetric product of two copies of the punctured plane. The latter can be identified with the space of all degree 2 monic polynomials w 7 → (w − w1 )(w − w2 ) = w2 − (w1 + w2 )w + w1 w2 b is homeomorphic to C × C∗ . In particular, it is an open with w1 w2 6 = 0. It follows that B topological manifold of real dimension 4. In the same way, we may consider the space C of all unordered pairs {c1 , c2 } of points outside the closed unit disk, which has a completely similar description. We consider the continuous map 9:B→C which sends a normalized Blaschke product B ' {p, q} with critical points {0, ∞, c1 , 1 1 c2 , , } to the unordered pair {c1 , c2 }. c1 c2 Lemma 7.2. 9 is a proper map. Proof. Let Bn ' {pn , qn } be a sequence of normalized Blaschke products in B which leaves every compact subset of B. Then, a priori we have the following three possibilities: • Some critical point of Bn accumulates on the unit circle, or • After relabeling, pn goes to ∞, or • After relabeling, pn accumulates on the unit circle (later we show that this cannot be the case; see Lemma 7.4). In the first two cases, it is easy to see that 9(Bn ) leaves every compact subset of C. In the third case, there is a subsequence of Bn which converges locally uniformly on C r T to a Blaschke product of degree < 5. It follows that the corresponding subsequence of t 9(Bn ) has to leave every compact subset of C. u Lemma 7.3. 9 is injective. Proof. Let A and B be two normalized Blaschke products in B with the same critical 1 1 points {0, ∞, c1 , c2 , , } . Let c1 c2 z − q1 z − p1 , A : z 7 → τA z3 1 − p1 z 1 − q1 z z − q2 z − p2 . B : z 7 → τB z 3 1 − p2 z 1 − q2 z If p1 = p2 or p1 = q2 , or if one of the critical points c1 , c2 coincides with one of the zeros pi , qi , then a straightforward computation shows that A = B. So let us assume that p1 6 = p2 and p1 6 = q2 and consider the rational function R(z) =
A(z) . B(z)
Dynamics of Cubic Siegel Polynomials
207
Clearly deg R = 4 and hence R has 6 critical points counted with multiplicity. We have Q Q z2 (z − cj )(1 − cj z) z2 (z − cj )(1 − cj z) 0 0 , B (z) = (const.) A (z) = (const.) (1 − p1 z)2 (1 − q1 z)2 (1 − p2 z)2 (1 − q2 z)2 from which it follows that 1 Y (z−cj )(1−cj z) R (z) = (const.) z 0
(P ) (−1)j (z−pj )(z−qj )(1−pj z)(1−qj z) . (z−p2 )2 (z−q2 )2 (1−p1 z)2 (1−q1 z)2
(Note that all the sums and products are taken over j = 1, 2.) From the above expression, R has already 4 critical points at the cj and 1/cj . So the rational function in the braces could have at most 2 zeros. Since this fraction is irreducible (by our assumption p1 6= p2 and p1 6 = q2 ), the numerator should have degree ≤ 2. But that implies p1 q1 = p2 q2 , p1 (1 + |q1 |2 ) + q1 (1 + |p1 |2 ) = p2 (1 + |q2 |2 ) + q2 (1 + |p2 |2 ) from which it follows that p1 = p2 or p1 = q2 , hence q1 = q2 or q1 = p2 , which contradicts our assumption. u t Proof of Theorem 7.1 (Critical Parametrization). By Lemma 7.2 and Lemma 7.3, 9 is ' t a covering map of degree 1. Hence, it is a homeomorphism B −→ C. u In particular, the theorem shows that B is also homeomorphic to the product C × C∗ . Lemma 7.4. Let B : z 7 → τ z3 (z − p)(z − q)/((1 − pz)(1 − qz)) be any normalized Blaschke product in B. Then |p| > 2 and |q| > 2. Proof. Write B(z) = ρz3 /R(z), where |ρ| = 1 and z−β z−α R(z) = 1 − αz 1 − βz is a degree 2 Blaschke product preserving the unit disk having zeros at α = 1/p and β = 1/q. We look at the logarithmic derivative LD(z) = d(log R(z))/d(log z) = zR 0 (z)/R(z) on the unit circle T. A brief computation shows that for z ∈ T, LD(z) =
1 − |β|2 1 − |α|2 + , 2 |z − α| |z − β|2
which is strictly positive. It is easy to see that 1 + |α| 1 + |β| , . max LD(z) ≥ max 1 − |α| 1 − |β| z∈T Hence if either |α| > 1/2 or |β| > 1/2, the maximum value of LD on T will be greater than 3. On the other hand, R induces a 2-to-1 covering map of the unit circle, so the average value of |R 0 | = LD on T will be 2. Putting these two facts together, it follows that if |α| > 1/2 or |β| > 1/2, then min LD(z) ≤ 2 < 3 < max LD(z). z∈T
z∈T
208
S. Zakeri
This simply implies that when |α| > 1/2 or |β| > 1/2, there are at least two points on T where LD takes on the value 3. Now B(z) = ρz3 /R(z) gives B 0 (z) = ρ
3z2 R(z) − z3 R 0 (z) 3 − LD(z) . = ρz2 R(z)2 R(z)
Hence by the above argument, B has at least two critical points on the unit circle as soon as |p| < 2 or |q| < 2. Certainly this cannot happen since by definition B ∈ B means the critical points of B are off the unit circle. u t Corollary 7.5. Given any two points c1 and c2 in the plane, with |c1 | ≥ 1 and |c2 | ≥ 1, there exists a unique normalized Blaschke product B in the closure B with critical points 1 1 {0, ∞, c1 , c2 , , }. c1 c2 In other words, critical parametrization is possible even if one or both critical points c1 , c2 belong to the unit circle. Proof. Take a sequence {c1n , c2n } of pairs of points outside the closed unit disk such that c1n → c1 and c2n → c2 as n → ∞. The zeros pn , qn of the corresponding normalized Blaschke products 9 −1 ({c1n , c2n }) stay away from the unit circle by Lemma 7.4. Therefore, 9 −1 ({c1n , c2n }) has a subsequence which converges to a normalized Blaschke 1 1 product which, by continuity of 9, has critical points at {0, ∞, c1 , c2 , , }. c1 c2 To see uniqueness, it is enough to note that the proof of Lemma 7.3 can be repeated t word by word even if we assume |c1 | = 1 or |c2 | = 1. u We conclude with the following proposition, the proof of which is quite straightforward. Proposition 7.6. Every B ∈ B induces a real-analytic diffeomorphism of the unit circle. Consequently, if B ∈ B r B, the restriction of B to the unit circle will be a real-analytic homeomorphism with one (or two) critical point(s).
8. A Blaschke Parameter Space Now we focus on a certain class of degree 5 Blaschke products. These are the maps B with the following two properties: (i) B has the form B : z 7 → e2πit z3
z−p 1 − pz
z−q , |p| > 1, |q| > 1, 1 − qz
(8.1)
where p and q are chosen such that B has a double critical point on the unit circle T and a pair (c, 1/c) of symmetric critical points which may or may not be on T. (ii) t is the unique number in [0, 1] for which the rotation number of B|T is equal to θ , with 0 < θ < 1 being a given irrational number.
Dynamics of Cubic Siegel Polynomials
209
E2
1 m( 1 )
glue
m(2 ) 1
1 1/ m ( 2 )
E1
Fig. 11. Topology of the parameter space Bcm (θ )
The number t in (ii) is unique because the rotation number of B in (8.1) is a continuous and increasing function of t which is strictly increasing at all irrational values (see for example [KH, Prop. 11.1.9]). From the above description, it follows that every B which satisfies (i) and (ii) can be represented as a normalized Blaschke product in B r B followed by a unique rotation which adjusts the rotation number to θ. As a consequence, Corollary 7.5 shows that every such B is uniquely determined by the position of its critical points. The rotation group rot= {Rρ : z 7 → ρz with |ρ| = 1} acts on the set of all such Blaschke products by conjugation. In fact, Rρ−1 ◦ B ◦ Rρ : z 7 → e2πit ρ 4 z3
z − pρ 1 − pρz
z − qρ . 1 − qρz
We would like to understand the topology of the space B cm (θ ) of all “critically marked” Blaschke products satisfying (i) and (ii) modulo the action of rot. Here by a marking of the critical points of such a Blaschke product B we mean a surjective function m from the set {1, 2} to the set of finite critical points of B outside the open unit disk. Two critically marked Blaschke products (B, m) and (A, m0 ) are equivalent under the action of rot if there exists an Rρ such that Rρ ◦ B = A ◦ Rρ and m0 = Rρ ◦ m. Here is how we parametrize the space B cm (θ ): For j = 1, 2, consider the closed set Ej consisting of all conjugacy classes in Bcm (θ ) for which the critical point m(j ) belongs to the unit circle. In each class in E1 , we choose the unique representative (B, m) for which m(1) = 1. It follows from Corollary 7.5 that E1 can be parametrized by the location of the second critical point m(2) ∈ C r D. Similarly, in each class in E2 , pick up the unique representative (B, m) for which m(2) = 1. This shows that E2 can be parametrized by the location of the first critical point m(1) ∈ C r D. Now on the common boundary E1 ∩ E2 , consisting of all Blaschke products with two double critical points on T, we have two different coordinates which must correspond to the same conjugacy class. This simply yields the identification m(1) = 1/m(2) between the two copies of C r D along their boundary circles. Consequently, Bcm (θ ) can be identified with the punctured plane (see Fig. 11).
210
S. Zakeri
It is easy to see that this gluing corresponds to choosing the uniformizing parameter µ = m(1)/m(2) ∈ C∗ for the space B cm (θ). Here is the concrete interpretation of this identification B cm (θ) ' C∗ : For µ ∈ C∗ with |µ| ≥ 1, the corresponding Blaschke product Bµ has marked critical points at m(1) = µ, m(2) = 1. Similarly, if |µ| ≤ 1, Bµ is the unique Blaschke product with marked critical points at m(1) = 1, m(2) = 1/µ. Note that Bµ = B1/µ as maps, if we forget the markings of the critical points. As in the case of the cubic parameter space P cm (θ ), the Blaschke space Bcm (θ ) also has two very special points: µ = 1 which corresponds to the conjugacy class of Blaschke products with a critical point of local degree 5 on T, and µ = −1, which corresponds to the conjugacy class of Blaschke products with two centered double critical points on T. The identification with C∗ puts the following topology on Bcm (θ ): If |µ| 6 = 1 so that Bµ has only one double critical point on T, then Bµn → Bµ simply means uniform convergence on compact subsets of the plane respecting the convergence of the marked critical points. On the other hand, if |µ| = 1 so that Bµ has two double critical points on the unit circle, then Bµn → Bµ means that in the topology of local uniform convergence, {Bµn } can only accumulate on Bµ or its conjugate Rµ−1 ◦ Bµ ◦ Rµ . future reference, we need to analyze the structure of the invariant set S For −k B (T) for a Blaschke product B ∈ Bcm (θ ). For similar descriptions in a family k≥0 of degree 3 Blaschke products, see [Pe] or [YZ]. Definition (Skeletons). Let B ∈ B cm (θ). Define T0 = T and T1 = B −1 (T0 ) r T0 . In general, for k ≥ 2 we define Tk inductively as Tk = B −1 (Tk−1 ). We call the closed set Tk the k-skeleton of B. Note that B commutes with the reflection I : z 7→ 1/z. Therefore, every Tk is invariant under I . Figure 12 shows different possibilities for the 1-skeleton of a B ∈ Bcm (θ ). The next proposition gives basic properties of k-skeletons. The proofs are straightforward and will be omitted. Proposition 8.1 (Structure of the k-Skeleton). (a) For k ≥ 1, the k-skeleton Tk is the union of finitely many piecewise analytic Jordan curves {Tk1 , · · · , Tkm } which intersect one another at finitely many points and do not cross the unit circle T. None of the Tki encloses T. For any Tki in this family, the reflected copy I (Tki ) also belongs to this family. (b) With the notation of (a), let Dki denote the bounded component of C r Tki for k ≥ 1. C r D. Then for k ≥ 1, B maps Dki onto For k = 0, D0i could mean either D or b j some Dk−1 . The mapping is either a conformal isomorphism or a 2-to-1 branched C r D. covering. As a result, B ◦k is a proper holomorphic map from Dki onto D or b j i (c) If k ≥ 1 and i 6 = j , we have Dk ∩ Dk = ∅. j j j (d) For k > ` ≥ 1, either Dki and D` are disjoint or Dki ⊂ D` . Conversely, if Dki ⊂ D` , we necessarily have k ≥ `. Every Dki is called a k-drop or simply a drop of B. In other words, k-drops are the open topological disks bounded by the Jordan curves in the decomposition of the kskeleton of B. For k = 0, we have slightly changed the notion of drops. The unit circle T is the only Jordan curve in the 0-skeleton of B, and we agree to call any of the two topological disks D or b C r D a 0-drop. The integer k is called the depth of Dki .
Dynamics of Cubic Siegel Polynomials
211
* *
* *
*
(a)
*
(b)
* * * *
*
(d)
(c)
Fig. 12. Four different configurations for B −1 (T), where B ∈ Bcm (θ ). The shaded regions are components of B −1 (D). The shaded subregion of D is mapped to D by a 3-to-1 branched covering with a superattracting
fixed point at the origin. There is a critical point at z = 1 and the other critical point(s) (marked by an asterisk) are symmetric with respect to the unit circle
Definition (Nucleus of a Drop). Let Dki be a drop. We define the nucleus Nki of Dki as the set of all points in Dki which are not accumulated by any other drop of B. The nuclei of k-drops are said to have depth k. It follows from Proposition 8.1(c) that Nki = Dki r
[[ `6=k j
j
D` .
Clearly every nucleus is open. It is also non-empty because every drop contains an open set which eventually maps to the immediate basin of attraction of 0 or ∞, and this open set cannot intersect the closure of any other drop of B. We have two nuclei of depth zero: N0 , which is the nucleus of D and contains the C r D and contains immediate basin of attraction of 0, and N∞ , which is the nucleus of b the immediate basin of attraction of ∞. Obviously N∞ = I (N0 ). It is not hard to see that both N0 and N∞ are invariant under B: B(N0 ) ⊂ N0 , B(N∞ ) ⊂ N∞ .
(8.2)
This of course implies that N0 and N∞ are subsets of the Fatou set of B. It follows from Proposition 8.1(b) that B maps every nucleus of depth k onto some nucleus of depth k − 1 and the mapping is either a conformal isomorphism or a 2-to-1 branched covering. We include the following lemma for completeness:
212
S. Zakeri
Lemma 8.2. Let Nki be the nucleus of a drop Dki which eventually maps to the unit disk D. Then (a) No point in the orbit B
B
B
B
i
i1 −→ · · · −→ N1k−1 −→ N0 Nki = Nki0 −→ Nk−1 i
j can intersect any of the reflected nuclei I (Nk−j ), 0 ≤ j ≤ k. i ◦k (b) For z ∈ Nk , B is the first iterate of B which sends z to N0 .
Proof. (a) B commutes with I , so there is a reflected orbit B
B
B
i
B
i1 ) −→ · · · −→ I (N1k−1 ) −→ N∞ . I (Nki ) = I (Nki0 ) −→ I (Nk−1
Now any point in both orbits would have to map to a point in N0 and N∞ simultaneously, which is impossible since N0 ∩ N∞ = ∅. (b) This is obvious if k = 1. Suppose that k > 1 and that for some 0 < ` < k, B ◦` (z) ∈ N0 . Then by (8.2), B ◦k−1 (z) ∈ N0 ⊂ D. But B ◦k−1 (z) ∈ B ◦k−1 (Dki ) and t B ◦k−1 (Dki ) is a 1-drop which does not intersect D. u Remark. If z ∈ Nki , it is not true that B ◦k is the first iterate of B which sends z to the unit disk. In fact, the orbit of z can pass through D several times before it maps to N0 . Proposition 8.3. (a) Distinct nuclei are disjoint. (b) The map B ◦k from Nki onto N0 or N∞ is either a conformal isomorphism or a 2-to-1 branched covering. j
Proof. (a) Let Nki and N` be two distinct nuclei which intersect. By Proposition 8.1(c), we have k 6 = `. Without loss of generality, we assume that k > ` and the iterate B ◦` j j maps N` onto N0 . So for every z in the intersection Nki ∩ N` , B ◦` (z) will belong to N0 . This contradicts Lemma 8.2(b). (b) Since by (a) distinct nuclei are disjoint, an orbit B
B
B
i
B
i1 −→ · · · −→ N1k−1 −→ N0 or N∞ Nki = Nki0 −→ Nk−1
can hit every critical point of B at most once. Since the critical point z = 1 of B does not belong to any nucleus, the above orbit can only hit the pair of critical points c and 1/c, with |c| 6 = 1. By Lemma 8.2(a), these critical points cannot belong to the above orbit simultaneously. This means that B ◦k : Nki → N0 or N∞ is either a conformal isomorphism or a 2- to-1 branched covering. u t cm √ Figures 13–15 show the Julia sets of some Blaschke products in B (θ ) for θ = ( 5 − 1)/2. In Fig. 13 there are two symmetric attracting cycles in the nuclei N0 and N∞ whose basins of attraction consist of the topological disks in black. Figure 14 shows the Julia set of a map outside of the connectedness locus C(θ ) (see Sect. 10). In Fig. 15 there is a critical point in the nucleus of the large 1-drop attached to the unit disk at z = 1 which maps into N0 . Hence this nucleus contains the zeros p and q. Surgery (see Sect. 9 below) will turn the first Blaschke product into a hyperbolic-like cubic, while sends the second to a cubic in ext and the last one to a capture cubic in P cm (θ ).
Dynamics of Cubic Siegel Polynomials
213
Fig. 13.
Fig. 14.
Fig. 15.
9. The Surgery For the rest of the paper, unless otherwise stated, we assume that θ is an irrational number of bounded type. We describe a surgery on Blaschke products in Bcm (θ ) to obtain cubic polynomials in P cm (θ). A similar surgery has been done in the case of ´ atek and Herman (see quadratic polynomials [D2] using the following theorem of Swi¸ [Sw] or [H2]). Recall that a homeomorphism h : R → R is called k-quasisymmetric, or simply quasisymmetric, if 0 < k −1 ≤
|h(x + t) − h(x)| ≤ k < +∞ |h(x) − h(x − t)|
214
S. Zakeri
for all x and all t > 0. A homeomorphism h : T → T is k-quasisymmetric if its lift to R has this property. Theorem 9.1 (Linearization of Critical Circle Maps). Let f : T → T be a real-analytic homeomorphism with finitely many critical points and rotation number θ. Then there exists a quasisymmetric homeomorphism h : T → T which conjugates f to the rigid rotation Rθ : z 7 → e2πiθ z if and only if θ is an irrational number of bounded type. Moreover, if f belongs to a compact family of real-analytic homeomorphisms with rotation number θ, then h is k-quasisymmetric, where the constant k only depends on the family and not on the choice of f . Let us briefly sketch what this surgery does on a Blaschke product B ∈ Bcm (θ ). By Proposition 7.6, the restriction B|T is a real-analytic homeomorphism with one (or two) critical point(s). When the rotation number of this circle map is of bounded type, by Theorem 9.1 one can find a unique k-quasisymmetric homeomorphism h : T → T with h(1) = 1 such that the following diagram commutes: B
T −−−−→ yh
T yh
Rθ
T −−−−→ T Moreover, {B|T }B∈Bcm (θ ) is contained in a compact family (see Theorem 12.3), hence h is k(θ )-quasisymmetric, where the constant k(θ) only depends on the family Bcm (θ ). We can extend h to a K(θ)-quasiconformal homeomorphism H : D → D whose dilatation depends only on k(θ). Possible extensions are given by the theorem of Beurling and Ahlfors [A] or Douady and Earle [DE] (which has the advantage of being conformally e as follows: invariant). Define a modified Blaschke product B |z| ≥ 1 e = B(z) . (9.1) B(z) (H −1 ◦ Rθ ◦ H )(z) |z| < 1 This amounts to cutting out the unit disk and gluing in a Siegel disk instead. Note that the two definitions match along T by the above commutative diagram. Now define a conformal structure σ on the plane as follows: On D, let σ be the pull-back H ∗ σ0 of e will preserve σ on D. For the standard conformal structure σ0 . Since Rθ preserves σ0 , B e◦k = B ◦k on B e−k (D) r D (which consists of all the every k ≥ 1, pull σ |D back by B maximal k-drops of B; see Sect. 10). Since B ◦k is holomorphic, this does not increase the dilatation of σ . Finally, let σ = σ0 on the rest of the plane. By the construction, σ e Therefore, by the Measurable Riemann has bounded dilatation and is invariant under B. Mapping Theorem, we can find a quasiconformal homeomorphism ϕ : C → C such that ϕ ∗ σ0 = σ . Set e ◦ ϕ −1 . P =ϕ◦B
(9.2)
Then P is a quasiregular self-map of the sphere which preserves σ0 , hence it is holoe has the same properties. Therefore P is morphic. Also P is proper of degree 3 since B a cubic polynomial. Evidently, ϕ(D) is a Siegel disk for P whose boundary ϕ(T) is a quasicircle passing through the critical point ϕ(1). To mark the critical points of P , hence getting an element of P cm (θ ), we must normalize ϕ carefully. Recall from Sect. 8 that B cm (θ ) is uniformized by the parameter
Dynamics of Cubic Siegel Polynomials
215
µ ∈ C∗ as follows: If |µ| ≥ 1, Bµ has marked critical points at m(1) = µ, m(2) = 1, while for |µ| ≤ 1, Bµ has marked critical points at m(1) = 1, m(2) = 1/µ. In the first case, we normalize ϕ such that ϕ(H −1 (0)) = 0 and ϕ(1) = 1. Call ϕ(µ) = c and mark the critical points of P by declaring P = Pc as in Sect. 2. In the case |µ| ≤ 1, we normalize ϕ similarly by putting ϕ(H −1 (0)) = 0 and ϕ(1/µ) = 1, but this time we call ϕ(1) = c and set P = Pc . It is easy to see that when |µ| = 1, both normalizations produce the same critically marked cubic polynomial in P cm (θ ). Let us denote the polynomial P constructed this way by SH (B). We will see that for two quasiconformal extensions H and H 0 , the cubics SH (B) and SH 0 (B) are quasiconformally conjugate and the conjugacy is conformal everywhere except on the grand orbit of the Siegel disk centered at the origin. When SH (B) is capture, we can certainly end up with two different cubics if we choose the extensions arbitrarily. In fact, let k be the first moment the orbit of the critical point c of B hits the unit disk, and let w = B ◦k (c). Then for two quasiconformal extensions H and H 0 , the captured images of the critical points of SH (B) and SH 0 (B) have the same conformal position in their corresponding Siegel disks if and only if H (w) = H 0 (w). It follows that SH (B) 6 = SH 0 (B) as soon as we choose two different extensions H, H 0 with H (w) 6 = H 0 (w). The following proposition has a very non-trivial content in case the result of the surgery is a cubic whose Julia set has positive measure (say, in a queer component). It is the Bers Sewing Lemma which makes the proof work. Proposition 9.2. Let P = SH (B) and H 0 be any other quasiconformal extension of the circle homeomorphism h which linearizes B|T . Then, if P is not capture, SH (B) = SH 0 (B). On the other hand, when P is capture, SH (B) = SH 0 (B) if and only if H (w) = H 0 (w), where w ∈ D is the captured image of the critical point of B. Proof. Let Q = SH 0 (B) and ϕH and ϕH 0 denote the quasiconformal homeomorphisms eH ◦ ϕ −1 and Q = ϕH 0 ◦ B eH 0 ◦ ϕ −10 as in (9.2). The homeowhich satisfy P = ϕH ◦ B H H morphism ϕ defined by −1 )(z) z ∈ C r GO(1P ) (ϕH 0 ◦ ϕH ϕ(z) = −1 )(z) z ∈ P −k (1P ) (ϕH 0 ◦ B −k ◦ H 0 −1 ◦ H ◦ B ◦k ◦ ϕH is quasiconformal and conjugates P to Q. By Lemma 3.5, one can find a quasiconformal conjugacy ψ : C → C between P and Q which is conformal on the grand orbit GO(1P ) and agrees with ϕ everywhere else. By the Bers Sewing Lemma, ∂ψ = ∂ϕ almost everywhere on CrGO(1P ). But the latter generalized partial derivative vanishes almost everywhereSon C r GO(1P ) because the surgery does not change the conformal e−k (D). Hence ∂ψ = 0 almost everywhere on C, which means structures outside k≥0 B ψ is conformal. This shows P = Q. u t Convention. For the rest of this paper, we always choose the Douady- Earle extension of circle homeomorphisms to perform surgery. By the above proposition, this is really a “choice” only in the capture case. We can therefore neglect the dependence on H and call S : B cm (θ) → P cm (θ ) the surgery map. As an immediate corollary of the normalization of ϕ and the construction of S, we have the following:
216
S. Zakeri
Corollary 9.3. Let µ ∈ C∗ and Pc = S(Bµ ) be the cubic obtained by performing the above surgery. / ∂1c . • If |µ| > 1, then 1 ∈ ∂1c and c ∈ / ∂1c . • If |µ| < 1, then c ∈ ∂1c and 1 ∈ • If |µ| = 1, then both c and 1 ∈ ∂1c . 10. The Blaschke Connectedness Locus C(θ) Suggested by the case of cubic polynomials, we define the Blaschke connectedness locus C(θ ) by C(θ) = {B ∈ B cm (θ) : The Julia set J (B) is connected}. The following theorem provides a useful characterization of C(θ ) in terms of the critical orbits. Theorem 10.1. B ∈ C(θ) if and only if one of the following holds: • The orbit of c, the critical point of B in C r D other than 1, eventually hits D. • The orbit of c never hits D, but remains bounded. The proof of this theorem depends on an alternative dynamical description for Julia sets of Blaschke products in B cm (θ) which is obtained by taking pull-backs along a certain type of drops called maximal drops. This description will be useful later in the proof of Theorem 13.1. Definition. Let Dki be a k-drop of B ∈ Bcm (θ). We call Dki a maximal drop if Dki = D, or if Dki ∩ D = ∅ and Dki is not contained in any other `-drop of B for ` ≥ 1. It follows in particular that maximal drops of B are disjoint. e ◦ ϕ −1 as in (9.2). Then Proposition 10.2. Let B ∈ B cm (θ) and let P = S(B) = ϕ ◦ B (a) Dki is a maximal drop of B if and only if ϕ(Dki ) is a Fatou component of P which eventually maps to the Siegel disk 1P . C r GO(1P ). (b) ϕ maps the nucleus N∞ of B onto b (c) The boundary of the immediate basin of attraction of infinity for B is precisely the closure of the union of the boundaries of all maximal drops of B. Under ϕ this set maps to the Julia set J (P ). Proof. (a) and (b) are easy consequences of the definitions. For (c), just note that under ϕ, the boundary of the immediate basin of attraction of infinity for B corresponds to the similar boundary for P , and the closure of the union of the boundaries of all maximal drops of B corresponds to the Julia set J (P ) by (a). u t Lemma 10.3 (Alternative description for Julia Sets). Let B ∈ B cm (θ ) and let J0 be the boundary of the immediate basin of attraction of infinity for B. Define a sequence of compact sets Jn = Jn (B) inductively by [ B −k (I Jn−1 ∩ D) ∩ Dki . (10.1) Jn = Dki maximal
Then J (B) =
[ n≥0
Jn .
(10.2)
Dynamics of Cubic Siegel Polynomials
217
Proof. Each Jn is compact and contained in J (B). By Lemma 10.2(c), J0 ⊂ J1 and it follows by induction on n that Jn ⊂ Jn+1 for n ≥ 0. Put J∞ =
[
Jn .
n≥0
Clearly J∞ is compact and contained in the Julia set J (B), and it is not hard to see that it is invariant under the reflection I . We will show that J∞ is totally invariant under B, i.e., B −1 (J∞ ) = J∞ . This will prove that J∞ = J (B). First we prove that J∞ is forward invariant. For any n, it follows from (10.1) that B(Jn r D) ⊂ Jn ⊂ J∞ . On the other hand, B(Jn ∩ D) = B(I Jn−1 ∩ D) = I B(Jn−1 r D) ⊂ I J∞ = J∞ . These two inclusions show that B(Jn ) ⊂ J∞ , hence B(J∞ ) ⊂ J∞ . To prove backward invariance, first note that for any n, B −1 (Jn ) r D ⊂ Jn ⊂ J∞ by (10.1). To obtain the same kind of inclusion for B −1 (Jn ) ∩ D, we distinguish two cases: First, B −1 (Jn ∩D)∩D = B −1 (I Jn−1 ∩D)∩D ⊂ I (B −1 (Jn−1 rD)) ⊂ I Jn−1 ∪Jn ⊂ J∞ . Second, B −1 (Jn rD)∩D = I (B −1 (I Jn ∩D)rD) ⊂ I (B −1 (Jn+1 )rD) ⊂ I Jn+1 ⊂ J∞ . Altogether, these three inclusions show that B −1 (Jn ) ⊂ J∞ for all n. Hence B −1 (J∞ ) ⊂ t J∞ and this proves (10.2). u Proof of Theorem 10.1. One direction is quite easy to see: If the orbit of c never hits the closed unit disk and escapes to infinity, one can easily show that J (B) is disconnected in a way identical to the polynomial case by considering the Böttcher map of the immediate basin of attraction of ∞ for B (see for example [M1, Theorem 17.3]. Conversely, suppose that the orbit of the critical point c either hits D or stays bounded in C r D. Then the Julia set J (P ) is connected, where P = S(B). Consider the sequence of compact sets Jn in (10.1). By Proposition 10.2(c), J0 is connected and it follows by induction on n that each Jn defined by (10.1) is connected. Therefore (10.2) shows that J (B) is connected. Hence B ∈ C(θ). u t In what follows, we prove that the connectedness locus C(θ ) is compact. Other facts, e.g. having only two complementary components, or connectivity, will be proved later using surgery (see Corollary 13.4 and Corollary 13.5). We would like to remark that unlike the case of cubic polynomials, it is often difficult to prove anything about the topology of the Blaschke connectedness locus, partly because of the complicated way these Blaschke products depend on their critical points, but more importantly because of the fact that the family µ 7 → Bµ does not depend holomorphically on µ. Lemma 10.4. Let {Bµn } be an arbitrary sequence of Blaschke products in Bcm (θ ) and hn : T → T be the unique normalized quasisymmetric homeomorphism which conjugates Bµn |T to the rigid rotation Rθ . Let Hn denote the Douady–Earle extension of hn . Then the sequence {Hn } has a subsequence which converges locally uniformly to a quasiconformal homeomorphism of D. It follows in particular that the sequence {Hn−1 (0)} stays in a compact subset of the unit disk. Proof. This follows from the facts that the space of all uniformly quasisymmetric normalized homeomorphisms of the circle is compact [Le, Lemma 5.1] and the Douady– Earle extension depends continuously on the circle homeomorphism [DE]. u t
218
S. Zakeri
Corollary 10.5. Let B ∈ Bcm (θ) and ϕB : C → C be the quasiconformal homeomore to the cubic P = S(B) as in phism which conjugates the modified Blaschke product B −1 e (9.2): P = ϕB ◦ B ◦ ϕB . Then the family F = {ϕB }B∈Bcm (θ ) is normal. Proof. By the surgery construction as described in Sect. 9, F is uniformly quasiconformal. Choose a sequence {Bµn } in B cm (θ) and let ϕn = ϕBµn denote the corresponding sequence in F. Choose a subsequence, still denoted by Bµn , such that |µn | ≥ 1 for all n (the case |µn | ≤ 1 is similar). By the way we normalized ϕn , ϕn (Hn−1 (0)) = 0, ϕn (1) = 1, ϕn (∞) = ∞. But {Hn−1 (0)} lives in a compact subset of D by the previous lemma. Hence the three points Hn−1 (0), 1 and ∞ has mutual spherical distance larger than some positive constant independent of n. This implies equicontinuity of {ϕn } by a standard theorem on quasiconformal mappings [Le, Theorem 2.1]. u t Proposition 10.6. The surgery map S : B cm (θ ) → P cm (θ ) is proper. Proof. Let the sequence {Bµn } leave every compact set in Bcm (θ ) and consider the eµn ◦ ϕn−1 . To be more specific, let us corresponding cubics Pcn = S(Bµn ) = ϕn ◦ B assume that the critical point µn tends to infinity. Clearly cn = ϕn (µn ). Since {ϕn } is t normal by the above corollary, we simply conclude that cn → ∞. u Proposition 10.7. The Blaschke connectedness locus C(θ ) is compact and invariant under µ 7 → 1/µ. As a result, there exists an unbounded component 3ext of C∗ r C(θ ) which contains a punctured neighborhood of ∞ and a corresponding component 3int which is mapped to it by µ 7 → 1/µ. Proof. The invariance follows from the definition of Bcm (θ ) and its identification with C∗ . Note that the unit circle T ⊂ Bcm (θ) is contained in C(θ ) by Theorem 10.1. So 3ext and 3int are actually distinct components of C∗ r C(θ ). C(θ ) is clearly closed by Theorem 10.1. Let us prove it is bounded. Assuming the contrary, there is a sequence Bµn ∈ C(θ) with µn → ∞ as in the above proof. It follows from Proposition 10.2(c) and Theorem 10.1 that the corresponding polynomials eµn ◦ ϕn−1 have connected Julia sets. By Proposition 2.3, 1/30 ≤ Pcn = S(Bµn ) = ϕn ◦ B t |cn | ≤ 30. This contradicts properness of S. u 11. Continuity of the Surgery Map This section is devoted to the proof of continuity of the surgery map S which depends strongly on the cubic parameter space being one-dimensional. We point out that the situation is similar to Douady–Hubbard’s proof of the continuity of the “straightening map” in their study of the space of quadratic-like maps [DH2]. One additional difficulty here is the lack of complete information on quasiconformal conjugacy classes in the nonholomorphic family Bcm (θ) (the analogue of Theorem 5.5; see however Theorem 12.4). The idea of the proof is as follows: Given a sequence Bµn ∈ B cm (θ ) such that Bµn → B = Bµ , we prove that there exists a subsequence {Bµn(j ) } such that S(Bµn(j ) ) → S(B) in P cm (θ ). The topology of the parameter space P cm (θ ) is local uniform convergence respecting the markings of the critical points. The same is true for Bcm (θ ) with one exception (see Sect. 9): If µ has absolute value 1, i.e., if B has two double critical
Dynamics of Cubic Siegel Polynomials
219
points on the unit circle, then Bµn → B means that every subsequence of {Bµn } has a further subsequence which either converges locally uniformly to B or to its conjugate Rµ−1 ◦ B ◦ Rµ . From the construction of S it is easy to see that S(B) = S(Rµ−1 ◦ B ◦ Rµ ). Therefore, in order to prove continuity of S, all we have to show is that Bµn → B locally uniformly on C (respecting the markings of the critical points) implies that for some subsequence {Bµn(j ) }, S(Bµn(j ) ) → S(B) locally uniformly on C (again, respecting the markings of the critical points). So let hn and h be the unique k(θ)-quasisymmetric homeomorphisms which fix z = 1 and conjugate Bµn |T and B|T to the rigid rotation Rθ . It is easy to see that hn → h uniformly on T. Consider the Douady–Earle extensions Hn and H , which are K(θ )-quasiconformal homeomorphisms of the unit disk. By the construction of these extensions, Hn and H are real-analytic in D and Hn → H locally uniformly in C ∞ topology [DE]. In particular, the partial derivatives ∂Hn and ∂Hn converge locally uniformly in D to the corresponding derivatives ∂H and ∂H . This shows that σn |D → σ |D locally uniformly, where σn = Hn∗ σ0 and σ = H ∗ σ0 are the conformal structures we constructed in the course of surgery for Bµn and B (see Sect. 9). At this point, the main problem is to prove that Bµn → B and σn |D → σ |D implies σn → σ in the L1 -norm on C, for this would show that the normalized solutions ϕn = ϕHn of the Beltrami equations ϕn∗ σ0 = σn converge locally uniformly on C to the normalized solution ϕ of the equation ϕ ∗ σ0 = σ . This would simply mean that S(Bµn ) → S(B) as n → ∞. Unfortunately, we cannot prove σn → σ in L1 (C) in all cases. So, following [DH2], we take a slightly different approach by splitting the argument into two cases depending on whether S(B) is quasiconformally rigid or not. In the former case, we show continuity directly using the rigidity. In the latter case, however, we prove ϕn → ϕ using the fact that S(B) admits non-trivial deformations. Theorem 11.1. The surgery map S : B cm (θ) → P cm (θ ) is continuous. Proof. Consider Bµn , B ∈ B cm (θ) and start with the same construction as above to get a sequence {σn } of conformal structures on the plane with uniformly bounded dilatation and the corresponding sequence {ϕn } of normalized solutions of ϕn∗ σ0 = σn . Since {ϕn } is a normal family by Corollary 10.5, it has a subsequence, still denoted by {ϕn }, which converges locally uniformly to a quasiconformal homeomorphism ψ : C → C. eµn ◦ϕn−1 = S(Bµn ), P = ϕ ◦ B e◦ϕ −1 = S(B), and Q = ψ ◦ B e◦ψ −1 . Set Pcn = ϕn ◦ B cm All these maps are cubic polynomials in P (θ ). Also P is quasiconformally conjugate to Q, and Pcn → Q as n → ∞. We will show that P = Q and this will prove continuity at B. For the rest of the argument, we distinguish two cases: If P = S(B) is quasiconformally rigid, then automatically P = Q and we are done. Otherwise, P is not rigid, so the quasiconformal conjugacy class of P is a non-empty open set U ⊂ P cm (θ ) by Corollary 5.2. Assume by way of contradiction that P 6 = Q. Since Pcn → Q as n → ∞, Pcn ∈ U for large n. Hence Pcn is quasiconformally conjugate to P for large n, i.e., there exists a normalized quasiconformal homeomorphism ηn : C → C such that ηn ◦ P = Pcn ◦ ηn . Observe that the dilatation of ηn is uniformly bounded, since by Theorem 5.1 the dilatation of (ψ ◦ ϕ −1 ) ◦ ηn−1 goes to 1 as n goes to ∞ (see Fig. 16). By “lifting” ηn , we can find a quasiconformal conjugacy ξn = ϕn−1 ◦ ηn ◦ ϕ between the e and B eµn , i.e., modified Blaschke products B e= B eµn ◦ ξn . ξn ◦ B
(11.1)
220
S. Zakeri
∼ Bµ n
ξn ∼ B
ϕn ϕ
ψ
U
Pcn P
Q ηn
Fig. 16. Sketch of the proof of continuity of S
Again, note that the dilatation of ξn is uniformly bounded. We prove that the sequence of conformal structures {σn } converges in L1 (C) to σ . This, by a standard theorem on quasiconformal mappings (see for example [Le, Theorem 4.6]) will show that ϕn → ϕ locally uniformly, hence Pcn → P , hence P = Q, which contradicts our assumption. To this end, we introduce the following sequences of conformal structures (where, as usual, we identify a conformal structure with its associated Beltrami differential): Sk e−i k σn (z) = σn (z) when z ∈ i=0 Bn (D) 0 otherwise and σ k (z) =
σ (z) when z ∈ 0 otherwise
Sk
e−i (D)
i=0 B
.
Note that σ k → σ in L1 (C) as k → ∞ and for every fixed k, σnk → σ k in L1 (C) as n → ∞. Lemma The L1 -norm kσn − σ k1 goes to zero as n → ∞ if the area of the open S∞ 11.2. −i eµ (D) goes to zero uniformly in n as k → ∞. set i=k B n S∞
Proof. For a given > 0, take k0 so large that k > k0 implies area( for all n. Then for a fixed large k > k0 and n large enough,
i=k
kσn − σ k1 ≤ kσn − σnk k1 + kσnk − σ k k1 + kσ k − σ k1 ≤Z kσn − σnk k1 + 2 =
S∞
< 3.
e−i i=k+1 Bµn (D)
This completes the proof of the lemma. u t
|σn | dxdy + 2
eµ−i (D)) < B n
Dynamics of Cubic Siegel Polynomials
221
S e−i So it remains to prove that the area of ∞ i=k Bµn (D) goes to zero uniformly in n S∞ −i e (D)) → 0 as k → ∞. Since {ξn } is uniformly as k → ∞. Clearly area( i=k B quasiconformal, there is a constant C ≥ 1 such that C −1 area(E) ≤ area(ξn (E)) ≤ C area(E) S e−i for every n and every measurable set E ⊂ ∞ i=0 B (D). By (11.1), ∞ [ i=k
eµ−i (D) = ξn ( B n
∞ [
e−i (D)), B
i=k
S∞ −i S e−i e so area( ∞ i=k Bµn (D)) ≤ C area( i=k B (D)) and this proves that the left side goes to zero uniformly in n. u t 12. Renormalizable Blaschke Products Here we consider those Blaschke products in B cm (θ ) from which one can “extract” the standard degree 3 Blaschke product fθ to be defined below. The importance of this particular Blaschke product lies in the fact that it provides a model for the dynamics of the quadratic polynomial Qθ : z 7 → e2πiθ z + z2 . It will be convenient to define renormalizable Blaschke products in B cm (θ) as ones which after the surgery give rise to renormalizable cubics in P cm (θ) (see Sect. 4). In what follows we will have to work with a symmetrized version of the notion of a quadratic-like map in order to show that any renormalizable Blaschke product is quasiconformally conjugate near the Julia set of its renormalization to the standard map fθ . The proof of this fact resembles the proof of [DH2] that every hybrid class of polynomial-like maps contains a polynomial. First we include the following simple fact for completeness. Proposition 12.1. Let 0 < θ < 1 be a given irrational number and f : b C→b C be a degree 3 Blaschke product with a superattracting fixed point at the origin and a double critical point at z = 1. Let the rotation number of f |T be θ . Then there exists a unique 0 < t (θ ) < 1 such that z−3 . (12.1) f (z) = fθ (z) = e2πit (θ ) z2 1 − 3z z−a , with |a| > 1 and 0 < t < 1. The fact Proof. Clearly f (z) = e2πit z2 1 − az that f 0 (1) = 0 implies a = 3. Since the rotation number of f |T as a function of t is continuous and strictly monotone at all irrational values, there exists a unique t for which this rotation number is θ. u t Remark. √ Computer experiments give the value t (θ ) ≈ 0.613648 · · · for the golden mean θ = ( 5 − 1)/2. Figure 17 shows the Julia set of fθ for this value of θ . This standard degree 3 Blaschke product was introduced by Douady, Ghys, Herman and Shishikura as a model for the quadratic Qθ : z 7 → e2πiθ z + z2 in the case θ is irrational of bounded type [D2]. It was also used by Petersen [Pe] to prove that the Julia set of Qθ is locallyconnected and has measure zero.
222
S. Zakeri
√ Fig. 17. Julia set of fθ for θ = ( 5 − 1)/2
Definition. A Blaschke product B ∈ Bcm (θ ) is called renormalizable if S(B) ∈ P cm (θ ) is a renormalizable cubic, as defined in Sect. 4. Theorem 12.2. Let B ∈ Bcm (θ) be renormalizable. Then there exists a pair of annuli W 0 b W , both containing the unit circle and symmetric with respect to it, and a quasiconformal homeomorphism ϕB : C → C such that: (a) B : ∂W 0 → ∂W is a degree 2 covering map, (b) ϕB ◦ I = I ◦ ϕB , (c) (ϕB ◦ B)(z) = (fθ ◦ ϕB )(z) for all z ∈ W 0 . Moreover, one can arrange ∂ϕB = 0 on K(B) =
T
n≥0 B
−n (W 0 ).
e◦ϕ −1 ∈ P cm (θ ) which is renormalizable. Proof. Consider the cubic P = S(B) = ϕ ◦ B Consider the quadratic-like restriction P |U : U → V and the corresponding regions U1 = ϕ −1 (U ) and V1 = ϕ −1 (V ). Clearly U1 b V1 and both contain the closed unit disk. Define the symmetrized regions W 0 = U1 ∩ I (U1 ),
W = V1 ∩ I (V1 )
W0
b W . Note that B sends ∂W 0 to ∂W in a 2-to-1 which are topological annuli with fashion. Now extend B|W 0 to the whole complex plane by gluing it to the polynomial z 7 → z2 near 0 and ∞ as follows: Let r > 1 and ω : CrW 0 → CrA(r −1 , r) be a diffeomorphism such that ω ◦ I = I ◦ ω, ω(B(z)) = ω(z)2 , z ∈ ∂W 0 . Define the extension of B|W 0 by F (z) =
B(z) z ∈ W0 . ω−1 (ω(z)2 ) z ∈ / W0
Dynamics of Cubic Siegel Polynomials
223
Note that F is a quasiregular degree 3 self-map of the sphere, F ◦ I = I ◦ F , and every point outside W 0 will converge to 0 or ∞ under the iteration of F . Define a conformal structure σ on the plane as follows: Put σ = ω∗ σ0 on C r W 0 , and pull it back by F ◦n to all the components of F −n (C r W 0 ) ∩ W 0 . Finally, on K(B) set σ = σ0 . It is easy to see that σ has bounded dilatation on the plane, is symmetric with respect to the unit circle, and F ∗ (σ ) = σ . By the Measurable Riemann Mapping Theorem, there exists a unique quasiconformal homeomorphism ϕB of the plane which fixes 0, 1, ∞, such that ϕB∗ (σ0 ) = σ . The conjugate map f = ϕB ◦ F ◦ ϕB−1 is easily seen to be a degree 3 rational map on the sphere. The quasiconformal homeomorphism I ◦ ϕB ◦ I also fixes 0, 1, ∞ and pulls σ0 back to σ because σ is symmetric with respect to T. By uniqueness, ϕB = I ◦ ϕB ◦ I . This implies that f commutes with I , hence it is t a Blaschke product. By Proposition 12.1, f = fθ , and we are done. u While the above theorem establishes a direct connection between some Blaschke products in B cm (θ) and fθ , it is curious to note the following entirely different relation: Theorem 12.3. Let Bµn be any sequence in Bcm (θ ) such that µn → ∞ as n → ∞. Then Bµn → fθ locally uniformly on C∗ as n → ∞. In other words, fθ can be regarded as the point at infinity of the parameter space B cm (θ ). Proof. As in Sect. 8, let Bµn : z 7 → e
z
2πitn 3
z − pn 1 − pn z
z − qn . 1 − qn z
The first and second logarithmic derivatives Bµ0 n Bµn
and
Bµn Bµ00 n − (Bµ0 n )2 (Bµn )2
both vanish at z = 1. A brief computation shows that these two conditions translate into |pn |2 − 1 |qn |2 − 1 + = 3, |pn − 1|2 |qn − 1|2
(12.2)
(pn − pn )(|pn |2 − 1) (qn − qn )(|qn |2 − 1) + = 0. |pn − 1|4 |qn − 1|4
(12.3)
and
Since µn → ∞, both pn and qn cannot stay bounded. Hence, after relabeling, pn → ∞ (compare Theorem 7.1). Then (12.2) shows that (|qn |2 − 1)/|qn − 1|2 → 2, or equivalently, |qn − 2| → 1 but qn stays away from z = 1 by Lemma 7.4. On the other hand, (12.3) shows that (qn −qn )(|qn |2 −1)/|qn −1|4 → 0, hence (qn −qn )/|qn −1|2 → 0. Since qn does not accumulate on z = 1, this implies that (qn − qn ) → 0. Near the circle |z − 2| = 1 this can happen only if qn → 3. Since the rotation number depends continuously on the circle map, it is easy to see that Bµn → fθ locally uniformly on C∗ . t u
224
S. Zakeri
Consider a sequence Bµn going off to infinity as in the previous theorem. Consider eµn ◦ ϕn−1 as in (9.2). By the previous theorem, the cubics Pcn = S(Bµn ) = ϕn ◦ B eµn → feθ locally uniformly on C. Here Bµn → fθ locally uniformly on C∗ , so B feθ denotes the modified Blaschke product for fθ , defined in a way similar to (9.1). Since {ϕn } is normal by Corollary 10.5, by passing to a subsequence if necessary, ϕn converges to a quasiconformal homeomorphism ϕ. Since the surgery map is proper by Proposition 10.6, cn → ∞. By examining the normal form (2.2), we see that Pcn → Q, where Q : z 7 → λz(1 − 1/2z) is affinely conjugate to Qθ : z 7 → e2π iθ z + z2 . Hence, Q = ϕ ◦ feθ ◦ ϕ −1 and we recover the surgery introduced by Douady and others. We conclude that the surgery map S : Bcm (θ) → P cm (θ ) extends continuously to the points at infinity of both parameter spaces, and the extension is also a surgery. The next theorem is the analogue of Theorem 5.1 for Blaschke products. It will be more convenient to formulate it for a general Blaschke product since we would like to use it for fθ as well as the elements of Bcm (θ). Theorem 12.4 (Paths of QC Conjugacies). Let A and B be two Blaschke products of degree d and let 8 be a quasiconformal homeomorphism which fixes 0, 1, ∞ such that 8 ◦ I = I ◦ 8 and 8 ◦ A = B ◦ 8. Then there exists a path {8t }0≤t≤1 of quasiconformal is a homeomorphisms, with 80 = id and 81 = 8, such that At = 8t ◦ A ◦ 8−1 t Blaschke product for every 0 ≤ t ≤ 1. In particular, either A is quasiconformally rigid or its conjugacy class is non-trivial and path-connected. Proof. The proof is almost identical to that of Theorem 5.1. Consider σ = 8∗ σ0 , which is invariant under A, and take the real perturbations σt = tσ , 0 ≤ t ≤ 1. Let 8t be the unique quasiconformal homeomorphism which fixes 0, 1, ∞ and satisfies 8∗t σ0 = σt . The map At = 8t ◦ A ◦ 8−1 t is easily seen to be a degree d rational map. By uniqueness, I ◦ 8t ◦ I = 8t since the left-hand side also pulls σ0 back to σt and fixes 0, 1, ∞. Hence t At commutes with I . So it is a Blaschke product. u We will need the next lemma in the proof of Theorem 13.3. Lemma 12.5 (Rigidity on the Julia Set). Let ψ be a quasiconformal homeomorphism defined on an open annulus containing the Julia set J (fθ ) of the Blaschke product fθ defined in (12.1). Suppose that ψ commutes with I and conjugates fθ to itself. Then ψ|J (fθ ) is the identity. Proof. Extend ψ to a quasiconformal homeomorphism C → C which commutes with I and conjugates fθ to itself. By the previous theorem, there exists a path t 7 → ψt of quasiconformal homeomorphisms, with 0 ≤ t ≤ 1 and ψ0 = id, ψ1 = ψ, such that ψt ◦ fθ ◦ ψt−1 is a degree 3 Blaschke product quasiconformally conjugate to fθ . By Proposition 12.1, this Blaschke product has to be fθ itself, so ψt commutes with fθ . Now that ψ|J (fθ ) must be the identity map follows from an argument similar to the proof of Lemma 6.2. u t 13. Surjectivity of the Surgery Map In this section we prove that the surgery map S : Bcm (θ ) → P cm (θ ) is surjective. We do this by showing that S is injective on the set of Blaschke products which map to C∗ r M(θ ) or to hyperbolic-like cubics. The proof of this fact is based on the combinatorics of drops and their nuclei as developed in Sect. 8. Here is the outline of the proof: If S(A) =
Dynamics of Cubic Siegel Polynomials
225
S(B) for some A, B ∈ Bcm (θ), there exists a quasiconformal homeomorphism of the e and B, e which is conformal plane which conjugates the modified Blaschke products A everywhere except on the union of the maximal drops. A careful analysis will then show that when S(A) is not capture, one can redefine this homeomorphism on all the drops of the two Blaschke products to get a conjugacy between A and B everywhere. A pull-back argument together with the Bers Sewing Lemma at each step shows that this conjugacy is conformal away from the Julia sets (Theorem 13.1). When S(A) is hyperbolic-like or has disconnected Julia set, one can use the renormalization scheme of Sect. 12 and the rigidity on the Julia sets (Lemma 12.5) to conclude that the conjugacy between A and B is in fact conformal (Theorem 13.3). Surjectivity of S, Theorem 13.7 and some corollaries will follow immediately. Theorem 13.1. Let A, B ∈ B cm (θ) and S(A) = S(B) = P . Suppose that P is not capture. Then there exists a quasiconformal homeomorphism 8 : b C→b C which fixes 0, 1, ∞, commutes with I and conjugates A to B. Moreover, 8 is conformal on the Fatou set b C r J (A). e◦ e ◦ ϕ −1 = ϕ 0 ◦ B Proof. Following the notation of (9.2), we assume that P = ϕ ◦ A −1 0 0 for some quasiconformal homeomorphisms ϕ and ϕ . Consider the quasiconformal ϕ e e homeomorphism 80 = ϕ 0 −1 ◦ ϕ which conjugates S A to−kB on the entire plane and is e (D). conformal (i.e., ∂80 = 0) everywhere except on k≥0 A S e−k (D) is precisely the Note that by Proposition 10.2(b) the open set b C r k≥0 A S e−k (D) is the disjoint union of the maximal nucleus N∞ as defined in Sect. 8.Also, k≥0 A drops of A (which by Proposition 10.2(a) correspond to the bounded Fatou components of map to the Siegel disk 1P ). Similar correspondence holds for the open set S P which −k (D). Therefore, for any maximal k-drop D i (A), there corresponds a unique e B k≥0 k maximal k-drop Dki (B) = 80 (Dki (A)). Finally, note that for any such maximal drops, A◦k : Dki (A) → D and B ◦k : Dki (B) → D are conformal isomorphisms since by our assumption P is not capture. In what follows we construct a sequence of quasiconformal homeomorphisms 8n : C → C which preserve the unit circle T and another sequence ϒn by symmetrizing each 8n : 8n (z) |z| ≥ 1 . ϒn (z) = (I ◦ 8n ◦ I )(z) |z| < 1 We have already constructed 80 , hence ϒ0 . Consider the sequences of compact sets {Jn (A)} and {Jn (B)} as in Lemma 10.3. Note that 80 ◦ A = B ◦ 80 on J0 (A). The next step is to define 81 : Let 81 = ϒ0 everywhere except on the maximal drops of A. On any maximal k-drop Dki (A) we define 81 : Dki (A) → Dki (B) by B −k ◦ϒ0 ◦A◦k . (When k = 0, the only maximal 0-drop is D and by this definition 81 |D = ϒ0 |D .) Observe that the two definitions match along the common boundary. Hence 81 is in fact a quasiconformal homeomorphism by the Bers Sewing Lemma. Note that 81 |J0 (A) = 80 |J0 (A) and by definition of J1 (A) in (10.1), 81 ◦ A = B ◦ 81 on J1 (A). The homeomorphism ϒ1 is then obtained by symmetrizing 81 . Continuing inductively, we define 8n to be equal to ϒn−1 everywhere except on the maximal drops of A and then on the maximal drops we define it by taking pull-backs. In other words, 8n : Dki (A) → Dki (B) will be defined by B −k ◦ ϒn−1 ◦ A◦k .
226
S. Zakeri
Lemma 13.2. The sequence of quasiconformal homeomorphisms {8n } has the following properties: 8n |Jn−1 (A) = 8n−1 |Jn−1 (A) ,
(13.1)
and (8n ◦ A)(z) = (B ◦ 8n )(z)
z ∈ Jn (A).
(13.2)
Proof. Both properties follow by induction on n. Let us prove (13.1) first. We have already seen (13.1) for n = 1. Assume (13.1) is true and let z ∈ Jn (A). We distinguish three cases: • Case 1. z ∈ Jn (A) ∩ D. Then I (z) ∈ Jn−1 (A) and we have 8n+1 (z) = ϒn (z) = (I ◦ 8n ◦ I )(z) = (I ◦ 8n−1 ◦ I )(z) by the induction hypothesis. The latter is clearly equal to ϒn−1 (z) = 8n (z). • Case 2. z ∈ Jn (A) r D and A◦k (z) ∈ D for some k ≥ 1. A◦k (z) ∈ I Jn−1 and hence (I ◦ A◦k )(z) ∈ Jn−1 (A). So 8n+1 (z) = (B −k ◦ ϒn ◦ A◦k )(z) = (B −k ◦ I ◦ 8n ◦ I ◦ A◦k )(z) = (B −k ◦ I ◦ 8n−1 ◦ I ◦ A◦k )(z) by the induction hypothesis. Again, the latter is equal to (B −k ◦ ϒn−1 ◦ A◦k )(z) = 8n (z). • Case 3. z ∈ Jn (A) r D and z is accumulated by points of the form Case 2. Then, clearly, 8n+1 (z) = 8n (z) by continuity. Altogether the three steps show that 8n+1 |Jn (A) = 8n |Jn (A) , which completes the induction step and the proof of (13.1). To prove (13.2) we have to work a little bit more. We have already seen (13.2) for n = 1. Assume (13.2) is true and let z ∈ Jn+1 (A). We split the induction step into the following cases: / D. Then (8n+1 ◦ A)(z) = (B ◦ 8n+1 )(z) • Case 1. z ∈ Jn+1 (A) r D and A(z) ∈ automatically since 8n+1 is defined by pull- backs. • Case 2. z ∈ Jn+1 (A) r D but A(z) ∈ D. Then (8n+1 ◦ A)(z) = (ϒn ◦ A)(z) = (B ◦ B −1 ◦ ϒn ◦ A)(z) = (B ◦ 8n+1 )(z). • Case 3. z ∈ Jn+1 (A) ∩ D and A(z) ∈ D. Then (8n+1 ◦ A)(z) = (ϒn ◦ A)(z) = (I ◦8n ◦I )(A(z)) = (I ◦8n ◦A)(I (z)). But I (z) ∈ Jn (A) so by the induction hypothesis, (I ◦ 8n ◦ A)(I (z)) = (I ◦ B ◦ 8n )(I (z)) = (B ◦ I ◦ 8n )(I (z)) = (B ◦ ϒn )(z) = (B ◦ 8n+1 )(z). / D. Then I (z) ∈ Jn (A). Let w = A(z). Since • Case 4. z ∈ Jn+1 (A) ∩ D but A(z) ∈ A(I (z)) = I (w) ∈ D, we have I (w) ∈ I Jn−1 (A), hence w ∈ Jn−1 (A). By (13.1), one has 8n+1 (w) = 8n (w) = 8n−1 (w) = ϒn−1 (w) = (I ◦ ϒn−1 ◦ I )(w) = (I ◦ 8n ◦ I )(w) = (I ◦ 8n ◦ I )(A(z)) = (I ◦ 8n ◦ A)(I (z)) = (I ◦ B ◦ 8n )(I (z)) by the induction hypothesis. The latter is equal to (B ◦ I ◦ 8n )(I (z)) = (B ◦ ϒn )(z) = (B ◦ 8n+1 )(z). t u Back to the proof of Theorem 13.1. By the Bers Sewing Lemma, the symmetrization 8n −→ ϒn does not increase the dilatation. On the other hand, the modification ϒn −→ 8n+1 achieved by pull-backs along the maximal drops does not increase the dilatation either, simply because A and B are holomorphic. So we may assume that {8n } is uniformly quasiconformal. Since all the 8n fix 0, 1, ∞, it follows that some subsequence 8n(j ) converges locally uniformly to a quasiconformal homeomorphism 8. Lemma 10.3 and Lemma 13.2 imply that 8 ◦ A = B ◦ 8 on J (A). In particular, this shows that 8 sends all the drops of A bijectively to the drops of B (before we only had a correspondence between the maximal drops of A and B).
Dynamics of Cubic Siegel Polynomials
227
S It is easy to check that 8 obtained this way is conformal on the union N = i,k Nki (A) of all the nuclei of drops of A at all depths as defined in Sect. 8 and in fact conjugates A to B there. Since N is clearly disjoint from the Julia set J (A) by (8.2), it remains to show that every Fatou component of A is contained in N . Consider a component U of the Fatou set of A. Under the iteration of A, U visits both D and C r D either finitely many times or infinitely often. In the first case, U has to map eventually into the nucleus N0 (A) or N∞ (A), hence it has to be contained in N. We prove that the second case cannot occur. In fact, suppose that the orbit of U visits D and C r D infinitely often. According to Sullivan [Su1], U eventually maps to a periodic Fatou component of A which is either an attracting or parabolic basin or a Siegel disk or a Herman ring. It follows that this cycle of periodic Fatou components intersects both D and C r D, so in either case a critical point of A has to enter D and leave it infinitely often, which is impossible since S(A) is not a capture. This shows that N = b C r J (A) and proves that 8 is a conjugacy between A and B everywhere and is conformal on b C r J (A). It is easy to see that 8 constructed this way commutes with I . u t Theorem 13.3. Let A, B ∈ B cm (θ) and S(A) = S(B). If S(A) is hyperbolic-like or has disconnected Julia set, then A = B. Proof. A and B are renormalizable by Theorem 4.2. Consider the quasiconformal homeomorphism 8 given by Theorem 13.1. By Theorem 12.2, there exists a pair of annuli WA0 b WA (resp. WB0 b WB ) and a quasiconformal homeomorphism ϕA (resp. ϕB ) which conjugates A (resp. B) to fθ on WA0 (resp. WB0 ). Since S(A) = S(B), we can assume that WB0 = 8(WA0 ) and WB = 8(WA ). The quasiconformal homeomorphism −1 : ϕA (WA0 ) → ϕB (WB0 ) is a self-conjugacy of fθ near its Julia set ψ = ϕB ◦ 8 ◦ ϕA which commutes with I . By Lemma 12.5, we must have ψ|J (fθ ) = id. It follows from the Bers Sewing Lemma that the ∂-derivative of ψ is zero almost everywhere on J (fθ ). Since by Theorem 12.2(b) ϕA (resp. ϕB ) has zero ∂-derivative on K(A) (resp. K(B)), we conclude that ∂8 = 0 almost everywhere S on K(A). But, as in the proof of Corollary 4.3, up to a set of measure zero, J (A) = n≥0 A−n (K(A)). Therefore, ∂8 has to be zero almost everywhere on the Julia set J (A). Hence 8 is conformal, so A = B. u t Remark. We believe that the surgery map is a homeomorphism, at least outside of the capture components where it might have branching. This would imply that the connectedness loci C(θ) and M(θ) are actually homeomorphic, a conjecture that is strongly supported by computer experiments. '
Corollary 13.4. The surgery map S restricts to a homeomorphism 3ext −→ ext . Similar conclusion holds for 3int and int . In particular, the connectedness locus C(θ ) is connected. Proof. Clearly S maps 3ext into ext injectively by the previous theorem. Since S is a proper map by Proposition 10.6, it extends to a continuous injection 3ext ∪ {∞} ,→ ext ∪ {∞}. We claim that this injection is onto. To this end, we show that for any sequence Bµn ∈ 3ext which converges to the boundary of the connectedness locus C(θ ), the sequence Pcn = S(Bµn ) ∈ ext converges to the boundary of M(θ ). If not, there is a subsequence of Bµn which converges to B ∈ ∂C(θ ) but the corresponding subsequence of Pcn converges to some P ∈ ext . By continuity, P = S(B). But B has connected Julia set while J (P ) is disconnected. This is impossible by Theorem 10.1. u t
228
S. Zakeri
Corollary 13.5. The connectedness locus C(θ ) has only two complementary components 3ext and 3int . Proof. Let U be a bounded component of C∗ r C(θ ) which is not 3int . Without loss of generality, we assume that U maps into ext by S. Take A ∈ U . By the previous corollary, there exists a B ∈ 3ext such that S(A) = S(B). By Theorem 13.3, A = B and this is a contradiction. u t Corollary 13.6. The surgery map S : B cm (θ) → P cm (θ ) is surjective. Proof. Compactify B cm (θ) and P cm (θ) by adding points at 0 and ∞ to get topological 2-spheres. S extends to a continuous map between these spheres by Proposition 10.6. ' This map has topological degree 6 = 0 because it is a homeomorphism 3ext −→ ext and S −1 (ext ) = 3ext . Therefore it has to be surjective. u t Since the boundary of the Siegel disk of a cubic which comes from the surgery is a quasicircle passing through some critical point, we have proved the main result (A3 ) of the introduction: Theorem 13.7 (Bounded type cubic Siegel disks are quasidisks). Let P be a cubic polynomial which has a fixed Siegel disk 1 of rotation number θ . Let θ be of bounded type. Then the boundary of 1 is a quasicircle which contains one or both critical points of P . By a recent theorem of Graczyk and Jones [GJ], we have Corollary 13.8. Under the assumptions of Theorem 13.7, the boundary of the Siegel disk 1 has Hausdorff dimension greater than 1. A recent result of McMullen [Mc3] implies the following interesting fact: The Hausdorff dimension of ∂1c is equal to the Hausdorff dimension 1 < δ(θ ) < 2 of the boundary of the Siegel disk of Qθ : z 7 → e2πiθ z + z2 whenever Pc is renormalizable. It follows from Theorem 4.2 that the function c 7 → HD(∂1c ) takes on the single value δ(θ ) on ext , int as well as on all the hyperbolic-like components of M(θ √ ). (One can actually find more rigorous estimates for the value of δ(θ ) when θ = ( 5 − 1)/2; see [BOS].) Now it is possible to show that despite all the bifurcations taking place near the boundary of the connectedness locus M(θ), which give rise to discontinuity of the Julia sets, the boundaries of the Siegel disks move continuously. Theorem 13.9 (Boundary of Siegel disks move continuously). The boundary ∂1c of the Siegel disk of Pc ∈ P cm (θ) centered at 0 is a continuous function of c ∈ C∗ in the Hausdorff topology. / ∂M(θ ), Theorem 3.1 shows that J (P ), Proof. Let us fix some P ∈ P cm (θ). If P ∈ hence ∂1P , moves holomorphically in a neighborhood of P and continuity at P is obvious. So let us assume that P ∈ ∂M(θ) and consider a sequence Pcn ∈ P cm (θ ) which converges to P as n → ∞. Since the surgery map is surjective, there exists a sequence Bµn ∈ B cm (θ) such that S(Bµn ) = Pcn . By properness (Proposition 10.6), some subsequence which we still denote by Bµn converges to some B ∈ Bcm (θ ), which by eµn ◦ϕn−1 as in (9.2). continuity maps to P . Now consider the representations Pcn = ϕn ◦ B Then the boundary ∂1cn is just the image ϕn (T). Since {ϕn } is normal by Corollary 10.5, some further subsequence, still denoted by {ϕn }, converges to a quasiconformal homee ◦ ψ −1 ∈ P cm (θ ) is quasiconformally conjugate to omorphism ψ. The map Q = ψ ◦ B P . Since P is rigid by Theorem 5.5, P = Q. Now, as n → ∞, ∂1cn = ϕn (T) converges t in the Hausdorff topology to ψ(T) = ∂1Q = ∂1P . u
Dynamics of Cubic Siegel Polynomials
229
Remark. We can actually make this theorem stronger in the following sense: Let Pc0 ∈ P cm (θ ) be a cubic for which one of the critical points c0 or 1 is off the boundary ∂1c0 (this happens if Pc0 is off the Jordan curve 0 studied in the next section). Then the boundary ∂1c moves holomorphically as a function of c in a neighborhood of c0 . To see this, assume for example that for all c sufficiently close to c0 we have c ∈ ∂1c but 1∈ / ∂1c . Evidently the critical orbit {Pc◦k (c)}k≥0 moves holomorphically as a function of c, and we can extend this motion to the closure of this critical orbit by the λ-lemma. But this closure is precisely the boundary ∂1c if c is close to c0 . 14. Siegel Disks with Two Critical Points on Their Boundary In this section we characterize those cubics in P cm (θ ) which have both critical points on the boundary of their Siegel disk. In Theorem 14.3 we will prove that the set of all such cubics is a Jordan curve 0 in P cm (θ). The proof of this theorem will use the fact that the quasiconformal conjugacy classes in B cm (θ) are path-connected (Theorem 12.4). We then show that when there are no queer components, 0 is in fact the common boundary of ext and int (Theorem 14.4). Consider the set 0 which consists of all cubics P ∈ P cm (θ ) such that both critical points of P belong to the boundary of the Siegel disk 1P . Fig. 18 shows this set in the parameter space P cm (θ). Since the surgery map S : B cm (θ) → P cm (θ ) is surjective by Corollary 13.6, every P ∈ 0 is of the form S(Bµ ) with Bµ having two double critical points on the circle. Corollary 9.3 shows that µ must belong to the unit circle T ⊂ C∗ ' B cm (θ ). Therefore, we simply have 0 = S(T). In particular, 0 is a closed path in P cm (θ) ' C∗ . Suggested by Fig. 18, we want to prove that 0 is a Jordan curve. This would follow immediately if one could prove that S|T is injective. However, I have not been able to show this. In fact, I do not know how to prove that Blaschke products on the boundary of the connectedness locus C(θ ) are quasiconformally rigid. So we take a slightly different approach by showing that the fibers of S|T : T → 0 are connected. Lemma 14.1. Let A, B ∈ B cm (θ) and S(A) = S(B) = P . Suppose that P is not capture. Then there exists a path t 7 → At ∈ B cm (θ ) of Blaschke products for 0 ≤ t ≤ 1, with A0 = A, A1 = B, such that S(At ) = P for all t. Proof. Since P is not capture, by Theorem 13.1 there exists a quasiconformal homeomorphism 8 which conjugates A to B, which is conformal away from the Julia set J (A). By Theorem 12.4 there exists a path {8t }0≤t≤1 connecting the identity map to 8 cm and a corresponding path {At = 8t ◦ A ◦ 8−1 t }0≤t≤1 of elements of B (θ ) connecting A to B. Note that by the definition of 8t , these quasiconformal homeomorphisms are all conformal away from J (A). It remains to show that S(At ) = P for all 0 ≤ t ≤ 1. Consider the Douady–Earle extension H : D → D used in the definition of S(A) in Sect. 9. Recall that H |T conjugates A|T to the rigid rotation Rθ . Hence, the quasiconformal homeomorphism : D → D will conjugate At |T to the rigid rotation as well. Note that Ht = H ◦ 8−1 t Ht is not in general the Douady–Earle extension of the linearizing homeomorphism
230
S. Zakeri
Γ c =1
Fig. 18. The Jordan curve 0, the locus of all critically marked cubics in P cm (θ ) which have both critical points on the boundary of their Siegel disk. Topologically it can be described as the common boundary of the complementary regions ext and int . Note that 0 is invariant under c 7 → 1/c
ht : T → T for At . Nevertheless, SHt (At ) = S(At ) by Proposition 9.2. Consider the modified Blaschke products |z| ≥ 1 e = A(z) A(z) (H −1 ◦ Rθ ◦ H )(z) |z| < 1 and et (z) = A
At (z) |z| ≥ 1 . (Ht−1 ◦ Rθ ◦ Ht )(z) |z| < 1
e= A et ◦ 8t . Note that 8t ◦ A Define the corresponding conformal structures σ = H ∗ σ0 and σt = Ht∗ σ0 as in Sect. 9. It is easy to see that σ = 8∗t σt .
(14.1)
Here we use the fact that 8t is conformal away from J (A). Consider the normalized solutions ϕ and ϕt of the Beltrami equations ϕ ∗ σ0 = σ, ϕt∗ σ0 = σt . By (14.1) and uniqueness, we have ϕt = ϕ ◦ 8−1 t .
Dynamics of Cubic Siegel Polynomials
231
Hence, by Proposition 9.2, et ◦ ϕt−1 S(At ) = ϕt ◦ A −1 e = ϕ ◦ 8−1 t ◦ At ◦ 8t ◦ ϕ −1 e =ϕ◦A◦ϕ = S(A). This completes the proof of the lemma. u t Corollary 14.2. The fibers of S|T : T → 0 are connected. Proof. Let A, B ∈ T ⊂ B cm (θ) and S(A) = S(B). Apply the previous lemma to A and B. Note that At ∈ T for all 0 ≤ t ≤ 1, since At is quasiconformally conjugate to A, hence has two double critical points on the unit circle. u t Theorem 14.3. 0 is a Jordan curve. Proof. Consider S|T : T → 0 whose fibers are closed and connected by Corollary 14.2. By general topology, 0 is homeomorphic to T/ ∼, where A ∼ B means S(A) = S(B). Since each equivalence class of ∼ is a closed connected proper subset of T, it follows that T/ ∼ is homeomorphic to the circle. u t Finally, we find a topological characterization of 0 in P cm (θ ) under the assumption that there are no queer components in the interior of M(θ ). Theorem 14.4 (Topological Characterization of 0). 0 is a subset of the boundary ∂M(θ ) and it contains ∂ext ∩ ∂int . If there are no queer components in the interior of M(θ ), then 0 = ∂ext ∩ ∂int . Proof. First let us show that ∂ext ∩ ∂int ⊂ 0. Let Pc ∈ ∂ext ∩ ∂int and assume that / 0. Choose Bµ ∈ B cm (θ) such that S(Bµ ) = Pc . We can assume without loss of Pc ∈ generality that |µ| > 1. Choose a sequence Pcn ∈ int converging to Pc and a sequence Bµn ∈ 3int such that S(Bµn ) = Pcn . By passing to a subsequence, we may assume that Bµn → Bµ0 as n → ∞, where |µ0 | ≤ 1. By continuity, S(Bµ0 ) = Pc and by our / 0, so we must have |µ0 | < 1. Since Pc is not capture by Corollary 5.3, assumption Pc ∈ Lemma 14.1 shows that there is a path t 7 → At of quasiconformally conjugate Blaschke products in B cm (θ) connecting Bµ to Bµ0 , all of which are mapped to Pc . Since this path must intersect T somewhere, we conclude that Pc ∈ 0 which is a contradiction. Now we prove that 0 ⊂ ∂M(θ). Fix some P ∈ 0. Since P has both critical points on ∂1P , it cannot belong to any hyperbolic-like or capture component. Also, P cannot be in a queer component U of the interior of M(θ ), since otherwise every Q ∈ U would have to be quasiconformally conjugate to P by Theorem 5.5, which would imply that Q has two critical points on ∂1Q , which would show U ⊂ 0. But this is evidently impossible because U is open and 0 is a Jordan curve. Therefore, P has to lie in ∂M(θ ) = ∂ext ∪ ∂int . Now assume that there are no queer components in the interior of M(θ ). To show that 0 = ∂ext ∩ ∂int , let Pc0 ∈ 0 and assume by way of contradiction that c0 ∈ ∂ext r ∂int . Since c0 has positive distance from int , for all c in a neighborhood D of c0 the sequence {Pc◦n (1)} has to be normal. Assuming that D is a small disk, the Jordan curve 0 cuts D into two topological disks D1 and D2 such that for every c ∈ D1 , / ∂1c , and for every c ∈ D2 , c ∈ ∂1c and 1 ∈ / ∂1c (see Fig. 19). 1 ∈ ∂1c and c ∈
232
S. Zakeri 1 belongs to the boundary of
∆c
Γ
D1 D
c0 D2
c belongs to the boundary of ∆
c
Fig. 19.
Clearly D2 ∩ ∂ext = D2 ∩ ∂int = ∅. So D2 has to be a subset of a component U of the interior of M(θ). Since there are no queer components by the assumption, U is either hyperbolic-like or capture. For every c ∈ D1 , we have 1 ∈ ∂1c and the restriction Pc |∂1c is conjugate to the ◦q rigid rotation by angle θ. Therefore, Pc n (1) → 1 for all c ∈ D1 , where the qn are the denominators of the rational approximations of θ. Since {Pc◦n (1)} is normal in D, for a ◦q subsequence {qn(j ) } we must have Pc n(j ) (1) → 1 throughout D. In particular, if c ∈ D2 , the critical point 1 of Pc must be recurrent. This is impossible if U is hyperbolic-like or capture, since over D2 , c ∈ ∂1c and hence 1 either gets attracted to the attracting cycle t or eventually maps to the Siegel disk 1c . u Acknowledgements. I am grateful to J. Milnor for many inspiring discussions and his support and interest in this work. He also showed me with great patience how to write programs to create the pictures of the Julia sets in this paper. I would like to thank M. Lyubich and D. Schleicher for very useful conversations during the Spring and Fall semesters of 1997 at Stony Brook. Further thanks are due to X. Buff for his suggestion on using deformation in the proof of Theorem 5.3, and A. Epstein and M. Yampolsky for various discussions.
References [A] [AB]
Ahlfors, L.: Lectures on Quasiconformal Mappings. Amsterdam: Van Nostrand, 1966 Ahlfors, L. and Bers, L.: Riemann mapping’s theorem for variable metrics. Annals of Math. 72, 385–404 (1960) [B] Bers, L.: On moduli of Kleinian groups. Russ. Math. Surv. 29, 88–102 (1974) [BR] Bers, L. and Royden, H.L.: Holomorphic families of injections. Acta Math., 157, 259–286 (1986) [BOS] Burbanks, A., Osbaldestin, A., Stirnemann, A.: Rigorous bounds on the Hausdorff dimension of Siegel disk boundaries. Commun. Math. Phys. 199, 417–439 (1998) [CG] Carleson, L. and Gamelin, T.: Complex Dynamics. Berlin–Heidelberg–New York: Springer-Verlag, 1993 [D1] Douady, A.: Systemes dynamiques holomorphes. Seminar Bourbaki, Asterisque 105–106, 39–63 (1983) [D2] Douady, A.: Disques de Siegel at aneaux de Herman. Seminar Bourbaki, Asterisque 152–153, 151– 172 (1987) [DE] Douady, A. and Earle, C.: Conformally natural extension of homeomorphisms of the circle. Acta Math. 157, 23–48 (1986) [DH1] Douady, A. and Hubbard, J.: Iteration des polynomes quadratiques complexes. C.R. Acad. Sc. Paris 294, 123–126 (1982) [DH2] Douady, A. and Hubbard, J.: On the dynamics of polynomial-like mappings. Ann. Sci. Ec. Norm. Sup. 18, 287–343 (1985) [EY] Epstein, A. and Yampolsky, M.: Geography of the cubic connectedness locus I: Intertwining surgery. Ann. Sci. Ec. Norm. Sup. 32, 151–185 (1999)
Dynamics of Cubic Siegel Polynomials
[G] [GJ] [GM] [Ha] [H1] [H2] [H3] [KH] [K] [Le] [Ly] [MN] [Mc1] [Mc2] [Mc3] [McS] [M1] [M2] [M3] [Pr1] [Pr2] [Pe] [R] [Su1] [Su2] [Sw] [W] [YZ] [Y] [Z1] [Z2] [Z3] [Z4]
233
Ghys, E.: Transformations holomorphes au voisinage d’une courbe de Jordan. C.R. Acad. Sc. Paris 298, 385–388 (1984) Graczyk, J. and Jones, P.: Geometry of Siegel disks. Manuscript, 1997 Goldberg, L. and Milnor, J.: Fixed points of polynomial maps II. Ann. Sci. Ec. Norm. Sup. 26, 51–98 (1993) Haissinsky, P.: Chirurgie parabolique. C.R. Acad. Sc. Paris 327, 195–198 (1998) Herman, M.: Are there critical points on the boundary of singular domains?. Commun. Math. Phys. 99, 593–612 (1985) Herman, M.: Conjugaison quasisymetrique des homeomorphismes analytique des cercle a des rotations. Manuscript Herman, M.: Conjugaison quasisymetrique des diffeomorphismes des cercle a des rotations et applications aux disques singuliers de Siegel. Manuscript Katok, A. and Hasselblatt, B.: Introduction to the Modern Theory of Dynamical Systems. Cambridge: Cambridge University Press, 1995 Kiwi, J.: Non-accessible critical points of Cremer polynomials. SUNY at Stony Brook IMS preprint, 1996/2, to appear in Ergod. Th. Dynam. Sys. Lehto, O.: Univalent Functions and Teichmüller Spaces Berlin–Heidelberg–New York: SpringerVerlag, 1987 Lyubich, M.: Dynamics of rational transforms: The topological picture. Russ. Math. Surv. 41, 35–95 (1986) Manton, N. and Nauenberg, M.: Universal scaling behavior for iterated maps in the complex plane. Commun. Math. Phys. 89, 555–570 (1983) McMullen, C.:Automorphisms of rational maps. In: Holomorphic Functions and Moduli I, ed. Drasin, Earle, Gehring, Kra, Marden, MSRI Pub. 10, Berlin–Heidelberg–New York: Springer, 1988 McMullen, C.: Complex Dynamics and Renormalization. Annals of Math Studies 135, 1994 McMullen, C.: Self-similarity of Siegel disks and the Hausdorff dimension of Julia sets. Acta Math. 180, 247–292 (1998) McMullen, C. and Sullivan, D.: Quasiconformal homeomorphisms and dynamics III: The Teichmüller space of a holomorphic dynamical system. Adv. Math. 135, 351–395 (1998) Milnor, J.: Dynamics in One Complex Variable: Introductory Lectures. SUNY at Stony Brook IMS preprint, 1990/5 Milnor, J.: Hyperbolic components in spaces of polynomial maps, with an appendix by A. Piorier. SUNY at Stony Brook IMS preprints, 1992/3 Milnor, J. Local connectivity of Julia sets: Expository Lectures. SUNY at Stony Brook IMS preprint, 1992/11 Perez-Marco, R.: Fixed points and circle maps. Acta Math. 179, 243–294 (1997) Perez-Marco, R.: Siegel disks with quasi-analytic boundary. Manuscript, 1997 Petersen, C.: Local connectivity of some Julia sets containing a circle with an irrational rotation. Acta Math. 177, 163–224 (1996) Rogers, J.: Singularities in the boundaries of local Siegel disks. Ergod. Th. Dynam. Sys. 12, 803–821 (1992) Sullivan, D.: Quasiconformal homeomorphisms and dynamics I: Solution of the Fatou–Julia problem on wandering domains. Annals of Math., 122, 401–418 (1985) Sullivan, D.: Quasiconformal homeomorphisms and dynamics III: Topological conjugacy classes of analytic endomorphisms. Manuscript ´ atek, G.: On critical circle homeomorphisms. Bol. Soc. Bras. Mat. 29, 329–351 (1998) Swi¸ Widom, M.: Renormalization group analysis of quasi-periodicity in analytic maps. Commun. Math. Phys. 92, 121–136 (1983) Yampolsky, M. and Zakeri, S.: Mating Siegel quadratic polynomials. SUNY at Stony Brook IMS preprint, 1998/8 Yoccoz, J.C.: Petits Diviseurs en Dimension 1. Asterisque 231, (1995) Zakeri, S.: On critical points of proper holomorphic maps on the unit disk. Bull. London Math. Soc. 30, 62–66 (1998) Zakeri, S.: Biaccessiblity in quadratic Julia sets. SUNY at Stony Brook IMS preprint, 1998/1, to appear in Ergod. Th. Dynam. Sys. Zakeri, S.: On dynamics of cubic Siegel polynomials. SUNY at Stony Brook IMS preprint, 1998/4 Zakeri, S.: Non-quasiconformal surgery and Siegel Julia sets. In preparation
Communicated by Ya. G. Sinai
Commun. Math. Phys. 206, 235 – 245 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Semilinear PDEs on Self-Similar Fractals K. J. Falconer Mathematical Institute, University of St Andrews, North Haugh, St Andrews, Fife, KY16 9SS, Scotland. E-mail:
[email protected] Received: 11 December 1998 / Accepted: 22 March 1999
Abstract: A Laplacian may be defined on self-similar fractal domains in terms of a suitable self-similar Dirichlet form, enabling discussion of elliptic PDEs on such domains. In this context it is shown that that semilinear equations such as 1u + up = 0, with zero Dirichlet boundary conditions, have non-trivial non-negative solutions if 0 < ν ≤ 2 and p > 1, or if ν > 2 and 1 < p < (ν + 2)/(ν − 2), where ν is the “intrinsic dimension” or “spectral dimension” of the system. Thus the intrinsic dimension takes the rôle of the Euclidean dimension in the classical case in determining critical exponents of semilinear problems. 1. Introduction Recently a great deal of effort has gone into defining a Laplacian operator for functions on fractal domains. This has led to a study of linear PDEs, such as the heat equation and linear eigenvalue problem, on fractal domains, see for example [4,8,10,12,14]. There are considerable difficulties in defining the Laplacian on a general fractal set and several definitions have been proposed that are applicable to certain classes of fractal. For example, [10,12] define a Laplacian as the limit of discrete differences on graphs approximating the fractal, a method suited to post-critically finite self-similar domains such as the Sierpi´nski triangle. Recently, Mosco [17,18] introduced a framework for the Laplacian, by taking as a starting point a Dirichlet form that reflects the self-similarities of the underlying fractal. This framework, which depends on the intrinsic structure of the fractal and its “intrinsic dimension” ν, leads to a very general theory of “variational fractals” that includes many of the more specific examples that have been analysed by other means, such as in [3,15,10,12]. So far, attention has concentrated on linear PDEs. However, many problems on fractal domains lead to nonlinear models, for example reaction-diffusion equations, problems on elastic fractal media or fluid flow through fractal regions, so it is appropriate to investigate nonlinear PDEs. It turns out that the analytic estimates obtained for variational fractals
236
K. J. Falconer
are just what is needed for critical point analysis in the nonlinear case. We demonstrate that semilinear elliptic PDEs of the form 10 u + f (x, u) = 0,
(1.1)
where 10 is the Laplacian corresponding to zero Dirichlet boundary conditions, have non-trivial non-negative solutions if f satisfies certain conditions, in particular that f (x, t) ∼ |t|p for large t, provided that either 0 < ν ≤ 2 and p > 1, or ν > 2 and 1 < p < (ν + 2)/(ν − 2), where ν is the intrinsic dimension of the system. This is reminiscent of the critical exponent condition for PDEs on classical domains in Rn with smooth boundary, where such equations have non-trivial solutions for all p > 1 if n = 1, 2 and for 1 < p < (n + 2)/(n − 2) if n = 3, 4, . . . , see [13]. Thus the intrinsic dimension plays an analogous rôle to the Euclidean dimension in the classical case. Moreover, the intrinsic dimension has another natural interpretaion as the spectral dimension of 10 u; thus the number of eigenvalues of 10 u at most λ is asymptotic to λν/2 . As a particular application, Eq. (1.1) might provide a steady-state solution to a nonlinear reaction-diffusion equation ut = 10 u + f (x, u) = 0 on a fractal catalyst, where u is temperature and the nonlinearity results from an exothermic chemical reaction. 2. Dirichlet Forms and Variational Fractals We review the definition of the Laplacian on self-similar sets via a suitable Dirichlet form, following the “variational fractal” approach of Mosco [17,18]. For i = 1, . . . , m, let ψi be contracting similarities on Rn equipped with the Euclidean metric d0 , so that d0 (ψi (x), ψi (y)) = ri d0 (x, y) (x, y ∈ Rn ), where ri < 1. Thus {ψ1 , . . . , ψm } is an iterated function system which has a self-similar attractor K, that is a unique non-empty compact set K ⊂ Rn satisfying K = ∪m i=1 ψi (K), see [7]. We assume that the system satisfies the open set separation condition, that there is a non-empty open set U such that ∪m i=1 ψi (U ) ⊂ U with this union disjoint. Then the Hausdorff and box-counting dimensions of K are given by the non-negative number df satisfying m X d ri f = 1. i=1
Moreover, K has positive finite df -dimensional Hausdorff measure. Let µ denote the restriction to K of normalised df -dimensional Hausdorff measure, so that µ(K) = 1 and µ satisfies the scaling property µ(A) =
m X i=1
d
ri f µ(ψi−1 (A))
(2.1)
for A ⊂ Rn . We write Ai1 ,... ,ik = ψi1 ◦ · · · ◦ ψik (A) for a set A ⊂ Rn in the usual way. Then µ(Ki1 ,... ,ik ∩ Kj1 ,... ,jk ) = 0 whenever i1 , . . . , ik 6 = j1 , . . . , jk . We write 0 ≡ ∪i6=j ψi−1 (Ki ∩ Kj ) for the intrinsic boundary of K, so 0i ∩ 0j = Ki ∩ Kj . We
Semilinear PDEs on Self-Similar Fractals
237
also assume an iterated analogue of this, namely Ki1 ,... ,ik ∩ Kj1 ,... ,jk = 0i1 ,... ,ik ∩ 0j1 ,... ,jk for all i1 , . . . , ik 6 = j1 , . . . , jk . Let L2 (K, µ) be the Hilbert space of µ-square integrable functions on K, with the usual inner product and norm k k2 . Let W : DW × DW → R≥0 be a Dirichlet form in L2 (K, µ) with domain DW = {u ∈ L2 (K, µ) : W [u] < ∞}, where we write W [u] = W (u, u). Thus W is a densely defined closed, non-negative, symmetric bilinear form, which satisfies the Markovian property, that is if u ∈ DW and T : R → R is such that T (0) = 0 and |T (t1 ) − T (t2 )| ≤ |t1 − t2 | for all t1 , t2 ∈ R,
(2.2)
T ◦ u ∈ DW and W [T ◦ u] ≤ W [u].
(2.3)
then
We assume that W is irreducible, that is W [u] = 0 if and only if u is constant, and strongly local, so that W (u, v) = 0 if u is constant on spt v. We define the energy norm on L2 (K, µ) by kuk = (W [u] + kuk22 )1/2 ;
(2.4)
this is analogous to the Sobolev norm k k2,1 for functions on classical domains. We assume that W is regular, that is DW ∩ C(K) is dense both in C(K) with the uniform norm, and in DW with the norm k k. For a Dirichlet form W satisfying these conditions, there is a representation Z W [u] = dL[u], (2.5) where the measure L(u, v) is defined by Z 1 gdL(u, v) = [W (gu, v) + W (u, gv) − W (g, uv)] 2 for g ∈ C0 (K). We think of L[u] ≡ L(u, u) as the analogue of “| grad u|2 ”, though in general L[u] is a measure rather than a function, and indeed it is not clear if “grad u” itself has any analogue. We next impose a requirement that W , and thus L, is compatible with the self-similar structure of K. We assume that for some constant 0 < σ < 1, W [u] =
m X
σ
µ(Ki ) W [u ◦ ψi ] =
m X
1
σ df
ri
W [u ◦ ψi ] (u ∈ DW ).
1
This leads to corresponding identities for L, that is Z gdL[u] =
m X 1
σ df
ri
Z (g ◦ ψi )dL[u ◦ ψi ]
238
K. J. Falconer
for g ∈ C0 (K). These formulae may be iterated to obtain identities involving the composed mappings ψi1 ◦ · · · ◦ ψik . Following Mosco [17,18] we term the triple (K, µ, W ) a variational fractal. A consequence of self-similarity is that there is a number ν > 0, called the intrinsic dimension of K, that gives the scaling law exponent of µ(B(x, r)) for x ∈ K and r > 0, where the ball B(x, r) is determined by a natural intrinsic metric on K. (This intrinsic metric is closely related to the effective resistence metric used in the constructive approach of Kigami [11].) This intrinsic approach, described in detail in [17,18], sheds a great deal of light on this theory. It turns out that ν=
2 1−σ
or σ =
ν−2 . ν
We may use W to define Laplace operators associated with Neumann boundary conditions, and with Dirichlet boundary conditions, respectively, by analogy with the Gauss-Green equation. The (unbounded) Neumann operator 1 has domain D1 ⊂ DW and is such that Z (2.6) W (u, v) = − (1u)vdµ for all v ∈ DW and u ∈ D1 . To permit Dirichlet boundary conditions we let W0 be the closure in kk2 of the restriction of W to W ∩ C0 (K \ 0), where C0 (K \ 0) is the space of continuous functions with compact support in K \ 0, and we write DW0 for the domain of W0 . Then W0 is closed, regular and strongly local with DW ∩ C0 (K) dense in C0 (K \ 0) in the uniform norm, and dense in DW0 in the norm k k. The self-adjoint Dirichlet operator 10 with domain D10 ⊂ DW0 is defined by Z W0 (u, v) = − (10 u)vdµ for all v ∈ DW0 and u ∈ D10 . (2.7) It turns out that the intrinsic dimension ν equals the spectral dimension of K, which gives the asymptotic distribution of the eigenvalues of the Laplacian. Thus #{eigenvalues of 1 ≤ λ} #{eigenvalues of 10 ≤ λ} λν/2 , see [14,16,20]. 3. Analytic Properties Some further analytic structure on the variational fractal (K, µ, W ) is needed to enable PDEs to be studied. As in Mosco [17,18] we assume that W satisfies a rather stronger irreducibility condition, namely a global Poincaré inequality of the form Z Z |u − uK |2 dµ ≤ c dL[u] (u ∈ DW ). (3.1) K
K\0
We also need a connectivity condition to ensure continuation of analytic properties between the different “similarity regions” Ki1 ,... ,ik = ψi1 ◦ · · · ◦ ψik (K) of K. The intersection of adjoining regions may be negligible in the measure theoretic sense, so we require K to be locally connected in a capacity sense. Essentially, we assume that for each small ball B, the regions Ki1 ,... ,ik of comparable size to B and which intersect
Semilinear PDEs on Self-Similar Fractals
239
B can be ordered so that consecutive regions intersect in a set of positive capacity with respect to W , with these capacities uniformly bounded away from 0 after normalising by the size of B (see [17,18] for the full technical condition). Using this condition and self-similarity, (3.1) leads to scaled Poincaré inequalities Z r 2 Z 2 |u − uB(x,r) | dµ ≤ c dL[u] (3.2) diamK B(x,r) B(x,qr) for all u ∈ DW , x ∈ K and r > 0, for constants q ≥ 1 and c > 0, see [17]. The scaled Poincaré inequalities yield the norm estimates needed for PDE analysis, see [5,17]. There are two cases depending on whether ν < 2 or ν > 2, where ν is the intrinsic dimension. In what follows, the constants c1 , c2 , . . . are, in particular, independent of the functions u ∈ DW . (a) If 0 < ν < 2 there is a Morrey inequality: |u(x) − u(y)| ≤ c1 W [u]1/2 d0 (x, y)df (2−ν)/2ν (u ∈ DW , x, y ∈ K).
(3.3)
In particular this gives a uniform bound |u(x)| ≤ c2 (W [u] + kuk22 )1/2 = c2 kuk (u ∈ DW , x ∈ K);
(3.4)
thus the embedding (DW , k k) ,→ (C(K), k k∞ )
(3.5)
is continuous, and moreover, by the Arzela-Ascoli theorem, it is compact. (b) If ν > 2 (3.2) leads to a Sobolev-type inequality: kuk2ν/(ν−2) ≤ c3 kuk = c3 (W [u] + kuk22 )1/2 (u ∈ DW ),
(3.6)
so the embedding (DW , k k) ,→ (L2ν/(ν−2) (K, µ), k k2ν/(ν−2) )
(3.7)
is continuous. Assuming also the analogue of Rellich’s theorem, that the embedding (DW , k k) ,→ (L2 (K, µ), k k2 )
(3.8)
is compact, then together with (3.6) this easily implies that the embedding (DW , k k) ,→ (Lq (K, µ), k kq )
(3.9)
is compact if 2 ≤ q < 2ν/(ν − 2). Note that in the critical case ν = 2 we get kukq ≤ c3 kuk for all 1 ≤ q < ∞ in place of (3.6), along with the consequential embeddings. For all ν > 0 there is a global Poincaré inequality for the Dirichlet operator: kuk22 ≤ c4 W0 [u] (u ∈ DW0 ).
(3.10)
If 0 < ν < 2 this follows by taking y on the boundary and integrating (3.3). If ν > 2 (3.10) may be deduced from (3.8), using that kuk ≤ lim supk→∞ kuk k if uk → u weakly, and that W is irreducible. Taking (3.10) together with (2.4) gives that W0 [u] ≤ kuk2 ≤ c5 W0 [u] (u ∈ DW0 ), so W0 [ ]1/2 and k k are equivalent norms on DW0 .
(3.11)
240
K. J. Falconer
4. Semilinear PDEs Let f : K × R → R be continuous. With the notation of (2.7), u ∈ D10 is a solution of 10 u + f (x, u) = 0
(4.1)
with Dirichlet boundary conditions if Z Z −W0 (u, v) + f (x, u)vdµ = [(10 u)v + f (x, u)v]dµ = 0
(4.2)
for all v ∈ DW0 . Variational methods enable us to study solutions of (4.1) for suitable f . Our conclusions depend crucially on the intrinsic dimension ν of the variational fractal (K, µ, W ). We set Z t f (x, s)ds. F (x, t) = 0
We will require that the continuous f : K × R → R satisfies conditions (i)-(iv) below; such conditions are familiar in the theory of PDEs on classical domains, see for example [2]: (i) |f (x, t)| ≤ a + b|t|p
(x ∈ K, t ∈ R)
(4.3)
for some a, b and p, where 1 < p < ∞ if
0 < ν ≤ 2 and
1 < p < (ν + 2)/(ν − 2) if
ν > 2,
(4.4)
(ii) f (x, t) = o(t)
near t = 0 uniformly in x,
(4.5)
(iii) f (x, t)t −1 → ∞ as t → ∞ uniformly in x, (iv) there are numbers 0 ≤ κ <
1 2
(4.6)
and a0 > 0 such that
F (x, t) ≤ κf (x, t)t
(x ∈ K, |t| ≥ a0 ).
(4.7)
Note that, although we do not require f (x, t) > 0, condition (iii) ensures that f (x, t) is everywhere positive for large t. In particular, these conditions are all satisfied by f (x, t) = t|t|p−1 provided that p satisfies (4.4); this may be regarded as the canonical example. We define ψ : DW → R by Z F (x, v(x))dµ. (4.8) ψ(v) = K
|t|p+1
for large t, so ψ is well-defined by (3.4) or If f satisfies (i) then |F (x, t)| ≤ b1 (3.6), noting that p + 1 < 2ν/(ν − 2). Moreover, by an argument parallel to that of [1, Theorem 2.9], ψ is continuous, and has derivative ψ 0 : DW → L(DW , R) given by Z 0 f (x, v(x))w(x)dµ. (4.9) ψ (v)w = K
Semilinear PDEs on Self-Similar Fractals
241
Indeed, (i) implies that the Nemitski map v 7→ f (x, v(x)) is continuous as a mapping Lp+1 (K, µ) → L(p+1)/p (K, µ), see [1, Theorem 2.2], so as the embedding DW ,→ Lp+1 (K, µ) is compact by (3.5) or (3.9), the Nemitski map is compact as a mapping DW → L(p+1)/p (K, µ). Using Hölder’s inequality it follows that ψ 0 : DW → L(DW , R) is compact. We state a form of the variational principle for (4.1) appropriate in this setting. Proposition 4.1 (Variational principle). Let f : K × R → R be continuous and satisfy (i). Then u is a stationary point in DW0 of Z 1 W0 [v] − F (x, v)dµ (4.10) 2 if and only if u ∈ D10 and 10 u + f (x, u) = 0.
(4.11)
Proof. For u, v ∈ DW0 we have, using (4.9), that Z 1 W0 [u + v] − F (x, u + v)dµ 2 Z Z 1 W0 [u] − F (x, u)dµ + W0 (u, v) − f (x, u)vdµ + o(kvk). (4.12) = 2 If u ∈ D10 satisfies (4.11) and thus (4.2), the second term of (4) vanishes, so u is a stationary point in DW0 of (4.10). Conversely, if u ∈ DW0 is a stationary point of (4.10), then by (4) Z W0 (u, v) − f (x, u)vdµ = o(kvk) for all v ∈ DW0 . Since the left-hand side of this expression is linear in v, (4.2) follows t for all v ∈ DW0 , so u ∈ D10 and (4.11) is satisfied. u We recall the mountain pass lemma which will be used to demonstrate the existence of solutions. Proposition 4.2 (Mountain pass lemma). Let (X, k k) be a Banach space and let φ : X → R be C 1 . Suppose φ(0) = 0, that there exist numbers r, h > 0 such that φ(v) > 0 if 0 < kvk ≤ r
(4.13)
φ(v) ≥ h if kvk = r,
(4.14)
and
and that there exists w ∈ X with kwk > r and φ(w) ≤ 0. Suppose further that the Palais–Smale compactness condition holds, that is every sequence {vi } in X with φ(vi ) positive and bounded above and with φ 0 (vi ) → 0 in L(DW , R) has a convergent subsequence. Then φ has a critical value c with h ≤ c < ∞, that is there exists u ∈ X such that φ 0 (u) = 0 and φ(u) = c. Proof. A proof of the mountain pass lemma is given, for example, in [6]. u t
242
K. J. Falconer
We apply the mountain pass lemma to semilinear PDEs on fractal domains in a similar way to the case of classical domains [2,6]; we include the details for the benefit of readers with backgrounds in fractal rather than nonlinear analysis and since there are some technical differences. Theorem 4.3. Suppose 0 < ν ≤ 2 and 1 < p < ∞ , or ν > 2 and 1 < p < (ν + 2)/(ν − 2). Let f : K × R → R be continuous and satisfy (i)-(iv). Then the Dirichlet problem 10 u + f (x, u) = 0
(4.15)
on K has a non-zero solution u ∈ D10 ⊂ DW0 , with u ≥ 0 on K \ 0. The same conclusion holds if f is merely defined on the restricted domain f : K × R≥0 → R with (i)-(iv) satisfied for t ≥ 0. Proof. We apply the mountain pass lemma to the space (DW0 , k k), with Z 1 φ(v) = W0 [v] − F (x, v)dµ 2
(4.16)
for v ∈ DW0 . As in (4), φ : DW0 → R is well-defined, continuous, and C 1 , with Z φ 0 (v)w = W0 (v, w) − f (x, v)wdµ (4.17) for v, w ∈ DW0 . Take 0 < < 21 c5−1 with c5 as in (3.11). By property (ii) there exists δ such that |F (x, t)| ≤ |t|2 if |t| < δ, and by (i) |F (x, t)| ≤ c6 |t|p+1 if |t| ≥ δ. Then splitting the domain of integration into x with |v(x)| < δ and x with |v(x)| ≥ δ gives Z F (x, v)dµ ≤ kvk2 + c6 kvkp+1 ≤ kvk2 + c7 kvkp+1 2 p+1 by (3.4) or (3.6). Hence from (4.16) and (3.11) φ(v) ≥ ( 21 c5−1 − )kvk2 − c7 kvkp+1 ,
(4.18)
so φ(v) is positive for sufficiently small kvk, and we may choose r, h so that (4.13) and (4.14) hold. Moreover, fixing a bounded non-zero v ∈ DW0 with v ≥ 0, it follows from (4.16) and (iii) that φ(λv) < 0 if λ is large enough. Thus φ satisfies the basic hypotheses of the mountain pass lemma. To check the Palais–Smale conditions, let vi be a sequence in DW0 such that |φ(vi )| ≤ M and φ 0 (vi ) → 0 in L(DW , R). Then for w ∈ DW0 , Z 0 0 (4.19) kφ (vi )kkwk ≥ |φ (vi )w| = W0 (vi , w) − f (x, vi )wdµ , so setting w = vi we have that for sufficiently large i, Z f (x, vi )vi dµ ≤ W0 [vi ] + kvi k.
Semilinear PDEs on Self-Similar Fractals
243
Then if i is sufficiently large, writing s = sup{F (x, t) : x ∈ K, |t| ≤ a0 }, and using (iv), Z M ≥ φ(vi ) = 21 W0 [vi ] − F (x, vi )dµ Z Z ≥ 21 W0 [vi ] − κ f (x, vi )vi dµ − F (x, vi )dµ |vi |≥a0
≥
( 21
− κ)W0 [vi ] − κkvi k − s ≥
( 21
|vi |
− κkvi k − s,
by (3.11). Since κ < 21 , it follows that kvi k is bounded. 0 0 R Recall from the remark after (4.9) that ψ : DW → L(DW , R) defined by ψ (v)w = f (x, v)wdµ is compact. Thus, by taking a subsequence if necessary, we may assume that ψ 0 (vi ) is convergent in L(DW , R). Since φ 0 (vi ) → 0, we conclude from (4.17) that W0 (vi , ·) is convergent in L(DW , R). Since W0 [ ]1/2 is equivalent to the Hilbert space norm k k on DW0 , the Riesz theorem gives v ∈ DW0 such that W0 (vi , ·) → W0 (v, ·), from which it follows that vi → v. We conclude from the mountain pass lemma that (4.15) has a non-zero solution u ∈ DW0 ; we must show that we can find such a solution with u ≥ 0. Set f (x, t) = f (x, t) if t ≥ 0 and f (x, t) = 0 if t < 0. Then f satisfies (i)-(iv), so by the above R argument 10 u + f (x, u) = 0 has a non-trivial solution u ∈ DW0 with W0 (u, v) = f (x, u)vdµ for all v ∈ DW0 by (4.2). In particular W0 (u, v) = 0 if v(x) = 0 for all x 6 ∈ A ≡ {x ∈ K : u(x) < 0}. Writing u = min{0, u} ∈ DW0 , using (2.2) with T (t) = min{0, t}, we have W (u, u) = W (u − u, u) + W (u, u) = 0 + 0 since u is constant on the support of u − u and W is strongly local, and u(x) = 0 if x 6 ∈ A. Thus u = 0 by the regularity of W , so u ≥ 0 on K, and u satisfies (4.15). The restricted domain version of the result follows just as in the previous paragraph, by setting f (x, t) = 0 for t < 0 to extend f to K × R. u t Taking f (x, t) = t|t|p−1 in Theorem 4.3, gives non-trivial non-negative solutions of 10 u + up = 0 for appropriate p. 5. Further Remarks Dirichlet forms and Laplacians have been constructed on a variety of self-similar fractals. The best-known constructions are on post-critically finite (p.c.f.) fractals, that is, roughly speaking, fractals where the components ψi (K) intersect each other in finitely many points. Suitable Dirichlet forms and Laplacians may be obtained by taking a renormalised limit of appropriate sums over vertices of graphs approximating K. For details see, for example, [3,4,9,10,12,14–16,21]. Such constructions fit into the variational fractal framework, and the resulting structures necessarily have intrinsic dimension less than 2, so (4.15) has non-trivial solutions for all p > 1. For example, the most commonly studied construction, the Sierpi´nski triangle, has Hausdorff dimension log 3/ log 2 and intrinsic dimension 2 log 3/ log 5. In such constructions the basic analytic inequalities (3.3) and (3.4) may be obtained directly from the explicit expressions for the graph Laplacians and Dirichlet forms, given, for example in [10,12]. Indeed, our original approach to semilinear PDEs took
244
K. J. Falconer
these explicit forms as a starting point, before we became aware of Mosco’s more general framework. It is natural to ask whether the solutions guaranteed by Theorem 4.3 can be taken to be strictly positive throughout K \0 as in the classical case. The usual way of demonstrating this is by a strong maximal principle. Such maximal principles are well-known in the classical setting, but in the fractal setting have so far been established only in the case of regular harmonic structures on p.c.f. fractals, see Strichartz [21]. The solutions of (4.15) given by Theorem 4.3 are in DW0 , so in particular if 0 < ν < 2 the solutions are continuous and in the Hölder class df (2 − ν)/2ν. If ν > 2 we have “weak” solutions in L2ν/(ν−2) (K, µ); the regularity theory needed to improve this is not generally available, though under circumstances where Green’s functions exist stronger conclusions may be possible. A feature of these results is that the intrinsic dimension ν plays an analogous rôle to Euclidean dimension in determining for which nonlinearity exponents p the PDEs have a non-trivial solution. Thus Theorem 4.3 should be compared with classical results on critical exponents, for example, if K is a domain in Rn with smooth boundary then 1u + up = 0 in K with u = 0 on ∂K has a positive solution for all p > 1 if n = 1, 2, and for 1 < p < (n + 2)/(n − 2) if n = 3, 4, . . . , see [13] for a survey of such results. Moreover, if K is starshaped then no such positive solution exists if n ≥ 3 and p > (n + 2)/(n − 2), see [19]. It would be interesting to know if there are conditions on a fractal domain K that ensure no non-trivial solutions of 10 u + up = 0 if ν > 2 and p > (ν + 2)/(ν − 2). References 1. Ambrosetti, A., Prodi, G.: A Primer of Nonlinear Analysis. Cambridge: University Press, 1992 2. Ambrosetti, A., Rabinowitz, G.: Dual variational methods in critical point theory and applications. J. Funct. Anal. 14, 349–381 (1973) 3. Barlow, M.T., Bass, R.F.: Transition densities for Brownian motion on the Sierpi´nski carpet. Probab. Theory Related Fields 91, 307–330 (1992) 4. Barlow, M.T., Kigami, J.: Localized eigenfunctions of the Laplacian on p.c.f. self-similar sets. J. London Math. Soc.(2) 56, 320–332 (1997) 5. Biroli, M., Mosco, U.: Sobolev and isoperimetric inequalities for Dirichlet forms on discontinuous media. Rend. Mat. Acc. Lincei s. 9 6, 37–44 (1995) 6. Chow, S-N., Hale, J.K.: Methods of Bifurcation Theory. Berlin: Springer, 1982 7. Falconer, K.J.: Fractal Geometry – Mathematical Foundations and Applications. Chichester: John Wiley, 1992 8. Falconer, K.J.: Techniques in Fractal Geometry. Chichester: John Wiley, 1997 9. Hambly, B.M.: Brownian motion on a random recursive Sierpinski gasket. Ann. Probab. 25, 1059–1102 (1997) 10. Kigami, J.: In quest of fractal analysis. In: Yamaguti, M., Hata, M., Kigami, J. (eds.) Mathematics of Fractals, Providence, RI: American Mathematical Society, 1993, pp. 53–73 11. Kigami, J.: Effective resistances for harmonic structures on p.c.f. self-similar sets. Math. Proc. Cambridge Philos. Soc. 115, 291–303 (1994) 12. Kigami, J.: Harmonic calculus on p.c.f. self-similar sets. Trans. Am. Math. Soc. 335, 721–755 (1993) 13. Lions, P.L.: On the existence of positive solutions of semi-linear elliptic equations. SIAM Review. 24, 441–467 (1982) 14. Kigami, J., Lapidus, M.L.: Weyl’s problem for the spectral distribution of Laplacians on p.c.f. self-similar sets. Commun. Math. Phys. 158, 93–125 (1993) 15. Kusuoka, S., Yin, Z.X.: Dirichlet forms on fractals: Poincaré constant and resistance. Probab.Theory Related Fields 93 ,169–196 (1992) 16. Lapidus, M.L.: Analysis on fractals, Laplacians on self-similar sets, noncommutative geometry and spectral dimensions. Topol. Methods Nonlinear Anal. 4, 137–195 (1994) 17. Mosco, U.: Dirichlet forms and self-similarity. In: J. Jost et al. (eds.) New directions in Dirichlet forms. Cambridge: International Press, 1998
Semilinear PDEs on Self-Similar Fractals
18. 19. 20. 21.
245
Mosco, U.: Lagrangian metrics on fractals. Proc. Symp. Appl. Math. 54, 301–323 (1998) Pohozaev, S.: Eigenfunctions of the equation 1u + λf (u) = 0. Soviet. Math. Dokl. 6, 1408–1411 (1965) Posta, G.: Spectral asymptotics for variational fractals. Z. Anal. Anwendungen 17, 417–430 (1998) Strichartz, R.S.: Some properties of Laplacians on fractals. J. Funct. Analysis, to appear
Communicated by J. L. Lebowitz
Commun. Math. Phys. 206, 247 – 264 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Projective Module Description of the q-Monopole Piotr M. Hajac1 , Shahn Majid2 1 Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Silver St., Cam-
bridge CB3 9EW, England and Department of Mathematical Methods in Physics, Warsaw University, ul. Ho˙za 74, Warsaw 00–682, Poland. E-mail: [email protected] 2 Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Silver St., Cambridge CB3 9EW, England. E-mail: [email protected] Received: 4 September 1998 / Accepted: 16 October 1998
Abstract: The Dirac q-monopole connection is used to compute projector matrices of quantum Hopf line bundles for arbitrary winding number. The Chern–Connes pairing of cyclic cohomology and K-theory is computed for the winding number −1. The nontriviality of this pairing is used to conclude that the quantum principal Hopf fibration is non-cleft. Among general results, we provide a left-right symmetric characterization of the canonical strong connections on quantum principal homogeneous spaces with an injective antipode. We also provide for arbitrary strong connections on algebraic quantum principal bundles (Hopf–Galois extensions) their associated covariant derivatives on projective modules.
Introduction The goal of this paper is to provide a better understanding of the relationship between the quantum-group and K-theory approach to the noncommutative-geometry gauge theory. The latter approach is based on the classical Serre-Swan theorem that allows one to think of vector bundles as projective modules. The former comes from the concept of a Hopf–Galois extension which describes a quantum principal bundle the same way Hopf algebras describe quantum groups. Here a Hopf algebra H plays the role of the algebra of functions on the structure group, and the total space of a bundle is replaced by an H comodule algebra P . We rely on the Hopf–Galois theory to derive our noncommutativegeometric constructions. On the other hand, it is the machinery of noncommutative geometry that allows us to obtain a Galois-theoretic result: We employ the Chern– Connes pairing to prove the non-cleftness of the Hopf–Galois extension of the algebraic quantum principal Hopf fibration. We begin in Sect. 1 with some preliminaries about Hopf–Galois extensions, connections and connection 1-forms on algebraic quantum principal bundles, and connections on projective modules. In Sect. 2 we extend the existing theory with some general results
248
P. M. Hajac, S. Majid
about strong connections, their covariant derivatives on projective modules, and bicovariant splittings of canonical Hopf algebra surjections. We also discuss how to obtain projector matrices from splittings of the multiplication map. In Sect. 3 we first define (the space of sections of) a quantum Hopf line bundle as a bimodule associated to the quantum principal Hopf fibration via a one-dimensional corepresentation of the Hopf algebra k[z, z−1 ]. Then we use a canonical strong connection on the quantum principal Hopf fibration (Dirac q-monopole) to compute, for any one-dimensional corepresentation, left and right projector matrices of the thus defined quantum Hopf line bundles. This computation is the main part of our paper and provides the projective-module characterization of the q-monopole. Further results relating to the Chern–Connes pairing are in Sect. 4. We end with the Appendix where we show that the only invertible elements of the coordinate ring of SLq (2) are non-zero numbers, and use it as an alternative way to conclude the non-cleftness of the quantum Hopf fibration. To focus attention and take advantage of the cyclic cohomology results in [MNW91], we work over a ground field k of characteristic zero, and assume that q is a non-zero element in k that is not a root of 1. We use the Sweedler notation 1h = h(1) ⊗ h(2) (summation understood) and its derivatives. The antipode of the Hopf algebra is a linear map S : H → H , and the counit is an algebra map ε : H → k obeying certain properties. The convolution product of two linear maps from a coalgebra to an algebra is denoted in the following way: (f ∗ g)(c) := f (c(1) )g(c(2) ). We use interchangeably the words “colinear” and “covariant” with respect to linear maps that preserve the comodule structure. For an introduction to noncommutative geometry, quantum groups, Hopf–Galois extensions and quantum-group gauge theory we refer to [C-A94,L-G97], [M-S95], [S-HJ94] and [BM93,BM98] respectively.
1. Preliminaries We begin by recalling basic definitions and known results. Definition 1.1. Let E be a left B-module, and ((B), d) a differential algebra on B. A linear map ∇ : ∗ (B) ⊗B E → ∗+1 (B) ⊗B E is called a connection (covariant derivative) on E iff ∀ ξ ∈ E, λ ∈ (B) : ∇(λ ⊗B ξ ) = λ(∇ξ ) + dλ ⊗B ξ . In the case of the universal differential algebra the existence of a connection is equivalent to the projectivity of E [CQ95, Corollary 8.2]; [L-G97, Proposition 8.2.3]. If E is projective then a connection exists for any differential algebra because it can be obtained from the universal differential algebra and the canonical surjection onto a given differential algebra [C-A94, p. 555]. Definition 1.2. Let H be a Hopf algebra, P be a right H -comodule algebra with multiplication mP and coaction 1R , and B := P coH := {p ∈ P | 1R p = p ⊗ 1} the subalgebra of coinvariants. We say that P is a (right) H -Galois extension of B iff the canonical left P -module right H -comodule map χ := (mP ⊗ id) ◦ (id ⊗B 1R ) : P ⊗B P −→ P ⊗ H is bijective. We say that P is a faithfully flat H -Galois extension of B iff P is faithfully flat as a right and left B-module. For a comprehensive review of the concept of faithful flatness see [B-N72].
Projective Module Description of q-Monopole
249
Definition 1.3. An H -Galois extension is called cleft iff there exists a unital convolution invertible linear map 8 : H → P satisfying 1R ◦ 8 = (8 ⊗ id) ◦ 1. We call 8 a cleaving map of P . Note that, in general, 8 is not uniquely determined by its defining conditions. Observe also that the unitality assumption for the cleaving map is unnecessary in the sense that any right colinear convolution invertible mapping can be normalised to be unital. Indeed, ˜ be such a mapping, and 8(1) ˜ let 8 := b. By the colinearity, we have that b ∈ B, and the convolution invertibility entails that b is invertible. Also, b−1 ⊗ 1 = b−1 1R (bb−1 ) = ˜ is right b−1 b1R (b−1 ) = 1R (b−1 ). It is straightforward to check that 8 := b−1 8 colinear, convolution invertible and unital. Let us also remark that a cleaving map is necessarily injective:
(mP ◦ (mP ⊗ id) ◦ (id ⊗ 8−1 ⊗ id) ◦ (id ⊗ 1) ◦ 1R ◦ 8)(h) = 8(h(1) )8−1 (h(2) )h(3) = h, ∀ h ∈ H. To fix convention, let us recall that the universal differential calculus (grade one of the universal differential algebra) can be defined as the kernel of the multiplication map m 1 B := Ker(B ⊗ B → B) with the differential db := 1 ⊗ b − b ⊗ 1 (e.g., see [L-G97, Sect. 7.1]). (We abuse the notation and use the same letter d to signify both the universal and general differential.) The following are the universal-differential-calculus versions of more general definitions in [BM93,H-PM96]: Definition 1.4 ([BM93]). Let B ⊆ P be an H -Galois extension. Denote by 1 P the universal differential calculus on P . A left P -module projection 5 on 1 P is called a connection on a quantum principal bundle iff 1. Ker 5 = P (1 B)P (horizontal forms), 2. 1R ◦ 5 = (5 ⊗ id) ◦ 1R (right covariance). Here 1R is the right coaction on differential forms given by the formula 1R (ada 0 ) := 0 ⊗ a a 0 , where 1 a := a a(0) da(0) R (1) (1) (0) ⊗ a(1) (summation understood). Coaction on higher order forms is defined in the same manner. Definition 1.5 ([BM93]). Let P , H , B and 1 P be as above. A k-homomorphism ω : H → 1 P such that ω(1) = 0 is called a connection form iff it satisfies the following properties: 1. (mP ⊗ id) ◦ (id ⊗ 1R ) ◦ ω = 1 ⊗ (id − ε) (fundamental vector field condition), 2. 1R ◦ ω = (ω ⊗ id) ◦ adR , adR (h) := h(2) ⊗ S(h(1) )h(3) (right adjoint covariance). For every Hopf–Galois extension there is a one-to-one correspondence between connections and connection forms (see [M-S97, Proposition 2.1]). In particular, the connection 5ω associated to a connection form ω is given by the formula: 5ω (dp) = p(0) ω(p(1) ) .
(1.1)
5ω
is a left P -module homomorphism, so that it suffices to know its values on exact forms.
Definition 1.6 ([H-PM96]). Let 5 be a connection in the sense of Definition 1.4. It is called strong iff (id − 5)(dP ) ⊆ (1 B)P . We say that a connection form is strong iff its associated connection is strong.
250
P. M. Hajac, S. Majid
A natural next step is to consider associated quantum vector bundles. More precisely, what we need here is a replacement of the module of sections of an associated vector bundle. In the classical case such sections can be equivalently described as “functions of type %" from the total space of a principal bundle to a vector space. We follow this construction in the quantum case by considering B-bimodules of colinear maps Homρ (V , P ) associated with an H -Galois extension B ⊆ P via a corepresentation ρ : V → V ⊗ H (see [D-M96]). For our later purpose, we need the following reformulation of [BM93, Prop. A.7]: Lemma 1.7. Let B ⊆ P be a cleft H -Galois extension and ρ : V → V ⊗ H a right corepresentation of H on V . Then the space of colinear maps Homρ (V , P ) is isomorphic as a left B-module to the free module Hom(V , B). 2. Strong Connections on Associated Projective Modules First we study a general setting for translating strong connections on algebraic quantum principal bundles to connections on projective modules. The associated bimodule of colinear maps is finitely generated projective as a left module over the subalgebra of coinvariants under rather unrestrictive assumptions. However, we do not assume the projectivity of this module in the following two propositions, as it is needed only later to ensure the existence of a connection. Also, although we work only with the universal differential algebra in the sequel, we do not assume here that the differential algebra is universal. It suffices that it is right-covariant, i.e., the right coaction is well-defined on differential forms, and right-covariant and right-flat in the second proposition. On the other hand, we do not aim here at the utmost generality but try to keep our noncommutativegeometric motivation evident. Proposition 2.1. Let H be a Hopf algebra with a bijective antipode, P a faithfully ρ flat H -Galois extension of B, and V → V ⊗ H (dimV < ∞) a coaction. Denote by Homρ (V , P ) the B-bimodule of colinear homomorphisms from V to P , and choose a right-covariant differential algebra (P ). Then the following map ˇ ⊗B ϕ))(v) = λϕ(v), `ˇ : (B) ⊗B Homρ (V , P ) −→ Homρ (V , (B)P ), (`(λ is an isomorphism of graded left (B)-modules. Proof. It suffices to show that `ˇ has an inverse. By choosing P a linear basis {λµ } of (B), for any ϕ ∈ Homρ (V , (B)P ) we can write ϕ(v) = µ λµ ϕ µ (v). The point is now to show that we can always choose each ϕ µ to be an element of Homρ (V , P ). It can be done by assuming flatness of (B) (see Proposition 2.3), or by employing our assumptions on the Hopf–Galois extension. Lemma 2.2. Under the assumptions of Proposition 2.1, for any ϕ ∈ HomP ρ (V , (B)P ) there exist colinear homomorphisms ϕ˜ µ ∈ Homρ (V , P ) such that ϕ(v) = µ λµ ϕ˜ µ (v), ∀v ∈ V . Proof. By assumption, we have
((1R ◦ϕ) ⊗ id )(v(0) ⊗ v(1) ) = (((ϕ ⊗id) ◦ ρ) ⊗ id )(v(0) ⊗ v(1) ), i.e., X µ
λµ ϕ µ (v(0) )(0) ⊗ ϕ µ (v(0) )(1) ⊗ v(1) =
X µ
λµ ϕ µ (v(0) ) ⊗ v(1) ⊗ v(2) .
Projective Module Description of q-Monopole
251
Taking advantage of the faithful flatness of P , Theorem I in [S-HJ90] and (1.6) in [D-Y85] (Remark 3.3 in [S-HJ90]), we know that there exists a unital colinear map j : H → P . Applying m(P ) ◦ (id ⊗ (j ◦ m)) ◦ (id ⊗ S ⊗ id), where m(P ) and m are appropriate multiplication maps, to both sides of the above equality, we get X
λµ ϕ µ (v(0) )(0) j (S(ϕ µ (v(0) )(1) )v(1) ) =
µ
X
λµ ϕ µ (v(0) )j (S(v(1) )v(2) ),
µ
Hence, by the unitality of j , we obtain ϕ(v) =
X
λµ ϕ µ (v(0) )(0) j (S(ϕ µ (v(0) )(1) )v(1) ).
µ
On the other hand, using the colinearity of j it is straightforward to verify that each of ϕ˜ µ t the maps v 7 −→ ϕ µ (v(0) )(0) j (S(ϕ µ (v(0) )(1) )v(1) ) is colinear. u τ
The next step is to take advantage of the existence of the translation map H → P ⊗B P , τ (h) := χ −1 (1 ⊗ h) (see Definition 1.2), and define an auxiliary isomorphism f : (B)P → (B) ⊗B P , f := (m⊗ B id) ◦ (id ⊗τ ) ◦ 1R . From the definition of the translation map it follows that f (λp) = λp(0) τ (p(1) ) = λp(0) χ −1 (1 ⊗ p(1) ) = λχ −1 (p(0) ⊗ p(1) ) = λχ −1 (χ (1 ⊗B p)) = λ ⊗B p . (Note that f is the inverse of the multiplication map.) Moreover, let I be the restriction to Homρ (V , (B)P ) of the canonical isomorphism from Hom(V , (B)P ) to (B)P ⊗ V ∗ . Then we have a well-defined map `ˆ := (id ⊗B I −1 ) ◦ (f ⊗id) ◦ I : Homρ (V , (B)P ) −→ (B)P ⊗B Homρ (V , P ), ! XX X ˆ λµ ϕ˜ µ (ei ) ⊗ ei = λµ ⊗B ϕ˜ µ , `(ϕ) = ((id ⊗B I −1 ) ◦ (f ⊗id)) i
µ
µ
where {ei } is a basis of V , {ei } its dual, and (by the above lemma) we choose ϕ˜ µ ∈ P Homρ (V , P ) such that ϕ(v) = µ λµ ϕ˜ µ (v). It is straightforward to check that `ˆ = `ˇ−1 , as desired. u t Proposition 2.3. Let H be a Hopf algebra and P ⊇ B an H -Galois extension. Let `ˇ be the map defined in Proposition 2.1. Then if (B) is flat as a right B-module, `ˇ is an isomorphism of graded left (B)-modules.
252
P. M. Hajac, S. Majid
Proof. Let ρˇ : Hom(V , (B)P ) −→ Hom(V , (B)P ⊗ H ) be a left (B)-linear homomorphism defined by the formula ρ(ϕ)(v) ˇ = ϕ(v(0) )⊗v(1) −ϕ(v)(0) ⊗ϕ(v)(1) , and let ρ˜ denote its restriction to Hom(V , P ). Evidently, we have Ker ρˇ = Homρ (V , (B)P ) and Ker ρ˜ = Homρ (V , P ). Moreover, since (B) is flat as a right B-module, we have the following commutative diagram with exact rows of left (B)-modules: id⊗B ρ˜
0 −→ (B)⊗ B Hom , P ) −→ (B)⊗ B Hom(V , P ⊗H ) ρ (V , P ) −→ (B)⊗ B Hom(V ˇ (2.1) y` y`˜ y` ρˇ 0 −→ Homρ (V , (B)P ) −→ Hom(V , (B)P ) −→ Hom(V , (B)P ⊗H ).
P P Here ` is defined by the formula `( µ λµ ⊗B ϕ µ )(v) = µ λµ ϕ µ (v), and `˜ is given τ
the same way. With the help of the translation map H → P ⊗B P , reasoning as in the proof of the preceding proposition, one can show that ` and `˜ are isomorphisms. By standard diagram chasing (or completing the left hand side of (2.1) with zeros and invoking the Five Isomorphism Lemma), one can conclude from the diagram (2.1) that `ˇ is also an isomorphism. u t If ω is a strong connection form, then (id − 5ω ) ◦ d ◦ Homρ (V , P ) ⊆ Homρ (V , 1 (B)P ). Assuming also that the conditions allowing us to utilise one of the above propositions are fulfilled, we can define the covariant derivative associated to ω in the following way: ∇ ω : Homρ (V , P ) −→ 1 (B) ⊗B Homρ (V , P ), ∇ ω ξ := `ˇ−1 ((id − 5ω ) ◦ d ◦ ξ ). (2.2) One can check that ∇ ω satisfies the Leibniz rule ∇ ω (bξ ) = b∇ ω ξ + db ⊗B ξ . Hence ∇ ω can be extended (by the Leibniz rule) to an endomorphism of (B) ⊗B Homρ (V , P ) which is of degree 1 with respect to the grading of (B). Our second group of results concerns the canonical connection on a quantum principal homogeneous space (principal homogenous H -Galois extension), which is the general construction behind the Dirac q-monopole. A principal homogenous H -Galois extension B ⊆ P is a Hopf–Galois extension obtained from a surjective Hopf algebra map π : P → H which defines the right comodule structure by the formula 1R := (id ⊗ π ) ◦ 1. We know from the proof of [BM93, Proposition 5.3] that if B ⊆ P is a principal homogenous H -Galois extension, and i : H → P is a linear unital map such that π ◦ i = id (splitting of π) and (id ⊗ π) ◦ adR ◦ i = (i ⊗ id) ◦ adR ,
(2.3)
then ω := (S ∗ d) ◦ i is a connection form in the sense of Definition 1.5. (Note that since i is a splitting of a Hopf algebra map, it is counital: εH = εH ◦ π ◦ i = εP ◦ i.) We call the thus constructed connection the canonical connection (form) associated to splitting i. (In what follows, we skip writing “form" for the sake of brevity.) The next step is towards a left-right symmetric characterization of strong canonical connections. Proposition 2.4. The canonical connection associated to splitting i : H → P satisfying the above conditions is strong if and only if the splitting i obeys in addition the right covariance condition (i ⊗ id) ◦ 1 = 1R ◦ i .
Projective Module Description of q-Monopole
253
Proof. First we need to reduce the strongness condition for the canonical connection to a simpler form: Lemma 2.5. The canonical connection ω associated to i : H → P is strong if and only if i(h(2) )(2) ⊗ h(1) Sπ(i(h(2) )(1) ) = i(h) ⊗ 1, ∀h ∈ H.
(2.4)
Proof. To simplify the notation, let us put π(p) = p. Also, let 5ω denote the connection associated to ω, i.e., 5ω (dp) = p(1) ω(p(2) ) . (We take advantage of the fact that 1R = (id ⊗ π) ◦ 1, see (1.1).) Using the Leibniz rule we obtain: (id − 5ω )(dp) = d (p(1) S(i(p(2) )(1) ) i(p(2) )(2) ) − p(1) S(i(p(2) )(1) ) d(i(p(2) )(2) ) = d (p(1) S(i(p(2) )(1) )) i(p(2) )(2) = 1 ⊗ p − p(1) S(i(p(2) )(1) ) ⊗ i(p(2) )(2) . On the other hand, applying 1R ⊗ id to p(1) S(i(p(2) )(1) ) ⊗ i(p(2) )(2) yields p (1) S(i(p(3) )(2) ) ⊗ p(2) S i(p(3) )(1) ⊗ i(p(3) )(3) . Remembering that (1 B)P ⊆ B ⊗ P , we conclude that the strongness condition (see Definition 1.6, cf. [M-S97, (11)]) of the canonical connection is equivalent to p (1) S(i(p(3) )(2) ) ⊗ p(2) S i(p(3) )(1) ⊗ i(p(3) )(3) = p(1) S(i(p(2) )(1) ) ⊗ 1 ⊗ i(p(2) )(2) . The above equation is of the form (id ∗ f1 )(p) = (id ∗ f2 )(p). Since the antipode S is the convolution inverse of id, it is equivalent to f1 (p) = f2 (p). Therefore we can cancel the p (1) product from both sides. Also, since π is surjective and a coalgebra map, we can replace π(p) by a general element h ∈ H . Thus we arrive at S(i(h(2) )(2) ) ⊗ h(1) S i(h(2) )(1) ⊗ i(h(2) )(3) = S(i(h)(1) ) ⊗ 1 ⊗ i(h)(2) . Moreover, for any Hopf algebra the map (S ⊗ id) ◦ 1 is injective (apply ε ⊗ id). Consequently, the strongness is equivalent to the condition i(h(2) )(2) ⊗ h(1) S i(h(2) )(1) = i(h) ⊗ 1 , ∀ h ∈ H , as claimed. u t Note now that we can write the adjoint covariance of i, in an explicit manner, as i(h)(2) ⊗ S(i(h)(1) ) i(h)(3) = i(h(2) ) ⊗ (Sh(1) )h(3) , ∀h ∈ H. In this case i(h(1) ) ⊗ h(2) = i(h(3) ) ⊗ h(1) S(h(2) )h(4) = (1 ⊗ h(1) )((i ⊗ id)◦adR )(h(2) ) = i(h(2) )(2) ⊗ h(1) S(i(h(2) )(1) ) i(h(2) )(3) .
(2.5)
254
P. M. Hajac, S. Majid
Assume that ω is strong. Hence, by the above lemma, the strongness condition implies that i(h(1) ) ⊗ h(2) = i(h)(1) ⊗ i(h)(2) as required. Conversely, using the right covariance of i for the first step and (2.5) for the second, we compute the left hand side of (2.4) as i(h(2) )(2) ⊗ h(1) S(i(h(2) )(1) ) i(h(2) )(3) S(i(h(2) )(4) ) = i(h(2) )(2) ⊗ h(1) S(i(h(2) )(1) ) i(h(2) )(3) Sh(3) = i(h(3) ) ⊗ h(1) S(h(2) )h(4) S(h(5) ) = i(h) ⊗ 1. Hence the canonical connection is strong by Lemma 2.5. u t Corollary 2.6. Assume that antipode S is injective. Then strong canonical connections are in 1-1 correspondence with linear unital splittings of π obeying the two conditions (i ⊗ id) ◦ 1 = 1R ◦ i, (id ⊗ i) ◦ 1 = 1L ◦ i, where 1R = (id ⊗ π) ◦ 1, 1L = (π ⊗ id) ◦ 1. Proof. Assume first that the canonical connection associated to i is strong. Then, by the preceding proposition, i is right covariant and (2.5) holds. Hence i(h(1) )(2) ⊗ S(i(h(1) )(1) ) h(2) = i(h)(1)(2) ⊗ S(i(h)(1)(1) ) i(h)(2) = i(h)(2) ⊗ S(i(h)(1) ) i(h)(3) = i(h(2) ) ⊗ (Sh(1) )h(3) . Reasoning as in the proof of Lemma 2.5, we can cancel h(2) and h(3) from the two sides. Then cancelling S from both sides (we assume S to be injective), we have i(h)(2) ⊗ i(h)(1) = i(h(2) ) ⊗ h(1) , which is the left covariance condition. Conversely, if the left and right covariance conditions hold then i(h(2) ) ⊗ (Sh(1) )h(3) = i(h(2)(1) ) ⊗ (Sh(1) )h(2)(2) = i(h(2) )(1) ⊗ (Sh(1) )i(h(2) )(2) = i(h)(2)(1) ⊗ S(i(h)(1) ) i(h)(2)(2) , which is the same as (2.5). Invoking again the preceding proposition, we can conclude that the canonical connection associated to i is strong as required. u t Remark 2.7. Let π : P → H be a Hopf algebra surjection. If a linear map i : H → P is counital and left or right colinear, then i is a splitting of π, i.e., π ◦ i = id. Indeed, if i is right colinear (i(h)(1) ⊗ π(i(h)(2) ) = i(h(1) ) ⊗ h(2) ), we have: (π ◦ i)(h) = ε ((π ◦ i)(h)(1) )(π ◦ i)(h)(2) = ε(π(i(h)(1) ))π(i(h)(2) ) = ε(π(i(h(1) )))h(2) = h. The left-sided case is analogous.
Projective Module Description of q-Monopole
255
We end this section by showing how to obtain a projector matrix (explicit embedding of a projective module in a free module) from the canonical strong connection. It is known [DGH] that strong connection forms on P are equivalent to unital left B-linear right H -colinear splittings of the multiplication map m : B ⊗ P → P . Explicitly, if ω is a strong connection form, then s : P −→ B ⊗ P , s(p) = p ⊗ 1 + p(0) ω(p(1) )
(2.6)
gives the desired splitting. (Solving this equation for ω one gets ω(h) = h[1] s(h[2] ) − 1 ⊗ ε(h), where h[1] ⊗B h[2] = χ −1 (1 ⊗ h), summation understood, see Definition 1.2.) In particular, for the canonical strong connection associated to a bicovariant splitting i (i.e., ω = (S ∗ d) ◦ i), we have: s(p) = p(1) Si(p(2) )(1) ⊗ i(p(2) )(2) .
(2.7)
Note that a splitting of the multiplication map is almost the same as a projector matrix, for it is an embedding of P in the free B-module B ⊗ P . (We will use formula (2.7) in the next section to compute projector matrices of quantum Hopf line bundles from the Dirac q-monopole connection.) To turn (2.7) into a concrete recipe for producing finite size projector matrices of finitely generated projective modules, let us note the following general lemma: Lemma 2.8. Let A be an algebra and M a projective left A-module generated by linearly independent generators g1 , ..., gn . Also, let {g˜ µ }µ∈I be a completion of {g1 , ..., gn } to a map m : A ⊗ M → linear basis of M, f2 be a left A-linear Psplitting of the multiplication P M given by the formula f2 (gk ) = nl=1 akl ⊗ gl + µ∈I akµ ⊗ g˜ µ , and cµl ∈ A a P P choice of coefficients such that g˜ µ = nl=1 cµl gl . Then ekl = akl + µ∈I akµ cµl defines a projector matrix of M, i.e., e ∈ Mn (A), e2 = e and An e and M are isomorphic as left A-modules. Proof. Note first that we do not lose any generality by assuming g1 , ..., gn to be linearly independent (we can always remove generators that are linear combinations of other generators), and that a splitting of the multiplication map always exists by the projectivity assumption (cf. [CQ95, Sect. 8]). Let N be the kernel of the surjection f1 : An → M = An /N , f1 (ek ) = gk , k ∈ {1, ..., n}, where {ek }k∈{1,...,n} is the standard basis of An , i.e., ek is the row with zeros everywhere except for the k th place where there is 1. We have the following commutative diagram of left A-module homomorphisms whose rows are exact: id⊗f1
⊗ M −→ 0 0 −→ A ⊗ N −→ A ⊗ An −→ ←− A x f3 f f2ym y 4 f 1 0 −→ N −→ An −→ M −→ 0.
(2.8)
Here f2 is a splitting of the multiplication map (m ◦ f2 = id), f3 a splitting of id ⊗ f1 (which exists because A ⊗ M is free), and f4 is the multiplication map on A ⊗ An . From the commutativity of the diagram we can infer that f4 ◦ f3 ◦ f2 is a splitting of f1 : f1 ◦ f4 ◦ f3 ◦ f2 = m ◦ (id ⊗ f1 ) ◦ f3 ◦ f2 = id.
256
P. M. Hajac, S. Majid
Hence fe := f4 ◦f3 ◦f2 ◦f1 is an idempotent (fe2 = fe ) and fe (An ) is isomorphic to M, as needed. To compute P a matrix of Pfe , we choose a splitting f3 so that f3 (1⊗gk ) = 1⊗ek , f3 (1 ⊗ g˜ µ ) = 1 ⊗ nl=1 cµl el , nl=1 cµl gl = g˜ µ , k ∈ {1, ..., n}, µ ∈ I . Then fe (ek ) = (f4 ◦ f3 ◦ f2 )(gk ) n X X akl ⊗ gl + akµ ⊗ g˜ µ ) = (f4 ◦ f3 )( l=1
= f4 (
n X
akl ⊗ el +
l=1
=
n X l=1
This means that (akl + t u
P
µ∈I
µ∈I
(akl +
X
akµ ⊗
µ∈I
X
n X
cµl el )
l=1
akµ cµl )el .
µ∈I
akµ cµl )k,l∈{1,...,n} is a projector matrix of M, as claimed.
Observe that if akµ = 0 for all k and µ, the matrix elements of e are simply akl , and can be directly read off from the formula for splitting f2 written in terms of the module generators g1 , ..., gn . By a completely analogous reasoning, the same kind of lemma is true for right modules. 3. Projective Module Form of the Dirac q-Monopole Recall that A(SLq (2)) is a Hopf algebra over a field k generated by 1, α, β, γ , δ, satisfying the following relations: αβ = q −1 βα , αγ = q −1 γ α , βδ = q −1 δβ , βγ = γ β , γ δ = q −1 δγ , (3.1) αδ − δα = (q −1 − q)βγ , αδ − q −1 βγ = δα − qβγ = 1 , where q ∈ k \ {0}. The comultiplication 1, counit ε, and antipode S of A(SLq (2)) are defined by the following formulas: αβ α⊗1 β ⊗1 1⊗α 1⊗β 1 = , γ δ γ ⊗1 δ⊗1 1⊗γ 1⊗δ δ −qβ α β 10 αβ . ε = , S = γ δ 01 γ δ −q −1 γ α Now we need to recall the construction of the standard quantum sphere of Podle´s and the quantum principal Hopf fibration. The standard quantum sphere is singled out among the principal series of Podle´s quantum spheres by the property that it can be constructed as a quantum quotient space [P-P87]. In algebraic terms it means that its coordinate ring can be obtained as the subalgebra of coinvariants of a comodule algebra. To carry out this construction, first we need the right coaction on A(SLq (2)) of the commutative and cocommutative Hopf algebra k[z, z−1 ] generated by the grouplike element z and its inverse. This Hopf algebra can be obtained as the quotient of A(SLq (2)) by the Hopf ideal generated by the off-diagonal generators β and γ . Identifying the image of
Projective Module Description of q-Monopole
257
α and δ under the Hopf algebra surjection π : A(SLq (2)) → k[z, z−1 ] with z and z−1 respectively, we can describe the right coaction 1R := (id ⊗ π ) ◦ 1 by the formula: αβ α ⊗ z β ⊗ z−1 1R = . γ δ γ ⊗ z δ ⊗ z−1 We call the subalgebra of coinvariants defined by this coaction the coordinate ring of the (standard) quantum sphere, and denote it by A(Sq2 ). Since k[z, z−1 ] = A(SLq (2))/(A(Sq2 ) ∩ Kerε)A(SLq (2)) by Remark 3.4, we know from the general argument that A(Sq2 ) ⊆ A(SLq (2)) is a principal homogenous k[z, z−1 ]-Galois extension. (If P is a Hopf algebra, I a Hopf ideal, B π the subalgebra of coinvariants under the coaction 1R = (id ⊗ π ) ◦ 1, P → P /I , and I = (B ∩ Kerε)P , then we can define the inverse of the canonical map by χ −1 (p0 ⊗ π(p)) = p0 Sp(1) ⊗B p(2) .) We refer to the quantum principal bundle given by this Hopf–Galois extension as the quantum principal Hopf fibration. (An SOq (3) version of this quantum fibration was studied introduced in [BM93].) The main point of this section is to compute projector matrices of quantum Hopf line bundles associated to the just described Hopf q-fibration. Definition 3.1. Let ρn : k[z, z−1 ] → k ⊗ k[z, z−1 ], ρn (1) = 1 ⊗ z−n , n ∈ Z, be a one-dimensional corepresentation of k[z, z−1 ]. We call the A(Sq2 )-bimodule of colinear maps Homρn (k, A(SLq (2))) the (bimodule of) quantum Hopf line bundle of winding number n. Since we deal here with one-dimensional corepresentations, we identify colinear maps with their value at 1. We have ˜ ∈ A(SLq (2)) | 1R p = p ⊗ z−n } =: Pn Homρn (k, A(SLq (2)))={p as A(Sq2 )-bimodules. With the help of the PBW basis α k β l γ m , β p γ r δ s , k, l, m, p, r, s ∈ N0 , k > 0 of A(SLq (2)), one can show that P P−n −n−k γ k A(S 2 ) for n ≤ 0 A(Sq2 ) α −n−k γ k = −n q k=0 k=0 α P P Pn = n n k n−k k 2 = k=0 β δ n−k A(Sq2 ) for n ≥ 0, k=0 A(Sq ) β δ L and A(SLq (2)) = n∈Z Pn (cf. [MMNNU91, (1.10)]). Next, similarly to [BM93,BM98], we consider the canonical connection induced by the bicovariant splitting i(zn ) = α n , i(z−n ) = δ n . By Corollary 2.6 it induces a strong connection. We call this connection the (Dirac) q-monopole. Now, formula (2.7) gives us a splitting s : A(SLq (2)) → A(Sq2 ) ⊗ A(SLq (2)), and we can claim: Proposition 3.2. Put
α −n−k γ k −n (−q)l β l δ −n−l for n ≤ 0 l q2 (en )kl = β k δ n−k n (−q)−l α n−l γ l for n ≥ 0. l 2 q
|n|+1
Then, for any n ∈ Z, en ∈ M|n|+1 (A(Sq2 )), en2 = en , and A(Sq2 ) Pn as a left A(Sq2 )-module.
en is isomorphic to
258
P. M. Hajac, S. Majid
Proof. Recall first that if qxy = yx, then (x + y)n = n k
q
=
Pn
k=0
n k
q
x k y n−k , where
(q − 1)...(q n − 1) (q − 1)...(q k − 1)(q − 1)...(q n−k − 1)
are the q-binomial coefficients. (See, e.g., [M-S95, p.85].) Taking advantage of formula (2.7) in the q-monopole case, we compute: s(α m−k γ k ) = α m−k γ k Si(zm )(1) ⊗ i(zm )(2) m X = α m−k γ k ml 2 S(α m−l β l ) ⊗ α m−l γ l =
l=0 m X
q
α m−k γ k
l=0
Pn
Similarly, s(β k δ n−k ) =
l=0 β
m l
k δ n−k n l
q2
q2
(−q)l β l δ m−l ⊗ α m−l γ l .
(−q)−l α n−l γ l ⊗ β l δ n−l . Thus we have ver-
ified that s preserves the direct sum decomposition of A(SLq (2)), i.e., s(Pn ) ⊆ A(Sq2 ) ⊗ Pn , n ∈ Z. Hence, by restriction, we have a splitting of the left multiplication map for each Pn . The claim of the proposition follows directly from Lemma 2.8 and the above formulas for s. u t Remark 3.3. Observe that for n ≥ 0 we can write en = uv T , where uT = (δ n , ..., β k δ n−k , ..., β n )
and v T = (S(δ n ), ...,
n k
q2
S(γ k δ n−k ), ..., S(γ n )).
Since vT u =
n X n k=0
k
q2
S(γ k δ n−k )β k δ n−k = S((δ n )(1) )(δ n )(2) = ε(δ n ) = 1,
we can directly see that en2 = en . The case n ≤ 0 is similar. Remark 3.4. We can define the fibre of a quantum vector bundle over a classical point (understood as a number-valued algebra homomorphism) as the localization of the module of “sections" of this bundle at the kernel of this homomorphism. The standard Podle´s quantum sphere that we consider here has one classical point given by the restriction of the counit map ε. Let us consider the quantum Hopf line bundles as left + + A(Sq2 )-modules Pn . We can then regard the localization Pn /A(Sq2 ) Pn , A(Sq2 ) := +
Kerε ∩ A(Sq2 ), as the fibre vector space of Pn over the point given by A(Sq2 ) . (Note +
+
that Pn /A(Sq2 ) Pn is automatically a vector space over A(Sq2 )/A(Sq2 ) +
+
= k.) Since
ε(A(Sq2 ) Pn ) = 0, ε induces a linear map ε˜ : Pn /A(Sq2 ) Pn → k given by the formula +
ε˜ (p/A(Sq2 ) Pn ) = ε(p). Assume now that n ≥ 0. Arbitrary p ∈ Pn can be written as P + p = nl=0 bl β l δ n−l , bl ∈ A(Sq2 ). Hence ε˜ (p/A(Sq2 ) Pn ) = ε(b0 ), and we can conclude
Projective Module Description of q-Monopole
259
that ε˜ is surjective. Note now that β = (−q −1 βγ )β + (qαβ)δ, and consequently, for + l > 0, β l δ n−l = (−q −1 βγ )β l δ n−l + (qαβ)δβ l−1 δ n−l ∈ A(Sq2 ) Pn . It follows that (
n X l=0
+
+
bl β l δ n−l )/A(Sq2 ) Pn = b0 δ n /A(Sq2 ) Pn +
+
= ε(b0 )δ n /A(Sq2 ) Pn + (b0 − ε(b0 ))δ n /A(Sq2 ) Pn +
= ε(b0 )δ n /A(Sq2 ) Pn . This entails the injectivity of ε˜ . Thus ε˜ is an isomorphism, and we can infer that + the fibre Pn /A(Sq2 ) Pn is a one-dimensional vector space, exactly as expected for a line bundle. The reasoning for n ≤ 0 is analogous, and relies on the L identity γ = (−qβγ )γ + (q −1 δγ )α. This agrees with the fact that A(SLq (2)) = n∈Z Pn and L + 2 −1 n A(SLq (2))/A(Sq ) A(SLq (2)) = k[z, z ] = n∈Z kz . The latter equality can be directly seen as follows: Since β and γ q-commute with all monomials, the two-sided + ideal hβ, γ i = βA(SLq (2)) + γ A(SLq (2)). Thus, as β, γ ∈ A(Sq2 ) A(SLq (2)) by the +
+
above formulas, we have hβ, γ i ⊆ A(Sq2 ) A(SLq (2)). On the other hand, since A(Sq2 ) + is the ideal in A(Sq2 ) generated by αβ, βγ , γ δ, we also have A(Sq2 ) A(SLq (2)) ⊆ + Hence k[z, z−1 ] = A(SLq (2))/hβ, γ i = A(SLq (2))/A(Sq2 ) A(SLq (2)).
hβ, γ i.
To compute projector matrices of the quantum Hopf line bundles thought of as right A(Sq2 )-modules, we need a right-sided version of formula (2.7). A natural first candidate appears to be: s˜ (p) = i(p(1) )(1) ⊗ S(i(p(1) )(2) )p(2) .
(3.2)
It is evidently a splitting of the multiplication map m : A(SLq (2)) ⊗ A(SLq (2)) → A(SLq (2)). Only now it is right linear under left coinvariants. By left coinvariants we ˜ q2 ) := {p ∈ A(SLq (2)) | 1L p = 1⊗p}, where 1L = (π ⊗id)◦1. understand here A(S On generators, we have explicitly: z⊗α z⊗β αβ . = −1 1L γ δ z ⊗ γ z−1 ⊗ δ Using the PBW basis α k β l γ m , β p γ r δ s , k, l, m, p, r, s ∈ N0 , k > 0 of A(SLq (2)), ˜ q2 ) is a unital subalgebra of A(SLq (2)) generated by αγ , βδ, βγ . one can show that A(S ˜ q2 ). To this end we note We want to prove now that the image of s˜ lies in A(SLq (2))⊗ A(S that the right covariance of i implies the formula i(h)(1) ⊗i(h)(3) ⊗i(h)(2) = i(h(1) )(1) ⊗ h(2) ⊗i(h(1) )(2) . With the above formula at hand, one can verify that ((id ⊗1L )◦ s˜ )(p) = i(p(1) )(1) ⊗ 1 ⊗ S(i(p(1) )(2) )p(2) , as needed. Thus we can conclude that s˜ is a right ˜ q2 ) → A(SLq (2)). ˜ q2 )-linear splitting of the multiplication map A(SLq (2)) ⊗ A(S A(S 2 2 ˜ q ) and A(Sq ) are different subalgebras of A(SLq (2)), and we want to However, A(S find projector matrices for Pn thought of as right A(Sq2 )-modules. To our aid comes the transpose automorphism of A(SLq (2)) defined on generators by αβ αγ T = . γ δ β δ
260
P. M. Hajac, S. Majid
One can check directly that T is well defined. In particular, when we work over C, A(SLq (2)) has a natural ∗-algebra structure for q real, namely αβ δ −q −1 γ , ∗ = γ δ −qβ α and we can simply define T = ∗ ◦ S. This automorphism gives an isomorphism between ˜ q2 ). We have T (A(Sq2 )) = A(S ˜ q2 ) and T (A(S ˜ q2 )) = A(Sq2 ). (Note that A(Sq2 ) and A(S T 2 = id.) It is straightforward to verify that sˇ := (T ⊗ T ) ◦ s˜ ◦ T is a right A(Sq2 )-linear splitting of the right multiplication map m : A(SLq (2)) ⊗ A(Sq2 ) → A(SLq (2)). We can now proceed as in the left-sided case to prove: Proposition 3.5. Put −n l (−q)−l β l δ −n−l α −n−k γ k for n ≤ 0 q2 (fn )lk = n (−q)l α n−l γ l β k δ n−k for n ≥ 0. l 2 q
|n|+1
Then, for any n ∈ Z, fn ∈ M|n|+1 (A(Sq2 )), fn2 = fn , and fn A(Sq2 ) to Pn as a right A(Sq2 )-module.
is isomorphic
Proof. We have: sˇ (α m−k γ k ) = (T ⊗ T )(˜s (α m−k β k )) = (T ⊗ T )(i(zm )(1) ⊗ S(i(zm )(2) )α m−k β k ) m X m m−l l = (T ⊗ T )( β ⊗ S(α m−l γ l )α m−k β k ) l 2α = (T ⊗ T )( =
m X
q
α m−l β l ⊗
l=0
α m−l γ l ⊗
l=0
Similarly, sˇ (β k δ n−k ) =
l=0 m X
Pn
l=0 β
m l
q2
l
m l
q2
(−q)−l γ l δ m−l α m−k β k )
(−q)−l β l δ m−l α m−k γ k .
l δ n−l ⊗ n
q2
(−q)l α n−l γ l bk δ n−k . Hence sˇ (Pn ) ⊆ Pn ⊗
A(Sq2 ), n ∈ Z. By restriction of sˇ , we have a splitting of the right multiplication map for each Pn . The claim of the proposition follows from the right-sided version of Lemma 2.8 and the above formulas for sˇ . u t Finally, let us observe that, identifying Homρn (k, A(SLq (2))) with Pn , we can view the covariant derivative ∇nω : Homρn (k, A(SLq (2))) → 1 A(Sq2 ) ⊗A(Sq2 ) Homρn (k, A(SLq (2))) associated to the q-monopole by (2.2), as the Grassmannian connection associated to the splitting sn := s|Pn . More precisely, let ψ : Homρn (k, A(SLq (2))) → Pn , ψ(ξ ) = ξ(1) be the identification isomorphism mentioned above. The Grassmannian connection associated to the splitting sn : Pn → A(Sq2 ) ⊗ Pn is by definition the connection
Projective Module Description of q-Monopole
261
P ∇˜ ns : Pn → 1 A(Sq2 ) ⊗ Pn given by the formula ∇˜ ns p = i dbi ⊗A(Sq2 ) pi , where P i bi ⊗ pi := s(p). (See [CQ95, (54)] or [L-G97, (8.27)] for the right-sided version.) We want to show that ∇nω = (id ⊗A(Sq2 ) ψ −1 ) ◦ ∇˜ ns ◦ ψ, n ∈ Z, or equivalently that
∀ ξ ∈ Homρn (k, A(SLq (2))), n ∈ Z : ˇ nω ξ ))(1) = ((`ˇ ◦ (id ⊗A(S 2 ) ψ −1 ) ◦ ∇˜ ns ◦ ψ)(ξ ))(1). (`(∇ q (See Proposition 2.1 and (2.2).) Notice that we can use here either Proposition 2.1 or Proposition 2.3 to guarantee that ∇nω , n ∈ Z, makes sense. Indeed, since k[z, z−1 ] admits the Haar functional (hH : k[z, z−1 ] → k, hH (zn ) = δ 0n ), we can construct a unital right colinear mapping j : k[z, z−1 ] → A(SLq (2)), j := η ◦ hH , where η : k → A(SLq (2)) is the unit map, so that A(SLq (2)) is injective as a right k[z, z−1 ]-comodule. Thus, as the antipode of k[z, z−1 ] is bijective, A(SLq (2)) is left and right faithfully flat over A(Sq2 ) by [S-HJ90, Theorem I], and Proposition 2.1 applies. (In fact, we used the existence of a unital right colinear mapping to prove Proposition 2.1.) Also, 1 A(Sq2 ) is isomorphic with A(Sq2 )/k ⊗ A(Sq2 ) as a right A(Sq2 )-module via db 7 → b/k ⊗ 1, so that it is free, whence flat. Therefore Proposition 2.3 applies as well. Now, we put s(ξ(1)) = bi ⊗ξ(1)i , ξi (1) = ξ(1)i , and taking advantage of m ◦ sn = id, (2.6), (1.1) and (2.2) compute:
((`ˇ ◦ (id ⊗A(Sq2 ) ψ −1 ) ◦ ∇˜ ns ◦ ψ)(ξ ))(1) X
= ((`ˇ ◦ (id ⊗A(Sq2 ) ψ −1 ))( =
X
i
dbi ⊗A(Sq2 ) ξ(1)i )(1)
(dbi )ξ(1)i
i
= 1 ⊗ (m ◦ sn )(ξ(1)) − sn (ξ(1)) = 1 ⊗ ξ(1) − ξ(1) ⊗ 1 − ξ(1)(0) ω(ξ(1)(1) ) = dξ(1) − 5ω (dξ(1)) ˇ nω ξ ))(1). = (`(∇ This is exactly as one should expect, since we have constructed the splitting s : A(SLq (2)) → A(Sq2 ) ⊗ A(SLq (2)) from the connection form ω by formula (2.6). 4. Chern–Connes Pairing for the n = −1 Bimodule The aim of this section is to compute the left and right Chern numbers of the left and right finitely generated projective bimodule P−1 describing the quantum Hopf line bundle of winding number −1. This computation is a simple example of the Chern–Connes pairing between K-theory and cyclic cohomology [C-A94,L-JL97]. To obtain the desired Chern numbers we need to evaluate (to pair) the appropriate even cyclic cocycle with the left and right projector matrix respectively. Since the positive even cyclic cohomology H C 2n (A(Sq2 )), n > 0, is the image of the periodicity operator applied to H C 0 (A(Sq2 )), and the pairing is compatible with the action of the periodicity
262
P. M. Hajac, S. Majid
operator, the even cyclic cocycle computing Chern numbers is necessarily of degree zero, i.e., a trace. This trace is explicitly provided in [MNW91, (4.4)]. Adapting [MNW91, (4.4)] to our special case of the standard Podle´s quantum sphere, we obtain: m n
τ ((αβ) ζ ) = 1
τ 1 ((γ δ)m ζ n ) =
(1 − q 2n )−1 for n > 0, m = 0, 0 otherwise, (1 − q 2n )−1 for n > 0, m = 0, 0 otherwise,
(4.1)
where ζ := −q −1 βγ . The fact that the “Chern cyclic cocycle" is in degree zero is a quantum effect caused by the non-classical structure of H C ∗ (A(Sq2 )) (see [MNW91]). In the classical case the corresponding cocycle is in degree two, as it comes from the volume form of the two-sphere. Since τ 1 is a 0-cyclic cocycle, the pairing is given by the formula h[τ 1 ], [p]i = 1 (τ ◦ T r)(p), where p ∈ Mn (A(Sq2 )), p2 = p, and T r : Mn (A(Sq2 )) → A(Sq2 ) is the usual matrix trace. The following proposition establishes the pairing between the cyclic cohomology class [τ 1 ] and the K0 -classes [e−1 ] and [f−1 ] of the left and right projector matrix of bimodule P−1 respectively: Proposition 4.1. Let τ 1 : A(Sq2 ) → k be the trace (4.1), and e−1 , f−1 the projectors given in Propositions 3.2 and 3.5. Then (τ 1 ◦ T r)(e−1 ) = −1 and (τ 1 ◦ T r)(f−1 ) = 1. Proof. Taking advantage of (3.1) and (4.1), we get:
αδ −βα (τ ◦ T r) γ δ −qβγ 1
= τ 1 (1 + (q −1 − q)βγ ) = τ 1 ((q 2 − 1)ζ ) = −1.
Similarly,
(τ 1 ◦ T r)
δα δγ −αβ −q −1 βγ
= 1,
as claimed. u t This computation is in agreement with the classical situation. Only there the sign change of the Chern number when switching (by transpose) from the left to right projector matrix is due to the anticommutativity of the standard differential forms on manifolds. Here the sign change relies on the noncommutativity of the algebra. Since every free module can be represented in K0 by the identity matrix, we obtain that the pairing of the cyclic cohomology class [τ 1 ] with the K0 -class of any free A(Sq2 )module always vanishes: h[τ 1 ], [I ]i = τ 1 (n) = 0, n ∈ N. Now, combining Proposition 4.1 with Lemma 1.7 yields: Corollary 4.2. The Hopf–Galois extension of the quantum principal Hopf fibration is not cleft.
Projective Module Description of q-Monopole
263
Appendix In this appendix we provide a direct proof of non-cleftness of the quantum principal Hopf fibration which is possible in the purely algebraic setting. This complements our K-theoretic proof. Thus, suppose that there exists a cleaving map 8 : k[z, z−1 ] → A(SLq (2)). The existence of the convolution inverse 8−1 entails 8(z)8−1 (z) = ε(z), whence 8(z) must be invertible in A(SLq (2)). The polynomial 8(z) cannot be constant because then 8(z) and 8(1) = 1 would be linearly dependent, which contradicts the injectivity of 8 (see Sect. 1). Therefore to prove the non-cleftness it suffices to show that all invertible elements of A(SLq (2)) are non-zero numbers. L One can do it using the direct sum decomposition A(SLq (2)) = m,n∈Z A[m, n], where A[m, n] = {p ∈ A(SLq (2)) | π(p(1) ) ⊗ p(2) = zm ⊗ p, p(1) ⊗ π(p(2) ) = p ⊗ zn } (see [MMNNU91, (1.10)].) To be consistent with [MMNNU91], let us put now k = C. (See, however, the bottom of p.360 in [MMNNU91].) We know Pfrom [MMNNU91, p.363] that we can write any element of A(SLq (2)) as a sum m,n pm,n (ζ )em,n or P −1 k,l (ζ ), where ζ :=P−q βγ , pm,n , rk,l ∈ C[ζ ], em,n ∈ A[m, n]. Assume k,l ek,l rP now that m,n pm,n (ζ )em,n k,l ek,l rk,l (ζ ) = 1. Since both sums are finite, there exist indices m+ := max{m ∈ Z | pm,n 6 = 0}, n+ := max{n ∈ Z | pm+ ,n 6 = 0}, m− := min{m ∈ Z | pm,n 6 = 0}, n− := min{n ∈ Z | pm− ,n 6 = 0}, and similarly k+ , k− , l+ , l− . We have X X pm,n (ζ )em,n ek,l rk,l (ζ ) A[0, 0] 3 e0,0 = 1 = =
X
m,n
k,l
pm,n (ζ )sm,n,k,l (ζ )˜rk,l (ζ )em+k,n+l .
(4.2)
m,n,k,l
Here sm,n,k,l (ζ )em+k,n+l := em,n ek,l (see [MMNNU91, p.363]), and r˜k,l (ζ ) is obtained from rk,l (ζ ) by commuting it over em+k,n+l , i.e., em+k,n+l rk,l (ζ ) = r˜k,l (ζ )em+k,n+l . It follows from the commutation relations (3.1) that the coefficients of r˜k,l are q to some powers times the corresponding coefficients of rk,l . In particular, rk,l = 0 ⇔ r˜k,l = 0. Since pm+ ,n+ (ζ )em+ ,n+ , ek+ ,l+ rk+ ,l+ (ζ ) and pm− ,n− (ζ )em− ,n− , ek− ,l− rk− ,l− (ζ ) are the only terms that can contribute to the direct summand A[m+ + k+ , n+ + l+ ] and A[m− + k− , n− + l− ] respectively, we can conclude from Eq. (4.2) that either m+ + k+ , n+ + l+ , m− +k− , n− +l− are all zero, or else pm± ,n± (ζ )sm± ,n± ,k± ,l± (ζ )˜rk± ,l± (ζ )em± +k± ,n± +l± = 0. From [MMNNU91, p.363] we know, however, that em± +k± ,n± +l± is a (left and right) basis of A[m± + k± , n± + l± ] over C[ζ ]. Qj Qj −2(i−1) ζ ), d j a j = 2i Also, using formulas α j δ j = i=1 (1 − q i=1 (1 − q ζ ) one can check that em,n ek,l 6 = 0, whence sm± ,n± ,k± ,l± 6= 0. Thus, as there are no zero divisors in C[ζ ] and rk,l = 0 ⇔ r˜k,l = 0, we can conclude that pm± ,n± = 0 or rk± ,l± = 0. This, however, contradicts the definition of m± , n± , k± , l± . Therefore m± = −k± and n± = −l± . Consequently, as m− ≤ m+ and k− ≤ k+ , we have m− = m+ = −k+ = −k− . Hence also n− = n+ = P −l+ = −l− . Put n = n = n . It follows now that m0 = m− = m+ and 0 − + m,n pm,n (ζ )em,n = P pm0 ,n0 (ζ )em0 ,n0 and k,l ek,l rk,l (ζ ) = e−m0 ,−n0 r−m0 ,−n0 (ζ ). This way (4.2) reduces to pm0 ,n0 (ζ )sm0 ,n0 ,−m0 ,−n0 (ζ )˜r−m0 ,−n0 (ζ ) = 1. Hence all three of the above polynomials must be non-zero constants. Using again [MMNNU91, p.363] and remembering that
264
P. M. Hajac, S. Majid
α j δ j and δ j α j are polynomials in ζ of degree j , we can P infer that m0 = 0 = n0 . (Othis not of degree 0.) Consequently erwise s m ,n ,−m ,−n 0 0 0 0 m,n pm,n (ζ )em,n = p0,0 (ζ ), P e r (ζ ) = r ˜ (ζ ) = r (ζ ), and p , r are invertible constant polynomials, k,l k,l 0,0 0,0 0,0 0,0 k,l as needed. Acknowledgements. P. M. H. was partially supported by the NATO and CNR postdoctoral fellowships and KBN grant 2 P03A 030 14. It is a pleasure to thank Max Karoubi and Giovanni Landi for very helpful discussions.
References [B-N72] [BM93]
Bourbaki, N.: Commutative Algebra. Reading, MA: Addison-Wesley, 1972 Brzezi´nski, T., Majid, S.: Quantum Group Gauge Theory on Quantum Spaces. Commun. Math. Phys. 157, 591–638 (1993); Erratum 167, 235 (1995) [BM98] Brzezi´nski, T., Majid, S.: Quantum Differentials and the q-Monopole Revisited. Acta Applic. Math. 54, 185–232 (1998) [C-A94] Connes, A.: Noncommutative Geometry. London–New York: Academic Press, 1994 [CQ95] Cuntz, J., D.Quillen, D.: Algebra Extensions and Nonsingularity. J. Amer. Math. Soc. 8 (2), 251–289 (1995) [DGH] D¸abrowski, L., Grosse, H., Hajac, P.M.: Joint project. Trieste, Italy, SISSA 84/99/FM [D-Y85] Doi, Y: Algebras with total integrals. Commun. Alg. 13, 2137–2159 (1985) [D-M96] Durdevic, M.: Quantum Principal Bundles and Tannaka–Krein Duality Theory. Rep. Math. Phys. 38 (3), 313–324 (1996) [H-PM96] Hajac, P.M.: Strong Connections on Quantum Principal Bundles. Commun. Math. Phys. 182 (3), 579–617 (1996) [L-G97] Landi, G.: An Introduction to Noncommutative Spaces and their Geometries. Berlin– Heidelberg–New York: Springer-Verlag, 1997 [L-JL97] Loday, J.-L.: Cyclic Homology Berlin–Heidelberg–New York: Springer, 1997 [M-S95] Majid, S.: Foundations of Quantum Group Theory. Cambridge: Cambridge University Press, 1995 [M-S97] Majid, S.: Some Remarks on Quantum and Braided Group Gauge Theory. Banach Center Publications. 40, 336–349 (1997) [MMNNU91] Masuda, T., Mimachi, K., Nakagami, Y., Noumi, M., Ueno, K.: Representations of the Quantum Group SUq (2) and the Little q-Jacobi Polynomials. J. Funct. Anal. 99, 357–387 (1991) [MNW91] Masuda, T., Mimachi, K. Nakagami,Y., Watanabe, J.: Noncommutative Differential Geometry on the Quantum Two Sphere of Podle´s. I: An Algebraic Viewpoint. K-Theory 5, 151–175 (1991) [P-P87] Podle´s, P.: Quantum Spheres. Lett. Math. Phys. 14, 521–531 (1987) [S-HJ90] Schneider, H.-J.: Principal Homogenous Spaces for Arbitrary Hopf Algebras. Isr. J. Math. 72 (1–2), 167–195 (1990) [S-HJ94] Schneider, H.J.: Hopf Galois Extensions, Crossed Products, and Clifford Theory. In: Bergen, J., Montgomery, S. (eds.) Advances in Hopf Algebras, Lecture Notes in Pure and Applied Mathematics. New York: Marcel Dekker, Inc., 158, 1994, pp. 267–297 Communicated by A. Connes
Commun. Math. Phys. 206, 265 – 272 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Categorial Mirror Symmetry for K3 Surfaces C. Bartocci1 , U. Bruzzo1,2 , G. Sanguinetti3 1 Dipartimento di Matematica, Università degli Studi di Genova, Via Dodecaneso 35, 16146 Genova, Italy.
E-mail: [email protected]
2 Scuola Internazionale Superiore di Studi Avanzati (SISSA), Via Beirut 2–4, 34014 Trieste, Italy.
E-mail: [email protected]
3 Mathematical Institute, University of Oxford, 24–29 St. Giles’, Oxford OX1 3LB, UK.
E-mail: [email protected] Received: 20 October 1998 / Accepted: 15 March 1999
Abstract: We study the structure of a modified Fukaya category F(X) associated with a K3 surface X, and prove that whenever X is an elliptic K3 surface with a section, the b derived category of F(X) is equivalent to a subcategory of the derived category D(X) b of coherent sheaves on the mirror K3 surface X. 1. Introduction In 1994 M. Kontsevich conjectured that a proper mathematical formulation of the mirror conjecture is provided by an equivalence between Fukaya’s category of a Calabi–Yau manifold X and the derived category of coherent sheaves of the mirror Calabi–Yau b [10]. Thus in some sense mirror symmetry relates the symplectic structure manifold X of a Calabi–Yau manifold with the holomorphic structure of its mirror. It is expected that special Lagrangian tori on X are mapped by mirror symmetry to skyscraper sheaves b on the mirror X. This conjecture found some physical evidence with the discovery of D-branes and the description of their role in mirror symmetry [14,17]. Moreover, in a recent paper [15] Kontsevich’s conjecture has been proved in the case of the simplest Calabi–Yau manifolds, the elliptic curves. Our approach to mirror symmetry follows the geometric interpretation due to Strominger, Yau and Zaslow [17]. According to their construction, given a Calabi–Yau manifold admitting a foliation in special Lagrangian tori, its mirror manifold should be obtained by relative T-duality. In the case of K3 surfaces this formulation has been given a rigorous treatment in [2,4], proving that Strominger, Yau and Zaslow’s approach is consistent with previous descriptions of mirror symmetry [5] (this is also related to work by Aspinwall and Donagi [1]). We show here how the constructions described in [2,4] can be given a categorial interpretation which provides a proof of Kontsevich’s conjecture in the case of K3 surfaces. More precisely, we show that, under some assumptions which will be spelled out in the
266
C. Bartocci, U. Bruzzo, G. Sanguinetti
following sections, the derived category of a Fukaya-type category built out of special Lagrangian submanifolds of an elliptic K3 surface X is equivalent to a subcategory of b This subcategory is the derived category of coherent sheaves on the mirror surface X. formed by the complexes of sheaves whose zeroth Chern character vanishes. 2. Special Lagrangian Submanifolds and Fukaya’s Category Definition 2.1. Let X be a Calabi–Yau n-fold, with Kähler form ω and holomorphic n-form . A (real) n-dimensional submanifold ι : M ,→ X of X is said to be special Lagrangian if the following two conditions are met: – X is Lagrangian in the symplectic structure given by ω, i.e. ι∗ ω = 0; – there exists a multiple 0 of such that ι∗ =m 0 = 0. It can be shown that these conditions are equivalent to requiring that the real part of 0 restricts on M to the volume form induced by the Riemannian metric of X. This exhibits special Lagrangian submanifolds as a special type of calibrated submanifolds [8]. There are not many explicit examples of special Lagrangian submanifolds. The simplest ones are the 1-dimensional submanifolds of an elliptic curve: the first condition is trivial, and the multiple 0 of the global holomorphic one-form is readily obtained by a holomorphic change of coordinates in the universal covering of the elliptic curve. Additional examples are provided by Calabi–Yau manifolds equipped with an antiholomorphic involution. Since the involution changes the sign of both the Kähler form and the imaginary part of the holomorphic n-form, the fixed point sets of the involution are special Lagrangian submanifolds. A third example, and the most relevant in our case, arises when considering Calabi–Yau manifolds endowed with a hyper-Kähler structure. This is always the case in dimension 2, i.e. for K3 surfaces. In this case special Lagrangian submanifolds are just holomorphic submanifolds with respect to a different complex structure compatible with the same hyper-Kähler metric. This example will be discussed at length in the next section. Special Lagrangian submanifolds have received remarkable attention in physics since the appearance of D-branes in string theory, and especially since their role turned out to be of a primary importance for the mirror conjecture [3,17]. D-branes are special Lagrangian submanifolds of the Calabi–Yau manifold which serves as compactification space, and are equipped with a flat U (1) line bundle. In the physicists’ language, special Lagrangian submanifolds of the compactification space are associated with physical states which retain part of the supersymmetry of the vacuum. For this (and other related) reasons, special Lagrangian submanifolds are often called supersymmetric cycles, or also BPS states. Fukaya’s category, whose objects are Lagrangian submanifolds of a symplectic manifold, was introduced in connection with Floer’s homology [6]. Here, basically following the exposition of [15], we offer a description of a modified Fukaya category, built out of the special Lagrangian submanifolds of a Calabi–Yau manifold X. We shall call this the special Lagrangian Fukaya category (SLF category for short) of X, and will denote it by F(X). The objects in F(X) are pairs (L, E), where L is a special Lagrangian submanifold of X, and E is a flat vector bundle on L. The morphisms in this category are a little bit more complicated to define. Since special Lagrangian submanifolds are n-cycles in a compact complex n-dimensional manifold, two special Lagrangian cycles generically intersect at a finite number of points. The basic concept is that a morphism between two objects in the SLF category is a way for passing from the vector bundle defined on one cycle to the bundle on the other.
Categorial Mirror Symmetry for K3 Surfaces
267
Definition 2.2. Let U1 = (L1 , E1 ), U2 = (L2 , E2 ) be two objects in the SLF category. Then the space of morphisms Hom(U1 , U2 ) is defined to be Hom(U1 , U2 ) = ⊕x∈L1 ∩L2 Hom(E1 |x , E2 |x ). Thus the space of morphisms between two objects in the SLF category turns out to be a direct sum of vector spaces, each one being the space of homomorphisms between the fibers of the two vector bundles at the intersection points of the two special Lagrangian cycles. Maslov index. The space of morphisms between two objects is naturally graded over Z by the Maslov index of the tangent spaces to the special Lagrangian submanifolds at the intersection points [15]. Let us recall some basic facts about the Maslov index. Let V be a 2n-dimensional real symplectic vector space, and denote by G(V ) the Grassmannian of Lagrangian n-planes in V . One has an isomorphism G(V ) ' U (n)/O(n), so that π1 (G(V )) ' Z. The Maslov index is the unique integer-valued function on the space of loops in G(V ) satisfying some naturality conditions [13] which include its homotopic invariance, and thus provides an explicit isomorphism π1 (G(V )) → Z. In order to define a Maslov index for the intersection of Lagrangian cycles one has to slightly modify its definition so as to consider open paths. One first notices that the Lagrangian Grassmannian is naturally stratified by the dimension of the intersection of the Lagrangian n-planes with a fixed Lagrangian n-plane. Then one can define a Maslov index for the intersection of two Lagrangian planes as a Z-valued function on the space of paths in G(V ) which is homotopy invariant under deformations of the paths that do not move the extrema out of their strata. (Actually one should consider a Grassmannian of special Lagrangian (rather than just Lagrangian) planes, and restrict the Maslov index to it. This will be done in the next section in the case of K3 surfaces.) A∞ structure. Strictly speaking Fukaya’s category is not a category at all, since in general the composition of morphisms fails to be associative. Associativity is replaced by a more complicated property, which makes Fukaya’s “category” into an A∞ category. Definition 2.3. An A∞ category F consists of a class of objects Ob(F); for any two objects X , Y, a Z-graded abelian group of morphisms Hom(X , Y); composition maps mk : Hom(X1 , X2 ) ⊗ · · · ⊗ Hom(Xk , Xk+1 ) → Hom(X1 , Xk+1 ), , k ≥ 1, of degree 2 − k, satisfying the condition X
(−1) mn−r+1 (a1 ⊗ · · · ⊗ as−1 ⊗ mr (as ⊗ . . .
r=1...n s=1...n−r+1
· · · ⊗ as+r−1 ) ⊗ as+r ⊗ · · · ⊗ an ) = 0 for all n ≥ 1, where = (r + 1)s + r(n +
s−1 X j =1
deg(aj )).
(1)
268
C. Bartocci, U. Bruzzo, G. Sanguinetti
Condition (1) implies that m1 is a coboundary operator. The vanishing of the morphism m1 , together with condition (1) for the morphism m3 , implies that the composition law given by m2 is associative. Let us see how this A∞ structure arises in Fukaya’s category. Let us assume that the first object X1 and the last object Xk+1 have a nonvoid intersection, otherwise Hom(X1 , Xk+1 ) = 0 and the composition map is trivial. The composition maps are explicitly described as follows: Let uj = (aj , tj ) ∈ Hom(Uj , Uj +1 ), where aj ∈ Lj ∩ Lj +1 and tj ∈ Hom(Ej |aj , Ej +1 |aj ). One defines X (C(u1 , . . . , uk ), ak+1 ). mk (u1 ⊗ · · · ⊗ uk ) = ak+1 ∈L1 ∩Lk+1
Here one has C(u1 , ..., uk , ak+1 ) =
X
Z I ∗ c ± exp[2π i( φ ω )]P exp[ φ ∗ β].
φ
This requires some explanation. The sum is performed over holomorphic and antiholomorphic maps φ from the disc D 2 into the manifold X, up to projective equivalence, with the following boundary condition: there are k + 1 points pj = e2π αj ∈ S 1 = ∂D 2 such that φ(pj ) = aj and φ(e2πα ) ∈ Lj for α ∈ (αj −1 , αj ). The two-form ωc appearing in (2) is the complexified Kähler form, while β is the connection of the bundle restricted to the image of the boundary of the disc. P represents a path-ordered integration, defined by I P exp( φ ∗ β) Z αk Z α1 Z αk+1 βk dα) tk exp( βk−1 dα) tk−1 ...t1 exp( β1 dα). = exp( αk
αk−1
αk+1
3. The Special Lagrangian Fukaya Category for K3 Surfaces The main purpose of this section is to give a description of the SLF category when the Calabi–Yau manifold is a K3 surface X. In this case, due to the fact that K3 surfaces admit hyper-Kähler metrics, special Lagrangian submanifolds are very easily exhibited. Let us denote by ω the Kähler form associated with a given hyper-Kähler metric and complex structure. One also has a holomorphic 2-form = x + iy. The three elements ω, x, y can be regarded as vectors in the cohomology space H 2 (X, R); if the latter is equipped with the scalar product of signature (3,19) induced by the intersection form on H 2 (X, Z), these three elements are spacelike, and generate a 2-sphere which can be identified with the set of complex structures compatible with the fixed hyper-Kähler metric. It is very easy to check that what is special Lagrangian in the original complex structure is holomorphic in the complex structure in which the roles of ω and x are exchanged (up to a sign) [8] (this corresponds to a rotation of 90 degrees around the y axis). We shall call such a change of complex structure a hyper-Kähler rotation. We want in particular to consider elliptic K3 surfaces X which admit a section.1 K3 surfaces arising as compactification spaces of string theories which admit mirror 1 This means that there exists an epimorphism p : X → P1 whose generic fiber is a smooth elliptic curve and admitting a section e : P1 → X.
Categorial Mirror Symmetry for K3 Surfaces
269
partners are always of this type [17]. So let us consider a K3 surface X that in a complex structure I is elliptic and has a section. Let us denote by XI this K3 surface. The Picard group of XI is generated by the section, by the divisor of the generic fiber, and by the irreducible components of the singular fibers that do not intersect the section.2 If we perform the hyper-Kähler rotation described above, and call J the new complex structure, the submanifolds which were holomorphic in the complex structure I are now special Lagrangian. Assuming that XJ is elliptic as well, it has been shown [4] that this hyper-Kähler rotation reproduces, at the level of the Picard lattice of an elliptic K3 surface, the effects of mirror symmetry previously described in an algebraic way [5]. So the varieties XI and XJ can be regarded as a mirror pair of K3 surfaces. In this way one has a very precise picture of the configuration of special Lagrangian submanifolds of XJ . Moreover, the flat vector bundles one considers on special Lagrangian submanifolds of XJ are (flat) holomorphic bundles in the complex structure I . On a K3 surface the A∞ structure of the SLF category turns out to be trivial, that is, the SLF category is a true category. In fact due to the hyper-Kähler structure of a K3 surface X, the Grassmannian of special Lagrangian subspaces of the tangent space to X at a point reduces to a copy of P1 , hence is simply connected. Moreover, special Lagrangian 2-cycles always intersect transversally, so there is no stratification, and the Maslov index is trivial (cf. [11]). The Hom groups in the SLF category have trivial grading, so mk = 0 for k 6 = 2, while condition (1) for m3 yields the associativity of the composition of morphisms. The triviality of this Fukaya category for K3 surfaces may be related, via Sadov’s claim [16] that the Floer homology of an almost Kähler manifold X with coefficients in the Novikov ring of X is equivalent to the quantum cohomology of X, to the triviality of the quantum cohomology of K3 surfaces. 4. The Special Lagrangian Fukaya Category and the Derived Category of Coherent Sheaves We want now to describe a construction which exhibits the relationship between the SLF category of a K3 surface and the derived category of coherent sheaves on the mirror K3 surface. We start by briefly recalling the definition of derived category of an abelian category A (cf. [18]). One starts from the category K(A) whose objects are complexes of objects in A, while the morphisms are morphisms of complexes identified up to homotopies. Let Ac(A) be the full subcategory of K(A) formed by acyclic complexes (i.e. complexes such that all cohomology objects vanish). The derived category D(A) is by definition the quotient K(A)/Ac(A). A morphism between two objects [X ], [Y] in D(A) is represented by a diagram of morphisms in K(A), q
m
X ←− Z −→ Y, where q is a quasi-isomorphism, i.e., a morphism which induces an isomorphism between the cohomology objects of X and Y. Two objects X , Y in K(A) turn out to be equivalent in D(A) whenever they are quasi-isomorphic, that is, whenever there is a diagram as above where m is also a quasi-isomorphism. If there exists a quasi-isomorphism between two complexes, these represent isomorphic objects in D(A). 2 Actually one may have further generators of the Picard group provided by additional sections of the projection p : X → P1 .
270
C. Bartocci, U. Bruzzo, G. Sanguinetti
Now we consider a K3 surface X with a fixed hyper-Kähler metric, and a compatible complex structure J . If we start from an object (L, E) in the SLF category F(XJ ), where L is a special Lagrangian submanifold of real dimension 2, and E a flat rank n vector bundle on L, in the complex structure I obtained by performing a hyper-Kähler rotation L is a divisor, and E may be regarded as a coherent sheaf on XJ concentrated on L, whose restriction to L is a rank n locally free sheaf. This operation is clearly functorial: the sheaf of homomorphisms between two such objects is a torsion sheaf concentrated on the points where the two divisors intersect. The stalks at such points are precisely the homomorphisms between the stalks of the two coherent sheaves. Thus the hyper-Kähler rotation induces a functor between the SLF category F(XJ ) and the category C(XI ) of coherent sheaves supported on a divisor of XI , whose restriction to the divisor is locally free. This functor is clearly faithful, free and representative and hence gives an equivalence of the two categories. Remark 4.1. To take account of the singular divisors in X we should consider torsionfree sheaves rather than just locally free ones. However, since any coherent sheaf on a possibly singular curve over C has a projective resolution by locally free sheaves, what we miss by restricting to locally free sheaves will be recovered when we go to the derived categories. The category C(XI ) that we obtained via a hyper-Kähler rotation is not abelian (kernels and cokernels of morphisms do not necessarily lie in the category). In order to introduce a related derived category, one should find a somehow natural abelian category ˜ I ) containing C(XI ). The most obvious choice is the subcategory of the category C(X Coh(XI ) of coherent sheaves on XI whose objects are sheaves of rank 0 (in particular we are adding all the skyscraper sheaves). We assume that the K3 surface XI is elliptic and has a section. Since XI is elliptic any point p ∈ X lies on a divisor D. The complex 0 → kp → 0 concentrated in degree zero, where kp is the length one skyscraper at p, is quasi-isomorphic to the complex of sheaves in C(XI ), 0 → OD (−p) → OD → 0, where OD is the term of degree zero. Since every coherent sheaf on a smooth curve is the direct sum of a locally free sheaf and a skyscraper sheaf, we obtain that all coherent sheaves whose support lies on ˜ I ). a divisor are objects of C(X It is not always true the derived category of an abelian subcategory C0 of an abelian category C is also a subcategory of the derived category of C. However, this is indeed ˜ I ), as we shall next show. Let us recall the definition of the case for the category C(X thick subcategory (cf. e.g. [9]). Definition 4.2. A subcategory C0 of a category C is said to be thick if for any exact sequence Y → Y 0 → W → Z → Z 0 in C with Y, Y 0 , Z, Z 0 in C0 then W belongs to C0 as well. ˜ I ) is a thick subcategory of Coh(XI ): in fact, the generic stalk of a sheaf in Now, C(X ˜ I ) is 0, and, since a sequence of sheaves is exact when it is so at the stalks, this C(X implies that also the generic stalk of W is 0, i.e. W also is a rank 0 sheaf. Moreover, ˜ I ) is a full subcategory, so that we can apply the following theorem [9]. C(X
Categorial Mirror Symmetry for K3 Surfaces
271
Theorem 4.3. Let C be an abelian category, C0 a thick full abelian subcategory. Assume that for any monomorphism f : W 0 → W with W 0 ∈ Ob(C0 ), there exists a morphism g : W → Y, with Y ∈ Ob(C0 ), such that g ◦ f is a monomorphism. Then the derived category D(C0 ) is equivalent to the subcategory of D(C) consisting of complexes whose cohomology objects belong to C0 . In our case the condition of this theorem is easily met, just take for g the evaluation ˜ I ) is a subcategory of the derived morphism. Thus the derived category built up from C(X category of coherent sheaves. ˜ I ) in cohomology is H 1,1 (Z) ⊕ H 4 (Z) and is an ideal The image of the category C(X in the algebraic cohomology ring. Since the Chern map is a ring morphism between ˜ I ) we recover K-theory and algebraic cohomology, by adding the structure sheaf to C(X the whole derived category of coherent sheaves. Adding the structure sheaf of the surface has no motivation from a strictly geometric viewpoint, but has physical grounds in the necessity of having 0-branes in the spectrum of the theory. (The association between coherent sheaves and branes is usually done by taking the Poincaré dual of the support of the coherent sheaf.) Let us check explicitly that every complex 0 → F → 0, where F is a coherent sheaf on XI , is quasi-isomorphic to a complex 0 → ⊕OXI → S → 0, where S is a coherent sheaf supported on a divisor. Let us fix a very ample divisor H in XI . Every coherent sheaf F admits a finite projective resolution by sheaves of the form ⊕rj =1 OXI (−mj H ) (cf. [7]). Moreover, due to the exactness of the sequence 0 → OXI (−mi H ) → OXI → Omi H → 0, the sheaf
⊕rj =1 OXI (−mj H )
is quasi-isomorphic to a complex 0 → ⊕OXI → S → 0
where S is a coherent sheaf supported on a divisor (here ⊕OXI is concentrated in degree 0). This proves that the whole derived category of coherent sheaves is obtained by complexes whose elements are either direct sums of the structure sheaf or lie in the image of the SLF category. Collecting these results, we have eventually proved the following fact: the derived category of a “natural abelianization” of the SLF category F(XJ ) is equivalent to a subcategory of the derived category D(XI ) of coherent sheaves on XI . 5. Conclusions Mirror symmetry yields definite predictions about the transformations of branes [14], which can be given a precise mathematical interpretation in terms of transformations of the derived category of coherent sheaves. In [2] it was indeed proved that the action of a Fourier–Mukai transform on the derived category of coherent sheaves mimics precisely the action of mirror symmetry on branes. In particular, this shows that on an elliptic K3 surface genus 1 special Lagrangian cycles are mapped to points, which is exactly the behaviour one expects from mirror symmetry [12]. Moreover, one can argue that the very essence of mirror symmetry is an equivalence between a suitable (derived) version of the Fukaya category of a Calabi–Yau manifold
272
C. Bartocci, U. Bruzzo, G. Sanguinetti
b This is exactly X and the derived category of coherent sheaves of the mirror manifold X. what we have proved when X is an elliptic K3 surface with a section, admitting also a fibration in special Lagrangian tori.After performing a hyper-Kähler rotation, we map the SLF category into a category whose “natural abelianization” is a thick full subcategory of the category of coherent sheaves. Now, if we consider an extension of this category adding the structure sheaf (which seems in some sense very natural) and derive this, we obtain the whole derived category of coherent sheaves. Applying a Fourier–Mukai transform (which at the level of derived categories is an equivalence) we obtain the desired transformation mapping 2-cycles of genus 1 to points. If, instead, we do not extend the SLF category by adding the structure sheaf, we obtain a subcategory of the derived category of coherent sheaves. This will be mapped by Fourier–Mukai transform to another subcategory, but again this will show the desired feature of mapping 2-cycles of genus 1 to points. Acknowledgements. We thank B. Dubrovin for valuable discussions and D. Hernández Ruipérez for his enlightening suggestions. This research was partly supported by the research project “Geometria delle varietà differenziabili”. The second author wishes to thank the School of Mathematical and Computing Sciences of the Victoria University of Wellington, New Zealand, for the warm hospitality during the completion of this paper while he was supported by the Marsden Fund research grant VUW-703.
References 1. Aspinwall, P., and Donagi, R.: The heterotic string, the tangent bundle, and derived categories. hepth/9806094 2. Bartocci, C., Bruzzo, U., Hernández Ruipérez, D., and Muñoz Porras, J.M.: Mirror symmetry on K3 surfaces via Fourier–Mukai transform. Commun. Math. Phys. 195, 79–93 (1998); alg-geom/9704023 3. Becker, K., Becker, M., and Strominger, A.: Fivebranes, membranes and non-perturbative string theory. Nucl. Phys. B456, 130-152 (1995); hep-th/9507158 4. Bruzzo, U., and Sanguinetti, G.: Mirror symmetry on K3 surfaces as a hyper-Kähler rotation. Lett. Math. Phys. 45, 295–301 (1998); physics/9802044 5. Dolgachev, I.V.: Mirror symmetry for lattice polarized K3 surfaces. J. Math. Sci. 81, 2599–2630 (1996); alg-geom/9502005 6. Fukaya, K.: Morse homotopy, A∞ -category and Floer homologies. In: Proceedings of the 1993 GARC Workshop on Geometry and Topology, Seoul National University 7. Hartshorne, R.: Algebraic geometry. New York: Springer-Verlag, 1977 (Corollary II.5.18) 8. Harvey, R., and Lawson Jr., H.B.: Calibrated geometries. Acta Math. 148, 47–157 (1982) 9. Kashiwara, M., and Schapira, P.: Sheaves on manifolds. Berlin: Springer-Verlag, 1990 10. Kontsevich, M.: Homological algebra of mirror symmetry. In: Proceedings of the 1994 International Congress of Mathematicians, I, Zürich: Birkhäuser, 1995, p. 120; alg-geom/9411018 11. Kontsevich, M.: Talk delivered at “European Conference on Algebraic Geometry”, University of Warwick, July 1996 12. Manin, Yu.I.: Talk delivered at the Pisa symposium “Hodge Theory, Mirror Symmetry and Quantum Cohomology”, April 1998 13. McDuff, D., and Salamon, D.: Introduction to symplectic topology. Oxford: Clarendon Press, 1995 14. Ooguri, H., Oz, Y., and Yin, Z.: D-branes on Calabi–Yau spaces and their mirrors. Nucl. Phys. B477, 407–430 (1996); hep-th/9606112 15. Polishchuk, A., and Zaslow, E.: Categorical mirror symmetry: The elliptic curve. math.AG/9801119 16. Sadov, V.: On equivalence of Floer’s and quantum cohomology. Commun. Math. Phys. 173, 77–99 (1995); hep-th/9310153 17. Strominger, A., Yau, S.-T., and Zaslow, E.: Mirror symmetry is T-duality. Nucl. Phys. B479, 243–259 (1996); hep-th/9606040 18. Verdier, J.-L.: Des catégories dérivées des catégories abéliennes. Astérisque 239, Société Mathématique de France (1996) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 206, 273 – 288 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Ergodicity of 2D Navier–Stokes Equations with Random Forcing and Large Viscosity Jonathan C. Mattingly? Program in Applied and Computational Mathematics, Princeton NJ, USA Received: 16 February 1998 / Accepted: 19 March 1999
Abstract: The stochastically forced, two-dimensional, incompressable Navier–Stokes equations are shown to possess an unique invariant measure if the viscosity is taken large enough. This result follows from a stronger result showing that at high viscosity there is a unique stationary solution which attracts solutions started from arbitrary initial conditions. That is to say, the system has a trivial random attractor. Along the way, results controling the expectation and averaging time of the energy and enstrophy are given.
We consider the stochastically forced, 2D, incompressible Navier–Stokes equations (SNS) on a bounded domain U ⊂ R2 with a smooth boundary ∂U , namely ∂u(x, t) − ν1u(x, t) + (u(x, t) · ∇)u(x, t) = f (x, t) − ∇P (x, t), ∂t ∇ · u(x, t) = 0, u(x, 0) = u0 (x) and u(x, t) = 0 for x ∈ ∂U .
(1)
Here f (x, t) is a divergence-free, mean zero, white in time Gaussian random field satisfying the specified boundary conditions, P (x, t) is the pressure, and ν > 0 is the viscosity. Equation (1) determines a Markov process whose phase space consists of the square integrable, divergence-free vector fields defined on the domain U with the given boundary conditions. This process was studied in a series of papers by Crauel, Da Prato, Ferrario, Flandoli, Foias, Gatarek, Maslowski, Temam, Zabczyk and others (see [Fla94, FG95,FM95,CF94,Fer97,DPZ96]). In particular, for (1) they proved the existence and uniqueness of strong solutions to the integral equation and the existence of at least one ? Current address: Department of Mathematics, Stanford University, Stanford, CA 94305, USA. E-mail: [email protected]
274
J. C. Mattingly
invariant measure. The uniqueness of this measure was proven under stringent additional conditions on the forcing. In this paper, we prove a general theorem about the behavior of solutions of the SNS equations which holds for large viscosity. This theorem, among other things, easily gives the uniqueness of the invariant measure. The approach was motivated by the paper [EKMS98]. Furthermore, it opens the possibility of studying various statistical properties of the solutions with respect to this invariant measure. Before stating the main result let us describe the setting more precisely. We begin by eliminating the pressure term from the equations by incorporating the divergencefree condition into the state space. Essentially, ∇P can be understood as a Lagrange multiplier which enforces the divergence free condition. Its effect can be captured by restricting ourselves to solutions living in a divergence free space. We denote by V the space of all C ∞ , divergence free vector fields on U satisfying the boundary conditions and by L2 the closure of V in L2 (U ) × L2 (U ). L2 should be thought of as the square integrable functions “in our setting”. By projecting onto L2 , we can rewrite (1) as an abstract Itô stochastic differential equation on L2 . We thus obtain du(t; t0 , u0 ) = {−ν32 u − B(u, u)}dt + dW (t),
(2)
u(t0 ; t0 , u0 ) = u0 ∈ L . 2
Here u(t; t0 , u0 ) is the value at time t of a solution which started from an initial condition u0 at time t0 . B(v, w) = PL2 (v·∇)w and 32 u = −PL2 1u are respectively the projection of the bilinear term and the linear terms onto L2 . We will also need the eigenvectors, {ek }k∈Z , of 32 in L2 and their corresponding eigenvalues, λk . To characterize the spatial smoothness, we will use the spaces Hs = D(3s ) ∩ L2 , where D(3s ) is the domain of the operator 3s . Hs is essentially the Sobolev space H s (U ) × H s (U ) with the addition of the boundary and divergence-free conditions. dW (t) is the Itô differential of an infinite dimensional Brownian motion in L2 . We assume W (t, ω) is of the form X σk ek βk (t) t ∈ (−∞, ∞). (3) W (t) = k∈Z,k6=0
Here the βk (t) are independent, two-sided, standard Brownian motions on the probability space (, Ft , P, θt ), Ft is the filtration of σ -algebras to which the βk ’s are adapted, P is the probability measure on , and θt is the induced ergodic group of P-preserving shift on . For a unique strong solution to exist, a sufficient requirement on the coefficients σk ∈ R is that X 1 (4) σk2 λk2 < ∞. 1
This condition is natural for it makes W (·) a Brownian motion with values in H 2 ([Kun90, DPZ92]). By the Sobolev embedding theorem, this is the marginal space to be continuously embedded in L4 × L4 as is required of the forcing in the deterministic theory. It is possible to work with less spatially regular forcing at the expense of having to deal with weak solutions and the imposition of additional conditions to assure uniqueness. However, since our goal is to outline a new approach, not give the most general theorems, we steer clear of these extra complications in the name of clarity. Thus henceforth, except when explicitly stated, we require that W (t) satisfy the condition given in (4).
Randomly Forced Navier–Stokes
275
The process W (t) has stationary increments; hence, the expected value of the L2 norm of the forcing at any instance time grows at a fixed constant rate. We will denote this constant by E0 , o n X σk2 . (5) E |W (t + τ )|2L2 − |W (t)|2L2 = τ E0 = τ k
Physically, E0 is the expected energy flux imparted by the stochastic forcing per unit of time. We also observe that the Poincaré inequality, |3u|2L2 ≥ λ1 |u|2L2 , holds in our setting. Furthermore, we shall need later the classical estimate on the bilinear term B(v, w) (see [CF88]): |hB(v, w), uiL2 |2 ≤ γ 2 |v|L2 |3v|L2 |3w|2L2 |u|L2 |3u|L2 .
(6)
For completeness, we restate the existence and uniqueness theorems for the SNS. Definition. A stochastic process u(t, ω) is a solution of (2), over the time interval [t0 , T ] with initial condition u0 ∈ L2 , if • u(·, ω) ∈ C( (t0 , T ), L2 ) ∩ L2 ( (t0 , T ); H1 ) a.s. • u(·, ω) is a solution of the integral equation u(t, ω) = e
ν32 (t−t0 )
Z u0 +
t
t0
e
ν32 (t−s)
Z B(u(s, ω), u(s, ω))ds +
t
t0
eν3
2 (t−s)
dW (s, ω)
with probability one. Just as in the deterministic Navier–Stokes equations, one can obtain a short time existence proof by means of a fixed point argument. The solution can than be extended for all time by an a priori energy estimate. Theorem (Da Prato, Zabczyk, Flandoli). If W (·) satisfies (4), then for each initial condition u0 ∈ L2 there exists a unique solution u of the SNS, Eq. (2), such that • u(·, ω) ∈ C([0, T ]; L2 ) ∩ L2 (0, T ; H1 ) a.s. . • u is a Markov process in L2 . 1
Proof. Given the observation that in the two-dimensional setting, H 2 ⊂ L4 × L4 , the existence and uniqueness was proved in [DPZ96]. Flandoli proved the regularity and Markov properties in [Fla94]. u t As mentioned before, in [FM95] Flandoli and Maslowski proved that the invariant measure for weak solutions of the SNS, Eq. (2), is unique if − 21
cλk
− 38 −
≤ σk ≤ Cλk
for some C > 0, c > 0, and > 0,
asymptotically in k. The upper bound ensures that a weak solution of a needed regularity exists, the lower bound ensures the uniqueness of the invariant measure. These results have been improved in [Fer97] but only in so far as the decay rates have been relaxed. All these results require the noise not only to be infinite dimensional but also not to have a high degree of smoothness in space. Our results, though requiring a viscosity which
276
J. C. Mattingly
is “large enough”, impose no spatial roughness on the forcing. In particular, the forcing can be finite dimensional. We use a simple yet, when applicable, extremely powerful methodology for showing the uniqueness of an invariant measure of our Markov process. It amounts to showing, noise realization by noise realization, that trajectories starting from different initial data converge to each other with probability one as the system evolves. 1. Main Results As we have alluded, all of our central results require the square of the viscosity to be large relative to the mean energy flux of the forcing. We now make this statement precise. Recall that γ is the domain dependent constant defined in (6) and that λ1 was the first eigenvalue of 32 on U . Define δ0 = λ1 ν − E0 γ /2ν 2 . Condition (A).
γ ν3 > ⇐⇒ δ0 > 0. E0 2λ1
As E0 is the mean increase in the L2 norm of the Brownian forcing W (t)per unit of time, Condition (A) requires that the mean energy input be small relative to the viscosity squared. All of our results stem from the following two theorems. Theorem 1. Assume Condition (A) holds. Fix a δ ∈ (0, δ0 ) and a time t0 . Let u0 ∈ L2 be 2p an initial condition, measurable with respect to Ft0 , such that E|u0 |L2 is finite for some p > 1. Set u(t) = u(t; t0 , u0 ) and let u(t) ˜ = u(t; t0 , u˜ 0 ) denote the solution starting from some other arbitrary initial condition u˜ 0 ∈ L2 . Then, there exists a positive integer→ ˜ such that valued random time − τ (δ, t0 , u0 ), independent of u, |u(t) − u(t)| ˜ 2L2 ≤ |u0 − u˜ 0 |2L2 e−2δ(t−t0 )
→ → for all t > t0 + +− τ . In addition, E(− τ q ) is finite for any q ∈ (0, p − 1).
L2 -norm u0
u˜ 0 t0 + τ ∗
t0 Fig. 1. Summary of Theorem 1
Theorem 2. Assume Condition (A) holds. Fix a δ ∈ (0, δ0 ) and a t ∈ R. Let {u0 (n)} be a sequence of random variables with n ∈ αZ and n ≥ 0. Assume that the u0 (n) are 2p measurable with respect to Ft−n and that E|u0 (n)|L2 is uniformly bounded in n for some p > 2. Then the following hold:
Randomly Forced Navier–Stokes
277
1. There exist a random αZ-valued time ← n−(, δ, t, ω) > 0 such that for real s > 0 and ← − all n ∈ αZ with n > n one has, with probability one, sup |u(t + s; t − n; , u0 (n)) − u(t + s; t − n, u00 )|L2 ≤ δ 2 |n|e−δ(|n|+s) .
u00 ∈An
2 n−q ) < ∞ for q ∈ (0, p−2). Here An is the set {u00 : |u00 |2L2 ≤ δ2 |n|}. In addition, E(← 2. Let {u˜ 0 (n)} be a second sequence of random variables with n ∈ αZ and n ≥ 0, 2p measurable with respect to Ft−n , and with E|u0 (n)|L2 uniformly bounded in n for some p > 2. Then there exists another αZ-valued random time ← n−0 such that, with probability one, for real s > 0 and all n ∈ αZ with n > ← n−0 one has,
|u(t + s; t − n; , u0 (n)) − u(t + s; t − n, u˜ 0 (n))|L2 ≤ δ 2 |n|e−δ(|n|+s) . Again, E(← n−0 )q < ∞ for q ∈ (0, p − 2). Theorem 2 is similar in spirit to Theorem 1. The main difference is that in the latter, time is running backwards. However Theorem 2 is a bit weaker in that we are restricted to points on the lattice αZ as starting times. This however is an artifact of our approach. By proving a “backwards” version of the critical lemma used in the proof of Theorem 1 (that is Lemma 3), one can prove a version of Theorem 2 completely analogous to Theorem 1. See [Mat98] for the details. The following corollary will allow us to build a solution starting from “−∞.” Corollary 1. Under Condition (A), fix a lattice αZ, a t1 ∈ αZ, and a δ ∈ (0, δ0 ). Given any ε > 0, there exists a positive αZ-valued random time n∗ (ε, δ, t1 ) such that with probability one, for all τ ≥ 0 and all n1 , n2 ∈ αZ, n1 , n2 < t1 − n∗ H⇒ |u(t1 + τ ; n1 , 0) − u(t1 + τ ; n2 , 0)|L2 ≤ εe−δτ . Furthermore, n∗ (ω) is a stationary random variable with all moments finite.
L2 -norm
t1 − n ∗
t1
Fig. 2. Summary of Corollary 1
Theorem 1 implies that, for almost every realization of the noise, trajectories starting from different initial conditions converge to each other. Corollary 1 states that two solutions with initial conditions identically equal to zero, but starting at different instances of time, converge to each other for almost every instance of noise. Together they show
278
J. C. Mattingly
that there exists a unique asymptotic behavior and thus a unique invariant measure. Essentially, Corollary 1 shows the existence of a single distinguished solution to which all solutions starting from zero converge almost surely. Theorem 1 guarantees that all initial conditions converge to this distinguished solution for almost every instance of the noise. Thus the asymptotic behavior depends only on the realization of the noise and is insensitive to the initial conditions. We now make this discussion more formal, and prove the following statements. Theorem 3. If Condition (A) holds, then there exists a unique solution u∗ : (−∞, ∞)× → L2 of the SNS, defined for all t ∈ (−∞, ∞), such that: 1. u∗ (t, ω) is a stationary stochastic process with values in H1 . 2. For any time t0 ∈ R, any δ in (0, δ0 ), and any lattice α0 Z, there exist integer random → n−∗ (t0 , δ, α0 ) such that times − n ∗ (t0 , δ) and ← |u(t; t0 , u0 ) − u∗ (t)|L2 ≤ re−δ(t−t0 ) ,
sup
{u0 :|u∗ (t0 )−u0 |2 2
0
|u(t0 ; t 0 , u0 ) − u∗ (t0 )|L2 ≤ re−δ|t −t0 | ,
sup
{u0 :|u∗ (t 0 )−u0 |2 2
→ n ∗ (t ∈ R), and t 0 < t0 − ← n−∗ (t 0 ∈ α0 Z). In addition, for r > 0 (r ∈ R), t > t0 + − ∗ ∗ ← − − → n and n have all moments finite.
L2 -norm u∗
Fig. 3. Summary of Theorem 3
In fact, u∗ has greater spatial regularity than mentioned above. See [Mat98] for the details. Proof. We begin by constructing u∗ . Pick an α ∈ R+ . Let n1 be an arbitrary element of αZ. Define un (t, ω) = u(t, ω; n1 − n, 0) for n ∈ αZ+ and t ≥ n1 . By Corollary 1, we see that the {un }, restricted to the time interval [n1 , ∞), form a Cauchy sequence in the def space C([n1 , ∞), L2 ) under the norm |u|∞,L2 = sup |u(s)|L2 . This space is complete so s≥n1
the limit exists. Define u∗ (t, ω) to be this limit for t ≥ n1 . Since n1 was arbitrary this defines u∗ (t, ω) for all time. Flandoli proved in [Fla94] that there is an absorbing ball for the dynamics in the H1 topology. Thus for any fixed T > n1 , we see that lim sup|3un (s)|L2 < K(ω, T ) almost n
surely, for some random K and all s ∈ [n1 , T ]. This gives that |3u∗ |L2 < K almost surely, which means u∗ ∈ C([n1 , ∞); H1 ). This also shows that the {un } converge to u∗ weakly in H1 . We already know that the {un } converge strongly to u∗ (t, ω) in
Randomly Forced Navier–Stokes
279
L2 . Hence by standard techniques and some estimates on the bilinear term, we see that u∗ (t, ω) is a weak solution to the SNS equation (cf. Sect. 2.1. of [Tem79]). Since u∗ ∈ C([n1 , ∞); H1 ) almost surely, it is in fact a strong solution to the integral equation. 2p Because each un starts from zero, Lemma 2 shows that for each p, E{|un (t)|L2 } is
bounded uniformly in both t and n. Thus, Ep {|(u∗ (t)|L } is bounded uniformly in t, t ∈ (−∞, ∞). This uniformity allows us to apply Theorem 1 and 2 which proves the two statements about balls in phase space being exponentially attracted to u∗ . Next, we must show that u∗ is stationary. Observe, that by construction u∗ (t, ω) is stationary under shifts of length α, 2p
u∗ (t + α, ω) = lim u(t + α, ω; t − n, 0) n→−∞ n∈α Z
= lim u t, θα ω; t − n + α, 0 = u∗ (t, θα ω). n→−∞ n∈α Z
Since α was arbitrary, for another α˜ ∈ R+ we could construct u˜ ∗ corresponding to the lattice αZ. ˜ Again u˜ ∗ would be a strong solution, with Ep (u˜ ∗ ) uniformly bounded in time and stationary relative to shifts of length α. ˜ Since u∗ (t) and u˜ ∗ (t) both have uniformly bounded energy moments, we can apply Theorem 2. Because u∗ (t) and u˜ ∗ (t) exist for all times, we can slide the “initial times” used in Theorem 2 back to “−∞”. Thus showing that the two solutions are identical. u t In light of Theorem (3), we have the following corollary. Corollary 2. If Condition (A) holds, the SNS has a unique invariant measure. Proof. The invariant measure is simply the law of u∗ (t) at any time t. Since every t trajectory is attracted to u∗ (·) the measure is unique. u We can recast these conclusions in the language of random attractors (see [CF94]) by saying that the SNS possesses a random attractor which for each noise realization is a single solution in L2 . 2. Energy Estimates Before proving our main results, we establish a few facts concerning the evolution of the energy which do not require Condition (A). We will denote the moments of 2p 2 = supk σk2 . Letting the energy by Ep (t; u0 ) = E{|u(t; t0 , u0 )|L2 }. Also define σmax
uk (t) = hu(t), ek iL2 and denoting by hu(t), dW iL2 the sum
X k
uk (t) · σk dβk (t), we
have the following lemmas describing the evolution of the energy moments. Lemma 1. For p ≥ 1, the energy moments satisfy the Itô stochastic differential equation h i 2p 2(p−1) −ν|3u(t)|2L2 dt + hu(t), dW iL2 d|u(t)|L2 =2p|u(t)|L2 X 2(p−2) 2(p−1) |uk (t)|2 |σk |2 dt + p|u(t)|L2 E0 dt. (7) + 2p(p − 1)|u(t)|L2 k
Furthermore, the local martingale defined by Mt = is in fact an
L2 ()
martingale.
Z t t0
2(p−1)
2p|u(s)|L2
hu(s), dW (t)iL2
280
J. C. Mattingly 2p
Lemma 2. Assume that the initial condition is such that E{|u0 |L2 } is finite for some p ≥ 1 and measurable with respect to Ft0 , then n o E0 E0 −2νλ1 t 2 +e E |u0 |L2 − E1 (t, u0 ) ≤ 2νλ1 2νλ1 E0 E0 −2νλ1 t +e E1 (t0 , u0 ) − , = 2νλ1 2νλ1 and for all j ∈ Z, 1 < j ≤ p, Ej (t, u0 ) ≤ Ej (t0 , u0 )e−2j νλ1 t + Cj where Cj = 2j (j
2 − 1)σmax
Z
t
t0
Ej −1 (s, u0 ) e−2j νλ1 (t−s) ds,
+ j E0 . Furthermore, for s < t,
Ep (t, u0 ) ≤
def Emax p (s, u0 )=
p X j =1
Cj0 Ej (s, u0 ) + C00 ,
(8)
where the Cj0 are constants depending only on j and the σk ’s. We also see that asymptotically 2 ) p max(E0 , σmax (p − 1)!. (9) Ep (t) ≤ νλ1 Proof of Lemma 1 and 2. We begin by deriving (7) leaving the problem of showing the local martingale term is a true martingale until after we have derived the estimates for the expectations. In fact, these bounds on the expectations will be used to bound the quadratic variation process of the local martingale. 2p Applying Itô’s formula to u 7 → |u|L2 , one obtains (7). For p = 1, this is identical to the deterministic energy evolution equation except for the additional term with E0 . This term arises in Itô’s formula when the second functional derivative of u 7 → |u|2L2 is applied to the quadratic variation of W (t). These somewhat formal manipulations can be understood as the limit of classical finite-dimensional stochastic calculus applied to the Galerkin approximations in Fourier space. All of the terms are independent of the order of the Galerkin approximation so the limit can be taken. In the rest of the section, we will seem to cover the same ground three times. On each pass, we will glean a little more information. It is probably worthwhile to mention the difficulties that necessitate such repetition. From the existence and uniqueness theory, we know only that |u(t)|2L2 is finite with probability one. This puts u(t) in the one o for which the Itô stochastic integral is defined. Knowing that nZof the weakest class P
t
0
2p
|u(t)|L2 ds < ∞ = 1, allows one to define the stochastic integral Z t
Mt = def
0
2(p−1)
2p|u(s)|L2
hu(s), dW (s)iL2
but only as a local martingale. In particular as the diligent referee correctly observed, Z t
2p
this means that one does not know that EMt = 0. This requires that E |u(t)|L2 dt 0 is finite. This is not given by the existence and uniqueness theorem. Hence, we must establish this before we can make any conclusion which requires EMt = 0.
Randomly Forced Navier–Stokes
281
We will now show that
Z t 2p 2p 2(p−1) |3u(s)|2L2 ds E|u(t)|L2 ≤E|u(0)|L2 − ν2pE |u(s)|L2 0 Z t X 2(p−2) 2p(p − 1)|u(s)|L2 |uk (s)|2 |σk |2 ds +E Z
0
t
+E 0
(10)
k
2(p−1)
p|u(s)|L2
E0 ds.
Since Mt is a local martingale there exists a sequence of stopping time {Tn }, with Tn → ∞ as n → ∞, that reduces the Mt , that is, makes Mt∧Tn a bounded martingale. For t < Tn , Mt∧Tn follows the evolution of Mt . At the time Tn , it “stops”. For all future times it takes the value MTn . Since Mt∧Tn is a bounded martingale, the Optional Stopping Time Theorem implies that EMt∧Tn equals 0 (see [DM82,Dur96]). We denote by fn (t) the expression Z t X 2p 2(p−2) 2p(p − 1)|u(s)|L2 |uk (s)|2 |σk |2 ds |u(0)|L2 + Z
0
t
+ 0
k
2(p−1)
p|u(s)|L2
E0 ds + Mt∧Tn .
(11)
This is simply the positive drift terms from the right-hand side of (7) written in integral form, with the local martingale Mt replaced by the stopped martingale Mt∧Tn . Because, is a bounded martingale and hence has expected value zero, as already observed, M Z t∧Tn t
2(p−1)
|3u(s)|2L2 ds is the desired right-hand side from we see that Efn − νE 2p|u(s)|L2 0 (10). Next rearranging (7), we observe that Z t 2p 2(p−1) |u(s)|L2 |3u(s)|2L2 ds = fn (t) 0 ≤ |u(t)|L2 + 2pν 0
for t ≤ Tn . This shows that fn (t) is non-negative for t ≤ Tn . We intend to use Fatou’s lemma; hence, we need to show that fn (t) is non-negative for all t. In fact, we will see that for t > Tn , fn (t) ≥ fn (Tn ). This can be seen by using (11) to write fn (t) − fn (Tn ). When t ≥ Tn , we have Z t 2(p−2) 2p(p − 1)|u(s)|L2 fn (t) − fn (Tn ) = Tn
X k
Z |uk (s)|2 |σk |2 ds +
t
Tn
2(p−1)
p|u(s)|L2
E0 ds.
(12)
Since each integral on the right-hand side is the integral of a non-negative quantity, it is clearly non-negative. Putting all of this together shows that fn (t) is non-negative for all t which allows us to apply Fatou’s lemma. Doing so gives Z t 2p 2(p−1) 2p|u(s)|L2 |3u(s)|2L2 ds = E lim fn ≤ lim Efn , (13) E|u(t)|L2 + νE 0
which proves (10).
n→∞
n→∞
282
J. C. Mattingly
2 out and Next, we complete Lemma 2 by constructing the bounds in (8). Pulling σmax using the Poincaré inequality once gives
h i d 2 Ep ≤ −2νpλ1 Ep + 2p(p − 1)σmax + pE0 Ep−1 . dt Integration of this differential inequality gives the desired bounds on Ep (t). Lastly, we obtain uniform bounds on each moment in terms of the values of moments of lesser or equal order evaluated at an earlier moment of time. For t > s, E0 E0 def , E1 (s)} ≤ + E1 (s)=Emax 1 (s), 2νλ 2νλ def max Ep (t) ≤ Ep (s) + Cp Emax p−1 (s)=Ep (s). E1 (t) ≤ max{
Notice that Emax p (s) is just a linear combination of the moments of order less than or equal to p evaluated at the time s. In other words, there exist constants Cp0 depending Pp 0 E0 . only on p and {σk } so that Emax p (s) = 1 Cj Ej (s) + 2νλ We now examine if Mt is a true martingale or simply a local martingale. By Corollary 3 on p. 66 of [Pro90], it is sufficient to show that the quadratic variation, [M, M]t has finite expectation for all finite times, Z t Z t X 2(p−1) 2p 2 2 2 2p|u(s)|L2 |uk (s)| |σk | ds ≤ σmax |u(s)|L2 ds. [M, M]t = 0
0
k
Hence, Z 2 E[M, M]t ≤ σmax
0
t
Ep (s)ds,
(14)
which is finite by the bounds proved in Lemma 2 . This completes Lemma 1. u t Before moving on, we mention that completely analogous estimates are possible for |∇u|L2 and of a slightly different form for higher Sobolev norms. See [Mat98]. 3. The Contraction in Phase Space Condition (A) makes the system strongly dissipative. In the deterministic setting, it produces a system with a globally attracting fixed point. Our understanding of the dissipative nature will come from examining the evolution of the difference between two solutions starting from different initial data, u0 and u˜ 0 , but subjected to the same instance of noise. We define ρ(t; t0 , u0 , u˜ 0 ) = u(t; t0 , u0 ) − u(t; t0 , u˜ 0 ). At times, we will use the shorthand u(t) ˜ for u(t; t0 , u˜ 0 ) and u(t) for u(t; t0 , u0 ). From Eq. (2), we see that ρ(t) satisfies the following partial differential equation dρ = −ν32 ρ + B(u, ˜ u) ˜ − B(u, u) dt = −ν32 ρ + B(u − ρ, u − ρ) − B(u, u) = −ν3ρ − [B(u, ρ) + B(ρ, u) + B(ρ, ρ)].
(15)
Randomly Forced Navier–Stokes
283
This PDE is classical in so far as there are no Itô integrals, only random coefficients. In the following manipulations, we will not make specific reference to the regularity of the solutions. Implicitly, we do the intermediate calculations with finite Galerkin approximations which are C ∞ . The quantiles presented in the final estimates will be well defined in the limit as the order of the Galerkin approximation is taken to ∞. Thus, the finial conclusions will hold for the actual solution and not just its Galerkin approximations. Taking the inner product of (15) with ρ and remembering that hB(v, w), wiL2 = 0 for general u and v, we arrive at 1 d |ρ(t; t0 )|2L2 = −ν|3ρ|2L2 − hB(ρ, u), ρiL2 . 2 dt
(16)
Next recall the estimate on |hB(v, w), uiL2 | from the introduction. We use this inequality, followed by the application of ab < a 2 /2 + b2 /2, and lastly the Poincaré inequality to obtain 1 d |ρ(t; t0 )|2L2 ≤ −ν|3ρ|2L2 + γ |ρ|L2 |3ρ|L2 |3u|L2 2 dt ν γ ≤ − |3ρ|2L2 + |ρ|2L2 |3u|2L2 2 2ν γ νλ1 − |3u|2L2 |ρ|2L2 . ≤− 2 2ν
(17)
Thus by Gronwall’s lemma, we arrive at the estimate we need: |ρ(t; t0 , u0 , u˜ 0 )|2L2 ≤ e−2(t−t0 )0(t−t0 ;t0 ,u0 ) |ρ0 |2L2 ,
(18)
where γ 0(τ ; t0 , u0 ) = νλ1 − ν
Z t0 +τ 1 2 |3u(s)|L2 ds . τ t0
The following lemma, which will be proved in a later section, gives the needed control on the process 0. Lemma 3. Let u0 be a L2 -valued random variable, measurable with respect to Ft0 , with 2p E|u˜ 0 |L2 for some fixed p > 0. Then for any fixed > 0, there exists a random time s0 (, t0 , u0 ) such that for n ∈ Z+ , n > s0 H⇒ |0(n; t0 , u0 ) − δ0 | < . Also, 1. if p > 1 then s0 is finite almost surely and q 2. if q ∈ (0, p − 1) then Es0 is finite. (The definition of δ0 was given on at the beginning of Sect. 1.)
(19)
284
J. C. Mattingly
4. Proofs of Theorems 1, 2, 3 and Corollary 1 Proof of Theorem 1. Most of the work of this theorem is contained in the proof of Lemma 3. We set = δ0 − δ. By Lemma 3, there exists a random time s0 (δ1 ) so that the condition in (19) holds. This implies that for all times τ > s0 (), we have → τ (δ, t0 , u0 ) = s0 (), the estimate in (18) becomes 0(τ ; t0 , u0 ) > δ. Thus if we take − → the estimate given in the theorem. By Lemma 3, − τ (δ, t0 , u0 ) has the desired moments. t u Proof of Theorem 2. Without loss of generality we will take t=0. The letter n, with all of its various ornamentations, will always be αZ-valued. Similarly, m will always 2 be an integer. For m ≥ 0, let n0 (ω) = αTbound ({u0 (αm)}, δ2 |αm|) and n˜ 0 (ω) =
αTbound ({u˜ 0 (αm)}, δ2 |αm|). (The rescaling by α is necessary because the Bounding Lemma is written for sequences indexed by integers.) The definition of Tbound (·, ·) is given at the start of the appendix; however, in words it is defined as the first integer moment of time such that the first sequence is smaller than the second sequence for all subsequent integer times. It is the nearest integer moment of time when the second sequence overtakes the first. Set δ1 = δ0 − δ and t0 (n) = s0 (δ1 , n, u0 (n)), where s0 was defined in Lemma 3. Hence by Lemma 3, Et0 (n)q < ∞ for q ∈ (0, p−1). Now set n∗1 = Tbound ({t0 (n)}, |n|). By the first corollary to the Bounding Lemma contained in the appendix (Lemma 5), n− = max(n0 , n˜ 0 , n∗1 ). Because max(X, Y )p ≤ E(n∗1 )q < ∞ for q ∈ (0, p − 2). Define ← p p p ← − (X + Y ) ≤ Cp (X + Y ), n has all the same moments as n∗0 , n˜ 0 and n∗1 . Thus, E← n−q < ∞ for q ∈ (0, p − 2). Putting everything together and using the estimate (18), we have 2
|u(0; n, u0 (n)) − u(0; n, u˜ 0 (n) )|2L2 ≤ |u0 (n) − u˜ 0 (n)|2L2 e−δ|n| ≤ δ 2 |n|e−δ|n| .
t for n < n∗ . u Proof of Corollary 1. Without loss of generality, we take t1 = 0. As in the previous proof, the letter n will always be αZ-valued. Similarly, m will always be an integer. Define u0 (n) = u(−n; −n − α, 0) for n ∈ αZ with n > 0. The sequence u0 (n) forms a stationary sequence of random variables. By Lemma 2, all of the moments of |u0 (n)|L2 are uniformly bounded in n because the initial conditions are deterministic. Now use Theorem 2 to compare the solution starting from u0 (n) at time −n and the solution starting from zero at time −n. Theorem 2 says that there exists at αZ-valued random variable n∗ , with all moments finite, such that for n00 > n0 > n∗ > 0 and τ > 0, |u(τ ; −n0 , 0) − u(τ ; −n00 , 0)|L2 = |
n00 α −1
X
0
u(τ ; −αj, 0) − u(τ ; −αj + 1, 0)|L2
j = nα n00 α −1
≤
X
0 j = nα
|u(τ ; −αj, 0) − u(τ ; −αj + 1, 0)|L2 =≤ e−δτ .
t u
Randomly Forced Navier–Stokes
285
For m < 0, let n∗0 (ω) = αTbound ({u0 (αm)}, δ 2 |αm|). As in the proof of Theorem 2, the rescaling by α is necessary because the Bounding Lemma is written for sequences indexed by integers. Set δ1 = δ0 − δ and t0 (n) = s0 (δ1 , n, u0 (n), 0), where s0 was defined in Lemma 3. Observe that it is also a stationary sequence of random variables. By Lemma 2, 2p E|u0 (n)|L2 is finite for all p ≥ 1 and n ∈ αZ. Hence by Lemma 3, all moments of t0 (n) are finite. By the first corollary to the Bounding Lemma contained in the appendix (Lemma 5), n∗1 = Tbound ({t0 (n)}, |n|) has all moments finite. Define n∗ = min(n∗0 , n∗1 ). Because max(X, Y )p ≤ (X + Y )p ≤ Cp (Xp + Y p ), n∗ has all moments finite since n∗0 and n∗1 do. Putting everything together and using the estimate (18), we have |u(0; n, 0) − u(0; n − α, 0)|2L2 = |u(0; n, 0) − u(0; n, u0 (n) )|2L2 ≤ |u0 (n)|2L2 e−δ|n| ≤ δ 2 |n|e−δ|n|
for n < n∗ . And hence, for n0 ,n00 < −n∗ < 0 < τ , we have the needed estimate X |u(0; n, 0) − u(0; n − 1, 0)|2L2 |u(τ ; n0 , 0) − u(τ ; n00 , 0)|2L2 ≤ e−δτ ≤ e
n∈α Z,n<−n∗ Z ∞ −δτ 2 −δx
δ
xe
dx =
0
1 2 δ = e−δτ . δ2
t u
5. Proof of Lemma 3 Controlling the process 0 is contingent on controlling the time average it contains. By writing Eq. (7) in integral form with p = 1, we obtain Z |u0 |2L2 − |u(t0 + τ ; t0 , u0 )|2L2 2ν t0 +τ + E0 |3u(s; t0 , u0 )|2L2 ds = τ t0 τ Z 1 t0 +τ + hu(s; t0 , u0 ), dW (s)iL2 τ t0 1 |u0 |2L2 + M(τ ; t0 , u0 ) . ≤ E0 + (20) τ The process M(τ ; t0 , u0 ) is defined by Z t0 +τ X Z def uk (s)ek σk dβk (s)= M τ ; t0 , u0 ) = t0
t0 +τ
t0
k
hu(s; t0 , u0 ), dW (s)iL2 .
First note that M(τ ; t0 , u0 ) is a martingale in the τ variable with expectation zero. To control 0, we will need to control M. We will find the time needed for M(τ ; t0 , u0 ) to get within of its average value. We begin by computing the quadratic variation process of M. We have Z t0 +τ Z t0 +τ X 2 2 2 uk (s) σk ds ≤ σmax |u(s; t0 , u0 )|2L2 ds. [M, M](τ ) = t0
k
t0
We recall the Burkholder–Davis–Gundy Inequality (see [Dur96,Pro90]).
286
J. C. Mattingly
Theorem (Burkholder–Davis–Gundy). Let X(t) be a local martingale with X(0) = 0. For any 0 < p < ∞, there exist constants 0 < c, C < ∞ such that p p/2 p/2 ≤ E sup X(s) < CE [X, X](t) . cE [X, X](t) 0≤s≤t
In the case of [M, M](t), the Burkholder–Davis–Gundy inequality gives 2p p ≤ E [M, M](τ ) E sup M(s) 0≤s≤τ
≤ Cτ p−1 ≤
Z
t0 +τ
2p E|u(s; t0 , u0 )|L2 ds t0 2 4 Cτ p Emax p (t0 , E|u0 |L2 , E|u0 |L2 , . . .
2p
, E|u0 |L2 ).
(21)
The second to last inequality comes from Hölder’s inequality and the last one from the estimates on the moments of the energy from Sect. 2. Define Mn = supn−1<s nδ ) ≤
p CEmax p n
C 0 (E|u0 |2L2 , . . . , E|u0 |L2 ) 2p
≤
. (22) 2p n2pδ 2p n2pδ−p Lemma 4. Let u0 (ω) be a random variable, measurable with respect to Ft0 , such that 2p E|u0 |L2 is finite. 1. If p > 1, then Tbound ({Mn }, n) is finite almost surely. 2. ETbound ({Mn }, n)q is finite for q ∈ (0, p − 1). Proof. We apply the Bounding lemma (Lemma 5 from the appendix) using the estimate given in (21). The condition to be almost surely finite translates to 2p > 1 + p which implies p > 1. The condition on the moments translates to 2p > 1 + p + q which implies p − q > 1. u t We are now in a position to prove Lemma 3. Proof of Lemma 3. Recalling (20) and the definition of 0, we have 1 2 |u(s)|L2 + M(τ ; t0 , u0 ) . 0(τ ; t0 , u0 ) ≤ νλ1 − γ 2E0 + τ Let Mn (t0 ) = supn−1<s
sM (, ω; t0 ) = Tbound ({Mn }, n).
s0 (δ) = max su (/2) , sM (/2)
s0 has the desired property, namely
i 1 h n + n = . n 2 2 All that remains is to show that this definition of s0 is finite almost surely and has finite moments up to order p − 1. It is sufficient that su and sM be almost surely finite and have finite moments up to order p − 1. For su , this follows from Corollary 4 in the appendix. t For sM , this follows from Lemma 4. u n > s0 H⇒ |0(n; t0 , u0 ) − δ0 | <
Randomly Forced Navier–Stokes
287
Conclusion We have shown that for a fixed realization of the noise, solutions starting from different initial conditions converge to the same distinguished trajectory. This implies that the asymptotic behavior depends only on the behavior of this distinguished solution and not on the initial conditions. In particular, this proves that there exists a unique invariant measure. The long term statistics of the problem are reduced to the dynamics of this solution. Acknowledgement. The author would like to thank René Carmona, Erhan Çınlar, Fabrice Planchon, and Ya. Sinai for useful discussions. The author also thanks a careful referee who pointed out a number of inadequacies in an earlier draft and generally improved the authors understanding of his own work. Lastly after the submission of this note, the author became aware of [Sch96] which contains results similar to Theorem 1, though in a slightly different setting and from a different point of view.
Appendix Let {Xn } be a sequence of real random variables indexed by n. Let f : Z+ → R+ . Define the random variable Tbound ({Xn }, f, ω) to be the smallest positive integer such that for almost every ω, m > Tbound ({Xn }, f, ω) H⇒ |Xm (ω)| < f (m). For a single random variable X define Tbound (X, f, ω) = sup{n : |X| > f (n)}. Lemma 5 (Bounding Lemma). Assume that P(|Xn | ≥ nδ ) ≤
E|Xn |p C ≤ pδ−r p pδ p n n
for some , δ, p, C > 0 and r ≥ 0. 1. If pδ > 1 + r then Tbound ({Xn }, nδ ) < ∞ a.s. . 2. E[Tbound ({Xn }, nδ )]q is finite for q ∈ (0, pδ − (1 + r) ). P Proof. In light of Chebyshev’s inequality, the sum n P(|Xn | > nδ ) is finite. Thus by the first Borel-Cantelli Lemma, there exists a random variable n∗ (ω), which is almost surely finite, such that m > n∗ ⇒ |Xn | > nδ a.s.. To prove the second statement, we observe that E(n∗ )q =
∞ X
nq P(n∗ = n) ≤
n=1
∞ X
nq P(|Xn | ≥ nδ )
(23)
n=1
≤
∞ X n=1
C p npδ−(r+q)
.
(24)
The first estimate hinges on the fact that n∗ was the smallest integer such that for all greater integers n, |Xn | < nδ . To conclude, note that the final sum converges if pδ − (r + q) > 1. u t The following two corollaries are a specialization of the above lemma.
288
J. C. Mattingly
Corollary 3. Given a family of random variables {Yn } for which E|Yn |p ≤ C < ∞ for all n (in particular, {Yn } could be a stationary sequence with E|Yn |p finite). then Tbound ({Yn }, nδ ) is finite almost surely. δ q is finite. 2. Let q > 0. If δ > q+1 p , then E Tbound ({Yn }, n )
1. If δ >
1 p,
Proof. By Chebyshev’s inequality and the bound on E|Yn |p , P(|Yn | > nδ ) ≤ E|Yn |p / p npδ ≤ C/ p npδ . This estimate satisfies the conditions of the above lemma with r = 0. The conclusion follows from the lemma. u t Corollary 4. Let X be a random variable such that E|X|p is finite. then Tbound (X, nδ ) is finite almost surely. δ q is finite. 2. Let q > 0. If δ > q+1 p , then E Tbound (X, n )
1. If δ >
1 p,
Proof. This is just a specialization of the above corollary. u t References [CF88]
Constantin, Peter and Foia¸s, Ciprian: Navier-Stokes Equations. Chicago: University of Chicago Press, 1988 [CF94] Crauel, Hans and Flandoli, Franco: Attractors for random dynamical systems. Probability Theory and Related Fields 100, 365–393 (1994) [DM82] Dellacherie, Claude and Meyer, Paul-André: Probabilities and potential. B Theory of martingales, 72 of North-Holland Mathematics Studies.Amsterdam–NewYork: North-Holland Publishing Co., 1982 [DPZ92] Da Parto, Giuseppe and Zabczyk, Jerzy: Stochastic Equations in Infinite Dimensions. Cambridge: Cambridge University Press, 1992 [DPZ96] Da Prato, Giuseppe and Zabczyk, Jerzy Ergodicity for Infinite Dimensional Systems. Cambridge: Cambridge University Press, 1996 [Dur96] Durrett, Richard: Stochastic Calculus, A practical introduction. CRC Press, 1996 [EKMS98] Khanin, W.E.K., Mazel, A., Sinai, Ya.: Burgers Equation with Random Forcing. Submitted to The Annals of Mathematics, Princeton University Press, 1998 [EFNT94] Eden, A., Foias, C., Nicolaenko, B. and Teman, R.: Exponential Attractors for dissipative Evolution equations. Research in Applied Mathematics. John Wiley and Sons and Masson, 1994 [Fer97] Ferrario, Benedetta: Ergodic results for stochastic Navier–Stokes equation. Stochastics and Stochastics Reports 60 (3–4), 271–288 (1997) [FG95] Flandoli, Franco and Gatarek, Dariusz: Martingale and stationary solutions for stochastic Navier– Stokes equations. Probability Theory and Related Fields 102, 367–391 (1995) [Fla94] Flandoli, Franco: Dissipativity and invariant measures for stochastic Navier–Stokes equations. NoDEA 1, 403–426 (1994) [FM95] Flandoli, Franco and Maslowski, B.: Ergodicity of the 2-D Navier–Stokes equation under random perturbations. Commun. Math. Phys. 171, 119–141 (1995) [Kun90] Kunita, Hiroshi: Stochastic Differential Equations. Cambridge: Cambridge University Press, 1990 [Mat98] Mattingly, Jonathan C.: The Stochastically forced Navier–Stokes equations: Energy estimates and phase space contraction. PhD thesis, Princeton University, 1998 [Pro90] Protter, Philip: Stochastic Integration and Differential Equations: A new approach. Berlin– Heidelberg–New York: Springer-Verlag, 1990 [Sch96] Schmalfuß, Björn: A random fixed point theorem based on Lyapunov exponents. Random & Computational Dynamics 4, 257–268 (1996) [Tem79] Temam, Roger: Navier-Stokes equations: Theory and numerical analysis. Volume 2 of Studies in Mathematics and its Applications. Amsterdam–NewYork: North-Holland Publishing Co., revised edition, 1979 [Tem88] Temam, Roger: Infinite Dimensional Dynamical Systems in Mechanics and Physics. New York: Springer-Verlag, 1988 Communicated by Ya. G. Sinai
Commun. Math. Phys. 206, 289 – 335 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Effective Interactions Due to Quantum Fluctuations Roman Kotecký1,2,? , Daniel Ueltschi3,?? 1 Center for Theoretical Study, Charles University, Jilská 1, 110 00 Praha 1, Czech Republic 2 Department of Theoretical Physics, Charles University, V Holešoviˇckách 2, 180 00 Praha 8, Czech Republic.
E-mail: [email protected]
3 Institut de Physique Théorique, EPF Lausanne, CH-1015 Lausanne, Switzerland
Received: 28 April 1998 / Accepted: 19 March 1999
Abstract: A class of quantum lattice models is considered, with Hamiltonians consisting of a classical (diagonal) part and a small off-diagonal part (e.g. hopping terms). In some cases when the classical part has an infinite degeneracy of ground states, the quantum perturbation may stabilize some of them. The mechanism of this stabilization stems from effective potential created by the quantum perturbation. Conditions are found when this strategy can be rigorously controlled and the low temperature phase diagram of the full quantum model can be proven to be a small deformation of the zero temperature phase diagram of the classical part with the effective potential added. As illustrations we discuss the asymmetric Hubbard model and the Bose–Hubbard model. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . 2. Assumptions and Statements . . . . . . . . . . . . . . 2.1 Classical Hamiltonian with quantum perturbation 2.2 The effective potential . . . . . . . . . . . . . . . 2.3 Stability of the dominant states . . . . . . . . . . 2.4 Characterization of stable phases . . . . . . . . . 2.5 Phase diagram . . . . . . . . . . . . . . . . . . . 3. Examples . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The asymmetric Hubbard model . . . . . . . . . 3.2 The Bose–Hubbard model . . . . . . . . . . . . . 4. Contour Representation of a Quantum Model . . . . .
. . . . . . . . . . .
? Partly supported by the grants GACR ˇ 202/96/0731 and GAUK 96/272.
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
290 293 293 296 297 299 301 303 303 305 308
?? Present address: Department of Mathematics, Rutgers University, 110 Frelinghuysen Road, Piscataway,
NJ 08854-8019, USA. E-mail: [email protected]
290
R. Kotecký, D. Ueltschi
5. Exponential Decay of the Weight of the Contours . . . . . . . . . . . . . . . . 320 6. Expectation Values of Local Observables and Construction of Pure States . . . 328 A. General Expression for the Effective Potential . . . . . . . . . . . . . . . . . 333
1. Introduction Physics of a large number of quantum particles at equilibrium is very interesting and difficult at the same time. Interesting, because it is treating such macroscopic phenomena as magnetization, crystallisation, superfluidity or superconductivity. And difficult, because their study has to combine Quantum Mechanics and Statistical Physics. A natural approach is to decrease difficulties arising from this combination by starting from only one aspect. Thus one can use only Quantum Mechanics and treat the particles first as independent, trying next to add small interactions. In the present paper we are concerned with the other approach. Namely, to start with a model treated by Classical Statistical Physics, adding next a small quantum perturbation. Another simplification is to consider lattice systems (going back to a physical justification for the modeling process, we can invoke applications to condensed matter physics). Quantum systems studied here have Hamiltonians consisting of two terms. The first term is a classical interaction between particles; formally, this operator is “function” of the position operators of the particles and it is diagonal with respect to the corresponding basis in occupation numbers. The second term is an off-diagonal operator that we suppose to be small with respect to the interaction. A typical example for this is a hopping matrix. The aim of the paper is to show that a new effective interaction appears that is due to the combination of the potential and the kinetic term. An explicit formula is computed, and sufficient conditions are given in order that the low temperature behaviour is controlled by the sum of the original diagonal interaction and the effective potential. To be more precise, it is rigorously shown that the phase diagram of the original quantum model is only a small perturbation of the phase diagram of a classical lattice model with the effective interaction. Thus, we will start by recalling some standard ideas of Classical Statistical Mechanics of lattice systems. The Peierls argument for proving the occurrence of a first order phase transition in the Ising model [Pei,Dob,Gri] marks the beginning of the perturbative studies of the low temperature regimes of classical lattice models. Partition functions and expectation values of observables may be expanded with respect to the excitations on top of the ground states, interpreting the excitations in geometric terms as contours. These ideas and methods are referred to as the Pirogov–Sinai theory; they were first introduced in [PS,Sin] and later further extended [Zah,BI,BS]. The intuitive picture is that a low temperature phase is essentially a ground state configuration with small excitations. A phase is stable whenever it is unprobable to install a large domain with another phase inside. For such an insertion one has to pay on its boundary, it is excited (two phases are separated by excitations), but, on the other side, one may gain on its volume if its metastable free energy (its ground energy minus the contribution of small thermal fluctuations) is smaller than the one of the external phase. It is important to take into account the fluctuations since they can play a role in determining which phase is dominant. A standard example here is the Blume-Capel model with an external field slightly favouring the “+1” phase; at low temperatures, the “0” phase may be still selected because it has more low energy excitations (theory of such dominant states chosen by thermal fluctuations may be found in [BS]).
Effective Interactions Due to Quantum Fluctuations
291
The partition function of a quantum system Tr e−βH may be expressed using the Duhamel expansion (or Trotter formula), yielding a classical contour model in a space with one more (continuous) dimension. If the corresponding classical model (the diagonal part only) has stable low temperature phases, and if the off-diagonal terms of the Hamiltonian are small, the contours have low probability of occurrence and it is possible to extend the Peierls argument to quantum models [Gin]. More generally, one can formulate a “Quantum Pirogov–Sinai theory” [BKU1,DFF1], in order to establish that (i) low temperature phases are very close to ground states of the diagonal interaction (more precisely: the density matrix Z1 e−βH is close to the projection operator |gihg| , where |gi is the ground state of the diagonal interaction only) and (ii) low temperature phase diagrams are small deformations of zero temperature phase diagrams of the interactions. So far we have only discussed the case when the effect of the quantum perturbation is small, and the features of the phases are due to the classical interaction between the particles. It may happen, however, that the classical interaction alone is not sufficient to choose the low temperature behaviour. This is the case in the two models we introduce now and use later for illustration of our general approach. • The asymmetric Hubbard model. It describes hopping spin 21 particles on a lattice 3 ⊂ Zν . A basis of its Hilbert space is indexed by classical configurations n ∈ {0, ↑, ↓, 2}3 , and the Hamiltonian X X X X † tσ cxσ cyσ + U nx↑ nx↓ − µ (nx↑ + nx↓ ) (1.1) H =− x
kx−yk2 =1 σ =↑,↓
x
(the hopping parameter tσ depends on the spin of the particle). In the atomic limit t↑ = t↓ = 0 the ground states are all the configurations with exactly one particle at each site. The degeneracy equals 2|3| , which means that it has nonvanishing residual entropy at zero temperature. The case t↑ 6 = t↓ = 0 corresponds to the Falicov– Kimball model (see [GM]); in this case, spin-↓ electrons behave as classical particles. Here, we shall consider the strongly asymmetric Hubbard model, with U t↑ t↓ . • The Bose–Hubbard model. We consider bosons moving on a lattice 3 ⊂ Z2 . They interact through on-site, nearest neighbour and next nearest neighbour repulsive potentials. A basis of its Hilbert space is the set of all configurations n ∈ N3 , and its Hamiltonian: X X ax† ay + U0 (n2x − nx ) H = −t kx−yk2 =1
+ U1
X
kx−yk2 =1
x
nx ny + U2
X √ kx−yk2 = 2
nx ny − µ
X
nx .
(1.2)
x
For U0 > 4U1 − 4U2 and U1 > 2U2 , and if 0 < µ < 8U2 , the ground states of the potential part are those generated by 01 00 , i.e. any configuration with alternatively a ferromagnetic and an empty line is a ground state (and similarly in the other direction); 1
see Fig. 2 in Sect. 3. The degeneracy is of the order 2 2 |3| 2 (if 3 is a square), there is no residual entropy. Actually, we shall add to (1.2) a generalized hard-core condition that prevents more than N bosons to be present at the same site; this condition has technical motivations, and does not change the physics of the model. 1
In these two situations, the smallest quantum fluctuations yield an effective interaction, and this interaction stabilizes phases displaying long-range order (there is neither superfluidity nor superconductivity).
292
R. Kotecký, D. Ueltschi
Beside of the low temperature Gibbs states, the effective potential may have an influence in situations with interfaces; it has been shown in [DMN] that rigid 100 and 111 interfaces occur in the Falicov–Kimball model at low temperature. In the case where classical and quantum particles are mixed in one model, like the Falicov–Kimball model, a method using Peierls argument was proposed by Kennedy and Lieb [KL]; it was extended in [LM] to situations that are not covered by the present paper, namely to cases of such mixed systems with continuous classical variables. Results very similar to ours have already been obtained by Datta, Fernández, Fröhlich and Rey-Bellet [DFFR]. Their approach is different, however. Starting from a Hamiltonian H (λ) = H (0) + λV , H (0) being a diagonal operator with infinitely many ground states, and V the quantum perturbation, the idea is to choose an antisymmetric matrix S = λS (1) + λ2 S (2) in such a way that the operator H (2) (λ) = eS H (λ) e−S , expanded with the help of Lie-Schwinger series, turns out to be diagonal, up to terms of order λ3 or higher. If the diagonal part of H (2) has a finite number of ground states and the excitations cost strictly positive energy, it can be shown that the ground states are stable. It is possible to include higher orders in this perturbation scheme (see [DFFR]). In fact, our first intention was to study the stability of the results of [BS] with respect to a quantum perturbation, and we began the present study as a warm-up and the first simple step towards this goal. This simple step turned out however to be rather involved. Even though, at the end, the paper contains results similar to that of [DFFR], we think that the subject is important enough to justify an alternative approach, and that there are some advantages in an explicit formula for the effective potential and sufficient conditions for it to control the low temperature behaviour that may be useful in explicit applications. The intuitive background of this paper owes much to the work of Bricmont and Slawny [BS] discussing the situation with infinite degeneracy of ground states, where only a finite number of ground states is dominating as a result of thermal fluctuations, and to the paper of Messager and Miracle-Solé [MM] which was useful to understand the structure of the quantum fluctuations. Having expanded the partition function Tr e−βH using the Duhamel formula and having defined quantum contours as excitations with respect to a well chosen classical configuration, we identify the smallest quantum contours (that we call loops). Given a set of big quantum contours, we can replace the sum over sets of loops by an effective interaction acting on the quantum configurations without loops. This effective interaction is long-range, but decays exponentially quickly with respect to the distance. This allows, for a class of models, to have an explicit control on the approximation given by the effective interaction allowing to prove rigorous statements about the behaviour of original quantum model. An important model that does not fall into the class of models we can treat is the (symmetric) Hubbard model. Take U = 1 and t↑ = t↓ = t in (1.1). Computing the effective potential stemming from one transition of a particle to a neighbouring site and back, we find an antiferromagnetic interaction of strength t 2 . On the other hand, it is possible to make two transitions as a result of which the spins of nearest neighbours are interchanged, † † cy↓ cy↑ cx↑ | ↑, ↓i. |nx , ny i = | ↓, ↑i = −cx↓
It turns out that this brings the factor t 2 , which is of the same order as the strength of the effective interaction. In this case we cannot ensure the stability of the phases selected by the effective potential – we would need a stronger effective interaction. Otherwise the system jumps easily from a configuration with one particle per site to another such configuration, i.e. from a classical ground state to another classical ground state. We call
Effective Interactions Due to Quantum Fluctuations
293
quantum instability this property of the system. In the Hubbard model it is a manifestation of a continuous symmetry of the system, namely the rotation invariance. In Sect. 2 the ideas discussed above are introduced with precise definitions. The effective potential is written down in Sect. 2.2 – actually, we restrict here to lowest orders; the general formula is not that pleasant, and is therefore hidden in the appendix. The results of the paper are summarized in Theorems 2.2 (a characterization of stable pure phases) and 2.3 (the structure of the phase diagram); experts will recognize standard formulations of Pirogov–Sinai theory. Taking into account that our aim is to describe in a rigorous way the behaviour of a quantum system, some care must be given to the introduction of stable phases. We define them with the help of an external field perturbation of the state constructed with periodic boundary conditions. In Sect. 3 we apply the results to our two illustrative examples. The rest of the paper is devoted to the construction of a contour representation (Sect. 4), the proof of the exponential decay of the weights of the contours (Sect. 5), and, finally, the proofs of our claims with the help of contour expansions of the expectation values of local observables and the standard Pirogov–Sinai theory (Sect. 6). Let us end this introduction by noting that given a model which enters our setting, it is not a straightforward task to apply our theorems. One still has to separate the correct leading orders that determine the behaviour of effective interaction. This situation has the utmost advantage that it should bring much more pleasure to users, since the most interesting part of the job remains to be done – to get intuition and to understand how the system behaves. 2. Assumptions and Statements 2.1. Classical Hamiltonian with quantum perturbation. Let Zν , ν > 2, be the hypercubic lattice. We use |x − y| := kx − yk∞ to denote the distance between two sites x, y ∈ Zν . is the finite state space of the system at site x = 0, || = S < ∞. Our standard setting will be to consider the system on a finite torus 3 = (Z/LZ)ν (i.e. a finite hypercube with periodic boundary conditions). With a slight abuse of notation we identify 3 with a subset of Zν and always assume that it is sufficiently large (to surpass the range of considered finite range interactions). A classical configuration n3 (occasionally we suppress the index and denote it n) is an element of 3 . If A ⊂ 3, the restriction of n3 to A is also denoted by nA . H3 is the (finite-dimensional) Hilbert space spanned by the classical configurations, i.e. the set of vectors X an3 |n3 i, an3 ∈ C, |vi = n3
with the scalar product hv|v 0 i =
X n3
A
n0A0
an∗3 an0 3 . 0
and ∈ A , with A ∩ A0 = ∅, it is convenient Given two configurations nA ∈ 0 to define nA n0A0 ∈ A∪A to be the configuration coinciding with nA on A and with n0A0 on A0 . The Hamiltonian is a sum of two terms, H3 = V3 + T3 . The former is the quantum equivalent of a classical interaction, the latter is the quantum perturbation – the notation was chosen such because we have in mind models where V represents the potential
294
R. Kotecký, D. Ueltschi
energy of quantum particles, that is diagonal in the basis of occupation number operators, and T represents the kinetic energy. It helps considerably to assume that V3 is the quantum equivalent of a classical “block interaction”, that is, an interaction that has support on blocks of a given size in Zν . More precisely, let R0 ∈ 21 N be the range of the interaction, and U0 (x) be the R0 -neighbourhood of x ∈ Zν : ( if R0 ∈ N {y ∈ Zν : |y − x| 6 R0 } (2.1) U0 (x) = 1 1 ν {y ∈ Z : |y − (x1 + 2 , . . . , xν + 2 )| 6 R0 } otherwise. When R0 is half-integer, U0 (x) is a block of integer size 2R0 × · · · × 2R0 whose center is at distance 21 of x. Then we assume the following structure for V3 . Assumption 1 (Classical Hamiltonian). There exists a classical periodic block interaction 8 of range R0 (i.e. a collection of functions 8x : U0 (x) → R ∪ {∞}, x ∈ Zν ) and period `0 such that X 8x (nU0 (x) ) |n3 i; V3 |n3 i = x∈3
for any torus 3 ⊂ Zν of side L that is a multiple of `0 and any n3 ∈ 3 . Let us suppose that a fixed collection of reference local configurations G0 (x) ⊂ U0 (x) is given, for all sites of Zν .1 Let GA = {gA ∈ A : gU0 (x) ∈ G0 (x) for all U0 (x) ⊂ A}, A ⊂ Zν , and G = GZν . Finally, we set A¯ = ∪U0 ∩A6=∅ U0 = {y : dist (y, A) 6 2R0 }.
(2.2)
We assume that the local energy gap of excitations is uniformly bounded from below, while the spread of local energies of reference states is not too big (Fig. 1): U0 (x) \ G0 (x)
G0 (x)
8x (nU0 (x) ) δ0
10
Fig. 1. Illustration for Assumption 2. The image of 8x decomposes into two sets separated by a gap 10 ; the spread of the set of small values is bounded by δ0
Assumption 2 (Energy gap for classical excitations). There exist constants 10 > 0 and δ0 < ∞ such that: / G0 (x), one has the lower bound • For any x ∈ Zν and any nU0 (x) ∈ 8x (nU0 (x) ) −
max
gU0 (x) ∈G0 (x)
8x (gU0 (x) ) > 10 ,
(2.3)
1 In some situations G (x) is simply the set of all ground configurations of 8 . When discussing the x 0 full phase diagram, however, we will typically extend the interaction 8x to a class of interactions by adding certain “external fields”. The set G0 (x) then will actually play the role of ground states of the interaction with a particular value of external fields (the point of maximal coexistence of ground state phase diagram).
Effective Interactions Due to Quantum Fluctuations
• and, max 0
gU0 (x) ,gU
0 (x)
295
8x (gU
∈G0 (x)
0 (x)
) − 8x (gU0 0 (x) ) 6 δ0 .
(2.4)
For later purpose, we note the following consequence of Assumption 2. Property. Let 8 satisfy Assumption 2, R be such that R ν 6 10 /δ0 , and A ⊂ Zν with / GA satisfies the lower diam A 6 R. Then any pair of configurations gA ∈ GA and nA ∈ bound i X h (2.5) 8x (nU0 (x) ) − 8x (gU0 (x) ) > R −ν 10 . x,U0 (x)⊂A
/ GA , there exists at least one site x, U0 (x) ⊂ A such that nU0 (x) ∈ / Proof. Since nA ∈ G0 (x). From the assumption, this implies that i X X h δ0 . 8x (nU0 (x) ) − 8x (gU0 (x) ) > 10 − y∈A,y6 =x
x,U0 (x)⊂A
t Using |A| 6 R ν , we obtain the property. u The quantum perturbation T3 is supposed to be aPperiodic quantum interaction. Namely, T3 is a sum of local operators TA , T3 = A TA , where TA has support supp A = A ⊂ 3 and A is, in general, a pair (A, α), where the index α specifies TA from a possible finite set of operators with the same support. We found it useful to label quantum interactions TA not only by the interaction domain A, but also, say, by quantum numbers of participating creation and annihilation operators. Thus, for example, the term A might, in the case of the Hubbard model, be a pair (< x, y >, ↑) corresponding to the † cy,↑ . We refer to A as a quantum transition. operator TA = cx,↑ Assumption 3 (Quantum perturbations). The collection of operators TA is supposed to be periodic,2 with period `0 , with respect to the translations of supp A. The interactions TA are assumed to satisfy the following condition, for fermions or bosons, respectively: • (Fermions) TA is a finite sum of even monomials in creation and annihilation operators of fermionic particles at a given site, i.e. X T˜ ({xi , σi , yj , σj0 })cx†1 ,σ1 . . . cx†k ,σk cy1 ,σ10 . . . cy` ,σ`0 TA = (x1 ,σ1 ),...,(xk ,σk ) (y1 ,σ10 )...,(y` ,σ`0 )
with xi , yi ∈ A and σi , σi0 are the internal degrees of freedom, such as spins; T˜ (·) is a complex number. k + ` must be an even number. The creation and annihilation operators satisfy the anticommutation relations † † , cy,σ {cx,σ 0 } = 0,
{cx,σ , cy,σ 0 } = 0,
† {cx,σ , cy,σ 0 } = δx,y δσ,σ 0 .
2 By taking the least common multiple, we can always suppose the same periodicity for 8 and T . Moreover, whenever a torus 3 is considered, we suppose that its side is a multiple of `0 .
296
R. Kotecký, D. Ueltschi
• (Spins or bosons) The matrix element hn3 | TA |n03 i is zero whenever n3\A 6 = n03\A and otherwise it depends on nA and n0A only. In both cases T is supposed to have an exponential decay with respect to its support: defining kT k to be kT k =
h sup
max
0 A,A⊂Zν nA ,nA ∈A
i1/|A| |hn0A | TA |nA i| ,
(2.6)
we assume that kT k < ∞. When stating our theorems, we shall actually suppose kT k to be sufficiently small. Notice also that we do not assume that T is of finite range, the exponential decay suffices. 2.2. The effective potential. In this section we define the effective potential that results from quantum fluctuations. It is due to a succession of “quantum transitions”, that is, it involves terms of the form hg| TA |ni. What are the sequences (A1 , . . . , Ak ) to take into account? There is no general answer to this question, it depends on the model and on the properties of the phases under observation. In the case where the Hamiltonian is of the form V + λT , λ being a perturbation parameter, one could restrict to all sequences that contain less than, say, 4 transitions (or 2, or 17...). But we can also consider models with more than one parameter. Let us say that the choice of the suitable sequence requires some physical intuition. The procedure is the following. First we guess a list S of sequences of quantum transitions, and we apply the formulæ (2.8)–(2.10) below to compute the effective potential. Then we must answer positively two questions: • Does S contains all the quantum transitions that actually play a role? • Are other quantum effects negligible? The mathematical formulation of these conditions is the subject of Assumptions 5 and 6 below. Notice that there is some freedom in the choice of S; indeed, it is harmless to include more transitions than what is necessary. Simply, it decreases the number of computations to guess the minimal set S. Let us now state the formulæ for the effective potential. Equations are rather simple in the case where S contains sequences of no more than 4 transitions; we restrict to that situation in this section, and postpone the general expression, that is quite involved, to the appendix. Let us decompose S = S (2) ∪ S (3) ∪ S (4) , with S (k) denoting the list of sequences with exactly k transitions, and write 9 = 9 (2) + 9 (3) + 9 (4) .
(2.7)
Here 9 (k) is the contribution to the effective potential due to the fluctuations from S (k) . Let i X h 8x (nU0 (x) ) − 8x (gU0 (x) ) . φA (nA ; gA ) = x,U0 (x)⊂A
Effective Interactions Due to Quantum Fluctuations
297
Then, for any connected A ⊂ Zν and gA ∈ GA , we define (2.8)
/ A (A1 ,A2 )∈S (2) nA ∈G A¯ 1 ∪A¯ 2 =A
X
(3)
9A (gA ) = −
X hgA | TA |nA ihnA | TA |gA i 1 2 , φA (nA ; gA )
X
(2)
9A (gA ) = −
X
/ A (A1 ,A2 ,A3 )∈S (3) nA ,n0A ∈G A¯ 1 ∪A¯ 2 ∪A¯ 3 =A
hgA | TA1 |nA ihnA | TA2 |n0A ihn0A | TA3 |gA i . φA (nA ; gA )φA (n0A ; gA ) (2.9)
The expression for 9 (4) becomes more complicated (we shall see in Sect. 4 that clusters of excitations are actually occurring here), (4)
9A (gA ) = =−
X
X
/ A (A1 ,A2 ,A3 ,A4 )∈S (4) nA ,n0A ,n00A ∈G A¯ 1 ∪A¯ 2 ∪A¯ 3 ∪A¯ 4 =A
−
1 2
X
nA ,n0A ∈G / A
hgA | TA1 |nA ihnA | TA2 |n0A ihn0A | TA3 |n00A ihn00A | TA4 |gA i φA (nA ;gA )φA (n0A ;gA )φA (n00A ;gA )
hgA | TA1 |nA ihnA | TA2 |gA ihgA | TA3 |n0A ihn0A | TA4 |gA i φA (nA ;gA )+φA (n0A ;gA )
n
1 1 φA (nA ;gA ) + φA (n0A ;gA )
o2
.
(2.10) Property (2.5) implies that all the denominators are strictly positive. These equations simplify further if TA is a monomial in creation and annihilation operators; indeed in the sums over intermediate configurations only one element has to be taken into account. Notice, finally, that the diagonal terms in T are not playing any role in the previous definitions; we consider that they are small, since otherwise we would have included them into the diagonal potential.
2.3. Stability of the dominant states. The aim of rewriting a class of quantum transitions in terms of the effective potential was to get control over stable low temperature phases. To this end, the three conditions, expressed first only vaguely and then in precise terms in the following Assumptions 4, 5, and 6, must be met. Namely, we suppose that • the Hamiltonian corresponding to the sum 8 + 9 of the classical (diagonal) and effective interactions has a finite number of ground configurations, and its excitations have strictly positive energy;3 • the list S contains all the lowest quantum fluctuations; • there is no “quantum instability”; the transition probability from a “ground state” g to another “ground state” g 0 is small compared to the energy cost of the excitations. 3 Again, when exploring a region of phase diagram at once, we have a fixed finite set of reference configurations that, strictly speaking, turn out to be ground configurations of the corresponding Hamiltonian for a particular value of “external fields”. See below for a more detailed formulation.
298
R. Kotecký, D. Ueltschi
Each component of the effective interaction 9A is a mapping GA → R; let us first / GA . To give a precise meaning to extend it to A → R by putting 9A (nA ) = 0 if nA ∈ the first condition, we suppose that a finite number of periodic reference configurations D ⊂ G is given such that the interaction 8 + 9 satisfies the Peierls condition with respect to D. We choose a formulation in which it is very easy to verify the condition and, in addition, it takes into account the fact that the configurations from D are not necessarily translation invariant. Namely, we will formulate the condition in terms of a block potential ϒ that is equivalent to 8 + 9 and is chosen in a suitable way. Of course, in many particular cases this is not necessary and the condition as stated below is valid directly for 8 + 9. However, in several important cases treated in Sect. 3, the interaction 8 + 9 turns out not to be the so-called m-potential and the use of the equivalent mpotential ϒ not only simplifies the formulation of the Peierls condition, but also makes the task of its verification much easier. We will consider the interactions ϕ and φ to be equivalent4 if, for any finite torus 3 and any configuration n ∈ 3 , one has X
X
ϕA (nA ) =
A⊂3 per
φA (nA ).
A⊂3 per
Assumption 4 (Peierls condition). There exist a finite set of periodic configurations D ⊂ G with the smallest common period L0 , a constant 1 such that 1 > kT kk for some finite constant k, and a periodic block interaction ϒ = {ϒx } (with period `0 ) that is equivalent to 8 + 9 such that the following conditions are satisfied. The interaction ϒ is of a finite range5 R ∈ 21 N such that R ν 6 10 /δ0 , with the constants δ0 and 10 determined by the interaction 8 in Assumption 2. We denote by U (x) the R-neighbourhood of x. The value ϒx (dU (x) ) is supposed to be translation invariant with respect to x for any d ∈ D, and the interaction ϒ satisfies the following conditions: / GU0 (x) , one has • For any x ∈ 3 and any n with nU0 (x) ∈ ϒx (nU (x) ) − max ϒx (gU (x) ) > 21 10 . g∈G
/ DU (x) , one has • For any x ∈ 3 and any n with nU (x) ∈ ϒx (nU (x) ) − min ϒx (dU (x) ) > 1. d∈D
The following assumption is a condition demanding that the list S should contain all transitions that are relevant for the effective potential. For this, we evaluate the diagonal 4 The usual notion of (physically) equivalent interactions (see [Geo,EFS]) is slightly weaker, but we will not need it here. 5 We will suppose, taking larger R if necessary, that it is larger or equal to the range R of 8, as well as to 0 half of the range of the effective interaction 9 and to L0 .
Effective Interactions Due to Quantum Fluctuations
299
terms arising from any sequence of transitions that does not appear in S; it will have to be small compared to the Peierls constant 1. We define m(TA1 , . . . , TAk ) = max
max
gA ∈GA n1 ,...,nk−1 ∈G / A A
|hgA | TA1 |n1A ihn1A | TA2 |n2A i . . .
A
. . . hnk−1 A | TAk |gA i|,
(2.11)
where A = ∪kj =1 A¯ j . Assumption 5 (Completeness of the set of quantum transitions). There exists a finite ¯ / S with connected ∪m number ε1 such that for any sequence (A1 , . . . , Am ) ∈ i=1 Ai one has m(TA1 , . . . , TAk1 )m(TAk1 +1 , . . . , TAk2 ) . . . m(TAkn−1 +1 , . . . , TAm ) 6 ε1 1. In general, it is not true that the main effect of quantum fluctuations results in a diagonal effective interaction. A sufficient condition for this to occur is that all possible transitions between different configurations g and g 0 have small contribution compared to 1. Assumption 6 (Absence of quantum instability). There exists a finite number ε2 such 0 ∈ G (A = ∪m A 0 ¯ that for any sequence (A1 , . . . , Am ), and any gA , gA A j =1 j ), gA 6 = gA , one has 0 i 6 ε2 1. hgA | TA1 . . . TAm |gA When formulating our theorems, we shall suppose that ε1 and ε2 are small, more precisely: smaller than a constant that does not depend on T . 2.4. Characterization of stable phases. Notice first that the specific energy per lattice site of the configuration d ∈ D, defined by e(d) = lim
3%Zν
1 X [8A (dA ) + 9A (dA )], |3|
(2.12)
A⊂3
is equal, according to Assumption 4, to ϒx (dU (x) ) (whose value does not depend on x). Our first result concerns the existence of the thermodynamic limit for the state under periodic boundary conditions. Taking L0 to be the smallest common period of periodic configurations from D, we always consider in the following the limit over tori 3 % Zν whose sides are multiples of L0 and `0 . Theorem 2.1 (Thermodynamic limit). Suppose that the Assumptions 1–6 are satisfied. There exist constants ε0 > 0 (independent of T ) and β0 = β0 (1) such that the limit per
hKiβ = lim
3%Zν
Tr K e−βH3 Tr e−βH3
(2.13)
exists whenever ε1 , ε2 , kT k 6 ε0 in Assumptions 5 and 6, β > β0 , and K is a local observable.6 6 A local observable, here, is a finite sum of even monomials in creation and annihilation operators, in the case of fermion systems.
300
R. Kotecký, D. Ueltschi
Notice the logic of constants in the theorem above (as well as in the remaining two theorems stated below). The constant ε0 is given by the context (lattice, phase space, range and periodicity of the model, and 8, but does not depend on T ). Then, for any T such that kT k and both ε1 and ε2 are smaller than ε0 one can choose β0 (depending on 1 that is determined in terms of T through the effective potential 9) such that the claim is valid for the given T and any β > β0 (1). With kT k → 0 we may have to go to lower temperatures (higher β) to keep the control. Of course, if 1 does not vanish with vanishing kT k (i.e. Assumption 4 is valid for 8 alone) as was the case in [BKU1, DFF1], one can choose the constant β0 uniformly in kT k. per If there are coexisting phases for a given temperature and Hamiltonian, the state h·iβ will actually turn out to be a linear combination of several pure states. A standard way how to select such a pure state is to consider a thermodynamic limit with a suitably chosen fixed boundary condition. In many situations to which the present theory should apply, this approach is not easy to implement. The classical part of the Hamiltonian might actually consist only of on-site terms and to make the system “feel” the boundary, the truly quantum terms must be used. One possibility is, of course, to couple the system with the boundary with the help of the effective potential. The problem here is, however, that since we are interested in a genuine quantum model, we would have to introduce the effective potential directly in the finite volume quantum state. Expanding this state, in a similar manner as it will be done in the next section, we would actually obtain a new, boundary dependent effective potential. One can imagine that it would be possible to cancel the respective terms by assuming that the boundary potential satisfies certain “renormalizing self-consistency conditions”. However, the details of such an approach remain to be clarified. Here we have chosen another approach. Namely, we construct the pure states by limits α 8α per , defined by (2.13) with H3 = V38 + T3 , where 8α is a perturbation of states h·iβ of the interaction 8 suitably chosen in such a way that one approaches the coexistence point from the one-phase region. Consider thus FR0 , the space of all periodic interactions φ per of range R0 . We say that a state h·iβ , φ ∈ FR0 , is thermodynamically stable if it is insensitive to small perturbations: φ, per
hKiβ
(φ+αψ) per
= lim hKiβ α→0
(2.14)
for every ψ ∈ FR0 and every local observable K. We define now a state h·i∗β to be a pure state (with classical potential 8 and quantum interaction T ) if there exists a 8α per are function (0, α0 ) 3 α → 8α ∈ FR0 so that limα→0+ 8α = 8, the states h·iβ thermodynamically stable, and 8α per
hKi∗β = lim hKiβ α→0+
(2.15)
for every local observable K. Theorem 2.2 (Pure low temperature phases). Under Assumptions 1–6 and for any η > 0, there exist ε0 > 0 (independent of T ) and β0 = β0 (1) such that if ε1 , ε2 , kT k 6 ε0 and β > β0 , there exists for every d ∈ D a function f β (d) such that the set Q = {d ∈ D; Re f β (d) = mind 0 ∈D Re f β (d 0 )} characterizes the set of pure phases. Namely, for any d ∈ Q:
Effective Interactions Due to Quantum Fluctuations
301
a) The function f β (d) is equal to the free energy of the system, i.e. f β (d) = −
1 1 lim log Tr e−βH3 . β 3%Zν |3|
b) There exists a pure state h·idβ . Moreover, it is close to the state |d3 i in the sense that for any bounded local observable K and any sufficiently large 3, one has hKidβ − hd3 | K |d3 i 6 η| supp K|kKk where supp K is the support of the operator K. c) There is exponential decay of correlations in the state h·idβ , i.e. there exists a constant ξ d > 0 such that 0 d hKK 0 idβ − hKidβ hK 0 idβ 6 | supp K|| supp K 0 |kKkkK 0 k e−dist (supp K,supp K )/ξ for any bounded local observables K and K 0 . per d) The state h·iβ is a linear combination of the states h·idβ , d ∈ Q, with equal weights, per
hKiβ =
1 X hKidβ |Q| d∈Q
for each local observable K. 2.5. Phase diagram. We now turn to the phase diagram at low temperatures. Let r be the number of dominant states, i.e. r = |D|. To be able to investigate the phase diagram, we suppose that r − 1 suitable “external fields” are added to the Hamiltonian H3 . Or, in other words, we suppose that the classical potential 8 and quantum interaction T depend on a vector parameter µ = (µ1 , . . . , µr−1 ) ∈ U, where U is an open set of Rr−1 . The dependence should be such that the parameters µ remove the degeneracy on the set D of dominant states. One way to formulate this condition is to assume a nonsingularity ∂eµ (d ) of the matrix of derivatives ∂µi j . Assumption 7. The potential 8 and the quantum perturbation T are differentiable with respect to µ and there exists a constant M < ∞ such that ∂ 8x (nU0 (x) ) 6 M maxν Z ∂µi n∈ for all x ∈ Zν , and kT k +
r−1
X
∂T
6M ∂µi i=1
for all µ ∈ U. Further, there exists a point µ0 ∈ U such that eµ0 (d) = eµ0 (d 0 ) for all d, d 0 ∈ D,
302
R. Kotecký, D. Ueltschi
and the inverse of the matrix of derivatives ∂ µ µ e (dj ) − e (dr ) ∂µi 1 6 i,j 6 r−1 has a uniform bound for all µ ∈ U. Notice that if for some d ∈ D one has eµ (d) = eµ := mind 0 ∈D eµ (d 0 ), then, according to the Peierls condition (Assumption 4), the configuration d is actually a ground state of ϒ. Thus, the assumption above implies that the zero temperature phase diagram has a regular structure: there exists a point µ0 ∈ U where all energies eµ0 (d) are equal, eµ0 (d) = eµ0 , r lines ending in µ0 with r − 1 ground states, 21 r(r − 1) twodimensional surfaces whose boundaries are the lines above with r − 2 ground states, . . . , r open (r − 1)-dimensional domains with only one ground state. Denoting the (r − |Q|)-dimensional manifolds corresponding to the coexistence of a given set Q ⊂ D of ground states by n Re eµ (d 0 ) if d ∈ Q, and M∗ (Q) = µ ∈ U; Re eµ (d) = min d 0 ∈D o (2.16) µ 0 Re e (d ) if d ∈ / Q , Re eµ (d) > min 0 d ∈D
we can summarize the above structure by saying that the collection P ∗ = {M∗ (Q)}Q⊂D determines a regular phase diagram. Notice, in particular, that ∪Q⊂D M∗ (Q) = U, ∗ ∗ M∗ (Q)∩M∗ (Q0 ) = ∅ whenever Q 6 = Q0 , while for the closures, M (Q)∩M (Q0 ) = ∗ M (Q ∪ Q0 ). Here we set M(∅) = ∅. The statement of the following theorem is that the similar collection P = {M(Q)}Q⊂D of manifolds corresponding to existence of corresponding stable pure phases for the full model is also a regular phase diagram and differs only slightly from P ∗ . To measure the distance of two manifolds M and M0 , we introduce the Hausdorff distance dist H (M, M0 ) = max( sup dist (µ, M0 ), sup dist (µ, M)). µ∈M
µ∈M0
Theorem 2.3 (Low temperature phase diagram). Under Assumptions 1–7 there exist P ∂ ε0 > 0 and β0 = β0 (1) such that if kT k + r−1 i=1 k ∂µi T k 6 ε0 , ε1 , ε2 6 ε0 , and β β > β0 , there exists a collection of manifolds P = {Mβ (Q)}Q⊂D such that (a) The collection P β determines a regular phase diagram; (b) If µ ∈ Mβ (Q), the corresponding stable pure state h·idβ exists for every d ∈ Q and satisfies the properties b), c), and d), from Theorem 2.2; (c) The Hausdorff distance dist H between the manifolds of P β and their correspondent in P ∗ is bounded, dist H (Mβ (Q), M∗ (Q)) 6 O( e−β + kT k +
r−1
X
∂T
), ∂µi i=1
for all Q ⊂ D.
Effective Interactions Due to Quantum Fluctuations
303
The proofs of these theorems are given in the rest of the paper. Expansions of the partition function and expectation values of local observables are constructed, and interpreted as contours of a classical model in one additional dimension. Then we show that the assumptions for using the standard Pirogov–Sinai theory are fulfilled, and, with some special care to be taken due to our definition of stability, the validity of the three theorems follows. 3. Examples 3.1. The asymmetric Hubbard model. The usual Hubbard model describes spin- 21 fermions on a lattice, interacting with an on-site repulsion. The kinetic energy of the particles is modelled by a hopping operator. There are many interesting questions with this model, much less rigorous results; see [Lieb] for a review. It is natural to think of the model as describing one kind of particles, that can be in two different states because of their spins. But since the Hamiltonian conserves the total magnetization, we can adopt a different point of view, namely to imagine having two different kinds of particles, the ↑ and ↓ ones; each kind of particle obeys the Pauli exclusion principle which prevents them from being at the same site. Whenever two particles of different kinds are at the same site, there is an energy cost of U . The natural phase space is the Fock space of antisymmetric wave functions on 3. It is isomorphic to H3 if we take for the state space = {0, ↑, ↓, 2}. Particles with different spins being different, it becomes natural to consider that they have different masses, hence different hopping coefficients. The Hamiltonian is written in (1.1). If we set t↓ = 0, we obtain the Falicov–Kimball model [GM]; in the following, we consider the situation t↓ t↑ U (strongly asymmetric Hubbard model). This model has for classical interaction if nx = 0 0 (3.1) 8x (nx ) = −µ if nx =↑ or nx =↓ U − 2µ if n = 2 x (R0 = 0). We choose the chemical potential such that 0 < µ < U . The set G is here the set of ground states of 8, i.e. ν
G = {n ∈ Z : nx =↑ or nx =↓ for any x ∈ Zν }. Assumption 2 holds with 10 = min(µ, U − µ) and δ0 = 0. The quantum perturbation is defined to be ( † t↑ cx↑ cy↑ if A = (< x, y >, ↑) , TA = † cy↓ if A = (< x, y >, ↓) t↓ cx↓
(3.2) 1
and we always have A = {x, y} for a pair of nearest neighbours x, y ∈ Zν . kT k = |t↑ | 2 (if |t↑ | > |t↓ |). The sequence S of transitions that we consider is S = {(A, A0 ) : A = (< x, y >, ↑) and A0 = (< y, x >, ↑) for some x, y ∈ Zν , kx − yk2 = 1}.
304
R. Kotecký, D. Ueltschi
The effective potential is given by Eq. (2.8). For any x, y ∈ Zν , nearest neighbours, † cy↑ |gi, g ∈ G, has an increase of energy of any configuration n such that |ni = cx↑ φ{x,y} (n{x,y} ; g{x,y} ) = U. Furthermore we have † † † † cy↑ cy↑ cx↑ |g{x,y} i + hg{x,y} | cy↑ cx↑ cx↑ cy↑ |g{x,y} i hg{x,y} | cx↑ ( 1 if g{x,y} ∈ {(↑, ↓), (↓, ↑)} = 0 otherwise.
Therefore
(3.3)
( 9{x,y} (g{x,y} ) =
−t↑2 /U if g{x,y} ∈ {(↑, ↓), (↓, ↑)} 0 otherwise.
(3.4)
This interaction is nearest-neighbour and can be inscribed in blocks 2 × · · · × 2. We take R = 21 and choose for the physically equivalent interaction ϒ, ϒx (nU (x) ) = 8x (nx ) +
1 2ν−1
X
9{y,z} (n{y,z} ).
(3.5)
{y,z}⊂U (x)
The set D has namely the two chessboard configurations d (1) and d (2) ; Qν two elements, x x i if (−1) := i=1 (−1) , ( ( ↑ if (−1)x = 1 ↑ if (−1)x = −1 (1) (2) , dx = . dx = x ↓ if (−1) = −1 ↓ if (−1)x = 1 To find the Peierls constant 1 of Assumption 4, let us make the following observation. Consider a cube 2 × · · · × 2 in Zν , that we denote C, and a configuration nC on it. First, only configurations with one particle per site need to be taken into account, the others having an increase of energy of the order U . If nC ∈ GC , then all edges of the cubes are either ferromagnetic, or antiferromagnetic. If a spin at a site is flipped, then exactly ν edges are changing of state. Since any configuration can be created by starting from the chessboard one, and flipping the spins at some sites, we see that the minimum number of ferromagnetic edges, for configurations that are not chessboard, is ν. This leads to t2
ν ↑ 1 = 2ν−1 U. The maximum of the expression in Assumption 5 is equal to max(t↓2 , t↑4 ). The constant
ε1 can be chosen to be
2ν−1 U ν
max(t↓2 /t↑2 , t↑2 ). For Assumption 6 the expression has ν−1
maximum equal to |t↓ t↑ | and we can take ε2 = 2 ν U |t↓ /t↑ | (we cannot suppose this to be very small in the symmetric Hubbard model; the effective potential is not strong enough in order to forbid the model to jump from one g to another g 0 ). Our results for the asymmetric Hubbard model can be stated in the following theorem (see also [KL,DFF2]): Theorem 3.1 (Chessboard phases in asymmetric Hubbard model). Consider the lattice Zν , ν > 2, and suppose 0 < µ < U . Then for any δ > 0, there exist t, α > 0 and β0 (t↑ ) < ∞ (limt↑ →0 β0 (t↑ ) = ∞) such that if |t↑ | 6 t, |t↓ | 6 α|t↑ |, and β > β0 ,
Effective Interactions Due to Quantum Fluctuations
305
• the free energy exists in the thermodynamic limit with periodic boundary conditions, as well as expectation values of observables. (1) (2) • There are two pure periodic phases, h·iβ and h·iβ , with exponential decay of correlations. (1) • One of these pure phases, h·iβ , is a small deformation of the chessboard state |d (1) i: ( (1) hnx↑ iβ
(
> 1 − δ if (−1)x = 1 6δ if (−1)x = −1
(1) hnx↓ iβ
6δ if (−1)x = 1 > 1 − δ if (−1)x = −1.
(2)
The other pure phase, h·iβ , is a small deformation of |d (2) i. To construct the two pure phases, one way is to consider the Hamiltonian X (−1)x (nx↑ − nx↓ ). H3 (h) = H3 − h x∈3
Then
(1)
per
h·iβ = lim h·iβ (h) h→0+
and
(2)
per
h·iβ = lim h·iβ (h), h→0−
per
where h·iβ (h) is defined by (2.13) with Hamiltonian H3 (h). 3.2. The Bose–Hubbard model. This model was introduced by Fisher et al. [FWGF] and may describe 4 He absorbed in porous media, or Cooper pairs in superconductors, . . . It is extremely simple, but has very interesting phase diagram with insulating and superfluid domains [FWGF]. Rigorous results mainly concern the insulating phases; when the classical model [(1.2) with t = 0] has a finite number of ground states, existence of Gibbs states that are close to projection operators onto the classical ground states can be proven for small t and large β; moreover, the compressibility vanishes in the ground states of the quantum model [BKU2]. If U0 = ∞, U1 = U2 = 0 and µ = 0, we obtain a model of hard-core bosons; the reflection positivity technique [DLS] shows that the model has off-diagonal long-range order at low enough temperature, hence has superfluid behaviour. On-site repulsion U0 discourages too high occupancy of sites, so it is physically harmless to introduce a generalized hard-core constraint, namely that there cannot be more than N bosons at the same site. As a consequence the local state space is = {0, 1, 2, . . . , N} and is finite. We restrict our discussion to the two-dimensional case. The range R0 is equal to 21 , and the classical interaction is X (U0 n2x − U0 nx − µnx ) + 8x (nU0 (x) ) = 41 y∈U0 (x)
+ 21 U1
X
y,z∈U0 (x) ky−zk2 =1
ny nz + U2
X y,z∈U0 (x) √ ky−zk2 = 2
ny nz .
(3.6)
306
R. Kotecký, D. Ueltschi
Remark that we have [BKU2] 8x (nU0 (x) ) = ( 41 U0 − U1 + U2 ) X
·
X
(ny − 21 )2 + ( 41 U1 − 21 U2 )
y∈U0 (x)
(ny + nz − 21 )2 + U2
X
y,z∈U0 (x) ky−zk2 =1
ny −
−
1 2
y∈U0 (x)
(3.7) µ 2 +C 8U2
with a constant C independent of n. Whenthe chemical potential satisfies 0 < µ < 8U2 , td 1 0 8x (nU0 (x) ) is minimum if nU0 (x) = d d ≡ 0 0 , or any configuration obtained from t d d d by rotation. Hence we define t d d t d d d d G0 (x) = d d , d d , d t , t d for any x ∈ Zν . Here, G is the set of ground states of the interaction 8, so that δ0 = / GU0 (x) , 0. Since 8x (nU0 (x) ) − 8x (gU0 (x) ) > 41 min(µ, 8U2 − µ), for any nU0 (x) ∈ 1 1 min(µ, 8U2 − µ) (the factor 36 , gU0 (x) ∈ GU0 (x) , Assumption 2 holds with 10 = 36 1 rather than 4 , has been chosen in view of Assumption 4, see below). t d d d t d t d d
d d t d d d d d t
t d d d t d t d d
d d t d d d d d t
t d d d t d t d d
d d t d d d d d t
t d d d t d t d d
d d t d d d d d t
t d t d t d t d t
t d d d t d t d d
d d d d d d d d d
t d t d t d t d t
d d d d d d d d d
(a)
t d t d t d t d t
d d d d d d d d d
t d t d t d t d t
d d d d d d d d d
t d t d t d t d t
t d d d t d d d t
d d t d d d t d d
t d d d t d d d t
(b)
d d t d d d t d d
t d d d t d d d t
d d t d d d t d d
t d d d t d d d t
d d t d d d t d d
t d d d t d d d t
(c)
Fig. 2. Configurations that minimize the diagonal interaction; (a) a general configuration; (b) and (c) two natural candidates that may be selected by lowest quantum fluctuations. Actually, candidate (c) dominates, because it allows for more freedom in the moves of bosons.
We take as a sequence of transitions for the smallest quantum fluctuations S = {(A, A0 ) : A =< x, y > and A0 =< y, x > for some x, y ∈ Z2 , kx − yk2 = 1}. The effective potential follows from (2.8). Let Pxy = {z : |z − x| 6 1 or |z − y| 6 1} and more generally we denote by P any 3 × 4 or 4 × 3 rectangle. Up to rotations and reflections, we have to take into account five configurations, namely dtd ddd dtd ddd (A)
gP (A)
dtd ddd tdt ddd (B)
tdt ddd dtd ddd (C)
gP
gP (C)
tdt ddd tdt ddd (D)
gP
tdd ddt tdd ddt (E)
gP
(B)
(D)
We find 9P (gP ) = −t 2 /2U1 , 9P (gP ) = −t 2 /4U2 , and 9P (gP ) = 9P (gP ) = (E) 9P (gP ) = 0.
Effective Interactions Due to Quantum Fluctuations
307
We can choose R = 23 ; U (x) is a block 4 × 4 centered on (x1 + 21 , x2 + 21 ). The configurations gU (x) ∈ GU (x) are (up to rotations and reflections) tdtd dddd tdtd dddd
tdtd dddd dtdt dddd
gU (x)
gU (x)
(a)
(b)
We choose for ϒ ϒx (nU (x) ) =
1 9
X
˜ y (nU0 (y) ) + 8
y,U0 (y)⊂U (x)
1 X 9P (nP ), 2
(3.8)
P ⊂U (x)
˜ y (nU0 (y) ) = 8y (nU0 (y) ) − ming∈G 8y (nU0 (y) ). Which configurations, among with 8 the four generated by g (a) and the eight generated by g (b) , allow for more quantum fluctuations? The effective potential yields t2 , 2U1 t2 t2 (b) − . ϒx (gU (x) ) = − 4U1 8U2 (a)
ϒx (gU (x) ) = −
We see that the set of dominant states D is formed by all the configurations generated by g (b) (recall that U1 > 2U2 ). Heuristically, there is more freedom for the bosons to move in g (b) , since they can go to a nearest-neighbour site and feel a small repulsion of strength U2 ; as for bosons of the configuration g (a) , any nearest-neighbour move brings them at distance 1 of another boson, and they feel a bigger repulsion U1 . As a result we can choose 1 = t 2 ( 8U1 2 − 4U1 1 ) in Assumption 4. The maximum of the expression in Assumption 5 is ε1 = t 2 ( 8U1 2 − 4U1 1 )−1 . In Assumption 6 we have ε2 = 0, because g 6 = g 0 means that g and g 0 must differ on a whole row, and the matrix element is zero for any finite m. These eight dominant states bring eight pure periodic phases, h·i(1) , . . . , h·i(8) ; each one can be constructed by adding a suitable field in the Hamiltonian (e.g. the projector onto the dominant state). Theorem 3.2 (Bose–Hubbard model). Consider the Bose–Hubbard model on the lattice Z2 with a generalized hard-core, and suppose U0 > 4(U1 − U2 ), U1 > 2U2 and 0 < µ < 8U2 . There exist t0 > 0 and β0 (t) < ∞ (limt→0 β0 (t) = ∞) such that if t 6 t0 and β > β0 , • the free energy exists in the thermodynamic limit with periodic boundary conditions, as well as expectation values of observables, • there are 8 pure periodic phases with exponential decay of correlations. Each of these eight phases is a perturbation of a dominant state d, and the expectation value of any local operator is close to its value in the state d, see Theorem 2.2 for more precise statement. Similar properties hold for other quarter-integer density phases. Equation (3.7) may be generalized so as to exhibit gaps for the spectrum of 8, cf. [BKU2].
308
R. Kotecký, D. Ueltschi
4. Contour Representation of a Quantum Model Our Hamiltonian has periodicity `0 < ∞. Without loss of generality, however, one can consider only translation invariant Hamiltonians, applying the standard trick. Namely, ν ν if is the single site phase space, we let 0 = {1,...,`0 } ; S 0 = |0 | = S `0 . Then we consider the torus 30 ⊂ Zν , `ν0 |30 | = |3|, each point of which is representing a block of sites in 3 of size `ν0 , and identify 0
30
' 3 . 0
Constructing H0 as the Hilbert space spanned by the elements of 0 3 , it is clear that H0 is isomorphic to H. The new translation invariant interactions 80 and T 0 are defined by resumming, for each A ⊂ 30 , the corresponding contributions with supports in the union of corresponding blocks. Notice the change in range of interactions. Namely, it decreased to dR/`0 e (the lowest integer bigger or equal to R/`0 ). From now on, keeping the original notation H, S, . . . , we suppose that the Hamiltonian is translation invariant. The partition function of a quantum model is a trace over a Hilbert space. But expanding e−βH with the help of the Duhamel formula we can reformulate it in terms of the partition function of a classical model in a space with one additional dimension (the extra dimension being continuous). In this section we present such an expansion, per leading to a contour representation, of the partition function Z3 := Tr e−βH3 in a per finite torus 3 . Expansion with the help of the Duhamel formula yields e
−βH3
=
X
X
Z
m > 0 A1 ,...,Am 0<τ1 <...<τm <β A¯ i ⊂3 per
dτ1 . . . dτm
e−τ1 V3 TA1 e−(τ2 −τ1 )V3 TA2 . . . TAm e−(β−τm )V3 . (4.1) P Inserting the expansion of unity 1H3 = n3 |n3 ihn3 | to the right of operators TAj , we obtain X X X X Z per e−βV3 (n3 ) + dτ1 . . . dτm Z3 = n3
m > 1 n1 ,...nm A1 ,...,Am 0<τ1 <...<τm <β 3 3 A¯ i ⊂3 per
1 −(β−τm )V3 (n3 ) e−τ1 V3 (n3 ) hn13 | TA1 |n23 i e−(τ2 −τ1 )V3 (n3 ) . . . hnm . 3 | TAm |n3 i e 1
2
1
(4.2)
For notational simplicity, we wrote V3 (n3 ) instead of hn3 | V3 |n3 i. This expansion can be interpreted as a classical partition function on the (ν+1)-dimensional space 3×[0, β]. per Namely, calling the additional dimension “time direction”, the partition function Z3 is a (continuous) sum over all space-time configurations n3 = n3 (τ ), τ ∈ [0, β], and all possible transitions at times corresponding to discontinuities of n3 (τ ). Notice that n3 (τ ) is periodic in the time direction. Thus, actually, we obtain a classical partition function on the (ν + 1)-dimensional torus T3 = 3 per × [0, β] per with a circle [0, β] per in the time direction (for simplicity we omit in T3 a reference to β). Introducing the quantum configuration ωT3 consisting of the space-time configuration n3 (τ ) and the transitions (Ai , τi ) at corresponding times, we can rewrite (4.2) in a compact form
Effective Interactions Due to Quantum Fluctuations per Z3
309
Z =
dωT3 ρ(ωT3 )
(4.3)
with ρ(ωT3 ) standing for the second line of (4.2). Now, we are going to specify excitations within a space-time configuration n and identify classes of small excitations – the loops7 – and large ones – the quantum contours. ν A configuration n ∈ Z is said to be in the state g ∈ G at site x whenever nU0 (x) = gU0 (x) (notice that, in general, g is not unique). If there is no such g ∈ G, the configuration n is said to be classically excited at x. We use E(n) to denote the set of all classically ν excited sites of n ∈ Z . For any 3 ⊂ Zν , let us consider the set Q3 of quantum configurations on the torus T3 . Whenever ω ∈ Q3 , its boundary B (0) (ω) ⊂ T3 is defined as the union ¯ B (0) (ω) = (∪τ ∈[0,β] (E(n(τ )) × τ )) ∪ (∪m i=1 (Ai × τi )). The sets A¯ i × τi ⊂ T3 represent the effect of the operator T and for this reason are called quantum transitions. It is worth noticing that the set B (0) (ω) is closed. The next step is to identify the smallest quantum excitations – those consisting of a sequence of transitions from the list S. First, let us use B (0) (ω) to denote the set of connected components of B (0) (ω) (so that B (0) (ω) = ∪B∈B(0) (ω) B). To any B ∈ B(0) (ω) that is not wrapped around the cylinder (i.e., for which there exists a time τB ∈ [0, β]per with B∩(Zν ×τB ) = ∅) we assign its sequence of transitions, S(B, ω), ordered according to their times (starting from τB to β and proceeding from 0 to τB ) as well as the smallest box B˜ containing B. Here, a box is any subset of TZν of the form A × [τ1 , τ2 ] with connected A ⊂ Zν and [τ1 , τ2 ] ⊂ [0, β]per (if τ1 > τ2 , we interpret the segment [τ1 , τ2 ] as that interval in [0, β]per (with endpoints τ1 and τ2 ) that contains the point 0 ≡ β). We would like to declare the excitations with S(B, ω) ∈ S to be small. However, we need to be sure that there are no other excitations in their close neighbourhood. If this were the case, we would “glue” the neighbouring excitations together. This motivates the following iterative procedure. (0) Given ω, let us first consider the set B0 (ω) of those components B ∈ B(0) (ω) that ¯ where S¯ is the set of all are not wrapped around the cylinder and for which S(B, ω) ∈ S, subsequences of sequences from S. Next, we define the first extension of the boundary, ˜ B (1) (ω) = (∪B∈B(0) (ω)\B(0) (ω) B) ∪ (∪B∈B(0) (ω) B). 0
0
(1)
Using B (1) (ω) to denote the set of connected components of B (1) (ω) and B0 (ω) ⊂ B (1) (ω) the set of those components B in B(1) (ω) that are not wrapped around the ¯ we define cylinder and for which8 S(B, ω) ∈ S, ˜ B (2) (ω) = (∪B∈B(1) (ω)\B(1) (ω) B) ∪ (∪B∈B(1) (ω) B). 0
0
Iterating this procedure, it is clear that after a finite number of steps we obtain the final extension of the boundary, B(ω) = (∪B∈B(k) (ω)\B(k) (ω) B) ∪ (∪B∈B(k) (ω) B). 0
(4.4)
0
7 Even though the present framework is more general, the name comes from thinking about simplest excitations in Hubbard type models. Namely, a jump of an electron to a neighbouring site and returning afterwards to its original position. 8 A set B ∈ B (1) (ω) may actually contain several original components from B (0) (ω). We take for S(B, ω) 0 0 the sequence of all transitions in all those components.
310
R. Kotecký, D. Ueltschi (k)
Here, every B ∈ B0 (ω) is a box of the form A × [τ1 , τ2 ] (that is not wrapped around ¯ Let us denote B(ω) ≡ B (k) (ω) and consider the set the cylinder) and S(B, ω) ∈ S. 0 (k) B0 (ω) ⊂ B(ω) of all those sets B ∈ B0 (ω) for which actually S(B, ω) ∈ S and, moreover, nA (τ1 − 0) = nA (τ2 + 0). Finally, let Bl (ω) = B(ω) \ B0 (ω) – “l” for “large”: it represents the set of all excitations of ω that are not loops. Taking, for any closed B ⊂ T3 , the restriction nB of a space-time configuration n to be defined by (nB )x (τ ) = nx (τ ) for any x × τ ∈ B, we introduce the useful notion of the restriction ωB of a quantum configuration ω to B as to consist of nB and those quantum transitions from ω that are contained in B, A × τ ⊂ B (we suppose here that ω and B are such that no transition intersects both B and its complement; we do not define ωB in this case). Now the loops and the quantum contours can be defined. First, the loops of a quantum ξ configuration ω are the triplets ξ ≡ (B, ωB , gA ); B ≡ A×[τ1 , τ2 ] ∈ B0 (ω) is the support ξ of the loop ξ and gA = nA (τ1 − 0) = nA (τ2 + 0), a restriction of a configuration g ∈ G. (While the configuration g is not unique, its restriction to A is determined by the loop ξ in a unique way.) We say that ξ is immersed in g. Given a quantum configuration ω, ξ we obtain a new configuration ωˇ by erasing all loops (B, ωB , gA ), i.e. for each ξ we remove all the transitions in its support B and change the space-time configuration on ˇ = Bl (ω). Notice that, B into g ∈ G into which ξ is immersed. Let us remark that B(ω) since we started our construction from (4), we have automatically diam A > 2R0 for a support A × [τ1 , τ2 ] of any loop ξ . Remark. The procedure described here to identify the loops of a quantum configuration is rather intricate. This is so because we consider a quite general class of models; when studying a special model, it is possible to give a more explicit definition of the loops, and to avoid this iteration. Quantum contours of a configuration ω will be constructed by extending pairs (B, ωB ) with B ∈ Bl (ω) by including also the regions of nondominating states from G. Namely, summing over loops we will see that “loop free energy” favours the regions with dominating configurations from D ⊂ G. However, to recognize the influence of loops, we have to look on regions of size comparable to the size of loops. This motivates the following definitions with U (x) = {y ∈ Zν , |x − y| 6 R} being an extension of the original neighbourhood U0 (x). Thus, we enlarge the set E(n) of classically excited sites ˜ to E(n), with ˜ E(n) = {x ∈ Zν : nU (x) 6 = gU (x) for any g ∈ G}
(4.5)
and we introduce the set F (n) of softly excited sites by ˜ : nU (x) 6= dU (x) for any d ∈ D}. F (n) = {x ∈ Zν \ E(n)
(4.6)
ˇ we define the new extended boundary Then, for a quantum configuration such that ω = ω, ˇ = B e (ω)
[ τ ∈[0,β] per
m [[ ˜ ∪ U (x) × τi , E(n(τ )) ∪ F (n(τ )) × τ i=1
x∈Ai
(4.7)
ˇ Notice that B(ω) ˇ ⊂ B e (ω), since the first ˇ we set B e (ω) = B e (ω). and if ω 6 = ω, set is the union of classical excitations, quantum transitions and boxes; obviously the classical excitations and the quantum transitions also belong to B e (ω), and the boxes being such that their diameter is smaller than 2R and they contain U0 (x)-excited sites
Effective Interactions Due to Quantum Fluctuations
311
at each time, they are U (x)-excited. Decomposing B e (ω) into connected components, we get our quantum contours, namely γ = (B, ωB ). Notice that the configuration ωB contains actually also the information determining which dominant ground state lies outside B. We call the set B the support of γ , B = supp γ , and introduce also its “truly excited part”, the core, core γ ⊂ supp γ , by taking \ m ˜ )) × τ ∪ ∪ ∪ U (x) × τi . (4.8) core γ = supp γ ∪τ ∈[0,β] per E(n(τ i=1
x∈Ai
Finally, notice that if the contour is not wrapped around the torus in its spatial direction, there exists a space-time configuration ωγ and we have B = B e (ωγ ). A set of quantum contours 0 = {γ1 , . . . , γk } is called admissible if there exists a quantum configuration ω0 ∈ Q3 which has 0 as set of quantum contours; clearly, if ω0 exists, it is unique under the assumption that it contains no loop (i.e. ω0 = ωˇ 0 ). We use D3 to denote the set of all collections 0 of admissible quantum contours, and extend the notions of core and support to sets of contours, namely core 0 = ∪γ ∈0 core γ , supp 0 = ∪γ ∈0 supp γ . Given 0 ∈ D3 , a set of loops 4 = {ξ1 , . . . , ξ` } is said to be admissible and compatible with 0 if there exists ω0∪4 which has 4 as a set of loops and 0 as a set of quantum contours (it is also unique whenever it exists). More explicitly, ξ0
ξ
• two loops ξ = (B, ωB , gA ) and ξ 0 = (B 0 , ω0B 0 , gA0 ) are compatible, ξ ∼ ξ 0 , iff B ∪ B 0 is not connected; ξ • a loop ξ = (B, ωB , gA ), with B = A × [τ1 , τ2 ], is compatible with 0, ξ ∼ 0, iff B ∪ B(ω0 ) is not connected, ξ
gA = n0A (τ ) for all τ ∈ [τ1 , τ2 ]; • a collection of loops 4 = {ξ1 , . . . , ξ` } is admissible and compatible with 0 iff any two loops from 4 are compatible and each loop from 4 is compatible with 0. loop
We use D3 (0) to denote the set of all admissible collections 4 that are compatible with 0. The conditions of admissibility and compatibility above can be, for any given set of transitions {A1 , . . . , Am }, formulated as a finite number of restrictions on corresponding transition times {τ1 , . . . , τm }. Given the restrictions on admissibility of 0 ∈ D3 , the loop per restrictions on 4 to belong to D3 (0) factorize. As a result, the partition function Z3 loop in (4.3) can be rewritten in terms of integrations over D3 and D3 (0) [the summation over 0 and 4 accompanied with the integration, a priori over the interval [0, β], over times τi of corresponding transitions, subjected to the above formulated restrictions, cf. (4.2)]. Furthermore the contribution of 0 ∪ 4 factorizes as a contribution of 0 times a product of terms for ξ ∈ 4 [BKU1,DFF1],9 we get Z Z Z Z Y per 0∪4 0 d0 loop d4 ρ(ω )= d0 ρ(ω ) loop d4 z(ξ ). Z3 = D3
D3 (0)
D3
D3 (0)
ξ ∈4
(4.9) 9 For spin or boson systems factorization is true simply because any two operators with disjoint supports commute. In the case of fermion systems there is an additional sign due to anticommutation relations between creation and annihilation operators, and factorization is no more obvious. That it indeed factorizes was nicely proved in Sect. 4.2 of [DFF1].
312
R. Kotecký, D. Ueltschi
Here, using {(Ai , τi ), i = 1, . . . , m} to denote the quantum transitions of 0 ∪ 4, we put m n Z o Y 0∪4 0∪4 0∪4 )= hnAi (τi −0)| TAi |nAi (τi +0)i exp − d(x, τ )8x (n0∪4 (τ )) , ρ(ω U0 (x) T3
i=1
(4.10) Rβ R P where B d(x, τ ) is the shorthand for 0 dτ x:x×τ ∈B (used here for B = T3 ). Similarly for ρ(ω0 ). Further, the weight of a loop ξ = (B, ωB , gA ) with the set of quantum transitions {(Ai , τi ), i = 1, . . . , `} and n the space-time configuration corresponding to ωB , is o n Z β X dτ [8x (nU0 (x) (τ ))−8x (gU0 (x) )] hgA1 | TA1 |nA1 (τ1 +0)i× z(ξ ) = exp − 0
x,U0 (x)⊂A
× hnA2 (τ2 − 0)| TA2 |nA2 (τ2 + 0)i . . . hnA` (τ` − 0)| TA` |gA` i. (4.11) Given 0 ∈ D3 , the second integral in (4.9) is over the collections of loops that interact only through a condition of non-intersection. This is the usual framework for applying the cluster expansion of polymers. The only technical difficulty is that the set of our loops is uncountable (the loops depend on continuous transition times), and thus we cannot simply quote the existing literature. Nevertheless, the needed extension is rather straightforward and often implicitly used. Given a collection C = (ξ1 , . . . , ξn ) of loops, we define the truncated function Y 1 z(ξ ), (4.12) 8T (C) = ϕ T (C) n! ξ ∈C
with
( 1 ϕ (C) = ϕ (ξ1 , . . . , ξn ) = P Q T
T
G
e(i,j )∈G
I ξi ∼ ξj − 1
if n = 1, if n > 2,
where the sum is over all connected graphs G of n vertices. Notice that 8T (C) = 0 whenever C is not a cluster, i.e. if the union of the supports of its loops is not connected. R We use L3 and C3 to denote the set of all loops and clusters, respectively, and use C3 dC R R P as a shorthand for n > 1 L3 dξ1 ... L3 dξn , in obvious meaning. Whenever 0 ∈ D3 is fixed, we use L3 (0) to denote the set of all loops compatible withR0 and write C ∈ C3 (0) whenever the cluster C contains only loops from L3 (0). Again, C3 (0) dC is a shorthand R R P for n > 1 L3 (0) dξ1 ... L3 (0) dξn . Finally, we also need similar integrals conditioned by the time of the first transition encountered in the loop ξ or the cluster C. Namely, using C to denote the support of C, i.e. the union of the supports of the loops of C, and IC = {τ1 (C), τ2 (C)} to denote its vertical projection,10 IC = {τ ∈ [0, β] per ; Zν ×τ ∩C 6= ∅}, (x,τ ) we use C3 for the set of all clusters C ∈ C3 with the first transition time τ1 (C) = τ , for whichRtheir first loopRξ1 with support B1 = A1 × [τ1 (C), τ2 ], contains the site x, A1 3 x. Then L(x,τ ) dξ and C (x,τ ) dC are shorthands for the corresponding integrals with first 3 3 R R transition time fixed – formally one replaces dξ1 by I A1 3 x δ(τ1 (ξ1 ) − τ )dξ1 . With this notation we can formulate the cluster expansion lemma. 10 Again, if τ > τ , the segment [τ , τ ] ⊂ [0, β] per contains the point 0 ≡ β. 1 2 1 2
Effective Interactions Due to Quantum Fluctuations
313
Lemma 4.1 (Cluster expansion). For any c ∈ R, α1 < (4R0 )−ν , α2 < R −2ν 10 and δ > 0, there exists ε0 > 0 such that whenever kT k 6 ε0 and 0 ∈ D3 , we have the loop cluster expansion, Z Z Y T d4 z(ξ ) = exp dC8 (C) . (4.13) loop D3 (0)
C3 (0)
Moreover, the weights of the clusters are exponentially decaying (uniformly in 3 and β): Z Y dC I C 3 (x, τ ) |8T (C)| e(c−α1 log kT k)|A|+α2 |B| 6 δ (4.14) C3
and
ξ ∈C
Z (x,τ ) C3
dC|8T (C)|
Y
e(c−α1 log kT k)|A|+α2 |B| 6 δ
(4.15)
ξ ∈C
for every (x, τ ) ∈ T3 . Proof. One can follow any standard reference concerning cluster expansions for continuum systems, for example [Bry]. We are using here [Pfi] whose formulation is closer to our purpose. Assuming that inequality (4.15) holds true, we have a finite bound n Y X 1 Z T dξ1 . . . dξn |ϕ (ξ1 , . . . , ξn )| |z(ξi )| 6 δβ|3|. n! L3 (0)n
n>1
(4.16)
i=1
Lemma 4.1 then follows from Lemma 3.1 of [Pfi]. Let us turn to the proof of the two inequalities. Let f (ξ ) = |z(ξ )| e(c−α1 log kT k)|A|+α2 |B| . Skipping the conditions ξj ∼ 0, we define Z hZ dξ1 I B1 3 (x, τ ) + In = n Z ·
) L(x,τ 3
L3
Ln−1 3
dξ2 . . . dξn |ϕ T (ξ1 , . . . , ξn )|
i dξ1 n Y
(4.17) f (ξi )
i=1
(it does not depend on (x, τ ) ∈ T3 ). The lemma will be completed once we shall have established that In 6 n!( 21 δ)n (assuming that δ 6 1; otherwise, we show that In 6 n!/2n ). From Lemma 3.4 of [Pfi], we get Y X I Bi ∪ Bj connected . (4.18) |ϕ T (ξ1 , . . . , ξn )| 6 T tree on n vertices e(i,j )∈T
Denoting d1 , . . . , dn the incidence numbers of vertices 1, . . . , n, we first proceed with the integration on the loops j 6 = 1 for which dj = 1; in the tree T , such j shares an edge (i) (i) only with one vertex i. The incompatibility between ξi and ξj , with ξi = (Bi , ωBi , gAi ), (i)
(i)
(i)
Bi = Ai ×[τ1 , τ2 ], and similarly for ξj , means that either Bj ∪[Ai ×τ1 ] is connected,
314
R. Kotecký, D. Ueltschi (j )
or [Aj × τ1 ] ∪ Bi is connected. Hence, the bound for the integral over the ξj that are incompatible with ξi is Z dξj I Bj ∪ Bi connected f (ξj ) L3 Z Z dξj I Bj 3 (x, τ ) f (ξj ) + 2ν|Bi | dξj f (ξj ) 6 2ν|Ai | (4.19) ) L3 L(x,τ 3 Z Z 1 dξj I Bj 3 (x, τ ) f (ξj ) + dξj f (ξj ) . 6 2ν |Ai | + α|Bi | ) α L3 L(x,τ 3 (The constant α has been introduced in order to match with the conditions of the next lemma). Then Z hZ i X n−1 dξ1 I B1 3 (x, τ ) + dξ1 In 6 n(2ν) T tree of n vertices
) L(x,τ 3
L3
d1 f (ξ1 ) |A1 | + α|B1 |
n Z Y j =2
L3
dj −1 dξj I Bj 3 (x, τ ) f (ξj ) |Aj | + α|Bj |
1 + α
Z ) L(x,τ 3
(4.20)
dj −1
dξj f (ξj ) |Aj | + α|Bj |
.
Now summing over all trees, knowing that the number of trees with n vertices and incidence numbers d1 , . . . , dn is equal to (n − 1)! (n − 2)! 6 , (d1 − 1)! . . . (dn − 1)! d1 !(d2 − 1)! . . . (dn − 1)! we find a bound In 6 n!(2ν)n−1 (1 + α)
Z
1 + α
L3
dξ I B 3 (x, τ ) f (ξ ) e|A|+α|B|
Z
) L(x,τ 3
dξf (ξ ) e
|A|+α|B|
n
(4.21) .
We conclude by using the following lemma which implies that the quantity between the brackets is small. u t Lemma 4.2. Let α1 < (4R0 )−ν and α2 < R −2ν 10 . For any c ∈ R and δ > 0, there exists ε0 > 0 such that whenever kT k 6 ε0 the following inequality holds true, Z dξ I B 3 (x, τ ) |z(ξ )| e(c−α1 log kT k)|A|+α2 |B| L3 Z + dξ |z(ξ )| e(c−α1 log kT k)|A|+α2 |B| 6 δ, ) L(x,τ 3
where (x, τ ) is any space-time site of T3 .
Effective Interactions Due to Quantum Fluctuations
315
Proof. Let us first consider the integral over ξ such that its box contains a given spacetime site. We denote by `1 the number of quantum transitions of ξ at times bigger than τ , and `2 the number of the other quantum transitions. The integral over ξ can be done by summing over (`1 + `2 ) quantum transitions A11 , . . . , A1`1 , A21 , . . . , A2`2 , by summing i,j Aj
over (`1 + `2 ) configurations n i , and by integrating over times τ11 < · · · < τ`11 , τ12 < · · · < τ`22 . Let us do the change of variables τ˜11 = τ11 − τ , τ˜21 = τ21 − τ11 , . . . , τ˜`11 = τ`11 − τ`11 −1 , and τ˜12 = τ − τ12 , . . . , τ˜`22 = τ`22 −1 − τ`22 . Then we can write the following upper bound: Z L3
dξ I B 3 (x, τ ) |z(ξ )| e(c−α1 log kT k)|A|+α2 |B| X
6
`1 ,`2 > 1
X
Z
X
∞ 0
2,`2 1,1 A11 ,...A2` / A 2 nA1 ,...,nA2 ∈G i 1 `2 ¯ ∪i,j A =A3x
dτ˜11 . . . dτ˜`22
`i Y Y i=1,2 j =1
i,j
i,j +1
|hnA | TAi |nA j
i|
j
A connected
¯i
e(c−α1 log kT k)|Aj | e
−τ˜ji
P
i,j y,U0 (y)⊂A [8y (nU0 (y) )−8y (gU0 (y) )]
i
eτ˜j R
να 2
,
(4.22)
where gA ∈ GA is the configuration in which the loop ξ is immersed (if the construction does not lead to a possible loop, we find a bound by picking any gA ∈ GA ). Remark 2,1 that we neglected a constraint on the sum over configurations, namely n1,1 A = nA . It is useful to note that the sums over `1 , `2 and over the quantum transitions are finite, otherwise they cannot constitute a loop. Using the definition (2.6) of kT k, we have |hn0A | TA |nA i| 6 kT k|A| . Furthermore
X
i,j
[8x (nU0 (x) ) − 8x (gU0 (x) )] > R −ν 10 ,
x,U0 (x)⊂A
as claimed in Property (2.5). Hence we have, since the number of configurations on A is bounded with S |A| , Z L3
dξ I B 3 (x, τ ) |z(ξ )| e(c−α1 log kT k)|A|+α2 |B| 6
X
X
`1 ,`2 > 1
A11 ,...A2` 2 ∪i,j A¯ ij =A3x
ν ν |Ai | `i j Y Y kT k1−α1 (4R0 ) S ec(4R0 ) . −ν ν R 10 − R α2
(4.23)
i=1,2 j =1
A connected
This is a small quantity since the sums are finite, by taking kT k small enough. Now we turn to the second term, namely Z dξ |z(ξ )| e(c−α1 log kT k)|A|+α2 |B| . ) L(x,τ 3
316
R. Kotecký, D. Ueltschi
The proof is similar; we first sum over the number of transitions `, then over ` transitions A1 , . . . A` with A = ∪i A¯ i 3 x, A connected. Then we choose ` − 1 intermediate configurations. Finally, we integrate over ` − 1 time intervals. The resulting equation looks very close to (4.23) and is small for the same reasons. u t Now, we single out the class of small clusters. Namely, a cluster is small if the sequence of its quantum transitions belongs to the list S. To be more precise, we have to specify the order of transitions: considering a cluster C ≡ (ξ1 , . . . , ξk ) and using S(ξ (`) ), ` = 1, . . . , k, to denote the sequence of quantum transitions of the loop ξ (`) = ξ (`)
(B (`) , ωB (`) , gA ), S(ξ (`) ) ≡ S(B (`) , ωB (`) ), we take the sequence S(C) obtained by combining the sequences S(ξ (1) ), . . . , S(ξ (k) ) in this order. A cluster C is said to be small if S(C) ∈ S, it is large otherwise. We use C3small to denote the set of all small clusters on the torus T3 . The local contribution to the energy at time τ , when the system is in a state nU0 (x) (τ ), is 8x (nU0 (x) (τ )). Similarly, we will introduce the local contribution of loops (and small clusters of loops) in the expansion of the partition function – the effective potential β 9A (nA (τ )). The latter is a local quantity in the sense that it depends on n only on the set β A at time τ . An explicit expression of 9A (gA ) with g ∈ G is, in terms of small clusters, Z 8T (C) β dC (4.24) I C ∼ gA , AC = A, IC 3 0 . 9A (gA ) := − small |I | C C3 Here, again, C is the support of C, AC its horizontal projection onto Zν , AC = {x ∈ Zν ; x × [0, β]per ∩ C 6 = ∅}, and IC its vertical projection, |AC | and |IC | their corresponding areas, and the condition C ∼ gA means that each loop of C is immersed in the ground state g. Notice that the “horizontal extension” of any small cluster is at most 2R: if C is a small cluster, diam (AC ) 6 2R. The definitions introduced to write the effective potential (see the appendix) are now clear, once we identify the effective potential 9 defined in (A.1) as the limit β → ∞ of (4.24). Namely, 9 = lim 9 β . β→∞
Our assumptions in Sect. 2.3 concern the limit β → ∞ of the effective potential, but at non zero temperature we have to work with 9 β . To trace down the difference, we β / GA introduce ψ β = 9 β − 9. Notice that (4.24) implies 9A (nA ) = 0 whenever nA ∈ or diam A < 4R0 . Recalling that if C ⊂ T3 , C˜ is the smallest box containing C, we introduce, for any cluster C ∈ C3small , the function Z 8T (C) dτ I C ∼ 0 − I n0AC (τ ) ∈ GAC , C ∼ n0AC (τ ) . 8T (C; 0) = |IC | IC (4.25) Here, the first indicator function in the parenthesis singles out the clusters each loop of which is compatible with 0, while the second indicator concerns the clusters for which n0AC (τ ) ∈ GAC and each of their loops is immersed in the configuration n0A (τ ) (extended as a constant to all the time interval IC ). Observing that 8T (C; 0) = 0 whenever C˜ ∩ core 0 = ∅, we split the integral over small clusters into its bulk part expressed in terms of the effective potential and boundary terms “decorating” the quantum contours from 0.
Effective Interactions Due to Quantum Fluctuations
317
Lemma 4.3. For any fixed 0 ∈ D3 , one has Z Z dC8T (C) = − d(A, τ )9A (n0A (τ )) C3small (0)
T3
Z −
T3
β d(A, τ )ψA (n0A (τ )) +
Z C3small
dC8T (C; 0).
The term 8T (C; 0) vanishes whenever C˜ ∩ core 0 = ∅. R R P R Similarly as d(x, τ ), the shorthand d(A, τ ) means A dτ . Proof. To get the equality of integrals, it is enough to rewrite Z Z T dC8 (C) = dC8T (C) I C ∼ 0 C3small (0)
C3small
and Z Z β 0 d(A, τ )9A (nA (τ )) = −
8T (C) dC |IC | C3small
T3
Z
(4.26)
dτ I n0AC (τ ) ∈ GAC , C ∼ n0AC (τ ) .
IC
(4.27)
Moreover, whenever C˜ ∩ core 0 = ∅, the configuration n0AC (τ ) belongs to GAC , and it is constant, for all τ ∈ IC . Under these circumstances, the condition C ∼ 0 is equivalent t to C ∼ n0AC (τ ) and the right hand side of (4.25) vanishes. u Whenever 0 ∈ D3 is fixed, let Wd (0) ⊂ T3 be the set of space-time sites in the state d, i.e. Wd (0) = {(x, τ ) ∈ T3 : n0U (x) (τ ) = dU (x) }. Notice that T3 = supp 0 ∪ ∪ Wd (0); d∈D
Wd (0) ∩ Wd 0 (0) = ∅ if d 6= d 0 ,
and the set supp 0∩Wd (0) is of measure zero (with respect to the measure P d(x, τ ) on T3 ). Let us recall that the equivalent potential ϒ satisfies the equality x∈3 ϒx (nU (x) ) = P (8 (n ) + 9 (n )) + const|3| for any configuration n on the torus 3; actually, A A A A A⊂3 we can take const = 0, since ϒ and ϒ 0 = ϒ + const are also physically equivalent, and ϒ 0 satisfies the same assumptions as ϒ. Lemma 4.4. The partition function (4.9) can be rewritten as Z Y Y per d0 e−|Wd (0)|e(d) z(γ ) eR(0) . Z3 = D3
d∈D
γ ∈0
Here the weight z(γ ) of a quantum contour γ = (B, ωB ) with the sequence of transitions (A1 , . . . , Am ) at times (τ1 , . . . , τm ) is m n Z o Y γ γ γ hnAi (τi − 0)| TAi |nAi (τi + 0)i exp − d(x, τ )ϒx (nU (x) (τ )) . (4.28) z(γ ) = i=1
B
318
R. Kotecký, D. Ueltschi
The rest R(0) is given by Z Z dC8T (C) − R(0) = C3 (0)\C3small (0)
β
T3
d(A, τ )ψA (n0A (τ )) +
Z C3small
dC8T (C; 0). (4.29)
Proof. Using Lemmas 4.1 and 4.3 to substitute in (4.9) the contribution of loops by the action of the effective potential, we get Z m nY o per d0 hn0Ai (τi − 0)| TAi |n0Ai (τi + 0)i Z3 = D3 i=1 (4.30) o n Z 0 0 R(0) d(A, τ )(8A (nA (τ )) + 9A (nA (τ ))) e . · exp − T3
Replacing 8 + 9 by the physically equivalent potential ϒ, we get per
Z3 =
Z D3
d0
m nY o hn0Ai (τi − 0)| TAi |n0Ai (τi + 0)i i=1
Z exp −
supp 0
Y d(x, τ )ϒx (n0U (x) (τ )) e−e(d)|Wd (0)| eR(0) .
(4.31)
d∈D
We get our lemma by observing that the product over quantum transitions and the first exponential factorize with respect to the quantum contours, as was the case for the loops (for fermions the sign arising because of anticommutation relations also factorizes; we again refer to [DFF1] for the proof). u t Our goal is to obtain a classical lattice system in ν +1 dimensions. Thus we introduce a discretization of the continuous time direction, by choosing suitable parameters β˜ > 0 β˜ 11 . Setting L3 to be the (ν + 1)-dimensional discrete torus and N ∈ N with β = N 1 L3 = 3 × {0, 1, . . . , N − 1} per – let us recall that 3 has periodic boundary conditions in all spatial directions – and using C(x, t) ⊂ Rν+1 to denote, for any (x, t) ∈ L3 , the β˜ ˜ t) with vertical length β/1, we have T3 = ∪(x,t)∈L3 C(x, t). cell centered in (x, 1 For any M ⊂ L3 , we set C(M) to be the union of all cells centered at sites of M, C(M) = ∪(x,t)∈M C(x, t) ⊂ T3 . Conversely, if B ⊂ T3 , we take M(B) ⊂ L3 to be the smallest set such that C(M(B)) ⊃ B. Given a connected12 set M ⊂ L3 and a collection of quantum contours 0 ∈ D3 , we define Z dC I M(C) = M 8T (C) + ϕ(M; 0) = C3 (0)\C3small (0)
Z
+ Z −
C3small
dC I M(C) = M, C 6 ⊂ C(supp 0) 8T (C; 0) − β
M(A×τ )=M
d(A, τ )ψA (n0A (τ ))
(4.32)
11 Note the difference from [BKU1]; here the vertical length of a unit cell β/1 ˜ depends on kT k, since so does the quantum Peierls constant 1. 12 Connectedness in L is meant in the standard way via nearest neighbours. 3
Effective Interactions Due to Quantum Fluctuations
and ˜ R(0) =
Z C3small
319
dC I C ⊂ C(supp 0) 8T (C; 0).
(4.33)
We have separated the contributions of the small clusters inside C(supp 0) ≡ C(M(supp 0)), because they are not necessarily a small quantity, and it is impossible to expand them. On the contrary, ϕ(M; 0) is small, and hence it is natural to write X Y ˜ (4.34) eϕ(M;0) − 1 , eR(0) = eR(0) M M∈M
with the sum running over all collections M of connected subsets of L3 . Let supp M = ∪M∈M M. Given a set of quantum contours 0 ∈ D3 and a collection M, we introduce contours on L3 by decomposing the set M(supp 0) ∪ supp M into connected components [notice that if (x, t) ∈ / M(supp 0) ∪ supp M, then C(x, t) ⊂ ∪d∈D Wd (0)]. Namely, a contour Y is a pair (supp Y, αY ), where supp Y ⊂ L3 is a (non-empty) connected subset of L3 , and αY is a labeling of connected components F of ∂C(supp Y ), αY (F ) = 1, . . . , r. We write |Y | for the length (area) of the contour Y , i.e. the number of sites in supp Y . A set of contours Y = {Y1 , . . . , Yk } is admissible if the contours are mutually disjoint and if the labeling is constant on the boundary of each connected component of T3 \ ∪Y ∈Y C(supp Y ). Finally, given an admissible set of contours Y, we define Wd (Y) to be the union of all connected components M of L3 \ ∪Y ∈Y supp Y such that C(M) has label d on its boundary. Consider now any quantum configuration ω ∈ Q3 yielding, together with a collection M, a fixed set of contours Y. Summing over all such configurations ω and collections M, we get the weight to be attributed to the set Y. Let 0 ω be the collection of quantum contours corresponding to ω, ∪Y ∈Y supp Y = M(supp 0 ω ) ∪ supp M. Given that the configurations ω are necessarily constant with no transition on T3 \C(∪Y ∈Y supp Y ), we easily see that the weight factor splits into a product of weight factors of single contours Y ∈ Y. Namely, for the weight z of a contour Y we get the expression Z Y Y ˜ d0 z(γ ) e−e(d)|Wd (0)∩C(supp Y )| eR(0) z(Y ) = D3 (Y )
γ ∈0
d∈D
X Y ϕ(M;0) I M(supp 0) ∪ supp M = supp Y −1 , e M
(4.35)
M∈M
where D3 (Y ) is the set of quantum configurations compatible with Y , 0 ∈ D3 (Y ) if supp 0 ⊂ supp Y and the labels on the boundary of supp 0 match with labels of Y . Thus, we can finally rewrite the partition function in a form that agrees with the standard Pirogov–Sinai setting, namely Y XY β˜ per e− 1 e(d)|Wd (Y )| z(Y ), (4.36) Z3 = Y d∈D
Y ∈Y
with the sum being over all admissible sets of contours on L3 . In the next section we will evaluate the decay rate of contour weights in preparation to apply, in Sect. 6, the Pirogov–Sinai theory to prove Theorems 2.1, 2.2, and 2.3.
320
R. Kotecký, D. Ueltschi
5. Exponential Decay of the Weight of the Contours In this section we show that the weight z has exponential decay with respect to the length of the contours. We begin by a lemma proving that the contribution of M is small, that we shall use in Lemma 5.2 below for the bound of z. Lemma 5.1. Under Assumptions 1–6, for any c < ∞ there exist constants β0 , β˜0 < ∞, and ε0 > 0 such that for any β > β0 , β˜0 6 β˜ < 2β˜0 , and kT k, ε1 , ε2 6 ε0 , one has X eϕ(M;0) − 1 ec|M| 6 1 M3(x,t)
for any contour Y and any set of quantum contours 0 ∈ D3 (Y ). Proof. We show that
X ϕ(M; 0) ec|M| 6 1. M3(x,t)
This implies that |ϕ(M; 0)| 6 1 and consequently Lemma 5.1 holds – with a slightly smaller constant c. Let us consider separately, in (4.32), the three terms on the right hand side: (a) the integral over big clusters, (b) the integral over small clusters, and (c) the expression involving ψ β . (a) Big clusters. Our aim is to estimate Z X ec|M| J =
C3 (0)\C3small (0)
M3(x,t)
dC I M(C) = M 8T (C) .
Since M(C) = M and M 3 (x, t), the cell C(x, t) intersects a quantum transition of C, or it is contained in a box B belonging to a loop of C (both possibilities may occur at the same time). In the first case we start the integral over clusters by choosing ˜ the time for the first quantum transition, which yields a factor β/1. In the second case we simply integrate over all loops containing the given site. In the same time, given a (i) ξ (i) (i) cluster C = (ξ1 , . . . , ξn ), ξi = (Bi , ωBi , gAii ) and Bi = Ai × [τ1 , τ2 ], the condition M(C) = M implies that n n X
|Ai | +
i=1
o 1 |Bi | > |M|. β˜
(5.1)
Using it to bound |M|, we get the estimate Z Y c|A|+c 1 |B| β˜ β˜ dC|8T (C)| e + J 6 1 C3(x,τ ) \C3small ξ ∈C Z Y c|A|+c 1 |B| β˜ dC I C 3 (x, τ ) |8T (C)| e . + C3 \C3small
(5.2)
ξ ∈C
˜ Taking, in Lemma 4.1, the constant c as above as well as α1 = 21 (4R0 )−ν , α2 = c1/β, δ = 1, and choosing the corresponding ε0 (c, α1 , α2 , δ), we can bound the second term
Effective Interactions Due to Quantum Fluctuations
321
of (5.2), for any kT k 6 ε0 , with the help of (4.14) once β˜ is chosen large enough to satisfy c 2ν β˜ R . > 1 10
(5.3)
To estimate the first term of (5.2), we first consider the contribution of those clusters for which Y 1 β˜ −ν kT k− 2 (4R0 ) |A| . 6 1 ξ ∈C
Applying it together with (5.3) we can directly use the bound (4.15). Thus it remains to estimate the contribution of those terms for which X ˜ log(β/1) 1 . |A| < 2(4R0 )ν log(1/kT k)
(5.4)
ξ ∈C
Let us first fix β˜ and ε0 6 ε0 (c, α1 , α2 , δ) with the constants c, α1 , α2 , and δ as above, so that c 2ν β˜ > R ε0 10
(5.5)
and, in the same time, 1 0
−ν
k− k (4R0 ) β˜ 6 ε0 2
(5.6)
for a suitable large k 0 (we also assume that ε0 6 1). Here k is the constant that appears in Assumption 4, 1(kT k) > kT kk . Observing further that 1(kT k) can be taken to increase with kT k (one can always consider a weaker lower bound 1 when taking smaller kT k), we conclude that (5.3), as well as the condition 2(4R0 )ν
˜ log(β/1) 6 k0, log(1/kT k)
are satisfied for every kT k 6 ε0 . Thus, it suffices to find an upper bound to J0 =
Z X β˜ dC|8T (C)| I |A| < k 0 . 1 C3(x,τ ) \C3small
(5.7)
ξ ∈C
The main problem in estimating this term stems from the factor 1/1 that may be large if kT k is small. Thus, to have a bound valid for all small kT k, some terms, coming from the integral, that would P suppress this factor must be displayed. The condition ξ ∈C |A| < k 0 will be used several times by applying its obvious consequences: (i) the number of loops in C is smaller than k 0 , (ii) the number of transitions for each loop is smaller than k 0 , (iii) each transition A is such that |A| < k 0 , and (iv) the distance between each transition and x is smaller than k 0 .
322
R. Kotecký, D. Ueltschi
Furthermore, we use Assumption 5 to bound the contribution of the transitions of C; recalling the definition (4.11) of the weight of ξ , we have, for any large C, n Z o Y Y ξ ξ |z(ξ )| 6 ε1 1 exp − d(x, τ )[8x (nU0 (x) (τ )) − 8x (gU0 (x) )] ξ ∈C
B
ξ ∈C
6 ε1 1
Y
e
−R −2ν 1
0 |B|
.
(5.8)
ξ ∈C
In the last inequality we used Assumption 2 in the form of the bound (2.5) as well as the |B| lower bound |τ2 − τ1 | = |B| |A| > R ν for the support B = A × [τ1 , τ2 ] of the loop ξ . For any ξ ∈ C = (ξ1 , . . . ξn ), let τ be the time at which the first transition in C occurs (we assume that it happens for the “first” loop ξ1 ) and τ ξ be such that τ + τ ξ is the time at which the first transition in ξ occurs (τ ξ1 = 0). Referring to the condition (i) on the number of loops in C, we get the inequality X
|τ ξ | 6 k 0
X
and thus also 16
Y
e
−
10 2k 0 R 2ν
|B|,
ξ
ξ 6=ξ1
|τ ξ |
ξ
Y
e2R 1
−2ν 1
0 |B|
.
ξ
Integrating now over the time of the first transition for each ξ ∈ C, ξ 6 = ξ1 , and taking into account that |ϕ T (ξ1 , . . . , ξn )| 6 nn−2 , we get Z k X on 1 −2ν nn−2 2k 0 R 2ν n−1 n ˜ 1 dξ e− 2 R 10 |B| I ξ : k 0 . J 6 βε ) (n − 1)! 10 L(x,τ 3 0
0
(5.9)
n=1
Here the constraint I ξi : k 0 means that the loop ξi satisfies the conditions (ii)–(iv) above. We have then a finite number of finite terms, the contribution of which is bounded ˜ and k 0 ). Thus J 0 6 βε ˜ 1 K which we by a fixed number K < ∞ (depending on ε0 , β, can suppose sufficiently small if ε1 is small. (b) Small clusters. Let us first notice that |8T (C; 0)| 6 |8T (C)|, and since M(C) = M, inequality (5.1) is valid. Moreover C must contain at least one of the two boundary points β˜ β˜ ± 21 ) of some cell C(y, t) for which dist (x, y) 6 R. Indeed, given that C is (y, t 1 small and in the same time C˜ ∩ core 0 6 = ∅ (cf. Lemma 4.3), this is the only way to satisfy also C 6 ⊂ C(supp 0) [cf. (4.32)]. Thus it suffices to use again (4.14) and (5.3) to estimate Z Y c|A|+c 1 |B| ν β˜ dC I C 3 (x, τ ) |8T (C)| e . (2R) C3small
ξ ∈C
(c) Bound for ψ β . Finally, we estimate the expression involving ψ β . We first observe that β
eαβ |ψA (gA )| 6 1
(5.10)
Effective Interactions Due to Quantum Fluctuations
323
for any A ⊂ Zν and with α = 21 R −2ν 10 , Indeed, β
β
eαβ |ψA (gA )| = eαβ |9A (gA ) − 9A (gA )| = Z 8T (C) + dC I C ∼ gA , AC = A, IC 3 0, C ⊂ 3×[0, β] per , |IC | = β = eαβ − |IC | C3small Z 8T (C) . dC I C ∼ gA , AC = A, IC 3 0, C ⊂ 3 × [−∞, ∞], |IC | > β + |IC | C3small (5.11) The first integral above corresponds to clusters wrapped around the torus in vertical direction, while the second one assumes integration over all clusters in 3 × [−∞, ∞]. For any C above |IC | > β and thus eαβ 6
Y
eα|B| .
ξ ∈C
Observing now that every cluster in both integrals necessarily contains in its support at least one of the points (x, 0), x ∈ A, and using the fact that diam A 6 R, we can bound the first integral by Rν β
Z C3small
Y dC I C 3 (x, 0) |8T (C)| eα|B| , ξ ∈C
which can be directly evaluated by (4.14). The same bound can be actually used also for the second integral, once we realize that the estimate (4.14) is uniform in β. β Using now the fact that ψA = 0 if diam A > R, the condition M(A × {τ }) = M ν implies that M has less than R ν sites, hence ec|M| 6 ecR . Furthermore, referring to (5.10), we have Z
1 −2ν β˜ ν β d(A, τ )|ψA (·)| I M(A × {τ }) = M ec|M| 6 e− 2 R 10 β+cR , 1 T3
(5.12)
which can be made small for β sufficiently large and concludes thus the proof of the lemma. u t Using Lemma 5.1 and introducing e0 = mind∈D e(d), we can estimate the weight z of the contours in the discrete space of cells. Lemma 5.2. Under Assumptions 1–6, for any c < ∞, there exist β0 , β˜0 < ∞ and ε0 > 0 such that for any β > β0 , β˜0 6 β˜ < 2β˜0 , and kT k, ε1 , ε2 6 ε0 , one has β˜
|z(Y )| 6 e− 1 e0 |Y | e−c|Y | for any contour Y .
324
R. Kotecký, D. Ueltschi
Proof. For a given 0 (such that M(supp 0) ⊂ supp Y ) with transitions {A1 , . . . , Am } at times {τ1 , . . . , τm }, we define A(0) = ∪m i=1 ∪x∈Ai [U (x) × τi ], A = M(A(0)), / DU (x) for some and E ⊂ supp Y \ A to be the set of sites (x, t) such that n0U (x) (τ ) ∈ (x, τ ) ∈ C(x, t). The latter can be split into two disjoint subsets, E = E core ∪ E soft , with / GU (x) for some (x, τ ) ∈ C(x, t). The condition (x, t) ∈ E core whenever n0U (x) (τ ) ∈ M(supp 0) ∪ supp M = supp Y in (4.35) implies the inequality Y ν ec|M| . ec|Y | 6 ec(2R) |A(0)| ec|E | M∈M
From definitions (4.35) of z(Y ) and (4.28) of z(γ ), and using Assumption 4, we have X β˜ e− 1 e0 |supp Y \A| ec|Y | |z(Y )| 6 X
X
A⊂supp Y ˜
e−(β−c)|E \E
core |
β˜ 10 −ν core | 2 (2R) −c)|E
e−( 1
×
E ⊂supp Y \A E core ⊂E
Z
× m Y
n Z × exp −
i=1
D3
d0 I M(A(0)) = A, M(core 0) = E core
0 hn (τi − 0)| TA |n0 (τi + 0)i ec(2R)ν |Ai | × i Ai Ai
o ˜ d(x, τ )ϒx (n0U (x) (τ )) e|R(0)|
C(A)
X
Y eϕ(M;0) − 1 ec|M| .
M,supp M⊂supp Y M∈M
(5.13) All elements in M are different, because it is so in the expansion (4.34). Therefore we have Y X eϕ(M;0) − 1 ec|M| M,supp M⊂supp Y M∈M
in X 1h X eϕ(M;0) − 1 ec|M| n! M⊂supp Y n>0 in X X 1h eϕ(M;0) − 1 ec|M| , |Y | 6 n! 6
n>0
(5.14)
M3(x,t)
and using Lemma 5.1 this may be bounded by e|Y | . In (4.33) clusters are small, and they must contain a space-time site (x, τ ) such that there exists x 0 with (x 0 , τ ) ∈ core 0 and dist (x, x 0 ) < R. So we have the bound Z ˜ dC I C 3 (x, τ ) 8T (C) , |R(0)| 6 (2R)ν |core 0| C3small
since |8T (C; 0)| 6 |8T (C)|. Taking now, in Lemma 4.1, the constants c = α1 = α2 = 10 0 and δ = 4(2R) 2ν , and choosing the corresponding ε0 , we apply (4.14) to get, for any kT k 6 ε0 , the bound β˜ 10 10 10 ˜ (2R)−ν |core 0| 6 (2R)−ν |E core | + (2R)−ν |core 0 ∩ C(A)|. |R(0)| 6 4 1 4 4
Effective Interactions Due to Quantum Fluctuations
Assuming β˜ > c and
β˜ 10 1 4
325
> (2R)ν c [cf. (5.3)], we bound
˜
e−(β−c)|E \E
core |
β˜ 10 −ν core | 4 (2R) −c)|E
e−( 1
6 1.
Inserting these estimates into (5.13), we get X
β˜
ec|Y | |z(Y )| 6 e− 1 e0 |Y | e|Y |
3|supp Y \A|
A⊂supp Y
Z
d0 I M(A(0)) = A
D3
m Y 0 hn (τi − 0)| TA |n0 (τi + 0)i ec(2R)ν |Ai | i Ai Ai
n Z exp −
C(A)
i=1
d(x, τ )[ϒx (n0U (x) (τ )) − e0 −
o 10 (2R)−ν I (x, τ ) ∈ core 0 ] . (5.15) 4
To estimate the above expression, we will split the “transition part” of the considered quantum contours into connected components, to be called fragments, and deal with them separately. Even though the weight of a quantum contour cannot be partitioned into the corresponding fragments, we will get an upper bound combined from fragment bounds. Consider thus the set ˆ A(0) = core 0 ∩ C(A(0)) ˆ ˆ A(0) = and the fragments ζi = (Bi , ωBi ) on the connected components Bi of A(0), ∪ni=1 Bi , ωBi is the restriction of ω0 onto Bi . From Assumption 4, we have Z h i 10 (2R)−ν I (x, τ ) ∈ core 0 d(x, τ ) ϒx (n0U (x) (τ )) − e0 − 4 C(A) n X |Bi |. > 41 (2R)−ν 10 i=1
Let us introduce a bound for the contribution of a fragment ζ with transitions Aj , j = 1, . . . , k, −ν 1
zˆ (ζ ) = e− 4 (2R) 1
0 |B|
k Y j =1
ζ
ζ
ν |A | j
|hnAj (τ1 − 0)| TAj |nAj (τ1 + 0)i| ec(2R)
Then, integrating over the set FC(A) of all fragments in C(A), we get n X X 1 Z β˜ c|Y | −1 e0 |Y | |Y | |supp Y \A| e |z(Y )| 6 e e 3 dζ zˆ (ζ ) . n! FC(A) n>0
A⊂supp Y
Anticipating the bound
R FC(A)
dζ zˆ (ζ ) 6 |A|, we immediately get the claim, β˜
ec|Y | |z(Y )| 6 e− 1 e0 |Y | e3|Y | , with a slight change of constant c → c − 3.
.
(5.16)
326
R. Kotecký, D. Ueltschi
A bound on the integral of fragments. Let us first consider short fragments ζ = (B, ωB ) satisfying the condition k
˜ log(β/1) 1X 6 log β˜ + k |Aj | 6 2 log(1/kT k)
(5.17)
j =1
(if kT k 6 1). The integral over the time of occurrence of the first transition yields the ˜ factor β/1. Notice that ζ is not a loop. This follows from the construction of quantum ˆ contours and the fact that B is a connected component of A(0), where every transition is taken together with its R-neighbourhood. Thus, either its sequence of transitions does not belong to S, or the starting configuration does not coincide with the ending configuration. In the first case we use Assumption 5, in the second case Assumption 6, and since (5.17) means that the sum over transitions is bounded, we can write Z dζ zˆ (ζ ) 6 21 |A|, (5.18) short FC(A)
if ε1 and ε2 are small enough, independently of kT k. Finally, we estimate the integral over ζ ’s that are not short. We have Z Z β˜ dζ zˆ (ζ ) 6 |A| dζ zˆ (ζ ). (x,τ ) short short 1 FC(A) FC(A) \FC(A) \FC(A)
(5.19)
(x,τ )
Here FC(A) is the set of all fragments ζ whose first quantum transition (A1 , τ1 ) is such that x ∈ A1 and τ = τ1 . Whenever ζ is not short, we have 16
k 1 1Y kT k− 2 |Aj | . β˜ j =1
Thus, defining −ν 1 |B| 0
zˆ 0 (ζ ) = e− 4 (2R) 1
k h Y
ν +1
kT k 2 ec(2R) 1
i|Aj |
,
(5.20)
j =1
Z
we find the bound |A|
F (x,τ )
dζ zˆ 0 (ζ ).
Here, slightly overestimating, we take for F(x, τ ) the set of all fragments containing a quantum transition (A, τ ) with x ∈ A. The support B of a fragment ζ = (B, ωB ) ∈ F(x, τ ), is a finite union of vertical segments (i.e. sets of the form {y}×[τ1 , τ2 ] ⊂ T3 ) and k horizontal quantum transitions A1 , . . . , Ak . We will finish the proof by proving by induction the bound Z dζ zˆ 0 (ζ ) 6 1 (5.21) F (x,τ ;k)
with F(x, τ ; k) denoting the set of fragments from F(x, τ ) with at most k quantum transitions.
Effective Interactions Due to Quantum Fluctuations
327
Consider thus a fragment ζ with k horizontal quantum transitions connected by vertical segments. Let (A, τ ) be the transition containing the point (x, τ ) and let (A1 , τ + τ1 ), . . . , (A` , τ + τ` ) be the transitions that are connected by (one or several) vertical segments of the respective lengths |τ1 |, . . . , |τ` | with the transition (A, τ ). If we remove all those segments, the fragment ζ will split into the “naked” transition (A, τ ) and ¯ belongs additional `¯ 6 ` fragments ζ1 , . . . , ζ`¯, such that each fragment ζj , j = 1, . . . , `, to F(yj , τ +τj ; k−1) with yj ∈ A. Taking into account that the number of configurations (determining the possible vertical segments attached to A) above and below A is bounded ¯ by S 2|A| and that the number of possibilities to choose the points yj is bounded by |A|` , we get Z X |A| 1 ν dζ zˆ 0 (ζ ) 6 kT k 2 ec(2R) +1 S 2 F (x,τ ;k)
A,dist (A,x)
|A| ¯ `!
Z dτ1 · · ·
¯ `=1 `¯ Z
Y
j =1 F (yj ,τ +τj ;k−1)
X
6
−ν 1
dτ`¯ e− 2 (2R) 1
0 (τ1 +···+τ`¯ )
dζ zˆ 0 (ζj ) ν +2
kT k 2 S 2 ec(2R) 1
|A|
ν /1 0
e2(2R)
61
(5.22)
A,dist (A,x)
once kT k is sufficiently small. u t In the application of Pirogov–Sinai theory we shall also need a bound on derivatives of the weight of contours. Lemma 5.3. Under Assumptions 1–7, for any c < ∞, there exist constants α, β0 , β˜0 < P ∂ ∞ and ε0 > 0 such that if β > β0 , β˜0 6 β˜ < 2β˜0 , kT k + r−1 i=1 k ∂µi T k 6 ε0 , and ε1 , ε2 6 ε0 , one has ∂ β˜ µ ˜ | e− 1 e0 |Y | e−c|Y | z(Y ) 6 α β|Y ∂µi for any contour Y . Proof. From the definition (4.35) of z, one has ∂ z(Y ) 6 ∂µi X X ∂ µ ∂ ∂ ˜ R(0) z(0) + e (d) + Wd ∩ C(supp Y ) 6 |z(Y )| ∂µi ∂µi ∂µi γ ∈0 d∈D Z Y Y µ ˜ d0 |z(γ )| e−e (d)|Wd ∩C(supp Y )| e|R(0)| + D3 (Y )
γ ∈0
d∈D
X I M(supp 0) ∪ supp M = supp Y M
X ∂ ϕ(M; 0) eϕ(M;0) ∂µi
M∈M
Y M 0 ∈M,M 0 6 =M
ϕ(M 0 ;0) e − 1 .
(5.23)
328
R. Kotecký, D. Ueltschi
∂ ∂ µ The bound for | ∂µ z(0)| is standard, see [BKU1], and | ∂µ e (d)| is assumed to be i i bounded in Assumption 7. For the other terms we have to control clusters of loops. Since we have exponential decay for z(ξ ) with any strength (by taking β large and kT k ∂ z(ξ ) (by taking β larger and kT k smaller). The integrals small), we have the same for ∂µ i over C can be estimated as before, the only effect of the derivative being an extra factor n (when the clusters have n loops). u t
6. Expectation Values of Local Observables and Construction of Pure States per
So far we have obtained an expression (4.36) for the partition function Z3 of the quantum model on torus 3 in terms of that of a classical lattice contour model with the weights of the contours showing an exponential decay with respect to their length. Using d with the torus the same weights z(Y ), we can also introduce the partition functions Z3(L) 3 replaced by a hypercube 3(L) and with fixed boundary conditions d. Namely, we take simply the sum only over those collections Y of contours whose external contours are labeled by d and are not close to the boundary.13 Notice, however, that here we d directly in terms of the classical contour model, without ensuring are defining Z3(L) existence of corresponding partition function for the original model. We will use these partition functions only as a tool for proving our theorems that are stated directly in terms of quantum models. To be more precise, we can extend the definition even more and consider, instead of the torus 3, any finite set V ⊂ L = Zν × {0, 1, . . . , N − 1} per . There is a class of contours that can be viewed as having their support contained in V ⊂ L. For any such contour Y we introduce its interior Int Y as the union of all finite components of L \ supp Y and Int d Y as the union of all components of Int Y whose boundary is labelled by d. Recalling that we assumed ν > 2, we note that the set L \ (supp Y ∪ Int Y ) is a connected set, implying that the label αY (·) is constant on the boundary of the set V (Y ) = supp Y ∪ Int Y . We say that Y is a d-contour, if αY = d on this boundary. Two contours Y and Y 0 are called mutually external if V (Y )∩V (Y 0 ) = ∅. Given an admissible set Y of contours, we say that Y ∈ Y is an external contour in Y, if supp Y ∩ V (Y 0 ) = ∅ for all Y 0 ∈ Y, Y 0 6 = Y . The sets Y contributing to ZVd are such that all their external contours are d-contours and dist (Y, ∂V ) > 1 for every Y ∈ Y. In this way we find ourselves exactly in the setting of standard Pirogov–Sinai theory, or rather, the reformulation for “thin slab” (cylinder L of fixed temporal size N ) as presented in Sects. 5–7 and Appendix of [BKU1]. In particular, for sufficiently large β P ∂ β,µ (d), metastable and sufficiently small kT k + r−1 i=1 k ∂µi T k, there exist functions f β,µ
free energies, such that the condition Re f β,µ (d) = f0 , with f0 ≡ f0 defined by f0 = mind 0 ∈D Re f β,µ (d 0 ), characterizes the existence of pure stable phase d. Namely, as will be shown next, a pure stable phase h·idβ exists and is close to the pure ground state |di. There is one subtlety in the definition of f β,µ (d). Namely, after choosing a suitable ˜ N) such that β˜ ∈ (β˜0 , 2β˜0 ) and N β˜ = β. To be β˜0 , given β, there exist several pairs (β, specific, we may agree to choose among them that one with maximal N. The function f β,µ (d) is then uniquely defined for each β > β0 . Notice, however, that while increasing β, we pass, at the particular value βN = N β˜0 , from discretization of temporal size N 13 In the terminology of Pirogov–Sinai theory we rather mean diluted partition functions – see the more precise definition below.
Effective Interactions Due to Quantum Fluctuations
329
to N + 1. As a result, the function f β,µ (d) might be discontinuous at βN with β = ∞ being an accumulation point of such discontinuities. Nevertheless, these discontinuities are harmless. They can appear only when Re f β,µ (d) > f0 and do not change anything in the following argument. Before we come to the construction of pure stable phases, notice that the first claim of Theorem 2.2 (equality of f0 with the limiting free energy) is now a direct consequence of the bound ν ν ˜ ˜ per (6.1) Z3 − |Q| e−βf0 NL 6 e−βf0 N L O( e−const L ) [cf. [BKU1], (7.14)]. Here Q = {d; Re f β,µ (d) = f0 }. The expectation value of a local observable K is defined as per
hKi3 =
Tr K e−βH3 . Tr e−βH3
(6.2)
In Sect. 4 we have obtained a contour expression for Z3 = Tr e−βH3 . We retrace per here the same steps for Z3 (K) := Tr K e−βH3 . The Duhamel expansion (4.1) for per Z3 (K) leads to an equation analogous to (4.2), per
per
Z3 (K) =
X
X
Z
X
m > 0 n0 ,...nm A1 ,...,Am 3 3 A¯ i ⊂3
0<τ1 <...<τm <β
dτ1 . . . dτm hn03 | K |n13 i
0 −(β−τm )V3 (n3 ) . e−τ1 V3 (n3 ) hn13 | TA1 |n23 i e−(τ2 −τ1 )V3 (n3 ) . . . hnm 3 | TAm |n3 i e 1
2
0
(6.3)
Configurations n03 and n13 match on 3 \ supp K (supp K ⊂ 3 is a finite set due to the locality of K), but may differ on supp K if K is an operator with non-zero off-diagonal terms. Let Q3 (K) be the set of quantum configurations with n3 (τ ) that is constant except possibly at ∪m i=1 (Ai × τi ) ∪ (supp K × 0). Then Z per dωT3 hn03 | K |n13 iρ(ωT3 ). (6.4) Z3 (K) = Q3 (K)
We identify loops with the same iteration scheme as in Sect. 4, starting with the set B (0) (ω) ∪(supp K × 0) instead of B (0) (ω) only. This leads to the set B K (ω). Removing the loops, we define B K e (ω), whose connected components form quantum contours. There is one special quantum contour, namely that which contains supp K × 0. Let us denote it by γ K and define its weight [see (4.28)] γK
γK
zK (γ K ) = hnsupp K (−0)| K |nsupp K (+0)i
m Y γK γK hnAi (τi − 0)| TAi |nAi (τi + 0)i i=1
o n Z γK exp − d(x, τ )ϒx (nU (x) (τ )) . B
(6.5)
Let 0 K = {γ K , γ1 , . . . , γk } be an admissible set of quantum contours, defining a quanK tum configuration ω0 ∈ Q3 (K). Then we have an expression similar to that of Lemma 4.4,
330 per Z3 (K)
R. Kotecký, D. Ueltschi
Z =
D3 (K)
d0 K
Y
e−|Wd (0
K )|e(d)
Y
zK (γ K )
d∈D
z(γ ) eR(0
K)
,
(6.6)
γ ∈0 K \{γ K }
with R(0 K ) as in (4.29) with 0 replaced by 0 K . K Next step is to discretize the lattice, to expand eR(0 ) , and if Y K is the contour that K K contains supp K × 0 ⊂ L3 , to define z (Y ) [see (4.35)]: Z Y Y K K ˜ K d0 K zK (γ K ) z(γ ) e−e(d)|Wd (0 )∩C(supp Y )| eR(0 ) zK (Y K ) = D3 (Y K )
γ ∈0 K \{γ K }
d∈D
X Y I M(supp 0 K ∪ supp M = supp Y K M
eϕ(M;0
K)
−1 .
(6.7)
M∈M
We also need a bound for zK (Y K ). It is clear that the situation is the same as for γK γK Lemmas 5.1 and 5.2, except for a factor hnsupp K (−0)| K |nsupp K (+0)i that is bounded by kKk. We can thus summarize: Lemma 6.1. Under Assumptions 1–6, for any c < ∞, there exist β0 , β˜0 < ∞, and ε0 > 0 such that if β > β0 , β˜0 6 β˜ < 2β˜0 and kT k, ε1 , ε2 6 ε0 , we have Y X Y β˜ K per Z3 (K) = e− 1 e(d)|Wd (Y )| zK (Y K ) z(Y ), (6.8) Y K ={Y K ,Y1 ,...,Yk } d∈D
Y ∈Y K \{Y K }
for every local observable K, with β˜
|zK (Y K )| 6 kKk ec|supp K| e− 1 e0 |Y
K|
e−c|Y
K|
for any contour Y K . In a similar manner as at the beginning of this section, we can introduce ZVd (K) for any V ⊂ L by restricting ourselves in the sum (6.8) to the collections Y K whose all external contours are d-contours and dist (Y, ∂V ) > 1 for every Y ∈ Y K . Thus we can define the expectation value hKidV =
ZVd (K) ZVd
(6.9)
for any V ⊂ L and, in particular, the expectation hKid3(L) for a hypercube 3(L). Again, this is exactly the setting discussed in detail in [BKU1]. We can use directly the corresponding results (cf. [BKU1], Lemma 6.1) to prove first that the limiting state h·idβ exists. Further, retracing the proof of Theorem 2.2 in [BKU1] we prove that the limit per
hKiβ = lim
3%Zν
Tr K e−βH3 Tr e−βH3
exists for every local K (proving thus Theorem 2.1). Moreover, 1 X per hKidβ , hKiβ = |Q| d∈Q
(6.10)
(6.11)
Effective Interactions Due to Quantum Fluctuations
331
where, again, Q denotes the set of stable phases, Q = {d; Re f β,µ (d) = f0 }. Thus we proved the claim d) of Theorem 2.2. Also the assertion c) follows in standard manner from the contour representation employing directly the exponential decay of contour activities and the corresponding cluster expansion [cf. [BKU1], (2.27)]. Before passing to the proof of b), we shall verify that h·idβ is actually a pure stable state according to our definition, i.e. a limit of thermodynamically stable states.14 To this end, let us first discuss how metastable free energies f β,µ (d) change with µ. The standard construction yields f β,µ (d) in the form of a sum eµ (d) + s β,µ (d), where s β,µ (d) is the free energy of “truncated” contour model Kd0 (Y ) [see [BKU1], (5.13) and (5.6)] constructed from the labelled contour model (4.36), which is under control by cluster P ∂T
on expansions. As a result, we have bounds of the form O e−β + kT k + r−1 i=1 ∂µi |s β,µ (d)| as well as on the derivatives with respect to µ. Hence, in view of Assumption 7, the leading behaviour is yielded by eµ (d). µ Starting thus from a given potential 8µ with Qµ = {d ∈ D; Re f β,µ (d) = f0 }, one can easily add to 8µ a suitable “external field” that favours a chosen d ∈ Qµ . For example, one can take µ,α
µ
d (n) 8A (n) = 8A (n) + αδA d defined by taking δ d (n) = 0 for n = d and δ d (n) = 1 otherwise.15 Now, since with δA A A A A µ,α ∂e (d) ∂eµ,α (d 0 ) is bounded from below by a positive constant (while = 0 for d 0 6= d), for ∂α ∂α β,µ,α ≡ mind 0 ∈D Re f β,µ,α (d 0 ), any α > 0 the only stable phase is d, Re f β,µ,α (d) = f0 β,µ,α β,µ,α 0 0 (d ) > f0 for d 6 = d. Thus, Qµ,α = {d} and and, in the same time, Re f per h·idβ,µ,α = h·iβ,µ,α . This state is thermodynamically stable – when adding any small perturbation, metastable free energies will change only a little and that one corresponding to the state d will still be the only one attaining the minimum. The fact that in the limit of vanishing perturbation we recover h·idβ,µ,α , as well as the fact that
lim h·iβ,µ,α ≡ lim h·idβ,µ,α = h·idβ,µ , per
α→0+
α→0+
follows by inspecting the contour representations of the corresponding expectations and observing that it can be expressed in terms of converging cluster expansions whose terms depend smoothly on α as well as on the additional perturbation. To prove, finally, the claim b) of Theorem 2.2, it suffices to show that it is valid for per per µ,α h·iβ,µ,α = h·idβ,µ,α for every α > 0. Abbreviating h·iβ,µ,α = h·i per and H3 = H3 , we first notice that the expectation value of the projector onto the configuration d on supp K, d d per = Psupp K := |dsupp K ihdsupp K | , is close to 1, since its complement h(1 − Psupp K )i d d h(1−Psupp K )i is related to the presence of a contour intersecting or surrounding supp K (loops intersecting supp K ×{0} are considered here as part of quantum contours), whose weight is small. More precisely, for any δ > 0 we have d per 6 δ|supp K|, h(1 − Psupp K )i 14 Recall that, up to now, the state h·id is defined only in terms of the contour representation [see (6.9), (6.8), β and (4.36)], and the only proven connection with a state of original quantum model is the equality (6.11). 15 Actually, we can restrict δ d only to a particular type of sets A – for example all hypercubes of side R. A
332
R. Kotecký, D. Ueltschi
whenever kT k, ε1 , ε2 are small enough and β large enough. Furthermore, 1 h d d −βH3 + per Tr Psupp K KPsupp K e Z3 i d d −βH3 d −βH3 + Tr K(1 − Psupp + Tr (1 − Psupp K )KPsupp K e K) e
(6.12)
d d −βH3 d −βH3 = hd3 | K |d3 iTr Psupp Tr Psupp K KPsupp K e Ke d −βH3 , = hd3 | K |d3 i Tr e−βH3 − Tr (1 − Psupp K) e
(6.13)
per
hKi3 =
and
so that we have per hKi per − hd3 | K |d3 i 6 hd3 | K |d3 i h(1 − P d supp K )i3 3 per per d d d + h(1 − Psupp K )KPsupp K i3 + hK(1 − Psupp K )i3 .
(6.14)
The mapping (K, K 0 ) 7 → hK † K 0 i3 , with any two local operators K, K 0 , is a scalar product; therefore the Schwarz inequality yields per
hKi per − hd3 | K |d3 i 6 hd3 | K |d3 i h(1 − P d
per supp K )i3
3
1 † per 1 per 2 per 21 d d † d + h(1 − Psupp )i K KP i + hK Ki3 2 hP K 3 supp K supp K 3 i h per per 1/2 d d 6 kKk h(1 − Psupp K )i3 + 2 h(1 − Psupp T )i3 1
6 kKk|supp K|(δ + 2δ 2 ).
(6.15)
The proof of the remaining Theorem 2.3 is a standard application of the implicit function theorem. Thus, for example, the point µ¯ 0 of maximal coexistence, Re f β,µ¯ 0 (d) = Re f β,µ¯ 0 (d 0 ) for every pair d, d 0 ∈ D, can be viewed as the solution of the vector equation f (µ¯ 0 ) = 0, with f (µ) = (Re f β,µ (di ) − Re f β,µ (dr ))r−1 i=1 . Now, f = e + s, r−1 µ µ β,µ β,µ , with ksk as well e(µ) = (e (di ) − e (dr ))i=1 , s(µ) = (Re s (di ) − Re s (dr ))r−1
i=1
∂s
bounded by a small constant once kT k + Pr−1 ∂T is sufficiently small and as ∂µ i=1 ∂µi β is sufficiently large. The existence of a unique solution µ¯ 0 ∈ U then follows once we notice the existence of the solution µ0 ∈ U of the equation e(µ0 ) = 0 (equivalent with eµ0 (d) = eµ0 (d 0 ), d, d 0 ∈ D) and the fact that the mapping T : µ → A−1
∂e (µ − µ0 ) − f (µ) µ=µ 0 ∂µ
∂e , is a contraction. To this end it is enough just to with A−1 the matrix inverse to ∂µ recall Assumption 7 and the bounds on s β,µ (d), d ∈ D, and its derivatives.
Effective Interactions Due to Quantum Fluctuations
333
A. General Expression for the Effective Potential It is actually a cumbersome task to write down a compact formula for the effective potential in the general case. A lot of notation has to be introduced, and one pays for the generality by the fact that the resulting formulæ look rather obscure; nevertheless, the logic behind the following definitions and equations appeared rather naturally along the steps in Sect. 4. We would like to stress that for typical concrete models, it is entirely sufficient to restrict to the effective potential due to at most 4 transitions, and we can content ourselves with Eqs. (2.8)–(2.10). We assume that a list S of sequences of quantum transitions A is given to represent the leading quantum fluctuations. The particular choice of S depends on properties of the considered model. Often the obvious choice like “any sequence of transitions not surpassing a given order” is sufficient. In the general case, certain conditions (specified in Assumption 5) involving S are to be met. For any gA ∈ GA , the effective potential 9 is defined to equal 9A (gA ) = −
X 1 n!
n>1 n Y
X
X
k1 ,...,kn > 2 (A1 ,...,A1 ,A2 ,...,An )∈S 1 1 k1 kn ∪i,j A¯ i =A j
ki hY
X
i,ki −1 I(Ai1 , . . . , Aiki ; ni,1 g3\A ) A g3\A , . . . , nA
i=1 ni,1 ,...,ni,ki −1 ∈G / A A A
Z
−∞<τ1i <...<τki <∞
dτ1i . . . dτkii
i
i −1 hkY
e
−(τji +1 −τji )
P
i,j −1
hnA
j =1
i,j
i
| TAi |nA i j
i,j x,U0 (x)⊂A [8x (nU0 (x) )−8x (gU0 (x) )]
i
j =1
I mini,j τji < 0 and maxi,j τji > 0 maxi,j τji − mini,j τji
ϕ T (B1 , . . . Bn ). (A.1)
To begin to decode this formula, notice first that the second sum is over all sequences (A11 , . . . , A1k1 , A21 , . . . , Ankn ) of transitions that are in the list S and are just covering the set A, ∪i,j A¯ ij = A. The sum in the braces (for a given i = 1, . . . , n) is taken over
i,ki −1 i,ki ∈ / GA with ni,0 collections of configurations ni,1 A , . . . , nA A ≡ nA ≡ gA , while the integral is taken over “times” attributed to transitions, with the energy term in the exponent taken over the set Ai = ∪kji=1 A¯ ij . Finally, there are some restrictions on the sums and integrals encoded in functions
I mini,j τji <0 and maxi,j τji >0
maxi,j τji −mini,j τji i,ki −1 g3\A ). The easiest nA
,
ϕ T (B1 , . . . Bn ),
and
I(Ai1 , . . . , Aiki ; ni,1 A g3\A , . . . ,
is the first one. One just assumes that the interval between the first and the last of concerned “times” contains the origin and the integrand is divided by the length of this interval. The function ϕ T (B1 , . . . Bn ) in terms of the sets Bi = Ai × [τ1i , τkii ] ⊂ Zν × [−∞, ∞], i = 1, . . . , n, is the standard factor from the theory of cluster expansions defined as ( 1 if n = 1 T ϕ (B1 , . . . , Bn ) = P Q ∪ B is connected if n>2 − I B i j G e(i,j )∈G
334
R. Kotecký, D. Ueltschi
with the sum over all connected graphs G of n vertices. Connectedness of a set B ⊂ Zν × [−∞, ∞] is defined by combining connection in continuous direction with connection in slices {x|(x, τ ) ∈ B} ⊂ Zν through pairs of sites of distance one. The most difficult to define is the restriction given by the function I that characterizes whether the collection of transitions is connected, in some generalized sense, through the intertwining configurations. A consolation might be that in lowest orders it is always true. Namely, whenever k 6 5, I(A1 , . . . , Ak ; n1A g3\A , . . . , nk−1 A g3\A ) ( Q j −1 j 1 if ∪j A¯ j is connected and kj =1 hnA | TAj |nA i 6 = 0 = 0 if ∪j A¯ j is not connected.
(A.2)
ν
To define it in a general case, consider A1 , . . . , Ak ⊂ Zν and n1 , . . . , nk−1 ∈ Z . Taking A¯ = ∪x∈A U (x) and E(n) = {x ∈ 3 : nU (x) 6 = gU (x) for any g ∈ G}, we consider the set Bˆ (0) ⊂ Zν+1 , i k−1h i k h Bˆ (0) = ∪ A¯ j × {2j − 2} ∪ ∪ E(nj ) × {2j − 1} . j =1
j =1
Think of layers, one on top of another – configurations on odd levels interspersed with transitions on even levels. The set Bˆ (0) decomposes into connected components, Bˆ (0) = ∪` > 1 Bˆ `(0) . To any Bˆ `(0) , define the box B˜ `(0) ⊂ Zν+1 as the smallest rectangle containing (0) (0) Bˆ ` . Then let Bˆ (1) = ∪` > 1 B˜ ` , decompose into connected components Bˆ (1) = ∪` > 1 Bˆ `(1) , and repeat the procedure until no change occurs any more, i.e. until Bˆ (m) = ∪` > 1 B˜ `(m) . The function I characterizes whether this final set, the result of the above construction, is connected or not, ( 1 if Bˆ (m) is connected 1 k−1 (A.3) I(A1 , . . . , Ak ; n , . . . , n ) = 0 otherwise. Equations (2.8)–(2.10) are obtained from the general expression (A.1) by considering the cases with one or two loops (i.e. n = 1, 2), each loop having no more than 4 transitions (ki 6 4). Acknowledgements. We are thankful to Christian Gruber for discussions. R. K. acknowledges the Institut de Physique Théorique at EPFL, and D. U. the Center for Theoretical Study at Charles University for hospitality.
References [BI]
Borgs, C. and Imbrie, J.: A unified approach to phase diagrams in field theory and statistical mechanics. Commun. Math. Phys. 123, 305–328 (1989) [BKU1] Borgs, C., Kotecký, R. and Ueltschi, D.: Low temperature phase diagrams for quantum perturbations of classical spin systems. Commun. Math. Phys. 181, 409–446 (1996) [BKU2] Borgs, C., Kotecký, R. and Ueltschi, D.: Incompressible phase in lattice systems of interacting bosons. Unpublished, available at http://dpwww.epfl.ch/instituts/ipt/publications.html (1997) [BS] Bricmont, J. and Slawny, J.: Phase transitions in systems with a finite number of dominant ground states. J. Stat. Phys. 54, 89–161 (1989) [Bry] Brydges, D.C.: A short course on cluster expansions. Proceeding of Les Houches, Session XLIII, 129–183 (1986)
Effective Interactions Due to Quantum Fluctuations
335
[DFF1] Datta, N., Fernández, R. and Fröhlich, J.: Low-temperature phase diagrams of quantum lattice systems. I. Stability for quantum perturbations of classical systems with finitely-many ground states. J. Stat. Phys. 84, 455–534 (1996) [DFF2] Datta, N., Fernández, R. and Fröhlich, J.: Effective Hamiltonians and phase diagrams for tightbinding models. Preprint, math-ph/9809007 (1998) [DFFR] Datta, N., Fernández, R., Fröhlich, J. and Rey-Bellet, L.: Low-temperature phase diagrams of quantum lattice systems. II. Convergent perturbation expansions and stability in systems with infinite degeneracy. Helv. Phys. Acta 69, 752–820 (1996) [DMN] Datta, N., Messager, A. and Nachtergaele, B.: Rigidity of interfaces in the Falicov–Kimball model. Preprint, mp-arc 98-267 (1998) [Dob] Dobrushin, R.L.: Existence of a phase transition in the two-dimensional and three-dimensional Ising models. Sov. Phys. Doklady 10, 111–113 (1965) [DLS] Dyson, F.J., Lieb, E.H. and Simon, B.: Phase transitions in quantum spin systems with isotropic and nonisotropic interactions. J. Stat. Phys. 18, 335–383 (1978) [EFS] van Enter, A.C.D., Fernández, R. and Sokal, A.D.: Regularity properties and pathologies of positionspace renormalization-group transformations: scope and limitations of Gibbsian theory. J. Stat. Phys. 72, 879–1167 (1993) [FWGF] Fisher, M.P.A., Weichman, P.B., Grinstein, G. and Fisher, D.S.: Boson localization and the superfluidinsulator transition. Phys. Rev. B 40, 546–570 (1989) [Geo] Georgii, H.-O.: Gibbs Measures and Phase Transitions. De Gruyter studies in Mathematics, Berlin– New York: De Gruyter, 1988 [Gin] Ginibre, J.: Existence of phase transitions for quantum lattice systems. Commun. Math. Phys. 14, 205–234 (1969) [Gri] Griffiths, R.B.: Peierls’ proof of spontaneous magnetization of a two-dimensional Ising ferromagnet. Phys. Rev. A 136, 437–439 (1964) [GM] Gruber, Ch. and Macris, N.: The Falicov–Kimball model: a review of exact results and extensions. Helv. Phys. Acta 69, 850–907 (1996) [KL] Kennedy, T. and Lieb, E.H.: An itinerant electron model with crystalline or magnetic long range order. Physica A 138, 320–358 (1986) [LM] Lebowitz, J.L. and Macris, N.: Low-temperature phases of itinerant fermions interacting with classical phonons: the static Holstein model. J. Stat. Phys. 76, 91–123 (1994) [Lieb] Lieb, E.H.: The Hubbard model: some rigorous results and open problems. In: XIth International Congress of Mathematical Physics (Paris, 1994), Cambridge, MA: Internat. Press, 1995 pp. 392–412 [MM] Messager, A. and Miracle-Solé, S.: Low temperature states in the Falicov–Kimball model. Rev. Math. Phys. 8, 271–299 (1996) [Pei] Peierls, R.: On the Ising model of ferromagnetism. Proceedings of the Cambridge Philosophical Society 32, 477–481 (1936) [Pfi] Pfister, C.-E.: Large deviations and phase separation in the two-dimensional Ising model. Helv. Phys. Acta 64, 953–1054 (1991) [PS] Pirogov, S.A. and Sinai, Ya.G.: Phase diagrams of classical lattice systems. Theoretical and Mathematical Physics 25, 1185–1192 (1975); 26, 39–49 (1976) [Sin] Sinai, Ya.G.: Theory of Phase Transitions: Rigorous Results, Oxford–New York–etc.: Pergamon Press, 1982 [Zah] Zahradník, M.: An alternate version of Pirogov–Sinai theory. Commun. Math. Phys. 93, 559–581 (1984) Communicated by Ya. G. Sinai
Commun. Math. Phys. 206, 337 – 366 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Global Foliations of Matter Spacetimes with Gowdy Symmetry Håkan Andréasson Department of Mathematics, Chalmers University of Technology, S-412 96 Göteborg, Sweden. E-mail: [email protected] Received: 8 December 1998 / Accepted: 20 March 1999
Abstract: A global existence theorem, with respect to a geometrically defined time, is shown for Gowdy symmetric globally hyperbolic solutions of the Einstein–Vlasov system for arbitrary (in size) initial data. The spacetimes being studied contain both matter and gravitational waves. 1. Introduction An important problem in classical general relativity is the question of global existence (in an appropriate sense) for globally hyperbolic solutions of the vacuum-Einstein and matter-Einstein equations. The main motivation is its relationship to the cosmic censorship conjectures. Strong cosmic censorship has, e.g. by Eardley and Moncrief [EM], been formulated as a question on global existence and asymptotic behaviour of solutions to the Einstein equations, suggesting a definite method of analytical attack. To begin studying the long-time behaviour of solutions to a complicated partial differential equation system one might focus on families of solutions with some prescribed symmetry. With the exception of the monumental work on global nonlinear stability of the Minkowski space by Christodoulou and Klainerman [CK], the practice in general relativity has for long been to study “global existence” problems under symmetric assumptions. One family of (cosmological) solutions which have been studied extensively is the Gowdy spacetimes [G]. These spacetimes are vacuum but admit gravitational waves (in contrast to e.g. spherically symmetric spacetimes). Global existence has been shown for the Gowdy spacetimes [M], strong cosmic censorship is settled in the case of polarized Gowdy spacetimes [CIM], and much is known about the subset of the Gowdy spacetimes which admit an extension across a Cauchy horizon [CI]. In this paper we show global existence, with respect to a geometrically defined time, for matter spacetimes (Einstein–Vlasov) with Gowdy symmetry and thereby we extend Moncrief’s result [M] in the vacuum case. This is the first result which provides a global
338
H. Andréasson
foliation of a spacetime containing both matter and gravitational waves. Moreover, for matter spacetimes there are only a few global results available all together. Let us briefly mention some of these results. First, by matter spacetimes we have in mind spacetimes where the matter consists of massive particles. One can also consider spacetimes which only contain radiation and important results have been obtained in this direction, e.g. Christodoulou has obtained strong results in the spherically symmetric case with a scalar field as matter model (see e.g. [Cu1, Cu2] and the references therein). For spacetimes containing massive particles the main global results can be summarized as follows. Under a smallness condition on the initial data, Rein and Rendall have [RR] shown that solutions of the spherically symmetric Einstein–Vlasov system are geodesically complete. Some information on the large data problem was then obtained in [RRS]. Christodoulou has in a series of papers (see [Cu3] and the references therein) studied the Einstein–Euler equation in the spherically symmetric case for a special equation of state, adapted to understand the dynamics of a supernova explosion. He can globally control the solutions to the Cauchy problem and he finds solutions whose behaviour resembles qualitatively that of a supernova explosion. Finally, the most relevant results in the context of this paper are those on cosmological solutions by Rendall [Rl1-2] and Rein [Rn]. These are discussed in some detail in relation to our result below. Our method of proof is inspired by a recent global foliation result for vacuum spacetimes admitting a T 2 isometry group, acting on T 3 spacelike surfaces [BCIM]. These spacetimes are more general than the Gowdy spacetimes: both families admit two commuting Killing vectors but in the Gowdy case there is the additional condition that the twists are zero. The twists are defined by c1 = µνρδ Xµ Y ν ∇ ρ X δ , c2 = µνρδ Xµ Y ν ∇ ρ Y δ ,
(1)
where X, Y are Killing vectors associated with the isometry group. It follows from the Einstein equations that in vacuum these quantities are constant throughout spacetime [G]. One difficulty in studying long-time existence problems in general relativity is the lack of having a fixed time measure. A solution which remains regular for an infinite range of one time scale may become singular within a finite range of another. In [BCIM] this problem is treated by choosing a coordinate system in which the time is fixed to the geometry of spacetime. In fact, the time is defined to be the area of the two dimensional spacelike orbits of the T 2 isometry group. These coordinates are called areal coordinates. The main theorem in [BCIM] shows that the entire maximal globally hyperbolic development of the initial hypersurface can be foliated by areal coordinates. These coordinates are however only used in a direct way in the future direction. To show that the past of the initial hypersurface is covered by areal coordinates the authors use conformal coordinates (the time is not fixed to the geometry of spacetime) in which the equations take a more suitable form for an analytical treatment. By a long chain of geometrical arguments it is then shown that the development in conformal coordinates admits a foliation by areal coordinates, and that it covers the past maximal globally hyperbolic development of the initial hypersurface. We prove that T 3 × R-matter spacetimes with Gowdy symmetry admit global foliations by areal coordinates. The matter content is described by the Vlasov equation. This is a kinetic equation and gives a statistical description of a collection of collisionless “particles”. In the cosmological case the particles are galaxies or clusters of galaxies whereas in stellar dynamics they are stars. The Vlasov equation has been shown to be
Global Foliations of Matter Spacetimes with Gowdy Symmetry
339
suitable in general relativity for the study of the long-time behaviour of matter in gravitational fields. In particular it rules out the formation of shell-crossing singularities. For a discussion on the choice of matter model see [Rl4] and [Rl5]. To prove the existence of a global folitaion we also work directly in areal coordinates in the expanding (future) direction, and in the contracting (past) direction, we first show a global existence theorem in conformal coordinates and then we invoke the geometrical arguments in [BCIM] to complete the proof. We point out that our result depends strongly on the exact structure of the Vlasov equation and does not hold for general matter models which are only restricted by certain inequalities on the components of the energymomentum tensor. A related and interesting result has recently been shown by Rendall [Rl1] (see also [Rl2]). He considers T 2 symmetric spacetimes for the Einstein–Vlasov and the Einsteinwave map equations and he shows that if such a spacetime admits at least one compact constant mean curvature (CMC) hypersurface then the past of that surface can be covered by a foliation of compact CMC hypersurfaces. The CMC- and the areal coordinate foliation are both geometrically based time foliations which provide frameworks for studying strong cosmic censorship and other global issues. The main motivation for developing techniques to obtain CMC foliations is that the definition of a CMC hypersurface does not depend on any symmetry assumptions and it is hence possible that CMC foliations will exist for rather general spacetimes. The areal coordinate foliation used here is less general since it is adapted to the symmetry, but leads in the Gowdy case (note that the results in [Rl1] apply to the more general T 2 symmetric spacetimes, but see the remark below) to stronger results. Namely, the arguments in [Rl1] do not show that the entire future of the initial hypersurface can be covered, and the existence of the CMC foliation is only guaranteed under the hypothesis that spacetime admits at least one such hypersurface. We also mention a result in this direction due to Rein [Rn]. He has studied cosmological Einstein–Vlasov spacetimes with stronger symmetry restrictions than in the Gowdy case (the spacetimes admit three Killing vectors). In these spacetimes gravitational waves cannot exist. For plane symmetry (the relevant case for us) he has shown existence back to the initial singularity for small initial data, and under the assumption that one of the field components is bounded, he obtains global existence for large data in the future direction. An interesting result in his work is that the initial singularity is shown to be a curvature singularity as well as a “crushing” singularity (see [ES]). Remark. We have not tried to consider the more general T 2 symmetric spacetimes, i.e. spacetimes with nonvanishing twists. However, we believe that a generalization to this case would be rather straightforward as soon as the Einstein–Vlasov system has been derived. During the work on this paper we noticed one potential problem in generalizing our proof in the future direction. This is discussed and solved in the remark following Eq. (78). The outline of the paper follows largely that of [BCIM]. In Sect. 2 we describe Gowdy symmetry and give the equations for the Einstein–Vlasov system in areal and conformal coordinates. The main theorem is formulated in Sect. 3 where we also describe the geometrical arguments in [BCIM] needed to complete the proof in the contracting direction. Section 4 is devoted to the analysis in the contracting direction. Estimates for the field components and the matter terms are derived in conformal coordinates, by using e.g. light-cone arguments and methods originally developed for the Vlasov–Maxwell equation. The analysis in the expanding direction is carried out in areal coordinates in
340
H. Andréasson
Sect. 5 where a number of estimates are derived. Light-cone arguments and an “energy” monotonicity lemma are important tools for obtaining bounds on the field components and their derivatives. The control of the matter terms and their derivatives rely on three lemmas. The first one is the “energy” monotonicity lemma just mentioned. Then, in the second lemma a careful analysis of the characteristic system associated with the Vlasov equation is carried out, which leads to a bound on the support of the momenta. The third lemma provides bounds on the derivatives of the matter terms and relies indirectly on the geodesic deviation equation. This equation relates the curvature tensor and the acceleration of nearby geodesics and has proved useful in previous studies of the Einstein–Vlasov system (see [RR, Rn] and [Rl3]).
2. The Einstein–Vlasov System with Gowdy Symmetry Let us begin with a brief review of Gowdy symmetry. Consider a spacetime that can be foliated by a family of compact, connected, and orientable hypersurfaces. If the maximal isometry group of the spacetime is two dimensional, and if it acts invariantly and effectively on the foliation, then the isometry group must be U (1)×U (1). Moreover, the foliation surfaces must be homeomorphic to T 3 , S 1 ×S 2 , S 3 or L(p, q) (the Lens space), and the action is unique up to equivalence. The Killing vector fields X, Y associated with the isometry group have to commute in such a spacetime. We say that spacetimes satisfying the symmetry conditions above and in which both the twists c1 , c2 (see (1)) vanish have Gowdy symmetry. We remark that the term “Gowdy spacetime” is reserved for the vacuum case. For more background on Gowdy symmetry we refer to [G, Cl]. As mentioned above there are several choices of spacetime manifolds compatible with Gowdy symmetry. In this paper we restrict our attention to the T 3 -case. It is an interesting fact that in vacuum this is the only possibility if the condition of vanishing twists is relaxed. The dynamics of the matter is governed by the Vlasov equation. This is a kinetic equation and models a collisionless system of particles, i.e. the particles follow the geodesics of spacetime. For a nice introduction to the Einstein–Vlasov system see [Rl3]. We also mention the survey of Ehlers [E] for more information on kinetic theory in general relativity, and the book by Binney and Tremaine [BT] for some applications of kinetic theory in stellar dynamics. We will use two choices of coordinates, areal coordinates and conformal coordinates. It has been shown in [Cl] that, at least locally, any globally hyperbolic (non-flat) Gowdy spacetime on T 3 × R admits each of these coordinates. Both sets of coordinates are chosen so that ∂ ∂ +b , X=a ∂x ∂y and Y =c
∂ ∂ +d ∂x ∂y
are Killing vector fields (a, b, c and d are constants with ad − bc 6 = 0), and in both cases θ ∈ S 1 denotes the remaining spatial coordinate. Below the form of the metric and the Einstein–Vlasov system is given in areal and conformal coordinates. The functions R, α, U, A, η all depend on t and θ and the function f depends on t, θ and v ∈ R3 .
Global Foliations of Matter Spacetimes with Gowdy Symmetry
341
Areal Coordinates. Metric: g = −e2(η−U ) αdt 2 + e2(η−U ) dθ 2 + e2U (dx + Ady)2 + e−2U t 2 dy 2 .
(2)
The Einstein-matter constraint equations: e4U ηt = Ut2 + αUθ2 + 2 (A2t + αA2θ ) + e2(η−U ) αρ, t 4t √ e4U αθ ηθ = 2Ut Uθ + 2 At Aθ − − e2(η−U ) αJ, t 2t 2tα αt = 2tα 2 e2(η−U ) (P1 − ρ).
(3) (4) (5)
The Einstein-matter evolution equations: α2 ηθ αθ e4U ηt αt αθ θ + − θ + − Ut2 + αUθ2 + 2 (A2t − αA2θ ) 2 2α 4α 2 4t 2 A 2A αe2η S23 , (6) −αe2(η−U ) P3 − 2 αe2(η+U ) P2 − t t Ut Uθ αθ Ut αt e4U =− + + + 2 (A2t − αA2θ ) t 2 2α 2t 1 2(η−U ) α(ρ − P1 + P2 − P3 ), (7) + e 2 At αθ Aθ αt At = + + − 4At Ut + 4αAθ Uθ t 2 2α +2tαe2(η−2U ) S23 . (8)
ηtt − αηθθ =
Utt − αUθθ
Att − αAθθ
The Vlasov equation: √ 2U √ 1 αθ √ 0 αv ∂f αe Aθ v 2 v 3 ∂f 1 − U + αv + (η − U )v − + − (η ) θ θ t t ∂t v 0 ∂θ 2α t v0 √ √ ∂f αUθ v 1 v 2 ∂f + 0 ((v 3 )2 − (v 2 )2 ) − Ut v 2 + αUθ 0 1 v ∂v v ∂v 2 √ √ 1 e2U v 2 ∂f v1v3 v1 − ( − Ut )v 3 − αUθ 0 + = 0. (9) (At + αAθ 0 ) t v t v ∂v 3 The matter quantities ρ(t, θ) =
Z R3
Z Pk (t, θ) = J (t, θ) =
R3
Z
R3
Z S23 (t, θ) =
R3
v 0 f (t, θ, v) dv,
(10)
(v k )2 f (t, θ, v) dv, k = 1, 2, 3, v0
(11)
v 1 f (t, θ, v) dv,
(12)
v2 v3 f (t, θ, v) dv. v0
(13)
342
H. Andréasson
Here the variables v are related to the canonical momenta p through √ v 0 = αeη−U p0 , v 1 = e(η−U ) p1 , v 2 = eU p2 + AeU p3 , v 3 = te−U p3 ,
(14)
and
dx µ , x µ = (t, θ, x, y), dτ where τ is proper time. It is assumed that all “particles” have the same mass (normalized to one) and follow the geodesics of spacetime (collisionless particle system). Hence p µ :=
gµν pµ pν = −1, so that v0 =
p 1 + (v 1 )2 + (v 2 )2 + (v 3 )2 .
(15)
In conformal coordinates the function α is removed, having the consequence that the orbital area function R now depends on both t and θ (in areal coordinates R = t). In these coordinates the metric and the Einstein–Vlasov system take the following form. Conformal coordinates. Metric: g = e2(η−U ) (−dt 2 + dθ 2 ) + e2U (dx + Ady)2 + e−2U R 2 dy 2 .
(16)
The Einstein-matter constraint equations: ηt Rt ηθ Rθ e4U 2 Rθθ − − = −e2(η−U ) ρ, (A + A2θ ) + 4R 2 t R R R ηt Rθ ηθ Rt e4U Rtθ − − = e2(η−U ) J, At Aθ + 2Ut Uθ + 2 2R R R R
Ut2 + Uθ2 +
(17) (18)
The Einstein-matter evolution equations: Ut Rt e4U 2 Uθ Rθ − + (A − A2θ ) R R 2R 2 t 1 + e2(η−U ) (ρ − P1 + P2 − P3 ), 2 Rt At Rθ Aθ = − + 4(Aθ Uθ − At Ut ) + 2Re2(η−2U ) S23 , R R = Re2(η−U ) (ρ − P1 ), e4U 2 = Uθ2 − Ut2 + (A − A2θ ) − e2(η−U ) P3 4R 2 t A2 2A 2η e S23 . − 2 e2(η+U ) P2 − R R
Utt − Uθθ =
Att − Aθθ Rtt − Rθθ ηtt − ηθθ
(19) (20) (21)
(22)
Global Foliations of Matter Spacetimes with Gowdy Symmetry
343
The Vlasov equation: (v 2 )2 v 1 ∂f ∂f + 0 − (ηθ − Uθ )v 0 + (ηt − Ut )v 1 − Uθ 0 ∂t v ∂θ v 1 v 2 ∂f Rθ (v 3 )2 Aθ 2U v 2 v 3 ∂f v 2 + (Uθ − − U v + U ) 0 − e t θ R v R v 0 ∂v 1 v 0 ∂v 2 Rt Rθ v 1 v 3 e2U v 2 ∂f v1 − ( − Ut )v 3 − (Uθ − = 0. ) 0 + (At + Aθ 0 ) R R v R v ∂v 3
(23)
The matter quantities ρ, Pk , J and S23 are given by (10)–(13), where in this case v 0 = eη−U p0 , v 1 = e(η−U ) p1 , v 2 = eU p2 + AeU p3 , v 3 = Re−U p3 ,
(24)
and (15) holds here as well. Remark. It might be instructive to relate the metric in (16) with that used by Rein [Rn] mentioned in the introduction. By letting A = 0 and U = (1/2) ln R in (16) we obtain a metric which admits three Killing vectors and which depends on two field components. The distribution function f depends in this case on p 1 and (p2 )2 + (p3 )2 . 3. The Main Theorem Let (h, k, f0 ) be a Gowdy symmetric initial data set on T 3 . By this we mean that h is a Riemannian metric on T 3 , invariant under an effective T 2 action; k is a symmetric 2-tensor on T 3 , also invariant under the same T 2 group action; the twists c1 and c2 are both zero; the initial distribution function f0 is defined on T 3 and is invariant under the same T 2 group action and possesses the following additional symmetry, which reads, in coordinates that cast the metric in the forms (2) or (16), f0 (θ, p1 , p2 , p3 ) = f0 (θ, p1 , −p2 , p3 ) = f0 (θ, p1 , p2 , −p3 ) (this assumption is necessary for the Einstein–Vlasov system to be compatible with the form of the metric); and that (h, k, f0 ) satisfy the Einstein–Vlasov constraint equations. We also assume that (h, k) are C ∞ on T 3 and that f0 is a nonnegative, not identically zero, C ∞ function of compact support on the tangent bundle T (T 3 ) of T 3 . Remark. The smoothness assumption on the initial data is not a necessary condition. It is included so that we can refer directly to the classical local existence theorems. However, the estimates in this paper provide the information needed for proving a local existence theorem for C 2 × C 1 data (h, k) and C 1 data f0 . Moreover, the assumption f0 6 = 0 is here included for a technical reason and we refer to [M] or Sect. 5 in this paper for the vacuum case. Indeed, it is in this case possible to work directly in areal coordinates and the estimates derived in Sect. 5 are sufficient. See also the remark following Lemma 1 in that section. The results by Choquet-Bruhat [CB] and Choquet-Bruhat and Geroch [CBG], show that there exists a unique maximal globally hyperbolic development (6 × R, g, f ) of a given initial data set on a three-dimensional manifold 6 for the Einstein–Vlasov equation. Let us briefly comment upon the initial conditions imposed. The relations between a given initial data set (h, k) on a three-dimensional manifold 6 and the metric g on the spacetime manifold is that there exists an imbedding ψ of 6 into the spacetime such
344
H. Andréasson
that the induced metric and second fundamental form of ψ(6) coincide with the result of transporting (h, k) with ψ. For the relation of the distribution functions f and f0 we have to note that f is defined on the mass shell (for m = 1 it is the set of all future pointing unit timelike vectors). The initial condition imposed is that the restriction of f to the part of the mass shell over ψ(6) should be equal to f0 ◦ (ψ −1 , d(ψ)−1 ) ◦ φ, where φ sends each point of the mass shell over ψ(6), to its orthogonal projection onto the tangent space to ψ(6). Our main theorem now reads, Theorem 1. Let (h, k, f0 ) be a smooth Gowdy symmetric initial data set on T 3 . For some non-negative constant c, there exists a globally hyperbolic spacetime (M, g, f ) such that (i) M = (c, ∞) × T 3 . (ii) g and f satisfy the Einstein–Vlasov equation. (iii) M is covered by areal coordinates (t, θ, x, y), with t ∈ (c, ∞), so the metric globally takes the form (2). (iv) (M, g, f ) is isometrically diffeomorphic to the maximal globally hyperbolic development of the initial data (h, k, f0 ). As described in the introduction we prove global existence in conformal and areal coordinates for the past and future directions respectively. Then, in order to prove Theorem 1 in the past direction, we need to invoke substantial geometrical arguments from [BCIM]. For the future direction only a simple geometrical argument is needed for completing the proof. It should be pointed out that even if the geometrical results in [BCIM] concern the vacuum case they are true also for matter spacetimes as long as the Einstein-matter equations form a well-posed hyperbolic system, which of course is the case here. In Sect. 4 we show that the past maximal development of (h, k, f0 ) in terms of − (h, k, f0 ), has t → −∞ as long as conformal coordinates, which we denote by Dconf R stays bounded away from zero. Starting from this result we briefly describe how the geometrical arguments in [BCIM] lead to a proof of Theorem 1 in the past direction. First, in [BCIM] R is shown to be positive everywhere in the globally hyperbolic region of a T 2 symmetric spacetime. Also, along any past inextendible timelike path − (h, k, f0 ), R is shown to approach a limit R0 ≥ 0 (to be identified with c in Dconf in Theorem 1), which is independent of the choice of path. Moreover, for any R˜ ∈ (R0 , R1 ), where R1 is the minimum value of R on the initial hypersurface, the level set − (h, k, f0 ) is shown to be a Cauchy surface. From these facts it follows R = R˜ in Dconf − (h, k, f0 ) admits areal coordinates to the past of the from arguments in [Cl] that Dconf − (h, k, f0 ) hypersurface R = R1 . Propositions 4 and 5 in [BCIM] then show that Dconf − is also isometrically diffeomorphic to the maximal past development, D (h, k, f0 ) of (h, k, f0 ) on T 3 . In the future direction, global existence in areal coordinates is almost sufficient for proving Theorem 1. The only statement that remains to be proved in Theorem 1 is that + (h, k, f0 ). This follows from a very the future maximal development is covered by Dareal short geometrical argument given in the proof of Proposition 5 in [BCIM]. 4. Analysis in the Contracting Direction The local existence theorem of Choquet-Bruhat [CB] together with the result of Chrusciel (Lemma 4.2 in [Cl]) imply that for any Gowdy symmetric initial data set (h, k, f0 ) on
Global Foliations of Matter Spacetimes with Gowdy Symmetry
345
T 3 , we can find an interval (tˆ1 , tˆ2 ) and C ∞ functions R, U, η on (tˆ1 , tˆ2 ) × T 3 , and a non-negative C ∞ function f on (tˆ1 , tˆ2 ) × P (P denotes the mass shell) such that: these functions satisfy the Einstein–Vlasov equations in conformal coordinate form and for some t0 ∈ (tˆ1 , tˆ2 ), the metric g induces initial data on the t0 -hypersurface which is smoothly spatially diffeomorphic to (h, k), and the relation between f and f0 given above holds. − (h, k, f0 ) has t → −∞, as long as R stays bounded Now, in order to show that Dconf away from zero, it is sufficient to prove that on any finite time interval (t˜, t0 ], the functions R, U, A, η, f and all their derivatives are uniformly bounded and that the supremum of the support of momenta at time t, Q(t) := sup{|v| : ∃(s, θ) ∈ [t, t0 ] × S 1 such thatf (s, θ, v) 6 = 0},
(25)
is uniformly bounded. Note that the last condition implies that the matter quantities and their derivatives are uniformly bounded (if |∂f/∂x µ | < C). Step 1 (Monotonicity of R and bounds on its first derivatives). This is a key step and relies on Theorem 4.1 in [Cl] together with the arguments in [BCIM]. We have to check that the matter terms have the right signs so that these arguments still hold. The bounds on R and its first derivatives will play a crucial role when we control the matter terms below. First we show that ∇R is timelike. Let us introduce the null vector fields 1 1 ∂ξ = √ (∂t + ∂θ ), ∂λ = √ (∂t − ∂θ ) , 2 2
(26)
and let us set Fξ = ∂ξ F, Fλ = ∂λ F for a function F . After some algebra it follows that the constraint equations (17) and (18) can be written 4U ∂θ Rξ = ηξ Rξ − RUξ2 − e4R A2ξ − Re2(η−U ) (ρ − J ), 4U
∂θ Rλ = ηλ Rλ − RUλ2 − e4R A2λ − Re2(η−U ) (ρ + J ).
(27) (28)
Let h1 and h2 be defined by h1 := RUξ2 +
e4U 2 A + Re2(η−U ) (ρ − J ), 4R ξ
and
e4U 2 A + Re2(η−U ) (ρ + J ). 4R λ From (10) and (12) we have ρ ≥ |J |, and since R > 0 it follows that both h1 and h2 are non-negative. Solving Eq. (27) gives for any θ0 ∈ [0, 2π ] (suppressing the t-dependence) Z θ R Rθ θ η (σ )dσ Rξ (θ0 ) − e θ˜ ηξ (σ )dσ h1 (θ˜ )d θ˜ . (29) Rξ (θ) = e θ0 ξ h2 := RUλ2 +
θ0
Since R is C ∞ on S 1 it can be identified with a periodic function on the real line. If now Rξ (θ0 ) = 0 for any θ0 then Rξ (2π + θ0 ) = 0, but from (29) this is only possible if h1 vanishes identically. However, in the non-vacuum case (recall f0 6 = 0) hξ (t, ·) is strictly
346
H. Andréasson
positive on some open set of [0, 2π]. Therefore Rξ is nonzero and has a definite sign. The same arguments apply to Rλ , and it follows that g µν ∂µ R∂ν R = e−2(η−U ) Rξ Rλ is strictly positive or strictly negative. The former possibility is ruled out since ∂θ R = 0 at some point on S 1 . Thus ∇R is timelike. This means that ∂t R is nonzero everywhere. Our choice of time corresponds to contracting T 2 orbits so that ∂t R > 0. Next we show that ∂t R and |∂θ R| are bounded into the past. The evolution equation (21) can be written ∂λ Rξ = Re2(η−U ) (ρ − P1 ),
(30)
∂ξ Rλ = Re2(η−U ) (ρ − P1 ).
(31)
or equivalently,
The right hand side is positive since ρ ≥ P1 , see (10) and (11), and from (30) it follows that if we start at any point (t0 , θ0 ) on the initial surface we obtain Rξ (θ0 + s, t0 − s) ≤ Rξ (t0 , θ0 ),
(32)
Rλ (θ0 − s, t0 − s) ≤ Rλ (t0 , θ0 ).
(33)
and similarly from (31),
From these relations we get for any t ∈ (t˜, t0 ) and any θ ∈ S 1 , Rξ (t, θ) ≤ sup Rξ (t0 , θ ),
(34)
Rλ (t, θ) ≤ sup Rλ (t0 , θ ).
(35)
Rt (t, θ) ≤ sup (Rξ + Rλ )(t0 , θ ),
(36)
θ∈S 1
θ∈S 1
This yields θ∈S 1
and since ∇R is timelike everywhere we have |Rt | > |Rθ | and we find that both Rt and |Rθ | are bounded into the past, so R is uniformly C 1 bounded to the past of the initial surface. Step 2 (Bounds on U, A and η and their first derivatives). The bounds on Ut , At , Uθ and Aθ to the past of the initial surface are obtained by a light-cone estimate, which in this case, with one spatial dimension, is an application of the Gronwall method on two independent null paths. Then, by combining these results, one obtains the desired estimate. Let us now define the quadratic forms G and H by e4U 2 1 R(Ut2 + Uθ2 ) + (A + A2θ ), 2 8R t e4U At Aθ . H = RUt Uθ + 4R G=
(37) (38)
Global Foliations of Matter Spacetimes with Gowdy Symmetry
347
A motivation for the introduction of these quadratic forms is given in [BCIM] where it is shown that G and H are components of an “energy-momentum tensor” of a wave map. To derive bounds on U and A and their first order derivatives we use the evolution equations (19) and (20) and we find −1 e4U 2 2 2 2 (−At + Aθ ) ∂λ (G + H ) = √ Rξ Ut − Uθ + 4R 2 2 2 R + Uξ e2(η−U ) (ρ − P1 + P2 − P3 ) 2 e2U Aξ Re2(η−U ) S23 , + 2R and −1 e4U 2 2 (−A + A ) ∂ξ (G − H ) = √ Rλ Ut2 − Uθ2 + t θ 4R 2 2 2 R + Uλ e2(η−U ) (ρ − P1 + P2 − P3 ) 2 e2U Aλ Re2(η−U ) S23 . + 2R Now, integrating these equations along null paths starting at (t1 , θ ) and ending at the initial t0 -surface, and adding the results we obtain 1 1 [G + H ](t0 , θ − (t0 − t1 )) + [G + H ](t0 , θ + (t0 − t1 )) 2 2 Z 1 t0 K1 (s, θ − (s − t1 )) + K2 (s, θ + (s − t1 )) ds − 2 t1 Z 1 t0 [Uξ T ](s, θ − (s − t1 )) + [Uλ T ](s, θ + (s − t1 )) ds − 2 t1 Z e2U 1 t0 e2U Aξ T˜ ](s, θ − (s − t1 )) + [ Aλ T˜ ](s, θ + (s − t1 )) ds, [ − 2 t1 2R 2R
G(t1 , θ) =
where we have introduced the notations −1 e4U K1 = √ Rλ Ut2 − Uθ2 + 2 (−A2t + A2θ ) , R 2 2 −1 e4U 2 2 2 2 K2 = √ Rξ (Ut − Uθ + 2 (−At + Aθ ) , R 2 2 R T = e2(η−U ) (ρ − P1 + P2 − P3 ), 2 T˜ = Re2(η−U ) S23 .
(39)
(40) (41) (42) (43)
Let us first consider the matter terms. Note that for any t ∈ (t˜, t0 ), the evolution equations (30) and (31) give √ Z t0 [Re2(η−U ) (ρ − P1 )](s, θ + (s − t))ds, Rξ (t0 , θ + (t0 − t)) − Rξ (t, θ) = 2 t
(44)
348
H. Andréasson
and Rλ (t0 , θ − (t0 − t)) − Rξ (t, θ) =
√ Z 2
t
t0
[Re2(η−U ) (ρ − P1 )](s, θ − (s − t))ds. (45)
Hence, since R is uniformly C 1 bounded to the past of the initial surface it follows that the right-hand sides are uniformly bounded. From (10)-(11) we have ρ ≥ P1 + P2 + P3 , and thus 0 ≤ (ρ − P1 + P2 − P3 ) ≤ 2(ρ − P1 ), and from (13) and the elementary inequality 2ab ≤ a 2 + b2 , a, b ∈ R, we have 2|S23 | ≤ P2 + P3 ≤ ρ − P1 . In view of (44) and (45) we therefore have that both Z t0 T (s, θ ± (s − t))ds,
(46)
t
and Z
t0
|T˜ (s, θ ± (s − t))|ds,
t
(47)
are uniformly bounded on (t˜, t0 ] × S 1 . Now, by using the inequality 2ab ≤ a 2 + b2 again, we get 2G 1/2 , |Uξ | ≤ R and e2U |Aξ | ≤ 2R
2G R
1/2 .
The same estimates also hold for Uλ and Aλ . Since Rξ and Rλ are uniformly bounded it clearly follows that CG CG , |K2 | ≤ , |K1 | ≤ R R for some constant C. Let a(t) := supθ R −1 (t, ·), the identity (39) now implies that Z t0 a(s) sup G(s, ·)ds sup G(t1 , ·) ≤ sup G(t0 , ·) + sup H (t0 , ·) + C θ
Z + C sup θ
+ C sup θ
Z
t0
t1 t0 t1
θ
θ
t1
θ
p p a(s)[sup G(s, ·)](T + |T˜ |)(s, θ − (s − t1 ))ds θ
p p a(s)[sup G(s, ·)](T + |T˜ |)(s, θ + (s − t1 ))ds. θ
(48)
Global Foliations of Matter Spacetimes with Gowdy Symmetry
349
Since the suprema with respect to θ of the last two integrals are taken over the compact set S 1 , there exist θ1 , θ2 ∈ S 1 such that the suprema of these integrals equal Z C
t0 t1
Z +C
p p a(s)[sup G(s, ·)](T + |T˜ |)(s, θ1 − (s − t1 ))ds
t0
θ
p p a(s)[sup G(s, ·)](T + |T˜ |)(s, θ2 + (s − t1 ))ds.
(49)
θ
t1
Combining (48) and (49) we obtain a Gronwall-type inequality. Recall that Z
t0 t1
(T + T˜ )(s, θ ± (s − t1 ))ds,
√ are uniformly bounded on (t˜, t0 ] × S 1 . Using the crude estimate G ≤ (1 + G) we obtain a standard Gronwall inequality which is sufficient here but a sharper estimate is given in [MPF, p. 360]. Thus, as long as R stays uniformly bounded away from zero (or equivalently that a(t) is uniformly bounded on (t˜, t0 ]), we conclude that supθ G is uniformly bounded on (t˜, t0 ], leading to bounds on U and its first order derivatives, and thus also on A and its first order derivatives. The bounds on |η|, |ηt | and |ηθ | are obtained in a similar way since the evolution equation (22) can be written ∂λ ηξ = Uθ2 − Ut2 +
e4U 2 A2 4U 2A 2U 2 2(η−U ) (A − A ) − e (P + e P2 + e S23 ), (50) 3 t θ 2 2 4R R R
or equivalently, ∂ξ ηλ = Uθ2 − Ut2 +
e4U 2 A2 2A 2U (At − A2θ ) − e2(η−U ) (P3 + 2 e4U P2 + e S23 ). (51) 2 4R R R
We found above that the integrals along null paths for the matter quantity Re2(η−U ) (ρ − P1 ) were bounded to the past of the initial surface. Therefore, since 0 ≤ Pk ≤ ρ − P1 , k = 2, 3 and |S23 | ≤ ρ − P1 we have, as long as R stays bounded away from zero, that the integrals along the null paths for the matter terms in the right-hand sides above are bounded as well, since U and A are bounded. Now, since the first order derivatives of U and A are uniformly bounded we immediately obtain that |ηξ | and |ηλ | are bounded by integrating the equations for η along null paths. Since ηt = √1 (ηξ + ηλ ) 2
and ηt = √1 (ηξ − ηλ ) we find that η is uniformly C 1 bounded to the past of the initial 2 surface as long as R stays bounded away from zero. Step 3 (Bound on the support of the momentum). Note that a solution f to the Vlasov equation is given by f (t, θ, v) = f0 (2(0, t, θ, v), V (0, t, θ, v)),
(52)
350
H. Andréasson
where 2 and V are solutions to the characteristic system V1 d2 = 0, ds V (V 2 )2 dV 1 = − (ηθ − Uθ )V 0 − (ηt − Ut )V 1 + Uθ ds V0 3 2 2 3 Rθ (V ) Aθ 2U V V − (Uθ − + , ) e R V0 R V0 V 1V 2 dV 2 , = − Ut V 2 − Uθ ds V0 Rt Rθ V 1 V 3 dV 3 = − ( − Ut )V 3 + (Uθ − ) ds R R V0 e2U V1 (At + Aθ 0 )V 2 , − R V and 2(s, t, x, v), V (s, t, x, v) is the solution that goes through the point (θ, v) at time t. Let us recall the definition of Q(t) := sup{|v| : ∃(s, θ) ∈ [t, t0 ] × S 1 such thatf (s, θ, v) 6 = 0}. If Q(t) can be controlled we obtain immediately from (10)–(12) bounds on ρ, J, S23 and Pk , k = 1, 2, 3, since kf k∞ ≤ kf0 k∞ from (52). Now, all of the field components and their first derivatives are known to be bounded on (t˜, t0 ], as long as R stays bounded away from zero. Also, the distribution function has compact support on the initial surface and therefore |V k (t0 )| < C. So by observing that |V k | < V 0 , k = 1, 2, 3, a simple Gronwall argument applied to the characteristic system gives uniform bounds on |V k (t)|, t ∈ (t˜, t0 ], and it follows that Q(t) is uniformly bounded on (t˜, t0 ]. Remark. By a Killing vector argument, bounds on |V 2 | and |V 3 | can be derived if merely |U | and |A| are bounded and R > > 0. Such an argument will be used in the expanding direction. Step 4 (Bounds on the second order derivatives of the field components and on the first order derivatives of f ). From the Einstein-matter constraint equations in conformal coordinates we can express Rtθ and Rθθ in terms of uniformly bounded quantities, as long as R stays bounded away from zero. Therefore these functions are uniformly bounded and Eq. (21) then implies that Rtt is uniformly bounded as well. In the vacuum case one can take the derivative of the evolution equations and repeat the argument in Step 2 to obtain bounds on second order derivatives of U and A. Here we need another argument. First we write the evolution equations for U and A in the forms Utt − Uθθ =
(Rθ − Rt ) (Rθ + Rt ) (Uθ + Ut ) − (Ut − Uθ ) 2R 2R 1 e4U (At − Aθ )(At + Aθ ) + e2(η−U ) κ, + 2R 2 2
(53)
Global Foliations of Matter Spacetimes with Gowdy Symmetry
351
and Att − Aθθ =
(Rt − Rθ ) (Rθ + Rt ) (Aθ + At ) + (At − Aθ ) 2R 2R − 2(At − Aθ )(Uθ + Ut ) − 2(Aθ + At )(Ut − Uθ ) + 2Re2(η−2U ) S23 ,
(54)
where κ denotes ρ − P1 + P2 − P3 . Taking the θ -derivative of these equations gives ∂λ ∂ξ Uθ = L + +
Rξ Rλ ∂ξ Uθ + ∂λ Uθ 2R 2R
e4U 1 (Aλ ∂ξ Aθ + Aξ ∂λ Aθ ) + e2(η−U ) κθ , 2R 2 4
(55)
and Rξ Rλ ∂ξ Aθ − ∂λ Aθ + 2Uξ ∂λ Aθ + 2Aλ ∂ξ Uθ 2R 2R + 2Uλ ∂ξ Aθ + 2Aξ ∂λ Uθ + 2Re2(η−2U ) (S23 )θ .
∂λ ∂ξ Aθ = L +
(56)
Here, L contains only κ and S23 , first order derivatives of U, A and η, and first and second order derivatives of R, which all are known to be bounded. These equations can of course also be written in a form where the left hand sides read ∂ξ ∂λ Uθ and ∂ξ ∂λ Aθ , respectively. By integrating these equations along null paths to the past of the initial surface, we get from a Gronwall argument a bound on sup (|∂ξ Uθ | + |∂λ Uθ | + |∂ξ Aθ | + |∂λ Aθ |),
θ∈S 1
as long as R is bounded away from zero, under the hypothesis that the integral of the differentiated matter terms κθ and (S23 )θ can be controlled. In order to bound these integrals we make use of a device introduced by Glassey and Strauss [GS] for treating the Vlasov–Maxwell equation. It is sufficient to show how one of the differentiated matter terms can be boundeded since the arguments are similar in all cases. Let us consider the integral appearing by integrating (55) along the null path defined by ∂λ which involves ρθ , Z Z 1 t0 [e2(η−U ) v 0 ∂θ f ](s, θ − (s − t), v)dvds, (57) 3 4 t R where t ∈ (t˜, t0 ]. Next, define W =
√ v1 2∂λ = ∂t − ∂θ , S = ∂t + 0 ∂θ . v
Hence, ∂θ and ∂t can be expressed in terms of W and S by v0 (S − W ), v0 + v1 v0 v1 (S + 0 W ). ∂t = 0 1 v +v v
∂θ =
(58) (59)
352
H. Andréasson
Now,
[Wf ](s, θ − (s − t), v) = ∂s [f (s, θ − (s − t), v)], and from the Vlasov equation we get [Sf ](s, θ − (s − t), v) = [−K · ∇v f ](s, θ − (s − t), v), where it is clear which terms have been denoted by K = (K1 , K2 , K3 ). By using (58) we can now evaluate the integral above by integrating by parts (in s for the W -term and in v for the S-term), so that the remaining terms only involve bounded quantities. Note in particular that the v-integrals are easily controlled in view of the uniform bound on Q(t). Thus, the integrals of the differentiated matter terms can be controlled and the Gronwall argument referred to above goes through. So we obtain uniform bounds on |∂ξ Uθ |, |∂λ Uθ |, |∂ξ Aθ |, and |∂λ Aθ |, and therefore also on |Uθ θ |, |Utθ |, |Aθ θ | and |Atθ |, as long as R is bounded away from zero. The evolution Eq. (19) and (20) then give uniform bounds on |Utt | and |Att |. By differentiating Eq. (22), it is now straightforward to obtain bounds on the second order derivatives of η, using similar arguments to those already discussed here, in particular the integrals involving matter quantities can be treated as above. Bounds on the first order derivatives of the distribution function f may now be obtained from the known bounds on the field components from the formula f (t, θ, v) = f0 (2(0, t, θ, v), V (0, t, θ, v)),
(60)
since f0 is smooth and since ∂2 and ∂V (here ∂ denotes ∂t , ∂θ or ∂v ) can be controlled by a Gronwall argument in view of the characteristic system. Step 5 (Bounds on higher order derivatives and completion of the proof). It is clear that the method described above can be continued for obtaining bounds on higher derivatives as well. Hence, we have uniform bounds on the functions R, U, A, η and f and all their derivatives on the interval (t˜, t0 ] if R > > 0. This implies that the solution extends to t → −∞ as long as R stays bounded away from zero. In view of the discussion after the statement of Theorem 1, this completes the proof of Theorem 1 in the contracting direction. 5. Analysis in the Expanding Direction To begin the analysis in the expanding direction (increasing R) in areal coordinates we need to start with data on a R =constant Cauchy surface (recall that in areal coordinates R = t). That this can be done follows from the geometrical arguments in [BCIM] (cf. the discussion following the statement of Theorem 1). There it is shown that if Gowdy symmetric (or more generally T 2 symmetric) data is given on T 3 , and if R0 is the past − and if R1 := inf T 3 R, then for every limit of R along past inextendible paths in Dconf d ∈ (R0 , R1 ), the R = d level set 6d is a Cauchy surface, and these 6d foliate the region − ∩ I − (6R1 ). Here I − (S) is the chronological past of S (see [HE]). The surfaces Dconf 6d lie to the past of the initial surface. Let us pick one of them, say 6d2 . The spacetime D − (h, k, f0 ) induces initial data for the areal component fields (U, A, η, α) and the distribution function f on 6t2 =d2 . By combining the local existence proof in harmonic coordinates [CB], and the arguments in [Cl] which show that the spacetime admits areal coordinates, we obtain local existence for the initial value problem in these coordinates. Now, in order to extend local existence to global existence in these coordinates, it is again sufficient to obtain uniform bounds on the field components and the distribution function and all their derivatives on a finite time interval [t2 , t3 ) on which the local solution exists.
Global Foliations of Matter Spacetimes with Gowdy Symmetry
353
Step 1 (Bounds on α, U, A and η). ˜ In this step we first show an “energy” monotonicity lemma and then we show how this result leads to bounds on η˜ := η + ln α/2 and on U and A. Let E(t) be defined by Z E(t) =
S1
[α − 2 Ut2 + 1
√ 2 e4U − 1 2 √ 2 √ αUθ + 2 (α 2 At + αAθ ) + αe2(η−U ) ρ]dθ. 4t
Lemma 1. E(t) is a monotonically decreasing function in t, and satisfies √ Z e4U √ 2 α 2(η−U ) 2 d −1/2 2 [α Ut + 2 αAθ + (ρ + P3 )]dθ ≤ 0. E(t) = − e dt t S1 4t 2
(61)
Proof. This is a straightforward but a somewhat lengthy computation. Let us merely sketch the steps involved. After taking the time derivative of the integrand we use the evolution equations for U and A to substitute for the second order derivatives, and we express ρt by using the Vlasov equation. Integrating by parts and using the constraint t equations for ηt and αt lead to (61). u Remark. It is clear from (61) that a Gronwall argument leads to a bound on E(t) also on (0, t2 ]. For T 2 symmetry and vacuum, which is considered in [BCIM], this bound is not available. A natural question is then why the areal coordinates in our case have to be discarded in the analysis for the past direction. However, the analysis of the characteristic system associated to the Vlasov equation in Lemma 2 depends on the time direction. Let us now define the quantity η˜ by η˜ = η +
1 ln α. 2
(62)
From the constraint equation (4) we get η˜ θ = 2tUt Uθ +
√ e4U At Aθ − t αe2(η−U ) J. 2t
(63)
1 2 a + 2cb2 , for any a, b, c ∈ R, c > 0, Now, from the elementary inequality |ab| ≤ 2c and from the fact that |J | ≤ ρ, it follows from Lemma 1 that for any t ∈ [t2 , t3 ), Z |η˜ θ |dθ ≤ tE(t) ≤ tE(t2 ). (64) S1
Hence, for any θ1 , θ2 ∈ S 1 and for any t ∈ [t2 , t3 ) we have Z ˜ θ1 )| = | |η(t, ˜ θ2 ) − η(t,
θ2
θ1
Z η˜ θ dθ | ≤
S1
|η˜ θ |dθ ≤ tE(t2 ).
(65)
Next, using the constraint equations (3) and (5), we find that the time derivative of η˜ satisfies η˜ t = t[Ut2 + αUθ2 +
e4U 2 (A + αA2θ ) + αe2(η−U ) P1 ] ≥ 0. 4t 2 t
(66)
354
H. Andréasson
R This relation leads to a control of S 1 ηdθ ˜ from above, namely Z Z Z t Z d η(t, ˜ θ)dθ − η(t ˜ 2 , θ)dθ = η(s, ˜ θ )dθ ds S1 S1 S1 t2 dt Z tZ 2 2 4U √ √ √ √ U e A αs[ √t + αUθ2 + 2 ( √ t + αA2θ ) + αe2(η−U ) P1 ]dθ ds = 1 4s α α t2 S Z t Z t √ sE(s)ds ≤ C1 sE(t2 )ds = C1 E(t2 )(t 2 − t22 )/2. ≤ sup α(t2 , ·) t2
S1
t2
(67) second that α is a monoIn the first inequality above we used that P1 ≤ ρ and in the √ tonically decreasing function in t (see (5)) and C1 := supS 1 α(t R 2 , ·). We are now in ˜ 2 , θ )dθ we get a position to obtain an upper bound on η˜ itself. By letting C2 := S 1 η(t from (67) the inequality Z 1 C1 E(t2 )(t 2 − t22 ) + C2 ≥ η(t, ˜ θ )dθ (68) 2 S1 Z (η˜ − max η)dθ. ˜ (69) = 2π max η˜ + S1
S1
S1
By applying (65) to the last term we find 1 E(t2 )(t 2 − t22 ) + C2 ≥ 2π max η˜ − 2π tE(t2 ). 2 S1
(70)
Therefore, for some bounded function C(t), we have the upper bound max η˜ ≤ C(t),
(71)
S1
and since η˜ t ≥ 0 we conclude that η˜ is uniformly bounded on S 1 × [t2 , t3 ). Remark. In the analysis below C(t) will always denote a uniformly bounded function on [t2 , t3 ). Sometimes we introduce other functions with the same property only for the purpose of trying to make some estimates become more transparent. Next we show that the boundedness of E(t), together with the constraint equation (5), lead to a bound on |U |. For any θ1 , θ2 ∈ S 1 , and t ∈ [t2 , t3 ) we get by Hölder’s inequality Z θ2 Uθ (t, θ )dθ |U (t, θ2 ) − U (t, θ1 )| = Z ≤
θ2 θ1
α −1/2 dθ
1/2 Z
θ1
θ2
θ1
√ 2 αUθ dθ
1/2 .
(72)
The second factor on the right-hand side is clearly bounded by (E(t2 ))1/2 . For the first factor we use the constraint equation (5). This equation can be written as √ (73) ∂t (α −1/2 ) = t αe2(η−U ) (ρ − P1 ),
Global Foliations of Matter Spacetimes with Gowdy Symmetry
so that for t ∈ [t2 , t3 ), α −1/2 (t, θ) =
Z
t
t2
355
√ s αe2(η−U ) (ρ − P1 )ds + α −1/2 (t2 , θ ).
(74)
Since ρ ≥ P1 , the integrand is positive and bounded by the last term in the integrand of E(t). Letting C denote the supremum of α −1/2 (t2 , ·) over S 1 we get Z t Z Z θ2 √ 2(η−U ) −1/2 α dθ ≤ s αe ρdθ ds + 2π C θ1
t2
S1 2
≤ E(t2 )(t − t22 )/2 + 2π C.
(75)
Hence, for any θ1 , θ2 ∈ S 1 we have (76) |U (t, θ2 ) − U (t, θ1 )| ≤ C(t). R Next we estimate S 1 U (t, θ)dθ. Let C := S 1 U (t2 , θ )dθ, we get by Hölder’s inequality Z t Z Z = U (t, θ)dθ U (s, θ )dθ ds + C t 1 S t2 S 1 Z tZ |Ut (s, θ)|dθ ds + |C| ≤ R
≤
Z t Z t2
S1
S1
t2
√
1/2 Z
αdθ
S1
α −1/2 Ut2 dθ
1/2 ds + |C|.
(77)
√ The right-hand side is easily seen to be bounded since (5) shows that α is monotonically decreasing and (61) gives a bound for the second factor. Therefore Z U (t, θ)dθ ≤ C(t), S1
for some uniformly bounded function C(t). To obtain a uniform bound on U we combine these results. Let U+ (t) := maxS 1 U (t, ·), and U− (t) := minS 1 U (t, ·). We have Z Z 2πU± (t) = U (t, θ)dθ + (U± (t) − U (t, θ ))dθ, (78) S1
S1
and the right-hand side is bounded from below and above so U is uniformly bounded on [t2 , t3 ) × S 1 . These arguments also apply to A as well, since the factor e4U is controlled by the uniform bound on U . Remark. In the case studied in [BCIM], i.e. vacuum and T 2 symmetry, a bound on ln α, and thus on η, is directly available. On the other hand, the method used here to bound U and A does not directly apply which would lead to a difficulty in generalizing the result in [BCIM] to matter spacetimes. However, one can in that case show that Z √ √ √ 1 1 e4U K2 α − 2 Ut2 + αUθ2 + 2 (α − 2 A2t + αA2θ ) + αe2(η−U ) (ρ + 4 ) dθ, (79) 4t 4t S1 is monotonically decreasing. Here K is the twist constant in [BCIM]. This is sufficient for obtaining bounds on U and A also in the more general case of T 2 symmetry by straightforwardly applying the arguments above.
356
H. Andréasson
Step 2 (Bounds on Ut , Uθ , At , Aθ , ηt , αt and Q(t)). To bound the derivatives of U we use light-cone estimates in a similar way as for the contracting direction. However, the matter terms must be treated differently and we need to carry out a careful analysis of the characteristic system associated with the Vlasov equation. Let us define e4U 1 2 (Ut + αUθ2 ) + 2 (A2t + αA2θ ), 2 8t √ e4U H = αUt Uθ + 2 At Aθ , 4t G=
(80) (81)
and √ 1 χ = √ (∂t + α∂θ ), 2 √ 1 ζ = √ (∂t − α∂θ ). 2
(82) (83)
A motivation for the introduction of these quantities is based on similar arguments as those given in Step 2, Sect. 4. For details we refer to [BCIM]. Remark. We use the same notations, G and H , as in the contracting direction, and below we continue to carry over the notations. The analysis in the respective direction is independent so there should be no risk of confusion. By using the evolution equation (7), a short computation shows that αt ζ (G + H ) = √ (G + H ) 2 2α √ √ e4U 1 2 2 Ut + αUt Uθ + 2 (αAθ + αAt Aθ ) −√ 4t 2t √ √ α 2(η−U ) αe2η κ + √ (At + αAθ )S23 , + (Ut + αUθ ) √ e 2 2 2 2t αt χ (G − H ) = √ (G − H ) 2 2α √ √ e4U 1 2 2 Ut − αUt Uθ + 2 (αAθ − αAt Aθ ) −√ 4t 2t √ √ α 2(η−U ) αe2η κ + √ (At − αAθ )S23 . + (Ut − αUθ ) √ e 2 2 2 2t
(84)
(85) Here κ = ρ − P1 + P2 − P3 . Now we wish to integrate these equations along the integral curves of the vector fields χ and ζ respectively (let us henceforth call these integral curves null curves, since they are null with respect to the two-dimensional “base spacetime”). Below we show that the quantity 0(t) := sup G(t, ·) + Q2 (t), θ∈S 1
(86)
Global Foliations of Matter Spacetimes with Gowdy Symmetry
is uniformly bounded on [t2 , t3 ) by deriving the inequality Z t 0(s) ln 0(s)ds. 0(t) ≤ C + t2
357
(87)
We begin with two observations. Let γ and X be a geodesic and a Killing vector field respectively in any spacetime. Then g(γ 0 , X) is conserved along the geodesic. Here γ 0 is the tangent vector to γ . In our case we have the two Killing vector fields ∂x and ∂y . The particles follow the geodesics of spacetime with tangent p µ , so gµν pµ (∂x )ν and gµν pµ (∂y )ν are thus conserved. Expressing pµ in terms of v µ (see (14)) we find that V 2 (t)eU (t,2(t)) and
V 2 (t)AeU (t,2(t)) + V 3 (t)te−U (t,2(t)) , are conserved. Here V 2 (t), V 3 (t) and 2(t) are solutions to the characteristic system associated to the Vlasov equation. From Step 2 we have that U and A are uniformly bounded on [t2 , t3 ). Hence |V 2 (t)| and |V 3 (t)| are both uniformly bounded on [t2 , t3 ), and since the initial distribution function f0 has compact support we conclude that sup{|v 2 | + |v 3 | : ∃(s, θ) ∈ [t2 , t] × S 1 with f (s, θ, v) 6= 0},
(88)
is uniformly bounded on [t2 , t3 ). Therefore, in order to control Q(t) it is sufficient to control Q1 (t) := sup{|v 1 | : ∃(s, θ) ∈ [t2 , t] × S 1 such thatf (s, θ, v) 6 = 0}.
(89)
Below we introduce the uniformly bounded function γ (t) to denote estimates regarding the variables v 2 and v 3 . Next we observe that there is some cancellation to take advantage of in the matter term (ρ − P1 ) which appears in the equations for G + H and G − H above. This term can be estimated as follows: Z (v 1 )2 (v 0 − 0 )f (t, θ, v)dv 0 ≤ (ρ − P1 )(t, θ) = v R3 Z 2 1 + (v )2 + (v 3 )2 f (t, θ, v)dv = 3 v0 ZR dv [1 + (v 2 )2 + (v 3 )2 ]|f | p ≤ 3 R 1 + (v 1 )2 Z dv 1 p ≤ kf0 k∞ γ (t) |v 1 |≤Q1 (t) 1 + (v 1 )2 ≤ Cγ (t) ln Q1 (t).
(90)
In a similar fashion we can estimate P2 , P3 and S23 . Indeed, for k = 1, 2, we have Z (v k )2 f (t, θ, v)dv 0 ≤ Pk (t, θ) = 0 R3 v Z dv 1 p ≤ kf0 k∞ γ (t) |v 1 |≤Q1 (t) 1 + (v 1 )2 ≤ Cγ (t) ln Q1 (t). The argument is almost identical for S23 .
(91)
358
H. Andréasson
Remark. Since the matter of interest is large momenta we have here assumed that Q1 (t) ≥ 2 to avoid the introduction of some immaterial constants in the estimates. Let us now derive (87). As in Step 2 in Sect. 4 we integrate the equations above for G + H and G − H along null paths. For t ≥ t2 , let Z t √ α(s, θ )ds, A(t, θ) = t2
and integrate along the two null paths defined by χ and ζ , starting at (t2 , θ ) and add the results. We get for t ∈ [t2 , t3 ), 1 1 [G + H ](t2 , θ − (A(t) − t2 )) + [G + H ](t2 , θ + (A(t) − t2 )) 2 2 Z 1 t K1 (s, θ − (A(s) − t2 )) + K2 (s, θ + (A(s) − t2 )) ds + 2 t2 Z t 1 L1 (s, θ − (A(s) − t2 )) + L2 (s, θ + (A(s) − t2 )) ds + 2 t2 Z 1 t [χU M](s, θ − (A(s) − t2 )) + [ζ U M](s, θ + (A(s) − t2 )) ds + 2 t2 Z ζA ˜ 1 t χA ˜ M](s, θ − (A(s) − t2 )) + [ M](s, θ + (A(s) − t2 )) ds, [ + 2 t2 2t 2t
G(t, θ ) =
(92) where
αt αt (93) K1 = √ (G + H ), K2 = √ (G − H ), 2 2α 2 2α √ √ 1 e4U 2 2 (94) Ut + αUt Uθ + 2 (αAθ + αAt Aθ ) , L1 = − √ 4t 2t √ √ 1 e4U (95) Ut2 − αUt Uθ + 2 (αA2θ − αAt Aθ ) , L2 = − √ 4t 2t 1 ˜ ) κ, M˜ = e2η˜ S23 . (96) M = e2(η−U 2 Note that in the expression for M and M˜ we used αe2η = e2η˜ . It is easy to see that both G + H and G − H can be written as sums of two squares. From the constraint equation (5) we find that αt /α ≤ 0 so that K1 and K2 are nonpositive. Using the elementary ˜ and |U | are uniformly bounded we obtain inequality 2ab ≤ a 2 + b2 and the fact that |η| from (92) the inequality Z t 1 sup G(s, ·)ds sup G(t, ·) ≤ sup G(t2 , ·) + sup H (t2 , ·) + C θ θ θ t2 s θ Z t p C(s) sup[ G(s, ·)((ρ − P1 + P2 − P3 ) + S23 )]ds + t2
≤ C + C(t)
θ
Z
t t2
p [sup G(s, ·) + sup G(s, ·) ln Q1 (s)]ds, θ
where (90) and (91) were used in the last inequality.
θ
(97)
Global Foliations of Matter Spacetimes with Gowdy Symmetry
359
Remark. The sign of K1 and K2 simplified the estimate above. This is not crucial since |αt |/α is bounded by ln Q1 (t) which is sufficient for obtaining a bound on 0(t). Let us now derive an estimate for Q1 in terms of supθ G. Lemma 2. Let Q1 (t) and G(t, θ) be as above. Then Z t 1 2 |Q (t)| ≤ C + D(t) [(Q1 (s))2 + sup G(s, ·)]ds, t2
(98)
θ
where C is a constant and D(t) is a uniformly bounded function on [t2 , t3 ). Proof. The characteristic equation for V 1 associated to the Vlasov equation reads αθ √ 0 dV 1 (s) = −(ηθ − Uθ + ) αV − (ηt − Ut )V 1 ds 2α √ √ αUθ αAθ 2U 2 3 2 2 3 2 − ((V ) − (V ) ) + e v v . (99) 0 V sv 0 We will now split the right-hand side into three terms to be analyzed separately. Expressing ηθ and ηt by using the constraint equations (3) and (4) we obtain d d (V 1 (s))2 = 2V 1 (s) V 1 (s) = T1 + T2 + T3 , ds ds
(100)
where T1 = − 2V 1 (s)[sαe2(η−U ) (J V 0 + ρV 1 )], e4U T2 = − 2V 1 (s) s(Ut2 + αUθ2 + 2 (A2t + αA2θ ))V 1 4s √ √ e4U √ + 2s αUθ Ut V 0 − αUθ V 0 − Ut V 1 + αAt Aθ V 0 , 2s √ √ αUθ αAθ 2U 2 3 ((V 3 )2 − (V 2 )2 ) − e V V ]. T3 = − 2V 1 (s)[ V0 sV 0 Let us first estimate T1 . We split it into two terms ˜ ) − (I + I + ), T1 = T1− + T1+ = −2sV 1 (s)e2(η−U
where I− = I+ =
Z
Z
0
R2 −∞ Z Z ∞ R2 0
(101)
(v 1 V 0 + v 0 V 1 )f (s, θ, v)dv 1 dv 2 dv 3 ,
(v 1 V 0 + v 0 V 1 )f (s, θ, v)dv 1 dv 2 dv 3 .
Let us now consider the two cases V 1 (s) > 0 and V 1 (s) < 0. On a time interval where V 1 (s) > 0, I + is nonnegative and T1+ can therefore be discarded since it is nonpositive. The kernel in I 1− can be estimated as follows: (v 1 )2 (V 0 )2 − (v 0 )2 (V 1 )2 v1V 0 − v0 V 1 1 2 (v ) (1 + (V 2 )2 + (V 3 )2 ) (V 1 )2 (1 + (v 2 )2 + (v 3 )2 ) = + . v1V 0 − v0 V 1 v0 V 1 − v1V 0
v1V 0 + v0 V 1 =
360
H. Andréasson
Of course, the cancellation of the terms (v 1 )2 (V 1 )2 is essential in this computation. The second term is positive since V 1 (s) > 0 and v 1 < 0, and contributes negatively to T1− and can be discarded. The first term is negative and the modulus can be estimated by |v 1 |(1 + (V 2 )2 + (V 3 )2 ) (v 1 )2 (1 + (V 2 )2 + (V 3 )2 ) ≤ . |v 1 |V 0 + v 0 V 1 V1
(102)
˜ ) ≤ C(s). Hence, on In the expression for T1 we first note that 2sαe2(η−U ) = 2se2(η−U 1 the time interval where V (s) > 0 we can estimate T1 by
T1 ≤ T1− ≤ kf0 k∞ C(s)V 1 (s) Z
Q1
≤ kf0 kC(s)γ (s)
Z
Z
R2 0
Q1
v 1 (1 + (V 2 )2 + (V 3 )2 ) 1 dv du V 1 (s)
v 1 dv 1 ≤ C(s)(Q1 (s))2 .
(103)
0
On a time interval where V 1 < 0 we see that T1− is nonpositive and can be discarded. We can then estimate T1+ by using almost identical arguments as for T1− and we get also on such a time interval, T1 ≤ T1+ ≤ C(s)(Q1 (s))2 .
(104)
Let us now consider T2 . We again study the cases V 1 (s) > 0 and V 1 (s) < 0. Assume first p that V 1 (s) > 0 on some time interval. The expression for T2 can be written T2 = T2 +T2r (p=principal, r=rest) where √ √ √ e4U p (At + αAθ )2 ] T2 = −2(V 1 (s))2 [s(Ut + αUθ )2 − (Ut + αUθ )] + [ 4s and √ √ e4U √ αAt Aθ ]. T2r = 2(V 0 (s) − V 1 (s))V 1 (s)[ αUθ − 2s αUt Uθ − 2s For T2r we have 2(1 + (V 2 )2 + (V 3 )2 )V 1 (s) √ e4U At Aθ | α|U − 2sU U − θ t θ V0 +V1 2s ≤ (s + 1)γ (s) sup G(s, ·).
|T2r | =
θ
(105)
√ Since the matter of interest is large G we have here assumed that G ≤ G. This p assumption will be used below without comment. To estimate T2 we observe that for s ≥ t2 , −1 −1 ≥ , for any a ∈ R. sa 2 − a ≥ 4s 4t2 The term involving A contributes negatively and can be discarded, thus p
T2 ≤
1 (V 1 (s))2 ≤ C(Q1 (s))2 . 2t2
(106)
Global Foliations of Matter Spacetimes with Gowdy Symmetry
361
On a time interval where V 1 (s) < 0, the same estimates hold. Indeed, we only have to p write T2 = T2 + T2r in the form √ √ √ e4U p (At − αAθ )2 T2 = −2(V 1 (s))2 [s(Ut − αUθ )2 − (Ut − αUθ )] + 4s and √ √ e4U √ αAt Aθ ], T2r = 2(V 0 (s) + V 1 (s))V 1 (s)[ αUθ − 2s αUt Uθ − 2s and the same arguments apply. Therefore we have obtained p
T2 ≤ T2 + |T2r | ≤ C(Q1 (s))2 + C(s) sup G(s, ·). θ
(107)
Finally we estimate T3 . It follows immediately that |T3 | ≤ γ (s)
|V 1 (s)| √ e2U Aθ | ≤ C(s) sup G(s, ·). α|U + θ V0 s θ
(108)
t The lemma now follows by adding the estimates for Tk , k = 1, 2, 3. u Combining the estimate for (Q1 (t))2 in the lemma and the estimate (97) for supθ G(t, ·), we find that 0(t) satisfies the estimate (87) and is thus uniformly bounded. The constraint equation (3) now immediately shows that |ηt | is bounded by ˜ ) ρ ≤ C(t)[sup G(t, ·) + (Q(t))3 ], 2tG + te2(η−U θ
since
Z
Z ρ=
R3
f dv ≤ kf0 k∞
|v|≤Q(t)
dv ≤ C(Q(t))3 .
Analogous arguments show that |αt | is uniformly bounded. The uniform bound on G provides bounds on |Ut | and |At |, but to conclude that |Uθ | and |Aθ | are bounded we have to show that α stays uniformly bounded away from zero. Equation (5) is easily solved, Rt
α(t, θ) = α(t2 , θ)e
t2
F (s,θ)ds
,
(109)
where ˜ ) (ρ − P1 ), F (t, θ) := −2te2(η−U
which is uniformly bounded from below. Hence |Uθ | and |Aθ | are bounded and Step 2 is complete.
362
H. Andréasson
Step 3 (Bounds on ∂f , αθ and ηθ ). The main goal in this step is to show that the first derivatives of the distribution function are bounded. In view of the bound on Q(t) we then also obtain bounds on the first derivatives of the matter terms ρ, J, S23 and Pk , k = 1, 2, 3. Such bounds almost immediately lead to bounds on αθ and ηθ . Recall that the solution f can be written in the form f (t, θ, v) = f0 (2(0, t, θ, v), V (0, t, θ, v)),
(110)
where 2(s, t, θ, v), V (s, t, θ, v) is the solution to the characteristic system √ V1 d2 = α 0, ds V 1 αθ √ 0 dV = − (ηθ − Uθ + ) αV − (ηt − Ut )V 1 ds 2α √ e2U √ (V 3 )2 − (V 2 )2 V 2V 3 − αUθ + αA , θ V0 s V0 √ V 1V 2 dV 2 , = − Ut V 2 − αUθ ds V0 √ 1 V 1V 3 dV 3 = − ( − Ut )V 3 + αUθ ds s V0 2U 1 √ e V (At + αAθ 0 )V 2 , − s V
(111)
(112) (113)
(114)
with the property 2(t, t, θ, v) = θ , V (t, t, θ, v) = v. Hence, in order to establish bounds on the first derivatives of f it is sufficient to bound ∂2 and ∂V since f0 is smooth. Here ∂ denotes the first order derivative with respect to t, θ or v. Evolution equations for ∂2 and ∂V are provided by the characteristic system above. However, the right-hand sides will contain second order derivatives of the field components, but so far we have only obtained bounds on the first order derivatives (except for ηθ , αθ ). Yet, certain combinations of second order derivatives can be controlled. Behind this observation lies a geometrical idea which plays a fundamental role in general relativity. An important property of curvature is its control over the relative behaviour of nearby geodesics. Let γ (u, λ) be a two-parameter family of geodesics, i.e. for each fixed λ, the curve u 7 → γ (u, λ) is a geodesic. Define the variation vector field Y := γλ (u, 0). This vector field satisfies the geodesic deviation equation (or Jacobi equation) (see e.g. [HE]) D2 Y = RY γ 0 γ 0 , Du2
(115)
where D/Du is the covariant derivative, R the Riemann curvature tensor, and γ 0 := γu (u, 0). Now, the Einstein tensor is closely related to the curvature tensor and since the Einstein tensor is proportional to the energy momentum tensor which we can control from Step 2, it is meaningful, in view of (115) (with Y = ∂2), to look for linear combinations of ∂2 and ∂V which satisfy an equation with bounded coefficients. More precisely, we want to substitute the twice differentiated field components which appear by taking the derivative of the characteristic system by using the Einstein equations. The geodesic deviation equation has previously played an important role in studies of the Einstein–Vlasov system ([RR, Rn] and [Rl3]).
Global Foliations of Matter Spacetimes with Gowdy Symmetry
363
Lemma 3. Let 2(s) = 2(s, t, θ, v) and V k (s) = V k (s, t, θ, v), k = 1, 2, 3 be a solution to the characteristic system (111)–(114). Let ∂ denote ∂t , ∂θ or ∂v , and define 9 = α −1/2 ∂2, ηt V 0 Ut V 0 (V 0 )2 − (V 1 )2 + (V 2 )2 − (V 3 )2 1 1 Z = ∂V + √ − √ (V 0 )2 − (V 1 )2 α α V 0V 2V 3 V 1 ((V 2 )2 − (V 3 )2 ) At e2U − √ (V 0 )2 − (V 1 )2 αt (V 0 )2 − (V 1 )2 V 1V 2V 3 + Aθ 0 2 ∂2, (V ) − (V 1 )2
(116)
+ Uθ
(117)
Z 2 = ∂V 2 + V 2 Uθ ∂2, e2U 2 V Aθ ) ∂2. Z 3 = ∂V 3 − (V 3 Uθ − s
(118) (119)
Then there is a matrix A = {alm }, l, m = 0, 1, 2, 3, such that := (9, Z 1 , Z 2 , Z 3 )T satisfies d = A, ds
(120)
and the matrix elements alm = alm (s, 2(s), V k (s)) are all uniformly bounded on [t2 , t3 ). Sketch of proof. Once the ansatz (116)–(119) has been found this is only a lengthy calculation. To illustrate the type of calculations involved we show the easiest case, i.e. the Z 2 term: d dZ 2 = (∂V 2 + V 2 Uθ ∂2) ds ds dV 2 d Uθ ∂2 = ∂( V 2 ) + ds ds d2 d2 )∂2 + V 2 Uθ ∂( ). + V 2 (Utθ + Uθθ ds ds (121) Now we use (111) and (113) to substitute for d2/ds and right-hand side equals
dV 2 /ds.
We find that the
√ √ V 1V 2 V 1V 2 2 αUθ ) + (−U V − αU )Uθ ∂2 t θ V0 V0 √ √ V1 V1 αθ V 1 +V 2 (Utθ + Uθθ α 0 )∂2 + V 2 Uθ √ 0 ∂2 + α∂( 0 ) . V V 2 αV ∂(−Ut V 2 −
Taking the ∂ derivative of the first term we find that all terms of second order derivatives and terms containing αθ cancel. Next, since 1 2 1 √ √ √ V V V V1 2 αU V ∂ αU ∂V 2 , (122) − αUθ ∂ + = − θ θ V0 V0 V0
364
H. Andréasson
we are left with √ √ V 1V 2 V1 dZ 2 = −(Ut V 2 + αUθ )Uθ ∂2 − (Ut + αUθ 0 )∂V 2 . 0 ds V V
(123)
Finally we express this in terms of 9, Z 1 , Z 2 and Z 3 . Here this is easy and we immediately get √ V1 dZ 2 = −(Ut + αUθ 0 )Z 2 . ds V Clearly, the map (∂2, ∂V k ) 7 → (9, Z k ) is invertible so that this step is easy also in the other cases. It follows that the matrix elements a2m , m = 0, 1, 2, 3, are uniformly bounded on [t2 , t3 ) (only a22 is nonzero here). The computations for the other terms are similar. For the Z 1 term we point out that the evolution equations (7) and (8) should be invoked and that the matrix element a10 contains ηθ and αθ /2α, but they combine and form η. ˜ u t From the lemma it now immediately follows that || is uniformly bounded on [t2 , t3 ). Moreover, since the system (116)–(119) is invertible with uniformly bounded coefficients we also have uniform bounds on |∂2| and |∂V k |, k = 1, 2, 3. In view of the discussion at the beginning of this section we see that the distribution function f and the matter quantities ρ, J, S23 and Pk , are all uniformly C 1 bounded. From the constraint equation (5) we now obtain a uniform bound on αθ by a simple Gronwall argument using as usual ˜ ) . Finally this yields a uniform bound on η since the identity αe2(η−U ) = e2(η−U θ ηθ = η˜ θ −
αθ 2α
and α stays uniformly bounded away from zero. Step 4 (Bounds on second and higher order derivatives). It is now easy to obtain bounds on second order derivatives on U and A by using light cone arguments. We define G and H by e4U 1 2 2 ) + 2 (A2tt + αA2tθ ), (Utt + αUtθ 2 8t √ e4U H = αUtt Utθ + 2 Att Atθ , 4t G=
(124) (125)
and use the differentiated (with respect to t) evolution equations for U and A to obtain equations similar to (84) and (85). In this case a straightforward light cone argument applies since we have control of the differentiated matter terms. Uθ θ and Aθ θ are then uniformly bounded in view of the evolution equations (7) and (8). Bounds on second order derivatives on f then follow from (120) by studying the equation for ∂. The only thing to notice is that η˜ θθ is controlled by (4). It is clear that this reasoning can be continued to give uniform bounds on [t2 , t3 ) for higher order derivatives as well. In view of the discussion after the statement of Theorem 1 in Sect. 3, this completes the proof of Theorem 1 in the expanding direction. u t
Global Foliations of Matter Spacetimes with Gowdy Symmetry
365
Acknowledgement. I am most grateful to Alan Rendall for suggesting the problem (for small data) and for commenting on the manuscript. I also wish to thank Demetrios Christodoulou and Shadi Tahvildar-Zadeh at the Department of Mathematics at Princeton University, where this work was carried out, for interesting and stimulating discussions. This work was supported by the Swedish Foundation for International Cooperation in Research and Higher Education (STINT) and is hereby gratefully acknowledged.
References [BCIM] Berger, B.K., Chru´sciel, P., Isenberg, J. and Moncrief, V.: Global foliations of vacuum spacetimes with T 2 isometry. Ann. Phys. 260, 117–148 (1997) [BT] Binney, J. and Tremaine, S.: Galactic dynamics. Princeton, NJ: Princeton University Press, 1987 [CB] Choquet-Bruhat, Y.: Problème de Cauchy pour le système intégro différentiel d’Einstein–Liouville. Ann. Inst. Fourier 21, 181–201 (1971) [CBG] Choquet-Bruhat Y. and Geroch, R.: Global aspects of the Cauchy problem in general relativity. Commun. Math. Phys. 14, 344–357 (1969) [Cu1] Christodoulou, D.: Examples of naked singularity formation in the gravitational collapse of a scalar field. Ann. Math. 140, 607–653 (1994) [Cu2] Christodoulou, D.: Bounded variation solutions of the spherically symmetric Einstein-scalar field equations. Comm. Pure Appl. Math. 46, 1131–1220 (1993) [Cu3] Christodoulou, D.: Self-gravitating relativistic fluids: The formation of a free phase boundary in the phase transition from soft to hard. Arch. Rational Mech. Anal. 134, 97–154 (1996) [CK] Christodoulou, D. and Klainerman, S.: The global nonlinear stability of the Minkowski space. Princeton, NJ: Princeton University Press, 1993 [Cl] Chru´sciel, P.T.: On spacetimes with U (1) × U (1) symmetric compact Cauchy surfaces. Ann. Phys. 202, 100–150 (1990) [CIM] Chru´sciel, P.T., Isenberg, J. and Moncrief, V.: Strong cosmic censorship in polarised Gowdy spacetimes. Class. Quantum Grav. 7, 1671–1680 (1990) [EM] Eardley, D. and Moncrief, V.: The global existence problem and cosmic censorship in general relativity. Gen. Rel. Grav. 13, 887–892 (1981) [ES] Eardley, D. and Smarr, L.: Time functions in numerical relativity: marginally bound dust collapse. Phys. Rev. D19, 2239–2259 (1979) [E] Ehlers, J.: Survey of general relativity theory. In: W. Israel (ed.) Relativity, Astrophysics and Cosmology. Dordrecht: Reidel, 1973 [GS] Glassey, R. and Strauss, W.: Singularity formation in a collisionless plasma could only occur at high velocities. Arch. Rat. Mech. Anal. 92, 56–90 (1986) [G] Gowdy, R.: Vacuum spacetimes and compact invariant hypersurfaces: Topologies and boundary conditions. Ann. Phys. 83, 203–24 (1974) [HE] Hawking, S. and Ellis, G.: The large scale structure of spacetime. Cambridge: Cambridge University Press, 1973 [MPF] Mitrinovi´c, D., Pecari´c, J. and Fink, A.: Inequalities involving functions and their integrals and derivatives. Dordrecht: Kluwer Academic Publishers, 1991 [M] Moncrief, V.: Global properties of Gowdy spacetimes with T 3 × R topology. Ann. Phys. 132, 87–107 (1981) [Rn] Rein, G.: Cosmological solutions of the Vlasov–Einstein system with spherical, plane and hyperbolic symmetry. Math. Proc. Camb. Phil. Soc. 119, 739–762 (1996) [RR] Rein, G. and Rendall, A.D.: Global existence of solutions of the spherically symmetric Vlasov– Einstein system with small initial data. Commun. Math. Phys. 150, 561–583 (1992); Erratum: Commun. Math. Phys. 176, 475–478 (1996) [Rl1] Rendall, A.D.: Existence of constant mean curvature foliations in spacetimes with two-dimensional local symmetry. Commun. Math. Phys. 189, 145–164 (1997) [Rl2] Rendall, A.D.: Crushing singularities in spacetimes with spherical, plane and hyperbolic symmetry. Class. Quantum Grav. 12, 1517–1533 (1995) [Rl3] Rendall, A.D.: An introduction to the Einstein–Vlasov system. Mathematics of gravitation. Part I (Warsaw, 1996) Banach center Publ.41, Part I, Warsaw: Polish Acad. Sci., 1997, pp. 35–68 [Rl4] Rendall, A.D.: On the choice of matter model in general relativity. In: R. d’Inverno (ed.) Approaches to Numerical Relativity. Cambridge: Cambridge University Press, 1992 [Rl5] Rendall, A.D.: Cosmic censorship and the Vlasov equation. Class. Quantum Grav. 9, L99–L104 (1992) [RRS] Rein, G., Rendall, A.D. and Schaeffer, J.: A regularity theorem for solutions of the spherically symmetric Vlasov–Einstein system. Commun. Math. Phys. 168, 467–478 (1995) Communicated by H. Nicolai
Commun. Math. Phys. 206, 367 – 381 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Lie Groupoid C ∗ -Algebras and Weyl Quantization N. P. Landsman? Korteweg-de Vries Institute for Mathematics, University of Amsterdam, Plantage Muidergracht 24, 1018 TV Amsterdam, The Netherlands. E-mail: [email protected] Received: 14 May 1998 / Accepted: 23 March 1999
Abstract: A strict quantization of a Poisson manifold P on a subset I ⊆ R containing 0 as an accumulation point is defined as a continuous field of C ∗ -algebras {Ah¯ }h¯ ∈I , ˜ 0 of C0 (P ) on which the Poisson bracket with A0 = C0 (P ), a dense subalgebra A is defined, and a set of continuous cross-sections {Q(f )}f ∈A˜ 0 for which Q0 (f ) = f . Here Qh¯ (f ∗ ) = Qh¯ (f )∗ for all h¯ ∈ I , whereas for h¯ → 0 one requires that i[Qh¯ (f ), Qh¯ (g)]/h¯ → Qh¯ ({f, g}) in norm. For any Lie groupoid G, the vector bundle G∗ dual to the associated Lie algebroid G is canonically a Poisson manifold. Let A0 = C0 (G∗ ), and for h¯ 6 = 0 let Ah¯ = C ∗ (G) be the C ∗ -algebra of G. The family of C ∗ -algebras {Ah¯ }h¯ ∈[0,1] forms a continuous field, ˜ 0 ⊂ C0 (G∗ ) and an associated family {QW (f )} and we construct a dense subalgebra A h¯ of continuous cross-sections of this field, generalizing Weyl quantization, which define ∗ a strict quantization of G . Many known strict quantizations are a special case of this procedure. On P = T ∗ Rn ∗ the maps QW h¯ (f ) reduce to standard Weyl quantization; for P = T Q, where Q is a Riemannian manifold, one recovers Connes’ tangent groupoid as well as a recent generalization of Weyl’s prescription. When G is the gauge groupoid of a principal bundle one is led to the Weyl quantization of a particle moving in an external Yang–Mills field. In case that G is a Lie group (with Lie algebra g) one recovers Rieffel’s quantization of the Lie–Poisson structure on g∗ . A transformation group C ∗ -algebra defined by a smooth action of a Lie group on a manifold Q turns out to be the quantization of the Poisson manifold g∗ × Q defined by this action. 1. Introduction The notion of quantization to be used in this paper is motivated by the desire to link the geometric theory of classical mechanics and reduction [18,32] with the C ∗ -algebraic ? Supported by a fellowship from the Royal Netherlands Academy of Arts and Sciences (KNAW).
368
N. P. Landsman
formulation of quantum mechanics and induction [15], and also with non-commutative geometry [2]. Starting with Rieffel’s fundamental paper [27], various C ∗ -algebraic definitions of quantization have been proposed [29,12,30,15,31]. Definition 2 below is closely related to these proposals, and is particularly useful in the context of the class of examples studied in this paper. These examples come from the theory of Lie groupoids and their Lie algebroids (cf. Sect. 2). The idea that the C ∗ -algebra of a Lie groupoid is connected to the Poisson manifold defined by the associated Lie algebroid by (strict) quantization was conjectured in [12], and proved in special cases in [13,15]. The results of [28,29,23] also supported the claim. In this paper we prove the conjecture up to Dirac’s condition (3); this is the content of Theorems 1 and 2. Following up on our work, Dirac’s condition has finally been proved by Ramazan [25]. This leads to the Corollary at the end of Sect. 5, which is the main result of the paper. Further to the examples considered in Sect. 6, it would be interesting to apply the point of view in this paper to the holonomy groupoid of a foliation [2], and to the Lie groupoid defined by a manifold with boundary [23,19]. Moreover, the approach to index theory via the tangent groupoid [2] and its recent generalization to arbitrary Lie groupoids [20] may now be seen from the perspective of “strict” quantization theory. This may be helpful also in understanding the connection between various other approaches to index theory which use (formal deformation) quantization [8,7]. The central notion in C ∗ -algebraic quantization theory is that of a continuous field of C ∗ -algebras [5]. For our purposes the following reformulation is useful [10]. Definition 1. A continuous field of C ∗ -algebras (C, {Ax , ϕx }x∈X ) over a locally compact Hausdorff space X consists of a C ∗ -algebra C, a collection of C ∗ -algebras {Ax }x∈X , and a set {ϕx : C → Ax }x∈X of surjective ∗ -homomorphisms, such that for all A ∈ C, 1. the function x → kϕx (A)k is in C0 (X); 2. one has kAk = supx∈X kϕx (A)k; 3. there is an element f A ∈ C for any f ∈ C0 (X) for which ϕx (f A) = f (x)ϕx (A) for all x ∈ X. The continuous Q cross-sections of the field in the sense of [5] consist of those elements {Ax }x∈X of x∈X Ax for which there is a (necessarily unique) A ∈ C such that Ax = ϕx (A) for all x ∈ X. We refer to [18,32] for the theory of Poisson manifolds and Poisson algebras; the latter is the classical analogue of the self-adjoint part of a C ∗ -algebra [15]. Definition 2. Let I ⊆ R contain 0 as an accumulation point. A strict quantization of a Poisson manifold P on I consists of 1. a continuous field of C ∗ -algebras (C, {Ah¯ , ϕh¯ }h¯ ∈I ), with A0 = C0 (P ); ˜ 0 ⊂ C0 (P ) on which the Poisson bracket is defined, and which is 2. a dense subspace A closed under pointwise multiplication and taking Poisson brackets (in other words, ˜ 0 is a Poisson algebra); A ˜ 0 and ˜ 0 → C which (with Qh¯ (f ) ≡ ϕh¯ (Q(f ))) for all f ∈ A 3. a linear map Q : A h¯ ∈ I satisfies Q0 (f ) = f, Qh¯ (f ∗ ) = Qh¯ (f )∗ ,
(1) (2)
Lie Groupoid C ∗ -Algebras and Weyl Quantization
369
˜ 0 satisfies Dirac’s condition and for all f, g ∈ A i lim k [Qh¯ (f ), Qh¯ (g)] − Qh¯ ({f, g})k = 0. h¯
h¯ →0
(3)
Elements of I are interpreted as possible values of Planck’s constant h¯ , and Ah¯ is the quantum algebra of observables of the theory at the given value of h¯ 6 = 0. For real-valued f , the operator Qh¯ (f ) is the quantum observable associated to the classical observable f . This interpretation is possible because of condition (2) in Definition 2. In view of the ˜ 0 each family {Qh¯ (f )}h¯ ∈I is a continuous comment after Definition 1, for fixed f ∈ A cross-section of the continuous field in question. In view of (1) this implies, in particular, that lim kQh¯ (f )Qh¯ (g) − Qh¯ (f g)k = 0.
h¯ →0
(4)
This shows that strict quantization yields asymptotic morphisms in the sense of E-theory [2]; cf. [22]. See [15] for an extensive discussion of quantization theory from the above perspective, including an interpretation of the conditions (3) and (4). 2. Lie Groupoids and Lie Algebroids Throughout this section, the reader is encouraged to occasionally skip to Sect. 6 to have a look at some examples of the objects defined. We refer to [26,17,3,2,15,1] for the basic definitions on groupoids; here we merely establish our notation. Briefly, a groupoid is a category whose space of arrows G is a set (hence the space of objects Q is a set as well), and whose arrows are all invertible. The source and target projections are called τs : G → Q and τt : G → Q, respectively. The subset of G × G on which the groupoid multiplication (i.e., the composition of arrows) is defined is called G2 ; hence (γ1 , γ2 ) ∈ G2 iff τs (γ1 ) = τt (γ2 ). The inversion γ → γ −1 defines the unit space G0 = {γ γ −1 |γ ∈ G}, which is related to the base space Q by the “object inclusion map” ι : Q ,→ G; this is a bijection between Q and ← ι(Q) = G0 . The notation G ⇒ Q for a groupoid to some extent captures the situation. ← A Lie groupoid is a groupoid G ⇒ Q, where G and Q are manifolds (perhaps with boundary), the maps τs and τt are surjective submersions, and multiplication and inclusion are smooth [17,3,2,15,1]. Following [15], we now sharpen Def. I.2.2 in [26]. ←
Definition 3. A left Haar system on a Lie groupoid G ⇒ Q is a family {µtq }q∈Q of positive measures, where the measure µtq is defined on τt−1 (q), such that 1. the family is invariant under left-translation in G; 2. each µtq is locally Lebesgue (i.e., it is equivalent to the Lebesgue measure in every −1 co-ordinate chart; note that each fiber R τt (q)t is a manifold); ∞ 3. for each f ∈ Cc (G) the map q 7 → τ −1 (q) dµq (γ )f(γ ) from Q to C is smooth. t
Here left-invariance means invariance under all maps Lγ , defined by Lγ (γ 0 ) := γ γ 0
(5)
whenever (γ , γ 0 ) ∈ G2 . Note that Lγ maps τt−1 (τs (γ )) diffeomorphically to τt−1 (τt (γ )).
370
N. P. Landsman ←
A Lie groupoid G ⇒ Q has an associated Lie algebroid [17,3,15,1], which we denote →TQ by G → Q . This is a vector bundle over Q, which apart from the bundle projection τ : G → Q is equipped with a vector bundle map τa : G → T Q (called the anchor), as well as with a Lie bracket [ , ]G on the space 0(G) of smooth sections of G, satisfying certain compatibility conditions. →TQ ← For our purposes, the essential point in the construction of G → Q from G ⇒ Q lies in the fact that the vector bundle G over Q is the normal bundle N ι Q defined by the embedding ι : Q ,→ G; accordingly, the projection τ : N ι Q → Q is given by τs or τt (these projections coincide on G0 ). The tangent bundle of G at the unit space has a decomposition t G, Tι(q) G = Tι(q) G0 ⊕ Tι(q)
(6)
where T t G = ker(T τt ) is a sub-bundle of T G. Note that Tγt G = Tγ τt−1 (τt (γ )). Hence →TQ
G → Q is isomorphic as a vector bundle to the restriction G0 of T t G to G0 . Under this −1 t G=T isomorphism the fiber Gq above q is mapped to the vector space Tι(q) ι(q) τt (q). The following pleasant result was pointed out by Ramazan [25]. Proposition 1. Every Lie groupoid possesses a left Haar system. Proof. A given strictly positive smooth density ρ on the vector bundle G can be t (uniquely) extended to a left-invariant density R ρ˜ on the vector bundle T G, which in t ˜ . u t turn yields a left Haar system by µq (f ) = τ −1 (q) ρf t
←
One may canonically associate a C ∗ -algebra C ∗ (G) to a Lie groupoid G ⇒ Q [2], and equally canonically associate a Poisson algebra C ∞ (G∗ ) to its Lie algebroid →TQ G → Q [4,3] (here G∗ is the dual vector bundle of G, with projection denoted by τ ∗ ). From the point of view of quantization theory, these constructions go hand in hand [12, 13,15]. Although a left Haar system is not intrinsic, and an intrinsic definition of C ∗ (G) may be given [2,15,25], it vastly simplifies the presentation of our results if we define this C ∗ algebra relative to a particular choice of a left Haar system {µtq }q∈Q . For f, g ∈ Cc∞ (G) the product ∗ in C ∗ (G) is then given by the convolution [26] Z dµtτs (γ ) (γ1 ) f(γ γ1 )g(γ1−1 ); (7) f ∗ g(γ ) := τt−1 (τs (γ ))
the involution is defined by f∗ (γ ) := f(γ −1 ).
(8)
The groupoid C ∗ -algebra C ∗ (G) is the completion of Cc∞ (G) in a suitable C ∗ -norm [2, 26,15]. On the classical side, the Poisson algebra C ∞ (G∗ ) associated to a Lie algebroid G [4,3,15] is most simply defined by listing special cases which uniquely determine the Poisson bracket. These are {f, g} = 0; {˜s , f } = −(τa ◦ s)f ;
(9) (10)
{˜s1 , s˜2 } = −[s^ 1 , s2 ]G .
(11)
Lie Groupoid C ∗ -Algebras and Weyl Quantization
371
Here f, g ∈ C ∞ (Q) (regarded as functions on G∗ in the obvious way), and s˜ ∈ C ∞ (G∗ ) is defined by a section s of G through s˜ (θ) = θ (s(τ ∗ (θ ))), etc. See [3] for an intrinsic definition. 3. A Generalized Exponential Map →TQ
Throughout the remainder of the paper, G → Q will be the Lie algebroid of a Lie ← groupoid G ⇒ Q. In order to state and prove our main results we need to construct an exponential map ExpW : G → G, which generalizes the map Exp from a Lie algebra to an associated Lie group. The construction of such a map was outlined by Pradines [24], but in order to eventually satisfy the self-adjointness condition (2) on our quantization map we need a different construction [15]. As in [24], our exponential map depends on the choice of a connection on the vector bundle G. As before, the reader is referred to Sect. 6 for examples of the constructions below. Lemma 1. The vector bundles T t G and τs∗ G (over G) are isomorphic. Proof. The pull-back bundle τs∗ G is a vector bundle over G with projection onto the second variable. The isomorphism is proved via the vector bundle isomorphism G ' G0 ; see Sect. 2. Recalling (5), one checks that T Lγ −1 : Tγt G → Tγt −1 γ G is the desired bundle t isomorphism between T t G and τs∗ G0 . u Let us now assume that G has a covariant derivative (or, equivalently, a connection), with associated horizontal lift `G . By Lemma 1 one then obtains a connection on T t G (seen as a vector bundle over G, whose projection is borrowed from T G) through pullback. Going through the definitions, one finds that the associated horizontal lift ` of a tangent vector X = γ˙ := dγ (t)/dtt=0 in Tγ G to Y ∈ Tγt G is `Y (γ˙ ) =
d [Lγ (t)∗ `G T Lγ −1 Y (τs (γ (t)))]t=0 , dt
(12)
which is an element of TY (T t G) (here `G (. . . ) lifts a curve). Since the bundle T t G → G has a connection, one can define geodesic flow X → X(t) on T t G in precisely the same way as on a tangent bundle with affine connection. That is, the flow X(t) is the solution of ˙ X(t) = `X(t) (X(t)),
(13)
with initial condition X(0) = X. →TQ
←
Definition 4. Let the Lie algebroid G → Q of a Lie groupoid G ⇒ Q be equipped with a connection. Relative to the latter, the left exponential map ExpL : G → G is defined by ExpL (X) := γX0 (1) = τT t G→G (X 0 (1)),
(14)
whenever the geodesic flow X 0 (t) on T t G (defined by the connection on T t G pulled back from the one on G) is defined at t = 1. Here X 0 ∈ G0 = T t G G0 is the image of X under the isomorphism G0 ' G. Our goal, however, is to define a “symmetrized” version of ExpL .
372
N. P. Landsman
Lemma 2. For all X ∈ G for which ExpL (X) is defined one has τt (ExpL (X)) = τ (X).
(15)
Here τ is the bundle projection of the Lie algebroid. Proof. We write X for X 0 in (14). One has τt (γX (0)) = τ (X) and d τt (γX (t)) = T (τt ◦ τT t G→G )`X(t) (X(t)) = T τt X(t) = 0, dt t since `X (Y ) covers Y , and X(t) ∈ T t G = ker(T τt ) ∩ T G. u We combine this with the obvious τ ( 21 X) = τ (− 21 X) to infer that τt (ExpL ( 21 X)) = τt (ExpL (− 21 X)) = τs (ExpL (− 21 X)−1 ). Thus the (groupoid) multiplication in (16) below is well-defined. Definition 5. The Weyl exponential map ExpW : G → G is defined by ExpW (X) := ExpL (− 21 X)−1 ExpL ( 21 X).
(16)
The following result is closely related to the tubular neighbourhood theorem. Proposition 2. The maps ExpL and ExpW are diffeomorphisms from a neighbourhood N ι of Q ⊂ G (as the zero section) to a neighbourhood Nι of ι(Q) in G, such that ExpL (q) = ExpW (q) = ι(q) for all q ∈ Q. Proof. The property ExpL (q) = ι(q) is immediate from Definition 4. The push-forward of ExpL at q is T ExpL : Tq G → Tι(q) G. Now recall the decomposition (6). For X tangent to Q ⊂ G one immediately sees that T ExpL (X) = T ι(X). For X tangent to the t G, one has T ExpL (X) = X 0 , as follows by fiber τ −1 (q), which we identify with Tι(q) the standard argument used to prove that expq in the theory of affine geodesics is a local t G one has ExpL (X(s)) = γ 0 (1) = diffeomorphism: for a curve X(s) = sX in Tι(q) X (s) L 0 γX0 (s), so that d/ds[Exp (X(s))]s=0 = X . Since T ExpL is a bijection at q, the inverse function theorem implies that ExpL is a local diffeomorphism. Since it maps Q pointwise to ι(Q), the local diffeomorphisms can be patched together to yield a diffeomorphism of the neighbourhoods stated in Proposition 2; we omit the details of this last step, since it is identical to the proof of the tubular neighbourhood theorem. As for ExpW , for X ∈ Tq Q ⊂ Tq G we have T ExpW (X) = T ι(X). Also, d [ExpL (− 21 sX)−1 ExpL ( 21 sX)]s=0 = − 21 T I (X0 ) + 21 X 0 , ds where T I is the push-forward of the inversion I in G. The right-hand side lies in ker(T τs + T τt ) ⊂ T G, and every element in this kernel is of the stated form. Similarly to (6), one may prove the decomposition Tι(q) G = Tι(q) G0 ⊕ ker(T τs + T τt )(ι(q)).
(17)
It follows that T ExpW is a bijection at q, and the second part of the theorem is derived t as for ExpL . u
Lie Groupoid C ∗ -Algebras and Weyl Quantization
373
4. The Normal Groupoid and Continuous Fields of C ∗ -Algebras We now come to the first part of the proof of the conjecture that C ∗ (G) is related to the Poisson manifold G∗ by a strict quantization. Theorem 1. Let G be a Lie groupoid, with associated Lie algebroid G. Take I = [0, 1] and put A0 = C0 (G∗ ), where G∗ is the dual vector bundle of G, and Ah¯ = C ∗ (G) for h¯ ∈ I \{0}. There exists a C ∗ -algebra C and a family of surjective ∗ -homomorphisms {ϕh¯ : C → Ah¯ }h¯ ∈I such that (C, {Ah¯ , ϕh¯ }h¯ ∈I ) is a continuous field of C ∗ -algebras. The proof uses the normal groupoid of Hilsum and Skandalis [9] (also cf. [33,15]), re-interpreted in terms of the Lie algebroid. We recall the definition; our construction of the smooth structure is different from the one in [9]. The essence is to regard the vector bundle G as a Lie groupoid under addition in each fiber, and glue it to G so as to obtain a new Lie groupoid containing both G and G. →TQ
←
Definition 6. Let G ⇒ Q be a Lie groupoid with associated Lie algebroid G → Q . The normal groupoid GN is a Lie groupoid with base [0, 1] × Q, defined by the following structures: • As a set, GN = G ∪ {(0, 1] × G}. We write elements of GN as pairs (h¯ , u), where u ∈ G for h¯ = 0 and u ∈ G for h¯ 6 = 0. Thus G is identified with {0} × G. • As a groupoid, GN = {0 × G} ∪ {(0, 1] × G}. Here G is regarded as a Lie groupoid over Q, with τs = τt = τ and addition in the fibers as the groupoid multiplication. The groupoid operations in (0, 1] × G are those in G. • The smooth structure on GN , making it a manifold with boundary, is as follows. To start, the open subset O1 := (0, 1] × G ⊂ GN inherits the product manifold structure. Let Q ⊂ N ι ⊂ G and ι(Q) ⊂ Nι ⊂ G, as in Theorem 2. Let O be the open subset of [0, 1] × G (equipped with the product manifold structure; this is a manifold with boundary, since [0, 1] is), defined as O := {(h¯ , X) | h¯ X ∈ N ι }. Note that {0} × G ⊂ O. The map ρ : O → GN is defined by ρ(0, X) := (0, X); ρ(h¯ , X) := (h¯ , ExpW (h¯ X)).
(18)
Since ExpW : N ι → Nι is a diffeomorphism (cf. Proposition 2) we see that ρ is a bijection from O to O2 := {0 × G} ∪ {(0, 1] × Nι }. This defines the smooth structure on O2 in terms of the smooth structure on O. Since O1 and O2 cover GN , this specifies the smooth structure on GN . The fact that GN is a Lie groupoid eventually follows from the corresponding property of G. The given chart is defined in terms of the Weyl exponential, which depends on the choice of a connection in G. However, one may verify that any (smooth) connection, or, indeed, any (Q-preserving) diffeomorphisms between N ι and Nι leads to an equivalent smooth structure on GN . For example, we could have used ExpL instead of ExpW . Also, the smoothness of ExpW makes the above manifold structure on GN well defined, in that open subsets of O1 ∩ O2 are assigned the same smooth structure. Since GN is a Lie groupoid, we can form the C ∗ -algebra C ∗ (GN ), which plays the role of C in Theorem 1. To proceed, we need a result due to Lee [16].
374
N. P. Landsman
Lemma 3. Let C be a C ∗ -algebra, and let ψ : Prim(C) → X be a continuous and open map from the primitive spectrum Prim(C) (equipped with the Jacobson topology [5]) to a locally compact Hausdorff space X. Define Ix := ∩ψ −1 (x); i.e., A ∈ Ix iff πI (A) = 0 for all I ∈ ψ −1 (x) (here πI (C) is the irreducible representation whose kernel is I). Note that Ix is a (closed two-sided) ideal in C. Taking Ax = C/Ix and ϕx : C → Ax to be the canonical projection, (C, {Ax , ϕx }x∈X ) is a continuous field of C ∗ -algebras. For the proof cf. [6]. We apply this lemma with C = C ∗ (GN ) and X = I = [0, 1]. In order to verify the assumption in the lemma, we first note that I0 ' C0 ((0, 1]) ⊗ C ∗ (G), as follows from a glance at the topology of GN . Hence Prim(I0 ) = (0, 1] × Prim(C ∗ (G)), with the product topology. Furthermore, one has C ∗ (GN )/I0 ' C ∗ (G) ' C0 (G∗ ); the second isomorphism is established by the fiberwise Fourier transform (20) below (also cf. [9,2]). Hence Prim(C ∗ (GN )/I0 ) ' G∗ . Using this in Prop. 3.2.1 in [5], with A = Cr∗ (GN ) and I the ideal I0 generated by those f ∈ Cc∞ (GN ) which vanish at h¯ = 0, yields the decomposition Prim(C ∗ (GN )) ' G∗ ∪ {(0, 1] × Prim(C ∗ (G))}, G∗
(19)
provide the full topology on Prim(C ∗ (GN )), but open. If it were, (0, 1] × Prim(C ∗ (G)) would be
is closed. This does not in which it is sufficient to know that G∗ is not closed in Prim(C ∗ (GN )), and this possibility can safely be excluded by looking at the topology of GN and the definition of the Jacobson topology. Using (19), we can define a map ψ : Prim(C ∗ (GN )) → [0, 1] by ψ(I) = 0 for all I ∈ G∗ and ψ(h¯ , I) = h¯ for h¯ 6 = 0 and I ∈ Prim(C ∗ (G)). It is clear from the preceding considerations that ψ is continuous and open. Using this in Lemma 3, one sees that Ih¯ is the ideal in C ∗ (GN ) generated by those f ∈ Cc∞ (GP ) which vanish at h¯ . Hence A0 ' C0 (G∗ ), as above, and Ah¯ ' C ∗ (G) for h¯ 6 = 0. Theorem 1 then follows from Lemma 3. As pointed out to the author by G. Skandalis (private communication, June 1997), similar considerations lead to the following generalization of Theorem 1. ˜ be a Lie groupoid with base Q, ˜ and let p be a continuous and open map from Let G ˜ ˜ Q to some Hausdorff space X, which is G-invariant in the sense that p ◦ τs = p ◦ τt . −1 ˜ ˜ because of the G-invariance ˜ Define Gx := (p ◦ τs ) (x) (this is a sub-groupoid of G of x ∗ x ∗ ˜ ˜ is a continuous field of p), and A := C (Gx ). Then the collection ({A }x∈X , C (G)) ˜ x ) = C ∗ (G ˜ x ). Here f ∈ C ∗ (G) ˜ is understood C ∗ -algebras at those points x where C ∗ (G r x ˜ to define a section of the field {A }x∈X by f (x) = f Gx . ˜ = GN and X = I , hence Q ˜ = I × Q, and We apply this to our situation by taking G p is just projection onto the first variable. Continuity away from h¯ = 0 follows from the triviality of the field for h¯ 6 = 0 (whether or not Cr∗ (G) = C ∗ (G)). Continuity at h¯ = 0 follows by noticing that Cr∗ (G) = C ∗ (G), both sides being equal to C0 (G∗ ). In other words, from this point of view it is the amenability of G, regarded as a Lie groupoid, that lies behind Theorem 1. 5. Weyl Quantization on the Dual of a Lie Algebroid →TQ
Let G → Q be a Lie algebroid, with bundle projection τ . We start by defining a fiberwise Fourier transform f` ∈ C ∞ (G) of suitable f ∈ C ∞ (G∗ ). This transform depends on L the choice of a family {µL q }q∈Q of Lebesgue measures, where µq is defined on the fiber
Lie Groupoid C ∗ -Algebras and Weyl Quantization
375
τ −1 (q). We will discuss the normalization of each µL q in the proof of Theorem 2; for the moment we merely assume that the q-dependence is smooth in the obvious (weak) sense. For a function f` on G which is L1 on each fiber we put Z −iθ (X) ` dµL f (X), (20) f (θ) := q (X) e τ −1 (q)
−1 L∗ where X ∈ τ −1 (q). Each µL q determines a Lebesgue measure µq on the fiber τG∗ →Q (q) of G∗ by fixing the normalization in requiring that the inverse to (20) is given by Z iθ (X) ` dµL∗ f (θ ). (21) f (X) = q (θ ) e −1 τG ∗ →Q (q)
∞ (G∗ ) as consisting of Having constructed a Fourier transform, we define the class CPW ∗ ∞ those smooth functions on G whose Fourier transform is in Cc (G); this generalizes the class of Paley-Wiener functions on T ∗ Rn ' Cn . We pick a function κ ∈ C ∞ (G, R) with support in N ι (cf. Proposition 2), equaling unity in some smaller tubular neighbourhood of Q, as well as satisfying κ(−X) = κ(X) for all X ∈ G.
Definition 7. Let G be a Lie groupoid with Lie algebroid G. For h¯ 6 = 0, the Weyl ∞ (G∗ ) is the element QW (f ) ∈ C ∞ (G), regarded as a dense quantization of f ∈ CPW c h¯ subalgebra of C ∗ (G), defined by QW / Nι , and by h¯ (f )(γ ) := 0 when γ ∈ W QW ¯ −n κ(X)f`(X/h¯ ). h¯ (f )(Exp (X)) := h
(22)
Here the Weyl exponential ExpW : G → G is defined in (16), and the cutoff function κ is as specified above. ∞ (G∗ ), the This definition is possible by virtue of Proposition 2. By our choice of CPW W operator Qh¯ (f ) is independent of κ for small enough h¯ (depending on f ). →TQ
˜0 = Theorem 2. Let G be a Lie groupoid with Lie algebroid G → Q , and take A W W ∞ ∗ ˜ CPW (G ). For each f ∈ A0 operator Qh¯ (f ) of Definition 7 satisfies Qh¯ (f )∗ = W W ∗ QW h¯ (f ), and the family {Qh¯ (f )}h¯ ∈[0,1] , with Q0 (f ) = f , is a continuous cross∗ section of the continuous field of C -algebras of Theorem 1. Proof. Writing the Poisson bracket and the pointwise product in terms of the Fourier ˜ 0 is indeed a Poisson algebra. transform, one quickly establishes that A ˜ 0 the operator QW (f ) It is immediate from (8) and (16) that for real-valued f ∈ A h¯ ∗ is self-adjoint in C (G); this implies the first claim. ← To prove the second claim, we pick a left Haar system {µtq }q∈Q on G ⇒ Q; see Proposition 1. The vector bundle G, regarded as a Lie groupoid under addition in each fiber (cf. Definition 6), has a left Haar system in any case, consisting of the family {µL q }q∈Q of Lebesgue measures on each fiber already used in the construction of the Fourier transform. Since we have a Lie groupoid, the Radon-Nikodym derivative ι Jq (X) := dµtq (ExpW (X))/dµL q (X) is well defined and strictly positive on N (since both measures are locally Lebesgue on spaces with the same dimension). We now fix
376
N. P. Landsman
the normalization of the µL q by requiring that lim X→0 Jq (X) = 1 for all q. This leads to a left Haar system for GN , given by µt(0,q) := µL q; µt(h¯ ,q) := h¯ −n µtq ,
(23)
where n is the dimension of the typical fiber of G. The factor h¯ −n is necessary in order to satisfy condition 3 in Definition 3 at h¯ = 0, as is easily verified using the manifold structure on GN . Thus the ∗ -algebraic structure on Cc∞ (GN ) defined by (7) and (8) with Definition 6 and (23) becomes Z dµL (24) f ∗ g(0, X) = τ (X) (Y ) f(0, X − Y )g(0, Y ); τ −1 ◦τ (X) Z dµtτs (γ ) (γ1 ) f(h¯ , γ γ1 )g(h¯ , γ1−1 ); (25) f ∗ g(h¯ , γ ) = h¯ −n τt−1 (τs (γ ))
f∗ (0, X) = f(0, −X); ∗
f (h¯ , γ ) =
f(h, γ −1 ). ¯
(26) (27)
∞ (G∗ ), the function Q(f ) on G defined by One sees that, for given f ∈ CPW N Q(f )(0, X) = f`(X), Q(f )(h¯ , ExpW (X)) = κ(X)f`(X/h¯ ), and Q(f )(h¯ , γ ) = 0 for γ ∈ / Nι , is smooth on GN ; cf. Definition 6. In other words, Q(f ) is an element of C ∗ (GN ). Recall that Ih¯ is the ideal in C ∗ (GN ) generated by those functions in Cc∞ (GN ) which vanish at h¯ . The canonical map f → [f]h¯ from C ∗ (GN ) to Cr∗ (GN )/Ih¯ is given, for h¯ 6 = 0, by [f]h¯ (·) = f(h¯ , ·). However, in view of the factor h¯ −n in (25), this map is only a ∗ -homomorphism from C ∗ (GN ) to C ∗ (G) if we add a factor h¯ −n to the definition (7) of convolution on G. Since for h¯ 6 = 0 we would like to identify C ∗ (GN )/Ih¯ with C ∗ (G), in which convolution is defined in the usual, h¯ -independent way, we should therefore define the maps ϕh¯ of Theorem 1 by
ϕ0 (f) : θ 7 → ´f(0, θ); ϕh¯ (f) : γ 7 → h¯ −n f(h¯ , γ ) (h¯ 6= 0).
(28)
Here ϕ0 : C ∗ (GN ) → C0 (G∗ ), and ´f(0, θ) and f(0, X) are related as f (θ ) and f`(X) are in (20). For h¯ 6 = 0 one of course has ϕh¯ : C ∗ (GN ) → C ∗ (G). These expressions are initially defined for f ∈ Cc∞ (GN ); since ϕh¯ is contractive, they are subsequently extended to general f ∈ C ∗ (GN ) by continuity. This explains the factor h¯ −n in (22); the theorem then follows from the paragraph after (27). u t The important calculations of Ramazan [25] show that i W W lim k [QW h (f ), Qh¯ (g)] − Qh¯ ({f, g})k = 0 h¯ →0 h ¯ ¯
(29)
˜ 0 ; this is Dirac’s condition (he in addition proves this to hold in formal for all f, g ∈ A deformation quantization).
Lie Groupoid C ∗ -Algebras and Weyl Quantization
377
Corollary 1. Let G be a Lie groupoid, with associated →TQ
• Lie algebroid G → Q ; • Poisson manifold G∗ (the dual bundle to G, with Poisson structure (9)–(11)); • normal groupoid GN (cf. Definition 6). In the context of Definition 2, the ingredients listed below yield a strict quantization of the Poisson manifold P = G∗ : 1. The continuous field of C ∗ -algebras given by C = C ∗ (GN ), A0 = C0 (G∗ ), Ah¯ = C ∗ (G) for h¯ ∈ I \{0}, and ϕh¯ as defined in (28); cf. Theorem 1. ˜ 0 = C ∞ (G∗ ) of fiberwise Paley–Wiener functions on G∗ (as 2. The dense subspace A PW defined below (21)). ∞ (G∗ ) → C ∗ (G ) is defined by putting Q = QW (as specified in 3. The map Q : CPW N h¯ h¯ Definition 7); this determines Q by Theorem 2 and the remark after Definition 1.
6. Examples In this section we illustrate the concepts introduced above, and show that a number of known strict quantizations are special cases of Corollary 1. Details of these examples will be omitted; see [17,3,15,1] for matters related to the Lie groupoids and Lie algebroids involved, and cf. [2,26,15,25] for the C ∗ -algebras that appear. The quantization maps are discussed in detail in [15]. It turns out that a number of examples are more naturally described by changing some signs, as follows. We denote G∗ , seen as a Poisson manifold through (9)–(11), by G∗− . Alternatively, we may insert plus signs on the right-hand sides of (10) and (11), defining the Poisson manifold G∗+ . The normal groupoid GN may be equipped with a different manifold structure by replacing ExpW (h¯ X) in (18) by ExpW (−h¯ X); the original − Definition 6 yields a manifold G+ N , and the modified one defines GN . (The original smooth structure is equivalent to the modified one by the diffeomorphism (0, X) 7→ (0, −X) and (h¯ , γ ) 7 → (h¯ , γ ).) In (22) we may replace f`(X/h¯ ) by f`(−X/h¯ ), defining W W a quantization map QW h¯ (·)− , differing from the original one Qh¯ (·)+ = Qh¯ (·). Theorems 1 and 2, Eq. (29), as well as Corollary 1 remain valid if all signs are simultaneously changed in this way. ←
Example 1 (Weyl quantization on a manifold). The pair groupoid Q × Q ⇒ Q on a set Q is defined by the operations τs (q1 , q2 ) := q2 , τt (q1 , q2 ) := q1 , ι(q) := (q, q), (q1 , q2 ) · (q2 , q3 ) := (q1 , q3 ), and (q1 , q2 )−1 := (q2 , q1 ). This is a Lie groupoid when Q is a manifold. Any measure ν on Q which is locally Lebesgue defines a left Haar system. One has C ∗ (Q × Q) ' B0 (L2 (Q)), the C ∗ -algebra of all compact operators on L2 (Q, ν). The associated Lie algebroid is the tangent bundle T Q, with the usual bundle projection and Lie bracket, and the anchor is the identity. The Poisson bracket on T ∗ Q is the canonical one.
378
N. P. Landsman
To define ExpW one chooses an affine connection ∇ on T Q, with associated exponential map exp : T Q → Q. Then ExpL (X) = (τ (X), expτ (X) (X));
(30)
ExpW (X) = (expτ (X) (− 21 X), expτ (X) ( 21 X)),
(31)
where X ∈ T Q and τ := τT Q→Q . On Q = Rn with flat metric and corresponding flat Riemannian connection this simplifies to ExpW (v, q) = (q − 21 v, q + 21 v), where we have used canonical co2 n ordinates on T Rn . The operator QW h¯ (f )− on L (R ) defined by (22), where one may take κ = 1, with (21), is then given by Z d n pd n y ip(x−y)/h¯ W e f (p, 21 (x + y))9(y). (32) Qh¯ (f )− 9(x) = ¯ )n T ∗ Rn (2π h This is Weyl’s original prescription. The associated continuous field of C ∗ -algebras is A0 = C0 (T ∗ Rn ) and Ah¯ = B0 (L2 (Rn )) for h¯ 6= 0. The fact that this quantization map is strict, and in particular satisfies (3), was proved by Rieffel [29]; also cf. [15]. Replacing I = [0, 1], as we have used so far in connection with Definition 2, by I = R, the C ∗ -algebra C in Definition 1 is C ∗ (Hn ), the group algebra of the simply connected Heisenberg group on Rn [6]. This is indeed the C ∗ -algebra of the tangent groupoid of Rn (see below). When Q is an arbitrary manifold, the normal groupoid (Q × Q)N is the tangent groupoid of Q [2]. If one takes the affine connection on T Q to be the Levi-Civita connection given by a Riemannian metric on Q, one recovers the extension of Weyl’s prescription considered in [12,15]. One now has A0 = C0 (T ∗ Q) and Ah¯ = B0 (L2 (Q)) for h¯ 6 = 0, and QW h¯ duly satisfies (3); see [12,15], where references to alternative generalizations of Weyl’s quantization prescriptions may be found. Example 2 (Rieffel’s quantization of the Lie–Poisson structure on a dual Lie algebra). A Lie group is a Lie groupoid with Q = e. A left-invariant Haar measure on G provides a left Haar system; the ensuing convolution algebra C ∗ (G) is the usual group algebra. The Lie algebroid is the Lie algebra. The Poisson structure on g∗± is the well-known Lie–Poisson structure [18,15]. No connection is needed to define the exponential map, and one has ExpL (X) = ExpW (X) = Exp(X),
(33)
where X ∈ g and Exp : g → G is the usual exponential map. When G is exponential (in that Exp is a diffeomorphism), one may omit κ in (22). Taking the + sign, the function ∗ QW h¯ (f )+ ∈ C (G) is then given by QW h¯ (f )+ : Exp(X) →
Z
g∗
d nθ eihθ,Xi/h¯ f (θ ). (2π h¯ )n
(34)
This is Rieffel’s prescription [28], who proved strictness of the quantization for nilpotent groups. When G is compact one needs the cut-off function κ, obtaining another quantization already known to be strict before the present paper and [25] appeared; see [14] or [15].
Lie Groupoid C ∗ -Algebras and Weyl Quantization
379 ←
Example 3 (Weyl quantization on a gauge groupoid). The gauge groupoid P×H P ⇒ Q of a smooth principal bundle P over a base Q with structure group H is defined by the projections τs ([x, y]H ) = τ (y) and τt ([x, y]H ) = τ (x), and the inclusion ι(τ (x)) = [x, x]H . Accordingly, the multiplication [x, y]H · [x 0 , y 0 ]H is defined when y and x 0 lie in the same fiber of P, in which case [x 0 , y 0 ]H = [y, z]H for some z = y 0 h, h ∈ H . Then [x, y]H · [y, z]H = [x, z]H . Finally, the inverse is [x, y]−1 H = [y, x]H . See [17]. An H -invariant measure µ on P which is locally Lebesgue produces a left Haar system. In general, each measurable section s : Q → P determines an isomorphism C ∗ (P ×H P) ' B0 (L2 (Q)) ⊗ C ∗ (H ); this is a special case of Thm. 3.1 in [21] (also cf. [15], Thm. 3.7.1). When H is compact one has C ∗ (P ×H P) ' B0 (L2 (P))H , where L2 (P) is defined with respect to some H -invariant locally Lebesgue measure on P. →TQ The associated Lie algebroid (T P)/H → Q is defined by the obvious projections (both inherited from the projection τ : P → Q), the Lie bracket on 0((T P)/H ) obtained by identifying this space with 0(T P)H , and borrowing the commutator from 0(T P); cf. [17]. The Poisson structure on ((T P)/H )∗ = (T ∗ P)/H is given by the restriction of the canonical Poisson bracket on C ∞ (T ∗ P) to C ∞ (T ∗ P)H , under the isomorphism C ∞ ((T ∗ P)/H ) ' C ∞ (T ∗ P)H . One chooses an H -invariant affine connection on T P, with exponential map exp : T P → P. This induces a connection on (T P)/H , in terms of which ExpL ([X]H ) = [τ (X), expτ (X) (X)]H ;
(35)
ExpW ([X]H ) = [expτ (X) (− 21 X), expτ (X) ( 21 X)]H ,
(36)
where τ = τT P→P , and [X]H ∈ (T P)/H is the equivalence class of X ∈ T P under the H -action on T P. In the Riemannian case, for compact H the corresponding map QW h¯ (·)− is simply ∞ (T ∗ P) → B (L2 (P)) as defined in Example 1 to (·) : C the restriction of QW − 0 PW h¯ ∞ (T ∗ P)H . Since QW is invariant under isometries [15], the image of C ∞ (T ∗ P)H is CPW PW h¯ contained in B0 (L2 (P))H . The ensuing quantization of (T ∗ P)/H was already known to be strict; see [12,15]. Physically, this example describes the quantization of a nonabelian charged particle moving in a gravitational as well as a Yang–Mills field. Example 4 (Transformation group C ∗ -algebras). Let a Lie group G act smoothly on a ← set Q. The transformation groupoid G × Q ⇒ Q is defined by the operations τs (x, q) = −1 x q and τt (x, q) = q, so that the product (x, q) · (y, q 0 ) is defined when q 0 = x −1 q. Then (x, q) · (y, x −1 q) = (xy, q). The inclusion is ι(q) = (e, q), and for the inverse one has (x, q)−1 = (x −1 , x −1 q). Each left-invariant Haar measure dx on G leads to a left Haar system. The corresponding groupoid C ∗ -algebra is the usual transformation group C ∗ -algebra C ∗ (G, Q), cf. [26]. →TQ The Lie algebroid g × Q → Q is a trivial bundle over Q, with anchor τa (X, q) = −ξX (q) (the fundamental vector field on Q defined by X ∈ g). Identifying sections of g × Q with g-valued functions X(·) on Q, the Lie bracket on 0(g × Q) is [X, Y ]g×Q (q) = [X(q), Y (q)]g + ξY X(q) − ξX Y (q).
(37)
The associated Poisson bracket coincides with the semi-direct product bracket defined in [11].
380
N. P. Landsman
The trivial connection on g × Q → Q yields ExpL (X, q) = (Exp(X), q); W
Exp (X, q) =
(Exp(X), Exp( 21 X)q).
(38) (39)
The cutoff κ in (22) is independent of q, and coincides with the function appearing in ∞ (g∗ × Q) is then quantized by Example 2. For small enough h¯ a function f ∈ CPW Z d nθ (f ) : (Exp(X), q) → eihθ,Xi/h¯ f (±θ, Exp(− 21 X)q). (40) QW ± h¯ ¯ )n g∗ (2π h When G = Rn and Q has a G-invariant measure, the map f → QW h¯ (f )± is equivalent to the deformation quantization considered by Rieffel [27], who already proved that it is strict (also cf. [15]). Note added in proof. All results remain true when the groupoid C ∗ -algebras are replaced by reduced ones. This is clear both from the proof of Lemma 3 and from the argument at the end of Sect. 4 (which should be attributed to E. Blanchard). References 1. Cannas da Silva, A., Hartshorn, K., Weinstein, A.: Lectures on Geometric Models for Noncommutative Algebras. Providence: AMS, 1998 2. Connes, A.: Noncommutative Geometry. San Diego: Academic Press, 1994 3. Coste, A., Dazord, P., Weinstein, A.: Groupoides symplectiques. Publ. Dépt. Math. Univ. C. Bernard-Lyon I 2A, 1–62 (1987) 4. Courant, T.J.: Dirac Manifolds. Trans. Am. Math. Soc. 319, 631–661 (1990) 5. Dixmier, J.: C ∗ -Algebras. Amsterdam: North-Holland, 1977 6. Elliott, G.A., Natsume, T., Nest, R.: The Heisenberg group and K-theory. K-Theory 7, 409–428 (1993) 7. Elliott, G.A., Natsume, T., Nest, R.: The Atiyah–Singer index theorem as passage to the classical limit in quantum mechanics. Commun. Math. Phys. 182, 505–533 (1996) 8. Fedosov, B.V.: Deformation Quantization and Index Theory. Berlin: Akademie-Verlag 1996 9. Hilsum, M., Skandalis, G.: Morphismes K-orientés d’espaces de feuilles et fonctorialité en théorie de Kasparov. Ann. Scient. Éc. Norm. Sup. (4e s.) 20, 325–390 (1988) 10. Kirchberg, E., Wassermann, S.: Operations on continuous bundles of C ∗ -algebras. Math. Ann. 303, 677– 697 (1995) 11. Krishnaprasad, P.S., Marsden, J.E.: Hamiltonian structure and stability for rigid bodies with flexible attachments. Arch. Rat. Mech. An. 98, 137–158 (1987) 12. Landsman, N.P.: Strict deformation quantization of a particle in external gravitational and Yang–Mills fields. J. Geom. Phys. 12, 93–132 (1993) 13. Landsman, N.P.: Classical and quantum representation theory. In: de Kerf, E. A., Pijls, H.G.J. (eds.) Proc. Seminar Mathematical Structures in Field Theory, CWI-syllabus 39, Amsterdam: Mathematisch Centrum CWI, 1996, pp. 135–163 14. Landsman, N.P.: Twisted Lie group C ∗ -algebras as strict quantizations. Lett. Math. Phys. 46, 181–188 (1998) 15. Landsman, N.P.: Mathematical Topics Between Classical and Quantum Mechanics. New York: Springer, 1998 16. Lee, R.-Y. On the C ∗ -algebras of operator fields. Indiana Univ. Math. J. 25, 303–314 (1976) 17. Mackenzie, K.: Lie Groupoids and Lie Algebroids in Differential Geometry. Cambridge: Cambridge University Press, 1987 18. Marsden, J.E., Ratiu, T.S.: Introduction to Mechanics and Symmetry. New York: Springer, 1994 19. Monthubert, B.: Groupoïdes et calcul pseudo-différentiel sur les variétés à coins. PhD Thesis. Paris: Université Paris VII- Denis Diderot, 1998 20. Monthubert, B., Pierrot, F.: Indice analytique et groupoïdes de Lie. C.R. Acad. Sci. Paris Série I 325, 193–198 (1997) 21. Muhly, P.S., Renault, J.N., Williams, D.P.: Equivalence and isomorphism for groupoid C ∗ -algebras. J. Operator Th. 17, 3–22 (1987)
Lie Groupoid C ∗ -Algebras and Weyl Quantization
381
22. Nagy, G.: E-theory with ∗-homomorphisms. J. Funct. Anal. 140, 275–299 (1996) 23. Nistor, V., Weinstein, A., Xu, P.: Pseudodifferential operators on differential groupoids. Preprint math.OA/9702054 (1998) 24. Pradines, J.: Géométrie différentielle au-dessus d’un groupoïde. C. R. Acad. Sci. Paris A266, 1194–1196 (1968) 25. Ramazan, B.: Quantification par Dèformation des variétés de Lie–Poisson. Ph.D Thesis. Orléans: Université d’Orléans, 1998 26. Renault, J.: A Groupoid Approach to C ∗ -algebras. Lecture Notes in Mathematics 793, Berlin: Springer, 1980 27. Rieffel, M.A.: Deformation quantization of Heisenberg manifolds. Commun. Math. Phys. 122, 531–562 (1989) 28. Rieffel, M.A.: Lie group convolution algebras as deformation quantizations of linear Poisson structures. Am. J. Math. 112, 657–686 (1990) 29. Rieffel, M.A.: Deformation quantization for actions of Rd . Mem. Am. Math. Soc. 106 (506), (1993) 30. Rieffel, M.A.: Quantization and C ∗ -algebras. In: Doran, R.S. (ed.) C ∗ -algebras: 1943–1993. Cont. Math. 167, Providence, RI: American Mathematical Society, 1994, pp. 67–97 31. Rieffel, M.A.: Quantization and operator algebras. In: Bracken, A.J., De Wit, D., Gould, M., Pearce, P. (eds.) Proc. XIIth Int. Congress of Mathematical Physics, Brisbane 1997 32. Vaisman, I.: Lectures on the Geometry of Poisson Manifolds. Basel: Birkhäuser, 1994 33. Weinstein, A.: Blowing up realizations of Heisenberg-Poisson manifolds. Bull. Sc. math. (2) 113, 381–406 (1989) 34. Weinstein, A.: Noncommutative geometry and geometric quantization. In: Donato, P. et al. (eds.) Symplectic Geometry and Mathematical Physics, Basel: Birkhäuser, 1991, pp. 446–461 Communicated by A. Connes
Commun. Math. Phys. 206, 383 – 407 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
On the Exact Solution of Models Based on Non-Standard Representations J. Gruneberg Institut für Theoretische Physik, Universität zu Köln, Zülpicher Straße 77, 50937 Köln, Germany Received: 24 November 1998 / Accepted: 26 March 1999
Abstract: The algebraic Bethe ansatz is a powerful method to diagonalize transfermatrices of statistical models derived from solutions of (graded) Yang Baxter equations, connected to fundamental representations of Lie (super-)algebras and their quantum deformations respectively. It is, however, very difficult to apply it to models based on higher dimensional representations of these algebras in auxiliary space, which are not of fusion type. A systematic approach to this problem is presented here. It is illustrated by the diagonalization of a transfer-matrix of a model based on the product of two different b 0 (2, 1; C)). four-dimensional representations of Uq (gl 1. Introduction The starting point for the construction of (Bethe ansatz) integrable models is the famous Yang–Baxter equation (YBE) [1,2], 0
00
0
00
0
00
00
0
VV VV V V V V VV VV (u, v)R13 (u, w)R23 (v, w) = R23 (v, w)R13 (u, w)R12 (u, v). R12 0
(1a)
V , V 0 and V 00 are in general three different spaces. The operators R V V (u) act on the direct product V × V 0 → V × V 0 . Both sides of Eq. (1a) act on the three-fold product V × V 0 × V 00 . The lower indices i, j ∈ 1, 2, 3 on the R-operators denote as usual the two factors in this product on which the corresponding R-operator acts non-trivially. In general the so-called spectral parameters u, v and w are complex variables. Up to now, there is no general classification of the solutions to (1a). The situation is much better understood, if V , V 0 and V 00 are carrier spaces for the representation of a simple Lie-algebra or its quantum-deformation. The corresponding theory is mainly due to Drinfel’d [3], who also introduced the concept of the universal R-matrix. The existence of the latter guarantees the existence of R-operators as matrices acting on direct products of usually, but not always, finite dimensional carrier spaces V . A good account of these developments has been given by Chari and Presley [4]. Powerful methods to construct
384
J. Gruneberg
these matrices explicitly were developed by Jimbo [5] and many others, see e.g. the book by Ma [6]. The dependence on only one complex parameter is due to the use of evaluation representations of affine algebras. In this case (1) takes the more common difference form 0
00
0
00
0
00
00
0
VV VV V V V V VV VV (u − v)R13 (u)R23 (v) = R23 (v)R13 (u)R12 (u − v) R12
(1b)
The first space V is called auxiliary space, the second relabeled to V (n) , in general taken out of some countable set {V (m) }N m=1 , a (local) quantum space. An L-operator acting on the direct product of these is defined as (n) Lˆ V (n|u) := R V V (u, w(n) ).
(2a)
It is assumed to act trivially on all other quantum spaces V (m) with m 6= n. Assuming that w (n) in (2a) just labels V (n) and that u is a spectral parameter of difference type as in (1b), it is possible to introduce additional inhomogeneities δ (n) into the monodromy-matrix Tˆ V (N |u) := Lˆ V (N |u − δ (N ) ) · · · · · · Lˆ V (1|u − δ (1) ). Here δ (n) and w(n) will be some complex numbers. o n τˆ V (N |u) = trV Tˆ V (N|u)
(2b)
(2c)
can be viewed as a row-to-row transfer-matrix of a two dimensional (classical) statistical model, with N sites per row, acting on the (global) quantum space V (N ) ×· · ·×V (1) . If δ (n) vanishes and w(n) is independent of n, the transfer-matrix (2c) is called homogeneous. In any case integrability of the latter is established via (1), written as 0 0 VV0 VV0 (u, v)Lˆ V1 (n|u)Lˆ V2 (n|v) = Lˆ V2 (n|v)Lˆ V1 (n|u)R12 (u, v). R12
(3a)
From that the fundamental commutation relations (FCR) are obtained immediately: 0
0
0
0
VV VV (u, v)Tˆ1V (N |u)Tˆ2V (N |v) = Tˆ2V (N|v)Tˆ1V (N|u)R12 (u, v). R12
(3b)
0
V V is invertible, which is guaranteed for finite dimensional V and V 0 , this Provided R12 yields i h 0 ! (3c) τˆ V (N |u), τˆ V (N |v) = 0. 0
Expanding τ V (N |v) in v one obtains an infinite family of operators commuting with τˆ V (N|u). The question, if this family contains the right number of “independent” integrals of motion for every finite N, is difficult to answer and usually taken for granted. The set of equations (2) and (3) was derived by Baxter and can be found together with the original references in his excellent book [7]. The notation here is due to Faddeev and coworkers, who created a purely algebraic way for diagonalizing τˆ V (N |u), the algebraic Bethe ansatz (ABA). Their quantum inverse scattering method (QISM) [8] provided the background for Drinfel’d’s theory [3], but is more general and in the author’s opinion not fully exploited yet. A good account including original references can be found in the book by Korepin et al. [9] and the reprint collection [10].
On the Exact Solution of Models Based on Non-Standard Representations
385
ABA is a powerful method to construct eigenvectors and eigenvalues of τˆ V (N|u). In some sense it is more systematic than the original coordinate Bethe ansatz [11]. In general this is only true, if the auxiliary space V is the carrier space of the fundamental representation of a Lie (super-)algebra or a deformation of the latter. Especially if the auxiliary space V is a higher dimensional carrier space of another representation of the same algebra, simplicity is lost and ABA becomes cumbersome. Drinfel’d’s theory [3] suggests that a simple generalization should exist. A systematic approach to this problem will be developed in the following. 2. Models In the case of general graded algebras Drinfel’d’s constructions [3] are still not completely understood. However for simple (affine) Lie superalgebras and their quantum deformations a proper algebraic construction has been given by Yamane recently [12]. Also QISM and ABA are not very sensitive to grading and the graded version of the YBE has been established by Kulish and Sklyanin long ago [13]. The R-matrices, which will be used as concrete examples, are related to the “quantum b 0 (2, 1|C)). No use will be made of any peculiar universal enveloping superalgebra” Uq (gl features of this symmetry. The interested reader is referred to the book by Cornwell [14] on Lie superalgebras, from which the notation is borrowed, the book by Kac [15] for more details on affinization and to the paper [12] for the proper construction of the q-deformed universal enveloping superalgebra. The carrier space V3 of the fundamental representation of Uq (gl(2, 1|C) is complex and three-dimensional. Basis and cobasis will be denoted by |i >, < j |i >= δij for i, j = 1, 2, 3.
(4a)
A basis of the complex carrier space V4 of the four-dimensional representation will be denoted similarly. These representations are Z2 -graded: To each basis-vector |ii a number p(i) ∈ {0, 1} is assigned, i.e. p(1) = p(2) = 0, p(3) = 1
(4b)
p(1) = p(2) = 0, p(3) = p(4) = 1
(4c)
for V3 and analogously
for V4 . Local basis-vectors are divided into even (bosonic, p = 0) and odd (fermionic, p = 1) ones. Local operators acting in V3 or V4 , etc. are expressed in the natural basis eij = |ii hj | .
(4d)
If the corresponding space is a (local) quantum space, it will be denoted with a hat, e.g. eˆij for clarity. These operators act trivially on all other (local) quantum spaces. A grading is assigned to this basis according to p(eij ) = [p(i) + p(j )] mod 2.
(4e)
It is possible to extend these definitions of grading naturally to those vectors |ψi and operators a, ˆ which are homogeneous with respect to the grading.
386
J. Gruneberg
It is convenient to expand operators as well as vectors in the natural (tensor) product basis, which is ordered according to V (N ) × · · · × V (1) , see (2b). Grading imposes signs on products of homogeneous operators, i.e.: ˆ
ˆ ˆ ˆ cˆ ⊗ d) ˆ = (−1)p(b)p(c) (aˆ c) ˆ ⊗ (bˆ d) (aˆ ⊗ b)(
(4f)
or on the action of homogeneous operators on homogeneous vectors, i.e. ˆ
ˆ (aˆ ⊗ b)(|ψi ⊗ |ϕi) = (−1)p(b)p(|ψi) (aˆ |ψi) ⊗ (bˆ |ϕi).
(4g)
The only other effect of grading is that trV in (2c) has to be interpreted as supertrace: o X n (−1)p(i) hi|Tˆ V (N|u)|ii. (4h) trV Tˆ V (N |u) = i
Kulish and Sklyanin found [13] that additional signs, which appear in an explicit representation of the YBE (1) due to grading can be absorbed into a redefinition of matrix elements, so that every solution of the graded YBE is equivalent to a solution of the conventional one. The four dimensional representation can be characterized by a set of complex parameters, symbolically denoted by (5a) V4 ≈ C, κ, κ ∗ , µ, µ∗ . This is a peculiarity of Lie superalgebras [14], which is conserved under quantum deformation; κ, κ ∗ and µ, µ∗ are not necessarily complex conjugate to each other, but related to C by κκ ∗ = [C]q , µµ∗ = [C + 1]q ,
(5b)
where q is the deformation parameter, q := e2η ,
(5c)
and q-brackets are defined as usual by [C]q :=
q C − q −C sinh(2ηC) = . −1 q −q sinh(2η)
(5d)
Different choices of κ, κ ∗ , µ, µ∗ can be related to each other by a similarity transformation of the algebra, which conserves grading, but is not unitary in general. That makes it convenient to keep these parameters. Note that the representation V4 can be deformed continuously into V40 , which is characterized by a set of primed parameters also connected by (5b). A well-known solution of (1b) with V = V 0 = V 00 = V3 is R V3 V3 (u) = e11 ⊗ eˆ11 + e22 ⊗ eˆ22 − d(u) e33 ⊗ eˆ33 + c(u) e11 (eˆ22 + eˆ33 ) + e22 (eˆ11 + eˆ33 ) + a(u)e21 ⊗ eˆ12 + b(u)e12 ⊗ eˆ21 + a(u) e31 ⊗ eˆ13 + e32 ⊗ eˆ23 − b(u) e13 ⊗ eˆ31 + e23 ⊗ eˆ32
(6)
On the Exact Solution of Models Based on Non-Standard Representations
387
with coefficients sinh(2η) [cosh(u) + sinh(u)] , sinh(2η + u) sinh(2η) b(u) := [cosh(u) − sinh(u)] , sinh(2η + u) sinh(u) , c(u) := sinh(2η + u) sinh(2η − u) . d(u) := sinh(2η + u) a(u) :=
To the author’s knowledge it appeared first in a different notation in the work of Perk and Schultz [16]. It is the standard q-deformation of the Y (gl(2, 1|C))-symmetric R-matrix given by Kulish and Sklyanin [13]. Kulish and Sklyanin wrote down the Y (gl(m, n|C))-symmetric R-matrix for arbitrary b 0 (m, n|C))-symmetric case can positive integers m and n. Its generalization to the Uq (gl also be taken from the paper by Perk and Schultz [16]. It is a simple generalization of (6). b 0 (2, 1|C))-symmetric R-matrix: The R-matrix (6) is related to the following Uq (gl R V3 V4 (u) = ρ(u) e11 ⊗ (eˆ11 + eˆ33 ) + e22 ⊗ (eˆ11 + eˆ44 ) + α0 (u) e11 ⊗ (eˆ22 + eˆ44 ) + e22 ⊗ (eˆ22 + eˆ33 ) + e33 ⊗ β0 (u)eˆ11 − eˆ22 + γ0 (u)(eˆ33 + eˆ44 ) + δ1 (u)e12 ⊗ eˆ43 + δ2 (u)e21 ⊗ eˆ34 − ε1 (u) e13 ⊗ eˆ23 + e23 ⊗ eˆ24 + ε2 (u) e31 ⊗ eˆ32 + e32 ⊗ eˆ42 + δ1 (u)e12 ⊗ eˆ43 + δ2 (u)e21 ⊗ eˆ34 h i − ζ1 (u) e13 ⊗ eˆ41 − q −1 e23 ⊗ eˆ31 + ζ2 (u) e31 ⊗ eˆ14 − q e32 ⊗ eˆ13
(7)
with coefficients (A1), listed in Appendix A, in the sense that it fulfills the YBE (1b) with V = V 0 = V3 and V 00 = V4 : V3 V3 V3 V4 V3 V4 V3 V4 V3 V4 V3 V3 R12 (u − v)R13 (u)R23 (v) = R23 (v)R13 (u)R12 (u − v).
(8)
The construction of the R-matrix (7), and the proof of (8) is standard (see e.g.[6,17]). From (7) a transfer-matrix τ V3 (N |u) is defined by (2). It is sufficient to consider the homogeneous case, i.e. δ (n) = 0 and V (n) = V4 for all n in (2a). Integrability follows from (3c). It is easily tractable by ABA, which will be demonstrated in the next section. b 0 (2, 1|C))-symmetric R-matrix acting on the direct product of two Another Uq (gl different four dimensional representations, characterized by the corresponding parameter
388
J. Gruneberg
sets (5a), is given by 0
R V4 V4 (u) = f (u)e11 ⊗ eˆ11 + g(u)e22 ⊗ eˆ22 − e33 ⊗ eˆ33 − e44 ⊗ eˆ44 + r5 e22 ⊗ eˆ11 + r50 e11 ⊗ eˆ22 − r10 (e33 ⊗ eˆ44 − e44 ⊗ eˆ33 ) − r7 (e33 + e44 ) ⊗ eˆ11 − r70 e11 ⊗ (eˆ33 + eˆ44 ) − r9 (e33 + e44 ) ⊗ eˆ22 − r90 e22 ⊗ (eˆ33 + eˆ44 ) + r1 e21 ⊗ eˆ12 + r10 e12 ⊗ eˆ21 − r4 e43 ⊗ eˆ34 − r40 e34 ⊗ eˆ43 + r2 (e31 ⊗ eˆ13 + e41 ⊗ eˆ14 ) − r20 (e13 ⊗ eˆ31 + e14 ⊗ eˆ41 ) + r3 (e32 ⊗ eˆ23 + e42 ⊗ eˆ24 ) − r30 (e23 ⊗ eˆ32 + e24 ⊗ eˆ42 )
(9)
− r6 (e24 ⊗ eˆ13 − q −1 e23 ⊗ eˆ14 ) + r80 (e42 ⊗ eˆ31 − q e32 ⊗ eˆ41 ) + r60 (e13 ⊗ eˆ24 − q e14 ⊗ eˆ23 ) − r8 (e31 ⊗ eˆ42 − q −1 e41 ⊗ eˆ32 ). The coefficients are again listed in Appendix A. The construction of this R-matrix, and a proof of (1b), V4 V 0
V4 V400
R12 4 (u − v)R13
V 0 V400
(u)R234
V 0 V400
(v) = R234
V4 V400
(v)R13
V4 V 0
(u)R12 4 (u − v)
(10a)
or V3 V 0
V4 V 0
V4 V 0
V3 V 0
V3 V4 V3 V4 (u − v)R13 4 (u)R23 4 (v) = R23 4 (v)R13 4 (u)R12 (u − v) R12
(10b)
with R V3 V4 from (7) will be given elsewhere [17]. A special case (V40 = V4 ), leading to considerable simplifications, has been constructed explicitly by Gould et al. [18]. One may fix u and v in (10a) and regard C, C 0 and C 00 instead as spectral parameters in order to satisfy the general form (1a) of the YBE. From (9) the transfer-matrix τ V4 (N |u) is defined by (2). It is again sufficient to consider the homogeneous case δ (n) = 0 and V (n) = V40 for all n in (2b). Integrability follows from (3c) with the choice between τ V4 (N|v) and τ V3 (N|v) as generating functionals for “integrals of motion”. Here ABA is not straightforward. This model requires a new strategy in order to obtain equations for all eigenvalues τ V4 (N |u). 3. Algebraic Bethe Ansatz The original recipe for ABA is simple [8]: 1. Determine a vacuum state, preferably a highest or lowest weight state of the underlying group structure, if available, which tridiagonalizes Lˆ V (n|u) locally, and extend it via the product structure (2b) to a global vacuum, tridiagonalizing Tˆ V (N|u). 2. Take the off-diagonal elements of Tˆ V (N |u), not annihilating the global vacuum, as creation-operators and use the associative algebra defined by the FCR (3b), to generate eigenvectors to all eigenvalues of τˆ V (N|u) (2c). Equations determining the latter are also derived from the algebra.
On the Exact Solution of Models Based on Non-Standard Representations
389
The first point is more or less a precondition for the applicability of ABA; the second is crucial: Only if V is a carrier space of the fundamental representation of a possibly deformed and graded Lie algebra, the choice of creation-operators is obvious. τˆ V3 (N|u) and τˆ V4 (N|u), as defined in the previous section, are sufficiently complex to illustrate the general situation. Since the auxiliary space is graded, it is useful to transform the matrix-elements of Lˆ V3 (n|u) (2a) in the V3 basis according to i(V3 ) i(V3 ) h h → (−1)p(j )[p(i)+p(j )] Lˆ V3 (n|u) . (11) Lˆ V3 (n|u) ij
ij
This absorbs just a troublesome minus sign from the commutation of |3iV3 with (n) [Lˆ V3 (n|u)]13 and [Lˆ V3 (n|u)]23 . All four local basis vectors of V4 (4a) are suitable as (local) vacuum, preferably (n) := |2 >(n) .
(12) (n)
(n) is a lowest weight state of the representation of Uq (gl(2, 1|C)) on V4 and its equivalent was used by Kulish and Reshetikhin to treat the non-graded Y (gl(3|C))symmetric case [19]. Their calculation was generalized to the fundamental representation b 0 (m, n)) by Schultz [20], of Uq (gl (n) ω1 (u) 0 0 (n) (n) (13) Lˆ V3 (n|u)(n) = 0 ω2 (u) 0 (n) ∗ ∗ ω3 (u) with ∗ denoting non-zero entries. The vacuum-eigenvalues of the diagonal elements are given by (n)
ω1 (u) = (n)
sinh(ηC + u) , sinh(η(C + 2) − u) (n)
ω2 (u) = ω1 (u), (n) ω3 (u)
(14)
= −1.
The index (n) will be omitted, due to homogeneity. Immediately from (13), (2b) and the definition |0iN = (N ) ⊗ (N−1) ⊗ · · · ⊗ (1)
(15)
of the (global) vacuum |0iN follows Tˆ V3 (N |u) |0iN [ω1 (u)]N 0 0 0 |0iN , 0 [ω1 (u)]N = Cˆ 2 (u) (−1)N Cˆ 1 (u)
(16)
where Cˆ i (u) := [Tˆ V3 (N |u)]3i for i = 1, 2
(17)
390
J. Gruneberg
will later serve as creation-operators. ABA step 1 is finished: From (16),(2c) and (4h) follows the vacuum-eigenvalue of τ V3 (N|u): 3VN3 (u) = 2[ω1 (u)]N − (−1)N .
(18)
As mentioned before, Kulish and Reshetikhin solved a model built from the fundamental representation of Y (gl(3|C)), whose R-matrix differs from the η → 0-limit of (6) only in minor details. The FCR (3b) derived from (8): V3 V3 V3 V3 (u − v)Tˆ1V3 (N |u)Tˆ2V3 (N |v) = Tˆ2V3 (N|v)Tˆ1V3 (N|u)R12 (u − v) R12
(19)
are almost identical to the ones in [19]: Trigonometric functions in (6) do not show up, if appropriate abbreviations are used. Apart from a few signs due to grading, which was also realized in [19], the formal algebra defined by (19) becomes exactly the same. Of course it is possible to write down equations for eigenvectors and eigenvalues immediately, using the result of [19]. Again apart from a few signs, just the vacuum eigenvalues have to be replaced by (14). This is a well-known feature of ABA. However some more details will be needed, in order to tackle the more complicated problem of diagonalizing τˆ V4 (N |u) in the following section: The (nested, see below) algebraic Bethe ansatz for (right) eigenvectors of τˆ V3 (N|u) is [19] |λ1 , . . . , λM |F >= F a1 ,... ,aM Cˆ a1 (λ1 ) · · · Cˆ aM (λM ) |0iN ,
(20)
F a1 ,... ,aM
where {λ1 , . . . , λM } is some set of yet unknown parameters and are some coefficients, yet undetermined. Summation over repeated ai = 1, 2 with i = 1, . . . , M is implied. From (19) it follows immediately 1 Cˆ i (v)Tˆ33 (u) c(u − v) a(v − u) ˆ Ci (u)Tˆ33 (v), + c(v − u) 2 X 1 rlm,j k (u − v)Cˆ m (v)Tˆil (u) Tˆij (u)Cˆ k (v) = c(u − v)
Tˆ33 (u)Cˆ i (v) =
(21a)
l,m=1
b(u − v) ˆ Cj (u)Tˆik (v), − c(u − v) 2 X 1 rkl,ij (u − v)Cˆ l (v)Cˆ k (u) Cˆ i (u)Cˆ j (v) = d(u − v)
(21b)
k,l=1
with i, j, k ∈ {1, 2}. a(u), b(u), c(u) and d(u) originate from (6). For brevity [Tˆ V3 (N |u)]ij has been denoted by Tˆij (u). In the present case rik,j l (u) b 0 (2|C))-symmetric R-matrix, are elements of the non-graded Uq (gl R V2 V2 (u) =
2 X
rik,j l eij ⊗ eˆkl
i,j,k,l=1
= e11 ⊗ eˆ11 + e22 ⊗ eˆ22 + c(u) e11 ⊗ eˆ22 + e22 ⊗ eˆ11 + a(u)e21 ⊗ eˆ12 + b(u)e12 ⊗ eˆ21
(22)
On the Exact Solution of Models Based on Non-Standard Representations
391
which acts on the direct product of two two-dimensional, purely even subspaces V2 of V3 , spanned by |1 > and |2 > from (4a). It is crucial to realize the appearance of R V2 V2 (u) as a proper submatrix in R V3 V3 (u) (6), because it defines a simpler BA-solvable model. Nested algebraic Bethe ansatz (NABA) is typical for models, based on fundamental representations of dimension larger than 2. It was preceded by the ingenious, but complicated nested coordinate Bethe ansatz, invented by Gaudin [21] and Yang [1] independently. Their method was applied to the fundamental representation of the Y (gl(m, n|C))-symmetric problem by Lai [23] and Sutherland [24]. The formal algebraic formulation of the method is apparently due to Takhtajan [22]. The transfer-matrix τˆ V3 (N |u) applied to the Bethe ansatz eigenvector (20) should yield τˆ V3 (N |u) |λ1 , . . . , λM |F >= 3V3 (N|u) |λ1 , . . . , λM |F >
(23)
Leaving some technical details for Appendix B, it turns out, that this is true, iff the coefficients F in (20) fulfill “6-vertex-type” eigenvalue equations [19]: ia1 ,... ,am
h
τˆ V2 (M|λk )
b1 ,... ,bm
F b1 ,... ,bm =
1 F a1 ,... ,am [−ω1 (λk )]N
(24)
for k = 1, . . . , M, of course solvable by ABA [8]. This is the second nested Bethe ansatz. τˆ V2 (M|u) is an inhomogeneous transfer-matrix obtained according to (2) with δ (n) = γn from (22). The eigenvalue of τ V2 (M|u) corresponding to the BA-eigenvector F is given by 3VM2 (u; µ1 , . . .
, µm ) =
M Y
! c(u − λn )
n=1 m Y
+
α=1
m Y α=1
1 c(µα − u)
!
1 c(u − µα )
! (25)
with rapidities µα (α = 1, . . . , m), determined by the BA-equations M Y
c(µα − λn ) =
n=1
m Y c(µα − µβ ) c(µβ − µα )
(26a)
β=1 β6=α
for α = 1, . . . , m. These and expressions for the actual BA-vectors F also depending on µ1 , . . . , µm , may be found in the literature [8]. Using (25) the eigenvalue condition (23) reads [−ω1 (λk )]N =
m Y
c(µα − λk )
(26b)
α=1
for k = 1, . . . , M, which is the second set of BA-equations, determining λ1 , . . . , λM . Collecting the wanted terms in (B1) the eigenvalue of τˆ V3 (N|u) corresponding to the
392
J. Gruneberg
NABA-eigenvector (20) follows immediately: 3VN3 (u; λ1 , . . .
, λM |µ1 , . . . , µm ) =
M Y i=1
1 c(u − λi )
! (27)
o n × [ω1 (u)]N 3VM2 (u; µ1 , . . . , µm ) − (−1)N . According to Baxter [7] BA-equations guarantee analyticity of all eigenvalues in u. Here a q-deformed, graded version of the R-matrix (6) has been used and the Cˆ i -operators act on a different quantum space, i.e. V4 instead of V3 . However not knowing about [20], the whole calculation has been borrowed from [19]. A highest weight state, i.e. |1i instead of |2i in (12) and (15), could have been used as vacuum, but this leads to a very similar calculation. The result (27) is new, but it differs just by the vacuum eigenvalues (14) and signs from the well-known one in [19]. It is also complete. This is not true for the set of eigenvectors (20). However the missing ones may be produced using the lowest weight property of the ABA-vectors with respect to the group action on quantum space, which can be proved by standard-methods [8]. These are well-known and beautiful features of Bethe ansatz solvable systems. Also the equations for the inhomogeneous model with w(n) = C (n) in (2a) can be written down immediately using an argument due to Baxter [7]: 3VN3 (u; λ1 , . . . , λM |µ1 , . . . , µm ) =
N Y n=1
×
sinh(ηC (n) + u − δ (n) ) sinh(η(C (n) + 2) − u + δ (n) )
!
(28a)
Y M
m sinh(u − λi + 2η) Y sinh(u − µα − 2η) sinh(u − λi ) sinh(u − µα ) α=1 i=1 m Y sinh(u − µα + 2η) + sinh(u − µα ) α=1
− (−1)N
M Y sinh(u − λi + 2η) . sinh(u − λi ) i=1
The BA-equations (analyticity conditions) are M m Y Y sinh(µα − µβ + 2η) sinh(µα − λi + 2η) = sinh(µα − λi ) sinh(µα − µβ − 2η) i=1
(28b)
β=1 β6=α
for α = 1, . . . , m and N m Y Y sinh(λk − δ (n) − η(C (n) + 2)) sinh(µα − λk + 2η) = sinh(µα − λk ) sinh(λk − δ (n) + ηC (n) )
n=1
(28c)
α=1
for k = 1, . . . , M. The situation is different in the case of τ V4 (N|u), because the innocent looking change of auxiliary space requires the use of an at first sight completely different algebra. In the next section a systematic approach to this problem will be developed, which makes extensive use of the presented solution.
On the Exact Solution of Models Based on Non-Standard Representations
393
0
4. Diagonalization of τˆ V4 (N|u) In order to understand the difficulties in diagonalizing the homogeneous version of 0 τ V4 (N|u) defined in Sect. 2, it is convenient to follow the standard procedure from the (N ) (1) previous section as far as possible. So V4 × · · · × V4 will be chosen as quantum 0 space, while V4 , characterized by primed parameters (5a) will serve as auxiliary space. The sign change (11) will be applied and the local vacuum will be chosen as the lowest weight state in V4 (12). Omitting the local index (n), due to homogeneity, this leads to 0 Lˆ V4 (n|u) (n) ω1 (u) 0 0 0 ∗ (n) ∗ ω2 (u) ∗ = ∗ 0 ω3 (u) 0 ∗ 0 0 ω4 (u)
(29)
with the new (local) vacuum eigenvalues sinh(η(C − C 0 ) + u) sinh(η(C − C 0 + 2) + u) , sinh(η(C + C 0 ) − u) sinh(η(C + C 0 + 2) + u) sinh(η(C + C 0 + 2) − u) , ω2 (u) = sinh(η(C + C 0 + 2) + u) sinh(η(C 0 − C) − u) , ω3 (u) = sinh(η(C + C 0 + 2) + u) ω4 (u) = ω3 (u). ω1 (u) =
(30)
There are five non-vanishing entries compared to two in (13). This will be the same for the other three possible local vacua. Using (15), (2b) leads to 0 Tˆ V4 (N |u) |0iN 0 0 0 [ω1 (u)]N ∗ ∗ ∗ [ω2 (u)]N |0iN . = N 0 ∗ 0 [ω3 (u)] N ∗ 0 0 [ω3 (u)]
(31)
From the integrability condition (3c), i h 0 τˆ V4 (N |u), τˆ V3 (N |v) = 0, 0
it is clear that τˆ V3 (N |v) and τˆ V4 (N |u) share the same eigenvectors. The eigenvalues (27) are in general degenerate. The lowest weight property of the (global) vacuum (15), which is inherited by the BA-vectors (20) via standard arguments [8], guarantees uniqueness of these special vectors. Note that the same argument would hold also for a highest weight state as (global) vacuum, but not for any other choice. From this and (31), following 0 Baxter [7], it can be concluded immediately that all eigenvalues of τˆ V4 (N|u) can be represented in the form V0
3N4 (u; λ1 , . . . , λM |µ1 , . . . , µm ) = [ω1 (u)]N F (u) + [ω2 (u)]N G(u) N
−[ω3 (u)] {H (u) + J (u)} ,
(32)
394
J. Gruneberg
where F (u), G(u), H (u) and J (u) are meromorphic functions in u, whose residua cancel, if the analyticity conditions (26) hold. In order to determine these unknown functions, the FCR (3b) with with V = V3 and V 0 = V40 , namely V3 V 0
V0
V0
V3 V 0
R12 4 (u, v)Tˆ1V3 (N |u)Tˆ2 4 (N |v) = Tˆ2 4 (N|v)Tˆ1V3 (N|u)R12 4 (u, v)
(33)
0
with R V3 V4 (u) from (7) will be chosen. The reasons are 0
0
1. R V3 V4 is a 12 × 12-matrix while R V4 V4 is a 16 × 16-matrix. The choice V = V4 would greatly increase the number of equations. 2. In contrast to (16) Eq. (31) does not offer a natural choice of creation-operators, so the invaluable a priori knowledge of unique eigenvectors (20) with BA-parameters obeying (26) would be lost within the alternative choice. 0
0
The R-matrices R V3 V4 (u) (7) and R V4 V4 (u) (9) do not contain R V2 V2 (u) (22) as a proper submatrix. In particular unwanted terms turn out to be much more complicated. However it is possible to omit their calculation. As will be shown, the knowledge of unique eigenvectors (20) with (26) as well as some details of the calculation given in Sect. 3 are sufficient to determine the unknown functions in (32) unambiguously. For brevity (17) will be used as well as 0
0
0 V V TˆijV3 (u) = [Tˆ V3 (N |u)]Vij3 , Tˆij 4 (u) = [Tˆ V4 (N|u)]ij4 .
First it is convenient to list all components from (33), containing an operator Cˆ i (u) (17) V0 multiplied with a diagonal element of Tˆjj4 (v) from the right. From (7) and (A1) with primed parameters (5a) and (33) follows: V0
V3 (u)Tˆ414 (v) ζ2 (u − v)Tˆ11 0
V V3 (u)Tˆ314 (v) − ζ2 (u − v)q Tˆ21 V0 + β0 (u − v)Cˆ 1 (u)Tˆ114 (v) 0
(34a) =
V0 ρ(u − v)Tˆ114 (v)Cˆ 1 (u), 0
V V −Cˆ 1 (u)Tˆ224 (v) = α0 (u − v)Tˆ224 (v)Cˆ 1 (u)
(34b)
V0 V3 (u), − ε2 (u − v)Tˆ234 (v)Tˆ33
V0
V3 (u)Tˆ234 (v) ε2 (u − v)Tˆ11 0
0
V V + γ0 (u − v)Cˆ 1 (u)Tˆ334 (v) = ρ(u − v)Tˆ334 (v)Cˆ 1 (u),
(34c)
0
V V3 (u)Tˆ244 (v) ε2 (u − v)Tˆ21 V0
V0
+ γ0 (u − v)Cˆ 1 (u)Tˆ444 (v) = α0 (u − v)Tˆ444 (v)Cˆ 1 (u) V40
+ δ2 (u − v)Tˆ43 (v)Cˆ 2 (u) V0
V3 (u), − ζ2 (u − v)Tˆ414 (v)Tˆ33
(34d)
On the Exact Solution of Models Based on Non-Standard Representations
395
0
V V3 ζ2 (u − v)Tˆ12 (u)Tˆ414 (v) V0
V3 (u)Tˆ314 (v) − ζ2 (u − v)q Tˆ22
(34e)
V40
V40
+ β0 (u − v)Cˆ 2 (u)Tˆ12 (v) = ρ(u − v)Tˆ11 (v)Cˆ 2 (u), V0
V0
−Cˆ 2 (u)Tˆ224 (v) = α0 (u − v)Tˆ224 Cˆ 2 (u)
(34f)
V 0 V3 (v), − ε2 (u − v)Tˆ244 Tˆ33
0
V V3 (u)Tˆ234 (v) ε2 (u − v)Tˆ12 0
0
V V + γ0 (u − v)Cˆ 2 (u)Tˆ334 (v) = δ1 (u − v)Tˆ344 (v)Cˆ 1 (u)
(34g)
V0 + α0 (u − v)Tˆ334 (v)Cˆ 2 (u) V0 V3 (u), + ζ2 (u − v)q Tˆ314 (v)Tˆ33
V0
V3 (u)Tˆ244 (u) ε2 (u − v)Tˆ22 V0
V0
+ γ0 (u − v)Cˆ 2 (u)Tˆ444 (v) = ρ(u − v)Tˆ444 (v)Cˆ 2 (u).
(34h)
The idea is to keep only contributions leading to wanted terms, when the eigenvector 0 (20) is applied to τˆ V4 (N |u) and neglect all others. The set (34) is not complete. For V0 V0 V3 V3 instance in (34a) a term ∝ Tˆ11 (u)Tˆ414 and another ∝ Tˆ21 (u)Tˆ314 occur. Both will act non-trivially on |0iN from (31). However in the set (33) the relations V0
V3 (u)Tˆ414 (v) α0 (u − v)Tˆ11 V0
V3 (u)Tˆ314 (v) + δ1 (u − v)Tˆ21 0
0
V V V3 (u) + ζ1 (u − v)Cˆ 1 (u)Tˆ114 (v) = Tˆ414 (v)Tˆ11
and 0
V V3 (u)Tˆ414 (v) δ2 (u − v)Tˆ11 0
V V3 (u)Tˆ314 (v) + α0 (u − v)Tˆ21 0
0
V V V3 (u) − ζ1 (u − v)q −1 Cˆ 1 (u)Tˆ114 (v) = ρ(u − v)Tˆ314 (v)Tˆ21
can be found and used to eliminate these terms leading to ! ζ1 ζ2 [2α0 + q −1 δ1 + qδ2 ] V0 β0 − (u − v) Cˆ 1 (u)Tˆ114 (v) 2 α0 − δ1 δ2 0
V = ρ(u − v) Tˆ114 (v)Cˆ 1 (u) " #! δ2 [α0 q + δ1 ] ρζ2 V0 V3 1+ (u − v) Tˆ414 (v)Tˆ11 (u) − 2 α0 α0 − δ1 δ2 ! ρζ2 [α0 q + δ1 ] V 0 V3 (u − v) Tˆ314 Tˆ21 (u), + 2 α0 − δ1 δ2
396
J. Gruneberg
where the dependence on difference variables has been denoted symbolically for brevity. The last two terms on the right hand side will not lead to a contribution proportional to any BA-eigenvector (20). It has been checked – and this is crucial, that these terms are not related to a proper combination of Cˆ i -operators by unused relations from the set (33). In conclusion they can be identified as leading to unwanted terms. In the same way two other relations from (33) may be used to eliminate from (34e) V0 V0 V3 V3 (u)Tˆ414 (v) and ∝ Tˆ22 (u)Tˆ314 (v), which after omitting contributions leading terms ∝ Tˆ12 to unwanted terms yield the same result with Cˆ 1 (u) replaced by Cˆ 2 (u), i.e.: V0 Tˆ114 (u)Cˆ i (v)
=
! β0 ζ1 ζ2 [2α0 + q −1 δ1 + qδ2 ] (v − u) − ρ ρ[α02 − δ1 δ2 ] V0
× Cˆ i (v)Tˆ114 (u)
± ...
(35a)
for i = 1, 2. 0 V 0 V3 V3 (u) and ∝ Tˆ244 Tˆ33 (v) can be identified as In (34b) and (34f) terms ∝ Tˆ V4 (v)Tˆ33 leading to unwanted terms in the sense explained above and therefore be neglected: 0
V Tˆ224 (u)Cˆ i (v) =
−1 V0 Cˆ i (v)Tˆ224 (u) α0 (v − u)
± ...
(35b)
for i = 1, 2. The other relations from (33) can be treated similarly, leading to α0 γ0 − ε1 ε2 V0 V0 (v − u) Cˆ 1 (v)Tˆ334 (u) Tˆ334 (u)Cˆ 1 (v) = α0 ρ ± ... α0 γ0 − ε1 ε2 V0 V0 (v − u) Cˆ 2 (v)Tˆ444 (u) Tˆ444 (u)Cˆ 2 (v) = α0 ρ ± ... ! α γ − ε ε V40 V0 0 0 1 2 (v − u) Cˆ 1 (v)Tˆ444 (u) Tˆ44 (u)Cˆ 1 (v) = 2 α0 − δ1 δ2 ! 0 δ2 α0 γ0 − ε1 ε2 ˆ 2 (v)Tˆ V4 (u) (v − u) C − 43 α0 α02 − δ1 δ2 ± ... 0
V Tˆ334 (u)Cˆ 2 (v) =
α0 γ0 − ε1 ε2 α02 − δ1 δ2 −
(35d)
(35e)
!
V0
(v − u) Cˆ 2 (v)Tˆ334 (u)
δ1 α0 γ0 − ε1 ε2 α0 α02 − δ1 δ2
± ... .
(35c)
!
0
V (v − u) Cˆ 1 (v)Tˆ434 (u)
(35f)
Some details of the calculations are given in Appendix C. They are tedious, but straightforward: It is trivial to identify terms proportional to a simple M = 1 eigenvector, (20), if it is applied. The remaining terms are divided into those which possibly lead to a contribution proportional to an eigenvector via the algebra (34), and others which cannot be
On the Exact Solution of Models Based on Non-Standard Representations
397
transformed this way. The former terms have been eliminated by using convenient relations from (34) and evaluated again, till this procedure terminated, leaving only terms of the latter type, i.e. unwanted terms, which have been neglected systematically in (35). Equations (35e) and (35f) contain non-trivial terms V0
V0
∝ Cˆ 2 (v)Tˆ434 (u) and ∝ Cˆ 1 (v)Tˆ434 (u). 0
V Next it is natural to add to (34) the relations involving terms ∝ Tˆ344 (u)Cˆ i (v) and ∝ 0 V Tˆ 4 (u)Cˆ i (v) with i = 1, 2, i.e. 43
0
V V3 ε2 (u − v)Tˆ11 (u)Tˆ244 (v) 0
0
V V + γ0 (u − v)Cˆ 1 (u)Tˆ344 (v) = α0 (u − v)Tˆ344 (v)Cˆ 1 (u)
(36a)
V0 + δ2 (u − v)Tˆ334 (v)Cˆ 2 (u) V0 V3 (u), + ζ2 (u − v)Tˆ314 (v)Tˆ33
0
V V3 (u)Tˆ234 (v) ε2 (u − v)Tˆ21 0
0
V V + γ0 (u − v)Cˆ 1 (u)Tˆ434 (v) = ρ(u − v)Tˆ434 (v)Cˆ 1 (u),
(36b)
0
V V3 (u)Tˆ244 (v) ε2 (u − v)Tˆ22 0
0
V V + γ0 (u − v)Cˆ 2 (u)Tˆ344 (v) = ρ(u − v)Tˆ344 (v)Cˆ 2 (u),
(36c)
V0
V3 (u)Tˆ234 (v) ε2 (u − v)Tˆ22 0
0
V V + γ0 (u − v)Cˆ 2 (u)Tˆ434 (v) = δ1 (u − v)Tˆ444 (v)Cˆ 1 (u)
(36d)
V0 + α0 (u − v)Tˆ434 (v)Cˆ 2 (u) V0 V3 (u). + ζ2 (u − v)q Tˆ414 (v)Tˆ33
Proceeding as above, leads to V0 Tˆ344 (u)Cˆ 1 (v)
=
α0 γ0 − ε1 ε2 α02 − δ1 δ2 −
0
α0 γ0 − ε1 ε2 α02 − δ1 δ2 −
0
V (v − u) Cˆ 1 (v)Tˆ344 (u)
δ2 α0 γ0 − ε1 ε2 α0 α02 − δ1 δ2
± ... , V Tˆ434 (u)Cˆ 2 (v) =
! !
V0
(v − u) Cˆ 2 (v)Tˆ334 (u) (37a)
!
0
V (v − u) Cˆ 2 (v)Tˆ434 (u)
δ1 α0 γ0 − ε1 ε2 α0 α02 − δ1 δ2
!
V0
(v − u) Cˆ 1 (v)Tˆ444 (u)
398
J. Gruneberg
± ... , α0 γ0 − ε1 ε2 V0 = (v − u) Cˆ 1 (v)Tˆ434 (u) α0 ρ ± ... , α0 γ0 − ε1 ε2 V0 V0 (v − u) Cˆ 2 (v)Tˆ344 (u) Tˆ344 (u)Cˆ 2 (v) = α0 ρ ± ... .
(37b)
V0 Tˆ434 (u)Cˆ 1 (v)
(37c)
(37d)
This idea is strongly supported by a comparison of (35) and (37) with (21), used in V0 the algebraic diagonalization of τˆ V3 (N |u), suggesting that the submatrix {Tˆ 4 } with ij
i, j = 3, 4 will play the same rôle as the submatrix {TˆijV3 } with i, j = 1, 2 in the previous section. Indeed using the definitions (A1) with primed parameters (5a), (35a) and (35b) can be written sinh(u − v + η(C 0 + 2)) ˆ V0 V0 Ci (v)Tˆ114 (u) Tˆ114 (u)Cˆ i (v) = 0 sinh(u − v − η(C − 2)) ± ... ,
(38a)
0 sinh(u − v + η(C 0 + 2)) ˆ V0 ˆ V4 (u) (v) T C Tˆ224 (u)Cˆ i (v) = i 22 sinh(u − v − ηC 0 ) ± ...
(38b)
for i = 1, 2., while the remaining equations from (35) and (37) may be noted as sinh(u − v + η(C 0 + 2)) tˆij (u)Cˆ k (v) = sinh(u − v − ηC 0 ) ×
2 X
rlm,j k (u − v − ηC 0 ) Cˆ m (v)tˆil (u)
l,m=1
± ...
(38c)
for i, j, k = 1, 2, where the elements rik,j l (u) of the R-matrix (22) and the convenient definition ! V0 V0 Tˆ334 (u) Tˆ434 (u) tˆ11 (u) tˆ12 (u) := (38d) V0 V0 tˆ21 (u) tˆ22 (u) Tˆ 4 (u) Tˆ 4 (u) 34
44
have been used. The similarity of (38) to (21) is striking and allows to calculate the 0 eigenvalues of τˆ V4 (N |u) easily. V0 V0 Applying the (right) eigenvector (20) to Tˆ 4 (u) and Tˆ 4 (u) using (38) and (31) yields 11
0
V Tˆ114 (u)|λ1 , . . . , λM |F > = [ω1 (u)]N
22
M Y sinh(u − λi + η(C 0 + 2)) sinh(u − λi − η(C 0 − 2)) i=1
×|λ1 , . . . , λM |F >
±...
On the Exact Solution of Models Based on Non-Standard Representations
399
and 0
V Tˆ224 (u)|λ1 , . . . , λM |F > = [ω2 (u)]N
M Y sinh(u − λi + η(C 0 + 2)) sinh(u − λi − ηC 0 ) i=1
×|λ1 , . . . , λM |F >
±... , V0
V0
where unwanted terms have been omitted. Applying it to [Tˆ334 (u) + Tˆ444 (u)] yields h
M i Y sinh(u − λi + η(C 0 + 2)) V0 V0 Tˆ334 (u) + Tˆ444 (u) |λ1 , . . . , λM |F > = [ω3 (u)]N sinh(u − λi − ηC 0 ) i=1 ib1 ,... ,bM h F a1 ,... ,aM × τˆ V2 (M|u − ηC 0 ) a1 ,... ,aM
×Cˆ b1 (λ1 ) · · · Cˆ bM (λM ) |0iN
± ... ,
where τˆ V2 (M|u) is defined by (22) via (2) with δ (n) = λn as in Sect. 3. But F is a (right) eigenvector to τˆ V2 (M|u) corresponding to the eigenvalue from (25). The neglected unwanted terms vanish per construction if the supertrace (4h) is performed according 0 to (2c). Therefore the eigenvalue of τˆ V4 (M|u) corresponding to the (right) eigenvector (20) is given by V0 3N4 (u; λ1 , . . .
M Y sinh(u − λi + η(C 0 + 2)) sinh(u − λi − η(C 0 − 2))
N
, λM |µ1 , . . . , µm ) = [ω1 (u)]
! (39)
i=1
N
+ [ω2 (u)]
M Y sinh(u − λi + η(C 0 + 2)) sinh(u − λi − ηC 0 ) i=1
N
− [ω3 (u)]
M Y sinh(u − λi + η(C 0 + 2)) sinh(u − λi − ηC 0 )
i=1 V2 × 3M (u − ηC 0 ; µ1 , . . .
! !
, µm )
with vacuum eigenvalues ωi (u) (i = 1, 2, 3) from (30) and 3VM2 (u; . . . ) from (25). The BA-parameters λ1 , . . . , λM and µ1 , . . . , µm are to be determined by the BAequations (26). Note that these are necessary and sufficient conditions [7] for analyticity of the eigenvalues (39) in u. Since up to now no explicit use has been made of these, this is a valuable consistency check on the validity of (39). Equation (39) is clearly of the expected form (32). It is further obvious that the eigenvalues for every transfer-matrix based on auxiliary space V40 can be represented by the same formula (30), provided the (global) quantum space is a lowest weight space. Of course the vacuum eigenvalues have to be replaced by new ones, which are obviously restricted by the BA-equations (26), as discussed in Sect. 3. For completeness the trivial generalization [7] of (39) to the inhomogeneous case with w (n) = C (n) in (2a) and δ (n) 6 = 0 in (2b) shall be given explicitly:
400
J. Gruneberg
V0
3N4 (u; λ1 , . . . , λM |µ1 , . . . , µm ) =
Y N n=1
sinh(η(C (n) − C 0 ) + u − δ (n) ) sinh(η(C (n) + C 0 ) − u + δ (n) )
sinh(η(C (n) − C 0 + 2) + u − δ (n) ) sinh(η(C (n) + C 0 + 2) + u − δ (n) ) ! M Y sinh(u − λi + η(C 0 + 2)) × sinh(u − λi − η(C 0 − 2))
(40)
×
i=1
Y N
sinh(η(C (n) + C 0 + 2) − u + δ (n) ) + sinh(η(C (n) + C 0 + 2) + u − δ (n) ) n=1 ! M Y sinh(u − λi + η(C 0 + 2)) × sinh(u − λi − ηC 0 ) i=1
Y N
sinh(η(C 0 − C (n) ) − u + δ (n) ) − sinh(η(C (n) + C 0 + 2) + u − δ (n) ) n=1 ! Y M sinh(u − λi + η(C 0 + 2) × sinh(u − λi − ηC 0 ) i=1 ! m Y sinh(u − µα − η(C 0 + 2)) × sinh(u − µα − ηC 0 ) α=1 ! M Y sinh(u − λi + η(C 0 + 2) + sinh(u − λi − η(C 0 − 2)) i=1 ! m Y sinh(u − µα − η(C 0 − 2)) . × sinh(u − µα − ηC 0 )
α=1
Here the BA-parameters λ1 , . . . , λM and µ1 , . . . , µm are determined by (28b) and (28c). Equation (40) describes all eigenvalues. As mentioned above, additional eigenvectors to the same eigenvalue (40) are obtained by applying shift operators, corresponding to the representation of the group-symmetry on the (global) quantum space, to the eigenvectors (20). Completeness may be assured by the usual arguments [8]. 5. Conclusion 0
In the previous section τˆ V4 (N |u) has been diagonalized by NABA, combined with analyticity arguments. Obviously the method can be applied to any BA-integrable model, defined by (2), based on an arbitrary, but finite dimensional representation V 0 of a possibly q-deformed Lie (super-)algebra as auxiliary space. Let the model based on the direct product of a fundamental representation V with itself, here defined by R V3 V3 (u) and (2), be solved by (N)ABA. In order to solve the model under consideration the following scheme may be applied: 1. An auxiliary model based on V as auxiliary and the non-standard representation V 0 as quantum space, may be constructed by standard methods and its transfer-matrix,
On the Exact Solution of Models Based on Non-Standard Representations
401
i.e. τˆ V3 (N|u) from (6) via (2), may be diagonalized, using a (global) lowest or highest weight state, e.g. |0iN (15), as (pseudo-)vacuum. 2. Vacuum eigenvalues may be calculated trivially, see (30). The transfer-matrix of the relevant model and the one of the auxiliary model commute (3c) and share all BA-eigenvectors, which dictates the form of the eigenvalue equations (32). 3. Mixed FCR (34), between creation-operators from auxiliary model (17), should be used as follows: 0 (a) FCRs (34) between diagonal elements of Tˆ V (N|u) and creation-operators multiplied from the right on these (35) should be collected. The remaining terms in these equations are classified as wanted (leading to terms proportional to the known BA-vectors), unwanted (not related to wanted ones by FCRs) and others. (b) Terms of the last category have to be eliminated by use of other convenient FCRs. Unwanted terms may be neglected in final equations, i.e. (35). (c) Generically the final equations in step (b) involve some off-diagonal elements 0 of Tˆ V (N|u) (35). They have to be complemented by all FCRs containing these off-diagonal elements, multiplied from the right with creation-operators (36), to which the same procedure has to be applied (37). 4. The relations obtained in step 3 allow the calculation of the eigenvalue equations (39), if they are written down conveniently, i.e. like (38). Step one and two are trivial here. Step three is crucial. An unusually large number of FCRs (34) has to be used, because the mixed R-matrix (7), does not contain any smaller R-matrix like (22) as a proper submatrix, which was true e.g. for (6). The approach is systematic and avoids a complicated discussion of unwanted terms. The author has checked in a number of cases, that these indeed vanish in the present application, but analyticity of the final result (39) is a very strong and usually sufficient test. Step four is simple. Some knowledge of the preceding calculations is a sufficient guideline. A group theoretical background is not necessary, but helpful. Definitely needed is a commuting (auxiliary) model, algebraically solvable [8], and a unique identification of joint eigenvectors. The theory of quantum groups [3,4,12] provides both. In addition it is implicitly assumed that the algebra defined by the FCRs is complete, i.e. if two operators are identical, this information should be encoded within the FCRs. This is guaranteed if R has the intertwining property [3]. The more complicated problem of handling the full set of commutation relations of comparable complexity directly, has been tackled more ore less exactly a number of times. The algebraic solution of a statistical covering model for the one-dimensional Hubbard model, where no commuting transfer-matrix is known, was studied by Ramos and Martins [25]. Also a diagonalization of an Y (sp(2, 1))-symmetrical model by the same authors should be mentioned [26]. To the author’s knowledge no systematic scheme is known and although the eigenvalues are presumably correct, the discussion of unwanted terms is not complete in these works. It is an interesting, but still unsolved question, if solvability of some statistical model by n-fold NABA implies the existence of a commuting transfer-matrix with minimal, that is (n + 1)-dimensional, auxiliary space? b 0 (N |C))-symmetric case, the quantum-determinant, introIn the non-graded Uq (gl duced by Izergin and Korepin [28] and recognized by Drinfel’d [3] to complete the center of this algebra, provides the possibility to construct functional relations [13] for the eigenvalues, extended to an “analytical Bethe ansatz” by Reshetikhin [29]. This is more elegant than the present approach, but does not generalize to the graded case, because no one-dimensional subspace can be separated from a product of transfer-matrices.
402
J. Gruneberg 0
The transfer-matrix τˆ V4 (N |u) has been used mainly for pedagogical reasons. Minus signs due to grading, even in the non-graded version [13] prevent a statistical interpretation. Nevertheless the Hamiltonian limit in the non-difference type spectral parameter (1a), as mentioned above, leads to an additional, unusual Hamiltonian, which will be discussed elsewhere [30]. Note that neither τˆ V3 (N|u) nor τˆ V4 (N|u) are hermitian, except if further restrictions are imposed on (5a). The diagonalization of τˆ V4 (N|u), especially the result (40), may serve as starting point for calculations on the thermodynamics of these models in the non-linear integral equation approach, pioneered by Klümper [31]. For a recent application of this technique see also [32]. b 0 (2, 1|C))-symThe eigenvalue-equation for the transfer-matrix of some other Uq (gl metric models with V40 as auxiliary and some lowest weight representation as quantum space may be written down by replacing the ωi (u) (30) in (39) by new ones. De Vega and Gonzáles Ruiz [33] and Foerster and Karowski [34] generalized the ABA calculations of Schultz [20] partially to non-periodic, integrable boundary conditions. There should be no principal problem to combine their techniques with the method presented here. The perhaps most important open question is concerned with the applicability of the method to models with infinite dimensional auxiliary space, which was precautiously excluded here. Acknowledgement. This work has been performed within the research program of the Sonderforschungsbereich 341 (Köln-Aachen-Jülich). The author thanks J. Zittartz and A. Klümper for continuous support, A. Zvyagin, G. Jüttner, Y. Kato, A. Klümper and especially A. Fujii for stimulating discussions and encouragement. Special thanks goes to A. Klümper for carefully reading the manuscript and numerous useful suggestions, incorporated in the final version. The author would also like to thank a referee for pointing out reference [20] to him.
Appendix A: Coefficients of R-matrices The elements of the R-matrix (7) are explicitly given by ρ(u) := α0 (u) := β0 (u) := γ0 (u) := δ1 (u) := δ2 (u) := ε1 (u) := ε2 (u) :=
sinh(η(C + 2) + u) , sinh(η(C + 2) − u) 1 {[C + 1]q ρ(u) − 1}, [C + 2]q 1 {[2]q ρ(u) − [C]q }, [C + 2]q 1 {ρ(u) − [C + 1]q }, [C + 2]q 1 {ρ(u)q −C−1 + q}, [C + 2]q 1 {ρ(u)q C+1 + q −1 }, [C + 2]q C C µ∗ {ρ(u)q − 2 −1 + q 2 +1 }, [C + 2]q C C µ {ρ(u)q 2 +1 + q − 2 −1 }, [C + 2]q
(A1a) (A1b) (A1c) (A1d) (A1e) (A1f) (A1g) (A1h)
On the Exact Solution of Models Based on Non-Standard Representations
403
C+1 C+1 κ∗ {ρ(u)q − 2 + q 2 }, [C + 2]q C+1 C+1 κ {ρ(u)q 2 + q − 2 }. ζ2 (u) := [C + 2]q
ζ1 (u) :=
(A1i) (A1j)
f (u) and g(u) in (9) are defined by sinh(η(C + C 0 ) + u) , sinh(η(C + C 0 ) − u) sinh(η(C + C 0 + 2) − u) . g(u) = sinh(η(C + C 0 + 2) + u)
f (u) =
(A2a) (A2b)
Using (5d) and the abbreviations α = [C + C 0 ]q β = [C + C 0 + 1]q , γ = [C + C 0 + 2]q , ε = [C 0 ]q q η = [C]q q
C+C 0 2 +1
C+C 0 2 +1
− [C]q q − − [C 0 ]q q
C+C 0 2 −1
0 − C+C 2 −1
, ,
the remaining coefficients of (9) can be written as: r1 = r10 = r2 = r20 = r3 = r30 = r4 =
r40 =
r5 =
κ ∗ µ∗ κ 0 µ0 0 0 (γ q C+C f (u) + [2]q β + αq −C−C −2 g(u)), αβγ κµκ 0 ∗ µ0 ∗ 0 0 (γ q −C−C f (u) + [2]q β + αq C+C +2 g(u)), αβγ 0 C+C 0 κ ∗κ 0 − C+C 2 2 f (u) + q q , α C+C 0 C+C 0 κκ 0 ∗ q − 2 f (u) + q 2 , α 0 C+C 0 +2 µµ0 ∗ − C+C2 +2 + q 2 g(u) , q γ C+C 0 +2 C+C 0 +2 µ∗ µ0 − 2 g(u) , q 2 +q γ q −1 1+ [C]q [C 0 ]q γf (u) − [C + 1]q β αβγ + [C 0 + 1]q [C + 1]q αg(u) − [C 0 ]q β , q 1+ [C]q [C 0 ]q γf (u) − [C + 1]q β αβγ + [C 0 + 1]q [C + 1]q αg(u) − [C 0 ]q β , 1 [C]q [C + 1]q γf (u) − [2]q [C 0 ]q [C + 1]q β αβγ
404
J. Gruneberg
+ [C 0 ][C 0 + 1]q αg(u) , 1 [C 0 ]q [C 0 + 1]q γf (u) − [2]q [C]q [C 0 + 1]q β r50 = αβγ + [C][C + 1]q αg(u) , r6 =
C+C 0 +1 1 µ∗ κ 0 [C]q γ q 2 f (u) − βεq 2 αβγ C+C 0 +1 − [C 0 + 1]q αq − 2 g(u) ,
C+C 0 +1 1 κµ0 ∗ [C 0 ]q γ q − 2 f (u) + βεq − 2 αβγ C+C 0 +1 − [C + 1]q αq 2 g(u) , 1 [C 0 ]q − [C]q f (u) , r7 = α 1 0 [C]q − [C 0 ]q f (u) , r7 = α C+C 0 +1 1 κ ∗ µ0 [C 0 ]q γ q 2 f (u) − βηq 2 r8 = αβγ C+C 0 +1 − [C + 1]q αq − 2 g(u) ,
r60 =
C+C 0 +1 1 µκ 0 ∗ [C]q γ q − 2 f (u) + βηq − 2 αβγ C+C 0 +1 − [C 0 + 1]q αq 2 g(u) , 1 [C 0 + 1]q − [C + 1]q g(u) , r9 = γ 1 [C + 1]q − [C 0 + 1]q g(u) , r90 = γ 1 [C]q [C + 1]q β − [C 0 ]q γf (u) r10 = αβγ + [C 0 + 1]q [C 0 ]q β − [C + 1]q αg(u) .
r80 =
Appendix B: Some Details on ABA Applying the ansatz (20) to the diagonal elements of Tˆij (u) using (19) and (16) yields [19], i h (B1a) Tˆ11 (u) + Tˆ22 (u) |λ1 , . . . , λM |F > = [ω1 (u)]N
M Y i=1
1 ,bM a1 ,... ,aM [τˆ V2 (M|u)]ba11 ,... ,... ,aM F c(u − λi )
×Cˆ b1 (λ1 ) · · · Cˆ bM (λM ) |0iN M h ib1 ,... ,bM X ˇ (1,2) (u; λ1 , . . . , λM ) + F a1 ,... ,aM 3 k k=1
a1 ,... ,aM
On the Exact Solution of Models Based on Non-Standard Representations
× Cˆ bk (u)
M Y
405
Cˆ bi (λi ) |0iN ,
i=1 i6=k
where τˆ V2 (M|u) is an inhomogeneous transfer-matrix obtained according to (2) with δ (n) = λn from (22), and Tˆ33 (u)|λ1 , . . . , λM |F > = (−1)N
M Y i=1
+
1 |λ1 , . . . , λM |F > c(u − λi ) ib1 ,... ,bM
M h X k=1
ˇ (3) (u; λ1 , . . . , λM ) 3 k
× Cˆ bk (u)
M Y
a1 ,... ,aM
(B1b)
F a1 ,... ,aM
Cˆ bi (λi ) |0iN .
i=1 i6=k
The operators Cˆ i (λ1 ) under the products in (B1) are ordered with the index increasing from left to right factors. Note that only the first terms in Eq. (B1) will contribute to the ˇ k are eigenvalue, while the following terms are unwanted. Their coefficients 3 ib1 ,... ,bM h ˇ (1,2) (u; λ1 , . . . , λM ) (B2a) 3 k a1 ,... ,aM
= −[ω1 (λk )]N
b(u − λk ) c(u − λk )
M Y i=1 i6=k
k−1 Y 1 1 c(λk − λi ) d(λj − λk ) j =1
b
[Lˆ cM−1 cM−2 (λk − λM−1 )]aM−1 × [Lˆ cM cM−1 (λk − λM )]baM M−1 M b × · · · × [Lˆ ck+1 ck (λk − λk+1 )]ak+1 k+1 ! k−1 h i Y × δabll δackk δb1k δ1cM + δb2k δ2cM , l=1
where Lˆ ij (u) is an abbreviation for [Lˆ V2 (n|u)]Vij2 , derived from (22) via (2a), and ib1 ,... ,bM h ˇ (3) (u; λ1 , . . . , λM ) (B2b) 3 k a1 ,... ,aM
=
a(λk − u) (−1)N+M c(λk − u)
M Y i=1 i6=k
M Y 1 d(λj − λk ) c(λi − λk )
ib1 ,... ,bk
h
× Sˆk (λ1 , . . . , λk )
a1 ,... ,ak
j =k+1
!
M Y l=k+1
δabll
.
Here a k-particle S-matrix has been defined via [19] h
ib1 ,... ,bk
Sˆk (λ1 , . . . , λk )
a1 ,... ,ak
= δbc1k δackk
k−1 Y i=1
rbi ci ,ai ,ci+1 (λi − λk )
(B2c)
406
J. Gruneberg
In (B2) summation over repeated indices ci = 1, 2 is implicit. Applying the ansatz (20) to (23) forces the unwanted terms in (B1) to vanish. These equations can be transformed into 6-vertex-type eigenvalue equations (24) in Sect. 3 [19].
Appendix C: Derivation of Commutation Relations 0
V V3 (u)Tˆ234 (v) A few more details on the derivation of (35) are given: In (34c) the term ∝ Tˆ11 acts non-trivially according to (16) and (31). It has to be eliminated by use of 0
V V3 (u)Tˆ234 (v) α0 (u − v)Tˆ11 V0
V0
V3 (u) + ε1 (u − v)Cˆ 1 (u)Tˆ334 (v) = ρ(u − v)Tˆ234 (v)Tˆ11
from (33), which results in (35c). Similarly (34h) can be handled, leading to (35d). In V0 (34d) the term ∝ Tˆ V3 Tˆ 4 has to be eliminated via the relation 21
24
V0
V3 (u)Tˆ244 (v) α0 (u − v)Tˆ21 0
0
V V V3 (u) + ε1 (u − v)Cˆ 1 (u)Tˆ444 (v) = α0 (u − v)Tˆ244 (v)Tˆ21 0
V V3 (u) + δ2 (u − v)Tˆ234 (v)Tˆ22 0
V V3 (u) + ζ2 (u − v)Tˆ214 (v)Tˆ32 0
V from (33). According to (16) and (31) the term ∝ Tˆ434 (v)Cˆ 2 (u) also acts non-trivially on |0iN . It has to be eliminated, using the following relations from set (33), V0
V3 (u)Tˆ234 (v) ε2 (u − v)Tˆ22 V0
V0
+ γ0 (u − v)Cˆ 2 (u)Tˆ434 (v) = δ1 (u − v)Tˆ444 (v)Cˆ 1 (u) 0
V + α0 (u − v)Tˆ434 (v)Cˆ 2 (u) 0
V V3 (u) + ζ2 (u − v)q Tˆ414 (v)Tˆ33
and 0
V V3 (u)Tˆ234 (v) α0 (u − v)Tˆ22 0
0
V V V3 (u) + ε1 (u − v)Cˆ 2 (u)Tˆ434 (v) = δ1 (u − v)Tˆ244 (v)Tˆ21 0
V V3 (u) + α0 (u − v)Tˆ234 (v)Tˆ22 V0
V3 (u). + ζ2 (u − v)q Tˆ214 (v)Tˆ23 V0
V3 (u)Tˆ234 (v), also Both relations have to be used in order to prevent the appearance of Tˆ22 acting non-trivially on |0iN , in the result (35e). Applying the same procedure to (34g) leads to (35f).
On the Exact Solution of Models Based on Non-Standard Representations
407
References 1. Yang, C.N.: Phys. Rev. Lett. 19, 1312–1315 (1967); Yang, C.N.: Phys. Rev. 168, 1920–19233 (1968) 2. Baxter, R.J.: Ann. Phys. 70, 323–337 (1972) 3. Drinfel’d: V.G.: Quantum Groups. In: Proceedings of the International Congress of Mathematicians, Berkeley, 1986 4. Chari, V. and Presley, A.: A Guide to Quantum Groups. New York: Cambridge University Press, 1994 5. Jimbo, M.: Commun. Math. Phys. 102, 537–547 (1986) 6. Ma, Z.-Q.: Yang–Baxter Equation and Quantum Enveloping Algebras. Singapore: World Scientific, 1993 7. Baxter, R.J.: Exactly solved Models in Statistical Mechanics. London: Academic Press, 1982 8. Sklyanin, E.K., Takhtajan, L.A. and Faddeev, L.D.: Theoret. Math. Phys. 40, 688–706 (1980); Takhtajan, L.A. and Faddeev, L.D.: Russ. Math. Surv. 34, 11–68 (1979); Faddeev, L.D.: Soviet Scientific Reviews C, 1, 107–155 (1980); Takhtajan, L.A.: Introduction to Algebraic Bethe Ansatz. In: B.S. Shastry, S.S. Jha and V. Singh (eds.): Exactly Solvable Problems in Condensed Matter and Field Theory. Lecture Notes in Physics 242, Berlin– Heidelberg: Springer, 1985, pp. 175–220 9. Korepin, V.E., Bogoliubov, N.N. and Izergin, A.G.: Quantum Inverse Scattering Method and Correlation Functions. New York: Cambridge University Press, 1993 10. Jimbo, M. (ed.): Yang–Baxter Equation in Integrable System. Singapore: World Scientific, 1989 11. Bethe, H.A.: Z. Physik 71, 205–226 (1931) 12. Yamane, H.: Preprint q-alg/9603015 (1996) 13. Kulish, P.P. and Sklyanin, E.K.: J. Soviet. Math. 19, 1596–15620 (1982) 14. Cornwell, J.F.: Group Theory in Physics, Vol. 3 – Supersymmetry and infinite dimensional Algebras. London: Academic Press, 1989 15. Kac, V.G.: Infinite dimensional Lie Algebras. 3rd ed., New York: Cambridge University Press, 1990 16. Perk, J.H.H. and Schultz, C.L.: Phys. Lett. 84 A, 407–410 (1981) 17. Gruneberg, J.: To be published 18. Gould, M.D., Hibberd, K.E., Links, J.R. and Zhang, Y.-Z.: Phys. Lett. A 212, 156–160 (1995) 19. Kulish, P.P. and Reshetikhin, N.Y.: JETP, 80, 158–183 (1981) 20. Schultz, C.L.: Physica A, 122, 71–88 (1983) 21. Gaudin, M.: Phys. Lett. A 24, 55–56 (1967) 22. Takhtajan, L.A.: LOMI-Proceedings, 1980, 101, 158–183 (1980) 23. Lai, L.A.: J. Math. Phys. 15, 1675–1676 (1974) 24. Sutherland, B.: Phys. Rev. B 12, 3795–3805 (1975) 25. Ramos and Martins: J. Phys. A 30, L195 (1997) 26. Ramos and Martins, Nucl. Phys. B 474, 678–714 (1996) 27. Kulish, P.P. and Sklyanin, E.K.: Quantum Spectral Transform Method – Recent Developments. In: J. Hietarina and C. Montonen (eds.): Integrable Quantum Field Theories. Lecture Notes in Physics 151, Berlin–Heidelberg, Springer, 1981, pp. 61–119 28. Izergin, A.G. and Korepin, V.E.: Sov. Phys. Dokl. 26, 653-654 (1981) 29. Reshetikhin, N.Y.: Sov. Phys. JETP 57, 691–696 (1983) 30. Gruneberg, J.: To be published 31. Klümper, A.: Ann. Physik 1, 540 (1992); Klümper, A.: Z. Phys. B 91, 507 (1993) 32. Jüttner, G., Klümper, A. and Suzuki, J.: Nucl. Phys. 487, 471–502 (1998) 33. De Vega, H.J. and Gonzáles-Ruiz, A.: Nucl. Phys. B 417, 553–578 (1994); Gonzáles-Ruiz, A.: Nucl. Phys. B 424, 468–486 34. Foerster, A. and Karowski, M.: Nucl. Phys. B 408, 512–534 (1993) Communicated by T. Miwa
Commun. Math. Phys. 206, 409 – 428 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
On the Small-Scale Mass Concentration of Modes John A. Toth? Department of Mathematics and Statistics, McGill University, Montreal, Quebec, Canada H3A 2K6. E-mail: [email protected] Received: 1 October 1998 / Accepted: 1 April 1999
Abstract: Let P1 , . . . , Pd be commuting, jointly-elliptic, h¯ - pseudodifferential operators on a compact manifold, X, of dimension n ≥ d. Suppose γ is the ω-limit set of the bicharacteristic flow of the classical Hamiltonian, p1 , restricted to the variety, 6E = {(x, ξ ) ∈ T ∗ X; p1 (x, ξ ) − E1 = · · · = pd (x, ξ ) − Ed = 0}. We discuss the corresponding concentration of mass as h¯ → 0 for a subsequence of joint eigenfunctions of the Pj ’s with eigenvalues sufficiently close to (E1 , . . . , Ed ). 1. Introduction Let X be a compact, C ∞ Riemannian manifold of dimension n, and P1 (x, h¯ Dx ), . . . , Pd (x, h¯ Dx ) functionally independent, jointly-elliptic, classical, self-adjoint h¯ - pseudodifferential operators of order m with 1 ≤ d ≤ n. For simplicity of notation, we will denote the corresponding h¯ -principal symbols by p1 , . . . , pd . As a matter of convention, we will refer to H := p1 as the classical Hamiltonian and will also assume that [Pi , Pj ] = 0 for all 1 ≤ i, j ≤ d. When d = n, this system is said to be quantum integrable. There is a rather rich class of examples of this sort, including many classically integrable systems such as the Euler top and geodesic flow on a quadric surface, among others (see [T1]). Consider the variety: 6E = {(x, ξ ) ∈ T ∗ X; p1 (x, ξ ) − E1 = · · · = pd (x, ξ ) − Ed = 0}. For the purposes of this paper, the energy values (E1 , . . . , En ) of interest tend to be singular (see Sect. 4) and thus, the variety 6E is, generally speaking, not a manifold. Let ψj be L2 -normalized, joint eigenfunction of P1 , . . . , Pd satisfying: Pk ψj = Ek (h¯ )ψj , ? Supported in Part by NSERC Grant OGP0170280 and FCAR Grant NC-1520
410
where
J. A. Toth
Ek (h¯ ) = Ek + O(h¯ δ1 ).
Here, 0 < δ1 < 1 and k = 1, . . . , d . Let γ ⊂ 6E be a smooth, compact, embedded submanifold of T ∗ X. Given (x, ξ ) ∈ 6E , suppose that the bicharacteristic curves φt (x, ξ ) = exp t4p1 (x, ξ ) of the Hamilton vector field, 4p1 =
X ∂p1 ∂ ∂p1 ∂ − ∂ξj ∂xj ∂xj ∂ξj j
converge to γ ⊂ 6E as t → ∞. Our main result is a quantum analogue of this classical phenomenon; we show that there is a concentration of L2 mass of the ψj ’s in a tubular neighbourhood of γ corresponding to the classical convergence of the bicharacteristics on 6E . Although, such results are known in some specific instances, for example, onedimensional Schrödinger operators with non-degenerate potential maxima (see [B, CP, T2] and Sect. 4), very little seems to be known in the general case. We will show (see Theorem 1) that, under a rather general assumption on the rate of classical convergence of the bicharacteristics on 6E (see (H1) below), there is an analogous concentration estimate for the corresponding eigenfunctions, ψj . The plan of the paper is as follows: At the end of Sect. 1, we give a precise statement of our main result. In Sect. 2, we give a proof of Theorem 1. In Sects. 3 and 4 we give some applications of this analysis. For the proof in Sect. 2, we will first obtain an estimate on the microlocal concentration of eigenfunctions ψj associated with eigenvalues, Ek (h¯ ), satisfying |Ek (h¯ ) − Ek | = O(h¯ δ1 ) for 0 < δ1 ≤ 1 and k = 1, . . . , d. This result (see Proposition 1) basically says that the mass of such an eigenfunction is concentrated in a tube, E , of radius O(h¯ δ1 /2 ) about the characteristic variety, 6E . Next, we apply the semiclassical, time-dependent, Egorov Theorem (see Proposition 2) with time t ∼ log(1/h¯ ) to transport eigenfunction mass into a tube of width O(h¯ ) about the limit set. Here, > 0 is determined by δ1 , a number of derivatives of certain symbols, and the time-dependent divergence rate of the classical flow. This enables us to prove Theorem 1 (see below) which shows that for suitable > 0 and h¯ sufficiently small, the L2 mass of the ψj ’s inside a tube of radius h¯ around the ω-limit set γ , is at least as large as the mass outside a tube of radius, 2h¯ . It is important to note that, although the bicharacteristic curves on the variety, 6E , all converge to γ , for points z ∈ / 6E , the bicharacteristic emanating from z need not converge to γ as t → ∞. For instance, in the example of the one- dimensional, periodic, Schrödinger operator with two non-degenerate potential maxima (see Sect. 4), the ωlimit set of bicharacteristics on 6 consists of two hyperbolic fixed points. This is a manifestation of the fact that the stable manifold of the first critical point is the unstable manifold of the second (and visa-versa). However, nearby bicharacteristics trace out closed ovals in a periodic motion with no nice limiting behaviour. The important point here is that, since for h¯ small, h¯ δ1 /2 << log(1/h¯ ), the relevant eigenfunction mass is concentrated very close to 6E relative to the time-scale of the classical flow as h¯ → 0. As a consequence , we can indeed propagate for logarithmic semiclassical time and still control the diffusion of eigenfunction mass in the tube E ⊃ 6E as h¯ → 0 (see Theorem 1). Logarithmic times are important because in general, one cannot flow for longer time and still effectively control error terms in Egorov’s Theorem (see [Vo, Z]). Following the proof in Sect. 2, we provide two applications of Theorem 1: In Sect. 3, we consider the case of an axial, closed, hyperbolic geodesic on a compact quadric surface and use Theorem 1 to show that there exist many Laplace eigenfunctions localized
Small-Scale Mass Concentration of Modes
411
in an O(h¯ ) neighbourhood of the periodic orbit. In Sect. 4, we discuss the one- dimensional, periodic, semiclassical Schrödinger operator near a nondegenerate potential maximum (see [CP, T2]). In particular, we show directly that a finite proportion of the eigenfunction mass is already concentrated in a neighbourhood of width O(h¯ ) about the critical point for suitable > 0. This is consistent with Theorem 1. Finally, in Appendix A we establish the existence of eigenvalues satisfying the hypotheses in Theorem 1 (provided 0 ≤ δ1 < 1/2). In fact, we give a lower bound for the semiclassical spectral counting function for such δ1 under rather weak assumptions on the singularities of the variety, 6E . We will now give a precise statement of our main result: To simplify the writing, we will henceforth fix the rate function: m(t) = e−|t| .
(1)
However, our results do generalize to include a wider class of rate functions, m(t) (see the remark after Theorem 1). In applying our result in these more general cases, h¯ should be replaced with the more cumbersome notation, m( log h¯ ). For an appropriate class of rate functions, see the remark after Theorem 1. Let 0 ≤ χ (s) ∈ Co∞ (R) be a cutoff function which is identically 1 in the interval, [−1, 1], and vanishes for |s| ≥ 2. Given 0 ≤ < 1/2, we define ζ (x, ξ ; h¯ ) := χ(h¯ −2 d 2 ((x, ξ ), γ )), where d(·, ·) denotes a fixed distance function on T ∗ X. Note, since we are assuming that γ ⊂ 6E is an embedded submanifold of T ∗ X, it follows that, for h¯ sufficiently small, ˜ ∈ C0∞ (R) be a cutoff which is identically 1 on the ζ ∈ C0∞ . Similarly, we let 0 ≤ χ(s) interval, [−2, 2], and vanishes for |s| ≥ 3. Given 0 ≤ < 1/2, we define ζ˜ (x, ξ ; h¯ ) := χ˜ (h¯ −2 d 2 ((x, ξ ), γ )). We will also need to fix the respective tubular neighbourhoods, 0 = {(x, ξ ) ∈ T ∗ X; d 2 ((x, ξ ), γ ) < h¯ 2 }, and 0˜ = {(x, ξ ) ∈ T ∗ X; d 2 ((x, ξ ), γ ) < 2h¯ 2 } . Moreover, we will define the neighbourhood, E , of 6E by E := {(x, ξ ) ∈ T ∗ X; p(x, ξ ) ≤ h¯ δ1 }, where p(x, ξ ) :=
d X (pj (x, ξ ) − Ej )2 . j =1
Suppose that the following hypotheses are satisfied (see also Sect. 4): (H1) Assume that for h¯ sufficiently small,
m(t) d(φt (x, ξ ), γ ) = O h¯ as t → ∞, uniformly for (x, ξ ) ∈ 6E − 0˜ .
412
J. A. Toth
(H2) Given z ∈ E − 0˜ , there exists z0 ∈ 6E − 0˜ such that for h¯ sufficiently small, δ1
d(z, z0 ) = O(h¯ 2 −f () ) uniformly for all such z. Here, f ∈ C 0 (R, R+ ) and f (0) = 0. The motivation behind (H2) is roughly as follows: If 6E is smooth outside γ and the differentials dpj ; j = 1, . . . , d are linearly independent, we can choose the pj ’s as coordinates in E . However, in a shrinking tubular neighbourhood around γ of radius O(h¯ ), the gradients, ∇pj , might degenerate (i.e. become dependent) as we approach γ . Condition (H2) roughly says that this degeneration occurs at a polynomial rate, h¯ −f () (see also, Sect. 4). In Sect. 2, we show that given hypotheses (H1) and (H2), 0 < δ1 ≤ 1 and ψj as above, there exists > 0 and κ = κ() > 0 such that, ([1 − OphF¯ (ζ˜ )]ψj , ψj ) ≤ (OphF¯ (ζ )ψj , ψj ) + O(h¯ κ ).
(∗)
Here, OphF¯ (a) denotes a semiclassical anti-Wick pseudodifferential operator associated with a(x, ξ ) (see Sect. 1). In the course of our proof, we will give estimates for and consequently, the error term κ > 0 in terms of dynamical constants and a finite number of derivatives of symbols. To give a statement of Theorem 1, we let π : T ∗ X → X denote the canonical cotangent projection map. Then, as a consequence of the estimate, (∗), we obtain: Theorem 1. Let ψj be as above and assume that conditions (H1) and (H2) are satisfied. Then, there exists > 0 and κ() > 0 such that for h¯ > 0 sufficiently small, Z Z 2 |ψj | dx ≤ |ψj |2 dx + O(h¯ κ ). X−π(0˜ )
π(0 )
Thus, the mass of ψj is concentrated in a semiclassically shrinking tubular neighbourhood of radius h¯ around π(γ ). Remark. Egorov’s Theorem with time t = −δ log h¯ also plays an important role in Zelditch’s paper [Z] on the rate of quantum ergodicity, as well as Volovoy’s paper [Vo] on the error term in Weyl’s law. 2. Microlocalization near 6E Let P1 (x, h¯ Dx ), . . . , Pd (x, h¯ Dx ) be self-adjoint, elliptic, classical h¯ -pseudodifferential operators of order m. This means that the respective symbols p1 , . . . , pd are required to have asymptotic expansions: pk (x, ξ ) ∼ h¯ m
∞ X
pk,j (x, ξ )h¯ j ,
j =0
where, for k = 1, . . . , d and j ≥ 0, β
|∂xα ∂ξ pkj (x, ξ )| ≤ Cα,β hξ im−j −|β| .
Small-Scale Mass Concentration of Modes
413
1
Here, hξ i := (1 + |ξ |2 ) 2 and pk,0 (x, ξ ) ≥ C1 hξ i for |ξ | ≥ C1 . Henceforth, to simplify notation, we shall denote the h¯ -principal symbols by p1 , . . . , pd . Since we will be working with small-scale cutoff functions like ζ , we now recall the main properties of the corresponding h¯ -pseudodifferential calculus. For further details, we refer the reader to [Sj1]. Let be an open set in Rn and recall that m(t) = e−|t| . For 0 ≤ < 21 , let Sm ( × Rn ) = h¯ −m S0 ( × Rn ), where the latter is defined as follows: a ∈ S0 provided β
S0 := {a ∈ C ∞ ( × Rn ); |∂xα ∂ξ a(x, ξ ; h¯ )| ≤ Cα,β h¯ −(|α|+|β|) , ∀(x, ξ ) ∈ × Rn }. We will denote by Oph¯ (a) the corresponding h¯ -Kohn–Nirenberg quantization, given locally by the integral operator: Z −n ei(x−y)ξ/h¯ a(x, ξ ; h¯ )u(y)dydξ. (Oph¯ (a)u)(x) = (2π h¯ ) Such operators form a calculus [Sj1] with the usual symbolic composition formula: If c(x, h¯ Dx ) = a(x, h¯ Dx ) · b(x, h¯ Dx ) with a, b ∈ S0 , then c(x, ξ ; h¯ ) ∼
∞ X
∂xα a(x, ξ ; h¯ ) · Dξα b(x, ξ ; h¯ )
0+|β|=k
h¯ α . α!
(2)
Note that the semiclassical Calderon–Vaillancourt Theorem also holds for operators in this calculus: If a ∈ S0 with 0 ≤ < 21 , then Oph¯ (a) : L2 (X) → L2 (X) is uniformly bounded in h¯ with: X |α|+|β| β h¯ 2 sup |∂xα ∂ξ a(x, ξ ; h¯ )|. (3) kOph¯ (a)k(0) ≤ C(n) |α|+|β|≤2n+1
Our first order of business will be an estimate on the microlocalization of the ψj near the variety, 6E . For this, we shall adapt the simple and elegant argument in [Sj1] involving commutators and resolvant estimates to the case at hand. To begin, let χ ∈ C0∞ (R) be a cutoff function identically equal to 1 in [−a, a] and vanishing for |s| ≥ 2a, where a > 0. We define p(x, ξ ) :=
d X (pj (x, ξ ) − Ej )2 .
(4)
j =1
Let 0 < δ1 ≤ 1 and consider the following symbol: χ δ1 (x, ξ ; h¯ ) := χ(h¯ −δ1 p(x, ξ )).
(5)
It is clear that Oph¯ (χ δ1 ) ∈ Op(Sδ01 /2 ), with m(t) = e−|t| . Let ψj ∈ C ∞ (X) be an L2 -normalized joint eigenfunction of P1 , . . . , Pd , satisfying Pk ψj = Ek ψj + O(h¯ δ1 )ψj for k = 1, . . . , d. Following [Sj1], we choose the support of χ large enough so that p(x, ξ ) + χ (h¯ −δ1 p(x, ξ )) ≥ (1 + 1/C)h¯ δ1
414
J. A. Toth
for some C > 0 and h¯ sufficiently small. Thus, the operator P˜ = P + Oph¯ (χ δ1 ) satisfies P˜ ≥ (1 + 1/2C)h¯ δ1 . Finally, let χ˜ be another cutoff function which is identically 1 on the support of χ and define ˜ h¯ −δ1 p(x, ξ )). χ˜ δ1 (x, ξ ; h¯ ) := χ( Proposition 1. Let ψj be an L2 -normalized joint eigenfunction as above. Then, k(1 − Oph¯ (χ˜ δ1 ))ψj k = O(h¯ ∞ ). Proof. Modulo the fact that we work with a cutoff function that is localized about an arbitrary energy level set rather than ground state, the proof follows as in [Sj1]. For the sake of completeness, we will sketch the argument. Consider the perturbed sum of squares operator: P˜ (x, h¯ Dx ) =
d X
(Pj (x, h¯ Dx ) − Ej )2 + Oph¯ (χ δ1 ).
j =1
The point of working with such an operator is that p vanishes to second order on the variety, 6E . This is the important point that enables one to estimate commutators. For the remainder of the proof, we drop the superscript δ1 and denote both the symbol and corresponding operator by χ when the context is clear. Start with a nested sequence of cutoff functions χ = χ0 , χ1 , χ2 , . . . , χN −1 , χN = χ˜ , with the property that χj is 1 near the support of χj −1 for all j = 0, . . . , N. By the symbolic composition, formula together with Calderon–Vaillancourt (3), it follows that (i)k[χj , χk ]k = O(h¯ ∞ ) and (ii)kχj (1 − χk )k = O(h¯ ∞ ) for k > j. Using (i), (ii) and the commutator identity: [(1 − χj ), (P˜ − λ)−1 ] = (P˜ − λ)−1 [χj , P˜ ](P˜ − λ)−1 , we obtain, by an iteration argument, the following estimate: ˜ P ](P˜ −λ)−1 . . . (P˜ −λ)−1 [χ1 , P ](P˜ −λ)−1 χψj +O(h¯ ∞ ). (1− χ˜ )ψj = (P˜ −λ)−1 [χ, Finally, to estimate the commutators [χj , P ] in L2 , use the symbolic expansion of σ ([χj , P ]) together with the fact that |∇p(x, ξ )|2 ≤ Cp(x, ξ ) near 6E to conclude that t k[χj , P ]k = O(h¯ ). Since N > 0 can be chosen arbitrarily large, we are done. u The next step in the proof of Theorem 1 is the time-dependent, semiclassical Egorov Theorem ([PU, Z, Vo]). Let φt : T ∗ X → T ∗ X denote time t bicharacteristic flow for H (x, ξ ). Then, in terms of local canonical coordinates on T ∗ X, we will write φt (x, ξ ) = ((φt )1 , . . . , (φt )2n ). We begin with the following elementary lemma: Lemma 1. There exists a constant Ck > 0 independent of t such that β
|∂xα ∂ξ (φt )j (x, ξ )| ≤ exp(Ck (|α| + |β|)|t|) locally uniformly for (x, ξ ) ∈ T ∗ X, for all 1 ≤ j ≤ 2n and 0 ≤ |α| + |β| ≤ k.
Small-Scale Mass Concentration of Modes
415
Proof. This inequality follows from the group law φt1 +t2 = φt2 · φt1 together with the chain rule and an iteration argument. u t Recall, P1 is assumed to be a classical, self-adjoint, h¯ pseudodifferential operator of order zero. It is then well-known that U (t) = eitP1 /h¯ , the corresponding solution operator of the time-dependent Schrödinger equation, −i h¯
∂ U (t) − P1 U (t) = 0, ∂t U (0) = I d
is an h¯ -Fourier integral operator [PU]. A principal ingredient in our argument is the following semiclassical analogue of the standard energy estimate ([Ta], Sect. 2.2) for strictly hyperbolic equations: Lemma 2. Let Q ∈ Oph¯ (S0 ) with kQ − Q∗ k = O(h¯ ) in L2 and suppose that u(x, t) solves the initial value problem: ∂u + Qu = r, ∂t u(x, 0) = u0 (x).
i h¯
Then, there exists a constant C1 > 0 such that: ku(x, t)k ≤ h¯ −1 eC1 |t| (ku0 k + krk). Proof. Let u(x, t) be the requisite solution. Then, ∂t (u, u) = (∂t u, u) + (u, ∂t u) = (i h¯ Qu − i h¯ r, u) + (u, i h¯ −1 Qu − i h¯ −1 r) = 2<(−i h¯ −1 r, u) + Ckuk2 ≤ C 0 (kuk2 + h¯ −2 krk2 ) −1
(6)
−1
since kQ−Q∗ k = O(h¯ ). The lemma finally follows from the Gronwall inequality ([Ta], Lemma 2.2) for first order ODE. u t The next order of business is the semiclassical Egorov Theorem in the calculus Op(S0 ): Proposition 2. Let 0 ≤ < 21 , m(t) = e−|t| and Q ∈ Oph¯ (S0 ) with semiclassical principal symbol q0 (x, ξ ; h¯ ). Then, there exists a constant C2 > 0 such that for 0 < h¯ ≤ h¯ 0 and t ∈ R, h¯
h¯
e−itP1 · Q · eitP1 = Oph¯ (exp t4∗p1 q0 ) + K(t; h¯ ), where kK(t; h¯ )k ≤ h¯ 1−2 eC2 |t| .
416
J. A. Toth
Proof. In the following, we work locally and will denote the total symbol of Q by q0 (x, ξ ; h¯ ). So, Z Q (x, y) = 2π h¯ −n ei(x−y)ξ/h¯ q0 (x, ξ ; h¯ )dξ and we will denote the conjugated operator, e−itP1 Q eitP1 , by Qt . Since we are only interested in Egorov’s theorem per se, following [Ta], it will be convenient to work with the induced equation for Qt : h¯
∂ Q = i[P1 , Qt ]. ∂t t
(7)
As usual, the idea is to construct an approximate solution, At , to (7) with error Rt and then estimate the difference kQt − At k using Lemma 2. Given Z At (x, y) = 2π h¯ −n ei(x−y)ξ/h¯ at (x, ξ ; h¯ )dξ, it follows that at must solve the initial value problem: ∂ a = {p, at }, ∂t t at |t=0 = q0 .
(8)
The solution to (8) is at (x, ξ ) = q0 (exp t4p1 (x, ξ )). For our purposes, it suffices to stop the symbolic manipulations at this stage. As a consequence, we put At = Oph¯ (exp t4∗p1 q0 ) and claim that there exists a constant C > 0 such that kR(t; h¯ )k = O(h¯ 2−2 )eC|t| .
(9)
To prove (9), consider the total symbol σ (x, y, ξ ; t, h¯ ) of the commutator [P1 , At ]. By a standard Taylor expansion and integration by parts argument ([Sh]), one obtains the usual formula for the associated semiclassical Kohn–Nirenberg symbol, K X h¯ α α (∂ p · Dxα at − ∂ξα at · Dxα p) + e(x, y, ξ ; t, h¯ ), σ (x, y, ξ ; t, h¯ ) = α! ξ
(10)
|α|=1
where e(x, y, ξ ; t, h¯ ) = O(h¯ K ) and depends only on derivatives of at and p of order K + 1. By choosing K sufficiently large and taking into account Lemma 1, we get kOph¯ (e)k = O(h¯ N )eC|t| for any N > 0. As far as the first term on the RHS of (10) goes, its principal part {p, at } is cancelled by ∂t at . So, kR(t, h¯ )k ∼ kOph¯ (
K X
|α|=2
h¯ α ∂ξα pDxα at − h¯ α ∂ξα at Dxα p)k = O(h¯ 2−2 eC|t| )
by the Calderon–Vaillancourt theorem (3) and Lemma 1. To conclude the proof, following ([Ta], Sect. 2.2), we must estimate kQt − At k. Writing F (t) = At eitP1 /h¯ and G(t) = Qt eitP1 /h¯ , it follows from the unitarity of eitP1 /h¯ that: kQt − At k = kF (t) − G(t)k.
Small-Scale Mass Concentration of Modes
417
However, v(t) = F (t) − G(t) satisfies h¯ ∂t v(t) = iP1 v(t) + R(t; h¯ )eitP1 /h¯ , v(0) = 0. Therefore, by the energy estimate in Lemma 2, it follows that kQt − At k = kv(t)k = t O(h¯ −1 eC1 |t| kR(t; h¯ )k) = O(h¯ 1−2 )eC2 |t| . u We will now apply the Egorov Theorem in Proposition 2 to the small-scale symbols localized near the limit set γ ⊂ 6E : Let d(·, ·) be a distance function on T ∗ X and recall, ζ (x; h¯ Dx ) := Oph¯ χ(h¯ −2 d 2 ((x, ξ ), γ )). An application of Proposition 2 with ζ = Q0 gives ζt = Oph¯ (exp t4∗p ζ ) + O(h¯ 1−2 )eC2 |t| , −1
(11)
−1
where, ζt = e−it h¯ P1 Oph¯ (ζ )eit h¯ P1 . Let ψj be a joint eigenfunction of P1 , . . . , Pd as above. Then, by Proposition 2 and the unitarity of eitP1 /h¯ , (ζ ψj , ψj ) = (ζt ψj , ψj ) = (Oph¯ (exp t4∗p ζ )ψj , ψj ) + O(h¯ 1−2 )eC2 |t| .
(12)
We now fix an invariant, semiclassical Friedrichs (anti-Wick) quantization map −→ OphF¯ (Sm ) OphF¯ : Sm
with the property that
OphF¯ (a) ≥ 0 if a ≥ 0.
Proposition 3. Given ζ and ψj as above, (OphF¯ (ζ )ψj , ψj ) = (OphF¯ (exp t4∗p ζ )ψj , ψj ) + O(h¯ 1−2 )eC3 |t| . Proof. For simplicity of notation, we will write σ = ζ for the remainder of the proof. In view of Proposition 2, it suffices to show that kOphF¯ (σ ) − Oph¯ (σ )k = O(h¯ 1−2 ) in L2 . We can represent the operator locally in terms of its Weyl quantization Z 1 (x + y), ξ ; h¯ dξ, OpF (σ )(x, y; h¯ ) = (2π h¯ )−n ei(x−y)ξ/h¯ σ w 2
(13)
where σ w denotes the (local) Weyl symbol. Let σ F denote the corresponding Kohn– Nirenberg symbol, so that: w F Oph¯ (σ F ) = Ophw ¯ (σ ) = Oph¯ (σ ).
By the usual argument relating Weyl and Kohn–Nirenberg symbols [Sh], it follows that σ F (x, ξ ; h¯ ) = σ w (x, ξ ; h¯ ) + O(h¯ 1−2 )
(14)
418
J. A. Toth
with similar estimates for the derivatives. It therefore suffices to relate the local Weyl symbol σ w to σ . The relevant formula is [F]: ZZ σ (q, h¯ p; h¯ ) 8(h¯ 1/2 (η − p), h¯ −1/2 (y − q))dpdq. (15) σ w (y, h¯ η; h¯ ) = R Here, 8 is an even, non-negative Schwartz function with 8 = 1. Note that, since σ is compactly-supported, the scaling by hpi in 8 is not necessary here. To estimate the difference σ w − σ , we use Taylor expansion to second order: ZZ [σ (q, h¯ p; h¯ ) − σ (y, h¯ η; h¯ )] (σ w − σ )(y, h¯ η; h¯ ) = ZZ =
· 8(h¯ 1/2 (η − p), h¯ −1/2 (y − q))dpdq
(16)
[h¯ (p − η) · ∇η σ + (y − q) · ∇y σ + R(x, ξ, q, p; h¯ )] · 8(h¯ 1/2 (η − p), h¯ −1/2 (y − q))dpdq.
The linear terms in (16) all integrate to zero, since 8 is even. The quadratic term R is bounded by: C h¯ (1 + h¯ 1/2 |η − p| + h¯ −1/2 |y − q|)2 · 8(h¯ 1/2 (η − p), h¯ −1/2 (y − q)) · kσ w kC 2 = O(h¯ 1−2 ) (17) with similar estimates for the derivatives. u t Since our main interest here is in mass estimates, we will henceforth work with a fixed, positive quantization. For simplicity of notation, we will drop the superscript F . To proceed, put 1 t = δ log( ) h¯
(18)
in Proposition 3, where δ > 0 is to be determined. By Lemma 1, it is clear that, for such a value of t, exp t4∗p ζ ∈ S +C1 δ
(19)
with m(t) = e−|t| . In order to choose δ > 0, we need to combine the estimate on the h¯ -microsupport of the ψj (Proposition 1) and the time-dependent Egorov theorem (Proposition 3) using hypotheses (H1) and (H2). To see how to do this, choose a cutoff function 0 ≤ χ(s) ˜ ∈ C0∞ (R) which is identically equal to 1 on [−2, 2] and vanishes for |s| ≥ 3. So, in particular, χ˜ = 1 on supp χ . Recall, we have defined the associated symbol, ζ˜ (x, ξ ) := χ˜ (h¯ −2 d 2 ((x, ξ ), γ )).
(20)
Fix a δ1 with 0 < δ1 < 1 and recall that, by Proposition 1, the microlocal mass of eigenfunctions, ψj , satisfying Pk ψj = Ek (h¯ )ψj
Small-Scale Mass Concentration of Modes
419
is concentrated (modulo O(h¯ ∞ )) in the domain, E = {(x, ξ ) ∈ T ∗ X; p(x, ξ ) ≤ C h¯ δ1 }. Consider the tubular neighbourhoods, 0 = {(x, ξ ) ∈ T ∗ X; d 2 ((x, ξ ), γ ) < h¯ 2 } and 0˜ = {(x, ξ ) ∈ T ∗ X; d 2 ((x, ξ ), γ ) < 2h¯ 2 }. Given (x, ξ ) ∈ E − 0˜ , our objective is to choose δ so that for t = −δ log h¯ , φt (x, ξ ) = exp t4p (x, ξ ) ∈ 0 . To see how to do this, we first of all restrict > 0 so that: f () <
δ1 . 2
(21)
Then, by hypothesis (H2), it is clear that for h¯ sufficiently small, there exists a point (x0 , ξ0 ) ∈ 6E − 0˜ such that δ1
d((x, ξ ), (x0 , ξ0 )) ≤ C h¯ 2 −f () .
(22)
Suppose that we now also require that δ > .
(23)
By hypothesis (H1), it follows that for t = −δ log h¯ , d(φt (x0 , ξ0 ), γ ) ≤ C h¯ δ− .
(24)
d(φt (x, ξ ), γ ) = d(φt (x, ξ ), φt (x0 , ξ0 )) + O h¯ δ− .
(25)
So, by the triangle inequality,
Finally, by a first-order Taylor expansion, it follows that d(φt (x, ξ ), φt (x0 , ξ0 )) ≤ sup |∇x,ξ φt | · d((x, ξ ), (x0 , ξ0 )) E
δ1
δ1
≤ exp(C1 |t|)h¯ 2 −f () = h¯ −C1 δ−f ()+ 2 , where, in the last inequality, we have used Lemma 1. The end result is that, given (21) and (23), we have for t = −δ log h¯ , (x, ξ ) ∈ E − 0˜ and h¯ sufficiently small, δ1
d(φt (x, ξ ), γ ) = O(h¯ −C1 δ−f ()+ 2 + h¯ δ− ).
(26)
We would like to arrange that the bicharacteristic curve φt (x, ξ ) be in 0 after time t = −δ log h¯ . This will be the case, provided: δ < C1−1 (
δ1 δ δ1 − f () − ), < , f () < . 2 2 2
(27)
The only other thing we need to consider is the error term in the Egorov Theorem (Proposition 3). In order to ensure that this term does not blow up, we also require that, δ < C3−1 (1 − 2). Summing up, we have proved:
(28)
420
J. A. Toth
Lemma 3. Let (, δ) satisfy the following inequalities: δ < min(C1−1 (
δ1 − f () − ), C3−1 (1 − 2)), 2 δ1 δ < , f () < . 2 2
(29)
Then, for t = −δ log h¯ , (x, ξ ) ∈ E − 0˜ and h¯ sufficiently small, ζ (exp t4p1 (x, ξ )) = 1. Since by assumption (H2), f ∈ C 0 and f (0) = 0, the system of inequalities in (29) can be solved for positive (, δ) and clearly, the maximal such is the optimal choice, since it will give the best localization near the limit set, γ . To exploit Lemma 3, we will need to discuss the pointwise behaviour of the symbols ζ and ζ˜ in greater detail: Lemma 4. Let (, δ) satisfy the inequalities in (29). Then, for t = −δ log h¯ , (x, ξ ) ∈ E and h¯ sufficiently small, [(1 − ζ˜ ) · (exp t4∗p1 ζ )](x, ξ ) = (1 − ζ˜ )(x, ξ ).
(30)
Proof. When (1 − ζ˜ )(x, ξ ) = 0, this identity clearly holds since both sides of (30) are zero. On the other hand, if (x, ξ ) ∈ supp (1 − ζ˜ ), then (x, ξ ) ∈ E − 0˜ and so, by t Lemma 3, ζ (exp t4p1 (x, ξ )) = 1. Thus, (30) is again satisfied. u Recall, the semiclassical Egorov Theorem (Proposition 3) says that: (Opζ ψj , ψj ) = (Op(exp t4∗p1 ζ )ψj , ψj ) + O(h¯ 1−2−C3 δ )).
(31)
Since 1 − ζ˜ ≤ 1 holds pointwise and we are using a non-negative, anti-Wick quantization, it follows as a consequence of Proposition 3 that (Opζ ψj , ψj ) ≥ (Op[(1 − ζ˜ ) · exp t4∗p1 ζ ]ψj , ψj ) + O(h¯ 1−2−C3 δ ).
(32)
Next, we expand the RHS in (32): (Op[(1 − ζ˜ ) · exp t4∗p1 ζ ]ψj , ψj ) = (Op[(1 − ζ˜ ) · exp t4∗p1 ζ ] · Op(χ˜ δ1 )ψj , ψj ) + (Op[(1 − ζ˜ ) · exp t4∗p1 ζ ] · [1 − Op(χ˜ δ1 )]ψj , ψj ). (33) By Proposition 1, k[1 − Op(χ˜ δ1 )]ψj k = O(h¯ ∞ ).
(34)
Using this estimate in (33) gives: (Opζ ψj , ψj ) ≥ (Op[(1 − ζ˜ ) · (exp t4∗p1 ζ )] · Op(χ˜ δ1 )ψj , ψj ) + O(h¯ 1−2C0 −C3 δ ).
(35)
Since the symbol χ˜ δ1 is supported on the domain E , it follows by the pointwise identity in (30) that, (Opζ ψj , ψj ) ≥ (Op(1 − ζ˜ ) · Op(χ˜ δ1 )ψj , ψj ) + O(h¯ 1−2C0 −C3 δ ).
(36)
Small-Scale Mass Concentration of Modes
421
Finally, appealing again to the microlocalization result in (34), we obtain: (Opζ ψj , ψj ) ≥ (Op(1 − ζ˜ )ψj , ψj ) + O(h¯ 1−2C0 −C3 δ ).
(37)
Our main result is now an immediate consequence of the estimate (37): Indeed, given the tubular neighbourhoods 0 = {(x, ξ ) ∈ T ∗ X; d 2 ((x, ξ ), γ ) ≤ h¯ 2 }, 0˜ = {(x, ξ ) ∈ T ∗ X; d 2 ((x, ξ ), γ ) ≤ 2h¯ 2 } and (, δ) satisfying the estimates in (29), we have proved: Theorem 1. Let P1 , . . . , Pd ; 1 ≤ d ≤ n be elliptic, self-adjoint classical h¯ - pseudodifferential operators with h¯ -principal symbols p1 , . . . , pd , satisfying: [Pi , Pj ] = 0 for all 1 ≤ i, j ≤ d. Let 0 < δ1 < 1, 6E = {(x, ξ ) ∈ T ∗ X; p1 (x, ξ ) − E1 = . . . , pd (x, ξ ) − Ed = 0} and ψj be an L2 -normalized joint eigenfunction satisfying: Pk ψj = Ek (h¯ )ψj , where, Ek (h¯ ) = Ek + O(h¯ δ1 ) and k = 1, . . . , d. Assume that hypotheses (H1) and (H2) are satisfied. Then, given (, δ) satisfying the estimates in (29) and h¯ sufficiently small, Z
Z X−π(0˜ )
|ψj |2 dx ≤
π(0 )
|ψj |2 dx + O(h¯ κ ),
where κ = 1 − 2 − C3 δ. Remark. Although we have dealt throughout with the explicit rate function, m(t) = e−|t| , and the associated symbol classes S0 , our main result generalizes to include other rate functions, m(t) (see [Sj2]). Indeed, let h¯ ∈ (0, h¯ 0 ] and assume that µ(h¯ ) ∈ (0, µ0 ] µ satisfies 0 < µh¯2 ≤ h¯ for any > 0. We define the corresponding symbol classes, S0 , as follows ([Sj2] Sect. 8): a(x, ξ ; h¯ , µ) ∈ Sµ0 , provided β
|∂xα ∂ξ a(x, ξ ; h¯ , µ)| ≤ Cα,β µ−(|α|+|β|) . Then, by standard arguments, one has the usual composition formulas for such symbols, together with Calderon–Vaillancourt L2 - boundedness results. In particular, Oph¯ (Sµ0 1 ) · Oph¯ (Sµ0 2 ) ⊂ Oph¯ (Sµ0 ), where, µ = max{µ1 , µ2 }. Thus, if m(t) ≥ e−|t| , it follows that we can work with symbol classes defined by µ(h¯ ) = m( log h¯ ) as long as 0 ≤ < 1/2.
422
J. A. Toth
3. Hyperbolic Geodesics on Quadric Surfaces In this section, we give a concrete application of Theorem 1: Let X = {(x1 , x2 , x3 ) ∈ R3 ; α1 x12 +α2 x22 +α3 x32 = 1} be the standard ellipsoid with axes of length α1−1 > α2−1 > α3−1 > 0. It was shown by Jacobi ([A]) that geodesic flow on X is completely integrable. In fact, this system is also quantum integrable in arbitrary dimension ([T1,T2]). In this case, we can take P1 = −h¯ 2 1, the standard Laplace-Beltrami operator. One can show [T1] that there exists a functionally-independent, second-order, self- adjoint partial differential operator P2 with the property that [P1 , P2 ] = 0. Define 6 = {z ∈ T ∗ X; p1 (z) − 1 = p2 (z) − α2−1 = 0} and denote the canonical dual coordinates to (x1 , x2 , x3 ) ∈ R3 by (ξ1 , ξ2 , ξ3 ) ∈ R3 . It is well known [A] that the geodesics γ ± = {z ∈ 6; x2 = ξ2 = 0} are hyperbolic. Moreover, there exists a constant C = C(α1 , α2 , α3 ) such that d(exp t4p1 z, γ ± ) ≤ exp(−C|t|) · h¯ − ,
(38)
uniformly for all z ∈ 6 − 0˜ , where 0˜ is a neighbourhood of γ ± defined as in Sect. 2. The following is an immediate consequence of Theorem 1: Corollary 1. Let δ1 > 0 be as above, E = (E1 , E2 ) = (1, α2−1 ), γ and ψj a normalized joint eigenfunction associated with 6E as in Theorem 1. Then, given (, δ) satisfying (29), Z Z 2 |ψj | dx ≤ |ψj |2 dx + O(h¯ κ ). X−π(0˜ )
π(0 )
Remarks. Note that applying separation of variables in this example leads to a nonFuchsian ODE of Heuns type ([T1, T2]) with an elliptic function potential, q(x). It is not difficult to show that to obtain smooth solutions to the Laplace eigenfunction on X, one must look for doubly-periodic solutions corresponding to a certain lattice in C ([T1]). Even the existence of such a solution is by no means obvious since it is not clear that q is a Picard potential [Ge]. Therefore, Corollary 1 gives a mass concentration result for separatrix eigenfunctions in a case where separation of variables is not readily applicable. There are other interesting algebraically integrable examples in arbitrary dimension satisfying (38) and hence, Corollary 1 ([T1, T2]) that can be approached using ODE techniques. However, in these examples, the ODE arising from separation of variables typically involve multiple spectral parameters and are therefore very difficult to work with directly. Moreover, the set γ can also be rather complicated. For example, when the dimension of the hyperellipsoid is at least 3, it is not difficult to show that the projected limit sets π(γ ) can actually be quadric surfaces of dimension ≥ 2.
Small-Scale Mass Concentration of Modes
423
4. The One-Dimensional Schrödinger Operator Let V ∈ C ∞ (R) satisfy: V (x + 1) = V (x), 0
V (0) = V (0) = 0, V 00 (0) < 0, −1 ≤ V (x) ≤ 0.
(i) (ii) (iii) (iv)
Consider the one-dimensional (reduced) Schrödinger equation P (h¯ )ψ = −h¯ 2
d2 ψ + V (x)ψ = λ ψ dx 2
(39)
on the circle, S1 = R (mod 1), with λ = O(h¯ ). The spectral theory of such a Schrödinger operator near a non- degenerate potential maximum is well-known ([B, BPU, CP, Ma]). However, it is of interest to see how this example falls into our framework. To do this, we will need to recall here some elementary properties of the classical flow. Consider the separatrix 1 60 = {(x, ξ ) ∈ T ∗ S1 ; ξ 2 + V (x) = 0}. 2 It is clear that 60 consists of two pieces and we will denote the subsets corresponding to ξ ≥ 0 and ξ ≤ 0 by 60+ and 60− respectively. To fix matters, we focus here on 60+ , the other case being similar. In this example, we define 0 = {(x, ξ ) ∈ T ∗ S1 ; x 2 + ξ 2 ≤ h¯ } ∪ {(x, ξ ) ∈ ∗ T S1 ; (x − 1)2 + ξ 2 ≤ h¯ } and 0˜ = {(x, ξ ) ∈ S1 ; x 2 + ξ 2 ≤ 2h¯ } ∪ {(x, ξ ) ∈ T ∗ S1 ; (x − 1)2 + ξ 2 ≤ 2h¯ }. Let (x0 , ξ0 ) ∈ 60+ − 0˜ and (x(t), ξ(t)) be the solution curve of the Hamilton equations: dξ = −V 0 (x), dt
dx = ξ, dt
(40)
with (x(0), ξ(0)) = (x0 , ξ0 ). Integration of the equations in (40) yields: Z Z ξ(t) = −
√ dx = 2t, √ −V (x)
x(t)
x(0) t
(41)
0
V (x(s))ds + ξ0 .
0
Let H[a,b] denote the indicator function of the interval, [a, b]. By the assumptions (i)-(iv) on the potential V (x), there exist c1 , c2 > 0 such that c2 H[0,1/2] (x)x 2 + c2 H[1/2,1] (x)(x−1)2 ≥ −V (x) ≥ c1 H[0,1/2] (x)x 2 +c1 H[1/2,1] (x)(x−1)2 . To minimize the profusion of constants, we will assume here that c1 = 1 and c2 = 2. Thus, 2H[0,1/2] (x) x 2 + 2H[1/2,1] (x) (x − 1)2 ≥ −V (x) ≥ H[0,1/2] (x) x 2 + H[1/2,1] (x) (x − 1)2 .
(42)
424
J. A. Toth
Consider the first equation in (41). Let x0 ≥ 2h¯ and suppose 2h¯ ≤ x(t) ≤ 1/2. Then, by the estimate in (42), Z x(t) dx √ (43) ≥ 2t. x x0 As a consequence, x(t) ≥ x0 e Suppose now that (42),
√
√ 2t
≥ 2h¯ e
√ 2t
.
(44)
2t ≥ log(4−1 h¯ − ), so that, in particular, x(t) ≥ 1/2. Then, also by Z
x(t) x0
√ dx ≥ 2t. 1−x
(45)
Thus, it follows that √ 2t
|1 − x(t)| ≤ |1 − x0 | e−
,
(46)
and consequently, √ 2t
|ξ(t)| ≤ 2 |1 − x0 | e−
(47)
for the same range of t. We now show that (H1) and (H2) follow from the above estimates. Lemma 5. Let γ := (0, 0) ∪ (1, 0) and suppose that d((x0 , ξ0 ), γ ) ≥ 2h¯ , where ξ0 ≥ 0. Then, for all t > 0, and h¯ sufficiently small, √ 2t
d((x(t), ξ(t)), γ ) = (|1 − x(t)|2 + ξ(t)2 ) 2 ≤ h¯ − e− 1
.
A similar result holds for ξ0 ≤ 0. Furthermore, hypothesis (H2) is also satisfied for this system with f () = . √ (46) and (47). On√the other Proof. For 2t ≥ log(h¯ − ) this follows from the estimates √ √ − − − 2t and |ξ(t)| ≤ h¯ − e− 2t , since hand, when 2t ≤ log(h¯ ), both |1 − x(t)| ≤ h¯ e 0 ≤ x, ξ ≤ 1. The second part of the lemma follows from the fact that V (x) ∼ −x 2 near x = 0 and V (x) ∼ −(x − 1)2 near x = 1. u t Let 1 be a sufficiently small neighbourhood of (0, 0) and let χ1 (x, ξ ) ∈ C0∞ be a cutoff function supported in 1 . Then, taking into account the microlocalization result in Proposition 1, by a quantum Birkhoff normal form construction [CP, HS], one can construct a microlocally unitary h¯ -Fourier integral operator, U : C0∞ (1 ) → C0∞ (1 ) such that (48) kOph¯ (ζ )[U ∗ F (P ; h¯ )U − h¯ (Dx x + xDx )]k = O(h¯ ∞ ), P where F (x; h¯ ) ∼ j =0 fj (x)h¯ j and 0 ≤ < 1/2. As a consequence of (48), it can be shown [CP] that there exist α± such that for any eigenfunction ψj , kOph¯ (ζ )(ψj − α+ U u+ − α− U u− )k = O(h¯ ∞ ).
(49)
Small-Scale Mass Concentration of Modes
425
Here, u± (x) = (2π)1/2 0(1/2 + iλ)/h¯ )−1 e−λ/2h¯ | log h¯ |−1/2 H (±x)x ±iλ/h¯ −1/2 is the distributional basis of solutions to the equation h¯ (Dx x + xDx )u = λu. To simplify the writing, we will put c(h¯ ) = (2π)1 |0(1/2 + iλ)/h¯ )|−2 e−λ/h¯ | log h¯ |−1 below. As a starting point, we will compute the microlocal mass of u± over the domain, := [−h¯ , h¯ ] × [−h¯ , h¯ ] ⊂ [−1, 1] × [−1, 1] with 0 ≤ < 1/2. Because of the symmetry of the problem [CP], it suffices to estimate the integral: 2 Z h¯ Z h¯ Z h¯ |uˆ + |2 dξ = c(h¯ ) e−i(xξ −λ log x)/h¯ x −1/2 dx dξ. (50) 0 0 0 R1 Notice, we have chosen c(h¯ ) so that 0 |uˆ + |2 dξ = 1. By making the change of coordinates ξ xξ , η= y= h¯ h¯ in the integral (50) we get, 2 Z h¯ −1 Z h¯ η Z h¯ dη |uˆ + |2 dξ = c(h¯ ) e−iy y −1/2+iλ/h¯ dy . (51) η 0 0 0 To estimate this latter integral, we first assume that η ∈ [0, h¯ − ]. Then, by an integration by parts: Z h¯ η e−iy y −1/2+iλ/h¯ dy 0
= O(h¯
/2 1/2
η
Z ) + O(1) 0
h¯ η
e−iy y 1/2+iλ/h¯ dy = O(h¯ /2 η1/2 ) + O(h¯ 3/2 η3/2 ). (52)
Thus, since c(h¯ ) ∼ (log h¯ )−1 , 2 Z h¯ −1 Z h¯ η Z h¯ dη 2 −iy −1/2+iλ/h¯ |uˆ + | dξ = c(h¯ ) e y dy + O(log h¯ −1 ), η h¯ − 0 0
(53)
and so, Z
h¯ 0
|uˆ + |2 dξ = 1 − 2 + O(log h¯ −1 ).
(54)
It follows that the mass inside dominates the mass in c = [0, 1]2 − provided 1 − 2 ≥ 2 and so, we must choose ≤
1 . 4
(55)
Remark. Although we will not prove this here, by using the above analysis together with Taylor expansion near (0, 0), it is not difficult to show that, for any q(x, ξ ) ∈ C0∞ (T ∗ S1 ), (Oph¯ (q)ψj , ψj ) → q(0, 0) as h¯ → 0 (see also [CP]). We will discuss limits of quantum expected values in greater generality (e.g. near unstable orbits) elsewhere.
426
J. A. Toth
5. Appendix A Fix a constant C > 0 and let P (x; h¯ DP ¯ - pseudodifferential operator x ) be a self-adjoint, h of order 1 with symbol p(x, ξ ; h¯ ) ∼ ∞ ¯ j , where j =0 pj (x, ξ )h β
|∂xα ∂ξ pj | ≤ Cα,β hξ i1−j −|α| . We will moreover assume that P is elliptic, with p0 (x, ξ ) ≥ C hξ i when |ξ | ≥ C1 . Fix 0 < δ1 < 1/2, E1 > 0 and denote the number of eigenvalues of P (counted with multiplicity) on the interval [E1 − C h¯ δ1 , E1 + C h¯ δ1 ] by Nδ1 ,E1 (h¯ ). Our objective here is to give an asymptotic lower bound for Nδ1 ,E1 (h¯ ) in terms of the trace of a pseudodifferential operator (the approximate spectral projector). This method is well-known ([Sh, R]) and has been used in a variety of settings. Since we could not find the results of Propositions 4 and 5 explicitly in the literature, we will sketch the proofs. To define the projector, we let χ(t) ∈ C0∞ (R) be identically 1 in the interval [−C − 1, C + 1] with supp χ ⊂ [−2C − 2, 2C + 2]. Define t − E1 χδ1 ,E1 (t) := χ h¯ δ1 and let 6s := {(x, ξ ) ∈ 6E1 ; dp(x, ξ ) = 0}. Proposition 4. Let 0 ≤ δ1 < 1/2 and suppose 6E1 − 6s contains an open manifold. Then, there exists a constant C > 0 such that: Nδ1 ,E1 (h¯ ) ≥ C h¯ −n+δ1 . Proof. Since Nδ1 ,E (h¯ ) ≥ Traceχδ1 (P (x, h¯ Dx )),
(56)
it suffices to give a lower bound for Trace χδ1 (P ). The first order of business is to show that χδ1 (P ) is an h¯ -pseudodifferential operator with singular symbol. One way of doing this [Do], is to use the Cauchy identity: ZZ ∂ f˜ (57) (z − P )−1 dzdz f (P ) = −π −1 lim →0 |=z|≤ ∂z which is valid for all f ∈ C0∞ (R). Here, f˜ ∈ C0∞ (C) denotes an almost-analytic extension of f . The resulting operator, f (P (h¯ )) is then an h¯ - pseudodifferential operator with symbol, pf (x, ξ ; h¯ ) ∼
∞ X j =0
pf,j (x, ξ )h¯ j ,
(58)
Small-Scale Mass Concentration of Modes
427
where pf,0 = f (p0 ) and for j ≥ 1, pf,j (x, ξ ) =
2j −1 X
dj,k f (k) (p0 ),
(59)
k=1
the dj k being universal polynomials in the derivatives of the pl . One can put f = χδ and carry out the symbolic calculations as in the standard case, except that the pf,j will now depend on h¯ . However, since ∂ k χδ1 = O(h¯ −δ1 k ), it follows that pf,j (x, ξ ) = O(h¯ −δ1 (2j −1) ). Since δ1 < 1/2, (58) still makes sense as an asymptotic expansion. Taking traces, we get the usual formula: ZZ (60) χδ1 (p0 (x, ξ ))dxdξ + O(h¯ −n+1−δ1 ). T rχδ1 (P (h¯ )) = (2π h¯ )−n By assumption, we can introduce p0 as a radial variable in (60) on an open domain. The result follows. u t By applying the argument above with a cutoff function χ(t1 , . . . , td ) ∈ C0∞ (Rd ) one can prove in exactly the same way: Proposition 5. Let P1 , . . . , Pd satisfy the hypotheses in Theorem 1 and suppose 6E −6s contains an open manifold. Then, for 0 ≤ δ1 < 1/2, there exists a constant C > 0 such that: Nδ1 ,E (h¯ ) ≥ C h¯ −n+δ1 ·d . Here, Nδ1 ,E (h¯ ) denotes the number of d-tuples of eigenvalues (λ1 , . . . , λd ) of P1 . . . , l . . . , Pd satisfying |λj − Ej | ≤ C h¯ δ1 and 6s = {(x, ξ ) ∈ 6E ; dp1 , . . . dpd are linearly dependent at (x, ξ )}. Remark. Under the hypothesis that the joint energy levels (E1 , . . . Ed ) are regular or have sufficiently tame singularities [BU, BPU, DG, GU, PU, R], there are well-known Weyl formulas for the spectral counting function that are much stronger than the lower bound in Proposition 5. The result of Proposition 5 shows that there are many eigenvalues satisfying the hypotheses of Theorem 1 (provided 0 ≤ δ1 < 1/2) under rather weak assumptions on the singularities of the level variety, 6E . Acknowledgement. I wish to thank Victor Guillemin, Alex Uribe, Steve Zelditch and Maciej Zworski for many helpful comments and valuable discussions. I am also indebted to the referee for several useful comments and suggestions regarding the paper.
References [A]
Arnold, V.I.: Mathematical Methods of Classical Mechanics. Second Edition, Berlin–Heidelberg–New York: Springer-Verlag, 1987 [B] Bleher, P.: Semiclassical quantization rules near separatrices. Commun. Math. Phys. 165, 621–640 (1994) [BU] Brummelhuis, J. and Uribe, A.: A trace formula for Schrödinger operators. Commun. Math. Phys. 136, 567–584 (1991) [BPU] Brummelhuis, J., Paul, T. and Uribe, A.: Spectral estimates around a critical level. Duke Math. J. 78 (3), 477–530 (1995)
428
[CP] [DG] [Do] [F] [Ge] [GU] [HS] [Ma] [PU] [R] [Sh] [Sj1] [SJ2] [Ta] [T1] [T2] [Vo] [Z]
J. A. Toth
Colin de Verdière, Y. and Parisse, B.: Équilibre instable en régime semi-classique I: concentration microlocale. Commun. P.D.E. 19, 1535–1563 (1994) Duistermaat, J. and Guillemin, V.: The spectrum of positive eliiptic operators and periodic bicharacteristics. Invent. Math. 29, 39–79 (1975) Dozias, S.: Mémoire de Magistère de l’ENS. (1993) Folland, G.: Harmonic Analysis in Phase Space. Annals of Math. Studies 122, Princeton, NJ: Princeton Univ. Press, 1989 Gesztesy, F.: On Picard Potentials. Differential and Integral Equations 8 (6), 1453–1476 (1995) Guillemin, V. and Uribe, A.: Circular symmetry and the trace formula. Invent. Math. 96, 385–423 (1989) Helffer, B.: and Sjöstrand, J.: Semiclassical analysis of Harper’s equation III. Bull.Soc. Math. France, Mémoire No. 39, (1990) März, C.: Spectral asymptotics for Hill’s equation near the potential maximum. Asymptotic Analysis 5, 221–267 (1992) Paul, T. and Uribe, A.: The semi-classical trace formula and propagation of wave packets. J. Funct. Anal. 132, 192–249 (1995) Robert, D.: Autour de l’approximation semi-classique. Progr. Math. 68, Boston: Birkhäuser, 1987 Shubin, M.: Pseudodifferential Operators and Spectral Theory. Berlin–Heidelberg–New York: Springer-Verlag, 1987 Sjöstrand, J.: Semi-excited states in nondegenerate potential wells. Asymp. Anal. 6, 29–43 (1992) Sjöstrand, J.: Microlocal analysis for the periodic magnetic Schrödinger equation and related questions. CIME-lectures, Montecatini (1989), Springer Lecture Notes in Math. 1495, pp. 237–332 Taylor, M.: Pseudodifferential Operators. Princeton, NJ: Princeton Univ. Press, 1981 Toth, J.A.: Various quantum mechanical aspects of quadratic forms. J. Funct. Anal. 130, 1–42 (1995) Toth, J.A.: Eigenfunction localization in the quantized rigid body. J. Diff. Geom. 43 (4), 844–858 (1996) Volovoy, A.V.: Improved two-term asymptotics for the eigenvalue distribution function of an elliptic operator on a compact manifold. Commun. in P.D.E. 15 (11), 1509–1563 (1990) Zelditch, S.: On the rate of quantum ergodicity. Commun. Math. Phys. 160, 81–92 (1994)
Communicated by P. Sarnak
Commun. Math. Phys. 206, 429 – 445 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
The Entropy Production of Diffusion Processes on Manifolds and Its Circulation Decompositions? Qian Min, Wang Zheng-dong Department of Mathematics, Peking University, Beijing 100871, P. R. China Received: 4 November 1998 / Accepted: 7 April 1999
Abstract: In non-equilibrium statistical mechanics, the entropy production is used to describe flowing in or pumping out of the entropy of a time-dependent system. Even if a system is in a steady state (invariant in time), Prigogine suggested that there should be a positive entropy production if it is open. In 1979, the first author of this paper and Qian Min-Ping discovered that the entropy production describes the irreversibility of stationary Markov chains, and proved the circulation decomposition formula of the entropy production. They also obtained the entropy production formula for drifted Brownian motions on Euclidean space R n (see a report without proof in the Proc. 1st World Congr. Bernoulli Soc.). By the topological triviality of R n , there is no discrete circulation associated to the diffusion processes on R n . In this paper, the entropy production formula for stationary drifted Brownian motions on a compact Riemannian manifold M is proved. Furthermore, the entropy production is decomposed into two parts – in addition to the first part analogous to that of a diffusion process on R n , some discrete circulations intrinsic to the topology of M appear! The first part is called the hidden circulation and is then explained as the circulation of a lifted process on M × S 1 around the circle S 1 . The main result of this paper is the circulation decomposition formula which states that the entropy production of a stationary drifted Brownian motion on M is a linear sum of its circulations around the generators of the fundamental group of M and the hidden circulation.
1. Introduction In non-equilibrium statistical mechanics, the entropy production is used to characterize how far a system is from being equilibrium (see e.g., [P]). As far as we know, this idea has not been studied in probability theory with appropriate generality (see e.g. p. 207 ? Project supported by the National Natural Science Foundation of China and Mathematical Center of State Education Commission.
430
M. Qian, Z.-d. Wang
of [Si]). In ICM 1998, G. Gallavotti brought up the topic of entropy production again in the plenary lecture (see e.g., [G]). He used it to solve some classical problems in non-equilibrium statistical mechanics. In 1979, the first author of this paper and Qian Min-Ping considered the entropy production of stationary Markov chains and found the relationship between the entropy production and circulation of Markov chains (see [QQ1]). For a sketch, we suppose that ξ is a stationary Markov chain with discrete state space S, transition probability matrix P = (pij )i,j ∈S and initial invariant distribution π = (πi )i∈S . The Markov chain ξ is called reversible if πi pij = πj pj i for all states i, j ∈ S. The entropy production ep of ξ is defined by ep =
πi pij 1 X (πi pij − πj pj i ) ln . 2 πj pj i i,j ∈S
Clearly, ep is non-negative, and ep = 0 if and only if ξ is reversible. Hence the entropy production defined above can be regarded as a criterion to characterize how far a Markov chain is from being reversible. Furthermore, the frequency of any cycle C (an ordered subset of S) which appears in every orbit of the Markov chain ξ has a certain limit WC . In fact the limit WC is independent of the orbit and is defined as the circulation of ξ around the cycle C. The following circulation decomposition formula of the entropy production is proved in [QQ1]: ep =
1X WC (WC − W−C ) ln , 2 W−C C∈C
where C denotes the set of all cycles, and −C represents the reverse cycle of C. We should refer to Kalpazidou’s book [K] for the further development of the circulation theory, in which the circulation is related to Carathéodory dimension, Betti’s number and Kolmogrov’s complexity. But all of these are limited to the case of the state space S being discrete. In this paper, we will consider the entropy production and circulaion of stationary drifted Brownian motions on compact Riemannian manifold M and study the relationship between them. Let {xt }t≥0 be a Brownian motion with drift X on probability space (, F, Ft , p), with the state space M (X is a vector field on M). Two probability mea+ − and p[s,t] can be introduced on the σ -algebra Fst generated by xu (s ≤ u ≤ t) sures p[s,t] as the distributions of {xu , s ≤ u ≤ t} and {xt+s−u , s ≤ u ≤ t}. {xt }≥0 is called re+ − = p[s,t] for any t > s > 0. The entropy production of {xt }≥0 is defined versible if p[s,t] as ! + dp[t,t+4t] 1 p E ln − , ep (t) = lim 4t→0+ 4t dp[t,t+4t] + . It is clear that ep (t) = 0(∀t > 0) if and where E p stands for the expectation of p[t,t+4t] only if {xt }≥0 is reversible. By a variant of Girsanov’s formula on compact manifolds, we will prove that the entropy production ep (t) of the drifted Brownian motion {xt }≥0 is given by:
1 ep (t) = 2
Z M
∂ ln ρ h2X − 5 ln ρ, 2X − 5 ln ρi − 2 ∂t
ρ(x, t)dx,
Entropy Production of Diffusion Processes on Manifolds
431
where dx stands for R the Riemannian volume element of M, ρ(x, t) denotes the density of xt and satisfies M ρ(x, t)dx = 1. Recall that 1 ∂ρ = 4 ρ − Xρ − ρ div X. ∂t 2 If {xt }t≥0 is stationary, i.e., ρ(x, t) = ρ(x)(∀t ≥ 0), the entropy production formula given above becomes Z 1 h2X − 5 ln ρ, 2X − 5 ln ρiρ(x)dx. (1.1) ep (t) = 2 M This yields the known result: a stationary drifted Brownian motion is reversible if and only if its drift X is a gradient vector field (see e.g., p. 294 of [IW]). We remark that our methods of derivation can be used to prove the entropy production formula for drifted Brownian motions on the Euclidean space R n which is given in [QQ2] without proof. And the definition of the entropy production given above seems closely related to the definition of Kurchan [ku] which goes back to Andrej [A] and Hoover et al (see for instance [H]) and Evans et al (see for instance [Ev]). The comparison of our definition with the ones above will be considered in the future. Suppose that the flow φt generated by the vector field X is ergodic, then the rotation number of φt around a closed curve γ in M is given by (see e.g., p. 149 of [AA]): Z (γ ∗ , X)(m)dµ(m), (1.2) αγ = M
where γ ∗ is the De Rham dual of γ in the first cohomology group H 1 (M, R), µ is the invariant probability measure of the ergodic flow φt , and (γ ∗ , X)(m) is the value of the one-form γ ∗ on X at point m. Even if the flow φt is non-ergodic, the rotation number αγ of the drifted Brownian motion {xt }t≥0 around the closed curve γ can be defined and is given by the formula (see [M]): Z (γ ∗ , X)(m)dµ(m), (1.3) αγ = M
where the De Rham dual γ ∗ of γ is chosen to be harmonic, and µ is the invariant probability measure of {xt }t≥0 . It is amazing that formula (1.3) takes the same form as (1.2), though µ represents different measures in these two cases. What we have in mind here is an extension of Qian-Qian’s result ([QQ1]) on the circulation for Markov chains to the case when the diffusion processes on manifolds are considered. The importance of the rotation numbers (or circulation) is revealed in the fact that the irreversibility of diffusion processes can be characterized in terms of them just as in the discrete case of Markov chains. To see this, now let us consider a simple example. Let Bt be a one dimensional Brownian motion on a probability space (, F, Ft , p) and b(x) is a bounded coninuous function on R 1 . The solution process {xt }t≥0 to the following stochastic differential equation with an initial condition x0 gives a Brownian motion with the drift b(x): dxt = dBt + b(xt )dt.
432
M. Qian, Z.-d. Wang
By Girsanov’s formula, a new probability measure p˜ can be defined on (, Fst ) such that ddpp˜ |F t = Zs,t , where s
Z
Zs,t (x. (ω)) = exp[−
s
t
1 b(xu (ω)) · dBu − 2
Z
s
t
xu2 (ω)du].
(1.4)
To calculate the entropy production ep (t), for simplification we assume that {xt }t≥0 is a stationary process with an invariant probability measure ρ(x)dx on R 1 . Observe that −
−1 (x. (ω))] E p[s,t] [f (x. (ω))] = E p [f (xt+s−. (ω))] = E p˜ [f (xt+s−. (ω))Zs,t
holds for any Borel function f on C([s, t], R 1 ). Notice that {xu }s≤u≤t is a stationary ˜ thus we have Brownian motion without drift on the new probability space (, Fst , p), (see Proposition 3.1 in Sect. 3) −1 (x. (ω))] E p˜ [f (xt+s−. (ω))Zs,t ρ(xt (ω)) −1 (xt+s−. (ω)) ] = E p˜ [f (x. (ω))Zs,t ρ(xs (ω)) + ρ(xt (ω)) −1 Zs,t (x. (ω))]. (xt+s−. (ω)) = E p[s,t] [f (x. (ω))Zs,t ρ(xs (ω))
Therefore we have − dp[s,t] + dp[s,t]
−1 = Zs,t (x. (ω))Zs,t (xt+s−. (ω))
ρ(xt (ω)) . ρ(xs (ω))
A simple stochastic calculus (see the proof of Proposition 3.3) yields that Z t Z t 3 ( b2 + b0 )(xu (ω))du]. Zs,t (xt+s−. (ω)) = exp[ b(xu (ω)) · dBu + 2 s s On the other hand, by Itô formula and using
∂ρ ∂t
(1.5)
(1.6)
= 21 (ρ)00 − (bρ)0 = 0, we can derive
ρ(xt (ω) = ln ρ(xt (ω)) − ln ρ(xs (ω)) ρ(xs (ω) Z t Z t 1 (ln ρ)00 + b(ln ρ)0 (xu (ω))du, = (ln ρ)0 (xu (ω)) · dBu + 2 s s Z t Z t 0 0 0 1 0 2 = (ln ρ) (xu (ω)) · dBu + b + 2b(ln ρ) − ((ln ρ) ) (xu (ω))du. 2 s s (1.7) By (1.4–1.7), we get Z t Z − dp[s,t] 1 t 2 = exp − a(x (ω)) · dB − a (x (ω))du , u u u + 2 s dp[s,t] s ln
where a(x) = 2b(x) − (ln ρ)0 (x). This yields the following entropy production formula for the drifted Brownian motion {xt }t≥0 on R 1 : Z 1 ep (t) = a 2 (x)ρ(x)dx. 2 R1
Entropy Production of Diffusion Processes on Manifolds
433
If b(x) is a continuous function on R 1 with period 2π , it can be regarded as a function ˆ iθ ) = b(θ), 0 ≤ θ ≤ 2π). A process {ξt }t≥0 with the state space bˆ on the circle S 1 (b(e 1 S can be defined as ξt (ω) = exp(ixt (ω)),
ω ∈ , t ≥ 0.
Clearly {ξt }t≥0 is a Brownian motion with drift bˆ on S 1 . As we consider above, the entropy production of {ξt }t≥0 can also be computed easily. In fact it is given by Z 2π [2b(θ) − (ln ρ)0 (θ )]2 ρ(θ )dθ, (1.8) ep (t) = 0
where ρ(θ ) is the invariant density of {ξt }t≥0 and satisfies the normalization condition R 2π 0 ρ(θ )dθ = 1. The rotation number of {ξt }t≥0 around the circle S 1 is defined as the following limit: α = lim
t→∞
1 xt . 2π t
Rt Observe that xt = x0 + Bt + 0 b(xu )du (B0 = 0 being supposed), (x0 + Bt )/t → 0, and the ergodicity of {ξt }t≥0 yields that Z t Z t Z 2π Z 2π ˆ u )du /t → ˆ iθ )ρ(θ )dθ = b(xu )du]/t = [ b(ξ b(θ )ρ(θ )dθ. b(e 0
0
0
0
Hence we have the rotation number formula for {ξt }t≥0 : Z 2π 1 b(θ )ρ(θ )dθ. α= 2π 0 Set
Z
θ
h(θ) =
(1.9)
[b(φ) − c]dφ,
0
where c=
1 2π
Z
2π
b(θ )dθ.
0
Clearly, h(θ ) is a C 1 function on R 1 with period 2π and satisfies b(θ) = c + h0 (θ ). Since (ρ 0 − 2bρ)0 = 0, i.e., ρ 0 − 2bρ =const., hence the entropy production ep (t) of {ξt }t≥0 can be rewritten as Z 2π Z 2π Z 2π (2b − (ln ρ)0 )(2bρ − ρ 0 )dθ = 2c (2bρ − ρ 0 )dθ = 4c bρdθ. ep (t) = 0
0
0
Combining this with (1.9), we get the following simple relationship between the entropy producion ep (t) and the circulation α of the drifted Brownian motion {ξt }t≥0 on S 1 : ep (t) = 8π cα.
434
M. Qian, Z.-d. Wang
Using some geometrical results, in Sect. 4, we prove that the entropy production formula (1.1) for drifted Brownian motions on M can be rewritten as Z ep (t) = 2
Z
M
(β, X)(x)ρ(x)dx + 2
M
(γ , X)(x)ρ(x)dx,
(1.10)
where β and γ represent the co-exact and harmonic one-forms respectively in the Hodge decomposition of the dual one-form X ∗ of X. By the rotation number formula (1.3), R we see that the second term M (γ , X)(x)ρ(x)dx in the right hand of (1.10) can be represented as a linear sum of the rotation numbers of {xt }t≥0 around some closed curves in M. Hence in the case of X ∗ being closed, i.e. β = 0, by (1.10) we see clearly that the entropy production ep (t) of R {xt }t≥0 is a linear sum of its circulation. In Sect. 4, we will explain that M (β, X)(x)ρ(x)dx (the first term in the right-hand side of (1.10)) represents a hidden circulation of {xt }t≥0 . To be more precise, we consider a trivial principal bundle M × S 1 over M. The diffusion process {xt }t≥0 can be lifted to M × S 1 with respect to a connection induced by the differential one form X∗ on M (for details see Sects. 2 and 4). R We prove that the circulation α0 of the lifted process around the circle S 1 is exactly M (β, X)(x)ρ(x)dx. This circulation can not be observed by the rotation of {xt }t≥0 in M and is called the hidden circulation of {xt }t≥0 . By the new entropy production formula (1.10), now we see clearly that the entropy production ep (t) of {xt }t≥0 can be characterized in terms of its circulation and hidden circulation. In fact, we have ep (t) = 2α0 + 2
b1 X (X∗ , ωi )αi , i=1
where α1 , · · · , αb1 are the rotation numbers of {xt }t≥0 around some closed curves γ1 , · · · , γb1 (they generate the homology group H1 (M, R 1 ), b1 being the first Betti number of M) in M, ωi is the harmonic one-form dual to γi , and (X∗ , ωi ) is the Hodge inner product between X ∗ and ωi .
2. Lifted Processes and Girsanov’s Formula Suppose that (M, h·, ·i) is a Riemannian manifold and X1 , X2 , · · · , Xd , Y are smooth vector fields on M. Let Bt = (Bt1 , Bt2 , · · · , Btd ) be a d-dimensional Brownian motion on a probability space (, F, Ft , p). Let us consider the following stochastic differential equation: dxt =
d X
j
Xj (xt ) ◦ dBt + Y (xt )dt
(2.1)
j =1
with an initial condition x0 , where ◦ is taken in the sense of Stratonovich. The infinitesimal generator A of its solution process is a second order differential operator on C ∞ (M) which satisfies (see e.g. [E]) d
d
j =1
j =1
1X 1X h5Xj (5f ), Xj i + (Y + 5Xj Xj )f Af = 2 2
Entropy Production of Diffusion Processes on Manifolds
435
for all f ∈ C ∞ (M). In the following we will always assume that the solution process of SDE (2.1) is a Brownian motion on M with a drift vector field X. This means that 4=
d X h5Xj 5, Xj i, j =1
d
X=Y +
1X 5X j X j ; 2
(2.2)
j =1
here 4 is the Laplace operator on C ∞ (M). We remark that in general the existence of such vector fields X1 , X2 , · · · , Xd on M is not known. However, there is a canonical SDE on the orthonormal frame bundle over M, and the solutions to this project down to give Brownian motion on M. This construction is due to Eells and Elworthy (see e.g., p. 362 of [E]). To simplify our discussion and make the argument more transparent, we will assume (2.2) throughout this paper. Suppose that {xt }t≥0 is a solution of SDE (2.1). Let us consider a lift of {xt }t≥0 to M ×S 1 . Let A be a R 1 valued differential one form on M. iA induces a connection of the trivial circle bundle M ×S 1 over M. Then any C 0 vector field Z on M can be horizontally b on M × S 1 . Regarding the tangent space T(x,g) (M × S 1 ) of lifted to a vector field Z 1 b is then given by M × S at point (x, g) (x ∈ M, g = eiθ ∈ S 1 ) as Tx (M) ⊕ Tg S 1 , Z b g) = Z(x) − i(A, Z)(x) d . Z(x, dθ
(2.3)
A lift of {xt }t≥0 to M × S 1 is then defined as a solution process {yt }t≥0 of the following SDE: d X bj (yt ) ◦ dBtj + Y b(yt )dt X (2.4) dyt = j =1
with an initial condition y0 = (x0 , g0 ). It is easy to prove that {yt }t≥0 projects down to give {xt }t≥0 . In fact, by (2.3) and (2.4), we have yt = (xt , gt ) with gt ∈ S 1 satisfying d X j (A, Xj )(xt ) ◦ dBt − i(A, Y )(xt )dt
dgt = −i
j =1
with a given initial condition g0 (in the following discussion, g0 = 1 is always assumed). Clearly, gt is then given by Z gt = exp{−i
t
d X j [ (A, Xj )(xs ) ◦ dBs + (A, Y )(xs )ds]}.
(2.5)
0 j =1
{(xt , gt )}t≥0 is called the horizontal lifted process of {xt }t≥0 with respect to the connection iA. In Sect. 4, we will use this lifted process to define a hidden circulation of the diffusion {xt }t≥0 . Using the methods in [WGQ], the lifted process {(xt , gt )}t≥0 can also be used to derive the following “covariant” Feynman–Kac formula: Z t ˜ V (xs )ds f (xt ) (2.6) [exp(t (A − V ))f ](x) = Ex0 =x gt exp − 0
− 21 hA∗ , A∗ i − ihX, A∗ i, for all f in C(M), where A˜ = 4 +X − iA∗ − ∗ A being the vector field on M dual to A. In the case of M being Euclidean space, such a formula is known and can be derived by combining the Cameron–Martin–Girsanov 1 2
i ∗ 2 div(A )
436
M. Qian, Z.-d. Wang
formula and the usual version of the Feynman–Kac formula (see e.g., Sect. 15 of [S]). Other Feynman–Kac type formulas can be founded in several papers (see e.g., [AHHK, AW and WGQ]). Notice that the one-form A can be regarded as a connection of the trivial principal bundle M × R 1 over M. As discussed above, we can also consider an horizontal lifted process {(xt , ht )}t≥0 on M × R 1 (with respect to the connection form A), where ht is given by Z t X d (A, Xj )(xs ) ◦ dBsj + (A, Y )(xs )ds . ht = − 0
j =1
Similar to formula (2.6), we have the following Feynman–Kac type formula: Z t V (xs )ds)f (xt , [exp (t (Aˆ − V ))f ](x) = Ex0 =x exp(ht ) exp −
(2.7)
0
where V ∈ C 0 (M) is a potential function and 1 1 1 Aˆ = 4 +X − A∗ − div (A∗ ) + hA∗ , A∗ i − hX, A∗ i. 2 2 2 Let A = X ∗ be the one-form dual to X and V = − 21 (div X + hX, Xi). By (2.7) and (2.2), and using Itô’s formula, we get t [exp( 4)f ](x) = Ex0 =x [Zt f (xt )], 2
(2.8)
where
Z tX Z t d 1 j hX, Xj i(xs (ω)) · dBs − hX, Xi(xs (ω))ds . Zt (ω) = exp − 2 0 0
(2.9)
j =1
P By our assumption (2.2), we have hX, Xi = dj =1 hX, Xj i2 and thus Zt is a martingale on the probability space (, F, Ft , p). So a new probability measure p˜ on (, F) can be defined by d p˜ | = Zt , ∀t > 0. dp Ft By (2.8), now we see clearly that the process {xt }t≥0 is a Brownian motion without ˜ We remark that (2.9) is a variant Girsanov’s drift on the probability space (, F, Ft , p). formula. Its original proof can be found in [E]. 3. The Entropy Production Formula Let {xt }t≥0 be a diffusion process on a probability space (, F, Ft , p). Define: Fst = ∨s≤u≤t σ (xu ) (the σ -algebra generated by xu , s ≤ u ≤ t), 0 ≤ s < t < ∞. By the Kolomogorov theorem, {xu }s≤u≤t and {xs+t−u }s≤u≤t determine probability measures + − and p[s,t] on Fst respectively. p[s,t]
Entropy Production of Diffusion Processes on Manifolds
437
Definition 3.1. If the following limits exist: + dp[t,t+4t] 1 p E ln − ep (t) = lim 4t→0+ 4t dp[t,t+4t]
! ,
! + dp[t,t+4t] 1 p | xt = x , E ln − ep (t, x) = lim 4t→0+ 4t dp[t,t+4t] then ep (t) and ep (t, x) are called the entropy production and entropy production density of the diffusion process {xt }t≥0 at time t respectively. + − = p[s,t] holds for A stationary diffusion process {xt }t≥0 is called reversible if p[s,t] any 0 ≤ s < t < ∞. The entropy production describes the irreversibility of a diffusion process. By the methods of Qian (see e.g. [QQ2]), we can prove easily that a stationary process {xt }t≥0 is reversible if and only if its entropy production ep (t) equals zero for all t ≥ 0. The entropy production formula for diffusion processes on Euclidean space R n has been given in [QQ2] without proof. In this section, we will prove the entropy production formula for drifted Brownian motions on a compact a Riemannian manifold M. Let {xt }t≥0 be a diffusion process on (, F, Ft , p), with M as its state space. Set (ηs,t x(ω))r = xt+s−r (ω) for any ω ∈ , 0 ≤ s ≤ r ≤ t < ∞. {(ηs,t x)r (ω)}s≤r≤t is a diffusion process on the probability space (, F, Ft , p). Denote by Rts the set of all functions which are measurable with respect to Fst . For any f ∈ Rts , f may be represented as f (ω) = f˜ ◦ x(ω), where f˜ is measurable with respect to the σ algebra β(W˜ st ) of Borel sets associated to W˜ st = C([s, t], M). Define a transformation ∗ : Rt −→ Rt by: ηs,t s s ∗ f )(ω) = (f˜ ◦ ηs,t x)(ω) (ηs,t
for any f = f˜ ◦ x, f ∈ Rts . Proposition 3.2. Suppose that {xt }t≥0 is a Brownian motion without drift on a prob˜ Let ρ(x, u) be the probability density of xu , u ≥ 0. If ability space (, F, Ft , p). ρ(x, 0) > 0 for any x ∈ M, then E p˜ [f (ω)
ρ(xt (ω), s) ∗ ] = E p˜ [(ηs,t f )(ω)] ρ(xs (ω), s)
(3.1)
holds for all f ∈ Rts . Proof. For any s = t0 < t1 < · · · < tn = t, and f0 , f1 , · · · , fn ∈ C(M), we have n Y ρ(xt (ω), s) ] E p˜ [ fi (xti (ω)) ρ(xs (ω), s) i=0 Z Y Z n n Y ··· p(ti − ti−1 , xti , xti−1 )ρ(xt , s) (fi (xti )dxti ), = M
M i=1
i=0
where dx represents the Riemannian volume element of M and p(u, x, ·) is the transition probability density of the Brownian motion {xt }t≥0 without drift which satisfies
438
M. Qian, Z.-d. Wang
p(u, x, y) = p(u, y, x). Hence the right-hand side of last equality becomes Z
Z M
···
ρ(xt , s)
M
n Y
p(ti − ti−1 , xti , xti−1 )
i=1
n Y (fi (xti )dxti ) i=0
p˜
= E [fn (xs (ω))fn−1 (xt+s−tn−1 (ω)) · · · f1 (xt+s−t1 (ω))f0 (xt (ω))] = E p˜ [f0 ((ηs,t x)s (ω))f1 ((ηs,t x)t1 (ω)) · · · fn ((ηs,t x)tn (ω))] ∗ f )(ω)]; = E p˜ [(ηs,t
Q here f (ω) = ni=0 fi (xti (ω)). Hence we see that (3.1) holds for all f in Rts . This completes the proof. u t In the following, we suppose that {xt }t≥0 is the solution process of SDE (2.1) which is a Brownian motion with drift X on the probability space (, F, Ft , p). Set
Z t Z tX d 1 j hX, Xj i(xu (ω)) · dBu − hX, Xi(xu (ω))du . Zs,t (ω) = exp − 2 s s j =1
Proposition 3.3. Zs,t ∈ Rts , and the following holds:
Z tX d j ∗ Zs,t )(ω) = exp hX, Xj i(xu (ω)) · dBu · (ηs,t s j =1
Z t 1 (3hX, Xi + 2 div X)(xu (ω))du . exp 2 s Proof. By the compactness of M, we may assume that M is a submanifold of R N for a large N , and the Riemannian metric h·, ·i of M is induced by the Euclidean metric in R N . Observe that Z tX d j hX, Xj i(xu (ω)) ◦ dBu · Zs,t (ω) = exp − s j =1
Z t 1 (div X + hX, Xi − 2hX, Y i)(xu (ω))du ] exp 2 s Z t Z 1 t (div X + hX, Xi)(xu (ω))du ]. = exp − hX, ◦dxu i + 2 s s By this expression and the stochastic calculus on R N , it is easy to see that Zs,t ∈ Rts . ∗ is a homomorphism of the algebra R t and the following holds: Observe that ηs,t s ∗ ηs,t
Z
t s
Z (div X + hX, Xi)du =
s
t
(div X + hX, Xi)du.
Entropy Production of Diffusion Processes on Manifolds
439
Hence we see that Proposition 3.3 follows from the following: Z t ∗ [ηs,t (− hX, ◦dxu i)](ω) s
Z t Z tX d 1 j hX, Xj i(xu (ω)) · dBu + ( divX + hX, Xi)(xu (ω))du. = s s 2
(3.2)
j =1
By the stochastic calculus, we have Z t n X hX, ·dxu i(ω) = lim hX(xu(n) (ω)), xu(n) (ω) − xu(n) (ω)i, n→∞
s
(n)
(n)
k
k=0
(n)
k+1
k
(n)
where s = u0 < u1 < · · · < un < un+1 = t is a series of a partition of [s, t], such that (n) (n) lim max | uk+1 − uk |= 0. n→∞ 0≤k≤n
Hence we have
Z
∗ ( [ηs,t
t s
= lim
n→∞
= lim
n→∞
− =− Since
hX, ·dxu i)](ω) n X hX(xt+s−u(n) (ω)), xt+s−u(n) (ω) − xt+s−u(n) (ω)i
−
n X
k=0 Z t s
k
k=0
"
k+1
#
k
n X hX(xu(n) (ω)), xu(n) (ω) − xu(n) (ω)i k
k=0
k+1
k
hX(xu(n) (ω)) − X(xu(n) (ω)), xu(n) (ω) − xun) (ω)i k+1
hX, ·dxu i(ω) −
k
Z s
t
k+1
k
hdX(xu ), dxu i(ω).
Rt Rt hX, ◦dxu i(ω) = s hX, ·dxu i(ω) + 21 s hdX(xu ), dxu i(ω), thus we get Z t ∗ hX, ◦dxu i (ω) ηs,t s Z t Z 1 t ∗ hX, ·dxu i (ω) + hdX(xu ), dxu i(ω) = ηs,t 2 s s Z Z t 1 t hdX(xu ), dxu i(ω) = − hX, ·dxu i(ω) − 2 s s Z t = − hX, ◦dxu i(ω)
Rt s
s
Z t Z tX d j hX, Xj i(xu (ω)) ◦ dBu − hX, Y i(xu (ω))du =− s j =1
s
Z t Z tX d 1 j hX, Xj i(xu (ω)) · dBu − ( div X + hX, Xi)(xu (ω))du, =− 2 s s j =1
440
M. Qian, Z.-d. Wang
This yields (3.2) and completes the proof. u t Now we can prove the entropy production and entropy production density formula for the diffusion process {xt }t≥0 . Theorem 3.4. Let ρ(x, r) be the density of xr . If ρ(x, 0) = ρ(x) > 0 for all x ∈ M, Then the entropy production density ep (t, x) and entropy production ep (t) of {xt }t≥0 can be expressed as: 1 ∂ ln ρ(x, t) h2X − 5 ln ρ(x, t), 2X − 5 ln ρ(x, t)i − , 2Z ∂t 1 ∂ ln ρ (h2X − 5 ln ρ, 2X − 5 ln ρi − 2 )ρ(x, t)dx. ep (t) = 2 M ∂t
ep (t, x) =
Proof. Define a new probability measure p˜ on (, Fst ) by
d p˜ t dp |Fs =
Zs,t . Notice that
−
∗ ∗ −1 f )(ω)] = E p˜ [(ηs,t f )(ω)Zs,t (ω)] E p[s,t] [f (ω)] = E p [(ηs,t
holds for any f ∈ Rts . It follows from the discussion in Sect. 2 that {xr }s≤r≤t is a ˜ observe Brownian motion without drift on the new probability space (, Fst , Fr , p).And −1 ∗ −1 ∗ ∗ ∗ −1 that we have (ηs,t ) = ηs,t and ηs,t (Zs,t ) = (ηs,t Zs,t ) . Thus by (3.1), we see that ∗ −1 f )(ω)Zs,t (ω)] E p˜ [(ηs,t ρ(xt (ω), s) ∗ Zs,t )−1 (ω) = E p˜ f (ω)(ηs,t ρ(xs (ω), s) + ρ(xt (ω), s) ∗ . Zs,t )−1 (ω) = E p[s,t] f (ω)Zs,t (ω)(ηs,t ρ(xs (ω), s)
Hence we get
− dp[s,t] + dp[s,t]
∗ (ω) = Zs,t (ω)(ηs,t Zs,t )−1 (ω)
ρ(xt (ω), s) . ρ(xs (ω), s)
By Proposition 3.3, now we get Z tX − d dp[s,t] j hX, Xj i(xr (ω)) · dBr · + (ω) = exp −2 dp[s,t] s j =1 Z t ρ(xt (ω), s) . exp − (div X + 2hX, Xi)(xr (ω))dr + ln ρ(xs (ω), s) s Since
∂ρ ∂t
=
1 2
4 ρ − hX, 5ρi − ρ div X, by Itô formula, we have
ρ(xt (ω), s) ρ(xs (ω), s) Z t Z tX d 1 j h5 ln ρ, Xj i(xr (ω)) · dBr + [( 4 +X)(ln ρ)](xr (ω))dr. = 2 s s
ln
j =1
Entropy Production of Diffusion Processes on Manifolds
441
Hence − dp[s,t]
+ dp[s,t]
Z tX d j = exp[− h2X − 5 ln ρ, Xj i(xr (ω)) · dBr ] ·
s j =1 Z 1 t
s j =1 Z 1 t
h2X − 5 ln ρ, 2X − 5 ln ρi(xr (ω))dr · exp − 2 s Z t 1 (−2X ln ρ + + 4 ρ − divX)(xr (ω))dr exp 2ρ s Z tX d j h2X − 5 ln ρ, Xj i(xr (ω)) · dBr · = exp − exp −
2
(h2X − 5 ln ρ, 2X − 5 ln ρi − 2
s
∂ ln ρ )(xr (ω))dr . ∂s
+ Therefore, Theorem − 3.4follows immediately by taking the limit 4t → 0 in the exdp . u t pression of ln dp[t,t+4t] + [t,t+4t]
In the case of {xt }t≥0 being stationary, by Theorem 3.4 we have the following Corollary 3.5. If ρ(x, 0) = ρ(x) is an invariant density of {xt }t≥0 , then the entropy production density ep (t, x) and entropy production ep (t) of {xt }t≥0 are given by ep (t, x) = and
1 ep (t) = 2
1 h2X − 5 ln ρ, 2X − 5 ln ρi(x) 2
Z M
h2X − 5 ln ρ, 2X − 5 ln ρi(x)ρ(x)dx
respectively. By Corollary 3.5, we see that a stationary drifted Brownian motion on M is reversible (i.e., its entropy production ep (t) = 0) if and only if its drift X is a gradient vector field. This result is of course known, see e.g., p. 294 of [IW]. 4. Entropy Production and Rotation Numbers In this section, we suppose that the solution process {xt }t≥0 of SDE (2.1) is a Brownian motion with drift X which admits an invariant initial density ρ(x) > 0, ∀x ∈ M. Suppose that the first homology group H1 (M, R 1 ) of M has finite integral bases γ1 , · · · , γb1 (b1 being the first Betti number of M, i.e. b1 =dimH1 (M, R 1 )). Each γk is a closed curve which can be assumed to be smooth. For any T > 0, let LT = {xt (ω) | 0 ≤ t ≤ T } be an orbit of {xt }t≥0 . We join the endpoints x0 (ω) and xT (ω) of LT with the shortest geodesic arc L0,T . Thus γ (T , ω) = LT ∪ L0,T is a closed curve, and there exist integers n1 (T , ω), · · · , nb1 (T , ω) such that γ (T , ω) =
b1 X i=1
ni (T , ω)γi
442
M. Qian, Z.-d. Wang
holds in the homology sense. The rotation number αi of {xt }t≥0 around the closed curve γi is then defined as the following limit: 1 ni (T , ω), i = 1, · · · , b1 . T →∞ T
αi = lim
It is known that these rotation numbers exist and are independent of ω. In fact, they are given by (see e.g., [M]) Z (ωi , X)(x)ρ(x)dx, i = 1, · · · , b1 , (4.1) αi = M
where ωi denotes the harmonic one form among the dual one-forms of γi . We remark that the rotation number formula can be rederived by considering a lifted process on the universal covering manifold M˜ of M. The rotation (or called circulation) of the diffusion process {xt }t≥0 is closely related to its irreversibility. All the rotation numbers α1 , · · · , αb1 of a reversible diffusion process are equal to zero. Note that the inverse becomes true only when the dual one-form X∗ of X is closed (see e.g. [IW]). Now we will rewrite the entropy production formula (Corollary 3.5) and then the relationship between the entropy production and circulation becomes more clear. Denote by X∗ the dual one form of X. Let X∗ = α + β + γ be its Hodge decomposition, with α, β, γ being the exact, co-exact, harmonic one-forms respectively. Now we give the following theorem, from which we can see how the rotation numbers contribute to the entropy production. Theorem 4.1. The entropy production ep (t) of {xt }t≥0 is given by ep (t) = 2(β, ρX∗ ) + 2(γ , ρX∗ ), where (·, ·) stands for the Hodge inner product. Proof. Set C = 2ρX − 5ρ. Denote its dual one form by C ∗ . By Corollary 3.5, we have Z 1 h2X − 5 ln ρ, 2X − 5 ln ρi(x)ρ(x)dx ep (t) = 2 M Z 1 h2X − 5 ln ρ, Ci(x)dx = 2 M 1 = (2X∗ − d ln ρ, C ∗ ) 2 1 1 1 = (2α − d ln ρ, C ∗ ) + (2β, C ∗ ) + (2γ , C ∗ ). 2 2 2 Observe that ρ satisfies div (2ρX − 5ρ) = 0. Hence δC ∗ = − div C = 0 (see e.g., p. 223 of [W]), i.e., C ∗ is co-closed. This yields (2α − d ln ρ, C ∗ ) = 0. Now we get ep (t) = (β, C ∗ ) + (γ , C ∗ ).
(4.2)
Since C ∗ = 2ρX∗ − dρ, (β, dρ) = (δβ, ρ) = 0 and (γ , dρ) = (δγ , ρ) = 0, we see clearly that Theorem 4.1 follows from (4.2). u t
Entropy Production of Diffusion Processes on Manifolds
443
By the rotation number formula (4.1), we see clearly that (γ , ρX∗ ) can be represented as a linear sum of the rotation numbers α1 , · · · , αb1 of {xt }t≥0 around the closed curves γ1 , · · · , γb1 . In the following, we shall explain that (β, ρX∗ ) represents a hidden circulation α0 of {xt }t≥0 . Therefore the irreversibility of {xt }t≥0 is characterized in terms of its circulation α0 , α1 , · · · , αb1 just as the case of Markov chain (see [QQ1]). When the dual one-form X ∗ of X is closed, the hidden circulation is zero and the entropy production ep (t) is then a linear sum of the rotation numbers α1 , · · · , αb1 . Define a connection of the principal bundle M × S 1 over M by a differential oneform iA = 2πiβ on M. With respect to this connection, the diffusion {xt }t≥0 can be horizontally lifted to M × S 1 (see Sect. 2). We define the rotation number of the horizontal lifting process around the circle S 1 as the hidden circulation of {xt }t≥0 . To be more precise, suppose that {(xt , gt )}t≥0 is the lifting process of {xt }t≥0 , gt = eiθt ∈ S 1 , θt being continuous with respect to t and the initial condition θ0 = 0 being given. The hidden circulation α0 of {xt }t≥0 is then defined by α0 = lim
t→∞
1 θt . 2π t
Theorem 4.2. The hidden circulation α0 of {xt }t≥0 is given by α0 = (β, ρX∗ ). Proof. By (2.5), we have Z θt = 2π
0
t
d X (β, Xj )(xs ) ◦ dBsj + (β, Y )(xs )ds . j =1
Let β ∗ be the dual vector field of β. Using (2.2) we can prove easily that d X h5Xj β ∗ , Xj i = div(β ∗ ). j =1
By Itô formula, we get Z t X d d X 1 j (β, Xj )(xs ) · dBs + Xj (β, Xj ) + (β, Y ) (xs )ds θt = 2π 2 0 j =1
j =1
Z tX d j (β, Xj )(xs ) · dBs + = 2π 0 j =1
+ * d d X X 1 1 ∗ ∗ h5Xj β , Xj i + β , Y + 5Xj Xj (xs )ds 2π 2 2 0 j =1 j =1 Z t X d 1 j (β, Xj )(xs ) · dBs + div (β ∗ ) + hβ ∗ , Xi (xs )ds . = 2π 2 0 Z
t
j =1
444
M. Qian, Z.-d. Wang
Observe that div (β ∗ ) = δβ = 0, we have Z tX Z t d j (β, Xj )(xs ) · dBs + (X, β)(xs )ds . θt = 2π 0 j =1
(4.3)
0
By the stochastic analysis and the compactness of M, it is easy to prove that Z 1 t j (β, Xj )(xs ) · dBs |2 ) = 0. lim E(| t→∞ t 0 Hence, by the Chebyshev’s inequality we get Z 1 t j (β, Xj )(xs ) · dBs = 0, j = 1, · · · , d. lim t→∞ t 0 On the other hand, by the ergodicity of {xt }t≥0 , we have Z Z 1 t (X, β)(xs )ds = (X, β)(x)ρ(x)dx. lim t→∞ t 0 M Thus by (4.3), we see clearly that 1 θt = t→∞ 2πt
Z
α0 = lim
M
(X, β)(x)ρ(x)dx,
which completes the proof. u t By Theorem 4.1 and Theorem 4.2, we see that the entropy production of {xt }t≥0 can be represented in terms of its rotation numbers α1 , · · · , αb1 and hidden circulation α0 . This can be stated as the following: Theorem 4.3. The entropy production ep (t) of {xt }t≥0 is represented as ep (t) = 2α0 + 2
b1 X (X∗ , ωi )αi . i=1
Acknowledgement. We would like to express our thanks to Professor Guo Mao-zheng for his helpful discussion.
References [A] Andrej, L: Phys. Lett., 111A, 45–46 (1982) [AA] Arnold V.I. & Avez, A.: Ergodic problems of classical mechanics, New York: W.A. Benjamin, 1968 [AHHK] Albeverio, S., Høegh-Krohn, R., Holden, H. & Kolsrud, T.: A covariant Feynman–Kac formula for unitary bundles over Euclidean space. In: Stochastic partial differential equations and its applications (G. Da. Prato& L. Tubaro eds.), Lecture Notes in Mathematics 1390, Berlin: Springer-Verlag, 1989, pp. 1–12 [AW] Albeverio, S. and Zheng-dong, Wang: Representation of the propagator and Schwinger functions of Dirac fields in terms of Brownian motions. J. Math. Phys., 36 No. 10, 5207–5216 (1995) [E] Elworthy, K.D.: Geometric aspects of diffusions on manifolds. iN: É cole d’É té de Probabilitié s de Saint-Flour XV-XVII, Proceedings 1985–87 (P. L. Hennequin ed.), Lecture Notes in Mathematics 1362, Berlin: Springer-Verlag, 1988, pp. 277–425 [Ev] Evans et al: Statistical Mechanics of Non-equilibrium fluids New York: Academic Press, 1990
Entropy Production of Diffusion Processes on Manifolds
[G] [H] [IW] [K] [Ku] [M] [P] [QQ1] [QQ2] [S] [Si] [W] [WGQ]
445
Gallavotti, G.: The chaotic hypothesis and universal large derivations properties. In: Abstracts of Plenary and Invited Lectures of ICM 1998, Berlin, 1998, p. 6 Hoover et al: Phys. Rev. Lett. 59, 10–13 (1987) Ikeda, N. & Watanabe, S.: Differential equations and diffusion processes. (second edition), Amsterdam: North Holland-Kodansha, 1989 Kalpazidou, S.: Cycle representation of Markov processes New York: Springer-Verlag, 1995 Kurchan: Fluctuation theorem for stochastic dynamics. J. Phys. A, 31, 3719–3729 (1998) Manabe, S.: Stochastic intersection number and homological behavior of diffusion processes on manifolds. Osaka J. Math. 19, 429–457 (1982) Prigogine, I.R.: From being to becoming. San Francisco: W. H. Freeman and Company, 1980 Qian, Min-ping & Qian, Min: Circulation for recurrent Markov chains. Zeit. für Wahr. Ver. Gef. 59, 203–212 (1982) Qian, Min-ping & Qian, Min: The entropy production and irreversibility of Markov processes. In: Proc. 1st World Congr. Bernoulli Soc., 1988, pp. 307–316 Simon, B.: Functional integration and mathematical physics. New York: Academic Press, 1979 Sinai, Ya.G.: Topics in Ergodic Theory. Princeton, NJ: Princeton University Press, 1994 Wu, Hong-Xi: Elements of Riemannian geometry. Beijing: Peking University Press, 1988 Wang, Zheng-dong, Guo, Mao-zheng & Qian, Min: Diffusion processes on principal bundles and differential operators on the associated bundles. Science in China (series A) 35, 385–398 (1992)
Communicated by Ya. G. Sinai
Commun. Math. Phys. 206, 447 – 462 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Entropic Repulsion for the Free Field: Pathwise Characterization in d ≥ 3 Jean-Dominique Deuschel1 , Giambattista Giacomin2,? 1 Fachbereich Mathematik, TU Berlin, D-10623 Berlin, Germany. E-mail: [email protected] 2 Département de Mathématiques, EPFL, CH-1015 Lausanne, Switzerland
Received: 26 October 1998 / Accepted: 5 April 1999
Abstract: We study concentration properties of the lattice free field {ϕx }x∈Zd in d ≥ 3, i.e. the centered Gaussian field with covariance given by the Green function of the (discrete) Laplacian, when constrained to be positive in a region of volume O(N d ) (hard–wall condition). It has been shown in [3] that, as N → ∞, the conditioned field is pushed to infinity: more precisely the typical value of the ϕ-variable to leading order √ is c log N , and the exact value of c was found. It was moreover conjectured that the conditioned field, once this diverging height is subtracted, converges weakly to the lattice free field. Here we prove this conjecture, along with other explicit bounds, always in the direction of clarifying the intuitive idea that the free field with hard–wall conditioning merely translates away from the hard wall. We give also a proof, alternative to the one presented in [3], of the lower bound on the probability that the free field is everywhere positive in a region of volume N d . 1. Introduction and Main Result Let ϕ = {ϕx }x∈Zd (d ≥ 3) be the massless free field, i.e. the Gaussian process with zero mean and covariance operator −1−1 , with 1 the discrete Laplacian, 1f (x) =
X
(f (x + e) − f (x)),
f : Zd → R.
(1.1)
e∈Zd :|e|=1
We will denote by G(x, y) the matrix element (−1−1 )x,y and we set G = G(0, 0). Observe that G is 1/2d times the Green function of the simple random walk on Zd . We will denote by P the probability distribution of ϕ and by E the corresponding expectation. ? Present address: Dipartimento di Matematica, Università di Milano, via Saldini 50, 20133 Milano, Italy. E-mail: [email protected]
448
J.-D. Deuschel, G. Giacomin d
d
RZ ≡ is endowed with the product topology. It is easy to check that P ∈ M1 (RZ ) is a Gibbs measure with formal Hamiltonian X 2 1 (1.2) ϕx − ϕy . H (ϕ) = 4 d x,y∈Z :|x−y|=1
By this we mean that for every x ∈ Zd , P exp − 21 y:|y−x|=1 (φ − ϕy )2 dφ P dφ F{x}{ (ϕ) = R 1P 0 2 dφ 0 y:|y−x|=1 (φ − ϕy ) R exp − 2
P(dϕ)–a.s., (1.3)
in which FA , A ⊂ Zd , is the σ -algebra generated by {ϕx }x∈A . Note that H (ϕ) is well defined if ϕx = const. for x in the complement of a finite set and that adding to such a ϕ a constant (i.e. ϕx → ϕx + const. for every x) does not change the value of H . The latter property goes under the name of continuum symmetry and it gives to the model several interesting properties, like the fact that associated to H there is a continuum of Gibbs measures. We refer to [8, Ch. 13] for an accurate presentation of the Gibbsian characterization of P and related results (see also Sect. 2 below). Our attention will be focused on P conditioned to the entropic repulsion event + N = {ϕ ∈ : ϕx ≥ 0 for all x ∈ VN } ,
(1.4)
where VN = NV ∩ Zd , N ∈ Z+ and V ⊂ Rd is a bounded domain which satisfies a uniform (interior) cone condition (i.e. there exists a right circular cone K ⊂ Rd , K open set, such that for every r ∈ V there exists a map S : Rd → Rd , composition of a rotation and a translation, such that SK has vertex r and SK ⊂ V ). Therefore we set + (1.5) P+ N (·) = P · N . This is a very simple model for an interface lying above a hard wall : ϕx represents the height of the interface at the site x and the wall is assumed to be at ϕ ≡ 0. What it is expected is that the hard wall will push the interface away from itself, i.e. that P+ N concentrates on trajectories (in this case: interfaces) which lie further and further from ϕ ≡ 0, as N grows (see [5,12] and [6] for physical background and some estimates on more general models). The exact distance at which the interface is pushed has been found in [3], in the case of the free field: for our purposes we need a strengthened version of the result. What we prove in Section 3, Proposition 3.3, is that E+ (ϕ ) x − 1 = 0. (1.6) lim sup √ N N→∞ x∈VN 4G log N In [3, Sect. 4] the statement (1.6) had been established only in the bulk, i.e. if we replace the supremum over x ∈ VN with the supremum over x ∈ (Vε )N , Vε = {r ∈ V : dist(r, V { ) > ε}, with ε > 0. We remark that in [3] only the case V = [−1, 1]d was considered. The extension of the results to a domain V as considered here is straightforward (see also Sect. 4 below). In [3] it has been also conjectured that P+ N , once the diverging repulsion distance is subtracted, would converge (as N → ∞) to P itself. This is in fact our main result: for
Entropic Repulsion for Free Field: Pathwise Characterization in d ≥ 3
449
a ∈ R, let us denote by P+ a,N the law of the field {ϕx − a}x∈Zd , where ϕ is distributed according to P+ . In what follows ⇒ denotes weak convergence of measures. We have N the following Theorem 1.1. There exists a sequence of real numbers a(N) satisfying lim √
N→∞
a(N) = 1, 4G log N
(1.7)
such that N →∞
P+ a(N ),N H⇒ P.
(1.8)
The proof is an immediate consequence of Proposition 2.1 and Proposition 3.1 below. In the proof we take a(N) = E+ N (ϕ0 ) ,
(1.9)
and (1.7) follows from (1.6). Theorem 1.1 is of a local nature, but we will establish also some more global results (see in particular Corollary 3.2 below). We will prove Theorem 1.1 in two steps. We will first (in Sect. 2) establish the convergence of P+ N , once recentered (by subtracting its mean). Then (Sect. 3) we will (ϕ show that E+ N x − ϕy ) tends to zero as N → ∞ and this will allow us to replace the mean with a (N -dependent) constant, completing thus the proof of Theorem 1.1. We observe that in spite of the fact that we start off in a Gaussian setting, due to the constraint, P+ N is of course non-Gaussian, and a central role in our analysis is played by the Brascamp–Lieb (B–L) inequality [4], which is a tool developed to deal with nonGaussian situations. Here we will use the following form of the inequality: for every compactly supported f : Zd → R and every N ∈ Z+ , + ≤ E [F ((f, ϕ))] , (1.10) E+ N F (f, ϕ) − EN [(f, ϕ)] P where (f, ϕ) = x f (x)ϕx and F : R → R is either F (r) = |r|β , any β ≥ 1, or F (r) = exp(r). The proof is an application of [4, Th. 5.1]: it can be found in [6], but, for completeness, we sketch it here. The key observation is that ∞1ϕx ≤0 is a convex function and the entropic repulsion constraint P can be enforced on P by changing the measure with the exponential factor exp(− x∈VV ∞1{ϕx ≤0} ), properly normalized. To apply directly the result in [4] it is sufficient to approximate ∞1{ϕx ≤0} with a C 2 convex function, for example α(ϕx )4 1{ϕx ≤0} , α ∈ R+ , and to consider the centered Gaussian d field P(M) on RZ with covariance given by −1−1 M , the inverse Laplacian with zero boundary conditions outside 3M , 3 = (−1, 1)d and M ∈ Z+ . WePdefine P(M,N,α) to be the probability measure satisfying dP(M,N,α) /dP(M) ∝ exp(−α x∈VN 1{ϕx ≤0} ). By [4, Th. 5.1], in the case F (r) = |r|β the inequality (1.10) is established uniformly in M, (M,N,α) and E with E(M) . By letting first M → ∞ and N and α, if we replace E+ N with E then α → ∞ we conclude. The case of F (r) = exp(r) is reduced to the case F (r) = r 2 by the differentiation–integration identity oi Z 1 Z t h n varP(M,N,α) (f, ϕ)dsdt, log E(M,N,α) exp (f, ϕ)} − E(M,N,α) ((f, ϕ)) = 0
0
s
450
J.-D. Deuschel, G. Giacomin (M,N,α)
where Ps
is the probability measure such that /dP(M,N,α) ∝ exp{s(f, ϕ)}. dP(N,M,α) s
In fact varP(M,N,α) (f, ϕ) ≤ varP(M) (f, ϕ) [4] and the proof of (1.10) is concluded by s taking limits. An inequality similar to (1.10) holds true also in a fully non-Gaussian setting, i.e. in the case in which H is the sum of convex functions with second derivative bounded away from zero. Various entropic repulsion results in this context are established in [6]. Crucial in establishing (1.6), and therefore for our result, is understanding the asymptotics of P(+ N ): while for the results in Sect. 2 we will only need (roughly) that the field is pushed toward infinity by the hard wall, √ to establish Theorem 1.1 we need to know that the field is pushed at distance const. log N and to have a relatively precise control on the value of the constant. We include in this paper (Sect. 4) a proof of the lower bound on P(+ N ), alternative to the one presented in [3, Th. 1.1]: this is very close in spirit to the original proof, but it relies on a well–known technique of field theory, providing thus a bridge from [3] to the earlier literature. 2. Convergence of the Centered Field In this section we focus on the recentered field: for each f : Zd → R define the shift map Tf : → by (Tf ϕ)x = ϕx − f (x). The recentered field is then −1 Pˆ N = P+ N Tf ,
with f (x) = E+ N (ϕx ).
(2.1)
The main result of this section is Proposition 2.1. With the definitions above N →∞
Pˆ N H⇒ P.
(2.2)
We start with two preliminary lemmas. Lemma 2.2. {Pˆ N }N∈Z+ is tight and any limit point Pˆ satisfies h i ˆ (Sn (x))2 = 0, lim E for all x ∈ Zd , n→∞
(2.3)
P where Sn (x) = y fn (y)ϕx+y and {fn }n∈Z+ is any sequence of functions such that kfn k1 ≤ 1 and limn→∞ kfn k∞ = 0. Proof. By the B–L inequality (1.10) and the definition of Pˆ N we obtain that for every x ∈ Zd , h i h i ˆ N (ϕx )2 ≤ E (ϕx )2 = G(0, 0) < ∞, (2.4) sup E N∈Z+
and therefore {Pˆ N }N∈Z+ is tight. B–L once again gives us also that for any n ∈ Z+ , h i X ˆ N (Sn (x))2 ≤ fn (x)G(x, y)fn (y). (2.5) sup E N∈Z+
x,y
Entropic Repulsion for Free Field: Pathwise Characterization in d ≥ 3
451
P Denoting by G the (convolution) operator Gf (x) = y G(x, y)f (y), by the Hölder and the Young inequality we have that X fn (x)(Gfn )(x) ≤ kfn kq kGkp kfn k1 ≤ kfn kq kGkp , (2.6) x
whenever 1/p + 1/q = 1. By using the decay of G at infinity [11, §1.5] we have that kGkp < ∞ if p > d/(d − 2) and by interpolation limn→∞ kfn kq = 0 for all q > 1. This establishes (2.3). u t Let us now define mx (ϕ) =
1 2d
X
ϕy .
(2.7)
y:|y−x|=1
We have the following expression for the expectation of the Laplacian of ϕ: √ Rr Lemma 2.3. Let 9(r) = ( −∞ exp{−s 2 /2}ds)/ 2π . For each x ∈ Zd , " # 2 √ 1 E+ exp −d(m x (ϕ)) if x ∈ VN , √ 4πd N 1−9 mx (ϕ) 2d E+ N [ϕx − mx (ϕ)] = 0 otherwise.
(2.8)
Proof. We write h i + + , E+ N [ϕx − mx (ϕ)] = EN EN ϕx − mx (ϕ) F{x}{
(2.9)
and therefore if x ∈ VN{ the quantity in (2.9) is equal to zero, since in this case we can take away the repulsion in the conditional expectation in the right-hand side and the result follows by the DLR characterization of the free field. If x ∈ VN we extract the conditioning on + x ≡ {ϕ : ϕx ≥ 0}, E (ϕx − mx (ϕ)) 1+x F{x}{ + . (2.10) E+ N [ϕx − mx (ϕ)] = EN P + F x {x}{ For the numerator we observe that Z ∞ n o 1 E (ϕx −mx (ϕ)) 1+x F{x}{ = √ (ϕx −mx (ϕ)) exp −d (ϕx −mx (ϕ))2 dϕx π/d 0 n o 1 exp −d (mx (ϕ))2 , =√ (2.11) 4πd and for the denominator √ F { = P ϕx − mx (ϕ) ≥ −mx (ϕ) F { = 1 − 9 2dm (ϕ) . (2.12) P + x x {x} {x} The proof of (2.8) is therefore complete. u t
452
J.-D. Deuschel, G. Giacomin
We are now ready to prove Proposition 2.1. Proof of Proposition 2.1. In Lemma 2.2 we have established the tightness of {Pˆ N }N ∈Z+ . We are therefore left with showing that any limit point Pˆ coincides with P. We will start ˆ by exhibiting the DLR equations satisfied by P. The idea is to observe that the DLR equations for the free field can be cast in the form: for every x ∈ Zd , 1 exp −d (φ − mx (ϕ))2 dφ, P dφ F{x}{ (ϕ) = √ π/d
(2.13)
and we repeat the same algebraic steps for Pˆ N . We obtain Pˆ N dφ F{x}{ (ϕ) = n 2 o 1 1{φ≥−E+ (ϕx )} dφ, exp −d (φ − mx (ϕ)) − E+ [ϕx − mx (ϕ)] N N Zˆ N (x) (2.14) in which Zˆ N (x) is the normalization. From (1.6) (in this case the result only for x away from the boundary is largely sufficient, see therefore [3, Sect. 4] for a proof, or refer directly to Lemma 3.3 below) we deduce that limN →∞ E+ N (ϕx ) = ∞ and therefore, to ˆ verify that P satisfies the same DLR equations as P, we are left with proving that lim E+ N [ϕx − mx (ϕ)] = 0,
N→∞
for every x ∈ Zd .
(2.15)
By using the explicit expression in (2.8) we obtain that if x ∈ VN \∂ − VN , h i 1 2 E+ , 0 ≤ E+ N [ϕx − mx (ϕ)] ≤ √ N exp −d (mx (ϕ)) πd
(2.16)
and once again the result follows from (1.6); in fact it is sufficient to know that P+ N (ϕx < c(N)) tends to zero for some c(N ) tending to infinity, as N → ∞. Now we know that (each) Pˆ satisfies the DLR equations of the free field, i.e. (1.3) or (2.13). We use now the fact that for the free field the set of extremal states is known [8, Ch. 13, ex. 13.29]: every extremal state Q can be written as P ◦ Th−1 (≡ Qh ), where h : Zd → R is an harmonic function. Therefore there exists a probability measure ν on the set of extremal Gibbs states viewed as a measurable space with the evaluation R σ -algebra [8, Th.7.26] such that Pˆ = Qh dˆν (Qh ). Let us now apply the second part of Lemma 2.2, by choosing fn (y) = pn (0, y), the probability that a simple random walk, leaving at 0, exits Vn at y ∈ ∂ + Vn . Note that with this choice by using the DLR equations and harmonicity of h we have that EQh (Sn (x)) = h(x),
(2.17)
Entropic Repulsion for Free Field: Pathwise Characterization in d ≥ 3
453
for every x ∈ Zd and any n ∈ Z+ . Therefore by (2.3) and Fatou’s Lemma we have that for every x Z h h i 2 i 2 ˆ varQh (Sn (x)) + EQh (Sn (x)) dˆν (Qh ) = 0 = lim E (Sn (x)) = lim n→∞ n→∞ Z Z h i ≥ lim varQh (Sn (x)) + (h(x))2 dˆν (Qh ) = (h(x))2 dˆν (Qh ), n→∞
(2.18) where limn→∞ varQh (Sn (x)) = 0 follows by the very same argument used to obtain (2.3). But (2.18) implies that νˆ is concentrated on P. u t
3. Repulsion and Flatness of the Field In this section we will require the full strength of (1.6), while in the previous one it was sufficient to know that limN→∞ E+ N (ϕx ) = ∞, without requiring any uniformity in x or any control over the rate of divergence. Notice however that such an estimate is required only if we want to be able to choose δ arbitrarily close to 0 in Proposition 3.1 below. For the main result (Theorem 1.1) of this paper, having Proposition 3.1 just for a δ < 1 suffices. It will be clear from the proof p that to obtain this weaker result it suffices that the field is pushed at least at distance (2G + δ 0 ) log N , for some δ 0 > 0. The main result of this section is the following Proposition 3.1. For every δ > 0 there exists C > 0 such that −1+δ , E+ N ϕx − ϕy ≤ C|x − y|N
(3.1)
for every x, y ∈ Zd and every N ∈ Z+ . A straightforward application to Pˆ N of the B–L inequality (1.10) with F (r) = exp(r), together with the exponential Chebychev inequality, yields the following corollary of Proposition 3.1, in which we keep the same notation as in Theorem 1.1. Corollary 3.2. For every r in the interior of V , every β < 1/G and any δ > 0, if we set a(N ) = E+ N (ϕ[rN] ) we have that β(ϕx − a(N))2 sup E+ < ∞. (3.2) exp sup N 2 N∈Z+ x:|x−rN |≤N 1−δ Corollary 3.2 will not be used in the sequel, but it gives a strong concentration property of the P+ N -field. In the proof of Proposition 3.1, we will make use of two lemmas, that we state and prove here. The first one is an extension of the results in [3] on the distance of the field from the hard wall, up to the boundary of the wall. Lemma 3.3. For every δ > 0 there exists N0 ∈ Z+ such that for all N ≥ N0 , p p (4G − δ) log N ≤ E+ N [ϕx ] ≤ (4G + δ) log N , for all x ∈ VN ∪ ∂ + VN .
(3.3)
454
J.-D. Deuschel, G. Giacomin
Proof. Let us first recall the following result from [3, Prop. 1.3 and Lemma 4.7]: for every ε > 0, E+ (ϕ ) x − 1 = 0. (3.4) lim sup √ N N→∞ x∈(Vε )N 4G log N From (3.4), the upper bound in (3.3) is immediate: it suffices in fact to replace V with (1 + ε)V , apply (3.4) and then use the FKG inequality. Let us turn to the lower bound. Because of (3.4), the result is already proven for x ∈ (Vε )N . To extend it to the whole box we proceed as follows: for x ∈ VN \(V )N we have + + E+ ≥ E+ (3.5) N [ϕx ] = EN EN ϕx F(Vε )N N E ϕx F(Vε )N , in which we have used the FKG inequality. Since X ε pN (x, y)ϕy , E ϕx F(Vε )N =
(3.6)
y∈∂ − (Vε )N
ε (x, y)} in which {pN y∈(Vε )N ∪{∞} is the hitting probability for a simple random walk starting at x, the result in (Vε )N implies that for any δ 0 > 0 and N sufficiently large p 0 ) log N P (4G − δ < ∞ , (3.7) ≥ E+ τ [ϕ ] x x N (V ){ ε N
where Px is the law of {X(j )}j ∈Z+ , the simple random walk on Zd , with X(0) = x, and τA is the exit time from A ⊂ Zd . We are therefore left with showing that inf Px τ(V ){ < ∞ = 1. (3.8) lim ε N
ε→0 x∈∂ + VN ∪VN \(Vε )N
Let us start with some notation: as before, we denote by K a (right circular) cone and we use h(K) for the height of K. Moreover, with respect to a fixed cone K with vertex r0 , we define for every R > 0, n o (3.9) BRN = y ∈ Zd : |y − N r0 | ≤ RN , while BR = {r ∈ Rd : |r − r0 | ≤ R}. We start by claiming that there exists δ > 0 such that for every ε0 ∈ (0, h(K)/4), (3.10) inf Px X(τB N ) ∈ N K ≥ δ, x∈BεN0
2ε 0
uniformly in N. This holds because fN (x) = Px (X(τB N ) ∈ NK) is a positive harmonic 2ε 0
N . Therefore, by the Harnack inequality [11, Theorem 1.7,2], there exists function in B2ε 0 a constant cH < ∞ such that
fN (x1 ) ≤ cH fN (x2 ),
(3.11)
for all x1 , x2 ∈ BεN0 . By elementary considerations lim fN ([Nr0 ]) = |∂B2ε0 ∩ K|d−1 ,
N→∞
(3.12)
Entropic Repulsion for Free Field: Pathwise Characterization in d ≥ 3
455
where | · |d−1 denotes area of a d − 1 dimensional manifold embedded in Rd . Therefore we have that there exists c > 0 such that fN ([N r0 ]) ≥ c for every N ∈ Z+ , which, combined with (3.11), yields c ≡ δ, (3.13) inf fN (x) ≥ N cH x∈B 0 ε
Z+ .
Therefore (3.10) is proven. for every N ∈ N , the ball Let us consider now a point x ∈ ∂ + VN ∪ VN \(Vε )N . Therefore x ∈ B2ε centered at N r0 , with r0 ∈ ∂Vε , vertex of a cone K contained in Vε . Observe first of all that (3.14) Px τ(V ){ < ∞ ≥ Px τK{ < ∞ ≥ inf Py τK{ < ∞ , ε N
N y∈B2ε
N
N
in which the first inequality follows from the fact that KN ⊂ (Vε )N . By the strong N, Markov property we obtain that for y ∈ B2ε o n n o Py τK{ < ∞ = Py τK{ < ∞ ∩ X(τB N ) ∈ KN 4ε N o n nN o { + Py τK{ < ∞ ∩ X(τB N ) ∈ KN 4ε N (3.15) ≥ Py X(τB N ) ∈ KN 4ε # " h i 1 − Py X(τB N ) ∈ KN . + inf Pz τK{ < ∞ N z∈B4ε
N
4ε
If 4ε < h(K) we can apply (3.10) to obtain that inf y∈B N Py (X(τB N ) ∈ KN ) ≥ δ and 2ε 4ε therefore (3.16) inf Py τK{ < ∞ ≥ δ + (1 − δ) inf Py τK{ < ∞ . N y∈B2ε
N
N y∈B4ε
N
From (3.16) it is clear that we can iterate the procedure n times, with 2n+1 ε < h(K) and, recalling (3.14), we obtain that q n X ε (1 − δ)j = 1 − (1 − δ)n+1 ≥ 1 − , (3.17) Px τ(V ){ < ∞ ≥ δ ε N h(K) j =0
for some q > 0, uniformly in N ∈ Z+ . By the uniform interior cone assumption on V , K (the cone used in the above procedure) can be chosen, up to translations and rotations, to be the same for each point x ∈ ∂ + VN ∪VN \(Vε )N . Therefore the estimate (3.17) is uniform in x and (3.8) is proven. t u Remark. By following the arguments in the beginning of the proof of Lemma 3.3 and using the weak convergence of XN (t) = X([tN 2 ])/N, t ∈ R+ , to the standard Brownian motion one can also obtain that for every r ∈ R+ , E+ N ϕ[rN] = u(r), (3.18) lim √ N→∞ 4G log N where u ∈ C 0 (Rd ), u = 1 in V , u harmonic outside outside V and limr→∞ u(r) = 0.
456
J.-D. Deuschel, G. Giacomin
Lemma 3.4. There exists C ∈ R+ such that for all N ∈ Z+ such that for all x ∈ VN , h n oi 1 2 2 (ϕ)) ( ϕ)) ¯ + Cm ( ϕ) ¯ , (3.19) (m exp −d(m ≤ exp − E+ x x x N 2G where ϕ¯· = E+ N (ϕ· ). Proof. First of all we note that h n oi 2 (ϕ)) exp −d(m E+ x N i n h o 2 2 (3.20) (ϕ − ϕ)) ¯ + 2dm ( ϕ)m ¯ (ϕ − ϕ) ¯ exp −d(m ( ϕ)) ¯ exp −d(m = E+ x x x x N oi n h n o ˆ N exp −d(mx (ϕ))2 + 2dmx (ϕ)m ¯ x (ϕ) exp −d(mx (ϕ)) ¯ 2 . =E If we set P˜ N (dϕ) =
exp −d(mx (ϕ))2 Pˆ N (dϕ), Z˜ N
(3.21)
where Z˜ N is the normalization constant, we can develop (3.20) further to obtain h n oi 2 ˜ N exp {2dmx (ϕ)m (ϕ)) ¯ x (ϕ − ϕ)} ˜ exp −d(m =E E+ x N n h n oi o ˆ N exp −d(mx (ϕ))2 exp {2dmx (ϕ)m ¯ x (ϕ)} ˜ exp −d(mx (ϕ) ¯ 2 , ·E (3.22) ˜ N (ϕ· ). In analogy with (3.21), we define also where ϕ˜ · = E exp −d(mx (ϕ))2 ˜ P(dϕ), P(dϕ) = Z˜
(3.23)
i.e. we perform the same change of measure but with respect to the free field. P˜ is a centered Gaussian field and i h ˜E (mx (ϕ))2 = 1 2dG − 1 , (3.24) 2d 2dG where we used the fact that E[(mx (ϕ))2 ] = G − (1/2d). Therefore by using Jensen’s inequality and the B–L inequality we obtain ˆ N mx (ϕ) exp −d(mx (ϕ))2 E ˜ = mx (ϕ) ˆ N exp −d(mx (ϕ))2 E n h i1/2 h io ˆ N (mx (ϕ))2 ˆ N (mx (ϕ))2 exp d E ≤ E (3.25) n h i1/2 io h 2 2 ˆ ˆ exp d E (mx (ϕ)) ≤ E (mx (ϕ)) r 1 1 exp dG − ≡ K. = G− 2d 2
Entropic Repulsion for Free Field: Pathwise Characterization in d ≥ 3
Finally, again by the B–L inequality, ˜ exp {2dmx (ϕ)m ˜ N exp {2dmx (ϕ)m ¯ x (ϕ − ϕ)} ˜ ≤E ¯ x (ϕ)} E 2dG − 1 2 (mx (ϕ)) ¯ . = exp d 2dG Inserting (3.25) and (3.26) into (3.22) we obtain h n oi ¯ 2 (mx (ϕ)) + 2 + 2dKmx (ϕ) ¯ , ≤ exp − EN exp −d(mx (ϕ)) 2G
457
(3.26)
(3.27)
and the proof is complete. u t We are now ready to prove the main result of this section. Proof of Proposition 3.1. Set uN (x) = E+ N (ϕx ). Let us denote by AN the discrete Laplacian of uN . Proposition 3.1 follows if we show that for every δ > 0 we can find C > 0 such that, uniformly in x ∈ Zd , i = 1, . . . , d and N ∈ Z+ , (3.28) ∇i 1−1 AN (x) ≤ CN 1−δ , where ∇i is the discrete gradient in the i-direction. We denote by Ki (·) the kernel of the operator ∇i 1−1 . By [11, Th. 1.5.5] there exists a constant cK such that for all x, |Ki (x)| ≤
cK . |x|d−1
Recalling Lemma 2.3, as in (2.16), we have that if x ∈ VN \∂ − VN , h i p 2 , 0 ≤ AN (x) ≤ 2 d/πE+ N exp −d (mx (ϕ))
(3.29)
(3.30)
and therefore, by Lemma 3.3 and Lemma 3.4, we obtain that for every δ > 0 there exists ca ∈ R+ such that for every x ∈ VN \∂ − VN , |AN (x)| ≤
ca . N 2−δ
(3.31)
Since AN = 0 outside VN , we are left with the case x ∈ ∂ − VN . If we call Ex the event {ϕ : mx (ϕ) ≤ 0}, from (2.8) we obtain that there is a constant c such that ( " #) 2 h i exp −d(m (ϕ)) 1 E x x 2 |AN (x)| ≤ c E+ . (3.32) + E+ √ N exp −d(mx (ϕ)) N 1 − 9( 2dmx (ϕ)) The first term in the right-hand side of (3.32) is bounded by const./N 2−δ , by the very same argument used for x ∈ VN \∂ − VN . For the other term we use the fact that R∞ 2 exp(−r /2)/ r exp(−s 2 /2)ds ≤ 2 + r, for r ≥ 0, and Hölder inequality to obtain that if 1/p + 1/q = 1, " # 2 p i1/p √ 1/q + h + 1Ex exp −d(mx (ϕ)) ≤ P+ (E ) E 2d|m (ϕ)| . 2 + EN √ x x N N 1 − 9( dmx (ϕ)) (3.33)
458
J.-D. Deuschel, G. Giacomin
Observe now that, by the exponential Chebychev inequality and the B–L inequality (1.10) with F (r) = exp(r), we have that + + + P+ N (Ex ) ≤ PN mx (ϕ) − EN (mx (ϕ)) ≤ −EN (mx (ϕ)) ( 2 ) E+ t2 N (mx (ϕ)) + , ≤ inf exp −tEN (mx (ϕ)) + G = exp − 2 2G t∈R (3.34) and therefore, by Lemma 3.3, we have that for every δ > 0, there exists c such that for all N ∈ Z+ , sup P+ N (Ex ) ≤
x∈∂ − VN
c N 2−δ
.
(3.35)
The second factor in the right-hand side of (3.33) can be easily bounded by using the B–L inequality with F (r) = |r|p and by using the upper bound in Lemma 3.3: for every N ∈ Z+ and every x ∈ ∂ − VN , E+ N
h
2+
p i1/p √ p 2d|mx (ϕ)| ≤ c(p) log N ,
(3.36)
where c(p) is a constant depending only on p. Therefore by choosing q sufficiently close to 1, we extend (3.31) to all x ∈ VN . Note that a much rougher upper bound than the one given in Lemma 3.3 would have been sufficient. Let us now go back to using (3.29). We obtain that for every x, c 1 cK ca X ≤ 1−δ , ∇i 1−1 AN (x) ≤ 2−δ d−1 N |x − y| N
(3.37)
y∈VN
t for some c ∈ R+ , and therefore (3.28) is proven. u
4. 2-Scale Decomposition and the Lower Bound For f : Rd → R, let us set Df = (∂1 f, . . . , ∂d f ). Proposition 4.1. Let C be the capacity of V , i.e. o n C ≡ inf kDhk2L2 (Rd ) : h ∈ H 1 (Rd ), h = 1 a.e. on V .
(4.1)
We have the following lower bound on the probability of + N: lim inf N →∞
1 N d−2 log N
log P (ϕx ≥ 0 for all x ∈ VN ) ≥ −2GC.
(4.2)
Entropic Repulsion for Free Field: Pathwise Characterization in d ≥ 3
459
This result can be found in [3, Prop. 2.1]. Here we present another proof, based on the following observation: we can realize the field ϕ as sum of two independent Gaussian fields {ϕx0 }x∈Zd and {ϕx1 }x∈Zd , ϕx = ϕx0 + ϕx1 ,
(4.3)
defined, once we fix ε > 0, by E(ϕx0 ϕy0 ) = (−1)−1 − (ε2 − 1)−1
x,y
E(ϕx1 ϕy1 ) = (ε2 − 1)−1
= G(0) (x, y),
(4.4)
,
(4.5)
x,y
and E(ϕx0 ) = 0, E(ϕx1 ) = 0,
(4.6)
for all x, y ∈ Zd . We will still use P (E) for the joint law of ϕ 0 and ϕ 1 . On the other hand, we denote by Pα,N (α ∈ R+ ) the law of the random field o n p . (4.7) ϕx0 + α log N , ϕx1 d x∈Z
This is still a Gaussian field, with the same covariance as {ϕ 0 , ϕ 1 } under P, but shifted ϕ 0 -mean. As remarked in the introduction, the proof is inspired by the multiscale decomposition of Field Theory (see e.g. [13] and [1]). We actually need only two scales: we are in fact splitting the field into a massless (ϕ 0 ) and a massive component (ϕ 1 ). Notice that the covariance of the massive part is equal to 1/2d times the Green function of a simple random walk with killing of rate ε 2 /2d. We will use the relative entropy technique: we compute the relative entropy of the √ original field and the field in which the massless part has been translated of a distance α log N . The best result, optimal to leading order by the upper bound in [3], is obtained by making the massless part to be infinitesimal (i.e. ε → 0) and c(> 2d) arbitrarily close to 2d. Once again, this gives another image of the fact that the field under the hard wall condition moves away from the wall, in order to make enough room for the fluctuations to occur. Proof of Proposition 4.1. First of all we claim that for all α ∈ R+ , lim
1
N→∞ N d−2 log N
α HN Pα,N |P = C, 2
where
HN Pα,N |P = Eα,N
! dPα,N log . dP FV
(4.8)
(4.9)
N
We will give the main argument and postpone the proof of (4.8) at the end. As it will be clear, we do not need to establish the equality in (4.8): an upper bound, with the same right-hand side, suffices. However equality is just as easy to obtain.
460
J.-D. Deuschel, G. Giacomin
Let σε = E(ϕx0 )2 (which is independent of x). It is immediate to see (for example by using the Fourier transform) that lim σε = 0.
ε→0
(4.10)
The two results (4.8) and (4.10) imply p 1 α log P ϕx0 ≥ α log N for all x ∈ VN ≥ − C, (4.11) lim inf lim inf d−2 ε→0 N→∞ N log N 2 for all x ∈ R+ . The proof of (4.11) goes as follows. First we recall the entropy inequality √ HN Pα 0 ,N |P + e−1 P(ϕx0 ≥ α log N ∀x ∈ VN ) ≥− , log √ √ Pα 0 ,N (ϕx0 ≥ α log N ∀x ∈ VN ) Pα 0 ,N (ϕx0 ≥ α log N ∀x ∈ VN ) (4.12) for α 0 ∈ R+ . Equation (4.12) is a consequence of Jensen’s inequality (see e.g. [3, p. 421]). Combining (4.8) and (4.12) we realize that, to have (4.11), it is sufficient to prove that for any α 0 > α there exists ε0 such that for all ε ≤ ε0 , p lim Pα 0 ,N ϕx0 ≥ α log N for all x ∈ VN = 1, (4.13) N→∞
which is proven by observing that √ p p √ Pα 0 ,N ϕx0 ≥ α log N ∀x ∈ VN = P ϕx0 ≥ ( α − α 0 ) log N ∀x ∈ VN
d
≥1−N P
ϕ00
s ! √ p √ √ √ log N d < ( α − α 0 ) log N = 1 − N 9 − ( α 0 − α) , σε (4.14)
and we recall that 9(r) is the probability that a standard normal variable is smaller than r ∈ R. Equation (4.13) is then an easy consequence of (4.14) and (4.10), since the in (4.14) converges to 1 when N goes to infinity if ε is chosen such that √ last term √ ( α 0 − α)2 /2σε > d. We are now going to prove (4.2). We have P (ϕx > 0 ∀x ∈ VN ) ≥ E P ϕx0 + ϕx1 > 0 ∀x ∈ VN |F 0 1{ϕx0 ≥√α log N ∀x∈VN } , (4.15) is the σ -algebra generated by ϕ 0 . By using the independence of ϕ 0 and ϕ 1 , where F 0 √ 0 on {ϕx ≥ α log N ∀x ∈ VN } we have p (4.16) P ϕx0 + ϕx1 > 0 ∀x ∈ VN |F 0 ≥ P ϕx1 > − α log N ∀x ∈ VN . By the FKG inequality for the field ϕ 1 (see e.g. [10]) and its translation invariance N d p p . P ϕx1 > − α log N ∀x ∈ VN ≥ P ϕ01 > − α log N
(4.17)
Entropic Repulsion for Free Field: Pathwise Characterization in d ≥ 3
461
Hence we have that 1 log P (ϕx > 0 ∀x ∈ VN ) ≥ N d−2 log N s ! p 1 α log N N2 log 9 − + d−2 log P(ϕx0 ≥ α log N ∀x ∈ VN ), log N (G − σε ) N log N (4.18) and the result then follows by using (4.11) and (4.10), since the first term in the right-hand side of (4.18) vanishes whenever α/4(G − σε ) > 1. We are then left with the proof of (4.8). A direct computation of the relative entropy (4.9), see [2] for similar computations, easily reduces the proof of (4.8) to proving that lim
1
N→∞ N d−2
(0)
h1VN , (GN )−1 1VN iVN = C,
(4.19) (0)
where h·, ·iA , A ⊂ Zd , is the scalar product in L2 (A) and GN is the matrix G(0) restricted to VN × VN (and analogous meaning below for G and GN ). The quantity of which we are taking the limit in (4.19) can be expressed in terms of a variational problem: it is equal to 1 1 (0) hf, G sup 2 h1 , f i − f i (4.20) VN VN VN . N N d−2 f ∈L2 (VN ) 2 A lower bound for the expression in (4.19) is then immediate, since G ≥ G(0) and lim
N→∞
1 h1VN , (GN )−1 1VN iVN = C, N d−2
(4.21)
which is proven in [2, Sect. 2]. For the upper bound we still use the variational formula (4.20) in the following way: 1 1 (0) (0) h1VN , f iVN − hf, GN f iVN ≤ sup hhN , f iZd − hf, G f iZd sup 2 2 f ∈L2 (VN ) f ∈L2 (Zd ) = hhN , (G(0) )−1 hN iZd , (4.22) where hN (·) = h(·/N), h ∈ C0∞ (Rd ) and h = 1 on V . P By setting fˆ(k) = x f (x) exp(ikx) for k ∈ (−π, +π]d , we have Z 1 µ(k)2 |hˆ N (k)|2 2 dk hhN , [(G(0) )−1 − G−1 ]hN iZd = d (2π) kkk≤π ε (4.23) c(h) 1 = 2 d−2 h1hN , 1hN iL2 (Zd ) ≤ 2 2 , ε N N ε Pd where µ(k) = 2 i=1 (1 − cos ki ) and c(h) is a constant depending on h. Hence the term in (4.23) vanishes as N → ∞. On the other hand 1 1 hhN , G−1 hN iZd = − d−2 hhN , 1hN iZd , N d−2 N
(4.24)
462
J.-D. Deuschel, G. Giacomin
R converges as N → ∞ to its continuum analog Rd |Dh|2 for any h ∈ C0∞ ; taking the infimum over h we obtain the capacity C and the proof is complete. u t Acknowledgements. We are grateful to Erwin Bolthausen for his help with the proof of Lemma 3.3 and for other useful discussions. G.G. acknowledges the support of the Swiss National Science Foundation (Project 20–410 925.94).
References 1. Benfatto, G., Cassandro, M., Gallavotti, G., Niccolò, F., Olivieri, E., Presutti, E. and Scacciatelli, E.: Ultraviolet Stability in Euclidean Scalar Field Theories. Commun. Math. Phys. 71, 95–130 (1980) 2. Bolthausen, E. and Deuschel, J.D.: Critical large deviations for Gaussian fields in the phase transition regime. Ann. Prob. 21, 1876–1920 (1994) 3. Bolthausen, E., Deuschel, J.D. and Zeitouni, O.: Entropic repulsion for the lattice free field. Commun. Math. Phys. 170, 417–443 (1995) 4. Brascamp, H.J. and Lieb, E.: On extensions of the Brun–Minkowski and Prekopa–Leinler theorems. J. Funct. Anal. 22, 366–389 (1976) 5. Bricmont, J., el Mellouki, A. and Fröhlich, J.: Random surfaces in statistical mechanics: Roughening, rounding, wetting. J. Stat. Phys. 42, 743–798 (1986) 6. Deuschel, J.D. and Giacomin, G.: Entropic Repulsion for Massless Fields. Preprint (1999). 7. Deuschel, J.D. and Stroock, D.W.: Large Deviations. Academic Press, Series in Pure and Applied Mathematics 137, 1989 8. Georgii, H.-O.: Gibbs Measures and Phase Transitions. Studies in Mathematics, 9, W. de Gruyter ed., 1988 9. Glimm, J. and Jaffe, A.: Quantum Physics. Berlin–Heidelberg–New York: Springer–Verlag, Second edition, 1987 10. Herbst, I. and Pitt, L.: Diffusion equation techniques in stochastic monotonicity and positive correlations. Prob. Th. Rel. Fields 87, 275–312 (1991) 11. Lawler, G.F.: Intersections of Random Walks. In: Probability and its Applications, Basel–Boston: Birkhäuser, 1991 12. Lebowitz, J.L. and Maes, C.: The effect of an external field on an interface, entropy repulsion. J. Stat. Phys. 46, 39–49 (1987) 13. Nelson, E.: A quartic interaction in two dimensions. In: Mathematical theory of elementary particles, (Goodman and Segal ed.s), Cambridge, MA: MIT press, 1966 14. Spitzer, F.: Principles of random walks. Berlin–Heidelberg–New York: Springer-Verlag, Second edition, 1976 Communicated by J. L. Lebowitz
Commun. Math. Phys. 206, 463 – 489 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
On the Spectrum of the Generator of an Infinite System of Interacting Diffusions R. A. Minlos1 , Yu. M. Suhov1,2 1 Institute for Problems of Information Transmission, Russian Academy of Sciences, 19 Bol’shoi Karetnyi
Per., Moscow, 101447, Russia
2 Statistical Laboratory, DPMMS, University of Cambridge, 16 Mill Lane, Cambridge CB2 1SB, UK
Received: 6 October 1998 / Accepted: 9 April 1999
Abstract: We study the spectrum of the operator Lf (Q) = −
X
X ∂ 2 f/∂qx2 (Q) − β (∂H /∂qx ) (Q) (∂f/∂qx ) (Q), Q = {qx },
x∈Zd
x∈Zd d
generating an infinite-dimensional diffusion process 4(t), in space L2 (RZ , dν(Q)). d Here ν is a “natural” 4(t)-invariant measure on RZ which is a Gibbs distribution corresponding to a (formal) Hamiltonian H of an anharmonic crystal, with a value of the inverse temperature β > 0. For β small enough, we establish the existence of an Ld invariant subspace H1 ⊂ L2 (RZ , dν(Q)) such that L H1 has a distinctive character related to a “quasi-particle” picture. In particular, L H1 has a Lebesgue spectrum separated from the rest of the spectrum of L and concentrated near a point κ1 > 0 giving the smallest non-zero eigenvalue of a limiting problem associated with β = 0. An immediate corollary of our result is an exponentially fast L2 -convergence to equilibrium for the process 4(t) for small values of β.
1. Introduction In this paper we consider the problem of describing a “lower” component of the spectrum of the generator of an infinite system of interacting diffusions. The dynamics of the model are given as an infinite-dimensional Markov process 4(t) = {ξx (t), x ∈ Zd }, t ≥ 0, d with state space = RZ , determined by a countable system of stochastic differential equations dξx (t) = −β(∂H /∂qx )(4(t))dt + dWx (t), ξx (0) = qx0 , x ∈ Zd ,
(1.1)
464
R. A. Minlos, Yu. M. Suhov
where {Wx , x ∈ Zd } is a family of independent Wiener processes on R labelled by sites x ∈ Zd , and Q0 = {qy0 , y ∈ Zd } ∈ is an initial condition. Furthermore, H (Q) is a formal Hamiltonian: X X α qx2s + (qx − qx 0 )2 , Q = {qy , y ∈ Zd }, (1.2) H (Q) = 2 d 0 d 0 x∈Z
x,x ∈Z : |x−x |=1
where s is a natural number, the coupling constant α is > 0, and |x − x 0 | denotes the distance (Euclidean or lattice) between x, x 0 ∈ Zd . The value β > 0 in (1.2) is interpreted as inverse temperature. It is known (see the original papers [7, 9, 23] and a review [6]) that, for a “tempered” Q0 ∈ , there exists a (strong) solution to (1.1) which is in fact unique among tempered weak solutions. (In fact, the existence and uniqueness of such solution can be proved under much more general assumptions about H (Q).) Furthermore (see [7, 23]), ∀β > 0 process 4(t) (= 4βR(t; Q0 )) defined by (1.1) has a unique invariant measure ν (= νβ ) such that supx∈Zd qx2 dν(Q) < ∞. Moreover, the measure ν coincides with Gibbs probability distribution ν corresponding to the Hamiltonian H (see (1.2)) and the value of the inverse temperature β. The last result also establishes the uniqueness, ∀β > 0, of a Gibbs distribution for Hamiltonian H (again within the class of probability measures on with a uniformly bounded second moment). As before, this result holds true under more general assumptions about H (Q). The measure ν is also invariant under spaceshifts in . Process 4β (t) with invariant measure ν is reversible and ergodic. The semi-group of its transition operators acting in the Hilbert space (H.s.) H := L2 (, dν(Q)) is selfadjoint; its generator L (= Lβ ) is also self-adjoint (and even positive definite). It is defined on a suitable dense set D(L) ⊂ H composed of “local”, smooth and tempered functions f (Q), Q ∈ , where it has the form X X β(∂H /∂qx )(Q) (∂f/∂qx ) (Q). ∂ 2 f/∂qx2 (Q) − (1.3) Lf (Q) = − x∈Zd
x∈Zd
In particular, the function f ≡ 1 is in the domain D(L) and taken to zero, i.e. is a (unique) normalised eigenvector of L with eigenvalue 0. The rest of the spectrum of L lies on R+ = (0, ∞). The main result of this paper is as follows. For β small enough, and under the condition s > 2d + 1,
(1.4)
there exists a subspace H1 ⊂ H, invariant under L and the space-shift unitary group {Uy , y ∈ Zd }, such that the spectrum of the restriction L H1 is Lebesgue and fills a segment J ⊂ R+ of length ∼ β, separated by gaps of size ∼ β 1/s from 0 and from the rest of the spectrum of L (which lies to the right of J). For the precise statement, see Theorem 1 below. Furthermore, let L2 (Td , dλ) denote the space L2 on the d-dimensional torus Td with the standard Lebesgue measure. Then, under a unitary map V: H1 → L2 (Td , dλ) (which is cyclic for group {Uy }), operator L H1 is taken to the operator of multiplication by b (λ), λ = (λ(1) , . . . , λ(d) ) ∈ Td , with values in J, a non-constant analytic function m whereas the operators Uy H1 , y = (y (1) , . . . , y (d) ), are taken to the operators of P multiplication by exp (ihy, λi), where hy, λi = 1≤j ≤d y (j ) λ(j ) . By using quantummechanics (or rather quantum field theory) analogies, one can interpret vectors of H.s.
Spectrum of Interacting Diffusions
465
b (λ) as the energy of the H1 as states of a certain (quasi-) “particle”, and the value m b gives the particle with the quasi-momentum λ. (Physicists often say that the function m dispersion rule of an individual (quantum) particle.) An immediate corollary of this result is an exponentially fast L2 -convergence (for β small enough) of the distribution of 4(t) to measure ν as t → ∞, for any initial distribution ν 0 that is absolutely continuous with respect to ν and with an dν 0 /dν ∈ H. After this paper was accepted, we learned about preprint [27] where exponential convergence was established in a different (and actually stronger) form (again under a condition that β is small enough). The technique used in [27] (and in related preprints [25], [26]) is based on logarithmic Sobolev’s inequalities and direct bounds upon process 4(t), and employes (in a rather indirect way) many properties of a Gibbs state used in this paper. See also [28]. Yet another approach was put forward in [22]. The case of small β considered in this paper may be treated as a small (although singular) perturbation of a certain “decoupled” system corresponding to β = 0; see below. In particular, the generator K of the “natural” decoupled system associated with (1.1)–(1.4) has a non-negative discrete spectrum of a distinctive “additive” structure (cf. (2.4)–(2.7) and (2.12)), where any positive eigenvalues have infinite multiplicity. The quasi-particle spectrum component L H1 of the perturbed operator L “arises” from the eigenspace corresponding to the lowest positive eigenvalue κ1 of K. Such a picture is typical for “cluster” operators associated with infinite-particle systems, cf. [14]. We want to stress that, as we believe, our results hold true for a more general form of Hamiltonian H than (1.2), (1,4). In particular, one could allow the existence, for β large enough, of more than one Gibbs distribution, as we work only in the region of small β’s where uniqueness is guaranteed by appropriate high-temperature polymer (or cluster) expansions. Furthermore, the proof in such a general situation could be done along the same lines, although it would considerably lengthen the exposition. We believe the same is also true about condition (1.4) that plays in this paper an important, although purely technical, role. The problems arising from rigorous study of the spectra of generators of various stochastic dynamics (including the dynamics implemented by an infinite-volume transfer-matrix operator) have their own history going back to [18]. There exists an extensive bibliography devoted to various aspects of this problem in the case of Glauber dynamics on Zd (otherwise known as the stochastic Ising model (s.I.m.)); see, e.g., reviews [6] and [12], as well as the literature quoted in these sources). In paper [16], under the assumption that the inverse temperature β of the model is small enough, a number of “lower” invariant subspaces of the s.I.m. generator were constructed (corresponding to k-particle pictures, k ≥ 1), and the spectrum on the first of these subspaces (with k = 1) was described, in terms similar to the ones above. (It should be noted that for d ≥ 2 the s.I.m. exhibits a complicated phenomenon of non-uniqueness of invariant measures.) In the one-dimensional case (d = 1), the s.I.m. is (again for β > 0 small enough) relatively simple: in this case all invariant subspaces of the generator and the spectrum on all of them were described in [19]. In [10] and [29], a similar problem was considered for the dynamics of plane rotators; in this model one deals with a system of stochastic differential equations which is similar to (1.1), but on a compact manifold (a circle S1 ) rather than on R. In this case, one was able to construct one- and two-particle invariant subspaces and describe the spectrum of the generator on these subspaces. The present paper follows the general scheme employed in the above papers and in [14] (which will allow us to avoid some tedious detail), although, in view of the non-
466
R. A. Minlos, Yu. M. Suhov
compactness of the “spin” space R, the whole construction is still technically rather involved. We also want to refer to [1] and [2, 3] (see also the references therein) where various properties of the operator L and process 4(t) are discussed from a different point of view. The paper is organised according to the following scheme. In Sect. 2 we state the problem and the main theorem on the one-particle invariant subspace H1 . Section 3 contains the proof of the first part of the theorem: here, we perform a construction of space H1 . In Sect. 4, we establish the form of the spectrum of operator L on H1 , and in Sects 5 and 6 prove various technical facts used in Sects 2–4. 2. The Main Theorem It is convenient to pass to a “modified” form of the operator L, by using the “multiplica1 tive” change of variables qy 7 → β 2s qy , y ∈ Zd . This generates a unitary transformation ¯ where of H.s.’s, R: H → H, H¯ := L2 (, d¯ν (Q)) and Rf (Q) = f (β −1/(2s) Q), f ∈ H, and ν¯ = ν¯ β is the Gibbs distribution determined by the Hamiltonian X X α b(Q) = H qx2s + β 1−1/s (qx − qx 0 )2 , 2 d 0 d 0 x∈Z
(2.1)
(2.2)
x,x ∈Z : |x−x |=1
with the inverse temperature one. The transformed operator RLR−1 has the form 1 ¯ where RLR−1 = β s L, X X ¯ (qx − qx 0 )∂/∂qx , −∂ 2 /∂qx2 + 2sq 2s−1 ∂/∂qx + L(= L¯ β ) = x∈Zd
x,x 0 ∈Zd : |x−x 0 |=1
(2.3) ¯ = RD(L). with = αβ 1−1/s . Operator L¯ is of course self-adjoint on its domain D(L) ¯ Thus, the problem of describing the spectrum of operator L is reduced to that for L. For small β (and hence small ) operator L¯ may be considered as a perturbation of the “decoupled” self-adjoint linear operator (l.o.) K in H.s. F, corresponding to “free” dynamics: X d (2.4) −∂ 2 /∂qx2 + 2sq 2s−1 ∂/∂qx , F := L2 (, dµ(Q)), µ = µZ K= 0 ; x∈Zd
here µ0 is a probability measure on R: dµ0 (q) = I
−1
Z
exp (−q )dq, where I = 2s
dq˜ exp (−q˜ 2s ).
(2.5)
Denote by k the self-adjoint l.o. acting in the space L2 (R, dµ0 (q)) by the formula k=−
d d2 + 2sq 2s−1 . 2 dq dq
(2.6)
Spectrum of Interacting Diffusions
467
The spectrum of k is a sequence of multiplicity one eigenvalues 0 = κ0 < κ1 < . . . < κn < . . . , κn % ∞.
(2.7)
The unitary group {Uy , y ∈ Zd } and the involution J are given by Uy f (Q) = f Uy Q), Jf (Q) = f (−Q),
(2.8)
where Uy is the shift Q = {qx } 7 → {qx0 }, with qx0 = qx+y , x ∈ Zd . Note that both {Uy , y ∈ Zd } and J commute with both L¯ and K (when considered in the corresponding H.s.). In particular, let H¯ ev and H¯ od denote, respectively, the even and odd subspaces of ¯ → H¯ ev and similarly for H¯ od . The same conclusion ¯ H¯ ev ∩ D(L) H¯ relative to J. Then L: holds for operator K and the even and odd subspaces F ev , F od ⊂ F. Theorem 1. Given s and α as above, s satisfying (1.4), there exist constants β 0 , C > 0 such that, for 0 < β < β 0 , ¯ 1. There exist decompositions of H.s.’s H¯ ev and H¯ od into H-orthogonal direct sums ev od , Hod = H¯ 1 ⊕ H¯ >1 , Hev = H¯ 0 ⊕ H¯ ≥2
(2.10)
ev and H ¯ od are invariant under L¯ and {Uy }, and H¯ 0 is where subspaces H¯ 0 , H¯ 1 , H¯ ≥2 >1 a one-dimensional nil-subspace of L¯ consisting of constant functions. Furthermore, ev := L ¯ H¯ ev and L¯ od := L¯ Hod lie in (i) the spectra of the restrictions L¯ ≥2 ≥2 >1 >1 (κ¯ 2 − C, ∞) and (κ¯ 3 − C, ∞), respectively, where κ¯ 2 = min [2κ1 , κ2 ] and κ¯ 3 = min [3κ1 , κ1 + κ2 , κ3 ], and (ii) the spectrum of the restriction L¯ 1 := L¯ H¯ 1 is confined to the interval J = (κ1 − C, κ1 + C). In particular, the spectrum of L¯ 1 is separated from 0 and ev ∨ Hod . the spectrum of L¯ H≥2 >1 ev ∨ Hod denotes the subspace spanned by Hev and Hod .) (Here, H≥2 >1 ≥2 >1 d , dλ) such that the l.o.’s VL ¯ 1 V−1 and → L (T 2. There exists a unitary map V : H 1 2 V Uy H1 V−1 have the form
b (λ)f (λ), V Uy H1 V−1 f (λ) = exp ihy, λif (λ), VL¯ 1 V−1 f (λ) = m
λ = (λ(1) , . . . , λ(d) ) ∈ Td , y = (y (1) , . . . , y (d) ) ∈ Zd , f ∈ L2 (Td dλ). (2.11) b is a non-constant analytic function Here Td is the d-dimensional unit torus and m on Td with values in interval J specified in assertion 1(i). In particular, the spectrum of L¯ 1 is Lebesgue. We conclude this section with an observation about the spectrum of operator K in H.s. F. Denote by ψn the (normalised) eigenvector of k corresponding to κn , n ∈ Z+ := {0, 1, ... }. For n even, ψn is an even function of q ∈ R, for n odd, ψn is odd. Comparing (2.4) and (2.6), we see that the eigenfunctions and eigenvalues of K are of the form Y X ψn(x) (qx ) and Kn = κn(x) . (2.12) 9n (Q) = x
x
468
R. A. Minlos, Yu. M. Suhov
Here n is integer-valued function x ∈ Zd 7 → n(x) (called a multi-index), a non-negative P with n = x n(x) < ∞; the set of such functions is denoted by N. Functions 9n , n ∈ N, form an orthonormal basis in F. Furthermore, each 9n is either an even or an odd vector relative to J, and the parity of 9n coincides with that of n . So, F ev is spanned by the even and F od by the odd 9n ’s. In particular, (a) 0 is a simple eigenvalue of K, (b) the lowest positive eigenvalue of K is κ1 ; it has an infinite multiplicity, and the corresponding eigenspace F1 is spanned by the odd vectors 9ey , y ∈ Zd . Here, ey denotes the multi-index with ey (x) = 1(x = y), x ∈ Zd . The next eigenvalue is κ¯ 2 = min [2κ1 , κ2 ], etc. Note that each eigenspace of K corresponding to a given eigenvalue Kn is invariant under {Uy }. We see that in terms of the asymptotic of the spectrum, L¯ 1 is related to the restriction K F1 : as β → 0, interval J shrinks to κ1 . An agreement used for the rest of the paper is that the notation c0 , c1 , etc., is used for positive constants varying from one lemma to another (so, e.g., constant c0 in Lemma 3.1 is different from that in Lemma 3.3); unless otherwise specified, these constants do not depend on variables figuring in the corresponding assertion (e.g., c0 in Lemma 3.1 does not depend on n). Also, each time a bound includes , we assume that β (and hence ) is small enough, in the sense indicated in Theorem 1. ¯1 3. Constructing Subspace H ¯ Furthermore, An important fact is that functions 9n ∈ H. ¯ Lemma 3.1. The H-norm of 9n obeys Y 2d/(4s−2) γn(x) , where γ0 = 1, γn = c0 κn + 1 , n ≥ 1. ||9n ||H¯ ≤
(3.1)
x
The proof of Lemma 3.1 is carried in Sect. 5. Consider now the system of functions 8n ∈ H¯ ∩ F, n ∈ N: .Y Y γn(x) = φn(x) (qx ), φn (q) = ψn (q)/γn , q ∈ R, n ∈ Z+ , 8n (Q) = 9n (Q) x
x
(3.2) by where γn are given by (3.1). P PDenote by L the space of functions G(Q) represented the following series: G = n gn 8n , gn ∈ C, with the norm |||G|||L = n |gn | < ∞, ¯ ||G|| ¯ ≤ |||G|||L , and so that L is isomorphic to l1 (N). In view of Lemma 3.1, L ⊂ H, H ¯ L is dense in H. ¯ n ∈ L. Lemma 3.2. Any 8n , n ∈ N, is taken by L¯ to a vector from L: L8 P ¯ n = Proof of Lemma 3.2. Consider the representation L8 m Ln , m 8m . Comparing (2.3) and (2.4) and using (3.2) we find that Lm , n = Mm , n + Wm , n , where X Mm , n =1(n = m)Kn , Wm , n = 2drn(x),m(x) 1(mx = nx ) −
X y: |y−x|=1
x∈σ (n)
pn(x),m(x) bn(x),m(x) 1(mx,y = nx,y ) .
(3.3)
Spectrum of Interacting Diffusions
469
Here, and below, σ (n) stands for the support of the multi-index n ∈ N: σ (n) = {y ∈ Zd : x n(y) ≥ 1}. Next, nx and ( m denote the multi-indices ( which differ from n and m n(z), m(z), z 6= x, z 6 = x, at site x only: nx (z) = mx (z) = Similarly, 0, z = x, 0, z = x. ( ( n(z), z 6 = x, z 6 = y, m(z), z 6 = x, z 6= y, mx,y (z) = Furthernx,y (z) = 0, z = x or z = y, 0, z = x or z = y. more, rn,m , pn,m and bn,m are defined by q
X X X dφn dφn rn,m φm (q), pn,m φm (q), qφn (q) = bn,m φm (q). (q) = (q) = dq dq m m m (3.4)
Lemma 3.3. The following bounds hold true: 3 2d+s−1 − 4d−1 −1 c0 |κn − κm | (κn + c1 ) 4 2(4s−2) (κm + c1 ) 4s−2 , m 6 = n, n ≥ 1, 1 1 |rn,m | ≤ c0 (κn + c1 ) 2 + 4s−2 , m = n ≥ 1, 0, n = 0, (3.5) ( |pn,m | ≤
c0 |κn − κm |−1 (κn + c1 ) 4 − 2(4s−2) (κm + c1 ) 0, n = m or n = 0, 3
4d−1
2d+s−2 4s−2
, m 6= n, n ≥ 1, (3.6)
1 2d − 2d −1 c0 |κn − κm | (κn + c1 ) 2 4s−2 (κm + c1 ) 4s−2 , m 6= n, n ≥ 1, 2d −1 (κ + c ) 4s−2 |bn,m | ≤ , m ≥ 1, n = 0, c0 κm m 1 0, m = n.
(3.7)
Moreover, X
|rn,m | ≤ c0 (κn + c1 ) 2 + 4s−2 ln (κn + c1 ), 1
1
(3.8a)
m
X
1
|pn,m | ≤ c0 (κn + c1 ) 2 ln (κn + c1 ),
(3.8b)
m
and X
1
|bn,m | ≤ c0 (κn + c1 ) 4s−2 ln (κn + c1 ).
(3.8c)
m
The proof of Lemma 3.3 is given in Sect. 6. P Remark. It is the bound for m |rn,m | in (3.8a) where condition (1.4) is essential (by using methods proposed in this paper).
470
R. A. Minlos, Yu. M. Suhov
The assertion of Lemma 3.2 now follows from (3.3) and bounds (3.5)–(3.8). u t ¯ = {8 ∈ L : Thus, we can consider the restriction of L¯ L, with the domain DL (L) ¯ ⊂ D ¯ (L). ¯ For simplicity, we will keep for the last operator ¯ ∈ L}. Clearly, DL (L) L8 H ¯ Space L is decomposed into a sum of its closed subspaces Lev the original notation L. and Lod spanned by the even and odd vectors 8n , respectively: L = Lev + Lod . Both Lev and Lod are invariant with respect to L¯ and {Uy }, and we set L¯ ev = L¯ Lev and L¯ od = L¯ Lod . ¯ Lemma 3.4. There exist decompositions of Lev and Lod into sums of their L-closed Land {Uy }-invariant subspaces: od od Lev = L0 + Lev ≥2 , L = L1 + L>1 ,
(3.9)
with the following properties. 1. L0 is the one-dimensional nil-subspace of L¯ formed by constant functions. ev = L ¯ ev Lev and L¯ od = L¯ od Lod are invertible l.o.’s, and 2. The restrictions L¯ ≥2 ≥2 >1 −1 >1 ev od −1 ||| ¯ |||L ≤ 1/(κ¯ 2 − c0 ), ||| L¯ >1 the L-norms of their inverses obey ||| L≥2 L ≤ 1/(κ¯ 3 − c0 ), where κ¯ 2 and κ¯ 3 are as in Theorem 1. 3. The restriction L¯ 1ev = L¯ ev L1 has the L-norm |||L¯ 1ev |||L obeying |||L¯ 1ev |||L ≤ κ1 + c0 . ¯ 4. The H-closures ev od ¯ ev = ClH¯ Lev H¯ 1 = ClH¯ L1 , H¯ ≥2 ¯ L>1 , ≥2 , H≥2 = ClH
(3.10)
are invariant under L¯ and {Uy }, and, together with H¯ 0 = L0 , form decompositions (2.10). Proof of Lemma 3.4. We begin with a construction of spaces L1 and Lod >1 . The starting point is a decomposition od (3.11) Lod = L01 + L0, >1 , nP o P 0, od where L01 = x∈Zd gx 8ex and L>1 = n∈N, |n|>1 gn 8n . Formula (3.11) induces od od L¯ 0,1 L¯ od : L0 → L0 , od ¯ the corresponding matrix representation L ' ¯ 0,0 , where L¯ 0,0 1 1 Lod L¯ od 1,0
1,1
od : L0, od → L0 , etc. We also introduce a l.o. M acting in space L0, od and defined by L¯ 0,1 1 >1 >1 M8n = Kn 8n . (The action of M is identical to that of K, but in a different space.) od ¯ ¯ Spaces L01 and L0, >1 are not L-invariant; in order to get the L-invariant decomposition (3.9) we will perform some “corrections”. Namely, spaces L1 and Lod >1 in (3.9) are sought in the form 0, od L1 = v : v = u + M−1 Su, u ∈ L01 , Lod >1 = u : u = v + Tv, v ∈ L>1 , (3.12) 0,od 0 ¯ where S: L01 → L0,od >1 and T: L>1 → L1 are bounded l.o.’s. The L-invariance of L1 od and L>1 is equivalent to the following relations upon S and T: od od od od od od od od + L¯ 1,1 M−1 S = M−1 S(L¯ 0,0 + L¯ 0,1 M−1 S), T(L¯ 1,0 T + L¯ 1,1 ) = L¯ 0,0 T + L¯ 0,1 . L¯ 1,0 (3.13)
Spectrum of Interacting Diffusions
471
od is invertible in L0, od , one can re-write relations (3.13) as Assuming that L¯ 1,1 >1 od −1 −1 ¯ od od −1 ¯ od od −1 −1 ¯ od ) M SL0,0 − M(L¯ 1,1 ) L1,0 + M(L¯ 1,1 ) M SL0,1 M−1 S, S = M(L¯ 1,1 (3.14a) od ¯ od −1 od od −1 od od −1 (L1,1 ) + L¯ 0,0 T(L¯ 1,1 ) − TL¯ 1,0 T(L¯ 1,1 ) . T = L¯ 0,1
(3.14b)
od od −1 ) exists and is bounded in L0, Lemma 3.5. 1. (L¯ 1,1 >1 . 0,od 0 2. There exist unique L-bounded l.o.’s S: L01 → L0,od >1 and T: L>1 → L1 satisfying ≤ c0 1/2 . (3.14a,b), and their norms obey |||S|||L , |||T|||L P 3. The m.e.’s Sx,n of S in the representation S8ex = n∈Nod Sx,n 8n have the form Sx,n P = ( 1/2 )`{x}∪σ (n) sx, n , where n∈Nod sx,n ≤ c1 1/2 .
Here, and below, given a finite set B ⊂ Zd , `B stands for the minimal length of a finite subgraph γ of Zd (taken with standard links) with B ⊆ [γ ], where [γ ] is the set of the vertices of γ . Remarks. 1. It is possible to show that norms |||S|||L and |||T|||L are actually of order . 2. The m.e.’s of T also admit a representation similar to that for S. However, we do not need such a result in this paper. od )−1 . Observe that the LProof of Lemma 3.5.1. First, we establish the existence of (L¯ 1,1 P P norm |||B|||L of a l.o. B defined by B8n = m Bn ,m 8m equals supn m |Bn ,m |. Write od = M + W, where W is the l.o. with the m.e.’s W od L¯ 1,1 n ,m , n , m ∈ N , |m|, |n| ≥ 1 (cf. od )−1 in the form (L ¯ od )−1 = M−1 (E + WM−1 )−1 , (3.3)). In other words, we seek (L¯ 1,1 1,1
od −1 have the form W −1 where E is the unit operator in L0, n ,m Kn . >1 . The m.e.’s of WM By using (3.5)–(3.8), we obtain that X 1 1 (κn(x) + c1 ) 2 + 4s−2 ln (κn(x) + c1 ) |||WM−1 |||L ≤ c2 sup Kn−1
+
n∈Nod
x
1 (κn(x) + c1 ) (κn(y) + c1 ) 4s−2 ln (κn(x) + c1 ) ln (κn(y) + c1 ) .
X
1 2
x,y: |x−y|=1
With the help of Young’s inequality we find that 1
1
(κn(x) + c1 ) 2 ln (κn(x) + c1 ) (κn(y) + c1 ) 4s−2 ln (κn(y) + c1 ) 4s+2 1 1 4s 4s (κn(x) + c1 ) 2 + 4s ln (κn(x) + c1 ) ≤ 4s + 2 2s+1 1 1 1 (κn(y) + c1 ) 2 + 2s+1 ln (κn(y) + c1 ) + ≤ c3 κn(x) + κn(y) + 2c1 , 2s + 1 od )−1 provided that s ≥ 2. Hence, |||WM−1 |||L < c4 < 1. This guarantees that (L¯ 1,1 od −1 exists and is bounded, and |||(L¯ 1,1 ) |||L ≤ 1/(κ¯ 2 − c5 ). Thus, Eqs. (3.13) are indeed equivalent to (3.14 a,b). u t
472
R. A. Minlos, Yu. M. Suhov
P P Proof of Lemma 3.5.2. In what follows, the sum n stands for n∈Nod , |n|>1 ; the same P is true for m . Consider the operator space AL0 ,L0,od consisting of the bounded l.o.’s A: 1 >1 P 1/2 )−`{x}∪σ (n) L01 → L0,od n |Ax,n | ( >1 such that the m.e.’s Ax,n satisfy the bound supx −`{x}∪σ ( n) e 1/2 < ∞. In other words, Ax,n are represented in the form Ax,n = ( ) Ax,n , P P e with supx n |Ax,n | < ∞. The norm |||A|||A of A ∈ A is defined as supx∈Zd n |Ax,n | P ex,n |. ( 1/2 )−`{x}∪σ ( n) = supx∈Zd n |A We treat the right-hand side (r.h.s) of (3.14a) as a “quadratic” map Λ: A → ΛA, where ΛA equals od −1 ¯ od od −1 −1 ¯ od od −1 −1 ¯ od ) L1,0 + M(L¯ 1,1 ) M AL0,0 + M(L¯ 1,1 ) M AL0,1 M−1 A. −M(L¯ 1,1
(3.15)
We are going to check that Λ maps A → A and is bounded in norm ||| |||A (for simplicity, we will omit the subscript A in this notation). Furthermore, we will show that Λ is a contraction on a suitably chosen subset of A. This will imply the existence (and uniqueness) of a fixed point S. To this end, we will assess each of three summands in the r.h.s. of (3.15). We begin with the analysis of the second summand which is linear in A. To start with, observe that od have the form the m.e.’s of M−1 AL¯ 0,0 (M
−1
od −1 ¯ AL0,0 )x,m = Km (κ1 + 2r1,1 )Ax, m − p1,0 b0,1
X
Ay, m .
y: |y−x|=1
This leads to the bound |||M−1 A(L¯ od )0,0 ||| ≤ κ¯ 2−1 (κ1 + c6 1/2 )|||A|||.
(3.16)
P Furthermore, the m.e.’s of WM−1 A are of the form (WM−1 A)x, m = n Kn−1 Ax, n Wn ,m . For the non-zero summands in the last sum the set-theoretical difference σ (m) \ σ (n) is either empty or contains a single point y ∈ Zd neighbouring a point of σ (n). Therefore, `{x}∪σ (n) + 1 ≥ `{x}∪σ (m) , and hence ex, m = −1/2 ex, m , where A |(WM−1 A)x, m | ≤ `{x}∪σ (m) A
X n
Kn−1 |Ax, n Wm ,n |.
P ex, m ≤ By the same argument as in the proof of Lemma 3.5.1, we conclude that m A P 1/2 1/2 −1 1/2 c7 |||A|||. This yields that |||WM A||| ≤ c7 |||A|||. n Ax, m ≤ c7 P od )−1 = (E + WM−1 )−1 into a power series −1 l Now, expanding M(L¯ 1,1 l (WM ) , we find that od −1 ) A||| < (1 − c8 1/2 )−1 |||A|||. |||M(L¯ 1,1
(3.17)
Bounds (3.16) and (3.17) together give the following bound for the norm of the second summand in (3.15): od −1 −1 ¯ od ) M AL0,0 ||| ≤ (κ1 + c7 1/2 )((1 − c8 1/2 )κ¯ 2 )−1 |||A|||, |||M(L¯ 1,1
which is ≤ η|||A|||, where 0 < η < 1 for small enough.
(3.18)
Spectrum of Interacting Diffusions
473
To assess the third, “quadratic”, term in the r.h.s. of (3.15), note that the operator od M−1 acts as follows: 8 L¯ 0,1 nex , for n > 1 odd, is taken to X 8ey , (3.19a) κn−1 2rn,1 8ex − pn,0 b0,1 y: |x−y|=1
and 8n1 ex
1
+n2 ex
2
, for n1 ≥ 2 even and n2 ≥ 1 odd, to
(κn1 + κn2 )−1 2rn1 ,0 8ex 1(n2 = 1) − 1(|x1 − x2 | = 1) 2 × pn1 ,1 bn2 ,0 8ex + pn2 ,0 bn1 ,1 8ex ; 1
(3.19b)
2
Nod ,
are taken to zero. the rest of the vectors 8n , n ∈ P od M−1 A) −1 ¯ od As follows from (3.19a,b), the m.e.’s (L¯ 0,1 x1 ,x2 = n (L0,1 M )x1 ,n An ,x2 admit the bound od M−1 A)x1 ,x2 | ≤ c9 ( 1/2 )|x1 −x2 | |||A|||. |(L¯ 0,1 Applying an argument similar to that used for deriving bound (3.18), with the use of an obvious inequality `x1 ∪σ (n) + |x1 − x2 | ≥ `x2 ∪σ (n) , we obtain that od −1 −1 ¯ od ) M AL0,1 M−1 A||| < c10 |||A|||2 . |||M(L¯ 1,1
(3.20)
It remains to assess the first summand in the r.h.s. of (3.15). We have that X X X od 8ex = 2 r1,m 8mex − p1,m1 b0,m2 8m1 ex +m2 ey , L¯ 1,0 m∈Z+ , m>1, m odd
m1 ,m2 ∈Z+ , m1 +m2 >1
y: |y−x|=1
od od ||| ≤ c 1/2 . An argument similar to the above ∈ A and |||L¯ 1,0 which implies that L¯ 1,0 11 again leads to the bound od −1 ¯ od ) L1,0 ||| ≤ c11 1/2 . |||M(L¯ 1,1
(3.21)
From bounds (3.18), (3.20) and (3.21) we obtain that for small enough, ∃η ∈ (0, 1) such that for any A, A1 , A2 ∈ A |||ΛA||| ≤ η|||A||| + c10 |||A|||2 + c11 1/2 ,
(3.22a)
|||Λ(A1 − A2 )||| ≤ η|||A1 − A2 ||| + 2c12 max [|||A1 |||, |||A2 |||, |||A1 − A2 |||] . (3.22b) In turn, (3.22a) means that Λ is a bounded map A → A. As to (3.22b), it guarantees that there exists a constant R (1) > 0 such that the ball BR (1) = A ∈ A : |||A||| < R (1) 1/2 is taken by map Λ into itself. Similarly, from (3.22b) we see that for small enough this map is a contraction on BR (1) . Thus, the required properties of map Λ are established. Therefore, for small enough there exists a unique S satisfying (3.14a). The existence and uniqueness of l.o. T obeying (3.14b) is established in a similar way. This completes the proof of Lemma 3.5.2. u t Proof of Lemma 3.5.3. The bounds for m.e.’s Sx,n follows directly from the above analysis of map Λ. The proof of Lemma 3.5 is now complete. u t
474
R. A. Minlos, Yu. M. Suhov
We now continue with the proof of Lemma 3.4. We have constructed a pair of L¯ od ¯ od invariant L-closed sub-spaces L1 , Lod >1 ⊂ L. As L commutes with {Uy }, subspaces od L1 and L>1 are {Uy }-invariant. Furthermore, the intersection of L1 and Lod >1 is zero, and their sum coincides with Lod . The proof of the last assertion is identical to that of Lemma 3.4 from [10], and we refer the reader to this paper. We want to outline the construction of the decomposition of the even space Lev . As ev before (cf. (3.11), (3.12)), we start with the decomposition Lev = L0 + L0, ≥2 , and the ev 0 L¯ 0,1 0, ev ¯ ev corresponding representation of L¯ ev as a matrix L¯ ev ' ev , where L0,1 : L≥2 → 0 L¯ 1,1 ev : L0, ev → L0, ev . The one-dimensional subspace L is identified with the L0 and L¯ 1,1 0 ≥2 ≥2 complex line C; such an identification is repeatedly used below without comment. ev is given by Observe that operator L¯ 0,1 ev L¯ 0,1 8nex = 2rn,0 , n ∈ Z1+ , n even, ev 8n1 e +n2 e = −(pn1 ,0 bn2 ,0 + pn2 ,0 bn1 ,0 )1(|x1 − x2 | = 1), L¯ 0,1 x1
x2
n1 , n2 ∈ Z10 , n1 and n2 odd, ev 8n = 0, L¯ 0,1
(3.23)
for all other n ∈ Nev .
in (3.9) in the form Lev = v : v = u + F(u), As before, we seek the subspace Lev ≥2 ≥2 ev 0, ev u ∈ L0, ≥2 , where F: L≥2 → C is a bounded linear functional. The condition of 0, ev ¯ ev ¯ ev L¯ ev -invariance of Lev ≥2 leads to the equation F(L1,1 u) = L0,1 u, for any vector u ∈ L≥2 ev ev from the domain of l.o.’s L¯ 1,1 and L¯ 0,1 . This yields the formula F(v) = L¯ ev (L¯ ev )−1 v, v ∈ L0, ev . As before, one can check 0,1
1,1
≥2
ev is invertible in L0, ev and (L ¯ ev )−1 is L-bounded. Thus, the linear functional F that L¯ 1,1 1,1 ≥2 is indeed L-bounded, and |||F|||L ≤ c13 . This completes the construction of the L¯ ev ev invariant space Lev ≥2 . It is also invariant under {Uy } and provides the decomposition L ev = L0 + L≥2 . ev )−1 ||| ≤ 1/(κ −c ) ≤ 1/(κ¯ −c ) in assertion 3 of Lemma The inequality |||(L¯ ≥2 1 0 2 0 L 3.4 may be deduced from the established facts, similar to the analogous inequality for od )−1 ||| . |||(L¯ >1 L It remains to check assertion 4 of Lemma 3.4: the closures (3.10) are invariant under ¯ and form decomposition (2.10). To this end, it L¯ and {Uy } (considered as l.o.’s in H) ¯ l.o. is convenient to pass from the unbounded self-adjoint l.o. L¯ to a bounded (in H) −1 −1 (L¯ + aE) . Here, the constant a > 0 is chosen so that the l.o. (L¯ 1 + aE) acting in space L1 ⊂ Lod is L1 -bounded. (Observe that the action of (L¯ 1 + aEL1 )−1 on L1 ¯ coincides with that of (L¯ + aE)−1 .) By virtue of the H-boundedness of (L¯ + aE)−1 , ¯ In H¯ 1 is invariant with respect to (L¯ + aE)−1 . Thus, it is invariant with respect to L. od and H ¯ ev are L-invariant. ¯ The invariance of H¯ 1 , a similar way one can check that H¯ >1 ≥2 od ev ¯ ¯ H>1 and H≥2 under Uy and decomposition (2.10) follow from the construction. This completes the proof of Lemma 3.4. u t
Thus, we construct decomposition (2.10). Its spectral properties are checked in Sect. 4.
Spectrum of Interacting Diffusions
475
4. Spectral Properties of L¯ 1 ¯ BL ⊆ L, and assume that B L is Lemma 4.1. Let B be a self-adjoint l.o. in H.s. H, ¯ L-bounded. Then B is H-bounded, and ||B||H¯ ≤ |||B|||L . The proof of Lemma 4.1 repeats that of Lemma 3.1 from [16] and is omitted. From Lemma 4.1 and bounds of the preceding section we deduce that od −1 ) ||H¯ od ≤ (κ¯ 3 − c0 )−1 , ||L¯ 1 ||H¯ 1 ≤ κ1 + c0 , ||(L¯ >1 ev −1 ) ||H¯ ev ≤ (κ¯ 2 − c0 )−1 , ||(L¯ ≥2
>1
(4.1)
≥2
od and L ¯ ev lie to the right of The two last bounds in (4.1) imply that the spectra of L¯ >1 ≥2 −1 −1 and κ¯ 2 − c0 , respectively. This gives the proof of assertion points κ¯ 3 − c0 1(i) of Theorem 1. u t The first bound in (4.1) gives that the spectrum of L¯ 1 lies to the left of point κ1 + c0 . To establish the lower bound for the spectrum of L¯ 1 , consider a family of elements 2x of L1 of the form
2x = 8ex + M−1 S8ex , x ∈ Zd .
(4.2)
Obviously, ∀x, y ∈ Zd , Uy 2x = 2x+y , and |||2x |||L = 1 + ζ , where P ζ does not depend on x and |ζ | ≤ c1 . Furthermore, for any v ∈ L1 we have: v = x∈ gx (8ex + P M−1 S8 P ex ) = x gx 2x , which yields P that {2x } is a basis in L1 , and the coefficients gx obey x |gx | ≤ |||v|||L ≤ (1 + ζ ) x |gx |. P Let Lx,y denote the m.e.’s of L¯ 1 in basis {2x }: L¯ 1 2x = y Lx,y 2y . As L¯ 1 commutes with {Uy }, Lx,y depends only on x − y: od od )x,y + (L¯ 0,1 M−1 S)x,y := m(x − y), x, y ∈ Zd . Lx,y = (L¯ 0,0
From (3.3) we find that od )x,y (L¯ 0,0
κ1 + 2r1,1 , if x = y, = −2p1,0 b0,1 , if |x − y| = 1, 0, otherwise,
(4.3)
(4.4)
and as in the proof of Lemma 3.5.2, od M−1 S)x,y | ≤ c2 ( 1/2 )|x−y| . |(L¯ 0,1
(4.5)
Now consider a commutative Banach algebra B formed by the functions f : Zd → C, P |f (x)|, where the multiplication is given by the with the l1 (Zd )-norm ||f ||1P= x convolution: (f1 ∗ f2 )(x) = x 0 f1 (x 0 )f2 (x − x 0 ). Obviously, m ∈ B and the unit of B is the function e0 (x) = 1(x = 0). Lemma 4.2. Element m is invertible in B. Furthermore, m and its inverse m−∗1 have the form m = κ1 e0 + n, m−∗1 = κ1−1 e0 + p, where |n(x)|, |p(x)| ≤ c3 1/2 ( 1/2 )|x| . (4.6)
476
R. A. Minlos, Yu. M. Suhov
Proof of Lemma 4.2. The bound on n is obvious P from (4.4). Consider the Fourier transb (θ) = κ1 b p is analytic in form of m: m p(θ), where b p(θ) = x n(x)eihθ,xi n(x). Then b the complex domain {θ ∈ Cd : |=(θ)| < (1/2)| ln |} (here, | | stands for the norm both p in the domain {θ ∈ C: in C and Cd ). Furthermore, ∀ζ ∈ (0, | ln |/2), the function b m(θ ) = |=(θ )| ≤ (1/2)| ln | − ζ } admits the bound |b p(θ )| ≤ c4 1/2 . Therefore, 1/b p(θ )) = κ1−1 − b p(θ)κ1−1 (κ1 + b p(θ))−1 . Taking the inverse Fourier transform 1/(κ1 + b R yields m−∗1 (x) = κ1−1 e0 (x) + p(x), where p(x) = κ1−1 Td eihθ,xi (κ1 + b p(θ ))dθ . Owing to analyticity of and the above bound for b p, by choosing an appropriate integration contour in the last integral, we obtain the bound (4.6) for p. The proof of Lemma 4.2 is now complete. u t Lemma 4.2 implies that L¯ 1 is invertible in L1 , and L¯ 1−1 acts on {2x } as L¯ 1−1 2x = P P −1 κ1 2x + y p(x − y)2y . It is easy to see that for any vector v = x gx 2x the norm P P |||L¯ 1−1 v|||L is ≤ x κ1−1 |gx | + y |p(x −y)||gy | |||2x |||L ≤ (1+c5 ) κ1−1 +c5 1/2
|||v|||L , whence
|||L¯ 1−1 |||L ≤ κ1−1 + c5 1/2 .
(4.7)
By virtue of Lemma 3.6 we obtain that the spectrum of L¯ 1 in H¯ 1 lies to the right of t the point κ1 − c6 1/2 which yields assertion 1(ii) of Theorem 1.1. u ex } which To prove assertion 2, we pass from {2x } to another orthonormal basis {2 we construct below. In what follows, we use the symbol h , iH¯ (and alternatively h , iν¯ ) ¯ Furthermore, hgi ¯ (and alternatively hgiν¯ ) stands for the for the scalar product in H. H R integral g(Q)d¯ν (Q) = hg, 1iH¯ . Finally, we set CoH¯ (g1 , g2 ) := hg1 , g2 iH¯ − hg1 iH¯ hg2 iH¯ and call CoH¯ (g1 , g2 ) (alternatively denoted as Coν¯ (g1 , g2 )) the correlator of g1 and g2 . Consider the Gram matrix for {2x }, with the m.e.’s Gx,y = Co(2x , 2y ) = h2x , 2y iH¯ (we use here the fact that h2x iH¯ = 0, x ∈ Zd ). Clearly, Gx,y is a function of x − y only. Furthermore, X X Sx,n 8n , 8ey + Sy,n 8m = CoH¯ (8ex , 8ey ) Gx,y = CoH¯ 8ex + +
X n
n
Sx,n CoH¯ (8n , 8ey ) +
X m
m∈N
Sy,m CoH¯ (8ex , 8m ) +
X n,m
(4.8)
Sy,n Sx,m CoH¯ (8n , 8m ).
Lemma 4.3. For any n, m ∈ N the following bound holds: |σ (n)|+|σ (m)|
|CoH¯ (8n , 8m )| ≤ c6
(c6 )ρ(n,m) .
(4.9)
of σ (n), and ρ(n, m) stands for the Here, and below |σ (n)| denotes the cardinality distance min |x − y|: x ∈ σ (n), y ∈ σ (m) . The proof of Lemma 4.3 is carried out in Sect. 5. Formula (4.8) and bound (4.9), together with the bounds of Lemma 3.5.3 and the inequality `{x}∪σ (n) + `{y}∪σ (m) + ρ(n, m) ≥ |x − y|, imply that function f:Zd → R defined by Gx,y =: f(x − y) belongs to algebra B and admits the representation f = re0 + h, where (a) r = h82ex iH¯ > 0 and |x| does not depend on x ∈ Zd , and (b) h ∈ l1 (Zd ) satisfies the bound |h(x)| ≤ c7 c7 1/4 .
Spectrum of Interacting Diffusions
477
Repeating the argument given in the proof of Lemma 4.2, we conclude that there exist in B the square ∗-root f∗1/2 and its inverse f−∗1/2 , and they admit the representations |x| f∗1/2 = r1/2 e0 + h1 , f−∗1/2 = r−1/2 e0 + h2 , where |h1 (x)|, |h2 (x)| ≤ c8 c8 1/4 . (4.10) e x } in H¯ 1 , map V: H¯ 1 → L2 (Td , dλ) and function m b (λ), λ ∈ Td , are The basis {2 now defined by X X ey )(λ) = exp ihy, λi, m ex = b (λ) = f−∗1/2 (x − z)2z , (V2 m(z) exp ihz, λi. 2 z
z
(4.11) ex = 2 ex by Uy 2 ex+y , and function m e is analytic in a Cd Group {Uy } acts on 2 d neighbourhood of torus T . Finally, it is not hard to check that X b (λ) = κ1 + 2r1,1 − 2p1,0 b0,1 cos λ(j ) + O( 2 ), m 1≤j ≤d (4.12) λ = (λ(1) , ..., λ(d) ) ∈ Td . e is not constant. This completes the proof of Theorem 1. u Thus, function m t 5. Cluster Expansions In this section we prove Lemmas 3.1 and 4.3. The proof is based on cluster expansions for measure ν¯ (see (2.2)–(2.5)) which are discussed below. 5.1. Expansion of the partition function. We begin with an expansion of the partition function ZV related to generator L¯ in a finite set V ⊂ Zd : Z X qx − qy )2 dµ(Q). (5.1a) ZV = exp − x,y∈V: |x−y|=1
We use a standard representation of the product P(V) Q sum {0} 0 p0 (Q) to write ZV =
(V) Y X
Q
{x,y}∈V (e
−(qx −qy )2
− 1 + 1) as the
P0 ,
(5.1b)
{0} 0
where
Z P0 =
p0 (Q)dµ(Q), p0 (Q) =
Y
(e−(qx −qy ) − 1). 2
(5.1c)
{x,y}∈0
P(V) Here and below, the sum {0} is taken over the finite unordered collections of pairwise disjoint Zd -connected sets 0 of lattice edges {x, y} lying in “volume” V (we say that
478
R. A. Minlos, Yu. M. Suhov
an edge Q {x, y} lies in a set O ⊂ Zd (and write {x, y} ⊂ O) when x, y ∈ O), and the product 0 over the 0’s from the given collection. Furthermore, [0] denotes below the set of vertices of the edges {x, y} ⊂ 0, |0| the cardinality of set 0 and |[0]| that of [0]. |0| It turns out that the following bound holds true: |P0 | ≤ c0 . The derivation of this bound is based on the following general fact: Lemma 5.1 (A generalized Hoelder inequality). Q Let Et , Et, πt , t ∈ T be a finite family of probability spaces and E, E, π = t∈T Et , Et , πt their Cartesian product. Suppose that {fYi , 1 ≤ i ≤ k} is a collection of functions E → C, indexed by subsets Yi ⊂ T such that each function fYi is measurable relative to the sigma-subalgebra EYi = Q t∈Yi Et ⊆ E. Furthermore, assume that a collection of positive numbers ri , 1 ≤ i ≤ k, P is given, such that Yi : Yi 3t ri−1 ≤ 1 ∀t ∈ T . Then k k Z Y Z Y 1/ri fYi dπ ≤ |fYi |ri dπ . E i=1
i=1
E
For the proof of Lemma 5.1 see [20], Lemma 5.2. To apply Lemma 5.1, we set T = [0] and identify Yi as the two-point subset consisting of vertices xi and yi of an edge {xi , yi } ⊂ 0. As each point x ∈ [0] is incident to not more than 2d edges from 0, we can take ri = 2d. Lemma 5.1 then gives that Y Z
|P0 | ≤
1/(2d)
|e−(qx −qy ) − 1|2d dµ(Q) 2
.
{x,y}⊂0
To bound a single term in the last product, we use the straightforward inequalities |e
−(qx −qy )2
Z − 1| ≤ (qx − qy ) and 2
qx − qy
4d
Z dµ ≤ 2
q 4d dµ0 (q).
4d
R 1/(2d) |0| with c0 = 4 q 4d dµ0 (q) . This yields the bound |P0 | ≤ c0 We list below, without proof, some facts about the partition function ZV which may be derived from the above bound for P0 . For the proof, see [13], Chapter 3. First, the above expansion of ZV absolutely converges for small enough. Furthermore, given (V) (1) (2) −1 finite V0 ⊂ V, set ϕV0 = ZV ZV\V0 . Then for any finite V0 , V0 , V0 ⊂ Zd , with (1)
(2)
V0 ∩ V0 = ∅,
(1)
(2)
1) the following bounds hold true: for any finite V ⊇ V0 , V0 , V0 , (V)
|ϕV0 | ≤ c1 2|V0 | , |ϕ
(V )
(1) V(1) 0 ∪V0
(1)
−
Y j =1,2
(V ) (j ) | V0
ϕ
(1)
≤ c1 3|V0
(2)
(2)
|+|V0 |
(1)
(c1 )ρ(V0 (1)
(2)
,V0 )
, (5.2)
(2)
where, as before, ρ(V0 , V0 ) denotes the distance between sets V0 and V0 ; (V)
2) there exists the limit ϕV0 = limV%Zd ϕV0 , and the limiting value ϕV0 satisfies bounds (5.2).
Spectrum of Interacting Diffusions
479
5.2. Expansion for expected values. Given a finite set V(0) ⊂ Zd , suppose that gV(0) is a function → C localised in V(0) (i.e. depending on the restriction of a configuration Q ∈ to V(0) : gV(0) (Q) = gV(0) (QV(0) ). Assuming that V(0) ⊆ V, con−1 sider the Gibbs distribution ν¯ V with the density d¯νV (Q)/dµ(Q) = ZV exp − P x,y∈V: |x−y|=1 qx − qy )2 ; see (5.1). The approach adopted in Sect. 5.1 leads to R the following representation for the expected value hgV(0) iν¯V := gV(0) d¯νV : Z
(X V(0) )
hgV(0) iν¯V =
(V)
gV(0) (Q)p0 (Q)dµ(Q)ϕV\(V(0) ∪[0]) .
0: [0]⊆V
(5.3)
P(V(0) ) Here, the sum 0: [0]⊆V is over the sets 0 of pairwise distinct edges of Zd such that (a) each connected component 0 of 0 has [0] ∩ V(0) 6 = ∅, and (b) the set of the vertices [0] of the edges from [0] is a subset V. Equations (5.2) and (5.3) imply that if (X V(0) ) 0: |[0]|<∞
(0) Z gV(0) (Q)p0 (Q)dµ(Q) 2|V ∪[0]| < ∞,
(5.4)
then there exists the limit limV%Zd hgV(0) iν¯ V that coincides with hgV(0) iν¯ , and hgV(0) iν¯ =
Z
(X V(0) )
gV(0) (Q)p0 (Q)dµ(Q)ϕV(0) ∪[0] ;
0: |[0]|<∞
(5.5)
P(V(0) ) here the sum 0: |[0]|<∞ is over the finite sets 0 which fulfill condition (a) above. In turn, condition (5.4) holds for small enough. 5.3. Expansion for correlators. Now let gV(1) and gV(1) be two functions → C lo(1)
0
(1)
0
calised in disjoint finite sets V0 , V0 ⊂ Zd . We want to write an expansion for their correlator Coν¯ (gV(1) , gV(1) ). After some straightforward (although tedious) algebra, one 0 0 obtains: Coν¯ (gV(1) , gV(1) ) = 0
+
0
Z
(X V0 ) 0: 012 =∅
−
Z
(X V0 ) 0: 012 6=∅
gV(1) (Q)gV(1) (Q)p0 (Q)dµ(Q)ϕV0 ∪[01 ]∪[02 ] 0
0
Y gV(1) (Q)gV(1) (Q)p0 (Q)dµ(Q) ϕV0 ∪[01 ]∪[02 ] − ϕV(j ) ∪[0 ] 0
( X 01 , 02
0
(1)
(1)
V 0 , V0 )
0
j =1,2
Y Z j =1,2
gV(j ) (Q)p0j (Q)dµ(Q)ϕV(j ) ∪[0 ] . 0
0
j
j
(5.6)
(1) (2) 0 of Here V0 stands for V0 ∪ V0 . Furthermore, (a) 012 (= 012 (0)) is the union ∪e (j ) e e those connected components 0 of set 0 which have [0 ] ∩ V0 6 = ∅, j = 1, 2; (b) 01 (1) (= 01 (0)) is the union of those connected components e 0 of 0 which have [e 0 ] ∩ V0
480
R. A. Minlos, Yu. M. Suhov (2)
6 = ∅, but [e 0 ] ∩ V0 = ∅; and finally (c) 02 (= 02 (0)) is the union of those connected (1) (2) 0 ] ∩ V0 6= ∅. The last sum, components e 0 of 0 which have [e 0 ] ∩ V0 = ∅, but [e (1) P(V(1) 0 ,V0 ) , is over the pairs of finite sets 01 , 02 of lattice edges such that (i) all connected 01 , 02 (1) (1) 0 0 (2) ∩ V0 6 = ∅, and all connected components e components e 0 (1) of set 01 have e (2) (2) of set 02 have e 0 ∩ V0 6 = ∅, and (ii) either at least one of the connected components (2) e 0 (1) ∩ V0 ∪ [02 ] 6 = ∅ or at least one of the connected components 0 (1) of 01 has e (2) (1) e 0 ∩ V0 ∪ [01 ] 6 = ∅. To simplify the notation, we have omitted in 0 (2) of 02 has e these sums the conditions |[0]| < ∞ and |[0j ]| < ∞, j = 1, 2; a similar agreement will also be in force in what follows. As before, the series in the r.h.s. of (5.6) converge absolutely for small enough. R Proof of Lemma 3.1. By (5.6), the integral 9n2 dµ equals (σ (n)) Z
X 0
(σ (n)) Z
9n2 (Q)p0 (Q)dµ(Q)ϕ[0]∪σ (n) =
X
p0 (Q)dµ(n) (Q)ϕ[0]∪σ (n) .
(5.7)
0
Here, µ(n) is a new reference measure: µ(n) = ×x∈Zd µn(x) , where (cf. (2.5)) en2 (q)dq, ψ en (q) = I −1/2 exp −q 2s /2 ψn (q), q ∈ R, n ∈ Z+ . (5.8) dµn (q) = ψ R To bound the absolute value (a.v.) p0 (Q)dµ(n) (Q) , we again use Lemma 5.1 and R straightforward inequalities for exponentials. Denoting In = q 4d dµn (q), we have that the above a.v. does not exceed Y Z ((qx4d + qy4d )/2)1/(2d) dµ(n) (Q). (4)|[0]| {x,y}∈0
In turn, the last expression is ≤ (4)|[0]|
Y
((In(x) + In(y) )/2)1/(2d) .
(5.9)
{x,y}∈0
Furthermore, (In1
(In1 + 1)(In2 + 1), n1 , n2 ≥ 1, + In2 )/2 ≤ I0 × In1 + 1, n1 ≥ 1, n2 = 0, 1, n = n = 0. 1 2
(5.10)
Now we again use the fact that each point x ∈ σ ( n ∩ [0]) is incident to not |[0]|/(2d) more than 2d edges from 0. This, together with (5.10), yields thatR (5.9) is ≤ I0 Q Q 2 x∈σ (n) (In(x) + 1). Therefore, from (5.7) and (5.8) we obtain that 9n dµ is ≤ x∈σ (n) P(σ (n)) (c2 )|[0]| 2|σ (n∪[0])| . Furthermore, the last sum does not exceed In(x) + 1 0 Pσ (n) P(σ (n)) |[0]| , where we used the fact that (c2 )|[0]| 2|σ (n∪[0])| ≤ 2|σ (n)| 0 0 (4c2 ) |[0]| ≤ 2|0|. Set 0 is identified with the collection of its connected components {0 1 , . . . , 0 l }: 0 , and a point x ∈ Zd , 0i = 0i (0), 1 ≤ i ≤ l. Given a connected set of edges, e
Spectrum of Interacting Diffusions
481
write x ≺ e 0 if x is the lexicographically minimal point of the set [e 0 ]. It is plain that |0| P(σ (n)) is 4c 2 0 (σ (n), con)
X
Y
{0}
0
≤
(4c2 )|0| ≤
X
X
Y
X
e
(4c2 )|0 | .
(5.11)
l≥0 {x1 ,... ,xl }⊆σ (n) 1≤j ≤l e 0 : xj ≺e 0
P(σ ( n),con) is over the unordered collections of pairwise disjoint conHere, the sum {0} nected sets of edges 0 such that [0] ∩ σ (n) 6 = ∅. |e P 0| For > 0 small enough, the series e converges and is ≤ c3 . Hence, 2 0 : x≺e 0 4c P P(σ (n)) |σ (n)| s (c2 )|[0]| 2|σ (n∪[0]| does not exceed l≥0 4c3 = (1 + 4c3 )|σ (n)| . 0 s (4d)/(4s−2)
Lemma 5.2. The following bound holds true: In ≤ c4 κn
, n ∈ Z+ .
The proof of Lemma 5.2 will be given in Sect. 6. The assertion of Lemma 5.2 implies R |σ (n)| Q (4d)/(4s−2) . The assertion of Lemma 3.1 now that 9n2 dµ is ≤ c5 x 1 + c6 κn(x) follows straightforwardly. u t Proof of Lemma 4.3. First, consider the simplest case where σ (n) ∩ σ (m) 6= ∅, i.e., ρ(n, m) = 0. In this case, |h8n , 8m iH¯ |, |h8n iH¯ | ≤ 1 and thus |CoH¯ (8n , 8m )| ≤ 2, which agrees with bound (4.9). Now assume that σ (n) ∩ σ (m) = ∅. Observe that Coµ (8n 8m ) = 1/(γn γm )Coµ (9n , 9m ). To estimate Coµ (9n , 9m ), we use formula (5.6) and consider each summand in the r.h.s. separately. 1. Consider the sum (σ (n+m)) Z
X
9n (Q)9m (Q)p0 (Q)dµ(Q)ϕσ (n+m) .
0: 012 6=∅
As before, bounds for exponentials, together with the Schwarz inequality, imply that Z Z 1/2 9n (Q)9m (Q)p0 (Q)dµ(Q) ≤ ||9n ||||9m || (p0 (Q))2 dµ(Q) which is ≤ 4c7
|0|
. Note that as 012 6 = ∅, the cardinality |0| ≥ ρ(n, m), and the a.v. |0|/2 |σ (n)|+|σ (m)| |[0]| P(σ (n)∪σ (m)) P(σ (n+m)) 2 2 4c7 of the above sum 0: 012 6=∅ becomes ≤ c8 0: 012 6=∅ ρ(n,m)/2 ρ(n,m)/2 |σ (n)|+|σ (m)| ≤ c8 c7 . 4c7 c9 2. For the sum (σ (n)∪σ (m)) Z
X
0: 012 =∅
9n (Q)9m (Q)p0 (Q)dµ(Q) ϕσ (n+m)∪[01 ]∪[02 ] − ϕσ (n)∪[01 ] ϕσ (m)∪[02 ]
482
R. A. Minlos, Yu. M. Suhov
one can use a similar approach. The second bound in (5.2) yields that ϕσ (n+m)∪[01 ]∪[02 ] − ϕσ (n)∪[01 ] ϕσ (m)∪[02 ] ρ(σ (n)∪[01 ],σ (m)∪[02 ]) . ≤ c10 3|σ (n)|+|σ (m)|+|[01 ]|+|[02 ]| c10 Here, as before, ρ(σ (n) ∪ [01 ], σ (m) ∪ [02 ]) stands for the distance between the sets σ (n) ∪ [01 ] and σ (m) ∪ [02 ]). Clearly, |01 | + |02 | + ρ(σ (n) ∪ [01 ], σ (m) ∪ [02 ]) ≥ ρ(n, m). Repeating the argument of the previous paragraph one gets that the a.v. of the ρ(n,m)/2 |σ (n)|+|σ (m)| P(σ (n)∪σ (m)) . c13 above sum 0: 012 =∅ is ≤ c11 c12 3. For the third sum emerging from the r.h.s. of (5.6) one can obtain a similar bound. Here, one observes that for each pair (01 , 02 ) entering this sum, |01 | + |02 | ≥ ρ(n, m). The bounds thereby obtained, together with the fact that γn ≥ 1, complete the proof of Lemma 4.3. u t
6. A WKBJ-TYPE Analysis In this section we prove Lemmas 3.3 and 5.2. We will need some information about the spectrum and eigenfunctions of the Schroedinger operator h acting in L2 (R, dq): h=−
d2 dq 2
+ V (q), q ∈ R, where V (q) = s 2 q 4s−2 − s(2s − 1)q 2s−2 .
(6.1)
It is not hard to check that the unitary map w : L2 (R, dµ0 (q)) → L2 (R, dq), (wf )(q) = Z −1/2 exp −q 2s /2 f (q), q ∈ R, yields h = wkw−1 , where k was defined in (2.6). So, h has the same eigenvalues as k, en (see (5.8)). but “modified” (normalised) eigenfunctions ψ 6.1. Asymptotical WKBJ-formulas. We follow the line of exposition adopted in [MVZ]. en is even for n even and odd for n odd. Thus, it suffices to analyse the Like ψn , each ψ behaviour of these functions for q ≥ 0. Note that there ∃n0 such that for all n ≥ n0 the equation V (q) = κn has a unique positive root, qn0 ; it is not hard to check that qn0 = 1/(4s−2) 1/2 +1/(2s−1) + 1/O κn κn /s 2 . Divide the half-axis R+ = (0, ∞) into four intervals: I1 = (qn0 (1 + δ), ∞), I2 = (qn0 , qn0 (1 + δ)), I3 = (qn0 (1 − δ), qn0 ), I4 = (0, qn0 (1 − δ)), where δ ∈ (0, 1) is a fixed number, to be specified below. The asymptotic of the noren on these intervals, as n → ∞, is as follows: malised eigenfunction ψ R √ √ √ en (q) = (Dn /2 π 4 V (q) − κn ) exp − q0 V (t) − κn dt 1+O 1/κna , 20 on I2 : ψ qn R √ R √ en (q) = Dn Υ 3 q0 V (t) − κn dt/2 2/3 3 q0 V (t) − κn dt/2 1/6 20 on I2 : ψ qn qn −1 √ 2a/3 × 4 V (q) − κn 1 + O 1/κn ,
Spectrum of Interacting Diffusions
483
√ Rq √ −1 2/3 en (q) 30 on I3 : ψ = Dn 4 κn − V (q) Υ − 3 q 0 κn − V (q)dt/2 n Rq √ 1/6 4a/3 + O 1/κn , × 3 q 0 κn − V (q)dt/2 n R qn0 √ √ √ 4 en (q) = Dn / π sin κ − V (t)dt + π/4 / κ − V (q) + 40 on I4 : ψ n n q O 1/κna , Here, a = 1/2 + 1/(4s − 2), Dn is a normalising constant which will be Rassessed later, and Υ is the (real) Airy function defined, on R, by the integral Υ (z) = R exp izt + t 3 /3 dt. Note that functions Υ (z) and |z|1/4 Υ (z) are bounded on R: maxz∈R |Υ (z)|, |z|1/4 |Υ (z)| < ∞. Observe that all remainder terms O( · ) in the above formulas are uniform with respect to q.
6.2. A bound for Dn . Plainly, 1 = 2 r.h.s. of the last inequality equals 2Dn2 /π
Z
qn0 (1−δ) h
Z sin
0
+ O 1/κna
Z
qn0 (1−δ)
sin
0
qn0
q
Z q
R∞ 0
R 0 en (q) 2 dq > 2 qn (1−δ) ψ en (q) 2 dq. The ψ 0
p π i2 .p κn − V (t)dt + κn − V (q) dq 4
qn0
p κn − V (t)dt
(6.2)
.p + π/4 4 κn − V (q) dq + O 1/κn2a qn0 . With the help of the change of variables p = Z
p π i2 .p κn − V (t)dt + κn − V (q) dq 4 0 c0 (p1 − p2 ) , ≥ κn + V0
qn0 (1−δ) h 0
where p1 =
Z
R qn0 √ κn − V (t)dt + π/4 we find that q
sin
qn0
(6.3)
R qn0 √ R q0 √ κn − V (t)dt + π/4, p2 = q 0n(1−δ) κn − V (t)dt + π/4, and 0 n
q ). V0 = − min V (e
(6.4)
e q
Therefore, by virtue of the asymptotic of qn0 (see above), the difference p1 − p2 is ≥ (1 − δ)qn0
min
0
p 1/2+1/(4s−2) κn − V (t) ≥ c1 κn .
(6.5)
−(a+1/4−1/(4s−2)) whereas the third of The second summand in (6.2) is of order O κn −(2a−1/(4s−2)) order O κn . Thus, we find from (6.3) and (6.5) that −1/2+1/(4s−2)
c2 Dn2 κn
1/4−1/(8s−4)
< 1, whence |Dn | ≤ c3 κn
.
(6.6)
484
R. A. Minlos, Yu. M. Suhov
6.3. Proof of Lemma 5.2. Following asymptotical formulas 10 –40 and bound (6.6), en obeys function ψ p en (q)| ≤ c4 κn1/4−1/(8s−4) / 4 κn − V (q), 0 < q < qn0 . (6.7) |ψ Furthermore, the above equations and obvious inequalities (4s−3)/(4s−2)
V (q) − κn ≥ (q − qn0 )V 0 (qn0 ) ≥ c5 (q − qn0 )κn
, qn0 < q < ∞,
en (q)|, for qn0 < q < ∞, is imply that |ψ 1/4−1/(8s−4) p (4s−3)/(4s−2) )/ 4 V (q) − κn ) exp − c6 (q − qn0 )3/2 κn . ≤ c5 (κn R Bounds (6.7), (6.8) imply in turn that q 4d dµ0 (q) does not exceed 1/2−2/(8s−4)
c 7 κn Z +
Z 0
∞ qn0
q 4d exp
qn0
q 4d dq √ κn − V (q)
(4s−3)/(4s−2)
− c6 (q − qn0 )3/2 κn √ V (q) − κn
(6.8)
dq .
(4d+1)/(4s−2)−1/2
. This It is not hard to check that here each integral does not exceed c8 κn R 4d 4d/(4s−2) and the proof of Lemma 5.2. t u gives the inequality q dµn (q) ≤ c9 κn 6.4. Proof of Lemma 3.3 Part One. According to (3.4), rn,m , pn,m and bn,m are the m.e.’s, in basis {φn }, of the l.o.’s r = qp, p and q, respectively, where p = d/dq and q is the multiplication by q. We write rn,m =
γm γm γm e e rn.m , pn,m = p en.m , bn,m = bn.m , γn γn γn
(6.9)
en.m and e bn.m of these operators in orthonormal basis {ψn } and pass to the m.e.’s e rn.m , p e2 (q)dq. So, in H.s. L2 (R, dµ0 (q)), where (see (5.8)) dµ0 (q) = ψ 0 e rn.m = hrψn , ψm iµ0 , p en.m = hpψn , ψm iµ0 , e bn.m = hqψn , ψm iµ0 .
(6.10)
Here, and below, h , iµ0 denotes the scalar product and || ||µ0 the norm in L2 (R, dµ0 (q)). bn,m are nonzero only if n and m have different parities. Thus Observe that p en,m and e it suffices to consider the off-diagonal m.e.’s only. We begin with e bn,m , n 6 = m. For the rest of this section, h , i denotes the scalar product and || || the norm in H.s. L2 (R, dq). Write e en , ψ em i = (1/κm )hqψ en , hψ em i bn,m = hqψn , ψm iµ0 = hqψ en , ψ em i = (κn /κm )hqψ en , ψ em i + (1/κm )h[h, q]ψ en , ψ em i, = (1/κm )hhqψ en , ψ em i = (1/(κm − κn ))h[h, q]ψ en , ψ em i. As [h, q] = −2p, we obtain that whence hqψ en , ψ em i| ≤ (2/(κm −κn ))||pψ en ||. Furthermore, ||pψ en ||2 = −hp2 ψ en i which does en , ψ |hqψ not exceed
Spectrum of Interacting Diffusions
Z
−
en (q) d2 ψ en (q)|2 + V0 |ψ en (q)|2 dq = κn + V0 , en (q) + V (q)|ψ ψ 2 dq
485
(6.11)
which leads to bound (3.7). Now pass to p en,m , n 6 = m. Proceeding in the same fashion as before, we find that hpψn , ψm iµo = (1/(κm − κn ))h[k, p]ψn , ψm iµ0 . As [k, p] = −2s(2s − 1)q2s−2 p, we obtain that |hpψn , ψm iµ0 | ≤ (c10 /(κm − κn ))||qs1 pψn ||µ0 ||qs2 ψm ||µ0 for any two integers s1 , s2 ≥ 0 such that s1 + s2 = 2s − 2. Later on, we shall choose particular values for these numbers. em , ψ em i and, To assess ||qs2 ψm ||µ0 , we write ||qs2 ψm ||2µ0 = hq2s2 ψm , ψm iµ0 = hq2s2 ψ assuming that 2s2 ≤ 4s − 2, obtain that em i| ≤ hq4s−2 ψ em i 2s2 /(4s−2) , em , ψ em , ψ |hq2s2 ψ which is ≤ c11
Z
−2
(2s2 )/(4s−2) d2 em (q) + 2V (q)ψ em (q) ψ em (q) + V0 ψ em (q)dq ψ dq 2
= c11 (2κm + V0 )2s2 /(4s−2) . (6.12) To assess ||qs1 pψn ||µ0 , observe that qs1 p = pqs1 − s1 qs1 −1 p Therefore, ||qs1 pψn ||2µ0 = ||pqs1 ψn ||2µ0 − 2s1 hpqs1 ψn , qs1 −1 ψn iµ0 + s12 ||qs1 −1 ψn ||2µ0 . (6.13) Under the assumption that 2s1 − 2 ≤ 4s − 2, we estimate individually each summand in the last expression. We start with the third term: en i ≤ c12 (2κn + V0 )(2s1 −2)/(4s−2) . en , ψ s12 ||qs1 −1 ψn ||2µ0 = s12 hq2s2 −2 ψ
(6.14)
Now write the scalar product hpqs1 ψn , pqs1 ψn iµ0 as en , qs1 ψ en i. hp∗ pqs1 ψn , qs1 ψn iµ0 = hkqs1 ψn , qs1 ψn iµ0 = hhqs1 ψ en , qs1 ψ en i equals As [h, qs1 ] = −s1 (s1 − 1)qs1 −2 − 2s1 qs1 −1 p, we find that hhqs1 ψ en , qs1 ψ en , qs1 ψ en i − s1 (s1 − 1)hqs1 −2 ψ en , qs1 ψ en i − 2s1 hqs1 −1 pψ en i. κn hqs1 ψ Furthermore, under the condition that 2s1 ≤ 4s − 2, we have: en , qs1 ψ en i| ≤ c13 κn (2κn + V0 )(2s1 )/(4s−2) , κn |hqs1 ψ en , qs1 ψ en i| ≤ c13 (2κn + V0 )(2s1 )/(4s−2) , s1 (s1 − 1)|hqs1 −2 ψ and en i| ≤ c61 ||pψ en , ψ en , qs1 ψ en ||hq4s1 −2 ψ en i 2s1 |hqs1 −1 pψ 1/2 . ≤ c13 (κn + V0 )1/2 (2κn + V0 )(4s1 −2)/(4s−2) This yields that the first term ||pqs1 ψn ||2µ0 in the r.h.s. of (6.13) does not exceed c14 (κn + V0 )1+(2s1 )/(4s−2) + (κn + V0 )(2s1 −2)/(4s−2) + (κn + V0 )1/2+(2s1 −2)/(4s−2) . (6.15)
486
R. A. Minlos, Yu. M. Suhov
To estimate the second summand 2s1 hpqs1 ψn , qs1 −1 ψn iµ0 in the r.h.s. of (6.13), write hpqs1 ψn , qs1 −1 ψn iµ0 = s1 hqs1 −1 ψn , qs1 −1 ψn iµ0 + hpψn , q2s1 −1 ψn iµ0 .
(6.16)
The a.v. of the first term in the r.h.s. of (6.16) is ≤ c15 (κn + V0 )(2s1 −2)/(4s−2) , whereas 1/2 that of the second is ≤ κn c15 (κn + V0 )(2s1 −1)/(4s−2) . With s1 = s, we obtain: |hpqs1 ψn , qs1 −1 ψn iµ0 | ≤ c16 (κn + V0 )3/4+1/2(4s−2) .
(6.17)
Together, bounds (6,17), (6.15), (6.14) and (6.12) imply (3.6). Finally, consider e rn,m . In the case n 6 = m we have hrψn , ψm iµ0 = (κm − κn )−1 h[k, r]ψn , ψm iµ0 . Furthermore, [k, r] = 2k − 4s 2 q2s−1 p. Repeating the above argument, we find that |h[k, r]ψn , ψm iµ0 | ≤ 4|hq2s−1 pψn , ψm iµ0 | ≤ 4||qs pψn ||µ0 ||qs−1 ψn ||µ0 ≤ c17 (κn + V0 )3/4+1/(2(4s−2)) (κm + V0 )(s−1)/(4s−2) , i.e., |hrψn , ψm iµ0 | ≤ c17 |κm − κn |−1 (κn + V0 )3/4+1/(2(4s−2)) (κm + V0 )(s−1)/(4s−2) . (6.18a) In the case n = m, |hrψn , ψn iµ0 | ≤ ||pψn || ||qψn || ≤ c65 (κn + V0 )1/2+1/(4s−2) .
(6.18b)
Inequalities (6.18a,b) imply bound (3.5).
6.5. Proof of Lemma 3.3 Part Two. This part of the proof focusses on bounds (3.8). We will need some general facts about the distribution of the eigenvalues of operator h (see (6.1)). These facts may be found, e.g., in the classical text [24] (see also [8] and [11]). Note that [24] considers the Schrödinger operator with a convex potential V (with V 00 ≥ 0). In our case, potential V differs from a convex function by a bounded function with a compact support. A straightforward analysis shows that the arguments used in [24] lead to bounds similar to the ones reproduced in this book. 6.5.1. ∃ c1 , c2 > 0 such that, ∀n ∈ Z+ , c1 (n + 3/2)(4s−2)/2s + c2 .
c1 (n − 3/2)4s−2)/2s − c2 ≤ κn ≤ (s−1)/(2s−1)
6.5.2. ∃ c3 > 0 such that, ∀n, k ∈ Z+ , κn+k − κn ≥ c3 kκn
.
6.5.3. Let N (t), t ≥ 0, denote the number of eigenvalues κn of h with κn ≤ t. ∃ c4 , c5 > 0 such that N(t) ≤ c4 t 2s/(4s−2) + c5 . Observe that bounds 6.5.1 imply that ∃ c6 , c7 ∈ (0, 1), c6 < c7 , such that for any r ∈ (0, 1) there exists n0 = n0 (r) such that ∀n > n0 c6 ≤ κ[rn] /κn ≤ c7 .
(6.19)
Spectrum of Interacting Diffusions
487
P To estimate the sum m |rn,m |, we note that by virtue of (3.5) it does not exceed c8 (κn + V0 )3/4+(4d−1)/(2(4s−2)) X 1 (κm + V0 )(2d+s−1)/(4s−2) + (κn + V0 )1/2+1/(4s−2) . × |κn − κm | m: m6=n
We partition the last series into four sums: X X X = + m: m6=n
0≤m≤[n/2]
X
+
[n/2]+1≤m≤n−1
X
+
n+1≤m6 =[3n/2]
(6.20)
m≥[3n/2]
and assess each ofP them individually. The first sum, 0≤m≤[n/2] , equals Z κ[n/2] + 1 (t + V0 )(2d+s−1)/(4s−2) dN (t) κn − t 0 2d+s−1 2d+s−1 !0 Z κ[n/2] (κ[n/2] + V0 ) 4s−2 (t + V0 ) 4s−2 − N (t) dt. = N κ[n/2] κn − κ[n/2] κn − t 0
(6.21)
In view of 6.5.1, 6.5.2 and (6.19), the first term in the r.h.s. of (6.21) is less than or equal to 2d+3s−1
c9
(κ[n/2] + V0 ) 4s−2 (κ[n/2] + V0 ) 2s−2 ≤ c9 n κ[n/2] + V0 4s−2 n
2d+s+1 4s−2
≤ c9 (κn + V0 )
2d−s+1 4s−2
The integral in the r.h.s. of (6.21), again by (6.19), is Z κ[n/2] 2d+3s−1 1 (t + V0 ) 4s−2 −1 ≤ c10 2 (κn − t) 0 2d + s − 1 3s − 2d − 1 (κn + V0 ) + | |(t + V0 ) dt. × 4s − 2 4s − 2
.
(6.22)
(6.23)
Performing the change of variables t + V0 = (κn + V0 )ξ and using (6.19), integral (6.23) is made Z c11 2d+3s−1 1 ξ (2d+3s−1)/(4s−2)−1 ≤ (κn + V0 ) 4s−2 −1 (1 − ξ )2 0 (6.24) 3s − 2d − 1 2d + s − 1 +| |ξ dξ = c12 (κn + V0 )(2d−s+1)/(4s−2) . × 4s − 2 4s − 2 Therefore, the first sum in the r.h.s. of (6.20) does not exceed
The second sum,
P
c13 (κn + V0 )(2d−s+1)/(4s−2) . [n/2]+1≤m≤n−1 ,
c14
X 1≤k≤[n/2]
in (6.20) does not exceed
(κn−k + V0 )(2d+s−1)/(4s−2) k(κm + V0 )(2s−2)/(4s−2) (2d+s−1)/(4s−2)
≤ c15 (κn + V0 )
(6.25)
ln (κn + V0 );
(6.26)
488
R. A. Minlos, Yu. M. Suhov
in the last inequality we used 6.5.1 and the fact that, for 0 ≤ k ≤ [n/2], 0 < c16 ≤ (κn−k + V0 ) (κn + V0 )−1 < 1. The third sum in the r.h.s. of (6.20) is assessed in a similar fashion and again does not exceed c17 (κn + V0 )(2d−s+1)/(4s−2) ln (κn + V0 ).
(6.27)
Finally, the fourth sum is estimated by means of an argument used for assessing the first sum. However, the difference with (6.24) is that now we deal with an integral Z ∞ 3s − 2d − 1 1 (2d+3s−1)/(4s−2)−1 2d + s − 1 +| |ξ dξ (6.28) ξ 2 4s − 2 4s − 2 c18 (1 − ξ ) which converges when 2d +P1 < s (see (1.4)). The ultimate bound is then identical to (6.25). We finally have that m |rn,m | ≤ c19 (κn + V0 )1/2+1/(4s−2) ln (κn + V0 ). X X |pn,m | and |bn,m | are assessed in a similar way. This completes the The sums m
proof of Lemma 3.3. u t
m
Acknowledgements. RAM acknowledges the financial support of RFFI (grants 96-01-00064 and 97-0100714).YMS acknowledges the support of EC Grant “Training Mobility and Research” (Contracts CHRX–CT 930411 and ERBMRXT–CT 960075A) and INTAS Grant “Mathematical Methods for Stochastic Discrete Event Systems” (INTAS 93–820). RAM thanks St John’s College, Cambridge, UK, for hospitality during Easter Term 1998. YMS thanks I.H.E.S., Bures-sur-Yvette, France, for hospitality during his visits in Spring and Autumn, 1998, and DIAS and Professor J. Lewis for hospitality during his visit in Autumn, 1998. The authors thank S. Shea-Simonds for checking the style of the paper.
References [AR]
Albeverio, S., Röckner, M.: Stochastic differential equations in infinite dimensions: Solution via Diriclet’s forms. Prob. Theor. Rel. Fields 89, 347–385 (1991) [AKR 1] Albeverio, S., Kondratiev, Yu.G., Röckner, M.: Ergodicity of L2 -semigroups and extremality of Gibbs states. J. Funct. Anal. 144, 394–423 (1997) [AKR 2] Albeverio, S., Kondratiev,Yu.G., Röckner, M.: Ergodicity for stochastic dynamics of quasi-invariant measures with applications to Gibbs states. J. Funct. Anal. 149, 415–469 (1997) [BH] Bellissard, J., Hoegh-Krohn, R.: Compactness and maxcimal Gibbs states for Gibbs random fields on a lattice. Commun. Math. Phys. 84, 297–327 (1982) [COPP] Cassandro, M., Olivieri, E., Pellegrinotti, A., Presutti, E.: Existence and uniqueness of DLR measures for unbounded spin systems. Z. Wahrsch. verv. Gebiete 41, 313–334 (1978) [DFS] Dobrushin, R.L., Fritz, J., Suhov, Yu.M., A.N.: Kolmogorov, the foundator of the theory of reversible Markov processes [Russian]. Uspekhi Matem. Nauk 43 No. 6, 167–188 (1988) d
Doss, H., Royer, G.: Processus de diffusion associé aux mesures de Gibbs sur RZ . Z. Wahrsch. Verw. Gebiete 46, 107–124 (1978) [F] Fedoryuk, M.V.: Asymptotic Analysis. Linear Ordinary Differential Equations. Berlin: SpringerVerlag, 1993 [Fr] Fritz, J.: Infinite lattice systems of interacting diffusion processes. Z. Wahrsch. Verw. Gebiete 59, 291–309 (1982) [KM] Kondratiev, Yu.G., Minlos, R.A.: One-particle subspaces in the stochastic XY model. J. Stat. Phys. 87, no. 3/4, 613–642 (1997) [LS] Levitan, B.M., Sargsijan, I.S.: Introduction to Spectral Theory: Selfadjoint Ordinary Differential Operators. Providence, R.I.: AMS, 1975 [L] Liggett, T.M.: Stochastic models of interacting systems. Ann. Prob. 25, 1–29 (1977) [MM 1] Malyshev, V.A., Minlos, R.A.: Gibbs Random Fields. Cluster Expansions. Dordrecht: Kluwer Academic Publishers, 1991 [MM 2] Malyshev, V.A., Minlos, R.A.: Linear Infinite-Particle Operators. Translations of Mathematical Monographs 143 Providence, R.I.: American Mathematical Society, 1995 [DR]
Spectrum of Interacting Diffusions
489
[M 1] Minlos, R.A.: Spectral expansion of the transfer matrices of Gibbs fields. In: Mathematical Physics Reviews. Vol. 7. Soviet. Sci. Rev. Sect. C: Math. Phys. Rev. Chur: Harwood Academic Publ. 1988, pp. 235–280 [M 2] Minlos, R.A.: Invariant subspaces of the stochastic Ising high temperature dynamics. Markov Proc. Rel. Fields 2, 263–284 (1996) [M 3] Minlos, R.A.: Spectra of the stochastic operators of some Markov processes, and their asymptotic behavior. St Petersburg Math. J. 8, 291–301 (1996) [MS] Minlos, R.A., Sinai, Ya.G.: Investigation of the spectra of stochastic operators that arise in lattice gas models [Russian]. Teoret. Mat. Fizika 2, 230–243 (1970) [MT] Minlos, R.A., Trishch, A.G.: Complete spectral resolution of the generator of Glauber dynamics for the one-dimensional Ising model[Russian]. Uspekhi Matem. Nauk 49 No.6, 209–210 (1994) [MVZ] Minlos, R.A., Verbeure, A., Zagrebnov, V.A.: A quantum crystal model in the light mass limit: The Gibbs state. To appear in Rev. Math. Phys. 1999 [MZ] Minlos, R.A., Zhizhina, E.A.: Asymptotics of decay of correlations for lattice spin fields at high temperatures. I. J. Stat. Phys. 84 no. 1/2, 85–118 (1996) [R] Ramirez, A.F.: Relative entropy and mixing properties of infinite-dimensional diffusions. Probab. Th. Rel. Fields 110, 369–395 (1998) [Ro] Royer, G. Processus de diffusion associé à certain modèles d’Ising à spin continue. Z. Wahrsch. Verw. Gebiete 46, 165–176 (1978) [T] Titchmarsh, E.C.: Eigenfunction Expansions Associated With Second-Order Differential Equations, Oxford: Clarendon Press, 1946 [Y1] Yoshida, N.: The log-Sobolev inequality for weakly coupled lattice fields. Preprint, Division of Mathematics, School of Science, Kyoto University, 1997. To appear in Prob. Theory Rel. Fields [Y2] Yoshida, N.: The equivalence of the log-Sobolev and a mixing condition for unbounded spin systems on the lattice. Preprint, Division of Mathematics, School of Science, Kyoto University, 1998 [Y3] Yoshida, N.: The log-Sobolev inequality for weakly coupled lattice fields. Preprint, Division of Mathematics, School of Science, Kyoto University, 1998 [Z] Zegarlinski, B.: The strong decay to equilibrium for the stochastic dynamics of unbounded spin systems. Commun. Math. Phys. 175, 401–432 (1996) [Zh] Zhizhina, E.A.: An asymptotic formula for the decay of correlations in a stochastic model of planar rotators at high temperatures. Theoret. and Math. Phys. 112, 857–865 (1997) Communicated by Ya. G. Sinai
Commun. Math. Phys. 206, 491 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Erratum
The Number-Theoretical Spin Chain and the Riemann Zeroes Andreas Knauf Mathematisches Institut, Universität Erlangen-Nürnberg, Bismarckstr. 1 21 , D–91054 Erlangen, Germany. E-mail: [email protected] Received: 18 June 1999 / Accepted: 18 June 1999 Commun. Math. Phys. 196, 703–731 (1998)
In Definition 12 of [2] I introduced three-regular finite graphs Gd = (V , E), d prime, whose vertex sets V = V+ ∪ V− consist of the orbits of −1 −1 −1 1 resp. M− := M+ := 1 0 −1 0 acting on SL(2, Z/dZ). A pair {v+ , v− }, v± ∈ V of vertices belongs to the set E of edges iff v+ ∈ V+ , v− ∈ V− and the orbits v+ and v− contain a common group element g ∈ SL(2, Z/dZ). I showed in Proposition 15 that for a common ε > 0 the adjacency matrices of these graphs have a spectral radius smaller than 3 − ε, omitting the eigenvalues ±3. On page 725 I conjectured that these √ graphs are bipartite Ramanujan, meaning that their non-trivial spectral radius is ≤ 8. However, it has been shown recently by Stephan Heiss (following a suggestion of Alain Valette, at the Université de Neuchâtel) that this conjecture is wrong, d = 29 being the first prime leading to a violation of the Ramanujan estimate. Similarly, a Ramanujan estimate does not hold for the operators T¯dd . Here the first counterexample is d = 433. I would like to thank them for pointing out my erroneous conjecture, and also Peter Sarnak who independently advised me to check it. References 1. Personal Communication. Homepage of Alain Valette: http://www.unine.ch/math/ 2. Knauf, A.: The Number-Theoretical Spin Chain and the Riemann Zeroes. Commun. Math. Phys. 196, 703–731 (1998) Communicated by P. Sarnak
Commun. Math. Phys. 206, 493 – 531 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Singular Dimensions of the N = 2 Superconformal Algebras. I Matthias Dörrzapf1 , Beatriz Gato-Rivera2,3 1 Lyman Laboratory of Physics, Harvard University, Cambridge, MA 02138, USA.
E-mail: [email protected]
2 Instituto de Matemáticas y Física Fundamental, CSIC, Serrano 123, Madrid 28006, Spain.
E-mail: [email protected]
3 NIKHEF-H, Kruislaan 409, 1098 SJ Amsterdam, The Netherlands
Received: 19 August 1998 / Accepted: 15 March 1999
Abstract: Verma modules of superconfomal algebras can have singular vector spaces with dimensions greater than 1. Following a method developed for the Virasoro algebra by Kent, we introduce the concept of adapted orderings on superconformal algebras. We prove several general results on the ordering kernels associated to the adapted orderings and show that the size of an ordering kernel implies an upper limit for the dimension of a singular vector space. We apply this method to the topological N = 2 algebra and obtain the maximal dimensions of the singular vector spaces in the topological Verma modules: 0, 1, 2 or 3 depending on the type of Verma module and the type of singular vector. As a consequence we prove the conjecture of Gato-Rivera and Rosado on the possible existing types of topological singular vectors (4 in chiral Verma modules and 29 in complete Verma modules). Interestingly, we have found two-dimensional spaces of singular vectors at level 1. Finally, by using the topological twists and the spectral flows, we also obtain the maximal dimensions of the singular vector spaces for the Neveu–Schwarz N = 2 algebra (0, 1 or 2) and for the Ramond N = 2 algebra (0, 1, 2 or 3). 1. Introduction More than two decades ago, superconformal algebras were first constructed independently and almost at the same time by Kac [21] and byAdemollo et al. [1]. Whilst Kac [21] derived them for mathematical purposes along with his classification of Lie super algebras, Ademollo et al. [1] constructed the superconformal algebras for physical purposes in order to define supersymmetric strings. Since then the study of superconformal algebras has made much progress in both mathematics and physics. On the mathematical side Kac and van de Leuer [24] and Cheng and Kac [6] have classified all possible superconformal algebras and Kac recently has proved that their classification is complete (see footnote in Ref. [23]). As far as the physics side is concerned, superconformal models are gaining increasing importance. Many areas of physics make use of superconformal
494
M. Dörrzapf, B. Gato-Rivera
symmetries but the importance is above all due to the fact that superconformal algebras supply the underlying symmetries of Superstring Theory. The classification of the irreducible highest weight representations of the superconformal algebras is of interest to both mathematicians and physicists. After more than two decades, only the simpler superconformal highest weight representations have been fully understood. Namely, only the representations of N = 1 are completely classified and proven [2,3]. For N = 2 remarkable efforts have been taken by several research groups [5,14,29,8,10,20]. Already the N = 2 superconformal algebras contain several surprising features regarding their representation theory, most of them related to the rank 3 of the algebras, making them more difficult to study than the N=1 superconformal algebras. The rank of the superconformal algebras keeps growing with N and therefore even more difficulties can be expected for higher N. The standard procedure of finding all possible irreducible highest weight representations starts off with defining freely generated modules over a highest weight vector, denoted as Verma modules. A Verma module is in general not irreducible, but the corresponding irreducible representation is obtained as the quotient space of the Verma module divided by all its proper submodules. Therefore, the task of finding irreducible highest weight representations can be reduced to the classification of all submodules of a Verma module. Obviously, every proper submodule needs to have at least one highest weight vector different from the highest weight vector of the Verma module. These vectors are usually called singular vectors of the Verma module. Conversely, a module generated on such a singular vector defines a submodule of the Verma module. Thus, singular vectors play a crucial rôle in finding submodules of Verma modules. However, the set of singular vectors may not generate all the submodules. The quotient space of a Verma module divided by the submodules generated by all singular vectors may still be reducible and may hence contain further submodules that again contain singular vectors. But this time they are singular vectors of the quotient space, known as subsingular vectors of the Verma module. Repeating this division procedure successively would ultimately lead to an irreducible quotient space. On the Verma modules one introduces a hermitian contravariant form. The vanishing of the corresponding determinant indicates the existence of a singular vector. Therefore, a crucial step towards analysing irreducible highest weight representations is to compute the inner product determinant. This has been done for N = 1 [22,33,34], N = 2 [5, 34,25,19,12], N = 3 [27], and N = 4 [28,32]. Once the determinant vanishes we can conclude the existence of a singular vector 9l at a certain level l, although there may still be other singular vectors at higher levels even outside the submodule generated by 9l , the so-called isolated singular vectors. Thus the determinant may not give all singular vectors neither does it give the dimension of the space of singular vectors at a given level l, since at levels where the determinant predicts one singular vector, of a given type, there could in fact be more than one linearly independent singular vectors, as it happens for the N = 2 superconformal algebras [9,20]. Therefore, the construction of specific singular vectors at levels given by the determinant formula may not be enough. One needs in addition information about the dimension of the space of singular vectors, apart from the (possible) existence of isolated singular vectors. The purpose of this paper is to give a simple procedure that derives necessary conditions on the space of dimensions of singular vectors of the N=2 superconformal algebras. This will result in an upper limit for the dimension of the spaces of singular vectors at a given level. For most weight spaces of a Verma module these upper limits on the dimensions will be trivial and we obtain a rigorous proof that there cannot exist any
Singular Dimensions of N = 2 Superconformal Algebras. I
495
singular vectors for these weights. For some weights, however, we will find necessary conditions that allow one-dimensional singular vector spaces, as is the case for the Virasoro algebra, or even higher dimensional spaces. The method shown in this paper for the superconformal algebras originates from the method used by Kent [26] for the Virasoro algebra1 . Kent analytically continued the Virasoro Verma modules to generalised Verma modules. In these generalised Verma modules he constructed generalised singular vector expressions in terms of analytically continued Virasoro operators. Then he proved that if a generalised singular vector exists at level 0 in a generalised Verma module, then it is proportional to the highest weight vector. And consequently, if a generalised singular vector exists at a given level in a generalised Verma module, then it is unique up to proportionality. This uniqueness can therefore be used in order to show that the generalised singular vector expressions for the analytically continued modules are actually singular vectors of the Virasoro Verma module, whenever the Virasoro Verma module has a singular vector. As every Virasoro singular vector is at the same time a generalised singular vector, this implies that Virasoro singular vectors also have to be unique up to proportionality. In this paper we focus on the uniqueness proof of Kent and show that similar ideas can be applied directly to the superconformal algebras. Our procedure does not require any analytical continuation of the algebra, however, and therefore gives us a powerful method that can easily be applied to a vast number of algebras without the need of constructing singular vectors. We shall define the underlying idea as the concept of adapted orderings. For pedagogical reasons we will first apply Kent’s ordering directly to the Virasoro Verma modules. Then we will present adapted orderings for the topological N = 2 superconformal algebra, which is the most interesting N = 2 algebra for current research in this field. The results obtained will be translated finally to the Neveu–Schwarz and to the Ramond N = 2 algebras. In a future publication we will further apply these ideas to the twisted N = 2 superconformal algebra [13]. The paper is structured as follows. In Sect. 2 we explain the concept of adapted orderings for the case of the Virasoro algebra, which will also serve to illustrate Kent’s proof in our setting. In Sect. 3, we prove some general results on adapted orderings for superconformal algebras, which justify the use of this method. In Sect. 4 we review some basic results concerning the topological N = 2 superconformal algebra. Section 5 introduces adapted orderings on generic Verma modules of the topological N = 2 superconformal algebra (those built on G0 -closed or Q0 -closed highest weight vectors). This procedure is extended to chiral Verma modules in Sect. 6 and to no-label Verma modules in Sect. 7. Section 8 summarises the implications of the adapted orderings on the dimensions of the singular vector spaces for the corresponding topological Verma modules. Section 9 translates these results to the singular vector spaces of the Neveu– Schwarz and the Ramond N = 2 superconformal algebras. Section 10 is devoted to conclusions and prospects. The proof of Theorem 5.3 fills several pages and readers that are not interested in the details of this proof can simply continue with Theorem 5.5. In this case, the preliminary remarks to Theorems 6.1 and 7.2 should also be skipped. Nevertheless, the main idea of the concept can easily be understood from the introductory example of the Virasoro Verma modules in Sect. 2.
1 Besides the later application to the Neveu–Schwarz N = 2 algebra in Ref. [9], only one further application is known to us which has been achieved by Bajnok [4] for the W A2 algebra.
496
M. Dörrzapf, B. Gato-Rivera
2. Virasoro Algebra It is a well-known fact that at a given level of a Verma module of the Virasoro algebra there can only be one singular vector which is unique up to proportionality. This is an immediate consequence of the proof of the Virasoro embedding diagrams by Feigin and Fuchs [15]. Using an analytically continued algebra of the Virasoro algebra, Kent constructed in Ref. [26] all Virasoro singular vectors in terms of products of analytically continued operators. Although similar methods had already been used earlier on Verma modules over Kac-Moody algebras [31], the construction by Kent not only shows the existence of analytically continued singular vectors for any complex level but also their uniqueness2 . This issue is our main interest in this paper. We shall therefore concentrate on the part of Kent’s proof that shows the uniqueness of Virasoro singular vectors rather than the existence of analytically continued singular vectors. It turns out that the extension of the Virasoro algebra to an analytically continued algebra, although needed for the part of Kent’s proof showing the existence claim, is however not necessary for the uniqueness claim on which we will focus in this paper. We will first motivate and define our concept of adapted orderings for the Virasoro algebra and will then prove some first results for the implications of adapted orderings on singular vectors. Following Kent [26] we will then introduce an ordering on the basis of a Virasoro Verma module and describe it in our framework. If we assume that a singular vector exists at a fixed level, then this total ordering will show that this singular vector has to be unique up to proportionality. The Virasoro algebra V is generated by the operators Lm with m ∈ Z and the central extension C satisfying the commutation relations [Lm , Ln ] = (m − n)Lm+n +
C 3 (m − m)δm+n,0 , [C, Lm ] = 0, m, n ∈ Z. 12
(1)
V can be written in its triangular decomposition V = V− ⊕ V0 ⊕ V+ , with V+ = span{Lm : m ∈ N}, the positive Virasoro operators, and V− = span{L−m : m ∈ N}, the negative Virasoro operators. The Cartan subalgebra is given by V0 = span{L0 , C}. For elements Y of V that are eigenvectors of L0 with respect to the adjoint representation we call the L0 -eigenvalue the level of Y and denote it by3 |Y |: [L0 , Y ] = |Y |Y . The same shall be used for the universal enveloping algebra U (V). In particular, elements of U (V) of the form Y = L−pI . . . L−p1 , pq ∈ Z for q = 1, . . . , I , I ∈ N, are at level P |Y | = Iq=1 pq and we furthermore define them to be of length kY k = I . Finally, for the identity operator we set k1k = |1| = 0. For convenience we define the graded class of subsets of operators in U (V) at positive level: S m = {S = L−mI . . . L−m1 : |S| = m ; mI ≥ . . . m1 ≥ 2 ; m1 , . . . , mI , I ∈ N},(2) for m ∈ N, S0 = {1}, and also Cn = {X = Sm Ln−m −1 : Sm ∈ Sm , m ∈ N0 , m ≤ n},
(3)
for n ∈ N0 , which will serve to construct a basis for Virasoro Verma modules later on. We consider representations of V for which the Cartan subalgebra V0 is diagonal. Furthermore, C commutes with all operators of V and can hence be taken to be constant 2 The exact proof of Kent showed that generalised Virasoro singular vectors at level 0 are scalar multiples of the identity. 3 Note that positive generators L have negative level |L | = −m. Therefore, any positive operators m m 0 ∈ V+ have a negative level |0|.
Singular Dimensions of N = 2 Superconformal Algebras. I
497
c ∈ C (in an irreducible representation). A representation with L0 -eigenvalues bounded from below contains a vector with L0 -eigenvalue 1 which is annihilated by V+ , a highest weight vector |1, ci: V+ |1, ci = 0, L0 |1, ci = 1 |1, ci , C |1, ci = c |1, ci .
(4)
The Verma module V1,c is the left-module V1,c = U (V) ⊗V0 ⊕V+ |1, ci. For V1,c we choose the standard basis B 1,c as: B 1,c = {Sm Ln−1 |1, ci : Sm ∈ Sm , m, n ∈ N0 }.
(5)
V1,c and B 1,c are L0 -graded in a natural way. The corresponding L0 -eigenvalue is called the conformal weight and the L0 -eigenvalue relative to 1 is the level. Let us introduce Bk1,c = {Xk |1, ci : Xk ∈ Ck } , k ∈ N0 .
(6)
Thus, Bk1,c has conformal weight k and span{Bk1,c } is the grade space of V1,c at level k. For x ∈ span{Bk1,c } we again denote the level by |x| = k. Verma modules may not be irreducible. In order to obtain physically relevant irreducible highest weight representations one thus needs to trace back the proper submodules of V1,c and divide them out. This finally leads to the notion of singular vectors as any proper submodule of V1,c needs to contain a vector 9l that is not proportional to the highest weight vector |1, ci but still satisfies the highest weight vector conditions4 with conformal weight5 1 + l for some l ∈ N0 : V+ 9l = 0, L0 9l = (1 + l)9l , C9l = c9l ,
(7)
l is the level of 9l , denoted by |9l |. An eigenvector 9l of L0 at level l in V1,c , in particular a singular vector, can thus be written using the basis (6): 9l =
l X X m=0 Sm ∈Sm
cSm Sm Ll−m −1 |1, ci ,
(8)
with coefficients cSm ∈ C. The basis decomposition (8) of an L0 -eigenvector in V1,c will be denoted the normal form of 9l , where Sm Ll−m −1 ∈ Cl and cSm will be referred to as the terms and coefficients of 9l , respectively. A non-trivial term Y ∈ Cl of 9l refers to a term Y in Eq. (8) with non-trivial coefficient cY . Let O denote a total ordering on Cl with global minimum. Thus 9l in Eq. (8) needs to contain an O-smallest X0 ∈ Cl with cX0 6 = 0 and cY = 0 for all Y ∈ Cl with Y
498
M. Dörrzapf, B. Gato-Rivera
a smallest non-trivial term X0 of 9l different from Ll−1 . However, by assumption we can find a positive operator 0 annihilating 9l but also creating a term that can only be generated from the term X0 and from no other terms of 9l . Thus the coefficient cX0 of X0 is trivial and therefore 9l is trivial. This motivates our definition of adapted orderings for Virasoro Verma modules: Definition 2.1. A total ordering O on Cl (l ∈ N0 ) with global minimum is called adapted to the subset ClA ⊂ Cl in the Verma module V|1,ci if for any element X0 ∈ ClA at least one positive operator 0 ∈ V+ exists for which 0 X0 |1, ci =
l+|0| X
X
m=0 Sm ∈Sm
l+|0|−m
0 cS0X Sm L−1 m
|1, ci
(9)
0 6 = 0) such that for all Y ∈ Cl with contains a non-trivial term X˜ ∈ Cl+|0| (i.e. c0X ˜
X0
X
X
0 Y |1, ci =
l+|0| X
X
m=0 Sm ∈Sm
l+|0|−m
cS0Y Sm L−1 m
|1, ci
(10)
= 0. The complement of ClA , ClK = Cl \ ClA , is the kernel with respect to is trivial: c0Y X˜ the ordering O in the Verma module V1,c . Obviously, any total ordering on Cl is always adapted to the subset ∅ ⊂ Cl with ordering kernel ClK = Cl , what does not give much information. For our purposes we need to find suitable ordering restrictions in order to obtain the smallest possible ordering kernels (which is not a straightforward task). In the Virasoro case we will give an ordering such that the ordering kernel for each l ∈ N has just one element: Ll−1 . As indicated in our motivation, it is then fairly simple to show that a singular vector at level l needs to have a non-trivial coefficient for the term Ll−1 in its normal form. If two singular vectors have the same coefficient for this term, then their difference is either trivial or a singular vector with trivial Ll−1 term. Again the latter is not allowed and hence all singular vectors at level l are unique up to proportionality. This will be summarised in the following theorem: Theorem 2.2. Let O denote an adapted ordering in ClA at level l ∈ N with a kernel ClK consisting of just one term K for a given Verma module V1,c . If two vectors 9l1 and 9l2 1 = c2 , at level l in V1,c , both satisfying the highest weight conditions Eq. (7), have cK K then 9l1 ≡ 9l2 .
(11)
˜ l does not contain the term ˜ l = 9 1 − 9 2 . The normal form of 9 Proof. Let us consider 9 l l 1 2 K as cK = cK . As Cl is a totally ordered set with respect to O, the non-trivial terms of ˜ l is non-trivial, need to have an O-minimum X0 ∈ Cl . By construction, ˜ l , provided 9 9 ˜ l is non-trivial, hence, X0 is contained in C A . As O is the coefficient c˜X0 of X0 in 9 l A adapted to Cl we can find a positive generator 0 ∈ V+ such that 0X0 |1, ci contains a ˜ l which is O-larger than non-trivial term that cannot be created from any other term of 9
Singular Dimensions of N = 2 Superconformal Algebras. I
499
˜ l . Therefore, X0 . But X0 was chosen to be the O-minimum of the non-trivial terms of 9 0X0 |1, ci contains a non-trivial term that cannot be created from any other term of ˜ l . The coefficient of this term is obviously given by a c˜X0 with a non-trivial complex 9 ˜ l is also annihilated by any positive generator, in particular number a. Like 9l1 and 9l2 , 9 by 0. It follows that c˜X0 = 0 which leads to contradiction. Thus, the set of non-trivial ˜ l = 0. This results in 9 1 = 9 2 . u ˜ l is empty and therefore 9 t terms of 9 l l Equipped with Definition 2.1 and Theorem 2.2 we can now easily prove the wellknown [26,15] uniqueness of Virasoro singular vectors. Let us first review the total ordering on Cl defined by Kent [26]. Whilst Kent used the following ordering to show that in his generalised Virasoro Verma modules vectors at level 0 satisfying the highest weight conditions are actually proportional to the highest weight vector, we will use Theorem 2.2 to show that, furthermore, already the ordering implies that all Virasoro singular vectors are unique at their levels up to proportionality. Definition 2.3. On the set Cl of Virasoro operators we introduce the total ordering OV i for l ∈ N. For two elements X1 , X2 ∈ Cl , X1 6= X2 , with Xi = L−mi . . . L−mi Ln−1 , ni = l − miIi . . . − mi1 , or Xi = Ll−1 , i = 1, 2 we define X1 n2 .
Ii
1
(12)
If, however, n1 = n2 we compute the index j0 = min{j : m1j − m2j 6 = 0, j = 1, . . . , min(I1 , I2 )}. We then define X1
(13)
For X1 = X2 we set X1
500
M. Dörrzapf, B. Gato-Rivera
Proof. The idea of the proof is a generalisation of Kent’s proof in Ref. [26]. Let us 0 ∈ ClA , n0 = l − mI . . . − m1 , mI ≥ . . . ≥ m1 ≥ 2 consider X0 = L−mI . . . L−m1 Ln−1 for I ∈ N. We then construct a vector 9l = X0 |1, ci at level l in the Verma module V1,c . We apply the positive operator Lm1 −1 to 9l and write the result in its normal form, Lm1 −1 9l =
l−m 1 +1 X
X
m=0
S m ∈S m
1 +1−m |1, ci , cSm Sm Ll−m −1
(14)
following Eq. (9). Equation (14) contains a non-trivial contribution of S˜ = L−mI . . .L−m2 , simply by commuting Lm1 −1 with L−m1 in X0 which creates another operator L−1 but lets L−mI . . . L−m2 remain unchanged and thus creates the term X˜ = L−mI . . . n0 +1 . In the case m1 = m2 = . . . = mj we simply obtain multiple copies of L−m2 L−1 Y
this term. However, for any other term Y = L−mY . . . L−mY Ln−1 with Y ∈ Cl (nY = 1 J l − mYJ . . . − mY1 , J ∈ N0 ) producing the term X˜ under the action of Lm1 −1 , either the term Y needs to have already at least one L−1 more than X0 , and would consequently be O-smaller than X0 due to Eq. (12), or Lm1 −1 needs to create L−1 by commuting through L−mY . . . L−mY . The latter, however, is only possible if mY1 < m1 . Otherwise 1 J the commutation relations would not allow L−1 being created from LmY −1 and for 1 ˜ Hence mY = m1 we would ultimately find X0 = Y , as both terms need to create X. 1
mY1 < m1 and therefore one finds Y
Theorem 2.4 implies as an immediate consequence the following theorem about the uniqueness of Virasoro singular vectors. Theorem 2.5. If the Virasoro Verma module V1,c contains a singular vector 9l at level l, l ∈ N, then 9l is unique up to proportionality. The coefficient of the term Ll−1 ∈ Cl in the normal form of 9l , i.e. the coefficient of Ll−1 |1, ci, is non-trivial. Proof. We first show that the Ll−1 |1, ci component in the normal form of 9l is nontrivial. Let us assume this component is trivial. The trivial vector 0 also satisfies the highest weight conditions Eq. (7) for any level l and has trivial Ll−1 |1, ci component. According to Theorem 2.4, {Ll−1 } is an ordering kernel for the ordering OV on Cl . Therefore, from Theorem 2.2 we know that 9l = 0 and therefore 9l is not a singular vector, which is a contradiction. Hence, we obtain that the component of Ll−1 |1, ci in 9l has to be non-trivial. Let us now assume 9l0 is another singular vector in V1,c at the same level l as 9l . We know that the coefficients c0 l and cLl of 9l0 and 9l respectively L−1
−1
are both non-trivial. Therefore c0 l 9l and cLl 9l0 are two singular vectors at the same L−1
−1
level which agree in their Ll−1 coefficient and according to Theorem 2.2 are identical. Thus, 9l and 9l0 are proportional. u t Feigin and Fuchs [15] have proven for which Verma modules these unique Virasoro singular vectors do exist.
Singular Dimensions of N = 2 Superconformal Algebras. I
501
3. Superconformal Algebras and Adapted Orderings A superconformal algebra is a Lie super algebra that contains the Virasoro algebra as a subalgebra. Therefore superconformal algebras are also known as super extensions of the Virasoro algebra. Thanks to Kac [21,23], Cheng and Kac [6], and Kac and van de Leur [24] all superconformal algebras are known by now. Let A denote a superconformal algebra and let U (A) be the universal enveloping algebra of A. We want that the energy operator L0 of theVirasoro subalgebra is contained in our choice of the Cartan subalgebra HA of A. A thus decomposes in L0 -grade spaces which we want to group according to the sign of the grade: A = A− ⊕ A0 ⊕ A− , where the L0 -grades of A− , A0 , or A+ are positive, zero, or negative respectively6 . Consequently, U (A) also decomposes in L0 grade spaces: U (A) = U (A)− ⊕ U (A)0 ⊕ U (A)+ . Obviously, the Cartan subalgebra HA is contained in A0 but does not need to be identical to A0 . The L0 -grade is just one ∗ (the dual space of H ) of A. For simplicity, let us fix component of the roots µ ∈ HA A a basis for HA that contains L0 : {L0 , H 2 , . . . , H r }, and hence let us denote the roots as (1, µ), where 1 indicates the L0 component and µ = (µ2 , . . . , µr ) the vector of all other components. Physicists are mainly interested in positive energy representations. One thus defines a highest weight vector |1, µi as a simultaneous eigenvector of HA with eigenvalues, the weights, (1, µ) and vanishing A+ action: A+ |1, µi = 0. The L0 -weight 1 = µ(L0 ) is the conformal weight, which is for convenience always denoted explicitlty in addition to the other weights µ. Depending on the algebra A, physical as well as mathematical applications may require highest weight vectors that satisfy additional vanishing conditions with respect to operators of A0 (the zero modes) with HA normally excluded. Later on, this shall be further explained in Sect. 4 for the topological N = 2 algebra. Definition 3.1. For a subalgebra N of A0 that includes the Cartan subalgebra HA we define a highest weight vector |1, µiN with weight (1, µ): L0 |1, µiN = 1 |1, µiN ,
(15)
H i |1, µiN = µi |1, µiN , i = 2, . . . , r,
(16)
A+ |1, µiN = 0, N
0 |1, µi 0
(17)
= 0, ∀ 0 ∈ N /HA . 0
(18)
A Verma module is then defined analogously to the Virasoro case as the left module N N = U (A) ⊗ V1,µ N ⊕A+ |1, µi , where we use the representation Eqs. (15)–(17) to act with N ⊕ A+ on |1, µiN . If N = HA we shall simply write V1,µ and |1, µi. N is again graded with respect to H into weight spaces The Verma module V1,µ A
N ,(l,q)
with weights (1 + l, µ + q). For convenience we shall only use the relative V1,µ weights (l, q) with q = (q2 , . . . , qr ) whenever we want to refer to a weight. The L0 relative weight l is again called the level. Also for the universal enveloping algebra an element Y with well-defined L0 -grade l is said to be at level |Y | = l, i.e. [L0 , Y ] = |Y |Y . N to be As for the Virasoro case we shall define a singular vector of a Verma module V1,µ a vector which is not proportional to the highest weight vector but satisfies the highest weight conditions Eqs. (15)–(17) with possibly different weights. 6 Note that for historical reasons the elements of A− have positive L -grade. 0
502
M. Dörrzapf, B. Gato-Rivera N ,(l,q)
0
N ∈V Definition 3.2. A vector 9l,q 1,µ if
is said to satisfy the highest weight conditions
0
0
N N L0 9l,q = (1 + l)9l,q , N0
(19)
N0
H i 9l,q = (µi + qi )9l,q , i = 2, . . . , r,
(20)
N0
U (A)+ 9l,q = 0, 0
0
N0 9l,q
(21) 0
= 0, ∀ 0 ∈ N /U (HA ), 0
(22)
for a subalgebra7 N 0 of U (A)0 that contains U (HA ) and may or may not be equal to N 0 is called a singular vector if in addition 9 N 0 is not proportional to the highest N . 9l,q l,q weight vector |1, µiN . N ,(l,q)
can be generated by choosing the subset of the root Each weight space V1,µ space of the universal enveloping algebra U (A) with root (l, q) that only consists of generators not taken from N ⊕A+ . From this set we choose elements such that the vectors generated by acting on the highest weight vector |1, µiN are all linearly independent N ,(l,q) N denote such a basis set, analogous to the and thus form a basis for V1,µ . Let Cl,q N denote any basis set for the Virasoro algebra, to be specified later. Further, let B1,µ N ,(l,q)
N . The standard basis for the weight space V shall basis for the Verma module V1,µ 1,µ N ,(l,q) N ˜ be the basis B1,µ generated by the sets Cl,q on the highest weight vector and the N ,(l,q) is called its normal form. As in basis decomposition of 9l,q with respect to B˜ 1,µ
N that generates a particular basis vector in B˜ N ,l the Virasoro case, an element X of Cl,q 1,µ N ,(l,q)
is called a term of a vector 9l,q ∈ V1,µ and the corresponding coefficient cX in its basis decomposition is simply called a coefficient of 9l,q . Finally, we call the term X of 9l,q a non-trivial term if cX 6 = 0. This completes the necessary notation to define N just like in the Virasoro case. adapted orderings on Cl,q N with global minimum is called adapted to Definition 3.3. A total ordering O on Cl,q
N ,A N in the Verma module V N with annihilation operators K ⊂ ⊂ Cl,q the subset Cl,q 1,µ
N ,A at least one annihilation U (A)+ ⊕ U (A)0 /U (HA ) if for any element X0 ∈ Cl,q operator 0 ∈ K exists for which X 0X cX 0 X (23) 0 X0 |1, µi = N X∈B1,µ
N (i.e. c0X0 6 = 0) such that for all Y ∈ C N with contains a non-trivial term X˜ ∈ B1,µ l,q ˜
X0
0 Y |1, µi =
X
X
0Y cX X
(24)
N X∈B1,µ
7 Note that this time, for convenience, we have chosen the universal enveloping algebra to define the highest weight conditions which is equivalent to the earlier definition Eqs. (15)–(17).
Singular Dimensions of N = 2 Superconformal Algebras. I
503
N ,A N ,K N \ C N ,A is the kernel with is trivial: c0Y = Cl,q l,q ˜ = 0. The complement of Cl,q , Cl,q X
N . Here B N represents a basis that respect to the ordering O in the Verma module V1,µ 1,µ can be chosen suitably for each X0 and may or may not be the standard basis8 .
In the motivation to Theorem 2.2 we assumed the existence of an ordering with the smallest kernel consisting of one element only. For the N = 2 algebras we will find ordering kernels which contain more than one element and also ordering kernels that are trivial. We saw in Theorem 2.5 that if the ordering kernel has only one element (the global minimum Ll−1 in the case of the Virasoro algebra) then any singular vector needs to have a non-trivial coefficient for this element. The following theorems reveal what can be implied if the ordering kernel consists of more than one element or of none at all. N ,A at weight (l, q) with kernel Theorem 3.4. Let O denote an adapted ordering on Cl,q 0
N ,K N and annihilation operators K. If two vectors 9 N ,1 Cl,q for a given Verma module V1,µ l,q 0
N ,2 and 9l,q at the same level l and weight q, satisfying the highest weight conditions 1 = c2 for all X ∈ C N ,K , then Eqs. (19)–(22) with N 0 = K, have cX X l,q 0
0
N ,1 N ,2 ≡ 9l,q . 9l,q 0
(25)
0
˜ l,q does not contain ˜ l,q = 9 N ,1 −9 N ,2 . The normal form of 9 Proof. Let us consider 9 l,q l,q
K , simply because c1 = c2 for all X ∈ C N ,K . As C N any terms of the ordering kernel Cl,q l,q X X l,q is a totally ordered set with respect to O which has a global minimum, the non-trivial ˜ l,q is non-trivial, need to have an O-minimum X0 ∈ C N . By ˜ l,q , provided 9 terms of 9 l,q ˜ l,q in its normal form is non-trivial, hence, X0 construction, the coefficient c˜X0 of X0 in 9
N ,A N ,A is also contained in Cl,q . As O is adapted to Cl,q one can find an annihilation operator 0 ∈ K such that 0X0 |1, µiN contains a non-trivial term (for a suitably chosen basis ˜ l,q which is O-larger depending on X0 ) that cannot be created by any other term of 9 ˜ than X0 . But X0 was chosen to be the O-minimum of 9l,q . Therefore, 0X0 |1, µiN ˜ l,q . The contains a non-trivial term that cannot be created from any other term of 9 coefficient of this term is obviously given by a c˜X0 with a non-trivial complex number a. N 0 ,1 N 0 ,2 ˜ Together with 9l,q and 9l,q , 9l,q is also annihilated by any annihilation operator, in particular by 0. It follows that c˜X0 = 0, contrary to our original assumption. Thus, ˜ l,q = 0. This results in ˜ l,q is empty and therefore 9 the set of non-trivial terms of 9 N 0 ,1 N 0 ,2 t 9l,q = 9l,q . u
Theorem 3.4 states that if two singular vectors at the same level and weight agree on the ordering kernel, then they are identical. The coefficients of a singular vector with respect to the ordering kernel are therefore sufficient to distinguish singular vectors. If the ordering kernel is trivial we consequently find 0 as the only vector that can satisfy the highest weight conditions. N ,A at weight (l, q) with trivial Theorem 3.5. Let O denote an adapted ordering on Cl,q N ,K N and annihilation operators K. A vector = ∅ for a given Verma module V1,µ kernel Cl,q
8 In the N = 2 case we will choose for most X the standard basis with only very few but important 0 exceptions.
504
M. Dörrzapf, B. Gato-Rivera 0
N at level l and weight q satisfying the highest weight conditions Eqs. (19)–(22) with 9l,q N 0 = K, is therefore trivial. In particular, this shows that there are no singular vectors.
Proof. We again make use of the fact that the trivial vector 0 satisfies any vanishing conditions for any level l and weight q. As the ordering kernel is trivial the components N 0 agree on the ordering kernel and using Theorem 3.4 we obtain of the vectors 0 and 9l,q 0
N = 0. u t 9l,q
We now know that singular vectors can be classified by their components on the ordering kernel. As we shall see if the ordering kernel has n elements, then the space of singular vectors for this weight is at most n-dimensional. Conversely, one could ask if there are singular vectors corresponding to all possible combinations of elements of the ordering kernel. In general this will not be the case, however, for the Virasoro algebra [26] and for the Neveu–Schwarz N = 2 algebra [9] it has been shown that for each element of the ordering kernel there exists a singular vector for suitably defined analytically continued Verma modules. Some of these generalised singular vectors lie in the embedded original non-continued Verma module and are therefore singular in the above sense. We finally conclude with the following theorem summarising all our findings so far. N ,A at weight (l, q) with kernel Theorem 3.6. Let O denote an adapted ordering on Cl,q
N ,K N and annihilation operators K. If the ordering for a given Verma module V1,µ Cl,q
N ,K kernel Cl,q has n elements, then there are at most n linearly independent singular 0
N in V N with weight (l, q) and N 0 = K. vectors 9l,q 1,µ
0
N in Proof. Suppose there were more than n linearly independent singular vectors 9l,q N with weight (l, q). We choose n + 1 linearly independent singular vectors among V1,µ
N ,K has the n elements K1 ,. . . ,Kn . Let cj k them 91 ,. . . ,9n+1 . The ordering kernel Cl,q denote the coefficient of the term Kj in the vector 9k in its standard basis decomposition. The coefficients cj k thus form a n by n + 1 matrix C. The homogeneous system of linear equations Cλ = 0 thus has a non-trivial solution λ0 = (λ01 , . . . , λ0n+1 )T for the vector P 0 λ. We then form the linear combination 9 = n+1 i=1 λi 9i . Obviously, the coefficient of Kj in the vector 9 in its normal form is just given by the j th component of the vector Cλ which is trivial for j = 1, . . . , n. Hence, the coefficients of 9 are trivial on the ordering kernel. On the other hand, 9 is a linear combination of singular vectors and therefore also satisfies the highest weight conditions with N 0 = K just like the trivial vector 0. P Due to Theorem 3.4 one immediately finds that 9 ≡ 0 and therefore n+1 i=1 λi 9i = 0. This, however, is a non-trivial decomposition of 0 contradicting the assumption that t 91 ,. . . , 9n+1 are linearly independent. u
4. Topological N = 2 Superconformal Verma Modules We will now apply the construction developed in the previous section to the topological N = 2 superconformal algebra. We first introduce an adapted ordering on the basis of the N = 2 Verma modules. Consequently the size of the ordering kernel will reveal a maximum for the degrees of freedom of the singular vectors in the same N = 2 grade
Singular Dimensions of N = 2 Superconformal Algebras. I
505
space. As the representation theory of the N = 2 superconformal algebras has different types of Verma modules we will see that the corresponding ordering kernels also allow different degrees of freedom. The topological N = 2 superconformal algebra T2 is a super Lie algebra which contains the Virasoro generators Lm with trivial central extension9 , a Heisenberg algebra Hm corresponding to the U(1) current, and the fermionic generators Gm and Qm , m ∈ Z corresponding to two anticommuting fields with conformal weights 2 and 1 respectively. T2 satisifies the (anti-)commutation relations [7] C 3 mδm+n ,
[Lm , Ln ] = (m − n)Lm+n ,
[Hm , Hn ] =
[Lm , Gn ] = (m − n)Gm+n ,
[Hm , Gn ] = Gm+n ,
[Lm , Qn ] = −nQm+n ,
[Hm , Qn ] = −Qm+n ,
[Lm , Hn ] = −nHm+n +
C 2 6 (m
+ m)δm+n ,
{Gm , Qn } = 2Lm+n − 2nHm+n + {Gm , Gn } = {Qm , Qn } = 0,
C 2 3 (m
(26)
+ m)δm+n , m, n ∈ Z.
The central term C commutes with all other operators and can therefore be fixed again as c ∈ C. HT2 = span{L0 , H0 , C} defines a commuting subalgebra of T2 , which can therefore be diagonalised simultaneously. Generators with positive index span the set of positive operators T2+ of T2 and likewise generators with negative index span the set of negative operators T2− of T2 : T2+ = span{Lm , Hm , Gn , Qn : m, n ∈ N},
T2−
= span{L−m , H−m , G−n , Q−n : m, n ∈ N}.
(27) (28)
The zero modes are spanned by T20 = span{L0 , H0 , C, G0 , Q0 } such that the generators {G0 , Q0 } classify the different choices of Verma modules. Q0 has the properties of a BRST-charge [7] so that the energy-momentum tensor is BRST-exact: Lm = 1/2 {Gm , Q0 }. Using Definition 3.1 a simultaneous eigenvector |1, q, ciN of HT2 with L0 eigenvalue 1, H0 eigenvalue q, C eigenvalue10 c, and vanishing T2+ action is called a highest weight vector. Each representation with lower bound for the eigenvalues of L0 needs to contain a highest weight vector. Additional vanishing conditions N are possible only with respect to the operators G0 and Q0 which may or may not annihilate a highest weight vector. The different types of annihilation conditions have been analysed in Ref. [20] resulting as follows. One can distinguish 4 different types of highest weight vectors |1, qiN labeled by a superscript N ∈ {G, Q, GQ}, or no superscript at all: highest weight vectors |1, qiGQ annihilated by both G0 and Q0 (chiral)11 , highest weight vectors |1, qiG annihilated by G0 but not by Q0 (G0 -closed), highest weight vectors |1, qiQ annihilated by Q0 but not by G0 (Q0 -closed), and finally highest weight vectors |1, qi that are neither annihilated by G0 nor by Q0 (no-label). Since 2L0 = G0 Q0 + Q0 G0 , a chiral vector, annihilated by both G0 and Q0 , necessarily has vanishing L0 -eigenvalue. On the other hand, any highest weight vector |1, qi 9 Note our slightly different notation for the Virasoro generators L in the topological N = 2 case. n 10 For simplicity from now on we will supress the eigenvalue of C in |1, q, ciN and simply write |1, qiN . 11 Chirality conditions are important for physics [30,7,19].
506
M. Dörrzapf, B. Gato-Rivera
Table 4.1. Topological highest weight vectors |0, qi
G0 |0, qi 6 = 0 and Q0 |0, qi 6 = 0
no-label
|1, qiG
G0 |1, qi = 0 and Q0 |1, qi 6 = 0
G0 -closed
|1, qiQ
G0 |1, qi 6 = 0 and Q0 |1, qi = 0
Q0 -closed
|0, qiGQ
G0 |0, qi = 0 and Q0 |0, qi = 0
chiral
1 that is neither annihilated by G0 nor by Q0 can be decomposed into 21 G0 Q0 |1, qi + 1 |1, Q G qi, provided 1 6 = 0. In this case the whole representation decomposes into 21 0 0 a direct sum of two submodules, one of them containing the G0 -closed highest weight vector G0 Q0 |1, qi and the other one containing the Q0 -closed highest weight vector Q0 G0 |1, qi. Therefore, for no-label highest weight vectors, annihilated neither by G0 nor by Q0 , we only need to consider the cases with 1 = 0; i.e. the highest weight vectors that cannot be expressed as linear combinations of G0 -closed and Q0 -closed highest weight vectors. From now on no-label will refer exclusively to such highest weight vectors with 1 = 0. Hence, according to Definition 3.1, we have the following 4 different types of topological Verma modules, as shown in Table 4.1:
|0, qi ,
V0,q = U (T2 ) ⊗HT
2
⊕T2+
G = U (T2 ) ⊗HT V1,q
⊕T2+ ⊕span{G0 }
|1, qiG ,
(30)
2
⊕T2+ ⊕span{Q0 }
|1, qiQ ,
(31)
2
2
⊕T2+ ⊕span{G0 ,Q0 }
Q
V1,q = U (T2 ) ⊗HT GQ
V0,q = U (T2 ) ⊗HT
(29)
|0, qiGQ .
(32)
Q
G and V The Verma modules of types V1,q 1,q , based on G0 -closed or Q0 -closed highest weight vectors, are called [20] generic Verma modules, whereas the Verma modules of GQ types V0,q and V0,q are called no-label and chiral Verma modules, respectively, for obvious reasons12 . For elements Y of T2 which are eigenvectors of HT2 with respect to the adjoint representation we define similarly to the Virasoro case the level |Y |L as [L0 , Y ] = |Y |L Y and in addition the charge |Y |H as [H0 , Y ] = |Y |H Y . In particular, elements of the form
(33) Y = L−lL . . . L−l1 H−hH . . . H−h1 Q−qQ . . . Q−q1 G−gG . . . G−g1 P PH PQ PG and any reorderings of Y have level |Y |L = L j =1 lj + j =1 hj + j =1 gj j =1 qj + and charge |Y |H = G − Q. For these elements we shall also define their length kY k = L + H + G + Q. Again, we shall set |1|L = |1|H = k1k = 0. For convenience we define the following sets of negative operators for m ∈ N: Lm Hm Gm Qm L0
= = = = =
{Y = L−lL . . . L−l1 : lL ≥ . . . ≥ l1 ≥ 2, |Y |L = m}, {Y = H−hH . . . H−h1 : hH ≥ . . . ≥ h1 ≥ 1, |Y |L = m}, {Y = G−gG . . . G−g1 : gG > . . . > g1 ≥ 2, |Y |L = m}, {Y = Q−qQ . . . Q−q1 : qQ > . . . > q1 ≥ 2, |Y |L = m}, H0 = G0 = Q0 = {1}.
(34) (35) (36) (37) (38)
12 As explained in Ref. [20], the chiral Verma modules V GQ , built on chiral highest weight vectors, are not 0,q
complete Verma modules because the chirality constraint is not required (just allowed) by the algebra.
Singular Dimensions of N = 2 Superconformal Algebras. I
507
We are now able to define a graded basis for the Verma modules as described in the previous section. We choose l ∈ N0 , n ∈ Z and define: G = Y = LH GQ : L ∈ Ll , H ∈ Hh , G ∈ Gg , Q ∈ Qq , Sm,n |Y | = m = l + h + g + q, |Y |H = n = |G|H + |Q|H , l, h, g, q ∈ N0 } , (39) L Q Sm,n = Y = LH QG : L ∈ Ll , H ∈ Hh , Q ∈ Qq , G ∈ Gg , |Y |L = m = l + h + g + q, |Y |H = n = |Q|H + |G|H , l, h, g, q ∈ N0 } . (40) And finally for m ∈ N0 , n ∈ Z: n m−p−r1 −r2 r1 G G 2 = Sp,q L−1 G−1 Qr−1 Qr03 : Sp,q ∈ Sp,q , p ∈ N0 , r1 , r2 , r3 ∈ {0, 1}, Cm,n Q Cm,n
(41) m − p − r1 − r2 ≥ 0, n = q + r1 − r2 − r3 } , n m−p−r1 −r2 r1 r2 r3 Q = Sp,q L−1 Q−1 G−1 G0 : Sp,q ∈ Sp,q , p ∈ N0 , r1 , r2 , r3 ∈ {0, 1}, m − p − r1 − r2 ≥ 0, n = q − r1 + r2 + r3 } .
(42)
G is of the form Thus, a typical element of Cm,n 0
r1 r2 r3 Y = L−lL . . . L−l1 H−hH . . . H−h1 G−gG . . . G−g1 Q−qQ . . . Q−q1 Lm −1 G−1 Q−1 Q0 , (43) Q
G or of Y ∈ C r1 , r2 , r3 ∈ {0, 1}, such that |Y |L = m and |Y |H = n. Sp,q of Y ∈ Cm,n m,n ∗ is called the leading part of Y and is denoted by Y . Hence, one can define the following standard bases: n o G m ∈ N0 , n ∈ Z , B|1,qiG = Y |1, qiG : Y ∈ Cm,n n o Q m ∈ N0 , n ∈ Z , (44) B|1,qiQ = Y |1, qiQ : Y ∈ Cm,n Q
G = span{B obtaining finally V1,q |1,qiG } and V1,q = span{B|1,qiQ }. For Verma modules G,Q
built on no-label and chiral highest weight vectors, V0,q and V0,q both with 1 = 0, one defines in exactly the same way: (45) Sm,n = Y = LH GQ : L ∈ Ll , H ∈ Hh , G ∈ Gg , Q ∈ Qq , |Y |L = m = l + h + g + q, |Y |H = n = |G|H + |Q|H , l, h, g, q ∈ N0 , m−p−r1 −r2 r1 2 G−1 Qr−1 G0r4 Qr03 : Sp,q ∈ Sp,q , Cm,n = Sp,q L−1 p ∈ N0 , r1 , r2 , r3 , r4 ∈ {0, 1}, m − p − r1 − r2 ≥ 0, n = q + r1 − r2 + r4 − r3 , (46) m−p−r1 −r2 r1 r2 GQ G−1 Q−1 : Sp,q ∈ Sp,q , p ∈ N0 , r1 , r2 ∈ {0, 1}, Cm,n = Sp,q L−1 (47) m − p − r1 − r2 ≥ 0, n = q + r1 − r2 . GQ
Sp,q of Y ∈ Cm,n or of Y ∈ Cm,n will be also called the leading part Y ∗ of Y . One G,Q obtains the following standard bases for the modules V0,q and V0,q : B|0,qi = Y |0, qi : Y ∈ Cm,n , m ∈ N0 , n ∈ Z , n o GQ , m ∈ N0 , n ∈ Z . (48) B|0,qiGQ = Y |0, qiGQ : Y ∈ Cm,n
508
M. Dörrzapf, B. Gato-Rivera
Table 4.2. Topological singular vectors 9l,p
G0 9l,p 6 = 0 and Q0 9l,p 6 = 0; l + 1 = 0
G 9l,p
no-label
G0 9l,p = 0 and Q0 9l,p 6 = 0
G0 -closed
9l,p
G0 9l,p 6 = 0 and Q0 9l,p = 0
Q0 -closed
9l,p
G0 9l,p = 0 and Q0 9l,p = 0; l + 1 = 0
chiral
Q
GQ
The bases Eq. (44) and Eq. (48) are naturally N0 × Z graded with respect to their HT2 eigenvalues relative to the eigenvalues (1, q) of the highest weight vector. For an N the L -eigenvalue is 1 + l and the H -eigenvalue is eigenvector 9l,p of HT2 in V1,q 0 0 q + p with l ∈ N0 and p ∈ Z. We define the level |9l,p |L = l and charge |9l,p |H = p. GQ G , CQ , C Like for the Virasoro case, we shall use Cm,n m,n m,n and Cm,n in order to define the normal form of an eigenvector 9l,p of HT2 . It is defined to be the basis decomposition with respect to the corresponding standard bases Eq. (44) and Eq. (48). Again, we call G , or X ∈ C Q , or X ∈ C , or X ∈ C GQ simply the terms of 9 the operators X ∈ Cl,p l,p l,p l,p l,p and the coefficients cX its coefficients. We introduce topological singular vectors according to Definition 3.2 as HT2 eigenvectors that are not proportional to the highest weight vector but are annihilated by T2+ and may also satisfy additional vanishing conditions with respect to the operators G0 G , 9 Q and 9 GQ and Q0 . Therefore one also distinguishes singular vectors 9l,p , 9l,p l,p l,p carrying the superscript G and/or Q depending on whether the singular vector is annihilated by G0 and/or Q0 . Obviously one obtains similar restrictions on the eigenvalues N 0 ∈ V N as for the highest weight vectors, as shown in Table 4.2. of 9l,p 1,q As there are 4 types of topological Verma modules and 4 types of topological singular vectors one might think of 16 different combinations of singular vectors in Verma modules. However, as will be explained later, no-label and chiral singular vectors do not exist neither in no-label Verma modules nor in chiral Verma modules (with one exception: chiral singular vectors at level 0 in no-label Verma modules). Most of these types of singular vectors are connected via the N = 2 topological spectral flow mappings [18, 17,20] which have been analysed in detail in Refs. [17,20]. Q
5. Adapted Orderings on Generic Verma Modules V G 1,q and V1,q Q
G and C We will now introduce total orderings OG and OQ on Cm,n m,n respectively. For convenience, however, we shall first give an ordering on the sets Lm , Hm , Gm and Qm .
Definition 5.1. Let Y denote either L, H, G, or Q, (but the same throughout this definition) and take two elements Xi ∈ Ymi for mi ∈ N0 , i = 1, 2, such that Xi = Z i i . . . Z i i , |Xi |L = mi or Xi = 1, i = 1, 2, with Z i i being an operator of −mkX
ik
−m1
−mj
the type L−mi , H−mi , G−mi , or Q−mi depending on whether Y denotes L, H, G or Q j
j
j
j
respectively. For X1 6 = X2 we compute the index13 j0 = min{j : m1j − m2j 6= 0, j = 1, . . . , min(kX1 k, kX2 k)}. j0 is, if non-trivial, the index for which the level of the operators in X1 and X2 first disagree when read from the right to the left. For j0 > 0 we 13 For subsets of N we define min ∅ = 0.
Singular Dimensions of N = 2 Superconformal Algebras. I
509
then define X1
(49)
X1 kX2 k.
(50)
If, however, j0 = 0, we set
For X1 = X2 we set X1
(51)
G which will turn out to be adapted with a very We can now define an ordering on Cm,n small kernel. G we introduce the total ordering O . For two elements Definition 5.2. On the set Cm,n G i
ri
ri
ri
G , X 6 = X with X = Li H i Gi Qi Lk G 1 Q 2 Q 3 , Li ∈ L , H i ∈ X1 , X2 ∈ Cm,n 1 2 i li −1 −1 −1 0 i Hhi ,G ∈ Ggi ,Qi ∈ Qqi for some li , hi , gi , qi , k i , r1i , r2i , r3i ∈ N0 , i = 1, 2 such that m ∈ N0 , n ∈ Z we define
X1 k 2 .
(52)
X1 r12 + r22 .
(53)
For k 1 = k 2 we set
If r11 + r21 = r12 + r22 , then we set X1
(54)
In the case where also Q1 = Q2 we define X1
(55)
If even G1 = G2 we then define X1
(56)
X1
(57)
If further L1 = L2 we set
which finally has to give an answer. For X1 = X2 we define X1
510
M. Dörrzapf, B. Gato-Rivera
G , Definition 5.2 is well-defined since one obtains an answer for any pair X1 , X2 ∈ Cm,n X1 6 = X2 after going through Eqs. (52)–(57), and hence the ordering OG proves to be a G . Namely, if Eqs. (52)–(57) do not give an answer on the ordering total ordering on Cm,n ri
ri
ri
1 2 Q−1 Q03 , of X1 and X2 , then obviously X1 and X2 are of the form Xi = LH GQLk−1 G−1 with common L, H , G, Q, k and also r11 + r21 = r12 + r22 . The fact that both X1 and X2 has charge n implies r11 − r21 − r31 = r12 − r22 − r32 , and using r11 + r21 = r12 + r22 one obtains 2r21 + r31 = 2r22 + r32 . But this equation has solutions from {0, 1} only for r21 = r22 and consequently also r31 = r32 and hence X1 = X2 . G is Lm followed by Lm−1 G Q whilst Obviously the OG -smallest element of Cm,0 −1 0 −1 −1 m−1 G G the OG -smallest element of Cm,−1 is Lm Q followed by L Q . Similarly, for Cm,1 0 −1 −1 −1 m−2 we find Lm−1 −1 G−1 as OG -smallest element followed by L−1 G−1 . We will now show that the ordering OG is adapted and we will compute the ordering kernels. We do not give a theoretical proof that these kernels are the smallest possible ordering kernels. However, we shall refer later to explicit examples of singular vectors that show that for general values of 1 and q most of the ordering kernels presented here cannot be smaller.
Theorem 5.3. If the central extension satisfies c 6= 3, then the ordering OG is adapted G for all Verma modules V G and for all grades (m, n) with m ∈ N , n ∈ Z. to Cm,n 0 1,q Ordering kernels are given by the following tables for all levels14 m, depending on the set of annihilation operators and depending on the charge n. Table 5.1. Ordering kernels for OG , annihilation operators T2+ n +1 0 −1 −2
ordering kernel m−1 {L−1 G−1 }
m−1 m−1 {Lm −1 , H−1 L−1 , L−1 G−1 Q0 }
m−1 m−1 {Lm −1 Q0 , H−1 L−1 Q0 , L−1 Q−1 } m−1 {L−1 Q−1 Q0 }
Table 5.2. Ordering kernels for OG , annihilation operators T2+ and G0 n +1 0 −1
ordering kernel m−1 {L−1 G−1 }
m−1 {Lm −1 , L−1 G−1 Q0 }
{Lm −1 Q0 }
Table 5.3. Ordering kernels for OG , annihilation operators T2+ and Q0 n
ordering kernel
0
{Lm −1 }
−1 −2
m−1 {Lm −1 Q0 , L−1 Q−1 } m−1 {L−1 Q−1 Q0 }
14 Note that for levels m = 0 and m = 1 some of the kernel elements obviously do not exist.
Singular Dimensions of N = 2 Superconformal Algebras. I
511
Table 5.4. Ordering kernels for OG , annihilation operators T2+ , Q0 , and G0 n
ordering kernel
0
{Lm −1 }
{Lm −1 Q0 }
−1
Charges that do not appear in the tables have trivial ordering kernels. Like in the Virasoro case our strategy will be to find annihilation operators that are able to produce an additional L−1 . Hence, we raise the term in question to the class of terms with one additional L−1 and try to prove that terms that can also be raised to this class of terms have to be OG -smaller. That we need to focus only on operators that create L−1 from the leading part of a term is a consequence of the following theorem which we therefore shall prove before starting the proof of Theorem 5.3. Theorem 5.4. Let us assume that there exists an annihilation operator 0 that creates a term X 0 with n + 1 operators L−1 by acting on r1 G 2 Qr−1 Qr03 ∈ Cm X0 = X0∗ Ln−1 G−1 0 ,n0
(58)
with X0∗ = L0 H 0 G0 Q0 , L0 = L−mL
. . . L−mL ∈ Ll ,
H = H−mH
. . . H−mH ∈ Hh ,
kL0 k
1
0
kH 0 k
1
G = G−mG
. . . G−mG ∈ Gg ,
Q = Q−mQ
. . . Q−mQ ∈ Qq ,
0
kG0 k
0
kQ0 k
1
1
l, h, g, q ∈ N0 and r1 , r2 , r3 ∈ {0, 1}. Let us further assume that this additional L−1 is G with X0
Proof. Let us show that if 0 does not create one additional L−1 by commuting through Y ∗ but still satisfies that it also creates the term X0 acting on Y , then Y
rY
rY
1 2 Y ∗ or from G−1 Q−1 Q03 . For the latter case there are a few possibilities to create L−1 : only under the action of G0 , Q0 , L1 , or H1 , and depending on the values of r1Y , r2Y , and
512
M. Dörrzapf, B. Gato-Rivera
r3Y . The operator 0 could be one of these operators or the commutation of 0 with Y ∗ could produce one of them. (Note that it is not possible to create one of these operators and to create in addition a L−1 from Y ∗ .) If now G0 , Q0 , L1 , or H1 creates a L−1 from rY
rY
rY
1 2 Q−1 Q03 , then we find in each case that at least one of r1Y or r2Y changes from 1 G−1 to 0 whilst r3Y remains unchanged. But X0 has the same ri as X0 for i = 1, 2, 3. One therefore deduces that r1Y + r2Y > r1 + r2 and thus Y
Equipped with Theorem 5.4 we can now proceed with the proof of Theorem 5.3: Let us consider the term r1 G 2 Qr−1 Qr03 ∈ Cm , X0 = L0 H 0 G0 Q0 Ln−1 G−1 0 ,n0
(59)
G with L0 , H 0 , G0 and Q0 given above. We construct the vector 9 0 = X0 |1, qiG ∈ V1,q at level |X0 |L = m0 and charge |X0 |H = n0 . Let us first consider the annihilation operators to be those in U (T2 )+ only. If Q0 6 = 1 we act with GmQ −1 ∈ T2+ on 9 0 and write the result again in its normal form. We will 1
thus obtain a non-trivial term in GmQ −1 9 0 with one additional L−1 : 1
˜ 0 Ln+1 G r1 Qr2 Qr3 , X Q = L0 H 0 G0 Q −1 −1 0 −1 ˜0 = Q Q ...Q Q Q −m −m q
(60)
2
˜ 0 = 1 simply by commuting G Q with Q Q which produces or, if kQ0 k = 1, Q m −1 −m 1
1
G also producing XQ under the the additional operator L−1 . Any other term Y ∈ Cm 0 ,n0 + action of GmQ −1 ∈ T2 and being OG -bigger than X0 also needs to create one L−1 1 by commuting GmQ −1 with Y ∗ due to Theorem 5.4. We can therefore focus on terms 1
r1 2 Qr−1 Qr03 . One finds that by commuting GmQ −1 with operators Y = LY H Y GY QY Ln−1 G−1 1
Q
in LY or H Y one can only produce terms of the form Gm0 with m0 < m1 − 1. Therefore, in order to create subsequently the operator L−1 from Gm0 or directly from GmQ −1 , Y 1
Q
needs to contain an operator of the form Q−m? −1 that satisfies15 0 < m? + 1 ≤ m1 . If Q m? + 1 < m1 one finds that Y is OG -smaller than X0 as the equation deciding on the Q ordering of X0 and Y would in this case be Eq. (54). If m? + 1 = m1 , on the other hand, Y must be necessarily equal to X0 . Note that GmQ −1 and Gm0 simply anticommute with 1
operators of GY and therefore cannot create any L−1 . Hence there are no terms Y OG bigger than the terms X0 producing the same terms XQ under the action of GmQ −1 ∈ T2+ . 1 We have therefore shown that the ordering OG is adapted on the set of terms X0 of the form given by Eq. (59) with Q0 6 = 1 for all grades (m0 , n0 ) and all central terms c ∈ C. As the terms X0 with Q0 6 = 1 are now proven to be adapted, next we will consider the terms X0 with Q0 = 1 and G0 6 = 1: r1 2 Qr−1 Qr03 . X0 = L0 H 0 G0 Ln−1 G−1 15 Note that for m? = 0, −1 Q ∗ −m? −1 would not be in the leading part Y .
(61)
Singular Dimensions of N = 2 Superconformal Algebras. I
513
If G0 6 = 1 we act with the annihilation operator QmG −1 on 9 0 = X0 |1, qiG . This 1 produces the term ˜ 0 Ln+1 G r1 Qr2 Qr3 X G = L0 H 0 G −1 −1 0 −1
(62)
˜ 0 = 1. Again, any other term Y with ˜ 0 = G−mG . . . G G or, if kG0 k = 1, G with G −m g 2
X0
r1 2 operators Y of the form Y = LY H Y GY QY Ln−1 G−1 Qr−1 Qr03 with QY = 1 as otherwise Y
the form Qm0 with m0 < mG 1 − 1. The operators QmG −1 and Qm0 can create L−1 from 1
GY only if GY contains G−m0 −1 with m0 + 1 ≤ mG 1 , and therefore Y is again OG -smaller or equal than X0 . Commuting Qm through GY can also give rise to operators of the form Lp , Hp and consequently even to Gp with p < mG 1 − 1. In order to create L−1 from QY it would require that QY 6 = 1 so that one again finds Y
(63)
If L0 6 = 1 we act with the annihilation operator LmL −1 ∈ T2+ on 9 0 = X0 |1, qiG . 1 This produces a term of the form r1 r2 r3 X L = L˜ 0 H 0 Ln+1 −1 G−1 Q−1 Q0
(64)
L with L˜ 0 = L−mL . . . L−mL or, if kL0 k = 1, L˜ 0 = 1. If mL 2 = m1 we may simply obtain l
2
multiple copies of the same term X L . Again Theorem 5.4 allows us to focus on OG r1 2 G Qr−1 Qr03 ∈ Cm (in addition, GY 6= 1 bigger terms Y of the form Y = LY H Y Ln−1 G−1 0 ,n0 Y or Q 6 = 1 would lead to Y
H Y cannot create any L−1 and obviously, following the arguments of the Virasoro case (proof of Theorem 2.4), terms Y that produce XL creating L−1 out of LY would again be OG -smaller than X0 . Therefore we can state that the ordering OG is adapted on the set of terms X0 of the form given by Eq. (59) with Q0 6= 1, or G0 6 = 1, or L0 6= 1 for all grades (m0 , n0 ) and all central terms c ∈ C. We are thus left with terms X0 of the form r1 2 Qr−1 Qr03 . X0 = H 0 Ln−1 G−1
(65)
At this stage it is not possible to create operators L−1 by acting directly with positive operators of T2+ on H 0 . Therefore we cannot use further Theorem 5.4, which has proven to be very fruitful so far, and a different strategy must be applied. Let us first assume that H 0 contains operators other than H−1 and let j0 ∈ N be the smallest index of H 0 such that mH j0 6 = 1. There are four different cases to study depending on the values of r1 , r2 and r3 . Let us start with the cases where r2 = 0. Acting with16 GmH Q−1 ∈ U (T2 )+ j0
16 At this stage of the proof we need the annihilation operators to be from the universal enveloping algebra. Therefore, Definition 3.3 is slightly modified compared to Definition 2.1.
514
M. Dörrzapf, B. Gato-Rivera
on 9 0 = X0 |1, qiG and writing the result in its normal form, one obtains a non-trivial term r1 r3 XH = H˜ 0 Ln+1 −1 G−1 Q0 ,
j0 −1 H˜ 0 = H−mH . . . H−mH H−1 h
(66)
j +1
j0 −1 (commuting Q−1 with H−mH produces Q−mH −1 and or, if kH 0 k = j0 , H˜ 0 = H−1 j0
j0
H subsequently the commutation with GmH produces L−1 ). If mH j0 +1 = mj0 one simply j0
G obtains multiple copies of X H . Now we must show that any other term Y ∈ Cm 0 ,n0 also producing XH under the action of GmH Q−1 is OG -smaller than X0 . Just like in j0
Theorem 5.4, if Y already has n + 1 or more operators L−1 , then Y
r1 2 G Qr−1 Qr03 ∈ Cm . If GmH Q−1 produces one Y . We therefore take first Y = H Y Ln−1 G−1 0 ,n0 j0
L−1 by commuting through H Y and leaves r1 , r2 and r3 unchanged one finds Y
rY
rY
1 2 Q−1 Q03 cannot create any L−1 but it could change the other hand, Q−1 acting on G−1 r2Y from 0 to 1 or r1Y from 1 to 0. The first case would not produce XH as we assumed r2 = 0 and in addition GmH cannot create L−1 from H Y . The latter case can produce j0
rY
rY
rY
1 2 but only for r1 = 0 and r1Y = r2Y = 1 (G2 Q−1 creating L−1 from G−1 Q−1 Q03 ). Thus r1Y + r2Y = 2 > 0 = r1 + r2 resulting in Y
XH
j0
the term X H from a term Y OG -bigger than X0 . In particular, the only way G−1 could change the triple (r1Y , r2Y , r3Y ) is by changing r1Y from 0 to 1, which is not allowed as r1 = 0, and in addition L−1 could not be created by QmH acting on H Y . For the case that j0
both r1 = r2 = 1 let us first assume that r3 = 0. By acting with GmH −1 Q0 ∈ U (T2 )+ j0
one produces L−1 from H−mH in a similar way as before. Q0 can change the triple j0
(r1Y , r2Y , r3Y ) in two ways: it can change r3Y from 0 to 1 or it could change r1Y from 1 to 0. The first case, however, would not lead to the term X H as r3 = 0. The latter case can only lead to X H if GmH −1 can be converted into G−1 which requires H Y
Therefore we find Y
case r3Y can only be changed from 1 to 0, which does not lead to XH , and r2Y can only be changed from 1 to 0, which requires QmH −1 to be converted into Q−1 in order to j0
obtain X H , resulting again in Y
Singular Dimensions of N = 2 Superconformal Algebras. I
515
action of H1 rather concerns G−1 Q−1 as this combination of operators is necessarily needed in order to create L−1 under the action of H1 . If we take r1 = r2 = 1, i.e. mH Ln G Q Qr3 , then the action of H creates an additional L X0 = H−1 1 −1 in the −1 −1 −1 0 H
m Ln+1 Qr3 that cannot only possible way that H1 can create L−1 , producing a term H−1 0 −1 be obtained from any other term Y OG -bigger than X0 . Thus elements of the ordering mH Ln G r1 Qr2 Qr3 with r + r < 2 for all grades and all kernel are of the form H−1 1 2 −1 −1 −1 0 central terms c ∈ C. At this stage all restrictions on the ordering kernel arising from operators in T2+ which create an additional L−1 have been used. One might think that the smallest ordering kernel has been found. However we will now show that, considering the action of two annihilation operators at the same time, we can still reduce the ordering kernel at least for mH Ln G r1 Qr2 Qr3 with mH 6 = 0 central terms c 6 = 3. Let us consider the case X0 = H−1 −1 −1 −1 0 and r1 + r2 < 2. The action of H1 ∈ T2+ on 9 0 = X0 |1, qiG creates, provided c 6= 0, H
r1 m −1 n 2 L−1 G−1 Qr−1 Qr03 with one H−1 removed a non-trivial term of the form X H = H−1 but no new L−1 created. Furthermore, as r1 + r2 < 2, H1 cannot create any L−1 . Now, depending on r1 , r2 and r3 it may be possible to find terms Y OG -bigger than X0 that do not create L−1 but still generate XH under H1 . In the case r1 = 1, r2 = 0 one finds that mH −1 G−2 Ln−1 Qr03 is the only such term. Thus, X0 and Y both create the same Y = H−1 H
m −1 n L−1 G−1 Qr03 under the action of H1 with coefficients mH 3c and 1 term X H = H−1 17 respectively , with X0
c (H1 + Q1 )X0 = mH X H + 2mH X˜ H + ..., 3 (H1 + Q1 )Y = XH + 2X˜ H + ...,
(67)
where “. . . ” denotes terms that are irrelevant for us19 . We now alter the standard basis of the normal form by defining new basis terms X 1 = mH 3c XH + 2mH X˜ H and X2 = X H + 2X˜ H . If this change of basis is possible the action of H1 + Q1 on X0 thus yields a term X 1 that cannot be produced from any other term Y unless Y 0 and c 6 = 3. In the case H
m −1 Q−2 Ln−1 Qr03 , r1 = 0, r2 = 1 we can repeat exactly the same procedure with Y = H−1 H r m −1 n+1 L−1 Q03 and Q1 replaced by G1 . Equations (67) turn in this case into X˜ H = H−1
c (H1 + G1 )X0 = mH X H − 2mH X˜ H + ..., 3 (H1 + G1 )Y = −XH + 2X˜ H + ...,
(68)
and thus result in exactly the same conditions from the determinant: mH > 0 and c 6= 3. 17 There are certainly other terms that are O -smaller than X and create X H such as HmH −1 Ln G Q 0 G −1 −1 −1 −1
for r2 = 0 and r3 = 1. But as before, OG -smaller terms are not relevant due to Definition 3.3. 18 Note that we have used the action of Q before to rule out Y in the ordering kernel. 1 19 Note that this is consistent for c = 0.
516
M. Dörrzapf, B. Gato-Rivera
Finally in the case r1 = r2 = 0 we are left with X0 of the form H
m Ln−1 Qr03 . X0 = H−1
(69) H
m −1 n−1 L−1 G−1 Q−1 Qr03 For mH ≥ 1 one finds that the action of H1 on X0 and on Y = H−1 H
m −1 n produces a term XH = H−1 L−1 Qr03 with coefficients 3c mH and 2 respectively. As X0
and thus result in the conditions: mH ≥ 2 and c 6 = 3. Therefore the kernel of the ordering OG , is given by r1 K 2 Cm = Ln−1 G−1 Qr−1 Qr03 , H−1 Ln−1 Qr03 : r1 + r2 < 2, 0 ,n0 m0 = n + r1 + r2 , n0 = r1 − r2 − r3 , for all grades (m0 , n0 ) and all central terms c ∈ C with c 6 = 3. This proves the results shown in Table 5.1. For G0 -closed vectors G0 is also in the set of annihilation operators. In this case the acmH Ln G r1 Q Qr3 produces the term HmH Ln+1 G r1 Qr3 tion of G0 on X0 of the form H−1 −1 −1 −1 0 −1 −1 −1 0 that cannot be obtained from any other term Y OG -bigger than X0 (commuting with Q−1 is the only way to produce L−1 acting with G0 ). Thus, the ordering kernel contains no terms with Q−1 for all complex values of c. If we now take X0 = H−1 Ln−1 Qr03 we find the (unique) OG -bigger term Y = r3 r3 r3 n n Ln−1 −1 G−1 Q−1 Q0 , both producing the terms L−1 G−1 Q0 and L−1 Q0 under the action 20 of G0 and H1 respectively . As a result H−1 can also be removed by changing the basis suitably provided c 6 = 3. The determinant of the coefficients results in: −1 −2 c (71) = 2( − 1), c 3 3 2 which is again non-trivial for c 6 = 3. This proves the results shown in Table 5.2. In a completely analogous way one can remove the terms containing G−1 or H−1 in the case of Q0 -closed singular vectors. The former can be done by acting with Q0 mH Ln G Qr2 Qr3 , creating the term HmH Ln+1 Qr2 Qr3 , whilst the latter is on X0 =H−1 −1 −1 −1 0 −1 −1 −1 0 achieved from the action of Q0 and H1 on the same X0 and Y as above. The determinant of the coefficients is again non-trivial for c 6 = 3: 1 2 c (72) c = −2( − 1). 2 3 3 20 The basis we have to choose in this case is not L graded. 0
Singular Dimensions of N = 2 Superconformal Algebras. I
517
The results are shown in Table 5.3. Finally, combining our considerations for G0 -closed singular vectors and Q0 -closed singular vectors we obtain that the ordering kernel for the case of chiral singular vectors does not contain any operators of the form Q−1 , G−1 or H−1 . This proves the results shown in Table 5.4 and finally completes the proof of Theorem 5.3. By replacing the rôles of the operators Gn and Qn for all n ∈ Z we can define Q analogously an ordering OQ on Cm,n which is adapted for c 6= 3 for all levels. The corresponding ordering kernels are as follows. Theorem 5.5. If the central extension satisfies c 6 = 3 then the ordering OQ is adapted Q Q to Cm,n for all Verma modules V1,q and for all grades (m, n) with m ∈ N0 , n ∈ Z. Ordering kernels are given by the following tables for all levels m, depending on the set of annihilation operators and depending on the charge n. Table 5.5. Ordering kernels for OQ , annihilation operators T2+ n
ordering kernel
+2
m−1 {L−1 G−1 G0 }
+1 0 −1
m−1 m−1 {Lm −1 G0 , H−1 L−1 G0 , L−1 G−1 } m−1 m−1 {Lm −1 , H−1 L−1 , L−1 Q−1 G0 }
m−1 {L−1 Q−1 }
Table 5.6. Ordering kernels for OQ , annihilation operators T2+ and Q0 n
ordering kernel {Lm −1 G0 }
+1
m−1 {Lm −1 , L−1 Q−1 G0 }
0
m−1 Q−1 } {L−1
−1
Table 5.7. Ordering kernels for OQ , annihilation operators T2+ and G0 n
ordering kernel
+2
m−1 {L−1 G−1 G0 }
m−1 {Lm −1 G0 , L−1 G−1 }
+1
{Lm −1 }
0
Table 5.8. Ordering kernels for OQ , annihilation operators T2+ , Q0 , and G0 n +1 0
ordering kernel {Lm −1 G0 } {Lm −1 }
Charges that do not appear in the tables have trivial ordering kernels. Proof. The proof of Theorem 5.5 is completely analogous to the proof of Theorem 5.3. t We just need to swap the rôles of the operators Gn and Qn for all n ∈ Z. u
518
M. Dörrzapf, B. Gato-Rivera GQ
6. Adapted Orderings on Chiral Verma Modules V0,q
We saw in Sect. 4 that for both chiral and no-label highest weight vectors the conformal weight is zero. This applies to Verma modules as well as to singular vectors. Thus a GQ chiral singular vector in the chiral Verma module V0,q needs to have level 0 and the GQ
same is true for no-label singular vectors in V0,q . But at level 0 there are no singular
GQ vectors in V0,q , as the only state at level 0 is the highest weight state |0, qiGQ itself. GQ
For chiral Verma modules V0,q we shall therefore only consider adapted orderings with additional annihilation conditions corresponding to G0 or Q0 , but not to both. GQ In Sect. 4 we also introduced the set Cm,n , Eq. (48), defining the standard basis GQ GQ B|0,qiGQ of the chiral Verma modules V0,q . Cm,n can be obtained by setting r3 ≡ 0 in GQ
G , Eq. (41). Therefore, O is also defined on C G Cm,n G m,n , a subset of Cm,n . This suggests GQ that the ordering kernels for OG on Cm,n may simply be appropriate subsets of the G ordering kernels of OG on Cm,n , given in Theorem 5.3. This can easily be shown by considering the fact that r3 is never a deciding element of the ordering OG in Eqs. (52)– (57). Furthermore, during the proof of Theorem 5.3 it happens in each case that the considered term X 0 , constructed from X0 under the action of a suitable annihilation operator 0, has always the same exponent r3 of Q0 as X0 itself. Therefore, the whole GQ proof of Theorem 5.3 can also be applied to OG defined on Cm,n simply by imposing r3 ≡ 0 in every step. As a result the new ordering kernels are simply the intersections of GQ G . Hence, we have already proven the following Cm,n with the ordering kernels for Cm,n theorem.
Theorem 6.1. For the set of annihilation operators that contains G0 or Q0 but not both GQ GQ and for c 6 = 3 the ordering OG is adapted to Cm,n for all chiral Verma modules V0,q and for all grades (m, n) with m ∈ N0 , n ∈ Z. Depending on the set of annihilation operators and depending on the charge n, ordering kernels are given by the following tables for all levels m: GQ
Table 11. Ordering kernels for OG on Cm,n , annihilation operators T2+ and G0 n +1 0
ordering kernel m−1 {L−1 G−1 }
{Lm −1 }
GQ
Table 12. Ordering kernels for OG on Cm,n , annihilation operators T2+ and Q0 n
ordering kernel
0
{Lm −1 }
−1
m−1 Q−1 } {L−1
Charges that do not appear in the tables have trivial ordering kernels.
Singular Dimensions of N = 2 Superconformal Algebras. I
519
7. Adapted Orderings on No-Label Verma Modules V0,q We will now consider adapted orderings for no-label Verma modules V0,q . In Sect. 4, the standard basis for V0,q is defined using Cm,n of Eq. (48). No-label Verma modules have zero conformal weight, like chiral Verma modules. Consequently chiral singular vectors as well as no-label singular vectors in V0,q can only exist at level 0. The space of states in V0,q at level 0 is spanned by {|0, qi , G0 |0, qi , Q0 |0, qi , G0 Q0 |0, qi}. Therefore, there are no no-label singular vectors in V0,q and there is exactly one chiral singular vector in V0,q for all q and for all central extensions c, namely G0 Q0 |0, qi. Hence, our main interest focuses on the G0 -closed singular vectors and the Q0 -closed singular vectors in V0,q and we shall therefore investigate adapted orderings with the corresponding vanishing conditions. The states G0 |0, qi and Q0 |0, qi satisfy Q0 G0 |0, qi = −G0 Q0 |0, qi. Consequently the norms of these states have opposite signs and can be set to zero. G is isomorphic to the quotient module of V Clearly, V0,q 0,q divided by the submodule G = generated by the singular vector G0 |0, qi, i.e. V0,q
V0,q G0 |0,qi
Q
and likewise V0,q =
V0,q Q0 |0,qi . If we consider a singular vector 9 of V0,q then the canonical projection of 9 G is either trivial or a singular vector in V G and similarly for V Q . The converse, into V0,q 0,q 0,q G or V Q do not necessarily correspond to a however, is not true, a singular vector in V0,q 0,q singular vector in V0,q , it may only be subsingular in V0,q . One may also ask whether all G or V Q , in which singular vectors in V0,q correspond to singular vectors in either V0,q 0,q
case the investigation of no-label Verma modules would not give us more information than what we already know, or rather there can also be singular vectors in V0,q that G and V Q . vanish for both canonical projections into the generic Verma modules V0,q 0,q This is indeed the case, as was shown by the explicit examples at level 1 given in Ref. [20] (we will come back to this point at the end of next section). Unlike for chiralVerma modules, the no-labelVerma modules are not simply a subcase of the G0 -closed Verma modules with respect to the adapted ordering OG . In fact, rather G , we find that C G is a subset of C than Cm,n being a subset of Cm,n m,n and we hence need m,n to extend the ordering OG suitably. Definition 7.1. On the set Cm,n we introduce the total ordering OGQ . For two elements i
ri
ri
ri
ri
1 2 Q−1 G04 Q03 , Li ∈ Lli , H i ∈ X1 , X2 ∈ Cm,n , X1 6 = X2 with Xi = Li H i Gi Qi Lk−1 G−1 i i i i i i i Hhi ,G ∈ Ggi ,Q ∈ Qqi for some li , hi , gi , qi , k , r1 , r2 , r3 , r4 ∈ N0 , i = 1, 2 such that m ∈ N0 , n ∈ Z we define
X1 k 2 .
(73)
X1 r12 + r22 .
(74)
For k 1 = k 2 we set
If r11 + r21 = r12 + r22 we set X1
(75)
In the case Q1 = Q2 we define X1
(76)
520
M. Dörrzapf, B. Gato-Rivera
If also G1 = G2 we then define X1
(77)
X1
(78)
If further L1 = L2 we set
unless H 1 = H 2 in which case we set X1 r12 .
(79)
If also r11 = r12 we finally define X1 r32 + r42 ,
(80)
which necessarily has to give an answer. For X1 = X2 we define X1
ri
r1 2 Qr−1 G04 Q03 , with r31 + r41 = r32 + r42 . Since X2 are of the form Xi = LH GQLk−1 G−1 X1 and X2 both have charge n one has r31 − r41 = r32 − r42 and thus X1 = X2 . Hence, Definition 7.1 is a total ordering well-defined on Cm,n . We will now argue that the proof of Theorem 5.3 can easily be modified in such a way that exactly the same restrictions on the ordering kernels of OG extend to the ordering kernels of OGQ . As a first step we see that Theorem 5.4 extends straightr1 2 G by X = forwardly to Cm,n simply by replacing X0 = X0∗ Ln−1 G−1 Qr−1 Qr03 ∈ Cm,n 0 r r r r 1 2 Q−1 G04 Q03 ∈ Cm,n . Note that in the proof r4Y would behave exactly like r3Y X0∗ Ln−1 G−1 which does not interfere with any arguments. As Theorem 5.4 turned out to be the key G , tool to remove operators of the form L−n , G−n , or Q−n from the ordering kernel of Cm,n we can in exactly the same way already state that the ordering OGQ is for all grades (m, n) and all central extensions c ∈ C adapted to the set of terms r1 2 Qr−1 G0r4 Qr03 ∈ Cm,n , X0 = L0 H 0 G0 Q0 Ln−1 G−1
(81)
with L0 6 = 1, or G0 6 = 1, or Q0 6 = 1. We can thus focus on terms X0 of the form j −1
r1 0 2 Ln−1 G−1 Qr−1 G0r4 Qr03 , X0 = H−mH . . . H−mH H−1 I
j0
(82)
with mH j0 > 1. In the proof of Theorem 5.3 we dealt with these terms by acting with GmH −1 Q−1 for r2 = 0. At first, the existence of G0 in the no-label case seems to interfere j0
with this argument. However, the ordering OGQ has been defined in such a way that Q−1 does never interact with G0 as it would simply be stuck on the left of G0 . Therefore, one easily sees that the same arguments as in the proof of Theorem 5.3 hold for r2 = 0. In the case of r2 = 1 and r1 = 0 the proof even holds without any modification. For the cases r1 = r2 = 1, we act with GmH Q0 or QmH G0 for r3 = 0 or r3 = 1 respectively. In j0
j0
these cases we have to consider the additional possibility that r4Y changes from 1 to 0 or from 0 to 1 respectively. However, as GmH or QmH still needs to create a L−1 we easily j0
j0
Singular Dimensions of N = 2 Superconformal Algebras. I
521
see that any term Y also satisfying the conditions of Proof 5.3 for this case must contain operators LY , GY , QY , or H Y with H Y
ordering kernel
+2
m−1 {L−1 G−1 G0 }
+1 0 −1
m−1 m−1 {Lm −1 G0 , L−1 G−1 , L−1 G−1 G0 Q0 } m−1 m {Lm −1 , L−1 G0 Q0 , L−1 G−1 Q0 }
{Lm −1 , Q0 }
Table 7.2. Ordering kernels for OGQ on Cm,n , annihilation operators T2+ and Q0 n +1 0 −1 −2
ordering kernel {Lm −1 G0 }
m−1 m {Lm −1 , L−1 G0 Q0 , L−1 Q−1 G0 }
m−1 m−1 {Lm −1 Q0 , L−1 Q−1 , L−1 Q−1 G0 Q0 } m−1 {L−1 Q−1 Q0 }
Charges that do not appear in the tables have trivial ordering kernels. 8. Dimensional Analysis In previous sections we argued, following Ref. [20], that a naive estimate would give 16 types of singular vectors in N = 2 topological Verma modules, depending on whether the highest weight vector or the singular vector itself satisfy additional vanishing conditions with respect to the zero modes G0 or Q0 , each of these types coming with different charges. Three of these types can be ruled out, however, simply by taking into account that chiral highest weight conditions and no-label highest weight conditions apply only to states with zero conformal weight. In chiral Verma modules this rules out chiral singular vectors as well as no-label singular vectors, whilst in no-label Verma modules the nolabel singular vectors are ruled out. A fourth type of singular vectors, chiral singular
522
M. Dörrzapf, B. Gato-Rivera
vectors in no-label Verma modules, turns out to consist of only the level zero singular vector G0 Q0 |0, qi. In this section we will use Theorem 3.6, together with the results for the ordering kernels of the previous sections, as the main tools to give upper limits for the dimensions of the remaining 12 types of topological singular vectors. For most charges this procedure will even show that there are no singular vectors corresponding to them. The dimension of the singular vector spaces in N = 2 superconformal Verma modules can be larger than one. This fact was discovered for the Neveu–Schwarz N = 2 algebra in Ref. [9]. In particular, sufficient conditions were found (and proved) to guarantee the existence of two-dimensional spaces of uncharged singular vectors. Before this had been shown, it was a false common belief that singular vectors at the same level and with the same charge would always be linearly dependent. Later some of the results in Ref. [9] were extended [20] to the topological N = 2 algebra. As a consequence twodimensional spaces for four different types of topological singular vectors were shown to exist (those given in Table 8.1 below). However, as we will discuss, the Neveu–Schwarz counterpart of most topological singular vectors are not singular vectors themselves, but either descendants of singular vectors or subsingular vectors [11,20,16], for which very little is known. As a consequence, in order to compute the maximal dimensions for the singular vector spaces of the topological N = 2 algebra, an independent method, like the one presented in this paper, was needed. Let us first proceed with a clear definition of what we mean by singular vector spaces. M Definition 8.1. A G0 -closed singular vector space of the topological Verma module V1,q M of vectors at the same level and with the same charge for which is a subspace of V1,q each non-trivial element is a G0 -closed singular vector. M stands for G0 -closed, Q0 closed, chiral, or no-label. Analogously we define Q0 -closed singular vector spaces, chiral singular vector spaces, and no-label singular vector spaces. M at Let us denote by 9 K,n M a singular vector in the topological Verma module V1,q m,|1,qi level m and with charge n. K denotes the additional vanishing conditions of the singular vector, with respect to G0 and Q0 , whilst M denotes the additional vanishing conditions of the highest weight vector, as introduced in Sect. 4. The ordering kernels of Theorems 5.3 and 5.5 together with Theorem 3.6 allow us to write down an upper limit for the dimensions of the singular vector spaces simply by counting the number of elements of the ordering kernels. G or in Theorem 8.2. For singular vectors with additional vanishing conditions in V1,q Q
V1,q , c 6 = 3, we find the following upper limits for the number of linearly independent singular vectors at the same level m ∈ N0 and with the same charge n ∈ Z (1 = −m for chiral singular vectors). See Table 8.1. Singular vectors can only exist if they contain in their normal form at least one nontrivial term of the corresponding ordering kernel of Theorem 5.3 or 5.5. Charges n that are not given have dimension 0 and hence do not allow any singular vectors. The ordering kernels for the vanishing conditions T2+ , given in Tables 5.1 and 5.5, do not include any conditions requiring the action of G0 and Q0 not to be trivial. As a result, the ordering kernels of tables Table 5.1 and Table 5.5 include not only the nolabel cases but also the cases of G0 -closed singular vectors, Q0 -closed singular vectors, G or in V Q and chiral singular vectors. However, for no-label singular vectors in V1,q 1,q
Singular Dimensions of N = 2 Superconformal Algebras. I
523 Q
G or in V Table 8.1. Maximal dimensions for singular vectors spaces annihilated by G0 and/or Q0 in V1,q 1,q
n = −2
n = −1
n=0
n=1
n=2
0
1
2
1
0
1
2
1
0
0
0
1
1
0
0
0
1
2
1
0
0
0
1
2
1
0
0
1
1
0
9 G,n
m,|1,qiG Q,n 9 m,|1,qiG GQ,n 9 m,|−m,qiG Q,n 9 m,|1,qiQ 9 G,n m,|1,qiQ GQ,n 9 m,|−m,qiQ
we can find in addition the following restrictions. If 9 n
is a no-label singular
m,|1,qiM must be a singular vector of type 9 G,n+1 M . Consequently, vector, then m,|1,qi cannot be larger the dimension for the space of the no-label singular vector 9 n m,|1,qiM G,n+1 21 . This can than the dimension for the space of the G0 -closed singular vector 9 m,|1,qiM easily be seen as follows. Assume that 91 and 92 are two no-label linearly independent singular vectors at the same level and with the same charge, and suppose G0 91 and G0 92
G0 9 n m,|1,qiM
are linearly dependent. Then obviously there exist numbers α, β (αβ 6 = 0) such that G0 (α91 + β92 ) = 0 and thus the G0 -closed singular vector α91 + β92 is contained in the space spanned by 91 and 92 , which is therefore not a no-label singular vector space. Therefore, linearly independent singular vectors of type 9 n M imply linearly m,|1,qi
independent singular vectors of type 9 G,n+1 M . The converse is not true, however, since m,|1,qi most G0 -closed singular vectors are not generated by the action of G0 on a no-label singular vector (in fact there are many more G0 -closed singular vectors than no-label singular vectors, as was shown in Ref. [20]). Hence, the dimension for the space of singular vectors 9 n M is limited by the dimension for the space of singular vectors 9 G,n+1
m,|1,qi
n M . Similarly, Q0 9
m,|1,qi
restricts the dimension for
Q,n−1 . m,|1,qiM
is a singular vector of type 9
m,|1,qiM 9n m,|1,qiM
This again Q,n−1 . m,|1,qiM
to be less or equal to the dimension of 9 Q
G or in V Theorem 8.3. For no-label singular vectors in V1,q 1,q , c 6 = 3, we find the following upper limits for the dimensions of singular vector spaces at level m ∈ N and with charge n ∈ Z (1 = −m for no-label singular vectors). Singular vectors can only exist if they contain in their normal form at least one nontrivial term of the corresponding ordering kernel of Theorem 5.3 or 5.5. Charges n that are not given have dimension 0 and hence do not allow any singular vectors. 21 If 9 is a no-label singular vector and 4 is a G -closed, or Q -closed, or chiral singular vector, both at 0 0 the same level and with the same charge, then 9 + 4 is again a no-label singular vector (in the sense of not being annihilated by G0 or Q0 ) which is linearly indenpendent of 9. However, the space spanned by 9 and 9 + 4 is not considered to be a two-dimensional no-label singular vector space as it decomposes into a one-dimensional no-label singular vector space and a one-dimensional G0 -closed, or Q0 -closed, or chiral singular vector space.
524
M. Dörrzapf, B. Gato-Rivera Q
G or in V Table 8.2. Maximal dimensions for spaces of no-label singular vectors in V1,q 1,q
n = −2
n = −1
n=0
n=1
n=2
0
1
1
0
0
0
0
1
1
0
9n
m,|−m,qiG 9n m,|−m,qiQ
We now use the ordering kernels of Sect. 6 and Sect. 7 for chiral and no-label Verma modules. Again, simply by counting the number of elements in the ordering kernels one obtains the corresponding dimensions, given in the tables that follow. GQ
Theorem 8.4. For singular vectors in chiral Verma modules V0,q or in no-label Verma modules V0,q , c 6 = 3, we find the following upper limits for the number of linearly independent singular vectors at the same level m ∈ N0 and with the same charge n ∈ Z. GQ
Table 8.3. Maximal dimensions for singular vectors spaces in V0,q n = −2
n = −1
n=0
n=1
n=2
0
0
1
1
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
9 G,n GQ m,|0,qi Q,n 9 m,|0,qiGQ GQ,n 9 0,|0,qiGQ 9n 0,|0,qiGQ
Table 8.4. Maximal dimensions for singular vectors spaces in V0,q
G,n 9m,|0,qi Q,n 9m,|0,qi GQ,n 90,|0,qi n 90,|0,qi
n = −2
n = −1
n=0
n=1
n=2
0
1
3
3
1
1
3
3
1
0
0
0
1
0
0
0
0
0
0
0
Singular vectors can only exist if they contain in their normal form at least one nontrivial term of the corresponding ordering kernel of Theorem 6.1 or 7.2. Charges n that are not given have dimension 0 and hence do not allow any singular vectors. Tables 8.1, 8.2, 8.3 and Table 8.4 prove the conjecture made in Ref. [20] about the possible existing types of topological singular vectors. Namely, using the algebraic mechanism denoted the cascade effect it was deduced (although not rigorously) the existence of 4 types of singular vectors in chiral Verma modules (the ones given in Table 8.3), and 29 types in complete Verma modules (the ones given in tables Table 8.1, Table 8.2 and Table 8.4). In addition, low level examples were constructed for all these types of singular vectors what proves that all these types do exist (already at level 1, in GQ,n fact, except the type 90,|0,qi in no-label Verma modules that only exists at level 0). We ought to mention that the dimensions given in the previous three theorems are consistent with the spectral flow box diagrams analysed in Refs. [16,19,20]. Namely, types
Singular Dimensions of N = 2 Superconformal Algebras. I
525
of singular vectors that are connected by the topological spectral flow automorphism A always show the same singular vector space dimensions22 . Finally let us consider the results of Table 8.1 in more detail for the case when the conformal weight 1 is a negative integer: 1 = −m ∈ −N0 . In this case, we easily find for each singular vector 9 G,n G (which has zero conformal weight) a companion m,|−m,qi
GQ,n−1
which is of chiral type 9 . Note that Q0 9 G,n cannot be Q0 9 G,n m,|−m,qiG m,|−m,qiG m,|−m,qiG trivial. It is rather a secondary singular vector at level 0 with respect to the singular vector 9 G,n G . Using the same arguments as for the no-label singular vectors of Table 8.2 we m,|−m,qi
GQ,n−1 is restricted by the dimension for 9 . m,|−m,qiG m,|−m,qiG Q,n Similarly, we can act with G0 on 9 in order to obtain a secondary singular m,|−m,qiG GQ,n+1 . The same statements are true for Verma modules of vector of chiral type 9 m,|−m,qiG Q type V1,q . We hence obtain the following theorem.
obtain that the dimension for 9 G,n
G or in Theorem 8.5. For singular vectors at level m ∈ N and with charge n ∈ Z in V1,q Q
V1,q , with 1 = −m and c 6 = 3, we find the following maximum dimensions for singular vector spaces. Q
G or in V Table 8.5. Maximal dimensions for spaces of singular vectors at level m in V1,q 1,q with 1 = −m
9 G,n m,|−m,qiG Q,n 9 m,|−m,qiG Q,n 9 m,|−m,qiQ 9 G,n m,|−m,qiQ
n = −2
n = −1
0
0
1
1
0
1
0
0
n=0
n=1
n=2
1
1
0
0
0
0
1
0
0
0
1
1
Charges n that are not given have dimension 0 and hence do not allow any singular vectors. The results of Table 8.5 imply that if, for example, there are two linearly independent G , with 1 = −m, both annihilated by G , singular vectors at level m with charge 0 in V1,q 0 then there exists a non-trivial linear combination of the two singular vectors that turns out to be a chiral singular vector (annihilated by G0 and Q0 ). The space spanned by these two singular vectors hence decomposes into a one-dimensional G0 -closed singular vector space and a one-dimensional chiral singular vector space (see Ref. [20] for examples at level 3). Some remarks are now in order concerning the existence of the considered spaces of N = 2 singular vectors. First of all observe that the dimensions given by tables Table 8.1 - Table 8.5 are the maximal possible dimensions for the spaces generated by singular vectors of the corresponding types. That is, dimension 2 for a given type of singular 22 A is the universal odd spectral flow [18,17,20], discovered in Ref. [18], which transform any topological singular vector into another topological singular vector; in particular A transforms chiral singular vectors into chiral singular vectors and no-label singular vectors into no-label singular vectors.
526
M. Dörrzapf, B. Gato-Rivera
vector in Table 8.1 does not mean that all the spaces generated by singular vectors of such type are two-dimensional. Rather, most of them are in fact one-dimensional and only under certain conditions one finds two-dimensional spaces. The same applies to the three-dimensional spaces in Table 8.4. To be more precise, in Ref. [9] it was proved that for the Neveu–Schwarz N = 2 algebra two-dimensional spaces exist only for uncharged singular vectors and under certain conditions, starting at level 2. For the topological N = 2 algebra this implies, as was shown in Ref. [20], that the four types of two-dimensional singular vector spaces of Table 8.1 must also exist starting at level 2, provided the corresponding conditions are satisfied. (To see this [20,16] one only needs to apply the topological twists to the singular vectors of the Neveu–Schwarz N = 2 algebra and then construct the box-diagrams using G0 , Q0 and the odd spectral flow automorphism A). As a matter of fact, also in Ref. [20] several examples of these two-dimensional spaces were constructed at level 3. For the case of the three-dimensional singular vector spaces in no-label Verma modules in Table 8.4, we do not know as yet of any conditions for them to exist. In fact these are the only spaces, among all the spaces given in tables Table 8.1 - Table 8.5, which have not been observed so far, although the corresponding types of singular vectors have been constructed at level 1 generating one-dimensional [20] as well as two-dimensional spaces23 (but not three-dimensional). The latter case is interesting, in addition, because the corresponding two-dimensional spaces exist already at level 1 (in contrast with the two-dimensional spaces given by the conditions of Ref. [9], which exist at levels 2 and higher). Namely, for c = 9 one can easily find two-dimensional Q,−1 G,0 and 91,|0,−3i , in the no-label Verma module spaces of singular vectors of types 91,|0,−3i Q,0
G,1 and 91,|0,0i , in V0,−3 , and two-dimensional spaces of singular vectors of types 91,|0,0i the no-label Verma module V0,0 , all four types of singular vectors belonging to the same box-diagram [16,19,20]. That is, the spectral flow automorphism A, transforming the Q,0 G,0 G,1 to 91,|0,0i and 91,|0,0i to Verma modules V0,−3 and V0,0 into each other, map 91,|0,−3i Q,−1
91,|0,−3i , and the other way around, whereas G0 and Q0 transform the singular vectors into each other inside a given Verma module. One of these two-dimensional spaces is, Q,0 for example, the space spanned by the singular vectors of type 91,|0,0i : Q,0
91,|0,0,ci = L−1 Q0 G0 |0, 0, ci
(83)
ˆ Q,0 9 1,|0,0,9i = [H−1 Q0 G0 + Q−1 G0 − 2Q0 G−1 ] |0, 0, 9i ,
(84)
and
the latter existing only for c = 9. As one can see, the canonical projections of the first G and V Q vanish. On the contrary, singular vector into the generic Verma modules V0,0 0,0 the canonical projections of the second singular vector into the generic Verma modules G and V Q are different from zero, giving rise to the singular vectors V0,0 0,0 9
Q,0 1,|0,0,9iG
= Q0 G−1 ] |0, 0, 9iG ,
Q,0 1,|0,0,9iQ
9
= [Q−1 G0 − 4L−1 ] |0, 0, 9iQ , (85)
respectively. These are, in turn, the particular cases for (1 = 0, q = 0, c = 9) of the Q,0 Q,0 general expressions for 9 G and 9 Q , given in Ref. [20]. 1,|1,qi
1,|1,qi
23 In Ref. [20] the existence of these two-dimensional spaces of singular vectors at level 1 was overlooked. We give examples of them here for the first time.
Singular Dimensions of N = 2 Superconformal Algebras. I
527
9. Dimensions for the Neveu–Schwarz and the Ramond N = 2 Algebras Transferring the dimensions we have found for the topological N = 2 algebra to the Neveu–Schwarz and to the Ramond N = 2 algebras is straightforward. The Neveu– Schwarz N = 2 algebra is related to the topological N = 2 algebra through the topolog∓ ical twists TW± : Lm = Lm ±1/2Hm , Hm = ±Hm , Gm = G± m+1/2 and Qm = Gm−1/2 . As a consequence, the (non-chiral) Neveu–Schwarz highest weight vectors correspond to G0 -closed topological highest weight vectors (annihilated by G0 ), whereas the chiral and − antichiral Neveu–Schwarz highest weight vectors (annihilated by G+ −1/2 and by G−1/2 , respectively), correspond to chiral topological highest weight vectors (annihilated by G0 and Q0 ). This implies (see the details in Refs. [19,20]) that the standard Neveu–Schwarz singular vectors (“normal”, chiral and antichiral) correspond to topological singular vecGQ,n , whereas the Neveu–Schwarz singular vectors tors of the types 9 G,n G and 9 m,|1,qi m,|1,qiG in chiral or antichiral Verma modules correspond to topological singular vectors of only the type 9 G,n GQ (as there are no chiral singular vectors in chiral Verma modules). m,|1,qi To be precise, for the singular vectors of the Neveu–Schwarz N = 2 algebra one finds the following results. ch,n n , 9m,|1,qi and Theorem 9.1. The spaces of Neveu–Schwarz singular vectors 9m,|1,qi a,n 9m,|1,qi , where the supercripts ch and a stand for chiral and antichiral, respectively, have the same maximal dimensions as the spaces of topological singular vectors GQ,±n 9 G,±n G and 9 G , given in Table 8.1. Therefore for sinm±n/2,|1±q/2,±qi
m±n/2,|−m∓n/2,±qi
N S we find the following upper limits gular vectors in Neveu–Schwarz Verma modules V1,q for the number of linearly independent singular vectors at the same level m and with the same charge n ∈ Z (m ∈ N for n = 0 while m ∈ N − 1/2 for n = ±1).
NS Table 9.1. Maximal dimensions for singular vectors spaces in V1,q n 9m,|1,qi ch,n 9m,|1,qi
a,n 9m,|−m,qi
n = −1
n=0
n=1
1
2
1
0
1
1
1
1
0
(Chiral and antichiral singular vectors satisfy 1 + m = ±1/2(q + n), respectively). Charges n that are not given have dimension 0 and hence do not allow any singular vectors. Theorem 9.2. The spaces of Neveu–Schwarz singular vectors 9 n
n , 9m,|1,qi a m,|1,qich N S,ch N S,a V1,q or V1,q with
ch,n 9 a,n and 9m,|1,qi a in chiral or antichiral Verma modules, m,|1,qich 1 = ±q/2 respectively, have the same maximal dimensions as the spaces of topologin chiral topological Verma modules, given in ical singular vectors 9 G,±n m±n/2,|0,±qiGQ Table 8.3. Therefore for Neveu–Schwarz singular vectors in chiral or antichiral Verma modules we find the following upper limits for the number of linearly independent singular vectors at the same level m and with the same charge n ∈ Z (m ∈ N for n = 0 while
528
M. Dörrzapf, B. Gato-Rivera
m ∈ N − 1/2 for n = ±1). The supercripts ch and a stand for chiral and antichiral, respectively. N S,ch N S,a Table 9.2. Maximal dimensions for singular vectors spaces in Vq/2,q and V−q/2,q
9n
m,|q/2,qich 9 a,n m,|q/2,qich n 9m,|−q/2,qi a ch,n 9m,|−q/2,qi a
n = −1
n=0
n=1
1
1
0
1
1
0
0
1
1
0
1
1
ch,n (9 a,n and 9m,|−q/2,qi a satisfy in addition q = ∓m − n/2, respectively). Charges m,|q/2,qich n that are not given have dimension 0 and hence do not allow any singular vectors.
Observe that there are no chiral singular vectors in chiral Verma modules, neither antichiral singular vectors in antichiral Verma modules; that is, there are no Neveu–Schwarz a,n singular vectors of types 9 ch,n ch and 9m,|−q/2,qia , which would correspond to the m,|q/2,qi
GQ,±n
in chiral topological non-existing chiral topological singular vectors 9 m±n/2,|0,±qiGQ Verma modules. The first row of Table 9.1 recovers the results already proven24 in Refs. [8,9], using adapted orderings in generalised (analytically continued) Verma modules. That is, in completeVerma modules of the Neveu–Schwarz N = 2 algebra singular vectors can only exist with charges n = 0, ±1 and, under certain conditions, there exist two-dimensional spaces of (only) uncharged singular vectors. Table 9.2 proves the conjecture, made in N S,ch the charged singular Refs. [19,20], that in chiral Neveu–SchwarzVerma modules Vq/2,q vectors are always negatively charged, with n = −1, whereas in antichiral Neveu– NS,a the charged singular vectors are always positively Schwarz Verma modules V−q/2,q charged, with n = 1. In contrast to this, the chiral charged singular vectors in the Verma NS and V NS,a modules V1,q −q/2,q are always positively charged, with n = 1, whereas the
N S and V N S,ch are always antichiral charged singular vectors in the Verma modules V1,q q/2,q negatively charged, with n = −1. This fact was observed also in Ref. [20] and can be deduced from the results of Ref. [9]. As to the Ramond N = 2 algebra, combining the topological twists TW± and the spectral flows it is possible to construct a one-to-one mapping between every Ramond singular vector and every topological singular vector, at the same levels and with the same charges25 (see the details in Ref. [12]). As a consequence, the results of tables Table 8.1– Table 8.5 can be transferred to the Ramond singular vectors simply by exchanging the labels G → (+), Q → (−), where the helicity (+) denote the Ramond states − annihilated by G+ 0 and the helicity (−) denote the Ramond states annihilated by G0 . The no-helicity Ramond states, analogous to the no-label topological states, have been 24 In Ref. [9] it had not been explicitly stated that the results do not hold for c = 3. Also, the necessity of the change of basis in the final consideration of the proof of Theorem 5.3 had been overlooked in Ref. [9]. Nevertheless, Theorem 9.1 shows that all the results of Ref. [9] do hold. 25 We define the charges for the Ramond states in the same way as for the Neveu–Schwarz states, see the details in Refs. [19,11].
Singular Dimensions of N = 2 Superconformal Algebras. I
529
overlooked until recently in the literature (see Refs. [11,12]). They require conformal weight 1 + m = c/24 in the same way that no-label states require zero conformal weight 1 + m = 0.
10. Conclusions and Prospects For the study of the highest weight representations of a Lie algebra or a Lie super algebra, the determinant formula plays a crucial rôle. However, the determinant formula does not give the complete information about the submodules existing in a given Verma module. Exactly which Verma modules contain proper submodules and at which level can be found the lowest non-trivial grade space of the biggest proper submodule is the information that may easily be obtained from the determinant formula. But it does not give a proof that the singular vectors obtained in that way are all the existing singular vectors, i.e. generate the biggest proper submodules, neither does it give the dimensions of the singular vector spaces. However, it has been shown [9,20] that singular vector spaces with more than one dimension exist already for the N = 2 superconformal algebra. In this paper we have presented a method that can easily be applied to many Lie algebras and Lie super algebras. This method is based on the concept of adapted ordering, which implies that any singular vector needs to contain at least one non-trivial term included in the ordering kernel. The size of the ordering kernel therefore limits the dimension of the corresponding singular vector space. Weights for which the ordering kernel is trivial do not allow any singular vectors in the corresponding weight space. On the other hand, non-trivial ordering kernels give us the maximal dimension of a possible singular vector space. Furthermore, the coefficients with respect to the ordering kernel uniquely identify a singular vector. In other words, a singular vector is already completely determined once we have found just the few coefficients with respect to the ordering kernel. In this way we can easily obtain product expressions for descendant singular vectors [13] and also solve the question for which cases descendant singular vectors vanish. The framework can easily be understood using the example of the Virasoro algebra where the ordering kernel always has size one and therefore Virasoro singular vectors at the same level in the sameVerma module are always proportional. In its original version, Kent [26] used the idea of an ordering for generalised (analytically extended) Virasoro Verma modules in order to show that all vectors satisfying the highest weight conditions at level 0 are proportional to the highest weight vector. As an important application of this method, we have computed the maximal dimensions of the singular vector spaces for Verma modules of the topological N = 2 algebra, obtaining maximal dimensions 0, 1, 2 or 3, depending on the type of Verma module and the type of singular vector. The results are consistent with the topological spectral flow automorphisms and with all known examples of topological singular vectors. On the one hand, singular vector spaces with maximal dimension bigger than 1 agree with explicilty computed examples found before (and during) this work, although in the case of the three-dimensional spaces in no-label Verma modules, the singular vectors of the corresponding types known so far generate only one and two-dimensional spaces. These exist already at level 1, in contrast with the previously known two-dimensional spaces, which exist at levels 2 and higher. On the other hand, singular vector spaces with zero dimension imply that the “would-be” singular vectors of the corresponding types do not exist. As a consequence, our results provide a rigorous proof to the conjecture made in
530
M. Dörrzapf, B. Gato-Rivera
Ref. [20] about the possible existing types of topological singular vectors: 4 types in chiral Verma modules and 29 types in complete Verma modules. Finally we have transferred the results found for the topological N = 2 algebra to the Neveu–Schwarz and to the Ramond N = 2 algebras. In the first case we have recovered the results obtained in Ref. [9] for complete Verma modules: maximal dimensions 0, 1 or 2, the latter only for uncharged singular vector spaces, and allowed charges only 0 and ±1. In addition, we have proved the conjecture made in Refs. [19,20] on the possible existing types of Neveu–Schwarz singular vectors in chiral and antichiral Verma modules. In the case of the Ramond N = 2 algebra we have found a one-to-one mapping between the Ramond singular vectors and the topological singular vectors, so that the corresponding results are essentially the same. The only exception for which the adapted orderings presented in this paper are not suitable is for central term c = 3. This case needs a separate consideration. The application of the adapted ordering method to the twisted N = 2 algebra will be the subject of a forthcoming paper [13]. The example of the N = 2 topological Verma modules is only one out of many cases where the concept of adapted orderings can be applied. For example, Bajnok [4] showed that the analytic continuation method of Kent [26] can be extended to generalised Verma modules of the W A2 algebra. Not only will the concept of adapted orderings allow us to obtain information about superconformal Verma modules with N > 2, it should also be easily applicable to any other Lie algebra whenever an adapted ordering can be constructed with small ordering kernels. Acknowledgements. We are very grateful to Adrian Kent for numerous discussions on the N = 2 orderings and to Victor Kac for many important comments with respect to the N = 2 superconformal algebras. M.D. is supported by a DAAD fellowship and in part by NSF grant PHY-98-02709.
References 1. Ademollo, M., Brink, L., d’Adda, A., d’Auria, R., Napolitano, E., Sciuto, S., del Giudice, E., di Vecchia, P., Ferrara, S., Gliozzi, F., Musto, R. and Pettorino, R.: Supersymmetric strings and colour confinement. Phys. Lett. B62, 105 (1976) 2. Astashkevich, A.B. On the structure of Verma modules over Virasoro and Neveu–Schwarz algebras. Commun. Math. Phys. 186, 531 (1997) 3. Astashkevich, A.B. and Fuchs, D.B.: Asymptotic of singular vectors in Verma modules over the Virasoro Lie algebra. Pac. J. Math. 177 No.2, (1997) 4. Bajnok, Z.: Singular vectors of the W A2 algebra. Phys. Lett. B329, 225 (1994) 5. Boucher, W., Friedan, D. and Kent, A.: Determinant formulae and unitarity for the N = 2 superconformal algebras in two dimensions or exact results on string compactification. Phys. Lett. B172, 316 (1986) 6. Cheng, S.-L. and Kac, V.G.: A new N = 6 superconformal algebra. Commun. Math. Phys. 186, 219 (1997) 7. Dijkgraaf, R., Verlinde, E. and Verlinde, H.: Topological strings in d < 1. Nucl. Phys. B352, 59 (1991) 8. Dörrzapf, M.: Superconformal field theories and their representations. PhD thesis, University of Cambridge, September 1995 9. Dörrzapf, M.: Analytic expressions for singular vectors of the N = 2 superconformal algebra. Commun. Math. Phys. 180, 195 (1996) 10. Dörrzapf, M.: The embedding structure of unitary N = 2 minimal models. Nucl. Phys. B529, 639 (1998) 11. Dörrzapf, M. and Gato-Rivera, B.: Transmutations between singular and subsingular vectors of the N = 2 superconformal algebras. HUTP-97/A055, IMAFF-FM-97/04 preprint, hep-th/9712085, 1997, to be published in Nucl. Phys. B 12. Dörrzapf, M. and Gato-Rivera, B.: Determinant formula for the topological N = 2 superconformal algebra. HUTP-98/A055, IMAFF-98/07 preprint, be published in Nucl. Phys. B 13. Dörrzapf, M. and Gato-Rivera, B.: Singular dimensions of the N = 2 superconformal algebras II: The twisted N = 2 algebra. DAMTP-99-19, IMAFF-FM-99/08, NIKHEF-99-06 preprint hep-th/9902044
Singular Dimensions of N = 2 Superconformal Algebras. I
531
14. Dobrev, V.K.: Characters of the unitarizable highest weight modules over the N = 2 superconformal algebras. Phys. Lett. B186, 43 (1987) 15. Feigin, B.L. and Fuchs, D.B.: Representations of Lie groups and related topics. A.M. Vershik and A.D. Zhelobenko eds., London–New York: Gordon & Breach, 1990 16. Gato-Rivera, B.: Construction formulae for singular vectors of the topological N = 2 superconformal algebra. IMAFF-FM-98/05, hep-th/9802204, (1998) 17. Gato-Rivera, B.: The even and the odd spectral flows on the N = 2 superconformal algebras. Nucl. Phys. B512, 431 (1998) 18. Gato-Rivera, B. and Rosado, J.I.: Spectral flows and twisted topological theories. Phys. Lett. B369, 7 (1996) 19. Gato-Rivera, B. and Rosado, J.I.: Chiral determinant formulae and subsingular vectors for the N = 2 superconformal algebras. Nucl. Phys. B503, 447 (1997) 20. Gato-Rivera, B. and Rosado, J.I.: Families of singular and subsingular vectors of the topological N = 2 superconformal algebra. Nucl. Phys. B514, 477 (1998) 21. Kac, V.G.: Lie superalgebras. Adv. Math. 26, 8 (1977) 22. Kac, V.G.: Contravariant form for infinite-dimensional Lie algebras and superalgebras. Lect. Notes in Phys. 94, Berlin–Heidelberg–New York: Springer, 1979 23. Kac, V.G.: Superconformal algebras and transitive groups actions on quadrics. Commun. Math. Phys. 186, 233 (1997) 24. Kac, V.G. and van de Leuer, J.W.: On classification of superconformal algebras. Strings 88, Sinapore: World Scientific, 1988 25. Kato, M. and Matsuda,S.: Null field construction and Kac formulae of N = 2 superconformal algebras in two dimensions. Phys. Lett. B184, 184 (1987) 26. Kent, A.: Singular vectors of the Virasoro algebra. Phys. Lett. B273, 56 (1991) 27. Kent, A., Mattis, M. and Riggs, H.: Highest weight representations of the N = 3 superconformal algebras and their determinant formulae. Nucl. Phys. B301, 426 (1988) 28. Kent, A. and Riggs, H.: Determinant formulae for the N = 4 superconformal algebras. Phys. Lett. B198, 491 (1987) 29. Kiritsis, E.: Character formula and the structure of the representations of the N = 1, N = 2 superconformal algebras. Int. J. Mod. Phys. A3, 1871 (1988) 30. Lerche, W., Vafa, C. and Warner, N.P.: Chiral rings in N = 2 superconformal theories. Nucl. Phys. B324, 427 (1989) 31. Malikov, F.G., Feigin, B.L. and Fuchs, D.B.: Singular vectors in Verma modules over Kac-Moody algebras. Funct. Anal. Appl. 20, 103 (1986) 32. Matsuda, S.: Coulomb gas representations and screening operators of the N = 4 superconformal algebras. Phys. Lett. B282, 56 (1992) 33. Meurman, A. and Rocha-Caridi, A.: Highest weight representations of the Neveu–Schwarz and Ramond algebras. Commun. Math. Phys. 107, 263 (1986) 34. Nam, S.: The Kac formula for the N = 1 and N = 2 super-conformal algebras. Phys. Lett. B172, 323 (1986) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 206, 533 – 566 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle O. A. Chalykh1 , M. V. Feigin1,2 , A. P. Veselov3,4 1 Department of Mathematics and Mechanics, Moscow State University, Moscow, 119899, Russia.
E-mail: [email protected]
2 Independent University of Moscow, Bolshoy Vlasevsky per. 11, Moscow, 121002, Russia.
E-mail: [email protected]
3 Department of Mathematical Sciences, Loughborough University, Loughborough, LE11 3TU, UK.
E-mail: [email protected]
4 Landau Institute for Theoretical Physics, Kosygina 2, Moscow, 117940, Russia
Received: 18 December 1998 / Accepted: 18 March 1999
Abstract: A notion of the rational Baker–Akhiezer (BA) function related to a configuration of hyperplanes in Cn is introduced. It is proved that the BA function exists only for very special configurations (locus configurations), which satisfy a certain overdetermined algebraic system. The BA functions satisfy some algebraically integrable Schrödinger equations, so any locus configuration determines such an equation. Some results towards the classification of all locus configurations are presented. This theory is applied to the famous Hadamard problem of description of all hyperbolic equations satisfying Huygens’ Principle. We show that in a certain class all such equations are related to locus configurations and the corresponding fundamental solutions can be constructed explicitly from the BA functions.
Introduction The notion of the Baker–Akhiezer function (BA function) has been introduced by Krichever [1] in the theory of finite-gap or algebro-geometric solutions of the nonlinear PDE’s, integrable by the inverse scattering method [2]. The BA function is a far-reaching generalisation of the classical function ψ=
σ (x − z) ζ (z)x , e σ (x)σ (z)
well-known as a solution to the classical Lame equation: Lψ = λψ, L = −
d2 + 2℘(x), λ = −℘(z). dx 2
Here σ, ζ and ℘ are classical Weierstrass elliptic functions (see e.g. [3]).
534
O. A. Chalykh, M. V. Feigin, A. P. Veselov
In the degenerate case one has the corresponding trigonometric and rational versions: 1 cot x)ekx , k 1 kx )e , = (1 − kx
ψtrig = (1 − ψrat
d2 2 , + dx 2 sin2 x d2 2 L = − 2 + 2. dx x
L=−
Certain multidimensional versions of these functions in the rational and trigonometric cases have been introduced by Chalykh and Veselov in [4] in the theory of the quantum Calogero–Moser problem. In this paper we will restrict ourselves by the rational case only. The construction of [4] (see also [5]) relates such a BA function ψ to a configuration A of the hyperplanes 5α in a complex Euclidean space Cn given by the equations (α, x) = 0, taken with some multiplicities mα ∈ Z+ . Here α ∈ A, A is a finite set of noncollinear vectors. The function ψ(k, x), k, x ∈ Cn is determined by certain analytic properties in k (see Sect. 1) and exists only for very special configurations. The most important property of the BA function is that it is an eigenfunction of the multidimensional algebraically integrable Schrödinger operator L, which in our case has the form X mα (mα + 1)(α, α) (1) L = −1 + (α, x)2 α∈A
(see [4,5]). When A is a Coxeter configuration, i.e. A consists of the reflection hyperplanes for some finite reflection group W with W -invariant multiplicities, then the corresponding operator L is the Hamiltonian of the generalised quantum Calogero–Moser problem (see Olshanetsky and Perelomov [6]) with special integer-valued parameters. The existence of the BA function in this case was proved in [5] with the help of Heckman’s result [7]. At that time it was believed that the Coxeter case is the only one when ψ exists, but it turned out not to be the case. The first non-Coxeter examples have been found by the authors in [8] (see also [9]). According to the general procedure proposed by Berest and Veselov in [10] this led to new examples of the hyperbolic equations satisfying Huygens’ Principle in Hadamard’s sense. Motivated by these results Berest and Lutsenko started the investigation of the case when the potential depends on two coordinates only and found other new examples of the huygensian equations [11]. Later Berest proved [12] that they have actually found all such equations under the assumption that the potential is homogeneous of degree (−2). Since a generic Berest–Lutsenko potential could not be described by the construction [4], we were motivated to revise it. In Sect. 1 we give such a revised definition of the BA function, which can be derived from the corresponding Schrödinger equation and therefore covers all possible cases. It is remarkable that an effective way exists to check for a given configuration whether a BA function exists or not. Namely, as we prove in Sects. 2 and 3, the following overdetermined system of algebraic equations is a necessary and sufficient condition for the existence of the Baker–Akhiezer function: X mβ (mβ + 1)(β, β)(α, β)2j −1 ≡ 0 on the hyperplane (α, x) = 0 (β, x)2j +1 β∈A β6=α
for each α ∈ A and j = 1, 2, . . . , mα .
(2)
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle
535
They are equivalent to the vanishing of the first mα odd terms in the Laurent expansion of the corresponding potential u(x) =
X mα (mα + 1)(α, α) (α, x)2
(3)
α∈A
at the hyperplane (α, x) = 0. A similar characterisation of the rational finite-gap potentials in one dimension has been first proposed in the famous paper [13] by Airault, McKean and Moser, who introduced the term “locus” in this situation. We will also use this terminology, calling Eqs. (2) as well as its general affine version (see below) locus equations. Duistermaat and Grünbaum [14] discovered the interpretation of such equations as a trivial monodromy condition for the corresponding one-dimensional Schrödinger equation in the complex domain. We give a similar interpretation for our locus equations (2) in Sect. 2. However to describe all the configurations, satisfying the locus equations (2) (the locus configurations) seems to be a very difficult problem. At the moment it is solved only in dimension 2, where the answer is given by the Berest–Lutsenko construction. In dimension n > 2 all known examples of the locus configurations are the Coxeter configurations and their special “deformations” [8,9]. In Sect. 4 we present all the results known in this direction so far. The generalisation of our construction to the affine configurations of the hyperplanes is discussed in Sect. 5. The potential u and the locus equations in that case have the form: u(x) =
K X mi (mi + 1)(αi , αi ) , ((αi , x) + ci )2
(4)
i=1
X mj (mj + 1)(αj , αj )(αi , αj )2s−1 ≡0 ((αj , x) + cj )2s+1
(5)
j 6=i
identically on the hyperplane (αi , x) + ci = 0 for all i = 1, . . . , K and s = 1, . . . , mi . Unfortunately, so far little is known about the affine locus configurations, which are not linear, i.e. with not all the hyperplanes passing through one point. Apart from the one-dimensional case investigated in [13,15], there are only some reducible examples discovered by Berest and Winternitz [16]. In fact, we show that the classification problem for the affine locus configurations can be reduced to the linear case (2) by the isotropic projectivisation procedure. In the last section we discuss the relations of our BA function ψ and locus configurations to Huygens’ Principle. The main result says that for any locus configuration in dimension n the corresponding hyperbolic equation (2N+1 + u(x1 , . . . , xn ))φ = 0
(6)
satisfies Huygens’ Principle for large enough odd N. Conversely, we show that if Eq. (6) satisfies Huygens’ Principle and all Hadamard’s coefficients are rational functions, then u(x) has a form (4) for some locus configuration. We conjecture that this construction gives all huygensian equations of the form (2N+1 + u(x1 , . . . , xn ))φ = 0. In the case n = 1 it is a well-known result by Stellmacher and Lagnese [17]. When n = 2 and u is homogeneous this follows from Berest’s theorem [12]. The proof of the general case would lead to the solution of the famous Hadamard problem in the class (6).
536
O. A. Chalykh, M. V. Feigin, A. P. Veselov
1. Rational Baker–Akhiezer Function Related to a Configuration of Hyperplanes n Let A be a finite set of noncollinear vectors Pn α =2 (α1 , . . . , αn ) ∈ C with multiplicities mα ∈ N. We will assume that (α, α) = i=1 αi 6= 0.
Definition. A function ψ(k, x), k, x ∈ Cn will be called Baker–Akhiezer function (BA function), if the following two conditions are fulfilled: 1) ψ(k, x) has the form ψ(k, x) = where A(k) = A(k); 2) for all α ∈ A ,
Q
mα α∈A (k, α) ,
P (k, x) (k,x) e , A(k)
(7)
P (k, x) is a polynomial in k with the highest term
∂α (ψ(k, x)(k, α)mα ) = ∂α3 (ψ(k, x)(k, α)mα )
= . . . = ∂α2mα −1 (ψ(k, x)(k, α)mα ) ≡ 0
(8)
∂ ) is the normal derivative for on the hyperplane 5α : (k, α) = 0, where ∂α = (α, ∂k this hyperplane.
Notice that (7) means that ψ is a rational function of k with the prescribed poles along the hyperplanes 5α , α ∈ A and with the asymptotic behaviour at infinity: ψ = (1 + o(1)) e(k,x) when k → ∞ along the rays outside the singularities (cf. [1]). First of all, in the same way as in [4,5] one can prove the following Theorem 1.1. If the Baker–Akhiezer function ψ exists then it is unique and satisfies the algebraically integrable Schrödinger equation Lψ = −k 2 ψ,
(9)
where L = −1 +
X mα (mα + 1)(α, α) . (α, x)2
(10)
α∈A
Algebraic integrability of the operator (10) means that L is a part of a rich (supercomplete) commutative ring of partial differential operators (see [5] for precise definitions). This ring is described by the following theorem. Theorem 1.2. Let RA be the ring of polynomials f (k) satisfying the following properties: ∂α f (k) = ∂α3 f (k) = . . . = ∂α2mα −1 f (k) ≡ 0
(11)
on the hyperplane (α, k) = 0 for any α ∈ A. If the Baker–Akhiezer function ψ(k, x) exists then for any polynomial f (k) ∈ RA ∂ there exists some differential operator Lf (x, ∂x ) such that Lf ψ(k, x) = f (k)ψ(k, x). All such operators form a commutative ring isomorphic to the ring RA . The Schrödinger operator (10) corresponds to f (k) = −k 2 .
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle
537
We give the proof of these statements in a more general affine situation in Sect. 5. We should note that there exists the following explicit formula for Lf (due to Yu. Berest [18]). Theorem 1.3. The commuting partial differential operators Lf for f ∈ RA are given by the formula Lf = cN (adL )N [fˆ(x)],
(12)
where cN = (−1)N /2N N!, N = degf , fˆ is the operator of multiplication by f (x), and (adL )N means the N th iteration of the standard ad-procedure, adA B = AB − BA. The proof follows from the results of the next section (see Corollary 2.5). We should note that originally in [4] another axiomatics for the ψ-function was proposed. A function φ(k, x) of the form φ(k, x) = P (k, x)e(k,x)
(13)
was considered, where P (k, x), as in (7), is a polynomial in k with the highest term A(k), with the property ∂α (φ(k, x)) = ∂α3 (φ(k, x)) = . . . = ∂α2mα −1 (φ(k, x)) ≡ 0
(14)
at the hyperplane 5α . Comparing (13), (14) with (7), (8) we see Q that the difference between these two axiomatics is due to the additional factor β6=α (k, β)mβ . In the Coxeter situation considered in [4] (see Sect. 4 below) this factor is not essential because of its symmetry. It turns out that this minor change makes the axiomatics less restrictive and leads to a richer class of the integrable Schrödinger operators. We will prove (see Corollary 2.7) that if there exists φ satisfying the conditions (13), (14) then there exists also the BA φ . The converse is not function ψ with the properties (7), (8) and in that case ψ = A(k) true: there are configurations for which ψ does exist but φ does not (see the Remark 2 after the proof of Theorem 4.4). 2. Monodromy and BA Functions Let L = −1+u(x) be a Schrödinger operator with a meromorphic potential u(x) having a pole along the hyperplane 5α : (α, x) = 0, which is assumed to be non-isotropic: (α, α) 6 = 0. We are looking for a formal solution φ of the Schrödinger equation Lφ = λφ in the form X φs(α) (α, x)µ+s , (15) φ(x) = s≥0
(α)
(α)
for some µ, where the coefficients φs = φs (x ⊥ ) are some analytic functions on the (α) hyperplane 5α , x ⊥ is an orthogonal projection of x onto 5α , φ0 6 = 0. Let’s suppose that the equation Lφ = λφ has a solution of the form (15) with some µ < 0. Then substitution into the equation gives immediately that the potential u(x)
538
O. A. Chalykh, M. V. Feigin, A. P. Veselov
must have a second order pole along 5α : the Laurent expansion in the normal direction α has the form X (α) ck (α, x)k (16) u(x) = k≥−2
(α)
with c−2 = µ(µ − 1)(α, α).
(α)
Moreover, we obtain the following recurrent relations for the coefficients φs :
e + λ)φs−2 − (α, α)(µ(µ − 1) − (µ + s)(µ + s − 1))φs = (1
s−2 X
ci φs−i−2 ,
(17)
i=−1
e is the Laplacian 1 restricted to the hyperplane 5α and we (s = 1, 2, . . . ), where 1 omitted all the indices α in the coefficients. If 2µ ∈ / Z we can determine all φs from (17) and obtain the solution (15) starting from an arbitrary function φ0 (the same procedure gives also another solution with µ0 = −1 − µ). In the one-dimensional case this is a classical way (going back to Frobenius, see e.g. [27]) to construct the basis of solutions of the corresponding equation −ϕ 00 + u(x)ϕ = λϕ
(18)
in the vicinity of its regular singular point. In the case when Eq. (18) has no monodromy in the complex domain, i.e. all the solutions are single-valued, we have that 1) µ must be an integer: µ = −m, m ∈ Z+ , 2) the first 2m + 1 equations from (17) must be compatible. In case this is true for each energy level λ we will say that the Schrödinger operator has trivial monodromy. In the multidimensional case there exists a generalisation of Frobenius’s theory for the partial differential equations with regular singularities in the complex domain (see [28]). For the Schrödinger equation with a singularity along a hypersurface the regularity condition means that the potential has at most a second order pole. The considerations above motivate the following Definition. We say that a Schrödinger operator L = −1 + u(x) with meromorphic potential u(x) with a second order pole along the hyperplane 5α : (α, x) = 0 has local trivial monodromy around this hyperplane if (α)
(α)
1) the Laurent coefficient c−2 in the expansion (16) has the form c−2 = mα (mα + 1)(α, α) for some mα ∈ Z+ , 2) the system (17) with µ = −mα is compatible for any function φ0 and for all λ ∈ C. Theorem 2.1. L has local trivial monodromy around 5α if and only if the coefficients of the normal Laurent expansion of the potential u(x) near 5α X u(x) = cs(α) (α, x)s s≥−2
satisfy the following conditions: c−2 = mα (mα + 1)(α, α) for some mα ∈ Z+ , and (α)
(α)
(α)
(α)
c−1 = c1 = c3 = . . . = c2mα −1 ≡ 0 on 5α .
(19)
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle
539
In that case the Laurent expansions of the corresponding eigenfunctions φ (15) satisfy the conditions (α)
φ1
(α)
= φ3
(α)
= . . . = φ2mα −1 ≡ 0 on 5α .
(20)
Proof. The proof is similar to the one-dimensional case considered by J.Duistermaat and A.Grünbaum [14]. Let’s demonstrate the idea in the simplest case when mα = 1. After substituting (15) into the Schrödinger equation, we deduce that µ = 2 and derive (α) the following recurrent relations for φk : (−2 + c−2 )φ0 = 0 2φ1 + c−1 φ0 = 0 e − λ)φ0 + c0 φ0 + c−1 φ1 = 0 2φ2 + (−1 , (21) e + (− 1 − λ)φ + c φ + c φ + c φ = 0 0φ 1 1 0 0 1 −1 2 3 ... e is the Laplacian 1 restricted to the hyperplane 5 (we omitted all the subindices where 1 α in these formulas and assumed that (α, α) = 1). These relations allow one to find all the coefficients uniquely except φ0 (which is an arbitrary function) and φ3 , provided the first four equations are consistent. From the first equation it follows that c−2 = 2. Expressing φ1 and φ2 from the second and the third equations and substituting them into the fourth one we arrive at the relation 1 1 3 1 e − λ)φ0 + (c1 − c0 c−1 + c−1 e − λ)(− c−1 φ0 ) − c−1 (−1 )φ0 = 0, (−1 2 2 4 which should be valid for all φ0 and λ.Vanishing of the leading term in λ gives c−1 φ0 ≡ 0, i.e. c−1 ≡ 0. The relation reduces after that to c1 φ0 = 0, thus c1 ≡ 0. Notice that the second equation implies φ1 ≡ 0 since c−1 ≡ 0. This completes the proof in the case when mα = 1. In the general case one should use induction arguments (see [14], p. 196). t u Remark. One can consider a more general case, when u(x) has a singularity along an arbitrary hypersurface ϕ(x) = 0. However, analysis of the corresponding relations (21) shows that the hypersurface has to be a hyperplane (cf. [19]). Now let’s consider a Schrödinger operator (1) L, corresponding to some Baker –Akhiezer function ψ. We claim that such an operator has local trivial monodromy around all the singular hyperplanes. To prove this one can consider for a given λ the (n−1)-dimensional family of the solutions of the Schrödinger equation (L − λ)ϕ = 0 of the form ϕ = ψ(k, x) with k 2 = −λ. They have proper pole behaviour near the (α) hyperplane (α, x) = 0. Unfortunately, ψ0 depends on k and is not an arbitrary function on the hyperplane, so we have to present additional arguments. We’ll prove a slightly more general result, which we will use also in Sect. 6. Theorem 2.2. Let the Schrödinger operator L = −1 + u(x) have an eigenfunction ψ(k, x), Lψ = −k 2 ψ of the form ψ = P (k, x)e(k,x) , where P is a finite sum of some functions which are homogeneous in k and meromorphic in x. Then the singularities of u(x) are second order poles located on a union of non-isotropic hyperplanes and L has local trivial monodromy around these hyperplanes.
540
O. A. Chalykh, M. V. Feigin, A. P. Veselov
Proof. The fact that singularities of u(x) must be located on the hyperplanes was proved byYu.Yu. Berest and A. P. Veselov in [20] under the assumption that P is a polynomial in k, but their proof works also in the case when P is a finite sum of functions homogeneous in k. The fact that these hyperplanes must be non-isotropic follows from the zero-residue lemma of the same paper [20] (see also [19]). Let’s now prove that the conditions (19) are satisfied. After a proper choice of orthonormal basis we may assume that the hyperplane under consideration has the equation x1 = 0, and let’s consider the Laurent expansion for the function ψ(k, x): ψ(k, x) = x1−m
+∞ X
ψi (k, x2 , . . . , xn )x1i .
(22)
i=0 0 be the highest homogeneous term of Let’s prove first that m has to be positive. Let PP P , then from the Schrödinger equation we have ki ∂/∂xi P 0 = 0. So P 0 (k, x + kt) is constant while t varies, hence if P 0 vanishes on the hyperplane x1 = 0, then it vanishes identically. Thus, P 0 and therefore ψ can not be zero at the hyperplane, so m in (22) must be positive. Substituting (22) into the Schrödinger equation immediately gives that c−2 = m(m+ 1) and leads to the following recurrence relations:
(m(m + 1) − (j + 2 − m)(j + 1 − m))ψj +2
e − k 2 )ψj − = (1
j X
ci ψj −i ,
(23)
i=−1
e = ∂ 2 2 + . . . + ∂ 2 2 . To prove (19) let’s suppose that c−1 = (j = −1, 0, 1, 2, . . . ), 1 ∂x2 ∂xn c1 = . . . = c2p−3 = 0, but c2p−1 6 = 0 for some p < m + 1. Considering j = −1, 1, 3, . . . , 2p − 3, it is easy to see that ψ1 = ψ3 = . . . = ψ2p−1 = 0. From the ˜ ˜ , where Pj is a finite form of the function ψ it follows that ψj = Pj (k, x2 , . . . , xn )e(k,x) ˜ sum of homogeneous functions in k, k = (k2 , . . . , kn ), x˜ = (x2 , . . . , xn ). Let Pj0 be the 0 = (−1)j k P 0 a highest homogeneous term of Pj . By induction one can prove that P2j 1 0 j 2j
2(j −p−1)
0 j −p k and P2j P00 c2p−1 bj , where the constant aj > 0 and b1 = b2 = 1 −1 = (−1) 0 it follows . . . = bp = 0 (by assumption) and bj > 0 for m ≥ j ≥ p + 1. Indeed, for P2j 0 easily from the relations (23). For P2j −1 one can use induction arguments similar to [14] (Prop. 3.3, p. 196). Now let’s consider Eq. (23) with the resonance value j = 2m − 1:
e − k 2 )ψ2m−1 − 0 = (1
2m−1 X
ci ψ2m−1−i .
i=−1
Since this holds identically for all k the highest homogeneous term should vanish. Simple calculation shows that this term is equal to 0 0 k12 + P2m−2p c2p−1 ) = (−1)m−p+1 k1 −(P2m−1
2(m−p)
P00 (bm + am )c2p−1 .
Since bm + am > 0 and P00 6 = 0 it vanishes only if c2p−1 = 0. This completes the proof. t u
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle
541
It is remarkable that the BA function turns out to be symmetric with respect to k and x. For Coxeter configurations this property has been established in [5]. Theorem 2.3. The Baker–Akhiezer function ψ(k, x) is symmetric with respect to x and k: ψ(k, x) = ψ(x, k). Proof. The idea is to show that ψ(x, k) is also the BA function and then to use the (k,x) is a polynomial in x with the uniqueness (Theorem 1.1). Let’s prove that A(x)P A(k) highest term A(x), where A(x) and P (k, x) are the same as in (7). For that let us consider conditions (8) for ψ(k, x). They give a linear system for the coefficients of the polynomial P with the coefficients, which are polynomial in (α, x), α ∈ A. Since this system has a unique solution, the coefficients of P are rational in x. Let’ denote by (k,x) of degree −j in k. In terms of Pj (k, x) one Pj (k, x) the homogeneous term of PA(k) can rewrite Eq. (9) in the following recurrent way: LPj (k, x) = 2
n X i=1
ki
∂ Pj +1 , P0 (k, x) = 1. ∂xi
From this it follows by induction that all the singularities of ψ(k, x) in x belong to our configuration of the hyperplanes (α, x) = 0. Analyzing Laurent expansions for u(x) and ψ(k, x) on these hyperplanes we conclude that ψ(k, x) has a pole of order mα along the hyperplanes (α, x) = 0. All that means that A(x)P (k, x) is a polynomial in x. But from the uniqueness of the BA-function it follows easily that Pj (k, x) is also homogeneous in x with the same degree −j . Hence the highest term in x of the polynomial A(x)P (k, x) is (k,x) . Properties of the Laurent expansions equal to A(x)A(k). Thus ψ(x, k) = A(x)+... A(x) e in x follow immediately from Theorems 2.1, 2.2. So we have all the conditions for ψ(x, k) to be a BA function. The theorem is proved. u t Corollary 2.4. The Baker–Akhiezer function ψ satisfies the following bispectral problem L(x,
∂ ∂ )ψ(k, x) = −k 2 ψ(k, x), L(k, )ψ(k, x) = −x 2 ψ(k, x), ∂x ∂k
(24)
where L is the Schrödinger operator (10). Now we are able to prove Theorem 1.3. Corollary 2.5. The Baker–Akhiezer function ψ is an eigenfunction of the operator (12) for any f ∈ RA . Proof. Due to Theorem 1.2 and to the symmetry of ψ for any f ∈ RA there exists ∂ ∂ ) such that A(k, ∂k )ψ = f (x)ψ. On the other hand, a differential operator A(k, ∂k ∂ 2 L(x, ∂x )ψ = −k ψ. Now we can use the identity (1.8) from [14] which states in that case that (adL)r (fˆ)[ψ] = (−ad kb2 )r (A)[ψ] for all r ∈ Z+ . For r = N = ordA = degf the differential operator (−ad kb2 )r (A) in the right-hand side has zero order and is, in fact, the operator of multiplication by cf (k) with c = (−2)N N!. This means that ψ is an eigenfunction of the operator (adL)r (fˆ) with the eigenvalue cf (k). This proves the Theorem 1.3. u t
542
O. A. Chalykh, M. V. Feigin, A. P. Veselov
Now let’s explain why the existence of φ with the properties (13), (14) (our old axiomatics, see Sect. 1) implies the existence of the BA function ψ. This follows from the following general statement, showing that the new axiomatics is in some sense the most general one. Let A be any set of noncollinear vectors, L = −1 + u(x) be a corresponding Q Schrödinger operator, A(k) = α∈A (α, k)mα . Consider the functions ϕ of the form ϕ(k, x) =
P (k, x) (k,x) e , A(k)A(x)
(25)
P is some polynomial in k and x: P = A(k)A(x) + . . . , where dots mean the terms of lower order both in k and in x. Theorem 2.6. If the Schrödinger equation Lϕ = −k 2 ϕ has a solution ϕ of the form (25) then ϕ(k, x) has to be BA function. Proof. The proof now is almost evident. Theorems 2.1 and 2.2 provide conditions (8) for ϕ in the x-variable, and it has the required form (7) in x. Hence, ϕ(x, k) is a BA function and according to Theorem 2.3 ϕ(x, k) = ϕ(k, x). u t Corollary 2.7. If a function φ satisfies conditions (13)–(14) then ψ = A−1 (k)φ is the Baker–Akhiezer function (7)–(8). Proof. As it follows from the results of the papers [4,5], the function φ must be an eigenfunction of the same equation (9). Then the arguments we used in the proof of theorem 2.3 show that ϕ = A−1 (k)φ satisfies the conditions of theorem 2.6 and therefore is the Baker–Akhiezer function. u t 3. Locus Equations and the Existence of the BA Function Let A, as in Sect. 1, be a finite set of non-collinear vectors α ∈ Cn with given multiplicities mα ∈ Z+ , A be the corresponding configurations of hyperplanes (α, k) = 0 in Cn and L = −1 + u(x) be the Schrödinger operator with the potential X mα (mα + 1)(α, α) . (26) u(x) = (α, x)2 α∈A
Theorems 2.1 and 2.2 from the previous section imply that if the BA function for the configuration A exists then in the normal Laurent expansions (16) of the potential u(x), (α) the first odd terms c2j −1 (j = 1, . . . , mα ) should vanish identically on the hyperplane (α, x) = 0. More explicitly, these conditions have the form of the following highly overdetermined algebraic system: X mβ (mβ + 1)(β, β)(α, β)2j −1 ≡ 0 on the hyperplane (α, x) = 0 (β, x)2j +1 β∈A
(27)
β6=α
for j = 1, 2, . . . , mα . We will call Eqs. (27) locus equations, following Airault, McKean and Moser [13], who used this terminology in the one-dimensional case. The configurations A which satisfy the locus equations we will call locus configurations. The remarkable fact is that the locus equations (27) are not only necessary, but are also sufficient for the existence of the BA function. We will give the proof following the paper [21].
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle
543
Theorem 3.1. For any locus configuration A the BA function ψ(k, x) does exist and can be given by the following Berest’s formula: Y (α, x)mα exp(k, x)], (28) ψ(k, x) = [(−2)M M!A(k)]−1 (L + k 2 )M [ where M =
P
α∈A mα ,
A(k) =
α∈A
Q
mα α∈A (α, k) .
Proof. Let’s consider the linear space V which consists of the functions φ(x), x ∈ Cn , with the following analytic properties: Q 1) φ(x) α∈A (α, x)mα is holomorphic in Cn ; 2) for each α ∈ A the Laurent expansion (15) for φ should not contain the terms of order −mα + 2j − 1 (j = 1, . . . , mα ), i.e. the conditions (20) hold. The basic observation is the following Lemma. The space V defined above is invariant under the Schrödinger operator with the potential (26) provided that the locus conditions (27) are fulfilled. It follows easily from the imposed conditions on the Laurent expansions in the αdirection for u(x) and φ ∈ V . Now let’s define the functions ϕi (i = 0, 1, . . . ) in the following way: Y (α, x)mα exp(k, x) ϕ0 = α∈A
and ϕi+1 = (L + k 2 )ϕi .
(29)
It’s obvious that ϕ0 belongs to V , hence by the lemma ϕi also belongs to V . From the definition of these functions and the property 1 of V Qit is clear that ϕi can be presented in the form ϕi = Ri (k, x)exp(k, x), where Ri = Qi α∈A (α, x)−mα for some polynomial Qi (k, x). From (29) it follows that the degrees of the polynomials Qi in x decrease: degQi+1 < degQi . Therefore, for some N ϕN 6 = 0 but ϕN +1 = (L + k 2 )ϕN = 0. Thus, φ = ϕN is anPeigenfunction for the Schrödinger operator L. Let’s prove that N in fact equals M = α∈A mα . If we denote by Ri0 the highest homogeneous terms of Ri in x, we see from (29) that n X 0 = −2 kj ∂/∂xj Ri0 . Ri+1 j =1
P From this we obtain immediately that for i = M = α∈A mα , Y 0 = (−2)M M! (α, k)mα . RM
(30)
α∈A
From this we conclude that for i > M Ri (which is polynomial in k) will be of negative degree in x. Thus, it cannot be an eigenfunction for the Schrödinger operator L because of the following lemma due to F.A.Berezin [22]. Lemma. If a quasipolynomial ψ in k ψ = P (k, x)exp(k, x) satisfies the Schrödinger equation (−1 + u(x))ψ = −k 2 ψ, then the highest term in k of the polynomial P must be polynomial in x.
544
O. A. Chalykh, M. V. Feigin, A. P. Veselov
This contradiction proves that the last non-zero function in the sequence (29) is ϕM . Moreover, since ϕM belongs to the space V we obtain using (30) that ψ(k, x) = 0 )−1 ϕ satisfies axiomatics (7),(8) in x as well as in k according to Theorem 2.3. (RM M So, we proved that ψ(k, x) defined by formula (28) is the BA function associated to a configuration A. u t Remark. The remarkable formula (28) for ψ was discovered by Yu. Berest ([18]), who proved that if ψ does exist then it should have the form (28). 4. Analysis of the Locus Equations and Locus Configurations The next step would be to classify all the solutions of the locus equations (locus configurations). Unfortunately, this problem seems to be very difficult. In this section we present some results in this direction and all the known examples.
4.1. Coxeter systems. The most natural examples of the locus configurations are given by the mirrors of the Coxeter groups. Recall that a Coxeter group W is by definition a finite group generated by some orthogonal reflections sα (x) = x − 2(α,x) (α,α) α with respect to hyperplanes in Rn (see [23]). If we consider all the reflections from the Coxeter group W , then the set A of the corresponding hyperplanes (α, x) = 0 will be invariant under the action of W . The configuration A of these hyperplanes with arbitrary W -invariant multiplicities mα ∈ Z+ gives an example of locus configuration. This fact follows immediately from the symmetry of the corresponding potential u(x) with respect to any reflection sα , α ∈ A. In this case the Schrödinger operator L is the quantum Hamiltonian of the generalised Calogero – Moser system (see [24,6]). The existence of the BA function for the root system of type An with mα = 1 was proved in [4], where some explicit formula for ψ has been found. This was done for the general Coxeter system in [5], using Heckman’s formula [7] for the so-called shift operators in terms of the Dunkl operators [25]. Notice that our approach gives a new proof of this result. Remark. In principle, one may try to extend these examples to the complex case, by considering a finite group generated by orthogonal reflections in complex Euclidean space. However, it is known (see e.g. [26]) that all such groups are nothing but the complexified Coxeter groups. 4.2. Deformed root systems. The first non-Coxeter locus configuration An (m) was inn+1 troduced in [8]. It consists of √the following vectors in R : ei − ej with multiplicity m (1 ≤ i < j ≤ n) and ei − men+1 with multiplicity 1 (i = 1, . . . , n). Notice that for m = 1 we have the root system An . We can allow the parameter m to be negative simply by considering the vectors ei − ej with the multiplicity −1 − m in that case (then, of course, we will have a complex configuration in Cn+1 ). The Corresponding Schrödinger operator has the form: L = −1 +
n X 2m(m + 1) i<j
(xi − xj )2
+
n X i=1
2(m + 1) . √ (xi − mxn+1 )2
(31)
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle A2 ( m)
1
545
1
θ
cosθ =
m m m +1
Figure 1. `
C 2 (m , ) 1
`
1
m
ϕ
cos 2 ϕ =
` `+ 1
m − m +
Figure 2.
In the simplest nontrivial case n = 2 we have the following configuration (see Fig. 1). The next example is related to the root system of Cn -type. Let’s consider the following set of vectors in Rn+1 :
ei ± ej 2e √i Cn+1 (m, l) = ke√ 2 n+1 ei ± ken+1
with with with with
multiplicity multiplicity multiplicity multiplicity
k m , l 1
where l and m are integer parameters such that k = 2m+1 2l+1 ∈ Z, 1 ≤ i < j ≤ n. In the case of the C2 (m, l)-system the parameters m, l can be arbitrary integers; the corresponding quantum problem was considered in [8,9]. The corresponding configuration has the form shown in Fig. 2.
546
O. A. Chalykh, M. V. Feigin, A. P. Veselov
For n > 1 the corresponding Schrödinger operator has the form: L = −1n+1 +
n 4k(k + 1)(x 2 + x 2 ) X i j i<j
(xi2 − xj2 )2
+
n X m(m + 1) i=1
xi2
+ (32)
l(l + 1) + 2 xn+1
n 2 ) X 4(k + 1)(xi2 + kxn+1 i=1
2 )2 (xi2 − kxn+1
,
where k = 2m+1 2l+1 . In the case l = m the system Cn+1 (m, l) coincides with the classical root system Cn+1 (or Dn+1 for l = m = 0). Again, as for the An (m) system, the parameters k, l, m may be negative; in that case the corresponding multiplicities in (4.2) should be −1 − k, −1 − m or −1 − l respectively. The simplest way to check the validity of the locus equations for these configurations is to use the following important property of system (27): Theorem 4.1. A configuration A satisfies the locus equations (27) if and only if each two-dimensional subsystem of A gives a locus configuration. In other words, for each two-dimensional plane π ⊂ Cn the vectors α ∈ A ∩ π with their multiplicities mα must satisfy the locus equations. Remark. Notice the analogy with the similar property of the Coxeter and root systems. b the orthogonal projection of a vector β onto the hyperplane Proof. Let us denote by β b, x) ≡ (β, x) on this hyperplane. Let π denote the two-dimensional (α, x) = 0, then (β plane spanned by α and γ 6 = α. Then the subsum of (27) over β ∈ π becomes proportional to (b γ , x)−2j −1 restricted to the hyperplane (α, x) = 0. All these subsums for different two-dimensional hyperplanes are independent, so we come to the following equivalent form of (27): for any two-dimensional plane π ∈ Cn and for each α ∈ A ∩ π and j = 1, . . . , mα , X mβ (mβ + 1)(β, β)(α, β)2j −1 (β, x)−2j −1 ≡ 0 for (α, x) = 0. (33) β∈A∩π β6=α
That gives the statement of the theorem. u t If we analyse the configurations An (m) and Cn+1 (m, l) from this point of view, we will have in each two-dimensional plane either a usual root system or one of their deformations A2 (m) and C2 (m, l). For these two cases the locus equations can be checked by direct calculation. One can see that our configurations An (m) and Cn+1 (m, l) have one common feature: they are obtained from Coxeter configurations by adding a special orbit of the Coxeter group with multiplicity 1 (a sort of “one-orbit deformation” of a Coxeter configuration). The following result demonstrates that such a property is not accidental: the hyperplanes with large multiplicities always form a Coxeter subsystem. Definition. Let’s say that the hyperplane 5β ∈ A has a large multiplicity mβ if in each two-dimensional plane containing the vector β there are no more than mβ + 1 vectors from A (without taking into account the multiplicities).
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle
547
Theorem 4.2. The set B ⊂ A of all hyperplanes with large multiplicities forms a Coxeter configuration and all other hyperplanes and their multiplicities are invariant under the action of this Coxeter group. Proof. We shall prove that for each 5β ∈ B the corresponding reflection sβ preserves the set A together with multiplicities. This implies, in particular, that sβ (B) ⊂ B. To prove the invariance of A under sβ let’s consider as in Theorem 4.1 an arbitrary twodimensional plane π, which contains β, and the corresponding two-dimensional locus equation (33): X
mγ (mγ + 1)(γ , γ )(β, γ )2j −1 (γ , x)−2j −1 |(β,x)=0 ≡ 0,
γ ∈A∩π γ 6 =β
where j = 1, . . . , mβ . Now we look at these equations for fixed generic x as a linear system for unknowns zγ = mγ (mγ + 1)(γ , γ )(β, γ )(γ , x)−3 of the form X γ 6=β
zγ
(β, γ )2 (γ , x)2
j −1 |(β,x)=0 ≡ 0, j = 1, . . . , mβ .
(34)
We need the following elementary lemma: Lemma. If three unit vectors β, γ , γ 0 belong to some two-dimensional subspace in Cn and (β, γ 0 )2 (β, γ )2 = (γ , x)2 (γ 0 , x)2 for all x such that (β, x) = 0 then either γ = ±γ 0 or sβ (γ ) = ±γ 0 . Let’s regroup the terms in (34) into the groups corresponding to different values of From the properties of the Vandermond determinant we easily conclude that the sum of zγ in each group should vanish. On the other hand, using the lemma we see that there are only two terms in each group, and they correspond to the pairs of vectors γ , γ 0 with sβ (γ ) = ±γ 0 . Finally, we arrive at the condition zγ + zγ 0 |(β,x)=0 = 0, which gives t mγ (mγ + 1) = mγ 0 (mγ 0 + 1), i.e. mγ = mγ 0 . u (β,γ )2 . (γ ,x)2
Remark. The Schrödinger operators (31), (32) remain integrable in a usual (Liouville) sense for the general (non-integer) values of the parameters m, l: there exists at least n = dimV independent commuting operators L1 = L, L2 , . . . , Ln . Indeed, for the An (m) case (m is integer) it’s easy to check that the polynomials ps = k1s + k2s + . . . + s−2
s (s = 1, 2, . . . ) satisfy the conditions (11) and, according to Theorem 1.2, kns +m 2 kn+1 there exist differential operators Ls with the highest symbols ps such that Ls ψ√= ps ψ and therefore [Ls , Lt ] = 0. Since the coefficients of these operators depend on m in a rational way, (see the explicit formula (12)) one can define such operators for general m. For s = 2 one has the Schrödinger operator (31), and other Ls give its quantum integrals. In the case of the Cn+1 (m, l)-system similar arguments prove the integrability of the Schrödinger operator (32) for the general l, m, and the commuting quantum integrals 2s (q = 2m+1 , s = 1, 2, . . . ). Ls have the symbols ps = k12s + . . . + kn2s + q s−1 kn+1 2l+1
548
O. A. Chalykh, M. V. Feigin, A. P. Veselov
4.3. Locus configurations on the plane. Yu. Berest and I. Lutsenko [11] in the context of Huygens’ Principle have introduced the following family of the real potentials u on the real plane. In polar coordinates they have the form u(r, ϕ) = −
2 ∂2 log W [χ1 (ϕ), . . . , χM (ϕ)], r 2 ∂ϕ 2
(35)
where χj (ϕ) = cos(kj ϕ + θj ), kM > . . . > k1 > 0, kj ∈ N, θj ∈ R and W [χ1 , . . . , χM ] is the Wronskian of χ1 , . . . , χM . One can consider the natural complexification of the Berest–Lutsenko family in the following way. The set of all non-isotropic lines in C2 is isomorphic to the cylinder C∗ ' CP 1 \{0, ∞} and can be parametrised by a complex parameter ϕ(mod π ), x cos ϕ + y sin ϕ = 0. Any configuration corresponds to a finite number of points in C∗ : ϕ1 , . . . , ϕN with multiplicities m1 , . . . , mN . The corresponding potential has the form u=
N 1 X mj (mj + 1) , r2 sin2 (ϕ − ϕj ) j =1
(36)
where r 2 = x 2 + y 2 ∈ C\{0} and ϕ(modπ) = arctan yx . The complex Berest–Lutsenko potentials given by formula (35) with the complex parameters θj , have the form (36) with ϕj being the roots of the trigonometric polynomial W [ϕ]; their multiplicities are m (m +1) (see [13]). known to have a “triangular” form j 2j Theorem 4.3. All the locus configurations on the plane are determined by the complex Berest–Lutsenko formula (35). Proof. First of all the locus equations (27) in this case are equivalent to the following P mj (mj +1) one-dimensional locus equations (cf. [13]) for the potential v(ϕ) = N j =1 sin2 (ϕ−ϕ ) :
d dϕ
2s−1
X mj (mj + 1) sin2 (ϕ − ϕj ) j 6 =i
j
= 0 (i = 1, . . . , N, s = 1, 2, , . . . , mi ).
ϕ=ϕi
Now we can use the result from [21], which says that in its turn this is equivalent to the existence of the differential operator D with π -periodic coefficients, intertwining the d2 d2 operator L = − dϕ 2 + v(ϕ) with L0 = − dϕ 2 : L ◦ D = D ◦ L0 .
(37)
The idea of the proof is close to the one demonstrated in the proof of Theorem 3.1, and we shall not reproduce it here. So, the only remaining thing to prove is that relation (37) implies that L can be obtained from L0 by classical Darboux transformations. Let’s assume that D has the
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle
549
minimal order among all the intertwiners of L and L0 and consider its kernel: V = KerD. As it follows from (37) V is invariant under L0 : if Df = 0 then D(L0 f ) = LDf = 0. Due to π-periodicity of the coefficients of D, KerD is also invariant under the shift T : f (ϕ) → f (ϕ + π). We would like to show that the spectrum of L0 |V is simple and has the form 2 ), where 0 < k < k < . . . < k are some integers. Suppose that there (k12 , k22 , . . . , kM 1 2 M exists an eigenfunction f ∈ V with the eigenvalue λ 6 = k 2 , k ∈ Z. Since L0 commutes with T , we can assume that f is a Bloch eigenfunction: L0 f = λf . Tf = µf If λ 6 = k 2 , f has to be a pure exponent: f = Ce the operator D can be factorised as e ◦ F, F = D=D
√ −λϕ
√ −λϕ .
or f = Ce−
Since Df = 0
f0 d − , dϕ f
e is a π-periodic differential operator of order one less than D (see e.g. [27]). where D √ √ d e ◦F = D e ◦F ◦L0 = D e ◦L0 ◦F . ± −λ and L◦ D When f = Ce± −λϕ we have F = dϕ e is also an intertwiner with order one less than the order of e=D e ◦ L0 , so D Thus L ◦ D D. Thus the spectrum of L0 |V consists only of the squares of integers: λ = k 2 , k ∈ Z. The same arguments show that λ 6 = 0. So we have only to prove that the spectrum is simple. First of all there could be only one eigenfunction, corresponding to a given λ = k 2 . Indeed, otherwise KerD contains the whole Ker(L0 − λ) and therefore D can be factorised as D = D1 ◦ (L0 − λ) with D1 being another intertwiner of less order. Suppose that L0 has a Jordan block with λ = k 2 . Consider the Jordan basis f0 , f1 , . . . : (L0 − λ)f0 = 0, (L0 − λ)f1 = f0 , . . . . Since f0 can not be a pure exponent (see above), f0 = A cos(kϕ + θ0 ), then f1 = Aϕ 2k sin(kϕ + θ0 ) + B cos(kϕ + θ1 ). Now from the invariance of KerD under the shift T we conclude that Aπ 2k sin(kϕ + θ0 ) also belongs to KerD. Together with f0 the last function generates Ker(L0 − λ), which leads to factorisation D = D1 ◦ (L0 − λ) and reducibility of D. Thus we have proven that KerD is generated by the functions χ1 , . . . , χn of the form χj = cos(kj ϕ + θj ). The general formula (see e.g. [29]) from the theory of Darboux d2 t transformations says that u = −2 dϕ 2 log W [χ1 , . . . , χn ]. The theorem is proven. u We should mention that although the formula (35) is explicit, it is not so easy to extract the geometric information about the locus configurations. For example, it is not clear how to prove the following theorem using this formula. It is very easy to show that all two-line locus configurations consist of two perpendicular lines with arbitrary multiplicities. Let’s consider the first non-trivial case of three lines (α, x) = 0, (β, x) = 0 and (γ , x) = 0, x ∈ C2 with arbitrary multiplicities mα , mβ , mγ ∈ Z+ , and ask when they form a locus configuration. Modulo the natural rotational equivalence we have the following classification. Theorem 4.4. All the three line locus configurations are listed below: 1) the Coxeter A2 configuration with multiplicities (m, m, m);
550
O. A. Chalykh, M. V. Feigin, A. P. Veselov
2) the deformed A2 (m) configuration (31) with multiplicities (1, 1, m) when m is positive and (1, 1, −m − 1) when m is negative; 3) the three line complex Berest–Lutsenko configurations, which can be parametrised in this case as: α = (1, a), β = (1, b), γ = (0, 1) : a 2 − ab + b2 + 1 = 0, with multiplicities (1, 1, 1). Proof. Let A be an arbitrary three line locus configuration. Let us consider the first case when A has at least two lines with multiplicities greater than 1. Then Theorem 4.2 states that A has to be a Coxeter A2 -system. Now let us suppose that there is only one vector γ = (0, 1) with multiplicity m > 1. Theorem 4.2 states that other two vectors have to be symmetric with respect to the vector γ , so we may fix the normalisation α = (1, λ), β = (1, −λ). The locus equation (27) for α has the form: 2(1 + λ2 )(1 − λ2 ) m(m + 1)λ + = 0 if x + λy = 0. (x − λy)3 y3 From that it immediately follows that λ can take only the following values: λ = 1 i , ± √2m+1 , and it is easy to check that A is equivalent to the system A2 (m) ± √2m+1 or A2 (−m − 1). The last case we have to consider is the case when all three vectors α = (1, a), β = (1, b), γ = (0, 1) have multiplicity 1. The locus equation (27) for the vector γ takes the form 2a(a 2 + 1) 2b(b2 + 1) + = 0 if y = 0 (x + ay)3 (x + by)3 or
(a + b)(a 2 + b2 − ab + 1) = 0.
The locus equations (27) corresponding to α and β can be written as follows: (1 + a 2 )(1 + ab) + b(a − b)3 = 0 . (1 + b2 )(1 + ab) + a(b − a)3 = 0 In the case a + b = 0 this system of equations is fulfilled if and only if a 4 = 19 , which implies that A is either the Coxeter system A2 or the deformed system A2 (−2). In the case a 2 + b2 − ab + 1 = 0 the above system holds automatically without any additional restrictions. Thus, the theorem is proven. u t Remark 1. We should mention that some of the configurations 3) contain an isotropic line (a = ±i, b = 0 or a = 0, b = ±i) and √therefore actually reduce to the two-line configurations. Notice also that when a = i/ 3 = −b we have a A2 (−2) configuration. Remark 2. It can be checked that for the configurations 3) from Theorem 4.4 the function φ with the properties (13–14) doesn’t exist. This demonstrates that the converse for the statement of Corollary 2.7 is not true. Notice that from this result it follows that the locus of n lines is non-empty only for the special sets of multiplicities. Moreover, if the locus configuration is real then the set of multiplicities determines it uniquely up to rotation due to the following result.
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle
551
Theorem 4.5. There exists no more than one locus configuration in R2 with a given cyclically ordered set of multiplicities. Proof. Let A = {α1 , . . . , αN } be such a configuration for a given set of multiplicities {m1 , . . . , mN }, and let us fix the normalisation αi = (− sin ϕi , cos ϕi ), 0 ≤ ϕ1 < ϕ2 < . . . < ϕN < π. Considering the locus equations, we have, in particular, that N X mj (mj + 1) cos(ϕj − ϕi ) = 0 for i = 1, . . . , N. sin3 (ϕj − ϕi ) j =1 j 6 =i
Let’s now introduce the function U (ϕ1 , . . . , ϕN ) =
X mi (mi + 1)mj (mj + 1) . sin2 (ϕi − ϕj ) i<j
We conclude that if 8 = (ϕ1 , . . . , ϕN ) defines a locus configuration then necessarily ∂ U (ϕ1 , . . . , ϕN ) = 0. ∂ϕi Function U being a sum of convex functions is a convex function in the domain 0 ≤ ϕ1 < e = (ϕ˜1 , . . . , ϕ˜N ). Then ϕ2 < . . . < π. Suppose it has one more extremum in the point 8 e − 8)t, 0 ≤ t ≤ 1, as U (ϕ1 , . . . , ϕN ) should be a constant along the segment 8 + (8 mi (mi +1)mj (mj +1) . From that it follows that ϕ˜i = ϕi + ϕ0 for some well as each function sin2 (ϕ −ϕ ) i
j
constant ϕ0 for all i. This means that system {αi } is defined uniquely up to a rotation. t u Corollary 4.6. If all the multiplicities are equal then the only real configuration on the plane is Coxeter, i.e. dihedral.
The consideration of all two-dimensional subsystems implies the following more general result. Corollary 4.7. Any real locus configurations in Rn with equal multiplicities must be Coxeter.
5. Affine Locus In this section we present some results concerning the case when the singular set of the potential u(x) of the Schrödinger operator is an affine configuration S of hyperplanes. So, we consider a Schrödinger operator L = −1 + u(x) with rational potential having second order poles along some non-isotropic hyperplanes in Cn . Let (αs , x) + cs = 0 (s = 1, . . . , K) be the equations of these hyperplanes. We will suppose also that the potential u(x) decays at infinity, i.e. u(x) → 0 while x → ∞ along the rays outside singularities. Impose now the condition that L has local trivial monodromy around its singularities. Then Theorem 2.1 from Sect. 2 allows us to reformulate this condition as some algebraic
552
O. A. Chalykh, M. V. Feigin, A. P. Veselov
conditions on the arrangement S of the singular hyperplanes (αj , x) + cj = 0. First of all, it follows that the potential u(x) must be of the form u(x) =
K X mj (mj + 1)(αj , αj ) ((αj , x) + cj )2
(38)
j =1
for some integers m1 , . . . , mK . Then conditions (19) imply that the Schrödinger operator with the potential of the form (38) has local trivial monodromy around its singularities if and only if the following relations are satisfied: X mj (mj + 1)(αj , αj )(αi , αj )2s−1 ≡0 ((αj , x) + cj )2s+1
(39)
j 6=i
identically on the hyperplane (αi , x) + ci = 0 for all i = 1, . . . , K and s = 1, . . . , mi . We will call the relations (39) locus equations. The equations (27) from Sect. 3 are their particular case, when all the hyperplanes pass through the origin. Sometimes we will refer to (39) and (27) as to affine and linear cases respectively. As it follows from Sect. 2, the locus equations (39) are necessary for the existence of a certain eigenfunction of the corresponding Schrödinger operator L (see Theorem 2.2). As well as in the linear case (Sect. 3) Eqs. (39) are sufficient for this. The following result has been proven in [21]. Theorem 5.1. Let L = −1 + u(x) be a Schrödinger operator with the potential of the form (38) which satisfies the affine locus equations (39). Then L has an eigenfunction φ of the form φ(k, x) = P (k, x)exp(k, x), where P is a polynomial in k, Lφ = −k 2 φ. This eigenfunction (up to a normalization factor ) is given by Berest’s formula analogous to (28): ψ(k, x) = [(−2)M M!C(k)]−1 (L + k 2 )M [
K Y
(αj , x) + cj
mj
exp(k, x)],
(40)
j =1
QK P mj where M = K j =1 mj and C(k) = j =1 (αj , k) . The normalization is chosen in such a way that ψ(k, x) = (1 + o(1))exp(k, x) as k → ∞. We start the analysis of the affine locus equations and their solutions (locus configurations) from the one-dimensional case. 5.1. One-dimensional case. In this case we have a configuration of K points z1 , . . . , zK with multiplicities m1 , . . . , mK on the complex plane and the potential u(z) =
K X mj (mj + 1) . (z − zj )2 j =1
The locus equations in this case (for mj = 1) have been introduced in the paper by Airault, McKean and Moser [13]. Duistermaat and Grünbaum [14] obtained them for the general multiplicities and proved that they are equivalent to the existence of the
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle
553
d differential operator D with rational coefficients, intertwining L = − dz 2 + u(z) and 2
d L0 = − dz 2: 2
L ◦ D = D ◦ L0 . All such operators L are the results of the classical Darboux transformations applied to L0 , so the potential u(z) can be given in this case in terms of the Wronskians by the well-known explicit formula: u(z) = −2
d2 log W [χ1 , . . . , χm ], dz2
where the polynomials χ1 , . . . , χm are defined by the recurrent relations χ100 = 0, χ200 = χ1 , . . . , χm00 = χm−1 (see Burchnall–Chaundy [32], Adler–Moser [15]). The Wronskian is a polynomial Pm (z, c1 , . . . , cm ) with the coefficients depending on the additional integration constants c1 , . . . , cm (see [15] for the details). Thus, the locus in the one-dimensional case is a union of the rational algebraic varieties of the dimensions m = 1, 2, 3, . . . , parametrised by c1 , . . . , cm , and the locus configurations are simply the roots of the corresponding Schur polynomials Pm (z, c1 , . . . , cm ). The solution ψ of the corresponding Schrödinger equation −ψ 00 + u(z)ψ = −λ2 ψ has the form ! m X −i ai (z)λ (41) eλz . ψ = 1+ i=1
This is a degenerate rational case of the hyperelliptic BA function, corresponding to a general finite-gap operator [2]. These rational BA functions ψ (41) are characterized by the following properties in the spectral parameter (cf. [33]). Let Pξ1 , . . . ,sξm be arbitrary parameters, ψs be the Laurent coefficients of ψ at λ = 0: ψ = +∞ s=−m λ ψs (z). Impose the following m linear conditions on the coefficients ψ−m , . . . , ψm−1 : m P ξs ψm−2s = 0 ψm−1 + s=1 m−1 P ξs ψm−2s−2 = 0 ψm−3 + s=1 . (42) m−2 P ψm−5 + ξs ψm−2s−4 = 0 s=1 ... ψ−m+1 + ξ1 ψ−m = 0 They are equivalent to a non-degenerate system for m unknown functions ai (z) and determine ψ of the form (41) uniquely. The usual arguments [1,33] show that such a function satisfies the Schrödinger equation −ψ 00 + u(z)ψ = −λ2 ψ with the rational potential u(z) = 2a10 (z).
(43)
Notice that for given ξ1 , . . . , ξm the system (42) determines a m-dimensional linear subspace V (ξ1 , . . . , ξm ) in C2m and therefore corresponds to a point of the Grassmannian Gr(m, 2m). It is more convenient to identify the system of conditions (42) with a point (2) of some infinite-dimensional Grassmannian Gr0 (see [33] for the details). Namely, let’s consider the linear space C[[λ]] of formal series in λ, and let W be a subspace of C[[λ]] with the following properties:
554
O. A. Chalykh, M. V. Feigin, A. P. Veselov
1) λm C[λ] ⊂ W ⊂ λ−m C[λ], where C[λ] is the space of polynomials and both inclusions have the same codimension m; 2) λ2 W ⊂ W . We will suppose that the number m = m(W ) in 1) cannot be reduced. The set of all such (2) subspaces for m = 0, 1, 2, . . . we will denote as Gr0 following [33]. It is easy to see that the subspace of C[[λ]] consisting of all Laurent series ψ = P+∞ s s=−m λ ψs which satisfy the conditions (42) represent nothing but a general point of (2) Gr0 . In these notations the one-dimensional BA function corresponding to W is the unique element ψW of the form (41), such that its Laurent expansion at λ = 0 belongs to W for each z. We will denote by uW the corresponding potential (43). These considerations suggest the following extension of the axiomatics (7-8) of the multidimensional BA function.
5.2. Equipped configurations and BA functions. Let A be again a finite set of non(2) collinear vectors in Cn . We will prescribe to each vector α ∈ A a subspace W (α) ∈ Gr0 , and denote the corresponding integer m(W (α) ) as mα . We will call the corresponding set of hyperplanes 5α : (α, k) = 0 with the prescribed subspaces W (α) the equipped configuration A. Definition. For a given equipped configuration A the function ψ(k, x) is called the Baker–Akhiezer function if it satisfies the following two conditions: 1) ψ has the form ψ=
P (k, x) (k,x) e , A(k)
(44)
Q where A(k) = α∈A (α, k)mα , P is a polynomial in k with the highest term A(k); 2) for each α ∈ A the Laurent expansion of ψ in k in the α-direction calculated at any point of the hyperplane 5α must belong to W (α) . Here by the Laurent expansion of a meromorphic function F (k) in the α-direction at a point k0 we mean the Laurent expansion of the function f (λ) = F (k0 + λα) at λ = 0. If for each subspace W (α) the corresponding parameters ξ in (42) are zeros, our definition reduces to the definition of the BA function from Sect. 1. Now we will prove the analogues of Theorems 1.1, 1.2 for a general equipped configuration. Theorem 5.2. If for a given equipped configuration A there exists a BA function ψ then it is unique and satisfies the Schrödinger equation −1 +
X
! (α, α)uα ((α, x)) ψ = −k 2 ψ,
(45)
α∈A
where uα (z) = uW (α) (z) are the one-dimensional potentials, corresponding to the subspaces W (α) .
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle
555
Theorem 5.3. Let R be the ring of polynomials f (k) with the following properties: for each α ∈ A and any point k0 ∈ 5α the polynomial fα,k0 (λ) = f (k0 + λα) preserves the space W (α) : fα,k0 W (α) ⊂ W (α) . If the Baker–Akhiezer function ψ(k, x) exists then for any polynomial f (k) ∈ R ∂ ) such that there exists some differential operator Lf (x, ∂x Lf ψ(k, x) = f (k)ψ(k, x). All such operators form a commutative ring isomorphic to the ring R. The Schrödinger operator (45) corresponds to f (k) = −k 2 . The proofs of the theorems above follow in a standard way (cf. [4]) from the following two lemmas. Lemma 1. If some function ψ of the form (44) (without the restrictions on the highest term of the polynomial P ) satisfies conditions 2 from the Q definition of the BA function then the highest term in P must be divisible by A(k) = α∈A (α, k)mα . Lemma 2. The BA function corresponding to an equipped configuration A has the following asymptotic behaviour at infinity: ! X (α) (α, α) −1 a1 ((α, x)) + o(k ) , ψ(k, x) = exp(k, x) 1 + (α, k) α∈A
(α)
where a1 (z) are the first coefficients in the corresponding functions (41) ψα = ψW (α) and o(k −1 ) means the rational function of k with degree less than −1. To prove the lemmas, let’s expand ψ in Laurent series in (α, k) on the hyperplane (α, k) = 0. For convenience we may suppose that (α, α) = 1 and choose the orthonormal ˜ basis in k such that (α, k) = k1 , the other coordinates k2 , . . . , kn we shall denote by k. Then up to the non-essential factor exp(k2 x2 + . . . + kn xn ) ψ-function (44) takes the form: X ˜ x), ˜ k1s as (k, (46) ψ(k, x) = ex1 k1 s≥−mα
at and the Laurent coefficients as are rational functions of k˜ with possible singularities P ˜ k) ˜ = k −m A(k)|k1 =0 . Since the sum s≥−m k s as zeros of homogeneous polynomial A( 1 α 1 is the Laurent expansion for P (k,x) , the degrees in k˜ of its coefficients as decrease at A(k)
s → ∞ (by definition, deg pq = degp − degq). Now we restrict our attention to the terms ˜ From the remark above it follows that we have k1s as with the maximal degree of as in k. a finite number of such terms, and if we extract the highest homogeneous part in k˜ in each term, we obtain the following finite expression: X ˜ x), k1s as0 (k, (47) ψ˜ 0 (k1 , x) = ex1 k1 s≥−mα
˜ It is clear where as0 is the highest term in as and all the as0 have the same degree in k. now that constructed in that way ψ˜ 0 must obey the same restrictions (42). This implies, in particular, that the sum (47) contains at least one term with s ≥ 0. The outcome is
556
O. A. Chalykh, M. V. Feigin, A. P. Veselov
P j ˜ and then extract from that if we expand P (k, x) in the series in k1 , P = j ≥0 k1 pj (k), ˜ this sum the terms with the maximal degree in k, the result must contain at least one term with j ≥ mα . Now let’s present P as a sum of components P = P0 + P1 + . . . , homogeneous in k1 , . . . , kn and suppose that the highest term P0 is not divisible by k1mα . In this case some other term Pi must contain k1mα , but its degree in k˜ is clearly less than the degree of the term coming from P0 . This contradiction proves Lemma 1. Moreover, in the extreme case when P0 has the form P0 = k1mα Q0 with Q0 |k1 =0 6 = 0, the reduced ψ-function (47) up to a factor coincides with the one-dimensional BA function (41) ψ(k1 , x1 ). It’s easy to see that this factor is simply Q0 |k1 =0 . In particular, this implies that the second homogeneous term P1 in P (k, x) for the BA function ψ satisfies the following condition: h i i h = a1 (x1 ) k1−mα P0 , k11−mα P1 k1 =0
k1 =0
where a1 is the first coefficient in the corresponding one-dimensional BA function (41). We obtained this formula under the assumption that (α, α) = 1, in general it looks as follows: i h = (α, α)a1 ((α, x)) (α, k)−mα P0 (α,k)=0 . (48) (α, k)1−mα P1 (α,k)=0
Taking into account Qthe restrictions (48) for all the hyperplanes (α, k) = 0, we obtain that if P0 = A(k) = α∈A (α, k)mα , then P1 = A(k)
X α∈A
(α)
a1 ((α, x))
(α, α) , (α, k)
(49)
which proves Lemma 2. Let’s consider now for a given equipped configuration A the corresponding Schrödinger operator (45). It is clear that the potential has the form (38). The corresponding affine configuration of the hyperplanes S we will call dual to the equipped configuration A. Suppose that the corresponding BA function does exist, then from Theorem 2.2 we conclude that the Schrödinger operator (45) has local trivial monodromy and hence satisfies the locus equations (39). In other words, the dual configuration S must be a locus configuration. We believe that the converse is true, that is, each locus configuration appears in such way for appropriate BA function. Part 2 of Theorem 5.6 below shows that each locus configuration is dual to some equipped configuration. So, the only problem is to check that for the function defined by the formula (40) Properties 2 from the definition of the BA function hold. Unfortunately, we couldn’t find a proof for this. We can only remark that for all known affine locus configurations it is true.
5.3. Geometry of affine locus. First of all, it is easy to check that the following operations preserve the locus equations and therefore allow to produce the locus configurations: 1) motions of the complex Euclidean space Cn ; 2) extensions of the configurations in Cn to Cm , m > n, induced by an orthogonal projection Cm → Cn ; 3) union of two configurations which are orthogonal to each other.
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle
557
At the moment all known examples of the affine locus configurations can be constructed using these operations from one-dimensional affine and multidimensional linear locus configurations. In particular, this is true for the configurations, corresponding to the operators introduced by Yu. Berest and P.Winternitz [16]. Analysis of these examples, however, reveals one more geometric way to produce the locus configurations. Let S be any affine configuration of hyperplanes in Cn . Let’s imbed Cn in Cn+2 in the following way: x = (x1 , . . . , xn ) → (x1 , . . . , xn , 1, 0). For any hyperplane 5 in e in Cn+2 as a linear span of 5 ⊂ Cn ⊂ Cn+2 and the Cn let’s define the hyperplane 5 isotropic vector e = (0, . . . , 0, 1, i). If (α, x) + c = 0 is the equation of 5 in Cn then e will be (α, x) + c(xn+1 + ixn+2 ) = 0. the corresponding equation of 5 e in Cn+2 we will call isotropic projectivisation The corresponding configuration S of S. Theorem 5.4. The isotropic projectivisation of an affine locus configuration S in Cn e in Cn+2 . is a linear locus configuration S e the others can be checked in Proof. We shall check the first of the locus equations for S, the same way. So, we need to prove that on a hyperplane (αs , x) + cs (xn+1 + ixn+2 ) = 0 the following identity holds: X mj (mj + 1)(α˜ j , α˜ j )(α˜ s , α˜ j ) ≡ 0, ((αj , x) + cj (xn+1 + ixn+2 ))3 j 6 =s
e j ⊂ Cn+2 . If 5 e j ⊂ Cn has where α˜ j denotes the normal vector of the hyperplane 5 n 1 1 the the normal vector αj = (αj , . . . , αj ), then α˜ j is the vector (αj , . . . , αjn , cj , icj ). From that we immediately see that (α˜ j , α˜ j ) = (αj , αj ) and (α˜ s , α˜ j ) = (αs , αj ). Now since λ = xn+1 + ixn+2 6 = 0 almost everywhere on the hyperplane (αs , x) + cs (xn+1 + ixn+2 ) = 0 we come to the identity X mj (mj + 1)(αj , αj )(αs , αj ) ≡0 ((αj , x) + cj λ)3 j 6=s
for (αs , x) + cs λ = 0. But this identity after rescaling x → λx takes the form X mj (mj + 1)(αj , αj )(αs , αj ) ≡0 for (αs , x) + cs = 0, ((αj , x) + cj )3 j 6=s
which is exactly the first locus equation for the configuration S. u t Example. Let S be a direct sum of three-point one-dimensional configurations with the corresponding potential u(x1 , . . . , xn ) =
n X 6x 4 − 12τi xi i
i=1
(xi3 + τi )2
.
Then after the isotropic projectivisation we obtain the locus configuration with the potential of the form (cf. [16]): u(x ˜ 1 , . . . , xn+2 ) =
n 6x 4 − 12τ (x 3 X j n+1 + ixn+2 ) xj j j =1
(xj3 + τj (xn+1 + ixn+2 )3 )2
.
558
O. A. Chalykh, M. V. Feigin, A. P. Veselov
In order to obtain a more general Berest–Winternitz’s potential [16] u(x ˜ 1 , . . . , xn+2 ) =
n 6x 4 − 12τ (x 3 X j n+1 + ixn+2 + cj ) xj j j =1
(xj3 + τj (xn+1 + ixn+2 + cj )3 )2
we should shift the pairwise-orthogonal triples of hyperplanes 1
xj + τj3 (xn+1 + ixn+2 ) = 0 (j = 1, . . . , n) by cj in xn+1 . Remark. The BA function in this example can be obtained easily using the following general remark. If ψi = Ri (k, x)exp(k, x) (i = 1, 2) are given by the formula (40) for two orthogonal locus configurations S1 and S2 then theSfunction ψ = R1 R2 exp(k, x) will correspond to the locus configuration S = S1 S2 . This is clear from the structure of formula (40). Thus, iterating such geometric procedures one can construct many new affine locus configurations. However, all of them are degenerate in the following sense. Let V (S) be the linear space of the normals to all the hyperplanes in S. We call S degenerate if the restriction of the complex Euclidean form on V (S) is degenerate. For a degenerate affine configuration one can define the following isotropic reduction procedure, which is inverse to the isotropic projectivisation. Let K be the kernel of the restriction of the Euclidean form onto V (S). Consider the orthogonal complement V ⊥ of V in Cn and choose a subspace L such that V + V ⊥ = K ⊕ L. By an isotropic reduction of the degenerate configuration S we shall mean the configuration S ∩ {a + L}, where {a + L} is a shift of L by a generic vector a ∈ Cn . Theorem 5.5. An isotropic reduction of a degenerate locus configuration is a nondegenerate locus configuration. The proof is similar to the case of isotropic projectivisation. These results may be interpreted in two ways. First, we can say that any affine locus configuration is a result of the isotropic reduction of some (degenerate) linear configuration. So, the classification problem for affine locus configurations reduces to the linear case. On the other hand, as we have shown, to classify all locus configurations it is sufficient to consider non-degenerate configurations only. Moreover, we can consider irreducible configurations only, i.e. exclude the unions of orthogonal subconfigurations. At the moment all the known non-degenerate irreducible locus configurations are linear or one-dimensional. It may well be the only possible examples. The following general result clarifies the geometrical structure of affine locus configurations. Theorem 5.6. Any affine locus configuration S has the following properties: 1) for each point x0 ∈ Cn the subset Sx0 ⊆ S of the hyperplanes passing through x0 form a linear locus configuration; 2) for each hyperplane 5 ∈ S the subset S(5) ⊆ S of the hyperplanes parallel to 5 forms an extended one-dimensional locus configuration.
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle
559
Conversely, any affine configuration with properties (1), (2) belongs to the locus. Proof. (1) Let’s consider the locus equations for some hyperplane 5i : (αi , x) + ci = 0 passing through x0 : X mj (mj + 1)(αj , αj )(αi , αj )2s−1 ≡ 0 for x ∈ 5i , ((αj , x) + cj )2s+1
(50)
j 6 =i
s = 1, . . . , mi . Now take x = x0 +y, then x ∈ 5i iff (αi , y) = 0 and we have the following relation: X mk (mk +1)(αk , αk )(αi , αk )2s−1 X mj (mj +1)(αj , αj )(αi , αj )2s−1 + ≡0 (αj , y)2s+1 ((αk , x0 )+ck +(αk , y))2s+1 j :x ∈5 0 j j 6 =i
k:x0 ∈5 / k
for all y such that (αi , y) = 0. Since the second sum is regular at y = 0, the first sum should vanish on the hyperplane (αi , y) = 0. Thus, we obtain a linear locus equation for the configuration Sx0 . (2) To prove the second property, let’s divide all the hyperplanes which are non-parallel to 5 into the subgroups in the following way: 50 and 500 belong to the same group if and only if their intersection is contained in 5. Then in each group the sum of the corresponding terms in (50) should vanish due to the property (1). The remaining terms are exactly the locus equation for the set of parallel planes S(5). The converse statement now is clear. u t We conclude this section by some negative results about locus configurations in Rn . Theorem 5.7. For any locus configuration in the real plane there exists a point all the lines pass through. Proof. First we note that parallel lines cannot appear in locus configurations in R2 . Indeed, the subset of parallel lines according to the previous theorem must give a real solution for the one-dimensional locus equations, which is impossible. Now let’s fix some terminology: by vertices we will mean the intersection points for the lines from the configuration and by a ray – any ray from the configuration with the origin at some vertex (some rays may contain other vertices). Let’s choose an orientation on the plane. This allows us to determine the oriented angle ϕ(l1 , l2 ) between the ordered pair of rays l1 , l2 , which varies from −π to π . We need the following property of the locus configurations in R2 : Lemma. For each ray l1 from the locus configuration in R2 there exists another ray l2 with the same vertex and acute angle between l1 and l2 : 0 < ϕ(l1 , l2 ) ≤
π . 2
Similarly, there exists a ray l3 with the same vertex such that − π2 ≤ ϕ(l1 , l3 ) < 0. Proof of the lemma follows from the linear locus equations (27) for the lines passing through a given vertex: it’s clear that the sign of each term in it depends only on the sign of the cotangent of the oriented angle between α and β.
560
O. A. Chalykh, M. V. Feigin, A. P. Veselov
Lemma. Let l1 and l2 be chosen as in the previous lemma. Then if l1 contains another vertex of the configuration, the same is true for l2 . The proof follows from simple geometrical considerations. Let’s consider now any vertex and all the rays of our configuration outgoing from this vertex. As it easily follows from the lemmas we have only two possibilities: 1) there are no other vertices on these rays or 2) there is at least one more vertex on each ray. Since we have a finite number of vertices, we obtain immediately that our configuration has only one vertex. The Theorem is proven. u t The same is probably true in Rn but at the moment we can prove this only in the special case when all the multiplicities are equal. Theorem 5.8. Any affine locus configuration in Rn with equal multiplicities is a linear Coxeter configuration. Proof. It’s sufficient to prove that the configuration must be symmetric with respect to each of its hyperplanes. Since parallel hyperplanes cannot appear in a real locus configuration, the statement follows from Theorem 5.6 and Corollary 4.7. u t 6. Locus Configurations and Huygens’ Principle Let us consider a linear hyperbolic equation Lϕ(x) = 0, L = 2N +1 + u(x), where 2N+1 is the D’Alembert operator, 2N+1 =
∂2 ∂x02
−
∂2 ∂x12
(51) − ... −
∂2 2 ∂xN
.
We say after J.Hadamard [30] that it satisfies Huygens’ Principle (HP) if its fundamental solution is located on the characteristic conoid, i.e. this solution vanishes in the conoid’s complement. Hadamard found some criterion for HP to be satisfied in terms of the so-called Hadamard coefficients Uν (x, ξ ). They are uniquely determined by the following system of equations: N X i=0
(xi − ξi )
∂Uν 1 + νUν = − L(Uν−1 ) ∂xi 2
(52)
and the conditions that U0 (x, ξ ) ≡ 1 and Uν (x, ξ ) are regular at x = ξ . These coefficients are symmetric with respect to x and ξ : Uν (x, ξ ) = Uν (ξ, x) (for the details see the book [31]). Hadamard proved that Eq. (51) satisfies Huygens’ Principle if and only if N is odd PN 2 2 and Uν |0 = 0 for ν ≥ N−1 i=1 (xi − ξi ) = 0} 2 , where 0 = {(x, ξ ) : (x0 − ξ0 ) − is the characteristic conoid. For the case when the potential u (and, as a corollary, all Hadamard’s coefficients Uν ) does not depend on at least one of the coordinates (say, x0 ), Hadamard’s criterion is equivalent to the condition U N −1 ≡ 0. 2
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle
561
We consider Hadamard’s problem of the description of all huygensian equations of the form: (2N+1 + u(x1 , . . . , xN ))ϕ = 0.
(53)
In fact, in our case for any locus configuration in Cn the corresponding potential will depend only on the first n coordinates : u = u(x1 , . . . , xn ), n ≤ N . It turns out that huygensian equations of the form (53) are closely related to the locus configurations. For the linear locus configuration in Cn the corresponding potential u(x) =
X mα (mα + 1)(α, α) . (α, x)2
(54)
α∈A
is homogeneous of degree –2. Theorem 6.1. For any real potential u(x1 , . . . , xn ) related to a linearPlocus configuration the hyperbolic equation (53) satisfies HP if N is odd and N ≥ 2 α∈A mα + 3. In that case the fundamental solution can be expressed via the BA function. Conversely, if the hyperbolic equation (53) with homogeneous potential u(x): u(λx) = λ−2 u(x) satisfies HP and all the Hadamard’s coefficients are rational functions, then the potential u(x) must have the form (54) for some linear locus configuration. Proof. The proof of the first statement repeats the arguments of the paper [10], where this result has been proven in the Coxeter case. It is based on the following relation between the BA function and Hadamard’s coefficients. If we have the Baker–Akhiezer function ψ of the form (7), we can present it in the form ψ(ξ, x) = (U0 (ξ, x) + U1 (ξ, x) + . . . + UM (ξ, x))e(ξ,x) ,
(55)
where U0 = 1, Uν (x, ξ ) is homogeneous of degree −ν in ξ , M = degA(k) = P m α∈A α . Since ψ is symmetric in ξ and x (Theorem 2.3), Uν has the same degree in x. From the Schrödinger equation (9) for ψ, Lψ = −ξ 2 ψ, L = −1 + u(x), we obtain: −2
n X i=1
ξi
∂ Uν + L[Uν−1 ] = 0 (ν = 1, . . . , M + 1 with UM+1 = 0). ∂xi
Since Uν are homogeneous in x this implies the relations (52), so Uν coincide with Hadamard’s coefficients. Now since UM+1 = 0 Hadamard’s criterion guarantees HP if N ≥ 2M +3. Notice that it gives also the explicit formula for the Hadamard’s coefficients and the fundamental solution for (51) (see [10] for the details). Conversely, from the chain (52) for Hadamard’s coefficients Uν (x, ξ ) for the homogeneous potential u it follows that Uν are also homogeneous in x (and, therefore, in ξ ): Uν (λx, ξ ) = λ−ν Uν (x, ξ ) = Uν (x, λξ ). This can be proven by the same calculation as in Lemma 1 from [12], where the case n = 2 was considered. Let’s now consider the function ψ defined by the formula (55). Then, from the Hadamard chain (52) and homogeneity of Uν it follows in the same way as above that ψ satisfies the Schrödinger equation (−1N + u(x))ψ = −ξ 2 ψ.
562
O. A. Chalykh, M. V. Feigin, A. P. Veselov
Notice that the potential u(x) must be rational since all Hadamard’s coefficients are supposed to be rational. This follows from the first equation of the Hadamard’s chain (52). Now using Theorems 2.1 and 2.2 and the fact that u(x) is homogeneous of degree (–2) we conclude that u(x) has the form (54) for some locus configuration. u t Remark. In the case when n = 2, i.e. u = u(x1 , x2 ), a stronger result (namely, without the assumption that Hadamard’s coefficients are rational) follows from the results by Yu. Berest and I. Lutsenko [11,12]. Now let’s consider an arbitrary (affine) locus configuration S such that the corresponding potential u(x) given by the formula (38) is real for real x. This is equivalent ¯ where S ¯ is a natural complex conjugation of a configuto the condition S = S, ration S. The following result generalises Theorem 6.1 for the general (affine) locus configurations. ¯ the correTheorem 6.2. For any affine locus configuration S ⊂ Cn with S = S sponding hyperbolic equation (53) satisfies Huygens’ Principle if N is odd and large P enough: N ≥ 2M + 3, M = K j =1 mj . Conversely, if Eq. (53) satisfies Huygens’ Principle and all Hadamard’s coefficients are rational functions, then the potential u(x) must be of the form (38) for some affine locus configuration. Proof. The first part of this theorem can be derived from Theorem 5.1 and the results by Yu. Berest [34] (see also [19]). We would like, however, to present here another, more illuminating proof. It is based on a different idea which will help us to prove the second part also. The idea is to reduce the affine case to the linear one using the isotropic projectivisation procedure. The main observation is encapsulated in the following lemma. Let Uν (x, ξ ) (ν = 0, 1, . . . ) be some analytic functions of 2n variables x = (x1 , . . . , xn ), ξ = (ξ1 , . . . , ξn ) which satisfy Eqs. (52) with some potential u(x). Let’s define now the new functions depending on x˜ = (x1 , . . . , xn , xn+1 , xn+2 ) and ξ˜ = (ξ1 , . . . , ξn , ξn+1 , ξn+2 ): eν (x, ˜ ξ˜ ) = (xn+1 + ixn+2 )−ν (ξn+1 + iξn+2 )−ν Uν ( U
x ξ , ) xn+1 + ixn+2 ξn+1 + iξn+2 (56)
and u( ˜ x) ˜ = (xn+1 + ixn+2 )−2 u(
x ). xn+1 + ixn+2
(57)
Lemma. The relations (52) for Uν (x, ξ ) and u(x) are equivalent to the similar relations eν (x, ˜ ξ˜ ) and u( ˜ x) ˜ defined by the formulas (56) and (57). in x, ˜ ξ˜ for U The proof is straightforward. Now suppose that we have the real potential u(x) related to some affine locus con¯ ⊂ Cn . Then the potential u( ˜ x) ˜ defined by (57) corresponds to some figuration S = S e ⊂ Cn+2 which is exactly the result of the isotropic projectivisalocus configuration S tion defined in the previous section (see Theorem 5.4). Thus, according to Theorem 3.1
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle
563
e = −1n+2 + u( the corresponding Schrödinger operator L ˜ x) ˜ in Cn+2 has the BA funce(ξ˜ , x) e can be presented in the tion ψ ˜ which is given by the formula (28). Therefore, ψ form analogous to (55), ˜ ˜ e1 (ξ˜ , x) eM (ξ˜ , x))e e0 (ξ˜ , x) e(ξ˜ , x) ˜ +U ˜ + ... + U ˜ (ξ ,x) , ψ ˜ = (U
(58)
e0 = 1 and the components U eν (x, where U ˜ ξ˜ ) are homogeneous of degree −ν in ξ˜ and ˜ x, ˜ non-singular for x˜ = ξ and satisfy the relations (52) in x, ˜ ξ˜ with the potential u( ˜ x). ˜ Now let’s consider their restriction for xn+1 + ixn+2 = ξn+1 + iξn+2 = 1, eν (x, ˜ ξ˜ )| xn+1 +ixn+2 =1 . Uν (x, ξ ) = U ξn+1 +iξn+2 =1
(59)
We claim that formula (59) determines Hadamard’s coefficients for the initial potential u(x). First of all, let’s notice that this formula really determines some functions of x, ξ only. This can be derived directly from the formula (28). Indeed, it’s easy to see from the inductive procedure (29) that the Q pre-exponent in the BA function (28) is a linear combination of the "monomial" terms α∈A (α, x)pα (α, k)qα with some integers pα , qα . e only as combinations xn+1 + ixn+2 and Thus, xn+1 , xn+2 , ξn+1 , ξn+2 will enter in ψ ξn+1 + iξn+2 . This means that the coefficients Uν defined by (59) indeed do not depend eν in x˜ and ξ˜ we may on xn+1 , xn+2 , ξn+1 , ξn+2 . As a corollary of the homogeneity of U eν are related to Uν by the formula (56). Now using invert formula (59) and obtain that U the lemma we get Eqs. (52) for Uν . It is clear then from (59) that U0 = 1 and Uν are nonsingular when x = ξ . The last remark is that the procedure (59) gives us the real-valued ¯ functions Uν of x, ξ ∈ Rn in the case when the initial potential u(x) is real, S = S. So, for any affine locus configuration we constructed Hadamard’s coefficients Uν for the corresponding hyperbolic equation (53), and UM+1 = 0. Applying Hadamard’s criterion, we obtain the first part of the theorem. To prove the inverse statement, we suppose that the hyperbolic equation (53) is huygensian and has rational Hadamard’s coefficients Uν with UM+1 = 0. In that case eν (x, ˜ ξ˜ ) by the formula (56). According to the we can define the homogeneous functions U lemma, they obey Eqs. (52) with the homogeneous potential (57). Then in the same way as in Theorem 6.1, we conclude that the function (58) satisfies the Schrödinger equation e = −1n+2 + u( eψ e = −ξ 2 ψ with L ˜ x). ˜ Now using Theorem 2.2 in the same way L as in Theorem 6.1 we deduce that the potential u( ˜ x) ˜ must correspond to some (linear) e of non-isotropic hyperplanes in Cn+2 . But in that case the initial locus configuration S e potential u(x) (see the formula (57)) will correspond to the isotropic reduction S of S which should satisfy the locus equations due to Theorem 5.5. The theorem is proven. u t Remark. We have assumed that the potential u of the hyperbolic equation does not depend on x0 , but essentially we have used only the fact that the sequence of Hadamard’s coefficients terminates at some step M. Actually all the results of this section can be generalised formally for any equation of the form (51) (even with the complex potential), which possesses the last property. In that case the singularities of the potential should satisfy the locus equations in CN,1 with the complex Euclidean structure defined by the metrics diag(−1, 1, . . . , 1). We conjecture that any hyperbolic equation (2N +1 + u(x))ϕ = 0 with terminating sequence of Hadamard’s coefficients has a rational potential u(x) which corresponds to some locus configuration in CN,1 . We have proved this under the assumption that
564
O. A. Chalykh, M. V. Feigin, A. P. Veselov
the Hadamard’s coefficients are rational. The proof of this conjecture would lead to the solution of the famous Hadamard problem in the class (53). Until now this problem is solved only when u depends on one of the coordinates (K. Stellmacher, J. Lagnese [17]) and when u is homogeneous and depends on two of the coordinates (Yu. Berest [12]). 7. Some Other Relations and Generalisations 7.1. The Baker–Akhiezer function ψ(k, x) related to an equipped configuration has the following remarkable property: it satisfies a system of differential equations not only in x but also in k–variables. The corresponding bispectral property of the one-dimensional BA function has been observed in the fundamental paper by Duistermaat and Grünbaum [14]. Let ψ(k, x) be a BA function related to some equipped configuration A, S be the corresponding dual configuration of the poles of the potential u(x) given by (38). Let R be the ring of polynomials defined in Theorem 5.3. Define also the dual ring S as the ring of all polynomials q(x) in x, satisfying the relations ∂ 2j −1 αs , [q(x)] |(αs ,x)+cs =0 ≡ 0 ∂x for all j = 1, 2, . . . , ms and for all the hyperplanes of the configuration S. Theorem 7.1. For any p(k) ∈ R and q(x) ∈ S there exist the differential operators Lp (x, ∂/∂x) and Mq (k, ∂/∂k) such that the BA function ψ(k, x) satisfies the following bispectral problem: Lp (x, ∂/∂x)ψ(k, x) = p(k)ψ(k, x) . (60) Mq (k, ∂/∂k)ψ(k, x) = q(x)ψ(k, x) The existence of the operator Lp (x, ∂/∂x) is claimed in Theorem 5.3. The existence of Mq (k, ∂/∂k) follows from the characterisation of ψ by its analytic properties in x. Namely, one can show that the BA function ψ(k, x) is the unique function of the form ψ=
B(x) + . . . (k,x) e , B(x)
Q ms where B(x) = N s=1 ((αs , x)+cs ) and the dots denote the polynomial in x of a smaller degree, such that the following conditions are fulfilled: ∂ 2j −1 ((αs , x) + cs )ms ψ |(αs ,x)+cs =0 ≡ 0 αs , ∂x for each j = 1, 2, . . . , ms and s = 1, . . . , N. The fact that the BA function satisfies these conditions follows from the Schrödinger equation (45) and Theorem 2.2. 7.2. A similar approach can be developed for the trigonometric versions of our Schrödinger operators (1). As well as in the rational case discussed in the present paper, the axiomatics of [4] has to be amended in order to cover the most general case. We intend to discuss such axiomatics in a separate paper. The corresponding locus conditions have been described in [21]. The bispectral property for the corresponding BA functions results in difference operators in the spectral parameter, which can be viewed as deformations of the rational Ruijsenaars and Macdonald operators (see [35]).
Multidimensional Baker–Akhiezer Functions and Huygens’ Principle
565
7.3 The most of the results of this paper can be generalised to the case when the potential u(x) of the Schrödinger operator is a matrix-valued function. The locus equations for that case in dimension 1 have been described in [36]. The multidimensional case is considered in [37]. Acknowledgements. This work was partially supported by the Russian Fundamental Research Fund (grants 96-01-01404, 96-15-96027, 96-15-96037) and INTAS (grant 96-0770). O. Ch. was supported also by the Royal Society postdoctoral fellowship during 1998, which is gratefully acknowledged. O. Ch. and M. F. are grateful to Loughborough University, UK, for the hospitality during the period this work was being completed. Finally, we would like to thank Yuri Berest for extremely fruitful discussions.
References 1. Krichever, I.M.: Methods of algebraic geometry in the theory of nonlinear equations. Uspekhi Mat. Nauk 32 (6), 183–208 (1977) 2. Dubrovin, B.A., Matveev, V.B., Novikov, S.P.: Nonlinear equations of Korteweg–de Vries type, finite-gap linear operators and abelian varieties. Uspekhi Mat. Nauk 31 (1), 51–125 (1976) 3. Whittaker, E.T., Watson, G.N.: A course of modern analysis. Cambridge: Cambridge University Press, 1963 4. Chalykh, O.A., Veselov, A.P.: Commutative rings of partial differential operators and Lie alrebras. Commun. Math. Phys. 126, 597–611 (1990) 5. Veselov, A.P., Styrkas, K.L., Chalykh, O.A.: Algebraic integrability for Schrodinger equation and finite reflection groups. Theor. Math. Phys. 94 (2), 253–275 (1993) 6. Olshanetsky, M.A., Perelomov, A.M.: Quantum integrable systems related to Lie algebras. Phys. Rep. 94, 313–404 (1983) 7. Heckman, G.J.: A remark on the Dunkl differential-difference operators. Prog. in Math. 101, 181–191 (1991) 8. Veselov, A.P., Feigin, M.V., Chalykh, O.A.: New integrable deformations of quantum Calogero–Moser problem. Russ. Math. Surv. 51 (3), 185–186 (1996) 9. Chalykh, O.A., Feigin, M.V., Veselov, A.P.: New integrable generalizations of Calogero–Moser quantum problem. J. Math. Phys. 39 (2), 695–703 (1998) 10. Berest, Yu.Yu., Veselov, A.P.: Hadamard’s problem and Coxeter groups: New examples of the huygensian equations. Funct. Anal. Appl. 28 (1), 3–15 (1994) 11. Berest, Yu.Yu., Lutsenko, I.M.: Huygens’ principle in Minkowski spaces and soliton solutions of the Korteweg–de Vries equation. Commun. Math. Phys. 190, 113–132 (1997) 12. Berest, Yu.Yu.: Solution of a Restricted Hadamard’s Problem in Minkowski Spaces. Comm. Pure Appl. Math. 50 (10), 1019–1052 (1997) 13. Airault, H., McKean, H.P., Moser, J.: Rational and elliptic solutions of the Korteweg–de Vries equation and a related many-body problem. Comm. Pure Appl. Math. 30, 95–148 (1977) 14. Duistermaat, J.J., Grünbaum, F.A.: Differential equations in the spectral parameter. Commun. Math. Phys. 103, 177–240 (1986) 15. Adler, M., Moser, J.: On a class of polynomials connected with the Korteweg–de Vries equation. Commun. Math. Phys. 61, 1–30 (1978) 16. Berest, Yu., Winternitz, P.: Huygens’ principle and separation of variables. Preprint CRM-2379 (1996) (to appear in Commun. Math. Phys.) 17. Lagnese, J.E., Stellmacher, K.L.: A method of generating classes of Huygens’ operators. J. Math. & Mech. 17 (5), 461–472 (1967) 18. Berest, Yu.: Huygens’ principle and the bispectral problem. CRM Proceedings and Lecture Notes 14, 11–30 (1998) 19. Berest, Yu.Yu., Veselov, A.P.: On the singularities of the potentials of exactly solvable Schrödinger operators and Hadamard’s problem. Russ. Math. Surv. 53 (1), 211–212 (1998) 20. Berest, Yu., Veselov,A.: On the Structure of Singularities of Integrable Schrödinger Operators. Submitted to Lett. in Math. Physics 21. Chalykh, O.A.: Darboux transformations for multidimensional Schrödinger operators. Russ. Math. Surv. 53 (2), 167–168 (1998) 22. Berezin, F.A.: Laplace operators on semisimple Lie groups. Proc. Moscow Math. Soc. 6, 371–463 (1957) 23. Bourbaki, N.: Groupes et algèbres de Lie. Chap. VI, Paris: Masson, 1981 24. Calogero, F.: Solution of the one-dimensional n-body problem with quadratic and/or inversely quadratic pair potential. J. Math. Phys. 12, 419–436 (1971)
566
O. A. Chalykh, M. V. Feigin, A. P. Veselov
25. Dunkl, C.F.: Differential-difference operators associated to reflection groups. Trans. AMS. 311, 167–183 (1989) 26. Cohen, A.M.: Finite complex reflection groups. Ann. Scient. Ec. Norm. Sup. ser.4, 9, 379–436 (1976) 27. Ince, E.L.: Ordinary differential equations. New York: Dover publications, 1956 28. Oshima, T.: A definition of boundary values of solutions of partial differential equations with regular singularities. Publications of the RIMS, Kyoto Univ., 19, 1203–1230 (1983) 29. Crum, M.M.: Associated Sturm–Liouville systems. Quart. J. Math. ser.2 (6), 121–126 (1955) 30. Hadamard, J.: Lectures on Cauchy’s Problem in Linear Partial Differential Equations. New Haven: Yale Univ. Press, 1923 31. Günther, P.: Huygens’ Principle and Hyperbolic Equations. Boston: Acad. Press, 1988 32. Burchnall, J.L., Chaundy, T.W.: A set of differential equations which can be solved by polynomials. Proc. London Math. Soc. 30, 401–414 (1929–1930) 33. Segal, G., Wilson, G.: Loop groups and equations of KdV type. Publ. IHES, 61, 5–65 (1985) 34. Berest, Yu. Hierarchies of Huygens’ operators and Hadamard’s conjecture. Acta Appl. Math. 53, 125–185 (1998) 35. Chalykh, O.A.: Duality of the generalized Calogero and Ruijsenaars problems. Russ. Math. Surv. 52 (6), 191–192 (1997) 36. Goncharenko, V.M., Veselov, A.P.: Monodromy of the matrix Schrödinger equations and Darboux transformations. J. Phys. A: Math.Gen. 31, 5315–5326 (1998) 37. Chalykh, O.A., Goncharenko, V.M., Veselov, A.P.: Multidimensional integrable Schrödinger operators with matrix potential. Accepted for publication in JMP Communicated by T. Miwa
Commun. Math. Phys. 206, 567 – 586 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
A Nonperturbative Regularization of the Supersymmetric Schwinger Model C. Klimˇcík Institute de Mathématiques de Luminy, 163, Avenue de Luminy, 13288 Marseille, France Received: 15 March 1999 / Accepted: 8 April 1999
Abstract: It is shown that noncommutative geometry is a nonperturbative regulator which can manifestly preserve a space supersymmetry and a supergauge symmetry while keeping only finite number of degrees of freedom in the theory. The simplest N = 1 case of the U (1) supergauge theory on the sphere is worked out in detail. 1. Introduction There is a widespread belief that “it is unlikely having any regulated form of the theory with the supersymmetry besides string theory” [1]. Nevertheless, we wish to show here, though only in the simplest possible case of the supersymmetric Schwinger model in two dimensions, that there exists a regulator reconciling the supersymmetry with the short distance cut-off. Moreover, this regulator has two advantages with respect to strings. First of all, it is genuinely nonperturbative and, secondly, it is more economic because it requires only a finite number of degrees of freedom in game. The mathematical structure which is behind this regulator is noncommutative geometry. It is a discipline largely developed by A. Connes [2] and besides applications in the physics of the standard model it is expected to have an important impact on a general structure of quantum field theory [3]. It was actually from the noncommutative rather than from the supersymmetric side where the motivation for this work came from. One could witness in the last few years a lot of activity [4–11] concerning model building on the so-called fuzzy sphere. The latter concept was probably invented by Berezin [12], but the idea to use it for regularization of scalar field theories was independently advocated first in [13] and [14]. We should perhaps mention, in order to avoid a misunderstanding, that all quoted references do not study the fuzzy sphere as such but rather field theories on it. In the same spirit, we may say that people working on the quantum inverse scattering method in two dimensions have not been studying the two dimensional Minkowski space but the immensely rich world of two-dimensional field theoretical models living on it. So the
568
C. Klimˇcík
question: “Why to devote so much time to studying the fuzzy sphere” is not well posed; one’d better ask: “Which theories one can formulate on it?” There is an encouraging experience so far that all interesting field theoretical structures can be formulated on the fuzzy sphere. We mean here theories including fermions [4,9,7,11,10], gauge fields [8,9,7,11], topologically nontrivial field configurations [5, 10] and even those incorporating superscalar fields like supersymmetric nonlinear σ models [4]. A next step to undertake is to formulate supersymmetric gauge theories. We shall do it in this paper. The construction of supersymmetric gauge theories is distinguished in comparison with the previously studied models in several aspects. First of all, there is more structure to reconcile with the fuzziness of the space; besides the supersymmetry, there is the supergauge symmetry and a need to build up a complex of superdifferential forms on the noncommutative sphere. Secondly, the structure appears very rigid; while in the nonsupersymmetric context one usually might advocate several nonequivalent constructions, in the supersymmetric context there seems to be a little room for ambiguities. Thirdly, but not less important, it is quite a technical task to study the supergauge theories. The field content of these theories is rich and much work is needed for disentangling good theories from pathological ones by imposing suitable constraints on the superfields. As it is well-known [15], the language of differential forms is the most natural one for building up the supersymmetric gauge theories. However, the field strength obtained by applying the coboundary of the supersymmetric complex on the gauge field (a 1form) cannot be arbitrary but must be, in general, constrained in a way compatible with supersymmetry. Without those constraints, the incovenience would be much stronger than just an abundance of additional propagating fields in the theory. One risks rather the violation of the spin-statistics theorem, the presence of terms containing four bosonic derivatives and similar serious pathologies. Unfortunately, the understanding that the constraints are indispensable is not sufficient for finding them. Hitherto a method was not found which would lead to an algorithmic identification of the constraints yielding a good supersymmetric theory. One therefore has to combine an intuition and a lot of calculation to work out the content of the theory for a given set of guessed constraints. As we have alluded to already above, in our case additional complications enter the story. Indeed, we need to work always with the global description of all the involved structures in order to ensure the possibility of a noncommutative generalization (non-commutative geometry does not know local charts). Thus, we have to find a suitable globally defined differential complex and a set of constraints which would give a good theory. The guiding principle for seeking the complex which would underlie supersymmetric gauge theories on the sphere seems clear. One should covariantize the superderivatives present in an matter action possessing a global symmetry to be gauged. If we take the standard superscalar matter [4], those superderivatives turn out to be generators of a certain superalgebra which is not osp(2, 1) as one may have expected. What happens is that, though the resulting supersymmetric theory does turn out to have the osp(2, 1) supersymmetry, one needs an sl(2, 1) covariant structure to uncover it. The reason for this can be seen already in the standard case of the flat space (super-Poincaré) supersymmetry ¯ algebra also the where one has to add to the generators of the supersymmetry (Q, Q) ¯ so-called supersymmetric covariant derivatives (D, D). In the flat case the derivatives ¯ are detached from the superalgebra (they anticommute with the Q-generators) (D, D) and it is of little usefulness to remark that the Q’s and D’s form together a N = 2 superalgebra. However, when we move from the flat space to the sphere, it can be seen
Nonperturbative Regularization of Supersymmetric Schwinger Model
569
([4]) that the supersymmetric covariant derivatives added to the superalgebra osp(2, 1) entail even an introduction of another bosonic generator which completes the structure to that of the sl(2, 1) superalgebra. Thus the supersymmetric covariant derivatives are not detached from the supersymmetry algebra and it is natural and, in fact, inevitable to consider the bigger structure which involves them. In what follows, we shall first introduce an algebra of superfunctions on the supersphere following [8] and then a differential complex which will underlie the notion of the supergauge field. Since the theory of representations of sl(2, 1) is not so notoriously known as that of su(2) we shall review its basic elements. In Sect. 3, we first describe the construction of the supersymmetric Schwinger model on the ordinary sphere in a way which is most probably also original. Then we shall give an invariant description of the complex of the differential superforms. This invariant description will not only render quite transparent the logic of the construction but it will prove technically efficient in formulating the supersymmetric Schwinger model on the noncommutative sphere. In fact, formulae which would appear in the noninvariant formulation in the noncommutative case would be exceedingly cumbersome. We shall finish with a brief outlook. 2. A Differential Complex on the Supersphere 2.1. Superfunctions on the supersphere. Consider the algebra of functions on the complex C 2,1 superplane, i.e. algebra generated by bosonic variables χ¯ α , χ α , α = 1, 2 and by fermionic ones a, ¯ a. The algebra is equipped with the graded involution ¯ , a¯ ‡ = −a, (χ α )‡ = χ¯ α , (χ¯ α )‡ = χ α , , a ‡ = a,
(1)
satisfying the following properties: (AB)‡ = (−1)AB B ‡ A‡ , (A‡ )‡ = (−1)A A,
(2)
and with the super-Poisson bracket {f, g} = ∂χ α f ∂χ¯ α g − ∂χ¯ α f ∂χ α g + (−1)f +1 [∂a f ∂a¯ g + ∂a¯ f ∂a g].
(3)
We can now apply the (super)symplectic reduction with respect to a moment map χ¯ α χ α + aa ¯ − 1. The result is a smaller algebra A∞ , that by definition consists of all functions f with the property ¯ − 1} = 0. {f, χ¯ i χ i + aa
(4)
Moreover, two functions obeying (4) are considered to be equivalent if they differ just ¯ − 1) with some other such function. The smaller algebra by a product of (χ¯ α χ α + aa A∞ (the reason for using the subscript ∞ will become clear soon) is referred to as the algebra of superfunctions on the supersphere [8]. It is sometimes more convenient to work with a different parametrization of A∞ , using rather the following coordinates: z=
χ¯ 1 a a¯ χ1 , z ¯ = , b = 2 , b¯ = 2 . 2 2 χ χ¯ χ χ¯
(5)
The Poisson bracket (3) then becomes ¯ {f, g} = (1 + z¯ z)(1 + z¯ z + bb)(∂ z f ∂z¯ g − ∂z¯ f ∂z g) f ¯ ∂z f ∂ ¯ g − ∂ ¯ f ∂z g) + (1 + z¯ z)b¯z(∂b f ∂z¯ g + (1 + z¯ z)bz((−1) b
b
¯ zz)(∂b f ∂ ¯ g + ∂ ¯ f ∂b g). − (−1)f ∂z¯ f ∂b g) + (−1)(f +1) (1 + z¯ z − bb¯ b b
(6)
570
C. Klimˇcík
A natural Berezin integral on A∞ can be written as Z 1 ¯ − 1)f. I [f ] = − 2 d χ¯ 1 ∧ dχ 1 ∧ d χ¯ 2 ∧ dχ 2 ∧ d a¯ ∧ da δ(χ¯ i χ i + aa 4π
(7)
It can be rewritten as i I [f ] ≡ − 2π
Z
d z¯ ∧ dz ∧ d b¯ ∧ db f. ¯ 1 + z¯ z + bb
(8)
(Note I [1] = 1.) Now we are ready to quantize the infinite dimensional algebra A∞ with the goal of obtaining its (noncommutative) finite dimensional deformation. The quantization was actually performed in [4] using the representation theory of the sl(2, 1) superalgebra. Here we adopt a different procedure, namely the quantum symplectic reduction (or, in other words, quantization with constraints). We start with the well-known quantization of ¯ a become creation and annihilation the complex plane C 2,1 . The generators χ¯ α , χ α , a, operators on the Fock space whose commutation relations are given by the standard replacement {., .} →
1 [., .]. h
(9)
Here h is a real parameter (we have absorbed the imaginary unit into the definition of the Poisson bracket) referred to as the “Planck constant”. Explicitly ¯ +=h [χ α , χ¯ β ]− = hδ αβ , [a, a]
(10)
and all remaining graded commutators vanish. The Fock space is built up as usual, applying the creation operators χ¯ α , a¯ on the vacuum |0i, which is in turn annihilated by the annihilation operators χ α , a. We use here the same symbols for the classical and quantum quantities with the hope that it will be always clear from the context which usage we have in mind. Now we perform the quantum symplectic reduction with the moment map (χ¯ α χ α + aa). ¯ First we restrict the Hilbert space only to the vectors ψ satisfying the constraint ¯ − 1)ψ = 0. (χ¯ α χ α + aa
(11)
Hence operators fˆ acting on this restricted space which fulfill ¯ − 1)] = 0 [fˆ, (χ¯ α χ α + aa
(12)
form our deformed version of A∞ . ¯ − 1) in the Fock space is given by a The spectrum of the operator (χ¯ α χ α + aa sequence N h − 1, where N’s are integers. In order to fulfill (11) for a non-vanishing ψ, we observe that the inverse Planck constant 1/ h must be an integer N . The constraint ¯ − 1) (11) then selects only ψ’s living in the eigenspace HN of the operator (χ¯ α χ α + aa with the eigenvalue 0. This subspace of the Fock space has the dimension 2N +1 and the algebra AN of operators (i.e. supermatrices) fˆ acting on it is (2N + 1)2 -dimensional. When N → ∞ (the dimension (2N + 1)2 then also diverges) we have the Planck constant approaching 0 and the algebras AN tend to the classical limit A∞ [4].
Nonperturbative Regularization of Supersymmetric Schwinger Model
571
The Hilbert space HN is naturally graded. The even subspace HeN is created from the Fock vacuum by applying only the bosonic creation operators: (χ¯ 1 )n1 (χ¯ 2 )n2 |0i, n1 + n2 = N,
(13)
while the odd one HoN by applying both bosonic and fermionic creation operators: ¯ n1 + n2 = N − 1. (χ¯ 1 )n1 (χ¯ 2 )n2 a|0i,
(14)
C 2,1 ,
it is the textbook fact from quantum At the level of the supercomplex plane R ¯ (this is the Liouville integral over the sumechanics that the integral d χ¯ α dχ α d ada perphase space) is replaced under the quantization procedure by the supertrace in the Fock space. (The supertrace is the trace over the indices of the zero-fermion states minus ¯ − 1) the trace over the one-fermion states). The δ function of the operator (χ¯ α χ α + aa just restricts the supertrace to the trace over the indices of HeN minus the trace over the indices of HoN . Hence an integration in AN is given by the formula I [fˆ] ≡ STr[fˆ], fˆ ∈ AN .
(15)
The graded involution ‡ in the noncommutative algebra AN is defined exactly as in (1). 2.2. osp(2, 1) and sl(2, 1) superalgebras and their representations. The sl(2, 1) superalgebra has a convenient basis of even generators R± ,R3 ,0 and the odd ones V± , D± , satisfying the following (anti)commutation relations (see also [16,4]): [R3 , R± ] = ±R± , [D± , V± ]+ = 0, 1 [D± , D± ]+ = ∓ R± , 2 1 [V± , V± ]+ = ± R± , 2 1 [R3 , V± ] = ± V± , 2 1 [R3 , D± ] = ± D± , 2 [0, V± ] = D± ,
[R+ , R− ] = 2R3 , 1 [D± , V∓ ]+ = ± 0, 4 1 [D± , D∓ ]+ = R3 , 2 1 [V± , V∓ ]+ = − R3 , 2
[Ri , 0] = 0,
[R± , V± ] = 0,
[R± , V∓ ] = V± ,
(20)
[R± , D± ] = 0,
[R± , D∓ ] = D± ,
(21)
(16) (17) (18) (19)
[0, D± ] = V± .
(22)
Here and in what follows the commutators are denoted as [., .] but the anticommutators have a subscript [., .]+ . We reserve the notation {., .} for Poisson brackets and their noncommutative generalizations (see Sect. 3.2.). If we take a Poisson bracket of two odd elements of A∞ we write {., .}+ . The superalgebra osp(2, 1) is a subsuperalgebra of sl(2, 1) generated by Ri , V± . The irreducible representations of osp(2, 1) are classified by one parameter, which may be a positive integer or a positive half-integer j and is referred to as the superspin [16]. Every irreducible j -representation is of course a (reducible) representation of the su(2)subalgebra of osp(2, 1). Its decomposition into the irreducible components from the su(2) point of view is given by 1 josp(2,1) = jsu(2) ⊕ (j − )su(2) , 2
(23)
572
C. Klimˇcík
where jsu(2) means obviously the standard su(2) spin. The only exception from the rule (23) is a trivial superspin zero representation. The classification of irreducible representations of sl(2, 1) is more involved [16]. There exist two types: the typical and the non-typical ones. The former are characterized by the property that they are reducible from the point of view of the osp(2, 1) superalgebra, while the latter are irreducible. The typical representation is characterized by one positive integer or half-integer jsl(2,1) ≥ 1 called the sl(2, 1) superspin and by an arbitrary complex number γ 6 = ±2j which is related to 0 and may be called a 0-spin. The typical representations considered in this paper will always have the 0-spin equal to zero. They are 8jsl(2,1) dimensional and they have the following osp(2, 1) content: jsl(2,1) = josp(2,1) ⊕ (j − 1/2)osp(2,1) ,
(24)
hence the following su(2) content: jsl(2,1) = jsu(2) ⊕ (j − 1/2)su(2) ⊕ (j − 1/2)su(2) ⊕ (j − 1)su(2) .
(25)
The Lie superalgebra sl(2, 1) is naturally represented on the (graded) commutative associative superalgebra A∞ and also on its noncommutative deformations AN . In the commutative case, this representation can be called Hamiltonian since it is generated via the super-Poisson bracket (3) by the following charges: r+ = χ¯ 1 χ 2 , r− = χ¯ 2 χ 1 , r3 =
1 1 1 ¯ + 1, (χ¯ χ − χ¯ 2 χ 2 ) γ = aa 2
¯ 2, 2v+ = χ¯ 1 a + aχ
2v− = χ¯ 2 a − aχ ¯ 1,
2d+ = aχ ¯ 2 − χ¯ 1 a,
2d− = −χ¯ 2 a − aχ ¯ 1.
(26)
(27)
This means that, for instance, V+ acts on an (even) element f ∈ A∞ as V+ f = {v+ , f },
(28)
and so on for every generator of sl(2, 1). In the non-commutative case, the representation of sl(2, 1) is defined by the same charges (26) and (27) but now thought of as the operators acting on HN via scaled (graded) commutators; for instance, for even f , V+ f = N [v+ , f ], f ∈ AN .
(29)
The explicit form of the supermatrices ri , vα , dα and γ was given in [4] (Eqs. (83)–(91)). Both the commutative algebra A∞ and its noncommutative deformations AN are completely reducible with respect to the sl(2, 1) action described above. Their decompositions into irreducible components involve only the typical representations and they are given explicitly as follows [4]: ∞ AN = ⊕N j =0 j, A∞ = ⊕j =0 j,
(30)
where j stands for the sl(2, 1) superspin and j = 0 means the trivial representation. The representation space of the latter consists of the constant elements of A and of the constant multiples of the unit supermatrix in the case of AN . The description of the typical multiplet in A∞ or in AN with the sl(2, 1) superspin equal to 1 is easy. In both cases, the commutative and the noncommutative one,
Nonperturbative Regularization of Supersymmetric Schwinger Model
573
the representation space of this superspin 1 representation is spanned by the charges ri , γ , v± , d± which are considered as the elements of A∞ and AN , respectively. Evidently, the sl(2, 1)-superspin 1 representation coincides with the adjoint representation and its dimension is 8. This fact has an interesting consequence, namely that the AN valued charges provide also the sl(2, 1) representation with the representation space being HN . This representation can be shown to be the non-typical irreducible representation of sl(2, 1) and, as such, it is also the irreducible representation of osp(2, 1). Its osp(2, 1) superspin is given by N/2. 2.3. Seeking the complex. We could briefly define the differential complex on the supersphere and then construct the supergauge theories based on it, but before doing that it is perhaps desirable to indicate the way that this complex was invented. Without those indications, the interested reader could check that the construction gives correct results but possibly he would not be convinced that it is somehow unique. Suppose therefore that we want to construct supergauge theories with the underlying superalgebra being osp(2, 1). A natural way to do it consists in trying to “covariantize” the derivatives which appear in the action of the charged scalar superfield (we are in two dimension). This action was constructed in [4] and is explicitly given by S = I [D+ 8‡ D− 8 − D− 8‡ D+ 8 + (1/4)08‡ 08],
(31)
where 8 ∈ A∞ is a complex scalar superfield on the sphere, I is the integral over A∞ defined in (7) and the derivatives D± , 0 were defined via the Poisson bracket (see (29)). If we add those derivatives to the osp(2, 1) superalgebra (which acts by means of Ri , V± ) we obtain the sl(2, 1) superalgebra. As was explained in [4], the using of sole osp(2, 1) generators was insufficient for constructing a theory respecting the spinstatistics theorem. It seems that our supergauge field multiplet is composed of three superfields A± , W which one has to add respectively to three derivatives D± , 0 in order to covariantize them. The action (31) would then become S = I [(D+ − A+ )8‡ (D− + A− )8 − (D− − A− )8‡ (D+ + A+ )8 + (1/4)(0 − W )8‡ (0 + W )8].
(32)
However, we shall encounter a lot of trouble trying to find a osp(2, 1) invariant field strength corresponding to the gauge multiplet A± , W . It seems that this field strength should be given by an expression that contains only the first derivatives D± , 0 of the multiplet A± , W , for example something like: F = D+ A− − D− A+ + (1/4)0W.
(33)
F defined in this way seems to be nice, since it is indeed osp(2, 1) invariant (for consistency, the multiplet A± , W has to transform under the osp(2, 1) action in the same way as the derivatives D± , 0 which are transformed according to (16),(17), (21) and (22)). However, the strength so defined is not gauge invariant if we impose the evident gauge transformation rule A± → A± + iD± 3, W → W + i03, where 3 is a real scalar superfield.
(34)
574
C. Klimˇcík
Let’s continue our search and suppose that the needed field strength is not a osp(2, 1) singlet but it is some multiplet with a higher osp(2, 1) superspin. However, it is not difficult to show that no such multiplet exists which would be linear in the derivatives D± , 0 and would respect the gauge transformation (34). The same no go theorem can be proved if we add into the game the derivatives Ri , V± acting on A± , W . Let us therefore add to the superspin 1/2 multiplet A± , W a superspin 1 multiplet of superfields Ci , B± whose gauge transformations are defined as follows: Ci → Ci + iRi 3, B± → B± + iV± 3.
(35)
Now it turns out that an osp(2, 1) covariant multiplet of gauge invariant field strengths exists that is linear in the derivatives Ri , V± , D± , 0 and in the superfields Ci , B± , A± , W . However, the trouble reappears: firstly, it seems to be unnatural to have an abundance of new gauge superfields in the game which even do not interact with the matter field and which enter only the pure gauge field sector of the Lagrangian. Secondly, and even more importantly, even if we accept that abundance of fields, any polynomial osp(2, 1) invariant Lagrangian built up of that field strength leads to a pathological theory (higher derivatives, violation of spin-statistics, etc.). It turns out, however, that even this difficulty can be circumvented by imposing suitable constraints on the supergauge field multiplet A± , W, Ci , B± which would eliminate the unwanted superfields. This strategy is, of course, standard in the superworld but not necessarily easy. The subtlety consists in ensuring that differential constraints in the superspace do not generate differential constraints in the bosonic variables on the fields which remain in the Lagrangian. At the same time one has to ensure that the constraints are compatible with the osp(2, 1) supersymmetry and the supergauge transformations (34) and (35). All these conditions are quite stringent and the fact that a solution exists even in the noncommutative case indicates the naturalness of the compatibility of noncommutative geometry and supersymmetry. For seeking the good constraints, we adopt a natural assumption that the constraints are linear both in the derivatives Ri , V± , D± , 0 and in the superfields Ci , B± , A± , W . The condition of the compatibility with the gauge transformations (34) and (35) selects 32 constraints of that type which fall into six osp(2, 1) supermultiplets. One of these multiplets has the superspin 1/2, three of them the superspin 1 and two of them the superspin 3/2. The superspin 1/2 multiplet is nothing but the field strength mentioned above. It reads F± = 0B± − V± W − 2R± A∓ + 2D∓ C± ∓ 2R3 A± ± 2D± C3 + 2A± , f = 4V+ A− − 4V− A+ + 4D− B+ − 4D+ B− + 2W.
(36) (37)
A tedious (though straightforward) inspection shows that the only viable constraint is given by the following superspin 1 multiplet, i.e.: ±4D+ A+ + C± = 0, C3 − 2D− A+ − 2D+ A− = 0; B± + D± W − 0A± = 0.
(38) (39)
The explicit formulas for the field strength and the correct constraint will reappear in the following subsection in terms of the structures of the differential complex alluded to in the introduction. It should be clear that the structure of the complex that we are going to construct is implied by our previous discussion. In other words, we grasp and formalize our search of the supersymmetric field strength and the supersymmetric constraints in terms of that complex.
Nonperturbative Regularization of Supersymmetric Schwinger Model
575
2.4. The complex. We shall describe the differential complex over the supersphere by working with the commutative and noncommutative case at the same time. In fact, whenever we shall consider the “commutator” in the algebra of AN we shall have in mind the commutator multiplied by N (for N finite) and the Poisson bracket (3) for N = ∞. We denote the complex by 4N and we define it as follows: 4N = ⊕3j =0 (4N )j .
(40)
As usual, the elements of (4N )j will be called the j -forms. The spaces (4N )j have the following structure: (4N )0 = (4N )3 = AN , (4N )1 = (4N )2 =
⊕8i=1 (AN )i ,
(41) (42)
where (AN )i = AN for every value of the index i. Generically, we shall denote by small (capital) Greek characters the 0-forms (3forms) and by capital (small) Latin characters the 1-forms (2-forms). Then the associative product ∗ is given by the rules φ ∗ ψ = φψ, φ ∗ (A± , W, Ci , B± ) = (φA± , φW, φCi , φB± ), φ ∗ (a± , w, ci , b± ) = (φa± , φw, φci , φb± ), φ ∗ 9 = φ9,
(43) (44)
1 2 ) ∗ (A2± , W 2 , Ci2 , B± )= (A1± , W 1 , Ci1 , B± 2 1 1 2 2 1 (W 1 B+ − W 2 B+ − 2C+ A− + 2C+ A− − 2C31 A2+ + 2C32 A1+ , 2 1 1 2 2 1 − W 2 B− − 2C− A+ + 2C− A+ + 2C31 A2− − 2C32 A1− , W 1 B− 1 2 1 2 2 2 A− + 4B− A+ − 4A1− B+ + 4A1+ B− , − 4B+
− 4A1+ A2+ , 2A1− A2+ + 2A1+ A2− , 4A1− A2− , W 1 A2+ − W 2 A1+ , W 1 A2− − W 2 A1− ), (A± , W, Ci , B± ) ∗ (a± , w, ci , b± ) = 1 1 1 A+ a− − A− a+ + W w − C+ c− − C− c+ − C3 c3 − B+ b− + B− b+ . 4 2 2
(45)
(46)
The multiplication of forms by the scalars from the right is defined as in (43) and (44) but with φ standing from the right. The product of a two form with a one-form is given as in (46) but with reversed order of the small and the capital characters. Finally, all other products are defined to be zero. It is important to notice that A± , B± are understood as odd elements of AN while Ci , W are even (we did not indicate it in (42) in order to avoid too cumbersome a notation). The same is true for a± , b± and ci , w respectively. 8 and ψ are even. Of course, these facts play an important role in checking that the product (43)–(46) is associative and, in the case of N = ∞, also graded commutative. The coboundary operator δ is given by the rules δφ = (D± φ, 0φ, Ri φ, V± φ),
(47)
576
C. Klimˇcík
δ(A± , W, Ci , B± ) = (0B± − V± W − 2R± A∓ +2D∓ C± ∓ 2R3 A± ± 2D± C3 + 2A± , 4V+ A− − 4V− A+ + 4D− B+ − 4D+ B− + 2W, − 4D+ A+ − C+ , −C3 + 2D− A+ +2D+ A− , +4D− A− − C− , −B± − D± W + 0A± ), (48) 1 (a± , w, ci , b± ) = D+ a− − D− a+ + 0w 4 1 1 − R+ c− − R− c+ − R3 c3 − V+ b− + V− b+ , 2 2 δ9 = 0.
(49) (50)
The operators Ri , V± , D± , 0 in (47)–(49) act via the scaled commutators (29) for N finite or via the Poisson brackets (3) for N infinite. One easily checks that the coboundary δ is nilpotent δ 2 = 0,
(51)
and that the δ does verify the graded Leibniz rule δ(α ∗ β) = δα ∗ β + (−1)α α ∗ δβ
(52)
in both commutative (infinite N ) and noncommutative (finite N) cases. Now we shall give the action of the osp(2, 1) superalgebra on the complex 4N . The osp(2, 1) action on the 0-forms is given basically in terms of the odd generators V± . The action of the bosonic generators Ri can then be derived in terms of the anticommutators (19) of the odd transformations. On the 0-forms (and 3-forms), we have the following action: 1φ = (+ V+ + − V− )φ ≡ (V )φ.
(53)
Here ± are constant Grassmann parameters and 1 stands for an infinitesimal variation. The parameters ± behave with respect to the graded involution as follows: ‡ ‡ = − , − = −+ . +
(54)
Note that this is the choice of a real form of osp(2, 1) with respect to the graded involution and it is dictated by the fact that a 1-form (A± , W, Ci , B± ) must fulfill the following “reality” conditions: A‡+ = A− ,
A‡− = −A+ ,
‡ B+ = −B− ,
‡ B− = B+ ,
(55)
‡ C+
‡ C−
C3‡
W ‡ = W,
(56)
= C− ,
= C+ ,
= C3 ,
in order to yield the correct degrees of freedom of a (super)gauge field. The action on the 1-forms (and also on the 2-forms) is given as follows 1(A± , W, Ci , B± ) = 1 1 ((V )A+ − − W, (V )A− + + W, (V )W + + A+ + − A− , 4 4 1 1 (V )C+ + − B+ , (V )C− + + B− +, (V )C3 + + B+ − − B− , 2 2 1 1 1 1 (V )B+ − + C+ + − C3 , (V )B− + + C3 + − C− ). 2 2 2 2
(57)
Nonperturbative Regularization of Supersymmetric Schwinger Model
577
It can be easily checked that the coboundary δ is osp(2, 1) invariant, i.e. 1δω = δ1ω, ω ∈ 4N .
(58)
In fact, Eq.(58) becomes evident when we introduce the invariant description of the complex in Sect. 3.2 because the coboundary δ will be given in terms of invariant operators of osp(2, 1). Another important property of 1 is that it verifies the Leibniz rule in the complex 4N , in other words, 1(ω1 ∗ ω2 ) = (1ω1 ) ∗ ω2 + ω1 ∗ (1ω2 ).
(59)
This property enables us to construct the osp(2, 1) invariants. For example, the product of a 1-form with a 2-form, given by the formula (46), is the osp(2, 1) scalar. Note, however, that the expressions 1 A+ a− − A− a+ + W w 4
(60)
1 1 C+ c− + C− c+ + C32 + B+ b− − B− b+ , 2 2
(61)
and
are separately osp(2, 1) invariant. The last things we shall need are the “Hodge triangle” G and the theory of integration on 4N . The Hodge triangle converts an i-form into a (3 − i)-form; it is defined simply as the identity map between 40 and 43 , and 41 and 42 , respectively. The integral I will be defined only on the 3-forms and will be given by (15) for N finite and by (7) for N infinite. 3. Field Theories 3.1. The commutative case. Let us consider first the pure gauge theories in the commutative case. The gauge field will be a 1-form V ∈ 4N subject to the following constraint: F ≡ δV = (., ., ., 0, 0, 0, 0, 0),
(62)
in words, the first three components of the coboundary δV are unconstrained but the remaining five have to vanish. The reader might have noticed that under the action of the subalgebra osp(2, 1) a generic 1-form (2-form) decomposes into two multiplets. The first three components form a multiplet with the osp(2, 1)-superspin 1/2 and the remaining five with superspin 1. Since the coboundary δ commutes with the action 1 of osp(2, 1), it is evident that our constraint (62), does respect the osp(2, 1) supersymmetry. The gauge symmetry of the constraint (62) is also obvious due to the nilpotency of δ. It remains to see whether the constraint involves some unwanted space derivatives. Fortunately, it is not the case. Indeed, looking at (48) it is immediately evident that the constraint is resolved with respect to the “additional” superfields C± , C3 , B± : C± = ∓D± A± , C3 = 2D− A+ + 2D+ A− , B± = −D± W + 0A± .
(63)
If the field V satisfies the constraint (62) then the three nonzero components of the field strength δV will be second order expressions in the derivatives Ri , V± , D± , 0 acting on the superspin 1/2 multiplet A± , W . This may seem awkward, since a Langrangian
578
C. Klimˇcík
quadratic in the field strength will contain terms with four derivatives. It turns out, however, that working out the action in components of the superfields A± , W will give a non-pathological action. The same phenomenon takes place in the standard (superPoincaré) supersymmetric electrodynamics in two dimensional flat space [17] where the kinetic term of the gauge field also contains expressions quartic in the supersymmetric covariant derivatives nevertheless the action in components is the standard second-order one. Let us write a pure gauge field action on the commutative supersphere as follows: S∞ (V ) = I [α 0 δV ∗ GδV + β 0 V ∗ δV ], V ∈ 4N ,
(64)
where α 0 and β 0 are real parameters and the components of V = (A± , W, Ci , B± ) are supposed to satisfy the reality conditions (55) and (56). It is evident that the action is gauge invariant with respect to the transformation V → V + iδ3,
(65)
where 3 is a 0-form. Of course, V is to satisfy the constraint (62), hence the components Ci , B± are given by (63). Having this in mind, we can evaluate the coboundary of V : δV = (F+ , F− , f, 0, 0, 0, 0, 0),
(66)
where 2 A∓ , F± = (0 2 + 2)A± − (0D± + V± )W ∓ 12D∓ D± A± ± 12D± f = 2W + 4(D+ D− − D− D+ )W + 4(V+ − D+ 0)A− − 4(V− − D− 0)A+ . (67)
Before giving a noncommutative version of the action (64), let us write its content in a more familiar parametrization. Set A+ =
1 ¯ + 1 d+ K, A− = − 1 (zA + A) ¯ + 1 d− K, W = b¯ A¯ − bA + γ K, (A − z¯ A) 2 2 2 2 (68)
where d± , γ were defined in (26) and (27). The components F± , f then become 3 ¯ ¯ A¯ + b(D ¯ A¯ + DA) ¯ + z¯ DD + D 2 A¯ + z¯ D¯ 2 A] F+ = − n[−DDA 2 ¯ + D A) ¯ − 4d+ K + 2nd+ DDK ¯ ¯ − 2(D + z¯ D)K, + 2d+ n(DA
(69)
3 ¯ ¯ A¯ + b(D A¯ + DA) ¯ n[−zDDA − DD + zD 2 A¯ − D¯ 2 A] 2 ¯ + D A) ¯ − 4d− K + 2nd− DDK ¯ − 2(D¯ − zD)K, + 2d− n(DA
(70)
¯ DD ¯ A) ¯ − b(DDA) ¯ ¯ f = 3n[b( − 2(D A¯ + DA) + bD 2 A¯ + b¯ D¯ 2 A] ¯ + D A) ¯ + 2γ nDDK ¯ 2γ n(DA + 4(b¯ D¯ + bD)K − 4γ K.
(71)
F− =
¯ and the operators D, D¯ are the standard supersymmetric covariant Here n = 1 + z¯ z + bb derivatives in two dimensions, i.e. ¯ z¯ . D = ∂b + b∂z , D¯ = ∂b¯ + b∂
(72)
Nonperturbative Regularization of Supersymmetric Schwinger Model
579
It is perhaps worth giving formulae that express the derivatives Ri , 0, D± , V± in terms ¯ where of the derivatives D, D¯ and Q, Q, ¯ = ∂ ¯ − b∂ ¯ z¯ . Q = ∂b − b∂z , Q b
(73)
Here they are: 1 ¯ (D − z¯ D), 2 1 ¯ V+ = (Q + z¯ Q), 2
D+ =
¯ ¯ − b∂b , 0 = b∂ b ¯ ¯, R+ = −∂z − z¯ 2 ∂z¯ − z¯ b∂ b
1 D− = − (D¯ + zD), 2 1 ¯ V− = (Q − zQ), 2 1¯ 1 R3 = z¯ ∂z¯ − z∂z + b∂ b¯ − b∂b , 2 2 R− = ∂z¯ + z2 ∂z + zb∂b .
(74) (75) (76) (77)
Using the formulae (46), (68) and (69)–(71), we obtain the action (64) in terms of ¯ K. The explicit formula is somewhat cumbersome and we do not list it here since it A, A, is pathological! The trouble is caused by the superfield K that is contained in the action in 2 . This gives the bosonic derivatives of fourth order. Fortunately, there ¯ the form (DDK) is a manifestly supersymmetric and gauge invariant way for getting rid of the unwanted field K. Indeed, a constraint K=0
(78)
simply does the job. Having in mind the noncommutative generalization, it is desirable to formulate this additional constraint in terms of the fields A± , W . It reads 1 d+ A− − d− A+ + γ W = 0. 4
(79)
It is not difficult to verify the gauge symmetry (78) or (79) and the osp(2, 1) supersymmetry of (79). After imposing the additional constraint (79), the action (64) becomes Z 1 ¯ ¯ (80) d z¯ dzd bdb{α D(nω)D(nω) + βnω2 }, S∞ = 2πi where ¯ + D A, ¯ n = 1 + z¯ z + bb ¯ ω = DA
(81)
and the parameters α, β are linear combinations of α 0 , β 0 . It is instructive to see how the osp(2, 1) superinvariance is realized in this parametrization of the gauge field multiplet. We have 1 1A = (+ V+ + − V− )A + − bA, 2 1 ¯ 1A¯ = (+ V+ + − V− )A¯ − + b¯ A, 2
(82) (83)
and therefore 1 ¯ 1ω = (+ V+ + − V− )ω + (− b − + b)ω, 2
(84)
580
C. Klimˇcík
or, equivalently, 1(nω) = (+ V+ + − V− )(nω).
(85)
From the last formula (85), the osp(2, 1) supersymmetry of the action (80) immediately ¯ is also evident. follows. The gauge symmetry A → A + iD3,A¯ → A¯ + i D3 We are now ready to cast the action (80) in components. Set η 1 (w + iu) ¯ + bb( − ∂z ζ ), iA = ζ + bv + b¯ 2 1 + z¯ z 1 + z¯ z η¯ 1 (w − iu) ¯ ¯ + bv¯ + bb( − ∂z¯ ζ¯ ). i A¯ = ζ¯ − b 2 1 + z¯ z 1 + z¯ z
(86) (87)
¯ ζ ‡ = ζ¯ . We Here u, w are real, v and v¯ mutually complex conjugate and η‡ = η, calculate ¯ + D A) ¯ = iu + bη − b¯ η¯ + bb[(1 ¯ ¯ + in(DA + z¯ z)(∂z¯ v − ∂z v)
iu ]. 1 + z¯ z
(88)
Thus the fields ζ, ζ¯ and w have dropped out. It turns out that avoiding a mutual coupling of the fields u and v in the action requires setting β = −2α in the action (80). Thus we respect this most natural choice and write finally S∞ =
−α 2πi
Z d z¯ dz{−(1 + z¯ z)2 (∂z¯ v − ∂z v) ¯ 2 + ∂z¯ u∂z u +
ηη ¯ }. + η∂z¯ η + η∂ ¯ z η¯ + 4 (1 + z¯ z)
u2 (1 + z¯ z)2
(89)
Note that the action (89) differs from that of the standard free supersymmetric electrodynamics in the flat Euclidean space [17] by the presence of the mass term for the dynamical fields u and η, η. ¯ The reader may say that it is not surprising that theories on different manifolds have different actions. Note, however, that the matter action (32), rewritten in terms of the superfields A, A¯ and a complex matter superfield 8, is exactly of the same form as that of the matter sector of the supersymmetric Schwinger model in flat two-dimensional space [17]: Smatter =
1 2π i
Z
¯ ¯ ¯ ‡ (D + A)8 + (D¯ + A)8(D − A)8‡ ]. (90) d z¯ dzd bdb[( D¯ − A)8
Of course, this coincidence (due to superconformal invariance of the massless superscalar matter in two dimensions) is only formal because, in spite of the same form of the action, the superfields belong to different algebras! The matter superfield 8 is an element of the algebra of the superfunctions on the supersphere while in the flat case it would be an element of the algebra of superfunctions on the flat Euclidean superspace.
Nonperturbative Regularization of Supersymmetric Schwinger Model
581
3.2. An invariant description of the complex. The complex constructed in Sect. 2.4 looks somewhat cumbersome and we may wonder whether a description exists which would be more elegant and natural. It turns out that the answer to this question is positive; in fact, we shall present the complex in a way which does not need to choose a basis in the superalgebra sl(2, 1) or a basis in the space of 1-forms. The invariant picture is entirely based on standard structures of super-Lie algebras and it is not only more esthetic but also it is efficient for technical purposes. Indeed, the noncommutative generalization of the commutative theories of the previous subsection requires adding a quadratic term in the definition of the field strength and cubic and quartic terms in the action functional of the field theory. Consulting Formula (45) the reader may easily convince himself that working with the product of forms in the previous non-invariant description would be very messy. However, the invariant picture will enable us readily to formulate the constraint (62) and even to solve it explicitly! What is even more remarkable in the story is that the complex can be constructed for a much more general class of Lie super-algebras than just sl(2, 1). Here are the details: Definition 1. A super-Poisson algebra A = A0 ⊕A1 is a Z2 -graded associative algebra over the field of complex numbers C equipped with the structure of a super-Lie algebra with an even (super-Poisson) bracket {., .} compatible with the associative multiplication m : A ⊗ A → A, i.e. {X, Y Z} = {X, Y }Z + (−1)XY Y {X, Z}.
(91)
Moreover, it is required that A possess an even unit element e such that eX = Xe = X and {e, X} = 0 for all X ∈ A. Finally, A is equipped with a linear supertrace STr : A → C, in particular STr(e) = 1. The supertrace is supposed to vanish for Poisson brackets: STr{X, Y } = 0; X, Y ∈ A and also for odd elements: STr(A1 ) = 0. Let us take as an example of A the algebra A∞ of the superfunctions on the supersphere with the super-Poisson bracket given by (3) or (6) and the supertrace given by the integral (7). Another example is the noncommutative algebra of (2N + 1) ⊗ (2N + 1)supermatrices, denoted as AN in Sect. 2.1. This algebra defines the fuzzy supersphere [4] and it is the noncommutative deformation (or Berezin quantization) of the algebra A∞ with the value of the “Planck constant” h = 1/N . It should therefore be no surprise that the Lie bracket in AN is not just the ordinary commutator inherited from the associative multiplication in AN but the commutator multiplied by N (playing the role of the inverse Planck constant): {X, Y } = Nm(X ⊗ Y ) − Nm(Y ⊗ X) = N [X, Y ], X, Y ∈ AN .
(92)
The definition of the bracket with this normalization is crucial for verifying normalization of all formulae in this paper, in particular the important constraint (113). Note that we denote the super-Lie bracket in AN by the same symbol as in A∞ . The reason is that the former gives the latter in the commutative limit N → ∞. Finally, the supertrace STr in AN is nothing but the standard supertrace over (2N + 1) ⊗ (2N + 1)-supermatrices. It is evident from (13) and (14), that STr is correctly normalized, which means that the STr of the unit supermatrix is equal to 1. Definition 2. We say that (A, G) is a supersymmetric double over a super-Poisson algebra A, if G = G0 ⊕G1 is a super-Lie subalgebra of A (but not necessarily the associative subalgebra of A!) and a bilinear form STr ◦ m restricted to G is non-degenerate. In this case the bilinear form STr ◦ m determines an element C G ∈ G ⊗ G called a quadratic Casimir element of the double (A, G).
582
C. Klimˇcík
Now we construct a canonical complex (A, G) over the double (A, G) as follows: (A, G) = ⊕3i=0 i (A, G),
(93)
0 (A, G) = 3 (A, G) = (AP )0 ≡ e ⊗ (AP )0 , 1 (A, G) = 2 (A, G) = (G0 ⊗ (AP )0 ) ⊕ (G1 ⊗ (AP )1 ).
(94)
where
Here the notation AP means that one does not consider the algebra A over the field C but one replaces C by a graded commutative algebra P . The subscript 0 in (AP )0 (1 in (AP )1 ) then means that one considers only a subspace of AP which consists of all elements of AP even (odd) with respect to the sum of gradings of A and of P . Note. Considering the algebra A over P instead of over the field of complex numbers C is not an useless complication, but it is dictated by the field theoretical applications. ¯ ∈ A∞ can be expanded in the Taylor series in For instance, any element f (z, z¯ , b, b) ¯ b, b. The term proportional to b is of the form ψ(z, z¯ )b. But ψ(z, z¯ ), being a fermion, is not a C-valued function! It should be Grassmann-valued which means that it should be valued in the odd part P1 of some Z2 -graded commutative algebra P . All this is the standard supersymmetric story so we do not give more details here. In order for (A, G) to be a complex we need to introduce a coboundary operator δ G : i (A, G) → i+1 (A, G) such that (δ G )2 = 0 and an associative product ∗G : i (A, G) ⊗ j (A, G) → i+j (A, G) compatible with δ G . By compatibility we mean, of course, the Leibniz rule δ G (X i ∗G Y j ) = δ G Xi ∗G Y j + (−1)i X i ∗G δ G Y j , X i ∈ i (A, G), Y j ∈ j (A, G). (95) In order to give a simple description of δ G and ∗G , we note that A ⊗ A naturally acts on itself in one of four ways: m ⊗ m,m ⊗ ad,ad ⊗ m and ad ⊗ ad; e.g.for X, Y ∈ A we have Xm⊗ad (Y ) ≡ (−1)Y
(1) X (2)
X (1) Y (1) ⊗ {X(2) , Y (2) },
(96)
where X = X (1) ⊗ X(2) and Y = Y (1) ⊗ Y (2) . Now we define δ G as follows: G X0 , δ G X 0 = Cm⊗ad
1 G X1 + dG X 1 , δ G X1 = Cad⊗ad 2 δ G X2 = adX2 ≡ {X2(1) , X2(2) }, G
δ X = 0, 3
(97) (98) (99) (100)
where X i ∈ i (A, G) and dG is a “Dynkin” number which can be defined by the relation STr(XY ) =
1 sTr(adXadY ). dG
(101)
Nonperturbative Regularization of Supersymmetric Schwinger Model
583
Here sTr is the superstrace over supermatrices in the adjoint representation of the Lie superalgebra G. The multiplication ∗G is given by the following table: i X ∗G Y j Y0 Y1 Y2 Y3 X0 m⊗m m⊗m m⊗m m ⊗ m X1 m⊗m ad ⊗ m (STr ⊗ Id)(m ⊗ m) 0 . (102) x2 m ⊗ m (STr ⊗ Id)(m ⊗ m) 0 0 m⊗m 0 0 0 X3 For example, 1 (Y 1 ), X1 ∗G Y 1 = Xad⊗m
(103)
or X 1 ∗G Y 2 = (−1)Y
(1) X (2)
X 1(2) Y 2(2) STr(X1(1) Y 2(1) ).
(104)
Definition 3. We say that (A, G, H) is a supersymmetric triple, if a subspace H of A exists such that 1) H is a super-Lie subalgebra of G; 2) (A, H) is the supersymmetric double with the Casimir element C H ∈ H ⊗ H, coboundary δ H and product ∗H ; 3) An element C ≡ C G − C H fulfills m(C) ∈ Ce; 4) ad(H⊥ ⊗ H⊥ ) ⊂ H, where H⊥ is an orthogonal complement of H in G with respect to STr ◦ m. Now we construct a canonical complex (A, G, H) over the triple (A, G, H) as follows: (A, G, H) = ⊕3i=0 i (A, G, H),
(105)
0 (A, G, H) = 3 (A, G, H) = (AP )0 ≡ e ⊗ (AP )0 , 1 (A, G, H) = 2 (A, G, H) = (G0 ⊗ (AP )0 ) ⊕ (G1 ⊗ (AP )1 ).
(106)
where
We define the exterior derivative δ on (A, G, H) as follows: δX 0 = δ G X0 , δX2 = δ G X2 , δX3 = δ G X3 ,
(107)
1 , δX 1 = δ G X1 − δ H XH
(108)
1 means the orthogonal projection of X 1 from G ⊗ A where X i ∈ i (A, G, H) and XH into H ⊗ A. A product ∗ in (A, G, H) is defined by the same table as the product ∗G in (A, G), except for the multiplication of 1-forms where we have 1 1 ∗H Y H . X 1 ∗ Y 1 = X1 ∗G Y 1 − XH
(109)
It is a straightforward exercise to check that the product ∗ and the coboundary δ verify the Leibniz rule.
584
C. Klimˇcík
3.3. Noncommutative supersymmetric electrodynamics. Suppose that the super-Poisson algebra A is such that its Lie bracket {., .} is derived from its associative multiplication, i.e. as an N-multiple of its commutator like in (92). Then a noncommutative pure supersymmetric electrodynamics over A is a theory of 1-forms in the triple complex (A, G, H) defined by an action S=
2 1 STr[α 0 G F ∗ F + β 0 (V ∗ δV + V ∗ V ∗ V )], g2 3
(110)
where F ≡ δV + V ∗ V
(111)
is the field strength of V , α 0 ,β 0 two real parameters (cf. (64)), g a coupling constant and the Hodge triangle G is the identity map between 1 (A, G, H) and 2 (A, G, H). Moreover, V is considered to be a real 1-form V ‡ = V subject to two constraints (δV + V ∗ V )H = 0
(112)
and C ∗ VH⊥ + VH⊥ ∗ C +
1 G VH⊥ ∗ VH⊥ = 0. N
(113)
Here VH⊥ means the orthogonal projection of V to H⊥ and C is viewed as a 2-form. The graded star ‡ on the complex is defined by means of the graded star ‡ on the algebra A. For example, (X 1 )‡ = (X1(1) )‡ ⊗ (X1(2) )‡ , X1 ∈ 1 (A, G, H).
(114)
The action S and both constraints are invariant with respect to 1) gauge transformations V → U V U −1 − (δU )U −1 , U −1 = U ‡ ;
(115)
2) H-supersymmetry1 , where h ∈ H acts on (A, G, H) as follows: h(X0,3 ) = ad(h ⊗ X0,3 ),
(116)
h(X 1,2 ) = (e ⊗ h)m⊗ad X 1,2 + (h ⊗ e)ad⊗m X 1,2 .
(117)
The constraint (δV + V ∗ V )H = 0 can be solved explicitly, thanks to assumption (4) in the definition of the supersymmetric triple (A, G, H) (assumption (3) is needed for the gauge invariance of the constraint (113)). The solution reads VH =
2 (Cad⊗ad VH⊥ + VH⊥ ∗ VH⊥ ). dH − dG
(118)
Thus we insert VH from (118) into (110) and we obtain a theory containing only 1-forms VH ⊥ . 1 We mean here the real form of the H-superalgebra which respects the reality of the 1-form V , cf. (54).
Nonperturbative Regularization of Supersymmetric Schwinger Model
585
An interaction with matter can also be expressed in terms of the complex (A, G, H). Let 8 be a complex 0-form, then Smatter = STr[((δ G − δ H + VH⊥ )8)‡ ∗ G(δ G − δ H + VH⊥ )8].
(119)
If we add Smatter to S in (110) we obtain the H-supersymmetric Schwinger model over the Poisson algebra A. For the commutative, resp. noncommutative, fuzzy supersphere (A = A∞ , resp. A = AN ) and the superalgebras H = osp(2, 1) and G = sl(2, 1), the complex (A, G, H) is isomorphic to the complex 4∞ resp. 4N of Sect. 2.4. We present a few formulae connecting the two presentations of the same complex. First of all the algebras G and H are generated by the Hamiltonians (26) and (27) in both commutative and noncommutative cases. In what follows, the parameter N will stand for either a finite integer (the noncommutative case) or ∞ (the commutative one). Thus 1 VH⊥ = +2d− ⊗ A+ − 2d+ ⊗ A− − γ ⊗ W ≡ (A+ , A− , W, 0, 0, 0, 0, 0), 2 1 F = +2d− ⊗ F+ − 2d+ ⊗ F− − γ ⊗ f ≡ (F+ , F− , f, 0, 0, 0, 0, 0), 2 1 C = +2d− ⊗ d+ − 2d+ ⊗ d− − γ ⊗ γ , 2 4 6 , dH = . dG = 1 + 1/N 1 + 1/N
(120) (121) (122) (123)
‡ Note also that C ‡ = C and the condition VH ⊥ = VH⊥ does indeed reproduce the reality conditions (55) and (56). It is trivial to check that the conditions 3) and 4) of Definition 3 are indeed fulfilled. For this, one uses the relations (26)–(27), (122) and (16)–(22). The main message of this paper is that the Schwinger model formulated in this section for finite N becomes in the limit N → ∞ the standard commutative supersymmetric theory of Sect. 3.1. In particular, the actions (110) and (119) become (64) and (90) respectively and the constraints (112) and (113) become (62) and (79) in this limit. Moreover, the gauge transformation (115) becomes the gauge transformation (34) and (35) and the supersymmetry transformations (116) and (117) become the transformations (53) and (57). It is in this sense that we consider the theory for finite N as the nonperturbative regularization of the standard commutative theory. Note the presence of expressions like δV + V ∗ V in our theory which are characteristic for non-abelian gauge theories. They appear because of the noncommutativity of the fuzzy sphere, but in the commutative limit the terms like V ∗ V disappear and we are left with the abelian theory. In fact, the coboundary δ commute only with one parametric subgroup of the gauge group, consisting of the elements of the form U = exp (iα)e, where e is the unit element of A. This is another sign that we deal with the noncommutative deformation of an U (1) gauge theory. The reader might appreciate also the economy of using the invariant formulation for writing the action functionals. In fact, the already cumbersome formulae of Sect. 3.1, which are written in noninvariant language, would be even much more cumbersome in the noncommutative case due to the presence of the V ∗ V terms.
586
C. Klimˇcík
4. Conclusions and Outlook We have constructed the supersymmetric Schwinger model on the noncommutative sphere. The theory possess only a finite number of degrees of freedom, nevertheless it is manifestly supersymmetric and supergauge invariant. The basic structural ingredient of the construction of the model is the complex (A, G, H) for G = sl(2, 1) and H = osp(2, 1). It is remarkable that the complex can be constructed for a large class of superalgebras, hence we expect that supergauge theories could be in a similar fashion constructed for higher dimensional projective spaces. It is also not difficult to suggest a generalization to more general gauge groups than U (1); however, due to the amount of work necessary for doing it we prefer to postpone it for a later publication. Note finally that 1-forms in (A, G) can be interpreted as 1-cochains in the Lie superalgebra cohomology but 1-forms in (A, G, H) are not relative cochains modulo H. This suggests an interesting variant of the Lie superalgebra cohomology of G with respect to H may exist. References 1. Douglas, M.: Superstring dualities, Dirichlet branes and the small scale structure of space. In: Quantum symmetries, Les Houches, Session LXIV, Eds. A. Connes, K. Gaw¸edzki and J. Zinn-Justin, Elsevier Science B.V., 1998, p. 519 2. Connes, A.: Noncommutative geometry. London: Academic Press, 1994 3. Connes, A. and Kreimer, D.: Commun. Math. Phys. 199, 203 (1998) 4. Grosse, H., Klimˇcík, C. and Prešnajder, P.: Commun. Math. Phys. 185, 155 (1997) 5. Grosse, H., Klimˇcík, C. and Prešnajder, P.: Commun. Math. Phys. 178, 507 (1996) 6. Madore, J.: Class. Quan. Grav. 14, 3003 (1997) 7. Grosse, H. and Madore, J.: Phys. Lett. B283, 218 (1992) 8. Klimˇcík, C.: Commun. Math. Phys. 199, 257 (1998) 9. Carow-Watamura, U. and Watamura, S.: Noncommutative geometry and gauge theory on fuzzy sphere. hep-th/9801195 10. Baez, S., Balachandran, A.P., Idri, B. and Vaidya, S.: Monopoles and solitons in fuzzy physics. hepth/9811169 11. Grosse, H. and Prešnajder, P.: A treatment of the Schwinger model within noncommutative geometry. hep-th/9805085 12. Berezin, F.: Commun. Math. Phys. 40, 153 (1975) 13. Hoppe, J.: MIT PhD thesis, 1982 and Elem. Part. Res. J. (Kyoto) 80, 145 (1989) 14. Madore, J.: J. Math. Phys. 32, 332 (1991) and Class. Quant. Grav. 9, 69 (1992) 15. Wess, J. and Bagger, J.: Supersymmetry and supergravity. Princeton, NJ: Princeton University Press, 1983 16. Scheunert, M., Nahm, W. and Rittenberg, V.: J. Math. Phys 18, 155 (1977) 17. Ferrara, S.: Lett. Nuov. Cim. 13, 629 (1975) Communicated by A. Connes
Commun. Math. Phys. 206, 587 – 601 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
An Extended Fuzzy Supersphere and Twisted Chiral Superfields C. Klimˇcík Institute de Mathématiques de Luminy, 163, Avenue de Luminy, 13288 Marseille, France Received: 29 March 1999 / Accepted: 8 April 1999
Abstract: A noncommutative associative algebra of the N = 2 fuzzy supersphere is introduced. It turns out to possess a nontrivial automorphism which relates twisted chiral to twisted anti-chiral superfields and hence makes it possible to construct noncommutative nonlinear σ -models with extended supersymmetry. 1. Introduction Supersymmetric nonlinear σ -models with N = 2 supersymmetry in two dimensions are important objects in modern mathematical physics. They possess a very rich structure interesting by itself and find also applications, for instance, in superstring theory. It is a well-known fact that the models with N = 1 supersymmetry can be constructed for an arbitrary geometry of the target space. However, the N = 2 case requires the target space to be Kähler [1] if we consider the case without torsion. A very convenient description of the N = 2 σ -models exists based on the N = 2 superspace. In this paper, we shall show that the N = 2 superspace can be constructed also on the noncommutative sphere. More precisely, we shall construct a noncommutative N = 2 supersphere. Note that the notation N = 1 or N = 2 refers usually to the Poincare-like superalgebras in which the anticommutators of the supercharges are the generators of translations of the underlying bosonic space. We shall see soon, however, that due to the fact that the two-sphere is conformally flat we can keep this terminology also for spherical worldsheets. Noncommutative geometry [2] is the generalization of the ordinary geometry in which an algebra of functions which encodes the geometry of an ordinary space is replaced by a certain noncommutative algebra. As an example we take a noncommutative (or fuzzy) sphere which is an object introduced by several researchers in the past [3–6] with various motivations. Berezin himself has quantized the standard round symplectic structure on the two-sphere and he found that this can be done only for integer values of the inverse Planck constant. For example if h = 1/n, the quantized algebra of observables (= the
588
C. Klimˇcík
fuzzy sphere) coincides simply with the algebra of (n + 1) × (n + 1) matrices. When n → ∞ the size of the algebra approaches infinity; in fact, one recovers the standard algebra of functions on the commutative sphere. Effectively, the quantization cuts off the large angular momenta. This fact independently lead several authors [4–6] to use the fuzzy sphere as a regularization of fields theories formulated on the ordinary sphere. It turns out that this regularization has an important advantage of preserving the standard SO(3) invariance of the ordinary sphere. This is a quite remarkable fact because the regulated theory contains only a finite number of degrees of freedom and, even more importantly, the regulated sphere continues to be a geometric object so it makes sense to formulate theories non-perturbatively directly on it. The list of the virtues of the fuzzy regularization is not exhausted by the SO(3) invariance and the finite number of degrees of freedom. In fact, one can introduce fuzzy monopole configurations [7,8] and, perhaps even more remarkable, to regulate supersymmetric [6] and supersymmetric gauge theories [10] while manifestly preserving supersymmetry, supergauge symmetry and the finite number of degrees of freedom. It is indeed the purpose of this paper to show that models with extended supersymmetry are also regularisable by the method. In Sect. 2, we introduce the extended N = 2 supersphere and its noncommutative deformation. Moreover, we shall identify a nontrivial automorphism of the structure which will prove very useful in constructing N = 2 theories. Section 3 presents the construction of the commutative and noncommutative N = 2 supersymmetric nonlinear σ -models on the sphere. 2. N = 2 Fuzzy Supersphere An N = 1 fuzzy supersphere has been constructed in [6] with a goal to regularize N = 1 supersymmetric nonlinear σ -models. The reader may find an alternative more concise description of the structure in [10]. In the N = 2 case, the construction begins in a similar way as in the N = 1 one but there is a point of departure in which new (and welcome) structural ingredients enter. Here are the details. Consider the algebra of polynomial functions on the complex C 2,2 superplane, i.e. the algebra generated by finite sums of monomials in bosonic variables χ¯ α , χ α , α = 1, 2 and in fermionic ones a¯ α , a α , α = 1, 2. The algebra is equipped with the super-Poisson bracket {f, g} = ∂χ α f ∂χ¯ α g − ∂χ¯ α f ∂χ α g + (−1)f +1 [∂a α f ∂a¯ α g + ∂a¯ α f ∂a α g],
(1)
and with the graded involution [11] (χ α )‡ = χ¯ α , (χ¯ α )‡ = χ α , (a 1 )‡ = a¯ 1 , (a 2 )‡ = −a¯ 2 , (a¯ 1 )‡ = −a 1 , (a¯ 2 )‡ = a 2 , (2) satisfying the following properties: (AB)‡ = (−1)AB B ‡ A‡ , (A‡ )‡ = (−1)A A.
(3)
We can now apply the (super)symplectic reduction with respect to a moment map χ¯ α χ α + a¯ α a α − 1. The result is a smaller algebra A∞ , that by definition consists of all functions f with the property
{f, χ¯ i χ i + a¯ α a α − 1} = 0.
(4)
Extended Fuzzy Supersphere and Twisted Chiral Superfields
589
Moreover, two functions obeying (4) are considered to be equivalent if they differ just by a product of (χ¯ α χ α + a¯ α a α − 1) with some other such function. The smaller algebra A∞ (the reason for using the subscript ∞ will become clear soon) will be referred to as the algebra of superfunctions on an N = 2 supersphere1 . It will be sometimes more convenient to work with a different parametrization of A∞ , using the following coordinates: z=
χ¯ 1 aα a¯ α χ1 , z¯ = 2 , bα = 2 , b¯ α = 2 . 2 χ χ¯ χ χ¯
(5)
The Poisson bracket (1) then becomes {f, g} = (1 + z¯ z + b¯ α bα )[(1 + z¯ z)(∂z f ∂z¯ g − ∂z¯ f ∂z g) b¯ β z((−1)f ∂z f ∂b¯ β g − ∂b¯ β f ∂z g) + bβ z¯ (∂bβ f ∂z¯ g − (−1)f ∂z¯ f ∂bβ g) +(−1)(f +1) (−b¯ β bγ + δ βγ ))(∂bγ f ∂ ¯ β g + ∂ ¯ β f ∂bγ g)]. b
b
A natural Berezin integral on A∞ can be written as Z 1 I (f ) = d χ¯ 1 ∧ dχ 1 ∧ d χ¯ 2 ∧ dχ 2 ∧ d a¯ 1 ∧ da 1 (2πi)2 ∧ d a¯ 2 ∧ da 2 δ(χ¯ α χ α + a¯ α a α − 1)f. It can be rewritten as 1 I (f ) ≡ 2πi
Z
(6)
d z¯ ∧ dz ∧ d b¯ 1 ∧ db1 ∧ d b¯ 2 ∧ db2 f.
(7)
(8)
(Note I (1) = 0.) Strictly speaking, the generators z¯ , z, b¯ α , bα are not elements of the algebra A∞ . What is true is that A∞ is (finitely) linearly generated by the functions of the following form: ¯
¯1
¯2
z¯ k zk (b¯ 1 )l (b1 )l (b¯ 2 )l (b2 )l , k¯ + l¯1 + l¯2 , k + l 1 + l 2 ≤ m, (1 + z¯ z + b¯ α bα )m 1
2
(9)
¯ l α , l¯α , m are non-negative integers. It is not difficult to understand the form where k, k, (9) of the elements of A∞ . Indeed, we first note that z¯ , z, b¯ α , bα can also be interpreted as local chart coordinates of the N = 2-supersphere obtained by the stereographic projection from the north pole. If we do the projection from the south pole, we obtain α , bα . A transition rule on the a complementary chart with local coordinates w, ¯ w, b¯w w overlap of the two charts reads α α = bα /z, b¯w = b¯ α /¯z. w = 1/z, w¯ = 1/¯z, bw
(10)
It is now a simple matter to check that the functions of the form (9) will transform into ¯ ¯1 −l¯2
w¯ m−k−l
¯1
¯2
1 )l (b1 )l (b¯ 2 )l (b2 )l w m−k−l −l (b¯w w w w . α bα )m (1 + ww ¯ + b¯w w 1
2
1
2
(11)
1 Note that in case we did not consider the fermionic variables a¯ α , a α , we would obtain, as the result of the symplectic reduction, the algebra of functions on the standard bosonic sphere. In the case of considering only one pair of fermionic variables a, ¯ a we would obtain the N = 1 supersphere.
590
C. Klimˇcík
¯ l¯1 − l¯2 ≤ m and 0 ≤ m−k−l 1 −l 2 ≤ m for k+ ¯ l¯1 + l¯2 , k+l 1 +l 2 ≤ m; Since 0 ≤ m− k− α α ¯ ¯ k, k, l , l , m ≥ 0 we see that the elements of A∞ are form-invariant with respect to the coordinate transformation (10). The reason to use the coordinates z¯ , z, b¯ α , bα is simple: they will enable us to establish a connection between standard N = 2 supersymmetric nonlinear σ -models defined on the flat Euclidean space and their counterparts on the N = 2 supersphere. In fact, we shall see that the flat models in the coordinates z¯ , z, b¯ α , bα and the spherical models in the same coordinates have the same field theoretical action! They differ, however, in the sense that the algebras of the superfields in both cases are different. In the flat case the superfield is an element of the algebra of superfunctions on the Euclidean N = 2 superspace while in the spherical case the superfield is an element of A∞ . Let us now introduce a Lie superalgebra T which will turn out to contain all relevant structure of the N = 2 nonlinear σ -models on the sphere. It has seven even generators ‡ ◦ , C ‡◦ . We denote the corresponding , C± R± , R3 , Z± , Z3 , C and eight odd ones C± , C± ± Hamiltonians by small characters; they are given by 1 1 1 (χ¯ χ − χ¯ 2 χ 2 ), 2 1 z3 = (a¯ 1 a1 − a¯ 2 a2 ), 2 r3 =
r+ = χ¯ 1 χ 2 ,
r− = χ¯ 2 χ 1 ,
(12)
z+ = a¯ 1 a2 ,
z− = a¯ 2 a1 ,
(13)
c = χ¯ 1 χ 1 + χ¯ 2 χ 2 + a¯ 1 a 1 + a¯ 2 a 2 ,
(14)
c+ = a¯ 2 χ 1 + χ¯ 2 a 1 ,
c− = −a¯ 2 χ 2 + χ¯ 1 a 1 ,
(15)
‡ c+ ◦ c+ ‡◦ c+
‡ c− ◦ c− ‡◦ c−
= a¯ χ − χ¯ a ,
(16)
= −a¯ χ + χ¯ a ,
(17)
= a¯ 2 χ 1 − χ¯ 2 a 1 .
(18)
= a¯ χ + χ¯ a , 1 2
1 2
= a¯ χ + χ¯ a , 1 1
2 2
= a¯ 2 χ 2 + χ¯ 1 a 1 ,
1 1
1 2
2 2
1 2
We should remember that these Hamiltonians are actually preimages of the true Hamiltonians in the process of the symplectic reduction. Since they anyway commute with the moment map (c − 1) it is possible and technically preferable to work with them. A reader who wishes to work directly with expressions in terms of z¯ , z, b¯ α , bα coordinates can simply use the equations (12)–(18) and the following relation: 1
= 1 + z¯ z + b¯ 1 b1 + b¯ 2 b2 .
(19)
zb¯ 2 + b1 z¯ b1 − b¯ 2 , c = , − 1 + z¯ z + b¯ 1 b1 + b¯ 2 b2 1 + z¯ z + b¯ 1 b1 + b¯ 2 b2
(20)
χ¯ 2 χ 2 One obtains, for example, c+ =
and so on for all Hamiltonians (12) -(18). In particular, the Hamiltonian c becomes simply c = 1.
(21)
The last equation does not mean, however, that C gets detached from the superalgebra T . It rather means that T is the central extension of T /C by C.
Extended Fuzzy Supersphere and Twisted Chiral Superfields
591
The (graded) commutation relations of the superalgebra T are given by the Poisson brackets of the Hamiltonians (12)–(18). Though this is a correct definition, we prefer to give an explicit list of nonvanishing commutators because many results of this paper depend directly on them. Here they are [R3 , R± ] = ±R± , [R+ , R− ] = 2R3 , 1 [R3 , C± ] = ∓ C± , 2 1 ‡ ‡ ] = ± C± , [R3 , C± 2 [R± , C± ] = C∓ ,
[Z3 , Z± ] = ±Z± , [Z+ , Z− ] = 2Z3 , 1 ◦ ◦ [R3 , C± ] = ∓ C± , 2 1 ‡◦ ‡◦ [R3 , C± ] = ± C± , 2 ◦ ◦ [R± , C± ] = C∓ ,
‡ ‡ ] = −C± , [R± , C∓ 1 [Z3 , C± ] = − C± , 2 1 ‡ ‡ ] = C± , [Z3 , C± 2 ‡ , [Z+ , C± ] = ±C∓
‡◦ ‡◦ [R± , C∓ ] = −C± , 1 ◦ ◦ [Z3 , C± ] = C± , 2 1 ‡◦ ‡◦ [Z3 , C± ] = − C± , 2 ‡◦ ◦ [Z+ , C± ] = ∓C∓ ,
‡◦ ◦ ] = ±C∓ , [Z− , C± ◦ ]+ = ±2R∓ , [C± , C± ‡◦ ]+ [C± , C±
= 2Z− ,
‡ [Z− , C± ] = ∓C∓ , ‡ ‡◦ [C± , C± ]+ ‡ ◦ [C± , C± ]+
(22)
(23)
(24)
(25)
(26)
= ±2R± , = 2Z+ ,
(27)
‡ ‡◦ ◦ ]+ = [C± , C∓ ]+ = 2(R3 ∓ Z3 ), [C± , C∓ ‡ ‡◦ ◦ ]+ = [C± , C± ]+ = C. [C± , C±
(28)
We note that the commutation relations of T /C coincide with those of the anomaly free subalgebra of the N = 4 super-Virasoro algebra [12]. Let us define an automorphism ◦ of the algebra T which plays a crucial role in our construction. It is easy to verify that the commutation relations of T are invariant if ◦ ◦ = R± , R3◦ = R3 , Z+ = Z− , Z3◦ = −Z3 , C ◦ = C. R±
(29)
The action of the automorphism on the odd generators is given by the notation itself and by the claim that the automorphism is involutive, i.e. it squares to the identity map. It is a matter of simple inspection to see that the associative algebra A∞ , which defines the N = 2 supersphere, is linearly and multiplicatively generated by four odd ‡ and three even ones l± , l3 , defined by2 variables c± , c± l3 = r3 −
1 ‡ 1 ‡ c− , c c+ + c− 2c + 2c
1 ‡ c∓ . l± = r± + c± c
(30)
(31)
2 Although c = 1 in A , it is useful to indicate c in (30) and (31) because these formulae hold also in the ∞ noncommutative case where c 6 = 1.
592
C. Klimˇcík
Note that ‡ l+ = l− , l3‡ = l3
(32)
and that l± , l3 are not independent variables as they are subject to the following relation: l32 + l+ l− = 1/4.
(33)
But the relations (32) and (33) characterize the ordinary bosonic sphere! Moreover, the ‡ , c turn out to only nonvanishing Poisson brackets among the generators l± , l3 , c± , c± be {l3 , l± } = ±l± , {l+ , l− } = 2l3 , ‡ } {c± , c±
= c.
(34) (35)
In other words, l’s and c’s completely decouple and we see that the algebra A∞ is a direct product of the algebra B∞ of the functions on the ordinary sphere and of the Grassmann ‡ . This direct product concerns not only the algebra Gr4 with four generators c± , c± associative multiplication but also the Poisson structure. The immediate conclusion of those facts is that it is very easy to quantize the N = 2 supersphere. The corresponding noncommutative algebra An is simply the ordinary bosonic fuzzy sphere [4–6] tensored ‡ . Here and in what follows we with a Clifford algebra Cf4 with four generators c± , c± shall often use the same symbol for non-deformed generators and for their deformed counterparts . It should be clear from the context which usage we have in mind. We recall that the bosonic fuzzy sphere is a (n + 1) × (n + 1) dimensional matrix algebra where the integer parameter n plays the role of the inverse Planck constant [6,9,10]. Since the Poisson brackets (34) and (35) are to be replaced by commutators scaled by the inverse Planck constant, we get the following commutation relations for ‡ and c of the N = 2 fuzzy supersphere: the noncommutative generators l± , l3 , c± , c± 1 2 [l3 , l± ] = ± l± , [l+ , l− ] = l3 , n n 1 ‡ [c± , c± ]+ = c. n
(36) (37)
Moreover, we define the graded involution ‡ in the noncommutative case by (32) on ‡ . It is easy to find the l± , l3 and by the notation and the second property (3) on c± , c± ‡ explicit forms of the matrices l± , l3 , c± , c± and c: l± =
1 1 1 (L± ⊗ 1), l3 = (L3 ⊗ 1), c = (1 + )(1 ⊗ 1), n n n 1 1 ‡ = √ (1 ⊗ 㱇 ), c± = √ (1 ⊗ γ± ), c± n n
(38)
(39)
where the first entry of the tensor product corresponds to the bosonic fuzzy sphere and the second entry to the Clifford algebra. L± , L3 are generators of su(2) Lie algebra in the representation with spin n/2 and γ ’s and γ ‡ ’s are standard Dirac matrices with respect to the Euclidean metric in four dimensions normalized according to [γ± , 㱇 ]+ = c = 1 + 1/n.
(40)
Extended Fuzzy Supersphere and Twisted Chiral Superfields
593
Note that due to the tensor product structure of the N = 2 supersphere, the normalization of the central term c must be 1 in the limit n → ∞ but it is otherwise a free parameter of the construction. It is the choice c = 1 + 1/n which makes it possible to construct N = 2 supersymmetric σ -models on the N = 2 fuzzy supersphere. Let us study the properties of the algebra An . It turns out that we shall need a non-commutative analogue of the automorphism ◦. Actually, we defined ◦ as the automorphism of the Lie superalgebra T and not yet of A∞ . On the other hand, ◦ can be directly defined to act on the whole algebra A∞ as the morphism of the (graded) multiplication. For instance, ◦ . (l3 c+ )◦ = l3◦ c+
(41)
Since we know that the Poisson bracket (6) is compatible with the associative multiplication in A∞ , we conclude that ◦ is the automorphism of both the associative and Poisson structure of A∞ . We can equally well use another set of generators for describing the algebra A∞ , ◦ , c‡◦ and l ◦ , l ◦ . Of course, l ◦ , l ◦ are given by namely, the set c± ± ± 3 ± 3 1 ‡◦ ◦ 1 ‡◦ ◦ c− , c+ c+ + c− 2c 2c 1 ‡◦ ◦ ◦ = r± + c± c∓ , l± c l3◦ = r3 −
(42) (43)
and they also turn out to fulfill the relations ◦‡ ◦ = l− , l3◦‡ = l3◦ l+
(44)
◦ ◦ l− = 1/4. (l3◦ )2 + l+
(45)
and
◦ , l ◦ , c◦ , c‡◦ Moreover, the only nonvanishing Poisson brackets among the generators l± 3 ± ± are as follows: ◦ ◦ ◦ ◦ } = ±l± , {l+ , l− } = 2l3◦ , {l3◦ , l± ‡◦ ◦ , c± } {c±
= c.
(46) (47)
The relations (46) and (47) are actually direct consequences of the fact that ◦ is the automorphism of the Poisson algebra A∞ . Nevertheless, we prefer to state them explicitly in order to stress the equal footing of the two sets of generators. Of course, we can now construct the noncommutative deformation of the N = 2 supersphere, by quantizing the set of the new circled generators. Thus, the N = 2 fuzzy supersphere will again be nothing but the tensor product of the bosonic fuzzy sphere with the Clifford algebra Cf4 . A question arises: How the uncircled and the circled fuzzy superspheres fit together? Let us look for a key for answering this question in the commutative case A∞ , where the circled variables can be written in terms of the non-circled ones as follows: 1 ‡ ‡ (2l3 c∓ ± 2l∓ c± ± c 1 = (2l3 c∓ ± 2l± c± ± c
◦ = c± ‡◦ c±
1 ‡ ‡ c [c , c± ]), c ∓ ± 1 ‡ c∓ [c± , c± ]), c
(48) (49)
594
C. Klimˇcík
1 ‡ 1 ‡ 1 ‡◦ ◦ 1 ‡◦ ◦ c+ c+ − c− c− − c+ c+ + c− c− , 2c 2c 2c 2c 1 ‡ 1 ‡◦ ◦ ◦ = l± − c± c∓ + c± c∓ . l± c c l3◦ = l3 +
(50) (51)
Now we take the formulae (48)–(51) as the definition of the circled variables in the ‡ , c are given by (38) and (39). non-commutative case where l± , l3 , c± ; c± The operator formulae (48) and (49) are remarkable since they involve cubic terms in the old uncircled generators. This causes the usual ordering problem to lead in this case to an operator rather than a c-number ambiguity. Indeed, writing the cubic terms in (48) and (49) requires the fixing of a certain ordering; in fact, the commutators (not ‡ , c± ] in (48) and (49) do the job. A slight change of the ordering anticommutators!) [c± in any of the definitions (48),(49) would completely destroy a crucial property of this map, namely, the circled variables fulfill exactly the same properties as the noncircled ones. Explicitly, 1 ◦ 2 ◦ ◦ ◦ ] = ± l± , [l+ , l− ] = l3◦ , [l3◦ , l± n n 1 ‡◦ ◦ , c± ]+ = c [c± n
(52) (53)
and all remaining graded commutators vanish. Recall that the relations (52) and (53) are not postulated but they are derived from the relations (36) and (37) and the definitions ◦ is also correct since one can verify (48) and (49). The normalization and reality of l3◦ , l± that ◦ ◦ l− = l 2 + l+ l− = 1/4 + 1/2n (l3◦ )2 + l+
(54)
◦‡ ◦ = l− , l3◦‡ = l3◦ . l+
(55)
and
Thus we conclude that the uncircled N = 2 fuzzy supersphere is the same thing as the circled one. The mapping ◦ preserves the commutation relations among the generators, therefore it can be extended to the whole supersphere as the automorphism of its associative product. Moreover, ◦ is an involutive automorphism since (51) and (52) are manifestly involutive and a tedious computation shows that the definitions (48)–(51) imply 1 ◦ ‡◦ 1 ‡◦ ‡◦ ◦ ◦ ‡◦ c± ± c∓ [c± , c± ]), (2l3 c∓ ± 2l∓ c c 1 1 ◦ ‡◦ ◦ ‡ ◦ ◦ ◦ = (2l3◦ c∓ ± 2l± c± ± c∓ [c± , c± ]). c± c c c± =
(56) (57)
For completeness, we give explicit formulae for the even Hamiltonians r± , r3 , z± , z3 ‡ ◦ , c◦ , c‡◦ . They are valid in both comin terms of the generators l3 , l± , c± , c± and l3◦ , l± ± ± mutative (n → ∞) and noncommutative (finite n) cases: 1 ‡ 1 ‡ 1 ‡◦ ◦ 1 ‡◦ ◦ c− = l3◦ + c+ c+ − c− c− , c c+ − c− 2c + 2c 2c 2c 1 ‡ 1 ‡◦ ◦ ◦ c∓ = l± − c± c∓ , r± = l± − c± c c r3 = l3 +
(58) (59)
Extended Fuzzy Supersphere and Twisted Chiral Superfields
1 1 ‡◦ ◦ 1 ‡ 1 ‡ 1 1 ‡◦ ◦ c+ c+ + c− = − c+ c− − c+ − c− c− , 2c 2c 2n 2n 2c 2c 1 ‡ ‡ 1 ◦ ◦ 1 1 ‡◦ ‡◦ c+ = c+ c− , z− = c+ c− = c− c+ . z+ = c− c c c c z3 =
595
(60) (61)
The construction of the involutive automorphism ◦ is the main result of this section. In what follows, we shall always enjoy the freedom of choosing to work in one of the two ◦ related equivalent parametrizations of the N = 2 fuzzy supersphere. 3. N = 2 Nonlinear σ -Models 3.1. The commutative case. The basic fact of life in the N = 2 flat Euclidean superspace is that a Lagrangian density of a field theoretic model does not involve derivatives. All dynamics is encoded in constraints imposed on N = 2 superfields in a way compatible with the N = 2 supersymmetry. For example, the Lagrangian of an N = 2 supersymmetric σ -model on the Euclidean plane is given by Z ¯ (62) S = d z¯ dzd b¯ 1 db1 d b¯ 2 db2 K(88). ¯ z, z, b¯ 1 , b1 , b¯ 2 , b2 ) are superfields on the plane. They Here 8(¯z, z, b¯ 1 , b1 , b¯ 2 , b2 ) and 8(¯ are subject to the following constraints: ¯ = D¯ + 8 ¯ = 0, D+ 8 = D¯ − 8 = 0, D− 8
(63)
¯ and K(88) is the Kähler potential of a target Kähler manifold with complex coordinates ¯ 8. The supersymmetric covariant derivatives are defined as 8, D+ = ∂b2 + b1 ∂z , D− = ∂b1 + b2 ∂z , D¯ + = ∂b¯ 2 + b¯ 1 ∂z¯ , D¯ − = ∂b¯ 1 + b¯ 2 ∂z¯ . (64) Note that the flat measure in the integral (62) coincides with the N = 2 “round” measure (8). Because of this fact, the model (62) can be reinterpreted as a model on ¯ the N = 2 supersphere. For this interpretation, it is sufficient to declare that both 8 and 8 are not the superfields on the plane but they are rather elements of the algebra A∞ , i.e. of the algebra of the superfunctions on the N = 2 supersphere. More precisely, the superfields are linear combinations of the elements of A∞ of the form (9) with coefficients being ordinary numbers when l¯1 + l¯2 + l 1 + l 2 is even, and Grassmann numbers when l¯1 + l¯2 + l 1 + l 2 is odd. These Grassmann numbers anticommute with ¯ 8 are even. This remark is the odd generators of A∞ . As a result, the superfields 8, important when we calculate the Poisson brackets involving the superfields or when we use the graded involution. The constraints (63) turn out to be equivalent to ◦ ¯ , 8} = 0, {c± , 8} = 0, {c±
(65)
◦ are given where {. , .} is the “round” Poisson bracket (6) and the Hamiltonians c± , c± in (15) and (17). The constraints (63) or (65) define so-called twisted chiral and twisted anti-chiral fields, respectively. In order to have another viable set of Poisson bracket constraints, giving so-called chiral and anti-chiral superfields [1], we would have to change the symplectic structure (6). This is easy but we shall not discuss it in this paper
596
C. Klimˇcík
because the resulting picture is completely analoguous to the twisted one. We just remark that from the point of view of the Poisson bracket (6) on the N = 2 supersphere, the twisted fields are more “natural” than the untwisted ones. There is an inconspicuous but, in fact, an important detail that concerns the (graded) involution in (62) denoted by a bar. It acts on the generators z¯ , z, b¯ 1 , b1 , b¯ 2 , b2 following the notation itself . This involution is not the same as the one denoted by ‡ in Sect. 2 (cf. (2)), although they coincide on the bosonic variables z¯ , z. In fact, ‡ is rather a world-sheet ◦ , l ◦ fulfill the correct reality involution. It is with respect to ‡ that the generators l± , l or l± conditions (32) of the bosonic generators of the (fuzzy) sphere. On the other hand, the bar involution sets the reality properties of fermionic fields if the supersymmetric action is written in components. These reality properties propagate to the quantization of the field theoretical model and define an involution on the Hilbert space of the quantum field theory. We remark that all this is also a standard flat space supersymmetric story although many authors do not provide a detailed discussion of various involutions in game. Their approach is simple and pragmatic, once a Lagrangian is worked out in components, an involution on fermions is set which makes the action real. In our case, we have to be more careful since experience [6,9] teaches us that only superfields as a whole are deformable; in other words, the notion of the component fields may lose sense after noncommutative deformation. Once we have defined the σ -model (62) on the commutative N = 2 supersphere, it is natural to ask what is its algebra of supersymmetry. There is a huge formal supersymmetry algebra of the theory (62) known as the N = 2 super-de-Witt algebra (whose central extension is the N = 2 Virasoro algebra [12]). It is actually defined as the Lie superalgebra of vector fields that preserve the constraints (63) [13] and, explicitly, it is generated by even chiral (anti-chiral) vector fields Lk , Jk , k ∈ Z (L¯ k , J¯k , k ∈ Z) and ¯ ± 1 , k ∈ Z). They are given by odd chiral (anti-chiral) ones G± 1 , k ∈ Z (G k+ 2
k+ 2
1 Lk = z−k+1 ∂z + (−k + 1)z−k (b1 ∂b1 + b2 ∂b2 ), 2 Jk = z−k (b1 ∂b1 − b2 ∂b2 ),
(66) (67)
G+ 1 k+ 2
= (z
b ∂z ),
(68)
G−
= (z−k ∂b1 + kz−k−1 b2 b1 ∂b1 − z−k b2 ∂z ).
(69)
k+ 21
−k
∂b2 + kz
−k−1 1 2
b b ∂b2 − z
−k 1
The barred generators are given by the same formulas with z¯ , b¯ 1 , b¯ 2 replacing z, b1 , b2 . It turns out that only eight chiral generators L±1 , L0 , J0 , G± 1 and eight anti-chiral ones ±2
¯ ± 1 preserve the algebra A∞ of the superfields on the N = 2 supersphere. L¯ ±1 L¯ 0 , J¯0 G ±2
Obviously, they form a Lie subalgebra of the full de Witt algebra. It therefore seems that the algebra of supersymmetry of the model on the sphere has sixteen complex dimensions. However, it is not so because we have to impose two further conditions which the supersymmetry algebra has to fulfill: 1) Since we are interested in the noncommutative deformation of the N = 2 σ -model (62) we have to consider only those generators which act by means of the Poisson bracket (6). This means that they are the Hamiltonian vector fields and, in the noncommutative case, they will act via the commutators. This reduces the supersymmetry algebra to an eight dimensional Lie superalgebra spl(2, 1). It is generated by four even generators
Extended Fuzzy Supersphere and Twisted Chiral Superfields
597
‡ ‡◦ R± , R3 , Z3 and four odd ones C± , C± . Explicitly, ‡ ¯ +1 , = G−1 + G C+ 2
‡◦ C+
+
= G1 2
‡ ¯ +1 , C− = G− 1 − G
−2
−2
¯ −1 , +G −2
‡◦ C−
R+ = −L1 − L¯ −1 ,
2
¯ −1 = G+ 1 − G −2
R− = L¯ 1 + L−1 , 1 Z3 = (J¯0 − J0 ). 2
R3 = L¯ 0 − L0 ,
(70)
2
(71)
Needless to say, the Hamiltonians of these generators of spl(2, 1) are r± , r3 , z3 and ‡ ‡◦ , c± of Sect. 2, Eqs. (12), (13), (16) and (18). Clearly, spl(2, 1) is the Lie subalgebra c± of T , hence its commutation relations are contained in (22)–(28). 2) We require that a supersymmetric transformation δ realized on both superfields 8 and ¯ respect the conjugacy of the fields, in other words, 8 ¯ δ8 = δ 8.
(72)
This reduces the supersymmetry algebra of the commutative model (62) to a certain real form of the spl(2, 1) algebra. Explicitly, the spl(2, 1) supersymmetry transformation is given by ‡ ‡◦ ‡◦ ‡ + − C− + ρ + C+ + ρ − C− δ8 = ( + C+
+ βZ3 + α 3 R3 + α + R+ + α − R− )8,
(73)
¯ Eq. (72) implies that the Grassmann parameters ± , ρ ± have and in the same way for 8. to fulfill − = +, ρ− = ρ+,
(74)
and the bosonic ones β, α 3 , α ± , β¯ = −β, α 3 = −α 3 , α + = −α − .
(75)
These conditions correspond precisely to the choice of the real form of the spl(2, 1) superalgebra. We have to check that the constraints (65) are compatible with the spl(2, 1) supersymmetry. The most simple way to see it is to note that ◦ , z , c form spl(2, 1) multiplets under the adjoint 1) the quadruples c± , z− , c and c± + action in T (22–28); ¯ trivially vanish; 2) {c, 8} and {c, 8} ◦ , 8} ¯ = 0 imply {z− , 8} = 0 and {z+ , 8} ¯ = 0, 3) The constraints {c± , 8} = 0 and {c± respectively. This is true because of the explicit formulae (61).
We conclude that the model (62) on the commutative sphere is spl(2, 1) supersymmetric, because also the measure of the integral (8) is invariant with respect to the Hamiltonian vector fields. Indeed, one can straightforwardly check that I ({t, f }) = 0 for whatever t, f ∈ A∞ .
(76)
598
C. Klimˇcík
3.2. The noncommutative case. Here are the ingredients needed for defining the noncommutative deformation of the N = 2 supersymmetric σ -model (62): 1) The bar involution in the noncommutative case for it coidentifies the supersymme¯ try algebra and ensures the reality of the Lagrangian K(88). It turns out that in the commutative case the bar involution can be expressed in terms of the automorphism ◦. Explicitly, ‡ ‡◦ ◦ ◦ , c± = ±c∓ , l¯± = l∓ , l¯3 = l3◦ , c¯ = c. c¯± = ∓c∓
(77)
Since the automorphism ◦ continues to make sense on the N = 2 fuzzy supersphere, we can use the relations (77) as the definition of the barred quantities in the noncommutative case. Note, however, that the barred involution is not an automorphism of the algebra ‡ of An is expressible in terms of An although its action on the generators l± , l3 , c± , c± the automorphism ◦. The point is that the bar involution acting on the product of two generators is not a morphism of the associative product in An since it is defined by the first rule of (3), e.g. ‡ ‡ ‡◦ ◦ c+ c− = −c− c¯+ = −c+ c− .
(78)
The automorphism ◦, in turn, does respect the mutiplication, e.g. ‡ ◦ ◦ ‡◦ ) = c+ c− . (c+ c−
(79)
We actually use the first rule of (3) to define the barred involution of all elements of An , ◦ ) by hence the second rule of (3) has to be verified. For example, we can calculate (c+ using the formula (48) and the first rule of (3). But one can use also (77) and the second rule of (3), i.e. ◦ = c¯ = −c . c+ − −
(80)
The consistency of the definition requires that both ways of calculating must be equivalent. Fortunately, this is the case and we have the bar involution also in the noncommutative case. 2) We need also the noncommutative analogue of the integral (8). It is a simple exercise to show that the commutative measure in the variables z¯ , z, b¯ α , bα can be rewritten in ‡ as follows: the variables l± , l3 , c± , c± 1 ‡ ‡ dc− . d z¯ dzd b¯ 1 db1 d b¯ 2 db2 = dl+ dl− dl3 δ(l+ l− + l32 − )dc+ dc− dc+ 4
(81)
Thus we see that the measure is simply the direct product of the round measure on the bosonic sphere and of the flat measure in the remaining fermionic variables. 1 Tr Upon the Berezin quantization, the integral over the bosonic measure becomes n+1 [6]. The fermionic measure, in turn, becomes the supertrace STr (not the trace!) over the ‡ of the Clifford algebra satisfy the Clifford algebra Cf4 . Indeed, the generators c± , c± canonical anticommutation relation of a quantum mechanical system with two fermionic degrees of freedom. The Clifford algebra can be identified with the algebra of linear operators acting on the corresponding Fock space. The latter is naturally graded so we obtain the supertrace as STr(.) ≡ Tr(0.),
(82)
Extended Fuzzy Supersphere and Twisted Chiral Superfields
599
where 0 is the grading operator. It is a textbook fact that, upon the quantization of the fermionic system, the Berezin integral becomes the supertrace. It is easy to see it directly in the case of one fermionic oscillator only. Then the only nonzero Berezin integral is the one over c‡ c; this is also true for the supertrace in the quantum case. The integral (denote it In ) in the noncommutative case has a crucial property In (AB − (−1)AB BA) = 0
(83)
for any A, B ∈ An . This property plays the same role in the noncommutative case as (76) in the commutative one. Namely it will ensure the supersymmetry of the following action ¯ Sn = In (K(88)).
(84)
This is the action of the N = 2 supersymmetric nonlinear σ -model on the fuzzy sphere. ¯ and 8 are now elements of the fuzzy algebra An . More precisely, if we The superfields 8 ‡ . take any element of An , it can be written as a polynomial in the generators l± , l3 , c± , c± Now the coefficients in front of the odd polynomials of the superfields have to anticommute with those polynomials, for example one has ηc+ l+ l− = −c+ l+ l− η.
(85)
The coefficients in front of the even polynomials commute with them: rl− c+ c− = l− c+ c− r.
(86)
These rules are, of course, standard in the superworld. In the commutative case, η’s in (85) were the Grassmann numbers belonging to some Grassmann algebra P . In the noncommutative case, however, they are the Grassman numbers tensored with the grading 0 of the linear space Hn , where An acts. The tensoring with 0 is the representation of ¯ 8. P (due to 0 2 = 1) and it ensures the correct commutative limit of the superfields 8, On the other hand, r in (86) can be interpreted as a complex multiple of the unit element ¯ 8 are of An . An important thing is that η’s are odd and r’s even. Thus the superfields 8, even. The constraints in the noncommutative case are defined by the formulae ◦ ¯ , 8] = 0. n[c± , 8] = 0, n[c±
(87)
Note that the only change with respect to the commutative case is the replacement of the Poisson brackets by the commutators scaled by the inverse Planck constant n. Of ◦ are elements of A . course, the Hamiltonians c± , c± n The spl(2, 1) supersymmetry transformation δ is again generated by the noncommu‡ ‡◦ , c± given by (12),(13),(16) and (18). Explicitly tative Hamiltonians r± , r3 , z3 , c± ‡ ‡◦ ‡◦ ‡ , 8] + − [c− , 8] + ρ + [c+ , 8] + ρ − [c− , 8] δ8 = n( + [c+
+ β[z3 , 8] + α 3 [r3 , 8] + α + [r+ , 8] + α − [r− , 8]).
(88)
Here α ± , α 3 , β are numbers and the parameters ± and ρ ± are the quantities of the type η in (85). This is important for ensuring that a commutator of two supertransformations with different coefficients is again a supertransformation. The parameters of δ are again
600
C. Klimˇcík
to fulfill the same relations (74) and (75) as their counterparts in the commutative case. With this assignment we can easily check that we have also in the noncommutative case ¯ δ8 = δ 8.
(89)
One can prove that the constraints (87) are compatible with the supersymmetry transformation in the same way as in the commutative setting. Summing up, we have constructed the N = 2 spl(2, 1) supersymmetric nonlinear σ -model on the noncommutative sphere. Two N = 1 osp(2, 1) subsupersymmetries can be obtained by setting respectively α = ρ α and α = −ρ α . 3.3. Solving the constraints. In the flat space commutative case, one can do more than just define the σ -model by the action (62) and the constraints (63). Indeed, one can effectively solve (63) and cast the action (62) in terms of the solutions of the constraints. Here we are going to show that all this can be performed also on the commutative N = 2 supersphere and even on the noncommutative one. Indeed, due to the Poisson brackets/commutation relations (34), (35)/(36), (37), we immediately conclude that any element of A∞ or An of the form 8(l± , l3 , c± )
(90)
solves the first set of the constraints in (87) and any element of the form ◦ ◦ ◦ ¯ ± , l3 , c± ) 8(l
(91)
solves the second set. Let us moreover show that every solution of (87) is of the form (90). Indeed, it is a simple matter to check that every element 9 of A∞ or An can be unambiguosly written as ‡ 9 = 8(l± , l3 , c± ) + 8− (l± , l3 , c± )c− ‡ ‡ ‡ + 8+ (l± , l3 , c± )c+ + 8+− (l± , l3 , c± )c+ c− .
(92)
Now the fact that 8(l± , l3 , c± ) is the most general solution of (87) is the direct consequence of the Poisson brackets/commutation relations (34)–(37). The same argument holds also for the circled variables. Finally, we remark how we can cast in components the action on the commutative N = 2 supersphere. We use the fact that l± , l3 ∈ A∞ are the generators of the ordinary sphere (cf. (32) and (33)). We can therefore introduce variables u, u‡ such that l+ =
u‡ u u‡ u − 1 = , l , l = . − 1 + u‡ u 1 + u‡ u u‡ u + 1
(93)
ub¯ 2 + b1 −b¯ 2 + u‡ b1 = c . − 1 + u‡ u 1 + u‡ u
(94)
Then c+ = Comparing with (20), we have u = z + b1 b2 , u‡ = z¯ + b¯ 2 b¯ 1 ,
(95)
Extended Fuzzy Supersphere and Twisted Chiral Superfields
601
and we arrive at 8 = 8(z + b1 b2 , z¯ + b¯ 2 b¯ 1 , b1 , b¯ 2 ).
(96)
Much in the same way, we obtain ¯ = 8(z ¯ + b2 b1 , z¯ + b¯ 1 b¯ 2 , b¯ 1 , b2 ). 8
(97)
Expanding (96) and (97) in b¯ α , bα , inserting in the action (62) and integrating over bα , b¯ α , we obtain the standard action of the N = 2 supersymmetric σ -model in components. 4. Outlook For the purpose of the quantization of the model, say by a path integral, it is sufficient to work directly in the superfield formalism. Nevertheless, it is perhaps of interest to know whether one can introduce the component fields also in the noncommutative case. In the N = 1 case it turned out [6] that one could not do that for the simple reason that the N = 1 supersphere is not the N = 0 supersphere tensored with some algebra. In the N = 2 case the question is more subtle. One has to find variables “between” the circled and the uncircled ones, like z, z¯ , bα , b¯ α in the commutative case, such that the N = 2 supersphere would be the product of the bosonic fuzzy sphere and the Clifford algebra in these intermediate variables. This is needed for being able to take the supertrace over the Clifford algebra separately and cast the action as the trace over the bosonic fuzzy sphere only. It is an open problem; personally we feel that it is not possible. Another natural question concerns coordinate transformations on the Kähler target. Although for some simple manifolds (like complex projective spaces) one can completely define the Kähler potential working in one chart, one should anyway look for a more invariant definition of the theory. Of course, this problem concerns not only the theories on the noncommutative worldsheets but arises in general in the studies of quantum theory of the nonlinear σ -models. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
Sevrin, A. and Troost, J.: Nucl.Phys. B492, 623 (1997) and references therein Connes, A.: Noncommutative geometry. London: Academic Press, 1994 Berezin, F.: Commun. Math. Phys. 40, 153 (1975) Hoppe, J.: MIT PhD thesis, 1982 and Elem. Part. Res. J. (Kyoto) 80, 145 (1989) Madore, J.: J. Math. Phys. 32, 332 (1991) and Class. Quant. Grav. 9, 69 (1992) Grosse, H., Klimˇcík, C. and Prešnajder, P.: Commun. Math. Phys. 185, 155 (1997) Grosse, H., Klimˇcík C. and Prešnajder, P.: Commun. Math. Phys. 178, 507 (1996) Baez, S., Balachandran, A.P., Idri, B. and Vaidya, S.: Monopoles and solitons in fuzzy physics. hepth/9811169 Klimˇcík, C.: Commun. Math. Phys. 199, 257 (1998) Klimˇcík, C.: A nonperturbative regularization of the supersymmetric Schwinger model. hep-th/9903112 Scheunert, M., Nahm, W. and Rittenberg, V.: J. Math. Phys 18, 155 (1977) Green, M., Schwarz, J. and Witten, E.: Superstring theory. Cambridge: Cambridge University Press, 1997 Schwarz, A.S.: Symplectic formalism in conformal field theory. In: Quantum symmetries, Les Houches, Session LXIV, Eds. A. Connes, K. Gaw¸edzki and J. Zinn-Justin, London: Elsevier Science B.V., 1998, p. 957
Communicated by A. Connes
Commun. Math. Phys. 206, 603 – 637 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
String Geometry and the Noncommutative Torus Giovanni Landi1,? , Fedele Lizzi2 , Richard J. Szabo3,?? 1 Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Silver Street, Cam-
bridge CB3 9EW, UK
2 Dipartimento di Scienze Fisiche, Università di Napoli Federico II and INFN, Sezione di Napoli, Mostra
d’Oltremare Pad. 20, 80125 Napoli, Italy. E-mail: [email protected]
3 Department of Physics – Theoretical Physics, University of Oxford, 1 Keble Road, Oxford OX1 3NP, UK
Received: 26 October 1998/ Accepted: 9 April 1999
Abstract: We construct a new gauge theory on a pair of d-dimensional noncommutative tori. The latter comes from an intimate relationship between the noncommutative geometry associated with a lattice vertex operator algebra A and the noncommutative torus. We show that the tachyon algebra of A is naturally isomorphic to a class of twisted modules representing quantum deformations of the algebra of functions on the torus. We construct the corresponding real spectral triples and determine their Morita equivalence classes using string duality arguments. These constructions yield simple proofs of the O(d, d; Z) Morita equivalences between d-dimensional noncommutative tori and give a natural physical interpretation of them in terms of the target space duality group of toroidally compactified string theory. We classify the automorphisms of the twisted modules and construct the most general gauge theory which is invariant under the automorphism group. We compute bosonic and fermionic actions associated with these gauge theories and show that they are explicitly duality-symmetric. The duality-invariant gauge theory is manifestly covariant but contains highly non-local interactions. We show that it also admits a new sort of particle-antiparticle duality which enables the construction of instanton field configurations in any dimension. The duality non-symmetric on-shell projection of the field theory is shown to coincide with the standard non-abelian Yang–Mills gauge theory minimally coupled to massive Dirac fermion fields. 1. Introduction The noncommutative torus [1]–[3] is one of the basic examples of a noncommutative geometry [4,5] which captures features of the difference between an ordinary manifold ? On leave from Dipartimento di Scienze Matematiche, Università di Trieste, P.le Europa 1, 34127, Trieste, Italy. E-mail: [email protected] Also INFN, Sezione di Napoli, Napoli, Italy. ?? Present address: The Niels Bohr Institute, University of Copenhagen, Blegdamsvej 17, 2100 Copenhagen Ø, Denmark. E-mail: [email protected]
604
G. Landi, F. Lizzi, R. J. Szabo
and a “noncommutative space”. Recent interest in such geometries has occurred in the physics literature in the context of their relation to M-Theory [6]–[9]. As shown in the seminal paper [6], the most general solutions to the quotient conditions for toroidal compactifications of Matrix Theory satisfy the algebraic relations of gauge connections on noncommutative tori. This has led to, among other things, new physical insights into the structure of the supergravity sector of M-Theory by relating geometrical parameters of the noncommutative torus to physical parameters of the Matrix Theory gauge theories. In this paper we shall discuss the role played by the noncommutative torus in the shortdistance structure of spacetime. In particular we shall construct a new duality-symmetric gauge theory on a pair of noncommutative tori. One way to describe the noncommutative torus is to promote the ordinary coordinates x i , i = 1, . . . , d, of a d-torus to non-commuting operators xˆ i acting on an infinitedimensional Hilbert space and obeying the commutation relations i h (1.1) xˆ i , xˆ j = 2π i `ij , where `ij are real numbers. Defining the basic plane waves Ui = exp i xˆ i ,
(1.2)
it follows from the Baker-Campbell-Hausdorff formula that these operators generate the algebra ij
Ui Uj = e2πi` Uj Ui
(1.3)
and these are the basic defining relations of the noncommutative torus, when viewed as an algebra of functions with generalized Fourier series expansions in terms of the plane waves (1.2). In the limit `ij → 0, one recovers the ordinary coordinates and the plane waves (1.2) become the usual ones of the torus. The antisymmetric matrix `ij can therefore be thought of as defining the Planck scale of a compactified spacetime. The noncommutative torus then realizes old ideas of quantum gravity that at distance scales below the Planck length the nature of spacetime geometry is modified. At large distances the usual (commutative) spacetime is recovered. On the other hand, a viable model for Planck scale physics is string theory, and its various extensions to D-brane field theory [10]–[12] and Matrix Theory [13]. In these latter extensions, the short-distance noncommutativity of spacetime coordinates is represented by viewing them as N × N matrices. The noncommutativity of spacetime in this picture is the result of massless quantum excitations yielding bound states of Dbranes with broken supersymmetry [10], or alternatively of the quantum fluctuations in topology of the worldsheets of the D-branes [12]. However, it is also possible to view the spacetime described by ordinary string theory more directly as a noncommutative geometry [14]–[17]. The main idea in this context is to substitute the (commutative) algebra of continuous complex-valued functions on spacetime with the (noncommutative) vertex operator algebra of the underlying conformal field theory. Ordinary spacetime can be recovered by noticing that the algebra of continuous functions can be thought of a subalgebra of the vertex operator algebra. The fruitfulness of this approach is that the full vertex operator algebra is naturally invariant under target space duality transformations of the string theory. The simplest such mapping relates large and small radius circles to one another and leads directly to a fundamental length scale, usually the finite size of the string. Although the duality is a symmetry of the full noncommutative string spacetime,
String Geometry and Noncommutative Torus
605
different classical spacetimes are identified under the transformation yielding a natural geometrical origin for these quantum symmetries of compactified string theory. This point of view therefore also describes physics at the Planck scale. In this paper we will merge these two descriptions of the short-distance structure of spacetime as described by noncommutative geometry. We will show that a particular algebra obtained naturally from the tachyon algebra, which in turn is obtained by projecting out the string oscillatory modes, defines a twisted projective module over the noncommutative torus. This algebra is, in this sense, the smallest quantum deformation of ordinary, classical spacetime, and it represents the structure of spacetime at the Planck length. Thus strings compactified on a torus have a geometry which is already noncommutative at short distances. The remainder of the full vertex operator algebra acts to yield the non-trivial gauge transformations (including duality) of spacetime. This fact yields yet another interpretation for the noncommutativity of Planck scale spacetime. In string theory spacetime is a set of fields defined on a surface, and at short distances the interactions of the strings (described by the vertex operator algebra) causes the spacetime to become noncommutative. In the case of toroidally compactified string theory, spacetime at the Planck scale is a noncommutative torus. The results of this paper in this way merge the distinct noncommutative geometry formalisms for strings by connecting them all to the geometry of the noncommutative torus. At hand is therefore a unified setting for string theory in terms of (target space) D-brane field theory, Matrix Theory compactifications, and (worldsheet) vertex operator algebras. Aside from these physically interesting consequences, we will show that the modules we obtain from the vertex operator algebra also bear a number of interesting mathematical characteristics. Most notably, the duality symmetries of the vertex operator algebra lead to a simple proof of the Morita equivalences of noncommutative tori with deformation parameters `ij which are related by the natural action of the discrete group O(d, d; Z) on the space of real-valued antisymmetric d × d matrices. These Morita equivalences have been established recently using more direct mathematical constructions in [18]. In [19] it was shown that Matrix Theories compactified on Morita equivalent tori are physically equivalent to one another, in that the BPS spectra of states are the same and the associated field theories can be considered to be duals of each other. Here we shall find a direct manifestation of this duality equivalence in terms of the basic worldsheet theory itself. The relationship between vertex operator algebras, duality and the noncommutative torus was originally pointed out in [20] and discussed further in [21]. In a sense, this relationship shows that the duality properties of Morita equivalences in M-Theory are controlled by the stringy sector of the dynamics. Morita equivalent noncommutative tori have also been constructed in [22] under the name discrete Heisenberg–Weyl groups and via the action of the modular group. There it was also suggested that this is the base for the duality principles which appear in string theory and conformal field theory. We will also show that the fairly complete classification of the automorphism group of the vertex operator algebra, given in [21], can be used to characterize the symmetries of the twisted modules that we find. This immediately leads us to the construction of an action functional for this particular noncommutative geometry which is naturally invariant under the automorphism group. Generally, the action functional in noncommutative geometry can be used to construct invariants of modules of the given algebra and it presents a natural geometrical origin for many physical theories, such as the standard model [23,24] and superstring theory [15]. The spectral action principle of noncommutative geometry naturally couples gravitational and particle interactions from a very simple geometric perspective [25,26]. In the following we shall construct both fermionic
606
G. Landi, F. Lizzi, R. J. Szabo
and bosonic actions for the twisted modules which possess the same properties as dictated by the spectral action principle. However, since we shall be neither concerned with coupling to gravity nor in renormalization effects, we shall use a somewhat simpler definition than that proposed in [26]. A consequence of the invariance of the action under automorphisms of the algebra is that the action is explicitly duality-symmetric. The construction of explicitly electric-magnetic duality symmetric action functionals has been of particular interest over the years [27,28] and they have had applications to the physics of black holes [29] and of D-branes [30,31]. These actions are of special interest now because of the deep relevance of duality symmetries to the spacetime structure of superstring theory within the unified framework of M-Theory. Because the derivation of the duality-symmetric action involves quite a bit of mathematics, it is worthwhile to summarize briefly the final result here. We shall show that a general gauge theory on the twisted module leads naturally to a target space Lagrangian of the form ↔ ↔ ij L = F + ? F ij F + ? F − i ψ∗ γ i ∂i + i Ai ψ − i ψ γi∗ ∂∗i + i Ai∗ ψ∗ , (1.4) where
h h i i Fij = ∂i Aj − ∂j Ai + i Ai , Aj − gik gj l ∂∗k Al∗ − ∂∗l Ak∗ + i Ak∗ , Al∗
(1.5)
and i, j = 1, . . . , d. Here the field theory is defined on a Lorentzian spacetime with coordinates (x i , xj∗ ) ∈ Rd × (Rd )∗ , and gij is the (flat) metric of Rd while the metric j
on the total spacetime is (gij , −gij ). We have defined ∂i = ∂/∂x i and ∂∗ = ∂/∂xj∗ . The j
fields Ai (x, x ∗ ) and A∗ (x, x ∗ ) are a dual pair of gauge fields, ψ(x, x ∗ ) and ψ∗ (x, x ∗ ) are dual spinor fields, and γ i and γi∗ are Dirac matrices on Rd and (Rd )∗ , respectively. The field strength ? Fij is a certain “‘dual" to the field strength Fij with respect to the Lorentzian metric of the spacetime. The bars on the fermion fields denote their “‘adjoints" and the double arrow on the gauge potentials in the fermionic part of the action denotes their left-right symmetric action on the fermion and anti-fermion fields (in a sense which we shall define more precisely in what follows). The commutators and actions of gauge potentials on fermion fields are defined using the noncommutative tachyon algebra structure, so that the action associated with (1.4) describes a certain nonabelian gauge theory coupled to fermions in a nontrivial representation of a gauge group (again this gauge group will be described more precisely in the following). The action corresponding to (1.4) is explicitly invariant under the interchange of starred and un-starred quantities. This symmetry incorporates the T -duality transformation which inverts the metric gij , and it moreover contains a particle-antiparticle duality transformation Fij ↔ ? Fij that represents a certain topological instanton symmetry of the field theory in any dimension d. However, by its very construction, it is manifestly invariant under a much larger symmetry group, including the gauge group, which we shall describe in this paper. In this sense, we will see that the action functional corresponding to (1.4) measures the amount of duality symmetry as well as the strength of the string interactions present in the given spacetime theory. Like the usual formulations of electricmagnetic symmetric actions [27] (see also [20]), (1.4) involves an O(2, R) doublet of vector potentials (A, A∗ ). The crucial difference between (1.4) and the usual actions is that it is also manifestly covariant, without the need of introducing auxiliary fields [28].
String Geometry and Noncommutative Torus
607
This general covariance follows from the fact that the diffeomorphism symmetries of the spacetime are encoded in the tachyon algebra as internal gauge symmetries, so that the gauge invariance of the action automatically makes it covariant. As a consequence of this feature, the on-shell condition for the field strengths is different than those in the usual formulations. Here it corresponds to a dimensional reduction Rd ×(Rd )∗ → Rd in which the field theory becomes ordinary Yang–Mills theory minimally coupled to massive Dirac fermions. This reduction thus yields a geometrical origin for colour degrees of freedom and fermion mass generation. The field theory (1.4) can thereby be thought of as a first stringy extension of many physical models, such as the standard model and Matrix Theory. As a conventional field theory, however, the Lagrangian (1.4) is highly non-local because it contains infinitely many orders of derivative interactions through the definition of the commutators (derived from the algebra (1.3)). This non-locality, and its origin as an associative noncommutative product, reflects the nature of the string interactions. Thus, the natural action functional associated with the noncommutative geometry of string theory not only yields an explicitly duality-symmetric (non-local) field theory, but it also suggests a sort of noncommutative Kaluza-Klein mechanism for the origin of nonabelian gauge degrees of freedom and particle masses. The explicit invariance of (1.4) under the duality group of the spacetime thereby yields a physical interpretation of the mathematical notion of Morita equivalence. This gauge theory on the noncommutative torus is different from those discussed in the context of Matrix Theory [6]–[9] but it shares many of their duality properties. The formalism of the present paper may thus be considered as a step towards the formulation of Matrix Theory in terms of the framework of spectral triples in noncommutative geometry [11]. It is a remarkable feature that such target space dynamics can be induced so naturally at the level of a worldsheet formalism. The structure of the remainder of this paper is as follows. All ideas and results of noncommutative geometry which we use are briefly explained throughout the paper. In Sect. 2 we will briefly define the vertex operator algebra associated with toroidally compactified string theory. In Sect. 3 we study the tachyon algebra of the lattice vertex operator algebra and show that it defines a particular twisted module over the noncommutative torus. In Sect. 4 we introduce a set of spectral data appropriate to the noncommutative geometry of string theory. In Sect. 5 we exploit the duality symmetries of this noncommutative geometry to study some basic properties of the twisted module, including its Morita equivalence classes and its group of automorphisms. In Sect. 6 the most general gauge theory on the module is constructed, and in Sect. 7 that gauge theory is used to derive the duality-symmetric action functional. In Sect. 8 we describe some heuristic, physical aspects of these twisted modules along the lines described in this section. For completeness, an appendix at the end of the paper gives a brief overview of the definition and relevant mathematical significance of Morita equivalence in noncommutative geometry. 2. Lattice Vertex Operator Algebras In this section we will briefly review, mainly to introduce notation, the definition of a lattice vertex operator algebra (see [17,32] and references therein for more details). Let L be a free infinite discrete abelian group of rank d with Z-bilinear form h · , · iL : L × L → R+ which is symmetric and nondegenerate. Given a basis {ei }di=1 of L, the symmetric nondegenerate tensor
(2.1) gij = ei , ej L
608
G. Landi, F. Lizzi, R. J. Szabo
defines a Euclidean metric on the flat d-dimensional torus Td ≡ Rd /2π L. We can extend the inner product h · , · iL to the complexification Lc = L ⊗Z C by C-linearity. The dual lattice to L is then (2.2) L∗ = p ∈ Lc | hp, wiL ∈ Z ∀w ∈ L which is also a Euclidean lattice of rank d with bilinear form g ij inverse to (2.1) that defines a metric on the dual torus Td∗ ≡ Rd /2π L∗ . Given the lattice L and its dual (2.2), we can form the free abelian group 3 = L∗ ⊕ L.
(2.3)
If {ei }di=1 is a basis of L∗ dual to a basis {ei }di=1 of L, then the chiral basis of 3 is i }d , where {e± i=1 i = √1 ei ± g ij ej (2.4) e± 2
with an implicit sum over repeated indices always understood. Given p ∈ L∗ and w ∈ L, we write the corresponding elements of 3 with respect to the basis (2.4) as p ± with components (2.5) pi± = √1 pi ± hei , wiL . 2
Then with qi±
=
√1 (qi ±hei , viL ), we can define a Z-bilinear form h · , 2
· i3 : 3×3 → Z
by hp, qi3 ≡ pi+ g ij qj+ − pi− g ij qj− = hp, viL + hq, wiL ,
(2.6)
and this makes 3 an integral even self-dual Lorentzian lattice of rank 2d and signature (d, d) which is called the Narain lattice. Note that, in the chiral basis (2.4), the corresponding metric tensor is ( D E ± g ij , α = β = ± ij j i . (2.7) ηαβ ≡ eα , eβ = 3 0, α 6= β The commutator map c3 (p+ , p− ; q + , q − ) = eiπ hq,wiL
(2.8)
on 3c × 3c → Z2 is a two-cocycle of the group algebra C[3] generated by the vector space 3c , the complexification of 3. This two-cocycle corresponds to a central extension fc of 3c , 3 fc → 3c → 1, 1 → Z2 → 3
(2.9)
fc = Z2 × 3c as a set and the multiplication is given by where 3 ρ ; q + , q − · σ ; r + , r − = c3 (q + , q − ; r + , r − )ρσ ; q + + r + , q − + r − (2.10) for ρ, σ ∈ Z2 and (q + , q − ), (r + , r − ) ∈ 3c . This can be used to define the twisted fc of 3c . A realization of C{3} group algebra C{3} associated with the double cover 3
String Geometry and Noncommutative Torus
609
in terms of closed string modes is given as follows. Viewing Lc and (L∗ )c as abelian Lie cc and algebras of dimension d, we can consider the corresponding affine Lie algebras L ∗ c c c ∗ c [ [ c ⊕ (L c=L (L ) , and also the affinization 3 ) . In the basis (2.4), we then have \ \ d ⊕ u(1) d, cc ∼ 3 = u(1) + − (±)i
where the generators αn
(2.11)
\ d satisfy the Heisenberg algebra , n ∈ Z, of u(1) ± i h (±)j = n g ij δn+m,0 . αn(±)i , αm
The basic operators of interest to us are the chiral Fubini-Veneziano fields X 1 −n i i (z± ) = x± + ig ij pj± log z± + α (±)i z± X± in n
(2.12)
(2.13)
n6 =0
and the chiral Heisenberg fields i i (z± ) = −i∂z± X± (z± ) = α±
∞ X n=−∞
(±)i
−n−1 αn(±)i z± ; α0
≡ g ij pj± ,
(2.14)
where z± ∈ C ∪ {∞}. Classically, the fields (2.13) are maps from the Riemann sphere into the torus Td , with x± ∈ Td . When they are interpreted as classical string embedding functions from a cylindrical worldsheet into a toroidal target space, the first two terms in (2.13) represent the center of mass (zero mode) motion of a closed string while the Laurent series represents its oscillatory (vibrational) modes. The wi in (2.5) represent the winding modes of the string about each of the cycles of Td , while the fields (2.14) are the conserved currents which generate infinitesimal reparametrizations of the torus. Upon canonical quantization, the oscillatory modes satisfy (2.12) and the zero modes form a canonically conjugate pair, i h j β (2.15) xαi , pj = i δαβ δi , where α, β = ±. Then the multiplication operators + i i x+ −iqi− x−
εq + q − (p+ , p− ) ≡ e−iqi
c3 (p+ , p− ; q + , q − )
(2.16)
generate the twisted group algebra C{3}. The fields X± now become quantum operators which act formally on the infinite-dimensional Hilbert space, Q dx i dx ∗ (2.17) h = L2 Td × Td∗ , di=1 (2π )2i ⊗ F + ⊗ F − , where x i ≡
i + x i ) and x ∗ √1 (x+ − i 2
≡
√1 2
j
j
gij (x+ − x− ) define, respectively, local coordi-
nates on the torus Td and on the dual torus Td∗ . The L2 space in (2.17) is generated by the canonical pairs (2.15) of zero modes and is subject to the various natural isomorphisms Q i dx i dxi∗ ∼ 2 dxi∗ 2 ∗ Qd 2 ∗ Qd L , ⊗ T L Td × Td , i=1 (2π)2 = L Td , di=1 dx C d i=1 2π 2π M Q i ∼ L2 Td , di=1 dx (2.18) = 2π w∈L
∼ =
M
p∈L∗
Q L2 Td∗ , di=1
dxi∗ 2π
.
610
G. Landi, F. Lizzi, R. J. Szabo
The Hilbert space (2.17) is thus a module for the group algebras C[L] and C[L∗ ], Q dx i dx ∗ and also for C{3}. The dense subspace C ∞ (Td × Td∗ ) ⊂ L2 (Td × Td∗ , di=1 (2π )2i ) of smooth complex-valued functions on Td × Td∗ is a unital ∗-algebra which is spanned by the eigenstates |q, vi = |q + , q − i = e−iqi x
i −iv i x ∗ i
of the operators −i ∂x∂ i and −i ∂x∂ ∗ on i
Td × Td∗ , where q ∈ L∗ and v ∈ L. The spaces F ± are bosonic Fock spaces generated (±)i by the oscillatory modes αn which act as annihilation operators for n > 0 and as creation operators for n < 0 on some vacuum states |0i± . The basic single-valued quantum fields which act on the Hilbert space (2.17) are the chiral tachyon operators Vq ± (z± ) =
i (z ) ◦ ◦ −iqi± X± ± ◦e ◦
,
(2.19)
where (q + , q − ) ∈ 3 and ◦◦ · ◦◦ denotes the Wick normal ordering defined by reordering (±)i i occur to the left of all α (±)i , the operators (if necessary) so that all αn , n < 0, and x± n n > 0, and pi± . The twisted dual he∗ of the Hilbert space (2.17) is spanned by operators of the form Y (+)i Y (−)j α−nk k ⊗ α−ml l . (2.20) 9 = εq + q − (p+ , p− ) ⊗ k
l
To (2.20) we associate the vertex operator V (9; z+ , z− ) =
◦ ◦
Vq + q − (z+ , z− )
Y k
1 ik ∂ nk −1 α+ (z+ ) (nk − 1)! z+
l
1 j ∂ ml −1 α−l (z− ) ◦◦ , (ml − 1)! z−
Y
where Vq + q − (z+ , z− ) ≡ V εq + q − (p+ , p− ) ⊗ I; z+ , z− = c3 (p+ , p− ; q + , q − ) =
◦ ◦
(2.21)
Vq + (z+ ) Vq − (z− ) ◦◦
(2.22)
+ i − i eiπhq,wiL ◦◦ e−iqi X+ (z+ )−iqi X− (z− ) ◦◦
are the basic tachyon vertex operators. This gives a well-defined linear map 9 7→ ±1 ±1 , z− ], the latter being the space of endomorphismV (9; z+ , z− ) on he∗ → (End h)[z+ valued Laurent polynomials in the variables z± . This mapping is known as the operatorstate correspondence. With an appropriate regularization (see the next section), the vertex operators (2.21) yield well-defined and densely-defined operators acting on h. They generate a noncommutative unital ∗-algebra A with the usual Hermitian conjugation and operator norm (defined on an appropriate dense domain of bounded operators in A). The various algebraic properties of A can be found in [17,32], for example. It has the formal mathematical structure of a vertex operator algebra. In this paper we shall not discuss these generic properties of A, but will instead analyse in detail a particular “subalgebra” of it which possesses some remarkable properties. It is important to note that A ∼ = Ef 3 is the Z2 -twist of the algebra E3 = E+ ⊗C E− , where E± are the chiral algebras generated by the operators (2.19). The lattice L∗ itself
String Geometry and Noncommutative Torus
611
yields the structure of a vertex operator algebra EL∗ without any reference to its dual lattice or the Narain lattice with its chiral sectors. Indeed, given a basis {ei }di=1 of L∗ and two-cocycles cL (p, q) = eiπhq,piL∗ corresponding to an S 1 covering group of the dual lattice, the operators eq (z) = eiπhq,piL∗ V
◦ −iqi Xi (z) ◦ ◦ ◦e
(2.23) Qd
dxi∗
generate a vertex operator algebra, acting on the Hilbert space L2 (Td∗ , i=1 2π )⊗F, in an analogous way that the operators (2.22) do [32]. Similarly one defines a vertex operator algebra EL associated with the lattice L. Modulo the twisting factors, the algebras A and EL∗ ⊗C EL are isomorphic since the latter representation comes from changing basis on 3 from the chiral one (2.4) to the canonical one {ei }di=1 ⊕ {ej }dj =1 . Physically, the difference between working with the full chirally symmetric algebra A and only the chiral ones or EL∗ , EL is that the former represents the algebra of observables of closed strings while the latter ones are each associated with open strings. Furthermore, the algebra A is that which encodes the duality symmetries of spacetime and maps the various open string algebras among each other via duality transformations. Much of what is said in the following will therefore not only apply as symmetries of a closed string theory, but also as mappings among various open string theories. It is in this way that these results are applicable to D-brane physics and M-Theory. 3. Tachyon Algebras and Twisted Modules over the Noncommutative Torus The operator product algebra of the chiral tachyon operators (2.19) can be evaluated using standard normal ordering properties to give [33] ± ij ± 0 0 qi g rj ◦ 0 ◦ ) = z± − z± (3.1) Vq ± (z± ) Vr ± (z± ◦ Vq ± (z± ) Vr ± (z± ) ◦ which leads to the operator product 0 0 0 , z− ) = z+ − z+ Vq + q − (z+ , z− ) Vr + r − (z+
×
◦ ◦
q + g ij r + i
j
0 z − − z−
q − g ij r − i
j
0 0 ◦ Vq + q − (z+ , z− ) Vr + r − (z+ , z− )◦
(3.2)
of the full vertex operators (2.22). They are the defining characteristics of the vertex operator algebra which in the mathematics literature are collectively referred to as the Jacobi identity of A [17,32]. The product of a tachyon operator with its Hermitian conjugate is given by ± ij ± p± g ij q ± 0 qi g qj Y ± α (±)i j z± −n n z± i 0 n 0 † eqi n (z± −z± ) 1− Vq ± (z± ) Vq ± (z± ) = 0 z± z± ×
Y
n>0
e
(±)i α qi± nn
−n 0 −n ) (z± −z±
.
(3.3)
n>0
The tachyon vertex operators thus form a ∗-algebra which, according to the operatorstate correspondence of the previous section, is associated with the subspace h0 ⊂ h defined by d OO dx i dxi∗ (+)i (−)i 2 ∗ Qd ∼ ker αn ⊗ ker αn h0 ≡ = L Td × Td , i=1 (2π )2 . (3.4) n>0 i=1
612
G. Landi, F. Lizzi, R. J. Szabo
Algebraically, h0 is the subspace of highest weight vectors for the representation of \ \ d ⊕ u(1) d on the L2 space, and it is spanned by the vectors |q + ; q − i. The irreducible u(1) +
−
(highest weight) representations of this current algebra are labelled by the u(1)d± charges (±)i qi± , and h0 carries a representation of u(1)d+ ⊕u(1)d− given by the actions of α0 . Then, M Vq + q − ⊗ F + ⊗ F − , (3.5) h= (q + ,q − )∈3
∼ = Vq ⊗ Vv are the irreducible u(1)d+ ⊕ u(1)d− Q dx i dx ∗ modules for the representation of the Kac-Moody algebra on L2 (Td ×Td∗ , di=1 (2π )2i ). In this sense, the tachyon operators generate the full vertex operator algebra A, and the algebraic relations between any set of vertex operators (2.21) can be deduced from the operator product formula (3.2) [17]. In the following we will therefore focus on the tachyon sector of the vertex operator algebra. If P0 : h → h0 is the orthogonal projection onto the subspace (3.4), then the low energy tachyon algebra is defined to be where qi± =
√1 (qi ± hei , viL ) and Vq + q − 2
A0 = P0 A P0 .
(3.6)
If we consider only the zero mode part of the tachyon operators (by projecting out the oscillatory modes) then the corresponding algebra generators are given by P0 Vq+ q− (z+ , z− ) P0 = K(z+ , z− ) eiπ hq,wiL e−iqi x = K(z+ , z− ) εq + q − (p+ , p− )
i −iv i x ∗ i
(3.7)
with K(z+ , z− ) a worldsheet dependent normalization equal to unity at z+ = z− = 1. The generators of A0 coincide with the generators (2.16) of C{3}. From the multiplication property εq + q − (p+ , p− ) εr + r − (p+ , p− ) = c3 (q + , q − ; r + , r − ) ε(q + +r + )(q − +r − ) (p+ , p− ) (3.8) we find the clock algebra εq + q − (p+ , p− ) εr + r − (p+ , p− ) = eiπγ3 (q
+ ,q − ;r + ,r − )
εr + r − (p+ , p− ) εq + q − (p+ , p− ), (3.9)
where γ3 (q + , q − ; r + , r − ) = hr, viL − hq, wiL
(3.10)
is a two-cocycle of 3c . This sector represents the extreme low energy of the string theory, in which the oscillator modes are all turned off. In this limit the geometry is a “commutative” one and the physical theory is that of a point particle theory. All stringy effects have been eliminated. In the following we will attempt to go beyond this limit. 0 and are therefore to be underThe expressions (3.1)–(3.3) are singular at z± = z± stood only as formal relationships valid away from coinciding positions of the operators on C ∪ {∞}. Moreover the vertex operators, seen as operators on the Hilbert space,
String Geometry and Noncommutative Torus
613
are in general not bounded, and to define a C ∗ -algebra one usually “smears” them1 . Nevertheless the ultraviolet divergence related to the product of the operators at coinciding points can be cured by introducing a cutoff on the worldsheet, a common practice in conformal field theory. We will implement this cutoff by considering a “truncated” algebra obtained by considering only a finite number of oscillators. With this device we solve the unboundedness problem as well. We define Y Wn± (z± ), (3.11) Vq ± (z± ) = n≥0
where W0± contains the zero modes x± and p± , while the Wn± ’s for n 6= 0 involve only (±)i (±)i the nth oscillator modes αn and α−n . The truncated vertex operator is then defined as VqN± (z± ) = NN±
N Y
Wn± (z± ).
n=0
(3.12)
The quantity NN± is a normalization constant which we choose to be NN± =
N Y
± ij ± g qj /n
e−qi
.
n=1
With this normalization the operators (3.12) are unitary. There is now no problem in multiplying operators at coinciding points, but the relation (3.1) will change and be valid only in the limit N → ∞. The operator product formula (3.1) (and its modification in the finite N case) imply a cocycle relation among the operators V Ni . Interchanging the order of the two operae±
tors and using standard tricks of operator product expansions [33]), it follows that the truncated algebra is generated by the elements Vei (zi ) subject to the relations ±
VeNi (z± i ) V Nj (z∓ j ) = V Nj (z∓ j ) VeNi (z± i ), ±
e∓
e∓
†
(3.13)
±
†
VeNi (z± i ) VeNi (z± i ) = VeNi (z± i ) VeNi (z± i ) = I, ±
±
VeNi (z± i ) V Nj (z± j ) = e ±
e±
±
ij 2πiωN ±
±
V Nj (z± j ) VeNi (z± i ), i 6= j, e±
±
where † is the ∗-involution on A, the z± i are distinct points, and " ! !n !n !# N X zj± zi± zi± 1 ij ij − ± . ωN ± = ± g log ± − ± n zj z z i j n=1
(3.14) (3.15)
(3.16)
The mutually commuting pair of (identical) algebras (3.14),(3.15) consists of two (ω ) (ω ) copies A+ N ∼ = A− N ∼ = A(ωN ) of the algebra of the noncommutative d-torus TωdN . Choosing an appropriate closure, the algebra A(ωN ) can be identified with the abstract 1 For issues related to the boundedness of vertex operators see [34].
614
G. Landi, F. Lizzi, R. J. Szabo
unital ∗-algebra generated by elements Ui , i = 1, . . . , d, subject to the cocycle relations (3.14),(3.15), Ui Ui† = Ui† Ui = I,
(3.17)
U i Uj = e
(3.18)
ij 2πiωN
Uj Ui , i 6= j.
The presentation of a generic “smooth” element of A(ωN ) in terms of the Ui is X p p p fp U1 1 U2 2 · · · Ud d , fp ∈ S(L∗ ), f =
(3.19)
p∈L∗
where S(L∗ ) is a Schwartz space of rapidly decreasing sequences. When ωN = 0, one i can identify the operator Ui with the multiplication by Ui = e−ix on the ordinary (0) ∞ ∼ torus Td , so that A = C (Td ), the algebra of smooth complex-valued functions on Td . In this case the expansion (3.19) reduces to the usual Fourier series expansion P i f (x) = p∈L∗ fp e−ipi x . For a general non-vanishing anti-symmetric bilinear form ωN , we shall think of the algebra A(ωN ) as a quantum deformation of the algebra C ∞ (Td ). While the ∗-involution is the usual complex-conjugation f † (x) = f (x), given f, g ∈ C ∞ (Td ) their deformed product is defined to be ij (3.20) (f ?ωN g)(x) = exp iπωN ∂x∂ i ∂0 j f (x)g(x 0 ) 0 . ∂x x =x
Furthermore, the unique normalized trace τ : A(ωN ) → C is represented by the classical average [35] Z τ (f ) =
d Y dx i Td i=1
2π
f (x)
(3.21)
which is equivalent to projecting onto the zero mode f0 in the Fourier series expansion (3.19), τ (f ) = f0 . This trace will be used later on to construct a duality-invariant gauge theory. Once we have established the connection (at finite N ) between the Vertex Operator Algebra and the Noncommutative Torus we can then take the limit N → ∞. Then, the cocycle ωN ± defined in (3.16) converges to ij
ω± = ± g ij sgn(arg zi± − arg zj± ),
i 6= j .
(3.22)
ii ≡ 0 we obtain two-cocyles. We have chosen the branch on C ∪ {∞} of With ω± the logarithm function for which the imaginary part of log(−1) lies in the interval [−iπ, +iπ ]. The cocycles ω± depend only on the relative orientations of the phases of the given tachyon operators. If we order these phases so that ± arg z1± > ± arg z2± > ij · · · > ± arg zd± , then ω± = ωij ≡ sgn(j − i) g ij for i 6 = j . This choice is unique up to a permutation in S2d of the coordinate directions of Td × Td∗ . Thus we obtain the noncommutative torus characterized by ω defined in (3.22). We will indicate with A(ω) the algebra obtained in this fashion and will consider this algebra to act on the tachyon Hilbert space, which is motivated by the fact that at up to Planckian energies only the tachyonic states are excited. Of course at higher
String Geometry and Noncommutative Torus
615
energies also the oscillatory (Fock space) modes of the Hilbert space will have to be considered. The algebra A(ω) that we are considering can also be seen as a deformation of A0 defined in (3.6), in which the product of the elements has to be made in the full algebra, and then the result is to be projected on the tachyon Hilbert space. Namely, given V0 = P0 V P0 , W0 = P0 W P0 ∈ A0 , the operator product expansion with the oscillators, gives non-trivial relations among the tachyons which we identify with the usual relations of the noncommutative torus so that we can define a deformed product by V0 ∗ W0 = P0 (V W ) P0 .
(3.23)
This is a very natural way to deform the tachyonic algebra A0 which takes the presence of oscillator modes into account, the projection operators being positioned in such a way to ensure that all oscillatory modes contribute to the product. The ∗-algebra A(ω) therefore defines a module for the noncommutative torus which possesses some remarkable properties that distinguish it from the usual modules for Tωd . Although it is related to the projective regular representation of the twisted group algebra C{3} of the Narain lattice, this is not the algebraic feature which determines the cocycle relation (3.15). The clock algebra (3.9) is very different from the algebra defined by (3.15) which arises from the operator product (3.1). The latter algebra is actually associated with the twisted chiral operators Ve± , although their chiral-antichiral products do not yield the Z2 -twist of the i tachyon vertex operators (3.7). Thus the product of the two algebras in (3.13)–(3.15) have deformation matrix associated with the bilinear form (2.7). However, the minus d ∼ T d by a relabeling sign which appears in the antichiral sector is irrelevant since T−ω = ω of the generators Ui . The algebra A(ω) thus defines a 2d-dimensional Z2 -twisted module Teωd over the (+)d (−)d × Tω of noncommutative torus, where Tωd carries a double representation Tω Tωd . The non-trivial Z2 -twist of this module is important from the point of view of the noncommutative geometry of the string spacetime. Note that its algebra is defined by computing the operator product relations in (3.13)–(3.15) in the full vertex operator algebra A, and then afterwards projecting onto the tachyon sector. Otherwise we arrive at the (trivial) clock algebra, so that within the definition of Teωd there is a particular ordering that must be carefully taken into account. This module is therefore quite different from the usual projective modules over the noncommutative torus [2,9], and we shall exploit this fact dramatically in what follows. Note that if we described A using instead the canonical basis {ei }di=1 ⊕ {ej }dj =1 of the Narain lattice 3, i.e. taking as generators for A twisted products of operators of the form (2.23) and their duals, then we would have arrived at a twisted representation of Tωd ×Tωd−1 associated with quantum deformations of the d-torus and its dual. That Tωd−1 ∼ = Tωd will be a consequence of a set of Morita equivalences we shall prove in Sect. 5. In particular, this observation shows that for any vertex operator algebra associated with a positive-definite lattice L of rank d, there corresponds a module over the d-dimensional noncommutative torus Tωd , with deformation matrix ω given as above in terms of the bilinear form of L, which is determined by an S 1 -twisted projective regular representation of the group algebra C[L]. We can summarize these results in the following Proposition 1. Let L be a positive-definite lattice of rank d with bilinear form gij , and let A be the vertex operator algebra associated with L. Then the algebra A(ω) ⊂ A defines a unitary equivalence class of finitely-generated self-dual Z2 -twisted projective modules
616
G. Landi, F. Lizzi, R. J. Szabo
(+)d (−)d Teωd of the double noncommutative torus Tω × Tω with generators Ui = Vei and ± antisymmetric deformation matrix
( ij
ω =
sgn(j − i) g ij , i 6 = j 0, i = j
.
As mentioned in Sect. 1, there is a nice heuristic interpretation of the noncommutativity of Tωd in the present case. The ordinary torus Td is formally obtained from Tωd by eliminating the deformation parameter matrix ω → 0. In the case at hand, ω ∼ g −1 , so that formally we let the metric g → ∞ become very large. As all distances in this formalism are evaluated in terms of a fundamental length which for simplicity we have set equal to 1, this simply means that at distance scales larger than this unit of length (which is usually identified with the Planck length), we recover ordinary, classical spacetime Td . Thus the classical limit of this quantum deformation of the Lie algebra u(1)d+ ⊕u(1)d− coincides with the decompactification limit in which the dual coordinates x ∗ , representing the windings of the string around the spacetime, delocalize and become unobservable. At very short distances (g → 0), spacetime becomes a noncommutative manifold with all of the exotic duality symmetries implied by string theory. This representation of the noncommutative torus thus realizes old ideas in string theory about the nature of spacetime below the fundamental length scale ls determined by the finite size of the string. In this case ls is determined by the lattice spacing of L. Therefore, within the framework of toroidally compactified string theory, spacetime at very short length scales is a noncommutative torus.
4. Spectral Geometry of Toroidal Compactifications From the point of view of the construction of a “space”, the pair (A, H), i.e. a ∗-algebra A of operators acting on a Hilbert space H, determines only the topology and differentiable structure of the “manifold”. To put more structure on the space, such as an orientation and a metric, we need to construct a larger set of data. This is achieved by using, for even-dimensional spaces, an even real spectral triple (A, H, D, J, 0) [25], where D is a generalized Dirac operator acting on H which determines a Riemannian structure, 0 defines a Hochschild cycle for the geometry which essentially determines an orientation or Hodge duality operator, and J determines a real structure for the geometry which is used to define notions such as Poincaré duality. In this section we will construct a set of spectral data to describe a particular noncommutative geometry appropriate to the spacetime implied by string theory. To define (A, H, D, J, 0), we need to introduce spinors. For this, we fix a spin structure on Td × Td∗ and augment the L2 space in (2.17) to the space L2 (Td × Td∗ , S) of square integrable spinors, where S → Td × Td∗ is the spin bundle which carries an irreducible left action of the Clifford bundle Cl(Td × Td∗ ). We shall take as H the corresponding augmented Hilbert space of (2.17), so that, in the notation of the previous section (see (3.4)), H0 = L2 (Td × Td∗ , S). A dense subspace of L2 (Td × Td∗ , S) is provided by the smooth spinor module S = C ∞ (Td × Td∗ , S) which is an irreducible Clifford module of rank 2d over C ∞ (Td × Td∗ ). There is a Z2 -grading S = S + ⊕ S − arising from the chirality grading of Cl(Td × Td∗ ) by the action of a grading operator 0
String Geometry and Noncommutative Torus
617
with 0 2 = I. As a consequence, the augmented Hilbert space splits as H = H+ ⊕ H−
(4.1)
into the ±1 (chiral-antichiral) eigenspaces of 0. The spinor module S carries a representation of the double toroidal Clifford algebra n o o n † γi± , γj∓ = 0, γi± = γi± . (4.2) γi± , γj± = ± 2gij , The Z2 -grading is determined by the chirality matrices 0c± =
d!
√1 det g
i1 i2 ···id γi± γ ± · · · γi± , 1 i2 d
(4.3)
where i1 ···id is the antisymmetric tensor with the convention 12···d = +1. The matrices 0c± have the properties 2 0c± γi± = −(−1)d γi± 0c± , 0c± γi∓ = (−1)d γi∓ 0c± , 0c± = (−1)d(d−1)/2 I (4.4) so that the Klein operator 0 ≡ (−1)d 0c+ 0c−
(4.5)
satisfies the desired properties for a grading operator. The representation of the vertex operator algebra A on H acts diagonally with respect to the chiral decomposition (4.1), i.e. [0, V ] = 0
∀V ∈ A.
(4.6)
It was shown in [17] that the two natural Dirac operators D associated with a lattice vertex operator algebra are the fields i (z± ). D / ± = γi± ⊗ z± α±
(4.7)
They are self-adjoint operators on H with compact resolvent on an appropriate dense domain (after regularization). They each act off-diagonally on (4.1) taking H± → H∓ , (4.8) 0, D / ± = 0. We shall use the chirally symmetric and antisymmetric self-adjoint combinations / −, D / =D /+ + D
D / =D /+ − D / −.
(4.9)
The final quantity we need is the operator J , which is needed to define a real structure. Since the representation of A on h is faithful (after regularization), we naturally have an injective map A → h defined by V 7 → |ψV i ≡ lim V (z+ , z− )|0; 0i ⊗ |0i+ ⊗ |0i− z± →0
(4.10)
so that V (ψV ; z+ , z− ) ≡ V (z+ , z− ). This means that the (unique) vacuum state |0; 0i ⊗ |0i+ ⊗ |0i− is a cyclic separating vector for h (by the operator-state correspondence), and we can therefore define an antilinear self-adjoint unitary isometry Jc : H → H by Jc |ψV i = |ψV † i,
Jc γi± = ±γi± Jc .
(4.11)
618
G. Landi, F. Lizzi, R. J. Szabo
Then J is defined with respect to the decomposition (4.1) by acting off-diagonally as Jc on H+ → H− and as (−1)d Jc on H− → H+ . Note that on spinor fields ψ ∈ L2 (Td × Td∗ , S) ⊂ H the action of the operator J is given by J ψ = C ψ,
(4.12)
where C is the charge conjugation matrix acting on the spinor indices. The antilinear unitary involution J satisfies the commutation relations / =D / J, J 0 = (−1)d 0J, J 2 = (d) I JD / =D / J, J D and
h
i V , J W † J −1 = 0,
i h i / , V ], J W † J −1 = 0 [D / , V ], J W † J −1 = [D
(4.13)
h
(4.14)
for all V , W ∈ A. The mod 4 periodic function (d) is given by [24] (d) = (1, −1, −1, 1).
(4.15)
The dimension-dependent ± signs in the definition of J arise from the structure of real Clifford algebra representations. The first condition in (4.14) implies that for all V ∈ A, J V † J −1 lies in the commutant A0 of A on H, while the second condition is a generalization of the statement that the Dirac operators are first-order differential operators. The algebra A0 defines an anti-representation of A, and J can be thought of as a charge conjugation operator. The spectral data (A, H, D, J, 0), with the relations above and D taken to be either of the two Dirac operators (4.9), determines an even real spectral triple for the geometry of the space we shall work with. As shown in [17], the existence of two “natural” Dirac operators (metrics) for the noncommutative geometry is not an ambiguous property, because the two corresponding spectral triples are in fact unitarily equivalent, i.e. there exist many unitary isomorphisms U : H → H which define automorphisms of the vertex operator algebra (U AU −1 = A) and for which / U. UD / =D
(4.16)
This means that the two spaces are naturally isomorphic at the level of their spectral triples, / , U J U −1 , U 0U −1 , (4.17) A, H, D /,J,0 ∼ = A, H, D and so the two Dirac operators determine the same geometry. This feature, along with the even-dimensionality of the spectral triple, will have important consequences in what follows. An isomorphism of the form (4.17) is called a duality symmetry of the noncommutative string spacetime [17]. We now consider the algebra A(ω) defined in the last section, and restricted to the tachyon Hilbert space in terms of the orthogonal projection P0 : h → h0 onto the subspace (3.4). The corresponding restrictions of the Dirac operators (4.7),(4.9) are h i / P0 = 21 g ij γi+ + γi− ⊗ pj + γi+ − γi− ⊗ wi , ∂/ ≡ P0 D (4.18) h i ∂/ ≡ P0 D / P0 = 21 g ij γi+ − γi− ⊗ pj + γi+ + γi− ⊗ wi . Note that the operators J and 0 preserve both A(ω) and H0 . We can therefore define subspaces (A(ω) , H0 , ∂/, J, 0) and (A(ω) , H0 , ∂/, J, 0) of the geometries represented by the (isomorphic) spectral triples in (4.17).
String Geometry and Noncommutative Torus
619
5. Duality Transformations and Morita Equivalence We shall now describe some basic features of the twisted modules constructed in Sect. 3. An important property of Teωd is that there is no distinction between the torus Td and its dual Td∗ within the tachyon algebra. These two commutative subspaces of the tachyon spacetime are associated with the subspaces (−)
h0
≡
d O i=1
(+) h0
≡
d O
Qd i i ∼ ker α+ ⊗ I + I ⊗ α− = L2 Td , i=1 ker
i=1
i α+
i ⊗ I − I ⊗ α−
∼ = L
2
dx i 2π
Qd
dxi∗ i=1 2π
Td∗ ,
, (5.1)
(±)
of (3.4). The subspace h0 is the projection of h0 onto those states |q + ; q − i with q + = ±q − (equivalently v = 0 and q = 0, respectively), and it contains only those highestweight modules which occur in complex conjugate pairs of left-right representations of \ \ d ⊕ u(1) d . From (3.7) we see that if P (±) : h → h(±) are the the current algebra u(1) +
−
0
0
0
respective orthogonal projections, then the corresponding vertex operator subalgebras are (−)
A(ω) P0
(+)
A(ω) P0
A(−) ≡ P0 A(+) ≡ P0
(−)
∼ = C ∞ (Td ),
(+)
∼ = C ∞ (Td∗ ),
(5.2)
and represent ordinary (commutative) spacetimes. If we choose a spin structure on Td × Td∗ such that the spinors are periodic along the elements of a homology basis, then the associated spin bundle is trivial and the corresponding Dirac operators (4.18) are ∂/ = −ig ij γi ⊗
∂ ∂x j
,
∂/ = −iγi ⊗
∂ ∂xi∗ ,
(5.3)
where we have used the coordinate space representations d
Y dx i ∂ ), pi = −i i on L2 (Td , ∂x 2π i=1
w i = −i
d
Y dx ∗ ∂ i 2 ∗ ) on L (T , d ∂xi∗ 2π i=1
given by the adjoint actions (2.15). We have also introduced the new Dirac matrices γi = 21 (γi+ + γi− ) and γ∗i = 21 g ij (γj+ − γj− ), and we used the fact that ∂x∂ j is zero on L2 (Td∗ ) (and an analogous statement involving ∂x∂ ∗ and Td ). The Dirac operators (5.3) j
represent the canonical geometries of the (noncommutative) torus and its dual. Thus, as subspaces of the tachyon algebra, tori are identified with their duals, since, as mentioned in the previous section, there exists a unitary isomorphism, that interchanges p ↔ w and g ↔ g −1 , of the Hilbert space H which exchanges the two subspaces in (5.1) and (5.2) and also the two Dirac operators (5.3). Since this isomorphism does not (±) commute with the projection operators P0 , distinct classical spacetimes are identified. The equivalence of them from the point of view of noncommutative geometry is
620
G. Landi, F. Lizzi, R. J. Szabo
the celebrated T -duality symmetry of quantum string theory. Different choices of spin structure on Td induce twistings of the spin bundle, and, along with some modifications of the definitions of the subspaces (5.1) and (5.2), they induce projections onto different dual tori with appropriate modifications of (5.3). But, according to (4.17), these spectral geometries are all isomorphic [17]. We now turn to a more precise description of the isomorphism classes in Teωd . We note first of all that the string geometry is not contained entirely within the tachyon sector of the vertex operator algebra A. This is because the explicit inner automorphisms of A which implement the duality symmetries are constructed from higher-order perturbations of the tachyon sector (by, for example, graviton operators). The basic tachyon vertex operators i (z ) = −iV (I ⊗ α (±)i ; 1, z ) generate an Vi together with the Heisenberg fields α± ± ± −1 c (0) (A) of inner automorphisms of the vertex operator algebra which affine Lie group Inn is in general an enhancement of the generic affine U (1)d+ × U (1)d− gauge symmetry [21]. This property is crucial for the occurrence of string duality as a gauge symmetry of the noncommutative geometry, and the isomorphisms described above only occur when the relevant structures are embedded into the full spectral triple (A, H, D, J, 0). It is well known that various noncommutative tori with different deformation parameters are equivalent to each other. (For completeness, a brief summary of the general definition and relevance of Morita equivalence for C ∗ -algebras is presented in Appendix A.) For instance, when d = 2 it can be shown that the abelian group Z ⊕ ωZ is an iso2 ∼ T2 morphism invariant of A(ω) . This means, for example, that the tori Tω2 ∼ = T−ω = ω+1 are unitarily equivalent. Moreover, it can be shown in this case that the algebras A(ω) 0 and A(ω ) are Morita equivalent if and only if they are related by a discrete Möbius transformation [1] aω + b ab , ∈ GL(2, Z), (5.4) ω0 = cd cω + d where GL(2, Z) is the group of 2 × 2 integer-valued matrices with determinant ±1. This 2 . Another copy of the discrete group SL(2, Z) then also identifies the tori Tω2 ∼ = T−1/ω appears naturally by requiring that the transformation U1 7 → U1a U2b ,
U2 7→ U1c U2d
(5.5)
be an automorphism of A(ω) . Indeed, the transformation (5.5) preserves the product (3.18), and so extends to an (outer) automorphism of A(ω) , if and only if ad − bc = a b det c d = 1. The duality symmetries of the string spacetime described above can be used to identify certain isomorphisms among the modules Teωd . These incorporate the usual isomorphisms between noncommutative tori described above as gauge symmetries, and they also naturally contain, for any d, the discrete geometrical automorphism group SL(d, Z) of Td . When the algebra A(ω) of the noncommutative torus Tωd is embedded as described in Sect. 3 into the full vertex operator algebra A, these isomorphisms are induced by the discrete inner automorphisms of A which generate the isometry group of the Narain lattice, Aut(3) = O(d, d; Z) ⊃ SL(d, Z).
(5.6)
They act on the metric tensor gij by matrix-valued Möbius transformations and induce an O(d, d; Z) symmetry on the Hilbert space acting unitarily on the Dirac operators in
String Geometry and Noncommutative Torus
621
the sense of (4.16). Therefore, with the embedding A(ω) ,→ A, a much larger class of tori are identified by the action of the full duality group2 , AB with ω∗ = (Aω + B)(Cω + D)−1 , ∈ O(d, d; Z), (5.7) Tωd ∼ = Tωd∗ CD where A, B, C, D are d × d integer-valued matrices satisfying the relations A> C + C > A = 0 = B > D + D > B,
A> D + C > B = I.
(5.8)
∗
To see this, we embed the two algebras A(ω) and A(ω ) into A. Then the unitary equivalence (4.17) implies that, in A, there is a finitely-generated projective right A(ω) -module ∗ E (ω) with A(ω ) ∼ = EndA(ω) E (ω) , where the ∗-isomorphism is implemented by the unitary operator U in (4.16). Thus the projections back onto these tachyonic algebras establishes the Morita equivalence (5.7) between the twisted modules.3 In other words, the tori in (5.7), being Morita equivalent, are indistinguishable when embedded in A. To summarize, string duality implies the Proposition 2. There is a natural Morita equivalence Tωd
∼ = Tωd∗ ; ω∗ ≡ (Aω + B)(Cω + D)−1 ,
AB ∈ O(d, d; Z) CD
of spectral geometries. A similar result for multi-dimensional noncommutative tori has been established recently in [18] using more explicit formal constructions of projective modules. Here we see the power of duality in establishing this strong result. It is intriguing that the symmetry group (5.6) contains the SL(2, Z) S-duality symmetry of type-IIB superstring theory. In [19] it is shown that Matrix Theory compactifications on Morita equivalent noncommutative tori are physically equivalent, in that the associated quantum theories are dual. Here we find a similar manifestation of this property, directly in the language of string geometry. The construction in the present case exposes the natural relationship between target space duality and Morita equivalence of noncommutative geometries. It also demonstrates explicitly in what sense compactifications on Morita equivalent tori are physically equivalent, as conjectured in [6]. For instance, the equivalence ω ↔ ω−1 represents the unobservability of small distances in the physical spacetime, while the equivalence ω ↔ ω + 1 (for d = 2) represents the invariance of the spacetime under a change of complex structure. As we discussed above, string duality can be represented as a gauge symmetry on the full vertex operator algebra A and, when projected onto the tachyon sector, one obtains intriguing equivalences between the twisted realizations of the noncommutative torus. In this representation of Tωd , the automorphisms of the algebra A(ω) are determined in 2 As emphasized in [18] this O(d, d; Z) transformation is only defined on the dense subspace of the vector space of antisymmetric real-valued tensors ω, where Cω + D is invertible. We shall implicitly assume this here. 3 Strictly speaking, in order to correctly identify (5.7) as the orbits under the action of the target space duality group of the string theory, one must regard ω as the natural antisymmetric bilinear form induced on the lattice 0 by the metric g, as defined above. This means that the string background in effect has an induced antisymmetric tensor B which parametrizes the deformation of the torus, as in [6,7] (and thus transforms in the way stated – see [17] for the details). The action of the duality group in (5.7) is then an explicit realization of the duality transformation law proposed in [6] and proven in [18,19].
622
G. Landi, F. Lizzi, R. J. Szabo
c (0) (A),4 consistent with the large part by gauge transformations, i.e. the elements of Inn corresponding results for the ordinary noncommutative torus. In particular, the usual diffeomorphism symmetries are given by inner automorphisms, i.e. gravity becomes a gauge theory on this space. The outer automorphisms of A are given by the full duality group of toroidally compactified string theory which is the semi-direct product O(d, d; Z) × ⊃ O(2, R), where O(2, R) is a worldsheet symmetry group that acts on the algebra A by rotating the two chiral sectors on C ∪ {∞} among each other (this part of the duality group does not act on the metric tensor of Td ). After applying the appropriate projections, we obtain canonical actions of these automorphisms on the tachyon sector. Thus, again using string duality, we arrive at the Proposition 3. There is a natural subgroup of automorphisms of the twisted module Teωd given by the duality group c (0) (A) × ⊃ Out(A) P0 , Aut(0) (A(ω) ) = P0 Inn c (0) (A) is the affine Lie group generated by the tachyon vertex operators Vi where Inn i , and Out(A) ∼ O(d, d; Z) × and the Heisenberg fields α± ⊃ O(2, R). = fωd 6. Gauge Theory on T In noncommutative geometry a finitely-generated projective module of an algebra replaces the classical notion of a vector bundle over a manifold [4,5]. The geometry of gauge theories can therefore also be cast into the natural algebraic framework of noncommutative geometry. In this section we will construct the gauge theories on Teωd which in the next section will enable us to construct an action functional that is naturally invariant under the automorphism group given by Proposition 3, and in particular under duality and diffeomorphism symmetries. The possibility of such a characterization comes from the fairly complete mathematical theory of projective modules and of connections on these modules for the noncommutative torus Tωd [2]. On A(ω) there is a natural set of linear derivations 1i , i = 1, . . . , d, defined by 1i (Uj ) = δij Uj
(6.1)
These derivations come from the derivative operators pi = −i ∂x∂ i acting on the algebra C ∞ (Td ) of Fourier series (see Sect. 3). They generate a u(1)d Lie algebra. There is a natural representation of the operators 1i in the twisted module. The tachyon vertex operators Vq + q − (z+ , z− ) have charge g ij qj± under the action of the u(1)d± current algebra i generate an affine Lie algebra generated by the Heisenberg fields (2.14). As such α± (ω) c c 3 of automorphisms of A (Proposition 3). We can define a set of linear derivations (0)i ∇± , i = 1, . . . , d, acting on A(ω) by the infinitesimal adjoint action of the Heisenberg fields [17,21] which is given by h i (0)i i , Ve± = δji Ve± . (6.2) ∇± Ve± ≡ α± j
j
j
This leads us to the notion of a connection on the algebra A(ω) . We shall start by considering this algebra as a finitely-generated projective left module over itself. A 4 This Lie group is described in detail in [21].
String Geometry and Noncommutative Torus
623
i ,i = (chiral) left connection on A(ω) is then defined as a set of C-linear operators ∇± (ω) 1, . . . , d, acting on A and satisfying the left Leibniz rule
h i i i i (V W ) = V ∇± W + α± , V W, ∇±
∀V , W ∈ A(ω) .
(6.3)
i and ∇ 0i are two connections on A(ω) , then their difference ∇ i − ∇ 0i commutes If ∇± ± ± ± i − ∇ 0i )(V W ) = V (∇ i − ∇ 0i )W . Thus with the left action of A(ω) on itself, i.e. (∇± ± ± ± i − ∇ 0i is an element of the endomorphism algebra End (ω) which we identify ∇± ± A(ω) A 0 with the commutant EndA(ω) A(ω) ∼ = A(ω) = J A(ω) J −1 of the algebra A(ω) , where J is the real structure introduced in Sect. 4. The derivations defined in (6.2) also satisfy the Leibnitz rule (6.3) and thus define fixed fiducial elements in the space of connections on i − ∇ (0)i obeys A(ω) . It follows that i± = ∇± ±
i± V W = V i± W,
∀V , W ∈ A(ω) ,
(6.4)
so that we can write an arbitrary connection in the form (0)i
i = ∇± + i± ∇±
with i± ∈ EndA(ω) A(ω) .
(6.5)
We can introduce a Hermitian structure on A(ω) via the A(ω) -valued positive-definite inner product h·, ·iA(ω) : A(ω) × A(ω) → A(ω) defined by V , W ∈ A(ω) .
hV , W iA(ω) = V † W,
(6.6)
The compatibility condition with respect to this Hermitian structure for left connections, E D i V, W − ∇±
A(ω)
D E i + V , ∇± W
(0)i
A(ω)
= ∇±
hV , W iA(ω)
(6.7)
implies that i± = (i± )† is self-adjoint. The minus sign in (6.7) arises from the fact that (0)i the fiducial connection defined in (6.2) anticommutes with the ∗-involution, (∇± V )† = (0)i −∇± V † . 0 Since the connection coefficients i in (6.5) are elements of the commutant A(ω) ∼ = ±
EndA(ω) A(ω) , it is natural to introduce an A(ω) -bimodule structure. We first define a right A(ω) -module which we denote by A(ω) . Elements of A(ω) are in bijective correspondence with those of A(ω) , A(ω) ≡ {V | V ∈ A(ω) }, and the right action of V ∈ A(ω) on i on A(ω) there W ∈ A(ω) is given by W · V = V † W . Associated with the connection ∇± i
is then a right connection ∇ ± on A(ω) defined by i
i V, ∇ ± V = −∇±
∀V ∈ A(ω) .
(6.8)
624
G. Landi, F. Lizzi, R. J. Szabo
The operator (6.8) is C-linear and obeys a right Leibniz rule. Indeed, with V ∈ A(ω) and W ∈ A(ω) , we have i
i
∇ ± (V · W ) = ∇ ± (W † V ) i (W † V ) = −∇±
i V − ∇ (0)i W † V = −W † ∇± ±
(0)i
(6.9)
†
i V ·W + ∇ V = −∇± ± W h i i i ,W = ∇ ± V · W + V · α± i
which expresses the right Leibniz rule for ∇ ± .
i
i and ∇ to get a symmetric connection We can now combine the two connections ∇± ± i on A(ω) ⊗ (ω) defined by e± A ∇ (ω) A i i i e± (V ⊗ W ) = ∇ ± V ⊗ W + V ⊗ ∇± W, ∇
∀V ⊗ W ∈ A(ω) ⊗A(ω) A(ω) .
(6.10)
The A(ω) -bimodule we are describing is just the Hilbert space H0 which carries both a left representation of A(ω) and a (right) anti-representation of A(ω) given by V 7→ J V † J −1 ∈ L(H0 ), where L(H0 ) is the algebra of bounded linear operators on H0 . In the present interpretation, the right module structure of H0 comes from the right A(ω) -module A(ω) when mapping A(ω) into L(H0 ) by V 7→ J V † J −1 . Thus, when representing A(ω) ⊗A(ω) A(ω) on H0 , using (6.5) and (6.8) we find that the action of the connection (6.10) restricted to A(ω) can be expressed in terms of the fiducial connection (6.2) and the connection coefficients as (0)i i e± (I ⊗ V ) = I ⊗ ∇± + i± + J (i± )† J −1 V . (6.11) ∇ The extra connection term in (6.11) achieves the desired left-right symmetric representation and can be thought of as enforcing CP T -invariance. The operator ± i e± e ∇ / ≡ γi± ⊗ ∇
(6.12)
is then a map on A(ω) ⊗A(ω) A(ω) → A(ω) ⊗A(ω) 1D/± (A(ω) ) ⊗A(ω) A(ω) , where n o / ± , W V , W ∈ A(ω) 1D/± (A(ω) ) = spanC V D
(6.13)
are linear spaces of one-forms which carry a natural A(ω) -bimodule structure. The action ± of the operator e ∇ / on A(ω) as defined in (6.11) can be expressed in terms of the adjoint ± actions of the Dirac operators (4.7) as e ∇ / |A(ω) = AdD/± + A± + J (A± )† J −1 , where A± ∈ 1D/± (A(ω) ) are self-adjoint operators (by the compatibility condition (6.7)) which
String Geometry and Noncommutative Torus
625
are called gauge potentials. In fact, all of this structure is induced by a covariant Dirac operator which is associated with the full data (A, H, J, 0) and is defined as / ± + A± + J (A± )† J −1 , D /± ∇ =D
A± ∈ 1D/± (A),
(6.14)
where 1D/± (A) are the A-bimodules of one-forms defined as in (6.13) but with the
algebra A(ω) replaced by the full vertex operator algebra A. The Dirac operator D /± ∇ ± is regarded as an internal perturbation of D / and it yields a geometry that is unitary equivalent to that determined by D / ± [25], i.e. the geometries with fixed data (A, H, J, 0) form an affine space modelled on 1D/± (A).
The spaces 1D/± (A) are free A-bimodules with bases {γi± }di=1 [17]. The gauge potentials in (6.14) can therefore be decomposed as A± = g ij γi± ⊗ A± j
with A± i ∈ A.
(6.15)
− + i ij Defining the self-adjoint elements Ai ≡ P0 (A+ i + Ai )P0 and A∗ = P0 g (Aj − (ω) A− j )P0 of A , the covariant versions of the Dirac operators (4.18), obtained from the restrictions of (6.14) to A(ω) , are then h ij 1 /− /+ γi+ + γi− ⊗ pj + Aj + gj k J Ak∗ J −1 ∂/∇ ≡ P0 D ∇ +D ∇ P0 = 2 g i + γi+ − γi− ⊗ wi + Ai∗ + g ij J Aj J −1 , h ij 1 ∂/∇ ≡ P0 D /− /+ γi+ − γi− ⊗ pj + Aj + gj k J Ak∗ J −1 ∇ −D ∇ P0 = 2 g i + γi+ + γi− ⊗ wi + Ai∗ + g ij J Aj J −1 ,
(6.16) where we have used (4.11). The final ingredient we need for a gauge theory on Teωd is some definition of an invariant integration in order to define an action functional. The trace (3.21) yields a natural normalized trace Tr : A(ω) → C on the twisted module defined by Z Tr V =
Td
Z
d Y dx i dxi∗ V (x, x ∗ ). (2π )2
Td∗ i=1
(6.17)
(0)i
cc -invariant, Tr(∇± V ) = 0 ∀V ∈ A(ω) , and the corresponding GelfandThis trace is 3 Naimark-Segal representation space L2 (A(ω) , Tr) is, by the operator-state correspondence, canonically isomorphic to the Hilbert space h0 . Using the trace (6.17) and the A(ω) -valued inner product (6.6) we obtain a usual complex inner product (·, ·)A(ω) : A(ω) × A(ω) → C defined by (V , W )A(ω) = Tr hV , W iA(ω) = hψV |ψW ih0
(6.18)
which coincides with the inner product on the Hilbert space h0 . Being functions which are constructed from invariant traces, the quantities (6.17) and (6.18) are naturally invariant under unitary transformations of the algebra A(ω) , and, in particular, under the action of the inner automorphism group given in Proposition 3. This property immediately implies the manifest duality-invariance of any action functional constructed from them.
626
G. Landi, F. Lizzi, R. J. Szabo
7. Duality-Symmetric Action Functional The constructions of the preceding sections yield a completely duality-symmetric formalism, and we are now ready to define a manifestly duality-invariant action functional b; ∇, ∇∗ ] b; ∇, ∇∗ ] = IB [∇, ∇∗ ] + IF [ψ, ψ I [ψ, ψ
(7.1)
associated with a generic gauge theory on the twisted module. The bosonic part of the action is defined as h i2 2 , (7.2) IB [∇, ∇∗ ] ≡ 21d Tr S 5 ∂/∇2 − ∂/∇ where Tr S is the trace (6.17) including a trace over the Clifford module, and 5 is the projection operator onto the space of antisymmetric tensors (two-forms). This projection is equivalent to the quotienting by junk forms of the representation of the universal 2 forms on the Hilbert space [4,5]. The operator ∂/∇2 − ∂/∇ is the lowest order polynomial combination of the two Dirac operators (6.16) which lies in the endomorphism algebra EndA(ω) A(ω) . Thus the action (7.2) comes from the lowest order polynomial multiplication operator two-form which is invariant under the duality symmetries represented by an interchange of the Dirac operators. The fermionic action is of the form of a Dirac d action. Let ψ = (ψ a )2a=1 be a square-integrable section of the spin bundle over Td × Td∗ . Then using a flat metric δab for the spinor indices, we define d
b; ∇, ∇∗ ] ≡ Tr IF [ψ, ψ
E J −1 V (ψ a )J , ∂/∇ , V (ψ a )
2 D X
A(ω)
a=1
D E b ∂/∇ ψ , (7.3) = ψ h0
where b = J −1 ψ = (d) C ψ ψ
(7.4)
is the corresponding anti-fermion field. In (7.3) V (ψ a ) denotes the map from the Hilbert space into A(ω) . The actions (7.2) and (7.3) are both gauge-invariant and depend only on the spectral properties of the Dirac K-cycles (H0 , ∂/) and (H0 , ∂/). A gauge transformation is an inner automorphism σu : A → A, parametrized by a unitary element u of A, i.e. an element of the unitary group U(A) = {u ∈ A | u† u = uu† = I} of A, and defined by σu (V ) = uV u† ,
V ∈ A.
(7.5)
It acts on gauge potentials as h i / ± , u† A± 7 → (A± )u ≡ uA± u† + u D
(7.6)
and, using the A-bimodule structure on H, on spinor fields by the adjoint representation ψ 7 → ψ u ≡ u ψ u† = U ψ,
(7.7)
U = uJ uJ −1 .
(7.8)
where
String Geometry and Noncommutative Torus
627
The transformation (7.7) preserves the inner product on the Hilbert space H, hψ1u |ψ2u i = hψ1 |ψ2 i, since both u and J are isometries of H. Moreover, one easily finds that, under a gauge transformation (7.6), † D /± /± /± ∇ 7→ D ∇u = U D ∇U ,
/ ± )u ≡ u ∇ / ± u† , ∇ / ± 7 → (∇
(7.9)
which immediately shows that bu ; ∇ u , ∇∗u ] = I [ψ, ψ b; ∇, ∇∗ ]. I [ψ u , ψ
(7.10)
Let us write the action (7.1) in a form which makes its duality symmetries explicit. Using the double Clifford algebra (4.2) and the coordinate space representations of the momentum and winding operators in (6.16), after some algebra we find h i ∂aj i ∂ai ∂a∗l ∂a∗k 2 j − − g g + g g + i a , a ∂/∇2 − ∂/∇ = − γ+i γ− ⊗ ik j l ik j l i j 2 ∂x i ∂x j ∂xk∗ ∂xl∗ h i h i ∂a k ∂a k (7.11) − igik gj l a∗k , a∗l + gj k ∗i + gik ∗j + igj k ai , a∗k ∂x ∂x h i ∂aj ∂ai − gik ∗ + igik aj , a∗k , − gj k ∗ ∂xk ∂xk j
where we have defined ai = Ai + gij J A∗ J −1 and a∗i = Ai∗ + g ij J Aj J −1 . The proj jection operator 5 acting on (7.11) sends the gamma-matrix product γ+i γ− into its j 1 i antisymmetric component 2 [γ+ , γ− ], thus eliminating from (7.11) the symmetric part. Since conjugation of (the components of) a gauge potential by the real structure J pro0 duces elements of the commutant A(ω) (see (4.14)), we can write the bosonic action (7.2) in the form of a symmetrized Yang–Mills type functional Z IB [∇, ∇∗ ] =
Td
Z Td∗
d Y dx i dxi∗ ik j l ∇,∇∗ ∇,∇∗ e∇,∇∗ F e∇,∇∗ J −1 g g Fij Fkl + J F ij kl (2π)2 i=1 e∇,∇∗ J −1 , + 2 Fij∇,∇∗ J F kl
(7.12) where Fij∇,∇∗ e∇,∇∗ F ij
l h h i i ∂Aj ∂Ai ∂Ak∗ ∂A∗ k l = − j + i Ai , Aj −gik gj l − + i A∗ , A∗ ,(7.13) ∂x i ∂x ∂xk∗ ∂xl∗ h i ∂Ak ∂Ak = gj k i∗ − gik j∗ + igik gj l Ak∗ , Al∗ ∂x ∂x h i ∂Aj ∂Ai (7.14) − gik ∗ − gj k ∗ + i Ai , Aj . ∂xk ∂xk
Note that the field strength (7.14) is obtained from (7.13) by interchanging the gauge j potentials Ai ↔ gij A∗ , but not the local coordinates (x, x ∗ ).According to the description of Sect. 3 (see Proposition 1), the commutators in (7.13) and (7.14) can be defined using the Moyal bracket [A, B] ≡ {A, B}ω = A ?ω B − B ?ω A,
(7.15)
628
G. Landi, F. Lizzi, R. J. Szabo
where (A ?ω B)(x, x ∗ ) h = exp iπ ωij ∂x∂ i
∂ ∂x 0 j
− gik gj l ∂x∂ ∗ ∂x∂0 ∗ k l
i
A(x, x )B(x , x ) ∗
0
0∗
(7.16)
(x 0 ,x 0 ∗ )=(x,x ∗ )
is the deformed product on C ∞ (Td × Td∗ ) (see also ref. [36]). The fermionic action can be written in a more transparent form as follows. We fix a spin structure such that any spinor field on Td × Td∗ can be decomposed into a periodic spinor χ and an antiperiodic spinor χ∗ , ψ = χ ⊕ χ∗
(7.17)
with respect to a homology basis. They are defined by the conditions γi+ + γi− χ∗ = 0 γi+ − γi− χ = 0,
(7.18)
for all i = 1, . . . , d. It is important to note that this periodic-antiperiodic decomposition is not the same as the chiral-antichiral one in (4.1), although its behaviour under the action of the charge conjugation operator J is very similar. From (4.11) it follows b∗ = J −1 χ∗ obey, respectively, that the corresponding anti-spinors χ b = J −1 χ and χ antiperiodic and periodic conditions. Furthermore, as there are 2d possible choices of spin structure on the d-torus Td , there are many other analogous decompositions that one can make. However, these choices are all related by “partial” T -duality transformations of the noncommutative geometry [17] and hence the fermionic action is independent of the choice of particular spin structure. Here we choose the one which makes its duality symmetries most explicit. Defining, with the usual conventions, the gamma-matrices γi = 21 (γi+ + γi− ) and γ∗i = 21 g ij (γj+ − γj− ), after some algebra we find that the fermionic action (7.3) can be written in terms of the decomposition (7.17) as b, χ b∗ ; ∇, ∇∗ ] IF [χ , χ∗ , χ Z Z Y d dx i dxi∗ ∂ j † ij + iAj χ + χ∗† gij γ∗i A∗ χ b −ib χ∗ g γi = 2 ∂x j Td Td∗ i=1 (2π) ! # ∂ j † i † ij + iA∗ χ∗ + χ g γi Aj χ b∗ . − ib χ gij γ∗ ∂xj∗ (7.19) The (left) action of gauge potentials on fermion fields in (7.19) is given by the action of the tachyon generators Vq ± f
r±
± ij ± g rj
= e2πiqi
fr ± +q ±
(7.20)
on functions f ∈ S(3). Note that (7.19) naturally includes the (left) action of gauge potentials on anti-fermion fields. The duality transformation is defined by interchanging starred quantities with unstarred ones. In terms of the fields of the gauge theory this is the mapping j
Ai ↔ gij A∗ ,
χ ↔ χ∗ ,
χ b↔ χ b∗ ,
(7.21)
String Geometry and Noncommutative Torus
629
while in terms of the geometry of the space Td × Td∗ we have x i ↔ g ij xj∗ ,
j
γi ↔ gij γ∗
(7.22)
for all i = 1, . . . , d. It is easily seen that both the bosonic and fermionic actions above are invariant under this transformation, so that b, χ b∗ ; ∇, ∇∗ ] = I [χ∗ , χ, χ b∗ , χ b; ∇∗ , ∇]. I [χ, χ∗ , χ
(7.23)
More general duality transformations can also be defined by (7.21) and (7.22) (as well as the spinor conditions (7.18)) taken over only a subset of all the coordinate directions i = 1, . . . , d. In all cases we find a manifestly duality invariant gauge theory. In a fourdimensional spacetime, the left-right symmetric combination F + J F J −1 of a field strength is relevant to the proper addition of a topological term for the gauge field to the Yang–Mills action yielding a gauge theory that has an explicit (anti-)self-dual form [37]. As we will describe in the next section, the extra terms in (7.12) incorporate, in a certain e∇,∇∗ J −1 to the field strength F ∇,∇∗ . The action sense to be explained, the “dual” J F ij ij e∇,∇∗ J −1 (7.12) is therefore also naturally invariant under the symmetry F ∇,∇∗ ↔ J F ij
ij
which can be thought of as a particle-antiparticle duality on Td × Td∗ with respect to the chiral Lorentzian metric (2.7). More generally, the action (7.1) is invariant under the automorphism group given in c (0) (A)P0 which contains the Proposition 3. The gauge group is the affine Lie group P0 Inn duality symmetries and also the diffeomorphisms of Td ×Td∗ generated by the Heisenberg fields. Thus the gauge invariance of (7.1) also naturally incorporates the gravitational interactions in the target space. The “diffeomorphism" invariance of the action under the group P0 Out(A)P0 naturally incorporates the O(d, d; Z) Morita equivalences between classically distinct theories. The discrete group O(d, d; Z) acts on the gauge potentials as ! j Aj Ai (A> )i (C > )ij 7→ (7.24) j , j Ai∗ A∗ (B > )i (D > )ij where the d × d matrices A, B, C, D are defined as in Proposition 2. Again, this symmetry is the statement that compactifications on Morita equivalent tori are physically equivalent. The O(2, R) part of this group yields the discrete duality symmetries described above and in general it rotates the two gauge potentials among each other as ! ! j j cos θ sin θ Ai + gij A∗ Ai + gij A∗ 7→ θ ∈ [0, 2π ) (7.25) j j , − sin θ cos θ Ai − gij A∗ Ai − gij A∗ for each i = 1, . . . , d. The action is in this sense a complete isomorphism invariant of the twisted module Teωd . The key feature leading to this property is that (7.1) is a spectral invariant of the Dirac operators (4.18). fωd 8. Physical Characteristics of T
Having obtained a precise duality-symmetric characterization of the twisted module Teωd over the noncommutative torus, we now discuss some heuristic aspects of it using the gauge theory developed in the previous section. We remark first of all that the operators
630
G. Landi, F. Lizzi, R. J. Szabo
(7.13) and (7.14), which can be interpreted as Yang–Mills curvatures, change sign under the above duality transformation. This signals a change of orientation of the vector bundle (represented by the finitely-generated projective module) over the tachyon algebra. Such changes of orientation of vector bundles under duality are a common feature of explicitly duality-symmetric quantum field theories [31]. It is interesting to note that in this duality symmetric framework there are analogs of the usual Yang–Mills instantons in any dimension. They are defined by the curvature condition e∇,∇∗ J −1 . Fij∇,∇∗ = −J F ij
(8.1)
Since the bosonic action functional (7.12) is the “square” of the operator e∇,∇∗ J −1 , F ∇,∇∗ + J F Eqs. (8.1) determine those gauge field configurations at which the bosonic action functional attains its global minimum of 0. They therefore define instanton-like solutions of the duality-symmetric gauge theory. It is in this sense that the operator J acts to map the e∇,∇∗ into the dual of F ∇,∇∗ . That the instanton charge (or Chern numfield strength F ber) here is 0 follows from the fact that the gauge theory we constructed in Sect. 6 was built on a trivial vector bundle where the module of sections is the algebra itself. To obtain instanton field configurations with non-trivial topological charges one needs to use non-trivial bundles which are constructed using non-trivial projectors. It would be very interesting to generalize the gauge theory of this paper to twisted and also non-abelian modules. Although the general solutions of Eqs. (8.1) appear difficult to deduce, there is one simple class that can be immediately identified. For this, we consider the diagonal subgroup 3diag of the Narain lattice (2.3), which we can decompose into two subgroups 3± diag that are generated by the bases {ei ⊕(±g ij ej )}di=1 , respectively. These rank d lattices define, respectively, self-dual and anti-self-dual d-dimensional tori Td± ≡ Rd /2π 3± diag ⊂ Td × Td∗ . Then, on these tori, the gauge potentials obey the self-duality and anti-selfduality conditions x i = ± g ij xj∗ ,
j
Ai = ± gij A∗ .
(8.2)
On these gauge field configurations the curvatures (7.13) and (7.14) vanish identically, e∇,±∇ = 0. Fij∇,±∇ = F ij
(8.3)
Thus the self-dual and anti-self-dual gauge field configurations provide the analog of the instanton solutions which minimize the usual Euclidean Yang–Mills action functional. The conditions (8.2) can be thought of as projections onto the classical sector of the theory in which there is only a single, physical gauge potential in a d-dimensional spacetime. In the classical theory duality symmetries are absent and so the gauge field action vanishes identically. The Yang–Mills type action functional (7.12) can therefore be thought of as measuring the amount of asymmetry between a given connection and its dual on Teωd . As such, it measures how much duality symmetry is present in the target space and hence how far away the stringy perturbation is from ordinary classical spacetime. The action (7.12) can thus be regarded as an effective measure of distance scales in spacetime. There are other interesting physical projections of the theory that one can make which, unlike the relations (8.2), break the duality symmetries explicitly. For instance,
String Geometry and Noncommutative Torus
631
consider the projection Td × Td∗ → Td+ along with the freezing out of the dual gauge degrees of freedom. This means that the fields now depend only on the local coordinates x i = g ij xj∗ , and the dual gauge potential Ai∗ is frozen at some constant value, (x, x ∗ ) → (x, x),
Ai∗ → const.
(8.4)
j
Since each Ai∗ is constant, it follows that [Ai∗ , A∗ ] = 0. The field strength (7.14) is then identical to (7.13) which becomes the usual Yang–Mills curvature of the d-dimensional gauge field Ai over Td+ , e∇,∇∗ = ∂i Aj − ∂j Ai + i[Ai , Aj ]. (8.5) Fij [A] ≡ Fij∇,∇∗ + = − F ij + Td
Td
The bosonic action functional can be easily read off from (7.11), h i 2 j 5 ∂/∇2 − ∂/∇ + = − 4i γ+i , γ− ⊗ Fij [A] − J Fij [A]J −1 . Td
(8.6)
When the operator (8.6) is squared, the resulting bosonic action has the form of a symmetrized Yang–Mills functional for the gauge field Ai on Td+ . It is remarkable that the projection (8.4) reduces the bosonic action functional (7.2) to the standard Yang–Mills action used in noncommutative geometry [37]. In the infrared limit g → ∞ (ω → 0), the Moyal bracket (7.15) vanishes and the gauge theory generated by (8.6) becomes the usual electrodynamics on the commutative manifold Td+ . Thus at large-distance scales, we recover the usual commutative classical limit with the canonical abelian gauge theory defined on it [4,5]. The continuous “internal” space Td∗ of the string spacetime acts to produce a sort of Kaluza-Klein mechanism by inducing nonabelian degrees of freedom when the radii of compactification are made very small. This nonabelian generating mechanism is rather different in spirit than the usual ones of noncommutative geometry which extend classical spacetime, represented by the commutative algebra C ∞ (Td ), by a discrete internal space, represented typically by a noncommutative finite-dimensional matrix algebra. In the present case the “internal” space comes from the natural embedding of the classical spacetime into the noncommutative string spacetime represented by the tachyon sector of the vertex operator algebra. In this context we find that the role of the noncommutativity of spacetime coordinates at very short distance scales is to induce internal (nonabelian) degrees of freedom. As for the fermionic sector of the field theory, the projection above applied to the spinor fields is defined as b += χ b∗ + (8.7) χ + = χ∗ + , χ Td
Td
Td
Td
j
with γi = gij γ∗ . Then, denoting the constant value of Ai∗ by M i , it follows that the fermionic action (7.19) becomes the usual gauged Dirac action for the fermion fields (b χ , χ ) minimally coupled to the nonabelian Yang–Mills gauge field Ai and with mass parameters M i , b; ∇, ∇∗ ] + = IDirac [χ, χ b, M; A]. (8.8) IF [χ , χ Td
Thus the internal symmetries of the string geometry also induce fermion masses, and so the explicit breaking of the duality symmetries, required to project onto physical spacetime, of the twisted module acts as a sort of geometrical mass generating mechanism.
632
G. Landi, F. Lizzi, R. J. Szabo
Again in the classical limit g → ∞ the left action (7.20) becomes ordinary multiplication and (8.8) becomes the Dirac action for U (1) fermions coupled to electrodynamics. The fact that the Kaluza-Klein modes coming from Td− induce nonabelian degrees of freedom and fermion masses could have important ramifications for string phenomenology. In particular, when d = 4, the above projections suggest a stringy origin for the canonical action of the standard model. We remark again that the nonabelian gauge group thus induced is the natural enhancement of the generic abelian U (1)d gauge symmetry within the vertex operator algebra [21]. It is intriguing that both nonabelian gauge degrees of freedom and masses are induced so naturally by string geometry, and it would be interesting to study the physical consequences of this feature in more depth. It would also be very interesting to give a physical origin for the duality-symmetric noncommutative gauge theory developed here using either standard model or M-Theory physics. For instance, in [7] it is argued that the characteristic interaction term for gauge theory on the noncommutative torus, within the framework of compactified Matrix Theory [6], appears naturally as the worldline field theories of N D-particles. This observation follows from careful consideration of the action of T -duality on superstring data in the presence of background fields. What we have shown here is that the natural algebraic framework for the noncommutative geometry of string theory (i.e. vertex operator algebras) has embedded within it a very special representation of the noncommutative torus, and that the particular module Teωd determines a gauge theory which is manifestly duality-invariant. Thus we can derive an explicitly duality-symmetric field theory based on very basic principles of string geometry. This field theory is manifestly covariant, at the price of involving highly non-local interactions. The non-locality arises from the algebraic relations of the vertex operator algebra and as such it reflects the nature of the string interactions. The fact that string dynamics control the very structure of the field theory should mean that it has a more direct relationship to M-Theory dynamics. One possible scenario would be to interpret the dimension N of the matrices which form the dynamical variables of Matrix Theory as the winding numbers of strings wrapping around the d-torus Td . In the 11-dimensional light-cone frame, N is related to the longitudinal momentum as p+ ∝ N , so that the vertex operator algebra is a dual model for the M-Theory dynamics in which the light-cone momenta are represented by winding numbers. The limit N → ∞ of infinite winding number corresponds to the usual Matrix Theory description of M-Theory dynamics in the infinite momentum frame. The origin of the noncommutative torus as the infinite winding modes of strings wrapping around Td is naturally contained within the tachyon sector of the vertex operator algebra and yields a dual picture of infinite momentum frame dynamics. We have also seen that the relationship between lattice vertex operator algebras and the noncommutative torus implies a new physical interpretation of Morita equivalence in terms of target space duality transformations. It would be interesting to investigate the relationship between this duality and the non-classical Nahm duality which maps instantons on one noncommutative torus to instantons on another (dual) one [6,19]. It appears that the duality interpretations in the case of the twisted module Teωd are somewhat simpler because of the special relationship between the deformation parameters ωij and the metric of the compactification lattice (see Proposition 1). This relation in essence breaks some of the symmetries of the space. In any case, it remains to study more the gauge group of the present module given the results of [21] and hence probe more in depth the structure of the gauge theory developed here.
String Geometry and Noncommutative Torus
633
Acknowledgements. We thank A. Connes, S. Majid and L. Pilo for helpful discussions. G. L. thanks all members of DAMTP and in particular Prof. M. Green for the kind hospitality in Cambridge. G. L. is a fellow of the Italian National Council of Research (CNR) under grant CNR-NATO 215.29/01. The work of R. J. S. was supported in part by the Particle Physics and Astronomy Research Council (UK).
Appendix A. Morita Equivalence of C ∗ -Algebras In this appendix we shall briefly describe the notion of (strong) Morita equivalence for C ∗ -algebras [38]. Additional details can be found in [5], for example. Throughout this appendix A is an arbitrary unital C ∗ -algebra whose norm we denote by k · k. A right Hilbert module over A is a right A-module E 5 endowed with an A-valued Hermitian structure, i.e. a sesquilinear form h· , ·iA : E × E → A which is conjugate linear in its first argument and satisfies hη1 , η2 aiA = hη1 , η2 iA a, hη1 , η2 i∗A = hη2 , η1 iA , hη, ηiA ≥ 0 , hη, ηiA = 0 ⇔ η = 0
(A.1) (A.2) (A.3)
p for all η1 , η2 , η ∈ E, a ∈ A. We define a norm on E by kηkA = k hη, ηiA k for any η ∈ E and require that E be complete with respect to this norm. We also demand that the module be full, i.e. that the ideal spanC {hη1 , η2 iA | η1 , η2 ∈ E} is dense in A with respect to the norm closure. A left Hilbert module structure on a left A-module E is provided by an A-valued Hermitian structure h· , ·iA on E which is conjugate linear in its second argument and with the condition (A.1) replaced by haη1 , η2 iA = a hη1 , η2 iA
∀η1 , η2 ∈ E, a ∈ A.
(A.4)
Given a Hilbert module E, its compact endomorphisms are obtained as usual from the “endomorphisms of finite rank”. For any η1 , η2 ∈ E an endomorphism |η1 i hη2 | of E is defined by |η1 i hη2 | (ξ ) = η1 hη2 , ξ iA ,
∀ξ ∈ E
(A.5)
which is right A-linear,
|η1 i hη2 | (ξ a) = |η1 i hη2 | (ξ ) a,
∀ξ ∈ E, a ∈ A.
Its adjoint endomorphism is given by ∗ |η1 i hη2 | = |η2 i hη1 | , ∀η1 , η2 ∈ E.
(A.6)
(A.7)
It can be shown that for η1 , η2 , ξ1 , ξ2 ∈ E one has the expected composition rule
|η1 i hη2 | ◦ |ξ1 i hξ2 | = η1 hη2 , ξ1 iA hξ2 | = |η1 i hη2 , ξ1 iA ξ2 . (A.8) It turns out that if E is a finitely-generated projective module then the norm closure End0A (E) (with respect to the natural operator norm which yields a C ∗ -algebra) of the C-linear span of the endomorphisms of the form (A.5) coincides with the endomorphism algebra EndA (E) of E. In fact, this property completely characterizes finitely-generated projective modules. For then, there are two finite sequences {ξk } and {ζk } of elements of 5 In more simplistic terms this means that E carries a right action of A.
634
G. Landi, F. Lizzi, R. J. Szabo
P E such that the identity endomorphism IE can be written as IE = k |ξk i hζk |. For any η ∈ E, we then have X X |ξk i hζk | η = ξk hζk , ηiA , (A.9) η = IE η = k
k
and hence E is finitely-generated by the sequence {ξk }. If N is the length of the sequences L {ξk } and {ζk }, we can embed E as a direct summand of AN ≡ N n=1 A, proving that it is projective. The embedding and surjection maps are defined, respectively, by λ : E → AN , λ(η) = hζ1 , ηiA , . . . , hζN , ηiA , X ξk ak . (A.10) ρ : AN → E, ρ (a1 , . . . , aN ) = k
Then, given any η ∈ E, we have ρ ◦ λ(η) = ρ hζ1 , ηiA , . . . , hζN , ηiA X X |ξk i hζk | (η) = IE (η) ξk hζk , ηiA = = k
(A.11)
k
so that ρ ◦ λ = IE , as required. The projector p = λ ◦ ρ identifies E as pAN . For any full Hilbert module E over a C ∗ -algebra A, the latter is (strongly) Morita equivalent to the C ∗ -algebra End0A (E) of compact endomorphisms of E. If E is finitelygenerated and projective, so that End0A (E) = EndA (E), then the algebra A is strongly Morita equivalent to the whole of EndA (E). The equivalence is expressed as follows. The idea is to construct an End0A (E)-valued Hermitian structure on E which is compatible with the Hermitian structure h· , ·iA . Consider then a full right Hilbert module E over the algebra A with A-valued Hermitian structure h· , ·iA . It follows that E is a left module over the C ∗ -algebra End0A (E). A left Hilbert module structure is constructed by inverting the definition (A.5) so as to produce an End0A (E)-valued Hermitian structure on E, hη1 , η2 iEnd0
A (E )
= |η1 i hη2 | ,
∀η1 , η2 ∈ E.
(A.12)
It is straightforward to check that (A.12) satisfies all the properties of a left Hermitian structure including conjugate linearity in its second argument. It follows from the definition of a compact endomorphism that the module E is also full as a module over End0A (E). Furthermore, from the definition (A.5) we have a compatibility condition between the two Hermitian structures on E, hη1 , η2 iEnd0
A (E )
ξ = |η1 i hη2 | (ξ ) = η1 hη2 , ξ iA ,
∀η1 , η2 , ξ ∈ E.
(A.13)
The Morita equivalence is also expressed by saying that the module E is an End0A (E)-A equivalence Hilbert bimodule. A C ∗ -algebra B is said to be Morita equivalent to the C ∗ -algebra A if B ∼ = End0A (E) ∗ for some A-module E. Morita equivalent C -algebras have equivalent representation theories. If the C ∗ -algebras A and B are Morita equivalent with B-A equivalence bimodule E, then given a representation of A, using E we can construct a unitary equivalent representation of B. For this, let (H, πA ) be a representation of A on a Hilbert space H.
String Geometry and Noncommutative Torus
635
The algebra A acts by bounded operators on the left on H via πA . This action can be used to construct another Hilbert space H0 = E ⊗A H,
ηa ⊗A ψ − η ⊗A πA (a)ψ = 0, ∀a ∈ A, η ∈ E, ψ ∈ H (A.14)
with scalar product (η1 ⊗A ψ1 , η2 ⊗A ψ2 )H0 = (ψ1 , hη1 , η2 iA ψ2 )H ,
∀η1 , η2 ∈ E, ψ1 , ψ2 ∈ H. (A.15)
A representation (H0 , πB ) of the algebra B is then defined by πB (b)(η ⊗A ψ) = (bη) ⊗A ψ, ∀b ∈ B, η ⊗A ψ ∈ H0 .
(A.16)
This representation is unitary equivalent to the representation (H, πA ). Conversely, starting with a representation of B, we can use a conjugate A-B equivalence bimodule E to construct an equivalent representation of A. Morita equivalence also yields isomorphic K-groups and cyclic homology, so that Morita equivalent algebras determine the same noncommutative geometry. However, the physical characteristics can be drastically different. For example, the algebras can have different (unitary) gauge groups and hence determine physically inequivalent gauge theories. References 1. Rieffel, M.A.: C ∗ -algebras associated with Irrational Rotations. Pac. J. Math. 93, 415 (1981) 2. Connes, A.: C ∗ -algèbres et Géométries Différentielle. C. R. Acad. Sci. Paris A290, 599 (1980); Connes, A. and Rieffel, M.A.:Yang–Mills for Noncommutative Two-tori. Contemp. Math. 62, 237 (1987); Rieffel, M.A.: Projective Modules over Higher-dimensional Noncommutative Tori. Can. J. Math. 40, 257 (1988) 3. Rieffel, M.A.: The Cancellation Theorem for Projective Modules over Irrational Rotation C ∗ -algebras. Proc. London Math. Soc. 47, 285 (1983) 4. Connes, A.: Noncommutative Geometry. London: Academic Press, 1994 5. Landi, G.: An Introduction to Noncommutative Spaces and their Geometries. Berlin–Heidelberg–New York: Springer, 1997 6. Connes, A., Douglas, M.R. and Schwarz, A.: Noncommutative Geometry and Matrix Theory: Compactification on Tori. J. High Energy Phys. 9802, 003 (1998) 7. Douglas, M.R. and Hull, C.M.: D-branes and the Noncommutative Torus. J. High Energy Phys. 9802, 008 (1998); Li, M.: Comments on Supersymmetric Yang–Mills Theory on a Noncommutative Torus. hep-th/9802052; Berkooz, M.: Nonlocal Field Field Theories and the Noncommutative Torus. Phys. Lett. B430, 237 (1998); Cheung, Y.-K.E. and Krogh, M.: Noncommutative Geometry from 0-branes in a Background B-field. Nucl. Phys. B528, 185 (1998); Bigatti, D.: Noncommutative Geometry and Super Yang–Mills Theory. hep-th/9804120 8. Ho, P.-M., Wu, Y.-Y. and Wu, Y.-S.: Towards a Noncommutative Geometric Approach to Matrix Compactification. Phys. Rev. D58, 026006 (1998); Kawano, T. and Okuyama, K.: Matrix Theory on Noncommutative Torus. Phys. Lett. B433, 29 (1998); Ardalan, F., Arfaei, H. and Sheikh-Jabbari, M.M.: Mixed Branes and Matrix Theory on Noncommutative Torus. hep-th/9803067; Ho, P.-M.: Twisted Bundle on Quantum Torus and BPS States in Matrix Theory. Phys. Lett. B434, 41 (1998) 9. Ho, P.-M. and Wu, Y.-S.: Noncommutative Gauge Theories in Matrix Theory. Phys. Rev. D58, 066003 (1998); Casalbuoni, R.: Algebraic Treatment of Compactification on Noncommutative Tori. Phys. Lett. B431, 69 (1998) 10. Witten, E.: Bound States of Strings and p-branes. Nucl. Phys. B460, 335 (1996); Douglas, M.R., Kabat, D., Pouliot, P. and Shenker, S.H.: D-branes and Short Distances in String Theory. Nucl. Phys. B485, 85 (1997)
636
G. Landi, F. Lizzi, R. J. Szabo
11. Ho, P.-M. and Wu, Y.-S.: Noncommutative Geometry and D-branes. Phys. Lett. B398, 52 (1997) 12. Lizzi, F., Mavromatos, N.E. and Szabo, R.J.: Matrix σ -models for Multi D-brane Dynamics. Mod. Phys. Lett. A13, 829 (1998) 13. Banks, T., Fischler, W., Shenker, S.H. and Susskind, L.: M Theory as a Matrix Model: A Conjecture. Phys. Rev. D55, 5112 (1997) 14. Fröhlich, J. and Gaw¸edzki, K.: Conformal Field Theory and Geometry of Strings. CRM Proc. Lecture Notes 7, 57 (1994) 15. Chamseddine, A.H.: The Spectral Action Principle in Noncommutative Geometry and the Superstring. Phys. Lett. B400, 87 (1997); An Effective Superstring Spectral Action. Phys. Rev. D56, 3555 (1997) 16. Lizzi, F. and Szabo, R.J.: Target Space Duality in Noncommutative Geometry. Phys. Rev. Lett. 79, 3581 (1997) 17. Lizzi, F. and Szabo, R.J.: Duality Symmetries and Noncommutative Geometry of String Spacetimes. Commun. Math. Phys. 197, 667 (1998) 18. Rieffel, M.A. and Schwarz, A.: Morita Equivalence of Multidimensional Noncommutative Tori. math.QA/9803057, to appear in Int. J. of Math. 19. Schwarz, A.: Morita Equivalence and Duality. Nucl. Phys. B534, 720 (1998) 20. Lizzi, F. and Szabo, R.J.: Electric-magnetic Duality in Noncommutative Geometry. Phys. Lett. B417, 303 (1998) 21. Lizzi, F. and Szabo, R.J.: Noncommutative Geometry and Spacetime Gauge Symmetries of String Theory. Chaos, Solitons and Fractals 10, 445 (1999) 22. Faddeev, L.D.: Discrete Heisenberg–Weyl Group and Modular Group. Lett. Math. Phys. 34, 249 (1995) 23. Connes, A. and Lott, J.: Particle Models and Noncommutative Geometry. Nucl. Phys. B (Proc. Suppl.) 18B, 29 (1990); Martín, C.P., Gracia-Bondía, J.M.and Várilly, J.C.: The Standard Model as a Noncommutative Geometry: The Low Energy Regime. Phys. Rep. 294, 363 (1998) 24. Connes, A.: Noncommutative Geometry and Reality. J. Math. Phys. 36, 619 (1995) 25. Connes, A.: Gravity Coupled with Matter and the Foundation of Noncommutative Geometry. Commun. Math. Phys. 182, 155 (1996) 26. Chamseddine, A.H. and Connes, A.: Universal Formula for Noncommutative Geometry Actions: Unification of Gravity and the Standard Model. Phys. Rev. Lett. 77, 4868 (1996); The Spectral Action Principle. Commun. Math. Phys. 186, 731 (1997) 27. Zwanziger, D.: Local-Lagrangian Quantum Field Theory of Electric and Magnetic Charges. Phys. Rev. D3, 880 (1971); Deser, S. and Teitelboim, C.: Duality Transformations of Abelian and Nonabelian Gauge Fields. Phys. Rev. D13, 1592 (1976); Schwarz, J.H. and Sen, A.: Duality Symmetric Actions. Nucl. Phys. B411, 35 (1994); Deser, S., Gomberoff, A., Henneaux, M. and Teitelboim, C.: Duality, Selfduality, Sources and Charge Quantization in Abelian N-form Theories. Phys. Lett. B400, 80 (1997) 28. Pasti, P., Sorokin, D. and Tonin, M.: Duality Symmetric Actions with Manifest Spacetime Symmetries. Phys. Rev. D52, 4277 (1995); Berkovits, N.: Manifest Electromagnetic Duality in Closed Superstring Field Theory. Phys. Lett. B388, 743 (1996); Local Actions with Electric and Magnetic Sources. B395, 28 (1997); Super-Maxwell Actions with Manifest Duality. B398, 79 (1997); Medina, R. and Berkovits, N.: Pasti–Sorokin–Tonin Actions in the Presence of Sources. Phys. Rev. D56, 6388 (1997) 29. Deser, S., Henneaux, M. and Teitelboim, C.: Electric-magnetic Black Hole Duality. Phys. Rev. D55, 826 (1997); Song, D.D. and Szabo, R.J.: Black String Entropy from Anomalous D-brane Couplings. hep-th/9805027 30. Aganagic, M., Park, J., Popescu, C. and Schwarz, J.H.: Dual D-brane Actions. Nucl. Phys. B496, 215 (1997); Nurmagambetov, A.: Duality-symmetric Three-brane and its Coupling to Type IIB Supergravity. Phys. Lett. B436, 289 (1998) 31. Cheung, Y.-K.E. and Yin, Z.: Anomalies, Branes and Currents. Nucl. Phys. B517, 69 (1998) 32. Frenkel, I.B., Lepowsky, J. and Meurman, A.: Vertex Operator Algebras and the Monster. Pure Appl. Math. 134 New York: Academic Press, 1988; Gebert, R.W.: Introduction to Vertex Algebras, Borcherds Algebras and the Monster Lie Algebra. Intern. J. Mod. Phys. A8, 5441 33. Green, M.B., Schwarz, J.H. and Witten, E.: Superstring Theory. Cambridge: Cambridge University Press, 1987 34. Constantinescu, F. and Scharf, G.: Smeared and Unsmeared Chiral Vertex Operators. Commun. Math. Phys. 200, 275 (1999)
String Geometry and Noncommutative Torus
637
35. De Bièvre, S.: Chaos, Quantization and the Classical Limit on the Torus Proc. XIV Workshop on Geometrical Methods in Physics, July 1995, Bialowieza, Poland, mp-arc 96–191; Várilly, J.C.: An Introduction to Noncommutative Geometry. Lectures at the EMS Summer School on Noncommutative Geometry and Applications, September 1997, Portugal, physics/9709045 36. Fairlie, D., Fletcher, P. and Zachos, C.: Trigonometric Structure Constants for New Infinite Dimensional Algebras. Phys. Lett. B218, 203 (1989); Infinite-Dimensional Algebras and a Trigonometric Basis for the Classical Lie Algebras. J. Math. Phys. 31, 1088 (1990); Fairlie, D. and Zachos, C.: Infinite-Dimensional Algebras, Sine Brackets, and SU(∞). Phys. Lett. B224, 101 (1989) 37. Gracia-Bondía, J.M., Iochum, B. and Schücker, T.: The Standard Model in Noncommutative Geometry and Fermion Doubling. Phys. Lett. B416, 123 (1998) 38. Rieffel, M.A.: Induced Representations of C ∗ -algebras. Bull. Am. Math. Soc. 78, 606 (1972); Adv. Math. 13. 176 (1974); Morita Equivalence for Operator Algebras. In: Operator Algebras and Applications, Proc. Symp. Pure Math. 38, R.V. Kadison, ed. Providence, RI: American Mathematical Society, 1982, pp. 285–298 Communicated by A. Connes
Commun. Math. Phys. 206, 639 – 690 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
A Generalized Hypergeometric Function Satisfying Four Analytic Difference Equations of Askey–Wilson Type S. N. M. Ruijsenaars Centre for Mathematics and Computer Science, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands Received: 21 December 1998 / Accepted: 14 April 1999
Abstract: The hypergeometric function 2 F1 can be written in terms of a contour integral involving gamma functions. We generalize this (Barnes) representation by using a certain generalized gamma function as a building block. In this way we obtain a new ˆ with various symmetry features. We 2 F1 -generalization R(a+ , a− , c0 , c1 , c2 , c3 ; v, v) determine the analyticity properties of the R-function in all of its eight arguments, and show that it is a joint eigenfunction of four distinct Askey–Wilson type difference operators, two acting on v and two on v. ˆ The Askey–Wilson polynomials can be obtained by a suitable discretization of v or v. ˆ
Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. The R-Function: Automorphy and Meromorphy . . . . . . . . . . 3. The R-Function: A1Es and Askey–Wilson Specialization . . . . . 4. A Barnes Type Representation for Basic Hypergeometric Functions Appendix A. The Hyperbolic Gamma Function and Related Functions Appendix B. Analyticity Properties . . . . . . . . . . . . . . . . . . . Appendix C. Proof of Theorem A.1 . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
639 646 651 656 663 672 681 690
1. Introduction In this paper – the first of a series – we begin a detailed study of a novel generalization of the hypergeometric function 2 F1 (a, b, c; w). Recall the latter can be defined for |w| < 1
640
S. N. M. Ruijsenaars
by the Gauss series ∞
2 F1 (a, b, c; w)
=1+
X 1 a(a + 1)b(b + 1) 2 ab w+ w + ··· = Cn (a, b, c)wn , c 2! c(c + 1) n=0
(1.1) 1 0(a + n) 0(b + n) 0(c) . Cn (a, b, c) ≡ n! 0(a) 0(b) 0(c + n)
(1.2)
An obvious way to generalize this function is, therefore, to generalize the coefficients Cn of this power series. In particular, such generalizations can be defined within the context of basic hypergeometric series. Recalling the standard definitions (|q| < 1, |w| < 1) s+1 φs (a1 , . . . , as+1 , b1 , . . . , bs ; q, w) ≡
(a; q)n ≡
n Y
∞ X (a1 ; q)n · · · (as+1 ; q)n n=o
(b1 ; q)n · · · (bs ; q)n
(1 − aq j −1 ), n ∈ N,
wn , (1.3) (q; q)n (1.4)
j =1
the oldest and most widely known generalization is Heine’s function 2 φ1 (a, b, c; q, w). Indeed, the coefficients Cn (a, b, c; q) ≡
(a; q)n (b; q)n , |q| < 1, (c; q)n (q; q)n
(1.5)
of its power series clearly converge to the 2 F1 -coefficients Cn (a, b, c) (1.2) as q → 1. One can also define various generalizations in terms of s+1 φs with s > 1 by letting the additional parameters depend on q in suitable ways. But such functions will not generally satisfy q-contiguous relations and a second order q-difference equation, by contrast to Heine’s 2 φ1 . (We refer to Ref. [1] for detailed information and references concerning various “q-world” results mentioned in this paper.) Next, we recall that the Gauss series (1.1) can be made to break off by suitably specializing the parameters a, b, c. The largest family of polynomials obtained in this way (the Jacobi polynomials) still depends on two continuous parameters (in addition to the continuous variable w). As is well known, Askey and Wilson [2] generalized this family to a five-parameter family of polynomials defined in terms of 4 φ3 , viz., pk (q, α, β, γ , δ; cos v) = Nk 4 φ3 (q −k , αβγ δq k−1 , αeiv , αe−iv , αβ, αγ , αδ; q, q). (1.6) Here, the normalization coefficient Nk (q, α, β, γ , δ) ≡ α −k (αβ; q)k (αγ ; q)k (αδ; q)k
(1.7)
ensures symmetry in α, β, γ , δ. These polynomials are eigenfunctions of a second-order q-difference operator and their three-term recurrence relation is explicitly known as well [2]. As a corollary of our work, we reobtain these results in a novel way. Moreover, omitting the constant Nk in (1.6), we show that when k is replaced by an arbitrary complex number, the resulting 4 φ3 -function no longer satisfies the Askey–Wilson q-difference equation, though it may be viewed as a 2 F1 -generalization.
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
641
In recent years other generalizations of the hypergeometric function were also presented and studied by Grünbaum and Haine [3], and by Nishizawa and Ueno [4]. The main object of study in this paper is a function R(a+ , a− , c0 , c1 , c2 , c3 ; v, v). ˆ It yields (in essence) the Askey–Wilson polynomials when one of its variables v or vˆ is suitably discretized, and it may be viewed as a 2 F1 -generalization. However, its structure is essentially different from the generalizations already mentioned. To lead up to its definition, we begin by recalling that the hypergeometric differential equation satisfied by 2 F1 can be transformed to a nonrelativistic Schrödinger equation via suitable substitutions. Specifically, introducing the wave function ˜ v, v) ˆ ≡ 2 F1 ((d + d˜ + i v)/2, ˆ (d + d˜ − i v)/2, ˆ d + 1/2; −sh2 v), ψnr (d, d;
(1.8)
one obtains ˜ h¯ ; νx, p/h¯ ν) = (Hnr ψnr )(g/h¯ , g/
1 ˜ 2 )ψnr (g/h¯ , g/ ˜ h¯ ; νx, p/h¯ ν), (p2 + ν 2 (g + g) 2m (1.9)
where Hnr is the Hamiltonian Hnr ≡ −
h¯ (h¯ ∂x2 + 2ν[gcth(νx) + gth(νx)]∂ ˜ x ). 2m
(1.10)
Physically speaking, m is the particle mass, h¯ is Planck’s constant, g and g˜ are coupling constants with dimension [action], x and p are position and momentum, resp., and ν is a scale parameter with dimension [position]−1 . Thus, the parameters d, d˜ and ˆ variables v, vˆ in (1.8) are dimensionless. Our function R(a+ , a− , c0 , c1 , c2 , c3 ; v, v) generalizes the reparametrized 2 F1 -function (1.8). As it stands, it depends on four more (dimensionless) arguments than the rhs of (1.8), but it is scale invariant in the sense that ˆ = R(a+ , a− , c; v, v), ˆ λ > 0. R(λa+ , λa− , λc; λv, λv)
(1.11)
(Here and in the sequel, we use c to denote (c0 , c1 , c2 , c3 ).) Hence, one might dispense −1 . with the parameter a+ (say) by taking λ = a+ For fixed parameters a+ , a− > 0 and (generic) real couplings c0 , . . . , c3 , the function R is meromorphic in the variables v, v, ˆ with poles occurring solely on the imaginary axis. It generalizes the rhs of (1.8), since one has lim R(π, λ, λc; v, λu) = 2 F1 (cˆ0 + iu, cˆ0 − iu, c0 + c2 + 1/2; −sh2 v),
λ→0
(1.12)
where cˆ0 ≡ (c0 + c1 + c2 + c3 )/2,
(1.13)
in a sense discussed below. The R-function is a joint eigenfunction of four independent analytic difference operators (henceforth A1Os) of Askey–Wilson type, two acting on v and the other two on v. ˆ The fourA1Os will be detailed in Sect. 3. Here, we only mention one of the associated analytic difference equations (from now on A1Es), trading R for the wave function ˆ ≡ R(π, λ, λc; v, v/2). ˆ ψrel (λ, c; v, v)
(1.14)
642
S. N. M. Ruijsenaars
In terms of physical quantities, this A1E can be reformulated as (Hrel ψrel )(βν h¯ , g/h¯ ; νx, βp) =
1 ch(βp)ψrel (βν h¯ , g/h¯ ; νx, βp), mβ 2
(1.15)
where Hrel is given by Hrel ≡
1 [V (x)(Ti h¯ β − 1) + V (−x)(T−i h¯ β − 1) + 2 cos βν(g0 + g1 + g2 + g3 )], 2mβ 2 (1.16)
with V the “potential” V (x) ≡
shν(x − iβg0 ) chν(x − iβg1 ) shν(x − iβg2 − iβ/2) chν(x − iβg3 − iβ/2) , shνx chνx shν(x − iβ/2) chν(x − iβ/2) (1.17)
and Tα the translation operator (Tα F )(x) ≡ F (x − α), α ∈ C.
(1.18)
Consequently, one obtains Hrel =
1 + Hnr + C + O(β 2 ), β ↓ 0, mβ 2
(1.19)
so that the second scale parameter β with dimension [momentum]−1 can be viewed as 1/mc, with c the speed of light. Moreover, (1.12) can be rewritten as a nonrelativistic limit, lim ψrel (h¯ ν/mc, (g0 , . . . , g3 )/h¯ ; νx, p/mc)
c→∞
= ψnr ((g0 + g2 )/h¯ , (g1 + g3 )/h¯ ; νx, p/h¯ ν).
(1.20)
To be sure, the physical picture implied by the above notation and terminology needs further elaboration to be convincing to a physicist reader. It is however beyond the scope of this paper to supply the background material concerning relativistic Calogero–Moser N-particle systems, from which our 2 F1 -generalization originated. The interested reader ˆ was is referred to our 1994 lecture notes [5], in which the function R(a+ , a− , c; v, v) already presented and discussed in relation to the latter integrable systems (introduced in Ref. [6] at the classical and in Ref. [7] at the quantum level). From a mathematical viewpoint no physical interpretation is needed, of course. Rather, a mathematician may be inclined to try and tie in the novel function R (which has various striking symmetry properties) with the representation theory of suitably constructed non-compact quantum groups. The latter viewpoint will not be further discussed here either. Indeed, we study the function R in its own right, albeit with a slight bias towards results that are relevant to a quantum-mechanical/Hilbert space context. By contrast to the previous 2 F1 -generalizations mentioned above, we have not found any representation for R as an explicit power series. Instead, our definition of R proceeds in terms of a contour integral that generalizes the Barnes representation for 2 F1
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
643
(cf. e.g. Ref. [8]). For later purposes it is convenient to specify the latter in the factorized form Z 2 SL (v, z)M(d; z)SR (f ; u, z)dz. (1.21) 2 F1 (f + iu, f − iu, d + 1/2; −sh v) = C
Here, the “side functions” are given by SL (v, z) ≡ exp(−iz ln(sh2 v)), SR (f ; u, z) ≡
0(f + iu − iz) 0(f − iu − iz) , 0(f + iu) 0(f − iu)
(1.22)
(1.23)
and the “middle function” by M(d; z) ≡
0(iz) 0(d + 1/2) . 2π 0(d + 1/2 − iz)
(1.24)
Taking d, f, v > 0 and u ∈ R (for ease of exposition), the contour C runs along the real axis from −∞ to ∞ and is indented downwards near z = 0 so as to avoid the simple pole of 0(iz). With a suitable restriction on the parameters, the contour C can also be used to define the function R. First, we introduce a ≡ (a+ + a− )/2,
cˆ0 ≡ (c0 + c1 + c2 + c3 )/2,
s1 ≡ c0 + c1 − a− /2, s2 ≡ c0 + c2 − a+ /2, s3 ≡ c0 + c3 , s4 ≡ a,
(1.25) (1.26)
so that s1 + s2 + s3 + s4 = 2c0 + 2cˆ0 .
(1.27)
sj ∈ (−a, a), j = 1, 2, 3,
(1.28)
c0 , cˆ0 ∈ (0, a), v, vˆ ∈ R.
(1.29)
Next we require
In view of (1.27), this yields a non-empty open subset of the coupling space R4 . (The restrictions are imposed for expository convenience; they will be relaxed below.) Now we set Z 1 ˆ = I (a+ , a− , c; v, v, ˆ z)dz. (1.30) R(a+ , a− , c; v, v) (a+ a− )1/2 C The integrand I may be viewed as a product of fifteen functions of the form G(a+ , a− ; ·), where G is defined by Z ∞ dy sin 2yz z − , |Im z| < a. (1.31) G(a+ , a− ; z) = exp i y 2sha+ y sha− y a+ a− y 0
644
S. N. M. Ruijsenaars
This function (referred to as the hyperbolic gamma function) was introduced and studied in detail in Subsect. III A of Ref. [9]. It is manifestly analytic and zero-free in the strip |Im z| < a, and obviously has the automorphy properties G(a− , a+ ; z) = G(a+ , a− ; z), G(−z) = 1/G(z), G(λa+ , λa− ; λz) = G(a+ , a− ; z), λ > 0.
(1.32)
(Since G is symmetric in a+ , a− , we often suppress these parameters; recall we fix a+ and a− in (0, ∞).) We will need various non-obvious properties of the hyperbolic gamma function that are summarized in Appendix A. We are now prepared to define the integrand in (1.30). It is convenient to factorize it as ˆ z), I ≡ F (a+ , a− , c0 ; v, z)K(a+ , a− , c; z)F (a+ , a− , cˆ0 ; v,
(1.33)
where F (a+ , a− , b; y, z) ≡
G(z + y + ib − ia) G(z − y + ib − ia) , a = (a+ + a− )/2, G(y + ib − ia) G(−y + ib − ia) (1.34)
K(a+ , a− , c; z) ≡
3 Y G(isj ) 1 . G(z + ia) G(z + isj )
(1.35)
j =1
(Note that K is not symmetric in a+ , a− , since s1 and s2 are not, cf. (1.26).) The restrictions (1.28) ensure that the poles of K in z lie above C. Similarly, the restrictions (1.29) ensure that the poles of the functions F in z lie below C. Using the pole sequence symbol explained in Appendix A, we have depicted the state of affairs in ˆ From the G-asymptotics the z-plane in Fig. 1 (taking s3 < s2 < s1 and 0 < v < v). detailed in Appendix A one easily deduces that the integrand has asymptotics I (z) = O(exp[∓2πz(
1 1 + )]), Re z → ±∞, a+ a−
(1.36)
so that R is well defined. (Specifically, one need only use (A.29), (A.30) and (A.35) to check this.) Next, we turn to the connection with the hypergeometric function, written in the Barnes representation (1.21)–(1.24). To this end we specialize (1.30) as Z SL (λ, c0 ; v, z)M(λ, c; z)SR (λ, cˆ0 ; u, z)dz, (1.37) R(π, λ, λc; v, λu) = C
where SL (λ, c0 ; v, z) ≡ exp(2iz ln 2)F (π, λ, λc0 ; v, λz), SR (λ, cˆ0 ; u, z) ≡ exp(2iz ln(2λ))F (π, λ, λcˆ0 ; λu, λz), 1/2 λ exp(−2iz ln(4λ))K(π, λ, λc; λz). M(λ, c; z) ≡ π
(1.38) (1.39) (1.40)
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
645
ia − is3
ia − is2
ia − is1
C 0 vˆ − i cˆ0
−vˆ − i cˆ0 −v − ic0
v − ic0
Fig. 1. The contour C vs. the upward and downward pole sequences in the z-plane
The point is now that the two zero step size limits (A.36) and (A.38) entail lim SL (λ, c0 ; v, z) = SL (v, z), v > 0,
(1.41)
lim SR (λ, cˆ0 ; u, z) = SR (cˆ0 ; u, z),
(1.42)
λ↓0
λ↓0
lim M(λ, c; z) = M(c0 + c2 ; z), λ↓0
(1.43)
where the right-hand-side functions are given by (1.22)–(1.24). The limits (1.41)–(1.43) hold true uniformly on compacts around the contour C, but this does not yet suffice for the limit on the lhs of (1.12) to exist. For a complete proof of (1.12), an L1 -bound on the integrand in (1.37) that is uniform for λ ∈ (0, 1] (say) would be sufficient. Unfortunately, we have been unable to supply such a bound thus far. In this connection we mention that we are not aware of complete proofs that the basic hypergeometric function 2 φ1 and other generalizations converge to 2 F1 in the obvious limit. Of course, termwise convergence is plain by inspection, but once more a uniform l 1 -bound is needed to dominate the convergence. For 2 φ1 this might be relatively simple, since a bound of the form |Cn (q)| ≤ K n (uniformly for q ∈ [1/2, 1), say) would suffice to obtain uniform convergence on |w| ≤ K −1 − for all > 0. But for the Askey– Wilson 4 φ3 -generalization discussed below far more delicate bounds seem to be needed to control the limit.
646
S. N. M. Ruijsenaars
Rigorous control on the limiting transitions R → 2 F1 and 4 φ3 → 2 F1 is the only point left open in this article – complete proofs of all other results are supplied here or in our previous paper [9]. A comprehensive and detailed picture of the analyticity properties of the R-function, both in its parameters a+ , a− , c and in its variables v, vˆ is critical not only in this paper, but also for further study. To arrive at the desired results, we need a substantial generalization of the asymptotic properties of the G-function. As already mentioned, Appendix A summarizes various results on the G-function that were obtained in Ref. [9], but our sharpening of the asymptotics (laid down in Theorem A.1) involves considerable technicalities that are relegated to Appendix C. In Appendix B we obtain analyticity results on a class of functions that includes the a priori least accessible part of the R-function, namely, the contour integral involving the eight z-dependent G-functions, cf. (1.30). The generalization of this contour integral studied in Appendix B is more easily handled than its special case, though it involves a half-space restriction on the pertinent variables that is automatically satisfied for the R-function. In Sect. 2 we obtain various symmetry and analyticity properties of the R-function by exploiting the information assembled in Appendixes A–C. With these properties at our ˆ is a joint eigenfunction of four Askey– disposal, the verification that R(a+ , a− , c; v, v) Wilson type A1Os is a largely computational enterprise, which is undertaken in Sect. 3. Similarly, the analyticity features enable us to show that the dual variable vˆ may be taken equal to i cˆ0 + ina− for n ∈ Z, yielding polynomials Pn (ch(2π v/a+ )) for n ∈ N. In Sect. 4 we first present a Barnes type representation for s+1 φs (1.3), which involves the trigonometric G-function Gt (r, a; z) from Ref. [9]. This representation was suggested by the above one involving the hyperbolic G-function G(a+ , a− ; z). It differs from and is simpler to work with than the Barnes type representation for basic hypergeometric functions that was introduced and studied long ago by Watson [10], cf. also Ref. [1]. (The latter uses the q-gamma function, which is closely related to our trigonometric G-function.) In particular, the analytic continuation of s+1 φs to all of the w-plane is quite easily studied by using our new representation. Taking s = 3 ˆ of our and suitably reparametrizing, we obtain a “trigonometric” analog Rt (r, a, c; v, v) ˆ Taking once again vˆ equal to i cˆ0 +ina, n ∈ N, “hyperbolic” function R(a+ , a− , c; v, v). we obtain polynomials Pn (cos 2rv) whose relation to the Askey–Wilson polynomials pn (cos v) (1.6) is quite easily established. ˆ however, does not satisfy an Askey–Wilson type differThe function Rt (r, a, c; v, v), ence equation. On the other hand, it fails to do so by a quite simple difference, cf. (4.51) below. Though this may not be clear at first sight, the inhomogeneous A1E thus found is substantially equivalent to a result obtained first by Atakishiyev and Suslov [11], cf. also Theorem 3 in Ref. [3]. Our proof proceeds by specializing a more general result (cf. Lemma 4.2), whose proof is patterned after the proof of the A1Es satisfied by ˆ (cf. Theorem 3.1). R(a+ , a− , c; v, v)
2. The R-Function: Automorphy and Meromorphy With the restrictions (1.28), (1.29) in effect, we have already defined the R-function by (1.30) and (1.33)–(1.35). Before relaxing the restrictions, it is convenient to collect some symmetry properties that can be read off from the integrand (1.33).
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
To this end we introduce matrices 1 0 0 0 0 0 1 0 I ≡ , 0 1 0 0 0 0 0 1
647
1 1 1 1 1 1 1 −1 −1 J ≡ , 2 1 −1 1 −1 1 −1 −1 1
(2.1)
satisfying I J = J I, K = K ∗ = K −1 , K = I, J,
(2.2)
and dual couplings given by cˆ ≡ J c = (cˆ0 , cˆ1 , cˆ2 , cˆ3 ).
(2.3)
Setting sˆ1 ≡ cˆ0 + cˆ1 − a− /2, sˆ2 ≡ cˆ0 + cˆ2 − a+ /2, sˆ3 ≡ cˆ0 + cˆ3 , sˆ4 ≡ a,
(2.4)
we then have sˆj = sj ,
j = 1, 2, 3, 4,
(2.5)
cf. (1.26). We also point out the equivalences cˆ = c ⇔ c0 = cˆ0 ⇔ c0 = c1 + c2 + c3 .
(2.6)
An inspection of (1.30)–(1.35) now shows that R has the parameter symmetry ˆ = R(a− , a+ , I c; v, v) ˆ R(a+ , a− , c; v, v)
(2.7)
and self-duality property ˆ = R(a+ , a− , cˆ ; v, ˆ v). R(a+ , a− , c; v, v)
(2.8)
Combining these two transformations, one obtains ˆ = R(a− , a+ , I cˆ ; v, ˆ v). R(a+ , a− , c; v, v)
(2.9)
From the structure of the integrand (1.33) it is also clear that R is even in v and v: ˆ ˆ = R(a+ , a− , c; δ1 v, δ2 v), ˆ δ1 , δ2 = +, −. R(a+ , a− , c; v, v)
(2.10)
Finally, the announced scale invariance (1.11) follows via the change of variables z → λz, using the scale invariance of the G-function, cf. (1.32). Clearly, these features are preserved under analytic continuation in the parameters ˆ We proceed by examining such continuation properties. Faca± , c and variables v, v. toring off the seven z-independent G-functions whose analyticity properties are known (cf. Appendix A), we are left with the auxiliary function Z ˆ ≡ I(a+ , a− ; u1 , . . . , u4 , d1 , . . . , d4 , z)dz, (2.11) A(a+ , a− , c; v, v) C
with I(u, d, z) ≡
4 Y j =1
G(z − uj )/G(z − dj ),
(2.12)
648
S. N. M. Ruijsenaars
u1 ≡ ±v − ic0 + ia, 2
u3 ≡ ±vˆ − i cˆ0 + ia,
(2.13)
4
dj ≡ −isj , j = 1, 2, 3, 4.
(2.14)
We are rewriting the integrand in this way so as to make clear that it is of the form (B.2) with N = 4. In the case at hand we have, using (1.27), 4 X (uj − vj ) = 4ia.
(2.15)
j =1
Thus we obtain an exponential decay |I(a+ , a− ; u, d, z)| < C± | exp(∓2πz(
1 1 + ))|, ±z > L± , a+ a−
(2.16)
for all a± ∈ RH P , c ∈ C4 and v, vˆ ∈ C, provided we choose L± large enough, cf. (B.8), (B.9). Moreover, the constants C± can be chosen uniformly for a± in RH P -compacts, c in C4 -compacts, and v, vˆ in C-compacts, cf. Theorem A.1. Taking a± positive again, we recall that the restrictions (1.29) ensure that the pole sequences uj − ia − zkl lie in the lower half plane, whereas (1.28) ensures that the pole sequences dj + ia + zkl lie in the upper half plane, save for the simple pole at d4 + ia = 0. The contour C equals (−∞, ∞), but for a downward indentation to avoid the latter d-pole. Now we are of course free to deform any finite part of C continuously without changing the value of the integral on the rhs of (2.11), as long as we do not cross any poles. (Whenever this happens, residues are picked up.) Doing so in suitable ways, it is also clear from the above decay bounds that we can freely continue a± , c0 , . . . , c3 , v and vˆ into the complex plane (keeping a± ∈ RH P , of course), provided the contour still separates the u-poles and d-poles. To be more specific, the function (2.11) we are starting from is real-analytic in a+ , a− , c0 , . . . , c3 , v and vˆ on the intervals specified in (1.28), (1.29), and can be continued analytically in all of its variables as long as the contour C can be deformed so that the u-poles stay below it and the d-poles above it. Now this much is clear from the decay bounds (2.16) and the analytic character of the ˆ integrand in (2.11). A priori, it is not at all clear, however, whether A(a+ , a− , c; v, v) (2.11) can be continued to all of the set D ≡ RH P 2 × C6 .
(2.17)
Indeed, whenever u-poles and d-poles collide, the contour gets pinched and singularities may arise. Collisions of the u1 -poles and u2 -poles with the d-poles can be encoded in the hyperplanes j
Hlm,δ ≡ {ila+ + ima− − ia − δv − isj + ic0 = 0}, l, m ∈ N∗ , δ = +, −, j = 1, 2, 3,
(2.18)
4 ≡ {i(l − 1)a+ + i(m − 1)a− − δv + ic0 = 0}, l, m ∈ N∗ , δ = +, −. (2.19) Hlm,δ j
Similarly, u3 - and u4 -pole collisions with the d-poles occur on hyperplanes Hˆ lm,δ , given by (2.18), (2.19) with v → v, ˆ c → cˆ . Let us denote the union of the collision hyperplanes by Zaux , and introduce the set
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
DA ≡ D \ Zaux .
649
(2.20)
Note that this set is open and connected, but not simply-connected. Now some reflection makes it plausible that A can be continued to DA , but whether or not multi-valuedness occurs is not readily guessed. Moreover, the character of the singularities (if any) when the collision variety Zaux is approached is far from clear by inspection. We have studied this circle of problems in the slightly more general setting of Appendix B. The most important results of the analysis to be found there, as applied to the case at hand, are contained in the following lemma. ˆ (2.11) extends to a (one-valued) Lemma 2.1. The auxiliary function A(a+ , a− , c; v, v) jointly analytic function in DA (2.20). Multiplying the latter function by the E-product 4 Y Y
E(a+ , a− ; δv − ic0 + isj )E(a+ , a− ; δ vˆ − i cˆ0 + i sˆj ),
(2.21)
δ=+,− j =1
one obtains a function that has a jointly analytic extension to D (2.17). Proof. The assertions in this lemma easily follow from Theorem B.2, specialized to the above setting. u t Of course, Appendix B contains more information than is encoded in Lemma 2.1. In particular, the values of A in DA can be written in terms of a contour integral and a finite residue sum, and the properties and explicit form of residues are further illuminated in Lemma B.1. Note also that Zaux is the zero locus of the E-product (2.21), and that the extension to D is necessarily one-valued, since D (2.17) is simply-connected. Let us next consider the seven z-independent G-functions in the integrand (1.33). The product G(is1 )G(is2 )G(is3 ) occurs for purposes of normalisation, but is of course irrelevant for the analytic behavior in the variables v and v. ˆ The G-product Y G(a+ , a− ; δv − ic0 + ia)(v → v, ˆ c0 → cˆ0 ), (2.22) δ=+,−
however, gives rise to pole and zero varieties that are closely related to the hyperplanes 4 (2.19) and their duals. Indeed, writing the G-functions in terms of E-functions by Hlm,δ using (A.40), one sees that the numerator E-functions amount to the j = 4 E-functions in (2.21). The denominator E-functions give rise to new singularities, as compared to the (a priori) singular locus Zaux of A. As a notational preliminary, we introduce new couplings γ0 ≡ c0 − a+ /2 − a− /2, γ1 ≡ c1 − a− /2, γ2 ≡ c2 − a+ /2, γ3 ≡ c3 ,
(2.23)
dual couplings γˆ ≡ J γ = (cˆ0 − a+ /2 − a− /2, cˆ1 − a− /2, cˆ2 − a+ /2, cˆ3 ),
(2.24)
hyperplanes µ
Hlm,δ ≡ {ila+ + ima− − ia − δv − iγµ = 0}, l, m ∈ N∗ , δ = +, −, µ = 0, 1, 2, 3,
(2.25)
650
S. N. M. Ruijsenaars
and dual hyperplanes µ Hˆ lm,δ ≡ {ila+ + ima− − ia − δ vˆ − i γˆµ = 0},
l, m ∈ N∗ , δ = +, −, µ = 0, 1, 2, 3.
(2.26)
For µ = 1, 2, 3 the hyperplanes (2.25) and (2.26) coincide with (2.18) and their duals, as anticipated by our notation. But for µ = 0 the hyperplanes encode the pole locus of the G-product (2.22), whereas (2.19) and its dual encode the zero locus. In the following theorem we collect various symmetry and analyticity properties of the R-function that are readily established from the above. As a preparation, we define the zero locus µ µ ∪ Hˆ ), (2.27) Z ≡ ∪3µ=0 ∪δ=+,− ∪l,m∈N∗ (H lm,δ
lm,δ
and the open, connected, but not simply-connected set Clearly, the normalisation factor
Q
Dren = D \ Z. j
(2.28)
G(isj ) yields zero and pole sets
j
Nkl,δ ≡ {ka+ + la− + a + δsj = 0}, k, l ∈ N, δ = +, −, j = 1, 2, 3,
(2.29)
with union j
N ≡ ∪3j =1 ∪δ=+,− ∪k,l∈N Nkl,δ ,
(2.30)
and it suffices to study R on the set DR ≡ Dren \ N ,
(2.31)
which is also open, connected and multiply-connected. Theorem 2.2. The renormalized R-function ˆ ≡ Rren (a+ , a− , c; v, v)
3 Y
G(a+ , a− ; −isj ) · R(a+ , a− , c; v, v), ˆ
(2.32)
j =1
defined by (1.30) and (1.33)–(1.35) with the restrictions (1.28), (1.29) in force, has a (one-valued) jointly analytic extension to Dren (2.28). Multiplying the latter function by the E-product 3 Y Y
E(a+ , a− ; δv + iγµ )E(a+ , a− ; δ vˆ + i γˆµ ),
(2.33)
µ=0 δ=+,−
(with γµ , γˆµ given by (2.23), (2.24)), one obtains a function that has a jointly analytic extension to D (2.17). ˆ has a (one-valued) jointly analytic extension to DR The function R(a+ , a− , c; v, v) (2.31). It is scale invariant in the sense that (1.11) holds, and it satisfies (2.7)–(2.10). ˆ with Fixing parameters (a+ , a− , c) ∈ (RH P 2 × C4 ) \ N , it is meromorphic in v and v, poles that can be located only at the points ±iv = la+ + ma− − a − γµ , l, m ∈ N∗ , µ = 0, 1, 2, 3,
(2.34)
±i vˆ = la+ + ma− − a − γˆµ , l, m ∈ N∗ , µ = 0, 1, 2, 3.
(2.35)
The pole order at these points is smaller than or equal to the zero order of the E-function product (2.33).
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
651
Proof. The assertions concerning Rren follow from Lemma 2.1 and the salient properties of the G-product (2.22), as already explained above. The first assertion about R is then an obvious corollary. The automorphy properties were already verified, subject to the restrictions (1.28), (1.29); evidently, they continue to DR . Finally, the pole locations (2.34), (2.35) encode the zero locus Z (2.27) of the E-product (2.33). Hence the last assertion readily follows. u t 3. The R-Function: A1Es and Askey–Wilson Specialization With the above analyticity properties of the R-function at our disposal, we are now going to demonstrate that the R-function is a joint eigenfunction of four independent analytic difference operators of Askey–Wilson type. To define the latter it is convenient to use the notation sδ (z) ≡ sh(πz/aδ ), cδ (z) ≡ ch(πz/aδ ), δ = +, −, a+ , a− ∈ RH P .
(3.1)
Now we introduce the A1Os,
z − 1 Aδ (c; z) ≡ Cδ (c; z) Tiaz −δ − 1 + Cδ (c; −z) T−ia −δ + 2cδ (i(c0 + c1 + c2 + c3 )), δ = +, −.
(3.2)
Here, the superscript z on the shifts T (1.18) is used to indicate that they act on the variable z that also occurs in the coefficient functions Cδ (c; z) ≡
sδ (z − ic0 ) cδ (z − ic1 ) sδ (z − ic2 − ia−δ /2) cδ (z − ic3 − ia−δ /2) . sδ (z) cδ (z) sδ (z − ia−δ /2) cδ (z − ia−δ /2) (3.3)
Theorem 3.1. For all parameters a± ∈ RH P , c ∈ C4 , the function ˆ Rren (a+ , a− , c; v, v) (2.32) is an eigenfunction of the A1Os, ˆ A− (I cˆ ; v), ˆ A+ (c; v), A− (I c; v), A+ (ˆc; v),
(3.4)
(where I, cˆ are defined by (2.1), (2.3)), with eigenvalues ˆ 2c− (2v), ˆ 2c+ (2v), 2c− (2v). 2c+ (2v),
(3.5)
Proof. The analyticity properties of the coefficient functions are clear by inspection. Combining them with the analyticity properties of Rren established in Theorem 2.2, we deduce that we need only prove the assertions for parameters a± , c0 , cˆ0 > 0, sj = sˆj ∈ (−a, a), j = 1, 2, 3. Moreover, in view of the symmetry properties (2.7)–(2.9), we need ˆ on R(a+ , a− , c; v, v) ˆ for v, vˆ ∈ RH P . only show that A+ (c; v) has eigenvalue 2c+ (2v) Now with the latter restrictions on the parameters and variables in effect, the upward z-pole sequences in the integrand I (1.33) belong to i[0, ∞) and the downward ones to the right and left half planes. Thus R is given by the contour integral (1.30) whenever the downward sequences lie in the lower half plane, too. More generally, we may work with (1.30), provided we deform C in obvious ways so that the downward sequences stay below it. Moreover, we may shift the infinite tails
652
S. N. M. Ruijsenaars
of the contour over iα ∈ iR as the need arises. (Indeed, the exponential decay (1.36) is uniform for Im z varying over R-compacts, cf. Theorem A.1.) Denoting the deformed contours once again by C, we deduce that a suitable choice of C ensures that shifts of the variables v and vˆ in R over ±iaδ amount to shifting v and vˆ in the integrand I (1.33) of the R-representation (1.30). We are now prepared to consider the salient features of the integrand in regard to shifts of the variables v, vˆ and z over ±iaδ . Since it consists of a product of G-functions, these features can be derived from the G-A1Es, G(z + iaδ /2) = 2c−δ (z), δ = +, −, G(z − iaδ /2)
(3.6)
cf. (A.7), (A.9). As we are going to verify the A1E ˆ = 2c+ (2v)R(c; ˆ v, v), ˆ A+ (c; v)R(c; v, v) with R(c; v, v) ˆ =
1 (a+ a− )1/2
(3.7)
Z C
F (c0 ; v, z)K(c; z)F (cˆ0 ; v, ˆ z)dz,
(3.8)
we may and will restrict attention to shifts over ±ia− . First, let us observe that F (1.34) satisfies the two A1Es, s+ (y + z + ib − ia− /2) s+ (y − ib + ia− /2) F (b; y + ia− /2, z) = , F (b; y − ia− /2, z) s+ (y − z − ib + ia− /2) s+ (y + ib − ia− /2) 1 F (b; y, z − ia− ) = . F (b; y, z) 4s+ (y + z + ib − ia− )s+ (y − z − ib + ia− )
(3.9) (3.10)
(Indeed, these result from (3.6) with δ = −.) Next, we use (3.9) with b = c0 , y = v, to calculate the quotient Q(c; v, z) ≡ (A+ (c; v)F )(c0 ; v, z)/F (c0 ; v, z). A critical fact is now that Q(c; v, z) can be rewritten Q 4 4j =1 c+ (z − ia− /2 + isj ) . 2c+ (2z + 2i cˆ0 ) + s+ (v + z + ic0 − ia− )s+ (v − z − ic0 + ia− )
(3.11)
(3.12)
This equality amounts to a functional equation that is not at all evident, and we postpone its verification to the end of the proof for expository reasons. Taking equality of Q and (3.12) for granted, we continue by demonstrating how the equality can be exploited to prove the A1E (3.7). The denominator in (3.12) appears in (3.10) with b = c0 , y = v. Thus we obtain A+ (c; v)F (c0 ; v, z) = 2c+ (2z + 2i cˆ0 )F (c0 ; v, z) + F (c0 ; v, z − ia− )5(c; z − ia− /2),
(3.13)
where we have introduced the product 5(c; z) ≡ 16
4 Y j =1
c+ (z + isj ).
(3.14)
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
653
As a result of these calculations we obtain the identity Z 1 ˆ = dz[2c+ (2z + 2i cˆ0 )I (c; v, v, ˆ z) A+ (c; v)R(c; v, v) (a+ a− )1/2 C ˆ z)]. (3.15) +F (c0 ; v, z − ia− )5(c; z − ia− /2)K(c; z)F (cˆ0 ; v, Now the exponential increase of c+ (2z+2i cˆ0 ) for Re z → ±∞ is more than neutralized by the exponential decay (1.36) of I (c; v, v, ˆ z). Likewise, the second term in square brackets is O(exp[∓2πz/a− ]) for Re z → ±∞, uniformly for Im z in R-compacts (cf. Theorem A.1). Hence we are entitled to take z → z + ia− in the second term. The key property of the function K(c; z) (1.35) is now that it satisfies the A1E K(c; z + ia− /2) = K(c; z − ia− /2)/5(c; z).
(3.16)
(As before, this is easily verified from the G-A1E (3.6) with δ = −.) Using this in the (shifted) second term, we obtain Z 1 ˆ = dz[2c+ (2z + 2i cˆ0 )I (c; v, v, ˆ z) A+ (c; v)R(c; v, v) (a+ a− )1/2 C ˆ z + ia− )]. (3.17) +F (c0 ; v, z)K(c; z)F (cˆ0 ; v, Now from the A1E (3.10) with b = cˆ0 , y = vˆ we have ˆ z + ia− ) = 4s+ (vˆ + z + i cˆ0 )s+ (vˆ − z − i cˆ0 )F (cˆ0 ; v, ˆ z) F (cˆ0 ; v, ˆ − c+ (2z + 2i cˆ0 )]F (cˆ0 ; v, ˆ z). = 2[c+ (2v)
(3.18)
Substituting this in (3.17), we finally obtain the announced A1E (3.7), cf. (3.8). It remains to prove the functional equation announced above. From the definition (3.11) and the A1E (3.9) we deduce that Q(c; v, z) equals c+ (v − ic1 )s+ (v − ic2 − ia− /2)c+ (v − ic3 − ia− /2) s+ (v)c+ (v)s+ (v − ia− /2)c+ (v − ia− /2) s+ (v − z − ic0 )s+ (v + ic0 − ia− ) − s+ (v − ic0 ) + (v → −v) · s+ (v + z + ic0 − ia− ) +2c+ (2i cˆ0 ).
(3.19)
Our remaining task is to show that this function equals the function (3.12). We begin by observing that both functions are ia+ -periodic, even and meromorphic in v. For Re v → ∞ the function (3.19) has asymptotics π π π (−ic1 − ic2 − ic3 )[exp (−2z − ic0 ) − exp (−ic0 )] a+ a+ a+ +(i → −i, z → −z) + 2c+ (2i cˆ0 ) = 2c+ (2z + 2i cˆ0 ). exp
(3.20)
Obviously, this is also true for (3.12). By virtue of Liouville’s theorem, it therefore suffices to prove that both functions have equal residues at v-poles in a period strip. (Indeed, the poles are all (generically) simple.) Now the factors in (3.19) have poles for v = 0, ia+ /2, ±ia− /2 and ±i(a+ + a− )/2 that do not occur for (3.12). But the first square bracket function vanishes for v = ia− /2, i(a+ + a− )/2 and the second one for v = −ia− /2, −i(a+ + a− )/2, so no poles occur for v = ±ia− /2, ±i(a+ + a− )/2. The first and second term in (3.19) yield
654
S. N. M. Ruijsenaars
functions that do have poles for v = 0, ia+ /2, but it is straightforward to check that the residues cancel. We are, therefore, reduced to comparing the residues of (3.12) and (3.19) at the poles ±v = z + ic0 − ia− . By evenness, we need only calculate the residue of (3.19) for the upper sign pole. It reads c+ (z + i(c0 + c1 − a− ))s+ (z + i(c0 + c2 − a− /2))c+ (z + i(c0 + c3 − a− /2)) 4−1 s+ (2z + 2ic0 − 2ia− )s+ (2z + 2ic0 − ia− ) (3.21) ·s+ (2z + 2ic0 − ia− )s+ (z). t Hence it equals the residue of (3.12) for v = z + ic0 − ia− , as announced. u We continue by showing how polynomials arise upon discretizing v. ˆ Specifically, let us consider the functions Rn (v) ≡ R(a+ , a− , c; v, i cˆ0 + ina− ), n ∈ Z,
(3.22)
where we restrict the parameters by requiring e−iφ+ a+ > 0, φ+ ∈ (−π/2, 0),
a− > 0,
c0 , cˆ0 ∈ (0, Re a), sj ∈ (−Re a, Re a), j = 1, 2, 3.
(3.23) (3.24)
These restrictions can easily be relaxed. However, they are convenient for expository reasons and guarantee that non-generic singularities are avoided. Indeed, they entail that R has no poles at the pertinent v-values, ˆ cf. (2.35). By contrast, the v-values ˆ i cˆ0 + ina− with n ∈ N always belong to the singularity ˆ (2.11): They amount to the hyperplanes locus Zaux of the auxiliary function A(v, v) 4 Hˆ n+1,1,+ , n ∈ N, cf. (the dual of) (2.19). As such, they encode collisions of u3 -poles with d4 -poles, cf. (2.13), (2.14). In order to elaborate on the special character of these v-values, ˆ we start from the Rrepresentation (1.30), with C the contour defined below (1.24), taking at first v, vˆ ∈ R. We denote the contour with the opposite indentation by C + . (Thus, C + runs along the real axis from −∞ to ∞, with an upward indentation at z = 0.) The pole of the integrand I (1.33) at z = 0 is simple, and using (A.20), (1.34) and (1.35) we obtain the residue (a+ a− )1/2 /2π i. Hence we deduce Z 1 I (v, v, ˆ z)dz, v, vˆ ∈ R. (3.25) R(v, v) ˆ =1+ (a+ a− )1/2 C + ˆ poles hitting C + . Thus the Now we may continue vˆ to Im vˆ < cˆ0 without v-dependent representation (3.25) holds true in this v-region, ˆ and in view of the upward indentation we ˆ cˆ0 −ia) may as well include an open disc around vˆ = i cˆ0 . Doing so, the factor 1/G(−v+i in I vanishes for vˆ → i cˆ0 , and so we conclude R0 (v) = 1.
(3.26)
(Note that the z = 0 residue of the A-integrand I (2.12) is indeed singular for vˆ = i cˆ0 .) We are now prepared for the following result.
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
655
Theorem 3.2. With the parameters a± , c restricted by (3.23), (3.24), the function Rn (v) (3.22) satisfies Rn (v) = Pn (ch(2πv/a+ )),
n ∈ N,
(3.27)
where Pn (u) is a polynomial of degree n in u. Proof. As we have seen above, the relevant arguments of the R-function belong to its analyticity domain DR (2.31). Specializing the A1E A+ (ˆc; v)R(c; ˆ v, v) ˆ = 2c+ (2v)R(c; v, v), ˆ
(3.28)
(cf. Theorem 3.1), we obtain C+ (ˆc; i cˆ0 + ina− )[Rn−1 (v) − Rn (v)] + C+ (ˆc; −i cˆ0 − ina− )[Rn+1 (v) − Rn (v)] (3.29) +2c+ (2ic0 )Rn (v) = 2c+ (2v)Rn (v), n ∈ Z. The crux is now that we have not only C+ (ˆc; i cˆ0 ) = 0,
(3.30)
cf. (3.3), but also R0 = 1, cf. (3.26). Hence the polynomial character of Rn (v) follows recursively from (3.29). (Note that none of the coefficients C+ (ˆc; −i cˆ0 −ina− ) vanishes for n ∈ N.) u t In view of the symmetry properties (2.7)–(2.10), there are various other ways to obtain polynomials from the R-function by discretizing vˆ or v. Staying with the polynomials obtained above, we continue by pointing out that they satisfy the A1E A+ (c; v)Pn (ch(2πv/a+ )) = 2ch(2πi(cˆ0 + na− )/a+ )Pn (ch(2π v/a+ )),
(3.31)
in addition to the three-term recurrence relation (3.29). (Indeed, (3.31) is simply the A1E (3.7) for the special v-values ˆ at issue.) It should also be observed that the A1O A− (I c; v) acts trivially on the polynomials: Since they are ia+ -periodic in v, its eigenvalue is given ˆ in Theorem 3.1 does not act by 2c− (2ic0 ) for all n ∈ N. (The fourth A1O A− (I c; v) on the polynomials, since it shifts vˆ by ±ia+ .) From the coefficients in the recurrence (3.29) and A1E (3.31) it can be read off that we may continue a+ to all of the lower half plane. In particular, we may switch to parameters a ≡ a− , r ≡ −πi/a+ , a, r > 0.
(3.32)
Then we obtain polynomials Pn (r, a, c; cos(2rv)) that satisfy the A1E A(r, a, c; v)Pn = 2ch2r(cˆ0 + na)Pn , n ∈ N,
(3.33)
and the recurrence relation an (r, a, c)Pn+1 + bn (r, a, c)Pn + cn (r, a, c)Pn−1 = 2 cos(2rv)Pn . Here, the A1O A(r, a, c; v) is given by v − 1 + 2ch(2r cˆ0 ), A ≡ C(r, a, c; v) Tiav − 1 + C(r, a, c; −v) T−ia
(3.34)
(3.35)
656
S. N. M. Ruijsenaars
C(v) ≡
sin r(v − ic0 ) cos r(v − ic1 ) sin r(v − ic2 − ia/2) cos r(v − ic3 − ia/2) , sin rv cos rv sin r(v − ia/2) cos r(v − ia/2) (3.36)
and the recurrence coefficients read an ≡
cn ≡
shr(2cˆ0 + na) chr(cˆ0 + cˆ1 + na) shr(cˆ0 + cˆ2 + (n + 1/2)a) · · shr(cˆ0 + na) chr(cˆ0 + na) shr(cˆ0 + (n + 1/2)a) chr(cˆ0 + cˆ3 + (n + 1/2)a) , (3.37) · chr(cˆ0 + (n + 1/2)a) chr(cˆ0 − cˆ1 + na) shr(cˆ0 − cˆ2 + (n − 1/2)a) shr(na) · · shr(cˆ0 + na) chr(cˆ0 + na) shr(cˆ0 + (n − 1/2)a) chr(cˆ0 − cˆ3 + (n − 1/2)a) , · chr(cˆ0 + (n − 1/2)a) bn ≡ 2chr(2c0 ) − an − cn .
(3.38)
(3.39)
4. A Barnes Type Representation for Basic Hypergeometric Functions The precise relation between the polynomials Pn (cos 2rv) just studied and the Askey– Wilson polynomials pn (cos v) (1.6) is not yet clear at this stage. In principle, this relation can be determined directly by comparing the difference equations and three-term recurrence relations satisfied by both sets of polynomials. But the relation will be an obvious consequence of results obtained below, and these results are of independent interest. Therefore, we postpone a comparison of the polynomials to the end of this section. Indeed, guided by the above integral representation involving the hyperbolic G-function from Ref. [9], we present a novel and quite simple integral representation for the basic hypergeometric function s+1 φs (1.3), which involves the trigonometric G-function from Ref. [9]. Taking s = 3 and choosing appropriate parameters, we obtain once more an interpolation of the above polynomials Pn (cos 2rv) that is essentially self-dual. But in the present setting an Askey–Wilson type difference equation only holds true when the dual variable vˆ is discretized in such a way that the polynomials arise. We prove the pertinent results in basically the same way as in Sect. 3, obtaining further explicit information in the process. Turning to the details, we collect first of all some formulas involving the trigonometric G-function from Ref. [9] that we have occasion to use. Throughout this section we fix parameters r, a > 0. Then the G-function is defined by Gt (r, a; z) =
∞ Y
(1 − exp[−(2m − 1)ar + 2irz])−1 .
(4.1)
m=1
(We require positivity of r and a more for brevity than for necessity. The results that follow can readily be generalized to the region Re ar > 0.) Thus Gt (r, a; ·) is meromorphic and has no zeros, while Gt has simple poles at zj k = j π/r − ia(k + 1/2), j ∈ Z, k ∈ N.
(4.2)
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
657
It is not hard to verify that Gt satisfies the A1E Gt (z + ia/2) = 1 − exp 2irz, Gt (z − ia/2) and that it can be rewritten Gt (r, a; z) = exp
∞ X exp 2inrz n=1
2nshnra
(4.3)
! , Im z > −a/2.
(4.4)
The series representation (4.4) entails that for z = x ∈ R the functions Gt and 1/Gt have Fourier expansions of the form X cn± (ra) exp 2inrx, x ∈ R. (4.5) G±1 t (r, a; x) = 1 + n∈N∗
Moreover, it readily yields the asymptotics G±1 t (r, a; z) = 1 + O(exp(−2rIm z)), Im z → ∞,
(4.6)
uniformly for Re z ∈ R. In our previous paper Ref. [9] we obtained some further results, and we clarified the relation to the q-gamma function. But the above information on the trigonometric G-function (which is quite easily derived) is all we need. Our integral representation for s+1 φs involves one more ingredient, however. This is the renormalized Weierstrass σ -function s(r, a; z) ≡ σ (z; π/2r, ia/2) exp(−ηz2 r/π ),
(4.7)
which we already employed in Ref. [9]. It is an entire, odd and π/r-antiperiodic function that obeys the A1E s(z + ia/2) = − exp(−2irz). s(z − ia/2)
(4.8)
Again, these are the only properties we need. The integrand involves quotients of G-functions, and the function E(λ, z) ≡ s(z + iλ/2r)/s(z).
(4.9)
Clearly, the latter is entire and 2iπ -antiperiodic in λ. Also, it is π/r-periodic and meromorphic in z, with simple poles in the period lattice P = πZ/r + iaZ. To be quite precise, this is the case for λ not equal to a point in the lattice 2irP; for the latter λvalues one infers from the A1E (4.8) that E(λ, z) is a multiple of exp(2ikrz), k ∈ Z. This A1E also entails that E satisfies E(λ, z) = eλ E(λ, z − ia).
(4.10)
Consider now the integrand I (α, β, λ, z) ≡ E(λ, z)G(α, β, z),
(4.11)
658
S. N. M. Ruijsenaars
where G denotes the following product of G-quotients: s+1 s Gt (ia/2) Y Gt (z + αj ) Y Gt (βk ) . G(α, β, z) ≡ Gt (z + ia/2) Gt (αj ) Gt (z + βk ) j =1
(4.12)
k=1
Introducing the domains DM ≡ {α ∈ Cs+1 | Im αj > −a(M + 1/2), j = 1, . . . , s + 1},
(4.13)
it is convenient to require at first (α, β, λ) ∈ D0 × Cs × C. Then the poles of the factors Gt (· + αj ) lie in the (open) lower half plane, cf. (4.2). The poles of E(λ, z) in the lower half plane are canceled by the zeros of 1/Gt (z + ia/2). Now let 0− denote a contour from z = −π/2r to z = π/2r such that the αj dependent poles are below 0− and the poles at z = ina, n ∈ N, are above 0− . (For instance, one can let 0− run along the real axis from −π/2r to π/2r with a downward indentation at z = 0.) Then the auxiliary function A(α, β, λ) ≡
s Y
−1
Gt (βk )
Z ·
k=1
0−
I (α, β, λ, z)dz
(4.14)
is clearly well defined and analytic in D0 × Cs × C. Theorem 4.1. The function A(α, β, λ) extends to a function that is analytic in C2s+2 . For Re λ < 0 the function 8(α, β, λ) ≡ [2iπs(iλ/2r)]−1
s Y
Gt (βk ) · A(α, β, λ)
(4.15)
k=1
satisfies 8(α, β, λ) =s+1 φs (e2irα1 −ar , . . . , e2irαs+1 −ar , e2irβ1 −ar , . . . , e2irβs −ar ; e−2ar , eλ ), (4.16) where s+1 φs is defined by (1.3)–(1.4). Proof. Fixing α in the domain D0 and β ∈ Cs , we first prove (4.16) for λ ∈ (−2ar, 0). In the process, we obtain a representation for 8 from which the asserted analytic continuation property of A can be read off. We begin by introducing contours 0k = {z = x + ia(k − 1/2) | x ∈ [−π/2r, π/2r], k ∈ N∗ }.
(4.17)
Since the integrand in (4.14) is π/r-periodic in z, we may shift 0− to 01 , picking up the residue at the simple pole z = 0. From (4.7) we have s(z) = z + O(z3 ), z → 0, so we deduce 8(α, β, λ) = 1 + [2iπs(iλ/2r)]
−1
(4.18)
Z 01
I (α, β, λ, z)dz.
(4.19)
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
659
Next, we replace the factor E(λ, z) in I by eλ E(λ, z−ia) (recall (4.10)), and shift 01 to 02 . Using the G-A1E (4.3), the values of the G-quotients at the simple pole z = ia are easily calculated, yielding 8(α, β, λ) = 1 + eλ
s+1 Y
(1 − e2irαj −ar )/(1 − e−2ar )
j =1
(1 − e2irβk −ar )
k=1
Z
+[2iπs(iλ/2r)]−1 eλ
s Y
02
E(λ, z − ia)G(α, β, z)dz.
(4.20)
Iterating this procedure, one readily obtains Qs+1 2irα −ar −2ar N j X ;e )n j =1 (e nλ Q e + RN (α, β, λ), 8(α, β, λ) = s −2ar −2ar 2irβ −ar (e ;e )n k=1 (e k ; e−2ar )n n=0
(4.21) where the remainder term reads −1 Nλ
RN (α, β, λ) ≡ [2iπs(iλ/2r)]
e
Z 0N +1
E(λ, z − iN a)G(α, β, z)dz.
(4.22)
Now on the contour 0N+1 we have from (4.17), E(λ, z − iNa) =
s(x + ia/2 + iλ/2r) , x ∈ [−π/2r, π/2r]. s(x + ia/2)
(4.23)
Thus E(λ, z − iNa) is bounded by an N-independent constant on 0N +1 . Moreover, the z-dependent G-functions in G(α, β, z) converge uniformly to 1 on 0N +1 as N → ∞, cf. (4.6). Since λ ∈ (−2ar, 0), we deduce that RN (α, β, λ) converges to 0 for N → ∞. Thus we obtain (4.16) for λ ∈ (−2ar, 0). Q Next, we multiply (4.21) by 2iπs(iλ/2r) sk=1 Gt (βk )−1 , so that we obtain A(α, β, λ) on the lhs, cf. (4.15). On the rhs we get a function that is clearly analytic for t (α, β, λ) ∈ DN +1 × Cs × C. Thus the theorem readily follows. u It should be observed that this theorem yields additional information. First, recall that the rhs of (4.16) is analytic for Re λ < 0, so that the same holds true for 8(α, β, λ). At first sight this seems to disagree with the poles of the factor 1/s(iλ/2r) for Re λ < 0. But for such λ the function E(λ, z) reduces to a multiple of exp(2ikrz), k ∈ N∗ . Thus we may replace 0− by [−π/2r, π/2r]. Recalling now (4.5), one deduces that the integral vanishes for the pertinent λ. Hence the rhs of (4.16) yields the limit value of the quotient. For λ such that Re λ ≥ 0 and s(iλ/2r) = 0, however, the integral does not vanish in general. In particular, we have Z
Z 0−
I (α, β, 0, z)dz =
π/2r
−π/2r
G(α, β, x)dx =
s
s+1
k=1
j =1
Y Y π Gt (ia/2) Gt (βk )/ Gt (αj ). r (4.24)
Thus 8(α, β, λ) has a simple pole at λ = 0, with a residue given by −Gt (ia/2)
s Y k=1
Gt (βk )/
s+1 Y j =1
Gt (αj ).
(4.25)
660
S. N. M. Ruijsenaars
Since A(α, β, λ) is entire and 2iπ-antiperiodic in λ, it follows more generally from (4.15) that the function 8(α, β, λ) has simple poles at λ = 2ark + 2πil, k ∈ N, l ∈ Z,
(4.26)
and is 2πi-periodic in λ. Obviously, this has consequences for the basic hypergeometric series (1.3): The analytic function in the unit disc defined by it has a meromorphic continuation, with simple poles located only at w = q −k , k ∈ N.
(4.27)
Furthermore, the residue at w = 1 follows from (4.25) by using q = exp(−2ar), aj = exp(2irαj − ar), bk = exp(2irβk − ar).
(4.28)
In order to tie in the above function 8(α, β, λ) with the Askey–Wilson A1O (3.35), we proceed by specializing the above variables. Specifically, for the remainder of this section we take s = 3 and substitute α1 ≡ ∓v + ic0 − ia/2, 2
α3 ≡ ∓vˆ + i cˆ0 − ia/2,
(4.29)
4
βj ≡ itj , j = 1, 2, 3,
(4.30)
t1 ≡ c0 + c1 − iπ/2r − a/2, t2 ≡ c0 + c2 , t3 ≡ c0 + c3 + iπ/2r, t4 ≡ a/2, (4.31) in 8(α, β, λ). This yields a function φ(c, λ; v, v) ˆ that is entire in v and vˆ and meromorphic in c0 , . . . , c3 , with poles that are obvious from (4.15) and the first assertion of Theorem 4.1. Moreover, φ is 2πi-periodic and meromorphic in λ, with simple poles located only at (4.26). It is convenient to require at first Re c0 > 2a, Re cˆ0 > 2a, Im v ∈ (−a, a), Im vˆ ∈ (−a, a), λ ∈ (−2ar, 0). (4.32) Indeed, in that case we may write φ(c, λ; v, v) ˆ = [2iπs(iλ/2r)]−1
Z 0−
I (c, λ; v, v, ˆ z)dz,
(4.33)
where ˆ z), I = F (c0 ; v, z)K(c, λ; z)F (cˆ0 ; v, F (b; y, z) ≡
Gt (z + y + ib − ia/2) Gt (z − y + ib − ia/2) , Gt (y + ib − ia/2) Gt (−y + ib − ia/2) K(c, λ; z) ≡ E(λ; z)
4 Y j =1
Gt (itj ) . Gt (z + itj )
(4.34) (4.35)
(4.36)
Since the function φ(c, λ; v, v) ˆ is entire in v, the Askey–Wilson A1O A(c; v) (3.35) has a well-defined action on it. Now the restrictions (4.32) ensure that the four downward pole sequences have a distance larger than a from the real axis. Therefore, the action of the shifts T±ia amounts to taking v → v ∓ ia in the integral on the rhs of (4.33). We are now prepared for the following auxiliary result.
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
661
Lemma 4.2. With the restrictions (4.32) in effect, one has A(c; v)φ(c, λ; v, v) ˆ = 2 cos(2r v)φ(c, ˆ λ; v, v) ˆ Z λ+2ar − 1] [e [cos(2r v) ˆ − cos(2rz + 2ir cˆ0 )]I dz. (4.37) + iπs(iλ/2r) 0− Proof. The proof of this lemma runs parallel to that of Theorem 3.1, with the G-A1E (4.3) playing the role of the A1E (3.6) with δ = −, the functions c(v) ≡ cos rv, s(v) ≡ sin rv,
(4.38)
corresponding to c+ (v) and s+ (v), and a− to a. Indeed, as the analogs of (3.9) and (3.10) we obtain s(y + z + ib − ia/2) s(y − ib + ia/2) F (b; y + ia/2, z) = , (4.39) F (b; y − ia/2, z) s(y − z − ib + ia/2) s(y + ib − ia/2) exp 2ir(−z − ib + ia) F (b; y, z − ia) = . (4.40) F (b; y, z) 4s(y + z + ib − ia)s(y − z − ib + ia) Calculating now Q(c; v, z) ≡ (A(c; v)F )(c0 ; v, z)/F (c0 ; v, z),
(4.41)
we obtain (3.19) with c+ , s+ , a− → c, s, a. Equivalently, we obtain (3.19) with a+ = −iπ/r, a− = a.
(4.42)
From this one easily deduces that Q(c; v, z) equals (3.12) with (4.42) in effect. Recalling (1.26) and (4.31), we now obtain Z dz[2c(2z + 2i cˆ0 )I (c, λ; v, v, ˆ z) A(c; v)φ(c, λ; v, v) ˆ = [2iπs(iλ/2r)]−1 0−
ˆ z)], (4.43) +F (c0 ; v, z−ia)5(c; z−ia/2)K(c, λ; z)F (cˆ0 ; v, 5(c; z) ≡ 16 exp[2ir(z + ic0 − ia/2)]
4 Y
s(z + itj ),
(4.44)
j =1
as the analogs of (3.15) and (3.14). Next, we take z → z+ia in the integrand of the second term on the rhs of (4.43). (This is allowed on account of (4.32) and π/r-periodicity.) As the analog of (3.16) we calculate K(c, λ; z + ia/2)/K(c, λ; z − ia/2) = − exp(λ − 2irz + 2r cˆ0 + ar)/5(c; z), (4.45) so we deduce A(c; v)φ(c, λ; v, v) ˆ = [2iπs(iλ/2r)]−1
Z 0−
dz[2c(2z + 2i cˆ0 )I (c, λ; v, v, ˆ z)
ˆ z + ia)]. − F (c0 ; v, z) exp(λ − 2irz + 2r cˆ0 + 2ar)K(c, λ; z)F (cˆ0 ; v,
(4.46)
Finally, using (4.40) we obtain ˆ z + ia) = −2 exp(2ir(z + i cˆ0 ))[c(2v) ˆ − c(2z + 2i cˆ0 )]F (cˆ0 ; v, ˆ z). (4.47) F (cˆ0 ; v, Substituting this in (4.46), we obtain (4.37). u t
662
S. N. M. Ruijsenaars
We are now prepared to define and study a “trigonometric” analog of the “hyperbolic” function R(c; v, v): ˆ It reads ˆ ≡ φ(c, −2ar; v, v). ˆ Rt (c; v, v)
(4.48)
We also define D(c; v, v) ˆ ≡ exp(2r cˆ0 )
4 Y
Gt (itj )
j =1
Y
−1 Gt (δv + ic0 − ia/2)Gt (δ vˆ + i cˆ0 − ia/2)
.
δ=+,−
(4.49) Q3
ˆ is entire in Theorem 4.3. The renormalized function j =1 Gt (itj )−1 · Rt (c; v, v) ˆ The function Rt satisfies the self-duality relation c0 , . . . , c3 , v and v. ˆ = Rt (ˆc; v, ˆ v), Rt (c; v, v)
(4.50)
ˆ = 2 cos(2r v)R ˆ t (c; v, v) ˆ + D(c; v, v), ˆ A(c; v)Rt (c; v, v)
(4.51)
ˆ = 2 cos(2rv)Rt (c; v, v) ˆ + D(ˆc; v, ˆ v). A(ˆc; v)R ˆ t (c; v, v)
(4.52)
and the A1Es
Moreover, it has a specialization Rt (c; v, i cˆ0 + ina) = Pn (cos(2rv)), n ∈ N,
(4.53)
where P0 (u), P1 (u), . . . are the polynomials from Theorem 3.2. Finally, one has ˆ cˆ0 ) 2ir(v+i ˆ = 4 φ3 (e2ir(−v+ic0 ) , e2ir(v+ic0 ) , e2ir(−v+i , e ˆ cˆ0 ) , Rt (c; v, v)
−e−2r(c0 +c1 ) , e−2r(c0 +c2 +a/2) , −e−2r(c0 +c3 +a/2) ; e−2ar , e−2ar) . (4.54)
Proof. The first assertion follows upon specializing the first assertion of Theorem 4.1. From (4.33) we read off φ(c, λ; v, v) ˆ = φ(ˆc, λ; v, ˆ v),
(4.55)
so (4.50) is clear from (4.48). Next, we prove (4.51), using Lemma 4.2. To this end we need only show that the second term on the rhs of (4.37) has limit D(c; v, v) ˆ for λ ↓ −2ar. Now the prefactor has limit −2r/π s 0 (−ia). Using the s-A1E (4.8) and (4.18), we obtain s 0 (−ia) = −ear . To determine the limit of the integral, we recall (4.34)–(4.36) and (4.9): We have E(−2ar, z) = s(z − ia)/s(z) = −ear e2irz ,
(4.56)
by virtue of (4.8). Therefore the limit of the second term becomes −
2r π
Z 0−
dz[cos(2r v)−cos(2rz+2ir ˆ cˆ0 )]e2irz F (c0 ; v, z)
4 Y j =1
Gt (itj ) F (cˆ0 ; v, ˆ z). Gt (z + itj ) (4.57)
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
663
We may now replace 0− by [−π/2r, π/2r] and invoke (4.5) to deduce that (4.57) equals D(c; v, v) ˆ (4.49). Hence (4.51) follows. Clearly, (4.52) follows by combining (4.50) and (4.51), so we proceed by proving (4.53). The key point is that we have D(c; v, i cˆ0 + ina), D(ˆc; i cˆ0 + ina, v) = 0, n ∈ N,
(4.58)
cf. (4.49). Therefore, we obtain the A1E (3.33) and recurrence (3.34) satisfied by the polynomials Pn (cos(2rv)) when we specialize (4.51) and (4.52) to vˆ = i cˆ0 + ina with n ∈ N. Hence we need only show Rt (c; v, i cˆ0 ) = 1
(4.59)
for (4.53) to result. To prove (4.59), we note that we may flip the indentation of 0− in (4.33), picking up a residue 1. Then we can let vˆ converge to i cˆ0 , to obtain a zero from the term 1/Gt (−vˆ + i cˆ0 − ia/2). As a consequence we obtain φ(c, λ; v, i cˆ0 ) = 1,
(4.60)
and so (4.59) follows by taking λ ↓ −2ar. The remaining assertion (4.54) follows by combining (4.16) with the pertinent substitutions, cf. (4.29)–(4.31). Hence the proof of the theorem is complete. u t Comparing (4.53) and (4.54) to (1.6), the relation between the polynomials Pn (cos 2rv) and the Askey–Wilson polynomials pn (cos v) can be read off: When we take r = 1/2 in the former and substitute q = e−a , α = e−c0 , β = −e−c1 , γ = e−c2 −a/2 , δ = −e−c3 −a/2 ,
(4.61)
in the latter, we obtain Nn−1 pn (cos v) = Pn (cos v), n ∈ N.
(4.62)
Appendix A. The Hyperbolic Gamma Function and Related Functions In a previous paper [9] we presented a new approach to the theory of first order analytic difference equations. As an application, we introduced generalized gamma functions of hyperbolic, trigonometric and elliptic type – Euler’s gamma function 0(z) being of rational type. Our trigonometric gamma function is closely related to the q-gamma function, as detailed in Ref. [9]. We have recently become aware of the fact that our hyperbolic gamma function is not essentially new either: It is basically equal to Kurokawa’s “double sine function” [12,13]. In turn, the latter is a quotient of “double gamma functions” – a function introduced and studied in great detail by Barnes almost a century ago [14,15]. We will clarify the latter relations towards the end of this appendix. The main purpose of this appendix is, however, to summarize and extend the results from Ref. [9] that we need for a detailed study of our generalized hypergeometric function, which is built from hyperbolic gamma functions. The pertinent results are mostly new within the context of the double sine function, too, and indeed the viewpoint on this function coming from Ref. [9] (and on the entire function E(z) introduced later on) is quite different from that
664
S. N. M. Ruijsenaars
of Barnes and later authors dealing with the double gamma and sine functions [16–18, 4]. To begin with, consider the integral Z ∞ dy sin 2yz z (A.1) − ≡ g(a+ , a− ; z), y 2sha+ y sha− y a+ a− y 0 where we take at first a− , a+ ∈ (0, ∞). Defining the strip S ≡ {z ∈ C||Im z| < a}, a ≡ (a+ + a− )/2,
(A.2)
it is clear that the integral converges absolutely and uniformly on compact subsets of S. Therefore, g(a+ , a− ; z) is analytic in S. The hyperbolic gamma function is now defined by G(a+ , a− ; z) ≡ exp(ig(a+ , a− ; z)).
(A.3)
It is obviously analytic and zero-free in S. We proceed by listing further properties that are important in this paper, referring to Sect. III A of Ref. [9] for complete proofs. First of all, it is not obvious, but true that G extends to a meromorphic function of z. It is obvious from (A.1) that G has the features G(a+ , a− ; −z) = 1/G(a+ , a− ; z), G(a− , a+ ; z) = G(a+ , a− ; z), G(λa+ , λa− ; λz) = G(a+ , a− ; z), λ ∈ (0, ∞).
(A.4) (A.5) (A.6)
From our viewpoint, the key property of G is that it is the unique minimal solution to the first order A1E G(z + ia+ /2) = 2ch(π z/a− ) G(z − ia+ /2)
(A.7)
G(0) = 1, G(z) > 0, z ∈ i(−a, a).
(A.8)
that satisfies
(The notion of “minimal solution” is defined in Ref. [9].) In view of the a+ ↔ a− symmetry of G, it may equivalently be defined as the unique minimal solution to G(z + ia− /2) = 2ch(π z/a+ ) G(z − ia− /2)
(A.9)
satisfying (A.8). It readily follows from (A.1) that for fixed z ∈ S the function G(a+ , a− ; z) is realanalytic for a+ , a− ∈ (0, ∞). Far stronger analyticity properties, however, can be read off from a representation of G in terms of an infinite product of 0-functions. In order to detail these properties, we define the quotient ρ ≡ a− /a+ ,
(A.10)
/ (−∞, 0]}, C− ≡ {ρ ∈ C | ρ ∈
(A.11)
the cut plane
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
665
ia −ia
Fig. 2. The zero and pole sequences of G(a+ , a− ; z) for a+ , a− positive
and the complex numbers zkl ≡ ika+ + ila− , k, l ∈ N.
(A.12)
− in C3 , Provided ρ stays in C− , and (a+ , a− , z) stays away from the hyperplanes z = zkl with − ≡ −ia − zkl , k, l ∈ N, zkl
(A.13)
the function G(a+ , a− ; z) continues to a one-valued jointly analytic function of (a+ , a− , + , with z), vanishing solely on the hyperplanes z = zkl + − ≡ −zkl , k, l ∈ N. zkl
(A.14)
Moreover, fixing a+ , a− , the function G(a+ , a− ; ·) is meromorphic with poles only for − . We symbolize the pole and zero sequences of G(a+ , a− ; ·) for positive a+ , a− z = zkl in Fig. 2. − , which The infinite product representation also yields the pole order for z equal to zkl we denote by O(kl) from now on. Specifically, the latter equals the number of distinct pairs (m, n) ∈ N2 such that zmn = zkl . In particular, all poles are simple for Im ρ 6 = 0, and also for ρ positive and irrational. We have occasion to use the inequality O(k + m, l + n) ≥ O(kl) + O(mn) − 1, k, l, m, n ∈ N.
(A.15)
This inequality cannot be found in Ref. [9], so we supply its simple proof here. Clearly, (A.15) holds true when all poles are simple. Thus we need only consider the case ρ = s/t, s, t ∈ N∗ coprime.
(A.16)
Then any zij , i, j ∈ N, can be uniquely written as zσ τ with σ ∈ {0, . . . , s − 1}, and we have O(ij ) = [τ/t] + 1,
(A.17)
where [·] is the entier function. Hence, letting zkl = zσ1 ,τ1 , zmn = zσ2 ,τ2 , zk+m,l+n = zσ3 ,τ3 , σ1 , σ2 , σ3 ∈ {0, . . . , s − 1}, (A.18)
666
S. N. M. Ruijsenaars
the inequality (A.15) is equivalent to [τ3 /t] ≥ [τ1 /t] + [τ2 /t].
(A.19)
Now if σ1 + σ2 ∈ {0, . . . , s − 1}, then we have σ3 = σ1 + σ2 and τ3 = τ1 + τ2 , so (A.19) is clear. In case σ1 +σ2 ∈ {s, . . . , 2s −2}, we have σ3 = σ1 +σ2 −s and τ3 = τ1 +τ2 +t. Hence (A.19) follows once more, and so the proof of (A.15) is complete. − = −ia is always simple. Its residue can be determined Obviously, the pole at z00 explicitly and reads r00 =
i (a+ a− )1/2 . 2π
(A.20)
− is simple iff the quantity More generally, the pole at zkl
tkl ≡
k Y
sin(πma+ /a− )
m=1
l Y
sin(π na− /a+ )
(A.21)
n=1
is non-zero; assuming it is, the residue at the pole reads rkl = (−)kl (−1/2)k+l r00 /tkl .
(A.22)
Of course, corresponding properties of the zeros of G(z) follow from the relation + of 1/G(z) the residue equals (A.4). Moreover, the latter entails that at a simple pole zkl −rkl . Next, we turn to the asymptotics of G(a+ , a− ; z) for Re z → ±∞. It so happens that we need an extension of our results obtained in Ref. [9], where we fixed a+ , a− ∈ (0, ∞). Here we allow a+ , a− to vary over the (open) right half plane, henceforth denoted by RH P . Accordingly, an extensive elaboration on our previous arguments regarding asymptotics appears inevitable. Recalling G = exp ig, we need only study the g-asymptotics. Moreover, since g is odd in z (cf. (A.1)), it suffices to obtain the Re z → ∞ asymptotics. Now since we take a+ , a− ∈ RH P , the poles and zeros of G(z) stay in the lower and upper half plane, and g(z) extends from the real axis (where the integral representation (A.1) is still valid) to an analytic function in the complex plane with two wedges deleted. (Note that g has logarithmic branch points at the poles and zeros of G.) To be specific, let us first set a± = r± exp(iφ± ), r± > 0, φ± ∈ (−π/2, π/2),
(A.23)
φmax ≡ max(φ+ , φ− ), φmin ≡ min(φ+ , φ− ),
(A.24)
a = (a+ + a− )/2 = r exp(iφ), r > 0, φ ∈ [φmin , φmax ].
(A.25)
so that
Defining the wedge W ≡ {iR exp(iη) | R ∈ [0, ∞), η ∈ [φmin , φmax ]},
(A.26)
it suffices to delete the two wedges W+ ≡ ia + W, W− ≡ −W+ ,
(A.27)
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
667
in the upper and lower half planes. Indeed, g(a+ , a− ; ·) is clearly analytic in the domain D ≡ C \ (W+ ∪ W− ).
(A.28)
In our study of the generalized hypergeometric function it is now crucial that the function f (a+ , a− , z) ≡ g(a+ , a− ; z) − A(a+ , a− , z),
(A.29)
where A is the dominant asymptotics function A(a+ , a− , z) ≡ −
πz2 π − 2a+ a− 24
a+ a− + a− a+
,
(A.30)
decays exponentially for Re z → ∞, locally uniformly in a+ , a− and Im z, with a rate involving the positive number am ≡ max(r+ / cos φ+ , r− / cos φ− ).
(A.31)
To detail this asymptotics, we introduce the domain A ≡ {am + R exp(iχ) | R > 0, |χ| < χm },
(A.32)
χm ≡ min(φmin + π/2, −φmax + π/2).
(A.33)
where
Equivalently, χm may be defined by cot χm = max(| tan φ+ |, | tan φ− |), χm ∈ (0, π/2].
(A.34)
Figure 3 may be helpful to vizualize the geometry. Theorem A.1. Fix compacts K+ , K− ⊂ RH P and K ⊂ R, and fix σ ∈ (1/2, 1). Then there exists a positive C = C(K+ , K− , K, σ ) such that one has |f (a+ , a− , z)| < C exp(−2π σ Re z/am ),
(A.35)
for all a+ ∈ K+ , a− ∈ K− and z ∈ A with Im z ∈ K. Proof. The proof of this theorem is relegated to Appendix C. u t In Appendix C additional information concerning asymptotics is obtained that is of some interest in itself. As it stands, however, the uniform bound (A.35) suffices for our present purposes (as will become clear in Appendix B). We proceed by detailing how the gamma function arises as a limit of our G-function: One has lim exp[iz ln(2πρ)]G(1, ρ; ρz + i/2) = ρ↓0
(2π )1/2 , 0(iz + 1/2)
(A.36)
668
S. N. M. Ruijsenaars W+
ia
0
am
A
−ia
W− Fig. 3. The asymptotics domain A and the zero and pole wedges W+ , W− for a+ , a− ∈ RH P
the limit being uniform on compact subsets of C. (Note that the scale invariance (A.6) of the G-function can be used to handle the case a+ 6= 1.) The well-known multiplication formula for the gamma function can be obtained from its generalization G(a+ , a− ; Nz) =
N Y j,k=1
G(a+ , a− ; z +
ia+ ia− (N + 1 − 2j ) + (N + 1 − 2k)) 2N 2N (A.37)
and the limit (A.36). (In fact, the G-function satisfies a multiplication formula that is more general than (A.37), cf. Eq. (3.25) in Ref. [9].) A second zero step size limit of the G-function is crucial, too: One has lim a↓0
G(π, a; z + iλa) = exp((λ − µ) ln(2chz)), λ, µ ∈ R, G(π, a; z + iµa)
(A.38)
uniformly on compacts of the cut plane C(π/2), with C(d) ≡ C \ ±i[d, ∞), d > 0.
(A.39)
(Here, ln is chosen real for z real.) Note that this limit is an easy consequence of the A1E (A.9) for λ − µ ∈ Z; The branch cut occurring for λ − µ ∈ / Z is due to a coalescence of zeros and poles. Both limits (A.36) and (A.38) are crucial to appreciate why our function R(a+ , a− , c; v, v) ˆ may be viewed as a generalization of the hypergeometric function. We do not need
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
669
any further information on the G-function to study our R-function. But in order to obtain a clear and detailed view on the analyticity properties of the R-function in its eight variables, we have occasion to use in addition to G a function E(a+ , a− ; z) that will be introduced and studied next. The E-function has in particular the following properties: (i) E is holomorphic in (a+ , a− , z) as long as a− /a+ stays in the cut plane C− (A.11); (ii) E is related to G by G(a+ , a− ; z) = E(a+ , a− ; z)/E(a+ , a− ; −z).
(A.40)
From these properties it is already obvious that E is non-zero unless (a+ , a− , z) belongs + , cf. (A.14); moreover, for a+ , a− fixed, the multiplicity of to the hyperplanes z = zkl the zeros of the entire function E(a+ , a− ; ·) equals that of the zeros of G(a+ , a− ; ·). To define E we first introduce Z 1 − e−2iyz 2iz z2 1 ∞ dy − − (e−2a+ y + e−2a− y ) . e(a+ , a− ; z) ≡ 4 0 y sha+ y sha− y a+ a− y a+ a− (A.41) It is straightforward to verify that e is holomorphic in (a+ , a− , z) for a+ , a− ∈ RH P and Im z < Re a. Clearly, we have (cf. (A.1)) e(a+ , a− ; z) − e(a+ , a− ; −z) = ig(a+ , a− ; z).
(A.42)
Therefore, the function E(a+ , a− ; z) ≡ exp[e(a+ , a− ; z)]
(A.43)
satisfies (A.40), and is holomorphic and non-zero for a+ , a− ∈ RH P and Im z < Re a. From (A.41) it is obvious that E is symmetric under a+ ↔ a− and scale-invariant: E(a− , a+ ; z) = E(a+ , a− ; z),
(A.44)
E(λa+ , λa− ; λz) = E(a+ , a− ; z), λ > 0.
(A.45)
The announced property (i) is now an easy consequence of scale invariance and holomorphy of E for (a+ , a− , z) ∈ RH P × RH P × C, which we demonstrate next. Indeed, the latter holomorphy property is readily obtained from the A1E (2π)1/2 E(z + ia+ /2) exp(izK(a+ , a− )), = E(z − ia+ /2) 0( aiz− + 21 )
(A.46)
where K(a+ , a− ) ≡
1 a+ ln , 2a− a−
(A.47)
which is obeyed by E(a+ , a− ; z) for Im (z + ia+ /2) < Re a. We prove this A1E in a moment, and explain first why it entails holomorphy of E(a+ , a− ; z) in RH P 2 × C.
670
S. N. M. Ruijsenaars
The point is that we can iterate (A.46) to get E(a+ , a− ; z + iNa+ ) = E(a+ , a− ; z) ·
N Y (2π)1/2 exp(i[z + i(j − 1/2)a+ ]K(a+ , a− ))
0( ai− [z + i(j − 1/2)a+ ] + 21 )
j =1
,
(A.48) for a+ , a− ∈ RH P and Im (z + iNa+ ) < Re a. Now the rhs can be analytically continued to Im z < Re a. (Indeed, E(a+ , a− ; z) has this property, and by entireness of 1/ 0(z) the product has this property, too; note that (A.47) entails analyticity of K(a+ , a− ) in RH P 2 .) Therefore, we can continue E(a+ , a− ; z) analytically to Im z < Re (a + N a+ ), and since N is arbitrary, we deduce holomorphy in RH P 2 × C. Next, we prove that E(a+ , a− ; z) satisfies (A.46). This A1E is readily obtained via (A.41): It entails e(z + i
a+ a+ ) − e(z − i ) = 2 2
−2itz/a− e 1 iz −2t dt − + − e t 2sht 2t a− 0 Z ∞ dy −2a− y iz (e − e−2a+ y ) + 2a− 0 y Z
∞
= ln((2π)1/2 / 0(iz/a− +1/2)) + izK(a+ , a− ). (A.49) (Cf. e.g. Eq. (A37) in Ref. [9] for the 0-representation used here.) Thus the proof of the property (i) of E(a+ , a− ; z) announced above is now complete. In the course of the proof we have obtained the A1E (A.46). This A1E plays no further role in this paper, inasmuch as we only need the G-A1Es (A.7) and (A.9). We add here some further remarks on it, however, and use it to explain the relation of E to Barnes’ double gamma function. Let us point out first that the A1Es (A.7) and (A.9) satisfied by the G-function may be obtained directly from (A.40), (A.46) and (A.44), by using the functional equation 0(w + 1/2)(w → −w) = π/ cos πw. (More generally, various properties of the Gfunction can be obtained as corollaries of E-function features.) Just as the G-function can be characterized as the unique minimal solution to the A1E (A.7) (with a+ , a− positive) satisfying (A.8), we may single out the E-function as the unique minimal solution to the A1E (A.46) satisfying E(0) = 1, E(z) > 0, z ∈ i(−∞, a).
(A.50)
(The ambiguity in minimal solutions is of the form c exp(2π kz/a+ ) with c ∈ C and k ∈ Z, cf. Ref. [9]. The properties (A.50) fix c and k.) We are now prepared to tie in the E-and G-functions with Barnes’ double gamma function and Kurokawa’s double sine function. The point is that the double gamma function may be viewed as a minimal solution to the A1E ˜ + ia+ /2) (2π)1/2 iz E(z = exp(− ln a− ). iz 1 ˜ a 0( a− + 2 ) E(z − ia+ /2) −
(A.51)
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
671
Specifically, the function ˜ + , a− ; z) ≡ 1/ 02 (iz + a, (a+ , a− )) E(a
(A.52)
has this property. The latter assertion is most easily established from Shintani’s paper Ref. [16]: The A1E can be gleaned from p. 172, and from p. 179 one can deduce that the third logarith˜ + , a− ; z) equals that of E(a+ , a− ; z). From this equality it already mic derivative of E(a follows that (A.52) is a minimal solution to (A.51) with a+ , a− positive, cf. Theorem II.3 in Ref. [9]. Moreover, comparing (A.46) and (A.51), we deduce that we must have ˜ + , a− ; z), E(a+ , a− ; z) = c(a+ , a− ) exp(2πkz/a+ + ln(a+ a− )z2 /4a+ a− )E(a (A.53) with k ∈ Z. Now both E and E˜ are invariant under a+ ↔ a− . (Indeed, this property of E˜ follows from (A.52) and invariance of 02 (w, (a+ , a− )) under a+ ↔ a− , cf. e.g. p. 173 in Ref. [16].) Hence we have k = 0. Using finally E(0) = 1, we obtain E(a+ , a− ; z) =
02 ((a+ + a− )/2, (a+ , a− )) exp(ln(a+ a− )z2 /4a+ a− ) . 02 (iz + (a+ + a− )/2, (a+ , a− ))
(A.54)
From (A.40) and (A.54) we now get G(a+ , a− ; z) =
02 (a − iz, (a+ , a− )) . 02 (a + iz, (a+ , a− ))
(A.55)
The rhs can be rewritten in terms of Kurokawa’s double sine function S2 (z|a+ , a− ), cf. Refs. [12,4]. This yields the identity G(a+ , a− ; z) = S2 (iz + (a+ + a− )/2|a+ , a− ),
(A.56)
already alluded to. It seems that Shintani was actually the first author who obtained a result on the double sine function that is not immediate from its being a quotient of double gamma functions: He proved a representation that in terms of our G-function amounts to iπz2 iπ a+ a− − + G(a+ , a− ; z) = exp − 2a+ a− 24 a− a+ ·
∞ 1 + exp(− 2π [z − i(l + 1 )a ]) Y a+ 2 − l=0
1 1 + exp(− 2π a− [z + i(l + 2 )a+ ])
, Im (a− /a+ ) > 0,
(A.57)
cf. p.181 in Ref. [16]. We would like to point out that this formula can be quickly rederived by exploiting the above theorem on the Re z → ∞ asymptotics of g(a+ , a− ; z). ˜ + , a− ; z), we first point out (following Indeed, denoting the rhs of (A.57) by G(a Shintani) that it satisfies the two A1Es (A.7) and (A.9). (This is easily checked.) There˜ is elliptic in z with periods ia+ , ia− . Now an inspection of the zeros and poles fore G/G ˜ reveals that they are simple and coincide with the simple zeros and poles of the of G ˜ is a constant, which a priori depends on a+ , a− . However, one G-function. Hence G/G need only invoke the g-asymptotics for z → ∞ (recall Theorem A.1 and G = exp ig), and verify that the infinite product on the rhs of (A.57) converges to 1 for a± ∈ RH P
672
S. N. M. Ruijsenaars
and z → ∞ (which is routine), to deduce that this constant equals 1. Therefore, we have now proved (A.57). Since we have studied the G-function from a quite different perspective in Ref. [9], we can combine (A.57) with our previous results to obtain some remarkable consequences. In particular, using the explicit value G(a+ , a− ; i(a+ − a− )/2) = (a+ /a− )1/2 ,
(A.58)
(cf. Eq. (3.68) in Ref. [9]), we obtain an identity that amounts to the transformation formula for the Dedekind η-function. (The latter identity was used by Shintani as an ingredient of his proof.) Similarly, using G(z)G(−z) = 1 (cf. (A.4)), we deduce another (previously known) modular transformation formula. (The connection between the double gamma function and elliptic functions was already pointed out and studied in detail by Barnes in his memoir Ref. [15].) Another bonus of the representation (A.57) is that it gives rise to an improvement on the bound (A.35) by means of which we obtained it. More precisely, (A.57) is easily seen to entail that one can take σ = 1 in (A.35), provided a+ /a− does not become positive as (a+ , a− ) varies over K+ × K− . (Note one may switch a+ and a− on the rhs of (A.57) because of (A.5).) This sharper bound cannot be directly derived from our proof of Theorem A.1. Indeed, we use the special case of equal a+ and a− in a telescoping argument, and in this special case (A.35) is false for σ = 1. (This becomes evident in the course of the proof, cf. the paragraph in Appendix C containing (C.55).) In this connection it is important to observe that the rhs of (A.57) has no meaning for Im (a− /a+ ) = 0. In particular, it seems quite unlikely that (A.57) leads to useful consequences for a+ , a− ∈ (0, ∞), which is the critical case for quantum-mechanical applications. Appendix B. Analyticity Properties In this appendix we study two complex-valued functions M and P of 2N + 2 complex variables (a+ , a− , u, d), with a± varying over the (open) right half plane RH P , and u, d over CN . These functions will be defined and shown to be jointly analytic on domains DM and DP , with DM a proper subdomain of DP . (Throughout this appendix, “domain” stands for “open, connected set”.) On DM the function P is given in terms of M and the entire function E(a+ , a− ; ·) from Appendix A (cf. (A.43)), as follows: P (a+ , a− ; u, d) ≡ M(a+ , a− ; u, d) ·
N Y
E(a+ , a− ; −ia + uj − dk ), a = (a+ + a− )/2.
(B.1)
j,k=1
To define M we need some preparations. The definition involves contour integrals over the auxiliary variable z in the integrand I(a+ , a− ; u, d, z) ≡
N Y
G(a+ , a− ; z − uj )/G(a+ , a− ; z − dj ),
(B.2)
j =1
where G is the hyperbolic gamma function from Appendix A. Thus I(a+ , a− ; u, d, ·) is meromorphic, with u-dependent poles for z = uj − ia − zkl , j = 1, . . . , N, k, l ∈ N,
(B.3)
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
673
and d-dependent poles for z = dj + ia + zkl , j = 1, . . . , N, k, l ∈ N,
(B.4)
cf. (A.13), (A.14). The u-poles (B.3) remain disjoint from the d-poles (B.4) as long as (a+ , a− , u, d) stays away from the set jk
Z ≡ ∪N j,k=1 ∪l,m∈N∗ Hlm ,
(B.5)
where jk
Hlm ≡ {(a+ , a− , u, d) ∈ RH P 2 × C2N |ila+ + ima− − uj + dk = 0}.
(B.6)
Thus the union of hyperplanes Z equals the zero locus of the E-function product in (B.1), cf. Appendix A. A first restriction on the domain of M is now that Z is excluded from it. (As will be seen, singularities arise when Z is approached.) But we need an additional restriction to ensure that the integrand I has exponential decay for z → ±∞. This requirement can be readily studied by invoking Theorem A.1. First, fixing (a+ , a− , u, d) in RH P 2 × C2N , we note that the set N (u + A(a , a )) ∩ ∩ (d + A(a , a )) (B.7) AI ≡ ∩N j + − j + − j =1 j =1 contains an interval (L(a+ , a− , u, d), ∞) ⊂ R. (This is clear from the geometry of A, cf. Fig. 3.) Then the bound (A.35) is easily seen to entail N X iπz (uj − dj ) |, z > L(a+ , a− , u, d), |I(a+ , a− ; u, d, z)| < C+ | exp a+ a− j =1
(B.8) and using (A.4) one similarly obtains N X iπz (uj − dj ) |, z < −L(a+ , a− , −d, −u). |I(a+ , a− ; u, d, z)| < C− | exp − a+ a− j =1
(B.9) Here, the positive constants C± can be chosen uniformly for a± varying over compacts in RH P and u1 , . . . , dN varying over compacts in C. Furthermore, these upper bounds are best possible, as I satisfies lower bounds with the same characteristics. To ensure exponential decay for z → ±∞, then, we must require that (a+ , a− , u, d) belong to the domain N X DP ≡ {(a+ , a− , u, d) ∈ RH P 2 × C2N |Im ( (uj − dj )/a+ a− ) > 0}. j =1
(B.10)
674
S. N. M. Ruijsenaars
The function P will be defined below on the simply-connected domain DP , whereas the definition domain of M reads DM ≡ DP \ Z,
(B.11)
and hence is not simply-connected. We begin by defining M on the domain D0 ≡ {(a+ , a− , u, d) ∈ DP |Im uj < Re a, Im dj > −Re a, j = 1, . . . , N}. (B.12) This is a subdomain of DM , since the imaginary part restrictions ensure that the u-poles (B.3) lie in the lower half plane and the d-poles (B.4) in the upper one. Thus the u-poles are separated from the d-poles by the contour C0 ≡ (−∞, ∞).
(B.13)
Defining now Z M(a+ , a− ; u, d) ≡
C0
I(a+ , a− ; u, d, z)dz, (a+ , a− , u, d) ∈ D0 ,
(B.14)
it is not hard to deduce from the above that M is jointly analytic in D0 . Indeed, the uniform decay bounds (B.8) and (B.9), combined with the analyticity properties of the G-functions in the integrand, entail that M is jointly continuous and separately analytic, hence jointly analytic by virtue of Osgood’s lemma (cf. Ref. [19]). As will be made clear below, M has a one-valued jointly analytic extension to DM (B.11), whereas the function P , defined at first by (B.1) on DM , has a (necessarily onevalued) jointly analytic extension to DP (B.10). But we will improve considerably on these existence assertions by making the extensions more or less explicit. For the extension of P from DM to DP we will do so towards the end of the proof of Theorem B.2 below. We proceed by defining M on DM . (It will not be obvious from the definition that M is indeed analytic on DM ; this will also be demonstrated in the course of the proof.) To this end we fix a± ∈ RH P and suppress the dependence on a+ , a− for the moment. To satisfy the decay restriction, we must require that (u, d) ∈ C2N obey arg(
N X (uj − dj )) − φ+ − φ− ∈ (0, π),
(B.15)
j =1
cf. (B.10) and (A.23). Now it is important to observe that arbitrary collisions of u-poles and d-poles are compatible with the restriction (B.15). Indeed, one has arg w − φ+ − φ− ∈ (0, π), ∀w ∈ W \ {0},
(B.16)
so that the pole wedges uj + W− , j = 1, . . . , N, and dk + W+ , k = 1, . . . , N, can have arbitrarily large intersections without violating (B.15). (Recall (A.26), (A.27) and Fig. 3.) Fixing (a+ , a− , u, d) ∈ DM , however, the u-poles are disjoint from the d-poles by definition, even though the wedges may be arbitrarily entangled. Clearly, the pole wedges uj + W− intersect the closed upper half plane in K+ (u) triangles, with K+ (u) ∈ {0, . . . , N}. (Here and below, “triangle” may reduce to “point” or “line segment”, the
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
675
latter case occurring for φ+ = φ− .) We now define a family CR+ , R ≥ 0, of contours, as follows: CR+ ≡ (−∞, −R) ∪ {z ∈ C||z| = R, Im z ≥ 0} ∪ (R, ∞), R ≥ 0.
(B.17)
Obviously, whenever K+ (u) > 0, there exists a unique R+ (u) ≥ 0 such that at least one u-pole lies on CR++ (u) , but all u-poles lie below CR+ for R+ > R+ (u). Next, we modify the contours CR++ , R+ > R+ (u), depending on the (fixed) point d ∈ CN . (For K+ (u) = 0 we modify the contour C0 (B.13) according to the following prescription.) When no d-poles lie on the contour, we keep it unchanged. Whenever d-poles lie on the contour, we indent it upwards so that the pertinent poles lie below it. (Of course, the indentations should be sufficiently small so that they avoid all other d-poles.) Proceeding in this way, we obtain a well-defined family CR++ (d), R+ > R+ (u), of contours. (For K+ (u) = 0 the modified contours are denoted C0+ (d).) Below each such contour lies a certain number N+ (R+ , d) of d-poles. Denoting the residues of I(u, d, ·) at these poles by D1 (u, d), . . . , DN+ (R+ ,d) (u, d), we now define Z M(u, d) ≡
N+ (R+ ,d)
CR++ (d)
I(u, d, z)dz + 2π i
X
Dj (u, d),
(B.18)
j =1
with R+ > R+ (u) for K+ (u) > 0 and R+ = 0 for K+ (u) = 0. It is not hard to see that M is well defined. (That is, for K+ (u) > 0 the number on the rhs does not depend on the choice of R+ ∈ (R+ (u), ∞).) Moreover, the definition (B.18) reduces to (B.14) whenever (a+ , a− , u, d) ∈ D0 . Indeed, in that case one has K+ (u) = 0 and no d-poles lie on or below C0 , so no indentations and residues occur. It is by no means clear, though, that M thus defined is analytic in DM , and hence yields a one-valued analytic continuation of (B.14) to DM . Likewise, the existence of a jointly analytic extension to DP for P (B.1) is not a matter of course. In the following lemma we first consider the relevant analyticity properties of an arbitrary d-pole residue. The results obtained are a crucial input for the study of M and P in Theorem B.2 below. Lemma B.1. Fix a± ∈ RH P , d ∈ CN , and p ≡ d1 + ia + zk1 l1 , k1 , l1 ∈ N.
(B.19)
p − ia + zlm , l, m ∈ N∗ ,
(B.20)
Denote C with the points
deleted by C(p). Letting u1 , . . . , uN ∈ C(p), the function I(a+ , a− ; u, d, z) (B.2) has a pole at z = p with a residue D(u, d) that is analytic for u ∈ C(p)N . Moreover, the function 5(u, d) ≡ D(u, d)
N Y j,k=1
extends to an entire function of u1 , . . . , uN .
E(−ia + uj − dk )
(B.21)
676
S. N. M. Ruijsenaars
Proof. Fixing u0 ∈ C(p)N , the factors G(z − u0j ) in I(u, d) have no poles at z = p, so we can find > 0 such that the poles of these factors lie outside the circle |z − p| = . Then analyticity at u = u0 can be read off from the representation I 1 I(u, d, z)dz, (B.22) D(u, d) = 2πi |z−p|= which holds for all u such that the poles of the factors G(z − uj ) lie outside |z − p| = . The proof of the last assertion involves more work. Let us first consider the (generic) case where the factors 1/G(z − dj ), j > 1, have neither a pole nor a zero for z = p. / Q, the pole of 1/G(z − d1 ) is simple and gives rise to an I-residue Assuming a+ /a− ∈ Y G(p − uj )G(−p + dj ), (B.23) −rk1 l1 G(p − u1 ) j >1
cf. the paragraph below (A.22). Moreover, the function G(p − uj ) has simple poles at uj = p + ia + zkl , k, l ∈ N, that are matched by simple zeros of E(−ia + uj − d1 ), cf. (B.19). Therefore, entireness of 5(u, d) (B.21) is clear. For a+ /a− ∈ Q, however, the pole at z = p need not be simple, and some of the poles of G(p − uj ) are not simple. Assuming first O(k1 l1 ) = 1, so that z = p is still a simple pole (cf. Appendix A), we obtain once again the residue (B.23). But now the factor G(p−uj ) has a pole of order O(kl) for uj = p+ia +zkl . Since E(−ia +uj −d1 ) has a zero of order O(k + k1 , l + l1 ) for uj = p + ia + zkl , we can invoke the order inequality (A.15) to deduce that no poles in uj occur. Hence 5(u, d) is once more entire. Consider next the case O(k1 l1 ) > 1. Then the factor 1/G(z − d1 ) has a Laurent expansion G(−z + d1 ) =
OX (k1 l1 )
c(i, k1 , l1 )(z − p)−i + O(1), z → p,
(B.24)
i=1
and we should develop the remaining factors in a power series up to order (z − p)O(k1 l1 ) to obtain the residue D(u, d). Explicitly, this yields OX (k1 l1 ) i=1
X
c(i, k1 , l1 )
i ,... ,iN ,j2 ,... ,jN PN PN1 l=1 il + l=2 jl =i−1
· G(i1 ) (p − u1 )
N Y
PN
(−) l=2 jl i1 ! · · · iN !j2 ! · · · jN !
G(il ) (p − ul )G(jl ) (−p + dl ).
(B.25)
l=2
Now each differentiation increases all pole orders by 1. The pole order of the residue at uj = p + ia + zkl equals, therefore, O(kl) + O(k1 l1 ) − 1. Since E(−ia + uj − d1 ) has zero order O(k + k1 , l + l1 ) at uj = p + ia + zkl , the inequality (A.15) entails once again entireness in uj . Thus we have now proved the last assertion for all d2 , . . . , dN ∈ C such that the factors G(−z + dj ), j > 1, have neither a pole nor a zero at z = p. We are now prepared to consider the most general situation: There may be factors G(−z + dj ), j > 1, that have a pole or a zero at z = p. Assume first that no further poles occur, whereas at least one of these factors has a zero at z = p. Then the above reasoning and residue formulas are still valid, the only difference being that the pertinent
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
677
factors vanish. In particular, when O(k1 l1 ) = 1, the residue (B.23) vanishes identically, whereas (B.25) vanishes when O(k1 l1 ) − 1 is smaller than the total zero order coming from the factors G(−z +dj ) for j > 1. (Of course, in these cases I has no pole at z = p, strictly speaking.) In any case, entireness of 5(u, d) follows, since the zero orders of the E-functions involving d1 already suffice to cancel any poles. Next, we assume that additional poles occur for z = p. Eventually relabeling, we may as well assume that we have d1 + zk1 l1 = dj + zkj lj , j = 2, . . . , M ≤ N, kj , lj ∈ N,
(B.26)
whereas the remaining factors G(−z + dj ), j > M, are regular at z = p. (Again, whether or not the latter are zero at z = p plays no role for the reasoning and formulas that follow. The only significant consequence is that when the total zero order is ≥ the total pole order, the residue vanishes and I has no pole at z = p.) When we now develop the singular factors in a Laurent expansion at z = p as in (B.24), we readily obtain M Y
G(−z + dj ) =
O (p) X
j =1
c(i, k1 , l1 , . . . , kM , lM )(z − p)−i + O(1), z → p, (B.27)
i=1
where O(p) ≡
M X
O(kj lj ),
(B.28)
j =1
and where the coefficients are determined in terms of the coefficients c(i, kj , lj ) with i = 1, . . . , O(kj lj ). To obtain the residue D(u, d), therefore, the remaining factors should be expanded up to order (z − p)O(p) . This yields O (p) X
c(i, k1 , . . . , lM )
i=1
PN
X i ,... ,iN ,jM+1 ,... ,jN PN1 PN l=1 il + l=M+1 jl =i−1
·
M Y l=1
G(il ) (p − ul )
N Y
(−) l=M+1 jl i1 ! · · · iN !jM+1 ! · · · jN !
G(il ) (p − ul )G(jl ) (−p + dl ).
(B.29)
l=M+1
Thus we see that also in the most general case the residue D(u, d) is a sum of products of meromorphic functions depending on only one of the variables u1 , . . . , uN . Clearly, the pole order in the variable uj at uj = p+ia+zkl is less than or equal to O(kl)+O(p)−1. Now using (B.26) we may write p + ia + zkl = 2ia + dm + zk+km ,l+lm ,
m = 1, . . . , M.
(B.30)
Therefore, the function E(−ia + uj − dm ) has zero order O(k + km , l + lm ) at uj = p + ia + zkl for m = 1, . . . , M. For the total zero order we then have, using (A.15), M X m=1
O(k + km , l + lm ) ≥
M X
(O(kl) + O(km lm ) − 1)
m=1
≥ M[O(kl) − 1] + O(p).
(B.31)
678
S. N. M. Ruijsenaars
Thus this order is larger than or equal to the maximal pole order O(kl) − 1 + O(p) of the residue (B.29). Hence the function 5(u, d) (B.21) extends to an entire function of u in the most general case, too. u t Theorem B.2. The function M defined by (B.18) is jointly analytic in DM (B.11). The function P defined by (B.1) for (a+ , a− , u, d) ∈ DM has a jointly analytic extension to DP (B.10). The functions F = M, P have the following automorphy properties: F (a+ , a− ; σ1 (u), σ2 (d)) = F (a+ , a− ; u, d), σ1 , σ2 ∈ SN , F (a+ , a− ; −d, −u) = F (a+ , a− ; u, d), F (a− , a+ ; u, d) = F (a+ , a− ; u, d), F (λa+ , λa− ; λu, λd) = λF (a+ , a− ; u, d), λ ∈ (0, ∞), F (a+ , a− ; u1 + r, . . . , dN + r) = F (a+ , a− ; u, d), r ∈ R.
(B.32) (B.33) (B.34) (B.35) (B.36)
Proof. Clearly, the variable transformations in (B.32)–(B.36) leave the domains DP , DM and D0 invariant. For F = M and variables in D0 , the automorphy properties are clear from (B.14), (B.2) and the G-features (A.4)–(A.6). For F = P and variables in D0 they then follow from (B.1) and the E-features (A.44), (A.45). Now the automorphy relations are preserved under analytic continuation. (In fact, one may take λ ∈ C∗ and r ∈ C.) Therefore it remains to prove the first two assertions. To this end we begin by claiming that we may just as well define M by Z M(u, d) ≡
CR−− (u)
N− (R− ,u)
I(u, d, z)dz − 2π i
X
Uj (u, d),
(B.37)
j =1
instead of (B.18). Here, the notation will be largely clear from context: The pole wedges dj + W+ intersect the closed lower half plane in K− (d) triangles, the contour CR−− (u) is defined by taking contours CR− in the closed lower half plane as starting point (i.e., with Im z ≤ 0 on the rhs of (B.17)), one should take R− > R− (d) in (B.37) so that all d-poles lie above CR−− , indentations to avoid u-poles are downwards, and the functions U1 , . . . , UN− (R− ,u) denote the residues of I at the u-poles lying above the contour. To see that the right-hand-sides of (B.18) and (B.37) indeed yield the same number, one need only deform the contour CR++ (d) to CR−− (u) in the obvious way. Then all of the N+ (R+ , u) d-poles in (B.18) and all of the N− (R− , u) u-poles in (B.37) are passed, whilst no other d- and u-poles are encountered. Thus the residues Dj in (B.18) are canceled and the residues Uj in (B.37) appear. With the alternative definition (B.37) of M at our disposal, we are now going to prove that M is analytic in each of the variables a+ , a− , u1 , . . . , dN . Joint analyticity is then a consequence of Hartogs’ theorem (cf. e.g. Theorem 2.2.8 in Ref. [20]). Our task is, then, to demonstrate that for each of the 2N + 2 variables there exists a disc around an arbitrary fixed point in DM so that M is analytic in this disc. To avoid 0 , a 0 , u0 , d 0 ). notational confusion, we denote the fixed point by (a+ − 0 , a0 Beginning with u1 , we start from (B.18) with u, d → u0 , d 0 and parameters a+ − understood. Now we choose 1 > 0 sufficiently small so that the following properties hold true for u1 varying over |u1 − u01 | ≤ 1 : 0 , a 0 , u , u0 , . . . , d 0 ) stay in D ; (i) The points (a+ M − 1 2 N 0 stay below C + (d 0 ) for all k, l ∈ N. (ii) The z-poles u1 − ia 0 − zkl R+
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
679
(Since DM is an open set, (i) can be achieved by letting u1 vary over |u1 − u01 | ≤ with 0 can cross the contour > 0 small enough. Then only finitely many points u1 − ia 0 − zkl + CR+ (d 0 ) as u1 varies, so an eventual decrease of ensures that none of them does. Now take 1 = /2.) A moment’s thought suffices to see that M is indeed analytic in u1 as u1 varies over the disc |u1 − u01 | < 1 : The d-poles and contour occurring in (B.18) are constant on the disc, the contour integral is analytic in u1 due to (i), (ii) and uniform convergence for z → ±∞, and each of the residues Dj (u1 , u02 , . . . , u0N ) is analytic in u1 by virtue 0 , a 0 , u , u0 , . . . , d 0 ) stays away from the hyperplanes H 1k of (i). (Indeed, since (a+ − 1 2 lm N (B.6), we may invoke Lemma B.1.) Of course, this reasoning can be repeated for u2 , . . . , uN . To prove separate analyticity in d1 , . . . , dN , we take (B.37) as a starting point, and adapt the argument accordingly. (In particular, Lemma B.1 has an obvious analog for an arbitrary residue U (u, d) at a u-pole.) We continue by proving analyticity in a+ , taking (B.18) with u, d → u0 , d 0 as a starting point. Making the dependence on a+ , a− explicit, it can be rewritten Z 0 0 0 0 0 0 I(a+ , a− ; u0 , d 0 , z)dz M(a+ , a− ; u , d ) = 0 ,a 0 ;d 0 ) CR++ (a+ −
0 ,a 0 ;R ,d 0 ) N+ (a+ I − +
+
X j =1
0 ,a 0 ;d 0 ) 0j (a+ −
0 0 I(a+ , a− ; u0 , d 0 , z)dz. (B.38)
Here, the contour 0j is a circle around the pertinent d-pole, whose radius j is chosen sufficiently small so that all other d-poles and all of the u-poles lie outside 0j . 0|≤ , Now we choose + > 0 small enough so that when a+ varies over |a+ − a+ + the following properties hold true: 0 , u0 , d 0 ) stay in D ; (i) The points (a+ , a− M 0 , a 0 ; d 0 ) and 0 (a 0 , a 0 ; (ii) None of the u-poles and d-poles cross the contours CR++ (a+ j + − − 0 , a 0 ; R , d 0 ). d 0 ), with j = 1, . . . , N+ (a+ + − 0 | ≤ all (As before, openness of DM ensures one can find > 0 such that for |a+ − a+ pertinent points stay in DM . Only finitely many points can then cross the finitely many fixed contours, so by decreasing one may ensure none does. Now take + = /2.) 0 , a 0 ; u0 , d 0 , z) by I(a , a 0 ; u0 , d 0 , z) on the rhs of When we now replace I(a+ + − − 0|< , (B.38), then the integrals are well-defined for a+ varying over the disc |a+ − a+ + 0 , a 0 , u0 , d 0 ; and yield analytic functions in this disc. Thus the rhs yields a function R(a+ − 0|< . a+ ) that is defined and analytic in a+ for |a+ − a+ + 0 ; u0 , d 0 ) as already defined We claim that the latter function coincides with M(a+ , a− 0 0. via (B.18) (with parameters a+ , a− understood). Indeed, this is plain for a+ = a+ 0 , the asserted equality is easily verified whenever all of the Assuming next a+ 6 = a+ 0 0 0 N+ (a+ , a− ; R+ , d ) d-poles entering (B.38) are simple. As we have already seen in the proof of Lemma B.1, the poles need not be simple, however. First, there may be coinciding points
dj0i + zk0i li , i = 1, . . . , M, M ∈ {2, . . . , N}, 1 ≤ j1 < · · · < jM ≤ N.
(B.39)
680
S. N. M. Ruijsenaars
0 ∈ RH P , of course. But when a 0 /a 0 ∈ Q, additional This can happen for arbitrary a± + − 0 0 for j 6 = l and (hence) k 6 = m. In the multiplicity may arise from zj k being equal to zlm 0 . Similarly, a pair of latter case one obtains O(j k) distinct points whenever a+ 6 = a+ 0 distinct ki1 , ki2 in (B.39) gives rise to distinct points for a+ 6 = a+ . 0 , a 0 , u0 , d 0 ; a ) is no longer maniWhenever one of these cases occurs, R(a+ + − 0 ; u0 , d 0 ) as defined by (B.18), as N (a , a 0 ; R , d 0 ) > festly equal to M(a+ , a− + + − + 0 , a 0 ; R , d 0 ). But we may replace all of the circle integrals entering the defN+ (a+ + − inition of R by 2πi times the sum of the residues of the d-poles they enclose, and when we do, we obtain the previous definition (B.18) of M, up to relabeling. The upshot is that we have proved analyticity in a+ . The same arguments yield analyticity in a− , so we deduce joint analyticity of M in DM , as announced. It now follows from (B.1) and joint analyticity of E(a+ , a− ; z) in RH P 2 ×C (cf. Appendix A) that P is also jointly analytic in DM . It remains to prove that P has a jointly analytic extension to DP . Just as for M, we do so by exhibiting an extension to DP , and then proving that the extended function is in fact analytic in each of its variables, hence jointly analytic by Hartogs’ theorem. 0 , a 0 , u0 , d 0 ) ∈ D ∩ Z. Then we In order to detail the extension, we fix a point (a+ P − can partition the index set {1, . . . , N} into two subsets I= and I6= , depending on whether u0j equals one of the points 0 , k = 1, . . . , N, l, m ∈ N∗ , dk0 + zlm
(B.40)
or not. Since the fixed point belongs to Z (B.5), we have |I= | ≥ 1. Next, for j ∈ I= we let uj vary over a punctured disc 0 < |uj − u0j | ≤ j , whereas we keep uj = u0j for j ∈ I6= . Choosing the j ’s small enough, we can ensure that the 0 , a 0 , u, d 0 ) stay in D . points (a+ M − Now for the subset of DM thus determined, we may invoke (B.18) with d → d 0 , provided we choose R+ larger than all of the R+ (u) involved. Doing so, it follows from previous arguments that the contour integral yields an analytic function of uj in the disc |uj − u0j | < j for all j ∈ I= . Multiplying the integral by the E-product (cf. (B.1)), we therefore obtain an analytic function of uj that vanishes for u = u0 . From Lemma B.1 we now deduce that the limit P (u0 , d 0 ) ≡ lim P (u, d 0 ) u→u0
(B.41)
exists and equals N+ (R+ ,d 0 )
2πi
X j =1
lim Dj (u, d 0 )
u→u0
N Y j,k=1
E(−ia 0 + uj − dk0 ),
(B.42)
with Dj given by the general formula (B.29) up to relabeling and substitutions. Moreover, Lemma B.1 entails that the function P on DP thus obtained is analytic in u1 , . . . , uN . The first assertion of the theorem has already been proved, so we may invoke the automorphy property P (u, d) = P (−d, −u) for arguments in DM . It entails that the extended function P on DP is also analytic in d1 , . . . , dN . We continue by proving that it is analytic in a+ as well.
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
681
0 , a 0 , u0 , d 0 ) in To this end we reconsider the local behavior near the above point (a+ − DP ∩ Z. We let a+ vary over a disc 0 | ≤ + }, + > 0, D+ ≡ {|a+ − a+
(B.43)
with + small enough so that: 0 , u, d 0 ) stay in D as a varies over D and u over |u − u0 | ≤ (1) The points (a+ , a− P + + j j j j /2, j ∈ I= ; 0 , u, d 0 ) stay in D as a varies over D and u over |u − u0 | = (2) The points (a+ , a− M + + j j j j /4, j ∈ I= . (Since DP is open, one can achieve (1) by letting a+ vary over D+ (B.43) with + 0 , a 0 , u, d 0 ) stays away from Z as u varies over |u − u0 | = sufficiently small. Now (a+ j j − j j /4, j ∈ I= , so an eventual shrinking of + ensures (2).) We are now prepared to examine the function I duj 1 |I= | Y 0 P (a+ , a− ; u, d 0 ), (B.44) 0 |= /4 (u − u0 ) 2π i |u −u j j j j j j ∈I =
0|< . with, as before, uj = u0j for j ∈ I6= . Due to (2), it is analytic in a+ for |a+ − a+ + Invoking (1), the analyticity in uj already proved, and Cauchy’s formula, one sees that 0 ; u0 , d 0 ). Hence the announced analyticity in a follows. it is also equal to P (a+ , a− + t Analyticity in a− can be proved analogously. u
Appendix C. Proof of Theorem A.1 The geometry of the asymptotics domain A (A.32) plays a crucial role in the following proof. In particular, using from now on the notation λ = Re z, α = Im z,
(C.1)
we note that we have z ∈ A, α > 0
⇒
z − itaδ ∈ A, t ∈ [0, α/rδ cos φδ ], δ = +, −,
(C.2)
whereas z − itaδ does not generally belong to A for t larger than specified. Moreover, we have z ∈ A, α ≥ 0
⇒
λ > am + α| tan φδ |, δ = +, −.
(C.3)
(These assertions easily follow from (A.32)–(A.34), cf. also Fig. 3.) Next, we observe that for z in the half strip H S ≡ {λ >
4 1 am , |α| < r cos φ}, 2 5
(C.4)
the function g(a+ , a− ; z) is given by the integral (A.1). We are going to reduce the asymptotics for arbitrary |α|-values to the asymptotics in H S by exploiting the A1Es satisfied by g. To be specific, the latter entail the relations f (a+ , a− , z) = f (a+ , a− , z + iδaτ ) + iδ ln(1 + exp[−2π(z + iδaτ /2)/a−τ ]), (C.5)
682
S. N. M. Ruijsenaars
with δ, τ = +, −, whenever the points z, z + iδaτ belong to D (A.28). (Indeed, for positive a+ , a− this follows from Eqs. (3.1)–(3.4) in Ref. [9], and these A1Es can be analytically continued.) ˜ in K+ × K− × K and Turning now to the details, let us choose a point (a˜ + , a˜ − , α) fix τ ∈ {+, −} such that r˜τ cos φ˜ τ = min(˜r+ cos φ˜ + , r˜− cos φ˜ − ).
(C.6)
(Here and below, we use tildes whenever the quantities (A.23)–(A.25) refer to the fixed points a˜ + , a˜ − .) Next, we define discs Dδ (d) ≡ {aδ ∈ C | |aδ − a˜ δ | ≤ d r˜δ cos φ˜ δ }, d ∈ (0, 1/2), δ = +, −,
(C.7)
(so that Dδ (d) ⊂ RH P ), and an interval I (b) ≡ {α ∈ R | |α − α| ˜ ≤ br˜τ cos φ˜ τ }, b ∈ (0, 1/2),
(C.8)
where d and b will be further restricted shortly. By virtue of compactness of Kδ and K, we need only show there exists a positive C (depending on D± (d), I (b) and σ ) such that the bound (A.35) holds for all a± ∈ D± (d) and z ∈ A with Im z ∈ I (b). We proceed by doing so, assuming α˜ ≥ 0 from now on. (The proof for the case α˜ ≤ 0 involves a few obvious changes.) This assumption entails that there exists a unique N ∈ N such that 1 2 α˜ − N r˜τ cos φ˜ τ ∈ r˜τ cos φ˜ τ (− , ]. 3 3
(C.9)
With τ and N fixed by (C.6) and (C.9), resp., we are now going to choose d and b such that for all a± ∈ D± (d) and all z ∈ A with α ∈ I (b) we have z − ilaτ ∈ A,
l = 1/2, . . . , N − 1/2, z − iNaτ ∈ H S.
(N > 0)
(C.10)
(C.11)
It is not obvious that such a choice can be made. In order to prove its feasibility, we begin by observing that (C.10) and (C.11) are implied by the (more convenient) requirements 1 α − Nrτ cos φτ > − rτ cos φτ , 2 3 α − Nrτ cos φτ < rτ cos φτ , 4 4 3 rτ cos φτ < r cos φ. 4 5
(C.12) (C.13) (C.14)
Indeed, for N = 0 (C.10) is vacuous, while for N > 0 (C.10) follows from (C.12), cf. the first paragraph of this appendix. Considering next (C.11), we deduce from (C.4) that it amounts to
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
1 am , 2
λ + Nrτ sin φτ >
|α − Nrτ cos φτ | <
4 r cos φ. 5
683
(C.15)
(C.16)
Now (C.16) is clear from (C.12)–(C.14), so we are left with (C.15). Since λ + iα ∈ A by assumption, we have λ > am , so (C.15) is plain for N = 0. Finally, to prove that (C.15) is implied by (C.12)–(C.14) for N > 0, we first note that (C.9) and (C.8) entail α > 0 on I (b). Thus we can use (C.3) to infer that (C.15) is implied by am + α| tan φτ | >
1 am + N rτ | sin φτ |. 2
(C.17)
Now (C.17) is evident for φτ = 0, whereas for φτ 6= 0 it can be rewritten 1 α − Nrτ cos φτ > − am cos φτ /| sin φτ |. 2
(C.18)
Using (C.12), we deduce that we need only verify −rτ cos φτ > −am cos φτ /| sin φτ |,
(C.19)
| sin 2φτ | < 2am cos φτ /rτ .
(C.20)
which amounts to
On account of the definition (A.31) of am , the rhs is ≥ 2, and so (C.20) holds true. Summarizing, we have shown that (C.10) and (C.11) are implied by (C.12)–(C.14). We continue by restricting d and b so that (C.12)–(C.14) hold for all a± ∈ D± (d) and α ∈ I (b). First, from (C.6) we deduce ˜ r˜τ cos φ˜ τ ≤ r˜ cos φ,
(C.21)
r± cos φ± ∈ r˜± cos φ˜ ± [1 − d, 1 + d].
(C.22)
and (C.7) yields
Hence we have
rτ cos φτ ≤
1+d 1−d
r cos φ,
so that (C.14) can be satisfied by requiring 4 3 1+d < . 4 1−d 5
(C.23)
(C.24)
Next, we use (C.9), (C.7) and (C.8) to obtain ˜ − N|rτ cos φτ − r˜τ cos φ˜ τ | α − N rτ cos φτ ≥ α˜ − N r˜τ cos φ˜ τ − |α − α| (b + N d + 1/3) 1 rτ cos φτ , (C.25) > r˜τ cos φ˜ τ (− − b − Nd) ≥ − 3 (1 − d)
684
S. N. M. Ruijsenaars
and, likewise, α − Nrτ cos φτ ≤
(b + Nd + 2/3) rτ cos φτ . (1 − d)
(C.26)
Thus we can satisfy (C.12) and (C.13) by requiring (b + Nd + 1/3)/(1 − d) < 1/2, (b + Nd + 2/3)/(1 − d) < 3/4.
(C.27) (C.28)
Clearly, we can choose d and b such that the restrictions (C.24), (C.27) and (C.28) are satisfied. To be specific, let us take d = 1/36(N + 1), b = 1/18.
(C.29)
Then the latter restrictions are easily checked, so we are now in the position to exploit (C.10) and (C.11). Indeed, consider the A1E (C.5) satisfied by f (a+ , a− , z), choosing δ = −, a± ∈ D± (d) and z ∈ A with α ∈ I (b), and fixing τ via (C.6). When we now iterate (C.5) N times, we obtain f (z) = f (z − iNaτ ) − i
N X j =1
1 ln(1 + exp[−2π(z − i(j − )aτ )/a−τ ]). 2
(C.30)
The crux is that this reduces the argument of f to H S (due to (C.11)), whereas the N ln-terms are of the form ln(1 + exp[−2πw/a−τ ]) with w ∈ A (due to (C.10)). To bound the latter function, we use (A.32) to write Re (w/a± ) = am cos φ± /r± + R cos(χ − φ± )/r± .
(C.31)
The second term on the rhs is positive, so we deduce Re (w/a± ) > 1, w ∈ A.
(C.32)
| exp(−2πw/a± )| < exp(−2π ), w ∈ A,
(C.33)
Clearly, this entails
so we can use the elementary estimate | ln(1 + x)| < c1 |x|, |x| < exp(−2π ),
(C.34)
to obtain the desired bound | ln(1 + exp(−2πw/a−τ ))| < c1 exp(−2πRe (w/a−τ )) < c1 exp(2π[|Im w|| sin φ−τ |/r−τ − Re w/am ]). (C.35) For each of the N logarithmic terms in (C.30) we can now use (C.35) to obtain a bound of the form Cj (a+ , a− , α) exp(−2πλ/am ), with Cj continuous on D+ (d) × D− (d) × I (b), and hence bounded. Thus we obtain |f (z) − f (z − iNaτ )| ≤ C(d, b) exp(−2π λ/am ), for all a± ∈ D± (d) and z ∈ A with α ∈ I (b).
(C.36)
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
685
A moment’s thought now shows that the proof of Theorem A.1 is complete once we succeed in showing |f (a+ , a− , z)| < C(a+ , a− , σ ) exp(−2π σ λ/am ), ∀z ∈ H S,
(C.37)
with C(a+ , a− , σ ) continuous for a+ , a− ∈ RH P . Indeed, the argument z − iN aτ in (C.36) remains in H S as (a+ , a− ) varies over D+ (d) × D− (d) and the imaginary part of z ∈ A varies over I (b), cf. (C.11). Therefore, (C.37) entails that |f (z − iN aτ )| is majorized by exp(−2πσ Re z/am ) times a function CN (a+ , a− , σ ) that is also continuous for a+ , a− ∈ RH P , hence bounded on D+ (d) × D− (d). Our main tools for obtaining (C.37) are a comparison argument and a suitable use of the integral representation (A.1). The comparison argument exploits the fact that the function g(a, a; z) admits a second integral representation from which its asymptotics can be easily established, cf. Eqs. (3.41)–(3.45) in Ref. [9]. (Here, however, the comparison parameter (a+ + a− )/2 differs from loc. cit. Eq. (3.52).) Turning to the details, in the present case the cited formulas entail the identity a2 g(a, a; z) = A(a+ , a− , z) − η(a+ , a− ) + D1 (a+ , a− , z). a+ a− Here, A is given by (A.30), and η and D1 are defined by a+ π a− 2− − , η(a+ , a− ) ≡ 48 a− a+ D1 (a+ , a− , z) ≡
a2 πa+ a−
Z
∞
πz/a
dt
te−t , a = (a+ + a− )/2. sht
(C.38)
(C.39)
(C.40)
Choosing at first z ∈ R, we now write a2 1 g(a, a; z) + g(a+ , a− ; z) = a+ a− 2
Z 0
∞
dy I (y) sin 2yz, y
(C.41)
where I (y) ≡
a2 1 . − sha+ y sha− y a+ a− sh2 ay
(C.42)
Clearly, we have I (y) = 4η/π + O(y 2 ), y → 0. Thus we can use the sine transform Z ∞ π ay sin 2yz πz = , Re a > 0, z ∈ R, dy cth(πz/a) − 2a sh2 ay ash2 (π z/a) 0 (cf. e.g. Eq. (2.66) in Ref. [9]), to obtain Z 1 ∞ dy I (y) sin 2yz = D2 (a+ , a− , z) + η(a+ , a− ) + D3 (a+ , a− , z), 2 0 y
(C.43)
(C.44)
(C.45)
686
S. N. M. Ruijsenaars
with D2 (z) ≡ η[cth(πz/a) − 1 − (π z/a)/sh2 (π z/a)], Z ∞ dyJ (y)e2iyz , D3 (z) ≡ −∞
J (y) ≡
1 4iy
I (y) −
4ηa 2 y 2 πsh2 ay
(C.46) (C.47)
.
(C.48)
The point is now that J (y) is O(y) for y → 0 and O(y exp(−2r|y| cos φ)) for y → ±∞. Therefore, we may take |Im z| < r cos φ in the integral (C.47). Combining the above formulas, we obtain f (a+ , a− , z) =
3 X
Dj (a+ , a− , z),
(C.49)
j =1
so we need only show that D1 , D2 and D3 satisfy a bound of the form (C.37), with C continuous. In order to prove this, we begin by pointing out that one has r/ cos φ ≤ am .
(C.50)
Indeed, one readily verifies the stronger inequality r/ cos φ ≤ (r+ / cos φ+ + r− / cos φ− )/2,
(C.51)
where equality holds iff φ+ = φ− . Using (C.50), we now obtain Re (z/a) = (Re z cos φ + Im z sin φ)/r 4 am cos φ − cos φ| sin φ| > 2r 5 1 1 2 , ∀z ∈ H S. > − = 2 5 10
(C.52)
Hence we have a −1 H S ⊂ RH P . Also, setting w ≡ πz/a,
(C.53)
we deduce |(1 − exp(−2w))−1 | < (1 − exp(−π/5))−1 ≡ c2 , ∀w ∈ π a −1 H S.
(C.54)
Using this numerical majorization, the desired bound on D2 (z) (C.46) is readily obtained. To handle D1 , we need only study Z ∞ te−2t /(1 − e−2t )dt, w ∈ π a −1 H S. (C.55) h(w) ≡ 2 w
Using the elementary integral Z
∞ R
ye−y dy = (R + 1)e−R ,
(C.56)
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
we first rewrite h(w) as
Z
h(w) = 2
∞
w
te−4t /(1 − e−2t )dt + we−2w + e−2w /2.
687
(C.57)
To bound the integral, we put w = seiη , s ≡ |w|,
(C.58)
and change the contour to the ray xeiη , x ≥ s. (The ‘arc from ∞ to eiη ∞’ yields zero contribution.) Then we obtain the integral Z ∞ x exp(−4xeiη )(1 − exp(−2xeiη ))−1 dx. (C.59) eiη s
Using (C.54), we infer that its modulus is majorized by Z ∞ c2 (4s cos η + 1) exp(−4s cos η), x exp(−4x cos η)dx = c2 2η 16 cos s
(C.60)
cf. (C.56). Now we have s cos η = Re w = Re (πz/a) = π r −1 (λ cos φ + α sin φ).
(C.61)
Therefore, the first term in (C.57) gives rise to a term in D1 (a+ , a− , z) that is bounded by the rhs of (C.37) for all σ < 2. Obviously, this holds true for the third term in (C.57) as well, provided we take σ ≤ 1. For the second one, however, we must take σ < 1 so as to counterbalance the factor linear in w. It remains to prove (C.37) with f replaced by D3 . To this end we make the key observation that Z ∞ duM(z, d, u), (C.62) 4iD3 (z) = exp(−2dz) −∞
with M(z, d, u) ≡
1 e2iuz ( u + id sha+ (u + id)sha− (u + id) 1 a2 [1 − (a+ − a− )2 (u + id)2 ]), − 2 12 a+ a− sh a(u + id)
(C.63)
for all z ∈ H S and d ∈ [0, π/am ). Indeed, (C.62) is evident for d = 0. To check independence of d, we should first of all verify that the integrand M is non-singular for u ∈ R and d ∈ (0, π/am ), and we proceed by doing so. First, the factor sha(u + id) vanishes iff u + id belongs to iπ a −1 Z, which amounts to u = kπ sin φ/r, d = kπ cos φ/r, k ∈ Z.
(C.64)
Since we have d ∈ (0, π/am ) and π/am ≤ π cos φ/r (recall (C.50)), it follows that sha(u+id) is non-zero. Likewise, since 1/am ≤ cos φ± /r± , we deduce sha± (u+ir) 6 = 0. Therefore, the function (C.63) has no poles on the contour in (C.62).
688
S. N. M. Ruijsenaars
Next, we note M(z, d, u) = O(u exp[2(|α| − r cos φ)|u|]), u → ±∞.
(C.65)
Now for z ∈ H S we have 5|α| ≤ 4r cos φ, so the integral (C.62) is well defined, and we may shift contours to deduce independence of d ∈ [0, π/am ). Fixing σ ∈ (1/2, 1), we now let d = σ π/am . Then we obtain the bound (with λ + iα ∈ H S)
(C.66) Z
|D3 (a+ , a− , λ + iα)| < exp(−2πσ λ/am )
∞
−∞
du
exp( 85 r|u| cos φ) (u2 + d 2 )1/2
· (|sha+ (u + id)sha− (u + id)|−1 |a 2 | + |a+ a− sh2 a(u + id)| 1 · (1 + |a+ − a− |2 (u2 + d 2 ))). 12 Next, we split up the integration region R in three subsets: Letting u0 ≡
π 1 + max(| tan φ+ |, | tan φ− |), 2 am
(C.67)
(C.68)
we consider successively u > u0 , u < −u0 and u ∈ [−u0 , u0 ]. For u > u0 we have |Re (a(u + id))| = r|u cos φ − d sin φ| ≥ r(u0 cos φ − d| sin φ|) σπ π 1 | sin φ| − | sin φ|) ≥ r( cos φ + 2 am am 1 ≥ r cos φ. 2 Likewise, we obtain 1 r± cos φ± , u > u0 . 2 Therefore, the integral over {u > u0 } can be majorized by Z ∞ du 8 exp( ru cos φ)| exp(−2a(u + id))| 4 5 1/2 u Y |a 2 | [1 − exp(−r cos φ)]−2 [1 − exp(−rδ cos φδ )]−1 + ·( |a+ a− | |Re (a± (u + id))| ≥
(C.69)
(C.70)
δ=+,−
· (1 +
1 π2 |a+ − a− |2 (u2 + 2 ))). 12 am
(C.71)
Now we have | exp(−2a(u + id))| = exp(−2ru cos φ + 2rσ π sin φ/am ) < exp(−2ru cos φ) exp(π| sin 2φ|).
(C.72)
Generalized Hypergeometric Function Satisfying Equations of Askey–Wilson Type
689
Thus, (C.71) is bounded above by Z
∞
1/2
du 2 exp(− ru cos φ)(C1 (a+ , a− ) + C2 (a+ , a− )u2 ), u 5
(C.73)
with Cj (a+ , a− ) continuous and positive. Finally, (C.73) can be bounded by Z
∞ 0
2 du exp(− ru cos φ)(2C1 (a+ , a− ) + uC2 (a+ , a− )) 5
= 5C1 (a+ , a− )/(r cos φ) + 25C2 (a+ , a− )/(4r 2 cos2 φ) ≡ C> (a+ , a− ),
(C.74)
with C> (a+ , a− ) continuous and positive. Likewise, the integral over {u < −u0 } in (C.67) can be majorized by a continuous, positive function C< (a+ , a− ). Thus we are left with the integral over [−u0 , u0 ]. On this interval we bound the u-dependent terms in (C.67) by their maxima. Then the integral is bounded by 8 2u0 d −1 exp( ru0 cos φ)(max |sha+ (u + id)sha− (u + id)|−1 5 |a 2 | 1 + max |sha(u + id)|−2 (1 + |a+ − a− |2 (u20 + d 2 ))). |a+ a− | 12
(C.75)
−1 (1/2, 1) Now we have already seen that the sh-factors stay away from 0 for d ∈ π am and u ∈ R. But as σ ↑ 1 in (C.66), the distance to at least one of the zeros goes to 0 (cf. the paragraph containing (C.64)). Therefore, (C.75) yields a function C[ ] (a+ , a− , σ ) which is continuous in a+ , a− for σ ∈ (1/2, 1), but which diverges as σ ↑ 1. The upshot is that we obtain the desired bound
|D3 (a+ , a− , z)| < C3 (a+ , a− , σ ) exp(−2π σ λ/am ), ∀z ∈ H S,
(C.76)
where C3 ≡ C> + C< + C[ ]
(C.77)
is continuous in a+ , a− . Therefore, the estimate (C.37) on f (a+ , a− , z) follows, which completes the proof of Theorem A.1. Acknowledgements. Some of the results of this paper were first reported in our Banff Summer School lecture notes (1994). We would like to thank the organizers G. Semenoff and L. Vinet for their invitation. We also outlined the main results in a lecture series at the Workshop on Invariant Differential Operators, Special Functions and Representation Theory (R.I.M.S., Kyoto, 1997). We are indebted to the Organizing Committee, in particular to M. Kashiwara and T. Oshima, for inviting us, and to the R.I.M.S. for its hospitality and financial support. This paper was completed during our stay at the Universidad de Chile in Santiago. We gratefully acknowledge the financial support of Fundación Andes, and the fine hospitality and support provided by the Departamento de Matemáticas de la Facultad de Ciencias.We also thank J.F. van Diejen for his invitation and for useful discussions.
690
S. N. M. Ruijsenaars
References 1. Gasper, G., Rahman, M.: Basic hypergeometric series. Encyclopedia of Mathematics and its Applications, 35, Cambridge: Cambridge Univ. Press, 1990 2. Askey, R., Wilson, J.: Some basic hypergeometric orthogonal polynomials that generalize Jacobi polynomials. Mem. Am. Math. Soc. 319, (1985) 3. Grünbaum, F.A., Haine, L.: Some functions that generalize the Askey–Wilson polynomials. Commun. Math. Phys. 184, 173–202 (1997) 4. Nishizawa, M., Ueno, K.: Integral solutions of q-difference equations of the hypergeometric type with |q| = 1. To appear in Proceedings of the 1996 Workshop “Infinite analysis”, IIAS, Japan 5. Ruijsenaars, S.N.M.: Systems of Calogero–Moser type. In: Proceedings of the 1994 Banff summer school “Particles and fields”, eds. G. Semenoff, L. Vinet. CRM Series in Mathematical Physics, Berlin– Heidelberg–New York: Springer-Verlag, 1999, pp. 251–352 6. Ruijsenaars, S.N.M., Schneider, H.: A new class of integrable systems and its relation to solitons. Ann. Phys. (N.Y.) 170, 370–405 (1986) 7. Ruijsenaars, S.N.M.: Complete integrability of relativistic Calogero–Moser systems and elliptic function identities. Commun. Math. Phys. 110, 191–213 (1987) 8. Whittaker, E.T., Watson, G.N.: A course of modern analysis. Cambridge: Cambridge Univ. Press, 1973 9. Ruijsenaars, S.N.M.: First order analytic difference equations and integrable quantum systems. J. Math. Phys. 38, 1069–1146 (1997) 10. Watson, G.N.: The continuation of functions defined by generalized hypergeometric series. Trans. Camb. Phil. Soc. 21, 281–299 (1910) 11. Atakishiyev, N.M., Suslov, S.K.: Difference hypergeometric functions. In: Progress in Approximation Theory, A.A. Gonchar and E.B. Saff, eds., Berlin–Heidelberg–New York: Springer-Verlag, 1992, pp. 1– 35 12. Kurokawa, N.: Multiple sine functions and Selberg zeta functions. Proc. Jap. Acad., Ser. A 67, 61–64 (1991) 13. Kurokawa, N.: Gamma factors and Plancherel measures. Proc. Jap. Acad., Ser. A 68, 256–260 (1992) 14. Barnes, E.W.: The genesis of the double gamma functions. Proc. London Math. Soc. 31, 358–381 (1900) 15. Barnes, E.W.: The theory of the double gamma function. Phil. Trans. Royal Soc. (A) 196, 265–387 (1901) 16. Shintani, T.: On a Kronecker limit formula for real quadratic fields. J. Fac. Sci. Univ. Tokyo, Sect. 1A 24, 167–199 (1977) 17. Nishizawa, M.: On a q-analogue of the multiple gamma functions. Lett. Math. Phys. 37, 201–209 (1996) 18. Ueno, K., Nishizawa, M.: The multiple gamma function and its q-analogue. In: Quantum groups and quantum spaces, Banach Center Publications Vol. 40, Inst. of Math., Polish Ac. Sciences, Warszawa, 1997, pp. 429–441 19. Gunning, R.C., Rossi, H.: Analytic functions of several complex variables. Englewood Cliffs, N.J.: Prentice-Hall, 1965 20. Hörmander, L.: An introduction to complex analysis in several variables. London: Van Nostrand, 1966 Communicated by T. Miwa
Commun. Math. Phys. 206, 691 – 736 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
The Action of Outer Automorphisms on Bundles of Chiral Blocks Jürgen Fuchs1,? , Christoph Schweigert2,? 1 Max-Planck-Institut für Mathematik, Gottfried-Claren-Str. 26, 53225 Bonn, Germany 2 CERN, 1211 Genève 23, Switzerland
Received: 11 May 1998 / Accepted: 17 April 1999
Abstract: On the bundles of WZW chiral blocks over the moduli space of a punctured rational curve we construct isomorphisms that implement the action of outer automorphisms of the underlying affine Lie algebra. These bundle-isomorphisms respect the Knizhnik--Zamolodchikov connection and have finite order. When all primary fields are fixed points, the isomorphisms are endomorphisms; in this case, the bundle of chiral blocks is typically a reducible vector bundle. A conjecture for the trace of such endomorphisms is presented; the proposed relation generalizes the Verlinde formula. Our results have applications to conformal field theories based on non-simply connected groups and to the classification of boundary conditions in such theories.
1. Introduction and Summary 1.1. Chiral blocks. Spaces of chiral blocks [1] are of considerable interest both in physics and in mathematics. In this paper we construct a natural class of linear maps on the spaces of chiral blocks of WZW conformal field theories and investigate their properties. Recall that a WZW theory is a two-dimensional conformal field theory which is characterized by an untwisted affine Lie algebra g and a positive integer, the level. The chiral block spaces of a rational conformal field theory are finite-dimensional complex vector spaces. In the WZW case, one associates to each complex curve C of genus g with m distinct smooth marked points and to each m-tuple 3 of integrable weights of g at level k ∨ a finite-dimensional vector space B3 of chiral blocks. These spaces are the building blocks for the correlation functions of WZW theories both on surfaces with and without boundaries, and on orientable as well as unorientable surfaces (see e.g. [2]). Moreover, since WZW theories underlie the construction of various other classes of two-dimensional conformal field theories, the spaces of WZW chiral blocks ? Current address: Institut für Theoretische Physik, ETH-Hönggerberg, 8093 Zürich, Switzerland. E-mail: [email protected]; [email protected]
692
J. Fuchs, C. Schweigert
enter in the description of the correlation functions of those theories as well. They also form the space of physical states in certain three-dimensional topological field theories [3]. Finally they are of interest to algebraic geometers because via the Borel--Weil--Bott theorem they are closely related to spaces of holomorphic sections in line bundles over moduli spaces of flat connections (for a review, see e.g. [4]). It is necessary (and rather instructive) not to restrict one’s attention to the case of fixed insertion points, but rather to vary both the moduli of the curve C and the position of the marked points. (In conformal field theory this accounts e.g. for the dependence of correlation functions on the positions of the fields.) This way one obtains a complex vector space B3 over each point in the moduli space Mg,m of complex curves of genus g with m marked smooth points. It is known (see e.g. [5,6]) that these vector spaces fit together into a vector bundle B3 over Mg,m , and that this vector bundle is endowed with a projectively flat connection, the Knizhnik--Zamolodchikov connection. (Sometimes the term “chiral block” is reserved for sections in the bundle B3 that are flat with respect to the Knizhnik--Zamolodchikov connection.) In this paper, we will restrict ourselves mainly to curves of genus g = 0, but allow for an arbitrary number m ≥ 2 of marked points. For brevity, we will denote the moduli space of m distinct points on P1 simply by Mm . 1.2. Automorphisms. It has been known for a long time that certain outer automorphisms of the affine Lie algebra g that underlies a WZW theory play a crucial role in the construction of several classes of conformal field theories. The automorphisms in question are in one-to-one correspondence with the elements of the center Z of the relevant compact Lie group, i.e. of the real compact connected and simply connected Lie group G whose Lie algebra is the compact real form of the horizontal subalgebra g¯ of the affine Lie algebra. These automorphisms underlie the existence of non-trivial modular invariants, both of extension and of automorphism type, which describe the “non-diagonal” WZW theories that are based [7] on non-simply connected quotients of the covering group G. (For the modular matrices of such theories see [8].) The same class of automorphisms also plays a crucial role in the construction of “gauged” WZW models [9–11], i.e. coset conformal field theories [12]. Let us remark that the structures implied by such automorphisms have been generalized for arbitrary rational conformal field theories in the theory of so-called simple currents (for a review, see [14]). It is also worth mentioning that in the representation theoretic approach to WZW chiral blocks [6,13] one heavily relies on the loop construction of the affine algebra g. That description treats the simple roots of the horizontal subalgebra g¯ and the additional simple root of g on a rather different footing and accordingly does not reflect the full structure of g in a symmetric way. In particular the symmetries of the Dynkin diagram of g which are associated to the (classes of) outer automorphisms of g that correspond to the center Z do not possess a particularly nice realization in the loop construction. 1.3. Main results. The basic purpose of this paper is to merge the two issues of chiral block spaces and outer automorphisms of affine Lie algebras. Thus our goal is to implement and describe the action of the outer automorphisms corresponding to the center Z of G on the spaces of chiral blocks of WZW theories. This constitutes in particular a necessary input for the definition of chiral blocks for the theories based on non-simply connected quotient groups of G. It also turns out to be an important ingredient in the
Action of Outer Automorphisms on Bundles of Chiral Blocks
693
description of the possible boundary conditions of WZW theories with non-diagonal partition function [15]. Moreover, our results are also relevant for the quest of establishing a Verlinde formula (in the sense of algebraic geometry) for moduli spaces of flat connections with non-simply connected structure group. (This might also pave the way to a rigorous proof of the Verlinde formula for more general conformal field theory models, such as coset conformal field theories.) Let us now summarize informally the main results of this paper. Consider the direct sum of m copies of the coweight lattice of g¯ P and denote by 0w the subgroup consisting of ¯ i = 0. To each µ¯ ∈ 0w we associate a those m-tuples µ¯ = (µ¯ i ) which sum to zero, m i=1 µ map ω? on the set of integrable weights at level k ∨ and, for each m-tuple 3 of integrable weights, a map 2?µ¯ :
B3 → Bω? 3
which is an isomorphism of the bundles B3 of chiral blocks over the moduli space Mm and which projects to the identity on Mm . We show that this map respects the Knizhnik--Zamolodchikov connections on B3 and Bω? 3 . Finally, we show that the map 2?µ¯ is the identity map on B3 whenever the collection µ¯ of coweight lattice elements contains ¯ We denote the sublattice of 0w that consists only elements of the coroot lattice of g. of such m-tuples µ¯ by 0. (With these properties the maps 2?µ¯ constitute prototypical examples for so-called implementable automorphisms of the fusion rules. For the role of such automorphisms in the determination of possible boundary conditions for the conformal field theory, see [2,16].) Let us now discuss a few implications of the bundle-isomorphism 2?µ¯ . First, when ω? 3 6 = 3, then 2?µ¯ can be used to identify the two bundles B3 and Bω? 3 . On the other hand, suppose that the stabilizer 03 , i.e. the subgroup of 0w that consists of all elements which satisfy ω? 3 = 3, is strictly larger than 0. In this case the finite abelian group 03 /0 acts projectively by endomorphisms on each fiber. We then decompose each fiber into the direct sum of eigenspaces under this action. It is now crucial to note that the action of this finite group commutes with the Knizhnik--Zamolodchikov connection; this property implies that the eigenspaces fit together into sub-vector bundles of B3 . In other words, the bundle of chiral blocks is in this case a reducible vector bundle. Moreover, each of these subbundles comes equipped with its own Knizhnik--Zamolodchikov connection, which is just the restriction of the Knizhnik--Zamolodchikov connection of B3 to the subbundle. This implies in particular that the chiral conformal field theory that underlies the WZW model with a modular invariant of extension type possesses a Knizhnik--Zamolodchikov connection as well. This pattern – identification, respectively splitting of fixed points into eigenspaces – is strongly reminiscent of what has been found in coset conformal field theories [12]. It is the deeper reason why similar formulæ hold [8,12] for the so-called resolution of fixed points in the definition of coset conformal field theories and in simple current modular invariants. The rank of the subbundles is the same for all values of the moduli. This rank is related via Fourier transformation over the subgroup (03 / 0)◦ of regular elements of the finite abelian group 03 /0 to the traces of 2?µ¯ on any fiber. Thus we can conclude that these traces do not depend on the moduli either. We present a conjecture for a formula for these traces and support it by several consistency checks. We expect that several features of our construction can be generalized to curves of arbitrary genus. So far we do not know how to prove our trace formulæ. Some traces on spaces of chiral blocks
694
J. Fuchs, C. Schweigert
on higher genus surfaces with only the vacuum module involved have, however, been computed [17] in some special cases, and in these cases they agree with our formula.
1.4. Organization of the paper. The rest of this paper is organized as follows. In Sect. 2 we introduce (see Eq. (2.6)) a class of automorphisms of an untwisted affine Lie algebra g, which we call multi-shift automorphisms. These automorphisms can be extended uniquely to the semidirect sum of g with the Virasoro algebra; they depend on a collection of m elements µ¯ s of the coweight lattice, and also on a collection of pairwise distinct complex numbers zs . We explain how such automorphisms are implemented on irreducible highest weight modules over the affine Lie algebra and determine their class modulo inner automorphisms. ¯ In Sect. 3 we introduce the algebra g⊗F of Lie algebra valued meromorphic functions over P1 with possible finite order poles at m prescribed different points, the so-called ¯ block algebra. The Lie algebra g⊗F can be regarded as a subalgebra of gm , the direct sum of m copies of the affine Lie algebra g with all centers identified. We observe that when the complex numbers zs that enter in the definition of the multi-shift automorphisms are interpreted as the coordinates of points on P1 , then a tensor product of the automorphisms of Sect. 2 restricts to an automorphism of the block algebra, see formula (3.9). One should regard this latter automorphism as a global object from which the former automorphisms are obtained by local expansions. In Sect. 4 we study the special case of multi-shift automorphisms that correspond to elements in the coroot lattice. We show that they are inner automorphisms with respect to the block algebra. In Sect. 5 we combine the implementation of these automorphisms on the modules of the affine Lie algebra at all insertion points, so as to obtain an implementation on the modules over the block algebra that arise from tensor products of the affine modules. We show that for fixed values of the insertion points these maps induce isomorphisms on the spaces of chiral blocks and prove that (upon suitably fixing the phases of the implementing maps) all these isomorphisms of chiral block spaces have finite order. Up to this point the moduli of the problem, i.e. the positions of the punctures, are kept fixed in all our considerations. In Sect. 6 we proceed to consider the dependence of our constructions on these moduli. Our guiding principle will be that the isomorphisms on the spaces of chiral blocks are compatible with the Knizhnik--Zamolodchikov connection. We show that the implementation on the modules can indeed be chosen in a such way that the Knizhnik--Zamolodchikov connection is preserved and that for each value of the moduli the implementation on the module provides the same projective representation of the group 0w . We then show that for all values of the moduli the action of a multi-shift automorphism on the chiral blocks is the identity map when all coweights are actually coroots, so that in this case the implementation 2?µ¯ of the automorphism on the bundle B of chiral blocks has finite order. Finally, in Sect. 7 we exploit these results to formulate a conjecture for a general formula (Eq. (7.4)) for the traces of the implementing maps 2?µ¯ . This conjecture suggests a generalization to curves of higher genus – see (7.7) – which is consistent with factorization. The expression (7.7) for the traces constitutes a generalization of the Verlinde formula. Surprisingly enough we also observe that in all examples that have been checked numerically, the traces turn out to be integers. Details of various calculations are deferred to appendices.
Action of Outer Automorphisms on Bundles of Chiral Blocks
695
2. Action of 0w on Affine Lie Algebras and Their Modules We start our investigations by introducing a certain family of automorphisms of affine Lie algebras which we call multi-shift automorphisms. To determine the class of a multi-shift automorphism modulo inner automorphisms, we study its implementation on irreducible highest weight modules.
2.1. Multi-shift automorphisms of affine Lie algebras. In this subsection we introduce a distinguished class of automorphisms of untwisted affine Lie algebras g. Such a Lie algebra g can be regarded as a centrally extended loop algebra, i.e. 1 g := g¯ ⊗ C((t)) ⊕ CK,
(2.1)
where g¯ is a finite-dimensional simple Lie algebra and t an indeterminate. We identify g¯ with the zero mode subalgebra of g, i.e. g¯ 3 x¯ ≡ x¯ ⊗t 0 ∈ g, and refer to it as the horizontal subalgebra of g; the rank of g¯ will be denoted by r. We may already remark at this point that our main interest in the multi-shift automorphisms of the affine Lie algebra g stems from the fact that collections of m such automorphisms combine in a natural way to automorphisms of m copies of g with identified centers. The latter automorphisms, in turn, restrict to automorphisms of an important class of infinite-dimensional subalgebras, ¯ ¯ the so-called block algebras g⊗F; those automorphisms of g⊗F will be introduced in Sect. 3. In the present section, however, we study the automorphisms of g in their own right. We will describe the automorphisms of the algebra g by their action on the canonical central element K and on the elements H i ⊗f and E α¯ ⊗f of g, where f ∈ C((t)) is ¯ ¯ (By is a Cartan--Weyl basis of g. arbitrary and {H i | i = 1, 2, . . . , r} ∪ {E α¯ | α¯ a g-root} choosing f = t n with n ∈ Z one would obtain a (topological) basis of g, but for the present purposes it is more convenient to write the formulæ for general f ∈ C((t)).) In terms of these elements, the relations of the Lie algebra g read [K, · ] = 0 and [H i ⊗f, H j ⊗g] = Gij Res(df g) K, [H i ⊗f, E α¯ ⊗g] = α¯ i E α¯ ⊗ fg, ¯
¯
α+ ¯ β ⊗ fg [E α¯ ⊗f, E β ⊗g] = eα, ¯ β¯ E
[E α¯ ⊗f, E −α¯ ⊗g] =
¯ for α¯ + β¯ a g-root,
(2.2)
r X 2 α¯ i H i ⊗ f g + (α, ¯ α) ¯ Res(df g) K. i=1
Here d ≡ ∂/∂t, and the residue Res ≡ Res0 of a formal Laurent series in t is defined by Res(
∞ X
cn t n ) = c−1 ;
(2.3)
n=n0 ◦ 1 Often it is the subalgebra g = g¯ ⊗ C[t, t −1 ] ⊕ CK of g that is generated when one allows only for
Laurent polynomials rather than arbitrary Laurent series that one refers to as the affine Lie algebra. However, in conformal field theory one deals with g-representations for which every vector of the associated module (rep◦ ◦ resentation space) is annihilated by all but finitely many generators of the subalgebra g+ of g that corresponds ◦ to the positive roots. Any such representation of g can be naturally promoted to a representation of the larger algebra g.
696
J. Fuchs, C. Schweigert
¯ and eα, Gij is the symmetrized Cartan matrix of the horizontal subalgebra g, ¯ β¯ is the ¯ ¯ α ¯ α+ ¯ β β when α+ ¯ β¯ two-cocycle that furnishes structure constants of g¯ via [E , E ] = eα, ¯ β¯ E ¯ is a g-root. The automorphisms to be constructed are characterized by a sequence of elements ∨ µ¯ s , for s = 1, 2, . . . , m with m an integer m ≥ 2, of the coweight lattice Lw of g¯ (i.e. the ¯ that add up to zero. The set lattice dual to the root lattice of g) ∨
0w := {µ¯ ≡ (µ¯ 1 , µ¯ 2 , . . . , µ¯ m ) | µ¯ s ∈ Lw for all s = 1, 2, . . . , m;
Pm
¯s s=1 µ
= 0} (2.4)
of such sequences forms a free abelian group with respect to elementwise addition. As an abstract group, we have the isomorphism ∨ 0w ∼ = (Lw )m−1 .
(2.5)
In addition, the automorphisms of our interest also depend on a sequence of pairwise distinct complex numbers zs (s = 1, 2, . . . , m), one of which is singled out. However, in contrast to the elements of 0w , in this section we regard these numbers as fixed once and for all, and accordingly we do not refer to them in our notation for the automorphisms. Thus we will deal with precisely one automorphism of g for each µ¯ ∈ 0w ; we denote this map by σµ¯ . Only in case we wish to stress the dependence of the automorphism on the label s◦ ∈ {1, 2, . . . , m} that is singled out, we employ instead the notation σµ;s ¯ ◦. For any µ¯ ∈ 0w , the automorphism σµ¯ ≡ σµ;s is defined by ¯ ◦ σµ¯ (K) := K, σµ¯
(H i ⊗f )
:=
H i ⊗f
m X
+K
s=1 ¯
¯
σµ¯ (E β ⊗f ) := E β ⊗f ·
m Y
µ¯ is Res(ϕs◦ ,s f ),
(2.6)
¯
(ϕs◦ ,s )−(µ¯ s ,β)
s=1
with −1
ϕs◦ ,s (t) := (t + (zs◦ − zs ))
.
(2.7)
Here for q ∈ Q we employ the notation f q for the function with values f q (z) = (f (z))q
(2.8)
for all z ∈ C. In the sequel we will also use the short-hand 8α¯ :=
m Y
¯ (ϕs◦ ,s )−(µ¯ s ,α)
s=1
¯ for α¯ a g-root.
(2.9)
Action of Outer Automorphisms on Bundles of Chiral Blocks
697
It is readily checked that the mapping (2.6) indeed constitutes an automorphism of the affine Lie algebra g. Indeed, except for the bracket [E α¯ ⊗f, E −α¯ ⊗g], all the relations (2.2) are trivially left invariant. As for the latter, invariance follows by the identity Res(d[f · 8α¯ ] · g · 8−α¯ ) K= Res(df g + fg · = Res(df g) K +
m X (µ¯ s , α) ¯ ϕs◦ ,s ) K
s=1 r X
m X
i=1
s=1
α¯ i
(2.10) µ¯ is
Res(ϕs◦ ,s f g) K,
which follows with the help of d8α¯ = 8α¯ ·
m X (µ¯ s , α) ¯ ϕs◦ ,s .
(2.11)
s=1
Moreover, it follows immediately from the definition that σµ¯ 0 ◦ σµ¯ = σµ+ ¯ µ¯ 0 = σµ¯ ◦ σµ¯ 0 .
(2.12)
Thus the automorphisms (2.6) respect the group law in 0w , i.e. they furnish a representation of the abelian group 0w in Aut(g). This representation provides in fact a group isomorphism; accordingly we will henceforth identify the group of automorphisms σµ¯ (2.6) with the group 0w (2.4). As already mentioned, we refer to the automorphisms (2.6) as multi-shift automorphisms of the affine Lie algebra g. The origin of this terminology is that they are close relatives of the ordinary shift automorphisms σµ(0) ¯ of g that appear in the literature, e.g. in [18–20]. In the conventional notation Hni := H i ⊗t n ,
Enα¯ := E α¯ ⊗t n ,
(2.13)
the action of such a “single-shift” automorphism σµ(0) ¯ is given by σµ(0) ¯ (K) = K, i i ¯ i δn,0 K, σµ(0) ¯ (Hn ) = Hn + µ
(2.14)
α¯ α¯ σµ(0) ¯ (En ) = En+(µ, ¯ α) ¯ , ∨
where µ¯ ∈ Lw . These ordinary shift automorphisms are recovered from the formula (2.6) as the special case where µ¯ s = 0 for s 6 = s◦ . In the case of (2.14), the coweight lattice element µ¯ is known as a shift vector; we will use this term also for the collection µ¯ of coweight lattice elements that characterizes the multi-shift automorphisms P(2.6). ¯ s = 0, Note that the ordinary shift automorphisms do not satisfy the constraint m s=1 µ i.e. do not belong to the class of automorphisms of g that we considerPhere. In this ¯ s = 0, context we should point out that when we dispense with the restriction m s=1 µ the formula (2.6) still provides us with an automorphism of g. The reason why we nevertheless impose that constraint will become clear in the next section, where we glue together several of the “local” objects (2.6) to a “global” object that only exists when the constraint is satisfied.
698
J. Fuchs, C. Schweigert
2.2. Implementation on modules. A natural question is to which class of outer modulo inner automorphisms a multi-shift automorphism σµ¯ belongs. Before we can address this issue, we first have to study the implementation of σµ¯ on irreducible highest weight modules. We start by observing that for any automorphism σ of an affine Lie algebra g and any irreducible highest weight module (R, H) over g, ˜ H) := (R ◦ σ, H) (R,
(2.15)
furnishes again an irreducible highest weight module. Since this fact is interesting in ˜ H) itself, we pause to describe how it can be established. First, irreducibility of (R, ˜ is immediate: if a subspace M of H is a non-trivial submodule under R, then it is a non-trivial submodule under R as well; but by the irreducibility of H such submodules do not exist. To check the highest weight property, it is sufficient to show that the ˜ H) possesses at least one singular vector; this vector is necessarily unique module (R, (up to a scalar), since otherwise the module would not be irreducible. Now consider any Borel subalgebra b of g; then σ (b) is a Borel subalgebra again. Since any two Borel subalgebras are conjugated under an inner automorphism of g [21], it follows that there exists an inner automorphism σ˜ of g such that σ (b) = σ˜ (b). This implies that the two automorphisms σ and σ˜ just differ by an automorphism that does not affect the triangular decomposition, i.e. by a so-called diagram automorphism ω (see e.g. [22]). Let us write σ˜ =
→ Y i
exp(adxi ),
(2.16)
where a fixed ordering of the product is understood. We consider the vector v˜ :=
← Y
exp(R(xi )) v,
(2.17)
i
where the opposite ordering of the product is understood and where v is the highest weight vector of (R, H). (One can think of v˜ as a Bogoliubov transform of v.) One can check that R(σ (x)) v˜ = R(σ˜ ω(x)) v˜ =
→ Y
exp(R(xi ))R(ω(x)) v
(2.18)
i
(compare also the relation (5.25) below). This tells us that the highest weight properties of v imply analogous highest weight properties for v. ˜ This concludes the argument. In addition, by similar arguments one can show that when σ is an inner automorphism, ˜ H) are isomorphic, and hence share the same highest then the g-modules (R, H) and (R, weight. On the other hand, they do not necessarily share the same highest weight vector, of course. Thus in general the weight of the highest weight vector v of (R, H) is different ˜ H); but the previous statement tells us that the difference from its weight as a vector in (R, must be an element of the root lattice of g. We also note that any diagram automorphism ω restricts to an automorphism of the Cartan subalgebra g◦ of g and hence by duality induces an automorphism ω? of the weight space of g, which constitutes a permutation 3 7→ ω? 3 of the finitely many
Action of Outer Automorphisms on Bundles of Chiral Blocks
699
integrable weights at fixed level. It follows that for each integrable g-weight 3 any multi-shift automorphism σµ¯ of g induces a map 2µ¯ ≡ 2µ;s ¯ ◦ :
H3 → Hω ? 3
(2.19)
between the irreducible highest weight modules H3 and Hω? 3 over g which obeys the twisted intertwining property 2µ¯ x = σµ¯ (x) 2µ¯
(2.20)
for all x ∈ g; the map 2µ¯ is characterized by this property up to a scalar factor. In the → Q special case where σµ¯ is an inner automorphism of g and hence of the form σµ¯ = i exp(adxi ), the map 2µ¯ can be implemented by ˜ µ¯ = 2
→ Y
exp(xi ),
(2.21)
i
where the right-hand side is to be understood as an element of the universal enveloping algebra of g. Notice, though, that the implementation 2µ¯ is only determined up to a phase. Indeed, owing to the fact that K is central so that exp(adξ K ) = id, we can always modify 2µ¯ by central terms, 2µ¯ = exp(f K)
→ Y
exp(xi ),
(2.22)
i
where f is an arbitrary element of F(Pm1 ). We will makeQuse of this freedom later on. Moreover, notice that even if all automorphisms in σµ¯ = i exp(adxi ) commute so that this product is well-defined without specifying the ordering, this does not imply that the same holds for the product that appears in the implementation (2.21). We will therefore choose some specific ordering of the insertion points and keep this ordering fixed in the sequel. 2.3. Outer automorphism classes. We are now in a position to determine in which outer automorphism class [σµ¯ ] a multi-shift automorphism σµ¯ lies. To this end we take for the Laurent series f the constant function id and exploit the transformation property of the elements H i ⊗id. Because of the identity for s = s◦ , n = 0, 1 n n for s 6 = s , n < 0, −z ) −(z (2.23) Res(ϕs◦ ,s t ) = s s◦ ◦ 0 else for n ∈ Z, we have in particular Res(ϕs◦ ,s ) = δs,s◦ ,
(2.24)
i i ¯ is K. σµ;s ¯ (H ⊗id) = H ⊗id + µ
(2.25)
so that
700
J. Fuchs, C. Schweigert
Expressed in terms of the notation (2.13), the transformation (2.25) reads H0i 7 → H0i + µ¯ is K.
(2.26)
When combined with the results of Subsect. 2.2 it follows that modulo the root lattice L of g¯ the highest weight 3 of any irreducible highest weight module (R, H) over g ˜ H) ≡ (R ◦ σ , H) are related by ˜ of (R, and the highest weight 3 ¯ µ;s ˜¯ − 3 ¯ = k µ¯ s mod L, 3
(2.27)
where k is the eigenvalue of the central element K. Note that this is a statement about the horizontal parts of the highest weights; the transformation of the rest of the weight is then fixed by the fact that because of σµ;s ¯ (K) = K the level does not change, while the change in the grade is prescribed by the formula for σµ;s ¯ (L0 ) that is given in (2.32) 2 below. ¯ ¯ Now the quantity λ¯ mod L is nothing but the conjugacy class of the g-weight λ. ¯ Taking also into account the relation k = k ∨ · (θ¯ , θ¯ )/2 (with θ¯ the highest root of g) between the eigenvalue k of K and the level k ∨ , we learn that the automorphism σµ;s ¯ leads to a change δs in the conjugacy class of level-1 modules that is given by (θ¯ , θ¯ )/2 times the shift vector µ¯ s , taken modulo the root lattice L, or what is the same, δs =
∨ (θ¯ ,θ¯ ) ¯ s mod L ). 2 (µ
(2.28)
Furthermore, there is a natural one-to-one correspondence between conjugacy classes ¯ of g-weights and classes of certain outer automorphisms of g. Namely, as abelian groups we have the isomorphisms ∨ ∨ Lw /L ∼ = Z(g), = Lw /L ∼
(2.29)
where Z(g) is the unique maximal abelian normal subgroup of the group Out(g) = Aut(g)/Int(g) of outer automorphism classes of g. It follows in particular that [σµ;s ¯ ] is ] of the ordinary shift automorphism the same outer automorphism class as the class [σµ(0) ¯s (compare formula (2.14)) that is characterized by the same shift vector µ¯ = µ¯ s . It σµ(0) ¯ is important to note that in general this way one cannot obtain all outer automorphism classes of g, but rather only those which belong to the subgroup Z(g). Let us also mention that the abelian subgroup Z(g) of the group of diagram automorphisms is naturally isomorphic to the center of the compact, connected and simply ¯ Furthermore, connected real Lie group whose Lie algebra is the compact real form of g. the elements of Z(g) are in one-to-one correspondence with the simple currents [23,24] of the WZW theory, which are those primary fields which correspond to the units of the fusion ring of the theory (see e.g. [14]). 2 Of course the property that, modulo L, for fixed level there is a universal shift in the horizontal part of the highest weight is not shared by arbitrary automorphisms of g.
Action of Outer Automorphisms on Bundles of Chiral Blocks
701
2.4. Extension to the Virasoro algebra. Given a representation of the untwisted affine Lie algebra g, the affine Sugawara construction provides us with a representation of the Virasoro algebra Vir. Recall that Vir is the Lie algebra with generators Ln for n ∈ Z and C and relations 1 (n3 − n) δn+m,0 C, [Ln , Lm ] = (n − m) Ln+m + 24
[C, Ln ] = 0.
(2.30)
The algebra Vir combines with g as a semidirect sum Vir ⊕ g, with relations [Ln , x¯ ⊗f ] = −x¯ ⊗ t n+1 df
(2.31)
for n ∈ Z. As we show in Appendix A, the multi-shift automorphisms σµ;s ¯ ◦ of g can be extended in a unique manner to automorphisms of Vir ⊕ g. The action of such an automorphism on the generators of Vir reads σµ;s ¯ ◦ (C) = C and σµ;s ¯ ◦ (Ln ) = Ln +
XX (µ¯ s ,Hn−` ) Res(t ` ϕs◦ ,s ) `∈Z s
+ 21 K
X (µ¯ s ,µ¯ s 0 ) Res(t n+1 ϕs◦ ,s ϕs◦ ,s 0 )
(2.32)
s,s 0
(since the extension is unique, we employ the same symbol for the automorphism of Vir⊕g as for the automorphism of g). Note that by putting µ¯ s = 0 for s 6 = s◦ , this reduces to ¯ Hn ) + 21 K (µ, ¯ µ) ¯ δn,0 ; σµ(0) ¯ (Ln ) = Ln + (µ,
(2.33)
this is precisely the unique extension to the Virasoro algebra of the ordinary shift automorphism (2.14) of g. For later reference we also mention that via the formula (2.20) and the affine Sugawara construction, the maps 2µ¯ : H3 → Hω? 3 defined in (2.19) obey the twisted intertwining relations 2µ;s ¯ ◦ Ln = σµ;s ¯ ◦ ¯ ◦ (Ln ) 2µ;s
(2.34)
with regard to the Virasoro algebra.
3. Action of 0w on the Block Algebra The automorphisms constructed in Sect. 2 also provide us with corresponding automorphisms of direct sums of affine Lie algebras with identified centers. In this section we show that such an automorphism preserves a distinguished subalgebra of the latter algebra, namely the so-called block algebra which plays a crucial role in the description of chiral blocks.
702
J. Fuchs, C. Schweigert
3.1. The block algebra. In the previous section we have established that given a shift vector µ¯ ∈ 0w together with a definite choice of distinguished label s◦ ∈ {1, 2, . . . , m}, the prescription (2.6) provides us with an automorphism of the affine Lie algebra g. But the formula (2.6) of course makes sense for every choice s◦ = 1, 2, . . . , m of this distinguished label, so that we are in fact even given a collection {σµ;s ¯ ◦ } of such automorphisms, one for each value of s◦ , and all with the same shift vector µ¯ ∈ 0w . By a slight change of perspective by saying that we are given an automorphism L we can rephrase this∼ g, namely the one that on the s th summand g of m copies g of the direct sum m = s s s=1 gs acts as σµ;s ¯ . Moreover, the so obtained automorphism restricts to an automorphism of the quotient gm := (
m M s=1
gs )/J ,
J := hKs −Ks 0 | s, s 0 = 1, 2, . . . , mi
(3.1)
L of m s=1 gs that is obtained by identifying the centers of the algebras gs , because the ideal J is σµ¯ -invariant. While the specific form of the automorphisms is rather irrelevant for this simple observation, it will become crucial in the considerations that follow now, where we focus our attention on a different infinite-dimensional Lie algebra that can be embedded into (3.1). A convenient starting point for the description of this algebra is to recall that the automorphisms (2.6) also depend on a chosen sequence of m pairwise distinct numbers zs ∈ C. We now regard these numbers as the coordinates of m pairwise distinct points ps on the complex projective line P1 , to which we refer as the insertion points or punctures. More precisely, we regard the complex curve P1 as C ∪ {∞} and denote by z the standard global coordinate on C; then we write zs = z(ps ).
(3.2)
For the present purposes the points ps are kept fixed; later on we will allow them to vary over the whole moduli space Mm of m pairwise distinct points on P1 . Given this collection of points, we can consider the space F(Pm1 ) of algebraic functions on the punctured Riemann sphere Pm1 ≡ P1 \{p1 , p2 , . . . , pm }, i.e. of meromorphic functions on P1 whose poles are of finite order and lie at most at the punctures {p1 , p2 , . . . , pm }. The space F(Pm1 ) is an associative algebra, the product being given by pointwise multiplication. As a consequence, the tensor product ¯ g⊗F ≡ g¯ ⊗ F(Pm1 )
(3.3)
of the horizontal subalgebra g¯ of g with the space F(Pm1 ) inherits a natural Lie algebra ¯ with Lie bracket structure from the one of g, ¯ y] ¯ ⊗ fg [x¯ ⊗f, y¯ ⊗g] = [x,
(3.4)
¯ H i and E α¯ , this yields the for x, ¯ y¯ ∈ g¯ and f, g ∈ F(Pm1 ). In terms of the g-generators same relations as in (2.2), but with the residue terms removed, and with f, g elements of F(Pm1 ) rather than formal power series. ¯ The algebra g⊗F provides a global realization of the symmetries of a WZW conformal field theory, which are locally realized through the affine Lie algebra g. As will be described in more detail later on, it constitutes an important ingredient in a representation theoretic description of the chiral blocks of WZW theories; for brevity, we will
Action of Outer Automorphisms on Bundles of Chiral Blocks
703
¯ therefore refer to g⊗F as the block algebra. Clearly, the block algebra is spanned by ¯ and f ∈ F(Pm1 ). (By the elements H i ⊗f and E α¯ ⊗f with i ∈ {1, 2, . . . , r}, α¯ a g-root 1 allowing for arbitrary f ∈ F(Pm ) of course we do not obtain a basis of the block algebra; while a basis can easily be given, we will not need it here. Note that an automorphism ¯ of g⊗F is uniquely defined by its action on a basis, and hence a fortiori by its action on ¯ the elements H i ⊗f and E β ⊗f .) Around each of the punctures ps we can choose the local coordinate ζs = z − zs .
(3.5)
By identifying these local coordinates with the indeterminate t of the loop construction ¯ of the affine Lie algebra gs ∼ = g, one obtains an embedding of the block algebra g⊗F in the direct sum of m copies of the loop algebra g¯ loop ∼ = g¯ ⊗ C((t)), and thereby also an embedding ı in the algebra gm that was introduced in (3.1). This is seen as follows. For any function f ∈ F(Pm1 ) we denote its expansion in local coordinates around the point ps , considered as a Laurent series in the variable t ≡ ζs , by f|s . With this notation, the ¯ under the embedding ı is the sequence image of x¯ ⊗f ∈ g⊗F ı(x¯ ⊗f ) = (x¯ ⊗f|1 , x¯ ⊗f|2 , . . . , x¯ ⊗f|m ),
(3.6)
where x¯ ⊗f|s is regarded as an element of gs . In short, the embedding in the s th summand is obtained by replacing the global function f by its local expansion f|s at the puncture ps , or in other words, by its germ at ps . Note that f is already determined completely by its germ at any single puncture. In particular, the block algebra is a proper subalgebra of gm . 3.2. Automorphisms of the block algebra. Next we observe that also the functions ϕs◦ ,s defined in (2.7) are the germs of globally defined functions in F(Pm1 ). Indeed, in view of (3.5) for each pair s◦ , s the function ϕs◦ ,s (2.7) can be recognized as the local expansion (s)
ϕs◦ ,s (ζs◦ ) = (ζs◦ + zs◦ − zs )−1 = ϕ|s◦ (ζs◦ )
(3.7)
at ps◦ of the function ϕ (s) ∈ F(Pm1 ) defined by (3.8) ϕ (s) (z) := (z − zs )−1 . P ¯ s = 0 becomes clear: it ensures At this point the meaning of the requirement m s=1 µ Q (s) −(µ¯ ,β) ¯ that the function s (ϕ ) s possesses poles only at the punctures ps and hence Q ¯ lies in F(Pm1 ). It therefore follows in particular that the functions f · s (ϕs◦ ,s )−(µ¯ s ,β) that appear in the automorphism (2.6) of g are the local germs at ps◦ of globally defined Q ¯ functions f · s (ϕ (s) )−(µ¯ s ,β) . Of course, each of the germs individually already contains all information on the global function; this is reflected by the fact that for all choices of ¯ ¯ s◦ the Laurent series (ϕs◦ ,s )−(µ¯ s ,β) all involve one and the same shift vector µ. From these observations we finally deduce that to any shift vector µ¯ ∈ 0w we can ¯ associate a linear map of the block algebra g⊗F which acts as σ µ¯ (H i ⊗f ) = H i ⊗f,
¯
¯
σ µ¯ (E β ⊗f ) = E β ⊗f ·
m Y
¯
(ϕ (s) )−(µ¯ s ,β) .
s=1
(3.9)
704
J. Fuchs, C. Schweigert
¯ It is readily checked that this map constitutes an automorphism of g⊗F. Note that on purpose here we employ a similar symbol σ µ¯ as for the automorphisms σµ¯ of g in (2.6); indeed these maps should be regarded as the global and local realizations, respectively, of one and the same basic structure. To be precise, the local expansions of the automorphism (3.9) of the block algebra reproduce the automorphisms (2.6) of gs only up to the central terms. The latter are needed in order to really have an automorphism of gs rather than only an automorphism of the corresponding loop algebra; however, because of the identification of the centers of the subalgebras gs in gm (3.1), upon summing over all insertion points these terms cancel owing to the residue theorem: P i ˆ σ µ¯ ( m σ µ¯ (H i ⊗f ) = s=1 H(s) ⊗f|s ) m m X X i⊗ i⊗ = σµ;s (H f ) = ˆ H f + K µ¯ is 0 Res(ϕs,s 0 f|s ) |s ¯ (3.10) s=1 s,s 0 =1 m m X X 0 µ¯ is 0 · Res((ϕ (s ) f )|s ) = H i ⊗f. = H i ⊗f + K s 0 =1
s=1
¯ As the reader may already have noticed, α¯ not only the poles, but also Q for any g-root ¯ occur at the punctures p . In the zeroes of the meromorphic function s (ϕ (s) )−(µ¯ s ,α) s other words, these functions belong to the subset F ∗ (Pm1 ) := {f ∈ F(Pm1 ) | f −1 ∈ F(Pm1 )}
(3.11)
of F(Pm1 ), where in accordance with the prescription (2.8) the symbol f −1 stands for the function that has values inverse to those of f , i.e. f −1 (p) = (f (p))−1 for all p ∈ P1 . The elements of this subset F ∗ (Pm1 ) ⊂ F(Pm1 ), which is in fact a subalgebra, are called the invertible elements or units of F(Pm1 ). At this point it is appropriate to mention that the point at infinity of P1 ∼ = C ∪{∞} is by no means distinguished geometrically, but acquires its special role only through the choice of coordinates. Choosing a different quasi-global coordinate z˜ on P1 one would assign the coordinate value ∞ to a different geometrical point. Accordingly we should also allow for z = ∞ as the value of z at one of the insertion points, for definiteness say zm = ∞. As it turns out, in this situation the discussion above indeed goes through, but with some modifications. An obvious modification consists in the fact that in place of ζs = z − zs (3.5) a good local coordinate at zm = ∞ is provided by ζm = z−1 . Accordingly we must e.g. replace the formula (2.7) for ϕs◦ ,s by ( (t + zs◦ − zs )−1 for zs◦ 6= ∞, (3.12) ϕs◦ ,s (t) := for zs◦ = ∞. (t −1 − zs )−1 However, this is not the only change that needs to be made. One finds that in fact in the formula (2.6) for the automorphisms of g one must now only include the contributions from s ∈ {1, 2, . . . , m−1}, but not from s = m, i.e. one now has i i σµ;s ¯ ◦ (H ⊗f ) := H ⊗f + K
m−1 X s=1
µ¯ is Res(ϕs◦ ,s f ),
¯
¯
β β σµ;s ¯ ◦ (E ⊗f ) := E ⊗f · 8β¯ ,
(3.13)
Action of Outer Automorphisms on Bundles of Chiral Blocks
705
where the definition (2.9) of 8α¯ is to be replaced by 8α¯ :=
m−1 Y
¯ (ϕs◦ ,s )−(µ¯ s ,α) .
(3.14)
s=1
Moreover, the formula (3.13) applies only for s◦ ∈ {1, 2, . . . , m−1}, while for s◦ = m it gets replaced by i i σµ;m ¯ (H ⊗f ) := H ⊗f −K
m−1 X s=1
¯
¯
β β µ¯ is Res(t −2 ϕs◦ ,s f ), σµ;m ¯ (E ⊗f ) := E ⊗f · 8β¯ .
(3.15) ¯ Similarly, instead by (3.9), the associated automorphism of the block algebra g⊗F is now given by σ µ¯ (H i ⊗f ) = H i ⊗f,
¯
¯
σ µ¯ (E β ⊗f ) = E β ⊗f ·
m−1 Y
¯
(ϕ (s) )−(µ¯ s ,β) .
(3.16)
s=1
After these replacements all calculations go through as before. In particular, still Q ¯ (ϕs◦ ,s )−(µ¯ s ,β) are the local germs of the globally defined function the functions m−1 Qm−1 (s) −(µ¯ s=1 ¯ s ,β) at p for all s ∈ {1, 2, . . . , m}. Also, that function is still an element s◦ ◦ s=1 (ϕ ) ∗ 1 (P ). (On the other hand, of course now one can no longer invoke the identity of F m Pm ¯ s = 0 to exclude poles at infinity; but this is also no longer needed, since ∞ is s=1 µ Q ¯ a puncture. Explicitly, now the divisor of the function s (ϕ (s) )−(µ¯ s ,β) consists of the ¯ points ps , s = 1, 2, . . . , m−1, with finite value of zs , which have multiplicity −(µ¯ s , β), Pm−1 ¯ ¯ and in addition of ∞ which has multiplicity s=1 (µ¯ s , β) = −(µ¯ m , β).) Briefly, the Q ¯ (s) −(µ¯ s ,β) has already by itself a pole respectively zero of the correct function m−1 s=1 (ϕ ) order at ∞. i⊗ Let us also note that at zm = ∞ the formula (2.25) for σµ;s ¯ ◦ (H id) still applies. Namely, in this case we have σµ;m ¯ (H
i⊗
id) = H
i⊗
id − K
m−1 X s=1
µ¯ is Res(t −2 (t −1 − zs )−1 ).
(3.17)
Using the fact that Res(t −2 (t −1 − zs )−1 ) = 1 independently of s together with the condition that the shift vectors add up to zero, this is nothing but i i ¯ im K. σµ;m ¯ (H ⊗id) = H ⊗id + µ
(3.18)
It follows in particular that the statement that the outer automorphism class [σµ;s ¯ ] of σµ;s ¯ can be read off the coweight lattice element µ¯ s is still true for s = m with zm = ∞. We also mention that in the case zm = ∞ the residue theorem still ensures the cancellation of central terms as in formula (3.10). Namely, while the summation over s 0 in (3.10) is now only from 1 to m−1, the summation over s still ranges from 1 to m; the desired 0 0 result then follows after taking into account that Res((ϕ (s ) f )|m ) = Res∞ (ϕ (s ) f ) = 0 −Res0 (t −2 ϕ (s ) f ).
706
J. Fuchs, C. Schweigert
Finally we remark that the Sugawara construction provides us at each of the insertion points ps , with a Virasoro algebra Vir ≡ Virs . We write L(s) n for the generators of the Virasoro algebra Virs associated to gs . Also note that in the formula (2.32) for the action of a multi-shift automorphism on Vir the sums over s and s 0 extend over all punctures except, when present, the one at ∞. 4. Inner Automorphisms In Subsect. 2.2 we have seen how an automorphism of the affine Lie algebra g can be implemented on g-modules. We can now implement the tensor product of m multi-shift automorphisms on the tensor product of m g-modules. We will see that this leads to a well-defined map on the spaces of chiral blocks because such a tensor product of multi¯ shift automorphisms leaves the block algebra g⊗F invariant. One convenient property that one would like to achieve for the maps on the chiral blocks is that they are of finite order. As it turns out, to this end we must find a suitable description for those multi-shift ¯ automorphisms which are inner automorphisms of g⊗F. The subset P ∨ ¯ s = 0} (4.1) 0 := {µ¯ ≡ (µ¯ 1 , µ¯ 2 , . . . , µ¯ m ) | µ¯ s ∈ L for all s = 1, 2, . . . , m; m s=1 µ of those elements of 0w for which the allowed range of the shift vectors µ¯ s is restricted ∨ ∨ to the coroot lattice L ⊆ Lw of g¯ clearly is a subgroup of 0w . In this section we show ¯ that the automorphisms (2.6) of g and (3.9) of g⊗F associated to shift vectors that lie ¯ in 0 are innerQautomorphisms of the block algebra g⊗F, i.e. automorphisms that can 3 a fortiori ¯ be written as i exp(adxi ) with suitable elements xi of g respectively g⊗F; ¯ they are then also inner automorphisms of the ambient algebra gm ⊃ g⊗F. For establishing this result we need to introduce some special inner automorphisms ¯ ¯ of the block algebra g⊗F, and for these we need some particular elements of g⊗F. For convenience, and without loss of generality, in this section we assume that 0 and ∞ are among the insertion points ps , and number the latter in such a way that zm = ∞; this has the technical advantage that for each insertion point ps with s ∈ {1, 2, . . . , m−1} 4 the function ϕ (s) defined by (3.8) already lies in the subalgebra F ∗ (Pm1 ) of units of F(Pm1 ). ¯ The relevant special elements of g⊗F are then of the form := E α¯ ⊗f + E −α¯ ⊗f −1 , Xα,f ¯
(4.2)
where E α¯ ∈ g¯ is a step operator of the horizontal subalgebra g¯ and f ∈ F ∗ (Pm1 ) is a unit in F(Pm1 ). ¯ Furthermore, for any two elements X, Y ∈ g⊗F and any complex number ξ , we denote by Adξ ;X,Y the inner automorphism Adξ ;X,Y := exp(ad−ξ Y ) ◦ exp(adξ X )
(4.3)
of the block algebra. We are interested in automorphisms of the form Adiπ/2;Xα,f ¯ 1 ,Xα,f ¯ 2 ∗ ¯ with an arbitrary positive g-root α¯ and some suitable functions f1 , f2 ∈ F (Pm1 ). By direct computation we find that i⊗ f ) = H i ⊗f Adiπ/2;Xα,f ¯ 1 ,Xα,f ¯ 2 (H
(4.4)
3 In the present section we only show that µ ¯ ∈ 0 is sufficient for having an inner automorphism. But as we shall see later on, this is also necessary. 4 Recall that there is no function ϕ (s) associated to ∞.
Action of Outer Automorphisms on Bundles of Chiral Blocks
707
for all i = 1, 2, . . . , r and ¯
¯
β⊗ f ) = E β ⊗ (f1 /f2 )−(α¯ Adiπ/2;Xα,f ¯ 1 ,Xα,f ¯ 2 (E
∨ ,β) ¯
f
(4.5)
∨
¯ here α¯ ∨ = 2α/( ¯ ¯ ¯ α, ¯ α) ¯ ∈ L is the coroot associated to the g-root α. ¯ for all g-roots β; (Though straightforward, this calculation is somewhat lengthy; some relevant formulæ are collected in Appendix B.) The results (4.4) and (4.5) imply in particular that any two automorphisms of this form commute. Also, only the quotient f1 /f2 appears in (4.5), so that without loss of generality we can put f2 = id. It follows in particular that when we compose inner automorphisms (s) as defined by (3.8) for of the form Adiπ/2;Xα,f ¯ 1 ,Xα,f ¯ 2 with the special choice f1 = ϕ ¯ α, ¯ then we arrive at some s ∈ {1, 2, . . . , m−1} and f2 = id and with arbitrary g-roots automorphisms of the form studied previously. Indeed, `s m−1 Y Y s=1 is =1
Adiπ/2;Xα¯
is ,ϕ
(s) ,Xα¯ i ,1 s
= σ µ¯ ,
(4.6)
¯ that acts as in where σµ¯ is the multi-shift automorphism of the block algebra g⊗F (3.16), with µ¯ = (µ¯ 1 , µ¯ 2 , . . . , µ¯ m ),
µ¯ s =
`s X is =1
∨
α¯ i∨s ∈ L
for s = 1, 2, . . . , m−1
(4.7)
¯ (and where the ordering in the product is irrelevant). In short, for any choice of the g∨ roots α¯ is we obtain an inner automorphism of the form (3.16) with µ¯ s ∈ L . Moreover, by appropriately choosing these roots we can obtain every automorphism σµ¯ (3.16) for which µ¯ ∈ 0. It follows in particular that any automorphism σ µ¯ with µ¯ ∈ 0 is an inner ¯ automorphism of the block algebra g⊗F, as claimed. When we replace the functions f , etc. that appeared in the considerations above, which are elements of F(Pm1 ), by functions that have the same mapping prescription but are regarded as Laurent series in the relevant local coordinate ζs , respectively as formal Laurent series in the variable t, then we can immediately repeat all steps in the derivation of the formulæ (4.4) and (4.5); thereby for each summand gs , s = 1, 2, . . . , m−1, of the L m algebra m s=1 gs ⊃ g we construct a certain class of commuting inner automorphisms. The action of these automorphisms differs from the one of the automorphisms of the block algebra precisely in that there arise additional terms that are proportional to the central element K ≡ Ks of gs . We find that (4.4) gets modified to (for some details see Appendix B) i⊗ f ) = H i ⊗f − (α¯ ∨ )i K (Res(f1−1 df1 f ) − Res(f2−1 df2 f )), Adiπ/2;Xα,f ¯ 1 ,Xα,f ¯ 2 (H (4.8)
while the formula (4.5) applies without any change also to the present situation. We now investigate the case where these Laurent series are precisely the local expansions of the functions ϕ (s) studied previously. Thus for any s, s◦ = 1, 2, . . . , m−1 we have to deal with the inner automorphism Adiπ/2;xs ,ys of gs with xs = E α¯ ⊗ϕs◦ ,s + E −α¯ ⊗(ϕs◦ ,s )−1
and
ys = (E α¯ + E −α¯ ) ⊗ id,
(4.9)
708
J. Fuchs, C. Schweigert
where ϕs◦ ,s (t) is the local expansion (2.7) at ps◦ of the function ϕ (s) as defined by (3.8). By inserting these functions in the general formulæ obtained above, we learn that the automorphism Adiπ/2;xs ,ys acts as Adiπ/2;xs ,ys (H i ⊗f|s◦ ) = H i ⊗f|s◦ + (α¯ ∨ )i K Res(ϕs◦ ,s f|s◦ ), ¯
¯
Adiπ/2;xs ,ys (E β ⊗f|s◦ ) = E β ⊗f|s◦ (ϕs◦ ,s )−(α¯
∨ ,β) ¯
.
(4.10)
Next we take the product of such automorphisms that is analogous to the product in (4.6). Thereby for each s◦ = 1, 2, . . . , m−1 we arrive at automorphisms σµ;s ¯ ◦ of gs◦ which act (K) = K and as σµ;s ¯ ◦ Pm−1 i i⊗ i ⊗f + K ¯ s Res(ϕs◦ ,s f|s◦ ), σµ;s |s◦ s=1 µ ¯ ◦ (H f|s◦ ) = H Qm−1 ¯ −(µ¯ s ,β) β¯ ⊗f ) = E β¯ ⊗ f · . σµ;s |s◦ |s◦ s=1 (ϕs◦ ,s ) ¯ ◦ (E
(4.11)
For s◦ = m an analogous result holds, with the formula in the first line replaced by i i σµ;m ¯ (H ⊗f|m ) = H ⊗ f|m − K
m−1 X s=1
µ¯ is Res(t −2 ϕm,s f|m )
(4.12)
(and with ϕm,s as given in (3.12)). Thus again we have succeeded in constructing all those multi-shift automorphisms σµ¯ – this time the automorphisms (2.6) of the affine Lie algebra g – for which the shift vector µ¯ lies in the subgroup 0 of 0w , and hence can conclude that all such automorphisms are inner automorphisms of g. ∨ ¯ ∨ ¯ (s) Of course, the function f|s◦ (ϕs◦ ,s )−(α¯ ,β) = f|s◦ (ϕ|s◦ )−(α¯ ,β) is nothing but the local ∨
¯
expansion of the function f (ϕ (s) )−(α¯ ,β) at ps . Hence in particular upon summation over s the central terms in the transformation (4.10) of H i ⊗f cancel owing to the residue theorem. Thus the automorphisms (4.11) are the local realizations of the automorphism (4.6) of the block algebra. One may also analyze the analogous inner multi-shift automorphisms of the Virasoro algebra. This is briefly mentioned at the end of Appendix B.
5. Implementation on Chiral Blocks 5.1. Implementation on tensor products. The next step is now to implement a tensor product of multi-shift automorphisms on a tensor product of modules of the affine Lie algebra. This gives rise to a projective action of the group 0w on this tensor product and, of course, also to a dual action on the algebraic dual of the tensor product. As we will explain, the space B of chiral blocks can be identified with a subspace in this algebraic dual. We will see that the action of 0w can be restricted to B and that this action has finite order on the subspace B. th Let us first recall the existence of the maps 2µ¯ ≡ 2µ;s ¯ ◦ (2.19). They act on the s◦ factor of the tensor product H ≡ H3 = H31 ⊗ H32 ⊗ · · · ⊗ H3m
(5.1)
Action of Outer Automorphisms on Bundles of Chiral Blocks
709
of irreducible highest weight modules over g, i.e. 2µ¯ : H3s◦ → Hω? 3s◦ , with ω? the permutation of integrable weights that was described before (2.19). The tensor product 2µ¯ := 2µ;1 ¯ ⊗ 2µ;2 ¯ ⊗ ···
⊗ 2µ;m ¯
(5.2)
of these maps for all s◦ = 1, 2, . . . , m provides us with an analogous map between tensor products, 2µ¯ :
H3 → Hω ? 3 ,
(5.3)
where ω? 3 = (ω? 31 , ω? 32 , . . . , ω? 3m ). Now note that this map is defined for any µ¯ ∈ 0w . Moreover, µ¯ 7 → 2µ¯
(5.4)
constitutes a projective representation of 0w , i.e. up to possibly a U (1)-valued cocycle , the group law of 0w is also realized by the action of the maps 2µ¯ on the modules, 2µ¯ 1 2µ¯ 2 = (µ¯ 1 , µ¯ 2 ) 2µ¯ 1 +µ¯ 2 ;
(5.5)
moreover, the cocycle on 0w is uniquely determined by a cocycle on the finite group 0w / 0. This result can be established as follows. First we realize that the highest weight vector of the g-module H3 can be required to be normalized and then is unique up to a phase. Accordingly the maps 2µ;s ¯ ◦ are unique up to a phase, too. Our aim is to show that we can choose this phase as a function of µ¯ in a suitable manner. To this end we proceed in several steps. We first define the maps 2µ¯ for elements µ¯ = γ¯ ∈ 0 of the subgroup 0 (4.1). This group is a finitely generated free abelian group, so there is a finite set {γ¯ (j ) } of independent generators. For instance one can choose these generators to be all of the form γ¯(j );s1 = β¯ ∨ ,
γ¯(j );s2 = −β¯ ∨ ,
γ¯(j );s = 0
for s 6= s1 , s2
(5.6)
¯ Now for ¯ β. with suitable choices of s1 , s2 ∈ {1, 2, . . . , m} with s1 < s2 and simple g-roots each γ¯ (j ) we fix the freedom in the definition of the implementing maps by explicitly ˜ γ¯ through elements of the block algebra. prescribing a preferred implementation 2 (j ) Concretely, in the case of generators of the form (5.6) we set ˜ γ¯ :=id ⊗ · · · 2 (j ) ···
⊗ id ⊗
⊗ id ⊗
iπ exp (− iπ ¯ ) exp ( 2 Xβ,ϕ ¯ s◦ ,s ) ⊗ id ⊗ 2 Xβ,1 1
iπ exp (− iπ ¯ ) exp ( 2 Xβ,ϕ ¯ s◦ ,s ) ⊗ id ⊗ · · · 2 Xβ,1 2
(5.7) ⊗ id,
is as introduced in (4.2) and where the non-trivial maps occur where the notation Xα,f ¯ at the s1th and s2th factor of the tensor product. A priori it might be expected that the maps ˜ γ¯ for different values of j do not commute, but rather satisfy 2 (j ) ˜ γ¯ 2 ˜ γ¯ = η(γ¯ (j ) , γ¯ (k) ) 2 ˜ γ¯ . ˜ γ¯ 2 2 (j ) (k) (j ) (k)
(5.8)
But as a matter of fact, when there are non-trivial chiral blocks in H? (which is the case we will be most interested in), then the numbers η introduced this way are equal to one. To see this, we note that the fact that the implementing maps (5.7) are realized through elements of the block algebra implies (compare the remarks around (5.25) below) that ˜ γ¯ acts as the identity on chiral blocks. The result then follows by acting each of the 2 (j )
710
J. Fuchs, C. Schweigert
with both sides of (5.8) (or more precisely, with their implementation on the blocks, as defined in formula (5.29) below) on a chiral block. P We now write any arbitrary γ¯ ∈ 0 uniquely as a linear combination γ¯ = j nj γ¯ (j ) and define the image of γ¯ under the map (5.4) as Y ˜ γ¯ )nj . (2 (5.9) 2γ¯ := (j ) j
By the result just obtained, the order of the factors in this product does not matter. Also, by the same argument as before one concludes that 2γ¯ 1 2γ¯ 2 = 2γ¯ 1 +γ¯ 2 = 2γ¯ 2 2γ¯ 1
for all γ¯ 1 , γ¯ 2 ∈ 0.
(5.10)
Further, suppose that we have made a choice also for the implementations 2µ¯ for all other µ¯ ∈ 0w (which we do not yet specify, because the actual choice is not relevant for the argument). Then again by the same reasoning as before, i.e. by considering the induced maps on the blocks, one deduces that (5.10) in fact generalizes to 2γ¯ 2µ¯ = 2µ¯ 2γ¯
for all γ¯ ∈ 0, µ¯ ∈ 0w .
(5.11)
Next we choose some set M = {κ¯ (i) } of coset representatives for 0w / 0, which is a finite abelian group. For each of the κ¯ (i) we make some choice of the implementing map 2κ¯ (i) , with the arbitrary phase for the moment still left unspecified. Having made these ¯ + κ¯ (i(µ)) ¯ ∈ 0 and choices, every µ¯ ∈ 0w can be uniquely written as µ¯ = γ¯ (µ) ¯ with γ¯ (µ) κ¯ (i(µ)) ¯ ∈ M; we then define 2µ¯ := 2γ¯ (µ) ¯ 2κ¯ (i(µ)) ¯
(5.12)
(where the order of the factors is again irrelevant). Note that this way we have also defined all maps 2κ¯ (i) +κ¯ (j ) , i.e. in particular also for those cases where κ¯ (i) +κ¯ (j ) 6 ∈ M. Therefore we can define phases (κ¯ (i) , κ¯ (j ) ) by 2κ¯ (i) 2κ¯ (j ) = (κ¯ (i) , κ¯ (j ) ) 2κ¯ (i) +κ¯ (j ) .
(5.13)
Finally, combining formula (5.13) with the result (5.11), we learn that for arbitrary ¯ ν¯ ∈ 0w we have µ, 2γ¯ (¯ν ) 2κ¯ (i(¯ν )) = (κ¯ (i(µ)) . 2µ¯ 2ν¯ = 2γ¯ (µ) ¯ 2κ¯ (i(µ)) ¯ γ¯ (¯ν ) 2κ¯ (i(µ)) ¯ , κ¯ (i(¯ν )) ) 2γ¯ (µ)+ ¯ ¯ +κ¯ (i(¯ν )) (5.14) When κ¯ (i(µ)) ¯ +κ¯ (i(¯ν )) ∈ M, this yields immediately 2µ¯ 2ν¯ = (κ¯ (i(µ)) ¯ ν, ¯ , κ¯ (i(¯ν )) ) 2µ+¯
(5.15)
while for κ¯ (i(µ)) ¯ +κ¯ (i(¯ν )) 6 ∈ M the same result is obtained after invoking the definition (5.12) and the identity (5.10). We thus conclude that, as claimed, the group law of 0w is realized by the maps 2µ¯ up to a cocycle , and furthermore this cocycle is induced from the cocycle on 0w / 0 that was introduced in formula (5.13). For later reference we also note that (compare e.g. [25]) the center Z(C (0w / 0)) of the twisted group algebra C (0w / 0) is the ordinary group algebra Z(C (0w / 0)) = C((0w / 0)◦ ) of the subgroup
Action of Outer Automorphisms on Bundles of Chiral Blocks
711
(0w / 0)◦ := {[κ¯ (i) ] ∈ 0w / 0 | (κ¯ (i) , κ¯ (j ) ) = (κ¯ (j ) , κ¯ (i) ) for all [κ¯ (j ) ] ∈ 0w / 0} (5.16) of so-called regular elements of 0w / 0. At this point it is appropriate to recall that so far we have left the phase choices for the maps 2κ¯ (i) undetermined. Changing these phases will change the cocycle by a coboundary. Thus by adjusting these phases we can achieve obtaining some preferred representative cocycle in the cohomology class of . From the cohomological properties of finite abelian groups (see e.g. [26]) it follows in particular that this way we can achieve the property that all the numbers (κ¯ (i) , κ¯ (j ) ) are roots of unity. In the sequel we will often assume that such a phase choice has been made. Taken together, these results will allow us to implement the multi-shift automorphisms also on chiral blocks in such a way that we even obtain a projective action of the group 0w on the space of chiral blocks. To make the implementation of 0w on blocks explicit, we need an appropriate description of the chiral blocks in terms of the action of the block algebra.
5.2. Chiral blocks from co-invariants. In a representation theoretic approach, the chiral blocks of a conformal field theory are constructed with the help of co-invariants bHcB of tensor products H ≡ H3 (5.1) of irreducible modules over the chiral algebra with respect to a suitable block algebra B (see e.g. [6,2]). This statement involves two new ingredients that need to be explained. First, in general, by a co-invariant of a module V over some Lie algebra h one means the quotient vector space bV ch := V / U+ (h)V ,
(5.17)
where U+ (h) = hU(h) with U(h) the universal enveloping algebra of h. When the hmodule V is fully reducible, then (5.17) is just the submodule of h-singlets in V , but generically it is a genuine quotient which cannot be identified with a subspace of V . And ¯ second, the action of the block algebra B = g⊗F on the tensor product vector space H is defined by its expansions in local coordinates, i.e. for X = x¯ ⊗f and v = v1 ⊗v2 ⊗ · · · ⊗vm one has X v :=
m X
v1 ⊗v2 ⊗ · · · ⊗vs−1 ⊗ (x¯ ⊗f|s )vs ⊗vs+1 ⊗ · · · ⊗vm ,
(5.18)
s=1
where x¯ ⊗f|s is regarded as an element of the loop algebra g¯ loop and hence of g, or more precisely, as the representation matrix of that element of g in the g-representation R3s . ¯ (That this yields a g⊗F-representation follows by
((x¯ ⊗f )(y¯ ⊗g) − (y¯ ⊗g) (x¯ ⊗f ))(v1 ⊗v2 ⊗ · · · ⊗vm ) =
m X v1 ⊗v2 ⊗ · · · ⊗[x¯ ⊗f|s , y¯ ⊗g|s ]vs ⊗ · · · ⊗vm s=1
=
m X
v1 ⊗v2 ⊗ · · ·
s=1
+ K κ(x, ¯ y) ¯ (
⊗ ([x, ¯ y] ¯ ⊗(f g)|s )vs ⊗
· · · ⊗vm
m X Resps (df g)) v1 ⊗v2 ⊗ · · · ⊗vm , s=1
(5.19)
712
J. Fuchs, C. Schweigert
where in the first equality one uses the fact that terms acting on different tensor factors cancel and in the second equality the bracket relations of g are inserted. The terms in (5.19) that involve the central element K cancel as a consequence of the residue formula, ¯ while the other terms add up to [x¯ ⊗f, y¯ ⊗g] v, where the Lie bracket is the one of g⊗F as defined in (3.4).) The finite-dimensional vector spaces bHcB of co-invariants play the role of the dual spaces B? of the chiral block spaces B, i.e. ?
B3 = (bH3 cB ) .
(5.20)
By duality, the chiral blocks can then also be regarded as the invariants (or in other words, singlet submodules) in the algebraic dual H? of H, i.e. B3 = (H?3 )B ;
(5.21)
in this description, the blocks are linear forms β on H with the property hβ , Xvi ≡ β(Xv) = 0
(5.22)
¯ for all X ∈ g⊗F and all v ∈ H.
5.3. Isomorphisms of chiral blocks. In (5.17) we followed the habit of suppressing the symbol R for the representation by which h acts on the vector space V , e.g. bV ch is a shorthand for bV cR(h) . This cannot cause any confusion as long as we only deal with a single h-module (R, V ) which is based on the vector space V . On the other hand, as is easily checked, given any automorphism σ of h, together with (R, V ) also ˜ V ) := (R ◦ σ, V ) furnishes a module over h. Accordingly we are then dealing with (R, two different actions of h on V , and hence with two different spaces (5.17) of co-invariants. It is then natural to try to associate to any automorphism σ of h a corresponding mapping of the respective spaces of co-invariants. This is easily achieved once we are ˜ 2σ v for all given a linear map 2σ : V → V with the property that 2σ R(x) v = R(x) x ∈ h and all v ∈ V , or in short (suppressing again the symbol R), 2σ x = σ (x) 2σ
(5.23)
for all x ∈ h. Namely, we observe that via the prescription σ (1) := 1 and linearity, the automorphism σ of h extends to an automorphism of the enveloping algebra U(h) that respects the filtration, hence also to an automorphism of U+(h). Because of (5.23) the prescription bV cR(h) 3 [v] 7 → [2σ v] ∈ bV cR(h) ˜
(5.24)
then supplies us with a well-defined mapping from bV cR(h) to bV c ˜ , and by conR(h) struction this is in fact an isomorphism of vector spaces. Now it is a general fact that inner automorphisms act trivially on co-invariants. Namely, an inner automorphism σ of h can by definition be written as a product of
Action of Outer Automorphisms on Bundles of Chiral Blocks
713
finitely many automorphisms σi = exp(adxi ) for some elements xi ∈ h. Moreover, for every y ∈ h we have (compare the formulæ (2.18) and (2.21)) 5 R ◦ σi (y) = R(exp(adxi )(y)) = exp(R(xi )) R(y) exp(−R(xi )).
(5.25)
By expanding the exponentials it then follows in particular that ˜ R(y) v = R ◦ σ (y) v = R(y) v mod U+(h)V
(5.26)
for all v ∈ V and all y ∈ h. By the definition of co-invariants, this means that the modules ˜ V ) possess the same co-invariants. By duality, this also means that the (R, V ) and (R, dual spaces to these modules possess the same invariants. When applied to the situation of our interest, these general observations tell us that as far as the study of chiral blocks is concerned we need to regard the automorphisms ¯ that we defined in (3.16) above only modulo inner autoσ µ¯ of the block algebra g⊗F morphisms. Accordingly we should determine the outer automorphism class to which a ¯ belongs. We have already addressed this question multi-shift automorphism σ µ¯ of g⊗F in Sect. 2 for the case of multi-shift automorphisms of the affine Lie algebra g. Now we ¯ study the same issue for the block algebra g⊗F. We first note that the result of Sect. 2 implies that whenever at least one of the vectors µ¯ s is not a coroot, then σµ;s ¯ is an outer automorphism of the corresponding affine Lie algebra gs ; it follows that in this case it is also an outer automorphism of gm , and thereby a fortiori an outer automorphism of ¯ the block algebra g⊗F. In other words, the subgroup 0 (4.1) already exhausts the set of those shift vectors ¯ Thus the group 0out of outer µ¯ ∈ 0w for which σµ¯ is an inner automorphism of g⊗F. ¯ modulo inner multi-shift automorphisms of g⊗F is precisely the factor group 0out = 0w /0. ∨
(5.27) ∨
According to the isomorphism (2.29) between Lw /L and the unique maximal abelian normal subgroup Z(g) of Out(g) = Aut(g)/Int(g), we thus have 0out ∼ = (Z(g))m−1 .
(5.28)
In particular, 0out is a finite group of order ord(0out ) = |Lw /L|m−1 . For future reference we mention that a set of distinguished representatives of the elements of Out(g) is provided by the diagram automorphisms ωµ;s ¯ of g (see Subsect. 2.2); thus elements of 0out may be regarded as collections of suitable diagram automorphisms whose product is the identity. We also recall that when implementing the group 0w through the maps 2µ¯ which satisfy (5.5), we are effectively dealing with a two-cocycle on the finite abelian group 0w / 0 = 0out . Explicitly, for every µ¯ ∈ 0w we have a map 2?µ¯ from the chiral blocks B3 = (H?3 )B to H?ω? 3 which acts as h2?µ¯ (β) , vi := hβ , 2−1 µ¯ vi.
(5.29)
5 This may be checked by replacing x by ξ x and comparing both sides order by order in the dummy i i variable ξ . In the special case that h integrates to a group, the formula is an immediate consequence of the −1 fact that exp(adx )(y) = γ yγ , where γ is a group element such that γ = exp(x).
714
J. Fuchs, C. Schweigert
The previous results imply, first, that the image of this map is in Bω? 3 , i.e. that for any block β, 2?µ¯ (β) is again a chiral block, and second, that this map depends on µ¯ only via ¯ ∈ 0out we have constructed an isomorphism ¯ in 0out . In short, for each [µ] its class [µ] 2?µ¯ ≡ 2?[µ] ¯ :
B3 ∼ = Bω? 3 .
(5.30)
(Analogously, we also have an isomorphism between the respective dual spaces, ∼ ? bH3 cg⊗ ¯ F = bHω 3 cg⊗ ¯ F .) Note that it already follows from the Verlinde formula and elementary properties of simple currents that the spaces B3 and Bω? 3 have the same dimension; being finite-dimensional, it is then trivial that they are isomorphic as vector spaces. The virtue of the result (5.30) is, however, that it provides us with a canonical realization of this isomorphism. As we will soon see, this realization possesses the additional non-trivial property to be compatible with a variation of the moduli in the problem, i.e. with different choices of the insertion points. Moreover, knowing the isomorphism (5.30) one can also study the problem of fixed point resolution, which arises whenever the collection 3 of g-weights is left invariant by ω? ; this issue will be addressed in Sect. 7 below. Finally, according to the remarks at the end of Subsect. 5.1, we can (and do) employ the freedom in defining the maps 2µ¯ so as to achieve the property that all cocycle factors in the projective representation of 0w / 0 are roots of unity. Together with the fact that 0w / 0 is a finite group, it follows that the maps 2?µ¯ all have finite order. 6. Bundles of Blocks In all the considerations above we have regarded the punctures ps as held fixed, i.e. we have analyzed block algebras and chiral blocks at a single point of the moduli space M ≡ Mm of m-punctured projective curves. We now address the issues that arise when the insertion points are allowed to vary over the whole moduli space M. Recall from the discussion in Subsect. 5.1 that in the implementation 2µ¯ of an automorphism of gm a phase is still undetermined. Clearly, if we choose this phase at random for each point in the moduli space, we cannot expect to obtain quantities on the bundle of chiral blocks that vary smoothly with the moduli. On the other hand, the bundle of chiral blocks carries a natural projectively flat connection, the Knizhnik--Zamolodchikov connection [27–29, 6]. Our rationale will therefore be to implement the maps σµ¯ (z) for different values of the moduli z in such a way that they preserve the Knizhnik--Zamolodchikov connection. 6.1. Connections and bundles over the moduli space. The moduli space M is (P1 )m minus the union of diagonals, i.e. M = {(p1 , p2 , . . . , pm ) | ps ∈ P1 , ps 6= ps 0 for s 6 = s 0 }.
(6.1)
When considered as depending on the values of the punctures, the spaces B of chiral blocks combine to a vector bundle B over M [6]. Before we study that bundle, we introduce a few other bundles which in this context are of interest as well. We first consider the trivial bundle over M with fiber given by the infinite-dimensional Lie algebra gm defined in (3.1), gm × M → M .
(6.2)
Action of Outer Automorphisms on Bundles of Chiral Blocks
715
This bundle carries a fiberwise action of the group 0w , i.e. an action by automorphisms of the total space that cover the identity on the base space M. For every µ¯ ∈ 0w we write σµ¯ (z) for the map on the fiber over the point p ∈ M with coordinates z ≡ (z1 , z2 , . . . , zm ). On the bundle (6.2) we have a flat connection D which acts on smooth sections g(z) of (6.2) as Ds g(z) := ∂s g(z) + [L(s) −1 , g(z)]
(6.3)
for s = 1, 2, . . . , m, where ∂s ≡ ∂/∂zs and L(s) −1 ∈ Virs is a generator of the Virasoro algebra associated to gs . Moreover, the bundle (6.2) possesses a subbundle of block algebras, which is defined ¯ by taking for each p ∈ M the corresponding block algebra g⊗F = g¯ ⊗ F(P1 \{p1 , p2 , m . . . , pm }) ⊂ g . The smooth sections of this bundle will be denoted by X(z). It is important to realize that while the block algebras at different points p are isomorphic as abstract Lie algebras, they are not naturally isomorphic, and the subbundle is not necessarily a trivial bundle. Still, by construction, the automorphisms σµ¯ restrict to automorphisms on the subbundle of block algebras. A crucial property of the connection (6.3) is that it preserves the subbundle of block algebras. Explicitly, this can be seen as follows. The algebra F(Pm1 ) is generated algebraically by the constant function ϕ (0) := id and the functions ϕ (s) with s = 1, 2, . . . , m, where ϕ (s) (z) = (z − zs )−1 . 6 By induction with respect to the “length” ` of f = ϕ (i1 ) ϕ (i2 ) · · · ϕ (i` ) and using the Leibniz rule, it follows that this property is established as soon as it is shown to hold for each of the functions ϕ (s) . Thus we consider the element X = x¯ ⊗ϕ (s) of the block algebra. For s = 0 we trivially have ∂s 0 X = 0 0 and [L(s−1) , X] = 0 for all s 0 = 1, 2, . . . , m. For s ∈ {1, 2, . . . , m} and s 0 6 = s both ∂s 0 X and 0 [L(s−1) , X] have a component only in gs 0 , namely, 0
[L(s−1) , X]|s 0 = x¯ ⊗ (t + zs 0 − zs )−2 = −∂s 0 X|s 0 ,
(6.4) 0
so that again DX = 0. Finally, for s ∈ {1, 2, . . . , m} and s 0 = s, the commutator [L(s−1) , X] is as in (6.4), but now for ∂s 0 X the component in gs 0 vanishes while there are additional contributions in all gs 00 with s 00 6 = s, namely ∂s 0 X|s 00 = x¯ ⊗(t + zs 00 − zs )−2 . Thus in this ¯ namely case Ds 0 X is not zero; however, it is still an element of the block algebra g⊗F, Ds 0 X = x¯ ⊗(z − zs )−2 , since we encountered precisely the local expansions of this ele¯ ment of g⊗F. Next we consider the trivial bundle over M with fiber given by the tensor product H ≡ H3 (5.1) of irreducible highest weight modules H3s over the affine Lie algebra g, H×M → M.
(6.5)
Again this bundle is endowed with a flat connection ∇, which acts on smooth sections v(z) of (6.5) as ∇s v(z) := (∂s + L(s) −1 )v(z)
(6.6)
for s = 1, 2, . . . , m; we call this connection the Knizhnik--Zamolodchikov connection. And again on the bundle (6.5) there is a fiberwise action of a certain discrete group. 6 When the puncture p is at z = ∞, for s = m we rather have ϕ (s) (z) = z. The corresponding changes m m in the arguments below are obvious, and we refrain from writing them down explicitly.
716
J. Fuchs, C. Schweigert
This group is the (proper) subgroup 0fix of 0w that consists of all those elements µ¯ of 0w for which all associated maps 2µ;s ¯ of the tensor factors (see (2.19)) of H are endomorphisms. This group is given by ? 0fix = 03 := {µ¯ ∈ 0w | ωµ;s ¯ (3s ) = 3s for all s = 1, 2, . . . , m}.
(6.7)
Note that by the results of Subsect. 5.3, the definition of 03 depends on the automorphisms ¯ in 0out (5.27), or what is the same, through the associated σ µ¯ only through the class [µ] diagram automorphisms ωµ¯ . We call 03 the stabilizer subgroup of 0w associated to the weights 31 , . . . , 3m . For every µ¯ ∈ 03 we write 2µ¯ (z) for the map on the fiber over p ∈ M in the bundle (6.5). 6.2. Moduli dependence of the twisted intertwiners. To determine how the implementation of the automorphism depends on the moduli, we first study how the multi-shift automorphism σ µ¯ changes when the punctures are varied (while the shift vector µ¯ is kept fixed). We denote by p [0] a reference point on M and study how the automorphism σ µ¯ ≡ σ µ¯ (z) differs from σ µ¯ (z [0] ), or in other words, what the automorphism δ µ¯ defined by σ µ¯ (z) = δ µ¯ (z; z [0] ) ◦ σ µ¯ (z [0] )
(6.8)
looks like. Note that we have to interpret δ µ¯ as an automorphism of the algebra gm ; since the block algebra, regarded as a subalgebra of gm , varies with the moduli, it does not make sense to look for an automorphism of the block algebra. For every s◦ = 1, 2, . . . , m the explicit form of the map δµ¯ ≡ δµ;s ¯ ◦ follows directly from the formula (2.6) for σµ¯ ≡ σµ;s . We get δ (K) = K and ¯ ◦ µ¯ m X
δµ¯ (H i ⊗f ) = H i ⊗f + K δµ¯
¯ (E β ⊗f )
=
¯ E β ⊗f
·
m Y
s=1
µ¯ is Res((ϕs◦ ,s − ϕs[0]◦ ,s ) f ), (6.9) [0] s◦ ,s
(ϕs◦ ,s /ϕ
s=1
¯ −(µ¯ s ,β)
)
with ϕs◦ ,s as defined in (2.7) 7 and ϕs[0]◦ ,s (t) := (t + zs[0]◦ − zs[0] )−1 . For simplicity, let us for the moment restrict our attention to the particular case where zs[0] = zs except for some fixed value u ∈ {1, 2, . . . , m}. Then δ µ¯ acts on gm as δ µ¯ (K) = K and i H ⊗f + K µ¯ iu Res((ϕs◦ ,u − ϕs[0]◦ ,u ) f ) for s◦ 6 = u, m X [0] δ µ¯ (H i ⊗f ) = H i ⊗f + K µ¯ is Res((ϕu,s − ϕu,s ) f ) for s◦ = u, s=1 s6=u
β¯ ¯ E ⊗f · (ϕs◦ ,u /ϕs[0]◦ ,u )−(µ¯ u ,β) for s◦ 6 = u, m Y ¯ ¯ δ µ¯ (E β ⊗f ) = E β¯ ⊗f · [0] −(µ (ϕu,s /ϕu,s ) ¯ s ,β) for s◦ = u.
(6.10)
s=1 s6=u
7 Here and below we suppress again the obvious modifications that arise when the puncture p is at m zm = ∞.
Action of Outer Automorphisms on Bundles of Chiral Blocks
717
Next we observe that a variation of the moduli does not change the class of the automorphism σµ¯ modulo inner automorphisms. As a consequence, δ µ¯ is in fact an inner automorphism of gm . As can be verified by direct calculation (see Appendix C), we have in fact ( exp(ad(µ¯ ,H )⊗gs ,u ) for s◦ 6= u, ◦ (6.11) δ µ¯ = exp(adPu ) for s◦ = u, (µ¯ s ,H )⊗g˜ s s6=u
where gs◦ ,u (t) := ln
t + zs0 − zu t + zs0 − zu[0]
and
g˜ s (t) := ln
t + zu − zs . t + zu[0] − zs
(6.12)
(Here we choose some definite branch of the logarithm. It is readily checked that all the results below do not depend on this choice. For zu 6 = zu[0] , gs◦ ,u can be considered as an element of C((t)), for which the constant part is defined only modulo 2π iZ.) Now recall from Sect. 5 that associated to the automorphism σµ¯ of gm there comes the map 2µ¯ that for each tensor factor of H is defined as in (2.19). This map is present for any z ∈ M; moreover, according to the relation (6.8) the maps 2µ¯ (z) and 2µ¯ (z [0] ) are related by an analogous map that is associated to the inner automorphism δ µ¯ . Using the results (2.22) and (6.11) we learn that explicitly we have [0] ¯ u , H )⊗gs◦ ,u + K h˜ s◦ ) ◦ 2µ;s 2µ;s ¯ ◦ (z) = exp ((µ ¯ ◦ (z )
for s◦ 6= u,
(6.13)
for s◦ = u.
(6.14)
respectively m X [0] (µ¯ s , H )⊗g˜ s + K h˜ u ) ◦ 2µ;s ¯ ◦ (z )
2µ;s ¯ ◦ (z) = exp (
s=1 s6=u
At this stage the functions h˜ s◦ , which are introduced here to take care of the fact that the implementation is only determined up to a phase, can still be chosen arbitrarily; we will make use of this freedom later on. Combining these results with the twisted intertwining relations (2.34), we can show that the commutator [∇u , 2µ¯ (z)] has the local expansions [∇u , 2µ;s ¯ ◦ (z)]|z
[0] u =zu
[0] = (−(µ¯ u , H ) ⊗ ϕs[0]◦ ,u + K gˆ s◦ ) 2µ;s ¯ ◦ (z ),
(6.15)
where gˆ s◦ = ∂z∂ h˜ u|z =z[0] − u u u
m X (µ¯ s ,µ¯ u ) s=1 s6=u
zu[0] − zs[0]
for s◦ = u,
(6.16)
while gˆ s◦ = ∂z∂ u h˜ s◦|z =z[0] for s◦ 6 = u. (The derivation of (6.15) is explained in some detail u u in Appendix C.) Collecting these local expansions, we learn that up to central terms we simply have [0] [∇u , 2µ¯ (z [0] )] = Yµ;u ¯ 2µ¯ (z ),
(6.17)
718
J. Fuchs, C. Schweigert
where Yµ;u ¯ u , H ) ⊗ (z − zu[0] )−1 ¯ := − (µ
(6.18)
is an element of the block algebra. Concerning the central terms, we claim that we can employ the freedom that is present in the choice of the functions h˜ s◦ so as to achieve gˆ s◦ ≡ 0
for all p ∈ M.
(6.19)
Clearly, this is possible at the point p = p [0] , by just choosing 8 h˜ s◦ ≡ 0
for s◦ 6 = u
and
h˜ u =
m X zu − zs (µ¯ s , µ¯ u ) ln [0] . [0] z u − zs s=1
(6.20)
s6 =u
To discuss how the situation looks globally, we first have to generalize the formula (6.15) to the case where all punctures may vary. By analogous considerations as above one can check that the generalization of (6.13) and (6.14) reads 2µ;s ¯ ◦ (z) = exp (
m X [0] (µ¯ s , H ) ⊗ gs◦ ,s + Khs◦ ) ◦ 2µ;s ¯ ◦ (z )
(6.21)
s=1
for all s◦ , where gs◦ ,s (t) := ln
t + zs0 − zs , t + zs[0]0 − zs[0]
(6.22)
which for all s◦ and all s is a sensible Laurent series. (Also note that the notation gs◦ ,u is in agreement with the definition (6.12), and gu,s = g˜ s ; the functions hs◦ are still to be determined.) The same arguments as in Appendix C then lead to a formula analogous to (6.15), [0] ¯ u , H ) ⊗ ϕs[0]◦ ,u + K gˆ s◦ ) 2µ;s [∇u , 2µ;s ¯ ◦ (z)]|z=z [0] = (−(µ ¯ ◦ (z ),
(6.23)
with gˆ s◦ = ∂z∂ hs◦|z=z [0] − u
m X (µ¯ s ,µ¯ u ) s=1 s6 =u
zu[0] − zs[0]
(6.24)
for all s◦ . Our aim is now to choose the functions hs◦ is such a way that in the tensor product 2µ¯ the terms proportional to K in the exponent cancel P so that we are again left with an element of the block algebra. Thus we need again s◦ gˆ s◦ ≡ 0, i.e. m m X X (µ¯ s ,µ¯ u ) ∂ h = s ◦ ∂z z −z
s◦ =1
u
s=1 s6=u
u
s
(6.25)
8 Here the freedom in the choice of the branch of the logarithm is not completely irrelevant. But since the level is always integral while the inner product (µ¯ s , µ¯ u ) is rational, the twisted intertwiner 2µ;s ¯ ◦ is determined only up to multiplication by a root of unity, and the presence of this root of unity does not destroy the important property of 2µ;s ¯ ◦ to have finite order. Moreover, the precise value of the order really matters only in the case of fixed points, and closer inspection shows that in that case the number (µ¯ s , µ¯ u ) is in fact always an integer so that the order is not changed at all.
Action of Outer Automorphisms on Bundles of Chiral Blocks
719
(note that the appearance of the summation is in agreement with the fact that gm is obtained from a direct sum of affine Lie algebras by identifying the centers). The choice hs◦ (z) =
1 2
m X
(µ¯ s◦ , µ¯ s ) ln
s=1 s6=s◦
zs◦ − zs zs[0]◦ − zs[0]
(6.26)
for all s◦ = 1, 2, . . . , m indeed satisfies this requirement. Note that the requirement (6.25) determines hs◦ only up to a z-independent constant; the choice made here will be convenient later on. 9 We conclude that we can choose the phase of 2µ¯ (z) in such a manner that the Knizhnik--Zamolodchikov connection is preserved at every point of the moduli space. 6.3. Transformation properties of flat sections. We are now in a position to investigate the bundle of chiral blocks and the transformation properties of those sections in the bundle which are flat with respect to the Knizhnik--Zamolodchikov connection. To this end we consider the trivial bundle H? × M → M,
(6.27)
where the fiber H? is the algebraic dual of H, and in this bundle the subbundle B of chiral blocks which are the singlets under the (dual) action of the block algebra. This implies that the smooth sections β(z) of B obey hβ(z) , X(z)v(z)i = 0
(6.28)
for all sections X(z) of the bundle of block algebras and all sections v(z) of the trivial bundle (6.5). This bundle B is of finite rank [6]. We endow the trivial bundle (6.27) with the connection that is dual to the connection (6.6); it restricts to a connection on the subbundle B. We will use the term Knizhnik--Zamolodchikov connection also for both the connection on the dual bundle and for its restriction to B. (Note, however, that a frequent convention in the literature is to reserve the term Knizhnik--Zamolodchikov connection only for the connection on B.) An action of the group 0fix on B can be defined by h2?µ¯ (β(z)) , v(z)i := hβ(z) , 2−1 µ¯ v(z)i
(6.29)
for all sections v(z) (recall that 2?µ¯ (β) is again a chiral block). We have h∇s (2?µ¯ β(z)) , v(z)i = ∂s h2?µ¯ β(z) , v(z)i − h2?µ¯ β(z) , ∇s v(z)i −1 = ∂s hβ(z) , 2−1 µ¯ v(z)i − hβ(z) , 2µ¯ ∇s v(z)i.
(6.30)
Using the result (6.17), this can also be written as −1 h∇s (2?µ¯ β(z)) , v(z)i = ∂s hβ(z) , 2−1 µ¯ v(z)i − hβ(z) , ∇s 2µ¯ v(z)i −1 + hβ(z) , Y−µ;s ¯ 2µ¯ v(z)i
=
(6.31)
−1 ∂s hβ(z) , 2−1 µ¯ v(z)i − hβ(z) , ∇s 2µ¯ v(z)i,
9 Also note that in the general case considered here there is of course less freedom in the choice of these functions than we had for the analogous functions h˜ s◦ in the special case treated before. In particular, the simple choice made in (6.20) is no longer available.
720
J. Fuchs, C. Schweigert
where in the second line we used the fact that according to the definition (6.18), Y−µ;s ¯ is an element of the block algebra. Note that 2µ¯ (z) depends smoothly on z, so that 10 i.e. 2−1 µ¯ (z)v(z) is a smooth section as well. Let now β(z) be a flat section in B, ∇β = 0, or more explicitly, 0 = h∇s β(z) , v(z)i = ∂s hβ(z) , v(z)i − hβ(z) , ∇s v(z)i
(6.32)
for all smooth sections v(z) and all s = 1, 2, . . . , m. Applying this formula to the smooth section 2−1 µ¯ (z)v(z), we see from (6.31) that h∇s (2?µ¯ (z)β(z)) , v(z)i = 0.
(6.33)
This means that together with β(z) also the section 2?µ¯ (z)β(z) is flat. 6.4. Verification of the projective action of 0w . In Subsect. 5.1 we have seen that for every given value z of the moduli the maps 2µ¯ (z) that implement the automorphisms σ µ¯ on the gm -modules can be chosen such that they respect the group law of 0w up to a two-cocycle . Suppose we have made this choice for some point z [0] ∈ M. To extend 2µ¯ to other values of the moduli, we have imposed the requirement that the extension should be compatible with the Knizhnik--Zamolodchikov connection. This requirement can only be satisfied when the implementation 2µ¯ (z) is chosen in a suitable manner. We will now show that the specific implementation 2µ¯ (z) that we have already chosen above for each µ¯ ∈ 0w still respects the group law of 0w at every point in M up to the same cocycle . [0] [0] To start, we recall the formula (6.21), i.e. 2µ;s ¯ ◦ (z) = Aµ;s ¯ ◦ (z; z ) 2µ;s ¯ ◦ (z ), with [0] Aµ;s ¯ ◦ (z; z ) = exp (
m X (µ¯ s , H )⊗gs◦ ,s + Khs◦ ) ;
(6.34)
s=1
¯ are defined by (6.22) and (6.26), respectively. It follows here gs◦ ,s (t; z) and hs◦ (z; µ) that [0] [0] (µ¯ , ν¯ ) 2µ+¯ ¯ ν ;s◦ (z) = Aµ+¯ ¯ ν ;s◦ (z; z ) ◦ 2µ+¯ ¯ ν ;s◦ (z ) [0] = Aµ;s ¯ ◦ (z; z ) ◦ exp (
m X (¯νs , H )gs◦ ,s s=1
(6.35)
¯ ν ) − hs◦ (z; µ)] ¯ ) + K [hs◦ (z; µ+¯ [0] [0] ◦ 2µ;s ¯ ◦ (z ) 2ν¯ ;s◦ (z ). [0] By commuting the exponential through 2µ;s ¯ ◦ (z ) we arrive at a similar exponential, [0] −1 but with (σµ¯ ) applied to the argument. With the help of the identity (D.14) we then get explicitly [0] [0] [0] (µ¯ , ν¯ ) 2µ+¯ ¯ ν ;s◦ (z) = Aµ;s ¯ ◦ (z; z ) ◦ 2µ;s ¯ ◦ (z ) ◦ exp(Y˜s◦ ) ◦ 2ν¯ ;s◦ (z )
10 Sometimes in the literature the term “chiral block” is reserved for such flat sections.
(6.36)
Action of Outer Automorphisms on Bundles of Chiral Blocks
with Y˜s◦ :=
X X (¯νs , H ) gs◦ ,s + K ( − (µ¯ s 0 , ν¯ s ) Res(ϕs◦ ,s 0 gs◦ ,s ) s
+ 21 =
721
X s s6=s◦
s,s 0
z −zs ln [0]s◦ [0] [(µ¯ s◦ +¯νs◦ , µ¯ s +¯νs ) − (µ¯ s◦ , µ¯ s )] zs◦ −zs
X X z −zs (¯νs , H ) gs◦ ,s + 21 K ln [0]s◦ [0] (¯νs◦ , ν¯ s ) zs◦ −zs s s + 21 K
X
) (6.37)
s6=s◦
ln
s s6=s◦
zs◦ −zs νs◦ , µ¯ s ) − (µ¯ s◦ , ν¯ s )]. [0] [(¯ zs[0] ◦ −zs
Upon exponentiation, the terms in the first line of the last expression just yield the correct factor Aν¯ ;s◦ (z; z [0] ), while the rest of the terms amount to an additional phase. Now this is the situation at the puncture s◦ ; taking into account that the centers of the affine algebras gs are identified in gm , we finally have to add up the prefactors of K for all insertion points s◦ = 1, 2, . . . , m. Doing so, the two sums in the prefactor cancel each other, X z −zs ln [0]s◦ [0] [(µ¯ s◦ , ν¯ s ) − (¯νs◦ , µ¯ s )] = 0. (6.38) s◦ ,s s6=s◦
zs◦ −zs
As a consequence, we have [0] [0] (µ¯ , ν¯ ) 2µ+¯ ¯ ν (z) = Aµ+¯ ¯ ν (z; z ) 2µ+¯ ¯ ν (z )
= Aµ¯ (z; z [0] )2µ¯ (z [0] ) ◦ Aν¯ (z; z [0] )2ν¯ (z [0] ) = 2µ¯ (z) ◦ 2ν¯ (z). (6.39) We conclude that, as claimed, with our chosen implementation the group law of 0w is respected at every point of the moduli space M up to a z-independent cocycle. (Note that it is the representative cocycle itself, and not just its cohomology class, that is independent of the moduli.) Moreover, using the relation α¯ s ⊗ ϕs◦ ,s + E −α¯ s ⊗ϕs−1 )) = exp (− 21 H α¯ s ⊗gs◦ ,s ) exp ( iπ ◦ ,s 2 (E α¯ s ⊗ [0] exp ( iπ ϕs◦ ,s + E −α¯ s ⊗(ϕs[0]◦ ,s )−1 )) 2 (E
exp ( 21 H α¯ s ⊗gs◦ ,s ), (6.40) one can show that the preferred implementation of inner multi-shift automorphisms introduced in Subsect. 5.1 (see (5.7)) obeys [0] ˜ [0] ˜ µ;s 2 ¯ ◦ (z) = Aµ;s ¯ ◦ (z; z ) 2 ¯ ◦ (z ) µ;s
(6.41)
precisely as in (6.21), which implies that at every point in the moduli space this implementation is compatible with the Knizhnik--Zamolodchikov connection. (For details, we refer to Appendix D.) We conclude in particular that in each fiber of the bundle B the map 2?µ¯ has finite order.
722
J. Fuchs, C. Schweigert
7. Fixed Point Resolution 7.1. Fixed points. In the previous section we have constructed a projective action of the group 0w on the tensor product H in such a way that for each µ¯ ∈ 0 the twisted intertwiner 2µ¯ is represented by the product of exponentials of elements in the block [0] ¯ ¯ ). As seen in Subsect. 5.3 this has in particular the consequence algebra g⊗F ≡ g⊗F(z that in all fibers over the moduli space Mm the induced maps on the space B ≡ B(z) of chiral blocks have finite order and realize, modulo the fixed cocycle , the group law of the finite abelian group 0w /0. A particularly interesting situation arises when the automorphism σµ¯ is not inner, but still does not change the isomorphism class of a module H3 . More precisely, given an m-tuple 3 of integrable weights of the affine Lie algebra g, we associate to it the subgroup of 0w that leaves each of the irreducible highest weight modules H3s invariant up to isomorphism. This subgroup is precisely the stabilizer 03 of 3 as defined in (6.7). For every m-tuple 3 the stabilizer 03 definitely contains 0 as a subgroup. If it is larger than 0, then it also contains elements which do not necessarily act as a multiple of the identity on the blocks. In this case we call the m-tuple 3 of g-weights a fixed point. Now the cocycle ≡ 3 on 0w / 0 induces a cocycle on the subgroup 03 /0, which we denote by the same symbol. Further, when applied to fixed points, the results of Subsect. 5.3 tell us that each fiber of the vector bundle B of chiral blocks can be split into finitely many subspaces Bψ that are invariant under the projective action of the group 03 , or rather, of the quotient by its subgroup 0. These invariant subspaces, whose dimensions may be larger than one, are in correspondence with the irreducible representations of the twisted group algebra C (03 / 0) of the finite group 03 /0. Thus 11 the label ψ of the invariant subspace Bψ can be taken to be a character of the center Z(C (03 / 0)) of C (03 / 0), which in turn (compare the remarks around formula (5.16)) is the group algebra Z(C (03 / 0)) = C((03 / 0)◦ ) of the subgroup (03 / 0)◦ of regular elements of 03 / 0; thus in short, ψ ∈ ((03 / 0)◦ )? . Now the Knizhnik--Zamolodchikov connection is preserved under the map 2?µ¯ ; therefore it restricts to the various invariant subspaces and they fit together into sub-vector bundles of B. In particular, the dimensions of the invariant subspaces Bψ do not depend on the moduli. Hence as soon as (03 / 0)◦ is non-trivial, the bundle B of chiral blocks is reducible (as a vector bundle). To actually establish this fact, we have to show that the ranks of at least two subbundles B ψ are nonzero, which in fact follows from the conjectured formula for the rank to be discussed below. We refer to the decomposition of the bundle B into the subbundles of invariant subspaces under 2µ¯ as the fixed point resolution. Our results imply that the Knizhnik--Zamolodchikov connection consistently restricts to these subbundles; thus even after fixed point resolution we are still given a Knizhnik--Zamolodchikov connection. The present situation should be compared with the situation in coset conformal field theories, where it is known [12] that the characters, i.e. the zero-point blocks on the torus, of fixed points decompose in a similar way under the action of an outer automorphism. This structural analogy should also explain the striking similarities in the modular matrices for extension modular invariants [8] and coset conformal field theories [12]. We pause for a side remark. One might be tempted to speculate that the converse is also true, i.e. that the bundle B splits into a direct sum of subbundles only if fixed points are involved. This, however, does not seem to be true, as the following example at higher genus shows. The bundle of zero-point blocks on the torus is given by the characters of the theory, and the representation of the mapping class group on this bundle is just given 11 See e.g. [25] for the representation theory of twisted group algebras.
Action of Outer Automorphisms on Bundles of Chiral Blocks
723
by the usual modular group representation on the characters. This bundle definitely does not involve fixed points. On the other hand it is known that the representation of the modular group is reducible if the theory contains non-trivial simple currents. (One well known example for this phenomenon is the fact that in superconformal field theories the character-valued indices of fields in the Ramond sector span a closed subspace under modular transformations.) 7.2. Trace formulæ. We have seen that the dimensions of the invariant subspaces Bψ do not depend on the moduli. In this subsection we present a conjecture for a general formula for these dimensions. Via Fourier transformation over the finite abelian group (03 / 0)◦ of regular elements of 03 /0, these dimensions are related to the traces of the implementing maps 2?µ¯ on the fibers. More precisely, we need to choose a representative µ¯ ∈ 03 for every element ω of (03 / 0)◦ ; the result does not depend on the choice of representative modulo 0, because the map 2?µ¯ depends on µ¯ ∈ 03 only through its class ω. Now recall that ωµ;s ¯ denotes the diagram automorphism that is in the same class as σµ;s . We have also seen that the latter is in the same class as the ordinary single-shift ¯ ; accordingly, instead of ωµ;s may also use the notation ωµ¯ s or, automorphism σµ(0) ¯s ¯ we Pm for brevity, ωs . In this notation, the condition that s=1 µ¯ s = 0 tells us that we have m Y s=1
ωs = id.
(7.1)
Moreover, as diagram automorphisms are in one-to-one correspondence with classes of outer automorphisms, we identify the m-tuple ω of outer automorphisms with the corresponding element of the group (03 / 0)◦ . Also recall that the implementing maps 2?µ¯ were defined only up to a phase. The invariant contents of the conjecture we are going to spell out consists of a formula for the dimensions of the invariant subspaces Bψ . However, it will be convenient not to write down directly this formula for the dimension of Bψ , but to present instead the trace T3;ω ≡ T3;σ µ¯ := tr B3 2?µ¯
(7.2)
for a given choice of implementation 2?µ¯ . Of course, unlike the dimensions of the invariant subspaces themselves, these numbers do depend on the chosen implementation. For a definite choice of this phase, the relation between the dimensions dim Bψ and traces reads X ψ ∗ (ω) T3;ω . (7.3) dim Bψ = |(03 / 0)◦ |−1 ω∈(03 / 0)◦
The facts that the right-hand side has an independent meaning and that the traces on the left-hand side depend on the implementation are reconciled by the following observation. First, we do require that the maps 2?µ¯ realize the group structure of 0w projectively and that they are the identity for all µ¯ ∈ 0. This already restricts the choice of possible scalar factors, but still leaves some indeterminacy. Indeed, the remaining freedom consists of modifying the implementing maps by phases that furnish a character ϕ of the group (03 / 0)◦ . In short, once we fix the choice of implementation in such a way that our requirements are satisfied, we can label the eigenspaces by characters ψ of (03 / 0)◦ , but
724
J. Fuchs, C. Schweigert
when making an allowed change in the implementation, the labelling of the eigenspaces is by different characters ψ 0 = ψϕ of (03 / 0)◦ . Having explained the dependence of the traces (7.2) on the choice of implementation, we are now in a position to present our conjecture for these numbers. The results of [8] for the S-matrix of integer spin simple current modular invariants suggest, when combined with the ordinary Verlinde formula, that there exists an allowed implementation for which the traces in the case m = 3 are given by T3;ω =
ω1 ω2 ω3 X S3 ,30 S3 ,30 S3 ,30 1
30
2
S,30
3
.
(7.4)
Here the summation extends over all integrable g-weights at level k ∨ which are fixed under each of the permutations ωs? for s = 1, 2, 3; is the label for the vacuum primary field, while S3,30 denotes the entries of the modular S-matrix of the WZW theory based on g, which is given by the Kac--Peterson formula [30] X i ¯ + ρ), ¯ 0 + ρ) sign(w) ¯ exp [ − k ∨2π ¯ 3 ¯ 3 ¯ ]. (7.5) S3,30 = N +g ∨ (w( w∈W ¯
ω And finally, S3 0 are the entries of the modular S-matrix for some other WZW theory. ,3 Namely, to each pair consisting of an affine Lie algebra g and a diagram automorphism ω of g one can associate another affine Lie algebra gω , the so-called orbit Lie algebra; S ω is just given by the Kac--Peterson formula for gω . (For more details, in particular on how the fixed point weights 3s are to be interpreted as weights of gωs , see [22,31]. For convenience, we have also listed in Table 1 the orbit Lie algebras 12 of untwisted affine Lie algebras with respect to all relevant diagram automorphisms [22].) It is worth mentioning that in a first step, formula (7.3) should be regarded as an identity that holds in each fiber separately. But since the dimension dim Bψ is independent of the choice of the fiber, by inverse Fourier transformation one concludes that the traces also do not depend on the point in moduli space over which the trace is taken. Accordingly, there is no moduli dependence in our conjecture (7.4). It is a remarkable empirical observation that, in all cases that have been checked numerically, the expression (7.4) gives an integral (though not necessarily non-negative) result. 13 Notice that this property is much stronger than the obvious requirement that the dimensions dim Bψ of the invariant subspaces must be integral. (Also, when the constraint (7.1) is not satisfied, then the expression on the right-hand side of (7.4) is zero.) We conjecture that the formula (7.4) indeed holds true and, moreover, that it directly generalizes to more complicated cases. To prepare the ground for this generalization, we write (7.4) in the equivalent form
T3;ω
ω1 ω2 ω3 X S3 S3 S3 0 0 ,30 1 ,3 2 ,3 = · · 3 · |S,30 |2 . 0 0 S S S ,3 ,3 ,30 0
(7.6)
3
12 B˜ (2) stands for the unique series of twisted affine Lie algebras whose characters furnish a module of the n modular group and which have simple roots of three different lengths. 13 The relevant calculations have been performed with the program kac which has been written by A.N. Schellekens and is available at http://norma.nikhef.nl/˜ t58/kac.html. Helpful discussions with Bert Schellekens are gratefully acknowledged.
Action of Outer Automorphisms on Bundles of Chiral Blocks
725
Table 7.1. Orbit Lie algebras of untwisted affine Lie algebras [22] gω
g
ω
N
An−1
(1)
ωn
n
(1) An−1 (1) Bn
(ωn )n/N
N
ω
2
(2) B˜ n−1
(1)
ω
2
A1
C2n
(1)
ω
2
(2) B˜ n
C2n+1
(1)
ω
2
Cn
(1)
ωv
2
Cn−2
D2n
(1)
ωs
2
Bn
D2n+1
(1)
ωs
4
Cn−1
(1)
ω
3
G2
(1)
ω
2
F4
C2
Dn
E6 E7
{0} (1)
A(n/N )−1 (2)
(1) (1) (1) (1)
(1)
(1)
The generalization to all m ≥ 3 then simply consists in replacing the product 3 Y s=1
Qm
ωs S3 0 /S,30 s ,3
ωs by the analogous product s=1 S3 0 /S,30 . s ,3 The last factor on the right-hand side of the formula (7.6) should find its heuristic interpretation as a Weyl integration volume at genus zero (compare e.g. the derivation of the Verlinde formula in the framework of Chern--Simons theory [32]). Accordingly, for a surface of genus g we can speculate that the exponent gets replaced by the Euler number χ = 2 − 2g, leading to an expression of the form
X 30
|S,30 |2−2g
m S ωs Y 3s ,30 s=1
S,30
,
(7.7)
which is consistent with factorization. Unfortunately, a complete proof of (7.4) is not known at present and remains a challenge for future work. Note that such a proof would in particular include a proof of the ordinary Verlinde formula for WZW theories, namely for the special case when ωs = id for all s. A. The Virasoro Algebra In this appendix we collect some information about multi-shift automorphisms of the Virasoro algebra and its semidirect sum with the affine Lie algebra g. We first show that for an arbitrary automorphism σ of an untwisted affine Lie algebra g, the extension to the semi-direct sum with the Virasoro algebra is unique, if it exists.
726
J. Fuchs, C. Schweigert
To see this, consider two maps σi : Vir → Vir ⊕ g, i = 1, 2, such that both (σ, σ1 ) and (σ, σ2 ) furnish an automorphism of the semi-direct sum Vir ⊕ g. Then we have [σ1 (Ln ), σ (xm )] = −m σ (xm+n ) = [σ2 (Ln ), σ (xm )],
(A.1)
from which we learn that for every n the combination σ1 (Ln ) − σ2 (Ln ) commutes with all of g, so that σ1 (Ln ) − σ2 (Ln ) = ξn K + ηn C with ξn , ηn ∈ C for n ∈ Z. But this in turn implies that [σ1 (Ln ), σ1 (Lm )] = [σ2 (Ln ), σ2 (Lm )], or more explicitly 1 (n3 − n) δn+m,0 C = (n − m) σ2 (Ln+m ) (n − m) σ1 (Ln+m ) + 24 1 (n3 − n) δn+m,0 C. + 24
(A.2)
This finally implies that ξn = 0 = ηn for all n, and hence σ1 = σ2 as claimed. Next we check that the prescription (2.32) indeed provides us with an extension of the automorphism σµ;s ¯ ◦ of g as defined in (2.6). We first verify that the relation (2.31) is preserved when x¯ = H i . We have i [σµ;s ¯ ◦ (Ln ), σµ;s ¯ ◦ (H ⊗f )] = [Ln +
=
−H i ⊗ t n+1 df
XX (µ¯ s , Hn−` )⊗Res(t ` ϕs◦ ,s ), H i ⊗f ] `∈Z s
+K
XX `∈Z s
µ¯ is Res(t ` ϕs◦ ,s ) Res(dt n−` f ). (A.3)
Integrating by parts within the second residue and using the identity X Res(t `−1 f ) Res(t −` g) for f, g ∈ C((t)) Res(fg) =
(A.4)
`∈Z
(which follows immediately by substituting f and g by their Laurent expansions), this reduces to P i i⊗ i ⊗ t n+1 df − K ¯ s Res(t n+1 ϕs◦ ,s df ) [σµ;s sµ ¯ ◦ (Ln ), σµ;s ¯ ◦ (H f )]= −H (A.5) i ⊗ n+1 df ) = σ i⊗ = −σµ;s ¯ ◦ (H t ¯ ◦ ([Ln , H f ]), µ;s which is the desired result. Similarly, with the help of the identities (2.11) and (2.23) we calculate P P α¯ ⊗f )] = [L + ¯ s , Hn−` ) Res(t ` ϕs◦ ,s ), E α¯ ⊗f 8α¯ ] [σµ;s n `∈Z s (µ ¯ ◦ (Ln ), σµ;s ¯ ◦ (E XX (µ¯ s , α) ¯ E α¯ ⊗ t n−` f 8α¯ Res(t ` ϕs◦ ,s ) = −E α¯ ⊗ t n+1 d(f 8α¯ ) + `∈Z s
= −E α¯ ⊗ t n+1 d(f 8α¯ ) + E α¯ ⊗ f 8α¯
X s
(µ¯ s , α)ϕ ¯ s◦ ,s t n+1
= −E α¯ ⊗ t n+1 d(f 8α¯ ) + E α¯ ⊗ t n+1 f d8α¯ = −E α¯ ⊗ t n+1 (df ) 8α¯ α¯ ⊗t n+1 df ) = σ α¯ ⊗f ]). = −σµ;s ¯ ◦ (E ¯ ◦ ([Ln , E µ;s
(A.6)
Action of Outer Automorphisms on Bundles of Chiral Blocks
727
Finally we check that we are indeed dealing with an automorphism of Virs◦ . We obtain P P ¯ s , Hn−` )⊗Res(t ` ϕs◦ ,s ), [σµ;s ¯ ◦(Ln ), σµ;s ¯ ◦(Lm )] = [Ln + P`∈Z Ps (µ 0 Lm + `0 ∈Z s 0 (µ¯ s 0 , Hm−`0 )⊗Res(t ` ϕs◦ ,s 0 )] 1 (n3 − n) δn+m,0 C = (n − m) Ln+m + 24 XX 0 (µ¯ s 0 , Hm−`0 +n ) ⊗ Res(t ` ϕs◦ ,s 0 ) [−(m − `0 )] +
(A.7)
`0 ∈Z s 0
XX (µ¯ s , Hn−`+m ) ⊗ Res(t ` ϕs◦ ,s ) [−(n − `)] − `∈Z s
+K
XX 0 (µ¯ s ,µ¯ s 0 ) (n−`)δn−`+m−`0 ,0 Res(t ` ϕs◦ ,s ) Res(t ` ϕs◦ ,s 0 ).
`,`0 ∈Z s,s 0
The terms in the second and third line combine to XX (µ¯ s , Hn+m−` )⊗Res(t ` ϕs◦ ,s ). (n − m) `∈Z s
The central term in the fourth line can be rewritten as X X `0 −` ` `0 0 0 (µ¯ s , µ¯ s 0 ) ( n−m K 2 + 2 ) δn+m−`−` ,0 Res(t ϕs◦ ,s ) Res(t ϕs◦ ,s ) `,`0 ∈Z s,s 0
X X 0) K ( µ ¯ , µ ¯ Res(t ` ϕs◦ ,s ) Res(t n+m−` ϕs◦ ,s 0 ) = n−m s s 2 s,s 0
`∈Z
s,s 0
`∈Z
(A.8)
X X (µ¯ s , µ¯ s 0 ) Res(t n+m+1 ϕs◦ ,s ϕs◦ ,s 0 ), = n−m 2 K where in the first step we used the fact that except for the explicit factor (`0 − `)/2 the expression is symmetric in ` and `0 , and in the second step we employed again the identity (2.23). Collecting all terms, we see that indeed [σµ;s ¯ ◦ (Ln ), σµ;s ¯ ◦ (Lm )] = σµ;s ¯ ◦ ([Ln , Lm ])
(A.9)
as required. B. Inner Multi-Shift Automorphisms Here we collect some details about the derivation of some of the results stated in Sect. 4. Let us consider the map adX for the element X = E α¯ ⊗f+ + E −α¯ ⊗f−
(B.1)
¯ of the block algebra g⊗F. We have adX (H i ⊗f ) = −α¯ i (E α¯ ⊗f+ f − E −α¯ ⊗f− f ), ad2X (H i ⊗f ) = 2α¯ i (α¯ ∨ , H ) ⊗ f+ f− f, ad3X (H i ⊗f )
=
−4α¯ i
(E α¯ ⊗f 2 f +
−
f
(B.2)
− E −α¯ ⊗f
f 2 f ),
+ −
728
J. Fuchs, C. Schweigert
etc., and hence exp(adξ X )(H i ⊗f ) = H i ⊗f − 21 α¯ i (E α¯ ⊗f+ − E −α¯ ⊗f− ) √ sinh(2ξ f+ f− ) (f+ f− )−1/2 f √ + 21 α¯ i (α¯ ∨ , H ) ⊗ [cosh(2ξ f+ f− ) − 1] f.
(B.3)
In the special case where f− = (f+ )−1 (recall that this restricts f+ ∈ F(Pm1 ) to lie in the subalgebra F ∗ (Pm1 )), this reduces to exp(ad(iπ/2)X )(H i ⊗f ) = (w¯ α¯ (H ))i ⊗ f,
(B.4)
where w¯ α¯ (H ) = H − (α¯ , H )α¯ is the image of H under the Weyl reflection w¯ α¯ . Note that this result does not depend on the function f+ at all; as a consequence, by making use of w¯ α2¯ = id one immediately deduces the relation (4.4). ¯ ¯ we distinguish between several cases. First When applying adX to E β ⊗f ∈ g⊗F, ¯ assume that β = ±α. ¯ Then √ exp(adξ X )(E ±α¯ ⊗f ) = E ±α¯ ⊗f ∓ 21 (α¯ ∨ , H ) ⊗ sinh(2ξ f+ f− ) (f+ f− )−1/2 f∓ f √ ± 21 (E α¯ ⊗f+ − E −α¯ ⊗f− ) [cosh(2ξ f+ f− ) − 1] (f± )−1 f. (B.5) ∨
¯ Similarly, for β¯ 6 = ±α¯ we find the following. When neither β¯ + α¯ nor β¯ − α¯ is a g-root, ¯ ¯ we simply have adX (E β ⊗f ) = 0. When either β¯ + α¯ or β¯ − α¯ (but not both) is a g-root, ¯ while β¯ ± 2α¯ is not a g-root, then we obtain √ ¯ ¯ exp(adξ X )(E β ⊗f ) =E β ⊗ cosh(ξ ηf+ f− ) f (B.6) √ ¯ α¯ β± ⊗ sinh(ξ ηf+ f− ) (ηf+ f− )−1/2 f± f, + e±α, ¯ β¯ E where η ≡ η±α¯ := e±α, ¯ α¯ , with eα, ¯ β¯ e∓α, ¯ β± ¯ β¯ structure constants of the horizontal sub¯ Actually, from the fact that the α-string algebra g. ¯ through β¯ has only two elements, it and e can only take the values ±1, and that β¯ and β¯ ± α¯ have follows that e±α, ¯ ¯ ¯ β ∓α, ¯ β±α¯ ¯ ¯ ¯ β) ¯ = e±α, the same length, so that the general identity (β ± α, ¯ β ± α)/( ¯ β, ¯ α¯ ¯ β¯ /e∓α, ¯ β± tells us that η = 1. When the α-string ¯ through β¯ has more elements, then the calculations become still a bit more lengthy. We refrain from describing all different possibilities, because the calculations remain straightforward. As an illustration, let β¯ + α¯ and β¯ + 2α, ¯ ¯ but neither β¯ − α¯ nor β¯ + 3α¯ be g-roots; then we have ¯
¯
α+ ¯ β ⊗ f f, adX (E β ⊗f ) = eα, + ¯ β¯ E ¯
¯
¯
β 2α+ ¯ β ⊗f + e ad2X (E β ⊗f ) = eα, + ¯ β¯ (eα, ¯ α+ ¯ β¯ E −α, ¯ α+ ¯ β¯ E ⊗ f− ) f+ f, ¯ ad3X (E β ⊗f )
=
(B.7)
α+ ¯ β¯ ⊗ f 2 f f, η0 eα, + − ¯ β¯ E
with η0 := eα, ¯ α+ ¯ β¯ e−α,2 ¯ α+ ¯ β¯ + e−α, ¯ α+ ¯ β¯ eα, ¯ β¯ , leading to
p ¯ ¯ α+ ¯ β¯ ⊗ sinh(ξ η0 f f ) (η0 f f )−1/2 f f exp(adξ X )(E β ⊗f ) =E β ⊗f + eα, + − + − + ¯ β¯ E ¯
¯
β 2α+ ¯ β ⊗f + e + eα, + ¯ β¯ (eα, ¯ α+ ¯ β¯ E −α, ¯ α+ ¯ β¯ E ⊗ f− ) p [cosh(ξ η0 f+ f− ) − 1] (η0 f+ f− )−1 f+ f.
(B.8)
Action of Outer Automorphisms on Bundles of Chiral Blocks
729
In the special case f+ f− = 1 from (B.5) we get exp(ad(iπ/2)X )(E ±α¯ ⊗f ) = E ±α¯ ⊗f ∓ (E α¯ ⊗f+ − E −α¯ ⊗f− ) (f± )−1 f = E ∓α¯ ⊗f∓2 f (B.9) ¯ for β¯ = ±α. ¯ When in addition Y ∈ g⊗F is of the same form as X, but with f± replaced by g± ∈ F ∗ (Pm1 ), then it follows from (B.9) that Adiπ/2;X,Y (E ±α¯ ⊗f ) = E ±α¯ ⊗(f+ )∓2 (g+ )±2 f.
(B.10)
Analogously, the special case f+ f− = 1 of (B.6) yields ¯
¯
β±α¯ ⊗ f± f. exp(ad(iπ/2)X )(E β ⊗f ) = i e±α, ¯ β¯ E
(B.11)
Similarly, from (B.8) one obtains (using also the fact that β¯ and β¯ + 2α¯ must be long roots while α¯ and β¯ + α¯ must be short roots, which implies the identities 2 eα, ¯ α+ ¯ β¯ e−α,2 ¯ α+ ¯ β¯ = (eα, ¯ α+ ¯ β¯ ) · 2 eα, ¯ β¯ e−α, ¯ α+ ¯ β¯ = (eα, ¯ β¯ ) ·
¯ α+ ¯ (α+ ¯ β, ¯ β) 2 1 ¯ α+ ¯ = 2 · 2 = 2, (2α+ ¯ β,2 ¯ β)
¯ β) ¯ (β, 2 ¯ α+ ¯ = 1 · 2 = 2, (α+ ¯ β, ¯ β)
(B.12)
|eα, ¯ β¯ eα, ¯ α+ ¯ β¯ | = 2, so that in particular η0 = 4), ¯
¯
¯ β⊗ 2 f+ f. exp(ad(iπ/2)X )(E β ⊗f ) = ± E 2α+
(B.13)
Taking again also Y of the form described above, the relations (B.11) and (B.13) lead to ¯
¯
Adiπ/2;X,Y (E β ⊗f ) = E β ⊗f± g∓ f
and
¯
¯
Adiπ/2;X,Y (E β ⊗f ) = E β ⊗(f+ g− )2 f, (B.14)
respectively. For the remaining cases, the calculations are completely parallel and lead to results ¯ This finally leads to the analogous to (B.14), with f± g∓ raised to the power −(α¯ ∨ , β). formula (4.5) of the main text. Next we comment on the analogous calculations for the case where the functions f are Laurent series (in the local coordinate ζs , respectively the formal variable t). First we compute that, for f+ f− = 1, the second formula in (B.2) changes to 2 ad2X (H i ⊗f ) = 2α¯ i (α¯ ∨ , H ) ⊗ f+ f− f − 2α¯ i K (α, ¯ α) ¯ Res(f+ f df− ),
(B.15)
so that (B.3) acquires an additional contribution proportional to the central element, − 21 (α¯ ∨ )i K Res(f+ f df− ) (cosh(2ξ ) − 1)),
(B.16)
which when specialized to ξ = iπ/2 becomes (α¯ ∨ )i K Res(f+ f df− ). Performing the same calculation also with the analogous element Y in place of X, we then arrive at the formula (4.8).
730
J. Fuchs, C. Schweigert
¯ In the calculation of exp(adξ X )(E β ⊗f ), a change occurs only for β¯ = ±α; ¯ in this case we get
2 adX (E ±α¯ ⊗f ) = ∓ (α¯ ∨ , H ) ⊗ f∓ f − K (α, ¯ α) ¯ Res(f∓ df ), ad2X (E ±α¯ ⊗f ) = ±2 (E α¯ ⊗f+ − E −α¯ ⊗f− ) f∓ f,
(B.17)
2 ad3X (E ±α¯ ⊗f ) = ∓4 (α¯ ∨ , H ) ⊗ f∓ f − 4K (α, ¯ α) ¯ Res(f∓ df ),
etc. The residue terms add up to −(α, ¯ α) ¯ −1 K Res(f∓ df ) sinh(2ξ ), which vanishes for ξ = iπ/2, while the other terms reproduce (B.6). Finally let us compute the action of adX on the Virasoro generators Ln . By (2.31) we find, for f+ f− = 1, adX (Ln ) = E α¯ ⊗ t n+1 df+ + E −α¯ ⊗ t n+1 df− , 2 n+1 df df ), ad2X (Ln ) = −2 (α¯ ∨ , H i ) ⊗ f− df+ t n+1 + 2 (α, + − ¯ α) ¯ K Res(t
ad3X (Ln )
=
4 (E α¯ ⊗ t n+1 df+
(B.18)
+ E −α¯ ⊗ t n+1 df− ),
and hence exp(adξ X )(Ln ) = Ln + −
1 2
1 2
sinh(2ξ ) (E α¯ ⊗t n+1 df+ + E −α¯ ⊗t n+1 df− )
2 n+1 [cosh(2ξ ) − 1] ((α¯ ∨ , H ) ⊗ f− df+ t n+1 − (α, df+ df− )). ¯ α) ¯ K Res(t
(B.19)
Specializing to ξ = iπ/2, we have 2 n+1 df+ df− ). exp(ad(iπ/2)X )(Ln ) = Ln + (α¯ ∨ , H ) ⊗ f− df+ t n+1 − (α, ¯ α) ¯ K Res(t (B.20)
C. On the Moduli Dependence of Inner Automorphisms Our first aim in this appendix is to check the formula (6.11), which asserts that the automorphism δµ¯ acting as in (6.10) can be represented as δµ¯ = exp(ad(µ¯ u ,H )⊗gs ,u ) ◦ for s◦ 6 = u and as δµ¯ = exp(adP (µ¯ s ,H )⊗g˜s ) for s◦ = u, with gs◦ ,u and g˜ s as defined in s6=u (6.12). First, definitely exp(ad(µ¯ u ,H )⊗gs◦ ,u )(K) = K = exp(ad(µ¯ s ,H )⊗g˜s )(K).
(C.1)
Second, using the relations dgs◦ ,u = ϕs◦ ,u − ϕs[0]◦ ,u and [(µ¯ u , H ) ⊗ gs◦ ,u , H i ⊗f ] = µ¯ iu Res(dgs◦ f ) K
(C.2)
together with [K, · ] = 0, we have exp(ad(µ¯ u ,H )⊗gs◦ ,u )(H i ⊗f ) = H i ⊗f + K µ¯ iu Res((ϕs◦ ,u − ϕs[0]◦ ,u ) f ),
(C.3)
and similarly [0] ) f ). exp(ad(µ¯ s ,H )⊗g˜s )(H i ⊗f ) = H i ⊗f + K µ¯ is Res((ϕu,s − ϕu,s
(C.4)
Action of Outer Automorphisms on Bundles of Chiral Blocks
731 ¯
¯
¯ β ⊗gs◦ ,u f And finally, by exponentiating the identity [(µ¯ u , H )⊗gs◦ ,u , E β ⊗f ] = (µ¯ u , β)E we obtain ¯ ¯ ¯ gs◦ ,u ) f exp(ad(µ¯ u ,H )⊗gs◦ ,u )(E β ⊗f ) = E β ⊗ exp((µ¯ u , β)
¯
(C.5)
t+zu −zs (µ¯ ,β) ( t+z ) . [0] u −zs
(C.6)
¯
= Eβ ⊗ f ·
t+zs0 −zu (µ¯ ,β) ( t+z , [0] ) s −zu u
0
and analogously ¯
¯
exp(ad(µ¯ s ,H )⊗g˜s )(E β ⊗f ) = E β ⊗ f ·
s
¯
Putting these results together we see that δµ¯ is indeed given by (6.11). Next we derive the formula (6.15) for the commutator [∇u , 2µ;s ¯ ◦ ]. Recall that ∇u = (u) ∂/∂zu + L−1 . Thus we first study the effect of differentiating with respect to zu . Employing the identities ∂ ∂zu gs◦ = −ϕs◦ ,u
for s◦ 6 = u
∂ ∂zu g˜ s zu =zu[0] = ϕu,s
|
and
for s 6 = u,
(C.7)
we derive ∂ ¯ ◦ (z) zu =z[0] = ∂zu 2µ;s u
[0] ( − (µ¯ u , H ) ϕs[0]◦ ,u + K ∂z∂ u h˜ s◦|zu =zu[0] ) 2µ;s ¯ ◦ (z ) for
∂ ¯ ◦ (z) zu =z[0] = ∂zu 2µ;s u
(
| |
m X [0] (µ¯ s , H ) ϕu,s + K ∂z∂ h˜ u|z u
s=1 s6=u
[0] u =zu
[0] ) 2µ;s ¯ ◦ (z )
s◦ 6= u,
for s◦ = u. (C.8)
(s) (s) Concerning the commutation with L(u) ¯ (z)L−1 = σµ;s ¯ (z) which −1 , we use 2µ;s ¯ (L−1 ) 2µ;s follows by (2.34). Also, according to (2.32) we have explicitly
m XX (µ¯ s ,H−1−` ) Res(t ` ϕu,s )
(u) (u) σµ;u ¯ (L−1 ) = L−1 +
+ 21 K
`∈Z s=1 m X
(C.9)
(µ¯ s ,µ¯ s 0 ) Res(ϕu,s ϕu,s 0 ).
s,s 0 =1
The two terms of the right-hand side can be rewritten with the help of X X i H−1−` Res(t ` ϕu,s ) ≡ H i ⊗ t −1−` Res(t ` ϕu,s ) `∈Z
`∈Z
=
X
H i ⊗ (ϕu,s )−1−` t −1−` = H i ⊗ ϕu,s
(C.10)
`∈Z
and m X
(µ¯ s ,µ¯ s 0 ) Res(ϕu,s ϕu,s 0 ) = 2
s,s 0 =1
m X (µ¯ s , µ¯ u ) s=1 s6 =u
zu − zs
(C.11)
732
J. Fuchs, C. Schweigert
so as to obtain (u) (u) (u) [2µ;u ¯ , L−1 ] = (σµ;u ¯ ¯ (L−1 ) − L−1 ) 2µ;u
=(
m m X X (µ¯ s ,µ¯ u ) (µ¯ s ,H ) ⊗ ϕu,s + K ¯ . zu − zs ) 2µ;u s=1
(C.12)
s=1 s6 =u
Combining this with (C.8), we finally get [0] ¯ u , H )⊗ϕs[0]◦ ,u 2µ;s [∇u , 2µ;s ¯ ◦ (z)] = − (µ ¯ ◦ (z )
+ K ∂z∂ h˜ s◦|z
[0] u =zu
u
[0] 2µ;s ¯ ◦ (z )
for s◦ 6= u,
(C.13)
while for s◦ = u several terms cancel, leading to [∇u , 2µ;u ¯ (z)]|z
[0] u =zu
[0] = − (µ¯ u , H )⊗t −1 2µ;u ¯ (z )
+ K ( ∂z∂ h˜ u|z u
u
− =z[0]
m X (µ¯ s ,µ¯ u )
u
s=1 s6 =u
z u − zs
[0] ) 2µ;u ¯ (z ).
(C.14)
This is indeed equivalent to the formula (6.15). D. On the Implementation of Inner Automorphisms In this appendix we present the calculations which show that the specific implementation of inner multi-shift automorphisms that was described in (5.7) satisfies the relation (6.41) and explain why this implies that the implementation can be consistently chosen at all points of the moduli space Mm . We consider first the case of a single coroot α¯ s∨ for some fixed s, i.e. µ¯ s = α¯ s∨ while µs 0 = 0 for s 0 6 = s. We write Xs◦ ,s ≡ Xs◦ ,s;αs (z) := E α¯ s
⊗ ϕs◦ ,s
+ E −α¯ s
⊗ ϕs−1,s ◦
(D.1)
and Xs[0]◦ ,s := Xs◦ ,s;αs (z [0] ) = E α¯ s
⊗ ϕs[0],s ◦
+ E −α¯ s
⊗ (ϕs[0],s )−1 . ◦
(D.2)
We claim that Xs◦ ,s and Xs[0]◦ ,s are related by (6.40), which in terms of the present notations reads (z), exp( 21 iπ Xs◦ ,s ) = Us◦ ,s (z) exp( 21 iπ Xs[0]◦ ,s ) Us−1 ◦ ,s
(D.3)
where Us◦ ,s (z) ≡ Us◦ ,s;α¯ ∨ (z) := exp ( − s
1 2
H α¯ s
⊗ gs◦ ,s (z))
(D.4)
with gs◦ ,s as defined in (6.22) and H α¯ s ≡ (α¯ s∨ , H ). To prove (D.3), we show that both sides satisfy the same differential equation; the relation then holds because owing to
Action of Outer Automorphisms on Bundles of Chiral Blocks
733
Us◦ ,s (z [0] ) = 1 both sides are identical at z = z [0] . To obtain the derivative of the lefthand side of (D.3) we first compute ∂ α¯ s ⊗ϕ 2 + E −α¯ s ⊗id) s◦ ,s ∂zu Xs◦ ,s = (δu,s − δu,s◦ ) (E
= − 21 (δu,s − δu,s◦ ) adXs ,s (H α¯ s ◦
≡
1 2
(δu,s − δu,s◦ ) adH α¯ s ⊗ϕ
s◦ ,s
⊗ ϕs◦ ,s )
(D.5)
(Xs◦ ,s ),
where in the transition to the second line we used the formula (B.2). It follows immediately that ∂ iπ α¯ s ⊗ 1 ϕs◦ ,s exp ( iπ ∂zu exp ( 2 Xs◦ ,s ) = 2 (δu,s − δu,s◦ ) (H 2 Xs◦ ,s ) α¯ s ⊗ − exp ( iπ ϕs◦ ,s ). 2 Xs◦ ,s ) H
(D.6)
As for the right-hand side of (D.3), we have ∂Us◦ ,s (z) 1 α¯ s ⊗ ∂ g ∂zu = − 2 Us◦ ,s (z) · H ∂zu s◦ ,s
= − 21 Us◦ ,s (z) · H α¯ s ≡
1 2
⊗ ϕs◦ ,s
(δu,s − δu,s◦ ) H α¯ s
(δu,s◦ − δu,s )
⊗ ϕs◦ ,s
(D.7)
· Us◦ ,s (z)
and ∂Us◦ ,s (z) −1 ∂ −1 −1 Us◦ ,s (z) ∂zu Us◦ ,s (z) = −Us◦ ,s (z) ∂zu
(z) · H α¯ s = − 21 (δu,s − δu,s◦ ) Us−1 ◦ ,s
(D.8) ⊗ ϕs◦ ,s ;
(z) indeed satisfies the same differtherefore the product Us◦ ,s (z) exp( 21 iπ Xs[0]◦ ,s ) Us−1 ◦ ,s 1 ential equation as exp( 2 iπ Xs◦ ,s ) in (D.6). Next we note that according to the result (B.4) and (B.16), we also have ¯
¯
dϕs◦ ,s f ). exp(ad(iπ/2)Xs◦ ,s )(H β ⊗f ) = w¯ α¯ (H β ) ⊗ f − (α¯ ∨ , β¯ ∨ ) K Res(ϕs−1 ◦ ,s
(D.9)
Thus in particular exp(ad(iπ/2)Xs ,s )(H α¯ s ⊗f )= −H α¯ s
⊗f
exp(ad−(iπ/2)Ys )(H α¯ s ⊗f ) = −H α¯ s
⊗ f,
◦
− (α¯ s∨ , α¯ s∨ ) K Res(ϕs−1 dϕs◦ ,s f ), ◦ ,s
(D.10)
where we introduced Ys := (E α¯ s + E −α¯ s ) ⊗ id.
(D.11)
In other words, exp(− 21 iπ Ys ) H α¯ s ⊗f = −H α¯ s ⊗f exp(− 21 iπ Ys ), which implies in particular that (z) = Us◦ ,s (z) exp(− 21 iπ Ys ), exp(− 21 iπ Ys ) Us−1 ◦ ,s
(D.12)
734
J. Fuchs, C. Schweigert
while iπ −1 exp ( iπ 2 Xs◦ ,s ) Us◦ ,s (z) = Us◦ ,s (z) exp ( 2 Xs◦ ,s ) · exp (− (α¯ 2,α¯ ) K Res(ϕs−1 dϕs◦ ,s gs◦ ,s )), ◦ ,s s
s
0
ln
(D.13)
which by dϕs◦ ,s = −ϕs2◦ ,s and Res(ϕs◦ ,s 0 gs◦ ,s ) =
δs
◦ ,s
zs◦ −zs [0] for s 6 = s◦ , zs[0] ◦ −zs
(D.14)
for s = s◦
0
reduces to (z) exp( 21 iπ Xs◦ ,s ). exp( 21 iπ Xs◦ ,s ) Us◦ ,s (z) = Us−1 ◦ ,s
(D.15)
Now according to our prescription (5.7), at each individual point z in the moduli ¯ is implemented space the inner automorphism Adiπ/2;Xs◦ ,s ,Ys of the block algebra g⊗F by ˜ α¯ ∨ ;s◦ (z) = exp(− 1 iπ Ys ) exp( 1 iπ Xs◦ ,s ). 2 2 2 s
(D.16)
Combining the results (D.3), (D.12) and (D.15), we learn that this twisted intertwiner satisfies ˜ α¯ ∨ ;s◦ (z [0] ) = exp(− 1 iπ Ys ) exp( 1 iπ Xs[0],s ) 2 2 2 s ◦ (z) exp( 21 iπ Xs◦ ,s ) Us◦ ,s (z) = exp(− 21 iπ Ys )Us−1 ◦ ,s
(D.17)
˜ α¯ ∨ ;s◦ (z). = Us2◦ ,s (z) exp(− 21 iπ Ys ) exp( 21 iπ Xs◦ ,s ) = Us2◦ ,s (z)2 s We now return to the general case where the components µ¯ are arbitrary elements of the coroot lattice. We note that because of Res(dgs◦ ,s gs◦ ,s 0 ) = 0
(D.18)
for all values of s, s 0 any two operators of the form (D.4) commute (when taken at the same point z in the moduli space Mm of course). Moreover, according to (4.8), up to ˜ β¯ ∨ ;s (z) for s 0 6= s central terms Us◦ ,s (z) also commutes with the twisted intertwiner 2 s0 ◦ and arbitrary coroots β¯ ∨ ≡ β¯ ∨0 ; more precisely, using the identity (D.14) we find s
˜ β¯ ∨ ;s (z) · exp [ − ˜ β¯ ∨ ;s (z) Us ,s (z) = Us ,s (z)2 2 ◦ ◦ 0 ◦ 0 ◦
1 2
(β¯s∨0 , α¯ s∨ ) KRes(ϕs◦ ,s 0 gs◦ ,s )]
˜ β¯ ∨ ;s (z) · exp [ − = Us◦ ,s (z)2 0 ◦
1 2
(β¯s∨0 , α¯ s∨ )δs◦ ,s 0 K ln
s
s
s
zs◦ −zs [0] . zs[0] ◦ −zs
]
(D.19) The final expression will involve the square of the exponential with the central element, because it is Us2◦ ,s (z) that we commute to the left. Moreover, taking into account the
Action of Outer Automorphisms on Bundles of Chiral Blocks
735
˜ µ;s chosen ordering of the factors in 2 ¯ ◦ , such a term will appear if and only if s >s◦ . ˜ µ;s Thus defining 2 according to the prescription (5.7), i.e. as ¯ ◦ ˜ µ;s 2 ¯ ◦ (z) := when µ¯ s =
P
is
→ Y → Y
(
s
is
exp(− 21 iπ Yαi
s
;s◦ )
exp( 21 iπ Xαi
s
)
;s,s◦ )
(D.20)
α¯ i∨s for s = 1, 2, . . . , m−1, Eq. (D.17) generalizes to
2 [0] ˜ µ;s ˜ µ;s 2 ¯ ◦ (z ) = Us◦ ;µ¯ (z) 2 ¯ ◦ (z) · exp [ −
m−1 X
(µ¯ s◦ , µ¯ s ) K ln
s=1 s>s◦
zs◦ −zs [0] zs[0] ◦ −zs
]
(D.21)
with Us◦ ;µ¯ (z) := exp ( −
m−1 X s=1
1 2
(µ¯ s , H ) ⊗ gs◦ ,s (z)).
(D.22)
Using the invariance of the exponent under exchange of s with s◦ , (D.21) may also be rewritten as 2 [0] ˜ µ;s ˜ µ;s 2 ¯ ◦ (z ) = Us◦ ;µ¯ (z) 2 ¯ ◦ (z) · exp [ −
1 2
m−1 X
(µ¯ s◦ , µ¯ s ) K ln
s=1 s6 =s◦
zs◦ −zs [0] . zs[0] ◦ −zs
]
(D.23)
We are now finally in a position to make contact to the relation (6.21) that corresponds to the compatibility with the Knizhnik--Zamolodchikov connection. First, comparison of our results with the definition (6.34) 14 (see also (6.22) and (6.26)), tells us that we may rewrite (D.23) in the form of (6.41), i.e. [0] ˜ [0] ˜ µ;s 2 ¯ ◦ (z) = Aµ;s ¯ ◦ (z; z ) 2 ¯ ◦ (z ). µ;s
(D.24)
Moreover, we already know that the implementation is consistent at the specific point [0] [0] ˜ µ;s z [0] , i.e. we have 2 ¯ ◦ (z ) = 2µ;s ¯ ◦ (z ). Together with the relation (6.21) between [0] 2µ;s ¯ ◦ (z) and 2µ;s ¯ ◦ (z ) the result (D.24) therefore implies that in fact [0] [0] ˜ µ;s 2 ¯ ◦ (z) = Aµ;s ¯ ◦ (z; z ) 2µ;s ¯ ◦ (z ) = 2µ;s ¯ ◦ (z)
(D.25)
at every point z in the moduli space M and hence that, as claimed, the implementation is consistent on all of M. Acknowledgement. We would like to thank A. N. Schellekens for helpful discussions, and K. Gaw¸edzki for pointing out some errors in an earlier version of the paper and for helpful correspondence. 14 The summation over s in (6.34) now extends only up to m−1, owing to our decision to keep one puncture at zm = ∞ when dealing with inner multi-shift automorphisms.
736
J. Fuchs, C. Schweigert
References 1. Felder, G., Fröhlich, J. and Keller, G.: On the structure of unitary conformal field theory. I. Existence of conformal blocks. Commun. Math. Phys. 124, 417 (1989) 2. Fuchs, J. and Schweigert, C.: Branes: from free fields to general backgrounds. Nucl. Phys. B 530, 99 (1998) 3. Witten, E.: Quantum field theory and the Jones polynomial. Commun. Math. Phys. 121, 351 (1989) 4. Beauville,A.: Vector bundles on curves and generalized theta functions: Recent results and open problems, In: Current Topics in Complex Algebraic Geometry, H. Clemens et al., eds., Cambridge: Cambridge University Press, 1995, p. 17 5. Tsuchiya, A., Ueno, K., and Yamada, H.: Conformal field theory on universal family of stable curves with gauge symmetries. Adv. Studies in Pure Math. 19, 459 (1989) 6. Ueno, K.: Introduction to conformal field theory with gauge symmetries, In: Physics and Geometry, J.E. Andersen, H. Pedersen, and A. Swann, eds., New York: Marcel Dekker, 1997, p. 603 7. Felder, G., Gaw¸edzki, K. and Kupiainen, A.: Spectra of Wess--Zumino--Witten models with arbitrary simple groups. Commun. Math. Phys. 117, 127 (1988) 8. Fuchs, J., Schellekens, A.N. and Schweigert, C.: A matrix S for all simple current extensions. Nucl. Phys. B 473, 323 (1996) 9. Gaw¸edzki, K. and Kupiainen, A.: Coset construction from functional integrals. Nucl. Phys. B 320, 625 (1989) 10. Karabali, D. and Schnitzer, H.J.: BRST quantization of the gauged WZW action and coset conformal field theories. Nucl. Phys. B 329, 649 (1990) 11. Hori, K.: Global aspects of gauged Wess--Zumino--Witten models. Commun. Math. Phys. 182, 1 (1996) 12. Fuchs, J., Schellekens, A.N. and Schweigert, C.: The resolution of field identification fixed points in diagonal coset theories. Nucl. Phys. B 461, 371 (1996) 13. Beauville, A.: Conformal blocks, fusion rules and the Verlinde formula, In: Hirzebruch 65 Conference on Algebraic Geometry. [Israel Math. Conf. Proc. 9], M. Teicher, ed., Ramat Gan: Bar-Ilan University, 1996, p. 75 14. Schellekens, A.N. and Yankielowicz, S.: Simple currents, modular invariants, and fixed points. Int. J. Mod. Phys. A 5, 2903 (1990) 15. Fuchs, J., and Schweigert, C.: A classifying algebra for boundary conditions. Phys. Lett. B 414, 251 (1997) 16. Recknagel, A. and Schomerus, V.: D-branes in Gepner models. Nucl. Phys. B 531, 185 (1998) 17. Beauville, A.: The Verlinde formula for P GL(p), In: The Mathematical Beauty of Physics, J.M. Drouffe and J.-B. Zuber, eds., Singapore: World Scientific, 1997, p. 141 18. Freericks, J.K. and Halpern, M.B.: Conformal deformation by the currents of affine g. Ann. Phys. 188, 258 (1988) [ ibid. 190 (1989) 212, Erratum] 19. Gorman, N., O’Raifeartaigh, L. and McGlinn, W.: Cartan preserving automorphisms of untwisted and twisted Kac--Moody algebras. J. Math. Phys. 30, 1921 (1989) 20. Lerche, W., Vafa, C. and Warner, N.P.: Chiral rings in N = 2 superconformal theories. Nucl. Phys. B 324, 427 (1989) 21. Peterson, D.H. and Kac, V.G.: Infinite flag varieties and conjugacy theorems. Proc. Natl.Acad. Sci. USA 80, 1778 (1983) 22. Fuchs, J., Schellekens, A.N. and Schweigert, C.: From Dynkin diagram symmetries to fixed point structures. Commun. Math. Phys. 180, 39 (1996) 23. Schellekens, A.N. and Yankielowicz, S.: Extended chiral algebras and modular invariant partition functions. Nucl. Phys. B 327, 673 (1989) 24. Intriligator, K.: Bonus symmetry in conformal field theory. Nucl. Phys. B 332, 541 (1990) 25. Karpilovsky, G.: Induced Modules Over Group Algebras. Amsterdam: North Holland Publishing Company, 1990 26. Brown, K.S.: Cohomology of Groups. Berlin: Springer Verlag, 1982 27. Kohno, T.: Monodromy representations of braid groups and Yang--Baxter equations. Ann. Inst. Fourier 37, 139 (1987) 28. Feigin, B.L., Schechtman, V.V. and Varchenko, A.N.: On algebraic equations satisfied by correlators in Wess--Zumino--Witten models. Lett. Math. Phys. 20, 291 (1990) 29. Mathieu, O.: Équations de Knizhnik--Zamolodchikov et théorie des représentations. Astérisque 227, 47 (1995) 30. Kac, V.G. and Peterson, D.H.: Infinite-dimensional Lie algebras, theta functions and modular forms. Adv. Math. 53, 125 (1984) 31. Fuchs, J., Ray, U. and Schweigert, C.: Some automorphisms of Generalized Kac--Moody algebras. J. Algebra 191, 518 (1997) 32. Blau, M. and Thompson, G.: Derivation of the Verlinde formula from Chern--Simons theory and the G/G model. Nucl. Phys. B 408, 345 (1993) Communicated by G. Felder