Commun. Math. Phys. 196, 1 – 18 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
∂¯ -Reductions of the Multidimensional Quadrilateral Lattice. The Multidimensional Circular Lattice A. Doliwa1,2 , S. V. Manakov3 , P. M. Santini1,4 1 Istituto Nazionale di Fisica Nucleare, Sezione di Roma, P-le Aldo Moro 2, I-00185 Roma, Italy. E-mail:
[email protected] 2 Instytut Fizyki Teoretycznej, Uniwersytet Warszawski, ul. Ho˙ za 69, 00-681 Warszawa, Poland. E-mail:
[email protected] 3 L. D. Landau Institute for Theoretical Physics, Kosygina 2, GSP1, Moscow 117970, Russia 4 Dipartimento di Fisica, Universit` a di Catania, Corso Italia 57, I-95129 Catania, Italy. E-mail:
[email protected];
[email protected]
Received: 8 June 1997 / Accepted: 11 December 1997
Abstract: We apply a recently introduced [21, 15] reduction method, based on the ∂¯ dressing, to construct a large class of integrable reductions of the equations characterizing the multidimensional quadrilateral lattice (an N -dimensional lattice in RM , N ≤ M , whose elementary quadrilaterals are planar and whose continuous limit describes submanifolds parametrized by conjugate lines [11]). We also show that, generically, in the limit of the small lattice parameter, half of these reductions lead to the Darboux equations for symmetric fields and the second half lead to the generalized Lam´e equations describing N -dimensional submanifolds of EM parametrized by conjugate orthogonal systems of coordinates. We finally show that a distinguished example of the second class of reductions corresponds to the multidimensional circular lattice (an N -dimensional lattice in EM , N ≤ M , whose elementary quadrilaterals are inscribed in circles [7]). 1. Introduction During the last few years, several interesting results in the fields of geometry and in¯ tegrable systems have been obtained. Using the ∂-dressing method [22], Zakharov and Manakov solved, about ten years ago, the Darboux equations ∂i βjk = βji βik , i, j, k = 1, ..N, i 6= j 6= k 6= j,
(1)
M
which characterize N -dimensional submanifolds of R , N ≤ M , parametrized by conjugate coordinate systems [8], and are the compatibility conditions of the following linear system (2) ∂i Xj = βji Xi , i, j = 1, ..N , i 6= j , involving suitable M -dimensional vectors Xi , tangent to the coordinate lines. A distinguished reduction of the multiconjugate systems on submanifolds is given by the systems of orthogonal coordinates in the N -dimensional Euclidean space EN [14, 8], characterized by the Lam´e equations
2
A. Doliwa, S. V. Manakov, P. M. Santini
∂i βjk = βji βik , i, j, k = 1, ..N, i 6= j 6= k 6= j, ∂i βij + ∂j βji +
N X
βli βlj = 0, i, j = 1, ..N , i 6= j ,
(3) (4)
l=1, l6=i,j
which are a consequence of the additional assumption that the vectors Xi form an orthonormal basis in the ambient space EN , Xi · Xj = δij , i, j = 1, ..N .
(5)
Last year Zakharov solved the Lam´e Eqs. (3)–(4) imposing a suitable linear constraint on the spectral data of the Marchenko integral equation associated with the Darboux equations (1) [21]. Motivated by this result, Manakov and Zakharov have extended the method of Zakharov, obtaining a general approach, based on the ∂¯ method, to construct a large class of differential reductions of integrable nonlinear equations [15]. During the last few years several progresses have also been made in the study of the connections between discrete geometry and integrability [3, 4, 10, 9]. More recently, the two basic notions of multidimensional conjugate and orthogonal systems have found their natural generalizations to a discrete context, leading to the introduction of two new integrable lattices: the multidimensional quadrilateral lattice (MQL) [11] and the multidimensional circular lattice (MCL) [2, 7]. More precisely, based on a result by Sauer, which introduced the proper discrete analogue of a conjugate net on a surface [19], Doliwa and Santini introduced the notion of “Multidimensional Quadrilateral Lattice” (MQL), i.e. a lattice x : ZN → RM , N ≤ M , with all its elementary quadrilaterals planar, which is the discrete analogue of a multidimensional conjugate net [11]. In a convenient parametrization (see also Sect. 4 and [11]), the planarity of the elementary quadrilaterals is expressed by the linear system 1i Xj = (Ti Qji )Xi , i, j = 1, . . . , N , i 6= j,
(6)
for suitable vector functions Xi : ZN → RM , where the real functions Qij : ZN → R satisfy the compatibility conditions 1i Qjk = (Ti Qji )Qik ,
i, j, k = 1, .., N,
i 6= j 6= k 6= i,
(7)
which provide a useful characterization of the MQL [11]. In Eqs. (6) and (7) Ti is the translation operator with respect to the discrete variable ni ∈ Z: Ti f (n1 , .., ni , .., nN ) = f (n1 , .., ni + 1, .., nN ) ,
i = 1, .., N
(8)
and 1i := Ti − 1 is the corresponding discrete partial derivative. In the limit of the small lattice parameter ε, Eqs. (6) and (7) reduce, respectively, to the linear system (2) and to the Darboux Eqs. (1) in a straightforward way, with: ε2 2 ∂ + ..., 2 i
(9)
(2) + ..., Qij = εβij + ε2 βij
(10)
Ti = 1 + ε∂i +
X i = Xi + . . .
.
(11)
¯ ∂-Reductions of Multidimensional Quadrilateral Lattice
3
Equations (7) were already known in the literature, having been recently obtained by Bogdanov and Konopelchenko [6], via the ∂¯ method, as a natural integrable discrete analogue of the Darboux Eqs. (1); but their geometric meaning was unknown. Soon after the work [11], Ma˜nas, Doliwa and Santini have constructed a vectorial Darboux transformation for Eqs. (7), which allows to construct a MQL from a given one [16]. Also the orthogonality constraint has been successfully discretized. This discretization consists in imposing that the elementary quadrilaterals of the MQL are inscribed in circles. This notion was first proposed in [17, 18] for N = 2, M = 3, as a discrete analogue of surfaces parametrized by curvature lines (see also [4]); later by Bobenko for N = M = 3 [2] and, finally, for arbitrary N ≤ M , by Cie´sli´nski, Doliwa and Santini [7]. These lattices are now called “Multidimensional Circular Lattices” (MCL) or discrete orthogonal lattices. In [7] it was also shown, in a purely geometric way, that the circularity constraint, imposed on the initial two-dimensional discrete surfaces of the MQL, is preserved during the construction of the lattice, generating an MCL. Therefore the N -dimensional circular lattice in EM , N ≤ M , is an integrable, geometrically distinguished reduction of the N -dimensional quadrilateral lattice. After the submission of the first version of this paper, we were informed of a very recent result of Konopelchenko and Schief [13] containing a convenient parametrization of the circular lattices in E3 and the associated discrete analogues of the Ribaucour transformation. In this paper we extend the ∂¯ reduction approach introduced in [21, 15] to a discrete context, obtaining the general theory of the integrable reductions of the MQL Eqs. (7). We also show that, generically, in the limit of the small lattice parameter, half of these reductions lead to the Darboux Eqs. (1) for symmetric fields: βij = βji , while the other half lead to the “generalized” Lam´e equations ∂i βjk = βji βik , ∂i βij + ∂j βji +
i, k = 1, ..N , j = 1, . . . , M, i 6= j 6= k 6= i , M X
βli βlj = 0; , i, j = 1, . . . , N, i 6= j,
(12) (13)
l=1, l6=i,j
describing N -dimensional submanifolds of EM parametrized by conjugate orthogonal systems of coordinates. Finally, as an application of the general reduction theory presented in the paper, we solve the most significant reduction of the MQL; i.e. the MCL, which is completely characterized by Eqs. (6), expressing the planarity of the elementary quadrilaterals, and by the following “circularity constraint” Xi · Ti Xj + Xj · Tj Xi = 0 , i, j = 1, . . . , N, i 6= j,
(14)
which, in the continuous limit, goes directly to the orthogonality condition (5). The general reduction theory presented in this paper may be used to classify and solve the geometrically meaningful reductions of the MQL’s. The systematic study of these reductions is presently under investigation and will be the content of a forthcoming paper. Here we only anticipate that our reduction theory allows one to solve other distinguished lattices, including: (i) the discrete analogue of the symmetric net [1]; (ii) the discrete analogue of the orthogonal net in a space of constant curvature [1]; (iii) the discrete analogue of the orthogonal coordinates of Egorov type [8]. The paper is organized in the following way. In Sect. 2 we investigate the ∂¯ problem associated with the MQL Eqs. (7) and we construct the formalism corresponding to degenerate kernels and explicit solutions. In Sect. 3 we construct a large class of integrable
4
A. Doliwa, S. V. Manakov, P. M. Santini
reductions of the MQL equations, together with the corresponding degenerate kernel formalism; we also discuss the limit of small lattice parameter. In Sect. 4 we show that the MCL corresponds to one of the reductions (the most distinguished one) obtained in Sect. 3, thus proving its analytic integrability. 2. The Nonlocal ∂¯ Problem and the Multidimensional Quadrilateral Lattice 2.1. The ∂¯ problem for the MQL. The matrix nonlocal ∂¯ problem Z ∂ψ(λ) ψ(µ)R(µ, λ)dµ ∧ dµ, ¯ λ, µ ∈ C = ∂ λ¯ C
(15)
plays a crucial role in the dressing method, being a very convenient tool to construct integrable multidimensional systems, together with large classes of solutions [22, 5, 20, 12]. In Eq. (15), R(µ, λ) is a given M × M matrix ∂¯ - datum, with µ, λ ∈ C. We ¯ remark that the dependence of the M × M matrices ψ and R on λ¯ and µ: ¯ ψ = ψ(λ, λ), ¯ µ, µ) R = R(λ, λ, ¯ will be omitted systematically throughout the paper, except when it will be necessary to show it explicitly (like when one deals with the reality conditions). Let us assume that the ∂¯ problem (15) be uniquely solvable and let the M × M matrix ψ(λ) be the (unique) solution of it satisfying the canonical normalization: ψ(λ) → I,
λ → ∞.
(16)
Furthermore, let us assume that R depends on N , N ≤ M , discrete variables n = (n1 , . . . , nN ), ni ∈ Z, i = 1, .., N through the following equations: Ti R(µ, λ; n) = (I − Pi + µPi )R(µ, λ; n)(I − Pi + λPi )−1 , i = 1, .., N,
(17)
where Pi is the usual ith projection matrix (Pi )jk = δij δik . Here and in the following, unless explicitly stated, the range of the index i runs from 1 to N , and the range of the indices j, k runs from 1 to M . Equations (17) admit the solution: R(µ, λ; n) =
N Y
(I − Pk + µPk )nk R0 (µ, λ)
k=1
N Y
(I − Pk + λPk )−nk .
(18)
k=1
The ∂¯ dressing method consists in building a complete set of linear equations satisfied by the solution ψ(λ) of (15). In our case, we have the following Proposition 1. The solution ψ(λ) of the ∂¯ - problem (15), satisfying the canonical normalization (16), solves the following linear discrete equations: Lij (λ)ψ(λ) = 0, i 6= j,
(19)
Lij (λ)ψ(λ) := Pj Ti ψ(λ)(I − Pi + λPi ) − (Ti Qji )ψ(λ) − Pj ψ(λ),
(20)
where the matrix Q is defined by the asymptotic expansion: ψ = I + λ−1 Q + λ−2 Q(2) + O(λ−3 ),
λ→∞
(21)
and Qji is the (ji)-projection of matrix Q: Qji := Pj QPi .
(22)
¯ ∂-Reductions of Multidimensional Quadrilateral Lattice
5
Proof. The proof is standard in the philosophy of the ∂¯ dressing method. It is easy to show that Lij (λ)ψ(λ) solves the ∂¯ problem (15) and that Lij (λ)ψ(λ) → 0, as λ → ∞. The uniqueness of the solution of (15) implies the result. It is convenient to exploit completely Eqs. (19) multiplying them from the right by the projectors Pk , k 6= i and Pi , obtaining, respectively: 1i ψjk (λ) = (Ti Qji )ψik (λ), j, k 6= i, (λTi − 1)ψji (λ) = (Ti Qji )ψii (λ), j 6= i,
(23) (24)
where ψjk (λ) := Pj ψ(λ)Pk .
(25)
Evaluating Eqs. (23), (24) at λ = ∞, one obtains the integrable nonlinear equations associated with the compatible linear system (19) and solved through the ∂¯ problem (15). The λ−1 terms of the expansions of (23) and (24) at λ → ∞ give, respectively: 1i Qjk = (Ti Qji )Qik , j, k 6= i,
(26)
Ti Q(2) ji = Qji + (Ti Qji )Qii , j 6= i,
(27)
while the λ−2 term of Eq. (23) gives: (2) 1i Q(2) jk = (Ti Qji )Qik , j, k 6= i.
(28)
Equations (7), which characterize the N -dimensional quadrilateral lattice in RM , being obtained from (26) choosing j, k = 1, . . . , N and j 6= k, are therefore solved by the nonlocal ∂¯ problem (15). It is also convenient to evaluate Eqs. (23)–(24) at other two values of λ: the distinguished point λ = 1 and the generic point λ0 ∈ C. At λ = 1, Eqs. (23)–(24) become 1i ψjk (1) = (Ti Qji )ψik (1), j 6= i,
(29)
and the comparison with Eqs. (6) leads to the identification of the M -dimensional vector Xi , i = 1, . . . , N , with the ith row of the matrix ψ(1). The remaining M − N rows of ψ(1) can be identified with the additional elements Xj , j = N + 1, . . . , M of the natural basis of RM at the points of the MQL. Therefore Eqs. (29) can be rewritten as 1i Xj = (Ti Qji )Xi , i 6= j,
(30)
involving the full basis of RM , and are the extended version of the linear system (6). Finally, in the neighborhood of the generic point λ0 ∈ C: (1) ψji (λ) = ψji (λ0 ) + (λ − λ0 )ψji (λ0 ) + O (λ − λ0 )2 , (31) we obtain:
and
1i ψjk (λ0 ) = (Ti Qji )ψik (λ0 ), j, k 6= i,
(32)
(λ0 Ti − 1)ψji (λ0 ) = (Ti Qji )ψii (λ0 ), j 6= i,
(33)
(1) (1) 1i ψjk (λ0 ) = (Ti Qji )ψik (λ0 ) , j, k 6= i,
(34)
(1) (1) (λ0 ) = −Ti ψji (λ0 ) + (Ti Qji )ψii (λ0 ) , j 6= i. (λ0 Ti − 1)ψji
(35)
6
A. Doliwa, S. V. Manakov, P. M. Santini
We conclude this section remarking that the ∂¯ problem (15), (18) provides, in general, complex solutions of the MQL Eqs. (6), (7). It turns out that, in order to obtain real ¯ solutions, the ∂-datum R must satisfy the following reality constraint: ¯ = −R(µ, ¯ λ) . R(µ, µ, ¯ λ, λ) ¯ µ, λ,
(36)
¯ ¯ satisfies the reality condition (36), then Proposition 2. If the ∂-datum R = R(µ, µ, ¯ λ, λ) ¯ of (15) and the MQL data Qij possess the following reality the solution ψ = ψ(λ, λ) properties: ¯ = ψ(λ, ¯ λ), ψ(λ, λ) Qij = Qij .
(37) (38)
Proof. Taking the complex conjugate of (15) and using (36), it is possible to show that ¯ satisfies the same ∂-problem ¯ ¯ λ), with the same canonical normalization ψ(λ, λ) as ψ(λ, (16). Therefore uniqueness implies that these two functions coincide; Eq. (38) follows from (21) comparing the O(λ−1 ) terms of their λ large asymptotics. 2.2. Separable kernels and explicit solutions. It is well known that explicit solutions of ¯ the ∂-problem (15) are obtained choosing a separable kernel R(µ, λ) [22, 5, 20, 12]: K
iX uk (µ) vk (λ). 2
R(µ, λ) =
(39)
k=1
With this choice, the linear integral equation Z Z 1 dλ0 ∧ dλ¯ 0 ψ(λ) = I + ψ(µ)R(µ, λ0 )dµ ∧ dµ, ¯ 2πi C λ0 − λ C
(40)
which is satisfied by the solution of (15), (16), reduces to the following linear algebraic system: K X ξj + ξk hvk ∂λ−1 (41) ¯ uj i = huj i, j = 1, . . . , K, k=1
for the unknown matrices
ξj := hψuj i,
where hf i :=
i 2
(42)
Z C
¯ (λ), dλ ∧ dλf
(43)
and
Z 1 dλ0 ∧ dλ¯ 0 f (λ0 ) := (44) 2πi C λ0 − λ is the inverse of the ∂λ¯ operator. ψ(λ) and Q are finally expressed in terms of these solutions in the following way: (∂λ−1 ¯ f )(λ)
ψ(λ) = I +
K X k=1
Q=
ξk (∂λ−1 ¯ vk )(λ),
K 1X ξk hvk i. π k=1
(45)
(46)
¯ ∂-Reductions of Multidimensional Quadrilateral Lattice
7
In particular, if K = 1, then ξ = hui I + hv∂λ−1 ¯ ui
−1
, −1
(47)
ψ(λ) = I + hui I + hv∂λ−1 (∂λ−1 ¯ ui ¯ v)(λ), 1 −1 hvi. Q = hui I + hv∂λ−1 ¯ ui π
(48) (49)
An interesting class of explicit solutions of the MQL equations (26), (30) corresponds to the choice ¯ = u0k δ(µ − µk ), uk (µ, µ)
¯ = vk0 δ(λ − λk ), vk (λ, λ)
(50)
where δ(λ − λk ) is the Dirac delta-function and u0k , vk0 are arbitrary constant matrices. Real solutions of the MQL Eqs. (26), (30), are obtained imposing the reality constraint (36); this implies that some of the terms of the sum (39) will satisfy the condition ¯ = u(λ, ¯ λ), u(λ, λ)
¯ = v(λ, ¯ λ), v(λ, λ)
(51)
while others will be coupled in pairs in the following way: ¯ λ) ¯ + f (µ, ¯ µ)g(λ, f (µ, µ)g(λ, ¯ λ)
(52)
for arbitrary matrix functions f and g; i. e.: ¯ = i R(µ, µ, ¯ λ, λ) 2 +
Ks X
¯ uk (µ, µ)v ¯ k (λ, λ)+
(53)
k=1
Kp h X
i
¯ + fk (µ, ¯ λ) , fk (µ, µ)g ¯ k (λ, λ) ¯ µ)gk (λ,
k=1
where ¯ = uk (λ, ¯ λ), uk (λ, λ)
¯ = vk (λ, ¯ λ), vk (λ, λ)
(54)
fk , gk are arbitrary functions and K = Ks + 2Kp . A systematic investigation of the explicit solutions of the MQL equations and the study of the geometric properties of the corresponding lattices is postponed to a subsequent paper. 3. Reductions of the MQL Equations 3.1. Reductions of the MQL. In this section, motivated by the results of Zakharov and Manakov [21, 15], we seek integrable reductions of the MQL Eqs. (7) imposing a linear constraint of the following type: RT (µ−1 , λ−1 ) = |µ|4 λ¯ 2 (F (λ))−1 R(λ, µ)F (µ)
(55)
on the ∂¯ datum R(µ, λ), where the superscript T indicates matrix transposition. It is easy to convince oneself that the constraint (55) is acceptable if: i) it is consistent with the dependence (18) of R(µ, λ) on the lattice variables and ii) it is an involution.
8
A. Doliwa, S. V. Manakov, P. M. Santini
The first condition is satisfied iff F (λ) is a diagonal matrix, while the second condition implies the following equation: λF (λ) = ±λ−1 F (λ−1 ),
(56)
which is conveniently parametrized in the following way: λF (λ) = A(λ) ± A(λ−1 )
(57)
in terms of an arbitrary diagonal matrix function A(λ). Therefore we are lead to the following Proposition 3. The linear constraint (55) on the ∂¯ datum R(µ, λ), where F (λ) is given by (57), gives rise to integrable reductions of the MQL equations for any choice of the diagonal matrix function A(λ). It turns out that the constraint (55),(57) on R(µ, λ) implies a nonlocal quadratic constraint on ψ(λ). Proposition 4. The reduction (55) of the matrix ∂¯ datum R implies the following nonlocal quadratic constraint on ψ(λ): Z Z ψ(λ)F (λ)ψ T (λ−1 )dλ + ψ(λ)(∂λ¯ F (λ))ψ T (λ−1 )dλ ∧ dλ¯ = 0, (58) C∞
C
where C∞ is the circle with center at the origin and arbitrarily large radius, and the corresponding integration is counter-clockwise. Proof. To prove this result, first apply the transformation: λ → λ−1 and µ → µ−1 to the ∂¯ problem (15), then transpose the resulting equation and finally make use of the constraint (55), obtaining: Z T −1 ¯ λ, µ ∈ C. (59) F (λ)∂λ¯ (ψ (λ )) = − R(λ, µ)F (µ)ψ T (µ−1 )dµ ∧ dµ, C
Multiplying Eq. (59) from the left by ψ(λ), multiplying Eq. (15) from R the right by ¯ the F (λ)ψ T (λ−1 ), adding up the resulting equations and integrating over C dλ ∧ dλ, LHS’s simplify and one obtains: Z Z T −1 ¯ ∂λ¯ (ψ(λ)F (λ)ψ (λ ))dλ ∧ dλ − ψ(λ)(∂λ¯ F (λ))ψ T (λ−1 )dλ ∧ dλ¯ = 0, (60) C
C
which is equivalent to Eq. (58), using the well-known Gauss–Green formula [12].
In order to use effectively the nonlocal reduction (58), it is necessary to choose the arbitrary diagonal matrix function A(λ) in a suitable functional class. Lemma 1. Let A(λ) be proportional to the identity: A(λ) = a(λ)I and let a(λ) be a rational function of λ. Then the nonlocal quadratic constraint (58) takes the following “multilocal” form: X fl ∂ νl T −1 T −1 = 0, f (λ)ψ(λ)ψ (λ ) Res f (λ)ψ(λ)ψ (λ ), ∞ + 2πi (νl )! ∂λνl λ=λl l (61) where the λl , l = 1, 2, .. are the poles of the rational function
¯ ∂-Reductions of Multidimensional Quadrilateral Lattice
9
1 a(λ) ± a(λ−1 ) λ
f (λ) :=
(62)
and (νl + 1) are the corresponding multiplicities: f (λ) ' fl (λ − λl )−(νl +1) , λ ' λl . Proof. The first term in (58) is nothing but minus the residue at ∞; to obtain the second term of (61) we make use of the well-known formulas: (−1)νl π ∂ νl ∂ 1 δ(λ − λl ), νl = 0, 1, . . . , (63) = ν +1 (νl )! ∂λνl ∂ λ¯ (λ − λl ) l Z C
δ(λ − λl )g(λ)dλ ∧ dλ¯ = −2ig(λl ).
(64)
We now present a few explicit examples of reductions. Example 1. Let a(λ) = λ
⇒
f± (λ) = 1 ± λ−2 .
(65)
Then the constraint (61) reads: T ψ (1) (0) + Qψ T (0) = ± ψ (1) (0) + ψ(0)QT .
(66)
Example 2. Let a(λ) =
λ1 ρ1 λ − λ1
⇒
f± (λ) = ρ1
1 1 1 ∓ − λ − λ1 λ − 1/λ1 λ
.
(67)
λ1 6= 0.
(68)
Then the constraint (61) reads: ψ(λ1 )ψ T (λ1 −1 ) ∓ ψ(λ1 −1 )ψ T (λ1 ) = ψ(0) ∓ ψ T (0), If λ1 = 1 the constraint (68) takes the following simple forms: ψ(0) = ψ T (0), ψ(1)ψ T (1) =
for the upper sign,
1 ψ(0) + ψ T (0) , 2
for the lower sign.
(69) (70)
Example 3. In the case of many simple poles: a(λ) =
n X λ l ρl , λ − λl
λl 6= 0,
(71)
l=1
the corresponding constraints (61) read: n X l=1
h
i
ρl ψ(λl )ψ T (λl −1 ) ∓ ψ(λl −1 )ψ T (λl ) =
n X l=1
! ρl
ψ(0) ∓ ψ T (0) , λl 6= 0. (72)
10
A. Doliwa, S. V. Manakov, P. M. Santini
All the above constraints express a quadratic relation for the matrix eigenfunction (and its derivatives) evaluated at different points of the complex λ plane. The “extended” MQL Eqs. (26), together with the constraint (61) and Eqs. (32)–(35), describing the connection between the value of ψ at λ0 and the fields Qjk , provide a closed system of integrable reductions of the MQL Eqs. (7). Each of them will correspond to a particular example of constrained quadrilateral lattice. In Sect. 4 we will show that the distinguished reduction (70) provides a convenient parametrization of the MCL. In Sect.3.3 we investigate instead the small lattice parameter limit of the above reductions. We end this section observing that the reality condition (36) implies that the diagonal ¯ appearing in the reduction formulas (55), must satisfy the constraint matrix F (λ, λ), ¯ = F (λ, ¯ λ). F (λ, λ)
(73)
3.2. Separable kernel formalism of the MQL reductions. In order to obtain explicit solutions of the MQL reductions introduced in Sect. 3.1, we have to study the implications of the constraints (55), (57) and (36) on the degenerate kernel (39). This is the content of this section. It is easy to check that the constraint (55), (57)± can be satisfied in two ways; through a single-term mechanism, in which the term uk (µ)vk (λ) takes the form 1 T 1 ¯λµ¯ F (µ)h ( µ )Bh(λ),
(74)
where h(µ) is an arbitrary matrix function of µ (and µ) ¯ and B is a constant matrix satisfying (75) B T = ±B, and through a pairing mechanism, in which two terms uk (µ)vk (λ) + ul (µ)vl (λ) are coupled in the following way: 1 T 1 T 1 T F (µ) f )Cg(λ) ± g )C ( ( f (λ) , (76) µ µ λ¯ µ¯ ¯ and C is an arbitrary constant where f (λ), g(λ) are arbitrary matrix functions of λ (and λ) matrix. We have therefore the following general result. Proposition 5. A degenerate kernel (39) satisfying the reduction (55), (57)± , consists of single terms of the type (74) and of pairs of terms of the type (76): 0 Ks X 1 i hTk ( )Bk hk (λ) + (77) R(µ, λ) = ¯ F (µ) µ 2λµ¯ k Kp0 X 1 1 + fkT ( )Ck gk (λ) ± gkT ( )CkT fk (λ) , µ µ k
¯ Bk and Ck are constant where hk , gk fk are (arbitrary) matrix functions of λ and λ, matrices, with (78) BkT = ±Bk .
¯ ∂-Reductions of Multidimensional Quadrilateral Lattice
11
The additional reality constraint (36) can be achieved imposing the analogous mechanisms described in Sect. 2.2. Therefore there are several ways to combine the two reductions. Here we consider two illustrative examples. i) If K = 1, then the two reductions (36) and 55), (57)± are satisfied iff ¯ = R(µ, µ, ¯ λ, λ) where
1 1 i ¯ F (µ, µ)h ¯ T ( , )Bh(λ, λ), µ µ¯ 2λ¯ µ¯
(79)
B T = ±B, B = B,
(80)
¯ = h(λ, ¯ λ) , h(λ, λ)
(81)
¯ satisfies Eq. (73). and F (λ, λ) ii) If K = 2, we have the following four possible combinations of the two reductions: a) Single-term mechanisms in both reductions: X 1 1 i ¯ hTk ( , )Bk hk (λ, λ), F (µ, µ) ¯ µ µ¯ 2λ¯ µ¯ 2
¯ = R(µ, µ, ¯ λ, λ)
(82)
k=1
where Bk , hk , k = 1, 2, satisfy the conditions (80), (81). b) Mechanisms (74) and (52): ¯ + hT ( 1 , 1 )Bh(λ, ¯ λ) , (83) ¯ = i F (µ) hT ( 1 , 1 )Bh(λ, λ) R(µ, µ, ¯ λ, λ) µ µ¯ µ¯ µ 2λ¯ µ¯ ¯ where B T = ±B and h is an arbitrary function of λ, λ. c) Mechanisms (76) and (51): ¯ ± g T ( 1 , 1 )C T f (λ, λ) ¯ , ¯ = i F (µ) f T ( 1 , 1 )Cg(λ, λ) R(µ, µ, ¯ λ, λ) µ µ¯ µ µ¯ 2λ¯ µ¯ (84) where C = C and f , g satisfy (81). d) Pairing mechanisms in both reductions: 1 1 i T 1 1 T ¯ ¯ ¯ R(µ, µ, ¯ λ, λ) = ¯ F (µ) f ( , )Cf (λ, λ) + f ( , )Cf (λ, λ) , (85) µ µ¯ µ¯ µ 2λµ¯ ¯ where C = ±C T and f is an arbitrary function of λ, λ. A systematic investigation of the explicit solutions of the MQL reductions and the study of the geometric properties of the corresponding lattices is postponed to a subsequent paper. 3.3. The continuous limit. In the limit of small lattice parameter ε: Xj = Xj + O(ε), j = 1, . . . , M , and Eqs. (30) reduce to the “extended version” ∂i Xj = βji Xi , i 6= j,
(86)
of the linear system (2), involving the vectors of the complete basis of RM at the points of the submanifold.
12
A. Doliwa, S. V. Manakov, P. M. Santini
In order to discuss the continuous limit of the reductions of the previous section, we first observe that the spectral parameter λ is expanded in the following form: λ = 1 + ελ˜ + O(ε2 ).
(87)
If a(λ) ' (λ − 1)−1 + O(1), then f (λ) '
O(1), ˜ 2/(ελ),
(88)
and, as it was shown in [15], the upper case of (88) gives the Darboux equations (3) in the symmetric reduction ∂i βjk = βji βik , βij = βji i 6= j 6= k 6= i, i, j, k = 1, . . . , N.
(89)
The lower case leads to the orthonormality condition Xj · Xk = δjk ,
(90)
for the M -dimensional basis vectors Xj , j = 1, . . . , M , implying the quadratic constraint (13). If M = N , these equations coincide with Eqs. (4) and are therefore the second part of the Lam´e equations, which describe the N -orthogonal coordinate systems in EN [15]. If N < M , Eqs. (13) involve the additional fields βli , i = 1, . . . , N , l = N + 1, . . . , M , and to obtain a closed system, we have to consider the “extended” Darboux Eqs. (12). Equations (12) and (13), which we call the “generalized Lam´e equations”, characterize N -dimensional submanifolds of EM parametrized by conjugate orthogonal systems of coordinates. We remark that the classical theorem of Dupin [8] states that, for N = M , the orthogonality of the coordinates implies their conjugacy. If we assume that a(1), a0 (1) 6= 0, then: 2a(1), f (λ) ∼ (91) 2εa0 (1)λ˜ . As it was shown in [15], the upper case leads again to (89), while the lower case leads to the reduction ∂i βjk = βji βik ; , ∂i βji + ∂j βij +
i, j = 1, ..N, k = 1, . . . , M, i 6= j 6= k 6= i, M X
βil βjl = 0
i 6= j , i, j = 1, ..N,
(92) (93)
l=1, l6=i,j
equivalent to the generalized Lam´e equations (12)–(13), after “transposing” the indices. These simple qualitative considerations allow one to establish the following result: Generically, in the continuous limit, the reductions corresponding to the upper sign in (62) go to the Darboux equations for symmetric fields (89), while the reductions corresponding to the lower sign go to the Lam´e Eqs. (12)–(13). The nongeneric situations correspond to the cases in which a(1) = 0 and/or a0 (1) = 0. It is of course possible (and instructive) to verify these findings on the explicit reductions derived in the previous section. In order to do that, one needs the following asymptotics of ψ(λ0 ) and ψ (1) (λ0 ):
¯ ∂-Reductions of Multidimensional Quadrilateral Lattice
βjk + ε2 ψjk (λ0 ) = δjk + ε λ0 − 1 (1) (λ0 ) ψjk
13
(2) βjk
σjk + λ0 − 1 (λ0 − 1)2
λ0 βjk δjk + ε + ε2 = λ0 − 1 λ0 − 1
! + O(ε3 ),
(2) βjk
λ0 − 2 + σjk λ0 − 1 (λ0 − 1)2
λ0 6= 1,
(94)
! + O(ε3 ),
λ0 6= 1, (95)
(2) is defined in (10) and where βjk
σjk := −∂k βjk + βjk βkk , ∂i σjj = βji σij ,
j 6= k,
i 6= j .
(96) (97)
The asymptotics (94)–(95) follow from Eqs. (31)–(35) in the limit of small lattice parameter. Substituting these asymptotics in the constraints (66), (68) and (72), for λl 6= 1, one verifies that, in the upper sign case, the first nontrivial term (at O(ε)) gives the reduction (89); while, in the case of lower sign, the first nontrivial term (at O(ε2 )) gives the reduction (92)–(93). A drastically different limiting procedure characterizes the case λ0 = 1. In this case Eq. (70) reduces to the orthogonality condition (90), implying the generalized Lam´e Eqs. (12)–(13). 4. The Multidimensional Circular Lattice In this section we show that the MCL corresponds to the distinguished reduction (70) of Sect. 3. In this way we obtain a convenient characterization of this lattice and we prove its analytic integrability through the ∂¯ method of Sects. 2 and 3. Since the multidimensional circular lattice is a geometrically distinguished reduction of the multidimensional quadrilateral lattice, we first recall the basic notions and results of the theory of the MQLs [11]. Definition 1. By an N -dimensional quadrilateral lattice we mean a mapping x : ZN → RM , N ≤ M , such that all the elementary quadrilaterals with vertices {x, Ti x, Tj x, Ti Tj x}, i, j = 1, . . . , N , i 6= j, are planar. A convenient parametrization of the MQL is given in terms of suitably rescaled tangent vectors Xi , i = 1, .., N , Xi =
1 1i x, Ti Hi
(98)
defined in such a way that the difference 1i Xj , i 6= j, is proportional to Xi only (see Fig. 1): (99) 1i Xj = (Ti Qji )Xi , i, j = 1, . . . , N, i 6= j. Equation (99) expresses the planarity of the (ij) elementary quadrilateral. In addition, the condition Ti Tj x = Tj Ti x implies that the scaling fields Hi appearing in Eq. (98) must satisfy the following equation 1i Hj = (Ti Hi )Qij , i, j = 1, . . . , N, i 6= j, adjoint to Eq. (99).
(100)
14
A. Doliwa, S. V. Manakov, P. M. Santini
. Tj x
TiTjx
Tj X i ( TiQ ji)X i
Xj Ti X j x
Xi
Ti x Figure 1.
To make this construction compatible in every direction of the lattice, the fields Qij defined in (99) (or in (100)) must satisfy the following nonlinear difference system: 1i Qjk = (Ti Qji )Qik , i, j, k = 1, . . . , N, i 6= j 6= k 6= i ,
(101)
which we refer to as the MQL equations. It turns out that the planarity constraint allows to build uniquely the MQL, once a suitable set of initial data is prescribed [11]. Proposition 6. An N -dimensional quadrilateral lattice in RM is uniquely determined assigning N (N2 −1) arbitrary intersecting quadrilateral surfaces (2-dimensional quadrilateral lattices). Remark 1. Since the construction of the MQL is based on a set of (linear) planarity constraints, the MQL is, by construction, integrable. Let us consider an N -dimensional quadrilateral lattice in the M -dimensional Euclidean space EM (i.e. we have equipped the linear space RM with the standard scalar product). If we impose the circularity constraint on the elementary quadrilaterals of the MQL we obtain an MCL [2, 7]. Definition 2. By an N -dimensional circular lattice in the M -dimensional (N ≤ M ) Euclidean space we mean a mapping x : ZN → EM , such that all the elementary quadrilaterals are inscribed in circles. Remark 2. It turns out that in an N -dimensional circular lattice every K-dimensional elementary cell (K ≤ N ) can be inscribed in a (K − 1)-dimensional sphere. In [7] the following results were established. Fact 1. In the limit of small lattice parameter the N -dimensional circular lattice in EM reduces to an N -dimensional submanifold of EM parametrized by conjugate orthogonal coordinate lines. Proposition 7. If the initial N (N2 −1) quadrilateral surfaces of Proposition 6 are made of quadrilaterals inscribed in circles, then the construction of the quadrilateral lattice preserves this circularity constraint, i.e. one builds an N -dimensional circular lattice in EM .
¯ ∂-Reductions of Multidimensional Quadrilateral Lattice
15
Remark 3. While Fact 1 states that the MCL is a discrete analogue of an N -dimensional submanifold parametrized by conjugate orthogonal lines, Proposition 7 establishes geometrically its integrability. We now recall the following well known characterization of the circularity constraint (see Fig. 2). Fact 2. i) A planar convex quadrilateral can be inscribed in a circle if and only if the sum of its opposite angles equals the flat angle π. ii) A planar skew quadrilateral can be inscribed in a circle if and only if its opposite angles are equal.
α
β
π−α
β
β α
α
π−β Figure 2.
It is easy to check, using the vectors Xi that, both characterizations i) and ii) can be expressed by the single formula (102) below. Lemma 2. The multidimensional circular lattice is characterized, among the quadrilateral lattices, by the following constraint cos ∠(Xj , Tj Xi ) + cos ∠(Xi , Ti Xj ) = 0.
(102)
A perhaps more convenient characterization of the circular lattice is given by the following Proposition 8. The multidimensional circular lattice is characterized, among the quadrilateral lattices, by the following constraint Xi · Ti Xj + Xj · Tj Xi = 0, i, j = 1, . . . , N, i 6= j.
(103)
Proof. ⇒ We present the proof for generic convex (Fig. 3) and skew (Fig. 4) quadrilaterals, leaving the analysis of all the degenerate cases to the interested reader. From the sine-theorem applied to the triangle ACE it follows that
The Thales theorem implies
|AE| |CE| = . sin β sin α
(104)
|Ti Xj | |Xj | = , |AE| |CE|
(105)
which, combined with Eq. (104), gives |Ti Xj | |Xj | = . sin α sin β
(106)
16
A. Doliwa, S. V. Manakov, P. M. Santini
E B β α A
β
Xj F
π−β
α
A
X
T jX i F
β
j
C
Xi D
D
T jX i B
T iX
Ti X j C
i
E Figure 3.
Figure 4.
Similar reasonings, applied to the triangle ABF , lead to |Xi | |Tj Xi | = ; sin α sin β
(107)
comparing (106) and (107) we obtain |Xj ||Tj Xi | = |Xi ||Ti Xj |,
i 6= j,
(108)
and combining Eqs. (108) and (102) we finally get the condition (103). ⇐ The planarity condition (99) and the constraint (103) imply 2Xi · Xj = −(Tj Qij )|Xj |2 − (Ti Qji )|Xi |2 ,
(109)
|Ti Xj |2 = |Xj |2 (1 − (Tj Qij )(Ti Qji )) .
(110)
and Equation (110) and its j ↔ i version lead to the condition (108). This condition, together with the constraint (103), give Eq. (102) or, equivalently, the circularity of the elementary quadrilaterals. In order to connect with the formalism of Sects. 2 and 3, it is convenient to define the functions φij , i, j = 1, . . . , N by: φii = |Xi |2 , φji = −φii (Ti Qji ),
(111) i 6= j,
(112)
¯ ∂-Reductions of Multidimensional Quadrilateral Lattice
17
then the formulas (109) and (110) can be rewritten, respectively, as Xi · Xj =
1 φij + φji , 2
i, j = 1, . . . , N,
1i φjj = (Ti Qji )φij , j 6= i .
(113) (114)
Finally, comparing Eqs. (99), (112), (114) with, respectively, Eqs. (29), (33), (32) (for λ0 = 0 and k = j), and comparing also the constraint (113) with the constraint (70), we arrive at the conclusion that the vector Xj is the j th row of the matrix ψ(1) and the field φij is the (ij) element of the matrix ψ(0), thus proving the analytic integrability, via the ¯ ∂-problem, of the equations describing the MCL. Acknowledgement. This research was partially supported by the INTAS grant no. 98–166 and by the Russian Foundation for Basic Research grant no. 96–01–00841. A. D. was partially supported by the Polish Committee for Scientific Research grant no. 2P03 B 18509.
References 1. Bianchi, L.: Lezioni di Geometria Differenziale. 3-a ed., Bologna: Zanichelli, 1924 2. Bobenko, A.: Discrete conformal maps and surfaces. To appear in: Symmetries and Integrability of Difference Equations II, eds. P. Clarkson and F. Nijhoff, Cambridge: Cambridge University Press 3. Bobenko, A., Pinkall, U.: Discrete Surfaces with Constant Negative Gaussian Curvature and the Hirota equation. J. Diff. Geom. 43, 527–611 (1996) 4. Bobenko, A., Pinkall, U.: Discrete isothermic surfaces. J. reine angew. Math. 475, 187–208 (1996) ¯ 5. Bogdanov, L.V., Manakov, S.V.: The nonlocal ∂-problem and (2+1)-dimensional soliton equations. J. Phys. A: Math. Gen. 21, L537–L544 (1988) 6. Bogdanov, L.V., Konopelchenko, B.G.: Lattice and q-difference Darboux–Zakharov–Manakov systems via ∂¯ method. J. Phys. A: Math. Gen. 28, L173–L178 (1995) 7. Cie´sli´nski, J., Doliwa, A., Santini, P.M.: The Integrable Discrete Analogues of Orthogonal Coordinate Systems are Multidimensional Circular Lattices. Phys. Lett. A 235, 480–488 (1997) 8. Darboux, G.: Lec¸ons sur les syst´emes orthogonaux et les coordonn´ees curvilignes. 2-´eme e´ d., compl´et´ee, Paris: Gauthier–Villars, 1910 9. Doliwa, A.: Geometric discretization of the Toda system. Phys. Lett. A 234, 187–192 (1997) 10. Doliwa, A., Santini, P.M.: Integrable dynamics of a discrete curve and the Ablowitz-Ladik hierarchy. J. Math. Phys. 36, 1259–1273 (1995) 11. Doliwa, A., Santini, P. M.: Multidimensional Quadrilateral Lattices are Integrable. Phys. Lett. A 233, 365–372 (1997) 12. Konopelchenko, B. G.: Solitons in Multidimensions. Singapore: World Scientific, 1993 13. Konopelchenko, B.G. and Schief, W.K.: Three-dimensional integrable lattices in Euclidean spaces: Conjugacy and orthogonality. Preprint 1997 14. Lam´e, G.: Lec¸ons sur les coordonn´ees curvilignes et leurs diverses applications, Paris; Mallet–Bachalier, 1859 15. Manakov, S.V., Zakharov, V.E.: Differential reductions for multidimensional soliton equations. Preprint 1997 16. Ma˜nas, M., Doliwa, A., Santini, P.M.: Darboux Transformations for Multidimensional Quadrilateral Lattices. I. Phys. Lett. A 232, 99–105 (1997) 17. Martin, R.R., de Pont, J., Sharrock, T.J.: Cyclic surfaces in computer aided design. In: The Mathematics of Surfaces, ed. J. A. Gregory, Oxford: Clarendon Press, 1986, pp. 253–268 18. Nutbourne, A.W.: The solution of a frame matching equation. In: The Mathematics of Surfaces, ed. J. A. Gregory, Oxford: Clarendon Press, 1986, pp. 233–252 19. Sauer, R.: Differenzengeometrie. Berlin: Springer, 1970 20. Zakharov, V.E.: On the Dressing Method. In: Inverse Method in Action, ed. P. C. Sabatier, Berlin: Springer, 1990, pp. 602–623
18
A. Doliwa, S. V. Manakov, P. M. Santini
21. Zakharov, V.E.: On Integrability of the Equations Describing N -Orthogonal Curvilinear Coordinate Systems and Hamiltonian Integrable Systems of Hydrodynamic Type. Part 1. Integration of the Lam´e Equations. Preprint 1996 22. Zakharov, V.E., Manakov, S.V.: Construction of Multidimensional Nonlinear Integrable Systems and Their Solutions. Funct. Anal. Appl. 19, 89 (1985) Communicated by T. Miwa
Commun. Math. Phys. 196, 19 – 51 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Extended Integrability and Bi-Hamiltonian Systems? Oleg I. Bogoyavlenskij Department of Mathematics and Statistics, Queen’s University, Kingston, Canada, K7L 3N6 Received: 15 January 1997 / Accepted: 19 December 1997
Abstract: The current notion of integrability of Hamiltonian systems was fixed by Liouville in a famous 1855 paper. It describes systems in a 2k-dimensional phase space whose trajectories are dense on tori Tq or wind on toroidal cylinders Tm ×Rq−m . Within Liouville’s construction the dimension q cannot exceed k and is the main invariant of the system. In this paper we generalize Liouville integrability so that trajectories can be dense on tori Tq of arbitrary dimensions q = 1, . . . , 2k−1, 2k and an additional invariant v: 2(q − k) ≤ v ≤ 2[q/2] can be recovered. The main theorem classifies all k(k + 1)/2 canonical forms of Hamiltonian systems that are integrable in a newly defined broad sense. An integrable physical problem having engineering origin is presented. The notion of extended compatibility of two Poisson structures is introduced. The corresponding bi-Hamiltonian systems are shown to be integrable in the broad sense. Contents 1 2 3 4 5 6 7 8 9 10 11 12 ?
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An Integrable Problem of Engineering Origin . . . . . . . . . . . . . . . . . . . . . Integrability of Bi-Hamiltonian Systems . . . . . . . . . . . . . . . . . . . . . . . . . Integrability After Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Complete Classification of the Invariant Closed 2-Forms for a Tq Dense Integrable System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Canonical Forms for the Symplectic Structures in the Toroidal Domains Canonical Forms for the Integrable Hamiltonian Systems . . . . . . . . . . . Abelian Lie Algebras of Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . Applications to Bi-Hamiltonian Systems . . . . . . . . . . . . . . . . . . . . . . . . Quasi-Periodic Dynamics Without Hamiltonian Structure . . . . . . . . . . . Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Applications to Hydrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Supported by the Natural Sciences and Engineering Research Council of Canada.
2 4 7 10 13 17 21 23 24 27 28 29
20
O. I. Bogoyavlenskij
1. Introduction I. This paper is devoted to the problem of integrability of Hamiltonian systems. We start with the problem about the dynamics of an electron on a torus T2 ⊂ R3 in an electromagnetic field. The dynamics is described by a Hamiltonian system on the symplectic manifold T ∗ (T2 ) with a symplectic structure ω that is not exact. For certain values of parameters, this system has the following properties: i) The system possesses one functionally independent first integral. ii) The generic trajectories are quasi-periodic and dense on 3-dimensional tori T3 ⊂ M4 = T ∗ (T2 ). iii) The restrictions of the symplectic structure ω to the invariant tori T3 have rank 2; the tori T3 are coisotropic with respect to ω. These properties imply that the Hamiltonian system is integrable in the conventional sense, but not in the Liouville sense. This example shows that a new concept of integrability is needed. II. A general Hamiltonian system on a symplectic manifold M 2k has the form x˙ τ = V τ (x) = P τ ν (x)θν (x),
(1.1)
−1
where P (x) = ω (x), ω(x) is a non-degenerate symplectic structure on the manifold M 2k , dω(x) = 0, and θ(x) is a closed 1-form, dθ(x) = 0. A Hamiltonian system (1.1) is called Liouville-integrable if Liouville’s condition [19] is satisfied: There are k functionally-independent first integrals F1 (x), . . . , Fk (x) that are in involution {Fi , Fj } = P τ ν Fi,τ Fj,ν = 0. Any Hamiltonian system that is Liouville-integrable has the canonical form I˙j = 0,
ϕ˙ j =
∂H(I) , ∂Ij
ω=
k X
dIj ∧ dϕj ,
(1.2)
j=1
where I1 , . . . , Ik , ϕ1 , . . . , ϕk are the action-angle coordinates. The invariant submanifolds M k : F1 (x) = c1 , . . . , Fk (x) = ck are tori Tk if M k are compact. The Liouville involutivity condition implies that system (1.1) has an abelian Lie algebra Fa of functionally independent first integrals Fi (x) with respect to the Poisson brackets {Fi , Fj } = 0, and an abelian Lie algebra Sa of symmetries Ui which preserve first integrals Fj (x). The symmetries Ui are Hamiltonian and have the form Uiτ = P τ ν Fi,ν . In the Liouville concept the two distinct Lie algebras Fa and Sa are identified. We separate these two Lie algebras Fa , Sa , and study them independently. The key observations concerning the preceding integrable Hamiltonian system are: a) The integrable Hamiltonian system must possess functionally independent first integrals F1 (x), . . . , Fp (x), where 0 ≤ p < 2k, and these first integrals may be noninvolutive. b) The integrable Hamiltonian system must possess an abelian (n − p)-dimensional Lie algebra Sa of symmetries that preserve first integrals Fj (x), and these symmetries may be non-symplectic. Definition 1. The dynamical system x˙ τ = V τ (x1 , . . . , xn ) on a smooth manifold M n will be called integrable in the broad sense if it possesses: (a) p functionally independent first integrals F1 (x), . . . , Fp (x), where 0 ≤ p < n, (b) an abelian (n − p)-dimensional Lie algebra Sa of symmetries which preserve first integrals Fj (x) and which are linearly independent at the generic points x ∈ M n .
Extended Integrability and Bi-Hamiltonian Systems
21
In Liouville’s concept, that is applicable to the Hamiltonian systems only, the independent conditions (a) and (b) were incorporated into the single condition of the involutivity of k first integrals. Let M 2k be a symplectic manifold with a symplectic structure ω. If a Hamiltonian system on M 2k is integrable in the broad sense and its invariant submanifolds are compact then the closures of the generic trajectories are tori Tq . This assertion is standard and does not require a proof that is analogous to the “simple” part of Liouville’s theorem. Our main result consists of the complete classification of all k(k+1)/2 nonequivalent canonical forms of integrable Hamiltonian systems and invariant symplectic structures ω. The classification is presented in Theorem 2 that is proved in Sects. 5–7. This classification extends the “difficult” part of Liouville’s theorem (construction of the action-angle coordinates) that is based entirely on the involutivity of first integrals F1 (x), . . . , Fk (x). Notice that in Definition 1 neither the involutivity of first integrals F1 (x), . . . , Fp (x) nor the symplecticity of the symmetries are required. The Hamiltonian systems integrable in the broad sense are classified by two integer topological invariants q and v: 1 ≤ q ≤ 2k and 2(q − k) ≤ v ≤ q. The first one is the maximal dimension of the tori Tq that are the closures of the generic trajectories. The second invariant v is the maximal rank of the restrictions of the symplectic structure ω on the invariant tori Tq . It is evident that v is even and v ≤ q. Liouville’s classical concept describes the very important and simultaneously special integrable Hamiltonian systems, for which 1 ≤ q ≤ k and v = 0. Theorem 2 below implies the following necessary and sufficient condition for the Liouville integrability: A Hamiltonian system (1.1) is Liouville-integrable if and only if it is integrable in the broad sense and its topological invariant v = 0. Hence Definition 1 is an extension of the Liouville concept of integrability. III. It is evident that a need for any new concept is determined by its possible applications. In this paper we show that the concept of integrability in the broad sense has applications to the following problems which are treated in Sects. 2–4, 9. The dynamics of an electron on a 2-dimensional torus T2 ⊂ R3 under the influence of an electromagnetic field. ii) The integrability of bi-Hamiltonian systems with two generic Poisson structures P1 (x) and P2 (x). iii) The integrability of Hamiltonian systems after reduction with respect to an infinite discrete group of symmetries 0 = Z ⊕ · · · ⊕ Z. iv) The algebraic properties of the recursion operators A(x) = P1 (x)P2−1 (x) and their Nijenhuis tensors. i)
IV. In Sect. 3, we introduce the notion of extended compatibility of two Poisson structures P1 (x) and P2 (x) that generalizes Magri’s notion of compatibility [20]. In Theorem 1 we prove that the corresponding generic bi-Hamiltonian systems are integrable in the broad sense. All previously known integrable bi-Hamiltonian systems [7, 12, 13, 20] are integrable in Liouville’s sense and are particular cases of the proposed more general construction. We present examples of bi-Hamiltonian systems on the manifold M 2k = R1 × T2k−1 which are integrable in the broad sense and are not integrable in the Liouville sense. The above physical problem about the dynamics of an electron on a torus T2 ⊂ R3 in an electromagnetic field is one of such bi-Hamiltonian systems. In Sect. 9 we study non-degenerate Poisson structures Pi (x) which are invariant with respect to a Hamiltonian system V that is integrable in the broad sense and nondegenerate. We prove that any recursion operator A(x) = P1 (x)P2−1 (x) has 2(q − k)
22
O. I. Bogoyavlenskij
constant eigenvalues and that any V -invariant Poisson structures P1 (x) and P2 (x) in general position are compatible in the extended sense. V. In Sect. 4, we study the integrability of Hamiltonian systems after reductions with respect to the infinite discrete groups of symplectic symmetries. We consider the Liouvilleintegrable Hamiltonian systems on a symplectic manifold R4` which possess a group of translational symmetries 0 = Z ⊕ · · · ⊕ Z. In Proposition 1, we prove that after the reduction we obtain Hamiltonian systems on the quotient manifold R4` /0 = T ∗ (T2` ) which are integrable in the broad sense. The generic trajectories of these systems are dense on the 3`-dimensional invariant tori T3` ⊂ T ∗ (T2` ); their topological invariants are q = 3`, v = 2`. VI. Kolmogorov was the first to mention the existence of Hamiltonian systems with invariant tori of dimensions > k on symplectic manifolds M 2k , see [18], p. 326. For all integrable Hamiltonian systems (1.1) that have been investigated until now, the invariant tori Tq ⊂ M 2k were either Lagrangian, isotropic or coisotropic with respect to the invariant symplectic structure ω. A torus Tk (q = k) is a Lagrangian submanifold in M 2k if its tangent spaces Tx (Tk ) coincide with the ω-orthogonal spaces Tx (Tk )⊥ ⊂ M 2k . A torus Tq is isotropic or coisotropic if respectively Tx (Tq ) ⊂ Tx (Tq )⊥
or
Tx (Tq ) ⊃ Tx (Tq )⊥ .
The coisotropic invariant tori have been explored by Parasyuk [27-30] and Moser [24] at the study of the applicability of the Kolmogorov - Arnold - Moser theory to small perturbations of the corresponding Hamiltonian systems. In [32], Zehnder constructed a Hamiltonian system on the symplectic manifold M 4 = R1 × T3 that has constant coefficients and whose trajectories are dense on tori T3 . In [14, 15], Herman constructed counterexamples to the C ∞ closing lemma using Zehnder-like Hamiltonian systems on the symplectic manifolds M 2k = R1 × T2k−1 . The integrability of the original Hamiltonian system was the initial assumption of the cited papers where the dynamics of small Hamiltonian perturbations has been studied. In the present paper, we derive the integrability of the Hamiltonian system under investigation from the assumptions (a) and (b) of Definition 1 and obtain a complete set of canonical forms, which contains those systems with Lagrangian, isotropic and coisotropic tori as special cases. The invariant tori of general integrable systems are not necessarily any one of these three types. 2. An Integrable Problem of Engineering Origin I. Let us study the non-relativistic dynamics of an electron on a torus T2 under the influence of an electromagnetic field. This problem arises from the engineering problem of confinement of charged particles moving in a vacuum toroidal camera [10]. We suppose that the torus T2 is embedded into the toroidal camera in the standard way: x = (R + r cos ϕ1 ) cos ϕ2 ,
y = (R + r cos ϕ1 ) sin ϕ2 ,
z = r sin ϕ1 .
Here ϕ1 and ϕ2 are the angular coordinates and r and R are radii of the small and large circles respectively, r < R. The Riemannian metric that is induced on the torus T2 has the form ds2 = r2 dϕ21 + R2 (ϕ1 ) dϕ22 , where R(ϕ1 ) = R + r cos ϕ1 .
Extended Integrability and Bi-Hamiltonian Systems
23
We suppose that the magnetic field is orthogonal to the torus and has a constant strength B, and that the electric field E is tangent to the torus T2 . Dynamics of an electron is described by Lagrange’s equations [2, 10] ∂L d ∂L = , dt ∂ ϕ˙ i ∂ϕi
(2.1)
1 2 2 1 e mr ϕ˙ 1 + mR2 (ϕ1 )ϕ˙ 22 + Bϕ1 ϕ˙ 2 + eE1 ϕ1 + eE2 ϕ2 . 2 2 c Here m is the electron mass and e is its charge, c is the speed of light. Lagrange’s equations (2.1) after applying the Legendre transform L(ϕi , ϕ˙ i ) =
p1 = mr2 ϕ˙ 1 ,
p2 = mR2 (ϕ1 )ϕ˙ 2
and transforming to the new angular variables ϕ1 = ϕ1 +
c p2 , eB
ϕ2 = ϕ2 −
c p1 eB
take the form p˙1 =
eB R0 (p, ϕ) 2 p p , + eE + 2 1 cmR2 (p, ϕ) mR3 (p, ϕ) 2 cE2 , ϕ˙ 1 = B
p˙2 = −
eB p1 + eE2 , cmr2
(2.2)
cR0 (p, ϕ) cE1 − p2 , ϕ˙ 2 = − B meBR3 (p, ϕ) 2
where R(p, ϕ) = R + r cos(ϕ1 − cp2 /eB), R0 (p, ϕ) = −r sin(ϕ1 − cp2 /eB). The dynamical system (2.2) has the Hamiltonian form x˙ i = P1ij θj ,
(2.3)
where P1ij is the constant Poisson structure 0 ec B 0 0 e − B 0 0 0 P ij = c c , 0 0 0 − eB c 0 0 eB 0 and θ is the closed 1-form 1 cE2 cE1 1 2 2 p p p p + − + θ= d 1 2 − eE1 dϕ1 − eE2 dϕ2 . 2mr2 1 2mR2 (p, ϕ) 2 B B The symplectic structure ω1 = P1−1 has the form ω1 = −
c eB dp1 ∧ dp2 + dϕ1 ∧ dϕ2 . eB c
II. In what follows, we suppose that eE1 r min{1, }. R mc2
(2.4)
24
O. I. Bogoyavlenskij
In this case system (2.2) is a small perturbation of the Hamiltonian system eB eB p2 + eE1 , p˙2 = − p1 + eE2 , (2.5) 2 cmR cmr2 cE2 cE1 , ϕ˙ 2 = − . ϕ˙ 1 = B B The system (2.5) has the Hamiltonian form (2.3), where function R(p, ϕ) is constant, R(p, ϕ) = R. System (2.5) has the first integral p˙1 =
1 1 cE2 cE1 p1 + p2 . p2 + p2 − (2.6) 2mr2 1 2mR2 2 B B The submanifolds of constant level of this first integral are the 3-dimensional tori T3 = S 1 × T2 . All trajectories of system (2.5) are quasi-periodic on the tori T3 and have the frequencies cE1 eB cE2 , − , . (2.7) B B cmrR Hence the trajectories are dense on the tori T3 if the three numbers (2.7) are rationally independent. H(p1 , p2 ) =
III. The Hamiltonian system (2.5) preserves three closed 1-forms θ, ζ1 = dϕ1 and ζ2 = dϕ2 . These differential forms satisfy the relations cE2 cE1 c , P1ij θi ζ2.j = , P1ij ζ1.i ζ2.j = − . (2.8) B B eB Therefore, the corresponding Hamiltonian flows commute. These flows preserve the function H(p1 , p2 ) (2.6). Indeed, Eqs. (2.5) and (2.6) imply dH = θ + eE1 ζ1 + eE2 ζ2 . Hence using Eqs. (2.8), we derive P1ij θi ζ1.j = −
∂H ∂H ∂H θj = 0, P1ij i ζ1.j = 0, P1ij i ζ2.j = 0. ∂xi ∂x ∂x Thus we obtain that system (2.5) on the 4-dimensional manifold T ∗ (T2 ) possesses an abelian 3-dimensional Lie algebra of symmetries Sa which preserve the first integral H(p1 , p2 ). Hence the Hamiltonian system (2.5) is integrable in the broad sense, but not in the Liouville sense. The invariant tori T3 are coisotropic with respect to the symplectic structure (2.4). The topological invariants of system (2.5) are q = 3, v = 2 provided that the three numbers (2.7) are rationally independent, and q = 2, v = 2 if they satisfy one linear equation with integer coefficients. P1ij
IV. It is evident that system (2.5) preserves the symplectic structures ω2 = f0 (H) dp1 ∧ dp2 + df1 (H) ∧ dϕ1 + df2 (H) ∧ dϕ2 + c2 dϕ1 ∧ dϕ2 , where f0 (H), f1 (H), and f2 (H) are arbitrary smooth functions, H = H(p1 , p2 ) is the first integral (2.6), and c2 is an arbitrary constant. Therefore system (2.5) has a continuum of bi-Hamiltonian forms. For the recursion operator A = ω1−1 ω2 = P1 P2−1 , a direct calculation proves that the Nijenhuis tensor NA [26] ! 2k i m i m X ∂A ∂A ∂A ∂A j j i ` m ` = A − Am + Ai − Ai (2.9) NAj` ∂xm j ∂xm ` ∂x` m ∂xj m m=1
is equal to zero even though the functions fi (H) are arbitrary. Hence the Poisson structures P1 and P2 are compatible in Magri’s sense [13, 20].
Extended Integrability and Bi-Hamiltonian Systems
25
3. Integrability of Bi-Hamiltonian Systems I. Bi-Hamiltonian systems have the form x˙ τ = V τ (x) = P1τ ν (x)θ1.ν (x) = P2τ ν (x)θ2.ν (x),
(3.1)
where P1 and P2 are Poisson structures and θ1 and θ2 are closed 1-forms. Let us consider examples of the bi-Hamiltonian systems on the smooth manifold M 2k = R1 × T2k−1 with coordinates I1 ∈ R1 and ϕ1 , . . . , ϕ2k−1 ∈ T2k−1 . We define two symplectic structures on M 2k : ω1 =
2k−1 X
aα 1 dI1 ∧ dϕα +
α=1
ω2 =
2k−1 X
2k−1 X
cαβ dϕα ∧ dϕβ ,
(3.2)
α,β=1
dFα (I1 ) ∧ dϕα +
α=1
2k−1 X
cαβ dϕα ∧ dϕβ ,
(3.3)
α,β=1
where F1 (I1 ), . . . , F2k−1 (I1 ) are arbitrary smooth functions and constant coefficients aα 1 , cαβ = −cβα , cαβ = −cβα satisfy the equations cαβ aβ1 = cαβ aβ1 = 0,
rank k cαβ k = rank k cαβ k = 2k − 2.
(3.4)
Equations (3.4) imply that the closed 2-forms ω1 and ω2 are non-degenerate. A direct calculation based on Eqs. (3.4) proves that the recursion operator A(x) = P1 (x)P2−1 (x) possesses the following properties. (i) The tangent subspaces Tx (T2k−1 ) ⊂ i Tx (M 2k ) are invariant with respect to the operators A(x). (ii) The Nijenhuis tensor NAj` (2.9) is equal to zero even though the functions F1 (I1 ), . . . , F2k−1 (I1 ) are arbitrary (see Section 9 for the proof). Hence the symplectic structures ω1 and ω2 are compatible in Magri’s sense. We define the dynamical system I˙1 = 0,
ϕ˙ α =
∂H(I1 ) α α a + b0 ∂I1 1
(3.5)
on M 2k , where H(I1 ) is an arbitrary smooth function and bα 0 are arbitrary constants. Calculating the Lie derivatives of ω1 and ω2 and using Eqs. (3.4), we prove that system (3.5) preserves the two symplectic structures ω1 and ω2 and therefore has the bi-Hamiltonian form (3.1). The generic trajectories of system (3.5) are dense on tori T2k−1 if conare rationally independent. Hence the bi-Hamiltonian system (3.5) stants a11 , . . . , a2k−1 1 is integrable in the broad sense and is not integrable in Liouville’s sense. II. Let us consider two Poisson structures P1 (x) and P2 (x) on a smooth manifold M 2k where P2 (x) is non-degenerate. It is well-known that the characteristic polynomial of the recursion operator A(x) = P1 (x)P2−1 (x) is a perfect square: P (λ, x) = det k Aij (x) − λδji k = (Q(λ, x))2 .
(3.6)
The proof is based on the notion of the Pfaffian Pf S of a skew-symmetric matrix S, and on the identity det S = (Pf S)2 . The polynomial Q(λ, x) = Pf (P1 (x) − λP2 (x))Pf (P2−1 (x)). Let the recursion operator A(x) = P1 (x)P2−1 (x) have p functionally independent invariants Tr Am (x). Equation (3.6) implies p ≤ k. The generic submanifolds Mcq : Tr Am (x) = cm , m = 1, . . . , k, have dimension q = 2k − p ≥ k.
26
O. I. Bogoyavlenskij
Definition 2. Two Poisson structures P1 (x) and P2 (x) will be called compatible in the extended sense if the recursion operator A(x) = P1 (x)P2−1 (x) satisfies the conditions: i) The tangent subspaces Tx (Mcq ) ⊂ Tx (M 2k ) are invariant with respect to the (1,1) tensor A: A(Tx (Mcq )) ⊂ Tx (Mcq ), ii) The Nijenhuis tensor NA vanishes on the tangent subspaces Tx (Mcq ): NA (u, v) = 0, for u, v ∈ Tx (Mcq ). The two Poisson structures P1 = ω1−1 (3.2) and P2 = ω2−1 (3.3) are compatible in the extended sense and p = 1, q = 2k − 1. Lemma 1. Magri’s compatibility of two non-degenerate Poisson structures P1 (x) and P2 (x) implies their compatibility in the extended sense. Proof. Magri’s compatibility is equivalent to the condition that the Nijenhuis tensor i vanishes [13, 20]. This condition implies the Lenard relations NAj` Aij (x) where Hm (x) =
1 m
∂Hm (x) ∂Hm+1 (x) = , ∂xi ∂xj
(3.7)
Tr Am (x). Equations (3.7) yield AU (Hm )(x) = U (Hm+1 )(x)
for any tangent vector U (x). Hence the extended compatibility conditions are satisfied. Theorem 1. If two Poisson structures P1 (x) and P2 (x) on a smooth manifold M 2k satisfy the conditions of extended compatibility and p = k then the generic bi-Hamiltonian systems (3.1) are integrable in the broad sense. Proof. For p = k, Eq. (3.6) implies that the generic submanifolds Mck : Tr Am (x) = cm , m = 1, . . . , k have dimension k and the k tangent vectors V (x), AV (x), . . . , Ak−1 V (x)
(3.8)
are linearly independent for the generic tangent vectors V (x) (3.1). It is evident that the bi-Hamiltonian system (3.1) preserves the (1,1) tensor Aij (x) and that the functions Hm (x) are first integrals. Hence the submanifolds Mck are invariant with respect to system (3.1): V (x) ∈ Tx (Mck ). Using condition (i), we define the restrictions Ac of the (1,1) tensor A onto the submanifolds Mck . The (1,1) tensor Ac is invariant with respect to the dynamical system (3.1) on the submanifolds Mck because the (1,1) tensor A is V invariant on the whole manifold M 2k . Hence for the Lie derivative we have LV Ac = 0. The Nijenhuis tensor NAc is defined by the formula [26] NAc (u, v) = A2c [u, v] + [Ac u, Ac v] − Ac [Ac u, v] − Ac [u, Ac v], where u, v ∈ T (Mck ). This expression coincides with that for the (1,1) tensor A, because Ac = A on T (Mck ). Therefore condition (ii) implies NAc (u, v) ≡ 0. It is well-known that two conditions LV Ac = 0 and NAc = 0 imply that the k vector fields (3.8) pairwise commute. Indeed, the Nijenhuis formula Ac )u NAc (Am V Ac − Ac LAm c V, u) = (LAm+1 c V c
Extended Integrability and Bi-Hamiltonian Systems
27
proves by induction that LAm Ac = 0 for all integers m ≥ 0. Hence the Leibniz formula c V for the Lie derivatives yields ` A`c V = (LAm A`c )V + A`c LAm V = 0. [Am c V, Ac V ] = LAm c V c V c V
Thus we obtain that the bi-Hamiltonian system (3.1) has k functionally independent first integrals Hm (x) and possesses the k-dimensional abelian Lie algebra of symmetries (3.8) that preserve first integrals Hm (x). Hence conditions (a) and (b) of Definition 1 are satisfied. III. It is easy to verify that conditions (i) and (ii) of Definition 2 are satisfied if two Poisson structures P1 (x) and P2 (x) are strongly dynamically compatible [7]. Therefore Theorem 1 contains all previously known bi-Hamiltonian systems that are Liouvilleintegrable [7, 12, 13, 20] plus new bi-Hamiltonian systems that are integrable in the broad sense and for which the Poisson structures P1 (x) and P2 (x) are not compatible. The topological invariants of these systems are q = k, v = 0, 2, . . . , 2[k/2]. IV. The condition p = k in Theorem 1 is essential because the commuting vector fields V (x), AV (x), . . . , Am V (x), . . . generate the abelian Lie algebra Sa whose dimension is ≤ k. This result follows from Lemma 2 below. Notice that in the above examples (where p = 1) the closures of the orbits of the arising Lie group Rk action are tori T2k−1 . Lemma 2. The recursion operator A(x) satisfies the equation of degree k Q(A(x), x) = 0.
(3.9)
Proof. We define two non-degenerate skew-symmetric bilinear forms on the complex linear space C2k : (y, z)1 = P1αβ yα zβ , (y, z)2 = P2αβ yα zβ . The operator A = P1 P2−1 satisfies the equations AP1 = P1 At and AP2 = P2 At . Hence the identities (Ay, z)1 = (y, Az)1 ,
(Ay, z)2 = (y, Az)2
(3.10)
follow. We first consider the generic case when the operator A has k double-degenerate eigenvalues λ1 , . . . , λk . Then the corresponding generalized eigenspaces L1 , . . . , Lk ⊂ C2k have dimensions 2. Equations (3.10) imply that these subspaces are mutually orthogonal with respect to the both bilinear forms: L1 ⊕ . . . ⊕ Lk = C2k . Hence the forms P1 and P2 are non-degenerate on each Li . Any two skew-symmetric forms on Li are proportional because dim Li = 2. Hence the operator A = P1 P2−1 on each subspace Li is the scalar operator λi . Therefore the operator A satisfies the equation Q(A) = (A − λ1 ) · . . . · (A − λk ) = 0. Thus Eq. (3.9) is proved for the generic case. The operator A = P1 P2−1 and the polynomial Q(λ) depend on the matrices P1 and P2 continuously. Hence Eq. (3.9) is true for any skew-symmetric non-degenerate matrices P1 and P2 by the continuity. The Cayley–Hamilton theorem states that P (A(x), x) = (Q(A(x), x))2 = 0. Lemma 2 means that the polynomial Q(λ, x) = Pf (P1 (x) − λP2 (x))Pf (P2−1 (x)) is the minimal polynomial for the generic recursion operator A(x).
28
O. I. Bogoyavlenskij
4. Integrability After Reduction I. Let us consider a Lagrangian system on the configuration space R2` with a Lagrangian function 2` X 1 cij q i q˙j − bi q i , (4.1) L(q, q) ˙ = T (q) ˙ + 2 i,j=1
˙ is a where the 2` × 2` matrix cij is skew-symmetric and non-degenerate, and T (q) smooth function. After the Legendre transform pi =
∂T (q) ˙ , i ∂ q˙
H(p) =
2` X
pi q˙i − T (q), ˙
i=1
Lagrange’s equations corresponding to the Lagrangian (4.1) define the dynamical system p˙i =
2` X
cij
j=1
∂H(p) − bi , ∂pj
q˙i =
∂H(p) . ∂pi
(4.2)
System (4.2) has the Hamiltonian form x˙ τ = P0τ ν
∂H0 (x) , ∂xν
where the Poisson structure P0τ ν and the Hamiltonian function H0 (x) have the form P0 =
c −I I 0
H0 (p, q) = H(p) +
,
2` X
bi q i .
(4.3)
i=1
The Hamiltonian system (4.2) contains the closed subsystem p˙i =
2` X j=1
cij
∂H1 (p) ∂pj
(4.4)
P2` in the phase space R2` , where H1 (p) = H(p) + i,j=1 (c−1 )ij bi pj . The system (4.4) is Hamiltonian with respect to the Poisson structure P2ij = cij . Equations (4.2) are invariant with respect to the translations tm (q) = (q 1 + m1 L1 , . . . , q 2` + m2` L2` ),
(4.5)
where L1 , . . . , L2` are arbitrary numbers and m1 , . . . , m2` are arbitrary integers. The translations tm form the infinite discrete group 0 = Z ⊕ · · · ⊕ Z. It is evident that the dynamical system (4.2) is reducible to the quotient space R4` /0 that is the cotangent bundle of the torus T2` = R2` /0. Let us introduce the angular coordinates ϕi = ai (qi −
2` X j=1
(c−1 )ij pj ) mod (2π),
Extended Integrability and Bi-Hamiltonian Systems
29
where ai = 2π/Li and i = 1, . . . , 2`. The 4` variables pi , ϕj form the coordinates on the cotangent bundle T ∗ (T2` ) = R2` × T2` . In these coordinates, system (4.2) takes the form 2` X ∂H1 (p) cij , ϕ˙ i = bi , (4.6) p˙i = ∂pj where bi = ai
P2`
j=1
j=1 (c
−1 ij
) bj . System (4.6) has the Hamiltonian form x˙ τ = P τ ν θν ,
where the Poisson structure P τ ν and the closed 1-form θ have the form 2` X c0 P = (c−1 )ij bj dϕi . , θ = dH1 (p) + 0c
(4.7)
i=1
II. Suppose that the function T (q) ˙ is chosen in such a way that the Hamiltonian system (4.4) is Liouville-integrable in the phase space R2` and F1 (p), . . . , F` (p) are its involutive functionally independent first integrals: 2` X
cij
i,j=1
∂Fn (p) ∂Fm (p) = 0, ∂pi ∂pj
n, m = 1, . . . , `.
(4.8)
Proposition 1. 1) The Hamiltonian system (4.6) on T ∗ (T2` ) is integrable in the broad sense. 2) The Hamiltonian system (4.2) on R4` with the Poisson structure P0τ ν (4.3) is Liouville-integrable. Proof. 1) The Hamiltonian system (4.6) possesses ` first integrals F1 (p), . . . , F` (p) (4.8) and the 3`-dimensional abelian Lie algebra Sa of symmetries with the generators Um =
2` X
cij
i,j=1
∂Fm (p) ∂ , ∂pj ∂pi
U`+i =
∂ , ∂ϕi
(4.9)
where m = 1, . . . , `, and i, j = 1, . . . , 2`. It is evident that the symmetries (4.9) preserve first integrals F1 (p), . . . , F` (p). Hence the system (4.6) is integrable in the broad sense. 2) Equations (4.2) imply f˙i = −bi ,
fi (p, q) = pi −
2` X
cij q j .
(4.10)
j=1
Therefore the 2` − 1 functions f i (p, q) = fi (p, q) −
bi f2` (p, q) b2`
(4.11)
are first integrals of system (4.2), in addition to the ` first integrals F1 (p), . . . , F` (p). The Poisson structure P0τ ν (4.3) defines the following Poisson brackets: {F, G} =
2` X i,j=1
∂F ∂G X + ∂pi ∂pj 2`
cij
i=1
∂F ∂G ∂F ∂G − ∂q i ∂pi ∂pi ∂q i
.
(4.12)
30
O. I. Bogoyavlenskij
Equations (4.8) and (4.12) imply {Fn , Fm } = 0, These equations and Eqs. (4.11) yield {Fn , f i } = 0,
{f i , f j } = cij = −cij +
{Fn , fi } = 0,
{fi , fj } = −cij .
1 (bi c2`.j − bj c2`.i ), b2`
(4.13)
where i, j = 1, . . . , 2` − 1. The latter Eqs. (4.13) define a skew-symmetric inner product in the (2`−1)-dimensional linear space that is generated by first integrals f 1 , . . . , f 2`−1 . Therefore there exists an `-dimensional isotropic subspace I. Let the subspace I be generated by some ` functions g1 = a11 f 1 + · · · + a1.2`−1 f 2`−1 , . . . , g` = a`1 f 1 + · · · + a`.2`−1 f 2`−1 .
(4.14)
In view of Eqs. (4.8), (4.13), we have {Fn , Fm } = 0,
{Fn , gi } = 0,
{gi , gj } = 0.
It is evident that the 2` involutive first integrals F1 , . . . , F` , g1 , . . . , g` are functionally independent. Therefore, the Hamiltonian system (4.2) is Liouville-integrable on the Poisson manifold R4` . III. If the first subsystem (4.6) for some function H1 (p) has s-dimensional invariant tori Ts then the whole Hamiltonian system (4.6) for generic constants bi has (2` + s)dimensional tori T2`+s ⊂ T ∗ (T2` ). Hence the topological invariants of such an integrable system (4.6) are q = 2` + s, v = 2`. The invariant tori T2`+s are coisotropic if s = ` and are not coisotropic if s < `. The function H1 (p) can be chosen as Hamiltonian of any classical integrable system in the phase space R2` : the Kepler problem, the Toda lattice and its Lie-algebraic generalizations [4], the Volterra system, etc. Hence Proposition 1 implies: There exist at least as many Hamiltonian systems on the symplectic manifold M 4` = T (T2` ) which are integrable in the broad sense with invariant tori Tq of dimensions q : 2` < q ≤ 3` as the Liouville-integrable Hamiltonian systems on the phase space R2` . ∗
Example 1. Let Lagrangian (4.1) be defined on R4 and have the form X 1 L(q, q) ˙ = T (q) ˙ + (q 3 q˙1 + q 4 q˙2 − q 1 q˙3 − q 2 q˙4 ) − bi q i , 2 4
i=1
1/4 m 12 m 22 (q˙ ) + (q˙ ) + (q˙3 + b1 )2 + (q˙4 + b2 )2 − mq˙1 b3 − mq˙2 b4 . 2 2 The corresponding reduced Hamiltonian system (4.4) coincides with the 2-dimensional Kepler problem. Hence the generic trajectories of the integrable Hamiltonian system (4.6) on T ∗ (T4 ) are dense on tori T5 for H1 (p) < 0 and generic constants bi , Lj . The invariant tori T5 are not coisotropic; the topological invariants are q = 5, v = 4. For more examples see [5, 6]. T (q) ˙ =
Remark 1. Reductions of Hamiltonian systems with respect to discrete groups of symplectic symmetries have been studied by Marsden [22]. In [23], Meyer proved that the fixed set of an antisymplectic involution is a Lagrangian submanifold.
Extended Integrability and Bi-Hamiltonian Systems
31
IV. It is evident from Definition 1, that if a Hamiltonian system on a symplectic manifold M 2k is integrable in the broad sense then the induced Hamiltonian system on the univerc2k is integrable also. Proposition 1 shows that the sal covering symplectic manifold M covering system can be Liouville-integrable. This is realized because the Hamiltonian system (4.2) possesses ` additional involutive first integrals (4.14) that are not invariant with respect to the action of the discrete group of symmetries 0 = π1 (M 4` ) = Z⊕· · ·⊕Z. c4` = T ∗ (R2` ) move on the nonTrajectories of the Hamiltonian system (4.2) on M c4` that are defined compact 2`-dimensional toroidal cylinders C 2` = Tm × R2`−m ⊂ M by the 2` equations Fm = cm , gi = ci . After the reduction, the toroidal cylinders C 2` are projected densely into the compact 3`-dimensional tori T3` ⊂ M 4` = T ∗ (T2` ). 5. Complete Classification of the Invariant Closed 2-Forms for a Tq -Dense Integrable System Proposition 2. If a dynamical system x˙ τ = V τ (x1 , . . . , xn ) on a smooth manifold M n is integrable in the broad sense then the components Mcq , q = n−p, of the generic invariant submanifolds M q : F1 (x) = c1 , . . . , Fp (x) = cp are tori Tq if they are compact, and toroidal cylinders Tm × Rq−m if they are non-compact. In a toroidal neighbourhood O = Ba ×Tm ×Rq−m of any toroidal cylinder there exist local coordinates I1 , . . . , Ip ∈ Ba ⊂ Rp , ϕ1 , . . . , ϕm ∈ Tm , ρm+1 , . . . , ρq ∈ Rq−m , where the dynamical system has the form I˙j = 0, ϕ˙ α = ωα (I), ρ˙γ = ωγ (I), and hence is integrable in the conventional sense. The proof is standard and uses the methods developed by Arnold [3], Jost [17], Nekhoroshev [25], Markus & Meyer [21] and Duistermaat [11]. A dynamical system with quasi-periodic dynamics has the form I˙1 = 0, . . . , I˙p = 0,
ϕ˙ 1 = ω1 (I), . . . , ϕ˙ q = ωq (I)
q
(5.1)
n
in a toroidal domain O = Ba × T ⊂ M . The coordinates Ij run over a ball Ba :
p X
(Ij − Ij0 )2 < a2 .
j=1
The angular coordinates ϕ1 , . . . , ϕq run over the torus Tq , 0 ≤ ϕj ≤ 2π. If the generic trajectories of the system (5.1) are dense on the tori Tq then system (5.1) has precisely p functionally independent first integrals. It is evident that all dynamical systems (5.1) are integrable in the broad sense. Definition 3. A trajectory of the dynamical system (5.1) is called Tq -dense if it is everywhere dense on a torus Tq for Ij = cj = const. The frequencies ω1 (I), . . . , ωq (I) corresponding to the Tq -dense trajectory are rationally independent. This means that for arbitrary integers mα we have m1 ω1 (I) + . . . + mq ωq (I) 6= 0. Let X ⊂ Ba be the set of points I ∈ Ba for which the trajectories of system (5.1) are Tq -dense.
32
O. I. Bogoyavlenskij
Definition 4. The dynamical system (5.1) is called Tq -dense in the toroidal domain O = Ba × Tq ⊂ M n if the set X is everywhere dense in the ball Ba . For example, a Liouville-integrable Hamiltonian system on a symplectic manifold M 2k is Tk -dense if it is non-degenerate in the Poincar´e sense or in the Poincar´e iso-energetic sense [9, 31]. Proposition 3. A continuous dynamical system (5.1) is Tq -dense in the toroidal domain O = Ba × Tq ⊂ M n if and only if the functions ωα (I) are rationally independent in any ball B0 ⊂ Ba . The proof is published in [8], Sect. 12. Any continuous first integral F (Ij , ϕβ ) of a Tq -dense dynamical system (5.1) is constant on all tori Tq and hence any first integral F is a function of the variables Ij only: dF = 0 =⇒ F = F (I1 , . . . , Ip ). (5.2) dt Let us consider a dynamical system that is integrable in the broad sense and has compact invariant submanifolds. In view of Proposition 3, this system has form (5.1) in the toroidal domain O = Ba × Tq ⊂ M n for arbitrary dimensions p and q, p + q = n. We suppose that system (5.1) is Tq -dense. This assumption does not cause any loss of generality, see [8], Sect. 12. Theorem 2, part 1. 1) In a toroidal domain O = Ba × Tq ⊂ M n for n = p + q, a closed 2-form ω is invariant with respect to the Tq -dense smooth dynamical system (5.1) if and only if it has the form ωc = dFα (I) ∧ dϕα + dfj (I) ∧ dIj + cαβ dϕα ∧ dϕβ ,
(5.3)
where the functions fj (I) are arbitrary, functions Fα (I) and frequencies ωα (I) satisfy the equations (5.4) ωα (I) dFα (I) = dH(I), cαβ ωβ (I) = cα , (5.5) where H(I) is a smooth function, the constant coefficients cαβ form a skew-symmetric q × q matrix, cα are constants, α, β = 1, . . . , q and j = 1, . . . , p. 2) The dynamical system (5.1) that preserves the non-degenerate symplectic structure ωc (5.3) (n = 2k) has the Hamiltonian form x˙ τ = Pcτ ν θν ,
Pc = ωc−1 ,
(5.6)
where θ is the closed 1-form θ = dH(I) + cα dϕα .
(5.7)
3) The condition
rank k cαβ k ≥ q − p is necessary for the non-degeneracy of the 2-form ωc for q > p. If rank k cαβ k = q − p,
(5.8) (5.9)
then the tori Tq are coisotropic. If rank k cαβ k > q − p, q
then the tori T are not Lagrangian, isotropic or coisotropic.
(5.10)
Extended Integrability and Bi-Hamiltonian Systems
33
Proof. (1) Calculating the Lie derivative LV ωc = ω˙ c with respect to the dynamical system (5.1), and using Eqs. (5.4) and (5.5), we obtain ω˙ c = dFα (I) ∧ dωα (I) + 2 dϕα ∧ (cαβ ωβ (I)) = − d ◦ dH(I) + 2 dϕα ∧ dcα = 0. Therefore all 2-forms ωc (5.3) are invariant with respect to the dynamical system (5.1). Let us prove that any closed 2-form ω that is invariant with respect to the dynamical system (5.1), has form (5.3). In the coordinates I1 , . . . , Ip , ϕ1 , . . . , ϕq , any differential 2-form ω is defined by the expression ω = aj` (I, ϕ) dIj ∧ dI` + bjα (I, ϕ) dIj ∧ dϕα + cαβ (I, ϕ) dϕα ∧ dϕβ , aj` (I, ϕ) = −a`j (I, ϕ),
(5.11)
cαβ (I, ϕ) = −cβα (I, ϕ),
where aj` (I, ϕ), bjα (I, ϕ) and cαβ (I, ϕ) are smooth functions defined in the toroidal domain O ⊂ M n and j, ` = 1, . . . , p, and α, β = 1, . . . , q. The invariant closed 2-form (5.11) satisfies the following two equations: LV ω = ω˙ = 0 and dω = 0. The Lie derivative of the 2-form ω with respect to the system (5.1) has the form ω˙ = a˙ jl dIj ∧ dI` + b˙ jα dIj ∧ dϕα + c˙αβ dϕα ∧ dϕβ + + bjα dIj ∧ dωα (I) + cαβ dωα (I) ∧ dϕβ + cαβ dϕα ∧ dωβ (I). Therefore, the invariance equation LV ω = ω˙ = 0 is equivalent to the system of equations 2a˙ j` = b`α
∂ωα (I) ∂ωα (I) − bjα , ∂Ij ∂I`
∂ωβ (I) b˙ jα = 2cαβ , ∂Ij
c˙αβ = 0.
(5.12)
Solutions to the linear triangular dynamical system (5.12) have the form ∂ωα (I) ∂ωβ (I) 2 ˜ ∂ωα (I) ˜ ∂ωα (I) t + (b`α − bjα )t + 2˜aj` (I), ∂Ij ∂I` ∂Ij ∂I` (5.13) ∂ωβ (I) t + b˜ jα (I), cαβ (t) = c˜αβ (I1 , . . . , Ip ). bjα (t) = 2˜cαβ ∂Ij
2aj` (t) = 2˜cαβ
The coefficients a˜ j` , b˜ jα , c˜αβ are first integrals of the dynamical system (5.12). Therefore, using the main property of first integrals (5.2), we obtain that all coefficients a˜ j` , b˜ jα , c˜αβ depend on the variables I1 , . . . , Ip only. The components aj` (I, ϕ), bjα (I, ϕ) and cαβ (I, ϕ) of the smooth invariant differential 2-form (5.11) are bounded on any torus Tq . The exact solution (5.13) is bounded for all t if and only if ∂ωα (I) ˜ ∂ωα (I) b˜ `α (I) − bjα (I) = 0, ∂Ij ∂I`
c˜αβ (I)
∂ωβ (I) = 0. ∂Ij
(5.14)
Hence using Eqs. (5.13) and the fact that generic trajectories of the system (5.1) are dense on the tori Tq , we obtain that any invariant 2-form ω (5.11) has the form ω = a˜ j` (I) dIj ∧ dI` + b˜ jα (I) dIj ∧ dϕα + c˜αβ (I) dϕα ∧ dϕβ . For the 2-form (5.15), the equation dω = 0 splits into the equations d(˜aj` (I) dIj ∧ dI` ) = 0,
d(b˜ jα (I) dIj ) = 0,
dc˜αβ (I) = 0.
(5.15)
34
O. I. Bogoyavlenskij
In view of Poincar´e’s lemma, these equations are equivalent to the equations a˜ j` (I) dIj ∧ dI` = d(fm (I) dIm ),
b˜ jα (I) dIj = dFα (I)
(5.16)
and c˜αβ (I) = cαβ = const. Here fm (I1 , . . . , Ip ) and Fα (I1 , . . . , Ip ) are some smooth functions. After the substitution of c˜αβ (I) = cαβ = const, the latter equation in (5.14) implies cαβ ωβ (I) = cα = const .
(5.17)
Substituting formulae (5.16) and c˜αβ (I) = cαβ into (5.15), we obtain ω = dFα (I) ∧ dϕα + dfj (I) ∧ dIj + cαβ dϕα ∧ dϕβ . Calculating the Lie derivative LV ω = ω˙ = 0 again, we get ω˙ = dFα (I) ∧ dωα (I) + 2 dϕα ∧ d(cαβ ωβ (I)) = − d(ωα (I) dFα (I)) = 0 in view of Eqs. (5.17). Therefore, the Poincar´e lemma implies ωα (I) dFα (I) = dH(I), where H(I1 , . . . , Ip ) is some smooth function. Thus we have proved that any invariant closed 2-form ω has form (5.3) and that Eqs. (5.4) and (5.5) are satisfied. (2) Any dynamical system that preserves a symplectic structure ωc has the form (5.6) with some closed 1-form θ [1]. Equations (5.6) imply ωcµτ x˙ τ = θµ .
(5.18)
Let x˙ τ = V τ be the vector field of the dynamical system (5.1): V j = 0,
V α+p = ωα (I).
Equation (5.18) takes the form ωV = θ. The 2-form ωc (5.3) has the following components 2ωc.j.` =
∂f` (I) ∂fj (I) − , ∂Ij ∂I`
ωc.j.α+p =
∂Fα (I) , ∂Ij
ωc.α+p.β+p = cαβ .
(5.19)
Hence components of the 1-form θ = ωV = θj dIj + θp+α dϕα have the form θj = ωα (I)
∂Fα (I) , ∂Ij
θp+α = cαβ ωβ (I).
These expressions and Eqs. (5.4) and (5.5) imply the formula (5.7). (3) Let πc be the rectangular q × (p + q) matrix formed by the last q rows of matrix ωc (5.3). The non-degeneracy of matrix ωc implies rank k πc k = q. Hence there exist q linearly independent columns of the q ×(p+q) matrix πc . Therefore at least q −p linearly independent columns belong to the matrix cαβ ⊂ πc . Hence the necessary condition (5.8) follows. For the non-degenerate symplectic structure ωc , the ωc -orthogonal space Tx (Tq )⊥ has dimension p. For the null space N of the constant skew-symmetric q × q matrix cαβ , we have N = Tx (Tq ) ∩ Tx (Tq )⊥ .
Extended Integrability and Bi-Hamiltonian Systems
35
Therefore if dimN = r = p, then N = Tx (Tq )⊥ . Hence if Eq. (5.9) holds then the inclusion Tx (Tq )⊥ ⊂ Tx (Tq ) is realized and the tori Tq are coisotropic. If Eq. (5.10) holds (equivalently r < p < q) then dim N = r < p,
N 6= Tx (Tq )⊥ .
This means that the tori Tx (Tq ) are not Lagrangian, isotropic or coisotropic if r < p < q. 6. Canonical Forms for the Symplectic Structures in the Toroidal Domains I. We use the standard Euclidean scalar product in Rq : (x, y) = x1 y 1 + · · · + xq y q . The null-space N ⊂ Rq of the skew-symmetric matrix cαβ has the dimension r = q − rank k cαβ k. The non-degeneracy condition (5.8) implies that r ≤ p. Let vectors a1 , . . . , ar , b1 , . . . , bq−r ∈ Rq form an orthonormal basis where a` ∈ N and bm ∈ N ⊥ : cαβ aβ` = 0,
(a` , aj ) = δj` ,
i (bi , bm ) = δm ,
(a` , bm ) = 0.
(6.1)
For the skew-symmetric matrix c we have (ca, b) = −(a, cb). Using Eqs. (6.1) and c(N ) = 0, we obtain c(N ⊥ ) = N ⊥ . Hence the operator c is an isomorphism on its invariant subspace N ⊥ , or the q − r vectors cbm ∈ N ⊥ are linearly independent. The vector-function F (I) = (F1 (I1 ), . . . , Fq (I)) ∈ Rq has the form F (I) =
r X
(F (I), a` )a` +
q−r X
gm (I)bm ,
(6.2)
m=1
`=1
where gm = (F (I), bm ). Let p vectors dj ∈ Rp have coordinates dij = δji , i, j = 1, . . . , p. The p vectors dj ∈ Rp and the q vectors a` , bm ∈ Rq form a basis in the Euclidean space Rp ⊕ Rq . Theorem 2, part 2. The non-degenerate symplectic structures ωc (5.3) in the toroidal domains O = Ba × Tq ⊂ M 2k have the k(k + 1)/2 canonical forms ωc =
q X r X
aα ` dI` ∧ dϕα +
α=1 `=1
q X α,β=1
X
(p−r)/2
cαβ dϕα ∧ dϕβ +
dIr+j ∧ dIh+j (6.3)
j=1
which are not isomorphic to the Liouville canonical form (1.2). Here a` ∈ Rq are r orthonormal null-vectors (6.1) of the q × q skew-symmetric matrix cαβ , r = q − rank k cαβ k, and h = (p + r)/2. Proof. (i) We present a sequence of transformations of coordinates I1 , . . . , Ip , ϕ1 , . . . , ϕq which transform the symplectic structure ωc (5.3) into one of the canonical forms (6.3). Let us prove that the r functions (F (I), a` ) (6.2) are functionally independent, ` = 1, . . . , r ≤ p. Indeed, multiplying the matrix of the non-degenerate 2-form ωc (5.19) with vectors a` , we obtain the r linearly independent vectors ωc a ` = (
∂Fα (I) α ∂Fα (I) α a ,..., a , 0, . . . , 0). ∂I1 ` ∂Ip `
36
O. I. Bogoyavlenskij
Hence we get rank k
∂(F (I), a` ) k = r. ∂Ij
Therefore new coordinates Ji = Ji (I1 , . . . , Ip ), i = 1, . . . , p exist such that the first r coordinates J1 , . . . , Jr have the form J1 = (F (I), a1 ),
...,
Jr = (F (I), ar ).
(6.4)
In view of (6.2) and (6.4), the symplectic structure ωc (5.3) takes the form ! q−r r X X α α a` dJ` + bm dgm (J) ∧ dϕα + cαβ dϕα ∧ dϕβ + dfi (J) ∧ dJi . (6.5) ωc = m=1
`=1
Matrix cαβ defines an isomorphism of the linear space N ⊥ ⊂ Rq , c(N ⊥ ) = N ⊥ . Therefore there exist q − r vectors dm ∈ N ⊥ that satisfy the equations β bα m = cαβ dm ,
m = 1, . . . , q − r.
(6.6)
Let us introduce the new angular coordinates 1 ϕ1α = ϕα − gm (J)dα m 2
(6.7)
on the tori Tq , α = 1, . . . , q. After the substitution of Eqs. (6.6) and (6.7), formula (6.5) takes the form ωc =
q X r X
1 aα ` dJ` ∧ dϕα +
α=1 `=1
q X
cαβ dϕ1α ∧ dϕ1β + η,
(6.8)
α,β=1
1 α α b d dgm (J) ∧ dgj (J) + dfi (J) ∧ dJi . 4 m j The 2-form η is closed and has the general form η=
η=
p r X X
f`m (J) dJ` ∧ dJm +
`=1 m=1
p X
fab (J) dJa ∧ dJb .
(6.9)
a,b=r+1
Let us prove that the 2-form η0 =
p X
fab (J) dJa ∧ dJb
(6.10)
a,b=r+1
is non-degenerate. For a given point J0 , we define the following transformation of the angular coordinates: p r X X 0 1 aα ϕα = ϕα + i fim (J0 )Jm . i=1 m=1
In the coordinates ωc (J0 ) =
q r X X `=1 α=1
Ji , ϕ0α ,
the 2-form ωc (J0 ) takes the form
0 aα ` dJ` ∧ dϕα +
q X α,β=1
cαβ dϕ0α ∧ dϕ0β +
p X a,b=r+1
fab (J0 ) dJa ∧ dJb .
Extended Integrability and Bi-Hamiltonian Systems
37
In this formula, only the last sum depends on the p − r variables Jr+1 , . . . , Jp . Hence the non-degeneracy of the 2-form ωc (J0 ) implies the non-degeneracy of the 2-form η0 (6.10). We consider the 2-form η0 as a family η0 (J1 , . . . , Jr ) of the smooth non-degenerate 2-forms in the Euclidean space Rp−r which depend on the r parameters J1 , . . . , Jr . The equation dη = 0 implies dη0 (c1 , . . . , cr ) = 0 for any constants J1 = c1 , . . . , Jr = cr . Hence the 2-forms η0 (c1 , . . . , cr ) are closed and non-degenerate in Rp−r . The integer p − r = p − q + rank k cαβ k = 2k − 2q + rank k cαβ k is even. Applying Darboux’s theorem, we obtain that the 2-form η0 (c1 , . . . , cr ) has the canonical form X
(p−r)/2
η0 (c) =
dc Ir+j ∧ dc Ih+j
(6.11)
j=1
in some coordinates Ir+1 = Ir+1 (ci , Jr+1 , . . . , Jp ), . . . , Ip = Ip (ci , Jr+1 , . . . , Jp ).
(6.12)
Here dc is the differential at the constant values of the parameters J1 = c1 , . . . , Jr = cr and h = (p + r)/2. The construction of the canonical coordinates Ir+1 (c, J), . . . , Ip (c, J) in the proof of the Darboux theorem [1] depends smoothly on the parameters c1 , . . . , cr because the 2-form η0 (c) is a smooth function of c1 , . . . , cr . Using the functions (6.12), we define the system of local coordinates I 1 = J1 , . . . , I r = Jr ,
Ir+1 = Ir+1 (J1 , . . . Jp ), . . . , Ip = Ip (J1 , . . . , Jp )
in the Euclidean space Rp . In view of Eq. (6.11), the 2-form η (6.9) takes the form
η1 =
p r X X
f`a (I) dI` ∧ dIa +
`=1 a=r+1
r X
X
(p−r)/2
f`j (I) dI` ∧ dIj +
dIr+j ∧ dIh+j
j=1
`,j=1
in the new coordinates I1 , . . . , Ip . (ii) Let us apply to the 2-form ωc (6.8) a sequence of transformations of the angular coordinates r X m = ϕ + gjm (I)aα (6.13) ϕm+1 α α j, j=1
where gjm (I) are some smooth functions of the p variables I1 , . . . , Ip , and α = 1, . . . , q. After the transformation (6.13), the 2-form ωc takes the form ωc =
q X r X α=1 `=1
m+1 aα + ` dI` ∧ dϕα
q X α,β=1
cαβ dϕm+1 ∧ dϕm+1 + ηm+1 , α β
38
O. I. Bogoyavlenskij
where the closed 2-form ηm+1 has the form ηm+1 = ηm −
r X
dI` ∧ dg`m (I) =
(6.14)
`=1
=
p r X X
m+1 f`a (I) dI` ∧ dIa +
`=1 a=r+1
r X
X
(p−r)/2 m+1 fij (I) dIi ∧ dIj +
i,j=1
dIr+j ∧ dIh+j .
j=1
We present a sequence of the p − r transformations (6.13) which annihilates the p−r+1 (I). Applying the transformation rectangular r × (p − r) matrix f`a Z Ip 1 , g (I) = fjp (I1 , . . . , Ip−1 , x) dx ϕ2α = ϕ1α + gj1 (I)aα j j c
2 to the symplectic structure (6.8), we obtain that the coefficients f`p (I) of the 2-form η2 (6.14) vanish for all ` : 1 ≤ ` ≤ r. The equation dη2 = 0 implies that the rest 2 (I) do not depend on the variable Ip . coefficients f`a By the induction, we suppose that the m − 1 right columns of the rectangular mam (I) are equal to zero and all its entries do not depend on the m − 1 variables trix f`a Ip−m+2 , . . . , Ip . Applying the transformation Z Ip−m+1 m+1 m m α m m fj.p−m+1 (I1 , . . . , Ip−m, x) dx (6.15) ϕα = ϕα + gj (I)aj , gj (I) = c
m+1 to the symplectic structure (6.8), we obtain that the coefficients f`.p−m+1 of the 2form ηm+1 (6.14) vanish for all ` : 1 ≤ ` ≤ r. The transformation (6.15) does not depend on the variables Ip−m+2 , . . . , Ip . Therefore it does not change the m − 1 zero m m+1 . Hence the matrix f`a has m zero right columns for right columns of the matrix f`a m+1 do not a = p − m + 1, . . . , p. The equation dηm+1 = 0 implies that all entries f`a depend on the m variables Ip−m+1 , . . . , Ip . By the induction, we obtain after the p − r subsequent transformations (6.13) the 2-form
ηp−r+1 =
r X
X
(p−r)/2 p−r+1 fij (I) dIi ∧ dIj +
i,j=1
dIr+j ∧ dIh+j .
(6.16)
j=1
(iii) We construct now a transformation (6.13) that converts the symplectic structure ωc (6.8) to the canonical form (6.3). The equation dηp−r+1 = 0 implies that the coefficients p−r+1 fij (I) (6.16) do not depend on the p − r variables Ia , r + 1 ≤ a ≤ p. Applying the Poincar´e lemma to the closed 2-form (6.16), we obtain r r r X X X p−r+1 fij (I) dIi ∧ dIj = d gj (I) dIj = − dIj ∧ dgj (I), i,j=1
j=1
j=1
where gj (I1 , . . . , Ir ) are some smooth functions. The 2-form ωc (6.8) takes the canonical − gj (I)aα form (6.3) in the angular coordinates ϕα = ϕp−r+1 α j. (iv) Let us calculate the total number of the canonical forms (6.3) on a symplectic manifold M n , n = 2k, that are not isomorphic to the Liouville canonical form (1.2).
Extended Integrability and Bi-Hamiltonian Systems
39
The dimension q ≥ 2 of the invariant tori Tq ranges over 2, . . . , 2k. For k < q ≤ 2k we have q − p ≤ rank k cαβ k≤ q. Therefore rank k cαβ k takes [p/2] + 1 different even values. For 2 ≤ q ≤ k, rank k cαβ k takes [q/2] even values between 2 and q. Hence the number N of the canonical forms (6.3) is N=
k−1 k X X k(k + 1) p q . ([ ] + 1) + [ ]= 2 2 2 p=0
q=2
It is evident that the canonical forms (6.3) are not isomorphic to each other and to the Liouville canonical form (1.2) because they define different classes in the de Rham cohomologies H ∗ (O) = H ∗ (Tq ), O = Ba × Tq . Therefore the integer parameters q and v = rank k cαβ k in the classification (6.3) are topological invariants. 7. Canonical Forms for the Integrable Hamiltonian Systems Theorem 2, part 3. 1) A Tq -dense dynamical system (5.1) preserves the non-degenerate symplectic structure ωc (6.3) if and only if it has the canonical form I˙1 = 0, . . . , I˙p = 0,
ϕ˙ α =
r X ∂H(I) `=1
∂I`
α aα ` + b0 ,
(7.1)
where H(I) is an arbitrary smooth function of the r variables I1 , . . . , Ir , the vectors a` ∈ Rq satisfy Eqs. (6.1) and (a` , b0 ) = 0. The system (7.1) has the Hamiltonian form τ ν θν , θ = dH(I) + cαβ bβ0 dϕα . (7.2) x˙ τ = ωc−1 Equations (7.1) for b0 = 0 present the canonical forms for the Tq -dense Hamiltonian systems (5.1) which possess a global Hamiltonian function. 2) Suppose that the function H(I1 , . . . , Ir ) is generic in a sense that it does not satisfy any linear equation ∂H(I) ∂H(I) + · · · + cr = c0 (7.3) c1 ∂I1 ∂Ir with constant coefficients c0 , c` in any ball B0 ⊂ Rr . Then system (7.1) is Tq -dense if and only if the image space C ⊂ Rq for the matrix cαβ contains no integer vectors m = (m1 , . . . , mq ), orthogonal to the vector b0 . Proof. 1) Let us prove that any system (7.1) preserves the 2-form ωc (6.3). Differentiating the 2-form ωc with respect to the dynamical system (7.1) and using Eqs. (6.1), we obtain ∂H(I) α ∂H(I) β dI ∧ d a dϕ ∧ d ω˙ c = aα + 2c = − d ◦ dH = 0. a ` αβ α ` ∂Im m ∂Im m Let us prove that any dynamical system (5.1) that preserves the symplectic structure (6.3) has the form (7.1). Theorem 2, part 1 yields that the frequencies ωα (I) (5.1) satisfy Eqs. (5.4) and (5.5). Any solution ωα (I) of Eqs. (5.5) has the form ωα (I) =
r X `=1
α ρ` (I)aα ` + b0 ,
(7.4)
40
O. I. Bogoyavlenskij
where ρ` (I) are some smooth functions and cαβ bβ0 = cα . Equations (5.4) after substituting the formulae Fα (I) = aα m Im and (7.4) takes the form ωα (I) dFα (I) =
r X
ρ` (I) dI` = dH(I).
(7.5)
`=1
Equation (7.5) implies that function H(I) depends on the r variables I1 , . . . , Ir only and the r functions ρ` (I) satisfy the equations ρ` (I) =
∂H(I) , ∂I`
` = 1, . . . , r.
Therefore the dynamical system (5.1) takes the form (7.1). The Hamiltonian form (7.2) follows from Theorem 2, part 1. 2) Let us prove by contradiction that if the image space C ⊂ Rq contains no integer vectors m that satisfy the equation (m, b0 ) = 0, then system (7.1) is Tq -dense. Suppose not. Then using Proposition 4, we obtain that there exists a non-zero integer vector m = (m1 , . . . , mq ) such that the equation (m, ωα (I)) =
r X ∂H(I) `=1
∂I`
(m, a` ) + (m, b0 ) = 0
(7.6)
holds for all points I in some ball B1 ∈ Rp . We have supposed that function H(I) does not satisfy any Eq. (7.3). Therefore Eq. (7.6) implies that the r + 1 coefficients (m, a` ), (m, b0 ) vanish. The equations (m, a` ) = 0 mean that the integer vector m belongs to the image space for the skew-symmetric operator c. Hence the equation (m, b0 ) = 0 gives a contradiction. If the image space C ∈ Rq contains an integer vectors m that satisfies the equation (m, b0 ) = 0 then Eqs. (7.1) imply (m, ωα (I)) = 0. Hence the dynamical system (7.1) is not Tq -dense. This completes the proof of the three parts of Theorem 2. I. In Definition 1, it is supposed that the dynamical system x˙ τ = V τ (x1 , . . . , xn ) possesses an abelian (n − p)-dimensional Lie algebra Sa of symmetries which may be non-symplectic. Theorem 2 implies that any Hamiltonian system (1.1) with compact invariant submanifolds that is integrable in the broad sense possesses another abelian (n − p)-dimensional Lie algebra G of symmetries which are symplectic. Indeed, in view of Theorem 2, such a system is reduced to a canonical form (7.1) that has the symmetries Uα = ∂/∂ϕα = (ωc−1 )αν θα.ν , where θα = aα ` dI` +cβα dϕβ , dθα = 0, α = 1, . . . , n −p. The symmetries Uα define a symplectic action of the torus Tn−p that is not a Poisson action. II. The Liouville canonical form (1.2) is a particular case of the general canonical forms α (6.3), (7.1) for aα ` = δ` , cαβ = 0, p = q = r = k. There are k − 1 canonical forms (6.3) with coisotropic tori Tq for that r = p, rank k cαβ k = q − p, k < q < 2k; they have the form ωc =
p q X X α=1 `=1
aα ` dI` ∧ dϕα +
q X α,β=1
cαβ dϕα ∧ dϕβ .
Extended Integrability and Bi-Hamiltonian Systems
41
There are k canonical forms (6.3) for that q is even and the skew-symmetric matrix cαβ is non-degenerate, rank k cαβ k = q; these symplectic structures (6.3) have the form ωc =
q X
cαβ dϕα ∧ dϕβ +
p/2 X
dIj ∧ dIp/2+j .
j=1
α,β=1
For q = 2k, p = 0 we have M 2k = T2k . For the rest (k − 1)(k − 2)/2 canonical forms (6.3) we have q − p < rank k cαβ k < q. Hence 0 < r < p and Eq. (6.3) implies that the tangent vectors ∂/∂Ir+j are ωc -orthogonal to Tx (Tq ). Therefore the tori Tq are not coisotropic and matrix cαβ is degenerate. The tori Tq are not Lagrangian or isotropic either. III. The Poisson brackets of the functions Ij , I` have the form {Ij , I` } = −ωc−1 dIj (I` ). The vector field Vj = ωc−1 dIj satisfies the equations ωc.τ ν Vjν = Ij,τ , where τ, ν = 1, . . . , 2k. Using Eqs. (6.3) and (6.1), we obtain that the vector field Vj for 1 ≤ j ≤ r has the form ∂ ∂ + · · · + aqj . Vj = a1j ∂ϕ1 ∂ϕq Hence the equations Vj (I` ) = 0 follow for 1 ≤ ` ≤ p. Therefore the r variables I1 , . . . , Ir have zero Poisson brackets with all variables I1 , . . . , Ip . If rank k cαβ k = q − p then r = p and hence the p variables I1 , . . . , Ip are in involution. In this case all first integrals of any Tq -dense Hamiltonian system (7.1) are in involution. If rank k cαβ k> q − p then r < p. In this case, the variables Ir+j and Ih+j are not in involution: {Ir+j , Ih+j } = 1, 1 ≤ j ≤ (p − r)/2, h = (p + r)/2. Hence the Lie-Poisson algebra of first integrals of any Tq -dense Hamiltonian system (7.1) is non-abelian if r < p. 8. Abelian Lie Algebras of Symmetries Let us consider a Tq -dense dynamical system with constant coefficients: I˙1 = 0, . . . , I˙p = 0,
ϕ˙ 1 = ω1 , . . . , ϕ˙ q = ωq .
(8.1)
It is evident that system (8.1) preserves all symplectic structures (6.3). Hence the invariant v = rank k cαβ k is not uniquely defined for the general integrable system (7.1). In Proposition 2 of paper [8], we have proved that the maximal integer rm = rank k
∂ωα (I) k ∂Ij
is an invariant of any system (5.1) which does not depend on the choice of the toroidal coordinates I1 , . . . , Ip , ϕ1 , . . . , ϕq . For the canonical form (7.1) for q ≥ k, we obtain rm = rank k
∂ 2 H(I) k ≤ r = q − v, ∂Ij ∂I`
(8.2)
because vectors a1 , . . . , ar ∈ Rq are linearly independent. Hence, applying the inequality (5.8), we find (8.3) q − p ≤ v ≤ q − rm
42
O. I. Bogoyavlenskij
for any non-degenerate symplectic structure ωc (5.3). The inequalities (8.3) imply that the invariant v is uniquely defined by the system (7.1) if rm = p or p − 1. Then v = q − p = 2(q − k) and tori Tq are coisotropic for q = k + 1, . . . , 2k and Lagrangian for q = k. Definition 5. A Hamiltonian system V (1.1) that is integrable in the broad sense is called non-degenerate if it is Tq -dense, q ≥ k, and for its canonical form (7.1) the condition ∂ 2 H(I) k = 2k − q (8.4) rank k ∂Ij ∂I` is met in a dense open domain D ⊂ Ba . For the Hamiltonian systems that are integrable in the Liouville sense (q = k), the condition (8.4) coincides with the Poincar´e non-degeneracy condition [31]. Applying the inequalities (8.3), we obtain that v = 2(q − k) for any non-degenerate integrable Hamiltonian system. Theorem 3. A Hamiltonian system that is integrable in the broad sense is nondegenerate if and only if its Lie algebra of symmetries is abelian. Proof. Theorem 2 implies that the Hamiltonian system under investigation is diffeomorphically equivalent to one of the canonical forms (7.1) in the toroidal domains O = Ba × Tq . For the systems (7.1), we have ωα (I) =
r X ∂H(I) `=1
∂I`
α aα ` + b0 ,
(8.5)
where α = 1, . . . , q and r ≤ p, p = 2k − q. Applying Proposition 1 of paper [8], we obtain that the Lie algebra of symmetries of system (7.1) is abelian if and only if the system is Tq -dense, q ≥ k, and the non-degeneracy condition rank k
∂ωα (I) k = 2k − q ∂Ij
(8.6)
is met in a dense open domain in the ball Ba . Hence r = 2k − q. For the functions ωα (I) (8.5), the non-degeneracy condition (8.6) takes the form (8.4) because vectors a1 , . . . , ap ∈ Rq are linearly independent. Theorem 3 provides the invariant geometric interpretation for the non-degeneracy condition (8.4). As a consequence, this condition is invariant and depends neither on the choice of the toroidal coordinates I1 , . . . , Ip , ϕ1 , . . . , ϕq nor on the choice of the Hamiltonian structure.
9. Applications to Bi-Hamiltonian Systems Theorem 4. Suppose that a Hamiltonian system V is integrable in the broad sense and non-degenerate. Let P1 (x) and P2 (x) be arbitrary V -invariant non-degenerate Poisson structures. Then: i)
The recursion operator A(x) = P1 (x)P2−1 (x) has 2(q − k) constant eigenvalues.
Extended Integrability and Bi-Hamiltonian Systems
43
ii) If q = 2k or 2k − 1 then the Poisson structures P1 (x) and P2 (x) are compatible in Magri’s sense. iii) If k ≤ q ≤ 2k − 2 then the generic V -invariant Poisson structures P1 (x) and P2 (x) are incompatible. Proof. (i) The system V is Hamiltonian with respect to the Poisson structure P1 (x). Applying Theorem 2, we obtain that the symplectic structure ω1 (x) = P1−1 (x) has the canonical form ω1 =
p q X X
q X
aα ` dI` ∧ dϕα +
α=1 `=1
cαβ dϕα ∧ dϕβ
(9.1)
α,β=1
in some toroidal coordinates I1 , . . . , Ip , ϕ1 , . . . , ϕq and cαβ aβ` = 0 for ` = 1, . . . , p. In these coordinates, the system V has the form I˙1 = 0, . . . , I˙p = 0,
ϕ˙ α = ωα (I) =
p X ∂H(I) `=1
∂I`
α aα ` + b0 .
(9.2)
Applying Theorem 2, part 1, we obtain that the V -invariant closed 2-form ω2 (x) = P2−1 (x) has the form ω2 = dFα (I) ∧ dϕα + dfj (I) ∧ dIj + bαβ dϕα ∧ dϕβ ,
(9.3)
where constant coefficients bαβ satisfy the equations bαβ ωβ (I) = cα . Using Eqs. (9.2), we obtain ∂H(I) bαβ ωβ (I) = bαβ aβ` + bαβ bβ0 = cα . (9.4) ∂I` Equations (9.4) and the non-degeneracy condition (8.4) imply bαβ aβ` = 0. Let the orα α α thonormal vectors aα p+1 , . . . , a2k be orthogonal to the vectors a1 , . . . , ap . Let us introα duce the local coordinates ϕβ = aα ϕ ; we have ϕ = a ϕ , where α, β, γ = 1, . . . , q. α γ γ β α In these coordinates, the symplectic structures ω1 (9.1) and ω2 (9.3) take the form ω1 =
p X
dI` ∧ dϕ` +
`=1
ω2 =
q X α=1
dF α (I) ∧ dϕα +
q X
cαβ dϕα ∧ dϕβ ,
(9.5)
α,β=1 p X j=1
dfj (I) ∧ dIj +
q X
bαβ dϕα ∧ dϕβ ,
(9.6)
α,β=1
where cαβ = aγα cγδ aδβ , bαβ = aγα bγδ aδβ . These formulae and equations cαβ aβ` = 0, bαβ aβ` = 0 for 1 ≤ ` ≤ p imply cαβ = bαβ = 0 for 1 ≤ α ≤ p or 1 ≤ β ≤ p. Hence the matrices of the 2-forms ω1 (9.5) and ω2 (9.6) have the block forms L M N 0 E0 ω1 = −E 0 0 , ω2 = −M t 0 0 , 0 0 c −N t 0 b where E is the unit p × p matrix, L = L(I) and M = M (I) are p × p matrices, N = N (I) is a p × (q − p) matrix and c and b are constant (q − p) × (q − p) matrices. Hence we find that the recursion operator A(I) = P1 (I)P2−1 (I) = ω1−1 (I)ω2 (I) has the block form
44
O. I. Bogoyavlenskij
0 0 Mt M N . A(I) = L −c−1 N t 0 c−1 b
(9.7)
The characteristic polynomial of matrix A(I) (9.7) has the form det k A(I) − λ k = det k c−1 b − λ k det k M (I) − λ k2 .
(9.8)
Hence we get that the q − p = 2(q − k) eigenvalues of matrix A(I) are constant because they coincide with the eigenvalues of the constant matrix c−1 b. (ii) If q = 2k, p = 0, then system (9.2) is the Kronecker flow on the torus T2k . If the frequencies ω1 , . . . , ω2k are rationally independent then all invariant Poisson structures are constant and compatible in Magri’s sense. Let q = 2k − 1, p = 1. In the local coordinates I1 , ϕ1 , . . . , ϕ2k−1 , the Nijenhuis tensor ! 2k X ∂Am ∂Ai` m ∂Aij m ∂Am j i i i ` NAj` = (9.9) A − A + A − A ∂xm j ∂xm ` ∂x` m ∂xj m m=1
i 1 1 has the following properties. The components NAj` , NAj` , NA1` , for i > 1, j > 1, i ` > 1 vanish identically in view of the block form (9.7). The components NA1` for i > 1, ` > 1 have the form i = NA1`
∂Ai` 1 ∂Am ` A − Ai . ∂I1 1 ∂I1 m
In view of (9.7), this expression vanishes for i > 2. For i = 2, we obtain 2 = NA1`
∂A2` 1 ∂A2` 2 A − A . ∂I1 1 ∂I1 2
i = This expression vanishes because A11 = A22 = M (I1 ). Thus the Nijenhuis tensor NAj` 0. Hence all non-degenerate V -invariant Poisson structures P1 and P2 for q = 2k − 1 are pairwise compatible in Magri’s sense [13, 20].
(iii) For k ≤ q ≤ 2k − 2, we consider the following component of the Nijenhuis tensor: 1 NA12
=
p X ∂A1 2
m=1
∂Im
Am 1
∂A11 m ∂Am ∂Am 1 1 1 2 . − A + A − A ∂Im 2 ∂I2 m ∂I1 m
This expression does not vanish for the generic functions F1 (I), . . . , Fq (I) (9.3). Therefore two generic V -invariant Poisson structures P1 (x) and P2 (x) are not compatible in Magri’s sense. Remark 2. Equation (9.8) proves that the non-constant eigenvalues of matrix A coincide with the eigenvalues of the p × p matrix M (I). Hence the maximal possible number of functionally independent eigenvalues of the recursion operator A(x) = P1 (x)P2−1 (x) is p = 2k − q. We say that two V -invariant Poisson structures P1 (x) and P2 (x) are in general position if they are non-degenerate and this number is achieved. Corollary 1. Any V -invariant Poisson structures P1 (x) and P2 (x) in general position are compatible in the extended sense.
Extended Integrability and Bi-Hamiltonian Systems
45
Proof. Indeed, the extended compatibility conditions are satisfied: (i) The general position of P1 (x) and P2 (x) implies that the submanifolds Mcq , Tr Am (x) = cm , coincide with tori Tq . Hence the condition A(Tx (Mcq )) ⊂ Tx (Mcq ) follows from the block form (9.7). (ii) For tangent vectors u, v ∈ Tx (Mcq ), the Nijenhuis tensor NA (u, v) vanishes because matrix A(I) (9.7) is constant on the tori Tq . i defines skew-symmetric algebraic structures in the tanThe Nijenhuis tensor NAj` n i gent spaces Tx (M ), namely (NA (u, v))i = NAj` uj v ` for u, v ∈ Tx (M n ). Equations (9.7) and (9.9) imply
NA (Tx (Tq ), Tx (Tq )) = 0,
NA (Tx (M n ), Tx (Tq )) ⊂ Tx (Tq ).
These formulae mean that the tangent subspaces Tx (Tq ) are abelian ideals with respect to the algebraic structure NA (u, v). To study these structures for an arbitrary (1,1) tensor Aij (x) on a smooth manifold n M , we define the (1,3) tensor i BN j`m =
n X
α i α i α i NAαm + NA`m NAαj + NAmj NAα` NAj` ,
α=1
that has the invariant form BN (u, v, w) = NA (NA (u, v), w) + NA (NA (v, w), u) + NA (NA (w, u), v). It is evident that tensor BN (u, v, w) is skew-symmetric: BN (u, v, w) = −BN (v, u, w),
BN (u, v, w) = BN (v, w, u) = BN (w, u, v).
If BN (u, v, w) ≡ 0 then the Nijenhuis tensor NA (u, v) defines a Lie algebra structure in each tangent space Tx (M n ). Hence the (1,3) tensor BN (u, v, w) characterizes the deviation of the algebraic structures defined by the Nijenhuis tensor from the Lie algebraic structures. 10. Quasi-Periodic Dynamics Without Hamiltonian Structure Any dynamical system (5.1) with quasi-periodic dynamics preserves the closed differential 2-forms q p X X dωα (I) ∧ dϕα + dfj (I) ∧ dIj , (10.1) ω= α=1
j=1
where fj (I) are arbitrary smooth functions. For q ≤ k, p = 2k − q and for generic functions ωα (I), the 2-forms (10.1) are non-degenerate. Therefore the generic dynamical system (5.1) for q ≤ k possesses a continuum of non-degenerate Hamiltonian structures. We call dynamical system (5.1) generic if its frequencies ωα (I) are linearly independent in any ball B0 ⊂ Ba . Theorem 5. The generic dynamical system (5.1) for k < q < 2k has no non-degenerate Hamiltonian structure. Any invariant closed 2-form ω is degenerate: rank k ω k ≤ 2(2k − q) < 2k.
(10.2)
46
O. I. Bogoyavlenskij
Proof. In view of Proposition 3, the linear independence of the functions ωα (I) implies that system (5.1) is Tq -dense. Applying Theorem 2, part 1, we obtain that any invariant closed 2-form ω has form (5.3). Let us prove that cαβ = 0. Indeed, Theorem 2 implies that the frequencies ωα (I) satisfy the linear Eqs. (5.5): cαβ ωβ (I) = cα , where α, β = 1, . . . , q. If the skew-symmetric matrix cαβ 6= 0 then rank k cαβ k ≥ 2. Hence there exist two linearly independent rows cτ β and cνβ . Equations (5.5) imply (cτ β −λcνβ )ωβ (I) = 0, where λ = cτ /cν . A contradiction. Hence cαβ = 0. Therefore Eq. (5.3) yields that any invariant closed 2-form ω has the form ω=
q X α=1
dFα (I) ∧ dϕα +
2k−q X
dfj (I) ∧ dIj .
(10.3)
j=1
This formula implies the relation (10.2) for k < q < 2k.
Remark 3. Any dynamical system (5.1) for q = 2k has constant coefficients. Hence any such a system preserves the k(2k − 1)-dimensional family of constant non-degenerate symplectic structures on the torus T2k . ∗ (V, M 2k ) [8] for the Corollary 2. The multiplication in the ring of cohomologies HB generic dynamical system (5.1) for k < q < 2k, p = 2k − q < k has the following property: (10.4) u1 · . . . · up+1 = 0 2 (V, M 2k ). for any p + 1 elements u1 , . . . , up+1 ∈ HB
Proof. Let the V -invariant closed 2-forms ω1 , . . . , ωp+1 represent the elements u1 , 2 . . . , up+1 ∈ HB (V, M 2k ). We have proved that these 2-forms have form (10.3). Hence the identity ω1 ∧ · · · ∧ ωp+1 = 0 yields Eq. (10.4).
11. Concluding Remarks In this paper, we have introduced the concept of integrability of dynamical systems in the broad sense. We have studied the dynamics of an electron on a torus T2 ⊂ R3 in an electromagnetic field and have shown that the corresponding Hamiltonian system on T ∗ (T2 ) is bi-Hamiltonian and integrable in the broad sense. Its generic trajectories are quasi-periodic and dense on the 3-dimensional tori T3 ⊂ T ∗ (T2 ). We have introduced the notion of extended compatibility of two Poisson structures. This notion contains as particular cases the compatibility in Magri’s sense [20] and strong dynamical compatibility [7]. We have proved in Theorem 1 that the corresponding biHamiltonian systems for p = k are integrable in the broad sense. Theorem 2 provides the complete classification of all k(k + 1)/2 canonical forms of integrable, in the broad sense, Hamiltonian systems and invariant symplectic structures on smooth manifolds M 2k . These canonical forms are classified by two integer topological invariants q and v: 1 ≤ q ≤ 2k, 2(q − k) ≤ v ≤ 2[q/2]. The derived classification has been applied in Theorem 4. We have proved that if a Hamiltonian system V is integrable in the broad sense and non-degenerate and the Poisson structures P1 (x) and P2 (x) are V -invariant then the recursion operator A(x) =
Extended Integrability and Bi-Hamiltonian Systems
47
P1 (x)P2−1 (x) has 2(q − k) constant eigenvalues. The V -invariant Poisson structures in general position are compatible in the extended sense. If q = 2k or 2k − 1 then any two V -invariant non-degenerate Poisson structures are compatible in Magri’s sense. For q ≤ 2k − 2, the generic V -invariant Poisson structures are incompatible. The results of this paper lead to the following open problem: What restrictions on the topology of a symplectic manifold M 2k follow from the existence of a Hamiltonian system on M 2k that is integrable in the broad sense with the topological invariant v > 0? One of these restrictions (for the non-compact manifolds M 2k ) is that the v/2 even cohomology groups H 2 (M 2k ), . . . , H v (M 2k ) must be infinite and the multiplication in the cohomology ring H ∗ (M 2k ) must be non-trivial. Indeed, the differential forms ωc` = ωc ∧ · · · ∧ ωc (6.3) have non-zero integrals over certain toroidal submanifolds T2` ⊂ Tq ⊂ M 2k , where 2` = 2, . . . , v. If the topological invariant q > k then the cohomology groups H 2 (M 2k ), . . . , H 2(q−k) (M 2k ) are infinite because v ≥ 2(q − k). Note added in proof 12. Applications to Hydrodynamics I. In this Section, we prove the integrability in the broad sense of the non-Hamiltonian dynamical systems that describe the axially symmetric flows in ideal hydrodynamics and viscous fluid dynamics. We consider Euler’s hydrodynamics equations p V2 ∂V = V × curl V − grad + , div V = 0. (12.1) ∂t ρ 2 Here ρ is the constant density of the ideal fluid. Helmholtz’s equation for vorticity has the form ∂ + V, curl V = 0. (12.2) ∂t The dynamics of the fluid for a given solution V i (t, x1 , x2 , x3 ) of Euler’s equations is defined by the dynamical system dx1 = V 1 (t, xi ), dt
dx2 = V 2 (t, xi ), dt
dx3 = V 3 (t, xi ), dt
dt =1 dt
(12.3)
in the space-time R4 . The vector field Vst of the dynamical system (12.3) has the form Vst = ∂/∂t + V. The Helmholtz equation (12.2) means that the vector field curl V is a symmetry of the dynamical system (12.3). II. We consider axially symmetric solutions to Euler’s equations (12.1). Let z, r, ϕ be the cylindrical coordinates and let ρF (t, z, r) be the density of the z-component of angular momentum of the fluid rotating around the vertical axis z. The axially symmetric vector field V of fluid velocity has the form V =
∂ 1 ∂ 1 ∂ 1 X(t, z, r) + Y (t, z, r) + F (t, z, r) . r ∂z r ∂r r2 ∂ϕ
(12.4)
Theorem 6. Axially symmetric dynamics of an ideal fluid is integrable in the broad sense if F (t, z, r) 6≡ const.
48
O. I. Bogoyavlenskij
Proof. For the vector field (12.4), the incompressibility equation div V = 0 takes the form ∂X/∂z + ∂Y /∂r = 0. Hence the vector field V is V =−
1 ∂H ∂ F ∂ 1 ∂H ∂ + + , r ∂r ∂z r ∂z ∂r r2 ∂ϕ
(12.5)
where H(t, z, r) and F (t, z, r) are arbitrary smooth functions. The fluid dynamics in the space-time R4 is defined by the vector field Vst = ∂/∂t + V. The axial symmetry and the Helmholtz equation (12.2) imply that the flow (12.3)– (12.5) has two symmetries ∂ , ∂ϕ
curl V =
1 ∂F ∂ 1 ∂F ∂ ∂ − +8 , r ∂r ∂z r ∂z ∂r ∂ϕ
where 1 ∂2H 1 ∂ 8= 2 + r ∂z 2 r ∂r
(12.6)
1 ∂H r ∂r
.
(12.7)
Let us prove that the three commuting vector fields Vst , ∂/∂ϕ, curl V preserve the function F (t, z, r). Euler’s equations (12.1) take the form 1 ∂F 2 ∂H ∂ 1 ∂2H − 2 −8 =− r ∂t∂z 2r ∂r ∂r ∂r 1 ∂F 2 ∂H ∂ 1 ∂2H + 2 +8 = r ∂t∂r 2r ∂z ∂z ∂z ∂F 1 + ∂t r
∂F ∂H ∂F ∂H − ∂r ∂z ∂z ∂r
p V2 + ρ 2
p V2 + ρ 2
,
(12.8)
,
(12.9)
= 0.
(12.10)
The compatibility condition for Eqs. (12.8) and (12.9) is ∂8 1 + ∂t r
∂8 ∂H ∂8 ∂H − ∂r ∂z ∂z ∂r
=
1 ∂F 2 . r4 ∂z
(12.11)
Equation (12.10) means that function F (t, z, r) is a first integral of the dynamical system (12.3)–(12.5). It is evident that function F (t, z, r) is preserved by the vector fields ∂/∂ϕ and curl V (12.6). Thus the dynamical system (12.3) in the space-time R4 has first integral F (t, z, r) and possesses three linearly independent symmetries Vst , ∂/∂ϕ, curl V . The symmetries are pairwise commuting and preserve the function F (t, z, r). Hence the dynamical system (12.3) is integrable in the broad sense. Remark 4. If function F (t, z, r) ≡ const then Eq. (12.10) is satisfied and Eq. (12.11) implies that function 8(t, z, r) (12.7) is a first integral of the dynamical system (12.3). However the system has only two linearly independent symmetries Vst and ∂/∂ϕ for F (t, z, r) ≡ const.
Extended Integrability and Bi-Hamiltonian Systems
49
Remark 5. In view of Proposition 2 of Sect. 5, the proved integrability implies that the invariant submanifolds F (t, z, r) = const are toroidal cylinders T2 × R1 or S 1 × R2 and there are coordinates in R4 where dynamical system (12.3) has the form I˙1 = 0,
ϕ˙ 1 = ω1 (I1 ),
ϕ˙ 2 = ω2 (I1 ),
ρ˙3 = ω3 (I1 ).
(12.12)
Here I1 = F (t, z, r) and ϕ1 , ϕ2 , ρ3 are global coordinates on the toroidal cylinders. As a consequence, we obtain that the axially symmetric dynamics of the fluid in the 3dimensional Euclidean space R3 is a projection of the integrable dynamics (12.12) in the space-time R4 . Theorem 6 leads to a conjecture that some cases of turbulent dynamics of a fluid are connected with the integrability in the broad sense. III. Example 2. Equations (12.10), (12.11) have the following exact solutions: 1 2 ω2 2 H = H0 − vr , 8 = − 4η + 2 H0 , F = ±ωH0 , 2 r H0 = sin(ω(z − vt)) sin(ηr2 ), where v, η, ω are arbitrary parameters. The corresponding dynamical system (12.3)– (12.5) has the form z˙ = −2η sin(ω(z − vt)) cos(ηr2 ) + v,
r˙ = ω cos(ω(z − vt))
ϕ˙ = ±ω sin(ω(z − vt))
sin(ηr2 ) , r2
sin(ηr2 ) , r
(12.13)
t˙ = 1.
The fluid velocity V is smooth and bounded everywhere in R4 . In the moving frame of reference z = vt + const, the flow (12.13) is steady and has infinitely many invariant compact domains which are separated by invariant surfaces H0 (t, z, r) = 0 : z − vt = nπ/ω and r2 = mπ/η. Here n and m are arbitrary integers and m/η > 0. In each domain, trajectories are either quasi-periodic on tori T2 defined by the equation H0 (t, z, r) = const 6= 0, 0 ≤ ϕ ≤ 2π, or are circles z − vt = (n + 1/2)π/ω, r2 = (m + 1/2)π/η, ϕ˙ = ±(−1)n+m ω/r2 . The system (12.13) is invariant with respect to the shift: t −→ t + 2π/vω. Hence equations (12.13) define a smooth dynamical system on the quotient manifold M 4 = S 1 ×R3 . The system is integrable in the broad sense and its generic trajectories are dense on tori T3 . Applying results from the end of Sect. 11, we obtain that the constructed integrable system on M 4 is not Hamiltonian because H 2 (M 4 ) = 0. Remark 6. Example 2 shows that there exist smooth dynamical systems on simply connected manifolds, for example on R4 or R3 (for v = 0), that are integrable in the broad sense and are not Liouville-integrable. For system (12.13), any invariant closed differential 1-form is exact; hence the “multi-valued first integrals" [33] do not exist. IV. The following statement gives a simple explanation of the steady vortex motion of a viscous fluid discovered by Taylor in his 1923 experiments. Proposition 4. Axially symmetric steady dynamics of a viscous fluid is integrable in the broad sense.
50
O. I. Bogoyavlenskij
Proof. The law of conservation of mass ∂ρ/∂t + div(ρV ) = 0 implies for the steady case the equation div(ρV ) = 0 where ρ = ρ(z, r) is the mass density. Hence the axially symmetric vector field ρV has form (12.5). Therefore, the corresponding dynamical system (12.3) is 1 ∂H F 1 ∂H , r˙ = , ϕ˙ = 2 , (12.14) z˙ = − rρ ∂r rρ ∂z r ρ where H(z, r) and F (z, r) are arbitrary smooth functions. The dynamical system (12.14) has first integral H(z, r) and two commuting symmetries V and ∂/∂ϕ which preserve H(z, r). Hence the viscous fluid dynamics (12.14) is integrable in the broad sense. Remark 7. Arnold proved in [34] the integrability of dynamics of a fluid in R3 assuming that dynamics is steady, vector fields V and curl V are linearly independent and the fluid is ideal. In Theorem 6 and Proposition 4, we do not use these assumptions but suppose that dynamics is axially symmetric and F (t, z, r) 6≡ const. Theorem 6 is true also for an ideal fluid with a variable density of mass. Theorem 7. Four Euler’s equations (12.1) for the axially symmetric steady case are equivalent to the single equation ∂2H ∂2H + x 2 = a(H) + xb(H), 2 ∂z ∂x
(12.15)
where a(H) and b(H) are arbitrary functions, x = r2 /4. Proof of Theorem 7 will be published elsewhere. The presence of arbitrary functions a(H) and b(H) in the reduced Eq. (12.15) proves that the Euler equations (12.1) are not integrable by any known analytic method. Acknowledgement. The author thanks J. Moser and E. Zehnder for helpful discussions and the Forschungsinstitut f¨ur Mathematik ETH Z¨urich for its hospitality. The author thanks the referee for useful suggestions.
References 1. Abraham, R., Marsden, J. E.: Foundations of mechanics. London: The Benjamin/Cummings Publishing, 1978 2. Alfven, H., Falthammar, C.-G.: Cosmical electrodynamics. Fundamental principles. Oxford: Clarendon Press, 1963 3. Arnold, V. I., Avez, A.: Ergodic problems of classical mechanics. New York: W. A. Benjamin, Inc., 1968 4. Bogoyavlenskij, O. I.: On perturbations of the periodic Toda lattice. Commun. Math. Phys. 51, 201–209 (1976) 5. Bogoyavlenskij, O. I.: A concept of integrability of dynamical systems. Compt. Rend. Acad. Sci. Canada 18, 163–169 (1996) 6. Bogoyavlenskij, O. I.: An extended concept of integrability and its applications. I. Two topological invariants of integrable Hamiltonian systems. Preprint 1996–08. Kingston: Queen’s University, 1996 7. Bogoyavlenskij, O. I.: Theory of tensor invariants of integrable Hamiltonian systems. I. Incompatible Poisson structures. Commun. Math. Phys. 180, 529–586 (1996) 8. Bogoyavlenskij, O. I.: Theory of tensor invariants of integrable Hamiltonian systems. II. Theorem on symmetries and its applications. Commun. Math. Phys. 184, 301–365 (1997) 9. Bogoyavlenskij, O. I.: Conformal symmetries of dynamical systems and Poincar´e’s 1892 concept of iso-energetic non-degeneracy. C. R. Acad. Sci. Paris, 326 S. I, 213–218 (1998) 10. Clemmow, P. C., Dougherty, J. P.: Electrodynamics of particles and plasmas. Reading, Massachusetts: Addison-Wesley Publishing Co., 1969
Extended Integrability and Bi-Hamiltonian Systems
51
11. Duistermaat, J. J.: On global action-angle coordinates. Commun. Pure Appl. Math. 33, 687–706 (1980) 12. Fuchssteiner, B.: Applications of hereditary symmetries to nonlinear evolution equations. Nonlinear Analysis, Theory, Methods and Applic. 3, 849–862 (1979) 13. Gelfand, I. M., Dorfman, I. Ya.: Hamiltonian operators and algebraic structures related to them. Funct. Anal. Appl. 13, 248–262 (1979) 14. Herman, M. R.: Exemples de flots hamiltonians dont aucune perturbation en topologie C inf n’a d’orbites p´eriodiques sur un ouvert de surfaces d’energies. C. R. Acad. Sci. Paris 312, S. I, 989–994 (1991) 15. Herman, M. R.: Differentiabilite optimale et contre-exemples a la fermeture en topologie C inf des orbites r´ecurrentes de flots hamiltonians. C. R. Acad. Sci. Paris 313, S. I, 49–51 (1991) 16. Hofer, H., Zehnder, E.: Symplectic invariants and Hamiltonian dynamics. Basel: Birkhauser Verlag, 1994 17. Jost, R.: Winkel- und wirkungsvariable f¨ur allgemeine mechanische Systeme. Helvetica Physica Acta 41, 965–968 (1968) 18. Kolmogorov, A. N.: The general theory of dynamical systems and classical mechanics. In: Proc. Intern. Congr. Math. 1954, 1. Amsterdam: North-Holland Publ. Co., 1957, pp. 315–333 19. Liouville, J.: Note sur l’int´egration des e´ quations diff´erentielles de la Dynamique. J. Math. Pures Appl. 20, 137–138 (1855) 20. Magri, F.: A simple model of an integrable Hamiltonian system. J. Math. Phys. 19, 1156–1162 (1978) 21. Markus, L., Meyer, K. R.: Generic Hamiltonian dynamical systems are neither integrable nor ergodic. Memoirs of AMS, 144, 1974 22. Marsden, J.: Lectures on mechanics. Cambridge: Cambridge University Press, 1992 23. Meyer, K. R.: Hamiltonian systems with a discrete symmetry. J. Diff. Eqs. 41, 228–238 (1981) 24. Moser, J.: Old and new applications of KAM theory. NATO Conference “Hamiltonian systems of 3 and more degrees of freedom”. Barcelona, August 1995 25. Nekhoroshev, N. N.: Action-angle variables and their generalizations. Trans. Moscow Math. Soc. 26, 180–198 (1972) 26. Nijenhuis, A.: Xn−1 -forming sets of eigenvectors. Proc. Kon. Ned. Akad. Amsterdam 54, 200–212 (1951) 27. Parasyuk, I. O.: Conservation of multidimensional invariant tori of Hamiltonian systems. Ukr. Math. J. 36, 380–385 (1984) 28. Parasyuk, I. O.: Variables of the action-angle type on symplectic manifolds stratified by coisotropic tori. Ukr. Math. J. 45, 85–93 (1993) 29. Parasyuk, I. O.: Reduction and coisotropic invariant tori of Hamiltonian systems with non-Poisson commutative symmetries. I, II, Ukr. Math. J. 46, 572–580, 994–002 (1994) 30. Parasyuk, I. O.: Coisotropic invariant tori of Hamiltonian systems of the quasiclassical theory of motion of a conduction electron. Ukr. Math. J. 42, 308–312 (1990) 31. Poincar´e, H.: Les m´etodes nouvelles de la m´ecanique c´eleste. T. 1. Paris: Gauthier–Villars, 1892 32. Zehnder, E.: Remarks on periodic solutions on hypersurfaces. In: Periodic solutions of Hamiltonian systems and related topics. P. H. Rabinowitz (Ed.). Dordrecht: D. Reidel Publ. Co., 1987, pp. 267–279 33. Novikov, S.P.: The Hamiltonian formalism and a multi-valued analogue of Morse theory. Russ. Math. Surv. 37 (5), 1–56 (1982) 34. Arnold, V.I.: Sur la topologie des e´ coulements stationnaires des fluides parfaits. C. R. Acad. Sci. Paris, 261 S. I, 17–20 (1965) Communicated by Ya. G. Sinai
Commun. Math. Phys. 196, 53 – 65 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
On the Inviscid Limit for a Fluid with a Concentrated Vorticity? Carlo Marchioro Dipartimento di Matematica, Universit`a “La Sapienza”, Piazzale A. Moro 2, 00185 Roma, Italy. E-mail:
[email protected] Received: 29 September 1997 / Accepted: 19 December 1997
Abstract: We study the time evolution of a viscous incompressible fluid in R2 when the initial vorticity is sharply concentrated in N regions of diameter . We prove that in the zero viscosity limit it converges as vanishes to the inviscid fluid and in particular to the point vortex system. 1. Introduction and Main Result We study an incompressible viscous fluid in R2 . The Navier–Stokes equation governing the motion is: ∂t ω(x, t) + (u · ∇)ω(x, t) = ν1ω(x, t),
ν>0 x = (x1 , x2 ) ∈ R2 ,
∇ · u(x, t) = 0, ω = ∂ 1 u2 − ∂ 2 u1 , ω(x, 0) = ω0 and boundary conditions,
(1.1a)
(1.1b) (1.1c)
where u = (u1 , u2 ) denotes the velocity field, ω the vorticity and ν the viscosity. When the fluid is inviscid the motion is governed by the Euler equation: ∂t ω(x, t) + (u · ∇)ω(x, t) = 0, ∇ · u(x, t) = 0, ω = ∂ 1 u2 − ∂ 2 u1 , ω(x, t) = ω0 and boundary conditions.
(1.2a) (1.2b) (1.2c)
? Research supported by MURST (Ministero dell’Universit` a e della Ricerca Scientifica e Tecnologica) and by CNR-GNFM (Consiglio Nazionale della Ricerche – Gruppo Nazionale Fisica Matematica).
54
C. Marchioro
A classical question is the following: does the solution of the Navier–Stokes equation converge to the corresponding solution of the Euler equation as ν → 0? in which norm? As is well known the question is not trivial because the Navier–Stokes equation is a singular perturbation of the Euler equation. In fact the Navier–Stokes equation differs from the Euler one by a term (ν1ω) which contains the maximal derivates of the problem. Moreover the boundary conditions are different in the two cases: for the Navier–Stokes equation we assume the perfect adherence of the fluid to the border (i.e. u = 0 on the boundary), while in the Euler case only the normal component of the velocity vanishes (i.e. u · n = 0, where n is the normal of the border). This gives rise to a boundary layer problem up to now not solved apart from a special case, in which the initial data are very well prepared (analytically) and the geometry of the problem is simple [Asa88, SaC961, SaC962, CaS96]. When the boundaries are absent and so the fluid moves in the whole R2 , the situation is better understood: for smooth initial data there is a pointwise convergence as ν → 0 [McG68] (see also [Kat72, BeM81]), while for a generic initial data in L1 ∩ L∞ there is a weak convergence [MaP84]. Other results on a vortex patch and nonsmooth bounded data are discussed in [CoW95, CoW96, Che96]. In the present paper we want to study the inviscid limit (ν → 0) when the initial vorticity is sharply concentrated in N disjoint regions of diameter ε and in particular we discuss the case as ε → 0. In this limit (ε → 0) it has been proved that the solution of the Euler Equation converges (weakly) to the solution of the so-called “point vortex model” defined by Eq. (1.11) (For the introduction of this model see [Hel67, Kir76, Poi93, Kel10], for the connection between this model and the Euler equation see [MaP83, MaP84, MaP86, Tur87, Mar88, MaP93, Mar96]. For a review on the topic see [MaP94]). Does the same thing happen for the solution of the Navier–Stokes equation as ν → 0? That is, the connection between Euler evolution and point vortex model is stable for a small perturbation of the viscosity? The answer is positive when the vorticity has the same sign in all parts of the plane [Mar90] and in particular it has been shown that the solution of the Navier–Stokes equation converges to the point vortex model when ε → 0 and ν → 0 independently. In the present paper we study the same problem when the vorticity in different blobs could be of different sign and the technique of previous paper fails. In the sequel we must assume that ν ≤ ν0 εα and we prove the convergence for any ν0 and α. (Of course the result is stronger as α is smaller.) Remark 1. The possibility to choose α as small as we want is practically equivalent to prove the convergence when the two limits (ε → 0 and ν → 0) are independent. More precisely in the sequel we will prove the result for a dependence of logarithmic type, that is weaker than any power. We consider initial data of the form: ωε (x, 0) =
N X
ωε;i (x, 0),
(1.3)
i=1
where ωε;i (x, 0) is a function with a definite sign supported in a region 3ε;i such that X X X (zi |ε); (zi |ε) ∩ (zj |ε) = 0 if i 6= j (1.4) 3ε;i = supp ωε;i ⊂ for ε small enough. Here We assume that
P
(z|r) denotes the circle of center z and radius r.
|ωε;i (x, 0)| ≤ M ε−γ ,
0 < M < ∞,
γ≥2
(1.5)
Fluid with Concentrated Vorticity
and
55
Z dx ωε;i (x, 0) ≡ aε;i → ai
as ε → 0,
(1.6)
where ai are N real numbers called the “intensity of the vortices”. These conditions (1.3), (1.4), (1.5), (1.6) mean that ωε (x, 0) −→ ε→0
N X
ai δ(zi )
(1.7)
i=1
(weak convergence in the sense of measure), where δ(z) means the Dirac measure centered in z. For simplicity, in the present paper we suppose that ωε;i (x, 0) ∈ C 2 , so that Eqs. (1.1) make sense. This last assumption is not essential and for nonsmooth initial data we can use a weak form of the Navier–Stokes evolution. Finally, we must choose the boundary condition and we assume that u(x, 0) → 0
as |x| → ∞.
(1.8)
We are able now to state the main result of the present paper: Theorem 1.1. Denote by ωε (x, t) the time evolution of ωε (x, 0) according to the Navier– Stokes equation. Fix any positive numbers ν0 and α, consider a sequence of evolutions via a Navier–Stokes equation with a viscosity ν such that ν ≤ ν 0 εα .
(1.9)
Then for any positive T and any time t such that 0 ≤ t ≤ T , we have ωε (x, t) −→ ε→0
N X
ai δ(zi (t)),
(1.10)
i=1
where zi (t) satisfy the ordinary differential equations (called point vortex system) N 1 X d zi (t) = −∇⊥ aj ln |zi (t) − zj (t)|, dt 2π i=1;j6=i
(1.11)
zj (0) = zi . In the literature ([MaP83, MaP84, MaP86, Tur87, Mar88, MaP93, MaP94, Mar96]) it has been proved that the same limit (1.10) holds when we study the Euler evolution and so Eq. (1.10) implies also the convergence of the Navier–Stokes solutions to the Euler ones. Remark 2. The right-hand side of Eq. (1.11) diverges as two vortices collapse, i.e. zi = zj i 6= j and in this situation the point vortex model becomes meaningless. It has been proved that this event may happen but it is exceptional (see [MaP94]). In general Theorem 1.1 holds until the first collapse. Remark 3. We observe that Theorem 1.1 is an asymptotic result in ε as ε → 0 and so, without loss of generality, we can suppose in the sequel ν0 fixed and ε → ε0 small enough.
56
C. Marchioro
The proof is rather technical and we summarize the main steps here. First at time zero we make a partition of the vorticity considering the mass near each blob. Then we evolve each component via the Navier–Stokes equation assuming in Eq. (1.1) the velocity field u as the actual (unknown) velocity field produced by other vortices and by itself. Of course Eq. (1.1a) reduces to a linear equation with depending on the time coefficient, and so the solution is the sum of the components due to the initial partition. We study in detail the time evolution of each component when we assume that the velocity field produced by the others is smooth. We show that under this hypothesis the main part of the vorticity remains “close” to the initial blob, which moves according to Eq. (1.11), and the assumption, if initially valid, remains valid. This concludes the proof. In the next section we discuss the time evolution of one vortex in an external field and then we sketch the proof of Theorem 1.1.
2. One Vortex in an External Field and Proof of Theorem 1.1 In this section we study the time evolution of the vorticity ωε (x, 0) of nonnegative sign, which satisfies the conditions (1.4), (1.5), (1.6), and moves in an external field Fε (x, t), which simulates the effect of other vortices. For simplicity we assume aε = 1. We suppose that the field is the sum of two terms: Fε (x, t) = F1;ε (x, t) + F2;ε (x, t),
(2.1)
where F1;ε (x, t) is a smooth, divergence-free,uniformly bounded, time dependent, vector field which satisfies the Lipschitz condition |F1;ε (x, t) − F1;ε (y, t)| ≤ L|x − y|
L independent of ε.
(2.2)
F2;ε (x, t) is a smooth, divergence-free, uniformly bounded, time dependent, vector field which satisfies the following condition: consider the point vortex system (1.11) and denote by Rm the mutual minimal distance between the point vortices zi (t) up to the time T . We define R∗ = Rm /10 and we assume that |F2;ε (x, t)| ≤ const. εα
if |x − Bε (t)| ≤ 3R∗ ,
(2.3)
where Bε (t) is the solution of the ordinary differential equation d Bε (t) = Fε (Bε (t), t), dt Bε (0) = zi ,
(2.4)
and α is defined in Eq. (1.9). From now on const. means a constant independent of ε. Moreover we assume that F (x, t) exists such that Fε (x, t) → F (x, t) as ε → 0.
(2.5)
The Navier–Stokes equation reads ∂t ωε (x, t) + ([uε + Fε ] · ∇) ωε (x, t) = ν1ωε (x, t), ∇ · uε (x, t) = 0, where
ν > 0,
(2.6a) (2.6b)
Fluid with Concentrated Vorticity
57
Z dyK(x − y)ωε (y, t),
uε (x, t) =
K = ∇⊥ G, ∇⊥ ≡ (∂2 , −∂1 ), 1 G(x) = − ln |x|. 2π
(2.7) (2.8) (2.9) (2.10)
We prove for this evolution the equivalent of Theorem 1.1. Theorem 2.1. Let ωε (x, t) be a solution of Eqs. (2.6), satisfying the initial conditions (1.4), (1.5) and Fε (x, t) satisfies Eqs. (2.1)–(2.5). Then for any positive T and time t such that 0 ≤ t ≤ T we have ωε (x, t) −→ δ (B(t)) , ε→0
(2.11)
where B(t) satisfies the ordinary differential equation d B(t) = F (B(t)) , dt B(0) = z.
(2.12)
Proof. We want to show that the main part of the vorticity remains close to B(t) and the vorticity going far from B(t) is so small that it produces a very small velocity field. For this purpose we introduce an approximate moment of inertia I(ε, 0) which vanishes when the initial data are concentrated around B(0). Then we evaluate its growth in time and we prove that it vanishes as ε → 0. As a useful tool we introduce the following nonnegative function WR ∈ C ∞ (R2 ), r → WR (r) depending only on |r|, defined as: WR (r) =
1 0
if |r| < R if |r| > 2R
(2.13)
such that,for some C1 > 0: C1 , R C1 |∇WR (r) − ∇WR (r0 )| < 2 |r − r0 |, R C1 |1WR (r)| < 2 . R |∇WR (r)| <
(2.14) (2.15) (2.16)
We define Z I(ε, t) =
dx ωε (x, t)[x − Bε (t)]2 WR (x − Bε (t)),
(2.17)
and we study its growth in time when 2R = R∗ . If Fε and ν would vanish, W would be equal one and Bε (t) constant, I(ε, t) would be constant along the motion.
58
C. Marchioro
In general we have Z d I(ε, t) = dx∂t ωε (x.t)[x − Bε (t)]2 WR (x − Bε (t)) dt Z d − dx ωε (x, t)WR (x − Bε (t)) 2[x − Bε (t)] · Bε (t) dt Z d − dx ωε (x, t)[x − Bε (t)]2 ∇WR (x − Bε (t)) · Bε (t) dt = (by using the Navier–Stokes equation (2.6) and an integration by parts) = A + D + E, (2.18) where Z d A = dx ωε (x, t) 2 [x − Bε (t)] · uε (x, t) + Fε (x, t) − Bε (t) WR (x − Bε (t)), dt (2.19) Z d D = dx ωε (x, t) [x − Bε (t)]2 uε (x, t) + Fε (x, t) − Bε (t) · ∇WR (x − Bε (t)), dt (2.20) Z E = ν dx ωε (x, t) [x − Bε (t)]2 1WR (x − Bε (t)) + 4[x − Bε (t)] · ∇WR (x − Bε (t)) + 4WR (x − Bε (t)) . (2.21) We study the three terms A, D, E. By using Eqs. (2.4) and (2.7), Z A = dx ωε (x, t)WR (x − Bε (t))2[x − Bε (t)] Z · dyωε (y, t) {K(x − y) + Fε (x, t) − Fε (Bε (t))}
(2.22)
= (by using the antisymmetry of K(x − y) and the decomposition (2.1)) = A1 + A2 + A3 , where
Z A1 =
Z dx
dy ωε (x, t)ωε (y, t)K(x − y) ·
(2.23)
{WR (x − Bε (t))[x − Bε (t)] − WR (y − Bε (t))[y − Bε (t)]} , Z dx ωε (x, t)WR (x − Bε (t)) 2[x − Bε (t)] · [F1;ε (x, t) − F1;ε (Bε (t))], (2.24)
A2 = Z A3 =
dx ωε (x, t)WR (x − Bε (t)) 2 [x − Bε (t)] · F2,ε (x, t) − F2;ε (Bε (t)) . (2.25)
We investigate A1 . We remark that the integrand vanishes when both |x − Bε (t)| and |y − Bε (t)| are less than R (where we have used the orthogonality of K(x) with x). Moreover the integrand is bounded. This is not trivial because of the singularity of
Fluid with Concentrated Vorticity
59
K(x − y) like |x − y|−1 , but this singularity is compensated by a zero of the same order. In fact |WR (x − Bε (t)) [x − Bε (t)] − WR (y − Bε (t)) [y − Bε (t)] | 1 = | [WR (x − Bε (t)) + WR (y − Bε (t))] {[x − Bε (t)] − [y − Bε (t)]} 2 1 + [WR (x − Bε (t)) − WR (y − Bε (t))] {[x − Bε (t)] + [y − Bε (t)]} | 2 ≤ const. |x − y| as x → y
(2.26)
(we have used Eq. (2.14)). In conclusion |A1 | ≤ const. mε (R, t), where
Z mε (R, t) = 1 − P
dx ωε (x, t))
(2.27)
(2.28)
(Bε (t)|R)
is the vorticity mass of the points farther from Bε (t) than R. We study A2 . We use the Lipschitz condition (2.2) and we reconstruct I(ε, t): |A2 | ≤ const. I(ε, t).
(2.29)
|A3 | ≤ const. εα .
(2.30)
By using Eq. (2.3), we have The study of D is similar to that of A1 and A3 . We obtain |D| ≤ const. mε (R, t) + const. εα .
(2.31)
Finally, by using Eqs. (2.14), (2.16), we obtain |E| ≤ const. ν. We sum Eqs. (2.27), (2.29)–(2.32); we have d I(ε, t) ≤ const. mε (R, t) + const. I(ε, t) + const. εα + const. ν. dt
(2.32)
(2.33)
Now we bound mε (R, t) by I(ε, t) and ν. We define IM (ε, t) = sup I(ε, τ ) 0≤τ ≤t
(2.34)
and we prove the following lemma. Lemma 2.1. There exist three constants such that mε (R, t) ≤ const. IM (ε, t) + const. εα + const. ν.
(2.35)
60
C. Marchioro
Proof. We define a mollified version of mε (R, t): Z µε (R, t) = 1 − dx ωε (x, t)WR (x − Bε (t)),
(2.36)
and we compute d µε (R, t) = − dt
Z Z
dx ∂t ωε (x, t)WR (x − Bε (t))
d (2.37) dx ωε (x, t) Bε (t) · ∇WR (x − Bε (t)) dt d Bε (t) − uε (x, t) − Fε (x, t) · ∇WR (x − Bε (t)) = dx ωε (x, t) dt Z − ν dx ωε (x, t)1WR (x − Bε (t)). + Z
By the same estimates used to obtain Eq. (2.27), (2.30), (2.32) we have d µε (R, t) ≤ const. [mε (R, t) − mε (2R, t)] + const. εα + const. ν, dt
(2.38)
which implies Z
t
µε (R, t) ≤ const.
dτ [mε (R, τ ) − mε (2R, τ )] + const. εα + const. ν.
(2.39)
0
We choose ε R so that
µε
R ,0 4
= 0.
(2.40)
We observe that trivially µε and
R ,t 4
mε
≥ mε
R ,t 2
≥ mε (R, t)
(2.41)
2 R R , t − mε (R, t) ≤ I(ε, t). 2 4
From Eqs. (2.39), (2.41), (2.42) we obtain the proof of the lemma.
(2.42)
We insert the estimate of this lemma in Eq. (2.33), we integrate and we obtain Z t Z t dτ IM (ε, t) + const. dτ I(ε, τ ) + const. εα + const. ν. |I(ε, t) − I(ε, 0)| ≤ const. 0
0
(2.43) We take the sup in t, we use the Gronwall Lemma and the condition on the initial data. We have (2.44) IM (ε, t) ≤ const. ε2 + const. εα + const. ν. This equation and Lemma 2.1 imply that the vorticity mass out of a circle of radius as small as we want and center in Bε (t) goes to zero as ε and ν vanish. Since Bε (t) → B(t) in the same limit (Eq. (2.5)), the proof is achieved.
Fluid with Concentrated Vorticity
61
Notice that in the proof of Theorem 2.1 we do not have used the fact that the viscosity vanishes faster than a power in ε (Eq.(1.9)) but only the fact that it vanishes in ε. On the contrary in the next Theorem we will use this property. We want to show that the vorticity going far from Bε (t) has a mass very small and so can produce a velocity field very weak. Theorem 2.2. Let the assumptions and the notations of Theorem 2.1 hold. Then, for any β > 0, an ε1 > 0 exists such that, for ε ≤ ε1 , mε (R∗ , t) ≤ const. εβ .
(2.45)
Proof. In this proof for simplicity we assume α ≤ 2. Of course the proof in the case α ≥ 2 is similar. We start from Eq. (2.37) that can be written d µε (R, t) = G + H + N + P, dt where
(2.46)
Z G≡ H≡
Z Z
dx ωε (x, t)uε (x, t) · ∇WR (x − Bε (t)),
(2.47)
dx ωε (x, t) F1;ε (Bε (t), t) − F1;ε (x, t) · ∇WR (x − Bε (t)),
(2.48)
dx ωε (x, t) F2;ε (Bε (t), t) − F2;ε (x, t) · ∇WR (x − Bε (t)), Z P ≡ −ν dx ωε (x, t)1WR (x − Bε (t)).
N≡
(2.49) (2.50)
We study these four terms. We investigate G. Using the relation between velocity and vorticity: Z Z (2.51) G ≡ − dx ωε (x, t) dy ωε (y, t)K(x − y) · ∇WR (x − Bε (t)). To estimate this term, we split the integration domain in the following sets: n hX io X X / (Bε (t)|R) y ∈ (Bε (t)|ah ) − (Bε (t)|ah1 ) Th = (x, y) |x ∈ if h < n, (2.52) n X X o / Bε (t)|R y ∈ / Bε (t)|an1 if h = n, (2.53) Th = (x, y) |x ∈ n h io X X X Sh = (x, y) |y ∈ / Bε (t)|R x ∈ / (Bε (t)|ah ) − (Bε (t)|ah1 ) if h < n, n o X X / (Bε (t)|R) x ∈ / (Bε (t)|an1 ) Sh = (x, y) |y ∈
(2.54) if h = n,
(2.55)
where n is a positive integer number and ah is defined as a0 = 0,
a1 = εα/10 ,
We choose n such that an+1 ≤ R and an+2 > R.
ah+1 = 2ah .
(2.56)
62
C. Marchioro
Notice that the integrand in Eq. (2.51) vanishes in the complement of
n S
(Th ∪ Sh ).
h=1
Thanks to the identities ∇WR (x − Bε (t)) · K(x − Bε (t)) = 0, the contribution of the integral (2.51) due to Th , h < n is bounded by Z Z dx dyωε (x, t)ωε (y, t) P P (Bε (t)|ah )− (Bε (t)|ah−1 ) (2.57) · [K(x − y) − K(x − Bε (t)] · ∇WR (x − Bε (t)) . By the explicit form of K(x), we have |K(x − y) − K(x)| < const. Hence
(2.57) ≤ const.
Since
γ |x|(|x| − γ)
if |y| < γ .
ah R−1 mε (ah−1 , t)mε (R, t). R(R − ah )
mε (ah−1 , t) ≤ a−2 h−1 I(ε, t),
(2.58)
(2.59) (2.60)
by using estimate (2.44) (2.57) ≤ const.
εα mε (R, t). R2 (R − ah )ah−1
In conclusion the contribution of the set
n−1 S h=1
const.
(2.61)
Th to the integral in Eq. (2.51) is less than
ε9α/10 mε (R, t). R3
(2.62)
We evaluate the contribution of the set Tn . Using the antisymmetry of K(x − y): Z Z 1 dx ωε (x, t) dy ωε (y, t)K(x − y) G=− 2 (2.63) · [∇WR (x − Bε (t)) − ∇WR (y − Bε (t))] . We proceed as in Eq. (2.27) and we obtain that this term is smaller than const.
εα mε (R, t). R4
we can handle in the same way the term with Sh . In conclusion ε9α/10 εα |G| ≤ const. + const. mε (R, t). R3 R4
(2.64)
(2.65)
We study now the terms H, N and P . By using the Lipschitz condition |H| ≤ const. mε (R, t)
(2.66)
Fluid with Concentrated Vorticity
63
and from Eq. (2.3)
εα mε (R, t) R
(2.67)
εα mε (R, t). R2
(2.68)
|N | ≤ const. and the property ν ≤ const. εα we have |P | ≤ const.
We sum Eqs. (2.65)–(2.68) and we obtain: 9α/10 d εα µε (R, t) ≤ const. ε + const. dt R3 R4 α α ε ε + const. 2 mε (R, t) ≤ const. mε (R, t), + const. + const. R R since R > εα/10 from Eq. (2.56). Equation (2.69) implies Z t dτ mε (R, τ ). µε (R, t) ≤ mε (R, 0) + const.
(2.69)
(2.70)
0
We observe that the term µε (R, 0) vanishes and R , t ≥ mε (R, t) µε 2
(2.71)
so that Eq. (2.70) becomes Z µε (R, t) ≤ const.
t
R ,τ 2
.
(2.72)
and εα/10 2n+2 > R∗ .
(2.73)
dτ µε 0
We iterate Eq. (2.72) (n − 2) times, where n is such that εα/10 2n+1 ≤ R∗
Hence from Eq. (2.72), by using the Stirling formula, n−2 const. . mε (R, t) ≤ n−2
(2.74)
Since from Eq. (2.73), for small ε n ≈ const. ln ε
(2.75)
mε (R∗ , t) vanishes as ε vanishes faster than any power. Hence the proof is achieved. We show now that mε (R∗ , t) can produce a velocity field as small as we want. Theorem 2.3. Let the assumptions and the notations of Theorem 2.1 and Theorem 2.2 hold. Then, for any β 0 > 0, an ε2 > 0 exists such that, for ε ≤ ε2 , 0
|Fm | ≤ const. εβ , where Fm is the velocity field produced by the vorticity mass out of
(2.76) P
(Bε (t)|R∗ ).
64
Proof.
C. Marchioro
Z |Fm | = dy K(x − y) . P R2 − (Bε (t)|R∗ )
(2.77)
The integrand is monotonically unbounded as y → x, and so the maximum of the integral is obtained when we rearrange the vorticity mass as close as possible to the singularity: Z dy |y|−1 , (2.78) |Fm | ≤ const. ε−γ P (O|η) where O denotes the origin and η is such that M ε−γ πη 2 = mε (R∗ , t). By using Theorem 2.1 , |Fm | can be made as small as we want.
(2.79)
We return now to the main theorem. We sketch the proof of Theorem 1.1 that is an easy consequence of the previous results. We have studied the motion of one vortex under the action of an external field. This field must have many properties and it produces an extremely small vorticity mass far from the vortex, which produces a field weaker than the previous one. We simulate the effect of other vortices on a particular one as this external field, regular and weak enough to obtain Theorem 2.1, 2.3. We have proved that it moves in a way to produce a good external field on the other vortices. This bootstrap procedure is surely valid at the initial time and so, by continuity, until time T . Remark that in the proof we do not use the Navier–Stokes equation in the form (1.1) but only in a weak form, obtained by an integration by parts. So the generalization to nonsmooth initial data is straightforward.
References [Asa88]
Asano, K.: Zero-viscosity limit of the incompressible Navier–Stokes equations. Preprint 1 and 2 (1988) [BeM81] Beale, J.T., Majda, A.: Rates of convergence for viscous splitting of the Navier–Stokes equations. Math. Comp. 37, 243–259 (1981) [CaS96] Caflisch, R.E., Sammartino, M.: Navier–Stokes equations on an exterior circular domain: Construction of the solution and zero viscosity limit. Preprint (1996) [Chemin96] Chemin J.-Y.: A remark on the inviscid limit for twp-dimensional incompressible fluids. Commun. Part. Diff. Eq. 21, 1771–1779 (1996) [CoW95] Constantin, P., Wu, J.: Inviscid limit for vortex patches. Nonlinearity 8, 735–742 (1995) [CoW96] Constantin, P., Wu J.: The Inviscid Limit for Non-Smooth Vorticity. Indiana U. Math. J. 45, 67–81 (1996) [Hel67] Helmholtz, H.: On the integrals of the hydrodynamical equations which express vortex motion. Phil. Mag. 33, 485 (1867) [Kat72] Kato, T.: Nonstationary flows of viscous and ideal fluids in R3 . J. Funct. Anal. 9, 296–305 (1972) [Kel10] Kelvin, J.: Mathematical and Physical Papers. Cambridge, Cambridge University Press, 1910 [Kir76] Kirchhoff, G.: Vorlesungen u¨ ber Math. Phys. Leipzig: Teubner, 1876 [MaP83] Marchioro, C., Pulvirenti, M.: Euler evolution for singular data and vortex theory. Commun. Math. Phys. 91, 563 (1983) [MaP84] Marchioro, C., Pulvirenti, M.: Vortex methods in two-dimensional fluid mechanics. Lecture Notes in Physics, Vol. 203, Berlin–Heidelberg–New York: Springer-Verlag, 1984 [MaP86] Marchioro, C., Pagani, E.: Evolution of Two Concentrated Vortices in a Two-Dimensional Bounded Domain. Math. Meth. Appl. Sci. 8, 328 (1986)
Fluid with Concentrated Vorticity
[Mar93] [MaP94] [Mar88] [Mar90] [Mar94] [Mar96] [McG68] [Poi93] [SaC961] [SaC962] [Tur87]
65
Marchioro, C., Pulvirenti, M.: Vortices and Localization in Euler Flows. Commun. Math. Phys. 154, 49 (1993) Marchioro, C., Pulvirenti, M.: Mathematical Theory of Incompressible Non-Viscous Fluids. Applied Mathematical Sciences 96, Berlin–Heidelberg–New York: Springer-Verlag, 1994 Marchioro, C.: Euler Evolution for Singular Initial Data and Vortex Theory: A Global Solution. Commun. Math. Phys. 116, 45–55 (1988) Marchioro, C.: On The Vanishing Viscosity Limit for Two-dimensional Navier–Stokes Equation with Singular Initial Data. Math. Meth. Appl. Sci. 12, 463–470 (1990) Marchioro, C.: Bounds on the Growth of the Support of a Vortex Patch. Commun. Math. Phys. 164, 507–524 (1994) Marchioro, C.: On the localization of the vortices. Preprint (1996) Mc.Grath, F.: Nonstationary plane flow of viscous and ideal fluids. Arch. Rat. Mech. Anal. 27, 329–348 (1968) Poincar´e, H.: Th´eories des Tourbillons. Paris: George Carr´e, 1893 Sammartino, M., Caflish, R.E.: Zero Viscosity Limit for Analytic Solutions of the Navier–Stokes Equation on a Half-Space I. Existence for Euler and Prandtl Equations. Preprint (1996) Sammartino, M., Caflish, R.E.: Zero Viscosity Limit for Analytic Solutions of the Navier–Stokes Equation on a Half-Space II. Construction of the Navier–Stokes Solution. Preprint (1996) Turkington, B.: On the evolution of a concentrated vortex in an ideal fluid. Arch. Rat. Mech. An. 97, 75 (1987)
Communicated by Ya. G. Sinai
Commun. Math. Phys. 196, 67 – 76 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Asymptotically Euclidean Manifolds and Twistor Spinors 1,? ¨ Wolfgang Kuhnel , Hans-Bert Rademacher2 1
Mathematisches Institut B, Universit¨at Stuttgart, 70550 Stuttgart, Germany. E-mail:
[email protected] 2 Mathematisches Institut, Universit¨ at Leipzig, Augustusplatz 10/11, 04109 Leipzig, Germany. E-mail:
[email protected] Received: 25 July 1997 / Accepted: 8 January 1998
Abstract: Based on the Berger–Simons holonomy classification, we characterize all Riemannian spin manifolds carrying a twistor spinor with at least one zero. In particular, the dimension n of the manifold is either even or n = 7. Outside the set of zeros of the twistor spinor the metric is conformal to either a flat metric or a Ricci flat and locally irreducible metric. 1. Introduction and Result On a Riemannian manifold with spin structure there lives the Dirac operator and also another natural differential operator, the twistor operator. It is also called the Penrose operator since it was first introduced in General Relativity by R. Penrose. The kernel of this operator is formed by the twistor spinors which satisfy the twistor equation ∇X φ +
1 X · Dφ = 0 n
(1)
for every vector field X. Here ∇X φ is the spinor derivative of the spinor field φ in the direction of the vector field X. D denotes the Dirac operator, the dot · denotes the Clifford multiplication and n = dim M is the dimension of the manifold. One can regard a twistor spinor on (M, g) as a Killing vector field on a canonically associated supermanifold, cf. [ACDS]. Particular cases of twistor spinors are parallel spinors which satisfy ∇φ = 0 and real Killing spinors which satisfy the equation ∇X φ = λX · φ for all X and some real number λ 6= 0. A manifold with a parallel spinor is Ricci flat and a manifold carrying a real Killing spinor is Einstein with positive scalar curvature 4λ2 (n−1)n. It was remarked by Hitchin [Hi] that a simply-connected and irreducible spin manifold of dimension n with a parallel spinor has special holonomy. ?
Supported by Max-Planck-Institute for Mathematics in the Sciences (MIS), Leipzig
68
W. K¨uhnel, H.-B. Rademacher
In particular one concludes from the Berger–Simons theorem on holonomy: If the dimension n 6= 7, 8 then either the manifold has holonomy SU(m) with n = 2m and carries a two-dimensional space of parallel spinors or the manifold is hyperk¨ahlerian, i.e. has holonomy Sp(m) where n = 4m and carries an (m + 1)-dimensional space of parallel spinors, cf. [Wa]. According to B¨ar [Ba] one can reduce the case of manifolds carrying real Killing spinors to the case of manifolds carrying parallel spinors as follows: If a compact (M, g) carries a real Killing spinor then the cone (R+ ×r M, dr2 + r2 g) over M is either flat or irreducible and carries a parallel spinor. If u = hφ, φi is the length of the twistor spinor φ then the function Aφ = 4ukDφk2 − 2 n k∇ uk2 is a non-negative constant. Moreover, the conformally equivalent metric g = u−2 g is an Einstein metric with non-negative scalar curvature s = (n − 1)Aφ /n. The last statement holds everywhere except at the set of zeros of φ. If Aφ > 0 then the metric g carries a real Killing spinor, if Aφ = 0 then the metric g carries a parallel spinor, cf. [Fr, 5.3]. Hence the case of a manifold carrying a twistor spinor without zero can be reduced by a conformal change to the case of a manifold carrying a parallel spinor. In the case of a manifold carrying a twistor spinor with zero things are different. One first observes that the set of zeros of a twistor spinor is discrete. Although there are compact manifolds carrying parallel spinors respectively real Killing spinors which are not conformally flat we have the following result. The proof uses the solution of the Yamabe problem, i.e., the existence of a metric of constant scalar curvature in the conformal class: Theorem 1.1 (Lichnerowicz [Li, Thm.7]). A compact Riemannian spin manifold carrying a non-trivial twistor spinor with zero is conformally equivalent to the standard sphere. In [KR1] we showed that a manifold carrying a twistor spinor φ with zero is conformally√flat if the associated conformal vector field Vφ defined by the equation hVφ , Xi = −1hφ, X · φi is non-trivial. Under additional curvature and completness assumptions this was shown before by K.Habermann [Ha]. In [KR4] the global conformal type of conformally flat manifolds with twistor spinors with zeros are investigated. The authors gave in [KR2] resp. [KR3] complete Riemannian metrics g on R4 resp. R2m carrying a two-dimensional space of twistor spinors with a common zero point which are not conformally flat. If a 4-dimensional manifold carries a twistor spinor then the manifold is half-conformally flat. The conformally equivalent metric Ricci flat metric g is in these examples asymptotically Euclidean. In dimension 4 it is the Eguchi–Hanson metric and in dimension 2m higher dimensional analogues which can be used to define complete metrics on a line bundle over CP n with holonomy SU(m) due to Calabi [Ca], cf. also [FG]. Our main result is the following: Theorem 1.2. Let (M, g) be an n-dimensional Riemannian spin manifold carrying a twistor spinor with non-empty set Zφ of zeros. Then the following hold: (a) The inverted normal coordinates around every p ∈ Zφ (as defined in Sect. 2) give an asymptotically Euclidean coordinate system of order 3 for the metric g = kφk−4 g outside p. (b) The metric g is either flat or Ricci flat and locally irreducible. Using recent results by M.Herzlich [He] one obtains a partial version of a converse statement: Let (M, g) be a Riemannian spin manifold which carries a parallel spinor and has an end with an asymptotically Euclidean coordinate system of order 2. Then
Asymptotically Euclidean Manifolds and Twistor Spinors
69
by adding a point q we can conformally compactify this end and obtain a C 2 -metric on M ∪ {q} with a twistor spinor having a zero at q. Using the Berger–Simons theorem on holonomy as in [Hi, p.8, footnote p.54] and [Wa, Prop.], we conclude from our main result the following classification result for the manifold (M = M − Zφ , g) : Theorem 1.3. Let (M, g) be a Riemannian spin manifold carrying a twistor spinor φ with non-empty set Zφ of zeros. Then the conformally equivalent and Ricci flat metric g = kφk−4 g on M = M − Zφ carries a parallel spinor and is either flat or locally irreducible. If (M, g) is not conformally flat the following holds: Denote by N the dimension of the space of twistor spinors on (M, g). Then for all twistor spinors on (M, g) the set of zeros equals Zφ and for the restricted holonomy Hol0 of the conformally equivalent and Ricci flat metric (M , g) one of the following holds: a) b) c) d)
n = 2m, m ≥ 2, Hol0 = SU(m) and N = 2. n = 4m, m ≥ 2, Hol0 = Sp(m) and N = m + 1. n = 8, Hol0 = Spin(7) and N = 1. n = 7, Hol0 = G2 and N = 1.
The results of [KR3] give examples for case a) in all even dimensions. Another consequence of our result is an alternative proof of Theorem1.1 due to Lichnerowicz in the compact case. Instead of the solution of the Yamabe problem we apply the Cheeger–Gromoll splitting theorem to conclude that there is only one zero and the rigidity part of the Bishop volume comparison theorem to conclude that the manifold with the conformally equivalent metric g is actually flat, see Corollary 3.6. 2. Inverted Normal Coordinates In Riemannian normal coordinates around a point p a Riemannian metric is Euclidean up to a term of order 2. By using an inversion of the Riemannian normal coordinates and a conformal change which coincides with the inversion in the case of the Euclidean space one obtains an asymptotically Euclidean metric of order 2. Here one obtains that the metric coefficients are Euclidean up to order 2, the first derivatives of the metric coefficients vanish of order 3 and the second derivatives vanish of order 4. We give a precise formulation in the particular case, where the curvature tensor vanishes at p, then the orders can be improved. Definition 2.1. Let U be an open subset of a Riemannian manifold. We call a diffeomorphism y ∈ {u ∈ Rn ; kuk2 > R} 7→ φ(y) ∈ U an asymptotically Euclidean coordinate system of order τ if in the coordinates (y1 , . . . , yn ) we have gij = δij + O(ρ−τ ); ∂k gij = O(ρ−τ −1 ), ∂k ∂l gij = O(ρ−τ −2 ) for ρ = kyk → ∞. A shorthand notation for this asymptotic behaviour is gij = δij + O00 (ρ−τ ). A Riemannian manifold N is called asymptotically Euclidean of order τ if there is a decomposition N = N0 ∪ N∞ with N0 compact such that the open set N∞ carries asymptotically Euclidean coordinates of order τ.
70
W. K¨uhnel, H.-B. Rademacher
Let x = (x1 , . . . , xn ) be Riemannian normal coordinates around a point p ∈ M, at which the Riemann curvature tensor R vanishes, i.e. R(p) = 0. Then gij (x) = δij + hij , hij = O(r3 ),
P where r2 = k x2k . Now introduce inverted normal coordinates z = (z1 , . . . , zn ) outside P p with zi = r−2 xi . Let ρ2 = k zk2 then ρ = 1/r. Then the conformally equivalent metric gˆ = ρ4 g is given in the inverted normal coordinates zi as follows: ! X X ∂ 2 ∂ = δij + hij − 2 zi , zk hkj + zj zl hil gˆ ij (z) = gˆ ∂zi ∂zj ρ k l X 4 zk zl hkl . (2) + 4 zi zj ρ k,l
Then one computes that
gˆ ij (z) = δij + O00 (ρ−3 )
which is the shorthand notation for: gˆ ij (z) = δij + O(ρ−3 );
∂ ∂2 gˆ ij (z) = O(ρ−4 ); gˆ ij (z) = O(ρ−5 ). ∂zk ∂zk ∂zl
Hence we proved the following Lemma 2.2 (cf. [LP, ch. 6]). If the curvature tensor vanishes at p then the inverted normal coordinates are asymptotically Euclidean coordinates of order 3. For the proof of Lichnerowicz’ Theorem 1.1 using Theorem 1.2 we use the following Lemma 2.3. Let (M, g) be a Riemannian manifold with a decomposition M = M0 ∪ M∞ , where M0 is compact and with a diffeomorphism y ∈ {u ∈ Rn ; kuk2 > R0 } 7→ φ(y) ∈ M∞ for some positive R0 > 0. With respect to these coordinates y = (y1 , . . . , yn ) the metric coefficients gij for kyk → ∞ are of the form gij = δij + O(r−τ ) P for some τ > 1 and with r2 = kyk2 = j yj2 . Then the following holds: If the Ricci curvature of (M, g) is non-negative then the manifold is isometric to the Euclidean space Rn . Proof. Choose p ∈ M0 , denote by Bt (p) = {y ∈ M | d(y, p) < t} the distance ball of radius t around p. Denote A(t) = M0 ∪ {φ(y) | R0 < kyk < t}. Since the closure of A(t) is compact there is a t0 > 0 such that A(2R0 ) ⊂ Bt0 (p). Let q = φ(y) ∈ A(2R0 + t) for some t > 0. Choose the path γ : s ∈ [2R0 , 2R0 + t] 7→ φ(sy/kyk). The estimate g(γ(s), ˙ γ(s)) ˙ = 1 + O(s−τ ) with τ > 1 implies that there is a constant λ > 0, such that L(γ) ≤ t + λ for all t > 0. The triangle inequality implies d(p, q) ≤ d(p, γ(2R0 )) + d(γ(2R0 ), q) ≤ t0 + t + λ for all t > 0. Hence we have shown that there is a positive constant µ such that for all t > 0 : A(2R0 + t) ⊂ Bt+µ . The Euclidean volume element volRn = dx1 · · · dxn can be written in polar coordinates with radial coordinate r in the form rdrvolS n−1 , where volS n−1 is the volume element
Asymptotically Euclidean Manifolds and Twistor Spinors
71
p of the standard (n − 1)-sphere. Therefore the estimate det(gij ) = 1 + O(r−τ ) implies that volA(r) = ωn rn (1 + O(r−τ )), here ωn is the volume of the unit ball in Euclidean n-space. Hence we conclude: lim
t→∞
volBt+µ (p) volA(2R0 + t) ≥ lim = 1. t→∞ ωn r n ωn r n
Then the Bishop volume comparison theorem, in the formulation due to Gromov, implies that (M, g) is isometric to the Euclidean n-space Rn . For the reader’s convenience we state this comparison result for manifolds of non-negative Ricci curvature as the following lemma. Lemma 2.4 (cf. [E, Thm. 5.5, Rem. 5.7]). Let (M, g) be a complete Riemannian manifold of non-negative Ricci curvature and let t ∈ R+ 7→ f (t) =
volBt (p) ∈ R+ ω n tn
be the quotient of the volume of the distance ball of radius t around p in (M, g) and the Euclidean volume of a ball of radius t. Then f : R+ → R+ is monotone decreasing and limt→0 f (t) = 1. If f (t0 ) = 1 for some t0 > 0 then the ball Bt0 (p) is isometric to the Euclidean ball of the same radius. 3. Twistor Spinors with Zeros We study the asymptotic behaviour of the length of a twistor spinor with zero in a neighborhood of the zero point. First we recall the following proposition which we already mentioned in the introduction: Proposition 3.1 ([Fr, 5.3]). Let u = hφ, φi be the length of a twistor spinor, then the function n2 Aφ = ukDφk2 − k∇ uk2 4 is a non-negative constant. Moreover, the conformally equivalent metric g = u−2 g is an Einstein metric with non-negative scalar curvature s = 4(n − 1)n−1 Aφ . The last statement holds everywhere except at the zeros of φ. If s > 0 then the metric g carries a real Killing spinor, if s = 0 then the metric g carries a parallel spinor. Since the twistor equation is conformally invariant we can choose a convenient metric in the conformal class: Lemma 3.2. Let (M, g) be a Riemannian manifold of dimension n ≥ 3. Then there is in a coordinate neighborhood of p a conformally equivalent metric g = f −2 g such that its Ricci tensor Ric vanishes at p. Proof. The (0, 2) Ricci tensors Ric, Ric of the metrics g, g are related by the following formula: Ric − Ric = f −2 (n − 2)f ∇2 f + [f 1f − (n − 1)g(∇ f, ∇ f )] · g , (3) where the gradient, the hessian and the laplacian are defined with respect to the metric g. Let x = (x1 , . . . , xn ) be normal coordinates in a neighborhood of the point p. We identify
72
W. K¨uhnel, H.-B. Rademacher
p with the zero point of the coordinates. We assume that the coordinate vectors ∂i = ∂/∂xi for i = 1, . . . , n form an orthonormal basis of the tangent space Tp M consisting of eigenvectors of the Ricci tensor at p. Hence there are real numbers P λ1 , . . . , λn such that Ric(∂i , ∂j ) = λi δij . In a neighborhood of p we define f (x) = 1 + j µj x2j , where the numbers µ1 , . . . , µn are uniquely determined by the linear system −λi = 2(n − 2)µi + 2
n X
µj
j=1
for n ≥ 3. Then it follows from Eq. 3 that Ric(p) = 0. Lemma 3.3. Let φ be a twistor spinor with zero p and let u = hφ, φi be its P length function. If Ric(p) = 0 we obtain in normal coordinates x around p with r2 = k x2k : u(x) = cr2 + O(r5 ), where c = 2kDφ(p)k2 /n2 . Proof. Since u is a non-negative function ∇ u(p) = 0. The hessian of u is given by ∇2 u(X, Y ) = hX, L(Y )iu +
2 hX, Y ikDφk2 n2
cf. [KR1, Eq.(18)]; 1 L(X) = n−2
s X − ric(X) 2(n − 1)
is the Schouten–Weyl tensor and ric denotes the (1, 1)–Ricci tensor. Hence at the zero point the hessian is proportional to the metric: ∇2 u(p) =
2 kDφ(p)k2 g n2
and ∇3 u(Z, X, Y ) = hX, (∇Z L)(Y )iu + hX, L(Y )ihZ, ∇ ui +
2 hX, Y iRehL(Z)φ, Dφi . n
Here we use that a twistor spinor satisfies ∇Z Dφ =
n L(Z)φ , 2
(4)
cf. [BFGK, (1.34)], and Rehφ1 , φ2 i = 1/2(hφ1 , φ2 i + hφ2 , φ1 i) denotes the real part of hφ1 , φ2 i. Therefore ∇3 u(p) = 0 follows. Differentiating once again we obtain ∇4 u(U, Z, X, Y ) = hX, (∇U ∇Z L)(Y )iu + hX, (∇Z L)(Y )ihU, ∇ ui + hX, (∇U L)(Y )ihZ, ∇ ui + hX, L(Y )i∇2 u(Z, U ) + 2 hX, Y iRe h(∇U L)(Z)φ, Dφi + hL(Z)∇U φ, Dφi + n hL(Z)φ, ∇U Dφi .
Asymptotically Euclidean Manifolds and Twistor Spinors
73
Using again Eq. 4 and the twistor equation 1 we obtain: ∇4 u(U, Z, X, Y ) = hX, (∇U ∇Z L)(Y )iu + hX, (∇Z L)(Y )ihU, ∇ ui + hX, (∇U L)(Y )ihZ, ∇ ui + hX, L(Y )i∇2 u(Z, U ) + 1 hX, Y i 2hL(Z), U ikDφk2 + n2 hL(Z), L(U )iu2 + 2 n 2nReh(∇U L)(Z)φ, Dφi. At the zero p of the twistor spinor we obtain: ∇4 u(p)(U, Z, X, Y ) =
2 hU, L(Z)ikDφ(p)k2 hX, Y i; n2
Ric(p) = 0 implies L(p) = 0, hence ∇4 u(p) = 0. This proves Lemma 3.3. Furthermore, we will use the following observation due to K.Habermann: Lemma 3.4. Let p be a zero of the twistor spinor φ which does not vanish identically. Then the Weyl tensor W vanishes at p. Proof. The Equation [BFGK, (1.40)] simplifies at the point p as follows: W (X, Y )Z · Dφ(p) = 0 for all vectors X, Y, Z at p. The pair (φ, Dφ) can be seen as a parallel section of the sum E = 6M ⊕ 6M of two copies of the spinor bundle with respect to a connection ∇E . Therefore Dφ(p) 6= 0 which implies that W (p) = 0. With these prerequisites we are in the position to prove the main result Theorem 1.2. Proof of Theorem 1.2. (a) Since the twistor equation is conformally invariant we can assume without loss of generality that Ric(p) = 0, cf. Lemma 3.2. From Lemma 3.4 it follows that W (p) = 0, hence also the full Riemann curvature tensor R(p) = 0 vanishes at p. This P implies that the metric coefficients in the normal coordinates x around p with r2 = k x2k satisfy: gij = δij + O(r3 ) . Lemma 3.3 implies that r4 u−2 = c−2 (1 + O(r3 )). Then one computes that the function r4 u−2 in inverted normal coordinates zi = r−2 xi satisfies r4 u−2 = c−2 (1 + O00 (ρ−3 )) , i.e. ∂k (r4 u−2 ) = O(ρ−4 ) and ∂k ∂j (r4 u−2 ) = O(ρ−5 ), where ∂j denotes the derivative ∂/∂zj . This and Lemma 2.2 imply that the metric coefficients g ij of the conformally equivalent metric g r4 g = c2 2 = c2 2 gˆ u u P with respect to the inverted normal coordinates z with zk = r−2 xk and ρ = r−1 = k zk2 satisfy: g ij = δij + O00 (ρ−3 ) . (b) From the last equation it follows that therePis ρ1 ≥ ρ0 such that for all ρ ≥ ρ1 all principal curvatures of the distance spheres { k zk2 = ρ} are ≥ (2ρ)−1 . This implies
74
W. K¨uhnel, H.-B. Rademacher
that all geodesics γ which start from a distance sphere in the direction of growing ρ, i.e. g(γ 0 (0), ∂ρ ) ≥ 0 and ρ(γ(0)) ≥ ρ1 are defined for all positive real numbers and limt→∞ ρ(γ(t)) = ∞. In geodesic normal coordinates the Ricci–flat metric g is analytic, hence the Riemann curvature tensor does not vanish on an open set unless the metric itself is flat. Now we assume that g is non-flat and locally reducible. Then by the above consideration we can choose a geodesic γ : [0, ∞) → M − {p} with limt→∞ ρ(γ(t)) = ∞ which in an open neigborhood U of γ(0) lies in the factor U1 of the Riemannian product U = U1 × U2 and such that the Riemann curvature tensor R2 at γ(0) of the factor U2 does not vanish. Hence we can choose analytic parallel vector fields X(t), Y (t) along γ tangential to U2 which span a tangent plane with non-zero sectional curvature K(X, Y ) = K2 (X, Y ), where K2 is the sectional curvature of the factor U2 . Hence the function t 7→ K(X, Y ) is a non-zero constant κ for small t. Then the analyticity implies that limt→∞ K(X(t), Y (t)) = κ. But the asymptotic behaviour of the coordinate system z implies that K(X(t), Y (t)) = O(ρ−5 (γ(t))) = 0, i.e. we obtain a contradiction. For the proof of Theorem 1.3 we need the following Lemma 3.5. Let M be a Ricci flat manifold with a non-constant smooth function f : M → R≥0 . If the set f −1 (0) of zeros of f contains the point p ∈ M and if the conformally equivalent metric f −2 g is also Ricci flat on M − f −1 (0) then (M, g) is an open subset of Euclidean space. Proof of the Lemma. Let g = f −2 g with a non-constant function f . From Eq. 3 we obtain the equations 1f ·g (5) ∇2 f = n and 2f · 1f = n · g(∇f, ∇f ). (6) We conclude from Eq. 6 that the zero point p of f is also a critical point of f. By a result due to Tashiro (cf. [K¨u, Lemma 18]) Eq. 5 then implies that the zeros of f are isolated and that the metric g in geodesic polar coordinates (r, x) ∈ (0, a) × S n−1 centered at p is of the form F 0 (r)2 dr2 + 00 2 g1 (x) F (0) for a smooth function F : (0, a) → R for some a > 0. Here g1 denotes the standard metric on S n−1 and F (r) = f (r, x). Equation 6 implies that F satisfies the differential equation 2F F 00 = F 2 . This shows that the metric g is of the form dr2 + r2 g1 (x), i.e. it is Euclidean. Since p is the only critical value of f (r, x) = F (r) = F 00 (0)r2 the manifold is an open subset of Rn . Proof of Theorem 1.3. Let φ be a twistor spinor with non-empty set of zeros Zφ . Assume that ψ is a twistor spinor with a zero q 6∈ Zφ . Then it follows from Proposition 3.1 that there is a non-constant function f , such that g = kφk−4 g resp. f −2 g = kψk−4 g is a Ricci flat metric outside Zφ resp. Zψ . Here f = kψk2 /kφk2 , i.e., f (q) = 0. It follows from Lemma 3.5 that (M, g) is an open subset of Euclidean space. Hence we proved that the sets of zeros of twistor spinors coincide on a manifold which is not conformally flat. The further statements follow from the holonomy classification of Berger and Simons, as in [Hi, p.8, footnote p.54] and [Wa, Prop.].
Asymptotically Euclidean Manifolds and Twistor Spinors
75
In the following corollary we obtain an elementary proof of Lichnerowicz’s Theorem 1.1. More precisely, the proof does not depend on global solutions of partial differential equations like the solution of the Yamabe problem. Corollary 3.6. A compact Riemannian spin manifold carrying a non-trivial twistor spinor with zero is conformally equivalent to the standard sphere. Proof. Since the zero set of the twistor spinor φ is a discrete set, there are only finitely many zeros p1 , . . . , pm . We conclude from Theorem 1.2 that there are open neighborhoods U1 , . . . , Um of p1 , . . . , pm which are pairwise disjoint and which satisfy the following: For the conformally equivalent metric g = u−2 g which is defined on M = M − {p1 , . . . , pm } there is an asymptotically Euclidean coordinate system z i = (z1i , . . . , zni ) of order 3 in every neighborhood P Ui ,i i2 = 1, . . . , m. Since for sufficiently large t0 the levels ρi = t0 with (ρi )2 = j (zj ) are convex and since M − (U1 ∪ . . . Um ) is compact the Riemannian manifold (M , g) is complete and Ricci flat. It follows from the splitting theorem due to Cheeger and Gromoll that the manifold can only have one end, i.e., m = 1. An elementary proof of the splitting theorem was given by Eschenburg and Heintze in [EH]. Hence (M , g) is an asymptotically Euclidean manifold as defined in Definition 2.1 of order 3. Then Lemma 2.3 implies that (M − {p1 }, g) is isometric to Rn , hence (M, g) is conformally equivalent to the standard sphere. Acknowledgement. We acknowledge a helpful discussion with Marc Herzlich.
References [ACDS] Alekseevsky, D.V., Cort´es, V., Devchand, C., Semmelmann, U.: Killing spinors are Killing vector fields in Riemannian supergeometry. To appear in: J. Geom. Phys. (1998) [Ba] B¨ar, C.: Real Killing spinors and holonomy. Commun. Math. Phys. 154, 509–521 (1993) [BFGK] Baum, H., Friedrich, T., Grunewald, R. and Kath, I.: Twistors and Killing spinors on riemannian manifolds. Teubner Texte zur Math., vol. 124, Stuttgart, Leipzig: B.G.Teubner, 1991 [Ca] Calabi, E.: M´etriques k¨ahleriennes et fibr´es holomorphes. Ann. Ecol. Norm. Sup. 12, 269–294 (1979) [E] Eschenburg, J.-H.: Comparison theorems in Riemannian geometry. Lect.Notes Ser. Univ.Trento, Math. vol. 3, 1994 [EH] Eschenburg, J.-H. and Heintze, E.: An elementary proof of the Cheeger–Gromoll splitting theorem Ann. Global Anal.Geom. 2, 141–151 (1984) [FG] Freedman, D.Z. and Gibbons, G.W.: Remarks on supersymmetry and K¨ahler geometry. In: Superspace and Supergravity, Proc. Nuffield workshop, Cambridge 1980 (S.W.Hawking, M.Roˇcek, eds.) Cambridge: Cambridge Univ. Press, 1981 [Fr] Friedrich, T.: Dirac–Operatoren in der Riemannschen Geometrie. Advanced lectures in Math., Braunschweig, Wiesbaden: Vieweg Verlag, 1997 [Ha] Habermann, K.: Twistor–Spinoren auf Riemannschen Mannigfaltigkeiten und deren Nullstellen. Thesis, Humboldt–Universit¨at Berlin 1992; Twistor spinors and their zeroes. J. Geom. Phys. 14, 1–24 (1994) [He] Herzlich, M.: Compactification conforme des vari´et´es asymptotiquement plates. Bull. Soc. Math. France 125, 55–92 (1997) [Hi] Hitchin, N.: Harmonic spinors. Adv.Math. 14, 1–55 (1974) [K¨u] K¨uhnel, W.: Conformal transformations between Einstein spaces. In: Conformal Geometry. R.S. Kulkarni and U. Pinkall, eds., Aspects of math. E 12, Braunschweig–Wiesbaden: Vieweg-Verlag, 1988, pp. 105–146 [KR1] K¨uhnel, W. and Rademacher, H.-B.: Twistor Spinors with zeros. Int. J. Math. 5, 877–895 (1994) [KR2] K¨uhnel, W. and Rademacher, H.-B.: Twistor Spinors and Gravitational Instantons. Lett. Math. Phys. 38, 411–419 (1996)
76
[KR3] [KR4] [LP] [Li] [Wa]
W. K¨uhnel, H.-B. Rademacher K¨uhnel, W. and Rademacher, H.-B.: Conformal completion of U(n)-invariant Ricci flat K¨ahler metrics at infinity. Zeitschr. Anal. Anwend. 16, 113–117 (1997) K¨uhnel, W. and Rademacher, H.-B.: Twistor spinors on conformally flat manifolds. Illin. J. Math. 41, 495–503 (1997) Lee, J.M. and Parker, T.H.: The Yamabe problem. Bull. (N.S.) Am. Math. Soc. 17, 37–91 (1987) Lichnerowicz, A.: Killing spinors, twistor–spinors and Hijazi inequality. J. Geom. Phys. 5, 2–18 (1988) Wang, M.: Parallel spinors and parallel forms. Ann. Global Anal. Geom. 7, 59–68 (1989)
Communicated by H. Nicolai
Commun. Math. Phys. 196, 77 – 103 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Virasoro Character Identities and Artin L-Functions A. Taormina, S. M. J. Wilson Department of Mathematical Sciences, University of Durham, Durham DH1 3LE, England. E-mail:
[email protected];
[email protected] Received: 6 June 1997 / Accepted: 11 January 1998
Abstract: Some identities between unitary minimal Virasoro characters at levels m = 3, 4, 5 are shown to arise as a consequence of relations between Artin L-functions of different quadratic fields. The definitions and concepts of number theory necessary to present the theta function identities which can be derived from these relations are introduced. A new infinite family of identities between Virasoro characters at level 3 2 and level m = 4a2 , for a odd and 1 + 4a2 = a0 p, where p is prime is obtained as a by-product.
1. Introduction Two sets of intriguing identities between unitary minimal Virasoro characters were presented in [5]. They are remarkable in various respects. First they provide a rewriting of the Virasoro characters at level m = 3 (resp. m = 4) in terms of differences of quadratic expressions in Virasoro characters at level m = 4 (resp. m = 3, 5). Second, they are stronger identities than the ones obtained by repeated use of the Goddard-KentOlive sumrules [2] for the cosets [SU (2)1 × SU (2)1 × SU (2)1 ]/SU (2)3 and [SU (2)1 × SU (2)2 × SU (2)1 ]/SU (2)4 . Third, their generalisation to higher levels is proving to be a highly nontrivial problem. The proof given in [5] uses infinite product representations of level m = 3, 4, 5 Virasoro characters, as well as the celebrated Jacobi triple identity and standard properties of generalised theta functions [3]. However, it does not shed any light on the underlying structure of these identities, and therefore offers no clue on how to generalise them to higher values of the level. In this paper, we use the powerful machinery of number theory1 to prove the same identities, and we provide a solid framework within which more identities can be unveiled. We establish relations between two imaginary quadratic extensions over Q, which 1
A standard textbook discussing the tools needed is [4]
78
A. Taormina, S.M.J. Wilson
we call K and K0 throughout. Given a Galois extension N of Q which contains K and K0 , we define two subgroups of the Galois group 0 = Gal(N/Q) to be 1 = Gal(N/K) and 10 = Gal(N/K0 ). The deep roots of the Virasoro identities considered lie in the ability to identify, given N, pairs of characters χgal and χ0gal of dimension one on 1 and 10 , which induce the same character on 0, χgal
↑0 1
= χ0 gal ↑0 . 10
(1.1)
Now, by a standard result, the Artin L-function of an induced character coincides with the Artin L-function of the original character. So, given (1.1), one has, L(χgal ) = L(χgal ↑0 ) = L(χ0 gal ↑0 ) = L(χ0 gal ). 1
10
(1.2)
As explained later, the L-functions appearing in the expression (1.2) are related to ray class theta functions. The latter obey nontrivial identities which can be derived from the relations (1.2) as χgal and χ0gal vary. Theorem 4.3 describes these identities and offers a practical way to identify relations between the generalised theta functions in terms of which the Virasoro characters can be expressed. However its formulation requires some ground work in number theory. We introduce the necessary mathematical jargon and results without the rather technical proofs, which will be presented elsewhere [6]. The paper is organised as follows. Section 2 sets the notations for coset theta functions [3], and emphasizes their rˆole in the description of the Virasoro characters identities considered in [5]. Since Theorem 4.3 is a result about ray class theta functions, we proceed in Sect. 3 to rewrite coset theta functions as ray class theta functions. In order to do so, we introduce ray class groups over K with conductor F . These provide a classification of ideals in OK (the ring of integers over K) which generalises the classification modulo n of integers prime to n. The rˆole of the modulus is played, in imaginary quadratic fields, by the conductor F . The three theorems at the end of Sect. 4 are the key to understanding the underlying algebraic structure of the Virasoro identities (2.4,2.5), but are also powerful tools in searching for more such identities. The first two subsections of Sect. 4 provide a brief discussion of the ray class characters and norm maps required in Theorems 4.2, 4.3 and 4.4. We show in Sect. 5 how the Virasoro identities √ presented in [5] of Theorems√4.3 and 4.4 when K = Q[ −2] √ arise as a consequence 0 0 (resp. K = Q[ −30]) and K = Q[i] (resp. K = Q[ −10]). As an illustration of the power of the tools developed in this paper, we also provide an infinite family of identities between unitary minimal Virasoro characters at level m = 3 and unitary minimal 2 Virasoro characters at level m = 4a2 for a odd and 1 + 4a2 = a0 p with p prime. The first member of this family, at a = 1, is the collection of identities (2.4). 2. Coset Theta Functions The m(m−1) independent unitary minimal Virasoro characters at level m (m ≥ 2, m ∈ 2 N) are analytic functions of the variable q = e2iπτ , τ ∈ C, Imτ > 0 defined by, (2.1) χVr,sir (m) (q) = η −1 (q) θr(m+1)−sm,m(m+1) (q) − θr(m+1)+sm,m(m+1) (q) , with the integers r and s in the following ranges, r = 1, 2, . . . , m − 1;
s = 1, . . . , r.
Virasoro Character Identities and Artin L-Functions
79
The generalised theta functions at positive integer k are given by, X ` 2 θ`,k (q) = q k(n+ 2k )
(2.2)
n∈Z
for ` integer, and the Dedekind eta function η(q) can be rewritten as the difference of two generalised theta functions at level 6, 1
η(q) = q 24
∞ Y
1 − q n+1 = θ1,6 (q) − θ5,6 (q).
(2.3)
n=0
From (2.1) and (2.3), it is easy to see that the identities presented in [5], namely, χV1,iir (3) (q) = ijk (−1)j+k χVj,4ir (4) (q) χVk,2ir (4) (q)
(2.4)
and
h i h i χV1,1ir (4) (q) ± χV3,1ir (4) (q) = χV1,1ir (3) (q) ± χV2,1ir (3) (q) χV2,2ir (5) (q) ∓ χV3,2ir (5) (q) , h i h i χV1,2ir (4) (q) ± χV3,2ir (4) (q) = χV1,1ir (3) (q) ± χV2,1ir (3) (q) χV1,2ir (5) (q) ∓ χV4,2ir (5) (q) , h i χV2,1ir (4) (q) = χV2,2ir (3) (q) χV2,1ir (5) (q) − χV3,1ir (5) (q) , h i χV2,2ir (4) (q) = χV2,2ir (3) (q) χV1,1ir (5) (q) − χV4,1ir (5) (q) , (2.5)
can be rewritten as identities involving quadratic expressions in generalised theta functions at different levels. For instance, (2.4) becomes, V (1, 2)V (4 − 3i, 3) = ijk (−1)j+k V (5j − 16, 4)V (5k − 8, 4),
(2.6)
where we have introduced the V functions, V (r, m) = θr,m(m+1) − θr(2m+1),m(m+1) ,
(2.7)
with the properties, V (r, m) = V (−r, m) = V (r + 2km(m + 1), m), V (r, m) = −V (r(2m + 1), m).
k ∈ Z; (2.8)
(Here, and in the future, we will not express the variable q.) We note that χVr,sir(m) = V (r(m + 1) − sm, m)/η η = θ1,6 − θ5,6 = V (1, 2).
and
χVr,rir(m) = V (r, m)/η, (2.9)
We first remark that the generalised theta functions (2.2) are coset theta functions in the following sense. Let X be a suitably sparse subset of a real or hermitian, positive definite, inner product space V . Choose d ∈ R+ . We define, X 2 θ(X; d) = q kxk /d , (2.10) x∈X
and we note that for a scalar α, we may define αX = {αx | x ∈ X} and then,
80
A. Taormina, S.M.J. Wilson
θ(αX; | α |2 d) = θ(X; d).
(2.11)
A finitely generated discrete subgroup of V is called a lattice. The lattice generated by the elements v1 , · · · , vn of V will be denoted hv1 , · · · , vn igp , i.e., n X hv1 , · · · , vn igp = { ai vi | ai ∈ Z}. i=1
When X is a coset v + 3 of a lattice 3 in V , we shall call the function θ(X; d) a coset theta function. If 30 is a lattice in another space V 0 , and if v 0 ∈ V 0 , we find that 3 × 30 is a lattice in V ⊕ V 0 , and that (v + 3) × (v 0 + 30 ) = (v, v 0 ) + 3 × 30 . We then get, θ((v, v 0 ) + 3 × 30 ; d) = θ(v + 3; d)θ(v 0 + 30 ; d).
(2.12)
From (2.2), (2.10) and (2.11), we see that, θr,k = θ(
√ 1 r r + Z; ) = θ( √ + kZ; 1), 2k k 2 k
i.e. the generalised theta function θr,k is a theta function for the coset X = lattice Z in V = R. Then we find, identifying R ⊕ R with R ⊕ iR = C,
r 2k
+ Z of the
√ √ √ √ (2.12) θr,k θs,` = θ((r/2 k) ± i(s/2 `) + ( kZ + i `Z); 1) √ √ (2.11) = θ((rλ`0 ± sµ D) + 2hλ`0 µhµk0 , λ Digp ; 4k``0 /h) def
= θ(α + J; d),
(2.13)
where k, ` ∈ N, h = hcf (k, `), k¯ = k/h = µ2 k0 and `¯ = `/h = λ2 `0 with k0 and `0 square free as is D = −k0 `0 . So the product of two generalised theta functions θr,k θs,` def
in R is a theta function for the coset X = v + 3 = α + J of a lattice √ J = 2hλ`0 µhµk0 , λ Digp with
and
(2.14)
√ √ hµk0 , λ Digp = {µk0 n + λ Dm | n, m ∈ Z} √ α = rλ`0 ± sµ D.
The next step is to rewrite this coset theta function in terms of a ray class theta function in imaginary quadratic fields. In order to do so, we introduce in the next section the relevant definitions and properties of imaginary quadratic fields.
Virasoro Character Identities and Artin L-Functions
81
3. From Coset to Ray Class Theta Functions Let K and K0 be two fields and K ⊃ K0 . Then K is called an extension field of K0 . The degree of the extension, noted [K : K0 ], is the dimension of K as a vector space over K0 . In particular, if [K : K0 ] = 2, the extension is a quadratic extension. In this paper, we work mainly with imaginary quadratic extensions of Q and we introduce the basic definitions and concepts as applied to them. 3.1. Ideals and prime factorization in imaginary quadratic fields. Let D be a negative integer with no square factor other than 1, and put √ √ K = Q[ D] = {a + b D | a, b ∈ Q}, an imaginary quadratic extension of Q. We define OK to be the ring of all algebraic integers in K. This means that if D ≡ 2 or 3(mod4), √ √ √ (3.1) OK = Z[ D] = {a + b D | a, b ∈ Z} = h1, Digp , and if D ≡ 1(mod4), OK = Z
h
1 2 (1
+
√
i D √ E D) = 1, 21 (1 + D)
gp
.
It is easy to see that every number in K is the ratio of two algebraic integers. A unit of OK is an element of OK whose reciprocal lies in OK . The group of units of OK is denoted OK × . We note in passing, Proposition 3.1. The units of OK are those roots of unity which lie in K, vis. {±1, ±i} if D = −1, the 6th roots of unity if D = −3 and {±1} otherwise. A fractional ideal I of OK generated by α1 , ..., αn ∈ K is the set of OK -linear combinations of the generators αi : n def X I = (α1 , ..., αn )OK = { γi αi | γi ∈ OK }. i=1
In other words, I is a lattice spanning K which is closed under multiplication by elements of OK . A fractional ideal (α)OK = αOK , generated by a single α ∈ K \ {0} is called a principal ideal. Note that αOK = βOK ⇐⇒ α/β ∈ OK × .
(3.2)
We write I(K) for the set of all fractional ideals of OK and P(K) for the subset of those which are principal. I(K) is an abelian group under ideal multiplication, whose neutral element is OK , m n X X γij αi βj | γij ∈ OK }. I.J = (α1 , ...αn )OK .(β1 , ..., βm )OK = { i=1 j=1
In particular, αOK .βOK = αβOK , so P(K) is a subgroup of I(K). The ideal class group, C(K), is defined to be the quotient group of I(K) by P(K). If I ∈ I(K) is a subset of OK , then I is said to be an integral ideal of OK . A prime ideal P of OK is an integral ideal different from OK such that OK \ P is closed
82
A. Taormina, S.M.J. Wilson
under multiplication. We shall be interested here in prime ideals other than {0}. Such ideals are maximal ideals, i.e. there is no ideal other than OK containing them, but we retain the terminology prime ideals to emphasize the analogy with prime integers. Every fractional ideal I ∈ I(K) may be uniquely factorized as a product of integer powers of prime ideals Qi , m Y vQ (I) I= Qi i . i=1
The integer vQi (I) is called the valuation of I at the prime ideal Qi . For prime ideals Q other than the Qi , we set vQ (I) = 0. As one might expect, I ∈ I(K) is integral if and only if vQ (I) ≥ 0 for all prime ideals Q. If I, J ∈ I(K), we say that I divides J (we write I | J) if JI −1 is an integral ideal. It follows that vQ (I) ≤ vQ (J) ∀ Q ⇐⇒ I | J ⇐⇒ I ⊃ J. Therefore the least common multiple of a pair of ideals I and J is the largest ideal contained in them both and their highest common factor is the smallest lattice which contains them both. Thus lcm(I, J) = I ∩ J
and
hcf(I, J) = I + J = {a + b | a ∈ I, b ∈ J}.
I is said to be coprime to J if their hcf is OK . Note that in this case I ∩ J = lcm(I, J) = IJ.
(3.3)
A useful parameter of integral ideals is the norm. If I is an integral ideal of OK , we define its norm N (I) to be the (finite) number of elements in the quotient group OK /I. The norm of a principal ideal αOK is easily found since N (αOK ) = |α|2 . Since −1 any fractional ideal J is a quotient II 0 of integral ideals, we may define N (J) = 0 N (I)/N (I ). This definition is unambiguous since, in fact, the norm is multiplicative: N (IJ) = N (I)N (J).
(3.4)
From this we see, also, that the norm of an integral ideal gives a clue as to its prime factors since (in a quadratic field) the norms of these must be p or p2 for some prime number p. Indeed, all non zero prime ideals of K occur as prime factors of pOK for some prime number p. If there is no ideal of norm p, pOK itself is prime and is said to be inert in OK . If, on the other hand, there exists an ideal Pp of norm p, one has, pOK = Pp P¯p .
(3.5)
When Pp is a different prime ideal from P¯p , pOK is said to split in OK , while it ramifies in OK when Pp = P¯p . (The term “ramifies” √ comes from algebraic function theory. √ Replacing Z by C[x] and OK by R = C[x][ f (x)], one finds that the function f (x) has a branch point at x = λ if (x − λ) ramifies in R, i.e. if (x − λ)R = Pλ2 ). In the rest of this paper Pp will stand for a prime ideal of OK of norm p (if there is one). We will make no explicit choice unless this is necessary. If p is an odd prime integer, then the number of ideals ofnorm K isthe number p in O D 2 of solutions to the congruence x ≡ D mod p. (That is, 1 + p , where D = 0 or ±1 p is the quadratic residue symbol, cf. Subsect. 4.1). (In all the cases that we will examine there will be just one ideal of norm 2.)
Virasoro Character Identities and Artin L-Functions
83
3.2. Ray class groups. Ray class theta functions are based on ray classes which we now introduce. These classify ideals in a way which generalises the classification, modulo n, of integers prime to n. For ideals of K, the modulus n is replaced by an integral ideal F , called the conductor, and we work in the subgroup IF (K) of quotients of those ideals which are coprime to F , that is, IF (K) = {I ∈ I(K) | vP (I) = 0 if P | F }. We identify a subgroup K1,F of K× of elements which are, in some sense, 1 modulo F . These are quotients of elements of OK which are congruent mod F and coprime to F . Thus, (3.6) K1,F = {λ/µ | λ − µ ∈ F, µOK + F = OK , λµ 6= 0}. We put PF (K) = {αOK | α ∈ K1,F }. Then the ray class group of OK with conductor F is the quotient group, IF (K) . CF (K) = PF (K) The ray class group is, in fact, a finite group. Note that, COK (K) = C(K). Each element of CF (K) is a coset IPF (K), for some I ∈ IF (K). We denote this coset more compactly as [I]F . We refer to it as the ray class of I with conductor F . We shall work, in particular, with CP F (K), the subgroup of CF (K) of ray classes of principal ideals. We shall shorten our notation of such classes by writing [γ]F for [γOK ]F . Clearly, CP F (K) is the kernel of the group homomorphism from CF (K) to C(K) given by [I]F 7→ [I]OK . This homomorphism is, in fact, surjective and so we have a short exact sequence of groups, redF O
0 → CP F (K) → CF (K) →K C(K) → 0.
(3.7)
We conclude with two useful lemmas (with F integral, as above). Lemma 3.2. Let H ∈ I(K). (i) If αOK + F H = H, then αH −1 ∈ IF (K). (ii) If, further, β − α ∈ F H then β/α ∈ K1,F and so [βH −1 ]F = [αH −1 ]F . (iii) Conversely, if α and β ∈ H and β/α ∈ K1,F then β − α ∈ F H. Proof. (i) hcf(αH −1 , F ) = αH −1 + F = (αOK + F H)H −1 = OK . (ii) Since C(K) is finite, H n = γOK for some positive integer n and γ ∈ K. Put λ = βαn−1 /γ and µ = αn /γ. Since α ∈ H and β ≡ α modulo F H, we have λ ≡ βαn−1 /γ ≡ αn /γ ≡ µ modulo F H n /γ = F. Moreover, µOK + F = (αH −1 )n + F = OK . So β/α = λ/µ ∈ K1,F . (iii) Now β = α(λ/µ)OK , where λ − µ ∈ F and µOK + F = OK . Whence, by (3.3), µOK ∩ F = µF . But β − α ∈ H and also β − α = (λ − µ)α/µ ∈ F H(1/µ). So β − α ∈ H ∩ F H(1/µ) = (1/µ)H(µOK ∩ F ) = (1/µ)H(µF ) = F H. Lemma 3.3. Suppose that α1 , α2 , α3 and α4 ∈ K \ F H and αi OK + F H = H. Then
84
A. Taormina, S.M.J. Wilson
(i) αi /αj = (αi H −1 )(αj H −1 )−1 ∈ IF (K). And if α1 α2 ≡ α3 α4 modulo F H 2 then (ii) [α1 /α4 ]F = [α3 /α2 ]F . Proof. (i) is clear from Lemma 3.2(i). From Lemma 3.2(ii) – with H 2 instead of H – [α1 α2 H −2 ]F = [α3 α4 H −2 ]F and (ii) follows from this. 3.3. The ray class theta function. For any ray class x in the ray class group CF (K), we define the ray class theta function of x with scale factor d ∈ R+ as follows, θ(x; d) =
X
q N (I)/d .
I∈x,I⊂OK
Here N (I) is the norm of I (see (3.4)). More generally, if W = formal complex linear combination of ray classes, we write, θ(W ; d) =
X
nx θ(x; d).
P x∈CF (K)
nx x is a
(3.8)
x∈CF (K)
Also, abusively, if X and Y are subsets of CF (K), we can regard them as standing for the sums of their elements so that, for instance, X X θ(x; d) − θ(y; d). θ(X − Y ; d) = x∈X
y∈Y
Now the quadratic field coset theta functions as in (2.13) can be rewritten as ray class theta functions in the following way. (We do not tackle here the most general cases as we assume that the lattice J of (2.14) is an ideal of OK . In the general case the coset must first be split up as a union of cosets of ideals (cf. 5.19) and a sum of ray class theta functions will be obtained.) Let J ∈ I(K) and α ∈ K \ J. Put H = hcf(αOK , J) = αOK + J and F = JH −1 , an integral ideal of OK . Thus, by Lemma 3.2(i), [αH −1 ]F ∈ CF (K). Let (OK × )F be the group of those units of OK which are 1 modulo F and let wF be its order. We have the following relation between coset and ray class theta functions, Theorem 3.4. With α, J, F , and H as above. θ(α + J; d) = wF θ([αH −1 ]F ; d/N (H)),
(3.9)
Proof. Put Y for the set of all integral ideals in [αH −1 ]F . The proof of the theorem is effected by the following lemma which sets up a wF to 1 correspondence between the terms in the two theta sums and in which, since N (βH −1 ) = |β|2 /N (H), the powers of q are scaled by 1/N (H). Lemma 3.5. The map f : α+J → Y defined by f (β) = βH −1 is wF to 1 and surjective.
Virasoro Character Identities and Artin L-Functions
85
Proof. By Lemma 3.2(ii), if β ∈ α + J then f (β) ∈ [αH −1 ]F . But β ∈ αOK + J = H. So f (β) ⊂ OK . Thus f (β) ∈ Y and f is well-defined. Let I ∈ Y . Then I(αH −1 )−1 ∈ PF (K). So, IHα−1 = δOK , where δ ∈ K1,F . Thus IH = βOK with β = αδ. Now α ∈ H and β ∈ IH ⊂ H. So, by Lemma 3.2(iii), β ∈ α + F H = α + J. Of course, f (β) = βH −1 = IHH −1 = I. Therefore f is surjective. To complete the proof we show that f −1 (I) = β(OK × )F . Suppose ω ∈ (OK × )F . Then ω ∈ 1+F and so, since βF ⊂ HF = J, βω ∈ β+J = α+J. Moreover, by (3.2), f (ωβ) = f (β) = I. Thus f −1 (I) contains β(OK × )F . We must show the reverse inclusion. If β 0 ∈ f −1 (I) then, certainly, β 0 OK = βOK so, by (3.2), β 0 = ωβ with ω ∈ OK × . But, by Lemma 3.2(ii) with β 0 = α, ω = β 0 /β ∈ K1,F . Thus, applying Lemma 3.2(iii) with H = OK , ω ≡ 1 modulo F . Thus β 0 ∈ β(OK × )F , as required. 3.4. An example of reduction to ray classes. As an illustration of use of the above identity and of (2.13), and as a first step in proving the identities (2.4), we reduce a generic theta product occurring in the L.H.S. of (2.4) to a coset theta function and then (in the main case) to a ray class theta function. First, by (2.13), θ(1+6a),6 θs(1+6b),12 = θ(αs (a, b) + J; d),
(3.10)
where (since, taking k = 6 and ` = 12 we have √ h = (6, 12) =√6, λ = µ = k0 = 1 and D = −2) α ≡ αs (a, b) = 2(1 + 6a) + s(1 + 6b) D, J = 24h1, Di and d = 96. We note that for a, b taking the values 0 or 1, and for s = 1, −2, −5, the L.H.S. of (3.10) gives all the terms in the three products V (1, 2)V (4 − 3i, 3), i = 1, 2, 3, which are the left-hand sides of (2.4). We recall that generalised theta functions satisfy the following relations, θ`,k = θ−`,k = θ`+2k,k . √ So we√work in the imaginary quadratic field K = Q( −2) and we introduce the notation ρ = −2. Then, from (3.1), OK = h1, ρi and J = 24OK . We find (see 3.5) prime factorisations 2OK = P22 and 3OK = P3 P¯3 , where P2 = ρOK and P3 = (1 + ρ)OK with norms 2 and 3, respectively. So J = P26 P3 P¯3 . In order to apply (3.9), we first need to determine the ideal H, which is the hcf of J and the principal ideal generated by αs (a, b). Suppose that s ≡ 1 mod 3 and assume for the moment that s is odd (so s ≡ 1 mod 6). Then α = 2 + ρ + 6β for some β ∈ OK and 2 + ρ = ρ(1 − ρ). So αOK = P2 P¯3 (1 − ρ(1 + ρ)β)OK , where the last ideal is clearly not contained in P2 or P3 . Thus H = P2 P¯3 and, using (3.4), the norm of H is, N (H) = N (P2 )N (P¯3 ) = 2 × 3 = 6. The conductor (i.e. F of (3.9)) is JH −1 = P26 P3 P¯3 (P2 P¯3 )−1 = P25 P3 = 4P2 P3 .
86
A. Taormina, S.M.J. Wilson
Thus, by (3.9), the coset theta function (3.10) reduces, for s ≡ 1 mod 6, to the ray class theta function, ! αs (a, b) ; 16 . θ(1+6a),6 θs(1+6b),12 = θ ρ(1 − ρ) 4P2 P3 3.5. Description of principal ray classes and change of conductor. Let K and F be again as in Subsect. 3.3. A standard result (though not difficult to prove) is the following description of the units of the quotient ring OK /F : (OK /F )× = {α + F | αOK + F = OK }. Combining Lemma 3.2 (with H = OK ) with (3.2), one can obtain the following exact sequence describing CP F (K), F CP F (K) → 0. 0 → (OK × )F → OK × → (OK /F )× →
π
(3.11)
Here the first map is inclusion, the second takes u to u + F and πF takes α + F to [α]F . In our analysis of Sect. 5, we will need to relate ray class groups corresponding to two conductors F˜ and F such that F˜ | F . This relation is provided by the reduction map, (3.12) redF F˜ : CF (K) → CF˜ (K) : [I]F → [I]F˜ . This is, in fact, a surjective group homomorphism of which the projection of (3.7) is a special case. Comparing (3.7) for F and F˜ it is easy to see that the kernel of the reduction map F redF˜ of (3.12) lies in CP F (K). From (3.11) it now follows that × ˜ ker redF F˜ = πF {α + F ∈ (OK /F ) | α ≡ 1 mod F }.
If F = QF˜ , and Q and F˜ have no common factors, one gets an isomorphism (the Chinese Remainder Theorem), (OK /F )× → (OK /Q)× × (OK /F˜ )× : α + F → (α + Q, α + F˜ ).
(3.13)
Clearly, isomorphisms of this sort are useful in the description of CP F (K) and we generalise the compact notation introduced above for classes of principal ideals in the following way. If γ + F is the pre-image of (α + Q, β + F˜ ), we write [α, β]F for [γ]F . Note that for u ∈ (OK )× , (3.14) [α, β]F = [uα, uβ]F . × It is easy to see that the kernel of redF F˜ is {[α, 1]F | α+Q ∈ (OK /Q) }. More generally, if A ⊂ CP F˜ (K), then −1 × ˜ (redF F˜ ) (A) = {[α, β]QF˜ | α + Q ∈ (OK /Q) , β + F ∈ A}.
(3.15)
If N (Q) = p then Z/pZ and OK /Q have the same number p of elements. In fact, since QQ¯ = pOK (see 3.5)), Z ∩ Q = pZ and so the map, Z/pZ → OK /Pp by n + pZ 7→ n + Pp ,
(3.16)
is a ring isomorphism. Thus in this case the elements of CP F (K) may be written [n, β]F with n an integer mod p.
Virasoro Character Identities and Artin L-Functions
87
4. Relations Between Ray Class Theta Functions of Different Quadratic Fields The identities (2.4) and (2.5) are a consequence of relations between ray class theta functions of two different imaginary quadratic fields. However, to describe these relations, which are stated in Theorem 4.3, we need to define ray class characters for these two quadratic fields and their corresponding L-functions. We also use the theory of Artin L-functions in a crucial step (Theorem (4.2)). 4.1. Ray class characters. Let χ : CF (K) → C× be a multiplicative character of the ray class group with conductor F . We define the conductor Fχ of the character χ to be the biggest of all ideals I dividing F such that the kernel of χ contains the kernel of the reduction map (3.12) from CF (K) to CI (K). In fact, Fχ is the sum of all such ideals. Since the reduction map is surjective, χ defines a character χFχ on CFχ (K) defined by the equation, χ([J]F ) = χF χ ([J]F χ ). Thus, for all I divisible by F χ, χ also defines a character χI of CI (K) by, χI ([J]I )=.χF χ ([I]F χ ). Note that χI also has conductor F χ. We refer to the collection of all such characters χI as “the ray class character χ for K”. We are now in a position to specialise to the problem at hand. Let us introduce two √ √ complex quadratic fields K = Q[ D] and K0 = Q[ D0 ], where D and D0 are negative, 0 /(D, D0 )2 , √ which is square free integers. We shall also sometimes consider D00 = DD√ 00 square free and positive, along with the quadratic field K = Q[ D00 ] = Q[ DD0 ]. Here, (a, b) is the highest common factor of the pair of integers a, b. e is defined to be D or 4D according to whether 4 divides (D − 1) or The integer D e 00 are defined similarly. (These are the discriminants of K, e 0 and D not. The integers D 0 00 K and K over Q.) In order to define a particular ray class character ψ for K, we first describe a function φ, which is, effectively, the Dirichlet character corresponding to the quadratic field K. This character is closely related to the Jacobi symbol, and we refer the interested reader to [1] (p.236-238) for further details. Let p be a prime number and m an integer not divisible by p. Then the quadratic is defined to be 1 or −1 according to whether x2 ≡ m has a residue symbol, m p solution modulo p or not. For positive integers n prime to 2D we define φ(n) = ±1 by the following rules: (i) φ(n) is the quadratic residue symbol D n if n is prime. (ii) φ(n) = φ(p1 )φ(p2 )...φ(pr ) if the prime factorization of n is n = p1 p2 ...pr . e (In fact φ(n) depends only on the residue of n modulo D). The function φ0 is defined similarly to φ (using D0 instead of D). Let F = 2DD0 OK . We define the character ψF on CF (K) by def
ψF ([I]F ) = φ0 (N (I)), where I is integral and N (I) is the norm of the ideal I – defined in the previous section before (3.4). Hence we obtain a ray class character ψ for K. (This is the character
88
A. Taormina, S.M.J. Wilson
corresponding to the field extension KK0 of K.) The conductor of this character is (as shown in [6]) e D e 0 ), e 0 /(D, (4.1) Fψ = 2 a D e D e 0 and D e 00 are even. Otherwise a = 0. The ray class character where a = 1 if all of D, 0 0 ψ for K is defined similarly, and its conductor is, e e D e 0 , D). Fψ0 = 2a D/(
(4.2)
4.2. Galois groups and norm maps. We first identify the elements of the Galois group of the field KK0 over Q. In general, if L is a field extension of the field L0 , the group of automorphisms of L which leaves every element of L0 fixed is called the Galois group of the degree L over L0 , and is denoted Gal(L/L0 ). Its order is at √ most √ √ of√the extension. In particular, the biquadratic extension KK0 = {a + b D + c D0 + d D D0 |a, b, c, d ∈ Q} has Galois group Gal(KK0 /Q) = {1, δ, δ 0 , γ}, where √ √ n√ n√ D 7→ D D 7→ − D 0 √ √ √ √ , δ : and γ = δδ 0 . δ: D0 7→ − D0 D0 7→ D0 Thus γ acts as complex conjugation on both K and K0 and thus gives the non-trivial element of the Galois group of either field over Q. In what follows we shall use exponential notation for the action of Galois elements. Thus xδ means x is acted on by δ and x1−γ = x(xγ )−1 . Secondly, if L ⊃ L0 is an extension of algebraic number fields, then there exist norm homomorphisms, both denoted by NL/L0 , from the group of units of L to the group of units of L0 , and from the group of fractional ideals of L to the group of fractional ideals of L0 . Thus, NL/L0 : NL/L0 :
L× I(L)
→ L0 × : λ → I(L0 ) : I
7→ NL/L0 (λ), 7→ NL/L0 (I).
These norms are related by the fact that they “agree” on elements λ of L. That is, NL/L0 (λOL ) = NL/L0 (λ)OL0 . Moreover, if I ∈ I(L), one has, N (I)Z = NL/Q (I). If the extension L over L0 is quadratic with Gal(L/L0 ) = {1, σ}, then, for λ ∈ L× , NL/L0 (λ) = λ1+σ and for I ∈ I(L), we have
NL/L0 (I)OL = I 1+σ .
This equation determines the norm map on ideals since the map, I(L0 ) → I(L) : J 7→ JOL , is injective. In particular, if L = KK0 and L0 = K, one has that, for I ∈ I(KK0 ) and J ∈ I(K), NKK0 /K (I)OKK0 = I 1+δ and N (J)OK = JJ.
Virasoro Character Identities and Artin L-Functions
89
Finally, choose a fractional ideal F in I(K) which is contained in the conductor Fψ (4.1), and which is self-conjugate, i.e., F ∈ I(K),
F ⊂ Fψ
and
F γ (= F ) = F.
We define a subgroup AF and a coset SF in the ray class group CF (K) by, def
AF = ker(ψF )1−γ = {[II def
SF = {[II
−1
−1
]F | ψF ([I]F ) = 1},
]F | ψF ([I]F ) = −1}.
(4.3)
For F 0 ⊂ Fψ0 we define A0F 0 and SF0 0 ⊂ CF 0 (K0 ), similarly. Note that if ψF ([I]F ) = −1 then SF = (I/I)AF . Also if F1 is a self-conjugate ideal contained in F , we find that, F1 1 AF = redF F (AF1 ) and SF = redF (SF1 ).
(4.4)
4.3. The three crucial theorems. To provide a link between the arithmetic of K and that of K0 we need to give a correspondence between conductors in OK and in OK0 . Definition 4.1. We say that a pair (F, F 0 ) of self-conjugate ideals, F ∈ I(K) and ˜ = N (F 0 )D ˜ 0. F 0 ∈ I(K0 ) is admissible if F ⊂ Fψ , F 0 ⊂ Fψ0 0 and N (F )D In this subsection we present our main theorem for producing identities, Theorem 4.3, which is a practical theorem relating ray class theta functions of different quadratic fields. Given an admissible pair (F, F 0 ), the theorem describes coincidences between certain combinations of ray class theta functions of K with conductor F and similar combinations of ray class theta functions of K0 with conductor F 0 . The origin of these coincidences can be described in terms of L-functions of characters for the ray class groups CF (K) and CF 0 (K0 ). We first define such L-functions, then give a crucial relation between them in Theorem (4.2). Given the definition (3.8) of theta functions for a formal complex linear combination of ray classes θ(W ; d), one obtains the corresponding L-function L(W ) by using the modified Mellin transform, Md , which sends q t to (td)−s , X Md (θ(W ; d)) = L(W ) = nx L(x), (4.5) x∈X
P
where L(x) = I∈x,I⊆OK N (I)−s . Now, if χ is a ray class character of K with conductor Fχ dividing F , we may define the L-function of the ray class character χ with conductor F as follows, def
L F (χ) =
X x∈CF (K)
χ(x)L(x) =
X
χ([I]F )N (I)−s .
I∈IF (K),I⊆OK
In particular, we can take F = Fχ . The corresponding L-function LFχ (χ) is the fundamental L-function corresponding to the ray class character χ. We put L(χ) = LFχ (χ). We now have, Theorem 4.2. Let (F, F 0 ) be an admissible pair and let χ and χ0 be characters of CF (K) and CF 0 (K0 ) such that
90
A. Taormina, S.M.J. Wilson
(i) χ is 1 on AF and −1 on SF . (ii) For all I ∈ IF F 0 OKK0 (KK0 ), χ([NKK0 /K (I)]F ) = χ0 ([NKK0 /K0 (I)]F 0 ). Then LF (χ) = LF 0 (χ0 ). The proof relies on the theory of Artin L-functions. In fact, for a suitable extension N of KK0 , the characters χ and χ0 define, under the Artin correspondence of class field theory, characters χgal of 1 = Gal(N/K) and χ0gal of 10 = Gal(N/K0 ). Now the Artin L-functions of χgal and χ0gal are, by definition, the L-functions of χ and χ0 , L(χgal ) = L(χ),
L(χ0gal ) = L(χ0 ).
One proves [6] that, under the conditions of the theorem, χgal and χ0gal induce the same character of 0 = Gal(N/Q), χgal 0 = χ0 gal 0 . ↑ ↑ 1
10
Since their Artin L-functions coincide with that of the induced character, one must have, (4.6) L(χ) = L(χgal ) = L(χ0gal ) = L(χ0 ). The result follows. We are now ready to state the main theorem of this paper. It is obtained by summing (multiplied by suitable roots of unity) all the instances of (4.6) for a given admissible pair of conductors (a sort of finite Fourier inversion) and applying the inverse of the Mellin transform given in (4.5). Theorem 4.3. Let (F, F 0 ) be admissible and let I be an integral ideal of OKK0 having no common factor with F F 0 OKK0 . Put J = NKK0 /K (I) and J 0 = NKK0 /K0 (I). Then θ(AF [J]F ; d) − θ(SF [J]F ; d) = θ(A0F 0 [J 0 ]F 0 ; d) − θ(SF0 0 [J 0 ]F 0 ; d),
(4.7)
for d ∈ R. Unfortunately, as will be exemplified in the next section when we discuss the first set of Virasoro character identities (2.4), coset theta functions may give rise, by application of (3.9), to theta functions of ray classes with respect to non-self-conjugate conductors. The following theorem describes situations where such theta functions may be replaced by theta functions of ray classes with self-conjugate conductors. It relies on the cancellation available from the relationship (for self-conjugate F ) L([ I ]F ) = L([I]F ). Theorem 4.4. Let F , P and J be (integral) ideals of OK . Suppose that F is self conjugate, that P is prime and coprime to F and that J is coprime to P F . Let B be a self-conjugate subgroup of CF (K) containing both [P/P ]2F and [J/J]F . Put T for the coset B[P/P ]F and put B˜ and T˜ for the inverse images of B and T in CF P (K). Then ˜ F P ; d) − θ(T˜ [J]F P ; d) = θ(B[J]F ; d) − θ(T [J]F ; d). (4.8) θ(B[J] We shall apply this result in cases where B = AF and T = SF . We note that in this case, the conditions of the second paragraph of the above theorem hold, provided fψ | F , ψ(J) = 1 and ψ(P ) = −1. The proofs of the above theorems (or rather the corresponding theorems for Lfunctions) may be found in [6]. We now use them to prove the Virasoro identities (2.4, 2.5), but also to uncover a whole family of identities between Virasoro characters at higher levels, of which the identities (2.4) are the simplest example.
Virasoro Character Identities and Artin L-Functions
91
5. Virasoro Identities In the next three subsections, we use the algebraic tools developed in this paper to prove the identities (2.4). We first show in Subsect. 5.1 how to relate √ √ ray class theta functions associated with the two relevant quadratic fields K = Q[ −2] and K0 = Q[ −1] using Theorem 4.3. We then rewrite the identities as in (2.6). To make contact with Theorem 4.3, we express the left-hand side (resp. the right-hand side) of (2.6) in terms of differences of ray class theta functions for CF (K) (resp. CF 0 (K0 )) for self-conjugate conductors F (resp. F 0 ). This is carried out in detail in Subsects. 5.2 and 5.3. Remarkably, the three distinct differences of ray class theta functions appearing in the RHS of (2.6) are obtained each time one considers quadratic expressions in Virasoro unitary minimal characters at level m = 4a2 for a odd and 1 + 4a2 = a02 p with p prime, in a way described in Theorem (5.1), Subsect. 5.3. This does then provide an infinite class of new identities between the Virasoro characters at level m = 3 and Virasoro characters at level m = 36, 100, 196, . . . We also remark (5.23) how the LHS of these identities may be rewritten as sums of Virasoro characters at any of an infinite series of higher levels m = 675, 131043, . . .. Subsection 5.4 gives a somewhat terser account of how to prove the√second set of identities √ (2.5). There, the two relevant quadratic fields are K = Q[ −30] and K0 = Q[ −10]. We recall (see Sect. 3) that in what follows, Pp (resp. Pp0 ) stands for a prime ideal of OK (resp. of OK0 ) of norm p when p is prime. 5.1. Relations between ray class theta functions. As already remarked at the end of Sect. 3, the expression √ on the left-hand side of the identities (2.6) is associated with the quadratic field Q[√D] with D = −2, while the right-hand side is associated with the 0 quadratic field Q[ D0 ] with √ D = −1. So, in the notations adopted in this paper, one has K = Q[ρ] with ρ = −2, OK = h1, ρigp , and the units of OK are OK × = {±1}. Also, K0 = Q[i], OK0 = h1, iigp and OK0 × = {±1, ±i}. Consider the two ray class characters ψ and ψ 0 defined in Subsect. 4.1. Their conf00 = 8, so ˜ = −8, D˜ 0 = −4 (and D ductors are given by the expressions (4.1,4.2) with D a = 1). Thus, 2[8, 4] = (2)OK , Fψ = 8 OK 2[4, 8] Fψ 0 = = (4)OK0 . 4 OK 0 (Note that, especially in subscripts, we shall write principal ideals (α) instead of αOK or αOK0 ). Both OK and OK0 are principal ideal domains, i.e. all their ideals are principal, so that, using (3.11), we can easily identify the ray class groups of K and K0 associated with the above conductors, CFψ (K) = C(2) (K) = CP (2) (K) ' (OK /(2))× = h[1 + ρ](2) igp , CFψ0 (K ) = C(4) (K0 ) = CP (4) (K0 ) ' (OK0 /(4))× /{±1, ±i} = h[1 + 2i](4) igp . 0
Both groups are of order 2, and so ψ and ψ 0 are the only non trivial (1-dimensional) characters of CFψ (K) and CFψ0 (K0 ). We thus have,
92
A. Taormina, S.M.J. Wilson
( ψ([α](2) ) = ( 0
ψ ([α](4) ) =
1 −1
if α ≡ 1 mod (2) otherwise,
(5.1)
if α ≡ ±1 or ± i mod (4) otherwise.
(5.2)
1 −1
The next step is to calculate the A’s and S’s of (4.3) for the admissible pairs of ideals (F, F 0 ) (see Definition 4.1) that we shall be using. Put P2 = (ρ)OK as before and 2 P20 = (1 + i)OK0 , the prime ideal of norm 2 in OK0 (noting that P20 = 2OK0 ). Then (4P2 , (8)OK0 ) and ((4)OK , 4P20 ) are admissible pairs for K and K0 . We concentrate on the first pair as the data for the second pair will come easily by (4.4). By (5.1), ker ψ4P2 = {[1 + 2α]4P2 | α ∈ OK } = {[1 + 2a + 2bρ]4P2 | a, b ∈ Z}. Thus, in this case, the A-group is trivial, A4P2 = (ker ψ4P2 )1−γ = {[1]4P2 },
(5.3)
since 1 + 2a + 2bρ = 1 + 2a − 2bρ ≡ 1 + 2a + 2bρ mod 4P2 (= (4ρ)) (and γ acts as conjugation). In order to obtain the coset S4P2 (Definition (4.3)), we must choose an ideal I in I4P2 (K) such that ψ [I]4P2 = −1. We take I = (1 + ρ)OK . Then we find that S4P2 consists of the class ¯ 4P2 = [(1+ρ)(1−ρ)−1 ]4P2 = [3(−1+2ρ)/9]4P2 = [−3+2ρ]4P2 = [5±2ρ]4P2 . (5.4) [I/I] 0 We also need A0(8) and its coset S(8) for K0 . Note that now, 0 = {[1 + 4α](8) | α ∈ OK0 } = {[1 + 4a + 4bi](8) | a, b ∈ Z}, ker ψ(8)
and so
0 1−γ ) = {[1](8) }, A0(8) = ker ψ(8)
(5.5) (5.6)
since, modulo (8), 1 + 4a + 4bi ≡ 1 + 4a − 4bi = 1 + 4a + 4bi. Moreover, with I = 0 consists of the class (1 + 2i)OK0 , ψ 0 [I](8) = −1, and so we find that S(8) ¯ (8) = [(1 + 2i)(1 − 2i)−1 ](8) = [5(1 + 2i)2 /25](8) = [−1 + 4i](8) = [1 + 4i](8) . (5.7) [I/I] For the second pair of conductors we find that the ideal (4)OK divides the ideal 4P2 , and 4P20 divides (8)OK0 . So we can obtain the new A’s and S’s by reduction using (4.4) A(4) = {[1](4) } and S(4) = [1 + 2ρ](4) , 0 1−γ 0 = {[1]4P20 } and S4P A04P 0 = (ker ψ4P 0) 0 = [5]4P 0 = [3]4P 0 . 2 2 2
2
2
(5.8)
Now take I, F and F 0 in Theorem 4.3 to be successively OKK0 , (1 − ρ)OKK0 , OKK0 ,
4P2 4P2 4OK ,
and and and
8OK0 ; 8OK0 ; 4P20 .
and
This gives the following relations between ray class theta functions of the fields Q[ρ] and Q[i],
Virasoro Character Identities and Artin L-Functions
93
0 θ([1]4P2 ; 16) − θ(S4P2 ; 16) = θ([1](8) ; 16) − θ(S(8) ; 16), 0 θ([1 + 2ρ]4P2 ; 16) − θ([1 + 2ρ]4P2 S4P2 ; 16) = θ([3](8) ; 16) − θ([3](8) S(8) ; 16), 0 θ([1](4) ; 8) − θ(S(4) ; 8) = θ([1]4P20 ; 8) − θ(S4P 0 ; 8). 2
(5.9)
Here, for the second line we have calculated the norms of (1 − ρ) from KK0 to K and from KK0 to K0 as follows NKK0 /K (1 − ρ) = (1 − ρ)2 = −(1 + 2ρ), NKK0 /K0 (1 − ρ) = (1 − ρ)(1 + ρ) = 3. 5.2. Reduction of V products to ray class theta functions. We have shown at the end of Sect. 3 how to rewrite a product of two generalised theta functions as a ray class theta function, namely, ! αs (a, b) ; 16 , (5.10) θ(1+6a),6 θs(1+6b),12 = θ ρ(1 − ρ) P3 F with s ≡ 1 mod 6 and F = 4P2 . From (2.7) and the above relation, we obtain the V product V (1, 2)V (s, 3) as a linear combination of ray class theta functions,
where
V (1, 2)V (s, 3) = (θ1,6 − θ7,6 )(θs,12 − θ7s,12 ) = θ(W ; 16),
(5.11)
W = [βs ]P3 F (q(0, 0) + q(1, 1) − q(1, 0) − q(0, 1)) ∈ CP3 F (K),
(5.12)
with βs = αs (0, 0)/((1 − ρ)ρ) and q(a, b) = [αs (a, b)/αs (0, 0)]P3 F . We express the elements [δ]P3 F of CP3 F (K) = CP P3 F (K) in the form [α, β]P3 F as explained after (3.13), following the decomposition (OK /P3 F )× ∼ = (OK /P3 )× × (OK /F )× . In order to apply Theorem 4.3 we shall combine the ray classes of (5.11) to make classes with respect to the self-conjugate conductor F using Theorem 4.4. We take P , B and T of that theorem to be, respectively, P3 , AF (= {[1]F } by 5.3) and SF (= {[5 − 2ρ]F } by 5.4). The choice of P = (1 + ρ) and T are consistent since from (5.1), ψ([1 + ρ]) = −1. We first identify B˜ and T˜ and then compare them to the classes q(a, b). We know that N (P3 ) = 3. So (OK /P3 )× = {±1 + P3 } and by, (3.15), B˜ and T˜ of ˜ − 2ρ]P3 F , respectively. Theorem 4.4 are {[±1, 1]P3 F } and B[5 Now αs (1, 1) = 7αs (0, 0). So q(1, 1) = [7]P3 F = [1, −1]P3 F = [−1, 1]P3 F , by (3.14). Also, 7αs (1, 0) = 98 + 7sρ ≡ αs (0, 1) mod 24. So αs (0, 1)/(ρ(1 − ρ)) ≡ 7αs (1, 0)/(ρ(1 − ρ)) mod P3 F, and hence
q(0, 1) = [7]P3 F q(1, 0) = [1, −1]P3 F q(1, 0).
Again, αs (0, 0)(5 − 2ρ) ≡ · · · ≡ αs (1, 0) mod 24. So, in the same way, q(1, 0) = [5 − 2ρ]P3 F q(0, 0).
(5.13)
94
Thus
A. Taormina, S.M.J. Wilson
{q(0, 0), q(1, 1)} = B˜
and
{q(1, 0), q(0, 1)} = T˜ .
Putting this information into (5.12) we get, from 5.11, ˜ 16)−θ([βs ]P3 F T˜ ; 16) = θ([βs ]F ; 16)−θ([βs ]F SF ; 16) V (1, 2)V (|s|, 3) = θ([βs ]P3 F B; (5.14) by Theorem 4.4. Moreover, βs = 1 if s = 1 and βs = 1 + 2ρ if s = 5. If s = −2 one can divide the coset in (3.10) by −ρ to get, θ(1+6a),6 θ2(1+6b),12 = θ(2(1 + 6b) − (1 + 6a)ρ + (J/P2 ); 8). Here we have the same situation as before with snew = 1 except that the rˆoles of a and b are reversed, the ray class conductor is 4P3 and the scale factor 8 instead of 16. The same analysis goes through with F = 4OK and we obtain V (1, 2)V (2, 3) = θ([1](4) ; 8) − θ(S(4) ; 8).
(5.15)
Thus we have shown that the left-hand sides of the identities (2.6) may be rewritten as the left-hand sides of the identities (5.9). To complete our proof of (2.6) we shall, in a similar manner, show that the right-hand sides of (2.6), which may be described by the V products V (r, 4)V (rf, 4) + V (7r, 4)V (7rf, 4) (5.16) for (r, f ) = (1, 2), (3, 2) and (1,3), are equal to the right-hand sides of the identities (5.9). 5.3. An infinite family of identities. In fact the expressions (5.16) are only the first set of an infinite family of quadratic expressions in the V functions which reduce to the right-hand sides of (5.9). We now prove this reduction for the whole family and thus obtain (by 2.9) an infinite family of identities between Virasoro characters at level m = 3 2 and products of those at levels m = 4a2 , where a is odd and 1 + 4a2 = pa0 with p prime. Theorem 5.1. Let a, a0 and p be integers such that a ≡ 1 mod 4, p is prime and 4a2 + 1 = pa02 . Put m = 4a2 and c = aa0 . Then, for r odd and prime to p and = 0 or 1, p−1
c−1 X c−1 2 X X
V (cu(r ˆ + 8vp), m)V (cu((2a ˆ − p)r + 8wp), m) = θ([r]F 0 − [rδ]F 0 , 24− ),
u=1 v=0 w=0
(5.17) where uˆ = (u + 5p(1 − u)) and the ray classes on the right are defined in K0 = Q[i] with 6− F 0 = P20 , P20 = (1 + i)OK0 (as in Subsect. 5.1) and δ = 1 + 4i so that {[δ]F 0 } = SF0 0 (see 5.7 and 5.8). Actually, the theorem gives just three different relations. These correspond to the choices (r, ) = (1, 0), (3, 0) and (1, 1) and have as right-hand sides the three right-hand sides of (5.9). Moreover if we take a = 1 and p = 5, so m = 4, a0 = c = 1 and 2ˆ = −23 and put f = |2a − p| = 2 or 3, then the LHS in Theorem 5.1 becomes, V (r, 4)V (rf, 4) + V (23r, 4)V (23rf, 4). Now we may make this expression more economical by replacing 23 by 7 (since 23×9 = 207 ≡ 7 modulo 2m(m + 1) = 40, 9 = 2m + 1 and V (r(2m + 1), m) = −V (r, m)). So we have,
Virasoro Character Identities and Artin L-Functions
95
V (r, 4)V (rf, 4) + V (7r, 4)V (7rf, 4) = θ([r]F˜ 0 ; 24− ) − θ([r]F˜ 0 SF0˜ 0 ; 24− ).
(5.18)
Taking (r, f ) = (1, 2), (3, 2) and (1, 3), and using (5.14), (5.15) and (5.9) we get the identities (2.6). Before we can prove Theorem 5.1, we need some preparation. Suppose that 3 ⊃ 30 are lattices in the inner product space V . Then there is a subset T ⊂ 3 (a transversal of 3 over 30 ) of the same size as the quotient group 3/30 such that (w + 30 ) and (w0 + 30 ) are disjoint for distinct w and w0 in T . Then, for any v ∈ V , [ (v + w) + 30 v+3= w∈T
and so, θ(v + 3; d) =
X
θ((v + w) + 30 ; d).
(5.19)
w∈T
Lemma 5.2. Suppose that k = c2 k 0 with c and k 0 ∈ N. Choose b ∈ Z to have no common factor with c. Then c−1 X
θcb(r+2jk0 ),k = θbr,k0 .
(5.20)
j=0
Proof. Now θcb(r+2jk0 ),k = θ
1 cb(r + 2jk 0 ) + Z; 2k k
Whereas θ
br,k0
=θ
=θ
1 br + Z; 0 2k 0 k
br + bj 2k 0
1 + cZ; 0 k
.
.
But since b is invertible mod c, the set {0, b, . . . , (c − 1)b} is a transversal for Z over cZ. So the identity follows by (5.19). As an immediate consequence we have, Corollary 5.3. If m(m + 1) = k above then c−1 X
V (cb(r + 2jk 0 ), m) = θbr,k0 − θbr(2m+1),k0 .
(5.21)
j=0
We remark, by the way, that taking m = 242, c = 99, k 0 = 6 and b = r = 1 gives a RHS in (5.21) of θ1,6 − θ5,6 = η, and we obtain, 98 X
ir(242) χV99(1+12j),99(1+12j) = 1.
(5.22)
j=0
This is the first in an infinite series of such identities (though, presumably not the first with RHS 1). The next, however has m = 23762 and c = 109 × 89 = 9701. Again, we may solve the Pellian equation (2m + 1)2 − 48c2 = 1
(i.e.
m(m + 1) = 12c2 )
96
A. Taormina, S.M.J. Wilson
and choose one of the (infinitely many) solutions such that 2m + 1 ≡ 7 modulo 24 (e.g. m = 675, c = 175; m = 131043, c = 37829). In Lemma 5.2, we now have k 0 = 12 and b = 1 gives a RHS in (5.21) of θr,12 − θ7r,12 = V (r, 3) and we obtain, for instance, 174 X
ir(675) χV175(1+24j),175(1+24j) = χVr,rir(3) .
(5.23)
j=0
This is the first in an infinite series of such identities which (taking r = 1, −2 and −5) rewrite the left-hand sides of (2.4). Proof of Theorem 5.1. Note the congruences 2a ≡ 2 mod 8 and p ≡ 5 mod 8 (so also 5p ≡ 1 mod 8). In particular, uˆ ≡ u mod p and uˆ ≡ 1 mod 8. (i) We first rewrite the left-hand side of (5.17) as a sum of differences of coset theta functions. Now 2m + 1 = 2p(a0 )2 − 1 ≡ 2p − 1 modulo 8p. So by (5.21), the uth term in the outer summation on the LHS of (5.17) is − θru(2p−1),4p )(θ(2a−p)ru,4p − θ(2a−p)ru(2p−1),4p ). LHS(u) = (θru,4p ˆ ˆ ˆ ˆ Applying (2.13), (with k = l = h = 4p, λ = µ = k0 = `0 = 1 and D = −1) LHS(u) = θ(ruα ˆ 1 + 8pOK0 ; 16p) − θ(ruα ˆ 2 + 8pOK0 ; 16p) ˆ 4 + 8pOK0 ; 16p), −θ(ruα ˆ 3 + 8pOK0 ; 16p) + θ(ruα where, α1 α2 α3 α4
(5.24)
= 1 + (2a − p)i; = 1 − (2a − p)(2p − 1)i); = (2p − 1) − (2a − p)i ≡ (2p − 1)α2 mod 8p; = (2p − 1)(1 + (2a − p)i) = (2p − 1)α1 .
(Note that (2p − 1)2 ≡ 1 mod 8p.) Now uˆ is u mod p and 1 mod 8. So u(2p ˆ − 1) ≡ −u ≡ p − u ≡ p[ − u mod p and u(2p ˆ − 1) ≡ 1(10 − 1) ≡ 1 ≡ p[ − u mod 8 and therefore u(2p ˆ − 1) ≡ p[ − u mod 8p. Thus
− u)α1 ruα ˆ 4 ≡ r(p[
and
ruα ˆ 3 ≡ r(p[ − u)α2 .
Hence the LHS of (5.17) may be rearranged as p−1
2 X
u=1
LHS(u) =
p−1 X
(θ(ruα ˆ 1 + 8pOK0 ; 16p) − θ(ruα ˆ 2 + 8pOK0 ; 16p)) ,
(5.25)
u=1
since doubling the range of the sum exactly compensates for the elimination of the last two terms of (5.24). We now prepare to express these coset theta functions as ray class theta functions using (3.9) and to reduce the natural conductors using Theorem 4.4.
Virasoro Character Identities and Artin L-Functions
97
We note first that pOK0 = P P , where P = (1 − 2ai)OK0 . So (cf. 3.5), P and P are distinct prime ideals of norm p and by (3.16), (OK0 /P )× = {u + P | u ∈ Z, 1 ≤ u ≤ p − 1}.
(5.26)
We shall apply Theorem 4.4 with K, P , F , B and T of that theorem being respectively K0 , P , F 0 = (P20 )6− , A0F 0 (= {[1]F 0 } by 5.6) and SF0 0 (= {[δ]F 0 } by 5.7). The choice of P and T are consistent since 1 − 2ai ≡ 1 − 2i mod 4 and so, from (5.2), ψ([P ]) = −1. We express the elements of CP F 0 (K0 ) in the form [α, β]P F 0 as explained after (3.13), following the decomposition (OK0 /P F 0 )× ∼ = (OK0 /P )× × (OK0 /F 0 )× . Then, by (3.15) and (5.26), B˜ of Theorem 4.4 is −1 def F0 B˜ = redP (B) = {[u, 1]P F 0 | 1 ≤ u ≤ p − 1} = {[u] ˆ P F 0 | 1 ≤ u ≤ p − 1} 0 F ˜ 0 ]P F 0 , provided [δ 0 ]F 0 = [δ]F 0 . and T˜ = B[δ (ii) We examine first the case when = 0. Then α1 OK0 = P and so, for β ∈ OK0 coprime to 2P , the highest common factor H of (βα1 )OK0 and (8p)OK0 = 8P P is P . Therefore, using the relation (3.9) between coset and ray class theta functions for the ray class group CP F 0 (K0 ), we get, θ(βα2 + 8pOK0 ; 16p) = θ([β]P F 0 ; 16).
(5.27)
˜ 2 /α1 ]P F 0 , since α2 /α1 = 1 − 4(1 − 2ai)i is coprime to P and congruent Now T˜ = B[α to δ modulo 8. Thus, applying (5.27) to (5.25) (with β = ruˆ or ruα ˆ 2 /α1 ) we find that the LHS of (5.17) is p−1 X
˜ P F 0 , 16) − θ(T˜ [r]P F 0 , 16) ˆ 2 /α1 ]P F 0 ; 16) = θ(B[r] θ([ru] ˆ P F 0 ; 16) − θ([ruα
u=1
= θ([r]F − [rδ]F ; 16),
(5.28)
by Theorem 4.4, as required. (iii) Now consider the case when = 1. For β ∈ OK0 coprime to 2P , the highest common factor, H, of (β(1+2ai)(1+i))OK0 = 6 βP P2 and (8p)OK0 = P20 P P is P P20 . Therefore, using (3.9) again, we get θ(β(1 + 2ai)(1 + i) + 8pOK0 ; 16p) = θ([β]P F 0 ; 8).
(5.29)
Put βj = αj /((1 + 2ai)(1 + i)), for j = 1, 2. Then, since 2a ≡ 2 mod 8, (1 + i)β1 =
1 + 2ai − pi = 1 − (1 − 2ai)i ≡ −1 − i mod 8. (1 + 2ai)
So β1 ≡ −1 modulo (8/(1 + i))OK0 = F 0 and so [β1 ]F 0 = [1]F 0 and (from the third ˜ 1 ]P F 0 = B. ˜ expression) β1 is coprime to P . Thus B[β Again (1 + i)β2 =
1 + 2ai − ip(1 − 2p + 4a) = 1 − i(1 − 2ai)(1 − 2p + 4a). (1 + 2ai)
98
A. Taormina, S.M.J. Wilson
So β2 is coprime to P and (1 + i)β2 ≡ 1 − (i + 2)(1 − 2 + 4) ≡ −2 − 3(1 + i) modulo 8. So β2 ≡ −(1 − i) − 3 = −(4 + i) ≡ −iδ mod 4P20 . ˜ 2 ]P F 0 = T˜ . Thus, applying (5.29) to (5.25) (with β = ruβ Hence B[β ˆ 1 or ruβ ˆ 2 ) we find that the LHS of (5.17) is p−1 X
˜ P F 0 , 8) − θ(T˜ [r]P F 0 , 8) (θ([ruβ ˆ 2 ]P F 0 ; 8)) = θ(B[r] ˆ 1 ]P F 0 ; 8) − θ([ruβ
u=1
= θ([r]F − [rδ]F ; 8), by Theorem 4.4, as required. 5.4. The second set of identities. The proof of the second set of identities (2.5) relies on Theorem 4.3, as did the proof of the identities (2.4) in Subsect. √ 5.1. However now, √ 0 −30] and K = Q[ the relevant quadratic fields are K = Q[ √ √ −10]. By (3.1), their rings of algebraic integers are OK = Z[ −30] and OK0 = Z[ −10]. By (4.1, 4.2), the conductors of the ray class characters ψ and ψ 0 introduced in Subsect. 4.1 are given by, Fψ = (2a )OK = (2)OK = P22
and
Fψ0 = (3 × 2 )OK0 = (6)OK0 = 3P20 , a
2
(5.30)
˜ 00 = 12, and a = 1. since D00 = 3 and so D The calculation of ψ and ψ 0 is complicated by the fact that ideals in OK and OK0 need not be principal ideals: C(K) is of order 4 and C(K0 ) is of order 2. In fact [P11 ]OK and [P13 ]OK are generators of C(K) so that, by (3.7), [P11 ](2) and [P13 ](2) together with CP (2) (K) generate C(2) (K). We find that ψ = 1 on [P11 ](2) and [P13 ](2) and on [γ](2) if γ ≡ 1 mod P2 and these values determine ψ. 0 ]OK0 as generator of C(K0 ), we eventually conclude that ψ 0 = 1 Similarly, taking [P13 0 0 on [P13 ](6) and on [α] √(6) , for α ∈ OK \(P2 ∪3OK ) if either both or neither of α ≡ 1 mod 2 and α ≡ ±1 or ± −10 mod 3 are satisfied (this is the only way to ensure that the conductor is no larger √ than 6OK – note that OK0 /(3)× is cyclic of order 8 generated by 0 the coset of µ = 1 + 2 −10.). Again these values determine ψ 0 . We use the admissible pair (F, F 0 ) of conductors where F = P5 P3 4P2 and F 0 = P50 (3)4P20 . Thus CP F (K) is the image of (OK /F )× ∼ = (OK /P5 )× × (OK /P3 )× × (OK /4P2 )× ,
(5.31)
(using (3.13) twice) and we denote its elements [n, m, α]F , accordingly, slightly generalising the notation introduced before (3.14). We find, ker ψF = h[P11 ]F , [P13 ]F , [n, m, 1 + 2α]F | n ∈ Z − 5Z, m ∈ Z − 3Zigp (recalling (3.16) we see that we can take n and m to be integers). The classes √ [n, m, 1 + 2α]1−γ turn out to be trivial, and for µ = 1 + 2 −30, one has, F = [µ/11]F = [11µ]F = [1, −1, 3µ]F , [P11 ]1−γ F 2 since P11 = µOK and 112 ≡ 1 mod F . Similarly,
Virasoro Character Identities and Artin L-Functions
99
√ [P13 ]1−γ = [(7 + 2 −30)/13]F = [−1, 1, 3µ]F . F So, in the notation of Theorem 4.3, AF = (ker ψF )1−γ = h[1, −1, 3µ)]F , [−1, 1, 3µ]F igp = {[1]F , [1, 1, −1]F , [−1, 1, 3µ]F , [−1, 1, −3µ]F }.
(5.32)
Also, [1, 1, 1 + ρ]1−γ = [1, 1, (1 + ρ)2 (31)−1 ]F = [1, 1, −3µ]F , and F Again ker ψF0 0
SF = AF [1, 1, −3µ]F = AF [−1, 1, 1]F . √ = B ∪ B[1, µ0 , 1 + −10]F 0 , where
(5.33)
√ 0 ]F 0 , [n, β, 1 + 2α]F 0 | n ∈ Z − 5Z, β ≡ ±1 or ± −10 mod 3igp . B = h[P13 √ 0 1−γ ]F 0 = [1, −10, −1]F 0 , [n, β, 1 + 2α]1−γ We find [P13 F 0 = [1, ±1, 1]F 0 , and √ √ 0 0 = [1, − −10, −3µ ] . So [1, µ0 , 1 + −10]1−γ 0 F F √ √ A0F 0 = h[1, −10, −1]F 0 , [1, −10, −3µ0 ]F 0 , [1, −1, 1]F 0 igp (5.34) √ √ = {[1, ±1, 1]F 0 , [1, ± −10, −1]F 0 , [1, ±1, 3µ0 ]F 0 , [1, ± −10, −3µ0 ]F 0 }. √ 0 Also, [1, 1, 1 + −10]1−γ F 0 = [1, 1, −3µ ]F 0 , and SF0 0 = A0F 0 [1, 1, −3µ0 ]F 0 = A0F 0 [1, ±1, −1]F 0 .
(5.35)
Now (because there are ideals of norm 13 in both OK and OK0 ) there is an ideal P˜ in OKK0 such that NKK0 /K (P˜ ) = P13
and
0 NKK0 /K0 (P˜ ) = P13 .
Again, let p be a prime such that p ≡ 1 mod 12. Then, by quadratic reciprocity, 3 p = = 1. p 3 So OK00 has an ideal Pp00 of norm p. It follows (from the identities of Sect. 4.2) that NKK0 /K (Pp00 OKK0 ) = pOK
and
NKK0 /K0 (Pp00 OKK0 ) = pOK0 .
Now, if n is prime to 5 and congruent to 1 mod 12 we may choose (by Dirichlet’s theorem) p ≡ n mod 120. Then, taking I of Theorem 4.3 to be P˜ Pp00 , we find that [NKK0 /K (I)]F = [pP13 ]F = [nP13 ]F
and
0 0 [NKK0 /K0 (I)]F 0 = [pP13 ]F 0 = [nP13 ]F 0 .
So that, by Theorem 4.3, 0 θ([nP13 ]F (AF − SF ); d) = θ([nP13 ]F 0 (AF 0 − SF 0 ); d).
(5.36)
(Here sets AF , SF , etc. stand for the sums of their elements.) In particular, let r ≡ ζ mod 4, where ζ = ±1; let 2 − s ≡ ζr mod 8; and let t ≡ ζs 6≡ 0 mod 5. Then, choosing n congruent to s (and ζt) mod 5, to 1 mod 3 and to ζr mod 8, we have
100
A. Taormina, S.M.J. Wilson
0 θ([P13 ]F [s, 1, 2 − s]F (AF − SF ); d) = θ([P13 ]F 0 [t, 1, r]F 0 (AF 0 − SF 0 ); d),
where we have used the fact that [ζ, 1, ζ]F 0 = [1, ζ, 1]F 0 take ( 1, 1, 1 ) . . . ( −11, −5, 1 ) . . . (s, r, t) = ( −3, −5, 13 ) . . . ( −7, 1, 13 ) . . .
(5.37)
lies in A0F 0 . Note that we can (i) (ii) . (iii) (iv)
(5.38)
We now set about reducing the identities (2.5) between Virasoro characters to identities like (5.37). As a first step, we rewrite the former using the V functions defined in 2.7. This gives, η[V (1, 4) ± V (11, 4)] = [V (1, 3) ± V (5, 3)][V (2, 5) ∓ V (8, 5)], η[V (−3, 4) ± V (7, 4)] = [V (1, 3) ± V (5, 3)][V (−4, 5) ∓ V (14, 5)]
(5.39)
ηV (2, 4) = V (2, 3)[V (1, 5) − V (19, 5)], ηV (6, 4) = V (2, 3)[V (7, 5) − V (13, 5)].
(5.40)
and
We concentrate on the four identities (5.39) here, since the last two (5.40) can be treated in a very similar way. Using the properties of V functions described in (2.8, 2.9), the relations (5.39) can be expressed in the following compact form, V (1, 2)V (s, 4) = V V (r, t), where
(5.41)
def
V V (r, t) = V (r, 3)V (2t, 5) + V (−5r, 3)V (32t, 5), with (s, r, t) taking the values (5.38). These equations and hence the first four identities in (2.5) will follow from (5.37) when we show that, for the (s, r, t) of (5.38) and with d = 240, 2V (1, 2)V (s, 4) = LHS of (5.37) (5.42) and 2V V (r, t) = RHS of (5.37).
(5.43)
We tackle (5.42) first. Applying (2.13) (both signs), we find def
2θr,6 θs,20 = T (α) = θ(α + J; d) + θ(α¯ + J; d), where α = 10r + sρ, J = 40P3 and d = 2400. Assume r prime to 6 and s to 10. We find that the h.c.f of αOK and J is H = P2 P5 and, since F = JH −1 , T (α) = θ([αH −1 ]F + [αH ¯ −1 ]F ; 240) = θ([αH −1 ]F ([1]F + [α/α] ¯ F ); 240). Now put β = 10 + ρ and, choosing b prime to 10, put αˆ = 10r + bsρ. Then αβˆ − αβ ˆ = 10(b − 1)(r − s)ρ ∈ 10F. Since H 2 = 10OK , it follows from Lemma 3.3(ii) that
Virasoro Character Identities and Artin L-Functions
101
ˆ [α/α] ˆ F = [β/β]F = · · · = [b, 1, 2 − b + (b − 1)ρ]F = (b), say, using the 3-component notation developed above. Also, N (βH −1 ) = 13. So we may take P13 = βH −1 and then [αH −1 ]F = [P13 ]F (s). We have now T (α) = θ([P13 ]F (s)([1]F + (−1)); 240). Now, since −31 ≡ 2m + 1 mod 2m(m + 1) for m = 2 and m = 4, (2.7) can be rewritten for these m values as, V (r, m) = θr,m(m+1) − θ31r,m(m+1) . So, taking b = 31 and r = 1 in α and α, ˆ 2V (1, 2)V (s, 4) = T (α) + T (31α) − T (α) ˆ − T (31α) ˆ = θ(X(s); 240), where
X(s) = [P13 ]F (s)([1]F + [31]F )([1]F + (−1))([1]F − (31)).
Now, [31]F = [1, 1, −1]F , (−1) = [−1, 1, 3µ]F and (31) = [1, 1, 3µ] ∈ SF . So X(s) = [P13 ]F (s)(AF − SF ) = [P13 ]F [s, 1, 2 − s]F (AF − SF ), if s ≡ 1 mod 4. Thus we have proved (5.42). Applying (2.13) again, we find def
2θr,12 θ2t,30 = T (α) = θ(α + J; d) + θ(α¯ + J; d), √ where α = 5r + 2t −10, J = 60P20 and d = 1200. We assume r ≡ 1 mod 6 and t prime to 5 and t ≡ 1 mod 3.
(5.44)
We find that the h.c.f of αOK and J is H = P50 and, since F 0 = JH −1 , T (α) = θ([αH −1 ]F 0 + [αH ¯ −1 ]F 0 ; 240) = θ([αH −1 ]F 0 ([1]F 0 + [α/α] ¯ F 0 ); 240). (5.45) √ Now put √ β = 5 + 2 −10 and, choosing a ≡ 1 mod 6 and b prime to 15, put αˆ = 5ar + 2bt −10. Then √ αβˆ − αβ ˆ = 10(b − a)(r − t) −10 ∈ 5F 0 , since r ≡ t mod 3. Since H 2 = 5OK , it follows from Lemma 3.3(ii) that 0 ˆ [α/α] ˆ F 0 = [β/β]F 0 = · · · = [1, 1, a]F 0 (b),
where
(5.46)
√ √ 0 (b) = [b, −(1 + b) − (b − 1) −10, 1 + 2(b − 1) −10]F 0 = [b, 1, 1]F 0 , if b ≡ 1 mod 6.
In particular, from (5.45), T (α) = θ([αH −1 ]F 0 ([1]F 0 + 0 (−1)); 240). Now, from (2.7),
(5.47)
102
A. Taormina, S.M.J. Wilson
V (r, 3) = θr,12 − θ7r,12
and
V (2t, 5) = θ2t,30 − θ2(−11t),30 ,
(5.48)
where we took the minus sign so that 7 ≡ −11 ≡ 1 mod 3. This ensures that if r = r0 and t = t0 satisfy (5.44) then so do all the pairs (r, t) of the products T (α) = 2θr,12 θ2t,30 in the expansion of V V (r0 , t0 ) using (5.48). Expressing each such product as in (5.47) and using√(5.46) several times with different choices of a and b, we find, writing α0 = 5r0 + 2t0 −10, that V V (r0 , t0 ) = θ(X; 240), (5.49) where X = [α0 H −1 ]F 0 ([1]F 0 + [1, 1, −5]F 0 0 (16))× × ([1]F 0 − 0 (−11) − [1, 1, 7]F 0 + [1, 1, 7]F 0 0 (−11))([1]F 0 + 0 (−1)) . . . = [α0 H −1 ]F 0 (AF 0 − SF 0 ). 0 Now, N (β) = 25 + 40 = 65. So N (βH −1 ) = 13. Hence we may choose P13 = βH −1 and then 0 0 [α0 H −1 ]F 0 = [P13 ]F 0 [α0 /β]F 0 = [P13 ]F 0 [1, 1, r0 ](t0 ).
So, if t0 ≡ 1 mod 6, 0 ]F 0 [t0 , 1, r0 ]F 0 (AF 0 − SF 0 ). X = [P13
Thus we have proved (5.43) and, as observed there, the first and second lines of (2.5) now follow. 6. Conclusions Over the years, two-dimensional conformal field theory has proven to be a true goldmine for those studying string theory as well as statistical mechanics. Its underlying algebraic structure is the infinite dimensional Virasoro algebra. Although its representation theory has been thoroughly analysed, it is remarkable that identities between unitary minimal Virasoro characters of low level, of the kind discussed in this paper, have not been of use in any “physical” context we are aware of. These identities could therefore be regarded as mathematical curiosities, but our aim here has been to provide a solid mathematical framework within which they naturally appear as the consequence of relations between two “well chosen” imaginary quadratic extensions over Q. The formalism used is borrowed from number theory, and provides, together with a new proof of the identities (2.4, 2.5), a new infinite family of identities between Virasoro characters at level 3 and level m = 4a2 , for a odd and 1 + 4a2 = a02 p, where p is prime. From the number theory point of view, the interesting result is Theorem 4.3, which describes relations between ray class theta functions of two different imaginary presented√in Sect. 4. quadratic fields K and K0 , under a certain number of constraints√ 0 ) is different from (Q[ −2], Q[ −1]) and That√these relations, when the pair (K,K √ (Q[ −30], Q[ −10]) but still obeys the constraints of Sect. 4, lead to other identities between unitary minimal Virasoro characters, is neither proven nor disproven at this stage. Acknowledgement. One of us (A.T.) acknowledges the U.K. Engineering and Physical Sciences Research Council for the award of an Advanced Fellowship.
Virasoro Character Identities and Artin L-Functions
103
References 1. Borevich, Z.I. and Shafarevich, I.R.: Number Theory. London–New York: Academic Press, 1966 2. Goddard, P., Kent, A., Olive, D.: Unitary representations of the Virasoro and Super-Virasoro algebras. Commun. Math. Phys. 103, 105 (1986) 3. Kac, V.G. and Peterson, D.H.: Infinite-dimensional Lie algebras, theta functions and modular forms. Adv. in Math. 53, 125 (1984) 4. Lang, S.: Algebraic number theory. Reading, MA: Addison-Wesley, 1970 5. Taormina, A.: New identities between unitary minimal Virasoro characters. Commun. Math. Phys. 165, 69 (1994) 6. Wilson, S.M.J.: Relations between ray class L-functions of different quadratic fields. In preparation Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 196, 105 – 131 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Transport Properties of Markovian Anderson Model Serguei Tcheremchantsev Department of Mathematics, University of Orl´eans, 45067 Orl´eans Cedex, France. E-mail:
[email protected] Received: 3 December 1997 / Accepted: 15 January 1998
Abstract: We consider the Anderson model in l2 (Zd ), d ≥ 1, with potentials whose values at any site of the lattice are Markovian independent random functions of time. The upper and lower bounds for the moments |X|p (t, ω) with probability 1 are obtained. We obtain also upper and lower bounds for the averaged diffusion constant and upper bounds for the correlation function. The results present diffusive behaviour in dimensions d = 1, 2 up to logarithmic factors. 1. Introduction Since the 50’s the Anderson model was one of the basic ones for studying the transport phenomena in random environment: i
∂ψ = −1ψ + Qω (n)ψ, ψ|t=0 = ψ0 (n), ψ0 ∈ l2 (Zd ), ∂t
where the values of the potential Q are random i.i.d. variables. Much less attention was devoted to the models with random potentials depending on time. There are physical motivations to consider such models (see [1–3]), but there are almost no mathematical results (see, however, [4] for references). In [4] the Markovian time-dependent version of the Anderson model has been considered. Some results about the behaviour of solutions as t tends to infinity have been proven. In the present paper we shall study the transport properties of this model. Before describing the model and discussing the results we shall say some words about the mathematical problems related to the “classic” Anderson model. When considering the random Schr¨odinger operators, one can obtain in general two kinds of results: 1. Spectral properties of random Schr¨odinger operators Hω = −1 + Qω with probability 1. In particular, the nature of the spectrum and properties of eigenfunctions are of interest.
106
S. Tcheremchantsev
2. Dynamical behaviour of solutions ψ(t) = exp(−itHω )ψ0 . Most interesting is the behaviour of the mean square displacement X |n|2 |ψ(t, n)|2 X 2 (t) = n∈Zd
as t → ∞, which determines the transport properties of the model. If in some sense X 2 (t) ∼ t2α , then α ∈ [0, 1] is called a diffusion exponent. If X 2 (t) ≤ C, one speaks of dynamical localization. As to the spectral properties of random Schr¨odinger operators, this is the field which has been the most intensively explored by mathematicians during the last 20 years. The main result called mathematical localization is the pure point nature of the spectrum and the exponentional decay of eigenfunctions at infinity. This result was proven in great generality in dimension d = 1 and in dimension d ≥ 2 for big disorder or low energy. One expects that the continuous (absolutely continuous?) spectrum may exist in dimension d ≥ 3, however, there is still no mathematical proof of this. The spectral properties of the self-adjoint operator H are closely related with the behaviour in time of solutions ψ(t) = exp(−itH)ψ0 . The classic result is given by the so-called RAGE-Theorem [5]. If ψ0 ∈ Hpp , where Hpp is the subspace of pure point spectrum of H, one has lim sup kF (|n| ≥ R)ψ(t)k = 0,
R→+∞
(1.1)
t
where F is a characteristic function of the outside of the ball. The solutions ψ(t) which satisfy (1.1) are called bound states. If ψ0 ∈ Hc , where Hc is the continuous subspace of H, then for any R > 0 one has lim T −1
T →+∞
Z
T
kF (|n| ≤ R)ψ(t)k2 dt = 0.
(1.2)
0
The solutions ψ(t) which verify (1.2) are called propagating states. The definitions (1.1), (1.2) are valid also in the case of time-dependent Hamiltonians [6]. The relations between spectral properties of H and the dynamical behaviour of solutions ψ(t) became less obvious when one considers the quantities like X 2 (t). In the case of ψ0 ∈ Hc , (1.2) implies hX iT = T 2
−1
Z
T
X 2 (t)dt → +∞,
0
and the problem is to determine the corresponding diffusion exponent α. When the spectrum of H is pure point, the most that one can claim is the absence of ballistic motion [7]: limt→+∞ X 2 (t)/t2 = 0. The example of [8] shows that even the exponential decay of eigenfunctions does not guarantee the dynamical localization, so the mathematical localization does not imply a dynamical one (the inverse being true). To prove the dynamical localization, one needs some additional information on the spatial behaviour of eigenfunctions [8, 9] or some bounds for expectations of the resolvent [10, 8]. For the Anderson model these conditions were verified in some situations. One can expect that in the near future the dynamical localization will be established in most cases when one has a mathematical localization.
Transport Properties of Markovian Anderson Model
107
There remain still two fundamental mathematical problems for the Anderson model in dimensions d ≥ 2 which are not resolved. 1. The existence of the continuous spectrum and its properties. 2. The possible diffusive behaviour of solutions ψ(t). One should say that in the last 10 years one has proved many results on the relations between properties of continuous spectrum of abstract self-adjoint operators and the dynamical behaviour of solutions (see, for instance, [11] for the review of recent papers). These results, however, are rather theoretical. To apply them in concrete cases (in particular, for the Anderson model), one should first establish the existence and the properties of continuous spectrum, which is a very difficult problem. Generally speaking, there are only a few quantum models of random media for which the existence of propagating states with the nontrivial diffusion exponent α ∈ (0, 1) has been mathematically proven. The Markovian Anderson model we present below is one of these examples, the diffusive behaviour in time of solutions being a generic one. Before describing the model, let us make the following general observation concerning Schr¨odinger equations with random time-dependent potentials. For such models the behaviour of solutions ψ(t) to the time-dependent equation iψt = H(t)ψ is not related to the spectral properties of operators H(t) even in the sense of (1.1), (1.2). It follows from the results of [4] for d = 1, where it was shown that with probability 1 all the states are propagating ones. At the same time it is well known that the spectrum of operators H(t) for any t is pure point. Therefore, to study the dynamical behaviour of solutions ψ(t), one should do it directly which differs considerably from the spectral methods used in the case of static potentials. On the other hand, the results we obtain for ψ(t) do not give information about spectral properties of operators H(t) for a fixed t. For instance, we shall prove the existence of diffusion in dimension d = 2 (up to logarithmic factors), but this gives nothing about the spectrum of the two-dimensional static Anderson model. Let us describe now the model we shall consider. Let V : E → R be some real bounded continuous function on the topological space E with Borel σ-algebra B. Let u(t, ω) be a stationary Markov process with values on E: u : R × → E, where (, A, P ) is the probability space with probability measure P . Let = (Z
d
)
with corresponding product probability measure P and expectation E, the elements of being ω = {ωn : ωn ∈ , n ∈ Zd }. One considers the time-dependent Schr¨odinger equation on the lattice Zd : i
∂ψ(t, n) = Hψ(t, n) + V (u(t, ωn ))ψ(t, n), ψ|t=0 = ψ0 (n), ∂t
Here ψ0 ∈ l2 (Zd ), H = −1 + Q0 (n), 1 is the discrete Laplacian: X f (n + p), −1f (n) = p:|p|=1
Q0 : Zd → R is some nonrandom bounded function.
t ≥ 0.
(1.3)
108
S. Tcheremchantsev
We shall assume that the Markov process u(t, ω) satisfies the following conditions: 1. The paths u(·, ω) are P -a.s. right continuous and have a finite number of discontinuities on any compact time interval. 2. We assume the existence of a unique invariant measure µ on (E, B) which gives the stationary distribution of the Markov process u. 3. Let B be the generator of the Markov process. One assumes that B is a self-adjoint nonnegative operator in L2 (E, B, µ) with discrete spectrum: 0 = λ0 < λ1 ≤ λ2 ≤ · · · ≤ λn ≤ · · · , the only eigenfunction corresponding to λ = 0 being e0 (u) ≡ 1, u ∈ E. The last condition on the Markov process implies the exponentional decay in time of correlations of potential [4]. Finally, we suppose that V : E → R is a bounded continuous function such that Z Z V (u)dµ(u) = 0, V 2 (u)dµ(u) > 0. E
E
One can note that for any fixed t the expression V (u(t, ωn )) gives some realisation of the Anderson potential. Therefore we can consider our model as dynamical generalisation of the usual Anderson model. Let ψ(t, n, ω) be solutions to Eq. (1.3). The main results showed in [4] is the following (Theorem 1): with probability 1 for any ψ0 ∈ l2 (Zd ) Z T −1 kF (|n| ≥ a(t))ψ(t, n)k2 dt = 0, (1.4) lim T T →+∞
lim T −1
T →+∞
0
Z
T
kF (|n| ≤ b(t))ψ(t, n)k2 dt = 0,
(1.5)
0
where b(t) = tβ , β < 1/2 for d = 1, 2 and β < 1/d for d ≥ 3; a(t) = tα , α > 1/2 for any d. This means, in particular, that all the states are propagating ones (1.2). However, the equalities (1.4), (1.5) give much more information about the behaviour of solutions, namely, that the quantum particle spends the most time (on average) in the region n : a(t) < |n| < b(t). We shall call (1.4) and (1.5) RAGE-like results for the outside and inside occupation time respectively. As it was stressed in [4], (1.4) and (1.5) do not determine directly the behaviour of X 2 (t) as t → ∞. In the present paper we shall study the behaviour of the time averaged moments Z TX p −1 |n|p |ψ(t, n, ω)|2 dt, p > 0. h|X| iT,ω = T 0
n
At the same time, we shall considerably improve (1.4) and (1.5), especially in dimensions d ≥ 3. Let us expose now the main results of the paper. First, we ameliorate the RAGElike results (1.4), (1.5) for the outside and inside occupation time. Namely, we show (Theorems 3.10 and 5.5) that (1.4) and (1.5) are true with probability 1 for any ψ0 ∈ l2 (Zd ) with the following choice of a(t), b(t): 1
a(t) = t 2 (log(t + 1))ν , where ν > 0 for d = 1, 2 and ν > 1 for d ≥ 3,
Transport Properties of Markovian Anderson Model
109
b(t) = t 2 (log(t + 2))−r , 1
where r > 1, d = 1; r > 3/4, d = 2, and b(t) = t d+2 (log(t + 2))−r , r > 2
2 , d ≥ 3. d+2
The second group of results concerns the behaviour of the time averaged moments. We show (Theorem 5.8) that with probability 1 for any ψ0 ∈ l2 (Zd ) for T sufficiently large 1 h|X|p iT,ω ≥ kψ0 k2 bp (T ), 2 where b(T ) was defined above. We show also (Theorem 3.9) that for any ψ0 exponentially decreasing at infinity with probability 1 p
h|X|p iT,ω ≤ Cr,p (ψ0 , ω)T 2 (log T )r , where r > 1 for d = 1, 2 and r > p + 1 for d ≥ 3. The third group of results gives the upper and lower bounds for the averaged diffusion constant Z +∞ dt exp(−tz)|X|2 (t, ·)], ψ0 6= 0. D(z) = E[z 2 0
We show (Corollaries 3.5 and 5.4) the following uniform in z ∈ (0, 1/3] estimates with some positive constants C1 , C2 : 0 < C1 ≤ D(z) ≤ C2 < +∞, d = 1, C1,r | log z|−r ≤ D(z) ≤ C2 (log | log z|)2 , r > 1/2, d = 2, d−2
C1 z d+2 ≤ D(z) ≤ C2 (log z)2 , d ≥ 3. Finally, we obtain upper bounds for the correlation function Z T −1 dt|(ψ(t, ω), ψ1 )|2 . Cψ0 ,ψ1 (T, ω) = T 0
We show (Theorem 5.9) that for any ψ0 , ψ1 ∈ l2 (Zd ) with probability 1 the uniform in T ≥ 3 estimates hold: C(T, ω) ≤ Cr T − 2 (log T )r , r > 1, d = 1, 1
C(T, ω) ≤ Cr T −1 (log T )r , r > 1, d = 2, C(T, ω) ≤ CT −1 , d ≥ 3. That gives us the following result for the autocorrelation function Cψ0 ,ψ0 (T, ω): lim sup T →+∞
lim
T →+∞
1 log C(T, ω) ≤ − , d = 1; log T 2
log C(T, ω) = −1, d ≥ 2, ψ0 6= 0. log T
We obtain also (Theorem 5.6) bounds for the rate of decay of average occupation time
110
S. Tcheremchantsev
hWR iT,ω = T −1
Z
X
T
dt 0
|ψ(t, n, ω)|2 , R > 0.
n:|n|≤R
In particular (Corollary 5.7), in dimensions d ≥ 3 the total occupation time of any ball is finite with probability 1: Z +∞ X dt |ψ(t, n, ω)|2 < +∞. 0
n:|n|≤R
2. Bounds for the Average Density Matrix in Weighted Spaces The result of [4] which will be the basis for our considerations concerns the Laplace transform of the average density matrix. It was obtained with the so-called generalised Feynman-Kac formula [12, 4]. Let ψ(t, n, ω) be the solution to Eq. (1.3). We set Z +∞ dt exp(−tz)E[ψ(t, n, ·)ψ(t, m, ·) ], z > 0. (2.1) f (n, m, z) = 0
It was shown in [4] that for any z > 0 the function f satisfies the following equation in l2 (Z2d ): Y (z)f = g, where g(n, m) = ψ0 (n)ψ0 (m), Y (z) = i1 + iQ + z + G(z), X X (1f )(n, m) = f (n + p, m) − f (n, m + p), p:|p|=1
p:|p|=1
(Qf )(n, m) = (Q0 (n) − Q0 (m))f (n, m), and G(z) is some family of bounded operators in l2 (Z2d ). Consider for any α ≥ 0 the weighted space X exp(α(|n| + |m|))|ψ(n, m)|2 < +∞}, lα2 (Z2d ) = {ψ : ψ ∈ l2 (Z2d ), n,m
where lα2 (Z2d ) ⊂ l02 (Z2d ) = l2 (Z2d ). We shall denote the inner product in lα2 (Z2d ) as h·, ·iα and the norm as k · kα . It has been proven in [4] ( Lemma 3.1, Corollary 3.2, Lemma 4.2) that operators 1, Q, G(z) have the following properties: 1. Operators 1, Q are bounded in lα2 (Z2d ) for any α ≥ 0 and k1kα ≤ C, kQkα ≤ C with constants uniform in α ∈ [0, 1]. 2. The operator Q is self-adjoint in lα2 (Z2d ) and for 1 one has k=1kα ≤ Cα with C uniform in α ∈ [0, 1]. 3. Let T be the operator of multiplication by Kronecker symbol: (T f )(n, m) = δn,m f (n, m), (T 0 f )(n, m) = (1 − δn,m )f (n, m). Operators T, T 0 are self-adjoint in lα2 (Z2d ), T + T 0 = I, kT kα ≤ 1, kT 0 kα ≤ 1. The identities hold: T 2 = T, (T 0 )2 = T 0 , T Q = QT = 0, T 1T = 0.
Transport Properties of Markovian Anderson Model
111
4. Operators G(z) are bounded in lα2 (Z2d ) for any α ∈ [0, ν] with some ν(d) > 0 : kG(z)kα ≤ C with C uniform in z ∈ (0, 1], α ∈ [0, ν]. 5. For any z > 0 the identities hold: T G(z) = G(z)T = 0. 6. For any z ∈ (0, 1], α ∈ [0, ν] the estimate holds:
0. Using Properties 1–6, one can prove necessary estimates for f = Y −1 (z)g in lα2 (Z2d ). The main interest for us will present the quantities like ρ(z) = hY −1 (z)g, hiα , g(n, m) = ψ0 (n)ψ0 (m). √ We shall take h(n, m) = δn,m F (|n| ≥ A) exp(−α|n|)|n|p , α = C z to get the upper bounds for the moments |X|p and bounds (1.4) for the outside occupation time. To get the lower bounds for |X|p and bounds (1.5) for the inside occupation time, we shall take h(n, m) = δn,m F (|n| ≤ A), α = 0. Finally, we choose h(n, m) = ψ1 (n)ψ1 (m) to get upper bounds for the correlation function. Some of the results we prove in this section were obtained in [4]. However, the proofs we present here are much simpler. Lemma 2.1. For any C2 > 0, δ > 0 we consider the set in R2 : √ D(C2 , δ) = {(z, α) : z ∈ (0, δ], α ∈ [0, C2 z]}. There exist C2 > 0, δ ∈ (0, 1/2) such that the bounded inverse Y −1 (z) exists in lα2 (Z2d ) for any (z, α) ∈ D(C2 , δ). The estimates uniform in (z, α) ∈ D(C2 , δ) hold: kY −1 (z)kα ≤ Cz −1 , kT 0 Y −1 (z)kα ≤ Cz − 2 , 1
k1T Y −1 (z)kα ≤ Cz ∗
where 1 is the adjoint of 1 in
− 21
, k1∗ T Y −1 (z)kα ≤ Cz
− 21
(2.2) ,
(2.3)
lα2 (Z2d ).
Proof. Let f ∈ lα2 (Z2d ), H = 1 + Q. One can write the equality: kY (z)f k2α = h(iH + z + G(z))f, (iH + z + G(z))f iα = k(iH + G(z))f k2α + z 2 kf k2α + 2z
(2.4)
Properties 6 and 2 imply that for α ≤ ν,
(2.5)
1 (1 − 1∗ ). (2.6) 2i As T 1T = 0 and T ∗ = T (Property 3), we have also T 1∗ T = 0. Therefore one can rewrite (2.6) as follows:
2ih(=1)f, f iα = h(T + T 0 )(1 − 1∗ )(T + T 0 )f, f iα = hT (1 − 1∗ )T 0 f, f iα + hT 0 (1 − 1∗ )f, f iα . Properties 2 and 3 imply the uniform estimate in α, |h(=1)f, f iα | ≤ CαkT 0 f kα kf kα .
(2.7)
112
S. Tcheremchantsev
With (2.4)–(2.7) we estimate for α ∈ [0, ν]: kY (z)f k2α ≥ z(za2 + 2C1 b2 − 2Cαab),
(2.8) √ where a = kf kα , b = kT 0 f kα . Obviously, there exists C2 > 0 such that if α ≤ C2 z, then for any a, b z 2 a + C1 b2 − 2Cαab ≥ 0. 2 For such (z, α), (2.8) yields: z2 kf k2α + C1 zkT 0 f k2α . (2.9) 2 √ If we take δ sufficiently small such that C2 δ ≤ ν, (2.9) will be valid for any (z, α) ∈ D(C2 , δ). In the same manner one can show that for the adjoint in lα2 (Z2d ) operator for any (z, α) ∈ D(C2 , δ), kY (z)f k2α ≥
kY ∗ (z)f k2α ≥
z2 kf k2α + C1 zkT 0 f k2α . 2
(2.10)
The operators Y (z), Y ∗ (z) being bounded in lα2 (Z2d ), we obtain from (2.9), (2.10) that Ker Y (z) = Ker Y ∗ (z) = {0}, Ran Y (z) = Ran Y ∗ (z) = lα2 (Z2d ) and √ √ kY −1 (z)kα ≤ 2z −1 , k(Y ∗ )−1 (z)kα ≤ 2z −1 . (2.11) One obtains also kT 0 Y −1 (z)kα ≤ Cz − 2 , kT 0 (Y ∗ )−1 (z)kα ≤ Cz − 2 1
1
(2.12)
for any (z, α) ∈ D(C2 , δ). Let us write now the equation for f = Y −1 (z). As QT = 0, T G(z) = G(z)T = 0, one has i1f + iQT 0 f + zf + T 0 G(z)T 0 f = g. The estimates (2.11), (2.12) together with the uniform boundedness of G(z) imply k1f kα ≤ Cz − 2 kgkα . 1
As 1f = 1T f + 1T 0 f , we get immediately k1T f kα ≤ Cz − 2 kgkα . 1
(2.13)
√ As k1 − 1∗ kα ≤ C 0 α ≤ C z, one obtains also k1∗ T f kα ≤ Cz − 2 kgkα . 1
(2.14)
The statement of the lemma follows now from (2.11), (2.12), (2.13) and (2.14). − 43
For any z > 0 define the following functions ld (z) : ld (z) = z , d = 1, ld (z) = 1 1 1 z − 2 | log z| 2 , d = 2 and ld (z) = z − 2 , d ≥ 3. We shall denote by D the set D(C2 , δ) from Lemma 2.1.
Transport Properties of Markovian Anderson Model
113
Lemma 2.2. Let (z, α) ∈ D, g ∈ lα2 (Z2d ), n ∈ Zd . The uniform estimates hold: |(Y −1 (z)g)(n, n)| ≤ Cld (z) exp(−α|n|)kgkα ,
(2.15)
|((Y ∗ )−1 (z)g)(n, n)| ≤ Cld (z) exp(−α|n|)kgkα .
(2.16)
Proof. One can easily see ([4], Lemma 5.1) that (1T Y −1 (z)g)(n, m) = 0 for |n−m| 6= 1 and (1T Y −1 (z)g)(n, m) = ρ(m)−ρ(n) for |n−m| = 1, where ρ(n) = (Y −1 (z)g)(n, n). Therefore X |ρ(n + p) − ρ(n)|2 exp(α(|n| + |n + p|)). (2.17) k1T Y −1 (z)gk2α = n,p:|p|=1
Consider the function h(n) = ρ(n) exp(α|n|). The result of Lemma 2.1 implies khk20 =
X
|h(n)|2 = kT Y −1 (z)gk2α ≤ Cz −2 kgk2α .
(2.18)
n
Let us fix p : |p| = 1. One can write the identity h(n) − h(n + p) = (ρ(n) − ρ(n + p)) exp(α|n|) + ρ(n + p)(exp(α|n|) − exp(α|n + p|)). √ As |p| = 1 and 0 ≤ α ≤ C2 z ≤ ν, one can easily see with (2.17), (2.18) and (2.3) that X
|h(n) − h(n + p)|2 ≤
n
C
X
|ρ(n) − ρ(n + p)|2 exp(α(|n| + |n + p|)) + Cα2
n
X
|ρ(n + p)|2 exp(2α|n + p|) =
n
Ck1T Y −1 (z)gk2α + Cα2 kT Y −1 (z)gk2α ≤ Cz −1 kgk2α . Therefore,
X
|h(n) − h(n + p)|2 ≤ Cz −1 kgk2α .
(2.19)
n,p:|p|=1
The bounds (2.18) and (2.19) allow us to estimate maxn |h(n)| exactly in the same manner as in [4], Lemmas 5.1-5.2. The result is max |h(n)| ≤ Cld (z)kgkα n
with ld (z) defined above, which gives immediately (2.15). For the adjoint operator the proof is the same. P Theorem 2.3. Let h ∈ lα2 (Z2d ), g(n, m) = ψ0 (n)ψ0 (m), where n |ψ0 (n)|2 exp(a|n|) < +∞ for some a > 0. There exists δ > 0 such that for any (z, α) ∈ D(C2 , δ) the uniform estimate holds: |hY −1 (z)g, hiα | ≤ C(ψ0 )ld (z)khkα .
(2.20)
114
S. Tcheremchantsev
Proof. One can write the identity: hY −1 (z)g, hiα = hT g, T (Y ∗ )−1 (z)hiα + hT 0 g, T 0 (Y ∗ )−1 (z)hiα ≡ r1 (z) + r2 (z). The estimate (2.16) of Lemma 2.2 implies X exp(2α|n|)|ψ0 (n)|2 exp(−α|n|)khkα ≤ C(ψ0 )ld (z)khkα (2.21) |r1 (z)| ≤ Cld (z) n
if α is sufficiently small (α ≤ a). The estimate kT 0 (Y ∗ )−1 (z)kα ≤ Cz − 2 of Lemma 2.1 yields 1 (2.22) |r2 (z)| ≤ Cz − 2 khkα · kT 0 gkα ≤ C(ψ0 )ld (z)khkα , 1
again if α ≤ a. The statement of the theorem follows from (2.21) and (2.22).
The result of this theorem for the maximal allowed value of α will be used to obtain the upper bounds for |X|p and bounds for the outside occupation time (1.4). In principle, taking α = 0, one can obtain the lower bounds for |X|p and bounds (1.5) for the inside occupation time. In fact, it was done so in [4] (Lemma 5.3, Corollary 5.4) for the inside occupation time. There exists, however, another method which gives better results. We shall discuss it in Sect. 4.
3. Upper Bounds for the Moments In this section we shall suppose that ψ0 satisfies the condition of Theorem 2.3, namely, that X |ψ0 (n)|2 exp(a|n|) < +∞ (3.1) n
for some a > 0. Let ψ(t, n, ω) be the solution to the Schr¨odinger equation (1.3). For any p > 0, N > 0 define the positive functionals X |n|p |ψ(t, n, ω)|2 . |X|pN (t, ω) = n:|n|≤N
Obviously, |X|pN (t, ω) ≤ N p kψ0 k2
(3.2)
for any t, ω. Define now |X|p (t, ω) =
X
|n|p |ψ(t, n, ω)|2 = lim |X|pN (t, ω), N →+∞
n
where it is possible that |X|p = +∞ for some t, ω. For P−a.e. ω the functions |X|pN (·, ω) are continuous as ψ(·, n, ω) are continuous (see [4]). Therefore, with probability 1 the functions |X|p (·, ω) are Lebesgue measurable and one can consider the integrals: Z +∞ exp(−tz)|X|p (t, ω)dt, z > 0. ηp (z, ω) = 0
Again, it is possible that ηp (z, ω) = +∞.
Transport Properties of Markovian Anderson Model
115
Lemma 3.1. For any z > 0, N > 0 the equality holds: Z +∞ X dt exp(−tz)|X|pN (t, ·) = |n|p f (n, n, z), E 0
(3.3)
n:|n|≤N
where f (z) = Y −1 (z)g, g(n, m) = ψ0 (n)ψ0 (m). Proof. First, for any t the functionals |X|pN are P-measurable as for any t, n the functionals ψ(t, n, ·) are measurable [4]. Moreover, for P − a.e. ω the functions |X|pN (·, ω) are continuous. The result of Lemma 2.5 of [4] and (3.2) imply that the functionals Z +∞ dt exp(−tz)|X|pN (t, ω) ηp,N (z, ω) = 0
are P-integrable for any positive p, N, z and Z +∞ dt exp(−tz)E[|X|pN (t, ·)] = E[ηp,N (z, ·)] = 0
Z
+∞
dt exp(−tz) 0
X
X
|n|p E[|ψ(t, n, ·)|2 ] =
n:|n|≤N
|n|p f (n, n, z),
n:|n|≤N
where f is defined by (2.1). The proof is completed.
Lemma 3.2. Let us fix some ψ0 , p. Consider the function X |n|p f (n, n, z). ρp (z) = n
Suppose that ρp (z) < +∞ for any z > 0. Then 1. P-almost surely for any z > 0 ηp (z, ω) < +∞. 2. For any z : z > 0 the functionals ηp (z, ·) are P-integrable and E[ηp (z, ·)] = ρp (z). The proof is based on the monotone convergence theorem (the limit N → +∞) and is rather straightforward, so we shall omit it. Corollary 3.3. If ρp (z) < +∞ for any z > 0, then 1. P-almost surely |X|p (t, ω) < +∞ for Lebesgue-almost every t. 2. P-almost surely the functions ηp (·, ω) are continuous in z : z > 0. Now we shall estimate the functions ρp (z). Theorem 3.4. Let p > 0 and ψ0 satisfy (3.1) for some a > 0. The estimates uniform in z ∈ (0, 1/3) hold: p ρp (z) ≤ Cz − 2 −1 , d = 1, p
ρp (z) ≤ Cz − 2 −1 (log | log z|)p , d = 2, p
ρp (z) ≤ Cz − 2 −1 | log z|p , d ≥ 3.
116
S. Tcheremchantsev
Proof. Let A > 0. One can write the identity: X X |n|p f (n, n, z) ≡ a(z) + b(z). + ρp (z) = n:|n|≥A
n:|n|
One can write a(z) as X exp(2α|n|)F (|n| ≥ A) exp(−2α|n|)|n|p f (n, n, z) = hf (z), hA iα , a(z) = n
For any z ∈ (0, δ], where α ≥ 0, hA (n, m) = δn,m F (|n| ≥ A)|n|p exp(−2α|n|). √ where δ is the number from Theorem 2.3, we take α = C2 z. The estimate (2.20) of this theorem implies |a(z)| ≤ C(ψ0 )ld (z)khA kα . The calculation shows that X khA k2α = |n|2p exp(−2α|n|) ≤ Cα−2p−d exp(−Aα). n:|n|≥A
This gives us
√ |a(z)| ≤ C(ψ0 , p, d)z −p/2−d/4 ld (z) exp(−CA z)
(3.4)
with some C > 0. To estimate b(z), we observe first that f (n, n, z) ≥ 0 and X X Z +∞ f (n, n, z) = dt exp(−tz)E[ψ(t, n, ·)|2 ] = z −1 kψ0 k2 . n
0
n
Therefore, |b(z)| ≤ Ap
X
f (n, n, z) ≤ Ap z −1 C(ψ0 ).
(3.5)
n:|n|
To get the optimal bound for ρp (z), we take A = 0 for d = 1, A = Dz −1/2 log | log z| for d = 2 and A = Dz −1/2 | log z| for d ≥ 3, where D is sufficiently big. As ρp (z) are decreasing in z, the bounds (3.4), (3.5) give the statement of the Theorem. R +∞ Corollary 3.5. Let D(z) = E[z 2 0 dt exp(−tz)|X|2 (t, ·)] be the average diffusion constant. The following estimates uniform in z ∈ (0, 1/3] hold: D(z) ≤ C, d = 1; D(z) ≤ C(log | log z|)2 , d = 2; D(z) ≤ C(log z)2 , d ≥ 3. Let T > 0, p > 0. Consider the time-averaged quantities h|X|p iT,ω = T −1
Z
T
dt|X|p (t, ω).
0
To obtain the upper bounds with probability 1 for h|X|p iT,ω , we shall use the following elementary fact.
Transport Properties of Markovian Anderson Model
Lemma 3.6. Let h(z, ω) = z | log z| β
117
Z
−γ
+∞
dt exp(−tz)g(t, z, ω), 0
where β ≥ 0, γ ≥ 0, z ∈ (0, 1/3], g(t, z, ω) ≥ 0 and for P-a.e. ω and any t > 0 the function g(t, ·, ω) is decreasing. Suppose that h(z, ·) is P-integrable for any z > 0. 1. If the estimate uniform in z ∈ (0, 1/3] holds: E[h(z, ·)] ≤ C| log z|−1 (log | log z|)−r
(3.6)
with some r > 1, then limz→+0 h(z, ω) = 0 P-almost surely. 2. If β = 0 and the uniform estimate holds: E[h(z, ·)] ≤ C(log | log z|)−r with some r > 1, then limz→+0 h(z, ω) = 0 P-almost surely. 3. If β = γ = 0, g does not depend on z and the uniform estimate holds: E[h(z, ·)] ≤ C, then P-almost surely for any z ∈ (0, 1/3], Z +∞ h(z, ω) ≤ g(t, ω)dt < +∞. 0
Proof. We shall use the same idea as in [8], proof of Theorem 7.6 (one could use also the Borel-Cantelli Theorem, see [13], proof of Theorem 4.4.2). If β > 0 and (3.6) holds, we set zk = exp(−k), k = 1, 2, ... and define the functional Q(ω) =
+∞ X
h(zk , ω).
k=1
One sees easily that E[Q(·)] < +∞. Therefore, for P-a.e. ω we have Q(ω) < +∞ and lim h(zk , ω) = 0.
k→+∞
(3.7)
On the other hand, if z ∈ [zk+1 , zk ], one can estimate: Z +∞ h(z, ω) ≤ zkβ | log zk |−γ dt exp(−tzk+1 )g(t, zk+1 , ω) = 0
h(zk+1 , ω)
β log zk+1 γ zk β γ log zk ≤ e 2 h(zk+1 , ω). zk+1
(3.8)
It follows from (3.7) and (3.8) that limz→+0 h(z, ω) = 0. In the case β = 0 one takes zk = exp(− exp k), the proof being the same. Finally, if Z +∞
dt exp(−tz)g(t, ·)] ≤ C,
E[ 0
the monotone convergence theorem implies
R +∞ 0
g(t, ω)dt < +∞ for P-a.e. ω.
118
S. Tcheremchantsev
R +∞ Corollary 3.7. Let G(z, ω) = 0 dt exp(−tz)g(t, z, ω), where g is the same as in Lemma 3.6. Suppose that for any z > 0 the functionals G(z, ·) are P-integrable. 1. If E[G(z, ·)] ≤ Cz −β | log z|γ , where β > 0, γ ≥ 0, then with probability 1 the estimate uniform in z ∈ (0, 1/3] holds: G(z, ω) ≤ Cr (ω)z −β | log z|r ,
(3.9)
where r > γ + 1. 2. If E[G(z, ·)] ≤ C| log z|γ with some γ > 0, then with probability 1 the uniform estimate holds: G(z, ω) ≤ Cr (ω)| log z|r ,
(3.10)
where r > γ. Proof. One takes h(z, ω) = z β | log z|−r G(z, ω), where r > γ + 1 in the first case and h(z, ω) = | log z|−r G(z, ω), where r > γ in the second case. Lemma 3.6 implies in both cases limz→+0 h(z, ω) = 0 with probability 1, which gives immediately (3.9), (3.10). Now we can estimate functionals ηp (z, ω) with probability 1. Lemma 3.8. Let p > 0 and ψ0 satisfy (3.1) for some a > 0. For any r > 1 with probability 1 the following estimates uniform in z ∈ (0, 1/3] hold: ηp (z, ω) ≤ C(r, ω)z −p/2−1 | log z|r , d = 1, 2, ηp (z, ω) ≤ C(r, ω)z −p/2−1 | log z|p+r , d ≥ 3. Proof. We take G(z, ω) = ηp (z, ω). The result follows directly from Lemma 3.2, Theorem 3.4 and the first statement of Corollary 3.7. Theorem 3.9. Let ψ0 be the same as above, p > 0, r > 1. With probability 1 the estimates uniform in T ≥ 3 hold: h|X|p iT,ω ≤ C(r, ω)T p/2 (log T )r , d = 1, 2, h|X|p iT,ω ≤ C(r, ω)T p/2 (log T )p+r , d ≥ 3. Proof. We can estimate: h|X|p iT,ω ≤ eT −1
Z
T
dt exp(−t/T )|X|p (t, ω) ≤
0
eT −1
Z
+∞
dt exp(−t/T )|X|p (t, ω) = ezηp (z, ω),
(3.11)
0
where z = T −1 . The result follows from (3.11) and Lemma 3.8.
Transport Properties of Markovian Anderson Model
119
With this result one can obtain bounds for the outside occupation time (1.4). Let us fix some ψ0 ∈ l2 (Zd ) and consider X |ψ(t, n, ω)|2 , W (t, ω) = n:|n|≥a(t)
where a(t) is some increasing function. In [4] it was shown that for a(t) = tα , α > 1/2, in any dimension with probability 1 lim hW iT,ω = 0
(3.12)
T →+∞
for any ψ0 ∈ l2 (Zd ). We shall now improve this result. 1
Theorem 3.10. Let a(t) = t 2 (log(t + 1))ν , where ν > 0 for d = 1, 2 and ν > 1 for d ≥ 3. With probability 1 for any ψ0 ∈ l2 (Zd ) (3.12) hold. Proof. Suppose that ψ0 is finite, so the result of Theorem 3.9 holds. For any p > 0 we have X |ψ(t, n, ω)|2 . |X|p (t, ω) ≥ ap (t) n:|n|≥a(t)
Therefore,
As limT →+∞ T
R −1 3 0
W (t, ω) ≤ a−p (t)|X|p (t, ω).
(3.13)
dtW (t, ω) = 0, it is sufficient to consider L(T, ω) = T −1
Z
T
dtW (t, ω). 3
Let
Z
T
F (T, ω) =
dt|X|p (t, ω).
3
Theorem 3.9 yields with probability 1 F (T, ω) ≤ CT 1+p/2 (log T )γ ,
(3.14)
where γ > 1 for d = 1, 2 and γ > p + 1 for d ≥ 3. One can estimate L using (3.13): L(T, ω) ≤ T
−1
Z
T
a
−p
0
(t)F (t, ω)dt = T
−1
F (T, ω)a
3
−p
(T ) − T
−1
Z
T
F (t, ω)s(t)dt, 3
where s(t) = (a−p (t))0 . It is not difficult to show that (3.14) and (3.15) imply
(3.15)
lim L(T, ω) = 0
T →+∞
if pν > γ. This condition is satisfied for p sufficiently big in any dimension. So, we have proved (3.12) for any finite ψ0 with probability 1. By standard density arguments (see [4], Lemma 2.4), (3.12) holds with probability 1 for any ψ0 ∈ l2 (Zd ).
120
S. Tcheremchantsev
4. Ameliorated Bounds in l2 (Z2d ) In this section we shall denote by (f, g) and kf k the scalar product and the norm in l2 (Z2d ). To obtain the upper bounds for ρ(z) = (Y −1 (z)g, h), we shall consider the operators X(z) = i1 + z + T 0 , where operators 1, T 0 were defined in Sect. 2. The idea consists in comparing the inverse Y −1 (z) with X −1 (z) and to show that these two operators are similar behaviour as z → +0. The advantage of this method is that the inverse operators X −1 (z) can be explicitly calculated (in Fourier-representation), which gives good estimates for ρ(z), better than that of Theorem 2.3 for α = 0. One should make here some comments about operators X(z). Let us consider the potential pV (u), and take λB as the generator of the Markov process, p, λ being positive parameters. If p/λ is small, one can show using the results of [4] that for the corresponding operators G(z) one has: G(z) = Cp2 λ−1 T 0 + T 0 R(z)T 0 , where kR(z)k ≤ Cp3 λ−2 . If one takes the limit p2 = λ → +∞, one gets G(z) = CT 0 with some constant C. As λ → +∞, the correlation time of the Markov process tends to 0. Therefore the equation X(z)f = g can be interpreted as the equation for the Laplace transform of the average density matrix in the case of δ-correlated potentials (see also [14]). One can explicitly calculate the inverse X −1 (z) and show that the averaged solutions present exactly diffusive behaviour in any dimension. Before calculating the inverse X −1 (z), we shall first establish the relation between −1 X (z) and Y −1 (z). Lemma 4.1. The following estimates uniform in z ∈ (0, 1] hold: kT 0 X −1 (z)T 0 k ≤ C, kT 0 Y −1 (z)T 0 k ≤ C.
(4.1)
Proof. First, it is easy to see that operators X(z) have all the properties 1–6 (Sect. 2) of Y (z). Therefore, all the results of Sect. 2 are valid for X(z). In particular, the inverse operators X −1 (z) exist in l2 (Z2d ). Let us now write the equation for f = Y −1 (z)T 0 g: (i1 + iQ + z + G(z))f = T 0 g. As T Q = QT = 0, T G(z) = G(z)T = 0, one can write i1f + iT 0 QT 0 f + zf + T 0 G(z)T 0 f = T 0 g.
(4.2)
Let us apply T to both sides of (4.2). As T 1T = 0, one obtains iT 1T 0 f + zT f = 0.
(4.3)
Now apply T 0 to both sides of (4.2): iT 0 1T f + iT 0 1T 0 f + iT 0 QT 0 f + zT 0 f + T 0 G(z)T 0 f = T 0 g.
(4.4)
From (4.3) we obtain T f = −iz −1 T 1T 0 f , and (4.4) implies the following equation for T 0f : N (z)T 0 f ≡ z −1 T 0 1T 1T 0 f + iT 0 1T 0 f + iT 0 QT 0 f + zT 0 f + T 0 G(z)T 0 f = T 0 g,
Transport Properties of Markovian Anderson Model
121
where the operators N (z) act in H0 = T 0 l2 (Z2d ), the closed subspace of l2 (Z2d ). One has T 0 1T 1T 0 = (T 1T 0 )∗ (T 1T 0 ) ≥ 0. As operators T 0 1T 0 , T 0 QT 0 are self-adjoint in l2 (Z2d ), with Property 6 of G(z) one estimates for any h ∈ H0 : <(N (z)h, h) ≥ <(T 0 G(z)T 0 h, h) ≥ C1 khk2 with some positive uniform constant C1 . Therefore, kT 0 gk2 = kN (z)T 0 f k2 = k(N (z) − C1 )T 0 f + C1 T 0 f k2 ≥ C12 kT 0 f k2 .
(4.5)
The estimate (4.5) gives the statement of Lemma for Y −1 (z). For the inverse X −1 (z) the proof is the same. Lemma 4.2. The following equality holds: Y −1 (z) = X −1 (z) + X −1 (z)T 0 M (z)T 0 X −1 (z), where kM (z)k ≤ C uniformly in z ∈ (0, 1]. Proof. Let K, D be some bounded operators such that bounded inverses (K + D)−1 and K −1 exist. It is easy to check the identity: (K + D)−1 = K −1 − K −1 (D − D(K + D)−1 D)K −1 .
(4.6)
One takes K = X(z), D = Y (z)−X(z) = iQ+G(z)−T 0 . As T Q = QT = 0, T G(z) = G(z)T = 0, T T 0 = T 0 T = 0, we have T D = DT = 0. Therefore, (4.6) implies Y −1 (z) = X −1 (z) − X −1 (z)T 0 (D − DT 0 Y −1 (z)T 0 D)T 0 X −1 (z).
(4.7)
It follows directly from the properties of G(z), Q that kDk ≤ C uniformly in z ∈ (0, 1]. The result of lemma follows from (4.1), (4.7) and (4.8).
(4.8)
Theorem 4.3. Let ρ(z) = (Y −1 (z)g, h), where g(n, m) = ψ0 (n)ψ0 (m), ψ0 ∈ l2 (Zd ), h ∈ l2 (Z2d ). The estimate uniform in z, ψ0 , h holds: |ρ(z)| ≤ Ckψ0 k2 max |((X ∗ )−1 (z)T h)(n, m)|+ n,m
C(kT 0 gk + kT 0 X −1 (z)T gk)(kT 0 hk + kT 0 (X ∗ )−1 (z)T hk). Proof. Lemma 4.2 and T + T 0 = I imply ρ(z) = ((T + T 0 )X −1 (z)(T + T 0 )g, h) + (X −1 (z)T 0 M (z)T 0 X −1 (z)g, h). As kM (z)k ≤ C (Lemma 4.2) and kT 0 X −1 (z)T 0 k ≤ C (Lemma 4.1), one estimates: |ρ(z)| ≤ |(T X −1 (z)T g, h)| + |(T 0 X −1 (z)T g, h)|+ |(T X −1 (z)T 0 g, h)| + CkT 0 gkkT 0 hk + CkT 0 X −1 (z)gkkT 0 (X ∗ )−1 (z)hk.
(4.9)
122
S. Tcheremchantsev
Further, X
(T X −1 (z)T g, h) = (T g, T (X ∗ )−1 (z)T h) =
|ψ0 (n)|2 ((X ∗ )−1 (z)T h)(n, n).
n
Using again the result of Lemma 4.1 for operators X(z), X ∗ (z), one estimates:
(4.10)
kT 0 X −1 (z)gk ≤ C(kT 0 gk + kT 0 X −1 (z)T gk),
(4.11)
kT 0 (X ∗ )−1 (z)hk ≤ C(kT 0 hk + kT 0 (X ∗ )−1 (z)T hk).
(4.12)
The equality (4.10) and estimates (4.9), (4.11), (4.12) give the statement of the theorem. It follows from the theorem that the key quantities to estimate are J1 = max |((X ∗ )−1 (z)T h)(n, m)|, n,m
J2 = kT 0 X −1 (z)T gk, J3 = kT 0 (X ∗ )−1 (z)T hk. It turns out that the functions X −1 (z)T g, (X ∗ )−1 (z)T h can be calculated explicitly in Fourier-representation. Let f ∈ l2 (Z2d ), fb- its Fourier transform: X fb(s, k) = exp(−i(s, n) − i(k, m))f (n, m), n,m −2d
Z
Z
f (n, m) = (2π)
dk exp(i(s, n) + i(k, m)fb(s, k).
ds [−π,π]d
[−π,π]d
As the function exp(i(k, m)fb(s, k) is 2π-periodic in kj , j = 1, ..., d for any s, one can change the variables r = k + s and write Z dk exp(i(s, n) + i(k, m)fb(s, k) [−π,π]d Z dr exp(i(s, n − m) + i(r, m))fb(s, r − s) = [−π,π]d +s Z dr exp(i(r, m))fb(s, r − s). = exp(i(s, n − m)) [−π,π]d
We shall use the variables s, r and for any f ∈ l2 (Z2d ) we shall note fe(s, r) ≡ fb(s, r−s), so Z −2d f (n, m) = (2π) dsdr exp(i(s, n − m) + i(r, m)fe(s, r), (4.13) [−π,π]2d
kf k2 = (2π)−2d
Z
dsdr|fe(s, r)|2 .
(4.14)
[−π,π]2d
If one takes n = m in (4.13), one obtains Tff (s, r) = (2π)−d
Z
dsfe(s, r), [−π,π]d
(4.15)
Transport Properties of Markovian Anderson Model
123
i.e. Tff (s, r) does not depend on s. We shall write therefore Tff (r). One can show also from (4.13) that f )(s, r) = L(s, r)fe(s, r), (1f (4.16) where L(s, r) = 4
d X
sin(rj /2) sin(sj − rj /2), r = (r1 , ..., rd ), s = (s1 , ..., sd ).
j=1
We can now calculate X −1 (z)T f, (X ∗ )−1 (z)T f for any f . Theorem 4.4. Let 0 < z ≤ 1, f ∈ l2 (Z2d ), u = X −1 (z)T f, v = (X ∗ )−1 (z)T f . The identities hold: Tff (r) (iL(s, r) + z + 1)−1 , (4.17) u e(s, r) = 1 − ηz (r) Tff (r) (−iL(s, r) + z + 1)−1 , 1 − ηz (r) where ηz (r) is some real function such that for any r, z, ve(s, r) =
0 < ηz (r) ≤ 1 − C(z + r2 )
(4.18)
(4.19)
with positive uniform constant C. Proof. As X(z)u = T f , one can write with (4.15), (4.16) the equation for u in variables s, r: Z dse u(s, r) = Tff (r). (4.20) (iL(s, r) + z + 1)e u(s, r) − (2π)−d [−π,π]d
Let p(r) = Tfu(r). Equations (4.20) and (4.15) yield p(r) = (Tff (r) + p(r))ηz (r), where −d
ηz (r) = (2π)
Z
(4.21)
ds(iL(s, r) + z + 1)−1 . [−π,π]d
Using the periodicity of L(s, r) in sj , one can write −1 Z d X ds 4i sin(rj /2) sin(sj ) + z + 1 . ηz (r) = (2π)−d [−π,π]d
j=1
P
sin(rj /2) sin(sj ). As K(−s, r) = −K(s, r), we obtain Z −1 ds (z + 1)2 + K 2 (s, r) . ηz (r) = (z + 1)(2π)−d
Let K(s, r) = 4
(4.22)
[−π,π]d
One sees that ηz (r) is real and 0 < ηz < (z + 1)−1 for any r, z. Further, one can write Z −1 dsK 2 (s, r) (z + 1)2 + K 2 (s, r) . (z + 1)−1 − ηz (r) = (2π)−d (z + 1)−1 [−π,π]d
(4.23)
124
S. Tcheremchantsev
As K 2 (s, r) ≤ 16d2 , z ∈ (0, 1], (4.23) implies Z −1 (z + 1) − ηz (r) ≥ C
dsK 2 (s, r)
(4.24)
[−π,π]d
with some positive uniform constant C. The calculation gives Z dsK 2 (s, r) = 16π [−π,π]d
d X
sin2 (rj /2) ≥ Cr2 , r ∈ [−π, π]d .
(4.25)
j=1
The estimates (4.24), (4.25) imply the estimate (4.19) for ηz (r). Now we go back to Eq. (4.21) and find (4.26) p(r) = Tff (r)ηz (r)(1 − ηz (r))−1 . The identities (4.20) and (4.26) yield (4.17). For the adjoint operator the proof of (4.18) is the same. Corollary 4.5. The estimate uniform in n, m, z holds: Z ∗ −1 dr|Tfh(r)|(r2 + z)−1 . |((X ) (z)T h)(n, m)| ≤ C [−π,π]d
Proof. As L(s, r) is real, the result follows directly from Theorem 4.4 and (4.13). Corollary 4.6. The estimates uniform in z hold: Z 0 −1 2 dr|Tfg(r)|2 r2 (r2 + z)−2 , kT X (z)T gk ≤ C
(4.27)
[−π,π]d
0
∗ −1
kT (X )
Z (z)T hk ≤ C 2
dr|Tfh(r)|2 r2 (r2 + z)−2 .
(4.28)
[−π,π]d
Proof. Let u = X −1 (z)T g. The equalities (4.17) and (4.26) imply 0 u(s, r) = g T
Tfg(r) (iL(s, r) + z + 1)−1 − ηz (r) . 1 − ηz (r)
(4.29)
As |L(s, r)| ≤ C|r| uniformly in s, r, it is easy to show that |(iL(s, r) + z + 1)−1 − ηz (r)| ≤ C|r|.
(4.30)
The equality (4.29) and the estimates (4.30), (4.19) yield 0 u(s, r)| ≤ C|T g fg(r)| · |r|(r2 + z)−1 , |T
which together with (4.14) gives (4.27). For the adjoint operator the proof of (4.28) is the same.
Transport Properties of Markovian Anderson Model
125
5. Bounds for Inside Occupation Time and Lower Bounds for Moments First, we shall obtain lower bounds for the expectations Z +∞ X ρp (z) = E dt exp(−tz)|X|p (t, ω) = |n|p f (n, n, z). 0
n
To do this, one should be able to estimate the quantities like X MB (z) = f (n, n, z). n:|n|≤B
Lemma 5.1. Let ψ0 ∈ l2 (Zd ). The estimates uniform in z ∈ (0, 1/3], B ≥ 2, zB 2 ≤ 1 hold: q 1 |MB (z)| ≤ CBz − 2 , d = 1; |MB (z)| ≤ CB 2 | log z|(1 + | log zB 2 |), d = 2; |MB (z)| ≤ CB Proof. First, as f (n, n, z) ≥ 0, we have X MB (z) ≤
d+2 2
, d ≥ 3.
f (n, n, z) ≡ N (z).
(5.1)
n:∀j,|nj |≤B
One can write N (z) as N (z) = (T Y −1 (z)g, h), where g(n, m) = ψ0 (n)ψ0 (m), h(n, m) = δn,m
d Y
F (|nj | ≤ B).
j=1
As T 0 h = 0, Theorem 4.3 gives the estimate: |N (z)| ≤ CJ1 (z) + (C(ψ0 ) + J2 (z))J3 (z), where
(5.2)
J1 (z) = max |((X ∗ )−1 (z)T h)(n, m)|, n,m
J2 (z) = kT 0 X −1 (z)T gk, J3 (z) = kT 0 (X ∗ )−1 (z)T hk. Corollaries 4.5 and 4.6 imply Z
dr|Tfh(r)|(r2 + z)−1 ,
(5.3)
dr|Tfg(r)|2 r2 (r2 + z)−2 ,
(5.4)
dr|Tfh(r)|2 r2 (r2 + z)−2 .
(5.5)
J1 (z) ≤ C [−π,π]d
Z J22 (z) ≤ C Z J32 (z)
[−π,π]d
≤C [−π,π]d
126
S. Tcheremchantsev
One can explicitly calculate the function Tfh(r), r 6= 0: Tfh(r) =
X
exp(−i(r, n))h(n, n) =
d Y
X
exp(−inj rj ) =
j=1 nj :|nj |≤B
n
d Y sin(rj B1 ) j=1
sin(rj /2)
,
where B1 ∈ [B − 1/2, B + 1/2]. Using the uniform estimate | sin x| ≥ C|x|, x ∈ [−π/2, π/2] and changing the variable ρ = B1 r, we obtain: Z J1 (z) ≤ C
2
dr(r + z)
−1
[−π,π]d
Z CB12
[−πB1 ,πB1 ]d
d Y | sin(rj B1 )|
dρ(ρ2 + zB12 )−1
=
|rj |
j=1
d Y | sin ρj | j=1
|ρj |
.
(5.6)
It is easy to estimate the integral in (5.6) and to show that J1 (z) ≤ CBz −1/2 , d = 1; J1 (z) ≤ CB 2 (| log(zB 2 )| + 1), d = 2; J1 (z) ≤ CB 2 , d ≥ 3. Estimating
J32 (z)
(5.7)
in the same manner, one obtains
J32 (z) ≤ CB 2 z −1/2 , d = 1; J32 (z) ≤ CB 4 (| log(zB 2 )| + 1), d = 2; (5.8) J32 (z) ≤ CB d+2 , d ≥ 3. P Let us now estimate J2 (z). As Tfg(r) = n exp(−i(n, r))|ψ0 (n)|2 , we have |Tfg(r)| ≤ kψ0 k2 uniformly in r. Therefore, one obtains from (5.4): J22 (z) ≤ Cz −1/2 , d = 1; J22 (z) ≤ C| log z|, d = 2; J22 (z) ≤ C, d ≥ 3.
(5.9)
The result of the lemma follows now from (5.1), (5.2) and (5.7)–(5.9).
Comment 5.2. It follows from Theorem 2.3 for α = 0 that |MB (z)| ≤ Cld (z)B d/2 . One can verify that the estimates of Lemma 5.1 give better results (see later) with probability 1 in dimensions d = 1, 2 and significantly better results for the averaged and with probability 1 for d ≥ 3. Theorem 5.3. Let ψ0 ∈ l2 (Zd ), ψ0 6= 0. The following estimates hold with positive constants uniform in z ∈ (0, 1/3]: p
p
ρp (z) ≥ Cz −1− 2 , d = 1; ρp (z) ≥ Cz −1− 2 | log z|−pα , α > 2p
ρp (z) ≥ Cz −1− d+2 , d ≥ 3.
1 , d = 2; 4
Transport Properties of Markovian Anderson Model
127
Proof. For any B > 0 one can estimate:
X
ρp (z) ≥
|n|p f (n, n, z) ≥ B p z −1 kψ0 k2 −
n:|n|>B
X
f (n, n, z) =
n:|n|≤B
(5.10) B p (z −1 kψ0 k2 − MB (z)). P We have again used the identity n f (n, n, z) = z −1 kψ0 k2 . We choose B big but such that 1 (5.11) |MB (z)| ≤ z −1 kψ0 k2 . 2 To do it, we take B = δz − 2 , d = 1; B = δz − 2 | log z|−α , α > 1
1
1 , d = 2; 4
B = δz − d+2 , d ≥ 3. 2
For δ sufficiently small and depending only on ψ0 , Lemma 5.1 yields (5.11). The result of the theorem follows now from (5.10). Corollary 5.4. The following uniform estimates for the averaged diffusion constant hold: D(z) ≥ C, d = 1; D(z) ≥ Cα | log z|−2α , α >
d−2 1 , d = 2; D(z) ≥ Cz d+2 , d ≥ 3. 4
The lower bounds for expectations ρp (z) = E[ηp (z, ·)] do not give, however, lower bounds for ηp (z, ω) with probability 1. To obtain such bounds, one shall use the RAGElike results for the inside occupation time, namely, lim T −1
Z
T →+∞
X
T
dt 0
|ψ(t, n, ω)|2 = 0
(5.12)
n:|n|≤b(t)
for some increasing b(t) with probability 1. The result we prove below improves significantly the corresponding result of [4]. Theorem 5.5. Let b(t) = t 2 (log(t + 2))−r , where r > 1, d = 1 and r > 43 , d = 2; 2 2 , d ≥ 3. With probability 1 for any ψ0 ∈ l2 (Zd ), b(t) = t d+2 (log(t + 2))−r , r > d+2 1
lim L(T, ω) ≡ lim T −1
T →+∞
In particular, (5.12) holds.
T →+∞
Z
T
dt 0
X n:|n|≤b(T )
|ψ(t, n, ω)|2 = 0.
(5.13)
128
S. Tcheremchantsev
Proof. Let us fix some ψ0 and estimate the functional L(T, ω). As in (3.11), one has L(T, ω) ≤ CzG(z, ω),
(5.14)
where z = T −1 , Z
X
+∞
dt exp(−tz)
G(z, ω) = 0
|ψ(t, n, ω)|2 .
n:|n|≤b(z −1 )
One can calculate expectations of G(z, ·) in the same manner as for ηp (z, ·). One gets easily X f (n, n, z). E[G(z, ·)] = n:|n|≤b(z −1 )
With the choice of functions b(t) in the statement of the theorem, one can verify that the estimates of Lemma 5.1 yield in any dimension: E[zG(z, ·)] ≤ C| log z|−r with some r > 1. As the function X
g(t, z, ω) =
ψ(t, n, ω)|2
n:|n|≤b(z −1 )
is decreasing in z for small z, we can apply Lemma 3.6 to the functional h(z, ω) = zG(z, ω) and obtain (5.15) lim zG(z, ω) = 0 z→+0
with probability 1. It follows from (5.14) and (5.15) that (5.13) holds. As the function b(t) is increasing for sufficiently large t, (5.13) implies (5.12). The equalities (5.12), (5.13) are proven for any ψ0 ∈ l2 (Zd ) for P-a.e. ω. By density arguments, (5.12) and (5.13) hold with probability 1 for any ψ0 . It follows from this theorem that for any ball KR = {n : |n| ≤ R}, R > 0, the average occupation time tends to 0 (in the Cesaro sense) as T → +∞: lim hWR iT,ω = 0, WR (t, ω) =
T →+∞
X
|ψ(t, n, ω)|2 .
n:|n|≤R
One can estimate the rate of decay of hWR iT,ω as T → +∞. Theorem 5.6. Let ψ0 ∈ l2 (Zd ), R > 0, r > 1. With probability 1 the following estimates uniform in T ≥ 3 hold: hWR iT,ω ≤ CT −1/2 (log T )r , d = 1; hWR iT,ω ≤ CT −1 (log T )r , d = 2; hWR iT,ω ≤ CT −1 , d ≥ 3.
Transport Properties of Markovian Anderson Model
129
Proof. We can show in exactly the same manner as in the proof of Theorem 5.5 that hWR iT,ω ≤ CzG(z, ω),
(5.16)
where z = T −1 , Z
+∞
dt exp(−tz)
G(z, ω) = 0
X
|ψ(t, n, ω)|2 ,
n|n|≤R
E[G(z, ·)] =
X
f (n, n, z) = MR (z).
n:|n|≤R
The statement of the theorem follows directly from (5.16), Lemma 5.1 and Corollary 3.7 in dimensions d = 1, 2 and the third statement of Lemma 3.6 in dimensions d ≥ 3. Corollary 5.7. In dimensions d ≥ 3 for any ψ0 , R with probability 1 the total occupation time of the ball KR is finite: Z +∞ X dt |ψ(t, n, ω)|2 < +∞. 0
n:|n|≤R
In particular, for any n ∈ Zd , Z
+∞
dt|ψ(t, n, ω)|2 < +∞. 0
Now we can prove the lower bounds for the moments |X|p with probability 1. Theorem 5.8. With probability 1 for any ψ0 ∈ l2 (Zd ), ψ0 6= 0 for T sufficiently large the uniform estimate holds: h|X|p iT,ω ≥
1 kψ0 k2 · bp (T ), 2
where b(T ) was defined above. Proof. As in the proof of Theorem 5.3, one estimates: h|X|p iT,ω ≥ bp (T ) kψ0 k2 − L(T, ω) , where L(T, ω) was defined in Theorem 5.5. The statement of the theorem follows directly from (5.13). Finally, we shall discuss the behaviour of the correlation function: Cψ0 ,ψ1 (T, ω) = T
−1
Z
T 0
where ψ0 , ψ1 ∈ l2 (Zd ).
|(ψ(t, ω), ψ1 )|2 dt,
130
S. Tcheremchantsev
Theorem 5.9. Let ψ0 , ψ1 ∈ l2 (Zd ), r > 1. With probability 1 the following estimates uniform in T ≥ 3 hold: C(T, ω) ≤ Cr T − 2 (log T )r , d = 1; C(T, ω) ≤ Cr T −1 (log T )r , d = 2; 1
C(T, ω) ≤ CT −1 , d ≥ 3. Proof. Again, where z = T
−1
C(T, ω) ≤ CzG(z, ω), ,
Z
(5.17)
+∞
dt exp(−tz)|(ψ(t, ω), ψ1 )|2 ,
G(z, ω) = 0
and
b = E[G(z, ·)] = Y −1 (z)g, h , G(z)
where g(n, m) = ψ0 (n)ψ0 (m), h(n, m) = ψ1 (n)ψ1 (m). As in the proof of Lemma 5.1, b ≤ CJ1 + (C + J2 )(C + J3 ), G(z) where J1 , J2 , J3 are estimated as (5.3)–(5.5) and C are some constant depending on ψ0 , ψ1 but uniform in z. As |Tfg(r)| ≤ kψ0 k2 , |Tfh(r)| ≤ kψ1 k2 , one can easily show that b ≤ C| log z|, d = 2; G(z) b ≤ C, d ≥ 3. b ≤ Cz − 21 , d = 1; G(z) G(z)
(5.18)
One can apply Corollary 3.7 for d = 1, 2 and Lemma 3.6 for d ≥ 3 and show that (5.17)–(5.18) imply the statement of the theorem. Corollary 5.10. For d ≥ 3 with probability 1, Z +∞ |(ψ(t, ω), ψ1 )|2 dt < +∞.
(5.19)
0
Comment 5.11. For the Schr¨odinger equation with Hamiltonian H which does not depend on time, where ψ(t) = exp(−itH)ψ0 , (5.19) implies the absolute continuity of spectral measures µψ0 ,ψ1 , µψ0 . Corollary 5.12. Let ψ1 = ψ0 6= 0. Consider the autocorrelation function Z T |(ψ(t, ω), ψ0 )|2 . Cψ0 (T, ω) = T −1 0
For d ≥ 2 with probability 1, lim
T →+∞
log C(T, ω) = −1. log T
For d = 1 with probability 1, lim sup T →+∞
log C(T, ω) ≤ −1/2. log T
Transport Properties of Markovian Anderson Model
131
Proof. As |(ψ(0, ω), ψ0 )| = kψ0 k2 > 0, by continuity in time C(T, ω) ≥ δT −1
(5.20)
with some δ > 0. The result of the corollary follows directly from (5.20) and Theorem 5.9. One can conjecture that for d = 1, lim sup T →+∞
log C(T, ω) = −1/2. log T
References 1. Paquet, D., Leroux-Hugon, P.: Electron propagation in a Markovian time-fluctuating medium: A dynamical generalisation of the coherent-potential approximation. Phys. Rev. B29, 593–608 (1984) 2. Golubovic, L., Feng, S., Zeng, F.-A.: Classical and quantum superdiffusion in a time-dependent random potential. Phys. Rev. Lett. 67(16), 2115–2118 (1991) 3. Lebedev, N., Maass, P., Feng, S.: Diffusion and superdiffusion of a particle in a random potential with finite correlation time. Phys. Rev. Lett. 74(11), 1895–1899 (1995) 4. Tcheremchantsev, S.: Markovian Anderson model: Bounds for the rate of propagation. Commun. Math. Phys. 187, 441–469 (1997) 5. Reed, M., Simon, B.: Methods of modern mathematical physics, Vol. 3. New York: Academic Press, 1979 6. Enss, V., Veselic, D.: Bound states and propagating states for time-dependent Hamiltonians. Ann. Inst. H. Poincar´e A39, 159–191 (1983) 7. Simon, B.: Absence of ballistic motion. Commun. Math. Phys. 134, 209–212 (1990) 8. Del Rio, R., Jitomirskaya, S., Last, Y., Simon, B.: Operators with singular continuous spectrum, IV. Hausdorff dimensions, rank-one perturbations, and localization. J. D’Anal. Math. 69, 153–200 (1996) 9. Germinet, F., De Bi`evre, S.: Dynamical localization for discrete and continuous random Schr¨odinger operators. Preprint LPTM/97/10, Universit´e Paris 7 (1997) 10. Aizenman, M.: Localization at weak disorder: Some elementary bounds. Rev. Math. Phys. 6, 1163–1182 (1994) 11. Last, Y.: Quantum dynamics and decompositions of singular continuous spectra. J. Funct. Anal. 142, 406–445 (1996) 12. Pillet, C.-A.: Some results on the quantum dynamics of a particle in a Markovian potential. Commun. Math. Phys. 102, 237–254 (1985) 13. Barbaroux, J.-M.: Dynamique quantique des milieux d´esordonn´es. Ph. D. thesis, Toulon (1997) 14. Stintzing, S.: Quantenmechanik eines Teilchens in einem zeitabh¨angigen Zufallspotential. Diplomarbeit, M¨unchen (1992) Communicated by B. Simon
Commun. Math. Phys. 196, 133 – 143 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Faltings’ Construction of the K-Z Connection? T. R. Ramadas1,2 1 School of Mathematics, TIFR, Homi Bhabha Road, Bombay 400 005, India. E-mail: [email protected] 2 Department of Mathematics, MIT, Cambridge, MA 02139, USA
Received: 3 March 1997 / Accepted: 19 January 1998
Abstract: We recast Faltings’ construction of the K-Z connection in complex-analytic terms. This suggests an approach to the problem of unitarity.
1. Introduction Let 6 be a compact (connected) oriented two-manifold of genus g > 2. Given an almost complex structure J (compatible with orientation), 6 acquires the structure of a riemann surface, which we denote X. In fact X has the canonical structure of a complex, projective, (irreducible and) smooth curve. Let us for the moment work in the algebraic category. Let L be a linebundle on X, and let SUX ≡ SUX (r, L) denote the moduli space of (s-equivalence classes of) semistable vector bundles F such that rank F = r and det F = L. It is known [?] that the picard group of SUX is Z. Let 2X denote the ample generator, and consider the space of sections Vk ≡ H 0 (SUX , 2kX ). These are the generalised theta functions of level k (associated to the curve X). Consider the relative case. Let X → T be a family of curves (of genus g) over a base T , and assume X is smooth and projective over T . Fix a linebundle on X and let 5 : SUX → T denote the corresponding family of relative moduli spaces. If 2X is a linebundle on SUX restricting to Theta on each fibre of 5, we can consider the direct image sheaves Vk ≡ 5∗ 2kX . These are in fact vector bundles, well-defined up to tensoring by linebundles from T . We have then the following well-known result. Theorem. The vector bundle Vk carries a natural projective connection ∇. The curvature of this connection vanishes. By natural we mean functorial with respect to maps of the parameter spaces T . ?
This research was partly funded by DOE grant DE-FG02-88ER25066.
134
T. R. Ramadas
The existence of a projective flat connection was first proved, following ideas from Conformal Field Theory, in [?]. An algebraic-geometric proof was given by Hitchin in [?], and a differential-geometric proof in [?]. Yet another proof, “inspired somehow by the paper of N.J. Hitchin, but mostly copying the approach of Witten” appeared in the paper of Faltings [?]. This note grew out of an attempt to understand this last paper. In the rest of this paper we restrict to the rank 2 case, and further take L = OX . Hitchin’s proof works by going to the cotangent space of SUX . In contrast, Faltings works on the space of pairs (F, D) where F is a stable bundle and D an algebraic connection on F . This space depends on the curve X. Instead we go to the complex-analytic category, where it becomes an open set of the moduli space R complex representations of the fundamental group π1 (6). 1) This latter space R is a complex manifold independent of J. Consider the map φ : R → SUX , given by associating to each representation the corresponding holomorphic bundle (this is not everywhere defined since this holomorphic bundle need not be stable, but we ignore this for the present). There is a similar map φ¯ : R → SUX¯ , where X¯ is the curve got by taking as complex structure −J. 2) The fibres of φ and φ¯ are transversal. Our constructions depend on (1) and (2). This point of view suggests an approach to the question of unitarity, which is the content of the last section. It is not clear what, if any, relationship exists between this approach and the work of Gawedzki [?].
2. Preliminaries 2a. Representations of π1 (6). Let E denote the trivial C ∞ hermitian rank 2 vector bundle on 6, with a fixed trivialisation. Let A denote the space of (not necessarily hermitian) connections on E such that the induced connection on det E is trivial; denote by u A the subspace of unitary connections. Let Af l denote the subspace of A consisting of connections A such that FA = 0, FA denoting the curvature of A. Let the Airr f l denote the subspace of Af l such that the associated monodromy representation π1 (6) → SL(2, C) is irreducible. The group of automorphisms (“complex gauge transformations”) of E acts on Airr f l and the quotient is a complex manifold of dimension 6g − 6. We let R denote this quotient. Clearly, R is the moduli space of irreducible representations into SL(2, C) of the fundamental group of 6, and is therefore an open subset of an affine algebraic variety. It carries a holomorphic symplectic form, which we denote . In fact R is a holomorphic symplectic quotient, and is the corresponding symplectic form. We let u R ,→ R denote the (totally real) submanifold given by flat unitary connections. Equivalently, this is the moduli space of irreducible unitary representations. The symplectic form restricts to u R as a real symplectic form. (We will normalise so that it corresponds to the positive generator of H 2 (u R, Z) = Z. Note that an orientation of 6 induces one on u R.) Chern-Weil theory yields: Proposition 2.1. There is a holomorphic line bundle L on R with holomorphic connection ∇ such that the corresponding curvature is .
Faltings’ Construction of the K-Z Connection
135
The space R has a real structure σ ([?]), induced by the map A 7→ −A† ; u R is one of the components of the fixed-point set. This structure lifts to an antilinear (on the fibres) isomorphism σ˜ : L → L−1 . For ρ˜ ∈ H 0 (R, L), define σ ∗ ρ˜ ∈ H 0 (R, L−1 ) by σ ∗ ρ˜ ≡ σ˜ ◦ ρ˜ ◦ σ. Restricted to u R, the map σ˜ induces a hermitian structure1 on L, and the connection of Proposition 2.1 preserves this structure. (A construction of this hermitian linebundle, with a very explicit form for the connection, is given in [?]. This construction works equally well on R, and yields the line bundle L, with connection ∇. The same construction, with orientation on 6 reversed, yields the dual line bundle, with the dual connection.) 2b. Holomorphic structures on E. Suppose now that 6 is endowed with a complex structure J – let X denote the corresponding riemann surface. A connection A determines a holomorphic structure on E (such that the local holomorphic sections are those annihilated by d0,1 A ). The variety R therefore parametrises a family of holomorphic bundles. (An actual family exists only if we consider the adjoint bundle, but this need not cause us concern.) By the fundamental theorem of Narasimhan and Seshadri [?], points of u R correspond bijectively to the set of isomorphism classes of stable bundles F such that det F is trivial. Since in any holomorphic family stable bundles form an open set, there exists an open set U such that u R ⊂ U ⊂ R such that the corresponding bundles are stable, and s , the open subset of SUX parametrising there exists an induced map φ : U → SUX stable bundles. (The map φ, restricted to u R, is a real-analytic isomorphism.) Reversing the complex structure (and also the orientation) on 6 – i.e., considering ¯ The the complex structure −J – we get another riemann surface, which we denote X. corresponding moduli space, which is obtained by just reversing the complex structure s s s ¯ , will be denoted SUX on SUX ¯ . As above, we have a map φ : U → SUX ¯ . We emphasize ¯ that the maps φ as well as φ are both holomorphic. Proposition 2.2. By shrinking U if necessary we can ensure that (1) (2) (3) (4)
U is σ-stable, the maps φ and φ¯ are smooth, the fibres are contractible, and the corresponding distributions (which we will denote by V (“vertical”) and H (“horizontal”) respectively) are transversal.
In fact the two distributions are lagrangian for . Note that the differential of φ sets up an isomorphism (2.1) H → φ∗ TSUXs . . Proposition 2.3. There are unique (up to nonzero scalar) isomorphisms φ∗ 2 = L, φ¯ ∗ 2 = L−1 , such that pull-back sections are covariant constant along the fibres of the respective maps. The following fact is almost tautologous. 1
References to the hermitian structure, here and elsewhere in the text, are only relevant to the last section.
136
T. R. Ramadas
Proposition 2.4. There is a complex antilinear isomorphism s s k , 2kX ) → H 0 (SUX H 0 (SUX ¯ , 2X ¯) ρ 7→ ρ. ¯
Remark 2.5. We have the following commutative diagram: −−−−→
U φy
σ
U φ¯ y
s s −−−−−→ SUX SUX ¯ Identity
where the vertical arrows are holomorphic maps and the horizontal maps antiholomorphic. In addition, φ¯ ∗ ρ¯ = σ ∗ φ∗ ρ . (2.2) Remark 2.6. Finally, we note the following facts. (1) SUX is a normal Cohen-Macaulay, unirational projective variety (with rational s is its smooth locus. The singular locus has singularities), and (if g > 2) SUX codimension 2g − 3. s , 2kX ). (2) H 0 (SUX , 2kX ) = H 0 (SUX 1 s (3) H (SUX , O) = 0. (4) By Hitchin [?], there exists an isomorphism s s , S 2 T (SUX )) H : H 1 (X, T X) → H 0 (SUX (5) The hermitian structure on 2 extends continuously to SUX . 3. Some Differential Geometry on R All forms and functions on R occurring below will be holomorphic. Notation 3.1. Set 6g − 6 = 2m. For any integer k, denote by ∇k the connection on Lk induced by ∇. s . The latter is a We let κ1 denote the pull-back to U of the canonical bundle of SUX (negative) power of the Theta bundle:
κSUXs = 2−c ,
(3.1)
where c = 4. Therefore, by Proposition 2.3, κ1 is similarly a power of L (the isomorphism being well-defined up to a non-zero scalar). Thus there is on κ1 a (holomorphic) connection with curvature proportional to . Our first task is to derive a formula for this connection. Let τ0 denote the “holomorphic volume form” corresponding to – the section of the canonical bundle κR of R obtained by raising to the (2m)th -power. We take τ = ef τ0 , where f is to be determined. Thus τ is another volume form which, unlike τ0 , could depend on J. Define a function u 7→ `(u) from the sheaf of vector fields on R to the sheaf of functions by Lu τ = `(u)τ, where Lu is the Lie derivative w.r.to u. Note that ` is not O-linear – in fact `(hu) = h`(u) + u(h), for h any function.
Faltings’ Construction of the K-Z Connection
137
Remark 3.2. One defines a connection on κR by defining ∇u (γ) = (Lu − `(u))γ for any section γ. This goes over to the trivial connection on OR under the isomorphism OR → κR given by multiplication by τ . There is a canonical isomorphism κ1 = (∧m H∗ ). Note also that (∧m H∗ ) is a direct summand of ∧m T ∗ R. An m-form χ belongs to ∧m H∗ iff iv χ = 0 for every vertical v. We let p denote the projection ∧m T ∗ R → ∧m H∗ . We will now define a connection τ ∇ on κ1 as follows. (1) τ ∇v χ ≡ Lv χ for any vertical v, and (2) τ ∇u χ ≡ p(Lu − `(u))χ for any horizontal u. One easily checks that this defines a connection on κ1 . We let τ F denote the curvature of ¯ we can also define a connection this connection. By interchanging the roles of X and X, τ ¯ s ∇ on κ1¯ , where we define κ1¯ to be the pullback of the canonical bundle of SUX ¯. Remark 3.3. Note the pairing κ1 ⊗ κ1¯ → κR → O, where the last isomorphism is given by the trivialisation defined by τ : < χ, η >τ τ = χ ∧ η. ¯ are dual connections with respect to the pairing Proposition 3.4. (1) τ ∇ and τ ∇ <, >τ . (2) V and H are isotropic distributions for τ F . (3) Pull-back sections of κ1 are covariant constant along V. (4) Pull-back sections of κ1¯ are covariant constant along H. Composing isomorphisms, we have an isomorphism κ1 → L−c , and pulling back ∇−c we get a second connection on κ1 . Clearly the two connections agree along V. Our key result is Theorem 1. There exists a choice of trivialiation τ such that ∇−c = τ ∇. Proof. Let us compute the curvature τ F of τ ∇, working locally on R. We already know that τ F vanishes on H and V. Let v be a vertical vector field and u a horizontal one. We assume u is a pull-back – this ensures that the commutator [v, u] is vertical. Let χ be a pullback section of κ1 ; note that such sections are covariant constant in the vertical direction. One computes: τ
F (v, u)χ = Lv p(Lu − `(u))χ = −v(`(u))χ.
This yields the formula τ
F (v, u) = −v(`(u))
(3.2)
valid for v vertical, u pulled back. On the other hand, we have τ0 ∇ = ∇−c + ζ, for some 1-form ζ. Since ζ is trivial on s ¯ −1 V and dζ|H = 0, there exists for small enough open W ⊂ SUX ¯ a function f on φ (W ) 1 s such that ζ(u) = u(f ) for horizontal u. Using the fact that H (SUX¯ , O) = 0, we see that f can be globally defined. We then have τ0
F (v, u) = −c(v, u) + v(u(f )),
which yields the desired result, provided we set τ = ef τ0 .
138
T. R. Ramadas
The following lemma will be relevant to the last section. Lemma 3.5. The function ef in the above proof can be chosen to be real on u R. Proof. Restricted to u R, κ1 has two sesquilinear forms preserved by the connection ∇−c = τ ∇. The first is the inner product induced by the isomorphism κ1 → L−c , and the second arises from the pairing of Remark 3.3. (Noting that on u R, κ¯1 = κ1¯ .) These two must be proportional, and can be made equal by an appropriate (constant) rescaling of τ . Referring again to 3.3, we see that the corresponding ef is real. Remark 3.6. Consider a point (F, ∇) in U, where F is a stable bundle, and ∇ a holomorphic connection. Then the vertical tangent space here can be canonically identified with H 0 (X, End F ⊗ KX ) and the horizontal tangent space with H 1 (X, End F ). The canonical pairing agrees, up to a constant, with the pairing given by . Remark 3.7. Using the results of Zograf-Takhtajan [?] one sees that ef is essentially the regularised determinant of the d0,1 A Laplacian. 4. The Connection The construction is essentially that of [?]; the use of the “horizontal distribution” H and a careful global choice of volume-form (Theorem 1 above) are the only innovations. We start with a generality. Suppose given a manifold M with a volume-form τ . Suppose further that (N, D) is a linebundle with connection. Then the symbol map Dif f 2 (N ) → S 2 T M has a splitting (which is, however, not O-linear), given by u ⊗s w 7→ D[u ⊗s w], where D[u ⊗s w] ≡ (Dw + `(w))Du + (Du + `(u))Dw . Here ⊗s denotes the symmetric tensor product (over the ring of functions) of vector fields. Consider now the complex manifold U, with the trivialisation of κU defined by τ . Take for N the linebundle Lk , and for the connection, ∇k . Using the isomorphism (2.1) we obtain a linear map s s , S 2 T (SUX )) → H 0 (U, Dif f 2 (Lk )) . D : H 0 (SUX
We next define a map s s , S 2 T (SUX )) → H 0 (U, O)/{constants} T : H 0 (SUX s s over which T (SUX ) is trivial, and define as follows. Consider an open set U ⊂ SUX 0 2 s 0 −1 0 TU : H (U, S T (SUX )) → H (φ (U ), O)/H (U, O) by
TU [u ⊗s w] = u(`(w)) + w(`(u)) + `(u)`(w) mod H 0 (U, O), where on the right we mean the horizontal lifts of u and w to φ−1 (U ). Noting that s s , O) = C and H 1 (SUX , O) = 0, we see that the TU globalise to yield a map T. H 0 (SUX Let g be a function on U. For future reference we define a third operator s s , S 2 T (SUX )) → H 0 (U, Dif f 1 (32m , 32m−1 )) . Sg : H 0 (SUX
Locally as above,
Faltings’ Construction of the K-Z Connection
139
SgU [u ⊗s w](γ) = iu (Lw − `(w))γ − iu w(g)γ + iw (Lu − `(u))γ − iw u(g)γ, for γ a (2m)-form. Again one checks that this is well-defined. We are now ready to define the connection ∇ . Consider the situation as in the Introduction, with a family of compact riemann surfaces, parametrised by a complex manifold, which in fact we can take to be the unit disk. We let t denote the natural coordinate. We can assume, after some shrinking if necessary, that the following situation holds: 8 T × U −−−−→ SUX p1 y 5y =
−−−−→ T T where 8 is a relative version of φ, and p1 the projection onto the first factor. Let LX ≡ p∗2 L, where p2 denotes the projection to the second factor; then LX has a pull-back ∂ on T lifts naturally connection, which we continue to denote ∇. The vector field ∂t = ∂t to a vector field on the product T × U; we continue to use the same notation for this lift. Since the connection on LX is a pullback, so is the curvature, and (again retaining the relevant notation) satisfies (4.1) i∂t = 0 . Suppose that we are given a section of the direct image bundle Vk on T , or equivalently, a section ψ of 2kX on SUX∫ . We wish to define the covariant derivative ∇ ∂t ψ. Let ψ˜ denote the corresponding section of LkX on T × U. We now set ] ˜ ∇ ∂t ψ = ∇∂t ψ +
1 k ˜ (D[H 0 (∂t )] − T[H 0 (∂t )])ψ. 2(k + c/2) c
(4.2)
Here H 0 ≡ H ◦KS, where KS is the Kodaira-Spencer map for the family of curves X → s s , S 2 T (SUX )) is the Hitchin isomorphism. T , and the map H : H 1 (X, T X) → H 0 (SUX (The appropriate normalisation of H will be determined below.) Proposition 4.1. Equation (4.2) defines a connection for an appropriate normalisation of H. Proof. We need to check that the RHS of (4.2) is covariant constant in the vertical P s we write H 0 (∂t ) = j uj ⊗s wj . Let v be a vertical direction. Locally over U ⊂ SUX vector field on φ−1 (U ). It is straightforward to check ˜ ∇kv (RHS) = ∇k[v,∂t ] ψ˜ + ∇kvt ψ, X where vt ≡ (v, uj )wj + (v, wj )uj . j
From this we see that the RHS is covariant constant along V (and therefore defines a section of 2kX on SUX∫ , and by extension on SUX ) if (mod V) vt = [∂t , v] (mod V) . We now use the following facts: (1) We have [∂t , v] = KS(∂t ) ⊗ v (mod V), where by KS(∂t ) ⊗ v we mean the product of KS(∂t ) ∈ H 1 (X, T X) and v ∈ H 0 (End F ⊗ KX ).
140
T. R. Ramadas
(2) If the normalisation of H is appropriately chosen, X (v, uj )wj + (v, wj )uj . KS(∂t ) ⊗ v = j
To see this, evaluate both sides on v, and use Remark 3.6 and the definition of the Hitchin map. 1/2
Remark 4.2. Let ψ be a local section of the square-root of the canonical bundle κSU s = X
s 2−c/2 over an open set U ⊂ SUX , and u ⊗s w a section of S 2 T U . Consider on φ−1 (U ), the section 1 ˜ (D[u ⊗s w] + TU [u ⊗s w])ψ. 2 The same computation as in the above proof shows that this is covariant constant along 1/2 V, and thus defines a new section of κSU s on U . Thus the symbol sequence X
0→
1/2 1/2 Dif f 1 (κSU s , κSU s )/O X X
1/2
1/2
s → Dif f 2 (κSU s , κSU s )/O → S 2 T (SUX )→0 X
X
splits. 5. The Hermitian Structure on Vk In this section we will assume that the various integrals encountered converge, and that integration by parts is permissible. Except in the context of Remark 5.7 below this convergence should not be difficult to prove, and the whole question can be avoided by considering the “coprime case” when the space of stable bundles is compact. Given sections ψ, ρ of 2kX , define their inner product to be Z ˜ ρ˜¯ > τk0 , < ψ, (5.1) (ψ, ρ) ≡ uR
where τk0 = eg τ , g a holomorphic function on U, which is real on u R. Note that this is the integral of a holomorphic form over a real cycle. The “reality condition” on g ensures that τk0 restricts to a real volume form on u R. Proposition 5.1. The equation (5.1) defines an inner product on Vk . Proof. This follows from the following facts: ˜ ψ˜¯ > is a holomorphic function on R, (1) < ψ, (2) its restriction to u R is a real nonnegative function, and (3) u R is totally real, of real dimension 2m. (When g = 0, the convergence of the integral is not difficult to see, using Remark 2.6.) Before stating the next lemma, recall that the operator Sg was defined in Sect. 4; it takes values in (2m − 1)-forms. We introduce one more piece of notation: we denote by Dκ the operator D acting on sections of κR , the corresponding connection having been defined in Remark 3.2. Note that in Eq. (5.2) below, the second term on the right is a total derivative.
Faltings’ Construction of the K-Z Connection
141
Lemma 5.2.
˜ ρ˜¯ > τk0 =< ψ, ˜ ρ˜¯ > Dκ [H 0 (∂t )]τk0 + d{Sg [H 0 (∂t )](< ψ, ˜ ρ˜¯ > τk0 )}. < D[H 0 (∂t )]ψ, (5.2) Proof. Over φ−1 (U ), with U an open subset of SUX as in the proof of Proposition 4.1,
˜ ρ˜¯ > τk0 = < D[H 0 (∂t )]ψ,
X
˜ ρ˜¯ > τk0 < (∇kwj + `(wj ))∇kuj ψ,
j
˜ ρ˜¯ > τk0 + < (∇kuj + `(uj ))∇kwj ψ, X ˜ ρ˜¯ >}τk0 = {(Lwj + `(wj ))Luj < ψ, j
˜ ρ˜¯ >}τk0 + {(Luj + `(uj ))Lwj < ψ, X ˜ ρ˜¯ > =< ψ, Luj (Lwj − `(wj ))τk0 + Lwj (Luj − `(uj ))τk0 j
˜ ρ˜¯ > τk0 )}, + d{S [H (∂t )](< ψ, g
0
where we have used the fact that ρ˜¯ is covariant constant along H. This proves the lemma. Proposition 5.3. The inner product (5.1) is (projectively) preserved by the connection ∇ provided τk0 satisfies
L∂t (τk0 ) =
1 k (Dκ [H 0 (∂t )] − T[H 0 (∂t )])τk0 . 2(k + c/2) c
(5.3)
Proof. Let ψ and ρ be sections of 2kX , which are covariant constant regarded as sections ˜ ρ˜ denote the corresponding sections of Lk . Note that ∇k annihilates of Vk over T . Let ψ, X ∂t¯ these, where we set ∂t¯ = ∂∂t¯ ; also ∇k∂t ρ˜¯ = 0. Using (4.2) we see that ψ˜ satisfies
∇∂t ψ˜ =
−1 k (D[H 0 (∂t )] − T[H 0 (∂t )])ψ˜ 2(k + c/2) c
and ρ˜ satisfies a similar equation. We now use Lemma 5.2, and compute
(5.4)
142
T. R. Ramadas
Z ∂t (ψ, ρ) =
uR
Z
˜ ρ˜¯ > τk0 < ∇k∂t ψ,
˜ ρ˜¯ > L∂t (τk0 ) < ψ, Z 1 ˜ ρ˜¯ > τk0 < D[H 0 (∂t )]ψ, =− 2(k + c/2) u R Z k ˜ ρ˜¯ > τk0 + T[H 0 (∂t )] < ψ, 2c(k + c/2) u R Z ˜ ρ˜¯ > L∂t (τk0 ) + < ψ, uR Z 1 ˜ ρ˜¯ > Dκ [H 0 (∂t )]τk0 < ψ, =− 2(k + c/2) u R Z k ˜ ρ˜¯ > τk0 + T[H 0 (∂t )] < ψ, 2c(k + c/2) u R Z ˜ ρ˜¯ > L∂t (τk0 ), + < ψ, +
uR
uR
where the second and third equalities hold modulo (ψ, ρ). A similar computation, with the roles of ψ and ρ interchanged, works for ∂t¯(ψ, ρ). We record a formula for the variation of τ with t. Lemma 5.4. Define `(∂t ) by L∂t τ = `(∂t )τ . We have then `(∂t ) = −
1 T[H 0 (∂t )] + constant . c
(5.5)
Proof. Note first that for a section χ of κ1 , pL∂t χ = ∇−c ∂t χ + `(∂t )χ. This follows by taking inner products with a section η of κ1¯ and using Remark 3.3. On the other hand, we have, using (4.1) (and omitting the superscript {}−c for the present) [∇v , ∇∂t ] = ∇[v,∂t ] , v vertical. Working over a small open set U ⊂ SUX∫ , let χ be a section of the relative canonical bundle κSU ∫ /T , pulled up to 8−1 (U). From the above equation we get X
Lv (pL∂t − `(∂t ))χ = p(L[v,∂t ] − `([v, ∂t ]))χ for any vertical v. Locally on 8−1 (U), we can ensure that [Lv , p] = 0, and for such a local vector field, we get 1 v(`(∂t )) = − v T[H 0 (∂t )] , c which proves the lemma.
Remark 5.5. The reality condition on eg is equivalent to σ ∗ τ¯k0 = τk0 . This, together with (5.3), determines the evolution of τk0 .
Faltings’ Construction of the K-Z Connection
143
Remark 5.6. It is not clear if a solution τk0 exists even locally with respect to a variation of complex structure. In fact this would be sufficient to yield a new proof of flatness of the connection ∇ . Remark 5.7. The involution σ has other components to its fixed-point set, and one may consider integrals of the type (5.1) over any one of these. The form τk0 need be defined only in a neighbourhood of the corresponding component. In particular one can consider the ‘Teichm¨uller component” [?], which is just a (6g − 6)-dimensional real vector space. References [A-D-W] Axelrod, S., Della Pietra, S., Witten, E. Geometric quantisation of Chern–Simons gauge theory. J. Diff. Geom. 33, 787–902 (1991) [D-N] Drezet, J.M. and Narasimhan, M.S. Groupe de Picard des vari´et´es de fibr´es semistables sur les courbes alg´ebriques. Invent. Math. 97, 53–94 (1989) [F] Faltings, G.. Stable G-bundles and projective connections. J. Alg. Geom. 2, 507–568 (1993) [G] Gawedzki, K.: SU (2) WZW theory at higher genera. Commun. Math. Phys. 169, 329–371 (1995) [H1] Hitchin, N.J.: Flat connections and geometric quantisation. Commun. Math. Phys. 131, 347–380 (1990) [H2] Hitchin, N.J.. The self-duality equations on a Riemann surface. Proc. London Math. Soc. 55, 59–126 (1987) [H3] Hitchin, N.J.: Lie groups and Teichm¨uller space. Topology. 31, 449–473 (1992) [N-S] Narasimhan, M.S. and Seshadri, C.S. Stable and unitary vector bundles on a compact Riemann surface. Ann. of Math. 82, 540–567 (1965) [R-S-W] Ramadas, T.R., Singer, I.M. and Weitsman, J.: Some comments on Chern-Simons gauge theory. Commun. Math. Phys. 126, 409–420 (1989) [T-U-Y] Tsuchiya, A., Ueno, K. and Yamada, Y.: Conformal field theory on universal family of stable curves with gauge symmetries. Advanced Studies in Pure Mathematics 19, 459–566 (1989) [Z-T] Zograf, P.G. and Takhtadzhyan, L.A.: The geometry of moduli spaces of vector bundles over a Riemann surface. Math. USSR-Izv. 35, 83–100 (1990) Communicated by A. Jaffe
Commun. Math. Phys. 196, 145 – 173 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
The Pointwise Estimates of Diffusion Wave for the Navier–Stokes Systems in Odd Multi-Dimensions Tai-Ping Liu1,? , Weike Wang2,?? 1 2
Department of Mathematics, Stanford University, Stanford, CA94305, USA Department of Mathematics, Wuhan University, Wuhan 430072, China
Received: 23 June 1997 / Accepted: 19 January 1998
Abstract: We study the dissipation of solutions of the isentropic Navier–Stokes equations in odd multi-dimensions. Pointwise estimates of the time-asymptotic shape of the solutions are obtained and shown to exhibit the generalized Huygen’s principle. Our approach is based on the detailed analysis of the Green function of the linearized system. This is used to study the coupling of nonlinear diffusion waves. 1. Introduction In this paper, we are interested in the time-asymptotic behavior of solutions to isentropic Navier–Stokes equations in several space dimensions. Through the pointwise estimates of the Green function of the linearized system and the analysis of coupling of nonlinear diffusion waves, we obtain explicit expressions of the time-asymptotic behavior of the solutions. For the pointwise estimate of solution to general hyperbolic-parabolic systems in one space dimension see Liu and Zeng [5]. In several space variables, Kawashim [3] has studied general hyperbolic-parabolic systems, and obtained L2 estimates. The isentropic Navier–Stokes equations have been studied by Hoff and Zumbrun in [1], where the Lp (p ≥ 1) estimates were obtained. Our pointwise bounds for the Green’s function for the linearized Navier–Stokes system is consistent with those identified by Hoff and Zumbrun for the related linear artificial viscosity system [2]. Due to the effect of the nonlinearity, the solution of the full nonlinear system exhibits a weaker Huygen’s principle than that of the Green’s function. To exhibit the effect of the Huygen’s principle, we only consider the case which the space dimension is odd. The more difficult case of even space dimension is left to the future. Consider the isentropic Navier–Stokes equations ? ??
Supported in part by NSF Grant DMS-9623025 Supported by National Natural Science Foundation of China
146
T.-P. Liu, W. Wang
(
ρt + div m = 0 j j mjt + div mρm + P (ρ)xj = ε4 mρ + ηdiv m . ρ
(1.1)
xj
Here ρ(x, t), m(x, t) = (m1 (x, t), · · · , mn (x, t))τ , and P = P (ρ) represent the fluid density, momentum density, and pressure. ε > 0 and η ≥ 0 are viscosity constants, and div and 4 are the usual spatial divergence and Laplace operators. We assume throughout that the pressure P is a smooth function of the density ρ in a neighborhood of a constant density ρ∗ and that c is the sound speed, c2 = P 0 (ρ∗ ) > 0. The system linearized about the constant state (ρ∗ , m∗ )τ , taken to be (1, 0)τ without loss of generality, is ρt + div m = 0 (1.2) mt + c2 ∇ρ = ε4m + η∇div m. The symbol of the operator for system (1.2) is √ √ 0 ξ 0 0 l(τ, ξ) = −1τ I + −1 2 τ + , (1.3) 0 εξξ τ + ηξ τ ξ c ξ 0 where τ and ξ = (ξ1 , · · · , ξn ) are dual variables to t and x respectively. Direct calculations [1] show that the Fourier transform Gˆ of the Green’s matrix for the linearized Navier–Stokes system (1.2) is given by √ eλ+ t −eλ− t τ λ+ eλ− t −λ− eλ+ t ξ − −1 λ −λ λ −λ ˆ t) = √ + λ −t λ t τ , λ+ t + λ−− t G(ξ, + − λ+ e −λ− e −ε|ξ|2 t −ε|ξ|2 t ξξ ξ e − −1c2 e λ+ −e I + − e −λ− λ+ −λ− |ξ|2 (1.4) where q 1 1 ν 2 |ξ|4 − 4c2 |ξ|2 λ± = − ν|ξ|2 ± 2 2 and ν = ε + η. The above representation holds for |ξ| 6= 0, 2c/ν. One of the main steps in our study of the solution of the nonlinear system (1.1) is the pointwise estimate of the Green function G for (1.2). The Green function G∗ for the systems with artificial viscosity for (1.1), ρt + div m = 21 (ε + η)4ρ (1.5) mt + c2 ∇ρ = ε4m + 21 (η − ε)∇div m. has been studied in Hoff and Zumbrun [2]. There, G∗ has a simplified form, which allows for explicit computations of the inverse Fourier transform. It is shown that G∗ contains the convolution of the fundamental solution of the wave operator with the heat kernel. The Green function G for the linearized system with physical viscosity matrix (1.2) is estimated in Sect. 3. For this, we need to identify the elements in G which correspond to the wave operator and also those which correspond to the dissipation. What distinguishes our analysis from that of [2] is that this identification for the whole Green function G is much less clear than for its principal part G∗ . This decomposition allows us to identify the hyperbolic and dissipative aspects of the Green function G. In Sect. 2, we establish some lemmas in preparation for the estimation of the Green’s function in Sect. 3. Instead of the explicitly computed exponential decaying term about the sound cone for G∗ in [2], the estimate for G is made possible by allowing an algebraic decaying term, (2.6). The pointwise estimate for the solution of (1.1) is carried out in Sect. 4 using Duhamel’s principle. For this we need to estimate the convolution of a non-linear source with Green function G. This last technical step is postponed till Sect. 5.
Pointwise Estimates of Diffusion Wave for Navier–Stokes Systems
147
2. Some Lemmas Lemmas 2.1 and 2.5 below are due to Hoff and Zumbrun [2] and [1] respectively. Lemma 2.3 is analogous to Lemma 2.3 of [2], with the Gaussian kernel replaced by BN (see (2.6)). Lemma 2.1. Let wˆ = (2π)−n/2 (sin c|ξ|t)/(c|ξ|) and wˆ t = (2π)−n/2 cos(c|ξ|t), then there are constants aα , bα , such that for the smooth function f (x) and odd integer n, Z X w∗f = aα t|α|+1 Dα f (x + cty)y α dSy , (2.1) |y|=1
0≤|α|≤(n−3)/2
X
wt ∗ f =
aα t|α|
Z |y|=1
0≤|α|≤(n−1)/2
Dα f (x + cty)y α dSy .
(2.2)
Proof. From [6] and [7], we have Z 1 ∂ n−3 2 tn−2 f (z)dSz , t ∂t |z−x|=ct
(2.3)
Z ∂ 1 ∂ n−3 2 tn−2 f (z)dSz . ∂t t ∂t |z−x|=ct
(2.4)
w ∗ f = Cn and w t ∗ f = Cn
For any smooth h = h(t), we have k−1 1 ∂ k X h(t) = bl t−(k+l) h(k−l) , t ∂t l=0
k 1 ∂ k X (tn−2 h(t)) = b0l t−(k+l−n+2) h(k−l) . t ∂t l=0
Letting y =
z−x ct ,
and using (2.3), (2.4) and the above formula, the lemma is proved.
Lemma 2.2. If fˆ(ξ, t) satisfy 2 |D2β ∂tl (ξ α fˆ(ξ, t))| ≤ C|ξ||α|+k−2|β|+2l (1 + (t|ξ|2 ))m e−ν|ξ| t/2 ,
for any integers l and m, and multi-indexes α, β with |β| ≤ N , then |D α f (x, t)| ≤ CN t−
n+|α|+k +l 2
BN (|x|, t),
(2.5)
where N is any fixed integer, and |x|2 −N BN (|x|, t) = 1 + . 1+t
(2.6)
148
T.-P. Liu, W. Wang
Proof. For 2N < k + |α| + n, we have, by direct calculation, R √ |x2β Dα f (x, t)| = C| e −1xξ D2β ξ α fˆ(ξ, t)dξ| R 2 ≤ C |ξ||α|+k−2|β| (1 + (t|ξ|2 ))m e−ν|ξ| t/2 dξ ≤ Ct−(|α|+n+k)/2 (1 + t)|β| . In particular,
lim |Dα f (x, t)| = 0.
(2.7)
(2.8)
t→+∞
If 2N ≥ k + |α| + n, we choose l, such that 2N < k + |α| + n + 2l. Similar to the proof of (2.7), we get |x2β ∂tl Dα f (x, t)| ≤ Ct−(|α|+n+k)/2 (1 + t)|β|−l , or,
±x2β ∂tl Dα f (x, t) ≤ Ct−(|α|+n+k)/2 (1 + t)|β|−l .
By integrating from t to +∞ on both sides of the above inequality, we have from (2.8) that ±x2β ∂tl−1 Dα f (x, t) ≤ Ct−(|α|+n+k)/2 (1 + t)|β|−l+1 . By induction on l we establish (2.7) for this case. If |x|2 ≤ t + 1, we take β = 0; if |x|2 > t + 1, |β| = N and obtain from (2.7), |Dα f (x, t)| ≤ Ct−(|α|+n+k)/2 min(1, ((1 + t)/|x|2 )N ). Since 1+
|x|2 1+t
≤2
(2.9)
1, |x|2 ≤ t + 1, 2 (|x| /(1 + t)), |x|2 > t + 1,
we have min(1, ((1 + t)/|x|2 )N ) ≤
2N (1 +
(|x|2 /(1
+ t)))N
Finally, (2.5) follows from (2.9) and the above inequality.
.
Lemma 2.3. (1) For 2N ≥ n, we have Z I= B2N (|x + cty|, t)y α dSy ≤ Ct−(n−1)/2 BN (|x| − ct, t).
(2.10)
(2) Let E˜ µ (x, t) = e−|x| /µt , then Z 2 E˜ µ (x + cty, t)y α dSy ≤ Ct−(n−1)/2 e−(|x|−ct) /3µt . I=
(2.11)
|y|=1
2
|y|=1
Proof. Let cos θ =
x·y |x||y| ,
we have Z BN (|x + cty|, t)dSy = I1 + I2 , I≤ |y|=1
with I1 =
Z
Z |y|=1,cos θ≥0
BN (|x + cty|, t)dSy , I2 =
|y|=1,cos θ<0
BN (|x + cty|, t)dSy .
Pointwise Estimates of Diffusion Wave for Navier–Stokes Systems
149
Clearly I1 ≤ I2 , thus we only need to estimate I2 . For cos θ < 0, (x + cty)2 = |x|2 + 2ct|x||y| cos θ + c2 t2 |y|2 = (|x| − ct|y|)2 + 2c2 t2 |y|2 (1 + cos θ) + 2ct(|x| − ct|y|)|y|(1 + cos θ), 2 3 (|x| − ct|y|)2 (1 + cos θ) + c2 t2 |y|2 (1 + cos θ), 3 2
2ct(|x| − ct|y|)|y|(1 + cos θ) ≤ and thus
1 1 (|x| − ct|y|)2 + c2 t2 |y|2 (1 + cos θ). (2.12) 3 2 Without loss generality, we may assume that x = (|x|, 0, · · · , 0), and so cos θ = y1 /|y|. If |y| = 1, we have (x + cty)2 ≥
(x + cty)2 ≥
1 1 (|x| − ct)2 + c2 t2 (1 + y1 ). 3 2
Since |y1 | ≤ 1, we know that 21 c2 t2 (1 + y1 ) ≥ 0. Let P 2 = (x + cty)2 /t, Q2 =
1 1 (|x| − ct)2 , R2 = c2 t(1 + y1 ). 3t 2
P, Q and R are non-negative functions for t > 0, and (1 + Q2 )(1 + R2 ) ≤ 1 + P 2 + P 4 ≤ (1 + P 2 )2 . Thus 1 1 1 ≤ , 2 2N 2 N 2 (1 + ((x + cty) /t)) (1 + ((|x| − ct) /(3t))) (1 + (c t(1 + y1 )/2))N and I2 ≤ C3N BN (|x| − ct, t)
Z |y|=1,y1 <0
(1 +
dSy . + y1 )/2))N
(c2 t(1
For y1 > −1/2, we have I2 ≤ CN BN (|x| − ct, t)t−N ≤ CN BN (|x| − ct, t)t−(n−1)/2 . p √ For y1 < −1/2, let y = (− 1 − |w|2 , w), w ∈ Rn−1 , |w| ≤ 3/2, then 1 + y1 = 1 − Thus I2 ≤ CN BN (|x|−ct, t)
q 1 − |w|2 =
Z
√ |w|≤ 3/2
1+
|w|2 |w|2 p . ≥ 2 1 + 1 − |w|2
c2 t|w|2 −N dw ≤ CN BN (|x|−ct, t)t−(n−1)/2 . 4
This proves (1). For the proof of (2), we just need to consider Z E˜ µ (x + cty, t)dSy . I2 = |y|=1,cos θ<0
150
T.-P. Liu, W. Wang
Using (2.12), we have I2 ≤ e
−(|x|−ct)2 /3µ
Z |y|=1,y1 <0
As in the proof of (1), let y = (−
p
Z |y|=1,y1 <0
ec
2
t(1+y1 )/2
dSy .
1 − |w|2 , w) and we have
ec
2
t(1+y1 )/2
dSy ≤ Ct−(n−1)/2 .
This proves (2).
Lemma 2.4. For |y| ≤ M, t > 4M 2 , p ≥ 0 and N > 0, we have (1 + (|x − y| − pt)2 /t)−N ≤ CN (1 + (|x| − pt)2 /t)−N . Proof. If ||x| − pt| ≤
(2.13)
√ t, we have (|x| − pt)2 /t ≤ 1, and 1 ≥ 2−N . (1 + (|x| − pt)2 /t)N
On the other hand, 1 ≤ 1. (1 + (|x − y| − pt)2 /t)N Thus we have 1 1 ≤ 2N . 2 N (1 + (|x − y| − pt) /t) (1 + (|x| − pt)2 /t)N For |x| − pt >
√
t, |x − y| − pt ≥ |x| − |y| − pt ≥
√ t − |y| ≥ 0, and
(|x − y| − pt)2 ≥ (|x| − |y| − pt)2 = (|x| − pt)2 − 2|y|(|x| − pt) + |y|2 = (|x| − pt)2 /2 + ((|x| − pt − 2|y|)2 /2 − |y|2 ≥ (|x| − pt)2 /4 + (t/4 − M 2 ). 2 This √ yields (2.13) since (t/4−M ) ≥ 0. For |x|−pt < t − |y| ≥ 0, and
√ t, pt−|x−y| ≥ pt−|x|−|y| ≥
(pt − |x − y|)2 ≥ (pt − |x| − |y|)2 = (|x| − pt)2 /2 + ((|x| − pt + 2|y|)2 /2 − |y|2 ≥ (|x| − pt)2 /4 + (t/4 − M 2 ), whence we also have (2.13).
Lemma 2.5. Assume that supp fˆ(ξ) ⊂ OR = {ξ, |ξ| > R}, fˆ(ξ) ∈ L∞ ∩ C n+1 (OR ) and that |fˆ(ξ)| ≤ C0 , |Dξα (fˆ(ξ))| ≤ C0 |ξ|−|α|−1 , (|α| ≥ 1), then f (x) ∈ L1 and kf kL1 ≤ CC0 for some constant C depending only on n.
Pointwise Estimates of Diffusion Wave for Navier–Stokes Systems
151
Proof. For |α| ≥ 1, we have I = |xα f (x)| = (2π)−n | Taking |α| = n + 1, −n
Z
I ≤ (2π) Thus we have
|ξ|>R
R
e
√
−1xξ
Dξα fˆ(ξ)dξ|.
|ξ|−n−2 dξ ≤ C.
|f (x)| ≤ CC0 |x|−(n+1) ,
and kf (x)kL1 (|x|≥R) ≤ CC0 . On the other hand, by Holder’s inequality kf kL1 ≤ Ckxα f kLp kx−α kLq , with p = 2n, q = 2n/(2n − 1). Taking |α| = n − 1, we know that kxα f kLp ≤ CkDξα fˆkLq ≤ CC0 , kx−α kLq (|x|
3. Pointwise Estimates of Green’s Function for odd n In order to estimate the Green function G with its Fourier transform Gˆ given in ˆ Let wˆ = (2π)−n/2 (sin c|ξ|t)/(c|ξ|), wˆ t = (1.4), we first give a decomposition for G. (2π)−n/2 cos(c|ξ|t), and √
λ+ e(λ− + Hˆ 1 (ξ, t) =
√
e(λ− + Hˆ 2 (ξ, t) = λ+ e(λ+ − Hˆ 3 (ξ, t) =
√
− λ− e(λ+ − λ + − λ−
−1c|ξ|)t
√ λ+ e(λ− + −1
− e(λ+ − λ + − λ−
−1c|ξ|)t
, √
−1c|ξ|)t
+ λ− e(λ+ − λ + − λ−
−1c|ξ|)t
+ e(λ+ − λ + − λ−
−1c|ξ|)t
,
− λ− e(λ− + λ + − λ−
√
√ e(λ− + Fˆ2 (ξ, t) = −1
√
−1c|ξ|)t
√
Fˆ1 (ξ, t) =
√
−1c|ξ|)t
√
−1c|ξ|)t
, √
−1c|ξ|)t
c|ξ|,
−1c|ξ|)t
c|ξ|,
152
T.-P. Liu, W. Wang
√ λ+ e(λ+ − Fˆ3 (ξ, t) = −1
√
√
−1c|ξ|)t
+ λ− e(λ− + λ + − λ−
−1c|ξ|)t
c|ξ|,
2 Eˆ µ (ξ, t) = e−µ|ξ| t .
In terms of these expressions, we can decompose (1.4) as follows: ˆ t) = G(ξ, ˆ ˆ √wˆ t H1 − wˆ F1 2 ˆ −c −1(wˆ t H2 − wˆ Fˆ2 )ξ
√ − −1(wˆ t Hˆ 2 − wˆ Fˆ2 )ξ τ . (wˆ t Hˆ 3 − wˆ Fˆ3 )(ξξ τ )/|ξ|2 + Eˆ ε (I − (ξξ τ )/|ξ|2 ) (3.1) This can also be written as Gˆ = Gˆ # + Gˆ ∗ with Gˆ # (ξ, t) √ wˆ t (Hˆ 1 − Eˆ ν/2 ) − wˆ Fˆ1 − −1(wˆ t Hˆ 2 − w( ˆ Fˆ2 − c−1 Eˆ ν/2 ))ξ τ , √ = −c2 −1(wˆ t Hˆ 2 − w( ˆ Fˆ2 − c−1 Eˆ ν/2 ))ξ (wˆ t (Hˆ 3 − Eˆ ν/2 ) − wˆ Fˆ3 )(ξξ τ )/|ξ|2 and Gˆ ∗ (ξ, t) =
ˆ ν/2 √ wˆ t E ν/2 ( −1cwˆ Eˆ )ξ
√
( c−1 wˆ Eˆ ν/2 )ξ τ wˆ t Eˆ ν/2 (ξξ τ )/|ξ|2 + Eˆ ε (I − (ξξ τ )/|ξ|2 ).
.
By direct calculation, we can see that G∗ is the Green’s matrix of (1.5). Sometimes we also write Gˆ = Gˆ + + Gˆ − with Gˆ + =
λ+ t √−η−2e λ t − −1c η0 e + ξ
√ − −1η0 eλ+ t ξ τ η+ eλ+ t (ξξ τ )/|ξ|2
and Gˆ − =
√
η+ e λ− t −1c2 η0 eλ− t ξ
e−ε|ξ|
2
√ − −1η0 eλ− t ξ τ , 2 t I − (η− eλ− t + e−ε|ξ| t )(ξξ τ )/|ξ|2
where η0 (ξ) = (λ+ (ξ) − λ− (ξ))−1 , η± (ξ) = λ± (ξ)η0 (ξ). Before we establish the pointwise estimates for G(x, t), we need the following lemmas. Lemma 3.1. For small |ξ|, Dβ (λ± ∓ |∂tl Dξβ (eτ± t − e−ν|ξ|
where τ± = λ± ∓
√
2
√
t/2
1 −1c|ξ|) = − νDβ (|ξ|2 ) + O(|ξ|3−|β| ), 2
)| ≤ C|ξ|1−|β|+2l (|ξ|2 t)(1 + |ξ|2 t)|β| e−ν|ξ| √ 1 − −1|ξ| + O(|ξ|2 ), = λ+ − λ− 2c
−1c|ξ|.
(3.2) 2
t/2
,
(3.3) (3.4)
Pointwise Estimates of Diffusion Wave for Navier–Stokes Systems
Proof. Let
153
p √ 1 (−να ± ν 2 α2 − 4c2 ) ∓ −1c |ξ|. 2
α (ξ) = τ±
α (ξ) = − 21 να|ξ| + r(|ξ|, α), where r(|ξ|, α) ∼ α2 and For α small enough, we have τ± r(|ξ|, α) is 1-homogeneous in |ξ|. Taking α = |ξ|, we have, for small |ξ|,
τ± = λ± ∓
√
1 −1c|ξ| = − ν|ξ|2 + O(|ξ|3 ). 2
(3.5)
Equation (3.2) follows by partial differentiation on both sides of (3.5). We also obtain from (3.5), √ 2 2 e(λ± ∓ −1c|ξ|)t − e−ν|ξ| t/2 = e−ν|ξ| t/2 O(|ξ|3 )t. Again by √ partial√differentiation on both sides of above formula, we obtain (3.3). From −1( ν 2 α2 − 4c2 )−1 = (2c)−1 + O(α2 ), we have √ √ 1 −1|ξ| −1 + O(|ξ|2 ), =p = λ + − λ− ν 2 |ξ|2 − 4c2 2c
and Lemma 3.1 follows. Lemma 3.2. For small |ξ|,
2 2 |∂tl Dξβ (ξ α (Hˆ 1 (ξ, t) − e−ν|ξ| t/2 ))| ≤ C|ξ||α|+1−|β|+2l (1 + (|ξ|2 t))|β|+1 e−ν|ξ| t/2 , (3.6)
|∂tl Dξβ (ξ α Hˆ 2 (ξ, t))| ≤ C|ξ||α|−|β|+2l (1 + (|ξ|2 t))|β|+1 e−ν|ξ|
2
t/2
,
(3.7)
2 2 |∂tl Dξβ (ξ α (Hˆ 3 (ξ, t) − e−ν|ξ| t/2 ))| ≤ C|ξ||α|+1−|β|+2l (1 + (|ξ|2 t))|β|+1 e−ν|ξ| t/2 , (3.8) 2 |∂tl Dξβ (ξ α Fˆ1 (ξ, t))| ≤ C|ξ||α|+2−|β|+2l (1 + (|ξ|2 t))|β| e−ν|ξ| t/2 ,
|∂tl Dξβ (ξ α (Fˆ2 (ξ, t) − (c)−1 e−ν|ξ|
2
/2
))| ≤ C|ξ||α|+1−|β|+2l (1 + (|ξ|2 t))|β| e−ν|ξ|
2 |∂tl Dξβ (ξ α Fˆ3 (ξ, t))| ≤ C|ξ||α|+2−|β|+2l (1 + (|ξ|2 t))|β| e−ν|ξ| t/2 .
(3.9) 2
t/2
, (3.10) (3.11)
Proof. By (3.3) we have |∂ l Dβ (ξ α (Hˆ 1 − e−ν|ξ| t/2 ))| Pt ξ ≤ |β1 |+|β2 |+|β3 |=|β| ||ξ|2l Dβ1 ξ α (Dβ2 η+ Dβ3 eτ− t 2 −Dβ2 η− Dβ3 eτ+ t − Dβ2 +β3 e−ν|ξ| t/2 )| P 2 2 ≤ |β1 |+|β2 |+|β3 |=|β| |ξ|2l |Dβ1 ξ α (Dβ2 (η+ − η− )Dβ3 e−ν|ξ| t/2 − Dβ2 +β3 e−ν|ξ| t/2 )| 2 +|Dβ1 ξ α (Dβ2 η+ )|ξ|1−|β3 | (|ξ|2 t)(1 + |ξ|2 t)|β3 | e−ν|ξ| t/2 ) 2 −Dβ1 ξ α (Dβ2 η− )|ξ|1−|β3 | (|ξ|2 t)(1 + |ξ|2 t)|β3 | e−ν|ξ| t/2 )| . 2
From η+ − η− = 1, we get (3.6). For (3.7), we have X |ξ|2l |Dβ1 ξ α Dβ2 (λ+ − λ− )−1 Dβ3 eτ− t − eτ+ t |, |∂tl Dβ (ξ α Hˆ 2 )| ≤ |β1 |+|β2 |+|β3 |=|β|
154
T.-P. Liu, W. Wang
and so (3.7) follows from (3.3) and the easily established Dβ2 (λ+ − λ− )−1 O(|ξ|−1−β2 ). The proof of (3.8) is similar to that of (3.6). For (3.9), noting that (η+ + η− ) = O(|ξ|), we have from (3.3),
=
|∂ l Dβ (|ξ|α Fˆ1 (ξ, t))| = |∂tl Dξβ (|ξ|ξ α (η1 eτ− t + η2 eτ+ t ))| Pt ξ ≤ |β1 |+|β2 |+|β3 |=|β| |ξ|2l |Dβ1 (|ξ|ξ α )(Dβ2 η− Dβ3 eτ− t + Dβ2 η+ Dβ3 eτ+ t )| 2 ≤ C|ξ||α|+2−|β|+2l (1 + (|ξ|2 t))|β|+1 e−ν|ξ| t/2 . The proof of (3.11) is similar to that of (3.9). Finally, we consider (3.10). Note first that 2 |∂tl Dβ (ξ α (Fˆ2 − c1 e−ν|ξ| /2 ))| P ≤ |β1 |+|β2 |+|β3 |=|β| |ξ|2l |Dβ1 ξ α Dβ2 2 −Dβ2 c1 Dβ3 (e−ν|ξ| /2 ) |.
√ −1|ξ| β3 τ− t λ+ −λ− D (e
+ e τ+ t )
With this, (3.10) follows from (3.3) and (3.4). This complete the proof of the lemma.
Let χ1 (ξ) =
1, |ξ| < χ3 (ξ) = 0, |ξ| > 2,
1, |ξ| > R + 1 0, |ξ| < R,
be cut-off functions, where 2 < R. Set χ2 = 1 − χ1 − χ3 and Hˆ j,i (ξ, t) = χi Hˆ j (ξ, t), Eˆ µ (ξ, t) = χi Eˆ µ (ξ, t), i
Fˆj,i (ξ, t) = χi Fˆj (ξ, t), (j = 1, 2, 3; i = 1, 2, 3).
The decay property is related to the behavior for |ξ| < , Lemma 3.3, while the smoothness property is related to that of |ξ| > R, Lemma 3.4. Lemma 3.3. For sufficiently small , ν/2
|Dα (H1,1 − χ1 (D)E1 )| ≤ Ct− |Dα H2,1 | ≤ Ct− α
|D (H3,1 −
ν/2 χ1 (D)E1 )|
|Dα F1,1 | ≤ Ct
n+|α| 2
ν/2
−
|Dα F3,1 | ≤ Ct
(3.12)
n+|α|+1 2
(3.13)
BN (|x|, t),
(3.14)
BN (|x|, t),
|Dα F2,1 − c−1 χ1 (D)E1 | ≤ Ct n+|α|+2 − 2
BN (|x|, t),
BN (|x|, t),
≤ Ct
n+|α|+2 − 2
n+|α|+1 2
n+|α|+1 − 2
(3.15)
BN (|x|, t),
(3.16)
BN (|x|, t).
(3.17)
Proof. We will only prove (3.12), since the proofs of the others are similar. First, we have X |Dβ1 χ1 Dβ2 (ξ α (Hˆ 1 − Eˆ ν/2 ))|. |Dβ (χ1 ξ α (Hˆ 1 − Eˆ ν/2 ))| ≤ |β1 |+|β2 |=|β|
Since |Dβ1 χ1 | ≤ C and |ξ|−|β2 | ≤ |ξ|−|β| , we also have from (3.6), |∂tl Dξβ (χ1 ξ α (Hˆ 1 (ξ, t) − e−ν|ξ|
2
t/2
)| ≤ C|ξ||α|+1−|β|+2l (1 + (|ξ|2 t))|β|+1 e−ν|ξ|
With the above, (3.12) follows from Lemma 2.2.
2
t/2
.
Pointwise Estimates of Diffusion Wave for Navier–Stokes Systems
155
ˆ t), Gˆ #j = χj (ξ)Gˆ # (ξ, t), Gˆ ∗j = χj (ξ)Gˆ ∗ (ξ, t) and Gˆ ± = Letting Gˆ j = χj (ξ)G(ξ, j χj (ξ)Gˆ ± , we have ˆ+ Gˆ = Gˆ ∗ + Gˆ #1 + (Gˆ 2 − Gˆ ∗2 − Gˆ ∗3 ) + Gˆ − 3 + G3 , or
G = G∗ + G#1 + (G2 − G∗2 − G∗3 ) + G+3 + G− 3 .
(3.18)
Proposition 3.1. There exist positive constants C and τ , such that |Dxα G∗ (x, t)| ≤ Ct−(n+|α|)/2 (BN (|x|, t) + t−(n−1)/4 BN (|x| − ct, t)).
(3.19)
Proof. First, it is easy to see that E µ (x, t) = Ct−n/2 e−|x| where µ = ε or ν/2. Then from |Dxα E µ (x, t)| ≤ Ct−n/2
2
/(4µt)
,
(3.20)
X Y 2 Dαj (−|x|2 /(µt)) e−|x| /(4µt) , αj
j
we get Let Rˆ =
|Dxα E µ (x, t)| ≤ Ct−n/2 t−|α|/2 e−|x|
τ
ξξ |ξ|2 ,
2
/(8µt)
.
(3.21)
it is easy to see that ˆ ≤ C|ξ||α|−|β|+2l (1 + |ξ|2 t)|β|+1 e−ν|ξ|2 t/2 . |∂tl Dβ (ξ α (Eˆ µ R))|
From Lemma 2.2, we know that |Dxα (E µ ∗ R)| ≤ Ct−(n+|α|)/2 BN (|x|, t). By (2.2), (3.20) and (3.21), |Dxα wt
∗E
ν/2
|=C
X
aγ t
|γ|
|y|=1
|γ|≤(n−1)/2
By (2.11) |Dxα wt ∗ E ν/2 | ≤ C
X
Z
(3.22)
t−(n+|α|+|γ|)/2 E ν/2 (x + cty, t)dSy .
t|γ|/2 t−(n+|α|)/2 t−(n−1)/2 e−(|x|−ct)
2
/12νt
.
(3.23)
|γ|≤(n−1)/2
By (2.2), (3.22) and (2.10), we have X |Dxα wt ∗ E ν/2 ∗ R| ≤ C t|γ|/2 t−(n+|α|)/2 t−(n−1)/2 BN (|x| − ct, t). (3.24) |γ|≤(n−1)/2
By (2.1), (3.22) and (2.10), we can get X |Dxα w ∗ Dxj E ν/2 ∗ R| ≤ C
t(|γ|+1)/2 t−(n+|α|)/2 t−(n−1)/2 BN (|x| − ct, t).
|γ|≤(n−3)/2
(3.25) Thus, summing up (3.22)–(3.25) and choosing a suitable constant τ , we have |Dxα G∗ (x, t)| ≤ Ct−(n+|α|)/2 (e−τ |x|
2
/t
+ t−(n−1)/4 (e−τ (|x|−ct)
Since e−τ |x| /t ≤ CBN (|x|, t) and e−τ (|x|−ct) (3.19) holds. 2
2
/t
2
/t
+ BN (|x| − ct, t))).
≤ CBN (|x| − ct, t), we conclude that
156
T.-P. Liu, W. Wang
Proposition 3.2. For sufficiently small , |Dxα G#1 (x, t)| ≤ Ct−(n+|α|)/2 t−(n+1)/4 BN (|x| − ct, t).
(3.26)
Proof. By (2.3) and Lemma 3.3, we have |wt ∗ (Dα (H1,1 − E ν/2R))| P ≤ 0≤|γ|≤(n−1)/2 aγ t|γ| |y|=1 t−(n+|α|+|γ|+1)/2 B2N (|x + cty|, t)dSy , and so from (2.10), |wt ∗ (Dα (H1,1 − E ν/2 ))| ≤ Ct−(n+|α|)/2−(n+1)/4 BN (|x| − ct, t).
(3.27)
Since |ξξ τ /|ξ|2 | ≤ C, we also have |wt ∗ (Dα (H3,1 − E ν/2 ) ∗ R)| ≤ Ct−(n+|α|)/2−(n+1)/4 BN (|x| − ct, t).
(3.28)
Similarly, we have |wt ∗ (Dα Dxj (H2,1 ))| ≤ Ct−(n+|α|)/2−(n+1)/4 BN (|x| − ct, t). By (2.1), Lemmas 3.1 and 2.4, we obtain P |w ∗ (Dα (F1,1 ))| ≤ 0≤|γ|≤(n−3)/2 aγ t|γ|+1 R t−(n+|α|+|γ|+2)/2 B2N (|x + cty|, t)dSy |y|=1 −(n+|α|)/2−(n+1)/4 ≤ Ct BN (|x| − ct, t).
(3.29)
(3.30)
Similarly, |w ∗ (Dα (F3,1 ) ∗ R)| ≤ Ct−(n+|α|)/2−(n+1)/4 BN (|x| − ct, t), ν/2
|w ∗ (Dα Dxj (F2,1 − c−1 E1 ))| ≤ Ct−(n+|α|)/2−(n+1)/4 BN (|x| − ct, t). Summing up (3.27)–(3.32), the proposition is proved.
(3.31) (3.32)
Proposition 3.3. For fixed and R, there exist positive constants b and C, so that |Dxα G2 (x, t)|, |Dxα G∗2 (x, t)|, |Dxα G∗3 (x, t)| ≤ Ct−(n+|α|)/2 e−bt BN (|x|, t).
(3.33)
Proof. Note that if |ξ| ∈ (, R + 1), we have Reλ± ≤ −2θ|ξ|2 for some positive constant θ. By Theorem 3.2 of [1], it is easy to see that 2 2 |Dξ2β ξ α Gˆ 2 | ≤ C(1+|ξ|)|α| (1+t|ξ|)2|β| e−2θ|ξ| t ≤ C(1+|ξ|)|α| (1+t|ξ|)2|β| e−2bt e−θ|ξ| t .
Then, as in the proof of Lemma 2.2, we have |Dxα G2 (x, t)| ≤ Ct−(n+|α|)/2 e−bt BN (|x|, t). By the definition of Gˆ ∗ , we also know that 2 |Dξ2β ξ α Gˆ ∗2 | + |Dξ2β ξ α Gˆ ∗3 | ≤ C(1 + |ξ|)|α| e−2bt (1 + t|ξ|)2|β| e−θ|ξ| t .
By the same method, we also obtain (3.33) for G∗2 and G∗3 and the proposition is proved.
Pointwise Estimates of Diffusion Wave for Navier–Stokes Systems
157
Now we consider G± 3 for sufficiently large |ξ|. First, we denote 1 λ˜ ± (β) = − (ν ∓ 2
p ν 2 − 4c2 β),
then |ξ|2 λ˜ ± (|ξ|−2 ) = λ± (ξ). For β sufficiently small, we have Pm 2 λ˜ + (β) = − cν β + j=1 a+j β j+1 + O(β m+2 ), Pm 2 j+1 λ˜ − (β) = −ν + cν β + j=1 a− + O(β m+2 ). j β Thus
Pm 2 λ+ (ξ) = − cν + j=1 a+j |ξ|−2j + O(|ξ|−2(m+1) ), Pm 2 −2j + O(|ξ|−2(m+1) ). λ− (ξ) = −ν|ξ|2 + cν + j=1 a− j |ξ|
Let η± = λ± /(λ+ − λ− ), η0 = (λ+ − λ− )−1 . As above we have Pm η+ (ξ) = j=1 b+j |ξ|−2j + O(|ξ|−2(m+1) ), Pm η− (ξ) = −1 + j=1 b− |ξ|−2j + O(|ξ|−2(m+1) ), Pm 0 −2jj + O(|ξ|−2(m+1) ). η0 (ξ) = j=1 bj |ξ| Proposition 3.4. For sufficiently large R, there exist positive constants b and C, such that −(n+|α|)/2 −bt e BN (|x|, t). (3.34) |Dxα G− 3 (x, t)| ≤ Ct Proof. Since eλ− (ξ)t ≤ Ce−ν|ξ| t , by the definition of Gˆ − , we have 2
2 |α| −bt |Dξ2β ξ α Gˆ − (1 + t|ξ|2 )2|β| e−θ|ξ| t . 3 | ≤ C|ξ| e
Using the same method as in the proofs of Proposition 3.3, we can prove (3.34). For
G+3 ,
since
eλ+ (ξ)t = e−c
2
t/ν
1+(
m X
a+j |ξ|−2j )t+· · ·+
j=1
m
1 X + −2j m m ( aj |ξ| ) t +R(t)O(|ξ|−2(m+1) ) , m! j=1
where R(t) is a polynomial in t of degree not more than m + 1, we have Pm + 2 −2j η+ eλ+ t = e−c t/ν + p+m+1 (t)O(|ξ|−2(m+1) ) , j=1 pj (t)|ξ| Pm 2 −2j −2(m+1) + p− ) , η− eλ+ t = e−c t/ν − 1 + j=1 p− j (t)|ξ| m+1 (t)O(|ξ| P 2 m 0 −2j + p0m+1 (t)O(|ξ|−2(m+1) ) . η0 eλ+ t = e−c t/ν j=1 pj (t)|ξ| 0 Here p+j (t), p− j (t) and pj (t) are polynomials in t of degree no larger than j. Let − −pj (t)4−j 0 ~ L2j (t, D) = ~ τD ~ 0 −p+j (t)4−(j+1) D
and ~ = L2j−1 (t, D) ~ = (Dx , · · · , Dx ). where D 1 n
! √ ~ 0 − −1p0j (t)4−j D √ , ~τ − −1c2 p0j (t)4−j D 0
158
T.-P. Liu, W. Wang
Lemma 3.4. For R sufficiently large, there exist distributions Fl (x, t) =
n−1+l X
Lj δ(x) +
j=1
10 00
2 δ(x) e−c t/ν ,
and a constant b > 0, such that |4l/2 (G+3 − χ3 (D)Fl (x, t))| ≤ Ce−2bt BN (|x|, t), l even |Dxj 4(l−1)/2 (G+3 − χ3 (D)Fl (x, t))| ≤ Ce−2bt BN (|x|, t), l odd.
(3.35)
Proof. We first consider the case when l is even. By the definitions of G+3 and Fl , we know that 2 |Dξβ |ξ|l (Gˆ +3 − χ3 (ξ)Fˆl (ξ, t))i,j | ≤ Ce−c t/ν (1 + t)n+l |ξ|−(n+|β|+1)
for i = j = 1 or 2 ≤ i, j ≤ n + 1. Taking |β| = 0 or |β| = 2N , we have |4l/2 (G+3 − χ3 (D)Fl (x, t))i,j | ≤ Ce−2bt BN (|x|, t). For other cases ( i.e. i = 1 or j = 1 but i 6= j), we must make a more precise estimate. In fact Z √ √ 0 0 I1,2 = |4l/2 (G+3 − χ3 (D)Fl (x, t))1,2 | = | e− −1x1 ξ1 e− −1x ξ |ξ|l Q(ξ)ξ1 dξ1 dξ 0 |, where x0 = (x2 , · · · , xn ), ξ 0 = (ξ2 , · · · , ξn ) and X
(n−1+l)/2
Q(ξ) = χ3 (ξ)(η0 eλ+ t −
p0j (t)|ξ|−2j ),
j=1
|ξ|l and Q(ξ) are even function for ξj , (j = 1, ·, n). Thus √ R R ∞ R −R √ 0 0 I1,2 = | Rn−1 e− −1x ξ ( R + −∞ )e− −1x1 ξ1 |ξ|l Q(ξ)ξ1 dξ1 dξ 0 | √ R 0 0 R∞ = | Rn−1 e− −1x ξ R sin(xξ11 ξ1 ) |ξ|l Q(ξ)ξ12 dξ1 dξ 0 |. Since Q(ξ, t) does not change sign for ξ1 ∈ (R, ∞), and −Cec
2
t/ν
(1 + t)(n+1+l)/2 |ξ|−(n+1) ≤ |ξ|l Q(ξ) ≤ Cec
we have |I1,2 | ≤ Ce−c
2
t/ν
Z (1 + t)(n+l+1)/2 |
Rn−1
e−
√
−1x0 ξ 0
Z
2
∞ R
t/ν
(1 + t)(n+1+l)/2 |ξ|−(n+1) ,
sin(x1 ξ1 ) −(n+1) 2 0 |ξ| ξ1 dξ1 dξ |. ξ1
Let ξ 0 = |ξ1 |η 0 , then |ξ|2 = ξ12 (1 + |η 0 |2 ). Noting that e−c t/ν (1 + t)(n+l+1)/2 ≤ Ce−2bt , we have Z ∞Z √ 0 0 sin(x1 ξ1 ) −2bt | ( e− −1|ξ1 |x η (1 + |η 0 |2 )−(n+1)/2 dη 0 ) dξ1 |. |I1,2 | ≤ Ce ξ1 Rn−1 R 2
Z
Since |
Rn−1
e−
√
−1|ξ1 |x0 η 0
(1 + |η 0 |2 )−(n+1)/2 dη 0 | ≤ C
Pointwise Estimates of Diffusion Wave for Navier–Stokes Systems
Z
and |
∞ R
we have
159
sin(x1 ξ1 ) dξ1 | ≤ C|x1 | ξ1
|I1,2 | ≤ Ce−2bt |x1 |.
By the same method, we obtain |I1,j | ≤ Ce−2bt |xj−1 |, |Ii,1 | ≤ Ce−2bt |xi−1 |. Thus
|4l/2 (G+3 − χ3 (D)Fl (x, t))| ≤ Ce−2bt (1 + |x|).
(3.36)
On the other hand |xβ 4l/2 (G+3 − χ3 (D)Fl (x, t))| ≤ Ce−c ≤ Ce
2
t/ν
R −2bt
kDβ (|ξ|l (Gˆ +3 − χ3 (ξ)Fˆl (ξ, t))kL1
|ξ|≥R
|ξ|−(n+|β|) dξ.
Taking |β| = 2N , for |xj | ≥ 1, we have |4l/2 (G+3 − χ3 (D)Fl (x, t))| ≤ Ce−2bt |x|−2N .
(3.37)
Based on (3.36) and (3.37), we get (3.35) for even l. If l is odd, for i 6= j, and i = 1 or j = 1, it is easy to see that |Dxj 4(l−1)/2 (G+3 − χ(D)Fl (x, t))i,j | ≤ Ce−2bt BN (|x|, t). But for i = j = 1 or 2 ≤ i, j ≤ n + 1, we need a more precise estimate as above. Since the proof is very similar, we omit it. For convenience, we denote L0 =
10 00
.
By Lemma 3.4, it is easy to see that Proposition 3.5. For R sufficiently large, there exist distributions Fl (x, t) =
n−1+l X
2 Lj δ(x) e−c t/ν ,
j=0
such that for |α| = l, |Dxα (G+3 − χ3 (D)Fl (x, t))| ≤ Ce−2bt BN (|x|, t).
(3.38)
Theorem 3.1. For x ∈ Rn , t > 0 and |α| = l, we have |Dxα (G(x, t) − χ3 (D)Fl (x, t))| ≤ Cα t−(n+|α|)/2 (t−(n−1)/4 BN (|x| − ct, t) + BN (|x|, t)).
(3.39)
160
T.-P. Liu, W. Wang
Proof. As in (3.18), we can write (G(x, t) − Fl (x, t)) + = G∗ + G#1 + (G2 − G∗2 − G∗3 ) + G− 3 + (G3 − χ3 (D)Fl ).
By Propositions 3.1 to 3.5, we conclude that (3.39) holds.
Remark. The estimate (3.39) in Theorem 3.1 can be proved directly and without resorting to G∗ . In fact, such a proof is somewhat shorter. Here we obtain a faster decay rate for G# = G − G∗ . 4. Decay of Solution of Non-Linear System We denote by u = (ρ − ρ∗ , m − m∗ )τ , u0 = (ρ0 − ρ∗ , m0 − m∗ )τ and rewrite (1.1) as ρt + div m = 0 (4.1) mt + c2 ∇ρ = ε4m + η∇div m + Q, where Q = (Q1 , . . . , Qn )τ , and Qj (x, t) = c2 ∂xj ρ−div
m mj m mj −P (ρ)xj +ε4 +ηdiv −ε4mj −η∇div mj . ρ ρ ρ xj
For |u| small enough, we may write X X X D xj Qj = Dxj (qj + Dxl qj,l ), Q= j
j
l
where qj = O(|u|2 ), qj,l = O(|u|2 ). We consider the Cauchy problem ∂t u + A(Dx )u + B(Dx )u = Q u|t=0 = u0 ,
(4.2)
where the symbols of A and B are √ 0 ξ 0 0 , B(ξ) = . A(ξ) = − −1 2 τ 0 εξξ τ + ηξ τ ξ c ξ 0 As in [3] and [1], we have Theorem 4.1. Suppose that u0 ∈ H s+l (Rn ), s = [n/2] + 1, l is a nonnegative integer, and that ku0 kH s+l is sufficiently small. Then there exists a unique, global, classical solution u ∈ H s+l of (1.1), satisfying kDxα ukL2 (t), 0 ≤ |α| ≤ s + l R∞ ( 0 kDxα uk2L2 (t)dt)1/2 , 1 ≤ |α| ≤ s + l = O(1)ku0 kH s+l . kDxα ukL∞ , 0 ≤ |α| ≤ l.
Pointwise Estimates of Diffusion Wave for Navier–Stokes Systems
161
Let E ≡max{ku0 kH s+l , ku0 kW 1,l }. By Theorem 4.1 we have ku0 kW ∞,l ≤ CE. Using interpolation we know that ku0 kW p,l ≤ CE (1 ≤ p ≤ ∞). Now we will give a pointwise estimate for the solution u of (4.1). Taking Dxα on (4.1) and applying Duhamel’s principle, we obtain Rt (4.3) Dxα u = Dxα G(t) ∗ u0 + 0 G(t − s) ∗ Dxα Q(s)ds = R1α + R2α . We first consider R1α . By Theorem 3.1, R1α = Dxα (G − χ3 (D)F|α| ) ∗ u0 + Dxα χ3 (D)(F|α| ∗ u0 ) R ≤ Cα t−(n+|α|)/2 Rn (BN (|x − y|, t) + t−(n−1)/4 BN (|x − y| − ct, t))|u0 (y)|dy Pn−1+|α| 2 α α +χ3 (D)( j=0 Lj Dα u0 )e−c t/µ = R1,1 + R1,2 . Using Lemma 2.4 and assuming that suppu0 (y) ⊂ {|y| ≤ M }, we have α ≤ Cα t−(n+|α|)/2 (BN (|x|, t) + t−(n−1)/4 BN (|x| − ct, t))ku0 kL1 . R1,1
Denote the symbol of the operator Lj (t, D) by lj (t, ξ). Then, − −pj (t)|ξ|−2j 0 , l2j (t, ξ) = 0 −p+j (t)|ξ|−2(j+1) ξ τ ξ √ ~ 0 − −1p0j (t)ξ −2j D √ l2j−1 (t, ξ) = , 0 − −1c2 p0j (t)ξ −2j ξ τ 10 l0 (t, ξ) = . 00 Since
R √ |xβ Lj (t, D)u0 | = (2π)−n |xβ e −1xξ lj (t, ξ)uˆ 0 dξ| R √ R = (2π)−n | e −1xξ Dξβ lj (t, ξ)uˆ 0 dξ| ≤ C |Dξβ lj (t, ξ)uˆ 0 |dξ ≤ C,
taking |β| = 0 and |β| = 2N , we get |Lj (t, D)u0 (x)| ≤ C(1 + |x|)−2N . Therefore α kR1,2 kL∞ ≤ C(1 + |x|)−2N e−c
2
t/(2µ)
≤ C(1 + t)−(n+|α|)/2 BN (|x|, t).
Thus R1α ≤ Cα t−(n+|α|)/2 (BN (|x|, t) + t−(n−1)/4 BN (|x| − ct, t))ku0 kL1 .
(4.4)
Next, we consider R2α and rewrite it as follows R2α Rt Rt = 0 χ3 (D)F|α| (t − s) ∗ Dxα Q(u(s))ds + 0 (G − χ3 (D)F|α| )(t − s) ∗ Dxα Q(u(s))ds α α + R2,2 . = R2,1
162
T.-P. Liu, W. Wang
Set |x|2 −n/2 −1 (|x| − ct)2 −n/2 + 1+ , ϕα (x, t) = (1 + t)n/2+ν(|α|) t−(n−1)/4 1 + 1+t 1+t M (t) = sup0≤τ ≤t,|α|≤l maxn |Dxα u(x, τ )|ϕα (x, τ ), x∈R
ν(k) =
1 2
k, k ≤ l − 2, 0, k > l − 2.
Proposition 4.1. If |α| ≤ l, α | ≤ C(EM + M |α| )(1 + t)−(n/2+ν(|α|)) (t−(n−1)/4 Bn/2 (|x| − ct, t) + Bn/2 (|x|, t)). |R2,1 (4.5)
Proof. By the definition of F|α| , we have α |R2,1 |≤C
n+|α|−1 Z t X j=0
(t − s)j kχ3 4−j/2 Dα Q(u(s))kL∞ e−c
2
(t−s)/ν
ds.
0
Letting Pˆj = χ3 (ξ)|ξ|−j , we can write α |≤C |R2,1
n+|α|−1 Z t X j=0
(t − s)j kPj ∗ Dα Q(u(s))kL∞ e−c
2
(t−s)/ν
ds.
0
By Lemma 2.5, we have α | |R2,1
≤C
n+|α|−1 Z t X j=0
Since
|Dxα Q(u(x))| ≤ C
kDα Q(u(s))kL∞ ds.
0
X P
|Dα1 u||Dα2 u|
|αj |≤|α|+2
by the definition of M , we get (4.5).
Y
|Dαj u|,
j≥3
Proposition 4.2. Suppose that |Dxα H(x, t)| ≤ C(1 + t)−(n+|α|+1)/2 (BN (|x|, t) + t−(n−1)/4 BN (|x| − ct, t)), and |Dxα S(x, t)| ≤ C(1 + t)−(2n+|α|)/2 (Bn (|x|, t) + t−(n−1)/2 Bn (|x| − ct, t)). Then, for N sufficiently large, Rt |Dxα ( 0 H(t − s) ∗ S(s))ds| ≤ C(1 + t)−(n+|α|)/2 (Bn/2 (|x|, t) + t−(n−1)/4 Bn/2 (|x| − ct, t)). (The proof of these crucial estimates on the convolution of forward and backward viscous wave cones will be given in Sect. 5.)
Pointwise Estimates of Diffusion Wave for Navier–Stokes Systems
163
Proposition 4.3. For |α| ≤ l and l ≥ n, α | ≤ C(M 2 +M |α| )(1+t)−(n/2+ν(|α|)) (t(n−1)/4 Bn/2 (|x|−ct, t)+Bn/2 (|x|, t)). (4.6) |R2,2
Proof. For |α| ≤ l − 2, write R t Pn α = 0 Dα (Dxj (G − F|α| )(s − t) ∗ qj (s)) R2,2 Pn j=1α + i=1 D (Dxj Dxi (G − F|α| )(s − t) ∗ qj,i (s)) ds. By Theorem 3.1, |Dxα Dxj (G − χ3 F|α| )| ≤ C(1 + t)−(n+|α|+1)/2 (t−(n−1)/4 BN (|x| − ct, t) + BN (|x|, t)). From qj = O(u2 ), we have P Q |Dxα qj | ≤ C P |αj |≤|α| |Dα1 u||Dα2 u| j≥3 |Dαj u|
≤ C(M 2 + M |α| )(1 + t)−(2n+|α|)/2 (t(n−1)/2 Bn (|x| − ct, t) + BN (|x|, t)).
We have a similar estimate for Dxi Dxj (G − χ3 F|α| ) and qj,i . Thus we have from Proposition 4.2 that (4.6) is valid for |α| ≤ l − 2. For |α| = l − 1 or |a| = l, we rewrite R t Pn α = 0 (Dβ Dxj (G − F|α| )(s − t)) ∗ (Dα˜ qj (s)) R2,2 Pn j=1 β + i=1 (D Dxj Dxi (G − F|α| )(s − t)) ∗ (Dα˜ qj,i (s)) ds. Here |α| ˜ = l−2 and |β| = |α|−|α|. ˜ In Proposition 4.2, we replace H by Dβ Dxj (G−F|α| ) or Dβ Dxj Dxi (G − F|α| ), S by Dα˜ qj or Dα˜ qj,i . This yields (4.6) for l − 2 ≤ |α| ≤ l. Combining (4.3), (4.4), (4.5) and (4.6), we get |Dxα u(x, t)| ≤ C(E + M 2 + M |α| )(1 + t)−(n/2+ν(|α|)) (t−(n−1)/4 Bn/2 (|x| − ct, t) + Bn/2 (x, t)). Thus we have
M (t) ≤ C(E + M 2 (t) + M α (t)).
Taking E small enough and using continuity of M (t) and induction, we conclude that M (t) ≤ CE and for |α| ≤ l − 2, |Dxα u(x, t)| ≤ CE(1 + t)−(n+|α|)/2 (t−(n−1)/4 Bn/2 (|x| − ct, t) + Bn/2 (|x|, t)). (4.7) Thus we have proved the main result of this paper: Theorem 4.2. Suppose that u0 ∈ H s+l (Rn ), s > [n/2] + 1, l > n + 1 and that E is small enough and |α| ≤ l − 2. Then the solution u(x, t) of (1.1) satisfies |Dxα u(x, t)| ≤ C(1 + t)−(n+|α|)/2 ((1 + t)−(n−1)/4 Bn/2 (|x| − ct, t) + Bn/2 (|x|, t)). (4.8) Remark. The solution of the nonlinear system (1.1) has the decay factor Bn/2 depending on the space dimension n. On the other hand, the Green function has the decay factor BN of arbitrary order N > 0, Theorem 3.1. In other words, due to the effect of nonlinearity, the solution exibits a much weak form of Huygen’s principle.
164
T.-P. Liu, W. Wang
5. The Proof of Proposition 4.2 In this section, we always denote = [0, t] × Rn , θ1 (t, s) = θ2 (t, s) = θ3 (t, s) = θ4 (t, s) = and
(1 + (t − s))−(3n+1)/4 (1 + s)−(3n−1)/2 , (1 + (t − s))−(n+1)/2 (1 + s)−(3n−1)/2 , (1 + (t − s))−(3n+1)/4 (1 + s)−n , (1 + (t − s))−(n+1)/2 (1 + s)−n ,
P1 (t, s, x, y) = 1 + P2 (t, s, x, y) = 1 + P3 (t, s, x, y) = 1 + P4 (t, s, x, y) = 1 +
(|y−x|−c(t−s))2 1+t−s −N |y−x|2 1 1+t−s
−N
(|y|−cs)2 1+s −n 2
1+
−n
,
+ (|y|−cs) , 1+s −N −n 2 (|y−x|−c(t−s))2 1 + |y| , 1+t−s 1+s −N −n |y−x|2 |y|2 1 + 1+s . 1+t−s
Set 5(t, x) = |Dxα (
Z H(t − s) ∗ S(s)ds)|,
(5.1)
θj (t, s)Pj (t, s, x, y)dsdy.
(5.2)
Z
and 5j (t, x) =
By the hypothesis of Proposition 4.2, 5(t, x) ≤ C(1 + t)−|α|/2 (51 (t, x) + 52 (t, x) + 53 (t, x) + 54 (t, x)).
(5.3)
Evidently, Proposition 4.2 will follow from the following estimates: (|x| − ct)2 −n/2 |x|2 −n/2 , (5.4) + 1+ 5j ≤ C(1 + t)−n/2 (1 + t)−(n−1)/4 1 + 1+t 1+t for j = 1, 2, 3, 4. We now prove some lemmas which will be used in establishing (5.4). The proof of the first lemma is straightforward. Lemma 5.1. (1) For τ ∈ [0, t] and A2 ≥ t, we have 1 + τ n A2 −n A2 −n 1+ 1+ ≤ 3n , 1+τ 1+t 1+t
(5.5)
(2) for A2 ≤ t,
A2 −n , 1 ≤ 2n 1 + 1+t (3) for any real functions A, B, A2 B2 (A + B)2 + ≥ , 1+t−s 1+s 2+t
(5.6)
(5.7)
(4) for Aj > 0, (j = 1, 2, 3, ) and A1 + A2 > A3 , (1 + A1 )−n (1 + A2 )−n ≤ (1 + A3 )−n .
(5.8)
Pointwise Estimates of Diffusion Wave for Navier–Stokes Systems
165
Lemma 5.2. For any a = a(t, s, x) ≥ 0, b = b(t, s, x) ≥ 0, Z (|y| − a)2 −n 1+ dy ≤ C (1 + b)n/2 + (1 + b)1/2 |a|n−1 . 1 + b n R
(5.9)
Proof. Denote the left-hand of (5.9) as I. We have 2 −n R √y n−1 √|y| √a − d 1+b I ≤ C(1 + b)n/2 Rn 1 + √|y| 1+b 1+b 2 −n 1+b R a ≤ C(1 + b)n/2 Rn 1 + √|y| − √1+b 1+b n−1 √y a n−1 √|y| a d 1+b + √1+b 1+b − √1+b R 2 −(n+1)/2 a ≤ C(1 + b)n/2 Rn 1 + √|y| − √1+b 1+b 2 −n R √a n−1 √|y| √a + Rn 1 + √|y| − d . 1+b 1+b 1+b 1+b Thus we get
I ≤ C((1 + b)n/2 + (1 + b)1/2 |a|n−1 ) and the lemma is proved. Lemma 5.3. For x = (|x|, 0, · · · , 0), −n R )2 1+ | 1 + (2cτ 4(1+t) ≤ C(1 + t)n . Proof. Set
(|y−x|+|y|−ct±2cτ )2 4(1+t)
−n
dτ dy|
(5.10)
y1 = r2 cos ϕ1 + |x| 2 , y2 = r˜ sin ϕ1 cos ϕ2 , ············ yn−1 = r˜ sin ϕ1 sin ϕ2 · · · sin ϕn−2 cos ϕn−1 , yn = r˜ sin ϕ1 sin ϕ2 · · · sin ϕn−2 sin ϕn−1 ,
p where r˜ = 21 r2 − |x|2 . It is easy to see that r = |y−x|+|y|. Consider the transformation T : (τ, y1 , y2 , · · · , yn ) → (τ, r, ϕ1 , · · · , ϕn−1 ) which maps onto 0 , where n o 0 ≤ τ ≤ t, |x| ≤ r < ∞, 0 = (τ, r, ϕ1 , · · · , ϕn−1 ); . 0 ≤ ϕj ≤ π (j = 1, · · · , n − 2), 0 ≤ ϕn−1 ≤ 2π ∂(τ,y1 ,y2 ,··· ,yn ) )| ≤ Crn−1 . Denoting the left-hand side of It is easy to see that J = | ∂(τ,r,ϕ 1 ,··· ,ϕn−1 (5.10) as I, one has Z t Z +∞ (2cτ )2 −n (r − ct ± 2cτ )2 −n n−1 1+ 1+ r drdτ. I≤C 4(1 + t) 4(1 + t) 0 |x|
Since
R +∞ |x|
1+
(r−ct±2cτ )2 4(1+t)
≤ C(1 + t)n/2
R +∞ 0
≤ C(1 + t)n/2 1 +
−n
rn−1 dr
2 −n
(r−ct±2cτ ) √ 2 1+t n−1 r−ct±2cτ √ 2 1+t n−1 ct∓2cτ √ , 2 1+t
1+
+
ct∓2cτ √ 2 1+t
n−1 d √r1+t
166
T.-P. Liu, W. Wang
we know that I ≤ C(1 + t)(n+1)/2
R +∞
1+
0
≤ C(1 + t)(n+1)/2 1 + √ct 1+t
2 −n
2cτ √ 1+t 2n−1
1+
ct∓2cτ √ 2 1+t
n−1 d √τ1+t
≤ C(1 + t)n .
Lemma 5.4. For 0 ≤ s ≤ t, |x| ≤ ct, N ≥ n, we have R
1+
−n 2 1 + (|y|−cs) dsdy 1+t (n−1)/2 2 . ≤ C(1 + t)(n+1)/2 1 + (ct−|x|) 1+t
(|y−x|−c(t−s))2 1+t
−N
(5.11)
Proof. Without loss of generality, we can assume that x = (|x|, 0, · · · , 0). Set τ = , r = |y − x| + |y| and denote (s − 2t ) + |y−x|−|y| 2c 1 2 3 4
= = = =
{(s, y) ∈ ; τ {(s, y) ∈ ; τ {(s, y) ∈ ; τ {(s, y) ∈ ; τ
≥ 0, r ≥ 0, r ≤ 0, r ≤ 0, r
≤ ct}, ≥ ct}, ≤ ct}, ≥ ct}.
If (s, y) ∈ 1 , we have |y−x|−|y| − |y|−|y−x| ≥ cs − |x| = cτ, cs − |y| = cs − |y−x|+|y| 2 2 2 + 2 |y−x|+|y| ct ct r = cτ − 2 + 2 . |y − x| − c(t − s) = cτ − 2 + 2
Setting (|y| − cs)2 −n (|y − x| − c(t − s))2 −N 1+ F (t, s, x, y) = 1 + , 1+t 1+t we have
(r − ct + 2cτ )2 −N (2cτ )2 −n . 1+ F |1 ≤ 1 + 4(1 + t) 4(1 + t)
Similarly, we also have (r − ct + 2cτ )2 −N (2cτ )2 −n F |4 ≤ 1 + 1+ , 4(1 + t) 4(1 + t) and
(r − ct − 2cτ )2 −n (2cτ )2 −N 1+ , F | j ≤ 1 + 4(1 + t) 4(1 + t)
for j = 2, 3. If N ≥ n, we always have Z Z (2cτ )2 −n (|y − x| + |y| − ct ± 2cτ )2 −n 1+ 1+ F dsdy ≤ dτ dy 4(1 + t) 4(1 + t) j for j = 1, 2, 3, 4. Thus the lemma follows from Lemma 5.3.
Pointwise Estimates of Diffusion Wave for Navier–Stokes Systems
167
Lemma 5.5. For (|x| − ct)2 > t, R θ (t, s)Rj (t, x, s, y)dsdy 1 −n 2 , ≤ C(1 + t)−(3n−1))/4 1 + (|x|−ct) 1+t
(5.12)
where (|x| − ct)2 −n (|y − x| − c(t − s))2 −n R1 (t, x, s, y) = 1 + 1+ , 1+t−s 1+s (|y| − cs)2 −n (|x| − ct)2 −n . 1+ R2 (t, x, s, y) = 1 + 1+s 1+t−s Proof. First, we write R R θ1 (t, s)Rj (t, x, s, y)dsdy R = 1 θ1 (t, s)Rj (t, x, s, y)dsdy + 2 θ1 (t, s)Rj (t, x, s, y)dsdy ≡ Ij,1 + Ij,2 , where 1 = ∩ {s ≥ t/2}, 2 = ∩ {s ≤ t/2}. From Lemma 5.2, Z (|y − x| − c(t − s))2 −n 1+ dy ≤ C(1 + t − s)(2n−1)/2 . 1+t−s Rn Next we estimate I1,l , (l = 1, 2). Note that C(1 + t)−(3n−1)/2 (1 + (t − s))−(3n+1)/4 , if s ≥ t/2, θ1 (t, s) ≤ if s ≤ t/2. C(1 + t)−(3n+1)/4 (1 + s)−(3n−1)/2 ,
(5.13)
(5.14)
By (5.13) and (5.14), we get I1,1 ≤ C(1 + t)−(3n−1)/2 1 + ≤ C(1 + t)−(5n−1)/4 1 +
(|x|−ct)2 1+t (|x|−ct) 1+t
−n R
−n 2
t (1 t/2
+ (t − s))−(3n+1)/4+(2n−1)/2 ds
.
Using (5.13) and (5.14) again, we also get I1,2
−(3n−1)/2 1+ (1 + s) 0 −n 2 ≤ C(1 + t)−(3n−1)/4 (|x|−ct) . 1+t ≤ C(1 + t)−(3n+1)/4
R t/2
Since (|x| − ct)2 > t, we have
(|x|−ct)2 t
−n
(|x|−ct)2 1+t 1+t 1+s
≤ 2n 1 +
−n
(1 + (t − s))(2n−1)/2 ds
(|x|−ct)2 t
−n
. Thus
(|x| − ct)2 −n . I1,2 ≤ C(1 + t)−(3n−1)/4 1 + t For the same reason, we also have (|x| − ct)2 −n . I2 ≤ C(1 + t)−(3n−1)/4 1 + t Finally we show the estimate (5.4).
168
T.-P. Liu, W. Wang
The estimate of 51 . Case 1.1. (|x| − ct)2 ≤ t. It is easy to see that 51 ≤ C
−N 2 θ1 (t, s) 1 + (|y−x|−c(t−s)) dsdy 1+t−s −n R 2 + 2 θ1 (t, s) 1 + (|y|−cs) dsdy. 1+s
R
1
By Lemma 5.2, following the same procedure as in the proof of Lemma 5.5, we get 51 ≤ C(1 + t)−(3n−1)/4 . Since (|x| − ct)2 ≤ t, summing (5.6) and the above inequality, we have (|x| − ct)2 −n (1 + t)−(3n−1)/4 . 51 ≤ C 1 + 1+t √ Case 1.2. |x| ≥ ct + t. We first have (|y − x| − c(t − s))2 −N (|x| − ct)2 −n , 1+ P1 (t, x, s, y) ≤ C 1 + 1+t−s 1+s
(5.15)
(5.16)
when |y| − cs ≥ 21 (|x| − ct). On the other hand, when |y| − cs ≤ 21 (|x| − ct), we have (|x| − ct)2 −N (|y| − cs)2 −n . 1+ P1 (t, x, s, y) ≤ C 1 + 1+t−s 1+s
(5.17)
Thus by (5.16), (5.17) and Lemma 5.5 we see that (|x| − ct)2 −n 51 ≤ C(1 + t)−(3n−1)/4 1 + . (5.18) 1+t √ ˜ = ∩ {t1 ≤ s ≤ t2 , r ≤ 2ct − |x|} with Case 1.3. |x| ≤ ct − t and (s, y) ∈ t1 = (ct − |x|)/(4c), t2 = t − (ct − |x|)/(4c). √ If ||y| − cs| ≥ 1 + t, from t − t2 = t1 , it is easy to see that 51 ≤ C((1 + t1 )−(3n+1)/4 (1 + 2t )−(3n−1)/2 R +(1 + t)−n(1 + 2t )−(3n+1)/4 (1 + t1 )−(n−1)/2R) F (t, x, s, y)dsdy ≤ C(1 + t)−(3n−1)/2 (1 + (ct − |x|))−(3n+1)/4 F (t, x, s, y)dsdy. Using Lemma 5.4, we get 51 ≤ C(1 + t)−(3n−1)/2 (1 + (ct − |x|))−(3n+1)/4 (1 + t)n −n/2 2 ≤ C(1 + t)−(3n−1)/4 1 + (ct−|x|) . 1+t
(5.19)
√ n/2 (n+1)/2 , where If ||y| − cs| < 1 + t, we just note √ Vol (s ) ≤ C(1 + t) (ct − |x|) ˜ − cs| < 1 + t}. With this, we also have (5.19). s = {y; (s, y) ∈ , ||y| √ ˜ c, ˜ c = \ . ˜ Denote by Case 1.4. |x| ≤ ct − t and (s, y) ∈ D1 = {(s, y); s ≤ t1 }; D2 = {(s, y); s ≥ t2 }; D3 = {(s, y); r ≥ 2ct − |x|}
Pointwise Estimates of Diffusion Wave for Navier–Stokes Systems c
˜ ⊂ D1 We see that
S
D2
S
169
D3 . If (s, y) ∈ D1 and |y| ≤ (ct − |x|)/2, we have
c(t−s)−|y−x| ≥ c(t−s)−|y|−|x| ≥ ct−(ct−|x|)/4−(ct−|x|)/2−|x| = (ct−|x|)/4. If (s, y) ∈ D1 and |y| ≥ (ct − |x|)/2, we have |y| − cs ≥ (ct − |x|)/2 − (ct − |x|)/4 = (ct − |x|)/4. Thus, for (s, y) ∈ D1 , (|x| − ct)2 −n (|y − x| − c(t − s))2 −N 1+ , P1 (t, x, s, y) ≤ 1 + 1+t−s 1+s
(5.20)
or
(|y| − cs)2 −n (|x| − ct)2 −N P1 (t, x, s, y) ≤ 1 + 1+ . (5.21) 1+t−s 1+s For (s, y) ∈ D2 , we consider cases |y − x| ≤ (ct − |x|)/2 or |y − x| ≥ (ct − |x|)/2 separately: cs − |y| = (cs − |y − x|) + (|y − x| − |y|) ≥ (ct − |x|)/4, |y − x| − c(t − s) ≥ (ct − |x|)/2 − ct + ct − (ct − |x|)/4 = (ct − |x|)/4.
Thus for (s, y) ∈ D2 , (5.20) or (5.21) also hold. For (s, y) ∈ D3 , we consider cases τ ≥ 0 or τ < 0 separately and can geht (5.20) or (5.21). By (5.20), (5.21) and Lemma ˜ c, 5.5, we see that for (s, y) ∈ (|x| − ct)2 −n . 51 ≤ C(1 + t)−(3n−1)/4 1 + 1+t Summing the above cases, we see that (5.4) is valid for 51 . The estimate of 52 . Case 2.1. |x|2 ≤ t. As in the proofs of Case 1.1, we first have −N −n R R 2 (|y|−cs)2 52 ≤ 1 θ2 1 + |y−x| 1 + dsdy + θ dsdy 2 2 1+t−s 1+s ≤ C((1 + t)−(3n−2)/2 + (1 + t)−(n+1)/2 ).
(5.22)
(5.23)
Noting that |x|2 ≤ t and using (5.6), we get |x|2 −n 52 ≤ C 1 + (1 + t)−(n+1)/2 . 1+t
(5.24)
Case 2.2. (|x| − ct)2 ≤ (c/4)2 t. Same as for (5.23), we have Z 52,1 = θ2 P2 dsdy ≤ C(1 + t)−(3n−1)/2 . But for 52,2 =
R
1
θ2 P2 dsdy, we need a better estimate than (5.23). In fact, since −n −N 2 |y−x|2 C 1 + (|x|−cs) 1 + , if|y| ≥ |x|+cs 1+s 2 , 1+t−s (5.25) P2 ≤ −N −n 2 2 |y−x| |x|+cs C 1 + (|x|−cs) , if|y| ≤ , 1 + 1+t−s 1+s 2 2
and |x| − cs ≥ t/4 if s ≤ t/2, we have
170
T.-P. Liu, W. Wang
52,2 ≤ C(1 + t)−n ( ≤ C(1 + t)−n .
R t/2 0
(1 + s)−(3n−1)/2 ds +
R t/2 0
(1 + t − s)−(n+1)/2 (1 + s)−(n−2)/2 ds)
Using (5.6) again, one concludes (|x| − ct)2 −n 52 ≤ C 1 + (1 + t)−n . 1+t √ Case 2.3. |x| ≥ ct + (c/4) t. Since C|x|2 , if s ≤ t/2, 2 (|x| − cs) ≥ C(|x| − ct)2 , if s ≥ t/2,
(5.26)
it follows from (5.25) that 52 ≤ −n R −N N −n 2 |y−x|2 (|y|−cs)2 1+t−s C 1 + (|x|−ct) 1 + 1 + dsdy θ + 1 2 1+t 1+t−s 1+t 1+s −n R n −N −n 2 2 2 1+s +C 1 + |x| θ 1 + |y−x| dsdy. + 1 + (|y|−cs) 1+t 1+t 1+t−s 1+s 2 2 By Lemma 5.2, we see that (|x| − ct)2 −n |x|2 −n 52 ≤ C (1 + t)−(2n−1)/2 1 + . (5.27) + (1 + t)−(n+1)/2 1 + 1+t 1+t √ √ Case 2.4. t ≤ |x| ≤ ct − (c/4) t. Let D1 = {(s, y) ∈ , 0 ≤ s ≤ t1 }, D2 = {(s, y) ∈ , t1 ≤ s ≤ t2 }, D3 = {(s, y) ∈ , t2 ≤ s ≤ t}, with t1 =
|x| 2c , t2
=
t 2
+
|x| 2c .
52 =
Then we may write
3 Z X j=1
Since
Dj
θ2 (t, s)P2 (t, s, x, y)dsdy =
3 X
52,j .
j=1
−N −N 2 |x|2 ≤ C 1 + 1+t−s , if |y − x| ≥ 1 + |y−x| 1+t−s −N −N (|y|−cs)2 |x|2 1 + 1+s ≤ C 1 + 1+s , if |y − x| ≤
|x| 4 , |x| 4 ,
we have R −N −n 2 2 52,1 ≤ C(1 + t)−(n+1)/2 D1 1 + |x| (1 + s)−(3n−1)/2 1 + (|y|−cs) dsdy 1+t 1+s −N −n R 2 2 1+t + D1 1 + |y−x| (1 + s)−(3n−1)/2 1 + |x| dsdy 1+t−s 1+t 1+s −n 2 ≤ C(1 + t)−(n+1)/2 1 + |x| . 1+t For 52,3 , as for 52,1 above, we only consider the case |y − x| ≤ (ct − |x|)/4, s ≥ t2 , cs − |y| ≥ cs − |x| − |y − x| ≥ (ct − |x|)/4,
Pointwise Estimates of Diffusion Wave for Navier–Stokes Systems
171
and conclude 52,3
R −N −n 2 (ct−|x|)2 −(n+1)/2 ≤ C(1 + t)−(3n−1)/2 D3 1 + |y−x| (1 + t − s) dsdy 1 + 1+t−s 1+t−s −N −n R 2 2 1+t + D3 1 + (ct−|x|) (1 + t − s)−(n+1)/2 1 + (|y|−cs) dsdy 1+t 1+t−s 1+s −n 2 ≤ C(1 + t)−(2n+1)/2 1 + (ct−|x|) . 1+t For 52,2 , it is easy to see that C(1 + (ct − |x|))−(n+1)/2 (1 + t)−(3n−1)/2 , t/2 ≤ s ≤ t θ2 (t, s) ≤ 0 ≤ s ≤ t/2, C(1 + t)−(n+1)/2 (1 + |x|)−(3n−1)/2 , and
R
R
1 + |y−x| 1+t D2 √ R ≤ C 1 + t Rn 1 +
P dsdy ≤ C D2 2
2
−N
−N 2 1 + (|y|−cs) dsdy 1+t −N 2 |y−x| dy ≤ C(1 + t)(n+1)/2 . 1+t
Thus, noting that (ct − |x|) ≤ C(1 + t), we get 52,2 ≤ C((1 + |x|)−(3n−1)/2 + (1 + t)−(n−1) (1 + (ct − |x|))−(n+1)/2 ) −(3n−1)/4 −(3n−1)/4 2 (ct−|x|)2 . + 1 + ≤ C(1 + t)−(3n−1)/4 1 + |x| 1+t 1+t Summing above inequalities, we have (ct − |x|)2 −n/2 |x|2 −n/2 52 ≤ C (1 + t)−n/2 1 + . (5.28) + (1 + t)−(3n−1)/4 1 + 1+t 1+t From the above cases, we conclude that (5.4) always holds for 52 . The estimate of 53 . Cases 3.1 and 3.2. |x|2 ≤ t or (|x| − ct)2 ≤ t. By the same method as in Cases 2.1 and 2.2, we have |x|2 −n |x|2 −n + (1 + t)−(3n−1)/4 1 + . 53 ≤ C (1 + t)−n/2 1 + 1+t 1+t √ Case 3.3. |x| ≥ ct + t. We first note that
(5.29)
|y − x| − c(t − s))2 −N (|x| − ct)2 −n P3 ≤ C 1 + 1+ 1+t−s 1+t for |y| ≥ |x| − ct. On the other hand, for |y| ≤ |x| − ct, since |y − x| − c(t − s) ≥ |x| − |y| − ct ≥ 0, by (5.7), we have |y|2 (|x| − ct − |y|))2 |y|2 (|x| − ct)2 (|y − x| − c(t − s))2 + ≥ + ≥ . 1+t−s 1+s 1+t−s 1+s 2+t Using the inequality (5.8), we have (|x| − ct)2 −n |y − x| − c(t − s))2 −N +n P3 ≤ C 1 + 1+ . 1+t−s 1+t
172
T.-P. Liu, W. Wang
If N ≥ 2n, as in the proof of Case 2.3, we obtain (|x| − ct)2 −n 53 ≤ C(1 + t)−(3n−1)/4 1 + . 1+t √ √ Case 3.4. t ≤ |x| ≤ ct − t. As in Case 2.4, we may write 53 =
3 Z X j=1
Dj
θ3 (t, s)P3 (t, s, x, y)dsdy =
3 X
(5.30)
53,j ,
j=1
|x| but with t1 = 2t − |x| 2c , t2 = t − 4c . For 53,1 , since c(t − s) − |y − x| ≥ (ct − |x|)/4 for |y| ≤ (ct − |x|)/4, we see that R −N −n 2 2 53,1 ≤ C(1 + t)−(3n+1)/4 D1 1 + (ct−|x|) (1 + s)−n 1 + |y| dsdy 1+t 1+s −n R 2 2 1+t + D1 1 + (c(t−s)−|y−x|) (1 + s)−n 1 + (ct−|x|) 1+t−s 1+t 1+s −n 2 ≤ C(1 + t)−(3n+1)/4 1 + (ct−|x|) . 1+t
For 53,3 , we see that |y − x| − ct + cs ≥ |x| 2 − ct + ct − |y| ≥ |x| − |y − x| ≥ |x| 2 ,
|x| 4
≥
|x| 4 ,
if |x| ≥ if |x| ≤
ct 2, ct 2.
Thus by Lemma 5.2, 53 ≤ C(1 + t)−n 1 + +(1 + t − s)
−n R t |x|2 ((1 1+t t2 n−1/2−(3n−1)/4
≤ C(1 + t)−(3n−1)/4
+ s)n/2 (1 + t − s)N −(3n−1)/4 (1 + t)−N
(1 + s)n (1 + t)−n )ds) −n . 1 + |x| 1+t
2
For 53,2 , it is easy to see that C(1 + (ct − |x|))−n (1 + t)−(3n+1)/4 , t/2 ≤ s ≤ t θ3 (t, s) ≤ 0 ≤ s ≤ t/2, C(1 + t)−(3n+1)/4 (1 + |x|)−n , and
−N −n R (|y−x|−c(t−s))2 |y|2 P dsdy ≤ C 1 + 1 + dsdy 3 1+t 1+t D2 D2 −n √ R 2 ≤ C 1 + t Rn 1 + |y| dy ≤ C(1 + t)(n+1)/2 . 1+t
R
Thus 53,2 ≤ C((1 + t)−(n−1)/4 (1 + (ct − |x|))−n + (1 + |x|)−n (1 + t)−(n−1)/4 ) −n/2 −n/2 2 2 + 1 + |x| . ≤ C(1 + t)−(3n−1)/4 1 + (ct−|x|) 1+t 1+t Summing above inequalities, we have (ct − |x|)2 −n/2 |x|2 −n/2 53 ≤ C (1 + t)−n/2 1 + + (1 + t)−(3n−1)/4 1 + . (5.31) 1+t 1+t Thus, (5.4) is also valid for 53 .
Pointwise Estimates of Diffusion Wave for Navier–Stokes Systems
173
The estimate of 54 . Case 4.1. |x|2 ≤ t. Since Z Z |y|2 −n |y − x|2 −N θ4 1 + dsdy + θ4 1 + dsdy ≤ C(1 + t)−(n+1)/2 , 54 ≤ 1+s 1+t−s 1 2 using (5.6), one has |x|2 −n . 54 ≤ C(1 + t)−(n+1)/2 1 + 1+t Case 4.2. |x|2 ≥ t. Since C 1 + P4 ≤ C 1 +
−n 2 1 + |y| , if |y| ≥ 1+s −N −n 2 2 |y−x| , if |y| ≤ 1 + |x| 1+t−s 1+s |x|2 1+t−s
−N
|x| 2 , |x| 2 ,
by Lemma 5.2, we have N −N −n |x|2 |y|2 −(n+1)/2 −n 1+t−s 1 + 1 + (1 + t − s) (1 + s) dsdy 1+t 1+t 1+s N −N −n R 2 2 + (1 + t − s)−(n+1)/2 (1 + s)−n 1+s 1 + |y−x| 1 + |x| dsdy 1+t 1+t 1+s −n 2 . ≤ C(1 + t)−(n+1)/2 1 + |x| 1+t
54 ≤ C
R
It follows from the above inequalities that (5.4) is valid for 54 . This completes the proof of Proposition 4.2. References 1. Hoff, D. and Zumbrun, K.: Multi-dimensional diffusion wave for the Navier–Stokes equations of compressible flow. Indiana Univ. Math. J. Vol. 44 No. 2, 603–676 (1995) 2. Hoff, D. and Zumbrun, K.: Pointwise decay estimates for multidimensinal Navier–Stokes diffusion waves. Z. angew. Math. Phys. 48, 1–18 (1997) 3. Kawashima, S.: System of a hyperbolic-parabolic composite type, with applications to the equations of magnetohydrodynamics. Thesis, Kyoto Univ. (1983) 4. Kawashima, S.: Large-time behavior of solutions to hyperbolic-parabolic systems of conservation laws and applications. Proc. Roy. Soc. Edinburgh 106A, 169–194 (1987) 5. Liu, T.P. and Zeng, Y.: Large time behavior of solutions general quasilinear hyperbolic-parabolic systems of conservation laws. A. M. S. memoirs, 599 (1997) 6. Treves, F.: Basic linear partial differential equations. London: Academic Press, Inc., 1975 7. Evans, L.C.: Partial differential equations. Berkely Mathematics Lecture Notes, V3A Communicated by A. Jaffe
Commun. Math. Phys. 196, 175 – 202 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Trace Formulas and Inverse Spectral Theory for Jacobi Operators Gerald Teschl? Institut f¨ur Reine und Angewandte Mathematik, RWTH Aachen 52056 Aachen, Germany Received: 5 November 1997 / Accepted: 27 January 1998
Abstract: Based on high energy expansions and Herglotz properties of Green and Weyl m-functions we develop a self-contained theory of trace formulas for Jacobi operators. In addition, we consider connections with inverse spectral theory, in particular uniqueness results. As an application we work out a new approach to the inverse spectral problem of a class of reflectionless operators producing explicit formulas for the coefficients in terms of minimal spectral data. Finally, trace formulas are applied to scattering theory with periodic backgrounds. 1. Introduction Trace formulas have a long history in the theory of one-dimensional second order equations. One case of particular importance are periodic potentials. Let (Hf )(n) = a(n)f (n + 1) + a(n − 1)f (n − 1) + b(n)f (n),
n∈Z
(1.1)
be our Jacobi operator with a(n + N ) = a(n), b(n + N ) = b(n) for some N ∈ N. Then, using Floquet theory (cf., e.g., [7], Appendix B, [30, 32]) one can show that the spectrum σ(H) of H consists of N bands (some of which might collide) σ(H) =
N [
[E2j−2 , E2j−1 ],
E0 < E1 ≤ E2 · · · < E2N −1 .
(1.2)
j=1
Next, we consider finite matrices associated with H obtained by restricting H to finite intervals from n0 to n0 + N and imposing boundary conditions at the endpoints. Denote the matrix obtained with Dirichlet boundary conditions (i.e., f (n0 ) = 0, f (n0 + N ) = 0) ? Current address: Institut f¨ ur Mathematik, Strudlhofgasse 4, 1090 Wien, Austria. E-mail: [email protected]
176
G. Teschl
by H˜ n∞0 and the one obtained with periodic/anti periodic boundary conditions (i.e., f (n0 ) = ±f (n0 + N ), f (n0 + 1) = ±f (n0 + N + 1)) by H˜ n±0 . The eigenvalues of H˜ n+0 , H˜ n−0 are precisely the even, odd band edges E2j−2 , E2j−1 , 1 ≤ j ≤ N , respectively. The P ˜ n∞ are denoted by µj (n), 1 ≤ j ≤ N −1. Since tr(H ˜ n± ) = N −1 b(n+j) eigenvalues of H j=0 PN −1 and tr(H˜ n∞ ) = j=1 b(n + j) we infer from b(n) = tr(H˜ n± − H˜ n∞ ) = tr(H˜ n+ + H˜ n− )/2 − tr H˜ n∞ by elementary linear algebra b(n) =
2N −1 N −1 X 1 X Ej − µj (n). 2 j=0
(1.3)
j=1
˜ n− )` )/2 − tr(H˜ n∞ )` , ` ∈ N one can obtain higher ˜ n+ )` + (H Similarly, considering tr((H order trace relations. The corresponding formulas for ` = 1 (i.e., (1.3)) and ` = 2 were first given in [32]. Formula (1.3) plays a key role in the inverse spectral theory of periodic operators and the reconstruction of a, b from suitable spectral data. Those ingredients form the basis for the solution of the periodic initial value problem of the Toda equations (cf., e.g., [7, 10, 39]). Moreover, relation (1.3) was extended to certain reflectionless operators in [2] and successfully used in [2, 22] to solve inverse spectral problems for these operators. To generalize trace formulas to arbitrary operators one invokes the measure dρδ of H associated with the vector δ ∈ `2 (Z) (cf. Lemma 3.1) by the spectral theorem. Choosing, e.g., δ = δn (the standard basis of `2 (Z)) we immediately obtain Z hδn , H ` δn i = λ` dρδn (λ), (1.4) R
connecting the matrix elements hδn , H ` δn i with the moments of the measure dρδn . In the special case where H has purely discrete spectrum, the integral can be evaluated, X hδn , H ` δn i = γ(λ, n, n)λ` , (1.5) λ∈σ(H)
where −γ(λ, n, m) is the residue of G(z, n, m) at z = λ ∈ σ(H), that is, γ(λ, n, n) =
u(λ, n)u(λ, m) , ku(λ)k2
λ ∈ σ(H),
(1.6)
where u(λ) is the eigenvector corresponding to λ ∈ σ(H). In particular, for ` = 1 this gives the interesting result that (for H with purely discrete spectrum) b(n) is equal to the sum over all eigenvalues of H weighted by γ(λ, n, n). However, generalizations of (1.3) cannot be obtained in this way. This can be done by using the exponential measure ξdλ (cf. Appendix A) associated with dρ(λ) as was discovered by F. Gesztesy and B. Simon in [17]. There they extended the analog of (1.3) for Schr¨odinger operators to a much larger class of potentials (in essence, only semiboundedness of the potential is needed) based on the theory of the Krein spectral shift [29]. In a subsequent series of papers [18], [19, 21, 25], and [26] they, together with H. Holden and Z. Zhao, exploit the ideas of [17] and extend them in various directions. In [17] they also give a generalization of (1.3) to arbitrary bounded Jacobi operators. However, a comprehensive treatment of trace formulas for Jacobi operators is still missing. Since it is desirable, for further work in inverse spectral theory, to have these powerful tools at one’s disposal, one goal of the present paper is to fill this gap.
Trace Formulas for Jacobi Operators
177
Furthermore, we want to point out an annoying mismatch in formula (1.3). In order to express b(n) for all n ∈ Z one needs {Ej }0≤j≤2N −1 , {µj (n)}1≤j≤N for all n ∈ Z. On the other hand, it is well-known that the spectral data {Ej }0≤j≤2N −1 , {µj (n0 )}1≤j≤N plus some additional signs {σj (n0 )}1≤j≤N for one fixed n0 ∈ Z already determine a(n)2 , b(n) for all n ∈ Z. Hence it must be, in principle, possible to express a(n)2 , b(n) in terms of these spectral data for one n0 ∈ Z. This naturally raises the question whether one might be able to find explicit expressions of a(n)2 , b(n) in terms of suitable minimal spectral data for certain classes of operators. To the best of our knowledge a problem of this kind has not been solved yet. Combining the approach of (1.4), the theory of [17], Weyl–Titchmarsh theory, and the moment problem we will add a new wrinkle to the theory of trace formulas and give a solution to this problem for a certain class of bounded reflectionless Jacobi operators in Sect. 6. To give the reader an overview of the results established, we briefly summarize the content of the remaining sections. Section 2 introduces all the necessary notation and is mainly added to make the paper self-contained and easier to read. Section 3 contains a comprehensive treatment of asymptotic expansions for Weyl m and Green functions. We establish that expansions for these objects always exist up to arbitrary order. In addition, recursion relations for the expansion coefficients are derived. Section 4 contains an alternate (recursive) approach to inverse spectral theory which gives simple proofs for standard uniqueness theorems. Moreover, new uniqueness results are established as well. In Sect. 5 we derive infinite series of trace formulas for Jacobi operators in the spirit of [17, 25]. The basic ingredients are the asymptotic expansions of Sect. 3 and Herglotz properties of these objects. In particular, we extend (1.3) to (i) arbitrary order ` ∈ N, (ii) arbitrary Jacobi operators, and (iii) general boundary conditions. Section 6 applies the results of Section 5 to the theory of reflectionless Jacobi operators, producing formulas of type (1.3) plus an explicit expression of the coefficients a2 , b in terms of minimal spectral data. Section 7 considers scattering theory with periodic backgrounds. Basic objects like transmission and reflection coefficients are introduced. In addition, the analog of a trace formula for Schr¨odinger operators involving the reflection coefficient is obtained. Finally, an appendix collects some properties of Herglotz functions needed in the main body of the paper. 2. Jacobi Operators, Resolvents, Green’s Functions and All That Throughout this paper we denote by `(I) = `(M, N ), I = {n ∈ Z|M < n < N }, M, N ∈ Z ∪ {±∞} the set of complex-valued sequences {u(n)}n∈I and by `p (I), 1 ≤ p ≤ ∞ the sequences u ∈ `(I) such that |u|p is summable over I. The scalar product in the Hilbert space `2 (I) will be denoted by X u(n)v(n), u, v ∈ `2 (I). (2.1) hu, vi = n∈I
We will be concerned with operators on `2 (Z) associated with the difference expression
178
G. Teschl
(τ f )(n) = a(n)f (n + 1) + a(n − 1)f (n − 1) + b(n)f (n),
(2.2)
where a, b ∈ `(Z) satisfy Hypothesis 2.1. Suppose a(n) ∈ R\{0},
b(n) ∈ R,
n ∈ Z.
(2.3)
If τ is limit point (l.p.) at both ±∞ (cf., e.g., [5, 6]), then τ gives rise to a unique self-adjoint operator H when defined maximally. Otherwise, we need to fix a boundary condition at each endpoint where τ is limit circle (l.c.) (cf., e.g., [5, 6]). Throughout this paper we denote by u± (z, .), z ∈ C, nontrivial solutions of τ u = zu which satisfy the boundary condition at ±∞ (if any) with u± (z, .) ∈ `2± (Z), respectively. Here `2± (Z) denotes the sequences in `(Z) being `2 near ±∞. The solution u± (z, .) might not exist for z ∈ R (cf. [37], Lemma A.1), but if it exists it is unique up to a constant multiple. Picking a fixed z0 ∈ C\R we can characterize H by H : D(H) → `2 (Z) , f 7→ τ f
(2.4)
where the domain of H is explicitly given by D(H) = {f ∈ `2 (Z)| τ f ∈ `2 (Z), limn→+∞ Wn (u+ (z0 ), f ) = 0, limn→−∞ Wn (u− (z0 ), f ) = 0}
(2.5)
and Wn (f, g) = a(n) f (n)g(n + 1) − f (n + 1)g(n)
(2.6)
denotes the (modified) Wronskian. The boundary condition at ±∞ imposes no additional restriction on f if τ is l.p. at ±∞ and can hence be omitted in this case. Next, consider the sequence δnβ0 = cos(α)δn0 + sin(α)δn0 +1 ,
β = cot(α), α ∈ [0, π),
(2.7)
where δn0 (n) is 1 for n = n0 and 0 otherwise. Restrict H to the orthogonal complement of δnβ0 in `2 (Z) and denote this restriction by Hnβ0 , that is, Hnβ0 f = τ f,
f ∈ D(Hnβ0 ) = {f ∈ D(H)|hδnβ0 , f i = 0}.
(2.8)
Clearly Hnβ0 is self-adjoint on the subspace {f ∈ `2 (Z)|hδnβ0 , f i = 0} but not on `2 (Z) since D(Hnβ0 ) is not dense. Now we turn to resolvents and introduce the Green’s function G(z, m, n) = hδm , (H − z)−1 δn i 1 u+ (z, n)u− (z, m) = u W (u− (z), u+ (z)) + (z, m)u− (z, n)
for m ≤ n , for n ≤ m
(2.9)
where z ∈ C\σ(H) and σ(H) denotes the spectrum of H. For later use we also introduce the convenient abbreviations
Trace Formulas for Jacobi Operators
179
u+ (z, n)u− (z, n) , W (u− (z), u+ (z)) h(z, n) = 2a(n)G(z, n, n + 1) − 1 a(n)(u+ (z, n)u− (z, n + 1) + u+ (z, n)u− (z, n + 1)) = . W (u− (z), u+ (z)) g(z, n) = G(z, n, n) =
(2.10)
(2.11)
Similarly, the corresponding object for Hnβ0 (viewed as a self-adjoint operator on {f ∈ `2 (Z)|hδnβ0 , f i = 0}) reads Gβn0 (z, m, n) = hδm , (Hnβ0 − z)−1 δn i = G(z, m, n) + γ β (z, n0 )−1 × G(z, m, n0 + 1) + βG(z, m, n0 ) G(z, n0 + 1, n) + βG(z, n, n0 ) , (2.12) where
γ β (z, n) =
u+ (z, n + 1) + βu+ (z, n) u− (z, n + 1) + βu− (z, n)
= g(z, n + 1) +
W (u− (z), u+ (z)) β h(z, n) + β 2 g(z, n). a(n)
(2.13)
The quantities g(z, n) and γ β (z, n) are most important for our purpose and satisfy the following recurrence equations which can be verified by tedious but straightforward calculations. We use the shortcuts (f − )(n) = f (n − 1), (f + )(n) = f (n + 1), (f ++ )(n) = f (n + 2), etc. Lemma 2.2. Let u, v be two solutions of τ u = zu. Then g(n) = u(n)v(n) satisfies (a+ )2 g ++ − a2 g a2 g + − (a− )2 g − = (z − b+ )g + − (z − b)g, + z − b+ z−b
(2.14)
and
a2 g + − (a− )2 g − + (z − b)2 g
2
= (z − b)2 W (u, v)2 + 4a2 gg + .
Moreover, set γ β (n) = (u(n + 1) + βu(n))(v(n + 1) + βv(n)), then we have 2 (a+ A− )2 (γ β )+ − (a− A)2 (γ β )− + B 2 γ β A = (A− B)2 ( W (u, v))2 + 4(a+ )2 γ β (γ β )+ , a
(2.15)
(2.16)
with A = a + β(z − b+ ) + β 2 a+ , B = a− (z − b+ ) + β((z − b+ )(z − b) + a+ a− − a2 ) + β 2 a+ (z − b).
(2.17) (2.18)
Remark 2.3. Equations (2.14) and (2.15) are the analogs of well-known differential equations for the diagonal Green function of one-dimensional Schr¨odinger operators (cf., e.g., [14, 24], Eqs. (5.19) and (5.20)). Equation (2.16) is the analog of Eq. (5.18) in [24].
180
G. Teschl
Finally, we turn to half line restrictions H±,n0 of H defined by H±,n0 : D(H±,n0 ) → `2 (n0 , ±∞) , f 7→ τ f
(2.19)
and D(H±,n0 ) = {f ∈ `2 (n0 , ±∞)|τ f ∈ `2 (n0 , ±∞),
lim Wn (u± (z0 ), f ) = 0}, (2.20)
n→±∞
where we set f (n0 ) = 0 in the definition of (τ f )(n0 ± 1). The corresponding Green functions read s(z, n, n0 )u± (z, m)
G±,n0 (z, m, n) =
±1 W (s(z), u± (z)) s(z, m, n0 )u± (z, n)
≥ for m ≤ n ≥ for n ≤ m
, (2.21)
where s(z, ., n0 ) is the solution of τ u = zu satisfying the Dirichlet boundary condition s(z, n0 , n0 ) = 0. The analogous quantities of g(z, n) are the Weyl m-functions m± (z, n) = hδn±1 , (H±,n − z)−1 δn±1 i = G±,n (z, n ± 1, n ± 1) u± (z, n ± 1) =− , a(n − 01 )u± (z, n)
(2.22)
which satisfy a(n − 01 )2 m± (z, n) +
1 = b(n) − z. m± (z, n ∓ 1)
(2.23)
β Remark 2.4. We can also consider half line operators H±,n on `2 (n0 , ±∞) associated 0 with the general boundary condition
f (n0 + 1) + βf (n0 ) = 0,
β ∈ R ∪ {∞}
(2.24)
at n0 rather than only the Dirichlet boundary condition f (n0 ) = 0. We set 0 = H+,n0 +1 , H+,n 0
β H+,n = H+,n0 − a(n0 )β −1 hδn0 +1 , .iδn0 +1 , 0
β 6= 0, (2.25)
and ∞ = H−,n0 , H−,n 0
β H−,n = H−,n0 +1 − a(n0 )βhδn0 , .iδn0 , 0
β β implying Hnβ0 ∼ = H−,n0 ⊕ H+,n0 .
β 6= ∞,
(2.26)
Trace Formulas for Jacobi Operators
181
3. Asymptotic Expansions In the sequel, asymptotic expansions for g(z, n) = G(z, n, n) and γ β (z, n) will turn out to be very useful. Both quantities are Herglotz functions as can be seen from g(z, n) = hδn , (H − z)−1 δn i,
(3.1) β a(n)
γ β (z, n) = (1 + β 2 )hδnβ , (H − z)−1 δnβ i −
(3.2)
(we note that, by (2.13), h(z, n) is the difference of two Herglotz functions) and the following lemma which is immediate from the spectral theorem. Lemma 3.1. Suppose δ ∈ `2 (Z) with kδk = 1. Then g(z) = hδ, (H − z)−1 δi is Herglotz, that is,
Z g(z) = R
(3.3)
1 dρδ (λ), λ−z
(3.4)
where dρδ (λ) = dhδ, P(−∞,λ] (H)δi is the spectral measure of H associated to the sequence δ. Moreover, Im(g(z)) = Im(z)k(H − z)−1 δk2
(3.5)
and g(z) = g(z),
|g(z)| ≤ k(H − z)−1 k ≤
1 . |Im(z)|
(3.6)
Next, we turn to asymptotic expansions for g(z, n), h(z, n), and γ β (z, n). Theorem 3.2. The quantities g(z, n), h(z, n), and γ β (z, n) have the following asymptotic expansions for arbitrary ε > 0 g(z, n)
−
∞ X gj (n)
|z|→∞
z j+1
j=0
,
g0 = 1,
(3.7)
|Im(z)|≥ε
h(z, n)
−1 −
|z|→∞
∞ X hj (n)
z j+1
j=0
,
h0 = 0,
(3.8)
|Im(z)|≥ε
γ β (z, n)
|z|→∞
−
∞ X γjβ (n) β − , a(n) z j+1
γ0β = 1 + β 2 .
(3.9)
j=0
|Im(z)|≥ε
Moreover, the coefficients are given by gj (n) = hδn , H j δn i,
j ∈ N0 ,
hj (n) = 2a(n)hδn+1 , H δn i, j
γjβ (n)
(3.10)
j ∈ N0 ,
(3.11)
= h(δn+1 + βδn ), H (δn+1 + βδn )i j
= gj (n + 1) +
β hj (n) + β 2 gj (n), a(n)
j ∈ N0 .
(3.12)
182
G. Teschl
Proof. We only carry out the proof for g(z, n) since the remaining expansions are similar. Rewriting g(z, n) as g(z, n) = hδn , (H − z)−1 δn i =−
N −1 X j=0
1 hδn , H j δn i + N hδn , H N (H − z)−1 δn i, N ∈ N j+1 z z
(3.13)
shows that it suffices to vindicate that the last term is O(z −N ). This follows from |hδn , H N (H − z)−1 δn i| ≤
kH N δn k kH N δn k ≤ . |Im(z)| ε
(3.14)
Remark 3.3. (i) If H is bounded, the above expansions are in fact Laurent series converging for |z| > kHk. (ii) Pick ε(n) ∈ {−1, +1} and introduce aε (n) = ε(n)a(n) and bε (n) = b(n). Then the operator Hε associated with aε , bε is unitarily equivalent to H. Indeed, take ˜ δm,n }m,n∈Z , where ε(n + 1)ε(n) ˜ = ε(n), then the unitary operator Uε = {ε(n) Hε = Uε HUε−1 . In particular, this shows that g(n), h(n) do not depend on the sign of a, that is, they only depend on a2 . The following lemma ([7], Lemma 2.1) shows how to compute gj , hj recursively. Lemma 3.4. The coefficients gj (n) and hj (n) for j ∈ N0 satisfy the following recursion relation; gj+1 =
hj + h − j + bgj , 2
2 + − 2 − hj+1 − h− + b h j − h− j . j+1 = 2 a gj − (a ) gj
(3.15) (3.16)
Proof. The first equation follows from gj+1 (n) = hHδn , H j δn i =
hj (n) + hj (n − 1) + b(n)gj (n). 2
(3.17)
Similarly, hj+1 (n) = b(n)hj (n) + 2a(n)2 gj (n + 1) + 2a(n − 1)a(n)hδn+1 , H j δn−1 i = b(n + 1)hj (n) + 2a2 gj (n) + 2a(n)a(n + 1)hδn+2 , H j δn i. Eliminating hδn+1 , H j δn−1 i completes the proof.
(3.18)
This system does not determine gj (n), hj (n) uniquely since it requires solving a first-order recurrence relation at each step, producing an unknown summation constant each time. To determine these constants we assign the weight one to a(n) and b(n), n ∈ Z. Then gj+1 (n) and hj (n) have weight j + 1, fixing the summation constants. To avoid this drawback we advocate a different approach using (2.15). First observe that hj (n) can be determined if gj (n) is known using hj+1 = bhj + gj+2 − 2bgj+1 + a2 gj+ − (a− )2 gj− + b2 gj ,
j ∈ N0 ,
(3.19)
which follows after inserting (3.15) into (3.16). In addition, inserting the expansion (3.7) for g(z, n) into (2.15) and comparing coefficients of z j one infers
Trace Formulas for Jacobi Operators
183
g0 = 1, g1 = b, g2 = a2 + (a− )2 + b2 , g3 = a2 (b+ + 2b) + (a− )2 (2b + b− ) + b3 ,
(3.20)
and 1X − b gj−1 − kj−`−1 k` 2 j−1
gj+1 = 2bgj −
+ a2 gj−1
+
− (a− )2 gj−1
2
`=0
+ 2a
2
j−1 X
gj−`−1 g`+
− 2b
`=0
j−2 X
gj−`−2 g`+
+b
2
`=0
j−3 X
gj−`−3 g`+ ,
(3.21)
`=0
for j ≥ 3, where k0 (n) = −b(n) and − + − (a− )2 gj−1 + b2 gj−1 − 2bgj + gj+1 , kj = a2 gj−1
j ∈ N.
(3.22)
Analogously, one can get a recurrence relation for γjβ using (2.16). Since this approach gets too cumbersome we omit further details at this point but note that γjβ can be computed from (3.12). Invoking (3.19) one explicitly obtains h0 = 0,
h1 = 2a2 ,
h2 = 2a2 (b+ + b)
(3.23)
and hence γ0β = 1 + β 2 ,
γ1β = b+ + 2aβ + bβ 2 ,
γ2β = (a+ )2 + a2 + (b+ )2 + 2a(b+ + b)β + (a2 + (a− )2 + b2 )β 2 .
(3.24)
Remark 3.5. Instead of (3.19) and (3.21) one can also use (3.15) and hj+1 = 2a2
j X
1X hj−` h` , 2 j
gj−` g`+ −
`=0
j ∈ N,
(3.25)
`=0
together with (3.15) to determine gj , hj . The above equation follows as before using (4.6) below. Next we turn to Weyl m-functions. As before we obtain Lemma 3.6. The quantities m± (z, n) have the asymptotic expansions m± (z, n)
−
∞ X m±,j (n)
|z|→∞
j=0
z j+1
,
m±,0 (n) = 1.
(3.26)
|Im(z)|≥ε
The coefficients m±,j (n) are given by m±,j (n) = hδn±1 , (H±,n )j δn±1 i,
j∈N
(3.27)
and satisfy m±,0 = 1,
m±,1 = b± ,
m±,j+1 = b± m±,j +
j−1 (a+ )2 X m±,j−`−1 m+±,` , (a−− )2
j ∈ N.
(3.28)
`=0
Remark 3.7. As in Remark 3.3 we have that (3.26) converges for |z| > kH±,n k if H±,n is bounded and m± (z, n) depend only on a2 .
184
G. Teschl
4. Inverse Spectral Theory In this section we present a simple recursive method of reconstructing the sequences a2 , b when the Weyl matrix 1 G(z, n, n) G(z, n + 1, n) 01 M (z, n) = − G(z, n, n + 1) G(z, n + 1, n + 1) 2a(n) 1 0 ! h(z,n) g(z, n) 2a(n) , z ∈ C\σ(H) (4.1) = h(z,n) g(z, n + 1) 2a(n) is known for one fixed n ∈ Z. As a consequence, we are led to several uniqueness results. From the previous section we know 1 1 b(n) g(z, n) = − − 2 + O( 3 ), z z z 2a(n)2 1 h(z, n) = −1 − + O( 3 ). z2 z
(4.2) (4.3)
Here and in the remainder of this paper all O( z1` ) terms apply for |z| → ∞, |Im(z)| ≥ ε > 0. Hence b(n) = − lim z(1 + zg(z, n)), z→i∞
a(n)2 = −
1 lim z 2 (1 + h(z, n)). 2 z→i∞
(4.4) (4.5)
Moreover, we have the useful identities (use (2.10) and (2.11)) 4a(n)2 g(z, n)g(z, n + 1) = h(z, n)2 − 1
(4.6)
h(z, n + 1) + h(z, n) = 2(z − b(n + 1))g(z, n + 1),
(4.7)
and
which show that g(z, n) and h(z, n) together with a(n)2 and b(n) can be determined recursively if, say, g(z, n0 ) and h(z, n0 ) are given. In addition, we infer that a(n)2 , g(z, n), g(z, n + 1) determine h(z, n) up to one sign, 1/2 h(z, n) = 1 + 4a(n)2 g(z, n)g(z, n + 1) (4.8) since h(z, n) is holomorphic with respect to z ∈ C\σ(H) and h(z, n) = h(z, n). However, this sign can be determined from the asymptotic behavior h(z, n) = −1 + O(z −2 ). Hence we have reproved the well-known result that M (z, n0 ) determines the sequences a2 , b. In fact, we have proved the slightly stronger result: Theorem 4.1. One of the following set of data (i) g(., n0 ) and h(., n0 ) (ii) g(., n0 + 1) and h(., n0 ) (iii) g(., n0 ), g(., n0 + 1), and a(n0 )2 for one fixed n0 ∈ Z uniquely determines the sequences a2 and b.
Trace Formulas for Jacobi Operators
185
Remark 4.2. (i) We want to emphasize that the diagonal elements g(z, n0 ) and g(z, n0 +1) alone plus a(n0 )2 are sufficient to reconstruct a(n)2 , b(n). This is in contradistinction to the case of one-dimensional Schr¨odinger operators, where the diagonal elements of the Weyl matrix determine the potential only up to reflection. It is not clear to me whether this different behavior of Jacobi operators has been previously noted in the literature. The reader might wonder how the Weyl matrix of the operator HR associated with the (at n0 ) reflected coefficients aR , bR (i.e., aR (n0 − k − 1) = a(n0 + k), bR (n0 − k) = b(n0 + k), k ∈ Z) look like. Since reflection at n0 exchanges m± (z, n0 ) (i.e., mR,± (z, n0 ) = m∓ (z, n0 )) we infer gR (z, n0 ) = g(z, n0 ), hR (z, n0 ) = −h(z, n0 ) + 2(z − b(n0 ))g(z, n0 ), a(n0 )2 z − b(n0 ) h(z, n0 ) g(z, n0 + 1) + gR (z, n0 + 1) = 2 a(n0 − 1) a(n0 − 1)2 + (z − b(n0 ))g(z, n0 ) ,
(4.9) (4.10)
(4.11)
in obvious notation. (ii) Remark 3.3(ii) shows that the sign of a(n) cannot be determined from either g(z, n0 ), h(z, n0 ), or g(z, n0 + 1). (iii) Clearly, if H is l.c. at ±∞ the corresponding boundary condition is determined by M (z, n) as well. (iv). Equation (4.6) is equivalent to det M (z, n) = −1/(2a(n))2 . The analogous equation for the Schr¨odinger case was first used by Rofe–Beketov in connection with inverse problems (see [31], Sect. 7.3). The off diagonal Green function can be recovered as follows G(z, n, n + k) = g(z, n)
n+k−1 Y j=n
1 + h(z, j) , 2a(j)g(z, j)
k > 0,
(4.12)
and we remark a(n)2 g(z, n + 1) − a(n − 1)2 g(z, n − 1) + (z − b(n))2 g(z, n) = (z − b(n))h(z, n).
(4.13)
A similar procedure works for H+ . The asymptotic expansion 1 b(n + 1) a(n + 1)2 + b(n + 1)2 m+ (z, n) = − − − + O(z −4 ) z z2 z3
(4.14)
shows that a(n + 1)2 , b(n + 1) can be recovered from m+ (z, n). In addition, (2.23) shows that m+ (z, n0 ) determines a(n)2 , b(n), m+ (z, n), n > n0 . Similarly, (by reflection) m− (z, n0 ) determines a(n − 1)2 , b(n), m− (z, n − 1), n < n0 . Hence both m± (z, n0 ) determine a(n)2 , b(n) except for a(n0 − 1)2 , a(n0 )2 , b(n0 ). However, introducing m ˜ ± (z, n) = ∓u± (z, n + 1)/(a(n)u± (z, n)) and considering m ˜ + (z, n) = m+ (z, n),
m ˜ − (z, n) =
z − b(n) + a(n − 1)−2 m− (z, n) a(n)2
(4.15)
we see that m ˜ − (z, n0 ) determines a(n0 −1)2 , a(n0 )2 , b(n0 ) and m− (z, n0 ). Summarizing:
186
G. Teschl
Theorem 4.3. The quantities m ˜ ± (z, n0 ) uniquely determine a(n)2 , b(n) for all n ∈ Z. Moreover, we have g(z, n) =
−a(n)−2 , m ˜ + (z, n) + m ˜ − (z, n)
˜ − (z, n) m ˜ + (z, n)m , m ˜ + (z, n) + m ˜ − (z, n) m ˜ + (z, n) − m ˜ − (z, n) h(z, n) = , m ˜ + (z, n) + m ˜ − (z, n)
g(z, n + 1) =
(4.16)
and conversely m ˜ ± (z, n) =
2g(z, n + 1) 1 ± h(z, n) =− . 2a(n)2 g(z, n) 1 ∓ h(z, n)
(4.17)
Next we recall the function γ β (z, n) introduced in (2.13) with asymptotic expansion γ β (z, n) = −
1 + β2 b(n + 1) + 2βa(n) + β 2 b(n) β 1 − − + O( 3 ). 2 a(n) z z z
(4.18)
Our goal is to prove Theorem 4.4. Let β1,2 ∈ R ∪ {∞} with β1 6= β2 be given. Then γ βj (., n0 ), j = 1, 2 for one fixed n0 ∈ Z uniquely determines a(n)2 , b(n) for all n ∈ Z (set γ ∞ (z, n) = g(z, n)) unless (β1 , β2 ) = (0, ∞), (∞, 0). In the latter case a(n0 )2 is needed in addition. More explicitly, we have γ β1 (z, n) + γ β2 (z, n) + 2R(z) , (β2 − β1 )2 β 2 γ β1 (z, n) + β12 γ β2 (z, n) + 2β1 β2 R(z) g(z, n + 1) = 2 , (β2 − β1 )2 β2 γ β1 (z, n) + β1 γ β2 (z, n) + (β1 + β2 )R(z) h(z, n) = , (−2a(n))−1 (β2 − β1 )2 g(z, n) =
where R(z) is the branch of 1/2 1 β1 + β 2 (β2 − β1 )2 β1 β2 + O( ), + γ (z, n)γ (z, n) = R(z) = 2 4a(n) 2a(n) z
(4.19) (4.20) (4.21)
(4.22)
which is holomorphic for z ∈ C\R and has asymptotic behavior as indicated. If one of the numbers β1,2 equals ∞, one has to replace all formulas by their limit using g(z, n) = lim β −2 γ β (z, n). β→∞
Proof. Clearly, if (β1 , β2 ) 6= (0, ∞), (∞, 0) we can determine a(n) from (4.18). Hence by Theorem 4.1 it suffices to show (4.19) – (4.21). Since (4.19) follows from (4.6) and the other two, it remains to establish (4.20) and (4.21). This will follow if we prove that the system (g + )2 + 2
βj2 βj hg + + (h2 − 1) = g + γ βj (z, n), 2a(n) 4a(n)2
j = 1, 2
(4.23)
has a unique solution (g + , h) = (g(z, n + 1), h(z, n)) for |z| large enough, |Im(z)| ≥ ε, which is holomorphic with respect to z and satisfies the asymptotic requirements from
Trace Formulas for Jacobi Operators
187
above. We first consider the case βj 6= 0, ∞. Changing to new variables (x1 , x2 ), xj = (2a(n)/βj )g + + h, our system reads x2j − 1 =
β1 β2 2a(n)γ βj (z, n) (x1 − x2 ), β2 − β1 βj2
j = 1, 2.
(4.24)
Picking |z| large enough we can assume γ βj (z, n) 6= 0 and the solution set of the new system is given by the intersection of two parabolas. In particular, (4.23) has at most four solutions. Two of them are clearly g + = 0, h = ±1. But they do not have the correct asymptotic behavior and hence are of no interest to us. The remaining two solutions are given by (4.20) and (4.21) with the branch of R(z) arbitrarily. However, we only get correct asymptotics (g + = −z −1 + O(z −2 ) resp. h = −1 + O(z −2 )) if we fix the branch as in (4.22). This shows that g(z, n + 1), h(z, n) can be reconstructed from γ βj , j = 1, 2 and we are done. The remaining cases can be treated similarly. Corollary 4.5. Suppose H has purely discrete spectrum. Then a(n0 ), σ(H) plus βj , β σ(Hn0j ), j = 1, 2 for two values β1 6= β2 uniquely determine the coefficients a(n)2 , b(n) (and the boundary condition at ±∞ if any). Proof. Since H has purely discrete spectrum the same is true for Hnβ0 . Hence γ β (z, n0 ) is meromorphic with poles at the eigenvalues of H and zeros at the eigenvalues of Hnβ0 following from (2.13) (if eigenvalues of H and Hnβ0 coincide we have a double zero in the numerator of (2.13) and a single zero in the denominator). Thus we know when γ β (z, n0 ) changes sign implying that we know the exponential Herglotz measure of γ β (z, n0 ) (cf. (A.2)). The remaining constant c in (A.2) follows from the asymptotic behavior (see also (5.19)). Hence we can reconstruct γ β (z, n0 ) from a(n0 ), σ(H) and β, σ(Hnβ0 ) completing the proof. β (cf. Remark 2.4). Since the Finally, let us turn to half line operators H+β = H+,0 dependence one a(0) can be removed by scaling β, we assume without restriction a(0) = 1 for the remainder of this section. We will now prove the following generalization of a result by Fu and Hochstadt [13] (where the special case β1 = 0, β2 = ∞ is proved under somewhat more restrictive conditions).
Theorem 4.6. Suppose the spectrum of H+β is purely discrete for one β ∈ R∪{∞} (and hence for all β) and let βj , j = 1, 2 be two different values which have opposite signs β if 0 < |βj | < ∞. Then βj plus σ(H+ j ), j = 1, 2 uniquely determine the coefficients a(n)2 , b(n) (and the boundary condition at +∞ if any). Proof. Without restriction we suppose β2 6= 0 and β1 6= ∞. Then F (z) = −
β2 (m+ (z) − β1 ) = m+ (z) − β2
1 β2
−1 β1 β 2 + 1 m (z) − β2 − m+ (z) +
(4.25)
is a meromorphic Herglotz function since m+ (z) = m+ (z, 0) is. Moreover, since m+ (z) = u+ (z,1) u+ (z,0) (where u+ (z, 0) has to be defined as −a(1)u+ (z, 2) + (z − b(1))u(z, 1); recall our convention a(0) = 1), we infer that the zeros of M (z) are given by the eigenvalues of H+β1 and the poles by the eigenvalues of H+β2 . Thus we know the exponential Herglotz measure ξ(λ) of F (z) (cf. (A.2)). The remaining constant c in (A.2) can be determined from the asymptotic behavior F (z) = −β1 − (1 − β1 β2−1 )z −1 + O(z −2 ). Thus F (z) is known and solving F (z) for m+ (z) finishes the proof.
188
G. Teschl
5. General Trace Formulas and ξ Functions In this section we will investigate trace formulas for Jacobi operators H. We will essentially follow the philosophy of [17, 25] and use the exponential Herglotz representation (A.2) rather than (A.1). This will produce generalizations of the formula (1.3). To avoid the Abelian limits of [17] we will first consider the case where H (and thus a, b) is bounded. We abbreviate E0 = inf σ(H),
E∞ = sup σ(H),
(5.1)
and note that G(λ, n, n) > 0 for λ < E0 , which follows from (H − λ) > 0 (implying (H − λ)−1 > 0). Similarly, G(λ, n, n) < 0 for λ > E∞ , following from (H − λ) < 0. Our main tool will be the following exponential representation of the Herglotz function g(z, n) = G(z, n, n) (cf. Theorem A.2) Z λ 1 − ξ(λ, n)dλ , z ∈ C\σ(H), (5.2) g(z, n) = |g(i, n)| exp 1 + λ2 R λ−z where the ξ function ξ(λ, n) is defined by ξ(λ, n) =
1 lim arg g(λ + iε, n), π ε↓0
arg(.) ∈ (−π, π].
In addition, ξ(λ, n) (which is only defined a.e.) satisfies 0 ≤ ξ(λ, n) ≤ 1, Z ξ(λ, n) 0 for z < E0 dλ = arg g(i, n), and ξ(λ, n) = . 2 1 for z > E∞ 1 + λ R Using (5.4) together with the asymptotic behavior of g(., n) we infer ! Z E∞ 1 ξ(λ, n)dλ g(z, n) = exp . E∞ − z λ−z E0
(5.3)
(5.4)
(5.5)
Theorem 5.1. Suppose H is bounded and let ξ(λ, n) be defined as above. Then we have the following trace formula: Z E∞ (`) ` λ`−1 ξ(λ, n)dλ, (5.6) b (n) = E∞ − ` E0
where b(1) (n) = b(n), b(`) (n) = ` g` (n) −
`−1 X
g`−j (n)b(j) (n),
` > 1.
(5.7)
j=1
Proof. The claim follows after expanding both sides of Z E∞ ξ(λ, n)dλ ln (E∞ − z)g(z, n) = λ−z E0
(5.8)
and comparing coefficients using the following connections between the series of g(z) and ln(1 + g(z)) (cf., e.g., [33]). Let g(z) have the asymptotic expansion
Trace Formulas for Jacobi Operators
189
g(z) =
∞ X g` z`
(5.9)
`=1
as z → ∞. Then we have ln(1 + g(z)) =
∞ X c` , z`
(5.10)
`=1
where c ` = g` −
c 1 = g1 ,
`−1 X j j=1
`
` ≥ 2.
g`−j cj ,
(5.11)
We remark that the special case ` = 1 of Eq. (5.6) Z b(n) = E∞ −
E∞ E0
E0 + E∞ 1 + ξ(λ, n)dλ = 2 2
Z
E∞
(1 − 2ξ(λ, n))dλ (5.12)
E0
has first been given in [17]. Next we turn to unbounded operators. In order to avoid Abelian limits here as well, we resort to a little trick. This will also show how our investigations tie in with the theory of Krein [29] and rank one perturbations (see also [17], Appendix A, [19, 34]). Consider θ ≥ 0.
Hn,θ = H + θhδn , .iδn ,
(5.13)
Then, as in [17], Appendix A, one computes
−1
tr (H − z)
−1
− (Hn,θ − z)
d ln(1 + θg(z, n)) = = dz
Z R
ξθ (λ, n) dλ, (5.14) (λ − z)2
where 1 + θg(z, n) = exp
Z ξ (λ, n) θ dλ , R λ−z
ξθ (λ, n) =
1 lim arg 1 + θg(λ + iε, n) . π ε↓0 (5.15)
R By Theorem A.2 (iii) all moments of ξθ (λ, n)dλ are finite and R ξθ (λ, n)dλ = θ. Taking logarithms in (5.15) and expanding yields as before Theorem 5.2. Let ξθ (λ, n) be defined as above. Then we have Z (`) bθ (n) = (` + 1) λ` ξθ (λ, n)dλ,
(5.16)
R
with b(0) θ (n) = θ,
b(`) θ (n) = θ(` + 1)g` (n) + θ
` X j=1
g`−j (n)b(j−1) (n), θ
` ∈ N. (5.17)
190
G. Teschl
Again, in the special case ` = 1 we obtain 1 b(n) = θ
Z
θ λξθ (λ, n)dλ − . 2 R
(5.18)
In addition, we remark that letting the coupling constant θ tend to ∞ implies Hn,θ → Hn∞ in a suitable sense (i.e., norm resolvent sense on {f ∈ `2 (Z)|hδn , f i = 0}, cf. [19]). Similarly, Hnβ0 can be obtained as the limit of the operator H + θhδnβ , .iδnβ as θ → ∞. Clearly, the same procedure can be applied to (cf. Theorem A.2 (i), (iii)) β γ (z, n) = − exp a(n)
Z
β
R
ξ β (λ, n)dλ , λ−z
z ∈ C\σ(Hnβ ), β ∈ R\{0}, (5.19)
where ξ β (λ, n) =
1 lim arg γ β (λ + iε, n) − δ β , π ε↓0
δβ =
0, 1,
βa(n) < 0 βa(n) > 0
(5.20)
and 0 ≤ sgn(−a(n)β) ξ β (λ, n) ≤ 1. This yields as before Theorem 5.3. Let ξ β (λ, n) be defined as above. Then we have b
β,(`)
β (n) = (` + 1) a(n)
Z λ` ξ β (λ, n)dλ, R
` ∈ N,
(5.21)
where bβ,(0) (n) = 1 + β 2 , β X β γ`−j (n)bβ,(j−1) (n), a(n) `
bβ,(`) (n) = (` + 1)γ`β (n) −
` ∈ N.
(5.22)
j=1
Again specializing for ` = 0 in (5.21) we obtain a(n) =
1 β + β −1
Z ξ β (λ, n)dλ.
(5.23)
R
Finally, we want to find out when ξ βj (λ, n0 ), j = 1, 2, for one fixed n0 determines a(n), b(n), n ∈ Z. Since ξ β (., n0 ), β ∈ R and a(n0 ) determines γ β (z, n0 ) by (5.19) we conclude from Theorem 4.4 Corollary 5.4. Let β1,2 ∈ R ∪ {∞} be given. Then (βj , ξ βj (., n0 )), j = 1, 2, and a(n0 ) for one fixed n0 ∈ Z uniquely determines a(n)2 , b(n) for all n ∈ Z.
Trace Formulas for Jacobi Operators
191
6. Reflectionless Operators Reflectionless operators have attracted a considerable amount of interest recently in connection with inverse spectral theory [2, 22, 35, 36] and completely integrable lattices [7, 32]. In this section we show that the trace formulas of the previous section become particularly transparent in this case. We will assume that H is a bounded self-adjoint Jacobi operator. Hence its spectrum can be written as the complement of a countable union of disjoint open intervals, that is, [ ρj , (6.1) σ(H) = R\ j∈J0 ∪{∞}
where J ⊆ N, J0 = J ∪ {0}, ρ0 = (−∞, E0 ), ρ∞ = (E∞ , ∞), E0 ≤ E2j−1 < E2j ≤ E∞ , ρj = (E2j−1 , E2j ), j ∈ J, −∞ < E0 < E∞ < ∞, ρj ∩ ρk = ∅ for j 6= k.
(6.2)
In addition, we will require that H is reflectionless, that is, for all n ∈ Z, ξ(λ, n) =
1 for a.e. λ ∈ σess (H). 2
(6.3)
By [22], Lemma 3.3 the requirement (6.3) is equivalent to one of the following: (i) For some n0 ∈ Z, n1 ∈ Z\{n0 , n0 + 1}, ξ(λ, n0 ) = ξ(λ, n0 + 1) = ξ(λ, n1 ) =
1 for a.e λ ∈ σess (H). 2
(ii) For some n0 ∈ Z, m ˜ + (λ + i0, n0 ) = m ˜ − (λ + i0, n0 ) for a.e. λ ∈ σess (H), where m ˜ − (λ + i0, n0 ) abbreviates limε↓0 m ˜ − (λ + iε, n0 ). The last equation implies u+ (λ + i0, n) = u− (λ + i0, n) for a.e. λ ∈ σess (H)
(6.4)
for u± (z, n) = c(z, n, n0 ) + a(n0 )m ˜ ± (z, n0 )s(z, n, n0 ), where c, s are the solutions of τ u = zu corresponding to the initial conditions c(z, n0 , n0 ) = s(z, n0 + 1, n0 ) = 1, s(z, n0 , n0 ) = c(z, n0 + 1, n0 ) = 0. The name reflectionless will become clear in the next section. There the above conditions will turn out to be equivalent to the vanishing of the reflection coefficients R± (z) (cf. (7.16)). For instance periodic operators, operators with purely discrete spectrum, and stationary solutions of the Toda hierarchy are special cases of reflectionless operators. Next we turn to Dirichlet eigenvalues associated with τ corresponding to a Dirichlet boundary condition at n ∈ Z. Associated with each spectral gap ρj we set µj (n) = sup{E2j−1 } ∪ {λ ∈ ρj |g(λ, n) < 0} ∈ ρj ,
j ∈ J.
(6.5)
The numbers µj (n) are called Dirichlet eigenvalues of H since we have σ(Hn∞ ) = σess (H) ∪ {µj (n)}j∈J .
(6.6)
192
G. Teschl
However, we want to point out that µj (n) is not necessarily an eigenvalue of Hn∞ unless µj (n) 6∈ σess (H). The strict monotonicity of g(λ, n) with respect to λ ∈ ρj , that is, X d g(λ, n) = hδn , (H − λ)−2 δn i = G(λ, n, m)2 > 0, λ ∈ ρj , (6.7) dλ m∈Z
then yields g(λ, n) < 0, g(λ, n) > 0,
λ ∈ (E2j−1 , µj (n)), λ ∈ (µj (n), E2j ),
j ∈ J.
(6.8)
Thus we conclude ξ(λ, n) = 1, λ ∈ (E2j−1 , µj (n)) and ξ(λ, n) = 0, λ ∈ (µj (n), E2j ), j ∈ J. Using this information to evaluate the exponential Herglotz representation of g(z, n) then implies ([22], Lemma 1.1) Y z − µj (n) −1 √ p p , (6.9) g(z, n) = √ z − E0 z − E∞ j∈J z − E2j−1 z − E2j √ √ where the square root branch used is defined as z = | z| exp(i arg(z)/2), −π < z ≤ π. In addition, denoting by χ (.) the characteristic function of the set ⊂ R, one can represent ξ(λ, n) by 1 χ(E0 ,∞) (λ) + χ(E∞ ,∞) (λ) ξ(λ, n) = 2 1 X χ(E2j−1 ,∞) (λ) + χ(E2j ,∞) (λ) − 2χ(µj (n),∞) (λ) + 2 j∈J 1 X 1 χ(E2j−1 ,µj (n)) (λ) − χ(µj (n),E2j ) (λ) = χ(E0 ,E∞ ) (λ) + 2 2 j∈J
+ χ(E∞ ,∞) (λ)
for a.e. λ ∈ R.
(6.10)
Evaluation of (5.6) shows b(`) (n) =
X 1 ` ` ` ` E0 + E∞ + (E2j−1 + E2j − 2µj (n)` ) 2
(6.11)
j∈J
and in the special case ` = 1 X 1 b(n) = E0 + E∞ + (E2j−1 + E2j − 2µj (n)) . 2
(6.12)
j∈J
The formulas for ` = 1, 2 were first given in [2], Theorem 5.2. Next, we want to address the problem of expressing a(n)2 as a function of Ej and µj (n). This endeavor turns out to be impossible unless we introduce additional data. This will be done first by defining {µ˜ j (n)}j∈J˜ = {µj (n)}j∈J ∪ σp (Hn∞ ),
J˜ ⊆ N
(6.13)
and E˜ 0 = E0 , E˜ ∞ = E∞ , E˜ 2j−1 = sup{E ∈ σ(H)|E < µ˜ j (n)},
E˜ 2j = inf{E ∈ σ(H)|µ˜ j (n) < E}. (6.14)
Trace Formulas for Jacobi Operators
193
A few remarks are in order: Remark 6.1. (i) We note that µ˜ j = µk implies E˜ 2j−1 = E2k−1 , E˜ 2j = E2k and E˜ 2j−1 < E˜ 2j implies µ˜ j (n) = µk (n) for some k ∈ J. Indeed, if E˜ 2j−1 < E˜ 2j we infer limλ→µ˜ j (n), λ∈(E˜ 2j−1 ,E˜ 2j ) g(λ, n) = 0 and hence µ˜ j (n) = µk (n) for some k ∈ J by monotonicity of g(., n) in spectral gaps. In other words, computing all previous formulas with µj (n), Ej replaced by µ˜ j (n), E˜ j leaves them unchanged since the new factors drop out. (ii) Our notation concerning E˜ j is imprecise since the list of numbers [E˜ j ]j∈J˜ might, in general, depend on n. Suppose for instance, that µ˜ j (n) is also an eigenvalue of H such that E˜ 2j−1 = µ˜ j (n) = E˜ 2j . Then the pair E˜ 2j−1 , E˜ 2j shows up in the list corresponding to n but not in the one corresponding to n + 1 since the eigenfunction for µ˜ j (n) cannot vanish at two consecutive points. Moreover, following [22], we introduce the numbers R˜ j (n) = lim iεg(µ˜ j (n) + iε, n)−1 ≥ 0,
(6.15)
ε↓0
and
( σ˜ j (n) =
lim h(µ˜ j (n) + iε, n) if R˜ j (n) > 0 ε↓0
2
if R˜ j (n) = 0
.
(6.16)
The actual value of σ˜ j (n) if R˜ j (n) = 0 is immaterial and is chosen in accordance with [22]. The above limits exist if µ˜ j ∈ σ(Hn∞ ) (i.e., if R˜ j (n) > 0) and σ˜ j (n) is either ±1 (depending on whether µ˜ j is an eigenvalue of H±,n ) or in (−1, +1) (if µ˜ j is an eigenvalue of both H±,n and hence also of H). For more details see [22]. The numbers R˜ j (n) can be evaluated using (6.9) p p p p µ˜ j (n) − E0 µ˜ j (n) − E∞ µ˜ j (n) − E2j−1 µ˜ j (n) − E2j . (6.17) R˜ j (n) = Q µ˜ j (n)−µk (n) √ √ k∈J\{j} µ˜ j (n)−E2k−1
µ˜ j (n)−E2k
If µ˜ j = µk = E2k = E2j−1 for some k (resp. µ˜ j = µk = E2k−1 = E2j ) the vanishing factors µ˜ j − µk in the denominator and µ˜ j − E2j (resp. µ˜ j − E2j−1 ) in the numerator have to be omitted. In particular, we want to point out that R˜ j (n) depend on Ej , µj only. In addition, we require that the singularly continuous spectrum of Hn∞ is empty (the absolutely continuous spectrum being taken care of by the reflectionless condition). Then it is shown in [22] that the spectral data Ej , j ∈ J ∪{0, ∞} plus µj (n0 ), j ∈ J plus σ˜ j (n0 ) , j ∈ J˜ for one fixed n0 ∈ Z are minimal and uniquely determine a(n)2 , b(n). (To be precise, the class of operators considered here is slightly larger than the one in [22], however, the same proof applies.) Moreover, necessary and sufficient conditions for given spectral data to be the spectral data of some Jacobi operator were derived. Here we want to focus on the reconstruction of a(n)2 , b(n) from given spectral data as above and present an explicit expression of a(n)2 , b(n) in terms of the spectral data. Our point of departure will be the formulas (use (4.15) and (4.17)) ( 1 g(z,n)
a(n)2 m+ (z, n) ± a(n − 1)2 m− (z, n) = ∓z ± b(n) − =−
∞ X c±,j (n) j=0
z j+1
h(z,n) g(z,n)
,
(6.18)
194
G. Teschl
where the coefficients c±,j (n) are to be determined. Arguing similarly as for (1.4) one obtains Z λ` a(n)2 dρ+,n (λ) ± a(n − 1)2 dρ−,n (λ) , ` ∈ N0 , (6.19) c±,` (n) = R
where dρ±,n (λ) are the spectral measures of H±,n associated with the vector δn±1 . The evaluation of this integral will now be done for the minus sign. Due to the reflectionless condition, the integral over the (absolutely) continuous spectrum is zero (there is no singularly continuous part by assumption) and it remains to evaluate the pure point part. To do this it suffices to know the jumps of the measure which are given by the residues of the corresponding Herglotz function. Evaluating the residues (using (6.18) plus the notation from above) shows X σ˜ j (n)R˜ j (n)µ˜ j (n)` , ` ∈ N0 . (6.20) c−,` (n) = j∈J˜
Clearly it suffices to sum over all µ˜ j (n) ∈ σp (Hn∞ ) since for all other terms we have R˜ j (n) = 0. Next we turn to the coefficients c+,` (n). They can be determined from (cf. (5.5)) X b(`) (n) 1 = −z exp − , g(z, n) `z ` ∞
(6.21)
`=1
which implies c+,−2 (n) = 1, 1X c+,`−j−2 (n)b(j) (n), ` `
c+,`−2 (n) =
` ∈ N.
(6.22)
j=1
Thus c+,` (n) are expressed in terms of Ej , µj (n). Here c+,−2 (n) and c+,−1 (n) have been introduced for notational convenience only. In particular, combining the case ` = 0 with our previous results we obtain a(n − 01 )2 =
b(2) (n) − b(n) X σ˜ j (n) ˜ ± Rj (n). 4 2 ˜
(6.23)
j∈J
Similarly, for ` = 1, b(n ± 1) =
2b(3) (n) − 3b(n)b(2) (n) + b(n)3 1 12 a(n − 01 )2 X σ˜ j (n) R˜ j (n)µ˜ j (n) . ± 2 ˜
(6.24)
j∈J
However, these formulas are only the tip of the iceberg. Combining c±,` (n) = a(n)2 m+,` (n) ± a(n − 1)2 m−,` (n) with some basic facts from the moment problem we obtain our main result:
(6.25)
Trace Formulas for Jacobi Operators
195
Theorem 6.2. Let H be a given bounded reflectionless Jacobi operator. Suppose the singularly continuous spectrum of Hn∞ is empty and the spectral data corresponding to H (as above) are given for one fixed n ∈ Z. Then the sequences a2 , b can be expressed explicitly in terms of the spectral data as follows: C±,n (k + 1)C±,n (k − 1) , C±,n (k)2 D±,n (k) D±,n (k − 1) b(n ± k) = − , C±,n (k) C±,n (k − 1)
a(n ± k − 01 )2 =
(6.26) k ∈ N,
(6.27)
where C±,n (0) = 1, D±,n (0) = 0,
m±,0 (n) m±,1 (n) · · · m±,k−1 (n) m±,1 (n) m±,2 (n) · · · m±,k (n) , C±,n (k) = det .. .. .. .. . . . . m±,k−1 (n) m±,k (n) · · · m±,2k−2 (n)
(6.28)
m±,0 (n) m±,1 (n) · · · m±,k−2 (n) m±,k (n) m±,1 (n) m±,2 (n) · · · m±,k−1 (n) m±,k+1 (n) , (6.29) D±,n (k) = det .. .. .. .. .. . . . . . m±,k−1 (n) m±,k (n) · · · m±,2k−3 (n) m±,2k−1 (n) and m±,` (n) =
c+,` (n)±c−,` (n) . 2a(n− 01 )
The quantities a(n)2 , a(n − 1)2 , and c±,` (n) have to be
expressed in terms of the spectral data using (6.23), (6.22), (6.20) and (6.11). Proof. It remains to show the expressions (6.26) and (6.27) for a(n) and b(n) in terms of the moments M±,` (n0 ), ` ∈ N. Both can be found in [1] (first equation on p. 5). However, the equation for b(n) here differs from the one in [1] since we have performed the integration (see [38], Sect. 2.5 for details). In the special case of periodic Jacobi operators, the formula (6.23) was first given in [7]. In addition, we get a discrete version of Borg’s theorem. Corollary 6.3. Let H be a reflectionless Jacobi operator with spectrum consisting of only one band, that is σ(H) = [E0 , E∞ ]. Then the sequences a(n)2 , b(n) are necessarily constant a(n)2 =
(E∞ − E0 )2 , 16
b(n) =
E0 + E∞ . 2
(6.30)
The special case where H is periodic seems due to [12] (Proposition 2 on p. 451). The formula for b(n) also follows directly from (5.12). Remark 6.4. (i) If J is finite, that is, H has only finitely many spectral gaps, then {µ˜ j (n)}j∈J˜ = {µj (n)}j∈J and we can forget about the additional µ’s. (ii) The reader might wonder whether a similar procedure for one-dimensional Schr¨od2 dinger operators H = − dx 2 + V (x) is possible. This is in fact the case but under more restrictive conditions on V (x). Without going into technical details we remark that in the continuous case the asymptotic expansions of the Weyl m-functions contain the
196
G. Teschl
information of all derivatives of V at the base point. Hence if V is assumed real analytic (e.g., finite gap) it can be expressed in terms of its derivatives using Taylor’s formula. (iii) Concerning general Jacobi operators we note that Theorem 4.4 indicates that a(n0 )2 , β γ` j (n0 ), j = 1, 2, ` ∈ N is solvable for a(n)2 , b(n) as well. Finally, we turn to general eigenvalues associated with Hnβ . Associated with each spectral gap ρj we set λβj (n) = sup{E2j−1 } ∪ {λ ∈ ρj |γ β (λ, n) < 0} ∈ ρj ,
j ∈ J.
(6.31)
The strict monotonicity of γ β (λ, n) with respect to λ ∈ ρj , j ∈ J0 ∪ {∞}, that is, d β γ (λ, n) = (1 + β 2 )hδnβ , (H − λ)−2 δnβ i, dλ
λ ∈ ρj ,
(6.32)
then yields λ ∈ (E2j−1 , λβj (n)), λ ∈ (λβj (n), E2j ),
γ β (λ, n) < 0, γ β (λ, n) > 0,
j ∈ J.
(6.33)
Since γ β (λ, n) is positive (resp. negative) for a(n)β > 0 (resp. a(n)β < 0) as λ → ∞ (resp. λ → −∞), there must be an additional zero λβ∞ for λ ≥ E∞ (resp. λ ≤ E0 ). Summarizing, ξ β (λ, n) is given by 1 1 X ξ β (λ, n) = χ(E0 ,E∞ ) (λ) + χ(E2j−1 ,λβ (n)) (λ) − χ(λβ (n),E2j ) (λ) j j 2 2 j∈J
a(n)β > 0
+ χ(E∞ ,λβ∞ ) (λ), and
(6.34)
1 1 X χ(E2j−1 ,λβ (n)) (λ) − χ(λβ (n),E2j ) (λ) ξ β (λ, n) = − χ(E0 ,E∞ ) (λ) + j j 2 2 j∈J
− χ(λβ∞ ,E0 ) (λ),
a(n)β < 0.
(6.35)
Thus we have for β 6= 0, ∞, γ β (z, n) = √
Y z − λβj (n) z − λβ∞ (n) √ p p , z − E0 z − E∞ j∈J z − E2j−1 z − E2j
(6.36)
and we remark that the numbers λβj (n) are related to the spectrum of Hnβ as follows: σ(Hnβ ) = σess (H) ∪ {λβj (n)}j∈J∪{∞} .
(6.37)
Again we point out that λβj (n) is not necessarily an eigenvalue of Hnβ unless λβj (n) 6∈ σess (H). Evaluation of (5.6) shows −β `+1 `+1 E + E∞ − 2λβ∞ (n)`+1 bβ,(`) (n) = 2a(n) 0 X `+1 `+1 + (E2j−1 + E2j − 2λβj (n)`+1 ) (6.38) j∈J
Trace Formulas for Jacobi Operators
197
and in the special case ` = 0, X 1 E0 + E∞ − 2λβ∞ (n) + (E2j−1 + E2j − 2λβj (n)) . a(n) = −1 2(β + β ) (6.39) j∈J 7. Scattering Theory One important class of Jacobi operators are periodic ones. In this section we want to consider scattering theory with periodic background operators and apply the results of Sect. 5. Even though this problem arises naturally if one considers an infinite harmonic crystal (with N atoms in the base cell) with impurities, not too many articles are available on this problem (cf., e.g., [15, 28]). The case with constant background (i.e., only one atom in the base cell) is treated, for instance in [9, 27]. For a comprehensive treatment in the case of Schr¨odinger operators with fairly arbitrary backgrounds we refer the reader to [23] and the references therein. We first recall some basic facts from the theory of periodic operators (cf., e.g., [7], Appendix B, [30, 32]). Let Hp be a Jacobi operator associated with periodic sequences ap 6= 0, bp , that is, ap (n + N ) = ap (n),
bp (n + N ) = bp (n),
(7.1)
for some fixed N ∈ N. The spectrum of Hp is purely absolutely continuous and consists of a finite number of gaps, that is, σ(H) =
N [
[Ep,2j−2 , Ep,2j−1 ],
Ep,0 < · · · < Ep,2N −1 .
(7.2)
j=1
Moreover, Floquet theory implies the existence of solutions up,± (z, .) of τp u = zu, z ∈ C (τp the difference expression corresponding to Hp ) satisfying up,± (z, n + N ) = m± (z)up,± (z, n)
(7.3)
and hence up,± (z, n) = p± (z, n) exp(±iq(z)n),
p± (z, n) = p± (z, n + N ),
(7.4)
where m± (z) = exp(±iq(z)N ) ∈ C are called Floquet multipliers and q(z) is called Floquet momentum (m± (z) is not related to the Weyl m-function m± (z, n)). m± (z) −1 ± satisfy m+ (z)m− (z) = 1, m± (z)2 = 1 for z ∈ {Ep,j }2N j=0 , |m (z)| = 1 for z ∈ σ(Hp ), + and |m (z)| < 1 for z ∈ C\σ(Hp ). (This says in particular, that up,± (z, .) are bounded −1 ± for z ∈ σ(Hp ) and linearly independent for z ∈ C\{Ej }2N j=0 .) Requiring m (λ) = ± ± limε↓0 m (λ + iε), λ ∈ σ(Hp ) determines m (z) uniquely. We are going to investigate scattering theory for the pair (H, Hp ), where H is a Jacobi operator satisfying X X |n(a(n) − ap (n))| < ∞, |n(b(n) − bp (n))| < ∞. (7.5) n∈Z
n∈Z
By [37], Theorem 5.1 the requirement (7.5) implies that the essential spectrum of H is equal to σ(Hp ) and purely absolutely continuous. Moreover, the point spectrum of H is finite and confined to the spectral gaps of Hp , that is, σp (H) ⊂ R\σ(Hp ).
198
G. Teschl
As in the proof of [37], Theorem 5.1 one can use the sum equation u± (z, n) =
ap (n − 01 ) a(n − 01 )
∞
up,± (z, n) ∓
n−1 X ap (n − 01 ) n+1 m= −∞
a(n − 01 )
K(z, n, m)u± (z, m), (7.6)
where ((τ − τp )up,− (z))(m)up,+ (z, n) − up,− (z, n)((τ − τp )up,+ (z))(m) Wp (up,− (z), up,+ (z)) sp (λ, n, m + 1) sp (λ, n, m) = (a(m) − ap (m)) + (b(m) − bp (m)) ap (m + 1) ap (m) sp (λ, n, m − 1) + (a(m − 1) − ap (m − 1)) (7.7) ap (m − 1)
K(z, n, m) =
(Wp (., ..) denotes the Wronskian formed with ap rather than a) to show the existence of solutions u± (z, .) of τ u = zu satisfying lim exp(∓Im(q(z))n)|u± (z, n) − up,± (z, n)| = 0,
n→±∞
z ∈ C.
(7.8)
Since we are most of the time interested in the case z ∈ σ(Hp ) we shall normalize up,± (λ, 0) = 1 for λ ∈ σ(Hp ). In what follows we will freely use the notation and results found in [7], Appendix B. In particular, note that we have up,± (λ) = up,∓ (λ), where the bar denotes complex conjugation. Since one computes W (u± (λ), u± (λ)) = Wp (up,± (λ), up,∓ (λ)) = ∓
2i sin(q(λ)N ) , sp (λ, N )
λ ∈ σ(Hp ) (7.9)
(sp (λ, n) is the solution of τp u = zu corresponding to the initial condition s(λ, 0) = 0, sp (λ, 1) = 1) we conclude that u± (λ), u± (λ) are linearly independent for λ in the interior of σ(Hp ) (if two bands collide at E, numerator and denominator of (7.9) both approach zero when λ → E and have a nonzero limit). Hence we might set u± (λ, n) = α(λ)u∓ (λ, n) + β∓ (λ)u∓ (λ, n),
λ ∈ σ(Hp ),
(7.10)
where α(λ) = β± (z) =
sp (λ, N ) W (u∓ (λ), u± (λ)) W (u− (λ), u+ (λ)), = W (u∓ (λ), u∓ (λ)) 2i sin(q(λ)N )
(7.11)
W (u∓ (λ)), u± (λ) sp (λ, N ) W (u∓ (λ), u± (λ)). =± 2i sin(q(λ)N ) W (u± (λ), u± (λ))
(7.12)
The function α(λ) can be defined for all λ ∈ C\{Ep,j }. Note that we have |α(λ)|2 = 1 + |β± (λ)|2 and β± (λ) = −β∓ (λ).
(7.13)
Using (7.6) one can also show W (u− (λ), u+ (λ)) = Wp (up,− (λ), up,+ (λ)) +
X n∈Z
u± (λ, n)((τ − τp )up,∓ (λ))(n) (7.14)
Trace Formulas for Jacobi Operators
and W (u∓ (λ), u± (λ)) = ∓
199
X
u± (λ, n)((τ − τp )up,± (λ))(n).
(7.15)
n∈Z
We now define the scattering matrix T (λ) R− (λ) S(λ) = , R+ (λ) T (λ)
λ ∈ σ(Hp )
(7.16)
of the pair (H, Hp ), where T (λ) = α(λ)−1 and R± (λ) = α(λ)−1 β± (λ). The matrix S(λ) is easily seen to be unitary since by (7.13) |T (λ)|2 + |R± (λ)|2 = 1 and T (λ)R+ (λ) = −T (λ)R− (λ). The quantities T (λ) and R± (λ) are called transmission and reflection coefficients respectively. The following equation further explains this notation: T (λ)up,± (λ, n), n → ±∞ , λ ∈ σ(Hp ). T (λ)u± (λ, n) = u (λ, n) + R (λ)u (λ, n), n → ∓∞ (7.17) p,± ∓ p,∓ Clearly (6.4) implies R± (λ) = 0, explaining the term reflectionless in the previous ˜ ± (z) = m ˜ ± (z, 0) section. The quantities T (λ) and R± (λ) can be expressed in terms of m as follows T (λ) =
u± (λ, 0) 2iIm(m ˜ ± (λ + i0)) , u∓ (λ, 0) m ˜ − (λ + i0) + m ˜ + (λ + i0)
R± (λ) = −
u± (λ, 0) m ˜ ± (λ + i0) ˜ ∓ (λ + i0) + m , u± (λ, 0) m ˜ − (λ + i0) + m ˜ + (λ + i0)
(7.18) λ ∈ σ(Hp ).
(7.19)
In addition, one verifies sp (λ, N ) u− (λ, n)u+ (λ, n) = T (λ) u− (λ, n)u+ (λ, n) W (u− (λ), u+ (λ)) 2i sin(q(λ)N ) u± (λ, n) sp (λ, N ) |u± (λ, n)|2 1 + R± (λ) , λ ∈ σ(Hp ). (7.20) = 2i sin(q(λ)N ) u± (λ, n)
g(λ + i0, n) =
+1 Construct the list (Ej )2M by taking all Ep,j plus two copies of each eigenvalue j=0 of H. We can assume E0 ≤ E1 < E2 ≤ · · · < E2M ≤ E2M +1 and equality holds if and only if E2j = E2j+1 is an eigenvalue of H. Define the Dirichlet eigenvalues µj (n) associated with each spectral gap (E2j+1 , E2j+2 ) as in (6.5). Then we infer
1 1 X χ(E0 ,E∞ ) (λ) + χ(E2j−1 ,µj (n)) (λ) − χ(µj (n),E2j ) (λ) 2 2 j=1 1 u± (λ, n) + χ(E∞ ,∞) (λ) + arg 1 + R± (λ) χσ(Hp ) (λ) π u± (λ, n) M
ξ(λ, n) =
(7.21)
since we have ξ(λ, n) =
u± (λ, n) 1 1 + arg 1 + R± (λ) , λ ∈ σ(Hp ). 2 π u± (λ, n)
(7.22)
200
G. Teschl
Hence we obtain from (5.6) b(`) (n) =
2M +1 M −1 1 X ` X Ej − µj (n)` 2 j=0 j=1 Z ` u± (λ, n) + λ`−1 arg 1 + R± (λ) dλ, π σ(Hp ) u± (λ, n)
(7.23)
and in the special case ` = 1 b(n) =
2M +1 M −1 X 1 X Ej − µj (n) 2 j=0 j=1 Z u± (λ, n) 1 arg 1 + R± (λ) + dλ. π σ(Hp ) u± (λ, n)
(7.24)
The analog of (7.24) in the case of Schr¨odinger operators with constant background and no eigenvalues was first derived in [11]. The general case for Schr¨odinger operators can be found in [21]. For further trace formulas in the constant background case, in particular in connection with the Toda lattice, we refer the reader to [8, 16]. Remark 7.1. If R± (λ) = 0 then H can be obtained from Hp by inserting the corresponding number of eigenvalues using the double commutation method provided in [20] since this transformation is easily seen to preserve the reflectionless property. Acknowledgement. I thank the referee for making several valuable suggestions.
A. Herglotz Functions The results stated in this section can be found in [4] (see also [3]). We set C± = {z ∈ C| ± Im(z) > 0}. A function F : C+ → C+ is called a Herglotz function (sometimes also Pick or Nevanlinna–Pick function), if F is analytic in C+ . For convenience one usually defines F on C− by F (z) = F (z). Herglotz functions can be characterized by Theorem A.1. F is a Herglotz function if and only if Z λ 1 − dρ(λ), z ∈ C+ , (A.1) F (z) = a + b z + 1 + λ2 R λ−z R where a = Re F (i) ∈ R, b ≥ 0, and ρ is a measure on R which satisfies R (1 + λ2 )−1 dρ(λ) < ∞. Let ln(z) be defined such that ln(z) = ln |z| + i arg(z), −π < arg(z) ≤ π. Then ln(z) is holomorphic and Im ln(z) > 0 for z ∈ C+ . Hence ln(z) is a Herglotz function. The sum of two Herglotz functions is again a Herglotz function, similarly the composition of two Herglotz functions is Herglotz. In particular, if F (z) is a Herglotz function, the same holds for ln F (z) and − F 1(z) . Thus, using the representation (A.1) for ln F (z) , we get another representation for F (z).
Trace Formulas for Jacobi Operators
201
F is a Herglotz function if and only if it has the representation n Z 1 o λ − F (z) = exp c + ξ(λ) dλ , z ∈ C+ , (A.2) 1 + λ2 R λ−z
Theorem A.2. (i)
where c = ln |F (i)| ∈ R, ξ ∈ L1 (R, (1 + λ2 )−1 dλ) real-valued and ξ is not identically zero. Moreover, 1 1 (A.3) ξ(λ) = lim Im ln F (λ + iε) = lim arg F (λ + iε) π ε↓0 π ε↓0 for a.e. λ ∈ R, and 0 ≤ ξ(λ) ≤ 1 for a.e. λ ∈ R. Here −π < arg(F (λ + iε)) ≤ π according to the definition of ln(z). (ii) Fix n ∈ N and set ξ+ (λ) = ξ(λ), ξ− (λ) = 1 − ξ(λ). Then Z |λ|n ξ± (λ)dλ < ∞ (A.4) R
if and only if Z |λ|n dρ(λ) < ∞
Z and
R
(iii) We have
Z F (z) = ±1 +
R
if and only if
lim ±F (z) = ±a ∓
z→i∞
F (z) = ± exp ±
R
Z
dρ(λ) λ−z
with
ξ± (λ)
dλ λ−z
Z
(ξ± from above). In this case Z
R
λdρ(λ) > 0. (A.5) 1 + λ2
R
dρ(λ) < ∞
with ξ± ∈ L1 (R)
(A.6)
(A.7)
Z dρ(λ) = R
R
ξ± (λ)dλ.
(A.8)
References 1. Akhiezer, N.: The Classical Moment Problem. London: Oliver and Boyd, 1965 2. Antony, A.J. and Krishna, M.: Inverse spectral theory for Jacobi matrices and their almost periodicity. Proc. Indian Acad. Sci. (Math. Sci.) 104:4, 777–818 (1994) 3. Aronszajn, N.: On a problem of Weyl in the theory of singular Sturm–Liouville equations. Am. J. Math. 79, 597–610 (1957) 4. Aronszajn, N. and Donoghue, W.: On the exponential representation of analytic functions in the upper half-plane with positive imaginary part. J. Analyse Mathematique 5, 321–388 (1956-57) 5. Atkinson, F.: Discrete and Continuous Boundary Problems. New York: Academic Press, 1964 6. Berezanskii, J.: Expansions in Eigenfunctions of Self-adjoint Operators. Transl. Math. Monographs, vol. 17, Providence, R.I.: Am. Math. Soc., 1968 7. Bulla, W., Gesztesy, F., Holden, H. and Teschl, G.: Algebro-Geometric Quasi-Periodic Finite-Gap Solutions of the Toda and Kac-van Moerbeke Hierarchies. Memoirs of the Am. Math. Soc. (to appear) 8. Case, K.M.: Orthogonal polynomials II. J. Math. Phys. 16, 1435–1440 (1975) 9. Case, K M. and Kac, M.: A discrete version of the inverse scattering problem. J. Math. Phys. 14, 594–603 (1973) 10. Date, E. and Tanaka, S.: Analogue of inverse scattering theory for discrete Hill’s equations and exact solutions for the periodic Toda lattice. Prog. Th. Phys. 59, 457–465 (1976)
202
G. Teschl
11. Deift, P. and Trubowitz, E.: Inverse scattering on the line. Comm. Pure Appl. Math. 32, 121–251 (1979) 12. Flaschka, H.: Discrete and periodic illustrations of some aspects of the inverse method. In: Dynamical Systems: Theory and Applications (ed. J. Moser), Lecture Notes in Physics 38, Berlin: Springer, 1975, pp. 441–466 13. Fu, L. and Hochstadt, H.: Inverse theorems for Jacobi matrices. J. Math. Anal. Appl. 47, 162–168 (1974) 14. Gel’fand, I.M. and Dikii, L.A.: Asymptotic behavior of the resolvent of Sturm-Liouville equations and the algebra of the Korteweg-de Vries equations. Russian Math. Surv. 30:5, 77–113 (1975) 15. Geronimo, J.S. and Van Assche, W.: Orthogonal polynomials with asymptotically periodic recurrence coefficients. J. App. Th., 46, 251–283 (1986) 16. Gesztesy, F. and Holden, H.: Trace formulas and conservation laws for nonlinear evolution equations. Rev. Math. Phys. 6, 51–95 (1994) 17. Gesztesy, F. and Simon, B.: The xi-function. Acta Math. 176, 49–71 (1996) 18. Gesztesy, F. and Simon, B.: Uniqueness theorems in inverse spectral theory for one-dimensional Schr¨odinger operators. Trans. Am. Math. Soc. 348, 349–373 (1996) 19. Gesztesy, F. and Simon, B.: Rank one perturbations at infinite coupling. J. Funct. Anal. 128, 245–252 (1995) 20. Gesztesy, F. and Teschl, G.: Commutation methods for Jacobi operators. J. Diff. Eqs. 128, 252–299 (1996) 21. Gesztesy, F., Holden, H. and Simon, B.: Absolute summability of the trace relation for certain Schr¨odinger operators. Com. Math. Phys. 168, 137–161 (1995) 22. Gesztesy, F., Krishna, M. and Teschl, G.: On isospectral sets of Jacobi operators. Commun. Math. Phys. 181, 631–645 (1996) 23. Gesztesy, F., Nowell, R. and P¨otz, W.: One-dimensional scattering for quantum systems with nontrivial spatial asymptotics. Diff. Integral Eqs. 10, 521–546 (1997) 24. Gesztesy, F., Ratnaseelan, R.,and Teschl, G.: The KdV hierarchy and associated trace formulas. In: Proceedings of the International Conference on Applications of Operator Theory, (eds. I. Gohberg, P. Lancaster, and P. N. Shivakumar), Oper. Theory Adv. Appl., 87, Basel: Birkh¨auser, 1996, pp. 125–163 25. Gesztesy, F., Holden, H., Simon, B. and Zhao, Z.: Higher order trace relations for Schr¨odinger operators. Rev. Math. Phys. 7, 893–922 (1995) 26. Gesztesy, F., Holden, H., Simon, B. and Zhao, Z.: A trace formula for multidimensional Schr¨odinger operators. J. Funct. Anal. 141, 449–465 (1996) 27. Guseinov, G.S.: The inverse problem of scattering theory for a second-order difference equation on the whole axis. Soviet Math. Dokl., 17, 1684–1688 (1976) 28. Klaus, M.: On bound states of the infinite harmonic crystal. Hel. Phys. Acta 51, 793–803 (1978) 29. Krein, M.G.: Perturbation determinants and a formula for the traces of unitary and self-adjoint operators. Soviet Math. Dokl. 3, 707-710 (1962) 30. Krichever, I.M.: Algebro-geometric spectral theory of the Schr¨odinger difference operator and the Peierls model. Soviet Math. Dokl. 26, 194–198 (1982) 31. Levitan, B.M.: Inverse Sturm–Liouville Problems. Utrecht: VNU Science Press, 1987 32. van Moerbeke, P.: The spectrum of Jacobi Matrices. Inv. Math. 37, 45–81 (1976) 33. Olver, F.W.J.: Asymptotics and Special Functions, New York: Academic Press, 1974 34. Simon, B.: Spectral Analysis of Rank One Perturbations and Applications. Proceedings, Mathematical Quantum Theory II: Schr¨odinger Operators, (eds. J. Feldman, R. Froese, and L. M. Rosen), CRM Proc. Lecture Notes 8, 1995, pp. 109–149 35. Sodin, M.L. and Yuditski˘ı, P.M.: Infinite-zone Jacobi matrices with pseudo-extendible Weyl functions and homogeneous spectrum. Russ. Acad. Sci. Dokl. Math. 49, 364–368 (1994) 36. Sodin, M. and Yuditskii, P.: Almost periodic Jacobi matrices with homogeneous spectrum, infinite dimensional Jacobi inversion, and Hardy spaces of character-automorphic functions. Preprint 37. Teschl, G.: Oscillation theory and renormalized oscillation theory for Jacobi operators. J. Diff. Eqs. 129, 532–558 (1996) 38. Teschl, G.: Jacobi Operators and Completely Integrable Lattices. In preparation, available from http://www.mat.univie.ac.at/˜gerald/ftp/book-jac/. 39. Toda, M.: Theory of Nonlinear Lattices. 2nd enl. edition, Berlin: Springer, 1989 Communicated by B. Simon
Commun. Math. Phys. 196, 203 – 247 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
On the Structure of Correlation Functions in the Normal Matrix Model Ling-Lie Chau1 , Oleg Zaboronsky2 1 Department of Physics, University of California at Davis, Davis, CA 95616, USA. E-mail: [email protected] 2 School of Mathematics, The Institute for Advanced Study, Princeton, NJ 08540, USA. E-mail: [email protected]
Received: 7 September 1997 / Accepted: 28 January 1998
Abstract: We study the structure of the normal matrix model (NMM). We show that all correlation functions of the model with an axially symmetric potential can be expressed in terms of a single holomorphic function of one variable. This observation is used to demonstrate the exact solvability of the model. The two-point correlation function is calculated in the continuum limit. The answer is proven to be universal, i.e. potential independent up to a change of the scale. In connection with NMM with a general polynomial potential we have developed a two-dimensional free fermion formalism and constructed a family of completely integrable hierarchies of non-linear differential equations, which we call the extended-KP (N ) hierarchies. The well-known KP hierarchy is a lower-dimensional reduction of this family. The extended-KP (1) hierarchy contains the (2+1)-dimensional Burgers equations. The partition function of the (N × N ) NMM is the τ function of the extended-KP (N ) hierarchy which is invariant with respect to a subalgebra of an algebra of all infinitesimal diffeomorphisms of the plane.
1. Introduction The Normal Matrix Model (NMM) was introduced in [5] and its connection to the Quantum Hall Effect was indicated. In [6] this connection was given a precise form by observing that the partition function of NMM coincides with the zero-temperature partition function of two-dimensional electrons in the strong magnetic field. In the present paper we study the mathematical structure of the NMM. The distinct feature of the NMM, in contrast to other matrix models (Hermitian, unitary, etc.), is its relation to two-plus-one-dimensional [(2+1)-d] physical systems, rather than the oneplus-one -dimensional [(1+1)-d] ones. For example, we will exploit the equivalence of NMM and the system of (2+1)-d Coulomb particles to compute correlation functions of the NMM in the continuum limit. We will also show that the partition function of NMM is a τ -function of an integrable hierarchy containing (2+1)-d Burgers equations.
204
L.-L. Chau, O. Zaboronsky
The paper is organized as follows. To make the present exposition self-contained we reproduce in Sect. 2 the results of [6], devoted to the definition of the model, the derivation of the eigenvalue formula for its partition function and the integrability of the model. To emphasize the (2+1)-d nature of the NMM we notice that the partition function of the model can be interpreted as a classical partition function of Coulomb particles in a plane. In contrast, the partition function of the Hermitian matrix model (HMM) coincides with the classical partition function of Coulomb particles constrained to a string in a plane. We show that the NMM partition function can be written in a determinant form. This enables us to conclude that the partition function of the model is a τ -function of Toda lattice with respect to holomorphic-antiholomorphic variations of the potential. In Sect. 3 we study the NMM with axially symmetric potentials. First we derive a determinant representation for the correlation functions of the model. We notice that the orthogonal polynomials associated with the integration measure are just powers of the complex variable z. This simplifies the analysis of the correlation functions considerably. We compute the correlation functions for the case when the matrix model potential is a monomial, V (M, M † ) = (M · M † )k . The answer is expressed through degenerate hypergeometric functions. Thus NMM provides us with an example of the matrix model which is exactly (and explicitly) solvable beyond the Gaussian case even before taking the continuum limit. The two-point function of Gaussian NMM decreases exponentially at infinity, which should be compared to the power decay of the two-point function of HMM. While the latter can be obtained from quantum-mechanical computation in the system of free (1 + 1)-d identical fermions in the external potential (see [31] for details), the former follows from the analogous computation in the system of free (2+1)-d identical fermions placed in the external magnetic field. In general we find that all correlation functions can be expressed in terms of a holomorphic function in one variable. This observation permits us to close a BBGKY1 chain of equations for the correlation functions and obtain a closed integral-differential equation for the two-point function. The existence of a single equation which determines the two-point function and thus, through determinant representation, all correlation functions, manifests the exact solvability of the NMM with an axially symmetric potential. In Sect. 4 we use the holomorphic property of the NMM with axially symmetric potentials to compute the continuum N → ∞ limit of the two-point function. The answer is proven to be universal, i.e. potential-independent up to a change of scale. Starting with Sect. 5 we consider NMM with general polynomial potentials. We develop a (2 + 1)-d free fermion formalism and construct the free fermion representation of the partition function of NMM. This construction has its analogy in the theory of HMM, in which the partition function admits a (1 + 1)-d free fermion representation. In Sect. 6, we use the formalism developed in Sect. 5 to construct a family of completely integrable hierarchies of non-linear differential equations, labeled by an integer N . We show that the N = 1 hierarchy contains the (2 + 1)-d Burgers equations, which is a multidimensional generalization [28] of the original Burgers equation [4], see [18] for a review. We thus call this N = 1 hierarchy the Burgers hierarchy. We give explicit solutions to it. These solutions generalize the Hopf-Cole solutions to the (2+1)-d Burgers equations, [19] and [7]. The family constructed provides a multidimensional extension of KP hierarchy which can be explained as follows. It is well known that the KP hierarchy can be formu1
Born, Bogoliubov, Green, Kirkwood and Yvone, [33].
Structure of Correlation Functions in Normal Matrix Model
205
lated in terms of the pseudodifferential operator W (t, ∂) = 1 + w1 (t)∂ −1 + w2 (t)∂ −2 + · · · ([43 and 44], see [10] for a review). It is also known that KP hierarchy admits a reduction specified by the condition that W (t, ∂)∂ N is a dif f erential operator, see [35] and [36]. We call it the KP (N ) hierarchy. The set of equations of KP hierarchy coincides with that of KP (N ) hierarchy with N → ∞. We show that the N th representative of our family of hierarchies of non-linear differential equations is a multidimensional extension of KP (N ) hierarchy. Thus we call it the extended-KP (N ) hierarchy. In the simplest N = 1 case the extended-KP (1), or equivalently the Burgers hierarchy, contains the (2 + 1)-d Burgers equations while the KP (1) hierarchy contains the (1 + 1)-d Burgers equation. We then classify all formal solutions to the extended-KP (N ) hierarchy. We find the one-to-one correspondence between the set of formal solutions and an open subset of an infinite-dimensional Grassmann manifold Gr(∞, N ). This subset consists of all N -dimensional subspaces of an infinite-dimensional complex linear space having a nondegenerate projection onto a fixed N -dimensional subspace. Next we show that the partition function of the (N × N ) NMM is a τ -function of the extended-KP (N ) hierarchy. This is achieved by using the bosonization formula which can be regarded as a (2 + 1)-d generalization of (1 + 1)-d bosonization formulae. Finally in Sect. 7 we discuss the Ward identities for the NMM. It is known that matrix model solutions to the KP hierarchy (like HMM) satisfy Virasoro constraints, i.e. they are annihilated by an infinite set of differential operators spanning a subalgebra of Virasoro algebra. This subalgebra is isomorphic to an algebra of infinitesimal holomorphic polynomial diffeomorphisms of the complex plane. We show that the partition function of NMM is a special solution to the extended-KP (N ) hierarchy annihilated by the set of differential operators generating an algebra of all infinitesimal polynomial diffeomorphisms of the plane. The results of this section can be used in the further analysis of the continuum limit of the NMM. The proofs of lemmas and theorems presented in the paper are placed between “♦” signs. References are ordered alphabetically.
2. The Eigenvalue and the Determinant Forms of the Normal Matrix Model In this section we will give the definition of the normal matrix model and derive the eigenvalue and determinant formulae for the partition function. We will be utilizing the standard methods of the theory of matrix models (see [34] for review). In the end of this section, we will derive a connection between NMM and the Toda lattice hierarchy. The partition function of NMM has the following form: Z ZN =
†
dµ(0)e−trV (M,M ) ,
(1)
{0:[M,M † ]=0}
where 0 denotes the set of N × N normal matrices, dµ(0) is a measure on 0 induced by flat metric on the space of all N × N complex matrices and V (z, z) ¯ is a function on C such that (1) exists. As usual, integral (1) can be reduced to the integral over eigenvalues {zi }N i=1 and † of matrices M and M respectively. A corresponding calculation can be easily {z¯i }N i=1 performed given the explicit expression for dµ(0) in the appropriate local coordinates on 0. So let us sketch the calculation of dµ(0).
206
L.-L. Chau, O. Zaboronsky
First we notice that any normal matrix M can be presented in the form M = U DU † , where the D is a diagonal matrix, U ∈ U (N ) is a diagonalizing matrix. This decomposition is unique up to multiplying U by a diagonal unitary matrix from the right. The corresponding equivalence class of U together with D defines a convenient coordinate system on 0. Using these coordinates one can calculate the Riemannian metric induced on 0 by flat metric on the space of all matrices: k δM k2 ≡ tr(δM · δM † ) = tr(δD·δD† ) + 2tr(δu · D · δu · D† −δu · δu ·D·D† ), (2) where δu = U † δU is an invariant (Haar) length element on U (N ). Due to the U (N )invariance δu is well-defined in terms of our coordinates on 0. Expressing k δM k2 through eigenvalues and matrix elements (δu)ij ≡ (U † δU )ij we get k δM k2 =
N X
δzi δ z¯i −
i=1
N X
δuij δuji |zi − zj |2 .
(3)
i,j=1
for the On the other hand, k M k2 = Gab la lb , where {la } is a cumulative notation Q local coordinates on 0, Gab is an induced metric on 0. Then µ(0) = det(G)1/2 a dla (see e.g. [11]). Combining these two formulae with (3) we obtain dµ(0) = dU
N Y
dzi dz¯i |1(z)|2 ,
(4)
i=1
Q where dU = i6=j duij duji is Haar measure on U (N ) and 1(z) ≡ det[zij−1 ]1≤i,j≤N = Q i>j (zi − zj ) is the Van der Monde determinant; dzdz¯ ≡ dIm(z)dRe(z). Finally, substituting (4) into (1) and integrating over the unitary group we arrive at the eigenvalue expression for ZN : Z ZN = c(N )
( CN
N Y
dzi dz¯i e−V (zi ,z¯i ) )|1(z)|2 ,
(5)
i=1
where c(N ) is the volume of the unitary group, a constant factor independent from V . From now on we replace it by unity. Next we will deduce a determinant formula for the integral (5) and relate the result to the solutions to the Toda lattice hierarchy of differential equations. The reformulation of (5) in determinant form is standard and is based on the following formula: X det[Miσ(j) Mjσ(j) ], (6) det[Mik ] · det[Mjk ] = σ∈S N
where M is any N ×N matrix, σ is an element of the symmetric group S N . Applying (6) to the product of Van der Monde determinants 1(z)1(z) ¯ in (5) we arrive at the desired form of the partition function NMM: ZN = N ! · det[Zij ]1≤i,j≤N , where Z ¯ i−1 j−1 dzdze ¯ −V (z,z) z z¯ . Zij ≡ C
Let us consider the potential V (z, z) ¯ in the form
(7) (8)
Structure of Correlation Functions in Normal Matrix Model
Vt (z, z) ¯ = U (z, z) ¯ −
X (tk z k + t¯k z¯ k )
207
(9)
k>0
with tk = 0 = t¯k for k 1. We are interested in the behavior of ZN [Vt ] with respect to variations of t’s and t¯’s. Note that ZN [Vt ] can be considered as a generating function of correlators in NMM with the potential U (z, z): ¯ h(trM1i )j1 · (trM †i2 )j2 · · ·iN M M Z Y N 1 ≡ (dzi dz¯i eV (zi ,z¯i ) )|1(z)|2 (trM i1 )j1 · (trM †i2 )j2 · · · ZN CN i=1
∂ j1 ∂ j2 1 ( ) ·( ) · · · ZN [Vt ] |t,t=0 . = ¯ ZN ∂ti1 ∂ t¯i2 From (8) with potential given by (9) we can easily deduce that ∂Zij ∂Zij = Z(i+k),j , = Zi,(j+k) , i, j ≥ 1. ∂tk ∂ t¯k
(10)
Equation (7) together with (10) mean that ZN [Vt ] is an N th τ -function of Toda lattice (with respect to complex variables t’s and t¯’s, see [30, 49] for details). Note that the real version of the integral (5) together with potential (9) was studied in [34], where it was referred to as a “scalar product model”. In Sect. 6 we will discuss a relation between NMM with an arbitrary potential and integrable systems which include the one described above as a particular case. 3. Correlation Functions of the Normal Matrix Model In this section we will analyze the structure of correlation functions of NMM with arbitrary axially symmetric polynomial potentials, V (z, z) ¯ = V (|z|2 ). We will see in this case that the two-point correlation function can be expressed in terms of the (analytical continuation of) one-point correlation function, thus leading to the exact solvability of the model. It follows from (5) that the probability distribution function of the eigenvalues in NMM with an axially symmetric potential is PN 1 − V (|zi |2 ) i=1 |1(z)|2 e , (11) PN (z1 , z¯1 , · · · , zN , z¯N |V ) = ZN where ZN is the partition function of NMM. The n-point correlation functions (also called reduced distribution functions in statistical physics) are defined as follows: (n) (z1 , z¯1 , · · · , zn , z¯n |V ) RN Z N Y N! dzi dz¯i PN (z1 , z¯1 , · · · , zN , z¯N |V ), ≡ (N − n)! CN −n
(12)
i=n+1
where the combinatorial prefactor accounts for the symmetry of the integrand. Note that X N (1) ¯ )=h δ(z − zk )δ(z¯ − z¯k ) iN M M (13) RN (z, z|V k=1
208
L.-L. Chau, O. Zaboronsky
and coincides therefore with the density of eigenvalues of the NMM. The density of eigenvalues is defined as the number of eigenvalues per unit square. It is usually referred to as “level density” and this is the term we are going to use everywhere below. Correlation functions (12) can be presented in the determinant form. To prove it we note that the expression (11) for the probability distribution function can be rewritten as 1 det[K(z¯i , zi , zj , z¯j |V )]1≤i,j≤N , N!
PN (z1 , z¯1 , · · · , zN , z¯N |V ) =
(14)
where KN (z¯i , zi , zj , z¯j |V ) ≡
N −1 X
φk (z¯i , zi )φk (zj , z¯j ),
(15)
k=0
and φm (z, z) ¯ ≡
p 2 1 cm [V ]z m e− 2 V (|z| ) ,
(16)
where cm [V ] > 0. Functions φm (z, z)’s ¯ introduced above are the orthogonal functions of the problem normalized by the following condition: Z dzdzφ ¯ m (z, ¯ z)φm0 (z, z) ¯ = δm,m0 , (17) C
which yields Z
∞
πcm [V ]
dx xm e−V (x) = 1.
(18)
0
Apparently, the orthogonal polynomials associated with the NMM with axially symmetric potential are just monomials in z. This explains the certain simplicity of the model. It is easy to check that the matrix KN (z¯i , zi , zj , z¯j |V ) is Hermitian and that it can be considered as the kernel of projection operator, i.e. the relation Z dvdvK ¯ N (u, ¯ u, v, v|V ¯ )KN (v, ¯ v, w, w|V ¯ ) = KN (u, ¯ u, w, w|V ¯ ) (19) C
is satisfied. Therefore one can make use of the well-known result from the theory of matrix models (see [31], p. 89, Theorem 5.2.1) to obtain the following representation of the n-point correlation functions defined in (12): (n) (z1 , z¯1 · · · , zn , z¯n |V ) = det[KN (z¯i , zi , zj , z¯j |V )]1≤i,j≤n . RN
(20)
It is convenient to rewrite the expression for KN (z¯i , zi , zj , z¯j |V ) by substituting (16) into (15): KN (z¯i , zi , zj , z¯j |V ) =
N −1 X
cm [V ] · (z¯i · zj )m e 2 (−V (|zi | 1
2
)−V (|zj |2 ))
m=0
≡ kN (zi · zj |V )e 2 (−V (|zi |) 1
2
−V (|zj |)2 )
.
(21)
Structure of Correlation Functions in Normal Matrix Model
209
Therefore, up to a factor which is explicitly expressed through the potential, ¯ z, w, w|V ¯ ) is completely determined by the function kN (z¯ ·w|V ) of one complex KN (z, variable. To illustrate the formulae obtained above let us consider the NMM with the Gaussian potential, V (|z|2 ) = |z|2 , in the limit N → ∞. A simple calculation gives the following answer for the kernel: 1 ¯ 1 |z|2 − 1 |w|2 ) 2 2 . (22) ¯ z, w, w¯ V (x) = x = e(zw− K∞ z, π Substituting (22) into (20) we conclude that in the limit N → ∞ the one-point correlation function is constant and equals to π1 , whereas the 2-point correlation function in the same limit is 2 1 (2) ¯ w, w¯ V (x) = x = 2 [1 − e(−|z−w| ) ]. (23) R∞ z, z, π Recall that the connected part of the two-point correlation function is defined as a difference between the two-point correlation function and a product of two one-point functions. It follows from the last two results that the connected part of the 2-point correlation function is equal to π12 exp[−|z − w|2 ] and decays exponentially at infinity. Let us note that the Gaussian NMM is equivalent to the Gaussian Complex matrix model (see [31], Chapter 15). However this is no longer true for a non-Gaussian matrix model potential. Computability of NMM goes beyond the Gaussian case. For example it is possible to ¯ z, w, w|V ¯ ) when the NMM potential is a monomial obtain a closed expression for K∞ (z, ¯ = |z|2k , where k is a positive integer. The corresponding integral in (18) in |z|2 : V (z, z) is easily calculated to give the following expression for cm [V (x) = xk ]: cm [V (x) = xk ] =
k 1 . π 0( m+1 k )
(24)
Substituting (24) into (21) one obtains the following answer for the kernel at N = ∞: k K∞ z, ¯ z, w, w¯ V (x) = x k−1 X 2k 2k 1 1 p+1 (zw) ¯ p k F1,1 1; ; (zw) ¯ e(− 2 |z| − 2 |w| ) , = p+1 k 0( k ) p=0
(25)
where F1,1 (a; b; z) is the degenerate hypergeometric function, the solution to the Kummer equation which is analytic at z = 0: z
dω d2 ω − aω = 0. + (b − z) dz 2 dz
(26)
For k = 1, (25) reduces to (22). It is well known (see e. g. [17]) that F1,1 (a, b, z) has a Taylor expansion in z with an infinite radius of convergence and defines therefore an entire function on the complex plane. We use this remark below to analyze the correlation functions of NMM in the case of an arbitrary polynomial potential V . Pd Suppose that V (x) = n=1 tn xn with td > 0. Then there are positive constants a1 and a2 such that
210
L.-L. Chau, O. Zaboronsky
V (x) ≤ a1 + a2 xd , x ≥ 0.
(27)
This estimate can be used to prove that for any complex number u and a positive number R such that |u| ≤ R, d X n kN u V (x) = tn x ≤ e−a1
kN R V (x) = a2 xd .
(28)
n=1
But the r.h.s. of (28) is nothing but a partial sum of a series converging by the remark above for any 0 ≤ R < ∞. Therefore k∞ (u|V ) ≡ limN →∞ kN (u|V ) exists for any u such that |u| < ∞ and defines an entire function on the complex plane. But an entire function is uniquely determined by its values on, say, the positive part of the real line, which provides us with the connection between 2- and 1-point correlation functions. To make this connection explicit we notice that 2 (1) (29) ¯ ) = kN |z|2 V e−V (|z| ) , 0 < N ≤ ∞. RN (z, z|V (1) Denote by RN (u|V ) an entire function of a complex variable uwhich coincides with (1) (1) (z, z|V ¯ ). Such the 1-point correlation function above for u = |z|2 , RN |z|2 V = RN
a function exists as V is a polynomial and kN (u|V ) has been proven to be an entire function for any N : 0 < N ≤ ∞. Moreover it is unique, and therefore determined completely by the 1-point correlation function. We finally notice that the connected part (1) (u|V ): of the 2-point correlation function can be expressed in turn in terms of RN 2 (1) − V (|z|2 )+V (|w|2 )−V (zw)−V ¯ (z w) ¯ ¯ w, w|V ¯ ) = RN (zw|V ¯ ) e Rc(2)N (z, z, , 0 < N ≤ ∞, (30) which states the claimed property of the NMM. Relation (30) leads to the exact solvability of NMM. Let us discuss this point in some more detail. It follows from (20) and (21) that all correlation functions of the model can (1) (u|V ). On the other hand they can be considered as a be expressed in terms of RN solution to the certain chain of integral-differential equations which is discussed below. (1) (u|V ). These two remarks permit one to derive a closed equation for the function RN The existence of such an equation means the exact solvability of NMM. To obtain this equation we rewrite an expression for the probability distribution function (11) in the following way: PN P 1 − V (|zi |2 )+2 ln|zi −zj | i<j i=1 e . (31) PN (z1 , z¯1 , · · · , zN , z¯N |V ) = ZN The right-hand side of (31) can be identified with the probability distribution function of classical two-dimensional Coulomb gas in thermal equilibrium. The analogous equivalence has been widely exploited in the study of conventional matrix ensembles (see [31] for review). Such an identification allows one to use Liouville’s Theorem of classical mechanics to derive a BBGKY chain of equations for the correlation functions (12) (see e.g. [33]). In particular we have the following equations, connecting one- and two-point correlation functions:
Structure of Correlation Functions in Normal Matrix Model
(1) ∂RN
Z =
|z| V 2
∂z
∂V (|z|2 ) |z| V ∂z
+
(1) RN
211
2
(32)
∂ (2) dwdw¯ RN (z, z, ¯ w, w|V ¯ ) ln|z − w|2 , ∂z
and another one obtained from (32) by means of complex conjugation. Equations (32) and its complex conjugate is the first pair of equations of the BBGKY chain of equations for the equilibrium distribution functions in the configuration space. These equations can be derived from the corresponding equations of the BBGKY chain for the equilibrium correlation functions in the phase space by integrating out momenta variables. The details of the derivation can be found in [45], p. 48. (1) (u|V ): Substituting (30) into (32) we get a closed equation for RN (1) ∂RN |z|2 V ∂V (|z|2 ) (1) + RN |z|2 V ∂z ∂z Z ∂ (1) (1) ln|z − w|2 (33) dwdw¯ RN |z|2 V |w|2 V = RN ∂z 2 Z − V (|z|2 )+V (|w|2 )−V (zw)−V (1) ∂ ¯ (z w) ¯ ln|z − w|2 . ¯ ) e − dwdw¯ R (zw|V ∂z (1) (1) ¯ ) and normalizaEquation (33), together with condition R (u|V ) = R (u|V a reality R (1) tion condition dzdzR ¯ N |z|2 V = N, can be used to determine the entire function (1) (u|V ) and thus all correlation functions of NMM. In what follows we will use (33) RN to complement our study of the continuum limit of NMM.
4. NMM in the Continuum Limit The present section is devoted to the study of the continuum limit of the NMM. We will compute the corresponding limits of the level density (one-point correlation function) and two-point correlation function. We will see that in the regions where the limiting level density is non-zero, the answer for the connected part of the two-point correlation functions is universal, i.e. independent of the NMM potential up to a rescaling. It is appropriate to mention that our search for universal answers in the NMM has been motivated by the study of universality in the Hermitian matrix models, see e. g. [2]. Consider the N ×N normal matrix model with the potential N ·V (|z|2 ), where V (x) is a polynomial with real non-negative coefficients and a positive coefficient in front of the monomial of the highest degree. As it is easy to check, such polynomials satisfy the following inequality: for any z, w ∈ C, ¯ − V (z w) ¯ ≥ 0, V (|z|2 ) + V (|w|2 ) − V (zw)
(34)
and the equality in (34) is reached iff z = w. Setting w = z + and expanding the l.h.s. of (34) around = 0 we get an infinitesimal version of the condition (34):
212
L.-L. Chau, O. Zaboronsky
1V (|z|2 ) ≥ 0,
(35)
where 1 ≡ ∂ ∂¯ is the two-dimensional Laplace operator. We are interested in the asymptotic properties of the model as N → ∞. As we will see shortly, the level density for large N is of order N , given that it is non-zero. Recall also that NMM is equivalent to the system of two-dimensional Coulomb particles in thermal equilibrium. Given that the density is high enough, this system can be approximated by a continuous medium. We therefore refer to the large-N limit of N × N NMM with the potential N · V (|z|2 ) as to the continuum limit. One more preliminary remark is needed. In order to get a non-trivial answer for ¯ w, w|N ¯ · V ) in the the connected part of the two-point correlation function Rc(2)N (z, z, large-N limit, one has to bring the points z and w closer and closer to each other as N grows. Speaking metaphorically, we should regard the system under the microscope, the larger is N , the stronger must be the microscope. This remark motivates the following adjustment to the definition of the two-point correlation function: (2) ¯ η, η; ξ, ξ, ¯ z, z¯ N · V RN (36) √ √ √ √ ¯ (2) ¯ N = RN z · eξ/ N , z¯ · eξ/ N , z · eη/ N , z¯ · eη/ N · V . According to the definition above, the distance between points the correlation of which √ we measure decreases as 1/ N . Such dependence of the distance with N leads to a finite non-trivial N → ∞ limit of the exponent in (30). As we will see below this secures the existence of non-zero asymptotic expansion of the two-point correlation function in √ inverse powers of N . Now we are ready to present an answer for the one- and two-point correlation functions in the continuum limit: Pd k Theorem 1. Let V (x) = k=1 tk x be a polynomial of positive degree d with nonnegative coefficients. Then √ N 2 · 1V (|z| ) · 1 + O(1/ N ) , 0 < |z|2 < x0 (1) π (z, z|N ¯ ·V)∼ (37) RN 0 , |z|2 > x0 , where x0 is the unique positive solution of the equation x0 · V 0 (x0 ) = 1.
(38)
Moreover, for any z : 0 < |z|2 < x0 , ¯ η, η; ¯ z, z|N ¯ ·V) Rc(2)N (ξ, ξ, 2 √ 2 2 2 −1V (|z|2 )·|ξ−η|2 · 1 + O(1/ N ) . ∼ N /π · 1V (|z| ) · e
(39)
Here and everywhere below, the symbol ”∼” denotes the asymptotic equivalence in the limit N → ∞; 0 in the right-hand side of (37) means that the l.h.s. decreases with N faster than any finite power of N ; in general, F (N ) ∼ 0 iff limN →∞ (N p · F (N )) = 0 for any p ∈ R.
Structure of Correlation Functions in Normal Matrix Model
213
We conclude from (39) that the continuum limit of the connected part of the twopoint correlation function of NMM with fixed z : 0 < |z|2 < x0 , exhibits universal behavior since the dependence on potential (which enters the answer through 1V (|z|2 )) can be eliminated by changing the units in which we measure the distances between points and the units in which we measure the two-point correlation function itself. After all these redefinitions the answer becomes identical to that of the Gaussian NMM, see (23). A different interpretation of (37) and (39) is also possible: let ρ |z|2 V = π1 · 2 2 2 2 2 2 2 1V (|z| ) and C |z| , |ξ − η| V = 1/π · 1V (|z| ) · e−1V (|z| )·|ξ−η| be the coefficients of leading terms in large-N asymptotic of onecorrelation and two-point 2 2 2 functions correspondingly. It is natural to think of ρ |z| V and C |z| , |ξ − η| V as continuum limits of one- and two-point correlation functions of the NMM. Combining these expressions together we see that 2 2 2 (40) C |z|2 , |ξ − η|2 V = ρ(|z|2 |V ) e−π·ρ(|z| |V )·|ξ−η| . Equation (40) can be interpreted as a completely universal relation between the continuum limits of one- and two-point correlation functions of the NMM. A similar phenomenon was previously observed in [3] for the Hermitian matrix model and is referred to as “universality of the second type”. Note finally, that the point z = 0 is excluded from the statement of the Theorem. In the vicinity of this point the two-point functions also exhibit universal behavior, but there are distinct universality classes classified by the order of the critical point of the potential V (|z|2 ) at the origin. If near the origin V (x) ∼ C · xk , then the corresponding two-point correlation function is expressed in terms of the kernel (25). The proof of Theorem 1 is sketched in the Appendix. This proof, however, is both very technical and complicated. Below we present a simple derivation of (37) and (39) which is based on the study of Eq. (33). This derivation uses several plausible but unproven assumptions on the way. Nevertheless, it provides in our opinion a good feeling about the nature of the continuum limit of NMM and about the answers one might expect to get in this limit. Therefore we include it in the text. (1) (u|N · V ) is algeWe start with an assumption that the large-N asymptotic of RN (1) p braic, that is RN (u|N · V ) ∼ N · ρ(u), where p is a real number. Let us substitute this asymptotic in Eq. (33) with V (|z|2 ) replaced with N ·V (|z|2 ). Comparing the asymptotic of the r.h.s. and l.h.s. of the result we see that p = 1, i.e. (1) (u|N · V ) ∼ N · ρ(u). RN
(41)
We also get the following equation for ρ(|z|2 ): Z ∂V (|z|2 ) 2 ∂ ρ(|z|2 ) − dwdwρ(|w| ¯ ) ln|z − w|2 = 0, ∂z ∂z which implies that ∂V (|z|2 ) = ∂z
Z 2 dwdwρ(|w| ¯ )
∂ ln|z − w|2 , ∂z
(42)
214
L.-L. Chau, O. Zaboronsky
given that ρ(|z|2 ) 6= 0. To derive (42) from (33) we have used the condition (34) satisfied by the potential V to show that in the continuum limit the second integral in the r.h.s of (33) vanishes. Note that Eq. (42) can be interpreted as an extremum condition for the potential energy U [ρ] of the two-dimensional charged continuous medium with charge density ρ(|z|2 ) placed in the external potential V (|z|2 ): Z Z Z 1 2 2 2 dzdzdwd ¯ wρ(|z| ¯ )V (|z| ) − )ρ(|w|2 )ln|z − w|2 . (43) U [ρ] = dzdzρ(|z| ¯ 2 One can say therefore that in the continuum limit the distribution of eigenvalues of the NMM is determined by the electrostatics of the charged medium with the potential energy (43). Note that the extremum of the one-dimensional counterpart of the functional (43) gives a correct answer for continuum limit of the level density of the Hermitian matrix model, see e. g. [31]. Equation (42) is easy to solve. Let us differentiate both sides of this equation with respect to z. ¯ Noticing that ln|z−w|2 is the Green’s function of two-dimensional Laplacian, i.e. 1ln|z − w|2 = πδ (2) (z − w), we get ρ(|z|2 ) =
1 1V (|z|2 ), π
(44)
which should be a correct answer given that ρ(|z|2 ) 6= 0. The physical meaning of (44) is the following: the charged medium which describes the continuum limit of NMM tends to screen the external potential completely. Having described the spectrum of eigenvalues locally one can derive a complete picture assuming that the NMM level density corresponds to the global minimum of the potential energy (43). In our case V is convex and the minimal energy assumption implies that ρ(|z|2 ) 6= 0 if |z|2 < x0 , where x0 can be determined from the normalization condition Z (1) dzdzR ¯ N (z, z|N ¯ · V ) = N. (45) |z|2 <x0
Substituting (41) with ρ(|z|2 ) given by (44) into (45) and computing the resulting integral we find that x0 must satisfy Eq. (38). Therefore, we have restored the first lineof (37). (1) (z, z|N ¯ · V )/N We also get ρ(|z|2 ) = 0 for |z|2 > x0 , i. e. limN →∞ RN
= 0, a
weaker statement than the one given by the second line of (37), which is however strong enough to claim that the spectrum √ of eigenvalues of the NMM in the continuum limit is supported by the disk of radius x0 centered at the origin of the z-plane. In order to derive (39) we assume that the continuum limit preserves the holomorphy property of the NMM proven in Sect. 3, at least locally. Namely, we assume that for any x : 0 < x < x0 , there is an open neighborhood Ux of x in the complex u-plane such that the function ρ from (41) is holomorphic in Ux . Now, ρ(u) is equal by definition of R(1) (u|N ·V )
asymptotic equivalence to the limit of a sequence of holomorphic functions N N , N = 1, 2, . . .. The holomorphy of the limit implies that on each closed subdomain of Ux the convergence is uniform. It is not difficult to verify the following abstract statement. Let {gN (u)} be a sequence of continuous functions on some closed domain U such that limN →∞ gN (u) = g(u) uniformly in U . Then limN →∞ gN (u · fN ) = g(u) for
Structure of Correlation Functions in Normal Matrix Model
215
any sequence {fN } of complex numbers converging to 1 and such that gN (u · fN ) is defined for each N . Applying this statement to the sequence of continuous functions ξ+η¯ (1) (u|N · V )/N } in U ⊂ Ux and the sequence of complex numbers {e N } we see {RN that ¯ ξ+η √ (1) |z|2 e N N · V /N lim RN N →∞ (46) (37) 1 (1) = lim RN |z|2 N · V /N = 1V (|z|2 ). N →∞ π The following computation concludes our derivation of (39): ¯ η, η; Rc(2)N (ξ, ξ, ¯ z, z|N ¯ ·V) ¯ ¯ ξ+η η+η¯ ξ+η¯ √ √ 2 −N · V (|z|2 ·e √ξ+Nξ )+V (|z|2 ·e √ N )−V (|z|2 ·e N )−V (|z|2 ·e N ) ¯ ξ+η (30) (1) = RN (|z|2 e N |N · V ) · e ¯ ¯ ξ+η η+η¯ ξ+η¯ √ √ 2 −N · V (|z|2 ·e √ξ+Nξ )+V (|z|2 ·e √ N )−V (|z|2 ·e N )−V (|z|2 ·e N ) (46) N 2 ∼ π2 · 1V (|z|2 ) · e 2 2 2 2 2 · 1V (|z| ) · e−1V (|z| )·|ξ−η| , ∼N π2 where the last equality can be verified by a direct computation. In conclusion of the section, we would like to remind the reader that the presented derivation of (37) and (39) is based on several unproven, however natural assumptions, and refer to the Appendix for a complete proof. 5. Free Fermion Representation of NMM The aim of the present section is to rewrite the partition function (1) of NMM in the form of the free fermion correlator. This will permit us to describe an arbitrary variation of NMM potential by means of the integrable system of differential equations and generalize therefore the results of Sect. 2. The technique we use is a straightforward generalization of free fermion methods developed in [41] and reviewed in part in [10] and their application to the theory of matrix models due to the ITEP group (see [34] for review). Consider an abelian group Z × Z consisting of pairs of integers. The corresponding group multiplication will be denoted by “+”. To simplify notations we refer to this group as G. An abelian group Z acts on G as follows:
G × Z → G, g = (i, j), m 7→ g · m = (i · m, j · m).
(47)
Alternatively, G can be presented in the following form: G = ⊕m∈Z Gm = G− ⊕ G+ ,
(48)
G− = ⊕m<0 Gm ; G+ = ⊕m≥0 Gm ,
(49)
where
216
L.-L. Chau, O. Zaboronsky
and
g ∈ G g = (l + m, −l) ; l ∈ Z , m ∈ Z.
Gm =
(50)
Note that G0 can be visualized as an antidiagonal of G = Z × Z. It is also worth mentioning that if g ∈ Gm and h ∈ Gn , then g + h ∈ Gn+m . Therefore, G is Z-graded and (48) is a decomposition of G into a sum of components of a fixed degree. Let us consider an infinite-dimensional Clifford algebra A(G) over complex numbers which corresponds to G. We define it by means of the following set of generators and relations: ¯ A(G) = 1, ψg , ψh ; g, h ∈ G {ψg , ψ¯ h } = δg,h , {ψg , ψh } = 0 = {ψ¯ g , ψ¯ h } . (51) Let FR be the right Fock space, an irreducible left A(G)-module obtained by applying A(G) to the right vacuum vector |vaci, which is defined as follows: ψg |vaci = 0, g ∈ G− ;
ψ¯ h |vaci = 0, h ∈ G+ .
(52)
In the same fashion we introduce the left Fock space FL which is generated by the right action of A(G) on the left vacuum vector hvac| which is defined below: hvac|ψ¯ g = 0, g ∈ G− ; hvac|ψh = 0, h ∈ G+ .
(53)
There is a natural non-degenerate pairing between left and right Fock spaces, hvac|a, b|vaci 7→ hvac|a · b|vaci, where a and b are elements of A(G). This pairing is normalized as follows: hvac|1 · 1|vaci = 1.
(54)
From now on we will think of A(G) as an algebra of linear transformations of FR or FL and refer to elements of A(G) as operators. To each element a ∈ A(G) one can assign its normal reordering with respect to the vacuum |vaci, an element of A(G), denoted as : a :. It is obtained from an element a by permuting the generators in each term of the sum constituting the element a in such a way that the fermion generators annihilating the right vacuum stand to the right from the rest of the generators. A variety of vectors in the Fock space is provided by means of the following construction. To each finite set S ⊂ G we assign the following vector in FR : Y |Si = og |vaci, (55) g∈S
T
T where og = ψg if g ∈ G+ S and og = ψ¯ g if g ∈ G− S. Note that without fixing a linear order in the set Z × Z, the element (55) is defined up to a sign only. The same construction can be applied to obtain a vector in FL which we will denote as hS|, Y hS| = hvac| o¯g , (56) g∈S
Structure of Correlation Functions in Normal Matrix Model
217
T T where o¯g = ψ¯ g if g ∈ G+ S and o¯g = ψg if g ∈ G− S. It is easy to see that any element of FR or FL can be presented as a linear combination of vectors (55) or (56), respectively. Consider a semigroup Q ⊂ G consisting of all pairs g = (m, n), where m and n are non-negative integers. We introduce the following element of the algebra A(G): X X X tg J g ≡ tg ( ψh ψ¯ h+g ), (57) H(t) = g∈Q
g∈Q
h∈G
where tg = 0 if deg(g) >> 0,
(58)
and we always suppose that t0 = 0. We will call H(t) a Hamiltonian operator or simply Hamiltonian. It is easy to check that the operators Jg ’s introduced in (57) commute, i.e. [Jg , Jh ] ≡ Jg · Jh − Jh · Jg = 0, for g, h ∈ Q \ {0}. It is also a matter of simple computation to verify that [Jg , ψh ] = ψh−g and [Jg , ψ¯ h ] = ψ¯ h+g . Before we introduced the Hamiltonian (57) all our considerations had not been different from the standard fermion construction as A(G) is isomorphic to a standard Clifford algebra the generators of which are labeled by elements of Z, the isomorphism is being established by ordering the set Z × Z. The new development starts with the introduction of Hamiltonian (57) since the presented commutation relations between Jg ’s and fermion generators depend on the group structure in G = Z × Z, but G is not isomorphic to Z as a group. It follows from the definition (57) that the Hamiltonian operator annihilates vacuum, H(t)|vaci = 0.
(59)
Define a t-evolution of an element a ∈ A(G) as the following element of (the formal completion of) A(G): a(t) = eH(t) ae−H(t) .
(60)
Due to (59) the t-evolution preserves the normal ordering, : a(t) :=: a : (t). It is not difficult to derive the following expressions for the t-evolution of fermionic generators of A(G): X ph (t)ψg−h , (61) ψg (t) = h∈Q
ψ¯ g (t) =
X
ph (−t)ψ¯ g+h ,
(62)
h∈Q
where {pg }g∈Q are generalized Schur polynomials, pg (t) = δg,0 + tg + +
1 2! 1 3!
X
th0 th00
h0 ,h00 ∈Q;h0 +h00 =g
X
(63) th0 th00 th000 + · · · .
h0 ,h00 ,h000 ∈Q;h0 +h00 +h000 =g
The set {pg }g∈Q defined above indeed consists of polynomials: the sum in the r .h. s. of (63) is finite as there is a finite number of ways to present an element of Q as a sum
218
L.-L. Chau, O. Zaboronsky
of elements of Q of positive degree. It is also worth mentioning that one can grade the polynomial ring C[th , h ∈ Q] by setting deg(th ) = deg(h). Under such assignment pg (t) becomes a homogeneous polynomial of degree equal to deg(g). We will also need a generating function for the Schur polynomials. Let z, z¯ be complex coordinates in the plane. We set ~z g ≡ z a · z¯ b , where g = (a, b). Let also X tg ~z g . (64) V (t, ~z) = g∈Q
Then eV (t, ~z) =
X
pg (t)~z g .
(65)
g∈Q
The generating function (64) can be used to rewrite relations (61) and (62) in a compact form. Let X ψg ~z g (66) ψ(~z) = g∈G
be a free fermion field operator. Then a direct computation shows that eH(t) ψ(~z)e−H(t) = eV (t,~z) ψ(~z).
(67)
To derive a free fermion representation of the partition function of NMM we will need a pair of operators called fermionic projectors: P ψg ψ¯ g :, (68) P+ = : e g∈G− P − ψg ψ¯ g g∈G+ P− = : e : . (69) One can verify the following properties of fermionic projectors: P+ ψ¯ g = ψg P+ = 0, if g ∈ G− ; P− ψg = ψ¯ g P− = 0, if g ∈ G+ ; P+2
= P+ ;
P−2
= P− .
(70) (71) (72)
Equation (72) is a direct consequence of (70) and (71), which in turn result from the following equivalent representation of fermionic projectors: Y (1 − ψ¯ g ψg ), P+ = g∈G−
P− =
Y
(1 − ψg ψ¯ g ).
g∈G+
Finally we are ready to introduce the main object of our interest. Consider the following fermionic correlator: (73) τ U, A(8)|t = hU |eH(t) A(8)|U i, where U ⊂ Q is a finite subset of Q ⊂ G,
Structure of Correlation Functions in Normal Matrix Model
nR A(8) = : e
C2
219
dzdzdwd ¯ w¯ 8(~ z ,w)ψ ~ + (~ z )ψ¯ + (w ~ −(1,1) )−
X
ψ+ (~z) =
ψg ~z g , ψ¯ + (~z) =
g∈G+
X
o
P g ∈ G+
ψg ψ¯ g
ψ¯ g ~z −g ,
:,
(74)
(75)
g∈G+
and 8 is a real function (or a distribution) on C2 . Our aim is to compute the correlation function (73). Consider the following matrix of an infinite size: Z dzdzdwd ¯ w¯ 8(~z, w) ~ w ~ g ~z h , h, g ∈ G+ , A(8)g,h = C2
A(8)g,h = δg,h ,
h, g ∈ G−
(76)
and Ag,h (8) = 0 in all other cases. One can verify the following set of identities involving the matrix A(8)g,h : X A(8)g,h ψh A(8) = A(8)ψg , g ∈ G. (77) h∈G
To prove (77) we consider separately cases g ∈ G− and g ∈ G+ . In the former case relation (77) reduces to ψg A(8) = A(8)ψg , g ∈ G− , which is true, as the operator A(8) depends on fermionic generators labeled by elements of G+ only. The latter case follows in a straightforward way from the representation of the operator A(8) given below: Z ∞ X 1 dz1 dz¯1 dw1 dw¯ 1 · · · dzm dz¯m dwm dw¯ m 8(~z1 , w ~ 1 ) · · · 8(~zm , w ~ m) A(8) = m! m=0
(−1,−1) ~m ) · · · ψ¯ + (w ~ 1(−1,−1) ). (78) ×ψ+ (~z1 ) · · · ψ+ (~zm )P− ψ¯ + (w
In what follows we will always require the non-degeneracy of the matrix Ag,h (8) . We also wish to remark that operator A(8) from (74) is a counterpart of the operator used in [23] to obtain a free fermion representation of conventional one-matrix models. Using relations (77) and the fact that A(8)|vaci = |vaci and applying Wick’s theorem one can express τ (U, A(8)|t) in terms of two-point correlators: X A(8)h,k ψk |vaci]g,h∈U , (79) τ (U, A(8)|t) = det[hvac|ψ¯ g (−t) k∈G+
where the determinant is taken with respect to some linear ordering of U . Note however that the change of order is a unitary transformation which does not change the determinant. It is not difficult to compute two-point correlation functions entering (79). Substituting (76) into (79) and using relation (67) we arrive at the following expression for τ (U, A(8)|t): τ (U, A(8)|t) = det[Zg,h (t)]g,h∈U , where
(80)
Z dzdzdwd ¯ w¯ ~z g w ~ h 8(~z, w)e ~ V (t,~z) .
Zg,h (t) = C2
(81)
220
L.-L. Chau, O. Zaboronsky
Suppose now that the set U ⊂ G and the function 8 which parameterize the τ -function are chosen to be the following: (82) UN = g ∈ G g = (n, 0), n = 0, 1, · · · N − 1 , 8(~z, w) ~ = 8N M M (~z, w) ~ = δ(z − w) ¯ · δ(z¯ − w) · e−|z| . 2
(83)
Substituting (82) and (83) into (80) and (81), we see that τ (UN , A(8N M M )|t) = det[Zi,j (t)]0≤i,j≤N −1 , Z P (t −δ δ )z m z¯ n i j dzdz¯ z z¯ e m,n≥0 (m,n) m,1 n,1 , Zi,j (t) =
(84)
C
which coincides (up to a nonessential factor) with the determinant formula (7) and (8) for the partition function of NMM with a polynomial potential equal to −V (t, ~z) + |z|2 . Thus we proved that 1 τ UN , A(8N M M ) t . (85) ZN = N! To conclude the discussion of the present section we would like to mention that the partition functions of conventional matrix models (such as Hermitian, Unitary, Complex one-matrix models and Hermitian two-matrix models ) can be presented in the form (73) under appropriate choices of U and 8. To obtain the partition function of the Hermitian matrix model, for example, set ~ = δ(z − w) ¯ · δ(z¯ − w) · δ(z − z) ¯ · e−|z| . 8(~z, w) ~ = 8HM M (~z, w)
(86)
τ UN , A(8HM M ) t = det[Zi,j ]0≤i,j≤N −1
(87)
2
Then
with Z Zi,j =
∞ −∞
m
dx xi+j e6m>0 Tm x and Tm =
k+l=m X
t(k,l) − δm,1 .
(k,l)∈Q
Equation (87) is the determinant P form of the partition function of Hermitian one-matrix model with the potential − p>0 Tp xp . 6. The Partition Function of NMM as a τ -Function of the Extended-KP (N ) Hierarchy In this section we are going to derive a system of differential equations associated with correlation function (73). In virtue of (85) all results of the present section apply as well to to the partition function of NMM with an arbitrary polynomial potential. The part of our construction dealing with free fermions relies heavily on methods developed in [21], [8] and [9], see [10] for a review. Our consequent analysis of the emerging hierarchy of differential equations shows that the original approach of [43]
Structure of Correlation Functions in Normal Matrix Model
221
and [44] to the theory of KP equations can be extended to the multidimensional case as well. Let UN and A be a subset of Q and an element of the Clifford algebra given by (82) and (74) correspondingly (to simplify notations, from now on we denote A(8) and A(8)g,h as A and Ag,h respectively). The following functions depending on ~z and t’s are called wave functions: H(t) ¯ UN ψN ·p e ψ(~z)A UN wp (~z, t) = , H(t) UN e A UN H(t) ¯ UN ψq e ψ(~z)A UN wq (~z, t) = , UN eH(t) A UN
(88)
(89)
P where ψ(~z) = g∈G ψg ~z g is a free field operator; p = (1, 0) ∈ G, q = (0, 1) ∈ G. In our notations for the wave functions we suppress the dependence on N and A which are supposed to be fixed. We assume that the common denominator in (88) and (89) is not equal to zero when all t’s are equal to 0. Then the wave functions make sense as formal power series in t’s. There is a linear relation imposed on the wave functions which follows from the identity (77). Let us explore it. Consider the Fourier decomposition of wave functions with respect to ~z: wp (~z, t) =
X
wpg (t)~z g , wq (~z, t) =
g∈G
X
wqg (t)~z g
(90)
g∈G
It is easy to prove the following set of relations between coefficients of such decomposition: [ X X Ag,h wph (t) = 0 , Ag,h wqh (t) = 0 if g ∈ G− UN . (91) h∈G
h∈G
Here Ag,h is a matrix defined by (76). To prove (91) we observe that wph (t) is proportional to hUN |ψ¯ N ·p eH(t) ψh A|UN i. over h, and using (77) we get hUN |ψ¯ p eH(t) Aψg |UN i, Multiplying it by Ag,h , summing S which is zero if g ∈ G− UN . The similar arguments apply if we replace wph (t) with wqh (t). Therefore, (91) is proven. Relations (91) have their counterpart in the theory of KP hierarchy (see e.g. [10] ) and are of prime importance for our further considerations. Before one can make use of them however, it is desirable to rewrite Eqs. (91) in the form of linear relations between the finite number of unknowns. This is possible because of the following representation of wave functions (88) and (89):
222
L.-L. Chau, O. Zaboronsky
X N ·p g wp (~z, t) = ~z + ag (t)~z eV (t,~z ) ,
(92)
g∈UN
X q g bg (t)~z eV (t,~z ) , wq (~z, t) = ~z +
(93)
g∈UN
where coefficients {ag (t), bg (t)}g∈UN depend on t’s and the choice of A. It is easy to express them in terms of free fermion correlators, but we will not need the explicit expressions. To verify (92) one can perform the following computation: commute the field operator ψ(~z) in the numerator of (88) with eH(t) using (67). Then notice that [ hUN |ψ¯ N ·p ψg = 0, g ∈ G+ \ UN {N · p} and ψh eH(t) A|UN i = 0, h ∈ G− , in which the first equality follows from the definition (56) of the generating vectors of FL while the other one is a consequence of our choice of the operator A defined in (74) and the law (61) of evolution of fermionic generators: X Aψh−g pg (−t)|UN i = 0, if h ∈ G− . ψh eH(t) A|UN i = eH(t) g∈Q
It only remains to show that the coefficient in front of ~z N ·p in (92) is indeed 1. But hU |ψ¯ ψN ·p eH(t) A|UN i , which is 1. Repreaccording to (88) this coefficient is equal to N hUNN·p|eH(t) A|UN i sentation (92) has been derived. The derivation of (93) can be performed along the same lines. P z g , where we assumed that pg = 0 if Using the fact that eV (t,~z) = g∈G pg (t)~ g ∈ G \ Q, and comparing representations of wave functions (90) and ((92), (93)) one can rewrite the relations (91) in terms of coefficients {ag , bg }g∈UN : ! X [ X Ag,h ph−N ·p (t) + ak (t)ph−k (t) = 0 if g ∈ G− UN , (94) k∈UN
h∈G
X h∈G
Ag,h
ph−q (t) +
X
! bk (t)ph−k (t)
= 0 if g ∈ G−
[
UN .
(95)
k∈UN
Looking closer at relations (94) and (95) we see that the only non-trivial ones are those corresponding to g ∈ UN . Indeed, if g ∈ G− then Ag,h = 0 for h ∈ G+ in accordance with definition (76). Thus the left-hand sides of relations (94) and (95) are just identical zeros if g ∈ G− . Let us consider now the following pair of differential operators: X ~ = ∂~ N ·p + ag (t)∂~ g , (96) Wp (t, ∂) g∈UN
~ = ∂~ q + Wq (t, ∂)
X
bg (t)∂~ g ,
(97)
g∈UN
g p gq ∂ and g = gp · p + gq · q which we call wave operators. Here ∂~ g ≡ ∂t∂p ∂tq is an arbitrary element of G decomposed in terms of p and q. In what follows we will
Structure of Correlation Functions in Normal Matrix Model
223
denote the complex variables tp and tq as x and y, respectively. Thus Wp and Wq are ∂ ∂ polynomials in ∂x and ∂y with coefficients depending on “times” t’s. Evidently, ~ V (t,~z) , wq (t, ~z) = Wq (t, ∂)e ~ V (t,~z) . wp (t, ~z) = Wp (t, ∂)e
(98)
Noting also that ∂~ g ph (t) = ph−g (t) for g ∈ Q, we can rewrite relations (94) and (95) in terms of wave operators, which will provide us with an alternative form of Eqs. (91) suitable for our needs: Wp hg (t) = 0 = Wq hg (t), g ∈ UN ,
(99)
P where hg (t) = h∈G Ag,h ph (t). Relations (99) can be viewed as two N × N systems of linear equations with respect to coefficients of wave operators. Here is an explicit solution: an·p (t) = −
det(∂~ 0·p~h, ∂~ 1·p~h, · · · , ∂~ (n−1)·p~h, ∂~ N ·p~h, ∂~ (n+1)·p~h, · · · , ∂~ (N −1)·p~h) , (100) det(∂~ 0·p~h, · · · , ∂~ (N −1)·p~h)
bn·p (t) = −
det(∂~ 0·p~h, ∂~ 1·p~h, · · · , ∂~ (n−1)·p~h, ∂~ q~h, ∂~ (n+1)·p~h, · · · , ∂~ (N −1)·p~h) , (101) det(∂~ 0·p~h, · · · , ∂~ (N −1)·p~h)
where n = 0, 1, · · · N − 1 and ~h is a column vector with elements hg , g ∈ UN . It follows from the results of the previous section that the common denominator of (100) and (101) is equal exactly to hUN |eH(t) A|UN i, and is therefore an invertible power series in t’s. Thus the wave operators are uniquely determined by conditions (99) and their coefficients are formal power series in “times”. ∂ ∂ and ∂y with coefficients being formal Let B be a ring of differential operators in ∂x series in variables t’s. The statement below is a natural consequence of the formalism developed: Lemma 1. Let O ∈ B be a differential operator such that Ohg = 0, g ∈ UN .
(102)
Then there are differential operators bp ∈ B and bq ∈ B such that O = b p W p + bq W q ,
(103)
in other words O = 0|mod Wp ,Wq . Moreover one can choose bp to be of zeroth order in ∂ ∂y . The proof of Lemma 1 is given in the Appendix. The system of non-linear equations satisfied by coefficients of wave operators is a direct consequence of the formulated statement. To see this we will differentiate each of the relations (99) with respect to th and use the fact that ∂t∂ hg (t) = ∂~ h hg (t), a property of Schur polynomials. As a result h we obtain the following set of identities: ! ~ ∂Wp (q) (t, ∂) h ~ ∂~ hg = 0, if g ∈ UN . + Wp (q) (t, ∂) (104) ∂th
224
L.-L. Chau, O. Zaboronsky
It also follows from (99) that [Wp , Wq ]hg = 0, g ∈ UN .
(105)
~ ∂Wp (q) (t,∂) ~ ∂~ h ∈ B and + W (t, ∂) Relations (104) and (105) state that operators p (q) ∂th [Wp , Wq ] ∈ B annihilate the set of functions hg , g ∈ UN . Thus it follows from Lemma 1 that ~ ∂Wp (t, ∂) h ~ ~ + Wp (t, ∂)∂ = 0 , h ∈ Q, (106) ∂th mod Wp ,Wq ~ ∂Wq (t, ∂) h ~ ~ + Wq (t, ∂)∂ = 0 , h ∈ Q, (107) ∂th mod Wp ,Wq [Wp , Wq ] = Y · Wp = 0
,
(108)
mod Wp
where Y ∈ B is completely determined by Wp and Wq . An explicit expression for the operator Y in terms of wave operators will be derived later and the answer is given in (130). Note that the r.h.s. of the (108) is proportional to Wp only, which follows from ∂ only. the fact that [Wp , Wq ] is an operator in ∂x For a fixed N , relations (106) and (107) constitute the system of non-linear differ−1 ential equations for the unknown functions {an (t), bn (t)}N n=0 subject to the constraint (108). A more explicit form of this system will be presented later. At this point we must comment on the correctness of the definition (106) and (107) of equations of our hierarchy. Depending on how we mode out the parts proportional to The reason Wp and Wq we can obtain seemingly different answers for the remainder. for such ambiguity is purely algebraic: the left B-ideal I = O ∈ B Ohg = 0, g ∈ UN consisting of all differential operators annihilating hg with g ∈ UN is not freely generated by Wp , Wq , there is the relation (108) between generators. Moreover, this relation is the only one. In other words, we have Lemma 2. The left ideal I ⊂ B can be described as follows in terms of generators and relations: (109) I = Wp , Wq [Wp , Wq ] = 0|mod Wp , which means that for any O ∈ I there are bp ∈ B and bq ∈ B such that O = bp Wp +bq Wq . Moreover, expressions bp Wp + bq Wq and b0p Wp + b0q Wq are presentations of the same element of I if and only if there is an element c ∈ B such that bp = b0p − c · (Wq + Y ) and bq = b0q + c · Wp . Here Y ∈ B is an operator defined in (108). A proof of Lemma 2 is given in the Appendix.
Structure of Correlation Functions in Normal Matrix Model
225
Therefore we conclude that all possible ways to write the remainders in (106) and (107) lead to the same answer if the relation (108) is taken into account; thus the hierarchy of equations we care about is completely determined by (106), (107) and (108). The practical way of deducing differential equations from identities (106), (107) and (108) can be extracted from the proof of Lemma 1 in the Appendix. To illustrate the result we present explicitly the simplest equations among (106), (107) and (108) in the case when the set UN consists of one point, i.e. N = 1. The expressions for the wave operators in this case are ∂ + a(t), ∂x ∂ Wq = + b(t). ∂y
Wp =
(110) (111)
The corresponding equations for h = (2, 0), (1, 1) and (0, 2) can be written as follows: ∂a + 2(∂x a)a = ∂x2 a, ∂t(2,0) ∂b + 2(∂x b)a = ∂x2 b; ∂t(2,0) ∂a + (∂x a)b + (∂y a)a = ∂x ∂y a, ∂t(1,1) ∂b + (∂x b)b + (∂y b)b = ∂x ∂y b; ∂t(1,1) ∂a + 2(∂y a)b = ∂y2 a, ∂t(0,2) ∂b + 2(∂y b)b = ∂y2 b. ∂t(0,2)
(112) (113) (114) (115) (116) (117)
The additional condition (108) takes the form ∂y a − ∂x b = 0.
(118)
Let us rewrite Eqs. (112)–(117) and (118) in terms of real variables and real-valued unknown functions. It follows from the reality condition imposed on the potential V (t, ~z) that t(2,0) = t(0,2) , t(1,1) = t(1,1) and x = t(1,0) = t(0,1) = y. Thus a(t) = b(t) and we introduce the following new real variables: v1 (t) ≡ a(t) + b(t) , iv2 (t) ≡ b(t) − a(t), r1 ≡ x + y , ir2 ≡ x − y, τ1 ≡ t(1,1) , t(2,0) ≡ 2(τ2 + iτ3 ), t(2,0) ≡ 2(τ2 − iτ3 ). Note that as far as transformation properties are concerned, vi (t) with i = 1, 2, is a covector field (one-form). Relation (118) states that this one-form is closed: (119) d vi (t)dri = 0. Equations (112)–(117) written in terms of vi (r, τ ) acquire the form ∂~vα + (~vα · ∇)(~vα ) = 1gα ~vα , α = 1, 2, 3, ∂tα
(120)
226
L.-L. Chau, O. Zaboronsky
where ∇ = ( ∂r∂ 1 , ∂r∂ 2 ) is a gradient operator, gα ’s are metric tensors, (g1 )ij = δij , (g2 )ij = 1 2 (σ3 )ij , (g3 )ij = (σ1 )ij and σ3 = diag(1, −1), σ1 = antidiag(1, 1). Matrices σ’s are the Laplace operator Pauli matrices. An operator 1gα = gαij ∂i ∂j is a of two-dimensional space equipped with metric gα . Finally, ~vα (t) = vα1 (t), vα2 (t) is a vector field corre sponding to a covector field v1 (t), v2 (t) in the presence of the metric gα , vαi = gαij vj . Note that g1 is a Euclidean metric, while g2 and g3 are equivalent Minkowski metrics. Relations (120) for α = 1 (the Euclidean case) are called two-dimensional Burgers equations, [28]. As a result we see that two-dimensional Burgers equations are included in an infinite hierarchy of non-linear differential Eqs. (106)–(107) and (108) for N = 1. This hierarchy is completely integrable in the sense that we can find all solutions in the class of formal power series, see Theorem 2 below.2 Burgers equations in one, two and three spatial dimensions together with the continuity equation and potentiality condition (119) can be used to model potential turbulence. We refer the reader to [18] for a review and original references. This book also describes an application of the three-dimensional analogue of Eqs. (120) to the study of the large scale structure of the Universe. The integrable structure we have discussed can prove useful in the systematic analysis of the development of the turbulence in the models based on Burgers equation: the integrable structure of the Burgers hierarchy implies that the dynamics of the system is constrained to the invariant subspaces of the phase space (or “ state space” in the terminology of [29]). The addition of the small perturbation in the form of the random force (see e.g. [48]) destroys integrability and the system moves towards chaos through the deterioration of invariant subspaces in accordance with Kolmogorov–Arnold–Moser theory. The presented picture is very close to the existing scenarios of the development of the turbulence ( Hopf-Landau scenario for example, see [29] for details) but has a chance to admit a complete qualitative treatment. Moreover, the integrability (in the sense of the presence of integrals of motion) plays an important role in the description of fully developed turbulence of [39] and [40]. For example, Polyakov’s anomaly introduced in [40] is an anomaly of the conservation law. We hope to investigate the consequences of complete integrability of Burgers hierarchy for the theory of Burgers turbulence in the near future. In the meantime let us discuss the solutions to (120) subject to the condition (119). Solutions to (119) which are defined at every point of the (r1 , r2 )-plane are of the following form: vi (t) = ∂j 9(t),
(121)
where 9(t) is an arbitrary function of t’s. It is called a potential of the covector field vi (t). Comparing (121) with (100) and (101) we see that the whole class of solutions to equations (120) is given by Z Z dzdz[ ¯ dwdw8(~ ¯ z , w)]e ~ V (t,~z) . (122) 9(t) = lnhU1 |eH(t) A|U1 i = ln We see that 9(t) is a generalization of the Hopf–Cole ([19, 7]) solution to the Burgers equations (120). The quantity inside the square brackets in the r.h.s. of the above equation 2 By the way, the original (i.e. one-dimensional) Burgers equation [4] also can be included in the integrable hierarchy constituting a certain reduction of KP hierarchy. This statement can be easily extracted from the results of [36].
Structure of Correlation Functions in Normal Matrix Model
227
is determined by initial conditions. The choice of 8 from (74) in accordance with (83) makes it clear that the (1 × 1) NMM solves the (2+1)-dimensional Burgers hierarchy. We also see that the Hopf–Cole substitution vi (t) = ∂i 9(t) = ∂i ln(τ (t))
(123)
linearizes the (2+1)-d Burgers hierarchy, (106) and (107) at N = 1, subject to the constraint (119). Note that the function τ appearing in (123) can be considered as a generating function for the solutions of the (2+1)-d Burgers hierarchy and plays in this sense a role of the so-called τ -function of this hierarchy, see [41] where the notion of the τ -function was introduced. Now we wish to analyze in some details the structure of Eqs. (106), (107) and (108) for an arbitrary N . Their equivalent form is the following: ~ ∂Wp (t, ∂) ~ ∂~ h = Oh Wp + Oh Wq , h ∈ Q, + Wp (t, ∂) p,p p,q ∂th ~ ∂Wq (t, ∂) ~ ∂~ h = Oh Wp + Oh Wq , h ∈ Q, + Wq (t, ∂) q,p q,q ∂th [Wp , Wq ] = Y Wp ,
(124) (125) (126)
where Y and Oij with i, j = p, q are elements of B and according to Lemma 1 we can ∂ . choose these operators in such a way that Oi,p with i = p and q are of zeroth order in ∂y Such a choice permits us to express the r.h.s. of (124)-( 126) in terms of wave operators alone. To obtain such an expression we have to define first the right inverses of the wave operators. A way to do this is the following. We set X dn ∂x−n , Wp−1 = ∂x−N n≥0
Wq−1
=
∂y−1
X
e˜n ∂y−n ,
n≥0
where {dn }∞ ˜ n }∞ n=0 are formal power series in t’s, while {e n=0 are differential operators in ∂x with coefficients being formal power series in t’s. Operators {e˜n }∞ n=0 and series {dn }∞ n=0 are uniquely determined by equations Wp · Wp−1 = 1 = Wq · Wq−1 .
(127)
The multiplication operation “·” in the above equations is defined by the Leibnitz rule. Relations (127) considered as equations with respect to unknown quantities {dn }∞ n=0 and {e˜n }∞ n=0 possess a unique solution. Multiplying Eqs. (124), and (125) by Wq−1 from the right, extracting the differential h ’s, parts and using the fact that the Oh ’s are already differential operators and the Oi,p i = p and q, are of zeroth order with respect to ∂y we find h h −1 ~ Oi,q = Wi ∂ Wq , i = p and q, (128) +
where the subscript “plus” denotes the operation of extracting the differential part of an P P g ~ ~ ~ = = g∈Q cg ∂~ g . operator, i.e. if O(t, ∂) g∈G cg ∂ then O(t, ∂) +
228
L.-L. Chau, O. Zaboronsky
h h To compute operators Op,p and Oq,p we substitute (128) back into (124) and (125). Multiplying the resulting equations with Wp−1 and projecting them onto differential operators in ∂x only we get: h h −1 −1 ~ Oi,p = Wp(q) ∂ Wq − Wq Wp , i = p, q, (129) (+,0)
where the subscripts denote the following operations: for any pseudodifferential operator O, O− = O − O+ and O(+,0) is a projection of O onto differential operators in ∂x . It remains to compute the operator Y entering the r.h.s. of (126). This differential operator is of zeroth order in ∂y and is therefore equal to (130) Y = [Wp , Wq ]Wp−1 (+,0) . Substituting (128)–(130) into (124)–(126) we obtain a system of equations governing the evolution of the wave operators: ∂Wp + Wp ∂~ g = (Wp ∂~ g Wq−1 )− Wq Wp−1 Wp + (Wp ∂~ g Wq−1 )+ Wq , g ∈ Q, (131) ∂tg (+,0) ∂Wq + Wq ∂~ g = (Wq ∂~ g Wq−1 )− Wq Wp−1 Wp + (Wq ∂~ g Wq−1 )+ Wq , g ∈ Q, (132) ∂tg (+,0) [Wp , Wq ] = [Wp , Wq ]Wp−1
(+,0)
Wp .
(133)
Equations (131) and (132) together with the constraint (133) constitute a hierarchy of non-linear differential equations which can be viewed as a generalization of the Sato equations in the theory of KP equations (see [36] for review). Burgers equations (120) with the condition (119) give the simplest examples of Eqs. (131)–(133) for N = 1. For large enough N Eqs. (131) and (132) contain first n equations of KP hierarchy, n << N . To see this, consider the Eq. (131) for Wp when g = n · p with n > 1. In order to present the answer in the standard form we set Wp ≡ W and tn ≡ tn·p and obtain ∂W + W ∂xn = (W ∂xn W −1 )+ W, n > 1. ∂tn
(134)
The set of equations (134) constitutes a certain reduction of KP hierarchy, which we describe as follows. Take the solution to KP hierarchy ([10]), ˜ = 1 + w1 (t)∂x−1 + w2 (t)∂x−2 + · · · , W
(135)
˜ · ∂xN being a differential operator. Then the which satisfies an additional condition of W N ˜ operator W = W · ∂x solves Eqs. (134). We call the hierarchy of equations (134) the KP (N ) hierarchy. This hierarchy has been described in detail in [36]. The KP hierarchy itself can be viewed as a limit of KP (N ) as N tends to infinity. Such a limit makes sense due to the stabilization of equations in KP (N ) hierarchy: its nth equation is independent from N if n << N . However the hierarchy (134) at finite N can be of independent interest as well. For instance, the exact version of the statement of the footnote 2 is that KP (1) is an integrable system containing the (1+1)-d Burgers equation. We can interpret Eqs. (131) and (132) with (133) as an integrable extension of KP (N ) hierarchy (134) to higher dimensions. Therefore, we call such a system the
Structure of Correlation Functions in Normal Matrix Model
229
extended-KP (N ) hierarchy. The N = 1 example considered above supports such an interpretation: the extended-KP (1) hierarchy, or equivalently the (2+1)-d Burgers hierarchy, is a natural extension of the KP (1) hierarchy, or equivalently the (1+1)-d Burgers hierarchy. Our main result about the extended-KP (N ) hierarchy is that it is a completely integrable extension of the KP (N ) hierarchy, i.e. all solutions to the extended-KP (N ) hierarchy are of form (100) and (101). To be precise we have the following theorem. Theorem 2. Let C[[t]] be a ring of formal power series in tg , g ∈ Q, with complex coefficients. Let B be a ring of differential operators in ∂x and ∂y , with coefficients belonging to C[[t]]. Operators Wp and Wq ∈ B of the form Wp (t, ∂x , ∂y ) = ∂xN +
N X
an (t)∂xn−1 ,
(136)
bn (t)∂xn−1 ,
(137)
n=1
Wq (t, ∂x , ∂y ) = ∂y +
N X n=1
where an (t) and bn (t) ∈ C[[t]] for n = 0, · · · , N − 1, solve the system (131)–(133) if and only if there exists a set {hn (t)}N n=1 of N elements of C[[t]] such that The Wronskian Wx (h1 , · · · , hN )(t) of h1 , · · · , hN with respect to the variable x is an invertible element of C[[t]],
(138)
∂ ∂~ g hn (t) = hn , n = 1, · · · , N ; g ∈ Q, ∂tg
(139)
Wp hn (t) = 0 = Wq hn (t), n = 1, · · · , N.
(140)
and
Note that (i) operators Wp and Wq defined in (136) and (137) are the explicit form of the wave operators defined in (96) and (97); (ii) the condition (138) implies in particular the linear independence of h1 (t), · · · , hN (t) ∈ C[[t]]; (iii) relations (140) constitute a system of linear algebraic equations for the coefficients a1 (t), · · · , aN (t) and b1 (t), · · · , bN (t). This system has a unique solution due to the linear independence of functions h1 (t), · · · , hN (t). Thus Theorem 2 indeed describes all solutions to (131)– (133) and the explicit form of these solutions is given by (100) and (101). The proof of Theorem 2 is based on the following lemma. Lemma 3. Let Wp and Wq be any two elements of B of the form (136) and (137) satisT fying the relation (133). Then dim(KerWp KerWq ) = N over the ring C[[ˆtp , ˆtq , t]] of formalTpower series depending on {tg | g ∈ Q \ {p, q}}. Moreover the basis in KerWp KerWq can be chosen to satisfy condition (138). The proof of Lemma 3 is presented in the Appendix. However the proof of Theorem 2 based on Lemma 3 is so short and transparent that we present it here. ♦ Suppose that operators Wp and Wq ∈ B solve the hierarchy (131)–(133). Let T us fix a basis h˜ 1 (t), · · · , h˜ N (t) of KerWp KerWq , which satisfies condition (138). Lemma 3 states that any element of C[[t]] annihilated by Wp and Wq can be decomposed into a linear combination of h˜ 1 (t), · · · , h˜ N (t) with coefficients depending on all tg ’s,
230
L.-L. Chau, O. Zaboronsky
g ∈ Q \ {p, q}. Let us apply operator equalities (131) and (132) to h˜ 1 (t), · · · , h˜ N (t). The r.h.s.’s of the results are identically 0; subtracting 0 ≡ we get:
˜ i (t)) ∂(Wp(q) h ∂tg
from the l.h.s.’s,
Wp Dg h˜ i (t) = 0 = Wq Dg h˜ i (t), g ∈ Q, i = 1, · · · N, (141) T where Dg ≡ ∂t∂g − ∂~ g . Therefore Dg h˜ i (t) ∈ KerWp KerWq for each i and g. Then there exist N × N matrices Ag (t) with g ∈ Q independent of x and y such that X [Ag (t)]ji h˜ j (t). (142) Dg h˜ i (t) = j
These matrices are not unrelated. The fact that [Dg , Dh ] = 0 together with the linear independence of basic elements h˜ 1 (t), · · · , h˜ N (t) yields ∂Ag ∂Ah − + [Ag , Ah ] = 0, g, h ∈ Q \ {p, q}. ∂th ∂tg
(143)
These zero-curvature-like conditions imply that there is an (x, y)-independent nondegenerate matrix B(t), such that Ag =
∂B −1 B . ∂tg
(144)
Non-degeneracy of B(t)Tmeans that detB(t) is an invertible element of C[[t]]. Consider a new basis of KerWp KerWq defined by X j Bi (t)h˜ j (t), i = 1, · · · , N. (145) hi (t) = j
Substituting (144) and (145) into (142) we see that Dg hi (t) = 0, i = 1, · · · , N.
(146)
Therefore the condition (139) of the theorem is satisfied by elements h1 (t), · · · , h1 (t). The condition (138) is satisfied as well, since Wx (h1 , · · · , hN )(t) = det B(t)·Wx (h˜ 1 , · · · , h˜ N )(t), and the r.h.s. of this relation is invertible in C[[t]]. Thus we have proven the “if” part of the theorem (for each pair Wp and Wq solving (131)–(133), there exists a set of linearly independent functions h1 (t), · · · , hN (t) satisfying conditions (138)–(139)). From our considerations which led to the hierarchy (131)–(133) we know that the inverse statement is also true: any pair of operators Wp and Wq annihilating N functions that satisfy condition (138) and (139) solves the extended-KP (N ) hierarchy. This concludes the proof of Theorem 2. ♦ Theorem 2 implies a geometric description of the space of solutions to the extendedKP (N ) hierarchy. Before we can give such a description some additional notations are to be introduced. Let L ⊂ C[[t]] be a complex linear space consisting of elements of C[[t]] which are annihilated by operators Dg ≡ ∂t∂g − ∂~ g , g ∈ Q. Consider a set of all N -dimensional linear subspaces of L which we identify with an infinite-dimensional Grassmann manifold Gr(∞, N ). Let us remind that Gr(∞, N ) is defined as a set of all N -dimensional linear subspaces of C∞ (understood as a Tychonoff product), see [14] for details. We equip Gr(∞, N ) with the structure of topological space by declaring that
Structure of Correlation Functions in Normal Matrix Model
231
the set Bγ consisting of N -dimensional linear subspaces of L having non-degenerate projections on the finite-dimensional linear subspace γ ⊂ C[[t]] is open. It follows from standard theorems of analysis (see e. g. [26]) that the set B ={Bγ }γ⊂C[[t]]together with an ∅ constitutes a base for the topology on Gr(∞, N ). Let
Gr(∞, N )
be an open 0
subset of Gr(∞, N ) consisting of all N -dimensional linear subspaces of L having a −1 non-degenerate projection on the subspace 5 of C[[t]] spanned by {xn }N n=0 . The space of solutions to the extended-KP (N ) hierarchy can be now described as follows. Corollary 1. There is a one-to-one correspondence between the set of solutions to (131)–(133) and points of Gr(∞, N ) . 0
♦ First of all let us prove that two sets of functions h1 (t), · · · , hN (t) and h˜ 1 (t), · · · , h˜ N (t) both satisfying (138) and (139) determine the same pair of operators Wp and Wq from (136) and (137) which annihilate them if and only if these two sets of functions are related by a constant non-degenerate linear transformation. The “if” statement is a direct consequence of Kramer’s formula (see (100) and (101)). Conversely, suppose that Wp and Wq annihilate two sets of functions h1 (t), · · · , hN (t) T and h˜ 1 (t), · · · , h˜ N (t) satisfying (138) and (139). Each of these sets span KerWp KerWq . Therefore, there is a non-degenerate N × N matrix M (t) independent of x, y P such that hi (t) = j Mij (t)h˜ j (t). Applying the operator Dg to both sides of the last equality and using the fact that Dg hi (t) = Dg h˜ i (t) = 0, i = 1, · · · , N , and the linear (t) = 0, g ∈ Q. Therefore, M is a independence of elements h˜ 1 , · · · , h˜ N , we get ∂M ∂tg constant non-degenerate matrix. Thus any two sets of N elements of C[[t]] obeying (138) and (139) satisfy (140) iff they are related by a constant non-degenerate linear transformation. Theorem 2 implies that we have just constructed a one-to-one map from the space of solutions to the extended-KP (N ) hierarchy to the set of N -dimensional linear subspaces of L or, equivalently, Gr(∞, N ). This map is not onto: a linear subspace of L belongs to the image iff one can find a basis {hn }N n=1 of this subspace such that (138) is satisfied. But this condition is equivalent to the non-degeneracy of the projection of our subspace onto the subspace 5 described above. Thus the image of the map in question is exactly (Gr(∞, N ))0 and the corollary is proved. ♦ Finally, let us discuss a relation between the solutions (100) and (101) to the extendedKP (N ) hierarchy and the partition function of NMM or, more generally, the fermionic correlators (73) with U = UN . Consider the so called vertex operator: −V (
1
∂
,~ z
−p
)
n ∂tn·p , (147) X(t, ~z ) = eV (t,~z) e P where V ( n1 · ∂t∂n·p , ~z −p ) = n>0 n1 ∂t∂n·p ~z −n·p . It is proved in the Appendix that the wave-function wp which encodes half of the solution to the extended-KP (N ) hierarchy can be expressed through the fermionic correlator:
wp (t, ~z) =
X(t, ~z )τ (UN , A, t) . τ (UN , A, t)
(148)
This equation can be viewed as a generalization of previously known one-dimensional bosonization formulae [10] . Knowing the fermionic correlator (73) we can compute
232
L.-L. Chau, O. Zaboronsky
−1 wp (~z, t) or, equivalently the set of functions {an (t)}N n=0 . The other half of the solution −1 is given by the set {bn (t)}N n=0 . Even though we do not have at the moment explicit formulae for the functions b(t)’s in terms of the fermionic correlators (73) we believe that functions b(t)’s are also determined by τ (UN , A, t). The reason for such a conjecture is very simple: all solutions at hand are determined completely by the matrix Ag,h with g ∈ UN and h ∈ Q which in turn can be restored from τ (UN , A, t). This conjecture has been verified in case N = 1, see (121). So we conclude that fermionic correlators τ (UN , A, t) with operators A from (74) play a role of τ -functions for the solutions (100) to the extended-KP (N ) hierarchy. We hope to continue the investigation of the structure of hierarchies (131)–(133). One of the most important questions to be answered here is the following: is there a universal integrable system from which one can obtain all hierarchies (131) and (132) for N = 1, 2, · · · by means of reductions? The answer to this question is not clear. The reason is that Eqs. (131) and (132) of the extended-KP (N ) hierarchy do not stabilize as N becomes larger. This is readily seen from the fact that the number of elements of Q of a given degree d grows with d. Thus the relation between extended-KP (N ) hierarchies and this hypothetical universal structure should be different from the known relation between KP (N ) hierarchies and the KP hierarchy.
7. Ward Identities in the NMM It is well-known (see [32, 16, 20], see [34] for review) that partition functions of Hermitian, unitary, complex, etc., matrix models exhibit invariance with respect to a subalgebra of an algebra of holomorphic diffeomorphisms of a complex plane, or Virasoro algebra with zero central charge. This invariance can be presented in the form of the so called Virasoro constraints imposed on the partition function of a matrix model. It is also known that Virasoro constraints can be rewritten in the form of loop equations, [37, 22], see [24] for a review. These equations happen to be exactly solvable in certain scaling limits thus providing a powerful tool for a study of matrix models. The aim of the present section is to demonstrate that the partition function of NMM is also subject to an infinite set of constraints. These constraints generate a subalgebra of the Dif f (C) algebra of all infinitesimal diffeomorphisms of the complex plane. We start with some heuristic considerations intended to unveil the reasons for the appearance of a subalgebra of the Dif f (C) algebra in the normal matrix model. Consider a family of 0-dimensional field theories with action V parameterized by the set of coupling constants {tkl }k,l≥0 , where tk,l = tl,k : V =
N X X
tkl zik z¯il .
(149)
i=1 k,l≥0
This family is equivalent in the field-theoretical framework to the NMM itself as for each fixed set of coupling constants (149) gives a NMM potential entering (5). The algebra of the following reparameterizations acts in the space of field theories (149): zi → zi + zim+1 z¯in ,
(150)
z¯i → z¯i + ¯ z¯im+1 zin ,
(151)
where m, n ≥ 0. These reparameterizations can be presented as vector fields on the parameter space of NMM with the potential (149):
Structure of Correlation Functions in Normal Matrix Model
233
δm,n V = wm,n V + ¯ wm,n V,
(152)
where wm,n =
X k,l≥0
k tkl
∂ ∂tk+m,l+n
, wm,n =
X k,l≥0
l tkl
∂ . ∂tk+n,l+m
(153)
These vector fields obey the following commutation relations: [wm,n , wp,q ] = (p − m)wm+p,n+q , [wm,n , wp,q ] = (p − m)wm+p,n+q , [wm,n , wp,q ] = q · wn+p,m+q − n · wm+q,n+p ,
(154)
in which we recognize the relations for the Dif f (C) algebra generated by operators ∂ and z¯ n+1 z n ∂∂z¯ . Operators (153) span a subalgebra of Dif f (C) consisting of z n+1 z¯ m ∂z all polynomial reparameterization of the plane. So we conclude that the subalgebra of Dif f (C) describes classical symmetries of NMM.3 Having discovered the Dif f (C) symmetry on the “classical” level we gain the hope that it translates to the “quantum” level in the form of corresponding constraints (Ward identities) on the partition function (1). To see this directly let us perform the change of variables given by (150), (151) in the integral (5). As a consequence of the fact that the integral doesn’t depend on the choice of the integration variables we obtain the following set of identities: Z Y P N l X k −tr tkl M M k,l≥0 − dzi dz¯i |1(z)|2 e ptpq tr(M m+p M )n+q i=1
p,q≥0
1 1X n n + (m + 1)tr(M m M ) + tr(M m−p M )trM p 2 2 p=0 n−1 N 1 X X n−1−p p m+1 z¯k − z¯l + z¯k z¯l zl = 0, (155) 2 zk − zl m
p=0 k6=l
and its complex conjugate. Here M and M are diagonal matrices with Mii = zi , M ii = z¯i . Our goal is to rewrite (155) in the form of differential constraints applied to the partition function itself. The obvious obstacle for doing so is the term with double z¯ l summation in the left hand side of (155). To overcome it we present a fraction zz¯kk − −zl in the following form: Z ∞ 2 z¯k − z¯l 2 = (z¯k − z¯l ) dωe−ω(|zk −zl | +η) , (156) zk − zl 0 where η > 0 is an infinitesimal parameter. Note that if zk = zl the right-hand side of (156) is equal to η0 = 0, therefore (156) can be considered as a continuation of the z¯ k fraction at hands to all values of zk , zl . The replacement of zz¯kk− −zl with the integral (156) in (155) alters the integrand on the set of measure 0 and doesn’t change the integral itself. Using this remark we can substitute (156) into (155) to arrive at the desired result: 3
The reasonings above are inspired by [32].
234
L.-L. Chau, O. Zaboronsky
Wmn ZN = 0 ; W mn ZN = 0,
(157)
where m, n ≥ 0 and Wm,n =
X
k tkl
k,l≥0
∂ ∂tm+k,n+l
∂ 1 + − (m + 1) 2 ∂tm,n
m n−1 Z 1X 1X ∞ ∂ ∂ ∂ ∂ ˆ + dωe−·ω e−ωD 2 ∂tm−p,n ∂tp,0 2 ∂t ∂t 0,n+1−p m+1,p p=0 p=0 0 ∂ ∂ ∂ ∂ ˆ ˆ + e−ωD −2 e−ωD , (158) ∂t0,n−1−p ∂tm+1,p+2 ∂t0,n−p ∂tm+1,p+1 +
W m,n =
X k,l≥0
l tkl
∂ ∂tn+k,m+l
∂ 1 + − (m + 1) 2 ∂tn,m
m n−1 Z 1X 1X ∞ ∂ ∂ ∂ ∂ ˆ + dωe−·ω × e−ωD 2 ∂tn,m−p ∂t0,p 2 ∂t ∂t n+1−p,0 p,m+1 p=0 p=0 0 ∂ ∂ ∂ ∂ ˆ ˆ −ω D −ω D + e −2 e . (159) ∂tn−1−p,0 ∂tp+2,m+1 ∂tn−p,0 ∂tp+1,m+1 +
Here ←←
→←
←→
→→
ˆ ≡ δ δ¯ − δ δ¯ − δ δ¯ + δ δ¯ , D
(160)
and δ and δ¯ are the shift operators defined as follows: ∂ ∂ )≡ , ∂tm,n ∂tm+1,n ∂ ¯ ∂ )≡ δ( . ∂tm,n ∂tm,n+1
δ(
(161) (162)
The arrows above operators δ, δ¯ in (160) indicate the direction in which these operators ˆ apply. The operators e−ωD in (158) and (159) act within the brackets encompassing them. Note that the first terms in the expressions (158) and (159) for the operators Wmn and W mn coincide with the “classical” Dif f (C) generators (153). The additional “anomalous” terms in the curly brackets of expressions for Wmn and W mn appear from the variation of the measure in the integral (5). One can check directly that operators (158) and (159) satisfy the commutation relations (154), thus providing us with the Dif f (C) constraints for the NMM. It follows from (154) that operators Wm,0 and W n,0 generate two commuting copies of the subalgebra of the Virasoro algebra. Thus the partition function of the NMM model obeys Virasoro constraints as well. There is a nice feature of Virasoro generators Wm,0 and W n,0 following from their representation (158) and (159) : the tedious terms involving integrals over ω are absent in this case, so the expressions for Virasoro generators Wm,0 and W n,0 are similar to the ones appearing in the conventional matrix models (cf. [34]):
Structure of Correlation Functions in Normal Matrix Model
X
Wm,0 =
k tkl
k,l≥0
W n,0 =
X k,l≥0
∂ 1 1X ∂ ∂ − (m + 1) + , 2 ∂tm,0 2 ∂tm−p,0 ∂tp,0 m
∂ ∂tm+k,l
l tkl
235
∂ 1 1 ∂ − (n + 1) + ∂tk,n+l 2 ∂t0,m 2
p=0 n X p=0
∂ ∂ . ∂t0,n−p ∂t0,p
(163)
(164)
The generators of Dif f (C) given by (150) and (151) can be presented in the form z¯ n lm and wm,n = z n lm , where ln , lm generate two commuting copies of the Virasoro algebra. In this representation the invariance with respect to both Virasoro algebras implies Dif f (C)-invariance. One might hope that, correspondingly, the invariance of the partition function with respect to Wm,0 and W n,0 of Eqs. (163) and (164) implies the invariance with respect to Dif f (C) generated by (158) and (159). Presently we do not have any results which could justify such a hope. To conclude the present section we would like to make the following comment. Usually the derivation of Ward identities in the conventional matrix models deals with the partition function written as an integral over the subset of the set of all matrices. Unfortunately, such a representation of the partition function of the NMM is unknown due to the non-linearity of the space of normal matrices. However, we have managed to derive Dif f (C)-constraints for NMM starting from the eigenvalue form (5). Actually, this approach has its own advantages. For example we are now able to apply similar considerations to analyze the structure of Ward identities in the models which are genuinely non-matrix. For example, consider a one-dimensional β-model, which was originally introduced in [12], see [31] for a review. This and similar models reappeared in recent literature in quite different contexts such as the study of chaotic systems [47] and fractional statistics [46, 25]. The partition function of the model is
ZN (β, t) =
Z Y N
dxi |1(x)|β e
−
PN P∞ j=1
k=1
tk x k j
,
(165)
i=1
where β is any non-negative number and the integration goes over RN . The orthogonal, Hermitian and quaternionic matrix models are the particular cases of the β-model for the values of β equal to 1, 2 and 4 correspondingly. In general (165) can be thought of as a classical partition of the two-dimensional Coulomb beads threaded on a string; β is identified in this case with the charge of a bead. The expression in the exponent has a meaning of the external potential. Exploiting the invariance of the integral (165) under the change of variables xi → xi + xn+1 i , n ≥ −1, we obtain the following set of Virasoro constraints satisfied by the model: Lβn ZN (β, t) = 0, n ≥ −1,
Lβn ≡
(166)
n ∞ X ∂ βX ∂ β ∂ ∂ − (1 − )(n + 1) + ktk . 2 ∂tk ∂tn−k 2 ∂tn ∂tk+n k=0
k=1
To our best knowledge this simple observation was not done before.
(167)
236
L.-L. Chau, O. Zaboronsky
8. Conclusion We see that NMM is a highly tractable yet nontrivial model possessing rich structures. The correlation functions of NMM with an axially symmetric potential can be expressed in terms of a holomorphic function of one variable. This holomorphic property leads to the universality of the correlation functions in the scaling limit. In the particular case of monomial potentials this holomorphic function is expressible in terms of degenerate hypergeometric functions. The NMM admits a free fermion representation. Using it we find that the behavior of the partition function of the NMM with respect to an arbitrary variation of the potential is governed by a completely integrable system of non-linear differential equations. This integrable system constitutes a multidimensional extension of the KP (N ) hierarchy. In the simplest case when N = 1 it contains (2+1)-d Burgers equations. The partition function of the NMM is subject to Dif f (C)-constraints which reflect the symmetries of the model. From the results of this paper we can foresee the following possible generalizations and developments. For example there is an integrable hierarchy corresponding to an appropriate subset U ⊂ Q which is different from the set UN used in this paper. By “appropriate” we mean that the ideal of the ring B consisting of the operators annihilating the set hg (t), g ∈ U is finitely generated. An example of such a set is served by {g ∈ Q|deg(g) ≤ n}. Another way to generalize our construction is to start with an arbitrary finitely generated Z-graded abelian group G and fix a semigroup Q ⊂ G in such a way that the Z-gradation of G induces Z+ -gradation of Q. Let us take for example G = Z × Z × Z, Q = {(m, n, p) ∈ G | m, n, p ≥ 0}. In the simplest case analogous to the case N = 1 considered in the paper we will have three wave operators Wi = ∂i + ai (t), i = 1, 2, 3 subject to constraints [Wi , Wj ] = 0. These constraints have a solution ai (t) = ∂i 9(t) and one of the simplest equations of the corresponding hierarchy of equations written in terms of the potential 9 reads ∂9 + (∂x 9)(∂y ∂z 9) + (∂y 9)(∂x ∂z 9) + (∂z 9)(∂x ∂y 9) ∂t(1,1,1) = (∂x ∂y ∂z 9) + (∂x 9)(∂x 9)(∂x 9). This equation has the following solution: Z 9 = ln dudvdw 8(u, v, w)exu+yv+zw+uvwt(1,1,1) , which can be verified by direct substitution. It is worth mentioning that integrable systems in 2+1 dimensions which are different from the KP hierarchy have been investigated recently using Lax formalism (see e. g. [1] and [50] ): generalized Schroedinger equations, Davey- Stewartson equation (a generalization of the non-linear Schroedinger equation to (2+1) dimensions) and their close relatives. It will be interesting to study the Lax structure of 2 + 1-dimensional Burgers hierarchy ( extended-KP (1)) introduced in Sect. 6. The following question seems to be extremely interesting to explore: is there a possible connection between NMM and multidimensional gravity analogous to the wellknown relations between HMM and two-dimensional gravity? To address this question we need to formulate an analog of Feynman rules for the computation of the coefficients of the asymptotic expansion of the partition function of NMM in the case when coupling
Structure of Correlation Functions in Normal Matrix Model
237
constants are large. Corresponding analysis is complicated by the non-linearity of the space of normal matrices, but there are no principle obstructions for progress. The asymptotic expansion of the partition function of the Hermitian matrix model (with a non-polynomial potential) carries topological information about the moduli space of Riemann surfaces [38]. What can be said in this respect about the partition function of NMM? As far as algebraic geometry is concerned it will be interesting to study an analog of quasiperiodic solutions to the KP hierarchy [27] in the context of extended-KP (N ) hierarchies. Such solutions could correspond to algebraic surfaces on the one hand and to points of Gr(N, ∞) on the other, thus providing us with a sort of multidimensional Krichever construction, [15]. 9. Appendix 9.1. The proof of Theorem 1. The single quantity which determines both one- and two-point correlation functions is the analytical continuaton of the one-point correlation function, N −1 1 X um e−N ·V (u) R∞ . (168) π dxxm e−N ·V (x) m=0 m=0 0 (1) 2 The one-point correlation function is equal to RN |z| N · V , while the expression (1) (u|N · V ) = RN
N −1 X
cm [N · V ]um e−N ·V (u) =
(1) (u|N · V ) is given in (30). of the connected part of the two-point function in terms of RN 0 Let x0 be a unique positive solution of Eq. (38), x0 V (x0 ) = 1. To prove Theorem 1 S (1) (u|N · V ) for u ∈ DV (x0 , +∞), where it is sufficient to compute RN DV = u ∈ C 0 < |u| < x0 ; |Arg(u)| (169) π x0 < ; Re V (x0 ) − ln( ) − V (u) < 0 , 2·d u Pd where d is the degree of the polynomial V (x) = k=1 tk xk , tk ≥ 0 for k = 1, 2, · · · , d − 1, td > 0. Note that DV contains neither the origin nor the closed loops around the origin. Therefore one can well define ln(u), u ∈ DV by means of analytical continuation from the R+ ; ln( xu0 ) in (169) is understood in the sense of such continuation. We will not need a detailed description of the shape of the domain DV ; it will be important however that the interval (0, x0 ) is contained in DV in such a way that each point belonging to the interval is contained in DV together with an open neighborhood. (1) (u|N · V ) is based on the following The computation of RN
Proposition 1. For any u ∈ DV , there is θ0 ∈ (0, 1) independent of N such that for any θ : 0 < θ < θ0 , (1) RN (u|N · V ) Z √ uN ·µ N 1 −N ·V (u) dµ R ∞ e · 1 + O(1/ N ) , ∼ π θ dxxN ·µ e−N ·V (x) 0 S where u ∈ DV (0, ∞).
(170)
238
L.-L. Chau, O. Zaboronsky
The proof of Proposition 1 is too long to be reproduced here. We only wish to make the following remark concerning the statement of Proposition 1. Equation (170) is actually a consequence of two statements: first, the summation from 0 to N − 1 in the r.h.s of (168) can be replaced with the summation from θ · N to N without changing the asymptotics of the l.h.s. of (168). Second, the resulting sum can be replaced with R1 PN an integral, n>N ·θ → N · θ dµ. The reader must be warned that in our situation the possibility of replacing summation with integration is not evident at all: the summand in (168) depends on N and this dependence becomes more and more pronounced as N grows. Our proof of Proposition 2 is based on the application of the Poisson summation formula to the r.h.s of (168) and uses the ideology of [13], Chapter 3, which deals with cases similar to ours. Our main task is to compute the integral in the r.h.s. of (170). The computation of R∞ the integral 0 dxxN ·µ e−N ·V (x) is an exercise in applications of the Laplace method; R∞ R∞ dxxN µ e−N ·V (x) = 0 dxe−N ·S(µ,x) , where S(µ, x) = V (x) − µ · ln(x); S(µ, x) as 0 a function of x tends to +∞ as x tends to 0 and +∞ and has a global minimum at the point xc (µ) > 0: xc (µ)V 0 (xc (µ)) = µ.
(171)
For any positive µ, Eq. (171) has a unique positive solution, x(µ), which is a monotonically increasing function of µ. of S(µ, x) with respect to x taken at The second derivative the critical point is equal to V 00 (xc (µ) + V
02
(xc (µ) µ
-a positive number for any positive
µ. Thus, the application of the Laplace formula to the integral at hand is justified and gives Z ∞ dxe−N ·S(µ,x) 0 v −N · V (xc (µ))−µ·ln(xc (µ)) u 2 u + O(N −3/2 ) e ∼ u . t 02 (x (µ)) V c πN V 00 (xc (µ)) + µ Substituting this answer into (170) we get (1) (u|N · V ) RN
N ∼ π
r
πN 2
Z
s
1
dµ
V 00 (xc (µ)) +
θ
V 02 (xc (µ)) N · e µ
(172)
V (xc (µ)−µln(xc (µ))
e−N ·V (u) .
To simplify the integral in the r.h.s. of (173) we change the integration variable from µ to x = xc (µ). Note that dµ = 1V (xc (µ))dxc (µ), where 1V (x) ≡ V 0 (x) + xV 00 (x) > 0 ¯ (|z|2 ), which justifies our choice of notations. for any x > 0. Note that 1V (|z|2 ) = ∂ ∂V Taking into account that xc (µ) satisfies Eq. (171) we get (1) RN (u|N · V )
N ∼ π
r
πN 2
Z
x0
dx √ x
s
3 1V (x)
N·
e
(173)
x V (x)−xV 0 (x)ln( u ))
e−N ·V (u) ,
Structure of Correlation Functions in Normal Matrix Model
239
where = xc (θ) > 0 and x0 = xc (1). Note that x0 solves Eq. (171) for µ = 1 and therefore, Eq. (38). The integral in the r.h.s of (174) can be computed using the steepest descent method. Let T (x, u) = V (x) − xV 0 (x)ln( ux ). As it is easy to check, x T 0 (x, u) = −1V (x)ln( ), u T (x = u, u) = V (u), T 00 (x = u, u) = −1V (u).
(174)
First let us consider the case when u ∈ (x0 , ∞). Then the global maximum of T (x, u) is reached at u > x0 . But x0 is an upper limit of integration in (174). Thus, it follows from (174) that T (x, u) − V (u) = T (x, u) − T (x = u, u) < 0 for x ∈ (, x0 ). Therefore, there is a positive constant C which may depend on u, such that s r Z x0 3 N · V (x)−xV 0 (x)ln( x ) u dx N πN −N ·V (u) √ 1V (x) e e π 2 x ≤ N 3/2 · C · eN ·M ∼ 0, where M = maxx∈(,x0 ) T (x, u) − V (u) < 0. So we conclude from (170) that (1) (u|N · V ) ∼ 0 for u ∈ (x0 , ∞). RN
This proves, upon identification u = |z|2 , the second line of (37). Second, let us suppose that u ∈ DV . According to (174), x = u is the only critical point of T (x, u) in the open x-plane. Using the definition of DV it right half complex is easy to show that Re V (u) > Re T (x, u) for x ∈ (x0 , ∞). A simple analysis shows then that the integration in (174) can be extended to infinity without changing the (1) (u|N · V ). Next we deform the contour of integration from large-N asymptotics of RN (, ∞) to 0, where the contour 0 is a union of two pieces: a piece of curve Re T (x, u)
=
const which goes from the point to the point 0 of intersection with the ray Arg(x) = Arg(u) emerging from the origin; a piece of ray Arg(x) = Arg(u) from the point 0 to infinity. According to Proposition 1 we can always choose > 0 as small as we want to; it is easy to check that one can choose small enough so that the contour 0 is well defined and contains the point u. Thus, s r 3 N · T (x,u)−V (u) Z N πN dx (1) √ 1V (x) e (u|N · V ) ∼ , (175) RN π 2 0 x where we used the fact that V (x) is a polynomial of degree d and definition of DV to prove that the integral over the segment connecting at infinity the real positive half axis and the ray Arg(x) = Arg(u), vanishes. Then (175) is a consequence of the Cauchy theorem. A simple analysis shows that Re(T (x, u) − V (u))|x∈0 ≤ 0 with the equality reached at the single point x = u. Therefore we are entitled to use the steepest descent
240
L.-L. Chau, O. Zaboronsky
formula to evaluate the integral (175). The main contribution comes from the critical point x = u. Using the information collected in (174) we get √ N (1) · 1V (u) · 1 + O(1/ N ) , u ∈ DV . (176) RN (u|N · V ) ∼ π Thus, specifying u = |z|2 and using the fact that DV contains (0, x0 ), we get from (176) the first line of (37). The proof of (39) is achieved by noticing that according to (176) the leading term (1) (u|N · V ) is a holomorphic function on DV . On the of large-N asymptotics of RN other hand, each point of the interval (0, x0 ) is contained in DV together with an open neighborhood. Thus the assumption which was used in the main text to derive (39) is proven and so is Theorem 1. 9.2. The proof of Lemma 1. Recall that the wave operator Wp defined by (96) is of order N in ∂x while Wq from (97) is of the first order in ∂y . Any operator O ∈ B can be presented in the form O=
M1 X M2 X
Cm1 ,m2 ∂xm1 ∂ym2 ,
m1 =0 m2 =0
where coefficients C’s are elements of C[[t]]. Therefore, O=
M1 X
Cm1 ,M2 ∂xm1 ∂yM2 + terms of lower order in ∂y .
m1
Or, O=
M1 X
Cm1 ,M2 ∂xm1 ∂yM2 −1 Wq + terms of lower order in ∂y .
m1
So, applying induction we see that O = b q Wq + O 0 , where O0 ∈ B is of zeroth order in ∂y . If the degree of O0 with respect to ∂x is less than N we are done. If not, we have O0 = CK ∂xK + lower order terms = CK ∂xK−N Wp + lower order terms. If the order of the omitted terms is greater than N we continue the process. After a finite number of steps we will arrive at the following representation of the operator O0 : X cg (t)∂~ g , O0 = bp Wp + g∈UN
where bp is a differential operator in ∂x only. Therefore an operator O can be presented as follows: X cg (t)∂~ g . (177) O = bp Wp + bq Wq + g∈UN
Applying equality (177 ) to the function hh (t) with h ∈ UN we see that
Structure of Correlation Functions in Normal Matrix Model
X
241
cg (t)(∂~ g hh ) = 0, h ∈ UN .
(178)
g∈UN
But (178) is a system of linear algebraic homogeneous equations with respect to {cg }g∈UN with the determinant which is an invertible formal power series. Thus cg = 0, g ∈ UN and Lemma 1 is proved. 9.3. The proof of Lemma 2 . First of all let us introduce the grading of the ring B by setting deg(∂x ) = 1, deg(∂y ) = N . The degree of an element of B is then by definition the maximal degree of monomials in the linear combination constituting this element. We will prove Lemma 2 basing on the following proposition: Lemma 4. Any element O ∈ B admits a unique representation of the form X cn1 ,n2 Wqn2 Wpn1 , O=
(179)
n1 ,n2 ≥0
where deg(cn1 ,n2 ) < N and cn1 ,n2 = 0 if n1 >> 0 or n2 >> 0. We will prove Lemma 4 later. Now let us show how the statement of Lemma 2 can be deduced from Lemma 4. Consider the following sequence of maps: α
β
(180) 0 → B → B ⊕ B → I → 0, where α(b) = −b(Wq +Y ), bWp , b ∈ B and β(c1 , c2 ) = c1 Wp +c2 Wq , (c1 , c2 ) ∈ B⊕B. Lemma 2 is equivalent actually to the statement of exactness of this sequence. So, let us check the exactness of (180). First, we see that the map α is monomorphism. Really, α(b) = 0 implies in particular that bWp = 0 which yields b = 0, as the ring B is an integral domain. Thus, Ker(α) = 0. Second, the map β is an epimorphism. Really, it follows from Lemma 1 that for any O ∈ I there are c1 , c2 ∈ B such that O = c1 Wp + c2 Wq = β(c1 , c2 ). Thus, β is onto. Finally, we have to verify that Im(α) = Ker(β). It is easy to see that Im(α) ⊂ Ker(β). Really, β · α(b) = β (−b · (Wq + Y ), b · Wp ) = b · ([Wp , Wq ] − Y Wp ), which is zero by virtue of the relation (133). Let us prove the opposite inclusion, Ker(β) ⊂ Im(α). If (c1 , c2 ) ∈ Ker(β) then for any b ∈ B (c1 , c2 ) − α(b) = c1 + b(Wq + P Y ), c2 − bWp ∈ Ker(β). By Lemma 4, c2 = n1 ,n2 ≥0 qn1 ,n2 Wqn2 Wpn1 . Choose b = P P n2 n1 n q W W . Then, (c , c )−α(b) = c +b(W +Y ), q W ∈ 1 2 1 q p n1 ,n2 ≥0 n1 +1,n2 q n≥0 0,n q P Ker(β). Again, by Lemma 4, c1 + b(Wq + Y ) = n1 ,n2 ≥0 pn1 ,n2 Wqn2 Wpn1 . So, X n1 ,n2 ≥0
pn1 ,n2 Wqn2 Wpn1 +1 +
X
q0,n Wqn+1 = 0.
n≥0
But the l.h.s. of the equation above is a representation of 0 in the form (179), it follows then from the uniqueness of such a representation that pn1 ,n2 = 0 for n1 , n2 ≥ 0 and
242
L.-L. Chau, O. Zaboronsky
q0,n = 0 for n ≥ 0. Going back we conclude that (c1 , c2 ) − α(b) = 0, which proves that Ker(β) ∈ Im(α). It remains to prove Lemma 4 which states the existence and uniqueness of decomposition (179). Take any element O ∈ B of order less than (n + 1) · N with n ≥ 0. It can be presented in the following form: O=
n X
qk ∂xk·N ∂yn−k + O0 ,
(181)
k=0
where the degree of qk is less than N and the degree of O0 is less than k · N . Thus, O − O0 = q0 Wqn + q0 (∂yn − Wqn ) + · · · + qn ∂xn·N = q0 Wqn + q10 ∂yn−1 ∂xN + · · · + qn0 ∂xn·N + D1 ,
(182)
where deg(qk0 ) < N and deg(D1 ) < n · N . Suppose that we have proven that for some m such that 0 ≤ m < n, O − O0 = c0 Wqn + · · · + cm Wqn−m Wpm + dm+1 ∂yn−m−1 ∂x(m+1)·N + · · · + dn ∂xn·N + Dm+1 ,
(183)
where degrees of operators ci and dj are less than N and deg(Dm+1 ) < N . Then we see that O − O0 = c0 Wqn + · · · + cm Wqn−m Wpm·N + cm+1 Wqn−m−1 Wp(m+1)·N + dm+1 (∂yn−m−1 ∂x(m+1)·N − Wqn−m−1 Wp(m+1)·N ) ) + · · · + dn ∂xn·N + Dm+1 = c0 Wqn + · · · + cm+1 Wqn−m−1 Wp(m+1)·N + d0m+2 ∂yn−m−2 ∂x(m+2)·N + · · · + d0n ∂xn·N + Dm+2 , where degrees of operators ci and d0j are less than N and deg(Dm+1 ) < N . At some point we defined cm+1 = dm+1 . Thus we proved by induction, the base of which is (182) and the induction hypothesis is (183), that O=
n X
ck Wqk Wpn−k + O00 ,
(184)
k=0
where deg(O00 ) < n · N . Note that if deg(O) < N , then the existence of (179) is clear. So, using (184) to generate induction in degree we verify the existence of representation (179) for any O ∈ B. To prove the uniqueness of (179) we must show that the equality X cn1 ,n2 Wqn2 Wpn1 = 0 (185) n1 ,n2 ≥0
n1 , n2 ≥ 0. We assume of course that deg(cn1 ,n2 ) < N . implies cn1 ,n2 = 0 for all P Suppose that n · N ≤ deg( n1 ,n2 ≥0 cn1 ,n2 Wqn2 Wpn1 ) < (n + 1) · N with n > 0. Then Wqn2 Wpn1 ≡
X n1 +n2 =n
cn1 ,n2 Wqn2 Wpn1 + O000 = 0, with deg(O000 ) < n · N.
Structure of Correlation Functions in Normal Matrix Model
243
Thus c0,n ∂y +( terms of lesser degree in ∂y ) = 0. Therefore, c0,n = 0. Continuing step-bystep considerations of terms of the highest degree P we see that cn1 ,n2 = 0 for n1 + n2 = n. But this contradicts the assumption that deg( n1 ,n2 ≥0 cn1 ,n2 Wqn2 Wpn1 ) ≥ n · N with n > 0. Thus all coefficients cn1 ,n2 in the l.h.s. of (185) for n1 + n2 > 0 are equal to 0 and we are left with the equality c0,0 = 0. The uniqueness of (179) is thus proved and so is Lemma 4. 9.4. The proof of Lemma 3. First we will construct N linearly independent elements of C[[t]] annihilated by Wp and Wq and satisfying condition (138). Then we will prove that any element of C[[t]] annihilated by Wp and Wq is a linear combination of these elements with coefficients independent of x, y. Let h˜ 1 , · · · , h˜ N be N linearly independent elements of C[[t]] generating KerWp . They can be chosen to have the following form: h˜ k = xk−1 + ( higher order terms in x) with k = 1, · · · , N , so that the condition (138) is satisfied. Applying (133) to these elements, we see that Wq h˜ i (t) ∈ KerWp . Therefore, there exists an N × N matrix E(t) independent from x such that X E(t)ij h˜ j (t). (186) Wq h˜ i (t) = j
Consider an invertible matrix F (t) solving the equation ∂y F (t) · F (t)−1 = E(t).
(187)
Such a solution always exists in the formal category: it is easy to verify that F (t) = I + F1 (t) · y + F2 (t) · y 2 + · · ·, where I is an N × N identity matrix and Fm (t), m > 0 are N × N matrices independent P the recursion P of y. They are determined one-by-one from 1 i relation Fm+1 (t) = m+1 i+j=m Ei (t) · Fj (t) with m ≥ 0, E(t) ≡ i≥0 Ei (t)y and F0 (t) ≡ I. The evaluation of our solution on complex numbers for y can be written as usual in the form of path-ordered exponent. Consider now a set of functions h1 , · · · , hN defined from: X h˜ i (t) = [F (t)]ji hj (t), i = 1, · · · , N. j
Then Wq hi (t) = 0, i = 1, · · · , N due to (186) and the fact that the matrix F (t) solves = 1, · · · , N due to the fact that F (t) is independent from x. Eq. (187); and Wp hi (t) = 0, iT Thus h1 , · · · , hN ∈ KerWp KerWq . Due to the non-degeneracy of the matrix F (t) and the fact that the elements h˜ 1 , · · · , h˜ N were chosen to satisfy the condition (138) it follows that elements h1 , · · · , hN also satisfy the condition (138). Moreover, these elements generate the T intersection of the kernels: suppose there is f (t) ∈ C[[t]] such that f (t) ∈ KerWp KerWq . Then f (t) ∈ KerWp as well. Therefore, there are coefficients d1 (t), · · · dN (t) independent of x such that X di (t)hi (t). (188) f (t) = i
Thus all we have to prove is that the coefficients di ’s are independent from y. Applying P Wq to (188) we obtain that i ∂y di (t) hi (t) = 0. Then by linear independence
244
L.-L. Chau, O. Zaboronsky
of hi ’s over formal power series depending on y and tg , g ∈ Q \ p, q we conclude ∂y (di (t)) = 0, i = 1, · · · , N . Lemma 3 is proved. 9.5. The proof of the bosonization relation . Here we present the proof of (148). First, consider operators P+ (−t) ≡ e−H(t) P+ eH(t) and P− (−t) ≡ e−H(t) P− eH(t) . Following [21], we will call them Clifford operators. The identities below [8] express the properties of the Clifford operators which will be important for us: hvac|P+ (−t) = hvac|eH(t) = hvac|P− (−t).
(189)
The proof of (189) is not completely straightforward and was not explained in [8]. Therefore let us outline the proof. Consider the state h| = hvac|eH(t) P+ (−t). d h|. Note that H(t) ≡ e−H(t) H(t)eH(t) = We wish to compute d where we used the definition (57) and the agreement (60). Therefore
d hvac|eH(t) H(t)P+ (−t) d =
X
tg
g∈G+
=
X
g∈G+
X
(190) P
g∈G+ tg Jg (−t),
hvac|eH(t) ψh (−t)ψ¯ g+h (−t)P+ (−t)
h∈G+
tg
X
hvac|ψh ψ¯ g+h eH(t) P+ (−t) = 0,
h∈G+
where the first equality is due to the properties (70)-(72) of projector operators and the last equality follows from (53). Therefore the state h| is in fact independent of . Finally, observing that hvac|P+ = hvac| we obtain the desired result: hvac|eH(t) = hvac|eH(t) P+ (−t) = h = 1| = h = 0| = hvac|P+ (−t). The first equality in (189) is proved. The second one can be verified along the same lines. Next we need to prove that hUN |eH(t) ψ(~z )AP+ = X(t, ~z )~z (N −1)·p hUN |ψ(N −1)·p eH(t) AP+ .
(191)
The proof is the following. Using (67) we can perform the following computation: [P+ (−t)Uˆ N (−t), ψ(~z )]AP+ = eV (t,~z ) e−H(t) [P+ Uˆ N , ψ(~z )]eH(t) AP+ X = eV (t,~z ) e−H(t) ~z (N −1)·p ~z −n·p [P+ Uˆ N , ψ(N −1−n)·p ]eH(t) AP+ n≥0 V (t,~ z ) −H(t−µ)
(N −1)·p
[P+ (−µ)Uˆ N (−µ), ψ(N −1)·p ]eH(t−µ) AP+ = X(t, ~z )e−H(t) ~z (N −1)·p [P+ (−µ)Uˆ N (−µ), ψ(N −1)·p ]eH(t) AP+ , =e
e
~z
Q where g∈UN ψ¯ g = Uˆ N and µg = calculation shows that
1 n
if g = n · p with n > 0 and µg = 0 otherwise. A
Structure of Correlation Functions in Normal Matrix Model
P+ (−µ) = P+ · e
~ z −p
245
P g∈G−1
ψg ψ¯ g+p (−µ)
.
Using this result and observing that ψh P+ = 0, h ∈ G− we find that [P+ (−t)Uˆ N (−t), ψ(~z )]AP+ = X(t, ~z )~z (N −1)·p e−H(t) P+ [Uˆ N (−µ), ψ(N −1)·p ]eH(t) AP+ .
(192)
Applying the operator equality (192) to the left vacuum and using the property (189) of Clifford operators we see that hvac|Uˆ N eH(t) ψ(~z)AP+ = X(t, ~z )~z(N −1)·p hvac|Uˆ N ψ(N −1)·p eH(t) AP+ ,
(193)
where we used that hvac|Uˆ N (−µ)ψ(N −1)·p = hvac|Uˆ N ψ(N −1)·p , which can be verified directly. Taking into account that hvac|Uˆ N ≡ hUN |, we obtain (191). Finally let us apply (191) to |UN i and use the fact that P+ |UN i = |UN i. Dividing the result by τ (UN , A, t) we arrive at Eq. (148). Acknowledgement. We are grateful to D. Fuchs, J. Hunter, A. Konechny, G. Kuperberg, M. Mulase, A. Schwarz, Ya. Sinai, P. Vanhaecke and V. Zakharov for numerous discussions and reading the manuscript. This research was partially supported by the U. S. Department of Energy and National Science Foundation. The NSF grant number is DMS 9304580.
References 1. Ablowitz, M. and Fokas, A.: Method of solution for a class of multidimensional non-linear evolution equations. Phys. Rev. Lett. 51 (1), 7 (1983) 2. Brezin, E. and Zee, A.: Universality of the correlations between eigenvalues of large random matrices. Nucl. Phys. B402, 613 (1993) 3. Brezin, E. and Zee, A.: Universal relation between Green’s functions in random matrix theory. E-print archive: cond-mat/9507032, Nucl. Phys. B453, 531 (1995) 4. Burgers, J.: The non-linear diffusion equation: Asymptotic solutions and statistical problems, Dordrecht (Holland)–Boston: D. Reidel Pub. Co., 1974 5. Chau, L.-L. and Yu, Y.: Unitary polynomials in normal matrix model and wave functions for the fractional quantum Hall effect. Phys. Lett. A167, 452 (1992) 6. Chau, L.-L. and Zaboronsky, O.: Normal matrix model, Toda lattice hierarchy, and the two-dimensional electron gas in the strong magnetic field. Proceedings in memory of Professor Wolfgang Kroll, ed. J. P. Hsu et. al., Singapore: World Scientific, 1997 7. Cole, J.: Quart. Appl. Math. 9, 225 (1951) 8. Date, E., Kashiwara, M. and Miwa, T.: Transformation groups for soliton equations. II. Proc. Japan Acad. 57, Ser. A, 387 (1981) 9. Date, E., Jimbo, M., Kashiwara, M. and Miwa, T.: Transformation groups for soliton equations. III. J. Phys. Soc. Jpn., 50, Ser. A, 342 (1981) 10. Date, E., Jimbo, M., Kashiwara, M. and Miwa, T: Integrable systems. Proc. RIMS Symp. “Non-linear integrable systems – classical and quantum theory”. p. 39, 1983 11. Dubrovin, B., Fomenko, A., Novikov, S.: Modern geometry – methods and applications, New-York: Springer-Verlag, 1990 12. Dyson, F.: Statistical theory of energy levels of complex systems, I, II and III. Jour. Math. Phys. 3, 140, 157 and 166 (1962) 13. Evgrafov, M.: Asymptotic estimates and entire functions. Translated by Allen L. Shields. Russian Tracts on Advanced Mathematics and Physics, Vol. IV; New York: Gordon and Breach, Inc., 1961 14. Fomenko, A., Fuchs, D. and Gutenmacher, V.: Homotopic topology, Budapest: Akademia Kiado: [Distributors, Kultura], 1986 15. Friedlander, L. and Schwarz, A.: Grassmannian and elliptic operators. E-print archive: funct-an/9704003
246
L.-L. Chau, O. Zaboronsky
16. Fukuma, M., Kawai, H. and Nakayama, R.: Continuum Schwinger-Dyson equations and universal structures in two-dimensional quantum gravity. Int. J. Mod. Phys. A6, 1385 (1991) 17. Gasper, G., Rahman, M.: Basic hypergeometric series. Cambridge [England]–New York: Cambridge University Press, 1990 18. Gurbatov, S., Malakhov, A., Saichev, A.: Nonlinear random waves and turbulence in nondispersive media:waves, rays, particles. New York: Manchester University press, 1991 19. Hopf, E.: Comm. Pure Appl. Math. 3, 201 (1950) 20. Itoyama, H. and Matsuo, Y.: Noncritical Virasoro algebra of the d < 1 matrix model and the quantized string field. Phys. Lett. B255, 202 (1991) 21. Kashiwara, M. and Miwa, T.: Transformation groups for soliton equations. I. Proc. Japan Acad. 57, Ser. A, 342 (1981) 22. Kazakov, V.: The appearance of matter fields from quantum fluctuations of 2D-gravity. Mod. Phys. Lett. A4, 2125 (1989) 23. Kharchev, S., Marshakov, A., Mironov, A. and Morozov, A.: Generalized Kontsevich model versus Toda hierarchy and discrete matrix models. Nucl. Phys. B397, 339 (1993) 24. Makenko, Yu.: Loop equations in matrix models and in 2D gravity. Mod. Phys. Lett. A6, (no. 21), 1901 (1991) 25. Kogan, I., Semenoff, G.: Fractional spin, magnetic moment and the short range interactions of anyons. Nucl. Phys. B368, 718 (1992) 26. Kolmogorov, A., Fomin, S.: Introductory real analysis. New York: Dover Publications Inc., 1975 27. Krichever, I.: Russ. Math. Surveys, 32 (185) 28. Kuznetsov, N. and Rozhdestvensky, B.: ZhVMMF 1 (2), 217 (1961) 29. Landau, L. and Lifshitz, E.: Fluid Mechanics. London: Pergamon Press, 1987 30. Leznov, A. and Saveliev, M.: Theory of group representations and integration of non-linear systems Xa,zz¯ = exp (kx)a , Physica 3D, 62 (1981) 31. Mehta, M.: Random matrices. San-Diego: Academic Press, 1991 32. Mironov, A. and Morozov, A.: On the origin of Virasoro constraints in matrix models: Lagrangian approach. Phys. Lett. B252 47 (1990) 33. Mohling, F.: Statistical mechanics: methods and applications. Jamaica, Queens, N. Y.: Publishers Creative Services Inc., 1982 34. Morozov, A.: Integrability and matrix models. E-print archive: hep-th/9303139, Phys. Usp. 1 1 (1994) 35. Mulase, M.: Math. Sci. 228, (1982) and private communication 36. Ohta, Y., Satsuma, J., Takakashi, D. and Tokihiro, T.: An elementary introduction to Sato theory. Progress of Theoretical Physics Supplement 94, 210 (1988) 37. Paffuti, G. and Rossi, P.: A solution to Wilson’s loop equation in lattice QCD2 . Phys. Lett B92, 321 (1980) 38. Penner, R.: Perturbative series and the moduli space of Riemann surfaces. Commun. Math. Phys. 113, 229 (1987) 39. Polyakov, A.: The theory of turbulence in two dimensions. Nucl. Phys. B396, 367 (1993) 40. Polyakov, A.: Turbulence without pressure. PUPT-1546, Jun 1995, 13, e-print archive: hep-th/9506189 41. Sato, M., Miwa, T. and Jimbo, M.: Holonomic quantum fields. Publ. RIMS, Kyoto Univ. 14, 223 (1978); 15, 201, 577, 871 (1979); 16, 531 (1980) 42. Sato, M., Jimbo, M., Miwa, T. and Mori, Y.: Holonomic quantum fields: the unanticipated link between deformation theory of differential equations and quantum fields. RIMS-305 (1979); Lausanne Math. Phys. 119 (1979) 43. Sato, M.: Soliton equations as dynamical systems on infinite dimensional Grassmann manifold. RIMS Kokyuroku, 439, 30 (1981) 44. Sato, M. and Sato, Y. (Mori): Nonlinear partial differential equations in Applied Science. ed. H. Fujita, Lax and G. Strang, Tokyo: Kinokuniya/North-Holland, p. 259, 1983 45. Schram, J.M.: Kinetic Theory of Gases and Plasmas. Fundamental Theories of Physics Series, vol. 46, Dordrecht, Boston, London: Kluwer Academic Publishers, 1991 46. Semenoff, G.: Anyons and Chern-Simons theory: A review. PRINT-91-0208 (BRITISH-COLUMBIA), Feb 1991. 32pp. Presented at Karpacz Winter School for Theoretical Physics, Karpacz, Poland, Feb 18– Mar 1, 1991 47. Simons, B., Altshuler, Lee and B.: Matrix models, one-dimensional fermions, and quantum chaos. Phys. Rev. Lett. 72 (1), 64 (1994) 48. Sinai, Ya.: Two results concerning asymptotic behavior of solutions of the Burgers equation with force. J. Stat. Phys. 64, 1 (1991)
Structure of Correlation Functions in Normal Matrix Model
247
49. Ueno, K. and Takasaki, K.: Toda lattice hierarchy. I and II. Proceedings of the Japan Academy, Ser. A, 59 (5), 167 and ibid. 59 (6), 215 (1983) 50. Zakharov, V. and Manakov, S.: Construction of the multidimensional integrable systems and their solutions. Funktz. Anal Prilozh. 19 (2), 11 (1985) Communicated by T. Miwa
Commun. Math. Phys. 196, 249 – 288 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
The Deformed Virasoro Algebra at Roots of Unity Peter Bouwknegt1 , Krzysztof Pilch2 1
Department of Physics and Mathematical Physics, University of Adelaide, Adelaide, SA 5005, Australia Department of Physics and Astronomy, University of Southern California, Los Angeles, CA 90089-0484, USA
2
Received: 23 October 1997 / Accepted: 26 January 1998
Abstract: We discuss some aspects of the representation theory of the deformed Virasoro algebra Virp,q . In particular, we give a proof of the formula for the Kac determinant and then determine the center of Virp,q for q a primitive N th root of unity. We derive explicit expressions for the generators of the center in the limit t = qp−1 → ∞ and elucidate the connection to the Hall–Littlewood symmetric functions. Furthermore, we √ argue that for q = N 1 the algebra describes “Gentile statistics” of order N − 1, i.e., a situation in which at most N − 1 particles can occupy the same state. 1. Introduction In recent years it has been realized (see, in particular, [7]) that the theory of off-critical integrable models of statistical mechanics is intimately connected to the theory of infinitedimensional quantum algebras and that such theories can be studied in close parallel to their critical counterparts, i.e., conformal field theories. Algebras of particular interest are the so-called deformed Virasoro algebra, Virp,q , introduced in [11, 34] (see [4] for a review), their higher rank generalization, the deformed W-algebras [9, 2, 3, 13], as well as their linearized versions [17, 18]. In particular, it has been argued [25, 26] that the deformed Virasoro algebra plays the role of the dynamical symmetry algebra in the Andrews-Baxter-Forrester RSOS models [1]. For generic values of the deformation parameters1 the representation theory of the deformed Virasoro algebra Virp,q closely parallels that of the undeformed Virasoro algebra, as manifested, e.g., by the Kac determinant formula [34], Drinfel’d-Sokolov reductions [14, 33] and the existence of Felder type free field resolutions [26, 21, 10]. In this paper we will discuss some aspects of the representation theory of Virp,q . First we complete the proof of the Kac determinant conjectured in [34], where also the hardest part of the proof, the construction of a sufficient number of vanishing lines, was already 1
Here, “generic values of the deformation parameters” will always stand for “not a root of unity”.
250
P. Bouwknegt, K. Pilch
established. Then we proceed to discuss the representation theory for q a primitive N th root of unity, i.e., a complex number q √ such that q N = 1 and q k 6= 1 for all 0 < k < N . N was (Henceforth, we use the notation q = 1 for q a primitive N th root of unity.) As √ already observed in [34], it follows from the Kac determinant formula that, for q = N 1, Verma modules contain many singular vectors irrespective of the highest weight. We first analyze the case q√= −1 in detail and find a close relation of Virp,q to the free fermion algebra. For q = N 1, N > 2, we give a generating series for all singular vectors. We derive explicit expressions of all singular vectors (for generic highest weight) in the limit t = q/p → ∞. For generic t, the singular vectors are deformations of these as we illustrate in various examples. We show that the √ existence of these generic singular vectors is a consequence of the fact that for q = N 1 (and t → ∞) the algebra Virp,q has a large center. We compute this center by exploiting the isometry from the Verma module to the Hall–Littlewood symmetric polynomials [20]. The paper is organized as follows. In Sect. 2 we recall the definition of the deformed Virasoro algebra Virp,q and its highest weight modules, prove some simple properties and discuss the free field realization. In Sect. 3 we prove the Kac determinant formula √ and in Sect. 4 we discuss the representation theory of Virp,q for q = N 1. This section is divided into three parts. First we discuss the case of q = −1 and then proceed to the √ general case of q = N 1 for all N ∈ N. Finally, we make the previous analysis more explicit in the limit t = q/p → ∞. We conclude with some general comments in Sect. 5. √ In particular, we point out an interesting relation between Virp,q at q = N 1, and so-called Gentile statistics of order N −1. The latter is a generalization of Fermi statistics (N = 2) in which at most N − 1 particles can occupy the same state [16]. Three appendices follow. In Appendix A we give basic definitions and summarize some results from the theory of symmetric functions that are used throughout the√paper. In Appendix B we provide some explicit examples of singular vectors at q = N 1 for N = 3, 4, and in Appendix C we establish some elementary identities for the√products of generators of Virp,q in the limit t → ∞ which, when specialized to q = N 1, yield another derivation of the center of Virp,q . 2. The Deformed Virasoro Algebra Virp,q 2.1. Definition. The deformed Virasoro algebra [34, 4], Virp,q , is defined to be the associative algebra generated by {Tn , n ∈ Z}, with relations X fl Tm−l Tn+l − Tn−l Tm+l = cm δm+n,0 , (2.1) l≥0
where fl is determined through f (x) ≡
X
X
fl xl = exp
n≥1
l≥0
(1 − q n )(1 − t−n ) xn (1 + pn ) n
1 (qx; p2 )∞ (q −1 px; p2 )∞ = , 1 − x (qpx; p2 )∞ (q −1 p2 x; p2 )∞ and cm = ζ(pm − p−m ),
ζ=−
(1 − q)(1 − t−1 ) . 1−p
(2.2)
(2.3)
Deformed Virasoro Algebra at Roots of Unity
251
Here, p, q ∈ C with p not a root of −1. The series (2.2) are to be understood as formal power series and the equality holds in the region where both converge. For convenience we have introduced a third parameter t by t = qp−1 . Also (x; q)M =
M Y
(1 − xq k−1 ),
(q)M = (q; q)M .
(2.4)
k=1
In terms of formal series T (z) =
X
Tn z −n ,
δ(z) =
n∈Z
X
zn,
(2.5)
n∈Z
the relations (2.1) read w z w w f ( )T (z)T (w) − f ( )T (w)T (z) = ζ δ(p ) − δ(p−1 ) . z w z z
(2.6)
For future convenience we also introduce a generating series for cm = −c−m , X 1 1 c(x) ≡ − . (2.7) c m xm = ζ 1 − px 1 − p−1 x m≥0
Note that the algebra Virp,q is invariant under (p, q) ↔ (p−1 , qp−1 ) and (p, q) ↔ (p , q −1 ) and carries a C-linear anti-involution ω defined by −1
ω(Tn ) = T−n .
(2.8)
From the second expression for f (x) in (2.2) one can easily derive the recurrence relation 1 1 (1 − qx)(1 − q −1 px) −1 = 1 + ζpq − , (2.9) f (x)f (px) = (1 − x)(1 − px) 1 − x 1 − px which is also the requirement that the algebra generated by T (z) be associative [4]. Clearly, for generic p, (2.9) has a unique solution given by (2.2). Remark. We have defined Virp,q for any p ∈ C by means of (2.1)–(2.3) considered as formal power series, as long as p is not a root of −1. For p a root of −1, the expressions (2.2) are ill-defined. In this case one may still define Virp,q in terms of a solution of the recurrence relation (2.9). However, such a solution is not unique as can be seen by taking different limits in (2.2) to obtain inequivalent solutions to (2.9). It is rather straightforward to show by iterating (2.9) that, for p an N th root of −1, a necessary and sufficient condition for the existence of a solution is q 2N = 1. We have not found a succinct way to classify all the solutions that arise then. In this paper we will focus on the case where p is not a root of −1, except for brief remarks in Sects. 2.2 and 4.1. The algebra Virp,q can be considered to be a deformation of the Virasoro algebra Vir [Lm , Ln ] = (m − n)Lm+n + in the following sense:
c m(m2 − 1)δm+n,0 , 12
(2.10)
252
P. Bouwknegt, K. Pilch
Theorem 2.1 [34]. In the limit q = e~ → 1 with t = q β (β fixed) we have (1 − β)2 ~2 + O(~4 ), T (z) = 2 + β L(z) + 4β P where L(z) = n∈Z Ln z −n satisfies the Virasoro algebra (2.10) with c=1−
6(1 − β)2 . β
(2.11)
(2.12)
2.2. Modules. In this paper we will consider the class of modules of Virp,q that are analogues, e.g., deformations, of the highest weight modules of the undeformed algebra. In particular, we are interested in studying those modules by means of standard techniques based on characters and contravariant forms. For that reason we extend Virp,q by a derivation d satisfying (2.13) [d, Tn ] = −nTn , and define the category O of Virp,q modules as the set of d-diagonalizable modules V = qz∈C V(z) , such that each d-eigenspace V(z) is finite dimensional. Let P (V ) = {z ∈ C | V(z) 6= 0}. We also require that there exists a finite set z1 , . . . , zs ∈ C such that P (V ) ⊂ ∪si=1 {zi + N}. Then, the character of a Virp,q module V ∈ O is defined by X dim(V(z) ) xz . (2.14) chV (x) = Tr xd = z∈C
One can show that the so-defined category O contains, in particular, highest weight modules, that are defined in the usual manner and include, among others, Verma modules and their (irreducible) quotients. The Verma module M (h) [34, 9], with highest weight state |hi satisfying T0 |hi = h|hi and Tn |hi = 0, n > 0, has a basis indexed by partitions λ = (λ1 , λ2 , . . .), λ1 ≥ λ2 ≥ . . . > 0, i.e., a basis of M (h)(n) is given by the vectors mn (λ) m1 (λ) . . . T−1 |hi, |λ; hi ≡ T−λ |hi ≡ T−λ1 . . . T−λ` |hi = T−n
(2.15)
where λ runs through all partitions of n. We use the notation λ ` n. Furthermore, of length i in λ. For λ ` n we mi (λ) = #{j : λj = i} denotes the number of partsP also use |λ| = n and define the length of λ by `(λ) = i≥1 mi (λ). The following two orderings are used; the (reverse) lexicographic ordering, i.e., λ > µ if the first nonvanishing difference λ1 − µ1 , λ2 − µ2 , . . ., is positive, and the natural (partial) ordering, Pi Pi i.e., λ µ if j=1 λj ≥ j=1 µj for all i ≥ 1. From the previous discussion it follows that the character of the Verma module M (h) is given by Y 1 X p(n) xn , (2.16) = chM (x) = 1 − xn n≥1
n≥0
where p(n) is the number of partitions of n. The anti-involution ω of Virp,q (see (2.8)) determines a bilinear contravariant form (“Shapovalov form”) on M (h), uniquely defined by Gλµ ≡ hλ; h|µ; hi = hh|ω(T−λ )T−µ |hi,
(2.17)
Deformed Virasoro Algebra at Roots of Unity
and
253
hh|hi = 1.
(2.18)
In particular we have Gλµ = 0 for |λ| 6= |µ|. In Sect. 3 we compute the so-called Kac determinant G(n) , i.e., the determinant of the form Gλµ on M (h)(n) . As an aside, one might wonder whether the category O contains any finite dimensional irreducible representations of Virp,q . Clearly, Virp,q is abelian for q = 1 and/or t = 1 and therefore has a wealth of one dimensional irreducible representations in these cases. The complete result is: Theorem 2.2. The only irreducible finite dimensional Virp,q modules in O are one dimensional and occur for (i) q = 1 and/or t = 1, and h ∈ C arbitrary. 1 1 (ii) p = q 3 or p = q − 2 and h2 = p−1 (1 + p)2 . Remark. In case (i) we have c(x) = 0 and f (x) = 1. Case (ii) corresponds to a c(x) 6= 0 deformation of the c = 0 Virasoro module with h2 = h21,1 (cf. (2.12) and (3.6)). Proof. Let |hi be the highest weight state of a finite dimensional irreducible module in O. Define an = hh|Tn T−n |hi, n ≥ 0. Then a0 = h2 , and, as follows from the commutation relations (2.1), an +
n X
an−l fl = cn ,
n ≥ 1.
(2.19)
l=1
Since the module is finite dimensional, we must have an = 0 for n sufficiently large, say n > n0 . Multiplying (2.19) by xn and summing over n ≥ n0 + 1, we find n0 X
ai xi f (x) = c(x) + a0 .
(2.20)
i=0
In arriving at (2.20) we have also used (2.19) with n ≤ n0 to simplify some intermediate expressions. First consider the case c(x) = 0. This can occur only for the following values of the deformation parameters: q = 1, p arbitrary; p = q, q arbitrary; p = −1, q arbitrary. (In view of the remark in Section 2.1, the last case is in fact covered by the first two.) We conclude that for c(x) = 0, Virp,q is an abelian algebra and its irreducible modules are one dimensional as described by case (i). Now, suppose c(x) 6= 0. In this case not all an , n ≥ 0, vanish and f (x) is a rational function. It follows from the recurrence relation (2.9) that limx→∞ f (x) = ±1. Since limx→∞ c(x) = 0, we must have an = 0 for n ≥ 1, so that (2.20) becomes h2 (f (x) − 1) = c(x).
(2.21)
In terms of modes this is simply h2 fn = cn . We also deduce from a1 = 0 that h2 =
c1 (1 + p)2 . = f1 p
(2.22)
After solving (2.21) for f (x), we find that (2.9) holds if and only if q and p satisfy one of the conditions in (i) or (ii). In particular, in case (ii),
254
P. Bouwknegt, K. Pilch
f (x) =
(1 − p−2 x)(1 − p2 x) . (1 − p−1 x)(1 − px)
(2.23)
We must still show that the resulting module is one dimensional. Let m0 be the smallest m > 0 for which T−m |hi 6= 0. We have shown already that Tm0 T−m0 |hi = 0. For n = 1, . . . , m0 − 1 we find Tn T−m0 |hi = −
m 0 −1 X
Tn−l T−m0 +l |hi − hfm0 Tn−m0 |hi = 0.
(2.24)
l=1
Thus T−m0 |hi is a singular vector and must vanish.
2.3. The q-Heisenberg algebra Hp,q and the free field realization of Virp,q . The qHeisenberg algebra, Hp,q , is the associative algebra with generators {αn , n ∈ Z} and relations (1 − q m )(1 − t−m ) δm+n,0 . (2.25) [αm , αn ] = m 1 + pm Let U (Hp,q )loc denote the local completion of Hp,q (see [8]). Furthermore, let ωH denote the C-linear anti-involution of U (Hp,q )loc defined by ωH (αm ) = p−m α−m , α0
ωH (q ) = pq
−α0
m 6= 0,
.
(2.26)
Again, we can add a derivation d to Hp,q defined by [d, αm ] = −mαm ,
m ∈ Z.
(2.27)
For any α ∈ C we have an Hp,q module F (α), the so-called Fock space, which is irreducible for generic p and q, and decomposes as F (α) = qn≥0 F (α)(n) ,
(2.28)
under the action of d. The module F (α)(n) has a basis indexed by partitions λ ` n, |λ, αi ≡ α−λ |αi ≡ α−λ1 . . . α−λ` |αi,
(2.29)
and where the highest weight vector (the “vacuum”) satisfies α0 |αi = α|αi, m > 0. αm |αi = 0,
(2.30)
The anti-involution ωH of (2.26) induces a unique contravariant bilinear form h−|−iF 0 on F (α0 ) × F (α) such that hα0 |αi = 1, where α0 is determined by q α = pq −α , i.e., gλµ ≡ hλ; α0 |µ; αiF = hα0 |ωH (α−λ )α−µ |αiF . We compute gλµ explicitly in Sect. 3. We now recall the free field realization of Virp,q .
(2.31)
Deformed Virasoro Algebra at Roots of Unity
255
Theorem 2.3 [9, 34]. We have a homomorphism of algebras ı : Virp,q → U (Hp,q )loc defined by (2.32) ı(T (z)) = 3+ (z) + 3− (z), where
X α−n n αn −n z exp − z , 3 (z) = p q exp n n n≥1 n≥1 X X 1 α−n −1 n αn −1 −n 3− (z) = p 2 q −α0 exp − (p z) exp (p z) . n n +
− 21 α0
X
n≥1
(2.33)
n≥1
Note that we can write 3− (z) =: 3(p−1 z)−1 :,
3+ (z) =: 3(z) :, where − 21 α0
3(z) = p
q
X αn −n z exp − . n
(2.34)
(2.35)
n6=0
Furthermore
ı ◦ ω = ωH ◦ ı.
(2.36)
We sketch the proof since some intermediate results will be useful in later sections. Proof. By means of standard free field techniques we find for |z1 | |z2 | and i ∈ {±}, 31 (z1 )32 (z2 ) = f 1 2 (
z2 ) : 31 (z1 )32 (z2 ) :, z1
(2.37)
with f ++ (x) = f −− (x) = f (x)−1 ,
f +− (x) = f (p−1 x),
f −+ (x) = f (px),
where f (x) is defined in (2.2). Thus we have X z2 z2 f ( )T (z1 )T (z2 ) = F 1 2 ( ) : 31 (z1 )32 (z2 ) :, z1 z1
(2.38)
(2.39)
i
with F ++ (x) = F −− (x) = 1,
F +− (x) = f (x)f (p−1 x),
Subtracting the term with z1 ↔ z2 gives
F −+ (x) = f (x)f (px). (2.40)
z2 z1 1 )T (z1 )T (z2 ) − f ( )T (z2 )T (z1 ) =(F +− (x) − F −+ ( )) : 3+ (z1 )3− (z2 ) : z1 z2 x 1 + (F −+ (x) − F +− ( )) : 3− (z1 )3+ (z2 ) : . x (2.41) Using (2.9) we find f(
1 F +− (x) − F −+ ( ) = ζ(δ(p−1 x) − δ(x)), x −+ +− 1 F (x) − F ( ) = ζ(δ(x) − δ(px)), x
(2.42)
256
P. Bouwknegt, K. Pilch
while (2.33) gives
: 3+ (z)3− (pz) := 1.
This completes the proof.
(2.43)
The free field realization ı : Virp,q → U (Hp,q )loc equips the U (Hp,q )loc module F (α) with the structure of a Virp,q module. In fact, from Theorem 2.3 it follows Corollary 2.4. Let α ∈ C and p, q ∈ C arbitrary. Define h(α) = p− 2 q α + p 2 q −α . 1
1
(2.44)
There exists a unique homomorphism ı of Virp,q modules, ı : M (h(α)) → F (α), such that ı(|h(α)i) = |αi. The homomorphism ı is an isometry with respect to the bilinear contravariant forms defined on M (h) and F (α). Remark. Using (2.32) and (2.37) one can show that ı(T (z)) satisfies, in the sense of meromorphically continued products of operators, the following exchange relation [9], w (2.45) ı(T (z))ı(T (w)) = ST T ( )ı(T (w))ı(T (z)), z where 1 (2.46) ST T (x) = f ( )f (x)−1 , x is a solution to the Yang-Baxter equation. This observation is further developed in [12], where a proposal is made for the definition of a deformed chiral algebra (DCA). In this formalism the deformed Virasoro algebra, Virp,q , is naturally defined as a subalgebra of the DCA corresponding to the q-deformed Heisenberg algebra, Hp,q . While some of our discussion can be naturally recast in the language of DCAs, for most of our purposes the algebraic setup of Sect. 2.1 is sufficient. 3. The Kac Determinant An explicit formula for the Kac determinant of the Verma modules of Virp,q was conjectured in [34]. In this section we present its proof by using the isometry ı : M (h(α)) → F (α). Let us first consider the determinant of the bilinear form on F (α). For partitions λ, µ ` n we have (n) = hα|ωH (α−λ )α−µ iF = δλµ zλ p|λ| gλµ
`(λ) Y (1 − q λi )(1 − t−λi ) i=1
where zλ =
Y
1 + pλi
imi (λ) mi (λ)!.
,
(3.1)
(3.2)
i≥1
Theorem 3.1. g
where Cn =
(n)
Q
≡ det gλµ = Cn
Y
p
r,s≥1 rs≤n λ`n zλ
r (1
− q r )(1 − t−r ) 1 + pr
p(n−rs) ,
is a constant independent of p, q and q α .
In deriving Theorem 3.1 we have used the following elementary result.
(3.3)
Deformed Virasoro Algebra at Roots of Unity
257
Lemma 3.2. Let f be a (complex valued) function on N. Then, for all n ≥ 0, `(λ) Y Y
f (λi ) =
λ`n i=1
n m i (λ) Y Y Y λ`n i=1
f (ji ) =
ji =1
Y
f (r)p(n−rs) .
(3.4)
r,s≥1 rs≤n
Proof. Consider the left hand side of (3.4). Fix r ≥ 1. Consider all partitions λ ` n with a row of length r. Since there are p(n − r) partitions with at least one row of length r, the first such row contributes a factor of f (r)p(n−r) . There are p(n − 2r) partitions with at least two rows of length r, so the second row of length r contributes f (r)p(n−2r) . Iterating Q this we conclude that the rows of length r in all partitions of n contribute a factor s f (r)p(n−rs) . This proves equality of the left hand side to the right hand side. The equality of the middle formula to the right hand side is proved similarly, but now considering rows of length s. We are now ready for the main result of this section. Theorem 3.3. The Kac determinant of M (h)(n) is given by G
(n)
= Cn
Y
(h − 2
h2r,s )p(n−rs)
r,s≥1 rs≤n
(1 − q r )(1 − t−r ) 1 + pr
where r
s
r
r
s
hr,s = t 2 q − 2 + t− 2 q 2 = p− 2 q
r−s 2
r
+ p2 q
p(n−rs)
s−r 2 ,
,
(3.5)
(3.6)
and Cn is a constant independent of p, q and h. (n) α Proof. Fix n ∈ N. For partitions λ, µ ` n, define a matrix 5(n) λµ = 5λµ (p, q, q ) by
ı(T−λ )|αi =
X
5(n) λµ α−µ |αi,
(3.7)
µ`n
and let 5(n) = det 5(n) λµ .
(3.8)
Clearly, 5(n) (p, q, q α ) is a Laurent polynomial in p, q and q α . Using (2.26), we have G(n) = 5(n) (p, q, pq −α ) g (n) 5(n) (p, q, q α ),
(3.9)
where g (n) is given in Theorem 3.1. The crucial step in the proof is the construction of a sufficient set of vanishing lines of G(n) . From [34] it is known that for every α of the form α = αr,s = 21 (r − 1)β − 21 (s − 1), 0 with r, s ≥ 1, we can construct a singular vector in F (αr,s ) (where, as before, α0 = −α +1−β) at level rs. This singular vector can be obtained as the image of a Fock space highest weight vector under an appropriately defined composition of screening operators and can be expressed in terms of Macdonald polynomials (see [34] for more details). Due to the non-degenerate pairing between F (α0 ) and F (α) (for generic values of q, q and q α ) there must exist a vector in the cokernel of the map ı : M (h(αr,s )) → F (αr,s ) at level rs. Therefore, 5(n) (p, q, q α ) has vanishing lines
258
P. Bouwknegt, K. Pilch
r−1 s−r p 2 q 2 qα
−
r−1 r−s p− 2 q 2 q −α
,
and thus, because of (3.9), G(n) has vanishing lines r−1 s−r − r−1 r−s −α 2 2 α 2 2 2 2 q q −p q q h − hr,s = p r+1 r−s r+1 s−r × p− 2 q 2 q α − p 2 q 2 q −α ,
(3.10)
where hr,s ≡ h(αr,s ), i.e., G(n) has a factor Y (h2 − h2r,s )p(n−rs) . r,s≥1 rs≤n
Thus, G(n) is given by (3.5), up to a Laurent polynomial Cn in p, q and q α . The proof will now be completed if we can show that Cn is actually a constant. As a Laurent polynomial in q α , the leading term of 5(n) (p, q, q α ) is easily computed. It arises from α 5(n)+ (p, q, q α ) = det 5(n)+ λµ (p, q, q ), where X (n)+ 3+−λ |αi = 5λµ α−µ |αi. (3.11) µ
This determinant can be computed by first going to a basis of F (α)(n) given by the vectors (3.12) A−λ |αi ≡ A−λ1 . . . A−λl |αi, where λ runs over all partitions of n and A−m is defined through X X α−m m exp z = A−m z m . m m≥1
(3.13)
m≥0
The transition matrix between the basis (2.29) and (3.12) is obviously independent of α p, q and q α , and non-degenerate. In the basis (3.12) the matrix 5(n)+ λµ (p, q, q ) is upper (n)+ triangular (i.e., 5λµ = 0 unless λ µ), and the diagonal elements are easily computed. We find Y −1 (p 2 q α )`(λ) 5(n)+ (p, q, q α ) = λ`n
=
Y
1
(p− 2 q α )p(n−rs) (3.14)
r,s≥1 rs≤n
=
Y
r
(p− 2 p
r−1 s−r 2 q 2 q α )p(n−rs) ,
r,s≥1 rs≤n
where we have used Lemma 3.2. Similarly, the leading term in q α of 5(n) (p, q, pq −α ) α arises from 5(n)− (p, q, q α ) = det 5(n)− λµ (p, q, q ), where X (n)− 3− 5λµ α−µ |αi, (3.15) −λ |αi = µ
Deformed Virasoro Algebra at Roots of Unity
259
and is given by Y
5(n)− (p, q, pq −α ) =
r
p − 2 p−
r+1 r−s α p(n−rs) 2 q 2 q .
(3.16)
r,s≥1 rs≤n
This, together with the factorization (3.9), shows that the prefactor Cn in (3.5) is actually independent of p, q and q α . As a consequence of the previous proof we also have Corollary 3.4. 5(n) (p, q, q α ) = Dn
Y
r
p− 2 p
r−1 s−r 2 q 2 qα
− p−
p(n−rs)
r−1 r−s 2 q 2 q −α
,
r,s≥1 rs≤n
(3.17) where Dn is a constant independent of p, q and q α . Proof. The Laurent polynomial 5
0(n)
α
(p, q, q ) =
Y
r p(n−rs) p2
5(n) (p, q, q α ),
(3.18)
r,s≥1 rs≤n
inherits the following duality invariances from Virp,q : 50(n) (p, q, q α ) = 50(n) (p−1 , q −1 , q −α ), 50(n) (p, q, q α ) = 50(n) (p−1 , p−1 q, q −α ).
(3.19)
This, together with the factorization (3.9), then uniquely determines 5(n) (p, q, q α ). From the Kac determinant (3.5) it follows that for generic p and q the category O of Virp,q modules is isomorphic to that of Vir. In particular, the construction of singular vectors [34] and a Felder type resolution of the irreducible modules√[26, 21, 10] are q-deformations of the corresponding constructions for Vir. At q = N 1, however, the Kac determinant displays a large number of additional vanishing lines irrespective of the highest weight h. This indicates the existence of additional, h-independent, singular √ vectors, and suggests that Virp,q at q = N 1 has a large center. It is important to note, though, that Corollary 3.4 implies that the√homomorphism ı : M (h(α)) → F (α), for N generic p and α, is a bijection even for √ q = 1. In the following section we use this fact N to establish results for Virp,q at q = 1 by using the free field realization. 4. The Center and Representations of Virp,q at Roots of Unity In this section we analyze Virp,q and its representations for q a primitive N th root of unity. To illustrate the main features we first discuss the simplest case, N = 2, before we proceed to general N . In the last part of this section we make the analysis more explicit in the limit t → ∞.
260
P. Bouwknegt, K. Pilch
4.1. Representations of Virp,q for q = −1. For q = −1 we have 1+x , 1−x i.e., f0 = 1 and fl = 2 for all l ≥ 1. Thus Virp,q at q = −1 is given by X [Tm , Tn ] + 2 Tm−l Tn+l − Tn−l Tm+l = cm δm+n,0 , f (x) =
(4.1)
(4.2)
l>0
where (cf. (2.3))
1+p (pm − p−m ). 1−p Note that, for all m and n, the sum over l in (4.2) is actually finite.
cm = −2
(4.3)
Theorem 4.1. For q = −1, Virp,q is equivalent to the algebra defined by the relations ( m−n 2 T m+n T m+n + e cm δm+n,0 for m + n ∈ 2Z, {Tm , Tn } = 2(−1) (4.4) 2 2 0 otherwise, where
m m 2 e c m = 2 p 2 − p− 2 .
(4.5)
Proof. The proof is straightforward if one uses the identity m X
(−1)l fl cm−l = e cm ,
(4.6)
l=0
which is proved by induction.
Remark. It is possible to arrive at the commutators (4.4) directly, using the free field realization, by exploiting a different factorization of the exchange matrix ST T (x) = f (x)−1 f (1/x) = −1, namely ST T (x) = g+ (x)−1 g− (1/x) with g+ (x) = −g− (x) = 1. See [12] for details regarding this procedure. Note that the commutation relations (4.2) can be considered as an equation for the symmetrization of the product of two generators. In Sect. 4.3 we generalize this, in the limit t → ∞, to the symmetrization of a product of N generators for arbitrary q ∈ C. The next result is a simple consequence of the commutation relations (4.2). Theorem 4.2. The elements (Tn )2 , n ∈ Z, are in the center of Virp,q at q = −1. In particular, this implies that the vectors (T−n )2 |hi are singular for all p, h ∈ C and n ∈ N. Remark. On the basis (2.15) of M (h), the action of T0 is upper-triangular, i.e., X T0 |λ; hi = (−1)`(λ) h|λ; hi + cλµ |µ; hi,
(4.7)
µ<λ
e.g., at level 2 on the basis {T−1 T−1 |hi, T−2 |hi} we have h −2 T0 = . 0 −h In particular note that T0 in (4.8) is not diagonalizable for h = 0.
(4.8)
Deformed Virasoro Algebra at Roots of Unity
261
Let M 0 (h) be the module obtained from M (h) by dividing out the submodule generated by the singular vectors (T−n )2 |hi, n ≥ 1. Theorem 4.3. The Kac determinant G0(n) of M 0 (h)(n) is given by G0(n) = Cn
Y
(h2 − h2r,1 )q2 (r;n−r) ,
(4.9)
r≥1 r≤n
where Cn is a constant independent of p and h, and the integers q2 (r; n) are determined by Y
(1 + xn ) =
n≥1 n6=r
X
q2 (r; n) xn .
(4.10)
n≥0
Proof. Observe that m m 2 = −2(−1)m h2m,1 , e c m = 2 p 2 − p− 2
(4.11)
while the character of the submodule of M 0 (h) generated by T−r |hi is given by (4.10). Theorem 4.4. Let M 0 (h) be the quotient module defined as above. (i)
M 0 (h) is irreducible provided h2 6= h2r,1 for all r ≥ 1.
(ii) The character of M 0 (h) is given by
chM 0 (x) =
Y
Y 1 − x2n
(1 + xn ) =
n≥1
n≥1
1 − xn
.
(4.12)
(iii) The “Witten index” of M 0 (h) is given by
Tr(T0 xd ) = h
Y
(1 − xn ).
n≥1
(4.13)
262
P. Bouwknegt, K. Pilch
Proof. (i) Follows from Theorem 4.3. (ii) In M 0 (h) we have an (orthogonal) basis of monomials T−λ1 . . . T−λ` |hi,
λ1 > . . . > λ` > 0,
(4.14)
indexed by partitions with no equal parts. (iii) Observe that, in M 0 (h), T0 T−λ1 . . . T−λ` |hi = (−1)`(λ) h T−λ1 . . . T−λ` |hi . This concludes the proof.
(4.15)
Remark. Clearly, the module M 0 (h) can be realized in terms of free (Ramond) fermions, 1 . We thus have an and is isomorphic to the irreducible Vir module at c = 21 and 1 = 16 action of Virp,q , for q = −1, on a Virasoro minimal model module. Now, if h = ±hm,1 for some m, then M 0 (h) is reducible. We leave the general analysis of this situation for further study. Here, let us just remark, that if in addition t is a primitive M th root of unity2 , e.g., t = exp( 2πi M ), then if the equation h = ±hm,1 = ±2 cos(π(
1 πm m − )) = ±2 sin( ), M 2 M
(4.16)
holds for one particular m = m0 , it holds for all m = m0 (mod M ) and m = M − m0 (mod M ). It is straightforward to verify the following theorem. πm0 Theorem 4.5. Suppose t = exp( 2πi M ) and h = ±2 sin( M ) for some m0 ∈ {0, 1, . . . , M − 1}. Then the vectors T−m |hi are singular in M 0 (h) for all m = m0 (mod M ) and m = M − m0 (mod M ). Let M 00 (h) be the module obtained from M 0 (h) by dividing out the ideal generated by these singular vectors. The module M 00 (h) is irreducible and has character Y 1 + xn , (4.17) chM 00 (x) = 1 + xm0 +nM n≥1
for m0 = 0 or m0 = M/2, and chM 00 (x) =
Y n≥1
1 + xn m +nM 0 (1 + x )(1 + x(M −m0 )+nM )
,
(4.18)
otherwise. Remark. The irreducibility of M 00 (h) for, e.g., M = 3 and h = h1,1 is in apparent conflict with case (ii) of Theorem 2.2. However, this is precisely a manifestation of the fact that the algebra is not uniquely defined for p a root of −1. 2 Note that in this case, p is a root of −1 and thus the algebra Vir p,q is not uniquely defined (cf. the remark in Sect.√2.1). The particular algebra we are discussing here corresponds to first taking q → −1 and then t → M 1, i.e., to the solution (4.1) of the recursion relations (2.9).
Deformed Virasoro Algebra at Roots of Unity
263
Remark. Note that the vectors T−m |hi do not have to be singular in M (h). For example, consider M = 3, h = 0 (m0 = 0). Then T1 T−3 |hi = 2T−1 T−1 |hi,
(4.19)
which means that T−3 |hi is primitive, but non-singular, in M (h). As a consequence, the Verma module M (h = 0) for q 2 = t3 = 1 has a submodule that is not generated by singular vectors, namely, the submodule generated by T−3 |hi and T−1 T−1 |hi. This situation does not occur for Verma modules of the Virasoro algebra, but is common in the case of W-algebras [5]. √ 4.2. Representations of Virp,q for q = N 1. In this section we consider the case where q is an arbitrary primitive N th root of unity with N > 2. Our main goal is to construct explicitly the series of singular vectors in the Verma module M (h) for generic p, that are independent of h, and to analyze the structure of the resulting quotient module M 0 (h). The results, given in Theorems 4.8 and 4.12, generalize those from the previous section. However, unlike√ for q = −1, the commutation relations of Virp,q are extremely cumbersome for q = N 1, N > 2, and it is difficult to establish any of these results directly. For that reason we resort to the free field realization of Sect. 2.3, which considerably simplifies the entire analysis, as we now show. √ Lemma 4.6. For q = N 1, the oscillators αm with m = 0 mod N generate the center of the q-Heisenberg algebra, Hp,q . In particular, they therefore also commute with ı(T (z)). Proof. This result is an obvious consequence of the commutation relations (2.25) in √ Hp,q for q = N 1. Since polynomials in the oscillators α−nN , n ∈ N, commute with ı(T (z)), upon acting on the vacuum they give rise to singular vectors in the Fock space F (α). Given that ı : M (h) → F (α) is an isomorphism (cf. Sect. 3), we conclude that there are corresponding singular vectors in the Verma module M (h), that are independent of a particular value of h. We will now proceed to compute those vectors explicitly. √ Lemma 4.7. For q = N 1 and p generic Y zj f ( ) ı(T (z1 ) . . . T (zN )) lim zi zi →q N −i z i<j =: 3+ (zq N −1 ) . . . 3+ (z) : + : 3− (zq N −1 ) . . . 3− (z) : . (4.20) Proof. Following the steps in the proof of Theorem 2.3, we find Y X Y zj zj f ( ) ı(T (z1 ) . . . T (zN )) = F i j ( ) : 31 (z1 ) . . . 3N (zN ) : . zi zi ,..., i<j i<j 1
N
(4.21) For generic p, F i j (x) has a well-defined limit for x → q k as long as q k 6= 1. It follows that the right-hand side of (4.21) has a well-defined limit for zi → zq N −i . In this limit we find (cf. [12], Sect. 6.1)
264
(4.21)
P. Bouwknegt, K. Pilch
→
: 3+ (zq N −1 ) . . . 3+ (z) : + : 3− (zq N −1 ) . . . 3− (z) : +
N −1 X
bk,N (q) : 3+ (zq N −1 ) . . . 3+ (zq N −k )3− (zq N −(k+1) ) . . . 3− (z) :,
k=1
(4.22)
where bk,N (q) =
N k Y Y
F +− (q i−j )
i=1 j=k+1 N k Y Y (1 − q i−j−1 )(1 − p−1 q i−j+1 ) = . (1 − q i−j )(1 − p−1 q i−j )
(4.23)
i=1 j=k+1
To establish (4.22), we first consider terms in (4.21) that have 3− (zi ) to the left of some 3+ (zj ). Choosing the rightmost such 3− (zi ) we see that there is a factor zi+1 . . . F −+ ( ) . . . 3− (zi )3+ (zi+1 ) . . . , zi which vanishes in the limit zi → zq N −i because F −+ (q −1 ) = 0. The remaining terms √ N yield (4.22). Finally, we note that for q = 1, we have bk,N (q) = 0,
k = 1, . . . , N − 1,
(4.24)
because of the factor (1 − q −N ) that arises by setting i = 1 and j = N in (4.23). √ Now, it follows from the explicit expression (2.33) that for q = N 1, the terms
: 3± (zq N −1 ) . . . 3± (z) : in (4.20) have an expansion in terms of {αnN , n ∈ Z}. This observation, together with Lemma 4.7, yields the following theorem. √ Theorem 4.8. For q = N 1 and p generic Y zj f ( ) T (z1 ) . . . T (zN )|hi, (4.25) 9(z) = lim zi zi →zq N −i i<j is a well-defined generating series of singular vectors in M (h). Proof. The result of Lemma 4.7 is that ı(9(z)) is a well-defined series of singular vectors in F (α). Since ı is an isomorphism, this also proves the theorem. Although (4.25) defines 9(z) as a series in the products of modes of T (z) acting on the vacuum, interpreting this result directly within the Verma module must be done with some caution. On the one hand, upon expanding T (zi ) into power series, we find that the resulting modes Tm have both m ≤ 0 and m > 0. In the latter case one can commute those modes to the right until they annihilate the vacuum. However, as one can see from (2.1), this yields additional infinite summations. Therefore, the product T (zq N −1 ) . . . T (z) turns out to be a divergent series, even when acting on the vacuum. On √ the other hand, it follows directly from the second expression in (2.2) that, for q = N 1, (4.26) f (x)f (xq) . . . f (xq N −1 ) = 1. Since the product of f (zj /zi ) in (4.25) has a factor of f (q) . . . f (q N −1 ) = f (1)−1 = 0 in the limit, it vanishes.
Deformed Virasoro Algebra at Roots of Unity
265
Remark. Note that, by using (4.26) and the exchange relations (2.45), the operator T (zq N −1 ) . . . T (zq)T (z) is formally in the center of Virp,q . However, as we have seen above, it diverges. The products of f (zj /zi ) in (4.25) provide a convenient regularizing factor (see, also [12]). Now, let us carefully expand (4.25) in modes. We have Y X Y X zj mN m1 f ( ) T (z1 ) . . . T (zN )|hi = z1 . . . z N flij zi (4.27) i<j m1 ,...,mN ∈Z lij ≥0 i<j × T−m1 −l12 −...−l1N T−m2 +l12 −l23 −...−l2N . . . T−mN +l1N +...+lN −1N |hi. Upon introducing Littlewood’s raising operators Rij , i < j [24, 27], acting on monomials as Rij T−m1 . . . T−mN |hi = T−m1 . . . T−(mi +1) . . . T−(mj −1) . . . T−mN |hi,
(4.28)
we can write (4.27) more succinctly, (4.27) →
X m1 ,...,mN ∈Z
mN z1m1 . . . zN
Y
f (Rij ) T−m1 . . . T−mN |hi,
(4.29)
i<j
where f (Rij ) =
X
fl (Rij )l .
(4.30)
l≥0
Note that in this notation the commutation relations (2.1) simply become f (R12 )Tm Tn = f (R12 )Tn Tm + cm δm+n,0 .
(4.31)
Now consider (4.29). P Clearly, at a given level in the Verma module, say level d, only the terms satisfying mi = d contribute. There are only a finite number of such terms that have all mi ≥ 0. In all other terms we can commute the T−mi with mi < 0 to the right, where they annihilate the vacuum. In fact, this would have been quite simple if there was no central term in (4.31), as in this case there would be a full symmetry in m1 , . . . , mN . However, because of the central term in the commutation relations, we do obtain subleading terms, i.e., with the product of a smaller number of the Tmi . Moreover, those central terms result in infinite sums over mi < 0. It is easy to see that for |p| sufficiently small, one can sum up those series, and the final expression is manifestly well-defined in the limit zi → q N −i z. To illustrate this procedure let us consider the simpler case with N = 2. Here we find X z1m1 z2m2 f (R12 )T−m1 T−m2 |hi m1 ,m2 ∈Z
=
X λ
mλ (z1 , z2 )f (R12 )T−λ1 T−λ2
! z2 + c( ) |hi, z1
(4.32)
where mλ (z1 , z2 ) are the monomial symmetric polynomials and the sum is over all partitions. For N > 2 it becomes considerably more difficult to carry out this calculation. In Appendix B we summarize the result for N = 3. Here let us only say that, as follows
266
P. Bouwknegt, K. Pilch
from the discussion above, the leading term in (4.25), i.e., the term with N generators Tn , is given by X
mλ (z, zq, . . . , zq
N −1
)
Y
f (Rij ) T−λ1 . . . T−λN |hi,
(4.33)
i<j
λ `(λ)≤N
where the sum is over all partitions λ. In particular, it follows from (4.33) that the singular vectors 9−d |hi for d = mN, m ∈ N, are non-vanishing and independent, i.e., none of the 9−d |hi is in the submodule generated by the other ones. The vanishing of the leading term for d 6= 0 mod N follows from part (i) of the following lemma. √ Lemma 4.9. For q = N 1 we have (i) mλ (z, zq, . . . , zq N −1 ) 1 |λ| 2 mN (N −1) (K(1)−1 K(q)) λ,(mN ) z = q 0
if λ ` mN for some m ∈ Z≥0 , otherwise, (4.34) where Kλµ (q) is the Kostka polynomial (see App. A). (ii) m(λ1 +n,...,λN +n) (z, zq, . . . , zq N −1 ) = m(λ1 ,...,λN ) (z, zq, . . . , zq N −1 ),
∀n ∈ N. (4.35)
Proof. For (i), use the following expansion of mλ (x) in terms of Hall–Littlewood polynomials which immediately follows from (A.11) and (A.12) in Appendix A: mλ (x) =
X
(K(1)−1 K(q))λµ Pµ (x; q).
(4.36)
µ
Then use (cf. [27]) Pλ (z, zq, . . . , zq N −1 ; q) = q n(λ) where n(λ) =
X
N m(λ)
z |λ| ,
(i − 1)λi ,
(4.37)
(4.38)
i≥1
and
N m(λ)
≡
(q)N N = . (q)m1 (λ) . . . (q)mN (λ) m1 (λ) . . . mN (λ)
(4.39)
√ Now, for q = N 1, the q-multinomial (4.39) vanishes unless mi (λ) = N for some i ∈ Z≥0 , i.e., only the terms for which µ = (mN ) for some m ∈ Z≥0 contribute in (4.36). This proves (i). Part (ii) is obvious from the definition of mλ (x).
Deformed Virasoro Algebra at Roots of Unity
267
We have not succeeded in finding a more explicit, but still tractable, expression for the singular vectors of (4.25) √ for arbitrary N . From the explicit expressions of some singular vectors at q = N 1, N = 3, 4, in Appendix B, it is however clear that the expressions drastically simplify in the limit t → ∞ (or, equivalently, t → 0). Indeed, in all examples only the leading term (T−n )N |hi survives. In Sect. 4.3 we analyze this limit in more detail. √ 0 0 denote the reduced q-Heisenberg algebra, i.e., Hp,q with the For q = N 1, let Hp,q 0 0 oscillators {αn | n = 0 mod N } removed, and denote by F (α) the Fock space of Hp,q . √ Theorem 4.10. For q = N 1, and generic p, we have a realization of Virp,q on the sub Fock space F 0 (α) ⊂ F (α). This realization is irreducible for generic α, p ∈ C and the character is given by Y 1 − xnN X pN (n) xn , (4.40) = chF 0 (x) = 1 − xn n≥1
n≥0
where pN (n) is the number of partitions of n with parts not equal to a multiple of N . The proof of this theorem is clear except for the irreducibility of F 0 (α). This will follow from the result of Theorem 4.12. Let M 0 (h) denote the module obtained from M (h) by dividing out the submodule generated by the singular vectors of (4.25). To investigate the irreducibility of M 0 (h) we make use of the following Theorem 4.11. The Kac determinant of the module M 0 (h) is given by e 0(n) = Cn G
Y
(h2 − h2r,s )qN (r,s;n−rs)
r≥1,1≤s≤N −1 rs≤n
Y r,s≥1, rs≤n r6=0 mod N
1 − t−r 1 + pr
pN (n−rs) ,
(4.41) where Cn is a constant independent of p and h, pN (n) is defined in (4.40), and the integers qN (r, s; n) are determined by Y X (1 + xr + . . . + xr(N −1−s) ) (1 + xn + . . . + xn(N −1) ) = qN (r, s; n) xn . (4.42) n≥1 n6=r
n≥0
Proof. The proof is completely analogous to the proof of Theorem 3.3. In particular, the second term arises from the Kac determinant g 0(n) of F 0 (α) while the √first term is a remnant of the vanishing lines (3.10). Note that hr,s = ±hr,s+N for q = N 1. The generalization of Theorem 4.4 reads √ Theorem 4.12. Let q = N 1 and let M 0 (h) be defined as above, then (i) M 0 (h) is irreducible provided h2 6= h2r,s for all r ≥ 1 and 1 ≤ s ≤ N − 1. (ii) The character of M 0 (h) is given by Y chM 0 (x) = (1 + xn + . . . + xn(N −1) ). n≥1
(4.43)
268
P. Bouwknegt, K. Pilch
Proof of Theorem 4.10. It remains to prove the irreducibility of F 0 (α) for generic α and p. This follows from the irreducibility of M 0 (h(α)) (Theorem 4.12 (i)) and the equality of characters Y 1 − xnN Y = (1 + xn + . . . + xn(N −1) ), (4.44) 1 − xn n≥1
n≥1
0
0
which imply that ı : M (h(α)) → F (α) is an isomorphism.
It is useful to have an explicit expression for the image of a Verma module monomial T−λ |hi under the map ı : M (h(α)) → F (α). To describe the result, let us identify F (α) with the ring of symmetric functions 3 ⊗ Q(p, q) over Q(p, q) through the isomorphism , (4.45) (α−λ1 . . . α−λn |αi) = pλ , where pλ are the power sum symmetric functions (see, Appendix A). We then have Theorem 4.13. Let λ be a partition of length `(λ) = n, then X 1 n 3−λ . . . 3−λ |αi, ı(T−λ1 . . . T−λn |hi) = 1 n
(4.46)
1 ,...,n
while P
1 n 3−λ . . . 3−λ |αi = (p− 2 q α ) 1 n 1
i
Y
f (Rij )−i j
hλ11 (x) . . . hλnn (x),
(4.47)
i<j
where
−n h− en (−x). n (x) = p
h+n (x) = hn (x),
(4.48)
and f (x) as in (2.2). We recall that hn (x) and en (x) are, respectively, the completely symmetric functions and the elementary symmetric functions. [The expression (4.47) is to be understood as follows: first replace hnii (x) by the appropriate expression in (4.48), then act with the f (Rij ) on any combination of hn ’s and en ’s as in (4.28). In particular, the f (Rij ) act only on the subscript of the symmetric function, and not on the prefactor p−n .]
Proof. As in [20], Proposition 3.9.
As an application of Theorem 4.13, consider the leading order term (4.33) of the singular vector 9(z). Applying the map ı we see that the term of leading order in q α (i.e., the term of order O((q α )N )) is proportional to X X mλ (z, zq, . . . , zq N −1 )hλ (x) = hm1 (x)hm2 (xq) . . . hmN (xq N −1 ) m1 ,...,mN ∈Z≥0
λ `(λ)=N
=
X
hm (xN )z mN ,
m≥0
(4.49) where, in the last step, we have used N Y i=1
H(xq i−1 ; t) = H(xN ; tN ).
(4.50)
Deformed Virasoro Algebra at Roots of Unity
Now, hm (xN ) =
X
269
X
zλ−1 pλ (xN ) =
λ`m
zλ−1 pN λ (x),
(4.51)
λ`m
so that indeed (cf. (4.45)) the vector (4.33) gives rise to a singular vector at the leading order in q α . A similar computation also shows that the leading order in q −α works out. The subleading terms in (4.25) are required to make the subleading orders in q α work. As a second application of Theorem 4.13 one can verify that, for q = −1, the vectors (T−n )2 |hi are indeed singular (cf. Theorem 4.2). √ 4.3. Representations of Virp,q for q = N 1 in the limit√t → ∞. In the previous section we have analyzed Virp,q and its representations for q = N 1 and generic p. In this section we will make the analysis even more explicit in the limit t → ∞ (or, equivalently, the limit t → 0), where, as we have emphasized already, we expect some drastic simplifications. At the same time the limit t → ∞ is a generic point, in the sense that the structure of the algebra and the modules is similar to that at other generic values of t, e.g., multiplicities of singular vectors at t = ∞ are equal to the multiplicities at a generic t. It turns out that the most efficient method of studying this case is to exploit an interesting connection to Hall– Littlewood polynomials, whose basic properties have been summarized in Appendix A. f q , is generated by The deformed Virasoro algebra Virp,q at t = ∞, denoted by Vir e {Tn , n ∈ Z}, where Tem = lim Tm p
|m| 2
t→∞
,
(4.52)
satisfy the relations given by the following theorem (cf. [4]): Theorem 4.14. In the limit t → ∞ we have X q l Ten−l Tem+l + (1 − q)δm+n,0 , [Tem , Ten ]q = (q − q −1 )
if
m > 0 > n, (4.53)
l≥1
[Tem , Ten ]q = −(1 − q)
m−n−1 X
Tem−l Ten+l ,
m>n>0
if
or
0 > m > n,
l=1
[Te0 , Tem ]q = −(1 − q)
−m−1 X
(4.54) Te−l Tem+l + (q − q −1 )
l=1
X
q l Tem−l Tel ,
if
0 > m,
l≥1
(4.55) [Tem , Te0 ]q = −(1 − q)
m−1 X
Tem−l Tel + (q − q −1 )
l=1
X
q l Te−l Tem+l ,
if
m > 0, (4.56)
l≥1
where [−, −]q is the q-commutator, [x, y]q = xy − qyx.
(4.57)
Proof. By a direct expansion of the commutation relation (2.1) to the leading order in t using Lemma 4.15 below.
270
P. Bouwknegt, K. Pilch
Lemma 4.15. For t → ∞ we have f0 = 1, and
fl = (1 − q) + O(t−1 ),
(4.58)
fl − fl+m = −q 2l−1 (1 − q 2 )t−l + O(t−l−1 ),
(4.59)
for all m, l ≥ 1. Remark. Note that (4.58) is equivalent to 1 − qx + O(t−1 ). 1−x Proof. By an explicit expansion of (2.2). f (x) =
(4.60)
Let us point out some obvious simplifications of the relations (4.53)–(4.56) over ± f of Vir f q generated by the Tem (2.1). First, unlike in Virp,q , we have subalgebras Vir q with ±m > 0. The relation (4.54) that defines those subalgebras has only a finite number of terms on the right-hand side. Moreover, (4.54) is invariant under the “shift transformation” induced by (m, n) → (m + k, n + k), k ∈ Z, as long as one remains within a given subalgebra. Secondly, the commutation relations between the positive and negative mode generators do not produce the Te0 , and the right hand side of (4.53) is already in ordered form. Finally, the form of (4.55) and (4.56) suggests that one should ± f by a zero mode be able to represent Te0 as a sum Te0 = Te0+ + Te0− , where Te0± extend Vir q generator. f(h)(n) , in which we introf(h) = qn≥0 M Now, let us consider the Verma module M duce a basis |λ; hi = Te−λ1 . . . Te−λ` |hi, λ ` n, (4.61) f of M (h)(n) as in (2.15). e (n) ≡ hλ; h|µ; hi is Lemma 4.16 (cf. [20], Proposition 2.20). Let λ, µ ` n. Then G λµ given by e (n) = δλµ bλ (q), (4.62) G λµ Q where bλ (q) = i≥1 (q)mi (λ) , and is independent of h. Proof. Consider the reverse lexicographic ordering on the set of partitions of n. It is clear that hλ; h|µ; hi = 0 for λ > µ. Indeed, start by moving Tλ1 to the right. This will produce terms with Tm , m > λ1 and eventually give a vanishing contribution when acting on the vacuum; unless Tλ1 is killed by a corresponding T−λ1 , in which case we have λ1 = µ1 . And so on. Because of symmetry, we also have hλ; h|µ; hi = 0 for λ < µ. The diagonal terms are then easily calculated. The h-independence is a consequence of the second simplification as discussed above. It immediately follows from Lemma 4.16 that the Kac determinant at level n is given by
e (n) = G
YY λ`n i≥1
(q)mi (λ) =
Y
(1 − q r )p(n−rs) ,
(4.63)
r,s≥1 rs≤n
where we have used Lemma 3.2. One can verify that (4.63), indeed, agrees with the leading order behaviour of the Kac determinant of Theorem 3.3.
Deformed Virasoro Algebra at Roots of Unity
271
√ Theorem 4.17. Let q = N 1. The vectors |(nN ); hi = (Te−n )N |hi,
n ∈ N,
(4.64)
are singular for any h ∈ C. √ Proof. From Lemma 4.16 it follows that, for q = N 1, the vectors (Te−n )N |hi are orf(h), i.e., they are in the radical of M f(h), and hence are thogonal to any vector in M primitive.3 However, from the explicit commutation relations (4.53) it easily follows that Tem (Te−n )N |hi = 0 for m > 0. Indeed, commuting Tem to the right will not produce terms containing Te−p for p < n, hence Tem (Te−n )N |hi cannot be in the submodule generated by the (Te−p )N |hi with p < n. In other words, it has to vanish. As expected, the existence of h-independent singular vectors is a consequence of a f q , as described by the following main theorem of this section. large center of Vir √ f q is generated by the elements Theorem 4.18. For q = N 1, N > 2, the center of Vir N e (Tn ) , n ∈ Z\{0}. One can prove this theorem directly using the relations (4.53)–(4.56). We have summarized this rather tedious argument in Appendix C. A more elegant proof will be given in due course using the free field realization and symmetric functions. f q is obtained by taking the limit t → ∞ in the free The free field realization of Vir field realization of Sect. 2.3, after rescaling the generators of Hp,q , m
q β0 = p− 2 q α0 , 1
βm = −αm p 2 ,
m 6= 0.
(4.65)
eq , In terms of the q-Heisenberg algebra H [βm , βn ] = m (1 − q |m| ) δm+n,0 ,
(4.66)
fq → H eq defined by we have a realization eı : Vir −
e m, eı(Tem ) = 3
e −m , eı(Te−m ) = 3
+
eı(Te0 ) =
e +0 3
+
m > 0,
where +
e (z) = lim 3+ (zp 3
− 21
t→∞
X βn
) = q β0 : exp
n6=0
−
e (z) = lim 3− (zp ) = q −β0 : exp − 3 1 2
t→∞
(4.67)
e− 3 0 ,
n
z −n :,
X βn n6=0
n
(4.68)
z −n : .
The contraction identities now read (cf. (2.37)) e 1 (z1 )3 e 2 (z2 ) = fe1 2 ( z2 ) : 3 e 1 (z1 )3 e 2 (z2 ) :, 3 z1 3
See, e.g., [5] for the definition and properties of primitive vectors.
(4.69)
272
P. Bouwknegt, K. Pilch
with
fe+− (x) = fe−+ (x) = fe(x), fe++ (x) = fe−− (x) = fe(x)−1 , where (cf. (4.60)), 1 − qx . fe(x) = 1−x e + (z) and 3 e − (z) satisfy the commutation relations The modes of 3 e +n ]q = −[3 e +n+1 , 3 e +m−1 ]q , e +m , 3 [3 e− e +m , 3 [3 n ]q − e− e m, 3 [3 n ]q
= =
e− e+ −[3 n−1 , 3m+1 ]q + e− e− −[3 n+1 , 3m−1 ]q .
(4.70) (4.71)
(4.72) (1 − q)2 δm+n,0 ,
(4.73) (4.74)
Note that (4.72) and (4.74) are both equivalent to (4.54) but with m, n ∈ Z. e ± (z) has been studied in [20] (and in Remark. The algebra of the vertex operators 3 [19] for q = −1) in the context of the Hall–Littlewood symmetric functions. In (4.66) e we have chosen √ a slightly different normalization for Hq , that is more suitable to discuss N the case q = 1 than the one in [20]. eq . Since the structure of Let F (β) be the Fock space of the Heisenberg algebra H f q modules is independent of h, we can set β = 0 without loss of generality. As in Vir Sect. 4.2 we identify F (0) with the space of symmetric functions 3[q] by means of e (β−λ1 . . . β−λ` |0i) = pλ (x).
(4.75)
Note that e is an isometry in the scalar product on 3[q] defined by (A.7). The specialization of Theorem 4.13 to the case t → ∞ gives an explicit identification f(h)) with symmetric functions (cf. [20], Proposition 3.9). of eı(M Theorem 4.19. For any partition λ, we have −
−
e −λ |0i) = Q0λ (x; q), e −λ . . . 3 e (3 1 n where Q0λ (x; q)
=
Y i<j
1 − Rij 1 − qRij
(4.76)
hλ1 (x) . . . hλn (x),
(4.77)
is a Milne polynomial. Proof. For a definition of Milne polynomials, see Appendix A. The proof is essentially the same as in [20] and can be found in [15]. Remark. Note that (4.76) makes sense for any sequence λ = (λ1 , . . . , λn ), where the λi are nonnegative, but not necessarily in descending order. Thus one can use (4.76) to define Q0λ (x; q) for such a general sequence λ. In Appendix A we have introduced Milne polynomials for arbitrary sequences using their relation to Hall–Littlewood polynomials. It is easy to see that the two extensions are exactly the same, since the reordering identity (A.15) is identical with the commutation relations (4.72) (cf. [27], Example 2, p. 213). Moreover, (4.77) holds in this more general situation. A consequence of Theorem 4.19 is the following result, which explicitly relates two e ± (z). types of ordering of products of modes of the vertex operators 3
Deformed Virasoro Algebra at Roots of Unity
273
Corollary 4.20. X
e − (z1 ) . . . 3 e − (zn ) : = :3
e− e− Pλ (z; q) 3 −λ1 . . . 3−λn ,
(4.78)
λ1 ≥λ2 ≥...≥λn
where the sum is over all ordered sequences λ (not necessarily positive). Proof. Write e − (zn ) : = e − (z1 ) . . . 3 :3
X
e− e− aλ (z; q) 3 −λ1 . . . 3−λn .
(4.79)
λ1 ≥λ2 ≥...≥λn
Using (4.68) we have X β−k k e − (zn ) : |0i = exp e − (z1 ) . . . 3 (z1 + . . . + znk ) |0i :3 k k≥1 X X e 1 = exp pk (x)pk (z) = Pλ (z; q)Q0λ (x; q), k k≥1
λ
(4.80) where, in addition, we have used Y X 1 1 pk (x)pk (z) = , exp k 1 − x i zj i,j
(4.81)
k≥1
and the completeness relation (A.26). Now, on the one hand, Theorem 4.19 implies that, for partitions λ1 ≥ λ2 ≥ . . . ≥ λn ≥ 0 we have aλ (z; q) = Pλ (z; q). On the other hand, from the “shift invariance” of the commutators (4.72), it follows that aλ1 +k,...,λn +k (z; q) = (z1 . . . zn )k aλ1 ,...,λn (z; q) for all k ∈ Z. This proves the corollary. fq We may now return to the main subject of this section, which is the center of Vir and the proof of Theorem 4.18. √ Proof of Theorem 4.18. Let q = N 1. Consider the expansion (4.78) in Corollary 4.20 for n = N and zi = zq i−1 , i = 1, . . . , N . By the same argument as in the proof of Lemma 4.9, we conclude that the coefficients Pλ (z, zq, . . . , zq N −1 ; q) vanish, unless mi (λ) = N for some i ∈ Z≥0 , i.e., only the terms for which λ = (mN ) for some m ∈ Z contribute to the expansion. Thus we find X 1 N mN e − (zq)3 e − (z) : = e− e − (zq N −1 ) . . . 3 q 2 mN (N −1) (3 . (4.82) :3 −m ) z m∈Z
Since 1 + q n + . . . + q n(N −1) = 0 for n 6= 0 mod N , we also have X − − − β−kN −kN N −1 e e e : 3 (zq z :. ) . . . 3 (zq)3 (z) : =: exp k
(4.83)
k6=0
N e e− This shows that (3 m ) , m ∈ Z, are in the center of Hq . Since the free field realization acts faithfully on F (0), this also shows that (cf. (4.67)) (Tm )N , m < 0, are in the center f q , while repeating the same analysis for 3 e + (z) extends this claim to (Tm )N , m > 0. of Vir
274
P. Bouwknegt, K. Pilch
One can check explicitly that, except for N = 2, the (T0 )N are not in the center, see the example in Appendix C. It remains to verify that we have identified the entire center. In fact this follows from the irreducibility of the Verma module M 0 (h), which is proved by taking the t → ∞ limit in Theorem 4.12, or directly from the Kac determinant in (4.63). This concludes the proof of the theorem. As a consequence of Theorems 4.18 √ and 4.19, we obtain the following interesting identity for Milne polynomials at q = N 1, which was first discussed in [22, 23] (see also [27], p. 234): √ Corollary 4.21. Let q = N 1 and n ∈ N, then Q0λ∪(nN ) (x; q) = Q0λ (x; q)Q0(nN ) (x; q),
(4.84)
for any partition λ. Proof. Suppose λi ≥ n for i ≤ k and λi < n for i ≥ k + 1. Then, by Theorem 4.18, we have e− e− N e− e− e− e− e− N e− 3 −λ1 . . . 3−λk (3−n ) 3−λk+1 . . . 3−λ` |0i = 3−λ1 . . . 3−λ` (3−n ) |0i.
(4.85)
N e e− e− e− Since (3 −n ) is in the center of Hq , the action of 3−λ1 . . . 3−λ` on the right hand side is only through the creation operators, β−m , m > 0, and as a result e factorizes, which proves (4.84).
Furthermore, note that in the context of the free field realization, we obtain another simple proof of Theorem 4.17 by using Theorem 4.19, the faithfulness of eı and the identity X zλ−1 pλ (xN ) Q0(nN ) (x; q) = (−1)n(N −1) hn (xN ) = (−1)n(N −1) = (−1)
n(N −1)
X
λ`n
zλ−1
(4.86)
pN λ (x),
λ`n
√ N
which holds for q = 1 (cf. [27], p. 235). In Sect. 4.1 we have found that the center of Virp,q at q = −1 could be calculated in terms of symmetrized products of the generators as given in (4.4). We will now show √ ± f at q = N 1. In fact, this is a simple that a generalization of this result holds for Vir q consequence of the following symmetrization theorem which holds for an arbitrary q. Theorem 4.22. For a partition λ = (λ1 , . . . , λn ), X X n e e Mλµ (q) Te−µ1 . . . Te−µn , T−σλ1 . . . T−σλn = m(µ) λ µ
(4.87)
σ∈Sn /Sn
where σ runs over all inequivalent permutations of the sequence (λ1 , . . . , λn ), µ over all partitions such that |µ| = |λ| and `(µ) = `(λ), and the matrix M (q) is given by M (q) = K(1)−1 K(q), where K(q) is the Kostka-Foulkes matrix.
Deformed Virasoro Algebra at Roots of Unity
275
Proof. Clearly, it is sufficient to verify (4.87) when both sides act on the vacuum in a Verma module. Then, by mapping the Verma module to the symmetric functions, we find that in fact the present theorem is equivalent to (the symmetrization) Lemma A.1 for Milne polynomials in Appendix A. +
f . However, unlike in Remark. There is an obvious counterpart of this result for Vir q Sect. 4.1, there is no simple symmetrization identity that would mix the positive and f q . This might be one reason why there seems to be no negative mode generators of Vir simple generalization of this theorem to generic t. 5. Discussion
√ In Sect. 4.2 (see Eq. (4.44)) we have seen that, for q = N 1, the character of the reduced Verma modules M 0 (h) is given by Y 1 − xnN Y (1 + xn + . . . + xn(N −1) ). = (5.1) 1 − xn n≥1
n≥1
While the left hand side of this equation has a natural interpretation in terms of the character of a bosonic Fock space with the oscillators αnN , n ∈ Z, removed, the right hand side has a natural interpretation in terms of the character of the Fock space of a so-called Gentile parafermion of order N − 1 [16], i.e., a generalization of a fermion (N = 2) defined by the property that at most N − 1 particles can occupy the same state, as embodied in the equation (Ten )N = 0, n ∈ Z, for t → ∞ or a deformation thereof for generic t. In other words, the left hand side of this equation counts the number of partitions without parts of length 0 mod N , while the right hand side counts the number of partitions with at most N − 1 equal parts. This equality is of course well-known in combinatorics. Free fermions, of course, are well-known to give rise to conformal field theories. Indeed, √ as we have noted in Sect. 4.2, we have a (non-canonical) action of Virp,q at q = 2 1 = −1 on certain modules over the (undeformed) Virasoro algebra Vir at central charge c = 21 . This rather remarkable feature, that we have an action of a quantum group on modules over an undeformed algebra, resembles the action of Yangians on affine Lie algebra modules (see [6] and references therein), and was one of our motivations to study the algebra Virp,q in the first place. Of course, there is a crucial difference between the two results, in that the affine Lie algebra modules are fully reducible into finite dimensional irreducible representations of this Yangian, while the c = 21 Vir module carries an infinite dimensional irreducible representation of Virp,q at q = −1. In fact, in Theorem 2.2 we have seen that Virp,q does not have any finite dimensional irreducible representations in the category O except for some trivial ones. It is an interesting open question whether Virp,q possesses any other nontrivial finite dimensional irreducible representations (e.g., cyclic representations). In √ addition, one may wonder whether, analogous to q = −1, the algebra Virp,q at q = N 1 can also be realized on modules of Vir. Clearly, the character (5.1) corresponds to the character of a conformal field theory of central charge c = (N − 1)/N and is thus non-unitary (for most N ), as is a well-known property of Gentile parafermions for N 6= 2 (see, e.g., [31] and references therein). This is left for further study as well. Interestingly, a collection of N such Gentile parafermions does realize a unitary conformal field theory, d namely sl N at level 1 [32].
276
P. Bouwknegt, K. Pilch
Acknowledgement. P.B. is supported by a QEII research fellowship from the Australian Research Council and K.P. is supported in part by the U.S. Department of Energy Contract #DE-FG03-84ER-40168. P.B. would like to thank the University of Southern California for hospitality during the initial stages of this work and K. Schoutens for discussions.
Appendix A. Symmetric functions In this appendix we give basic definitions and summarize some results from the theory of symmetric functions that are used throughout the paper. For further details and omitted proofs the reader should consult the monograph [27] and/or the references cited below. A.1. Symmetric functions. Let 3 be the ring of symmetric functions in countably many independent variables x = {x1 , x2 , . . . }. Throughout the paper we are using the following standard bases in 3 indexed by partitions, λ: (i) (ii) (iii) (iv)
monomial symmetric functions, mλ , elementary symmetric functions, eλ , complete symmetric functions, hλ , Schur functions, sλ .
One defines those functions as the limit, when the number of variables goes to infinity, of the corresponding symmetric polynomials that are stable with respect to the adjunction of variables. For instance, the monomial symmetric function, mλ , is the limit, mλ (x) = limn→∞ mλ (x1 , . . . , xn ), of the monomial symmetric polynomials X αn 1 mλ (x1 , . . . , xn ) = xα (A.1) 1 . . . xn , where the sum in (A.1) runs over all inequivalent permutations α = (α1 , . . . , αn ) of the partition λ = (λ1 , . . . , λn ). The elementary symmetric functions and the complete symmetric functions are defined by hλ = hλ1 hλ2 . . . , (A.2) e λ = eλ1 e λ2 . . . , where er and hr are determined from their generating series X Y E(x; t) = en (x) tn = (1 + xi t), n≥0
H(x; t) =
X
i≥1 n
hn (x) t =
n≥0
Y
(1 − xi t)−1 .
(A.3)
i≥1
Finally, the Schur functions, sλ , can be defined as polynomials in the elementary symmetric functions, er , or, equivalently, as polynomials in the complete symmetric functions, hr , namely sλ = det(eλ0i −i+j )1≤i,j≤m ,
sλ = det(hλi −i+j )1≤i,j≤n ,
(A.4)
where m ≥ `(λ0 ) and n ≥ `(λ), respectively, and λ0 is the partition conjugate to λ. Upon extension of 3 to 3Q = 3 ⊗Z Q, the ring of symmetric functions with rational coefficients, one can introduce yet another basis, namely the power sum symmetric functions, pλ . Those are defined by X pλ = pλ1 pλ2 . . . , pr = xri = m(r) . (A.5)
Deformed Virasoro Algebra at Roots of Unity
277
The standard scalar product on 3Q is defined by hpλ , pµ i = zλ δλµ ,
(A.6)
where zλ is given in (3.2). In fact (A.6) induces a well-defined scalar product on 3, with respect to which the Schur functions, sλ , form an orthonormal basis. A.2. Hall–Littlewood and Milne symmetric functions. The Hall–Littlewood (HL) symmetric functions, Pλ (x; q), form an orthogonal basis in the space of one parameter symmetric functions 3[q] = 3 ⊗Z Z[q] with the scalar product defined by hpλ , pµ iq = zλ δλµ
`(λ) Y
(1 − q λi ).
(A.7)
i=1
They can be calculated by applying the Gramm-Schmidt orthogonalization algorithm to the basis of Schur functions and are given explicitly as the limit of the corresponding HL polynomials [24] Y xi − qxj 1 X λ1 λn w x1 . . . x n , (A.8) Pλ (x1 , . . . , xn ; q) = vλ (q) xi − xj i<j w∈Sn
where vm (q) =
(q)m , (1 − q)m
vλ (q) =
Y
vmi (q).
(A.9)
i≥0
It is understood in (A.8) that the permutations w act on the variables x1 , . . . , xn . The functions Pλ (x; q) interpolate between the Schur functions, sλ (x), and the monomial symmetric functions, mλ (x), namely Pλ (x; 0) = sλ (x),
Pλ (x; 1) = mλ (x).
The transition matrix K(q) defined by X Kλµ (q) Pµ (x; q), sλ (x) =
(A.10)
(A.11)
µ
is strictly upper unitriangular with respect to the natural order of partitions, i.e., Kλµ (q) = 0 unless |λ| = |µ| and λ µ, and Kλλ (q) = 1. The entries Kλµ (q) ∈ Z[q] are called the Kostka-Foulkes polynomials. One can show that their coefficients are non-negative integers. It follows from (A.10) that K(1) is the transition matrix between the monomial symmetric functions and the Schur functions, X Kλµ (1) mµ (x). (A.12) sλ (x) = µ
Its entries are called the Kostka numbers. Another family of symmetric functions, Qλ (x; q), also referred to as HL symmetric functions, are scalar multiples of the Pλ (x; q), defined by Qλ (x; q) = bλ (q) Pλ (x, q),
(A.13)
278
P. Bouwknegt, K. Pilch
where bλ (q) =
Q
i≥1 (q)mi (λ) ,
so that hPλ , Qµ iq = δλµ .
(A.14) √ An obvious consequence of the definition (A.13) (see, also (A.18)) is that for q = N 1 the functions Q(kN ) (x; q) vanish. It has already been observed in [24] that the definition of the Qλ (x; q) in (A.8) and (A.13) makes sense for any sequence of integers (λ1 , . . . , λn ) that are not necessarily in a descending order and/or are not positive. For such generalized Qλ one can prove [24] a reordering identity that allows to reduce Qλ , when λi are not in descending order, to a linear combination of the Qµ , where µi are in descending order. For a two-term sequence the formula is (see, [27] p. 214) Q(m,n) − q Q(n,m) = −Q(n−1,m+1) + q Q(m+1,n−1) .
(A.15)
The same equality holds within sequences λ of length greater than two. Now, let us consider the symmetric polynomials corresponding to Qλ . It is clear that a particularly simple polynomial arises if we set the number of variables equal to the length of the partition, namely X Y xi − qxj w xλ1 1 . . . xλnn , (A.16) Qλ (x1 , . . . , xn ; q) = (1 − q)n xi − xj i<j w∈Sn
where `(λ) = n. The main result of this section is a symmetrization lemma for such polynomials (cf. [30], Theorem 1): Lemma A.1. Let λ = (λ1 , . . . , λn ) be a partition and Sn /Snλ the subgroup of distinct permutations of the sequence λ. Then X X n Qσλ (x1 , . . . , xn ; q) = Mλµ (q) Qµ (x1 , . . . , xn ; q), (A.17) m(µ) λ µ σ∈Sn /Sn
where M (q) = K(1)−1 K(q), and the sum on the r.h.s. in (A.17) runs only over permutations µ such that |µ| = |λ| and `(µ) = `(λ). Proof. By applying the symmetrization in λ1 , . . . , λn to the right-hand side of (A.16), we obtain X Qσλ (x1 , . . . , xn ; q) λ σ∈Sn /Sn
Y xi − qxj σλn 1 w xσλ . . . x n 1 xi − xj λ w∈Sn i<j σ∈Sn /Sn Y xi − qxj X X σλ1 n σλn x1 . . . x n w = (1 − q) xi − xj λ i<j = (1 − q)n
X
X
σ∈Sn /Sn
w∈Sn
= (q)n mλ (x1 , . . . , xn ), where we used (A.1) and the identity (see, [27] p. 207)
(A.18)
Deformed Virasoro Algebra at Roots of Unity
X
w
279
Y i<j
w∈Sn
xi − qxj xi − xj
= vn (q).
(A.19)
Now use the fact that mλ (x1 , . . . , xn ) = mλ (x1 , . . . , xm )|xn+1 =...=xm =0 ,
(A.20)
and Qµ (x1 , . . . , xm )|xn+1 =...=xm =0 =
if `(µ) ≤ n, if `(µ) > n,
(A.21)
1 Qµ (x1 , . . . , xn ; q), bµ (q)
(A.22)
Qµ (x1 , . . . , xn ) 0
together with (A.11), (A.12) and (A.13) to find mλ (x1 , . . . , xn ) =
X
Mλµ (q)
µ
where M (q) = K(1)−1 K(q).
(A.23)
The restriction on the range of the sum over µ is then a straightforward consequence of the strict upper unitriangularity of K(q). We also need to introduce another family of symmetric functions, Q0λ (x; q), called Milne’s symmetric functions [28, 29] (see, also [15, 22, 23]). They are related to the HL functions, Qλ (x; q), by a change of variables Q0λ (x; q) = Qλ (
x ; q), 1−q
(A.24)
which has to be understood in the sense of the λ-ring notation. This means that Q0λ (x; q) is the image of Qλ (x; q) by the ring homomorphism of 3[q] that sends pr (x) to (1 − q r )−1 pr (x). A defining property of Milne’s functions is their orthogonality to the HL functions, (A.25) hPλ , Q0µ i = δλµ , with respect to the product (A.6). This is equivalent to the completeness relation Y X 1 . (A.26) Pλ (x; q) Q0λ (y; q) = 1 − x i yj i,j λ
Finally, an expansion in terms of Schur functions yields X Kµλ (q) sµ (x). Q0λ (x; q) =
(A.27)
µ
One can use (A.24) to extend the definition of Q0λ (x; q) to arbitrary sequences λ. Given that the transformation defined in (A.24) is a ring homomorphism, it is clear that the so-defined Q0λ (x; q) will satisfy both the reordering identity (A.15) as well as (A.17) of the symmetrization lemma.
280
P. Bouwknegt, K. Pilch
√ Appendix B. Explicit Singular Vectors for q = N 1 In Theorem 4.8 of Sect. 4.2 we have seen that Y zj 9(z) = lim f ( ) T (z1 ) . . . T (zN )|hi, zi zi →zq N −i i<j
(B.1)
√ is a (well-defined) generating series of singular vectors in M (h) for q = N 1. We have also outlined how to make sense out of this expression. Here we carry out the procedure for N = 3. First, Theorem B.1. We have Y zj f ( ) T (z1 )T (z2 )T (z3 )|hi zi i<j Y X z1m1 z2m2 z3m3 f (Rij ) T−m1 T−m2 T−m3 |hi = m1 ,m2 ,m3 ≥0
i<j
p(z /z ) z3 p−1 (z3 /z2 ) 3 2 +− z3 F −+ ( ) − F ( ) T (z1 )|hi +ζ 1 − p(z3 /z2 ) z1 1 − p−1 (z3 /z2 ) z1 p(z /z ) p−1 (z3 /z1 ) 3 1 −+ z3 +− z3 +ζ F(d) F ( )− ( ) T (z2 )|hi 1 − p(z3 /z1 ) z2 1 − p−1 (z3 /z1 ) (d) z2 z3 1 1 −+ −1 z3 F +− (p ) − F (p ) T (z3 )|hi, +ζ 1 − p(z2 /z1 ) z1 1 − p−1 (z2 /z1 ) z1 (B.2)
where 1 2 (x) = F(d)
X
1 2 m Fm x .
(B.3)
0≤m≤d
We sketch the proof, which is based on the following Lemma B.2. Let cm,n = ζ (pm − p−m ) δm+n,0 ,
(B.4)
then we have X
fl1 fl2 T−m1 −l1 −l2 c−m2 +l1 ,−m3 +l2
l1 ,l2 ≥0
−+ +− = ζ p−m2 Fm − pm 2 F m T−m1 −m2 −m3 , 2 +m3 2 +m3 X fl1 fl2 c−m1 −l1 ,−m2 −l2 T−m3 +l1 +l2
l1 ,l2 ≥0
+− −+ = ζ p−m1 F−m − pm1 F−m T−m1 −m2 −m3 , 1 −m2 1 −m2
where F ±∓ (x) =
P m
±∓ m Fm x is defined in (2.40).
(B.5)
Deformed Virasoro Algebra at Roots of Unity
281
Proof. We have X fl1 fl2 T−m1 −l1 −l2 c−m2 +l1 ,−m3 +l2 l1 ,l2 ≥0
=ζ
X
fl1 fl2 (p−m2 +l1 − pm2 −l1 )δ−m2 −m3 +l1 +l2 T−m1 −l1 −l2
p,q≥0
= ζ p−m2 f (px)f (x)|xm2 +m3 − pm2 f (p−1 x)f (x)|xm2 +m3 T−m1 −m2 −m3 . (B.6) The other identity is shown similarly. Proof of Theorem B.1:. Writing out the left-hand side of (B.2) in modes we distinguish three cases: (i) m1 ≥ 0, m2 ≥ 0, m3 ≥ 0, (ii) m2 < 0, m3 ≥ 0, (iii) m1 < 0, m2 ≥ 0, m3 ≥ 0. In case (i) only a finite number of terms contribute. This gives the first term on the right hand side of (B.2). In case (ii) we move the term T−m2 +l12 −l23 to the right using the commutator (2.1). Then use Lemma B.2 to find X X X −+ +− z1m1 z2m2 z3m3 p−m2 Fm − pm 2 F m T−m1 −m2 −m3 |hi. ζ 2 +m3 2 +m3 m1 m2 ≤−1 m3 ≥0
The sum over m1 is unrestricted. The nonvanishing terms must have m2 + m3 ≥ 0 because of the moding of F +− and F −+ . Thus, with the restriction on m2 in place, we may let the sum on m3 run over all integers. This allows for a change in the summation m0 = m2 + m3 ,
m00 = m1 + m2 + m3 ,
(B.7)
after which all sums can be performed and we obtain ζ
p(z /z ) z3 p−1 (z3 /z2 ) z3 3 2 F −+ ( ) − F +− ( ) T (z1 )|hi. −1 1 − p(z3 /z2 ) z1 1 − p (z3 /z2 ) z1
The remaining case (iii) is analyzed similarly. By taking the limit zi → zq F +− (q −2 ) = 0, we obtain
N −i
in Theorem B.1 and using F −+ (q −1 ) = F −+ (p−1 q −2 ) =
Corollary B.3. The following is an explicit form for the level d singular vector at q = and arbitrary h ∈ C: Y X 2m1 +m2 q f (Rij ) T−m1 T−m2 T−m3 |hi 9d = m1 ,m2 ,m3 ≥0 m1 +m2 +m3 =d
h + ζ q 2d
1
i<j
1
F −+ (q −2 ) T−d
qp−1 − 1 1 1 −+ −1 +− −1 F(d) F(d) + q d 2 −1 (q ) − 2 (q ) T−d q p −1 q p−1 i qp−1 +− −2 F + (pq ) T −d |hi. qp−1 − 1
√ 3
(B.8)
282
P. Bouwknegt, K. Pilch
The expression 9d vanishes for d 6= 0 mod 3 (we have explicitly verified this for small values of d). the duality symmetry (p, q) → (p−1 , q −1 ), it follows From the structure of fl , and √ N that the singular vectors at q = 1 for d = 0 mod N can be written in terms of the following set of invariants: 1rs m1 ...mk = ars
pr q s (1 + pn−2r q N −2s ) , Qk i mi i=1 (1 + p )
n=
X
imi ,
(B.9)
i
where the normalization factor 1 ars = 2 1
for 2r = n and 2s = 0 mod N . (B.10) otherwise √ Using (B.8) we now find the following singular vectors at q = 3 1 for arbitrary h ∈ C. At level d = 3, (B.11) 93 = T−1 T−1 T−1 + a210 T−2 T−1 T0 + a300 T−3 T0 T0 + a3 T−3 |hi, with
a210 = −3110 2 , a300 = −3121 31 , a3 =
(B.12)
3111 11 ,
and, at level d = 6, 96 = T−2 T−2 T−2 + a321 T−3 T−2 T−1 + a411 T−4 T−1 T−1 + a330 T−3 T−3 T0 + a420 T−4 T−2 T0 + a510 T−5 T−1 T0 + a600 T−6 T0 T0 + a6 T−6 |hi,
(B.13)
where a321 = −3110 2 , a411 = −3121 31 , a330 = −3121 31 , 30 21 a420 = −3140 42 + 3141 − 3131 , 31 a510 = −6140 42 + 3152 , 60 70 50 61 51 41 a600 = 27180 6301 + 12163 + 1816201 + 6162 − 315301 − 315301 − 315301 , 20 40 50 30 51 41 31 a6 = −3160 4201 + 314201 − 6142 − 1214101 + 6141 − 314201 + 314201 + 614201 . (B.14) The equality of the coefficients a210 = a321 and a300 = a411 in the singular vectors at d = 3 and d = 6 is a direct consequence of Lemma 4.9 (ii) and the expression (4.33). √ In addition we have computed the level d = 4 singular vector at q = 4 1, 94 V = T−1 T−1 T−1 T−1 +a2110 T−2 T−1 T−1 T0 +a2200 T−2 T−2 T0 T0 +a3100 T−3 T−1 T0 T0 + a4000 T−4 T0 T0 T0 + a22 T−2 T−2 + a31 T−3 T−1 + a40 T−4 T0 |hi, (B.15)
Deformed Virasoro Algebra at Roots of Unity
283
where 11 a2110 = −4110 2 − 413 ,
a2200 = 8120 4 , 21 a3100 = 12120 4 + 415 , 31 a4000 = −8130 501 − 81501 ,
a22 = a31 = a40 =
(B.16)
−8110 2 , 11 −4110 2 + 413 , 21 8120 301 − 81301 .
Observe that for t → ∞ we have 1r,s m1 ...mk → 0 for 0 < r ≤ 2n. Thus, in all the examples above, in the limit t → ∞ only the leading term (T−n )N |hi survives. Furthermore, note that, both for N = 3 and N = 4, all the factors in front of the fundamental invariants 1rs m1 ...mk are a multiple of N (except for the one of the leading term). Thus, it appears that calculating modulo N , in the appropriate sense, is somehow equivalent to considering the t → ∞ limit.
Appendix C. The Center Revisited In this appendix we establish in a direct way some elementary identities for the √ ± f for a generic q. Specialization of those results to q = N 1 products of generators of Vir q yields a direct proof of Theorem 4.14. ±
f are invariant under the rescaling C.1. Preliminaries. The defining relations (4.54) of Vir q Tem → a|m| Tem , a ∈ C, while the remaining relations in (4.53)–(4.56) can be further simplified by a judicious choice of a. In the following we will find it convenient to work with the generators tm = q
|m| 2
Tem ,
m ∈ Z.
(C.1)
+
f are given by Then the relations involving the generators of Vir q tm tn = qtn tm − (1 − q)
m−n−1 X
tm−l tn+l ,
m > n ≥ 1,
(C.2)
l=1
tm t0 = qt0 tm − (1 − q)
m−1 X
tm−l tl + (q − q −1 )
l=1
tm t−n = qt−n tm + (q − q −1 )
∞ X
t−l tm+l ,
m ≥ 1,
(C.3)
l=1
∞ X
t−n−l tm+l + (1 − q)q m δm,n ,
m, n ≥ 1, (C.4)
l=1
while the remaining ones have a similar form and can easily be worked out.
284
P. Bouwknegt, K. Pilch ±
±
f . Now let us consider the subalgebras Vir f in more detail. We C.2. Identities in Vir q q + − f being obvious. f with the extension to Vir will discuss explicitly only the case of Vir q q The relation (C.2) implies that two subsequent generators satisfy tm+1 tm = qtm tm+1 .
(C.5)
For the generators tm+k and tm with k ≥ 2, there are additional terms, though it is still possible to rewrite (C.2) in a symmetric form k X
tm+j tm+k−j = q
j=1
k X
tm+k−j tm+j .
(C.6)
j=1
In view of (C.6) it is then natural to consider the following sums of generators4 sm = tm + tm+1 + . . . ,
(C.7)
in term of which (C.2) is equivalent to sm+1 tm = qtm sm+1 − (1 − q)sm+1 sm+1 .
(C.8)
The infinite sums and their products here and below should be understood in the graded sense. By iterating (C.8) we prove Lemma C.1. For n ≥ 1, (sm+1 )n tm = q n tm (sm+1 )n − (1 − q n )(sm+1 )n+1 ,
n
sm+1 (tm )
=
n X
j n−j
(−1) q
j=0
n (q)j (tm )n−j (sm+1 )j+1 . j
(C.9)
(C.10)
Then, a straightforward induction yields Lemma C.2. For n ≥ 1, (tm + sm+1 )n =
n X
1
q 2 j(j−1)
j=0
qtm + (q − q
−1
)sm+1
n
=
n X j=0
n (tm )n−j (sm+1 )j , j
q n−2j (q)j+1 n (tm )n−j (sm+1 )j . (−1) 1−q j j
(C.11)
(C.12)
op For a partition λ of length `(λ) = n, let λop = (λop 1 , . . . , λn ) be the increasing op sequence of positive integers, λi = λn−i , i = 1, . . . , n. For such a sequence we define 4 Since (C.2) is invariant under rescaling, there seems to be no advantage in working with the generating P t z −n . series t+ (z) = n≥1 n
Deformed Virasoro Algebra at Roots of Unity
op
ht(λ ) =
285
`(λ) X
(`(λ) − i)(λop i − 1)
i=1
(C.13)
1 = n(λ) − `(λ)(`(λ) − 1). 2 Finally, in terms of multiplicities, we can write (λ1 , . . . , λn ) = (1m1 2m2 . . . ), where mi = mi (λ). Using Lemma C.2 we will now establish the main result of this section, which gives the expansion of powers of sm into ordered products of generators. Theorem C.3. For n ≥ 1, X
(sm )n =
q ht(λ
op
)
{λ | `(λ)=n}
n op . . . tm+λopn −1 . t m(λ) m+λ1 −1
(C.14)
Proof. We write sm = tm + sm+1 , and expand (sm )n = (tm + sm+1 )n using (C.12), X n n ht(1m1 2n2 ) (tm )m1 (sm+1 )n2 . q (C.15) (sm ) = , n m 1 2 m +n =n 1
2
Since at a given level only products of a finite number of generators, tm , tm+1 , . . ., tm+s−1 , can appear, after repeating this expansion s − 1 times we obtain a multiple sum X Ps−1 ht(1mi 2ni+1 ) n ns−1 n ... q i=1 (tm )m1 . . . (tm+s−1 )ms , (sm ) = , n m , m m 1 2 s−1 s m1 +...+ms =n (C.16) where ni = mi + . . . + ms . It follows from definition (C.13) that s−1 X
ht(1mi 2ni+1 ) = ht(1m1 2m2 . . . sms ),
(C.17)
i=1
and it is obvious that n2 nk−1 n n ... = . m2 , n3 ms−1 , ms m1 , . . . , m s m1 , n2 This completes the proof of the theorem.
(C.18)
C.3. Proof of Theorem 4.18. √ f q. Theorem C.4. Let q = N 1. Then (tm )N , m ≥ 1, belong to the center of Vir Proof. We will show separately that (tm )N tn = tn (tm )N ,
m ≥ 1,
(C.19)
for ±n > 0 and n = 0. Case 1. n > 0. From Lemma C.1 we find (sm+1 )N tm = tm (sm+1 )N ,
(C.20)
286
P. Bouwknegt, K. Pilch
and sm+1 (tm )N = (tm )N sm+1 ,
(C.21)
for arbitrary m ≥ 1. Since, by Theorem C.3, (sm )N =
∞ X
q 2 kN (N −1) (tm+k )N , 1
(C.22)
k=0
where each term in the sum is at a different level, this implies (C.19). Case 2. −n > 0. First rewrite (C.4) in a more convenient form, tm t−n =
∞ X
al t−n−l tm+l + cn δm,n ,
(C.23)
l=1
where a0 = q, al = q − q −1 , l ≥ 1, and cn = (1 − q)q −n . By repeated use of (C.23) we obtain (tm )N t−n = cm (tm )N −1 δm,n +
+
N −1 X
(tm )
k=1 ∞ X
N −k−1
∞ X
al1 . . . alk cn+l1 +...+lk tm+l1 . . . tm+lk δm,n+l1 +...+lk
l1 ,...,lk =0
al1 . . . alN t−n−l1 −...−lN tm+l1 . . . tm+lN .
l1 ,...,lN =0
(C.24) This may be recognized as (tm )N t−n = cm (tm )N −1 δm,n + cm +
N −1 X
X
(tm )N −k−1 qtm + (q − q −1 )sm+1
k=1
t−n−l qtm + (q − q −1 )sm+1
N N m+l
k (k+1)m−n
(C.25)
,
l≥0
where the subscripts on the brackets indicate the level. For m < n, we find using (C.12) that only the last term in (C.25) with l = 0 contributes giving t−n (tm )N . For m = n, all terms in (C.25) contribute, however, we may set sm+1 = 0 in the second term. Thus the central charge term has an overall factor of 1 + q + . . . + q N −1 = 0, while the last term is the same as above. For m > n, the first term does not contribute. Let us first consider the second term proportional to the central charge. Using (C.12) we can rewrite it as cm
k N −1 X X k=1 j=0
(−1)j
q k−2j (q)j+1 k (tm )N −j−1 (sm+1 )j |N m−n , 1−q j
where only the terms at the level N m − n contribute. In particular, at this level we must have j ≥ 1. Changing the order of summation we obtain
Deformed Virasoro Algebra at Roots of Unity
cm
N −1 X j=1
= cm
q −j (q)j+1 (−1) 1−q
N −1 X j=1
j
N −1 X k=j
q
287
k−j
k (tm )N −j−1 (sm+1 )j |N m−n j
−j (q)j+1 N jq (tm )N −j−1 (sm+1 )j |N m−n . (−1) 1−q j+1
(C.26)
Clearly all terms in the sum vanish. Thus once more the only contribution arises from the last term in (C.25) and it is the same as above. Case 3. n = 0. Using (C.3) we find (tm )n t0 = q N t0 (tm )N +
N X k=1
+
N X k=1
q k−1 (tm )N −k [ (q − 1)
m−1 X
tm−l tl ](tm )k−1
l=1
q k−1 (tm )N −k [ (q − q −1 )
∞ X
(C.27)
t−l tm+l ](tm )k−1 .
l=1
+ This expression is clearly the same as the one obtained by setting t0 = t+0 + t− 0 , where t0 − satisfies (C.2) with n = 0 and t0 satisfies (C.4) with n = 0. Since the proof of the two cases above does not depend on the specific value of n, by exactly the same algebra we show that t0 commutes with (tm )N .
An obvious modification of the above argument proves that (Tm )N for m < 0 lies in the center. This then concludes the proof of Theorem 4.18. Remark. Note that (t0 )N , (N ≥ 3), is not in the center as the following example for N = 3 shows, (t0 )3 t−2 = q 3 t−2 (t0 )3 − (1 − q)q 2 (1 + q + q 2 )(t−1 )2 (t0 )2 + (1 − q)3 q(2 + q)t−2 t0 − (1 − q)4 (1 + q)(t−1 )2 + . . . ,
(C.28)
where the dots stand for terms with strictly positive modes on the right. References 1. 2. 3. 4. 5. 6.
7.
Andrews, G.E., Baxter, R.J. and Forrester, P.J.: Eight-vertex SOS model and generalized RogersRamanujan-type identities. J. Stat. Phys. 35, 193–266 (1984) Awata, H., Kubo, H., Odake, S. and Shiraishi, J.: Quantum WN algebras and Macdonald polynomials. Commun. Math. Phys. 179, 401–416 (1996) q-alg/9508011 Awata, H., Kubo, H., Odake, S. and Shiraishi, J.: Quantum deformation of the WN algebra. q-alg/9612001 Awata, H., Kubo, H., Odake, S. and Shiraishi, J.: Virasoro-type symmetries in solvable models. To appear in the CRM series in Mathematical Physics, Springer Verlag, hep-th/9612233 Bouwknegt, P., McCarthy, J. and Pilch, K.: The W3 algebra; modules, semi-infinite cohomology and BV-algebras. Lect. Notes in Physics Monographs, m42, Berlin: Springer Verlag, 1996 Bouwknegt, P. and Schoutens, K.: Spinon decomposition and Yangian structure of sblN modules. In “Geometric Analysis and Lie Theory in Mathematics and Physics”, Lecture Notes Series of the Australian Mathematical Society, Cambridge: Cambridge University Press, 1997, q-alg/9703021 Davies, B., Foda, O., Jimbo, M., Miwa, T. and Nakayashiki, A.: Diagonalization of the XXZ Hamiltonian by vertex operators. Commun. Math. Phys. 151, 89–153 (1993)
288
8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34.
P. Bouwknegt, K. Pilch
Feigin, B. and Frenkel, E.: Affine Kac-Moody algebras at the critical level and Gelfand-Dikii algebras. Int. J. Mod. Phys. A7 (Suppl. A1), 197–215 (1992) Feigin, B. and Frenkel, E.: Quantum W-algebras and elliptic algebras. Commun. Math. Phys. 178, 653– 678 (1996) q-alg/9508009 Feigin, B., Jimbo, M., Miwa, T., Odesskii, A. and Pugai, Y.: Algebra of screening operators for the deformed Wn algebra. q-alg/9702029 Frenkel, E. and Reshetikhin, N.: Quantum affine algebras and deformations of the Virasoro and Walgebras. Commun. Math. Phys. 178, 237–264 (1996) q-alg/9505025 Frenkel, E. and Reshetikhin, N.: Towards deformed chiral algebras. q-alg/9706023 Frenkel, E. and Reshetikhin, N.: Deformations of W-algebras associated to simple Lie algebras. qalg/9708006 Frenkel, E., Reshetikhin, N. and Semenov-Tian-Shansky, M.A.: Drinfeld–Sokolov reduction for difference operators and deformations of W-algebras, I. The case of Virasoro algebra. q-alg/9704011 Garsia, A.M.: Orthogonality of Milne’s polynomials and raising operators. Discr. Math. 99, 247–264 (1992) Gentile, G.: Osservazioni sopra le statistiche intermedie, Nuovo Cimento 17, 493–497 (1940) Hou, B.-Y. and Yang, W.-L.: A ~-deformed Virasoro algebra as a hidden symmetry of the restricted Sine–Gordon model. hep-th/9612235 Hou, B.-Y. and Yang, W.-L.: An ~-deformation of the WN algebra and its vertex operators. J. Phys. A: Math. Gen. 30, 6131–6145 (1997), hep-th/9701101 Jing, N.: Vertex operators, symmetric functions and the spin group 0n . J. Algebra 138, 340–398 (1991) Jing, N.: Vertex operators and Hall–Littlewood symmetric functions. Adv. Math. 87, 226–248 (1991) Jimbo, M., Lashkevich, M., Miwa, T. and Pugai, Y.: Lukyanov’s screening operators for the deformed Virasoro algebra. Phys. Lett. 229A, 285–292 (1997) hep-th/9607177 Lascoux, A., Leclerc, B. and Thibon, J.-Y.: Fonctions de Hall–Littlewood et polynˆomes de Kostka-Foulkes aux racines de l’unit´e. C.R. Acad. Sci. Paris 316, 1–6 (1993) Lascoux, A., Leclerc, B. and Thibon, J.-Y.: Green polynomials and Hall–Littlewood functions at roots of unity. Europ. J. Combinatorics 15, 173–180 (1994) Littlewood, D.E.: On certain symmetric functions. Proc. London Math. Soc. 11, 485–498 (1961) Lukyanov, S. and Pugai, Y.: Bosonization of ZF algebras: Direction toward deformed Virasoro algebra. J. Exp. Theor. Phys. 82 , 1021.-1045 (1996), hep-th/9412128 Lukyanov, S. and Pugai, Y.: Multi-point local height probabilities in the integrable RSOS model. Nucl. Phys. B473, 631–658 (1996), hep-th/9602074 Macdonald, I.G.: Symmetric functions and Hall polynomials. Oxford: Oxford University Press, 1995 Milne, S.C.: Classical partition functions and the U (n + 1) Rogers-Selberg identity. Discr. Math. 99, 199–246 (1992) Milne, S.C.: The C` Rogers-Selberg identity. SIAM J. Math. Anal. 25, 571–595 (1994) Morris, A.O.: On an algebra of symmetric functions. Quart. J. Math. Oxford 16, 53–64 (1965) Polychronakos, A.: Path integrals and parastatistics. Nucl. Phys. B474, 529–539 (1996), hep-th/9603179 Schoutens, K.: Exclusion statistics in conformal field theory spectra. Phys. Rev. Lett. 79, 2608–2611 (1997), cond-mat/9706166 Semenov-Tian-Shansky, A.M. and Sevostyanov, A.V.: Drinfeld–Sokolov reduction for difference operators and deformations of W-algebras, II. General semisimple case. q-alg/9702016 Shiraishi, J., Kubo, H., Awata, H. and Odake, S.: A quantum deformation of the Virasoro algebra and the Macdonald symmetric functions. Lett. Math. Phys. 38, 33–51 (1996), q-alg/9507034
Communicated by G. Felder
Commun. Math. Phys. 196, 289 – 318 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Level Spacings for Integrable Quantum Maps in Genus Zero? Steve Zelditch Johns Hopkins University, Baltimore, Maryland 21218, USA Received: 11 September 1997 / Accepted: 30 January 1998
Abstract: We study the pair correlation function for a variety of completely integrable quantum maps in one degree of freedom. For simplicity we assume that the classical phase space M is the Riemann sphere CP 1 and that the classical map is a fixed-time map expt4H of a Hamilton flow. The quantization is then a unitary N x N matrix Ut,N and ) its pair correlation measure ρ(N 2,t gives the distribution of spacings between eigenvalues in an interval of length comparable to the mean level spacing (∼ 1/N ). The physicists’ ) conjecture (Berry–Tabor conjecture) is that as N → ∞, ρ(N 2,t should converge to the = δo + 1 of a Poisson process. For any 2-parameter pair correlation function ρPOISSON 2 ˆ + β Iˆ with φ00 6= 0 we prove that this family of Hamiltonians of the form Hα,β = αφ(I) conjecture is correct for almost all (α, β) along the subsequence of Planck constants Nm = [m(log m)5 ]. In the addendum to this paper [Z. Addendum], we further show that for polynomial phases φ the a.e. convergence to Poisson holds along the full sequence ) of Planck constants for the Cesaro means of ρ(N 2;(t,α,β) . Contents 0 1 1.1 1.2 2 2.1 2.2 2.3 3 ?
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 Toeplitz Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 Toeplitz quantization in genus zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 Quantum maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 Density of States, Pair Correlation Function and Number Variance . . . . 297 DOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 The pair correlation function and number variance . . . . . . . . . . . . . . . . . 297 Asymptotics of traces and exponential sums . . . . . . . . . . . . . . . . . . . . . . 298 PCF for Hamiltonians: Proof of Theorem A . . . . . . . . . . . . . . . . . . . . . . 302
Partially supported by NSF grant #DMS-9404637.
290
S. Zelditch
4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 5 5.1 6 6.1
PCF for Quantized Perfect Hamiltonian Flows on CP 1 : Proof of Theorem B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 Classical Hamiltonian S 1 actions and perfect Morse Hamiltonians on CP 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 Quantum S 1 actions over CP 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306 WKB for quantum completely integral Hamiltonians in genus zero . . . . 306 The quantizations Uχt ,N and Ut,N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 ` . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 Final form of T rUt,N Proof of Theorem B: pair correlation on average . . . . . . . . . . . . . . . . . . 309 Proof of Theorem B (a): Number variance on average . . . . . . . . . . . . . . 310 Mean Square Poisson Statistics for Quantum Spin Evolutions: Proof of Theorem B (b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 Proof of Theorem B(b) for general non-degenerate phases . . . . . . . . . . . 313 Appendix: Linear and Quadratic Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 315 Poisson on average in t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
0. Introduction In this paper we shall be concerned with the fine structure of the spectra of some completely integrable quantum maps in genus zero, that is, with quantizations of integrable symplectic maps χ on the Riemann sphere M = CP 1 , equipped its standard (FubiniStudy) form ω of area one. For any positive integer N , (CP 1 , N ω) is quantized by the Hilbert space HN ∼ = 0(LN ) of holomorphic sections of the Nth power of the hyperplane section line bundle. The quantum system then consists of a sequence of unitary operators {Uχ,N } on HN , a Hilbert space of dimension N. For simplicity, we restrict attention to quantizations Ut,N of Hamilton flows χt = exp t4H , where the Hamiltonian H has no separatrix levels. Our interest is in the semiclassical asymptotics (N → ∞) of the ) (N ) pair correlation function ρ(N 2,t and number variance 62,t (L) of the quantum systems. We first show that the time-averages of these objects tend to the Poisson limits 1 b−a
Z
b a
) POISSON ρ(N = δo + 1, 2,t dt → ρ2
1 b−a
Z
b a
) 6(N 2,t (L)dt → L
as N → ∞. This is consistent with the Berry–Tabor conjecture [B.T] that eigenvalues of completely integrable quantum systems behave like random numbers (waiting times of a Poisson process). However, it is only a weak test of the conjecture since the averaging process itself induces a good deal of the randomness. A much stronger test is whether the variance tends to zero. For special 2-parameter families of Hamiltonians Hα,β = ˆ + β Iˆ (see Sect. 2 for the definition) we show that the variance tends to zero at the αφ(I) 3 rate (logNN ) . This implies that the individual systems are almost always Poisson along a slightly sparse subsequence of Planck constants. In an addendum [Z.Addendum] we will further show that when φ is a polynomial, then the Cesaro means in N of the pair ) POISSON . correlation function ρ(N 2,t,α,β tend almost always to ρ2 Before describing the models and results more precisely, let us recall what the level spacings problems are about. In the quantization H → Hˆ (N ) of Hamiltonians on compact phase spaces M of dimension 2f , the “Planck constant” is constrained to the values h = 1/N and the spectrum of Hˆ (N ) consists of dN ∼ N f eigenvalues {λN,j } in a bounded
Level Spacings for Integrable Quantum Maps in Genus Zero
291
interval [min H, max H]. Similarly, the spectrum of a quantum map Uχ,N consists of dN eigenvalues {eiθN,j } on the unit circle S 1 . The density of states in degree N , ) dρ(N = 1
dN 1 X δ(λN,j ), dN j=1
resp.
) dρ(N = 1
dN 1 X δ(eiθN,j ) dN j=1
has a well defined weak limit as N → ∞ which may be calculated by standard methods of microlocal analysis (Sect. 2). According to the physicists, there also exist asymptotic patterns in the spectra on the much smaller length scale of the mean level spacing d1N between consecutive eigenvalues. The pair correlation function ρ2 , for instance, is the limit distribution of spacings between all pairs of normalized eigenvalues dN λN j . The length scale d1N is usually below the resolving power of microlocal methods. Hence the problem of rigorously determining the limit , or even of determining whether it exists, has remained open for almost all quantum systems. The sole exceptions are the cases of almost all flat 2-tori [Sa.2] (see also [Bl.L]) and Zoll surfaces [U.Z]. For other rigorous results on level spacings for Laplacians on surfaces with completely integrable geodesic flow, see [S, K.M.S, Bl.K.S]. On the other hand, there exist numerous computer studies of eigenvalue spacings in the physics literature which indicate that limit PCFs often exist. The following conjectures give a rough guideline towards the expected shape of the level spacings statistics: , where ρGOE is the limit • When the classical system is generic chaotic, ρ2 = ρGOE 2 2 expected PCF for NxN random matrices in the Gaussian orthogonal ensemble; := 1 + δ0 . • When the classical system is generic completely integrable, ρ2 = ρPOISSON 2 That is, at least on the level of the PCF, the normalized spacings between eigenvalues behave like waiting times of a Poisson process. The term δ0 comes from the diagonal, while the term 1 reflects that any spacing between distinct pairs is as likely as any other. These conjectures should not be taken too literally, and indeed cannot be since the term “generic” is not precisely defined. Our main purpose in this article is to test the Poisson conjecture against quantized Hamilton flows in one degree of freedom on the compact Kahler phase space CP 1 . Of course, they are necessarily completely integrable. It might also be suspected that quantized Hamilton flows in one degree of freedom are necessarily trivial, but this is not the case: as will be seen, PCF’s of toral completely integrable systems on CP 1 are almost always Poisson along a slighty sparse subsequence of Planck constants. It should also be recalled that many of the model quantum chaotic systems, such as kicked tops and rotors and cat maps, take place in one degree of freedom and still defy rigorous analysis [Iz, Kea]. The quantum maps studied in this paper thus join a growing list of integrable quantum systems whose level spacings have been shown rigorously to exhibit some degree of Poisson statistics. On the other hand, it is clear that not all of the quantum maps in our 2parameter families exhibit Poisson behaviour (see the Appendix for a counterexample). Rather, our results tend to corroborate the (still rather vague) picture that (some level of) Poisson statistics occurs almost everywhere in an n-parameter family of non-degenerate integrable systems, but that a dense exceptional set of non-Poisson systems occurs as well. This probabilistic revision of the Berry–Tabor conjecture seems to have been first proposed by Sinai in his study of a closely related lattice point problem [S]. It was also stated clearly by Sarnak [Sa.2] in his proof that almost every flat 2-torus has a Poisson PCF but that a dense residual set of flat tori had no well-defined PCF. We should also emphasize that all of the rigorous results at the present time on eigenvalue spacings of completely integrable systems only pertain to the 2-level correlation
292
S. Zelditch
function; neither the k- level correlation functions for k ≥ 3 nor the (nearest-neighbor) level spacings distribution have been proved to be Poisson. In this connection it is interesting to recall a suggestion of Sinai concerning the “degree” of Poisson behaviour of typical members in an n-parameter family of quantum systems (e.g. as measured by the largest k so that the k-level correlation function is Poisson): namely, that the typical degree could depend on the number n of parameters in the system. Thus our 2-parameter families (and the 2-parameter family of flat 2-tori) exhibit Poisson PCF’s a.e., but it is unknown whether their higher-level correlation functions or level spacings distribution are also Poisson. Now let us be more precise about the models we will study. In the usual Kahler quantization of (CP 1 , N ω), HN may be identified with the space PN of homogeneous holomorphic polynomials f (z1 , z2 ) of degree N on C2 . A classical Hamiltonian H ∈ C ∞ (CP 1 ) is then quantized as a self-adjoint Toeplitz operator Hˆ (N ) := 5N H5N : HN → HN ,
ψN,j → 5N HψN,j ,
where 5N is the (Cauchy–Szego) orthogonal projection on HN (Sect. 1). Hence the quantum Hamiltonian system amounts to the eigenvalue problem: −||H||∞ ≤ λN,1 ≤ λN,2 ≤ · · · λN,N ≤ ||H||∞ .
Hˆ (N ) φN,j = λN,j φN,j ,
For a fixed value h = N1 of the Planck constant, the distribution of normalized spacings between all possible pairs of eigenvalues of Hˆ (N ) is given by the N th pair correlation “function” (measure) ) dρ(N 2 (x) =
N 1 X δ(x − N (λN,i − λN,j )). N i,j=1
Here, the eigenvalues are rescaled, λN,j → N λN,j to have unit mean level spacing, i.e. so that N (λN,i+1 − λN,i ) ∼ 1 on average. Our first result gives an explicit formula for the limit pair correlation function ) dρ2 = limN →∞ dρ(N 2 (x)
of a quantized Hamiltonian Hˆ (N ) . It is of a similar nature to the pair correlation function for a Zoll Laplacian ([U.Z]) and involves dynamical invariants of the classical Hamiltonian flow exp t4H generated by H on the classical phase space. Under some generic hypotheses (which will be stated precisely in Sect. 2), the formula is given by: Theorem A. For the generic H ∈ C ∞ (M ), the limit pair correlation function for the system Hˆ (N ) is given by: ρ2 (f ) = V fˆ(0) +
(ν) Z M N XX X k∈Z ν=1 j=1
fˆ(kTjν (E))Tjν (E)2 dE, (cν ,cν+1 )
where: (i)
V = vol{(z1 , z2 ) ∈ M × M : H(z1 ) = H(z2 )};
(ii) {cν } is the set of critical values of H;
Level Spacings for Integrable Quantum Maps in Genus Zero
293
(ν) (iii) for a regular value E ∈ (cν , cν+1 ), H −1 (E) is a union of periodic orbits {γνj }N j=1 ν of exp t4H and Tj (E) is the minimal positive period of the jth component.
It follows that the pair correlation function of quantized Hamiltonians in one degree of freedom is quite deterministic. On the other hand, the eigenvalues of the associated quantized Hamiltonian flow are much more random. Before describing the results, let us recall the definition of a quantum map and of its pair correlation function. Suppose that χo is a symplectic map of a compact symplectic manifold (M, ω). It is called quantizable if it can be lifted to a contact transformation χ of the prequantum S 1 bundle π : (X, α) → (M, ω), where dα = π ∗ ω. We are mainly interested here in Hamiltonian flows χt and these are always quantizable (Sect. 1). We then define the quantization of the map χo to be Uχ,N := 5N σχ Tχ 5N : HN → HN , where Tχ is the translation operator by χ on HN and σχ ∈ C ∞ (M ) is the “symbol”, designed to make Uχ,N unitary. All of the usual quantum maps, e.g. “cat maps” and kicker rotors can be obtained by this method [Z]. Since the eigenvalues lie on the unit circle, the rescaling to unit mean level spacing leads to the (periodic) pair correlation functions ) dρ(N 2 (x) :=
N 1 XX δ(x − N (θN,i − θN,j − `N )) N i,j=1 `∈Z
=
N 1 XX x . δ θ − θ − 2π` − N,i N,j N2 N i,j=1 `∈Z
) is quite “random” in general because the rescaling deThe large N behaviour of dρ(N 2 stroys the Lagrangian nature of Uχ,N . The question is whether there is some asymptotic pattern to the randomness. As mentioned above, we will restrict in this paper to a simple but reasonably representative case of the question, namely to Hamiltonian flows generated by perfect Morse functions H on CP 1 . The reason for restricting attention to CP 1 is that it is the only symplectic surface carrying a Hamiltonian S 1 action (i.e. it is a ”toric variety”), namely the usual rotation of the sphere about an axis. The moment map is known as an action variable I. Any perfect Morse function may be written as a function H = φ(I) of a global action variable. Any toral action can be quantized and in particular I can be quantized as an operator Iˆ(N ) whose spectrum lies on a one dimensional “lattice” j ˆ and : j = −N . . . N . It follows that H is quantized as an operator of the form φ(I) N ˆ (N ) itN H its flow can be quantized as a unitary group of the form Ut,N = 5Ne 5N , where N equals N on HN . Hence the eigenangles have the form tN φ Nj and the asymptotics of the PCF can be reduced to the study of exponential sums of the form
S(N ; `, t) =
N X
e
2πiN `φ
j N
t
.
j=1
The Poisson conjecture is essentially that these exponential sums behave like random walks. It is too difficult to analyse the individual exponential sums, but we can successfully analyse some typical behaviour in families of such systems. The first result is about the mean behaviour as the t parameter varies.
294
S. Zelditch
Theorem B(a). Suppose H : M → R is a perfect Morse function on CP 1 . Then the limit PCF ρ2,t and number variance 62,t (L) for Ut,N are Poisson on average in the sense: Z b 1 ) P OISS ρ(N := 1 + δ0 lim 2,t dt = ρ2 N →∞ b − a a and 1 N →∞ b − a
Z
b
lim
a
) P OISS 6(N (L) := L 2,t (L)dt = 62
for any interval [a, b] of R. This result applies to the case of linear Hamiltonians and their Hamilton flows, whose pair correlation functions are clearly not individually Poisson (cf. Sect. 6). For Poisson level spacings, we make some further assumptions on the Hamiltonian (or phase φ). Our main result concerns the mean and variance of a 2-parameter family of Hamiltonians: Theorem B(b). Let I denote an action variable on CP 1 and let Hα,β = αφ(I) + βI ) with |φ00 | > 0. Denote by ρ(N 2;(t,α,β) the pair correlation measure for the quantum map U(t,α,β),N = exp(itN Hˆ (α,β;N ) ). Then for any t 6= 0, any T > 0 and any f ∈ S(R) with fˆ ∈ Co∞ (R) we have Z T Z T 1 (log N )3 N POISSON 2 . |ρ (f ) − ρ (f )| dαdβ = 0 2 (2T )2 −T −T 2;(t,α,β) N Thus, the mean pair correlation function in the family is Poisson and the variance 3 tends to zero at the rate (logNN ) . Following [Sa.2], we conclude: POISSON m . Corollary. Let Nm = [m(log m)5 ]. Then, for almost all (α, β), ρN 2;(t,α,β) → ρ2
It would be interesting to study quantizations of Hamilton flows in the case where the Hamilton had saddle levels, as must happen if the genus is > 0. It would also be interesting to study completely integrable maps which are not Hamilton flows. We hope to extend our methods and results to these cases in the future. 1. Toeplitz Quantization We now review the basics of Toeplitz quantization on CP 1 . For futher background on Kahler quantization, we refer to [G.S]; for general Toeplitz quantization we refer to [B.G, Z]. Toeplitz quantization is a form of Kahler quantization, that is, of quantization of symplectic manifolds in the presence of a holomorphic structure. The basic idea is that the quantum system is the restriction of the classical system to holomorphic functions. To be more precise, let (M, ω) be a compact Kahler manifold with integral symplectic form. Then there is a positive hermitian holomorphic line bundle L → M with connection 1-form α whose curvature equals ω. In Kahler quantization, the phase space (M, ω) is quantized as the sequence of finite dimensional Hilbert spaces 0(L⊗N ), where 0 denotes the holomorphic sections. In Toeplitz quantization, these spaces are put together as the Hardy space H 2 (X) of CR functions on the unit circle bundle X in L∗ . Thus, the setting for Toeplitz quantization is a compact contact manifold (X, α) whose contact flow
Level Spacings for Integrable Quantum Maps in Genus Zero
φθ : X → X,
φθ∗ α = α
295
(1)
defines a free S 1 -action with quotient a Kahler manifold M whose Kahler form ω pulls back to dα. The Kahler structure on M also induces a CR structure on X. The corresponding Hardy space H 2 (X) is the space of boundary values of holomorphic functions on the disc bundle of L∗ which lie in L2 (X). The orthogonal (Cauchy–Szego) projector 5 : L2 (X) → H 2 (X) defines a Toeplitz structure on X in the sense of [B.G]. From the symplectic point of view, H 2 (X) is viewed as the quantization of the symplectic cone 6 = {(x, rαx ) : r ∈ R+ } ⊂ T ∗ X − 0. To be precise, 5 is a Hermite Fourier integral operator with wave front set on the isotropic submanifold 6∗ := {(σ, −σ) : σ ∈ 6} ⊂ T ∗ (X × X). The CR structure corresponds to a positive definite Lagrangian sub-bundle 3 of T 6⊥ , the symplectic normal bundle of 6. The vector fields generating 3 annihilate a ground state e3 in the quantization of the T 6⊥ . The symbol of 5 is the orthogonal projection π = e3 ⊗ e∗3 onto this ground state. For a detailed account of these objects we refer to [B.G]. 1.1. Toeplitz quantization in genus zero. In the case of M = CP 1 , the contact manifold X may be identified with SU (2) and the Hardy space H 2 (X) may be identified with the space of lowest weight vectors for the right action of SU (2) on L2 (SU (2)). To make the Toeplitz theory more concrete, let us recall how these identifications are made. We first recall [G.H, Sect. I.3] that the holomorphic line bundles over CP 1 are all powers H ⊗N of the hyperplane bundle H → CP 1 , whose fiber over V ∈ CP 1 is the space V ∗ of linear functionals on the line thru V . The Chern class of H is the Fubini study form ωF S , which generates H 2 (CP 1 ). The holomorphic sections HN of H are given by the linear functionals L on C2 by setting sL (V ) = L|V . More generally, the holomorphic sections of H ⊗N correspond to homogeneous holomorphic polynomials of degree N on C2 . The associated principal S 1 to H is evidently the unit sphere S 3 ⊂ C2 which we identify with SU (2). As the boundary of the unit ball B ⊂ C2 , it has a natural CR structure. The associated Hardy space is the usual space of boundary values of holomorphic functions on B. Under the S 1 action eiθ (z1 , z2 ) = (eiθ z1 , eiθ z2 ) of S 3 → CP 1 , it is evident that the holomorphic functions transforming by eiN θ are given by homogeneous holomorphic polynomials of degree N . The Cauchy -Szego kernel is given by 5N (z, w) = hz, wiN . We also recall that the irreducible representations HN of SU (2) are given by its actions on homogeneous polynomials. By the Plancherel theorem, L2 (SU (2)) = ∗ 3 ⊕∞ N =1 HN ⊗ HN . The CR structure induced on SU (2) by the identification S = ∂B ≡ SU (2) is equivalent to that given by the lowering operator L− for the right action. The Szego projector 5N is then the orthogonal projection onto HN ⊗ ψN , where ψN is the ∗ . lowest weight vector in HN Below we will often refer to an action operator Iˆ(N ) on CP 1 . It may be identified with the Planck constant N1 times any generator (e.g. Lz ) of a Cartan subgroup of SU (2). Thus its eigenvalues in HN are the weights Nj . 1.2. Quantum maps. Symplectic maps χo on CP 1 may be quantized by the Toeplitz method as long as χ lifts to a contact transformation χ of (X, α). The Toeplitz quantization is almost the translation operator Tχ by χ compressed to the Hardy space H 2 (X). Since Tχ does not usually preserve H 2 (X), 5Tχ 5 is not generally unitary; to unitarize
296
S. Zelditch
it one must equip it with a symbol. In [Z] it is described how to construct a symbol σχ on M for any quantizable symplectic map on any compact symplectic M so that Uχ := 5σχ Tχ 5 is unitary. We will describe the symbol in some detail in Sect. 2.3. It automatically commutes with the S 1 action, so is the direct sum of the finite unitary operators, Uχ,N on HN . We define Uχ,N to be the quantization of χo with semiclassical parameter 1/N. Its eigenvalues have the form Sp(Uχ,N ) = {e2πiθN,j : j = 1, . . . , dN },
(2)
where dN = dim HN = N. Consider now the case of Hamilton flows χot = exp t4H on a general symplectic manifold (M, ω). Proposition 1.2.1. Hamilton flows are always quantizable. Proof. What needs to be proved is that exp t4H always lifts to a contact flow χt : X → X-equivalently that 4H lifts to a contact vector field, say XH . We prove this by lifting ¯ H on the symplectic cone 6. Let us exp t4H to a homogeneous Hamilton flow exp t4 define the function r(x, rαx ) = r. r : 6 → R+ , ∂ + + ∼ ∼ . Thus, 6 = X × R , X = {r = 1} and the R action is generated by the vector R = r ∂r The natural symplectic structure ω on 6 is the restriction of the canonical symplectic structure ωT ∗ X on T ∗ X, which is homogeneous of degree 1. Denoting by π : X → M the projection, we have: ω = rπ ∗ ωM + dr ∧ α.
(3)
The proof is simply that ωT ∗ X = dαT ∗ X , where αT ∗ X is the action 1-form. This equation restricts to 6, where αT ∗ X = rα. Taking the exterior derivative gives the formula. ¯ r) = rπ ∗ H(x) Now return to H ∈ C ∞ (M ) and consider the Hamiltonian H(x, ¯ is ¯ on 6. It is homogeneous of degree 1 so its Hamilton vector field 4H¯ = ω −1 (dH) ¯ H¯ is homogeneous of homogeneous of degree zero and then its Hamilton flow exp t4 degree one. We claim that (i) the flow preserves X; and (ii) its restriction χt to X is a contact flow lifting exp t4H . Indeed, we have ι4¯ H¯ ω = d(rH) = rdH + Hdr = rι4¯ H¯ ωM + ι4¯ H¯ dr ∧ α ¯ H¯ )dr + dr(4 ¯ H¯ )α. = rι4¯ H¯ ωM − α(4 ¯ H¯ ) = 0. Here ¯ H¯ )α are dθ-independent we must have dr(4 Since all terms except dr(4 dθ denotes the vertical one form of X. It is then obvious that ¯ H¯ ) = H, −α(4
ι4¯ H¯ ωM = dH.
¯ H¯ projects to 4H , i.e 4 ¯ H¯ is a lift of 4H . Since The second equation says that 4 L4¯ H¯ α = ι4¯ H¯ dα + d(ι4¯ H¯ α) = dH − dH ¯ H¯ is a contact vector field (here, L is the Lie derivative). we also see that 4
Level Spacings for Integrable Quantum Maps in Genus Zero
297
2. Density of States, Pair Correlation Function and Number Variance 2.1. DOS. Before considering the pair correlation function, we first describe the limit density of states (DOS) of quantum Hamiltonians and quantum maps in the Toeplitz setting. They can be easily determined from the trace formulae of [B.G] and indeed the calculation is carried out in [Z, Theorem A]. Let us recall the results. In the case of Hamiltonians, the DOS in degree N is defined by ) dρ(N 1 (λ)
dN 1 X := δ(λ − λN,j ). dN
(4)
j=1
By [B.G, Theorem 13.13] we have: Proposition 2.1.1. The limit DOS is given by Z f (H)ω βo (f ) =
(f ∈ C(R)).
M
In the case of quantum maps the DOS in degree N is defined by ) dρ(N 1 (z) :=
dN 1 X δ(z − e2πiθN,i ) dN
z ∈ S1.
(5)
j=1
The limit DOS βo is determined in [Z, Theorem A] and depends on whether the classical map is periodic or aperiodic (i.e. the set of periodic points has measure zero). Proposition 2.1.2. Let χ be a symplectic map of (M, ω).
R (a) In the aperiodic case, β = co dθ, where co is the constant ( M σdµ) with σ the symbol of Uχ . (b) If χk = id, then β is a linear combination of delta functions at the k th roots of unity. 2.2. The pair correlation function and number variance. We recall here the definitions of the pair correlation function and number variance for quantum maps Uχ,N in f degrees of freedom. Then dim HN = dN ∼ N f and the spectrum has the form Sp(Uχ,N ) = iθN j e : j = 1, . . . , dN . The spectrum may be identified with the periodic sequence {θN j + 2πn : n ∈ Z, j = 1, . . . , dN } and then rescaled to given a periodic sequence of period N and mean level spacing one: {dN θN j + 2πndN : n ∈ Z, j = 1, . . . , dN }. Definition 2.2.1. The pair correlation function of level N of a quantum map in f degrees of freedom is the measure on R given by ) dρ(N 2 (x) :=
dN X 1 X δ(dN (θN j − θN j ) + 2πndN − x). dN j,k=1 n∈Z
The limit pair correlation function is then: ) (N ) dρ(N 2 (x) = w − lim dρ2 (x). N →∞
(6)
298
S. Zelditch
We often write the integral ) ρ(N 2 (f ) =
R R
) f dρ(N 2 dx as
dN X 1 X f (dN (θN j − θN j ) + 2πndN ). dN j,k=1 n∈Z
By the Poisson summation formula we have: ) ρ(N 2 (f )
N 2 1 X ˆ 2π` 1 X ˆ 2π` X i`(θN,j −θN,k ) ` . f f = 2 = 2 e T rUχ,N dN d dN `∈Z d N N (7) j,k=1 `
A closely related spectral statistic is the number variance for the quantum map Uχ,N . It is defined as follows (cf. [Kea]): First, define the density of the scaled eigenangles by dN dN X X X 1 X ) δ(θ − dN θN j + 2πdN n) = e2πi`θN,j e−2πi`θ/dN ρ(N s (θ) = dN j=1 n∈Z
`∈Z
=1+
j=1
∞ X 2 ` Re T rUχ,N e−2πi`θ/dN . dN `=1
Then: Definition 2.2.2. The number variance of Uχ,N is defined by: ) 6(N 2 (L)
2 Z x+L/2 (N ) ρs (y)dy − L dx 0 x−L/2 ∞ 2 2 X 1 π`L ` . sin2 T rUχ,N = 2 2 π ` dN 1 = N
Z
N
`=1
. ) (N ) ˆ We observe that 6(N 2 (L) is similar to ρ2 (f ) for f = has been removed.
sin x x
except that the ` = 0 term
2.3. Asymptotics of traces and exponential sums. Before getting down to our specific models, let us make some general remarks about the exponential sums S(N, `) := ` . T rUχ,N ` have complete asymptotic expansions as N → ∞. To state First, the traces T rUχ,N the results, we need some notation. Recall that the S 1 action on X is denoted φθ . For each ` put 2χ,` = {θj mod2π : Fix(φθj ◦ χ) 6= ∅}.
(8)
Assuming (as we will) that the maps have clean fixed point sets, the set 2χ,` is finite and Fix(φθj ◦ χ) is a conic submanifold of 6. We denote its dimension by ej and its base Fix(φθj ◦ χ) ∩ X by SFix(φθj ◦ χ). The trace asymptotics then have the form:
Level Spacings for Integrable Quantum Maps in Genus Zero
299
Proposition 2.3.1. For each ` 6= 0, ` ∼ T rUχ,N
∞ X X
a`,j,r N
ej −1 2 −r
eiN θj
θj ∈2χ,` r=0
for certain coefficients a`,j,r . The leading coefficients are given by: Z a`,j,0 =
SF ixφθj ◦χ`
dµχ` ,
where dµχ` is the canonical density on SF ix(φθj ◦ χ` ). Before sketching the proof, let us describe in more detail the ingredients that go into the principal coefficients. Roughly speaking, the quantized map Uχ,N involves two pieces of data in addition to the Szego projector: the map χ and the symbol σχ . We recall from [Z] that the scalar principal symbol of Uχ is given by σχ = hχ∗ e3 , e3 i−1 , where e3 is the section of the “bundle of ground states” corresponding to 5. See [B.G, Sect. 11] for the definition and properties of the ground states (normal Gaussians). Since χ is rarely holomorphic, it will generally not commute with 5 and will take e3 to another ground state χ∗ e3 . After trivializing the ”bundle of ground states” the map χ∗ may be described as follows: the derivative dχ defines a linear symplectic map dχ|6⊥ on the symplectic normal bundle T 6⊥ of 6. The quantization of the normal space is a space of Schwartz functions and the quantization of dχ|6⊥ is its image M(dχ|6⊥ ) under the metaplectic representation. Then χ∗ = M(dχ|6⊥ ). As a Fourier integral Toeplitz operator, Uχ ∈ I o (X × X, graph(χ)0 ), where graph(χ)0 = {(σ, −χ(σ)) : σ ∈ 6}. Hence its symbol is a symplectic spinor on the graph. Under the natural parametrization j : 6 → graph(χ), the symbol may be viewed as a symplectic spinor on 6 and as discussed in [Z], unitarity of Uχ forces it to have the form j ∗ σUχ = hχ∗ e3 , e3 i−1 |dσ| 2 ⊗ e3 ⊗ e∗3 , 1
(9)
where |dσ| is the symplectic volume density on 6. Hence as a symplectic spinor on graph(χ) it has the form σUχ = hχ∗ e3 , e3 i−1 (χ∗ |dσ| 2 ⊗ |dσ| 2 ) ⊗ χ∗ e3 ⊗ e∗3 . 1
1
(10)
Now let us describe the symbolic aspects of the trace, assuming that χ has clean fixed point sets. Then each component of F ix(χ|6 ) carries a canonical density dV and by inserting the radial vector field R we get a canonical Liouville density dµχ = iR dV on the base SF ix(χ) of the fixed point set. Moreover, in the normal direction we have the symplectic spinor factor χ∗ e3 ⊗ e∗3 . Taking the trace, we get the matrix element hχ∗ e3 , e3 i. This cancels the scalar principal symbol, so the symbolic trace just gives the Liouville volume of the fixed point set, as stated above. Having discussed the ingredients in the above proposition, we now sketch the proof. For further details we refer to [B.G, Theorem 12.9].
300
S. Zelditch
Sketch of Proof. Consider the Fourier series ϒχ,` (θ) =
∞ X
` T rUχ,N eiN θ = T rUχ` eiθN 5.
(11)
N =1
By the composition theorem of [B.G], ϒ` is a Lagrangian distribution on S 1 with sine gularities at the values θj ∈ 2χ,` and with singularity degrees beginning at 2j . Hence: ϒχ,`
∼ =
∞ X X θj ∈2χ,` r=0
aθj ,`,r u ej −r (θ − θj ),
(12)
2
P∞ where um (θ) = N =1 N m−1 eiN θ . Note that um is a periodic distribution with the same ej singularity at θ = 0 as the homogeneous distribution (θ − θj + i0) 2 −r . The leading coefficients are given by the principal symbols of ϒχ,` (θ) at the singularities. From the symbol calculus of [B.G], the coefficients are given by the symbolic traces described above. The expansion stated in the proposition then follows by matching Fourier series. Examples. Let us consider the form of the trace for quantum maps in one degree of freedom: (a) Suppose that χt = exp t4H is the fixed time map of a Hamilton flow. The fixed point set then consists of a finite number of level sets {H = Ej (t)}, which must be periodic orbits of period t. Pick a base point mj on each orbit and lift it to a point xj lying over mj in X. Then the lift of {H = Ej (t)} to xj is a curve which begins and ends on π −1 (mj ). The difference in the initial and terminal angle is of course given by the holonomy with respect to α. This holonomy angle θj is independent of the choice of mj and of xj and 2χ,t is the set of these holonomy angles. Then SFix χt ◦ φθj is two dimensional and hence e = 3. As will be seen below, expressing T rUχ` t ,N in terms of its eigenvalues produces a classical exponential sum in the completely integrable case. The trace expansion produces a dual exponential sum involving dynamical data. The principal term can be obtained by applying the van der Corput method [G.K, H.1], i.e. Poisson summation followed by stationary phase. However, the existence of a complete asymptotic expansion might not be so clear without the Toeplitz machinery. The trace expansion (or at least the principal term) is often used in the physics literature under the name of the Gutzwiller trace formula. Some remarks on its limitations are included below. (b) Suppose next that χo has only isolated non-degenerate fixed point sets. Then χ fixes the entire fiber over each fixed point. The only singularity occurs at θ = 0 and the dimension of Fixχ equals one. Hence e = 2. Special case: Quantum cat maps Ug,N . These are the most familiar examples of quantum maps, so let us see what the above proposition says about them. In this case, the trace can be calculated exactly and equals the character of the finite metaplectic representations. For the exact calculation in Toeplitz setting, see [Z]. For the physics style calculation, see [Kea]. Here we do the calculation asymptotically. ab First we observe that if g = is hyperbolic then it has non-degenerate fixed cd points at (x, ξ) ∈ R2 /Z2 such that g(x, ξ) ∼ = (x, ξ) mod Z2 , i.e. at the points (g −I)−1 Z2 .
Level Spacings for Integrable Quantum Maps in Genus Zero
301
The lifted map χg actually has no fixed points. However, χg ◦ φθ has fixed points if and only if [(g · (x, ξ), e2πi(t+θ) )] = [(x, ξ, e2πiθ )], where the bracket denotes the equivalence in the quotient space. Since g · (x, ξ) = (x, ξ) + (m, n) for some (m, n) ∈ Z2 we get that [(x, ξ) + (m, n), e2πi(t+θ) )] = [(x, ξ, e2πiθ )]. But [(x, ξ) + (m, n), e2πi(t+θ) )] = [(x, ξ), e2πi(t+θ) eiπmn eiπω((x,ξ),(m,n)) ]. It follows that 1 θmn = − (mn + ω((x, ξ), (m, n)). 2 Hence 1 − (mn + ω((x, ξ), (m, n)) : g(x, ξ) = (x, ξ) + (m, n) . 2
2g,1 =
The fixed point set of χg is therefore clean and has dimension one (the fibers over the fixed points of g on R2 /Z2 ). The fixed points are non-degenerate in the directions dθ, where dθ is the normal to the fibers, so the canonical density is given by √ 1 det(I−g)
invariant measure on the fibers. Furthermore, the scalar principal symbol of Ug,N is 1 1 given by m(g) = hM(g)e3 , e3 i−1 = 2− 2 (a + d + i(b − c)) 2 [Z, Sect. 5]. This factor is cancelled in the trace operation, leaving T rUg,N = √
1 det(I − g)
X
eiπ(mn+ω((m,n),(1−g)(m,n)) .
(m,n)∈Z2 /(I−g)Z2
` ` can be analysed in the same way because in this example Ug,N = The traces T rUg,N Ug` ,N . Indeed, g → Ug,N is the metaplectic representation of SL(2, Z/N Z) realized on spaces of theta-functions [Z]. This connection makes it possible to get exact results on the level spacings of Ug,N even though it is a quantum chaotic map. On the other hand, the results are quite different from the GOE behaviour conjectured for generic quantum maps (cf. [Kea]).
Remarks on the trace expansion. We do not actually use the trace expansion in this paper. This is for two reasons. The main one is that due to the eigenangle rescaling the ` of the quantum map in the formula for the PCF are those for significant powers Uχ,N which ` is on the order of N . But the trace expansion as it stands is only an asymptotic expansion as N → ∞ for fixed `. The “standard wisdom” regarding exponential sums (see [B.I, H.1, G.K]) is that the van der Corput method produces a dual exponential sum which is only simpler than the original (in general) when the dual sum has fewer terms. In the case of the trace expansion, this gain in simplicity only occurs if the number of fixed points of φ` is slowly growing in `. Only if the topological entropy of φ equals zero can this be the case. The most favorable case is probably that of completely integrable systems, for which the `th term has roughly ` critical points. Then the periodic orbit sum contains only ` terms. But since the significant terms occur when ` ∼ N this is no simplification. The trace expansion is not necessary in the integrable case because the eigenangles can be written down explicitly by the WKB method. This is the essential use of complete integrability in this paper and also in [Sa.2]. The chaotic case is much more difficult because no explicit formula for the eigenangles is known and because the trace formula leads to a dual sum with an exponentially growing number of terms.
302
S. Zelditch
3. PCF for Hamiltonians: Proof of Theorem A As mentioned in the introduction, the pair correlation problem for quantized Hamiltonians is similar to that for Zoll surfaces as presented in [U.Z]. We can therefore follow the exposition in [U.Z, Sect. 3] to the extent possible and omit details which are essentially similar to the Zoll case. There is no difference in this problem between the case of M = CP 1 and the general case of symplectic surfaces. So in this section M can be any closed symplectic surface. However, we will make some generic simplifying assumptions on the Hamiltonian. The first one is Assumption MORSE. H : M → R is a Morse function. Let E denote the set of values of H and let c1 < c2 < · · · < cM +1 denote its set of critical values. Then the inverse image H −1 (cν , cν+1 ) consists of a finite number N (ν) of connected components Xjν each diffeomorphic to (cν , cν+1 ) × S 1 . Hence for E ∈ (cν , cν+1 ), H −1 (E)∩Xjν consists of a periodic orbit γjν (E) of expt4H . Its minimal positive period will be denoted by Tjν (E). For the sake of simplicity we will make a second assumption: Assumption Q. Tjν and Tkν are independent over Q if j 6= k. Under this (generic) condition, the manifold P of periodic points which arises in the calculation of ρ2 has the minimal number of components. In the general case, the formula for ρ2 is given by a sum over connected components with dependent period functions [U.Z, Theorem 3.3]. In the simpler case (which exhibits all of the ideas) we will prove: Theorem 3.0.2. With assumptions MORSE and Q on H, the limit pair correlation function is given by Z R
f (x)dρ2 (x) = V fˆ(0) +
(ν) Z M N XX X k∈Z ν=1 j=1
fˆ(kTjν (E))Tjν (E)2 dE (cν ,cν+1 )
for any f such that fˆ ∈ Co∞ (R). ) Proof. To determine the asymptotics of the sequence ρ(N 2 (f ) we form the generating function ∞ X ) iN θ ρ(N . ϒf (θ) := 2 (f )e N =1
We wish to show that ϒf is a classical Hardy-Lagrangian distribution on S 1 . The asymp) totics of ρ(N 2 (f ) can then be determined from the singularity data of ϒf . To show that ϒf (θ) is a Lagrangian distribution, we write it as the trace of a Toeplitz type Fourier Integral operator, defined as follows: We form the product manifold X × X and consider the product Szego projector 5 ⊗ 5 : L2 (X × X) → H 2 (X) ⊗ H 2 (X) and the diagonal projector
Level Spacings for Integrable Quantum Maps in Genus Zero
5diag :=
∞ M
303
5N ⊗ 5N : L2 (X × X) →
N =1
HN ⊗ HN .
N =1
We next observe that
Z
) ρ(N 2 (f )
∞ M
= T r5N ⊗ 5N
ˆ ˆ fˆ(t)eitN (HN ⊗I−I⊗HN ) dt. R
Noting that N is the eigenvalue of the number operator N we can rewrite this in the form Z ˆ ˆ fˆ(t)eit(N HN ⊗I−I⊗N HN ) dt. T r5N ⊗ 5N R
The generating function is then given by ϒf (θ) =
∞ X
T re
iθ[N ⊗I
Z 5N ⊗ 5N
N =1
= T r5diag eiθ[N ⊗I]
Z
ˆ ˆ fˆ(t)eit(N HN ⊗I−I⊗N HN ) dt = R
ˆ ˆ fˆ(t)eit(N H⊗I−I⊗N H) dt. R
We recall here that Hˆ = 5H5, where H is the pull back to X of the function so denoted on M. We now have to analyse each operator which occurs under the trace sign. (a) 5diag : This operator is the composition of the product Szego projector 5 ⊗ 5 with the full diagonal weight projection Pdiag : L2 (X ⊗ X) → L2N (X) ⊗ L2N (X), where L2N (X) is the eigenspace of N of eigenvalue N . From [G.S.2, U.Z] it follows that Pdiag is a Fourier Integral operator in the class ˜ where X (2) = X × X and where 0 ˜ is the flow-out of the coisotropic I 0 (X (2) × X (2) , 0), cone ˜ = {(ζ1 , ζ2 ) ∈ T ∗ (X (2) ) : H(ζ1 ) − H(ζ2 ) = 0}. 2 That is, let 8t = exp t4H × exp −t4H denote the Hamilton flow generated on T ∗ (X (2) ) by H(ζ1 ) − H(ζ2 ). Then in a well-known way, the map ˜ → T ∗ (X (2) ) × T ∗ (X (2) ), i2˜ : R × 2
(t, ζ1 , ζ2 ) → ((ζ1 , ζ2 ), 8t (ζ1 , ζ2 ))
˜ defines a Lagrange immersion with image equal (by definition) to the flow out 0. On the other hand 5⊗5 is the exterior tensor product of two Toeplitz (hence Hermite type Fourier Integral) operators. According to [B.G, Theorem 9.3], we therefore have 5⊗5=α+β with α ∈ I o (X (2) × X (2) , 6 × 6) and with W F (β) contained in a small conic neighborhood C of 6×0∪0×6. Moreover, the symbol of 5 ⊗ 5 is given by σ(5 ⊗ 5) = σ(5) ⊗ σ(5)
304
S. Zelditch
on 6 × 6 − C. Hence 5 ⊗ 5 is essentially a Toeplitz structure on the symplectic cone 6 × 6 ⊂ T ∗ (X (2) ). The complication due to C will ultimately prove to be irrelevant in the analysis of the trace since it will not contribute to the singularities along the diagonal. Hence we can (and will) pretend that this component of W F (5 ⊗ 5) does not occur. By the composition theorem for Hermite and ordinary Fourier Integral operators [B.G, Theorem 7.5] it follows that (modulo the term Pdiag ◦ β) 5diag ∈ I o (X (2) × X (2) , 0), ˜ where 0 is the flowout Lagrangian in 6×6 for the co-isotropic subcone 2 := 2∩6 ⊂ 6. That is, the map i2 := i2˜ |R×2 : R × 2 → 6 × 6 is a Lagrange immersion with respect to the symplectic cone 6 × 6 and 0 is its image; of course it is only an isotropic immersion with respect to T ∗ (X (2) × X (2) ). (b) eiθ[N ⊗I] : This operator does not require a fancy analysis since N is simply the ∂ of the contact flow. Hence eiθ[N ⊗I] is the differentiation operator by the generator ∂θ θ translation operator F (x, y) → F (φ (x), y) by φθ × id on X (2) . R ˆ ˆ (c) 5 ⊗ 5 R fˆ(t)eit(N H⊗I−I⊗N H) dt : Here we have inserted the factor 5 ⊗ 5, as we may, to simplify the discussion. ˆ we may remove the projection 5 from the exponent. Since [N , 5] = 0 = [N , H], Then the unitary under the integral is given by eitHN ⊗ e−itHN . Each factor is the exponential of a pseudodifferential operator of real principal type and is therefore Fourier Integral. It follows again from the composition theorem [B.G, Theorem 7.5] that 5eitHN ∈ I − 4 (X × X, C ∩ R × 6 × 6), 1
C := {(t, τ, x, ξ, y, η) : τ + σN (x, ξ)H(x) = 0, ψ t (x, ξ) = (y, η)}, t ∗ where
∂ ψ is the Hamilton flow on T X generated by σN (x, ξ)H. Note ∗that σN (x,ξ) = ξ, ∂θ , which generates the lift of the central circle action on X to T X. Also, the Hamilton flow of H on T ∗ X is the two-fold lift of the Hamilton flow of H on M : first from M to X and then from X to T ∗ X. Since the Hamilton vector field of σN (x,ξ) H on T ∗ X is given by 4σN (x,ξ) H = H4σN (x,ξ) + σN (x,ξ) 4H
and since the Lie bracket of the two terms is zero, we have ψ t = exp tH4σN (x,ξ) ◦ exp tσN (x, ξ)4H . The isotropic cone C ∩ 6 × 6 can be parametrized by (t, ζ) → (t, σN H(ζ), ζ, ψ t (ζ)). iC6 : R × 6 → R × 6 × 6, R ˆ ˆ The symbol of 5 ⊗ 5 R fˆ(t)eit(N H⊗I−I⊗N H) dt may then be identified with the spinor 1 fˆ(t)|dt| 2 ⊗ σ5 . Putting the above together we see that up to the factor of Pdiag the operator under the trace is a Hermite Fourier Integral operator associated to the graph of φθ × id ◦ ψ t × ψ −t
Level Spacings for Integrable Quantum Maps in Genus Zero
305
on 6 × 6. The effect of the Pdiag factor is to reduce this torus action to the quotient of 6 × 6 by the diagonal contact flow. The remainder of the calculation is similar to the Zoll case, so we will just summarize the key ideas and refer to [U.Z] for details. First, the singularities of ϒ are caused by the periodic orbits of the reduced flow exp t4H × exp −t4H on the characteristic variety V := {H(z1 ) − H(z2 ) = 0} ⊂ M × M. The singularities occur at the holonomy angles of the lifts of these periodic orbits to the prequantum S 1 bundle of M × M. Since the periodic orbits of the product (or difference) flow are products of periodic orbits of the factors, the holonomy of the periodic orbits are ratios of holonomies of the factors. In particular, if for each period there is just one periodic orbit, then the ratio is always one and there is only a single singularity at θ = 0. The general case is discussed in detail in the analogous situation of [U.Z]. The set of relevant periodic points and their periods is given by P = {(z1 , z2 , T ) ∈ V × suppfˆ : expT4H (zj ) = zj }. Observe that each component of P is of dimension 3: the T = 0 component equals V, and the others consist of continuous unions of products α×β of periodic orbits α, β with the same (or rationally related) periods (see [U.Z, Lemma 3.7]). Under assumption Q, only “squares” α × α of periodic orbits arise. It follows that each component contributes with equal strength to the pair correlation function. To complete the calculation we need to determine the canonical density on P whose integral gives the principal symbol of ϒf at θ = 0 (or at other possible singular angles). On the T = 0 component it is just the Liouville density of V and hence the contribution of this component is the volume of V. For the T > 0 components we use the Morse theory of H to break up M into local action-angle charts Xjν . We then wish to determine the surface measures on the surfaces V ∩ {Tjν (z1 ) = Tjν (z2 )} ⊂ Xjν × Xjν where as above Tjν (z) is the minimal period of the periodic orbit thru z. We will express the result in terms of the local action angle coordinates (I1 , φ1 , I2 , φ2 ) on Xjν × Xjν . The coordinates (I1 , φ1 , φ2 ) are independent on V ∩{Tjν (z1 ) = Tjν (z2 )} and (dropping the sub and superscripts) the Liouville measure in these coordinates equals T (dI1 ∧ dφ1 ∧ dφ2 ) [U.Z, Lemma 3.11]. On the level sets T (z1 ) = T (z2 ) the surface measure is then given d2 I1 −1 dφ1 dφ2 ). The measure of the T > 0 component of P is then the integral by T 2 ( dH 2) over the set of periods T of this form times dT. Changing variables from T (E) to E, summing over the components Xjν (and reinstating the sub and superscripts) gives the stated formula. 4. PCF for Quantized Perfect Hamiltonian Flows on CP 1 : Proof of Theorem B In this section we consider general Hamiltonians H on CP 1 which are perfect Morse functions. Our purpose is to show that the pair correlation functions of their quantized Hamiltonian flows are Poisson on average. 4.1. Classical Hamiltonian S 1 actions and perfect Morse Hamiltonians on CP 1 . Up to symplectic equivalence, CP 1 carries a unique Hamiltonian S 1 action. The model example is that of rotations of S 2 ⊂ R3 around the x3 -axis, whose Hamiltonian is the x3 coordinate. In general, a function I : CP 1 → R is called an “action” variable if its Hamilton flow exp t4I is 2π-periodic. Any such global action variable on CP 1 must have precisely
306
S. Zelditch
two critical points which are fixed points of its flow. This flow has a global transversal connecting the fixed points. The travel time from this transversal defines an angle variable θ symplectically dual to I and the transformation χ(I, θ) = (x3 , θ) is a global symplectic transformation to the model case. Now suppose that H : CP 1 → R is a perfect Morse function, with a non-degenerate minimum value equal to −1, and a non-degenerate maximum value equal to 1. Then let µ denote the distribution function of H : CP 1 → [−1, 1], i.e. µ(E) = |{H ≤ E}|, where | · | denotes the symplectic area. Under our assumptions it is a strictly increasing smooth function on (−1, 1). Let I : CP 1 → [0, 4π] be defined by I(z) = µ(H(z)). Then I is an action variable for H: as functions on the symplectic CP 1 , I and H Poisson commute, {I, H} = 0, and the Hamilton flow of I is 2π- periodic. It is obvious that I generates the algebra a = {f ∈ C ∞ (CP 1 ) : {f, H} = 0} and hence we may write H = φ(I), where φ is the inverse function to µ. Non-degeneracy of the critical points forces dφ 6= 0 since d2 H = φ0 (I)d2 I + φ00 (I)dI ⊗ dI
(13)
so that d2 H = φ0 (I)d2 I at critical points. 4.2. Quantum S 1 actions over CP 1 . It is a general fact that compact Hamilton group actions can be quantized [B.G]. In the model case, the quantization is given by the generator of a linear S 1 action, such as rotations around the x3 -axis (if we think of CP 1 as the embedded S 2 ). In general (without conjugation to the model case), the quantization can be constructed as follows: The Toeplitz quantization 5I5 of I is a positive operator satisfying e2πiN [5I5] = I + K, where K is a Toeplitz operator of order −1. In a wellknown way [CV], we may add lower order terms to 5I5 to arrive at a positive operator ˆ N ] = 0 and e2πiN Iˆ = I. Therefore N Iˆ has only integral eigenvalues. Iˆ satisfying [I, Since N = N on H 2 (N ), it follows that ˆ H 2 (N ) ) = j : j = −N, . . . , N . (14) Sp(I| N 4.3. WKB for quantum completely integral Hamiltonians in genus zero. We now come to the definition of the quantum systems which concern us. In the homogeneous setting we say: Definition 4.3.1. A quantum completely integrable Hamiltonian is a Toeplitz Hamiltonian Hˆ on H 2 (X) which is given by a polyhomogeneous function ˆ ∼ φ(I) ˆ + φ−1 (I)N ˆ −1 + φ−2 (I)N ˆ −2 + . . . Hˆ = 8(I) ˆ i.e. a generator of a quantum Hamiltonian S 1 action. of a global action operator I, ˆ N. ˆ Automatically, [H, N ] = 0 and we write Hˆ (N ) for 5N H5 The principal symbol φ(I) defines a perfect Morse function on CP 1 if φ0 6= 0. Conversely, suppose that H is a perfect Morse function on M , so that there exists a global action I and monotone φ for which H = φ(I). The Toeplitz quantization 5N H5N of H does not necessarily commute with the quantization Iˆ of I, so it is not quantum completely integrable in the strong sense of the definition above. To make it commute it ˆ is only necessary to average 5N H5N with respect to the quantum S 1 -action exp itI. Since this does not change the principal symbol it may be regarded as a quantization
Level Spacings for Integrable Quantum Maps in Genus Zero
307
of H and we will reserve the notation Hˆ for it. Since Hˆ commutes with Iˆ and since Iˆ ˆ for some function 8. It must be a polyhomogenous has a simple spectrum, Hˆ = 8(I) function since Hˆ is a Toeplitz operator. We therefore come back to the same definition as above. The following proposition is an immediate consequence of the polyhomogenity: Proposition 4.3.2. Denote the eigenvalues of Hˆ (N ) by Sp(Hˆ (N ) ) = {λN,j }. Then: j j j −1 −2 + N φ−1 + N φ−2 + .... λN,j = φ N N N 4.4. The quantizations Uχt ,N and Ut,N . There are two approaches to the quantization of a Hamiltonian flow χt = exp t4H : (i) by first exponentiating and then quantizing or (ii) by first quantizing and then exponentiating. As above we would like to produce a quantum ˆ This could be done completely integrable system, so that Uχt should commute with I. ˆ by quantizing H → Hˆ as above and then exponentiating to get Ut,N = 5N eitN H 5N . This automatically produces a quantum completely integrable unitary group. Alternatively, we could exponentiate H to get the flow exp t4H and then quantize the flow by the Toeplitz method. By Proposition 2.3, exp t4H lifts to a contact transformation χt on X so by [Z, Sect. 3] there exists a canonical symbol σt ∈ S 0 (T ∗ (X)) such that the Uχt ,N := 5N σt χt 5N is unitary. There is no guarantee that the resulting quantization commutes with the quantum S 1 action generated by Iˆ nor that it is a unitary group. The first defect can be overcome, as above, by first averaging against the S 1 action and then applying the functional calculus as in [Z] to make the quantization unitary. We henceforth use the notation Uχt ,N to refer to the result. To compare the methods we prove: Proposition 4.4.1. Ut,N ∼ = Uχt ,N modulo operators of order −1. Proof. To be cautious, convince ourselves that 5N and eitN 5H5 commute. R we first iθN −iN θ dθ and the averaging operator commutes with Indeed, 5N = 5 ◦ S 1 Ad(e )e both 5 and multiplication by H. It follows that ∗ Ut,N = 5N e−itN 5H5 5N eitN 5H5 5N = 5N e−itN 5H5 eitN 5H5 5N = 5N . Ut,N
Thus, Ut,N is unitary on HN . We then observe that Ut = 5eitN 5H5 5 = 5eitN H 5 is a unitary group of FourierToeplitz integral operators whose underlying canonical relation is the lift of the Hamilton flow of H to the symplectic cone 6 ⊂ T ∗ (X) = {(x, ξ, rα) : r ∈ R+ }. This follows very much as in the proof that exponentials of first order pseudodifferential operators are groups of Fourier integral operators. Indeed, 5N H5 is a first order Toeplitz (pseudodifferential) operator with principal symbol rH. Hence its exponential is a Toeplitz Fourier integral operator with bicharacteristic flow equal to the Hamilton flow of rH on 6. By Proposition 2.3, this Hamilton flow is the lift of exp t4H to 6. On the semi-classical (non-homogeneous) level, we note that Ut,N = 5N eitN 5H5 5N forms a non-homogeneous Hermite-Fourier integral distribution as N varies. By nonhomogeneous is meant an oscillatory integral (with complex phase) associated to a non-homogeneous Lagrangian; here it equals the graph of χt on X. Both Ut,N and Uχt ,N are Hermite-Fourier integral distributions associated to the same Lagrangian, the graph of χt , and both have principal symbols equal to the graph 1/2-density. It follows
308
S. Zelditch
that they can only differ by operators in the same class and of order −1. In fact both are ˆ The only possible difference is in the subprincipal functions of Iˆ of the form exp(i8t (I). terms in the polyhomogeneous expansions of the 8 functions. It seems most natural to concentrate on the unitary groups Ut,N so we will always ˆ for some polyhomogeneous 8. It assume our quantum maps have the form exp it8(I) is straightforward to extend them to the more general quantum maps Uχt ,N . ` 4.5. Final form of T rUt,N . It follows from the above that the pair correlation function ˆ is given by for the completely integrable quantum map exp(itN φ(I)) 1 X ˆ 2π` 2 (N ) ˆ f ρ2,t (f ) = f (0) + 2 |St (N, `)| , N N `6=0
where φ0 6= 0 and where St (N, `) =
N X
e
it`N φ
j N
.
(15)
j=1
We will refer to this as the “homogeneous” case since the Hamiltonian has the form Hˆ (N ) = N φ(Iˆ(N ) ). In the general polyhomogeneous case, the exponential sum has the form St (N, `) =
N X
e
it`N 8
j N
,N
(16)
j=1
φ−1
j N
=φ + N + · · · . For the sake of simplicity we often assume the with 8 Hamiltonian is homogeneous, but we would like to note that they also hold in the general case since the lower order terms do not affect the lattice point counting arguments that are the heart of the matter. As regards the individual sums we also note that the terms of order φN−33 or lower in 8(j, N ) cannot affect the pair correlation function. That is, let us put: j N,N
j N
Zt (N, `) =
N X
e
it`N 82
j N
,N
j=1
with 82
j N,N
:= φ
j N
+
j φ−1 ( N ) N
+
j N 2 N
φ−2
. The following is obvious:
Proposition 4.5.1. For f with fˆ ∈ Co∞ (R), 1 X ˆ 2π` 1 2 2 f |{|S` (N, t)| − |Z` (N, t)| } = O(1/N ). N N N `6=0
(17)
Level Spacings for Integrable Quantum Maps in Genus Zero
309
4.6. Proof of Theorem B: pair correlation on average. We now come to the main results. In the first we allow for general polyhomogeneous exponents. ˆ as above, with |φ0 (x)| ≥ C1 . Then the limit Theorem 4.6.1. Let Ut,N = exp(itN 8(I)) pair correlation function ρ2,t for Ut,N is Poisson on average: 1 b−a
Z
b
ρ2,t (f )dt = f (0) + fˆ(0) a
for any interval [a, b] ⊂ R and any f ∈ S(R) with fˆ ∈ C0∞ (R). Proof. By the above, it suffices to show that Z b 1 1 X ˆ 2π` 2 f lim |St (N, `)| dt = f (0). N →∞ b − a a N 2 N `6=0
To prove this, we use the Hilbert inequality (cf. [Mo, Sect. 7.6 (28)]) 2 Z b X N N X 1 1 e2πiµj t dt = N + O b − a a δ j j=1
j=1
with δj =
min
1≤j≤N,j6=k
|µj − µk | .
The two terms correspond respectively to the diagonal and to the off-diagonal in the square, and the O-symbol constant j(which can be taken to be 3/2). In the is an absolute φ−1 Nj φ−2 N j + N2 so that case at hand, µj = N ` φ N + N δN,j ≥ `
min |φ0 (x)| + O(1/N ) .
x∈[a,b]
It follows that if |φ0 (x)| ≥ C > 0 then Z b N 1 2 |St (N, `)| dt = N + O( ). b−a a ` Therefore the limit equals X 1 2π` 1 X ˆ 2π` 1 . f + O lim sup fˆ lim N →∞ N N ` N N →∞ N `6=0
The first term tends to
R R
`6=0
fˆ(x)dx = f (0). When supp fˆ ⊂ [−C, C] the second is 1 X 1 log N << << . N ` N `≤CN
310
S. Zelditch
4.7. Proof of Theorem B (a): Number variance on average. Next, we prove that the number variance is Poisson on average: Theorem 4.7.1. With the same hypotheses as above, we have Z b 1 ) 6(N lim 2,t dt = L. N →∞ b − a a Proof. We have 1 b−a
Z
b a
) 6(N 2,t dt
Z b ∞ 2 X 1 1 π`L 2 ` = 2 sin |T rUt,N |2 dt π `2 N b−a a `=1 ∞ ∞ 2 X 1 2 X 1 π`L π`L 2 2 + sin sin =N 2 π `2 N π2 `2 N `=1 `=1 N 1 Z b X j k 2πiN ` φ N −φ( N ) dt . e b − a a
(18) (19)
j6=k,j,k=1
The first term is a Riemann sum and we have 2 Z ∞ 2 ∞ sin Lx 1 2 X N2 π`L 2 → sin dx = L. N π2 `2 N π 0 x `=1
The second can be explicitly evaluated as above and is bounded by ∞ X 1 (log N )2 π`L 2 =O . sin N log N `3 N N `=1
5. Mean Square Poisson Statistics for Quantum Spin Evolutions: Proof of Theorem B (b) ) We have just seen that averages of the pair correlation function ρ(N 2,t of quantized Hamilas long as H is a perfect Morse function. In particton flows Ut,N converge to ρPOISSON 2 ular, the result is true for linear functions H = αI in the action. Since exponential sums with linear phases are far from random (see Sect. 6), it is evident that the averaging is the agent producing the random number behaviour. A much stronger test of the Poisson behaviour of Ut,N is whether the variance
1 b−a
Z
b a
) POISSON |ρ(N (f )|2 dt 2,t (f ) − ρ2
tends to zero as N → ∞. It is easy to see that quantum Hamiltonian flows with linear Hamiltonians do not have this property (Sect. 6). We therefore turn to quadratic ˆ Our next result shows that the pair correlation functions of Hamiltonians Hˆ = αIˆ2 + β I. their quantized Hamilton flows converge in mean square to Poisson. The proof is based on techniques from mean value estimates on exponential sums; see [B.I, G.K, H.1] for background.
Level Spacings for Integrable Quantum Maps in Genus Zero
311
) Theorem 5.0.2. Let Hˆ α,β = αIˆ2 + β Iˆ and let ρ(N 2;(t,α,β) be the pair correlation measure for the quantum map U(t,α,β),N = exp(itN Hˆ (α,β;N ) ). Then for any t 6= 0, any T > 0 and any f ∈ S(R) with fˆ ∈ Co∞ (R) we have Z T Z T 2 1 (log N )3 (N ) POISSON . (f ) − ρ (f ) dαdβ = 0 ρ 2 (2T )2 −T −T 2;(t,α,β) N
Proof. Removing the ` = 0 and diagonal terms as above, it suffices to show that X 2 2 Z T Z T N 2 1 X j − k ` 1 fˆ [ + β(j − k) ) dαdβ e(t α 2 2 (2T ) −T −T N N N `6=0 j6=k,j,k=1 (20) (log N )3 =0 . N Here, e(x) = e2πix . To prove this, we use the Beurling–Selberg function BT,|t|δ . It has the following properties (see [G.K]): • BT,|t|δ ≥ χ[−T,T ] ; • Supp Bˆ T,|t|δ ⊂ (−|t|δ, |t|δ). Here, χ[−T,T ] is the characteristic function of [−T, T ]. Then the integral on the right side above is bounded above by Z Z 1 X ` fˆ BT,|t|δ (α)BT,|t|δ (β) 2 N N R R `6=0 (21) 2 2 X N j − k2 + β(j − k) dαdβ. e t α N j6=k,j,k=1
Squaring and evaluating the Fourier transforms gives 1 X X ˆ `1 ¯ˆ `2 f f N4 N N `1 6=0 `2 6=0
N X
N X
(22) 2 j − k12 j 2 − k22 Bˆ T,|t|δ t `1 1 − `2 2 Bˆ T,|t|δ (t(`1 (j1 − k1 ) − `2 (j2 − k2 ))) . N N
j1 6=k1 ,j1 ,k1 =1 j2 6=k2 ,j2 ,k2 =1
By the support properties of BT,|t|δ , the latter expression is bounded above by 1 X X ˆ `1 ˆ `2 (23) f N f N I(N, `1 , `2 ) N4 `1 6=0 `2 6=0
with
312
S. Zelditch
2 j1 − k12 j22 − k22 − `2 I(N, `1 , `2 ) = # (j1 , k1 , j2 , k2 ) ∈ [1, N ] : ji 6= ki : `1 N N (24) ≤ δ, |(`1 (j1 − k1 ) − `2 (j2 − k2 )))| ≤ δ .
4
Introduce new variables hi = ji − ki , mi = ji + ki so that the conditions read h m `1 1 1 − `2 h2 m2 ) ≤ δ N N . |`1 h1 − `2 h2 | ≤ δ The change of variables is invertible so I(N, `1 , `2 ) is the number of integer solutions (h1 , h2 , m1 , m2 ) with |mi | ≤ 2N, |hi | ≤ N − |mi |. Since |`1 h1 − `2 h2 | ∈ N it can only be < δ if it vanishes. Therefore the second condition is equivalent to h`, hi = 0
⇒ h2 =
`1 h1 . `2
Here we assume |h1 | ≥ |h2 | so that |h1 | = max{|h1 |, |h2 |} ∼ |h| and we abbreviate ` = (`1 , `2 ) etc. Substituting in the first condition we get `1 h1 (m1 − m2 ) = O(N δ). Now let us count solutions. We split them up into two classes: (a) homogeneous solutions with m1 − m2 = 0 and (b) inhomogeneous solutions with m1 − m2 6= 0. The homogeneous solutions are lattice points (h1 , m1 , `1 ; h2 , m2 , `2 ) which solve the system `1 h1 = `2 h2 (h1 6= 0, h2 6= 0); m = m . 1 2 Clearly there are 2N solutions of m1 = m2 . For each integer s in [−N 2 , N 2 ] there are ≤ d(s) ways of writing s as a product h1 `1 with h1 , `1 ∈ [−N, N ]. Here, d(s) is the divisor function (the number of non-trivial divisors of s). Hence the number of solutions PN 2 of the homogeneous system is O(N s=1 d(s)2 ) = O(N 3 (log N )3 ). Now let us count inhomogeneous solutions. We write Iih (N, `1 , `2 ) for the number of inhomogeneous solutions with fixed (`1 , `2 ). Since h2 is determined from (h1 , m1 , m2 ) it suffices to count these triples. First, there are O(N ) choices of m1 . Then put m2 = m1 +M δ . From M ≤ 2N the number of pairs (h1 , M ) is with M ≥ 1 so that h1 M = O N `1 bounded above by 2N X N 1 N =O log N . `1 M `1 M =1
2
N . Hence Iih (N, `1 , `2 ) << N `log 1 ∞ ˆ It follows that for f ∈ Co (R),
Level Spacings for Integrable Quantum Maps in Genus Zero
313
`1 `2 |fˆ N ||fˆ N |I(N, `1 , `2 ) 3 P N P ˆ `1 ˆ << O (logNN ) + log `1 6=0 `2 6=0 |f N ||f N2 3 2 3 << O (logNN ) + O (logNN ) = O (logNN ) . 1 N4
P
P
`1 6=0
`2 6=0
`2 N
| `11
(25)
5.1. Proof of Theorem B(b) for general non-degenerate phases. We now consider a much more general class of Hamiltonians for which a similar result holds: ˆ + β I, ˆ where |φ00 | > C > 0 on [−1, 1] and let Theorem 5.1.1. Let Hˆ α,β = αφ(I) (N ) ρ2;(t,α,β) be the pair correlation measure for the quantum map U(t,α,β),N = exp itN Hˆ (α,β;N ) . Then for any t 6= 0, any T > 0 and any f ∈ S(R) with fˆ ∈ Co∞ (R) we have Z T Z T 2 1 (log N )3 (N ) POISSON . (f ) − ρ (f ) dαdβ = 0 ρ 2 (2T )2 −T −T 2;(t,α,β) N Proof. The previous argument now leads to the lattice point problem: ` 1 h1 = ` 2 h2 . j1 j2 `1 φ( N ) − φ kN1 − `2 φ N − φ kN2 = O(1/N ) Here as above the mean value theorem there exist ξji ki ∈ [j1 , k1 ] hi =kji − k1i . By ji 0 i such that φ N − φ N = N φ (ξji ki /N )(ji − ki ). As above we then get the system of constraints: ! ` 1 h1 = ` 2 h2 (`i 6= 0, hi 6= 0) . h1 `1 (φ0 (ξj1 k1 /N ) − φ0 (ξj2 k2 /N )) = O(1) Then writing φ0 (ξj1 k1 /N ) − φ0 (ξj2 k2 /N ) = φ00 (ξj1 k1 j2 k2 /N )(ξj1 k1 − ξj2 k2 )/N we get the system: ! ` 1 h1 = ` 2 h2 (`i 6= 0, hi 6= 0) . h1 `1 φ00 (ξj1 k1 j2 k2 /N )(ξj1 k1 − ξj2 k2 )) = O(N ) By the assumption |φ00 | > c > 0 this gives `1 h1 = `2 h2
(lj 6= 0, hj 6= 0)
h1 `1 (ξj1 k1 − ξj2 k2 ) = O(N )
! .
Let us change to the variables (hi , mi ) as in the quadratic case and write ξji ki = ξ(hi , mi ). We wish to count the number of lattice points (`1 , h1 , m1 , `2 , h2 , m2 ) satisfying the system ` 1 h1 = ` 2 h2 (`j 6= 0, hj 6= 0) . (26) |ξ(h1 , m1 ) − ξ(h2 , m2 )| ≤ hN ` 1 1
314
S. Zelditch
As in the quadratic case, we regard (h1 , `1 , m1 ) as independent variables, so that the first equation is a constraint on (h2 , `2 ). We also regard the second constraint |ξ(h1 , m1 )− as a constraint on |m2 |. To put it in a more convenient form we consider ξ(h2 , m2 )| ≤ hN 1 `1 ξ(h2 , m2 ) as a function ξh2 (m2 ) and invert the function ξh2 . An easy calculation gives Z 1 h m 0 −1 + (2s − 1) )ds) (27) ξh (m) = N (φ ) ( φ0 ( 2N 2N 0 hence
) (Z 1 0 −1 0 h ∂ 0 m ξh (m) = (φ ) + (2s − 1) )ds) φ( ( ∂m 2N 2N 0 "Z # 1 h 00 m · + (2s − 1) )ds . φ ( 2N 2N 0
(28)
Since C ≤ |φ00 (x)| ≤ C 0 for x ∈ [−1, 1] and certain positive constants C, C 0 , It follows ∂ ξh (m)| ≥ δ > 0 (∀m ∈ [0, 2N ]). Therefore a smooth inverse function that that | ∂m −1 ξh exists on the range of ξ. In particular, ξh−1 is Lipshitz, so (26) is equivalent to the system (`j 6= 0, hj 6= 0) ` 1 h1 = ` 2 h2 (29) 0 N |m2 − ξh−1 (ξ(h , m ))| ≤ C . 1 1 h 1 `1 2 The situation is now very close to the quadratic case: There are d(h1 `1 ) solutions (h2 , `2 ) of the first equation and N values of m1 . For given (h1 , `1 , h2 , `2 , m1 ) there are ≤ 1 + O( hN ) solutions m2 of the second equation. Summing 1 over the relevant 1 `1 (h1 , `1 , h2 , `2 , m1 ) gives O(N 3 (log N )3 as in the homogeneous part of the quadratic case. over these lattice points gives O(N 2 (log N )2 ) as in the inhomogeneous Summing O( hN 1 `1 part of the quadratic case. Corollary 5.1.2. Let Nm = [m(log m)5 ] ([·] = integer part). Then for almost all (α, β) with respect to Lebseque measure and all t 6= 0 we have lim ρNm m→∞ 2;(t,α,β)
= ρPOISSON . 2
Proof. By the above, ∞ X m=1
1 (2T )2
Z
T −T
Z
T −T
2 (Nm ) (f ) dαdβ < ∞. ρ2;(t,α,β) (f ) − ρPOISSON 2
Since the terms are positive it follows that for almost all (α, β), ∞ X
POISSON m) |ρ(N (f )|2 dαdβ < ∞ 2;(t,α,β) (f ) − ρ2
m=1
and for these (α, β) the mth term tends to zero. The set of such (α, β) apriori depends on the smooth function f . However by a standard diagonal argument one can find a set of full measure that works for every continuous f (see [Sa.2, R.S] for further details).
Level Spacings for Integrable Quantum Maps in Genus Zero
315
Remark 1. In this corollary we have adapted an argument from [Sa.2, R.S], where the pair correlation problem is studied for flat tori and for some homogeneous integrable systems. Their main result was that the relevant pair correlation functions are almost everywhere Poisson. After proving the almost everywhere convergence to Poisson along a slightly sparse subsequence (as in the above corollary), they show that for Nm < M < Nm+1 , (M ) m) ρ(N 2;(t,α,β) (f ) − ρ2;(t,α,β) (f ) = o(1) as m → ∞. This last step seems to be much more difficult in our problem. The difference is that the spectra in [Sa.2][R.S] increase with increasing N and the common terms cancel in the difference above. On the other hand, our spectra change rapidly with N and there are no (obvious) common terms to cancel. In [Z.Addendum] we will show that an a.e. ) result comparable to that of [R.S] holds for the average in N of ρ(N 2;(t,α,β) if the phase is a polynomial. 6. Appendix: Linear and Quadratic Cases In the case of linear and pure quadratic Hamiltonians, the exponential sums discussed above are classical and there are many prior results in the literature. We briefly discuss what is known and add a few observations of our own. First, the pair correlation problem for linear Hamiltonians H = αI has been studied since the fifties. See [Bl.2, R.S, Ca.Gu.Iz] for discussion and references to the literature. The main result is that only three level spacings can occur for a given α and the pair correlation function is not even mean square Poisson. In the case of quadratic Hamiltonians, we get the incomplete Gauss sums: h i N j 2 j X 2πitN ` +α N N e . St (N ; `) = j=1
In the special case α = 0 and t = 1 they are classical complete Gauss sums G(`, 0; N )
N X
j2
e2πi` N .
j=1
If (`, N ) = 1 then
√ √N |G(`, 0, N )| = 2N 0
In general
G(`, 0, N ) = (`, N )G
Hence the values of IN =
if N ≡ 1(mod 2) if N ≡ 0(mod 4) . if N ≡ 2(mod 4) N ` , 0; (`, N ) (`, N )
.
1 X ˆ 2π` f |St (N, `)|2 N2 N `6=0
depend on the residue class of N modulo 4. If N ≡ 2 (mod 4) then IN = 0. If N is odd, then
316
S. Zelditch
IN
1 1 X 2π` ˆ = = (`, N )f N N N
X
`6=0
X
k
k∈Z:k6=0
`:(`,N )=k
2π` ˆ f . N
When N = p, a prime number, then (`, p) = 1 except for multiples kp with k ∈ supp(fˆ). They make a neglible contribution, so Z 1 X ˆ 2π` f fˆ(x)dx = f (0). → Ip = p p R `6=0
Thus the prime sequence is Poisson. At the opposite extreme, suppose N = pM for some prime p. Then I pM
M 1 X k = M p p k=0
The sums pM −k
fˆ
q=1,(q,p)=1
M −k pX
1
M −k pX
fˆ
q=1,(q,p)=1
q pM −k
.
q pM −k
have the form of Riemann sums for f (0) as M − k → ∞ except that the terms with p|q are omitted. These terms also resemble Riemann sums for f (0), multiplied by 1/p. Hence each term is roughly 1 − 1/p times f (0). Since there are M such terms, the coefficient of f (0) tends to infinity and the pair correlation function cannot be Poisson. 6.1. Poisson on average in t. If we allow t to vary then we do have an average Poisson behaviour: Proposition 6.1.1. For any interval [−T, T ], the average PCF of ˙expitN Iˆ2 is Poisson, i.e. Z T X 2 2 1 1 X ˆ ` i`t j −k N f e dt = o(1). N2 N 2T −T `6=0
j6=k
Proof. The integral equals
X sin `T j 2 −k2 X N 1 1 ˆ ` f N ` N j 2 − k2 `6=0
j6=k
2N 1 X ˆ ` X f = N N
X
m=1 0<|h|≤N −|m|
`6=0
sin `T hm N , hm`
where as above h = j − k, m = j + k. Using just that sin x << 1 this is 2N 1 X ˆ ` X f << N N `6=0
X
m=1 0<|h|≤N −|m|
1 hm`
(log N )2 X ˆ ` 1 (log N )3 f =O . << N N ` N `6=0
Level Spacings for Integrable Quantum Maps in Genus Zero
317
We have not determined whether the variance tends to zero in this case. Acknowledgement. This article was completed during visits to the Australian National University and to the Newton Insitute. We thank A. Hassell and Z. Rudnick for comments on the proof of Theorem B and J. Marklof for discussions of incomplete theta series. We particularly thank M. Zworski for pointing out several errors and omissions in the first version of this paper. Some further improvements will appear in [Z.Zw].
References [B.T]
Berry, M.V. and Tabor, M.: Level clustering in the regular spectrum. Proc. R. Soc. Lond. A356, (375) (1977) [Bl] Bleher, P.: Distribution of energy levels of a quantum free particle on a surface of revolution. Duke Math. J. 74, 45–93 (1994) [Bl.L] Bleher, P. and Lebowitz, J.L.: Energy-level statistics of model quantum systems: universality and scaling in a lattice point problem. J. Stat. Phys. 74, 167–217 (1994) [Bl.K.S] Bleher, P., Kosygin, D. and Sinai, Ya.G.: Distribution of energy levels of a quantum free particle on a Liouville surface and trace formulae. Commun. Math. Phys. 179, 375–403 (1995) [B.I] Bombieri, E. and Iwaniec, H.: Some mean value theorems for exponential sums. In: Annali Scuola Normale Superiore – Pisa (4), 13, 633–649 (1986) [B.G] Boutet de Monvel, L. and Guillemin, V.: The Spectral Theory of Toeplitz Operators. Ann. Math. Studies 99, Princeton, NJ.: Princeton U. Press, 1981 [Ca.Gu.Iz] Casati, G., Guarneri, I. and Izrailev, F.M.: Statistical properties of the quasi-energy spectrum of a simple integrable system. Phys. Lett. A 124 (4,5), 263–66 (1987) [CV] Colin de Verdiere, Y.: Spectre conjoint d’operateurs pseudo-differentiels qui commutent II. Le cas integrable. Math. Zeit. 171, 51–73 (1980) [D.G] Duistermaat, J.J. and Guillemin, V.: The spectrum of positive elliptic operators and periodic bicharacteristics. Invent. Math. 29, 39–79 (1975) [G.K] Graham, S.W. and Kolesnik, G.: Van der Corput’s Method of Exponential Sums. London Math. Soc. Lecture Note Series 126, Cambridge: Cambridge U.Press, 1991 [G.H] Griffiths, P. and Harris, J.: Principles of Algebraic Geometry, N.Y.: Wiley-Interscience, 1978 [G.1] Guillemin, V.: Toeplitz operators in n-dimensions. Int. Eq. Op. Theory 7, 145–205 (1984) [G.2] Guillemin, V.: Moment Maps and Combinatorial Invariants of Hamiltonian T n - Spaces. Progress in Math. 122, Basel: Birkhauser, 1994 [G.S] Guillemin, V. and Sternberg, S.: Symplectic Techniques in Physics. Cambridge: Cambridge U.Press, 1984 [G.S.2] Guillemin, V. and Sternberg, S.: Some problems in integral geometry and some related problems in microlocal analysis. Am. J. Math. 101, 915–955 (1979) [G.U] Guillemin, V. and Uribe, A.: Monodromy in the quantum mechanical spherical pendulum. Commun. Math. Phys. 122, 563–574 (1989) [H.1] Huxley, M.N.: Area, Lattice Points, and Exponential Sums. London Math.Soc. Monographs, New Series 13, New York: Clarendon Press, 1996 [H.2] Huxley, M.N.: The fractional parts of a smooth sequence. Mathematika 35, 292–296 (1988) [Iz] Izrailev, F.M.: Limiting quasi-energy statistics for simple quantum systems. Phys. Rev. Letts. 56, 541–544 (1986) [Kea] Keating, J.: The cat maps: Quantum mechanics and classical motion. Nonlinearity 4, 309–341 (1991) [K.M.S] Kosygin, D.V., Minasov, A.A. and Sinai, Ya.G.: Statistical properties of the spectra of LaplaceBeltrami operators on Liouville surfaces. Russ. Math. Surv. 48, 1–142 (1993) [Lox.1] Loxton, J.H.: The graphs of exponential sums. Mathematika 30, 153–163 (1983) [Lox.2] Loxton, J.H.: The distribution of exponential sums. Mathematika 32, 16–25 (1985) [M] Mendes-France, M.: Entropie, dimension et thermodynamique des courbes planes. In: Seminar on Number Theory, Paris 1981-2, Progress. Math. 38, Boston: Birkhauser, 1983 [Mo] Montgomery, H.L.: Ten Lectures on the Interface Between Analytic Number Theory and Harmonic Analysis. CBMS Lecture Series 84, Providence, RI: AMS Publications, 1990 [P.Z] Porod, U. and Zelditch, S.: Semi-classical limit of random walks, II. Preprint, 1997
318
[R.S]
S. Zelditch
Rudnick, Z. and Sarnak, P.: The pair correlation function of fractional parts of polynomials. Commun. Math. Phys. To appear [Sa.1] Sarnak, P.: Arithmetic quantum chaos. In: Israel Mathematical Conference Proc. Vol. 8, 183–236 (1995) [Sa.2] Sarnak, P.: Values at integers of binary quadratic forms. In: Harmonic analysis and number theory CMS Conf. Proc. 21, Providence, RI: AMS, 1997 pp. 181–203 [S] Sinai, Ya.G.: Poisson distribution in a geometrical problem. Adv. Sov. Math. 3, 199–214 (1991) [U.Z] Uribe, A. and Zelditch, S.: Spectral statistics on Zoll surfaces. Commun. Math. Phys. 154, 313–346 (1993) [Z] Zelditch, S.: Index and dynamics of quantized contact transformations. Annales Inst. Fourier, Grenoble 47, 305–363 (1997) [Z.Addendum] Zelditch, S.: Addendum: Level spacings for integrable quantum maps in genus zero. Commun. Math.Phys. 196, 319–329 (1998) [Z.Zw] Zelditch, S. and Zworski, M.: Spacings between phase shifts in a simple scattering problem. In preparation Communicated by Ya. G. Sinai
Commun. Math. Phys. 196, 319 – 329 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Addendum Level Spacings for Integrable Quantum Maps in Genus Zero? Steve Zelditch Johns Hopkins University, Baltimore, Maryland 21218, USA Received: 11 September 1997 / Accepted: 30 January 1998
In this note we continue our study [Z] of the pair correlation functions (PCF’s) ρn2 (f ) =
n 1 X ˆ ` 1 X ˆ ` X i`n[αφ( k )+β k ] 2 ` 2 n n | f ( f ( )|T rU )| | | = e n n2 n n2 n `
`
k=1
of completely integrable quantum maps over CP . To be specific, the quantum maps ˆ ˆ are assumed to have the form Un,α,β = ein(αφ(I)+β I) , where Iˆ is an action operator (i.e. an angular momentum operator) with eigenvalues nk (k = −n, . . . , n), acting on the quantum Hilbert space Hn of nth degree spherical harmonics at Planck constant n1 . Also φ is a smooth function satisfying φ00 6= 0 on [−1, 1]. Our main result was 1
Theorem 0.0.1 (Z). Let nm = [m(log m)5 ]. Then for almost all (α, β) (in the Lebesgue m → ρPOISSON as m → ∞. sense), ρn2,α,β 2 Our aim in this addendum is to strengthen this result to almost everywhere convergence to Poisson along the entire sequence of Planck constants. The price we pay is that the results apply not to the individual ρn2 ’s but to the average ρ¯N 2,α,β :=
N 1 X n ρ2,α,β . N
(1)
n=1
n Here we change the notation from ρN 2 in [Z] to ρ2 so that N is reserved for the cumulative N PCF ρ¯2 up to level N .
Theorem 0.0.2. Suppose that φ(x) is a polynomial satisfying φ00 6= 0 on [−1, 1]. Then, for almost all (α, β) we have: POISSON . ρ¯N 2,α,β → ρ2 ?
Partially supported by NSF grant #DMS-9703775.
320
S. Zelditch
This addendum was motivated by a comparison of the results of [Z] with those of Rudnick-Sarnak [R.S] on the PCF of fractional parts of polynomials. Independently, both [R.S] and [Z] established mean square convergence to Poisson of their respective PCF’s. However, [R.S] went on to prove a.e. convergence. Their technique was first m tend to Poisson almost everywhere along a sparse to prove that the local PCF’s ρn2,α,β subsequence {nm } of Planck constants, and then to show that for n ∈ [nm , nm+1 ] the oscillation ρn2 − ρn2 m was relatively small and hence the full sequence converged to Poisson. This latter step seemed (and still seems) intractable in the quantum maps situation [Z]. The main difference is that the local spectra in [R.S] increase with n, whereas for quantum maps [Z] they change in rather uncontrollable ways. However we can re-establish a parallel to their situation by focussing on the mean PCF’s ρ¯N 2,α,β rather than the individual ρn2 ’s. Our spectra then increase with N and there is much less oscillation between Planck constants. As in [R.S], the proof of this last step is based on the use of Weyl estimates of exponential sums and seems limited to polynomial phases. In addition to the Weyl method, it also uses some considerations from the measure theory of continued fractions. 1. Preliminary Results on ρ¯ N 2,α,β N Up until the last step, the analysis of ρ¯N 2,α,β is analogous to the analysis of ρ2,α,β in [Z]. As in [Z, Theorem (5.1.1)] we have: ˆ + β I, ˆ where |φ00 | ≥ Co > 0 on [−1, 1]. Let ρ¯N Theorem 1.0.3. Let Hˆ α,β = αφ(I) 2,α,β be as above. Then for any f with suppfˆ compact:
Z
T −T
Z
T −T
POISSON |ρ¯N (f )|2 dαdβ = O( 2,α,β (f ) − ρ2
(log N )3 ). N
Corollary 1.0.4. Let Nm = [m(log m)5 ]. Then for almost all (α, β) in the Lebesgue sense, POISSON m lim ρ¯N (f ). 2,α,β (f ) = ρ2 m→∞
To fill in the gaps in the sparse subsequence {Nm }, consider ρ¯M 2,α,β for Nm < M < Nm+1 . Obviously, m ρ¯M ¯N 2,α,β (f ) − ρ 2,α,β (f ) =
M Nm − M N m 1 X n ρ¯2,α,β (f ) + ρ2,α,β (f ). M M
(2)
n=Nm
We have M − Nm << (Nm+1 − Nm ) ∼ (m + 1)(log(m + 1)5 ) − m(log m)5 << (log m)5 . So in the first sum NmM−M << m−1+ . In the second we have O((log m)5 ) terms. Under the assumption suppfˆ ⊂ [−1, 1] the trivial bound ρn2 (f ) << n already gives M 1 X n N m − M Nm ρ¯2 (f ) + ρ2,α,β (f ) << (M − Nm ) << (log m)5 . M M
(3)
n=Nm
So we just need a tiny improvement on the trivial bound to prove that these terms tend to zero. In the following section we will prove that for almost all (α, β), ρn2,α,β (f ) ≤
Level Spacings for Integrable Quantum Maps in Genus Zero
321
C(α, β)n1− K + , where K = 2k−1 with k the degree of φ. From this it also follows by POISSON [a, b] for all intervals [a, b]. We standard density arguments that ρ¯N 2,α,β [a, b] → ρ2 refer to [R.S] for the details of the density argument. 2
2. The Main Lemma The purpose of this section is to prove: Lemma 2.0.5. Suppose that φ is a polynomial of degree k satisfying the hypotheses: (i) |φ00 | > 0 and (ii) |αφ0 + β| > 0 on [−1, 1], . Then for any fˆ ∈ Co (R) and almost all 2 1− K + , where K = 2k−1 . (α, β), we have: n2 ρ¯(n) 2,α,β (f ) ≤ C(α, β)n Recall that the local PCF’s have the form 2 n X ` X k k fˆ +β e αn` φ ρn2,α,β = . n n n `∈Z
k=1
Since fˆ is compactly supported, the `-sum runs over an interval of integers of the form [−Cn, Cn] for some C > 0. For simplicity of notation, and with no loss of generality, we will assume the sum over ` runs over the interval [−n, n]. Throughout we use the notation e(x) = e2πix . 2.1. The quadratic case. The case of quadratic polynomials is more elementary than that of polynomials of general degree and we can prove our main result without analysing continued fraction convergents to α. Hence we begin by discussing this case. The relevant exponential sum is n 2 X n X 2n X x k k +β e αn` φ e `h α + β . = n n n h=−n x=1
k=1
For f with suppfˆ in [−1, 1] we have X X n X 2n x e `h α + β . n2 ρn2,α,β (f ) << n |`|≤n h=−n x=1 The following estimate is weaker than that claimed in the Main Lemma but is sufficient for the proof of the theorem. Lemma 2.1.1. Let α be a diophantine number satisfying |α− aq | ≥ number
a q.
Then for all β,
ρn2,α,β (f )
<< n
1 2 +
K(α) q 2+
for any rational
.
Proof. We begin with the standard estimate (e.g. [K, Lemma 1]) 2n 2n X X x x 1 ), e `h α + β = e `h α ≤ min(2n, n n 2||`h α n || x=1
x=1
where || · || denotes the distance to the nearest integer. This gives
322
S. Zelditch
n2 ρn2,α,β (f ) <<
n X X
min 2n,
`≤n h=−n
1 2||`h α n ||
.
The variable x = h` runs over [−n2 , n2 ]; when x 6= 0, the multiplicity cx = #{(h, `) : h` = x} is well-known to have order n (e.g [V, Lemma 2.5]). Then there are 2n terms where h` = 0, each contributing n to the sum. Hence, n X 2
n2 ρn2,α,β (f )
2
<< n + n
min 2n,
x=−n2
1 2||x α n ||
.
(4)
At this point we are close to the well-known estimate ( e.g. Korobov [K, Lemma 14]) Q X Q 1 << 1 + )(P + q log P , min P, ||αx + β|| q x=1
a θ q + q2
where α =
with |θ| < 1 and with (a, q) = 1. In our situation Q = n2 , P = n, giving
(1 + nq )(n + q log n), but the estimate does not apply because our ‘α’ is α n ; the rational a 1 1 approximation qn to α has a remainder of only rather than . This complicates n nq 2 (nq)2 the argument and worsens the resulting estimate. Since we do not know the continued fraction expansion of α n , we use the rational α a θ a a0 approximation n = qn + nq2 . It is not necessary that (a, n) = 1 so we rewrite qn = qn 0 with (a0 , n0 ) = 1 (hence (a0 , n0 q) = 1). Then 2
a0 θ α = 0 + 2, n n q nq
(a0 , n0 ) = 1,
|θ| < 1.
Now break up [−n2 , n2 ] into blocks of length n0 q. There are at most 2[ nn0 q ] + 1 such blocks. Hence 2
n X
2
n
1 min 2n, 2||x α n || 2
2
<< n
x=−n
[ nn0 q ]+1 n0 q X X y=0
x=1
1 min 2n, 2||(x + yqn0 ) α n ||
. (5)
The above rational approximation brings yn0 θ α a0 x xθ αx + yqn0 = 0 + 2 + ya0 + . n n n q nq nq Hence || 0
a0 x xθ αx α + yqn0 || = || 0 + 2 + β||, n n n q nq
θ where β = { yn nq }. Write β = we have
||
b(y) θ1 n0 q + n 0 q
with b(y) ∈ Z and with |θ1 | < 1. Since |x| ≤ n0 q
α xθ α 1 n0 θ1 a0 x + b(y) 0α 0α || = ||x + yqn − || ≤ ||x + yqn || + + . − n0 q n n nq 2 n0 q n n n0 q nq
Level Spacings for Integrable Quantum Maps in Genus Zero
323
0
n The remainder nq is much larger than occurs in the standard argument and since it is 0 possible that n = n we can only be sure that the remainder is O( q1 ). Therefore we are only sure that our sum is
!
2
<< n
[ nn0 q ]+1 n0 q X X y=0
min 2n,
x=1
1 0
2|| a x+b(y) + O( q1 )|| n0 q
.
Since (a0 , n0 q) = 1, the numbers a0 x + b(y) run thru a complete residue system modulo n0 q as x runs thru 1, . . . n0 q. Hence, the x-sum is independent of a0 , b(y) and we may rewrite it as ! 2+ X 2 n +1 min 2n, x . << n0 q || n0 q + O( q1 )|| 2≤x≤n0 q−1 The distance || nx0 q + O( q1 )|| can be less than n1 over the range of terms x ∈ [0, Cn0 ] and x ∈ [n0 q − Cn0 , n0 q], where C is the implicit constant in O( q1 ). For these we must take n in the minimum. Since there are O(n) such terms in the x-sum, their contribution to 2 the entire sum is << n2+ nn0 q . For the remaining terms we use that min(2n, || x2 || ) is an even function of x to put n0 q
the x-sum in the form X
min 2n,
Cn0 ≤x≤ qn 2
0
The minimum is now surely attained by the interval we have
Therefore X Cn0 ≤x≤ qn 2
min(2n, 0
2 , || nx0 q +O( q1 )||
1 = || nx0 q + O( q1 )||
|| nx0 q
|| nx0 q
2 + O( q1 )||
2 ) << n0 q + O( q1 )||
x n0 q
! .
and since it stays in the left half of
1 . + O( q1 )
X Cn0 ≤x≤ qn 2
0
1 << n0 q log(n0 q). x − O(n0 )
The whole x-sum is therefore << ( nn0 q + 1)[n2 + n0 q log(n0 q)]. In sum, we have 2+
n X 2
n
x=−n2
min(2n,
n2+ 1 + 1)[n2 + qn0 log(n0 q)]. α ) << ( 0 ||x n || nq
Hence ρn2,α,β << 1 + (
n + n−2 )[n2 + qn0 log(n0 q)]. n0 q
The first parenthetical term is of size n1+ /q when n0 = n while the trivial bound was n. It is at this point that we must restrict to diophantine numbers satisfying |α − aq | ≥ K(α) q 2+
324
S. Zelditch
for all rational pq . By Dirichlet’s box principle there exists q ≤ nr and a rational aq with (a, q) = 1 such that |α − aq | ≤ qn1 r . It follows that q > nr− . Substituting into our estimate, we get ρn2,α,β << 1 + (
n−r+ 1 n−1+r . + n−2 )[n2 + nr n0 log(n)] << n ((a, n)n1−r + n0 (a, n)
Since 1 ≤ (a, n) ≤ n the final estimate is << n (n2−r + n−1+r ). The terms balance when r =
3 2
to give ρn2,α,β (f ) << n 2 + . 1
Remark 1. In the next section we will see that there are rational numbers aq satisfying the above requirements and also satisfying (a, n) ≤ C(α)n . This changes the final estimate to << n (n1−r + n−1+r ) and gives ρn2,α,β (f ) << n . 2.2. The general polynomial case. Now let φ(x) = αo xk + α1 xk−1 · · · + αk be a general polynomial. We would like to estimate ρn2 (f )
n 1 X ˆ ` X x f ( )| = 2 e(n`φ( ))|2 . n n n `∈Z
k=1
As in the classical Weyl inequality (cf. [V, Lemma 2.4]) we will estimate |
n X k=1
x e(n`φ( ))|2 n
by squaring and differencing repeatedly until we reach the linear case. Let 1j be the j th iterate of the forward difference operator, so that 11 φ(x; h) = φ(x + h) − φ(x), 1j+1 φ(x; h1 , . . . , hj+1 ) = 11 (1j φ(x; h1 , . . . , hj ; hj+1 )). We recall (cf. [V, Lemma 2.3]): Lemma 2.2.1. We have |
n X x=1
j
e(f (x))|2 ≤ (2n)2
j
−j−1
X |h1 |
···
X X [ e(1j f (x; h1 , . . . , hj ))], |hj |
where the intervals Ij = Ij (h1 , . . . , hj ) satisfy I1 ⊂ [1, n], Ij ⊂ Ij−1 .
Level Spacings for Integrable Quantum Maps in Genus Zero
Now let T (φ; n, `) =
n X x=1
k
325
x e(n`φ( )) n
k−1
with φ(x) = αo x +· · ·+αo and put K = 2 . Apply the previous lemma with j = k −1 to get: |T (φ; n, `)|K << nK−k × X X X ··· e(h1 . . . hk−1 `pk−1 (x; h1 , . . . , hk−1 ; n, `)). hk−1 x∈Ik−1
h1
Here, the sum runs over hj with |hj | ≤ n and 1 1 pk−1 (x; h1 , . . . , hk−1 ; n) = k!n−k+1 αo (x + h1 + · · · + hk−1 ) + (k − 1)!n−k+2 α1 . 2 2 This is just as in the standard Weyl estimate ([V, D, §3]) except for the powers of n in the coefficients of pk−1 . Then write 1X ˆ ` 1 1X 1 f ( )[ |T (φ; n, `)|2 ] << ρn2 (f ) = ( |T (φ; n, `)|2 ). (6) n n n n n `
`≤n
Since the `-sum is an average, we may apply Holder’s inequality with exponent get ρn2 (f ) << [
2 1X 1 | √ T (φ; n, `)|K ] K . n n
K 2
to
(7)
`≤n
Therefore K
[ρn2 (f )] 2
K
<< nK−k n− 2 −1
XX `≤n h1
X X
···
e(h1 . . . hk−1 `pk−1 (x; h1 , . . . , hk−1 ; n)).
hk−1 x∈Ik−1
There are nk−1 terms with h1 . . . hk−1 ` = 0, each contributing n to the x-sum. So the contributions of such terms to the total sum is O(nk ), and we get K
K
[ρn2 (f )] 2 << n 2 −k−1 [nk +
0 XX
e(h1 . . . hk−1 `pk−1 (x; h1 , . . . , hk−1 ; n))], (8)
`≤n h,x
where the primed sum runs only over non-zero values of h1 . . . hk−1 `. As in the case with k = 2 above we sum over x to get K
K
[ρn2 (f )] 2 << n 2 −k−1 [nk +
0 XX `≤n h
min(n,
1 )], ||k!h1 . . . hk−1 `n−k+1 α||
(9)
and then rewrite the variable k!h1 . . . hk−1 ` as a new variable x ranging over [0, k!nk ]. As before, the number cx of ways of representing x 6= 0 as a product k!h1 . . . hk−1 ` is O(n ) so
326
S. Zelditch K
X
K
[ρn2 (f )] 2 << n 2 −k−1+ [nk +
min(n,
x≤k!nk
1 )]. ||xn−k+1 α||
(10)
α We now repeat the steps of the quadratic case but with nk−1 replacing α n . Thus, the a θ α a θ rational approximation α = q + q2 gives the approximation nk−1 = nk−1 + and q nk−1 q 2 k k−1 k−1 hence requires us to break up the sum over [0, k!n ] into blocks of size n q/(a, n ). nk−1 Precisely the same argument (with n0k = (a,n k−1 ) ) then gives
X
min(n,
x≤k!nk
1 ||xn−k+1 α||
) << (
nk + 1)(nk + qn0k log(qn0k )). qn0k
Hence we get K
K
[ρn2 (f )] 2 << n 2 −k−1+ [nk + ( << n Recalling that n0k =
nk−1 (a,nk−1 )
K 2
−k−1+
nk + 1)(nk + qn0k log(qn0k ))] qn0k
n2k [n + 0 + qn0k ]. qnk
(11)
k
the last expression is
K
<< n 2 −1+ [1 +
nk (a, nk−1 ) q ]. + qnk−1 n(a, nk−1 )
Thus, [ρn2 (f )] << n1− K + [1 + 2
2 q n(a, nk−1 ) + ]K . q n(a, nk−1 )
(12)
The exponent of the right side will be less than one if and only if the exponent of k−1 [1 + n(a,nq ) + n(a,nq k−1 ) ] is less than one. Thus we are in very much the same situation as in the quadratic case (although the resulting exponent will be increasingly bad as K → ∞). However, the estimate (a, n) ≤ n used in the quadratic case does not generalize well to higher degree: In higher degree, the estimate (a, nk−1 ) ≤ nk−1 leads to r = k+1 2 and an exponent larger than one. Therefore we need to choose a rational approximation satisfying (a, q) = 1 and |α − aq | < q12 and with low value of (a, nk−1 ). The natural candidates for such numbers are the continued fraction conm = [ao , a1 , . . . , am ] to α = [ao , a1 , . . . ]. Therefore we need to study the vergents pqm behaviour of fn (α) := min{
qm (α) n(pm (α), nk−1 ) + }. qm (α) n(pm (α), nk−1 )
(13)
m = α + O( q12 ) we can (and will) replace the qm in this definition by pm . Since Since pqm m it is presumably hard to arrange for (pm (α), nk−1 ) to be large, we will require that pm (α) ∈ [nr− , nr ] for some exponent r to be determined later. Before proceeding let us recall how the index m is related to n, r.
Level Spacings for Integrable Quantum Maps in Genus Zero
327
Proposition 2.2.2. For any r, > 0, any M ∈ N and almost any α ∈ R, there exists no ∈ N with the following property: for n ≥ no there exist at least M consecutive convergents pm−M (α), pm−M +1 (α), . . . , pm ∈ [nr− , nr ] with m ≤ C(α) log n. Proof. By a theorem of Khinchin and Levy [Kh], one knows that for almost all α the convergents satisfy 1
m = γ, lim qm
m→∞
γ :=
π2 . 12 log 12
(14)
The first claim is equivalent to the statement that there exists m such that, for 0 ≤ j ≤ M, (r − ) log n < log pm−j = m log γ + o(m) < r log n. Evidently there exists C(α) > 0 such that m ≤ rC(α) log n, proving the second claim. The first claim states that for sufficiently large n, there are at least k consecutive solutions m of r (r − ) + o(1)] log n ≤ m ≤ [ + o(1)] log n. [ γ γ This is obvious since the width of the interval equals [ γ + o(1)] log n, which is positive and unbounded. We then have: Proposition 2.2.3. Fix k, r, > 0. Then for almost all α ∈ R there exists a convergent pm (α) r− , nr ] and with (pm (α), nk−1 ) ≤ n . qm (α) with pm (α) ∈ [n Proof. By the previous proposition, for any M > 0, there are at least M consecutive pm ’s in [nr− , nr ] for sufficiently large n. Our goal is to find one satisfying (pm (α), nk−1 ) ≤ n1+ . To this end we recall [Kh] that pm = am pm−1 + pm−2 qm = am qm−1 + qm−2 and hence that pm qm−1 − pm−1 qm = ±1. It follows that pm (α), pm−1 (α) are relatively prime. This pattern continues in a sufficiently useful way. By a simple induction we find that for k < m, pm qm−k − pm−k qk = ±Ek−1 (am , am−1 , . . . , am−k+1 ),
(15)
where E0 = 1, E1 (am ) = am , E2 (am , am−1 ) = am am1 + 1 and where Ek (am , am−1 , . . . , am−k ) = am−k Ek−1 (am , am−1 , . . . , am−k+1 ) + Ek−2 (am , am−1 , . . . , am−k+2 ). Hence any common divisor of pm , pm−1 , pm−2 is a divisor of am , and so on. We now claim that for the M consecutive pm ’s in [nr− , nr ] we have: (pm−M , nk−1 )(pm−M +1 , nk−1 ) · · · (pm , nk−1 ) M −j ≤ nk−1 5M j=0 5`=1 E` (am−j , am−j−1 , . . . , am−j−`+1 )
(16)
328
S. Zelditch
The idea of the argument is that, were all the pm−j ’s relatively prime, then each (pm−j , nk−1 ) would contribute a distinct factor of nk−1 and hence the product would be ≤ nk−1 . The pm−j ’s are of course not relatively prime but (15) gives an upper bound on the greatest common divisors of each pair. Thus, let us start with pm and consider the degree to which factors in (pm , nk−1 ) are replicated by the lower (pm−j , nk−1 )’s. Since (pm , pm−1 ) = 1 there is no duplication of factors due to the nearest neighbor. Since (pm , pm−2 )|am the greatest common factor of (pm−2 , nk−1 ), (pm , nk−1 ) is less than (am , nk−1 ) and hence less than am . Similarly the greatest common factor of (pm−3 , nk−1 ), (pm , nk−1 ) is less than E2 (am , am−1 ). In all, the product (pm−M , nk−1 )(pm−M +1 , nk−1 ) · · · (pm , nk−1 ) replicates factors of (pm , nk−1 ) by at most E1 (am ) . . . EM (am , am−1 , . . . , am−M +1 ). Next, move on to (pm−1 , nk−1 ). These factors of nk−1 can get duplicated in (pm−3 , nk−1 ) and so on down to (pm−k , nk−1 ). One gets a similar estimate as in the first case but with the indices lowered by one. Proceeding down to (pm−M , nk−1 ) proves the claim. To complete the proof of the proposition, we use another fact from the metric theory of continued fractions [Kh, Theorem 30]: For almost any α ∈ R, there exists C(α) > 0 such that am (α) ≤ C(α)m1+ . By Proposition (2.2.2), the relevant values of m are of order log(n). Therefore, for the pm , pm−1 , . . . , pm−M under consideration we have am−j << log n. Since E` is a polynomial in the am−j ’s of degree `, we have E` (am−j , am−j−1 , . . . , am−j−`+1 ) << (log n)` . Therefore M −j M 5M . j=0 5`=1 E`˙ (am−j , am−j−1 , . . . , am−j−`+1 ) << (log n) 3
(17)
It follows that k−1 5M ) ≤ C(α)nk−1 (log n)M . j=0 (pm−j , n 3
k−1
(18)
Hence at least one factor must be ≤ C(α)1/M n M (log n)M . The proposition follows from the fact that M can be arbitrarily large. We now complete the proof of the lemma and of our main result. We have proved the existence of (pm , qm ) with all the necessary properties and such that qm ∈ [nr− , nr ], (pm , nk−1 ) << n . It follows that 2
qm n(pm , nk−1 ) << n1+−r + nr−1 . + qm n(pm , nk−1 )
(19)
The terms balance when r = 1/2 and give the power n˙ . It follows from (12) that 2 ρn2 (f ) << n1− K + . References [D]
Davenport, H.: Analytic methods for Diophantine equations and Diophantine inequalities. Ann Arbor: Ann Arbor Publishers, 1962 [Kh] Khinchin, A.Ya.: Continued Fractions. Chicago: Univ.Chicago Press, 1964 [K] Korobov, N.M.: Exponential sums and their applications. Math.and its Appl. 80, Amsterdam: Kluwer Academic Publishers, 1992
Level Spacings for Integrable Quantum Maps in Genus Zero
[L]
329
Lang, S.: Introduction to Diophantine Approximations. New Expanded ed., Berlin–Heidelberg–New York: Springer-Verlag, 1995 [R.S] Rudnick, Z. and Sarnak, P.: The pair correlation function of fractional parts of polynomials. (Commun. Math. Phys. To appear [V] Vaughan, R.C.: The Hardy–Littlewood Method. Second Edition, Cambridge: Cambridge Univ. Press, 1997 [Z] Zelditch, S.: Level spacings for integrable quantum maps in genus zero. Commun. Math. Phys. 196, 289–318 (1998) Communicated by Ya. G. Sinai
Commun. Math. Phys. 196, 331 – 361 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Orbifold Subfactors from Hecke Algebras II – Quantum Doubles and Braiding David E. Evans1 , Yasuyuki Kawahigashi2 1 School of Mathematics, University of Wales, Cardiff, PO Box 926, Senghennydd Road, Cardiff CF2 4YH, Wales, UK 2 Department of Mathematical Sciences, University of Tokyo, Komaba, Tokyo, 153-8914, Japan. E-mail: [email protected]
Received: 21 January 1997 / Accepted: 31 January 1998
Abstract: A. Ocneanu has observed that a mysterious orbifold phenomenon occurs in the system of the M∞ -M∞ bimodules of the asymptotic inclusion, a subfactor analogue of the quantum double, of the Jones subfactor of type A2n+1 . We show that this is a general phenomenon and identify some of his orbifolds with the ones in our sense as subfactors given as simultaneous fixed point algebras by working on the Hecke algebra subfactors of type A of Wenzl. That is, we work on their asymptotic inclusions and show that the M∞ -M∞ bimodules are described by certain orbifolds (with ghosts) for SU (3)3k . We actually compute several examples of the (dual) principal graphs of the asymptotic inclusions. As a corollary of the identification of Ocneanu’s orbifolds with ours, we show that a non-degenerate braiding exists on the even vertices of D2n , n > 2.
1. Introduction In the theory of subfactors initiated by V. F. R. Jones in [17], Ocneanu’s paragroup theory [30] is fundamental in descriptions of the combinatorial structures arising from subfactors. Ocneanu’s construction of the asymptotic inclusions, introduced in [30], has recently caught much attention as a subfactor analogue of the quantum double construction of Drinfel0 d in [4]. (See [7, 10, 16, 21, 22, 26, 28, 41] on asymptotic inclusions.) As noted by Ocneanu, if we start with a subfactor N ⊂ M = N o G, where N is a hyperfinite II1 factor with a free action of a finite group G on N , then the resulting asymptotic inclusion M ∨ (M 0 ∩ M∞ ) ⊂ M∞ gives the tensor category of the quantum double of G as that of the M∞ -M∞ bimodules. (See [11, Sect. 12.8] and [24] for example.) Since a paragroup, arising from a subfactor, is a certain “quantization” of an ordinary group, Ocneanu’s construction of the asymptotic inclusion can be regarded as a subfactor analogue of the quantum double construction. (See [11, Sects. 12.6,
332
D. E. Evans, Y. Kawahigashi
12.7, 13.5] for its relation to topological quantum field theory and rational conformal field theory.) The asymptotic inclusion can be also regarded as an analogue of the quantum double construction, since the tensor category of the M∞ -M∞ bimodules gives a natural braiding as an analogue of the R-matrix, as in [36], [11, Sects. 12.7]. The dual principal graphs of the asymptotic inclusions are hard to compute, in general, while their principal graphs are easy to compute, as in [10], [11, Sect. 12.6], [31, 32, 33], as long as we know the fusion rule of the M -M bimodules of the original subfactor N ⊂ M . From the above viewpoint of the quantum double, it is the dual principal graph, or the system of its even vertices, strictly speaking, that is more important of the two graphs. (In some sense, the principal graph represents just a double “without quantum”. See [11, Sect. 12.6].) So it would be interesting to have concrete descriptions of the dual principal graphs (or their even vertices) of the asymptotic inclusions of concrete examples of subfactors, other than the ones of the form N ⊂ M = N oG arising from genuine groups. Other “easy” examples of subfactors of finite depth arising from actions of finite groups contain subgroup-group subfactors N = R o H ⊂ M = R o G and Wassermann type subfactors (C⊗R)K ⊂ (Mn (C)⊗R)K , where R is the N hyperfinite II1 factor, H ⊂ G are finite groups acting freely on R, and K acts on R = k Mn (C) as a product type action. These, however, do not give anything new in the tensor categories of their M∞ -M∞ bimodules, because the M -M bimodules of these subfactors are given by the tensor category of Gˆ and then they give the same M∞ -M∞ bimodules as the subfactor N = R o H ⊂ M = R o G, as seen from [11, Sect. 12.6]. In this sense, “classical” subfactors do not give interesting asymptotic inclusions. The easiest subfactors among “quantum” subfactors are the Jones subfactors N ⊂ M of type An , as introduced in [17]. They are described as N = he2 , e3 , e4 , . . .i, M = he1 , e2 , e3 , e4 , . . .i, where {ej }j≥1 is a sequence of projections satisfying the following relations : |j − k| 6= 1, π −1 ) ej . ej ej±1 ej = (4 cos2 n+1 ej ek = ek ej ,
Then it is easy to see that the asymptotic inclusion M ∨ (M 0 ∩ M∞ ) ⊂ M∞ is given as M ∨ (M 0 ∩ M∞ ) = h. . . , e−2 , e−1 , e1 , e2 , . . .i, M∞ = h. . . , e−2 , e−1 , e0 , e1 , e2 , . . .i, where {ej }j∈Z is a (double-sided) sequence of projections satisfying the same relations as above. The Jones indices of these subfactors were first computed by M. Choda in [2]. It is quite non-trivial to describe the dual principal graphs of these asymptotic inclusions, while the general theory mentioned above gives the principal graphs easily from the fusion rules. Ocneanu has announced a description of the M∞ -M∞ bimodules in [36] and at several other conferences. Let us apply the construction of the asymptotic inclusion to a finite dimensional Hopf C ∗ -algebra already having a non-degenerate braiding, then the resulting tensor category of the M∞ -M∞ bimodules is just a “double” of the original category and nothing interesting happens in this procedure. (This fact was first noticed by Ocneanu in [36] and we will see this in more detail in Sect. 2.) One might suspect we have something similar and not interesting for these asymptotic inclusions of the Jones subfactors, because the Jones subfactors correspond to the quantum groups Uq (sl2 ) in some sense. This, however, is not true. We have a more subtle and interesting situation due to a certain degeneracy condition of the braiding in the sense of Ocneanu [36].
Orbifold Subfactors from Hecke Algebras II
333
The non-degeneracy condition of this kind was first introduced by Reshetikhin– Turaev [39] in their construction of topological invariants of 3-manifolds realizing the physical prediction of Witten [49]. From the topological viewpoint, this condition is quite natural and it is this non-degeneracy that leads the Turaev–Viro invariant [43] of a 3-manifold being the square of the absolute value of the Reshetikhin–Turaev invariant as in [42]. (See also [36] for an operator algebraic account of this theorem of Turaev.) Ocneanu has observed that a certain orbifold construction, similar to the orbifold construction in our sense in [8] as simultaneous crossed products, is invoked in the process of the asymptotic inclusions of the Jones subfactors if the above non-degeneracy condition fails. In this paper, we will generalize his construction to the case of the Hecke algebra subfactors of Wenzl [48] and show that this is a general phenomenon in the following sense. The asymptotic inclusion produces a non-degenerate system of bimodules in the sense of Ocneanu [36]. From the viewpoint of [36], we can say that our orbifold construction [8] removes the degeneracy. So if we apply the construction of the asymptotic inclusion to a subfactor having a degenerate system of bimodules, the orbifold construction is invoked automatically in order to remove the degeneracy in the procedure of making the “double”. In this way, we get another series of orbifold subfactors from Hecke algebras of type A as a continuation of our work in [8]. The asymptotic inclusions of the Hecke algebra subfactors are described naturally as follows. The original subfactor of Wenzl is described as N = hg2 , g3 , g4 , . . .i, M = hg1 , g2 , g3 , g4 , . . .i, where {gj }j≥1 is a sequence of the Hecke generators satisfying the relations of the Hecke algebras of type A as in [48]. The series of the commuting squares giving this subfactor is not canonical in the sense of Popa, because this series has a period larger than two. Still, one can identify the asymptotic inclusion of this subfactor as follows, which is similar to the above description of the asymptotic inclusions of the Jones subfactors : M ∨ (M 0 ∩ M∞ ) = h. . . , g−2 , g−1 , g1 , g2 , . . .i, M∞ = h. . . , g−2 , g−1 , g0 , g1 , g2 , . . .i, where {gj }j∈Z is a (double-sided) sequence of the Hecke generators satisfying the same relations. This subfactor was first constructed with a double sided sequence of the Hecke generators by Erlijman [6]. Later it was identified with the asymptotic inclusion of the Hecke algebra subfactor of Wenzl by Goto [15] and Erlijman [7] independently. (Goto’s proof works in a quite general setting, while Erlijman directly works on the Hecke algebras.) So the asymptotic inclusions of the Hecke algebra subfactors have natural constructions in terms of generators and commuting squares parallel to the case of the Jones subfactors of type An . In Sect. 2, we explain Ocneanu’s basic properties of braiding on a system of bimodules in his sense [36] and its relation to his tube algebras. We continue the study of the tube algebras for the Hecke algebra subfactors of Wenzl [48] in Sect. 3. This gives the basic properties of the tube algebra and enables us to use Ocneanu’s general machinery on asymptotic inclusions and tube algebras in [35] and [10]. The dual principal graphs of the asymptotic inclusions of the Jones subfactors of type An are described in Sect. 4. This covers the case announced by Ocneanu in [36]. These for the Hecke algebra subfactors with indices converging to 9 are dealt with in Sect. 5. This Section describes our main results. In Sect. 6, we study a relation between the orbifold phenomena Ocneanu has observed and the orbifold construction in our sense [8] for the SU (2)2k case. In the last Sect. 7, we study the orbifold construction with braiding in our setting and get a non-degenerate braiding on the even vertices of D2n , n > 2.
334
D. E. Evans, Y. Kawahigashi
2. Braiding and a Tube Algebra – Non-Degenerate Case We start with a finite braided system of bimodules M = {xi }i∈I in the sense of Ocneanu [36]. (The original references for Ocneanu’s theory used in this paper are [30–36]. See also [9, 10] and [11, Chap. 12].) An important example of such a system is obtained from the WZW-models SU (n)k with Ocneanu’s surface bimodule construction as in [33, 34, 35]. (See also [10] or [11, Chap. 12].) We may have such a system from a subfactor N ⊂ M with finite index and finite depth. Note that even when we have an abstract system of bimodules, we can realize the system as a system of bimodules arising from a single hyperfinite (possibly reducible) subfactor N ⊂ M of type II1 finite index and finite depth. This is possible by a minor variation of the construction in [1]. That is, instead of choosing a primary field 8 on p. 281 of [1], we choose ⊕i∈I xi as the generator to construct a paragroup. In this way, we get a (possibly reducible) subfactor N ⊂ M for which the system of the M -M bimodules arising from the subfactor is given by M = {xi }i∈I . (We have learned this construction from S. Yamagami. See also [9, Sect. 4] or [11, Sect. 12.5].) So we may and do assume that our system M arises from a hyperfinite subfactor N ⊂ M of type II1 with finite index and finite depth. P Define the global index [M] of the system by [M] = x∈M [x], where [x] denotes the Jones index of the bimodule x. This is also the global index of the subfactor N ⊂ M in the sense of Ocneanu [30]. We would like to study the system of the M∞ -M∞ bimodules corresponding to the asymptotic inclusion M ∨ (M 0 ∩ M∞ ) ⊂ M∞ arising from this subfactor N ⊂ M in the sense of Ocneanu [30]. The construction of this system of the M∞ -M∞ bimodules can be regarded as a subfactor analogue of the quantum double construction of Drinfel0 d in [4]. (This analogy has been noted by Ocneanu. See [11, Chap. 12] for the basic theory of asymptotic inclusions and this analogy.) In order to study this system, we work on Ocneanu’s tube algebra rm Tube M and study the center of the tube algebra in the sense of [33, 36]. (See also [11, Chap. 12] for tube algebras.) Recall that an element x in the braided system M is called degenerate in the sense of Ocneanu [36] if it satisfies the identity in Fig. 1, where the dashed circle denotes the summation over all the labels x ∈ M with coefficient [x]1/2 /[M] as in Fig. 2. Such a dashed ring is called a killing ring in Ocneanu’s terminology. (This has been already used in the topology literature e.g. [44].)
x
x
=
?
?
Fig. 1. A degenerate element x
Orbifold Subfactors from Hecke Algebras II
335
x
X =
[x]1/2 [M]
x∈M
Fig. 2. A killing ring
a
-
X c
b
a
-c
a
-c -
-c
b
b
Fig. 3. The Ocneanu projection pa,b
In this section, we suppose that the braiding on M is non-degenerate in the sense that 0 is the only degenerate element. (We remark that 0 is always degenerate by definition.) We note that we can use a graphical expression as in [19, 50] for elements in the tube algebra rm Tube M because we have a braided system of bimodules. (We need to orient edges, since bimodules are not now self-contragredient in general. We also for simplicity as in [51] drop labels for intertwiners on triple points arising from multiplicities in the fusion rules .) In the tube algebra, we define the Ocneanu projection pa,b ∈ rm Tube M for a, b ∈ M as in Fig. 3. In this picture, the left half is a coefficient represented diagramatically as in [19], where the horizontal bar represents a fraction, and the right half is an element in the tube algebra rm Tube M, where the top and the bottom of the dashed line are identified in the tube picture. The dashed line again denotes the killing ring. (If we use the convention of writing an element in a tube algebra as in [10], the right half of the picture is interpreted as in Fig. 4. Here the two triangles denote intertwiners and the two trapezoids denote intertwiners arising from braiding. The diagram represents the composition of these four intertwiners, and gives an element in the tube algebra as in the definition of the multiplication in the tube algebra. The dashed line does not need an orientation because we take a summation over all the labels.)
336
D. E. Evans, Y. Kawahigashi
Fig. 4. An element in the tube algebra
Fig. 5. Before a handle slide
We recall from [19, Sect. 12.3] that we can perform the graphical operation called a handle slide against a killing ring without changing the number or operator represented by the figure. We give an example of a handle slide in Figs. 5, 6. In this situation here, we assume that the link components on the right hand side are killing rings. (We remark that we have to regard a diagram of a link as a framed link diagram now.) Note that this handle slide is valid regardless of the non-degeneracy condition. (See [19, Sect. 12.3].)
Fig. 6. After a handle slide
The following theorem is due to Ocneanu. He presented this theorem and a proof in his talk in the Taniguchi Symposium in Japan in July, 1993. The proof here, except for the last paragraph, is his proof, which we include for the sake of completeness. Theorem 2.1. The above element pa,b givesPa system of mutually orthogonal minimal central projections in the tube algebra with a,b∈M pa,b = 1. Before the proof, note that this system of the minimal central projections describes the system of the M∞ -M∞ bimodules and that these bimodules give the even vertices of the dual principal graph of the asymptotic inclusion, by Ocneanu’s theorem in [35] (see [10, Theorem 4.3] or [11, Theorem 12.28]). Proof. First we prove that pa,b ’s give a system of mutually orthogonal projections. It is clear that each pa,b is self-adjoint, so we will prove that pa,b pa0 ,b0 = δa,a0 δb,b0 pa,b first. This proof is given as in Fig. 7, where we compute pa,b pa0 ,b0 graphically. In the first equality, we have used a handle slide. (The handle slide is against the left dashed line, which is a link because its top and bottom are connected.) In the third equality, the non-degeneracy assumption implies that the label d should be 0.
Orbifold Subfactors from Hecke Algebras II
a
b
337
a0
b0 -a - a0 c c c - 0 b b
X a -c b
c
a
=
b
a0
b0 -a c -c b
X a -c b
c
a =
a0 -c b0
a0 -c b0 b
a0
a -c b
a0 b -c -b0 b0 d a
= δaa0 δbb0
X c
a -c b
-
c -
b0
b0
X c,d
- a0
d a cKa0 ? d b K b0 a a0 c c d ? a b b0 -a0 d
b -a c c b
Fig. 7. Orthogonality of pa,b
We next prove that each pa,b is central. It is enough to prove that pa,b /([a][b])1/2 commutes with any element in the tube algebra. This proof is given again graphically. First we compute the product of pa,b /([a][b])1/2 and a generic element in the tube algebra as in Fig. 8, where the top and the bottom of the generic element labeled with z are identified again. We next compute the product in the reverse order as in Fig. 9. Then the coefficients in Figs. 8 and 9 turn out to be the same, so we have the desired centrality. P The proof of a,b∈M pa,b = 1 is given graphically in Fig. 10.
338
D. E. Evans, Y. Kawahigashi
Fig. 8. Centrality of pa,b (1)
We finally have to show that each pa,b is a minimal central projection. By Ocneanu’s theorem mentioned above just before the proof, it is enough to show that the corresponding M∞ -M∞ bimodules are all irreducible. By the proof of Ocneanu’s theorem in [35] (see [10, Theorem 4.3] or [11, Theorem ∞ -M∞ L12.28]), cwe know that the M bimodule corresponding to pa,b decomposes as c∈M Nab Bc as an M ∨ (M 0 ∩ M∞ )M∞ bimodule after restricting the left action to M ∨ (M 0 ∩ M∞ ), where Bc denotes the M ∨ (M 0 ∩ M∞ )-M∞ bimodule labeled with c. This shows that the Jones index of the M∞ -M∞ bimodule corresponding to pa,b is [a][b]. Thus if allP the bimodules corresponding to pa,b are irreducible, we get the global index equal to a,b∈M [a][b], which is the correct global index. If one or more of the bimodules is reducible, we would get a smaller global index, which is impossible. Thus we conclude that all the bimodules are irreducible.
Orbifold Subfactors from Hecke Algebras II
339
Fig. 9. Centrality of pa,b (2)
We now recall that the principal graph of the asymptotic inclusion M ∨(M 0 ∩M∞ ) ⊂ M∞ is given by the fusion graph of the original system M by Ocneanu’s theorem in [35] (see [10, Theorem 4.1] or [11, Theorem 12.25]). That is, the set of the odd vertices of the principal graph is labeled with M, the set of the even vertices is labeled by pairs (a, b) with a, b ∈ M, and the number of the edges between the odd vertex labeled with c , the multiplicity of c in the c and the even vertex labeled with (a, b) is given by Nab relative tensor product a ⊗M b. The connected component of this graph containing the even vertices labeled with (∗, ∗) is called the fusion graph of the system M. (See [31], [10, p. 220], [11, Sect. 12.6].) Combining these pieces of information, we get the following proposition. Proposition 2.2. Let N ⊂ M be a hyperfinite type II1 subfactor with finite index and finite depth. Suppose that the system of the M -M bimodules arising from this subfactor has a non-degenerate braiding. Then the dual principal graph of the asymptotic inclusion M ∨ (M 0 ∩ M∞ ) ⊂ M∞ is the fusion graph of the original system, the same as the principal graph. Proof. As above, we know that the even vertices of the dual principal graph is labeled with pairs (a, b) for a, b ∈ M and the odd vertices with c ∈ M. It is thus enough to show that the number of the edges connecting the vertices labeled with (a, b) and c is c . This follows from the above proof of Theorem 2.1. indeed Nab
340
D. E. Evans, Y. Kawahigashi
X
pa,b =
a,b
a
b
-
X
c -
a,b,c
a c - -
-
b,c
X =
b c
-
b
b
b
=
c -
-
X
a
-
c
-
=1
c
X Fig. 10.
pa,b = 1
3. Braiding for SU (n)k and a tube algebra We now work on the WZW-model SU (n)k . Let N ⊂ M be the corresponding Wenzl subfactor in [48] constructed as in [1, Sect. 4]. Note that the fusion rule algebra for the WZW-model SU (n)k has a natural Z/nZ-grading and that the fusion rule subalgebra given by the grade 0 elements corresponds to the fusion rule algebra of the M -M bimodules arising from this subfactor N ⊂ M . (This correspondence also follows from [1, Sect. 4].) We denote the grading of a primary field a in the model SU (n)k by gr(a) ∈ Z/nZ. Then this system is often degenerate in the sense of the previous section. Our next aim is to study the asymptotic inclusions for these degenerate cases. Some statements in this section hold for a general RCFT in the sense of [29] rather than for the WZW-models, so we make a general statement in such a case. First we have the following general proposition. Proposition 3.1. Let M be a braided system of M -M bimodules. A bimodule x ∈ M is degenerate if and only if the bimodule x satisfies the equality in Fig. 11 for all y ∈ M. Proof. It is trivial that if we have the equality in Fig. 11, then we have the degeneracy condition in Fig. 1.
Fig. 11. The degeneracy condition
Orbifold Subfactors from Hecke Algebras II
341
Fig. 12. The converse direction
For the converse direction, we use a graphical argument as in Fig. 12, where we have used a handle slide against a killing ring. We record the following straightforward lemma just to fix the normalization constants for an RCFT. Lemma 3.2. The number represented by Fig. 13 is Sxy /S00 , where S denotes the Smatrix of the RCFT.
y
x
Fig. 13. The Hopf link
Proof. This is standard. See [50] for example.
Let M be a subsystem of an RCFT. (A typical case will be the subsystem of all grade 0 elements in a WZW-model SU (n)k .) We first have the following lemma. Lemma 3.3. Let x be an element of the subsystem M. If we have S0y = Sxy for all y ∈ M, then x is degenerate in M. Proof. This follows from a graphical argument as in Fig. 14.
Lemma 3.4. Suppose that x is degenerate in M. Then for all y ∈ M, we have Sxy Sx0 Sy0 = . S00 S00 S00
(1)
Proof. Suppose that the identity (1) fails for some y ∈ M. Then we have the graphical relation of Fig. 15. This, together with identity of Fig. 16 given by the handle slide, gives the identity of Fig. 17, which is a contradiction. In the rest of this section, we work on the WZW-model SU (n)k with n | k, because it will turn out that this case is a typical degenerate case related to the orbifold construction. Let M be the subsystem of the WZW-model SU (n)k consisting of the elements with grading 0. (Note that if n and k are relatively prime, then M is non-degenerate by [23,
342
D. E. Evans, Y. Kawahigashi x
x y
X [y]1/2
=
[M]
X [y]1/2 Sxy
y
y
[M]Sx0
x
=
x
X [y]1/2 S0y y
=
[M]S00
Fig. 14. Degeneracy of x
x
y
x
y
6=
Fig. 15.
Sect. 2].) In this case, the subsystem {x ∈ M | S0x = S00 } of M is isomorphic to Z/nZ. (They are called simple currents. See [11, Sect. 8.8], [12, pp. 327, 365] for example.) We choose and fix an element σ in this subsystem of M so that this subsystem is given as {0, σ, σ 2 , . . . , σ n−1 }. Lemma 3.5. For σ as above and an arbitrary y ∈ M, we have S0y = Sσy . Proof. This follows from a standard property of the S-matrix. See [12, (5.5.25)], for example. This lemma, together with Lemma 3.3, shows that any element in {0, σ, σ 2 , . . . , σ n−1 } is degenerate in M. We next show the converse as follows. Proposition 3.6. If x ∈ M is degenerate in M, then x ∈ {0, σ, σ 2 , . . . , σ n−1 }. Proof. For y ∈ M, let 0y be the matrix for the multiplication by y on the entire fusion b rule algebra of the model SU (n)k . That is, each entry (0y )ab is given by Nay for any primary field a, b in the model SU (n)k . We also define a vector v by v = (Syx )y for any
Orbifold Subfactors from Hecke Algebras II
343 x
x
y
y
=
?
?
Fig. 16.
x
=0
? Fig. 17.
primary field y in the model SU (n)k . According to the grading of y, we split the vector v into n pieces and write v = (v0 , v1 , . . . , vn−1 ), where vj denotes the vector component corresponding to y with gr(y) = j. By the Verlinde identity [46], [11, Sect. 8.6], we get 0z v j =
Szx vj+1 , S0x
where z ∈ M and j ∈ Z/nZ. Lemma 3.4 implies that we have Szx 6= 0 for any z ∈ M with grading 1. Then we get Szx kvj+1 k22 = (0z vj , vj+1 ) S0x = (vj , 0z¯ vj+1 ) =
Szx ¯ kvj k22 S0x
344
D. E. Evans, Y. Kawahigashi
=
Szx kvj k22 , S0x
which implies kvj k2 = kvj+1 k2 .√Since this is true for all j ∈ Z/nZ and the matrix S is unitary, we get kvj k2 = 1/ n for all j ∈ Z/nZ. Let w0 be the vector defined by (w0 )y = Syx for y ∈ M. Lemma 3.4 implies that w0 = v0 and thus S0x = S00 . This means that the Perron–Frobenius weight of the element x is 1 and thus x is in {0, σ, σ 2 , . . . , σ n−1 }, which is the conclusion of the proposition. We now extend the definition of the Ocneanu projection pa,b in Fig. 3. Suppose that a, b are primary fields in the model SU (n)k with gr(a) + gr(b) = 0 ∈ Z/nZ. Then the graphical formula in Fig. 3 still defines an element in the tube algebra rm Tube M for the subsystem M of the elements with 0 grading, because we have gr(c) = 0 for any c appearing in Fig. 3. Note that pa,b may not be a projection any more. We call this element pa,b an Ocneanu element. Lemma 3.7. For primary fields a, b as above, the element pa,b is central in the tube algebra rm Tube M. Proof. The same argument as in Fig. 8 works.
Lemma 3.8. For primary fields a, b as above, we have pa,b = pσa,σn−1 b in the tube algebra rm Tube M. Proof. First consider pσ,σn−1 . If c 6= 0 in Fig. 3, then the term corresponding to c is 0. So we have a single term for this Ocneanu element. The degeneracy of σ, proved in Lemmas 3.5 and 3.3, easily implies pσ,σn−1 = p0,0 . Next note that we have S(x1 ∗ x2 )S(x3 ) = S(x1 )S(x2 ∗ x3 ) for x1 , x2 , x3 ∈ HS 1 ×S 1 as in [10, Theorem 5.1], [11, Theorem 12.29], where S means the action of the S-matrix in P SL(2, Z). (This is a direct analogue of the Verlinde formula [46]. Actually, the formula in Theorem 5.1 in [10] is slightly incorrect because normalizing coefficients are missing there. Theorem 12.29 in [11] is correct.) By Lemma 3.7, we can apply this identity to the Ocneanu elements. Then we have pa,b = pa,b ∗ p0,0 = pa,b ∗ pσa,σn−1 b = pσa,σn−1 b .
Lemma 3.9. Let a, b, a0 , b0 be primary fields in the model SU (n)k with gr(a) + gr(b) = gr(a0 ) + gr(b0 ) = 0 ∈ Z/nZ. We suppose that (a0 , b0 ) 6= (σ j a, σ n−j b) for all j ∈ Z/nZ. Then we have pa,b pa0 ,b0 = 0. Proof. We compute pa,b pa0 ,b0 as in Fig. 7. The computation is the same up to the third line of Fig. 7. Then in the third line, the picture represents the value 0 for any choice of d. Thus we have pa,b pa0 ,b0 = 0. Note that we have a unique primary field f with σf = f , because n | k. Lemma 3.10. If primary fields a, b as above satisfy (a, b) 6= (f, f ), then the element pa,b is a projection in the tube algebra rm Tube M. P Let P = {pa,b | gr(a) + gr(b) = 0 ∈ Z/nZ, (a, b) 6= (f, f )}. Then we have p∈P p + pf,f /n = 1, which implies that pf,f /n is a central projection.
Orbifold Subfactors from Hecke Algebras II
345
a
b
X
c
X
gr(a)+gr(b)=0 gr(c)=0
-
c
b
b
-
b
gr(c)=0
b
=
c
-
=
X
c
-
-
-
XX b
c
-
a
a
-
-
=n
gr(c)=0
Fig. 18.
P p∈P0
p=n
Proof. Suppose (a, b) 6= (f, f ). We compute p2a,b as in Fig. 7. Then in the third line, we have only the terms with d in {0, σ, σ 2 , . . . , σ n−1 }. Since (a, b) 6= (f, f ), none of these d, except for d = 0, give a non-zero value. For d = 0, we have the original pa,b . This shows that pa,b is a projection. P Set P0 = {pa,b | gr(a) + gr(b) = 0 ∈ Z/nZ}. We compute p∈P0 p as in Fig. 18. The second equality follows since the entire system of the primary fields in the model SU (n)k is non-degenerate, which follows from unitarity of the S-matrix. (The coefficient n comes from the ratio of the global indices of M and the entire system.) This implies P p∈P p + pf,f /n = 1. The last assertion on pf,f /n now follows from Lemmas 3.9, 3.7, 3.10. Lemma 3.11. If primary fields a, b as above satisfy (a, b) 6= (f, f ), then the projection pa,b in the tube algebra rm Tube M is minimal.
346
D. E. Evans, Y. Kawahigashi
Proof. We define p(c,d) a,b as in Fig. 19.
- a c-
d-
b
Fig. 19. Element p(c,d) a,b (d,e) We compute p(c,d) a,b pa,b graphically as in Fig. 20, where we have used (a, b) 6= (f, f ) in the second line.
- a - a e c d b b
a db
a
=
a
-
- a c -
e b
b
=
a
a 0 ?
b
a db
-
c -
e b
b Fig. 20. Product p(c,d) p(d,e) a,b a,b
Orbifold Subfactors from Hecke Algebras II
347
Let M (a, b) be the set of the primary fields c satisfying p(c,c) a,b 6= 0. If c, d ∈ M (a, b),
then the computation in Fig. 20 shows that p(c,d) a,b 6= 0. If c, d, e ∈ M (a, b), then the (d,e) computation in Fig. 20 also shows that p(c,d) a,b pa,b 6= 0. We thus have a system of matrix
units {λc,d p(c,d) a,b }c,d for pa,b (rm Tube M)pa,b , where λc,d are some positive numbers. This shows that the algebra pa,b (rm Tube M)pa,b is a full matrix algebra and thus the central projection pa,b is minimal in the center of the tube algebra rm Tube M. Again by Ocneanu’s theorem in [35] (see [10, Theorem 4.3] or [11, Theorem 12.28]), we get irreducible M∞ -M∞ bimodules corresponding to pa,b with (a, b) 6= (f, f ). Note that we have corresponding bimodules even when gr(a), gr(b) 6= 0. The principal graphs of the asymptotic inclusions are determined only by the primary fields with grading 0, but the dual principal graphs have vertices related to the primary fields with other grading. The primary fields with non-zero grading are the ghosts of the system M in the sense of Ocneanu [36]. We work on the irreducible decompositions of these M∞ -M∞ bimodules after restricting the left action to M ∨ (M 0 ∩ M∞ ) as follows. This gives partial information about the dual principal graph of the asymptotic inclusion. Lemma 3.12. Let a, b be primary fields as above satisfying (a, b) 6= (f, f ) and Xa,b the irreducible M∞ -M∞ bimodule corresponding to the minimal central projection pa,b in rm TubeL M. If we restrict the left action to M ∨ (M 0 ∩ M∞ ), we get a decomposition c 0 Xa,b = c∈M Nab Xc as a M ∨ (M ∩ M∞ )-M∞ bimodule, where Xc is the M ∨ 0 (M ∩ M∞ )-M∞ bimodule corresponding to c ∈ M. Proof. An argument similar to the one in the proof of Theorem 2.1 works.
We next work on the case (a, b) = (f, f ). We still have an M∞ -M∞ bimodule in such a case, though this bimodule might not be irreducible, and we get the following lemma in the same way. Lemma 3.13. Let Xf,f be the M∞ -M∞ bimodule corresponding to the central projecM. If we restrict the left action to M ∨ (M 0 ∩ M∞ ), we get a tion pf,f /n in rm TubeL decomposition Xf,f = c∈M Nfcf Xc as a M ∨ (M 0 ∩ M∞ )-M∞ bimodule, where Xc is the M ∨ (M 0 ∩ M∞ )-M∞ bimodule corresponding to c ∈ M. Proof. An argument similar to the one in the proof of Theorem 2.1 again works.
We would like to get a full description of the dual principal graph, but the bimodule Xf,f plays a quite subtle role. So we first make the following assumption and later prove that this assumption holds in some cases. Assumption 3.14. The M∞ -M∞ bimodule Xf,f decomposes into n irreducible bimodules and each has the same dimension. In this assumption, we mean the square root of the Jones index of the corresponding subfactor of a bimodule by the “dimension” of a bimodule. A. Ocneanu has observed this assumption holds for SU (2)2k and we will prove that this also holds for SU (3)3k in a general framework. We conjecture that this assumption holds for any SU (n)nk , but combinatorial complexity has prevented us from proving it, so far. A simple computation easily gives the following lemma.
348
D. E. Evans, Y. Kawahigashi
Lemma 3.15. Assumption 3.14 gives the correct global index for the asymptotic inclusion. Consider the dual principal graph of the asymptotic inclusion. To each M∞ -M∞ or M ∨ (M 0 ∩ M∞ )-M∞ bimodule, we assign its dimension, as usual. This gives a Perron–Frobenius weight. That is, for an M ∨(M 0 ∩M∞ )-M∞ bimodule corresponding to c ∈ M, we get [c]1/2 [M]1/2 and for an M∞ -M∞ bimodule corresponding to pa,b with arbitrary a, b in the model SU (n)k with (a, b) 6= (f, f ), we get [a]1/2 [b]1/2 . We also note that the Perron–Frobenius eigenvalue for this weight is [M]1/2 . Lemma 3.16. These Perron–Frobenius weights on the M ∨(M 0 ∩M∞ )-M∞ bimodules are compatible with Assumption 3.14. Proof. For c ∈ M, we denote by Xc the corresponding M ∨(M 0 ∩M∞ )-M∞ bimodule. We easily get [Xc ] = [c][M]. We can also form a fusion graph using all the primary fields in the model SU (n)k . From the Perron–Frobenius property of this graph, we get X c n[M][c]1/2 = Nab [a]1/2 [b]1/2 , a,b
where a, b are arbitrary primary fields in the model SU (n)k . Let L be a set of representatives of the equivalence classes on all the pairs of arbitrary primary fields in the model SU (n)k excluding (f, f ) for the equivalence relation (a, b) ∼ (σ j a, σ n−j a) with j ∈ Z/nZ. Then the right-hand side of the equality is equal to X [f ] c , Nab [a]1/2 [b]1/2 + Nfcf n n (a,b)∈L
by Assumption 3.14. This gives [M]1/2 [c]1/2 [M]1/2 =
X
c Nab [a]1/2 [b]1/2 + Nfcf
(a,b)∈L
which is the conclusion, because we have Lemmas 3.12, 3.13.
[f ] , n
4. Dual Principal Graphs of the Asymptotic Inclusions – SU (2)k Case With the preliminaries of the previous section, we compute the dual principal graphs of the asymptotic inclusions of the SU (2)2n subfactors, that is, the Jones subfactors of type A2n+1 constructed in [17], with n > 1. These results were first claimed by Ocneanu. We present a complete proof here, because we will generalize the results in the next section. First label the primary fields in SU (2)2n with 0, 1, . . . , 2n as usual. Recall that the fusion rule is given as 1, if |j − k| ≤ l ≤ j + k, j + k + l ∈ 2Z, j + k + l ≤ 4n, l Njk = 0, otherwise. Note that all the bimodules in M are labeled with even integers and they are all self-contragredient. The M ∨ (M 0 ∩ M∞ )-M∞ bimodules arising from the asymptotic inclusion are labeled with 0, 2, . . . , 2n and the M ∨ (M 0 ∩ M∞ )-M ∨ (M 0 ∩ M∞ )
Orbifold Subfactors from Hecke Algebras II
349
bimodules are labeled with pairs of even integers 0, 2, . . . , 2n. This implies that all the M ∨ (M 0 ∩ M∞ )-M ∨ (M 0 ∩ M∞ ) bimodules arising from the asymptotic inclusion are also self-contragredient. On the even vertices of the dual principal graph, we do not know how the M∞ -M∞ bimodule corresponding to pn,n /2 in rm Tube M decomposes into irreducible ones, but the fusion rule as above shows that this bimodule contains exactly one copy of X0 when we restrict the left action to M ∨ (M 0 ∩ M∞ ) by Lemma 3.13. Then Lemma 3.16 implies that the M∞ -M∞ bimodule corresponding to pn,n /2 is not irreducible and it contains at least one irreducible bimodule whose dimension is half the dimension of this M∞ -M∞ bimodule corresponding to pn,n /2. Then Lemma 3.15 implies that this M∞ -M∞ bimodule corresponding to pn,n /2 decomposes into exactly two irreducible bimodules with equal Jones indices and thus Assumption 3.14 holds, because we would have a smaller global index otherwise. We label these two bimodules with (n, n)+ and (n, n)− . We will now determine the dual principal graph of the asymptotic inclusion. By Lemma 3.12, it is enough to determine how the even vertices labeled with (n, n)+ and (n, n)− are connected to the odd vertices. Since the odd vertex labeled with 0 is connected to one of these two even vertices, we may assume that (n, n)+ is connected to 0. Lemma 4.1. The M∞ -M∞ bimodules labeled with (n, n)± are self-contragredient. Proof. First note that the other M∞ -M∞ bimodules are self-contragredient by Lemma 3.8. We count the number of the paths of length 2 connecting the odd vertex 0 to itself on the principal graph of the asymptotic inclusion, which is the fusion graph of M, via the contragredient map, because the fusion graph is now connected. By the fusion rule described above, we can go from 0 back to 0 on the principal graph through (0, 0), (2, 2), . . . , (2n, 2n). This implies that the number of the paths is n + 1. We know that the number of paths of length 2 connecting the odd vertex 0 to itself on the dual principal graph of the asymptotic inclusion via the contragredient map is also equal to n − 1 by (bi)unitarity of the connection arising from the asymptotic inclusion [30, p. 130] (or [11, Sect. 10.3]). The M∞ -M∞ bimodules labeled with (0, 0) = (2n, 2n), (1, 1) = (2n − 1, 2n − 1), . . . , (n − 1, n − 1) = (n + 1, n + 1) give n paths from 0 back to 0 on the dual principal graph. (Here the equality as in (0, 0) = (2n, 2n) means that the bimodule labeled with p0,0 is equal to that with p2n,2n because of Lemma 3.8.) This means that we still have another path from 0 back to 0 on the dual principal graph through the even vertex labeled with (n, n)+ . This means that the M∞ -M∞ bimodule labeled with (n, n)+ , hence that with (n, n)− , is contragredient to itself. We next count the number of paths connecting the odd vertices 0 and 2 on the principal graph of the asymptotic inclusion. (In this kind of counting in the rest of this paper, by a “path” we mean a path of length 2 on the graph via the contragredient map.) Again by the fusion rule, we can go from 0 to 2 on the principal graph through (2, 2), (4, 4), . . . , (2n − 2, 2n − 2). This implies that the number of the paths is n − 1. Again by unitarity, the number of the paths connecting the odd vertices 0 and 2 on the dual principal graph is also equal to n − 1. The even vertices labeled with (1, 1), (2, 2), . . . , (n − 1, n − 1) are connected both to the odd vertices 0 and 2 by Lemma
350
D. E. Evans, Y. Kawahigashi
3.12. These already give the correct number of paths, so this fact means that the even vertex (n, n)+ is not connected to the odd vertex 2. Then Lemma 3.13 implies that the even vertex (n, n)− is connected to the odd vertex 2. Similarly, we can count the number of paths from 0 to 4, 6, . . . on the principal/dual principal graphs with the fusion rule. Then unitarity gives the following description of the dual principal graph. Theorem 4.2. Let N ⊂ M be the subfactor corresponding to SU (2)2n . Then the even vertex (n, n)+ of the dual principal graph of the asymptotic inclusion is connected to the odd vertices 0, 4, . . .. The even vertex (n, n)− of the dual principal graph is connected to the odd vertices 2, 6, . . .. As a corollary of the above description, we get the following, which was announced by Ocneanu in [36, p. 41]. Note that this corollary gives the number of the even vertices of the dual principal graph of the asymptotic inclusions. These are also the dimensions of the Hilbert spaces HS 1 ×S 1 in the corresponding topological quantum field theories. Corollary 4.3. Let N ⊂ M be the subfactor corresponding to SU (2)k , that is, the Jones subfactor of type Ak+1 . Assume k > 2. Then the number of the irreducible M∞ -M∞ bimodules arising from the asymptotic inclusion is given as follows.
2 k+1 , 2 2 k k + + 2, 4 2
if k is odd, if k is even.
We list some examples of the dual principal graphs. The first one is for SU (2)4 , which is the Jones subfactor of type A5 . It is well known that this subfactor of index 3 is of the form RoS2 ⊂ RoS3 , where S2 and S3 are the symmetric groups of order 2 and 3 respectively and these groups act freely on the hyperfinite II1 factor R. (See [30].) Thus the paragroup of the asymptotic inclusion is given by that of the subfactor RS3 ×S3 ⊂ RS3 , where S3 is diagonally embedded into S3 × S3 and the group S3 acts freely on R, by Ocneanu’s theorem. (See [21, Lemma 2.15], [22, Appendix], [11, Sect. 12.8].) So the (dual) principal graphs of the asymptotic inclusion can be described with Ocneanu’s theorem on subfactors of the form RG ⊂ RH , where G is a finite group acting freely on a II1 factor R and H is a subgroup of G. (See [25] for this type of computation.) Of course, this method gives the same result as in Fig. 21. 0
00 44
11 33
2
22+
02 42
4
20 24
22−
13 31
Fig. 21. Dual principal graphs for SU (2)4 , A5
04 40
Orbifold Subfactors from Hecke Algebras II
0
00 66
11 55
22 44
351
2
02 64
33+
20 46
6
4
13 53
24 42
31 35
04 62
33−
26 40
15 51
60 06
Fig. 22. Dual principal graphs for SU (2)6 , A7
A more complicated example of the dual principal graph of the asymptotic inclusion is given in Fig. 22. In the graphs in Figs. 21, 22, the vertices labeled with pairs of odd numbers arise, while the original M -M bimodules are labeled with only even numbers. These odd numbers correspond to the ghosts in the terminology of Ocneanu [36]. 5. Dual Principal Graphs of the Asymptotic Inclusions – SU (3)k Case Now we work on the asymptotic inclusions of the SU (3)3k -subfactors and give our main results in this paper. We have to determine how the central projection pf,f /3 decomposes into minimal central projections in the tube algebra rm Tube M. Lemmas 3.13 and 3.16 imply that pf,f /3 contains at least one minimal central projection p(0) f,f , the dimension of the corresponding irreducible M∞ -M∞ bimodule of which is one third of that of the M∞ -M∞ bimodule corresponding to pf,f /3. This argument also shows that the odd vertex of the dual principal graph labeled with 0 is connected to the even vertex labeled with p(0) f,f with exactly one edge and also that the odd vertex 0 is not connected to the other even vertices arising from the decomposition of pf,f /3. Lemma 5.1. The irreducible M∞ -M∞ bimodule corresponding to p(0) f,f is contragredient to itself. Proof. We again count the number of the appropriate paths as in the proof of Lemma 4.1. We first count the number of the paths connecting the odd vertex 0 to itself on the principal graph of the asymptotic inclusion. Let l be the number of the primary fields in the WZW-model SU (3)3k . Then it is easy to see that the number of the primary fields in M is (l + 2)/3. By the fusion rule, the number of paths from 0 back to 0 on the principal graph is (l + 2)/3. It is also easy to see that the number of paths connecting the odd vertex 0 to itself on the dual principal graph of the asymptotic inclusion without going through the even vertices corresponding to some minimal central projection appearing in the decomposition of pf,f /3 is (l − 1)/3.
352
D. E. Evans, Y. Kawahigashi
These mean that we still have one more path from 0 back to 0 on the dual principal graph, which must go through the even vertex corresponding to p(0) f,f . That is, the M∞ is self-contragredient. M∞ bimodule corresponding to p(0) f,f Lemma 5.2. If c ∈ M satisfies Nfcf = 1, then the odd vertex of the dual principal graph labeled with c is connected to the even vertex labeled with p(0) f,f with exactly one edge. Proof. We count the number of appropriate paths again. The number of paths connecting the odd vertex 0 to itself on the principal graph of P the asymptotic inclusion is a∈M Naca¯ , because the principal graph is the fusion graph which is now connected. Let l be the number of the edges connecting the odd vertex of the dual principal graph labeled with c and the even vertex labeled with p(0) f,f . Lemma 3.12 implies that l is 0 or 1, because Nfcf = 1. We next count the number of paths connecting the odd vertex 0 toP itself on the dual principal graph of the asymptotic inclusion. This number is equal to ( a Naca¯ − Nfcf )/3 + l, where the summation is over all the primary fields a of the WZW-model SU (3)3k . Since the two numbers are equal, we get X gr(a)=1,2
Naca¯ = 2
X
Naca¯ = −1 + 3l,
gr(a)=1
which implies that 3l − 1 is even. That is, we get l = 1.
As in Sect. 3, we know that the subsystem {x ∈ M | S0x = S00 } of M is given as {0, σ, σ 2 }. The above lemma gives the following. Corollary 5.3. Each of the odd vertices of the dual principal graph labeled with 0, σ, σ 2 is connected to the even vertex labeled with p(0) f,f with exactly one edge. σ(c) c = Naσ(b) by [47]. Proof. This follows from the above lemma, because we have Nab
We now need some lemmas for the fusion rule of the WZW-model SU (3)3k , which has been obtained by Goodman–Wenzl in [14] as a quantum version of the classical Littlewood–Richardson rule. Each primary field is represented by a Young diagram and we denote a primary field by the corresponding Young diagram. Lemma 5.4. We have the following fusion rule in the WZW-model SU (3)3k : Nf f = 1. Proof. By [13], we can apply the fusion rule described in [14]. By the Young–Pieri rule in [14, Prop. 2.6 (a)] and the classical Littlewood–Richardson rule (see [27, Sect. 1.9], for example), we get the conclusion. Lemma 5.5. We have the following fusion rule in the WZW-model SU (3)3k : Nf f = 2.
Orbifold Subfactors from Hecke Algebras II
353
Proof. This follows from Lemma 5.4 and 3
because f
3
=∅+
contains 6 copies of f .
+2
,
Lemma 5.6. We have the following identity in the WZW-model SU (3)3k . X X Naa 3 = Nbb 3 . gr(a)=0
gr(b)=1
Proof. Since the level is 3k, the numbers of the primary fields of grade 0, 1, 2 are 3k(k + 1)/2 + 1, 3k(k + 1)/2, 3k(k + 1)/2 respectively. Recall that the primary fields are arrayed in a triangular picture for SU (3)k as in Fig. 3.8. We have three primary fields of grade 0 at the three corners of the triangle. The contribution of these terms on the left hand side of the identity in this Lemma is 3. (See [14], [27, Sect. 1.9], [47] again for the computations of the fusion rule.) We have 3(k − 1) primary fields of grade 0 on the three edges of the triangle with three corners excluded. Each term gives a contribution of 3 on the left hand side of the identity, so we get 9(k − 1) as the total contribution. We next have 3k 2 /2 − 3k/2 + 1 primary fields of grade 0 inside the triangle. Each term gives a contribution of 6 on the left hand side of the identity, so we get 9k 2 − 9k + 6 as the total contribution. The sum of these three contributions is 9k 2 and this is the number on the left hand side of the identity. We similarly evaluate the right-hand side of the identity. We have 3k primary fields of grade 1 on the three edges of the triangle. Each term gives a contribution of 3 on the right-hand side of the identity, so we get 9k as the total contribution. We next have 3k 2 /2 − 3k/2 primary fields of grade 1 inside the triangle. Each term gives a contribution of 6 on the right-hand side of the identity, so we get 9k 2 − 9k as the total contribution. The sum of these two contributions is 9k 2 , which is equal to the left-hand side. Lemma 5.7. We have the following identity in the WZW-model SU (3)3k : X X Naa¯ = Nbb¯ + 1. gr(a)=0
gr(b)=1
. We next count the number of the paths from the odd vertex 0 to Proof. Let α be the odd vertex labeled with α on both the principal and the dual principal graphs. The P number gr(a)=0 Naαa¯ gives the number of the paths on the principal graph. Let l be the number of the edges connecting the odd vertex α and the even vertex labeled with p(0) f,f on the dual principal graph. By Lemmas 3.13 and 5.4, we know that l is 0 or 1. Lemmas 3.12, 3.13, and 5.4P imply that the number of the paths connecting 0 and α on the dual principal graph is ( b Nbαb¯ − 1)/3 + l, where the summation is over all the primary fields b in the model SU (3)3k . Since the two numbers of the paths are equal, we get X X X Naαa¯ = Nbαb¯ − 1 + 3l = 2 Nbαb¯ − 1 + 3l. 2 gr(a)=0
gr(b)=1,2
gr(b)=1
354
D. E. Evans, Y. Kawahigashi
This implies l = 1 because both sides are even numbers. We then get the conclusion. Lemma 5.8. We have the following identity in the WZW-model SU (3)3k : X X Naa¯ = Nbb¯ − 1. gr(a)=0
gr(b)=1
Proof. Recall that we have 3
=∅+
.
(2)
Lemma 5.6 and Frobenius reciprocity imply X X 3 3 Naa¯ = Nbb¯ .
(3)
gr(a)=0
We also have the easy identity, X gr(a)=0
Na∅a¯ =
+2
gr(b)=1
X
Nb∅b¯ + 1,
(4)
gr(b)=0
since both sides are equal to 3k(k + 1)/2 + 1. Identities (2), (3), (4) and Lemma 5.7 imply the conclusion. Lemma 5.9. The odd vertex of the dual principal graph labeled with to the even vertex labeled with p(0) f,f .
is not connected
Proof. We count the number of appropriate paths again. The number of paths connecting the odd vertex 0 to on the principal graph of the P asymptotic inclusion is gr(a)=0 Naa¯ because the principal graph is the fusion graph. Let l be the number of edges connecting the odd vertex of the dual principal graph labeled with to the even vertex labeled with p(0) f,f . on the dual principal graph The number of paths connecting P the odd vertex 0 to of the asymptotic inclusion is /3 + l. Lemmas 5.5, 5.8 show l = 0. N − N ff b bb¯ We finally prove the main theorem in this section as follows. Theorem 5.10. For the subfactor N ⊂ M arising from the WZW-model SU (3)3k , Assumption 3.14 holds. Proof. Since we have Lemmas 3.13, 5.5, we have one of the following two cases. 1. We have a minimal central projection p(1) f f majorized by pf f /3 such that the odd is connected to the even vertex labeled with p(1) vertex labeled with f f on the dual principal graph of the asymptotic inclusion by exactly two edges. (2) 2. We have two minimal central projections p(1) f f , pf f majorized by pf f /3 such that the is connected to each of the even vertices labeled with odd vertex labeled with (2) , p on the dual principal graph of the asymptotic inclusion by exactly one edge. p(1) ff ff
Orbifold Subfactors from Hecke Algebras II
355
Suppose that we have Case 1. Lemma 3.16 shows that the dimension of the bimodule (0) corresponding to p(1) f f is equal to that of the bimodule corresponding to pf f . Lemma (0) (1) 3.15 implies that the central projection p(2) f f = pf f /3 − pf f − pf f is minimal and the dimension of the bimodule corresponding to p(2) f f is also equal to that of the bimodule (0) corresponding to pf f . Next suppose that we have Case 2. Lemma 3.16 implies that the sum of the dimen(2) sions of the bimodules corresponding to p(1) f f , pf f is equal to twice of that of the bimodule (0) (1) (2) corresponding to p(0) f f . This shows pf f /3 = pf f + pf f + pf f . Then Lemma 3.15 then implies that these two dimensions have to be equal. (1) (2) In any case, we have a decomposition pf f /3 = p(0) f f + pf f + pf f into minimal central projections and each of the three minimal central projection has the same corresponding dimension. This completes the proof. This theorem implies the following by a simple computation. This corollary is a generalized version of Corollary 4.3. Again note that this corollary gives the number of even vertices of the dual principal graph of the asymptotic inclusions and that these are also the dimensions of the Hilbert spaces HS 1 ×S 1 in the corresponding topological quantum field theories for the original subfactors. Corollary 5.11. Let N ⊂ M be the subfactor corresponding to SU (3)k with k > 2. Then the number of the irreducible M∞ -M∞ bimodules arising from the asymptotic inclusion is given as follows. (k + 1)2 (k + 2)2 , 36 4 3 2 k + 6k + 13k + 12k + 108 , 36
if k 6≡ 0 mod 3, if k ≡ 0 mod 3.
As examples, we work out the dual principal graphs for small k such as k = 3, 6 in the rest of this section. First, we label the primary fields of SU (3)3 as in Fig. 23. 6
K 7
8
0
K
-
K
5
9 - - 4 K K - - -
K
1
2
3
Fig. 23. Primary fields for SU (3)3
Then the principal graph of the asymptotic inclusion of the subfactor corresponding to SU (3)3 is given as the fusion graph as in the upper half of Fig. 24. For the dual principal graph, we know the graph except for the edges connected to the three vertices
356
D. E. Evans, Y. Kawahigashi
∗ 00
36
63
03
0
∗ 00 36 63
30
66
3
81 54 27
18 45 72
03 30 66
33
06
60
99
09
06 33 60
21 84 57
69
90
93
96
9
6
(99)0
39
12 48 75
51 24 87
15 42 78
90 93 96
(99)1 (99)2
(21)
(51)
09 39 69
Fig. 24. (dual) principal graphs for SU (3)3
(0)
(6)
(66)
(3)
(42)(42)0
(33)
(63)
(42)
(42)(42)1
(54)
(42)(42)2
Fig. 25. Part of the dual principal graphs for SU (3)6
(99)0 , (99)1 , (99)2 . From the Perron–Frobenius property, we can determine these edges as in the bottom half of Fig. 24. These edges are marked thick. Since the subfactor corresponding to SU (3)3 has index 4 and is described as RoA3 ⊂ R o A4 , where A3 and A4 are the alternating groups of order 3 and 4 respectively and these groups act freely on the hyperfinite II1 factor R, the paragroup of the asymptotic inclusion is given by that of the subfactor RA4 ×A4 ⊂ RA4 , where A4 is diagonally embedded into A4 × A4 and the group A4 acts freely on R, by Ocneanu’s theorem again. (See [21, Lemma 2.15], [22, Appendix], [11, Sect. 12.8].) So the (dual) principal graphs of the asymptotic inclusion can be described with Ocneanu’s theorem again. (See [25].) Of course, this method gives the same result as in Fig. 24. The next example is SU (3)6 . In this case, the system M has 10 primary fields and thus the principal graph of the asymptotic inclusion has 100 even vertices, and the dual principal graph has 90 even vertices. Since these graphs are too complicated, we draw (1) (2) only the edges concerned with the three even vertices p(0) f f , pf f , pf f . Then the Perron– Frobenius property and counting of paths with unitarity gives the graph as in Fig. 25. In this figure, the symbol (lm) denotes the Young diagram with l boxes in the first row and m boxes in the second row.
Orbifold Subfactors from Hecke Algebras II
357
6. Orbifold Subfactors In Sects. 4, 5, we have observed that the even vertices of the dual principal graphs of the asymptotic inclusions are given by merging/splitting of the vertices with symmetries on pairs of the original labels. In the SU (2)k case, Ocneanu has noticed that this situation is similar to the orbifold construction for subfactors studied by us in [8, 20]. (See also [15, 51].) However, the dual principal graphs we have studied in Sects. 4, 5 are not orbifold graphs in the sense of [8, 20, 51], because we have merging/splitting of the vertices only for the even vertices. In this Section, we study a relation of this orbifold phenomena of Ocneanu to the orbifold construction in our sense. Let N ⊂ M be the Jones subfactor of type A4n−3 . That is, it is the hyperfinite type II1 subfactor corresponding to SU (2)4n−4 . To avoid disconnectedness of the fusion graph, we assume that n > 2. (If n = 2, we get a subfactor arising from a free action of a group S3 , so everything can be studied with classical methods on group actions.) As in [20], we get the orbifold subfactor P = N oσ Z/2Z ⊂ Q = M oσ Z/2Z of type D2n , where σ gives a non-strongly-outer action of Z/2Z on the subfactor N ⊂ M in the sense of [3]. (See also [15].) Let α be the dual action of σ on P ⊂ Q. Then we have N = P α and M = Qα , of course. Then the asymptotic inclusion M ∨ (M 0 ∩ M∞ ) ⊂ M∞ is described as 0 α α 0 α Qα ∨ (Q0 ∩ Q∞ )α ⊂ Qα ∞ . Putting R = (Q ∨ (Q ∩ Q∞ )) , we get Q ∨ (Q ∩ Q∞ ) ⊂ α 0 α R ⊂ Qα and [R : Q ∨ (Q ∩ Q ) ] = 2. This intermediate subfactor corresponds ∞ ∞ to the intermediate subfactor (Mω )σ of the central sequence subfactor N ω ∩ M 0 ⊂ Mω described in [21, Sect. 3], [22, Sect. 4] in the correspondence of Ocneanu [31, p. 42], [22, Theorem 4.1]. (Here ω is a free ultrafilter over N. See also [11, Theorem 15.32].) We use the notation [[M : N ]] for the global index of N ⊂ M as in [40]. We easily get [[M∞ : M ∨ (M 0 ∩ M∞ )]]/4 = [[Q∞ : Q ∨ (Q0 ∩ Q∞ )]] from the description of the principal graph as the fusion graph. Note that R ⊂ Qα ∞ is given as the simultaneous fixed point algebras of Q ∨ (Q0 ∩ Q∞ ) ⊂ Q∞ by the action α. By looking at α, we can conclude that we have one of the following three cases. 1. [[M∞ : R]] = [[M∞ : M ∨ (M 0 ∩ M∞ )]]/4, 2. [[M∞ : R]] = [[M∞ : M ∨ (M 0 ∩ M∞ )]]/2, 3. [[M∞ : R]] = [[M∞ : M ∨ (M 0 ∩ M∞ )]]/8. That is, if the action is strongly outer in the sense of [3] and has a trivial Loi invariant, then we get Case 1, if the action is strongly outer and has a non-trivial Loi invariant, then we have Case 2, and if the action is not strongly outer, then we have Case 3. Since the fusion rule algebra of the M∞ -M∞ bimodules arising from R ⊂ M∞ is a fusion rule subalgebra of those arising from M ∨ (M 0 ∩ M∞ ) ⊂ M∞ (see [40, Lemma 2.4], for example), we look for a fusion rule subalgebra of that of the M∞ -M∞ bimodules arising from the asymptotic inclusion of N ⊂ M . We first study fusion rule subalgebras of the WZW-model SU (2)2k . Lemma 6.1. Let N be a closed subsystem of primary fields under fusion of the WZWmodel SU (2)2k labeled as {0, 1, 2, . . . , 2k}. Then N is one of the following; {0}, {0, 2k}, {0, 2, 4, . . . , 2k}, {0, 1, 2, . . . , 2k}. Proof. It is clear that these four indeed give subsystems. Suppose that N 6= {0}. Let l be the smallest non-zero label appearing in N . If l = 1, l = 2, or l = 2k, then we clearly have N = {0, 1, 2, . . . , 2k}, N = {0, 2, 4, . . . , 2k},
358
D. E. Evans, Y. Kawahigashi
N = {0, 2k}, respectively. If 2 < l < 2k, then we would have Nll2 = 1, which implies 2 ∈ N and thus a contradiction. Lemma 6.2. Let N be the system of M∞ -M∞ bimodules arising from the asymptotic inclusion of the subfactor N ⊂ M of type A4n−3 . Let γ be the global index of this system. Suppose we have a subsystem N0 of N with global index equal to one of γ/2, γ/4, γ/8. Then N0 is a subsystem of the M∞ -M∞ bimodules labeled with pairs of even numbers as in Sect. 4 and its global index is γ/2. Proof. It is clear that the subsystem of the M∞ -M∞ bimodules labeled with pairs of even numbers has global index γ/2. Suppose that N0 contains a bimodule labeled with a pair of odd numbers. By taking an appropriate tensor power of this bimodule, we have (2, 2) in this system N0 . We set X be the set of labels of pairs of integers appearing in N0 and set Y = {l | (0, l) ∈ X}. Then Y gives a subsystem of the original WZW-model SU (2)4n−4 . By Lemma 6.1, we have four cases for Y . The assumption on the global index forces Y = {0, 2, 4, . . . , 4n − 4} and [N0 ] = γ/2. Then we have the conclusion. Let β be the M∞ -M∞ bimodule labeled with (0, 4n − 4) = (4n − 4, 0). It is clear that this bimodule has dimension 1. We can apply the orbifold construction for tensor categories as in [52] and get the following lemma. Lemma 6.3. Let N ⊂ M , P ⊂ Q be as above. Let N0 be the system of M∞ -M∞ bimodules arising from R ⊂ Qα ∞ = M∞ . Let N1 be the system of Q∞ -Q∞ bimodules arising from the asymptotic inclusion of the subfactor P ⊂ Q of type D2n . Then the system N1 is given as the orbifold construction of N0 with β as above. Proof. Let N be the system of M∞ -M∞ bimodules arising from the asymptotic inclusion of the subfactor N ⊂ M of type A4n−3 . Lemma 6.2 implies that [N1 ] = [N0 ]/2, which gives the conclusion. Theorem 6.4. Let N ⊂ M be the Jones subfactor of type A4n−3 with n > 2. Let N0 be the subsystem of M∞ -M∞ bimodules arising from the asymptotic inclusion M ∨ (M 0 ∩ M∞ ) ⊂ M∞ labeled with pairs of even integers as in Sect. 4. Let σ be the outer, non-strongly-outer automorphism of order 2 of N ⊂ M . The system N0 is isomorphic to the system of (M ⊗ M )σ⊗σ -(M ⊗ M )σ⊗σ bimodules arising from the orbifold subfactor (N ⊗ N )σ⊗σ ⊂ (M ⊗ M )σ⊗σ . Proof. This follows from Lemma 6.3.
The meaning of the above theorem is as follows. When we apply the “quantum double” construction to a degenerate system, it is not enough to take a simple “double” because of degeneracy. Pairs labeled with ghosts appear so that the non-degeneracy is recovered, but then we have too many bimodules and the global index, giving the size of the system, becomes too large. Then the orbifold construction removes this redundancy and the correct global index is realized. The bimodules labeled with pairs of ghosts disappear when we remove an intermediate subfactor of index 2, which is the order of the orbifold construction.
Orbifold Subfactors from Hecke Algebras II
359
7. Orbifold Construction for Braiding Theorem 7.1. The system of the M -M bimodules arising from a subfactor N ⊂ M of type D2n , n > 2, has a non-degenerate braiding. Proof. The system of the Q∞ -Q∞ bimodules has a non-degenerate braiding by Ocneanu’s general theory. (See [11, Sect. 12.7], for example.) Lemma 6.3 implies that the system given by the orbifold construction on the system (0, 0), (0, 2), . . . , (0, 4n − 4) = β with β is a subsystem of the Q∞ -Q∞ bimodules. We thus get a braiding naturally. The non-degeneracy is also easy to see, because if have degeneracy, then the degenerate subsystem would give a finite abelian group by [5], which is impossible by n > 2. Corollary 7.2. The dual principal graph of the asymptotic inclusion of the hyperfinite II1 subfactor N ⊂ M with principal graph D2n is the fusion graph of the system of M -M bimodules. Proof. This follows from Theorem 7.1 and Proposition 2.2.
Remark 7.3. Ocneanu has constructed a braiding on the even vertices of D2n with an entirely different method in [37]. His theory in [37] also shows that his braiding and ours must be the same. Turaev and Wenzl [45] have worked on a similar construction to our orbifold construction in categories of tangles. In their approach to the Reshetikhin–Turaev type topological quantum field theory [39], they need a certain non-degeneracy and make some construction similar to our orbifold construction to remove the degeneracy. It seems that their construction, in particular, gives a braiding on the even vertices of D2n and we expect that their braiding is also the same as ours, but the actual relation is not clear. The basic idea is that the orbifold construction can be performed when we have some kind of degeneracy and this degeneracy is removed by the orbifold construction. Acknowledgement. This work was done at University of Wales, Swansea while the second author visited there on the joint research program of the Royal Society and the Japan Society for Promotion of Sciences. The second author thanks these societies for the financial support. The second author also acknowledges financial support from the Inamori Foundation during this research. We thank Dr. Maxim Nazarov for his kind explanation of the Littlewood–Richardson rule.
References 1. de Boer, J. and Goeree, J.: Markov traces and II1 factors in conformal field theory. Commun. Math. Phys. 139, 267–304 (1991) 2. Choda, M.: Index for factors generated by Jones’ two sided sequence of projections. Pac. J. Math. 139, 1–16 (1989) 3. Choda, M. and Kosaki, H.: Strongly outer actions for an inclusion of factors. J. Funct. Anal. 122, 315–332 (1994) 4. Drinfel0 d, V.G.: Quantum groups. Proc. ICM-86, Berkeley, pp. 798–820 5. Doplicher, S. and Roberts, J.E.: A new duality theory for compact groups Invent. Math. 98, 157–218 (1989) 6. Erlijman, J.: New subfactors from braid group representations. Ph. D. dissertation at University of Iowa (1995) 7. Erlijman, J.: Two-sided braid subfactors and asymptotic inclusions. Preprint 1996
360
D. E. Evans, Y. Kawahigashi
8. Evans, D.E. and Kawahigashi, Y.: Orbifold subfactors from Hecke algebras. Commun. Math. Phys. 165, 445–484 (1994) 9. Evans, D.E. and Kawahigashi, Y.: From subfactors to 3-dimensional topological quantum field theories and back – A detailed account of Ocneanu’s theory. Internat. J. Math. 6, 537–558 (1995) 10. Evans, D.E. and Kawahigashi, Y.: On Ocneanu’s theory of asymptotic inclusions for subfactors, topological quantum field theories and quantum doubles. Internat. J. Math. 6, 205–228 (1995) 11. Evans, D.E. and Kawahigashi, Y.: Quantum symmetries on operator algebras. Oxford: University Press, 1998. 12. Fuchs, J.: Affine Lie algebras and quantum groups. Cambridge: Cambridge University Press, 1992 13. Goodman, F. and Nakanishi, T.: Fusion algebras in integrable systems in two dimensions Phys. Lett. B262, 259–264 (1991) 14. Goodman, F. and Wenzl, H.: Littlewood Richardson coefficients for Hecke algebras at roots of unity. Adv. Math. 82, 244–265 15. Goto, S.: Orbifold construction for non-AFD subfactors Internat. J. Math. 5, 725–746 (1994) 16. Goto, S.: Quantum double construction for subfactors arising from periodic commuting squares. Preprint 1996 17. Jones, V.F.R.: Index for subfactors. Invent. Math. 72, 1–15 (1983) 18. Kac, V.: Infinite dimensional Lie algebras. Cambridge: Cambridge University Press, 1990 19. Kauffman, L. and Lins, S.L.: Temperley–Lieb recoupling theory and invariants of 3-manifolds. Princeton, NJ: Princeton University Press, 1994 20. Kawahigashi, Y.: On flatness of Ocneanu’s connections on the Dynkin diagrams and classification of subfactors. J. Funct. Anal. 127, 63–107 (1995) 21. Kawahigashi, Y.: Centrally trivial automorphisms and an analogue of Connes’s χ(M ) for subfactors. Duke Math. J. 71, 93–118 (1993) 22. Kawahigashi, Y.: Orbifold subfactors, central sequences and the relative Jones invariant κ. Internat. Math. Res. Notices, 129–140 (1995) 23. Kohno, T. and Takata, T.: Symmetry of Witten’s 3-manifold invariants for sl(n, C). J. Knot Theory Ramif. 2, 149–169 (1993) 24. Kosaki, H., Munemasa, A. and Yamagami, S.: On fusion algebras associated to finite group actions. Pac. J. Math. 177, 269–290 (1997) 25. Kosaki, H. and Yamagami, S.: Irreducible bimodules associated with crossed product algebras Internat. J. Math. 3, 661–676 (1992) 26. Longo, R. and Rehren, K.-H.: Nets for subfactors. Rev. Math. Phys. 7, 567–597 (1995) 27. Macdonald, I.G.: Symmetric functions and Hall polynomials. Oxford Mathematical Monographs, New York: Oxford University Press, 1995 28. Masuda, T.: An analogue of Longo’s canonical endomorphism for bimodule theory and its application to asymptotic inclusions. Internat. J. Math. 8, 249–265 (1997) 29. Moore, G. and Seiberg, N.: Classical and quantum conformal field theory Commun. Math. Phys. 123, 177–254 (1989) 30. Ocneanu, A.: Quantized group string algebras and Galois theory for algebras. In: Operator algebras and applications, Vol. 2 (Warwick, 1987), London Math. Soc. Lect. Note Series Vol. 136, Cambridge: Cambridge University Press, 1988, pp. 119–172 31. Ocneanu, A.: Quantum symmetry, differential geometry of finite graphs and classification of subfactors. University of Tokyo Seminary Notes 45, (Notes recorded by Y. Kawahigashi), 1991 32. Ocneanu, A.: An invariant coupling between 3-manifolds and subfactors, with connections to topological and conformal quantum field theory. Preprint 1991 33. Ocneanu, A.: Operator algebras, 3-manifolds and quantum field theory. OHP sheets for the Istanbul talk, July, 1991 34. Ocneanu, A.: Lectures at Coll`ege de France, Fall 1991 35. Ocneanu, A.: Seminar talk at University of California, Berkeley, June 1993 36. Ocneanu, A.: Chirality for operator algebras. In: Subfactors (ed. H. Araki, et al.), Singapore: World Scientific, 1994, pp. 39–63 37. Ocneanu, A.: Paths on Coxeter diagrams: From Platonic solids and singularities to minimal models and subfactors. In preparation 38. Popa, S.: Correspondences. Ppreprint, 1986 39. Reshetikhin, N.Yu. and Turaev, V.G.: Invariants of 3-manifolds via link polynomials and quantum groups. Invent. Math. 103, 547–597 (1991)
Orbifold Subfactors from Hecke Algebras II
361
40. Sato, N.: Two subfactors arising from a non-degenerate commuting square – An answer to a question raised by V. F. R. Jones. Pac. J. Math. 180, 369–376 (1997) 41. Sato, N.: Two subfactors arising from a non-degenerate commuting square – Tensor categories and TQFT’s. Internat. J. Math. 8, 407–420 (1997) 42. Turaev, V.G.: Topology of shadows. Preprint, 1991 43. Turaev, V.G. and Viro, O.Y.: State sum invariants of 3-manifolds and quantum 6j-symbols. Topology, 31, 865–902 (1992) 44. Turaev, V.G. and Wenzl, H.: Quantum invariants of 3-manifolds associated with classical simple Lie algebras. Internat. J. Math. 4, 323–358 (1993) 45. Turaev, V.G. and Wenzl, H.: Semisimple and modular categories from link invariants. Preprint 1996 46. Verlinde, E.: Fusion rules and modular transformation in 2D conformal field theory. Nucl. Phys. B300, 360–376 (1988) 47. Walton, M.: Fusion rules of Wess–Zumino–Witten models. Nucl. Phys. B340, 777–789 (1990) 48. Wenzl, H.: Hecke algebras of type A and subfactors. Invent. Math. 92, 345–383 (1988) 49. Witten, E.: Topological quantum field theory. Commun. Math. Phys. 117, 353–386 (1988) 50. Witten, E.: Gauge theories and integrable lattice models. Nucl. Phys. B 322, 629–697 (1989) 51. Xu, F.: Orbifold construction in subfactors. Commun. Math. Phys. 166, 237–254 (1994) 52. Yamagami, S.: Group symmetries in tensor categories. Preprint 1995 Communicated by H. Araki
Commun. Math. Phys. 196, 363 – 383 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
On the Contribution of Degenerate Periodic Trajectories to the Wave-Trace Georgi Popov ?,?? Institute of Mathematics, Bulgarian Academy of Sciences, 1113 Sofia, Bulgaria. E-mail: [email protected] Received: 11 July 1997 / Accepted: 4 February 1998
Abstract: This paper is concerned with the wave trace Z(t) of a selfadjoint elliptic pseudodifferential operator on a compact manifold. It is devoted to the contribution of degenerate periodic trajectories of the associated Hamiltonian flow to the singularities of Z(t). Given a periodic trajectory γ and a point ν on it, we obtain a trace formula which relates microlocally the trace of the unitary group of A in a neighborhood of ν to an oscillatory integral, the phase and the amplitude of which are written in terms of the Poincar´e map of γ. For suitable isolated but degenerate γ, this enables us to obtain complete asymptotic expansions applying already known results about asymptotics of oscillatory integrals. As an application we obtain lower bounds on the resonances close to the real axis for the Laplace-Beltrami operator in Rn , n ≥ 2, for suitable compactly supported perturbations of the Euclidean metric. The purpose of this article is to present some new results on the singularities of the wavetrace. Let X be a compact smooth manifold of dimension n ≥ 2. Consider a selfadjoint elliptic semi-bounded from below pseudodifferential operator A of order one acting on the space of half-densities C ∞ (X, 1/2 ) with a classical symbol given locally by a(x, ξ)v = a1 (x, ξ) + a0 (x, ξ) + · · · + a−j (x, ξ) + · · · , where aj are smooth homogeneous functions of order j with respect to ξ 6= 0. We are going to investigate the singularities of the wave-trace Z(t) =
X
e−iλk t ,
λk ∈Spec(A) ? Current address: UMR 6629, Universit´ e de Nantes – CNRS, DMI Facult´e des Sciences et des Techniques, BP 92208, 44322 Nantes-Cedex 03, France ?? Author partially supported by grant MM-706/97 with MES, Bulgaria
364
G. Popov
where the summation is taken over all the eigenvalues λk of A counted with multiplicities. Denote by 6 the cosphere bundle associated to the principal symbol a1 of A, namely, 6 = {(x, ξ) ∈ T ∗ (X) : a1 (x, ξ) = 1}. Let 4 be the Hamiltonian vector field of a1 , and let γ be a periodic trajectory of 4 in 6 of period Tγ 6= 0, not necessarily primitive. In other words, exp(Tγ 4)(ν) = ν for each ν ∈ γ. The contribution of γ to the wave-trace of A is well-known in case γ is nondegenerate or, more generally, if it belongs to a clean fixed point manifold of the flow [6], [8], (see also [10, 23, 26] and the references there). It is given by a Fourier distribution in R with Lagrangian manifold {(Tγ , τ ) : τ < 0}. To study the singularities of Z it is natural to apply it to certain test functions. Given a function ζ in the Schwartz space S(R) with Fourier transform ζb ∈ C0∞ (R), we consider the trace X ζ(λ − λk ) as λ → +∞. Tr ζ (λ) = tr (ζ(λ − A)) = λk ∈Spec(A)
Obviously we have 1 Tr ζ (λ) = 2π
Z
b Z(t) dt. eiλt ζ(t)
(0.1)
Our main purpose is to investigate the contribution of degenerate periodic trajectories to the asymptotics of Tr ζ (λ) as λ → +∞, starting from a suitable microlocal trace formula as in [5], [15] and [16]. Fix ν0 ∈ γ and choose a pseudodifferential operator of order zero with wave-front set WF(B) in a neighborhood of ν0 . The microlocal trace formula is concerned with the asymptotics of the trace Tr ζ,B (λ) = tr (ζ(λ − A) B) as λ → +∞,
(0.2)
where WF (B) and supp ζb are contained in sufficiently small neighborhoods of ν0 ∈ T ∗ (X) and Tγ ∈ R respectively. Without any additional assumptions on the structure of the set of periodic points of the flow, we are going to write the microlocal trace (0.2) as a suitable oscillatory integral which involves naturally the Poincar´e map P of γ. Let us recall the definition of P . Choose a smooth local section Y ⊂ 6 of γ at ν0 which is transversal to γ and denote by ω0 the pull-back to Y of the symplectic twoform of T ∗ (X) via the inclusion map. Then (Y, ω0 ) becomes a symplectic manifold of dimension 2n − 2. By the implicit function theorem, there exists a smooth function t(ν) in a neighborhood Y0 of ν0 in Y such that t(ν0 ) = Tγ and def
P (ν) = exp(t(ν)4)(ν) ∈ Y for any ν ∈ Y0 . The function t is called return time function while P : Y0 → Y is the (local) Poincar´e map associated to γ. It is well-known that P is a symplectic map in (Y, ω0 ) ( see [1]). Then P can be defined by a suitable generating function as follows: Let (q, p) : Y → W ⊂ T ∗ (Rn−1 ) be symplectic (Darboux) coordinates in (Y, ω0 ) such that q(ν0 ) = 0 and p(ν0 ) = 0. Given (q, p) ∈ W we denote the corresponding point on Y by ν = κ(q, p). Let P 0 : W0 → W be the Poincar´e map in the new coordinates, where W0 ⊂ W is a neighborhood of 0. Then P 0 = κ−1 ◦ P ◦ κ is a symplectic map which admits a generating function L0 , choosing appropriately the local coordinates q ∈ Rn−1 . In other words, we have
Wave-Trace
365
P
0
∂L0 (q, p), p q+ ∂p
=
and def
∂L0 q, p + (q, p) , (q, p) ∈ W0 , ∂q
J(q, p) = det
(0.3)
I+
∂ 2 L0 (q, p) ∂q∂p
6= 0,
(0.4)
where I stands for the identity matrix in R2n−2 (for more details see Sect. 1). We normalize L0 by L0 (0, 0) = Tγ . Next we set ∂L0 ν(q, p) = κ q + (q, p), p , (0.5) ∂p and denote by
i X ∂ 2 a1 (x, ξ) 2 ∂xj ∂ξj n
a00 (x, ξ) = a0 (x, ξ) +
j=1
the subprincipal symbol of A, which is invariantly defined on T ∗ (X) (see [11], Theorem 18.1.33). We can state now the “microlocal trace formula”: Theorem 1. Let ζ ∈ S(R) have Fourier transform ζb in C0∞ (R) and let B be a pseudodifferential operator of order 0 with principal symbol b0 . Suppose that supp ζb is contained in a sufficiently small neighborhood O1 ⊂ R of Tγ and that WF (B) is contained in a sufficiently narrow conic neighborhood O2 ⊂ T ∗ (X) \ 0 of ν0 . Then we have Z λn−1 eiλ L0 (q,p)+iπµγ /2 ψ(q, p, λ) |J(q, p)|1/2 dqdp + O(λ−∞ ) Tr ζ,B (λ) = (2π)n W (0.6) as λ → +∞, P∞ where µγ is the Maslov index of γ, and ψ(q, p, λ) = j=0 ψj (q, p)λ−j is a classical symbol with respect to λ. Moreover, supp ψj ⊂ W0 , j ≥ 0, and Z R t(ν) 0 −i a0 (exp((s+u)4)(ν))du b 0 e b0 (exp(s4)(ν)) ds, (0.7) ψ0 (q, p) = ζ(t(ν)) R
where ν = ν(q, p) is given by (0.5). Formulae (0.3) - (0.7) contain all the geometric information about the Poincar´e map that we need. Indeed, the set Crit (L0 ) of critical points of L0 in W0 coincides with that of fixed points Fix (P 0 ) of P 0 in view of (0.3). Hence, ν = κ(q, p) is a periodic point of exp(t4) for each (q, p) ∈ Crit (L0 ) and t(ν) is a period of it. Denote by γ(ν) the corresponding periodic trajectory (Tγ(ν) = t(ν)) of 4. The pseudodifferential operator B in Theorem 1 has wave front set in a small conic neighborhood of a given point ν0 ∈ γ. For the applications it will be convenient to extend Theorem 1 for any pseudodifferential operator B of order 0 with WF (B) contained in a conic neighborhood of the whole trajectory γ. Let Tγ# be the period of the primitive periodic trajectory γ # passing through ν0 ∈ γ. Then γ = `γ # and Tγ = `Tγ# , where ` 6= 0 is integer. Let P# : Y0 → Y be the Poincar´e map associated to γ # . In other words, P # (ν) = exp t# (ν)4 (ν) ∈ Y for any ν ∈ Y0 , where t# : Y0 → R+ is a smooth function such that t# (ν0 ) = Tγ# . Obviously, P = (P # )` . Then P # (Fix (P )) = Fix (P ) but P # may differ from the identity map on Fix (P ). As a consequence of Theorem 1 we obtain:
366
G. Popov
Corollary 2. There exists a neighborhood I of Tγ and a conic neighborhood O of γ such that if ζb ∈ C0∞ (I) and WF (B) ⊂ O, then Tr ζ,B (λ) admits an asymptotic expansion of the form (0.6) with a leading term ψ0 (q, p) = ζb (t(ν)) 90 (ν), ν = ν(q, p), where 90 ∈ C0∞ (Y0 ) is independent of ζ. Moreover, for any (q, p) ∈ Crit (L0 ) = Fix (P 0 ) we have L0 (q, p) = t(ν) = Tγ(ν) and |`|−1
X
R t(ν) 0 −i a0 (exp(u4)(ν))du 0 90 (P # )j (ν) = e
Z
|t(ν)|
b0 (exp(s4)(ν)) ds ,
(0.8)
0
j=0
where ` = Tγ /Tγ# and ν = κ(q, p), To obtain an asymptotic expansion of (0.6) we need more information about the Poincar´e map. As in [5] one can easily get from (0.6) and (0.8) a complete asymptotic expansion of Tr ζ,B (λ) at infinity in case γ is nondegenerate, or more generally, when the fixed points of the flow in WF (B) belong to a clean fixed point manifold [8]. Indeed, differentiating (0.3) we get −1 0 I ab I +b c , I − Pγ = −I 0 b c 0 I where Pγ is the linear Poincar´e map associated to γ and a=
∂ 2 L0 ∂ 2 L0 ∂ 2 L0 (0), c = (0), b = (0). ∂q 2 ∂q∂p ∂p2
(0.9)
Suppose that γ is nondegenerate. Then zero is the only stationary point of the phase L0 and it is nondegenerate. Thus, we can apply the stationary phase method to (0.6). The corresponding density is −1/2 a b −1/2 1/2 = | det J(0, 0)| det . | det(I − Pγ )| bc Moreover, L0 (0, 0) = Tγ , P # (ν0 ) = ν0 , where ν0 = κ(0, 0), and (0.8) implies 90 (ν0 ) = e
−i
R Tγ 0
a00 (exp(u4)(ν0 ))du
Z
Tγ#
b0 (exp(s4)(ν0 )) ds, 0
and we get the coefficient of the leading term. In the same way we obtain the asymptotics of (0.6) in the case of clean fixed point manifolds (see [5]). Our aim is to investigate the asymptotics of Trζ,B (λ) for degenerate periodic trajectories γ. To do this we apply to (0.6) certain results of Varchenko [3, 24]. Suppose that 0 is a critical point of L0 of finite multiplicity. Assume that the principal part of the Taylor series of L0 at 0 is R-nondegenerate and that the corresponding Newton polyhedron is remote (the notions of “finite multiplicity”, “R-nondegenerate” and “remote” will be explained in Sect. 2). Then we obtain a complete asymptotic expansion of the oscillatory integral (0.6) and describe the leading term. As an application of the trace formulae we get lower bounds on the resonances near the real axis for the Laplace-Beltrami operator −1g in Rn , n ≥ 2, where g is a suitable
Wave-Trace
367
Riemannian metric in Rn which coincides with the Euclidean one outside a compact. If the Liouville measure ϑ (5(T )) on the cosphere bundle 6 of the union of all periodic geodesics with a given period T > 0 is positive, we prove that for any % > 0, the number N% (r) of the resonances λj lying in the domain {z ∈ C : 0 < Im z ≤ % log |z|, 1 < Re z ≤ r} can be estimated from below by (C0 + o% (1))rn
as r → +∞,
where C0 = ϑ(5(T ))(2π)−n n−1 . This generalizes a result of Sj¨ostrand and Zworski [22]. Upper bounds of the form N% (r) ≤ Crn were obtained by Vodev [25]. The idea first to present microlocally the wave trace as an oscillatory integral and then to investigate it was proposed by Marvizi and Melrose [4] in connection with the “inverse” Poisson relation for strictly convex planar domains. A similar idea was used in [17] and [18], where the “inverse” Poisson relation was investigated for closed bicharacteristics accumulating near a family of KAM tori. The analysis in [17], Sect. 5, could be simplified considerably by Theorem 1. A semiclassical analog of Theorem 1 was obtained in [5] (see also [16]). In [12] Ikawa applied a result of Varchenko to obtain an asymptotic trace formula in the case of two convex and bounded obstacles in R3 when the minimal broken geodesic is isolated but degenerated. The results in the present paper could be extended also to the case of smooth manifolds with boundary to treat the contribution of broken closed bicharacteristics to the wave trace. The paper is organized as follows. Section 1 is devoted to the proof of the microlocal trace formula given by Theorem 1. Using a symplectic “straightening out” theorem we find a “normal form” of the Lagrangian manifold of the Fourier integral operator (FIO) exp(−itA) over (Tγ , ν0 ). This allows us to obtain phase functions for exp(−itA) which are adapted to the geometry of the Poincar´e map. In Sect. 2 we investigate the asymptotics of the oscillatory integral (0.6) assuming that 0 is a critical point of finite multiplicity. In Sect. 3 we obtain lower bound for the resonances. 1. Proof of the Microlocal Trace Formula The proof of Theorem 1 is close to that of Theorem 5.3, [5], (see also [16]). One of the main differences is that we are going to deal with FIOs with homogeneous phase functions. The proof is divided into several steps. 1.1. FIOs and Lagrangian manifolds. Using the functional calculus we write Z 1 b e−itA B dt. tr eiλt ζ(t) Tr ζ,B (λ) = 2π
(1.1)
Suppose that WF (B) is contained in a conic neighborhood U of ν0 . Then U = e−itA B is a FIO in the sense of H¨ormander with a kernel in the class I 1/4 (R × X × X, 30 ) (see [11]), associated to the conic Lagrangian manifold 30 = { (t, x, y, τ, ξ, −η) ∈ T ∗ (R × X × X) : τ + a1 (y, η) = 0, (x, ξ) = exp(t4)(y, η), (t, y, η) ∈ R × U}. As usually we denote the corresponding canonical relation by
(1.2)
368
G. Popov
3 = {(t, x, τ, ξ, y, η) : (t, x, y, τ, ξ, −η) ∈ 30 }. Conjugating 3 with a suitable canonical relation, we are going to find a simple form for it. That will allow us to represent microlocally U near ν0 as an oscillatory integral with a simple phase function. Let N ⊂ T ∗ (X) be a smooth conic section transversal to γ at ν0 and let N0 be a conic neighborhood of ν0 in N such that Y = N ∩ 6 and Y0 = N0 ∩ 6. We extend the return time function t to a homogeneous function of order 0 in N and consider the return map N0 3 ν −→ R(ν) = exp(t(ν)4)(ν) ∈ N . Obviously, the restriction of the return map to Y0 coincides with the local Poincar´e map P. Let f be a smooth real valued function in T ∗ (X) \ 0, homogeneous of order zero, which defines N , i.e., f = 0 and df 6= 0 on N . We can choose f as a local solution of the Cauchy problem 4f (ν) = 1, ν ∈ U , f |N = 0, where U a conic neighborhood of N . Then the Poisson bracket {a1 , f }(ν) = 1, ν ∈ U . Let (q, p) : Y → W ⊂ T ∗ (Rn−1 ) be symplectic (Darboux) coordinates in (Y, ω0 ) such that q(ν0 ) = 0 and p(ν0 ) = 0, and let κ : W → Y be the inverse map. Consider the symplectic map P 0 = κ−1 ◦ P ◦ κ mapping a neighborhood W0 ⊂ W of 0 to W . Then q , p)} graph (P 0 ) = {(q, pe, qe, p) ∈ W × W0 : (q, pe) = P 0 (e is a Lagrangian submanifold of W × W0 . As in Proposition 25.3.3 [11] we can choose the local coordinates q ∈ Rn−1 at 0 such that the projection graph (P 0 ) 3 (q, pe, qe, p) −→ (q, p) ∈ T ∗ (Rn−1 )
(1.3)
becomes a local diffeomorphism. In particular, since graph (P 0 ) is a Lagrangian manifold, there exists a smooth function L0 (q, p) satisfying (0.3) and (0.4). In other words, L0 is a generating function of P 0 in W0 . We normalize it by L0 (0) = Tγ . Now we choose local symplectic coordinates (x, ξ) in a neighborhood of ν0 as follows: At first we set ξn = a1 (y, η) and xn = f (y, η) for (y, η) ∈ U . Then we find a smooth map (x0 , ξ 0 ) : N → T ∗ (Rn−1 ), x0 = (x1 , . . . , xn−1 ), ξ 0 = (ξ1 , . . . , ξn−1 ), such that (x0 , ξ 0 ) = (q, p) on Y and {xn , x0 } = {xn , ξ 0 } = 0 on N . Finally we extend (x0 , ξ 0 ) in U so that {ξn , x0 } = {ξn , ξ 0 } = 0
and
{xn , x0 } = {xn , ξ 0 } = 0.
In this way we obtain symplectic coordinates (x, ξ) : U → U0 ⊂ T ∗ (Rn ), mapping ν0 to ν 0 = (x0 , ξ 0 ), where x0 = 0, ξ 0 = (0, 1), and U0 ⊂ T ∗ (Rn ) is a neighborhood of ν 0 . Denote by χ : U0 → U the inverse map. Then the conic section S = χ−1 (N ) is defined by {xn = 0} ∩ U0 and S0 = χ−1 (N0 ) is a neighborhood of ν 0 in
Wave-Trace
369
S. Moreover, 60 = χ−1 (6) is given by 60 = {ξn = 1} ∩ U0 , and we can suppose that W = {xn = 0, ξn = 1} ∩ U0 . Obviously, the restriction of χ to W coincides with κ. Introduce the canonical relations C0 = {(χ(x, ξ), (x, ξ)) : (x, ξ) ∈ U0 }, and e (t, x, τ, ξ)) : (e e = χ(x, ξ), (t, τ, x, ξ) ∈ T ∗ (R) × U}. e, τ, ξ), x, ξ) C1 = {(t, x Denote by C1−1 the canonical relation inverse to C1 and consider the transversal composition of canonical relations 30 = C1−1 ◦ 3 ◦ C0 . Let Q be a FIO with a kernel belonging to I 0 (X×Rn , C00 ). Suppose that Q is microlocally unitary in a neighborhood U1 ⊂ U0 of ν 0 , i.e., WF (Q∗ Q − Id ) ∩ U1 = ∅. Then
e = Q∗ ◦ U ◦ Q U
is a FIO with a kernel belonging to I 1/4 (Rn+1 × Rn , 300 ). Choose the neighborhoods O1 of Tγ and O2 of ν0 in Theorem 1 as follows: Take O2 ⊂ χ(U1 ) and suppose that exp(t4)(O2 ) is a subset of χ(U1 ) for each t ∈ O1 . Then we have Z 1 b U e dt + O(λ−∞ ). Tr ζ,B (λ) = tr eiλt ζ(t) (1.4) 2π In what follows we shall find a simple phase function generating the Lagrangian manifold e. 300 of U 1.2. Phase functions generating 300 . Set φt (y, η) = (χ−1 ◦ exp(t4) ◦ χ)(y, η),
(t, y, η) ∈ V,
where V is a conic neighborhood of (Tγ , ν 0 ) in R × T ∗ (Rn ) such that φt (y, η) belongs to U1 for each (t, y, η) in V . Suppose also that V contains O1 × χ−1 (O2 ). Then we have 300 = { (t, x, y, τ, ξ, −η) ∈ T ∗ (Rn+1 × Rn ) : τ + ηn = 0, (x, ξ) = φt (y, η), (t, y, η) ∈ V }. (1.5) Obviously ξn = ηn on 300 . Denote by s(y 0 , η) and R0 = χ−1 ◦ R ◦ χ : S0 → S the corresponding return time function and return map in the new coordinates. In other words, 0 (1.6) s(y 0 , η) = t χ(y 0 , 0, η) and R0 (y 0 , 0, η) = φs(y ,η) (y 0 , 0, η) ∈ S, for each (y 0 , 0, η) ∈ S0 .
370
G. Popov
Lemma 1.1. For any (t, y, η) ∈ V we have ∂ w, φt (y, η) = exp u ∂yn where u = t + yn − s(y 0 , η) and w = R0 (y 0 , 0, η). To prove the lemma we notice that the push-forward of the Hamiltonian vector field 4 of a1 under χ−1 is equal to ∂/∂yn in U0 , i.e. (χ−1 )∗ (4) = ∂/∂yn in U0 (see [5, 16]). We write the return map R0 in the form R0 (y 0 , 0, η 0 , ηn ) = (X 0 (y 0 , η), 0, Z 0 (y 0 , η), ηn ) , (y 0 , 0, η) ∈ S0 , where X 0 and Z 0 are smooth functions. In particular, P 0 (y 0 , η 0 ) = (X 0 (y 0 , η 0 , 1), Z 0 (y 0 , η 0 , 1)) , (y 0 , η 0 ) ∈ W0 .
(1.7)
Now (1.5) and Lemma 1.1 imply 300 = { (t, x, y, τ, ξ, −η) ∈ T ∗ (Rn+1 × Rn ) : τ = −ηn = −ξn , x0 = X 0 (y 0 , η), xn = t + yn − s(y 0 , η), ξ 0 = Z 0 (y 0 , η), (t, y, η) ∈ V }.(1.8) One can think of 300 as of a “normal form” of 30 . We are going to find a homogeneous non-degenerate phase function 9 which defines the Lagrangian manifold 300 in a conic neighborhood of %0 = (Tγ , x0 , x0 , −1, ξ 0 , −ξ 0 ). Notice that the projection 300 3 (t, x, y, τ, ξ, −η) −→ (t, x, η)
(1.9)
is a local diffeomorphism, because the map (1.3) is such. Hence, we can suppose that 300 is parameterized by (t, x, η). On the other hand, the pull-back of the canonical one-form of T ∗ (R × X × X) to 300 is exact. Then the one-form τ dt + ξdx + ydη is exact on 300 , and there exists a smooth function Q(t, x, η) such that τ = ∂Q/∂t, ξ = ∂Q/∂x and y = ∂Q/∂η on 300 . Taking into account (1.8) we get Q(t, x, η) = (xn − t)ηn + hx0 , η 0 i + L(x0 , η), where L is a smooth function and L(ν 0 ) = Tγ . In this way we obtain a phase function 9 generating 300 which is of the form 9(t, x, y, η) = (xn − yn − t)ηn + hx0 − y 0 , η 0 i + L(x0 , η).
(1.10)
In other words we have 300 = { (t, x, x0 +
∂L 0 ∂L 0 ∂L 0 (x , η), xn − t + (x , η), −ηn , η 0 + (x , η), ηn , −η) 0 ∂η ∂ηn ∂x0 : (t, x, η) ∈ V }. (1.11)
On the other hand, the projection 300 3 (t, x, y, τ, ξ, −η) → (t, x, ξ) is a diffeomorphism and so is (1.9) locally, hence ∂2L def 0 0 e (x , η) 6= 0 . (1.12) J(x , η) = det I + ∂x0 ∂η 0
Wave-Trace
371
Comparing (1.8) with (1.11) at 300 ∩ { xn = yn = 0 } and taking into account (1.7) we obtain that L(x0 , η 0 , 1) is just the generating function L0 (x0 , η 0 ) (uniquely defined by construction) of the Poincar´e map P 0 in W . Moreover, ∂L 0 ∂L 0 0 s x + (x , η), η = (x , η). (1.13) ∂η 0 ∂ηn On the other hand, L is homogeneous of order one, and we obtain t(ν(x0 , η 0 )) = L0 (x0 , η 0 ) − hη 0 , ∂L0 /∂η 0 (x0 , η 0 )i, (x0 , η 0 ) ∈ W0 ,
(1.14)
where ν(x0 , η 0 ) = κ(x0 + ∂L/∂η 0 (x0 , η), η 0 ). e in the form 1.3. Computation of the trace. Now we can write the kernel of U Z e 0 , η)|1/2 dη, e (t, x, y) = (2π)−n ei9(t,x,y,η) u(t, x, η) |J(x U
(1.15)
Rn
where u(t, x, η) =
∞ X
u−j (t, x, η)
j=0
is a classical symbol. Moreover, parameterizing 300 by (t, x, η) as above, we can identify e dividing the latter by the half-density u0 with the principal symbol of U σ e2 = |dt ∧ dx ∧ dη|1/2 . We are going to calculate u0 using standard arguments from the theory of the FIOs. Recall that parameterizing 30 by (t, y, η), the principal symbol of U becomes Rt 0 −i a0 (exp(u4)(y,η))du 0 σ(U )(t, y, η) = e b0 (y, η) σ1 ⊗ σ2 , where σ1 is a suitable section of Maslov’s bundle and σ2 = |dt ∧ dy ∧ dη|1/2 . e = Q∗ ◦ U ◦ Q parameterizing as above the Consider the principal symbol of U 0 e2 we obtain corresponding Lagrangian manifold 30 by (t, x, η). After dividing it by σ Rt 0 −i a0 (exp(u4)(χ(y(t,x,η),η)))du 0 u0 (t, x, η) = eiπµγ /2 e b0 (χ(y(t, x, η), η)), (1.16) where µγ is the Maslov index of γ and y(t, x, η) is given by y 0 = x0 +
∂L 0 ∂L 0 (x , η), yn = xn − t + (x , η). ∂η 0 ∂ηn
Plugging (1.15) into (1.4) and changing the η variables we get Z Z 0 λn b |J(x e 0 , η)|1/2 eiλ(t(1−ηn )+L(x ,η)) ζ(t) Tr ζ,B (λ) = (2π)n+1 R2n R ×u(t, x, λη) dηdxdt + O(λ−∞ ).
(1.17)
372
G. Popov
The stationary points of the phase with respect to (t, ηn ) are given by ηn = 1 , t =
∂L 0 0 (x , η , 1). ∂ηn
(1.18)
Applying the stationary phase method and integrating with respect to xn , we obtain Z 0 0 λn−1 eiλL0 (x ,η )+iπµγ /2 ψ(x0 , η 0 , λ) |J(x0 , η 0 )|1/2 dη 0 dx0 + O(λ−∞ ), Tr ζ,B (λ) = (2π)n W e 0 , η 0 , 1). Moreover, where L0 (x0 , η 0 ) = L(x0 , η 0 , 1) and J(x0 , η 0 ) = J(x 0
0
ψ(x , η , λ) =
∞ X
ψj (x0 , η 0 )λ−j , (x0 , η 0 ) ∈ W,
j=0
is a classical symbol with respect to λ with a leading term Z 0 0 −iπµγ /2 b (x , η , 1)) e u0 (∂L/∂ηn (x0 , η 0 , 1), x, η 0 , 1) dxn . ψ0 (x0 , η 0 ) = ζ(∂L/∂η n R 0
0
Thus setting q = x and p = η we obtain (0.6). On the other hand, (1.13) implies t=
∂L 0 0 (x , η , 1) = s(y 0 (x0 , η 0 ), η 0 , 1) = t(ν(x0 , η 0 )) ∂ηn
at the stationary points (1.18). Now we take into account (1.16) observing that (1.17) and (1.18) yield together yn (t, x, η) = xn . Then setting s = xn we get b p))) ψ0 (q, p) = ζ(t(ν(q, Z R t(ν(q,p)) −i 0 e ×
a00 (exp(u4)(χ(y 0 (q,p),s,p,1))du
b0 (χ(y 0 (q, p), s, p, 1)) ds.
R
Finally we notice that χ(y 0 (q, p), s, p, 1) = exp(s4) ◦ χ (y 0 (q, p), 0, p, 1) = exp(s4)(ν(q, p)) for any s with |s| sufficiently small. The proof of Theorem 1 is complete. Pd Proof of Corollary 2. Consider a microlocal partition k=1 φk of the identity in a conic neighborhood of γ # by zero order pseudodifferential operators φk , 1 ≤ k ≤ d, with Pd sufficiently small wave front sets and principal symbols ϕk . Then k=1 ϕk = 1 in a conic neighborhood of γ # and the principal symbol of Bk = Bφk is b0 ϕk . We can assume that the set O2 in Theorem 1 is given by O2 = {exp (s4)(ν) : (s, ν) ∈ (−δ, δ) × Y } for some δ > 0 sufficiently small. Moreover, we suppose that there exists a partition 0 = s1 < . . . < sk < . . . < sd < Tγ# such that the intervals (sk − δ, sk + δ), 1 ≤ k ≤ d, cover [0, Tγ# ] and WF (φk ) ⊂ exp (sk 4)(O2 )
(1.19)
Wave-Trace
373
ek = exp (isk A) ◦ Bk ◦ exp (−isk A) is for each k. According to Egorov’s theorem B ek ) contained in O2 , and its a pseudodifferential operator with a wave front set WF (B e principal symbol is equal to bk0 (y, η) = (b0 ϕk )(exp (sk 4) (y, η)). Moreover, Z 1 b e−itA B ek dt = Tr tr eiλt ζ(t) Tr ζ,Bk (λ) = ek (λ) ζ,B 2π e e = Pd B which allows us to use Theorem 1. More precisely, setting B k=1 k we investigate Tr ζ,B (λ) = Tr ζ,Be (λ). e ⊂ O2 and the principal symbol of B e is given by We have WF (B) eb0 (y, η) =
d X
(b0 ϕk )(exp (sk 4) (y, η)).
k=1
Hence Tr ζ,Be (λ) can be written as an oscillatory integral of the form (0.6), where ψ0 is given by (0.7) with b0 replaced by eb0 . In this way we obtain ψ0 (q, p) = ζb (t(ν)) 90 (ν), ν = ν(p, q), where 90 ∈ C0∞ (Y0 ) is independent of ζ. Moreover, ψ0 can be further simplified at any critical point (q, p) of L0 . To do this, we recall that in view of (0.3) the set of critical points Crit (L0 ) of L0 in W0 coincides with that of fixed points Fix (P 0 ) of P 0 . Using (1.19) we get for any (q, p) ∈ Crit (L0 ) = Fix (P 0 ) the equality −i
90 (ν) = e
R t(ν) 0
a00 (exp(u4)(ν))du
d Z X k=1
sk +δ sk −δ
(b0 ϕk )(exp(s4)(ν)) ds ,
(1.20)
where ν = κ(q, p). Applying (1.20) to each νj = (P # )j (ν), 0 ≤ j ≤ |`|−1, and summing up with respect to j we prove (0.8). Moreover, (1.14) implies L0 (q, p) = t(ν) = Tγ(ν) , ν = κ(q, p) , ∀ (q, p) ∈ Crit (L0 ), that is, the critical value of L0 at any (q, p) ∈ Crit (L0 ) is just the period of the corresponding periodic trajectory γ(ν) passing through ν = κ(q, p).
2. On the Contribution of Isolated Periodic Trajectories The purpose of this section is to obtain asymptotics of Tr ζ,B (λ) for certain degenerate but isolated γ using suitable asymptotic expansions for the oscillatory integral (0.6). The hypothesis that we need should involve higher derivatives of the Poincar´e map P at ν0 and it is natural to formulate them in terms of the generating function L0 of P . Unfortunately, it would be difficult to put all the conditions that we need in an invariant form. We are going to apply to (0.6) certain results of Varchenko [3] and [24]. First we recall some basic notions which can be found in [3]. Set d = 2n − 2 and denote z = (z1 , . . . , zd ) = (q, p). The critical point 0 of L0 (z) is said to be of finite multiplicity m
374
G. Popov
if the local algebra of the gradient map ∇L0 = (∂L0 /∂z1 , . . . , ∂L0 /∂zd ) has a finite dimension m = dim R[[z1 , . . . , zd ]]/(∇L0 ) < ∞. In this case L0 is right equivalent to its Taylor polynomial g(z) of order m + 1 according to Tougeron’s theorem (see [2], Sect. 6.3). In other words, there exists a local diffeomorphism ϕ in a neighborhood of 0 such that L0 (ϕ(z)) = g(z) and ϕ(0) = 0. Moreover, the point 0 turns out to be an isolated critical point of L0 in this case (Proposition 7.2.2, [9]). Changing the variables z = ϕ(q, p) we replace L0 (z) in (0.6) by its Taylor polynomial g(z) of order m + 1. Then the asymptotic behavior of (0.6) depends on the Newton polyhedron 0 of the polynomial X cα z α . (2.1) g(z) = |α|≤m+1
By definition, 0 is the smallest closed convex set in R+n which contains each α 6= 0 with cα 6= 0 together with the positive orthant parallel translated to α. Denote by 1 the union of all compact faces of 0. The principal part of (2.1) is defined by X cα z α . g1 (z) = α∈1
Moreover, given a face δ of 0, we set gδ (z) =
X
cα z α .
α∈δ
The principal part of (2.1) is said to be R-nondegenerate if for any compact face δ of 0 the polynomials ∂gδ /∂z1 , . . . , ∂gδ /∂zd d do not have common zeros in R \ 0 . The bisector {(t, . . . , t) : t > 0} intersects 1 for some t = ` ≥ 1. The negative rational r = −1/` is called remoteness of the Newton polyhedron and its multiplicity is defined by K = d − dim (δ) − 1, where δ is the open face containing the point (`, . . . , `) ∈ 1. The Newton polyhedron is said to be remote if ` > 1. Combining Corollary 2 with Theorem 6.4, [3] we obtain: Theorem 2.1. Let 0 be a critical point of L0 of finite multiplicity m. Suppose that there are local coordinates z with z(0) = 0 such that the principal part of the Taylor polynomial g(z) of order m+1 of L0 (q(z), p(z)) at 0 is R-nondegenerate. Let the Newton polyhedron of g(z) be remote with remoteness −1/` and multiplicity K. Then there exists a neighborhood I of Tγ and a conic neighborhood O of γ such that for any ζ ∈ S(R) with Fourier transform ζb ∈ C0∞ (I) and for any pseudodifferential operator B of order zero with WF (B) ⊂ O we have Tr ζ,B (λ) = e
iλTγ +iπµγ /2
X 2n−3 X
wk,j λj (log λ)k
as λ → +∞.
(2.2)
j∈M k=0
Here M is a finite union of arithmetic progressions of rational numbers < n − 1 depending only on the Taylor polynomial of L0 at 0 of degree m + 1. Moreover, the
Wave-Trace
375
exponents in the leading term of (2.2) are j = n−1−1/` and k = K. The corresponding coefficient has the form b γ) e wK,n−1−1/` = Q (2π)−n ζ(T
−i
R Tγ 0
a00 (exp(s4)(ν0 ))ds
Z
Tγ# 0
b0 (exp(s4)(ν0 ))ds, (2.3)
where b0 is the principal symbol of B and Q 6= 0 is a constant which depends only on the Taylor series of L0 at 0. Thus, under the conditions of Theorem 2.1 we obtain Z R Tγ 0 −i a0 (exp(s4)(ν0 ))ds −n b 0 Tr ζ,B (λ) = Q (2π) ζ(Tγ ) e
Tγ#
b0 (exp(s4)(ν0 ))ds 0
×eiλTγ +iπµγ /2 λn−1−1/` (log λ)K + R(λ), where Q 6= 0 is a constant which depends only on the Taylor series of L0 at 0 and O λn−1−1/` (log λ)K−1 if K ≥ 1 R(λ) = if K = 0, O λn−1−1/`−ε for some ε > 0. Example 2.2. Consider a smooth compact surface X in R3 defined by D 3 (u, ϕ) −→ r(u, ϕ) = (g(u, ϕ) cos ϕ, g(u, ϕ)sinϕ, u) ∈ X, where g is a continuous function in D = [−2, 2] × R and 2π-periodic with respect to ϕ. We assume that g is positive and smooth in the interior of D and p g(u, ϕ) = 4 − u2 + O (4 − u2 )∞ as |u| → 2. Moreover, we suppose that g(u, ϕ) = %(u) + O(uk+1 ), %(u) = 1 − Cuk ,
as u → 0,
where C 6= 0 and k is integer, k ≥ 3. In this way we obtain a smooth compact surface in R3 which can be considered as a perturbation of the surface of revolution {(%(u) cos ϕ, %(u) sin ϕ, u) : ϕ ∈ R, |u| ≤ ε}, ε > 0, near the equator √ {(cos ϕ, sin ϕ, 0) : ϕ ∈ R}. Consider the first order pseudodifferential operator A = −1, where 1 is the Laplace-Beltrami operator on X. The metric tensor on X is of the form dx2 + dy 2 + dz 2 = 1 + O(uk+1 ) du2 + 1 − 2Cuk + O(uk+1 ) dϕ2 + O(uk+1 ) dudϕ. The corresponding Hamiltonian is given by H(u, ϕ, v, w) = 1 + O(uk+1 ) v 2 + 1 + 2Cuk + O(uk+1 ) w2 + O(uk+1 )vw, √ where (u, ϕ, v, w) ∈ T ∗ (R × (R/2πZ)). Moreover, a1 = H is the principal symbol of A and the subprincipal symbol a00 of A is zero. Denote by 4 the Hamiltonian vector field of a1 and set 6 = {H = 1}.
376
G. Popov
Fix 0 6= j ∈ Z, denote γ = γj = {(0, ϕ, 0, 1) : 0 ≤ ±ϕ ≤ 2π|j|}, where the sign is positive if j > 0 and negative if j < 0, and set Tγ = 2πj. Then γ is a closed trajectory of 4 in 6. We are going to investigate the contribution of γ to the singularities of Z. Consider the trajectory exp(t4)(u0 , 0, v0 , w0 ) = (u(t), ϕ(t), v(t), w(t)), t ∈ R, issuing from ν0 = (u0 , 0, v0 , w0 ) ∈ 6. The corresponding Hamiltonian system has the form du/dt = v + O(uk+1 ) , dv/dt = −kCuk−1 w2 + O(uk ), dϕ/dt = w + O(uk ) , dw/dt = O(uk+1 ). The initial data belong to 6 and we have w02 = 1 + O(v02 ) + O(uk0 ) as (u0 , v0 ) → (0.0). In the same way we get w2 = 1 − v 2 + O(uk ). It is easy to see that for each fixed t1 > 2π|j| and any t with |t| ∈ [0, t1 ] we have u(t) v(t) ϕ(t) w(t)
= = = =
u0 + tv0 + O(uk−1 ) + v0 O(uk−2 ) + O(v02 ), 0 0 k−1 k ) + O(v02 ), v0 − ktCu0 + O(u0 ) + v0 O(uk−2 0 k−1 k 2 w0 t + O(u0 ) + v0 O(u0 ) + O(v0 ), w0 + O(uk0 ) + O(v02 ).
Hereafter, O(·) depend on j. We are ready to find the Taylor expansion of the generating function L0 of the Poincar´e map. Denote by Y a neighborhood of ν0 = (0, 0, 0, 1) in 6 ∩ {ϕ = 0}. The pull-back of the symplectic two-form of T ∗ (X) to Y is just du ∧ dv. Consider the Poincar´e map P : Y0 → Y associated to γ, where Y0 ⊂ Y is a sufficiently small neighborhood of ν0 . It is given by P (u0 , v0 ) = (u(T ), v(T )), where T = T (u0 , v0 ) is a smooth function such that ϕ(T ) = Tγ = 2πj. Then T (u0 , v0 ) = Tγ + O(uk0 ) + v0 O(uk−1 ) + O(v02 ), 0 and we get ) + v0 O(uk−2 ) + O(v02 ), u(T ) = u0 + Tγ v0 + O(uk−1 0 0 k−1 k ) + O(v02 ). v(T ) = v0 − kCTγ u0 + O(u0 ) + v0 O(uk−2 0 Set qe = u0 , pe = v(T ), q = u(T ), p = v0 . Then we get ∂L0 /∂p = qe − q = −Tγ p + O(q k−1 ) + pO(q) + O(p2 ) , ∂L0 /∂q = pe − p = −kCTγ q k−1 + O(q k ) + pO(q k−2 ) + O(p2 ). Hence, the generating function L0 (q, p) normalized by L0 (0, 0) = 2πj has a Taylor expansion of the form L0 (q, p) = Tγ − CTγ q k −
Tγ 2 p + M pq k−1 + O(q k+1 ) + p2 O(q) + O(p3 ), 2
where M is a constant. It is easy to see that 0 is a critical point of L0 of multiplicity m = k − 1. The only compact face of the corresponding Newton polyhedron is δ = {(kt, 2 − 2t) : 0 ≤ t ≤ 1}. Suppose that both j and C are positive. Since k ≥ 3, by a linear change of the variables, the principal part of L0 becomes
Wave-Trace
377
T γ − q k − p2
(2.4)
which is R-nondegenerate. Moreover, the Newton polyhedron is remote with remoteness r = −1/2 − 1/k, the multiplicity of which is 0, and one can apply Theorem 2.1. In our case, however, the asymptotics of (0.6) can be obtained by a simpler argument. It is easy to see that L0 is right equivalent to its principal part (2.4), changing the variables as follows: y = (πj)1/2 p(1 + O(p) + O(q)) −
M (πj)−1/2 q k−1 , 2
x = (2πjC)1/k q(1 + O(q) + O(p)).
Then (0.6) becomes Z
e−iλ(x
Tr ζ,B (λ) = λ eiλTγ +iπµγ /2
k
+y 2 )
e y, λ) dxdy + O(λ−∞ ) ψ(x,
as λ → ∞,
W
e y, λ) is a classical symbol of order 0 of the form where ψ(x, e y, λ) = ψ(x,
∞ X
ψej (x, y) λ−j
j=0
with supp ψej ⊂ W0 . Moreover, b ψe0 (0, 0) = j −1/2−1/k 2−2−1/k π −5/2−1/k C −1/k ζ(2πj)
Z
2π
b0 (0, ϕ, 0, 1)dϕ. 0
Set e = π −2−1/k k −1 0(1/k)C −1/k e−iπ(1/4+1/k) . C
(2.5)
Using a lemma of Erd´elyi (see also [11], (7.7.30) and (7.7.31)), we obtain Tr ζ,B (λ) = eiλTγ +iπµγ /2
∞ X
C`,m λ(1−`)/2−(1+m)/k
as λ → ∞,
(2.6)
`,m=0
where Tγ = 2πj and −1/2−1/k
C0,0 = (2j)
b e sin (k − 1)π ζ(2πj) C 2k
Z
2π
b0 (0, ϕ, 0, 1)dϕ 0
for k odd, and b e eiπ/2k ζ(2πj) C0,0 = (2j)−1/2−1/k C
Z
2π
b0 (0, ϕ, 0, 1)dϕ 0
for k even. Suppose also that γj is the only closed geodesic of X with a period 2πj. Then we obtain a complete asymptotic expansion of Tr ζ (λ) if supp ζb is contained in a sufficiently small neighborhood I of 2πj.
378
G. Popov
3. Lower Bounds on the Resonances Let g be a smooth Riemannian metric in Rn , n ≥ 2, which coincides with the Euclidean metric outside a ball |x| ≤ %0 , %0 >> 1. Denote by −1g the corresponding LaplaceBeltrami operator and by −1 the Laplacian on Rn . Choose a smooth function ψ ∈ C0∞ (Rn ), ψ = 1 in a neighborhood of |x| ≤ %0 , and consider the “outgoing” cut-off resolvent Rψ (λ) = ψ(−1g − λ2 )−1 ψ : L2 (Rn ) −→ L2 (Rn ), Im λ ≤ 0. It is known that Rψ (λ) is analytic in {λ ∈ C : λ 6= 0, Im λ ≤ 0} and that it admits a meromorphic continuation in the complex plane when n is odd and in the logarithmic plane when n is even. The poles λj of Rψ (λ) are called resonances and the multiplicity of each λj 6= 0 is defined by Z m(λj ) = rank Rψ (z)zdz, γj
where θ → γj (θ) = λj + εeiθ and 0 < ε << 1. The definition of the resonances does not depend on ψ. As in [20] we fix C > 1 and set 3 = {λj : |λj | ≥ C, 0 ≤ arg λj ≤ 1/C}. For % > 0, we define 3% = {λj ∈ 3 : Im λj ≤ % log |λj |} and N% (r) = #{λj ∈ 3% : Re λj ≤ r}, r > 1, where the resonances are counted with multiplicities. Denote by H the Hamiltonian in√T ∗ (Rn ) corresponding to the metric g and by 4 the Hamiltonian vector field of a1 = H. Set 6 = {a1 (x, ξ) = 1}. Given T 6= 0, denote by 5(T ) the set of all ν ∈ 6 such that exp(T 4)(ν) = ν. Then 5(T ) (if not empty) is the union of all (multiple) periodic trajectories γ of 4 with period T . Notice that the projection of any such γ to Rn is a closed geodesic in (Rn , g) with a length |T |. Let dϑ be the Liouville measure on the cosphere bundle 6. The following theorem is a generalization of a result of Sj¨ostrand and Zworski [22]. Theorem 3.1. Suppose that ϑ(5(T0 )) > 0 for some T0 > 0. Then for each % > 0, we have C0 1 − o% (1) rn , r → +∞, N% (r) ≥ 2πn where C0 = ϑ(5(T0 )) (2π)1−n . Proof. We are going to use a trace formula relating the resonances to the distribution Z √ p hu, ϕi = 2 tr ϕ(t) cos t −1g − cos t −1 dt, ϕ ∈ C0∞ (R). It is well known (see [20, 21, 27], and the references there) that in the case when n is odd ≥ 3 we have the following Poisson type formula: X u(t) = eiλj t , in D0 ((0, +∞)), j
Wave-Trace
379
where the resonances are counted according to their multiplicities. Fix some 0 < α < 1 < β and γ > 0 and set = {z ∈ C : α < Re z < β, 0 < Im z < γ}. Then for any ϕ ∈ C0∞ ((0, +∞)) we have ϕu(λ) c =
X
ϕ(λ b − λj ) + O(λ−∞ ), λ → +∞,
(3.1)
λj ∈λ
where the resonances are counted with multiplicities. Sj¨ostrand proved in [20] that (3.1) is valid in any dimension n ≥ 2. For any p ∈ N we set Tp = 2p T0 . According to [20], Theorem 10.1, (see also [22]), it suffices to find a sequence pj → +∞ in N and neighborhoods Ij of Tpj such that |ϕu(λ)| c ≥ C0 − oϕ (1) λn−1 , λ → +∞,
(3.2)
for any ϕ ∈ C0∞ (Ij ) with ϕ(Tpj ) = 1. p To prove (3.2) we shall use Theorem 1 for the operator A = −1g . Using the finite speed of propagation for the wave equation as well as the propagation of the singularities we obtain ϕu(λ) c = 2πTr ζ,9 (λ) + O(λ−∞ ), λ → +∞, b is supported in a small neighborhood Ij of −Tp and 9 stands where ϕ(−t) = ζ(t) j for the operator of multiplication with a function 9 ∈ C0∞ (Rn ) such that 9(x) = 1 in {|x| ≤ %1 }, %1 1. Let 5(T ) 6= ∅ for some T 6= 0. Given ν ∈ 5(T ), we denote by µγ = µ(T, ν) the Maslov index of the closed trajectory γ = {exp(t4)(ν) : 0 ≤ ±t ≤ |T |}, where the sign is positive if T > 0 and negative if T < 0. We define by 5+ (T ) (5− (T )) the set of all ν ∈ 5(T ) such that µ(T, ν)/2 is even (odd). Obviously, 5+ (−T ) = 5+ (T ), 5− (−T ) = 5− (T ), 5(T ) ⊂ 5+ (2T ) ⊂ 5(2T ). Suppose that ϑ(5(T0 )) > 0. We claim that for each ε > 0 there exists p0 ≥ 1 such that ϑ(5− (2p T0 )) < ε for any integer p ≥ p0 . If ϑ(5− (2pj T0 )) ≥ ε, for an increasing sequence of positive integers {pj }∞ j=0 , then ϑ(6) ≥ ϑ(5(2pj T0 )) ≥ ϑ(5+ (2pj T0 )) + ε ≥ ϑ(5(2pj−1 T0 )) + ε ≥ ϑ(5(T0 )) + rj ε, where rj → +∞, which leads to a contradiction. Hence, for each 0 < ε <
1 ϑ(5(T0 )) 2
there exists p0 ≥ 1 such that 0 ≤ ϑ(5− (2p T0 )) < ε < ϑ(5+ (2p T0 ))
(3.3)
380
G. Popov
for any integer p ≥ p0 . Fix an integer p ≥ p0 and set T = −2p T0 . Pick I = (T −δ, T +δ), 0 < δ ≤ δ0 , where δ0 is sufficiently small, and choose finitely many closed trajectories γk , 1 ≤ k ≤ N , of 4 of period T , conic neighborhoods Ok of γk and pseudodifferential operators B (k) of order zero with WF B (k) ⊂ Ok satisfying the assumptions of Corollary 2 and such that the union of 5(t), t ∈ [T − δ0 , T + δ0 ], does not intersect PN b = ϕ(−t) is contained WF ( k=1 B (k) −I). Moreover, we suppose that the support of ζ(t) in I. Then we obtain ϕu(λ) c = 2π
N X
Tr ζ,B (k) (λ) + O(λ−∞ ), λ → +∞.
k=1
To simplify the notations let us drop the index k. Now we set Iδ = (T − δ, T ) ∪ (T, T + δ), and write the domain of integration W in (0.6) as W = W1 ∪ W2 ∪ W3 , where W1 = t−1 (Iδ ) and e t(q, p) = t(ν(q, p)), ν(q, p) being defined W0 ∩ 5(T ), W2 = W2 (δ) = W0 ∩ e b by (0.5). Since supp ζ ⊂ (T − δ, T + δ), using Corollary 2 we obtain Tr ζ,B (λ) = Q1 (λ) + Q2 (λ) + O(λn−2 ), λ → +∞, where Qj (λ) =
λn−1 (2π)n
Z
b eiλ L0 (q,p)+iπµγ /2 ζ(t(ν(q, p))) 90 (q, p) |J(q, p)|1/2 dsdqdp. Wj
Obviously W2 (δ) ⊂ W2 (δ 0 ) for 0 < δ < δ 0 ≤ δ0 . Moreover, the intersection of the open sets W2 (δ), 0 < δ ≤ δ0 , is empty, hence there exists δ > 0 such that |Q2 (λ)| ≤
ε n−1 λ . N
The first integral can be evaluated as in [5] and [16] (see also [13, 19, Theorem 4.4.2], and references there for a little bit different method involving the so called absolutely periodic trajectories). We have W1 = Fix (P 0 ) = Crit (L0 ). Moreover, L0 (q, p) = T (κ(q, p)) = T, ∀ (q, p) ∈ W1 , and ν(q, p) = κ(q, p) for (q, p) ∈ W1 . Denote by F ⊂ W1 the set of all ν ∈ W1 such that for any neighborhood U of ν in W the Lebesgue measure of W1 ∩ U is positive (see [16]). By definition, F and W1 have the same Lebesgue measure in W . Hence, we can replace the domain of integration in Q1 (λ) by F . On the other hand, the set F enjoys the following property (see Lemma 2.1, [14]): Any smooth function in W which vanishes on F is in fact flat at F , i.e. f (z) = 0, ∀z ∈ F
=>
∂zα f (z) = 0 , ∀z ∈ F, ∀α.
Since F ⊂ Crit (L0 ), we obtain ∂zα L0 (z) = 0 for all z = (q, p) ∈ F , which implies J(q, p) = 1 in F .
Wave-Trace
381
On the other hand, a00 = 0, and for any (q, p) ∈ Fix (P 0 ) we can rewrite (0.8) as follows: |`|−1 |`|−1 Z t# (νj ) X X 90 (νj ) = b0 (exp(s4)(νj )) ds , νj = (P # )j (κ(q, p)). j=0
j=0
0
Integrating the last equality in W1 and using that P # and κ are symplectic, hence volume preserving, we obtain Z Z t# (ν) λn−1 iλ T b e eiπµγ /2 b0 (exp(s4)(ν)) dsdqdp, ν = κ(q, p). ζ(T ) Q1 (λ) = (2π)n W1 0 On the other hand, it is easy to see that dsdqdp gives locally the Liouville measure on 6. Namely, given s0 ∈ R, we have dsdqdp = ψ ∗ (dϑ) in a neighborhood of (s0 , 0, 0), where ψ(s, q, p) = exp(s4)(κ(q, p)) = χ(q, s, p, 1) (see Sect. 1.1). Indeed, the Liouville measure on 6 is just the density dϑ = |j ∗ |, where j : 6 → T ∗ (X) is the natural inclusion map and (n!)da0 ∧ = ω n in a neighborhoodPof 6, ω being the symplectic two-form on T ∗ (X). Since χ∗ (a1 ) = ξn , n and χ∗ (ω) = j=1 dxj ∧ dξj , we obtain ψ ∗ (j ∗ ) = (−1)n−1 ds ∧ dq1 ∧ · · · ∧ dqn−1 ∧ dp1 ∧ · · · ∧ dpn−1 . Then Q1 (λ) =
λn−1 iλ T b e ζ(T ) (2π)n
Z 5(T )
eiπµ(T,ν)/2 b0 (ν) dϑ + o(λn−1 ).
b ) = 1, summing up and taking into account (3.3) we get Since ζ(T n−1 λ ϕu(λ) c = eiλ T ϑ(5(T )) + εO(λn−1 ) + o(λn−1 ) 2π which proves the theorem.
Example 3.2. Consider a smooth surface Y = X1 ∪ X2 in R3 , where X2 = {x ∈ R3 : x3 = 0, |x| ≥ 2} and X1 is parameterized by D 3 (u, ϕ) −→ r(u, ϕ) = (g(u, ϕ) cos ϕ, g(u, ϕ)sinϕ, u) ∈ X2 , where g is a suitable continuous function in D = [0, 2] × R, which is 2π-periodic with respect to ϕ and positive and smooth in the interior of D × R, and g(0, ϕ) = 2, g(2, ϕ) = 0. We suppose that ∂g (u, ϕ) < 0 for u ∈ (0, 1) ∪ (1, 2) ∂u and that
(3.4)
g(u, ϕ) = 1 − C(1 − u)2k+1 + O((1 − u)2k+2 ) as u → 1,
where C > 0 and k is integer, k ≥ 1. Notice that the only (primitive) closed geodesic on X is γ1 = {(cos ϕ, sin ϕ, 1) : 0 ≤ ϕ ≤ 2π} because of (3.4). Taking into account (2.6) and using [20], Theorem 10.1, (see also [22]), we obtain for any j ∈ N and any % > (3/4 + 1/(4k + 2))π −1 j −1
382
G. Popov
N% (r) ≥ (2j)−1/2−1/k C1 sin
(k − 1)π (1 + o(1))r3/2−1/(2k+1) , r → +∞, 2k
e C e is given by (2.5) and it does not depend on j. where C1 = (3/2 − 1/(2k + 1))|C|, This example shows how sensitive the resonances could be for perturbations of the metric. Indeed, deforming a little the surface Y we obtain ∂g/∂u(u, ϕ) < 0 for u ∈ (0, 2). Then the corresponding metric becomes “non-trapping” and there exist % > 0 and C > 0 such that N% (r) ≤ C, ∀r ≥ 1. Acknowledgement. Most of this work has been written while I was visiting the Department of Mathematics, University of Nantes, and I would like to thank my colleagues there especially Didier Robert and Anne-Marie Charbonnel for the helpful discussions and the hospitality.
References 1. Abraham, R., Marsden, J.: Foundations of mechanics. Second edition, Reading: Addison-Wesley, 1978 2. Arnold, V., Gusein-Zade, S., Varchenko, A.: Singularities of Differentiable Maps. Vol. I, Boston: Birkhauser, 1985 3. Arnold, V., Gusein-Zade, S., Varchenko, A.: Singularities of Differentiable Maps. Vol. II, Boston: Birkhauser, 1985 4. Marvizi, Sh., Melrose, R.: Spectral invariants of convex planar regions. J. Differ. Geom. 17, 475–502 (1982) 5. Charbonnel, A.-M. and Popov, G.: Semiclassical asymptotics for several commuting operators. Rapport de Recherche 96/11-3, Universit´e de Nantes 6. Colin de Verdi`ere, Y.: Spectre du Laplacien et longueurs des g´eod´esiques p´eriodiques I, II. Compositio Math. 27, 80–106, 159–184 (1973) 7. Duistermaat, J.: Oscillatory integrals, lagrange immersions and unfolding of singularities. Commun. in Pure and Appl. Math. 27, 207–281 (1974) 8. Duistermaat, J., Guillemin, V.: The spectrum of positive elliptic operators and periodic bicharacteristics. Invent. Math. 29, 39–79 (1975) 9. Golubitsky, M., Guillemin, V.: Stable mappings and their singularities. Berlin–Heidelberg–New York: Springer, 1973 10. Guillemin, V., Uribe, A.: Circular symmetry and the trace formula. Invent. Math. 96, 385–423 (1989) 11. H¨ormander, L.: The Analysis of Linear Partial Differential Operators. Vol. I, III, IV. Berlin–Heidelberg– New York: Springer, 1985 12. Ikawa, M.: Trapping obstacles with a sequence of poles of the scattering matrix converging to the real axis. Osaka J. Math. 22, 657–689 (1985) 13. Petkov, V.: Weyl asymptotics of the scattering phase for metric perturbations. Asympt. Analysis. 10 (3), 245–263 (1995) 14. Petkov, V., Popov G.: On the Lebesgue measure of the periodic points of a contact manifold. Math. Z. 218, 91–102 (1995) 15. Petkov, V., Popov, G.: Une formule de trace semi-classique et asymptotiques de valeurs propres de l’op´erateur de Schr¨odinger. C. R. Acad. Sci. Paris, t.323, S´erie I, 163–168 (1996) 16. Petkov, V., Popov, G.: Semi-classical trace formula and clustering of eigenvalues for Schr¨odinger operators. Ann. Inst. Henri Poincar´e, Phys. Theor. 68, (1), 17–83 (1998) 17. Popov, G.: Length spectrum invariants of Riemannian manifolds. Math. Z. 213, 311–351 (1993) 18. Popov, G.: Invariants of the length spectrum and spectral invariants for convex planar domains. Commun. Math. Phys. 161, 335–364 (1994) 19. Safarov, Yu., Vassiliev D.: The asymptotic distribution of eigenvalues of partial differential operators. Translations of Mathematical Monographs, AMS, vol. 155, 1996 20. Sj¨ostrand, J.: A trace formula and review of some estimates for resonances. In: L. Rodino (eds.) Microlocal analysis and spectral theory. Nato ASI Series C: Mathematical and Physical Sciences, 490, Dordrecht: Kluwer Academic Publishers, 1997, pp. 377–437
Wave-Trace
383
21. Sj¨ostrand, J., Zworski, M.: Complex scaling and the distribution of scattering poles. J. Am. Math. Soc. 4, 729–769 (1991) 22. Sj¨ostrand, J., Zworski, M.: Lower bounds on the number of scattering poles. Commun. P.D.E, 18 (5–6), 847–857 (1993) 23. Uribe, A., Zelditch, S.: Spectral Statistics on Zoll Surfaces. Commun. Math. Phys. 154, 313–346 (1993) 24. Varchenko, A.: Newton polyhedra and estimation of oscillatory integrals. Funct. Anal. Appl. 10, 175–196 (1976) 25. Vodev, G.: Sharp polynomial bounds of the number of scattering poles for metric perturbations of the Laplacian in Rn . Math. Ann. 291, 39–49 (1991) 26. Zelditch, S.: Kuznecov sum formulae and Szeg¨o limit formulae on manifolds, Comm. Partial Diff. Equations, 17, 221–260 (1992) ´ 27. Zworski, M.: Poisson formulae for resonances. S´eminaire EDP, Ecole Polytechnique, Mai, 1997 Communicated by B. Simon
Commun. Math. Phys. 196, 385 – 398 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Relations Between the Correlators of the Topological Sigma-Model Coupled to Gravity M. Kontsevich1 , Yu. Manin2 1 2
IHES, 35 route de Chartres, 91440 Bures-sur-Yvette, France. E-mail: [email protected] MPIM, Gottfried-Claren-Str. 26, 53225 Bonn, Germany. E-mail: [email protected]
Received: 19 September 1997 / Accepted: 4 February 1998
Abstract: We prove a new recursive relation between the correlators hτd1 γ1 . . . τdn γn ig,β , which together with known relations allows one to express all of them through the full system of Gromov–Witten invariants in the sense of Kontsevich–Manin and the intersection indices of tautological classes on M g,n , effectively calculable in view of earlier results due to Mumford, Kontsevich, Getzler, and Faber. This relation shows that a linear change of coordinates of the big phase space transforms the potential with gravitational descendants to another function defined completely in terms of the Gromov–Witten correspondence and the intersection theory on V n × M g,n . We then extend the formalism of gravitational descendants from quantum cohomology to more general Frobenius manifolds and Cohomological Field Theories.
0. Introduction This note furnishes a list of relations between the correlators of the topological sigmamodel coupled to the topological gravity hτd1 γ1 . . . τdn γn ig,β . Here γi ∈ H ∗ (V ), where V is a smooth projective algebraic manifold, the target space of the model. These relations allow one to express all the correlators through the following data: (i) The (full) quantum cohomology of V in the sense of [KM], consisting of the maps V : H ∗ (V n ) → H ∗ (M g,n ). Ig,n,β (ii) The intersection indices of tautological classes on M g,n , effectively calculable in view of the known results of Mumford, Kontsevich, Getzler, and Faber (cf. [F]).
386
M. Kontsevich, Yu. Manin
The correlators in question, for the physical discussion of which we refer to [W1, W2, DijW], as well to the more recent works [EHX1, EHX2], are polylinear functions on ∗ ∗ the extended phase space ⊕∞ d=0 H (V )[d], which is the infinite sum of copies of H (V ). ∗ The elements γ ∈ H (V ) are called the primary fields, whereas the respective elements τd γ ∈ H ∗ (V )[d] for d ≥ 1 are called the gravitational descendants. The mathematical definition of the correlators, spelled out in (1) below, is given in terms of the intersection theory on the moduli stack M g,n (V, β) of stable maps to V. Our choice of this interpretation of the physical correlators differs from that made in [RT] in the context of symplectic geometry (see the last seven lines of [RT], p. 458) in the following way: Ruan and Tian use the monomials in Chern classes of the tautological bundles on V n × M g,n (“downstairs”) whereas we use their analogs on M g,n (V, β) (“upstairs”). The latter classes are not the lifts of the former ones, and the discrepancy between the two is the source of the divergence of the definitions (see Theorem 1.1 and its proof). Ruan’s and Tian’s correlators are called here modified correlators. In the notation of (2) below they are hτ0,e1 γ1 . . . τ0,en γn ig,β . As a justification of our interpretation of physicists’ constructions, we may refer to [W2], Subsect. 3c, pp. 275–276. Witten speaks there explicitly about the space of stable maps rather than space of maps of stable curves, even if the former notion was not mathematically defined before [KM]. More to the point, the Virasoro constraint L0 eF = 0 for the standard generating function of the correlators and with the standard choice of the operator L0 (see e. g. [EHX1]) holds for the upstairs correlators but fails for the downstairs (modified) correlators (the standard L−1 works for the modified correlators as well). This argument, which was conveyed to us by R. Pandharipande, unambiguously favors our definition. In [M2] the formula L0 eF = 0 was checked in the algebraic geometric context. The first result of this note consists in establishing the exact relationship between the correlators and the modified correlators. Essentially, they are related by an overall invertible linear transformation T of the extended phase space (cf. Theorem 2.1 and the Remark after it). So it might seem that there is not much point in insisting upon either choice, except for comparison with the physicists’ usage. However, there is a hidden subtlety which is worth looking into more closely. The point is that the natural definition domains of the correlators and the modified correlators are slightly different: the latter ones are directly defined only in the stable range 2g − 2 + n ≥ 3 because in the unstable range M g,n is empty. But the matrix coefficients of T are genus zero two point correlators and so belong to the unstable range (see (20) below). The correlators can be in fact extended to the unstable range either by passing to M g,n (V, β), or formally, by using a generalization of the Divisor Axiom of [KM], which by now is of course proved in both algebraic-geometric and symplectic contexts (Lemma 1.4 below). The latter trick is necessary if we want to calculate the operator T itself without appealing to the space of stable maps. This remark makes it possible to approach the problem of coupling to topological gravity of those theories which do not necessarily come from the topological sigmamodels. The largest natural class of such theories in genus zero essentially coincides with that of Frobenius manifolds ([D, M1]), locally given by the solutions of the Witten– Dijkgraaf–Verlinde–Verlinde Associativity Equations (potentials). Coefficients of the Taylor series of the potential in flat coordinates are the genus zero correlators of primary fields. The Second Reconstruction Theorem of [KM] (for the detailed proof see [KMK])
Correlators of Topological Sigma-Model Coupled to Gravity
387
allows then to construct the modified genus zero correlators with gravitational descendants in the stable range for arbitrary Frobenius manifolds, which solves a part of the coupling problem (see Proposition 3.1). If we insist on non-modified correlators, we have to provide the operator T that is, two point correlators. But the potential is defined only up to terms of degree ≤ 2. It can be normalized further on a subclass of Frobenius manifolds which we introduce in Sect. 3 and call the manifolds of qc type (for Quantum Cohomology). The additional structure postulated for such manifolds generalizes the Divisor Axiom. This provides the operator T valid for any genus. However, the problem of higher genus correlators for general Frobenius manifolds seems to be wide open. Even if we somehow construct the correlators of the primary fields for higher genus, this would not suffice for the reconstruction of the full-fledged Cohomological Field Theory of [KM] which we and [RT] use to calculate the modified correlators with descendants. For some interesting recent work on the genus 1 and 2 numerical geometry see [Ge1, Ge2, KT, Z], this might give a clue for generalizations. Now a few words about the plan of this note. Our main trick consists in introducing the generalized correlators which we denote hτd1 ,ei γ1 . . . τdn ,en γn ig,β and in deriving for them a general recursion relation. This is the content of Theorem 1.1 which is the central result of Sect. 1. In the remaining part of Sect. 1 we collect some further (and well known) recursion formulas for the reader’s convenience: cf. [W1, W2, DijW, DijVV, EHX1, EHX2]. Taken together, they provide transparent computation algorithms. In Sect. 2 we apply these formulas to the comparison of two generating functions involving the upstairs and downstairs gravitational descendants respectively. We prove that the two functions are related by an invertible linear transformation T of the big phase space, common for all genera, and defined entirely in terms of two-point genus zero correlators with descendants at one point. This might shed some light to the problem of Virasoro constraints, cf. [EHX1, EHX2]. In fact, any system of differential equations for any generating function of the correlators with descendants equally well serves for the modified correlators after the coordinate change defined by T. On the other hand, the transition to the modified correlators almost decouples the a-indices from the d-indices: cf. formula (27) below. As an example, applying T to the standard Virasoro generators Li we easily see that L−1 remains of the same form, but in L0 the cup product by the canonical class gets replaced by the quantum product. And although the Virasoro constraints Ln look more complicated when written with respect to the downstairs descendants, certain vanishing results are clearer in this picture: for example, in a recent article, Eguchi and Xiong [EX] make use of the vanishing of correlation functions with more than 3g − 3 + n descendants to obtain simple topological recursion relations for the generating functions of the theory. Finally, in Sect. 3 we extend the new formalism of the gravitational descendants from quantum cohomology to the more general Frobenius manifolds and Cohomological Field Theories as was explained above.
1. Generalized Correlators 1.1. The setting. The mathematical definition of the conventional correlators in the notation of [BM] is hτd1 γ1 . . . τdn γn ig,β := Z c1 (L1;g,n (V, β))d1 ∪ ev1∗ (γ1 ) ∪ · · · ∪ c1 (Ln;g,n (V, β))dn ∪ evn∗ (γn ), (1) Jg,n (V,β)
388
M. Kontsevich, Yu. Manin
where Jg,n (V, β) ∈ A∗ (M g,n (V, β)) is the virtual fundamental class, the line bundle Li;g,n (V, β), i = 1, . . . , n has the geometric fiber Tx∗i C at the point [(C, x1 , . . . , xn , f : C → V )], and evi sends this point to f (xi ). Recall also that β varies in the semigroup of the effective algebraic classes of H2 (V, Z)/(tors). Put ψi := c1 (Li;g,n (V, β)). In the stable range 2g − 2 − n > 0 we have the absolute stabilization map st : M g,n (V, β) → M g,n , and the respective bundles Li on M g,n . Put φi := st∗ (c1 (Li )). Our generalized correlators, by definition, are: hτd1 ,e1 γ1 . . . τdn ,en γn ig,β :=
Z Jg,n (V,β)
ψ1d1 φe11 ∪ ev1∗ (γ1 ) ∪ . . . ∪ ψndn φenn ∪ evn∗ (γn ).
(2)
Since M 0,2 (V, 0) = ∅, we have hτd1 γ1 τd2 γ2 i0,0 = 0.
(3)
Furthermore, in the stable range we have h
n Y
τdi ,0 γi ig,β = h
i=1
n Y
τdi γi ig,β .
i=1
Theorem 1.1. If 2g − 2 + n > 0, then for any j with dj ≥ 1 we have h
n Y
τdi ,ei γi ig,β = h
i=1
+
X
n Y
τdi −δij ,ei +δij γi ig,β
i=1
±hτdj −1 γj τ0 1a i0,β1 hτ0,ej 1a
a, β1 +β2 =β
Y
τdi ,ei γi ig,β2 .
(4)
i: i6=j
Here (1a ), (1a ) are Poincar´e dual bases of H ∗ (V ), and the sign arises from permuting γj with γi for all i < j. Corollary to Theorem 1.1. hτd1 γ1 τd2 γ2 τd3 γ3 i0,β =
For g = 0, n = 3, d1 ≥ 1 we have: X hτd1 −1 γ1 τ0 1a i0,β1 hτ0 1a τd2 γ2 τd3 γ3 i0,β2 .
(4a)
a, β1 +β2 =β
In fact, φi = 0 here, so one should put ei = 0 in (4), and the first summand will vanish. This is a well known identity. e : Cg,n (V, β) → Cg,n Sketch of proof. Consider the morphism of universal curves st covering st. It induces the morphism of relative 1-form sheaves ω → ω(V, β), at least at the complement of singular points of the fiber. Restricting the latter to the j th section (j ∈ S being fixed), we get the morphism st∗ (Lj;g,n ) → Lj;g,n (V, β) on M g,n (V, β). It is a local isomorphism everywhere except for the points in this stack over which the e These points j th section lies on the component of fiber which gets contracted by st. constitute the union of boundary strata M (V, σ(β1 , β2 )), where σ(β1 , β2 ) is a one-edge, two-vertex n-graph with one vertex of genus 0, class β1 , with tail j, and another of genus g, class β2 , with tails 6= j. Naively, one would expect that all these boundaries are divisors,
Correlators of Topological Sigma-Model Coupled to Gravity
389
and over them sections of st∗ (Lj;g,n ) have an extra zero of the first order. Hence in (2) P we could replace one factor ψj by φj + β1 +β2 =β [M (V, σ(β1 , β2 ))]. Then the restriction to the boundary would give (4). A more precise reasoning uses the pullback property of the virtual fundamental classes J(V, σ) similar to Lemma 10 of [B]. The details will be treated in [M2]. Clearly, these relations allow us to reduce all the generalized (in particular, the conventional ones) correlators to those with β = 0, to the conventional ones in the unstable range and to the generalized ones with all di = 0 in the stable range. Using (2) and the projection formula, one can rewrite the latter in the form hτ0,e1 γ1 . . . τ0,en γn ig,β :=
Z
c1 (pr2∗ (L1 ))e1 ∪ pr1∗ (γ1 ) ∪ . . . ∪ c1 (pr2∗ (Ln ))en ∪ pr1∗ (γn ), Ig,n (V,β)
where this time the integration refers to V n × M g,n , I = (ev, st)∗ J is the Gromov– Witten correspondence, and pri are the two projections. Hence the correlators in the stable range with di = 0 are calculable if we know the full (not just top) Gromov–Witten invariants. We will call the expressions above the modified correlators. Notice that for β = 0 we have ψi = φi , hence τd,e = τd+e , so that (4) gives no new information and is tautologically true because of (3). So we will recall what happens in the case β = 0, dim V > 0 separately. 1.2. The mapping to a point case. Recall that M g,n (V, 0) is canonically isomorphic to M g,n × V , and with this identification, [M g,n (V, 0)]virt = Jg,n (V, 0) = cG (E TV ) ∩ [M g,n × V ],
(5)
where E = R1 π∗ OC , π : C → M g,n is the universal curve, and G = g dim V. Consider the Chern classes and Chern roots of E and TV : ct (E) =
g Y
(1 + ai t) =
i=1
g X
(−1)i λi;g,n ti ,
i=0
where λi are Mumford’s tautological classes defined as Chern classes of π∗ (ωπ ), ct (TV ) =
δ Y j=1
(1 + vj t) =
δ X
cj (V )tj , δ = dim V.
j=0
Then we get cG (E TV ) = =
X
g Y δ Y i=1 j=1
(ai 1 + 1 vj ) =
g δ X Y
(−1)i λi;g,n vjg−i
j=1 i=0
(−1)i1 +···+iδ λi1 ;g,n . . . λiδ ;g,n v1g−i1 . . . vδg−iδ
(i1 ,...,iδ )
= (−1)G
X
λi1 ;g,n . . . λiδ ;g,n mg−i1 ,...,g−iδ (c0 (V ), . . . , cδ (V )).
(6)
0≤i1 ≤···≤iδ ≤g
Here mg−i1 ,...,g−iδ is the symmetric function obtained by symmetrization of the obvious monomial in −vj and expressed via the Chern classes of V .
390
M. Kontsevich, Yu. Manin
Furthermore, Li;g,n (V, 0) is the lift of Li;g,n wrt the projection M g,n × V → M g,n and evi is the projection M g,n × V → V. Hence we get hτd1 γ1 . . . τdn γn ig,0 = = (−1)
X
G
0≤i1 ≤···≤iδ ≤g
Z M g,n
d1 dn λi1 ;g,n . . . λiδ ;g,n ψ1;g,n . . . ψn;g,n
mg−i1 ,...,g−iδ (c0 (V ), . . . , cδ (V ))γ1 . . . γn ,
Z
×
(7)
V
where ψi;g,n = c1 (Li;g,n ). The generalized correlators give nothing new: τd,e = τd+e . Most of the correlators (7) vanish for dimensional reasons. Here is the list of those that may remain. Proposition 1.2. The correlators (7) identically vanish except for the following cases: P P |γi | = 2δ, where γ ∈ H |γ| (V ), δ = dimV : a) g = 0, n ≥ 3, di = n − 3, Z (d1 + . . . dn )! hτd1 γ1 . . . τdn γn i0,0 = γ1 . . . γ n . (8) d1 ! . . . d n ! V P P b) g = 1, n ≥ 1, di = n (resp. n − 1), |γi | = 0, (resp. 2): Z d1 dn hτd1 1 . . . τdn 1i1,0 = deg cδ (V ) ψ1;1,n . . . ψn;1,n , (9) M 1,n
Z hτd1 γ τd2 1 . . . τdn 1i1,0 = −(cδ−1 (V ), γ) M 1,n
d1 dn λ1,1,n ψ1;1,n . . . ψn;1,n
(10)
for |γ| = 2. P P c) g ≥ 2, n ≥ 0, |γi |/2 ≤ δ ≤ 3, (di + |γi |/2) = (g − 1)(3 − δ) + n. In particular, the g ≥ 2, β = 0 correlators vanish for dim V ≥ 4. Proof. First of all, E = Eg,n is lifted from M ≥2,0 , M 1,1 or M 0,3 . For g = 0, E is the zero bundle, and J0,n (V, 0) = [M 0,n × V ]. Formula (8) follows from this and from the known expression for g = 0, V = a point correlators: Z (d1 + · · · + dn )! d1 dn . (11) ψ1;0,n . . . ψn;0,n = d1 ! . . . d n ! M 0,n For g = 1, (6) becomes cδ (E TV ) = cδ (V ) 1 − cδ−1 (V ) λ1,1,n from which (9) and (10) follow. Finally, for g ≥ 2 one sees that the virtual fundamental class can be non-zero only if the virtual dimension for n = 0 is non-negative, which means that dim V ≤ 3. The remaining inequalities follow from the dimension matching.
Correlators of Topological Sigma-Model Coupled to Gravity
391
One can further specialize (7) and write formulas similar to (8)–(10) separately for curves, surfaces and threefolds, g ≥ 2. 1.3. Unstable range case. If 2g − 2 + n ≤ 0, we cannot use the absolute stabilization morphism as in Theorem 1.1 and Subsect. 1.2 because M g,n is empty, whereas for β 6= 0, the stack M g,n (V, β) may well be non-empty. Always assuming this (otherwise the relevant correlators vanish), we will use instead the forgetful morphism M g,n+1 (V, β) → M g,n (V, β) to produce recursion. Proposition 1.3. All the unstable range correlators can be calculated through the genus zero and one primary (di = 0) stable range correlators, and the β = 0 correlators. Proof. We will be considering the cases (g, n) = (0, 2), (0, 1), (0, 0), (1, 0) in this order, reducing each in turn to the previously treated ones. Lemma 1.4. Let γ0 be a divisor class on V or more generally, a class in H 2 (V ). Then we have hγ0 τd1 γ1 . . . τdn γn ig,β = (γ0 , β) hτd1 γ1 . . . τdn γn ig,β X + hτd1 γ1 . . . τdk −1 (γ0 ∪ γk ) . . . τdn γn ig,β . (12) k: dk ≥1
(We omit sometimes τ0 in notation.) This is a generalization of the Divisor Axiom in [KM] following from the properties of J(V, β). To treat the two-point correlators with, say d1 > 0, we first use (12) and write for some γ0 with (γ0 , β) 6= 0: hτd1 γ1 τd2 γ2 i0,β =
1 hγ0 τd1 γ1 τd2 γ2 i0,β (γ0 , β) (13) −hτd1 −1 (γ0 ∪ γ1 ) τd2 γ2 i0,β − hτd1 γ1 τd2 −1 (γ0 ∪ γ2 )i0,β .
The last two terms in (13) contain only two-point correlators with the smaller sum d1 + d2 − 1. To the first term we apply Corollary 1.3: X hτd1 −1 γ1 1a i0,β1 h1a γ0 τd2 γ2 i0,β2 . (14) hγ0 τd1 γ1 τd2 γ2 i0,β = a, β1 +β2 =β
The right-hand side contains only two-point correlators with the smaller sum d1 − 1 and three-point correlators with the maximum one τd , d 6= 0. If necessary, we can again apply (14) to the three-point correlators there, again reducing the order of the gravitational descendants involved. Iterating this procedure, we will arrive at the expressions containing only primary correlators. Finally, the two-point primary correlators can be reduced to the three-point stable range ones: 1 hγ0 γ1 γ2 i0,β . (15) hγ1 γ2 i0,β = (γ0 , β) For later use, we register the following explicit reduction of some two-point correlators to the three-point ones following from (13): hτd γ1 τ0 γ2 i0,β =
d+1 X j=1
(−1)j+1 (γ0 , β)−j hγ0 τd+1−j γ1 τ0 (γ0j−1 ∪ γ2 )i0,β .
(15a)
392
M. Kontsevich, Yu. Manin
Clearly, one can invoke (12) in the same way in order to calculate the one-point and zero-point correlators. Alternatively, one can exploit the following identity, called the dilaton equation: Lemma 1.5. We have hτ1 1 τd1 γ1 . . . τdn γn ig,β = (2g − 2 + n) hτd1 γ1 . . . τdn γn ig,β . This again follows from the axioms for J(V, β) stated in [BM] and proved in [B]. 1.4. Correlators for zero-dimensional V . This case is covered by the Witten–Kontsevich theory and additional relations summarized in [F]. 2. Generating Functions on the Big Phase Space 2.1. The big phase space. The conventional gravitational potential is a generating series for the correlators (1) considered as a formal function on the extended phase (super)space ∗ th ∗ ⊕∞ d=0 H (V )[d ]. The d copy of H (V ) accommodates τd γ’s. Thus the symbol τd acquires an independent meaning as the linear operator identifying H ∗ (V ) = H ∗ (V )[ 0 ] with H ∗ (V )[d ] or even shifting each H ∗ (V )[e ] to H ∗ (V )[e + d ] so that we can write τd = τ1d . 0, . . . r} of H ∗ (V, C). Denote by {xd,a } For convenience choose a basis {1a | a =P the dual coordinates to {τd 1a } and by 0 = a,d xd,a τd 1a the generic even element of the extended phase superspace. As usual, xd,a has the same Z2 -parity as 1a , and the odd coordinates anticommute. The formal functions we will be considering are formal series in weighted variables, where the weight of xd,a is d. We need the universal character B(V ) → 3 : β 7→ q β with values in the Novikov ring 3 which is the completed semigroup ring of B(V ) eventually localized with respect to the multiplicative system q β . It is topologically spanned by the monomials q β = bm , where β = (b1 , . . . , bm ) in a basis of the numerical class group of 1-cycles, q1β1 . . . qm and (q1 , . . . , qm ) are independent formal variables. We will not need the genus expansion parameter because our main formula does not mix genera. We now put formally Fg (x) =
X β
=
X
(a)
n,(a1 ,d1 ),...,(an ,dn )
q β he0 ig,β =
X
qβ
X h0⊗n ig,β n
β
n!
xd1 ,a1 . . . xdn ,an X β q hτd1 1a1 . . . τdn 1an ig,β , n!
(16)
β
where is the standard sign in superalgebra. We define Fgst (x) by the same formula in which the last summation is restricted to the stable range of (g, n) that is, n ≥ 3 for g = 0 and n ≥ 1 for g = 1. We will introduce the generating function Gg (x) for modified correlators by the same formula as Fgst in which every τd in the stable range correlators is replaced by τ0,d : X xd ,a . . . xdn ,an X β Gg (x) = (a) 1 1 q hτ0,d1 1a1 . . . τ0,dn 1an ig,β . n! n,(a1 ,d1 ),...(an ,dn )
β
(17) We will prove that the two functions are connected by a linear change of coordinates of the big phase space.
Correlators of Topological Sigma-Model Coupled to Gravity
393
Theorem 2.1. We have for all g ≥ 0, Fgst (x) = Gg (y), X
where yc,b = xc,b +
X
q β xd,a hτd−c−1 1a τ0 1b i0,β .
(18) (19)
(a,d),d≥c+1 β
Proof. For d ≥ 1, define the linear operators Ud : H ∗ (V, 3) → H ∗ (V, 3) by the formula Ud (γ) :=
X
q β hτd−1 γ τ0 1a i0,β 1a
(20)
a,β
and put U0 (γ) = γ. The formula (4) means that in the stable range and for d ≥ 1 the correlator of any element of the form τd,e γ − τd−1,e+1 γ − τ0,e (Ud (γ)) with any product of other τdi ,ei γi vanishes; the same is true for d = 0 by the definition of U0 . Hence by induction, in any stable range correlator we can replace any expression Pd τd,0 γ by j=0 τ0,j (Ud−j (γ)) without changing the value of the correlator. In particular, Fgst (x) =
n X X qβ Y h xdi ,ai τdi 1ai ig,β n! i=1 ai ,di
n,β
=
di n X X X qβ Y h xdi ,ai τ0,ji (Udi −ji (1ai ))ig,β n! i=1 ai ,di
n,β
=
X qβ n,β
n!
h
n Y
X
ji =0
yci ,bi τ0,ci 1bi ig,β = Gg (y).
i=1 ci ,bi
To obtain the last equality, use (20) in order to represent each sum in the correlator product as a linear combination of terms τ0,c 1b . The straightforward calculation of coefficients furnishes (19). Remark. The operator T defined by y = T (x) is a linear transformation of the big phase space with coefficients in 3 defined entirely in terms of genus zero two-point correlators. It is invertible, because (19) shows that it is the sum of identity and the operator which strictly raises the gravitational weight c. Hence we may define the corrected version of e g (x) := Fg (T −1 (x)). Equivalently, we can extend the modified correlators Gg (x) by G to the unstable range keeping the natural functional equations. One can also use these formulas in order to give independent meaning to the symbols τ0,d as linear operators on the infinite sum of the 3-modules H ∗ (V, 3)[d ]. 2.2. Expressing T through the three-point primary correlators. Formulas (16) and (19) make the following definition natural:
394
M. Kontsevich, Yu. Manin
hτd1 γ1 . . . τdn γn ig :=
X
q β hτd1 γ1 . . . τdn γn ig,β .
(21)
β
We will write simply h. . . i when g = 0. These correlators are 3-polylinear functions on the 3-module ⊕d≥0 H ∗ (V, 3)[d ]. Setting in (14) d2 = 0, multiplying by q β and summing, we obtain: hγ0 τd γ1 γ2 i =
X hτd−1 γ1 1a ih1a γ0 γ2 i.
(22)
a
Put γ0 · γ2 :=
X
1a h1a γ0 γ2 i
(23)
a
(this is essentially the product in “small” quantum cohomology where the structure constants are the third derivatives of the genus zero potential restricted to H 2 ). Then we can rewrite (22) as hγ0 τd γ1 γ2 i = hτd−1 γ1 γ0 · γ2 i.
(24)
Now let l be any linear function on H2 (V, 3). It defines the derivation ∂l : 3 → 3, ∂l q β := l(β) q β . We extend it to formal series over 3 coefficientwise. If γ0 is an ample divisor class considered as a linear function on H2 , we write ∂γ0 for this derivation. Turning now to Eq. (15a), multiply it by q β and sum over all β. The left-hand side of (15a) vanishes for β = 0, and the right-hand side does not make sense, so we get: hτd γ1 γ2 i = d+1 X
(−1)j+1 ∂γ−j [hγ0 τd+1−j γ1 τ0 (γ0j−1 ∪ γ2 )i − hγ0 τd+1−j γ1 τ0 (γ0j−1 ∪ γ2 )i0,0 ]. 0
j=1
To interpret this, notice that since (γ0 , β) 6= 0 for all algebraic effective non-zero 2F makes sense for any series F whose coefficients are homology classes on V , ∂γ−1 0 correlators not involving the β = 0 ones. As the result of this “integration” we take the series again not involving the β = 0 terms. Actually, in view of (8), the β = 0 terms vanish unless j = d + 1. Separating this summand and replacing the remaining triple correlators with the help of (24), we get the following result. Proposition 2.2. The matrix coefficients of T can be expressed inductively through the triple primary correlators, that is, Gromov–Witten invariants, of genus zero: for d ≥ 1, hτd γ1 γ2 i =
d X
(−1)j+1 ∂γ−j hτd−j γ1 γ0 · (γ0j−1 ∪ γ2 )i+ 0
j=1
(−1)d ∂γ−(d+1) [hγ0 γ1 γ0d ∪ γ2 i − hγ0 γ1 γ0d ∪ γ2 i0,0 ]. 0
(25)
Correlators of Topological Sigma-Model Coupled to Gravity
395
3. Coupling of Frobenius Manifolds and Cohomological Field Theories to Topological Gravity 3.1. Coupling of Frobenius manifolds to topological gravity. The restriction 8(x) to the small phase space (xd,a = 0 for d > 0) of the genus zero potential F0 (x) from (16) satisfies the so called Associativity Equations and defines on H ∗ (V, 3) the structure of the formal Frobenius manifold, or the tree level quantum cohomology of V. The notion of Frobenius manifold was axiomatized and studied by B. Dubrovin in [D]. There are many interesting examples which do not come from quantum cohomology. In Sect. of [D] Dubrovin sets to reconstruct the whole potential with gravitational descendants from its small phase space part. Our previous discussion shows how one can do it for quantum cohomology potentials. In this subsection we show how to do this for a wider class of formal Frobenius manifolds which are not supposed to come from quantum cohomology. Our approach considerably differs from that of [D]. It would be important to relate it to the integrable hierarchies as in [D]. We will divide our discussion into two steps. First, we will introduce the modified potential with gravitational descendants which reduces to G0 (x) in the quantum cohomology case. Second, we will discuss the additional conditions needed to define the analog of the linear transformation T and the conventional potential with gravitational descendants F0 (x) := G0 (T (x)). 3.1.1. The big phase space and the modified potential. We will use the formalism of Frobenius manifolds as it was presented in [M1]. Let 3 be a Q-algebra (playing role of the Novikov ring), H a free Z2 -graded 3module of finite rank (in the quantum cohomology case H = H ∗ (V, 3) ), η a symmetric non-degenerate pairing on H replacing the Poincar´e form. To keep intact as much notation as possible, we introduce formally the big phase space as linear infinite dimensional formal supermanifold ⊕d≥0 H[d ] with basis τd 1a and coordinates xd,a as in Sect. 6 above. Put xa = x0,a , x = {xa }. By definition, a Frobenius potential on (H, η) is a formal series 8(x) ∈ 3[[x]] whose third derivatives 8ab c (with one index raised by η) form the structure constants of the commutative, associative 3[[x]]-module spanned by ∂a := ∂/∂xa . Finally, any such triple M = (H, η, 8) is called a formal Frobenius manifold (over 3). The primary correlators of M are by definition the symmetric polylinear functions H ⊗n → 3, n ≥ 3, whose values on the tensor products of τ0 1a are essentially the coefficients of 8 written as in (16): X xa . . . xan 8(x) = hτ0 1a1 . . . τ0 1an i. (a) 1 (26) n! n,a ,...,a 1
n
In the case of quantum cohomology this agrees with our notation (21). Notice that the Associativity Equations do not constrain the terms of 8 of degree ≤ 2. In this subsection we will use only correlators with ≥ 3 arguments. In order to extend the potential 8 to a formal function on the big phase space which in the quantum cohomology case will coincide with G0 , we will use the Second Reconstruction Theorem of [KM], proved in [KMK] and [M1]: Proposition 3.1. For any Frobenius manifold M as above, there exists a unique sequence of 3-linear maps InM : H ⊗n → H ∗ (M 0,n , 3), n ≥ 3, satisfying the folowing properties:
396
M. Kontsevich, Yu. Manin
(i) Sn -invariance and compatibility with restriction to boundary divisors (cf. [KM] or [M1], p. 101). (ii) The top degree term of InM capped with the fundamental class is the correlator of M with n arguments. Moreover, in the quantum cohomology case X V q β I0,n,β , InM = β V are the genus zero Gromov–Witten invariants discussed in [KM]. where I0,n,β
We now define the modified M -correlators with gravitational descendants by Z hτ0,d1 1a1 . . . τ0,dn 1an i := InM (τ0 1a1 ⊗ · · · ⊗ τ0 1an ) c1 (L1 )d1 . . . c1 (Ln )dn . [M 0,n ]
(27) Finally put GM 0 (x) =
X n,(a1 ,d1 ),...(an ,dn )
(a)
xd1 ,a1 . . . xdn ,an hτ0,d1 1a1 . . . τ0,dn 1an i, n!
(28)
where this time x denotes coordinates on the big phase space. Clearly, if M is quantum cohomology, we have reproduced (17). The expressions (27) are universal polynomials in the coefficients of 8 and η ab depending only on the superrank of H and (ai , di ). They can be calculated using some results of [Ka]. To explain this, recall that H∗ (M 0,n ) is spanned by the classes of the boundary strata M 0,τ indexed by trees whose tails are labelled by {1, . . . , n}. Any cohomology class is uniquely defined by its values on these classes. For InM these values are given in [KMK], (0.7). For φd1 1 . . . φdnn they are products of multinomial coefficients over all vertices of τ : put on each flag di if this is a tail with label i, 1 otherwise, and divide the factorial of the sum of labels at each vertex by the product of factorials of labels. It remains to calculate the cup product of the described classes. This problem was solved in [Ka]. Admittedly, the explicit formula is rather complicated. M is extended to a Cohomological Field Theory 3.1.2. Higher genus case. If InM = I0n M , as defined in [KM], one can use the evident version of formula (26) in order to Ign define the modified correlators and functions Gg (x) of any genus. However, unlike the genus zero case, a CohFT cannot be reconstructed only from its primary correlators.
3.2. The operator T on the big phase space. If we want to extrapolate the construction of T from the case of quantum cohomology to more general Frobenius manifolds, we encounter several difficulties. The basic problem is that the inductive formula (25) for the coefficients of T involves some additional structures, not required in the general definition of formal Frobenius manifolds. Namely, we need submodules H2 and H 2 in H, a semigroup in H2 with indecomposable zero accomodating β, the ring 3 with derivatives ∂γ0 . All of these structures must satisfy several conditions, ensuring in particular the independence of the right-hand side of (25) from the choice of γ0 . The following seems to be the most straightforward way to describe the additional restrictions starting with the more conventional data on M = (H, η, 8).
Correlators of Topological Sigma-Model Coupled to Gravity
397
Assume that M is endowed with the flat identity e and an Euler vector field E, such that ad E is semisimple on H. Assume that the spectrum D, (da ) belongs to 3 (see [M1], Ch.1, Sect. 2 for precise definitions). (ii) Denote by H 2 ⊂ H the submodule of H corresponding to the zero eigenvalue of ad E. Assume that it is a free direct submodule. Denote by H2 ⊂ H the submodule of H corresponding to the eigenvalue −D of ad E. Assume that it is a free direct submodule, and that η makes H2 strict dual to H 2 . (iii) Assume that a semigroup B ⊂ H2 with indecomposable zero and finite decomposition is given such that 8(x) can be expanded into a formal Fourier series with respect to the part of the coordinates dual to a basis of H2 , with coefficients vanishing outside B. Denote by 9 the part corresponding to β 6= 0. Assume finally that 8 = 9 + c, where c is a cubic form, E9 = (D + d0 )9 (without additional terms of degree ≤ 2, cf. [M1], Ch.1, (2.7)) and E1 c = (D + d0 )c, where E1 is the projection of E to the orthogonal complement to H 2 . (i)
These structures allow us to imitate the constructions of Sect. 2, starting with βdecomposition of the primary correlators, and to define T via (25). For more details, see [M3], Sect. 1. Notice that the cup product on H and the h. . . i0,0 correlators are defined using the constant terms of the relevant Fourier decomposition. The independence of (25) from the choice of γ0 follows from the postulated properties. Acknowledgement. One of the authors (Yu.M.) is thankful to Ezra Getzler for stimulating discussions which prompted him to focus on this part of a larger project [M2], and to Kai Behrend for his explanations about the proof of Lemma 10 in [B] and the restriction formula for virtual fundamental classes. We also thank Ezra Getzler and Rahul Pandharipande for the discussion which allowed us to follow the suggestion of a referee and to clarify the relation of our results to those in the physical literature.
References [B] [BF] [BM]
Behrend, K.: Gromov–Witten invariants in algebraic geometry. Inv. Math. 127, 601–617 (1997) Behrend, K., Fantechi, B.: The intrinsic normal cone. Inv. Math. 128, 45–88 (1997) Behrend, K., Manin, Yu.: Stacks of stable maps and Gromov–Witten invariants. Duke Math. J. 85:1, 1–60 (1996) [DijVV] Dijkgraaf, R., Verlinde, R., Verlinde, E.: Loop equations and Virasoro constraints in non-perturbative two-dimensional quantum gravity. Nucl. Phys. B348, 435–456 (1991) [DijW] Dijkgraaf, R., Witten, R.: Mean field theory, topological field theory, and multimatrix models. Nucl. Phys. B342, 486–522 (1990) [D] Dubrovin, B.: Geometry of 2D topological field theories. In: Springer LNM, Berlin–Heidelberg– New York: Springer-Verlag, 1620, 1996, pp. 120–348 [EHX1] Eguchi, T., Hori, K., Xiong, Ch.-Sh.: Gravitational quantum cohomology. Preprint UT-753, hepth/9605225 [EHX2] Eguchi, T., Hori, K., Xiong, Ch.-Sh.: Quantum cohomology and Virasoro algebra. Preprint UT–769, hep-th/9703086 [EX] Eguchi, T., Xiong, Ch.-Sh.: Quantum Cohomology at Higher Genus: Topological Recursion Relations and Virasoro Conditions. Preprint hep-th/9801010 [F] Faber, C.: Algorithms for computing intersection numbers on moduli spaces of curves, with an application to the class of the locus of Jacobians. Preprint, 1997 [Ge1] Getzler, E.: Intersection theory on M 1,4 and elliptic Gromov–Witten invariants. J. Am. Math. Soc. 10, 973–998 (1997), alg-geom/9612004 [Ge2] Getzler, E. Topological recursion relations in genus 2. Preprint math.AG/9801003 [KT] Kabanov, A., Kimura, T.: Intersection numbers and rank one cohomological field theories in genus one. Preprint alg-geom/9706003
398
[Ka] [KM] [KMK] [M1] [M2] [M3] [RT] [W1] [W2] [Z]
M. Kontsevich, Yu. Manin Kaufmann, R.: The intersection form in H ∗ (M 0n ) and the explicit K¨unneth formula in quantum cohomology. Int. Math. Res. Notices 19, 929–952 (1996) Kontsevich, M., Manin, Yu.: Gromov–Witten classes, quantum cohomology, and enumerative geometry. Commun. Math. Phys. 164 3, 525–562 (1994) Kontsevich, M., Manin, M. (with Appendix by R. Kaufmann): Quantum cohomology of a product. Inv. Math. 124, 1–3, 313–340 (1996) Manin, Yu.: Frobenius manifolds, quantum cohomology, and moduli spaces (Chapters I, II, III). Preprint MPI 96–113, 1996 Manin, Yu.: Algebraic geometric introduction to the gravitational quantum cohomology. In preparation Manin, Yu.: Three constructions of Frobenius manifolds: A comparative study. Preprint MPIM, 1998 (10), math.QA/9801006 Ruan, Y., Tian, G.: Higher genus symplectic invariants and sigma models coupled with gravity. Inv. Math. 130, 455–516 (1997), alg-geom/9601005 Witten, E.: On the structure of the topological phase of two-dimensional gravity. Nucl. Phys. B340, 281–332 (1990) Witten, E.: Two-dimensional gravity and intersection theory on moduli space. Surveys in Diff. Geometry, 1, 243–310 (1991) Zograf, P.: Weil–Petersson volumes of low genus moduli spaces. Preprint, 1998
Communicated by A. Jaffe
Commun. Math. Phys. 196, 399 – 410 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Systems of Correlators and Solutions of the WDVV Equations Sergei Natanzon1 , Vladimir Turaev2 1 Independent University of Moscow, B. Vlas’evskii per. 11, 121001 Moscow, Russia. E-mail: [email protected] 2 Institut de Recherche Math´ ematique Avanc´ee, Universit´e Louis Pasteur – C.N.R.S., 7 rue Ren´e Descartes, F-67084 Strasbourg, France. E-mail: [email protected]
Received: 16 October 1997 / Accepted: 5 February 1998
Abstract: We construct new solutions of the Witten–Dijkgraaf–E. Verlinde–H. Verlinde equations.
Introduction In 1990 E. Witten [6] and R. Dijkgraaf, E. Verlinde, H. Verlinde [1] introduced a remarkable system of partial differential equations (the WDVV equations) determining deformations of 2-dimensional topological field theories. The WDVV equations consist of three groups of differential equations called the associativity equations, the normalisation equations, and the quasihomogeneity equations. B. Dubrovin [2] observed that solutions of these equations lead to new deep differential-geometrical structures on manifolds. He found that such structures naturally arise in various branches of mathematics including algebraic geometry, integrable systems, Coxeter groups, quantum cohomology, etc. The theory of the WDVV equations is closely related to the theory of Frobenius algebras. Namely, every solution of the associativity equations gives rise to a family of Frobenius algebras. In this paper we show that conversely every Frobenius algebra A over C gives rise in a purely algebraic way to a family of holomorphic solutions of the associativity equations. More precisely, we construct an infinite family of holomorphic functions A → C satisfying the associativity equations with respect to any choice of linear coordinates on A. If A has a unity 1 = 1A , then this family is parametrised by infinite sequences of elements of A. If A has no unity, then this family is parametrised by infinite sequences of A-linear homomorphisms A → A. The functions A → C constructed here in general do not satisfy the normalisation equations. We show how to modify them in order to satisfy these equations as well. We show finally that if the Frobenius algebra is graded, then the functions constructed in this paper are homogeneous and thus satisfy all the WDVV equations. Examples of
400
S. Natanzon, V. Turaev
graded Frobenius algebras are provided by cohomology of closed oriented manifolds; in this case our methods yield polynomial solutions of the WDVV equations. Note that our solutions are in general different from those appearing in the theory of quantum cohomology, see [4]. It should be stressed that our aim here is not to recover the quantum cohomology but rather to study solutions of WDVV arising in an algebraic way via the theory of Frobenius algebras. The constructions of this paper can be easily generalised to produce solutions of the super-WDVV equations from Frobenius super-algebras [3]. We study solutions of WDVV in terms of the coefficients of their Taylor expansions, called correlators (cf. [2]). Section 1 is concerned with preliminaries on the associativity equations and correlators. In Sect. 2 we derive from a Frobenius algebra A a family of holomorphic solutions of the associativity equations. In Sect. 3 we show how to modify these solutions in order to satisfy the normalisation equations. In Sect. 4 we discuss the quasihomogeneity equations and consider examples of normalized quasihomogeneous systems of correlators related to the ordinary and quantum cohomology of manifolds.
1. Frobenius Algebras, Associativity Equations and Correlators Frobenius algebras. By a Frobenius algebra over C we mean a commutative associative finite dimensional algebra A over C provided with a non-degenerate symmetric bilinear form η : A × A → C such that η(ab, c) = η(a, bc), for any a, b, c ∈ A. The form η is called the inner product on A. If A has a unity 1 = 1A , then the inner product is determined by the linear functional a 7→ η(a, 1) : A → C because η(a, b) = η(ab, 1). The associativity equations. Let (η λµ )N λ,µ=1 be a non-degenerate symmetric N × N matrix over C. Let F = F (t1 , ..., tN ) be a C-valued function on CN where t1 , ..., tN are the coordinates on CN . The associativity equations (associated with (η λµ )) are the following nonlinear partial differential equations: N X λ,µ=1
N X ∂3F ∂3F ∂3F ∂3F η λµ = η λµ , ∂tα ∂tβ ∂tλ ∂tγ ∂tδ ∂tµ ∂tβ ∂tγ ∂tλ ∂tα ∂tδ ∂tµ
(1.1)
λ,µ=1
where α, β, γ, δ run over {1, ..., N }. Every solution F of (1.1) gives rise to a family of Frobenius algebras {At = At (F )} parametrized by t ∈ CN . Let e, ..., eN be the canonical basis of CN (determining the coordinates t1 , ..., tN ). Then At = CN with multiplication e α Ft eβ =
N X λ,µ=1
∂3F (t) η λµ eµ . ∂tα ∂tβ ∂tλ
(1.2)
It is clear that this multiplication is commutative. Equation (1.1) guarantees its associativity. The inner product in At is given by the matrix (ηλµ ) = (η λµ )−1 . (The inner product does not depend on t.) The algebra A0 (F ) at t = 0 will be especially important in the sequel. Systems of correlators. For m ≥ 2, denote by Sm the group of permutations {1, . . . , m} → {1, . . . , m}. By a system of correlators (associated with (η λµ )), we
Systems of Correlators and Solutions of WDVV Equations
401
mean a function, assigning to every sequence i1 , . . . , im ∈ {1, . . . , N } (with m > 3) a number < i1 , . . . , im >∈ C such that < iσ(1) , . . . , iσ(m) > = < i1 , . . . , im > for any σ ∈ Sm and N m X X X
< α, β, iσ(1) , . . . , iσ(k) , λ > < γ, δ, iσ(k+1) , . . . , iσ(m) , µ > η λµ
σ∈Sm k=0 λ,µ=1
=
N m X X X
< β, γ, iσ(1) , . . . , iσ(k) , λ >< α, δ, iσ(k+1) , . . . , iσ(m) , µ > η λµ (1.3)
σ∈Sm k=0 λ,µ=1
for any α, β, γ, δ, i1 , ..., im ∈ {1, ..., N }. A system of correlators as above determines a Frobenius algebra A with basis e1 , . . . , eN , multiplication eα e β =
N X
< α, β, λ > η λµ eµ
λ,µ=1
and inner product η : A × A → C such that (η(eλ , eµ )) is the inverse matrix to (η λµ ). Theorem 1.1. Let F = F (t1 , ..., tN ) be a holomorphic C-valued function on CN with Taylor series ∞ N X X 1 (1.4) ai1 ...im ti1 . . . tim F (t1 , . . . , tN ) = m! m=3
i1 ,...,im =1
whose coefficients ai1 ...im are invariant under any permutations of the indices: aiσ(1) ...iσ(m) = ai1 ...im for any σ ∈ Sm . The function F satisfies Eqs. (1.1) associated with (η λµ ) if and only if the numbers < i1 , . . . , im > =
1 ai ...i (m − 3)! 1 m
form a system of correlators associated with (η λµ ). If it is the case, then the Frobenius algebra A0 (F ) coincides with the algebra determined by the system of correlators {< i1 , . . . , im >∈ C}. Proof. A direct calculation shows that ∞
N X
m=0
i1 ,...,im =1
X 1 ∂3F = ∂tα ∂tβ ∂tγ m! =
∞ X
N X
m=0 i1 ,...,im =1
aαβγi1 ...im ti1 ...tim
< α, β, γ, i1 , ..., im > ti1 ...tim .
(1.5)
402
S. Natanzon, V. Turaev
Thus N X λ,µ=1
X
N X
∂3F ∂3F η λµ = ∂tα ∂tβ ∂tλ ∂tγ ∂tδ ∂tµ
hα, β, λ, i1 , ..., ik ihγ, δ, µ, j1 , ..., j` iη λµ ti1 ...tik tj1 ...tj` .
k,l≥0 i1 ,...,ik ,j1 ,...,j` ,λ,µ=1
Exchanging α and γ we obtain a similar expression for the right-hand side of (1.1). Now, it is easy to see that (1.1) is equivalent to (1.3). To prove the last claim of the theorem, note that eα F 0 e β =
N X λ,µ=1
N X ∂3F λµ (0) η eµ = hα, β, λi η λµ eµ = eα eβ . ∂tα ∂tβ ∂tλ
λ,µ=1
Corollary 1.2. Any holomorphic solution of the associativity equations on CN has the form F (t1 , . . . , tN ) = f +
∞ X m=3
1 m(m − 1)(m − 2)
N X
hi1 , ..., im i ti1 ...tim ,
i1 ,...,im =1
where f is a quadratic polynomial on t1 , ..., tN and {hi1 , ..., im i ∈ C} is a system of correlators.
2. Holomorphic Deformations of Frobenius Algebras A-linear homomorphisms. Let A be an N -dimensional Frobenius algebra over C (possibly without unity) with inner product η : A × A → C. A C-linear homomorphism f : A → A is said to be A-linear if f (ab) = af (b) for any a, b ∈ A. Note that for such f , we have af (b) = f (ab) = f (ba) = bf (a). The set of A-linear homomorphisms A → A will be denoted by End(A). This is a finite-dimensional complex vector space. For example, multiplication by a complex number and multiplication by an element of A are A-linear homomorphisms A → A. If A has a unity 1, then any A-linear homomorphism f : A → A is multiplication by f (1) so that End(A) = A. A set {fi }i of A-linear homomorphisms A → A is said to be bounded if the absolute values of the matrix coefficients of {fi }i (with respect to a basis of A over C) are bounded from above by a positive constant. The boundedness does not depend on the choice of a basis in A. An equivalent definition can be given in purely topological terms: a subset of End(A) is bounded if it lies in a compact subset of End(A). For instance, any finite subset of End(A) is bounded. The following theorem is the main result of this paper. Theorem 2.1. Let f3 , f4 , ... be a bounded sequence of A-linear operators A → A. For t ∈ A, the series ∞ X 1 η(tm−1 , fm (t)) (2.1) F (t) = m! m=3
Systems of Correlators and Solutions of WDVV Equations
403
is absolutely convergent and defines a holomorphic function F : A → C. This function satisfies the associativity equations (1.1) for the linear coordinates t1 , ..., tN on A determined by an arbitrary basis e1 , ..., eN of A and the matrix (η λµ ) inverse to (ηλµ ) = (η(eλ , eµ ))N λ,µ=1 . If f3 = idA then A0 (F ) = A. As f3 , f4 , ... we can take multiplications by elements a3 , a4 , ... of A. If these elements lie in a compact subset of A (i.e., if their coordinates with respect to a basis of A are bounded from above and from below), then f3 , f4 , ... is a bounded sequence in End(A) and Theorem 2.1 provides a holomorphic solution of the associativity equations. Taking as f3 , f4 , ... multiplications by complex numbers, we obtain the following corollary. Corollary 2.2. Let k3 , k4 , ... be a bounded sequence of complex numbers. For t ∈ A, the series ∞ X km η(tm−1 , t) F (t) = m! m=3
is absolutely convergent and defines a holomorphic function F : A → C. This function satisfies the associativity Eqs. (1.1) for the linear coordinates t1 , ..., tN on A determined by an arbitrary basis e1 , ..., eN of A and the matrix (η λµ ) inverse to (ηλµ ) = (η(eλ , eµ ))N λ,µ=1 . If k3 = 1 then A0 (F ) = A. The remaining part of Sect. 2 is concerned with the proof of Theorem 2.1. We begin with two lemmas. Lemma 2.3. For any f, g ∈ End(A) and a, b, c ∈ A, η(ab, f (c)) = η(f (ab), c),
(2.2)
η(ab, f (c)) = η(bc, f (a)), η(f (ab), g(cd)) = η(f (bc), g(ad)).
(2.3) (2.4)
Proof. The proof amounts to a direct application of definitions using the commutativity of multiplication, the identity η(ab, c) = η(a, bc) and the A-linearity of f, g. Proof of (2.2) and (2.3): η(ab, f (c)) = η(b, af (c)) = η(b, cf (a)) = η(bc, f (a)) = η(c, bf (a)) = η(c, f (ab)) = η(f (ab), c). Proof of (2.4): η(f (ab), g(cd)) = η(af (b), cg(d)) = η(cf (b), ag(d)) = η(f (bc), g(ad)).
Lemma 2.4. Let f3 , f4 , ... be A-linear operators A → A. Let e1 , ..., eN be a basis of A and (η λµ ) be the matrix inverse to (ηλµ ) = (η(eλ , eµ ))N λ,µ=1 . For any i1 , ..., im ∈ {1, ..., N } with m ≥ 3, set hi1 , ..., im i = η(ei1 · · · eim−1 , fm (eim )) ∈ C. Then: (i) we have the identity N X
hi1 , ..., is , λi his+1 , ..., im , µi η λµ = η(fs (ei1 · · · eis ), fm−s (eis+1 · · · eim )); (2.5)
λ,µ=1
(ii) the function (i1 , ..., im ) 7→ hi1 , ..., im i is a system of correlators.
404
S. Natanzon, V. Turaev
Proof. It follows from (2.2) that for m ≥ 3, η(ei1 · · · eim−1 , fm (eim )) = η(fm (ei1 · · · eim−1 ), eim ).
(2.6)
Consider the expansion fm (ei1 · · · eim−1 ) =
N X
ϕji1 ...im−1 ej ,
j=1
where ϕji1 ...im−1 ∈ C. Then hi1 , ..., im i =
N X
ϕji1 ...im−1 ηjim =
j=1
N X
ϕji1 ...im−1 ηim j .
j=1
This implies the first claim of the lemma: N X
N X
hi1 , ..., is , λi his+1 , ..., im , µi η λµ =
λ,µ=1
ϕji1 ...is ηjλ ϕkis+1...im ηµk η λµ
j,k,λ,µ=1
=
N X
ϕji1 ...is ϕkis+1...im ηjk = η(
X j
j,k=1
ϕji1 ...is ej ,
X
ϕkis+1 ...im ek )
k
= η(fs (ei1 · · · eis ), fm−s (eis+1 · · · eim )). The symmetry of hi1 , ..., im i under permutations follows from (2.3). To verify the second equation in the definition of correlators, we apply (2.5) to both sides. After that, it suffices to prove that the tensor η(fs (ei1 · · · eis ), fm−s (eis+1 · · · eim )) is invariant under arbitrary permutations of the indices i1 , . . . , im . This follows from (2.4). Proof of Theorem 2.1. Fix a basis e1 , ..., eN of A and denote by t1 , ..., tN the corresponding linear coordinates on A. Observe that for t = t1 e1 + ... + tN eN ∈ A, N X
tm−1 =
ti1 ...tim−1 ei1 ...eim−1
i1 ,...,im−1 =1
so that η(tm−1 , fm (t)) =
N X
ti1 ...tim−1 η(ei1 ...eim−1 , fm (t1 e1 + ... + tN eN ))
i1 ,...,im−1 =1
=
N X
ti1 ...tim η(ei1 ...eim−1 , fm (eim )).
i1 ,...,im =1
Thus, F (t1 , . . . , tN ) =
∞ X 1 m!
m=3
N X i1 ,...,im =1
η(ei1 · · · eim−1 , fm (eim )) ti1 ...tim .
(2.7)
Systems of Correlators and Solutions of WDVV Equations
405
Let M be a positive real number majorating |η λµ |, |ηλµ | for all λ, µ = 1, ..., N . Applying Lemma 2.4 (i) to the sequence f3 = f4 = ... = idA we obtain N X
η(ei1 · · · eis , λ) η(eis+1 , ..., eim , µ) η λµ
λ,µ=1
= η(ei1 · · · eis , eis+1 · · · eim ) = η(ei1 · · · eim−1 , eim ). Using this formula and an easy induction (beginning with m = 2) we obtain that |η(ei1 · · · eim−1 , eim )| < M (M N )m−2 . Assume that M majorates also the matrix coefficients of the operators f3 , f4 , ... and that M > 1. If (aji ) is the matrix of fm then η(ei1 · · · eim−1 , fm (eim )) =
N X
ajim η(ei1 · · · eim−1 , ej ).
j=1
Therefore |η(ei1 · · · eim−1 , fm (eim ))| < M (M N )m−1 < (M N )m .
(2.8)
It is clear that N Y
exp(N M ti ) =
i=1
∞ X 1 m!
m=1
N X
(M N )m ti1 · · · tim .
i1 ,...,im =1
The convergence of this series and (2.8) imply the convergence of the series (2.7). Hence, F : A → C is a well defined holomorphic function. 0 = ((m − 3)!)−1 fm , m = 3, 4, ... we Applying Lemma 2.4 (ii) to the sequence fm obtain that the numbers ((m − 3)!)−1 η(ei1 · · · eim−1 , fm (eim )) form a system of correlators. Therefore, by Theorem 1.1, the series F given by (2.7) satisfies the associativity equations. If f3 = idA , then e α F0 e β =
N X λ,µ=1
Thus A0 = A.
N X ∂3F (0) η λµ eµ = η(eα eβ , eλ ) η λµ eµ = eα eβ . ∂tα ∂tβ ∂tλ λ,µ=1
Remark. The function F in Theorem 2.1 is a polynomial iff fm (Am−1 ) = 0 for all sufficiently big m (cf. (2.6) and (2.7)). The easiest way to ensure this condition is to take fm = 0 for all sufficiently big m. If A is nilpotent, i.e., Ad = 0 for a certain d ≥ 2, then the function F in Theorem 2.1 is a polynomial of degree ≤ d. It is determined by the operators f3 , f4 , ..., fd and does not depend on the choice of fd+1 , fd+2 , . . . .
406
S. Natanzon, V. Turaev
3. Normalisation Normalised solutions. Let F (t1 , ..., tN ) be a solution of the associativity equations associated with a matrix (η λµ ). It is called normalised (with respect to the first coordinate) if ∂3F = ηλµ (3.1) ∂t1 ∂tλ ∂tµ for all λ, µ = 1, ..., N . This condition may be reformulated in terms of the Frobenius algebras At = At (F ), see Sect. 1. Comparing (1.2) and (3.1) we observe that F is normalised if and only if the vector (1, 0, ..., 0) ∈ CN is the unity of At for all t ∈ CN . The normalisation condition may be easily reformulated in terms of correlators. Let us say that a system of correlators {< i1 , . . . , im >∈ C} (with i1 , ..., im ∈ {1, ..., N }) is normalised if h1, λ, µi = η λµ for all λ, µ and h1, i1 , ..., im i = 0 for m > 2. If F is given by its Taylor series (1.4), then F is normalised if and only if the system of correlators < i1 , . . . , im >= ((m − 3)!)−1 ai1 ...im is normalised. This follows directly from formula (1.5). Normalisation of Frobenius algebras. There is a simple procedure allowing to derive e ηe) from any Frobenius algebra (A, η) (possibly without unity) a Frobenius algebra (A, e e with unity. As a vector space, A = C ⊕ A ⊕ C. Let e ∈ C ⊂ A be the unity of the first e be the unity of the second copy of C. Multiplication in A e is copy of C and f ∈ C ⊂ A 2 e defined by the following rules: e is the unity of A; f = f A = Af = 0; for a, b ∈ A, we have ab = a · b + η(a, b)f , where · denotes multiplication in A. The inner product ηe in e extends the one in A by ηe(e, f ) = ηe(f, e) = 1 and A ηe(e, e) = ηe(f, f ) = ηe(e, A) = ηe(A, e) = ηe(f, A) = ηe(A, f ) = 0. e is a Frobenius algebra with unity. It is called the normal extension of It is clear that A A. The next lemma shows that a similar technique can be applied to solutions of the associativity equations (1.1) to produce normalised solutions. Lemma 3.1. Let F (t2 , ..., tN ) be a function of N − 1 complex variables satisfying Eqs. (1.1) associated with a matrix (η λµ )N λ,µ=2 . Set N 1 t1 X ηλµ tλ tµ , Fe (t1 , t2 , ..., tN +1 ) = F (t2 , ..., tN ) + t21 tN +1 + 2 2 λ,µ=2
where (ηλµ ) is the inverse matrix to (η λµ ). Then Fe is a normalised solution of Eqs. (1.1) +1 associated with the matrix (e η λµ )N eλµ = η λµ for N > λ, µ > 2, ηe1,N +1 = λ,µ=1 , where η ηeN +1,1 = 1 and ηeλµ = 0 otherwise. The Frobenius algebra At (Fe ) derived from Fe at a point t = (t1 , ..., tN +1 ) ∈ CN +1 is the normal extension of the Frobenius algebra At (F ), where t = (t2 , ..., tN ) ∈ CN −1 . Proof. It is obvious that for any λ, µ ∈ {1, ..., N + 1}, ∂ 3 Fe = ηeλµ , ∂t1 ∂tλ ∂tµ
Systems of Correlators and Solutions of WDVV Equations
407
where (e ηλµ ) is the inverse of the matrix (e η λµ ). This implies the normalisation condition. Let us check that N +1 X λ,µ=1
N +1 X ∂ 3 Fe ∂ 3 Fe ∂ 3 Fe ∂ 3 Fe ηeλµ = ηeλµ . ∂tα ∂tβ ∂tλ ∂tγ ∂tδ ∂tµ ∂tβ ∂tγ ∂tλ ∂tα ∂tδ ∂tµ λ,µ=1
Denote the left- and right-hand sides of this formula by L and R, respectively. We consider several cases. Let α = 1. Then L=
N +1 X λ,µ=1
=
N X µ=1
N +1 X ∂ 3 Fe ∂ 3 Fe ∂ 3 Fe ηeλµ = ηeβλ ηeλµ ∂t1 ∂tβ ∂tλ ∂tγ ∂tδ ∂tµ ∂tγ ∂tδ ∂tµ
δβµ
λ,µ=1
∂ 3 Fe ∂ 3 Fe = . ∂tγ ∂tδ ∂tµ ∂tγ ∂tδ ∂tβ
A similar computation for R gives the same result. Thus, L = R. The cases β = 1, γ = 1, or δ = 1 are considered similarly. Consider the case where α, β, γ, δ > 2. Assume that α = N +1. Then ∂ 3 Fe/∂tα ∂tβ ∂tλ = 0 for any λ and ∂ 3 Fe /∂tα ∂tδ ∂tµ = 0 for any µ. Therefore L = 0 = R. The cases β = N + 1, γ = N + 1, or δ = N + 1 are considered similarly. In the case where N > α, β, γ, δ > 2 we have L=
N X λ,µ=2
N X ∂3F ∂3F ∂3F ∂3F η λµ = η λµ = R. ∂tα ∂tβ ∂tλ ∂tγ ∂tδ ∂tµ ∂tβ ∂tγ ∂tλ ∂tα ∂tδ ∂tµ λ,µ=2
The last claim of the lemma follows directly from definitions.
Theorem 3.2. Let A be a Frobenius algebra over C with inner product η : A × A → C. Let f3 , f4 , ... be a bounded sequence of A-linear operators A → A. For t ∈ A, u, v ∈ C, set ∞ X 1 u 1 η(tm−1 , fm (t)) + u2 v + η(t, t) ∈ C. (3.2) Fe (u ⊕ t ⊕ v) = m! 2 2 m=3
e = C⊕A⊕C → C is a normalised holomorphic solution of the associativity Then Fe : A e equations. If f3 = idA , then A0 (Fe ) = A. This theorem is obtained by applying the construction of Lemma 3.1 to the function F e ^ defined in Theorem 2.1. The last claim follows from the equalities A0 (Fe ) = A 0 (F ) = A. 4. The WDVV Equations The quasihomogeneity equations. A function F (t1 , ..., tN ) of N complex variables is called quasihomogeneous if for certain complex numbers dF , d1 , ..., dN , r1 , ..., rN , the PN derivative of F with respect to the linear vector field E = α=1 (dα tα + rα ) ∂t∂α on CN equals dF F . Briefly, EF = dF F . The function F is quasihomogeneous if and only if F (cd1 t1 + r1 c, ..., cdN tN + rN c) = (cdF − 1)F (t1 , ..., tN ) + F (t1 + r1 , ..., tN + rN )
408
S. Natanzon, V. Turaev
for any non-zero c ∈ C. The quasihomogeneity condition can be reformulated in terms of correlators. Let us say that a system of correlators {< i1 , . . . , im >∈ C} is quasihomogeneous under E if for any i, j ∈ {1, ..., N }, N X rα hα, i, ji = 0, α=1
and for a certain dF ∈ C and any i1 , ..., im ∈ {1, ..., N } with m ≥ 3, (
m X
dik − dF ) hi1 , ..., im i + (m − 2)
N X
rα hα, i1 , ..., im i = 0.
α=1
k=1
Using the obvious formula (
N X α=1
X ∂ )(ti1 · · · tim ) = (dik ti1 · · · tim + rik ti1 · · · tik−1 tik+1 · · · tim ), ∂tα m
(dα tα + rα )
k=1
it is easy to show that a solution of the associativity equations with Taylor series (1.4) is quasihomogeneous if and only if the system of correlators < i1 , . . . , im >= ((m − 3)!)−1 ai1 ...im is quasihomogeneous. The WDVV equations. The system of differential equations consisting of the associativity Eqs. (1.1), the normalisation Eqs. (3.1) and the quasihomogeneity equation EF = dF F is called the WDVV equations. The pair (F, E) is called a solution of the WDVV equations. Summing up the results obtained above we can state the following theorem. Theorem 4.1. Let F = F (t1 , ..., tN ) be a holomorphic C-valued function on CN with Taylor series ∞ N X X 1 ai1 ...im ti1 . . . tim F (t1 , . . . , tN ) = m! m=3
i1 ,...,im =1
whose coefficients ai1 ...im are invariant under permutations of indices. Let E = PN ∂ N with dα , rα ∈ C, α = 1, ..., N . The α=1 (dα tα + rα ) ∂tα be a vector field on C pair (F, E) satisfies the WDVV equations if and only if the numbers < i1 , . . . , im > =
1 ai ...i (m − 3)! 1 m
form a normalised system of correlators quasihomogeneous under E. Example 4.1. Let A be a Frobenius algebra with inner product η : A × A → C. Assume that A splits as a direct sum of linear subspaces A1 , A2 , ... such that (i) Ai Aj ⊂ Ai+j for any i, j and (ii) there is an integer d ≥ 3 such that η(Ai , Aj ) = 0 for i + j 6= d. Under these conditions we say that A is a graded Frobenius algebra. Condition (ii) and the non-degeneracy of the inner product imply that Ai = 0 for i ≥ d. Therefore Ad = 0 so that A is nilpotent. Let f3 , f4 , ..., fd be A-linear operators A → A. Theorem 3.2 yields a function e = C ⊕ A ⊕ C → C which is a normalised solution of the associativFe : A ity equations. (Formally speaking, we apply Theorem 3.2 to the bounded sequence
Systems of Correlators and Solutions of WDVV Equations
409
f3 , f4 , ..., fd , 0, 0, ....) It is clear that Fe is a polynomial of degree ≤ d in the linear e coordinates on A. e such that Assume now that fm (Ai ) ⊂ Ai for all m, i. Choose a basis e1 , ..., eN +1 of A e e and e1 is the unit of the first copy of C in A, eN +1 is the unit of the second copy of C in A, each vector eα with α = 2, ..., N lies in a certain Adα with dα ∈ {1, .., d − 1}. Denote e Set d1 = 0 and dN +1 = d. by t1 , ..., tN +1 the corresponding linear coordinates on A. Consider the vector field N +1 X ∂ E= d α tα ∂tα α=1
e We claim that EF = d · F so that the pair (F, E) satisfies all the WDVV equations. on A. It suffices to check that F (t1 , cd2 t2 , cd3 t3 , ..., cdN tN , cd tN +1 ) = cd F (t1 , ..., tN ). This follows directly from formulas (2.7), (3.2) and the assumptions. Example 4.2. Let M be a closed connected oriented manifold of even dimension d > 2 r such that H r (M ; C) = 0 for all odd r. We provide the vector space A = ⊕d−1 r=1 H (M ; C) with multiplication and inner product as follows. The product of homogeneous elements a ∈ H p (M ; C) and b ∈ H q (M ; C) (with 1 6 p, q 6 d − 1) is defined by ab = a ∪ b if p + q < d and ab = 0 otherwise. The inner product is defined by η(a, b) = (a ∪ b)([M ]) if p + q = d and η(a, b) = 0 otherwise. It is clear that A is a graded Frobenius algebra. e of A is the full cohomology algebra of M : Observe that the normal extension A e = C ⊕ A ⊕ C = ⊕dr=0 H r (M ; C) A with usual multiplication and inner product η(a, b) = (a ∪ b) ([M ]). e → C defined by For any k3 , k4 , ..., kd ∈ C, consider a polynomial function Fe : A Fe (u ⊕ t ⊕ v) = (
d X km m u2 v + ut2 t + ) ([M ]), m! 2
(4.1)
m=3
r d where u ∈ H 0 (M ; C) = C, t ∈ A = ⊕d−1 r=1 H (M ; C), and v ∈ H (M ; C) = C. The product of the cohomology classes in formula (4.1) is the usual cup-product in H ∗ (M ; C). By Corollary 2.2, the function Fe is a solution of the associativity equations. By Theorem 3.2 and Example 4.1, this function satisfies also the normalisation equations and the quasihomogeneity equation. This implies that for any 1-variable polynomial P e defined by with complex coefficients, the polynomial function on A
u ⊕ t ⊕ v 7→ (P (t) +
u2 v + ut2 ) ([M ]) 2
with u, t, v as above is a solution of all the WDVV equations. This example can be easily extended to closed oriented manifolds with non-trivial odd-dimensional cohomology; in this case one has to use the language of Frobenius super-algebras and super-WDVV equations.
410
S. Natanzon, V. Turaev
Example 4.3. For the sake of completeness, we briefly decribe the solutions of the WDVV equations arising in the theory of quantum cohomology. These solutions are different from those obtained above. In particular, they are series rather than polynomials. Let (M, ω) be a symplectic semi-positive manifold such that H 2k+1 (M ; C) = 0 and H 2k (M ; Z) is a free abelian group for all k ≥ 0. Consider a homogeneous basis e1 , ..., eN in H∗ (M ; Z) and set hi1 , ..., im i =
1 (m − 3)!
X
8A,ω,0 (ei1 , ei2 , ei3 |ei4 , ..., eim ) e−hω,Ai ,
A∈H2 (M ;Z)
where 8A,ω,0 is the Gromov-Witten invariant, see [4]. It follows from [4, Theorem 7.3] that hi1 , ..., im i is a normalised system of correlators. By [5], it is quasihomogeneous PN with respect to the linear vector field E = α=1 (dα tα + rα ) ∂t∂α where t1 , ..., tN are the linear coordinates in H∗ (M ; C) determined by the basis e1 , ..., eN , 1 dα = 1 − dimeα 2
and
rα = c1 (M )(eα ).
Here c1 (M ) is the first Chern class of the manifold M endowed with an almost complex structure compatible with ω and c1 (M )(eα ) is the value of c1 (M ) ∈ H 2 (M ; Z) on the homology class eα . References 1. Dijkgraaf, R., Verlinde, E., Verlinde, H.: Nucl. Phys. B352, 59 (1991); Notes on topological string theory and 2D quantum gravity. Preprint PUPT-1217, IASSNS-HEP-90/80, November 1990 2. Dubrovin, B.: Geometry of 2D topological field theories. Lect. Notes in Math. 1620, 120–348 (1996) 3. Kontsevich, M., Manin, Yu.: Gromov–Witten classes, quantum cohomology and enumerative geometry. Commun. Math. Phys. 164, 525–562 (1994) 4. Ruan, Y., Tian, G.: A mathematical theory of quantum cohomology. J. Diff. Geom. 42, 259–367 (1995) 5. Tian, G.: Quantum cohomology and its associativity. Preprint 6. Witten, E.: On the structure of the topological phase of two-dimensional gravity. Nucl. Phys. B340, 281–332 (1990) Communicated by G. Felder
Commun. Math. Phys. 196, 411 – 428 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
A Derivation of the Cyclic Form Factor Equation Max R. Niedermaier Max-Planck-Institut f¨ur Physik (Werner Heisenberg Institut), F¨ohringer Ring 6, D-80805 Munich, Germany Received: 7 July 1997 / Accepted: 9 February 1998
Abstract: A derivation of the cyclic form factor equation from quantum field theoretical principles is given; form factors being the matrix elements of a field operator between scattering states. The scattering states are constructed from Haag–Ruelle type interpolating fields with support in a “comoving” Rindler spacetime. The cyclic form factor equation then arises from the KMS property of the modular operators 1 associated with the field algebras of these Rindler wedges. The derivation in particular shows that the equation holds in any massive 1+1 dim. relativistic QFT, regardless of its integrability.
1. Introduction Form factors of a 1+1 dim. massive quantum field theory (QFT) and modular structures in the sense of algebraic QFT are apparently unrelated concepts. Form factors are matrix elements of some field operator between an asymptotic multi-particle state and the physical vacuum. As such they are parametrized by a set of (initially real) rapidity variables θj in which they admit a meromorphic continuation and possibly by a set of internal quantum numbers aj , 1 ≤ j ≤ n. We shall write Fan ...a1 (θn , . . . , θ1 ) = h0|O|θn , an ; . . . ; θ1 , a1 i, where O is a field operator (obtained by smearing a relativistic field, possibly nonlocal and charged with a test function) and each rapidity θ parametrizes an on-shell momentum in the usual way, p0 = mchθ, p1 = mshθ. In theories with a factorized scattering operator there exists a system of functional equations for these form factors, which entail that the Wightman functions built from them have all the required properties, and which in principle allow one to compute the former exactly. Knowing the form factors, the Wightman functions can be reconstructed through convergent series expansions, which arise from inserting a resolution of the identity in terms of scattering states. Truncating the series at a finite particle number provides a powerful solution technique that produces non-perturbative results difficult or impossible to obtain otherwise. The most innocent looking of these functional equations is the cyclic form factor equation, stating that
412
M. R. Niedermaier
Fan ...a1 (θn , . . . , θ2 , θ1 + 2πi) = η Fa1 an ...a2 (θ1 , θn , . . . , θ2 ),
(1.1)
where η is a phase and the shift by 2πi is understood in the sense of analytic continuation. Originally Eq. (1.1) was found in the context of the Sine–Gordon model [1] improving on earlier attempts to generalize Watsons equation [3]. Subsequently Smirnov promoted it to an axiom for the form factors of an integrable QFT, which together with the other equations implies locality [2]. The purpose of this paper is to give a derivation of Eq. (1.1) from quantum field theoretical principles. The derivation shows in particular that (1.1) holds in any massive 1+1 dim. relativistic QFT, regardless of its integrability. The crucial tools are the modular structures (in the sense of algebraic QFT) in a “Rindler wedge” situation, where they have geometrical significance. Modular structures in the context of von Neumann algebras are a pair of operators (J, 1) that can be associated to any von Neumann algebra M with cyclic and separating vector . The latter means that there exists a Hilbert space H such that both M and M0 are dense subspaces of H, where M0 is the commutant of M (the set and von Neumann algebra of all bounded operators on H commuting with M). The operator J is an anti-unitary involution with respect to the inner product on H, and 1 is a positive selfadjoint (in general unbounded) operator. The defining relations for (J, 1) ∗ are: J11/2 X = X ∗ for X ∈ M and J1−1/2 X 0 = X 0 for X 0 ∈ M0 , where −1 J1J = 1 . The Tomita Takesaki theorem [8] states that JMJ = M0 and that for all real λ the mapping Dλ (X) = 1iλ X1−iλ defines an automorphism group of both M and M0 . From the defining relations one can deduce the following “KMS property” of 1 [7, 8] (, Y 1X) = (, XY ), X, Y ∈ M. (1.2) Heuristically one can thus think of 1 as being an unbounded density operator for which the defining relations and (1.2) provide a substitute for the cyclic property of the trace. In algebraic QFT one deals with a net of von Neumann algebras M(K) associated to bounded regions K (double cones) of the Minkowski spacetime and (by the ReehSchlieder theorem) for each M(K) the vacuum provides a cyclic and separating vector. Hence the Tomita-Takesaki theory applies. The same holds when K is replaced with a Rindler wedge, in which case the modular symmetries have geometrical significance. Basically J acts as a reflection and exchanges the left and the right Rindler wedge and 1 ln 1 can be identified with the generator of Lorentz boosts along a direction that − 2π leaves the wedge invariant. Heuristically one can think of 1 as an unbounded operator implementing Lorentz boosts with purely imaginary parameter and J as being related to the CPT operator. In the framework of the Wightman QFT the above result is essentially due to Bisognano and Wichmann [5], while in the more general algebraic setting an analogous 1+1 dim. result has more recently been proved by Borchers [6]. In this context Eq. (1.1) is clearly reminiscent of the “KMS property” (1.2) of the modular operator 1. For an actual derivation of (1.1) based on (1.2) one has to deal with three aspects of the problem. First one has to make sure that the action of the modular operator is defined on (vectors generated by) appropriate operators localized in a wedge domain and having sharply peaked momentum transfer. Second one has to show that these operators generate the usual scattering states of a Minkowski space QFT. Third, in order to cover reasonably generic QFTs, soliton sectors should be taken into account. This is because in 1+1 dim. massive particles often have soliton properties, i.e. interpolate between inequivalent vacua, and excluding them asymptotic completeness cannot be expected to hold; see e.g. [14]. It is the combination of these aspects which renders the derivation of (1.1) technically a bit subtle. In the next section we describe the required general QFT framework in the presence of soliton sectors. In Sect. 3 we discuss some
Derivation of Cyclic Form Factor Equation
413
aspects of a Haag–Ruelle type scattering theory tailored towards the use of modular structures. The derivation proper of (1.1) is given in Sect. 4. 2. QFT Framework Including Solitons Since Eq. (1.1) is a statement about matrix elements of scattering states, it is clear that the proper QFT framework for its derivation must ensure that the QFT under consideration has a well-behaved scattering theory. Apart from the set-up in which a Haag–Ruelle type scattering theory is formulated in higher dimensions, in 1 + 1 dim. this requires the inclusion of soliton sectors, because otherwise asymptotic completeness cannot be expected to hold. A model independent understanding of the appropriate QFT framework in the presence of soliton sectors was obtained only recently by Rehren and M¨uger [18, 20] based on early work by Fr¨ohlich [14]. We shall adopt here a version of this framework suiting our purposes, the guideline being more simplicity rather than minimality of the assumptions. The paragraphs containing major assumptions on the QFT considered are numbered (1)–(6). (1) The QFT is supposed to be described in terms of a net of local observables K → A(K) satisfying isotony and locality [11]. For simplicity we also require covariance with respect to the action of the proper 1 + 1 dim. Poincar´e group P+ . This means that there exists a representation of P+ by automorphisms p → αp such that αp A(K) = A(pK). Elements p = p(y, λ, r) ∈ P+ can be parametrized by triples (y, λ, r), where y ∈ R1,1 is a translation parameter, λ ∈ R is a boost parameter and r ∈ {±1} is a sign. Our conventions are p(y, λ, 1)x = y + x(λ), with x0 (λ) = x0 chλ + x1 shλ, x1 (λ) = x0 shλ + x1 chλ, and p(0, 0, ±1)x = ±x. The subgroup generated by p(y, λ, 1) is the restricted Poincar´e group, denoted by P+↑ . For arguments and indices referring to p ∈ P+ we will use the shorthands y = p(y, 0, 1), λ = p(0, λ, 1) and r = p(0, 0, r). The C ∗ -algebra associated with a double cone K of 1 + 1 dim. Minkowski space is denoted by A(K) and is assumed to be a factor. In 1+1 dim. each double cone K is an intersection of two translated wedges K = (L + x) ∩ (R + y), where L = {x ∈ R1,1 | |x0 | < −x1 } and R = {x ∈ R1,1 | |x0 | < x1 } are the left and right Rindler wedge. For an unbounded region GSlet A(G) denote the algebra obtained by taking the normclosure (C ∗ -inductive limit) of K⊂G A(G); in particular A = A(R1,1 ) is the algebra of quasilocal observables. For any state ω over A(G) we write (Hω , πω , ) for the GNS triple of ω and denote by Mω (G) the von Neumann algebra π(A(G))00 , where the double prime denotes the weak closure in the C ∗ -algebra of bounded operators on Hω . For the states ω of interest Hω carries a positive energy representation of the Poincar´e group. If Pω denotes the generator of the translation subgroup in Hω this means that its spectrum Sp(Pω ) is contained in the closed forward lightcone (spectrum condition). (2) Specifically we assume that the QFT under consideration has both massive 1particle and massive vacuum states. These concepts are defined as follows [13]. A massive 1-particle state is defined to a pure translation covariant state on A such that the spectrum Sp(Pω ) on Hω consists of the mass shell {p| p0 > 0, p2 = m2 } and a subset of the continuum {p| p0 > 0, p2 = (m + µ)2 }, for some µ > 0. Similarly a massive vacuum state is defined, except that Sp(Pω ) now consists of the value 0 and a subset of {p| p0 > 0, p2 ≥ µ2 }, where µ > 0 is called the mass gap. The unitary equivalence class of irreducible GNS representations associated with a given massive 1-particle state is called a massive 1-particle sector and will be denoted by a or [ωa ]. The set of massive 1-particle sectors is assumed to be finite and is denoted by I. Similarly massive vacuum sectors [ωα ] are defined, of which there may be infinitely many. For these vacuum
414
M. R. Niedermaier
sectors we shall assume that they obey wedge duality, i.e. Mα (L + c) = Mα (R + c)0 ,
∀c ∈ R1,1 ,
(2.1)
where as usual the prime denotes the commutant in the algebra of bounded operators on the (separable) GNS Hilbert space. One can interpret A(L + c) as a weakly dense subalgebra of Mα (L + c) and similarly for the right wedges. We do not require Haag duality. If (2.1) is replaced with Haag duality, this is roughly also the 1+1 dim. specialization of the set-up in which superselection sectors in d + 1 dim. in the sense of DHR and BF are discussed [12, 13, 11]. A peculiarity of 1 + 1 dim. is that non-trivial superselection sectors in this sense do not exist [18] (under certain conditions which are supposed to be satisfied in massive QFTs). The massive 1-particle states can however have soliton character, i.e. interpolate between two inequivalent vacua at positive or negative spacelike infinity. This is related to a topological speciality of 1 + 1 dim. Minkowski space: The spacelike complement of any double cone has two disconnected components, a left and a right component. Associated with any massive 1-particle state ωa are therefore a pair of massive vacuum states ωαL and ωαR [15]. (3) Concerning the vacuum structure we assume that the different vacua arise (exclusively) from a spontaneously broken internal symmetry group G, as in [14]. For reasons that will become clear later we take the internal symmetry group to be abelian. More precisely G is supposed to satisfy the following conditions: (G1) Elements g ∈ G are ∗-automorphisms of A preserving the net structure, i.e. g(A(K)) = A(K) for all double cones K. (G2) Elements g ∈ G commute with the Poincar´e group: αp ◦ g = g ◦ αp , p ∈ P+ . (G3) G is abelian, finitely generated and has trivial second cohomology group. In concrete terms the latter means that every 2-cocycle e : G×G → C is a 2-coboundary, i.e. (2.2) e(g2 , g3 )e(g1 g2 , g3 )∗ e(g1 , g2 g3 )e(g1 , g2 )∗ = 1 , with |e(g1 , g2 )| = 1 and e(g, 11) = 1 implies e(g, h) = λ(g)λ(h)λ(gh)∗ , for some 1cocycle λ of G. Examples are the cyclic groups Z and ZN , N > 0. (G4) The vacuum states ωα and ωα ◦g are unitarily inequivalent for all g 6= 11. As indicated, it is convenient to fix a reference vacuum state ωα and label all other vacua by group elements. In particular we shall write (Hα◦g , πα◦g , ) for the GNS triple of ωα ◦g. The representation of the Poincar´e group is the same in all vacuum representations and is denoted by p → U (p), with the shorthands U (x) = U (p(x, 0, 1)), etc. As described before massive 1-particle states will now interpolate between two such vacua at left and right spacelike infinity. In many situations one will be interested only in the interpolation properties of a state, not in its particle properties. This motivates us to define kink states as follows: A state ω over A is called a kink state, interpolating between vacuum states ωα ◦ g and ωα ◦ h, if it is a translation covariant state satisfying the spectrum condition and if it has the property ∼ πα◦g , πω ∼ πα◦h , (2.3) πω A(L)
A(L)
A(R)
A(R)
where “∼” denotes unitary equivalence. Naturally a kink sector is an equivalence class [ω] of kink states. Note that massive 1-particle states are special kink states. Following Fr¨ohlich [14] we next assume that (all) kink states can be constructed from vacuum states by means of suitable automorphisms of A whose existence is postulated. (4) For any pair (g, h) ∈ G × G we assume that there exists a ∗-automorphism ρ (a “kink automorphism of type (g, h)”) enjoying the following properties:
Derivation of Cyclic Form Factor Equation
415
(ρ1) There exists a bounded double cone K such that ρ = g and ρ A(KL )
A(KL )
A(KR )
= h
A(KR )
,
(2.4)
where KL and KR are the left and right spacelike complement of K, respectively. The region K is called the interpolation region of ρ. (ρ2) ρ commutes with the symmetries: ρg = gρ. (ρ3) There exists a strongly continuous map γρ : P+↑ → A (a cocycle) such that γρ (p)Aγρ (p)−1 = (αp ◦ ρ ◦ αp−1 ◦ ρ−1 )(A), A ∈ A,
(2.5a)
γρ (p2 p1 ) = αp2 (γρ (p1 ))γρ (p2 ),
(2.5b)
gγρ (p) = γρ (p),
γg (p) = 11,
g ∈ G.
(2.5c)
(ρ4) There exists a group homomorphism G × G 3 (11, h) → ρ, denoted by h → ρh . Let us add a few comments. The absence of non-trivial DHR sectors [18] implies that ρ is determined by its type (g, h) up to unitary equivalence. That is to say, if ρ1 , ρ2 are two automorphisms of type (g, h) then ρ1 ρ−1 2 is an inner automorphism of A. Condition (ρ4) thus says that out of each unitary equivalence class one can pick a representative such that h → ρh becomes a group homomorphism. Then g ρh := ρhg−1 g is a (preferred) kink automorphism of type (g, h). The supplementary condition (2.5c) on the cocycles is only included for convenience; it could be relaxed and then follows from the other properties. Concerning (ρ1) one verifies that γρ (p), p = p(λ, x, 1) has interpolation region x + p(λ, 0, 1)K, if K is the interpolation region of ρ. Further one checks that the set of kink automorphisms forms a group with respect to composition. In particular the inverse of a kink automorphism of type (g, h) is of type (g −1 , h−1 ) and has the same interpolation region. Parallel to the DHR case [12] one can show that two kink automorphisms commute if their interpolation regions are spacelike separated. Clearly a necessary condition for this to happen is that the group G is abelian, which supplemented by (ρ2) also turns out to be sufficient [20]. If the interpolation regions of ρ1 and ρ2 are not spacelike separated, ρ1 ρ2 and ρ2 ρ1 are related by a unitary “statistics operator” as in the DHR case, ρ1 ρ2 = Ad(ρ1 , ρ2 )◦ρ2 ρ1 . The latter is defined by separating the interpolation regions by means of unitary, gauge invariant charge transporters; by (2.5c) the cocycles serve that purpose. It then follows that (ρ1 , ρ2 ) depends at most on the orientation of the auxiliary spacelike separated regions employed in the separation process. In particular it satisfies (g1 ρh1 , g2 ρh2 ) = (ρh1 g−1 , ρh2 g−1 ), and turns out always to be a complex phase. 1 2 The statistics phase proper is defined as κρ = (ρ, ρ) and obeys κρ = κρ−1 = κρ◦g , ∀g ∈ G. From (ρ4), the trivial second cohomology of G, and the properties of the √ statistics operator one can show that there exists a choice of square roots κρ such that √ κρ 1 ρ 2 (ρ1 , ρ2 ) = √ √ . κρ 1 κρ 2
(2.6)
The main use of the kink automorphisms lies in the fact that they generate kink states from vacuum states. In detail, let ρ be a kink automorphism of type (g, h) and let ωα be a massive vacuum state. Then the state ωα ◦ ρ is a kink state interpolating the vacuum sectors [ωα ◦ g] and [ωα ◦ h] (∗). Let us briefly comment on the proof of (∗). The fact
416
M. R. Niedermaier
that the state ωα ◦ g ρh has the correct interpolation property is manifest. Its translation covariance follows from the translation part of the identity (πα ◦ ρ ◦ αp )(A) = Uρ (p) (πα ◦ ρ)(A) Uρ (p)−1 ,
where
Uρ (p) := U (p)πα (γρ (p−1 )).
(2.7)
Here p → Uρ (p) is the representation of the (restricted) Poincar´e group in πα◦ρ := πα ◦ρ, which can thus be constructed from U and the cocycle. To complete the proof of (∗) it remains to establish the spectrum condition, which is done in [14, 17]. Using the kink automorphisms therefore all superselection sectors, that is all of the relevant representations of A can be realized on a fixed reference Hilbert space, as in the DHR case [12, 13]. In detail, pick a reference vacuum state and let (Hα , πα , ) denote its GNS triple. For an interpolating automorphism ρ consider the representation πα ◦ ρ with Hilbert space Hα◦ρ . An automorphism ρˆ = AdV ◦ ρ unitarily related to ρ induces an unitarily equivalent representation π ◦ ρˆ = Ad W ◦ (π ◦ ρ), W = π(V ) and vice versa. The space of cone-localized unitary intertwiners V : ρ → ρˆ is denoted by (ρ|ρ). ˆ A generalized state is an equivalence class of pairs (π, ψα ) ∼ (AdW ◦ π, W ψα ),
W = π(V ), V ∈ (ρ|ρ), ˆ
(2.8)
where ψα ∈ Hα and π = πα ◦ ρ for some interpolating automorphism ρ. We shall use [π, ψα ] to denote the equivalence class generated by the pair (π, ψα ). Each equivalence class [π, ψα ] defines a state over A by means of the assignment A 3 A −→
(ψα , π(A)ψα ) . (ψα , ψα )
(2.9)
The classes Ωα◦g := [πα ◦ g, ] play the role of the vacua. The inner product of two pairs (π, ψ) and (π 0 , ψ 0 ) is declared to vanish when π and π 0 are not unitarily equivalent. Otherwise it is defined to be (ψ, ψ 0 ) for representatives such that π = π 0 . In particular the norm of (π, ψ) is the norm of ψ. The space of all pairs (π, ψα ) equipped with this inner product and norm is called the state bundle and is denoted by H. The assumption (ρ4) allows one to choose a global section in this bundle. To see this let ρh , h ∈ G be the collection of automorphisms forming a representation of G. As noted before, then g ρh := ρhg−1 g is an automorphism of type (g, h) and any other of the same type is unitarily equivalent to it. Letting now (g, h) run through G × G, each sector is visited once and only once. L If we denote by gHh the Hilbert space of pairs (g ρh , ψα ), ψα ∈ Hα , the direct sum g,h gHh provides a global section through the state bundle H. Next we define an extension of the observable algebra acting irreducibly on H. It is obtained from pairs (ρ, A) consisting of a kink automorphism and a quasilocal operator A ∈ A. They act on pairs (π, ψα ) by (ρ, A)(π, ψα ) = (π ◦ ρ, π(A)ψα ). The associated generalized state [(ρ, A) ◦ (π, ψα )] defines a kink state over A by (2.9). The equivalence relation (2.8) induces a corresponding one (AdV ◦ ρ, V A) ∼ (ρ, A) on the pairs (ρ, A). The set of such pairs can be given the structure of an associative ∗-algebra with multiplication and ∗-operation given by (ρ2 , A2 )(ρ1 , A1 ) = (ρ1 ρ2 , ρ1 (A2 )A1 ), ¯ ρ(A ¯ ∗ )Vr ). (ρ, A)∗ = (ρ−1 , ρ−1 (A∗ )) ∼ (ρ,
(2.10)
Derivation of Cyclic Form Factor Equation
417
The automorphism ρ¯ entering in the second line is defined by ρ¯ = αr ◦ ρ ◦ αr ◦ (gh)−1 , where ρ is of type (g, h). Further αr is the automorphism of A associated with r = p(0, 0, 1) ∈ P+ and Vr is a unitarity. ρ¯ is of type (g −1 , h−1 ) and has interpolation region −K, if K is the interpolation region of ρ. The former implies that ρ¯ is unitarily related to ρ−1 , i.e. ρ¯ = AdVr ◦ ρ−1 , but has reflected interpolation region. In either version, the ∗-operation is compatible with the inner product on H. We shall refer to this algebra as “kink algebra” and denote it by F . It can be given a net structure satisfying Poincar´e covariance and isotony. The action of the (restricted) Poincar´e group is p
(ρ, A) −→ (ρ, Aρ (p)) =: U(p)(ρ, A)U(p)−1 , with Aρ (p) := γρ (p)∗ αp (A), (2.11) where γρ is a P+↑ -cocycle for ρ. As anticipated by the notation, the action (2.11) of the Poincar´e group on pairs (ρ, A) commutes with the composition law and the ∗-operation (2.10) due to the cocycle identity (2.5a). Concerning the localization properties, we say that a kink operator (ρ, A) is localized in a double cone K if there exists a representab in the unitarity equivalence class (AdV ◦ ρ, V A) ∼ (ρ, A) such that ρˆ has tive (ρ, ˆ A) b ∈ A(K). With this definition one shows that (ρ, Aρ (x)) is interpolation region K and A localized in x + K, if (ρ, A) is localized in K, and that the product of two kink operators is localized in the smallest double cone containing the localization regions of the individual operators. The ∗-algebra of kink operators localized in K is denoted by F(K), the subspace where ρ is of type (g, h) by g Fh (K). Further the kink algebra carries two (mutually commuting) internal group actions. First, the original spontaneously broken symmetry, which is implemented unitarily on F via g
(ρ, A) −→ (ρ, g(A)) = Q−1 g (ρ, A)Qg =: g(ρ, A),
(2.12)
with Qg = (g, 11). In particular Qg connects the different vacuum sectors in H by Qg Ωα = Ωα◦g . Second, there is an unbroken dual symmetry acting on F by (g ρh , A) → χ(g, h)(g ρh , A), where χ(g, h) is a character of G × G. Both symmetries preserve the localization and commute with the ∗-operation and Poincar´e transformations. So far we have concentrated on the interpolation properties of the elements of the kink algebra F . We now select those elements of F that generate interesting 1-particle states from a vacuum Ωα◦g . First this requires ρ to be such that ωα ◦ gρ is a massive 1-particle state. In addition 8 ∈ A must be chosen such that the spectral support of this state is contained in the mass shell {p | p0 > 0, p2 = m2a }. Finally it is natural to assume that 8 transforms irreducibly both under the action of the Lorentz group and under the action of G. Whence U(λ)AU(λ)−1 = esa λ A,
Q−1 g AQg = χa (g) A,
(2.13)
b the character of A. We reserve an extra where sa ∈ R is called the spin of A and χa ∈ G symbol A = (ρ, 8) for these elements of F and call them soliton operators, or 1-kink operators, of type a = (g, h; ma , sa , χa ). (The special case where ρ actually interpolates between equivalent vacua, i.e. where A isn’t a soliton operator proper, is included in this terminology.) The set of 1-kink operators A of type a with interpolation region K is denoted by Fa (K). By construction a soliton operator generates a 1-particle state from a vacuum sector. The 1-particle subspace of H of type a is denoted by Ha(1) . For the soliton operators one computes the following exchange relations [20]:
418
M. R. Niedermaier
A B = δab (±) B A,
± (Ka − Kb ) 0,
δab (+) = (ρa , ρb ) χ∗b (ga )χa (hb ), δab (−) = (ρb , ρa )∗ χ∗b (ha )χa (gb ).
(2.14)
e K ⇔ K e − K ⊂ R. Here “” denotes the partial ordering for double cones, i.e. K The relations (2.13) and (2.14) clearly generalize to kink operators which are arbitrary products of soliton operators and which carry the induced quantum numbers. (5) For the discussion of Spin-Statistics and the construction of a CPT operation in this context it seems indispensable at present to assume that the kink algebra F is generated by non-local Wightman fields (see however [16] for the case of conformal QFTs). In upshot the assumption is that the relation between the algebra generated by these quantum fields and the kink algebra is a straightforward generalization of the relation between Wightman fields and the observable algebra. A detailed account and references on the latter can be found in [4]. Here a few remarks may be sufficient. Let Kν , ν ∈ N be a sequence of double cones shrinking to a point, which we take to be the origin of the coordinate system. Let ω(H) be a function of the Hamiltonian (the generator of the time translations in (2.11)) which grows like H α , α < 1 for large (spectral) values of H. We assume that for Aν ∈ F(Kν ) the weak limit w − lim e−ω(H) Aν e−ω(H) =: e−ω(H) F e−ω(H) , ν→∞
(2.15)
R exists and that F(f ) := d2 xf (x)U(x)FU(x)−1 defines a sesquilinear form on some dense subset of H × H, with F(x) = U(x)FU(x)−1 being a non-local Wightman field. These fields then inherit the algebraic structures on F (group actions, multiplication and ∗-operation, interpolation properties, exchange relations etc.). In particular to each Wightman field F(x) a kink automorphism ρ (with pointlike interpolation region) is associated and, after decomposition into irreducible components, also a group character χ and a Lorentz spin s. Cone-localized operators F can be obtained by averaging with appropriate test functions. We retain the previous terminology by saying that F is localized in K, if F arises from averaging a field F(x) with a test function supported in K and if the kink automorphism associated with F can be chosen to have interpolation region K. In a slight abuse of notation we also write K → F (K) for the net (of unbounded operators) generated thereby, and continue to call the elements kink operators. Roughly speaking the original kink algebra of bounded operators can be recovered by taking the weak commutant [4], generalized to the kink algebra case. For the purposes here the distinction between both descriptions is only important for the construction of a CPT operation. In preparation of the latter, let Fa (x), Fb (x) be two (Wightman) soliton fields of type a, b, respectively. Their two-point function obeys [20] (Fa (x)Ωα , Fb (y)Ωα ) = ωa ωb∗ Fb (−y)∗ Ωα , Fa (−x)∗ Ωα , (x − y)2 < 0, where
eiπs √ ωa = √ χ(h)∗ = e−iπs κρ χ(g)∗ . κρ
(2.16)
Both expressions for ωa coincide by the following spin-statistics relation [20] e2πis = κρ χ(hg −1 ).
(2.17)
Derivation of Cyclic Form Factor Equation
419
This is to say, the spin of a soliton field of type (g, h; m, s, χ) is determined up to an integer by gh−1 and the character χ. In order to get non-vanishing matrix elements in (2.16) ρa and ρb have to be of the same type (g, h), in which case the phase in (2.16) only depends on the common unitary equivalence class ρ ∼ AdV ◦ ρ. In contrast, the character is not b does not enforce the matrix element (2.16) to super selected, i.e. χa 6= χb , χa , χb ∈ G vanish. If the character were super selected only “neutral” operators of trivial character could have a non-vanishing vacuum expectation value and consequently the vacuum states ωα◦g were all equal. From (2.16) one can anticipate the proper definition of the CPT operation, which for expositional reasons we defer to Sect. 4. (6) Finally we assume “completeness of the particle picture” in the following sense. First, there are sufficiently many soliton operators/fields in Fa (K) such that upon averaging the translated operators with rapidly decaying wave functions all of Ha(1) can be generated from a vacuum sector. Second we assume the following version of asymptotic completeness. Given the collection of 1-particle Hilbert spaces Ha(1) , a ∈ I one can apply a standard second quantization procedure to them, resulting in a Fock space. For the purposes here it is convenient to work with the free (“unsymmetrized”) Fock space F . Off hand the Fock space F is completely unrelated to the physical Hilbert space H. The Haag–Ruelle theory in this context provides a constructive way to isometrically embed two distinguished (proper or improper) subspaces Hex , ex = in/out of H into F . We assume that the image in F can be identified with subspaces of “rapidity ordered” wave functions F ex , ex = in/out; cf. Sect. 3, step 3. Since F in and F out are isometric this entails asymptotic completeness, i.e. Hin = Hout . 3. Aspects of a Haag–Ruelle Scattering Theory in 1 + 1 Dimensions Here we describe those aspects of a Haag–Ruelle type scattering theory in 1 + 1 dimensions required for the derivation in Sect. 4. Compared to 3 + 1 dimensions there are two technical complications. First, the convergence for t → ±∞ of the states built from the multi-particle interpolating fields is only guaranteed for velocity ordered configurations. Second, the particle concept itself is more complicated due to the existence of solitons. As remarked before, the inclusion of soliton states is crucial for the discussion of scattering theory, because otherwise asymptotic completeness cannot be expected to hold. The assumption (5) is not needed here and the DHR description is used throughout this section. The construction basically involves three steps: 1. Construction of 1-particle interpolating fields. 2. Construction of multi-particle scattering states. 3. Verification that the norms of these states factorize, yielding isometric embeddings Hex → F . Tailored towards the use of geometric modular structures we wish to use ingredients localized in a wedge domain, which requires an approximation procedure. In the following we describe the so-adapted Steps 1–3 consecutively. 1. Construction of the 1-particle interpolating fields: Let (ρ, 8) ∈ Fa (K) be a soliton R b ρ (p) := d2 x e−ip·x 8ρ (x) be the Fourier transform of the operator of type a and let 8 translated operator 8ρ (x). We define the 1-particle interpolating field by Z d2 p bt t t t b ρ (p), where A(f |θ) = f (p) 8 A(f |θ) = ρ, A(f |θ) , (2π)2
420
M. R. Niedermaier
fbt (p) = fb(p) ei(p0 −ω(p1 ))t ,
ω(p1 ) =
q p21 + m2a .
(3.1)
Here fb(p) is a energy-momentum distribution with the following features: It is smooth (infinitely differentiable) with compact support in R1,1 and non-vanishing connected intersection with the mass hyperboloid p2 = m2a . For δ > 0 we define the velocity support of f by vδ (f ) = {v(p) := p1 /ω(p1 ) | kp − kk ≤ δ, k ∈ supp(fb)},
(3.2)
where v(p) is the velocity with respect to the Lorentz frame determined by the x0 coordinate. In 1+1 dim. it is convenient to use coordinates p0 = µ cosh θ, p1 = µ sinh θ on the forward lightcone, in which case the velocity is parametrized by the rapidity v(p) = tanh θ. In particular vδ (f ) determines some closed rapidity interval. We shall b We also find it refer to the center of this rapidity interval as the “average rapidity” θ. convenient to split the information contained in fb(p) into two parts: First an equivalence class of translated functions θ → fb(µ cosh(θ − λ), µ sinh(θ − λ)) for some λ ∈ R; and second the average rapidity θb of fb(p), which determines a unique member of this b adopted in (3.1), the first argument refers equivalence class. In the notation A(f t |θ) to the equivalence class and the second to the average rapidity. The advantage of this notation is that Lorentz boosts act on the fields (3.1) basically by shifting the average rapidity, i.e. (3.3) γρ (λ)∗ αλ A(f t |θ) = esa λ A(f t |θ + λ). Let us now address the localization properties of the 1-particle interpolating field A(f t |θ). In position space the expression (3.1) for A(f t |θ) becomes Z A(f t |θ) = d2 x f t (x)8ρ (x), where f t (x) = Dt (x) =
Z Z
d2 p bt f (p) e−ip·x = (2π)2
Z
d2 yDt (x − y)f (−y),
d2 p i(p0 −ω(p1 ))t −ip·x e e . (2π)2
(3.4)
Here f is the Fourier transform of fb (but for notational simplicity f t is the Fourier transform of fbt with sign reversed arguments). Since fb has compact support in momentum space, f and f t will not have compact support in position space, but will only be of “fast decrease”. In particular A(f t |θ) is only a quasilocal field, not an element of any algebra A(K) associated with a bounded double cone K. With hindsight to the application of modular operators in a Rindler wedge situation we wish to approximate A(f t |θ) by local fields. In preparation let us examine the decay properties of f t (x0 , x1 ) in more detail. A standard integration by parts argument shows that it decays faster than any power of |t − x0 |−1 for |t − x0 | → ∞ with x1 fixed. Similarly, for fixed t it decays faster than any inverse power of x1 for x1 → ∞. Of particular interest is the limit along trajectories of the form x0 = t, x1 = −vt, with v 6∈ vδ (f ). Ruelle’s lemma [10] states that f t (t, −vt) decays faster than any inverse power of t for |t| → ∞.1 This motivates us to introduce compact regions 1
A quick check on the signs is via the stationary phase approximation.
Derivation of Cyclic Form Factor Equation
421
Gt,δ (f ) = {x ∈ R1,1 | x0 ∈ [t − δ, t + δ], x1 ∈ −x0 vδ (f )},
(3.5)
whose spatial extension grows linearly in |t|. For a soliton operator (ρ, 8) ∈ Fa (K) then define Z δ t δ t δ t A (f |θ) = d2 x f t (x)8ρ (x) (3.6) A (f |θ) = ρ, A (f |θ) , Gt,δ (f )
and the bounded double cone
K t,δ = cone
[
(x + K) ,
(3.7)
x∈Gt,δ (f )
where cone(G) denotes the smallest double cone containing the set G. One can then show: (a) Aδ (f t |θ) has interpolation region K t,δ . (b) The norm of the difference of the fields (3.4) and (3.6) is bounded by some rapidly decaying function d(t), i.e. kA(f t |θ) − Aδ (f t |θ)k < d(t). The proof of (a) can be found in [17]; we only add that the use of Haag duality can be avoided, consistent with our assumptions in Sect. 2. In a slight abuse of notation we shall temporarily use (ρ, Aδ (f t |θ)) also to denote the representative (AdV t ◦ ρ, V t Aδ (f t |θ)) (V t a cone-localized unitarity) for which the automorphism has interpolation region K t,δ and the operator is an element of A(K t,δ ). Given Ruelle’s lemma in the form Z d2 x f t (x) < d(t), (3.8) R1,1 \Gt,δ (f )
the proof of (b) amounts to
Z
kA(f |θ) − A (f |θ)k =
t
δ
t
Z ≤ k8k
d x f (x)8ρ (x)
2
R1,1 \Gt,δ (f )
t
d2 x f t (x) < d(t)k8k.
(3.9)
R1,1 \Gt,δ (f )
2. Construction of multi-particle scattering states: Let (ρj , 8j ) ∈ Faj (Kj ), 1 ≤ j ≤ n, be a collection of soliton operators with interpolation regions Kj to be specified later. Let Aj (f t |θ) = ρj ,Aj (f t |θ) be the associated 1-particle interpolating fields and Aδj (f t |θ) = ρj , Aδj (f t |θ) be the approximants (3.6). Using the composition law (2.10) the product fields can be computed, for which we introduce the shorthands Xt,δ = Aδn (fnt |θn ) . . . Aδ1 (f1t |θ1 ) =: (ρ1 . . . ρn , X t,δ ), Xt = An (fnt |θn ) . . . A1 (f1t |θ1 ) =: (ρ1 . . . ρn , X t ),
(3.10)
for the restricted and unrestricted case, respectively. For the reference vacuum Ωα = [πα , ] consider the states Xt Ωα and Xt,δ Ωα . We wish to arrange the data on which these states depend such that for t → ∞ they converge in norm to states in the physical Hilbert space H. The norm of pairs (ρ, A) or (π, 9) here is simply defined as the norm of the second entry of the pair. The convergence can be achieved by an appropriate choice of the localization regions Kj of the 1-particle operators and the velocity supports vδ (fj ) of the wave functions. The proper requirements are
422
M. R. Niedermaier
Kn ≺ Kn−1 ≺ . . . ≺ K1 , vn < vn−1 < . . . < v1 ,
(3.11a) ∀vj ∈ vδ (fj ).
(3.11b)
The states Xt Ωα with data (3.11) are the 1+1 dim. version of Hepp-Ruelle “nonoverlapping states”. For the restricted fields Xt,δ the condition (3.11b) guarantees that the ordering (3.11a) translates into an ordering of the bounded interpolation regions (3.7) t,δ ≺ . . . ≺ K1t,δ , Knt,δ ≺ Kn−1
(3.12)
for large enough t > 0. Further the spatial distance between these double cones tends to infinity as t → ∞. On the other hand one has the multi-particle generalization of (3.9), kXt,δ Ωα − Xt Ωα k < d(t).
(3.13)
Combining (3.12) and (3.13) one can follow the classic arguments [10, 12] to show that
d t
X Ωα < d(t), (3.14)
dt for some rapidly decreasing function d(t). A more detailed account can be found in Sect. 6.3 of [17]. From (3.14) one concludes that the family of vectors Xt Ωα converges strongly for t → ∞ to a vector in H, which is the searched for candidate for an nparticle “out” scattering state. It turns out to depend only on the 1-particle input data; in particular it is easily checked to be independent of the choice of the Lorentz frame used. Further, by (3.13) the restricted interpolating fields generate the same scattering states. To adhere to the rapidity notation usually employed in the context of form factors, we shall describe the limits in terms of improper momentum eigenstates as follows 9out := lim Xt,δ Ωα = lim Xt Ωα t→∞ t→∞ Z dn θ =: fn (θn ) . . . f1 (θ1 ) |θn , an ; . . . , θ1 , a1 iout , (4π)n
(3.15)
where fj (θ) stands for fj (maj cosh θ, maj sinh θ), j = 1, . . . n, and the massive 1particle representations are πaj = πα ◦ ρj . For simplicity we treat only “out” scattering states here. For “in” scattering states some of the ordering relations have to be reversed. Since we assume asymptotic completeness, it is convenient to treat them as CPT transforms of the “out” states, as we shall do later. 3. Isometric embedding Hout → F : It remains to show that the norm of the limiting vectors (3.15) factorizes into a product of terms depending only on the 1-particle input data. As usual this follows from clustering, since the conditions (3.11) ensure that the spatial distances of the essential support regions (3.5) tend to infinity as |t| → ∞ [10, 12, 13]. Details in the case at hand can be found in [17]. This factorization entails that the limiting states (3.15) can be identified with certain Fock space vectors having the same norm. In the momentum space description used before an n-particle vector is represented by a wave function f (n) (θn , an ; . . . ; θ1 , a1 ) with ordered and separated rapidities θn < . . . < θ1 , together with an assignment (an , . . . , a1 ) to particle types. The space of sequences of such functions forms a subspace of the free Fock space F (where no relations among the creation and annihilation operators are imposed) built from the 1-particle Hilbert spaces. Explicitly
Derivation of Cyclic Form Factor Equation
F out ⊂ F = n
∞ M M n=1 I n
423
Ha(1)n ⊗ . . . ⊗ Ha(1)1 ,
o F out = (f (n) )n≥0 | supp f (n) ∈ {θ ∈ Rn |θn < . . . < θ1 } .
(3.16)
The inner product on F is inherited from the 1-particle sectors. The inner product on 1-particle states of type a, b is Z ∞ dθ ∗ f1 (θ)f2 (θ) ⇐⇒ (A(f1 |θ1 ), A(f2 |θ2 )) = δa,b −∞ 4π hθ1 , a|θ2 , biout = 4πδab δ(θ1 − θ2 ).
out
(3.17)
The isometric embedding Hout → F obtained thereby is a somewhat weaker result as in the usual Haag–Ruelle theory. The reason is that in Hout additional relations among the state vectors exist, which result from the exchange relations (2.14) of the field operators (in coordinate space) used in their construction. Correspondingly the image F sym of Hout in F induced by (3.15) will consist of sequences of wave functions obeying certain “symmetry” relations. In momentum space their explicit description may be cumbersome. Nevertheless one expects that these relations allow one to extend the domain of definition of an n-particle momentum space wave function from, say, the sector {θ ∈ Rn |θn < . . . < θ1 } to all of Rn , while preserving the norm. This is what the assumption in paragraph (6) of Sect. 2 amounts to. In other words, we can view the exchange relations (2.14) as defining an isometry between F sym and F out . For the purposes here only the final isometry between Hout and F out matters. 4. Cyclic Form Factor Equation and Modular Structures After these lengthy preparations we now turn to the derivation proper of (1.1). The idea is to use the modular operators of a family of (right) wedge domains Rt = ct + R shifted along a path t → ct ∈ R1,1 , such that the restricted interpolating fields at time t have support in Rt = ct + R and the action of geometric modular operators is defined. In this way Eq. (1.1) arises from the “KMS property" (1.2) of the modular operator 1. As a guideline let us recall how (1.2) arises from the defining relations of (J, 1). The latter ∗ are: J11/2 X = X ∗ for X ∈ M and J1−1/2 X 0 = X 0 for X 0 ∈ M0 , where −1 J1J = 1 . From this and the anti-unitarity of J one obtains (1.2) via [7, 8] (, Y 1X) = (11/2 Y ∗ , 11/2 X) = (JY , JX ∗ ) = (X ∗ , Y ) = (, XY ),
X, Y ∈ M .
(4.1)
The aim in the following is to transfer this computation to the situation at hand. To this end one first has to ensure that the n-particle interpolating fields have support in Rt and that the action of 11/2 is defined on the vectors generated by them. Let Aδj (fjt |θj ), 1 ≤ j ≤ n be a collection of 1-particle interpolating fields with data satisfying (3.11). Let (ρj , Aδj (fjt |θj )) be the representatives for which ρj has interpolation region Kjt,δ and Aδj (fjt |θj ) ∈ A(Kjt,δ ). For a suitably chosen z t ∈ R1,1 we shall be interested in the analyticity properties in λ of vectors of the form Un (λ)Un (−z t )πα (X t,δ ),
424
M. R. Niedermaier
with X t,δ as in (3.10). Writing this vector out explicitly, the kink representations πj := πα ◦ ρ1 . . . ρj appear and we will use Uj to denote Uρ1 ...ρj in (2.7). One finds Un (λ)Un (−z t )πα (X t,δ ) Z Z (san +...+sa1 )λ 2 =e d yn . . . Gtn
Gt1
d2 y1 fnt (yn ) . . . f1t (y1 ) Un (yn (λ) − z t (λ)) ×
×πn−1 (8n )Un−1 (yn−1 (λ) − yn (λ)) . . . π1 (82 )U1 (y1 (λ) − y2 (λ))πα (81 ). (4.2) Here z t ∈ R1,1 is chosen such that −z t + Gtn ⊂ R and to simplify the notation we wrote Gtj for Gt,δ (fj ), 1 ≤ j ≤ n. The guideline to determine the analyticity properties of (4.2) is the following simple fact. Let p → U (p) be a strongly continuous unitary representation of P+↑ on a separable Hilbert space obeying the spectrum condition. Consider U (x(λ)) = U (λ)U (x)U (λ)−1 for x ∈ R1,1 , with the notation x0 (λ) = x0 chλ + x1 shλ, x1 (λ) = x0 shλ + x1 chλ. Then 0 < Im λ < π, if x ∈ R, λ → U (x(λ)) is analytic in (4.3) −π < Im λ < 0, if x ∈ L, and continuous on the boundary of these strips. Further U (x(λ)) is a bounded operator when λ in one of the strips. Applied to the vector (4.2) one sees that the dependence on λ is analytic in the strip 0 < Imλ < π. Indeed, since the spatial distance between the regions Gtj increases with t, there exists a t0 > 0 such that cone(Gtn ) ≺ . . . ≺ cone(Gt1 ), for t ≥ t0 , which implies yj − yj+1 ∈ R for j = 1, . . . , n − 1 with yj ∈ Gtj . For the argument of Un the condition yn − z t ∈ R holds by definition of z t . In summary, we found that for a suitably chosen z t ∈ R1,1 the support regions Gtj = Gt,δ (fj ) are contained in a shifted right wedge domain z t + R, for t > t0 . In this shifted wedge Un,zt (λ) := Un (z t )Un (λ)Un (−z t ) plays the role of the Lorentz boost generator and acts consistently on the vector πα (X t,δ ) for 0 ≤ Im λ ≤ π. The localization regions Kjt,δ of the 1-particle interpolating fields are not necessarily contained in z t + R. However, since they are likewise compact regions, related to Gtj by (3.7), one can find a ct ∈ R1,1 (timelike and future-pointing) and t1 ≥ t0 such that Kjt,δ ⊂ ct + R, for all t > t1 . With this definition of Rt := ct + R one has Aδj (fjt |θj ) ∈ A(Kjt ) ⊂ A(Rt ). It is easy to see that such localization properties are preserved under composition of kink operators. For the multiparticle interpolating fields (3.10) one can thus choose representatives ρX = AdV t ρ1 . . . ρn (V t a cone-localized unitarity) having interpolation region cone(K1t,δ ∪ . . . ∪ Knt,δ ) and X(Rt ) := V t X t,δ ∈ A(Rt ). Having ensured that such a choice of representatives is possible, the outcome of the previous discussion is conveniently recast in terms of kink operators and generalized states. Set Ut (λ) := U(ct )U(λ)U(−ct ),
1sR := Ut (2πis),
1sL := Ut (−2πis), s > 0. (4.4)
Proposition 1. (a) Let Aδj (fjt |θj ), 1 ≤ j ≤ n, be restricted 1-particle interpolating fields with data satisfying (3.11). Then there exist wedge domains Rt = ct + R and t1 > 0 such that the restricted n-particle interpolating field X(Rt ) := Aδn (fnt |θn ) . . . Aδ1 (f1t |θ1 ) = ρX , X(Rt ) , (4.5) has bounded interpolation region in Rt for all t > t1 . Symbolically X(Rt ) ∈ F(Rt ).
Derivation of Cyclic Form Factor Equation
425
(b) 1sR is a positive densely defined operator on F (Rt )Ωα , for all 0 ≤ s ≤ 1/2. (c) The generalized states X(Rt )Ωα converge strongly to scattering states in H for t → ∞. Until here the assumption (5) did not enter. Now we employ it to construct a CPT operation on F in its Wightman version. Recall the notation Qg = (g, 11) and let Q be the unitary involution on H, acting like Q(gh)−1 on the sector gHh . Define an operator Θ on H and AdΘ on F by ΘF(x)Ωα = ωa Q F(−x)∗ Ωα ,
ΘF(x)Θ = ωa Qgh F(−x)∗ ,
(4.6)
where in the second equation F(x) is of type (g, h) and hence Qgh F(−x)∗ is of type (h, g). Then Θ has the following properties: Proposition 2. (a) AdΘ is an antilinear ∗-automorphism of F and an involution. Further Θ is antiunitary w.r.t. the inner product on H. (b) The following commutation relations hold ΘU(±iπ) = U(∓iπ)Θ, ΘU(λ) = U(λ)Θ,
ΘQg = Qg Θ, ΘU(x) = U(−x)Θ.
(4.7)
(c) ΘF (R)Θ = F(L) and vice versa. The proof can be adapted from Rehren [20]. This CPT operator arises, up to a unitary factor, from the polar decomposition of the following Tomita operators: S+ FΩα = Q h(F∗ )Ωα , ∗
S− FΩα = Q g(F )Ωα ,
F ∈ g Fh (R), F ∈ g Fh (L).
(4.8)
In fact, the closures of the operators (4.8) can be seen to have adjoints related by (S± )∗ = κ±1 S∓ and to admit polar decompositions, √ ±1 (4.9) S± = J± U(±iπ), with J± = κ Θ. √ √ Here the unitary operator κ is declared to act by multiplication with κρ on Hα◦ρ . √ ±1 The origin of the unitary factor κ can be understood from the relations √ −1 ΘU(iπ)FΩα = κ Q h(F∗ )Ωα , F ∈ g Fh (R), √ ΘU(−iπ)FΩα = κ Q g(F∗ )Ωα , F ∈ g Fh (L). (4.10) Next we show that the CPT operation declared via (4.6) on the kink operators induces a CPT operation on scattering states having all the required properties. The CPT conjugate of a 1-particle (Wightman) interpolating field of type (g, h) naturally is ¯ , where Θ ρ, Aδ (f t |θ) Θ = j(ρ), Aδ,CP T (f ∗ −t |θ) Z ¯ = ωa j(ρ) = ρgh, ¯ AδCPT (f ∗ −t |θ) d2 y f ∗ −t (y)8∗ (y). (4.11) G−t,δ (f ∗ )
We have displayed the representatives localized in −K t,δ = −Gt,δ and rewrote the operator such that the time reversal is manifest. The complex conjugate f ∗ of f plays
426
M. R. Niedermaier
the role of the charge conjugate wave function, whose average rapidity is denoted by ¯ Of course the spacetime reflection here is with respect to the origin of the chosen θ. coordinate system and exchanges R with L, rather than the “comoving” wedge domains Rt = ct + R and ct + L =: Lt . A CPT operation doing the latter is Θt := U(ct )ΘU(−ct ).
(4.12)
Let then X(Rt ) be an n-particle interpolating field as in (4.5) and consider its CPT conjugate (4.13) Θt X(Rt )Θt = j(ρn ) . . . j(ρ1 ), X(Rt )CPT =: X(Rt )CPT . One easily sees that there exist representatives, displayed in the middle term, for which X(Rt )CPT ∈ A(Lt ) and j(ρ1 ) . . . j(ρn ) has bounded interpolation region in Lt := ct + L for t < −t1 . As before one can use them to study the analyticity properties in λ of the Lorentz boosted state U(λ)U(−ct ) X(Rt )CPT Ωα as in (4.2). With the data for X(Rt ) as in Proposition 1, the dependence on λ is found to be analytic in the strip −π < Imλ < 0. It follows that the action of 1sL , 0 ≤ s ≤ 1/2 is defined on X(Rt )CPT . Since the latter generate F (Lt ) one concludes that Θt F (Rt )Θt = F(Lt ),
Θt 1sL Θt = 1sR , 0 ≤ s ≤ 1/2 , on F (Rt )Ωα . (4.14)
Further SR XΩα = Q h(X∗ )Ωα ,
X ∈ g Fh (Rt ),
SL XΩα = Q g(X∗ )Ωα ,
X ∈ g Fh (Lt ),
√
1/2
κΘt 1R , √ −1 1/2 with SL = κ Θt 1L . (4.15)
with SR =
In particular the state X(Rt )CPT Ωα converges strongly to a scattering state in H for t → ∞. On the improper scattering states (3.15) the following CPT operation is induced: J|θn , an ; . . . ; θ1 , a1 iout = |θ¯1 , a¯ 1 ; . . . ; θ¯n , a¯ n iin = |θ¯n , j(an ); . . . ; θ¯1 , j(a1 )iin . (4.16) Here ak , a¯ k and j(ak ) refer to the massive 1-particle representations πα ◦ ρk , πα ◦ ρ¯k and πα ◦ j(ρk ), respectively. Further θ and θ¯ are the average rapidities of a momentum space wave function fb and its complex conjugate, respectively. >From (4.16) one readily checks that J has all the familiar properties of a CPT operation on scattering states. In particular it leaves the scattering operator invariant JSJ = S −1 and the scattering operator S itself can be written as a product of J and the free CPT operator on the Fock space; see also [19]. Having all these ingredients at our disposal we can eventually transfer the computation (4.1) to the case at hand. Introduce generalized operators X(Rt ), Y(Rt ) by t |θn−k+1 ) = ρX , X(Rt ) , X(Rt ) := Aδn (fnt |θn ) . . . Aδn−k+1 (fn−k+1 t Y(Rt ) := Aδn−k (fn−k |θn−k ) . . . Aδ1 (f1t |θ1 ) = ρY , Y (Rt ) , (4.17) where the data fj and Kj , 1 ≤ j ≤ n are as in Proposition 1 and the terms on the right t,δ ) denote the representatives with interpolation region KX := cone(Knt,δ ∪ . . . ∪ Kn−k+1 t,δ t,δ and KY := cone(Kn−k ∪ . . . ∪ K1 ), respectively. Further let O = (ρO , O) be a kink operator and choose dt ∈ R1,1 such that the translated operator O(dt ) := U(dt )OU(−dt ) has interpolation region KO satisfying KX ≺ KO ≺ KY , for large t. In order to have
Derivation of Cyclic Form Factor Equation
427
nonvanishing matrix elements of the form required, these operators have to satisfy an appropriate “charge balance” condition. Explicitly, for some k ∈ G we assume that ρY ρO ρX = k,
(4.18)
and write Ωβ := Ωα◦k . Further we abbreviate momentarily X = X(Rt ), Y = Y(Rt ). Using (4.18) one verifies that all the matrix elements in the following chain of equalities are well-defined √ √ −1 −1 1/2 1/2 1R Y∗ O∗ (dt )Ωβ , 1R X Ωα = Θt κ SR Y∗ O∗ (dt ) Ωβ , Θt κ SR XΩα −1 t = SR X Ωα , SR Y∗ O∗ (dt )Ωβ = k −1 hX (X∗ ) Ωβ , h−1 Y hO (O(d )Y) Ωα . (4.19) In the last expression we extract the character phases using (2.13) and then exchange the order of X and O(dt ) using (2.14). Reinserting into (4.19) results in the following identity η O∗ (dt )X∗ Ωβ , Y Ωα = η Ωβ , O(dt )XYΩα = Y∗ O∗ (dt )Ωβ , Ut (2πi) X Ωα , (4.20) where η = χXOY (hX k −1 ) δXO (−) is the accumulated phase, depending both on the statistics phases and the group characters of the involved kink operators. The first expression in (4.20) in particular shows that the t → ∞ limit of these matrix elements exists and yields well-defined matrix elements between scattering states. Since O is a cone-localized operator each of the matrix elements is separately well-defined also for dt = 0. On the other hand η depends (for given kink operators) only on the orientation of the interpolating automorphisms and in particular is independent of dt . The identity (4.20) thus remains valid when sending dt to zero. Adopting the notation from (3.15), (3.3) and (4.16) one arrives at η inhθ¯n−k+1 − iπ, a¯ n−k+1 ; . . . ; θ¯n − iπ, a¯ n | O |θn−k , an−k ; . . . ; θ1 , a1 iout = η outh0| O |θn , an ; . . . ; θ1 , a1 iout = outh0| O |θn−k , an−k ; . . . ; θ1 , a1 ; θn + i2π, an ; . . . ; θn−k+1 + i2π, an−k+1 iout (4.21) for ordered and separated rapidities, i.e. θj − θj+1 > , j = 1, . . . , n − 1 with some positive constant . Both the “crossing relation” and the “cyclic form factor equation” are special cases of (4.21). For example one has hθ¯n , a¯ n | O |θn−1 , an−1 ; . . . ; θ1 , a1 iout = outh0| O |θn + iπ, an ; . . . ; θ1 , a1 iout ,
in
h0| O |θn−1 , an−1 ; . . . ; θ1 , a1 ; θn + 2πi, an iout = η outh0| O |θn , an ; . . . ; θ1 , a1 iout .
out
(4.22) Analogous formulae with “in” and “out” scattering states exchanged follow from (4.16). The purpose of this paper was to provide a quantum field theoretical derivation of the cyclic form factor Eq. (1.1) or (4.21). The derivation given shows that it is a generic feature – not tied to integrability – of massive 1+1 dim. QFTs with a proper relativistic scattering theory. Keeping this in mind, we propose retaining the term “cyclic form
428
M. R. Niedermaier
factor equation” for it. The main technical tool in the derivation was the use of a family of Rindler spacetimes t → Rt , comoving with the essential support regions of the interpolating quantum fields, to transfer the action of geometric modular structures to scattering states. We expect that a 4-dim. counterpart of the cyclic form factor equation can be derived along similar lines, to which we intend to return elsewhere. Acknowledgement. I wish to thank H.-J. Borchers and K.-H. Rehren for important discussions, the latter in particular for clarifing much of the material of Sect. 2. The author acknowledges support by the Reimar L¨ust fellowship of the Max-Planck-Society.
References 1. Smirnov, F.A.: A general formula for soliton form factors in the quantum Sine–Gordon model. J. Phys. A19, L575-L578 (1986) 2. Smirnov, F.A.: Form Factors in Completely Integrable Models of Quantum Field Theory. Singapore: World Scientific, 1992 3. Karowski, M. and Weisz, P.: Exact form factors in 1 + 1 dim. field theoretical models with soliton behaviour. Nucl. Phys. 139, 455–476 (1978) 4. Baumg¨artel, H. and Wollenberg, M.: Causal nets of Operator Algebras. Berlin: Akademie Verlag, 1992 5. Bisognano, J. and Wichmann, E.: On the duality condition for a hermitian scalar field. J. Math. Phys. 16, 985–1007 (1975); Bisognano, J. and Wichmann, E.: On the duality condition for quantum fields. J. Math. Phys. 17, 303–321 (1976) 6. Borchers, H.-J.: The CPT theorem in 2-dim. theories of local observables. Commun. Math. Phys. 143, 315–332 (1992) 7. Haag, R., Hugenholtz, W. and Winnink, M.: On the equilibrium states in quantum statistical mechanics Commun. Math. Phys. 5, 215–236 (1967) 8. Takesaki, M.: Tomita’s theory of modular Hilbert algebras and its application. Lecture Notes in Mathematics, Berlin–Heidelberg–New York: Springer, 1970 9. Bratteli, O. and Robinson, D.: Operator Algebras and Quantum Statistical Mechanics, Vol. 1,2; 2nd ed., Berlin–Heidelberg–New York: Springer, 1987 10. Haag, R.: Quantum field theories with composite particles and asymptotic conditions. Phys. Rev. 112, 669 (1958); Ruelle, D.: On the asymptotic condition in quantum field theory. Helv. Phys. Acta 35, 147 (1962) 11. Haag, R.: Local Quantum Physics, 2nd ed., Berlin–Heidelberg–New York: Springer, 1996 12. Doplicher, S., Haag, R. and Roberts, J.: Local observables and particle statistics I, II. Commun. Math. Phys. 23, 199–230 (1971) and 35, 49–85 (1974) 13. Buchholz, D. and Fredenhagen, K.: Locality and the structure of particle states. Commun. Math. Phys. 84, 1 (1982) 14. Fr¨ohlich, J.: New super-selection sectors (soliton states) in 2-dim bose quantum field models. Commun. Math. Phys. 47, 269–310 (1976) 15. Fredenhagen, K.: Superselection sectors in low dim. QFT. J. Geom. Phys. 11, 337 (1993) 16. Guido, D. and Longo, R.: The Conformal Spin-Statistics Theorem. Commun. Math. Phys. 181, 11 (1996) 17. Schlingemann, D.: On the algebraic theory of kink sectors: Application to quantum field theory models and collision theory. PhD thesis 1996, DESY-96-228 18. M¨uger, M.: Superselection structure of massive QFTs in 1+1 dim., DESY 97-081, hep-th/9705019 19. Schroer, B.: Modular localization and the bootstrap-form factor program. Nucl. Phys. B499, 547 (1997) 20. Rehren, K.-H.: Spin-Statistics and CPT for Solitons, hep-th/9711085 Communicated by A. Connes
Commun. Math. Phys. 196, 429 – 443 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
A q-Deformation of the Parastatistics and an Alternative to the Chevalley Description of Uq [osp(2n + 1/2m)] T. D. Palev? International Centre for Theoretical Physics, 34100 Trieste, Italy Received: 2 September 1997 / Accepted: 15 February 1998
Abstract: The paper contains essentially two new results. Physically, a deformation of the parastatistics in a sense of quantum groups is carried out. Mathematically, an alternative to the Chevalley description of the quantum orthosymplectic superalgebra Uq [osp(2n+1/2m)] in terms of m pairs of deformed parabosons and n pairs of deformed parafermions is outlined. 1. Introduction In this paper we give an alternative to the Chevalley definition of the quantum superalgebra Uq [osp(2n + 1/2m)] in terms of generators and relations (see Eqs. (30)). We generalize to the quantum case a result we have recently obtained [20], namely that the universal enveloping algebra U [osp(2n + 1/2m)] of the orthosymplectic Lie superalsebra osp(2n + 1/2m) is an associative unital algebra with generators, called Green generators (operators), ± ± ± ± ± ± a± 1 , a2 , . . . , am−1 , am , am+1 , . . . , am+n ≡ aN ,
(1)
and relations [[aηN −1 , aηN ], aηN ] = 0,
η hki [[[[aηi , a−η δjk aηi , j ]], ak ]] = 2η
where deg(a± i )
≡ hii =
¯ 1, ¯ 0,
for i ≤ m for i > m.
∀ |i − j| ≤ 1,
η = ±, (2) (3)
Here and throughout [[a, b]] = ab − (−1)deg(a)deg(b) ba, [a, b] = ab − ba, {a, b} = ab + ba. ? Permanent address: Institute for Nuclear Research and Nuclear Energy, 1784 Sofia, Bulgaria. E-mail: [email protected]
430
T. D. Palev
The motivation for the present work stems from the observation that the Green operators provide a description of osp(2n + 1/2m) via generators, which, contrary to the Chevalley elements, have a direct physical significance. As it was shown in Ref. 20, ± ± ± ± ± ± a± 1 , a2 , . . . , am−1 , am (resp. am+1 , . . . , am+n ≡ aN ) are para-Bose (pB) (resp. paraFermi (pF)) operators. These operators were introduced in the quantum field theory as a possible generalization of the statistics of the tensor (resp. spinor) fields [10]. Therefore what we are actually doing here is a simultaneous deformation of the pB and the pF operators (2) in the sense of quantum groups [6,14]. The fact that n pairs of pF creation and annihilation operators (CAOs) generate the Lie algebra so(2n+1) was first observed in Refs. 16 and 33. It took some time ± to incorporate the para-Bose statistics into an algebraic structure: a± 1 , . . . , am are odd elements, generating a Lie superalgebra [19] isomorphic to osp(1/2m) [9]. It is usually assumed that the pB operators commute with pF operators. Other possibilities were also investigated [11]. In Ref. 21 it was indicated that the relations between the pB and pF operators can be selected in such a way that they generate osp(2n + 1/2m). The identification of the parastatistics with a well known algebraic structure has far reaching consequences. Firstly, it indicates that the representation theory of n pairs of pF operators (or of m pairs of pB operators) is completely equivalent to the representation theory of so(2n + 1) (resp. of osp(1/2m)). In this way one may enlarge considerably [2, 22] the class of the known representations, those corresponding to a fixed order of the statistics. In particular, since the (complex) Lie algebra so(2n + 1) has infinite-dimensional representations, so do the pF operators. Similarly osp(1/2m) has finite-dimensional representations (for instance the defining one) and therefore the pB operators have also such representations. Secondly, it provides a natural background for further generalizations of the quantum statistics. In order to give a hint of where this possibility comes from consider in the frame of the quantum field theory a field 9(x). In the momentum space the translation invariance of the field is expressed as a commutator between the energy-momentum P m , m = 0, 1, 2, 3 and the CAOs a± i of the field: m ± [P m , a± i ] = ±ki ai ,
where the index i replaces all (continuous and discrete) indices of the field and X Pm = kim Hi .
(4)
(5)
i 1 + − In the case of the pF statistics Hi = 21 [a+i , a− i ], whereas for pB fields Hi = 2 {ai , ai }. 1 + − In a unified form Hi = 2 [[ai , ai ]] with the pF considered as even elements and the pB as odd. To quantize the field means, loosely speaking, to find solutions of Eqs. (4) and (5), where the unknowns are the CAOs a± i . The first opportunity for further generalizations is based on the observation that both so(2n + 1) and osp(1/2m) belong to the class B superalgebras in the classification of Kac [15]. Therefore it is natural to try to satisfy the quantization Eqs. (4) and (5) with CAOs, generating superalgebras from the classes A, C or D [15] or generating other superalgebras from the class B. It turns out this is possible indeed. Examples of this kind, notably the A-statistics, related to the completion and the central extension of sl∞ were studied in Refs. 23 and 24 (see also Example 2 in Ref. 25 and the other references in that paper). The Wigner quantum systems (WQSs), introduced in Refs. 26 and 27, are also examples of this kind, however in the frame of a noncanonical quantum mechanics. Some of these systems possess quite unconventional physical features, properties which cannot
A q-Deformation of the Parastatistics
431
be achieved in the frame of the quantum mechanics. The (n + 1)-particle WQS, based on sl(1/3n) ∈ A [28], exhibits a quark like structure: the composite system occupies a small volume around the centre of mass and within it the geometry is noncommutative. The underlying statistics is a Haldane exclusion statistics [13], a subject of considerable interest in condensed matter physics. The osp(3/2) ∈ B WQS, studied in Ref. 29, leads to a picture where two spinless point particles, curling around each other, produce an orbital (internal angular) momentum 1/2. The second opportunity for generalization of the statistics is based on deformations of the relations (2). Assume only for simplicity that in (5) i = 1, . . . , n (the considerations remain valid for n = ∞). Then both in the pF and the pB case Hi are elements from the Cartan subalgebra H of so(2n + 1) and osp(1/2n), respectively. The CAOs are root vectors of these (super)algebras (see Ref. 24 for more detailed discussions). The important point now comes from the observation that the commutation relations between the Cartan elements and the root vectors, in particular the quantization Eqs. (4) and (5), remain unaltered upon q-deformations. Therefore one can satisfy the quantization Eqs. (4) and (5) also with deformed pF (resp. pB) operators. Certainly in this case the relations Hi = 21 [[a+i , a− i ]] cannot be preserved anymore. One has to postulate the expression (5), introducing the additional (Cartan) generators Hi similarly as in the case of a deformed harmonic oscillator, where one is forced to introduce also number operators [18, 1, 35]. The conclusion is that the deformed pF operators {a± i , Hi |i = 1, . . . , n}, being solutions of the quantization Eqs. (4) and (5), enlarge the class of the possible statistics. It turns out these are the operators, which provide an alternative to the Chevalley description of Uq [so(2n + 1)]. This was shown in Ref. 30. A similar problem for Uq [osp(1/2m)], corresponding to a q-deformation of the pB operators, was first carried out for m = 1 [4] and then for any m [31, 12, 32]. Here we generalize the results for any Uq [osp(2n + 1/2m)], n, m > 1, namely when both para-Bose and para-Fermi operators are involved. This amounts to a simultaneous deformation of the parabosons and the parafermions as one single supermultiplet. In Sect. 2, after recalling the definition of Uq [osp(2n + 1/2m)], we introduce the deformed Green generators (13) and derive the relations (30) they satisfy. In Sect. 3 we solve the inverse problem. We express the Chevalley elements via the Green generators and show that the relations among the Chevalley generators follow from the properties of the Green operators. This leads to the conclusion that the deformed Green operators provide an alternative description of Uq [osp(2n + 1/2m)]. Throughout the paper we use the notation (some of them standard): C – all complex numbers; C[[h]] – the ring of all formal power series in h over C; q = eh ∈ C[[h]], q¯ = q −1 ; [[a, b]] = ab − (−1)deg(a)deg(b) ba, [a, b] = ab − ba, {a, b} = ab + ba; [[a, b]]x = ab − (−1)deg(a)deg(b) xba, [a, b]x = ab − xba, {a, b}x = ab + xba; ¯ for i ≤ m 1, ) ≡ hii = deg(a± i ¯ for i > m. 0, hi+1i
qi = q (−1)
, i.e. qi = q, ¯ i < m,
qi = q, i ≥ m.
For the convenience of further references we list here some deformed identities, which will be often used (Id(3) follows from Id(2)). Id(1): If [a, c] = 0, then (x + x−1 )[b, [a, [b, c]x ]x ] = [a, [b, [b, c]x ]x−1 ]x2 − [[b, [b, a]x ]x−1 , c]x2 ;
432
T. D. Palev
Id(2): If B or C is an even element, then for any values of x, y, z, t, r, s subject to the relations x = zs, y = zr, t = zsr, [[A, [B, C]x ]]y = [[[[A, B]]z , C]]t + (−1)deg(A)deg(B) z[[B, [[A, C]]r ]]s ; Id(3): If C is an even element and [A, C] = 0, then [[A, [B, C]x ]]y = [[[A, B]]y , C]x . 2. Deformed Green Generators and their Relations The q-deformed superalgebra Uq [osp(2n + 1/2m)], a Hopf superalgebra, is by now a classical concept. See, for instance, Refs. 5, 3, 7, 17, where all Hopf algebra operations are explicitly given. Here, following Ref. 17, we write only the algebra operations. Let (αij ), i, j = 1, . . . , m + n = N be an N × N symmetric Cartan matrix chosen as: (αij ) = (−1)hji δi+1,j + (−1)hii δi,j+1 − [(−1)hj+1i + (−1)hji ]δij + δi,m+n δj,m+n . (6) For instance the Cartan matrix, corresponding to m = n = 4 is 8 × 8 dimensional matrix: 2 −1 0 0 (αij ) = 0 0 0 0
−1 0 0 2 −1 0 −1 2 −1 0 −1 0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 −2 1 0 0
0 0 0 0 1 −2 1 0
0 0 0 0 0 1 −2 1
0 0 0 0 . 0 0 1 −1
(7)
Definition 1. Uq [osp(2n +1/2m)] is a Hopf algebra, which is a topologically free module over C[[h]] (complete in h-adic topology), with (Chevalley) generators hi , ei , fi , i = 1, . . . , N and 1. Cartan–Kac relations: [hi , hj ] = 0, [hi , ej ] = aij ej ,
[hi , fj ] = −aij fj , ki − k¯ i , ki = q hi , k¯ i = ki−1 = q −hi ; [[ei , fj ]] = δij q − q¯
(8a) (8b) (8c)
2. e-Serre relations [[ei , ej ]] = 0, |i − j| 6= 1, [ei , [ei , ei±1 ]q¯ ]q ≡ [ei , [ei , ei±1 ]q ]q¯ = 0, i 6= m, i 6= N, {[em , em−1 ]q , [em , em+1 ]q¯ } ≡ {[em , em−1 ]q¯ , [em , em+1 ]q } = 0, [eN , [eN , [eN , eN −1 ]q¯ ]]q ≡ [eN , [eN , [eN , eN −1 ]q ]]q¯ = 0;
(9a) (9b) (9c) (9d)
A q-Deformation of the Parastatistics
433
3. f -Serre relations [[fi , fj ]] = 0, |i − j| 6= 1, [fi , [fi , fi±1 ]q¯ ]q ≡ [fi , [fi , fi±1 ]q ]q¯ = 0, i 6= m, i 6= N {[fm , fm−1 ]q , [fm , fm+1 ]q¯ } ≡ {[fm , fm−1 ]q¯ , [fm , fm+1 ]q } = 0, [fN , [fN , [fN , fN −1 ]q¯ ]]q ≡ [fN , [fN , [fN , fN −1 ]q ]]q¯ = 0.
(10a) (10b) (10c) (10d)
The grading on Uq [osp(2n + 1/2m)] is induced from: ¯ ∀j, deg(hj ) = 0,
¯ deg(ei ) = deg(fi ) = 0¯ f or i 6= m. deg(em ) = deg(fm ) = 1, (11)
The (9c) and (10c) relations are the additional Serre relations [17, 8, 34], which were initially omitted. We do not write the other Hopf algebra maps (1, ε, S) [17] since we will not use them. They are certainly also a part of the definition of Uq [osp(2n + 1/2m)]. From (8a,b) and the definition of ki one derives: ki ki−1 = ki−1 ki = 1, ki ej = q
αij
ej ki ,
k i k j = kj k i ,
k i fj = q
−αij
fj ki .
(12a) (12b)
Introduce the following 3N elements in Uq [osp(2n + 1/2m)] (i = 1, . . . , N − 1): √ (m−i)hii a− 2[ei , [ei+1 , [. . . , [eN −2 ,[eN −1 , eN ]qN −1 ]qN −2 . . .]qi+2 ]qi+1 ]qi , i = (−1) √ a− 2eN , (13a) N = √ + N −i+1 ai = (−1) 2[[[. . . [fN , fN −1 ]q¯N −1 ,fN −2 ]q¯N −2 . . .]q¯i+2 , fi+1 ]q¯i+1 , fi ]q¯i , √ a+N = − 2fN , (13b) (13c) Hi = hi + hi+1 + . . . + hN (including i = N ). We refer to the above operators as to (deformed) Green generators since in the nondeformed case they coincide with the Green generators of U [osp(2n + 1/2m)] (see Eq. ± ± ± ± ± ± (36) in Ref. 20). Therefore a± 1 , a2 , . . . , am−1 , am (resp. am+1 , . . . , am+n ≡ aN ) can be viewed as deformed para-Bose (pB) (resp. deformed para-Fermi (pF)) operators. Our aim is to show that Uq [osp(2n + 1/2m)] can be described entirely via the generators (13). One can write (13a) and (13b) as (m−i)hii+(m−j)hji [ei , [ei+1 , [. . . [ej−2 , [ej−1 ,a− a− i = (−1) j ]qj−1 ]qj−2 . . .]qi+1 ]qi , i < j < N, (14a) a+i = (−1)i+j [[. . . [[a+j , fj−1 ]q¯j−1 , fj−2 ]q¯j−2 . . .]q¯i+2 , fi+1 ]q¯i+1 ,fi ]q¯i , i < j < N. (14b)
Taking into account (9a), (10a) and applying repeatedly Id(3) one rewrites the Green generators also as (i < j < N ) (m−i)hii+(m−j)hji a− i = (−1)
[[ei , [ei+1 , [ei+2 , . . . [ej−3 , [ej−2 , ej−1 ]qj−2 ]qj−3 . . .]qi+2 ]qi+1 ]qi , a− j ]qj−1 , (15a) a+i = (−1)i+j [a+j , [. . . [[fj−1 , fj−2 ]q¯j−2 , fj−3 ]q¯j−3 . . .]q¯i+2 , fi+1 ]q¯i+1 , fi ]q¯i ]q¯j−1 .
(15b)
The next proposition plays an important role in several intermediate calculations.
434
T. D. Palev
Proposition 1. The following “mixed” relations between the Chevalley and the Green generators take place: [[ei , a+j ]] = −δij (−1)hi+1i ki a+i+1 , − ¯ [[a− j , fi ]] = δij ai+1 ki ,
i 6= N,
i 6= N,
(17)
[[ei , a− j ]] = 0, if i < j − 1 or i > j, hi+1i − [[ei , a− ai , i+1 ]]qi = (−1) − [[ei , ai ]]q¯i−1 = 0, i 6= N,
−a+i ,
= = 0,
i 6= N,
i 6= N,
(18a) (18b) (18c)
[[a+j , fi ]] = 0, if i < j − 1 or i > j, [[a+i+1 , fi ]]q¯i [[a+i , fi ]]qi−1
(16)
i 6= N,
i 6= N. i 6= N.
(19a) (19b) (19c)
Proof. We stress some of the intermediate steps in the proof. 1. Begin with (16). (i) Let i < j. Then from (13b) and (8c) one immediately has [[ei , a+j ]] = 0. (ii) Let i > j. From (14b) [[ei , a+j ]] ∼ [[ei , [[...[[[a+i+1 , fi ]q¯i , fi−1 ]q¯i−1 , fi−2 ]q¯i−2 . . .]q¯j−1 , fj ]q¯j ]] (applying repeatedly Id(3) and (8c)) = [. . . [A, fi−2 ]q¯i−2 . . .]q¯j−1 , fj ]q¯j , where A = [[[ei , [a+i+1 , fi ]q¯i ]], fi−1 ]q¯i−1 (from (i) and Id(3)) + = [[ai+1 , [[ei , fi ]]]q¯i , fi−1 ]q¯i−1 ∼ [a+i+1 ki , fi−1 ]q¯i−1 = q¯i−1 [a+i+1 , fi−1 ]ki = 0, since evidently fi−1 commutes with a+i+1 (see (13b). Hence [[ei , a+j ]] = 0
for i > j.
(iii) Let i = j. [[ei , a+i ]] = −[[ei , [a+i+1 , fi ]q¯i ]] (from (i) and Id(3)) ki − k¯ i = −[a+i+1 , [[ei , fi ]]]q¯i = −[a+i+1 , ]q¯ q − q¯ i = −(−1)hi+1i q¯i a+i+1 ki = −(−1)hi+1i ki a+i+1 . The unification of (i)–(iii) yields (16). 2. Equation (17) is proved in a similar way. 3. We pass on to prove (18a).
(20)
A q-Deformation of the Parastatistics
435
(i) The case i < j − 1 is evident. (ii) Take i = m > j. Note first that according to (9) and Id(3) [[em , [em−1 , [em , em+1 ]q ]q¯ ]] = [[em , [[em−1 , em ]q¯ , em+1 ]q ]] (using Id(2)) = {[em , em−1 ]q , [em , em+1 ]q¯ } − [em+1 , {em , [em−1 , em ]q¯ }q ] = {[em , em−1 ]q , [em , em+1 ]q¯ } − [em+1 , qem−1 e2m − qe ¯ 2m em−1 ] = 0, according to (9a) and (9c), i.e., B ≡ [[em , [em−1 , [em , em+1 ]q ]q¯ ]] = 0.
(21)
If m = N − 1, then [[em , a− m−1 ]] = B = 0. Let m < N − 1. From (15a) and − ]] ∼ [B, a Id(3) [[em , a− m−1 m+2 ]q = 0. − (iii) Let i 6= m > j. From (15b) a− i−1 ∼ [[ei−1 , [ei , ei+1 ]qi ]qi−1 , ai+2 ]qi+1 . Then Id(3) − yields [ei , ai−1 ] ∼ [z, ai+2 ]qi+1 with z = [ei , [ei−1 , [ei , ei+1 ]qi ]qi ]. If i > m, then using Id(1), z = [ei , [ei−1 , [ei , ei+1 ]q ]q ] ∼ [ei−1 , [ei , [ei , ei+1 ]q ]q¯ ]q2 − [[ei , [ei , ei−1 ]q ]q¯ , ei+1 ]q2 = 0 from (9b). If i < m again from Id(1) z = 0. Hence [ei , a− i−1 ] = 0, if m 6= i. So far we have from (ii) and (iii) that [ei , a− i−1 ] = 0.
(22)
The rest of the proof is by induction. Assume [ei , a− j ] = 0 for a certain i > − ] j, i 6= N . Then from (9a), (14a) and Id(3) [ei , a− j−1 ∼ [ei , [ej−1 , aj ]qj−1 ] = − [ej−1 , [ei , a− j ]]qj−1 = 0. Therefore [ei , aj ] = 0 for any i > j, i 6= N . Combining the last with (i), one obtains (18a). 4. Equation (18b) follows from the definition of a− i and the observation that (−1)(m−i−1)hi+1i−(m−i)hii = (−1)hi+1i . 5. It remains to verify (18c). Since − − − a− i ∼ [[ei , ei+1 ]qi , ai+2 ]qi+1 , [[ei , ai ]]q¯i−1 ∼ [[ei , [[ei , ei+1 ]qi , ai+2 ]qi+1 ]]q¯i−1 (from (18a) and Id(3)) − = [z, ai+2 ]qi+1 ,
where z = [[ei , [ei , ei+1 ]qi ]]q¯i−1 . If i > m, z = [ei , [ei , ei+1 ]q ]q¯ = 0 (see (9b)); if i < m, z = [ei , [ei , ei+1 ]q¯ ]q = 0 again from (9b); if i = m, z = {em , [em , em+1 ]q }q = e2m em+1 − q 2 em+1 e2m = 0, since according to (9a) e2m = 0. Hence [[ei , a− i ]]q¯i−1 = 0. 6. Eqs. (19) are proved in a similar way as Eqs. (18). This completes the proof of the proposition. Proposition 2. The deformed Green operators (13) generate Uq [osp(2n + 1/2m)]. Proof. The proof is an immediate consequence of the relations: + [[a− i , ai+1 ]] = 2Li+1 ei , + [[a− i+1 , ai ]]
= −2(−1)
+ [[a− i , ai ]] = −2
i = 1, 2, , . . . , N − 1, fi L¯ i+1 , i = 1, 2, . . . , N − 1,
hi+1i
Li − L¯ i , q − q¯
L i = q Hi ,
These equations are proved by induction on i.
L¯ i = q −Hi ,
i = 1, . . . , N.
(23a) (23a) (23c)
436
T. D. Palev
1. The Serre relation (8c) together with the definitions of a± N and LN immediately yield ¯N − − − LN −L + [[aN , aN ]] = −2 q−q¯ . From (18b) aN −1 = [[eN −1 , a− N ]]qN −1 = [eN −1 , aN ]qN −1 . − Taking into account that [eN −1 , a+N ] = 0 and Id(3), one has [[aN −1 , a+N ]] = − − + + + [a− N −1 , aN ] = [[eN −1 , aN ]qN −1 , aN ] = [eN −1 , [aN , aN ]]qN −1 = 2LN eN −1 , i.e., + [[a− N −1 , aN ]] = 2LN eN −1 .
(24a)
In a similar way one shows that + ¯ [[a− N , aN −1 ]] = −2fN −1 LN .
(24b)
+ + + In order to compute [[a− N −1 , aN −1 ]] set from (13b) aN −1 = −[aN , fN −1 ]q¯N −1 . Since − − + + n ≥ 1, q¯N −1 = q. ¯ Therefore [[aN −1 , aN −1 ]] = −[[aN −1 , [aN , fN −1 ]q¯ ]]. Apply to the last supercommutator the identity Id(2) with y = z = r = 1 and x = s = t = q, ¯ namely
[[A, [B, C]q¯ ]] = [[[[A, B]], C]]q¯ + (−1)deg(A)deg(B) [[B, [[A, C]]]]q¯ ,
(25)
+ where A = a− N −1 , B = aN , C = fN −1 . Then − − + + + [[a− N −1 , aN −1 ]] = −[[[aN −1 , aN ], fN −1 ]]q¯ − [aN , [[aN −1 , fN −1 ]]]q¯ (from (17) and (24a)) ¯ N −1 ]]q¯ = −2kN [[eN −1 , fN −1 ]] + [a− , a+N ]k¯ N −1 k = −[[2kN eN −1 , fN −1 ]]q¯ − [[a+N , a− N N kN − k¯ N ¯ kN kN −1 − k¯ N k¯ N −1 kN −1 − k¯ N −1 −2 kN −1 = −2 = −2kN q − q¯ q − q¯ q − q¯
i.e., + [[a− N −1 , aN −1 ]] = −2
LN −1 − L¯ N −1 . q − q¯
(26)
From (24) and (26) we conclude that Eqs. (23) are fulfilled for i = N − 1. 2. Assume Eqs. (23) hold for i replaced by i + 1: + [[a− i+1 , ai+2 ]] = 2Li+2 ei+1 , + [[a− i+2 , ai+1 ]]
hi+2i
= −2(−1)
(27a) fi+1 L¯ i+2 ,
Li+1 − L¯ i+1 + [[a− . i+1 , ai+1 ]] = −2 q − q¯
(27b) (27c)
hi+1i We proceed to show that then Eqs. (23) hold too. Set from (13b) a− i = (−1) − + [ei , ai+1 ]qi . Take into account that according to (16) [ei , ai+1 ] = 0 and Id(3). Then + hi+1i + hi+1i + [[[ei , a− [ei , [[a− [[a− i , ai+1 ]] = (−1) i+1 ]qi , ai+1 ]] = (−1) i+1 , ai+1 ]]]qi (from (27a)) −1 hi+1i ¯ [ei , Li+1 − Li+1 ]qi = 2(q¯ − q) (−1)
which after some rearrangement of the multiples finally yields (23a). The verification of (23b) is similar. So far we have derived from (27) that
A q-Deformation of the Parastatistics
437
+ [[a− i , ai+1 ]] = 2Li+1 ei , + [[a− i+1 , ai ]]
i = 1, 2, . . . , N − 1, fi L¯ i+1 , i = 1, 2, . . . , N − 1.
hi+1i
= −2(−1)
(28a) (28b)
+ + + + Set in [[a− i , ai ]] ai = −[ai+1 , fi ]q¯i . Either ai or fi is an even element. Therefore applying again the identity (25), one has − − + + + [[a− i , ai ]] = −[[ai , [ai+1 , fi ]q¯i ]] = −[[[[ai , ai+1 ]], fi ]]q¯i
− (−1)hiihi+1i [[a+i+1 , [[a− i , fi ]]]]q¯i (from (28a) and (17)) ¯ = −2[[Li+1 ei , fi ]]q¯i − (−1)hiihi+1i [[a+i+1 , a− i+1 ki ]]q¯i Li − L¯ i + ¯ . = −2Li+1 [[ei , fi ]] + [[a− i+1 , ai+1 ]]ki = −2 q − q¯ Hence (23c) holds too. From here and (28) we conclude that, if Eqs.(27) hold, then also Eqs. (23) are fulfilled too. This completes the proof of the validity of Eqs. (23). From (13) and (23a,b) one obtains (i = 1, . . . , N − 1): hi = Hi − Hi+1 , HN = hN , 1 1 + ei = L¯ i+1 [[a− e N = √ a− i , ai+1 ]], N, 2 2 1 1 + − + fi = − (−)hi+1i [[a− i+1 , ai ]]Li+1 = [[ai , ai+1 ]]Li+1 , 2 2
(29a) (29b) 1 fN = − √ a+N . 2
(29c)
Since the Chevalley elements generate Uq [osp(2n + 1/2m)], so do the Green operators. This completes the proof. Proposition 3. The Green generators Hi , a± i , i = 1, . . . , N satisfy the following relations (i, j = 1, . . . , N, ξ, η = ± or ± 1): [Hi , Hj ] = 0,
(30a)
[Hi , a± j ]
(30b)
=
±δij (−1)hii a± j ,
Li − L¯ i + , [[a− i , ai ]] = −2 q − q¯
L i = q Hi ,
L¯ i = q −Hi ,
[[aξN −1 , aξN ], aξN ]q¯ = 0, η [[[[aηi , a−η i+ξ ]], aj ]]q −ξ(−1)hii δij
(30c) (30d)
=
2(η)hji δj,i+ξ L−ξη aηi . j
(30e)
Proof. The commutation relations (30a) are evident. Equation (30b) follows from the definitions of the Green generators, the Cartan relations (8a,b) and the observation that N X N X
αsr = −(−1)hii δij .
(31)
s=i r=j − − Equation (30c) was derived in Proposition 2. The equation [[a− N −1 , aN ], aN ]q¯ = 0 is the same as the Serre relation (9b), if one takes into account that [eN −1 , eN ]q ∼ a− N −1 and e N ∼ a− . Similarly one shows that (30d) with ξ = + is the same as (10d). N
438
T. D. Palev
The proof of (30e) is based on case by case considerations (ξ, η = ±). To this end one has to replace ei and fi in Eqs. (16)–(19) with their expressions through the CAOs from (29). Using the relations (which follow from (30b)) ± L i a± j = aj L i ,
±¯ L¯ i a± j = aj L i , hii
±(−1) a± Li a ± i =q i Li ,
i 6= j = 1, . . . , N, hii
∓(−1) ¯ L¯ i a± a± i =q i Li ,
after long, but simple calculations one verifies (30c).
(32a) i = 1, . . . , N,
(32b)
3. Description of ospq (2n+1/2m) via Deformed Green Generators So far we have established that the Green generators (13) satisfy the relations (30). Here we solve the inverse problem: we show that the operators Hi , a± i , i = 1, . . . , N subject to the relations (30) provide an alternative description of Uq [osp(2n + 1/2m)]. In Sect. 2 we have derived the relations (16)–(19) from the definition (13) of the Green generators and the Cartan–Kac and the Serre relations, satisfied by the Chevalley generators. Now as a first step we derive (16)–(19) on the ground of Eqs. (30). Proposition 4. The “mixed” relations (16)–(19) follow from (29) and (30). Proof. Consider Eq. (16). Since i 6= N , from (29b), [[ei , a+j ]] = (i)
1¯ 1 +¯ − + + + Li+1 [[a− i , ai+1 ]]aj − aj Li+1 [[ai , ai+1 ]]. 2 2
(33)
If i + 1 < j or i > j, then L¯ i+1 and a+j commute (see (32)). Therefore, using (30e), [[ei , a+j ]] =
1¯ 1 + + hi+1i ¯ + Li+1 [[[[a+i+1 , a− Li+1 [[[[a− i , ai+1 ]], aj ]] = − (−1) i ]], aj ]] = 0. 2 2
(ii) If i = j, again from (30e), 1 + [[ei , a+i ]] = − (−1)hi+1i L¯ i+1 [[[[a+i+1 , a− i ]], ai ]] 2 = −(−1)hi+1i L¯ i+1 Li a+i+1 = −(−1)hi+1i ki a+i+1 . (iii) If i + 1 = j, from (33) and taking into account (32b), 1¯ + + (−1)hi+1i + + Li+1 ([[a− ai+1 [[a− i , ai+1 ]]ai+1 − q i , ai+1 ]]) 2 1 + + = L¯ i+1 [[[[a− i , ai+1 ]], ai+1 ]]q (−1)hi+1i 2 1 + = L¯ i+1 (−1)hiihi+1i [[[[a+i+1 , a− i ]], ai+1 ]]q (−1)hi+1i = 0. 2
[[ei , a+i+1 ]] =
The unification of (i)–(iii) yields (16). The remaining equalities (17)–(19) are proved in a similar way.
Proposition 5. The Cartan–Kac relations are a consequence of the relations (30). Proof. The first two Eqs. (8a) and (8b) are easily verified. We proceed to prove (8c).
A q-Deformation of the Parastatistics
439
1. The case i = j. If i = N , then (8c) is the same as (30c). Let i < N . From (29c) and the graded Leibnitz rule 1 1 − + [[ei , fi ]] = [[ei , [[a+i , a− i+1 Li+1 ]]]] = [[[[ei , ai ]], ai+1 Li+1 ]] 2 2 + (−1)(hii+hi+1i)hii [[a+i , [[ei , a− i+1 Li+1 ]]]]. Insert above [[ei , a+i ]] = −(−1)hi+1i ki a+i+1 and [[ei , a− i+1 Li+1 ]] hi+1i − = [[ei , a− ai Li+1 . i+1 ]]qi Li+1 (from (18b)) = (−1)
After some rearrangement of the multiples one obtains: [[ei , fi ]] =
1 − + 1 ki − k¯ i + [[ai+1 , ai+1 ]]Li − [[a− . i , ai ]]Li+1 (from (23c) = 2 2 q − q¯
Hence
ki − k¯ i i = 1, . . . , N. (34) q − q¯ 2. The case i 6= j. Equation (8c) is easily verified for i = N or j = N . We consider i 6= j 6= N. From (29c) [[ei , fi ]] =
1 1 − + [[ei , fj ]] = [[ei , [[a+j , a− j+1 Lj+1 ]]]] = [[[[ei , aj ]], aj+1 Lj+1 ]] 2 2 + (−1)(hii+hi+1i)hji [[a+j , [[ei , a− j+1 Lj+1 ]]]]. The first term in the r.h.s. cancels out, since [[ei , a+j ]] = 0 according to (16). From the second term evaluate only the internal supercommutator A = [[ei , a− j+1 Lj+1 ]]. (i) If i < j or i > j + 1, A = [[ei , a− ]]L = 0 according to (18a); j+1 j+1 (ii) If i = j + 1, then Li ei = q¯i−1 ei Li . Therefore A = [[ei , a− i ]]q¯i−1 Li = 0 according to (18c). Hence [[ei , fj ]] = 0, if i 6= j = 1, . . . , N . The latter together with (34) shows that also the last Cartan–Kac relation (8c) is fulfilled. This completes the proof. Proposition 6. The Serre relations (9) and (10) are a consequence of the relations (30). Proof. 1. First we prove that [ei , ej ] = 0, if |i − j| > 1. Assume for definiteness that i + 1 < j. (i) Let i + 1 < j = N . From (29b) and the observation that L¯ i+1 commutes with a− N one has − − + − + ¯ [ei , eN ] ∼ [L¯ i+1 [[a− i , ai+1 ]], aN ] = Li+1 [[[ai , ai+1 ]], aN ] (from (30e)) = 0.
(ii) Let i + 1 < j < N . From (29b) − + + ¯ [ei , ej ] ∼ [L¯ i+1 [[a− i , ai+1 ]], Lj+1 [[aj , aj+1 ]]] − + + ¯ (L¯ j+1 commutes with a− i and ai+1 , Li+1 commutes with aj and aj+1 ) = L¯ i+1 L¯ j+1 [[[[a− , a+i+1 ]], [[a− , a+j+1 ]]]] i
j
− + + = L¯ i+1 L¯ j+1 [[[[[[a− i , ai+1 ]], aj ]], aj+1 ]]]] − + + + (−1)(hii+hi+1i)hji L¯ i+1 L¯ j+1 [[a− j , [[[[ai , ai+1 ]], aj+1 ]]]] = 0,
440
T. D. Palev
− − + + + since [[[[a− i , ai+1 ]], aj ]] = 0 and [[[[ai , ai+1 ]], aj+1 ]] = 0 according to (30e). 2. The proof of [fi , fj ] = 0, if |i − j| > 1 is similar. ¯ 3. Proof of [ei , [ei , ei+1 ]q0 ]q¯0 = 0, i 6= m, i 6= N and q 0 = q or q 0 = q. 0 (−1)hii . Therefore the relation to be proved is We choose q = qi−1 = q
[ei , [ei , ei+1 ]qi−1 ]q¯i−1 = 0,
i 6= m, i 6= N.
(35)
+ As a preliminary step compute [ei , ei+1 ]qi−1 (see (29b)) = 41 [L¯ i+1 [[a− i , ai+1 ]], − − − + + + L¯ i+2 [[ai+1 , ai+2 ]]]qi−1 . From (32) one has [[ai , ai+1 ]]L¯ i+2 = L¯ i+2 [[ai , ai+1 ]] and − + −(−1)hi+1i ¯ + + ¯ ¯ Li+1 [[a− [[a− i+1 , ai+2 ]]Li+1 = q i+1 , ai+2 ]] = q¯i Li+1 [[ai+1 , ai+2 ]]. Therefore, − + − − 1 ¯ + + + ¯ [ei , ei+1 ]qi−1 = 4 Li+1 Li+2 [[ai , ai+1 ]][[ai+1 , ai+2 ]]−qi−1 q¯i [[ai+1 , ai+2 ]][[a− i , ai+1 ]] . − + + Since i 6= m, qi−1 q¯i = 1. Thus, [ei , ei+1 ]qi−1 = 41 L¯ i+1 L¯ i+2 [[[a− i , ai+1 ]], [[ai+1 , ai+2 ]]] − 1 ¯ + + (hii+hi+1i)hi+1i ¯ [[a− = 41 L¯ i+1 L¯ i+2 [[[[[[a− i , ai+1 ]], ai+1 ]], ai+2 ]] + 4 Li+1 Li+2 (−1) i+1 , − + + [[ai , ai+1 ]], ai+2 ]]]]. + The second term in the r.h.s. is zero, since from (30c) [[[[a+i+1 , a− i ]], ai+2 ]] = 0. Again − + − − hi+1i Li+1 ai . Therefore from (30c) [[[[ai , ai+1 ]], ai+1 ]] = 2(−1)
1 + (−1)hi+1i L¯ i+2 [[a− i , ai+2 ]]. 2 Insert ei from (29b) and [ei , ei+1 ]qi−1 from (36) in the l.h.s. of (35): [ei , ei+1 ]qi−1 =
(36)
1 1 + hi+1i ¯ + Li+2 [[a− [ei , [ei , ei+1 ]qi−1 ]q¯i−1 = [ L¯ i+1 [[a− i , ai+1 ]], (−1) i , ai+2 ]]]q¯i−1 2 2 1 − + + = (−1)hi+1i L¯ i+1 L¯ i+2 [[[a− i , ai+1 ]], [[ai , ai+2 ]]]q¯i−1 4 + + (use the circumstanc that [[[a− i , ai+1 ]], ai+2 ] = 0 and Id(3)) 1 − + + = (−1)hi+1i L¯ i+1 L¯ i+2 [[[[[a− i , ai+1 ]], ai ]q¯i−1 , ai+2 ]] = 0, 4 − − + − + since [[[a− i , ai+1 ]], ai ]q¯i−1 = [[[ai , ai+1 ]], ai ]q −(−1)hii = 0 according to (30e). Hence the Serre relation (35) holds. 4. The proof of [ei , [ei , ei−1 ]q0 ]q¯0 = 0, i 6= 1, i 6= m, i 6= N and q 0 = q or q 0 = q¯ is hii similar. For q 0 one has to take q 0 = q¯i−1 = q −(−1) . 5. The proof of the Serre relations (10b) is similar as for (9b). 6. Proof of e2m = 0 (i.e., of (9a) for i = j = m).
e2m ∼ [[em , em ]]q2 (use (29b)) + − + ¯ ∼ [[L¯ m+1 [[a− m , am+1 ]], Lm+1 [[am , am+1 ]]]]q 2 + − + = q L¯ 2m+1 [[[[a− m , am+1 ]], [[am , am+1 ]]]]q 2 . + − + − + − In order to evaluate [[[[a− m , am+1 ]], [[am , am+1 ]]]]q 2 set A = [[am , am+1 ]], B = am , + ¯ deg(C) = 0¯ and use the identity Id(2) C = am+1 . Note that deg(A) = deg(B) = 1, ¯ It yields with x = 1, y = q 2 , z = r = t = q, s = q. + − + − + − + [[[[a− m , am+1 ]], [[am , am+1 ]]]]q 2 =[[[[[[am , am+1 ]], am ]]q , am+1 ]]q − + + − q[[a− m , [[[[am , am+1 ]], am+1 ]]q ]]q¯ = 0, + − − + since, as it follows from (30e), [[[[a− m , am+1 ]], am ]]q = 0 and [[[[am , am+1 ]], + am+1 ]]q = 0.
A q-Deformation of the Parastatistics
441
2 7. The proof of fm = 0 is similar. 8. Proof of {[em , em−1 ]q , [em , em+1 ]q¯ } = 0. So far we have proved the validity of the Cartan–Kac relations (8) and of the Serre relations (9a,b) and (10a,b). Therefore we can refer to them. In particular Eqs. − (15) hold. Using (15a), write a− m−1 = −[[em−1 , [em , em+1 ]q ]q¯ , am+2 ]q . Accord− ing to (18a), which was proved in Proposition 4, [[em , am−1 ]] = 0. Therefore, − 0 = [[em , a− m−1 ]] = −[[em , [[em−1 , [em , em+1 ]q ]q¯ , am+2 ]q ]] and since, again from − − (18a), [em , a− m+2 ] = 0, applying Id(3), we have [[em , am−1 ]] = −[y, am+2 ]q , where y = [[em , [em−1 , [em , em+1 ]q ]q¯ ]], which can be written also as
y = [[em , [[em−1 , em ]q¯ , em+1 ]q ]]
(37)
[y, a− m+2 ]q = 0.
(38)
and
From (13b), (37) and (8c) one immediately concludes that [y, a+m+2 ] = 0. Therefore, − + + applying Id(3), one has 0 = [[y, a− m+2 ]q , am+2 ] = [y, [am+2 , am+2 ]]q (use (30c))= −1 −2(q − q) ¯ [y, Lm+2 − L¯ m+2 ]q , which after pushing Lm+2 and L¯ m+2 to the right, yields (1 − q 2 )yLm+2 = 0. Hence y = [[em , [[em−1 , em ]q¯ , em+1 ]q ]] = 0.
(39)
Set in (39) A = em , C = [em−1 , em ]q¯ , B = em+1 and use the following identity, which follows from Id(2): If B is an even element, then [[A, [C, B]q ]] = −q[[[A, B]q¯ , C]] − [B, [[A, C]]q ].
(40)
This yields y = −q[[[em , em+1 ]q¯ , [em−1 , em ]q¯ ]] − [em+1 , [[em , [em−1 , em ]q¯ ]]q ] = {[em , em−1 ]q , [em , em+1 ]q¯ } − [em+1 , {em , [em−1 , em ]q¯ }q ] = 0. The second term ¯ 2m em−1 and in the r.h.s. above is zero, since {em , [em−1 , em ]q¯ }q = qem−1 e2m − qe 2 em = 0. Therefore, {[em , em−1 ]q , [em , em+1 ]q¯ } = 0, which proves (9c). 9. The proof of (10c) is similar. + 10. Proof of [eN , [eN , [eN , eN −1 ]q¯ ]]q = 0. From (29b) eN −1 = 21 L¯ N [a− N −1 , aN ] and − 1 eN = √2 aN . Therefore, 1 1¯ 1 ¯ − − − + + [eN , eN −1 ]q¯ = [ √ a− N , LN [aN −1 , aN ]]q¯ = √ q¯Ln [aN , [aN −1 , aN ]], 2 2 2 2 which, applying (30e), yields: q¯ [eN , eN −1 ]q¯ = − √ a− N −1 . 2 Therefore, q¯ − − [eN , [eN , [eN , eN −1 ]q¯ ]]q = − √ [a− N , [aN , aN −1 ]]q 2 2 1 − − = − √ [[a− N −1 , aN ], aN ]]q¯ = 0, 2 2 according to (30d). Hence, (9d) holds.
(41)
442
T. D. Palev
11. The proof of (10d) is similar. This completes the proof of Proposition 6.
The relations (29), written in the form (i = 1, . . . , N − 1) hi = Hi − Hi+1 , HN = hN , 1 1 + ei = q −Hi+1 [[a− e N = √ a− i , ai+1 ]], N, 2 2 1 1 Hi+1 , fN = − √ a+N , fi = [[a+i , a− i+1 ]]q 2 2
(42a) (42b) (42c)
indicate that the Chevalley elements are functions of the Green generators. More precisely, hi , ei , fi are in the closure of the subalgebra of all polynomials of the Green operators over C[[h]]. So far we were considering Uq [osp(2n + 1/2m)] as a topologically free module over the ring C[[h]] of the formal power series over an indeterminate h. Due to this, for instance, q Hi is a well defined element from Uq [osp(2n + 1/2m)]. It is important to note however that all our considerations remain true, if one goes to the factor algebra / iπQ (Q – Uqc [osp(2n + 1/2m)], replacing h by a complex number hc , such that hc ∈ all rational numbers), namely considering q to be a number qc , which is not a root of 1. Then in the limit hc → 0 the deformed Green generators become ordinary parabosons and parafermoins. This is the justification to call the operators (13) deformed Green generators, and the statistics, corresponding to them – quantum deformation of the parastatistics. We conclude the paper, formulating our main result as a theorem. Theorem 1. Uq [osp(2n + 1/2m)] is a topologically free C[[h]] module and an associative unital algebra with generators Hi , a± i , i = 1, . . . , N and relations (30). The generators consist of m pairs of deformed parabosons and n pairs of deformed parafermions. This theorem established a link between the quantum groups in the sense of Drinfeld– Jimbo [6, 14] and the quantum statistics in the sense of Green [10]. Acknowledgement. I am grateful to Prof. Randjbar-Daemi for the kind hospitality at the High Energy Section of ICTP. Constructive discussions with Dr. N.I. Stoilova are greatly acknowledged. This work was supported by the Grant 8-416 of the Bulgarian Foundation for Scientific Research.
References 1. Biedenharn, L. C.: The quantum group SU2 (2) and q-analogue of the boson operators. J. Phys. A : Math. Gen. 22, L873–L878 (1989) 2. Bracken, A.J. and Green, H.S.: Parastatistics and the quark model. J. Math. Phys. 14, 1784–1793 (1973) 3. Bracken, A.J., Gould, M.D. and Zhang, R.B.: Quantum supergroups and solutions of the Yang-Baxter equation. Mod. Phys. Lett. A 5, 831–840 (1990) 4. Celeghini, E., Palev, T.D. and Tarlini, M.: The quantum superalgebra B(0/1) and q-deformed creation and annihilation operators. Mod. Phys. Lett. B5, 187–193 (1991) 5. Chaichian, M. and Kulish, P.: Quantum Lie superalgebras and q-oscillators. Phys. Lett. 234B, 72–80 (1990) 6. Drinfeld, V.: Quantum groups.ICM proceedings, Berkeley, 1986, pp. 798–820 7. Floreanini, R., Spiridonov, V.P. and Vinet, L.: q-oscillator realizations of the quantun superalgebras slq (m, n) and ospq (m, 2n). Commun. Math. Phys. 137, 149–160 (1990) 8. Floreanini, R., Leites, D.A. and Vinet, L.: On the defining relations of quantum superalsgebras. Lett. Math. Phys. 23, 127–131 (1991)
A q-Deformation of the Parastatistics
443
9. Ganchev, A.Ch. and Palev, T.D.: A Lie superalgebraical analysis of the para-Bose statistics. J. Math. Phys. 21, 797–799 (1980) 10. Green, H.S.: A generalized method of field quantization. Phys. Rev. 90, 270–273 (1953). 11. Greenberg, O.W. and Messiah, A.M.: Selection rules for parafields and absence of para particles in nature. Phys. Rev. 138, B1155–1167 (1965) 12. Hadjiivanov, L.K.: Quantum deformation of Bose parastatistics. J. Math. Phys. 34, 5476–5492 (1993) 13. Haldane, F.D.M.: “Fractional statistics” in arbitrary dimensions: A generalization of the Pauli principle. Phys. Rev. Lett. 67, 937–940 (1991) 14. Jimbo, M.: A q-analogue of U (gl(n + 1)), Hecke algebra, and Yang-Backster equation. Lett. Math. Phys. 11, 247–252 (1986) 15. Kac, V.G.: Representations of classical Lie superalgebras. Lecture Notes in Math. 676, Berlin– Heidelberg–New York: Springer, 1979, pp. 597–626 16. Kamefuchi, S. and Takahashi, Y.: A generalization of field quantization and statistics. Nucl. Phys. 36, 177–206 (1962) 17. Khoroshkin, S.M. and Tolstoy, V.N.: Universal R-matrix for quantized (super)algebras. Commun. Math. Phys. 141, 599–617 (1991) 18. Macfarlane A. J.: On a q-analogues of the quantum harmonic oscillator and the quantum group SU (2)2 . J. Phys. A : Math. Gen. 22, 4581–4588 (1989) 19. Omote, M., Ohnuki, Y. and Kamefuchi, S.: Femi-Bose similarity. Progr. Theor. Phys. 56, 1948–1964 (1976) 20. Palev, T.D.: A description of the superalgebra osp(2n + 1/2m) via Green generators. J. Phys. A : Math. Gen. 29, L171–L176 (1996) 21. Palev, T.D.: Para-Bose and para-Fermi operators as generators of orthosymplectic Lie superalgebras. J. Math. Phys. 23, 1100–1102 (1982) 22. Palev, T.D.: Vacuum-like state analysis of the representations of the para-Fermi operators. Ann. Inst. Henri Poincare XXIII, 49–60 (1975) 23. Palev, T.D.: Lie algebraical aspects of the quantum statistics. Thesis, Institute of Nuclear Research and Nuclear Energy, Sofia, 1976 24. Palev, T.D.: Lie algebraical aspects of quantum statistics. Unitary quantization (A-quantization). Preprint JINR E17-10550 (1977) and hep-th/9705032 25. Palev, T.D.: Lie superalgebras, infinite-dimensional algebras and quantum statistics. Rep. Math. Phys. 31, 241–262 (1992) 26. Palev, T.D.: On dynamical quantization. Czech. J. Phys. B32, 680–687 (1982) 27. Palev, T.D.: Wigner approach to quantization. Noncanonical quantization of two particles interacting via a harmonic potential. J. Math. Phys. 23, 1778–1784 (1982) 28. Palev, T.D. and Stoilova, N.I.: Many-body Wigner quantum systems. J. Math. Phys. 38, 2506–2523 (1997) and hep-th/9606011 29. Palev, T.D. and Stoilova, N.I.: Wigner quantum oscillators. Osp(3/2) oscillators. J. Phys. A : Math. Gen. 27, 7387 (1994) and hep-th/9405125 30. Palev, T.D.: Quantization of Uq [so(2n + 1)] with deformed para-Fermi operators. Lett. Math. Phys. 31, 151–157 (1994) and hep-th/9311163 31. Palev, T.D.: Quantization of Uq [osp(1/2n)] with deformed para-Bose operators. J. Phys. A : Math. Gen. 26, L1111–L1116 (1993) and hep-th/9306016 32. Palev, T.D. and Van der Jeugt, J.: The quantum superalgebra Uq [osp(1/2n)]: deformed para-Bose operators and root of unity representations. J. Phys. A : Math. Gen. 28, 2605–2616 (1995) and qalg/9501020 33. Ryan, C. and Sudarshan, E. C. G.: Representation of parafermi rings. Nucl. Phys. 47, 207–211 (1963) 34. Scheunert, M.: Serre-type relations for special linear Lie superalgebras. Lett. Math. Phys. 24, 173–181 (1992) 35. Sun, C. P. and Fu, H. C.: The q-deformed boson realization of the quantum group SU (n)q and its representations. J. Phys. A: Math. Gen. 22, L983–L986 Communicated by G. Felder
Commun. Math. Phys. 196, 445 – 459 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
The Hidden Symmetry Algebras of a Class of Quasi-Exactly Solvable Multi Dimensional Operators? Y. Brihaye, J. Nuyts Department of Mathematical Physics, University of Mons, Av. Maistriau, B-7000 Mons, Belgium. E-mail: [email protected]; [email protected] Received: 30 January 1997/ Accepted: 18 February 1998
Abstract: Let P (N, V ) denote the vector space of polynomials of maximal degree less than or equal to N in V independent variables. This space is preserved by the enveloping algebra generated by a set of linear, differential operators representing the Lie algebra gl(V + 1). We establish the counterpart of this property for the vector space P (M, V ) ⊕ P (N, V ) for any values of the integers M, N, V . We show that the operators preserving P (M, V ) ⊕ P (N, V ) generate an abstract superalgebra (non-linear if 1 = | M − N |≥ 2). A family of algebras is also constructed, extending this particular algebra by 1 − 1 arbitrary complex parameters. 1. Introduction Quasi-exactly solvable (QES) equations refer to a class of spectral differential equations for which a part of the spectrum can be obtained by solving algebraic equations [1, 2, 3]. Linear differential operators preserving a finite dimensional space of smooth functions constitute in this respect a basic ingredient in the topic of QES equations. In the case of operators of one real variable acting on a scalar function, the possibilities of finite dimensional invariant vector spaces are rather limited [4]. Up to a change of the variable and a redefinition of the function, the vector space can only be the set P (N ) of polynomials with degree less than or equal to N . The relevant operators are the elements of the enveloping algebra of sl(2) whose generators are suitably represented by three differential operators [1, 4]. The QES equations they define therefore possess an sl(2) hidden symmetry. When the number of variables or (and) the number of components of the function is (are) larger than one, the number of possible invariant vector spaces and hidden symmetry algebras increases considerably. The scalar QES operators in two variables were classified in [5, 6]. Seven inequivalent spaces of functions appear to be possible. ?
Work supported in part by the Belgian Fonds National de la Recherche Scientifique.
446
Y. Brihaye, J. Nuyts
Correspondingly, the hidden symmetry algebras can be of several types, e.g. sl(2), sl(2)⊗ sl(2), sl(3). A few cases of this classification generalise easily to operators involving an arbitrary number of variables. In particular, the case labelled 2.3 in ref. [6] can be extended to the space P (N, V ) of polynomials of maximal degree less than or equal to N in their V independent variables. The related algebra is sl(V + 1). The construction of the matrix operators in one variable preserving the direct sum P (N1 )⊕. . .⊕P (Nk ) has also been considered [7, 8]. These operators are closely related to graded algebras. As an example, the case P (N ) ⊕ P (N + 1) is related to the graded Lie algebra osp(2, 2) [9]. The purpose of this paper is to classify the operators preserving the vector space P (M, V )⊕P (N, V ) for arbitrary values of the integers M, N, V and to construct a series of associative algebras corresponding to the hidden symmetries of these operators. In Sect. 2 we fix the notations and point out the relevant representations of the algebra gl(n). The 2 × 2 matrix operators preserving the space P (N, V ) ⊕ P (M, V ) are contructed in Sect. 3 and are shown to obey a set of normal ordering rules. In Sect. 4, these ordering rules are modified into sets of commutation and anticommutation relations which fulfill all Jacobi identities. We obtain in this way a series of associative abstract algebras which appear to be labelled by V , by 1 =| N −M | and by 1−1 arbitrary complex parameters. The technical details related to the proof of our main result are given in Sect. 5. 2. Operators Preserving P (N, V ) Let N, V be two positive integers. Let xi (i = 1, . . . , V ) represent V independent real variables. We define the finite dimensional vector space P (N, V ) of polynomials in the variables xi and of maximal total degree N , P (N, V ) = span {xn1 1 , xn2 2 . . . xnVV }
,
0≤
V X
nj ≤ N,
(1)
j=1
P (N, 1) ≡ P (N ) = span {1, x, . . . , xN } .
(2)
The dimension of P (N, V ) is given by 1+V +
V (V + 1) . . . (V + N − 1) V (V + 1) + ... + = CVN +V . 2 N!
(3)
Following the ideas of Turbiner [1, 4] the following lemma is easy to prove. Lemma 1. The set of linear differential operators preserving P (N, V ) is the enveloping algebra of the following operators: J00 (N )
= D − N,
D≡
V X j=1
xj
∂ , ∂xj
∂ , k = 1, . . . , V, ∂xk Jk0 (N ) = −xk (D − N ), k = 1, . . . , V, ∂ , k, l = 1, . . . , V . Jkl (N ) = −xk ∂xl
J0k (N ) =
(4)
Hidden Symmetry Algebras of Quasi-Exactly Solvable Operators
447
Simple computations show that Lemma 2. The (V + 1)2 independent operators (4) fulfill the commutation relations of the Lie algebra gl(V + 1), [Jab , Jcd ] = δad Jcb − δcb Jad , a, b, c, d = 0, 1, . . . , V .
(5)
Lemma 3. Acting on the finite dimensional space P (N, V ), the operators (4) realize gl(V + 1) irreducibly. Within the representation (4), the Casimir operators of gl(V + 1) Cp ≡
V X
Jaa21 Jaa32 . . . Jaa1p , p = 1, . . . , V + 1
(6)
a1 ,... ,ap =0
have the values Cp = (−1)p N (N + V )p−1 . b The operators J 0 a defined by J 0 a = Jab + Cδab , b
(7)
where C is any operator which commutes with all J’s, satisfy also the relations (5). For instance, this is the case for the (V + 1)2 − 1 independent operators J˜ab = Jab −
1 C1 δab V +1
(8)
which form an irreducible representation of sl(V + 1) (since C˜ 1 ≡ J˜aa = 0) when acting on P (N, V ). The usual form [1] of the of sl(2) generators J+ (N ) = −J˜10 = x(x∂x − N ), N J0 (N ) = −J˜11 = (x∂x − ), 2 1 ˜ J− (N ) = J0 = ∂x
(9)
is recovered for V = 1. These operators play a major role in the topic of quasi-exactly solvable equations. More generally, an element, say A, of the enveloping algebra constructed over the Jab (or the J˜ab ) is a quasi-exactly solvable operator preserving P (N, V ). That is to say that the spectral equation Ap = λp,
p ∈ P (N, V )
(10)
admits CVV +N solutions. Recently, the Calogero and Sutherland quantum hamiltonians were shown to be expressible in terms of the operators Jab [10]; this result reveals the hidden symmetries of these models.
448
Y. Brihaye, J. Nuyts
3. Operators Preserving P(M,V) ⊕ P(N,V) We now put the emphasis on the 2×2 matrix operators which preserve the vector space P (M, V ) ⊕ P (N, V ) ,
1 ≡ N − M.
(11)
Without loss of generality, we assume the integer 1 to be non negative. In order to classify the operators preserving (11) we define a list of generators. First the “diagonal” generators that we choose as b 1 1+1 0 Ja (N − 1) 0 (12) − Jab (N, 1) = δb 0 1−1 a 0 Jab (N ) 2 for 0 ≤ a, b ≤ V . They are built as a direct sum of two operators of the type (4)-(7), translated by (7) in such a way that J00 (N, 1) is proportional to the unit matrix. The interest for this translation will appear later. The “non-diagonal” generators naturally split into “Q operators”, proportional to the matrix σ− (as usual σ± = (σ1 ± iσ2 )/2) and “Q operators” proportional to the matrix σ+ . It is convenient to write them by using a multi index [A] ≡ a1 , a2 , . . . , a1 . For later convenience we also define [Aˆ i ] as the set [A] where the index ai has been removed. We choose the non diagonal generators respectively as Q[A] = (−1)δ xa1 . . . xa1 σ− ,
0 ≤ ai ≤ V,
x0 ≡ 1,
(13)
where δ represents the overall degree in x1 , . . . , xV of the monomial xa1 . . . xa1 and Q
[B]
= q [B] σ+ ,
0 ≤ bi ≤ V,
(14)
where the scalar operators q [B] , fully symmetric in their 1 indices bk , are defined by q [B] = ∂b1 . . . ∂b1 if 0 < b1 ≤ b2 ≤ b3 . . . ≤ b1 , (15) (D − N + 1 − 1)∂b2 . . . ∂b1 if 0 = b1 < b2 ≤ b3 . . . ≤ b1 , (D − N + 1 − 1)(D − N + 1 − 2)∂b3 . . . ∂b1 if 0 = b1 = b2 < b3 . . . ≤ b1 , (D − N + 1 − 1)(D − N + 1 − 2) . . . (D − N ) if 0 = b1 = b2 = b3 = b1 . [A]
The operators Q[A] (and similarly the Q ) are fully symmetric in their 1 indices ak . Hence there are C1V +1 independent operators of both types. We then have the following proposition. Proposition 1. The operators preserving the space P (N − 1, V ) ⊕ P (N, V ) are the elements of the enveloping algebra constructed over the generators (12),(13),(14) . This result (whose demonstration follows the same lines as in the scalar case [4]) allows to write formally all the operators preserving (11). However, in order to classify these operators, it is useful to set up normal ordering rules between the generators. In particular, these rules allow to write any product of operators (the enveloping algebra) in a canonical form, e.g. as a sum of terms where, in each term, the Q operators (if any) are written on the left, the Q operators (if any) on the right and the J operators in between. As we show next, such rules exist for the operators (12),(13),(14).
Hidden Symmetry Algebras of Quasi-Exactly Solvable Operators
449
Normal ordering rules. The operators (12) obey the commutation rules (5) and assemble into a reducible representation of gl(V + 1) when acting on the vector space (11). The dimension of it is CVN +V + CVN +V −1 .
(16)
[A]
By construction, the operators Q[A] (resp. Q ) transform as an irreducible multiplet of dimension C1V +1 under the adjoint action of the generators Jab (N, 1). More precisely, we have 1 X
[Jab , Q[A] ] = k δab Q[A] −
δab k Q[Aˆ k ,a] ,
(17)
k=1 [A] [Jab , Q ]
[A] δab Q
= −k
+
1 X
δaak Q
[Aˆ k ,b]
.
(18)
k=1
The explicit form of the generators leads to the value k = 1. The first Casimir constructed with (12), i.e. T ≡
V X
Jaa (N, 1),
(19)
a=0
plays the role of a grading operator : [T, Jab ] = 0, [T, Q[A] ] = 1V Q[A] , [T, Q
[A]
] = −1V Q
[A]
.
(20)
The product of any two operators Q (and separately of two Q’s) vanishes, hence also their anticommutator {Q[A] , Q[C] } = 0, {Q
[B]
,Q
[D]
}=0.
(21)
The evaluation of the anti-commutator {Q, Q} is more involved. Its form can be guessed from the covariance under gl(V + 1), from the symmetries of Q and Q in their indices and from the fact that the anti-commutator involves at most derivatives of the order 1. It is therefore likely that the anticommutator {Q, Q} should be expressed as a combination of the tensors [B] (k) ≡ W[A]
1 S[A] S[B](Jab11 Jab22 . . . Jabkk δabk+1 . . . δab11 ), k+1 (1!)2
(22)
where the operator S[.] denotes the sum over all permutations of all indices entering in the argument [.]. After calculation, we found the following relations between (12),(13),(14), {Q[A] , Q
[B]
}=
1 X
[B] αk W[A] (k),
(23)
k=0
and the parameters αk are numbers which are uniquely determined by the polynomial equation 1−1 Y
1 X
j=0
k=0
(y + j) =
αk (y +
1−1 k ) . 2
(24)
450
Y. Brihaye, J. Nuyts
As a consequence of (24), the right-hand side of (23) is an even (resp. odd) polynomial in the operators J if 1 is even (resp. odd). We would like to stress that this particularly simple expression is due to the labelling of the generators and to the translation used in (12). A priori, the undetermined coefficients αk could be 2 × 2 diagonal matrices. The non-vanishing parameters αk appear only for k = 1, 1 − 2, 1 − 4, . . . and read as follows for the first few values of 1 : 1 = 1, αk : 1 , 1 1 = 2, αk : 1, − , 4 1 = 3, αk : 1, −1, 5 9 1 = 4, αk : 1, − , , 2 16 1 = 5, αk : 1, −5, 4, 225 35 259 1 = 6, αk : 1, − , , − . 4 16 64
(25)
4. Abstract Algebras We now investigate the possibility that the operators (12),(13),(14) represent the generators of an abstract associative algebra. We will see that there are two types of such algebras that we note generically A(V, 1) and B(V, 1). A few cases are known to coincide with Lie algebras [9, 7], A(1, 0) ' sl(2) ⊗ sl(2),
(26)
A(1, 1) ' osp(2, 2) ' spl(2, 1).
(27)
For 1 > 1, A(1, 1) corresponds to a non-linear superalgebra [7]. The algebra A(1, 2) was treated in great detail in [11]. Here we want to move away from the case V = 1 in order to access the hidden symmetries of the operators preserving (11) in general. With the aim to promote the normal ordering rules of the previous section into a set of relations defining an abstract associative algebra, we first note that the operators Jab (resp. Q, Q) should naturally be interpreted as the bosonic (resp. fermionic) generators of the algebra (this refers of course to the most natural choice of the commutator or of the anti-commutator used to exchange the order between these generators). Therefore, we expect some graded algebras to come out. However, it is well known that the knowledge of a particular representation (here (12),(13),(14)) is not sufficient in general to infer the whole algebraic structure : the Jacobi identities are not automatically fulfilled. In the present case, the identities which are not obeyed are those involving a {Q, Q} (or a {Q, Q}) anticommutator (remember that they vanish). Although we can try to modify the whole set of (anti) commutation relations, we limit our research of the underlying abstract algebras in relaxing only the relation (21). In order to present the way to modify it, a few notations are worth introducing. Due to its symmetry in the indices [A], the representation defined by the Q[A] (and [A] similarly by the Q ) corresponds to a Young diagram with one line of 1 boxes. The products Qa1 ...a1 Qc1 ...c1
(28)
Hidden Symmetry Algebras of Quasi-Exactly Solvable Operators
451
assemble into a representation of gl(V + 1) under the adjoint action of the operators J. This representation can be decomposed into irreducible pieces. The symmetry of Q is such that the irreducible representations appearing in the decomposition of (28) correspond to the Young diagrams consisting of two lines with total number of 21 boxes. When applied to the anticommutators Qa1 ...a1 Qc1 ...c1 + Qc1 ...c1 Qa1 ...a1 ,
(29)
the same decomposition selects only the representations which are symmetric under the exchange [A] ↔ [C]. In terms of Young diagrams they correspond to the diagrams with two lines and total number of 21 boxes; the upper line is of length 21 − 2p (with 2p ≤ 1) and the lower line is of even length 2p. One Young tableau, corresponding to this Young diagram with fixed p, is obtained by filling the first (resp. the second) line with [a1 , a2 , . . . , a1 , c2p+1 , c2p+2 , . . . , c1 ], ( resp. [c1 , c2 , . . . , c2p ] ).
(30)
The Young element SY corresponding to this Young tableau reads SY = S[a1 , . . . a1 , c2p+1 , . . . , c1 ]S[c1 , . . . c2p ]Ex ,
(31)
with Ex ≡
2p Y
(E − (ak , ck )),
(32)
k=1
where the operator S[.] (defined previously) denotes the sum over all permutations of all indices appearing in the argument [.], where (a, b) denotes the transposition a ↔ b and where E is the identity operator. With these notations, we are ready to describe the conditions we choose for the anticommutations of two Q operators . We restrict them by imposing SY {Q[A] , Q[C] } = 0.
(33)
This corresponds to the vanishing of a particular representation contained in decomposition of the symmetrized product of Q[A] with Q[C] into irreducible representations of gl(V + 1). The absent representation is exactly the one related to the Young diagram defined above, characterized by p and by the Young element (31). We have studied the associativity conditions, i.e. the Jacobi identities, compatible with the relation (5), the relations (17),(18) for an arbitrary value of the parameter k, the relation (23) for arbitrary values of the parameters αk and the relation (33) for an arbitrary integer p (in fact 2p ≤ 1). The results of our calculation is summarized by the following proposition : Proposition 2. The set of relations (5),(17),(18),(23),(33) are compatible with all the Jacobi identities in two cases only : 1. p = 0 and k = 1 , 2. 1 even, 1 = 2p and k = −1 .
452
Y. Brihaye, J. Nuyts
In both cases, the anticommutation relations of two Q’s have to follow the same symmetry pattern as the anticommutation relations (33) of two Q’s. Associativity is realized irrespectively of the values of the parameters αk . That is to say that we obtained two families of associative algebras A, B, each indexed by 1 + 1 parameters and by the integers V and 1. By a suitable choice of the normalisation of the Q’s and/or of the Q’s one can set α1 = 1 in (23). One can also set α1−1 = 0 by using an appropriate translation (7) on the operators J(N, 1). Before presenting the proof of this result, let us discuss a few properties of the algebras. Case 1. The abstract algebra A(V, 1, αk ). In the case p = 0, k = 1, the constraint (33) reads S[A, C]{Q[a1 ,a2 ...a1 ] , Q[c1 ,c2 ...c1 ] } = 0
(34)
and the same relation has to be imposed on the operators Q. Using some combinatoric, one can show that (34) encodes a total number of CV21+V independent relations among the anticommutators of two Q’s. Remembering that there are CVV +1 operators Q, we see easily that the number of constraints is lower than the number of independent anticommutators, that is to say that not all anticommutators are constrained. The operators (12),(13),(14) constitute a particular representation of the algebras of this type : the ones corresponding to the values αk determined by (24). For these operators the conditions (34) are trivially realized. In the case 1 = 1, the relation (33) just implies that all anticommutators of two operators Q vanish (and similarly for two Q). The algebra A(V, 1) is linear, it coincides with the Lie superalgebra denoted spl(V + 1, 1) in the classification [12]. If V = 1 one recovers the algebra osp(2, 2) (remember the equivalence of osp(2, 2) with spl(2, 1)). Case 2. The abstract algebra B(V, 1, αk ). If 1 is even and if p = 1/2 the constraints (33) on the Q’s can be set in the form {Qa1 a2 ...a1 , Qa1+1 a1+2 ...a21 } = {Qσ(a1 )σ(a2 )...σ(a1 ) , Qσ(a1+1 )σ(a1+2 )...σ(a21 ) }
(35)
for any permutation σ of the 21 indices. The total number of independent constraints is not as easy to find as in Case 1; we obtained it in two particular cases 1(1 − 1) 2
if
V =1
(36)
and V (V + 1)(V 2 + 9V − 4) if 1 = 2. (37) 12 Exceptional solutions. It should be stressed that associative algebra could also exist with the same structure as above, i.e. with (5), (17), (18), (33) and (23) but where some of the parameters αk are 2 × 2 diagonal matrices. That is to say they depend on the Casimir operators constructed with the gl(V + 1) subalgebra generated by the operators (12). We could not solve this problem for general values of 1 but we studied completely the cases 1 = 1, 2, 3. We obtained one new solution in the case 1 = 2, V = 1, p = 1. The most general relation for {Q, Q} which is compatible with associativity depends on four parameters. It is of the form
Hidden Symmetry Algebras of Quasi-Exactly Solvable Operators
{Qa1 a2 , Q
b1 b2
}=
2 X j=0
453
αj Wab11ab22 (j) + β C1 Wab11ab22 (1) + (4C2 − 3C12 ) Wab11ab22 (0) , (38)
where β is the additional parameter while C1 , C2 represent the Casimir operators (6) computed in the representation (12). 5. Proof of Proposition 2 Let us come to the proof of Proposition 2. The relevant Jacobi identities are i h i h i h [B] [B] [B] = 0. (39) {Q[A] , Q }, Q[C] + {Q[C] , Q }, Q[A] + {Q[A] , Q[C] }, Q The application of SY (see (31)) to this equation and the use of (33) lead to the necessary and sufficient conditions h i h i [B] [B] (40) SY {Q[A] , Q }, Q[C] + {Q[C] , Q }, Q[A] = 0. Moreover SY (with an even second line) applied to a tensor T[A,C] , symmetrical in [A] on one side and in [C] on the other side, selects automatically the piece in T symmetrical under the exchange [A] ↔ [C]. Hence, the necessary and sufficient condition becomes simply h i [B] (41) SY {Q[C] , Q }, Q[A] = 0. Let us first suppose that the anticommutation relations of Q and Q take the form o n [B] (42) = S[A]S[B]Jab11 Jab22 . . . Jab11 Q[A] , Q rather than the more general one (23). Using (42) together with (17), and separating the terms, say X 0 , which come out without k (through (17)) from the terms, say kY 0 , which come out linear in k, the expression (41) becomes X 0 + kY 0 = 0,
(43)
δab1i Q[cj ,Aˆ i ] S[Cˆ j ]Jcb12 Jcb23 . . . Jcb11 − . . . ,
(44)
where 0
X = −SY S[B]
1 1 X X i=1 j=1
0
Y = SY S[B]
1 X
δcbj1 Q[A] S[Cˆ j ]Jcb12 Jcb23 . . . Jcb11 + . . . .
(45)
j=1
In (44) and (45), the . . . refer to the terms where the Q does not appear as the first operator, but rather after a J operator. Remark also that the index cj is absent in the set [Cˆ j ] and accordingly does not appear as a lower index in the J’s. It follows that, as it should, the number (1 − 1) of indices bk in the product of the J’s matches the number of cm indices. Since X 0 + kY 0 has to be zero identically, every coefficient of every (independent) operator entering in it has to be zero. This allows a great simplification in the necessary and sufficient conditions.
454
Y. Brihaye, J. Nuyts
– The terms labeled . . . in (44) and (45) can be forgotten altogether. Indeed, the terms where the Q’s are in the first position are independent of the terms where they are not. – The symmetry on the [B] can also be eliminated. Every term, for every value of the indices bk , has to vanish on its own. – Let us introduce the notations W (ai ) = δab1i
(46)
for some arbitrary fixed value of b1 and V [Cˆ k ] = Jcb12 Jcb23 . . . Jcb11 ,
(47)
where ck is absent as a lower index and b2 , . . . , b1 have also fixed values. With these simplifications, the condition X 0 + kY 0 = 0 reduces to the necessary and sufficient condition X + kY = 0 with X = −SY
1 1 X X
W (ai )Q[cj ,Aˆ i ] S[Cˆ j ]V [Cˆ j ]
(48)
i=1 j=1
and Y = SY
1 X
W (cj )Q[A] S[Cˆ j ]V [Cˆ j ].
(49)
j=1
The operator X + kY is composed of exactly two types of independent operators. They can be written canonically as O1 = W (c1 )Q[A] V [Cˆ 1 ], O2 = W (c1 )Q[A] V [Cˆ 1 ].
(50) (51)
Indeed: – The indices of the Q operator have to be completely symmetrical. Hence they must belong to the first line of the Young tableau and by symmetry of SY can be chosen as the [A] set. – If the index in W is taken in the first line, it can be chosen to be c1 . This is due to the fact that any of the indices (except those belonging to the set [A] which already pertain to the Q) in the first line is equivalent by symmetry to any other in the first line. The remaining indices for the V can be chosen in any order and for example in the natural order. – If the index in W is taken in the second line, it can be chosen to be c1 . Indeed any of the indices in the second line is equivalent by symmetry to any other in the second line. The remaining indices for the V can again be chosen in any order and for example in the natural order. The remaining task is to extract in X and in Y the number of times the operators O1 and O2 occur. This is a rather delicate operation in terms of the symmetries involved. Let us call Xi (resp. Yi ) with i = 1, 2 the coefficient of the operator Oi in X (resp. Y ). With these notations the condition X + kY = 0 becomes equivalent to
Hidden Symmetry Algebras of Quasi-Exactly Solvable Operators
X1 + kY1 = 0, X2 + kY2 = 0.
455
(52)
To now compute these four coefficients, we will make use of the fundamental theorem of group theory which states that, if P is any permutation of the elements in [A], S[A] = P S[A] = S[A]P.
(53)
Computation of X1 . Let us rewrite X as X=−
1 1 X X
S[A, c2p+1 , . . . , c1 ]S[c1 , . . . , c2p ]Ex
i=1 j=1
W (ai )Q[cj ,Aˆ i ] S[Cˆ j ]V [Cˆ j ],
(54)
where we have interchanged the finite summation on i and j with the symmetry operations. First, the ai in W (ai ) which belongs to the first line has to be replaced by a c belonging to the second line. This can be done at the intervention of the operator Ex only. At the same time none of the other aj , j 6= i in Q should be replaced by an element of the second line. Hence, from the 22p terms in Ex we can restrict ourselves to the transposition (ai , ci ) which comes with a minus sign. At the same time i can be restricted to the range 1, 2p. The summation on j then has one term with j = i. For the terms with j 6= i, the cj in Q has to belong to the set j = 2p + 1, . . . , 1 in order to be able to replace it by an a by the first symmetry operator S in (54). Hence the restricted ˆ is composed of two pieces, say Xˆ α and Xˆ β , part of X, say X, Xˆ α =
2p X
S[A, c2p+1 , . . . , c1 ]S[c1 , . . . , c2p ](ai , ci )W (ai )Q[ci ,Aˆ i ] S[Cˆ i ]V [Cˆ i ]
i=1
=
2p X
S[A, c2p+1 , . . . , c1 ]S[c1 , . . . , c2p ]W (ci )Q[A] S[Cˆ i ]V [Cˆ i ]
i=1
=
2p X
S[A, c2p+1 , . . . , c1 ]S[c1 , . . . , c2p ](c1, ci )W (ci )Q[A] S[Cˆ i ]V [Cˆ i ]
i=1
= S[A, c2p+1 , . . . , c1 ]S[c1 , . . . , c2p ]
2p X
W (c1 )Q[A] S[Cˆ 1 ]V [Cˆ 1 ]
(55)
i=1
and Xˆ β =
2p X 1 X
S[A, c2p+1 , . . . , c1 ]S[c1 , . . . , c2p ](ai , ci )
i=1 j=2p+1
W (ai )Q[cj ,Aˆ i ] S[Cˆ j ]V [Cˆ j ].
(56)
Using (53), we easily conclude that in Xα the following coefficient appears (1)!(2p)!(1 − 2p)!.
(57)
The first factor (1)! comes from the permutation of the [A] set which always contributes to an equal factor due to the symmetry of the Q. The second term (2p)! comes from the product of the summation over i (a factor 2p) and of a factor (2p − 1)! coming
456
Y. Brihaye, J. Nuyts
from the repetition of the symmetries in [c2 , . . . , c2p ] contained in the second and in the third S factors. The last term (1 − 2p)! comes from the repetition of the symmetries in [c2p+1 , . . . , c1 ] contained in the first and in the third S factors. Let us now focus our attention on the Xβ term. Using (53) we can factor out of S[A, c2p+1 , . . . , c1 ], at no cost, a transposition factor (ai , cj ) and from S[c1 , . . . , c2p ] a factor (c1 , ci ). The product of these two transpositions together with the transposition in Xβ leads to the cyclic permutation (ai , c1 , ci , cj ) and the relevant part Xˆ β becomes Xˆ β =
2p X 1 X
S[A, c2p+1 , . . . , c1 ]S[c1 , . . . , c2p ]
i=1 j=2p+1
W (c1 )Q[A] S[Cˆ 1 ]V [Cˆ 1 ].
(58)
The following coefficient then appears (1 − 2p)(1)!(2p)!(1 − 2p)!.
(59)
The extra factor as compared to the coefficient coming out of Xα is due to the extra summation over j. Summing up the results (59,57), we find X1 = (1 + 1 − 2p)(1)!(2p)!(1 − 2p)! .
(60)
Computation of Y1 . The same technique applied to Y1 is much simpler as the relevant term in Ex is simply the identity. Hence Yˆ = S[A, c2p+1 , . . . , c1 ]S[c1 , . . . , c2p ]
2p X
W (cj )Q[A] S[Cˆ j ]V [Cˆ j ]
j=1
= S[A, c2p+1 , . . . , c1 ]S[c1 , . . . , c2p ]
2p X
W (c1 )Q[A] S[Cˆ 1 ]V [Cˆ 1 ].
(61)
j=1
To pass from the first to the second line we have factored out of S[c1 , . . . , c2p ] the transposition (c1 , cj ). Collecting again the factors, we find Y1 = (1)!(2p)!(1 − 2p)! .
(62)
Computation of X2 . The relevant term in Ex is again the identity and the reduced part of X which can lead to a term of the form O2 (51) is Xˆ = −
1 1 X X
S[A, c2p+1 , . . . , c1 ]S[c1 , . . . , c2p ]W (ai )Q[cj ,Aˆ i ] S[Cˆ j ]V [Cˆ j ]. (63)
i=1 j=1
The summation on j has to be restricted to those values in the first line of the Young diagram. A transposition (ai , cj ) can then be factored out of S[A, c2p+1 , . . . , c1 ] as well as a transposition (ai , c1 ), i.e. in total a cyclic permutation (ai , c1 , cj ) . We find
Hidden Symmetry Algebras of Quasi-Exactly Solvable Operators
Xˆ = −
1 1 X X
457
S[A, c2p+1 , . . . , c1 ]S[c1 , . . . , c2p ](ai , c1 , cj )
i=1 j=2p+1
W (ai )Q[cj ,Aˆ i ] S[Cˆ j ]V [Cˆ j ] =−
1 X
1 X
S[A, c2p+1 , . . . , c1 ]S[c1 , . . . , c2p ]
i=1
j=2p+1
W (c1 )Q[A] S[Cˆ 1 ]V [Cˆ 1 ].
(64)
Collecting the factors as usual, we find X2 = −1(1)!(2p)!(1 − 2p)! .
(65)
The extra factor (1) comes from the summation over i. Computation of Y2 . In this last case the relevant term in Ex is again the identity and the reduced part of Y which can lead to a term of the form O2 (51) is Yˆ = S[A, c2p+1 , . . . , c1 ]S[c1 , . . . , c2p ]
1 X
W (cj )Q[A] S[Cˆ j ]V [Cˆ j ]
j=2p+1
= S[A, c2p+1 , . . . , c1 ]S[c1 , . . . , c2p ]
1 X
(cj , c1 )W (cj )Q[A] S[Cˆ j ]V [Cˆ j ]
j=2p+1
= S[A, c2p+1 , . . . , c1 ]S[c1 , . . . , c2p ]
1 X
W (c1 )Q[A] S[Cˆ 1 ]V [Cˆ 1 ].
(66)
j=2p+1
Collecting the terms, we find Y2 = (1)!(2p)!(1 − 2p)! .
(67)
We can now summarize the conditions coming from (52) : 1. The conditions coming from the Jacobi identities are thus two in number if p 6= 0 (the condition for the operator O1 to be defined) and if 1 6= 2p (the condition for 02 to be defined). These conditions k = −(1 + 1 − 2p), k=1
(68) (69)
are incompatible. 2. More generally, the anticommutators of two Q’s cannot vanish for more than one representation. 3. If p = 0 the only condition comes from the 02 operator. It is k=1
(70)
which is a solution to our problem. The corresponding Young diagram has only one line of length 21.
458
Y. Brihaye, J. Nuyts
4. If 1 is even and 1 = 2p the only condition comes from the 01 operator. It is k = −1,
(71)
which is a second solution to our problem. The corresponding Young diagram has two lines of equal length 1. This achieves the proof the proposition when the anticommutator {Q, Q} is restricted to (42) It is easy to see that the other allowed terms in the anticommutator of the Q’s with the Q’s, i.e. those which do not involve J’s only but the products of J’s and δ’s as in (23) lead to exactly the same restrictions. Hence they can all be present at the same time leaving us with the form (23) with the 1 + 1 arbitrary coefficients. The conditions coming from the Jacobi identities involving two Q and one Q also lead to exactly the same conditions. Hence the anticommutators which are chosen to be zero for the anticommutations of two Q’s on one side or of two Q’s on the other side must be identical.
6. Summary and Conclusions The operators preserving globally a system of two polynomials in V variables (V ≥ 1) and of degrees N and N − 1 (1 ≥ 0) respectively can be constructed as the elements of the enveloping algebra of certain superalgebras. In this paper, we have constructed a family of such associative, non-linear superalgebras. Any of these algebras is specified by V , by 1 and by a set of 1 + 1 complex numbers noted αk with k = 0, 1, . . . , 1. They are generated by (V + 1)2 (bosonic) operators Jab , a, b = 0, 1, . . . , 1
(72)
and by two sets of C1V +1 (fermionic) operators Q[a1 ,... ,a1 ] , Q
[a1 ,... ,a1 ]
, ak = 0, 1, . . . , 1
(73)
symmetric in their 1 indices. The bosonic generators obey the commutation relations of the Lie algebra gl(V + 1). The operators Q (and separately the Q) assemble into a specific representation of gl(V + 1) under the adjoint action of Jab (see (17),(18)). The anticommutators {Q, Q} are polynomials of degree at most 1 in the bosonic operators. The arbitrariness of the polynomials is coded in the 1 + 1 parameters αk (23). All the supplementary conditions on the products of the operators Q (and of the operators Q) necessary to guarantee associativity (equivalent to the generalised Jacobi identities) are given by our Proposition 2. For all fixed values of the integers V and 1 and of the complex parameters αk we denote A(V, 1, αk ) the algebra corresponding to Case 1 of Proposition 2. If 1 is even, a supplementary algebraic structure, then we denote B(V, 1, αk ) is also possible, as predicted by Case 2 of Proposition 2. Referring to the general definition of a W -algebra given recently in [13], it is natural to classify A(V, 1, αk ) and B(V, 1, αk ) as “finite W1+1 -superalgebras”. An analysis of the representations of A(1, 2, α0 , α1 , α2 ), performed recently [11], leads to a rather rich set of inequivalent irreducible, finite dimensional representations.
Hidden Symmetry Algebras of Quasi-Exactly Solvable Operators
459
Let us stress again that the operators in the enveloping algebras that we have constructed are directly relevant for the study of quasi-exactly solvable systems of equations. Note added. After submission of this article (q-alg/9701016), we have found in the preprint hep-th/9705219, by Brink, Turbiner and Wyllard [14], an application of our results. Indeed these authors have shown that the Hamiltonian of the super-Calogero model, after a suitable change of variables, is expressible in terms of the generators of our algebra for 1 = 1. References 1. Turbiner, A.V.: Commun. Math. Phys. 119, 467 (1988) 2. Ushveridze, A.G.: Quasi-exact solvability in quantum mechanics. Institute of Physics Publishing, Bristol and Philadelphia, 1993 3. Shifman, M. A.: Int. J. Mod. Phys. A4, 2897 (1989) 4. Turbiner, A.V.: J. Phys. A 25, L 1087 (1992) 5. Gonzalez-Lopez, A., Kamran, N., Olver, P.J.: J. Phys. A 24, 3995 (1991) 6. Gonzalez-Lopez, A., Kamran, N., Olver, P.J.: Phil. Trans. R. Soc. Lond. A 354, 1165 (1996) 7. Brihaye, Y., Kosinski, P.: J. Math. Phys. 35, 3089 (1994) 8. Brihaye, Y., Giller, S., Gonera, C., Kosinski, P.: J. Math. Phys. 36, 4340 (1995) 9. Shifman, M.A. and Turbiner, A.V.: Commun. Math. Phys. 120, 347 (1989) 10. Ruhl, W. and Turbiner, A.V.: Mod. Phys. Lett. A 10, 2213 (1995) 11. Brihaye, Y., Giller, S., Kosinski, P., Nuyts, J.: Commun. Math. Phys. 187, 202 (1997) 12. Kac, V.G.: Commun. Math. Phys. 53, 31 (1977) 13. Barbarin, F., Ragoucy, E., Sorba, P.: Czech. J. Phys. 46, 1165 (1996) 14. Brink, L., Turbiner, A. and Wyllard, N.: Journ. Math. Phys. 39, 1285 (1998) Communicated by H. Araki
Commun. Math. Phys. 196, 461 – 476 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Twisted Quantum Affine Algebras Vyjayanthi Chari1 , Andrew Pressley2 1 Department of Mathematics, University of California, Riverside, CA 92521, USA. E-mail: [email protected] 2 Department of Mathematics, King’s College, Strand, London WC2R 2LS, UK. E-mail: [email protected]
Received: 20 October 1996/ Accepted: 19 February 1998
Abstract: We give a highest weight classification of the finite-dimensional irreducible representations of twisted quantum affine algebras. As in the untwisted case, such representations are in one-to-one correspondence with n-tuples of monic polynomials in one variable. But whereas in the untwisted case n is the rank of the underlying finitedimensional complex simple Lie algebra g, in the twisted case n is the rank of the subalgebra of g fixed by the diagram automorphism. The way in which such an n-tuple determines a representation is also more complicated than in the untwisted case. Introduction Quantum affine algebras are one of the most important classes of quantum groups. Their finite-dimensional representations lead to solutions of the quantum Yang–Baxter equation which are trigonometric functions of the spectral parameter (see [7], Sect. 12.5 B) and are thus related to various types of integrable models in statistical mechanics and field theory. Quantum affine algebras have also been shown to arise as “quantum symmetry groups” of certain integrable quantum field theories, such as affine Toda field theories (see [2] and [10]). More precisely, there is an affine Toda field theory associated to any affine Lie algebra k, and this theory admits as a quantum symmetry group the quantum affine algebra Uq (k∗ ), where k∗ is the affine Lie algebra dual to k (whose Dynkin diagram is obtained from that of k by reversing the arrows). Since k∗ is often twisted even if k is untwisted, this shows that the representation theory of twisted quantum affine algebras is, in this context at least, just as important as that of untwisted ones. However, there appear to be virtually no results in the literature on the twisted case. The only exceptions appear to be [12] and [14], which prove the existence of finite-dimensional irreducible representations of twisted quantum affine algebras Uq (k) which are irreducible under certain subalgebras of the form Uq (m), where m is a finitedimensional Lie subalgebra of k, and [15] and [17] which construct, by vertex operator
462
V. Chari, A. Pressley
methods, quantum analogues of the standard modules (which are, of course, infinitedimensional). In [6] and [8], we gave a classification of the finite-dimensional irreducible representations of untwisted quantum affine algebras in terms of their highest weights, which are in one-to-one correspondence with n-tuples of polynomials in one variable with constant coefficient one (n being the rank of the underlying finite-dimensional Lie algebra). The purpose of this paper is to extend this result to the twisted case. We find that the finite-dimensional irreducible representations of twisted quantum affine algebras are again parametrized by n-tuples of polynomials. But n is now the rank of the fixed point subalgebra of the diagram automorphism, and the way in which such an n-tuple determines a highest weight is more complicated than in the untwisted case. In the analogous classical situation, we classified in [8] the finite-dimensional irreducible representations of the twisted affine Lie algebra gˆ σ , associated to a diagram automorphism σ of a finite-dimensional complex simple Lie algebra g, by using the canonical embedding of gˆ σ in the untwisted affine Lie algebra gˆ . Namely, we showed that every finite-dimensional irreducible representation of gˆ decomposes under gˆ σ into a finite direct sum of irreducibles, and that every finite-dimensional irreducible representation of gˆ σ arises in this way. Together with the results of [6] and [7], this gave the desired classification. In the quantum case, Jing [16] has shown how to embed Uq (ˆgσ ) into Uq (ˆg), but this embedding is not as simple as in the classical case and we have preferred to use a direct approach, following the method used for untwisted quantum affine algebras in [6] and [8]. Since the proofs are similar to those for the untwisted case, we omit many of the details.
1. Twisted Quantum Affine Algebras Let g be a finite-dimensional complex simple Lie algebra with Cartan matrix A = (aij )i,j∈I . Let σ : I → I be a bijection such that aσ(i)σ(j) = aij for all i, j ∈ I, and let m be the order of σ; we assume that m > 1 (thus, m = 2 or 3). We also denote by σ the corresponding Lie algebra automorphism of g. Fix a primitive mth root of unity ω ∈ C× . For r ∈ Z/mZ, let gr be the eigenspace of σ on g with eigenvalue ω r . Then, M gr g= r∈Z/mZ
is a Z/mZ-gradation of g (see [18], Chapter 8). The fixed point set g0 of σ is a simple Lie algebra. The nodes of its Dynkin diagram are naturally indexed by Iσ , the set of σ-orbits on I. Moreover, g1 is an irreducible representation of g0 . Let {αi }i∈Iσ be a set of simple roots of g0 , and let θ be the highest weight of g1 as a representation of g0 . Let {ni }i∈Iσ be the positive integers such that X ni α i . θ= i∈Iσ
The twisted affine Lie algebra gˆ σ is the universal central extension (with onedimensional centre) of the twisted loop algebra L(g)σ = {f ∈ C[t, t−1 ] ⊗ g | f (ωt) = σ(f (t))},
Twisted Quantum Affine Algebras
463
where t is an indeterminate. It is well known (see [18]) that gˆ σ is a symmetrizable Kac–Moody algebra whose Dynkin diagram has nodes indexed by Iˆσ = Iσ q {0}. (Note: the node labelled 0 here is labelled in [18].) Let Aσ = (aσij )i,j∈Iˆσ be the (generalized) Cartan matrix of gˆ σ , and let {di }i∈Iˆσ be the coprime positive integers such that the matrix (di aσij ) is symmetric. Setting n0 = 1, we have X
ni di aσij = 0
for all j ∈ Iˆσ .
(1)
i∈Iˆσ
Since gˆ σ is a symmetrizable Kac–Moody algebra there is, according to Drinfel’d and Jimbo, a corresponding quantum group Uq (ˆgσ ). Namely, let q be a non-zero complex number, assumed throughout this paper not to be a root of unity. Let qi = q di for i ∈ Iˆσ . If n ∈ Z, set q n − q −n , [n]q = q − q −1 and for n ≥ r ≥ 0, [n]q ! = [n]q [n − 1]q . . . [2]q [1]q , hni [n]q ! . = r q [r]q ![n − r]q ! Proposition 1.1. There is a Hopf algebra Uq (ˆgσ ) over C which is generated as an ±1 ˆ algebra by elements e± i , ki (i ∈ Iσ ), with the following defining relations: ki ki−1 = ki−1 ki = 1; ki kj = kj ki ; ±aσ ij ± ej ;
−1 k i e± = qi j ki
[e+i , e− j ] = δij
1−aσ ij
X r=0
r
(−1)
1 − aσij r
ki − ki−1 ; qi − qi−1 σ
qi
r ± ± 1−aij −r (e± = 0 if i 6= j. i ) ej (ei )
The comultiplication 1 of Uq (ˆgσ ) is given by − −1 − ±1 ±1 ±1 1(e+i ) = e+i ⊗ki + 1⊗e+i , 1(e− i ) = ei ⊗1 + ki ⊗ei , 1(ki ) = ki ⊗ki .
It follows from (1) that c=
Y
kini
i∈Iˆσ
lies in the centre of Uq (ˆgσ ). Let Uq (L(g)σ ) be the quotient of Uq (ˆgσ ) by the ideal generated by c − 1. Note that, since c is group-like, Uq (L(g)σ ) inherits a natural Hopf algebra structure from Uq (ˆgσ ). The following theorem is an analogue of a result of Drinfel’d ([13], Theorem 4). To state it, we introduce a quantity λg , which is equal to 2 if g is of type A2n for some
464
V. Chari, A. Pressley
n ≥ 1, and equal to 1 otherwise. Further, let u1 and u2 be independent indeterminates and, for i, j ∈ I, define dij ∈ Q, Pij± , Fij± , G± ij ∈ C[u1 , u2 ] as follows: if σ(i) = i, then dij = 21 , Pij± (u1 , u2 ) = 1; if aiσ(i) = 0 and σ(j) 6= j, then dij =
1 4m ,
Pij± (u1 , u2 ) = 1;
if aiσ(i) = 0 and σ(j) = j, then dij = 21 , Pij± (u1 , u2 ) =
±2m um −um 1 q 2 u1 q ±2 −u2 ;
if aiσ(i) = −1, then dij = 18 , Pij± (u1 , u2 ) = u1 q ±1 + u2 ; Q Fij± (u1 , u2 ) = r∈Z/mZ (u1 − ω r q ±λg aiσr (j) u2 ); Q ±λg aiσr (j) G± − ω r u2 ). ij (u1 , u2 ) = r∈Z/mZ (u1 q ± We note that G± ij (u1 , u2 ) = −Fji (u2 , u1 ).
Definition 1.2. For i ∈ I, let i ∈ Iσ be the σ-orbit of i. Let Dq (g)σ be the associative ± algebra over C with generators Xi,k (i ∈ I, k ∈ Z), Hi,k (i ∈ I, k ∈ Z\{0}), Ki±1 (i ∈ I), and the following defining relations: ± ± ±1 = ω k Xi,k ; Hσ(i),k = ω k Hi,k ; Kσ(i) = Ki±1 ; Xσ(i),k
Ki Ki−1 = Ki−1 Ki = 1; Ki Kj = Kj Ki ; Hi,k Hj,l = Hj,l Hi,k ; Ki Hj,l = Hj,l Ki ; P ±λg aiσr (j) ± ± r∈Z/mZ Ki Xj,k Ki−1 = q Xj,k ; X 1 ± ± [Hi,k , Xj,l ]=± [kaiσr (j) /di ]qi ω kr Xj,k+l ; k r∈Z/mZ ! X 9+i,k+l − 9− i,k+l − + rl [Xi,k , Xj,l ] = δσr (i),j ω , qi − qi−1 r∈Z/mZ where the 9± i,k are defined by ∞ X
k 9± i,±k u
=
Ki±1
exp ±(qi −
k=0
qi−1 )
∞ X
! Hi,±l u
l
,
l=1
u being an indeterminate, and 9± i,k = 0 if ∓k > 0; ± ± Fij± (u1 , u2 )Xi± (u1 )Xj± (u2 ) = G± ij (u1 , u2 )Xj (u2 )Xi (u1 ),
where
Xi± (u) =
X
± −k Xi,k u ;
k∈Z
Sym{Pij± (u1 , u2 )(Xj± (v)Xi± (u1 )Xi± (u2 ) − (q 2mdij + q −2mdij )Xi± (u1 )Xj± (v)Xi± (u2 ) + Xi± (u1 )Xi± (u2 )Xj± (v))} = 0
Twisted Quantum Affine Algebras
465
if aij = −1 and σ(i) 6= j, where u1 , u2 and v are independent indeterminates and Sym denotes symmetrization over u1 , u2 ; λg /2 −3λg /2 ∓1 + q −λg /2 )u∓1 u3 )Xi± (u1 )Xi± (u2 )Xi± (u3 )} = 0 Sym{(q 3λg /2 u∓1 1 − (q 2 +q (2± ) and λg /2 3λg /2 ±1 + q −λg /2 )u±1 u3 )Xi± (u1 )Xi± (u2 )Xi± (u3 )} = 0 Sym{(q −3λg /2 u±1 1 − (q 2 +q (3± ) if aiσ(i) = −1, where Sym denotes symmetrization over the independent indeterminates u1 , u2 , u3 .
Theorem 1.3. There exists an isomorphism of algebras between Uq (L(g)σ ) and Dq (g)σ such that + , e− = e+i = Xi,0 i
1 − X , k i = Ki , pi i,0
where i ∈ I belongs to the σ-orbit i. Remarks. 1. There is a similar realization of Uq (ˆgσ ). Theorem 1.3 is, however, sufficient for our purposes since it can be shown (cf. [7], Proposition 12.2.3) that the central element c acts as one on every finite-dimensional representation of Uq (ˆgσ ). 2. Relations (2) and (3) are present only when gˆ σ is of type A(2) 2n . Drinfel’d ([13], Theorem 4) has analogues of only two of these four relations (namely (2− ) and (3+ )). The other two can be shown to be consequences of these together with the other defining relations of Dq (g)σ . We have included all four partly for reasons of symmetry, and partly because they are all needed in subsequent calculations. 3. The isomorphism in 1.3 depends on the choice of the section ¯i 7→ i of the canonical projection I → Iσ , but any two such isomorphisms differ only by a rescaling on the generators e± (¯i ∈ Iˆσ ). i¯ For a proof of this theorem, and an explicit description of the isomorphism, see [16], Theorem 3.1 and [17], Proposition 2.1. However, in the A(2) 2n case, the q in [16] and [17] must be replaced by q 2 to get the algebras denoted here by Uq (L(g)σ ) and Dq (g)σ . Compare also [1] and [11] for analogous results in the untwisted case. For later use, we record here the defining relations, and the form of the isomorphism in 1.3, for the simplest twisted quantum affine algebra Uq (L(sl3 )τ ), where τ is the nontrivial diagram automorphism of sl3 (C). In this case, we may drop the index i from the generators of Uq (L(sl3 )τ ) (since |Iτ | = 1). The generalized Cartan matrix is Aτ = so that d0 = 4 and d1 = 1. The defining relations are as follows:
2 −1 , −4 2
466
V. Chari, A. Pressley
KK −1 = K −1 K = 1, KHk = Hk K, Hk Hl = Hl Hk , − + ψk+l − ψk+l , q − q −1 ! ∞ ∞ X X ± k ψ±k u = K ±1 exp ±(q − q −1 ) H±l ul , where
KXk± K −1 = q ±2 Xk± , [Xk+ , Xl− ] =
k=0
l=1
[2k]q 2k ± (q + q −2k + (−1)k+1 )Xk+l if k 6= 0, [Hk , Xl± ] = ± k ± ± ± ± Xl± + (q ∓2 − q ±4 )Xk+1 Xl+1 − q ±2 Xk± Xl+2 Xk+2
± ± ± ± = q ±2 Xl± Xk+2 + (q ±4 − q ∓2 )Xl+1 Xk+1 − Xl+2 Xk± ,
± ± ± ± ± Xl± Xm − (q + q −1 )Xk± Xl∓1 Sym(q 3 Xk∓1 Xm + q −3 Xk± Xl± Xm∓1 ) = 0, ± ± ± ± ± Xl± Xm − (q + q −1 )Xk± Xl±1 Xm + q 3 Xk± Xl± Xm±1 ) = 0, Sym(q −3 Xk±1
where Sym means the sum over all permutations of k, l, m. The isomorphism in 1.3 is given by e+0 = K −2 (X0− X1− − q 2 X1− X0− ), e− 0 =
1 + (X + X + − q −2 X0+ X−1 )K 2 , [4]2q −1 0
− k0 = K −2 , e+1 = X0+ , e− 1 = X0 , k1 = K. σ + , U0σ ) be the subalgebras of U σ generated by the Xi,k (resp. by the Let U+σ (resp. U− ± by the 9i,k ) for i ∈ I, k ∈ Z.
− Xi,k ,
σ .U0σ .U+σ . Proposition 1.4. U σ = U−
The proof is straightforward. 2. Some Subalgebras of Uq (L(g)σ ) The study of untwisted quantum affine algebras can be reduced, to some extent at least, to the case of quantum affine sl2 , by noting that any algebra of the former type can be generated by finitely many copies of the latter (see [1], Proposition 3.8). In the twisted case, one needs Uq (L(sl3 )τ ) in addition, where τ is the unique non-trivial diagram automorphism of sl3 (C). We recall the definition of quantum affine sl2 : Definition 2.1. Uq (L(sl2 )) is the associative algebra with generators Xk± (k ∈ Z), Hk (k ∈ Z\{0}), K ±1 , and the following defining relations: KK −1 = K −1 K = 1; KHk = Hk K; Hk Hl = Hl Hk ; KXk± K −1 = q ±2 Xk± ; 1 ± [Hk , Xl± ] = ± [2k]q Xk+l ; k ± ± ± ± Xl± − q ±2 Xl± Xk+1 = q ±2 Xk± Xl+1 − Xl+1 Xk± ; Xk+1 [Xk+ , Xl− ] =
9+k+l − 9− k+l , q − q −1
Twisted Quantum Affine Algebras
where
∞ X
467
k 9± ±k u
=K
±1
exp ±(q − q
−1
)
k=0
∞ X
! H±l u
l
,
l=1
and 9± k = 0 if ∓k > 0. See [7], Theorem 12.2.1 – we have set the central element C 1/2 equal to one. The result we need is the following: Proposition 2.2. Let i ∈ I. (i)
If σ(i) 6= i and aiσ(i) 6= 0, there is a homomorphism of algebras ϕi : Uq (L(sl3 )τ ) → Uq (L(g)σ ) such that ± ± ϕi (Xk± ) = Xi,k , ϕi (Hk ) = Hi,k , ϕi (9± k ) = 9i,k , ϕi (K) = Ki .
(ii) If σ(i) 6= i and aiσ(i) = 0, there is a homomorphism of algebras ϕi : Uq (L(sl2 )) → Uq (L(g)σ ) such that ± ± ϕi (Xk± ) = Xi,k , ϕi (Hk ) = Hi,k , ϕi (9± k ) = 9i,k , ϕi (K) = Ki .
(iii) If σ(i) = i, there is a homomorphism of algebras ϕi : Uqm (L(sl2 )) → Uq (L(g)σ ) such that 1 + − X , ϕi (Xk− ) = Xi,mk , m i,mk ± ± ϕi (Hk ) = Hi,mk , ϕi (9k ) = 9i,mk , ϕi (K) = Ki . ϕi (Xk+ ) =
Proof. Straightforward verification, using 1.3 and 2.1.
Remarks. 1. We have dropped the subscript i from the generators of Uq (L(sl3 )τ ) in (i), since |Iτ | = 1. ± σ 2. In (iii), the generators Xi,k , Hi,k , 9± i,k of Uq (L(g) ) vanish if k is not a multiple of m. 3. We expect that the homomorphisms ϕi are injective, but we shall not need this. 3. Finite-Dimensional Representations A representation V of U σ (i.e. a left U σ -module) is said to be of type I if each ki (i ∈ Iˆσ ) acts semisimply on V with eigenvalues which are integer powers of qi . It is not difficult to show that every finite-dimensional irreducible representation of U σ can be obtained from a type I representation by twisting with an automorphism of U σ of the − form e+i 7→ i e+i , e− i 7→ ei , ki 7→ i ki , where each i = ±1 (cf. [7], Proposition 12.2.3). If V is a type I representation of U σ , a vector v ∈ V is said to be a highest weight + for all i ∈ I, k ∈ Z, and is a simultaneous vector if v is annihilated by the Xi,k σ eigenvector for the elements of U0 . If, in addition, V = U σ .v, then V is said to be a ± }i∈I,k∈Z are the scalars such that highest weight representation. Moreover, if {ψi,k ± 9± i,k .v = ψi,k v,
468
V. Chari, A. Pressley
± the pair of (I × Z)-tuples ψ ± = {ψi,k }i∈I,k∈Z is called the highest weight of V (or the weight of v). Note that we necessarily have ± =0 ψi,k
if ∓ k > 0,
± ± ψσ(i),k = ω k ψi,k
− + ψi,0 ψi,0 = 1,
for all i ∈ I, k ∈ Z.
(4)
Conversely, by the usual Verma module construction, it is easy to show that, for any ± }i∈I,k∈Z satisfying (4), there is, up to isomorphism, exactly one irreducible ψ ± = {ψi,k ψ ± ) with highest weight ψ ± . representation V (ψ The following theorem is the main result of this paper: Theorem 3.1. (i) Every finite-dimensional irreducible type I representation of Uq (L(g)σ ) is highest weight. ± ψ ± ) of Uq (L(g)σ ) is (ii) If ψ ± = {ψi,k }i∈I,k∈Z , the highest weight representation V (ψ finite-dimensional if and only if there exist polynomials Pi ∈ C[u] (i ∈ I) with constant coefficient one such that ∞ X k=0
+ ψi,k uk
=
∞ X k=0
− ψi,−k u−k
−2m u) mdegPi Pi (q Pi (u) q −2 = q degPi PiP(q(u)u) i q mdegPi Pi (q−2m um ) Pi (um )
if σ(i) 6= i and aiσ(i) 6= 0, if σ(i) 6= i and aiσ(i) = 0, if σ(i) = i,
in the sense that the first two terms are the Laurent expansions of the third term about u = 0 and u = ∞, respectively. The proof of (i) is straightforward (cf. [7], Proposition 12.2.3). The proof of (ii) will occupy the next two sections. k ± Remark. Since 9± σ(i),k = ω 9i,k , the polynomials Pi , if they exist, necessarily satisfy the condition
Pσ(i) (u) = Pi (ωu).
(5)
Let 5 be the set of I-tuples of polynomials Pi ∈ C[u] with constant coefficient one satisfying (5). If P = {Pi }i∈I ∈ 5, we denote by V (P) the irreducible highest weight ψ ± ) of U σ (abusing notation), the relation between P and ψ ± being representation V (ψ as in 3.1. Proposition 3.2. Let P = {Pi }i∈I , Q = {Qi }i∈I ∈ 5, and let vP ∈ V (P), vQ ∈ V (Q) be highest weight vectors. Then, vP ⊗ vQ is a highest weight vector in V (P) ⊗ V (Q) of weight φ ± , where φ ± is related to the I-tuple {Pi Qi }i∈I in the same way as ψ ± is related to {Pi }i∈I in 3.1. This will be proved in Sects. 4 and 5. The following corollary is immediate: Corollary 3.3. Let the notation be as in 3.2, and denote the I-tuple {Pi Qi }i∈I by P ⊗ Q. Then, V (P ⊗ Q) is isomorphic as a representation of Uq (L(g)σ ) to a subquotient of V (P) ⊗ V (Q).
Twisted Quantum Affine Algebras
469
4. The Uq (L(sl3 )τ ) Case In this section, we prove 3.1 and 3.2 for Uq (L(sl3 )τ ), where τ is the non-trivial diagram automorphism of sl3 (C), and we denote Uq (L(sl3 )τ ) by U τ . The explicit form of the generators and relations of U τ was given at the end of Sect. 1. It will be convenient to set e˜0 = X0− X1− − q 2 X1− X0− , so that, in the isomorphism in 1.3, e+0 = K −2 e˜0 , and to write (Xk± )(r) =
(Xk± )r (e˜0 )r , (e˜0 )(r) = . [r]q ! [r]q4 !
The crucial result for the proof of 3.1 in this case is the next proposition. Definition 4.1. Define elements {Pr }r∈N in U0τ by P0 = 1 and Pr = −
r−1 X 1 9+j+1 Pr−j−1 K −1 . 1 − q −4r
(6)
j=0
If we introduce the formal power series P(u) =
∞ X
Pr ur , 9± (u) =
r=0
∞ X
±r 9± ±r u
r=0
in an indeterminate u, then 4.1 is equivalent to saying that P(u) has constant coefficient one and that P(q −4 u) . 9+ (u) = K P(u) Let X + be the linear subspace of U τ spanned by the Xk+ for k ∈ Z. Proposition 4.2. For all r ∈ N, we have the following congruences ( mod U τ X + ): 2r+2 ; (i)r (X0+ )(2r+2) (e˜0 )(r+1) ≡ (−1)r+1 q −2(r+1)(2r+1) [4]r+1 q Pr+1 K Pr − + (2r+1) (r+1) r −4r(r+1) r+1 2r+1 (ii)r (X0 ) (e˜0 ) ≡ (−1) q [4]q ; j=0 Xj+1 Pr−j K
(iii)r (X0+ )(2r+1) (e˜0 )(r+1) ≡ q −6r+2 [4]q KX1− (X0+ )(2r) (e˜0 )(r) q +q −8r+4 [2]q [3] K 2 [H1 , (X0+ )(2r−1) (e˜0 )(r) ]. q
[4]
Proof. All three congruences are easily checked when r = 0. Assuming (iii)r , one deduces (ii)r from (i)r−1 , (ii)r−1 and (6), and then (i)r follows from (ii)r by multiplying on the left by X0+ and using (6) again. Thus, the main point is to prove (iii)r . For this, one needs identities (7)–(16) below: (X0+ )(r) X1+ = −q −3r [r − 1]q X1+ (X0+ )(r) + q −3r+3 X0+ X1+ (X0+ )(r−1) ;
[H1 , (X0+ )(r) ]
q −3r+3 + q −r+3 − q −r+1 − q −r−1 = [3]q q − q −1 + q −2r+4 X0+ X1+ (X0+ )(r−2) ;
(7)
X1+ (X0+ )(r−1) (8)
470
V. Chari, A. Pressley
[(X0+ )(r) , X1− ] = q −r+1 KH1 (X0+ )(r−1) −2r+1 + q −2r−1 − q −2r+5 − q −4r+5 q + KX1+ (X0+ )(r−2) q − q −1 − q −3r+5 KX0+ X1+ (X0+ )(r−3) ;
(9)
[(X0+ )(r) , e˜0 ] = q −r+3 [4]q KX1− (X0+ )(r−1) + q −2r+4 (q 2 + q −2 )K 2 H1 (X0+ )(r−2) −5 q + q −3 − q 3 − q −2r+3 −3r+6 K 2 X1+ (X0+ )(r−3) +q q − q −1 − q −4r+8 K 2 X0+ X1+ (X0+ )(r−4) ;
(10)
e˜0 X1− = q 4 X1− e˜0 ;
(11)
−4r+5 [H1 , e˜(r) (q − q −1 )[3]q [4]q e˜(r−1) (X1− )2 ; 0 ] = −q 0
(12)
−4r+4 [X0+ , e˜(r) [4]q e˜(r−1) X1− K ; 0 ]=q 0
(13)
[(X0+ )(r+1) , e˜0 ] = q −r+4 [2]q KX1− (X0+ )(r) + q −r [2]q K(X0+ )(r) X1− +
q −2r 2 K [H1 , (X0+ )(r−1) ] + q −2r+3 (q − q −1 )K 2 H1 (X0+ )(r−1) ; [3]q (14)
X1− e˜(r) 0 K ≡
(X1− )2 ≡ e˜(r−1) 0
1 X + e˜(r+1) [4]q 0 0
(mod U τ X + ) ;
q 8r−2 + 2 (r+1) −2 q 4r−2 (r) (X0 ) e˜0 K − e˜ H1 [4]2q [4]q 0
(mod U τ X + ) .
(15)
(16)
Identities (7)–(10) are proved successively by induction on r; (11) is a consequence of (3− ); (12) and (13) are proved by induction on r using (11); (14) follows from (8) and (9); congruence (15) follows from (11) and (13); and (16) follows by a double application of (13). Finally, to prove (iii)r , we compute ] [r + 1]q4 [(X0+ )(2r+1) , e˜(r+1) 0 −2r = q −2r+4 [2]q KX1− (X0+ )(2r) e˜(r) [2]q K(X0+ )(2r) X1− e˜(r) 0 +q 0
+
q −4r 2 −4r+3 K [H1 , (X0+ )(2r−1) ]e˜(r) (q − q −1 )K 2 H1 (X0+ )(2r−1) e˜(r) 0 +q 0 [3]q (by (14))
Twisted Quantum Affine Algebras
471
−2r = q −2r+4 [2]q KX1− (X0+ )(2r) e˜(r) [2]q K(X0+ )(2r) X1− e˜(r) 0 +q 0
+
q −4r 2 q −4r 2 + (2r−1) K [H1 , (X0+ )(2r−1) e˜(r) ] − K (X0 ) [H1 , e˜(r) 0 0 ] [3]q [3]q
−4r+3 + q −4r+3 (q − q −1 )K 2 [H1 , (X0+ )(2r−1) e˜(r) (q − q −1 )K 2 (X0+ )(2r−1) e˜(r) 0 ]+q 0 H1 −2r = q −2r+4 [2]q KX1− (X0+ )(2r) e˜(r) [2]q K(X0+ )(2r) X1− e˜(r) 0 +q 0
+
q −4r+6 2 q −4r 2 + (2r−1) K [H1 , (X0+ )(2r−1) e˜(r) K (X0 ) [H1 , e˜(r) 0 ]− 0 ] [3]q [3]q
+ q −4r+3 (q − q −1 )K 2 (X0+ )(2r−1) e˜(r) 0 H1 −2r−2 [2]q [2r + 1]q ≡ q −2r+4 [2]q KX1− (X0+ )(2r) e˜(r) (X0+ )(2r+1) e˜(r+1) 0 +q 0 [4]q +
q −4r+6 2 q −4r 2 + (2r−1) K [H1 , (X0+ )(2r−1) e˜(r) K (X0 ) [H1 , e˜(r) 0 ]− 0 ] [3]q [3]q
+ q −4r+3 (q − q −1 )K 2 (X0+ )(2r−1) e˜(r) (mod U τ X + ) 0 H1 ( by (15) applied to the second term) −2r−2 [2]q [2r + 1]q (X0+ )(2r+1) e˜(r+1) ≡ q −2r+4 [2]q KX1− (X0+ )(2r) e˜(r) 0 +q 0 [4]q +
q −4r+6 2 −8r+5 K [H1 , (X0+ )(2r−1) e˜(r) (q − q −1 )[4]q K 2 (X0+ )(2r−1) e˜(r−1) (X1− )2 0 ]+q 0 [3]q
+ q −4r+3 (q − q −1 )K 2 (X0+ )(2r−1) e˜(r) (mod U τ X + ) 0 H1 (by (12) applied to the fourth term) −2r−2 [2]q [2r + 1]q (X0+ )(2r+1) e˜(r+1) ≡ q −2r+4 [2]q KX1− (X0+ )(2r) e˜(r) 0 +q 0 [4]q +
q −4r+6 2 K [H1 , (X0+ )(2r−1) e˜(r) 0 ] [3]q
+ q −8r+5 (q − q −1 )[4]q K 2 (X0+ )(2r−1)
q 8r−2 + 2 (r+1) −2 q 4r−2 (r) (X0 ) e˜0 K − e˜ H1 [4]2q [4]q 0
+ q −4r+3 (q − q −1 )K 2 (X0+ )(2r−1) e˜(r) (mod U τ X + ) 0 H1 (by (16) applied to the fourth term) −2r−2 [2]q [2r + 1]q (X0+ )(2r+1) e˜(r+1) ≡ q −2r+4 [2]q KX1− (X0+ )(2r) e˜(r) 0 +q 0 [4]q +
q −4r+6 2 1 − q −2 K [H1 , (X0+ )(2r−1) e˜(r) [2r + 1]q [2r]q (X0+ )(2r+1) e˜(r+1) 0 ]+ 0 [3]q [4]q ( mod U τ X + ).
Collecting the terms involving (X0+ )(2r+1) e˜(r+1) on the left-hand side and simplifying 0 gives (iii)r . Let V be the finite-dimensional irreducible type I representation of U τ with highest weight given by the pair of Z-tuples {ψk± }k∈Z , and let v be a highest weight vector in V . We have
472
V. Chari, A. Pressley
k0 .v = q 4r0 v, k1 .v = q r1 v for some r0 , r1 ∈ Z. Note that r1 = −2r0 ; in particular, r1 is even. Write r0 = −r, r1 = 2r from now on. Regarding V as a representation of the Uq (sl2 ) subalgebra of U τ generated by e± 1 and ±1 k1 , v is a highest weight vector (so that r ≥ 0), and we have a direct sum decomposition M Vpnp , V ∼ = p∈N
where Vp is the irreducible representation of Uq (sl2 ) of dimension p + 1 (and on which k1 acts with eigenvalues in q Z ), and the np ≥ 0 are certain multiplicities. By 1.4, np = 0 if p is odd or > 2r. Applying both sides of (i)s in 4.2 to v, it follows that Ps .v = 0 if s > r. Hence, P(u).v = P (u)v, where P ∈ C[u] is a polynomial with constant coefficient 1 and degree ≤ r. By the remarks following 4.1, P (q −4 u) v. P (u)
9+ (u).v = q 2r
(17)
+ , where n ∈ N, we see that Multiplying both sides of (ii)r on the left by X−n−1 r X
9+j−n Pr−j .v =
n X
j=n
9− j−n Pr−j .v
(18)
j=0
if n ≤ r, and r X
9− j−n Pr−j .v = 0
(19)
j=0
if n > r. By 4.1, (18) is equivalent to q 2r q −4(r−n) Pr−n .v =
n X
9− j−n Pr−j .v.
(20)
j=0
Equations (19) and (20) are together equivalent to 9− (u).v = q 2r
P (q −4 u) v. P (u)
(21)
Finally, to compute deg P , note that if deg P = s, Eq. (21) implies that 9− (u).v = q 2r−4s
(q −4 u)−s P (q −4 u) v, u−s P (u)
and hence that K −1 .v = q 2r−4s v. But from Eq. (17) we have K.v = q 2r v, so s = r. This completes the proof of the “only if” part of 3.1(ii) in the U τ case. Before proving the “if” part, we prove 3.2 in the U τ case. This depends on the following proposition.
Twisted Quantum Affine Algebras
473
Proposition 4.3. Let k ≥ 0. Pk + ⊗ 9+j + 1 ⊗ Xk+ ( mod U τ (X + )2 ⊗ U τ ); (i) 1(Xk+ ) ≡ j=0 Xk−j Pk (ii) 1(9+k ) ≡ j=0 9+j ⊗ 9+k−j ( mod U τ X + ⊗ U τ + U τ ⊗ U τ X + ). Proof. Making use of 1.1 and the isomorphism in 1.3, one computes that 1(H1 ) = H1 ⊗ 1 + 1⊗H1 − (q − q −1 )[2]q [3]q X0+ ⊗X1− + q −1 (q − q −1 )[3]q (X0+ )2 ⊗K 2 e+0 . The formula in (i) now follows by an easy induction on k. Then (ii) follows from (i) by using 9+k = (q − q −1 )[Xk+ , X0− ]. Part (ii) of 4.3 implies that, when acting on a tensor product of two highest weight vectors, 9+ (u) acts as a group-like element of the formal power series Hopf algebra U τ [[u]]. Proposition 3.2 follows. To prove the “if” part of 3.1(ii) for U τ , it suffices by 3.3 to show that V (P ) is finite-dimensional when deg P = 1. This is accomplished in the next proposition. Proposition 4.4. The following is a representation of U τ , for any a ∈ C× : 01 0 0 0 0 0 0 , Xk+ 7→ ak [2]q 0 0 (−1)k q 2k , Xk− 7→ ak 1 k 2k 00 0 0 (−1) q 0 −2k 2 0 0 q q 0 0 [2k] q , K 7→ 0 1 0 , Hk → 7 ak 0 (−1)k − q 2k 0 k 0 0 q −2 0 0 (−1)k+1 q 4k 1 0 0 if k > 0, 0 9+k 7→ (q 2 − q −2 )ak 0 (−1)k q 2k − 1 0 0 (−1)k+1 q 2k 1 0 0 2 −2 k if k < 0. 0 0 (−1)k q 2k − 1 )a 9− k 7→ −(q − q k+1 2k 0 0 (−1) q Proof. Straightforward verification.
The representation defined in 4.4, say Va , is clearly irreducible and of type I. Moreover, if {ψk± }k∈Z is its highest weight, we have ∞ X k=0
ψk+ uk = q 2 +
∞ X k=1
(q 2 − q −2 )ak uk = q 2
P (q −4 u) , P (u)
where P (u) = 1 − au. Thus, we have exhibited a finite-dimensional irreducible type I representation of U τ with highest weight given (as in 3.1) by an arbitrary polynomial of degree one. This completes the proof of 3.1(ii) in the U τ case. (Note that, if P = 1 is the constant polynomial, then V (P ) is the trivial representation.)
474
V. Chari, A. Pressley
5. The general case In this section, we outline the proofs of 3.1 and 3.2 for an arbitrary twisted quantum affine algebra Uq (L(g)σ ), which we denote by U σ . Suppose then that V is a finite-dimensional highest weight representation of U σ with ± }i∈I,k∈Z . We consider three cases, as highest weight vector v and highest weight {ψi,k in Proposition 2.2: Case (i). σ(i) 6= i and aiσ(i) 6= 0. Note that m = 2 in this case. Using the homomorphism ϕi described in 2.1(i), we can view V as a representation of Uq (L(sl3 )τ ), and as such v is still a highest weight vector in V . By the Uq (L(sl3 )τ ) case of 3.1, proved in the previous section, there exists Pi ∈ 5 such that ∞ X
+ ψi,k uk =
k=0
∞ X
− ψi,−k u−k = q 2 deg Pi
k=0
Pi (q −4 u) . Pi (u)
Case (ii). σ(i) 6= i and aiσ(i) = 0. This time, we can view V as a representation of Uq (L(sl2 )). By [6], Theorem 3.4, there exists Pi ∈ 5 such that ∞ X
+ ψi,k uk =
k=0
∞ X
− ψi,−k u−k = q deg Pi
k=0
Pi (q −2 u) . Pi (u)
Case (iii). σ(i) = i. Viewing V as a representation of Uqm (L(sl2 )), there exists Pi ∈ 5 such that ∞ ∞ X X Pi (q −2m u) − + k ψi,mk u = . ψi,−mk u−k = q m deg Pi Pi (u) k=0
Noting that
9± i,k
k=0
= 0 unless k is divisible by m, we find that ∞ X k=0
+ ψi,k uk =
∞ X k=0
− ψi,−k u−k = q m deg Pi
Pi (q −2m um ) . Pi (um )
This proves the “only if” part of 3.1(ii). The “if” part is proved by an argument similar to that used in the untwisted case in [8], Sect. 5. In that case, the crucial point was to establish the result for Uq (L(sl2 )). In the present case, we also need the result for Uq (L(sl3 )τ ), which was proved at the end of Sect. 4. We omit further details. Finally, to prove 3.3 in the general case, one uses the methods of [9], Sect. 2. Let Ui denote Uq (L(sl3 )τ ), Uq (L(sl2 )) or Uqm (L(sl2 )) in cases (i), (ii) or (iii) of 2.2, respectively. If V is an irreducible highest weight representation of U σ with highest weight vector v, denote the representation ϕi (Ui ).v of Ui by Vi . Lemma 5.1. With the above notation, Vi is an irreducible representation of Ui with highest weight vector v. The proof is similar to that of Lemma 2.3 in [9]. In particular, for any P = {Pi }i∈I ∈ 5, V (P)i ∼ = V (Pi ), where V (Pi ) is the finite-dimensional irreducible representation of Ui associated to the polynomial Pi as in 3.1(ii) if Ui = Uq (L(sl3 )τ ), and as in Theorem 3.4 in [6] if Ui = Uq (L(sl2 )) or Uqm (L(sl2 )). If V and W are two irreducible highest weight representations of U σ with highest weight vectors v and w, then, for any i ∈ I, Vi ⊗Wi is a representation of Ui via (ϕi ⊗ϕi )◦
Twisted Quantum Affine Algebras
475
1i . On the other hand, it is not difficult to show that the subspace Vi ⊗Wi of V ⊗W is preserved by (1 ◦ ϕi )(Ui ), giving a second way of viewing Vi ⊗Wi as a representation of Ui . We denote these representations by Vi ⊗i Wi and Vi ⊗Wi , respectively. Lemma 5.2. With the above notation, the identity map Vi ⊗i Wi → Vi ⊗Wi is an isomorphism of representations of Ui . The proof is similar to that of Proposition 2.2 in [9]. The necessary facts about the comultiplication of Uq (L(g)σ ) can be established by computations similar to those used to prove Theorem 2.2 in [17]. Now let P = {Pi }i∈I , Q = {Qi }i∈I ∈ 5. Then, by the Uq (L(sl3 )τ ) case of 3.2, proved in the previous section, and the analogous result for Uq (L(sl2 )) (Proposition 4.3 in [6]), we have the following isomorphisms of representations of Ui : V (P)i ⊗i V (Q)i ∼ = V (Pi Qi ). = V (Pi )⊗i V (Qi ) ∼ If vP and vQ are highest weight vectors in V (P) and V (Q), it follows from 5.2 that ± 9± i,k .(vP ⊗vQ ) = ψi,k (vP ⊗vQ ), ± }i∈I,k∈Z corresponds to the polythe action on the left being given by 1, where {ψi,k τ nomial Pi Qi as in the Uq (L(sl3 ) ) case of 3.1(ii) (proved in the previous section) or the analogous result for Uq (L(sl2 )) or Uqm (L(sl2 )) (Theorem 3.4 in [6]). This proves 3.2 for U σ .
References 1. Beck, J.: Braid group action and quantum affine algebras. Commun. Math. Phys. 165, 555–568 (1994) 2. Bernard, D. and LeClair, A.: Quantum group symmetries and non-local currents in 2D QFT, Commun. Math. Phys. 142, 99–138 (1991) 3. Chari, V., Integrable representations of affine Lie algebras. Invent. Math. 85, 317–335 (1986) 4. Chari, V. and Pressley, A. N.: New unitary representations of loop groups. Math. Ann. 275, 87–104 (1986) 5. Chari, V. and Pressley, A. N.: Integrable representations of twisted affine Lie algebras. J. Algebra 113, 438–464 (1988) 6. Chari, V. and Pressley, A. N.: Quantum affine algebras. Commun. Math. Phys. 142, 261–283 (1991) 7. Chari, V. and Pressley, A. N.: A Guide to Quantum Groups. Cambridge: Cambridge University Press, 1994 8. Chari, V. and Pressley, A. N.: Quantum affine algebras and their representations, Canadian Math. Soc. Conf. Proc. 16, 59–78 (1995) 9. Chari, V. and Pressley, A. N.: Minimal affinizations of representations of quantum groups: the simplylaced case, J. Algebra 184, 1–30 (1996) 10. Chari, V. and Pressley, A. N.: Yangians, integrable quantum systems and Dorey’s rule. Commun. Math. Phys. 181, 265–302 (1996) ˆ 11. Damiani, I.: A basis of type Poincar´e–Birkhoff–Witt for the quantum algebra of sl(2). J. Algebra 161, 291–310 (1993) 12. Delius, G. W., Gould, M. D. and Zhang, Y.-Z.: Twisted quantum affine algebras and solutions to the Yang–Baxter equation. Preprint KCL-TH-95-8, q-alg/9508012 13. Drinfel’d, V. G.: A new realization of Yangians and quantized affine algebras. Soviet Math. Dokl. 36, 212–216 (1988) 14. Gandenberger, G. M., McKay, N. J. and Watts, G. M. T.: Twisted algebra R-matrices and S-matrices for b(1) n affine Toda solitons and their bound states. Preprint DAMTP-95-49, CRM-2314, hep-th/9509007 15. Jing, N.-H.: Twisted vertex representations of quantum affine algebras. Invent. Math. 102, 663–690 (1990)
476
V. Chari, A. Pressley
16. Jing, N.-H.: On Drinfeld realization of quantum affine algebras. Preprint q-alg/9610035 17. Jing, N.-H. and Misra, K. C.: Vertex operators for twisted quantum affine algebras. Preprint qalg/9701034 18. Kac, V. G.: Infinite dimensional Lie algebras. 3rd edition, Cambridge: Cambridge University Press, 1990 Communicated by T. Miwa
Commun. Math. Phys. 196, 477 – 483 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Singular Continuous Spectrum for the Period Doubling Hamiltonian on a Set of Full Measure David Damanik Fachbereich Mathematik, Johann Wolfgang Goethe-Universit¨at, 60054 Frankfurt/Main, Germany. E-mail: [email protected] Received: 11 December 1997 / Accepted: 24 February 1998
Abstract: We consider the discrete one-dimensional Schr¨odinger operator with potential generated by the period doubling substitution. We show that for almost every element in the hull , with respect to the unique ergodic measure µ on , there are no eigenvalues. Combining this with a result proven by Kotani, we establish purely singular continuous spectrum on a set of full measure. 1. Introduction The framework for our study will be provided by discrete one-dimensional Schr¨odinger operators (Hω φ)(n) = φ(n + 1) + φ(n − 1) + Vω (n)φ(n)
(1)
in l2 (Z), where the potential Vω depends on an element ω from a probability space (, µ) in such a way that the family (Hω )ω∈ is ergodic with respect to the shift (T Vω )(n) = Vω (n+1). Many interesting spectral properties are independent of the parameter ω if one ignores sets of µ-measure zero. In particular, there exist sets 6, 6pp , 6sc , 6ac such that the equalities σ(Hω ) = 6, σε (Hω ) = 6ε , ε ∈ {pp, sc, ac}, hold µ-almost everywhere [6]. We shall often write a.e. instead of µ-almost everywhere. A subclass of such operators has received considerable attention in the past, which is due to a very nice observation by Kotani [19]. If the potentials Vω take only finitely many values, then the following very strong implication holds: 6ac = ∅ if the Vω are aperiodic. If, in addition, the family (Vω )ω∈ is minimal in the sense that, for every pair ω1 , ω2 ∈ , Vω1 is the pointwise limit of a suitable sequence of translates of Vω2 , then, by a recent Last-Simon result [20], absence of absolutely continuous spectrum even holds for every rather than almost every ω. Thus, within the class of ergodic, aperiodic, minimal, finitely valued potentials the standard spectral classification reduces to the distinction between point spectrum and singular continuous spectrum.
478
D. Damanik
Two classes of such potentials have been studied in the past. The so-called circle sequences have the form Vω (n) = λχ[1−β,1) (nα + ω
mod 1),
where λ > 0, 0 < β < 1, α irrational and ω ∈ = [0, 1]. Here, the probability measure µ on is just the Lebesgue measure. A primitive substitution, which is a map from a finite set A, called the alphabet, to the set A∗ of finite words over A, generates such potentials in the following way. Put discrete topology on A. Choose a fixed point u ∈ AN of S (existence can be ensured by conditions which are either satisfied by S itself or a suitable power of S), extend it to the left arbitrarily giving uˆ ∈ AZ and define to be the set of accumulation points of the translates of uˆ with respect to pointwise convergence, ˆ ni → ∞}, ≡ {ω ∈ AZ : ω = lim T ni u, where the shift T on AZ is defined by (T τ )n = τn+1 , τ ∈ AZ . Primitivity of S (i.e., there exists k ∈ N such that, for every a ∈ A, S k (a) contains every symbol from A) yields the existence of a unique probability measure µ on the Borel sets of which makes the shift T ergodic [21]. Finally, choose a function f : A → R. The potentials Vω are then given by Vω (n) = f (ωn ). As long as f is not too degenerate (we could, for example, require f to be one-to-one), the qualitative spectral properties of the resulting operators do not depend on the particular choice of f . In order to emphasize that the spectral phenomena are determined by the self-similarity generated by S rather than by the actual numerical values the potential takes, the Vω are usually defined in the abstract way as presented above. Some prominent examples of primitive substitutions (in each case A = {a, b}) are a 7→ ab, b 7→ a Fibonacci substitution, a 7→ ab, b 7→ aa Period doubling substitution, a 7→ ab, b 7→ ba Thue-Morse substitution. Let us remark that the Fibonacci case can also be described as a circle sequence (α = √ 5−1 β = 2 ). There are many papers which study the spectral properties of such operators, we mention [1, 2, 3, 4, 5, 7, 11, 12, 15, 16, 18, 26, 27]. It should be emphasized that no example with non-empty point spectrum is known so far. Thus, purely singular continuous spectrum seems to be the rule. However, in contrast to the Kotani–Last– Simon result, empty point spectrum could not be established in full generality. Let us therefore be explicit about previous results on absence of eigenvalues. For circle sequences a very general result was obtained by Delyon–Petritis [11]. The point spectrum is empty for any λ, any β, a.e. α and then for a.e. ω. This result was even extended by Kaminaga [18]. He enlarged the class in such a way that it also contains the Fibonacci sequence. Thus, from the ergodic point of view, in these cases absence of eigenvalues was shown on sets of full µ-measure, that is, the authors were able to identify the set 6pp as the empty set. On the other hand, there are several works which establish absence of eigenvalues for both substitution and circle sequences for one element ω0 ∈ [26, 3, 1, 12, 4, 5, 16, 7]. This is very much due to the fact that certain elements in exhibit nice symmetry properties such as self-similarity or a strong palindromic structure. By general principles (minimality and Simon’s categorial approach [22]), this gives empty point spectrum for a dense Gδ in . The main result of our article is the following.
Singular Continuous Spectrum for Period Doubling Hamiltonian
479
Theorem 1. Let (Hω )ω∈ be the ergodic family of Schr¨odinger operators generated by the period doubling substitution. Then, for almost every ω ∈ , Hω has no eigenvalues. Remarks. 1. To the best of the author’s knowledge, this is the first result on the absence of eigenvalues for a non-quasiperiodic substitution Hamiltonian which holds on a set of full measure. 2. Theorem 1 was implicitly used by Guille-Biel in [14] in order to show absence of eigenvalues for p-sparse versions of the period doubling Hamiltonian, see also [8]. 3. The function f : {a, b} → R from above can be chosen arbitrarily. Combining this result with the one obtained by Kotani [19], we get Corollary 1. Let (Hω )ω∈ be the ergodic family of Schr¨odinger operators generated by the period doubling substitution. Then, for almost every ω, Hω has purely singular continuous spectrum. Remarks. 1. Singular continuous Schr¨odinger operator spectra have become quite fashionable in the past couple of years. In a series of papers [22, 10, 17, 9, 25, 23, 24], Simon and co-workers have revealed the generic aspect of this spectral type. Moreover, they have found many concrete operators with purely singular continuous spectrum. In this regard, Corollary 1 adds to this list. 2. For this result to hold, we require f : {a, b} → R to be one-to-one, for otherwise the potentials would be periodic. The organization is as follows. Section 2 recalls basic concepts and results related to substitution sequences. Theorem 1 will be proven in Sect. 3. 2. Preliminary Remarks on Substitution Sequences Let A = {a1 , . . . , as } be a finite S set, called the alphabet. The ai are called symbols or letters, the elements of A∗ ≡ k≥1 Ak are called words. We denote by |B| the length of a word B ∈ A∗ (i.e., |B| = l iff B ∈ Al ). If B1 , B2 are words, we denote by #B1 B2 the number of occurrences of B1 in B2 . A substitution S is a map S : A → A∗ . S can be extended homomorphically to A∗ (resp., AN ) by S(b1 . . . bn ) ≡ S(b1 ) . . . S(bn ) (resp., S(b1 b2 b3 . . . ) ≡ S(b1 )S(b2 )S(b3 ) . . . ). A fixed point u ∈ AN of S is called substitution sequence. The existence of such a fixed point follows from the following conditions: 1. There exists a letter a ∈ A such that the first letter of S(a) is a. 2. limn→∞ |S n (a)| = ∞. In this case, u ≡ limn→∞ S n (a) exists and is a substitution sequence. If the substitution S is primitive, which means that there exists k ∈ N such that, for every a ∈ A, S k (a) contains every symbol from A, then these conditions are satisfied in the sense that they hold for a suitably chosen power of S (which is primitive, too). For primitive S it is shown in [21] that, for any B ∈ Al which occurs in a fixed point of S and any b ∈ A, the limits #B S n (b) d(B) ≡ lim n→∞ |S n (b)| exist, they are strictly positive, and they are independent of b. The numbers d(B) have a natural interpretation as frequencies and they can in fact be computed easily. For the corresponding procedure we refer the reader to [21]. As the other central result, we want
480
D. Damanik
to mention the connection between these frequencies and the unique ergodic measure µ on . The Borel σ-algebra is generated by the cylinder sets [b0 . . . bl−1 ][m,m+l−1] ≡ {ω ∈ : wm+i = bi , 0 ≤ i ≤ l − 1}. Now, again by [21], µ obeys µ([b0 . . . bl−1 ][m,m+l−1] ) = d(b0 . . . bl−1 ).
(2)
3. The Three-Block Method and its Applicability In this section, we want to exclude decaying solutions for the period doubling operator in order to prove absence of eigenvalues. For the rest of the paper, we therefore restrict ourselves to S being the period doubling substitution S(a) = ab, S(b) = aa. Let u, , d, µ, (Hω )ω∈ be induced by S as described above. Let us first define the set of ω’s where the point spectrum is indeed empty, C ≡ {ω : σpp (Hω ) = ∅}. Next, we need a method how to prove that the eigensolutions actually do not decay. We will rely on a classical approach that could be called the three-block method. Define the following sets, B(n) ≡ {ω : Vω (−2n + k) = Vω (k) = Vω (2n + k), 1 ≤ k ≤ 2n }. Lemma 3.1. lim sup B(n) ⊆ C. Proof. This is a standard application of an idea originally due to Gordon [13], see also [11]. Thus, we are led to the investigation of the lim sup above. The following definition will prove to be useful, G(n) ≡ {words in u of the form vvv where |v| = 2n }. The key lemma in our proof of Theorem 1 is now the following. The final steps will then be straightforward. Lemma 3.2. For every n, the following holds 1. |G(n)| = 2n , | · | denoting cardinality. 2. For every vvv ∈ G(n), d(vvv) = 16 · 21n . Proof by induction. Consider the case n = 0. 1. One checks that the set 3 of words of length 3 occurring in u is given by 3 = {aba, baa, aaa, aab, bab} since the word bb does not occur in u. Hence, |G(0)| = 1. 2. A straightforward application of the standard procedure from [21] shows d(aaa) = 16 . Let us now assume that the hypothesis holds for n.
Singular Continuous Spectrum for Period Doubling Hamiltonian
481
1. We have |G(n)| = 2n . We will show that every word vvv ∈ G(n) gives rise to two words l(v)l(v)l(v), r(v)r(v)r(v) ∈ G(n + 1) such that the words obtained in this way are mutually different. Moreover, G(n + 1) does not contain any other words. It then follows |G(n + 1)| = 2 · |G(n)| = 2n+1 . l(v) is simply given by l(v) ≡ S(v). By one of the key properties of the period doubling substitution (every S(c), c ∈ {a, b}, has a as its first letter), l(v) is of the form l(v) = al(v). Define r(v) ≡ l(v)a. We now have to show a) l(v)l(v)l(v) ∈ G(n + 1), b) r(v)r(v)r(v) ∈ G(n + 1), c) l(v1 ) 6= l(v2 ) if vi vi vi ∈ G(n), v1 6= v2 , d) r(v1 ) 6= r(v2 ) if vi vi vi ∈ G(n), v1 6= v2 , e) l(v1 ) 6= r(v2 ) if vi vi vi ∈ G(n). f) No other word is in G(n + 1). a) The assertion follows from S(u) = u. b) Look at an occurrence of vvv in u. Apply S and look at the corresponding occurrence of S(v)S(v)S(v) in S(u) = u. Thus, the letter immediately to the right of this block in u is an a. We therefore have a ...
a ...
a . . . a,
which can also be interpreted as a ... a ... a ... a . c) This is immediate from the definition of l(vi ). d) This is also immediate from the definition. e) By definition, the words l(vi ) have the form a · a · a · . . . There is at least one b in every l(vi ) (for n ≥ 1, this follows from the fact that aaaa does not occur in u, for n = 0 this follows from G(0) = {aaa}), at an even position of course. Thus, every r(vi ) has at least one b, this time at an odd position. This proves the claim. f) The argument from (b) can be reversed since any word in G(n + 1) has length 3 · 2n+1 . Let www ∈ G(n + 1). Look at an occurrence of www in u. Either it starts at an odd position 2l − 1 or at an even position 2l. In the first case it has the form l(v)l(v)l(v) (just consider the inverse image under S). In the latter case it has the form r(v)r(v)r(v). In both cases there is a block vvv in u starting at position l. 2. Let w be any one of the l(vi ), r(vi ) and let v be the vi it is coming from. Obviously, for every m, #www (S m (a)) + 1 ≥ #vvv (S m−1 (a)). Thus, #www (S m (a)) #vvv (S m−1 (a)) − 1 ≥ lim inf m m→∞ m→∞ |S (a)| 2m 1 #vvv (S m−1 (a)) 1 1 1 lim = · · n. = 2 m→∞ |S m−1 (a)| 2 6 2
d(www) = lim
On the other hand,
482
D. Damanik
#www (S m (a)) ≤ #vvv (S m−1 (a)). A similar calculation yields d(www) ≤ This completes the proof of the lemma.
1 1 1 · · . 2 6 2n
Lemma 3.3. For every n, µ(B(n)) = 16 . Proof. The set B(n) is the finite union of the cylinder sets [vvv][−2n +1,2n +2n ] , vvv ∈ G(n). Of course, these cylinder sets are mutually disjoint. Thus, by Lemma 3.2 and (2), µ(B(n)) =
X
µ([vvv][−2n +1,2n +2n ] ) =
vvv∈G(n)
X
d(vvv) =
vvv∈G(n)
1 . 6
Proof of Theorem 1. Lemma 3.3 yields µ(C) ≥ µ(lim sup B(n)) ≥ lim sup µ(B(n)) =
1 > 0. 6
Now the assertion follows from shift-invariance of C and ergodicity of µ.
Acknowledgement. The author would like to thank A. Kechris and B. Simon for the hospitality of the Mathematics Department at Caltech where parts of this work were done. Financial support from DAAD (Doktorandenstipendium HSP III) is gratefully acknowledged.
References 1. Bellissard, J., Bovier, A., Ghez, J.-M.: Spectral properties of a tight binding hamiltonian with period doubling potential. Commun. Math. Phys. 135, 379–399 (1991) 2. Bellissard, J., Bovier, A., Ghez, J.-M.: Gap labelling theorems for one-dimensional discrete Schr¨odinger operators. Rev. Math. Phys. 4, 1–37 (1992) 3. Bellissard, J., Iochum, B., Scoppola, E., Testard, D.: Spectral properties of one-dimensional quasicrystals. Commun. Math. Phys. 125, 527–543 (1989) 4. Bovier, A., Ghez, J.-M.: Spectral properties of one-dimensional Schr¨odinger operators with potentials generated by substitutions. Commun. Math. Phys. 158, 45–66 (1993) 5. Bovier, A., Ghez, J.-M.: Erratum: Spectral properties of one-dimensional Schr¨odinger operators with potentials generated by substitutions. Commun. Math. Phys. 166, 431–432 (1994) 6. Carmona, R., Lacroix, J.: Spectral Theory of Random Schr¨odinger Operators. Boston: Birkh¨auser, 1990 7. Damanik, D.: α-continuity properties of one-dimensional quasicrystals, Commun. Math. Phys., to appear 8. Damanik, D.: On p-sparse Schr¨odinger operators with quasiperiodic potentials. Preprint 9. del Rio, R., Jitomirskaya, S., Last, Y., Simon, B.: Operators with singular continuous spectrum, IV. Hausdorff dimensions, rank one perturbations, and localization. J. d’Analyse Math. 69, 153–200 (1996) 10. del Rio, R., Makarov, N., Simon, B.: Operators with singular continuous spectrum, II. Rank one operators. Commun. Math. Phys. 165, 59–67 (1994) 11. Delyon, F., Petritis, D.: Absence of localization in a class of Schr¨odinger operators with quasiperiodic potential. Commun. Math. Phys. 103, 441–444 (1986) 12. Delyon, F., Peyri`ere, J.: Recurrence of the eigenstates of a Schr¨odinger operator with automatic potential. J. Stat. Phys. 64, 363–368 (1991) 13. Gordon, A.: On the point spectrum of the one-dimensional Schr¨odinger operator. Usp. Math. Nauk 31, 257 (1976)
Singular Continuous Spectrum for Period Doubling Hamiltonian
483
14. Guille-Biel, C.: Sparse Schr¨odinger operators. Rev. Math. Phys. 9, 315–341 (1997) 15. Hof, A.: Some remarks on discrete aperiodic Schr¨odinger operators. J. Stat. Phys. 72, 1353–1374 (1993) 16. Hof, A., Knill, O., Simon, B.: Singular continuous spectrum for palindromic Schr¨odinger operators. Commun. Math. Phys. 174, 149–159 (1995) 17. Jitomirskaya, S., Simon, B.: Operators with singular continuous spectrum: III. Almost periodic Schr¨odinger operators. Commun. Math. Phys. 165, 201–205 (1994) 18. Kaminaga, M.: Absence of point spectrum for a class of discrete Schr¨odinger operators with quasiperiodic potential, Forum Math. 8, 63–69 (1996) 19. Kotani, S.: Jacobi matrices with random potentials taking finitely many values, Rev. Math. Phys. 1, 129–133 (1989) 20. Last, Y., Simon, B.: Eigenfunctions, transfer matrices, and absolutely continuous spectrum of onedimensional Schr¨odinger operators, preprint 21. Queff´elec, M.: Substitution Dynamical Systems – Spectral Analysis. Lecture Notes in Mathematics, Vol. 1284, Berlin–Heidelberg–New York: Springer (1987) 22. Simon, B.: Operators with singular continuous spectrum: I. General operators. Ann. of Math. 141, 131–145 (1995) 23. Simon, B.: Operators with singular continuous spectrum: VI. Graph Laplacians and Laplace-Beltrami operators. Proc. Amer. Math. Soc. 124, 1177–1182 (1996) 24. Simon, B.: Operators with singular continuous spectrum: VII. Examples with borderline time decay. Commun. Math. Phys. 176, 713–722 (1996) 25. Simon, B., Stolz, G.: Operators with singular continuous spectrum: V. Sparse potentials. Proc. Amer. Math. Soc. 124, 2073–2080 (1996) 26. S¨ut¨o, A.: The spectrum of a quasiperiodic Schr¨odinger operator. Commun. Math. Phys. 111, 409–415 (1987) 27. S¨ut¨o, A.: Singular continuous spectrum on a Cantor set of zero Lebesgue measure for the Fibonacci Hamiltonian. J. Stat. Phys. 56, 525–531 (1989) Communicated by B. Simon
Commun. Math. Phys. 196, 485 – 521 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
A Low Temperature Expansion for Classical N -Vector Models III. A Complete Inductive Description, Fluctuation Integrals Tadeusz Balaban Rutgers University, Department of Mathematics, Hill Center, New Brunswick, NJ. 08903, USA Received: 13 December 1996 / Accepted: 25 September 1997
Abstract: We give a complete inductive description of the effective densities generated by applications of the renormalization transformations to the densities of the classical N -vector models described in the paper [1], and we formulate the main theorems of this sequence of papers. This description includes now all the “large field” domains, which we’ve excluded from the discussions of [1, 4]. We apply the renormalization transformation to a density satisfying the inductive hypotheses, we introduce “large field-small field” partitions, and we determine and analyze the fluctuation integrals in the general setting.
1. Introduction In this and the following two papers we complete the construction of the low temperature expansion for the considered class of the N-vector models in full generality, including the “large field” domains. This construction is based on the renormalization group approach introduced in [1], which was completed in [4] under the assumption that “small field” domains are considered only. In that paper we have given an inductive description of the effective actions generated by the renormalization transformations under the above assumption, and we have proved the main theorem stating that a successive transformation preserves the form and bounds of the actions. The inductive description and the proof were based on more technical results obtained in the previous papers [1–3]. The results of [2, 3] concerned constructions and properties of various “background” configurations, and have been obtained in full generality needed here. We use them extensively, as well as the results of [1] on bounds for the effective actions. We follow very closely the general line of arguments of paper [4]. We start with an inductive description of effective densities, and we formulate the main result of these papers again as a theorem stating that a successive renormalization transformation preserves the form and bounds of these densities. In this inductive description we use the inductive hypotheses
486
T. Balaban
formulated in [4] without repeating them, and we formulate only hypotheses containing some new elements connected with new issues which are specific for the general case considered here, for example with the densities on the “large field” regions. Also we follow closely [4] performing the same operations in the same order, and we consider mainly new problems arising in this case, relying heavily on the descriptions, definitions and formulas related to these operations and given in [4]. We repeat only a few of them if it is necessary to reformulate them to adapt to a new situation, or to incorporate some changes. A main emphasis in these papers is on a new operation connected with the densities on large field regions, which we will comment on in a little bit more detail below. Another new feature is the presence of a variety of restrictions on spin variables on various scales. They are introduced through nonlinear functions of these variables, typically like the functions χk in [4], and we spend a great deal of effort to study relations between them, bounds which they imply, and so on. Let us remark here that the inductive description yields a form of partial low temperature expansions, corresponding to partially integrated out “degrees of freedom” through the renormalization transformations. After a last step, when we complete the integrations, we obtain a complete low temperature expansion for the generating functional and correlation functions for the considered class of models. In a following paper we will discuss some consequences of this expansion, in particular the so called “spin wave picture”, or “Goldstone bosons picture”, which is connected with some more detailed properties of one-point and two-point correlation functions. We now present a plan of these three papers. In this introduction we give the inductive description of the effective densities, and we formulate the main theorems. In Sect. 2 we apply a next renormalization transformation, and we discuss some basic issues connected with it, like the introduction of restrictions on “new” spin variables, definition of a new small field region and separation into“"background” configuration and “fluctuation” variables together with setting up a fluctuation integral. In Sect. 3, the last section of this paper, we discuss the fluctuation integral and new contributions to effective actions determined by it. We classify these contributions and formulate their properties. In Sect. 2 of the next paper the renormalization of the new contributions is discussed. These three sections are based on the small field analysis in [4] and are rather brief, except the part connected with the new characteristic functions, and new “large field”, “small field” regions, which must be discussed in detail. In Sect. 3 of the next paper some remaining operations and definitions leading to a new density satisfying the inductive hypotheses are discussed, together with bounds on all new contributions. Some crucial technical problems connected with the whole method, mainly localization expansions of various expressions created in successive stages of the procedure, are the subject of the last Sect. 4. In the third paper a new operation is performed, a renormalization operation of large field densities. It is denoted by R(k) , and it is an essential part of the method, it allows to improve bounds on large field expressions. Most important aspects of this operation are discussed in Sect. 2, the remaining ones, like some localization expansions and other related more technical issues, some inductive bounds, etc. are discussed in Sect. 3, which concludes also the proof of the main theorems. In the short Sect. 4 we consider the last step of the procedure, which is a final integration with respect to remaining degrees of freedom, and a discussion of some simple issues connected with it. In order to justify to some extent our inductive description of the densities let us recall and develop further the very general discussion of some aspects of our method given in the Introduction of paper [1]. In the first step, after the first renormalization transformation T = T (0) given by (1.11) [1] we obtain the integral (1.13) [1]. We decompose the domain of integration, which is the vector space of all spin configurations, into subdomains
Low Temperature Expansion for Classical N -Vector Models
487
determined either by the “small field” conditions (1.15) [1], or their complements, on all blocks, bonds and points of the unit lattice. For example we have either the small field −1
condition |ψ(y)−(Qφ)(y)| < β0 2 p0 (β0 ), or the large field condition |ψ(y)−(Qφ)(y)| ≥ −1
−1
β0 2 p0 (β0 ) on blocks B(y) of the lattice, either |(∂φ)(b)| < β0 2 p0 (β0 ), or |(∂φ)(b)| ≥ −1
β0 2 p0 (β0 ) on bonds of the lattice, etc. Each subdomain determines a region Z1 of the unit lattice T1 , which is defined as a union of all large cubes of the partition π1 , i.e. the cubes of the size M , such that at least one large field condition is satisfied on a block, a bond or a point of the cube. The region Z1 is the “large field” region, its complement Z1c is the “small field” region in the first step. We leave the integration over the region Z1 unchanged, and on the small field region Z1c we perform the operations described in [4], i.e. we calculate the fluctuation integral generating this way a new effective action on Z1c , and we renormalize it. Thus we obtain a new effective density which is represented as a sum of terms over the admissible regions Z1 . Each term is a partial density which has the same representation as the corresponding small field density in [4], but restricted to the region Z1c . This density restricted to Z1 is given by the integral mentioned above, and although its detailed structure is important in the analysis of the operation R(k) , generally only a few properties are needed, and they are traced through the inductive procedure, the most important one is that the density has a sufficiently small bound. In the next step we apply the next renormalization transformation, and we repeat the above procedure, but only on the small field regions Z1c . Thus for each term with a region Z1c we introduce new decompositions of the domain of integration into subdomains corresponding to new small field - large field conditions, and we obtain new large field regions, which we combine with the previous region Z1 , or rather its neighbourhood. We obtain a larger region Z2 , and we leave the integration of the renormalization transformation unchanged over the regions Z2 . On the small field regions Z2c we perform again the same operations as before, i.e. we calculate the fluctuation integral, we generate a new effective action on Z2c , the same as the corresponding action in [4] restricted to Z2c , and we renormalize it. This way new effective densities are constructed, and they have the same general structure as before, after the first step. They are parametrized by the large field regions Z2 , and on Z2c they coincide with the small field density constructed in [4], but restricted to Z2c , and on Z2 they have the same general properties as before, in particular they have small bounds. This procedure is continued inductively step by step, as in [4]. There is one additional problem with it, and it is connected with the densities on the large field regions. They are controlled only by small factors coming from bounds connected with large field restrictions. If for a large number of steps no new large field restrictions are created in the neighbourhoods of the large field regions, or some of their connected components, then we may lose the control over the bounds on those components. This is a focal point of these papers, and we will explain and analyze the problem in the Introduction to the third paper, when we will understand the form and properties of the representations of the effective densities in full detail. Now let us mention only that to improve the bounds we have to renormalize the densities, and this new renormalization operation is constructed and discussed in Sects. 2, 3, of the third paper. Let us remark that this part of the method is closest to the “phase space cell” analysis, as presented for example in the papers [9–11], in that we consider expressions on a large number of scales, in distinction to the one-scale analysis of other renormalization group approaches of, for example [7]. We hope that the above remarks help to justify and to clarify some features of the inductive description given below. It is probably needless to say that they present a very
488
T. Balaban
idealized and simplified picture, the actual description and procedure involve many other new features and problems. For a more detailed, but still rather nontechnical description the reader is referred to the survey paper presented at the Conference on Constructive Physics in Palaiseau, France, 1994. We formulate now the inductive hypotheses describing a space of densities, which by assumption contains the density obtained by applying k renormalization steps to the density (1.1) [1]. The first hypothesis describes their general form. H 1. A density from the space has the representation X X ρk (Zk ; ψk , h, g) = ρ0k (Zk ; ψk , h, g) · ρk (ψk , h, g) =
Zk
Zk
· χk (Zkc ) exp Ak (Zkc ; ψk , h) + Fk (Zkc ; ψk , h, g) + R(k) (Zkc ; ψk , h, g) ,
(1.1)
where the sum is over Zk such that each component of Zkc belongs to Dk . The exponential above is roughly the “small field” density analyzed in papers [1, 4], but restricted and localized to the small field region Zkc in a way described precisely in the next hypothesis. There is a new contribution R(k) which is connected with the new renormalization operation of the large field densities mentioned above. The densities ρ0k (Zk ) are the large field densities, which are constructed by composing renormalization group transformations and the characteristic functions restricted to large field regions in successive steps. The characteristic functions χk (Zkc ) are as in [4], only localized to Zkc , and with some modifications depending on k. We will define them later in this section. To formulate the next hypothesis it is convenient to introduce some new definitions of a geometric character, and some new notations. We will use them extensively in the rest of the paper. We have used the operation “∼” defined as adding to a given domain one layer of large cubes in a proper scale. We define now a similar operation “≈” but we want to add a thick layer. Take a domain X which is a union of large cubes from the partition πj , and define [ X ≈ = { ∈ πj : distj (, X) ≤ M Rj−1 } = X ∼([Rj−1 ]+1) = [ (1.2) = Rj (), where Rj−1 = R0 (log aβj−1 )2 , and ∈πj :⊂X
Rj () =
[
{0 ∈ πj : distj (0 , ) ≤ M Rj−1 }.
(1.3)
The distance distj is considered in the norm |.|∞ on Rd , i.e. |x|∞ = max |xµ |, distj (X1 , X2 ) = 1≥µ≥d
inf
x1 ∈X1 ,x2 ∈X2
|x1 − x2 |∞ ,
and the domains are in the ξ-scale, ξ = L−j , which means that large cubes from πj have the size M . Let us recall that we consider all domains in the continuous torus T ⊂ Rd . To describe in detail the expressions in (1.1) we introduce the functions φk = φ(Bk (Wk ); ψ˜ k ), ψk(j) = ψ(Bj (Wk ), Bk (Wk ); ψ˜ k ), Wk = (Zkc )≈2 , ψ˜ k = Q∗ (T (k) , Bk (Wk ))ψk , 1
(1.4)
Low Temperature Expansion for Classical N -Vector Models
489
determined by the corresponding variational problems (1.6), (1.9), (4.14), (4.15) [2]. They are equal to the functions φk , ψk(j) considered in [1, 4], but restricted to the neighborhood of the small field region Zkc by the boundary conditions introduced by Bk ((Zkc )≈2 ). We consider the main action (1.1)[4] restricted to domains , which are unions of unit cubes with centers at points of the lattice T1(k) . Let us recall that the set of bonds st() is defined as st() = {< x, x0 >⊂ Tη : x ∈ ∩ Tη , x0 ∈ c ∩ Tη }. More generally, for a domain and a generating set Bk we define 1 < ψ − Q(Bk )φ, a(Bk )(ψ − Q(Bk )φ) >∩Bk + 2 λk νk k|φ|2 − 1k2 + kφ − hk2 , where + 8 2 X η |(∂ φ)(< x, x0 >)|2
A∗k (, Bk ; ψ, φ) = 1 + k∂ η φk2∗ 2 k∂ η φk2∗ =
<x,x0 >⊂∩Tη
+
1 2
X
(1.5)
|(∂ η φ)(< x, x0 >)|2 , η = L−k .
<x,x0 >∈st()
This means that we divide the derivatives squared, corresponding to the boundary bonds connecting two adjacent domains, in a symmetric way between the norms squared corresponding to the domains. We formulate now the second inductive hypothesis describing the effective action Ak restricted to the small field region Zkc . H 2. There exist positive coefficients βk , ak , λk , νk satisfying the conditions in (H.1) [4], constants Ek , α0 , γ, and functions E (j) (ψj , h), j = 1, · · · , k, satisfying the inductive hypotheses (H.2)-(H.5) [4], such that Ak (Zkc ; ψk , h) = −βk A∗k (Zkc ; ψk , φk ) + Ek (Zkc ; ψk , h) − Ek |Zkc |,
(1.6)
where the function Ek (Zkc ) is defined by the following modification of (H.2) [4]: Ek (Zkc ; ψk , h) =
k X
E (j) (Zkc ; ψk(j) (ψ˜ k ), h),
(1.7)
j=1
E (j) (Zkc ; ψj , h) = X
X
X
E (j) (y, X; ψj , h) −
y∈Zkc ∩T (j) X∈Dj :y∈X⊂(Zkc )≈
X
1 (j) v1 (y, x)(ψj2 (x) − ψj2 (y)) + 2 c c y∈Zk ∩T (j) x∈(Zk )≈ ∩T (j) (j) + v2 (y, x)(νj h · (ψj )0 (x) − νj h · (ψj )0 (y)) .
−
(1.8)
The functions v1(j) , v2(j) on T (j) × T (j) are determined by the functions E (j) (y; ψj , h), considered on the whole lattice T (j) , by the formulas (3.106) [1]. The configurations φk , ψk(j) are defined in two different ways depending on a relation between βk and νk . If νk βk ≤ 1,
(1.9)
490
T. Balaban
then in the variational problems defining the configurations we omit the term with νk , or we simply put νk = 0. If the opposite inequality holds, then we take the usual complete definitions as in [4]. We shall make now several remarks on various assumptions in the above hypothesis, trying to explain in more detail their meaning. At first let us stress again that we have assumed that the actions E (j) are defined on the whole lattices, and satisfy all the relevant conditions in the inductive hypotheses (H.2)–(H.5) [4] formulated on the whole lattices. The action Ak (Zkc ) is obtained from Ak in [4] by restricting correspondingly the main action and the sums in (H.2),(H.4)[4], subtracting the second sum in (1.8), and restricting the background configurations (1.4) to the neighbourhood of Zkc . Terms of the sums in (1.8) do not depend on the domain Zkc . It is important, because one of fundamental aspects of our procedure is that a term E (k+1) (z, Y ) of a new action E (k+1) created by the next renormalization step depends on elements of the action (1.6) localized in the domain Y , so it does not depend on any domain Zkc containing Y and used for its construction. Let us remark also that the second sum in (1.8) appeared in the expansions of [1], but it was equal to 0 there, as a trace of an antisymmetric matrix over the whole lattice. Here it does not vanish because the arguments are summed over different sets, and we need it to cancel the corresponding “relevant” and “marginal” terms in the expansions of the effective actions E (j) . Without the cancellations we do not have the uniform bounds (3.128), (3.129) [1] for the whole action Ek , which are crucial for the whole approach. The next remark is on the condition (1.9) and the corresponding definitions of the configurations φk , ψk(j) . Consider the constants βk νk . They form an increasing sequence, in fact the quotient of the k + 1st and k th constants is approximately equal to Ld . From this it follows that there exists exactly one index k0 such that the condition (1.9) is satisfied for k ≤ k0 , and the opposite inequality holds for k > k0 . Then βk0 νk0 > 78 L−d , βk0 +1 νk0 +1 < 98 Ld . The definition of the configurations φk , ψk(j) means that for k ≤ k0 they do not depend on νk , h, and the corresponding term in the main action is treated as a perturbation. It could be included in Ek , but for the renormalization procedure, which is the same for all steps, we keep it in the main action. In the k0th step, i.e. transforming the k0th into the k0 + 1st density, we have to change the “incomplete” into the “complete” configurations. It is a simple matter, and it is included into the renormalization procedure, i.e. changing the unrenormalized into the renormalized configurations. The distinction between the two cases is made in order to simplify our analysis of large field contributions, there was no need for it in the small field analysis in [1, 4]. The second term in the exponential in (1.1) is the generating functional described previously in [1, 4], but restricted to a neighbourhood of Zkc . We assume the following hypothesis. H 3. There exist functions F (j) (ψj , h, g), j = 1, · · · , k, satisfying all the assumptions of the hypothesis (H.7) [4]. The restricted function in (1.1) is defined by the formulas Fk (Zkc ; ψk , h, g) =< χZkc g, φk >1 +
k X
˜ h, g) = F (j) (Zkc ; ψk(j) (ψ),
j=1
=<
χZkc g, Mk (Zkc ; ψk , h, g)
>1 ,
(1.10)
where
F (j) (Zkc ; ψj , h, g) = hχZkc g, M(j) (Zkc ; ψj , h, g)i1 ,
(1.11)
Low Temperature Expansion for Classical N -Vector Models
M(j) (x, Zkc ; ψj , h, g) =
X
491
M(j) (x, X; ψj , h, g),
hence
(1.12)
X∈Dj :x∈X⊂(Zkc )≈
Mk (x, Zkc ; ψk , h, g) = φk (x; ψ˜ k ) +
k X
M(j) (x, Zkc ; ψk(j) (ψ˜ k ), h, g).
(1.13)
j=1
Notice that the above notation is slightly imprecise, for example M(j) (x, Zkc ) denotes the sum in (1.12), but it might be also a term of this sum corresponding to X = Zkc . This will never cause any confusion, the meaning will always be clear from the context. The third term in (1.1) is a new term in comparison with the effective actions considered in [1, 4]. Most important contributions to it come from the renormalization operations of the large field expressions, the operations described in the last paper„ but there are also contributions coming from fluctuation integrals. This term can be distinguished from the previous “regular” terms in (1.1) by the fact that it has reduced symmetry properties, and it is very small, basically it can be bounded by an arbitrary power of βk−1 . We assume that it is a sum of m terms, where m is the positive integer determined by the equality Lm = M, M is the fixed size of the large cubes, and their properties are described in the following inductive hypothesis. H 4. There exist functions R(k) n (ψk , h, g), n = 1, 2, · · · , m, defined and real analytic on the space 4k (1, k ) × {g : kgk`1 < 1} on the whole lattice, and satisfying the symmetry conditions (k) R(k) n (Rψk , Rh, Rg) = Rn (ψk , h, g)
for all R ∈ O(N ),
(k) R(k) n (rψk , h, rg) = Rn (ψk , h, g)
(1.14) (1.15)
, or for all Euclidean transformations r of the torus T leaving invariant the lattice TL(k+n) n equivalently leaving invariant the rescaled partition πk−m+n of T into Ln -cubes. These functions have localization expansions of the usual form described in (H.4)[4], which satisfy all the locality, analyticity and symmetry properties there, and the bounds with the constant E0 replaced by βk−n−1 . The restricted function R(k) (Zkc ) in (1.1) is defined by the formulas m X c R(k) (Zkc ; ψk , h, g) = R(k) (1.16) n (Zk ; ψk , h, g), n=1 c R(k) n (Zk ; ψk , h, g) =
X
R(k) n (X; ψk , h, g).
(1.17)
X∈Dk :X⊂Zkc (k) Notice that the functions R(k) n satisfy almost the same conditions as E , except that the Euclidean symmetry properties are slightly reduced, the functions depend on g, and they are not “pre-localized”, i.e. they are not represented by sums over points y ∈ T1(k) . The symmetry properties improve with n decreasing, and a renormalization transformation together with the scaling improve the properties by one scale, so they decrease the index n by 1, roughly speaking. The picture is that we create at first expressions with worst symmetry properties, and we include them into R(k) m . Next renormalization steps improve the properties and decrease the index n. After m steps we obtain expressions with all the properties of E (k) , and we finally include them into E (k) . Let us remark also that the powers of βk−1 in the bounds are to some extent arbitrary, and we could have
492
T. Balaban
chosen much higher powers, but each choice imposes some restrictions on βk , a higher power imposes stronger restrictions. The choice in (H.4) is one of the simplest possible. In front of the exponential in (1.1) there is the characteristic function χk (Zkc ). It is defined by the formulas Y χk (), (1.18) χk (Zkc ) = ∈πk :⊂Zkc
χk () = χ({|ψk − Qk φk (; ψ˜ k )|, |∂ η φk (; ψ˜ k )|, |1η φk (; ψ˜ k )|, δk δ2 |αk (; ψ˜ k )| < δk on ∼ , |φk (; ψ˜ k ) − h| < √ on ∼ , |h2 − 1| < √ k }), νk νk λk 2 (φ (; ψ˜ k ) − 1), φk (; ψ˜ k ) = φk (Bk (∼2 ); ψ˜ k ). αk (; ψ˜ k ) = 2 k (1.19) The above definition is written in the case k > k0 , if k ≤ k0 then we omit the last two conditions in the curly brackets above, and of course we take the function φk (; ψ˜ k ) with νk = 0. We have finished the inductive description of the “small field” effective densities given by the product of the characteristic function and the exponential in (1.1). Now we describe the “large field” effective densities ρ0k (Zk ). This is in a sense a more complicated problem, because we need to know much more of the multi-scale structure of these densities. This is connected with the fact that at each step we create in the exponential terms with localization domains intersecting the small field region Zkc and the large field region Zk . Actually they may intersect some, or even all, of the large field regions in the previous steps, so these terms “remember” the corresponding part of the multi-scale structure. Most of these terms participate in a fluctuation integral connected with a next renormalization transformation, so we have to describe them in order to analyze their contribution to this integral. On the other hand they do not participate in any other operation, in particular in the renormalization operation in the next step, so we need only their most general and simplest properties. Before we formulate them let us explain briefly what we understand by the phrase “multi-scale structure”. The most important element of this structure is a sequence {j }, or the corresponding generating set Bk , satisfying conditions (1.1)–(1.3) [2], actually more restrictive conditions, as will be described in the next section. This sequence is connected with fluctuation integrals, on j we perform operations leading to a j th fluctuation integral, and on cj we leave the corresponding renormalization transformation unchanged, as explained before. As in [2, 3] we denote by ψ the spin variable defined on Bk by the equalities ψ = ψj on 3j , j = 0, 1, · · · , k, where ψj is the spin variable on the lattice T (j) , as in (1.8). Let us recall that 3j = (j \ j+1 )(j) = (j \ j+1 ) ∩ T (j) , where we take 0 = ∼ 1 , k+1 = ∅. With the sequence {j } there is connected another one denoted by {00j }, where 00j is a subdomain of j , on which the fluctuation variable is small and we perform the fluctuation integral. The variable is left unintegrated on the complement of this domain, and we denote by ψ 0 a fluctuation variable defined on the subset of Bk by the equalities ψ 0 = ψj0 on (j \ 00j ) ∩ T (j) , j = 1, · · · , k. There is also the already mentioned sequence {Zj } of large field regions, or the sequence {Zjc } of the corresponding small field regions, denoted by Ck . The three sequences satisfy the inclusions 1 ⊃ 001 ⊃ Z1c ⊃ 2 ⊃ · · · ⊃ j ⊃ 00j ⊃ Zjc ⊃ j+1 ⊃ · · · ⊃ k ⊃ 00k ⊃ Zkc , (1.20)
Low Temperature Expansion for Classical N -Vector Models
493
and other conditions which will be introduced in the next section. It is convenient to introduce a multi-index Ak which includes the three sequences, and also other sequences of regions introduced in the next sections. A representation of the large field density given below is expressed as a sum over admissible multi-indices. The multi-index Ak will be defined inductively, the remaining parts of definition will be given in the third sections of the next two papers. Here we have described only its most important elements relevant in the description of the above mentioned terms. Let us formulate also the following important property of the multi-index : all its elements are determined completely by their restrictions to the large field region Zk , like for example the above discussed sequences Bk , {00j }, Ck . This means that we may localize Ak to components of Zk , or to some unions of components, so we can take the localized multi-index Ak ∩ Z, where Z is a union of connected components of Zk . We formulate now an inductive hypothesis describing the large field densities. H 5. The large field density ρ0k (Zk ) has the representation ρ0k (Zk ; ψk ; h; g) =
X
Tk (Zk , Ak ; ψk , h, g) exp B (k) (Zk , Ak ; ψ, ψ 0 , h, g),
(1.21)
Ak
where the sum is over certain admissible multi-indices, and the integral operator Tk (Zk , Ak ) acts on functions of ψ, ψ 0 and yields functions of ψk , with a real and non-negative kernel depending on the variables ψk , ψ, ψ 0 , g restricted to the domain Uk∼ , Uk = Zk≈ , and on the multi-index Ak . The kernel is invariant with respect to the orthogonal transformations applied to the vector variables ψ, ψ 0 , h, g. Take the Euclidean (k+m) transformations leaving invariant the lattice TM , or the partition πk . These transformations are applied to the variables and to all domains forming the multi-indices Ak . The kernel is invariant with respect to all these transformations also. The above hypothesis plays the same role as (H.1); it establishes a general form of the densities, and their most basic properties. There are three ingredients of the representation (1.21) : the integral operators Tk (Zk , Ak ), the admissible multi-indices Ak and the functions B (k) (Zk , Ak ), called the boundary terms. The integral operators are treated in a completely different way than the other elements of the inductive description. We do not describe them by introducing some space of such operators determined by some general properties, but we define them quite explicitly, although still inductively, in the course of our constructions, together with the multi-indices Ak . Crudely speaking they are compositions of the renormalization transformations restricted to the domains cj , unintegrated fluctuation integrals restricted to (j \00j ) ∩ 3j , certain characteristic c ∩ Zj . This definition is long and functions and small field densities restricted to Zj−1 complicated, but it arises quite naturally as a result of various operations we perform, so we prefer to discuss it in proper places in the following sections, in order to avoid unnecessary repetitions of many tedious details. We use only a small part of this definition, in some special cases connected with the large field renormalization operation R(k+1) , but in these cases we use its explicit form. In other operations, in particular in the analysis of the next renormalization transformation T (k) , we use only the general property on the supports of kernels of the integral operators formulated in (H.5). From this we obtain the following factorization property : if Zk = Z 0 ∪ Z 00 , U 0 = Z 0≈ , U 00 = Z 00≈ , and U 0∼ ∩ U 00∼ = ∅, then Tk (Zk , Ak ) = Tk (Z 0 , Ak ∩ Z 0 )Tk (Z 00 , Ak ∩ Z 00 ),
(1.22)
494
T. Balaban
the two operators on the right hand side are commuting. This is an important property, which will play a crucial role in the operation R(k+1) . There is also another factorization property, in “levels” of the multi-scale structure determined by Ak . For a component Z of Zk , after the last fluctuation integral we have c Tk (Z, Ak ∩ Z) = T(k−1) (Z, Ak ∩ (Z ∩ Zk−1 ))Tk−1 (Z ∩ Zk−1 , Ak−1 ∩ (Z ∩ Zk−1 )), (1.23) where the operator T(k−1) is explicitly defined, see Sect. 3 of the next paper. This property may be destroyed in these components where the operation R(k+1) is performed. Because the operators are explicitly defined, their properties are not really a part of the inductive assumptions, but they are statements which are consequences of the definition, and are proved in proper places. We formulate here the important ones, these which are directly connected with basic properties of the effective densities, and which will be used in future constructions, like the two properties above. The most important property is that these operators have sufficiently small bounds. We write now a general form of these bounds, and mention a few of their properties. We will discuss them more extensively later on. At first we have to characterize a domain of integration for Tk (Zk , Ak ). In the variables ψ it is contained in k−1 \
\
˜ j (Bk ∩ ∼ ; 3dLδj ) ∩ 9
j=0 ⊂πj+1 ,⊂∼ ∩Zj+1 j+1
k \
\
˜ j (∼ ; 3δj ), 9
j=1 ∈πj ,⊂Zjc ∩(∼ )c j+1
(1.24) = ∅. For a definition of the above spaces see Sect. 1 of [2]. In the where we put ∼ k+1 variables ψ 0 it is contained in {ψ 0 : |ψj0 | < const.p0 (βj )
on
j \ 00j , j = 1, · · · , k},
(1.25)
where const. is built in a simple way from the constants K1 , B5 , etc. The above statements are consequences of considerations in the next section. Denote by χ the characteristic function of the above domains. The operator restricted to a component Z of the region Zk satisfies the inequality Z (1.26) dψk Z |Tk (Z, Ak ∩ Z)F | ≤ exp(−κk (Z, Ak ∩ Z)) sup χ|F |, ψ,ψ 0
where F is a bounded function of ψ, ψ 0 . Obviously this inequality has a little meaning without defining the function κk , or at least describing its properties. This function is explicitly defined along with the integral operator, it is built of various small factors connected with large field restrictions mentioned at the beginning of this section, and other bounds. The definition is inductive, and it becomes simple and natural only after constructions leading to a new small field region in the next step, so we postpone its formulation to Sect. 3 of the next paper, where we discuss also all relevant properties of this function. Now we write the most important one connected with the summation in (1.21), but at first we have to introduce some additional definitions. It is convenient to formulate the bounds in terms of some “combinatoric factors” C(Ak ∩ Z) which control sums over Ak ∩ Z. These factors are defined on a maximal set of the multi-indices, i.e. on all which satisfy only simplest geometric restrictions (1.20), and the ones for other regions introduced in the following sections. They satisfy the inequality X C −1 (Ak ∩ Z) ≤ 1, (1.27) Ak ∩Z
Low Temperature Expansion for Classical N -Vector Models
495
where the sum is over the maximal set of the multi-indices. These factors are defined inductively in Sect. 3 of the next paper. It is a very simple and obvious definition. Let us mention here only that they satisfy the inequality k−1 X 1 5 1 1 |Z1 ∩ Z|1 + |Z c ∩ Zj+1 |ξ . (1.28) C(Ak ∩ Z) ≤ exp 3 Md 3 Md j j=1
We sum the inequalities (1.26) over the admissible Ak ∩ Z, we multiply the terms under the sum on the right-hand side by C −1 (Ak ∩ Z)C(Ak ∩ Z), and bound the sum by the sum in (1.27) multiplied by supremum of the remaining products. This yields the inequality X Z dψk Z |Tk (Z, Ak ∩ Z)F | ≤ sup C(Ak ∩ Z) exp (−κk (Z, Ak ∩ Z)) . Ak ∩Z
Ak ∩Z
·
sup
Ak ∩Z,ψ,ψ 0
χ|F |,
(1.29) where we have taken into account the possibility that F may depend on Ak ∩ Z also. Let us stress again that the sum and the suprema above are over the set of admissible multiindices. It is a special subset of the set of all multi-indices, which is defined by some geometric conditions for sequences of regions forming the multi-indices, for example conditions (1.1)–(1.3) [2], (1.20), but also other conditions connected with bounds of the product under the first supremum in (1.29). We formulate now the most important one. We assume that there exists an integer p2 ≥ 2d + 2 and sufficiently large constants A2 , H0 such that one of the following inequalities holds for all admissible Ak ∩ Z: C(Ak ∩ Z) exp (−κk (Z, Ak ∩ Z)) ≤ exp −`p2 (βk ) − Hk |Z|ξ , (1.30) where ` = 1 or 2, p2 (βk ) = A2 (log βk )p2 , Hk = H0 log βk . The numbers p2 , A2 can be explicitly expressed in terms of p0 , A0 and some other constants, like L, M, K0 , K1 , etc., these relations follow from bounds discussed in the next two papers, for example, we can take p2 = p0 − 2d − 2. The constant H0 can be expressed in terms of E0 , B0 , and is important in one bound only in Sect. 3 of the third paper. Combining the inequalities (1.29), (1.30) we obtain X Z dψk Z |Tk (Z, Ak ∩ Z)F | ≤ exp −`p2 (βk ) − Hk |Z|η sup χ|F |. Ak ∩Z
Ak ∩Z,ψ,ψ 0
(1.31) It is a crucial inequality and will be used to estimate integrals of the effective densities. It has a simple and clear meaning; unfortunately we cannot take it as a basis of the inductive analysis. It does not behave in a simple way under the induction steps. Instead the analysis is based on properties of the expression on the left-hand side of the inequality (1.30). We consider properties which behave “regularly” under the induction steps, and which imply (1.30). They are formulated in Proposition 5.2 in Sect. 3 of the next paper in terms of yet another function of the multi-indices, which we need also to formulate the next inductive hypothesis. It is defined on components Z of Zk , and on the restrictions Ak ∩ Z, it has values in non-negative integers, and we denote it by K(Z, Ak ∩ Z). Intuitively, it is a maximal number of additional renormalization steps which we can perform and still have the inequality (1.30) for ` = 2 in components containing Z,
496
T. Balaban
assuming that no new large field restrictions have been created in these components. A more explicit and precise definition will be given in Sect. 3 of the next paper, after an inductive construction of new multi-indices connected with a successive renormalization step. We formulate now the next inductive hypothesis, but we will use it only in the third paper, after the definition of this function. H 6. For k ≤ k0 and for each connected component Z of Zk the condition K(Z, Ak ∩ Z) > 0
(1.32)
holds for all admissible multi-indices in the sum (1.21). This assumption is slightly stronger than the inequality (1.30), but more importantly it is easier to control in our inductive procedure. The large field renormalization operations are introduced to assure that condition (1.32) is preserved in successive renormalization steps. It holds also in the preceding steps, which means that we have K(Zj ∩Z, Ak ∩(Zj ∩ Z)) > 0 for all j ≤ k. It is one of the admissibility conditions for the multi-indices Ak mentioned above. Let us mention that this part of the considerations connected with the above new definitions and the inequalities (1.27)–(1.32) is independent of the particular model, it is quite universal and involves only geometric combinatoric arguments. They have been discussed extensively in many papers, and in particular we will use here the results of [5d]. Consider now the function B (k) (Zk , Ak ) in (1.21). It is treated in the same way as the functions in the small field density; we introduce a space of such functions satisfying some inductive assumptions. Unfortunately the description is more complicated now, because of the complicated geometric structure connected with Ak , although relevant properties are quite trivial, and the functions do not play any role in constructions, they are just their necessary outcome. To formulate an inductive hypothesis describing them we have to introduce several new definitions. We start with geometric ones. Consider a domain whose components belong to Dk , and define Dk (modc ) = {X ∈ Dk : any component of c is either contained in X, or disjoint with X}.
(1.33)
For localization domains in the above class we introduce the following modification of the linear size function dk : 1 |0|η : 0 is a connected tree graph M in the continuous space, contained in the domain X and
dk (Xmod c ) = inf{
(1.34)
intersecting every cube from the cover πk0 which intersects X ∩ }. Obviously for X ⊂ we have dk (Xmodc ) = dk (X). For X containing components of the above function measures the linear size of components of X ∩ , together with distances between the components. Because of this we have again the bound X exp(−dk (Xmod c )) < K0 , ∈ πk , ⊂ , (1.35) X∈Dk (modc ):⊂X
with a slightly bigger constant K0 than before. We choose it universally for all bounds of this type. Notice that by definition (1.33) the sum above is in reality restricted to the sum over intersections X ∩ .
Low Temperature Expansion for Classical N -Vector Models
497
It is much more awkward to define analyticity properties and spaces, because the variable ψ is a multi-scale variable. We use localized spaces defined in the same way as the characteristic functions χj (), ∈ πj . We introduce the spaces 4cj (; 1, j ) defined by the obvious “complexification” of the conditions in (1.19), with j and j instead of k and δk . For cubes close to the boundary ∂j we modify slightly this definition. We 3 5 introduce the spaces 40c j (; ( 4 , 4 ), j−1 ) modeled on the definition of the characteristic 0 0 functions χk+1 ( ) given in the next section in (2.11)–(2.13), in which we have to take j and j−1 instead of k +1 and δk . There is no point in formulating this straightforward, but rather awkward, definition here, and we refer the reader for details in the next section, in particular for an explanation in what places we put the factors 43 , 45 . For a domain X ∈ Dk ( modck ) we define k \ \ 4cj (; 1, j ) ∩ 4c (X, Bk ; {j }) = j=1
∩
\
⊂[X∩(∼3 )c ]\(cj )∼3 j+1
40c j (;
⊂X∩∼3 ∩(cj )∼3 j
3 5 , 4 4
(1.36)
, j−1 ) .
This lengthy and complicated looking definition has a very simple interpretation, it means that a configuration ψ from the space has almost the same local regularity properties on X ∩ j ∩ cj+1 as an element of the space 4cj (1, j ), with the slight modifications near the boundaries ∂j and ∂j+1 . We have to introduce also proper analyticity domains for the fluctuation variables ψ 0 . Fortunately this is simple. We introduce the spaces of complex configurations ψ 0 defined by {ψ 0 : |ψj0 | < 1,j on X ∩ (j \ 00j ), j = 1, · · · , k},
(1.37)
where 1,j is equal to j multiplied by a sufficiently small number, for example we may A1 take 1,j = A j . Now we are ready to formulate the inductive hypothesis describing the 0 functions B (k) (Zk , Ak ). H 7. The function in the exponential in (1.21) has a localization expansion of the form X B (k) (X, Ak ; ψ, ψ 0 , g). (1.38) B (k) (Zk , Ak ; ψ, ψ 0 , g) = X∈Dk (modck ):X∩Zk 6=∅,X∩Zkc 6=∅
Terms of this expansion satisfy the usual locality properties with respect to ψ, ψ 0 , g, and have analytic extensions onto the spaces (1.36), (1.37), {g : kgk`1 < 1}. The extended functions satisfy the bounds |B (k) (X, Ak ; ψ, ψ 0 , g)| < B0 exp(−κdk (Xmod ck )),
(1.39)
with a sufficiently large positive constant B0 , and the symmetry properties described in (H.5). The constant B0 plays the same role as the constant E0 . We will obtain a number of restrictions on it, from which we could determine it explicitly, as E0 in [4]. With the above hypothesis we have finished the inductive description of the effective densities, except for a few technical points and definitions connected with the integral operators,
498
T. Balaban
which for reasons of simplicity and convenience are postponed to Sects. 3 of the next two papers. As we know from paper [4] the effective action Ak (Zkc ), or the whole lattice action Ak , is not uniquely defined, it depends on several choices, among them on the choice of the renormalization group equations (1.9) [4], and in particular the equations involving the constants ck+1 , dk+1 . We have mentioned there several simple choices of these constants, each determining its own renormalization group flow for the “running” coefficients of the model. Now we fix one which is particularly convenient in the large field analysis. We take ck+1 = 0, dk+1 = bk+1 , and more generally cj = 0, dj = bj for all j ≤ k + 1. By (4.1)[4] these equations mean that βj+1 aj+1 = βj+1,u aj+1,u = βj Ld−2
aaj , λj+1 = λj+1,u = λj L2 , aj + aL−2
(1.40)
so the renormalization group flow for βj aj , λj is the “free” flow. Let us study consequences of these equations having in mind composition formulas for products of renor˜ φ) malization transformations. Let us recall that the coefficients aj in Ak (Bk (Wk ); ψ, −2k are determined by the coefficient ak through the equations aj = 1−L 1−L−2j ak , and the same relations hold in the previous steps. We have not determined yet what is the constant a in the definition of the next renormalization transformation T (k) . Now we fix it as equal −2k 1−L−2k to a = a1 = 1−L 1−L−2 ak . This assures that ak+1,u = 1−L−2(k+1) ak . Of course we take the same definition for the previous renormalization transformations, i.e. for T (j) we take −2j 1−L−2j a = 1−L 1−L−2 aj , and we have aj+1,u = 1−L−2(j+1) aj . At this moment we should introduce a double index for all these constants, e.g. the constant a in the definition of T (j) actually depends on j, but for simplicity we keep the slightly unprecise old notation. The transformation T (j) is determined by the constant βj a, and from the above equations we obtain by an easy induction βj a = β j aj
−2(j+n) 1 − L−2j −n(d−2) 1 − L = β a L , j+n j+n 1 − L−2 1 − L−2
hence, taking n = k − j 1 − L−2k = βk a(Lj η)d−2 , (1.41) 1 − L−2 where the constant a on the right hand side is determined by ak . This way we have written all the renormalization transformations T (j) in terms of the common constants βk , a, and the scaling factors. Having this form of the transformations we can apply all the composition formulas proved in Sect. 4 [2]. They will be used in Sect. 2 of the third paper in the analysis of the operation R(k+1) . Equations (1.41) give simple scaling relations between the constants βj a. We can use these constants in various definitions instead of the coefficients βj , for example we 1 define p0 (βj ) = A0 (log βj a)p0 , δj = (βj a)− 2 p0 (βj ), etc. With such definitions it is easier to study relations between values of these expressions for different j’s. In the future we will always use these definitions, even if for simplicity we will not write the coefficients a. To formulate the main theorems of this paper it is again convenient to introduce spaces of densities satisfying the inductive hypotheses, as in [4]. These spaces are determined by the same parameters as in [4], with an addition of the constant B0 in (1.39), and we define βj a = βk ak L−(k−j)(d−2)
Low Temperature Expansion for Classical N -Vector Models
Rk (β, a, λ, ν; B, ν) ¯ =
499
(1.42)
= {ρk : ρk satisfies the hypotheses (H.1)–(H.7), which include also the hypotheses (H.1)–(H.7) [4], determined by the constants in the parentheses above, together with M, κ, α0 , α1 , γ, E0 , c8 , c9 , B0 , with ¯ βk > B, 0 < νk ≤ ν}, and where the integral operators are explicitly defined, the definitions discussed in the next papers. By the hypotheses (H.2)–(H.4) the small field effective action and generating functional Ek , Fk , R(k) defined on the whole tori form a part of the above description. We separate this part and define spaces of these elements. It is the definition (1.6) [4], in which we include also the functions (R(k) n )n≤m satisfying the inductive hypothesis (H.4). It does not introduce new parameters in their description, so we use the same symbol (1.6) [4] for these extended spaces. Now we formulate the first main theorem of this paper in the following way. Theorem 1. Under the assumptions of Theorem 1 [4], i.e. for any constants M, κ, α1 , γ, c8 , c9 , where M, κ are sufficiently large, α1 , c8 , c9 are positive and sufficiently small, there exist constants B, α0 , E0 , B0 such that if ν¯ ≤ βk−1 ≤ 78 L−2 , which implies k ≤ k0 , then the transformation S (k) T (k) maps the space (1.42) into such a space ¯ 2 instead of k, B, ν. ¯ The remaining constants are the same, exwith k + 1, BLd−2 , νL cept that Hypothesis (H6) may be not satisfied, only the weaker hypothesis with the inequality “≥” in (1.32), and Hypotheses (H.2), (H.3), (H.4), (H.7) are satisfied with the constant 23 κ instead of κ for functions with the superscript j = k + 1. This transformation determines uniquely the transformation S (k) T (k) introduced in Sect. 1 [4] and defined on the extended space (1.6) [4]. It satisfies all the conclusions of Theorem 2 [4], i.e. it establishes the mapping (1.7) [4] between the extended spaces, which satisfies the Eqs. (1.8) [4], (1.9) [4] with ck+1 = 0, dk+1 = bk+1 , and the inequalities (1.10) [4]. This theorem is very similar to the combined Theorems 1,2 in [4], and we refer the reader to the discussion given after their formulations in Sect. 1 [4], which applies and is important here also. The only difference between the above theorem and Theorem 1 in [4] is that now one of the inductive hypothesis may be not satisfied by the density S (k) T (k) ρk . To obtain a density satisfying all the hypotheses, in particular the improved bound (1.32), we introduce the large field renormalization operation R(k+1) . More precisely we introduce this operation for k < k0 . For k ≥ k0 we can control the remaining renormalization steps by weaker bounds. This operation is defined again quite explicitly, but the definition is long and complicated, so we give it in the course of constructions of the third paper, and here we formulate a theorem which states the existence of this construction as well as other conclusions. Theorem 2. Under the assumptions of Theorem 1, and for k < k0 , there exists a transformation R(k+1) defined on the densities ρk+1 = S (k) T (k) ρk in the image of S (k) T (k) considered on the space (1.42), such that R(k+1) ρk+1 satisfies all the inductive hypotheses (H.1)–(H.7) for k +1. More precisely it belongs to the space (1.42) with k +1, BLd−2 , ¯ This transformation satisfies the basic normalization property νL ¯ 2 instead of k, B, ν. Z Z dψk+1 R(k+1) ρk+1 = dψk+1 ρk+1 . (1.43) It satisfies also a much stronger normalization property. It determines uniquely a transformation R(k+1) of the corresponding space (1.6) [4] of the effective actions and
500
T. Balaban
generating functionals. R(k+1) is equal to the identity transformation on (E (j) )j≤k+1 , (M(j) )j≤k+1 , (R(k+1) )n<m , and it generates a new function R00(k+1) (ψk+1 , g) which is n m (k+1) added to Rm (ψk+1 , g). This new function satisfies the same properties as R(k+1) m (ψk+1 , g), and the sum of the two satisfies all the properties and bounds required by = 0 in Hypothesis (H.4). For k ≥ k0 we take R(k+1) as the identity operator, so R00(k+1) m these cases. Combining the two theorems we obtain Theorem 3. Under the assumptions of Theorem 1 and for k < k0 the transformation R(k+1) S (k) T (k) maps the space (1.42) into the space determined by k + 1, BLd−2 , νL ¯ 2 , with the remaining constants unchanged. The corresponding transformation R(k+1) S (k) T (k) satisfies all the conclusions of Theorem 1. The above theorems hold for k < k0 , so the image of the last transformation is in the space (1.42) with k = k0 . We can apply the transformation S (k0 ) T (k0 ) and Theorem 1 still holds, but densities in the image of this transformation generally do not satisfy the hypothesis (H.6), so the image is not contained in the space (1.42) with k = k0 + 1. Actually this theorem holds for arbitrary k, but because the images are not contained in the corresponding spaces we cannot iterate the procedure. We want to apply the same renormalization transformations S (k) T (k) , with R(k+1) equal to the identity, and we want to use the same general form of the effective densities described in (H.1), (H.5), with the small field densities satisfying the same inductive hypotheses (H.2)–(H.4), but obviously we have to change the assumptions on the large field densities, in particular on the integral operators. Fortunately the problem is simplified by the fact that we need to perform yet a relatively small number of the renormalization steps. We start with νk0 satisfying νk0 > 78 L−d βk−1 , and as in Sect. 6 [4] we finish when νk satisfies for the first 0 7 −2 time the inequality νk > 8 L . We have 2 7 7 2(k−k0 ) > L−d L2(k−k0 ) βk−1 , hence L2(k−k0 ) < 1 ≥ νk > νk0 L 0 8 8 2 3 1 8 log Ld + log βk0 < 2 log βk0 , and Ld βk0 , 2(k − k0 ) < < 7 log L 2 k − k0 < log βk0 . Because of this the bounds of the large field densities for k = k0 are enough to control all the remaining steps. We change now the approach. We do not introduce any new spaces of densities, but we consider specifically the densities ρk obtained by applying the renormalization transformations to densities ρk0 in the space (1.42) with k = k0 , or more precisely the densities ρk =
k0 Y
S (j) T (j) ρk0 , ρk0 ∈ Rk0 (β, a, λ, ν; B, βk−1 ). 0
(1.44)
j=k−1
Their properties are described in the following theorem. Theorem 4. The densities (1.44) satisfy all the inductive hypotheses (H.1)–(H.5), (H.7), and the integral operators Tk (Z, Ak ∩ Z) satisfy the inequality (1.31) with ` = 1, as long as νk ≤ 1.
Low Temperature Expansion for Classical N -Vector Models
501
This theorem ends the description of the whole sequence of densities obtained by applying the renormalization transformations to the original density (1.1)[1]. From a technical mathematical point of view this is the most important part of the analysis of the models, and most of the statements connected with the low temperature “spin wave” picture described in the Introduction to paper [1] are simple consequences of this description. Let us formulate some of them. By the normalization properties of the renormalization transformations we have Z Z (1.45) dψk ρk = dψρ0 = exp F (T ; h, g; ν), where F(T ; h, g; ν) is the generating functional of truncated correlation functions on the torus T . We show in Sect. 4 of the third paper, that if k is the final index defined above, for which νk > 78 L−2 , then the first integral in (1.45) has the same properties, and can be performed in the same way as the previous integrals connected with the renormalization transformations. From this we obtain that the generating functional F (T ; h, g; ν) has the representation described in the inductive hypothesis (H.7)[4], but with ψk = h, which means that also φk = h and ψk(j) = h. From the properties of this representation it follows immediately that there exists the limit F(h, g; 0+) = lim lim F (T ; h, g; ν),
(1.46)
ν&0 T →Zd
which has the same representation but with k = ∞ and with the classes Dj (T ) of the localization domains replaced by Dj (Rd ). From such a representation we obtain that the one-point function hφ(x)i is non-zero, actually close to h for β large, and the remaining truncated correlation functions are convergent to 0 as distances between points increase to ∞, with a “power-law” convergence. This means that the correlation functions determine the pure phase of the model. To obtain the more precise properties of the functions, formulated in [1] as the “spin-wave picture”, like “Goldstone boson” behavior etc., we have to develop still a more precise description of the generating functional. This will be done in the forthcoming last paper of the series, written together with Michael O’Carroll. We prove Theorems 1,4 in the first two papers, Theorem 2 in the third paper, and Theorem 3 is a simple conclusion of the previous two. 2. The Transformation T (k) : Introduction of Characteristic Functions and Determination of New “Small Field” Regions and Fluctuation Integrals We apply the transformation T (k) to a density ρk , and we write T
(k)
X
ρk =
T
(k)
(Uk∼ )Tk (Zk , Ak )T (k) ((Uk∼ )c )χk (Zkc )
Zk ,Ak
+
Fk (Zkc )
+R
(k)
(Zkc )
+ B (Zk , Ak ) , Uk = (k)
exp Ak (Zkc ) + (2.1)
Zk≈ ,
where we have used the fact that the integrations in the transformation T (k) ((Uk∼ )c ) are independent of the variables in the other two integral transformations. The transformation T (k) (Uk∼ ) becomes a part of the next integral operator T(k) . Our first goal is to introduce a new small field region and proper restrictions on a new spin variable θ on this region.
502
T. Balaban
Take the next partition πk+1 into the large cubes in the next scale L−1 η, i.e. LM-cubes in the scale η, and define [ (2.2) Uk0 = {0 : 0 ∈ πk+1 , 0 ∩ Uk 6= ∅}. From now on the operations “∼” and “≈” are taken in the k + 1st scale, i.e. with respect to LM-cubes of the partition πk+1 . Next we define the domain [ 0 0k+1 = {00 : 00 ∈ πk+1 , distk (00 , Uk0∼3 ) > LM Rk }. (2.3) 0 is a cover of the torus T by L2 M -cubes 00 , which are unions Let us recall that πk+1 d of L cubes from the partition πk+1 . Notice that 0∼3+[Rk ]+1+ L−1 c ∼ L−1 2 2
0k+1 = ((Uk
) )
and
, hence 0k+1 ⊂ (Uk0∼4+[Rk ] )c ,
(2.4)
0∼4+[Rk ]+ L−1 c 2
) ⊂ 0k+1 .
(Uk
Notice also that for 0 ⊂ 0k+1 we have Rk+1 (0 ) ⊂ (Uk0∼3 )c and Rk+1 (0 )∼ ⊂ (Uk0∼2 )c . Now we define the functions χ0(k) (0 ) = χ({|θ − Qψk | < 2δk on 0 })
(2.5)
and we introduce the decomposition of unity on 0k+1 , Y 1= (χ0(k) (0 ) + χ0(k)c (0 )) = 0 ∈πk+1 :0 ⊂0k+1
=
X
c χ0(k) (Pk+1 ∩ 0k+1 )χ0(k)c (Pk+1 )
(2.6)
Pk+1 ⊂0k+1
under the integral transformation T (k) ((Uk∼ )c ). The sum above is over subdomains Pk+1 c of 0k+1 , which are unions of large cubes from πk+1 . The function χ0(k) (Pk+1 ) is defined 0 c 0(k)c as a product of (2.5) over the cubes contained in Pk+1 ∩k+1 . The function χ (Pk+1 ) is a product of the complementary functions χ0(k)c (0 ) = 1 − χ0(k) (0 ) = χ({|θ − Qψk | ≥ c 2δk on a block in 0 }) over the cubes contained in Pk+1 . On the domain Pk+1 ∩ 0k+1 we have both “small field” characteristic functions χk , χ0(k) . Hence by Lemma 3.1 [1] e k (P c ∩ 0k+1 ; 3δk ), and repeating the arguments of this lemma we have that ψk ∈ 9 k+1 for the configurations θ, ψk we obtain the restrictions (2.3) in [4], that is c e k (Pk+1 ∩ 0k+1 ; (3L + 4)δk ), |θ(y) − ψk (x)| < 3d(L − 1)δk for x ∈ B(y), θ∈9
hence |θ(y) − ψk (x0 )| < 3dLδk if x0 is a nearest neighbour of an x ∈ B(y).
(2.7) For a term of the decomposition (2.6), i.e. for a domain Pk+1 , we define [ 0 ∼3 k+1 = {00 : 00 ∈ πk+1 , 00 ⊂ 0k+1 , 00 ∩ Pk+1 = ∅}.
(2.8)
This is the next, k+1st domain we add to the sequence {j }. Obviously it satisfies the conditions (1.1) - (1.3) [2], so the sequence {1 , . . . , k , k+1 } determines a generating set Bk+1 . Again it is easy to see that
Low Temperature Expansion for Classical N -Vector Models ∼3+ L−1 c ∼ L−1 2 2
k+1 = 0k+1 ∩ ((Pk+1
) )
∼3 c , hence k+1 ⊂ 0k+1 ∩ (Pk+1 )
∼3+ L−1 c 2
and 0k+1 ∩ (Pk+1
503
(2.9)
) ⊂ k+1 .
00
Notice that the operation ∼3“ in the above definitions is to some extent arbitrary. We just apply it universally in geometric definitions, we need it really only in some cases involving functions of the type χk . The domain k+1 plays a fundamental role in our constructions. We define on it a decomposition of ψk into a fluctuation variable and a background configuration. This background configuration depends on a variable which we denote by θ from now on, c and which is equal to θ on 3k+1 = (k+1) k+1 , and to ψk on 3k ∩ k+1 , actually on a subset of it, as it will become clear below. Let us clarify that now we take the definition of the sets 3j , recalled before (1.20), for k + 1 instead of k. Before defining this decomposition we introduce some stronger restrictions on θ, i.e. stronger than (2.7). Notice that from (2.7) and the characteristic functions χk (Zkc ) we conclude e k (Bk+1 ∩ Wk∼ ; 3dLδk ), Wk = (Zkc )≈2 , θ∈9
(2.10)
on the domain of the characteristic functions involved. Take a cube 0 ⊂ ∼3 k+1 . We would like to define on 0∼2 a function of the type φk+1 of the configuration θ, so we must define a corresponding generating set taking into account the different scales on k+1 and ck+1 . We take the usual generating set Bk+1 (0∼2 ) defined by (4.4) - (4.6) [3], with the size of large cubes equal to the fixed constant M1 . This set determines a sequence of domains {j (0∼2 )}, and we define Bk+1 (k+1 ∩ 0∼2 , 0∼2 ) is the generating set determined by the sequence of domains{1 (0∼2 ), . . . , k (0∼2 ), k+1 ∩ 0∼2 }.
(2.11)
We construct the configurations θ˜ on this set in the usual way. Every point x of this set uniquely determines either a unit cube 1k (y), if x ∈ ck+1 , or an L-cube 1k+1 (z), ˜ = θ(y) = ψk (y), and in if x ∈ k+1 , to which it belongs. We put in the first case θ(x) ˜ the second case θ(x) = θ(z). For the generating sets (2.11) and the configurations θ˜ we define the functions ˜ = φk+1 (Bk+1 (k+1 ∩ 0∼2 , 0∼2 ); θ), ˜ φk+1 (0 ; θ) and the characteristic functions 0 0 ˜ < 3 δk on 0∼ ∩ck+1 χk+1 ( ) = χ {|θ − Q(Bk+1 )φk+1 (0 ; θ)| 4 5 ˜ |1η φk+1 (0 ; θ)| ˜ and < δk on 0∼ ∩k+1 , |∂ η φk+1 (0 ; θ)|, 4 ˜ < 3 δk , |φk+1 (0 ; θ) ˜ − h| < 3 √δk on 0∼ , |αk+1 (0 ; θ)| 4 4 νk 2 9 δk } . |h2 − 1| < 16 νk
(2.12)
(2.13)
If k ≤ k0 , that is if (2.9) holds, then we omit the expressions and the conditions involving the vector h in the variational problem determining the functions (2.12), and
504
T. Balaban
in the characteristic functions (2.13). Notice that only the characteristic functions for 0 c ∼3 in the boundary layer ∼3 depend on the variables θ for both scales. For k+1 ∩ (k+1 ) the remaining cubes we have functions similar to χk , but for the k + 1st scale and with different bounds. If we took 0 outside ∼3 k+1 , then we would obtain basically a function χk , but on a larger cube and with stronger conditions. Now we introduce the second decomposition of unity Y 0 (χ0k+1 (0 ) + χ0c 1= k+1 ( )) = 0 ∈πk+1 :0 ⊂∼3 k+1
X
=
0c χ0k+1 (Qck+1 ∩ ∼3 k+1 )χk+1 (Qk+1 ),
(2.14)
Qk+1 ⊂∼3 k+1
where the domains Qk+1 are again unions of cubes from πk+1 . We insert this decomposition under the integral transformations in (2.1). Let us consider again the formula (2.1) with the above decompositions introduced into it. The sum over Pk+1 can be written as a double sum, the first over the domains k+1 and the second over subdomains Pk+1 determining a fixed k+1 . We separate also a part of the last renormalization transformation in (2.1) localized outside k+1 and combine it with the first one, and we obtain the equality X X c 0(k)c T (k) (ck+1 )χk (Zkc ∩ (∼3 (Pk+1 ) · T (k) ρk = k+1 ) )χ k+1 ,Pk+1 ,Qk+1 Zk ,Ak
·χ
0(k)
c (Pk+1
∩
0k+1
∩
ck+1 )Tk (Zk , Ak )
0(k) T (k) (k+1 )χk (∼3 (k+1 ) · k+1 )χ
(2.15)
0 c ∼3 · χ0c k+1 (Qk+1 )χk+1 (Qk+1 ∩ k+1 ) · · exp Ak (Zkc ) + Fk (Zkc ) + R(k) (Zkc ) + B (k) (Zk , Ak ) .
The domains k+1 , Pk+1 , Qk+1 become a part of a new “multi-index” Ak+1 , in particular k+1 becomes a part of Bk+1 . The transformation T (k) (ck+1 ) together with the following characteristic functions form a part of an integral transformation T(k) . Consider now a term of the above sum and the expression in the curly brackets. This expression is given by the integral Z 0 c ∼3 0(k) (Q )χ (Q ∩ ) dψk k+1 χk (∼3 (k+1 ) · χ0c k+1 k+1 k+1 k+1 k+1 k+1 )χ 1 · exp −βk { aL−2 kθ − Qψk k23k+1 + A∗k (Zkc ; ψk , φk )} + (Ek (Zkc ) + 2 1 −d βk aLd−2 c (k) c (k) c + Fk (Zk ) + R (Zk ) + B (Zk , Ak )) − Ek |Zk | + L N log |k+1 | . 2 2π (2.16) According to the method of [4] we should take a minimum of the function in the curly brackets with respect to ψk restricted to k+1 , and expand the whole expression in the exponential around the minimum introducing proper fluctuation variables. This way we would obtain a fluctuation integral, which would be a starting point of further analysis. Unfortunately there are two problems with this prescription. The function in the curly brackets does not have a minimum defined naturally in terms of the variational
Low Temperature Expansion for Classical N -Vector Models
505
problems studied in [2], because the summation in A∗k (Zkc ) is restricted to Zkc , so it does not cover the whole generating set Bk ((Zkc )≈2 ). If it did cover, then we would have a special case of the variational problems (4.14) [2], and the unique minimum would be equal to ˜ = ψ(Bk (Wk ), (Bk (Wk ) ∩ ck+1 ) ∪ 3k+1 ; θ), ˜ (2.17) ψ (k) (θ) where Wk = (Zkc )≈2 . Notice that the above configuration is equal to θ˜ on Bk (Wk )∩ck+1 . Generally this configuration is not a minimum of the function in the curly brackets, nevertheless we can take it as a good approximation of the minimum, because Zkc covers a very large neighbourhood of k+1 . There is a second problem with the function (2.17). If we use it in a definition of fluctuation variables, then we introduce this function into the characteristic function under the integral (2.16). Generally not all of these functions can be removed by introducing restrictions on fluctuation variables, and we have a serious problem because the remaining functions depend non-locally on the whole configuration θ˜ on (Bk (Wk ) ∩ ck+1 ) ∪ 3k+1 through the function (2.17). This destroys locality and analyticity properties of expressions obtained in the inductive procedure. To avoid this problem we replace (2.17), which is only an approximate minimum, by its approximation ˜ We can obtain such an approximation easily which is of sufficiently short range in θ. by using the localization extensions of the function (2.17) constructed in Sect. 2 of [3], see Proposition 2.3. For a cube 0 ⊂ k+1 we take X0 = 0 , X1 = Rk+1 (0 ). This pair of domains obviously satisfies the condition (2.14) [3], and we take the localization ˜ s) constructed for this pair. The configuration extension ψ (k) (θ, ˜ 0) on 0 ψ (k) (0 , θ) = ψ (k) (θ,
(2.18)
is a very good approximation of (2.17) restricted to 0 , because by the property (2.60) [3] in Proposition 2.3 [3] we have ˜ < exp(− 1 γ0 (LM Rk − 2M1 ))K4 3dLδk < exp(−Rk ) < β −K on 0 , |δψ (k) (0 ; θ)| k 4 (k) 0 ˜ (k) ˜ (k) ˜ 0 δψ ( ; θ) = ψ (θ) − ψ (θ, 0) on , (2.19) where K is an arbitrary number, for βk large enough. Now we define finally a decomposition of ψk on (k) k+1 into approximate background configuration and fluctuation variables by the equality −1
ψk = ψ (k) (0 ; θ) + βk 2 ψ 0 on 0 , 0 ⊂ k+1 .
(2.20)
The fluctuation variables ψ 0 are defined on the whole set (k) k+1 , and it is easy to see from the restrictions introduced by the characteristic functions that |ψ 0 | = O(p0 (βk )). Later we will introduce much stronger restrictions on ψ 0 . The background configuration (2.18) on 0 depends on θ restricted to the domain Rk+1 (0 ). This is the short range dependence which saves the inductive procedure. We make the substitution (2.20) in the integral (2.16), and we express this integral in terms of the fluctuation variables. This changes the constant in the exponential. We add the constant − 21 N log βk |k+1 |, and in all functions of ψk we substitute the expression on the right hand side of (2.20). We study separately the functions in the curly brackets and in the parentheses. For the first function we write (2.20) in the form −1
−1
˜ + β 2 ψ 0 − δψ (k) (0 ; θ) = ψ (k) (θ) ˜ + δψ, ψ (k) (0 ; θ) + βk 2 ψ 0 = ψ (k) (θ) k −1
where δψ = βk 2 ψ 0 − δψ (k) (0 ; θ) on 0 .
(2.21)
506
T. Balaban
This is a more convenient representation, because we can use various composition formulas and known expansions. In particular we have ˜ + δψ) = φ(Bk (Wk ); ψ(Bk (Wk ), (Bk (Wk ) ∩ ck+1 ) ∪ 3k+1 ; θ)) ˜ φk (ψ (k) (θ) c ˜ + δφk (δψ) = + δφ(Bk (Wk ); δψ) = φ((Bk (Wk ) ∩ k+1 ) ∪ 3k+1 ; θ)
(2.22)
˜ + δφk (δψ), = φk+1 (θ) ˜ + δψ). Next, we expand the main action in and an identical expansion for αk (ψ (k) (θ) (2.16) using the ones above. Such an expansion will be used many times in the future, and we will write a general formula covering these cases. Let us take a generating set Bk and a domain such that ∩ (j \ j+1 ) is a union of a set of j-cubes, i.e. Lj η-cubes in the scale η with centers at points of 3j . Let us assume that configurations ψ, φ are defined on neighbourhoods of Bk , 1 correspondingly, and that ψ = ψ0 +δψ, φ = φ0 +δφ, where φ0 is a solution of the variational problem (1.6) [2] determined by a generating set Bk and a configuration ψ0 . Then A∗k (, Bk ; ψ, φ) = A∗k (, Bk ; ψ0 , φ0 ) + + < δψ, a(Bk )(ψ0 − Q(Bk )φ0 ) >∩Bk +δA∗k (, Bk ; δψ, δφ) 1 + νk < δφ, φ0 − h > + < [(δφ)+ + (δφ)− ], ∂ η φ0 >st() , where 2 δA∗k (, Bk ; δψ, δφ) =
1 < δψ − Q(Bk )δφ, a(Bk )(δψ − Q(Bk )δφ) >∩Bk + 2
1 1 1 kδα k2 , + k∂ η δφk2∗ + < δφ, (α0 + νk )δφ > + 2 2 2λk α0 =
(2.23)
(2.24)
λk λk (|φ0 |2 − 1), δα = (2φ0 .δφ + |δφ|2 ), and 2 2
1 [(δφ)+ + (δφ)− ], ∂ η φ0 >st() = 2 X 1 η d−1 (δφ(x0 ) + δφ(x)) · (∂ η φ0 )(< x, x0 >), = 2 0
<
(2.25)
<x,x >∈st()
st() = {< x, x0 >: x, x0 ∈ Tη , x, x0 are nearest neighbours, x ∈ , x0 ∈ c }. The first formula is written in the case when we do not include the term with νk in the variational problem; of course it is always included into the action. When we consider the full variational problem, i.e. for k > k0 , then the fourth term with νk does not appear on the right hand side of (2.23). We apply the above formulas to the main action in (2.16) and the expansions (2.21), (2.22), and we obtain
Low Temperature Expansion for Classical N -Vector Models
507
1 1 −2 ˜ 23 − aL kθ − Qψk k23k+1 + A∗k (Zkc ; ψk , φk ) = aL−2 kθ − Qψ (k) (θ)k k+1 2 2 ˜ >3 + 1 aL2 kQδψk23 + − aL−2 < Qδψ, θ − Qψ (k) (θ) k+1 k+1 2 ∗ c (k) ˜ (k) ˜ + ak < δψ, ψ (θ) ˜ − Qk φk+1 (θ) ˜ >B(3 ) + + Ak (Zk ; ψ (θ), φk+1 (θ)) k+1 ∗ c ˜ − h >Z c +δAk (Zk ; δψ, δφk (δψ)) + νk < δφk (δψ), φk+1 (θ) k
1 ˜ >st(Z c ) = + < [(δφk )+ + (δφk )− ], ∂ η φk+1 (θ) k 2 η∗ c ˜ ˜ − h >Z c + = Ak+1 (Zk , Bk+1 ; θ, φk+1 (θ)) + νk < δφk (δψ), φk+1 (θ) k 1 −2 2 ∗ c aL kQδψk3k+1 + δAk (Zk ; δψ, δφk (δψ)) + 2 1 ˜ >st(Z c ) . + < [(δφk )+ + (δφk )− ], ∂ η φk+1 (θ) k 2
(2.26)
Here we have used the fact that the first and fourth terms on the right hand side of the first equality above combined together yield the action Aη∗ k+1 . This follows from the arguments and formulas between (2.4) and (2.19) in [4], which can be easily generalized ˜ to an arbitrary Bk , but also from the representation (4.21) [2] for the function ψ (k) (θ), and from the equation (2.9) [4] satisfied by the coefficients ak , ak+1 . The sum of the two linear terms in δψ on the right-hand side of the first equality vanishes, because we expand around the minimum in ψk , but also because of the same representation and equation as above. The term with νk again does not appear if k > k0 . Let us analyze now the terms on the right-hand side of the last equality. The first term is obvious, it determines the k + 1st main action, after additional restrictions and transformations discussed later, and the corresponding exponential factor can be taken before the integral in (2.16). The expression in the curly brackets is represented by the formula (2.24) in terms of the functions δφk (δψ), δαk (δψ). We expand them into linear and higher order functions, exactly as in (2.20)[4], using corresponding variational equations, and we obtain the formulas (2.21), (2.22) [4] with the operator Gk (α) given now by ∗ −1 Gk (α) = (−1η,D 1 (Wk ) + Q (Bk )a(Bk )Q(Bk ) + α) , Bk = Bk (Wk ),
(2.27)
where for k > k0 we replace α by νk + α. Substituting these expansions into the formula (2.24) we obtain δA∗k (Zkc ; δψ, δφk (δψ)) =
1 < δψ, 1(k) (Zkc )δψ > +V (k) (Zkc ; δψ), 2
(2.28)
1 1 η (1) 1 2 2 < δψ, 1(k) (Zkc )δψ >= ak kδψ − Qk δφ(1) k (δψ)kZkc + k∂ δφk (δψ)kZkc∗ + 2 2 2 1 1 (1) ˜ c kδαk(1) (δψ)k2Z c , + < δφ(1) k (δψ), (αk+1 (θ) + νk )δφk (δψ) >Zk + k 2 2λk (2.29) c + (δψ), Q δφ (δψ) > V (k) (Zkc ; δψ) = −ak < δψ − Qk δφ(1) k k,2 Z k k (1) η ˜ c∗ c + < ∂ η φ(1) k (δψ), ∂ φk,2 (δψ) >Zk + < δφk (δψ), (αk+1 (θ) + νk )δφk,2 (δψ) >Zk + 1 < δαk(1) (δψ), δαk,2 (δψ) >Zkc +δA∗k (Zkc ; 0, δφk,2 (δψ)). + λk (2.30)
508
T. Balaban
The function (2.30) has most of the general properties of the function (2.27) [4] discussed in [4], with some obvious modifications, so we do not repeat them here. We will come back to them in the next paper, where we construct the remaining localization expansions. There is a problem with the quadratic form (2.29), because the domain of “integration” Zkc does not cover the support 1 (Wk ) of the generating set Bk , as is needed for formulas (3.7)–(3.9) [3] to hold. We could generalize the construction of the localization extensions in Sect. 3 of [3] to the operator given by (2.29), but the simplest way is to extend the expression on the right hand side of (2.29) to the whole domain 1 (Wk ) by adding and subtracting the corresponding norms restricted to 1 (Wk ) ∩ Zk , or to Zk , because all the functions are naturally supported in 1 (Wk ) by (2.27). Thus we write
1 1 1 < δψ, 1(k) (Zkc )δψ >= < δψ, 1(k) δψ > − < δψ, 1(k) (1 (Wk ) ∩ Zk )δψ >, 2 2 2 (2.31) and the quadratic form 21 < δψ, 1(k) δψ > is given by the second equality in (2.23) [4] with the operator Gk (α) given by (2.27) above, or by the integral representation (3.7) [3]. The second quadratic form is given by (2.29) with the norms restricted to Zk , the configuration δψ is supported in k+1 , and the domains Zk , k+1 are well separated, by a distance greater than (L + 1)M Rk + 2LM . Hence by the exponential decay of the functions in (2.29) the matrix elements of the form are very small, of the order O(exp(−Rk )). This quadratic form contributes to the “boundary terms” only, and it is convenient to write it as a sum of forms localized to cubes ∈ πk such that ⊂ Zk , ∩ 1 (Wk ) 6= ∅, i.e. we write
1 < δψ, 1k (1 (Wk ) ∩ Zk )δψ >= 2
X ∈πk :⊂Wk∼ ∪Zk
1 < δψ, 1(k) ()δψ > . (2.32) 2
We can construct localization expansions for the localized forms on the right-hand side, using the results of [3], and it is easy to see that 21 < δψ, 1(k) ()δψ >= O(exp(−Rk )|δψ|2 ). Consider finally the last term in (2.26). It is a surface term, a sum over all bonds intersecting the surface ∂Zkc , where we consider the domain in the continuous space T . It is convenient to decompose this surface into walls of large cubes from πk , and this yields a decomposition of this term into a sum of surface terms, for which we sum over bonds intersecting a given wall. The point is that the function 21 [δφk (x0 ; δψ) + δφk (x; δψ)] for such a bond is very small, because the bond < x, x0 > is well separated from the support of the configuration δψ by a distance greater than (L + 1)M Rk + 2LM . Thus by (2.55) [3] the surface term corresponding to a given wall can be bounded by 3 d−1 K3 |δψ|. exp(− 41 γ0 ((L + 1)M Rk + 2LM − 2M1 )) ≤ exp(−Rk )|δψ|. Because of 2M the localization the surface term contributes to the boundary terms. Combining now all the above expansions and formulas we obtain the following representation for the main action in (2.16):
Low Temperature Expansion for Classical N -Vector Models
509
1 −2 c ˜ aL kθ − Qψk k23k+1 + A∗k (Zkc ; ψk , φk ) = Aη∗ k+1 (Bk+1 , Zk ; θ, φk+1 (θ)) + 2 1 + < δψ, (aL−2 Q∗ Q + 1(k) ) B(3k+1 ) δψ > 2 1 − < δψ, 1(k) (1 (Wk ) ∩ Zk )δψ >B(3k+1 ) + 2 ˜ − h >Z c +V (k) (Zkc ; δψ) + + νk < δφk (δψ), φk+1 (θ) k 1 ˜ >st(Z c ) . + < [(δφk (δψ))+ + (δφk (δψ))− ], ∂ η φk+1 (θ) k 2
(2.33)
The first quadratic form on the right-hand side above is the basic one, determining (k) defined and analyzed in Sect. 3 [3]. a Gaussian integral with the covariance CB(3 k+1 ) As in that section we omit the set B(3k+1 ), and we denote it simply by C (k) . Now 1 according to the method of [4] we should make the change of variables ψ 0 = C (k) 2 ψ. There are the same problems with it as with the function (2.17) before. If we introduce it into the characteristic functions in (2.16) we obtain the same non-local dependence ˜ To avoid this problem again we use a short range part on the whole configuration θ. of the operator. In Sect. 3 of [3] we have constructed the “localization extensions” of this operator, which we combine with the corresponding localization extensions of the ˜ αk+1 (θ) ˜ on which it depends. For a cube 0 ⊂ k+1 we take X0 = 0 , functions φk+1 (θ), 0 X1 = Rk+1 ( ), and we construct the localization extension for this pair of domains, 1 1 which we denote by C (k) 2 (0 , s). The operator C (k) 2 (0 , 0) is a very good approximation 1 of C (k) 2 on 0 , in fact we have again 1 1 1 |χ0 (C (k) 2 − C (k) 2 (0 , 0))ψ| ≤ exp(− γ0 (LM Rk − 2M1 ))B5 |ψ| ≤ exp(−Rk )|ψ|. 4 (2.34) The kernel of C (k) (0 , 0) has a support in Rk+1 (0 ) ∩ B(3k+1 ), and it depends on θ restricted to Rk+1 (0 ). Taking this approximation of C (k) we make the change of variables 1 (2.35) ψ 0 = C (k) 2 (0 , 0)ψ on 0 , 0 ⊂ k+1 .
For the configuration δψ given by the second formula in (2.21) we obtain −1
−1
−1
δψ = βk 2 C (k) 2 (0 , 0)ψ − δψ (k) (0 ) = βk 2 C (k) 2 ψ − βk 2 δC (k) 2 (0 ) 1
−1
1
1
−1
− δψ (k) (0 ) = βk 2 C (k) 2 ψ − δ0(k) (βk 2 ψ) on 0 , where 1
δC (k) 2 (0 ) = χ0 (C (k) 2 − C (k) 2 (0 , 0)), 1
1
1
(2.36)
δ0(k) (ψ) = δC (k) 2 (0 )ψ + δψ (k) (0 ) on 0 . 1
The function δ0(k) (ψ) is an affine function of ψ, which is uniformly very small, of the order O(exp(−Rk )), as it follows from (2.19), (2.34). We substitute the decomposition (2.36) into the function on the right-hand side of (2.33). We do not write the whole expression, only the part coming from the first quadratic form. It is equal to 1 1 −1 −1 βk−1 kψk2 − βk 2 < ψ, C (k) 2 (aL−2 Q∗ Q + 1(k) )δ0(k) (βk 2 ψ) > + 2 1 −1 −1 + < δ0(k) (βk 2 ψ), (aL−2 Q∗ Q + 1(k) )δ0(k) (βk 2 ψ) >, 2
(2.37)
510
T. Balaban
where we have omitted the subscript “B(3k+1 )” in the norm and the scalar products. The first term above is the basic ultra-local quadratic form defining the Gaussian fluctuation integral, the same as in (2.41) [4], except that it is localized now to B(3k+1 ). The remaining two terms are small, as all the remaining terms on the right hand side of (2.33) are, for various reasons discussed before. Let us denote the sum of all these small − 21 (k) ˜ terms by Vef f (βk ψ), of course it is also a function of θ. It is defined in terms of the background configurations and their expansions, so it is an analytic function of θ on a ˜ c (Wk∼ ; δ, ), δ ≤ c7 , ≤ c0 . We do not discuss rather large analyticity domain, e.g. on 9 the analyticity problems in this section, except occasional remarks like the above one, but we will come back to them in the “localization” section in the next paper. Combining all the operations and expansions we obtain the final form of the expansion of the main action in (2.16), 1 −2 c ˜ aL kθ − Qψk k23k+1 + A∗k (Zkc ; ψk , φk ) = Aη∗ k+1 (Bk+1 , Zk ; θ, φk+1 (θ)) + 2 1 − 21 (k) + βk−1 kψk2B(3k+1 ) + Vef f (βk ψ). 2
(2.38)
(k) Unfortunately only the notation is simple here, the function Vef f has a quite complicated structure, its terms have different properties and we will have to analyze them in the future again. Let us recall that the change of the variables ψk is given by −1
ψk = ψ (k) (0 ; θ) + βk 2 C (k) 2 (0 , 0)ψ on 0 , 0 ⊂ k+1 , or 1
− (k) (k) ˜ + β − 2 C (k) 21 ψ − δ0(k) (β − 2 ψ), (θ) + βk 2 Cloc 2 ψ = ψ (k) (θ) ψk = ψloc k k 1
1
1
1
(2.39)
with the obvious definitions of the function and the operator with the subscript “loc”. This change of variables performed in the integral (2.16) yields the following new term in the exponential: 1 (k) 1 (2.40) − N log βk |k+1 | + log det Cloc 2 . 2 (k) 1
1
The operator Cloc 2 is a small perturbation of C (k) 2 , and we would like to represent the second term above as a small perturbation of the corresponding term in (2.41) [4]. We have Z 1 1 d (k) 21 (k) 1 (k) 21 + dt log det(tCloc 2 + (1 − t)C (k) 2 ) = log det Cloc = log det C dt 0 Z 1 1 1 1 (k) 1 dtT r[δC (k) 2 (tCloc 2 + (1 − t)C (k) 2 )−1 ] = = T r log C (k) 2 − 0
1 = T r log C (k) − 2 =
1 T r log C (k) − 2
Z
1
0 ∞ X n=0
dtT r[(C (k) )− 2 δC (k) 2 (1 − t(C (k) )− 2 δC (k) 2 )−1 ] = 1
1
1
1
1 1 1 T r[(C (k) )− 2 δC (k) 2 ]n = n+1
1 1 = − Tr log((aL−2 Q∗ Q + 1(k) ) B(3k+1 ) ) − δD(k) = − D(k) − δD(k) . 2 2
(2.41)
Low Temperature Expansion for Classical N -Vector Models
511
The term D(k) is a trace of the second operator (3.22) [3], which we have analyzed in [3], and whose localization extension was constructed there. The term δD(k) is very small, as it follows from (2.34), and its localization extension can be easily constructed on the basis of the methods and results of [3]. Using these we construct localization expansions for these terms, in particular for − 21 D(k) and for the term 21 log det C (k) in (2.41) [4], where the operator C (k) is defined on the whole lattice. The important point here is that terms of these two localization expansions, with the same localization domains contained inside the domain k+1 , are equal. This property holds for other expressions also, and we will discuss it again in more detail. Our next step is to introduce stronger restrictions on the ultra-local fluctuation variables ψ, restrictions of the form (2.40) [4]. On cubes 0 contained in the domain k+1 we define χ(k) (0 ) = χ({|ψ| < p1 (βk ) on 0 }), where p1 (βk ) = A1 (log βk )p1 , p1 ≤ p0 , (2.42) and we introduce the decomposition of unity Y (χ(k) (0 ) + χ(k)c (0 )) = 1= 0 ∈πk+1 :0 ⊂k+1
=
X
c χ(k) (Rk+1 ∩ k+1 )χ(k)c (Rk+1 )
(2.43)
Rk+1 ⊂k+1
under the integral (2.16). In [4] we have shown that the characteristic functions (2.42), or (2.40) [4], allowed us to remove the functions χk , χ0(k) , because they are equal to 1 on the domain of the remaining characteristic functions. This removed a dependence on the new variables θ from the characteristic functions, which was an important part of the proof of analyticity properties. The functions (2.42) are introduced for the same reason here, except that we expect to have these properties outside the large field regions. To make them easier to see and prove we consider cubes that are well separated from these regions, e.g. by the distance LMRk . We have the following lemma. ∼ 0 c 0 Lemma 2.1. If ∈ πk , ⊂ 0 ∈ πk+1 , 0 ⊂ ∼3 k+1 , ⊂ Qk+1 , and either ∩k+1 = 0 ∼ 0 0 ∅, or Rk+1 ( ) ∩ Rk+1 = ∅, then χk () = 1. Similarly, if ∈ πk+1 , ⊂ k+1 , ∼ 0 ⊂ Qck+1 and Rk+1 (0 ) ∩ Rk+1 = ∅, then χ0(k) (0 ) = 1. ∼2 Actually the lemma is true under a much weaker assumption that 0 ∩(Qk+1 ∪Rk+1 )= ∅, but it is not important; we have to take a next domain within the large distance from the large field regions anyway. We do not prove the lemma here. We will have to prove a more general statement in Sect. 2 of the third paper, from which the lemma will follow. We conclude from it that 0(k) (k+1 ) the only ones left are the functions among the functions χk (∼3 k+1 ), χ ∼ ≈ 0(k) ∼ ≈ χk (Qk+1 ∪ ((Rk+1 ) ∩ ∼ ((Qk+1 ∪ (Rk+1 ) ) ∩ k+1 ). k+1 )), χ
(2.44)
Notice that the domain under the second function above is contained in the domain under the first. By the definition of χk and the properties of the substitution (2.39) these functions depend on the variables θ, ψ restricted to the domain ∼ ≈ ∼ ∼ ≈ ∼ ∼ ≈ ) ∩ ∼ [Qk+1 ∪ ((Rk+1 k+1 )] ∪ ([Qk+1 ∪ ((Rk+1 )) ∩ k+1 ] ∩ k+1 ) .
512
T. Balaban
It is a complicated formula. It does not make clear the structure of this domain. Fortunately we need only a part of it contained in k+1 , and this part is contained in ≈ ∼2 ≈2 [(Q∼ k+1 ) ∪(Rk+1 ) ]∩k+1 . This domain has a simple enough structure, and it suggests the following definition of a small field region:
00k+1 =
[
0 ≈ ∼2 ≈2 {00 : 00 ∈ πk+1 , 00 ⊂ k+1 , 00 ∩ [(Q∼ k+1 ) ∪ (Rk+1 ) ] = ∅}. (2.45)
We could take also the same operations applied to both domains, so the condition would then be 00 ∩ ((Qk+1 ∪ Rk+1 )∼2 )≈2 = ∅. The domain 00k+1 could be written in a similar way as 0k+1 in (2.4), but what we need is a simple characterization of k+1 ∩00c k+1 in terms of Qk+1 , Rk+1 . By (2.45) we obtain
∼ ≈ ∼2 ≈2 ∼ k+1 ∩00c k+1 ⊂ [(Qk+1 ) ∪(Rk+1 ) ]
L−1 2
∩k+1 ⊂ (Qk+1 ∪Rk+1 )∼4+
L−1 2 +2[Rk ]
∩k+1 . (2.46)
On the domain 00k+1 we have the functions χ(k) (00k ) only. All the other characteristic functions left under the integral (2.16) involve the variables θ, ψ restricted to 00c k+1 . This is the crucial property which allows us to perform the fluctuation integral on 00k+1 and to obtain an effective action with good analyticity properties. The domain k+1 ∩ 00c k+1 is included into the large field region, and the ultra-local Gaussian integration restricted to it is included into an operator T(k) . We have finished all the preparatory steps leading to the fluctuation integrals, so we have determined also a part of this operator given by the remaining integrations and characteristic functions. It is defined by the formula
(k) c (ck+1 )χk (Zkc ∩ (∼3 T0(k) (Zkc , Pk+1 , Qk+1 , Rk+1 , k+1 , 00c k+1 ) = T k+1 ) ) · c 0 c ∼3 00c · χ0(k)c (Pk+1 )χ0(k) (Pk+1 ∩ 0k+1 ∩ ck+1 )χ0c k+1 (Qk+1 )χk+1 (Qk+1 ∩ k+1 ∩ k+1 ) Z 00c 1 1 exp(− kψk2 ) · (2π)− 2 N |k+1 ∩k+1 | dψ k+1 ∩00c k+1 2 ∼ ≈ ∼ 0(k) ∼ ≈ ) ) ∩ k+1 ) · χk (Qk+1 ∪ ((Rk+1 ) ∩ k+1 ))χ ((Qk+1 ∪ (Rk+1 c · χ(k)c (Rk+1 )χ(k) (Rk+1 ∩ k+1 ∩ 00c k+1 ).
(2.47)
Let us consider the equality (2.15) with the integrals (2.16) after all the above expansions and transformations. We decompose the sum of boundary terms B(k) (Zk , Ak ) into two parts. To the first part we include the sum of all terms with localization domains X ⊂ ck+1 . We denote this part now by B 0(k) (ck+1 , Ak ). It does not depend on the variables ψk on B(3k+1 ) = (k) k+1 . The second part is the sum of all the remaining terms, i.e. the terms B (k) (X, Ak ) with X ∩k+1 6= ∅. We denote it by B 00(k) (Zk , Ak ; ψk ), displaying the dependence on the variables ψk restricted to B(3k+1 ). With these notations we have
Low Temperature Expansion for Classical N -Vector Models
T (k) ρk =
X
X 00 k+1 ,k+1
X
513
T0(k) (Zkc , Pk+1 , Qk+1 , Rk+1 , k+1 , 00c k+1 )
Pk+1 ,Qk+1 ,Rk+1 Zk ,Ak
1 (k) c ˜ −βk Aη∗ k+1 (Zk , Bk+1 ; θ, φk+1 (θ)) − D 2 ˜ + Fk (Zkc ; ψ (k) (θ)) ˜ + B 0(k) (ck+1 , Ak ) − − δD(k) + Ek (Zkc ; ψ (k) (θ)) βk 1 1 βk αLd−2 c −d |k+1 | + N L log |k+1 | − Ek |Zk | − N log 2 2π 2 2π Z 1 − 21 (k) − 21 N |00 | 2 (k) 00 k+1 · (2π) dψ 00k+1 exp(− kψk )χ (k+1 ) exp −βk Vef f (βk ψ) + 2 ·
Tk (Zk , Ak )χ0k+1 (00k+1 ) exp
− (k) ˜ + δFk (Zkc ; β − 2 C (k) 2 ψ − δψ (k) (θ)) ˜ + + δEk (Zkc ; βk 2 Cloc 2 ψ − δψ (k) (θ)) k loc 1
+R
(k)
(k) (Zkc ; ψloc (θ)
1
+
1
− 1 (k) 1 βk 2 Cloc 2 ψ)
+B
00(k)
1
(k) (Zk , Ak ; ψloc (θ)
+
− 1 (k) 1 βk 2 Cloc 2 ψ)
, (2.48)
where ˜ + δψ) − Ek (Zkc ; ψ (k) (θ)) ˜ = δEk (Zkc ; δψ) = Ek (Zkc ; ψ (k) (θ) Z 1 ∂ ˜ + tδψ), δψ >, dt < ( Ek )(Zkc ; ψ (k) (θ) = ∂ψ k 0
(2.49)
and of course the same formula for δFk . In the above formulas we have suppressed the dependence on ψk outside the domain k+1 , which is unchanged by the above operations and the same as for the original action, and also on other variables. It becomes clear how a new k + 1st density emerges from the right hand side of (2.48), but actually none of the expressions has a correct form yet, and we have to perform several additional operations to reach the form described in the inductive hypotheses for k + 1. In the next section we discuss the fluctuation integral, i.e. the integral in the curly brackets in (2.48), and we construct its preliminary contributions to various terms of the new action. In the next paper we renormalize the new contributions, and we perform some final operations bringing the new density to its inductive form. 3. The Fluctuation Integral, a Classification of its Contributions to the New Effective Action, and Their Localization Expansions The fluctuation integral now has a more complicated structure than the integral (2.41) [4]. There are some new types of terms, like the boundary terms, the “large field” terms R(k) , etc., but we perform basically the same operations as in Sect. 2[4]. Namely we take the logarithm of this integral and we separate and localize various contributions to a new effective density. We do it using simple generalizations of the formulas (2.52)–(2.55) [4]. Also as in Sect. 2 of [4] we formulate several propositions on localization expansions of the new contributions, and their bounds, and we discuss them to some extent, but we postpone complete proofs to the “localization section” in the next paper. At first let us notice that the expressions in the two exponentials in (2.48) depend on Zkc , k+1 only, the boundary terms on Ak , k+1 . They do not depend on the large field regions Pk+1 , Qk+1 , Rk+1 , or on 00k+1 . The dependence on these is in the characteristic functions and integrations only. It is an important remark, which allows us to treat these
514
T. Balaban
expressions and define the operations below uniformly on the whole domain k+1 , on which the transformation (2.39) was performed. The additional dependence on 00k+1 is introduced only by the integration in the curly brackets in (2.48). At first we separate from the fluctuation integral contributions determined by the boundary terms. We give a slightly different meaning to the symbols in (2.48), we take the third and the sixth terms on the right hand side of (2.33), which have been 1 (k) 1 ˜ and we shift them included into V (k) after substituting δψ = β − 2 C 2 ψ − δψ (k) (θ), loc
ef f
to the boundary terms B 00(k) . The reason for this should be obvious. These terms are partially localized, or “pre-localized”, on the subsets of Zk , so their localization expansions contribute only to the boundary terms. Notice that now B 00(k) does not depend on −1
(k) 1
(k) the expression ψloc (θ) + βk 2 Cloc 2 ψ because of the new terms. We introduce the new −1
−1
notation B 00(k) (Zk , Ak , k+1 ; βk 2 ψ) displaying the dependence on βk 2 ψ and k+1 . For simplicity we denote the Gaussian measure together with the characteristic functions in the fluctuation integral by dµ0 (00k+1 ; ψ), and we write the first separation formula Z − 21 (k) c c c log dµ0 (00k+1 ; ψ) exp −βk Vef f (Zk ; βk ψ) + δEk (Zk ) + δFk (Zk ) + − 1 (k) 1 −1 (k) + R(k) (Zkc ; ψloc (θ) + βk 2 Cloc 2 ψ) + B 00(k) (Zk , Ak , k+1 ; βk 2 ψ) = (3.1) Z − 21 (k) c c c = log dµ0 (00k+1 ; ψ) exp −βk Vef (Z ; β ψ) + δE (Z ) + δF (Z ) + k k k k k k f − 1 (k) 1 (k) + R(k) (Zkc ; ψloc (θ) + βk 2 Cloc 2 ψ) + B 00(k+1) (Ak , k+1 , 00k+1 ; θ, ψ), where B
00(k+1)
(Ak , k+1 , 00k+1 ; θ, ψ)
Z = 0
1
D E00 −1 du B 00(k) (Zk , Ak , k+1 ; βk 2 ψ) . u
(3.2)
The expectation value h·i00u is with respect to the probability measure dµ00u (ψ), which is obtained by normalizing the measure after the first integral in (3.1), in which the function B 00(k) is multiplied by the variable u. Notice that the function (3.2) still depends on ψ restricted to the domain k+1 \00k+1 . This function has some obvious symmetry properties. It is invariant with respect to orthogonal transformations R ∈ O(N ) applied to all vector valued variables and functions, and it is invariant with respect to Euclidean (k+1+m) , or the partition πk+1 . The last property was fortransformations of the lattice TLM mulated in (H.7), but let us recall that it means invariance if a transformation is applied to the whole geometric structure also. We need a localization expansion of this function, its analyticity properties and bounds. We formulate a corresponding proposition below, but first let us describe very briefly the main steps in the construction of such an expansion. The function B 00(k) is a sum of localized terms in which we make the substitution (2.39). We construct localization expansions for these terms based on Proposition 4.3 [3], but slightly modified and adapted to the case of boundary terms, and we resum over original domains, with fixed final localization domains. Thus we represent B 00(k) as a sum of localized terms, localizations with respect to the both variables θ, ψ. We do the same for all expressions in the exponential defining the measure dµ00u (ψ), i.e. we write the whole expression there as a sum of localized terms. Then we substitute the localization expansion of the function B 00(k) into the expectation value in (3.2), and we obtain the sum of
Low Temperature Expansion for Classical N -Vector Models
515
the expectation values of the localized terms. For each term of the last sum we construct a “cluster expansion” of the expectation value. It is a very simple form of such an expansion, and it yields a localization expansion of the term. We fix again final localization domains, and we perform another resummation over all terms with these domains. This yields a localization expansion of (3.2). We fill in all details in the next paper, but even from the above description it should be clear that bounds of terms of the last expansion depend on bounds of the terms of the original expansion for B 00(k) . Take a term B 00(k) (X), then by the definition of B 00(k) we have X ∩ k+1 6= ∅ and X ∩ Zk 6= ∅. By the construction of the domains Zk , k+1 we have dk (Xmodck ) ≥ LRk − (2L − 1) > Rk . Using a small part of the exponential decay bound for B 00(k) (X) we obtain an additional factor exp(−Rk ) < βk−K , which makes bounds for terms of the above expansion very small. The above remarks should help to understand and justify the following proposition. Proposition 3.1. The function defined by (3.2) has the localization expansion B 00(k+1) (Ak , k+1 , 00k+1 ; θ, ψ) = X =
B 00(k+1) (Y, (Ak , k+1 , 00k+1 ); θ, ψ),
(3.3)
Y ∈Dk+1 (modck+1 ):Y ∩ck+1 6=∅,Y ∩k+1 6=∅
whose terms have the usual analyticity and symmetry properties described in (H.7), they depend on θ restricted to Y and on ψ restricted to Y ∩ (k+1 \ 00k+1 ), and they satisfy the bounds 3 |B 00(k+1) (Y, (Ak , k+1 , 00k+1 ); θ, ψ)| < βk−1 exp(− κdk+1 (Y modck+1 )) 2
(3.4)
on the analyticity domains. Let us remark that βk−1 in the above bound can be replaced by an arbitrary power of or any other small constant, and 23 κ can be replaced by 2κ − 5 − 8κ0 . It is clear that all the terms in (3.3) contribute to new boundary terms only. The function (3.2) can be written in the first exponential in (2.48), and we are left with the logarithm on the right-hand side of (3.1). We want to improve symmetry properties of the next contributions, so we have to separate the “Euclidean symmetry breaking” terms. They are connected with the functions with the “loc” subscript, like βk−1 ,
(k) 1
(k) , Cloc 2 ψ, or δψ (k) , etc. We use formulas (2.36), (2.39), and we expand around the ψloc fully symmetric functions. Let us write these expansions at first for terms of the effective action in the fluctuation integral. We have −1
−1
1
(k) c (k) 2 (Zkc ; βk 2 C (k) 2 ψ) + Vef f (Zk ; βk ψ) = V D E 1 −1 −1 + − βk 2 C (k) 2 ψ, (aL−2 Q∗ Q + 1(k) )δ0(k) (βk 2 ψ) +
E 1 D (k) − 21 −1 δ0 (βk ψ), (aL−2 Q∗ Q + 1(k) )δ0(k) (βk 2 ψ) − 2 Z 1 1 ∂ −1 −1 −1 V (k) )(Zkc ; βk 2 C (k) 2 ψ − tδ0(k) (βk 2 )), δ0(k) (βk 2 ψ) dt ( = − ∂δψ 0
+
−1
1
−1
1
−1
(k) c (k) 2 2 ψ, δ0(k) (βk 2 ψ)), = V (k) (Zkc ; βk C (k) 2 ψ) + δVef f (Zk ; βk C
and the more obvious expansions
(3.5)
516
T. Balaban − (k) ˜ = δEk (Zkc ; β − 2 C (k) 21 ψ) + δEk (Zkc ; βk 2 Cloc 2 ψ − δψ (k) (θ)) k 1
1
1
−1
+ δEk0 (Zkc ; −δ0(k) (βk 2 ψ)),
where
˜ + β − 2 C (k) 21 ψ + δψ) − δEk0 (Zkc ; δψ) = Ek (Zkc ; ψ (k) (θ) k 1
˜ − Ek (Zkc ; ψ (k) (θ) Z 1 dt (
= 0
1 −1 βk 2 C (k) 2 ψ)
+
(3.6)
=
1 ∂ ˜ + β − 2 C (k) 21 ψ + tδψ), δψ , Ek )(Zkc ; ψ (k) (θ) k ∂ψk
the same for δFk (Zkc ), and −1
−1
(k) 1
1
(k) (θ) + βk 2 Cloc 2 ψ) = R(k) (Zkc ; ψ (k) (θ) + βk 2 C (k) 2 ψ) + R(k) (Zkc ; ψloc −1
+ δR(k)0 (Zkc ; −δ0(k) (βk 2 ψ)).
(3.7)
Using these expansions we write the second separation formula Z
−1
dµ0 (00k+1 ; ψ) exp
(k) c c c 2 −βk Vef f (Zk ; βk ψ) + δEk (Zk ) + δFk (Zk ) + − 1 (k) 1 (k) ˜ + R(k) (Zkc ; ψloc (θ) + βk 2 Cloc 2 ψ) = Z 1 1 1 −1 = log dµ0 (00k+1 ; ψ) exp −βk V (k) (Zkc ; β − 2 C (k) 2 ψ) + δEk (Zkc ; βk C (k) 2 ψ) + 1 1 −1 ˜ + β − 2 C (k) 21 ψ) + + δFk (Zkc ; βk 2 C (k) 2 ψ) + R(k) (Zkc ; ψ (k) (θ) k
log
˜ ψ), + R0(k+1) (Zkc , k+1 , 00k+1 ; θ,
(3.8)
where 0(k+1)
R +
˜ ψ) (Zkc , k+1 , 00k+1 ; θ,
−1 δEk0 (Zkc ; −δ0(k) (βk 2 ψ)) − 21
Z =
1
D − 21 (k) 1 − 21 (k) c 2 ψ, δ0(k) (β du −βk δVef k ψ))+ f (Zk ; βk C
0 −1 0 + δFk (Zkc ; −δ0(k) (βk 2 ψ)) E0
+δR(k)0 (Zkc ; −δ0(k) (βk ψ))
+
. u
(3.9) The expectation value h·i0u is with respect to the probability measure dµ0u (ψ), which is obtained in a rather obvious way. We use the expansions (3.5) - (3.7) for the terms in the first exponential in (3.8), we multiply the second terms on the right hand sides of these expansions by the variable u, and we normalize the obtained measure. The function (3.9) still has the same symmetry properties as the function (3.2). Each term in the expectation value can be represented as a scalar product of some function on B(3k+1 ) −1
with δ0(k) (βk 2 ψ), so by (2.19), (2.34) it can be bounded by a constant multiplied by exp(−Rk ). Thus the function (3.9) is again very small, it can be bounded by an arbitrary −1
power of βk 2 . We construct a localization expansion in the same way as in the previous case, and we obtain the following proposition.
Low Temperature Expansion for Classical N -Vector Models
517
Proposition 3.2. The function defined by (3.9) has a localization expansion of the form (3.3), but with Zkc instead of Ak , and without the restriction Y ∩ ck+1 6= ∅. Terms of this expansion have all the properties described in Proposition 3.1, and they satisfy the bounds (3.4) with 21 βk−m−1 instead of βk−1 . In addition, if Y ⊂ 00k+1 , then the corresponding term does not depend on Zkc , k+1 , 00k+1 , and is equal to the term with the same localization domain constructed in the case when the domains Zkc , k+1 , 00k+1 are equal to the whole torus. ˜ hence on θ Let us notice that the function (3.9) depends on the configuration θ, restricted to the domain Wk∼ . The sum in (3.3) may be actually restricted to Y ∈ Dk+1 contained in the above domain, or in k , but it is not important. Terms of this expansion will be divided later into two groups, one contributing to the boundary terms B (k+1) , and another to R(k+1) m . Consider now the first expression on the right-hand side of (3.8), and in particular the expressions in the exponential. If the domains Zkc , k+1 were equal to the whole torus, then these expressions would have Euclidean symmetry properties determined by the properties of the corresponding terms of the original action and the functions 1 ψ (k) (θ), C (k) 2 ψ. This means that the expressions connected with V (k) , Ek , Fk would be invariant with respect to Euclidean transformations of the lattice TL(k+1) , or the lattice T1(k+1) after the next rescaling. The expression connected with R(k) n would be invariant with respect to Euclidean transformations of the lattice TL(k+n) , or the lattice TL(k+1+(n−1)) n n−1 after the rescaling, n = 1, 2, ..., m. For these expressions the symmetry properties improve with the rescaling, but they are still worse than the properties of the previous expressions, at least for n > 1, for n = 1 they are exactly the same. Now we want to separate from the considered fluctuation integral contributions determined by the last expressions above, in the order of improving symmetry properties. We have generally for 1 < n ≤ m, Z n X 00 (k) c c c (k) c Rp (Zk ) = log dµ0 (k+1 ; ψ) exp −βk V (Zk ) + δEk (Zk ) + δFk (Zk ) + Z = log
p=1
dµ0 (00k+1 ; ψ) exp
−βk V
(k)
(Zkc )
+
δEk (Zkc )
+
δFk (Zkc )
+
n−1 X
c R(k) p (Zk )
+
p=1 0(k+1) ˜ ψ), + Rn−1 (Zkc , k+1 , 00k+1 ; θ,
(3.10)
where 0(k+1) ˜ ψ) (Zkc , k+1 , 00k+1 ; θ, Rn−1
Z = 0
1
D E(n) − 21 (k) 1 c (k) ˜ 2 ψ) du R(k) . (3.11) n (Zk ; ψ (θ) + βk C u
˜ ψ in the expressions in the In (3.10) we have suppressed the dependence on θ, exponentials, because it is the same as in the exponential on the right-hand side of (3.8). (n) A definition of the expectation value h·iu is again obvious. A probability measure dµ(n) u (ψ) is determined by the measure after the symbol of integral on the left hand side c of (3.10), in which the expression R(k) n (Zk ) is multiplied by the variable u. Consider the function (3.11) in the case when the domains Zkc , k+1 , 00k+1 are equal to the whole torus. It is a function of the configurations θ, g defined on the whole lattices TL(k+1) , Tη , and it is invariant with respect to Euclidean transformations of the lattice TL(k+n) . It is so n
518
T. Balaban
because all the expressions determining the expectation value in (3.11) have this property. After the next rescaling the function will become invariant with respect to Euclidean (k+1+(n−1)) , so the symmetry properties will transformations of the lattice TL(k+n) n−1 = TLn−1 improve by one scale in comparison with the symmetry properties of the function R(k) n . In particular for n = 1 we would obtain the same properties as for the “regular” terms of the action, but this case has to be treated in a slightly different way, as will be discussed below. In the general case we now have a different form of a localization expansion. The difference is connected with the fact that the terms R(k) n (X) of the expansion (1.17) with X ∩ k+1 = ∅ are not changed by taking the expectation value in (3.11), so we have two types of terms in an expansion of (3.11). Proposition 3.3. The function defined by (3.11) has a localization expansion which can be written as a sum of two parts. The first part is the sum of terms of the expansion (1.17) with localization domains X satisfying X ∩ k+1 = ∅. The second part has the same form and the same properties as the expansion described in Proposition 3.2, except that the bounds (3.4) hold with βk−1 replaced by const. βk−n−1 . Let us remark that the constant in the bounds is constructed in a simple way in terms of the previously introduced constants, like K0 , K1 , K2 , K3 , B5 , L, M, α0 , γ, etc. Let us again stress the important property that if a localization domain Y ⊂ 00k+1 , then the corresponding term is equal to the term with the same domain constructed in the case when Zkc = k+1 = 00k+1 = T . This is important because of the symmetry properties discussed above. The terms of the localization expansion of the function (3.11) contribute again to the boundary terms, and to the function R(k+1) n−1 , which is now defined by (3.11) in the case of the whole torus. Consider for a moment the equalities (3.10), (3.11) for n = 1. They would determine , for the whole torus, and the last one has the same the functions R00(k+1) , or R(k+1) 0 symmetry and other properties as the contribution from the “regular” part of the action, so we would like to combine them together. To do this we would have to separate the part independent of g, which is simple, and to derive the special “pre-localized” representation for it, which is needed for hypothesis (H.2), and which demands a slightly different treatment than the one in (3.10), (3.11). It is simpler to reverse the procedure. c We expand R(k) 1 (Zk ) up to the first order in g, and we include the term independent of g c into δEk (Zk ), and the term of at least first order in g into δFk (Zkc ). For simplicity again we keep the same notation for the new expressions. Finally we consider the first expression on the right hand side of (3.10) without the last sum in the exponential. In the case when the domains are equal to the whole torus this expression has exactly the same form as the logarithm of the integral in (2.41) [4]. We follow now the procedure of Sect. 2 in [4], and we have Z log dµ0 (00k+1 ; ψ) exp −βk V (k) (Zkc ) + δEk (Zkc ) + δFk (Zkc ) = (3.12)
˜ ψ) + = e(N, p1 (βk ))|00k+1 | + E10(k+1) (Zkc , k+1 , 00k+1 ; θ, ˜ ψ), + F 0(k+1) (Zkc , k+1 , 00k+1 ; θ, 0
where
−N 2
e(N, p1 (βk )) = log (2π)
Z RN
dψχ({|ψ| < p1 (βk )})e
− 21 |ψ|2
= O(e− 4 p1 (βk ) ), 1
2
(3.13)
Low Temperature Expansion for Classical N -Vector Models
519
˜ ψ) = E10(k+1) (Zkc , k+1 , 00k+1 ; θ, Z 1 1 1 ∂ −1 −1 V (k) )(Zkc , ; uβk 2 C (k) 2 ψ), βk 2 C (k) 2 ψ + du −βk ( = ∂δψ 0 1 1 ∂ −1 −1 + δEk )(Zkc ; uβk 2 C (k) 2 ψ , βk 2 C (k) 2 ψ , ∂δψ u,0
(3.14)
˜ ψ) = F00(k+1) (Zkc , k+1 , 00k+1 ; θ, Z 1 ∂ − 21 (k) 1 − 21 (k) 1 c Fk )(Zk ; uβk C 2 ψ), βk C 2 ψ du ( , = ∂δψ 0 1,u
(3.15)
and the expectation values are with respect to the measure 1 −1 −1 00 00 dµt,u (ψ) = Zt,u (k+1 )dµ0 (k+1 ; ψ) exp −βk V (k) (Zkc ; tβk 2 C (k) 2 ψ) + − 21 (k) 1 − 21 (k) 1 c c 2 2 + δEk (Zk ; tβk C ψ) + δFk (Zk ; uβk C ψ) .
(3.16)
The above definitions on the whole torus obviously coincide with the corresponding definitions (2.55), (2.52), (2.51) in [4]. For the function F00(k+1) we can introduce the representation (2.52) [4] in terms of M00(k+1) given by a formula corresponding to (2.53) [4]. The function M00(k+1) depends obviously on the same domains and variables as the function F00(k+1) , and it is represented by the same formula (3.15) with Fk (Zkc ) replaced by Mk (Zkc ). The function E10(k+1) has to be “pre-localized”. We do it in a very simple way by introducing at first slightly more general functions ˜ ψ), where is a domain contained in Z c , which is a union E10(k+1) (, k+1 , 00k+1 ; θ, k of the unit cubes 1k (y), y ∈ Zkc ∩ T1(k) . These functions are defined by the same formula (3.14), in which the functions V (k) (Zkc ), δEk (Zkc ) inside the expectation value are replaced correspondingly by V (k) (), δEk (, (Zkc )≈ ). The meaning of V (k) () is obvious, and the second symbol denotes the function represented by (1.7), (1.8), where the sums in (1.8) are over y ∈ ∩ T (j) , and over X, x as written in (1.8). The measure (3.16) determining the expectation value is unchanged. We have now ˜ ψ) = E 0(k+1) (Zkc ∩ ck+1 , k+1 , 00k+1 ; θ, ˜ ψ) + E10(k+1) (Zkc , k+1 , 00k+1 ; θ, 1 X 0(k+1) ˜ ψ). E1 (1k+1 (z), k+1 , 00k+1 ; θ, +
(3.17)
z∈3k+1
We combine the determinant term − 21 D(k) in the first exponential in (2.48) with the function (3.14) defining E00(k+1) = − 21 D(k) + E10(k+1) , as in (2.54) [4]. The trace in the definition (2.41) of − 21 D(k) is written as in (2.58) [4], and we construct a pre-localization of this term as in (2.60) [4]. These pre-localizations are combined with corresponding terms of the sum above, again as in (2.60) [4]. This way we obtain a decomposition of the form (3.17) for the function Eo0(k+1) , in which a first term on the right hand side is equal to the first term in (3.17). We have the following generalization of Propositions 2.1, 2.2 in [4].
520
T. Balaban
Proposition 3.4. The functions E10(k+1) (Zkc ∩ ck+1 ), E00(k+1) (z), (3.15) or M00(k+1) (x) have localization expansions having the same form and properties as the one described in Proposition 3.2, but with the following modifications. The expansion of the first function is over the domains Y containing at least one component of ck+1 , and intersecting k+1 , and terms of this expansion satisfy the bounds (3.4) with 1 instead of βk−1 . The sum in the localization expansion of E00(k+1) (z) is over the domains Y satisfying the condition z ∈ Y , and the constant in the bounds (3.4) is equal to (N Ld B5 + 1)K0 . Terms of the localization expansion of (3.15) satisfy the bounds with βk−1 replaced by d−2 d−2 5 1 const. β − 8 +α (L−1 η) 2 +γ−α1 ≤ 21 β − 4 (L−1 η) 2 +γ−α1 , and terms of the expansion of M00(k+1) (x) satisfy the same bounds, with the localization domains satisfying the condition x ∈ Y . Let us stress once more that a part of the above proposition is the following important statement: if Y ⊂ 00k+1 , then the terms of the above expansions with this localization domain are equal to the terms of the localization expansions constructed on the whole torus T , with the same domain Y . This property is particularly important for a renormalization of the functions E00(k+1) (z). It allows us to introduce the renormalization constants uniformly for all terms in the representation (2.48) of the density T (k) ρk . Let us mention again that proofs of the above propositions are given in the next paper. Actually we do not give separate proofs of them, but we study a sufficiently general case, and we construct a “generic” localization expansion along the lines sketched in the paragraph before Proposition 3.1. All these propositions are simple conclusions of this construction. There we obtain also various additional restrictions on the coefficients determining the model, and the constants determining our method. The most important are restrictions on β. We have not formulated any of them in the above proposition in order to make them as simple and straightforward as possible, but obviously they are needed here. We have finished the analysis of the fluctuation integral, and we can write a newly obtained representation of the density T (k) ρk . We do it performing also the next scaling operation S (k) , which is the rescaling of the lattice TL(k+1) to T1(k+1) , and all other rescalings it implies. We have discussed this operation at the beginning of Sect. 3 [4], and there is no need to add anything here. We write the representation formula using some simplified notations. We combine the term −δD(k) together with R0(k+1) , and we denote the sum by 0(k+1) . It satisfies all the conclusions of Proposition 3.2. We denote the configurations Rm ˜ ψu(k) (θ), ˜ as in [4], stressing the fact that they occurring in the “old” action by φk+1,u (θ), are unrenormalized. We use also the constants Ek00 , Ek000 , similar to the ones defined in (2.41), (2.63) [4], but with obvious modifications connected with the powers of 2π. We have X X X T0(k) (Zkc , Pk+1 , Qk+1 , Rk+1 , k+1 , 00c S (k) T (k) ρk = k+1 ) 00 ,k+1 Pk+1 ,Qk+1 ,Rk+1 Zk ,Ak k+1
˜ ·Tk (Zk , Ak )χ0k+1 (00k+1 ) exp −βk+1,u A∗k+1,u (Zkc , Bk+1 ; θ, φk+1,u (θ))+ ˜ + Fk (Zkc ; ψu(k) (θ)) ˜ + B 0(k) (Ak , ck+1 )+ +Ek (Zkc ; ψu(k) (θ)) ˜ ψ) + F 0(k+1) (Zkc , k+1 , 00k+1 ; θ, ˜ ψ)+ +E00(k+1) (Zkc , k+1 , 00k+1 ; θ, 0
(3.18)
Low Temperature Expansion for Classical N -Vector Models
+
m X
521
˜ ψ) + B 00(k+1) (Ak , k+1 , 00k+1 ; θ, ψ)− Rn0(k+1) (Zkc , k+1 , 00k+1 ; θ,
n=1
−Ek L
d
|Zkc \k+1 |
−
Ek00 Ld |\00k+1 |
−
Ek000 Ld |00k+1 |
,
where all norms, volumes, etc. are written in the L−1 η-scale. The expressions in the exponential do not have the form needed for the inductive hypotheses, e.g. they do not have right localizations; they all contribute to boundary terms. We still have to perform some additional operations, the most important of which is the renormalization operation. We discuss it in the next paper. References 1. Balaban, T.: A Low Temperature Expansion for Classical N -vector Models. I. A Renormalization Group Flow. Commun. Math. Phys. 167, 103–154 (1995) 2. Balaban, T.: The Variational Problems for Classical N -Vector Models. Commun. Math. Phys. 175, 607–642 (1996) 3. Balaban, T.: Localization Expansions. I. Functions of the “Background” Configurations. Commun. Math. Phys. 182, 33–82 (1996) 4. Balaban, T.: A Low Temperature Expansion for Classical N -Vector Models. II. Renormalization Group Equations. Commun. Math. Phys. 182, 675–721 (1997) 5. Balaban, T.: Commun. Math. Phys., a) 89, 571–597 (1983); b) 119, 243–285 (1988); c) 122, 175–202 (1989); d) 122, 355–392 (1989) 6. Brydges, D.: A Short course in Cluster Expansions. In: Critical Phenomena, Random Systems, Gauge Theories, Les Houches (1984) Elsevier Science Publishers, 1986 7. Brydges, D., Dimock, J., Hurd, T.: Weak Perturbations of Gaussian Measures. In: Mathematical Quantum Theory I. Field Theory and Many Body Theory, CRM Proceedings Lecture Notes 8. Commarota, C.: Commun. Math. Phys. 85, 517–528 (1982) 9. Feldman, J., Magnen, J., Rivasseau, V., S´em´eor. R: Commun. Math. Phys. 109, 473 (1987) 10. Abdessalam, A., Rivasseau, V.: An Explicit Large Versus Small Field Multiscale Cluster Expansion. Rev. Math. Phys. (to appear) 11. Rivasseau, V.: Cluster Expansions with Small/Large Field Conditions. In: Mathematical Quantum Theory I: Field Theory and Many Body Theory, CRM Proceedings Lecture Notes 12. Glimm, J., Jaffe, A.: Quantum Physics: Functional Integral Point of View. New York: Springer, 1989 Communicated by D. C. Brydges
Commun. Math. Phys. 196, 523 – 533 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
A Conjectural Generating Function for Numbers of Curves on Surfaces Lothar G¨ottsche International Center for Theoretical Physics, Strada Costiera 11, P.O. Box 586, 34100 Trieste, Italy. E-mail: [email protected] Received: 24 November 1997 / Accepted: 11 January 1998
Abstract: I give a conjectural generating function for the numbers of δ-nodal curves in a linear system of dimension δ on an algebraic surface. It reproduces the results of Vainsencher [V] for the case δ ≤ 6 and Kleiman–Piene [K-P] for the case δ ≤ 8. The numbers of curves are expressed in terms of five universal power series, three of which I give explicitly as quasimodular forms. This gives in particular the numbers of curves of arbitrary genus on a K3 surface and an abelian surface in terms of quasimodular forms, generalizing the formula of Yau–Zaslow for rational curves on K3 surfaces. The coefficients of the other two power series can be determined by comparing with the recursive formulas of Caporaso–Harris for the Severi degrees in P2 . We verify the conjecture for genus 2 curves on an abelian surface. We also discuss a link of this problem with Hilbert schemes of points. 1. Introduction Let L be a line bundle on a projective algebraic surface S. In the case δ ≤ 6 Vainsencher [V] proved formulas for the numbers tSδ (L) of δ-nodal curves in a general δ-dimensional sub-linear system of |L|. By refining Vainsenchers approach Kleiman–Piene [K-P] extended the results to δ ≤ 8. The formulas hold under the assumption that L is a sufficiently high power of a very ample line bundle. In this paper we want to give a conjectural generating function for the numbers tSδ (L). We will have only partial success: We are able to express the tSδ (L) in terms of five universal generating functions in one variable q. Three of these are Fourier developments of (quasi-) modular forms, the other two we have not been able to identify: the formulas of [C-H] for the Severi degrees on P2 yield an algorithm for computing their coefficients and I computed them up to degree 28. As the functions are universal, one would hope that there exists a nice closed expression for them. If the canonical divisor of the surface S is numerically trivial, only the quasimodular forms appear in the generating function. Thus we obtain (conjecturally) the numbers
524
L. G¨ottsche
of curves of arbitrary genus on a K3 surface and on an abelian surface as the Fourier coefficients of quasimodular forms. The formulas generalize the calculation of [Y-Z] for the numbers of rational curves on K3 surfaces. The fact that for K3 surfaces and abelian surfaces the numbers can be expressed solely in terms of quasimodular forms might be related to physical dualities. The numbers tSδ (L) will on a general surface (including also P2 ) count all curves with the prescribed numbers of nodes, including the reducible ones. If |L| contains only reduced curves, it seems however that on an abelian surface one is actually counting irreducible curves, and in the case of K3 surfaces one can restrict attention to the case that |L| contains only irreducible reduced curves. Both in the case of abelian surfaces and K3 surfaces I do then expect the generating function to count the curves of given genus in sub-linear systems of |L|, even if L is only assumed to be very ample and not a high multiple of an ample line bundle. The curves can then have worse singularities than nodes and a curve C should be counted with a multiplicity determined by the singularities of C, as in [B, F-G-vS]. In particular nodal (or more generally immersed) curves should count with multiplicity 1. In the case of K3 surfaces and L a primitive line bundle the numbers of these curves have in the meantime been computed in [Br-Le]. The coefficients of the unknown power series are by the recursion of [C-H] the solutions to a highly overdetermined system of linear equations; the same is true for a similar recursion obtained by Vakil [Va] for rational ruled surfaces and the results of [Ch] on P2 and P1 × P1 . This gives an additional check of the conjecture. Finally we compute the numbers of genus 2 curves on an abelian surface. I thank B. Fantechi for many useful discussions without which this paper could not have been written and P. Aluffi for pointing out [V] to me. I thank Don Zagier for useful comments and discussions which improved the formulation of Conjecture 2.4, R. Vakil for giving me a preliminary version of [Va] and S. Kleiman, R. Piene and O. Debarre for very useful comments. The work on this paper was begun during my stay at the Mittag Leffler Institute.
2. Statement of the Conjecture Let S be a projective algebraic surface and L a line bundle on S. In this paper by a curve on S we mean an effective reduced divisor on S. A nodal curve on S is a reduced (not necessarily irreducible) divisor on S, which has only nodes as singularities. We denote by KS the canonical bundle and by c2 (S) the degree of the second Chern class. For two line bundles L and M let LM denote the degree of c1 (L) · c1 (M ) ∈ H 4 (S, Z). In [V] (for δ ≤ 6) and [K-P] (for δ ≤ 8) formulas for the number aSδ (L) of δ-nodal curves in a general δ-dimensional sub-linear system V of |L| were proved. Here general means that V lies in an open subset of the Grassmannian of δ-dimensional subspaces of |L|. However (see [K-P]), one can just take V to be the linear system of divisors in |L| passing through dim(|L|) − δ general points on S. The number is expressed as a polynomial in c2 (S), KS2 , LKS and L2 of degree δ. The formulas are valid if L is a sufficiently high multiple of an ample line bundle. In other words, for such L, the locally closed subset WδS (L) (with the reduced structure) of elements in |L| defining δ-nodal curves has codimension δ, and its degree is given by a polynomial as above. It is clear and was noted before [K-P] that there should be a formula for all δ: Conjecture 2.1. For all δ ∈ Z there exist universal polynomials Tδ (x, y, z, t) of degree δ (Tδ = 0 if δ < 0) with the following property. Given δ and a pair of a surface S and a
A Conjectural Generating Function for Numbers of Curves
525
very ample line bundle L0 there exists an m0 > 0 such that for all m ≥ m0 and for all ⊗ M satisfies very ample line bundles M the line bundle L := L⊗m 0 aSδ (L) = Tδ (L2 , LKS , KS2 , c2 (S)). Remark 2.2. Note that the statement is slightly stronger than that of [V] in the case δ ≤ 6. We expect L does not have to be a high power of a very ample line bundle but that it suffices that L is sufficiently ample. In fact Conjecture 5.3 below implies that if L is (5δ − 1)-very ample (see Sect. 5), then the formula of Conjecture 2.1 holds for up to δ-nodes. Assuming Conjecture 2.1 we will in the future just write TδS (x, y) := Tδ (x, y, KS2 , c2 (S)),
tSδ (L) := TδS (L2 , LKS )),
T (S, L) :=
X
tSδ (L)xδ .
δ≥0
The aim of this note is to give a conjectural formula for the generating function T (S, L), and to give some evidence for it. We start by noting that Conjecture 2.1 imposes rather strong restrictions on the structure of T (S, L). The point is that the conjecture applies to all surfaces, including those with several connected components. In this case we will write |L| for P(H 0 (L)). By definition WδS (L) includes only f ∈ |L| which do not vanish identically on a component of S. Proposition 2.3. Assume Conjecture 2.1. Then there exist universal power series A1 , A2 , A3 , A4 ∈ Q[[x]] such that 2
K2
LKS T (S, L) = AL A3 S A4c2 (S) . 1 A2
Proof. Fix δ0 ∈ Z>0 . It is enough to show the result up to order δ0 in x. Assume that S = S1 t S2 and that L1 := L|S1 and L2 := L|S2 are both sufficiently ample so that the WδSi (Li ) have codimension δ and degree tSδ i (Li ) in |Li | for i = 1, 2 and δ < δ0 . Fix δ < δ0 . The application (f + g) 7→ (f, g) defines a surjective morphism p : U → |L1 | × |L2 |, defined on the open subset U ⊂ |L| where neither f not g vanish identically. The fibres of p are lines in |L|. Obviously a WδS11 (L1 ) × WδS22 (L2 ) . WδS (L) = p−1 δ1 +δ2 =δ
In particular WδS (L) has codimension δ in |L| and modulo the ideal generated by xδ0 we have T (S, L) ≡ T (S1 , L1 )T (S2 , L2 ). Now choose n ∈ Z>0 such that Conjecture 2.1 holds for Z1 := (P2 , O(n)), Z2 := (P2 , O(2n)), Z3 := (P2 , O(3n)) and Z3 := (P1 × P1 , O(n, n)) for all δ ≤ δ0 . Let Sn = Sn (a1 , a2 , a3 , a4 ) be the disjoint union Q of ai copies of each of the Zi with ai ∈ Z≥0 . Then by the above argument T (Sn , L) ≡ i T (Zi )ai up to order δ0 in x. Note that the 4-tuple (L2 , LKS , KS2 , c2 (S)) takes on the Zi the linearly independent values (n2 , −3n, 9, 3), (4n2 , −6n, 9, 3), (9n2 , −9n, 9, 3), (2n2 , −4n, 8, 4). Therefore there are power series 2
K2
c2 (S) LKS S A1,n , A2,n , A3,n , A4,n ∈ Q[[x]], such that T (Sn , L) ≡ AL 1,n A2,n A3,n A4,n up to degree δ0 in x. On the other hand the image of the Sn (a1 , a2 , a3 , a4 ) is Zariski-dense in Q4 , therefore this formula also holds for arbitrary S up to degree δ0 in x. Finally put Ai = limn→∞ Ai,n (i = 1, . . . , 4), and the result follows.
526
L. G¨ottsche
We recall some facts about quasimodular forms from [K-Z]. We denote by H := {τ ∈ C | =(τ ) > 0} the complex upper half plane, and for τ ∈ H we write q := e2πiτ . A modular form of weight k for SL(2, Z) is a holomorphic function f on H satisfying aτ + b ab k = (cτ + d) f (τ ), τ ∈ H, f ∈ SL(2, Z) cd cτ + d P∞ and having a Fourier series f (τ ) = n=0 an q n . P Writing σk (n) := d|n dk , the Eisenstein series Bk X + σk−1 (n)q n , k ≥ 2, Bk = k th Bernoulli number Gk (τ ) = − 2k n>0
are for even k > 2 modular forms of weight k/2, while G2 (τ ) is only a quasimodular form. Another important modular form is the discriminant Y (1 − q k )24 = η(τ )24 , 1(τ ) = q k>0
where η(τ ) is the Dedekind η function. For the precise definition of quasimodular forms see [K-Z]. They are essentially the holomorphic parts of almost holomorphic modular forms. The ring of modular forms for SL(2, Z) is just Q[G4 , G6 ], while the ring of quasimodular forms is Q[G2 , G4 , G6 ]. 1 d d We denote by D the differential operator D := 2πi dτ = q d q . Unlike the ring of modular forms the ring of quasimodular forms is closed under differentiation, i.e. for a quasimodular form f of weight k the derivative Df is a quasimodular form of weight k + 2. In fact every quasimodular form has a unique representation as a sum of derivatives of modular forms and of G2 (see [K-Z]). Conjecture 2.4. There exist universal power series B1 , B2 in q such that X δ∈Z
2
tSδ (L)(DG2 (τ ))δ =
(DG2 (τ )/q)χ(L) B1 (q)KS B2 (q)LKS . (1(τ ) D2 G2 (τ )/q 2 )χ(OS )/2
Remark 2.5. (1) Using the fact that DG2 (τ )/q, B1 (q), B2 (q), 1(τ )D2 G2 (τ )/q 2 are power series in q starting with 1, and by the standard formulas χ(OS ) = (KS2 +c2 (S))/12, χ(L) = (L2 − LKS )/2 + χ(OS ) one sees that Conjecture 2.4 expresses the tSδ (L) as polynomials of degree δ in L2 , KS L, KS2 , c2 (S). (2) I have checked that Conjecture 2.4 reproduces the formulas of Vainsencher and Kleiman–Piene for δ ≤ 8. This determines B1 (q) and B2 (q) up to degree q 8 . In Remark 4.2 below we use the formulas of [C-H] for the Severi degrees in P2 to determine the coefficients of B1 (q) and B2 (q) up to degree 28 (they are given here up to degree 20). B1 (q) ≡ 1 − q − 5 q 2 + 39 q 3 − 345 q 4 + 2961 q 5 − 24866 q 6 + 207759 q 7 − 1737670 q 8 + 14584625 q 9 − 122937305 q 10 + 1040906771 q 11 − 8852158628 q 12 + 75598131215 q 13 − 648168748072 q 14 + 5577807139921 q 15 − 48163964723088 q 16 + 417210529188188 q 17 − 3624610235789053 q 18 + 31575290280786530 q 19 − 275758194822813754 q 20 + O(q 21 ), B2 (q) ≡ 1 + 5 q + 2 q 2 + 35 q 3 − 140 q 4 + 986 q 5 − 6643 q 6 + 48248 q 7 − 362700 q 8 + 2802510 q 9 − 22098991 q 10 + 177116726 q 11 − 1438544962 q 12 + 11814206036 q 13 − 97940651274 q 14 + 818498739637 q 15 − 6888195294592 q 16 + 58324130994782 q 17 − 496519067059432 q 18 + 4247266246317414 q 19 − 36488059346439524 q 20 + O(q 21 ).
A Conjectural Generating Function for Numbers of Curves
527
Remark 2.6. We give a reformulation of the conjecture. We define, for all l, m, r ∈ Z, S (2l + m, m), nSr (l, m) := Tl+χ(O S )−1−r
mSg (l, m) := ng−m−2+χ(OS ) (l, m). If L is sufficiently ample with respect to δ = χ(L) − 1 − r and S (and thus in particular χ(L) = H 0 (S, L) and r ≥ 0), then nSr ((L2 − LKS )/2, LKS ) counts the δ-nodal curves in a general r-codimensional sub-linear system of |L|. Then X
r
2
nSr (l, m)q l = B1 (q)KS B2 (q)m DG2 (τ )
l∈Z
D2 G2 (τ ) . (1(τ )D2 G2 (τ ))χ(OS )/2
An irreducible δ-nodal curve C on S has geometric genus g(C) = (L2 + LKS )/2 + 1 − δ. We take the same definition also if C is reducible. For L sufficiently ample mSg ((L2 − LKS )/2, LKS ) counts the nodal curves C with g(C) = g in a general g − LKS + χ(OS ) − 2-codimensional sub-linear system of |L|. Proof. If f (q) and g(q) are power series in q and g(q) starts with q, then we can develop f (q) as a power series in g(q) and Coeffg(q)k f (q) = Resg(q)=0 We apply this with g(q) = DG2 (τ ).
f (q)dg(q) f (q)Dg(q) = Coeffq0 . k+1 g(q) g(q)k+1
3. Counting Curves on K3 Surfaces and Abelian Surfaces Let now S be a surface with numerically trivial canonical divisor. We denote nSr (l) := nSr (l, 0), i.e. for L sufficiently ample nSr (L2 /2) is the number of χ(L) − r − 1-nodal curves in an r-codimensional sub-linear system of L. nSr (l) can be expressed in terms of quasimodular forms. For S a K3 surface, A an abelian surface and F an Enriques or bielliptic surface we get X
r nSr (l)q l = DG2 (τ ) /1(τ ),
(3.0.1)
l∈Z
r 2 l nA r (l)q = DG2 (τ ) D G2 (τ ),
(3.0.2)
X X
l∈Z
zr 1 = D exp DG2 (τ )z , r! z l∈Z,r≥0 X r 1/2 l nF . (D2 G2 (τ ))/1(τ ) r (l)q = DG2 (τ ) l nA r (l)q
l∈Z,r≥0 F A A Note that mSg (l) = nSg (l), mF g (l) = ng−1 (l), mg (l) = ng−2 (l).
(3.0.3)
528
L. G¨ottsche
Remark 3.1. In the case of an abelian surface or a K3 surface we expect that the numbers S nA r (l) and nr (l) have a more interesting geometric significance. (1) Let (S, L) be a polarized K3 surface with P ic(S) = ZL. Then the linear system |L| contains only irreducible curves. The number nSr (L2 /2) is a count of curves C ∈ |L| of geometric genus r passing through r general points on S. In this case the numbers of rational curves have been calculated in [Y-Z] and [B] and (3.0.1) is a generalization to arbitrary genus. The rational curves that are counted are not necessarily nodal. In ¯ of its compactified the count a rational curve C is assigned the Euler number e(JC) Jacobian (which is 1 if C is immersed) as multiplicity. Let M g,n (S, β) be the moduli space of n-pointed genus g stable maps of homology class β (see [K-M]). It comes equipped with an evaluation map µ to S n . In [F-G-vS] ¯ is just the multiplicity it is shown that for a rational curve C on a K3 surface e(JC) of M 0,0 (S, [L]) at the point defined by the normalization of C. Here [L] denotes the homology class Poincar´e dual to c1 (L). In other words nS0 (L2 /2) is just the length of the 0-dimensional scheme M 0,0 (S, [L]). I expect that for curves of arbitrary genus the corresponding result should hold: If S and L are general and x is a general point in S r then the fiber µ−1 (x) ⊂ M r,r (S, [L]) should be a finite scheme and nSr (L2 /2) should just be its length. More generally nSr (L2 /2) should be a generalized Gromov-Witten invariant as defined and studied in [Br-Le] in the symplectic setting and in [Be-F] in the algebraic geometric setting. In the meantime this invariant has been computed in [Br-Le] for curves of arbitrary genus on K3 surfaces confirming the conjecture. (2) Let A be an abelian surface with a very ample line bundle L. We claim that in general 2 all the curves counted in nA r (L /2) will be irreducible: The set of δ-nodal curves in |L| has expected dimension χ(L) − δ − 1 = L2 /2 − δ − 1. On the other hand the set of reducible δ-nodal curves C1 + . . . + Cn ∈ |L| with Ci ∈ |Li | has expected dimension n X i=1
(L2i /2 − 1) − δ +
X
Li Lj = L2 /2 − δ − n.
1≤i6=j≤n
2 Therefore in general nA r (L /2) should count the irreducible curves C ∈ |L| of geometric genus r + 2 passing through r general points. If |L| does not contain nonreduced curves (or more generally δ is smaller than the codimension of the locus of nonreduced curves in |L|) we expect that this result holds in a modified form also if L is not required to be sufficiently ample, and if not all the curves in |L| are immersed. The moduli space M g,n (A, [L]) is naturally fibered over P icL (A). The fibers are the spaces M g,n (A, |M |) of stable maps ϕ : W → A with ϕ∗ (W ) a divisor in |M |, where c1 (M ) = c1 (L). Again 2 for A and L general and x a general point in Ar the number nA r (L /2) should be the −1 length of the fiber µ (x) ⊂ M r+2,r (A, |L|). The condition that |L| does not contain nonreduced curves is necessary: for instance if L is p times a principal polarization (for p a prime) on a general abelian surface, then the number of genus 2 curves in |L| is p4 (1 + p + p2 + p3 ) instead of p4 (1 + p + p2 ) as pointed out to me by O. Debarre. By Sect. 5 below nonreduced curves occurring in codimension ≤ δ in |L| should contribute to tA δ (L).
We want to show Conjecture 2.4 and the expectations of Remark 3.1 for abelian surfaces in a special case. Essentially the same result has been shown independently in [De]. Theorem 3.2. Let A be an abelian surface with an ample line bundle L such that c1 (L) is a polarization of type (1, n). Assume that A does not contain elliptic curves. Write
A Conjectural Generating Function for Numbers of Curves
529
P again σ1 (n) = d|n d. Then the number of genus 2 curves in |L| is n2 σ1 (n). Moreover all these curves are irreducible and immersed, and the moduli space M 2,0 (A, |L|) consists of n2 σ1 (n) points corresponding to their normalizations. Proof. We will denote by [C] the homology class of a curve C and by [L] the Poincar´e dual of c1 (L). For a divisor D we write c1 (D) for c1 (O(D)). Let ϕ : C → A be a morphism from a connected nodal curve of arithmetic genus 2 to A, with ϕ∗ ([C]) = [L]. Put D := ϕ(C). As A does not contain curves of genus 0 or 1, C must be irreducible and smooth, and ϕ must be generically injective. In particular [D] = [L]. Let J(C) be the Jacobian of C. We freely use standard results about Jacobians of curves, see ([L-B] Chap. 11) for reference. For each c ∈ C the Abel-Jacobi map αc : C → J(C) is an embedding with αc (c) = 0. We write Cc := αc (C) and θC := c1 (αc (C)). By the Torelli theorem the isomorphism class of C is determined uniquely by the pair (J(C), θC ). For all a ∈ A we denote by ta the translation by a. By the universal property of the Jacobian e ◦ αc for all c ∈ C. As ϕ e there is a unique isogeny ϕ e : J(C) → A such that ϕ = tϕ(c) ◦ ϕ is e´ tale, ϕ is an immersion and ϕ : C → D is the normalization map. We also see that ϕ e∗ (c1 (L)) = nθC . On the other hand, let (B, γ) be a principally polarized abelian surface and ψ : B → A an isogeny with ψ ∗ (c1 (L)) = nγ. By the criterion of Matsusaka-Ran and the assumption that A does not contain elliptic curves we obtain that (B, γ) = (J(C), θC ) for C a smooth curve of genus 2 and ψ = ϕ e for a morphism ϕ : C → A with ϕ∗ ([C]) = [L]. e ◦ αc is ϕ e depends only on ϕ up to composition with a translation in A and ϕ = tϕ(c) ◦ ϕ determined by ϕ e up to translation in A. By the universal property of J(C) applied to the embedding αc , an automorphism ψ of C induces an automorphism ψb of J(C). If is an automorphism of J(C) with ∗ (θC ) = θC , then it is ψb for some automorphism ψ of C. (H 0 (J(C), O(Cc )) = 1, and a 7→ t∗a (O(Cc )) defines an isomorphism J(C) → P icθC (J(C)).) Therefore we see that the set M1 of morphisms ϕ : C → A from curves of genus 2 with ϕ∗ ([C]) = [L] modulo composition with automorphisms of C and with translations of A can be identified with the set M2 of morphisms ψ : B → A from a principally polarized abelian surface (B, γ), such that ψ ∗ (c1 (L)) = nγ modulo composition with automorphisms η : B → B with η ∗ (γ) = γ. We write A = C2 /0 and B = C2 /3. Then c1 (L) is given by an alternating form a : 0 × 0 → Z such that there is a basis x1 , x2 , y1 , y2 of 0 with a(x1 , y1 ) = 1, a(x2 , y2 ) = n, a(x1 , x2 ) = a(y1 , y2 ) = 0. A homomorphism ψ : B → A is given by a linear map ψb : C2 → C2 with ψ(3) ⊂ 0. We see that M2 can be identified with the set M3 of sublattices 3 ⊂ 0 of index n with a(3, 3) ⊂ nZ. We claim that M3 has σ1 (n) elements. First we want to see that this claim implies the theorem. Let P icL (A) be the group of line bundles on A with first Chern class c1 (L). By [L-B] Proposition 4.9 the morphism ϕL : A → P icL (A); a 7→ t∗a L is e´ tale of degree n2 . By the claim this means that for each L1 ∈ P icL (A) the linear system |L1 | contains precisely n2 σ1 (n) curves of genus 2. Finally we show the claim. Via the basis x1 , y1 , x2 , y2 (in that order) we identify 0 with Z2 × Z2 . We see that 3 must be of the form 30 × Z2 , where 30 is a sublattice of index n in Z2 , satisfying b(30 , 30 ) ⊂ nZ, for the alternating form b defined by a(x1 , y1 ) = 1. Let 30 be a sublattice of Z2 of index n. We claim that b(30 , 30 ) ⊂ nZ. Then the result follows by the well-known fact that the number of sublattices of index n in a rank two lattice is σ1 (n). Let L1 := p2 (30 ) for the second projection Z2 → Z and put L2 := ker(p2 |30 ). Then L1 = d1 Z and L2 = d2 Z for d1 , d2 ∈ Z with d1 d2 = n.
530
L. G¨ottsche
0 Choose x = (k, d1 ) ∈ 30 ∩ p−1 2 (d1 ). Then 3 is generated by L2 and x, and in particular 0 0 b(3 , 3 ) ⊂ nZ.
4. Severi Degrees on P2 and Rational Ruled Surfaces The Severi degree N d,δ is the number of plane curves of degree d with δ nodes passing through (d2 + 3d)/2 − δ general points. In [R] a recursive procedure for determining the N d,δ is shown, and in [C-H] a different recursion formula is proven. In the number N d,δ also reducible curves are included, they however occur only if d ≤ δ + 1. Furthermore the numbers of irreducible curves can be determined from them ([C-H] see also [Ge]). For simplicity I will write tδ (d) := tPδ 2 (O(d)). If d is sufficiently large with respect to δ, then N d,δ should be equal to tδ (d): Conjecture 4.1. If δ ≤ 2d − 2, then N d,δ = tδ (d). Remark 4.2. (1) Conjecture 4.1 and [C-H] provide an effective method of determining the coefficients of the two unknown power series B1 (q) and B2 (q). Using a suitable program I computed the N d,δ via the recursive formula from [C-H] for d ≤ 16 and δ ≤ 30. We write x = DG2 (τ ). By Conjectures 2.4 and 4.1 one has for all d > 0 moduli the ideal generated by x2d−1 the identity X
N d,δ xδ ≡ exp(d2 C1 (x) + dC2 (x) + C3 (x)).
(4.2.1)
δ∈Z
Here C1 (x) is known by Conjecture 2.4 and the first k coefficients of C2 (x) and C3 (x) determine the first k coefficients of B1 (q) and B2 (q). Taking logarithms on both sides gives, for any two degrees d1 < d2 and δ ≤ 2d1 − 2, a system of two linear equations for the coefficients of xδ in C2 (x) and C3 (x). Note that this also gives a test of the conjecture. It is a nontrivial fact that the generating function has the special shape (4.2.1). In particular each pair d1 < d2 with δ ≤ 2d1 − 2 already determines the coefficients of xδ . (2) The conjecture implies in particular that for δ ≤ 2d − 2 the numbers N d,δ are given by a polynomial Pδ (d) of degree 2δ in d. This was already conjectured in [D-I]. Denote by pµ (δ) the coefficient of d2δ−µ in Pδ . In [D-I] a conjectural formula for the leading coefficients pµ (δ) for µ ≤ 6 is given. Kleiman and Piene have determined p7 (δ) and p8 (δ) [K-P]. Using Conjectures 2.4 and 4.1 there is an algorithm to determine the pµ (δ). Again we use the formula (4.2.1), and collect terms. From knowing the coefficients of C2 (x) and C3 (x) up to degree 28 we get the pµ (δ) for µ ≤ 28. Independently in [Ch] the recursive method of [R] is used to compute many of the N d,δ and to show that for δ ≤ d the number N d,δ is indeed given by a polynomial Pδ of degree 2δ in d. Then also he uses a maple program to compute the pµ (δ) for µ ≤ 12. Our formulas for Pδ and pµ (δ) coincide with those from [D-I, Ch] and [K-P] for δ ≤ 12. Let [ ] denote the integer part. For µ ≤ 28 we observe that pµ (δ) is of the form pµ (δ) =
3δ−[µ/2] Qµ (δ), µ!(δ − [µ/2])!
where Qµ (δ) is a polynomial of degree [µ/2] in δ with integer coefficients, which have only products of powers of 2 and 3 as common factors. In particular
A Conjectural Generating Function for Numbers of Curves
531
Q8 (δ) = −24 (282855 δ 4 − 931146 δ 3 + 417490 δ 2 + 425202 δ + 1141616), Q9 (δ) = −23 32 (128676 δ 4 + 268644 δ 3 − 1011772 δ 2 − 1488377 δ − 1724779), Q10 (δ) = 24 32 (4345998 δ 5 − 15710500 δ 4 − 3710865 δ 3 + 7300210 δ 2 + 57779307 δ + 98802690). Note that for d > δ + 1 all the curves are irreducible, so that in this case we get a conjectural formula for the stable Gromov-Witten invariants of P2 , i.e. the numbers of . irreducible curves C of degree d with g(C) ≥ d−2 2 Remark 4.3. Let 6e be a rational ruled surface, E the curve with E 2 = −e and F a fiber e of the ruling. For simplicity denote teδ (n, m) := t6 δ (O(nF + mE)). [Ch] determined the 2,m 3,m . In [Va] N0,δ for δ ≤ 9 as polynomials of degree δ in m and several numbers N0,δ a recursion formula very similar to that of [C-H] is proved for the generalized Severi n,m , i.e. the number of δ-nodal curves in |nF + mE| which do not contain degrees Ne,δ n,m for e ≤ 4, δ ≤ 10, E as a component. Using a suitable program I computed the Ne,δ n ≤ 11 and m ≤ 8. These results are compatible with Conjecture 2.4, if one conjectures n,m = teδ (n, m) if and only if δ ≤ min(2m, n − em) that for (n, m) 6= (1, 0) one has Ne,δ or δ ≤ min(2m, 2n) in case e = 0. Remark 4.4. A slightly sharpened version of Conjecture 4.1 can be reformulated as saying that N d,δ = tδ (d) if and only if H 0 (P2 , O(d)) > δ and the locus of nonreduced curves in |O(d)| has codimension bigger than δ. In a similar way the conjecture of Remark n,m = teδ (n, m) if and only if H 0 (6e , O(nF + 4.3 can be reformulated as saying that Ne,δ mE)) > δ and the locus of curves in |O(nF + mE)| which are nonreduced or contain E as a component has codimension bigger than δ. In view of Proposition 5.2 below one would expect that nonreduced curves contribute to the count of nodal curves, and the recursion formula of [Va] only counts curves not containing E. Therefore it seems that, at least in the case of P2 and of rational ruled surfaces, tSδ (L) is the actual number of δnodal curves in a general δ-codimensional linear system, unless this cannot be expected for obvious geometrical reasons. 5. Connection with Hilbert Schemes of Points Let again S be an algebraic surface and let L be a line bundle on S. Let S [n] be the Hilbert scheme of finite subschemes of length n on S, and let Zn (S) ⊂ S × S [n] denote the universal family with projections pn : Zn (S) → S, qn : Zn (S) → S [n] . Then Ln := (qn )∗ (pn )∗ (L) is a locally free sheaf of rank n on S [n] . Definition 5.1. Let S2δ ⊂ S [3δ] be the closure (with the reduced induced strucδ which parametrizes subschemes of the form ture) of the locally closed subset S2,0 `δ 2 distinct points in S. It is easy to see i=1 Spec (OS,xi /mS,xi ), where x1 , . . . xδ are R that S2δ is birational to S [δ] . We put dn (L) := S δ c2δ (L3δ ). 2
Following [B-S] we call L k-very ample if for all subschemes Z ⊂ S of length k + 1 the natural map H 0 (S, L) → H 0 (L⊗OZ ) is surjective. In that case it defines a morphism ϕk+1,L : S [k+1] → Grass(k + 1, |L|) to the Grassmannian of (k + 1)-codimensional sublinear systems of |L|. If L and M are very ample, then L⊗k ⊗ M ⊗l is (k + l)-very ample.
532
L. G¨ottsche
Proposition 5.2. Assume L is (3δ − 1)-very ample, then a general δ-dimensional sublinear system V ⊂ |L| contains precisely dn (L) curves Ci with ≥ δ singularities. If furthermore L is (5δ − 1)-very ample (5-very ample if δ = 1), then the Ci have precisely δ nodes as singularities. Proof. Assume first that L is (3δ −1)-very ample. We apply the Thom-Porteous formula δ . to the restrictions of the evaluation map H 0 (S, L) ⊗ OS [3δ] → L3δ to S2δ and to S2δ \ S2,0 As L is (3δ − 1)-very ample the evaluation map is surjective. Then ([Fu] Ex. 14.3.2) applied to S2δ gives that for a general δ-dimensional sub-linear system V ⊂ |L| the class dn (L) is represented by the class of the finite scheme W of Z ∈ S2δ with Z ⊂ D for δ and a dimension count give D ∈ V . The application of ([Fu] Ex. 14.3.2) to S2δ \ S2,0 δ that W lies entirely in S2,0 . We note that W is the preimage under ϕ3δ,L of the Schubert cycle σ(W ) = {U ∈ Grass(3δ, |L|) | U ∩V 6= ∅}. By [Kl] W will be smooth for general V. Now assume that L is (5δ − 1)-very ample. Let V ⊂ |L| again be a general δdimensional subsystem of |L|. The Porteous formula applied to the restriction of L3δ+3 to S2δ+1 and a dimension count shows that there will be no curves in V with more than δ ⊂ S [5δ] be the locus of schemes of the form Z1 t Z2 . . . t Zδ , δ singularities. Let S3,0 where each Zi is of the form Spec (OS,xi /(m3 + xy)) with x, y local parameters at xi and let S3δ be the closure. If a curve C with precisely δ singularities does not contain a δ , then it has δ nodes as only singularities. subscheme corresponding to a point in S3δ \ S3,0 δ It is easy to see that S3,0 is smooth of dimension 4δ. Applying the Porteous formula to δ and a dimension count we see that all the curves in V the restriction of L5n to S3δ \ S3,0 with δ singularities have precisely δ nodes. A similar argument has been used in [H-P] to calculate the numbers tδ (d) on P2 for δ ≤ 3. The argument that the curves count with multiplicity 1 is taken from there. Conjecture 5.3. dδ (L) is a polynomial of degree δ in L2 , LKS , KS2 , c2 (S). By Proposition 5.2, Conjecture 2.1 follows from Conjecture 5.3. Conjecture 5.3 also gives the hope of proving Conjecture 2.4 via the study of the cohomology of Hilbert schemes of points. Remark 5.4. Note that one can generalize the above to singular Ps pointsof arbitrary order: Let µ = (m1 , . . . , ml(µ) ) where mi ∈ Z≥2 . Let N (µ) := i=1 m2i +1 and let Sµ be the `l(µ) i closure in S [N (µ)] of the subset of schemes of the form i=1 Spec (OS,xi /mm S,xi ). Denote R dµ (L) := Sµ c2l(µ) (LN (µ) ). We call a curve D ∈ |L| of type µ if there are distinct points i x1 , . . . , xl(µ) in S, such that the ideal ID,xi is contained in mm S,xi . A straightforward generalization of the proof of Proposition 5.2 shows that, for V a general (N (µ)−2l(µ))dimensional sub-linear system of an N (µ)-very ample line bundle |L|, dµ (L) counts the finite number of curves of type µ in V . Again I expect dµ (L) to be a polynomial of degree l(µ) in L2 , LKS , KS2 and c2 (S). In a similar way one can also deal with cusps instead of nodes. References [B] [Be-F]
Beauville, A.: Counting curves on K3 surfaces. Preprint alg-geom/9701019 Behrend, K., Fantechi, B.: In preparation
A Conjectural Generating Function for Numbers of Curves
533
Beltrametti, M., Sommese, A.J.: Zero cycles and kth order embeddings of smooth projective surfaces Problems on surfaces and their classification, INDAM, London–NewYork: Academic Press, 1991, pp. 33–44 [Br-Le] Bryan, J., Leung, C.: The enumerative geometry of K3 surfaces and modular forms. Preprint alggeom/9711031 [C-H] Caporaso, L., Harris, J.: Counting plane curves of any genus. Preprint alg-geom/9608025 [Ch] Choi, Y.: On the degree of Severi varieties. Preprint 1997 [De] Debarre, O.: On the Euler Characteristic of Generalized Kummer Varieties. Preprint alggeom/9711035 [D-I] Di Francesco, P., Itzykson, C.: Quantum intersection rings. In: The moduli space of curves, eds. R. Dijkgraaf, C. Faber, G. van der Geer, Boston: Birkh¨auser, 1995, pp. 81–148 [F-G-vS] Fantechi, B., G¨ottsche, L., van Straten, D.: Euler number of the compactified Jacobian and multiplicity of rational curves. Preprint 1997 [Fu] Fulton, W.: Intersection theory. Ergebnisse der Mathematik und ihrer Grenzgebiete, Berlin: SpringerVerlag, 1984 [Ge] Getzler, E.: Intersection theory on M1,4 and elliptic Gromov–Witten invariants. Preprint alggeom/9612004 [H-P] Harris, J., Pandharipande, R.: Severi degrees in Cogenus 3. Preprint alg-geom/9504003 [K-Z] Kaneko, M., Zagier, D.: A generalized Jacobi theta function and quasimodular forms In: The moduli space of curves, eds. R. Dijkgraaf, C. Faber, G. van der Geer, Boston: Birkh¨auser, 1995, pp. 165–172 [Kl] Kleiman, S.: The transversality of a general translate. Compositio Math. 38, 287–297 (1974) [K-P] Kleiman, S., Piene, R.: Private communication [K-M] Kontsevich, M., Manin, Y.: Gromov-Witten classes, quantum cohomology and enumerative geometry. Commun. Math. Phys. 164, 525–562 (1994) [L-B] Lange, H., Birkenhake, C.: Complex Abelian Varieties. Grundlehren der mathematischen Wissenschaften 302, Berlin. Springer Verlag, 1992 [R] Ran, Z.: Enumerative geometry of singular plane curves. Invent. Math. 97, 447–469 (1989) [Va] Vakil, R.: Counting curves of any genus on rational ruled surfaces. Preprint alg-geom/9709003 [V] Vainsencher, I.: Enumeration of n-fold tangent hyperplanes to a surface. J. Alg. Geom.4, 3, 503–526 (1995) [Y-Z] Yau, S.T., Zaslow, E.: BPS states, string duality, and nodal curves on K3. Nucl. Phys. B 471, 503–512 (1996)
[B-S]
Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 196, 535 – 570 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Analyticity Properties and Thermal Effects for General Quantum Field Theory on de Sitter Space-Time Jacques Bros1 , Henri Epstein2,3 , Ugo Moschella1,3,4 1
Service de Physique Th´eorique, C.E. Saclay, 91191 Gif-sur-Yvette, France C.N.R.S., France 3 Institut des Hautes Etudes Scientifiques, 91440 Bures-sur-Yvette, France 4 Istituto di Scienze Matematiche Fisiche e Chimiche, Via Lucini 3, 22100 Como, and INFN sez. di Milano, Italy
2
Received: 1 August 1997 / Accepted: 19 January 1998
Abstract: We propose a general framework for quantum field theory on the de Sitter space-time (i.e. for local field theories whose truncated Wightman functions are not required to vanish). By requiring that the fields satisfy a weak spectral condition, formulated in terms of the analytic continuation properties of their Wightman functions, we show that a geodesical observer will detect in the corresponding “vacuum” a blackbody radiation at temperature T = 1/2πR. We also prove the analogues of the PCT, Reeh-Schlieder and Bisognano-Wichmann theorems.
1. Introduction It is known that, when quantizing fields on a gravitational background, it is generally impossible to characterize the physically relevant vacuum states as the fundamental states for the energy in the usual sense, since there is no such thing as a global energy operator. In the absence of the analogue of an energy-momentum spectrum condition [28, 35], several authors have formulated various alternative prescriptions to select, among the possible representations (vacua) of a quantum field theory, those which can have a meaningful physical interpretation; the adiabatic prescription, the local Hadamard condition, and the conformal criterion, (see [3, 29] and references therein) have proven to be useful for characterizing linear field theories on various kinds of space-times. It is worthwhile to stress immediately that the relevant choice of the vacuum of a quantum field theory on a curved space-time has striking consequences even in the case of free fields: the most celebrated examples are the Hawking thermal radiation on a black-hole background [24, 25, 33], the Unruh effect [36] and the Gibbons and Hawking thermal effect on a de Sitter space-time [19]. As regards interacting field theories on a gravitational background, (i.e. field theories with non-vanishing truncated n-point functions), much less is known. While the
536
J. Bros, H. Epstein, U. Moschella
property of locality (or local commutativity) of the field observable algebra (i.e. the commutativity of any couple of field observables localized in space-like separated regions) remains a reasonable postulate for all space-time manifolds which are globally hyperbolic, the problem of specifying a representation of the field algebra becomes still more undetermined than in the free-field case. In the latter, the indeterminacy is confined in the two-point functions of the fields, namely in the splitting of the given (c-number) commutators into the correlators at permuted couples of points. In the general interacting case, the indeterminacy of the possible representations is now encoded (in an unknown way) in the properties of the whole sequence of n-point functions of the fields. If the gravitational background is only considered as a general (globally hyperbolic and pseudo-Riemannian) differentiable manifold, this huge indeterminacy cannot be completely reduced by imposing a (well-justified) principle of stability which postulates the existence at each point of the manifold of a Minkowskian scaling limit of the theory satisfying the spectral condition (see [22] and references therein); nor can it be reduced in an operational way by adding the general requirement of a local definiteness criterion (based on the principles of local quantum physics [22]). In such a general context, one should however mention the recent use of microlocal analysis which has allowed the introduction of a wave front set approach to the spectral condition [32, 13]; after having supplied a simple characterization of the free-field Hadamard states, this promising approach has in its program to give information on the n-point functions of interacting fields in perturbation theory. On the other side, starting from the remark that in Minkowskian theories the spectral condition can be expressed in terms of analyticity properties of the n-point functions in the complexified space-time manifold [28, 35], one can defend the viewpoint that it may be of particular interest to study quantum field theory on an analytic gravitational background. As a matter of fact, there is one model of an analytic curved universe, and actually the simplest one, that offers the possibility of formulating a global spectral condition for interacting fields which is very close to the usual spectral condition of Minkowski QFT: this is the de Sitter space-time. The de Sitter space-time can be represented as a d-dimensional one-sheeted hyperboloid embedded in a Minkowski ambient space Rd+1 and it can also be seen as a one-parameter deformation of a d-dimensional Minkowski space-time involving a length R. The Lorentz group of the ambient space acts as a relativity group for this space-time, and the very existence of this (maximal) symmetry group explains the popularity of the de Sitter universe as a convenient simple model for developing techniques of QFT on a gravitational background. Moreover, there has been regained interest in the de Sitter metric in the last years, since it has been considered to play a central role in the inflationary cosmologies (see [30] and references therein): a possible explanation of phenomena occurring in the very early universe then relies on an interplay between space-time curvature and thermodynamics and a prominent role is played by the mechanisms of symmetry breaking and restoration in a de Sitter QFT. The geometrical properties of de Sitter space-time and of its complexification actually make it possible to formulate a general approach to QFT on this universe which closely parallels the Wightman approach [28, 35] to the Minkowskian QFT. In fact, it is not only the existence of a simple causal structure (inherited from the ambient Minkowski space) and of a global symmetry group (playing the same role as the Poincar´e group) on the real space-time manifold which are similar; but the complexified manifold itself is equipped with domains which are closely similar to the tube domains of the complex Minkowski space. Since these Minkowskian tubes play a crucial role for expressing the spectral condition in terms of analyticity properties of the n-point functions of the the-
Analyticity and Thermal Effects for de Sitter QFT
537
ory, the previous geometrical remark strongly suggests that analogous complex domains might be used for a global formulation of the spectral condition in de Sitter quantum field theory. This approach has been in fact introduced and used successfully in a study of general two-point functions on de Sitter space-time [7, 9, 10, 31]. As a by-product, it has been shown [10] that a satisfactory characterization of generalized free fields (GFF) on de Sitter space-time, including the preferred family of de Sitter invariant Klein-Gordon field theories (known as Euclidean [19] or Bunch-Davies [12] vacua) can be given in terms of the global analytic structure of their two-point functions in the complexified de Sitter manifold. (For an interesting characterization of these states based on the Schr¨odinger functional picture, in the Klein-Gordon case, see [18].) Moreover, all these theories of GFF were shown to be equivalently characterized by the existence of thermal properties of Gibbons-Hawking type, the temperature T = (2πR)−1 being induced by the curvature of the space. In this paper, we will show that the same ideas and methods can be applied with similar results to a general approach to the theory of interacting quantum fields in de Sitter space-time. In fact, we shall work out an axiomatic program (already sketched at the end of [10]) in which the “spectral condition” is replaced by appropriate global analyticity properties of the n-point vacuum expectation values of the fields (or “Wightman distributions”) in the complexified de Sitter manifold. These postulated analyticity properties are similar to those implied by the usual spectral condition in the Minkowskian case, according to the standard Wightman axiomatic framework. For simplicity, we shall refer to them as to the “weak spectral condition”. As a physical support to our weak spectral condition, we shall establish that all interacting fields which belong to this general framework admit a Gibbons-Hawking-type thermal interpretation with the same specifications as the one obtained for GFF’s in [10]. In spite of this remarkable interpretative discrepancy with respect to the Minkowskian quantum fields satisfying the usual spectral condition, we shall see that such basic structural properties as the PCT and Reeh-Schlieder theorems are still valid in this general approach to de Sitter QFT. Furthermore, our global analytic framework also supplies an analytic continuation of the theory to the “Euclidean sphere” of the complexified de Sitter space-time, which is the analogue of the (purely imaginary time) “Euclidean subspace” of the complexified Minkowskian space-time. We will also show that the Wick powers of generalized free fields fit within the framework and we have some indication that our approach is relevant for the study of perturbation theory. The latter will be developed elsewhere. From a methodological viewpoint, one can distinguish (as in the Minkowskian case) two types of developments which can be called according to traditional terminology the “linear” and “non-linear programs”. The linear program, which deals exclusively with the exploitation of the postulates of locality, de Sitter covariance and spectral condition (expressed by linear relations between the various permuted n-point functions, for each fixed value of n) results in the definition of primitive analyticity domains Dn for all the n-point (holomorphic) functions Wn (z1 , ..., zn ) of the theory. Each domain Dn is an open connected subset of the topological product of n copies of the complexified de Sitter hyperboloid. As in the Minkowskian case, each primitive domain Dn is not a “natural holomorphy domain”, but it turns out that new regions of analyticity of the functions Wn (contained in the respective holomorphy envelopes of the domains Dn and obtained by geometrical techniques of analytic completion) yield important consequences for the corresponding field theories. A specially interesting example is the derivation of analyticity properties of the functions Wn with respect to any subset of points zi = zi (t) varying simultaneously
538
J. Bros, H. Epstein, U. Moschella
on complex hyperbolae interpreted as the (complexified) trajectories of a given timelike Killing vector field on the de Sitter universe. The periodicity with respect to the imaginary part of the corresponding time-parameter t directly implies the interpretation of the obtained analyticity properties of the functions Wn as a KMS-type condition; in view of the general analysis of [23], this gives a thermal interpretation to all the de Sitter field theories considered. Since the above mentioned analyticity property is completely similar to the one which emerges from the Bisognano-Wichmann results in the Minkowskian theory [4] (see our comments below), we shall call the previous result “Bisognano-Wichmann analyticity property of the n-point functions”. The non-linear program, which exploits the Hilbert-space structure of the theory, relies in an essential way on the (quadratic) “positivity inequalities” to be satisfied by the whole sequence of n-point Wightman distributions of the fields; these inequalities just express the existence of the vector-valued distributions defined by the action of field operator products on the “GNS-vacuum state” of the theory. An important issue to be recovered is the fact that these distributions are themselves the boundary values of vectorvalued holomorphic functions from certain complex domains; it is this mathematical fact which is directly responsible for such important features of the theory as the ReehSchlieder property. In the Minkowskian case, this vectorial analyticity is readily obtained from the spectral condition by an argument based on the Laplace transformation. Here, we shall apply an alternative method for establishing vectorial analyticity which directly makes use of the analyticity and positivity properties of the n-point functions. It is based on a general study by V. Glaser [20, 21] of positive-type sequences of holomorphic kernels in domains of Cm × Cn , whereby the analyticity of the Wightman n-point functions “propagates” their positivity properties to the complex domain. This method is therefore applicable not only to the Minkowskian and de Sitter QFT but also, in principle, to QFT on more general holomorphic (or real-analytic) space-time manifolds for which the spectral condition would be replaced by an appropriate (possibly local) version of the analyticity properties of the Wightman functions. The structure of the paper is the following: in Sect. 2, we introduce the notations and recall some properties of the de Sitter spacetime and of its complexification; we then formulate our general principles for the interacting fields on this universe, giving a special emphasis on the spectral condition which we propose. In Sect. 3 we explore various consequences of our general principles which are the analogues of standard results of the Minkowskian QFT in the Wightman framework. In particular, we establish the existence of an analytic continuation of the Wightman n-point functions to corresponding primitive domains of analyticity. The PCT property is also shown. In Sect. 4 we come to the physical interpretation of the spectral condition. We first extend the analytic aspect of the Bisognano-Wichmann theorem [4] to the de Sitter case. Then we show that the thermal interpretation, already known for free field theories [19, 10], is still valid in this more general case. In Sect. 5 we prove the validity of the Reeh and Schlieder property. The proof of the relevant vectorial analyticity is given as an application of the above mentioned theorem of Glaser. The paper ends with three appendices where we discuss some more technical points.
Analyticity and Thermal Effects for de Sitter QFT
539
2. QFT on the de Sitter Space-Time: The Spectral Condition We start with some notations and some well-known facts. The (d + 1)-dimensional real (resp. complex) Minkowski space is Rd+1 (resp. Cd+1 ) equipped with the scalar product x · y = x(0) y (0) − x(1) y (1) − . . . − x(d) y (d) with, as usual, x2 = x · x. We thus distinguish a particular Lorentz frame and denote eµ the (d + 1)-vector with e(ν) µ = δµ ν . In this special Lorentz frame, we also distinguish the (e0 , ed )-plane and the corresponding light-like coordinates u and v, namely we put: x = (x(0) , x, x(d) ),
x = (x(1) , . . . , x(d−1) ),
(1)
v = v(x) = x(0) − x(d) ,
u = u(x) = x(0) + x(d) ,
(2)
and we introduce, for each λ ∈ C \ {0}, the special Lorentz transformation [λ] such that u([λ]x) = λu(x), v([λ]x) = λ−1 v(x),
x([λ]x) = x(x) .
(3)
The future cone is defined in the real Minkowski space Rd+1 as the subset V+ = −V− = {x ∈ Rd+1 : x(0) > 0, x · x > 0} and the future light cone as C+ = ∂V+ = −C− . We denote x ≤ y the partial order (called causal order) defined by V+ , i.e. x ≤ y ⇔ y − x ∈ V+ . The d-dimensional real (resp. complex) de Sitter space-time with radius R is identified with the subset of the real (resp. complex) Minkowski space consisting of the points x such that x2 = −R2 and is denoted Xd (R) or simply Xd (resp. Xd(c) ). Thus Xd is the one-sheeted hyperboloid 2
2
2
Xd = Xd (R) = {x ∈ Rd+1 : x(0) − x(1) − . . . − x(d) = −R2 }.
(4)
The causal order on Rd+1 induces the causal order on Xd . The future and past shadows of a given event x in Xd are given by 0+ (x) = {y ∈ Xd : y ≥ x}, 0− (x) = {y ∈ Xd : y ≤ x}. Note that if x2 = −R2 and r2 = 0, then (x + r)2 = −R2 is equivalent to x · r = 0, and remains true if r is replaced with t r for any real t (the same holds in the complex domain.) Hence the boundary set ∂0(x) = {y ∈ Xd : (x − y)2 = 0}
(5)
of 0+ (x) ∪ 0− (x) is a cone (“light-cone”) with apex x, the union of all linear generators of Xd containing the point x. Two events x and y of Xd are in “acausal relation”, or “space-like separated” if y 6∈ 0+ (x) ∪ 0− (x), i.e. if x · y > −R2 . The relativity group of the de Sitter space-time, called “de Sitter group” in the following, is the connected Lorentz group of the ambient Minkowski space, i.e. L↑+ = SO0 (1, d) leaving invariant each of the sheets of the cone C = C+ ∪ C− . The connected complex Lorentz group in d + 1 dimensions is denoted L+ (C). We denote σ the L↑+ -invariant volume form on Xd given by Z Z (6) f (x) dσ(x) = f (x) δ(x2 + R2 ) dx(0) ∧ . . . ∧ dx(d) . L↑+ acts transitively on Xd and L+ (C) on Xd(c) .
540
J. Bros, H. Epstein, U. Moschella
The familiar forward and backward tubes are defined in complex Minkowski space as T± = Rd+1 ± iV+ , and we denote T+ = T+ ∩ Xd(c) ,
T− = T− ∩ Xd(c) .
(7)
Since T+ ∪ T− contains Ed+1 = {z = (iy (0) , x(1) , . . . , x(d) ) : (y (0) , x(1) , . . . , x(d) ) ∈ Rd+1 }, the “Euclidean subspace” of the complex Minkowski space-time Cd+1 , the subset T+ ∪ T− of Xd(c) contains the “Euclidean sphere” Sd = {z = (iy (0) , x(1) , . . . x(d) ) : 2 2 2 y (0) + x(1) + . . . + x(d) = R2 }. We denote D(Xdn ) (resp. S(Xdn )) the space of functions on Xdn which are restrictions to Xdn of functions belonging to D(Rn(d+1) ) (resp. S(Rn(d+1) )). As in the Minkowskian case, the Borchers algebra B is defined as the tensor algebra over D(Xd ). Its elements are terminating sequences of test-functions f = (f0 , f1 (x1 ), . . . , fn (x1 , . . . , xn ), . . . ), where f0 ∈ C and fn ∈ D(Xdn ) for all n ≥ 1, the product and ? operations being given by X fp ⊗ gq , (f ? )n (x1 , . . . , xn ) = fn (xn , . . . , x1 ). (f g)n = p, q∈N p+q=n
The action of the de Sitter group L↑+ on B is defined by f 7→ f{3r } , where f{3r } = (f0 , f1{3r } , . . . , fn{3r } , . . . ), fn{3r } (x1 , . . . , xn ) = fn (3r −1 x1 , . . . , 3r −1 xn ),
(8)
3r denoting any (real) de Sitter transformation. A quantum field theory (we consider a single scalar field for simplicity) is specified by a continuous linear functional W on B, i.e. a sequence {Wn ∈ D0 (Xdn )}n∈N , where W0 = 1 and the {Wn }n>0 are distributions (Wightman functions) required to possess the following properties: 1. (Covariance). Each Wn is de Sitter invariant, i.e. hWn , fn{3r } i = hWn , fn i
(9)
for all de Sitter transformations 3r . (This is equivalent to saying that the functional W itself is invariant, i.e. W(f ) = W(f{3r } ) for all 3r .) 2. (Locality) Wn (x1 , . . . , xj , xj+1 , . . . , xn ) = Wn (x1 , . . . , xj+1 , xj , . . . , xn )
(10)
if (xj − xj+1 )2 < 0, 1 ≤ j < n. 3. (Positive Definiteness). For each f ∈ B, W(f ? f ) ≥ 0. Explicitly, given f0 ∈ C, f1 ∈ D(Xd ), . . . , fk ∈ D(Xdk ), then k X n,m=0
hWn+m , fn? ⊗ fm i ≥ 0.
(11)
Analyticity and Thermal Effects for de Sitter QFT
541
As in the Minkowskian case [37, 6, 28], the GNS construction yields a Hilbert space H, a unitary representation 3r 7→ U (3r ) of L↑+ , a vacuum vector ∈ H invariant under U , and an operator valued distribution φ such that Wn (x1 , . . . , xn ) = (, φ(x1 ) . . . φ(xn ) ).
(12)
The GNS construction also provides the vector valued distributions 8(b) n such that Z fn (x1 , . . . , xn ) φ(x1 ) . . . φ(xn ) dσ(x1 ) . . . dσ(xn ) (13) h8(b) n , fn i = and a representation f →R Φ(f ) (by unbounded operators) of B of which the field φ is a special case: φ(f1 ) = φ(x)f1 (x)dσ(x) = Φ ((0, f1 , 0, . . . )). For every open set O of Xd the corresponding polynomial algebra P(O) of the field φ is then defined as the subalgebra of Φ(B) whose elements Φ(f0 , f1 , . . . , fn , . . . ) are such that for all n ≥ 1 suppfn (x1 , . . . , xn ) ⊂ On . The set D = P(Xd ) is a dense subset of H and one has (for all elements Φ(f ), Φ(g) ∈ P(Xd )): W(f ? g) = (Φ(f ), Φ(g)).
(14)
Properties 1–3 are literally carried over from the Minkowskian case, but no literal or unique adaptation exists for the usual spectral property. In the (d + 1)-dimensional Minkowskian case, the latter is equivalent to the following: for each n ≥ 2, Wn is the boundary value in the sense of distributions of a function holomorphic in the tube Tn = {z = (z1 , . . . , zn ) ∈ Cn(d+1) : Im (zj+1 − zj ) ∈ V+ , 1 ≤ j ≤ n − 1}. (15) In the case of the de Sitter space Xd (embedded in Rd+1 ), a natural substitute for this property is to assume that Wn is the boundary value in the sense of distributions of a function holomorphic in Tn = Xd(c)n ∩ Tn .
(16)
It will be shown below that Tn is a domain and a tuboid in the sense of [10], namely a domain which is bordered by the reals in such a way that the notion of “distribution boundary value of a holomorphic function from this domain” remains meaningful. It is thus possible to impose: 4. (Weak spectral condition). For each n > 1, the distribution Wn is the boundary value of a function Wn holomorphic in the subdomain Tn of Xd(c)n . It may seem unnatural, in the absence of translational invariance, to postulate analyticity properties in terms of the difference variables (zj − zk ). Note however that a Lorentz invariant holomorphic function on a subdomain of Xd(c)n depends only on the invariants zj · zk . Among these the zj · zj are fixed and equal to −R2 . Such a function therefore depends only on the (zj − zk )2 . In the same way as in the Minkowskian case, it may be useful to relax some of the hypotheses 1-3. One may also want to impose: 5. (Temperedness Condition). For each n > 1, there are constants M (n) ≥ 0 and L(n) ≥ 0 such that the distribution Wn is the boundary value of a function Wn holomorphic in the subdomain Tn of Xd(c)n satisfying |Wn (x + iy)| ≤ M (n)(1 + kx + iyk + dist(z, ∂Tn )−1 )L(n) .
(17)
542
J. Bros, H. Epstein, U. Moschella
This global bound (which includes the behaviour of Wn at infinity) will not be indispensable in this paper, but the local part of it (indicating a power behaviour near each point x for y tending to zero) is in fact equivalent to the distribution character of the boundary value of Wn postulated in 4 (see our Remark 1 below). For completeness, we now recall the definition of tuboids onSmanifolds (given in [10]). Let M be a real n-dimensional analytic manifold, T M = x∈M (x, Tx M) the tangent bundle to M and M(c) a complexification of M. If x0 is any point in M, Ux0 and Ux(c)0 will denote open neighborhoods of x0 , respectively in M and M(c) such that Ux0 = Ux(c)0 ∩ M; a corresponding neighborhood of (x0 , 0) with basis Ux0 in T M will be denoted Tloc Ux0 . Definition 1. We call admissible local diffeomorphism at a point x0 any diffeomorphism δ which maps some neighborhood Tloc Ux0 of (x0 , 0) in T M onto a corresponding neighborhood Ux(c)0 of x0 in M(c) (considered as a 2n-dimensional C ∞ manifold) in such a way that the following properties hold: a) ∀x ∈ Ux0 , δ[(x, 0)] = x; b) ∀(x, y) ∈ Tloc Ux0 , with y 6= 0, (y ∈ Tx M), the differentiable function t → z(t) = δ[(x, ty)] is such that 1 dz (t)|t=0 = αy, with α > 0. i dt
(18)
A tuboid can now be described as a domain in M(c) which is bordered by the real manifold M and whose “shape” near each point of M is (in the space of Im z and for Im z → 0) very close to a given cone 3x of the tangent space Tx M to M at the point x. The following more precise definitions are needed. Definition S 2. We call “profile” above M any open subset 3 of T M which is of the form 3 = x∈M (x, 3x ), where each fiber 3x is a non-empty cone with apex at the origin in Tx M (3x can be the full tangent space Tx M). It is S convenient to introduce the “projective representation” T˙ M of T M, namely T˙ M = x∈M (x, T˙x M), with T˙x M = Tx M \ {0}/R+ . The image of each point y ∈ Tx M in T˙x M S is y˙ = {λy; λ > 0}. Each profile 3 can then be+ represented by an open ˙ ˙ ˙ ˙ = subset 3 x∈M (x, 3x ) of Tx M (each fiber 3x = 3x /R being now a relatively ˙ in T˙ M, namely compact set). We also introduce the complement of the closure of 3 S 0 0 0 ˙ ˙ ˙ ˙ ˙ ˙ ˙ x ). the open set 3 = T M \ 3= x∈M (x, 3x ) (note that 3x ⊂ Tx M \ 3 Definition 3. A domain 2 of Mc is called a tuboid with profile 3 above M if it satisfies the following property. For every point x0 in M, there exists an admissible local diffeomorphism δ at x0 such that: ˙ admits a compact neighborhood K(x0 , y˙0 ) in 3 ˙ such that a) Every point (x0 , y˙0 ) in 3 δ {(x, y); (x, y) ˙ ∈ K(x0 , y˙0 ), (x, y) ∈ Tloc Ux0 } ⊂ 2. ˙ 0 admits a compact neighborhood K 0 (x0 , y˙00 ) in 3 ˙ 0 such b) Every point (x0 , y˙00 ) in 3 that δ {(x, y); (x, y) ˙ ∈ K 0 (x0 , y˙0 ), (x, y) ∈ Tloc Ux0 0 } ∩ 2 = ∅. In a) and b), Tloc Ux0 and Tloc Ux0 0 denote sufficiently small neighbourhoods of (x0 , 0) in T M which may depend respectively on y0 and y00 , but always satisfy the conditions of Definition 1 with respect to δ. Each fiber 3x of 3 will also be called the profile at x of the tuboid 2.
Analyticity and Thermal Effects for de Sitter QFT
543
Using these notions and the results in Appendix A of [10], we will show the following Theorem 1. i) Let zk = xk + iyk ∈ Xd(c) , 1 ≤ k ≤ n. The set Tn = {z = (z1 , . . . , zn ) ∈ Xd(c)n ; yj+1 − yj ∈ V + , 1 ≤ j ≤ n − 1}
(19)
is a domain of Xd(c)n ii) Tn is a tuboid above Xdn , with profile 3n =
[
(x, 3nx ),
(20)
x∈Xdn
where, for each x = (x1 , . . . , xn ) ∈ Xdn , 3nx is a non-empty open convex cone with apex at the origin in Tx Xdn defined as follows: 3nx = {y = (y 1 , . . . , y n ); y k ∈ Txk Xd , 1 ≤ k ≤ n; y j+1 − y j ∈ V + , 1 ≤ j ≤ n − 1}. (21)
Proof.
a) Let Cn be the open convex cone in Rn(d+1) defined by
Cn = {y = (y 1 , . . . , y n ); y k ∈ Rd+1 , 1 ≤ k ≤ n; y j+1 − y j ∈ V + , 1 ≤ j ≤ n − 1}. (22) The set 3n defined in Eqs. (20) and (21) can then be seen as the restriction of the open subset Xdn × Cn of Xdn × Rn(d+1) to the algebraic set with equations xj .y j = 0, 1 ≤
j ≤ n, which represents T Xdn as a submanifold of Xdn × Rn(d+1) ; 3n is therefore an open subset of T Xdn . Moreover, for every point x = (x1 , . . . , xn ) ∈ Xdn , the set 3nx defined in Eq. (21) is an open convex cone in Tx Xdn ,as being the intersection of the latter with Cn . For every x, this cone is non-empty since one can determine at least one vector y = (y 1 , . . . , y n ) ∈ 3nx as follows: y 1 being chosen arbitrarily o n in Tx1 Xd , we can always find y 2 ∈ Tx2 Xd ∩ y 1 + V + , and then by recursion n o n o y j+1 ∈ Txj Xd ∩ y j + V + , for j ≤ n − 1, because for every point xj ∈ Xd , n o Txj Xd ∩ V + is a non-empty convex cone. b) Let 3nR =
[
(x, 3nx,R ), 3nx,R = {y = (y 1 , . . . , y n ) ∈ 3nx , y 2j < R2 , 1 ≤ j ≤ n}. (23) x∈Xdn
3nR is (like 3n ) an open subset of T Xdn ; each fiber 3nx,R is a non-empty domain in Tx Xdn . This results from the property of 3nx proved in a), since the existence of a point in 3nx,R or of an arc connecting two arbitrary points inside 3nx,R , follows from the corresponding property of 3nx by using the dilatation invariance of the latter. It follows that 3nR is (like 3n ) a connected set and therefore a domain in T Xdn .
544
J. Bros, H. Epstein, U. Moschella
c) We now show that there exists a continuous mapping µ which is one-to-one from 3nR onto the set Tn \ YRn , where YRn denotes the following subset of real codimension d of Xd(c)n : YRn = {z = (z1 , . . . , zn ) ∈ Xd(c)n ; ∃ at least one j0 : xj0 = 0 }.
(24)
Let us consider the following mapping µ:
µ(x, y) = z = (z1 , . . . , zn ), zj = xj + iyj =
q R2 − y 2j R
xj + iy j , 1 ≤ j ≤ n. (25)
µ is defined on the subset {T Xdn }R of all the elements (x, y) of T Xdn such that y 2j < R2 for 1 ≤ j ≤ n; Eq. (25) implies that (for all j) zj2 = −R2 and therefore that n µ is a global diffeomorphism from {T Xdn }R onto the subset ZR = Xd(c)n \ YRn of (c)n n n Xd ; clearly, this diffeomorphism maps 3R onto Tn \ YR , and therefore (in view of b)), Tn \ YRn is a domain of Xd(c)n . Since all points of Tn are either interior points or boundary points of Tn \ YRn , and since Tn = Xd(c)n ∩ Tn is an open set, it is a domain of Xd(c)n . d) In order to show that Tn is a tuboid with profile 3n above Xdn , one just notices that the global diffeomorphism µ provides admissible local diffeomorphisms (by local restrictions) at all points x in Xdn . Properties a) and b) of Definition 3 are then satisfied by Tn (with respect to all these local diffeomorphisms) as an obvious by-product of Eq. (25).
Remark 1. By an application of Theorem A.2. of [10], the weak spectral condition implies that for every x there is some local tube x + i0x around x in any chosen system of local complex coordinates on Xd(c)n whose image in Xd(c)n is contained in Tn and has a profile very close to the profile of Tn (restricted to a neighborhood of x), from which the boundary value equation Wn = b.v.Wn can be understood in the usual sense. It implies equivalently that, in a complex neighborhood of each point x = (x1 , . . . , xn ) ∈ Xdn , the analytic function Wn (z1 , . . . , zn ) is of moderate growth (i.e. bounded by a power of kyk−1 , where kyk denotes any local norm of y = Im z = (y1 , . . . , yn )) when the point z = (z1 , . . . , zn ) tends to the reals inside Tn . Remark 2. An important difference with respect to the Minkowski case is that the reals (i.e. Xdn ) are not a distinguished boundary for the tuboid Tn . 3. Consequences of Locality, Weak Spectral Condition and de Sitter Covariance Most of the well-known properties of the Wightman distributions in the flat Minkowskian case ([35, 28]) hold without change in the de Sitterian case under our assumptions, and their proofs mostly carry over literally. A few points, however require some attention. For each permutation π of (1, . . . , n), the permuted Wightman distribution Wn(π) (x1 , . . . , xn ) = Wn (xπ(1) , . . . , xπ(n) )
(26)
Analyticity and Thermal Effects for de Sitter QFT
545
is the boundary value of a function Wn(π) (z1 , . . . , zn ) holomorphic in the “permuted tuboid” Tnπ = {z = (z1 , . . . , zn ) ∈ Xd(c)n ; yπ(j+1) − yπ(j) ∈ V + , 1 ≤ j ≤ n − 1},
(27)
If two permutations π and σ differ only by the exchange of the consecutive indices j and k, then, by the locality condition, Wn(π) and Wn(σ) coincide in Rjk = Xdn ∩ Rjk , Rjk = {x ∈ Rn(d+1) : (xj − xk )2 < 0}.
(28)
Let R be a non-empty region which is the intersection of a subset of {Rjk : j 6= k}. By the edge-of-the-wedge theorem (in its version for tuboids, see Theorem A3 of [10]), any maximal set of permuted Wightman distributions which coincide on this region are the boundary value, in R, of a common function holomorphic in a tuboid above R whose profile is obtained by taking at each point x ∈ R the convex hull of the profiles at x of the corresponding permuted tuboids. In particular all the permuted Wightman distributions coincide in the intersection n of all the Rjk , and it follows that they all are boundary values of a common function Wn (z1 , . . . zn ), holomorphic in a primitive analyticity domain Dn . Wn is the common analytic continuation of all the holomorphic functions Wn(π) and the domain Dn is the union of all the permuted tuboids Tnπ and of the above mentioned local tuboids associated (by the edge-of-the-wedge theorem) with finite intersections of the Rjk . In particular Dn contains a complex neighborhood of n since the tuboids Tnπ and T πinv n are opposite (where πinv = (π(n), . . . , π(1))). For each permutation π we denote Tnπ ext the extended permuted tuboid [ [ 3c Tnπ = 3c (Tnπ ∩ Xd(c)n ) Tnπext = 3c ∈L+ (C)
=
Xd(c)n
∩
[
3c ∈L+ (C)
3c Tnπ = Xd(c)n ∩ Tnπext .
(29)
3c ∈L+ (C)
3.1. The Jost points and the Glaser–Streater theorem. The set of real points of Tnext = Tn1 ext (Jost points in the ambient space) is denoted Jn . Its intersection Jn with Xdn will be called the set of Jost points associated with the tuboid Tn . It is contained in the set n of all totally space-like configurations. The set Jn is generated (like Jn ) by the action of the connected group L↑+ on a special subset of Jost points associated with a given maximal space-like cone such as the “right-wedge” W(r) of the ambient space: W(r) = −W(l) = {x ∈ Rd+1 : u(x) > 0, v(x) < 0},
(30)
the notations u,v being those of Eq. (2). The corresponding special Jost subset Jn (r) is defined by Jn (r) = Jn(r) ∩ Xdn ,
(31)
with Jn(r) = {(x1 , . . . , xn ) ∈ Rn(d+1) : x1 ∈ W(r) , (x2 − x1 ) ∈ W(r) , . . . . . . , (xn − xn−1 ) ∈ W(r) }.
(32)
The fact that Jn is a non-empty and, if d > 2, connected set is then a consequence of the connectedness of Jn(r) . The latter property can be checked as follows. The projection
546
J. Bros, H. Epstein, U. Moschella
[Jn(r) ]u,v of Jn(r) onto the space R2n of the (u, v)-coordinates is the intersection of the convex cone (u1 > 0, v1 < 0, uj+1 − uj > 0, vj+1 − vj < 0, 1 ≤ j ≤ n − 1) (here we have put uj = u(xj ), vj = v(xj )) with the set (u1 v1 > −R2 , . . . , un vn > −R2 ) which is preserved by the contractions; therefore, any couple of points in [Jn(r) ]u,v can be connected by a broken line contained in this set. Considering now Jn(r) as a fiberspace over its projection [Jn(r) ]u,v , we see that it is locally trivialized with a toroidal fiber of the form x2j = constant, 1 ≤ j ≤ n which is connected provided d is larger than 2; the connectedness of Jn(r) follows correspondingly. As in the Minkowskian case, one can then state a de Sitterian version of the Glaser– Streater property, according to which any function holomorphic in Tn ∪ −Tn ∪ Jn has a single-valued analytic continuation in Tnext = Tn1 ext (see e.g. [8, 28, 34]). Hence every permuted Wightman distribution is the boundary value of a function holomorphic in the corresponding extended permuted tuboid Tnπ ext ; this function is in fact an analytic continuation of Wn(π) and thereby of the common holomorphic n-point function Wn . Remark 3. The proof of the Glaser–Streater property is based on a lemma of analytic completion in the orbits of the complex Lorentz group and this is why it holds for the complexified de Sitter space (since Xd(c)n is a union of such orbits), the connectedness of the set of orbits generated by the Jost points being of course crucial. To be complete, one must also point out that it requires the following strong form of the Bargmann– Hall–Wightman lemma, ([26], pp. 95–97, [35] pp. 67–70) proved for d + 1 ≤ 4 in these references, and extended to all dimensions in [27]. An alternative proof of the latter is given in Appendix B. Lemma 1 (Bargmann–Hall–Wightman–Jost). Let M ∈ L+ (C) be such that T+ ∩ M −1 T+ 6= ∅. There exists a continuous map t 7→ M (t) of [0, 1] into L+ (C) such that M (0) = 1, M (1) = M , and that, for every z ∈ T+ ∩M −1 T+ and t ∈ [0, 1], M (t)z ∈ T+ . 3.2. The PCT-property. The standard proof of the PCT theorem (see [28, 35] and references therein) extends in a straightforward way to the de Sitterian case under the assumptions of covariance, weak spectral condition, and locality. The latter can be relaxed to the condition of weak locality [14, 28, 35], namely: Weak locality condition. For every Jost point (r1 , . . . , rn ) ∈ Jn , Wn (r1 , . . . , rn ) = Wn (rn , . . . , r1 ).
(33)
This obviously follows from locality. Proposition 1 (PCT invariance). From the weak spectral condition, the covariance condition, and the weak locality condition, it follows that Wn (x1 , . . . , xn ) = Wn (I0 xn , . . . , I0 x1 )
(34)
(in the sense of distributions), where I0 = −1 if d is odd, holds at every real x ∈ and, if d is even, for every z ∈ Cd+1 , Xdn
(I0 z)(µ) = −z (µ) for 0 ≤ µ < d, (I0 z)(d) = z (d) .
(35)
If moreover the positivity condition holds, there exists an antiunitary operator 2 : H → H such that (b) ? 2 = , 2h8(b) n , f i = h8n , fI0 i,
where fI?0 (x1 , . . . , xn ) = f¯(I0 xn , . . . , I0 x1 ).
(36)
Analyticity and Thermal Effects for de Sitter QFT
547
One notices that, in this statement, it is the symmetry I0 (which depends on the parity of the dimension) which has to be used (as it is also the case for d + 1-dimensional Minkowskian theories). This is due to the fact that I0 always belongs to the corresponding complex connected group L+ (C) under which the functions Wn are invariant. Since the mapping (z1 , . . . , zn ) → (I0 zn , . . . , I0 z1 ) is always (for every n) an automorphism of the tuboid Tn , the standard analytic continuation argument [28, 35] applies to the proof of Eq. (34). Now, it is interesting to note that for d even (in particular in the “physical case” d = 4) I0 does have the interpretation of a space-time inversion in a local region of the de Sitter universe around the base point x0 with coordinates (0, . . . , 0, R), considered as playing the role of the origin in Minkowski space. In fact, the stabilizer of x0 (inside the de Sitter group) is the analogue of the Lorentz group (inside the Poincar´e group) and indeed it acts as the latter in the (Minkowskian) tangent space to Xd at x0 ; I0 then appears as the corresponding space-time inversion (contained in the complexified stabilizer of x0 ). This means that (for d even) the previous proposition can be seen as introducing a PCT-symmetry relative to the point x0 ; analogous symmetry operators could be associated with all points of the de Sitter manifold. 3.3. Euclidean points. In the ambient complex Minkowskian space-time Cn(d+1) , S n-point πext the union of the permuted extended tubes π Tn contains all non-coinciding Euclidean points. Since the intersection of this union with Xd(c)n is the union of all permuted extended tuboids Tnπext , it follows that the domain of analyticity of Wn contains the set of all non-coinciding points of the product of n Euclidean spheres. (For a constructive approach based on the Euclidean sphere see [17].) 3.4. The case n = 2. Generalized free fields and their Wick powers. The extended tube / R+ }. Hence T2ext is equal to {(z1 , z2 ) ∈ C2(d+1) : (z1 − z2 )2 ∈ T2ext = {(z1 , z2 ) ∈ Xd(c)2 : (z1 − z2 )2 ∈ / R+ }.
(37)
In particular W2 (z1 , z2 )−W2 (z2 , z1 ) is analytic, antisymmetric, and Lorentz invariant at real space-like separations, hence vanishes there even without the locality assumption. Thus under the assumptions of weak spectral condition and covariance, W2 (z1 , z2 ) defines an “invariant perikernel” in the sense of [11] which can be represented by a function w(ζ) of the single complex variable ζ = 1 + (z1 − z2 )2 /2R2 = −z1 · z2 /R2 , holomorphic in the cut-plane C \ [1, ∞). Any such two-point function completely determines a generalized free field A whose Wightman functions are obtained by the same formulae as in the Minkowskian case (see [10] for a detailed study of all that). A can also be seen as the restriction of a generalized free field on the ambient Minkowski space, in general with an indefinite metric (see also in this connection Subsect. 5.4 of [10]). Wick monomials in A have well-defined Wightman functions, again given by the same formulae as in the Minkowskian case, i.e. as sums of products of two-point functions. Since these Wightman functions can be obtained as limits of Wightman functions of Wick monomials of group-regularizations of A, they satisfy all the conditions 1-5 (in particular positivity) provided A does. In particular the Wick monomials in A are unbounded distribution valued operators in the Fock space of A, and provide examples of theories satisfying all the axioms. 4. Physical Interpretation of the Weak Spectral Condition In this section, we are still in the Lorentz coordinate frame {e0 , . . . , ed } in the ambient real Minkowski space, the notations u, v, [λ] are as in Eqs. (2) and (3).
548
J. Bros, H. Epstein, U. Moschella
Let us now discuss the physical interpretation of the spectral condition we have introduced. Following the pioneering approach of Unruh [36], Gibbons and Hawking [19] we adopt the viewpoint of a geodesical observer and namely the one moving on the geodesic h(x0 ) of the base point x0 contained in the (x(0) , x(d) )-plane, which we parametrize as follows: h(x0 ) = {x = x(τ ); x(0) = R sinh
τ τ , x(1) = · · · = x(d−1) = 0, x(d) = R cosh }. R R (38)
The parameter τ of the representation (38) is the proper time of the observer and the base point x0 is the event for which τ = 0 . The set of all events of Xd which can be connected with the observer by the reception and the emission of light-signals is the region: Uh(x0 ) = {x ∈ Xd :
x(d) > |x(0) |} = W(r) ∩ Xd .
Points in Uh(x0 ) can be parametrized by (τ, x) as follows: (0) p τ x = R2 − x2 sinh R (1) (d−1) , τ ∈ R, x2 < R2 . x(τ, x) = (x , . p .. ,x )=x τ (d) 2 2 x = R − x cosh R
(39)
(40)
Uh(x0 ) is the intersection of the hyperboloid with the wedge W(r) of the ambient space − + and Hh(x , respectively called the “future” and and admits two boundary parts Hh(x 0) 0) “past horizons” of the geodesical observer: ± = {x ∈ Xd : Hh(x 0)
x(0) = ±x(d) , x(d) ≥ 0}.
(41)
t
Uh(x0 ) is stable under the transformation (3), for λ = e R > 0. These transformations constitute a subgroup Th(x0 ) of L↑+ . The action of Th(x0 ) (t) on Uh(x0 ) written in terms of the parameters t and τ can be interpreted as a “time-translation”: Th(x0 ) (t)[x(τ, x)] = x(t + τ, x) ≡ xt .
(42)
Th(x0 ) thus defines a group of isometric automorphisms of Uh(x0 ) whose orbits are all branches of hyperbolae of Uh(x0 ) in two-dimensional plane sections parallel to the (x(0) , x(d) )-plane (see [29] for a general discussion of this kind of structure). Before discussing the physical interpretation of the spectral condition, we need to extend to the de Sitter case one aspect of a well-known result of Bisognano and Wichmann [BW] which concerns analyticity properties in orbits of the complexified (c) of Th(x0 ) . group Th(x 0) 4.1. Bisognano-Wichmann analyticity. For every function gn in D(Xdn ) or S(Xdn ) and every λ ∈ R \ {0}, [λ] as in Eq. (3), we denote (with a simplified form of (8)) gn λ (x1 , . . . , xn ) = gn ([λ−1 ]x1 , . . . , [λ−1 ]xn )
(43)
gn← (x1 , . . . , xn ) = gn (xn , . . . , x1 ).
(44)
and
Then one has:
Analyticity and Thermal Effects for de Sitter QFT
549
Theorem 2. If a set of Wightman distributions satisfies the locality and weak spectral m n ∩ Xdm ) and gn ∈ D(W(r) ∩ Xdn ), conditions, then for all m, n ∈ N, fm ∈ D(W(r) there is a function G(fm ,gn ) (λ) holomorphic on C \ R+ with continuous boundary values G± (fm ,gn ) on (0, +∞) from the upper and lower half-planes such that: a) for all λ ∈ (0, +∞), G+(fm ,gn ) (λ) = hWm+n , fm ⊗ gn λ i,
G− (fm ,gn ) (λ) = hWm+n , gn λ ⊗ fm i. (45)
b) for all λ ∈ (−∞, 0), G(fm ,gn ) (λ) = hWm+n , fm ⊗ gn← λ i = hWm+n , gn← λ ⊗ fm i.
(46)
This theorem requires neither positivity nor Lorentz covariance. It expresses a property of the domain of holomorphy of the Wightman functions, and of the boundary values from this domain. In fact, it states that appropriate boundary values of the (m + n)−point holomorphic function Wm+n , taken in the region where all the variables w1 , . . . , wm , z1 , . . . , zn belong to W(r) ∩ Xd , are holomorphic with respect to (c) the group variable λ (for λ ∈ C \ R+ ) in the orbits (w, x) 7→ (w, [λ]x) of Th(x (with 0) t w = (w1 , . . . , wm ), x = (x1 , . . . , xn ), λ = e R ) and such that: Wm+n (w, [λ + i0]x) = Wm+n (w, [λ]x), Wm+n (w, [λ − i0]x) = Wm+n ([λ]x, w)
(47)
Wm+n (w, [λ]x) = Wm+n (w, [λ]x← ) = Wm+n ([λ]x← , w),
(48)
for λ > 0
for λ < 0
where we put x← = (xn , . . . , x1 ); the latter equality is a direct consequence of locality n n (since x ∈ W(r) and λ < 0 imply [λ]x← ∈ W(l) ). The theorem will be proved here under the simplifying assumption that the temperedness condition (17) holds. Proof. Four permuted branches of the function Wm+n are involved in the proof. The m ∩ Xdm , while the variables variables w = (w1 , . . . , wm ) will always be kept real in W(r) z = (z1 , . . . , zn ) are complex (in Xd(c)n ) and we denote y = Im z. The corresponding analyticity domains in the variables z (described below) are obtained in the boundaries π according to the prescription (i.e. in the “face” Im w = 0) of four permuted tuboids Tm+n of our weak spectral condition. In view of the distribution boundary value procedure, restricted to the subset of variables w, these analyticity domains are obtained whenever (π) under consideration with a fixed function one smears out the permuted functions Wm+n m m fm ∈ D(W(r) ∩ Xd ). (this function being understood as the function named fm in the statement of the theorem). These four branches are: Wm+n (w n 1 , . . . , wm , z1 , . . . , zn ) = Wm+n (w, z), holomorphic o in the tuboid: (c)n Zn + = z ∈ Xd ; y1 ∈ V+ , yj − yj−1 ∈ V+ , j = 2, . . . , n ; ii) Wm+n (zn , . . . , z1 , w1 , . . . , wm ) = Wm+n (z← , w), holomorphic in the opposite tuboid: Zn − = {z ∈ Xd(c)n ; y1 ∈ V− , yj − yj−1 ∈ V− , j = 2, . . . , n}; iii) Wm+n (z1 , . . . , zn , w1 , . . . , wm ) = Wm+n (z, w), holomorphic in the tuboid: Zn0 + = {z ∈ Xd(c)n ; yn ∈ V− , yj − yj−1 ∈ V+ , j = 2, . . . , n};
i)
550
J. Bros, H. Epstein, U. Moschella
iv) Wm+n (w1 , . . . , wm , zn , . . . , z1 ) = Wm+n (w, z← ), holomorphic in the opposite tuboid: Zn0 − = {z ∈ Xd(c)n ; yn ∈ V+ , yj − yj−1 ∈ V− , j = 2 . . . , n}. m ∩ Xdm ) we associate the Correspondingly, with the fixed function fm ∈ D(W(r) 0 following four functions z 7→ F± (fm ; z) and z 7→ F± (fm ; z): Z Wm+n (w, z) fm (w) dm σ(w), F+ (fm ; z) =
Z F− (fm ; z) =
Xdm
Wm+n (z← , w) fm (w) dm σ(w),
(49)
Z
F+0 (fm ; z) = F−0 (fm ;
Xdm
Wm+n (z, w) fm (w) dm σ(w),
Z
Xdm
z) = Xdm
Wm+n (w, z← ) fm (w) dm σ(w),
(50)
which are respectively holomorphic in Zn + , Zn − , Z 0 n + and Z 0 n − . By letting the variables z tend to the reals from the respective tuboids Zn + , Zn − , Z 0 n + and Z 0 n − , (b) and taking the corresponding boundary values F±(b) (fm ; x) and F 0 ± (fm ; x) of F± and 0 n F± on Xd in the sense of distributions, one then obtains for every gn ∈ D(Xdn ) the following relations which involve the (m + n)−point Wightman distributions considered in the statement of the theorem: Z F+(b) (fm ; x) gn (x) dn σ(x) = hWm+n , fm ⊗ gn i, (51) Xdn
Z Xdn
Z
F−(b) (fm ; x) gn (x) dn σ(x) = hWm+n , gn← ⊗ fm i,
(52)
F 0 + (fm ; x) gn (x) dn σ(x) = hWm+n , gn ⊗ fm i,
(53)
F 0 − (fm ; x) gn (x) dn σ(x) = hWm+n , fm ⊗ gn← i.
(54)
(b)
Xdn
Z
(b)
Xdn
We now notice that, in view of local commutativity, F+(b) (fm ; x) and F−(b) (fm ; x) coincide in the sense of distributions on the set of special Jost points Jn(l) = −Jn(r) = {(x1 , . . . , xn ) ∈ Xdn ; 0 > u1 > . . . un−1 > un , 0 < v1 < . . . vn−1 < vn }; therefore, in view of the edge-of-the-wedge theorem, the functions z 7→ F+ (fm ; z) and z 7→ F− (fm ; z) have a common holomorphic extension, denoted F (fm ; z), in 1 = Zn + ∪ Zn − ∪ V, where V is a complex neighborhood of Jn(l) , such that [λ]V = V for all λ > 0 (in particular F+(b) (fm ; x) and F−(b) (fm ; x) are continuous on Jn(l) ). By (b) (b) a similar use of local commutativity for F 0 + and F 0 − , which coincide on the set of (l) special Jost points J 0 n = {(x1 , . . . , xn ) ∈ Xdn ; 0 > un > un−1 . . . > u1 , 0 < vn <
Analyticity and Thermal Effects for de Sitter QFT
551
vn−1 . . . < v1 }, we also notice that the functions z 7→ F 0 + (fm ; z) and z 7→ F 0 − (fm ; z) have a common holomorphic extension, denoted F 0 (fm ; z), in 10 = Z 0 n + ∪ Z 0 n − ∪ V 0 , (l) where V 0 is a complex neighborhood of J 0 n , such that [λ]V 0 = V 0 for all λ > 0. Moreover, if the temperedness condition (17) is satisfied by the function Wm+n , it can be checked that similar inequalities are satisfied by the holomorphic functions F (fm ; z) and F 0 (fm ; z) with respect to the variables z in their respective tuboids Zn ± and Z 0 n ± . At this point, we shall rely on the following basic lemma which provides analytic completion in the orbits of the group {z 7→ [λ]z} (for λ ∈ C± ) and whose proof is given below (after the end of our argument). Lemma 2. a) Given any function H(z) holomorphic in 1, the function (z, λ) 7→ H([λ]z) is holomorphic in Zn + × C+ . Moreover, if H(x + iy) satisfies majorizations of the form (17) in the tuboids Zn + and Zn − allowing one to define the boundary values (b) of H from Zn + and Zn − as tempered distributions, then the function H+(b) and H− (z, λ) 7→ H([λ]z) admits a distribution boundary value on Xdn × C+ (still denoted H([λ]x)); the latter is a distribution in x with values in the functions of λ which are holomorphic in C+ and continuous in C+ \ {0} and one has: (b) ([±λ]x) H([±λ]x) = H±
for λ > 0
(55)
(the latter being identities between distributions in x with values in the continuous functions of λ). b) Similarly, given any function H 0 (z) holomorphic in 10 , the function (z, λ) 7→ H 0 ([λ]z) is holomorphic in Z+0 × C− . Moreover, if H 0 (x + iy) satisfies majorizations 0 allowing one to define the boundary values of the form (17) in the tuboids Z+0 and Z− (b) (b) 0 0 0 0 0 H + and H − of H from Z+ and Z− as tempered distributions, then the function (z, λ) 7→ H 0 ([λ]z) admits a distribution boundary value on Xdn × C− , holomorphic in C− and continuous in C− \ {0}, and one has: H 0 ([±λ]x) = H 0 ± ([±λ]x) (b)
for λ > 0.
(56)
Since the function F (fm ; z) satisfies the analyticity and temperedness properties of the function H(z) of Lemma 2 a), it follows that one can take the boundary value onto Xdn ×C+ from Zn + ×C+ of the holomorphic function (z, λ) 7→ F (fm ; [λ]z) and obtain for every gn ∈ D(Xdn ) the following relations (deduced from Eq. (55) after taking into account Eqs. (51) and (52)): Z F (fm ; [λ]x)gn (x)dn σ(x) = hWm+n , fm ⊗ gn λ i for λ > 0, (57) Xdn
Z Xdn
F (fm ; [λ]x)gn (x)dn σ(x) = hWm+n , gn← λ ⊗ fm i
for λ < 0.
(58)
Similarly, one can apply the results of Lemma 2 b) to the function H 0 (z) = F 0 (fm ; z); one can thus take the boundary value onto Xdn ×C− from Z 0 n + ×C− of the holomorphic function (z, λ) 7→ F 0 (fm ; [λ]z) and obtain for every gn ∈ D(Xdn ) the following relations (deduced from Eq. (56) after taking into account Eqs. (53) and (54)):
552
J. Bros, H. Epstein, U. Moschella
Z Xdn
Z Xdn
F 0 (fm ; [λ]x)gn (x)dn σ(x) = hWm+n , gn λ ⊗ fm i
for λ > 0,
(59)
F 0 (fm ; [λ]x)gn (x)dn σ(x) = hWm+n , fm ⊗ gn← λ i
for λ < 0.
(60)
The l.h.s. of Eqs. (57) (or (58)) and (59) (or (60)) are respectively the boundary values of the holomorphic functions Z F (fm ; [λ]x)gn (x)dn σ(x) (61) G(fm ,gn ) (λ) = Xdn
defined for λ ∈ C+ and G0(fm ,gn ) (λ) =
Z
F 0 (fm ; [λ]x)gn (x)dn σ(x)
(62)
Xdn
defined for λ ∈ C− . For an arbitrary function gn ∈ D(Xdn ), these two holomorphic n ), the r.h.s. of functions are distinct from each other. Now, if gn is taken in D(Uh(x 0) Eqs. (58) and (60) coincide in view of local commutativity, and therefore these two holomorphic functions admit a common holomorphic extension G(fm ,gn ) (λ) in C \ R+ whose boundary values on R \ 0 satisfy the properties a) and b) of the theorem (in view of Eqs. (57)–(60)). Proof of Lemma 2. We concentrate on part a) of the lemma, part b) being completely similar. At first, the fact that the function (z, λ) 7→ H([λ]z) can be analytically continued in Zn + × C+ is a result of purely geometrical nature (based on the tube theorem) which can be obtained as a direct application of Lemma 3 (ii) of Appendix A. In fact, for each point x ∈ Jn(r) , the set {z = [λ]x; λ ∈ C+ } is contained in 1 (namely in Zn + , as it directly follows from Eq. (3) and from the definitions of Jn(r) and Zn + ). One can even check that each point x ∈ Jn(r) is on the edge of a small open tuboid τ (x) contained in Zn + such that {z = [λ]z 0 ; z 0 ∈ τ (x), λ ∈ C+ } ⊂ Zn + ∪ V ⊂ 1. On the other hand, for each point z ∈ Zn + there exists a neighbourhood δ+ (z) of the real positive axis and a neighbourhood δ− (z) of the real negative axis in the complex λ-plane, such that the set {[λ]z; λ ∈ δ + (z) ∪ δ − (z)} is contained in 1: for λ ∈ δ+ (z) and λ ∈ δ− (z) the corresponding subsets are respectively contained in Zn + and in Zn − . Therefore, the assumptions of Lemma 3 (ii) of Appendix A are fulfilled (by choosing the set Q of the latter as a subset of τ (x) and D0 = Zn + after an appropriate adaptation of the variables). In order to see that the new domain thus obtained (i.e. {z = [λ]z 0 ; z 0 ∈ Zn + , λ ∈ C+ } yields an enlargement of 1, it is sufficient to notice that every real point x such that at least one component xj −xj−1 is time-like is transformed by any complex transformation [λ] into a point outside Zn ± and this is of course also true for all points z ∈ Zn + tending to such real (boundary) points (the neighbourhoods δ ± (z) becoming arbitrarily thin in such limiting configurations). The second statement of the lemma precisely deals with these limiting real configurations and with the fact that the analyticity of the boundary value H([λ]x) in {λ ∈ C+ } is maintained for all x ∈ Xdn . The boundary value relations (55) then follow from the fact that every point x is a limit of points z ∈ Zn + and that the latter are always such that [λ]z ∈ Zn + for λ > 0 and [λ]z ∈ Zn − for λ < 0. In order to avoid too subtle an argument for justifying the analyticity of the limit H([λ]x) in {λ ∈ C+ }, we prefer to rely on an assumption of tempered growth (of the form (17))
Analyticity and Thermal Effects for de Sitter QFT
553
for H; the latter allows one to give an alternative version of the analytic completion procedure which is based on the Cauchy integral representation, and thereby includes the treatment of the boundary values. For z = (z1 , . . . , zn ) ∈ Cnd , we adopt the coordinates ζ1 = z1 , ζk = zk − zk−1 for 1 < k ≤ n, uj = ζj(0) + ζj(d) , vj = ζj(0) − ζj(d) , for 1 ≤ j ≤ n, rj = (ζj(1) , . . . , ζj(d−1) ).
(63) (64)
For every z = (z1 , . . . , zn ) ∈ Zn + , we define G(z, λ) = H([λ]z). Easy computations using the tempered growth condition show that G(z, λ) is a holomorphic function of z and λ = ρeiθ for z ∈ Zn + , ρ ∈ (0, +∞) and kIm rj k2 κ , κ = min 1 − , (65) | sin θ| < j 2(1 + 2M ) Im uj Im vj with M=
1 max max{|Re uj |, |Re vj |}, µ = min min{Im uj , Im vj }, j µ j
which (for such values) satisfies bounds of the following form: L 1 L |G(z, λ)| ≤ K1 (|λ| + 1/|λ|) + max |ζj | ≤ j µκ L , ≤ K2 (|λ| + 1/|λ|)L dist(z, ∂Tn )−1 + max |ζj | j
(66)
(67)
where K1 , K2 are suitable constants. On the other hand if z is real and z ∈ Jn(r) , i.e. uj > 0 and vj < 0 for all j (with the notations of Eq. (63)), then [λ]z ∈ Zn + whenever Im λ > 0 and L 1 |H([λ]z)| ≤ K(|λ| + 1/|λ|)L max (1/Re uj − 1/Re vj ) + max |ζj | . j Im λ j (68) This shows that H([λ + i0]z) is a tempered distribution in λ ∈ R with values in the polynomially bounded functions of z on Jn(r) (actually in the C ∞ functions of z, as the z derivatives of H and G satisfy similar bounds). When λ < 0, as already noted, one has [λ]z ∈ V and G(z, λ) = H([λ]z) is analytic in z and λ. Hence, for z ∈ Jn(r) , G(z, λ + i0) is well-defined as a tempered distribution in λ ∈ R with values in the polynomially bounded functions of z on Jn(r) and is the boundary value of a function holomorphic in C+ and bounded by the r.h.s. of Eq. (68). For λ ∈ C+ this function can be computed by the Cauchy formula: Z 1 G(z, λ0 + i0) 2L+2 (i + λ − 1/λ) dλ0 . (69) G(z, λ) = 0 0 2L+2 (λ0 − λ) 2πi R (i + λ − 1/λ ) As shown by Eq. (67), the r.h.s. of this formula continues to make sense for z ∈ Zn + and defines a function of z holomorphic and of tempered growth in Zn + , with values
554
J. Bros, H. Epstein, U. Moschella
in the functions of λ holomorphic in C+ and continuous on C+ \ {0}. Therefore it has a boundary value in the sense of distributions as z tends to the reals. For real λ 6= 0, this boundary value coincides with G(z, λ) (in the sense of distributions) when z ∈ Jn(r) , hence (in view of the analytic continuation principle extended by the edge-of-the-wedge theorem) the rhs of Eq. (69) coincides with G(z, λ) for all z ∈ Zn + and all real λ 6= 0. The formula (69) thus holds for all z ∈ Zn + , λ ∈ C+ and, in the sense of distributions, when z tends to the reals; moreover, the relations (55) hold in this limit as explained above in the geometrical analysis. The previous argument could be identically repeated for part b) of the lemma, replacing Zn ± by Z 0 n ± etc... and C+ by C− , since (as one can check directly) for each (r) (l) point x ∈ J 0 n = −J 0 n , the set {z = [λ]x; λ ∈ C− } is contained in Z 0 n + . Remark 4. Using the vector-valued analyticity provided by Theorem 5 below, it is possible to carry over the analysis of Bisognano and Wichmann without change to the de Sitterian case. The above proof (also valid in the Minkowskian case) aims at a clear distinction of the part of this theory which does not depend on positivity. 4.2. Physical interpretation. The following theorem gives a thermal physical interpretation to the weak spectral condition we have introduced. Theorem 3 (KMS condition). For every pair of bounded regions O1 , O2 of Uh(x0 ) , the correlation functions between elements of the corresponding polynomial algebras P(O1 ), P(O2 ) of a field on Xd satisfying the previous postulates enjoy a KMS condition with respect to the time-translation group Th(x0 ) whose temperature is T = 1/2πR. Proof. Given any general correlation function (, Φ(f )Φ(g)) between arbitrary elements Φ(f ) ∈ P(O1 ) and Φ(g) ∈ P(O2 ), with f = (f0 , f1 , . . . , fm , . . . ), g = (g0 , g1 , . . . , gn , . . . ), (fm ∈ D(O1m ), gn ∈ D(O2n )), we consider, for each “timetranslation” Th(x0 ) (t), the transformed quantities W(f,g) (t) = (, Φ(f )Φ(g{et/R } ))
(70)
W 0 (f,g) (t) = (, Φ(g{et/R } )Φ(f ))
(71)
and
t
(the notation g{et/R } being as in Eq. (8), with 3r = [λ], λ = e R ). In view of Theorem 2, one can introduce the function G(f,g) (λ) defined as follows: P t G(f,g) (λ) = m,n G(fm ,gn ) (λ). G(f,g) (λ) is holomorphic for λ = e R ∈ C \ R+ and admits continuous boundary values G± (f,g) on (0, +∞) from the upper and lower halfplanes given respectively (in view of Eqs. (45) and (14)) by: X hWm+n , fm ⊗ gn λ i = (, Φ(f )Φ(g{et/R } )), (72) G+(f,g) (λ) = m,n
G− (f,g) (λ) =
X
hWm+n , gn λ ⊗ fm i = (, Φ(g{et/R } )Φ(f )).
(73)
m,n t
This readily implies that the function W(f,g) (t) = G(f,g) (e R ) is holomorphic in the strip 0 < Im t < 2πR and that it admits continuous boundary values on the edges of this strip which are:
Analyticity and Thermal Effects for de Sitter QFT
lim W(f,g) (t + i) = W(f,g) (t),
→0+
555
lim W(f,g) (t + 2iπR − i) = W 0 (f,g) (t). (74)
→0+
The latter express the fact that all the field observables localized in Uh(x0 ) and submitted to the time-translation group Th(x0 ) satisfy a KMS-condition at temperature T = (2πR)−1 . The previous property must be completed by the following results: i) Periodicity in the complex time variable. Since f and g are localized respectively in O1 and O2 , it follows from local commutativity that the function W(f,g) (t) can be analytically continued across the part of the line Im t = 0 (and therefore Im t = 2nπR, n ∈ Z) on which the two matrix elements of Eqs. (70) and (71) are equal. One concludes that the function W(f,g) (t) is holomorphic and periodic with period 2iπR in the following cut-plane Ccut (O1 , O2 ) which is connected (in particular) if O1 and O2 are space-like separated: \ Ccut (75) Ccut (O1 , O2 ) = x1 ,x2 , x1 ∈O1 ,x2 ∈O2
where Ccut x1 ,x2 = {t ∈ C; Im t 6= 2nπR, n ∈ Z} ∪ {t; t − 2inπR ∈ Ix1 ,x2 , n ∈ Z}. (76) and for any pair (x1 , x2 ) we have set Ix1 ,x2 = {t ∈ R : (x1 − [e− R ]x2 )2 < 0} t
(77)
ii) The antipodal condition. The following property, relating by analytic continuation the field observables localized in the region Uh(x0 ) with those localized in the antipodal region Uˇh(x0 ) = {x ∈ Xd , −x ∈ Uh(x0 ) } = {x = (x , x, x ) ∈ Xd , xˇ = (−x(0) , x, −x(d) ) ∈ Uh(x0 ) } (0)
(d)
(78)
can also be obtained as a by-product of Theorem 2. n ) let us assoWith each sequence g = (g0 , g1 , . . . , gn , . . . ) such that gn ∈ D(Uh(x 0) ← ciate the sequence gˇ = (gˇ 0 , gˇ 1 , . . . , gˇ n , . . . ), where gˇ n (x1 , . . . , xn ) = gn (xˇ1 , . . . , xˇn ) = n ), it follows that Φ(g) ˇ belongs gn (xˇn , . . . , xˇ1 ). Since (for each n) one has gˇ n ∈ D(Uˇh(x 0) ˇ to P(Uh(x0 ) ). Let us also note that for the Lorentz transformation [λ] = [−1], one has gn← −1 = gˇ n and therefore, for all λ > 0, gn← −λ = gˇ nλ . We then see that the holomorphic function G(f,g) (λ) introduced above satisfies (in view of Eq. (46)) the following relations: for all λ > 0, X X hWm+n , fm ⊗ gn← −λ i = hWm+n , gn← −λ ⊗ fm i, (79) G(f,g) (−λ) = m,n
m,n
and therefore in view of Eq. (14): G(f,g) (−et/R ) = (, Φ(f )Φ(gˇ {et/R } )). = (, Φ(gˇ {et/R } )Φ(f )). We can then state the following
(80)
556
J. Bros, H. Epstein, U. Moschella
Proposition 2 (Antipodal condition). Given arbitrary observables Φ(f ) and Φ(g) in ˇ in P(Uˇh(x0 ) ), the following identities P(Uh(x0 ) ) and the corresponding observable Φ(g) hold: ∀ t ∈ R, W(f,g) (t + iπR) = (, Φ(f )Φ(gˇ {et/R } )) = (, Φ(gˇ {et/R } )Φ(f )).
(81)
The geodesic and antipodal spectral conditions. We can introduce an “energy operator” Eh(x0 ) associated with the geodesic h(x0 ) by considering in H the continuous unitary t ; t ∈ R} of the time-translation group Th(x0 ) and its spectral resrepresentation {Uh(x 0) olution Z ∞ t = eiωt dEh(x0 ) (ω). (82) Uh(x 0) −∞
This defines (on a certain dense domain of H containing Φ(B)) the self-adjoint operator Z ∞ ωdEh(x0 ) (ω). (83) Eh(x0 ) = −∞
For any pair of vector states 9(1) = Φ(f ? ), 9(2) = Φ(g), the corresponding correlation function given in Eq. (70) can be written as follows: t Φ(g)), W(f,g) (t) = (Φ(f ? ), Uh(x 0)
(84)
which shows that W(f,g) (t) is a continuous and bounded function. In view of Eq. (82) it can be expressed as the Fourier transform of the bounded measure f(f,g) (ω) = (Φ(f ? ), dEh(x ) (ω)Φ(g)). W 0
(85)
−t Φ(f )), W 0 (f,g) (t) = (Φ(g ? ), Uh(x 0)
(86)
Similarly, one has:
which is the Fourier transform of 0 f(f,g) (ω) = (Φ(g ? ), dEh(x0 ) (−ω)Φ(f )). W
(87)
Equations (85) and (87) are valid for arbitrary f and g in B. Now, if f and g have 0 (t) satisfy the KMS relations (74) supports in Uh(x0 ) , the functions W(f,g) (t) and W(f,g) and their Fourier transforms satisfy (as bounded measures) the following relation which is equivalent to Eq. (74): 0 f(f,g) (ω). f(f,g) (ω) = e−2πRω W W
(88)
Moreover, if we rewrite the antipodal condition (81) as follows (with notations similar to those of Eqs. (70) and (71)): W(f,g) (t + iπR) = W(f,g) ˇ (t) = W(g,f ˇ ) (t)
(89)
we see that the corresponding Fourier transforms satisfy the following equivalent relations: f ˇ ) (ω) = e−πRω W f(f,g) (ω). f(f,g) W ˇ (ω) = W(g,f We have thus proved the
(90)
Analyticity and Thermal Effects for de Sitter QFT
557
Theorem 4. i) For every pair of states 9(1) = Φ(f ? ), 9(2) = Φ(g) in P(Uh(x0 ) ), the corresponding matrix elements of the spectral measure dEh(x0 ) (ω) satisfy the following geodesic spectral condition: (Φ(g ? ), dEh(x0 ) (−ω)Φ(f )) = e−2πRω (Φ(f ? ), dEh(x0 ) (ω)Φ(g)).
(91)
ii) Moreover, the previous matrix elements of the spectral measure are also related to a third one which involves the antipodal state Φ(g) ˇ in P(Uˇh(x0 ) ), by the following antipodal spectral condition: ˇ = e−πRω (Φ(f ? ), dEh(x0 ) (ω)Φ(g)). (Φ(f ? ), dEh(x0 ) (ω)Φ(g))
(92)
Remark 5. The geodesic spectral condition (91) gives a precise content to the statement that in the region Uh(x0 ) corresponding to an observer living on the geodesic h(x0 ), the energy measurements (relative to this observer) give exponentially damped expectation values in the range of negative energies. In the limit of flat space-time the l.h.s. of Eq. (91) would be equal to zero for ω > 0, which corresponds to recovering the usual spectral condition of “positivity of the energy”. Remark 6. The antipodal spectral condition (92) asserts that the spectral measure dEh(x0 ) has exponentially damped matrix elements, in the high energy limit, between states localized in the mutually antipodal regions Uh(x0 ) and Uˇh(x0 ) . Remark 7. All the features that have been discussed in this section are also naturally interpreted in terms of the existence of an antiunitary involution J relating the algebras P(Uh(x0 ) ) and P(Uˇh(x0 ) ) and the validity of the corresponding Bisognano–Wichmann duality theorem for the Von Neumann algebras A(Uh(x0 ) ) and A(Uˇh(x0 ) ) [1, 4]. 5. A Consequence of Positivity and Weak Spectral Condition: the Reeh-Schlieder Property In this section we wish to show that the vector-valued distributions fn 7→ h8(b) n , fn i, (which are provided by the GNS construction, see Eq. (13)), are boundary values of vector-valued functions holomorphic in the tuboids Zn = Zn + = Zn, d+1 ∩ Xd(c)n , where Zn, d+1 = z ∈ Cn(d+1) ; y1 ∈ V+ , yj − yj−1 ∈ V+ , j = 2, . . . , n , (93) with, in particular, n the Reeh-Schlieder property as a consequence. We o will also use Z 0 n = Z 0 n + = z ∈ Xd(c)n ; yn ∈ V− , yj − yj−1 ∈ V+ , j = 2, . . . , n . In the Minkowskian, flat, d-dimensional case, assuming the temperedness condition, as a consequence of the spectral condition (see e.g. [28]), the vector-valued distribution 8(b) n is the Fourier transform of a vector-valued tempered distribution with support in the cone dual to the base of the tube Zn, d . Hence 8(b) n is the boundary value of a function holomorphic in Zn, d . This fact can also be seen, in this case, by using the maximum principle and the fact that the distinguished boundary of Zn, d is Rdn . These tools are not available in the de Sitterian case, but, as mentioned before, a theorem of V. Glaser, stated below, can be used in conjunction with the positivity and weak spectral conditions, to prove:
558
J. Bros, H. Epstein, U. Moschella
Theorem 5. There exists, for each n ≥ 1, a function 8n holomorphic in Zn with values in H such that 8(b) n is the boundary value of 8n in the sense of distributions and of the Hilbert space topology. Theorem 5 implies the Reeh-Schlieder property: Theorem 6 (Reeh-Schlieder). For every open subset O of Xd , the vacuum is cyclic for the algebra of all field polynomials localized in O. Proof. For every 9 ∈ H and every n ≥ 1, the distribution (9, 8(b) n ) is the boundary value of the function z 7→ (9, 8n (z)), holomorphic in Zn . If O is an open subset of Xd such n n that (9, h8(b) n , ϕi) vanishes for every ϕ ∈ D(O ) then it vanishes for all ϕ ∈ D(Xd ) by analytic continuation, and since the vector space P(Xd ) is dense in H, this implies n that 9 = 0. Therefore the vector space generated by {h8(b) n , ϕi : ϕ ∈ D(O ), n ∈ N} is dense in H. To prove Theorem 5 we shall make use of the following immediate consequence of the weak spectral condition Proposition 3. For each pair of integers (m, n), the function (w, z) 7→ Wm+n (w, z), (w ∈ Xd(c)m , z ∈ Xd(c)n ), is holomorphic in the corresponding topological product Z 0 m × Zn . We are now in a position to apply the following theorem proved by V. Glaser in [20] (see also a restatement in [21]). We suppose given a finite sequence of non-empty domains Un ⊂ CNn , 1 ≤ n ≤ M , where the Nn are integers and Nn ≥ 1. We set N0 = 0, i.e. U0 can be considered as consisting of a single point. Un∗ will denote the complex conjugate domain of Un . For n ≥ 1, λn denotes the Lebesgue measure in CNn ≡ R2Nn . Glaser’s Theorem 1. For each pair of integers (n, m) with 0 ≤ n, m ≤ M , let ∗ . (In particular A0 0 (pn , qm ) 7→ An m (pn , qm ) be a holomorphic function on Un × Um is just a complex number.) Then the following properties are equivalent: (G.0) For each n ∈ [1, M ], there is an open neighborhood Vn of 0 in RNn and a point pn ∈ Un such that pn + Vn ⊂ Un and, for each sequence {fn }0≤n≤M , f0 ∈ C, fn ∈ D(Vn ) for n > 0, Z X An m (pn + hn , p¯m + km ) f¯n (hn )fm (km )dhn dkm ≥ 0 (94) 0≤n, m≤M
RNn ×RNm
(with an obvious meaning when n or m is equal to 0). (G.1) For every sequence {gn }0≤n≤M , g0 ∈ C, gn ∈ D(Un ) for n > 0, Z X An m (pn , q¯m ) g¯ n (pn ) gm (qm ) dλn (pn )dλm (qm ) ≥ 0. 0≤n, m≤M
(95)
Un ×Um
(G’.1) For each n ∈ [1, M ], there is an open subset ωn of Un such that for every sequence {gn }0≤n≤M , g0 ∈ C, gn ∈ D(ωn ) for n > 0,
Analyticity and Thermal Effects for de Sitter QFT
X 0≤n, m≤M
559
Z ωn ×ωm
An m (pn , q¯m ) g¯ n (pn ) gm (qm ) dλn (pn )dλm (qm ) ≥ 0.
(96)
(G.2) There is a sequence {fν, 0 }ν∈N ⊂ C and, for each n ∈ [1, M ], a sequence {fν, n }ν∈N of functions holomorphic in Un , such that An m (pn , qm ) =
X
fν, n (pn ) fν, m (q¯m )
(97)
ν∈N ∗ , again holds in the sense of uniform convergence on every compact subset of Un × Um with an obvious meaning when n or m is equal to 0. (G.3) For every sequence {pn ∈ Un }1≤n≤M , and every finite sequence {a(n)α } of complex numbers,
Qp (a, a) =
X X a(n)α a¯ (m)β ∂pαn ∂pβ¯ m An m (p, p) ¯ ≥ 0. α! β! n, m
(98)
α, β
(G.4) There is a particular sequence {pn ∈ Un }1≤n≤M such that, for every finite sequence {a(n)α } of complex numbers, Qp (a, a) ≥ 0. The following striking theorem, also proved in [20] is mentioned here for completeness although it is not used in the proof of Theorem 5: Glaser’s Theorem 2. Let U be a non-empty simply connected domain in CN (with N ≥ 1), and F a distribution over U , such that, for every finite sequence {aα } of complex numbers indexed by N -multiindices, X aα a¯ β ∂ α ∂¯ β F ≥ 0 α! β!
(99)
α, β
(in the sense of distributions). Then there is a function (p, q) 7→ A(p, q), holomorphic on U × U ∗ and possessing the properties (G.1)–(G.3) of Glaser’s Theorem 1 (in the case M = 1), such that F coincides with p 7→ A(p, p). ¯ Remark 8. The statement of Glaser’s Theorem 1 does not literally coincide with the original in [20], but it follows from the proofs given there. Remark 9. In the condition (G.1) one can equivalently require the gn to be arbitrary complex measures with compact support contained in Un . Since any measure can be weakly approximated by finite linear combinations of Dirac measures, the condition (G.1) is equivalent to (G”.1) For every finite sequence {(cn,l , tn,l ) : cn,l ∈ C, tn,l ∈ Un , 0 ≤ n ≤ M, 1 ≤ l ≤ L}, M X
L X
n, m=0 l, k=1
cn,l cm,k An m (tn,l , tm,k ) ≥ 0.
(100)
560
J. Bros, H. Epstein, U. Moschella
Remark 10. Apart from condition (G.0), the properties mentioned in these theorems are essentially invariant under holomorphic self-conjugated coordinate changes and in fact the various Un can be replaced by connected complex manifolds which are separable at infinity (i.e. are unions of increasing sequences of compacts) as it can be seen from the sketch of the proof given in Appendix C. Proof of Theorem 5. Taking into account Remark 10, we shall apply Glaser’s Theorem 1 to the case when each Un is the domain Zn of the corresponding manifold Xd(c)n and A0 0 = 1,
An m (z, w) = Wm+n (w← , z) = Wm+n (wm , . . . , w1 , z1 , . . . , zn , ) (101)
with n, m ∈ [0, M ], M being any fixed integer. In fact, in view of Proposition 3 and of the remark that ∗ = {w = (w1 , . . . , wm ) ∈ Xd(c)m ; w← = (wm , . . . , w1 ) ∈ Z 0 m }, Zm
(102)
it follows that for all pairs of integers (n, m) the functions of (z, w) defined by Eq. (101) ∗ . Now our aim is to prove that, as a consequence are holomorphic in the domains Zn ×Zm of the positivity property (11), these functions possess the properties (G.0)–(G.4) of Glaser’s Theorem 1. Let a be a particular point of Xd (e.g. a = (0, . . . , 0, R)). It is clear that we can define, for each n ≥ 1, a holomorphic diffeomorphism σn of an open ball centered at 0 in Cnd onto a complex neighborhood Nn of an = (a, a, . . . , a) in Xd(c) n with the following properties: ¯ = σn (z) for all z. 1. σn is self-conjugate, i.e. σn (z) 2. σn (0) = an . 3. σn maps the “local tube” {z = x + iy ∈ Cnd : |zj | < 1, 0 < yj , 1 ≤ j ≤ nd}
(103)
into Zn ∩ Nn . −1 ∗ (Nm ∩ Zm ), there holds (in view of the distribution character In σn−1 (Nn ∩ Zn ) × σm of the boundary values of the An m on Xdn+m ): X X |Im zj |−r + |Im zj0 |−r ), (104) |An m (σn (z), σm (z 0 ))| ≤ K( j
j
where K > 0 and r ≥ 0 may be taken independent of n, m ∈ [1, M ]. By composing σn with zj = th (ζj /2), we obtain a self-conjugate holomorphic diffeomorphism τn of the tube {ζ = ξ + iη ∈ Cnd : |ηj | < π/2, 1 ≤ j ≤ nd}
(105)
onto a complex neighborhood of an in Xd(c) n such that τn (0) = an and the image of the tube 2n = {ζ = ξ + iη ∈ Cnd : 0 < ηj < π/2, 1 ≤ j ≤ nd}
(106)
is contained in Zn . Let Bn m (ζ, ζ 0 ) = An m (τn (ζ), τm (ζ 0 )).
(107)
Analyticity and Thermal Effects for de Sitter QFT
561
The functions Bn m are holomorphic in 2n × 2∗m . Since for ζ = ξ + iη ∈ C, th (ζ/2) =
sh ξ + i sin η , 2|ch (ζ/2)|2
(108)
the Bn m satisfy X e|ξj | r X + K0 |Bn m (ζ, ζ )| ≤ K | sin ηj | j j 0
0
0
e|ξj | | sin ηj0 |
!r ,
(109)
∀ζ = ξ + iη ∈ 2n , ζ 0 = ξ 0 + iη 0 ∈ 2∗m .
They have boundary values Bn(v)m in the sense of generalized functions over test-functions of faster than exponential decrease. These boundary values satisfy, for each finite sequence {fn }, f0 ∈ C, fn ∈ D(Rnd ) for n ≥ 1, XZ Bn(v)m (ξ, ξ 0 )fn (ξ) fm (ξ 0 ) dξ dξ 0 ≥ 0. (110) n, m
Let now ρn, ε (ξ) = C(ε) exp −
nd X
(ξj2 /ε) ,
(111)
j=1
R where C(ε) is chosen so that ρn, ε (ξ) dξ = 1. For each µ ∈ Cnd , the function ξ 7→ ρn, ε (ξ + µ) is of gaussian decrease, and depends holomorphically on µ. In particular if µn ∈ 2n , µ0m ∈ 2∗m , Z Bn(v)m (t, t0 ) fn (ξ) ρn, ε (t − µn − ξ) fm (ξ 0 )ρm, ε (t0 − µ¯ 0m − ξ 0 ) dt dt0 dξ dξ 0 Z = Bn m (t + µn , t0 + µ0m ) fn (ξ) ρn, ε (t − ξ) fm (ξ 0 )ρm, ε (t0 − ξ 0 ) dt dt0 dξ dξ 0 , (112) since both sides define analytic functions in 2n × 2∗m whose boundary values for real µn , µ0m coincide (the integration is over Rnd × Rmd ). The lhs satisfies the positivity conditions, by virtue of Eq. (110), if we choose µ0n = µ¯ n for all n. It follows, by letting ε tend to 0 in the rhs, that the functions Bn m have the property (G.0) of Glaser’s Theorem 1 and therefore all the properties (G.0)–(G.4) in the sequence of domains {2n }. Coming back to the original variables, Glaser’s Theorem 1 now shows that the same properties, in particular (G.2), extend to the entire tuboid {Zn }. We have thus proved the following Proposition 4. For any integer M ≥ 1, there exist a sequence {Fν, 0 ∈ C}ν∈N and, for each integer n ∈ [1, M ], a sequence {Fν, n }ν∈N of functions holomorphic in Zn , such ∗ , that, for every n and m in [1, M ], z ∈ Zn , w ∈ Zm X Fν, m (w) ¯ Fν, n (z), (113) Wm+n (w← , z) = ν∈N ∗ × Zn . where the convergence is uniform on every compact subset of Zm
562
J. Bros, H. Epstein, U. Moschella
In particular W2n (z¯← , z) =
X
|Fν, n (z)|2 ,
(114)
ν∈N
(so that if the temperedness condition holds, each Fν, n has polynomial behavior at infinity and near the reals). Let now {ϕm }1≤m≤M be a sequence of test-functions, ϕm ∈ D(Xdm ), ϕ0 ∈ C. We continue to denote ϕm a C ∞ extension of ϕm with compact support over Xd(c) m . Let C(m, ε) be, for each m ∈ [1, M ] and ε ≥ 0, an (md)-cycle, contained in Zm for ε > 0, equal to Xdm for ε = 0, and continuously depending on ε. Using Proposition 4 and Schwarz’s inequality, we find, for any z ∈ Zn (n being fixed and M chosen arbitrarily such that n ≤ M ), 2 X Z ϕ (w) W ( w ¯ , z) d w ¯ ∧ . . . ∧ d w ¯ m m+n ← 1 m 0≤m≤M C(m, ε) P 2 R P = ν∈N [ 0≤m≤M C(m, ε) ϕm (w) Fν, m (w)dw¯ 1 ∧ . . . ∧ dw¯ m ]Fν, n (z) (115) Z X X ≤ |Fν, n (z)|2 × [ ϕm (w) ϕk (w0 ) t× ν∈N
0≤m, k≤M
C(m, ε)×C(k, ε)
× Wm+k (w¯ ← , w0 ) dw¯ 1 ∧ . . . ∧ dw¯ m ∧ dw10 ∧ . . . ∧ dwk0 ]. Taking Eq. (114) into account and letting ε tend to 0 then yield: P 2 R 0≤m≤M X m ϕm (w) Wm+n (w← , z) dw1 . . . dwm d
P
2 R
≤ W2n (z¯← , z) 0≤m≤M Xd 8(b) m (w) ϕm (w) dw .
(116)
Since the latter holds for any (arbitrarily large) value of M , i.e. for a dense set of vectors Φ(ϕ) in H, this shows that for every n there is a vector 8n (z) ∈ H such that Z Z 8(b) (w) ϕ (w) dw, 8 (z) = Wm+n (w← , z)ϕm (w) dw. (117) m n m Xd
Xd
Integrating similarly in z over a cycle such as C(n, ε), and letting ε tend to 0 show that 8n admits 8(b) n as its boundary value in the sense of distributions and Theorem 5 follows. Remark 11. This proof is valid for some other spaces besides de Sitter space. What is really used is that the space is real-analytic and that the Wightman distributions are boundary values of functions Wm+n (w← , z) holomorphic in products of the form ∗ × Un , where the Un are connected complex tuboids. Um Remark 12. Neither temperedness nor locality have been used. Remark 13. By using the PCT property, the BW analyticity and the Reeh-Schlieder property it is possible to recover the full Bisognano–Wichmann theorem in the de Sitter case. We do not give the details here.
Analyticity and Thermal Effects for de Sitter QFT
563
A. Appendix. A Lemma of Analytic Completion In this appendix we prove a simple lemma of analytic completion by applying the convex tube theorem, according to which any function which is holomorphic in a tube Rn + iB, where B is a domain in Rn , can be analytically continued in the convex hull of this tube (see [5, 38, 16, 15]). C+ denotes the upper half-plane. Lemma 3. (i) Let P = {z ∈ CN : |zj | < 1, Im zj > 0, ∀j = 1, . . . , N }.
(118)
Let D be a domain in CN , containing P , and a domain in C × CN of the form = N ∩ (C+ × D),
(119)
where N is an open neighborhood, in C1+N , of the set (R \ {0}) × D ∪ (C+ \ {0}) × {z ∈ CN : |zj | < 1, Im zj = 0, ∀j = 1, . . . , N } . (120) Then any function holomorphic in has a holomorphic extension in C+ × D. (ii) Let D0 be a domain in CN , containing Q = {z ∈ CN : |zj | < 1, ∀j = 1, . . . , N },
(121)
0
and a domain in C × CN of the form 0 = (C+ × Q) ∪ (N 0 ∩ (C+ × D0 )), 0
(122) 0
where N is an open neighborhood, in C , of (R \ {0}) × D . Then any function holomorphic in 0 has a holomorphic extension in C+ × D0 . 1+N
Remark 14. By setting w = eπσ the upper half-plane can be replaced by the strip {σ : 0 < Im σ < 1}, and R \ {0} by the boundary of that strip. 1. We start by proving Lemma 3 (i) for the case when D = P . This follows from: Lemma 4. Let a ∈ (0, 1) and 10a a domain in C × CN of the form V ∩ (C+ × P ), where V is an open neighborhood in C1+N of {(w, z) ∈ C1+N : w ∈ R : a < |w| < 1/a}, z ∈ P } ∪ {(w, z) ∈ C1+N : w ∈ C+ ∪ (−1/a, −a) ∪ (a, 1/a), |zj | < 1, Im zj = 0, ∀j = 1, . . . , N }.
(123)
Then any function f holomorphic in 10a has a holomorphic extension in the domain [ 1a = Wa (θ) × Z(θ), (124) 0<θ<π
where:
Z(θ) = {z ∈ CN : ∀j = 1, ... N, Im zj > 0, 2 Im log
1 + zj 1 − zj
< θ }, (125)
Wa (θ) = {w ∈ C : 0 < Im w, 0 < Im 8(w, a) < π − θ }, 8(w, a) = iπ − log
w − a−1 w−a
− log
w+a , (Im w 6= 0). w + a−1
(126) (127)
564
J. Bros, H. Epstein, U. Moschella
Remark 15. The function w 7→ Im 8(w, a) is the bounded harmonic function in the upper half-plane with boundary values equal to 0 on the real segments (−a−1 , −a) and (a, a−1 ), and to π on the other real points. π − Im 8(w, a) is the sum of the angles under which these two segments are seen from the point w. Proof. We shall make use (at several places and in several complex variables) of the following conformal map. For A > 0 and B > 0, we denote L(A, B) the open lunule in the W -plane bounded by the real segment [−A, A] and the circular arc going through the points −A, iB, and A. This domain is conformally mapped onto the strip {λ ∈ C : 0 < Im λ < 2Arctg (B/A)} by the map A+W , (128) W 7→ λ = log A−W whose inverse is λ 7→ W = A th (λ/2).
(129)
Both the hypotheses of Lemma 4 and the function 8 are left invariant by the transformation w 7→ −1/w. In fact, denoting b = a−1 and µ(w) = w − 1/w
(130)
(w − b)(w + a) µ − (b − a) = ≡ −1/ϕ(µ), (w − a)(w + b) µ + (b − a)
(131)
8(w, a) = log ϕ(µ).
(132)
we have
A function f holomorphic in 10a can be rewritten in the form: f (w, z) = fs (w, z) + (w + w−1 )fa (w, z),
(133)
where fs (w, z) = fa (w, z) =
1 (f (w, z) + f (−w−1 , z)), 2
1 (f (w, z) − f (−w−1 , z)). 2(w + w−1 )
(134)
Both fs and fa are easily seen to have the properties postulated for f itself, and they are moreover invariant under the transformation w 7→ −w−1 . They can therefore be written as holomorphic functions Fs,a (µ, z) of µ = w − w−1 and z. We now perform the change of coordinates µ 7→ ω = log ϕ(µ), zj 7→ ζj = 2 log ((1 + zj )/(1 − zj )),
(135)
Gs,a (ω, ζ) = Fs,a (µ, z) with
(136)
i.e. define
Analyticity and Thermal Effects for de Sitter QFT
565
µ = (b − a)th (ω/2),
(137)
zj = th (ζj /4).
(138)
The functions Gs,a are holomorphic in a domain of the following form, which is the image of the domain 10a into the space of variables (ω, ζ) (after taking the successive maps given in Eqs. (130),(131) and (135) into account): U1 = V1 ∩ {(ω, ζ) ∈ C1+N : 0 < Im ω < π, 0 < Im ζj < π},
(139)
where V1 is an open neighborhood, in C1+N , of the set S1 = {(ω, ζ) ∈ C1+N : ω ∈ R, 0 < Im ζj < π}∪ {(ω, ζ) ∈ C1+N : 0 ≤ Im ω < π, Im ζj = 0} = {(ω, ζ) ∈ C1+N : ω ∈ R, 0 ≤ Im ζj < π}∪ {(ω, ζ) ∈ C1+N : 0 ≤ Im ω < π, Im ζj = 0}.
(140)
The domain U1 of Eq. (139) is not a tube. We can however inscribe in it increasing unions of topological products of lunules which are isomorphic to tubes. In fact, for every A > π1 , there exists an ε > 0 such that U1 contains VA = {(ω, ζ) ∈ C1+N : ω ∈ L(A, ε), ζj ∈ L(A, π − 1/A)} ∪ {(ω, ζ) ∈ C1+N : ω ∈ L(A, π − 1/A), ζj ∈ L(A, ε)}.
(141)
Using the conformal map (128) in all variables, we can map VA into a tube whose holomorphy envelope is its convex hull. Returning to the variables (ω, ζ), and taking the limit A → ∞ shows that the functions Gs,a are holomorphic in the interior of the convex hull of S1 , namely [
{(ω, ζ) ∈ C1+N : 0 < Im ω < π − θ, 0 < Im ζj < θ, ∀j}.
(142)
0<θ<π
This set is the image of the domain 1a introduced in Eq. (124) under the mapping w 7→ µ = w − w−1 7→ ω, z 7→ ζ defined in Eq. (135), and therefore the assertion of Lemma 4 follows. Lemma 3 (i) in the special case D = P follows from the latter by letting a tend to 0. 2. We now prove Lemma 3 (ii) in the case when D0 = ρ Q for some real ρ > 1. The proof of this is the same as that of Lemma 4, except that the change of coordinates (138) is replaced by zj = exp(iζj ), (1 ≤ j ≤ N ).
(143)
This again allows the use of the tube theorem. 3. Lemma 3 (ii) follows from this by using chains of polydisks, and (i) follows in the same way from the special case D = P and (ii).
566
J. Bros, H. Epstein, U. Moschella
B. Appendix. A Lemma of Hall and Wightman In [26], Hall and Wightman prove the following lemma Lemma 5. Let M ∈ L+ (C) be such that T+ ∩ M −1 T+ 6= ∅. There exists a continuous path t 7→ M (t) from the interval [0, 1] into L+ (C) such that M (0) = 1, M (1) = M and that, for every z ∈ T+ ∩ M −1 T+ ⊂ Cd+1 , M (t)z ∈ T+ holds for all t ∈ [0, 1]. This lemma is proved in [26] for the case d + 1 ≤ 4 (a very clear exposition also appears in [35]). It is extended to all dimensions in [27]. We give another proof based on holomorphic continuation. As noted in the above references, if M ∈ L+ (C) is such that the statement in Lemma 5 holds, then it holds for 31 M 32 for any 31 , 32 ∈ L↑+ , as well as for M −1 . It is therefore sufficient to consider the case when M is one of the normal forms classified by Jost in [27]. M can then be written in the form: M1 (i) 0 (144) M= 0 M2 (i) where t 7→ M1 (t) is a one-parameter subgroup of the p × p Lorentz group, real for real t, with p ≤ 3, and t 7→ M2 (t) is a one-parameter subgroup of the (d + 1 − p) × (d + 1 − p) orthogonal group, real for real t. In the generic case p ≤ 2, M1 (t) = 1 if p = 1 and, if p = 2, M1 (t) = [exp at] for some real a with |a| ≤ π. We focus on this case first. Replacing M by M −1 if necessary, we may assume 0 < a ≤ π. For any z ∈ T+ the set 1(z, M ) = {t ∈ C : M (t)z ∈ T+ } is invariant under real translations, i.e. is a union of open strips parallel to the real axis. Let E(M ) = {T+ ∩ M −1 T+ } = {z ∈ Cd+1 : R ∪ (i + R) ⊂ 1(z, M )}. Denote z(s) = (z (0) , z (1) , sz (2) , . . . , sz (d) ). If z ∈ E(M ), then z(s) ∈ E(M ) for all s ∈ [0, 1]. The set 1(z(0), M ) contains the segment i[0, 1], and hence i[0, 1] ⊂ 1(z 0 , M ) for all z 0 in a sufficiently small neighborhood N of z(0). For n ∈ V+ \ {0} and b ∈ C, with Im b ≥ 0, the function (t, z) 7→ Hn, b (t, z) = (n · M (t)z + b)−1 is holomorphic in {(t, z) : z ∈ E(M ), t ∈ 1(z, M )}. Applying Lemma 3ii), with w replaced by the variable σ of Remark 14, it follows that Hn, b is holomorphic in {t : 0 < Im t < 1} × E(M ). Let us now assume that for some z ∈ E(M ) and some t ∈ i[0, 1] the corresponding point ζ = M (t)z belongs to the complement of T+ ; then, as explained below, one can determine n and b satisfying the previous conditions and such that n · ζ + b = 0, which therefore contradicts the previously proved analyticity property of the corresponding function Hn, b . In fact, for any complex point ζ = ξ + iη in the complement of T+ (i.e. η ∈ / V+ ), one can find n ∈ Rd+1 and c ∈ R such that n · η + c = 0, while n · r + c > 0 for all r ∈ V+ . This implies n ∈ V+ \ {0} and c ≥ 0. Hence there is a b ∈ C with Im b = c ≥ 0 such that n · ζ + b = 0 (while Im (n · q + b) > 0 for all q ∈ T+ ). This proves Lemma 5 for all dimensions. C. Appendix. Sketch of the Proof of Glaser’s Theorem 1 This section closely follows the original [20] with a few unimportant alterations, mainly intended for the cases when Un might not be simply connected. The notations are those en the universal covering space of Un , ιn of Glaser’s Theorem 1, and we also denote U e the canonical projection of Un onto Un . If V is a complex manifold, A(V) denotes the
Analyticity and Thermal Effects for de Sitter QFT
567
set of holomorphic functions on V. It is clear that (G.1) ⇒ (G’.1). The latter implies f (zn − pn )δ(Im (zn − pn )) in G’.1. In turn (G.0) implies (G.0), by inserting gn (zn ) =P (G.4) by inserting fn (hn ) = α aα (n)∂ α δ(hn ) in (G.0). 1. The first step of the proof is to show that (G.4) ⇒ (G.3). Assume that (G.4) holds. Let, for each n ∈ [1, M ], Rn > 0 be such that the closure of the polydisk Pn = {zn ∈ CNn : |zn,j − pn,j | < Rn ∀j} is contained in Un . For any {zn ∈ Pn }1≤n≤M and every finite sequence b(n)α , Qz (b, b) = Qp (a, a),
a(n)α = α!
X bγ (zn − pn )α−γ . γ! (α − γ)!
(145)
(146)
γ≤α
Although the sequence {a(n)α }α∈NNn is infinite, the convergence of the power series for Qz (b, b) and a limiting argument show that Qz (b, b) ≥ 0. Thus the property (G.4) propagates everywhere, i.e. (G.3) holds. 2. As our next step, we prove that (G.4) implies that (G’.1) holds within the same sequence of polydisks {Pn }n∈[1, M ] just used. We first prove this in the form of condition (G”.1), i.e. in case gn is a finite linear combination of Dirac measures, i.e. L X
cn,r δ(zn − tn,r ),
(147)
cn,r cm,s An m (tn,r , tm,s ) = Qp (a, a),
(148)
gn (zn ) =
r=1
with tn,r ∈ Pn . Indeed M X
L X
n, m=0 r, s=1
with a(n)α =
L X
cn,r (tn,r − pn )α .
(149)
r=1
The sequence {a(n)α } is again infinite, but we can still conclude that Qp (a, a) ≥ 0, i.e. that (G”.1), and hence (G’.1) hold in the sequence of domains {Pn }n∈[1, M ] . 3. To prove that (G’.1) in the sequence of polydisks {Pn }n∈[1, M ] implies the property (G.2) within the same sequence, we introduce a Hilbert space E = E0 ⊕ . . . EM as follows: E0 = C. For each n ∈ [1, M ], En = A(Un ) ∩ L2 (Pn , λn ). It is well-known (see [2]) that En is a closed subspace of L2 (Pn , λn ), and that the convergence of a sequence {ψν ∈ En }ν∈N in the sense of En implies its uniform convergence on every compact subset of Pn . The operator AP defined on E by (g, AP f ) =
M Z X m, n=0
Pn ×Pm
gn (pn ) An m (pn , q¯m ) fm (qm ) dλn (pn ) dλm (qm ) (150)
is Hilbert–Schmidt and positive by virtue of the property (G’.1). The spectral decomposition of this operator therefore yields the existence of a sequence {ϕν =
568
J. Bros, H. Epstein, U. Moschella
(ϕν, 0 , . . . , ϕν, M ) : ν ∈ N} of eigenvectors of AP corresponding to non-negative eigenvalues. Hence there is a sequence {fν ∈ E} such that X An m (pn , q¯m ) = fν, n (pn ) fν, m (qm ) (151) ν∈N
holds for all n, m ∈ [0, M ], uniformly on every compact subset of Pn × Pm . en . 4. We now show that for each n and ν, fν, n extends to a function holomorphic on U 0 0 Let zn ∈ Pn and suppose that the closure of a polydisk Pn with radius R centered at zn is contained in Un . The Taylor coefficients of An n at zn satisfy X ∂ α fν, n (zn ) 2 ∂ α ∂¯ α ≤ C R0 −2|α| . A (z , z ¯ ) = (152) nn n n α!2 α! ν Hence the power series for fν, n at (zn ) converges in Pn0 . Moreover the expansion 0 0 (151) continues P to hold in P2n × Pm . To see 0this it suffices, by Schwarz’s inequality, to prove that ν |fν, n (wn )| converges in Pn . We fix n ≥ 1 and temporarily denote gν, α = ∂ α fν, n (zn )/α!. For any κ ∈ (0, 1), by Schwarz’s inequality, 2 X X 0 |gν, α |(κ2 R0 )|α| ≤ (1 − κ2 )−Nn |gν, α |2 (κ2 R 2 )|α| , (153) α α and by Eq. (152), 2 X X 2 0 |α| |gν, α |(κ R ) ≤ C(1 − κ2 )−2Nn . α ν In particular for any ε > 0, it is possible to choose S such that 2 X X |gν, α |(κ2 R0 )|α| ≤ ε α
(154)
(155)
ν≥S
so that, for any ζ with |ζ| < κ2 R0 , X
|fν, n (zn + ζ)|2 ≤ ε,
(156)
|fν, n (zn + ζ)|2 ≤ C(1 − κ2 )−2Nn .
(157)
ν≥S
X ν
en and a component Pbn of Therefore there exists a function feν, n holomorphic on U −1 e b ιn (Pn ) such that fν, n coincides with fν, n ◦ ιn on Pn . The expansion X en m (ζn , ζ¯m ) def = An m (ιn (ζn ), ιm (ζm )) = (158) feν, n (ζn ) feν, m (ζm ) A ν
en × U em . Therefore the sequence {A en m } holds uniformly on every compact subset of U e possesses the property (G”.1) in {Un }. It follows that the {An m } possess the property en . (G”.1) in {Un }, since the points tn, r can be lifted in an arbitrary way to points in U
Analyticity and Thermal Effects for de Sitter QFT
569
5. For each n ∈ [1, M ], we now define a measure ρn on Un as follows. Let first −ϕn (pn ) dλn (pn ) where the smooth real function ϕn is chosen such that dµ R n (pn ) = e dµ (p ) = 1. We then define dρn (pn ) = (1 + An n (pn , p¯n ))−1 dµn (pn ). Let F0 = C, n n Un and, for each n ∈ [1, M ], let Fn = A(Un ) ∩ L2 (Un , ρn ). Note, e.g. that for any fixed qn ∈ Un , the function pn 7→ An n (pn , q¯n ) belongs to Fn . Let F = F0 ⊕ . . . ⊕ FM . The operator A defined on F by (g, A f ) =
M Z X m, n=0
Un ×Um
gn (pn ) An m (pn , q¯m ) fm (qm ) dρn (pn ) dρm (qm ) (159)
is Hilbert-Schmidt and positive, and we again conclude that there exists a sequence {hν ∈ F : ν ∈ N} such that X hν, n (pn ) hν, m (qn ) (160) An m (pn , q¯m ) = ν∈N
holds uniformly on every compact subset of Un ×Um as well as in the sense of Fn ⊗Fm . This concludes the proof of Glaser’s Theorem 1. Remark 16. The extension to the case when the Un are complex manifolds which are separable at infinity is straightforward. Remark 17. The requirement that U be simply connected in Glaser’s Theorem 2 is necessary as the following example shows. Let U = C \ {0} and√ A(p, p) ¯ = |p|. This √ ¯ But there cannot satisfies the assumptions of Glaser’s Theorem 2 since A(p, p) ¯ = p p. P be a sequence ¯ = ν |fν (p)|2 , since p fν of functions holomorphic on U such that A(p, p) |fν (p)| ≤ |p| implies that fν is analytic at 0, hence entire and necessarily 0. References 1. Araki, H.: J. Math. Phys. 5, 1 (1964) 2. Bergman, S.: The kernel function and conformal mapping. Mathematical Surveys, No.5. Providence, Rhode Island: American Mathematical Society, 1980 3. Birrell, N.D., Davies, P.C.W.: Quantum fields in curved space. Cambridge: Cambridge University Press, 1982 4. Bisognano, J.J., Wichmann, E.H.: On the duality condition for a Hermitian scalar field. J. Math. Phys. 16, 985 (1975) 5. Bochner, S., Martin, W.T. : Several complex variables. (Princeton Mathematical Series. 10) Princeton, NJ: Princeton University Press, 1948 6. Borchers, H.J.: On the structure of the algebra of field observables. Nuovo Cimento 24, 214 (1962) 7. Bros, J.: Complexified de Sitter space: analytic causal kernels and K¨all´en-Lehmann-type representation. Nucl.Phys. (Proc. Suppl.) 18 B, 22 (1990) 8. Bros, J., Epstein, H., Glaser, V.: Connection between analyticity and covariance of Wightman functions. Commun. Math. Phys. 6, 77 (1967) 9. Bros, J., Gazeau, J.-P., Moschella, U.: Phys. Rev. Lett. 73, 1746 (1994) 10. Bros, J., Moschella, U.: Two-point functions and de Sitter quantum fields. Rev. Math. Phys. 8, 324 (1996). Moschella, U.: New results on de Sitter Quantum Field Theory. Ann.Inst. Henri Poincar´e 63, 411–426 (1995) 11. Bros, J., Viano, G.A.: Connection between the harmonic analysis on the sphere and the harmonic analysis on the one-sheeted hyperboloid: an analytic continuation viewpoint. Forum Mathematicum 8, 621–658 and 659–722 (1996) and 9, 165–191 (1997)
570
J. Bros, H. Epstein, U. Moschella
12. Bunch, T.S., Davies, P.C.W.: Quantum field theory in de Sitter space: Renormalization by point splitting. Proc. R.Soc. Lond. A 360, 117 (1978) 13. Brunetti, R., Fredenhagen, K., Kohler, M.: The microlocal spectrum condition and Wick polynomials of quantum fields on curved space-times. Commun. Math. Phys. 180, 633 (1996) 14. Dyson, F.J. : Connection between local commutativity and regularity of Wightman functions. Phys. Rev. 110, 579 (1958) 15. Epstein, H.: CTP invariance of the S-matrix in a theory of local observables. J. Math. Phys. 8, 750 (1967) 16. Epstein, H: Some analytic properties of scattering amplitudes in quantum field theory. In: Axiomatic Field Theory, M. Chretien & S. Deser eds., New York: Gordon & Breach, 1966 17. R. Figari, R. Hoegh-Krohn, C.R. Nappi: Interacting relativistic boson fields in the de Sitter universe with two space-time dimensions. 18. Floreanini, R., Hill, C.T., Jackiw, R.: Functional representation for the isometries of de Sitter space. Ann. Phys. 175, 345 (1987) 19. Gibbons, G.W., Hawking, S.W.: Cosmological event horizons, thermodynamics and particle creation. Phys. Rev. D 10, 2378 (1977) 20. Glaser, V.: The positivity condition in momentum space. In: Problems in Theoretical Physics. Essays dedicated to N. N. Bogoliubov. D. I. Blokhintsev et al. eds. Moscow: Nauka 1969 21. Glaser, V.: On the equivalence of the Euclidean and Wightman formulations of field theory. Commun. Math. Phys. 37, 257–272 (1974) 22. Haag, R.: Local Quantum Physics, Fields, Particles, Algebras. Berlin: Springer, 1992 (1992). 23. Haag, R.,Hugenholtz, N.M.,Winnink, M.: On the equilibrium states in quantum statistical mechanics. Commun. Math. Phys. 5, 215 (1967) 24. Hawking, S.W.: Black Hole Explosions? Nature 248, 30 (1974) 25. Hawking, S.W.: Particle creation by Black Holes. Commun. Math. Phys. 43, 199 (1975) 26. Hall, D., Wightman, A.S.: A theorem on invariant analytic functions with applications to relativistic quantum field theory. Mat. Fys. Medd. Dan. Vid. Selsk. 31, 3 (1957) 27. Jost, R.: Die Normalform einer komplexen Lorentztransformation. Helv. Phys. Acta 33, 773–782 (1960) 28. Jost, R.:The general theory of quantized fields. Providence, RI: A.M.S., 1965 29. Kay, B.S., Wald, R.: Theorems on the uniqueness and thermal properties of stationary, nonsingular, quasifree states on space-times with a bifurcate Killing horizon. Phys. Rep. 207, 49 (1991) 30. Linde, A.: Particle Physics and Inflationary Cosmology. Chur: Harwood Academic Publishers, 1990 31. Moschella, U.: Quantization curvature and temperature: The de Sitter space-time. In: Quantization and Infinite-Dimensional Systems, J-P. Antoine, S. Twareque Ali, W. Lisiecki, I.M. Mladenov, A. Odzijewicz eds., New York: Plenum, 1994, p. 183 32. Radzikowski, M.-J.: The Hadamard condition and Kay’s conjecture in (axiomatic) quantum field theory on curved space-time, Ph.D. thesis, Princeton University, October 1992 33. Sewell, G. L.: Quantum fields on manifolds: PCT and gravitationally induced thermal states. Ann. Phys. 141, 201 (1982) 34. Streater, R.F. : Analytic properties of products of field operators. J. Math. Phys. 3, 256 (1962) 35. Streater, R.F., Wightman, A.S.: PCT, Spin and Statistics, and all that. New York: W.A. Benjamin, 1964 36. Unruh, W.G.: Notes on black-hole evaporation. Phys. Rev. D 14, 870 (1976) 37. Wightman, A.S.: Quantum field theory in terms of vacuum expectation values. Phys. Rev. 101, 860 (1956) 38. Wightman, A.S.: Analytic functions of several complex variables. In: Relations de dispersion et particules e´ l´ementaires, C. De Witt and R. Omnes eds. Paris: Hermann, 1960 Communicated by A. Jaffe
This article was processed by the author using the LaTEX style file pljour1 from Springer-Verlag.
Commun. Math. Phys. 196, 571 – 590 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Convergence of Anti-Self-Dual Connections on SU (n)-Bundles Over Product of Two Riemann Surfaces Jingyi Chen? Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA. E-mail: [email protected] Received: 17 September 1997 / Accepted: 19 January 1998
Abstract: We obtain a convergence result on anti-self-dual connections over a product of two Riemann surfaces in the adiabetic limit. 1. Introduction In this paper, we study the asymptotic behavior of the anti-self-dual Yang-Mills equations on a SU (n)-bundle over a product of two compact Riemann surfaces. The blowingup process is taken along one of the two Riemann surfaces by multiplying increasing constants to the Riemannian metric on this factor. We consider the “adiabatic limit” of the instanton equations. When the base manifold is a homological 3-sphere represented by a Heegaard splitting, Atiyah in [A] introduced an idea of stretching the neck for the splitting to study the Casson invariants and Floer homology. This idea was successfully carried further by Dostoglou and Salamon in their proof in [DS] of a version of the Atiyah-Floer conjecture. In particular, if P is a nontrivial SO(3)-bundle over a compact Riemann surface, they showed that the self-dual instantons on the product of the mapping cylinder of P and R became holomorphic curves in the moduli space of flat connections on P . When the base manifold is a product of two compact Riemann surfaces, physicists Bershadsky, Johansen, Sadov and Vafa in [BJSV] investigated the Yang-Mills instantons in the limiting process and observed, among other things, a topological reduction of 4dimensional supersymmetric Yang-Mills theory to 2-dimensional σ-models. We provide a mathematical proof to the convergence process. Our main result is as follows (Theorem 4.10). Let E be a SU (n)-bundle over 61 ×62 which is given a Riemannian metric g1 ⊕ g2 , where 61 , 62 are two compact Riemann surfaces. Let M(E, 62 ) be the moduli space of flat connections over 62 . If Aλ is a sequence of anti-self-dual connections on E, which corresponds to a family of metrics λ2 g1 ⊕ g2 , where λ → ∞, then a subsequence of the 62 -components of Aλ converges to ?
The author is supported partially by a NSF grant
572
J. Chen
a holomorphic map from the union of 61 and finitely many rational curves to M(E, 62 ), in Hausdorff topology after fixing a sequence of gauges. In Sect. 2, we discuss the behavior of the Yang-Mills functional in different metrics λ2 g1 ⊕ g2 . Also, when a connection over 61 × 62 is decomposed into 1-forms over 61 and 62 accordingly, the component over 62 is parameterized by 61 . We then derive a Cauchy-Riemann type equation for this component from the anti-self-dual equation. In Sect. 3, we recall some facts about the moduli space M of flat connections over a Riemann surface and introduce a Cauchy-Riemann operator from 62 into the moduli space. Section 4 is the main theme of this paper. We first construct holomorphic maps from discs in 61 into M by using a well-known result of Nijenhuis and Woolf ([NW]). Then we show the holomorphic maps can be lifted to a slice in the space of flat connections over 62 . A L2 -version of the classical three circles theorem and elliptic theory allows us to prove that the 62 -component of a subsequence of the original ASD connections converges to the lifted holomorphic maps in C 0 -topology. The summation of the Dirichlet energy of the holomorphic maps over the discs, which form a countable cover of 61 with possibly finitely many points deleted where curvature concentrates, is bounded by a multiple of the second Chern number of the SU (n)-bundle with some constant. Then a diagonal process implies the existence of a holomorphic map from 61 together with finitely many rational curves (bubbles) into M, by Gromov’s compactness theorem and Sachs-Uhlenbeck’s removable singularity theorem. The main result is Theorem 4.10. The phenomenon for the Yang-Mills instantons we are encountering here bears strong analogy with asymptotic behavior of harmonic mappings from Riemann surfaces with uniformly bounded total energy. In fact, the product of two Riemann surfaces with enlarging metrics on one factor plays the role of the “necks”, which connects the regular part (no energy density concentration) with the bubbles, in the analysis for 2-dimensional harmonic maps (cf. [ChT, DT, P, QT, S], etc.). Also, we would like to mention that a higher dimensional holomorphic gauge theory has been introduced recently by S.K. Dondaldson and R.P. Thomas in [DTh]. We note that C 0 -convergence is important to study the topology of the moduli spaces. Moreover, it is known that the anti-self-dual connections over K¨ahler surfaces can be identified with the stable holomorphic bundles (cf. [DK] and the references therein). As for non-minimizing critical points of the YangMills functional over a product of two compact Riemann surfaces in the same setting, we believe that our method should imply convergence to a harmonic map instead of a holomorphic one. The author would like to thank Professor Gang Tian for suggesting the problem and for many stimulating conversations and valuable comments.
2. Instantons Over Product of Two Riemann Surfaces Consider a compact Riemannian 4-manifold M which is topologically a product of two compact Riemann surfaces 61 and 62 . We shall use (x, y) and (s, t) to denote local coordinates on 61 and 62 respectively. Let g1 and g2 be fixed Riemannian metric on 61 and 62 respectively. Take the Riemannian metric on M to be g = g1 ⊕ g2 . With respect to g, the space of 2-forms on M is decomposed into orthogonal subspaces 32 (T ∗ M ) = 32 (T ∗ 61 ) ⊕ 32 (T ∗ 62 ) ⊕ (T ∗ 61 ∧ T ∗ 62 ). Let E be a vector bundle over M and {σα } be orthonormal frames on E with respect to a fixed metric on E. Let 0(E) be the set of smooth sections of E. Take a connection
Convergence of Anti-Self-Dual Connections Over Riemann Surfaces
573
A ∈ C ∞ (0(E), 0(T ∗ M ) ⊗ E) on E. If U is a local coordinates chart on M , A can be written as Aα = Aσα = (0βαx dx + 0βαy dy + 0βαs ds + 0βαt dt) ⊗ σβ , where the 0’s are smooth functions on U and hence depend on (x, y, s, t), where we use x, y and s, t to denote coordinates on 61 and 62 respectively. So A can be decomposed as differential forms into two pieces A = A1 + A2 , where Ai is the component of A on 6i which can be viewed as a connection on E over 6i for i = 1, 2. This induces a decomposition of the curvature 2-form FA FA = FA1 + FA2 + dA1 A2 + dA2 A1 , where dAi is the covariant differentiation with respect to Ai . Consider a family of smooth metrics on M defined by gλ = λ2 g1 ⊕ g2 , where λ ∈ [1, ∞). The Yang-Mills functional with respect to the metric gλ on the E-valued connection 1-form A is given by Z λ−2 |FA1 |2g1 + λ2 |FA2 |2g2 + |dA1 A2 + dA2 A1 |2g dµg , (2.1) YM(A, gλ ) = M
where dµg is the volume element on M with respect to the metric g and the trace in the bundle E is taken in the square norm of curvature. For each scaled metric gλ , let Aλ be an anti-self-dual (ASD) connection on E over M . The ASD equation (1 + ∗λ )dAλ Aλ = 0
(2.2)
combined with the gauge fixing condition d∗Aλλ Aλ = 0
(2.3)
yields an elliptic system, where ∗λ is the Hodge start operator and d∗Aλλ the adjoint operator of dAλ both with respect to the metric gλ . Since Aλ can be written as Aλ1 + Aλ2 and the coefficients 0’s depending on (x, y, s, t), for each fixed pair (x, y) ∈ 61 , Aλ2 is a connection over 62 . Therefore we obtain a map Xλ : 61 → A(E, 62 ), where A(E, 62 ) is the set of connections on E over 62 . In particular, we write Aλ2 = Xλ = Xλs ds + Xλt dt. Write maps Xλs , Xλt : 0(E) → 0(E) as Xλs (σα ) =
dim XE
s Xλαβ σβ ,
β=1
Xλt (σα ) =
dim XE
t Xλαβ σβ ,
β=1 s t , Xλαβ , α, β = 1, ..., dim E, are smooth functions defined locally by where Xλαβ
574
J. Chen dim XE
s ds Xλαβ
⊗ σβ +
t Xλαβ dt
⊗ σβ =
dim XE
β=1
0βαs ds ⊗ σβ + 0βαt dt ⊗ σβ .
β=1
Here the functions 0’s depend on λ. As for Aλ1 , for convenience of writing differential equations in later discussion, we set Aλ1 = d1 + axλ dx + ayλ dy, where d1 is the exterior differential operator on 61 . Write maps axλ , ayλ : 0(E) → 0(E) as axλ (σα ) =
dim XE
ayλαβ σβ ,
β=1
ayλ (σα ) =
dim XE
ayλαβ σβ ,
β=1
where axλαβ , ayλαβ , α, β = 1, ..., dim E, are smooth functions defined locally by axλ (σα ) ⊗ dx + ayλ (σα ) ⊗ dy =
dim XE
0βαx dx ⊗ σβ + 0βαy dy ⊗ σβ − dx ⊗ σα − dy ⊗ σα .
β=1
Also, for the sake of simplicity, we shall use Xλs , Xλs , axλ , ayλ to denote the entries in s t ), (Xλαβ ), (axλαβ ), (ayλαβ ). Therefore, they are regarded as smooth the matrices (Xλαβ functions defined locally. It then follows from (2.2) that ∂Xλt ∂Xλs (1 + ∗λ )dAλ1 Xλ = − − axλ Xλs + ayλ Xλt dx ∧ ds ∂x ∂y ∂Xλs ∂Xλt + + − ayλ Xλs − axλ Xλt dy ∧ ds ∂y ∂x ∂Xλt ∂Xλs + + − ayλ Xλs − axλ Xλt dx ∧ dt ∂x ∂y t ∂Xλs ∂Xλ + − + axλ Xλs − ayλ Xλt dy ∧ dt. (2.4) ∂y ∂x Note that Xλs , Xλt , axλ , ayλ are all regarded as matrix-valued functions. Therefore, we have ∂Xλs ∂Xλt − − axλ Xλs + ayλ Xλt = fλ1 , ∂x ∂y ∂Xλt ∂Xλs + − ayλ Xλs − axλ Xλt = fλ2 , ∂x ∂y where fλ1 and fλ2 are the corresponding components of (1 + ∗λ )dXλ Aλ1 .
(2.5) (2.6)
Convergence of Anti-Self-Dual Connections Over Riemann Surfaces
575
3. Flat Connections Over Riemann Surfaces and Cauchy-Riemann Operators Denote the flat connections of E over 62 by Aflat (E, 62 ) and the moduli space of these flat connections by M(E, 62 ) = Aflat (E, 62 )/G0 , where G0 is the component of the group of gauge transformations which contains the identity. Its tangent space T[A2 ] M(E, 62 ) at [A2 ] consists of 1-forms in ker dA2 ∩ ∗g ∗g ker dA22 , where dA22 = − ∗g2 dA2 ∗g2 is the L2 adjoint of dA2 and ∗g2 is the Hodge star operator on 62 in the metric g2 . Therefore, the tangent space at [A2 ] can be identified 1 (62 ). The space M(E, 62 ) is a compact manifold with the de Rham cohomology HA 2 of dimension 6l − 6, where l is the genus of 62 . There is a natural symplectic structure on M(E, 62 ) defined by Z ω(α, β) = α∧β 62
and a Weil-Petersson metric Z G(α, β) = 62
α ∧ ∗β
1 for α, β ∈ HA (62 ). Since dimR 62 = 2, ∗g2 maps 1-forms to 1-forms and ∗2g2 = 2 −1. Every conformal structure on 62 determines a Hodge ∗g2 -operator, hence in turn an almost complex structure on M(E, 62 ). Further, the almost complex structure is compatible with ω (cf. [AB, DS]). For each connection A1 on E over 61 , we define a Cauchy-Riemann operator for maps from (61 , J1 ) to (M(E, 62 ), ∗g2 ) by
∂ A1 = dA1 ◦ J1 − ∗g2 dA1 ,
(3.1)
where J1 is a complex structure on 61 . In particular, if A1 + A2 is a connection over 61 × 62 and A2 is flat over 62 , then in local coordinates, ∂ A1 A2 is precisely the right hand side of (2.4). So we have ∂ A1 A2 = (1 + ∗g )dA1 A2 .
(3.2)
√ √ If we use z = x + −1y to denote the complex coordinate on 61 , and w = s + −1t the complex coordinate on 62 , then (3.2) can be written as ∂ Aλ1 Xλw = ∂z Xλw + az X w . Since the 31 (T 62 )-component of dA2 A1 ∈ 31 (T 61 ) ∧ 31 (T 62 ) is exact, i.e., in the 1 form of dA2 f for some matrix-valued function f , it is equal to 0 in HA (62 ). Therefore, 2 a holomorphic map (with respect to ∂ A1 ) Y from 61 into M(E, 62 ) satisfies (1 + ∗g )dA1 Y + (1 + ∗g )dY A1 = 0.
576
J. Chen
4. Limiting Instantons and Convergence For the SU (n)-bundle E, it is well known that for any anti-self-dual connection Aλ in the metric gλ , Z |FAλ |2gλ dµgλ = 8π 2 c2 (E) ∈ Z, M
where c2 is the second Chern class of E which depends only on topology of E and M (cf. [DK]). If there is no curvature concentration in the L2 -norm, namely, for any given > 0 there exists some r, λ0 > 0 such that for any λ ≥ λ0 and any x ∈ M , kFAλ kL2 (Bx (r,gλ )) ≤ .
(4.1)
According to Uhlenbeck’s regularity theorem ([U1, U2]) for anti-self-dual connections, there exists a gauge transformation σλ in L22 (Bx (2, gλ )) such that Aλ is gauge equivalent to another connection Aσλλ = d + aλ (σλ ) and kaσλλ kC k (Bx ( 1 ,gλ )) ≤ C(k)kFAσλ kL2 (Bx (1,gλ )) λ
2
(4.2)
for any k ≥ 0, where d is the exterior differentiation on 61 × 62 . In particular, in (4.1) can be taken approaching 0 as λ goes to infinity. Moreover, we can patch Aσλλ , which are obtained on small balls, together over M as in Uhlenbeck’s proof of the removable singularity theorem. In particular, after fixing this gauge, the L22 norm of the connection Aσλλ is small for sufficiently large λ. We shall still use Aλ instead of Aσλλ for simplicity. In particular, the following elliptic estimates holds: sup B(r,λ2 g1 )
|∇Xλ |2λ2 g1
C ≤ V (B(2r, λ2 g1 ) × 62 )
Z B(2r,λ2 g1 )×62
|FAλ |2gλ dµgλ . (4.3)
This implies |∇Xλ |2λ2 g1 (0) ≤
C . r2
(4.4)
Note again that these elliptic regularity estimates hold on regions where curvature does not concentrate as λ → ∞. There is no loss to assume that the injectivity radius of (M, g) is bigger than 2. Fix a point (p, q) ∈ 61 × 62 . Recall that a well-known result of Nijenhuis and Woolf ([NW]) asserts the following local existence of holomorphic discs with prescribed initial data at the center of the discs. Note that the size of the discs and upper bounds on the first order derivatives of the holomorphic maps depend on the initial conditions. The detailed dependence is discussed in 5.2a of [NW], and also in Proposition 3.2.1 in [Mc]. Lemma 4.1. Let M be a compact K¨ahler manifold. For any given point x ∈ M and v ∈ Tx M ⊗ C, there exists a holomorphic map f from D(R) into M such that f (0) = x, ∂f (0) = v, where D(R) is the disc centered at 0 in C of radius R and R|v| < C for some uniform constant C. Furthermore, |df |(z) ≤ 2|v| for all z ∈ D(R).
Convergence of Anti-Self-Dual Connections Over Riemann Surfaces
577
In the scaled metric, Xλ can be regarded as maps from Bp (λ, λg1 ) by setting Xλ0 (z) = λ−1 Xλ (λ−1 z). Since the L2 -norm of the curvature of Aλ in the metric gλ is a fixed topological quantity over 61 × 62 , from (4.3) we also have |∇Xλ0 |2 (0) ≤
C , r 2 λ2
where the norm is in the (almost) Euclidean metric on Bp (λ) ⊂ R2 . Recall that the elliptic estimates for ASD connections satisfying the Coulomb gauge condition imply, via Sobolev embedding and the Ascoli–Arzela theorem, that any sequence of ASD connections with small curvature in L2 over a small geodesic ball contains a subsequence which is gauge equivalent to a converging sequence of connections in C ∞ (cf. [DK]). In particular, Xλ0 , passing to a subsequence if necessary, converges in C ∞ -topology on the unit disc measured in the (almost) Euclidean metric for each λ and centered at q. 0 0 over 62 is flat. X∞ is parameterized by the Euclidean unit The limit connection X∞ disc centered at q. Note that in the unscaled metric g1 the radius of the unit disc is λ−1 , 0 (z)] be the equivalent hence the discs are shrinking to the center q as λ → ∞. Let [X∞ 0 class of X∞ (z) modulo G0 , hence a point in M. Let vλ,q be a complex tangent vector of M at [Xλ0 (q)] such that |vλ,q | ≤ |∂z Xλ0 (q)| ≤ Cλ−1 . By Lemma 4.1, there exists a holomorphic map Zλ from Bp (λ) into M with the initial condition 0 ] Zλ (q) = [X∞,q ∂z Zλ (q) = vλ,q . Since λ|dZλ | is uniformly bounded by a fixed constant C, there exists a constant C 0 independent of λ such that the image set Zλ (D(C 0 λ)) is contained in a fixed open set U of M and U can be lifted to a slice in Aflat (E, 62 ) which is determined by d∗Zλ (0) B = 0.
(4.5)
This equation also determines a slice in A(E, 62 ), if B is taken to be in the larger space A(E, 62 ) rather than Aflat (E, 62 ). By composing with a gauge transformation if necessary, we may assume that the elliptic estimates for ASD connections hold on this slice. Since we shall deal with inhomogeneous Cauchy-Riemann type equations on the chosen slice, to get the same scaling factor as Xλ , we set Yλ (z) = Zλ (λz) for any z ∈ Bp (C 0 ). Proposition 4.2. The Cauchy-Riemann equation ∂ z Yλ = 0 can be lifted to
(1 + ∗)(dd1 Y˜λ + dY˜ λ d1 ) = 0
on the slice in Aflat (E, 62 ) determined by (4.5).
578
J. Chen
Proof. Note that curvature transforms as a tensor under bundle automorphisms (gauge group): Fu(A) = uFA u−1 for u ∈ G and any connection A over 61 × 62 . Note that [Y˜λ ] = Yλ . Observe that the 2-form dd1 Y˜λ + dY˜ d1 = F (d1 + Y˜λ ) − F (d1 ) − F (Y˜λ ). λ
Therefore dd1 Y˜λ + dY˜ λ d1 transforms as a tensor since the curvature does. So its antiself-dual part is preserved by gauge transformations. Namely, if u ∈ G0 , (1 + ∗)(du(d1 ) u(Y˜λ ) + du(Y˜ λ ) u(d1 )) = (1 + ∗)(F (u(d1 + Y˜λ )) − F (u(d1 )) − F (u(Y˜λ ))) = (1 + ∗)(u(F (d1 + Y˜λ ) − F (d1 ) − F (Y˜λ ))u−1 ) = (1 + ∗)(u(dd1 Y˜λ + dY˜ d1 )u−1 ) λ
= u(1 + ∗)(dd1 Y˜λ + dY˜ λ d1 )u−1 . Recall that dYλ -exact forms are equal to 0 in HY1 λ (62 ). The Cauchy-Riemann equation reads (1 + ∗)(dd1 Yλ + dYλ d1 ) = 0 for Yλ maps D(C 0 ) into M. This equation is preserved by gauge transformations, hence can be lifted to the chosen slice. The lifting of the metric from U in M to the slice in A(E, 62 ) in general is not isometric. However, since M is compact, we may cover it by finitely many open neighborhoods U such that each one of them can be lifted as in Proposition 4.2. Therefore the lifted metric is quasi-isometric to the one on M. This is sufficient for us to estimate various norms. Similar to the rescaling of Xλ , we set Yλ0 (z) = Y˜λ (λ−1 z) for any z ∈ Bp (C 0 λ). The equation for Yλ0 is ∂ z Yλ0 + λ−1 (1 + ∗)dYλ0 d1 = 0. The equation for Xλ0 from Bp (C 0 λ) into the slice is ∂ z Xλ0 + λ−1 a0λ Xλ0 + λ−1 (1 + ∗)dXλ0 (d1 + a0λ ) = 0, where
a0λ (z, w) = aλ (λ−1 z, w).
Also, we notice that ∗g = ∗gλ on 2-forms in T ∗ 61 ∧ T ∗ 62 . Therefore, for the matrixvalued functions Xλ0 and Yλ0 from Bp (C 0 λ), we have ∂ z (Xλ0 − Yλ0 ) + λ−1 a0λ (Xλ0 − Yλ0 ) + λ−1 (1 + ∗)dXλ0 −Yλ0 (d1 + a0λ ) = −λ−1 a0λ Yλ0 − λ−1 (1 + ∗)dYλ0 a0λ .
(4.6)
In particular, (4.6) yields differential equations for the R-valued entries of Xλ − Yλ . We shall still use Xλ0 and Yλ0 to denote the entries of the matrix-valued functions Xλ0 − Yλ0 for the sake of simplicity.
Convergence of Anti-Self-Dual Connections Over Riemann Surfaces
579
Take a sequence of annuli A(q, λ, g1 ) = {y ∈ Bq (1, g1 ) : λ−1 ≤ dg1 (y, q) ≤ 1}, and map it conformally onto a 2-dimensional cylinder [0, log λ] × S1 by φ : (r, θ) → (− log r, θ) = (t, θ). The conformal map φ pulls back the flat metric on the cylinder to the Euclidean metric with a conformal factor by φ∗ (dt2 + dθ2 ) = r−2 ds2 , where ds2 is the Euclidean metric on R2 . The length of the cylinder from the scaled region A(q, λ) = {y ∈ B(1, g1 ) : 1 ≤ dλg1 (y, q) ≤ λ} is the same as that of unscaled one, and rescaling in the Cartesian coordinates only results in a shifting in cylindrical coordinates by logarithm of the scaling factor. Divide the cylinder into equal length units each with length L and assume that log λ = mL, m ∈ Z+ and m is large when λ is large. Define Z |u|2 dtdθ, kuk2i = kuk21,i
=
[(i−1)L,iL]×S1 kuk2L∞ ([(i−1)L,iL]×S1 )
+ kuk2i + k∇uk2i .
The L2 version of the classical three circles theorem for elliptic equations of second order over 2-dimensional cylinders (cf. [ChT, QT, S]) states the following Proposition 4.3 and for the sake of completeness, we shall sketch a proof. Proposition 4.3. Suppose that u ∈ L21 ([0, mL] × S1 ) satisfies 4u + A · ∇u + B · u = h,
(4.7)
where m ∈ Z+ and L are large fixed numbers. Then there exists a positive number such that if khkL2 ([0,mL]×S1 ) ≤ max kuk1,i 1≤i≤m
and kAkL2 ([0,mL]×S1 ) + kBkL2 ([0,mL]×S1 ) ≤ ,
(4.8)
then for 2 ≤ i ≤ m − 1 the following alternatives hold: 1 1 (1) kuki+1 ≤ e− 2 L kuki implies kuki ≤ e− 2 L kuki−1 , (2) kuki−1 ≤ e− 2 L kuki implies kuki ≤ e− 2 L kuki+1 and 1
1
(3) if
Z | Z
udtdθ| ≤ max kuk1,i ,
(4.9)
udtdθ| ≤ max kuk1,i ,
(4.10)
1≤i≤m
[(i−1)L,iL]×S1
|
1≤i≤m
[iL,(i+1)L]×S1
then either kuki ≤ e− 2 L kuki−1 or kuki ≤ e− 2 L kuki+1 . 1
1
580
J. Chen
Proof. Consider first for any harmonic function u on [0, mL] × S1 satisfying Z Z udtdθ = 0 = udtdθ. [(i−1)L,iL]×S1
[iL,(i+1)L]×S1
Separating variables for harmonic functions on the flat 2-dimensional cylinder leads to u = a 0 + b0 t +
∞ X
ent (an cos nθ + bn sin nθ) + e−nt (an cos nθ + bn sin nθ) .
n=1
It follows that kuk2i =
∞ π X e2nL − 1 2(i−1)nL 2 e (an + b2n ) + e−2(i−1)nL (a2−n + b2−n ) 2 n n=1
1 +2π(a20 L + a0 b0 L2 (2i − 1) + b20 (3i2 − 3i + 1)). 3 These two conditions imply that a0 = 0 = b0 . We may take L large such that eL > 4. Then 1 L e kuk2i+1 + e−L kuk2i−1 . kuk2i < 2 Now (1) follows for the harmonic function u and (2) follows from (1) by a reflection R(t) = 2ti−1 − t about ti−1 = (i − 1)L. The two integral conditions force a0 = 0 = b0 , and hence 1 kuk2i < e−L (kuk2i+1 + kuk2i−1 ) 2 which implies (3) for u. For the general case, if the proposition were false, then there would be a sequence k tending to 0 and a sequence of solutions uk satisfying the PDE with khkL2 ([0,mL]×S1 ) = k max kuk k1,i 1≤i≤m
and
kAkL2 ([0,mL]×S1 ) + kBkL2 ([0,mL]×S1 ) ≤ k .
We can normalize uk by dividing the equation by max1≤i≤m kuk k1,i , such that 1≤
m X
kuk1,i ≤ m.
i=1
Then uk , possibly a subsequence, converges in W 2,2 to a vector-valued harmonic function by elliptic theory (cf. [GT]). This would contradict the above discussion for harmonic functions. Next, we would like to illustrate how Proposition 4.3 is used in our case. For simplicity, we take L = 1 when applying Proposition 4.3 without losing any generality. Also, we remark that so far the mapping Yλ is defined on the regions without curvature concentration. Proposition 4.4. Xλ converges to Yλ in C 0 -topology.
Convergence of Anti-Self-Dual Connections Over Riemann Surfaces
581
Proof. First, by differentiating (4.6), we observe that uλ = Xλ0 − Yλ0 satisfies the following second order elliptic equation: ∂z ∂ z uλ + λ−1 a0λ ∂z uλ + λ−1 ∂z a0λ uλ + λ−1 ∂z {(1 + ∗)duλ (d1 + a0λ )} = −λ−1 ∂z {a0λ Yλ0 + (1 + ∗)dYλ0 a0λ }.
(4.11)
Note that ∂ z in polar and cylindrical coordinates is given by √ √ √ ∂ ∂ √ ∂ −1 ∂ ∂ −1θ =e + = et e −1θ − + −1 . ∂z ∂r r ∂θ ∂t ∂θ Also, the Laplace operator ∂2 ∂2 4= + = e2t ∂x2 ∂y 2
∂2 ∂2 + ∂t2 ∂θ2
.
Therefore, in the cylindrical coordinates, the equation for uλ (in fact u(φ−1 )) over the cylinder [0, m] × S1 is (∂t2 + ∂θ2 )uλ + λ−1 e−t a0λ ∂uλ + λ−1 e−t ∂a0λ uλ + λ−1 e−t ∂{(1 + ∗)duλ (d1 + a0λ )} = −λ−1 e−t ∂{a0λ Yλ0 + (1 + ∗)dYλ0 a0λ },
(4.12)
√ ∂ ∂ where ∂ is the first order operator −e− −1θ ( ∂t + −1 ∂θ ). There are two cases according to whether Proposition 4.3 can be applied or not. Here the situation is very similar to the harmonic map heat flow from surfaces (cf. [QT]). Case 1. There is a constant C independent of λ and a constant λ0 , such that for any λ ≥ λ0 , ! √
max kXλ0 − Yλ0 k1,i ≤ C
1≤i≤m
sup [i0 ,i0 +1]×S1
|Xλ0 − Yλ0 | + khλ kL2
,
where [i0 , i0 +1]×S1 contains the point φ(p) (recall φ is the conformal map from the disc to the cylinder). Note that Proposition 4.3 is not applicable to Xλ0 − Yλ0 in this case since condition (3) does not hold. However, since the derivatives of Xλ0 and Yλ0 are going to 0 as λ → ∞ on [i0 , i0 + 1] × S1 and Xλ0 (q) − Yλ0 (q) also tends to 0, sup[i0 ,i0 +1]×S1 |Xλ0 − Yλ0 | go to 0. Then the above assumption implies ! sup [0,m]×S1
|Xλ0 − Yλ0 | ≤ C
sup [i0 ,i0 +1]×S1
|Xλ0 − Yλ0 | + khλ kL2
.
(4.13)
Therefore if x = (r, θ) 6= (0, 0) in g1 , there is λ0 > 0 such that for any λ > λ0 the point x is in the annulus A(q, λ). It then follows from 2k < t = − log r and (4.13) that |Xλ (x) − Yλ (x)| = |Xλ0 (λx) − Yλ0 (λx)| ≤C
sup [i0 ,i0 +1]×S1
|Xλ0
−
! Yλ0 |
+ khλ kL2
.
Since the right-hand side of the above inequality goes to 0 as λ → ∞, Xλ converges to Yλ in the C 0 topology in Case 1. Case 2. For any C, there is a sequence λn → ∞ such that for some j0 ,
582
J. Chen
! max
1≤i≤m
kXλ0
−
Yλ0 k1,i
≥C
sup [j0 ,j0 +1]×S1
|Xλ0
−
Yλ0 |
+ khλ kL2 ([0,m]×S1 )
.
Due to the smallness of the derivatives of Xλ0 and Yλ0 , we have C −1 max kXλ0 − Yλ0 k1,i 1≤i≤m Z Z ≥| (Xλ0 − Yλ0 )dtdθ| + | [j0 ,j0
+1]×S1
[j0 −1,j0
]×S1
(Xλ0 − Yλ0 )dtdθ|
+ khλ kL2 ([0,m]×S1 ) . We can assume that the initial point q is in [j0 − 1, j0 ] × S1 . In Eq. (4.12), the coefficients A, B in front of uλ , ∇uλ and the inhomogeneous term hλ = −λ−1 e−t ∂{a0λ Yλ0 + (1 + ∗)dYλ0 a0λ } involve a0λ1 , da0λ1 , d2 a0λ1 , a0λ dYλ0 , da0λ Yλ0 , d2 a0λ Yλ0 , where d stands for derivatives on the cylinder. The elliptic regularity for ASD connections implies that their L2 -norms approach 0 as λ → ∞ over the fixed length cylinder. We have a number in Proposition 4.3 over [i0 − k, i0 + k] × S1 . Then we can take C so large that C > −1 . So the initial conditions in (3) and condition on hλ of Proposition 4.3 are satisfied. Now (3) in Proposition 4.3 implies that either kXλ0 − Yλ0 |i ≤ e− 2 kXλ0 − Yλ0 ki+1 1
or
kXλ0 − Yλ0 |i ≤ e− 2 kXλ0 − Yλ0 ki−1 . 1
And then (1) or (2) implies that the exponential decay (growth if viewed backward) holds all the way up to one end of the cylinder. In order to have these decay estimates from both sides of [i, i + 1] × S1 , we need to make an adjustment about i by moving the balls around when we construct holomorphic maps into M. To achieve this, we take the smallest (largest) i such that the exponential decay holds from the right (left) end of the cylinder to [i, i + 1] × S1 . In fact, for any point y ∈ B( 41 , g1 ), Lemma 4.1 implies that there exists a holomorphic map Yλ0 from By ( 21 , g1 ) into M(E, 62 ) with initial conditions for Yλ0 such that 0 Yλ0 (y) = [X∞,y ] 0 ∂Aλ1 Yλ (y) = vλ,y with λvλ,y ≤ C as before. Again, by lifting we identify the slice in A(E, 62 ) with U in M. We point out that when we move the center y the resulting holomorphic map Yλ0 0 depends on y, but for simplicity we use Yλ0 instead of Yλ,y to denote the holomorphic 0 0 map. For each y, Proposition 4.3 applies to Xλ − Yλ . Hence there exists some i(y, λ) given by (3) in Proposition 4.3. Without loss of generality, we may assume that kXλ0 − Yλ0 ki(y,λ) ≤ e− 2 k|Xλ0 − Yλ0 ki(y,λ)+1 . 1
Therefore (2) in Proposition 4.3 implies that kXλ0 − Yλ0 kj ≤ e− 2 k|Xλ0 − Yλ0 kj+1 1
for any j ≥ i(y). Now we set
Convergence of Anti-Self-Dual Connections Over Riemann Surfaces
iλ =
583
i(y).
min y∈Bp ( 21 ,g1 )
Due to the compactness of Bp ( 21 , g1 ), there is some y0 such that i(y0 , λ) = iλ . By our construction, 1 kXλ0 − Yλ0 kj ≤ e− 2 kXλ0 − Yλ0 kj+1 for any j ≥ iλ . If iλ is bigger than 2, then we can apply (3) in Proposition 4.3 over [iλ − 2, iλ − 1] × S1 . Now (1) in Proposition 4.3 must happen since we pick the smallest i for type (2) in Proposition 4.3. Hence we conclude the claimed exponential decay from both sides toward [iλ , i1 ] × S1 . If iλ ≤ 2 for a sequence λ → ∞, then we have the situation that the exponential decays from one direction all the way up to one end. Proposition 4.4 will follow immediately from the lemma below. Lemma 4.5. Let Xλ0 , Yλ0 be as before over [0, log k]×S1 . Assume that the two conditions (4.9), (4.10) in (3) of Proposition 4.3 hold and |∇Xλ0 |, |∇Yλ0 | ≤ δ << , where is the one in Proposition 4.3. Then there is some i0 ≥ 2 such that |Xλ0 (t, θ) − Yλ0 (t, θ)| + |∇Xλ0 (t, θ) − ∇Yλ0 (t, θ)| ≤ 1 Ce− 2 t , 1
|Xλ0 (t, θ) − Yλ0 (t, θ)| + |∇Xλ0 (t, θ) − ∇Yλ0 (t, θ)| ≤ 1 Ce where C is a universal constant and 1 =
− 21 (log k−t)
if t < i0 ,
, if t > i0 , (4.15)
100 .
Proof. By previous discussion, for any fixed i0 ≥ 2 either kXλ0 − Yλ0 ki0 ≤ e− 2 kXλ0 − Yλ0 ki0 +1 1
or
kXλ0 − Yλ0 ki0 ≤ e− 2 kXλ0 − Yλ0 ki0 −1 . 1
Suppose the first case is true. Then by (2) in Proposition 4.3, kXλ0 − Yλ0 ki+1 ≥ e 2 kXλ0 − Yλ0 ki 1
for any i ≥ i0 . We claim
. 100 If this were not the case, then there would be some i > i0 such that kXλ0 − Yλ0 km ≤ 1 =
kXλ0 − Yλ0 ki ≥ 1 . On the other hand, kXλ0 − Yλ0 k2i+1 ≤ kXλ0 − Yλ0 k2i + k∇(Xλ0 − Yλ0 )k2C 0 · 2π ≤ kXλ0 − Yλ0 k2i + 2π(1 + C 2 )δ 2 . It follows (e − 1)δ12 ≤ (e − 1)kXλ0 − Yλ0 k2i ≤ kXλ0 − Yλ0 k2i+1 − kXλ0 − Yλ0 k2i ≤ 2π(1 + C 2 )δ 2 .
(4.14)
584
J. Chen
This is impossible since δ << 1 . Therefore by iteration, we have kXλ0 − Yλ0 ki0 ≤ e− 2 (m−i0 ) kXλ0 − Yλ0 km 1
≤ 1 e− 2 (m−i0 ) . 1
For the second case, we can argue similarly. Indeed, if kXλ0 − Yλ0 ki0 ≤ e− 2 kXλ0 − Yλ0 ki0 −1 , 1
we conclude first that and then
kXλ0 − Yλ0 k20 ≤ 1 , kXλ0 − Yλ0 ki0 ≤ 1 e− 2 i0 . 1
Choose i0 to be the smallest one so that kXλ0 − Yλ0 ki0 ≤ e− 2 kXλ0 − Yλ0 ki0 +1 . 1
Note that after this i0 , the above inequality still holds. Then the foregoing discussions show the following. As m → ∞ or equivalently log k goes to infinity, Xλ0 decays exponentially to Yλ0 from the left if i ≤ i0 and from the right if i ≥ i0 . Then by the Lp -theory in the elliptic partial differential equations ([GT]) for p = 2, kXλ0 − Yλ0 kW 2,2 can be bounded by the L2 -norm of Xλ0 − Yλ0 . The Sobolev embedding theorem implies that the kXλ0 − Yλ0 kC 0 is bounded by the L2 -norm. So for any p > 2, kXλ0 − Yλ0 kLp is bounded by kXλ0 − Yλ0 kL2 . Applying the Lp -theory again, we see that kXλ0 − Yλ0 kW 2,p is bounded by kXλ0 − Yλ0 kLp , hence by kXλ0 − Yλ0 kL2 . Applying the Sobolev embedding theorem again, the C 1 -norm of Xλ0 −Yλ0 is bounded by kXλ0 −Yλ0 kL2 . Note that goes to 0 as λ → ∞. Transferring back to Cartesian coordinates yields the desired C 0 convergence. Now we analyze the case when there are points where curvature concentrates. Lemma 4.6. The curvature FAλ can concentrate only at finitely many points in 61 ×62 . Proof. If there is some 0 > 0 such that for any r > 0, we can find a sequence of λ with λ → ∞ and Z |FAλ |2gλ dµgλ ≥ 0 . Bx (r,gλ )
Let Bp (r, λ g1 ) be the ball in 61 centered at p with radius r in metric λ2 g1 . Denote Bp (r, λ2 g1 ) × (62 ∩ Bx (r, gλ )) by r,λ . It follows that Z 2 −1 max |FAλ |gλ (y) ≥ Vol(r,λ ) |FAλ |2gλ dµgλ 2
y∈r,λ
r,λ
≥ C0 r This implies that
−4
.
lim m2λ = ∞,
λ→∞
where we have set m2λ =
max
y∈Bx (1,g)
|Fλ |gλ (y).
Convergence of Anti-Self-Dual Connections Over Riemann Surfaces
585
We can employ the well-known blowing up process. Recall the injectivity radius of M is larger than 2 in the metric g. Define a new metric by gλ0 = m2λ φλ gλ , where the cut-off function φλ satisfies 1 if y ∈ Bx (1, gλ ) φλ (y) = 0 if y ∈ M \Bx (2, gλ ) and 0 ≤ φλ ≤ 1 on M . Note that gλ0 and gλ are conformally equivalent. Since the anti-self-dual equations are conformally invariant, Aλ is anti-self-dual in the scaled metric gλ0 . Moreover, kFλ kL2 (Bx (1,g),gλ0 ) remains bounded uniformly from above due to the invariance of the L2 -norm of curvature under conformal transformations. It follows that max |Fλ |gλ0 (y) = 1. y∈Bx (1,g)
Now we are essentially back to the previous case, i.e., no curvature concentration. Again, we go upstairs to the tangent space. Let expx be the exponential map from the Euclidean unit ball Bx (1) ⊂ R4 to (R2 × 62 , geuclid ⊕ g2 ) and let Sδ be the dilation on R4 by δ which starts from x = (p, q). Let expp be the exponential map from R2 to 61 from p and Tδ be the dilation on R2 by δ from p. Define a sequence of maps by hλ,x = expp ◦Tλ−1 ◦ expx ◦Sm−1 : Bx (λmλ ) → Bp (1, g1 ) × (62 ∩ Bq (1, g2 )). λ
h∗λ,x Aλ
is anti-self-dual in the Euclidean metric over the enThe pullback connection larged region Bx (λmλ ). Note that the two pieces of R2 × R2 scale at different rates mλ , mλ λ → ∞. Uhlenbeck’s compactness theorem implies that Aλ converges in C ∞ (Bx (K)) to an anti-self-dual connection A∞,K , where Bx (K) is the ball in R4 centered at x with radius K, for every fixed K as before. A diagonal process allows us to have a subsequence of Aλ which converges simultaneously for every K, hence yielding an anti-self-dual connection A∞,Bx (1,g) on R4 . Further, the removable singularity theorem ([U1]) says that A∞,Bx (1,g) extends to an anti-self-dual connection on S4 . But kFAS4 kL2 ≥ 4π 2 . But c2 (E) is a fixed topological quantity. It follows that curvature can concentrate near at most finitely many points {y1 , ..., yk }. Proposition 4.7. The Dirichlet energy of Yλ is uniformly bounded on each ball in the metric g1 . A subsequence of {Yλ } converges to a holomorphic map from the ball to M(E, 62 ), possibly union with finitely many rational curves. Moreover, the limit maps can be patched together by the quotient of the group of gauge transformations G0 to yield a holomorphic map from 61 to M(E, 62 ), possibly union with finitely many rational curves. Proof. Denote the points in 61 ×62 , where curvature accumulates by (wi , zi ) ∈ 61 ×62 for i = 1, ..., k. Let {Bpj (rj , g1 )} be a countable cover of 61 \{x1 , ..., xk } by small balls Bpj (rj ), j = 1, 2, ..., such that each ball Bpj (2rj ) is contained in a single coordinate chart of 61 , and moreover each point in M \{y1 , ..., yk } is in at most a fixed number of balls Bpj (rj ). We also assume that the balls are so small that there exists a holomorphic map
586
J. Chen
from each ball Bpj (2rj ) into a region in M which is liftable to a slice in Aflat (E, 62 ). In the following discussion, we shall drop the subscript j for pj , rj for simplicity. It then follows from (4.2) and the invariance of the L2 norm of dAλ1 Xλ under conformal changes on the 2-dimensional balls that Z Z |dAλ1 Xλ |2g1 dµg1 = |dA0λ1 Xλ0 |2λ2 g1 dµλ2 g1 (4.16) Bp (r)
Bp (λr)
!
Z
≤ C1
sup Bp (λr)
Bp (λr)×62
Bp (λr)
1 λ2 r2 Area(62 )
Z ≤ C2 Z
|FAλ |2gλ
≤ C3 Bp (2λr)×62
dµλ2 g1 !
Z Bp (2λr)×62
|FAλ |2gλ dµgλ
dµλ2 g1
|FAλ |2gλ dµgλ ,
(4.17)
where C1 , C2 , C3 are universal constants independent of λ. To estimate the energy of Yλ on each small ball, we shall lift Yλ to the chosen slice. The lifting of the metric on the moduli space M is in general not isometric. But since M is compact, we may assume that M is covered by finitely many open sets U which is liftable. Also, we recall that the image of the enlarged balls Bp (λr) under Yλ0 lie in a fixed U by the construction. Therefore, there is a constant C > 1, such that Z Z −1 2 |d1 Yλ |λ2 g1 dµλ2 g1 ≤ |d1 Yλ0 |2λ2 g1 dµλ2 g1 C Bp (λr)
Bp (λr)
Z
≤C Bp (λr)
|d1 Yλ |2λ2 g1 dµλ2 g1 .
Again by the invariance of the L2 norm of the corresponding integrals under conformal changes on the domain, we have Z Z −1 2 −1 |dAλ1 Yλ | dµg1 = C |dA0λ1 Yλ0 |2λ2 g1 dµλ2 g1 C Z
Bp (r)
≤ Z
Bp (λr)
Z
Bp (λr)
≤ ≤ Bp (λr)
Bp (λr)
|dAλ1 Yλ0 |2λ2 g1 dµλ2 g1 |dA0λ1 (Yλ0 − Xλ0 )|2λ2 g1 dµλ2 g1 +
Z Bp (λr)
|dA0λ1 (Yλ0 − Xλ0 )|2λ2 g1 dµλ2 g1 + C3
|dA0λ1 Xλ0 |2λ2 g1 dµλ2 g1
Z
Bp (2λr)×62
|FAλ |2gλ dµgλ .
By Schauder’s interior estimates (cf. [GT]), we see that Z Z 0 0 2 |dAλ1 (Yλ − Xλ )|λ2 g1 dµλ2 g1 = |dAλ1 (Y˜λ − Xλ )|2 dµg1 Bp (λr)
Bp (r)
C ≤ 2 r
Z
|Y˜λ − Xλ |2 dµg1 Bp (2r)
≤ C sup |Y˜λ − Xλ |2 Bp (2r)
Convergence of Anti-Self-Dual Connections Over Riemann Surfaces
587
for some uniform constant C > 0. Proposition 4.4 asserts that Y˜λ − Xλ tends to 0 as λ → ∞. So we have Z Z |dAλ1 Yλ |2 dµg1 ≤ C |FAλ |2gλ dµgλ + Cδ(λ), (4.18) Bp (r)
Bp (2λr)×62
where lim δ(λ) = 0.
λ→∞
The anti-self-duality of Aλ implies (1 + ∗)(FAλ1 + FAλ2 ) + (1 + ∗)(dAλ1 Aλ2 + dAλ2 Aλ1 ) = 0. But we also have lim FAλ2 = 0.
λ→∞
Observe that FAλ1 , ∗FAλ2 ∈ 32 (T ∗ 61 2 ⊗ E), FAλ2 , ∗FAλ1 ∈ 32 (T ∗ 62 ⊗ E) and (1 + ∗)(dAλ1 Aλ2 + dAλ2 Aλ1 ) ∈ 31 (T ∗ 61 ⊗ E) ⊕ 31 (T ∗ 62 ⊗ E). It follows that Aλ1 approaches a flat connection over R2 , i.e., as λ → ∞ FAλ1 = − ∗ FAλ2 → 0. Also,
|FAλ1 |2 = | ∗ FAλ2 |2 = |FAλ2 |2
goes to 0 as λ → ∞ at the rate λ−1 . It then follows from the regularity for anti-self-dual connections that there exists a L22 gauge σ such that Aλ1 = d1 + aλ (σ), where aλ (σ) is smooth and the elliptic estimate kaλ (σ)kC k (B(λr)) ≤
C(k) kFAλ kL2 (B(2λr)×62 ) . λ2 r 2
By the triangle inequality, the boundedness of Yλ (as functions into the compact manifold M) and the conformal invariance of the L2 integral of differential 2-forms in 31 (T ∗ 61 )331 (T ∗ 62 ), Z Z Z |d1 Yλ |2 ≤ |dAλ1 Yλ |2 + |aλ (σ)Yλ |2 Bp (r)
Z
Bp (r)
≤
Bp (r)
|dAλ1 Yλ |2 + CkFAλ kL2 (Bp (2λr)×62 ) .
(4.19)
Bp (r)
Therefore (4.18) and (4.19) imply Z Z 2 |d1 Yλ | dµg1 ≤ C Bp (r)
Bp (2λr)×62
|FAλ |2gλ dµgλ + δ(λ)
(4.20)
for some constant C independent of λ. Note that the integral on the right hand side of (4.20) is bounded above by c2 (E). If there are no curvature concentration points on 61 × 62 , then we can cover 61 by finitely many balls satisfying all the requirements in the previous discussion. If there are finitely many points where curvature accumulates, then we cover the complement of these points by countably many balls chosen as before. δ(λ) δ(λ) So by a diagonal process, we can arrange δ(λ) in (4.20) to be δ(λ) 22 , 32 , ..., n2 , ... on the
588
J. Chen
corresponding balls Bpj (rj ). Applying (4.20) to each ball Bpi (ri , g1 ) and then summing over i, we conclude Z |dYλ |2g1 dµg1 ≤ Cc2 (E) + Cδ(λ) 61 \{w1 ,...,wk }
for sufficiently large λ for some uniform positive constant C. Recall that Gromov’s Compactness Theorem for pseudoholomorphic curves ([Gr, PW, RT, Ye]) states Proposition 4.8. Let 6 be a closed surface and Jn a sequence of complex structures on 6 and let (N, J, ν) be a compact Hermitian manifold. Assume that fn : (6, Jn ) → (N, J, ν) is a sequence of (Jn , J)-holomorphic maps with Z 6
|dfn |2gn dvgn ≤ C
for some constant C independent of n, where gn is the Hermitian metric for Jn on 6. Then there is a subsequence of {fn } which converges to a cusp curve in (N, J). A cusp curve in (N, J) is a holomorphic map into (N, J) from a disjoint union of finitely many closed Riemann surfaces with a finite number of points on them identified. The cusps come from degeneration of conformal structures. In our case, the complex structure does not change under scaling by constants. Yλ is holomorphic with respect to ∂ z . So there is no degeneration of conformal structures. Also, in the regions without curvature concentration, the energy density of Yλ is uniformly bounded on each compact subset. Therefore there are no bubbles on this region when we pass to a converging subsequence of Yλ . However, bubbling may still occur since the energy density |dYλ |2 may blow up when we approach the points w1 , ..., wk where curvature accumulates. The rescaling metrics process in the proof of Lemma 4.6 yields holomorphic maps from C into M with finite energy (bounded by a constant times kFAλ kS4 ). These holomorphic maps can be extended to holomorphic maps from S2 into M by the removable singularity theorem for harmonic or holomorphic mappings, see below. Nevertheless, we conclude that there is a subsequence of Yλ converging to a holomorphic map from the disc into M with possibly finitely many rational curves. It is well known that each bubble (harmonic S2 in M) utilizes a definite amount of Dirichlet energy (cf. [SU, RT], etc.). We see that there are at most finitely many rational curves since the energy of Yλ over 61 \{w1 , ..., wk } is finite. Moreover the limit holomorphic map extends to entire 61 across the finitely many points {w1 , ..., wk } where the curvature FAλ concentrates according to the following lemma ([PW, RT]), which is essentially the Removable Singularity Theorem of Uhlenbeck. Proposition 4.9. Let f be any J-holomorphic map from a punctured disc D\{0} in a Riemann surface into a compact symplectic manifold. If Z |df |2µ dµ < ∞, D
then f extends to a smooth J-holomorphic map on D. In fact, we used this removing singularity result to obtain the rational curves.
Convergence of Anti-Self-Dual Connections Over Riemann Surfaces
589
Then we can utilize a diagonal process to patch the holomorphic maps on the balls together to obtain a holomorphic map from 61 , possibly union with finitely many rational curves, into M. Denote the holomorphic map by X∞ . Transferring back to the Cartesian coordinates, Proposition 4.4 says |Xλ (x) − X∞ (x)| ≤ C|x|c . It follows immediately from Proposition 4.7 that Xλ converges in Hausdorff topology to a holomorphic map from 61 to M(E, 62 ), with possibly finitely many rational curves (holomorphic S2 in M). We can now state the main theorem as follows. Theorem 4.10. Let M = 61 × 62 be a product of two compact Riemann surfaces equipped with a Riemannian metric g1 ⊕ g2 . Let E be a SU (n)-bundle over M . Suppose that {Aλ } is a sequence of anti-self-dual connections on E over M in the metric λ2 g1 ⊕g2 where λ → ∞. Then there exists a holomorphic curve from the union of 61 and finitely many smooth rational curves to the moduli space of flat connections on E over 62 , which is the limit in Hausdorff topology of a subsequence of Aλ2 in a suitable choice of gauges, where Aλ2 is the component of Aλ in 62 . References [A]
Atiyah, M.F.: New invariants of three and four dimensional manifolds. Proc. Symp. Pure Math. 48, 285–299 (1988) [AB] Atiyah, M.F. and Bott, R.: The Yang-Mills equations over Riemann surfaces. Phil. Trans. R. Soc. Lond. A 308, 523–615 (1982) [BJSV] Bershasky, M., Johansen, A., Sadov, V. and Vafa, C.: Topological reduction of 4D SYM to 2D σ-models. hep-th/950196 v4 [ChT] Chen, J.Y. and Tian, G.: Compactfication of moduli space of harmonic mappings. Preprint (1996) [DK] Donaldson, S.K. and Kronheimer, P.B.: The Geometry of Four-Manifolds. Oxford Mathematical Monographs, Oxford: Clarendon Press, 1990 [DS] Dostoglou, S. and Salamon, D.: Self-dual instantons and holomorphic curves. Ann. of Math. 139, 581–640 (1994) [DTh] Donaldson, S.K. and Thomas, R.P.: Gauge theory in higher dimensions. Preprint [DT] Ding, W.Y. and Tian, G.: Energy identity for a class of approximate harmonic maps from surfaces. Commun. in Anal. and Geom. Vol. 4, No. 3, 543–554 (1995) [FU] Freed, D. and Uhlenbeck, K.: Instantons and four-manifolds. M.S.R.I. Publications, Vol. 1, New York: Springer-Verlag, 1984 [Gr] Gromov, M.: Pseudo holomorphic curves in symplectic manifolds. Invent. Math. 82, 307–374 (1985) [GT] Gilbarg, D. and Trudinger, N.: Elliptic Partial Differential Equations of Second Order. Berlin: Springer-Verlag, 1983 [Mc] Mcduff, D.: Singularities and positivity of intersections of J-holomorphic curves. Progress in Mathematics, Vol. 117, Basel–Boston–Berlin: Birkh¨auser Verlag, 1994 [NW] Nijenhuis, A. and Woolf, W.B.: Some integration problems in lmost-complex and complex manifolds. Ann. of Math. 77, No. 3, 424–489 (1963) [P] Parker, T.: Bubble tree convergence for harmonic maps. J. Diff. Geom. 44, 595–633 (1996) [PW] Parker, T. and Wolfson, J.: A compactness theorem for Gromov’s moduli space. J. Geom. Anal. 3, 63–98 (1993) [QT] Qing, J. and Tian, G.: Bubbling of the heat flows for harmonic maps from surfaces. Comm. in Pure Appl. Math. 50, No. 4, 295–310 (1997) [RT] Ruan, Y.B. and Tian, G.: A mathematical theory of quantum cohomology. J. Diff. Geom. Vol. 42, No. 2, 259–367 (1995) [S] Simon, L.: Asyptotics for a class of nonlinear evolution equations, with applications in geometric problems. Ann. of Math. 118, No.3, 535–571 (1983)
590
[SU] [T] [U1] [U2] [Ye]
J. Chen
Sacks, J. and Uhlenbeck, K.: The existence of minimal immersions of 2 spheres. Ann. of Math. 113, 1–24 (1981) Taubes, C.: Self-dual Yang–Mills connections on non-self-dual four manifolds. J. Diff. Geom. 17, 139–170 (1982) Uhlenbeck, K.: Removable singularities in Yang–Mills fields. Commun. Math. Phys. 83, 11–29 (1982) Uhlenbeck, K.: Connections with Lp bounds on curvature. Commun. Math. Phys. 83, 31–42 (1982) Ye, R.G.: Gromov’s compactness theorem for pseudo-holomorphic curves. Trans. Am. Math. Soc. 342, No.2, 671–694 (1994)
Communicated by A. Jaffe
Commun. Math. Phys. 196, 591 – 640 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Solutions of the Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups? Pavel Etingof1 , Alexander Varchenko2 1 Department of Mathematics, Harvard University, Cambridge, MA 02138, USA. E-mail: [email protected] 2 Department of Mathematics, Phillips Hall, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3250, USA. E-mail: [email protected]
Received: 19 August 1997 / Accepted: 21 January 1998
Abstract: The quantum dynamical Yang–Baxter (QDYB) equation is a useful generalization of the quantum Yang–Baxter (QYB) equation. This generalization was introduced by Gervais, Neveu, and Felder. Unlike the QYB equation, the QDYB equation is not an algebraic but a difference equation, with respect to a matrix function rather than a matrix. The QDYB equation and its quasiclassical analogue (the classical dynamical Yang–Baxter equation) arise in several areas of mathematics and mathematical physics (conformal field theory, integrable systems, representation theory). The most interesting solution of the QDYB equation is the elliptic solution, discovered by Felder. In this paper, we prove the first classification results for solutions of the QDYB equation. These results are parallel to the classification of solutions of the classical dynamical Yang–Baxter equation, obtained in our previous paper. All solutions we found can be obtained from Felder’s elliptic solution by a limiting process and gauge transformations. Fifteen years ago the quantum Yang–Baxter equation gave rise to the theory of quantum groups. Namely, it turned out that the language of quantum groups (Hopf algebras) is the adequate algebraic language to talk about solutions of the quantum Yang–Baxter equation. In this paper we propose a similar language, originating from Felder’s ideas, which we found to be adequate for the dynamical Yang–Baxter equation. This is the language of dynamical quantum groups (or h-Hopf algebroids), which is the quantum counterpart of the language of dynamical Poisson groupoids, introduced in our previous paper. Introduction This paper is devoted to the quantum dynamical Yang–Baxter equation, its solutions, and the related algebraic structures (quantum groupoids, Hopf algebroids); abusing language, we will call these structures by the collective name “dynamical quantum groups”. ?
The authors were supported in part by an NSF postdoctoral fellowship and NSF grant DMS-9501290.
592
P. Etingof, A. Varchenko
Let h be a finite dimensional commutative Lie algebra over C, V a semisimple finite dimensional h-module, and γ a complex number. The quantum dynamical Yang–Baxter (QDYB) equation is the equation R12 (λ − γh(3) ) R13 (λ) R23 (λ − γh(1) ) = R23 (λ) R13 (λ − γh(2) ) R12 (λ)
(1)
with respect to a meromorphic function R : h∗ → End(V ⊗ V ), where by definition R12 (λ − γh(3) )(v1 ⊗ v2 ⊗ v3 ) := (R12 (λ − γµ)(v1 ⊗ v2 )) ⊗ v3 if v3 has weight µ, and R13 (λ − γh(2) ), R23 (λ − γh(1) ) are defined analogously. It is also useful to consider the quantum dynamical Yang–Baxter equation with spectral parameter, with respect to a meromorphic function R : C × h∗ → End(V ⊗ V ). By definition, the QDYB equation with spectral parameter is just Eq. (1), with Rij (∗) replaced by Rij (zi − zj , ∗), where z1 , z2 , z3 ∈ C. Solutions of the QDYB equation which are invariant under h are called quantum dynamical R-matrices. A brief history of the QDYB equation is as follows. The QDYB equation was proposed by Felder [F2] as a quantization of the classical dynamical Yang–Baxter equation [F1], but it also appeared earlier in physical literature [GN]. Examples of dynamical Rmatrices appeared in [Fad1, AF]). As Felder showed [F2], the QDYB equation is equivalent to the star-triangle relation in statistical mechanics. The most interesting known solution of the QDYB equation with spectral parameter is the elliptic solution given in [F1, F2]. As was shown in [TV], this solution arises when one studies monodromies of the quantum KZ equation introduced in [FR], see also [FTV1-2]. The algebraic structure corresponding to this solution was described in [F1,F2, FV1-3] and called “the elliptic quantum group”. Although the elliptic quantum group is not a Hopf algebra, it is very similar to a Hopf algebra in many respects. For example, its category of representations, with a suitable definition of the tensor product, is a tensor category, which was studied in [FV1, FV2]. This paper has two goals. 1. To classify quantum dynamical R-matrices in the case when h ⊂ End(V ) is the algebra of all diagonal operators in some basis. 2. To describe the axiomatics of the algebraic structure corresponding to a quantum dynamical R-matrix. The first goal is partially attained in Chapters 1 and 2. In Chapter 1, we study dynamical R-matrices without spectral parameter. We define the notion of a dynamical R-matrix of Hecke type which is a dynamical R-matrix satisfying a generalized unitarity condition. Then we define gauge transformations, which map the set of such dynamical R-matrices to itself. After this, we classify dynamical Rmatrices of Hecke type, with h as above. The answer turns out to be completely parallel to the classical case ([EV], Chapter 3). In particular, any classical dynamical r-matrix from [EV] without spectral parameter (for the Lie algebra glN ) can be quantized. In Chapter 2, we study dynamical R-matrices with spectral parameter, satisfying the unitarity condition. As in Chapter 1, we define gauge transformations, which map the set of such dynamical R-matrices to itself. After this, we list all known examples, and give a partial classification result (for R-matrices given by a power series in γ, which are quantizations of elliptic r-matrices from [EV], Chapter 4). As before, the results are parallel to the classical case. In particular, any classical dynamical r-matrix from [EV] with spectral parameter (for the Lie algebra glN ) can be quantized.
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
593
Remark. We were not able to obtain a nice classification result for dynamical R-matrices with spectral parameter and numerical γ, since we do not understand what is the correct analogue of the residue condition in [EV]. However, we expect that such a result can be obtained along the same lines as in Chapter 4 of [EV], and Chapter 1 of this paper. The second goal is attained in Chapters 3–6. In Chapter 3, we explain the connection between dynamical R-matrices and monoidal categories. We introduce the tensor category of h-vector spaces, and show that a tensor functor from a braided monoidal category to the category of h-vector spaces gives a dynamical R-matrix, in the same way as a tensor functor from a braided monoidal category to the category of vector spaces gives a usual R-matrix. We also attach to every dynamical R-matrix a tensor category of its representations, following the ideas of [F1, F2, FV1, FV2]. This category is nontrivial (for example, it contains the basic representation), has natural notions of the left and right dual objects, and is equipped with a canonical tensor functor to h-vector spaces. In Chapter 4 we introduce the notions of an h-algebra, h-bialgebroid, and h-Hopf algebroid, which are generalizations of the notions of an algebra, bialgebra, and Hopf algebra. We define the notion of a dynamical representation of an h-algebra, and show that the category of dynamical representations Rep(A) of an h-bialgebroid A is a tensor category with a natural tensor functor to h-vector spaces. If A is an h-Hopf algebroid, this category in addition has natural notions of the left and right dual representation. Using a generalization of the Faddeev–Reshetikhin–Sklyanin–Takhtajan formalism [FRT, FT] which assigns a Hopf algebra to any R-matrix, we assign an h-bialgebroid AR to any dynamical R-matrix R. If R has an additional rigidity property, then AR is an h-Hopf algebroid. We call the bialgebroid AR the dynamical quantum group associated to R. We show that the category of representations of R is equivalent to the category Rep(AR ) as a tensor category with duality and with a functor to h-vector spaces. In Chapter 5, we define quantum counterparts of the quasiclassical objects defined in [EV] (in the setting of perturbation theory). More specifically, we define the notions of a biequivariant algebra (biequivariant quantum space), a biequivariant Hopf algebroid (biequivariant quantum groupoid), a dynamical Hopf algebroid (dynamical quantum groupoid), which are the quantum analogues of the notions of a biequivariant Poisson algebra (biequivariant Poisson manifold), a biequivariant Poisson–Hopf algebroid (biequivariant Poisson groupoid), a dynamical Poisson–Hopf algebroid (dynamical Poisson groupoid), introduced in [EV]. We introduce the notion of quantization for biequivariant and dynamical objects, and conjecture that any dynamical Poisson groupoid can be quantized. This material is a generalization of the material of Chapter 4, because, as we explain in Sect. 5.5, the notion of an h-algebra (h-bialgebroid, h-Hopf algebroid) is essentially a special case of the notion of a biequivariant algebra (bialgebroid, Hopf algebroid). Remark. The general notion of a Hopf algebroid was introduced by J. H. Lu [Lu]. It is easy to check that biequivariant and dynamical Hopf algebroids as defined in Chapter 5 of our paper are Hopf algebroids in the sense of Lu. However, the notion considered in [Lu] is more general than the one considered in this paper. In Chapter 6, we study h-bialgebroids associated to dynamical R-matrices of strong Hecke type. Using the semisimplicity of the Hecke algebra for a generic value of the parameter, we prove a Poincare–Birkhoff–Witt theorem for such bialgebroids. This result explains the meaning of the Hecke type condition, which was artificially introduced in Chapter 1. Using the same method, we show that the h-Hopf algebroid associated to a
594
P. Etingof, A. Varchenko
dynamical R-matrix of Hecke type of the form R = 1 − γr + ... is a flat deformation (quantization) of the Poisson–Hopf algebroid corresponding to r. In the next papers, we plan to develop the theory of dynamical quantum groups. We plan to describe the infinite-dimensional dynamical quantum groups associated to dynamical R-matrices with spectral parameter, and dynamical quantum groups (both finite and infinite dimensional) associated to Lie groups other than GLN . We plan to develop the representation theory of dynamical quantum groups, and explain its connection with exchange (Zamolodchikov) algebras, Kazhdan–Lusztig functors, KZ and quantum KZ equations.
1. Classification of Quantum Dynamical R-matrices without Spectral Parameter 1.1. Quantum dynamical R-matrix. Let h be an abelian finite dimensional Lie algebra. A finite dimensional diagonalizable h-module is a complex finite dimensional vector space V with a weight decomposition V = ⊕µ∈h∗ V [µ], such that h acts on V [µ] by xv = µ(x)v, where x ∈ h, v ∈ V [µ]. Let Vi , i = 1, 2, 3, be finite dimensional diagonalizable h-modules, RVi Vj : h∗ → End(Vi ⊗ Vj ),
1 ≤ i < j ≤ 3,
meromorphic functions, γ a nonzero complex number. The equation in End(V1 ⊗V2 ⊗V3 ), RV121 V2 (λ − γh(3) ) RV131 V3 (λ) RV232 V3 (λ − γh(1) ) = RV232 V3 (λ) RV131 V3 (λ − γh(2) ) RV121 V2 (λ)
(1.1.1)
is called the quantum dynamical Yang–Baxter equation with step γ (QDYB equation). Here we use the following notation. If X ∈ End(Vi ), then we denote by X (i) ∈ · , acting non-trivially on the End(V1 ⊗ · · · ⊗ Vn ) the operator · · · ⊗ Id ⊗ X ⊗ Id ⊗ · ·P ith factor of a tensor product of vector spaces, and if X = Xk ⊗ Yk ∈ End(Vi ⊗ Vj ), P (i) (j) then we set X ij = Xk Yk . The shift of λ by γh(i) is defined in the standard way. For instance, RV121 V2 (λ − γh(3) ) acts on a tensor v1 ⊗ v2 ⊗ v3 as RV121 V2 (λ − γµ3 ) ⊗ Id if v3 has weight µ3 . A function RVi Vj : h∗ → End(Vi ⊗ Vj ) is called a function of zero weight if [RVi Vj (λ), h ⊗ 1 + 1 ⊗ h] = 0
(1.1.2)
for all h ∈ h, λ ∈ h∗ . A solution {RVi Vj }1≤i<j≤3 of the QDYB equation is called a solution of zero weight if each of the functions is of zero weight. If all the spaces Vi are equal to a space V , then consider the QDYB equation on one function R : h∗ → End(V ⊗ V ), R12 (λ − γh(3) ) R13 (λ) R23 (λ − γh(1) ) = R23 (λ) R13 (λ − γh(2) ) R12 (λ).
(1.1.3)
An invertible function R of zero weight satisfying the QDYB Eq. (1.1.3) is called a quantum dynamical R-matrix. 1.2. Quantization and quasiclassical limit. Let x1 , ..., xN be a basis in h. The basis defines a linear system of coordinates on h∗ . For any λ ∈ h∗ , set λi = xi (λ), i = 1, ..., N .
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
595
Let Rγ : h∗ → End(V ⊗ V ) be a smooth family of solutions to the QDYB equation with step γ such that (1.2.1) Rγ (λ) = 1 − γ r(λ) + O(γ 2 ). Then the function r : h∗ → End(V ⊗ V ) satisfies the classical dynamical Yang–Baxter Eq. (CDYB), N X i=1
x(1) i
N N X X ∂r23 ∂r31 ∂r12 + x(2) + x(3) + i i ∂xi ∂xi ∂xi i=1 12 13
i=1
(1.2.2)
[r , r ] + [r12 , r23 ] + [r13 , r23 ] = 0 . A function r of zero weight satisfying the CDYB equation is called a classical dynamical r-matrix. The function r in (1.2.1) is called the quasiclassical limit of R, and the function R is called a quantization of r. Let U ⊂ h∗ be an open set, and let R : U → End(V ⊗ V ) be a zero weight meromorphic function on U . We will say that R is a quantum dynamical R-matrix on U if the QDYB equation is satisfied for R whenever it makes sense. Remark. If U is a bounded set, this notion is only interesting for small γ, so that the QDYB equation makes sense on a nonempty open set U 0 ⊂ U . A classical dynamical r-matrix r(λ) on U is called quantizable if there exists a power series in γ, ∞ X γ n rn (λ), (1.2.3) Rγ (λ) = 1 − γ r(λ) + n=2
convergent for small |γ| for any fixed λ ∈ U and such that Rγ (λ) is a quantum dynamical R-matrix on U with step γ. 1.3. Quantum dynamical R-matrices of Hecke type. Let h be an abelian Lie algebra of dimension N . Let V be a diagonalizable h-module of the same dimension N such that its weights ω1 , ..., ωN form a basis in h∗ . Let x1 , ..., xN be the dual basis of h. Let v1 , ..., vN be an eigenbasis for h in V such that xi vj = δij vj . Then the h-module V ⊗ V has the weight decomposition, V ⊗ V = ⊕N a=1 Vaa ⊕ ⊕a
(1.3.1)
where Vaa = C va ⊗ va and Vab = C va ⊗ vb ⊕ C vb ⊗ va . Introduce a basis Eij in End(V ) by Eij vk = δjk vi . A quantum dynamical R-matrix R : h∗ → End(V ⊗ V ) for these h and V will be called an R-matrix of glN type. The zero weight condition implies that the R-matrix preserves the weight decomposition (1.3.1) and has the form R(λ) =
N X a,b=1
αab (λ) Eaa ⊗ Ebb +
X
βab (λ) Eba ⊗ Eab ,
(1.3.2)
a6=b
where αab , βab : h∗ → C are suitable meromorphic functions. Let P ∈ End(V ⊗ V ) be the permutation of factors. Set R∨ = P R. Let p, q be nonzero complex numbers, p 6= −q. A function R : h∗ → End(V ⊗ V ) will be called a function of Hecke type with parameters p, q if
596
P. Etingof, A. Varchenko
(1.3.3) The function preserves the weight decomposition (1.3.1). (1.3.4) For any a = 1, ..., N and λ ∈ h∗ , we have R∨ (λ)va ⊗ va = p va ⊗ va . (1.3.5) For any a 6= b and λ ∈ h∗ , the operator R∨ (λ) restricted to the two dimensional space Vab has eigenvalues p and −q. A function R : h∗ → End(V ⊗ V ) will be called a function of weak Hecke type with parameters p, q if it preserves the weight decomposition (1.3.1) and for any λ ∈ h∗ satisfies the equation (1.3.6) (R∨ (λ) − p) (R∨ (λ) + q) = 0. A relation between Hecke types is given by the following simple observation. Let Rt : h∗ → End(V ⊗ V ), t ∈ [0, 1], be a continuous family of meromorphic functions, which is analytic when t ∈ (0, 1). Assume that for any t the function Rt is of weak Hecke ∨ =P type and Rt=0 = Id. Then Rt is of Hecke type for any t. In fact, the matrix Rt=0 ∨ satisfies (1.3.4–5) and hence Rt satisfies (1.3.4–5) for any t. In the following sections we classify quantum dynamical R-matrices of glN Hecke type. 1.4. Gauge transformations and multiplicative closed 2-forms. In this subsection we introduce gauge transformations of quantum dynamical R-matrices of Hecke type. We shall use the notion of a multiplicative form. A multiplicative k-form on a vector space with a linear coordinate system λ1 , ..., λN is a collection, ϕ = {ϕa1 ,...,ak (λ1 , ..., λN )}, of meromorphic functions , where a1 , ..., ak run through all ordered k element subsets of {1, ..., N }, such that for any subset a1 , ..., ak and any i, 1 ≤ i < k, we have ϕa1 ,...,ai+1 ,ai ,...,ak (λ1 , ..., λN ) ϕa1 ,...,ak (λ1 , ..., λN ) = 1. Let k be the set of all multiplicative k-forms. If ϕ and ψ are multiplicative k-forms, then {ϕa1 ,...,ak (λ1 , ..., λN ) · ψa1 ,...,ak (λ1 , ..., λN )} and {ϕa1 ,...,ak (λ1 , ..., λN ) / ψa1 ,...,ak (λ1 , ..., λN )} are multiplicative k-forms. This gives an abelian group structure on k . The zero element in k is the form {ϕa1 ,...,ak (λ1 , ..., λN ) ≡ 1}. Fix a nonzero complex number γ. For any a = 1, ...., N , introduce an operator δa on the space of meromorphic functions f (λ1 , ..., λN ) by δa : f (λ1 , ..., λN ) 7→ f (λ1 , ..., λN ) / f (λ1 , ..., λa − γ, ..., λN ) and an operator dγ : k → k+1 , ϕ 7→ dγ ϕ, by (dγ ϕ)a1 ,...,ak+1 (λ1 , ..., λN ) =
k+1 Y
i+1
(δai ϕa1 ,...,ai−1 ,ai+1 ,...,ak+1 (λ1 , ..., λN ))(−1) .
i=1
We have d2γ = 0. A form ϕ will be called γ-closed if dγ ϕ = 0. Let ϕ(γ) = {ϕa1 ,...,ak (λ1 , ..., λN , γ)} be a smooth family of multiplicative k-forms such that for all a1 , ..., ak , ϕa1 ,...,ak (λ, γ) = 1 − γ Ca1 ,...,ak (λ) + O(γ 2 )
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
597
for suitable functions Ca1 ,...,ak (λ). Then the functions {Ca1 ,...,ak (λ)} are skew-symmetric with respect to permutation of the indices, so it is natural to consider a differential P form C = a1 <...
∞ X
γ n Cn; a1 ,...,ak (λ),
n=2
convergent for small |γ| for a fixed λ ∈ U and such that {ϕa1 ,...,ak (λ, γ)} is a γ-closed multiplicative k-form. Lemma 1.1. Every closed holomorphic differential k-form C defined on an open polydisc is quantizable to a holomorphic multiplicative closed k-form ϕ(γ). Proof. Since U is a polydisc, we can find a holomorphic (k − 1)-form E on U such that dE = C. Define a multiplicative (k − 1)-form θ on U by θa1 ...ak−1 = e−Ea1 ...ak−1 . Set ϕ(γ) = dγ θ. Since d2γ = 0, the form ϕ(γ) is a desired multiplicative closed k-form. Remark. The Taylor expansion of ϕ(γ) in powers of γ is well defined in U , but for each particular (even very small) nonzero γ, the form ϕ(γ) is defined in a smaller open subset U 0 (γ) ⊂ U which tends to U as γ → 0. Now we introduce gauge transformations of quantum dynamical R-matrices, R : h∗ → End(V ⊗ V ), of form (1.3.2) with step γ. (1.4.1) Let {ϕab } be a meromorphic γ-closed multiplicative 2-form on h∗ . Set R(λ) 7→
N X
αaa (λ) Eaa ⊗ Eaa +
a=1
+
X
X
ϕab (λ) αab (λ) Eaa ⊗ Ebb
a6=b
βab (λ) Eba ⊗ Eab .
a6=b
(1.4.2) Let the symmetric group SN , the Weyl group of glN , act on h∗ and V by permutation of coordinates. For any permutation σ ∈ SN , set R(λ) 7→ (σ ⊗ σ) R(σ −1 · λ) (σ −1 ⊗ σ −1 ). (1.4.3) For a nonzero complex number c, set R(λ) 7→ c R(λ). (1.4.4) For a nonzero complex number c and an element µ ∈ h∗ , set R(λ) 7→ R(c λ + µ).
598
P. Etingof, A. Varchenko
It is clear that any gauge transformation of types (1.4.2)–(1.4.3) transforms a quantum dynamical R-matrix with step γ to a quantum dynamical R-matrix with step γ. Any gauge transformation of type (1.4.4) transforms a quantum dynamical R-matrix with step γ to a quantum dynamical R-matrix with step γ/c. In all cases, if the R-matrix is of Hecke type, then the transformed matrix is of Hecke type. If the transformation is of type (1.4.3) and the Hecke parameters of the R-matrix are p and q, then the Hecke parameters of the transformed matrix are cp and cq. Theorem 1.1. Any gauge transformation of type (1.4.1) transforms a quantum dynamical R-matrix with step γ to a quantum dynamical R-matrix with step γ. If the R-matrix is of Hecke type, then the transformed matrix is of Hecke type with the same parameters. Theorem 1.1 is proved in Sect. 1.9. Two R-matrices R : h∗ → End(V ⊗ V ) and R0 : h∗ → End(V ⊗ V ) will be called equivalent if one of them can be transformed into another by a sequence of gauge transformations. 1.5. Classification of quantum dynamical R-matrices of Hecke type with parameters p, q such that q = p. If Hecke parameters satisfy p = q, then the Hecke Eq. (1.3.6) can be written as R21 (λ) R(λ) = q 2 Id. Let X ⊂ {1, ..., N } be a subset. Say that X is decomposed into disjoint intervals, X = X1 ∪ ... ∪ Xn , if every Xk has the form {ak , ak + 1, ..., bk } and ak+1 > bk for k = 1, ..., n − 1. A meromorphic function µ(λ) will be called γ-quasiconstant if δa µ = 0 for all a. Fix a γ-quasiconstant µ : h∗ → h∗ with γ = 1. Define scalar meromorphic γ-quasiconstant functions µab : h∗ → C by µab (λ) = xa (µ(λ)) − xb (µ(λ)). Let λab denote λa − λb . Define R∪Xk : h∗ → End(V ⊗ V ) by R∪Xk (λ) =
N X
Eaa ⊗ Ebb +
a,b=1
n X
X
k=1 a,b∈Xk a6=b
1 ( Eaa ⊗ Ebb + Eba ⊗ Eab ). λab − µab (λ) (1.5.1)
Theorem 1.2. 1. For every X ⊂ {1, ..., N } , the R-matrix R∪Xk defined by (1.5.1) is a quantum dynamical R-matrix of Hecke type with parameters p = 1, q = 1 and step γ = 1. 2. Every quantum dynamical R-matrix of Hecke type with parameters p, q, such that p = q, is equivalent to one of the matrices (1.5.1). Theorem 1.2 is proved in Sect. 1.11. 1.6. Classification of quantum dynamical R-matrices of Hecke type with parameters p, q such that q 6= p . Assume that for any a, b, a 6= b, a γ-quasiconstant µab : h∗ → C is given. We say that this collection of quasiconstants is multiplicative if (1.6.1) For any a, b, we have µab (λ) µba (λ) = 1. (1.6.2) For any a, b, c, we have µac (λ) = µab (λ) µbc (λ).
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
599
Fix a multiplicative family of γ-quasiconstants with γ = 1. Fix a complex number such that e 6= 1. Let X ⊂ {1, ..., N } be a subset, X = X1 ∪ ... ∪ Xn its decomposition into disjoint intervals. For any a, b ∈ {1, ..., N }, a 6= b, we shall introduce functions αab , βab : h∗ → C. We shall introduce functions βab and then set αab = e + βab . If a, b ∈ Xk for some k, then we set βab (λ) =
e − 1 . µab (λ)eλab − 1
(1.6.3)
Otherwise we set βab (λ) = 0 , if a < b, and βab (λ) = 1 − e , if a > b. Define R∪Xk : h∗ → End(V ⊗ V ) by R∪Xk , (λ) =
N X
Eaa ⊗Eaa +
a=1
X
αab (λ) Eaa ⊗Ebb +
X
a6=b
βab (λ) Eba ⊗Eab . (1.6.4)
a6=b
Theorem 1.3. 1. For every X ⊂ {1, ..., N } , the R-matrix R∪Xk , defined by (1.6.4) is a quantum dynamical R-matrix of Hecke type with parameters p = 1, q = e and step γ = 1. 2. Every quantum dynamical R-matrix of Hecke type with parameters p, q such that q 6= p is equivalent to one of the matrices (1.6.4). Theorem 1.3 is proved in Sect. 1.12. 1.7. Quantization of classical dynamical r-matrices of glN type. Let V be the N dimensional h-module considered in Sect. 1.3. Let r : h∗ → End(V ⊗ V ) be a zero weight meromorphic function satisfying CDYB (1.2.2). Assume that r satisfies the unitarity condition, (1.7.1) r(λ) + r21 (λ) = P + δ Id for some constants , δ ∈ C and all λ. The constant is called the coupling constant, the constant δ is called the secondary coupling constant. The zero weight condition implies that r has the form r(λ) =
N X
αab (λ) Eaa ⊗ Ebb +
a,b=1
X
βab (λ) Eab ⊗ Eba .
(1.7.2)
a6=b
We recall a classification of such r-matrices. First we introduce gauge transformations of classical dynamical r-matrices. P (1.7.3) Let ψ = a,b ψab (λ)dxa ∧ dxb be a closed meromorphic differential 2-form on h∗ ( and the notion of a closed differential form has the standard meaning). Set r(λ) 7→ r(λ) +
N X
ψab (λ) Eaa ⊗ Ebb .
a6=b
(1.7.4) For µ ∈ h∗ , set
r(λ) 7→ r(λ + µ).
600
P. Etingof, A. Varchenko
(1.7.5) Let the symmetric group SN act on h∗ and V by permutation of coordinates. For any permutation σ ∈ SN , set r(λ) 7→ (σ ⊗ σ) r(σ −1 · λ) (σ −1 ⊗ σ −1 ). (1.7.6) For a nonzero complex number c, set r(λ) 7→ c r(cλ). (1.7.7) For a nonzero complex number c, set r(λ) 7→ r(λ) + c Id. Any gauge transformation transforms a classical dynamical r-matrix to a classical dynamical r-matrix [EV]. Two classical dynamical r-matrices r(λ) and r 0 (λ) will be called equivalent if one of them can be transformed into another by a sequence of gauge transformations. The gauge transformations of quantum dynamical R-matrices described in Sect. 1.4 are analogs of gauge transformations of classical dynamical r-matrices. Classification of r-matrices with zero coupling constant, = 0. Let X ⊂ {1, ..., N } be a subset, X = X1 ∪ ... ∪ Xn its decomposition into disjoint intervals. Define a map r : h∗ → End(V ⊗ V ) by r∪Xk (λ) =
n X
X
k=1 a,b∈Xk a6=b
1 Eba ⊗ Eab . λba
(1.7.8)
Theorem 1.4. 1. For any X and its decomposition X = X1 ∪ ... ∪ Xn into disjoint intervals, the function r∪Xk defined by (1.7.8) is a classical dynamical r-matrix with zero coupling constant. 2. Any classical dynamical r-matrix r : h∗ → End(V ⊗ V ) with zero coupling constant is equivalent to one of the matrices (1.7.8). Theorem 1.4 follows from [EV]. Classification of r-matrices with nonzero coupling constant, 6= 0. Let X ⊂ {1, ..., N } be a subset, X = X1 ∪ ... ∪ Xn its decomposition into disjoint intervals. For any a, b ∈ {1, ..., N }, a 6= b, we introduce functions βab : h∗ → C. If a, b ∈ Xk for some k, then we set βab (λ) = cotanh ( λba ) . Otherwise we set βab (λ) = −1, if a < b , and βab (λ) = 1, if a > b. Define r∪Xk : h∗ → End(V ⊗ V ) by X r∪Xk (λ) = P + βab (λ) Eba ⊗ Eab .
(1.7.9)
a6=b
Theorem 1.5. 1. For every X ⊂ {1, ..., N } and its decomposition X = X1 ∪ ... ∪ Xn into disjoint intervals, the function r∪Xk defined by (1.7.9) is a classical dynamical r-matrix with nonzero coupling constant = 2 and the secondary coupling constant δ = 0. 2. Every classical dynamical r-matrix r : h∗ → End(V ⊗ V ) with nonzero coupling constant is equivalent to one of the matrices (1.7.9).
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
601
Theorem 1.5 follows from [EV]. Theorem 1.6. 1. Every classical dynamical r-matrix r with zero coupling constant, holomorphic on an open polydisc U ⊂ h∗ , can be quantized to a quantum dynamical R-matrix Rγ on U , of Hecke type with parameters p, q such that p = q. 2. Every classical dynamical r-matrix r with nonzero coupling constant, holomorphic on an open polydisc U ⊂ h∗ , can be quantized to a quantum dynamical R-matrix Rγ on U , of Hecke type with parameters p, q such that p 6= q. Proof. The R-matrix R∪Xk (λ, γ) =
N X
Eaa ⊗ Ebb +
a,b=1
n X
X
k=1 a,b∈Xk a6=b
γ ( Eaa ⊗ Ebb + Eba ⊗ Eab ) λab
is a quantum dynamical R-matrix of Hecke type with parameters p = q = 1 and step γ. Its quasiclassical limit is r0 (λ) =
n X
X
k=1 a,b∈Xk a6=b
−1 ( Eaa ⊗ Ebb + Eba ⊗ Eab ). λab
Making (1.7.3) corresponding to the closed form P P the gauge transformation −1 λ dx ∧ dx , a b we get the r-matrix r∪Xk defined by (1.7.8). This k a,b∈Xk ,a
N X a=1
Eaa ⊗ Eaa +
X
αab (λ) Eaa ⊗ Ebb +
a6=b
X
βab (λ) Eba ⊗ Eab .
(1.8.1)
a6=b
The Hecke property also implies that for every a, c ∈ {1, ..., N }, a 6= c, we have βac (λ) + βca (λ) = 1 − q,
(1.8.2)
βac (λ) βca (λ) − αac (λ) αca (λ) = −q,
(1.8.3)
∨
this is the trace and the determinant of R restricted to Vac . Applying both sides of the QDYB Eq. (1.1.3) to a basis vector va ⊗ va ⊗ vc ∈ V ⊗3 , a 6= c, we get equations αca (λ − ωa ) βac (λ) αac (λ − ωa ) + βac (λ − ωa )2 = βac (λ − ωa ),
(1.8.4)
βca (λ − ωa ) βac (λ) αac (λ − ωa ) + αac (λ − ωa ) βac (λ − ωa ) = βac (λ) αac (λ − ωa ).
(1.8.5)
602
P. Etingof, A. Varchenko
Applying both sides of the QDYB Eq. (1.1.3) to a basis vector va ⊗ vb ⊗ vc ∈ V ⊗3 with pairwise distinct a, b, c we get equations αab (λ − ωc ) αac (λ) αbc (λ−ωa ) = αbc (λ) αac (λ − ωb ) αab (λ), αac (λ − ωb ) αab (λ) βbc (λ−ωa ) = βbc (λ) αac (λ − ωb ) αab (λ), βab (λ − ωc ) αac (λ) αbc (λ−ωa ) = αac (λ) αbc (λ − ωa ) βab (λ), βcb (λ − ωa ) βac (λ) αbc (λ−ωa ) + αbc (λ − ωa ) βab (λ) βbc (λ − ωa ) = βac (λ) αbc (λ − ωa ) βab (λ),
(1.8.6) (1.8.7) (1.8.8) (1.8.9)
αcb (λ − ωa ) βac (λ) αbc (λ − ωa ) + βbc (λ − ωa ) βab (λ) βbc (λ − ωa ) = αba (λ) βac (λ − ωb ) αab (λ) + βab (λ) βbc (λ − ωa ) βab (λ),
(1.8.10)
βac (λ − ωb ) αab (λ) βbc (λ − ωa ) = βba (λ) βac (λ − ωb ) αab (λ) + αab (λ) βbc (λ − ωa ) βab (λ).
(1.8.11)
Lemma 1.2. For any a, c, a 6= c, the functions αac (λ) and q + βac (λ) are not identically equal to zero. Proof. If αac ≡ 0, then Eqs. (1.8.2)ac , (1.8.3)ac , (1.8.4)ac , and (1.8.4)ca give a contradiction. Thus, αac and αca are not identically equal to zero. Equations (1.8.2)ac , (1.8.3)ac imply (1.8.12) αac (λ) αca (λ) = (q + βac (λ)) (q + βca (λ)). The lemma is proved. 1.9. Proof of Theorem 1.1. Let {ϕab } be a γ-closed multiplicative 2-form on h∗ . It is easy to see that Eqs. (1.8.2)–(1.8.11) are invariant with respect to the gauge transformation (1.4.1). This proves Theorem 1.1. 1.10. Relation αac = q + βac . Consider a quantum dynamical R-matrix R(λ) of form (1.3.2). Assume that the matrix is of Hecke type with step γ = 1 and Hecke parameters p = 1 and q. For any a, c, a 6= c, set ϕac (λ) =
q + βac (λ) . αac (λ)
(1.10.1)
Lemma 1.3. The collection of functions ϕ = {ϕac } is a γ-closed multiplicative 2-form with γ = 1. Corollary 1.1. Apply to the R-matrix R(λ) the gauge transformation (1.4.1) corresponding to the multiplicative 2-form ϕ−1 . Then the coefficients of the transformed matrix satisfy the equation (1.10.2) αac = q + βac for all a, c. Proof of Lemma 1.3. Equation ϕac ϕca = 1 follows from (1.8.12). Equation dγ ϕ = 0 is a direct corollary of (1.8.6) and (1.8.7). 1.11. Proof of Theorem 1.2. Let R(λ) be a quantum dynamical R-matrix of Hecke type with parameters p, q such that p = q. Using gauge transformations (1.4.3) and (1.4.4) we can make step γ = 1 and p = q = 1. By Lemma 1.3 we may assume that αac (λ) = 1 + βac (λ) for all a 6= c. By (1.8.2) we have βac (λ) = −βca (λ) for all a 6= c. Fix a, c, a 6= c, and solve Eqs. (1.8.4)ac , (1.8.5)ac , (1.8.4)ca , (1.8.5)ca .
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
603
Lemma 1.4. Any solution βac (λ), βca (λ) of Eqs. (1.8.4)ac , (1.8.5)ac , (1.8.4)ca , (1.8.5)ca has one of the following two forms. 1. βac = βca = 0. 2. βac (λ) =
1 , λac − µac
βca (λ) =
1 , λca − µca
(1.11.1)
where µac = −µca and µac (λ) is a meromorphic function periodic with respect to shifts of λ by ωa and ωc , µac (λ − ωa ) = µac (λ − ωc ) = µac (λ). Proof. It is easy to see that βac (λ) = βca (λ) ≡ 0 is a solution. Now assume that βac = −βca 6= 0. Then (1.8.5)ac gives 1 1 + = 1, βac (λ) βac (λ − ωa ) and (1.8.5)ca gives 1 1 + = −1. βac (λ) βac (λ − ωc ) Let µac (λ) = λac − 1/βac (λ). Then µac (λ − ωa ) = µac (λ) and µac (λ − ωc ) = µac (λ). Hence 1 , βac (λ) = λac − µac where µac (λ) is a meromorphic function periodic in ωa and ωc . Similarly, βca (λ) =
1 , λca − µca
where µca (λ) is a function periodic in ωa and ωc . We have µac = −µca since βac = −βca . It is easy to see that these functions βac and βca solve Eqs. (1.8.4)ac and (1.8.4)ca . The lemma is proved. Equation (1.8.7) shows that the function βac (λ) and hence the function µac (λ) is periodic with respects to shifts of λ by ωb for any b different from a and c. Consider Eq. (1.8.9)abc on functions βab (λ), βbc (λ), βac (λ). It is easy to see that if one of these three functions is identically equal to zero, then there is another function in this triple which is identically equal to zero. Introduce a relation on the set {1, ..., N }. For any a ∈ {1, ..., N }, let a be related to a. For any a, b ∈ {1, ..., N }, a 6= b, let a be related to b if the function βab (λ) is not identically equal to zero. It is easy to see that this is an equivalence relation. Let Y ⊂ {1, ..., N } be the union of all the equivalence classes containing more than one element. Let Y = Y1 ∪ ... ∪ Yn be its decomposition into equivalence classes. If pairwise distinct a, b, c ∈ {1, ..., N } do not belong to the same equivalence class, then at least two of the three functions βab (λ), βbc (λ), βac (λ) are identically equal to zero. Hence this triple of functions satisfies Eq. (1.8.9)abc . If all three elements a, b, c belong to the same equivalence class, then equation (1.8.9)abc takes the form 1 1 1 1 1 1 + = . λcb − µcb λac − µac λab − µab λbc − µbc λac − µac λab − µab This implies that µac (λ) = µab (λ) + µbc (λ). Therefore there exists a 1-quasiconstant meromorphic map µ : h∗ → h∗ such that µac (λ) = xa (µ(λ)) − xc (µ(λ)) for all a, c such
604
P. Etingof, A. Varchenko
that µac (λ) is not identically equal to zero. It is easy to see that if the functions µab (λ) have this property then Eqs. (1.8.8) and (1.8.10) are also satisfied. Let σ be a permutation of {1, ..., N } which transforms the set Y and the decomposition Y = Y1 ∪ ... ∪ Yn into a set X ⊂ {1, ..., N } and its decomposition into disjoint intervals X = X1 ∪ ... ∪ Xn . Apply to the R-matrix R(λ) the gauge transformation (1.4.2) corresponding to the permutation σ. Then the transformed R-matrix will have form (1.5.1) corresponding to the constructed decomposition X = X1 ∪ ... ∪ Xn . Theorem 1.2 is proved. 1.12. Proof of Theorem 1.3. Let R(λ) be a quantum dynamical R-matrix of Hecke type with parameters p, q such that p 6= q. Using gauge transformations (1.4.3) and (1.4.4) we can make step γ = 1 and p = 1. Fix a number such that q = e . By Lemma 1.3 we may assume that αac (λ) = q + βac (λ) for all a 6= c. By (1.8.2) we have βca (λ) = 1 − q − βac (λ) for all a 6= c. Fix a, c, a 6= c, and solve Eqs. (1.8.4)ac , (1.8.5)ac , (1.8.4)ca , (1.8.5)ca . Lemma 1.5. Any solution βac (λ), βca (λ) of Eqs. (1.8.4)ac , (1.8.5)ac , (1.8.4)ca , (1.8.5)ca has one of the following two forms. 1. βac = 0, βca = 1 − q or βca = 0, βac = 1 − q. 2. e − 1 e − 1 , β , (λ) = βac (λ) = ca µac (λ)eλac − 1 µca (λ)eλca − 1
(1.12.1)
where µac (λ)µca (λ) = 1 and µac (λ) is a meromorphic function periodic with respect to shifts of λ by ωa and ωc , µac (λ − ωa ) = µac (λ − ωc ) = µac (λ). Proof. Equation (1.8.4)ac can be written in the form (q + βac (λ − ωa )) (1 − βac (λ − ωa )) βac (λ) = (1 − βac (λ − ωa )) βac (λ − ωa ). Hence βac (λ) ≡ 1 or (q + βac (λ − ωa )) βac (λ) = βac (λ − ωa ).
(1.12.2)
The function βac (λ) cannot be identically equal to 1. In fact, if βac (λ) ≡ 1, then Eq. (1.8.4)ca gives 0 = −q(1 + q) which is impossible since we always assume that −q 6= p. Equation (1.12.2)ac has constant solutions βac (λ) = 0 or βac (λ) = 1 − q which correspond to the first statement of the lemma. Now assume that βac (λ) is not constant. Introduce a new meromorphic function yac (λ) = (βac (λ) + q − 1)/βac (λ). It is easy to see that yac (λ) yca (λ) = 1. Now Eqs. (1.12.2)ac , (1.12.2)ca can be written as yac (λ) = q yac (λ − ωa ),
yac (λ) = q −1 yac (λ − ωc ).
(1.12.3)
Set µac (λ) = yac (λ) e−λac . Then the function µac (λ) is periodic with respect to shifts of λ by ωa and ωc . We have µac (λ) µca (λ) = 1. Returning to functions βac (λ) and βca (λ) we get the second type of solutions. The lemma is proved. Equation (1.8.7) shows that the function βac (λ) and hence the function µac (λ) is periodic with respect to shifts of λ by ωb for any b different from a and c. If the function βac (λ) has form (1.12.1), then we say that the function µac (λ) is finite. If βac (λ) = 1 − q, then we say that µac (λ) = 0. If βac (λ) = 0, then we say that µac (λ) = ∞. If µac (λ) = 0, then µca (λ) = ∞. If µac (λ) = ∞, then µca (λ) = 0.
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
605
For pairwise distinct a, b, c, we shall say that the equation µab (λ) µbc (λ) = µac (λ)
(1.12.4)
holds if one of the following four conditions is satisfied. (1.12.5) (1.12.6) (1.12.7) (1.12.8)
All three functions µab (λ), µbc (λ), µac (λ), are finite, and satisfy (1.12.4). µac (λ) = ∞ and at least one of the functions µab (λ), µbc (λ) is equal to ∞. µac (λ) = 0 and at least one of the functions µab (λ), µbc (λ) is equal to 0. µac (λ) is finite, one of the functions µab (λ), µbc (λ) is equal to zero and the other is equal to infinity.
Lemma 1.6. For any pairwise distinct a, b, c, Eq. (1.12.4) holds. The lemma easily follows from Eq. (1.8.9). Introduce Y = {(a, b) | (a, b) ∈ {1, ..., N }, a 6= b, µab = ∞}.
(1.12.9)
Then (1.12.10) If (a, b) ∈ Y and (b, c) ∈ Y , then (a, c) ∈ Y . (1.12.11) If (a, b) belongs to Y , then (b, a) does not belong to Y . By Theorem 3.11 in [EV], there exists a permutation σ of numbers {1, ..., N } such that for the new order on {1, ..., N }, if (a, b) ∈ Y , then a < b. Apply to the R-matrix R(λ) the gauge transformation (1.4.2) corresponding to the permutation σ. Then the set Y defined by (1.12.9) for the transformed R-matrix is such that if (a, b) ∈ Y , then a < b. From now on we denote by R(λ) the transformed matrix. Let Z = {(a, b) | a < b} − Y . Lemma 1.7. 1. If (a, b) belongs to Z, then all pairs (c, c + 1), c = a, a + 1, ..., b − 1, belong to Z. 2. If for some a, b, a < b, all pairs (c, c + 1) for c = a, a + 1, ..., b − 1 belong to Z, then (a, b) belongs to Z. Lemma 1.7 is a special case of Lemma 3.13 in [EV]. Consider the subset X ⊂ {1, ..., N } of all a such that there exists b with the property that (a, b) or (b, a) belongs to Z. Introduce a relation on the set X. For any a ∈ X, let a be related to a. For any a, b ∈ X, a < b, let a be related to b if (a, b) ∈ Z. Lemma 1.7 implies that this relation is an equivalence relation. Let X = X1 ∪ ... ∪ Xn be the decomposition of X into equivalence classes. Lemma 1.7 implies that X = X1 ∪ ... ∪ Xn is a decomposition into a union of disjoint intervals. It is easy to see that the R-matrix R(λ) has form (1.6.4) for the constructed decomposition X = X1 ∪ ... ∪ Xn . Theorem 1.3 is proved. 1.13. Quantum dynamical R-matrices as an extrapolation of constant quantum Rmatrices. Consider the vector representation V of the quantum group Uq (glN ). Then its R-matrix R ∈ End(V ⊗ V ) has the form, R =
N X a=1
Eaa ⊗ Eaa +
X a6=b
αab Eaa ⊗ Ebb +
X a6=b
βab Eba ⊗ Eab ,
(1.13.1)
606
P. Etingof, A. Varchenko
where the numbers αab , βab are defined as follows: αab = q, βab = 0 if a < b and αab = 1, βab = 1 − q if a > b. The matrix R is a constant solution of the quantum dynamical Yang–Baxter equation (1.1.3). For any permutation σ of numbers {1, ..., N } we construct a new constant solution, Rσ , of the quantum Yang–Baxter equation. Rσ has form (1.13.1) where the numbers αab , βab are defined by the rule: αab = q, βab = 0 if σ(a) < σ(b) and αab = 1, βab = 1−q if σ(a) > σ(b). Fix a complex number such that e = q. Consider the matrix R(λ) =
N X a=1
Eaa ⊗ Eaa +
X
αab (λ) Eaa ⊗ Ebb +
a6=b
X
βab (λ) Eba ⊗ Eab , (1.13.2)
a6=b
where the functions αac (λ) and βac (λ) are defined by βab (λ) =
e − 1 , −1
eλab
αab = e + βab .
The matrix R(λ) is the R-matrix of form (1.6.4) corresponding to data X = X1 = {1, ..., N }. The R-matrix R(λ) extrapolates the constant R-matrices {Rσ } in the following ∗ sense. Let ρ = ( N 2−1 , N 2−3 , ..., 1−N 2 ) ∈ h . Let σ(ρ) be the vector obtained from ρ by permutation of coordinates by σ. Then limt→+∞ R(
t σ(ρ)) = Rσ .
(1.13.3)
2. Quantum Dynamical R-matrices with Spectral Parameter 2.1. Definition. Let h be an abelian finite dimensional Lie algebra. Let Vi , i = 1, 2, 3, be finite dimensional diagonalizable h-modules, RVi Vj : C × h∗ → End(Vi ⊗ Vj ),
1 ≤ i < j ≤ 3,
meromorphic functions, γ a nonzero complex number. The equation in End(V1 ⊗V2 ⊗V3 ), RV121 V2 (z1 −z2 , λ − γh(3) ) RV131 V3 (z1 − z3 , λ) RV232 V3 (z2 − z3 , λ − γh(1) ) = RV232 V3 (z2 − z3 , λ) RV131 V3 (z1 − z3 , λ − γh(2) ) RV121 V2 (z1 , −z2 , λ)
(2.1.1)
is called the quantum dynamical Yang–Baxter equation with spectral parameter and step γ (QDYB equation). In what follows we will use a notation zij = zi − zj . A function RVi Vj : C × h∗ → End(Vi ⊗ Vj ) is called a function of zero weight if [RVi Vj (z, λ), h ⊗ 1 + 1 ⊗ h] = 0
(2.1.2)
for all h ∈ h, z ∈ C, λ ∈ h∗ . A solution {RVi Vj }1≤i<j≤3 of the QDYB Eq. (2.1.1) is called a solution of zero weight if each of the functions is of zero weight. If all the spaces Vi are equal to a space V , then we consider the QDYB equation on one function R : h∗ → End(V ⊗ V ), R12 (z12 , λ − γh(3) ) R13 (z13 , λ) R23 (z23 , λ − γh(1) ) = R23 (z23 , λ) R13 (z13 , λ − γh(2) ) R12 (z12 , λ).
(2.1.3)
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
607
A zero weight function R satisfying the QDYB Eq. (2.1.3) is called a quantum dynamical R-matrix with spectral parameter. An R-matrix is called unitary, if it satisfies the unitarity condition (2.1.4) R(z, λ) R21 (−z, λ) = 1. 2.2. Quantization and quasiclassical limit. Let x1 , ..., xN be a basis in h. The basis defines a linear system of coordinates on h∗ . For any λ ∈ h∗ , set λi = xi (λ), i = 1, ..., N . Let Rγ : C × h∗ → End(V ⊗ V ) be a smooth family of solutions to Eqs. (2.1.3) and (2.1.4) with step γ such that Rγ (z, λ) = 1 − γ r(λ) + O(γ 2 ).
(2.2.1)
Then the function r : C × h∗ → End(V ⊗ V ) satisfies the zero weight condition [r(z, λ), h ⊗ 1 + 1 ⊗ h] = 0
(2.2.2)
for all h ∈ h, z ∈ C, λ ∈ h∗ , the unitarity condition r(z, λ) + r21 (−z, λ) = 0
(2.2.3)
and the classical dynamical Yang–Baxter equation with spectral parameter (CDYB), N X i=1 12
x(1) i
N N X X ∂r23 ∂r31 ∂r12 (z23 , λ) + x(2) (z , λ) + x(3) (z12 , λ) + 31 i i ∂xi ∂xi ∂xi i=1
13
i=1
12
23
[r (z12 , λ), r (z13 , λ)] + [r (z12 , λ), r (z23 , λ)] + [r13 (z13 , λ), r23 (z23 , λ)] = 0 . (2.2.4) A function r(z, λ) with properties (2.2.2)–(2.2.4) is called a classical dynamical r-matrix with spectral parameter. The function r in (2.2.1) is called the quasiclassical limit of R, and the function R is called a quantization of r. Let U ⊂ h∗ be an open set, and let R : C × U → End(V ⊗ V ) be a zero weight meromorphic function on C × U . We will say that R is a quantum dynamical R-matrix with spectral parameter on C × U if the QDYB equation with spectral parameter is satisfied for R whenever it makes sense. A classical dynamical r-matrix r(z, λ) with spectral parameter on C × U is called quantizable if there exists a power series in γ, Rγ (z, λ) = 1 − γ r(z, λ) +
∞ X
γ n rn (z, λ)
(2.2.5)
n=2
convergent for small |γ| for any fixed (z, λ) ∈ C × U , such that Rγ (z, λ) is a quantum dynamical R-matrix on C × U with spectral parameter and step γ. 2.3. R-matrices of glN type. Let h be an abelian Lie algebra of dimension N . Let V be a diagonalizable h-module of the same dimension such that its weights ω1 , ..., ωN form a basis in h∗ . Let x1 , ..., xN be the dual basis of h. Let v1 , ..., vN be an eigenbasis for h in V such that xi vj = δij vj . Then the h-module V ⊗ V has the weight decomposition, V ⊗ V = ⊕N a=1 Vaa ⊕ ⊕a
(2.3.1)
608
P. Etingof, A. Varchenko
A quantum dynamical R-matrix with spectral parameter, R : C×h∗ → End(V ⊗V ), for these h and V will be called an R-matrix of glN type. The zero weight condition implies that the R-matrix preserves the weight decomposition (2.3.1) and has the form R(z, λ) =
N X
αab (z, λ) Eaa ⊗ Ebb +
a,b=1
X
βab (z, λ) Eba ⊗ Eab ,
(2.3.2)
a6=b
where αab , βab : C × h∗ → C are suitable meromorphic functions. 2.4. Gauge transformations. Fix a nonzero complex number γ. Let ψ : h∗ → C be a function. For any a, b = 1, ..., N , set ∂a ψ(λ) = ψ(λ) − ψ(λ − ωa ), Lab ψ(λ) = ∂a ψ(λ) − ∂b ψ(λ − ωa ) = ψ(λ) − 2ψ(λ − ωa ) + ψ(λ − ωa − ωb ). Introduce gauge transformations of quantum dynamical R-matrices, R : C × h∗ → End(V ⊗ V ), of type (2.3.2) with step γ. (2.4.1) Let ψ be a meromorphic function on h∗ . Set R(z, λ) 7→ N X
ez∂a ∂b ψ(λ) αab (z, λ) Eaa ⊗ Ebb +
a,b=1
X
ezLab ψ(λ) βab (z, λ) Eba ⊗ Eab .
a6=b
(2.4.2) Let {ϕab } be a meromorphic γ-closed multiplicative 2-form on h∗ . Set R(z, λ) 7→
N X
αaa (z, λ) Eaa ⊗ Eaa +
a=1
X a6=b
ϕab (λ) αab (z, λ) Eaa ⊗ Ebb +
X
βab (z, λ) Eba ⊗ Eab .
a6=b
(2.4.3) Let the symmetric group SN act on h∗ and V by permutation of coordinates. For any permutation σ ∈ SN , set R(z, λ) 7→ (σ ⊗ σ) R(z, σ −1 · λ) (σ −1 ⊗ σ −1 ). (2.4.4) For a nonzero holomorphic scalar function c(z), set R(z, λ) 7→ c(z) R(z, λ). (2.4.5) For nonzero complex number b, c and an element µ ∈ h∗ , set R(z, λ) 7→ R(bz, cλ + µ).
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
609
It is clear that any gauge transformation of type (2.4.3) transforms a (unitary) quantum dynamical R-matrix with spectral parameter and step γ to a (unitary) quantum dynamical R-matrix with spectral parameter and step γ. Any gauge transformation of type (2.4.4) transforms a quantum dynamical R-matrix with spectral parameter and step γ to a quantum dynamical R-matrix with spectral parameter and step γ. If in addition we have c(z)c(z −1 ) = 1, then the gauge transformation of type (2.4.4) transforms a unitary quantum dynamical R-matrix with spectral parameter and step γ to a unitary quantum dynamical R-matrix with spectral parameter and step γ. Any gauge transformation of type (2.4.5) transforms a (unitary) quantum dynamical R-matrix with spectral parameter and step γ to a (unitary) quantum dynamical R-matrix with spectral parameter and step γ/c. Theorem 2.1. Any gauge transformation of type (2.4.1) or (2.4.2) transforms a quantum dynamical R-matrix with spectral parameter and step γ to a quantum dynamical R-matrix with spectral parameter and step γ. Moreover, if the initial quantum dynamical R-matrix is unitary, then the transformed R-matrix is unitary. Theorem 2.1 is analogous to Theorem 1.1 and is also proved by direct verification. Namely, in order to prove Theorem 2.1 it is enough to write the QDYB Eq. (2.1.3) in coordinates, as it was done for Eq. (1.1.3) in Sect. 1.8, and then check that if functions αab (z, α) and βab (z, α) form a solution of the coordinate equations, then the transformed functions also form a solution. Two R-matrices R : C × h∗ → End(V ⊗ V ) and R0 : C × h∗ → End(V ⊗ V ) will be called equivalent if one of them can be transformed into another by a sequence of gauge transformations. 2.5. Examples. The elliptic R-matrix. Fix a point τ in the upper half plane and a complex number γ. Let X 2 1 eπij τ +2πij(z+ 2 ) θ(z, τ ) = − j∈Z+ 21
be Jacobi’s first theta function. Let h be the Cartan subalgebra of glN . It is the abelian Lie algebra of diagonal complex N × N matrices with the standard basis xi = diag(0, . . . , 0, 1i , 0, . . . , 0), i = 1, . . . , N . Its dual space h∗ has the dual basis ωi . The vector representation of glN is V = CN with the standard basis v1 , . . . , vN , xi vj = δij vj . ell Let Rγ,τ (z, λ) ∈ End(V ⊗ V ) be the R-matrix of the elliptic quantum group Eτ,γ/2 (slN ), [F1-2, FV2]. It is a function of the spectral parameter z ∈ C and an additional variable λ = (λ1 , . . . , λN ) ∈ h∗ . It is a solution of the CDYB Eq. (2.1.3) and ell is satisfies the unitarity condition (2.1.4) [F1-2]. The formula for Rγ,τ ell Rγ,τ (z, λ) =
N X a=1
Eaa ⊗Eaa +
X
α(z, λab )Eaa ⊗Ebb +
a6=b
X
β(z, λab )Eba ⊗Eab , (2.5.1)
a6=b
where λab = λa − λb and the functions α, β are ratios of theta functions: α(z, λ) =
θ(λ + γ, τ )θ(z, τ ) , θ(λ, τ )θ(z − γ, τ )
β(z, λ) =
θ(z − λ, τ )θ(γ, τ ) . θ(z − γ, τ )θ(λ, τ )
(2.5.2)
610
P. Etingof, A. Varchenko
Trigonometric R-matrices. Let X ⊂ {1, ..., N } be a subset, X = X1 ∪ ... ∪ Xn its decomposition into disjoint intervals. For any a, b ∈ {1, ..., N }, a 6= b, we introduce functions αab , βab : C × h∗ → C. If a, b ∈ Xk for some k, then we set αab (z, λ) =
sin(λab + γ) sin(z) , sin(λab ) sin(z − γ)
βab (z, λ) =
sin(z − λab ) sin(γ) . sin(λab ) sin(z − γ)
(2.5.3)
Otherwise we set αab (z, λ) = e−iγ
sin(z) , sin(z − γ)
βab (z, λ) = − eiz
sin(γ) sin(z − γ)
(2.5.4)
βab (z, λ) = − e−iz
sin(γ) sin(z − γ)
(2.5.5)
if a < b, and αab (z, λ) = eiγ
sin(z) , sin(z − γ)
if a > b. trig : C × h∗ → End(V ⊗ V ) by Define a function R∪X k ,γ trig (z, λ) = R∪X k ,γ
N X
Eaa ⊗ Eaa +
a=1
X
αab (λ) Eaa ⊗ Ebb +
a6=b
X
βab (λ) Eba ⊗ Eab ,
a6=b
(2.5.6) where αab and βab are defined by (2.5.3) - (2.5.5). Rational R-matrices. Let X ⊂ {1, ..., N } be a subset, X = X1 ∪ ... ∪ Xn its decomposition into disjoint intervals. For any a, b ∈ {1, ..., N }, a 6= b, we shall introduce functions αab , βab : C × h∗ → C. If a, b ∈ Xk for some k, then we set αab (z, λ) =
(λab + γ) z , λab (z − γ)
βab (z, λ) =
(z − λab ) γ . λab (z − γ)
(2.5.7)
Otherwise we set αab (z, λ) =
z , z−γ
βab (z, λ) = −
γ . z−γ
(2.5.8)
rat Define a function R∪X : C × h∗ → End(V ⊗ V ) by k ,γ
rat R∪X (z, λ) = k ,γ
N X a=1
Eaa ⊗ Eaa +
X
αab (λ) Eaa ⊗ Ebb +
a6=b
X
βab (λ) Eba ⊗ Eab ,
a6=b
(2.5.9) where αab and βab are defined by (2.5.7) - (2.5.8). Theorem 2.2. For any subset X ⊂ {1, ..., N } and its decomposition X = X1 ∪ ... ∪ Xn trig rat into disjoint intervals, the functions R∪X and R∪X are zero weight solutions of k ,γ k ,γ the QDYB Eq. (2.1.3) satisfying the unitarity condition (2.1.4).
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
611
ell Proof. According to [F1-2] the elliptic R-matrix Rγ,τ is a zero weight solution of the QDYB Eq. (2.1.3) satisfying the unitarity condition (2.1.4). If q = e2πiτ → 0, then θ(z) ∼ 2q 1/8 sin(πz). These two facts show that the R-matrix R0 (z, λ) of the form (2.3.2), with
αab (z, λ) =
sin(λab + γ) sin(z) , sin(λab ) sin(z − γ)
βab (z, λ) =
sin(z − λab ) sin(γ) sin(λab ) sin(z − γ)
for all a 6= b and αaa ≡ 1 for all a, is a zero weight solution of the QDYB Eq. (2.1.3) satisfying the unitarity condition (2.1.4). For any fixed d ∈ h∗ , the R-matrix R0 (z, λ + d) is also a zero weight solution of the QDYB Eq. (2.1.3) satisfying the unitarity condition (2.1.4). Fix a subset X ⊂ {1, ..., N } and its decomposition X = X1 ∪ ... ∪ Xn into disjoint intervals. It is easy to see that there exists a sequence of elements di ∈ h∗ , i = 1, 2, ... such that the R-matrix R0 (z, λ + di ) has a limit when i tends to infinity, and this limit trig trig (z, λ). This observation shows that R∪X (z, λ) is a zero weight is equal to R∪X k ,γ k ,γ solution of the QDYB Eq. (2.1.3) satisfying the unitarity condition (2.1.4). trig trig (z, λ) and consider a matrix R (z, λ) = R∪X (z, λ), Rescale the R-matrix R∪X k ,γ k ,γ where is a new parameter. Let γ, z, λ be fixed and let tends to 0. Then the limit rat rat (z, λ). Hence, R∪X (z, λ) is a zero weight solution of of R (z, λ) is equal to R∪X k ,γ k ,γ the QDYB Eq. (2.1.3) satisfying the unitarity condition (2.1.4). Theorem 2.2 is proved. 2.6. Quantization of classical dynamical r-matrices of glN type with spectral parameter. Let V be the N dimensional h-module considered in Sect. 2.3. Let r : C × h∗ → End(V ⊗ V ) be a zero weight meromorphic function satisfying CDYB (2.2.4) and the unitarity condition (2.2.3). The zero weight condition implies that r has the form r(z, λ) =
N X a,b=1
αab (z, λ) Eaa ⊗ Ebb +
X
βab (z, λ) Eab ⊗ Eba .
(2.6.1)
a6=b
Assume that the function r satisfies also the residue condition Resz=0 r(λ, z) = P + δ Id. Here P ∈ End(V ⊗ V ) is the permutation of factors and Id ∈ End(V ⊗ V ) is the identity operator. The complex numbers and δ are called the coupling constant and the secondary coupling constant, respectively. We always assume that the coupling constant is not equal to zero. We recall a classification of such r-matrices. First we introduce gauge transformations of classical dynamical r-matrices with spectral parameter. P (2.6.2) Let ψ = a,b ψab (λ)dxa ∧ dxb be a closed meromorphic differential 2-form on h∗ . Set X r(z, λ) 7→ r(z, λ) + ψab (λ) Eaa ⊗ Ebb . a6=b
612
P. Etingof, A. Varchenko
(2.6.3) For a holomorphic function ψ : h∗ → C, set r(z, λ) 7→
N X
∂2ψ (λ)) Eaa ⊗ Ebb + ∂xa ∂xb X ∂ψ z( ∂ψ (λ)− ∂x (λ)) b βab (z, λ) e ∂xa Eab ⊗ Eba .
(αab (z, λ)+ z
a,b=1
a6=b
(2.6.4) For µ ∈ h∗ , set
r(z, λ) 7→ r(z, λ + µ).
(2.6.5) Let the symmetric group SN act on h∗ and V by permutation of coordinates. For any permutation σ ∈ SN , set r(z, λ) 7→ (σ ⊗ σ) r(z, σ −1 · λ) (σ −1 ⊗ σ −1 ). (2.6.6) For a nonzero complex number c, set r(z, λ) 7→ c r(z, cλ). (2.6.7) For an odd scalar meromorphic function f (z), f (z) + f (−z) = 0, set r(z, λ) 7→ r(z, λ) + f (z) Id. Any gauge transformation transforms a classical dynamical r-matrix with spectral parameter to a classical dynamical r-matrix with spectral parameter [EV]. Two classical dynamical r-matrices r(z, λ) and r 0 (z, λ) will be called equivalent if one of them can be transformed into another by a sequence of gauge transformations. The gauge transformations of quantum dynamical R-matrices with spectral parameter described in Sect. 2.4 are analogs of the gauge transformations of classical dynamical r-matrices with spectral parameter. Classification of the classical dynamical r-matrices with spectral parameter. The elliptic r-matrix. Fix a point τ in the upper half plane. Introduce the functions σw (z) = where θ0 (z, τ ) =
∂θ(z,τ ) ∂z .
θ(w − z, τ ) θ0 (0, τ ) , θ(w, τ ) θ(z, τ )
ρ(z) =
θ0 (z, τ ) , θ(z, τ )
Set
rτell (z, λ) = ρ(z)
N X
Eaa ⊗ Eaa +
a=1
X
σλba (z)Eab ⊗ Eba .
(2.6.8)
a6=b
For every τ ∈ C, Im τ > 0, the function rτell (z, λ) is a classical dynamical r-matrix with spectral parameter z, coupling constant = 1 and secondary constant δ = 0, [FW]. Trigonometric r-matrices. Let X ⊂ {1, ..., N } be a subset, X = X1 ∪ ... ∪ Xn its decomposition into disjoint intervals. For any a, b ∈ {1, ..., N }, a 6= b, we introduce a function βab : C ⊕ h∗ → C. If a, b ∈ Xk for some k, then we set βab (z, λ) = −
sin(λab + z) . sin(λab ) sin(z)
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
613
Otherwise we set βab (z, λ) =
e−iz , sin(z)
for a < b,
βab (z, λ) =
eiz sin(z)
for a > b.
trig : C ⊕ h∗ → End(V ⊗ V ) by We introduce a trigonometric r-matrix r∪X k ,γ
trig r∪X (z, λ) = cotan (z) k
N X
Eaa ⊗ Eaa +
a=1
X
βab (z, λ) Eab ⊗ Eba ,
(2.6.9)
a6=b
where cotan (z) = cos (z) /sin (z). Rational r-matrices. Let X ⊂ {1, ..., N } be a subset, X = X1 ∪ ... ∪ Xn its decomposition into disjoint intervals. Set P X + z n
rat (z, λ) = r∪X k
X
k=1 a,b∈Xk , a6=b
1 Eab ⊗ Eba . λab
(2.6.10)
Theorem 2.3. 1. For every subset X ⊂ {1, ..., N } and its decomposition X = X1 ∪ trig rat and r∪X are classical dynamical ... ∪ Xn into disjoint intervals, the matrices r∪X k k r-matrices with spectral parameter. 2. Every classical dynamical r-matrix r : C×h∗ → End(V ⊗V ) with nonzero coupling constant is equivalent to one of the matrices (2.6.8)–(2.6.10). Theorem 2.3 follows from [EV]. Theorem 2.4. Let r(z, λ) be a unitary classical dynamical r-matrix with spectral parameter and nonzero coupling constant, meromorphic on C × U , where U is an open polydisc. Assume that for any λ ∈ U there exists z ∈ C such that r is holomorphic at (λ, z). Then r can be quantized to a unitary quantum dynamical R-matrix Rγ on C × U of glN type. Moreover, if a classical dynamical r-matrix with spectral parameter and nonzero coupling constant is equivalent to the elliptic r-matrix (2.6.8) (resp., a trigonometric r-matrix (2.6.9) or a rational r-matrix (2.6.10)), then it has a quantization equivalent to the elliptic R-matrix (2.5.1) (resp., a trigonometric R-matrix (2.5.6) or a rational R-matrix (2.5.9)). Proof. We shall prove that if a classical dynamical r-matrix is equivalent to the elliptic r-matrix (2.6.8), then it is quantizable to a quantum dynamical R-matrix equivalent to the elliptic R-matrix (2.5.1). The other statements of the theorem are proved similarly. ell (z, λ). For the functions α(z, λ, γ) and Compute the quasiclassical limit of Rγ,τ β(z, λ, γ) defined in (2.5.2), we have limγ→0
α(z, λ, γ) − 1 θ0 (λ) θ0 (z) = + , γ θ(λ) θ(z)
limγ→0
θ0 (0)θ(z − λ) β(z, λ, γ) = . γ θ(λ)θ(z)
Hence ell Rγ,τ (z, λ) = 1 − γ r(z, λ) + O(γ 2 ),
where
614
P. Etingof, A. Varchenko
r(z, λ) = −
X θ0 (λab ) θ0 (z) X θ0 (0)θ(z − λab ) + ) Eaa ⊗ Ebb − Eba ⊗ Eab ( θ(λab ) θ(z) θ(λab )θ(z) a6=b
a6=b
X θ0 (λab ) θ0 (z) X + ) Eaa ⊗ Ebb + =− ( σλba (z) Eab ⊗ Eba . θ(λab ) θ(z) a6=b
a6=b
Now applying to the r-matrix r(z, λ) the transformation (2.6.2) corresponding to the closed differential 2-form X θ0 (λab ) dxa ∧ dxb , θ(λab ) a6=b
and then applying to the result the transformation (2.6.7) corresponding to the function f (z) = θ0 (z)/θ(z) we get the matrix rτell (z, λ) defined by (2.6.8). This remark and Lemma 1.1 easily imply the statement of the Theorem concerning the elliptic r-matrix. Theorem 2.4 is proved. Remark. The elliptic quantum dynamical R-matrix (2.5.1) was invented by G. Felder [F1-2] as a quantization of the classical dynamical r-matrix (2.6.8). 2.7. Formal dynamical R-matrices and gauge fixing conditions. Let Rγ (z, λ) = P 1 − γr(z, λ) + n≥2 γ n rn (z, λ) be a power series in λ and γ, whose coefficients are meromorphic functions of z, taking values in End(V ⊗ V ). The series Rγ is called a formal quantum dynamical R-matrix of glN type with spectral parameter and step γ if it is of zero weight and satisfies the quantum dynamical Yang–Baxter equation. In addition, Rγ is called unitary if it satisfies the unitarity condition (2.1.4). In this section for brevity we will refer to formal quantum dynamical R-matrices of glN type with spectral parameter and step γ as “formal dynamical R-matrices”. As we know, any such R-matrix has form (2.3.2). The theory of formal dynamical R-matrices is completely analogous to the theory of analytic dynamical R-matrices. In particular, one can define formal classical dynamical r-matrices and formal gauge transformations in an obvious way. If Rγ = 1 − γr + ... is a (unitary) formal dynamical R-matrix, then r is a (unitary) formal dynamical r-matrix. An example of a formal dynamical R-matrix is the Taylor expansion of an analytic dynamical R-matrix Rγ (z, λ) at a point γ = 0, λ = λ0 , such that R is regular at this point for generic values of z. Proposition 2.1. Let Rγ = 1 − γr + ... be a unitary formal dynamical R-matrix, and z0 ∈ C a point where Rγ is regular. Let αab , βab be the matrix coefficients of Rγ , see (2.3.2). Then Rγ can be transformed, by a sequence of formal gauge transformations, to a unitary formal dynamical R-matrix satisfying the following conditions: c) is independent of z; 1) for every a, b, c, the ratio αabα(z,λ−γω ab (z,λ) 2) for every a < b, αab (z0 , λ) = 1; 3) the coefficient α11 (z, λ) is independent of z.
Proof. The QDYB equation with spectral parameter implies the equation αab (u, λ − γωc ) αac (u + v, λ) αbc (v, λ − γωa ) = αbc (v, λ) αac (u + v, λ − γωb ) αab (u, λ) (2.7.1) for any a, b, c. Therefore, we have
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
615
αab (u, λ − γωc ) = Habc (λ)eDabc (λ)u , αab (u, λ)
(2.7.2)
for suitable power series Habc (λ), Dabc (λ). Lemma 2.1. There exists a formal power series ψ(λ) such that Dabc = ∂a ∂b ∂c ψ. Proof. From (2.7.1) it follows that Dabc is symmetric. From (2.7.2) it follows that ∂d Dabc is symmetric. The rest of the proof of the lemma follows from the basic theory of difference equations with infinitesimal shift. Corollary 2.1. Performing a gauge transformation (2.4.1), we can arrange D = 0, i.e. condition 1. From now on we assume that D = 0, i.e. αab (u, λ − γωc ) = Habc (λ). αab (u, λ)
(2.7.3)
1 2 i (u)αab (λ), where αab are new functions. This implies that αab (u, λ) = αab Consider the multiplicative 2-form ϕ defined by ϕab (λ) = αab (z0 , λ), a < b. It follows from (2.7.1) that dγ ϕ = 0. Therefore, by a gauge transformation of type (2.4.2) we can arrange ϕ = 1, i.e. condition 2. It remains to arrange condition 3. By (2.7.3), α11 (z, λ) = f (z)g(λ) for a suitable formal power series g(λ) and a meromorphic function f (z) such that f (z)f (−z) = 1. Applying transformation (2.4.4) with c(z) = 1/f (z), we get condition 3. The proposition is proved.
We will call conditions 1–3 the gauge fixing conditions. 2.8. Classification of unitary formal dynamical R-matrices with elliptic quasiclassical limit. We will say that a formal classical dynamical r-matrix r is of elliptic, trigonometric, or rational type if it is gauge equivalent (by formal gauge transformations) to an r-matrix of the form (2.6.8), (2.6.9),(2.6.10), respectively, expanded near a point λ0 ∈ h∗ . It follows from [EV] that any formal classical dynamical r-matrix satisfying the residue condition with coupling constant 6= 0 is of elliptic, trigonometric, or rational type. Theorem 2.5. Let Rγ = 1 − γr + O(γ 2 ) be a unitary formal dynamical R-matrix whose quasiclassical limit r is of the elliptic type. Then there exist a point λ0 ∈ h∗ and a power series τ (γ) = τ0 + O(γ) ∈ C[[γ]], Im(τ0 ) > 0 such that the R-matrix Rγ can be transformed, by a sequence of formal gauge transformations, into the Taylor series of ell ell Rγ,τ (γ) (z, λ − λ0 ), where Rγ,τ (z, λ) is the elliptic R-matrix (2.5.1). The proof of this Theorem occupies the next section. 2.9. Proof of Theorem 2.5 . Let X 0 be the space of unitary formal classical dynamical r-matrices with spectral parameter and a nonzero coupling constant. Let X∗0 be the subset of elements of X 0 which satisfy the following gauge fixing conditions: ∂ 1c) ∂λ αab (z, λ) is independent of z; c 2c) αab (z0 , λ) = 0, a < b; 3c) α11 (z, λ) is independent of z (these conditions are quasiclassical analogues of conditions 1–3 above).
616
P. Etingof, A. Varchenko
According to the results of [EV], the space X∗0 is a connected, finite-dimensional complex manifold (with singularities), and any element of X 0 is gauge equivalent to an element of X∗0 . (i.e. X∗0 is a “cross-section”). Moreover, since r ∈ X∗0 is of elliptic type, the manifold X∗0 is smooth at r. Let X be the space of unitary formal quantum dynamical R-matrices with spectral parameter, and X∗ the subset of elements of X satisfying the gauge fixing conditions 1-3. As we have shown in Sect. 2.7, we can assume that our family Rγ is in X∗ . In this case, r ∈ X∗0 . Now let us prove the statement of the theorem modulo γ m+1 by induction in m. For m = 1, the theorem is a tautology. Suppose we know the theorem for m = k ≥ 1, and want to prove it for m = k + 1. We have a polynomial Rk = 1−γr+...+γ k rk which satisfies the condition Rk ∈ X∗ modulo γ k+1 . We know that Rk satisfies the conclusion of Theorem 2.5 modulo γ k+1 , i.e. is of the form (2.5.1) modulo γ k+1 . Consider any extension of this polynomial to order k + 1: Rk+1 = Rk + γ k+1 rk+1 . The condition that Rk+1 ∈ X∗ modulo γ k+2 can be expressed as a nonhomogeneous linear equation with respect to rk+1 having the form A rk+1 = sk+1 (rk , ..., r2 , r), where A is a linear operator. The obvious, but crucial observation now is the following. Lemma 2.2. Ker A = Tr X∗0 , where Tr X∗0 denotes the tangent space at the point r. Proof. Indeed, it is easy to see by an explicit calculation that the linear homogeneous equation Aρ = 0 is nothing else but the equation for a tangent vector to X∗0 at the point r. Corollary 2.2. The dimension of the space of solutions of Ark+1 = sk+1 is less than or equal to K = dim(X∗0 ). However, by Theorem 2.4, we already have a family of elements of X∗ with K parameters – the quantizations of elements of X∗0 . Therefore, using dimension arguments, we obtain that if rk+1 satisfies Ark+1 = sk+1 , then Rk+1 (γ) has to be in this K-parametric family, which completes the induction step. The theorem is proved. Remark. If r is not elliptic but rational or trigonometric, the result of Theorem 2.5 can be generalized, in the sense that formal dynamical R-matrices Rγ = 1 − γr + ... with rational or trigonometric r can be explicitly classified up to gauge transformations by the same method as above. However, both the statement and the proof in this case are more delicate, as the manifold X0∗ may now be singular at r, and it is necessary to describe carefully these singularities. For simplicity one should first consider the case dim V = 2, and then generalize to an arbitrary dimension. We are not giving this argument here. 3. Quantum Dynamical R-matrices and Monoidal Categories Let us briefly recall some standard notions of the category theory [Mac, Kass]. Recall that a morphism a : F → G of two functors from a category C to a category C 0 is a choice of a morphism aX : F (X) → G(X) for any object X in C, such that for any two objects X, Y ∈ C and any morphism g : X → Y we have aY ◦ F (g) = G(g) ◦ aX . An endomorphism of a functor is just a morphism of this functor into itself.
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
617
Recall that a monoidal category is a category C with a bifunctor ⊗ : C × C → C (i.e. a functor with respect to each factor), called the tensor product, and an isomorphism of functors 8 : (∗ ⊗ ∗) ⊗ ∗ → ∗ ⊗ (∗ ⊗ ∗), called the associativity isomorphism, such that 8 satisfies the pentagon relation, and there exists a unit object 1 ∈ C with certain properties. A braided monoidal category is a monoidal category with a functorial isomorphism β : ⊗ → ⊗op called the commutativity isomorphism, which satisfies the hexagon relations. A braided category is called symmetric if β 2 = 1. A monoidal category will be called a tensor category if it has an additive structure ⊕, such that ⊗ is distributive with respect to ⊕. 3.1. The category of h-vector spaces. Let h be a finite-dimensional commutative Lie algebra over C. Let Mh∗ denote the field of meromorphic functions on h∗ . Fix a complex number γ. Let Vh denote the category whose objects are diagonalizable h-modules, and morphisms are defined by HomVh (X, Y ) = Homh (X, Y ⊗C Mh∗ ). Let W ⊗ ∗ be the functor of multiplication by W . For any W ∈ Vh and f ∈ EndVh (W ), define f (∗ − γh(2) ) ∈ End(W ⊗ ∗) by the formula fV (λ − γh(2) )(w ⊗ v) = fV (λ − γµ)w ⊗ v,
(3.1.1)
for any v ∈ V of weight µ (cf. Sect. 1.1). ¯ ¯ : Vh × Vh → Vh as follows. For any X, Y ∈ Vh , define X ⊗Y Define a bifunctor ⊗ to be the usual tensor product X ⊗ Y . For any two morphisms f : X → X 0 , g : Y → Y 0 ¯ : X ⊗ Y → X 0 ⊗ Y 0 by the formula define the morphism f ⊗g ¯ f ⊗g(λ) = f (1) (λ − γh(2) )(1 ⊗ g(λ)).
(3.1.2)
¯ is a tensor category It is easy to see that the category Vh equipped with the bifunctor ⊗ ¯ ⊗∗) ¯ and (∗⊗∗) ¯ ⊗∗ ¯ are equal, so ⊗ ¯ is associative. (cf. [Mac]). Indeed, the functors ∗⊗(∗ ¯ and Moreover, the object 1 = C (the trivial h-module), satisfies the condition 1 = 1⊗1, ¯ ¯ the functors X → 1⊗X, X → X ⊗1 are autoequivalences of Vh , so 1 is an identity object in Vh . We will call this monoidal category the category of h-vector spaces. If h = 0, the category Vh coincides with the category of complex vector spaces. If γ = 0, the category Vh is equivalent, as a tensor category, to the category of diagonalizable h-modules, with scalars extended from C to Mh∗ . This case is not very interesting, so from now on we will assume that γ 6= 0. The category Vh depends on γ, but the categories with different nonzero γ are obviously equivalent. We will suppress the dependence of Vh on γ in the notation. Remark. It is clear that for any two objects X, Y ∈ Vh the permutation operator σXY : ¯ → Y ⊗X ¯ is an isomorphism in Vh . However, if h 6= 0, then this isomorphism X ⊗Y is not functorial in X and Y . In fact, it is quite easy to see that if h 6= 0, there is no ¯ and Y ⊗X: ¯ functorial isomorphism between X ⊗Y such an isomorphism would have to conjugate f (1) (λ − γh(2) )(1 ⊗ g(λ)) into g (1) (λ − γh(2) )(1 ⊗ f (λ)) for any f, g, which is impossible, since there is no relation between f (λ) and f (λ − γµ) for a generic function f . Thus, the category Vh is a tensor category which in general does not admit a braided structure. 3.2. Dynamical quantum R-matrices and tensor functors. It is known from the theory of quantum groups that if we are given a braided monoidal category B, a symmetric
618
P. Etingof, A. Varchenko
tensor category V, and a tensor functor F : B → V, then for any object X ∈ B we can construct an element R(B, F, X) ∈ AutV (F (X) ⊗ F (X)) which satisfies the quantum Yang–Baxter equation, by the formula R(B, F, X) = σF (βXX ), where
(3.2.1)
βXY : X ⊗ Y → Y ⊗ X
is the braiding in B, and σ is the permutation. For brevity we will write R(B, F, X) as RX . Suppose now that we are given a braided monoidal category B and a tensor functor F : B → Vh . Observe that formula (3.2.1) makes sense in this situation. However, since σXY is not a functorial isomorphism, we should not expect RX to be a solution to the quantum Yang–Baxter equation. Still, it turns out that RX satisfies a modified version of the quantum Yang–Baxter equation, namely, the quantum dynamical Yang–Baxter Eq. (1.1.3). Theorem 3.1. The element RX satisfies the quantum dynamical Yang–Baxter Eq. (1.1.3) ¯ ). in EndVh (F (X)⊗3 Proof. We start with the braid relation (β ⊗ 1)(1 ⊗ β)(β ⊗ 1) = (1 ⊗ β)(β ⊗ 1)(1 ⊗ β).
(3.2.2)
Applying the functor F to (3.2.2), and using the definition of the tensor product of morphisms in Vh , we get (1.1.3). 3.3. Representations of a quantum dynamical R-matrix. The notions discussed in this section were introduced in [F1, F2, FV1]. Let R : h∗ → End(V ⊗ V ) be a quantum dynamical R-matrix (see Chapter 1). Definition. A representation of R is an object W ∈ Vh endowed with an invertible ¯ ), called the L-operator, such that morphism L ∈ EndVh (V ⊗W R12 (λ − γh(3) ) L13 (λ) L23 (λ − γh(1) ) = L23 (λ) L13 (λ − γh(2) ) R12 (λ),
(3.3.1)
¯ ⊗W ¯ ). in EndVh (V ⊗V Examples. 1. The trivial representation: W = C, L = Id. 2. The basic representation: W = V , L = R. Let (W, L) be a representation of R. Let A ∈ AutVh (W ). Let LA (λ) := (1 ⊗ A(λ)−1 )L(λ)(1 ⊗ A(λ − γh(1) )). Lemma 3.1. (W, LA ) is a representation of R. Proof. Straightforward.
Let (W, LW ) and (U, LU ) be representations of R. Definition. A morphism A ∈ HomVh (W, U ) is called an R-morphism if (1 ⊗ A(λ))LW (λ) = LU (λ)(1 ⊗ A(λ − γh(1) )),
(3.3.2)
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
619
Denote the space of R-morphisms from W to U by HomR (W, U ). It is clear that the composition of two R-morphisms is again an R-morphism. Thus, representations of R form a category, which we denote by Rep(R). This category is additive, with the obvious notion of direct sum. Definition. The tensor product of W and U is the pair (W ⊗ U, LW ⊗U ), where (3) 13 LW ⊗U (λ) = L12 W (λ − γh )LU (λ).
(3.3.3)
Lemma 3.2. (W ⊗ U, LW ⊗U ) is a representation of R. Proof. Straightforward.
It is clear that (W ⊗ U ) ⊗ X = W ⊗ (U ⊗ X). Lemma 3.3. If W, W 0 , U, U 0 are representations of R and f, g are R-morphisms then ¯ is an R-morphism. f ⊗g Proof. Straightforward.
Thus, we have equipped the category Rep(R) with a structure of a tensor category. Moreover, the forgetful functor F : Rep(R) → Vh is naturally a tensor functor. Theorem 3.1 shows that any pair (B, F : B → Vh ) defines a system of quantum dynamical R-matrices. It turns out that conversely, any quantum dynamical R-matrix R defines B, F , and X, such that R = R(B, F, X). The construction of B, F, X is parallel to the case of usual R-matrices (h = 0), where it is well known. Namely, let B be the subcategory of Rep(R) whose objects are tensor powers of V , and morphisms are the same as in Rep(R). It is clearly a monoidal category. Define a braiding β on B by βV V = σR. It is easy to check using the hexagon axioms for the braiding that there exists a unique braiding on B with such βV V . Let F : B → Vh be the forgetful functor. We assign the pair (B, F ) to R. It is clear that R = R(B, F, X) if we take X = V . 3.4. Dual representations. It is useful to define the notion of the left and right dual representations. Definition. Let (W, LW ) be a representation of R. The right dual representation to W is the pair (W ∗ , LW ∗ ), where W ∗ denotes the h-graded dual of W , and (2) t2 LW ∗ (λ) = L−1 W (λ + γh ) ,
(3.4.1)
provided that the r.h.s. of (3.4.1) is invertible (here t2 denotes dualization in the second component). The left dual representation to W is the pair (∗ W, L∗ W ), where ∗ W = W ∗ , and L∗ W (λ) = LtW2 (λ − γh(2) )−1 , (3.4.2) provided that the r.h.s. of (3.4.2) is well defined. (2) t2 Remark 1. Here L−1 W (λ + γh ) denotes the result of three operations applied successively to LW : inversion, shifting of the argument, and dualization in the second component. Similarly, LtW2 (λ − γh(2) )−1 denotes the result of three operations applied successively to LW : dualization in the second component, shifting of the argument, and inversion.
620
P. Etingof, A. Varchenko
Remark 2. We do not define the representation W ∗ if LW ∗ is not invertible, and do not define the representation ∗ W if LtW2 is not invertible. Lemma 3.4. The right dual representation (W ∗ , LW ∗ ) and the left dual representation (∗ W, L∗ W ) are representations of R, and if W has finite dimensional weight subspaces then ∗ (W ∗ ) = (∗ W )∗ = W . Proof. The lemma can be checked by a direct calculation. It also follows from Propositions 4.1 and 4.4 below. Lemma 3.5. If A : W1 → W2 is a homomorphism of representations of R, then the linear map A∗ (λ) := A(λ+γh(1) )t = At (λ−γh(1) ) is a homomorphism of representations W2∗ → W1∗ , and is a homomorphism of representations ∗ W2 → ∗ W1 , when these representations are defined. Proof. The lemma can be checked by a direct calculation. It also follows from Propositions 4.1 and 4.4. Remark. It is easy to show that for two finite dimensional representations W1 , W2 of R, the representation (W1 ⊗ W2 )∗ is naturally isomorphic to W2∗ ⊗ W1∗ , and similarly for the left dual, if the corresponding dual representations are defined. 4. h-Hopf Algebroids and Their Dynamical Representations In this chapter we will define the notion of an h-bialgebroid, and give the simplest nontrivial examples – dynamical quantum groups associated to quantum dynamical Rmatrices from Chapter 1. We will generalize this material in the next chapter. 4.1. h-bialgebroids. Let h be a finite dimensional commutative Lie algebra over C, and γ a nonzero complex number. Recall that Mh∗ denotes the field of meromorphic functions on h∗ . Definition. An h-algebra with step γ is an associative algebra A over C with 1, endowed with an h∗ -bigrading A = ⊕α,β∈h∗ Aαβ (called the weight decomposition), and two algebra embeddings µl , µr : Mh∗ → A00 (the left and the right moment maps), such that for any a ∈ Aαβ and f ∈ Mh∗ , we have µl (f (λ))a = aµl (f (λ + γα)),
µr (f (λ))a = aµr (f (λ + γβ)).
(4.1.1)
A morphism ϕ : A → B of two h-algebras is an algebra homomorphism, preserving the moment maps. By (4.1.1), such a homomorphism also preserves the weight decomposition. A B B Let A, B be two h-algebras with step γ, and µA l , µr , µl , µr their moment maps. e which is also an h-algebra. Define their “matrix tensor product”, A⊗B, Definition. Let
e αδ := ⊕β Aαβ ⊗Mh∗ Bβδ , (A⊗B)
(4.1.2)
where ⊗Mh∗ means the usual tensor product modulo the relation µA r (f )a ⊗ b = a ⊗ ∗. µB (f )b, for any a ∈ A, b ∈ B, f ∈ M h l
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
621
e by the rule (a ⊗ b)(a0 ⊗ b0 ) = aa0 ⊗ bb0 . It is easy Introduce a multiplication in A⊗B to show that this product is well defined (cf. Proposition 5.1). Define the moment maps e B (f ) = µA (f ) ⊗ 1, µA⊗ ⊗ e B (f ) = 1 ⊗ µB (f ). It is easy to check that e by µA for A⊗B r r l l e into an h-algebra. It is clear that ⊗ e is functorial with respect to both this makes A⊗B e ⊗C e = A⊗(B e ⊗C). e e is not, in general, isomorphic to factors, and (A⊗B) However, A⊗B e B ⊗A. Remark. The name “matrix tensor product” is used because formula (4.1.2) reminds of the matrix multiplication. Definition. A coproduct on an h-algebra A is a homomorphism of h-algebras 1 : A → e A⊗A. Mh∗ → Mh∗ , i.e. operators of the form PnLet Dh be the algebra of difference operators ∗ ∗ , and for β ∈ h we denote by Tβ the field automorphism f (λ)T , where f ∈ M i β i h i i=1 of Mh∗ given by (Tβ f )(λ) = f (λ + γβ). The algebra Dh is the simplest nontrivial example of an h-algebra. Indeed if we define the weight decomposition by Dh = ⊕(Dh )αβ , where (Dh )αβ = 0 if α 6= β, and (Dh )αα = {f (λ)Tα−1 : f ∈ Mh∗ }, and the moment maps µl = µr : Mh∗ → (Dh )00 to be the tautological isomorphism, then Dh becomes an h-algebra. e h and Dh ⊗A e are canonically Lemma 4.1. For any h-algebra A, the algebras A⊗D isomorphic to A. Proof. Straightforward.
e is a Lemma 4.1 shows that the category of h-algebras equipped with the product ⊗ monoidal category, where the unit object is Dh . Definition. A counit on an h-algebra A is a homomorphism of h-algebras ε : A → Dh . Definition. An h-bialgebroid is an h-algebra A equipped with a coassociative coproduct 1 (i.e. such that (1⊗ IdA )◦1 = (IdA ⊗ 1)◦1), and a counit ε such that (ε ⊗ IdA ) ◦ 1 = (IdA ⊗ ε) ◦ 1 = IdA . The property of the counit in the definition makes sense because of Lemma 4.1. 4.2. Dynamical representations of h-bialgebroids. Let W be a diagonalizable h-module, α ⊂ HomC (W, W ⊗ Dh ) be the space of all difference operators on h∗ with and let Dh,W coefficients in EndC (W ), which have weight α with respect to the action of h in W . α Consider the algebra Dh,W = ⊕α Dh,W . This algebra has a weight decomposition Dh,W = ⊕α,β (Dh,W )αβ defined as follows: if g ∈ HomC (W, W ⊗ Mh∗ ) is an operator of weight β − α then gTβ−1 ∈ (Dh,W )αβ . Define the moment maps µl , µr : Mh∗ → (Dh,W )00 by the formulas µr (f (λ)) = f (λ), µl (f (λ)) = f (λ − γh). Lemma 4.2. The algebra Dh,W equipped with this weight decomposition and these moment maps is an h-algebra. Proof. Straightforward.
622
P. Etingof, A. Varchenko
e h,U → Lemma 4.3. There is a natural embedding of h-algebras θW U : Dh,W ⊗D ¯ ¯ , where ⊗ is defined in ChapDh,W ⊗U , given by the formula f Tβ ⊗ gTδ → (f ⊗g)T δ ter 3, and f ∈ Hom(W, W ⊗ Mh∗ ). This embedding is an isomorphism if W, U are finite-dimensional. Proof. We have to show that the map θW U is well defined, and is an embedding. We also have to show that θW U is a homomorphism of h-algebras, which is an isomorphism in the finite-dimensional case. ¯ = f ⊗ϕ(λ ¯ − The fact that θW U is well defined follows from the identity ϕ(λ)f ⊗g ∗ γh)g, for any function ϕ ∈ Mh . The injectivity of θW U , and its surjectivity in the finite dimensional case are straightforward. It remains to show that θW U is a homomorphism of h-algebras. It is obvious that θW U preserves the moment maps, so it remains to show that it is multiplicative. We have −1 0 θW U ((f (λ)Tβ−1 ⊗ g(λ)Tδ−1 )(f 0 (λ)Tβ−1 0 ⊗ g (λ)Tδ 0 )) = −1 −1 0 θW U (f (λ)f 0 (λ − γβ)Tβ+β 0 ⊗ g(λ)g (λ − γδ)Tδ+δ 0 ) =
f (1) (λ − γh(2) )f
0
(1)
−1 (λ − γh(2) − γβ)(1 ⊗ g(λ)g 0 (λ − γδ))Tδ+δ 0 =
f (1) (λ − γh(2) )(1 ⊗ g(λ))f
0
(1)
−1 (λ − γh(2) − γδ)(1 ⊗ g 0 (λ − γδ))Tδ+δ 0 =
f (1) (λ − γh(2) )(1 ⊗ g(λ))Tδ−1 f
0
(1)
(4.2.1)
(λ − γh(2) )(1 ⊗ g 0 (λ))Tδ−1 = 0
−1 0 θW U (f (λ)Tβ−1 ⊗ g(λ)Tδ−1 )θW U (f 0 (λ)Tβ−1 0 ⊗ g (λ)Tδ 0 ).
The lemma is proved.
Definition. A dynamical representation of an h-algebra A is a diagonalizable h-module W endowed with a homomorphism of h-algebras πW : A → Dh,W . A homomorphism of dynamical representations ϕ : W1 → W2 is an element of HomC (W1 , W2 ⊗ Mh∗ ) such that ϕ ◦ πW1 (x) = πW2 (x) ◦ ϕ for all x ∈ A. Example. If A has a counit, then it has the trivial representation: W = C, π = ε. Suppose now that A is an h-bialgebroid. Then, if W and U are two dynamical representations of A, the h-module W ⊗ U also has a natural structure of a dynamical representation, defined by πW ⊗U (x) = θW U ◦ (πW ⊗ πU ) ◦ 1(x). It is easy to show that if f : W1 → W2 and g : U1 → U2 are homomorphisms of ¯ is a homomorphism W1 ⊗ U1 → W2 ⊗ U2 (where dynamical representations, then f ⊗g ¯ is defined in Chapter 3). This gives a rule of tensoring morphisms. Thus, dynamical ⊗ representations of A form a monoidal category Rep(A), whose identity object is the trivial representation. Moreover, the category Rep(A) is equipped with a natural tensor functor Rep(A) → Vh to the category of h-vector spaces – the forgetful functor. 4.3. h-Hopf algebroids and dual representations. Let us introduce the notion of an antipode on an h-bialgebroid. Let A be an h-algebra. A linear map S : A → A is called an antiautomorphism of h-algebras if it is an antiautomorphism of algebras and µr ◦ S = µl , µl ◦ S = µr . From these conditions it follows that S(Aαβ ) = A−β,−α . Let A be an h-bialgebroid, and let 1, ε be the coproduct and counit of A. For a ∈ A, let X a1i ⊗ a2i . (4.3.1) 1(a) = i
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
623
Definition. An antipode on the h-bialgebroid A is an antiautomorphism of h-algebras S : A → A such that for any a ∈ A and any presentation (4.3.1) of 1(a), one has X X a1i S(a2i ) = µl (ε(a)1), S(a1i )a2i = µr (ε(a)1), (4.3.2) i
i
where ε(a)1 ∈ Mh∗ is the result of application of the difference operator ε(a) to the constant function 1. P P Remark. It is easy to see that i a1i S(a2i ) and i S(a1i )a2i depends only on a and not on the choice of the presentation (4.3.1). Definition. An h-bialgebroid with an antipode is called an h-Hopf algebroid. Remark. If h = 0, the notions of an h-algebra, h-bialgebroid, h-Hopf algebroid coincide with the notions of an algebra, bialgebra, and Hopf algebra, respectively. For any h-Hopf algebroid A, the category Rep(A) has the following natural notion of the left and right dual representation. 0 :A→ If (W, πW ) is a dynamical representation of an h-algebra A, we denote by πW 0 (x)w = πW (x)w, w ∈ W (the difference Hom(W, W ⊗ Mh∗ ) the map defined by πW operator πW (x) restricted to the constant functions). It is clear that πW is completely 0 . determined by πW Definition. Let (W, πW ) be a dynamical representation of A. Then the right dual representation to W is (W ∗ , πW ∗ ), where W ∗ is the h-graded dual to W , and 0 0 t πW ∗ (x)(λ) = πW (S(x))(λ + γh − γα)
(4.3.3)
for x ∈ Aαβ , where t denotes dualization. The left dual representation to W is (∗ W, π∗ W ), where ∗ W = W ∗ , and 0 (S −1 (x))(λ + γh − γα)t π∗0 W (x)(λ) = πW
(4.3.4)
for x ∈ Aαβ . Proposition 4.1. Formulas (4.3.3) and (4.3.4) define dynamical representations of A. Moreover, if A(λ) : W1 → W2 is a morphism of dynamical representations, then A∗ (λ) := A(λ + γh)t defines a morphism W2∗ → W1∗ and ∗ W2 → ∗ W1 . 0 0 0 Proof. Let x ∈ Aαx βx , y ∈ Aαy βy . Then πW (xy)(λ) = πW (x)(λ)πW (y)(λ − γβx ) by the definition of a dynamical representation. Therefore, we have 0 t 0 πW ∗ (xy)(λ) = πW (S(xy))(λ + γh − γαx − γαy ) = 0 (S(y)S(x))(λ + γh − γαx − γαy ) = πW 0 πW (S(y))(λ + γh − γαx − γαy + γαS(x) 0 −γβS(x) )πW (S(x))(λ 0 (S(y))(λ πW
+ γh − γαx − γαy − βS(y) ) =
0 + γh − γαy − γβx )πW (S(x))(λ + γh − γαx ).
Dualizing (4.3.5), we get
(4.3.5)
624
P. Etingof, A. Varchenko 0 0 t 0 t πW ∗ (xy)(λ) = πW (S(x))(λ + γh − γαx ) πW (S(y))(λ + γh − γαy − γβx ) =
0 0 πW ∗ (x)(λ)πW ∗ (y)(λ − γβx ), (4.3.6) which implies the first statement of the proposition for W ∗ . The proof for ∗ W is obtained by replacing S by S −1 . Let us prove the second statement. The intertwining property of A(λ) can be written as 0 0 (x)(λ) = πW (x)(λ)A(λ − γβx ). (4.3.7) A(λ)πW
Replacing x with S(x) and shifting the arguments, we get 0 (S(x))(λ + γh − γαx ) = A(λ + γh − γβx )πW 0 (S(x))(λ + γh − γαx )A(λ + γh − γαx − γβS (x)). πW
(4.3.8)
Dualizing (4.3.8) and using the identity βS(x) + αx = 0, we get the second statement of the proposition. The proposition is proved. 4.4. h-bialgebroids associated to a function R : h∗ → End(V ⊗ V ). Let h be a finite dimensional commutative Lie algebra, and V = ⊕α∈h∗ Vα a finite dimensional diagonalizable h-module. Let R(λ) be a meromorphic function h∗ → End(V ⊗ V ) of zero weight, such that R(λ) is invertible for a generic λ. Using R, we will now define an h-bialgebroid AR which we call the dynamical quantum group corresponding to R. This construction is analogous to the Faddeev–Reshetikhin–Sklyanin–Takhtajan construction of the quantum function algebra on GLN . As an algebra, AR by definition is generated by two copies of Mh∗ (embedded as subalgebras) and certain new generators, which are matrix elements of the operators L±1 ∈ End(V )⊗AR . We denote the elements of the first copy of Mh∗ as f (λ1 ) and of the second copy as f (λ2 ), where f ∈ Mh∗ . We denote by (L±1 )αβ the weight components of L±1 with respect to the natural h-bigrading on End(V ), so that L±1 = (L±1 αβ ), where ±1 Lαβ ∈ HomC (Vβ , Vα ) ⊗ AR . Then the defining relations for AR are: f (λ1 )Lαβ = Lαβ f (λ1 + γα); f (λ2 )Lαβ = Lαβ f (λ2 + γβ); [f (λ1 ), g(λ2 )] = 0; (4.4.1) LL−1 = L−1 L = 1;
(4.4.2)
and the dynamical Yang–Baxter relation R12 (λ1 )L13 L23 =: L23 L13 R12 (λ2 ) : .
(4.4.3)
Here the :: sign (“normal ordering”) means that the matrix elements of L should be put on the right Thus, if {va } is a homogeneous basis of V , P of the matrix elements of R.P ab Rcd (λ)vc ⊗ vd , then (4.4.3) has the form and L = Eab ⊗ Lab , R(λ)(va ⊗ vb ) = X X xy 1 bd Rac (λ )Lxb Lyd = Rxy (λ2 )Lcy Lax , (4.4.4) where we sum over repeated indices. More precisely, the algebra AR is, by definition, the quotient of the algebra A˜ freely generated by Mh∗ ⊗ Mh∗ and elements Lab , (L−1 )ab , a, b = 1, ..., dimV , by the ideal defined by relations (4.4.1)–(4.4.3).
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
625
Introduce the moment maps for AR by µl (f ) = f (λ1 ), µr (f ) = f (λ2 ), and the weight decomposition by f (λ1 ), f (λ2 ) ∈ (AR )00 , Lαβ ∈ HomC (Vβ , Vα ) ⊗ (AR )αβ . It is clear that AR equipped with such structures is an h-algebra. e R , by the usual Lie-theoretic Now define the coproduct on AR , 1 : AR → AR ⊗A formulas 1(L) = L12 L13 , 1(L−1 ) = (L−1 )13 (L−1 )12 (4.4.5) (here 1 is applied to the second component of L, L−1 ). e Proposition 4.2. 1 extends to a well defined homomorphism A → A⊗A. Proof. From (4.4.5) we get 1(Lαβ ) =
X
13 L12 αθ Lθβ .
(4.4.6)
θ
So it remains to show that the defining relations of AR are invariant under 1. The invariance of relations (4.4.1) follows directly from (4.4.6). Relation (4.4.2) is obviously invariant. To check the invariance of relation (4.4.3), we have to show that R12 (λ11 )L13 L14 L23 L24 =: L23 L24 L13 L14 R12 (λ22 ) :
(4.4.7)
(the subscripts 1, 2 under λ indicate that the corresponding functions are taken from the e R ; and, as before, the :: first and the second components of AR in the product AR ⊗A sign indicates that the functions of λi are written on the left from the L-operators). We have R12 (λ11 )L13 L14 L23 L24 = R12 (λ11 )L13 L23 L14 L24 =: L23 L13 R12 (λ21 ) : L14 L24 = L23 L13 R12 (λ12 )L14 L24 = L23 L13 : L24 L14 R12 (λ22 ) :=: L23 L24 L13 L14 R12 (λ22 ) : . (4.4.8) e R is by definition inside of (We replaced λ21 by λ12 in the middle of (4.4.8) since AR ⊗A the tensor product AR ⊗Mh∗ AR , where Mh∗ is mapped into the first component of AR by µr and into the second by µl , acting from the left). The proposition is proved. Now define the counit on the algebra AR . Recall that the counit has to be an algebra homomorphism ε : AR → Dh . Define the counit by the formula ε(Lαβ ) = δαβ IdVα ⊗ Tα−1 , ε((L−1 )αβ ) = δαβ IdVα ⊗ Tα ,
(4.4.9)
where IdVα : Vα → Vα is the identity operator. We need to check that the counit is well defined, i.e. that the defining relations are annihilated by it. For relations (4.4.1),(4.4.2) it is obvious. Relation (4.4.3) reduces to checking that X X −1 −1 R12 (λ)(IdVα ⊗ IdVβ )) ⊗ Tα+β = ( (IdVα ⊗ IdVβ )R12 (λ)) ⊗ Tα+β , (4.4.10) ( α,β
α,β
which holds because R has zero weight. Proposition 4.3. The counit axiom (Id ⊗ ε) ◦ 1 = (ε ⊗ Id) ◦ 1 = Id is satisfied for AR .
626
P. Etingof, A. Varchenko
Proof. We need to check the relations on L. These relations follow from the fact that the elements Tα−1 ⊗ Lαβ , Lαβ ⊗ Tβ−1 are mapped to Lαβ under the natural isomorphisms e R → AR , AR ⊗D e h → AR . Dh ⊗A Thus, AR is an h-bialgebroid. We will call it the dynamical quantum group corresponding to the function R. It is also possible to consider the algebra generated by f (λ1 ), f (λ2 ), L (without L−1 ). Denote this algebra by A¯ R . The algebra A¯ R is an h-bialgebroid, which is naturally mapped to AR . Remark. The algebra A¯ R was introduced in [FV1] under the name of “the operator algebra”. 4.5. The antipode on AR . Let A, B be algebras with 1. For X ∈ B ⊗ A, define i(X) to be the inverse of X, and i∗ (X) to be the inverse of X in the algebra B ⊗ Aop , where Aop is A with the reversed order of multiplication. Clearly, i2 = i2∗ = Id. Let I be the group freely generated by i, i∗ with relations i2 = i2∗ = Id. We will say that an element X is strongly invertible if for any g ∈ I the element g(X) is well defined. Definition. An invertible, weight zero matrix function R is said to be rigid if the element L ∈ End(V ) ⊗ AR is strongly invertible. Proposition 4.4. R is rigid if and only if AR admits an antipode S such that S(L) = L−1 . In this case, S 2n (L) = (i∗ i)n (L), S 2n+1 (L) = i(i∗ i)n (L). In particular, S(L−1 ) = i∗ i(L). Proof. Suppose that R is rigid. Extend the definition of the antipode by S(L−1 ) = i∗ (L−1 ) = i∗ i(L). It is easy to see that the relations of AR are preserved, so this indeed defines an antihomomorphism S : A → A. Moreover, S is an isomorphism: the inverse is given by S −1 (L−1 ) = L, S −1 (L) = i∗ (L). Now suppose that S is defined. Then it is easy to check that (i∗ i)n (L) = S 2n (L), i(i∗ i)n (L) = S 2n+1 (L), n ∈ Z. This defines g(L) for all g ∈ I. The proposition is proved. Remark 1. The proposition shows that for rigidity of R, it is sufficient that i∗ (L) and i∗ (L−1 ) be defined. Remark 2. Observe that in general S 2 6= 1. Thus, if R is rigid then AR is an h-Hopf algebroid. 4.6. Representation theory of AR . Now consider the representation theory of AR . As was pointed out in [FV1], the category Rep(AR ) of dynamical representations of AR is tautologically isomorphic to the category Rep(R) of representations of R. Proposition 4.5. The tensor categories Rep(AR ) and Rep(R) are equivalent. Proof. Define the functor 0 : Rep(AR ) → Rep(R) to be the identity at the level of vector spaces, and set 0 (L). (4.6.1) L0(W ) = πW
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
627
Define the functor 0−1 : Rep(R) → Rep(AR ) by π00 −1 (W ) (L) = LW .
(4.6.2)
These functors preserve tensor structure, and are obviously inverse to each other. The proposition is proved. It is easy to see that the functor 0 commutes with the duality functors. Therefore, if R is rigid, then the representations W ∗ ,∗ W of R are well defined for any W , and the category Repf (R) of finite-dimensional representations of R (= the category Repf (AR ) of finite dimensional dynamical representations of AR ) is a rigid tensor category[DM]. This explains our use of the word “rigid”. Although AR is an h-Hopf algebroid for any rigid zero weight function R, it does not always have nice properties. For a generic R, this algebra will be very small and will not have interesting dynamical representations. However if R is a dynamical quantum R-matrix, then the category Rep(R) is nontrivial (it contains the basic representation defined in Chapter 3), so by Proposition 4.4 the category Rep(AR ) is also nontrivial. Thus, algebras AR with R being a dynamical quantum R-matrix form a good class of h-Hopf algebroids. From now on we will only consider AR for R being a dynamical quantum R-matrix. 4.7. Sufficient conditions for rigidity. Unfortunately, the definition of rigidity cannot be effectively checked, since it depends on the properties of the algebra AR , about whose structure we do not know very much. Therefore, we would like to find some effective sufficient conditions of rigidity. X˜ : h∗ → End(V ⊗ V ) For any function X : h∗ → End(V ⊗ V ), define the functionP as follows. Suppose that for v, w ∈ V one has X(λ)(v ⊗ w) = i fi (λ)vi ⊗ wi , where P ˜ fi ∈ Mh∗ and wi are homogeneous. Then set X(λ)(v⊗w) = i fi (λ+γ wt(wi ))vi ⊗wi , where wt(wi ) denotes the weight of wi . ˜ is defined. Let R be a dynamical quantum P R-matrix with step γ. Assume that i∗ (R) P ˜ ˜ ˜ ˜ ci ⊗ d i . Let us write R in the formP R= ai ⊗ bP i , and i∗ (R) in the form i∗ (R) = Define the operators Q = d i ci , Q 0 = ci di : h∗ → End(V ). These operators are of weight zero with respect to h, since R is of weight zero. ˜ is defined, and R satisfies the following Proposition 4.6. Suppose R is such that i∗ (R) conditions: (i) The operator Q is invertible for a generic λ. (ii) The operator Q0 is invertible for a generic λ. Then R is rigid, and i∗ (L−1 ) = S 2 (L) =: (Q(λ1 ) ⊗ 1)L(Q−1 (λ2 ) ⊗ 1) : =: (Q0 (λ1 + γh)−1 ⊗ 1)L(Q0 (λ2 + γh) ⊗ 1) : .
(4.7.1)
Remark. It is clear that (i) and (ii) are satisfied for R = 1 and are open conditions. Therefore, Proposition 4.5 shows that if Rγ is a continuous family of quantum dynamical R-matrices with step γ such that R0 = 1, then Rγ is rigid for small γ.
628
P. Etingof, A. Varchenko
Proof. First of all, let us deduce a commutation relation between L and L−1 . Multiplying the dynamical Yang–Baxter equation by (L−1 )23 on the right, we get R12 (λ1 )L13 =: L23 L13 R12 (λ2 )(L23 )−1 : . (4.7.2) P Let {va } be an h-homogeneous basis of V , and L = Eab ⊗ Lab . Denote by ωa the weight of va . Then we have X (2) 13 12 2 23 −1 Eab : L(3) := : L23 L13 R12 (λ2 ) : (L23 )−1 = ab L R (λ )(L ) X (2) (3) Eab Lab : L13 R12 ((λ + γωb )2 )(L23 )−1 := (4.7.3) X (2) (3) Eab Lab : L13 R˜ 12 (λ2 )(L23 )−1 := L23 : L13 R˜ 12 (λ2 ) : (L23 )−1 . Therefore, multiplying (4.7.2) on the left by (L23 )−1 we get (L23 )−1 : R12 (λ1 )L13 :=: L13 R˜ 12 (λ2 )(L23 )−1 : .
(4.7.4)
Transforming the left hand side of this equation similarly to (4.7.3), we arrive at the equation (4.7.5) : (L23 )−1 R˜ 12 (λ1 )L13 :=: L13 R˜ 12 (λ2 )(L23 )−1 :, which is the desired commutation relation. Now, using property (i), define T =: (Q(λ1 ) ⊗ 1)L(Q−1 (λ2 ) ⊗ 1) :∈ End(V ) ⊗ AR .
(4.7.6)
Let * denote the product in the algebra End(V ) ⊗ (AR )op . Let us compute the product L−1 ∗ T . P Set L−1 = Eab ⊗ (L−1 )ab . Then we get X (Epq Q(λ2 )Ers Q−1 (λ1 ) ⊗ 1)(1 ⊗ Lrs (L−1 )pq ). (4.7.7) L−1 ∗ T = Using (4.7.5), we can rewrite (4.7.7) in the form X (di (λ2 )Ers bj (λ1 )Q(λ1 )aj (λ1 )Epq ci (λ2 )Q−1 (λ2 ) ⊗ 1)(1 ⊗ Lrs (L−1 )pq ). L−1 ∗ T = (4.7.8) Using the definition of Q, we have X (4.7.9) bi Qai = 1. Substituting (4.7.9) into (4.7.8), we get L−1 ∗ T = 1. Now, using property (ii), define T 0 =: (Q0 (λ1 + γh)−1 ⊗ 1)L(Q0 (λ2 + γh) ⊗ 1) : .
(4.7.10)
Then, analogously to the above, we get T 0 ∗ L−1 = 1. Thus, T = T 0 = i∗ (L−1 ). It is easy to see that i∗ (L) =: (Q−1 (λ2 ) ⊗ 1)L(Q(λ1 ) ⊗ 1) : . Thus, R is rigid.
(4.7.11)
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
629
Now we will show that any rigid quantum dynamical R-matrix satisfies a certain crossing symmetry condition. For an invertible zero weight function X(λ) ∈ End(V ⊗ V ), set τ (X)(λ) = X −1 (λ + γh(2) )t2 .
(4.7.12)
Corollary 4.1. Let R be a rigid quantum dynamical R-matrix on V . Then τ (R) is invertible, and R satisfies the crossing symmetry condition τ 2 (R) = (Q(λ − γh(2) ) ⊗ 1)R(λ)(Q−1 (λ) ⊗ 1).
(4.7.13)
Proof. It is clear that τ 2 (R) = LV ∗∗ , where V is the basic representation of R. Therefore, using (4.7.1) in the basic representation, we get (4.7.12). 4.8. Dynamical quantum groups associated to dynamical R-matrices of glN type. Now suppose that R is a dynamical R-matrix of glN -type. Then it has form (1.3.2), and we can write the defining relations for AR more explicitly. Since all weight subspaces of V are 1-dimensional, weP have (L±1 )αβ ∈ A. For brevity we write (L±1 )ab for (L±1 )ωa ωb . ±1 Eab ⊗ (L±1 )ab . Thus, we have L = In this notation, the defining relations for AR look like LL−1 = L−1 L = 1, f (λ1 )Lbc = Lbc f (λ1 + γωb ), f (λ2 )Lbc = Lbc f (λ2 + γωc ), αst (λ2 ) Lat Las , s 6= t, 1 − βts (λ2 ) (4.8.1) αab (λ1 ) L L , a = 6 b, Lbs Las = as bs 1 − βab (λ1 ) αab (λ1 )Las Lbt − αst (λ2 )Lbt Las = (βts (λ2 ) − βab (λ1 ))Lbs Lat , a 6= b, s 6= t, Las Lat =
where αab , βab are the functions from (1.3.2). Remark. It is also possible to define dynamical quantum groups associated with dynamical R-matrices with spectral parameter. It is done analogously to the above, and we will do it in detail in a forthcoming paper. For example, if R(z, λ) is a quantum dynamical R-matrix with spectral parameter of elliptic type (i.e. of the form (2.5.1)), we will get the elliptic quantum group defined in [F1, F2, FV1, FV2]. Relations (4.8.1) (for dynamical R-matrices of glN Hecke type) can be obtained as a limiting case of the defining relations for the elliptic quantum group. 4.9. Rigidity of the rational and the trigonometric dynamical R-matrix. Consider the trigonometric dynamical R-matrix R(λ) defined by (1.6.4), with X = {1, ..., N }, and µab = 1. Proposition 4.7. R(λ) is rigid, and the matrices Q, Q0 are given by the formulas Q = diag(Q1 , ..., QN ), Q0 = diag(Q01 , ..., Q0N ), Y q 1+λi − q λa , Qa (λ) = q λi − q λa i6=a
Q0a (λ) = qQ−1 a (λ), where q = e .
(4.9.1)
630
P. Etingof, A. Varchenko
˜ Proof. First of all, it is not hard to show by a direct computation that the matrix i∗ (R) is defined. So it remains to show that the elements Q, Q0 are invertible. Let P (λ) = Q0 (λ + γh). Let Pi , Qi be the diagonal entries of P, Q. As we know, these entries are defined by the following systems of linear equations: X βab (λ + γωa )Qb = 1, Qa + b6=a
Pa +
X
(4.9.2) βba (λ + γωb )Pb = 1.
b6=a
The explicit form of the systems (4.9.2) is X q−1 Qb = 1, Qa + q 1+λa −λb − 1 b6=a
Pa +
X b6=a
q−1 Pb = 1. q 1+λb −λa − 1
Thus, if one of these systems is nondegenerate (which we show below) then Q(λ) = P (−λ). From now on we consider only the first system. Note that it can be conveniently written as X q−1 Qb = 1. (4.9.3) q 1+λa −λb − 1 b
Define Xb = q λb Qb . Then (4.9.3) can be written as X 1 Xb = 1, [1 + λa ] − [λb ]
(4.9.4)
b
x
−1 where [x] = qq−1 . Thus, the vector X is defined by X = C −1 1, where Cab = [1+λa1]−[λb ] , and 1 is the vector whose components are all equal to 1. To invert the matrix C, we use the well known combinatorial identity (which is called the “Bose-Fermi correspondence” in physics): Q Q 1 i<j (xi − xj ) i<j (yi − yj ) Q . (4.9.5) )= det( x i − yj i,j (xi − yj )
Applying this identity to xi = [1 + λi ], yi = [λi ], and using the usual rule of inverting matrices, we get Q (i,j):i=b or j=a (xi − yj ) −1 Q . (4.9.6) (C )ab = Q j6=b (xb − xj ) i6=a (yi − ya ) In particular, Xa =
X b
Claim 1.
Q Q (xi − ya ) X j6=a (xb − yj ) Q (C −1 )ab = Q i . i6=a (yi − ya ) j6=b (xb − xj )
(4.9.7)
b
X b
Q
(xb − yj ) Qj6=a = 1. (x j6=b b − xj )
(4.9.8)
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
631
Proof of the claim. Consider the expression on the l.h.s. of (4.9.8) as a rational function of z = xa for fixed xb , b 6= a. This function has no more than simple poles at xb , b 6= a, and no other singularities; it equals 1 at infinity. Thus, it suffices to show that its residues vanish, which is obvious: only two terms contribute to each residue, ant these two terms cancel each other. Thus, we get: Q ([1 + λi ] − [λa ]) , (4.9.9) Qa (λ) = q −λa Qi i6=a ([λi ] − [λa ]) i.e. Qa (λ) =
Y q 1+λi − q λa , q λi − q λa i6=a
Y q 1−λi − q −λa . Pa (λ) = q −λi − q −λa
(4.9.10)
i6=a
Therefore, Q0a (λ) = Pa (λ − ωa ) =
Y q 1−λi − q 1−λa Y q 1+λi − q 1+λa = = qQ−1 a (λ). (4.9.11) q −λi − q 1−λa q 1+λi − q λa i6=a
i6=a
Thus, R is rigid, and Q, Q0 are given by formula (4.9.1). The proposition is proved.
An analogous theorem holds for the rational dynamical R-matrix (1.5.1) (with X = {1, ..., N } and µab = 0). The formulas for Q, Q0 for such R are obtained from (4.9.1) as q → 1. It is easy to show that the property of rigidity is preserved by gauge transformations, so we get Corollary 4.2. Any quantum dynamical R-matrix R of glN Hecke type is rigid. Clearly, the elements Q, Q0 for any such R can be easily computed from (4.9.1). 5. H-Biequivariant Hopf Algebroids In this chapter we generalize the notions of an h-algebra, h-bialgebroid, h-Hopf algebroid to the case when the Lie algebra h is not necessarily commutative, and define quantum counterparts of the quasiclassical notions introduced in Chapters 1-2 of [EV]. We will define the notions of an H-biequivariant Hopf algebroid and quantum groupoid. The notion of an H-biequivariant quantum groupoid is a quantum analogue of the notion of an H-biequivariant Poisson groupoid, introduced in [EV]. We will also introduce less general notions of a dynamical quantum groupoid and Hopf algebroid, which are quantum analogues of the notions of a dynamical Poisson groupoid and Hopf algebroid. In this chapter we will work mostly in the setting of perturbation theory. That is, quantum objects will be defined over k[[~]], where k is some field, and give classical objects modulo ~ and quasiclassical ones modulo ~2 . We discuss the relationship between the quasiclassical and quantum objects, and questions regarding quantization.
632
P. Etingof, A. Varchenko
5.1. Quantization of Poisson algebras. In this section we will recall some well known facts from the theory of deformation quantization. Let k be a field of characteristic zero. Let K = k[[~]]. By a topologically free K-module we mean a K-module of the form V [[~]], where V is a k-vector space. All K-modules we will use will be topologically free. By tensor product of two such modules we will always mean completed tensor product over K. Let A0 be a commutative algebra over k with 1. Recall that according to Grothendieck, a linear operator D : A0 → A0 is a differential operator of order ≤ N, N ≥ 1 if for any a ∈ A0 the operator f → D(af ) − aDf is a differential operator of order ≤ N − 1, and a differential operator of order 0 is the operator of multiplication by an element of A0 . If A0 is the algebra of regular functions on a manifold (smooth, analytic, algebraic, formal) then “differential operator of order N ” means what it usually means in geometry. Let A0 be a Poisson algebra over k with 1, with Poisson bracket {, }. Recall that by a quantization of A0 is meant a K-module A = A0 [[~]] equipped with a K-linear binary operation ∗ : A ⊗ A → A, which defines an associative algebra structure on A, such that A/~A = A0 as an algebra, and ~1 (f ∗ g − g ∗ f ) mod ~ = {f, g}, f, g ∈ A0 ⊂ A. In this case A0 is called the quasiclassical limit of A. Let f, g ∈ A0 . Then f ∗ g = f g + ~c1 (f, g) + ~2 c2 (f, g) + ...,
(5.1.1)
where ci : A0 ⊗ A0 → A0 are linear maps. A quantization defined by (5.1.1) is called local if ci (f, g) is a differential operator in f and g for any i. If A0 is the algebra O(X) of regular functions on a smooth manifold X, and A is a local quantization of A0 , then A defines (by formula (5.1.1)) a quantization AU of the algebra (AU )0 = O(U ) of regular functions on any open subset U of X. In other words, it defines a quantization of the sheaf of regular functions. This holds also in the holomorphic and algebraic situations, if X is affine. Let X be a manifold, and let T ∗ X be its cotangent bundle. Let A0 = O(T ∗ X)p be the Poisson algebra of regular functions on T ∗ X which are fiberwise polynomial of a uniformly bounded degree. This Poisson algebra has a distinguished quantization A = Oq (T ∗ X)p called the canonical quantization (q is not a parameter here but the first letter P of nthe word “quantum”). Namely, A is the algebra of formal series of the form n≥0 ~ Dn , where Di are differential operators on X, such that n ≥ order(Dn ), and n − order(Dn ) → +∞, as n → ∞. It is easy to check that this quantization is local, so it defines a quantization AU = Oq (U ) of the Poisson algebra (AU )0 = O(U ) of regular functions on an open subset U ∈ T ∗ X. Let g be a Lie algebra, and g∗ be its dual space, with the usual Poisson structure. Consider the Poisson algebra O(g∗ )p of polynomial functions on g∗ . This algebra has a , called the geometric quantization. Namely, A is distinguished quantization A = Oq (g∗ )pP the algebra of formal series of the form n≥0 ~n Dn , where Di ∈ U (g), n ≥ order(Dn ), and n − order(Dn ) → +∞, n → ∞. It is easy to check that this quantization is local, so it defines a quantization AU = Oq (U ) of the Poisson algebra (AU )0 = O(U ) of regular functions on an open subset U ∈ g∗ . 5.2. H-biequivariant associative algebras. In this section we will introduce the notion of an H-biequivariant associative algebra. This notion is a quantum analogue of the notion of an H-biequivariant Poisson algebra, introduced in a previous paper [EV]. Let A be an associative algebra over K with 1, which is commutative mod ~, H a connected affine algebraic group over k, and ψ : A × H → A be a right algebraic action
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
633
of H on A by automorphisms, defined over k. This means that A, as a representation of H, has the form A0 [[~]], where A0 is a sum of finite dimensional representations of H over k. Let h be the Lie algebra of H. Let U ⊂ h∗ be an H-invariant open set. A homomorphism µ : Oq (U ) → A is called a quantum moment map for ψ if for any linear function on U given by a ∈ h and any f ∈ A we have [µ(a), f ] = ~dψ|h=1 (a, f ).
(5.2.1)
Here dψ|h=1 : h × A → A is the differential of ψ at h = 1 ∈ H. Using the Leibnitz identity for the operator g → [µ(g), f ], from (5.2.1) one can compute [µ(g), f ] for any rational function g. For a left action of H a quantum moment map is defined in the same way, with the only difference that it is an anti-homomorphism rather than a homomorphism. Definition. An H-biequivariant associative algebra over U is a 5-tuple (A, l, r, µl , µr ), where A is an associative algebra with 1 over K, which is commutative mod ~, l, r is a pair of commuting algebraic actions of H on A (a left action and a right action) by algebra automorphisms, defined over k, and µl , µr : Oq (U ) → A are quantum moment maps for l, r, such that (i) µl , µr are embeddings, and their images commute; (ii) There exists an l(H) × r(H)-invariant k-subspace Al0 of A such that the multiplication map µr (Oq (U ))⊗Al0 → A is a linear isomorphism; there exists an l(H)×r(H)invariant k-subspace Ar0 of A such that the multiplication map µl (Oq (U ))⊗Ar0 → A is a linear isomorphism. A morphism of H-biequivariant associative algebras over U is a morphism of algebras which preserves l, r and µl , µr . Remark 1. From [l, r] = 0 it follows that [µl ◦x, µr ◦y] is a central element for x, y ∈ h, but it does not follow that this commutator equals 0. So we require that it is zero by condition (i). Remark 2. Condition (ii) is of a technical nature and is not very important in the discussion below. Denote the category of H-biequivariant associative algebras over U by AqU (q stands for “quantum”). For convenience we will write l(h)a as ha and r(h)a as ah. Let us now describe the monoidal structure on AqU . Let A, B ∈ AqU . Then the group H acts in A ⊗ B by 1(h)(a ⊗ b) = ah−1 ⊗ hb. e which is obtained We will construct a new H-biequivariant associative algebra A⊗B, by quantum Hamiltonian reduction of A ⊗ B by the action of H. Denote by A ∗ B the space A ⊗Oq (U ) B, where Oq (U ) is mapped to A via µA r , and to B via µB , acting in both algebras from the left. Then A ∗ B is the quotient of A ⊗B l B by the linear span I of elements of the form µA (f )a ⊗ b − a ⊗ µ (f )b, f ∈ O (U ), q r l a ∈ A, b ∈ B. The space A ∗ B has two commuting actions of H (lA ⊗ 1 and 1 ⊗ rB ). But we cannot claim that A ∗ B ∈ AqU , since the algebra structure on A ⊗ B does not, in general, descend to A ∗ B (I is only a right ideal and not necessarily a left ideal). However, the action 1 of H on A ⊗ B descends to one on A ∗ B, so we can define e := (A ∗ B)H , where H acts by 1. A⊗B
634
P. Etingof, A. Varchenko
e Proposition 5.1. The algebra structure on A ⊗ B descends to one on A⊗B. e We can regard x, y as elements of A ∗ B. Choose their liftings Proof.PLet x, y ∈ A⊗B. P ci ⊗ di into A ⊗ B. By definition, xy is the image of XY in X = ai ⊗ bi , Y = A ∗ B. We have to check two things. 1. That xy is H-invariant. 2. That xy does not depend on the choice of liftings X, Y . First we check property 1. Since x, y are H-invariant, we have X X ai ⊗ [µB [µA r (z), ai ] ⊗ bi + l (z), bi ] ∈ I, (5.2.2) X X B [µA (z), c ] ⊗ d + c ⊗ [µ (z), d ] ∈ I, z ∈ h. i i i i r l Therefore, since I is a right ideal, X X ai cj ⊗ [µB [µA r (z), ai cj ] ⊗ bi dj + l (z), bi dj ] ∈ XI + I.
(5.2.3)
Lemma 5.1. If X is H-invariant modulo I, then XI ⊂ I. P Proof of the Lemma. Since cj ⊗ dj is H-invariant modulo I, for any z ∈ h we have X X c j ⊗ dj µB (5.2.4) c j µA r (z) ⊗ dj − l (z) ∈ I. Therefore, the same equality holds any rational function g ∈ Oq (U ) instead of z. This proves the lemma. The Lemma shows that the RHS of (5.2.3) is in I, i.e. xy is H-invariant. Now we check property 2. If X 0 , Y 0 are any other liftings of x and y, then X −X 0 ∈ I, and Y − Y 0 ∈ I. So it remains to show that X(Y − Y 0 ) ∈ I. But this follows from the lemma. e The two commuting actions Thus, we have shown that the product descends to A⊗B. of H on A ⊗ B by (h1 , h2 )(a ⊗ b) = h1 a ⊗ bh2 , and the corresponding quantum moment e e ∈ AqU , it suffices to check maps descend to A⊗B. So, in order to check that A⊗B properties (i) and (ii). A B B Using properties (i) and (ii) of the quantum moment maps µA l , µr , µl , µr , it is A r r e is easy to see that A ∗ B is naturally identified with µl (Oq (U )) ⊗ A0 ⊗ B0 , and A⊗B A r r H −1 identified with µl (Oq (U )) ⊗ (A0 ⊗ B0 ) , where H acts by a ⊗ b → ah ⊗ hb. This e implies properties (i) and (ii) for the quantum moment map µA l ⊗ 1 : Oq (U ) → A⊗B, r r r H e (with (A⊗B) e 0 = (A0 ⊗ B0 ) ). For the corresponding to the left action of H on A⊗B e quantum moment map 1 ⊗ µB : O (U ) → A ⊗B corresponding to the right action, q r these properties are proved analogously. e is a bifunctor e ∈ AqU . It is clear that the assignment A, B → A⊗B Thus, A⊗B q q q AU × AU → AU . Recall [EV] that (T ∗ H)U denotes the variety of points (h, p) ∈ T ∗ H such that −1 h p ∈ U . Consider the algebra Oq ((T ∗ H)U ), which is the canonical quantization of the standard symplectic structure on (T ∗ H)U . It is equipped with the standard actions l, r of H on left and right given by (x, p) → (h1 xh2 , h1 ph2 ) (these actions obviously respect the quantization). Let µl,r : Oq (U ) → Oq ((T ∗ H)U ) be the embeddings, which assign to an element of U (h) the corresponding right-, respectively left-invariant differential operator on H. It is easy to check that µl,r are quantum moment maps for l, r.
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
635
Let 1 = (Oq ((T ∗ H)U ), l, r, µl , µr ). It is easy to check that we have natural isomore ≡ A ≡ 1⊗A. e phisms A⊗1 e ⊗C e = A⊗(B e ⊗C). e Proposition 5.2. (i) (A⊗B) e and (AqU , ⊗, e 1) is a monoidal category. (ii) 1 is a unit object in AqU with respect to ⊗, Proof. Easy.
Let A ∈ AqU . Denote by A¯ the new object of AqU obtained as follows: A¯ is Aop (the opposite algebra), with the left and the right actions of H permuted (i.e. the left, respectively right, action of h on A¯ is the right, respectively left, action of h−1 on A), and the quantum moment maps also permuted. We will call A¯ the dual object to A. By a quasireflection on A we will mean a morphism i : A¯ → A. Note that unlike [EV], here we do not require that i2 = 1. Let A ∈ AqU and i : A¯ → A be a quasireflection. Let ϕi+ , ϕi− : A ⊗ A → A be given by the formulas ϕi+ (a ⊗ b) = ai(b), ϕi− (a ⊗ b) = i(a)b. It is easy to see that these maps i e → A. : A⊗A descend to linear maps ψ± 5.3. H-biequivariant Hopf algebroids. Now let us define the quantum version of the notion of an H-biequivariant Poisson–Hopf algebroid. Definition. Let A be an H-biequivariant associative algebra. Then A is called an Hbiequivariant Hopf algebroid over U if it is equipped with a coassociative AqU -morphism e called the coproduct, a AqU -morphism ε : A → 1 called the counit, and 1 : A → A⊗A a quasireflection S : A¯ → A called the antipode, such that (i) (id • ε) ◦ 1 = (ε • id) ◦ 1 = id, and S (ii) ψ+S ◦ 1 = µl ◦ P ◦ ε, ψ− ◦ 1 = µr ◦ P ◦ ε, where P : 1 → Oq (U ) is the map which assigns to a differential operator on H its value at the identity element (which is in U (h)). The same structure without the antipode will be called an H-biequivariant bialgebroid. If H = 1, then these notions coincide with notions of a Hopf algebra and a bialgebra over K. Remark 1. In the above discussion, U is a Zariski open set. If k = R or C, then we can take U to be an open set in the usual sense, and define O(U ) to be the algebra of smooth, respectively analytic, functions on U . Then we can repeat Sect. 5.2, 5.3, and thus define the notions of an H-biequivariant associative algebra and Hopf algebroid over U . Similarly, one can take U to be the infinitesimal neighborhood of zero in h∗ (i.e. O(U ) = k[[h]]). The material of Sects. 5.2 and 5.3 can be generalized to this case as well. Remark 2. In the smooth, analytic, and formal case one has to drop the condition that A is the sum of finite dimensional representations of H (because Oq (U ) does not satisfy this condition). One should instead require that A is a representation of h. One should also impose the locality condition for a quantum moment map µ:P for any f ∈ A the operation g → [µ(g), f ] is local in g, in the sense that [µ(g), f ] = µ(Di g)fi , where fi ∈ A, and Di are h-adically convergent series of differential operators on U . Using (5.2.1) and the locality property, one can compute [µ(g), f ] not only for rational functions g but for arbitrary smooth, holomorphic, or formal functions.
636
P. Etingof, A. Varchenko
5.4. Quantization of H-biequivariant Poisson–Hopf algebroids and Poisson groupoids. In this section we will heavily use notations and definitions from [EV], Chapters 1 and 2. We advise the reader to look through these chapters before reading this section. Consider the following two settings. 1. Let A0 be an H-biequivariant Poisson algebra (see Sect. 2.3 of [EV]). Let A = A0 [[~]]. Suppose that A is equipped with an associative product ∗ in such a way that A is a local quantization of A0 as a Poisson algebra, and the 5-tuple (A, l, r, µl , µr ) is an H-biequivariant associative algebra (where l, r, µr , µr are the K-linear extensions of the structure maps of A0 to A). 2. Assume that in addition A0 is an H-biequivariant Poisson–Hopf algebroid, i.e. it is equipped with maps 10 , ε0 , S0 satisfying certain axioms (see Sect. 2.4 of [EV]). Suppose that A is as above, and in addition that A is equipped with maps 1, ε, S, which make A an H-biequivariant Hopf algebroid, and equal 10 , ε0 , S0 modulo ~. Definition. In these cases, A0 is called the quasiclassical limit of A, and A is called a quantization of A0 . If H = 1, then this definition is the usual definition of a quantization of a Poisson and Poisson–Hopf algebra. Now consider the geometric version of this definition. Let X be an H-biequivariant Poisson manifold over U . Let A0 = O(X). Then A0 satisfies the axioms of an Hbiequivariant Poisson algebra, except for maybe property (ii). The notion of quantization of A0 is defined as above. A quantization A of A0 will be called an H-biequivariant quantum space. If X is in addition an H-biequivariant Poisson groupoid, then A0 satisfies the axioms of an H-biequivariant Poisson–Hopf algebroid, except for property (ii) and the fact that e 0 , but the coproduct 1 maps A0 to A20 := O(X •X)[[~]], which is a completion of A0 ⊗A e 0 itself (here X •Y is the product of the X-biequivariant Poisson manifolds, not to A0 ⊗A defined in [EV]). (This problem already exists for Lie groups, where the coproduct maps O(G) to O(G×G) and not to O(G)⊗O(G).) The notion of quantization of A0 is defined as above. The quantization is called local if f ∗g is a bidifferential operator of f, g modulo any power of ~, and 1(f ) = D10 (f ), where D is a differential operator modulo any power of ~. A local quantization A of A0 will be called an H-biequivariant quantum groupoid. Suppose that X = X(G, H, U ) is a dynamical Poisson groupoid (see Chapter 1 of [EV]), and A0 = O(X) is as above. In this case a local quantization A of A0 will be called a dynamical quantum groupoid. If the subspace O(U ) ⊗ O(G) ⊗ O(U )[[~]] ⊂ A is closed under the product, then it is an H-biequivariant Hopf algebroid. Such Hopf algebroid is called a dynamical Hopf algebroid. Recall that by a preferred quantization of a Poisson Lie group is meant to be a quantization in which the coproduct is undeformed. The notion of a preferred quantization of an H-biequivariant Poisson groupoid or Poisson–Hopf algebroid is defined in the same way. Conjecture. (i) Any dynamical Poisson groupoid admits a quantization. (ii) Any quasitriangular dynamical Poisson groupoid admits a preferred quantization. In the case H = 1 (Poisson–Lie groups), this conjecture goes back to Drinfeld and is proved in [EK1, EK2].
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
637
5.5. The case H = (C∗ )N . In this section we will consider the special case when H = (C∗ )N , and establish the connection between the constructions of this chapter and Chapter 4. Let H = (C∗ )N . In this case, the main notions of Chapter 5 are simplified: 1. Since H is commutative, the algebra Oq (U ) is just O(U )[[~]]. 2. Denote by P ⊂ h∗ the lattice of characters of H (P = Zn ). Let A be an Hbiequivariant associative algebra. Then the algebra A can be written as A = ⊕α,β∈P Aαβ , where Aαβ is the set of elements a ∈ A such that h1 ah2 = α(h1 )β(h2 )a (the direct sum is understood in the ~-adically complete sense). The images of the maps µl , µr are in e can be written in the form (A⊗B) e αδ = ⊕β∈P Aαβ ⊗O(U ) Bβδ , A00 . The product A⊗B B where O(U ) is embedded in A via µA and in B via µ r l , and acts from the left (thus this product is similar to the matrix product). 3. The algebra Oq ((T ∗ H)U ) = 1 can be written in form O(U ) ⊗ O(H)[[~]] = O(U ) ⊗ C[P ][[~]], where the commutation relations between P and O(U ) are given by f χ = χf χ , f ∈ O(U ), χ ∈ P , where f χ (u) = f (u + ~χ). In particular, in this case we can replace the algebra O(U ) with the field Mh∗ of meromorphic functions on h∗ , imposing the locality condition (see Remark 2, Sect. 5.3). Then Eq. (5.2.1) together with the locality condition implies identities (4.1.1). Now nothing prevents us from setting ~ to be no longer a formal parameter, but a nonzero complex number γ. In this situation, it is easy to see that an H-biequivariant algebra (bialgebroid, Hopf algebroid) is the same as an h-algebra (h-bialgebroid, h-Hopf algebroid) with weights belonging to P ⊂ h∗ . This gives a connection between Chapters 4 and 5.
6. h-Bialgebroids Associated to Quantum Dynamical R-Matrices of Hecke Type 6.1. The Hecke condition. Let R : h∗ → End(V ⊗ V ) be a quantum dynamical R-matrix with step γ. Consider the h-bialgebroid A¯ R introduced in Chapter 4. It is clear that if R = 1 and γ = 0 then A¯ R = Mh∗ ⊗ Mh∗ ⊗ O(End(V )). Therefore, for R 6= 1 we want the algebra A¯ R to look like a quantum deformation of Mh∗ ⊗ Mh∗ ⊗ O(End(V )). A natural formalization of this wish is the PBW property, defined below. The algebra A¯ R has a natural Z+ -grading, given by deg(f (λi )) = 0, deg(Lab ) = 1. Denote by A¯ nR the degree n component of A¯ R . It is clear that A¯ nR are Mh∗ ⊗ Mh∗ modules, where the two components of Mh∗ act by left multiplication by f (λ1 ) and f (λ2 ). Definition. The algebra A¯ R is said to satisfy the Poincare–Birkhoff–Witt (PBW) property if the Mh∗ ⊗ Mh∗ -module A¯ nR is isomorphic to the free module Mh∗ ⊗ Mh∗ ⊗ S n End(V ). For a general dynamical R-matrix, the PBW property is not the case. However, the property holds if one imposes an additional “Hecke type” condition on R. Definition. R is said to be of strong Hecke type if (i) R satisfies Eq. (1.3.6) for some nonzero parameters p, q ∈ C, p 6= −q, such that q/p is not a root of unity, and
638
P. Etingof, A. Varchenko
(ii) There exists a continuous family R(t), t ∈ [0, 1], of quantum dynamical R-matrices with step γ(t) continuously depending on t, satisfying (i) with parameters p(t), q(t), such that R(0) = 1, p(0) = q(0) = 1, γ(0) = 0, R(1) = R, p(1) = p, q(1) = q, γ(1) = γ. Example. It is easy to see from the classification that all dynamical R-matrices of glN Hecke type are of strong Hecke type. Thus, for dynamical R-matrices of glN -type, strong Hecke type is the same as the Hecke type. Theorem 6.1. If R is of strong Hecke type then A¯ R satisfies the PBW property. This theorem explains the meaning of the Hecke type conditions introduced in Chapter 1. If h = 0, this theorem is well known (see [FRT]). 6.2. Proof of Theorem 6.1. Let A˜ be the algebra with the same generators as A¯ R and the same relations except the Yang–Baxter relation. Then, as a vector space, the algebra A˜ has the form ⊕n≥0 A˜ n , A˜ n = Mh∗ ⊗ Mh∗ ⊗ (End(V ))⊗n , and A¯ R is the quotient of A˜ by the Yang–Baxter relation. Let Hn (v) be the Hecke algebra of type An with parameter v. It is the algebra generated by elements Ti , 1 ≤ i ≤ n − 1, with relations [Ti , Tj ] = 0, |i − j| ≥ 2; Ti Ti+1 Ti = Ti+1 Ti Ti+1 ; (Ti − 1)(Ti + v) = 0.
(6.2.1)
If v is not a root of unity of degree n, this algebra is isomorphic to C[Sn ] and therefore semisimple. ¯ ¯ n−i−1 : V ⊗n → Mh∗ ⊗ V ⊗n , where ⊗1 Denote by Rii+1 (λ) the operator 1i−1 ⊗R(λ) ¯ has the meaning defined by (3.1.2). ⊗ If R satisfies condition (i), then we have an action of Hn (v), v = q/p, on the Mh∗ ⊗ Mh∗ -module A˜ n , defined by the formula Ti X = Pii+1 : Rii+1 (λ1 )XRii+1 (λ2 )−1 : Pii+1 , th
(6.2.2)
st
where Pii+1 is the permutation of the i and the i + 1 components in the tensor product V ⊗n . This construction explains the origin of the term “Hecke type”. The Yang–Baxter relation in AR implies that the degree n component A¯ nR of A¯ R is isomorphic to the space of coinvariants of T1 , ..., Tn−1 in A˜ n . By semisimplicity of Hn (v), this space is isomorphic to the space of vectors in Mh∗ ⊗ Mh∗ ⊗ (End(V ))⊗n , which are invariant under Ti . Now recall that R satisfies condition (ii). Let R(t) be the corresponding family. Consider the corresponding modules A¯ nR(t) . Since they can be defined both as coinvariants and invariants, their dimensions cannot jump, which implies that A¯ nR(0) is isomorphic to A¯ nR(1) as a Mh∗ ⊗ Mh∗ -module. However, by our assumptions, A¯ nR(0) = Mh∗ ⊗ Mh∗ ⊗ S n End(V ), while A¯ nR(1) = A¯ nR . This proves the theorem. 6.3. Hecke condition and quantization. Theorem 6.1 has the following generalization to the case when the step Pγ is a formal parameter. Let Rγ = 1 − γr + γ n rn be a formal series whose coefficients are meromorphic functions h∗ → End(V ⊗ V ). Suppose that R is a quantum dynamical R-matrix with step γ. Let A¯ Rγ , ARγ denote the algebras over K := C[[γ]] defined as in Chapter 4. It is clear that A¯ Rγ /γ A¯ Rγ = Mh∗ ⊗ Mh∗ ⊗ O(End(V )). Thus the analogue of the PBW property for A¯ Rγ in this case is the property that the K-module A¯ Rγ is a topologically free module, i.e. provides a flat deformation of Mh∗ ⊗ Mh∗ ⊗ O(End(V )).
Quantum Dynamical Yang–Baxter Equation and Dynamical Quantum Groups
639
Theorem 6.2. If Rγ satisfies the Hecke equation (1.3.6) for some p(γ) = 1 + O(γ), q(γ) = 1 + O(γ), then A¯ Rγ is a flat deformation of Mh∗ ⊗ Mh∗ ⊗ O(End(V )). Proof. Analogous to the proof of Theorem 6.1
Corollary 6.1. Under the assumption of Theorem 6.2, ARγ is a flat deformation of Mh∗ ⊗ Mh∗ ⊗ O(GL(V )). If Rγ is holomorphic in an open set U ⊂ h∗ then we can define algebras A¯ U Rγ , U ¯ ARγ in the same way as ARγ , ARγ , except that Mh∗ is replaced with the algebra of holomorphic functions O(U ) on U . It is clear that Theorem 6.2 and Corollary 6.1 are valid for these algebras: U Proposition 6.1. Under the assumptions of Theorem 6.2, the algebras A¯ U Rγ , ARγ are topologically free over K.
Now let Rγ : U → End(V ⊗V )[[γ]] be a quantum dynamical R-matrix holomorphic on U which satisfies the condition of Theorem 6.2. Let p(γ) = 1 + aγ + O(γ 2 ), q(γ) = 1 + bγ + O(γ 2 ), γ → 0. Then from the quadratic equation for R∨ we get the unitarity condition (6.3.1) r21 + r = (b − a)P − (b + a), and from the quantum dynamical Yang–Baxter equation for R we get the classical dynamical Yang–Baxter equation for r. Thus, according to Chapter 1 of [EV], r defines a structure of a quasitriangular dynamical Poisson groupoid on U × GL(V ) × U . = In particular, we have the corresponding dynamical Poisson–Hopf algebroid A0U r O(U ) ⊗ O(GL(V )) ⊗ O(U ) (here O(G) denotes the algebra of polynomial functions on G). Theorem 6.3. The dynamical Hopf algebroid AU Rγ is a quantization of the dynamical . Poisson–Hopf algebroid A0U r Proof. Since we know that AU Rγ is topologically free, the proof is the direct computation of the quasiclassical limit and then comparison with Chapter 1 of [EV]. Let G = GL(V ), H be a maximal torus in G, and U ⊂ h∗ a polydisc. Let X(G, H, U ) be the Lie groupoid U × G × U with two actions of H, defined in Chapter 1 of [EV]. Theorem 6.4. Any structure of a quasitriangular dynamical Poisson groupoid on X(G, H, U ) admits a preferred quantization. Proof. The statement follows from Theorem 1.6 and Theorem 6.3.
Remark. Notice that if Rγ fails to satisfy the Hecke condition modulo γ 2 , then the algebra ARγ is not topologically free. Indeed, in this case r does not satisfy the unitarity condition, so according to Chapter 1 of [EV] the bracket defined by r on U ×GL(V )×U is not Poisson (i.e. does not satisfy the Jacobi identity). This means that the corresponding deformation is not flat, since a flat deformation of a commutative algebra induces a Poisson structure on this algebra. Thus, the Hecke condition seems to be intrinsic for good properties of the algebra AR .
640
P. Etingof, A. Varchenko
References Alexeev, A. and Faddeev, L.: (T ∗ G)t : a toy model of conformal field theory. 141, 413–422 (1991) Deligne, P., and Milne, J.: Tannakian categories. Lecture notes in math. 900, 1982 Etingof, P. and Kazhdan, D.: Quantization of Lie bialgebras, I. q-alg 9506005 Selecta Math. 2, 1 1–41 (1996) [EK2] Etingof, P. and Kazhdan, D.: Quantization of Poisson algebraic groups and Poisson homogeneous spaces. q-alg 9510020 (1995.)In: Quantum Symmetries, Les Houches, Session LXIV, 1995, Elsevier, 1998 [EV] Etingof, P. and Varchenko, A.: Geometry and classification of solutions of the classical dynamical Yang–Baxter equation. Commun. Math. Phys. 192, 77–120 (1998) [Fad1] Faddeev, L.: On the exchange matrix of the WZNW model. Commun. Math. Phys. 132, 131–138 (1990) [F1] Felder, G.: Conformal field theory and integrable systems associated to elliptic curves. Preprint hep-th/9407154, to appear in the Proceedings of the ICM, Zurich, 1994 [F2] Felder, G.: Elliptic quantum groups preprint hep-th/9412207, to appear in the Proceedings of the ICMP, Paris 1994 [FR] Frenkel, I.B., and Reshetikhin, N.Yu.: Quantum affine algebras and holonomic difference equations. Commun. Math. Phys. 146, 1–60 (1992) [FRT] Reshetikhin, N.Yu., Takhtadzhyan, L.A. and Faddeev, L.D.: Quantization of Lie groups and Lie algebras. Leningrad Math. J. 1, 1, 193–225 (1990) [FT] Faddeev, L.D., and Takhtajan, L.A.: The quantum method of the inverse problem and the Heisenberg XYZ model. Russ. Math. Surv. 34, 5, 11–68 (1979) [FTV1] Felder, G., Tarasov, V. and Varchenko, A.: Solutions of the elliptic qKZB equations and Bethe ansatz I. A.reprint q-alg/9606005, to appear in the volume dedicated to V.I.Arnold’s 60-th birthday (1996) [FTV2] Felder, G., Tarasov, V. and Varchenko, A.: Monodromy of solutions of the elliptic qKZB difference equations. Preprint (1997) [FV1] Felder, G. and Varchenko, A.: On representations of the elliptic quantum group Eτ,η (sl2 ). Commun. Math. Phys. 181, 746–762 (1996) [FV2] Felder, G. and Varchenko, A.: Elliptic quantum groups and Ruijsenaars models. Preprint (1997) [FV3] Felder, G. and Varchenko, A.: Algebraic Bethe ansatz for the elliptic quantum group Eτ,η (sl2 ). Nuclear Physics B 480, 485–503 (1996) [FW] Felder, G. and Wieszerkowski, C.: Conformal blocks on elliptic curves and the Knizhnik– Zamolodchikov–Bernard equations. Commun. Math. Phys. 176, 133 (1996) [GN] Gervais, J.-L., and Neveu, A.: Novel triangle relation and absense of tachyons in Liouville string field theory. Nucl. Phys. B 238, 125 (1984) [Kass] Kassel, C.: Quantum groups. Berlin–Heidelberg–New York: Springer-Verlag, GTM 155, 1994 [Lu] Lu, J.H.: Hopf algebroids and quantum groupoids. Inter. J. Math. 7, (1), 47–70 (1996) [Mac] MacLane, S.: Categories for the working mathematician. Berlin–Heidelberg–New York:: SpringerVerlag, 1971 [TV] Tarasov, V. and Varchenko, A.: Geometry of q-hypergeometric functions, quantum affine algebras, and elliptic quantum groups. q-alg 9703044 (1997)
[AF] [DM] [EK1]
Communicated by G. Felder
Commun. Math. Phys. 196, 641 – 670 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Self-Duality of the SL2 Hitchin Integrable System at Genus 2 Krzysztof Gawe¸dzki1 , Pascal Tran-Ngoc-Bich2 1 2
I.H.E.S., C.N.R.S., 34 route de Chartres, F-91440 Bures-sur-Yvette, France Universit´e de Paris Sud, 15 rue George Cl´emenceau, F-91405 Orsay Cedex, France
Received: 24 October 1997 / Accepted: 21 January 1998
Abstract: We revisit the Hitchin integrable system [11, 21] whose phase space is the bundle cotangent to the moduli space N of holomorphic SL2 -bundles over a smooth complex curve of genus 2. As shown in [18], N may be identified with the 3-dimensional projective space of theta functions of the 2nd order, i.e. N ∼ = P3 . We prove that the ∗ ∗ 3 ∼ Hitchin system on T N = T P possesses a remarkable symmetry: it is invariant under the interchange of positions and momenta. This property allows to complete the work of van Geemen–Previato [21] which, basing on the classical results on geometry of the Kummer quartic surfaces, specified the explicit form of the Hamiltonians of the Hitchin system. The resulting integrable system resembles the classic Neumann systems which are also self-dual. Its quantization produces a commuting family of differential operators of the 2nd order acting on homogeneous polynomials in four complex variables. As recently shown by van Geemen–deJong [22], these operators realize the Knizhnik– Zamolodchikov–Bernard–Hitchin connection for group SU (2) and genus 2 curves.
1. Introduction In [11], Nigel Hitchin has discovered an interesting family of classical integrable models related to modular geometry of holomorphic vector bundles or to 2-dimensional gauge fields. The input data for Hitchin’s construction are a complex Lie group G and a complex curve 6 of genus γ. The configuration space of the integrable system is the moduli space N of (semi)stable holomorphic G-bundles over 6. This is a finite-dimensional complex variety and Hitchin’s construction is done in the holomorphic category. It exhibits a complete family of Poisson-commuting Hamiltonians on the (complex) phase space T ∗ N . The Hitchin Hamiltonians have open subsets of abelian varieties as generic level sets on which they induce additive flows [11]. More recently, Hitchin’s construction was extended to the case of singular or punctured curves [16, 19, 7] providing a unified construction of a vast family of classical integrable systems. For 6 = CP 1 with
642
K. Gawe¸dzki, P. Tran-Ngoc-Bich
punctures, one obtains this way the so called Gaudin chains and for G = SLN and 6 of genus 1 with one puncture, the elliptic Calogero-Sutherland models which found an unexpected application in the supersymmetric 4-dimensional gauge theories [6]. In Sect. 2 of the present paper we briefly recall the basic idea of Hitchin’s construction. The main aim of this contribution is to treat in detail the case of G = SL2 and 6 of genus 2 (no punctures). The genus 2 curves are hyperelliptic, i.e., given by the equation ζ2 =
6 Y
(λ − λs ),
(1.1)
s=1
where λs are 6 different complex numbers. The semistable moduli space N has a particularly simple form for genus 2, [18]: it is the projectivized space of theta functions of the 2nd order: N = PH 0 (L22 ),
(1.2)
where L2 is the theta-bundle over the Jacobian J 1 of (the isomorphism classes of) degree γ − 1 = 1 line bundles1 l over 6. dimC (H 0 (L22 )) = 4 so that N ∼ = P3 . This picture of N is related to the realization of SL2 -bundles as extensions of degree 1 line bundles. We review some of the results in this direction in Sect. 3 using a less sophisticated language than that of the original work [18]. The relation between the extensions and the theta functions is lifted to the level of the cotangent bundle T ∗ N in Sect. 4. The language of extensions proves suitable for a direct description of the Hitchin Hamiltonians on T ∗ N . The main aim is, however, to present the Hitchin system as an explicit 3-dimensional family of integrable systems on T ∗ P3 , parametrized by the moduli of the curve. This was first attempted, and almost achieved, in reference [21]. Let us recall that the Hitchin Hamiltonians are components of the map H : T ∗ N −→ H 0 (K 2 )
(1.3)
with values in the (holomorphic) quadratic differentials (K denotes the canonical bundle of 6). Due to relation (1.2), the map H may be viewed as a H 0 (K 2 )-valued function of pairs (θ, φ), where θ ∈ H 0 (L22 ) and φ from the dual space H 0 (L22 )∗ are s.t. hθ, φi = 0. Fix a holomorphic trivialization of L2 around l ∈ J 1 and denote by φl the linear form that computes the value of the theta function at l. As was observed in [21], 1
H(θ, φl ) = − 16π2 (dθ(l))2
(1.4)
(with appropriate normalizations). In the above formula, θ is viewed as a function on J 1 and dθ(l) as an element of H 0 (K). Since θ(l) = 0, the equation is consistent with changes of the trivialization of L2 . The map J 1 3 l 7→ φl induces an embedding of the Kummer surface J 1 /Z2 with l and l−1 K identified into a quartic K∗ in PH 0 (L22 )∗ . The Kummer quartic is a carrier of a rich but classical structure, a subject of an intensive study of the nineteenth century geometers, see [13] and also the last chapter of [10]. The reference [21] used the relation (1.4) and a mixture of the classical results and of more modern algebraic geometry to recover an explicit form of the components of the Hitchin map H up to a multiplication by a function on the configuration space. The authors of [21] checked that the simplest 1
We use the multiplicative notation for the tensor product of line bundles.
Self-Duality of the SL2 Hitchin Integrable System at Genus 2
643
way to fix this ambiguity leads to Poisson-commuting functions but they fell short of showing that the latter coincide with the ones of the Hitchin construction. Among the aims of the present paper is to fill the gap left in [21]. We observe that the proposal of [21] has a remarkable self-duality property: it is invariant under the interchange of the positions and momenta in T ∗ P3 . We show that the Hitchin construction leads to a system with the same symmetry. This limits the ambiguity left by the analysis of [21] to a multiplication of the components of H by constants. A direct check based on Eq. (1.4) fixes the normalizations and results in a formula for the Hitchin map which uses the hyperelliptic description (1.1) of the curve. Namely, X rst 1 (dλ)2 , (1.5) H = − 128π2 (λ − λs )(λ − λt ) 1≤s6=t≤6
where rst are explicit polynomials in (θ, φ) given, upon representation of (θ, φ) by pairs (q, p) ∈ C4 × C4 , by Eqs. (7.7) below. The above expression for H has a similar form as that for the Hitchin map on the Riemann sphere with 6 insertion points λs , see e.g. Sect. 4 of [9], except for the structure of the terms rst . This is not an accident but is connected to the reduction of conformal field theory on genus 2 surfaces to an orbifold theory in genus 0 [14, 23]. We plan to return to this relation in a future publication. Let us discuss in more detail how we establish the self-duality of the Hitchin Hamiltonians. The main tool here is an explicit expression for the values of the Hitchin map off the Kummer quartic K∗ which we obtain in Sect. 5. Our formula for H(θ, φ) requires a choice of a pair of perpendicular 2-dimensional subspaces (5, 5⊥ ), where θ ∈ 5 ⊂ H 0 (L22 ) and φ ∈ 5⊥ ⊂ H 0 (L22 )∗ (there is a complex line of such choices). The plane 5⊥ corresponds to a line P5⊥ in PH 0 (L22 )∗ which intersects the Kummer quartic K∗ in four points C∗ φlj , j = 1, 2, 3, 4, (counting with multiplicity). Whereas the analysis of [21] was mainly concerned with the geometry of bitangents to K∗ with two pairs of coincident φlj ’s, we concentrate on the generic situation with φlj ’s different. Then any two of them, say C∗ φl1 and C∗ φl2 , span 5⊥ . 5 is composed of the 2nd order theta functions vanishing at l1 and l2 . In particular, φ = a 1 φ l1 + a 2 φ l2
and
θ(l1 ) = 0 = θ(l2 ).
(1.6)
Let x1 + x2 and x3 + x4 be the divisors of l1 l2 and of l1 l2−1 K, respectively, where xi are four points2 in 6. If l12 6= K, which holds in a general situation, then the quadratic differential H(θ, φ) is determined by its values at xi which, as we show in Sect. 5, are 1
H(θ, φ)(xi ) = − 16π2 (a1 dθ(l1 ) ± a2 dθ(l2 ))2 (xi ).
(1.7)
Sign plus is taken for x1 and x2 and sign minus for x3 and x4 . Note that for φ = φl with θ(l) = 0 the above equation reproduces the result (1.4). As we recall at the end of Sect. 3, there exists an almost natural linear isomorphism ι between H 0 (L22 )∗ and H 0 (L22 ). What follows is independent of the remaining ambiguity in the choice of ι. The identity hθ, φi = hι(φ), ι−1 (θ)i implies that if (θ, φ) is a perpendicular pair then so is (θ 0 , φ0 ) where θ0 = ι(φ) and φ0 = ι−1 (θ). Thus ι interchanges ⊥ the positions and momenta in T ∗ N . We may take (50 , 50 ) = (ι(5⊥ ), ι−1 (5)) as a pair ⊥ of perpendicular subspaces containing (θ0 , φ0 ). The line P50 meets K∗ in four points 2 The other two lines of l1 l3−1 K = O(x2 + x4 ), l1 l4
⊥
intersection of P5 with K∗ correspond to l3 and l4 with l1 l3 = O(x1 + x3 ), = O(x1 + x4 ), l1 l4−1 K = O(x2 + x3 ).
644
K. Gawe¸dzki, P. Tran-Ngoc-Bich
C∗ φlj0 . Equivalently, C∗ ι(φlj0 ) are the points of intersection of P5 with the Kummer
quartic K = ι(K∗ ) ⊂ PH 0 (L22 ). In a general situation, 50 φlj0 ’s so that φ0 = a01 φl10 + a02 φl20
and
⊥
is spanned by any pair of
θ0 (l10 ) = 0 = θ0 (l20 )
(1.8)
which is the dual version of relations (1.6). Equivalently, θ = a01 ι(φl10 ) + a02 ι(φl20 )
and
hι(φl10 ), φi = 0 = hι(φl20 ), φi.
(1.9)
Let yi be the points associated to lj0 the same way as the points xi were associated to lj . lj0 may be chosen so that yi and xi coincide modulo the natural involution of 6 fixing the six Weierstrass points. Formula (1.7) implies then that 2
H(θ0 , φ0 )(yi ) = − 16π2 a01 dθ0 (l10 ) ± a02 dθ0 (l20 ) 1
(yi ).
(1.10)
Points yi in Eq. (1.10) may be replaced by xi since the quadratic differentials are equal at point x if and only if they are equal at the image of x by the involution of 6. A direct calculation of the coefficients a1 , a2 and a01 , a02 appearing on the right-hand sides of Eqs. (1.7) and (1.10) shows then that both expressions coincide, establishing the self-duality of H. The verification of this equality is the subject of Sect. 6. In Sect. 7, we recall the main result of reference [21] and show how the self-duality may be used to complete the analysis performed there and to obtain the explicit form (1.5) of the Hitchin map. We briefly discuss the relation of that form to the classical Yang-Baxter equation. An appropriate quantization of Hitchin Hamiltonians leads to operators acting on holomorphic sections of powers of the determinant line bundle over N and defining the Knizhnik–Zamolodchikov–Bernard–Hitchin [15, 4, 5, 12] connection. In our case, the sections of the powers of the determinant bundle are simply homogeneous polynomials on H 0 (L22 ). It is easy to quantize the Hamiltonians corresponding to the components of the Hitchin map (1.5) in such a way that one obtains an explicit family of commuting 2nd order differential operators acting on such polynomials. The corresponding connection coincides with the explicit form of the (projective) KZBH connection worked out recently3 in [22]. The quantization of the genus 2 Hitchin system is briefly discussed in the Conclusions, where we also mention other possible directions for further research. Four appendices which close the paper contain some more technical material. We would like to end the presentation of our paper by expressing some regrets. We apologize to Ernst Eduard Kummer and other nineteenth century giants for our insufficient knowledge of their classic work. The apologies are also due to a few contemporary algebraic geometers who could be interested in the present work for an analytic character of our arguments. To the specialist in integrability we apologize for the yet incomplete analysis of the integrable system studied here and, finally, we apologize to ourselves for not having finished this work 2 years ago. 3 We thank B. van Geemen for attracting our attention to ref. [22] and for pointing out that this work may be used to fix indirectly the precise form of the Hitchin map.
Self-Duality of the SL2 Hitchin Integrable System at Genus 2
645
2. Hitchin’s Construction Let us assume, for simplicity, that the complex Lie group G is simple, connected and simply connected. We shall denote by g its Lie algebra. The complex curve 6 will be assumed smooth, compact and connected. Topologically, all G-bundles on 6 are trivial and the complex structures in the trivial bundle may be described by giving operators ∂¯ + A, where A are smooth g-valued 0,1-forms on 6 [1]. Let A denote the space of such forms (i.e. of chiral gauge fields). The group G of local (chiral) gauge transformations composed of smooth maps h from 6 to G acts on operators ∂¯ + A by conjugation and on the gauge fields A by ¯ −1 . A 7−→ hA ≡ hAh−1 + h∂h Two holomorphic G-bundles are equivalent iff the corresponding gauge fields are in the same orbit of G. Hence the space of orbits A/G coincides with the (moduli) space of inequivalent holomorphic G-bundles. It may be supplied with a structure of a variety provided one gets rid of bad orbits. This may be achieved by limiting the considerations to (semi)stable bundles, i.e. such that the vector bundle associated with the adjoint representations of G contains only holomorphic subbundles with negative (non-positive) first Chern number. For genus γ > 1, the moduli space Ns ≡ As /G of stable Gbundles is a smooth complex variety with a natural compactification to a variety Nss , the (Seshadri-) moduli space of semistable bundles [18]. The complex cotangent bundle T ∗ Ns may be obtained from the infinite-dimensional bundle T ∗ As by the symplectic reduction. T ∗ As may be realized as the space of pairs (A, 8), where 8 is a (possibly distributional) g-valued 1,0-form on 6, A ∈ As and the duality with the vectors δA tangent to A is given by Z tr 8 ∧ δA 6
with tr standing for the Killing form. The action of the local gauge group G on As lifts to a symplectic action on T ∗ As by 8 7−→ h8 ≡ h8h−1 . The moment map µ for the action of G on T ∗ Ns is ¯ + A ∧ 8 + 8 ∧ A ≡ ∂¯A 8 . µ(A, 8) = ∂8 Note that it takes values in g-valued 2-forms on 6. These may be naturally viewed as elements of the space dual to the Lie algebra of G. The symplectic reduction of T ∗ As realizes T ∗ Ns as the space of G-orbits in the zero level of µ: T ∗ Ns ∼ = µ−1 ({0})/G . For a homogeneous G-invariant polynomial P on g of degree dP , the gauge invariant expression P (8) defines a section of the bundle K dP of dP -differentials on 6. If 8 is in the zero level of µ then P (8) is also holomorphic. Hence the map 8 7→ P (8) induces a map HP : T ∗ Ns −→ H 0 (K dP ) into the finite dimensional vector space of holomorphic differentials of degree dP on 6. The components of such vector-valued Hamiltonians clearly Poisson-commute since
646
K. Gawe¸dzki, P. Tran-Ngoc-Bich
upstairs (on T ∗ As ) they depend only on the momentum variables 8. By a beautiful argument, Hitchin showed [11] that taking all polynomials P one obtains a complete system of Hamiltonians in involution and that the collection of maps HP defines in generic points a foliation of T ∗ Ns into (open subsets of) abelian varieties. Let us briefly sketch Hitchin’s argument for G = SL2 . There is only one (up to normalization) non-trivial invariant polynomial P2 on sl2 given by, say, half of the Killing form. H ≡ HP2 maps into the space of quadratic differentials. A non-trivial holomorphic quadratic differential ρ determines a (spectral) curve 60 ⊂ K given by the equation ξ 2 = ρ(π(ξ)),
(2.1)
where ξ ∈ K and π is the projection of K on 6. The map ξ 7→ −ξ gives an involution σ of 60 . Restriction of π to 60 is a 2-fold covering of 6 ramified over 4(γ − 1) points fixed by σ, the zeros of ρ. 60 has genus γ 0 = 4γ − 3. If ρ = 21 tr (8)2 then relation (2.1) coincides with the eigen-value equation det(8 − ξ · I) = 0 for the Lax matrix 8. Let for each 0 6= ξ ∈ 60 , lξ denote the corresponding eigensubspace of 8. By continuity, lξ extend to vanishing ξ in 60 and ∪ξ lξ forms a line subbundle l of 60 ×C2 . In fact, l is a holomorphic subbundle with respect to the complex structure defined on 60 × C2 by ∂¯ + A ◦ π. The degree of l is −2(γ − 1). Besides, l(σ ∗ l) = π ∗ K −1 .
(2.2)
Conversely, given 60 and a holomorphic line bundle l of degree −2(γ −1) on it satisfying (2.2), we may recover a rank 2 holomorphic bundle E of trivial determinant over 6 as a pushdown of l to 6. Thus for 0 6= ξ ∈ 60 , Eπ(ξ) = lξ ⊕ l−ξ . E corresponds to a unique holomorphic SL2 -bundle which, if stable (what happens on an open subset of l’s) defines a point in the moduli space Ns . A holomorphic 1,0-form with values in the traceless endomorphisms of E acting as multiplication by ±ξ on l±ξ ⊂ Eπ(ξ) defines then a unique covector of T ∗ Ns . Thus 60 encodes the values of the quadratic Hitchin Hamiltonian H (i.e., of the action variables) whereas the line bundles l satisfying relation (2.2) form the abelian (Prym) variety (of the angle variables) describing the level set of H. 3. SL2 Moduli Space at Genus 2 We shall present briefly the description of the moduli space Ns for G = SL2 and γ = 2 which was worked out in [18]. Let us start by recalling some basic facts about theta functions. We shall use a coordinate rather than an abstract language. The space of degree γ − 1 holomorphic line bundles forms a Jacobian torus J γ−1 of complex dimension γ. Fixing a marking (a symplectic homology basis (Aa , Bb ), a, b = 1, . . . , γ), weR may identify J γ−1 with Cγ /(Zγ + τ Zγ ). τ ≡ (τ ab ) is the period matrix, i.e. τ ab = Bb ω a , where ω a are the R basic holomorphic forms on 6 normalized so that Aa ω b = δ ab . The point 0 ∈ Cγ corresponds in J γ−1 to a (marking dependent) spin structure S0 , i.e. a degree 1 bundle such that S02 = K. u ∈ Cγ describes the line bundle V (u)S0 , where V (u) is the flat
Self-Duality of the SL2 Hitchin Integrable System at Genus 2
647
b
line bundle with the twists e2πiu along the Bb cycles. The set of degree 1 bundles l with non-trivial holomorphic sections forms a divisor 2 of a holomorphic line bundle L2 over J γ−1 . Holomorphic sections of the k th power (k > 0) of L2 are called theta function of order k. With the use of a marking, they may be represented by holomorphic functions u 7→ θ(u) on C2 satisfying θ(u + p + τ q) = e−πikq·τ q−2πikq·u θ(u) for p, q ∈ Z . The functions
(3.1)
γ
θk,e (u) =
X
eπik(n+e/k)·τ (n+e/k)+2πik(n+e/k)·u ,
(3.2)
n∈Zγ
where e ∈ Zγ /kZγ form a basis of the theta functions of order k. Hence dimH 0 (Lk2 ) = k γ . In particular, the Riemann theta function θ1,0 (u) ≡ ϑ(u) represents the unique (up to normalization) non-trivial holomorphic section of L2 . It vanishes on the set γ−1 X xi { ∫ ω − 1 | x1 ∈ 6, . . . , xγ−1 ∈ 6} i=1
x0
representing the divisor 2. Here 1 ∈ Cγ denotes the (x0 -dependent) vector of Riemann constants. All theta functions of order 1 and 2 are even functions of u. For γ =R 2, the divisor 2 is formed by the bundles O(x) with divisors x ∈ 6. x O(x) = V ( x0 ω − 1)S0 . The pullback of the theta bundle L2 by means of the map x 7→ O(x) is equivalent to the canonical bundle K. The equivalence assigns 1,0-forms to functions representing sections of the pullback of L2 : x
ab ∂b ϑ( ∫ ω − 1) x0
7→
ω a (x).
(3.3)
Rx This is consistent since vanishing of ϑ( ω − 1) implies that x0 x
∂a ϑ( ∫ ω − 1) ω a (x) = 0 . x0
Rx a a −πiτ aa −2πi( ω −1 ) x0 when Hence any multivalued function on 6 picking up a factor e x goes around the Ba cycle and univalued around the Aa cycles may be identified with a 1,0-form on 6. As already suggested by the discussion at the end of Sect.2, for the SL2 group it is more convenient to use the language of holomorphic vector bundles (of rank 2 and trivial determinant) than to work with principal SL2 -bundles. Of course the first ones are just associated to the second ones by the fundamental representation of SL2 . Any stable rank 2 bundle E with trivial determinant is an extension of a degree 1 line bundle l ([18], Lemmas 5.5 and 5.8), i.e. it appears in an exact sequence of holomorphic vector bundles 0 −→ l−1 −→E −→l −→ 0 . σ
$
(3.4)
The inequivalent extensions (3.4) are classified by the cohomology classes in H 1 (l−2 ). This may be seen as follows. Taking a section of $, i.e., a smooth bundle homomorphism
648
K. Gawe¸dzki, P. Tran-Ngoc-Bich
¯ = 0 and hence that ∂s ¯ = σb for b a 0,1s : l → E such that $ ◦s = idl , we infer that $∂s ¯ where form with values in Hom(l, l−1 ) = l−2 , i.e. b ∈ ∧01 (l−2 ). b is determined up to ∂ϕ, ϕ is a smooth section of l−2 , i.e. ϕ ∈ 0(l−2 ). The class [b] in ∧01 (l−2 )/0(l−2 ) ∼ = H 1 (l−2 ) determines the extension (3.4) up to equivalence. Each b corresponds to an extension: ¯ given by ∂¯ −1 + ( 00 0b ). one may simply take E equal to l−1 ⊕ l with the ∂-operator l
⊕l
Proportional [b] correspond to equivalent bundles E. If E is a stable bundle then the extension (3.4) is necessarily nontrivial, i.e. [b] 6= 0. Let CE denote the set of degree 1 line bundles l s.t. H 0 (l ⊗ E) 6= 0 (equivalently, s.t. E is an extension of l). This is a complex 1-dimensional variety. It was shown in [18] that CE characterizes the bundle E up to isomorphism and that there exists a theta function θ of the 2nd order which vanishes exactly on CE . The assignment E 7→ C∗ θ gives an injective map m : Ns −→ PH 0 (L22 ) .
(3.5)
Let V (u1 )S0 ≡ lu1 ∈ CE . E may be realized as an extension of lu1 which is characterized by [b] ∈ H 1 (lu−2 ). Then one may take 1 Z K(x; u1 , u) ∧ b(x), (3.6) θ(u) = 6
where x
x
K(x; u1 , u) = ϑ( ∫ ω − u1 − u − 1) ϑ( ∫ ω − u1 + u − 1) x0 x0 −1 x ω a (x) · ab ∂b ϑ( ∫ ω − 1)
(3.7)
x0
(it does not depend on the choice of a = 1, 2). Let us explain the above formulae. K(x; u1 , u), in its dependence on x, is a multivalued holomorphic 1,0-form. More exactly, the function x
x 7→ ϑ( ∫ ω − u1 − u − 1)
(3.8)
x0
is multivalued around the Ba -cycles picking up the factor Rx a a a a −πiτ aa −2πi( ω −u1 −u −1 ) x0 e when x goes around Ba so that it describes an element s2 ∈ H 0 (lu1 lu ) (non-vanishing if u1 + u 6∈ Z2 + τ Z2 ). Similarly, −1 x x x 7→ ϑ( ∫ ω − u1 + u − 1) ab ∂b ϑ( ∫ ω − 1) ω a (x) x0
picks up the factor
x0
a
e2πi(u1 −u
a
)
when x goes around Ba and describes a holomorphic 1,0-form χ with values in lu1 lu−1 (non-vanishing if u1 − u 6∈ Z2 + τ Z2 ). The product s2 χ = K(·; u1 , u) is a holomorphic ) via the integral over 1,0-form with values in lu2 1 and it may be paired with b ∈ ∧01 (lu−2 1
Self-Duality of the SL2 Hitchin Integrable System at Genus 2
649
x on the r.h.s. of Eq. (3.6). The integral is independent of the choice of the representative b of the cohomology class [b]. In its dependence on u, K(x; u1 , u) is a theta function of the 2nd order and so is θ(u). In Appendix 1 we check explicitly that θ given by Eq. (3.6) possesses the required property. The product of the two shifted Riemann theta functions ϑ(u0 − u)ϑ(u0 + u) is a theta function of the 2nd order both in u0 and in u (and it is invariant under the interchange u0 ↔ u). Let ι denote the (marking dependent) linear isomorphism between the spaces H 0 (L22 )∗ and H 0 (L22 ) defined by ι(φ)(u) = hϑ(· − u)ϑ(· + u), φi.
(3.9)
An easy calculation shows that ϑ(u0 − u) ϑ(u0 + u) =
X
θ2,e (u0 ) θ2,e (u).
(3.10)
e ∗ ) of H 0 (L22 )∗ . Hence ι interchanges the basis (θ2,e ) of H 0 (L22 ) with the dual basis (θ2,e 0 2 Denote by φu the linear form on H (L2 ) that computes the value of the theta function at point u ∈ C2 . The Kummer quartic K∗ ⊂ H 0 (L22 )∗ , K∗ = {C∗ φu0 |u0 ∈ C2 } is mapped by the isomorphism ι into a quartic K ⊂ H 0 (L22 ) of theta functions proportional to
u 7→ ϑ(u0 − u) ϑ(u0 + u) for some u0 ∈ C2 . One may define a projective action of (Z/2Z)4 on H 0 (L22 ) by assigning to an element (e, e0 ) ∈ (Z/2Z)4 , with e, e0 = (0, 0), (1, 0), (0, 1) or (1, 1), a linear transformation Ue,e0 s.t. 0
0
0
(Ue,e0 θ)(u) = e 2 πie ·τ e +2πie ·u θ(u + 2 (e + τ e0 )). 1
1
(3.11)
0
The relation Ue1 ,e01 Ue2 ,e02 = (−1)e1 ·e2 Ue1 +e2 ,e01 +e02 holds so that U lifts to the Heisenberg group. In the action on the basic theta functions, Ue1 ,e01 θ2,e = (−1)e1 ·e θ2,e+e01 .
(3.12)
The marking-dependence of the isomorphism ι of Eq. (3.9) is given by the action of (Z/2Z)4 . It is easy to check that this action preserves K and that the transposed action of (Z/2Z)4 preserves K∗ . The (Z/2Z)4 symmetry of the Kummer quartics allows to find easily their defining equation, see Appendix 3. It was shown in [18] that the image of Ns under the map (3.5) contains all nonzero theta functions of the 2nd order except the ones in the Kummer quartic K. The latter correspond, however, to the (Seshadri equivalence classes of) semistable but not stable bundles so that the map m extends to an isomorphism between Nss and PH 0 (L22 ) showing that Nss is a smooth projective variety.
650
K. Gawe¸dzki, P. Tran-Ngoc-Bich
4. Cotangent Bundle Let us describe the cotangent space of Ns at point E. The covectors tangent to Ns at E may be identified with holomorphic 1,0-forms 9 with values in the bundle of traceless endomorphisms of E. We may assume that E is an extension of a line bundle l of degree 1 realized as l−1 ⊕ l with ∂¯E = ∂¯ −1 + B, where B = ( 00 0b ). Then l
⊕l
9=
−µ η
ν , µ
(4.1)
where µ ∈ ∧10 , ν ∈ ∧10 (l−2 ), η ∈ ∧10 (l2 ) and ¯ = −η ∧ b, ∂µ
∂¯l2 η = 0,
∂¯l−2 ν = 2µ ∧ b .
(4.2)
It is easy to relate the above description of covectors tangent to Ns to the one of Sect. 2. Let U : l−1 ⊕ l → 6 × C2 be a smooth isomorphism of rank 2 bundles with trivial ¯ for a certain sl2 -valued 0,1-form A and 8 = U9U −1 determinant. Then U ∂¯E U −1 = ∂+A satisfies ∂¯A 8 = 0. The G orbit of (A, 8) is independent of the choice of U and the quadratic Hitchin Hamiltonian takes value 21 tr(8)2 on it. The latter expression is clearly equal to 21 tr(9)2 = µ2 + ην which, as easily follows from relations (4.2), defines a holomorphic quadratic differential. Hence H(E, 9) = µ2 + ην .
(4.3)
We would like to express the latter using the theta function description of T ∗ Nss = T ∗ PH 0 (L22 ); where the covectors tangent to Nss at C∗ θ are represented by linear forms φ on H 0 (L22 ) s.t. hθ, φi = 0. Let l = lu1 ∈ CE , i.e. θ(u1 ) = 0 for the theta function corresponding to E. We shall assume that l2 6= K i.e. that 2u1 6∈ Z2 + τ Z2 . An infinitesimal variation δE of the bundle E in Ns may be achieved by changing ∂¯E = ∂¯ −1 + B with B = ( 00 0b ) to l
πδu1 (Imτ ) ∂¯l−1 ⊕l + 0
−1
ω¯
⊕l
b + δb −πδu1 (Imτ )−1 ω¯
≡ ∂¯E + δB
(4.4)
(all other variations of ∂¯E may be obtained from (4.4) by infinitesimal gauge transformations). Clearly Z Z Z −1 tr 9 ∧ δB = −2πδu1 (Imτ ) µ ∧ ω¯ + η ∧ δb . (4.5) hδE, 9i = 6
6
6
¯ changed to ∂¯lu − πδu1 (Imτ )−1 ω¯ is Note that the line bundle lu1 with the ∂-operator 1 0 is established by multiplication by the equivalent to lu1 +δu1 ≡ l and the equivalence R multivalued function x 7→ e
2πiδu1 (Imτ )−1
by Eq. (4.4) is equivalent to l
0 −1
x
Imω
¯ . Hence l−1 ⊕l with the ∂-operator given 0 0 ¯ ⊕ l0 with the ∂-operator ∂¯l0 −1 ⊕l0 + ( 0 b+δ0 b ); where x0
x
δ 0 b(x) = δb − 4πiδu1 (Imτ )−1 ( ∫ Imω)b(x). The last bundle corresponds by the relation x0
(3.6) to the theta function
Z
K(x; u1 + δu1 , u) ∧ (b(x) + δ 0 b(x)) .
(θ + δθ)(u) = 6
Self-Duality of the SL2 Hitchin Integrable System at Genus 2
651
Hence δE is represented by the variation Z Z −1 a b δθ(u) = − 2πδu1 (Imτ )ab L (x; u1 , u) ∧ b(x) + K(x; u1 , u) ∧ δb(x) (4.6) 6
6
of the theta function, where x
La (x; u1 , u) = K(x; u1 , u) ∫ (ω a − ω¯ a ) − x0
1 Imτ ab ∂ub K(x; u1 , u) 2π 1
.
(4.7)
Note that as functions of x, La (x; u1 , u) are 1,0-forms with values in lu2 1 (as are K(x; u1 , u)). They are not holomorphic: ∂¯x La (x; u1 , u) = K(x; u1 , u) ∧ ω¯ a (x). As functions of u, La (x; u1 , u) are theta functions of the 2nd order. We would like to find an explicit form of the Lax matrix 9 representing the linear form φ on H 0 (L22 ) s.t. hθ, φi = 0. We shall achieve this goal partially, finding the entries η and µ of the matrix (4.1). The correspondence between 9 and φ is determined by the equality hδE, 9i = hδθ, φi. Since the left-hand side is given by Eq. (4.5) and δθ by Eq. (4.6), we obtain Z Z −1 µ ∧ ω¯ + η ∧ δb −2πδu1 (Imτ ) 6 6 Z Z b = −2πδua1 (Imτ )−1 hL (x; u , ·), φi ∧ b(x) + hK(x; u1 , ·), φi ∧ δb(x). (4.8) 1 ab 6
6
Taking δu1 = 0 we infer that η(x) = hK(x; u1 , ·), φi
(4.9)
is the lower left entry of the matrix 9 corresponding to the linear form φ. It is easy to find the entry µ of 9 representing the linear form φu1 (recall that φu1 computes the value of a theta function in H 0 (L22 ) at point u1 ). Since K(x; u1 , u1 ) = 0, it follows from Eq. (4.9) that η = 0 in this case. Equation (4.8) reduces then to Z Z −1 a −2πδu1 (Imτ ) µ ∧ ω¯ = δu1 ∂ua1 K(x; u1 , u1 ) ∧ b(x) 6 6 Z = −δua1 ∂ua K(x; u1 , u1 ) ∧ b(x) = −δua1 ∂a θ(u1 ). 6
This fixes µ uniquely: µ=
i ∂ θ(u1 )ω a 4π a
.
(4.10)
) such that the last equation of (4.2) holds. For Let us check that there exists ν ∈ ∧10 (lu−2 1 this it is necessary and sufficient that Z κµ ∧ b = 0 (4.11) 6
652
K. Gawe¸dzki, P. Tran-Ngoc-Bich
for a non-zero holomorphic section κ of lu2 1 = V (2u1 )K (dimH 0 (lu2 1 ) = 1 if 2u1 6∈ Z2 + τ Z2 ). But such a section may be represented by the function x
x 7→ ϑ( ∫ ω − 2u1 − 1) x0
so that, recalling the definition (3.7), we obtain Z Z a κω ∧ b = ab ∂ub K(x; u1 , u1 ) ∧ b(x) = ab ∂b θ(u1 ) . 6
(4.12)
6
Hence the relation (4.11) follows for µ given by Eq. (4.10). The 1,0-form ν satisfying K) = {0}. the last relation of (4.2) is now unique since H 0 (lu−2 1 We would like to find the entry µ of 9 corresponding to more general linear forms φ s.t. hθ, φi = 0. Recall that θ with θ(u1 ) = 0 may be given by formula (3.6) with ). Note that any 2nd -order theta function δθ vanishing at u1 and not in the b ∈ ∧0,1 (lu−2 1 Kummer quartic K may be written as Z K(x; u1 , u) ∧ δb(x) (4.13) δθ(u) = 6
) since it corresponds to an extension of lu1 . The space of δθ vanishing with δb ∈ ∧01 (lu−2 1 ) of classes [δb] and the assumption at u1 is 3-dimensional, as well as the space H 1 (lu−2 1 that δθ 6∈ K is obviously superfluous. Set for a linear form ψ on H 0 (L22 ), ηψ (x) = hK(x; u1 , ·), ψi . ηψ defines a holomorphic 1,0-form with values in lu2 1 . We have Z ηψ ∧ δb hδθ, ψi =
(4.14)
(4.15)
6
for δθ given by Eq. (4.13). By dimensional count, the map ψ 7→ ηψ is onto H 0 (lu2 1 K) with the 1-dimensional kernel spanned by φu1 . Specifying Eq. (4.15) to δθ ∝ θ, we obtain the relation Z ηψ ∧ b (4.16) hθ, ψi = 6
) in terms of θ. On the other hand, taking ψ = φ which determines the class [b] ∈ H 1 (lu−2 1 in Eq. (4.14), we infer that η = 0 if and only if φ is proportional to φu1 , the case studied before. If ηφ 6= 0 then µ depends on the choice of the representative b in the class [b] ∈ ¯ H 1 (lu−2 ) characterizing E as the extension of lu1 . Under the transformation b 7→ b + ∂ϕ, 1
, where ϕ is a section of lu−2 1 η 7→ η,
µ 7→ µ + ϕη,
ν 7→ ν − 2ϕµ − ϕ2 η.
The pairing of the theta functions La (x; u1 , ·) of Eq. (4.7) with the linear form φ gives two 1,0-forms with values in lu2 1 : χa (x) = hLa (x; ·, u1 ), φi
s.t.
¯ a = η ∧ ω¯ a . ∂χ
(4.17)
Self-Duality of the SL2 Hitchin Integrable System at Genus 2
653
Specifying the equality (4.8) to the case with δb = 0, we infer the relation Z Z µ ∧ ω¯ a = χa ∧ b 6
(4.18)
6
which, together with the equation ¯ = −η ∧ b ∂µ
(4.19)
determines the R µ completely. In Appendix 2, we show that µ fixed this way satisfies κµ ∧ b = 0 and hence defines a unique 1,0-form ν with values in lu−2 s.t. relation 1 6 ¯ ∂ν = 2µ ∧ b. 5. Hitchin Hamiltonians From the relation (4.3) and the explicit form of 9 corresponding to φu1 (η vanishing, µ given by Eq. (4.10)), one obtains 1
H(θ, a1 φu1 ) = − 16π2 a21 (∂a θ(u1 ) ω a )2 .
(5.1)
The right -and side is a quadratic differential. Equation (5.1), whose projective version was first obtained in [21], is consistent with the rescaling θ 7→ tθ and φ 7→ t−1 φ for t ∈ C∗ . It describes the value of the Hitchin map H on the special covectors, namely ∗ of the Kummer those represented by the pairs (θ, φ) s.t. C∗ φ is in the intersection KE ∗ ∗ quartic K with the plane hθ, φi = 0. The linear span of KE gives the whole cotangent space TE∗ Nss . Indeed, any theta function of the 2nd order δθ which vanishes on CE has ∗ is itself a quartic. Hence to be proportional to θ and defines a zero vector in TE Nss . KE ∗ in a general position the restriction of the quadratic polynomial H to six lines in KE determines H completely. ∗ It is possible to find a more explicit description of the values of H away from KE and this is the main aim of the rest of the present section. Suppose then that the entry η in 9 does not vanish. Let xi , i = 1, . . . , 4, be its four zeros. We shall assume that η cannot be written as κω for κ ∈ H 0 (lu2 1 ) and ω ∈ H 0 (K). This is true for generic φ. In this case, η = a2 ηφu2 for some a2 ∈ C∗ and for u2 satisfying x1
x2
x0
x0
u1 + u2 = ∫ ω + ∫ ω − 21
and
x3
x4
x0
x0
u1 − u2 = ∫ ω + ∫ ω − 21,
(5.2)
u1 ± u2 6∈ Z + τ Z. Indeed, ηφu2 (x) is a holomorphic section of lu2 1 K represented by the multivalued function ϑ(∫xx0 ω − u1 − u2 − 1)ϑ(∫xx0 ω − u1 + u2 − 1) vanishing exactly at xi and such a section is unique up to normalization. We infer that in the action on the theta functions of Eq. (4.13), the linear forms φ and a2 φu2 coincide. Since Eq. (4.13) gives all theta functions vanishing at u1 , it follows that φ = a1 φu1 + a2 φu2
(5.3)
for some a1 ∈ C. Let us stress that, to fix normalizations, u1 and u2 should be viewed e of 6. The as elements of C2 with xi in relations (5.2) belonging to the covering space 6 relation hθ, φi = 0 implies that θ(u2 ) = 0.
654
K. Gawe¸dzki, P. Tran-Ngoc-Bich
Summarizing, we have shown that a generic pair (θ, φ) s.t. hθ, φi = 0 may be obtained by first choosing u1 and u2 s.t. 2u1 , 2u2 , u1 ± u2 6∈ Z + τ Z and then taking θ from the 2-dimensional space of theta functions vanishing at u1 and u2 and φ from the orthogonal subspace. The zeros xi of η are determined from Eqs. (5.2) (as the zeros of ϑ(∫xx0 ω − u1 ± u2 − 1)). For simplicity, we shall assume that they are distinct (this is true for generic φ). Then the differentials ∂η(xi ) ∈ (lu2 1 K 2 )xi do not vanish. A quadratic differential ρ ∈ H 0 (K 2 ) is determined by its values at four points xi which form a divisor of lu2 1 K 6= K 2 . Since dimH 0 (K 2 ) = 3, there is one linear relation satisfied by all ρ(xi ): 4 X
ρ(xi )κ(xi )∂η(xi )−1 = 0
i=1
for 0 6= κ ∈ H 0 (lu2 1 ). It expresses the fact that the sum of residues of the meromorphic 1,0-form ρκη −1 has to vanish. For ρ = H(θ, φ) = µ2 + ην, ρ(xi ) = µ(xi )2 so that it is enough to know µ(xi ) in order to determine H(θ, φ). Note that although ) the 1,0-form µ depends on the choice of the representative b of the class [b] ∈ H 1 (lu−2 1 ¯ the 1,0-form defined by Eq. (4.16), the values µ(xi ) are invariant since under b 7→ b + ∂ϕ µ changes to µ + ϕη. It remains to find µ(xi ). Consider the meromorphic function ηψ η −1 . Viewed as a ¯ ψ η −1 ) is supported at the poles of ηψ η −1 and distribution, ∂(η Z
¯ ψ η −1 ) = −2πi µ ∧ ∂(η 6
4 X
µ(xi )ηψ (xi )∂η(xi )−1
i=1
for any (smooth) 1,0-form µ. In particular, for µ satisfying Eq. (4.19) we obtain 4 X
µ(xi )ηψ (xi )∂η(xi )
i=1
−1
Z =
1 2πi
ηψ ∧ b = 6
1 hθ, ψi. 2πi
(5.4)
Recall that ηψ run through the three-dimensional space H 0 (lu2 1 K). If ηψ (xi ) = 0 for all i then ηψ has to be proportional to η = a2 ηφu2 . Hence vectors (ηψ (xi )) form a 2dimensional subspace in ⊕(lu2 1 K)xi and Eqs. (5.4) determine vector (µ(xi )) ∈ ⊕ Kxi i
i
up to a 2-dimensional ambiguity spanned by (ω a (xi )) (indeed, as the residues of the meromorphic 1,0-form ηψ η −1 ω a , the numbers ω a (xi )ηψ (xi )∂η(xi )−1 sum to zero). It is clearly enough to take for ψ in Eq. (5.4) any two linear forms independent of φu1 and φu2 . In the generic situation, we may choose the forms ∂a φu1 defined by hθ, ∂a φu1 i = ∂a θ(u1 ). Denoting the corresponding 1,0-forms ηψ by ηa0 , we obtain 2 relations for µ(xi ): 4 X i=1
µ(xi )ηa0 (xi )∂η(xi )−1 =
1 ∂ θ(u1 ). 2πi a
(5.5)
Self-Duality of the SL2 Hitchin Integrable System at Genus 2
655
Alternatively, we may choose for ψ the linear forms ∂a φu2 corresponding to 1,0-forms ηa00 . This gives the relations 4 X
µ(xi )ηa00 (xi )∂η(xi )−1 =
1 ∂ θ(u2 ). 2πi a
(5.6)
i=1
ηa00 must be linearly dependent from ηa0 and η (in the generic situation): ηa00 = Dab ηb0 + η
(5.7)
leading via Eqs. (5.5) and (5.6) to the relation ∂a θ(u2 ) = Dab ∂b θ(u1 ). We need 2 more equations to determine µ(xi ). They may be obtained from Eqs. (4.18) fixing the holomorphic contributions to µ. Indeed, using the 2nd equation in (4.17), and Eq. (4.19) we infer that Z Z Z Z a −1 a −1 ¯ a −1 ¯ µ ∧ ω¯ = (µη )η ∧ ω¯ = (µη )∂χ = χa ∧ ∂(µη ) 6
6
6
Z
χa ∧ b − 2πi
= 6
4 X
6
µ(xi )χa (xi )∂η(xi )−1
(5.8)
i=1
so that Eq. (4.18) implies that 4 X
µ(xi )χa (xi )∂η(xi )−1 = 0.
(5.9)
i=1
These are the two missing equations. To see this, repeat the calculation (5.8) for µ replaced by ω b . This gives the relation 1 Imτ ab π
=
4 X
ω b (xi )χa (xi )∂η(xi )−1 .
i=1 a
Suppose now that da χ (xi ) + eηψ (xi ) = 0 for i = 1, . . . , 4. It follows that 0=
4 X
ω b (xi ) da χa (xi ) + eηψ (xi ) ∂η(xi )−1 =
1 Imτ ab da π
i=1
so that da = 0. Hence the vectors (χa (xi )) span a 2-dimensional subspace of ⊕ Kxi i
transversal to the 2-dimensional subspace spanned by the vectors (ηψ (xi )) and the linear equations (5.4) and (5.9) determine µ(xi ) completely. It is enough to consider the case φ = φu2 . Indeed, the shift φ 7→ φ + a1 φu1 results in the change µ
7→
µ+
i a ∂ θ(u1 )ω a , 4π 1 a
see Eq. (4.10). Identifying 1,0-forms with multivalued functions by the relation (3.3) and xi b setting χa = 2π(Imτ )−1 ab χ , wi = ∫x0 ω − 1, G1 = G12 = −G2 and G3 = G34 = −G4 where
656
K. Gawe¸dzki, P. Tran-Ngoc-Bich
Gij = det
∂1 ϑ(wi ) ∂2 ϑ(wi )
∂1 ϑ(wj ) , ∂2 ϑ(wj )
we obtain ∂η(x1 ) = G1 ϑ(w1 − w3 − w4 ), ∂η(x2 ) = G2 ϑ(w2 − w3 − w4 ), ∂η(x3 ) = G3 ϑ(w3 − w1 − w2 ), ∂η(x4 ) = G4 ϑ(w4 − w1 − w2 ),
χa (x1 ) = −∂a ϑ(w2 ) ϑ(w1 − w3 − w4 ), χa (x2 ) = −∂a ϑ(w1 ) ϑ(w2 − w3 − w4 ), χa (x3 ) = −∂a ϑ(w4 ) ϑ(w3 − w1 − w2 ), χa (x4 ) = −∂a ϑ(w3 ) ϑ(w4 − w1 − w2 ),
ηa0 (x1 ) = ∂a ϑ(w1 ) ϑ(w2 + w3 + w4 ), ηa0 (x2 ) = ∂a ϑ(w2 ) ϑ(w1 + w3 + w4 ), ηa0 (x3 ) = ∂a ϑ(w3 ) ϑ(w1 + w2 + w4 ), ηa0 (x4 ) = ∂a ϑ(w4 ) ϑ(w1 + w2 + w3 ),
ηa00 (x1 ) = ∂a ϑ(w2 ) ϑ(w1 − w3 − w4 ), ηa00 (x2 ) = ∂a ϑ(w1 ) ϑ(w2 − w3 − w4 ), ηa00 (x3 ) = −∂a ϑ(w4 ) ϑ(w3 − w1 − w2 ), ηa00 (x4 ) = −∂a ϑ(w3 ) ϑ(w4 − w1 − w2 ).
Given these values, it is easy to find the explicit form of the matrix (Dab ) appearing in the relation between the derivatives of ∂a θ at u1 and u2 by specifying Eq. (5.7) to two of the points xi . One form of these relations is ∂2 ϑ(w3 )∂1 θ(u2 ) − ∂1 ϑ(w3 )∂2 θ(u2 ) =−
ϑ(w3 −w1 −w2 ) ϑ(w1 +w2 +w4 )
(∂2 ϑ(w4 )∂1 θ(u1 ) − ∂1 ϑ(w4 )∂2 θ(u1 )),
∂2 ϑ(w4 )∂1 θ(u2 ) − ∂1 ϑ(w4 )∂2 θ(u2 ) =−
ϑ(w4 −w1 −w2 ) ϑ(w1 +w2 +w3 )
(∂2 ϑ(w3 )∂1 θ(u1 ) − ∂1 ϑ(w3 )∂2 θ(u1 )).
Let us denote µ e(xi ) = µ(xi )/Gi . Equations (5.9) have the general solution (e µ(x1 ), . . . , µ e(x4 )) = g1 (G34 , 0, G23 , −G24 ) + g2 (0, G34 , G13 , −G14 ) and Eqs. (5.6) fix the values of g1 and g2 to ∂2 ϑ(w1 )∂1 θ(u2 ) − ∂1 ϑ(w1 )∂2 θ(u2 ) , 4πiG12 G34 ∂2 ϑ(w2 )∂1 θ(u2 ) − ∂1 ϑ(w2 )∂2 θ(u2 ) g2 = . 4πiG12 G34 g1 = −
This leads to the following simple result: i
µ(xi ) = ± 4π (∂2 ϑ(wi )∂1 θ(u2 ) − ∂1 ϑ(wi )∂2 θ(u2 ))
(5.10)
or, in a more abstract notation from the introduction, i
µ(xi ) = ± 4π dθ(lu2 ) with the plus sign for i = 1, 2 and the minus one for i = 3, 4. Since the Hitchin Hamiltonian is quadratic in φ and its values on φu1 and φu2 are given by Eq. (5.1), it follows that H(θ, a1 φu1 + a2 φu2 ) = a21 H(θ, φu1 ) + a22 H(θ, φu2 ) + 2a1 a2 (c1 (ω 1 )2 + c2 ω 1 ω 2 + c3 (ω 2 )2 ).
Self-Duality of the SL2 Hitchin Integrable System at Genus 2
657
The mixed term may be found from the linear equations i (∂2 ϑ(wi )∂1 θ(u1 ) 4π
− ∂1 ϑ(wi )∂2 θ(u1 )) µ e(xi )Gi
= c1 ∂2 ϑ(wi )∂2 ϑ(wi ) − c2 ∂2 ϑ(wi )∂1 ϑ(wi ) + c3 ∂1 ϑ(wi )∂1 ϑ(wi ). Their explicit solution leads to the expression 1
H(θ, a1 φu1 + a2 φu2 ) = − 16π2 (a1 ∂a θ(u1 )ω a + a2 ∂a θ(u2 )ω a )2 +
a1 a2 4π 2 G13 G23
(∂2 ϑ(w3 )∂1 θ(u1 ) − ∂1 ϑ(w3 )∂2 θ(u1 ))
(5.11)
· (∂2 ϑ(w3 )∂1 θ(u2 ) − ∂1 ϑ(w3 )∂2 θ(u2 )) ∂a ϑ(w1 )∂b ϑ(w2 )ω ω . a
b
The second term on the right-hand side is a quadratic differential that vanishes at x1 and a b 1 a2 x2 and is equal to a4π 2 ∂a θ(u1 )∂b θ(u2 )ω ω at x3 and x4 so that 1
H(θ, φ)(xi ) = − 16π2 (a1 ∂a θ(u1 )ω a (xi ) ± a2 ∂a θ(u2 )ω a (xi ))2 ,
(5.12)
where sign plus should be taken for x1 and x2 and sign minus for x3 and x4 . This is the result (1.7) described in Introduction. 6. Self-Duality We would like to compare the values of the Hitchin Hamiltonians on the dual pairs (θ, φ) and (θ0 , φ0 ), where θ0 = ι(φ) and φ0 = ι−1 (θ) with ι defined by Eq. (3.9). Recall that, given u1 s.t. θ(u1 ) = 0, we associated to the linear form φ a 1,0-form η by Eq. (4.9). Viewed as a holomorphic section of lu2 1 K, x
x
x0
x0
η(x) = hϑ( ∫ ω − u1 − · − 1) ϑ( ∫ ω − u1 + · − 1) , φi. Let us denote xi
u0i = ∫ ω − u1 − 1.
(6.1)
x0
The vanishing of η(xi ) implies then that the linear form φ annihilates the theta functions u 7→ ϑ(u0i − u)ϑ(u0i + u) = ι(φu0i )(u)
(6.2)
and also, if we rewrite η(xi ) as ι(φ)(u0i ), that θ0 (u0i ) = 0. Since φ = a1 φu1 + a2 φu2 and φu1 annihilates the theta functions (6.2) as well, it follows that they belong to 5. Hence C∗ ι(φu0i ) are the 4 points of intersection of the line P5 with the Kummer quartic K. Equivalently, C∗ φu0i are the points of intersection of P50
⊥
with K∗ . In the generic
0⊥
and since φ0 ∈ 50 , we may write
φ0 = a01 φv1 + a02 φv2
(6.3)
θ = a01 ι(φv1 ) + a02 ι(φv2 ).
(6.4)
situation, any pair of theta functions φu0i spans 5
⊥
or, equivalently,
658
K. Gawe¸dzki, P. Tran-Ngoc-Bich
The involution l 7→ l−1 K of the Jacobian J 1 lifts to C2 to the flip of sign of u. By restriction to the bundles O(x), it induces the involution x 7→ x0 of 6 which leaves 6 Weierstrass points invariant. The latter involution lifts to an involution (without fixed e determined by the equation points) of the covering space 6 x
x0
x0
x0
∫ ω − 1 = − ∫ ω + 1.
(6.5)
Definitions (6.1) together with Eqs. (5.2) give the relations x1
x2
x0
x0
u01 − u02 = ∫ ω − ∫ ω
and
x3
x4
x0
x0
u01 + u02 = − ∫ ω − ∫ ω + 21
e They may be rewritten as holding in C2 , with xi ∈ 6. x1
x02
x0
x0
u01 − u02 = ∫ ω + ∫ ω − 21
and
x03
x04
x0
x0
u01 + u02 = ∫ ω + ∫ ω − 21,
(6.6)
which, upon the flip of the sign of u02 leaving φu02 unchanged, provides the dual version of e Applying the previous result relations (5.2) corresponding to points x1 , x02 , x03 , x04 ∈ 6. (5.12) and using the possibility to exchange a point with its image under the involution of 6 in the argument of a quadratic differential, we infer that H(θ0 , φ0 )(xi ) = − 16π2 (a01 ∂a θ0 (u01 )ω a (xi ) ∓ a2 ∂a θ0 (u02 )ω a (xi ))2 . 1
(6.7)
The sign minus should be taken for x1 and x2 and sign plus for x3 and x4 . The exchange of signs in comparison with Eq. (5.12) is due to the flip u02 7→ −u02 . In order to compare expressions (5.12) and (6.7) we shall calculate the coefficients a1,2 and a01,2 of the linear combinations (5.3) and (6.3). Note that the definition θ0 = ι(φ) implies that x
x
x
x0
x0
x0
θ0 ( ∫ ω − u1 − 1) = a2 ϑ( ∫ ω − u1 − u2 − 1) ϑ( ∫ ω − u1 + u2 − 1). Taking the derivative over x at x1 , we obtain ∂a θ0 (u01 )ω a (x1 ) = −a2 ϑ(w1 − w3 − w4 ) ∂a ϑ(w2 )ω a (x1 ), where we employed Eqs. (5.2) and the abbreviated notations wi = ∫xx0i −1. Hence ∂a θ 0 (u0 )ω a (x1 )
a2 = − ϑ(w1 −w3 −w4 )1 ∂a ϑ(w2 )ωa (x1 ) .
(6.8)
Similarly, x
x
x
x0
x0
x0
θ0 ( ∫ ω − u2 − 1) = a1 ϑ( ∫ ω − u1 − u2 − 1) ϑ( ∫ ω + u1 − u2 − 1). Taking the derivative at x = x1 and noting that w1 − u2 = −u02 , we infer that a1 =
∂a θ 0 (u02 )ω a (x1 ) . ϑ(w1 +w3 +w4 ) ∂a ϑ(w2 )ω a (x1 )
(6.9)
Self-Duality of the SL2 Hitchin Integrable System at Genus 2
659
To calculate a01,2 , we note that Eq. (6.4) implies that x
x
x
x0
x0
x0
θ( ∫ ω − v1 − 1) = a02 ϑ( ∫ ω − u01 − u02 − 1)ϑ( ∫ ω − u01 + u02 − 1). Upon derivation at x = x1 and with the use of relations (6.6) and (6.5), this gives ∂ θ(u )ω a (x )
a02 = − ϑ(w1 +w3a+w4 )1∂a ϑ(w12 )ωa (x1 ) .
(6.10)
Finally, since x
x
x
x0
x0
x0
θ( ∫ ω + v2 − 1) = a01 ϑ( ∫ ω − u01 + u02 − 1)ϑ( ∫ ω + u01 + u02 − 1), and w1 + u02 = u2 we infer that ∂ θ(u )ω a (x )
a01 = − ϑ(w1 −w3a−w4 )2 ∂a ϑ(w1 2 )ωa (x1 ) .
(6.11)
Substitution of expressions (6.9),(6.8),(6.11) and (6.10) shows equality of the right-hand sides of Eqs. (5.12) and (6.7) for xi = x1 . Since there is a full symmetry between points xi (hidden in our arbitrary choices of the order and the signs of uj ’s and u0j ’s), the self-duality H(θ, φ) = H(θ0 , φ0 )
(6.12)
follows. 7. van Geemen–Previato’s Result and Beyond The genus 2 curves are hyperelliptic. The map H 0 (K) 3 ω 7→ ω(x) defines an element of PH 0 (K)∗ and varying x ∈ 6 one obtains a realization of 6 as a ramified double cover PH 0 (K)∗ ∼ = P1 . One may use the 1,0-forms ω a ∈ H 0 (K) to define the homogeneous coordinates on PH 0 (K)∗ . Then λ(x) =
ω 2 (x) ω 1 (x)
∂1 ϑ(∫xx ω−1)
= − ∂2 ϑ(∫ x0 ω−1)
(7.1)
x0
becomes the inhomogeneous coordinate of the image in P1 of the point x ∈ 6. If x0 is the image of x under the involution O(x) 7→ O(−x)K = O(x0 ), i.e. if x
x0
x0
x0
∫ ω + ∫ ω − 21 ∈ Z + τ Z
then
λ(x) = λ(x0 ).
Hence the involution x 7→ x0 permutes the sheets of the covering 6 7→ P1 ramified over the 6 Weierstrass points xs , s = 1, . . . , 6, fixed by the involution. O(xs ) is an odd spin structure. i.e. xs
∫ ω − 1 = Es mod(Z2 + τ Z2 )
x0
and
660
K. Gawe¸dzki, P. Tran-Ngoc-Bich ∂ ϑ(E )
λs ≡ λ(xs ) = − ∂21 ϑ(Ess ) ,
(7.2)
where Es = 21 (es + τ e0s ) with es , e0s = (1, 0), (0, 1) or (1, 1) such that es · e0s is odd. The possibilities are: e1 = (1, 0), e01 = (1, 0);
e2 = (1, 1), e02 = (1, 0);
e3 = (0, 1), e03 = (0, 1);
e4 = (1, 1), e04 = (0, 1);
e5 = (0, 1), e05 = (1, 1);
e6 = (1, 0), e06 = (1, 1),
(7.3)
and we shall number the Weierstrass points (in a marking-dependent way) in agreement with this list. 6 may be identified with the hyperelliptic curve given by the equation ζ2 =
6 Y
(λ − λs )
(7.4)
s=1
with the involution mapping (λ, ζ) to (λ, −ζ). The expressions ω1 = C
dλ ζ
and
ω2 = C
λdλ , ζ
(7.5)
where C is a constant, give the basis of holomorphic 1,0-forms of 6 (the right-hand sides vanish exactly where the left-hand sides do). Let us recall the main result of [21] based on the analysis of the formula (5.1) for the Hitchin Hamiltonians on the Kummer quartic K∗ . It will be convenient to identify the pairs (θ, φ) s.t. hθ, φi = 0 with pairs (q, p) ∈ C4 × C4 s.t. q · p = 0 by the relations θ = q1 θ2,(0,0) + q2 θ2,(1,0) + q3 θ2,(0,1) + q4 θ2,(1,1) , ∗ ∗ ∗ ∗ φ = p1 θ2,(0,0) + p2 θ2,(1,0) + p3 θ2,(0,1) + p4 θ2,(1,1) . The symplectic form of T ∗ P3 is the standard dp∧dq and the isomorphism ι interchanges p and q. By examining the values of the quadratic differentials given by H at the Weierstrass points xs , van Geemen and Previato showed that Zs (q) = {p |q · p = 0, H(q, p)(xs ) = 0} is a union of a pair of bitangents to K∗ . Then classical results giving the equations for bitangents to the Kummer surface permitted the authors of [21] to write an almost explicit formula for H(xs ) in the form H(q, p)(xs ) = hs
X rst (q, p) t6=s
λs − λ t
,
(7.6)
where rst = rts are homogeneous polynomials, r12 (q, p) = (q1 p1 + q2 p2 − q3 p3 − q4 p4 )2 , r13 (q, p) = (q1 p4 − q2 p3 − q3 p2 + q4 p1 )2 , r14 (q, p) = −(q1 p4 + q2 p3 − q3 p2 − q4 p1 )2 , r15 (q, p) = −(q1 p3 − q2 p4 − q3 p1 + q4 p2 )2 , r16 (q, p) = (q1 p3 + q2 p4 + q3 p1 + q4 p2 )2 , r23 (q, p) = −(q1 p4 − q2 p3 + q3 p2 − q4 p1 )2 , r24 (q, p) = (q1 p4 + q2 p3 + q3 p2 + q4 p1 )2 , r25 (q, p) = (q1 p3 − q2 p4 + q3 p1 − q4 p2 )2 ,
(7.7)
Self-Duality of the SL2 Hitchin Integrable System at Genus 2
661
r26 (q, p) = −(q1 p3 + q2 p4 − q3 p1 − q4 p2 )2 , r34 (q, p) = (q1 p1 − q2 p2 + q3 p3 − q4 p4 )2 , r35 (q, p) = (q1 p2 + q2 p1 + q3 p4 + q4 p3 )2 , r36 (q, p) = −(q1 p2 − q2 p1 − q3 p4 + q4 p3 )2 , r45 (q, p) = −(q1 p2 − q2 p1 + q3 p4 − q4 p3 )2 , r46 (q, p) = (q1 p2 + q2 p1 − q3 p4 − q4 p3 )2 , r56 (q, p) = (q1 p1 − q2 p2 − q3 p3 + q4 p4 )2 , and hs ∈ Kx2 s could still depend on q. In the original language of pairs (θ, φ), and of the (Z/2Z)4 -action (3.12) on H 0 (L22 ) one has rst (θ, φ) = hUes ,e0s Uet ,e0t θ, φihUet ,e0t Ues ,e0s θ, φi with es , e0s from the list (7.3). The polynomials rst are self-dual: rst (q, p) = rst (p, q)
(7.8)
and the self-duality of H proven in the present paper forces coefficients hs in Eq. (7.6) to be q-independent filling partially the gap left in [21]. An easy but important identity is X rst (q, p) = (q · p)2 = 0 (7.9) t6=s
for any fixed s. It implies that the Hamiltonians (7.6) are preserved up to normalization by the isomorphisms of the hyperelliptic surfaces induced by the fractional action λ 7→ 1 λ0 = aλ+b cλ+d of SL(2, C) on P . We would still like to fix the values of the constants hs in Eqs. (7.6). We claim that they are such that the Hitchin map is given by Eq. (1.5), i.e. that 1
H(q, p) = − 128π2
X s,t=1,...,6, s6=t
rst (q, p) (dλ)2 . (λ − λs )(λ − λt )
(7.10)
First note that the above formula is consistent with the SL(2, C) transformations. Indeed, relations (7.9) imply that X s6=t
(λ0
−
rst λ0s )(λ0
−
(dλ λ0t )
0 2
) =
X s6=t
rst (dλ)2 (λ − λs )(λ − λt )
0 −1 for λ0 = aλ+b one verifies that the quadratic differentials cλ+d . Taking, in particular, λ = λ dλ (7.10) are regular at infinity. They are also regular at the branching points since √λ−λ s is a local holomorphic differential around xs . Hence the r.h.s. of Eq. (7.10) is indeed a (holomorphic) quadratic differential. Thus Eq. (7.10) is equivalent to relations (7.6) (dλ)2 | , modulo an overall normalization. To prove Eq. (7.10) we shall with hs = (λ−λ s ) xs verify it at a point of the phase space for which H(q, p)(xs ) 6= 0 for s 6= 1. This will fix hs for s 6= 1 and hence all of them (two quadratic differentials equal at points xs with s 6= 1 have to coincide).
662
K. Gawe¸dzki, P. Tran-Ngoc-Bich
Consider a pair (θ, φu1 ) lying in the product K × K∗ of the Kummer quartics with 0
0
0
θ(u) = e 2 πie1 ·τ e1 +2πie1 ·u1 ϑ(u1 + E1 − u) ϑ(u1 + E1 + u) X (Ue1 ,e01 θ2,e )(u1 ) θ2,e (u) = 1
(7.11)
e
for e1 = e01 = (1, 0). Note that hθ, φu1 i = 0. Equation (5.1) together with the relations (7.5) and the equation 0
0
0
∂a θ(u1 ) = −e 2 πie1 ·τ e1 +2πie1 ·u1 ∂a ϑ(E1 ) ϑ(2u1 + E1 ) 1
results in the identity 0
0
0
C πie1 ·τ e1 +4πie1 ·u1 (∂2 ϑ(E1 ))2 ϑ(2u1 + E1 )2 (λ − λ1 )2 H(θ, φu1 ) = − 16π 2 e 2
(dλ)2 ζ2
, (7.12)
where C is the constant appearing in Eq. (7.5). Note that H(θ, φu1 ) 6= 0 as long as ϑ(2u1 + E1 ) 6= 0. It follows that H(θ, φu1 ) is a quadratic differential proportional to 2 th (λ − λ1 )2 (dλ) ζ 2 which has the 4 order zero at x1 . The latter property characterizes it uniquely up to normalization. It is not difficult to check that Eq. (7.10) gives a quadratic differential with the same property. Indeed, in the language of q’s and p’s, the linear form φu1 corresponds to a vector p ∈ C4 and θ to q = (p2 , −p1 , p4 , −p3 ). A straightforward verification shows that r1t (q, p) = 0 for all t 6= 1. This implies that the quadratic differential given by Eq. (7.10) vanishes to the second order at x1 . The condition that it vanishes to the fourth order is Y X rst ((p2 , −p1 , p4 , −p3 ), p) (λ1 − λv ) = 0. s6=t, s,t6=1
v6=1,s,t
A direct calculation shows that this is exactly Eq. (A3.2) of the Kummer quartic with the coefficients (A3.4) so that it holds for p corresponding to φu1 . This establishes proportionality between the Hitchin map and the right-hand side of Eq. (7.10) with a coefficient that may be still curve-dependent. Fixing the overall normalization of the Hitchin map is more involved. We shall calculate the value of the quadratic differential on the right-hand side of Eq. (7.12) at λ = λ2 and compare it to the value given by Eq. (7.10). Since this is somewhat technical, we defer the argument to Appendix 4. The system with Hamiltonians (7.6) bears some similarity to the classic Neumann systems4 , also anchored in modular geometry [17, 2]. The Hamiltonians of a Neumann system have the form Hs =
X 1≤t6=s≤n
2 Jst , λ s − λt
(7.13)
where Jst = qs pt − qt ps are the functions on T ∗ Cn generating the infinitesimal action of the complex group SOn : 4
We thank M. Olshanetsky for attracting our attention to this fact.
Self-Duality of the SL2 Hitchin Integrable System at Genus 2
663
{Jst , Jtv } = −Jsv
for s, t, v different,
{Jst , Jvw } = 0
for s, t, v, w different.
(7.14)
The fact that the Hamiltonians (7.6) (with constant hs ) Poisson commute reduces, as is well known, to the identities {rst + rsv , rtv } = 0
and cyclic permutations thereof,
{rst , rvw } = 0
for
{s, t} ∩ {v, w} = ∅.
(7.15)
2 for the Neumann system, then Eqs. (7.15) follow from the relations If we set rst = Jst (7.14). It appears that the same algebra stands behind the fact5 that rst given by Eq. (7.7) verify (7.15). The phase space T ∗ Nss ∼ = {(q, p)|q·p = 0}/C∗ , where C∗ acts by (q, p) 7→ −1 (tq, t p), may be identified with the coadjoint orbit of the group SL4 composed of the traceless complex 4×4 matrices |pihq| of rank 1. Using the isomorphism of the complex Lie algebras sl4 ∼ = so6 , we obtain the functions Jst = −Jts on this SL4 orbit which generate the action of so6 and have the Poisson brackets given by (7.14). A straightforward check shows that, for rst of Eq. (7.7), 2 rst = −4Jst
(7.16)
so that Eq. (7.15) follows from the so6 -algebra (7.14). Upon the introduction of the rational functions rλst , Eqs. (7.15) take the form r
r
r
r
r
r
r
r
st sv st tv sv tv , λs −λ } + { λs −λ , λt −λ } + { λs −λ , λt −λ } = 0, { λs −λ t v t v v v st vw , λv −λ }=0 { λs −λ t w
{s, t} ∩ {v, w} = ∅.
for
(7.17)
The first of these identities is, essentially, the classical Yang-Baxter equation. Note, however, that rst , unlike in the Gaudin and Neumann systems, is not an element of a product of two copies of a Poisson algebra of functions: there is no sign of an explicit product structure, or of a reduction thereof, in our phase space. The important question is whether rst come from a rational solution of the CYBE. The conformal field theory work [14, 23] suggests that the answer may be positive, at least in some sense. The knowledge of the explicit form of the quadratic differentials H(q, p) allows to write the explicit equations for the genus 5 spectral curve of the SL2 Hitchin system at genus 2, see Eq. (2.1). They take the form ζ2 =
6 Y s=1
(λ − λs ),
ξ2 =
X
rst (q, p)
s6=t
Y
(λ − λv ).
(7.18)
v6=s,t
The involution of the spectral curve flips the sign of ξ. To extract explicit formulae for the angle variables describing the point on the Prym variety of the spectral curve, we would need, however, a more explicit knowledge of the entire Lax matrix 9. 5
This is the classical version of the observation of [22].
664
K. Gawe¸dzki, P. Tran-Ngoc-Bich
8. Conclusions The main result of the present paper is the proof of self-duality of the Hitchin Hamiltonians on the cotangent bundle to the moduli space of the holomorphic SL2 bundles on a genus 2 complex curve. The result was based on an expression for the Hitchin Hamiltonians off the Kummer quartic on which the values of the Hamiltonians were determined in [21]. Using the self-duality, we were able to complete the analysis of [21] and to obtain the explicit formula (1.5) for the Hitchin map (1.3) giving the action variables of the integrable system. The explicit formula for the angle variables remains still to be found. An interesting open problem is an extension of the present work to the case with insertion points. Another important problem related to Hitchin’s construction is the quantization of the corresponding integrable systems. For the SL2 case such a quantization is essentially provided by the Knizhnik–Zamolodchikov–Bernard–Hitchin connection [15, 4, 5] which describes the variation of conformal blocks of the SU2 WZW conformal field theory under the change of the complex structure of the curve. The (partition function) conformal blocks are holomorphic sections of the k th -power of the determinant line bundle over the moduli space Nss (k is the level of the WZW theory). In our case, they are simply k th -order homogeneous polynomials on H 0 (L2θ ). It is easy to quantize the Hitchin Hamiltonians Hs =
X t6=s
rst . λs − λ t
If one keeps the original formulae (7.7) for rst in which pi stands now for 1i ∂qi , the relations (7.15) or (7.17) still hold after the replacement of the Poisson brackets by the commutators. One obtains this way the commuting operators Hs mapping the space of homogeneous, degree k polynomials in variables q into itself. Note, however, that now X
rst = −k(k + 4)
t6=s
for each fixed s so that the quantization changes the conformal properties of the Hamiltonians. A direct construction of the projective version of the KZBH connection for group SU2 and genus 2 has been recently given in ref. [22] by following Hitchin’s approach [12]. It is consistent with the above ad hoc quantization of the classical Hitchin Hamiltonians. The integral formulae for the conformal blocks [3, 20, 8] or, equivalently, the integral formulae for the scalar product of the conformal blocks [9] have been used at genus 0 and 1 to extract the Bethe Ansatz eigen-vectors and eigen-values of the quantized version of the quadratic Hitchin Hamiltonians. The Bethe-Ansatz type diagonalization of the quantization of the genus 2 Hitchin Hamiltonians is among the issues that will have to be examined. Finally, as we stressed in the text, the relations between the conformal WZW field theory on a genus 2 surface and an orbifold theory in genus 0 requires further study.
Self-Duality of the SL2 Hitchin Integrable System at Genus 2
665
Appendix 1 Let us check that θ given by Eq. (3.6) vanishes if and only if H 0 (lu ⊗ E) = {(s1 , s2 ) | s2 ∈ H 0 (lu lu1 ), ∂¯ −1 s1 + s2 b = 0} 6= 0. lu lu 1
For u−u1 ∈ Z2 +τ Z2 the 1st theta function on the r.h.s. of Eq. (3.7) vanishes but lu = lu1 and lu1 ∈ CE . Assume now that u − u1 6∈ Z2 + τ Z2 . Then dim H 0 (lu−1 lu1 K) = 1 with a non-zero χ ∈ H 0 (lu−1 lu1 K). The necessary and sufficient condition for the solvability of the equation ∂¯ −1 s1 + s2 b = 0 for a given s2 ∈ H 0 (lu lu1 ) is lu lu
1
Z χs2 b = 0.
(A1.1)
6
If u + u1 ∈ Z2 + τ Z2 then lu lu1 = K and dimH 0 (lu lu1 ) = 2 so that there always is a non-zero solution but also θ(u) = 0 in this case due to the vanishing of the 2nd theta function on the r.h.s. of Eq. (3.7). Finally, if u ± u1 6∈ Z2 + τ Z2 then s2 ∈ H 0 (lu lu1 ) has to be proportional to the element defined by (3.8) and the condition (A1.1) coincides with the equation θ(u) = 0.
Appendix 2 Let us show that the 1,0-form µ satisfying relations (4.18) and (4.19) automatically fulfills the condition Z κµ ∧ b = 0. (A2.1) 6
Among the infinitesimal gauge field variations δB given by Eq. (4.4) there are ones which are equivalent to infinitesimal gauge transformations: ¯ + [B, 3]. δB = ∂3 Explicitly, for 3 = ( −σ κ
ϕ σ
) with σ a function, ϕ a section of lu−2 and κ a section of lu2 1 , 1
this requires that ¯ = 0, ∂κ
¯ + κb, πδu1 (Imτ )−1 ω¯ = −∂σ
¯ + 2σb. δb = ∂ϕ
(A2.2)
Such variations may only change the normalization of the theta function θ. Integrating the second of the above relations against forms ω a and using Eq. (4.12) we find that 1
δua1 = − 2πi ab ∂b θ(u1 )
(A2.3)
for the proper normalization of κ. For such δu1 the first term on the right-hand side of Eq. (4.6) gives a theta function vanishing at u = u1 and may be compensated by the ). Pairing second term. The 3rd equation of (A2.2) gives the compensating δb ∈ ∧01 (lu−2 1 Eq. (4.6) with the above δu1 and δb with the linear form φ, we obtain the identity Z Z 1 ab −1 c ∂ θ(u )(Imτ ) χ ∧ b + 2 ση ∧ b = 0 . (A2.4) b 1 ac i 6
6
666
K. Gawe¸dzki, P. Tran-Ngoc-Bich
On the other hand, Z Z Z 1 ab −1 ¯ κµ ∧ b = µ ∧ ∂σ − 2i ∂b θ(u1 )(Im)ac µ ∧ ω¯ c 6 6 Z Z 6 1 = − ση ∧ b − 2i ab ∂b θ(u1 )(Im)−1 χc ∧ b = 0, ac 6
6
nd
where we have subsequently used the 2 equation in (A2.2) with δu1 given by Eq. (A2.3), ¯ = −η ∧ b and Eq. (4.18) fixing µ and, finally, the identity (A2.4). the relation ∂µ
Appendix 3 It is not difficult to see that there exist a non-zero element P ∈ S 4 H 0 (L22 ), a homogeneous polynomial of degree 4 on H 0 (L22 )∗ , s.t. P (φu0 ) = 0 for all u0 ∈ C2 . Indeed, dimS 4 H 0 (L22 ) = ( 73 ) = 35 but the map u0 7→ P (φu0 ) defines 0 (L82 ) = 34. P is a quartic expression in an even theta function of order 8 and dimHeven 0 0 θ2,e (u ) which vanishes for all u . It has to be preserved by the (Z/2Z)4 -action (3.12) and hence it must be of the form 4 4 4 4 P = c1 (θ2,(0,0) + θ2,(1,0) + θ2,(0,1) + θ2,(1,1) ) 2 2 2 2 θ2,(1,0) + θ2,(0,1) θ2,(1,1) + c2 (θ2,(0,0) ) 2 2 2 2 θ2,(0,1) + θ2,(1,0) θ2,(1,1) ) + c3 (θ2,(0,0) 2 2 2 2 θ2,(1,1) + θ2,(1,0) θ2,(0,1) ) + c4 (θ2,(0,0)
+ c5 θ2,(0,0) θ2,(1,0) θ2,(0,1) θ2,(1,1) . It is not difficult to calculate the values of coefficients ci . Denoting α ≡ θ2,(0,0) (0), β ≡ θ2,(1,0) (0), γ ≡ θ2,(0,1) (0) and δ ≡ θ2,(1,1) (0), one has c1 = (α2 β 2 − γ 2 δ 2 )(α2 γ 2 − β 2 δ 2 )(α2 δ 2 − β 2 γ 2 ), c2 = −(α4 + β 4 − γ 4 − δ 4 )(α2 γ 2 − β 2 δ 2 )(α2 δ 2 − β 2 γ 2 ), c3 = −(α4 − β 4 + γ 4 − δ 4 )(α2 β 2 − γ 2 δ 2 )(α2 δ 2 − β 2 γ 2 ),
(A3.1)
c4 = −(α4 − β 4 − γ 4 + δ 4 )(α2 β 2 − γ 2 δ 2 )(α2 γ 2 − β 2 δ 2 ), c5 = 2αβγδ[(α4 − β 4 + γ 4 − δ 4 )2 − 4(α2 γ 2 − β 2 δ 2 )2 ]. If we use the basis dual to (θ2,e ) to identify φ ∈ H 0 (L22 )∗ with a vector p = (p1 , p2 , p3 , p4 ) ∈ C4 , the equation of the Kummer quartic K∗ becomes c1 (p41 + p42 + p43 + p44 ) + c2 (p21 p22 + p23 p24 ) + c3 (p21 p23 + p22 p24 ) (A3.2) +c4 (p21 p24 + p22 p23 ) + c5 p1 p2 p3 p4 = 0.
Self-Duality of the SL2 Hitchin Integrable System at Genus 2
667
Similarly, identifying θ ∈ H 0 (L22 ) with q = (q1 , q2 , q3 , q4 ) ∈ C4 with the help of the basis (θ2,e ), the same equation with p replaced by q defines the Kummer quartic K, compare [13], p. 81. We shall also need another well known presentation of the above equation using the inhomogeneous coordinates of the Weierstrass points λs given by Eq. (7.2). It is usually obtained by beautiful geometric considerations about quadratic line complexes, see [10]. It may be also obtained analytically by observing that the multivalued functions x
x 7→ θ2,e ( ∫ ω − 1) x0
transform like bilinears in ∂a ϑ(∫xx0
ω−1), i.e., that they represent quadratic differentials. It follows that X x x x θ2,e (Es ) θ2,e ( ∫ ω − 1) = ϑ(Es + ∫ ω − 1) ϑ(Es − ∫ ω + 1) x x0 x0 e 0 x x (A3.3) = Ds ∂1 ϑ(Es0 )∂2 ϑ( ∫ ω − 1) − ∂2 ϑ(Es0 )∂1 ϑ( ∫ ω − 1) x0 x0 x x · ∂1 ϑ(Es00 )∂2 ϑ( ∫ ω − 1) − ∂2 ϑ(Es00 )∂1 ϑ( ∫ ω − 1) , x0
x0
where Es = 21 (es + τ e0s ) is an odd characteristics from the list (7.3) and Es0 , Es00 are the two other ones s.t. Es + Es0 = Es00 mod(Z2 + τ Z2 ). The odd characteristics Es , Es0 , Es00 are either a permutation of E1 , E4 , E5 or a permutation of E2 , E3 , E6 . The relations (A3.3) hold since both sides represent a quadratic differential with double zeros at the Weierstrass points corresponding to Es0 and Es00 . One may obtain expressions for the coefficients Ds by the de l’Hospital rule applied twice at those points. Specifying then ∫xx0 ω − 1 to Es or to 3 remaining odd characteristics one obtains relations for quadratic combinations of θ2,e (0) of the form ±α2 ± β 2 ± γ 2 ± δ 2 with 2 plus and 2 minus signs as well as for αβ ± γδ, αγ ± βδ and αδ ± βγ. These relations may be used to compute the ratios of the coefficients ci (A3.1) which become functions of λs only. One obtains this way an alternative expression for the coefficients ci c1 = (λ1 − λ2 )(λ3 − λ4 )(λ5 − λ6 ), c2 = 2(λ1 − λ2 )((λ3 − λ5 )(λ4 − λ6 ) + (λ3 − λ6 )(λ4 − λ5 )), c3 = −2(λ3 − λ4 )((λ1 − λ5 )(λ2 − λ6 ) + (λ1 − λ6 )(λ2 − λ5 )),
(A3.4)
c4 = 2(λ5 − λ6 )((λ1 − λ3 )(λ2 − λ4 ) + (λ1 − λ4 )(λ2 − λ3 )), c5 = −2(λ1 − λ3 )((λ4 − λ5 )(λ2 − λ6 ) + (λ4 − λ6 )(λ2 − λ5 )) −2(λ1 − λ4 )((λ3 − λ5 )(λ2 − λ6 ) + (λ3 − λ6 )(λ2 − λ5 )) −2(λ1 − λ5 )((λ2 − λ4 )(λ3 − λ6 ) + (λ2 − λ3 )(λ4 − λ6 )) −2(λ1 − λ6 )((λ2 − λ4 )(λ3 − λ5 ) + (λ2 − λ3 )(λ4 − λ5 )) equivalent to the previous one up to normalization. Note that the SL(2, C) transformas +b tions λs 7→ aλ cλs +d preserve the form of the quartic equation. The virtue of the analytic approach is that it also provides useful expressions for the non-homogeneous ratios like e.g.
668
K. Gawe¸dzki, P. Tran-Ngoc-Bich
αβ + γδ e− 2 πi(1,0)·τ (1,0) (λ2 − λ5 )(λ2 − λ6 )(λ3 − λ4 ) = − . α2 γ 2 − β 2 δ 2 2C 2 (∂2 ϑ(E1 ))2 λ 1 − λ2 1
C 2 is given by the equations
3 3 2 2 1 (∂1 ϑ)3 ∂2 ϑ−3(∂1 ϑ)2 ∂2 ϑ∂1 ∂2 ϑ+3∂1 ϑ(∂2 ϑ)2 ∂1 ∂2 ϑ−(∂2 ϑ)3 ∂1 ϑ 4 2 (∂2 ϑ)
2
C =
Y
(A3.5)
(λs − λt )
Es t6=s
holding for any fixed s. It is not difficult to see by differentiating twice Eq. (7.1) at x = xs that C is the same constant that appears in Eq. (7.5). The expression (A3.5) is used below to fix the normalization of the Hitchin map.
Appendix 4 We shall show here that the overall normalization of the Hitchin map is as in Eq. (7.10). Since 0
0
0
eπie1 ·τ e1 +4πie1 ·u1 ϑ(2u1 + E1 )2 0 0 0 0 X = −eπie1 ·τ e1 ϑ(2u1 + E1 ) ϑ(2u1 − E1 ) = −eπie1 ·τ e1 θ2,e (E1 ) θ2,e (2u1 ) e X 1 (−1)(1,0)·e θ2,e+(1,0) (0) θ2,e (2u1 ), = −e 2 πi(1,0)·τ (1,0)
e
the coefficient of C2 16π 2
(dλ) ζ2
2
on the right-hand side of Eq. (7.12) takes at λ = λ2 the value
1
e 2 πi(1,0)·τ (1,0) (∂2 ϑ(E1 ))2 (λ1 − λ2 )2 (βθ2,(0,0) (2u1 ) − αθ2,(1,0) (2u1 ) +δθ2,(0,1) (2u1 ) − γθ2,(1,1) (2u1 )) (A4.1)
in the notations of Appendix 3. This coefficient should coincide with the one obtained from the right-hand side of Eq. (7.10) which is equal to X Y 1 r2t (q, p) (λ2 − λv ) (A4.2) − 64π2 t6=2
v6=2,t
calculated at (q, p) corresponding to (θ, φu1 ) with θ given by Eq. (7.11). The respective values of rst are: r1t = 0, r23 = 2(−αγ 2 θ2,(0,0) (2u1 ) − βδ 2 θ2,(1,0) (2u1 ) − γα2 θ2,(0,1) (2u1 ) −δβ 2 θ2,(1,1) (2u1 ) − βγδθ2,(0,0) (2u1 ) − αγδθ2,(1,0) (2u1 ) −αβδθ2,(0,1) (2u1 ) − αβγθ2,(1,1) (2u1 )), r24 = 2(αγ 2 θ2,(0,0) (2u1 ) + βδ 2 θ2,(1,0) (2u1 ) + γα2 θ2,(0,1) (2u1 ) +δβ 2 θ2,(1,1) (2u1 ) − βγδθ2,(0,0) (2u1 ) − αγδθ2,(1,0) (2u1 ) −αβδθ2,(0,1) (2u1 ) − αβγθ2,(1,1) (2u1 )),
(A4.3)
Self-Duality of the SL2 Hitchin Integrable System at Genus 2
669
r25 = 2(αδ 2 θ2,(0,0) (2u1 ) + βγ 2 θ2,(1,0) (2u1 ) + γβ 2 θ2,(0,1) (2u1 ) +δα2 θ2,(1,1) (2u1 ) + βγδθ2,(0,0) (2u1 ) + αγδθ2,(1,0) (2u1 ) +αβδθ2,(0,1) (2u1 ) + αβγθ2,(1,1) (2u1 )), r26 = 2(−αδ 2 θ2,(0,0) (2u1 ) − βγ 2 θ2,(1,0) (2u1 ) − γβ 2 θ2,(0,1) (2u1 ) −δα2 θ2,(1,1) (2u1 ) + βγδθ2,(0,0) (2u1 ) + αγδθ2,(1,0) (2u1 ) +αβδθ2,(0,1) (2u1 ) + αβγθ2,(1,1) (2u1 )). Multiplying the coefficients at subsequent θ2,e (2u1 ) in expression (A4.1) by α, −β, γ and −δ, respectively, and summing them up we obtain C2 8π 2
1
e 2 πi(1,0)·τ (1,0) (∂2 ϑ(E1 ))2 (λ1 − λ2 )2 (αβ + γδ).
A similar operation on expression (A4.2) gives 1
− 16π2 (λ1 − λ2 )(λ2 − λ5 )(λ2 − λ6 )(λ3 − λ4 )(α2 γ 2 − β 2 δ 2 ). The equality of the two expressions follows from Eq. (A3.5). This verifies the correctness of the overall normalization of the Hitchin map in Eq. (7.10).
References 1. Atiyah, M. F., Bott, R.: The Yang–Mills Equation over Riemann Surfaces. Phil. Trans. R. Soc. Lond. A308, 523–165 (1982) 2. Avan, J., Talon, M.: Poisson Structure and Integrability of the Neumann–Moser–Uhlenbeck Model. Int. J. Mod. Phys. A 5, 4477–4488 (1990) 3. Babujian, H. M., Flume, R.: Off-Shell Bethe Ansatz Equation for Gaudin Magnets and Solutions of Knizhnik–Zamolodchikov Equations. Mod. Phys. Lett. A 9, 2029–2040 (1994) 4. Bernard, D.: On the Wess–Zumino–Witten Models on the Torus. Nucl. Phys. B 303, 77–93 (1988) 5. Bernard, D.: On the Wess–Zumino Model on Riemann Surfaces. Nucl. Phys. B 309, 145–174 (1988) 6. Donagi, R., Witten, E.: Supersymmetric Yang–Mills Theory and Integrable Systems. Nucl. Phys. B 460, 299–350 (1996) 7. Enriquez, B. Rubtsov, V.: Hitchin systems, higher Gaudin operators and r-matrices. Math. Res. Lett. 3, 343–57 (1996) 8. Etingof, P., Kirillov, A., Jr.: Representations of Affine Lie Algebras, Parabolic Differential Equations and Lam´e Functions. Duke Math. J. 74, 585–614 (1994) 9. Falceto, F., Gawe¸dzki, K.: Unitarity of the Knizhnik–Zamolodchikov–Bernard Connection and the Bethe Ansatz for the Elliptic Hitchin Systems. Commun. Math. Phys. 183, 267–290 (1997) 10. Griffiths, P. Harris, J.: Principles of algebraic geometry. New York: Wiley and Sons, 1978 11. Hitchin, N. J.: Stable Bundles and Integrable Systems. Duke Math. J. 54, 91–114 (1987) 12. Hitchin, N. J.: Flat Connections and Geometric Quantization. Commun. Math. Phys. 131, 347–380 (1990) 13. Hudson, R.: Kummer’s quartic surface. Cambridge: Cambridge University Press, 1990 14. Knizhnik, V. G.: Analytic Fields on Riemann Surfaces. II. Commun. Math. Phys. 112, 567–590 (1987) 15. Knizhnik, V. G., Zamolodchikov, A. B.: Current Algebra and Wess–Zumino Model in Two Dimensions. Nucl. Phys B 247, 83–103 (1984) 16. Markman, E.: Spectral Curves and Integrable Systems. Comp. Math. 93, 255–290 (1994) 17. Mumford, D.: Tata lectures on theta II, Basel: Birkh¨auser, 1984 18. Narasimhan, M. S., Ramanan, S.: Moduli of Vector Bundles on a Compact Riemann Surface. Ann. Math. 89, 19–51 (1969) 19. Nekrasov, N.: Holomorphic Bundles and Many-Body Systems. Commun. Math. Phys. 180, 587–603 (1996)
670
K. Gawe¸dzki, P. Tran-Ngoc-Bich
20. Reshethikin, N., Varchenko, A.: Quasiclassical Asymptotics of Solutions of the KZ Equation. In: Geometry, topology and physics for Raoul Bott, ed. S.-T. Yau, International Press, Cambridge MA 1995, pp. 293–322 21. van Geemen, B., Previato, E.: On the Hitchin System. Duke Math. J. 85, 659–683 (1996) 22. van Geemen, B., de Jong, A. J.: On Hitchin’s Connection. alg-geom/9701007 23. Zamolodchikov, Al. B.: Selected Topics in Conformal Field Theory. In: Functional integration, geometry and strings, eds. Z. Haba and J. Sobczyk, Basel: Birkh¨auser, 1989, pp. 303–347 Communicated by G. Felder
Commun. Math. Phys. 196, 671 – 679 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Strange Attractors Containing a Singularity with Two Positive Multipliers? C.A. Morales, E.R. Pujals Departamento de M´etodos Matem´aticos, Instituto de Matem´aticas-UFRJ, Caixa Postal 68530, Cep 21945-970, Rio de Janeiro, RJ, Brasil. E-mail: [email protected], [email protected] Received: 13 December 1996/ Accepted: 30 January 1998
Abstract: We construct smooth vector fields exhibiting strange attractors that contain a singularity with two positive Lyapunov exponents. We explain how such attractors can be approximated by geometric Lorenz attractors or by homoclinic tangencies.
1. Introduction Let M be a three-dimensional closed manifold and denote X r the set of C r -vector fields in M , endowed with the C r topology, r ≥ 1. An attractor of X ∈ X r is an invariant transitive set A ⊂ M of X such that there is a neighborhood U of A (isolating block) such that Xt (U ) ⊂ U (for t > 0) and A = ∩t>0 Xt (U ), where Xt denotes the flow of X. The attractor A of X is strange if it does not consist of a singularity nor a periodic orbit and robust if there is an isolating block U of A such that ∩t>0 Yt (U ) is a strange attractor for Y close to X. Note that strange attractors containing singularities cannot be hyperbolic sets. The geometric Lorenz attractor [3] is an example of a robust strange attractor that contains a unique singularity with just one positive Lyapunov exponent. The main subject of this paper is to analyze the existence of strange attractors having singularities with two positive Lyapunov exponents. This investigation is motivated by the study of diffeomorphisms with (nonhyperbolic) attractors having several expanding directions in its basin. The reader can find examples of such diffeomorphisms in [9,6] and more recently in [1,11]. We start with a result which says that, under certain conditions, robust C 1 strange attractors containing a singularity with two positive Lyapunov exponents cannot exist. ? The authors thank IMPA for its very kind hospitality. This work was partially supported by CNPq and PRONEX/Dyn. Sys-Brazil.
672
C.A. Morales, E.R. Pujals
Proposition. There is no vector field in X 1 with a robust strange attractor A containing a singularity σ such that – σ is hyperbolic and has two positive Lyapunov exponents; – A contains only one branch of W s (σ) \ {σ}. The proof of this result follows using Hayashi’s connecting lemma [4]. For assume that A is an attractor as in the proposition. Then, by the connecting lemma, it follows that there is a C 1 perturbation A0 of A with a homoclinic orbit associated to the continuation σ 0 of σ and such that one branch of W s (σ 0 )\{σ 0 } is not contained in a fixed isolating block U of A0 . Perturbing A0 we obtain a robust strange attractor A00 with isolating block U such that the branches of W s (σ 00 )\{σ 00 } are not contained in U (σ 00 denotes continuation of σ 0 ). And this is a contradiction because at least one branch of W s (σ 00 ) \ {σ 00 } must be contained in A00 since A00 is non-trivial and transitive. It was recently proved that the last hypothesis in the proposition can be removed [7]. Due to the proposition, it is interesting to investigate how strange attractors containing a singularity with two positive Lyapunov exponents are located in the space of flows. The following theorem deals with this problem in the space of C r flows, r ≥ 3. Theorem 1. There are open sets U ⊂ M , U ⊂ X r (r ≥ 3) and a codimension-two submanifold S ⊂ U such that Xt (U ) ⊂ U (X ∈ U , t > 0) and 1X = ∩t≥0 Xt (U ) is a strange attractor containing a singularity with two positive Lyapunov exponents ∀X ∈ S. Moreover, each vector field X ∈ S can be C r -approximated by vector fields exhibiting (in U ) either a geometric Lorenz attractor or homoclinic tangencies. We shall obtain the submanifold S in Theorem 1 through a simple bifurcation. The idea is to connect the unstable manifold of the Lorenz attractor‘s singularity with the stable manifold of a two positive Lyapunov exponents hyperbolic singularity. Indeed, we shall exhibit open examples where such a connection can be done successfully as the first bifurcation of the Lorenz attractor. The main construction will be based on a geometric model to be introduced in Sect. 2. Some basic properties as well as the approximation by geometric Lorenz attractors for this model are proved in Sects. 3 and 4. The approximation by homoclinic tangencies is proved in Sect. 5 under the assumptions (H1) and (H2) at Sect. 2. Some final remarks are given in Sect. 6. 2. The Model In this section, we shall describe a vector field by means of a geometric model. Let X0 ⊂ X r be a vector field satisfying Fig. 1. As it is indicated in this figure, σ0 (resp. σ 0 ) is a hyperbolic singularity of X0 such that the eigenvalues {−λ2 , −λ3 , λ1 } (resp. {ρ3 , ρ2 , −ρ1 }) of DX0 (σ0 ) (resp. DX0 (σ 0 )) are reals and satisfy −λ2 < −λ3 < 0 < λ1 (resp. ρ3 , ρ2 > 0 and −ρ1 < 0). It is assumed that the flow X0 is linear close to σ0 and σ 0 , i.e., there exist two coordinate systems (x, y, z) and (x, y, z) such that X0 has the form, x 0 = λ1 x y 0 = −λ2 y (1) z 0 = −λ z 3
Attractors with a Singularity with Two Positive Multipliers
and,
673
x0 = −ρ1 x y 0 = ρ2 y z0 = ρ z 3
(2)
close to σ0 and σ 0 respectively. A top square S, transversal to X0 , is depicted in Fig. 1 too. It is close to σ0 and has the form S = {(x, y, z) : ||(x, y)|| ≤ δ, z = 1} for some positive fixed number δ. We have S = S + ∪ S − ∪ {(0, y) : |y| ≤ δ}, where S + = {(x, y, 1) ∈ S : x > 0} and − S = {(x, y, 1) ∈ S : x < 0}. Also, in Fig. 1, it is depicted a fundamental domain D0u of X0 restricted to the unstable manifold W u (σ 0 ) of σ 0 . S ( ρ3)
S-
z Du 0
-σ 0 (ρ ) 2
RL
S+
( − λ 3)
(−ρ ) 1
z
D c c1 0
RL
σ0 x
(− λ2 )
x ( λ ) 1
y y
( x= - 1)
( x= 1)
Figure 1.
We impose the following conditions on X0 : H1 Every orbit with initial point in D0u goes directly to S and there are two critical points c0 , c1 between the return set D of D0u (under the flow of X0 ) and the vertical foliation {x = cnt.} in S (see Fig. 1). H2 According to (H1) there are two points c0 ,c1 in D0u such that the forward orbit of c0 (resp. c1 ) under X0 goes directly to c0 (resp. c1 ). We assume that c0 (resp. c1 ) does uu not belong to Wloc (σ 0 ), the local strong unstable manifold of X0 at σ 0 (see [5]). Solving the linear system (1), one gets a return map RL from S ∗ = S + ∪ S − to {x = ±1}. If we endow S with the coordinate system x0 = x, y0 = y for (x, y, 1) ∈ S, then RL (x0 , y0 ) = (y0 |x0 |β , |x0 |α ) with α = λλ31 and β = λλ21 . In Fig. 2 there are depicted two “cusp” triangles T and T , both corresponding to components of the image of S + and S − under RL respectively. We assume that a return map R0 sending T back to S is well defined. The corre0 sponding image is denoted by T (see Fig. 2). We assume that the orbits in T return to S and do not pass close to any other singularity too. Furthermore, the orbits in T pass first close to σ 0 before return to S. We have a return map RT from {x = −1} to {x = 1}. It turns out that RT is a smooth diffeomorphism and we shall assume that, for our initial field X0 , it sends (−1, y, z) to (1, y, z) with y = y and z = z.
674
C.A. Morales, E.R. Pujals
R far y0
R far
y0 x
0
T’ T’
z1
z1
RL
RL T
T
y1
x 0
y1 RT
Figure 2.
We introduce the coordinate systems y 1 = y, z 1 = z on {(1, y, z)} and x0 = x, y 0 = y on S = {(x, y, 1) : ||(x, y)|| ≤ δ}. A return map RL , from {(1, y, z)} to S is depicted in Fig. 2. This return map is easily obtained from the linear system (2) and the shape of its domain depends on the ρ1 eigenvalues of DX0 (σ 0 ). Indeed, it follows that RL (y 1 , z 1 ) = (z ρ1 , y 1 z −γ 1 ), where ρ = ρ3 ρ2 and γ = ρ3 . 1
In addition, the domain of RL in {x = 1} is DL = {(y 1 , z 1 ) : z 1 ≥ ( δ1 |y 1 |) γ } with respect to the coordinate system (y 1 , z 1 ) in {x = 1}. The composition Rin = RL ◦ RT ◦ RL , from S − to S, is well defined and is given by Rin (x0 , y0 ) = (|x0 |α , y0 |x0 |(β−αγ) ). Therefore Rin sends the leaves {x0 = cnt.} in S − into leaves {x0 = cnt.} in S. To complete the description of X0 , we shall assume that there exists a global return map Rf ar , from S to S, in a way that R0 = Rf ar ◦ Rin . Furthermore, Rf ar sends the leaves {x0 = cnt.} (resp. {y 0 = cnt.}) into leaves {x0 = cnt.} (resp. {y0 = cnt.}) in S. The corresponding return map Rf ar exists and is a smooth diffeomorphism sending the leaves {y1 = cnt.} (resp. {z1 = cnt.}) in {x = 1} into leaves {y0 = cnt.} (resp. {x0 = cnt.}) in S. We define R0 = Rf ar ◦ RL and, in Fig. 2, T 0 stands for the image of T under Rf ar . We shall assume the following eigenvalue assumptions for X0 . EV1 λ2 ρ3 > λ3 ρ2 and ρ3 > ρ2 + ρ1 . EV2 There are no resonances between the eigenvalues λ1 , −λ2 , −λ3 (resp. −ρ1 , ρ2 , ρ3 ) of σ0 (resp. σ 0 ). This leads to the existence of C 2 -linearizing coordinates around the analytic continuation of σ0 and σ 0 for vector fields nearby X0 [8]. Under such eigenvalue assumptions, we summarize the main features of X0 as follows: 1. X0 has a transversal section S, endowed with a coordinate system (x0 , y0 ), in which s (σ0 ) ∩ S is given by {(0, y0 ) : |y0 | ≤ δ}. Wloc
Attractors with a Singularity with Two Positive Multipliers
675
2. There is a return map RX0 , from S ∗ = S + ∪ S − to S, given by R0 (x0 , y0 ) = (f0 (x0 ), g0 (x0 , y0 )) , if (x0 , y0 ) ∈ S + RX0 (x0 , y0 ) = R0 (x0 , y0 ) = (f 0 (x0 ), g 0 (x0 , y0 )) , otherwise. 3. |∂y0 g0 | and |∂y0 g 0 | are small. 4. The one-dimensional map fX0 defined by f0 (x0 ) , if x0 ∈ (0, δ] fX0 (x0 ) = f 0 (x0 ) ,if x0 ∈ [−δ, 0) is well defined and satisfy √ 0 – fX (x0 ) exists and is greater than 2, ∀x0 ∈ [−δ, δ] \ {0}; 0 0 – limx0 →0+ fX0 (x0 ) = −δ, limx0 →0− fX0 (x0 ) = δ, limx0 →0± |fX (x0 )| = ∞ and 0 further fX0 (0+) = −δ and fX0 (0−) = δ. 5. The limits limq→q± R0 (q) = b± exist ∀q0 ∈ {(0, y0 ) : |y0 | ≤ δ} and they do not 0 depend on q0 . Apart from the singularity σ 0 , the geometric behavior of the vector field X0 resembles the one of the geometric model in [3]. 3. Useful Lemmas Let X0 be the vector field described in the last section. Then it presents a saddle connection between σ0 and σ 0 : the left branch of W u (σ0 ) \ {σ0 } coincides with the right branch of W s (σ 0 ) \ {σ 0 }. This saddle connection clearly persists in a codimension two submanifold N ⊂ X r containing X0 . Throughout σ(X) (resp. σ(X)) denotes the continuation of σ0 (resp. σ 0 ) for X close to X0 . Let us analyze the continuation Rin,X of Rin , for X close to X0 . To do so, denote by RL,X , RT,X and RL,X the continuation of RL , RT and RL respectively, for X close to X0 . Now, it is easy to see that N satisfies N = {X ∈ W0 : RT,X (0, 0) = (0, 0)}, where W0 ⊂ X r is some open set containing X0 . When (y 1 (X), z 1 (X)) = RT,X (0, 0) satisfies 1 z 1 (X) ≥ | δ1 y 1 (X)| γ , we shall see that Rin,X = RL,X ◦ RT,X ◦ RL,X displays a certain amount of hyperbolicity, for X close to X0 . In particular, we shall prove that Rin,X exhibits such behavior for X ∈ N close to X0 . In the statement below, we shall use the following notation. DRin,X denotes the derivative of Rin,X with respect to (x0 , y0 ) ∈ S − . The functions AX , B X , C X and DX are the entries of the matrix DRin,X in a way that AX (x0 , y0 ) B X (x0 , y0 ) DRin,X (x0 , y0 ) = . C X (x0 , y0 ) DX (x0 , y0 ) The derivative of AX , B X , C X and DX with respect to (x0 , y0 ) are denoted by ∇AX , ∇B X , ∇C X and ∇DX respectively. RT,X is writen as RT,X (x0 , y0 ) = (y 1,X (x0 , y0 ), z 1,X (x0 , y0 )). In addition, y 1,X and z 1,X stand for y 1,X (RL,X (x0 , y0 )) and z 1,X (RL,X (x0 , y0 )) respectively.
676
C.A. Morales, E.R. Pujals
Lemma 1. There exist positive constants Ki , i = 1, ..., 4, such that if X is close to X0 and satisfies |y 1 (X)| ≤ z 1 (X), then – |AX (x0 , y0 )| ≥ K1 |z 1,X |(ρ−1) |x0 |(α−1) , |B X (x0 , y0 )| ≤ K2 |z 1,X |(ρ−1) |x0 |β ; – |C X (x0 , y0 )| ≤ K3 |z 1,X |(−γ) |x0 |(α−1) , |DX (x0 , y0 )| ≤ K4 |z 1,X |(−γ) |x0 |β . Moreover, the quantities – – – –
|∇AX (x0 , y0 )|.|z 1,X |(1−ρ) |x0 |(1+β+2(1−α)) , |∇B X (x0 , y0 )|.|z 1,X |(1−ρ) |x0 |(1−α) , |∇C X (x0 , y0 )|.|z 1,X |(1−ρ) |x0 |(1+β+2(1−α)) and |∇DX (x0 , y0 )|.|z 1,X |(1−ρ) |x0 |(1−α)
go uniformly to zero when x0 goes to zero. Here α, β, ρ, γ stand for the continuation, for X close to X0 , of the corresponding eigenvalue ratios of X0 (see Sect. 2). Proof. By (EV2) in Sect. 2, X is C 2 -linearizable close to σ(X) and σ(X) for X nearby X0 . As |y 1 (X)| ≤ z 1 (X) we obtain that |y 1,X |(z 1,X )−1 is a bounded function. The remainder is a straighforward computation. The lemma below follows using Lemma 1 and the methods in [2]. In its statement, RX denotes the continuation of the return map RX0 for X close to X0 (see Sect. 2). Lemma 2. If X is close to X0 and satisfies |y 1 (X)| ≤ z 1 (X), then there exists a C 1 s invariant strong stable foliation FX for RX in S. Moreover, the dynamics induced by s this return map in FX is locally eventually onto [3]. 4. Approximation by Geometric Lorenz Attractors In this section we prove the approximation by geometric Lorenz attractors required in Theorem 1. For this we shall study two-parameter families of vector fields Xµ,η , (µ, η) ∈ R2 such that X0,0 is close to X0 and belongs to N . To start with, let us give a simple criterion to test when a parametrized family Xµ,η as above crosses N transversally at (µ, η) = (0, 0). Let Xµ,η depending smoothly on (µ, η) ∈ R2 and define I(µ, η) = (y 1 (µ, η), z 1 (µ, η)) = RT,Xµ,η (0, 0). Then one has Lemma 3. Xµ,η is transversal to N at (µ, η) = (µ0 , η0 ) if and only if Xµ0 ,η0 ∈ N and det(DI(µ0 , η0 )) 6= 0. Because of this lemma, a smooth change of parameters gives rise to the relations y 1,Xµ,η (0, 0) = µ z 1,Xµ,η (0, 0) = η
(3)
for (µ, η) close to (0, 0), provided Xµ,η is transversal to N at (µ, η) = (0, 0). To state our next lemma we shall make the following considerations. Using the arguments in [3], one can prove that X0 (see last section) exhibits the strange attractor 10 = ∩t≥0 (X0 )t (U )
Attractors with a Singularity with Two Positive Multipliers
677
containing σ0 and σ 0 , for a fixed neighborhood U . An attractor as 10 above is called heteroclinic in [10]. In addition, it follows from Lemmas 1 and 2 that there is a neighborhood W of X0 such that every field X ∈ N = W ∩ N displays a continuation 1X of 10 in U , i.e. 1X = ∩t≥0 Xt (U ), is also a strange attractor of X containing σ(X) and σ(X). We establish a lemma which is a direct consequence of (3) and the lemmas in Sect. 3. Lemma 4. Let Xµ,η be a parametrized family crossing N transversally at (µ, η) = (0, 0). Then, for every (µ, η) satisfying |µ| ≤ η 6= 0, 1Xµ,η is the union of a geometric Lorenz attractor and a set of orbits contained in W s (σ(Xµ,η )). In other words, this lemma says that a geometric Lorenz attractor arises from the strange attractor 10,0 of X0 in the parameter region L = {(µ, η) : |µ| ≤ η, η > 0}. Therefore, geometric Lorenz attractors are prevalent in generic unfoldings of Xµ,η : the set of parameter values corresponding to vector fields exhibiting a geometric Lorenz attractor has positive Lebesgue density at the bifurcating value (0, 0). 5. Approximation by Homoclinic Tangencies We shall analyze how the attractors in N are accumulated by homoclinic tangencies. The result is the following. Theorem 2. There are positive constants 1, K > 0 satisfying the following property. Suppose that Xµ,η is transversal to N at (µ, η) = (0, 0). Then for every > 0 there is δ > 0 such that if (µ, η) satisfies ||(µ, η)|| < δ and η < −K|µ|1+1 , then there exists Y ∈ X r such that, – ||Y − Xµ,η ||C r < and Y coincides with Xµ,η outside U (see last section); – Y exhibits a homoclinic tangency associated to some hyperbolic saddle periodic orbit in U . It follows from this theorem that 1X , for X ∈ N , is accumulated by strange Hen´onlike attractors or finitely (infinitely) many attracting periodic orbits. Let us establish some terminology which will be used in the proof of Theorem 2. Throughout we deal with vector fields X ∈ X r sufficiently close to X0 (see Sect. 2, where X0 was already defined). Recall that RX stands for the return map associated to X in S (see Sect. 2). We shall say that a basic set B ⊂ S of RX is almost vertical if the local stable s (p)}p∈B , with respect to RX , extends to an almost vertical C 1 foliation manifolds {Wloc s s F in S. We fix a metric d(., .) on S and, as usual, Wloc (B) = ∪p∈B Wloc (p) for a basic set B of RX . An almost vertical basic set of RX in S is -critical if there exist a point p ∈ B and q ∈ s (B)) < W u (p)∩S such that W u (p) is tangent, at q, to some leaf of F and d(q, ∪p∈B Wloc . Notice that every basic set of RX with |y 1 (X)| ≤ z 1 (X) is almost vertical. We can see that if RX possesses an almost vertical -critical basic set in S, then X can be -approximate (in the C r -topology) by a vector field Y satisfying the following properties: – Y coincides with X outside U ; – Y exhibits a homoclinic tangency associated to some periodic orbit in U .
678
C.A. Morales, E.R. Pujals
Therefore, the proof of Theorem 2 follows from the following Claim. Suppose that Xµ,η is as in Theorem 2. Then there are constants K, 1 > 0 satisfying the following property. For every there is δ such that if (µ, η) satisfies ||(µ, η)|| < δ and η < −K|µ|1+1 , then Xµ,η displays an almost vertical -critical basic set Bµ,η () ⊂ S of Rµ,η = RXµ,η . Proof. It follows from Lemma 2 that, up to a change of coordinates, the return map RX associated to X ∈ N satisfies RX (x0 , y0 ) = (fX (x0 ), gX (x0 , y0 )), where (4) of Sect. 2 holds (replacing fX0 by fX ) and gX is a contraction in the y0 direction. Denote by P er(fX ) the set of periodic points of fX . We shall say that q, q 0 ∈ i i (q), fX (q 0 ) belong to the open interval in R P er(fX ) (q 6= q 0 ) are related if none of fX with boundary points q, q 0 ∀i ∈ N. Lemma 5. There is δ0 ∈ (0, δ) such that if q, q 0 ∈ P er(fX ) are related and −δ0 < q < 0 < q 0 < δ0 then −n ([fX (q 0 ), q] ∪ [q 0 , fX (q)]) H(q, q 0 ) = ∩n∈N fX
is a Cantor set, P er(fX ) ∩ H(q, q 0 ) is dense in H(q, q 0 ) and fX /H(q, q 0 ) has a dense orbit. Recall that a Cantor set is a compact perfect subset of R. Usually it is obtained by removing open intervals (often called gaps) on a fixed compact interval of R (see [8] for a detailed description). Another result is the following Lemma 6. There is 0 ∈ (0, δ0 ) such that if ∈ (0, 0 ), then there exist q , q0 ∈ P er(fX ) which are related and satisfy − < q < 0 < q0 < . Thus we can associate, for small and X ∈ N , the basic set BX () of RX defined as −n (([fX (q0 ), q ] ∪ [q0 , fX (q )]) × [−δ, δ]), BX () = ∩n∈N RX
where q , q0 come from the above lemma. Clearly BX () is an almost vertical basic set of RX . Now if Xµ,η is as in the statement of Theorem 2, we define B0,0 () = BX0,0 () and it is easy to see that B0,0 () has an almost vertical continuation Bµ,η () for every (µ, η) with small norm. It remains to prove that there is K, 1 > 0 such that Bµ,η () is a -critical basic set of Rµ,η if (µ, η) is close to (0, 0) and satisfies η < K|µ|1+1 . To prove this, choose a small cross section B of X0,0 such that B ∩ S is empty and u c0 , c1 ∈ / B. In addition, we choose B in a way that B ∩ D0,0 is a nontrivial compact u interval of D0,0 . u is the continuation of D0u for X close to X0 (see (H1) and (H2) in Sect. 2) Here DX u u . One can see that RL,µ,η = RL,Xµ,η (recall Sect. 3) has an extension, and D0,0 = DX 0,0 −1
in {x = 1}, in a way that RL,µ,η is well defined in B. −1
Let C tang = RL,µ,η (B). Because of (EV1) in Sect. 2, one has C tang = {(1, y, z) : z < −K|y|1+1 },
Attractors with a Singularity with Two Positive Multipliers
679
for some fixed constants K, 1 > 0 which do not depend on (µ, η). Using (3) we obtain that C tang induces the region {(µ, η) : η < −K|µ|1+1 } in the parameter space. Take K ∈ (0, K) and consider Ctang = {(µ, η) : η < −K|µ|1+1 }. If (µ, η) ∈ Ctang , then the unstable manifold W u (Bµ,η ()) of Bµ,η () under Rµ,η satisfies that there is an arc l ⊂ W u (Bµ,η ()) such that the Xµ,η -orbit of a point q ∈ l passes close to σ(Xµ,η ). u This implies that such a Xµ,η -orbit passes close to DX \ (S ∪ B) as well. µ,η u Then, Rµ,η (l ) ⊂ W (Bµ,η ()) must contain a critical point cµ,η which is close to c0 (or c1 ) if (µ, η) is close to (0, 0). As the length of the gaps of H(q , q0 ) go to zero when s (Bµ,η ()) when → 0+ . This implies that → 0+ , we obtain that cµ,η converges to Wloc Bµ,η () is -critical and completes the proof of the Claim. 6. Final Remarks The main result of this paper says that, in dimension three, strange attractors containing a singularity with two positive Lyapunov exponents can persist in a codimension-two submanifold of the space of C r flows, r ≥ 3. It was also proved that the attractors in such a submanifold are accumulated by geometric Lorenz attractors or else by homoclinic tangencies. In the proposition at Sect. 1 is proved that robust examples of strange attractors containing a singularity with two positive Lyapunov exponent do not exist in in the C 1 topology (see also [7]). These results motivate in part the study of strange attractors with singularities for vector field (singular strange attractors). In particular, the study of singular strange attractors with a unique singularity and a dense set of periodic orbits seems to be interesting. We belive that any of those attractors can be approximated by homoclinic tangencies if the corresponding singularity has two positive Lyapunov exponents. References 1. Bonatti, C., Diaz, L.J.: Persistent nonhyperbolic transitive diffeomorphisms. Ann. of Math. 143, 357–396 (1996) 2. Bam´on, R., Labarca, R., Ma˜ne´ , R., Pac´ıfico, M.J.: The explosion of singular cycles. Publ. Math. IHES 78, 207–232 (1993) 3. Guckenheimer, J., Williams, R.F.: Structural stability of Lorenz attractors. Publ. Math. IHES 50, 307– 320 (1979) 4. Hayashi, S.: Connecting invariant manifolds and the solution of the C 1 stability conjecture and stability conjecture for flows. Ann. of Math., 145, 81–137 (1997) 5. Hirsch, M., Pugh, C.C., Shub, M.: Invariant manifolds. Lect. Notes. in Math. 583, Berlin–Heidelberg– New York: Springer-Verlag, 1977 6. Ma˜ne´ , R.: Contributions to the stability conjecture. Topology 17, 383–396 (1978) 7. Morales, C.A., Pacifico M.J., Pujals, E.R.: On C 1 robust transitive sets with singularities for three dimensional flows. C.R. Acad. Sci. Paris, t. 326, S´erie I, 81–86 (1998) 8. Palis, J., Takens, F.: Hyperbolicity and sensitive chaotic dynamics at homoclinic bifurcations. Cambridge: University Press, 1993 9. Shub, M.: Topologically transitive diffeomorphisms on T 4 . Lect. Notes in Math. 206, Berlin–Heidelberg– New York: Springer-Verlag, p. 39, 1971 10. Takens, F.: Heteroclinic Attractors: Time Averages and Moduli of Topological Conjugacy. Bol. Soc. Bras. Mat. 25, 107–120 (1993) 11. Viana, M.: Multidimensional nonhyperbolic attractors. Publ. Math. IHES 85, 63–96 (1997) Communicated by Ya. G. Sinai
Commun. Math. Phys. 196, 681 – 701 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Scaling Limit for a Mechanical System of Interacting Particles II Kˆohei Uchiyama? Department of Mathematics, Tokyo Institute of Technology, Meguro-ku, Tokyo 152, Japan. E-mail: [email protected] Received: 6 October 1997 / Accepted: 5 February 1998
Abstract: A system of a large number of classical particles moving on a one-dimensional segment with virtually reflecting boundaries is studied. The particles interact with one another through repulsive pair-potential forces and are subject to resistance proportional to their velocities. Because of the latter it is only the number of particles that is conserved under the evolution of the system. It is established that under suitable scaling of space and time the normalized counting measure of particle locations converges and its limiting measure is governed by a non-linear evolution equation, which is local (diffusion equation) or non-local (integral equation) according as the tail of the potential is bounded by 1/|x| or obeys the power law 1/|x|γ with 0 < γ < 1. The scaling also depends on the tail of the potential.
0. Introduction This paper is a continuation of Uchiyama [2], in which we studied a scaling limit of a system of a large number of particles that move on a one-dimensional segment with virtually reflecting boundaries according to a classical equation of motion. The particles interact with one another through repulsive pair-potential forces given by a common potential function U (x), x ∈ R \ {0}, and are subject to resistance proportional to their velocities. Because of the latter it is only the number of particles that is conserved under the evolution of the system. In [2] it is established that if the function U (x) is even and convex for x > 0 and has an integrable tail (short range), then under the parabolic scaling the limit density ρ(θ, t) of normalized counting measures of particle locations solves a non-linear diffusion equation ρt = (P (ρ))θθ . For its derivation the local equilibrium structure in the microscopic description of the system plays a decisive role; the function P accordingly reflects the whole shape of U . ?
Research partially supported by Japan Society for the Promotion of Science
682
K. Uchiyama
In this paper we consider the same system in the case when the potential has a long tail that behaves like the power law C/|x|γ with 0 < γ ≤ 1 (C is a positive constant). In such cases the local equilibrium is irrelevant for the macroscopic description of the density and the corresponding limit equation is determined depending only on the tail of U in contrast with the short range case as we shall see in this paper. The proof is similar to that employed in [2] except that we do not need to prove ( and can not utilize) the local equilibrium. When the tail of U behaves like C/|x|γ with 0 < γ < 1, the mathematical structure of the problem is that of a typical mean-field model of McKeanVlasov type with the interaction kernel (x − y)/|x − y|2+γ , which is singular along the diagonal; all the difficulty comes from this singularity. (Because of the cancellation of the interacting forces it is the singularity 1/|x − y|γ that we must control.) In the critical case U (x) ∼ 1/|x| (i.e.,γ = 1) we encounter a new type of problem on a singular limit that would lead to the non-linear diffusion ρt = (ρ2 )θθ . 1. The Model and Main Results Consider a system of N Newtonian particles of unit mass moving on the one dimensional open segment (0, N ) such that each particle is subject to the resistance equal to its momentum, interacts with the other particles through a pair potential and is repelled with potential forces exerted by the two ‘walls’ at 0 and N . The equation of motion for the system is written as follows: d qi (t) = pi (t), dt X d pi (t) = −pi (t) − U 0 (qi (t) − qj (t)) dt
(1.1a)
j6=i
− W−0 (qi (t)) − W+0 (qi (t) − N ),
(1.1b)
where qi (t) and pi (t) denote, respectively, the location and the momentum for the i-th particle at time t ≥ 0 and U 0 , W±0 the derivatives of functions U, W± . The pair potential function U (x), defined for x ∈ R − {0}, is supposed to be continuously differentiable and satisfy U (x) = U (−x); U (0+) = ∞; 1 inf ψ(x) > 0, lim inf δ↓0 ψ(δ) 0<x<δ where
ψ(x) := −xU 0 (x),
(1.2a) (1.2b)
x 6= 0.
For the wall potentials W (x) = W− (x) or W+ (−x), x > 0, we suppose W (0+) = ∞, 0
W (x) ≤ 0,
(1.3a) 0
lim xW (x) = 0.
x→∞
(1.3b)
The assumption (1.2b) implies in particular that ψ(x) either diverges to infinity or is bounded off both zero and infinity as x → 0. Thus the pair interaction works repulsively
Scaling Limit for Mechanical System of Interacting Particles
683
in a neighborhood of the origin. We shall assume additional conditions on U that imply it is repulsive on the whole space or for large values of |x|. R ∞ In our previous paper [2] we studied the case when the0 potential has short range: |U (x)|dx < ∞, or, what amounts to the same thing if U (x) ≤ 0 for all sufficiently 1 large values of x, Z ∞ ψ(x)dx < ∞. (1.4) 1
In the present paper we are concerned with the long range case: ψ(x) ∼
1 |x|γ
as |x| → ∞,
(1.5)
where γ is a constant such that 0 < γ ≤ 1 and ∼ means that the ratio of two sides approaches unity. Let us introduce the macroscopic position variables xi (t) =
1 qi (λN t) N
(1.6)
and the normalized counting measure αtN (dθ)
N 1 X = δxi (t) (dθ), N
θ ∈ (0, 1) :
i=1
for an open subset A ⊂ (0, 1) it takes the value αtN (A) = N −1 ]{i|xi (t) ∈ A}. The numbers λN are taken to be N 2 in the short range case (1.4). In the long range case (1.5) they are chosen as follows: λN =
N2 log N
if γ = 1
= N 1+γ
if γ < 1.
(1.7)
We assume as in [2] the following bound for the total energy of the initial phase (pi , qi )N i=1 : as N → ∞: N N X 1X 2 1 X pi + U (qi − qj ) + [W− (qi ) + W+ (qi − N )] = o(N 3 ), 2 2 i=1
(1.8)
i=1
i,j(6=)
where the second sum extends to all ordered pairs (i, j) with i 6= j, 1 ≤ i, j ≤ N . We will regard αtN as an element of the space of all probability measures on the closed interval [0, 1], which space we endow with the topology of weak convergence : according to it a sequence (αN )N in this space converges to an element α if and only if αN (J) → α(J) as N → ∞ for every continuous function J on [0, 1], where α(J) stands for the integral of J by α. In the following theorems it is supposed that the sequence of initial phases N (pi , qi )N i=1 = (pi (0), qi (0))i=1 , N = 1, 2, ..., satisfies (1.8) and that α0N −→ µo
as N → ∞,
(1.9)
where µo is a probability measure on [0,1]. In the short range case we have the following result according to [2].
684
K. Uchiyama
Theorem ([2]). Suppose that U has the short range (1.4) and is strictly convex on (0, ∞), and let λN = N 2 in (1.6). Then αtN converges uniformly for t ≤ T for each T < ∞, the limit measure is absolutely continuous relative to the Lebesgue measure dθ, and the density, ρ(θ, t) say, is a unique weak solution of the non-linear diffusion equation ∂2 ∂ ρ(θ, t) = 2 P (ρ(θ, t)), 0 < θ < 1, t > 0 ∂t ∂θ that satisfies the boundary condition (∂/∂θ)ρ(0, t) = (∂/∂)ρ(1, t) =P 0, t > 0 as well as ∞ the initial condition ρ(θ, t)dθ −→ µo (dθ) as t ↓ 0, where P (u) = − k=1 kU 0 (k/u) for u > 0 and P (0) = 0. In what follows we suppose that U has the long range (1.5) and that ψ(x) ≥ 0
for x ∈ R
(1.10a)
or ψ(x)|x| −→ ∞ as
x → 0.
(1.10b)
Our main theorems in this paper are now advanced. Theorem 1. Suppose that (1.5) is satisfied with γ = 1. Then αtN converges uniformly for t ≤ T for each T < ∞, the limit measure is absolutely continuous relative to the Lebesgue measure dθ, and the density, ρ(θ, t) say, is a unique weak solution of the non-linear diffusion equation ∂2 ∂ ρ(θ, t) = 2 ρ2 (θ, t), 0 < θ < 1, t > 0 (1.11) ∂t ∂θ that satisfies the boundary condition (∂/∂θ)ρ(0, t) = (∂/∂θ)ρ(1, t) = 0, t > 0 as well as the initial condition ρ(θ, t)dθ −→ µo as t ↓ 0. Theorem 2. Suppose that (1.5) is satisfied with 0 < γ < 1. Then αtN converges, as N → ∞, to a probability-measure-valued continuous function, µt say, uniformly for 0 ≤ t ≤ T for each T < ∞, and µt is a continuous measure (i.e., has no point mass) for each t > 0 and characterized as a unique solution, starting at the initial measure µo , to the non-linear integro-differential equation Z Z 1 1 1 J 0 (θ) − J 0 (θ0 ) µt (dθ)µt (dθ0 ) d µt (J) = · , t>0 (1.12) dt 2 0 0 θ − θ0 |θ − θ0 |γ to be valid for every smooth testing function J with J 0 (0) = J 0 (1) = 0. If Eq. (1.12) can be written for the whole space R instead of the unit interval and µt has a density ρ(θ, t), it is a weak form of the integro-differential equation Z ∞ 1 ∂ ∂ ρ(θ0 , t) ∂ ρ(θ, t) = · ρ(θ, t) dθ0 . (1.13) ∂t γ ∂θ ∂θ −∞ |θ − θ0 |γ An outline of the proofs of Theorems 1 and 2 will be given in Sect. 2. The nonlocality causes some difficulty to control the non-linearity of a singular nature. The key idea to overcome it is involved in the transform given by (2.13), which sends the probability measure αtN (dθ) to a function αˆ tN (θ, θ0 ) of two variables. We shall prepare a few lemmas concerning this transform in Sect. 3 and various moment estimates related to the empirical distribution αtN in Sect. 4. Theorems 1 and 2 will be proved in Sects. 5 and 6, respectively .
Scaling Limit for Mechanical System of Interacting Particles
685
2. Outline of Proofs Put vi (t) = pi (λN t). Then the equation of motion (1.1) becomes d xi (t) = λN N −1 vi (t), dt X d vi (t) = −λN vi (t) − λN U 0 (N (xi (t) − xj (t))) dt −
j6=i 0 λN W− (N xi (t)) −
λN W+0 (N (xi (t) − 1))).
(2.1a)
(2.1b)
For the total energy E N (t) :=
N 1X 2 1 X vi (t) + U (N (xi (t) − xj (t))) 2 2 i=1
+
i,j(6=)
N X
[W− (N xi (t)) + W+ (N (xi (t) − 1))],
i=1
we calculate its derivative to obtain N
X d N E (t) = −λN vi2 (t) ≤ 0; dt i=1
in particular E N (t) ≤ E N (0) and Z
T
N 0
N 1 1 1 X 2 vi (t)dt ≤ 3 E N (0) + cU , N N N
(2.2)
i=1
where we put cU = − inf x U (x) and N =
λN . N2
(2.3)
Let J be a smooth function on the closed interval [0, 1]. As in [2] we obtain t Z N t X 00 N X 0 J (xi )vi2 ds − 2 J (xi )vi αtN (J) − α0N (J) = N 0 N s=0 Z tX N J 0 (xi )BN (xi )ds − N 0 Z t N X J 0 (xi ) − J 0 (xj ) ψ(N (xi − xj ))ds, + xi − xj 0 2N i,j(6=) (2.4) where
BN (x) = N [W−0 (N x) + W+0 (N (x − 1))].
686
K. Uchiyama
In view of (2.2) the first term on the right-hand side is dominated by kJ 00 k∞ N −3 E(0); hence converges to zero as N → ∞ owing to the hypothesis (1.8). The second term also vanishes since 1/2 X X 1 −3 2 N |v v sup (t)| ≤ sup (t) ≤ [2N −3 E N (0) + N −1 cU ]1/2 . (2.5) i i N2 t t Taking J(x) = (x − 21 )2 in (2.4) and applying (2.2) and the assumption (1.3b) we see Z
N T X
N 0
i=1
E N (0) |BN (xi )|dt ≤ N + C + 2N N2
Z
T 0
X
ψ(N (xi − xj ))dt. (2.6)
i,j(6=)
Suppose that J 0 (0) = J 0 (1) = 0. Then the third term must also converge to zero owing to (1.3b) and in view of Lemma 2.1 below. These result in Z t N X J 0 (xi ) − J 0 (xj ) N N ψ(N (xi − xj ))ds + o(1), (2.7) αt (J) − α0 (J) = xi − xj 0 2N i,j(6=)
where o(1) is locally uniform in t. Lemma 2.1. For each T Z N
T 0
X
|ψ(N (xi (t) − xj (t)))|dt = O(N ).
(2.8)
i,j(6=)
For l > 0 we put
ψl∗ (x) = ψ(x)χ(|x| < l),
where χ(|x| < l) denotes the indicator function of the statement |x| < 1. The essential contribution to the integral in (2.8) comes from those pairs of particles that are distant apart from each other in microscopic length scale, as asserted in Lemma 2.2. For each T and l > 0, Z T N X ∗ ψl (N (xi (t) − xj (t)))dt → 0 0 2N
as
N → ∞.
(2.9)
i,j(6=)
Lemmas 2.1 and 2.2 will be proved at the end of Sect. 4. Suppose γ < 1. Then N = N γ−1 and Lemma 2.2 allows us to replace (2.7) by αtN (J) − α0N (J) Z t ds 1 X J 0 (xi ) − J 0 (xj ) · + o(1), = 2 γ 2N x − x (|x − x i j i j | + 1/N ) 0
(2.10)
i,j(6=)
where o(1) vanishes as N → ∞. By means of αtN (2.10) may be expressed as follows:
Scaling Limit for Mechanical System of Interacting Particles
687
αtN (J) − α0N (J) Z t ZZ 0 0 J (θ ) − J 0 (θ) αsN (dθ)αsN (dθ0 ) · 0 ds + o(1). = θ0 − θ (θ − θ + 1/N )γ 0
(2.11)
θ 0 >θ
If it is not for the singularity of the kernel |θ0 − θ|−γ , we could conclude without any difficulty that the limit density of αtN solves the integro-differential equation (1.12) in the case 0 < γ < 1. When γ = 1 the problem is subtler. Since N =
1 log N
if γ = 1,
in place of (2.11) we have αtN (J) − α0N (J) ZZ 0 0 Z t 1 J (θ ) − J 0 (θ) αsN (dθ)αsN (dθ0 ) + o(1). ds = log N θ0 − θ θ0 − θ + 1/N 0
(2.12)
θ 0 >θ
If αtN (dθ) had converged to ρt (θ)dθ sufficiently strongly, then this integral would converge to Z t Z 1 ds J 00 (θ)ρ2s (θ)dθ, 0
0
yielding the weak form of the desired equation (1.12). We shall, however, not prove the strong convergence of ρN t (θ) itself. Instead of it we shall consider the “average” αˆ tN (θ, θ0 )
1 := 0 θ − θ + 1/N
Z
θ0 θ
αtN (dr),
(2.13)
and prove that αˆ tN (θ, θ + N λ−1 ) converges to ρ(θ, t) strongly enough as a sequence of functions of three variables (θ, t, λ) ∈ [0, 1] × [0, T ] × [0, 1]. The proof then will be accomplished by further rewriting the integral on the right-hand side of (2.12) in terms of αˆ tN (θ, θ + N λ−1 ), which may be done in a straightforward way. The introduction of the average is essential in our approach not only for this purpose but also for obtaining some moment bounds necessary to control the singularity involved in (2.11) and (2.12). To conclude this section we notice that the parameter γ is allowed to take values also from the interval (−1, 0] in the condition (1.5) with the analogous scaling limit results as one may readily recognise: the proof is straightforward since no singularity arises in (2.11) for such values of γ. (Of course this is because the space variables θ is confined in a compact set and, if the system is considered in the whole real line R instead of the unit interval, this is not the case although we can still obtain the result under a certain additional condition on the initial configurations (cf. [3]).)
688
K. Uchiyama
3. Lemmas on the Function µ(x, ˆ y) Let µ(dx) be a finite measure on the unit interval that has no (point) mass at both end points (µ({0, 1}) = 0) and a and γ constants such that a>0
and
0 ≤ γ ≤ 1.
For a Borel function ϕ(x) and for 0 ≤ x < y ≤ 1 we define Z y− Z y 1 ϕ(r)µ(dr) = ϕ(r)µ(dr) + [ϕ(x)µ{x} + ϕ(y)µ{y}]. 2 x x+
(3.1)
(µ{x} = µ({x}); x+ and y− indicate that the end points x and y are not included in the range of integration). We put Z y 1 µ(dr) for 0 < x < y < 1. µ(x, ˆ y) = y−x+a x Lemma 3.1. Let p = 0 or 1 . Then Z y p ZZ µ(dx)µ(dy) µ(dr) (y − x + a)p+γ x x
=
(p + γ)(p + 1 + γ) (p + 2)(p + 1) Z
ZZ x
µˆ p+2 (x, y)dxdy (y − x + a)γ
1 p+γ µˆ p+2 (0, x)(x + a)1−γ + µˆ p+2 (x, 1)(1 − x + a)1−γ dx (p + 2)(p + 1) 0 R1 3 ( 0 µ(dr))p+2 4p X µ{x} − , + (p + 2)(p + 1)(1 + a)p+γ 3a1+γ x 2
+
where µˆ k (x, y) = [µ(x, ˆ y)]k and triangle 0 < x < y < 1.
RR
x
means that the integration ranges over the
Proof. Let p = 1. From our definition (3.1) it follows that Z y Z y µ(dr))2 = 2µ(dy) µ(dr) dy ( x
x
for y > x. By integration by parts Ry Z 1 µ(dy) x µ(dr) 1+γ x+ (y − x + a) R1 Ry 2 Z ( x µ(dr))2 1 1 + γ 1 ( x µ(dr))2 dy µ{x} = − 1+γ + . 2+γ 2(1 − x + a)1+γ 2a 2 2 x+ (y − x + a) Integrating by parts once more but this time by using 2 Z y 2 3 Z y 1 1 µ{x} µ(dr) µ(dx) = − dx µ(dr) − µ(dx) 3 3 2 x x
Scaling Limit for Mechanical System of Interacting Particles
689
we have ZZ Z y µ(dx)µ(dy) µ(dr) (y − x + a)1+γ x x
−
X x
3 Z 1 X µ{x} 1 + γ 1 X (µ{x}/2)3 (µ{x}/2)3 − 1+γ − dy . 3(1 − x + a)1+γ a 2 3 (y − x + a)2+γ 0 x x
3 P The last line is seen to be reduced to −(4/3)a−1−γ x µ{x}/2 by carrying out the integration of the last term. Thus we have the equality of the lemma. The proof in the case p = 0 (which is much simpler) is included in that of the next lemma as a special case. Lemma 3.2. Let g be a bounded function on [0, 1]. Then Z y ZZ µ(dx)µ(dy) g(r)dr (y − x + a)1+γ x x
=
1+γ 2
ZZ x
µˆ 2 (x, y)dxdy (y − x + a)γ
2+γ y−x+a
Z
y x
g(r)dr − g(x) − g(y) + R
with Z
1
R ≤ 2kgk∞
3
1−γ
µˆ (0, x)(x + a)
0
Z
1−γ
!2/3 dx
!2
1
+ kgk∞
+ µˆ (x, 1)(1 − x + a) 3
µ(dr)
.
0
Proof. The integration by parts turns the left-hand side of the equality of the lemma into R1 R1 Ry ZZ µ(dx) x µ(dr) x g(r)dr 1 ( x µ(dr))2 + g(y)dydx (1 − x + a)1+γ 2 (y − x + a)1+γ 0 x
Z
1
x
690
K. Uchiyama
By integrating by parts once more this becomes ZZ Z y 1+γ 2+γ dxdy µˆ 2 (x, y) g(r)dr − g(x) − g(y) 2 y−x+a x (y − x + a)2+γ x 0, Z y p ZZ µn (dx)µn (dy) µn (dr) < ∞. (3.2) sup (y − x + 1/n)p+γ n x Then
RR
x
x
µ(dx)µ(dy)(y − x)−γ < ∞ and, as n → ∞, µ(dx)µ(dy) µn (dx)µn (dy) −→ , γ (|y − x| + 1/n) |y − x|γ
where the convergence takes place in the dual space of the set of continuous functions. Proof. It suffices to show that ZZ ZZ µn (dx)µn (dy) µ(dx)µ(dy) −→ . (y − x + 1/n)γ (y − x)γ x
(3.3)
x
In fact this shows that the function (|x − y| + 1/n)−γ is uniformly integrable with respect to µn (dx)µn (dy), i.e., ZZ µn (dx)µn (dy) = 0. (3.4) lim lim sup ↓0 n→∞ (|y − x| + 1/n)γ |x−y|<
For the proof of (3.3) look at the equality of Lemma 3.1 and substitute µn for µ and 1/n for a. With the help of H¨older’s inequality we then deduce from the hypothesis (3.2) that the following two integrals: ZZ (µˆ n (x, y))2 dxdy , (y − x + 1/n)γ x
Z
0
(µˆ n (0, x))2 + (µˆ n (1 − x, 1))2 (x + 1/n)1−γ dx
Scaling Limit for Mechanical System of Interacting Particles
691
converge to zero as n → ∞ and ↓ 0 in this order. Now we look at the equality of Lemma 3.1 again but this time with p = 0. We have just observed that the integrals appearing on the right-hand side are uniformly integrable in the sense analogous to (3.4). Since ˆ y) pointwise and boundedly whenever |x − y| ≥ for each µˆ n (x, y) converges to µ(x, positive , we can now conclude that (3.3) holds. 0 4. Moment Bounds for α ˆN t (θ, θ ) and
P
ψl∗ (N (xi − xj ))
In our proof of Theorem 2 we assume for simplicity that ψ(x) > 0
for
|x| < 1,
(4.1)
which gives rise to no loss of generality. Lemma 4.1. Let ω(x) be an even function of x ∈ R − {0} such that ω ≥ 0; ω(x) is positive and continuous for 0 < |x| < 1; ω(x) converges to a positive number or diverges to infinity as x → 0 according as ψ is bounded or unbounded. Then for each δ > 0 and l > 0 there exists a constant C (independent of N ) such that X |ψl∗ (N (xi − xj ))| ≤ CN + δ9N (x), i,j(6=)
where ψl∗ (x) = ψ(x)χ(|x| < l) (as defined in Sect. 3) and X X ∗ 9N (x) := ψ1 (N (xi − xj )) ω(N (xi − xk )) . i,j(6=)
k6=i
Proof. This is the same as Lemma 3 of [2].
Let h(x) be a non-negative continuous function such that Z h(0) > 0, h(x) = 0
for
|x| > 1
hdx = 1.
and R
Lemma 4.2. Suppose that (1.10b) holds, i.e., |x|ψ(x) → ∞ as |x| → 0. Let ϕ(x) be a positive, integrable continuous function on R such that ϕ(y)/ϕ(x) is bounded on {|x| < |y|}. Then for each δ > 0 there exists a constant C (independent of N ) such that N X X i=1
2 ϕ(N (xi − xk ))
k6=i
≤ CN + δ
X i,j(6=)
ψ1∗ (N (xi
− xj ))
X
h(N (xi − xk )) .
k6=i
Proof. For ` > 0 put ` ` < xj ≤ θ + . n` (θ) = nN,` (θ; x) := ] j θ − 2N 2N We may suppose that h(x) > b for |x| < 1/3, where b is a positive constant. In [2:Lemma 4] itP is shown that the left-hand side of the inequality of the lemma is bounded by C1 N + C2 i [n1/4 (xi )]2 . Hence it suffices to show
692
K. Uchiyama
X
[n1/4 (xi )] ≤ CN + δ 2
i
X
ψ1∗ (N (xi
− xj ))
i,j(6=)
X
h(N (xi − xk )) .
(4.2)
k6=i
By (1.10b) for any δ > 0 there exists a positive integer m such that ψ(x) > m/(2δb) for |x| < 1/m , so that X X m [n1/2m (xi ) − 1]+ [n2/3 (xi ) − 1]+ . ψ1∗ (N (xi − xj )) h(N (xi − xk )) ≥ 2δ j6=i
k6=i
Summing over i we obtain X X ∗ h(N (xi − xk )) ψ1 (N (xi − xj )) i,j(6=)
k6=i
+ + mX l l l ≥ n1/2m −1 −1 . n1/2m n1/2 2δ 2mN 2mN 2mN (4.3) l
On the other hand X X [n1/4 (xi )]2 = i
Noticing n1/4 (xi ) ≤ by
i
m−1 X
n1/2m xi +
k=−m−1
q p n1/4 (xi ) n1/2 xi +
k 2mN
1 k + 2mN 4mN +
1 4mN
n1/4 (xi ).
, we dominate the last line
s l l × n1/2m n1/2m 2mN 2mN l=−m k=−m−1 s s l+k l+k l+k−1 l+k−1 × n1/2m + n1/2m . n1/2 n1/2 2mN 2mN 2mN 2mN 4mN X
m−1 X
Applying the Shwarz inequality to the sum over l which may be taken first we conclude 2 X X l l 2 . (4.4) [n1/4 (xi )] ≤ (2m + 1) n1/2 n1/2m 2mN 2mN i l P l l It is easy to see that 4(2m + 1) l n1/2m 2mN n1/2 2mN is dominated by the righthand side of (4.2) (see the proof of Lemma 3 in [2]). Therefore (4.2) follows from (4.3) and (4.4). The proof of Lemma 4.2 is finished. Lemma 4.3. There exists a function ω(x) such that it satisfies the properties stated in the hypothesis of Lemma 4.1 and that for each l > 1, Z Z xj 1 N T X (ψ − ψl∗ )(N (xi − xj )) ω(N (y − xk ))dydt N 0 xj − xi xi i,j,k(6=) Z N T 9N (x)dt + N 0 Z N T X (ψ − ψl∗ )(N (xi − xj ))dt, (4.5) ≤ C1 + C 2 N 0 i,j(6=)
where C1 and C2 are constants independent of N (but may depend on l).
Scaling Limit for Mechanical System of Interacting Particles
693
Proof. In the case ψ ≥ 0 the lemma is proved in Sect. 3 of [2] (cf. Lemma 1 and (3.8) of it) if the function ψ − ψl∗ on the right-hand side of (4.5) is replaced by ψ. But the contribution of ψl∗ is absorbed into the second term of the left-hand side of (4.5) owing to Lemma 4.1. The same proof works also in the other case (1.10b) owing to Lemma 4.2. Lemma 4.4. If γ < 1, then Z N
1−γ
ZZ
T
N
dt 0
θ<θ 0
αtN (dθ)αtN (dθ0 ) (θ0 − θ + 1/N )1+γ Z
ZZ
T
≤ C1 + C2 N 1−γ N
dt 0
θ<θ 0
Z
θ0 θ
αtN (dr)
N + N
Z
T
9N (x)dt
0
αtN (dθ)αtN (dθ0 ) . (θ0 − θ + 1/N )γ
(4.6)
(Notice that N 1−γ N = 1 if γ < 1 and = 1/ log N if γ = 1.) Proof. The lemma is essentially a corollary of Lemma 4.3 although working out in details is somewhat involved. The relation (4.6) is obtained by rewriting the sum in (4.5) by means of αtN . Our task is to ascertain that the error terms arising therein are really negligible. For the right-hand side of (4.5), replacing the function (ψ − ψl∗ )(x) by (x + 1)−γ (for x > 0), we observe N X (ψ − ψl∗ )(N (xi − xj )) 2N i,j(6=) ZZ αtN (dθ)αtN (dθ0 ) = N 1−γ N (1 + o(1)), (θ0 − θ + 1/N )γ
(4.7)
0<θ<θ 0 <1
where o(1) → 0 as l → ∞ uniformly in N . We look at the first term on the left-hand side of (4.5). Let h(x) be the non-negative function as introduced just prior to Lemma 4.2 and put hN (θ) = N h(N θ). Then obviously X k
≥
1 xj − xi
Z
xj xi
1 xj − xi + l/N −
X k
hN (y − xk )dy Z
1 xj − xi
xj xi
"Z
αtN (dθ) Z
xi
xj +1/N
#
+ xi −1/N
If we put
Z
xj
hN (y − xk )dy.
1
h(x ± y)dy,
h˜ ± (x) = 0
P the last term may be written as k (xj − xi )−1 [h˜ − (N (xi − xk )) + h˜ + (N (xj − xk ))]. Therefore, by replacing (ψ − ψl∗ )(x) by (x + 1)−γ (for x > 0) as above,
694
K. Uchiyama
Z
X (ψ − ψ ∗ )(N (xi − xj )) Z xj l h(N (y − xk ))dydt xj − xi xi 0 i,j,k(6=) Z T ZZ Z θ0 αtN (dθ)αtN (dθ0 ) 1−γ N dt αtN (dr)(1 + o(1)) − RN , ≥N (θ0 − θ + 1/N )1+γ θ 0 (4.8) θ<θ 0
N 2N
T
where o(1) → 0 as l → ∞ uniformly in N and RN =
N N
Z
T 0
X (ψ − ψ ∗ )(N |xi − xj |) l [h˜ + (N (xi − xk )) + h˜ − (N (xj − xk ))]dt N (xj − xi )
i,j,k(6=)
+
N N
Z
T 0
X χ(N |xj − xi | ≤ l)) χ(N |xj − xk | ≤ l)) dt. · (N |xi − xj | + 1)γ (N |xj − xk | + 1)
i,j,k
Let (1.10b) hold. Then, since the function (ψ − ψl∗ )(x)/|x| is integrable and h˜ ± vanishes outside [−2, 2], Lemma 4.2 says that for any l > 1 and for any positive number δ there exists a constant C such that RN
N < CN + δ N
Z
T
9N (x)dt.
(4.9)
0
The inequality of Lemma 4.4 now follows from (4.7), (4.8), (4.9) and Lemma 4.3 under (1.10b). If (1.10a) is assumed instead of (1.10b), then, replacing (θ0 − θ + 1/N )1+γ by (θ0 − θ + m/N )1+γ on the right-hand side of (4.8), we can verify (4.9) by taking m large enough (cf. [2; Corollary 3 ]). The proof of Lemma 4.4 is complete. Lemma 4.5. For each l ≥ 1 and T there exists a constant C such that N N
Z
ZZ
T
dt 0
θ<θ 0
Z
T
9N (x(t))dt ≤ C,
(4.10)
0
αtN (dθ)αtN (dθ0 ) (θ0 − θ + 1/N )1+γ
Z
θ0 θ
( αtN (dr) ≤
C C log N
if γ < 1 if γ = 1.
(4.11)
Proof. Apply Lemma 3.1 with µ = αtN and with p = 0, 1 to the inner integrals on both sides of (4.6). Then with the help of H¨older’s inequality it is easy to deduce (4.10) and (4.11). Proof of Lemmas 2.1 and 2.2. We apply Lemma 3.1 to the double integral on the righthand side of (4.7). Then Lemma 2.1 follows from H¨older’s inequality and (4.11). Lemma 2.2 follows from Lemma 4.1 and (4.11).
Scaling Limit for Mechanical System of Interacting Particles
695
5. Proof of Theorem 1 Let γ = 1. Recall that N = 1/ log N. Let αˆ tN be defined by (2.13), namely αˆ tN (θ, θ0 ) =
1 0 θ − θ + 1/N
Z
θ0 θ
αtN (dr).
We will apply (4.11) in the following form Lemma 5.1. If γ = 1, then for each T > 0, Z T ZZ [αˆ sN (θ, θ0 )]3 dθdθ0 = O(log N ), ds θ0 − θ + 1/N 0
(5.1)
θ<θ 0
Z
Z
T
1
ds 0
0
[αˆ sN (0, θ0 )]3 + [αˆ sN (θ, 1)]3 dθ = O(log N ).
(5.2)
Proof. We have only to write down the inequality (4.11) in terms of αˆ sN (θ, θ0 ) according to Lemma 3.1. We have shown Lemmas 2.1 and 2.2 and hence (2.6). Recalling the manner (2.6) follows from Lemma 2.2 we see that it can be written as αtN (J) − α0N (J) Z t ZZ αN (dθ)αsN (dθ0 ) 2 ds [J 0 (θ0 ) − J 0 (θ)] s0 + o(1), = log N 0 (θ − θ + 1/N )2
(5.3)
θ<θ 0
where o(1) → 0 as N → ∞ and J is any smooth function on [0, 1] with J 0 (0) = J 0 (1) = 0. We apply Lemma 3.2 with g = J 00 and γ = 1 to rewrite the inner integral on the right-hand side of (5.3) in terms of αˆ tN (θ, θ0 ). Since the contribution from outside a neighborhood of the diagonal θ = θ0 to the integral is negligible owing to the factor 1/ log N , Jˆ00 (θ, θ0 ) may be replaced by J 00 (θ). Comparing the resulting expression with those in (5.1) and (5.2) we obtain αtN (J) − α0N (J) Z t ZZ [αˆ N (θ, θ0 )]2 dθdθ0 2 + o(1). ds J 00 (θ) s 0 = log N 0 θ − θ + 1/N
(5.4)
θ<θ 0
Lemma 5.2. The family of empirical measures αtN , N = 1, 2, ..., if considered as a sequence of functions from a time interval [0, T ] into the space of the probability measures on [0, 1], is equi-continuous. Proof. Since the space of continuous functions on [0, 1] has a dense, countable subset of smooth functions whose derivative vanish at 0 and 1, it suffices to show the equicontinuity of αtN (J) for each testing function J. From (6.4) this follows if we show Z t ZZ 1 [αˆ sN (θ, θ0 )]2 dθdθ0 ds = 0. (5.5) sup lim lim sup δ↓0 N →∞ 0≤τ
696
K. Uchiyama
We apply H¨older’s inequality to the integration relative to s in (5.4). Then the first term on the right-hand side is dominated by (t − τ )1/3
kJ 00 k∞ log N
Z Z Z
t τ
θ<θ 0
[αˆ sN (θ, θ0 )]3 ds
2/3
dθdθ0 . θ0 − θ + 1/N
Applying H¨older’s inequality once more together with Lemma 5.1, we see (5.5) holds. Owing to (5.1) the range of the inner integral on the right-hand side of (5.4) may be restricted to θ 0 > θ + 1/N . Introducing the new variable λ by the equation θ0 − θ = N λ−1 we can further rewrite (6.2) as αtN (J) − α0N (J) Z t Z 1 Z ds dθ =2 0
0
κ(θ,N ) 0
where
J 00 (θ)[αˆ sN (θ, θ + N λ−1 )]2 dλ + o(1),
κ(θ, N ) =
1+
log(1 − θ) log N
(5.6)
+ ,
where x+ denotes the positive part of x ∈ R. The family {(αtN , 0 ≤ t ≤ T ) : N ≥ 2} is to a measure-valued conrelatively compact according to Lemma 5.1. Let αtN converge R κ(θ,N ) N ∗ λ−1 tinuous function µt along a subsequence N . Then 0 αˆ s (θ, θ + N )dλ dθ ∗ also converges to µt (dθ) along N as is easily seen. From (5.1) it follows that Z
Z
T
N
Z
1
dt
sup
κ(θ,N )
dθ
0
0
0
[αˆ tN (θ, θ + N λ−1 )]3 dλ < ∞.
(5.7)
By Fatou’s lemma µt therefore has a density, ρ(θ, t) say, such that Z
T
0
Z
1
ρ2 (θ, t)dθdt < ∞.
(5.8)
0
(Clearly the power 2 may be replaced by 3.) Since a weak solution of the Cauchy problem for the non-linear diffusion equation (1.11) satisfying this integrability condition that the measures from (5.6) if we canshow R isR unique (cf. [4]), Theorem 1 follows RT 2 T κ(θ,N ) N λ−1 2 [αˆ s (θ, θ + N )] dλdt dθ weakly converge to 0 ρ (θ, t)dt dθ. Ac0 0 tually we prove the following somewhat stronger result. Theorem 3. Let ρ(θ, t) be a weak solution to Eq. (1.11) having initial measure µo and satisfying the moment condition (5.7). Then Z 0
Z
1
dt
lim
N →∞
Z
T
κ(θ,N )
dθ 0
0
|αˆ tN (θ, θ + N λ−1 ) − ρ(θ, t)| dλ = 0. 2
Scaling Limit for Mechanical System of Interacting Particles
697
The proof of Theorem 3 is done by means of the Young measure. Define a finite measure π N on X := [0, T ] × [0, 1] × [0, 1] × [0, ∞) by Z
F dπ N =
Z
Z
T
X
0
Z
1
dt
κ(θ,N )
dθ 0
0
F (t, θ, λ, αˆ tN (θ, θ + N λ−1 ))dλ,
where F (t, θ, λ, u) ranges over all bounded continuous functions on X. By (5.7) the family (π N (dtdθdλdu); N = 1, 2, ..., ) constitutes a relatively compact set of measures. Theorem 4. Let N ∗ be a subsequence of N along which both αtN and π N converge. Let ρ(θ, t)dθ and π(dtdθdλdu) be respective limit points along N ∗ . Then the Young measure π(dtdθdλdu) is degenerate at ρ(θ, t), i.e., π(dtdθdλdu) = dtdθdλδρ(θ,t) (du). The deduction of Theorem 3 from Theorem 4 is straightforward owing to the moment bound (5.7). The proof of Theorem 4 is similar to that given in [5, 1, or 2] for the corresponding results except that we define the measure π N by means of αˆ tN (θ, θ+N λ−1 ) instead of ρN t (θ). For the proof of Theorem 4 we prepare two lemmas. We assume that N ∗ is chosen as in Theorem 4. Let pt (x, y), 0 ≤ x, y ≤ 1, t > 0, be the fundamental solution for the heat equation ∂t u = (∂ 2 /∂x2 )u on [0, 1] with the reflecting boundary condition. Lemma 5.3. For each l > 1 and δ > 0, lim inf ∗ IN (δ/N 2 ) ≤ lim sup lim sup ∗ IN (b), N →∞
b↓0
(5.9)
N →∞
where ∗ indicates that the limit is taken along the subsequence N ∗ specified in Theorem 4 and N IN (τ ) = 2N 2
Z
T 0
X
(ψ −
ψl∗ )(N (xi
i,j,k(6=)
1 − xj )) xi − xj
Z
xj xi
pτ (y, xk )dy. (5.10)
Proof. Let pt (x, y), 0 ≤ x, y ≤ 1, t > 0, be the fundamental solution for the heat equation ∂t u = (∂ 2 /∂x2 )u on [0, 1] with the reflecting boundary condition. We will apply the following representation of pτ : pτ (x, y) = p˜τ (y − x) + p˜τ (x + y), where p˜τ (x) =
∞ X
gτ (x + 2n),
gτ (x) = √
n=−∞
As in [2; Proof of Theorem 7] we compute d X ∂x pτ (xi , xj )vi dt i,j(6=)
and deduce from Eq. (2.1)
1 x2 . exp − 4τ 4πτ
698
K. Uchiyama
N 2N 2 +
=
Z
X
T 0
N N2
[−N U 0 (N (xi − xj ))][∂x pτ (xi , xk ) − ∂x pτ (xj , xk )]dt
i,j,k(6=)
Z
T 0
X
[−N U 0 (N (xi − xj ))]p˜0τ (xi − xj )dt
i,j(6=)
T T 1 X N X v ∂ p (x , x ) + p (x , x ) i x τ i j τ i j 3 2 N 2N t=0 t=0 i,j(6=) i,j(6=) Z T X N − [(vi − vj )2 ∂τ p˜τ (xi − xj ) + (vi + vj )2 ∂τ p˜τ (xi + xj )]dt 2N 2 0 i,j(6=) Z T X N BN (xi )∂x pτ (xi , xj )dt, (5.11) + 2 N 0 i,j(6=)
where BN (x) := N W−0 (N x) + N W+0 (N (x − 1)) as before, ∂x denotes the partial differentiation with respect to the first argument of pτ , and the time variable is suppressed from functions xi (t), vi (t), etc. for the sake of brevity. Integrate both sides of (5.11) with respect to τ on an interval [δ/N 2 , b], where b and δ are positive constants. Since Z b Z xj [∂x pτ (xi , xk ) − ∂x pτ (xj , xk )]dτ = [pb (u, xk ) − pa (u, xk )]du, xi
a
the first term on the left-hand side then becomes Z T X Z xj 1 N ψ(N (x − x )) [pb (u, xk ) − pδ/N 2 (u, xk )]du. i j 2N 2 0 xi − xj xi (5.12) i,j,k(6=)
For the proof of the lemma we must prove that this is non-negative in the limit as first N → ∞ and then b ↓ 0. To this end let us examine the other terms one by one. We first observe that if m is a positive constant such that U 0 (x) < 0 for x > m, then the contribution of the second term on the left-hand side is bounded above by a constant multiple of Z N T X 0 |U (N (xi − xj ))|χ(1 < N |xi − xj | < m)dt, N 0 i,j(6=)
since we have p˜0τ (x)x ≤ 0 for |x| < 1 as well as Z −
a
b
gτ0 (z)dτ
Z =
√ z/ a
√ z/ b
g1 (u)du ≤ 1
for
z ≥ 0.
(5.13)
This upper bound vanishes as N → ∞ according to Lemma 2.1. We turn to the right-hand side. In view of (2.2) and the inequality (2.5) the contribution of the sum p of the first two terms on the right-hand side of (5.11) is estimated in absolute value by Cb E N (0)/N 3 + Rb N k 0 pτ dτ k∞ and that of the third term is bounded below by 4N kgb k∞ E N (0)/N 3 . Thus they all vanish as N → ∞. Let us show that the negative part of the last term (i.e., the boundary term) also vanishes in the limit. By symmetry we have only to look at the contribution of xi ≤ 1/2.
Scaling Limit for Mechanical System of Interacting Particles
699
Owing to (1.3b) W+0 may be ignored. Our estimation will be based on (2.6), which together with Lemma 2.1 implies that Z
T
N
sup N
N X
0
−W−0 (N xi )dt < ∞.
(5.14)
i=1
Taking these into account and noticing that ∂x pτ (x, y) < 0 if y < x ≤ 1/2, we see that it suffices to prove Z Z b X 1 N T = 0. dt |W−0 (N xi )| ∂x pτ (xi , xj )dτ χ xi < xj ≤ lim lim sup b↓0 N →∞ N 4 0 δ/N 2 i,j(6=)
We divide the summation over i, j(6=) into two parts according to √ √ xi < b or xi ≥ b . The contribution of the second part vanishes owing to (1.3b). As for the first part we apply the equality in (5.13) to see that as b ↓ 0, Z
b δ/N 2
Z ∂x pτ (x, y)dτ ≤
√ (y+x)/ b √
(y−x)/ b
g1 (z)dz + o(e−1/4b )
if
0<x
1 . 4
By choosing small enough the integral on the √ right-hand side can be made arbitrarily small uniformly in x and b as long as 0 < x < b. Thus by (5.14) the contribution of the first Part can be made as small as one wishes. We have shown that the limit infimum of the formula (5.12) as N → ∞ and b ↓ 0 is non-negative. In view of Lemmas 4.1, 4.2 and 2.1 ψ in (5.12) may be replaced by (ψ − ψl∗ ). Thus Lemma 5.3 has been proved. Lemma 5.4.
Z X
u3 π(dtdθdλdu) ≤ lim inf lim inf ∗ IN (δ/N 2 ). δ↓0
N →∞
Proof. We wish to replace pδ/N 2 (xi , xj ) appearing in IN (δ/N 2 ) by N h(N (xi − xj )), but this is not allowed since we have to let δ go to zero. Instead of h we consider f δ (x) := gδ (x)ϕ(x), whereRϕ is a smooth function such that χ(|x| < 1/2) ≤ ϕ(x) ≤ χ(|x| < 1). The total mass f δ dx, though less than one, converges to one as δ ↓ 0. Let (1.10b) hold. Taking l large enough we may replace (ψ − ψl∗ )(x) by (|x| + 1)−1 χ(|x| > l) in the Rdefining expression of IN (δ/N 2 ). We apply (4.8) and (4.9) with γ = 1 and with f δ / f δ dx in place of h to deduce R Z 0 Z ZZ N 2( f δ dx) T αt (dθ)αtN (dθ0 ) θ N 2 dt αt (dr)(1 + o(1)), IN (δ/N ) ≥ log N (θ0 − θ + 1/N )2 θ 0 θ<θ 0
where o(1) → 0 as N → ∞. Rewriting the right-hand side in terms of αˆ tN (θ, θ0 ) according to Lemma 3.1 and neglecting the boundary terms that come up therein (since they are all non-negative), we obtain
700
lim inf ∗ N →∞
K. Uchiyama
Z
T
Z
Z
1
κ(θ,λ)
dθ 0
0
0
[hN ∗ αˆ tN (θ, θ + N 1−λ )]3 dλ ≤ lim inf lim inf ∗ IN (δ/N 2 ). δ↓0
N →∞
The inequality of the lemma follows from this one by a simple application of Fatou’s lemma. In the case when (1.10a) is assumed we replace (θ0 − θ + 1/N )2 by (θ0 − θ + m/N )2 in the argument given above, which forces us to replace αˆ tN (θ, θ + N 1−λ ) by (N 1−λ + m)−1 (N 1−λ + 1)αˆ tN (θ, θ + N 1−λ ) in the last inequality. Owing to (5.7) this causes no problem since the factor (N 1−λ + m)−1 (N 1−λ + 1) converges to 1 as N → ∞. The proof of Lemma 5.4 is finished. Proof of Theorem 4. As in the proof of the last lemma we replace (ψ − ψl∗ )(x) by (|x| + 1)−1 χ(|x| > l) in IN (b) and apply Lemmas 2.1 and 2.2 to deduce Z T ZZ N Z 0 2 αt (θ)αtN (θ0 )dθdθ0 θ N,b dt ϕt (r)dr + o(1), IN (b) ≤ log N 0 (θ0 − θ + 1/N )2 θ θ<θ 0
where ϕN,b t (θ) =
Z
1 0
pb (θ, θ0 )αtN (dθ0 ).
Rewriting the right-hand side in terms of αˆ tN (θ, θ + N λ−1 ) according to Lemma 3.2, and applying (5.2) of Lemma 5.1 we deduce from the inequalities of Lemmas 5.3 and 5.4 that Z u3 π(dtdθdλdu) X
≤ lim sup lim sup ∗ b↓0
×
N →∞
{3ϕˆ N,b t (θ, θ
Z
Z
T
Z
1
dt 0
0
+N
κ(θ,N )
dθ
λ−1
)−
0
ϕN,b t (θ)
[αˆ tN (θ, θ + N λ−1 )]2 ×
λ−1 − ϕN,b )}dλ. t (θ + N
The rest of the proof proceeds as in [5]. Since pb (θ, θ0 ) with b fixed can be uniformly approximated by functions of the form α1 (θ)β1 (θ0 ) + · · · + αm (θ)βm (θ0 ), where αk and βk are smooth, ϕN,b t (θ) converges to Z ϕb (θ, t) := pb (θ, θ0 )ρ(θ0 , t)dθ0 uniformly in (θ, t). We express the right-hand side as an integral relative to π N and apply the bound (5.7), i.e., Z u3 dπ N < ∞, (5.15) sup N
X
which allows us to take the limit relative to N under the integral sign. The right-hand side above is thus shown to equal Z u2 ϕb (θ, t)π(dtdθdλdu). (5.16) lim sup b↓0
X
From the definition it follows that the Young measure π is automatically of the form dtdθdλπt,θ,λ (du). We also have
Scaling Limit for Mechanical System of Interacting Particles
701
Z ρ(θ, t) =
uπt,θ,λ (du)
for
a.a. (t, θ, λ).
(5.17)
RT R1 RT R1 H¨older’s inequality yields 0 dt 0 [ϕb (θ, t)]3 dθ ≤ 0 dt 0 [ρ(θ, t)]3 dθ. Since the right-hand side is finite due to (5.15), by approximating ρ(θ, t) by continuous functions we see that φb (θ, t) converges to ρ(θ, t) in the norm ofR L3 (dtdθdλ). Therefore we may take the limit under the integral sign in (5.16) to have X u2 ρ(θ, t)π(dtdθdλdu) as the limit value. We can now conclude that Z Z ∞ Z T Z 1 Z 1 Z ∞ Z ∞ 3 2 u π(dtdθdλdu) ≤ dt dθ dλ u πt,θ,λ (du) uπt,θ,λ (du). X
0
0
0
0
0
0
It would be standard to deduce πt,θ,λ (du) = δρ(θ,t) (du) from this inequality and (5.17). The proof of Theorem 4 is finished.
6. Proof of Theorem 2 Let γ < 1. In place of (5.3) we have Z t ZZ 0 0 J (θ ) − J 0 (θ) αsN (dθ)αsN (dθ0 ) ds + o(1) αtN (J) − α0N (J) = θ0 − θ (|θ0 − θ| + 1/N )γ 0
(6.1)
[0,1]2
(J is any smooth function on [0, 1] with J 0 (0) = J 0 (1) = 0 as before). The same equicontinuity of the empirical measures αtN as in Lemma 5.2 holds as proved in a similar way. We let N → ∞ in (6.1) along a subsequence for which αtN has a limit. Owing to (4.11), we may apply Lemma 3.3 of Sect. 3 for taking the limit under the integral sign , which leads to Eq. (1.12) as required. Uniqueness of a solution to the initial value problem for (1.14) is established in Uchiyama [4] at least under Z T Z 1Z 1 ρ(θ, t)ρ(θ0 , t)dθdθ0 dt < ∞. |θ − θ0 |γ 0 0 0 This condition follows from Lemma 2.1. Thus Theorem 2 has been proved. References 1. Olla, S., Varadhan, S.R.S.: Scaling limit for interacting Ornstein-Uhlenbeck processes. Commun. Math. Phys. 135, 355–378 (1991) 2. Uchiyama, K.: Scaling limit for a mechanical system of interacting particles. Commun. Math. Phys. 177, 103–128 (1996) 3. Uchiyama, K.: Scaling limits for large systems of interacting particles. Advances in non-linear partial differential equations and stochastics (S. Kawashima and T. Yanagisawa, eds.), Adv. Math. Appl. Sci. 48 (1998) 4. Uchiyama, K.: Uniqueness of solutions to the initial-value problem for an integro-differential equation. Preprint (submitted to Diff. Integ. Eq.) 5. Varadhan, S.R.S.: Scaling limit for interacting diffusions: Commun. Math. Phys. 135, 313–353 (1991) Communicated by J. L. Lebowitz
Commun. Math. Phys. 196, 703 – 731 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
The Number-Theoretical Spin Chain and the Riemann Zeroes Andreas Knauf Max-Planck-Institute for Mathematics in the Sciences, Inselstr. 22–26, D-04103 Leipzig, Germany. E-mail: [email protected] Received: 17 February 1998 / Accepted: 17 February 1998
Abstract: It is an empirical observation that the Riemann zeta function can be well approximated in its critical strip using the Number-Theoretical Spin Chain. A proof of this would imply the Riemann Hypothesis. Here we relate that question to the one of spectral radii of a family of Markov chains. This in turn leads to the question whether certain graphs are Ramanujan. The general idea is to explain the pseudorandom features of certain numbertheoretical functions by considering them as observables of a spin chain of statistical mechanics. In an appendix we relate the free energy of that chain to the Lewis Equation of modular theory. 1. Introduction The Euler product formula ζ(s) ≡
∞ X n=1
n−s =
Y p prime
1 1 − p−s
(Re(s) > 1)
for the Riemann zeta function and partial integration imply that Z ∞ ∞ X M (x) 1 = µ(n)n−s = s dx, (Re(s) > 1), ζ(s) xs+1 1 n=1
where µ : N → {−1, 0, 1} denotes the M¨obius function and X M (x) := µ(n). n≤x
(1)
704
A. Knauf
Thus a Mertens type estimate M (x) = Oε x1/2+ε
(2)
for all ε > 0 would imply convergence of (1) in the half plane Re(s) > 21 and thus the Riemann Hypothesis (RH)1 . The values 1 and −1 of the M¨obius function have equal densities 3/π 2 . This lead Good and Churchhouse in [8] to a probabilistic motivation for RH. Indeed (2) would follow with probability one if the µ(n) were i.i.d. random variables with the above distribution, since then n 7→ M (n) would correspond to a symmetric random walk. In this spirit, it is conjectured on the basis of the probabilistic law of the iterated logarithm for i.i.d. random variables that (2) is wrong for ε = 0. Indeed the original √ Mertens conjecture |M (x)| < x for x > 1 is known2 to be wrong. However, arithmetical functions like µ are of course deterministic, and thus Good and Churchhouse remark that “all our probability arguments are put forward in a purely heuristic spirit without any claim that they are mathematical proofs” [8]. In this paper we convert the above idea into a mathematical framework (which uses the Liouville function λ instead of the M¨obius function). This approach is based on a statistical mechanics interpretation of the Riemann zeta function. In [10] we interpreted the quotient Z(s) := ζ(s − 1)/ζ(s)
(3)
of Riemann zeta functions for Re(s) > 2 as the partition function of an infinite spin chain at inverse temperature s. In that half plane Z has the Dirichlet series Z(s) =
X ϕ(n) ns
n∈N
with the Euler totient function ϕ(n) := |{j ∈ {1, . . . , n} | gcd(j, n) = 1}|. The quotient Z has been shown in [10] to be the thermodynamic limit lim Zk (s) = Z(s)
k→∞
(Re(s) > 2)
(4)
of the partition functions X
Zk (s) :=
exp(−s · Hk (σ))
σ∈{0,1}k
of spin chains with k spins. The energy function Hk := ln(hk ) of that spin chain is defined inductively by h0 := 1,
hk+1 (σ, 0) := hk (σ)
and
hk+1 (σ, 1) := hk (σ) + hk (1 − σ),
(5)
the spin configuration σ = (σ1 , . . . , σk ) being an element of the additive group Gk := (Z/2Z)k and 1 − σ := (1 − σ1 , . . . , 1 − σk ) being the configuration with all spins inverted. 1 2
Eq. (2) is indeed equivalent to RH, see Titchmarsh [25], 14.25. See the article [21] by Odlyzko and te Riele.
Number-Theoretical Spin Chain
705
Writing the hk (σ) in the row number k using the lexicographic order of the σ ∈ {0, 1}k , we obtain what could be called Pascal’s triangle with memory, see Fig. 1. Like in the usual Pascal triangle one writes the sum of neighbouring integers in row no. k into the next row. But in addition one also copies the integers from row No. k to the (k + 1)–st row. Notice that these sequences of integers coincide with the denominators of the modified Farey sequence.
k=0
1
k=1
1
1
1
2
1
k=2
1
k=3
1
k=4
1547385727583745 1
3 4
3
2 5
2
3 5
3
4
1
Fig. 1. Pascal’s triangle with memory
For n ≤ k + 1 the multiplicity ϕk (n) := |{σ ∈ Gk | hk (σ) = n}| of n equals ϕ(n). This implies (4), since X ϕk (n)n−s . Zk (s) = n∈N
As has been worked out in [10–12], in the articles [5, 6] with Contucci, and the one with Guerra [9], this Number-Theoretical Spin Chain has the properties of typical systems considered in statistical mechanics. It has exactly one phase transition, at s = 2. But from the point of view of number theory the most important point seems to be its ferromagnetic property. That is, the Fourier coefficients jk (t), t ∈ Gk∗ ∼ = Gk of −Hk (interaction coefficients in the statistical mechanics terminology), with X Hk (σ) = − jk (t) · (−1)σ·t ∗ t∈Gk
are positive: jk (t) ≥ 0 for t 6= 0. The exceptional negative coefficient, namely the mean jk (0) of −Hk , does not affect the Gibbs measure σ 7→ exp(−sHk (σ))/Zk (s)
(σ ∈ Gk )
of the spin chain for inverse temperature s ≥ 0. The ferromagnetic property is of interest in the context of the Riemann Hypothesis, since the Lee-Yang Theorem of statistical mechanics shows that the partition function of a ferromagnetic Ising system has a zero-free half-plane. In [13] we noted that numerically the functions Z˜ k (s) :=
∞ X
λ(n) · ϕk (n) · n−s =
n=1
with the Liouville function
X σ∈Gk
λ(hk (σ)) · exp(−s · Hk (σ))
(6)
706
A. Knauf
λ : N → {±1},
λ
Y
P
pαp := (−1)
p
αp
p prime
well approximate the function ˜ Z(s) :=
∞ X n=1
λ(n) · ϕ(n) · n−s =
Z(2s) · Z(2s − 1) Z(s)
(7)
not only in the half plane Re(s) > 2 of absolute convergence but even for Re(s) > ˜ 3/2 (with Z(s) being defined by analytical continuation). Comparing these statistical– mechanics ensembles with the finite Dirichlet series with the same number of terms obtained by truncation of the Dirichlet series (7), the numerical convergence properties of the Z˜ k (s) are much better. Clearly a convergence proof for the half plane Re(s) > 3/2 would imply RH, since by (3) the non-trivial zeroes s ∈ C of ζ give rise to poles of Z˜ at s + 1. In this paper we develop a framework supporting this empirical observation. In Sect. 2 we introduce a family of Markov chains with n2 × n2 transition matrices Tn which control the divisibility by n of the values of hk . In Sect 3. we show that Tn can be restricted to a subspace on which it is irreduciblen aperiodic. √ However, the spectral radius of this reduced matrix Tn is in general larger than 1/ 2 (which would be the expected value based on RH). Thus in Sect. 4 we introduce a reduced matrix T¯n (of size about n). We prove in Sect. 5 that all non-real √ eigenvalues of this doubly stochastic irreducible-aperiodic matrix have modulus 1/ 2. In the course of the analysis of its real eigenvalues we are lead in Sect. 6 to a class of three-regular graphs which we conjecture to have the Ramanujan property (that is, their nontrivial spectrum is conjectured to be a subset of the spectrum of the three-regular tree). In the last section we shortly draw some easy consequences from the Perron– Frobenius Theorem. However, we do not proceed in the analysis here, since further progress hinges on a proof of the Ramanujan property for the above graphs (or similar information). In the Appendix the free energy of the Number-Theoretical Spin Chain is related to the solutions of the Lewis Equation, ψ(z) = ψ(z + 1) + z −2s ψ(1 + 1/z). The holomorphic solutions of that equation on C \ (−∞, 0] with |ψ(0+ )| < ∞ are in bijection with the even Maass wave forms, see Lewis [14]. Our approach based on the Number-Theoretical Spin Chain partly resembles the one followed by A. Connes. He interprets ζ(s) (instead of Z(s)) as the partition function of a statistical mechanics system at inverse temperature s, see [4]. Notation. We denote by P ⊂ N the set of primes. |S| is the cardinality of the set S.
Number-Theoretical Spin Chain
707
2. The Markov Chain Construction In (2) one considers the P sums of the ensembles (µ(1), . . . , µ(k)) in the limit k → ∞. Here we estimate sums like σ∈Gk λ(hk (σ)) = Z˜ k (0), in order to gain some understanding of the functions Z˜ k (s). Thus we consider arithmetical functions f : N → C like λ as random variables f ◦ hk : Gk → C with expectations X f ◦ hk (σ) hf ik := 2−k σ∈Gk
w.r.t. the normalized counting measure on the group Gk . In order to estimate the expectation hλik of the Liouville function λ, we analyze the divisibility properties of hk . Therefore we start by considering for m ∈ N the functions χm : N → {0, 1} and cm : N → {−1, 1} given by χm (n) := 1 for m|n, χm (n) := 0 for m - n and cm := (−1)χm = 1 − 2χm . These functions are the building blocks of the arithmetical functions λp : N → {−1, 1}, p ∈ P, which equal +1 if the power of p in the prime factorization of the argument is even, and −1 if it is odd: ∞ P∞ Y cpl (m). λp (m) := (−1) l=1 χpl (m) =
(8)
l=1
Clearly for any argument m the sum and the product in (8) are effectively finite. The Liouville function may be written as the product Y λp λ= p∈P
of these function (where one needs only take into account those primes p which are smaller than the argument). We want to calculate thermodynamic limits hf i∞ := lim hf ik k→∞
(9)
for the functions χm , cm , λp and, finally, λ. Moreover if these limits are proven to exist, we would like to estimate how fast they are approached. Since |Gk | = 2k , an estimate of the form (10) hf ik − hf i∞ = Oε (2k )−1/2+ε for all ε > 0 would be similar to the one expected for the µ-function as discussed in the Introduction and fit with the Riemann Hypothesis. Our basic idea is to consider the groups Gk as finite probability spaces with the normalized counting measure, so that arithmetical functions composed with hk are random variables on these probability spaces. However, in order to estimate the k-dependence of these quantities, we embed all the finite groups Gk in the countably infinite probability space := N × N, with the help of the maps
708
A. Knauf
Ik : Gk → ,
Ik (σ) := (hk (σ), hk (1 − σ)) .
The Ik are indeed injective, have disjoint images Ik (Gk ) ∩ Il (Gl ) = ∅ for k 6= l, and ∞ [
Ik (Gk ) = {(a, b) ∈ | gcd(a, b) = 1}
k=0
(see [10], Lemma 2.1 and 3.1). The images of the first groups are shown in Fig 2. 25
20
15
10
5
0 0
5
10
15
20
25
Fig. 2. The images Ik (Gk ) ⊂ , for k ≤ 6. Larger groups have lighter shading
The image probability measures µk on w.r.t. Ik give the elementary events the probabilities ( µk ({(a, b)}) =
2−k (a, b) ∈ Ik (Gk ) 0
otherwise
.
Now from the inductive definition (5) it follows that for all k ∈ N0 , σ ∈ Gk , (hk+1 (σ, 1), hk+1 (1 − σ, 0)) = τ L (hk (σ), hk (1 − σ)) and (hk+1 (σ, 0), hk+1 (1 − σ, 1)) = τ R (hk (σ), hk (1 − σ)) with τ L , τ R : → , τ L (a, b) := (a + b, b) and τ R (a, b) := (a, a + b) denoting “left” resp. “right” addition.
(11)
Number-Theoretical Spin Chain
709
Thus µk+1 =
1 2
τ∗L (µk ) + τ∗R (µk ) ,
(12)
starting with the probability measure µ0 concentrated on (1, 1) ∈ . If we denote by pr : → N, (a, b) 7→ a the projection on the left factor, then pr ◦ Ik = hk so that by (11) X f ◦ pr(ω) · µk ({ω}). hf ik = ω∈
We want to estimate the expectations hχm ik or equivalently hcm ik , since the Liouville function λ is a product of the cm . However, in order to do this, one does not need to work on the infinite space . Instead, for any n ∈ N with m|n we can work on the space n := Z/nZ × Z/nZ. Then the functions χm ◦ pr on are projectable w.r.t. the natural residue class map Xn : → n ,
Xn ((a, b)) := (a + nZ, b + nZ),
since they are constant on the preimages Xn−1 (ω), ω ∈ n . We denote the projected functions by χ˜ m : n → {0, 1}
resp.
c˜m : n → {−1, 1}.
These functions are thus elements of the n2 -dimensional Hilbert space X f¯(ω)g(ω). Vn := {f : n → C} with inner product (f, g) := ω∈n
Moreover, the image measure (Xn )∗ (µk ) give probabilities vn,k (ω) := µk (Xn−1 ({ω}))
(13)
to the elements ω ∈ n , and thus the expectation of a function f ∈ Vn w.r.t. the image measure (Xn )∗ (µk ) equals the inner product (f, vn,k ). In particular for integers m, n with m|n one has hχm ik = (χ˜ m , vn,k ).
(14)
3. The Matrices Tn Since Xn : → n acts (mod n), “left” and “right” addition τ L , τ R on descend to maps τnL , τnR : n → n , τnL (a, b) := (a + b, b),
τnR (a, b) := (a, a + b)
(a, b) ∈ n ,
(15)
in the sense that τnL ◦ Xn = Xn ◦ τ L and τnR ◦ Xn = Xn ◦ τ R . But τnL and τnR are permutations. For a permutation τ ∈ S(n ) we denote by Pτ : Vn → Vn , Pτ (f ) := f ◦τ −1 the permutation representation. Consider the endomorphism T n : V n → Vn , Tn := 21 (PτnL + PτnR ).
710
A. Knauf
The matrix representations of these endomorphisms w.r.t. the orthonormal basis δx of characteristic functions of the points x ∈ n are denoted by the same symbol. As a convex combination of permutation matrices Tn is doubly stochastic, that is, its entries are nonnegative, and the sum of each column and each row equals one. In fact, by (12) and by (13) Tn is the transition matrix of a finite Markov chain with state space n and probability vectors vn,k , i.e. (k ∈ N0 ).
Tn vn,k = vn,k+1 ,
(16)
Our first goal in the spectral analysis of Tn is to find its ergodic sets Und ⊂ n . Thus we consider the orbits in n w.r.t. the action of the permutation subgroup generated by τnL , τnR ∈ S(n ). Lemma 1. For d|n let U˜ nd := {(a, b) ∈ n | n|ad and n|bd}. Then τnL (U˜ nd ) = τnR (U˜ nd ) = U˜ nd , and the cardinality |U˜ nd | = d2 . For d|n and e|n one has U˜ nd ∩ U˜ ne = U˜ ngcd(d,e) . Remarks 2. 1. The property n|ad for a ∈ Z/nZ is independent of the chosen complete residue system so that the above definition is valid. 2. Obviously, U˜ nn = n and U˜ n1 = {(n, n)}. Furthermore, the map U˜ nd → U˜ dd , (a, b) 7→ (ad/n, bd/n) is an isomorphism commuting with the τ L and τ R maps. Proof. If n|ad and n|bd then n|(a + b)d, showing that τnL (U˜ nd ) ⊂ U˜ nd and τnR (U˜ nd ) ⊂ U˜ nd . Equality holds since τnL and τnR are injective. Furthermore, {a ∈ Z/nZ | n|ad} = {kn/d | k = 1, . . . , d} so that |U˜ nd | = d2 . If d|n and e|n, then n|cd and n|ce implies n| gcd(cd, ce) = c gcd(d, e), so that U˜ nd ∩ U˜ ne ⊂ U˜ ngcd(d,e) . The converse inclusion is trivial. Lemma 3. For d|n let Und := U˜ nd −
[
U˜ nd/p .
p∈P,p|d
Then Und = {(a, b) ∈ n | d = n/ gcd(a, b, n)},
[ ˙ U˜ nd =
e|d
Une ,
τnL (Und ) = τnR (Und ) = Und , and the cardinality X Y µ(d/e)e2 = d2 · (1 − p−2 ). |Und | = e|d
(17)
(18)
p∈P,p|d
Proof. We have U˜ nd = {(a, b) ∈ n | n|d · gcd(a, b)} = {(a, b) ∈ n | n|d · gcd(a, b, n)}, since gcd(n, gcd(a, b)) = gcd(a, b, n). By the definition of Und , for (a, b) ∈ Und one has n - d0 gcd(a, b) for d0 < d and d0 |d. This implies the equation Und = {(a, b) ∈ n | d = n/ gcd(a, b, n)}.
Number-Theoretical Spin Chain
711
S The same equation shows that Und1 ∩ Und2 = ∅ if d1 6= d2 , and that U˜ nd = e|d Une , proving the second part of (17). The invariance of Und w.r.t. the automorphisms τnL and τnR of the U˜ nd follows from [ [ L d/p U˜ nd/p = τnL U˜ nd − τnL (Und ) = τnL U˜ nd − τn U˜ n p∈P,p|d
= U˜ nd −
[
p∈P
p|d
U˜ nd/p = Und
p∈P
p|d
and similarly for τnR . P 2 ˜d The formula |Und | = e|d µ(d/e)e for the cardinality follows from |Un | = P e 2 ˜d obius inversion formula (Theorem 2.9 of [1]). e|d |Un | and |Un | = d , using the M¨ Q P 2 The identity e|d µ(d/e)(e/d) = p∈P,p|d (1 − p−2 ) is certainly true for d = 1 since then the product is empty. For d > 1 let p1 , . . . , pr ∈ P be the prime divisors of d. Then Y
(1 − p−2 ) =
r Y
(1 − p−2 i )=
i=1
p∈P,p|d
M ⊂{1,... ,r}
=
X µ(e0 ) e0 |d
(−1)k
k=0
X
=
r X
(e0 )2
Q µ( i∈M pi ) Q ( i∈M pi )2
=
X
Y
M ⊂{1,... ,r}
i∈M
|M |=k
p−2 i
1 X µ(d/e)e2 , d2 e|d
since µ(e0 ) = 0 if e0 contains a prime factor raised to a power ≥ 2.
In particular we have obtained the decomposition n =
[ ˙ e|n
Une
(19)
of the state space n into disjoint τnL - and τnR -invariant subsets Une . We will show now that these subsets cannot be further decomposed, i.e. that they are the orbits of the subgroup of permutations generated by τnL and τnR . For d|n let Vnd := span({δx | x ∈ Und }) ⊂ Vn be the subspace corresponding to the states in Und and 4dn : Vn → Vnd the orthogonal projector on that space. By the invariance property τnL (Und ) = τnR (Und ) = Und of Lemma 3 Tnd := Tn |Vnd is an endomorphism of the subspace Vnd and thus has a doubly stochastic matrix representation w.r.t. the basis vectors δx , x ∈ Und . Again, we denote the matrix by the same symbol. Lemma 4. For n ∈ N and d|n the matrix Tnd is irreducible-aperiodic.
712
A. Knauf
Proof. By Remark 2) above it suffices to prove the lemma for the special case Tdd , since Tnd is mapped to Tdd under the isomorphism Vnd → Vdd , δ(a,b) 7→ δ(ad/n,bd/n) . We must show that for a certain l ∈ N the lth power of the matrix Tdd has only strictly positive entries. But since Tdd is the restriction Td |V d of the matrix Td = 21 (Pτ L + Pτ R ), d d d one may equivalently show that for every pair (ai , bi ), (af , bf ) ∈ Udd of initial and final states there exists a chain (ak , bk ) ∈ Udd , k = 1, . . . , l starting at (a1 , b1 ) := (ai , bi ), ending at (al , bl ) := (af , bf ) and following the rule (ak+1 , bk+1 ) := τIk (ak , bk ),
k = 1, . . . , l − 1
with some choice of indices Ik ∈ {L, R} of the permutations. Actually we can drop the condition that all these chains have a common length l, for the following reason. For all a, b ∈ Z/dZ with gcd(a, d) = gcd(d, b) = 1, the states (a, d) resp. (d, b) are in Udd . These states are fixed points of left addition τdL resp. right addition τdR . Therefore if we have shown that the matrix Tdd is irreducible of equivalently that any given (ai , bi ), (af , bf ) ∈ Udd can be connected by some chain of states, we can connect (ai , bi ) to, say, (1, d), perform an arbitrary number of left additions and then connect (1, d) to (af , bf ). This then implies the existence of chains between all states of Udd with a common length l. Secondly, it is sufficient to show the existence of a chain from an arbitrary state (ai , bi ) ∈ Udd to (1, 1) ∈ Udd , since then by the group property of the set of permutations generated by τdL and τdR there also exists a chain from (1, 1) to an arbitrary state (af , bf ) ∈ Udd . We build the chain from (ai , bi ) to (1, 1) by joining a chain from (ai , bi ) to a state on the diagonal with a chain from that state to (1, 1). To go from (ai , bi ) to the diagonal, we employ the Euclidean algorithm: We set (a1 , b1 ) := (ai , bi ) and assume without loss of generality that a1 > b1 (if a1 = b1 we are already on the diagonal, and if a1 < b1 we interchange ak and bk in the following construction). Setting r0 := a1 , r1 := b1 , by the Euclidean algorithm r 0 = q 1 r1 + r 2 , r1 = q2 r2 + r3 , ... rn−2 = qn−1 rn−1 + rn with 0 < ri+1 < ri for i = 1, . . . , n − 2, 0 < qi < d and rn = rn−1 . This implies for the states in Udd that (r0 + (d − q1 )r1 , r1 ) = (r2 , r1 ), (r2 , r1 + (d − q2 )r2 ) = (r2 , r3 ), ... so that by first applying d − q1 > 0 left additions, then d − q2 > 0 right additions etc., we reach after finitely many (say, k1 ) steps the element (ak1 , bk1 ) = (rn−1 , rn−1 ) ∈ Udd on the diagonal. Since ak1 = bk1 = rn−1 , we have gcd(ak1 , bk1 ) = rn−1 , and Lemma 3 implies gcd(a, b, d) = 1 for (a, b) ∈ Udd , so that gcd(rn−1 , d) = 1. This implies that rn−1 , considered as an element of Z/dZ, is invertible, i.e. qrn−1 = 1(mod d) for some q ≥ 1. Thus
Number-Theoretical Spin Chain
713
ak1 + (q − 1)bk1 , bk1 = 1, bk1 so that q − 1 left additions (ai+1 , bi+1 ) := (ai + bi , bi ),
i = k1 , . . . , k1 + q − 1,
lead to (ak2 , bk2 ) = (1, bk1 ) with k2 := k1 + q − 1. Finally, another d + 1 − bk1 right additions lead to (ak3 , bk3 ) = (1, 1).
The Perron–Frobenius theorem, together with Lemma 4, implies the following facts: 1. The algebraic multiplicity of the eigenvalue 1 of the endomorphism Tnd is one, and the vector 1 X δx ∈ Vnd (20) 11dn := d |Un | d x∈Un
is eigenvector w.r.t. that eigenvalue. 2. Denoting by 5dn the orthogonal projector on Vnd to span(11dn ), the spectral radius (21) sr Tnd − 5dn < 1. By our direct sum decomposition, M Tn ∼ Tnd =
on Vn ∼ =
d|n
M
Vnd ,
d|n
the algebraic multiplicity of the eigenvalue 1 of Vn being d(n) ≡ radius X 5dn 4dn ) < 1. sr(Tn −
(22) P d|n
1, and the spectral
d|n
The smaller the spectral radius is, the faster the expectation hf ik converges to its thermodynamical limit hf i∞ as k → ∞. An inequality √ (23) sr Tnd − 5dn ≤ 1/ 2 would be in accordance with (10) but is not valid √ in general (the first n ∈ N for which the spectral radius of Tnn is strictly larger than 1/ 2 is n = 9, the corresponding eigenvalues being roots of the polynomial 3 + 4x2 + 16x3 + 48x4 + 64x5 + 64x6 ). The probability vectors vn,k ∈ Vn are actually elements of the subspace Vnn , since vn,0 is supported on (1, 1) ∈ Unn , by the relation vn,k = (Tn )k (vn,0 ) following from (16), and (22). So by (14), hχm ik = (χ˜ m , (Tnn )k vn,0 ).
(24)
But since by definition (20) (11nn , vn,k ) = (11nn , (Tn )k vn,0 ) = ((Tn∗ )k 11nn , vn,0 ) = (11nn , vn,0 ) = 1/|Unn |, the Perron–Frobenius theorem implies lim vn,k = 11nn = 5nn vn,0 .
k→∞
(25)
So the thermodynamic limit hχm i∞ defined in (9) is given by hχm i∞ = (χ˜ m , 11m m ), and an explicit formula in terms of m will be given by evaluation of the r.h.s..
(26)
714
A. Knauf
4. The Matrices T¯n The violation of the estimate (23) seems to outlaw our probabilistic approach to the Riemann Hypothesis. However, since χ˜ m ((a, b)), with (a, b) ∈ n , in (14) is independent of b, we do not use the full information encoded in the transition matrix Tn and may therefore reduce it. In this section we thus analyze a new Markov chain with transition matrix T¯n which is derived from Tn by lumping of states. T¯n has smaller spectral radius than the old chain and suffices for our purpose. As remarked in Sect. 2, we are not only interested in the thermodynamic limit (26), but mainly in the deviation hχm ik − hχm i∞ from that limit for a spin chain of length k < ∞. By (24) and (25) for all m ∈ N with m|n, hχm ik − hχm i∞ = (χ˜ m , (Tnn )k vn,0 ) − (χ˜ m , 5nn vn,0 ) = (χ˜ m , (Tnn − 5nn )k vn,0 ), the second equation following from
Tnn 5nn
=
5nn Tnn
=
(27)
5nn .
From (27) we see that k |hχm ik − hχm i∞ | ≤ c · sr(Tnn − 5nn )
for some k-independent c, if Tnn is semi-simple. By the Perron–Frobenius inequality (21) this deviation vanishes exponentially in k. However we just remarked that for general n the estimate (23) does not hold true. On the other hand we may use the invariance property χ˜ m (xa, xb) = χ˜ m (a, b) valid for the x ∈ Z/nZ with gcd(x, n) = 1 to improve the above estimate. The ring Z/nZ acts on the Z/nZ module n by multiplication. If n 6∈ P, then Z/nZ is not a field (and n not a vector space over Z/nZ). For general n ∈ N we consider the action αn : U (Z/nZ) × n → n ,
(x, (a, b)) 7→ (xa, xb)
of the multiplicative group U (Z/nZ) := {x ∈ Z/nZ | gcd(x, n) = 1} which is of order |U (Z/nZ)| = ϕ(n). That group action leaves the ergodic sets of Tn invariant: Lemma 5. For n ∈ N, d|n and x ∈ U (Z/nZ) αn (x, Und ) = Und . Proof. We show first that αn (x, U˜ nd ) = U˜ nd , Since
U˜ nd
x ∈ U (Z/nZ).
is of the form U˜ nd = {(k1 n/d, k2 n/d) | k1 , k2 = 1, . . . , d},
it is obvious that αn (x, U˜ nd ) ⊂ U˜ nd . But multiplication by x ∈ U (Z/nZ) is an automorphism of the finite set n so that the opposite inclusion holds, too. By a argument similar to the one employed to prove τnL , τnR -invariance of Und im Lemma 4, we conclude that αn (x, Und ) = Und , too.
Number-Theoretical Spin Chain
715
So we obtain a refinement of the partition (19) of n into the sets Und by considering the orbits of the group action αn . We denote the restrictions of αn to Und by αnd : U (Z/nZ) × Und → Und , the set of αn -orbits by On := n /U (Z/nZ), and the set of αnd -orbits by Ond := Und /U (Z/nZ). Lemma 6. For n ∈ N, d|n and ω ∈ Und the cardinality of the isotropy group of ω equals x ∈ U (Z/nZ) | αnd (x, ω) = ω = ϕ(n)/ϕ(d), (28) and the cardinality of the orbit through ω equals d αn (U (Z/nZ), ω) = ϕ(d).
(29)
The number of αnd -orbits in Und equals Y d On = d (1 + 1/p).
(30)
p∈P,p|d
Proof. The proof is based on the following fact (see, e.g., Thm 5.33 of [1]): Consider a reduced residue system Uˆ (Z/nZ) for U (Z/nZ) and let d|n. Then Uˆ (Z/nZ) is the disjoint union of ϕ(d) sets, each of which consists of ϕ(n)/ϕ(d) numbers congruent to each other (mod d). To see how formula (28) for the cardinality of the isotropy group follows, we consider the elements x ∈ Uˆ (Z/nZ) which are congruent to one (mod d). By the above fact, that set has cardinality ϕ(n)/ϕ(d) and clearly is a subset of the isotropy group of ω, since ω ∈ Und is of the form (k1 n/d, k2 n/d). On the other hand, let x ∈ Uˆ (Z/nZ) be an element of the isotropy group of ω = (a, b) ∈ Und . Then n|(x − 1)a and n|(x − 1)b so that n|(x − 1) gcd(a, b) and n|(x − 1) gcd(a, b, n). But by (17) one has gcd(a, b, n) = n/d so that n|(x − 1)n/d. This can only be the case if x − 1 is a multiple of d, which shows the opposite inclusion. The fact cited above also implies formula (29) for the cardinality of the orbit through ω, since elements x1 , x2 ∈ Uˆ (Z/nZ) which are not congruent (mod d) lead to different points αnd (x1 , ω) 6= αnd (x2 , ω) of the orbit. Q Remembering the product representation ϕ(d) = d p∈P,p|d (1 − 1/p) of the Euler totient, we obtain the formula (30) for the number of orbits in Und by dividing the cardinality (18) of Und through the (constant) cardinality (29) of the orbits: Q Y d d2 · p∈P,p|d (1 − p−2 ) On = Q =d· (1 + p−1 ). −1 d · p∈P,p|d (1 − p ) p∈P,p|d
The isomorphism Und → Udd , (a, b) 7→ (ad/n, bd/n) maps αnd -orbits onto αdd -orbits and thus induces an isomorphism Ond → Odd . So we need only consider the sets Odd of orbits. • If d ∈ P, then by (30) |Odd | = d + 1, and Odd is isomorphic to the one-dimensional projective space: Odd ∼ = P 1 (Z/dZ). Namely, for (a, b) ∈ Udd the quotient a/b only depends on the orbit αdd (U (Z/dZ), (a, b)) ∈ Odd through the state (a, b), and any two states with the same quotient lie in the same orbit.
716
A. Knauf
• If d is a prime power (d = pα with p ∈ P), then as a set Od ∼ = P 1 (Z/dZ) × Z/pα−1 Z. d
• If d and e are relatively prime gcd(d, e) = 1), then for f := de Off ∼ = Odd × Oee . The endomorphism An : Vn → Vn , (An f )(ω) := |U (Z/nZ)|−1
X
f (x · ω)
x∈U (Z/nZ)
is an orthogonal projection on the space V¯n := An (Vn ) of functions which are invariant on the αn -orbits. By Lemma 5 the subspaces V¯nd := An (Vnd ) have the form V¯nd = Vnd ∩ V¯n . Their dimensions are dim(V¯nd ) = |Ond |, see (30). By its definition (20), the vector 11dn lies in V¯nd , that is, An (11dn ) = 11dn . We have (An δx , An δy ) = 0 if x, y ∈ Und belong to different αn -orbits and (An δx , An δy ) = |U (Z/nZ)|−1 p otherwise. So we find an orthonormal basis {e¯ω | ω ∈ Ond } of Vnd by setting e¯ω := |U (Z/nZ)| · An δx for an arbitrary point x ∈ Und of the orbit ω ∈ Ond . By the distributive law, the permutations τnL and τnR of n defined in (15) map αn orbits to αn -orbits and thus induce permutations τ¯nL and τ¯nR of On , leaving the subsets Ond invariant. Similar to the last section, we denote the permutation representation of a permutation τ ∈ S(On ) by P¯τ : V¯n → V¯n , and set Ln := P¯τ¯nL , Rn := P¯τ¯nR for simplicity. The doubly stochastic matrix Tn commutes with An , since X An PτnL f ((a, b)) = |U (Z/nZ)|−1 f ((x(a − b), xb)) = |U (Z/nZ)|−1
X
x∈U (Z/nZ)
f (((xa) − (xb), xb)) = PτnL An f ((a, b))
x∈U (Z/nZ)
and similarly for τnR . Thus we may define T¯n : V¯n → V¯n by restriction T¯n := Tn |V¯ n . The relation T¯n =
1 2
(Ln + Rn ) .
(31)
holds true. The restrictions T¯nd := T¯n |V¯ nd to the subspaces V¯nd define endomorphisms of these subspaces, so that M M T¯n ∼ T¯nd on V¯n ∼ V¯nd . (32) = = d|n
d|n
The matrix representations of T¯n and T¯nd w.r.t. the basis vectors e¯ω are denoted by the same symbols. By definition, the matrices T¯n are doubly stochastic. T¯nd arises from Tnd by lumping together states in Und belonging to the same orbit. Thus T¯nd is irreducible-aperiodic since Tnd has that property (see Lemma 4).
Number-Theoretical Spin Chain
717
Example. For the primes n = d = 2 resp. 3 one has T¯22 =
1 2
011 1 1 0 101
T¯33 =
1 2
0200
0 0 1 1 0 1
1 . 0 1001
(33)
in the basis corresponding to the enumeration (1, . . . , d, ∞) of the projective space P 1 (Z/dZ). By Perron–Frobenius, the eigenvalue one of T¯nd has algebraic multiplicity one, with
eigenvector 11dn and the spectral radius
¯ dn ) < 1, sr(T¯nd − 5
(34)
¯ dn := 5dn |V¯ d being the orthogonal projector on span(11dn ). 5 n It was remarked in the beginning of this section that χ˜ m : n → {0, 1} is constant on the αn -orbits so that An χ˜ m = χ˜ m . This implies that hχm ik = (χ˜ m , (T¯nn )k v¯ n,0 )
(35)
with v¯ n,k := An vn,k , since by (24) hχm ik = (χ˜ m , (Tnn )k vn,0 ) = (An χ˜ m , (Tnn )k vn,0 ) = (χ˜ m , An (Tnn )k vn,0 ) and An Tnn = Tnn An . 5. The Spectrum of T¯n Now we will analyze T¯nd with more precision, in order to show that (unlike in the case √ ¯ dn ) ≤ 1/ 2. of Tnd ) the spectral radius sr(T¯nd − 5 By the isomorphism Ond → Odd mentioned above it suffices to consider the special case d = n. The first remark which will be of importance in the determination of the spectrum σ(T¯nd ) is that the group of permutations of the set Ond generated by τ¯ L and τ¯ R is much smaller than the full permutation group S(Ond ). (a,b) ∈ Unn of the orbits in the form of column vectors Writing the representatives a a a+b L τn : b 7→ b is represented by the matrix 01 11 , and right addition b , leftaddition a a R τn : b 7→ a+b is represented by the matrix 11 01 . So the subgroup of permutations L R generated by τ¯n and τ¯n is isomorphic to the group of matrices in Mat(2, Z/nZ) gener11 10 ated by 0 1 and 1 1 .
718
A. Knauf
Example. If n is a prime number, then we recover the group of M¨obius transformations z 7→
az + b , cz + d
z ∈ P1 (Z/nZ),
with a, b, c, d ∈ Z/nZ and ad − bc = 1. Now we represent right addition τ R in the form τ R = τ I τ L τ I , with the permutation τ I (a, b) := (b, a), (a, b) ∈ Und . For simplicity we introduce τ M (a, b) := (a, −b) as a further permutation on Und . As both of these order two permutations act on orbits, they induce permutations τ¯ I and τ¯ M on Ond , and τ¯ I τ¯ M = τ¯ M τ¯ I although τ I τ M 6= τ M τ I on Und for d > 2: τ I τ M (a, b) = (−b, a) and τ M τ I (a, b) = (b, −a), but (b, −a) = −1 · (−b, a) belongs to the orbit through (−b, a). Moreover, we use the shorthands L := P¯τ¯ L ,
R := P¯τ¯ R ,
(36)
I := P¯τ¯ I and M := P¯τ¯ M for the matrices of the permutation representations, and T¯ ≡ T¯dd . The following identities (depicted in Fig 3) turn out to be useful: Lemma 7. J := M I = LR−1 L = L−1 RL−1 = R−1 LR−1 = RL−1 R = IM. Proof. M I = LR−1 L since −1
−1
τ L τ R τ L (a, b) = τ L τ R (a + b, b) = τ L (a + b, −a) = (b, −a) = τ M τ I (a, b). Similarly IM = RL−1 R. Then we note that IM = M I = (M I)−1 since τ¯ I τ¯ M = τ¯ M τ¯ I , and since both τ¯ I and τ¯ M are of order two. So LR−1 L = L−1 RL−1 and R−1 LR−1 = RL−1 R. Lemma 8. T¯ −1 = 2T¯ ∗ − J.
Proof. We show that 2T¯ T¯ ∗ − T¯ J = 11. Using T¯ = 21 (L + R), and the orthogonality of the permutation matrices, we have 2T¯ T¯ t = 21 (L + R)(L−1 + R−1 ) = 11 + 21 (LR−1 + RL−1 ). So we must prove that T¯ J = 21 (LR−1 + RL−1 ). But 2T¯ J = LJ + RJ = L(L−1 RL−1 ) + R(R−1 LR−1 ) = RL−1 + LR−1 , using Lemma 7.
Number-Theoretical Spin Chain
719
R−1
*
z
HH L H j H 6
z z+1
z+1
6 L
R−1
J
? −z−1 z
−1 z+1
HH Y
R−1
HH
L
? −1 z
Fig. 3. Relations between the M¨obius transformations and the matrices of the permutation representation
¯ dn , then t ∈ (−1, − 1 ] ∪ [ 1 , 1) or Proposition 9. If t ∈ C is an eigenvalue of T¯nd − 5 2 2 √ |t| = 1/ 2. Proof. Let f ∈ V¯nd , (f, f ) = 1 be an eigenvector with eigenvalue t. Then from Lemma 8 we conclude that 2t¯ − t−1 = (f, J f ).
(37)
Moreover, −1 ≤ (f, J f ) ≤ 1 since J is a self-adjoint √ involution. Writing the l.h.s. of (37) in the form t¯(2 − |t|−2 ), one sees that |t| = 1/ 2 if t 6∈ R. If t ∈ R, one obtains from (37) t ∈ [−1, − 21 ] ∪ [ 21 , 1]. On the other hand, we already ¯ dn ) < remarked in (34) that by the Perron–Frobenius theorem the spectral radius sr(T¯nd −5 1. In order to proceed in the spectral analysis of T¯ , we introduce the operators Y + := 13 (11 + R−1 L + L−1 R) = IY + I
(38)
Y − := 13 (11 + LR−1 + RL−1 ) = JY + J
(39)
and
which by Lemma 7 are ortho-projections. Geometrically Y + (Y − ) corresponds to the mean over the orbits generated by the order three transformation (τ¯ R )−1 τ¯ L (resp. τ¯ L (τ¯ R )−1 ). Considering a pair of projectors Y + , Y − it is generally useful to introduce the operators and B := 11 − Y + − Y − , A := Y + − Y − see Avron, Seiler and Simon [2]. These meet the relations A2 + B 2 = 11,
AB + BA = 0
720
A. Knauf
and [A2 , Y ± ] = [B 2 , Y ± ] = 0. In our case the non-normal operator T¯ is related to the self-adjoint operator B by the following formula Lemma 10. 2 2T¯ + T¯ −1 = 9B 2 . Proof. With the help of Lemma 7 and Lemma 8 one shows the identities T¯ = 23 JY + − 21 J, So
T¯ −1 = 3Y + J − 2J.
2T¯ + T¯ −1 = 3(JY + + Y + J − J) = −3JB = −3BJ.
This implies the formula, since J 2 = 11.
In particular Lemma 10 shows that the algebraic and geometric multiplicity of an eigen√ value t of the Markov transition matrix T¯ coincide if t 6= ±1/ 2 (since 2T¯ + T¯ −1 is self-adjoint). We now split the Hilbert space V¯ ≡ V¯nd on which T¯ acts into the orthogonal direct sum V¯ = V¯ ker ⊕ V¯ ran with
(40)
V¯ ker := ker(Y + ) ∩ ker(Y − ),
V¯ ran := ran(Y + ) + ran(Y − ). This splitting induces a decomposition of our operators B and T¯ . We begin with the simpler piece. By definition of B, the restriction of B to V¯ ker is the identity operator. By Lemma 10 an eigenvalue ±1 of B corresponds to an eigenvalue ± 21 or ±1 of T¯ . We already know that the multiplicity of the eigenvalue one of T¯ is one and that −1 does not occur in its spectrum (T¯ being an irreducible-aperiodic doubly stochastic operator). So V¯ ker equals the orthogonal sum of the eigenvalue 21 and − 21 subspaces: V¯ ker = ker(T¯ − 21 11) ⊕ ker(T¯ + 21 11). Both of them have approximately dimension d/6: Proposition 11. Let d ∈ N be a prime number. Then the dimensions 1 D± 2 := dim ker(T¯ ∓ 21 11) of the eigenvalue ± 21 subspaces of T¯ ≡ T¯dd equal • for d = 2, D+ 2 = D− 2 = 1, 1 1 • for d = 3, D+ 2 = 1 whereas D− 2 = 0, • 1 d (mod 12) D+ 2 1
1
1 5 7 11
d−1 6 −1 d+1 6 d−1 6 d+1 6 +1
D− 2
1
d−1 6 d+1 6 +1 d−1 6 −1 d+1 6
(41)
Number-Theoretical Spin Chain
721
Proof. The statements for the primes 2 and 3 follow from direct inspection of the matrices (33). So we may assume 2 - d and 3 - d. Lemma 8 implies that the involution J, restricted to the T¯ -eigenspaces for eigenvalues ± 21 , equals ∓1. So ker T¯ ∓ 21 11 = V¯ ker ∩ ker J ± 11 . By the relation (39) between Y − and Y + this eigenspace is also characterized by ker T¯ ∓ 21 11 = ker(Y + ) ∩ ker J ± 11 . So we count the dimension of the r.h.s., beginning with ker J ± 11 . 1) The involution J corresponds to the permutation z 7→ −1/z on P 1 (Z/dZ). This permutation has • for d = −1 (mod 4) no fixed points √ • for d = 1 (mod 4) the two fixed points ± −1 ∈ P 1 (Z/dZ), 1 2 (d − 1) . since the Jacobi symbol equals −1 d = (−1) The eigenvalue equation Jϕ = µϕ gives one independent equation per pair (z, −1/z) with z 6= −1/z. If and only if µ = −1, the fixed points z = −1/z give additional equations (namely ϕ(z) = 0). So we obtain the following numbers of independent equations:
d (mod 12) 1 5 7 11
µ = −1
µ=1
d+1 2 d+1 2 d+1 2 d+1 2
d+1 2 d+1 2 d+1 2 d+1 2
+1 +1
−1 −1
2) Next we determine dim(ker(Y + )). This equals the dimension d + 1 of our Hilbert space V¯ , minus the number of orbits of the permutation z 7→ −1 − 1/z on P 1 (Z/dZ). Since by quadratic reciprocity the Jacobi symbol 1 3 d −3 = , = (−1) 2 (d − 1) d 3 d this permutation has • for d = −1 (mod 6) no fixed points. Thus every orbit consists of three points, and we have d+1 3 orbits. √ √ • for d = 1 (mod 6) the two different fixed points − 21 (1 + −3) and − 21 (1 − −3) corresponding to the roots of z 2 + z + 1 in P 1 (Z/dZ). So in that case we have d−1 3 +2 orbits. However, the set of equations obtained from the condition ϕ ∈ ker(Y + ) is not independent from the set of equations obtained by the eigenvalue equation Jϕ = µϕ ifPµ = −1. Namely we may obtain from Jϕ = −ϕ that ϕ has zero mean value ( z∈P 1 (Z/dZ) ϕ(z) = 0), a property which already follows from Y + ϕ = 0. So ϕ ∈ ker(Y + ) gives us at most
722
A. Knauf
d (mod 12)
µ = −1
µ=1
1
d−1 3 +1 d+1 3 −1 d−1 3 +1 d+1 3 −1
d−1 3 d+1 3 d−1 3 d+1 3
5 7 11
+2 +2
independent additional equations. 3) Subtracting the above two lists from the dimension d + 1 of the Hilbert space V¯ , we 1 see that the dimensions D± 2 are at least as large as stated in our lemma. 1 To obtain the upper bounds for D± 2 , we compare dimensions in (41). As calculated above, dim(ran(Y + )) = (d + 1)/3
if
d = −1 (mod 6)
(42)
and dim(ran(Y + )) = (d − 1)/3 + 2
if
d = 1 (mod 6).
(43)
The same holds true for the projector Y − = JY + J. However, there is exactly one relation between the equations characterizing ker(Y + ) and ker(Y − ). Each z ∈ P 1 (Z/dZ) appears exactly once in both sets of equations, which are of the form ϕ(z) + ϕ(−1 − 1/z) + ϕ(−1/(z + 1)) = 0
(44)
for ϕ ∈ ker(Y + ) respectively ϕ(z) + ϕ(1 − 1/z) + ϕ(−1/(z − 1)) = 0
(45)
for ϕ ∈ ker(Y − ). All coefficients of these equations equal +1. So the sum of the equations (44) minus the sum of the equations (45) equals zero. But this is the only relation between eqs. (44) and (45). So by (42) and (43) the dimension of V¯ ker = ker(Y + ) ∩ ker(Y − ) equals dim(V¯ ) − dim(ran(Y + )) − dim(ran(Y − )) + 1 = d + 2 − 2 dim(ran(Y + )) and thus D+ 2 + D− 2 , proving the lemma. 1
1
For general integers d the dimensions D± 2 may be determined along the same lines. 1
Number-Theoretical Spin Chain
723
6. Ramanujan Graphs Now we turn to the part of the spectrum of T¯ which belongs to the subspace V¯ ran in the orthogonal decomposition (40) of the Hilbert space V¯ ≡ V¯dd . The constant eigenfunction with eigenvalue 1 belongs to that subspace. The dimension of the co-dimension one subspace V¯ ran,⊥ ⊂ V¯ ran of functions with zero mean is even and follows from (40) and (41) which give 1
1
dim(V¯ ran,⊥ ) = dim(V¯ ) − 1 − D+ 2 − D− 2 . So by Prop. 11 it is 0 for d = 2, and 2 for d = 3. For d ∈ P one has dim(V¯ ran,⊥ ) = 2 2 3 (d − 1) + 2 if d = 1 (mod 6) and 3 (d + 1) − 2 if d = −1 (mod 6). We know from Prop. 9 that the non-real eigenvalues of T¯ , which are all associated √ ran,⊥ ¯ , have absolute value 1/ 2. to its restriction to V We would like to show that there are no real eigenvalues t of T¯ |V¯ ran,⊥ , since these would in general enlarge the spectral radius. For the first 50 primes we checked this property, see Fig 4. Eigenvalues, n=229 1 0.75 0.5 0.25 -1 -0.75 -0.5-0.25
0.250.5 0.75 1
-0.25 -0.5 -0.75 -1
Fig. 4. The spectrum of the reduced Markov transition matrix T¯ for the 50th prime n = d = 229
By Lemma 10 there do not exist such additional real eigenvalues if √ √ ! 8 8 , . spec B|V¯ ran,⊥ ⊂ − 3 3 Without restriction to the subspace this corresponds to the property √ √ ! 8 8 , ∪ {1} spec (B) ⊂ {−1} ∪ − 3 3
(46)
of B = 11 − Y + − Y − . We will now relate relation (46) to the so-called Ramanujan property of certain graphs. The key observation is that by (36) the definition (38), (39) of the ortho-projectors Y ± is related to the action of the matrices M± ∈ SL(2, Z/dZ),
724
A. Knauf
M+ :=
−1 −1 1
!
0
and
M− :=
−1 1 −1 0
! .
(47)
These group elements of order three are conjugated by ! ! 0 1 0 −1 2 M+ = M− −1 0 1 0 and M+ 6= M− iff d > 2. M+ corresponds to the transformation R−1 L, whereas M− corresponds to RL−1 . Of course, M+ and M− also act by left transformations on the group SL(2, Z/dZ). For d > 1 the orbits of these actions are of size three. For d > 2 the M+ -orbit and the M− -orbit through g ∈ SL(2, Z/dZ) have only g in common. Definition 12. We denote by V+ (V− ) the set of M+ (M− )-orbits and consider V := V+ ∪ V− as the vertex set of an undirected graph Gd = (V, E). A pair {v+ , v− }, v± ∈ V of vertices belongs to the set E of edges iff v+ ∈ V+ , v− ∈ V− and the orbits v+ and v− contain a common group element g ∈ SL(2, Z/dZ). For d > 2 (and we will henceforth consider only that case) the graph Gd is threeregular, that is, any vertex has three adjacent edges, and it is connected, that is, any two vertices in V are connected by a chain of edges in E. E is then naturally isomorphic to SL(2, Z/dZ). Since V− and V+ are disjoint, Gd is bipartite.
1100 00111100 00110011 0011
01 01 0110 01
11001100 00 11 00111100 1010 0 1 0110
Example. For d = 3 the order of the group SL(2, Z/dZ) is 24. So |V | = 2 × 8 = 16. In this case G3 can be visualized as follows: Attach two vertices at each corner of a cube, one inside, and one outside the cube. Connect vertices along the edges of the cube, changing between inside and outside along three edges of maximal distance, see Fig 5. By using PSL(2, Z/3Z) instead of SL(2, Z/3Z), we obtain the graph of the cube.
Fig. 5. The graph G3 for the group SL(2, Z/3Z)
Now we consider the Laplacian 1 of the graph Gd . We remind the reader of the definition of the Laplacian 1 of a graph G = (V, E).
Number-Theoretical Spin Chain
725
Consider the Hilbert space H := H0 ⊕H1 with H0 := l2 (V ) (with counting measure) and H1 being isomorphic to l2 (E). More precisely we double the unoriented edges by setting E := {(v, w) | {v, w} ∈ E} and consider the subspace H1 := {f ∈ l2 (E) | f ((w, v)) = −f ((v, w))} with inner product (f, g) := Then the adjoint of
1 2
P e∈E
f¯(e)g(e).
d : H 0 → H1 ,
df ((v, w)) := f (w) − f (v)
equals d ∗ : H1 → H0 ,
X
d∗ g(v) = −
g((v, w)).
(v,w)∈E
As usual one defines 1 : H → H by 1 := d∗ d + dd∗ , so that 1 = 10 ⊕ 11 with X
10 f (v) =
(f (v) − f (w))
{v,w}∈E
and 11 g((v, w)) =
X
X
g((v, x)) +
(v,x)∈E
g((x, w)).
(x,w)∈E
Thus we are in a supersymmetric situation (see, e.g., [7]) and, apart from zero eigenvalues, the spectra of 10 and 11 coincide, including multiplicities. In the case of a k-regular graph G (that is, a graph whose vertices have degree k), 10 = k11 − A, A being the adjacency matrix of G. If G is finite, then k is an eigenvalue (corresponding to the constant eigenfunction) of A, and we denote by µ ≥ 0 the absolute value of the next to largest eigenvalue (in absolute value) of A. If G is bipartite, then the eigenvalues of A are symmetrically distributed around 0, so that also −k is an eigenvalue in the finite case. Thus we denote by µb ≥ 0 the absolute value of the third to largest eigenvalue (in absolute value) of A. Definition 13. A k-regular graph G is called Ramanujan if √ µ ≤ 2 k − 1. A bipartite k-regular graph G is called bipartite Ramanujan if √ µb ≤ 2 k − 1.
(48)
Conjecture. The graphs Gd introduced in Def. 12 for SL(2, Z/dZ) are bipartite Ramanujan.
726
A. Knauf
Proposition 14. If the above conjecture holds true, then instead of (34) we obtain the optimal estimate √ ¯ dn ≤ 1/ 2, sr T¯nd − 5 for the spectral radius of the reduced transition matrix. If in addition the inequality (48) is strict, then for some C > 0,
k
d
T¯n − 5 ¯ dn ≤ C · 2−k/2 (k ∈ N). (49)
Proof. The spectrum of B is a subset of the spectrum of its lift B¯ to the group ring C[SL(2, Z/dZ)]. Now 3B¯ = 311 − 11 , with 11 being the edge Laplacian for the graph Gd of Def. 12. So (apart from multiplicity of the eigenvalue 3) the spectrum of 3B¯ equals the spectrum of the adjacency matrix A = 311 − 10 . This implies that (46) holds true if the three-regular graph Gd is Ramanujan. We remarked that as a consequence of Lemma 10 T¯ is semi-simple if no eigenvalue √ equals ±1/ 2. This is the case under our assumption on the spectrum of the graph. But a semi-simple matrix is normal in some metric, and the norm of a power of a normal matrix equals the power of the norm. Ramanujan graphs are in a sense √ optimal, since the estimate µ ≤ c is always violated for large k-regular graphs if c < 2 k − 1. They have number-theoretic and engineering applications. In the articles [3] of Chiu, [17] by Lubotzky, Phillips and Sarnak, [18] by Margulis, [22, 23] by Pizer and the books [24] by Sarnak and [16] by Lubotzky one finds different constructions leading to Ramanujan graphs. Ref. [26] by Venkov and Nikitin is a general survey. Our family of graphs described in Def. 12 does not fall in one of these known families of Ramanujan graphs. However, the following weaker result follows from results on groups similar to Kazhdan groups, see the book [16] by Lubotzky. Proposition 15. There exists ε > 0 (independent of d) such that all graphs Gd with d > 2 meet the estimate µb ≤ 3 − ε. Definition 16. The Fell Topology on the set Gˆ of equivalence classes of irreducible unitary (separable) representations ρ : G → U (H) of a locally compact (separable) group G is generated by the open neighbourhoods W (K, ε, v) := (H 0 , ρ0 ) ∈ Gˆ ∃v 0 ∈ H 0 , kv 0 k = 1, ∀g ∈ K : | hv, ρ(g), vi − hv 0 , ρ0 (g), v 0 i | < ε} , of (H, ρ), where K ⊂ G is compact, ε > 0 and v ∈ H has norm kvk = 1. A finitely generated group G has property τ w.r.t. a family {Ni }i∈I of finite index normal subgroups, if the trivial representation is isolated in the set {ρ ∈ Gˆ | ∃i ∈ I : Ni ⊂ ker(ρ)} of (equivalence classes of) representations.
Number-Theoretical Spin Chain
727
Proof of Proposition 15. The group G := SL(2, Z) is known to have property τ w.r.t. the family Ni ≡ 0i = ker(SL(2, Z) → SL(2, Z/iZ)), (i ∈ N) of normal subgroups, see [16], Example 4.3.3 D. Then the proposition follows from Theorem 4.3.2 of [16].
7. Estimating Expectations Our aim is to estimate expectations hf ik , beginning with hχm ik . For m1 , m2 ∈ N one has m1 m2 χ m1 · χ m2 = χm with m := gcd(m1 , m2 )
(50)
being the least common multiple of m1 and m2 . This implies a rule for multiplying the cm = 1 − 2χm . In particular we are interested in products of the functions λp,r := Q r l=1 cpl : N → {−1, 1}, p ∈ P, r ∈ N0 which approximate the function λp defined in (8) in the sense that λp,r (m) = λp (m) for m < pr+1 . Lemma 17. r X λp,r = 1 + 2 (−1)l χpl . l=1
Proof. λp,1 = cp = 1 − 2χp and λp,r+1 = λp,r · cpr+1 =
1+2
r X
! l
(−1) χpl
(1 − 2χpr+1 )
l=1
= 1+2 using (50). But 1 + 2
Pr
r X
(−1) χpl − 2χpr+1 l
1+2
l=1
l l=1 (−1)
r X
! l
(−1)
,
l=1
= (−1)r .
The existence of thermodynamic limits follows from the Perron–Frobenius Theorem: Proposition 18. For m ∈ N,
hχm i∞ = m
−1
Y
(1 + 1/p)
,
(51)
p∈P,p|m
for p ∈ P, r ∈ N, hλp,r i∞ =
p2 + 1 − 2(−1/p)r−1 (p + 1)2
and
hλp i∞ =
p2 + 1 . (p + 1)2
Furthermore, for any set {p1 , . . . , ps } ⊂ P of primes and any numbers r1 , . . . , rs ∈ N ∪ ∞, * s + s Y Y λpi ,ri = hλpi ,ri i∞ . i=1
Finally hλi∞ = 0.
∞
i=1
(52)
(53)
728
A. Knauf
Proof. The limit (51) (defined in (9)) exists, since by (35) m k ) v¯ m,0 ), hχm ik = (χ˜ m , (T¯m
¯m and the doubly Q stochastic matrix Tm is irreducible-aperiodic. This matrix acts on the m | = m p∈P,p|m (1 + 1/p)–dimensional vector space V¯mm (see (30)), and χ˜ m is a |Om one-dimensional projection on V¯mm . This implies (51). Then the first formula of (52) follows from Lemma 17. This also implies that λp,2r−1 ≤ λp ≤ λp,2r , whence the second part of (52). The independence (53) of the expectations in the therm | of V¯mm is a modynamic limit is a consequence of the fact that the dimension |Om multiplicative function of m. hλi∞ = 0, since by (52) and (53) for l ∈ N, * + 2 Y X p +1 ln λp = ln , (p + 1)2 p∈P,p≤l
∞
p∈P,p≤l
which diverges to −∞ as l → ∞, since 1 p2 + 1 ∼1− , (p + 1)2 2p and
P p∈P
p−1 = ∞.
The estimate
|hχm ik − hχm i∞ | ≤ C(m)2−k/2
for the deviation (27) from the thermodynamic limit would follow from the validity of (49). A. The Lewis Equation In [5] we showed that the free energy F (β) := lim
k→∞
−1 ln(Zk (β)) βk
of the Number-Theoretical Spin Chain equals for 0 < β < 2, F (β) = − ln(λ(β))/β, ˜ where λ(β) is the largest eigenvalue of a transfer operator C(β) (by (4) F (β) = 0 for β ≥ 2). This operator on l2 (N0 ) has matrix elements " X # m −β − m m −β − m ˜ m,r = (−1)r 2−β−m−r C(β) + , (54) 2s r s r−s s=0
(m, r ∈ N0 ), with the binomial coefficients a b = 0 if b < 0.
a b
=(
Qb−1 i=0
(a − i))/b!, a ∈ R, b ∈ N0 , and
Number-Theoretical Spin Chain
729
Proposition 19. The eigenvalue λ(β) coincides with the largest eigenvalue λ of the Lewis three-term functional equation λ · ψ(x) = ψ(x + 1) + x−β ψ(1 + 1/x).
(55)
Proof. For analytical questions (in particular existence of the Perron–Frobenius eigenvalue of multiplicity one) we refer to [14] resp. [5]. First we transform (55) by substituting 1/x for x and then dividing through xβ : λ · x−β ψ(1/x) = x−β ψ(1/x + 1) + ψ(1 + x). The r.h.s. coincides with the r.h.s. of (55). Thus ψ(w) = w−β ψ(1/w),
(56)
and we use (56) to transform the r.h.s. of (55): Kβ ψ = λ · ψ
(57)
with Kβ : L2 ((0, 1)) → L2 ((0, 1)), 1 x +ψ . Kβ ψ(x) := (x + 1)−β ψ x+1 x+1 P Expanding ψ around 1 in the form ψ(w) = m am (1 − w)m , we obtain (x + 1)−β ψ
∞ x X = am (x + 1)−β−m x+1
=
m=0 ∞ X
am 2
−β−m
m=0
∞ X
(58)
−β − m −r 2 (1 − x)r , r
r
(−1)
r=0
since (x + 1)α = 2α (1 + 21 (x − 1))α = 2α
P∞
α r
r=0
2−r (x − 1)r .
Similarly (x + 1) =
∞ X m=0
−β
ψ
1 x+1
am 2−β−m
=
∞ X
am xm (x + 1)−β−m
m=0
∞ X
(−1)t+s
m X m −β − m
t=0
since xm =
(59)
s
s=0
t
2−t (1 − x)t+s ,
m X m s=0
s
(x − 1)s .
The sum of (58) and (59) corresponds to (54). This proves the claim, since by the positivity of the am (see [5]) ψ is positive, too.
730
A. Knauf
˜ For β ∈ Z − N the operator C(β) leaves the subspace {f ∈ l2 (N0 ) | f (m) = 0 for m > −β} invariant, so that we obtain polynomial solutions of degree |β| + 1 of (57). The corresponding (eigenvalue one) solutions of ψ(x) = ψ(x + 1) + (x + 1)2(k−1) ψ(x/(x + 1)) are called period polynomials and are related to the cusp forms of weight 2k of the modular group, see Lewis and Zagier [15]. Mayer showed in [19, 20] that the Selberg zeta function Y
∞ Y
{γ}∈SL(2,Z)
m=0
ZSL(2,Z) (s) =
primitive
with N (γ) =
1 2 Tr(γ)
+
p
1 − det(γ)m N (γ)−s−m
(Re(s) > 1)
2
(Tr(γ)/2)2 − det(γ)
can be written in the form
ZSL(2,Z) (s) = det(11 − Ls ) · det(11 + Ls ), Ls : A∞ (D) → A∞ (D) being the transfer operator of the Gauss map: ∞ X 1 . (n + z)−2s f Ls f (x) := n+z n=1
Here A∞ (D) denotes the Banach space of functions holomorphic on the disk D := {z ∈ ¯ C | |z − 1| < 3/2} and continuous on D. Since a fixed point f of Ls gives rise to a solution ψ(x) := f (x − 1) of the Lewis equation, we notice a relation with the Selberg zeta function. Acknowledgement. I am most grateful to John Lewis who explained to me his functional equation and showed how to generalize its relation with free energy from negative integral to arbitrary inverse temperatures.
References 1. Apostol, T.M.: Introduction to Analytic Number Theory. Undergraduate Texts in Mathematics. New York: Springer, 1976 2. Avron, J., Seiler, R., Simon, B.: The Index of a Pair of Projections. J. Funct. Anal. 120, 220–237 (1994) 3. Chiu, P.: Cubic Ramanujan Graphs. Combinatorica 12, 275–285 (1992) 4. Connes, A.: Formule de trace en g´eom´etrie non-commutative et hypoth`ese de Riemann. C. R. Acad. Sci. Paris, Ser. I 323, 1231–1236 (1996) 5. Contucci, P., Knauf, A.: The Phase Transition of the Number-Theoretical Spin Chain. Forum Mathematicum 9, 547–567 (1997) 6. Contucci, P., Knauf, A.: The Low Activity Phase of Some Dirichlet Series. J. Math. Phys. 37, 5458–5475 (1996) 7. Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schr¨odinger Operators. Texts and Monographs in Physics. Berlin: Springer, 1987 8. Good, I.J., Churchhouse, R.F.: The Riemann Hypothesis and Pseudorandom Features of the M¨obius Sequence. Mathematics of Computation 22, 857–864 (1968) 9. Guerra, F., Knauf, A.: Free Energy and Correlations of the Number-Theoretical Spin Chain. J. Math. Phys. 39, 3188–3202 (1998) 10. Knauf, A.: On a Ferromagnetic Spin Chain. Commun. Math. Phys. 153, 77–115 (1993) 11. Knauf, A.: Phases of the Number-Theoretical Spin Chain. J. Stat. Phys. 73, 423–431 (1993)
Number-Theoretical Spin Chain
731
12. Knauf, A.: On a Ferromagnetic Spin Chain. Part II: Thermodynamic Limit. J. Math. Phys. 35, 228–236 (1994) 13. Knauf, A.: Irregular Scattering, Number Theory, and Statistical Mechanics. In: Stochasticity and Quantum Chaos. Z. Haba et al, Eds. Dordrecht: Kluwer, 1995 14. Lewis, J.B.: Spaces of holomorphic functions equivalent to the even Maass cusp forms. Invent. Math. 127, 271–306 (1997) 15. Lewis, J.B., Zagier, D.: Period functions for Maass wave forms. MPI 96 - 112 Preprint (http://www.mpimbonn.mpg.de) (1996) 16. Lubotzky, A.: Discrete Groups, Expanding Graphs, and Invariant Measures. Progress in Mathematics 125, Basel: Birkh¨auser, 1994 17. Lubotzky, A., Phillips, R., Sarnak, P.: Ramanujan Graphs. Combinatorica 8, 261–277 (1988) 18. Margulis, G.A.: Explicit Group-Theoretical Constructions of Combinatorial Schemes and their Application to the Design of Expanders and Concentrators. Problemy Peredachi Informatsii 24, 51–60 (1988) 19. Mayer, D.: On the Thermodynamic Formalism for the Gauss Map. Commun. Math. Phys. 130, 311–333 (1990) 20. Mayer, D.: The Thermodynamic Formalism Approach to Selberg’s Zeta Function for P SL(2, Z). Bull. AMS 25, 55–60 (1991) 21. Odlyzko, A.M., te Riele, H.J.J.: Disproof of Mertens conjecture. J. f. d. reine und angewandte Math. 357, 138–160 (1985) 22. Pizer, A.: An Algorithm for Computing Modular Forms on 00 (N ). J. Algebra 64, 340–390 (1980) 23. Pizer, A.: Ramanujan Graphs and Hecke Operators. Bull. AMS (New Series) 23, 127–137 (1990) 24. Sarnak, P.: Some applications of modular forms. Cambridge, New York: Cambridge University Press, 1990 25. Titchmarsh, E.C.: The Theory of the Riemann Zeta Function. London: Oxford University Press, 1967 26. Venkov, A., Nikitin, A.: Selberg Trace Formula, Ramanujan Graphs and some Problems of Mathematical Physics. Algebra Anal. 5, 1–76 (1993) Communicated by P. Sarnak